A Reconfigurable Spectrally Efficient FDM Baseband Modulator

P N Whatmough†‡ and I Darwazeh†
† University College London, ‡ ARM Ltd.

Abstract: Spectrally Efficient FDM (SEFDM) systems employ non-orthogonal overlapped carriers to improve spectral efficiency for future communication systems. One of the challenges for SEFDM systems is to demonstrate efficient hardware implementations for transmitters and receivers. This paper presents the first VLSI digital baseband transmitter architecture for SEFDM. The transmitter is reconfigurable between three bandwidth compression ratios, including OFDM and Fast OFDM, therefore supporting operation with current OFDM systems. Complexity analysis is presented of the proposed architecture, along with an area and power efficient hardware mapping, implemented using a 65nm CMOS cell library to provide analysis of area and power compared to a baseline OFDM transmitter.

1 Introduction.

Spectrally efficient FDM (SEFDM) systems propose spectral savings, when compared to Orthogonal Frequency Division Multiplexing (OFDM), by multiplexing non-orthogonal overlapped carriers [1]. In principle, non-orthogonal multicarrier systems achieve spectral savings by reducing the spacing between the subcarriers and/or transmission time [1][2][4]. Despite the advantage of saving the precious spectrum, the loss of orthogonality complicates both signal generation and detection. For the detection problem, many detectors have been proposed and evaluated in the literature. Maximum likelihood (ML) is suggested for detection as the optimum technique in Additive White Gaussian (AWGN) channels [1]. Nevertheless, ML detection is overly complex, with a computational complexity that grows exponentially with the size of the system. On the other hand, linear detectors, such as minimum mean square error (MMSE) and Zero Forcing (ZF), constrain the size of the SEFDM system to small systems so as to yield competitive BER performance [3]. However, recently detection based on sphere decoders proposed in [5] showed optimum BER performance with a decreased computational complexity and higher immunity to noise.

As for the generation of the SEFDM system, recent work has shown in [6] that SEFDM signals can be realized with a similar complexity to OFDM, by utilizing standard Discrete Fourier Transform (IDFT) blocks, judiciously arranged for the SEFDM generation. In this paper, we present a VLSI architecture for SEFDM signal generation with complexity that approaches that of conventional OFDM systems. The architecture is reconfigurable for differing degrees of sub-carrier overlap, enabling the transmitter to switch from conventional OFDM to a more spectrally efficient SEFDM system on a symbol-by-symbol basis.

The rest of the paper is organized as follows: section 2 introduces the mathematical model of SEFDM signal and the proposed signal generation algorithm. Section 3 describes the algorithm-hardware mapping and Section 4 gives the implementation. Section 5 concludes the paper with a focus on the advantages of the proposed architecture.

2. The SEFDM System

Fig. 1 shows a generic SEFDM system. The signal is generated by the superposition of several non-orthogonal carriers each carrying a complex symbol denoted as \( s = s_R + js_I \) to represent two dimensional modulation. The carriers in SEFDM systems are spaced by a fraction of the inverse of the symbol duration, thereby violating the orthogonality condition of the OFDM system, where the spacing is equal to the inverse of the symbol duration. The distance between the carriers in frequency, denoted by \( \Delta f \), is given by \( \Delta f = \alpha / T \), where \( \alpha \) denotes the amount of bandwidth compression and \( T \) is the duration of one SEFDM symbol. Equation (1) gives SEFDM signal denoted by \( x(t) \) in baseband representation, where \( M \) is number of subcarriers, \( s_{l,m} \) denotes the symbol modulated on the \( m^{th} \) subcarrier in the \( l^{th} \) SEFDM frame.

\[
x(t) = \frac{1}{\sqrt{T}} \sum_{m=0}^{M-1} s_{l,m} e^{j2\pi m(t-t_l)}
\]

(1)

A discrete model of the SEFDM system can be obtained by sampling the SEFDM frame with index zero from (2) at a rate \( N/T \), where \( N \geq M \), giving:
The samples of the SEFDM signal as in (2) can be generated based on IDFT operations. Taking $\alpha = b/c$, where $b$ and $c \in \mathbb{Z}$ and $b < c$, the SEFDM signal can be expressed as:

$$X[k] = 1/\sqrt{N} \sum_{n=0}^{N-1} S_n e^{j2\pi nk/\sqrt{cN}},$$

where

$$s(i) = \begin{cases} 
S_{i/b}, & i \in I \\
0, & \text{otherwise}
\end{cases}$$

and $I = \{0, b, \ldots, b(N-1)\}$. Equation (3) can be rearranged as

$$X[k] = 1/\sqrt{N} \sum_{i=0}^{c-1} e^{j2\pi i/N} \sum_{l=0}^{N-1} S_{l+mc} e^{j2\pi lk/\sqrt{cN}},$$

by substituting with $n = i + mc$.

Equation (4) clearly shows that the samples of the SEFDM signal can be generated using $c$ IDFT operations each of length of $N$ points. The input symbols are padded with $(c-1)N$ zeros and then arranged as a $c \times N$ matrix in column major order. Then an IDFT operation is performed on each row. The signal is finally composed by combining rotated versions of the IDFT outputs as depicted in Fig. 2.

### 3. VLSI Architecture

In this section, we present a VLSI architecture for the SEFDM signal generation algorithm described in the previous section. The architecture is based on IFFTs which are the key building blocks of current OFDM transmitters, and the subject of significant implementation research, which has led to some highly-optimised architectures [7][8]. The algorithm presented in the previous section is suitable for generation of signals with arbitrary values of $\alpha$, but complexity rises linearly with $c$, so we present here both a general implementation targeted for FPGA and an ASIC implementations optimised for $\alpha \in (1, \frac{2}{3}, \frac{1}{2})$. The FPGA implementation is intended for use in an SEFDM performance evaluation test bed to further enable practical demonstration of the spectrally efficient physical layer.

An SEFDM solution with reconfigurable sub-carrier spacing has two key benefits. Firstly, it allows us to adapt $\alpha$ in order to maximize the trade-off between receiver complexity, spectral efficiency and prevailing channel conditions. Secondly, supporting $\alpha = 1$ allows us to maintain backward compatibility with the many incumbent OFDM systems. To this end, we ensure that $\alpha$ can be reconfigured for each FDM symbol. We assume $M$-ary digital QAM modulation for sub-carrier symbols, and do not consider mapping of pilot symbols. Word sizes have been generalised to $w$ bits where possible.
Fig. 3 illustrates the general symbol re-ordering operation, which consists of padding the input symbols with \((c - 1)N\) zeros before arranging them as a \(c \times N\) matrix in column major order, as described in section 2. A naive implementation of this operation implies a buffer of \(cN\) complex words to hold the sparse complex matrix. Our proposed architecture has the advantage of avoiding this large storage requirement, instead performing the symbol mapping operation in a serial manner by first loading the symbols into a single \(N\)-complex word buffer (RAM or register file) before enabling the IFFTs and loading either a symbol addressed from the buffer or a zero into each of \(c\) IFFT inputs as determined by an address generation unit. In the FPGA implementation the various control signals required to operate the multiplexers are generated by modulo arithmetic operations on a \(\log_2(cN)\)-bit counter, which reconfigures for different \(b/c\) parameters. For the optimised ASIC implementation this process is replaced with a LUT containing addresses for pre-set values of \(b/c\).

The \(N\)-point IDFTs of (4) are efficiently implemented as \(cN\)-point IFFTs, which can be implemented as \(c\) parallel IFFT blocks or as a smaller number of time-multiplexed blocks. We have opted to use \(c\) parallel IFFTs in order to allow the highest throughput and constant latency independent of \(\alpha\). We have used 64-point, 12-bit complex IFFT blocks in a RAM-based Radix-8 configuration [7]. The IFFTs have an enable signal which when de-asserted gates the internal clock and clears the output registers to zero.

The post-processing operation (Fig. 2) combines the parallel IFFT outputs after multiplication with a complex exponential in order to produce the discrete-time output samples, \(X[k]\). The complexity of the post-processing is a linear function of \(c\), where we require \(c-1\) complex multiply-accumulate (CMAC) operations. The hardware required includes the CMACs and LUTs to store pre-calculated rotation coefficients in read-only memory (ROM). No explicit reconfiguration is required when \(c < 3\), since we ensured the outputs of the relevant IFFTs are \(0 + j0\) in this case; however, LUT address signals are explicitly masked when not required in order to reduce redundant toggling. A total of \((c-1)\) LUTs are required for each configuration, which results in three \(N\)-complex word LUTs. Due to inherent symmetry in the rotation coefficients, the storage required can in practice be easily reduced to the range \(\{0..\pi/2\}\) per LUT at the cost of increased complexity in the address generator logic, which is subsequently required to count both up and down the LUT and requires a conditional negation of the output value. For higher performance implementations, the critical paths through the complex multipliers form a feedforward cutset that can be arbitrarily pipelined to meet throughput requirements at the cost of additional latency.

### 4. Implementation Results

The proposed architecture has been implemented in VHDL and verified using RTL simulation and FPGA prototyping. The design was synthesised to a commercial 65nm CMOS cell library using Synopsys Design Compiler. Synopsys IC Compiler was used to place and route the design and Primetime and PrimetimePX tools were used for static timing analysis (STA) and power estimation respectively. For a baseline comparison, we also present results for a conventional OFDM modulator which consists of a single IFFT module. Both designs were constrained to an achieved clock period of 2 ns, which was verified using STA. Power dissipation is analysed at two clock frequencies to give an impression of how power scales with baud rate (\(V_{dd} = 1 V\) for both analyses). The results are summarised in Table I.

The presented reconfigurable SEFDM transmitter requires approximately 3.6x greater circuit area than our baseline OFDM transmitter, which is intuitive given that the architecture includes three IFFT datapaths and additional input and output circuitry. Power dissipation follows a similar trend (around 3.1x increase) compared to the baseline OFDM implementation. Due to the clock tree gating in the design, dynamic power dissipation scales well with different values of \(\alpha\), and in fact, for \(\alpha = 1\) has a comparable power dissipation to the OFDM baseline, although with a greater leakage power contribution due to the significantly increased circuit area. It is
also important to consider that successfully achieving improved spectral efficiency may well lead to considerable improvements in power efficiency at the system level if it allows a reduction in transmission power at equal throughput. Fig. 4 shows the spectrum for 64 QPSK subcarriers at all three compression ratios.

![Figure 4. Spectrum of transmitter output signal for compression ratios of 1 (conventional OFDM), ½ (“Fast” OFDM) and ⅔, with 64 QPSK subcarriers.](image)

### Table I. Implementation Results

<table>
<thead>
<tr>
<th></th>
<th>Area</th>
<th>Clock Period / Frequency</th>
<th>Power Dissipation</th>
</tr>
</thead>
<tbody>
<tr>
<td>OFDM</td>
<td>89,517 um² (62 K gates)</td>
<td>5 ns / 200 MHz</td>
<td>9.4 mW</td>
</tr>
<tr>
<td></td>
<td></td>
<td>2 ns / 500 MHz</td>
<td>22.5 mW</td>
</tr>
<tr>
<td>Reconfig SEFDM</td>
<td>324,244 um² (225 K gates)</td>
<td>5 ns / 200 MHz</td>
<td>10.8 mW, α = 1</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>21.6 mW, α = ½</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>29.4 mW, α = ⅔</td>
</tr>
<tr>
<td></td>
<td></td>
<td>2 ns / 500 MHz</td>
<td>23.3 mW, α = 1</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>50.3 mW, α = ⅔</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>69.5 mW, α = ⅔</td>
</tr>
</tbody>
</table>

a. Equivalent NAND2 gates.

### Conclusion

For the first time a practical implementation for the newly developed Spectrally Efficient FDM (SEFDM) system is described. The hardware implementation uses a recently proposed algorithm which employs the IDFT, which we map to an efficient VLSI architecture. This architecture is detailed in the context of two scenarios; one with an arbitrary compression ratio of $b/c$ targeted for an FPGA test bed, and the other an optimised ASIC implementation constrained to $a \in (1, ⅔, ½)$. The latter has been implemented in a 65nm cell library, along with a baseline OFDM transmitter. Analysis of these designs shows a 3.6x increase in area, along with power dissipation that scales with $a$ from 3.1x ($a = ⅔$) to approximately 1.1x ($a = 1$), as compared to the baseline OFDM implementation. These results demonstrate that a reconfigurable SEFDM transmitter can be realistically implemented with a modest increase in circuit area and power dissipation when compared to conventional OFDM.

### References


