The authors present a novel 16/32/64/128-point single-path delay feedback pipeline fast Fourier transform (FFT) architecture targeting the multi-rate and multi-regional orthogonal frequency division multiplexing (MR-O...
详细信息
The authors present a novel 16/32/64/128-point single-path delay feedback pipeline fast Fourier transform (FFT) architecture targeting the multi-rate and multi-regional orthogonal frequency division multiplexing (MR-OFDM) physical layer of IEEE 802.15.4-g. The proposed FFT architecture employs a mixed-radix algorithm to significantly reduce the number of complex multipliers. It utilises a configurable complex constant multiplier structure instead of a fixed constant multiplier to efficiently conduct W-32, W-64, and W-128 twiddle factor multiplication. A hardware-sharing mechanism has also been formulated to reduce the memory space requirements of the proposed 16/32/64/128-point FFT computation scheme. The proposed design is implemented in Xilinx Virtex-5 and Altera's field-programmable gate array devices. For the computation of 128-point FFT, the proposed mixed-radix FFT architecture significantly reduces the hardware cost in comparison with existing FFT architecture. The proposed FFT architecture is also implemented by adopting the 90 nm complementary metal-oxide-semiconductor technology with a supply voltage of 1 V. Post-synthesis results reveal that the design is efficient in terms of gate count and power consumption, compared to earlier reported designs. The proposed variable-length FFT architecture gate count is 22.3K and consumes 3.832 mW, while the word-length is 12-bits and can be efficiently useful for the IEEE 802.15.4-g standard.
This paper presents a generalized mixed-radix decimation-in-time (DIT) fast algorithm for computing the modified discrete cosine transform (MDCT) of the composite lengths N=2 x q(m), m >= 2, where q is an odd posit...
详细信息
This paper presents a generalized mixed-radix decimation-in-time (DIT) fast algorithm for computing the modified discrete cosine transform (MDCT) of the composite lengths N=2 x q(m), m >= 2, where q is an odd positive integer. The proposed algorithm not only has the merits of parallelism and numerical stability, but also needs less multiplications than that of type-IV discrete cosine transform (DCT-IV) and type-II discrete cosine transform (DCT-II) based MDCT algorithms due to the optimized efficient length-(N/q) modules. The computation of MDCT for composite lengths N=q(m) x 2(n), m >= 2, n >= 2, can then be realized by combining the proposed algorithm with fast radix-2 MDCT algorithm developed for N=2(n). The combined algorithm can be used for the computation of length-12/36 MDCT used in MPEG-1/-2 layer III audio coding as well as the recently established wideband speech and audio coding standards such as G.729.1, where length-640 MDCT is used. The realization of the inverse MDCT (IMDCT) can be obtained by transposing the signal flow graph of the MDCT. (C) 2011 Elsevier B.V. All rights reserved.
Designers must carefully choose the best-suited fast Fourier transform (FFT) algorithm among various available techniques for the custom implementation that meets their design requirements, such as throughput, latency...
详细信息
Designers must carefully choose the best-suited fast Fourier transform (FFT) algorithm among various available techniques for the custom implementation that meets their design requirements, such as throughput, latency, and area. This article, to the best of authors' knowledge, is the first to present a compact and yet high-throughput parameterisable hardware architecture for implementing different FFT algorithms, including radix-2, radix-4, radix-8, mixed-radix, and split-radixalgorithms. The designed architectures are fully parameterisable to support a variety of transform lengths and variable word-lengths. The FFT algorithms have been modelled and simulated in double-precision floating-point and fixed-point representations using authors' custom-developed library of numerical operations. The designed FFT architectures are modelled in Verilog hardware description language and their cycle-accurate and bit-true simulation results are verified against their fixed-point simulation models. The characteristics and implementation results of various FFT architectures on a Xilinx Virtex-7 FPGA are presented. Compared to recently published works, authors' memory-based FFT architectures utilise less reconfigurable resources while maintaining comparable or higher operating frequencies. The ASIC implementation results in a standard 45-nm CMOS technology are also presented for the designed memory-based FFT architectures. The execution times of FFTs on a workstation and a graphics processing unit are compared against authors' FPGA implementations.
暂无评论