This paper presents a new architecture for high performance and low power implementation of FIR filter by using distributed arithmetic. The speed of the architecture increased because of parallel implementation of coe...
详细信息
ISBN:
(纸本)9781509049677
This paper presents a new architecture for high performance and low power implementation of FIR filter by using distributed arithmetic. The speed of the architecture increased because of parallel implementation of coefficient updating and the filtering. The throughput is increased when compare with the conventional DA based adaptive FIR filter because of replacing carry save accumulator instead of the normal accumulator. By using this architecture, we can reduce the number of adders to design the DA based architecture. In this paper, we calculate the area delay product (ADP), area and power for different filter lengths.
The core of many DSP applications involve convolution which was previously implemented by Multiply and Accumulate operations or MAC. It requires a number of multipliers and accumulators. The use of multipliers and acc...
详细信息
ISBN:
(纸本)9781538632444
The core of many DSP applications involve convolution which was previously implemented by Multiply and Accumulate operations or MAC. It requires a number of multipliers and accumulators. The use of multipliers and accumulators result in faster execution but it also results in an increase in cost. Also the number of multipliers and adders are limited. Hence a new technique known as distributed arithmetic (DA) was proposed. It is basically a multiplier-less concept utilizing Lookup Table (LUT). It is used when one of the operand is fixed. The input enters into a serial register which is used to access the LUT. To get the address from the LUT we consider the bit positions and get the values of inputs by that bit position. The output from the lookup table is shifted accordingly and the shifted results are added together to form the final result. Hence it basically involves accessing the lookup table, shift and add operation. After implementing the DA logic and comparing it with the convolution scheme, we apply this concept in Discrete Wavelet Transform (DWT) of an image and ECG signal by using Haar filter coefficient.
Digital signal processing techniques are widely used for a large number of applications with digital filters being considered as one of the basic elements. Digital filter design involves several multiply-and-accumulat...
详细信息
ISBN:
(纸本)9781509036462
Digital signal processing techniques are widely used for a large number of applications with digital filters being considered as one of the basic elements. Digital filter design involves several multiply-and-accumulate (MAC) operations, which consume a large amount of hardware resources and computation cost. distributed arithmetic (DA) approach is proposed in literature as an alternative and efficient technique for MAC operation based designs. Similarly, reconfigurable computing possesses the benefits of both worlds, i.e., flexibility of software and high performance of hardware using flexible high speed computing fabric such as FPGA for efficient use of hardware resources. In this paper, design of FIR filters using the concepts of distributed arithmetic and reconfigurable computing is proposed. Two reconfigurable architectures are proposed and implemented on an SRAM based Xilinx FPGA board. The performance of proposed design is evaluated with and without reconfiguration architectures and their results are reported. It is observed that the proposed reconfigurable design saved 41.6-86.9% of hardware resources and 67.92% of power over the conventional non-reconflgurable design.
This brief presents an efficient adaptive Reversible Logic Finite Impulse Response filter (RLFIR) based on distributed arithmetic (DA) using Reversible gates. Reversible logic is one of the most essential issues at pr...
详细信息
ISBN:
(纸本)9781467378079
This brief presents an efficient adaptive Reversible Logic Finite Impulse Response filter (RLFIR) based on distributed arithmetic (DA) using Reversible gates. Reversible logic is one of the most essential issues at present time due to its power reduction effectiveness in circuit designing. The delay and the logical resources of the proposed design were significantly reduced by using add one carry select adder in the inner product of the adaptive filter. The existing carry save adder in the adaptive filter is replaced by the proposed add one carry select adder and logic gates in add one carry select adder is replaced by reversible logic gates in order to reduce the power consumption. The logical resources and delay is reduced to half when compared to the existing carry save adder and the power consumption is reduced to half by changing reversible gates. This paper presents quantum implementation and combinational circuit of all basic reversible gates and its VHDL code. All reversible logic gates are verified and simulated by Xilinx 8.2i.
In this paper, we performed the complexity analysis of fixed-coefficient and variable-coefficient distributed arithmetic (DA)-based finite impulse response (FIR) filter structures to observe the effect of LUT decompos...
详细信息
In this paper, we performed the complexity analysis of fixed-coefficient and variable-coefficient distributed arithmetic (DA)-based finite impulse response (FIR) filter structures to observe the effect of LUT decomposition on the area complexity of DA structure. The complexity analysis reveals that the area complexity of different units of DA FIR filter structure does not increase proportionately with the level of parallelism. An appropriate selection of LUT decomposition factor, and introducing higher level of parallelism in the computation could improve the area-delay efficiency of both fixed-coefficient and variable-coefficient DA-based FIR structures. Based on these findings, we have proposed bit-parallel block-based DA structures, for fixed-coefficient and variable-coefficient FIR. The proposed structures process one block of input samples and produce one block of outputs in every clock cycle. Theoretical estimate shows that the proposed fixed-coefficient structure, for block-size 8 and filter-length 32, involves eight times more ROM-LUT words, eight times more adders, two less registers, and offers eight times higher throughput-rate than the existing similar structure. For the same block-size and filter-length, the proposed variable-coefficient structure involves 7.2 times more adders, the same number of registers, eight times more MUXes, and offers eight times higher throughput than the best available similar structure. Synthesis result shows that the proposed fixed-coefficient structure for block-size 8 and filter-length 32 involve 47% less area delay product (ADP) and 42% less energy per sample (EPS) than the existing structure and offers nearly eight times higher throughput than others. For the same block-size and filter-length, the proposed structure for variable-coefficient FIR involves 71% less ADP and 65% less EPS than the similar existing structures.
The paper describes the design of 2D-discrete cosine transform (DCT) which is widely used in image and video compression algorithms. The objective of this paper is to design a fully parallel distributed arithmetic (DA...
详细信息
ISBN:
(纸本)9781479968183
The paper describes the design of 2D-discrete cosine transform (DCT) which is widely used in image and video compression algorithms. The objective of this paper is to design a fully parallel distributed arithmetic (DA) architecture for 2D-dimensional DCT to be implemented on field programmable gate array (FPGA). DCT requires large amount of mathematical computations including multiplications and accumulations. The multipliers consume increased power and area;hence multipliers are completely discarded in the proposed design. distributed arithmetic is a method of modification at bit stream for sum of product or vector dot product to hide the multiplications. DA is very much suitable for FPGA designs as it reduces the size of a multiply and accumulate hardware. The speed is increased in the proposed design with the fully parallel approach. In this work, existing DA architecture for 2D-DCT and the proposed area efficient fully parallel DA architecture for 2D-DCT are realized. The simulation and synthesis is performed using Xilinx ISE.
This paper discusses FPGA implementation of Finite Impulse Response (FIR) filters using distributed arithmetic (DA) which substitute multiply and accumulate operations with a series of Look-Up-Table (LUT) accesses. Pa...
详细信息
This paper discusses FPGA implementation of Finite Impulse Response (FIR) filters using distributed arithmetic (DA) which substitute multiply and accumulate operations with a series of Look-Up-Table (LUT) accesses. Parallel FIR digital filter can be used either for high speed or low-power applications. The distributed arithmetic provides a multiplication-free method for calculating inner products of fixed-point data, based on table lookups of pre calculated partial products. The implementation results are provided to demonstrate a high-speed and low power proposed architecture. The proposed filter is implemented in very high speed integrated circuit hardware description language (VHDL) and verified via simulation. The proposed method offers average reductions of 60% in the number of LUT, 40% reduction in occupied slices and 50% reduction in the number gates for parallel FIR filter implementation. (C) 2015 The Authors. Published by Elsevier B.V.
distributed arithmetic is a technique developed for the real-time computation of the inner product of the vector with constant elements and the vector with varying coefficients. The inner product is computed without s...
详细信息
ISBN:
(纸本)9781479971039
distributed arithmetic is a technique developed for the real-time computation of the inner product of the vector with constant elements and the vector with varying coefficients. The inner product is computed without splitting into operations of multiplication and addition. At calculation, operations of summation and shift of inner products of an unchangeable vector and a bit-slice of a changeable vector are carried out. All possible values of partial inner products are calculated offline and written down in Look Up Table (LUT). In this paper, it is offered to apply technology of the distributed arithmetic to calculation in real time of product of changeable matrices. Thus, content of LUT is computed dynamically in the online mode. Contents of this memory remain invariable for the period of multiplication of the left matrix by a column of the right matrix. Despite need of calculation of contents of LUT total number of microoperations of addition decreases in comparison with a classical way of calculation of matrix product. The analysis of computational complexity of the offered approach depending on an order of matrices and word length of elements is provided in paper. The offered approach is intended for realization of the advanced algorithms of digital signal processing with application of FPGA.
This paper discusses FPGA implementation of Finite Impulse Response (FIR) filters using distributed arithmetic (DA) which substitute multiply and accumulate operations with a series of Look-Up-Table (LUT) accesses. Pa...
详细信息
This paper discusses FPGA implementation of Finite Impulse Response (FIR) filters using distributed arithmetic (DA) which substitute multiply and accumulate operations with a series of Look-Up-Table (LUT) accesses. Parallel FIR digital filter can be used either for high speed or low-power applications. The distributed arithmetic provides a multiplication-free method for calculating inner products of fixed-point data, based on table lookups of pre calculated partial products. The implementation results are provided to demonstrate a high-speed and low power proposed architecture. The proposed filter is implemented in very high speed integrated circuit hardware description language (VHDL) and verified via simulation. The proposed method offers average reductions of 60% in the number of LUT, 40% reduction in occupied slices and 50% reduction in the number gates for parallel FIR filter implementation.
The paper describes the design of 2D-discrete cosine transform (DCT) which is widely used in image and video compression algorithms. The objective of this paper is to design a fully parallel distributed arithmetic (DA...
详细信息
ISBN:
(纸本)9781479968190
The paper describes the design of 2D-discrete cosine transform (DCT) which is widely used in image and video compression algorithms. The objective of this paper is to design a fully parallel distributed arithmetic (DA) architecture for 2D-dimensional DCT to be implemented on field programmable gate array (FPGA). DCT requires large amount of mathematical computations including multiplications and accumulations. The multipliers consume increased power and area;hence multipliers are completely discarded in the proposed design. distributed arithmetic is a method of modification at bit stream for sum of product or vector dot product to hide the multiplications. DA is very much suitable for FPGA designs as it reduces the size of a multiply and accumulate hardware. The speed is increased in the proposed design with the fully parallel approach. In this work, existing DA architecture for 2D-DCT and the proposed area efficient fully parallel DA architecture for 2D-DCT are realized. The simulation and synthesis is performed using Xilinx ISE.
暂无评论