This paper introduces a coarse-grained fpga architecture that is specialized for high-performance Finite Impulse Response (FIR) filtering. The proposed architecture provides the flexibility of a DSP processor with per...
详细信息
This paper introduces a coarse-grained fpga architecture that is specialized for high-performance Finite Impulse Response (FIR) filtering. The proposed architecture provides the flexibility of a DSP processor with performance and area efficiency similar to that of a custom ASIC design, while allowing all of the basic FIR design parameters, including coefficient precision, to be configured. Previous research has already shown that fpgas can provide a high-performance alternative to DSP processors. Experimental comparisons in this paper show that the performance and area efficiency of the proposed architecture is similar to that of custom approaches across a wide range of filter sizes and configurations.
Monte-Carlo arithmetic is a form of self-validating arithmetic that accounts for the effect of rounding errors. We have implemented a floating point unit that can perform either IEEE 754 or Monte-Carlo floating point ...
详细信息
ISBN:
(纸本)9781450305549
Monte-Carlo arithmetic is a form of self-validating arithmetic that accounts for the effect of rounding errors. We have implemented a floating point unit that can perform either IEEE 754 or Monte-Carlo floating point computation, allowing hardware accelerated validation of results during execution. Experiments show that our approach has a. modest hardware overhead and allows the propagation of rounding error to be accurately estimated.
This paper presents the application of time-delay sonar beamforming and discusses a multi-board fpga system for performing several variations of this beamforming method in real-time for realistic sonar arrays. Additio...
详细信息
ISBN:
(纸本)9780897919784
This paper presents the application of time-delay sonar beamforming and discusses a multi-board fpga system for performing several variations of this beamforming method in real-time for realistic sonar arrays. Additionally, we show that our proposed fpga system has a six to twelve times performance advantage over an equivalent system created using currently available, high-performance DSPs designed for multiprocessing systems. This performance advantage is due to the simplicity of the core calculation, the limitations of the DSP's address calculation hardware, and the ability to customize the I/O of the fpga to the application.
Even with HiCuts algorithm, which is one of the most effective algorithms for packet classification, the on-line searching for each input packet still consumes the main CPU a large amount of computation resource if it...
详细信息
ISBN:
(纸本)9781595930293
Even with HiCuts algorithm, which is one of the most effective algorithms for packet classification, the on-line searching for each input packet still consumes the main CPU a large amount of computation resource if it is fulfilled by software. An effective alternative is to use a hardware co-processor to realize the on-line searching. Based on the principle of HiCuts algorithm, the architecture design of a hardware on-line searching co-processor with an fpga is presented in this paper. Especially, mapping the decision tree and linear search in each leaf node to the memory data structure is described in detail. Benefiting from multiple pipeline structure, there are a total of 12 searching engines working parallel to achieve very high searching speed (8M packet heads/second). The simulation test results show a useful guide for optimization of off-line pre-processing and the co-processor design.
This paper proposes a new high-level technique for designing fault tolerant systems in SRAM-based fpgas, without modifications in the fpga architecture. Traditionally, TMR has been successfully applied in fpgas to mit...
详细信息
This paper proposes a new high-level technique for designing fault tolerant systems in SRAM-based fpgas, without modifications in the fpga architecture. Traditionally, TMR has been successfully applied in fpgas to mitigate transient faults, which are likely to occur in space applications. However. TMR comes with high area and power dissipation penalties. The proposed technique was specifically developed for fpgas to cope with transient faults in the user combinational and sequential logic, while also reducing pin count, area and power dissipation. The methodology was validated by fault injection experiments in an emulation board. We present some fault coverage results and a comparison with the TMR approach.
The routing channels of today's fpgas consist of wire segments of various types. This routing architecture makes us capable of exploiting some new techniques to enhance the routability of net segments in channels ...
详细信息
The routing channels of today's fpgas consist of wire segments of various types. This routing architecture makes us capable of exploiting some new techniques to enhance the routability of net segments in channels in order to support engineering change order (ECO). In this paper we present an optimal greedy algorithm to switch the track, which each net segment is assigned to, in order to enhance the routability of newly added nets for enabling ECO. We used the routing architecture of Virtex II fpgas from Xilinx as our target routing architecture and integrated our algorithm into VPR fpga routing tool. The experimental result show that the algorithm reduces the number of Tracks by 9% in average. It allows 28.4% more rerouting than the existing router of VPR tool, which is based on Dijkestra's maze router algorithm.
In this paper we introduce a new Simulated Annealing-based timing-driven placement algorithm for fpgas. This paper has three main contributions. First, our algorithm employs a novel method of determining source-sink c...
详细信息
ISBN:
(纸本)9781581131932
In this paper we introduce a new Simulated Annealing-based timing-driven placement algorithm for fpgas. This paper has three main contributions. First, our algorithm employs a novel method of determining source-sink connection delays during placement. Second, we introduce a new cost function that trades off between wire-use and critical path delay, resulting in significant reductions in critical path delay without significant increases in wire-use. Finally, we combine connection-based and path-based timing-analysis to obtain an algorithm that has the low time-complexity of connection-based timing-driven placement, while obtaining the quality of path-based timing-driven placement. A comparison of our new algorithm to a well known non-timing-driven placement algorithm demonstrates that our algorithm is able to increase the post-place-and-route speed (using a full path-based timing-driven router and a realistic routing architecture) of 20 MCNC benchmark circuits by an average of 42%, while only increasing the minimum wiring requirements by an average of 5%.
In recent years, the classic method of Coordinate Rotation by Digital Computer (CORDIC) arithmetic has been widely implemented as part of the computational requirements of the well known QR-RLS (Recursive Least Square...
详细信息
ISBN:
(纸本)9781450305549
In recent years, the classic method of Coordinate Rotation by Digital Computer (CORDIC) arithmetic has been widely implemented as part of the computational requirements of the well known QR-RLS (Recursive Least Squares) algorithm. In order to operate Givens rotation on a complex number system, double angle complex rotation (DACR) was adopted to simplify the computational requirement of Complex Givens Rotation. This paper presents a new architecture of high speed CORDIC based single Processor Element (PE) that can be used to accomplish the complex value QR update based RLS. The implementation results on Xilinx fpga implementaton demonstrates that the proposed structure results in a lower latency and lower cost.
Modern fpga architectures provide ample routing resources so that designs can be routed successfully. The routing architecture is designed to handle versatile connection configurations. However, providing such great f...
详细信息
Modern fpga architectures provide ample routing resources so that designs can be routed successfully. The routing architecture is designed to handle versatile connection configurations. However, providing such great flexibility comes at a high cost in terms of area, delay and power. We propose a new fpga routing architecture1 that utilizes a mixture of hard-wired and traditional flexible switches. The result is 24% reduction in leakage power consumption, 7% smaller area and 24% shorter delays, which translates to 30% increase in clock frequency. Despite the increase in clock speeds, the overall power consumption is reduced by 8%. Copyright 2005 acm.
Synthesizing common sequential algorithms, captured in a language like C, to fpga circuits is now well-known to provide dramatic speedups for numerous applications, and to provide tremendous portability and adaptabili...
详细信息
ISBN:
(纸本)9781595939340
Synthesizing common sequential algorithms, captured in a language like C, to fpga circuits is now well-known to provide dramatic speedups for numerous applications, and to provide tremendous portability and adaptability advantages over circuit implementations of an application. However. many applications targeted to fpgas are still designed and distributed at the circuit level, due in part to tremendous human ingenuity being exercised at that level to achieve exceptional performance and efficiency. A question then arises as to whether applications for fpgas will have to be distributed as circuits to achieve desired performance and efficiency,v or if instead a more pot-table language like C might be used. Given a set of common synthesis transformations, we studied the extent to which circuits published in FCCM in the past 6 years could be captured as sequential code and then synthesized back to the published circuit. The study showed that a surprising 82% of the 35 circuits chosen for the study could be rederived from some form of standard C code, suggesting that standard C code, without extensions, may be an effective means for distributing fpga applications.
暂无评论