Due to their generic and highly programmable nature, FPGAs provide the ability to implement a wide range of applications. However, it is this nonspecific nature that has limited the use of FPGAs in scientific applicat...
详细信息
ISBN:
(纸本)1595932925
Due to their generic and highly programmable nature, FPGAs provide the ability to implement a wide range of applications. However, it is this nonspecific nature that has limited the use of FPGAs in scientific applications that require floating-point arithmetic. Even simple floating-point operations consume a large amount of computational resources. In this paper, we introduce embedding floating-point multiply-add units in an island style FPGA. This has shown to have an average area savings of 55.0% and an average increase of 40.7% in clock rate over existing architectures. Copyright 2006 acm.
RapidSmith is an open-source framework that allows for the exploration of novel approaches to the FPGA CAD flow for Xilinx devices. However, RapidSmith has poor sup- port for manipulating designs below the slice level...
详细信息
We are proposing a shared-memory communication infrastructure that provides a common parallel programming interface for FPGA and CPU components in a heterogeneous system. Our intent is to ease the integration of recon...
详细信息
This paper discusses architectural issues arising from the use of dynamic reconfiguration and shows a possible use of dynamic reconfiguration to extend and accelerate a computation performed in system-on-a-chip design...
详细信息
This paper discusses architectural issues arising from the use of dynamic reconfiguration and shows a possible use of dynamic reconfiguration to extend and accelerate a computation performed in system-on-a-chip designs with microprocessors with fixed instruction sets. Further a sample application is discussed that uses a dynamically reconfigurable FPGA to implement different floating-point calculations in hardware, reconfigured as required by the execution of the user code. The implementation data for two dynamically reconfigurable platforms available on the market - the Xilinx Virtex2 family FPGAs and the Atmel FPSLIC family FPGAs - is compared in terms of resource requirements, operating frequency, and power consumption.
This paper presents an analysis of the potential yield loss in FPGA due to random defects in metal layers. A proven yield model is adapted to target the FPGA interconnect layers in order to predict the manufacturing y...
详细信息
ISBN:
(纸本)9781595930293
This paper presents an analysis of the potential yield loss in FPGA due to random defects in metal layers. A proven yield model is adapted to target the FPGA interconnect layers in order to predict the manufacturing yield. Defect parameters from the 2003 SIA roadmap are used to investigate the trend in yield loss due to defects in interconnect layers in the future. It is shown that the low yield predicted for the 45nm technology node and beyond is a cause for concern. The potential impact on yield using two different approaches, namely redundant circuits and fault tolerant design, is also presented. Copyright 2005 acm.
Advanced Microelectronics Department at Sandia National Laboratories. We present an automatic logic synthesis method targeted for high-performance asynchronous FPGA (AFPGA) architectures. Our method transforms sequent...
详细信息
ISBN:
(纸本)9781595930293
Advanced Microelectronics Department at Sandia National Laboratories. We present an automatic logic synthesis method targeted for high-performance asynchronous FPGA (AFPGA) architectures. Our method transforms sequential programs as well as high-level descriptions of asynchronous circuits into fine-grain asynchronous process netlists suitable for an AFPGA. The resulting circuits are inherently pipelined, and can be physically mapped onto our AFPGA with standard partitioning and place-and-route algorithms. For a wide variety of benchmarks, our automatic synthesis method not only yields comparable logic densities and performance to those achieved by hand placement, but also attains a throughput close to the peak performance of the FPGA. Copyright 2005 acm.
C-slow retiming is a process of automatically increasing the throughput of a design by enabling fine grained pipelining of problems with feedback loops. This transformation is especially appropriate when applied to FP...
详细信息
C-slow retiming is a process of automatically increasing the throughput of a design by enabling fine grained pipelining of problems with feedback loops. This transformation is especially appropriate when applied to FPGA designs because of the large number of available registers. To demonstrate and evaluate the benefits of C-slow retiming, we constructed an automatic tool which modifies designs targeting the Xilinx Virtex family of FPGAs. Applying our tool to three benchmarks: AES encryption. Smith/Waterman sequence matching, and the LEON 1 synthesized microprocessor core, we were able to substantially increase the total throughput. For some parameters, throughput is effectively doubled.
While system reliability is conventionally achieved through component replication, we have developed a fault-tolerance approach for FPGA-based systems that comes at a reduced cost in terms of design time, volume, and ...
详细信息
ISBN:
(纸本)9780897919784
While system reliability is conventionally achieved through component replication, we have developed a fault-tolerance approach for FPGA-based systems that comes at a reduced cost in terms of design time, volume, and weight. We partition the physical design into a set of tiles. In response to a component failure, we capitalize on the unique reconfiguration capabilities of FPGAs and replace the affected tile with a functionally equivalent tile that does not rely on the faulty component. Unlike fixed structure fault-tolerance techniques for ASICs and microprocessors, this approach allows a single physical component to provide redundant backup for several types of components. Experimental results conducted on a subset of the MCNC benchmarks demonstrate a high level of reliability with low timing and hardware overhead.
Division is one of the most complicated and expensive arithmetic operations. Both clock frequency and operation delay are limited by the memory wall, even in LUT-based FPGA devices. To conquer the memory limitation, w...
详细信息
ISBN:
(纸本)1595932925
Division is one of the most complicated and expensive arithmetic operations. Both clock frequency and operation delay are limited by the memory wall, even in LUT-based FPGA devices. To conquer the memory limitation, we propose a hybrid division algorithm which employs Prescaling, Series expansion and Taylor expansion (PST) algorithms. The proposed algorithm boosts very-high radix division efficiently. The algorithm is multiplicative, and feasible for the modern FPGA devices with build-in multipliers. The algorithm is implemented in Altera StratixII FPGA devices and compared with the division IP core generated by Mega Wizard. The result shows that the PST algorithm has higher clock frequency, lower execution time and also lower power consumption. Copyright 2006 acm.
fieldprogrammablegatearrays (FPGAs) are being used to provide fast Internet Protocol (IP) packet routing and advanced queuing in a highly scalable network switch. A new module, called the field-programmable Port Ex...
详细信息
ISBN:
(纸本)9781581131932
fieldprogrammablegatearrays (FPGAs) are being used to provide fast Internet Protocol (IP) packet routing and advanced queuing in a highly scalable network switch. A new module, called the field-programmable Port Extender (FPX), is being built to augment the Washington University Gigabit Switch (WUGS) with reprogrammable logic. FPX modules reside at the edge of the WUGS switching fabric. Physically, the module is inserted between an optical line card and the WUGS gigabit switch back-plane. The hardware used for this project allows ports of the switch populated with an FPX to operate at rates up to 2.4 Gigabits/second. The aggregate throughput of the system scales with the number of switch ports. Logic on the FPX module is implemented with two FPGA devices. The first device is used to interface between the switch and the line card, while the second is used to prototype new networking functions and protocols. The logic on the second FPGA can be re-programmed dynamically via control cells sent over the network.
暂无评论