Reconfigurable computing can provide a significant speed-up factor to cryptographic and error correcting code algorithms. Finite field arithmetic is essential to both, but is difficult to implement efficiently. Finite...
详细信息
ISBN:
(纸本)9781595936004
Reconfigurable computing can provide a significant speed-up factor to cryptographic and error correcting code algorithms. Finite field arithmetic is essential to both, but is difficult to implement efficiently. Finite field instruction set extensions and a reconfiguration framework have been constructed to enable a finite field multiplier to be regenerated via software control. A performance evaluation has been created by generating a Finite field Extensions Unit with MicroBlaze processor in a Xilinx Virtex2Pro FPGA. By utilizing the in-system partial reconfiguration capability, the finite field multiplier can be customized to a particular size and definition. With a customized GF(2163 ) multiplier, a speed-up factor of 1530X has been demonstrated versus execution of the same algorithm on the MicroBlaze processor alone.
The proceedings contain 64 papers from the acm/sigda Thirteenth acminternationalsymposium on fieldprogrammablegatearrays - FPGA 2005. The topics discussed include: the Stratix II logic and routing architecture;sk...
详细信息
The proceedings contain 64 papers from the acm/sigda Thirteenth acminternationalsymposium on fieldprogrammablegatearrays - FPGA 2005. The topics discussed include: the Stratix II logic and routing architecture;skew-programmable clock design for FPGA and skew-aware placement;sparse matrix-vector multiplication on FPGAs;power modeling and architecture evaluation for FPGA with novel circuits for VDD programmability;architecture adaptive routability-driven placement for FPGAs;energy-efficient FPGA interconnect architecture design;3D-Softchip: A novel 3D vertically integrated adaptive computing system;dynamic reconfiguration in FPGA-based SoC designs;and rapid prototyping of a test harness for forward error correcting codes.
Due to their generic and highly programmable nature, FPGAs provide the ability to implement a wide range of applications. However, it is this nonspecific nature that has limited the use of FPGAs in scientific applicat...
详细信息
ISBN:
(纸本)1595932925
Due to their generic and highly programmable nature, FPGAs provide the ability to implement a wide range of applications. However, it is this nonspecific nature that has limited the use of FPGAs in scientific applications that require floating-point arithmetic. Even simple floating-point operations consume a large amount of computational resources. In this paper, we introduce embedding floating-point multiply-add units in an island style FPGA. This has shown to have an average area savings of 55.0% and an average increase of 40.7% in clock rate over existing architectures. Copyright 2006 acm.
The aim of this paper is to propose a real time reconfigurable (RTR) micro-FPGA using new non volatile memory. Magnetic tunneling junctions (MTJ) used in Magnetic random access memories (MRAM.) are compatible with cla...
详细信息
ISBN:
(纸本)1595932925
The aim of this paper is to propose a real time reconfigurable (RTR) micro-FPGA using new non volatile memory. Magnetic tunneling junctions (MTJ) used in Magnetic random access memories (MRAM.) are compatible with classical CMOS processes. Moreover remanent property of such a memory could limit configuration time and power consumption required at each power up of the die. Nevertheless, each configuration memory point has to be readable independently from each other, that is why the approach is different from the classical memory array one. Copyright 2006 acm.
This paper examines the tradeoffs between flexibility, area, and power dissipation of programmable clock networks for field-programmablegatearrays (FPGA's). The paper begins by describing a parameterized clock n...
详细信息
ISBN:
(纸本)1595932925
This paper examines the tradeoffs between flexibility, area, and power dissipation of programmable clock networks for field-programmablegatearrays (FPGA's). The paper begins by describing a parameterized clock network model that describes a broad range of programmable clock network architectures. Specifically, the model supports architectures with multiple local and global clock domains and varying amounts of flexibility at various levels of the clock network. Using the model, the architectural parameters that control the flexibility of the clock network are varied to determine the cost of this flexibility in terms of area and power dissipation. From these experiments, the study finds that area and power costs are highest for networks with flexibility close to the logic blocks. Furthermore, it found that clock networks with local clock domains have little overhead and are significantly more efficient than clock networks without local clock domains for applications with multiple clocks. Copyright 2006 acm.
Division is one of the most complicated and expensive arithmetic operations. Both clock frequency and operation delay are limited by the memory wall, even in LUT-based FPGA devices. To conquer the memory limitation, w...
详细信息
ISBN:
(纸本)1595932925
Division is one of the most complicated and expensive arithmetic operations. Both clock frequency and operation delay are limited by the memory wall, even in LUT-based FPGA devices. To conquer the memory limitation, we propose a hybrid division algorithm which employs Prescaling, Series expansion and Taylor expansion (PST) algorithms. The proposed algorithm boosts very-high radix division efficiently. The algorithm is multiplicative, and feasible for the modern FPGA devices with build-in multipliers. The algorithm is implemented in Altera StratixII FPGA devices and compared with the division IP core generated by Mega Wizard. The result shows that the PST algorithm has higher clock frequency, lower execution time and also lower power consumption. Copyright 2006 acm.
The paper presents several improvements to state-of-the-art in FPGA technology mapping exemplified by a recent advanced technology mapper DAOmap [Chen and Cong, ICCAD '04]. Improved cut enumeration computes all K-...
详细信息
ISBN:
(纸本)1595932925
The paper presents several improvements to state-of-the-art in FPGA technology mapping exemplified by a recent advanced technology mapper DAOmap [Chen and Cong, ICCAD '04]. Improved cut enumeration computes all K-feasible cuts without pruning for up to 7 inputs for the largest MCNC benchmarks, A new technique for on-the-fly cut dropping reduces by orders of magnitude memory needed to represent cuts for large designs. Improved area recovery leads to mappings with area on average 7% smaller than DAOmap, while preserving delay optimality when starting from the same optimized netlists. Applying mapping with structural choices derived by a synthesis flow on average reduces delay by 7% and area by 14%, compared to DAOmap. Copyright 2006 acm.
programmable logic devices such as FPGAs are useful for a wide range of applications. However, FPGAs are not commonly used in battery-powered applications because they consume more power than ASICs and lack power mana...
详细信息
ISBN:
(纸本)1595932925
programmable logic devices such as FPGAs are useful for a wide range of applications. However, FPGAs are not commonly used in battery-powered applications because they consume more power than ASICs and lack power management features. In this paper, we describe the design and implementation of Pika, a low-power FPGA core targeting battery-powered applications such as those in consumer and automotive markets. Our design uses the Xilinx Spartan-3 low-cost FPGA as a baseline and achieves substantial power savings through a series of power optimizations. The resulting architecture is compatible with existing commercial design tools. The implementation is done in a 90nm triple-oxide CMOS process. Compared to the baseline design, Pika consumes 46% less active power and 99% less standby power. Furthermore, it retains circuit and configuration state during standby mode, and wakes up from standby mode in approximately 100ns. Copyright 2006 acm.
While previous research has shown that FPGAs can efficiently inclement many types of computations, their flexibility inherently limits their clock rate. Several research groups have attempted to address this by develo...
详细信息
ISBN:
(纸本)1595932925
While previous research has shown that FPGAs can efficiently inclement many types of computations, their flexibility inherently limits their clock rate. Several research groups have attempted to address this by developing new architectures that include registered switchpoints within their interconnect. Unfortunately, this pipelined communication presents a new and difficult problem for detailed routing tools. Known as the N-Delay Routing Problem, it has been proven to be NP-Complete. Although there have been two heuristics developed to address this issue, both have certain limitations and neither approach considers timing during the routing process. While timing-driven conventional routing is largely considered to be a solved problem, there are several issues inherent to the N-Delay Routing problem make addressing timing particularly difficult, m this paper we discuss the nature of these problems and present a new timing-driven pipeline-aware router that produces as much as 60% better critical path delay than previous efforts. Copyright 2006 acm.
暂无评论