the proceedings contain 22 papers. the topics discussed include: embedded floating-point units in fpgas;measuring the gap between fpgas and ASICs;optimality study of logic synthesis for LUT-based fpgas;improvements to...
详细信息
ISBN:
(纸本)1595932925
the proceedings contain 22 papers. the topics discussed include: embedded floating-point units in fpgas;measuring the gap between fpgas and ASICs;optimality study of logic synthesis for LUT-based fpgas;improvements to technology mapping for LUT-based fpgas;improving performance and robustness of domain-specific CPLDs;design, implementation, and verification of active cache emulator (ACE);modeling and data-dependent performance of pattern-matching architectures;yield enhancements of design-specific fpgas;FGPA clock network architecture: flexibility vs. area and power;a reconfigurable hardware based embedded scheduler for buffered crossbar switches;and combining module selection and resource sharing for efficient fpga pipeline synthesis.
the aim of this paper is to propose a real time reconfigurable (RTR) micro-fpga using new non volatile memory. Magnetic tunneling junctions (MTJ) used in Magnetic random access memories (MRAM.) are compatible with cla...
详细信息
ISBN:
(纸本)1595932925
the aim of this paper is to propose a real time reconfigurable (RTR) micro-fpga using new non volatile memory. Magnetic tunneling junctions (MTJ) used in Magnetic random access memories (MRAM.) are compatible with classical CMOS processes. Moreover remanent property of such a memory could limit configuration time and power consumption required at each power up of the die. Nevertheless, each configuration memory point has to be readable independently from each other, that is why the approach is different from the classical memory array one. Copyright 2006acm.
this paper examines the tradeoffs between flexibility, area, and power dissipation of programmable clock networks for field-programmablegatearrays (fpga's). the paper begins by describing a parameterized clock n...
详细信息
ISBN:
(纸本)1595932925
this paper examines the tradeoffs between flexibility, area, and power dissipation of programmable clock networks for field-programmablegatearrays (fpga's). the paper begins by describing a parameterized clock network model that describes a broad range of programmable clock network architectures. Specifically, the model supports architectures with multiple local and global clock domains and varying amounts of flexibility at various levels of the clock network. Using the model, the architectural parameters that control the flexibility of the clock network are varied to determine the cost of this flexibility in terms of area and power dissipation. From these experiments, the study finds that area and power costs are highest for networks with flexibility close to the logic blocks. Furthermore, it found that clock networks with local clock domains have little overhead and are significantly more efficient than clock networks without local clock domains for applications with multiple clocks. Copyright 2006acm.
Division is one of the most complicated and expensive arithmetic operations. Both clock frequency and operation delay are limited by the memory wall, even in LUT-based fpga devices. To conquer the memory limitation, w...
详细信息
ISBN:
(纸本)1595932925
Division is one of the most complicated and expensive arithmetic operations. Both clock frequency and operation delay are limited by the memory wall, even in LUT-based fpga devices. To conquer the memory limitation, we propose a hybrid division algorithm which employs Prescaling, Series expansion and Taylor expansion (PST) algorithms. the proposed algorithm boosts very-high radix division efficiently. the algorithm is multiplicative, and feasible for the modern fpga devices with build-in multipliers. the algorithm is implemented in Altera StratixII fpga devices and compared withthe division IP core generated by Mega Wizard. the result shows that the PST algorithm has higher clock frequency, lower execution time and also lower power consumption. Copyright 2006acm.
the paper presents several improvements to state-of-the-art in fpga technology mapping exemplified by a recent advanced technology mapper DAOmap [Chen and Cong, ICCAD '04]. Improved cut enumeration computes all K-...
详细信息
ISBN:
(纸本)1595932925
the paper presents several improvements to state-of-the-art in fpga technology mapping exemplified by a recent advanced technology mapper DAOmap [Chen and Cong, ICCAD '04]. Improved cut enumeration computes all K-feasible cuts without pruning for up to 7 inputs for the largest MCNC benchmarks, A new technique for on-the-fly cut dropping reduces by orders of magnitude memory needed to represent cuts for large designs. Improved area recovery leads to mappings with area on average 7% smaller than DAOmap, while preserving delay optimality when starting from the same optimized netlists. Applying mapping with structural choices derived by a synthesis flow on average reduces delay by 7% and area by 14%, compared to DAOmap. Copyright 2006acm.
Due to their generic and highly programmable nature, fpgas provide the ability to implement a wide range of applications. However, it is this nonspecific nature that has limited the use of fpgas in scientific applicat...
详细信息
ISBN:
(纸本)1595932925
Due to their generic and highly programmable nature, fpgas provide the ability to implement a wide range of applications. However, it is this nonspecific nature that has limited the use of fpgas in scientific applications that require floating-point arithmetic. Even simple floating-point operations consume a large amount of computational resources. In this paper, we introduce embedding floating-point multiply-add units in an island style fpga. this has shown to have an average area savings of 55.0% and an average increase of 40.7% in clock rate over existing architectures. Copyright 2006acm.
programmable logic devices such as fpgas are useful for a wide range of applications. However, fpgas are not commonly used in battery-powered applications because they consume more power than ASICs and lack power mana...
详细信息
ISBN:
(纸本)1595932925
programmable logic devices such as fpgas are useful for a wide range of applications. However, fpgas are not commonly used in battery-powered applications because they consume more power than ASICs and lack power management features. In this paper, we describe the design and implementation of Pika, a low-power fpga core targeting battery-powered applications such as those in consumer and automotive markets. Our design uses the Xilinx Spartan-3 low-cost fpga as a baseline and achieves substantial power savings through a series of power optimizations. the resulting architecture is compatible with existing commercial design tools. the implementation is done in a 90nm triple-oxide CMOS process. Compared to the baseline design, Pika consumes 46% less active power and 99% less standby power. Furthermore, it retains circuit and configuration state during standby mode, and wakes up from standby mode in approximately 100ns. Copyright 2006acm.
the performance benefits of a monolithically stacked 3D-fpga, whereby the programming overhead of an fpga is stacked on top of a standard CMOS layer containing the logic blocks and interconnects, are investigated. A V...
详细信息
ISBN:
(纸本)1595932925
the performance benefits of a monolithically stacked 3D-fpga, whereby the programming overhead of an fpga is stacked on top of a standard CMOS layer containing the logic blocks and interconnects, are investigated. A Virtex-II style 2D-fpga fabric is used as a baseline for quantifying the relative improvements in logic density, delay, and power consumption achieved by such a 3D-fpga. It is assumed that only the pass-transistor switches and configuration memory cells can be moved to the top layers and that the 3D-fpga employs the same logic block and programmable interconnect architecture as the baseline 2D-fpga. Assuming a configuration memory cell that is ≤ 0.7 the area of an SRAM cell and pass-transistor switches having the same characteristics as nMOS devices in the CMOS layer are used, it is shown that a monolithically stacked 3D-fpga can achieve 3.2 times higher logic density, 1.7 times lower critical path delay, and 1.7 times lower total dynamic power consumption than the baseline 2D-fpga fabricated in the same 65nm technology node. Copyright 2006acm.
Recent breakthroughs in cryptanalysis of standard hash functions like SHA-1 and MD5 raise the need for alternatives. A credible alternative to for instance SHA-1 or the SHA-2 family of hash functions is Whirlpool. Whi...
详细信息
ISBN:
(纸本)1595932925
Recent breakthroughs in cryptanalysis of standard hash functions like SHA-1 and MD5 raise the need for alternatives. A credible alternative to for instance SHA-1 or the SHA-2 family of hash functions is Whirlpool. Whirlpool is a hash function that has been evaluated and approved by NESSIE and is standardized by ISO/IEC. To the best of our knowledge only one fpga implementation of Whirlpool has been published to date. this implementation is designed for high throughput rates requiring a considerable amount of hardware resources. In this article we present a compact hardware implementation of the hash function Whirlpool. the proposed architecture uses an innovative state representation that makes it possible to reduce the required hardware resources remarkably. the complete implementation requires 1456 CLB-slices and, most notably, no block RAMs. Copyright 2006acm.
暂无评论