The proceedings contains 26 papers from the fpga 2002 Tenth acminternationalsymposium on field-programmablegatearrays. Topics discussed include: interconnect enhancements for a high-speed PLD architecture;fpga swi...
详细信息
The proceedings contains 26 papers from the fpga 2002 Tenth acminternationalsymposium on field-programmablegatearrays. Topics discussed include: interconnect enhancements for a high-speed PLD architecture;fpga switch block layout and evaluation;a faster distributed arithmetic architecture for fpgas;efficient circuit clustering for area and power reduction in fpgas and integrated retiming and placement for fieldprogrammablegatearrays.
This paper presents a review of some existing architectures for the implementation of Montgomery modular multiplication and exponentiation on fpga (fieldprogrammablegate Array). Some new architectures are presented,...
详细信息
ISBN:
(纸本)9781581134520
This paper presents a review of some existing architectures for the implementation of Montgomery modular multiplication and exponentiation on fpga (fieldprogrammablegate Array). Some new architectures are presented, including a pipelined architecture exploiting the maximum carry chain length of the fpga which is used to implement the modular exponentiation operation required for RSA encryption and decryption. Speed and area comparisons are performed on the optimised designs. The issues of targeting a design specifically for a reconfigurable device are considered, taking into account the underlying architecture imposed by the target technology.
As device densities increase, testing cost is becoming a larger portion of the overall fpga manufacturing cost. We present an approach to speed up testing fpga interconnect by reconfiguring it during the test. Simple ...
详细信息
ISBN:
(纸本)9781581134520
As device densities increase, testing cost is becoming a larger portion of the overall fpga manufacturing cost. We present an approach to speed up testing fpga interconnect by reconfiguring it during the test. Simple additions are made to create feedback shift register structures, which considerably reduce the number of test configurations for the switching matrix interconnect. This new testing architecture reduces switching matrix test time by 66% and the diagnosis time by 72%. The additions are transparent to the users both in terms of speed and functionality.
Random number generators (RNGs) based upon neighborhood-of-four cellular automata (CA) with asymmetrical, non-local connections are explored. A number of RNGs that pass Marsaglia's rigorous Diehard suite of random...
详细信息
ISBN:
(纸本)9781581134520
Random number generators (RNGs) based upon neighborhood-of-four cellular automata (CA) with asymmetrical, non-local connections are explored. A number of RNGs that pass Marsaglia's rigorous Diehard suite of random number tests have been discovered. A neighborhood size of four allows a single CA cell to be implemented with a four-input lookup table and a one-bit register which are common building blocks in popular fieldprogrammablegatearrays (fpgas). The investigated networks all had periodic (wrap around) boundary conditions with either 1-d, 2-d, or 3-d interconnection topologies. Trial designs of 64-bit networks using a Xilinx XCV 1000-6 fpga predict a maximum clock rate of 214 MHz to 230 MHz depending upon interconnection topology.
As programmable logic grows more viable for implementing full design systems, performance has become a primary issue for programmable logic device architectures. This paper presents the high-level design of Dali, a PL...
详细信息
ISBN:
(纸本)9781581134520
As programmable logic grows more viable for implementing full design systems, performance has become a primary issue for programmable logic device architectures. This paper presents the high-level design of Dali, a PLD architecture specifically aimed at performance-driven applications. We will present significant portions of the background research that contributed to our architectural decisions, an overview of the core routing architecture and benchmarking experiments used to evaluate the prototype device.
As the capacity of fpga's increases to millions of equivalent gates the use of Intellectual Property (IP) cores becomes increasingly important to control design complexity. fpga's are becoming platforms for in...
详细信息
As the capacity of fpga's increases to millions of equivalent gates the use of Intellectual Property (IP) cores becomes increasingly important to control design complexity. fpga's are becoming platforms for integrating a system solution from components supplied by independent vendors in the same way as printed circuit boards provided a platform for earlier generations of designers. However, the current commercial model for IP cores involves large up-front license fees reminiscent of ASIC NRE charges. In order to match the IP core business model to the low to medium volume applications addressed by fpga customers it is important to develop cryptographic techniques which allow IP core vendors to sell their product on a pay-per-use basis rather than through up-front license fees.
This paper analyzes the dynamic power consumption in the fabric of fieldprogrammablegatearrays (fpgas) by taking advantage of both simulation and measurement. Our target device is Xilinx Virtex™-II family, which co...
详细信息
ISBN:
(纸本)9781581134520
This paper analyzes the dynamic power consumption in the fabric of fieldprogrammablegatearrays (fpgas) by taking advantage of both simulation and measurement. Our target device is Xilinx Virtex™-II family, which contains the most recent and largest programmable fabric. We identify important resources in the fpga architecture and obtain their utilization, using a large set Of real designs. Then, using a number of representative case studies we calculate the switching activity corresponding to each resource. Finally, we combine effective capacitance of each resource with its utilization and switching activity to estimate its share of power consumption. According to our results, the power dissipation share of routing, logic and clocking resources are 60%, 16%, and 14%, respectively. Also, we concluded that dynamic power dissipation of a Virtex-II CLB is 5.9 μW per MHz for typical designs, but it may vary significantly depending on the switching activity.
field-programmable-Core-arrays (FPCA) will include various computing cores for a wide variety of applications ranging from DSP to general purpose computing. With the increasing gap between core computing speeds and me...
详细信息
ISBN:
(纸本)9781581134520
field-programmable-Core-arrays (FPCA) will include various computing cores for a wide variety of applications ranging from DSP to general purpose computing. With the increasing gap between core computing speeds and memory access latency, managing and orchestrating the movement of data across multiple cores will become increasingly important. In this paper we propose data reorganization engines that allow a wide variety of data reorganizations intra- as well as inter-memory modules for future FPCAs. We have experimented with a suite of data reorganizations pervasive in DSP applications. Our limited set of experiments reveals that the proposed designs for these engines are flexile and use little design area in current fpga fabrics, making them amenable to be easily integrated in future FPCAs as either soft- or hard- macros.
This paper presents abstract layout techniques for a variety of fpga switch block architectures. We evaluate the relative density of subset, universal, and Wilton switch block architectures. For subset switch blocks o...
详细信息
ISBN:
(纸本)9781581134520
This paper presents abstract layout techniques for a variety of fpga switch block architectures. We evaluate the relative density of subset, universal, and Wilton switch block architectures. For subset switch blocks of small size, we find the optimal implementations using a simple metric. We also develop a tractable heuristic that returns the optimal results for small switch blocks, and good results for large switch blocks. For switch blocks with general connectivity, we develop a representation and a layout evaluation technique. We use these techniques to compare a variety of small switch blocks. We find that the traditional Xilinx-style, subset switch block is superior to the other proposed architectures. Finally, we have hand-designed some small switch blocks to confirm our results.
暂无评论