Random number generators (RNGs) based upon neighborhood-of-four cellular automata (CA) with asymmetrical, non-local connections are explored. A number of RNGs that pass Marsaglia's rigorous Diehard suite of random...
详细信息
ISBN:
(纸本)9781581134520
Random number generators (RNGs) based upon neighborhood-of-four cellular automata (CA) with asymmetrical, non-local connections are explored. A number of RNGs that pass Marsaglia's rigorous Diehard suite of random number tests have been discovered. A neighborhood size of four allows a single CA cell to be implemented with a four-input lookup table and a one-bit register which are common building blocks in popular fieldprogrammablegatearrays (FPGAs). The investigated networks all had periodic (wrap around) boundary conditions with either 1-d, 2-d, or 3-d interconnection topologies. Trial designs of 64-bit networks using a Xilinx XCV 1000-6 FPGA predict a maximum clock rate of 214 MHz to 230 MHz depending upon interconnection topology.
field-programmable-Core-arrays (FPCA) will include various computing cores for a wide variety of applications ranging from DSP to general purpose computing. With the increasing gap between core computing speeds and me...
详细信息
ISBN:
(纸本)9781581134520
field-programmable-Core-arrays (FPCA) will include various computing cores for a wide variety of applications ranging from DSP to general purpose computing. With the increasing gap between core computing speeds and memory access latency, managing and orchestrating the movement of data across multiple cores will become increasingly important. In this paper we propose data reorganization engines that allow a wide variety of data reorganizations intra- as well as inter-memory modules for future FPCAs. We have experimented with a suite of data reorganizations pervasive in DSP applications. Our limited set of experiments reveals that the proposed designs for these engines are flexile and use little design area in current FPGA fabrics, making them amenable to be easily integrated in future FPCAs as either soft- or hard- macros.
This paper analyzes the dynamic power consumption in the fabric of fieldprogrammablegatearrays (FPGAs) by taking advantage of both simulation and measurement. Our target device is Xilinx Virtex™-II family, which co...
详细信息
ISBN:
(纸本)9781581134520
This paper analyzes the dynamic power consumption in the fabric of fieldprogrammablegatearrays (FPGAs) by taking advantage of both simulation and measurement. Our target device is Xilinx Virtex™-II family, which contains the most recent and largest programmable fabric. We identify important resources in the FPGA architecture and obtain their utilization, using a large set Of real designs. Then, using a number of representative case studies we calculate the switching activity corresponding to each resource. Finally, we combine effective capacitance of each resource with its utilization and switching activity to estimate its share of power consumption. According to our results, the power dissipation share of routing, logic and clocking resources are 60%, 16%, and 14%, respectively. Also, we concluded that dynamic power dissipation of a Virtex-II CLB is 5.9 μW per MHz for typical designs, but it may vary significantly depending on the switching activity.
As the capacity of FPGA's increases to millions of equivalent gates the use of Intellectual Property (IP) cores becomes increasingly important to control design complexity. FPGA's are becoming platforms for in...
详细信息
As the capacity of FPGA's increases to millions of equivalent gates the use of Intellectual Property (IP) cores becomes increasingly important to control design complexity. FPGA's are becoming platforms for integrating a system solution from components supplied by independent vendors in the same way as printed circuit boards provided a platform for earlier generations of designers. However, the current commercial model for IP cores involves large up-front license fees reminiscent of ASIC NRE charges. In order to match the IP core business model to the low to medium volume applications addressed by FPGA customers it is important to develop cryptographic techniques which allow IP core vendors to sell their product on a pay-per-use basis rather than through up-front license fees.
This paper presents abstract layout techniques for a variety of FPGA switch block architectures. We evaluate the relative density of subset, universal, and Wilton switch block architectures. For subset switch blocks o...
详细信息
ISBN:
(纸本)9781581134520
This paper presents abstract layout techniques for a variety of FPGA switch block architectures. We evaluate the relative density of subset, universal, and Wilton switch block architectures. For subset switch blocks of small size, we find the optimal implementations using a simple metric. We also develop a tractable heuristic that returns the optimal results for small switch blocks, and good results for large switch blocks. For switch blocks with general connectivity, we develop a representation and a layout evaluation technique. We use these techniques to compare a variety of small switch blocks. We find that the traditional Xilinx-style, subset switch block is superior to the other proposed architectures. Finally, we have hand-designed some small switch blocks to confirm our results.
We present a routability-driven bottom-up clustering technique for area and power reduction in clustered FPGAs. This technique uses a cell connectivity metric to identify seeds for efficient clustering. Effective seed...
详细信息
ISBN:
(纸本)9781581134520
We present a routability-driven bottom-up clustering technique for area and power reduction in clustered FPGAs. This technique uses a cell connectivity metric to identify seeds for efficient clustering. Effective seed selection, coupled with an interconnect-resource aware clustering and placement, can have a favorable impact on circuit routability. It leads to better device utilization, savings in area, and reduction in power consumption. Routing area reduction of 35% is achieved over previously published results. Power dissipation simulations using a buffered pass-transistor-based FPGA interconnect model are presented. They show that our clustering technique can reduce the overall device power dissipation by an average of 13%.
Video signal processing requires complex algorithms performing many basic operations on a video stream. To perform these calculations in real-time in a FPGA, we must use innovative structures to meet speed requirement...
详细信息
ISBN:
(纸本)9781581134520
Video signal processing requires complex algorithms performing many basic operations on a video stream. To perform these calculations in real-time in a FPGA, we must use innovative structures to meet speed requirements while managing complexity. As part of a project aiming at the development of a video noise reducer, we developed an optimized processing stream that required some floating-point calculations. This paper presents the rationale for developing a floating-point unit, justifies the data representation used, its implementation in a Xilinx VirtexE FPGA and reports the performance we obtained. A divider using this representation is also presented, with its implementation and performances in the same FPGA.
This paper examines circuit design of buffered routing switches in symmetrical, island-style FPGAs. The effects of switch size, tile length, level-restoring, and slow input slew rates are examined. Two new fanin-based...
详细信息
ISBN:
(纸本)9781581134520
This paper examines circuit design of buffered routing switches in symmetrical, island-style FPGAs. The effects of switch size, tile length, level-restoring, and slow input slew rates are examined. Two new fanin-based switch designs are used to eliminate nearly all of the increase in delay that arises from fanout with a previous switch design. Alternating between buffers and pass transistors is shown to improve connection delay without fanout by 25 %. To take advantage of this, we propose schemes to replace some buffers with pass transistors to simultaneously reduce area and delay. Routing a suite of MCNC benchmark circuits shows that 14% in areadelay, or 7% in delay can be saved using the new switch schemes. Alternatively, approximately 13 % in area can be saved with no degradation to delay.
Recent years have seen a tremendous increase in the capacities and capabilities of field-programmablegatearrays (FPGA's). Much of this dramatic improvement has been the result of changes to the FPGAs' intern...
详细信息
ISBN:
(纸本)9781581134520
Recent years have seen a tremendous increase in the capacities and capabilities of field-programmablegatearrays (FPGA's). Much of this dramatic improvement has been the result of changes to the FPGAs' internal architectures. New architectural proposals are routinely generated in both academia and industry. For FPGA's to continue to grow, it is important that these new architectural ideas are fairly and accurately evaluated, so that those worthy ideas can be included in future chips. Typically, this evaluation is done using experimentation. However, the use of experimentation is dangerous, since it requires making assumptions regarding the tools and architecture of the device in question. If these assumptions are not accurate, the conclusions from the experiments may not be meaningful. In this paper, we investigate the sensitivity of FPGA architectural conclusions to experimental variations. To make our study concrete, we evaluate the sensitivity of four previously published and well-known FPGA architectural results: lookup-table size, switch block topology, cluster size, and memory size. It is shown that these experiments are significantly affected by the assumptions, tools, and techniques used in the experiments.
field-programmablegatearrays have become popular ever since their introduction. Compared to other digital circuit implementation media, they have lower NRE cost and rapid turnaround with the penalties of reduced spe...
详细信息
ISBN:
(纸本)0769515622
field-programmablegatearrays have become popular ever since their introduction. Compared to other digital circuit implementation media, they have lower NRE cost and rapid turnaround with the penalties of reduced speed and larger size. Thus better FPGA programmable switch technology is desired in order to gain speed and density advantages. In this paper, Laser-induced MakeLink(TM)* technology is proposed as a programmable switch element. The Electrical resistance is as low as 0.8 Omega to 11 Omega, depending on the size of the link, which is 2-3 orders smaller than that of NMOS transistor in a SRAM based FPGA. Thus the speed improvement for Laser field-programmablegate Array (LFPGA) is significant. Other features of Laser-induced vertical links technology, such as small size and radiation hardness, car, also greatly improve the FPGA performance. The cluster-based LFPGA with 128 by 64 basic logic elements (BLE) is laid out under a 0.5 mum commercialized technology. The chip size is about 138mm(2).
暂无评论