By tailoring a compiler tree-parsing tool for datapath module mapping, we produce good quality results for datapath synthesis in very fast run time. Rather than flattening the design to gates, we preserve the datapath...
详细信息
ISBN:
(纸本)9780897919784
By tailoring a compiler tree-parsing tool for datapath module mapping, we produce good quality results for datapath synthesis in very fast run time. Rather than flattening the design to gates, we preserve the datapath structure;this allows exploitation of specialized datapath features in FPGAs, retains regularity, and also results in a smaller problem size. To further achieve high mapping speed, we formulate the problem as tree covering and solve it efficiently with a linear-time dynamic programming algorithm. In a novel extension to the tree-covering algorithm, we perform module placement simultaneously with the mapping, still in linear time. Integrating placement has the potential to increase the quality of the result since we can optimize total delay including routing delays. To our knowledge this is the first effort to leverage a grammar-based tree covering tool for datapath module mapping. Further, it is the first work to integrate simultaneous placement with module mapping in a way that preserves linear time complexity.
Modern fieldprogrammablegatearrays (FPGAs) provide embedded memory blocks (EMBs) to be used as on-chip memories. In this paper, we explore the possibility of using EMBs to implement logic functions when they are no...
详细信息
ISBN:
(纸本)9780897919784
Modern fieldprogrammablegatearrays (FPGAs) provide embedded memory blocks (EMBs) to be used as on-chip memories. In this paper, we explore the possibility of using EMBs to implement logic functions when they are not used as on-chip memory. We propose a general technology mapping problem for FPGAs with EMBs for area and delay minimization and develop an efficient algorithm based on the concepts of Maximum Fanout Free Cone (MFFC) and Maximum Fanout Free Subgraph (MFFS), named EMB_Pack, which minimizes the area after or before technology mapping by using EMBs while maintaining the circuit delay. We have tested EMB_Pack on MCNC benchmarks on Altera's FLEX10K device family. The experimental results show that compared with the original mapped circuits generated from CutMap without using EMBs, EMB_Pack as postprocessing can further reduce up to 10% of the area on the mapped circuits while maintaining the layout delay by making efficient use of available EMB resources. Compared with CutMap-e without using EMBs, EMB_Pack as pre-mapping processing followed by CutMap-e can reduce 6% of the area while maintaining the circuit optimal delay.
In this paper, we developed Boolean matching techniques for complex programmable logic blocks (PLBs) in LUT-based FPGAs. A complex PLB can not only be used as a K-input LUT, but also can implement some wide functions ...
详细信息
ISBN:
(纸本)9780897919784
In this paper, we developed Boolean matching techniques for complex programmable logic blocks (PLBs) in LUT-based FPGAs. A complex PLB can not only be used as a K-input LUT, but also can implement some wide functions of more than K variables. We apply previous and develop new functional decomposition methods to match wide functions to PLBs. We can determine exactly whether a given wide function can be implemented with a XC4000 CLB or other three PLB architectures (including the XC5200 CLB). We evaluate functional capabilities of the four PLB architectures on implementing wide functions in MCNC benchmarks. Experiments show that the XC4000 CLB can be used to implement up to 98% of 6-cuts and 88% of 7-cuts in MCNC benchmarks, while two of the other three PLB architectures have a smaller cost in terms of logic capability per silicon area. Our results are useful for designing future logic unit architectures in LUT based FPGAs.
This paper introduces a coarse-grained FPGA architecture that is specialized for high-performance Finite Impulse Response (FIR) filtering. The proposed architecture provides the flexibility of a DSP processor with per...
详细信息
ISBN:
(纸本)9780897919784
This paper introduces a coarse-grained FPGA architecture that is specialized for high-performance Finite Impulse Response (FIR) filtering. The proposed architecture provides the flexibility of a DSP processor with performance and area efficiency similar to that of a custom ASIC design, while allowing all of the basic FIR design parameters, including coefficient precision, to be configured. Previous research has already shown that FPGAs can provide a high-performance alternative to DSP processors. Experimental comparisons in this paper show that the performance and area efficiency of the proposed architecture is similar to that of custom approaches across a wide range of filter sizes and configurations.
It has become clear that large embedded configurable memory arrays will be essential in future FPGAs. Embedded arrays provide high-density high-speed implementations of the storage parts of circuits. Unfortunately, th...
ISBN:
(纸本)9780897919784
It has become clear that large embedded configurable memory arrays will be essential in future FPGAs. Embedded arrays provide high-density high-speed implementations of the storage parts of circuits. Unfortunately, they require the FPGA vendor to partition the device into memory and logic resources at manufacture-time. This leads to a waste of chip area for customers that do not use all of the storage provided. This chip area need not be wasted, and can in fact be used very efficiently, if the arrays are configured as large multi-output ROMs, and used to implement *** order to efficiently use the embedded arrays in this way, a technology mapping algorithm that identifies parts of circuits that can be efficiently mapped to an embedded array is required. In this paper, we describe such an algorithm. The new tool, called SMAP, packs as much circuit information as possible into the available memory arrays, and maps the rest of the circuit into four-input lookup-tables. On a set of 29 sequential and combinational benchmarks, the tool is able to map, on average, 60 4-LUTs into a single 2-Kbit memory array. If there are 16 arrays available, it can map, on average, 358 4-LUTs to the 16 arrays.
In the development of new FPGA architectures, a designer must balance speed, density and routing flexibility. In this paper, we discuss a new FPGA architecture based on a patented [1], novel, segmented routing fabric ...
详细信息
ISBN:
(纸本)9780897919784
In the development of new FPGA architectures, a designer must balance speed, density and routing flexibility. In this paper, we discuss a new FPGA architecture based on a patented [1], novel, segmented routing fabric that is targeted to high performance and predictability but does not sacrifice routability or area efficiency. Current segmented architectures allow much flexibility in routing, but incur large delay penalties when a signal has high fanout or must traverse medium to long distances to reach its target. Reducing the number of programmable interconnect points (PIPs) that a signal must traverse to reach its target, while eliminating the RC delay buildup due to signal fanout, improves design performance and offers highly predictable signal delays.
The proceedings contains 22 papers from the 1997 internationalsymposium on fieldprogrammablegatearrays. Topics discussed include: fieldprogrammablegate array (FPGA) architectures;FPGA partitioning and synthesis;...
详细信息
The proceedings contains 22 papers from the 1997 internationalsymposium on fieldprogrammablegatearrays. Topics discussed include: fieldprogrammablegate array (FPGA) architectures;FPGA partitioning and synthesis;rapid prototyping and emulation;reconfigurable computing;and FPGA floorplanning and routing.
This paper shows that the speed of FPGAs with large embedded memory arrays can be improved by adding direct programmable connections between the memories. Nets that connect to multiple memory arrays are often difficul...
详细信息
ISBN:
(纸本)9780897918015
This paper shows that the speed of FPGAs with large embedded memory arrays can be improved by adding direct programmable connections between the memories. Nets that connect to multiple memory arrays are often difficult to route, and are often part of the critical path of circuit implementations. The memory-to-memory connection structure proposed in this paper allows for the efficient implementation of these nets, resulting in a reduction in memory access time of up to 25% and a slight improvement in routability.
暂无评论