To address some of the challenges of asynchronous design, we propose a new, decomposable asynchronous logicblock architecture based on our THx2 programmable threshold cell, and we use it to implement common threshold...
详细信息
To address some of the challenges of asynchronous design, we propose a new, decomposable asynchronous logicblock architecture based on our THx2 programmable threshold cell, and we use it to implement common threshold functions found in asynchronous, null convention logic circuits. At a minimum, programmable gate arrays require a programmablelogic cell that can implement a complete set of logic. It is well known that a NAND function forms a complete set of logic, and in null convention logic, the TH12 and TH22 threshold cells are used to form a basic two-input NAND function. The THx2 threshold cell is capable of performing both TH12 and TH22 operations, so it too forms a complete set of logic. In this paper, we present our eight-transistor mask-programmable gate array logic cell, 16-transistor field-programmable gate array logic cell, and new decomposable field-programmable gate array logicblock architecture, all based on the THx2 threshold cell and suitable for implementing null convention logic asynchronous functions. To minimize the THx2 threshold cell area for both TH12 and TH22 modes, we designed a layout with common Euler paths and no diffusion breaks for both modes. The highly compact nature of the THx2 threshold cell-along with the symmetry of the mask- and field-programmable gate array logic cells-made it an ideal candidate for an asynchronous field-programmable logic block structure. This paper is part of an ongoing project, and it only addresses the programmable logic block architecture, not a complete FPGA fabric.
The conventional LUT is redundant since practical functions in real-world benchmarks only occupy a small proportion of all the functions. For example, there are only 3881 out of more than 1014 NPN classes of 6-input f...
详细信息
ISBN:
(纸本)9798350359114
The conventional LUT is redundant since practical functions in real-world benchmarks only occupy a small proportion of all the functions. For example, there are only 3881 out of more than 1014 NPN classes of 6-input functions occurring in the mapped netlists of the VTR8 and Koios benchmarks. Therefore, we propose a novel LUT-like architecture, named DSLUT, with asymmetric inputs and programmable bits to efficiently implement the practical functions in domain-specific benchmarks instead of all the functions. The compact structure of the MUX Tree in the conventional LUT is preserved, while fewer programmable bits are connected to the MUX Tree according to the bit assignment. A 6-input DSLUT with 26 SRAM bits is generated for evaluation, which is based on the practical functions of 39 circuits from the VTR8 and Koios benchmarks. After the synthesis flow of ABC, the post-synthesis results show that the proposed DSLUT6 architecture reduces the number of levels by 10.98% at a cost of 7.25% area overhead compared to LUT5 architecture, while LUT6 reduces 15.16% levels at a cost of 51.73% more PLB area.
In this brief, a nonlinear integrated circuit to harvest different types of digital computation from complex dynamics is designed and fabricated. This circuit can be dynamically reconfigured to implement different two...
详细信息
In this brief, a nonlinear integrated circuit to harvest different types of digital computation from complex dynamics is designed and fabricated. This circuit can be dynamically reconfigured to implement different two-input, one-output digital functions. The main advantage of the circuit is the ability to implement different digital functions in each clock cycle without halting for reconfiguration.
The programmable logic block (PLB) in a modern FPGA features a built-in carry chain (or adder) and a decomposable LUT, where such an LUT may be decomposed into two or more smaller LUTs. Leveraging decomposable LUTs an...
详细信息
ISBN:
(纸本)9781424481927
The programmable logic block (PLB) in a modern FPGA features a built-in carry chain (or adder) and a decomposable LUT, where such an LUT may be decomposed into two or more smaller LUTs. Leveraging decomposable LUTs and underutilized carry chains, we propose to decompose a logic function in a PLB into two subfunctions and to combine the subfunctions via a carry chain to make the circuit more robust against single-event upsets(SEUs). Note that such decomposition can be implemented using the decomposable LUT and carry chain in the original PLB without changing the PLB-level placement and routing. Therefore, it is an in-place decomposition (IPD) with no area and timing overhead at the PLB level and has an ideal design closure between logic and physical syntheses. For 10 largest combinational MCNC benchmark circuits with a conservative 20% utilization rate for carry chain, IPD improves MTTF (mean time to failure) by 1.43 and 2.70 times respectively, for PLBs similar to those in Xilinx Virtex-5 and Altera Stratix-IV.
A novel Fudan programmablelogic chip (FDP) was designed and implemented with a SMIC 0. 18μm CMOS logic process. The new 3-LUT based logic cell circuit increases logic density about 11% compared with a traditional ...
详细信息
A novel Fudan programmablelogic chip (FDP) was designed and implemented with a SMIC 0. 18μm CMOS logic process. The new 3-LUT based logic cell circuit increases logic density about 11% compared with a traditional 4-input LUT. The unique hierarchy routing fabrics and effective switch box optimize the routing wire segments and make it possible for different lengths to connect directly. The FDP contains 1,600 programmablelogic cells, 160 programmable I/O, and 16kbit dual port block RAM. Its die size is 6. 104mm× 6. 620mm, with the package of QFP208. The hardware and software cooperation tests indicate that FDP chip works correctly and efficiently.
As technology scaling exacerbates interconnect resistance in advanced nodes, FPGA architectures demand enhanced programmable logic blocks (PLBs) to minimize global metal routing. However, it is expensive to raise the ...
详细信息
As technology scaling exacerbates interconnect resistance in advanced nodes, FPGA architectures demand enhanced programmable logic blocks (PLBs) to minimize global metal routing. However, it is expensive to raise the functionality of LUTs due to exponential area growth with the number of inputs, resulting in poor scalability. Moreover, LUTs are redundant since practical functions in real-world benchmarks only account for an extremely small proportion of all the functions. For example, only 16424 out of more than 100 trillion NPN classes of 6-input functions are used in the mapped netlists of the VTR8 and KOIOS benchmarks. Therefore, we propose a reduced LUT architecture, named RLUT, to efficiently implement most of the frequent functions. The compact structure of the MUX tree in LUTs is preserved and reduced, while the reduced programmable bits are connected to the MUX tree according to the bit assignment generated automatically by the proposed algorithms. Results of evaluations by a full EDA flow show that, compared with the modified Stratix10 baseline, the proposed 8-input PLB with 75 SRAM bits, named Dual-RLUT6, reduces the maximum logic levels significantly by 20.85%, while the critical path delay is improved by 10.11% at the cost of 4.65% area overhead.
暂无评论