This work presents the design of an energy efficient FPGA architecture. Significant reduction in the energy consumption is achieved by tackling both circuit design and architecture optimization issues concurrently. A ...
详细信息
ISBN:
(纸本)9781581131338
This work presents the design of an energy efficient FPGA architecture. Significant reduction in the energy consumption is achieved by tackling both circuit design and architecture optimization issues concurrently. A hybrid interconnect structure incorporating nearest neighbor connections, symmetric mesh architecture, and hierarchical connectivity is used. The energy of the interconnect is also reduced by employing low-swing circuit techniques. These techniques have been employed to design and fabricate an FPGA. Preliminary analysis show energy improvement of more than an order of magnitude when compared to existing commercial architectures.
Multi-fieldprogrammablegate array (FPGA) systems (MFS) are used as custom computing machines, logic emulators and rapid prototyping vehicles. A key aspect of these systems is their programmable routing architecture;...
详细信息
Multi-fieldprogrammablegate array (FPGA) systems (MFS) are used as custom computing machines, logic emulators and rapid prototyping vehicles. A key aspect of these systems is their programmable routing architecture;the manner in which wires, FPGAs and fieldprogrammable interconnect devices (FPID) are connected. A new routing architecture, called hybrid complete-graph and partial-crossbar (HCGP), which has superior speed and cost compared to a partial crossbar is proposed. The architecture uses both hard-wired and programmable connections between the FPGAs.
In this paper, we present a new retiming-based technology mapping algorithm for look-up table-based fieldprogrammablegatearrays. The algorithm is based on a novel iterative procedure for computing all k-cuts of all...
详细信息
In this paper, we present a new retiming-based technology mapping algorithm for look-up table-based fieldprogrammablegatearrays. The algorithm is based on a novel iterative procedure for computing all k-cuts of all nodes in a sequential circuit, in the presence of retiming. The algorithm completely avoids flow computation which is the bottleneck of previous algorithms. Due to the fact that k is very small in practice, the procedure for computing all k-cuts is very fast. Experimental results indicate the overall algorithm is very efficient in practice.
An algorithm is presented for partitioning a design in time. The algorithm divides a large, technology-mapped design into multiple configurations of a time-multiplexed FPGA. These configurations are rapidly executed i...
详细信息
ISBN:
(纸本)9780897919784
An algorithm is presented for partitioning a design in time. The algorithm divides a large, technology-mapped design into multiple configurations of a time-multiplexed FPGA. These configurations are rapidly executed in the FPGA to emulate the large design. The tool includes facilities for optimizing the partitioning to improve routability, for fitting the design into more configurations than the depth of the critical path and for compressing the critical path of the design into fewer configurations, both to fit the design into the device and to improve performance. Scheduling results are shown for mapping designs into an 8-configuration time-multiplexed FPGA and for architecture investigation for a time-multiplexed FPGA.
Three factors are driving the demand for rapid fieldprogrammablegate array (FPGA) compilation. First, as FPGAs grown in logic capacity, the compile computation grows more quickly than the compute power of the availa...
详细信息
Three factors are driving the demand for rapid fieldprogrammablegate array (FPGA) compilation. First, as FPGAs grown in logic capacity, the compile computation grows more quickly than the compute power of the available computers. Second, there exists a subset of users who are willing to pay for very high speed compile with a decrease in quality of result. Third, very high speed compile is a long-standing desire of those using FPGA-based custom computing machines, as they want compile times at least closer to those of regular computers. A routing algorithm and routing tool that relates these three unique capabilities to very high-speed compile is presented.
This paper introduces a coarse-grained FPGA architecture that is specialized for high-performance Finite Impulse Response (FIR) filtering. The proposed architecture provides the flexibility of a DSP processor with per...
详细信息
This paper introduces a coarse-grained FPGA architecture that is specialized for high-performance Finite Impulse Response (FIR) filtering. The proposed architecture provides the flexibility of a DSP processor with performance and area efficiency similar to that of a custom ASIC design, while allowing all of the basic FIR design parameters, including coefficient precision, to be configured. Previous research has already shown that FPGAs can provide a high-performance alternative to DSP processors. Experimental comparisons in this paper show that the performance and area efficiency of the proposed architecture is similar to that of custom approaches across a wide range of filter sizes and configurations.
Current reconfigurable systems suffer from a significant overhead due to the time it takes to reconfigure their hardware. In order to deal with this overhead, and increase the power of reconfigurable systems, it is im...
详细信息
ISBN:
(纸本)9780897919784
Current reconfigurable systems suffer from a significant overhead due to the time it takes to reconfigure their hardware. In order to deal with this overhead, and increase the power of reconfigurable systems, it is important to develop hardware and software systems to reduce or eliminate this delay. In this paper we propose one technique for significantly reducing the reconfiguration latency: the prefetching of configurations. By loading a configuration into the reconfigurable logic in advance of when it is needed, we can overlap the reconfiguration with useful computation. We demonstrate the power of this technique, and propose an algorithm for automatically adding prefetch operations into reconfigurable applications. This results in a significant decrease in the reconfiguration overhead for these applications.
To implement high-density and high-speed FPGA circuits, designers need tight control over the circuit implementation process. However, current design tools are unsuited for this purpose as they lack fast turnaround ti...
详细信息
ISBN:
(纸本)9780897919784
To implement high-density and high-speed FPGA circuits, designers need tight control over the circuit implementation process. However, current design tools are unsuited for this purpose as they lack fast turnaround times, interactiveness, and integration. We present a system for the Xilinx XC6200 FPGA, which addresses these issues. It consists of a suite of tightly integrated tools for the XC6200 architecture centered around an architecture-independent tool framework. The system lets the designer easily intervene at various stages of the design process and features design cycle times (from an HDL specification to a complete layout) in the order of seconds.
While reconfigurable computing promises to deliver incomparable performance, it is still a marginal technology due to the high cost of developing and upgrading applications. Hardware virtualization can be used to sign...
详细信息
ISBN:
(纸本)9780897919784
While reconfigurable computing promises to deliver incomparable performance, it is still a marginal technology due to the high cost of developing and upgrading applications. Hardware virtualization can be used to significantly reduce both these costs. In this paper we describe the benefits of hardware virtualization, and show how it can be achieved using a combination of pipeline reconfiguration and run-time scheduling of both configuration streams and data streams. The result is PipeRench, an architecture that supports robust compilation and provides forward compatibility. Our preliminary performance analysis predicts that PipeRench will outperform commercial FPGAs and DSPs in both overall performance and in performance per mm2.
暂无评论