This paper describes a bus mastering implementation of the PCI Express protocol using a Xilinx fpga. While the theoretical peak performance of PCI Express is quite high, attaining that performance is a complex endeavo...
详细信息
ISBN:
(纸本)9781605584102
This paper describes a bus mastering implementation of the PCI Express protocol using a Xilinx fpga. While the theoretical peak performance of PCI Express is quite high, attaining that performance is a complex endeavor on top of an already complex protocol. The implementation is described and its performance is analyzed. Source code is offered for free download via the web. Copyright 2009 acm.
The fpga architectural issue of the effect of logic block functionality on fpga performance and density is investigated. In particular, in the context of lookup tables (LUT), cluster-based island-style fpgas, the effe...
详细信息
The fpga architectural issue of the effect of logic block functionality on fpga performance and density is investigated. In particular, in the context of lookup tables (LUT), cluster-based island-style fpgas, the effect of LUT size and cluster size on the speed and logic density of an fpga is analyzed. A fully timing-driven experimental flow, in which a set of benchmark circuits are synthesized, is used into different cluster based logic book architectures, which contain groups of LUTs and flip-flops.
The fieldprogrammable Counter Array (FPCA) was introduced to improve fpga performance for arithmetic circuits. An FPCA is a reconfigurable IP core that can be integrated into an fpga. To exploit the FPCA, a circuit i...
详细信息
ISBN:
(纸本)9781595939340
The fieldprogrammable Counter Array (FPCA) was introduced to improve fpga performance for arithmetic circuits. An FPCA is a reconfigurable IP core that can be integrated into an fpga. To exploit the FPCA, a circuit is transformed by merging disparate addition and multiplication operations into large multi-input addition operations, which are synthesized as compressor trees on the FPCA;the remaining portion of the circuit is synthesized on the fpga. This paper presents a series of architectural improvements to the FPCA that reduce routing delay, increase flexibility and component utilization, and simplify the integration process. Using an fpga containing six FPCAs, we observed average and maximum speedups of 1.60x and 2.40x on a set of arithmetic benchmarks.
We present an architecture for a synthesizable datapath-oriented fieldprogrammablegate Array (fpga) core which can be used to provide post-fabrication flexibility to a System-on-Chip (SoC). Our architecture is optim...
详细信息
ISBN:
(纸本)9781595936004
We present an architecture for a synthesizable datapath-oriented fieldprogrammablegate Array (fpga) core which can be used to provide post-fabrication flexibility to a System-on-Chip (SoC). Our architecture is optimized for bus-based operations that are common in signal processing and computation intensive applications. It employs a directional routing architecture, which allows it to be synthesized using standard ASIC design tools and flows. We also describe a proof-of-concept layout of our core. It is shown that the proposed architecture is significantly more area efficient than the best previously reported synthesizable programmable logic core.
The proceedings contains 26 papers from the fpga 2002 Tenth acminternationalsymposium on field-programmablegatearrays. Topics discussed include: interconnect enhancements for a high-speed PLD architecture;fpga swi...
详细信息
The proceedings contains 26 papers from the fpga 2002 Tenth acminternationalsymposium on field-programmablegatearrays. Topics discussed include: interconnect enhancements for a high-speed PLD architecture;fpga switch block layout and evaluation;a faster distributed arithmetic architecture for fpgas;efficient circuit clustering for area and power reduction in fpgas and integrated retiming and placement for fieldprogrammablegatearrays.
Good fpga placement is crucial to obtain the best Quality of Results (QoR) from fpga hardware. Although many published global placement techniques place objects in a continuous ASIC-like environment, fpgas are discret...
详细信息
ISBN:
(纸本)9781450311557
Good fpga placement is crucial to obtain the best Quality of Results (QoR) from fpga hardware. Although many published global placement techniques place objects in a continuous ASIC-like environment, fpgas are discrete in nature, and a continuous algorithm cannot always achieve superior QoR by itself. Therefore, discrete fpga-specific detail placement algorithms are used to improve the global placement results. Unfortunately, most of these detail placement algorithms do not have a global view. This paper presents a discrete "middle" placer that fills the gap between the two placement steps. It works like simulated annealing, but leverages various acceleration techniques. It does not pay the runtime penalty typical of simulated annealing solutions. Experiments show that with this placer, final QoR is significantly better than with the global-detail placer approach.
With the recent release of High Bandwidth Memory (HBM) based fpga boards, developers can now exploit unprecedented external memory bandwidth. This allows more memory-bounded applications to benefit from fpga accelerat...
详细信息
This article presents the performance evaluation of two new diagonal routing tracks in fpgas. We discuss the automatic detailed architecture generation issues and propose changes in the conventional placement and rout...
详细信息
ISBN:
(纸本)9781605584102
This article presents the performance evaluation of two new diagonal routing tracks in fpgas. We discuss the automatic detailed architecture generation issues and propose changes in the conventional placement and routing to suit these architectures better. We conduct a series of experiments on these architecture with MCNC Benchmarks, where key parameters are varied over practical ranges and we conclude that the results are well in accordance, as predicted by the theory. Copyright 2009 acm.
Glitches are unnecessary transitions on logic signals that needlessly consume dynamic power. Glitches arise from imbalances in the combinational path delays to a signal, which may cause the signal to toggle multiple t...
详细信息
ISBN:
(纸本)9781450338561
Glitches are unnecessary transitions on logic signals that needlessly consume dynamic power. Glitches arise from imbalances in the combinational path delays to a signal, which may cause the signal to toggle multiple times in a given clock cycle before settling to its final value. In this paper, we propose a low-cost circuit structure that is able to eliminate a majority of glitches. The structure, which is incorporated into the output buffers of fpga logic elements, suppresses pulses on buffer outputs whose duration is shorter than a configurable time window (set at the time of fpga configuration). Glitches are thereby eliminated "at the source" ensuring they do not propagate into the high-capacitance fpga interconnect, saving power. An experimental study, using Altera commercial tools for power analysis, demonstrates that the proposed technique reduces 70% of glitches, at a cost of 1% reduction in speed performance.
The size of configuration bitstreams of field-programmablegatearrays (fpga) is increasing rapidly. Compression techniques are used to decrease the size of bitstreams. In this paper, an appropriate bitstream format a...
详细信息
暂无评论