In this paper, we present and analyze a sophisticated communication architecture that allows to integrate many different modules into a system by FPGA reconfiguration at runtime. Furthermore, we examine how this archi...
详细信息
ISBN:
(纸本)9781605584102
In this paper, we present and analyze a sophisticated communication architecture that allows to integrate many different modules into a system by FPGA reconfiguration at runtime. Furthermore, we examine how this architecture can be implemented on low-cost Spartan-3 devices. It will be demonstrated that modules can be exchanged in a system without disturbing the communication architecture. The paper points out, that the capabilities of Spartan-3 FPGAs are sufficient to build complex reconfigurable systems. Copyright 2009 acm.
This paper describes an analytical model that relates the architectural parameters of an FPGA to the average prerouting wirelength of an FPGA implementation. Both homogeneous and heterogeneous FPGAs are considered. Fo...
详细信息
ISBN:
(纸本)9781605584102
This paper describes an analytical model that relates the architectural parameters of an FPGA to the average prerouting wirelength of an FPGA implementation. Both homogeneous and heterogeneous FPGAs are considered. For homogeneous FPGAs, the model relates the lookup-table size, the cluster size, and the number of inputs per cluster to the expected wirelength. For heterogeneous FPGAs, the number and positioning of the embedded blocks, as well as the number of pins on each embedded block is considered. Two applications of the model to FPGA architectural design are also presented. Copyright 2009 acm.
Technology mapping is an important step in the FPGA CAD flow in which a network of simple gates is converted into a network of logic blocks. We consider enhancements to a traditional LUTbased mapping algorithm for an ...
详细信息
ISBN:
(纸本)9781605584102
Technology mapping is an important step in the FPGA CAD flow in which a network of simple gates is converted into a network of logic blocks. We consider enhancements to a traditional LUTbased mapping algorithm for an FPGA comprised of logic blocks which implement only a subset of functions of up to k variables- specifically, the logic block is a partial LUT, but it possesses more inputs than typical LUTs. Numerical results are presented to demonstrate the efficacy of our proposed techniques using real circuits mapped to a commercial FPGA architecture. Copyright 2009 acm.
PERG is a pattern matching engine designed for locating predefined byte string patterns (rules) from ClamAV virus signature database in a data stream. This paper presents PERG-Rx, an extension of PERG that adds limite...
详细信息
ISBN:
(纸本)9781605584102
PERG is a pattern matching engine designed for locating predefined byte string patterns (rules) from ClamAV virus signature database in a data stream. This paper presents PERG-Rx, an extension of PERG that adds limited regular expression support for wildcard patterns used by rules that represent polymorphic viruses. To reduce the amount of state needed to track so many regular expressions, PERG-Rx employs a lossy scheme which increases the rate of false positives detected as the required state grows. The scalability and dynamic updatability of the PERG-Rx architecture to database updates are also evaluated. Copyright 2009 acm.
Via-programmablegatearrays (VPGAs) offer a middle ground application specific integrated circuits and fieldprogrammablearrays in terms of flexibility, manufactuing , speed, power and area. In this paper, we presen...
详细信息
ISBN:
(纸本)9781605584102
Via-programmablegatearrays (VPGAs) offer a middle ground application specific integrated circuits and fieldprogrammablearrays in terms of flexibility, manufactuing , speed, power and area. In this paper, we present a VPGA logic cell, the complementary universal logic (CULG) which can be used to implement both sequential combinatorial elements. Its performance is compared a number of other designs including transmission , differential cascode voltage switch with pass gate, standard cell. The CULG is found to have comparable delay product and process variation sensitivity to the other designs while offering the lowest power consumption. Copyright 2009 acm.
In this paper we present an implementation of a Cholesky decomposition core, with IEEE754 single precision arithmetic. The datapaths are generated using fused datapath synthesis, created with an experimental floating ...
详细信息
ISBN:
(纸本)9781605584102
In this paper we present an implementation of a Cholesky decomposition core, with IEEE754 single precision arithmetic. The datapaths are generated using fused datapath synthesis, created with an experimental floating point compiler tool, capable of fitting hundreds of floating point operators into a single device. We present a scalable architecture for both real and complex matrixes, on which we will report results for up to 128128 real matrices. The concepts of fused datapath synthesis for FPGA floating point designs will be reviewed, and the application to the Cholesky algorithm detailed. Experimental results will be given to show that the accuracy of this method is superior to those expected from a traditional IEEE754 core based design flow. Copyright 2009 acm.
Clock network power in field-programmablegatearrays (FP- ) is considered and two complementary approaches for power reduction in the Xilinx RVirtexTM-5 FPGA are. The approaches are unique in that they lever- specifi...
详细信息
ISBN:
(纸本)9781605584102
Clock network power in field-programmablegatearrays (FP- ) is considered and two complementary approaches for power reduction in the Xilinx RVirtexTM-5 FPGA are. The approaches are unique in that they lever- specific architectural aspects of Virtex-5 to achieve re- in dynamic power consumed by the clock network. first approach comprises a placement-based technique reduce interconnect resource usage on the clock network, reducing capacitance and power (up to 12%). The approach borrows the "clock gating" notion from the domain and applies it to FPGAs. Clock enable sig- on flip-flops are selectively migrated to use the dedi- clock enable available on the FPGA's built-in clock, leading to reduced toggling on the clock intercon- and lower power (up to 28%). Power reductions are achieved without any performance penalty, on average. Copyright 2009 acm.
We present in this paper the first reported FPGA implementation of the Position Specific Iterated BLAST (PSI-BLAST) algorithm. The latter is a heuristic biological sequence alignment algorithm that is widely used in t...
详细信息
ISBN:
(纸本)9781605584102
We present in this paper the first reported FPGA implementation of the Position Specific Iterated BLAST (PSI-BLAST) algorithm. The latter is a heuristic biological sequence alignment algorithm that is widely used in the bioinformatics and computational biology world in order to detect weak homologs. The architecture of our FPGA implementation is parameterized in terms of sequence lengths, scoring matrix, gap penalties and cut-off and threshold values. It is composed of various blmocks each of which performs one step of the algorithm in parallel. This results in high performance implementations, which easily outperform equivalent software implementations by one order of magnitude or more. Furthermore, the core was captured in an FPGA-platformindependent language, namely the Handel-C language, to which no specific resource inference or placement constraints were applied. This makes our core portable across different FPGA families and architectures. Copyright 2009 acm.
Performance of fieldprogrammablegatearrays (FPGAs) used for floating-point applications is poor due to the complexity of floating-point arithmetic. Implementing floatingpoint units on FPGAs consume a large amount o...
详细信息
ISBN:
(纸本)9781605584102
Performance of fieldprogrammablegatearrays (FPGAs) used for floating-point applications is poor due to the complexity of floating-point arithmetic. Implementing floatingpoint units on FPGAs consume a large amount of resources. This makes FPGAs less attractive for use in floating-point intensive applications. Therefore, there is a need for embedded floating-point units (FPUs) in FPGAs. However, if unutilized, embedded FPUs waste space on the FPGA die. To overcome this issue, we propose a flexible multi-mode embedded FPU for FPGAs that can be configured to perform a wide range of operations. The floating-point adder and multiplier in our embedded FPU can each be configured to perform one double-precision operation or two single-precision operations in parallel. To increase flexibility further, access to the large integer multiplier, adder and shifters in the FPU is provided. Benchmark circuits were implemented on both a standard Xilinx Virtex-II FPGA and on our FPGA with embedded FPU blocks. The results using our embedded FPUs showed a mean area improvement of 5.2 times and a mean delay improvement of 5.8 times for the doubleprecision benchmarks, and a mean area improvement of 4.4 times and a mean delay improvement of 4.2 times for the single-precision benchmarks. Copyright 2009 acm.
This paper presents a new architecture for time-to-digital enabling a time resolution of 17ps over a range 50ns with a conversion rate of 20MS/s. The proposed , implemented in a 65nm FPGA system, consists a pipelined ...
详细信息
ISBN:
(纸本)9781605584102
This paper presents a new architecture for time-to-digital enabling a time resolution of 17ps over a range 50ns with a conversion rate of 20MS/s. The proposed , implemented in a 65nm FPGA system, consists a pipelined interpolating time-to-digital converter (TDC). The TDC comprises a coarse time discriminator and ne delay line, capable of sustained operation at a clock of 300MHz. A Turbo version of the circuit implements pipelined interpolating TDC with suppressed dead to reach a conversion rate of 300MS/s at the expense a systematic asymmetry that requires fast error correction. TDCs proposed in this paper can be compensated process, voltage, and temperature (PVT) variations using conventional charge pump based feedback or a digital technique. Results demonstrate the suitability the approach for a variety of applications involving precision ultra-fast time discrimination, such as optical sensing, time-of-ight cameras, high throughput comlinks, RADARs, etc. Copyright 2009 acm.
暂无评论