DMA transfer between a CPU and an FPGA often becomes a bottleneck of current reconfigurable machines. the DMA transfer of the machines like SRC-6 supports streaming processing with on-board memory interleaving, but as...
详细信息
ISBN:
(纸本)9781424419609
DMA transfer between a CPU and an FPGA often becomes a bottleneck of current reconfigurable machines. the DMA transfer of the machines like SRC-6 supports streaming processing with on-board memory interleaving, but as a preprocessing of the interleaving, the CPU must reorder the data for applications with severe FPGA resource constraints. this paper empirically evaluates this overhead to reveal the trade-off point. the results show that a speedup is achieved by interleaved streaming DMA when 150KB or lower data strings are transferred.
this study describes the integration of thermally assisted switching magnetic random access memories (TAS-MRAMs) in field-programmable gate array (FPGA) design. the non-volatility is achieved through the use of magnet...
详细信息
this study describes the integration of thermally assisted switching magnetic random access memories (TAS-MRAMs) in field-programmable gate array (FPGA) design. the non-volatility is achieved through the use of magnetic tunnelling junctions (MTJs) in an MRAM cell. A TAS scheme is used to write data in the MTJ device, which helps to reduce power consumption during a write operation in comparison withthe conventional writing scheme used in MTJ devices. Furthermore, the non-volatility allows reducing both power consumption and configuration time required at each power-up of the circuit in comparison with classical static random access memory-based FPGAs. An innovative architecture furthermore provides run-time reconfigurable (RTR) support at minimum area overhead. A RTR FPGA element using TAS-MRAM allows dynamic reconfiguration mechanisms, while featuring simple design architecture.
logical effort (LE) is a linear technique for modelling the delay of a circuit in a technology independent manner. It offers the potential to simplify delay models for FPGAs and gain more insight into how the paramete...
详细信息
ISBN:
(纸本)9781424419609
logical effort (LE) is a linear technique for modelling the delay of a circuit in a technology independent manner. It offers the potential to simplify delay models for FPGAs and gain more insight into how the parameters affect the result. In this paper, the LE model will be introduced and an application to FPGA interconnect driver sizing described. Simple closed form equations are given for delay, sensitivity of delay to driver size and optimal delay. the results are shown to closely agree with Spice simulations.
this paper describes CHiMPS, a C-based accelerator compiler for hybrid CPU-FPGA computing platforms. CHiMPS's goal is to facilitate FPGA programming for high-performance computing developers. It inputs generic ANS...
详细信息
ISBN:
(纸本)9781424419609
this paper describes CHiMPS, a C-based accelerator compiler for hybrid CPU-FPGA computing platforms. CHiMPS's goal is to facilitate FPGA programming for high-performance computing developers. It inputs generic ANSIC code and automatically generates VHDL blocks for an FPGA. the accelerator architecture is customized with multiple caches that are tuned to the application. Speedups of 2.8x to 36.9x (geometric mean 6.7x) are achieved on a variety of HPC benchmarks with minimal source code changes.
Reduced device-level reliability and increased within-die process variability will become serious issues for future field-programmable gate arrays (FPGAs), and will result in faults developing dynamically during the l...
详细信息
Reduced device-level reliability and increased within-die process variability will become serious issues for future field-programmable gate arrays (FPGAs), and will result in faults developing dynamically during the lifetime of the integrated circuit. Fortunately, FPGAs have the ability to reconfigure in the field and at runtime, thus providing opportunities to overcome such degradation-induced faults. this study provides a comprehensive survey of fault detection methods and fault-tolerance schemes specifically for FPGAs and in the context of device degradation, withthe goal of laying a strong foundation for future research in this field. All methods and schemes are quantitatively compared and some particularly promising approaches are highlighted.
Random number generators are one of basic cryptographic primitives used in cryptographic protocols. Most of true random number generators in fieldprogrammable Gate Arrays (FPGAs) employ the timing jitter from ring os...
详细信息
ISBN:
(纸本)9781424419609
Random number generators are one of basic cryptographic primitives used in cryptographic protocols. Most of true random number generators in fieldprogrammable Gate Arrays (FPGAs) employ the timing jitter from ring oscillator clocks as a source of randomness. the paper analyses the jitter generated in ring oscillators and it uses a simple physical model of jitter sources to show that the random jitter accumulates slower than the global and manipulable deterministic jitter. this fact, which can be used to attack generators, is not considered even in most recent designs considered to be secure. the paper proposes simple but efficient countermeasure against these attacks. the method is validated using the proposed behavioral VHDL model and it is shown to be efficient also in hardware.
It is often desirable to change the logic and/or the connections within an FPGA design on-the-fly without the benefit of a workstation or vendor CAD software. this paper presents a dynamic router for Xilinx FPGAs, des...
详细信息
ISBN:
(纸本)9781424419609
It is often desirable to change the logic and/or the connections within an FPGA design on-the-fly without the benefit of a workstation or vendor CAD software. this paper presents a dynamic router for Xilinx FPGAs, designed to run on stand-alone embedded systems. With information obtained from Xilinx's XDL tool, a compact routing database for the Virtex-II/IIP/4 devices is built which only requires 96 KB of storage. A channel routing algorithm is used because of its deterministic execution time and because all routing resources in the channel are available. Sample channels are routed withthe router and compared withthe Xilinx PAR tool. Improvements in both execution time and in memory usage of several orders of magnitude are observed.
this work presents a programmable, configurable motion estimation processor for the H.264 video coding standard, capable of handling the processing requirements of high definition (HD) video and suitable for FPGA impl...
详细信息
ISBN:
(纸本)9781424419609
this work presents a programmable, configurable motion estimation processor for the H.264 video coding standard, capable of handling the processing requirements of high definition (HD) video and suitable for FPGA implementation. the programmable aspect of the processor follows the ASIP (Application Specific Instruction set Processor) approach with a instruction set targeted to accelerating block matching motion estimation algorithms. Configurability relates to the ability to optimize the microarchitecture for the selected algorithm and performance requirements through varying the number and type of execution units at compile time.
In this paper we present the design and the implementation of an FPGA-based floating-point adder withthree inputs. the design is based on a 5-level pipeline stage in order to distribute the critical paths and to maxi...
详细信息
ISBN:
(纸本)9781424419609
In this paper we present the design and the implementation of an FPGA-based floating-point adder withthree inputs. the design is based on a 5-level pipeline stage in order to distribute the critical paths and to maximize the performance. We examine the data dependencies to minimize the number of the pipeline stages and to reduce the resource allocation. Our design is parameterisable in order to cope with different floating-point formats, including the standard IEEE 754 formats and the custom configurations. the proposed design withthe single precision, 32-bit floating-point format, can be operated at 143 MHz on Xilinx Virtex2Pro XC2VP30-7.
Traditional design techniques for FPGAs are based on using hardware description languages, with functional and post-place-and-route simulation as a means to check design correctness and remove detected errors. With la...
详细信息
ISBN:
(纸本)9781424419609
Traditional design techniques for FPGAs are based on using hardware description languages, with functional and post-place-and-route simulation as a means to check design correctness and remove detected errors. With large complexity of things to be designed it is necessary to introduce new design approaches that will increase the level of abstraction while maintaining the necessary efficiency of a computation performed in hardware that we are used to today. this paper presents one such methodology that builds upon existing research in multithreading, object composability and encapsulation, partial runtime reconfiguration, and self adaptation. the methodology is based on currently available FPGA design tools. the efficiency of the methodology is evaluated on basic vector and matrix operations.
暂无评论