As an alternative of adding more and more instructions to CPU cores in order to address a wide range of applications, this paper examines to use a mixed grained CPU interlay fabric to provide reconfigurable instructio...
详细信息
ISBN:
(纸本)9789090304281
As an alternative of adding more and more instructions to CPU cores in order to address a wide range of applications, this paper examines to use a mixed grained CPU interlay fabric to provide reconfigurable instruction set extensions. In detail, we are examining to replace the hardened NEON SIMD unit of an ARM Cortex-A9 with an identical sized FPGA fabric. We show that by applying a set of optimizations, we are able to emulate original applications using NEON instructions at the same hardware cost and at very little performance drop by an interlay. Moreover we are demonstrating examples where special custom instructions running on a CPU-Interlay-hybrid are substantially outperforming the original hardened CPU-NEON-system, hence making a strong case to embed reconfigurability as a beneficial feature in future processors.
Effectively exploiting the variety of computational and storage resources available in common FPGA architectures for complex applications, such as the real-time implementation of vision algorithms, is often difficult ...
详细信息
ISBN:
(纸本)9781424438914
Effectively exploiting the variety of computational and storage resources available in common FPGA architectures for complex applications, such as the real-time implementation of vision algorithms, is often difficult in standard HDL design methodologies. Higher-level design tools can enable a design to more quickly explore a range of different architectures. In this paper we apply algorithmic C-to-FPGA synthesis technology in a structured design approach and demonstrate its added value on two relevant vision processing kernels: optical flow and debayering. the impact of the proposed approach on the design time, the FPGA resource consumption and the throughput is measured.
Withthe introduction of the Stratix V family, the FPGA vendor Altera is now fully supporting partial reconfiguration in all their recent FPGA devices. A distinct feature in the Altera architecture is that reconfigura...
详细信息
ISBN:
(纸本)9782839918442
Withthe introduction of the Stratix V family, the FPGA vendor Altera is now fully supporting partial reconfiguration in all their recent FPGA devices. A distinct feature in the Altera architecture is that reconfigurable regions can be arbitrarily defined which is possible by writing a configuration mask prior to writing the actual configuration data to the FPGA fabric. In this paper, we will present details and the flow for implementing partial reconfiguration using Altera FPGAs, as well as a study on configuration bitstream sizes and configuration speeds for various resource and bounding-box aspect ratio variants. the results are used to build a partial reconfiguration controller that is featuring a lightweight but effective bitstream decompression module for greatly improving configuration speed on a DE5-net board.
Withthe introduction of Zynq FPGAs that provide an ARM SoC with an attached FPGA fabric, it is possible to build complex software-centric systems that are software and hardware programmable. To harness the full poten...
详细信息
ISBN:
(纸本)9781728148847
Withthe introduction of Zynq FPGAs that provide an ARM SoC with an attached FPGA fabric, it is possible to build complex software-centric systems that are software and hardware programmable. To harness the full potential of this approach, we developed FOS an FPGA Operating System which is built on open-source FPGA community and Xilinx vendor components. A distinct feature shown in this demo is a heterogeneous resource elastic scheduler that can dynamically and automatically adjust the allocation of tasks to hardware and software resources with respect to the present load scenario. We will also show the FOS ecosystem that allows easily implementing relocatable partially reconfigurable modules directly from RTL or HLS.
Over the past few decades, the use of reconfigurable computing for aerospace applications has become increasingly common despite its sensitivity to ionizing radiation. Tools are needed to test and implement fault-miti...
详细信息
ISBN:
(纸本)9782839918442
Over the past few decades, the use of reconfigurable computing for aerospace applications has become increasingly common despite its sensitivity to ionizing radiation. Tools are needed to test and implement fault-mitigation mechanisms to increase the reliability of FPGAs in space. this paper introduces a tool called the JTAG Configuration Manager (JCM) that provides high-speed programmable access to the configuration memory of Xilinx FPGAs using the JTAG serial protocol. the JCM consists of a linux-based software library running on an embedded ARM processor paired with a hardware state machine implemented in programmablelogic. Two important uses of the JCM are configuration scrubbing and fault injection. the highspeed JTAG interface allows such operations to run at up to 60 MHz, which is several times faster than traditional JTAG FPGA configuration methods. the JCM also has access to the XADC on-chip temperature monitoring and the internal Boundary SCAN, making it useful for many testing and debugging applications.
A Virtual Private Network (VPN) encrypts and decrypts the private traffic it tunnels over a public network. Maximizing the available bandwidth is an important requirement for network applications, but the cryptographi...
详细信息
ISBN:
(纸本)9782839918442
A Virtual Private Network (VPN) encrypts and decrypts the private traffic it tunnels over a public network. Maximizing the available bandwidth is an important requirement for network applications, but the cryptographic operations add significant computational load to VPN applications, limiting the network throughput. this work presents a coprocessor designed to offer hardware acceleration for these encryption and decryption operations. the open-source SigmaVPN application is used as the base solution, and a coprocessor is designed for the parts of Networking and Cryptography library (NaCl) which underlies the cryptographic operation of SigmaVPN. the hardware-software codesign of this work is implemented on a Xilinx Zynq-7000 SoC, showing a 93% reduction in the execution time of encrypting a 1024-byte frame, and this improved the TCP and UDP communication bandwidths by a factor of 4.36 and 5.36 respectively compared to pure software solution for a 1024-byte frame.
the P4 language provides a way to describe a custom network packet processing behavior that involves header parsing, matching and assembling modified packets. Such abstraction represents a significant step towards rem...
详细信息
ISBN:
(纸本)9789090304281
the P4 language provides a way to describe a custom network packet processing behavior that involves header parsing, matching and assembling modified packets. Such abstraction represents a significant step towards removing the limitation of fixed-function networking devices. Our live demonstration shows a straightforward usage of an algorithm and tool that maps a P4 program to a general architecture of FPGA-based networking device. Network traffic is received, parsed, filtered and modified by the generated circuit at the full line rate of 100 Gbps Ethernet. the results of our ongoing joint research project NFV200 show that the FPGA technology can be used to improve network flexibility without the usual burden of tedious and error-prone HDL coding.
this paper presents a field-programmable gate array (FPGA) logic synthesis technique based upon Boolean satisfiability. this paper shows how to map any Boolean function into an arbitrary programmablelogic block (PLB)...
详细信息
this paper presents a field-programmable gate array (FPGA) logic synthesis technique based upon Boolean satisfiability. this paper shows how to map any Boolean function into an arbitrary programmablelogic block (PLB) architecture without any custom decomposition techniques. the authors illustrate several useful applications of this technique by showing how this technique can be used for architecture evaluation and area optimization. When evaluating the FPGA architecture, the authors focus on the basic building block of the FPGA, which they refer to as PLB. In order to illustrate the flexibility of their evaluation framework, several unrelated PLB architectures are evaluated in an automated fashion. Furthermore, the authors show that using their technique is able to reduce FPGA resource usage by 27% on average in common subcircuits found in digital design.
A nonvolatile field-programmable gate array (NVF-PGA), where both magnetic tunnel junction (MTJ) devices and greedy power-saving techniques are utilized, is proposed. Because the circuit components are shared among se...
详细信息
ISBN:
(纸本)9782839918442
A nonvolatile field-programmable gate array (NVF-PGA), where both magnetic tunnel junction (MTJ) devices and greedy power-saving techniques are utilized, is proposed. Because the circuit components are shared among several MTJ devices by the use of logic-in-memory (LIM) structure, the number of leakage current paths is reduced, which results in leakage power reduction during power-on. Moreover, the use of the self-termination scheme, which automatically turns off the write current immediately after the desired data is written, makes it possible to minimize power consumption during the backup operation. In fact, the proposed NVFPGA exhibits a 90 % power reduction in comparison withthat of a conventional SRAM-based FPGA under typical benchmark-circuit implementations.
Power consumption in data centres is a growing issue as the cost of the power for computation and cooling has become dominant. An emerging challenge is the development of "environmentally friendly" systems. ...
详细信息
ISBN:
(纸本)9781424438914
Power consumption in data centres is a growing issue as the cost of the power for computation and cooling has become dominant. An emerging challenge is the development of "environmentally friendly" systems. In this paper we present a novel application of FPGAs for the acceleration of Information Retrieval algorithms, specifically, filtering streams/collections of documents against topic profiles. Our results show that FPGA acceleration can result in speed-ups of up to a factor 20 for large profiles.
暂无评论