This paper presents the FISH (FPGA-Initiated Software-Handled) framework which allows FPGA accelerators to make system calls to the Linux operating system in CPU-FPGA systems. A special FISH Linux kernel module runnin...
详细信息
ISBN:
(纸本)9789090304281
This paper presents the FISH (FPGA-Initiated Software-Handled) framework which allows FPGA accelerators to make system calls to the Linux operating system in CPU-FPGA systems. A special FISH Linux kernel module running on the CPU provides a system call interface for FPGA accelerators, much like the ABI which exists for software programs. We provide a proofof-concept implementation of this framework running on the Intel Cyclone V SoC device, and show that an FPGA accelerator can seamlessly make system calls as if it were the host program. We see the FISH framework being especially useful for high-level synthesis (HLS) by making it possible to synthesize software code that contains system calls.
Implementing convolutional neural networks for scene labelling is a current hot topic in the field of advanced driver assistance systems. The massive computational demands under hard real-time and energy constraints c...
详细信息
ISBN:
(纸本)9789090304281
Implementing convolutional neural networks for scene labelling is a current hot topic in the field of advanced driver assistance systems. The massive computational demands under hard real-time and energy constraints can only be tackled using specialized architectures. Also, cost-effectiveness is an important factor when targeting lower quantities. In this PhD thesis, a vector processor architecture optimized for FPGA devices is proposed. Amongst other hardware mechanisms, a novel complex operand addressing mode and an intelligent DMA are used to increase perfromance. Also, a C-compiler support for creating applications is introduced.
Supervised machine learning for data classification is increasingly implemented in hardware to be integrated close to the source of the data. The ability to update a trained machine learning model is the most importan...
详细信息
ISBN:
(数字)9781538685174
ISBN:
(纸本)9781538685174
Supervised machine learning for data classification is increasingly implemented in hardware to be integrated close to the source of the data. The ability to update a trained machine learning model is the most important property any classification system must fulfill. This is often achieved by implementing the algorithm on reconfigurable hardware but some applications require speed, size, or power efficiency only application-specific integrated circuits (ASICs) can offer. Architectures that have proven to be very efficient on reconfigurable hardware are not always suited for custom ASIC designs. We therefore propose to integrate commonly used field-programmable technology in an application-specific architecture to allow updates of the trained model. This design pattern allows deep integration into full custom ASICs while leveraging all advantages of reconfigurable hardware.
Many CPU design houses have added dedicated support for cryptography in recent processor generations, including Intel, IBM, and ARM. While adding accelerators and/or dedicated instructions boosts performance on crypto...
详细信息
ISBN:
(纸本)9789090304281
Many CPU design houses have added dedicated support for cryptography in recent processor generations, including Intel, IBM, and ARM. While adding accelerators and/or dedicated instructions boosts performance on cryptography, we are investigating a different approach that is not adding extra silicon area: We study to replace the hardened NEON SIMD unit of an ARM Cortex-A9 with an identical sized FPGA fabric, called an interlay. This will be used for implementing cryptographic instructions in soft-logic. We show that this approach can outperform the hardened NEON by up to 7.7 x on AES and provide functionality that is not available in the hardened ARM.
An improved architecture for efficiently computing the sum of absolute differences (SAD) on FPGAs is proposed in this work. It is based on a configurable adder/subtractor implementation in which each adder input can b...
详细信息
ISBN:
(纸本)9782839918442
An improved architecture for efficiently computing the sum of absolute differences (SAD) on FPGAs is proposed in this work. It is based on a configurable adder/subtractor implementation in which each adder input can be negated at runtime. The negation of both inputs at the same time is explicitly allowed and used to compute the sum of absolute values in a single adder stage. The architecture can be mapped to modern FPGAs from Xilinx and Altera. An analytic complexity model as well as synthesis experiments yield an average look-up table (LUT) reduction of 17.4% for an input word size of 8 bit compared to state-of-the-art. As the SAD computation is a resource demanding part in image processing applications, the proposed circuit can be used to replace the SAD core of many applications to enhance their efficiency.
A True Random Number Generator (TRNG) is an essential component for security applications of FPGAs. Its requirements include small logic area, high throughput, sufficient randomness backed with a mathematical model, a...
详细信息
ISBN:
(纸本)9781728199023
A True Random Number Generator (TRNG) is an essential component for security applications of FPGAs. Its requirements include small logic area, high throughput, sufficient randomness backed with a mathematical model, and feasibility - ease of implementation. This paper focuses on TRNGs based on a Transition Effect Ring Oscillator (TERO) and presents a three-path configurable TERO (TC-TERO), an improved implementation of TERO that achieves high feasibility with a minimal amount of hardware. According to the evaluation with a Xilinx Artix-7 FPGA, a TC-TERO with a 20-bit configurable parameter only required 40 LUTs. By selecting one of the promising parameters, the proposed TRNG passed AIS-31 Procedure A without post-processing and NIST SP 800-22 with a simple debiasing.
This paper presents an alternative FPGA design compilation flow that reduces the back-end time required to implement a design. Beginning with the GReasy front-end and proceeding through the TFlow back-end, this flow c...
详细信息
ISBN:
(纸本)9781479900046
This paper presents an alternative FPGA design compilation flow that reduces the back-end time required to implement a design. Beginning with the GReasy front-end and proceeding through the TFlow back-end, this flow consists of a rapid method for design assembly, decoupled from the vendor tools. This enables software-like turnaround time for faster prototyping and increased productivity.
""We will demonstrate a portable FPGA tablet running a spiking neural network for handwriting recognition. The user draws digits on the tablet's touch-screen, and the neural network performs digit recogn...
详细信息
ISBN:
(纸本)9781479900046
""We will demonstrate a portable FPGA tablet running a spiking neural network for handwriting recognition. The user draws digits on the tablet's touch-screen, and the neural network performs digit recognition.""
FPGAs are used in many long-life systems that serve mission-critical needs. The supply chain and life-cycle management of these devices have long relied on ensuring adequate controls are in place. In this paper, a tec...
详细信息
ISBN:
(纸本)9782839918442
FPGAs are used in many long-life systems that serve mission-critical needs. The supply chain and life-cycle management of these devices have long relied on ensuring adequate controls are in place. In this paper, a technique is presented that provides measurement vectors by determining both characteristics of the supply properties of the FPGA and characteristics of aging of the FPGA. Asynchronous ring oscillators are placed throughout the FPGA, and the measurement of these oscillators is compared to other chips both within a manufacturing lot and between other manufacturing lots. Through these non-invasive measurements, the "health history" of the FPGA can be evaluated and utilized in supply chain decisions before and during system operation.
暂无评论