Partial Reconfiguration is one of the most attractive features of FPGAs. This feature provides new computing possibilities, for instance we can change a part of the initial functionality after its deployment, where a ...
详细信息
ISBN:
(纸本)9780769550749
Partial Reconfiguration is one of the most attractive features of FPGAs. This feature provides new computing possibilities, for instance we can change a part of the initial functionality after its deployment, where a complete configuration is not needed, and the total area required is reduced. However, the design of partially reconfigurable systems has been a complex task yet. This work try to facilitate the design process and proposes a new development flow, which reduces mistakes during first stages of the design and makes the building of partial reconfiguration projects easier. In addition, we provide a dedicated hardware component, which manages bitstreams and dynamic areas. This component speed up the reconfiguration time, accomplishing a speed about 180MB/s.
We propose a rate-adaptive forward error correction (FEC) scheme based on spatially-coupled (SC) LDPC codes derived from quasi-cyclic (QC) LDPC codes with its field-programmable gate array (FPGA) architecture. By FPGA...
详细信息
ISBN:
(纸本)9781538666050
We propose a rate-adaptive forward error correction (FEC) scheme based on spatially-coupled (SC) LDPC codes derived from quasi-cyclic (QC) LDPC codes with its field-programmable gate array (FPGA) architecture. By FPGA emulation, we show that, with comparable computational complexity, the proposed LDPC codes provide larger coding gain and smaller error floor compare to the QC-LDPC base code. As a result, due to its hardware friendly structure, they are presented to be a promising candidate for the next-generation intelligent optical communication systems such as long-haul optical transmission system.
In this paper, an improved decoding algorithm for Non-binary low density parity check (NB-LDPC) codes with low decoding complexity and suitable for field-programmable gate array (FPGA) implementation is proposed, whic...
详细信息
ISBN:
(纸本)9789881563958
In this paper, an improved decoding algorithm for Non-binary low density parity check (NB-LDPC) codes with low decoding complexity and suitable for field-programmable gate array (FPGA) implementation is proposed, which is a mixed logarithmic domain FFT-BP decoding algorithm (Mixed Log-FFT-BP) for the problem of high complexity of the existing decoding algorithms for Non-binary LDPC codes. The algorithm combines the traditional Log-BP algorithm with the FFT-BP algorithm, and simplifies the update of the check nodes in the iterative decoding process. A large number of convolution operations are converted into multiplication operations in frequency domain by using FFT transform and IFFT transform. The multiplication of the original FFT-BP algorithm is converted into the addition and look-up table operations in the logarithmic domain. Then, the logarithm of the probability information is directly solved, so that it can be decoded in the logarithmic domain, which saves the computation of the log likelihood ratio, and then reduces the complexity. Simulation results show that under the additive Gauss white noise channel, when the bit error rate is 10(-5), compared with BP algorithm, Log-BP algorithm and FFT-BP algorithm, the performance of Mixed Log-FFT-BP algorithm is not decreased, and all of them remain within the range of 0.1-0.2dB.
Self-timed pipeline (STP) is an attractive circuit architecture because its local data transfer naturally realizes pipeline-stage-level signal gating and thus leads to high power-performance efficiency. A circular STP...
详细信息
ISBN:
(纸本)9798331528379;9798331528362
Self-timed pipeline (STP) is an attractive circuit architecture because its local data transfer naturally realizes pipeline-stage-level signal gating and thus leads to high power-performance efficiency. A circular STP can realize not only iterative or recursive operations but also program execution. There exists a circuit configuration and its design procedure that make it possible to implement the circular STP on highly available commercial FPGAs, which provide flexibility at circuit level, by utilizing the standard EDA tool provided by the FPGAs' vendors;however, the existing configuration is premised on a data transfer manner called four-phase bundled-data, in which main transfer control signals transition four times when a set of data is transferred between adjacent pipeline stages. In this paper, we propose a circuit configuration for two-phase bundled-data manner that makes main transfer control signals transition only twice for transferring a data set and thus potentially leads to a higher throughput. The proposed configuration realizes the circular STP's basic functions: the merge of two pipelines into one, the branch of one pipeline to two, and the copy and/or erasure of a data set. To make it possible to apply the existing design procedure to the proposed configuration, we reveal intrinsic timing constraints exhaustively and present a timing information acquisition technique for static timing analysis. According to the proposed configuration, we implemented a circular STP on AMD Zynq-7000 FPGA. Based on the implementation, we demonstrate that the proposed configuration requires an approximately 15% larger circuit area and it can achieve the approximately 1.6 times higher maximum pipeline throughput in comparison with the existing four-phase bundled-data version.
High-speed imaging is an indispensable technique, particularly for identifying or analyzing fast-moving objects. The serial time-encoded amplified microscopy (STEAM) technique was proposed to enable us to capture imag...
详细信息
ISBN:
(纸本)9781628419542
High-speed imaging is an indispensable technique, particularly for identifying or analyzing fast-moving objects. The serial time-encoded amplified microscopy (STEAM) technique was proposed to enable us to capture images with a frame rate 1,000 times faster than using conventional methods such as CCD (charge -coupled device) cameras. The application of this high-speed STEAM imaging technique to a real-time system, such as flow cytometry for a cell-sorting system, requires successively processing a large number of captured images with high throughput in real time. We are now developing a high-speed flow cytometer system including a STEAM camera. In this paper, we describe our approach to processing these large amounts of image data in real time. We use an analog-to-digital converter that has up to 7.0G samples/s and 8 -bit resolution for capturing the output voltage signal that involves grayscale images from the STEAM camera. Therefore the direct data output from the STEAM camera generates 7.0G byte/s continuously. We provided a field -programmablegatearray (FPGA) device as a digital signal pre-processor for image reconstruction and finding objects in a microfluidic channel with high data rates in real time. We also utilized graphics processing unit (GPU) devices for accelerating the calculation speed of identification of the reconstructed images. We built our prototype system, which including a STEAM camera, a FPGA device and a GPU device, and evaluated its performance in real-time identification of small particles (beads), as virtual biological cells, flowing through a microfluidic channel.
In recent years, IC reverse engineering and IC fabrication supply chain security have grown to become significant economic and security threats for designers, system integrators, and end customers. Many of the existin...
详细信息
ISBN:
(纸本)9783981926354
In recent years, IC reverse engineering and IC fabrication supply chain security have grown to become significant economic and security threats for designers, system integrators, and end customers. Many of the existing logic locking and obfuscation techniques have shown to be vulnerable to attack once the attacker has access to the design netlist either through reverse engineering or through an untrusted fabrication facility. We introduce soft embedded FPGA redaction, a hardware obfuscation approach that allows the designer substitute security-critical IP blocks within a design with a synthesizable eFPGA fabric. This method fully conceals the logic and the routing of the critical IP and is compatible with standard ASIC flows for easy integration and process portability. To demonstrate eFPGA redaction, we obfuscate a RISC-V control path and a GPS P-code generator. We also show that the modified netlists are resilient to SAT attacks with moderate VLSI overheads. The secure RISC-V design has 1.89x area and 2.36x delay overhead while the GPS design has 1.39x area and negligible delay overhead when implemented on an industrial 22nm FinFET CMOS process.
Three dimensional (3D) imaging using optical coherence tomography (OCT) is equipped with a field-programmable gate array and graphics processing unit (FPGA-GPU) acquisition and processing architecture, thereby making ...
详细信息
ISBN:
(纸本)9781509063529
Three dimensional (3D) imaging using optical coherence tomography (OCT) is equipped with a field-programmable gate array and graphics processing unit (FPGA-GPU) acquisition and processing architecture, thereby making it highly advantageous in the domain of parallel computing. To realize the full benefit of the data acquisition and processing capabilities, it is preferable to increase the number of high-speed processing modules capable of running the complicated image processing algorithms at comparable speeds. In this paper, we propose the design of a real-time image acquisition and pre-processing FPGA via LabVIEW (National Instruments (NI)) with GPU-based acceleration that is capable of sustaining the rate of data acquisition. When using the NI LabVIEW FPGA to develop high-speed processing, FPGA cores are modeled as reusable code modules and comprise subVls, which are commonly implemented in the LabVIEW environment. The OCT pre-processing was implemented and performed via subVls in LabVIEW FPGA. Additionally, we utilized GPU-based acceleration, which employs the use of a general purpose GPU (GP-GPU) together with a CPU, to accelerate the operation at hand. Finally, we implemented the LabVIEW FPGA core to perform data acquisition and image pre-processing in the frame grabber. Results showed that, by applying GPU acceleration to the tomographic inspection of biological samples, SD-OCT imaging in excess of 40 frames/s (FPS) for the NVIDIA M6000 GPU-accelerated SD-OCT with frame size 4096 (axial) x 512 (lateral) becomes feasible, and more than 512 x 512 x 500 volumes can be reconstructed with a speed increase of at least 7x that of a non-GPU.
This paper explores the use of the conservative power theory for active shunt compensation, and provides experimental validation of reactive compensation and harmonic filtering. It is aimed to provide a more in-depth ...
详细信息
ISBN:
(纸本)9781467367851
This paper explores the use of the conservative power theory for active shunt compensation, and provides experimental validation of reactive compensation and harmonic filtering. It is aimed to provide a more in-depth review of how the conservative power theory operates as a control algorithm for a shunt compensator. Also there is discussion on some of the challenges associated to practical implementation of active filters.
To use reservoir computing (RC) for practical tasks, both a high memory capacity and nonlinearity are required;however, some RC models have the problem of a low memory capacity. We propose a delay mechanism for increa...
详细信息
ISBN:
(纸本)9798350330991;9798350331004
To use reservoir computing (RC) for practical tasks, both a high memory capacity and nonlinearity are required;however, some RC models have the problem of a low memory capacity. We propose a delay mechanism for increasing the memory capacity in RC as well as a simple and small-scale digital circuit for implementing the delay mechanism. The proposed delay mechanism is integrated into the input layer of the RC model and is expected to be implemented in several RC models, such as material reservoirs and chaotic Boltzmann machine (CBM)-RC. We conducted experiments using a CBM-RC with a delay mechanism (CBM-RC-DL) and evaluated the performance improvement achieved by introducing a delay mechanism. We used CBM-RC as the base model because it is an appropriate model for the hardware implementation of large networks but has a low memory capacity. The experimental results for CBM-RC-DL indicated that the delay mechanism significantly increased the memory capacity of CBM-RC with the addition of a small-scale circuit. Furthermore, the entire synthesized CBM-RCDL was sufficiently small-scale to be implemented in a field-programmable gate array for edge computing, and it outperformed conventional methods in nonlinear autoregressive moving average 10 (NARMA10)-a benchmark task for time-series data processing. The proposed delay mechanism can facilitate the use of many RC models because of its simple structure.
This paper describes a new high speed Quad Serial Peripheral Interface (QSPI) NOR flash memory controller, which can work in a mixed single-data-rate/double-data-rate (SDRIDDR) mode. The proposed controller can suppor...
详细信息
ISBN:
(纸本)9781467349338
This paper describes a new high speed Quad Serial Peripheral Interface (QSPI) NOR flash memory controller, which can work in a mixed single-data-rate/double-data-rate (SDRIDDR) mode. The proposed controller can support code eXecute In Place (XIP) operation as well as classic demand paging. The operation frequency ranges from larger than 0 up to 133MHz for SDR write and read operations, or from larger than 0 up to 80MHz for DDR read operation. The controller features an efficient calibration method for maximizing margin for receiving data. This QSPI memory controller prototype is designed and implemented on Xilinx Zynq-7020 field-programmable gate array (FPGA).
暂无评论