Computer-generated holography (CGH) is a technique to generate holographic interference patterns. One of the major issues related to computer hologram generation is the massive computational power required. Hardware a...
详细信息
Computer-generated holography (CGH) is a technique to generate holographic interference patterns. One of the major issues related to computer hologram generation is the massive computational power required. Hardware accelerators are used to accelerate this process. Previous publications targeting hardware platforms lack performance comparisons between different architectures and do not provide enough information for the evaluation of the suitability of recent hardware platforms for CGH algorithms. We aim to address these limitations and present a comprehensive review of CGH-related hardware implementations. (C) 2020 Society of Photo-Optical Instrumentation Engineers (SPIE)
An adaptive optical system is developed to correct the wavefront of laser radiation distorted by a turbulent air flow. The use of a field-programmable gate array as the main control element makes it possible to achiev...
详细信息
An adaptive optical system is developed to correct the wavefront of laser radiation distorted by a turbulent air flow. The use of a field-programmable gate array as the main control element makes it possible to achieve a system bandwidth of 2 kHz. The results of experiments on dynamic correction of the phase of a laser beam distorted by a flow of heated air are presented and analysed.
This paper presents the architecture of a hardware accelerator for a cellular neural network (CeNN) with an application to real-time edge detection on visible-range and infrared video. The accelerator features fully-p...
详细信息
ISBN:
(纸本)9781728128610
This paper presents the architecture of a hardware accelerator for a cellular neural network (CeNN) with an application to real-time edge detection on visible-range and infrared video. The accelerator features fully-pipelined processing elements (PEs) that exploit the data parallelism in the algorithm to perform an iteration of the CeNN on a stream of video data with high throughput. The memory architecture exploits the locality of reference in the CeNN, so that each PE uses only 5 line buffers to store pixel, state, and output data, thus achieving low on-chip memory utilization. Implemented on a Xilinx XC7A200T FPGA running at 245MHz, the accelerator performs edge detection on 1080p video using a single CeNN iteration with a throughput of 118 frames per second (fps), a total latency of 15.7 mu s, and 618mW of power consumption. The architecture features static reconfiguration to store built-in kernels and to add more PEs to support multiple iterations of the CeNN algorithm. More kernels can be added dynamically through a serial interface.
A novel hardware architecture for the reconstruction of digital holograms with autofocusing is presented in this paper. The architecture is based on a novel autofocusing algorithm operating on a smaller local block lo...
详细信息
A novel hardware architecture for the reconstruction of digital holograms with autofocusing is presented in this paper. The architecture is based on a novel autofocusing algorithm operating on a smaller local block located at the center of the source digital holograms. By exploiting global information contained in the local block, accurate focus distance can be computed with less computational complexities. Interval search is also adopted to further accelerate the process. The circuits for the fast autofocusing algorithm and subsequent reconstruction operations are effectively integrated in the proposed architecture. Two fast Fourier transform cores are shared by the operations for parallel computations with low area costs. The architecture is implemented by fieldprogrammablegatearray, and is used as a hardware accelerator in a network on chip system for performance evaluation. Experimental results demonstrate that the proposed circuit exhibits the advantages of high speed computation, low power dissipation, accurate focus distance search, and hologram reconstruction for three-dimensional rendering applications.
An important scientific direction is the development and study- of computer vision systems (CVS) for mobile robotic complexes. Today, developers of CVS are most often using convolutional neural networks (CNN). For inc...
详细信息
ISBN:
(纸本)9781538651421
An important scientific direction is the development and study- of computer vision systems (CVS) for mobile robotic complexes. Today, developers of CVS are most often using convolutional neural networks (CNN). For increasing the speed detection of objects on images in CVS, there has been a trend of using CNN, which are hardware-implemented on fieldprogrammablegatearray (FPGAs). This article shows that the perspective for hardware implementation on the FPGA is the tiny-YOLO (_:NN from the YOLO class. For reduce required FPGA computing resources in this (_:NN, was proposed to use Inception-ResNet modules. We was found that with high detection accuracy of objects in images with minimum resources requirements provide by the tiny-YOLO-InceptionResNet2 network architecture. It is obtained from replacing the fifth tiny-YOLO convolutional layer of the tiny-YOLO CNN with two sequential processing Inception-ResNet 'nodules. Also results of the study of the detection accuracy of objects using the CNN for this architecture with the lack of resource-intensive operations: batch normalization and bias from calculations were given. These studies were performed for different formats of representation numbers in the FPGA.
When attempting to make a design fit a set of the heterogeneous resources found in field-programmable gate arrays (FPGAs), designers using High-Level Synthesis (HLS) may resort to approximate approaches. However, curr...
详细信息
ISBN:
(纸本)9781450367257
When attempting to make a design fit a set of the heterogeneous resources found in field-programmable gate arrays (FPGAs), designers using High-Level Synthesis (HLS) may resort to approximate approaches. However, current FPGA-oriented approximate HLS tools do not allow specifying constraints on heterogeneous resources such as lookup tables, flip-flops, and multipliers, being instead error-oriented. In this work, we propose a resource-oriented HLS methodology with which designers can specify heterogeneous resource constraints and satisfy them while minimizing the output error, attaining average improvements, over error-oriented approaches, of about 34% and 2.2 dB for mean-squared error and peak signal-to-noise ratio error metrics, respectively.
This paper describes a new high speed combined Octal Serial Peripheral Interface (SPI) and HyperBus flash memory host controller, which can work in a mixed single-datarate/double-data-rate (SDR/DDR) mode. The proposed...
详细信息
ISBN:
(纸本)9781728111902
This paper describes a new high speed combined Octal Serial Peripheral Interface (SPI) and HyperBus flash memory host controller, which can work in a mixed single-datarate/double-data-rate (SDR/DDR) mode. The proposed controller can support SPI, Dual/Quad/Octal SPI, and HyperBus protocols. The operation frequency ranges from 5 up to 200MHz for SDR/DDR write and read operations, which can make systems boot from flash at the low frequency and run software at the high frequency. The controller features an efficient calibration method for maximizing margin for receiving data. This memory controller prototype is designed and implemented on Xilinx Zynq-7030 field-programmable gate array (FPGA).
Optical phased arrays (OPAs) are a solid-state device able to manipulate the distribution of optical power without the use of mechanical beam steering systems and have potential applications in free-space laser commun...
详细信息
ISBN:
(纸本)9781510631410
Optical phased arrays (OPAs) are a solid-state device able to manipulate the distribution of optical power without the use of mechanical beam steering systems and have potential applications in free-space laser communications, target acquisition and tracking, and interferometry. Here we present a scalable OPA and digital control architecture capable of steering a laser beam at MHz frequencies, and having arbitrary control over the beam wavefront.
Convolution neural networks have become widely used in embedded systems such as automatic driving systems. In these cases, inference functions in convolution neural networks are implemented due to resource limitation ...
详细信息
ISBN:
(纸本)9781728140346
Convolution neural networks have become widely used in embedded systems such as automatic driving systems. In these cases, inference functions in convolution neural networks are implemented due to resource limitation in embedded systems. field-programmable gate array implementation is preferable because of low power consumption and real-time response time in embedded systems. Parameters in a convolution neural network are floating-point numbers, and enormous floating-point calculation is required. This is a challenge because the field-programmable gate array has a limited floating processing unit and in-memory processing. One way to solve the problem is to approximate an enormous number of parameters and to perform efficient computation. In the approximation, floating-point numbers of parameters are implemented using smaller-size integers or numbers having fewer bits. However, the inference accuracy decreases in the approximation, leading to a tradeoff situation. In addition, which layer should be approximated in order to be effective is not clear. In order to solve these problems, we developed an approximation support system. The developed system approximates the parameters and calculates the accuracy of the parameters and the required memory size. Furthermore, using this system, we carry out experiments to evaluate the effectiveness of several approximation methods for a large-scale network and dataset.
Functional hardware description languages (FHDL) provide powerful tools for building new abstractions that enable sophisticated hardware system to be constructed by composing small reusable parts. Raising the level of...
详细信息
ISBN:
(纸本)9781728109961
Functional hardware description languages (FHDL) provide powerful tools for building new abstractions that enable sophisticated hardware system to be constructed by composing small reusable parts. Raising the level of abstractions in hardware designs means the programmer can focus on high-level circuit structure rather than mundane low-level details. The language features that facilitate this include high-order functions, rich static type system with type inference, and parametric polymorphism. We use hand-written structural and behavioral VHDL, Simulink, and the Kansas Lava FHDL to re-implement several components taken from a Simulink model of an orthogonal frequency-division multiplexing (OFDM) physical layer (PHY). Our development demonstrates that an FHDL can require fewer lines of code than traditional design languages without sacrificing performance.
暂无评论