Random number generators are one of basic cryptographic primitives used in cryptographic protocols. Most of true random number generators in fieldprogrammable Gate Arrays (FPGAs) employ the timing jitter from ring os...
详细信息
ISBN:
(纸本)9781424419609
Random number generators are one of basic cryptographic primitives used in cryptographic protocols. Most of true random number generators in fieldprogrammable Gate Arrays (FPGAs) employ the timing jitter from ring oscillator clocks as a source of randomness. The paper analyses the jitter generated in ring oscillators and it uses a simple physical model of jitter sources to show that the random jitter accumulates slower than the global and manipulable deterministic jitter. This fact, which can be used to attack generators, is not considered even in most recent designs considered to be secure. The paper proposes simple but efficient countermeasure against these attacks. The method is validated using the proposed behavioral VHDL model and it is shown to be efficient also in hardware.
In this paper, we describe an FPGA system for the real-time processing of Poisson image Editing. Poisson Image Editing is a powerful method to overlay an image on another image seamlessly. In this method, however, a s...
详细信息
ISBN:
(纸本)9789090304281
In this paper, we describe an FPGA system for the real-time processing of Poisson image Editing. Poisson Image Editing is a powerful method to overlay an image on another image seamlessly. In this method, however, a simple equation is repeatedly applied to each pixel, and this repetition makes its computational complexity very high. In our system, a very deep pipeline is used to apply the equation. One repetition is executed on each stage of the pipeline, and all repetitions are applied to all pixels during one scan of the image. In our current implementation, to reduce the circuit size, a color image is scanned three times, for R-, G-, and B-plane, but its processing speed is still fast enough for real-time processing of HD images.
In image processing, FPGAs have shown very high performance in spite of their low operational frequency. This high performance comes from (1) high parallelism in applications in image processing, (2) high ratio of 8 b...
详细信息
ISBN:
(纸本)9781424419609
In image processing, FPGAs have shown very high performance in spite of their low operational frequency. This high performance comes from (1) high parallelism in applications in image processing, (2) high ratio of 8 bit operations, and (3) a large number of internal memory banks on FPGAs which can be accessed in parallel. In the recent micro processors, it becomes possible to execute SIMD instructions on 128 bit data in one clock cycle. Furthermore, these processors support multi-cores and large cache memory which can hold all image data for each core. In this paper, we compare the performance of FPGAs with those processors using three applications in image processing;two-dimensional filters, stereo-vision and k-means clustering, and make it clear how fast is an FPGA in image processing, and how many hardware resources are required to achieve the performance.
The goal of this PhD project is to devise a way to combat the effect of process variation on propagation delays in modern FPGAs. Through our research, we have devised a novel measurement method that is capable of meas...
详细信息
ISBN:
(纸本)9781424419609
The goal of this PhD project is to devise a way to combat the effect of process variation on propagation delays in modern FPGAs. Through our research, we have devised a novel measurement method that is capable of measuring the delays of components on FPGAs with picosecond timing resolution and fine spatial granularity. The method avoids the use of external test equipment and able to measure stochastic delay variability, which is becoming increasingly significant. The aim is to exhaustively test FPGA components based on this method and use the results to optimise the placement and routing of circuits in FPGAs to maximise performance under the negative influence of process variation.
The conventional matrix multiplication algorithms that are suitable for dense matrices do not perform well on the corresponding Sparse Matrix-Matrix Multiplication (SMMM) operation. In particular, they do not utilize ...
详细信息
ISBN:
(纸本)9781479900046
The conventional matrix multiplication algorithms that are suitable for dense matrices do not perform well on the corresponding Sparse Matrix-Matrix Multiplication (SMMM) operation. In particular, they do not utilize the sparsity of the matrix. This paper describes a new technique for performing the SMMM operation using a novel storage format for sparse matrices. To demonstrate the feasibility of this technique, the SMMM operation is implemented on an FPGA and various parameters that affect the performance of the design are explored.
Conventional FPGA design and implementation processes involve two separate flows. The FPGA architecture is determined by academic FPGA design flow. However, in the implementation phase, commercial VLSI design flow are...
详细信息
ISBN:
(纸本)9781479900046
Conventional FPGA design and implementation processes involve two separate flows. The FPGA architecture is determined by academic FPGA design flow. However, in the implementation phase, commercial VLSI design flow are used. In this research, we propose an FPGA design framework in order to improve synthesizable FPGA IP design efficiency. A novel FPGA routing tool is developed in this framework, namely the EasyRouter, which can bridge the two flows efficiently. With this design flow, accurate physical information can be reported when a new FPGA IP architecture is evaluated with reliable commercial VLSI CADs.
Structured ASICs have recently emerged as a mid-way between cell-based ASICs with high NRE costs and FPGAs with high unit costs. Though the structured ASIC fabric attacks mask and other fixed cost it does not solve ve...
详细信息
ISBN:
(纸本)9781424410590
Structured ASICs have recently emerged as a mid-way between cell-based ASICs with high NRE costs and FPGAs with high unit costs. Though the structured ASIC fabric attacks mask and other fixed cost it does not solve verification, particularly physical verification issues with ASICs or logic errors missed by simulation which would require re-spins. These can be avoided by testing in-system with an FPGA and migrating the FPGA design to a closely coupled structured ASIC fabric. Here we describe a practical methodology for a fast, push-button, and thorough verification approach tying an FPGA prototype to a matching structured-ASIC implementation for cost-reduction. Our focus is the equivalence verification between the respective revisions of a design, including netlist, compiler settings, macro-block parameters, timing constraints, pin layout and resource count.
With the introduction of the Stratix V family, the FPGA vendor Altera is now fully supporting partial reconfiguration in all their recent FPGA devices. A distinct feature in the Altera architecture is that reconfigura...
详细信息
ISBN:
(纸本)9782839918442
With the introduction of the Stratix V family, the FPGA vendor Altera is now fully supporting partial reconfiguration in all their recent FPGA devices. A distinct feature in the Altera architecture is that reconfigurable regions can be arbitrarily defined which is possible by writing a configuration mask prior to writing the actual configuration data to the FPGA fabric. In this paper, we will present details and the flow for implementing partial reconfiguration using Altera FPGAs, as well as a study on configuration bitstream sizes and configuration speeds for various resource and bounding-box aspect ratio variants. The results are used to build a partial reconfiguration controller that is featuring a lightweight but effective bitstream decompression module for greatly improving configuration speed on a DE5-net board.
Domain-disparity between CPU and Hardware Accelerators(HA) leads to CPU under-utilization and inter-domain data copy overheads. By exposing HA memory to OS and host MMU, these overheads can be eliminated. In this pape...
详细信息
ISBN:
(纸本)9781479900046
Domain-disparity between CPU and Hardware Accelerators(HA) leads to CPU under-utilization and inter-domain data copy overheads. By exposing HA memory to OS and host MMU, these overheads can be eliminated. In this paper, we present a shared virtual memory real system design for PCIe-based HAs to enable parallel heterogeneous execution in CPU and HAs without driver overheads. We extend Linux with a custom memory manager and scheduler to manage HA memory and application-cores respectively. Our FPGA-based multiapplication logic design supports simultaneous execution of multiple heterogeneous applications. We show the advantages of heterogeneous execution and analyze how our design reduces OS overhead.
Custom operators, working at custom precisions, are a key ingredient to fully exploit the FPGA flexibility advantage for high-performance computing. Unfortunately, such operators are costly to design, and application ...
详细信息
ISBN:
(纸本)9781424438914
Custom operators, working at custom precisions, are a key ingredient to fully exploit the FPGA flexibility advantage for high-performance computing. Unfortunately, such operators are costly to design, and application designers tend to rely on less efficient off-the-shelf operators. To address this issue, an open-source architecture generator framework is introduced. Its salient features are an easy learning curve from VHDL, the ability to embed arbitrary synthesizable VHDL code, portability to mainstream FPGA targets from Xilinx and Altera, automatic management of complex pipelines with support for frequency-directed pipeline, and automatic test-bench generation. This generator is presented around the simple example of a collision detector, which it significantly improves in accuracy, DSP count, logic usage, frequency and latency with respect to an implementation using standard floating-point operators.
暂无评论