this work introduces an FPGA implementation for vessel-tree extraction on retinal images. the retinal vessel-tree can be used in disease diagnoses, e.g. diabetes, or in person authentication. In such cases, a portable...
详细信息
ISBN:
(纸本)9781424438914
this work introduces an FPGA implementation for vessel-tree extraction on retinal images. the retinal vessel-tree can be used in disease diagnoses, e.g. diabetes, or in person authentication. In such cases, a portable device with a high performance may be a need. the FPGA implementation discussed here, although application-oriented, features a fully programmable SIMD architecture, allowing for an efficient realization of low-level image processing algorithms. It is mapped onto a Spartan 3, amounting to 90 processing elements. the on-chip memory utilized was 1.4MB and stores 8 gray images of 144 x 160px. the working frequency is 53MHz, allowing for a 3 x 3 convolution in less than 110 mu s.
this paper advocates the use of 3D integration technology to stack a DRAM on top of an FPGA. the DRAM will store future FPGA contexts. A configuration is read from the DRAM into a latch array on the DRAM layer while t...
详细信息
ISBN:
(纸本)9781424438914
this paper advocates the use of 3D integration technology to stack a DRAM on top of an FPGA. the DRAM will store future FPGA contexts. A configuration is read from the DRAM into a latch array on the DRAM layer while the FPGA executes;the new configuration is loaded from the latch array into the FPGA in 60ns (5 cycles). the latency between reconfigurations, 8.42 mu s, is dominated by the time to read data from the DRAM into the latch array. We estimate that the DRAM can cache 289 FPGA contexts.
SRAM-based FPGA devices are susceptible to single event effects (SEE) including single event upsets (SEU) within the configuration memory. Configuration scrubbing along with TMR or other hardware redundancy techniques...
详细信息
ISBN:
(纸本)9781424438914
SRAM-based FPGA devices are susceptible to single event effects (SEE) including single event upsets (SEU) within the configuration memory. Configuration scrubbing along with TMR or other hardware redundancy techniques are often used to mitigate the effects of these SEUs. However, the use of traditional configuration scrubbing prevents the ability to reconfigure the FPGA dynamically or to perform partial reconfiguration. this paper presents a novel technique that allows partial reconfiguration to be used with configuration scrubbing. A self scrubber, utilizing a small portion of the FPGA, performs the necessary operations to reconfigure a portion of the design while continuously scrubbing the entire FPGA.
Many applications in image processing have high inherent parallelism. FPGAs have shown very high performance in spite of their low operational frequency by fully extracting the parallelism. In recent micro processors,...
详细信息
ISBN:
(纸本)9781424438914
Many applications in image processing have high inherent parallelism. FPGAs have shown very high performance in spite of their low operational frequency by fully extracting the parallelism. In recent micro processors, it also becomes possible to utilize the parallelism using multi-cores which support improved SIMD instructions, though programmers have louse them explicitly to achieve high performance. Recent GPUs support a large number of cores, and have a potential for high performance in many applications. However, the cores are grouped, and data transfer between the groups is very limited. Programming tools for FPGA, SIMD instructions on CPU and a large number of cores on GPU have been developed, but it is still difficult to achieve high performance on these platforms. In this paper, we compare the performance of FPGA, GPU and CPU using three applications in image processing;two-dimensional filters, stereo-vision and k-means clustering, and make it clear which platform is faster under which conditions.
Self-organization is a natural concept that helps complex systems to adapt themselves autonomically to their environment. In this paper, we present a self-organizing framework for multi-cue fusion in embedded imaging....
详细信息
ISBN:
(纸本)9781424438914
Self-organization is a natural concept that helps complex systems to adapt themselves autonomically to their environment. In this paper, we present a self-organizing framework for multi-cue fusion in embedded imaging. this means that several simple image filters are used in combination to lead to a more robust system behavior. Human motion tracking serves as a show case. the system adapts to changes in the environment while tracking a person. Besides this, system customization can be simplified. the designer just has to select a desired set of image filters for a given task. the system then finds the appropriate parameters, e.g., the weighting of different cues. Withthe option of partial re-configuration, FPGAs support this type of customization. An FPGA-based prototype implementation demonstrates the feasibility of this approach. Tracking and adaptation work in real-time with 25 FPS and a resolution of 640 x 480.
Fast carry chains featuring dedicated adder circuitry is a distinctive feature of modern FPGAs. the carry chains bypass the general routing network and are embedded in the logic blocks of FPGAs for fast addition. Conv...
详细信息
ISBN:
(纸本)9781424438914
Fast carry chains featuring dedicated adder circuitry is a distinctive feature of modern FPGAs. the carry chains bypass the general routing network and are embedded in the logic blocks of FPGAs for fast addition. Conventional intuition is that such carry chains can be used only for implementing carry-propagate addition;state-of-the-art FPGA synthesizers can only exploit the carry chains for these specific circuits. this paper demonstrates that the carry chains can be used to build compressor trees, i.e., multi-input addition circuits used for parallel accumulation and partial product reduction for parallel multipliers implemented in FPGA logic. the key to our technique is to program the lookup tables (LUTs) in the logic blocks to stop the propagation of carry bits along the carry chain at appropriate points. this approach improves the area of compressor trees significantly compared to previous methods that synthesized compressor trees solely on LUTs, without compromising the performance gain over trees built from ternary carry-propagate adders.
Energy-performance tunable circuits enable the user to adjust the energy and performance of a chip after fabrication to suite the particular application, thus increase the overall power efficiency of the chip. Two tun...
详细信息
ISBN:
(纸本)9781424438914
Energy-performance tunable circuits enable the user to adjust the energy and performance of a chip after fabrication to suite the particular application, thus increase the overall power efficiency of the chip. Two tunable interconnect architectures are proposed. Pseudo-static interconnect achieves the same performance as static interconnect while consuming only 65% as much energy and provides 2X wider range for adjusting energy performance. Integration of pseudo-static interconnect in FPGA architecture does not require any system level changes. Pulse-mode interconnect provides marginal improvement at comparable power consumption but provides considerable performance boost when energy increases. Using pulses enables pulse-mode look-up tables with 2.5X higher speed at 2X higher power consumption and at the cost of significant system level changes.
this paper presents a new Single Event Upset (SEU), Multiple Bit Upset (MBU) and Single Hardware Error (SHE) mitigation strategy to be used in Virtex-4 FPGAs. this strategy aims to increase not only the effectiveness ...
详细信息
ISBN:
(纸本)9781424438914
this paper presents a new Single Event Upset (SEU), Multiple Bit Upset (MBU) and Single Hardware Error (SHE) mitigation strategy to be used in Virtex-4 FPGAs. this strategy aims to increase not only the effectiveness of traditional Triple Module Redundancy (TMR), but also the overall system availability. Frame readback with ECC detection and frame scrubbing are combined in a dynamically reconfigurable TMR architecture, designed under both spatial and implementation diversification premises. Moreover, since the strategy works on the device's bitstream domain, the basis for Virtex-4 FPGAs bitstream definition are also shown.
the paper presents a new efficient method for implementation of the AES byte substitution function (S-box). It is aimed at the AES implementation in non-volatile FPGAs featuring volatile embedded RAM blocks. the metho...
详细信息
ISBN:
(纸本)9781424438914
the paper presents a new efficient method for implementation of the AES byte substitution function (S-box). It is aimed at the AES implementation in non-volatile FPGAs featuring volatile embedded RAM blocks. the method uses a pair of linear feedback shift registers to generate substitution tables into embedded RAMs. the proposed solution requires less space and is faster than the one implementing whole S-boxes in the logic area, and it is especially suited to a power-aware AES implementation. the complete AES cipher implemented in the Actel Igloo family and employing the proposed solution consumes two times less total power and more than 150-times less static power than the same cipher implemented in a competing volatile FPGA technology.
Local contrast enhancement is a technique to enhance the visibility of local details of an image by increasing the contrast in local regions. Adaptive histogram equalization (AHE) is a method for the local contrast en...
详细信息
ISBN:
(纸本)9781424438914
Local contrast enhancement is a technique to enhance the visibility of local details of an image by increasing the contrast in local regions. Adaptive histogram equalization (AHE) is a method for the local contrast enhancement. AHE computes several histograms of intensity values, each corresponding to a distinct region of the image, and uses them to redistribute the intensity values. AHE is very computationally intensive and not acceptable for most applications. In this paper, we propose a method for real-time computation of AHE using an FPGA. In our system, a histogram is generated for each pixel in a image, and the intensity of the pixel is remapped using the histogram. the computational complexity of this approach is very high, but it can generate smooth enhanced images without using interpolation techniques. By reusing partial histograms by temporarily storing them in on-chip memory banks, and by adding/subtracting all bins in the histograms in parallel, we can achieve real-time processing of HD images using an FPGA. this high performance becomes possible because of the very high memory bandwidth of on-chip memory banks of FPGA.
暂无评论