In the multiuser MIMO broadcast channel, the use of precoding techniques is required in order to detect the signal at the users' terminals without any cooperation between them. This contribution presents the desig...
详细信息
In order to obtain depth information about a scene in computer vision, one needs to process pairs of stereo images. The calculation of dense depth maps in real-time is computationally challenging as it requires search...
详细信息
Noise often limits the performance of transmitted signals and degrades signals quality. Moreover, stochastic nature of noise makes it difficult to predict, and hence, is hard to detect. In hardware implementation, the...
详细信息
Graphical processing Units (GPU) architectures are massively used for resource-intensive computation. Initially dedicated to imaging, vision and graphics, these architectures serve nowadays a wide range of multi-purpo...
详细信息
An integrated smart camera is a single chip composed of a sensor tightly coupled with one or more processing elements. The imageprocessing applications that are mapped on such systems can require processing power in ...
详细信息
An integrated smart camera is a single chip composed of a sensor tightly coupled with one or more processing elements. The imageprocessing applications that are mapped on such systems can require processing power in the range of supercomputer. To face the increasing application needs we propose in this paper a SIMD based processor optimized for the low and intermediate level of imageprocessing. The architecture is composed of several SIMD cluster. Each cluster includes a configurable number of 2-Way PE (processing Element) ranging from 32 to 256 running at 200 MHz. These cluster configurations provide between 12 to 102 GOPS.
An image compressing technique for High Dynamic Range (HDR) image sensors is introduced. Compression is performed in two steps: Pixel value coding optimization followed by DCT-based (Discrete Cosine Transform) compres...
详细信息
An image compressing technique for High Dynamic Range (HDR) image sensors is introduced. Compression is performed in two steps: Pixel value coding optimization followed by DCT-based (Discrete Cosine Transform) compression. A floating point coding technique is first used with a common exponent shared between pixels of the same block, and then a DCT is applied to each group of pixels. This new concept, while maintaining low complexity architecture, shows a compression ratio of 75 % and retains a good image quality with a PSNR of about 40 dB.
This paper presents the state of the art of the Integrated Streak Camera (ISC) architectures in standard CMOS technology. It focuses on some of the methods required for reconstructing the luminous events profile from ...
详细信息
This paper presents the state of the art of the Integrated Streak Camera (ISC) architectures in standard CMOS technology. It focuses on some of the methods required for reconstructing the luminous events profile from the chip raw data. Two main ISC architectures are presented. The first adopts the traditional for the most silicon imagers pixel array configuration, where the photocharges-induced signal is processed directly in-pixel. The second approach is based on a single light detecting vector, comparable to the slit of a Conventional Streak Camera (CSC), coupled to an amplifier stage and an analog sampling and storage unit. For both architectures, depending on the on-chip processing of the photocharges, appropriate signal reconstruction techniques are required in order to restore the luminous signal shape. A novel single vector ISC front-end architecture with an asynchronous photodiode reset scheme is presented. Algorithms allowing the luminous event reconstruction are proposed and validated through simulations for all the ISCs considered.
The paper proposes high speed FPGA implementations of adders and multipliers in F p . The work shows through experimental results that due to optimized addition chain available in such devices, Karatsuba decomposition...
详细信息
The paper proposes high speed FPGA implementations of adders and multipliers in F p . The work shows through experimental results that due to optimized addition chain available in such devices, Karatsuba decomposition upto a particular level improves the performance. Further the paper modifies existing interleaved multiplier using Montgomery ladder and the high speed adder circuits. Extensive experiments have been performed. The result shows that the proposed design provides 70% speedup from the best known designs.
To investigate the on-sensor processing capabilities of FPGAs, this paper presents a bird call recognition system based on linear predictive cepstral coefficients (LPCC) and dynamic time warping (DTW) algorithms for s...
详细信息
To investigate the on-sensor processing capabilities of FPGAs, this paper presents a bird call recognition system based on linear predictive cepstral coefficients (LPCC) and dynamic time warping (DTW) algorithms for sensor network applications, and compares two different implementations on a Xilinx Spartan-3E FPGA with MicroBlaze soft processor. The experimental results show that compared to the software-only solution, the software / hardware (SW/HW) implementation with hardware coprocessor for DTW yields significant performance improvement by the factor of 13.8 and 33.4 respectively for two example inputs, and achieves about 31.1 times energy efficiency by using only 7.5% more power.
This paper presents a high performance dual-core reconfigurable processor implementation methodology for a demosaicing system that targets next generation camera systems. The implementation methodology is based on dua...
详细信息
This paper presents a high performance dual-core reconfigurable processor implementation methodology for a demosaicing system that targets next generation camera systems. The implementation methodology is based on dual-core architecture with coarse-grained dynamically reconfigurable processors. The demosaicing system adopts Freeman's algorithm that has been partitioned and mapped onto two customized and tailored heterogeneous processor cores. The demosaicing engine's implementation has been optimized by compilation techniques and special approaches for the targeting processor. Simulation results demonstrate that the resulting demosaicing system provides high throughput reaches up to 241.6Mpixels/s, which represents a 1.82x speedup compared to a single-core implementation.
暂无评论