One key issue in the design of Real-Time imageprocessing and computer Vision (IP/CV) systems is the massive volume of data to process. Not only the number of arithmetic and logic operations over the data but also the...
详细信息
ISBN:
(纸本)9783319304816;9783319304809
One key issue in the design of Real-Time imageprocessing and computer Vision (IP/CV) systems is the massive volume of data to process. Not only the number of arithmetic and logic operations over the data but also the access to these data represents an important issue. An Application-Specific Instruction Set Processor (ASIP) focused on Real-Time IP/CV algorithms was developed in this work. Starting from a standard 32-bit Reduced Instruction Set computer (RISC) as a benchmark, we analyzed the different issues and optimized the processor incrementally. We derived an economical image memory partition and also a new data path concept to speed up the processing. RTL models were synthesized for an FPGA, enabling an analysis of power consumption, area, and processing speed, to show the corresponding overheads in comparison with the original processor architecture.
Numerous applications for mobile devices require 3D vision capabilities, which in turn require depth detection since this enables the evaluation of an object's distance, position and shape. Despite the increasing ...
详细信息
ISBN:
(纸本)9781509061082
Numerous applications for mobile devices require 3D vision capabilities, which in turn require depth detection since this enables the evaluation of an object's distance, position and shape. Despite the increasing popularity of depth detection algorithms, available solutions need expensive hardware and/or additional ASICs, which are not suitable for low-cost commodity hardware devices. In this paper, we propose a low-cost and low-power embedded solution to provide high speed depth detection. We extend an existing off-the-shelf VLIW image processor and perform algorithmic and architectural optimizations in order to achieve the requested real-time performance speed. Experimental results show that by adding different functional units and adjusting the algorithm to take full advantage of them, a 640x480 image pair with 64 disparities(1) can be processed at 36.75 fps on a single processor instance, which is an improvement of 23% compared to the best state-of-the-art image processor.
Due to high accuracy, inherent redundancy, and embarrassingly parallel nature, the neural networks are fast becoming mainstream machine learning algorithms. However, these advantages come at the cost of high memory an...
详细信息
Due to high accuracy, inherent redundancy, and embarrassingly parallel nature, the neural networks are fast becoming mainstream machine learning algorithms. However, these advantages come at the cost of high memory and processing requirements (that can be met by either GPUs, FPGAs or ASICs). For embedded systems, the requirements are particularly challenging because of stiff power and timing budgets. Due to the availability of efficient mapping tools, GPUs are an appealing platforms to implement the neural networks. While, there is significant work that implements the image recognition (in particular Convolutional Neural Networks) on GPUs, only a few works deal with efficiently implement of speech recognition on GPUs. The work that does focus on implementing speech recognition does not address embedded systems. To tackle this issue, this paper presents SPEED (Open-source framework to accelerate speech recognition on embedded GPUs).We have used Eesen speech recognition framework because it is considered as the most accurate speech recognition technique. Experimental results reveal that the proposed techniques offer 2.6X speedup compared to state of the art.
The real-time presentation of the huge data rate generated by high-resolution radars is a challenging task. Latest developments on commercial-of-the-shelf GPUs for imageprocessing provide a suitable resource to imple...
详细信息
The discrete Pascal transform (DPT) is a relatively recently introduced spectral transform based on the concept of the Pascal triangle which has been known for centuries. It is used in digital imageprocessing, digita...
详细信息
The discrete Pascal transform (DPT) is a relatively recently introduced spectral transform based on the concept of the Pascal triangle which has been known for centuries. It is used in digital imageprocessing, digital filtering, pattern recognition, watermarking, and related areas. Its applicability is limited by the O(N 2 ) asymptotical time complexity of best current algorithms for its computation, where N is the size of the function to be processed. In this paper, we propose a method for the efficient computation of the DPT in O(N log N) time, based on the factorization of its transform matrix into a product of three matrices with special structure - two diagonal matrices and a Toeplitz matrix. The Toeplitz matrix is further embedded into a circulant matrix of order 2N. The diagonalization of the circulant matrix by the Fourier matrix permits the use of the fast Fourier transform (FFT) for performing the computations, leading to an algorithm with the overall computational complexity of O(N log N). Since the entries in the Toeplitz matrix have very different magnitudes, the numerical stability of this algorithm is also discussed. We also consider the issues in implementing the proposed algorithm for highly-parallel computation on graphicsprocessing units (GPUs). The experiments show that computing the DPT using the proposed algorithm processed on GPUs is orders of magnitude faster than the best current approach. As a result, the proposed method can significantly extend the practical applicability of the discrete Pascal transform.
Modeling of 3D objects and scenes have become a common tool in different applied fields from simulationbased design in high-end engineering applications (aviation, civil structures, engine components, etc.) to enterta...
详细信息
ISBN:
(纸本)9781509037971
Modeling of 3D objects and scenes have become a common tool in different applied fields from simulationbased design in high-end engineering applications (aviation, civil structures, engine components, etc.) to entertainment (computer-based animation, video-game development, etc.). In Biology and related fields, 3D object modeling and reconstruction provide valuable tools to support the visualization, comparison and even morphometric analysis in both academical and applied tasks. Such computational tools, usually implemented as webbased virtual reality applications, significantly reduce the manipulation of fragile samples, preventing their damage and, even, their complete loss. On the other hand, they allow to take the morphological properties of physical specimens to the digital domain, giving support to common entomology tasks such as characterization, morphological taxonomy and teaching. This paper addresses the problem of producing reliable 3D point clouds from the surface of entomological specimens, based on a proved approach for multiview 3D reconstruction from high resolution pictures. Given the traditional issues of macro-photography for small sized objects (i.e. short depth of field, presence of subtle and complex structures, etc.), a pre-processing protocol, based on focus stacking, supported the generation of enhanced views obtained by an acquisition device specifically designed for this work. The proposed approach has been tested on a sample of six representative subjects from the Entomological Collection of the Centro de Biosistemas, Universidad Jorge Tadeo Lozano (Colombia). The resulting point clouds exhibit an overall good visual quality for the body structure the selected specimens, while file sizes are portable enough to support web based visualization.
In this work we present a methodology to design the next generation of real-time vision processors. These processors are expected to achieve high throughput with complex applications, under real-time embedded constrai...
详细信息
ISBN:
(纸本)9783319304816;9783319304809
In this work we present a methodology to design the next generation of real-time vision processors. These processors are expected to achieve high throughput with complex applications, under real-time embedded constraints (time, fault-tolerance, silicon area and power consumption). To achieve these goals, we propose the fusion of two key concepts: the Focal-Plane imageprocessing (FPIP) and the Many-Core architectures. We show the concepts and ideas to build-up a methodology able to offer both design space exploration, and a customized programming toolchain for the final architecture. We present implementation details and results for working parts of the framework, and partial results and general comments about the work-in-progress.
Various optimized coordinate rotation digital computer (CORDIC) designs have been proposed to date. Nonetheless, in the presence of natural faults, such architectures could lead to erroneous outputs. In this paper, we...
详细信息
ISBN:
(纸本)9781479953417
Various optimized coordinate rotation digital computer (CORDIC) designs have been proposed to date. Nonetheless, in the presence of natural faults, such architectures could lead to erroneous outputs. In this paper, we propose error detection schemes for CORDIC architectures used vastly in applications such as complex number multiplication, and singular value decomposition for signal and imageprocessing. To the best of our knowledge, this work is the first in providing reliable architectures for these variants of CORDIC. We present three variants of recomputing with encoded operands to detect both transient and permanent faults. The overheads and effectiveness of the proposed designs are benchmarked through Xilinx FPGA implementations and error simulations. The proposed approaches can be tailored based on overhead tolerance and the reliability constraints to achieve.
NASA Technical Reports Server (Ntrs) 19800007188: Proceedings of Technical Sessions, Volumes 1 and 2: the Lacie symposium by NASA Technical Reports Server (Ntrs); published by
NASA Technical Reports Server (Ntrs) 19800007188: Proceedings of Technical Sessions, Volumes 1 and 2: the Lacie symposium by NASA Technical Reports Server (Ntrs); published by
According to some estimates of World Health Organization (WHO), in 2014, more than 1.9 billion adults aged 18 years and older were overweight. Overall, about 13% of the world's adult population (11% of men and 15%...
详细信息
According to some estimates of World Health Organization (WHO), in 2014, more than 1.9 billion adults aged 18 years and older were overweight. Overall, about 13% of the world's adult population (11% of men and 15% of women) were obese. 39% of adults aged 18 years and over (38% of men and 40% of women) were overweight. The worldwide prevalence of obesity more than doubled between 1980 and 2014. The purpose of this study is to design a convolutional neural network model and provide a food dataset collection to distinguish the nutrition groups which people take in daily life. For this aim, both two pretrained models Alexnet and Caffenet were finetuned and a similar structure was trained with dataset. Food images were generated from Food-11, FooDD, Food100 datasets and web archives. According to the test results, finetuned models provided better results than trained structure as expected. However, trained model can be improved by using more training examples and can be used as specific structure for classification of nutrition groups.
暂无评论