The discrete cosine transform (DCT) is a central mathematical operation in several digital signal processing methods and image/video standards. In this paper, we propose a collection of twelve approximations for the 8...
详细信息
The discrete cosine transform (DCT) is a central mathematical operation in several digital signal processing methods and image/video standards. In this paper, we propose a collection of twelve approximations for the 8-point DCT based on integer functions. Considered functions include: the floor, ceiling, truncation, and rounding-off functions. Sought approximations are required to meet the following specific criteria: (i) very low arithmetic complexity, (ii) orthogonality or quasi-orthogonality, and (iii) low-complexity inversion. By varying a scaling parameter, approximations could be systematically obtained and several existing approximations were identified as particular cases of the proposed methodology. Particular cases include the signed DCT and the rounded DCT. Four new quasi-orthogonal approximations were introduced and their practical relevance was demonstrated. All approximations were given fast algorithms based on matrix factorization methods. Proposed approximations are multiplierless;their computation requires only additions and bit-shifting operations. Additive complexity ranged from 18 to 24 additions. Obtained approximations were compared with the exact DCT and assessed in the context of JPEG-like image compression. As quality assessment measures, we considered the peak signal-to-noise ratio and the structural similarity index. Because its low-complexity and good performance properties, the proposed approximations are suitable for hardware implementation in dedicated architectures. (C) 2013 Elsevier B.V. All rights reserved.
Although FPGA technology offers the potential of designing high performance systems at low cost for a wide range of applications, its programming model is prohibitively low level requiring either a dedicated FPGA-expe...
详细信息
ISBN:
(纸本)0780367251
Although FPGA technology offers the potential of designing high performance systems at low cost for a wide range of applications, its programming model is prohibitively low level requiring either a dedicated FPGA-experienced programmer or basics digital design knowledge. To allow a signal/imageprocessing end-user to benefit from this kind of devices, the level of design abstraction needs to be raised, even beyond a Hardware Description Language level (eg VHDL). This approach will help the application developer to focus on signal/imageprocessingalgorithms rather than on low-level designs and implementations. This paper arms to present a framework for an FPGA-based coprocessor dedicated to Discrete Wavelet Transforms (DWT). The proposed approach will help the end-user to generate FPGA configurations for DWT at a highest level without any knowledge of the low-level design styles and architectures.
Signal processingalgorithms and architectures can use dynamic reconfiguration to exploit variations in signal statistics with the objectives of improved performance and reduced power consumption. Parameters provide a...
详细信息
ISBN:
(纸本)0780370414
Signal processingalgorithms and architectures can use dynamic reconfiguration to exploit variations in signal statistics with the objectives of improved performance and reduced power consumption. Parameters provide a simple and formal way to characterize incremental changes to a computation and its computing mechanism. This paper examines five parameterized computations which are typically implemented in hardware for a wireless multimedia terminal: 1) motion estimation, 2) discrete cosine transform, 3) Lempel-Ziv lossless compression, 4) 3D graphics light rendering and 5) Viterbi decoding, Each computation is examined for the capability of dynamically adapting the algorithm and architecture parameters to variations in their respective input signals. Dynamically reconfigurable low-power implementations of each computation are currently underway.
DSPs with dual memory banks offer high memory bandwidth, which is required for high-performance applications. However, such DSP architectures pose problems for C compilers, which are mostly not capable of partitioning...
详细信息
ISBN:
(纸本)0780370414
DSPs with dual memory banks offer high memory bandwidth, which is required for high-performance applications. However, such DSP architectures pose problems for C compilers, which are mostly not capable of partitioning program variables between memory banks. As a consequence, time-consuming assembly programming is required for an efficient coding of time-critical algorithms. This paper presents a new technique for automatic variable partitioning between memory banks in compilers, which leads to a higher utilization of available memory bandwidth in the generated machine code. We present experimental results obtained by integrating the proposed technique into an existing C compiler for the AMS Gepard, an industrial DSP core.
Vector rotation is the key operation employed extensively in many digital signal processing applications. In this paper, we introduce a new design concept called Angle Quantization (AQ). It can be used as a design ind...
详细信息
ISBN:
(纸本)0780370414
Vector rotation is the key operation employed extensively in many digital signal processing applications. In this paper, we introduce a new design concept called Angle Quantization (AQ). It can be used as a design index for vector rotational operation, where the rotational angle is known in advance. Based on the AQ process, we establish a unified design framework for cost-effective low-latency rotational algorithms and architectures. Several existing works, such as conventional CORDIC, AR-CORDIC, MVR-CORDIC, and EEAS-based CORDIC, can be fitted into the design framework, forming a Vector Rotational CORDIC Family. Based on the new design framework, we can realize high-speed / low-complexity rotational VLSI circuits, whereas without degrading the precision performance in fixed-point implementations.
This research explores architectures and design principles for monolithic optoelectronic integrated circuits (OEICs) through the implementation of an optical multi-token-ring network testbed system. Monolithic smart p...
详细信息
ISBN:
(纸本)0819441848
This research explores architectures and design principles for monolithic optoelectronic integrated circuits (OEICs) through the implementation of an optical multi-token-ring network testbed system. Monolithic smart pixel CMOS OEICs are of paramount importance to high performance networks, communication switches, computer interfaces, and parallel signal processing for demanding future multimedia applications. The general testbed system is called Reconfigurable Translucent Smart Pixel Array (R-Transpar) and includes a field programmable gate array (FPGA), a transimpedance receiver array, and an optoelectronic very large-scale integrated (OE-VLSI) smart pixel array. The FPGA is an Altera FLEX10K100E chip that performs logic functions and receives inputs from the transimpedance receiver array. A monolithic (OE-VLSI) smart pixel device containing an array of 4 x 4 vertical-cavity surface-emitting lasers (VCSELs) spatially interlaced with an array of 4 x 4 metal-semiconductor-metal (MSM) detectors connects to these devices and performs optical input-output functions. These components are mounted on a printed circuit board for testing and evaluation of integrated monolithic OEIC designs and various optical interconnection techniques. The system moves information between nodes by transferring 3-D optical packets in free space or through fiber image guides. The R-Transpar system is reconfigurable to test different network protocols and signal processing functions. In its operation as a 3-D multi-token-ring network, we use a specific version of the system called Transpar-Token-Ring (Transpar-TR) that uses novel time-division multiplexed (TDM) network node addressing to enhance channel utilization and throughput. Host computers interface with the system via a high-speed digital I/O board that sends commands for networking and application algorithm operations. We describe the system operation and experimental results in detail.
This paper presents an approach to automatically register infrared and millimeter wave images for concealed weapons detection application. The distortion between the two images is assumed to be a rigid body transforma...
详细信息
This paper presents an approach to automatically register infrared and millimeter wave images for concealed weapons detection application. The distortion between the two images is assumed to be a rigid body transformation and we assume that the scale factor can be calculated from the sensor parameters and the ratio of the two distances from the object to the imagers. Therefore, the pose parameters that need to be found are x-displacement and y-displacement only. Our registration procedure involves image segmentation, binary correlation and some other imageprocessingalgorithms. Experimental results indicate that the automatic registration procedure performs fairly well.
This paper presents an approach to automatically register infrared and millimeter wave images for concealed weapons detection application. The distortion between the two images is assumed to be a rigid body transforma...
详细信息
This paper presents an approach to automatically register infrared and millimeter wave images for concealed weapons detection application. The distortion between the two images is assumed to be a rigid body transformation and we assume that the scale factor can be calculated from the sensor parameters and the ratio of the two distances from the object to the imagers. Therefore, the pose parameters that need to be found are x-displacement and y-displacement only. Our registration procedure involves image segmentation, binary correlation and some other imageprocessingalgorithms. Experimental results indicate that the automatic registration procedure performs fairly well.
The research presented here focuses on the general problem of finding tools and methods to compare and evaluate parallel architectures in this particular field: the computer vision. As there are several different para...
详细信息
ISBN:
(纸本)0819419230
The research presented here focuses on the general problem of finding tools and methods to compare and evaluate parallel architectures in this particular field: the computer vision. As there are several different parallel architectures proposed for machine vision, some means of comparison between them are necessary in order to employ the most suitable architecture for a given application. 'Benchmarks' are the most popular tools for machine speed comparison, but do not give any information on the most convenient hardware structures for implementation of a given vision problem. This paper tries to overcome this weakness by proposing a definition of the concept of a tool for the evaluation of parallel architecture (more general than a benchmark), and provides a characterization of the chosen algorithms. Taken into account different ways to process data, it is necessary to consider two different classes of machines: MISD and (MIMD, SPMD, SIMD) offering different programming models, thus leading to two classes of algorithms. Consequently, two algorithms, one for each class are proposed: 1) the extraction of connected components, and 2) a parallel region growing algorithm with data reorganization. The second algorithm tests the capabilities of the architecture to support the following: i) pyramidal data structures (initial region step), ii) a merge procedure between global and global information (adjacent regions to the growing region), and iii) a parallel merge procedure between local and global information (adjacent points to the growing region).
A tree-structured wavelet transform has been developed for texture classification in our previous work. The new transform, which offers a non-redundant representation, is able to zoom into dominant frequency channels ...
详细信息
ISBN:
(纸本)081940943X
A tree-structured wavelet transform has been developed for texture classification in our previous work. The new transform, which offers a non-redundant representation, is able to zoom into dominant frequency channels containing significant information of textures and can be interpreted as the decomposition of a 2-D function with the wavelet packet basis. In this research, we extend our work to the texture segmentation problem. A new multiscale texture segmentation algorithm based on the tree-structured wavelet transform and hierarchical fuzzy clustering technique is proposed. Numerical experiments are given to demonstrate the performance of our new algorithm.
暂无评论