This article considers the problem of thickness measurements and flaw detection in concrete with one side access. It can be solved by ultrasonic echo methods based on the application of multiple-unit antenna arrays, w...
详细信息
This article considers the problem of thickness measurements and flaw detection in concrete with one side access. It can be solved by ultrasonic echo methods based on the application of multiple-unit antenna arrays, with further data processing in accordance with tomographic algorithms. The data processing algorithm and criteria of the selection of antenna arrays parameters, based on experimental investigations, are described in the article. The results of concrete inner structure inspection obtained by the proposed method are presented. Practically, it is possible to receive an image of concrete cross section at depths of up to one meter and to detect hollow defects with diameters of more than 50 mm.
In video applications, digital encoding techniques are frequently associated to the use of compression algorithms. Compression algorithms such as the MPEG 2 standard allow economies in transmission bandwidth, while pr...
详细信息
In video applications, digital encoding techniques are frequently associated to the use of compression algorithms. Compression algorithms such as the MPEG 2 standard allow economies in transmission bandwidth, while providing a service with a quality similar or better to the current analogue systems. The reduction in the number of bits necessary to represent a given sequence is achieved by exploiting spatial and temporal redundancy in the original video sequences. The number of bits necessary to achieve a given picture quality level will be variable because the redundancy in the original sequence is also variable. This means that compression algorithms such as the MPEG 2 standard will generate variable bit rate (vBR) encoded bit streams if a constant quality is to be obtained. The goal of the work described in this paper is to develop dynamic multiplexing algorithms to be implemented at the user terminal equipment and to study the potential benefits of jointly controlling the bit rate of several encoders and the channel bit rate. The bit rate of the vBR transmission channel is constrained by leaky bucket policing functions. The source bit rate is determined and controlled on a picture basis by the proposed dynamic bandwidth allocation algorithm with the objective of maintaining a minimum level of quality among all video sources using always the least possible bandwidth and without violating the traffic contract.
This paper presents a technique for simulating processors based on the principle of compiled simulation. Unlike existing, commercially available instruction set simulators for DSPs, which are of interpretive character...
详细信息
This paper presents a technique for simulating processors based on the principle of compiled simulation. Unlike existing, commercially available instruction set simulators for DSPs, which are of interpretive character, the proposed technique performs instruction decoding and simulation scheduling at compile time. The technique offers up to three orders of magnitude faster simulation. The high speed allows the user to explore algorithms and hardware/software trade-offs before any hardware implementation. Moreover, the user can tailor the compiled simulation to trade speed for more accuracy. In this paper, the sources of the speedup and the limitations of the technique are analyzed and the realization of the simulation compiler is presented.
We propose an open signal processing system design and implementation environment, BEEHIvE, that allows application developers to rapidly compose and debug functional specifications in a networked, distributed computi...
详细信息
We propose an open signal processing system design and implementation environment, BEEHIvE, that allows application developers to rapidly compose and debug functional specifications in a networked, distributed computing environment, and then later migrate the application (transparently) onto an embedded, distributed, computing hardware/software platform, with the capability to reconfigure (adaptively) the resources assigned to the application to meet the dynamic real-time requirements of the implementation. Recent developments in the area of virtual machines; broker-based, distributed, transportable computing; object-oriented programming methodologies, Java and its real-time extensions; reconfigurable and programmable hardware; approximate algorithms; adaptive-load and resource-management algorithms, are harnessed in this operating environment.
This paper presents a comparative study of adaptive wavelet design for a speech coder based on wavelet-type multiresolution transforms. It is concerned with the problem of choosing suitable transforms that are adapted...
详细信息
This paper presents a comparative study of adaptive wavelet design for a speech coder based on wavelet-type multiresolution transforms. It is concerned with the problem of choosing suitable transforms that are adapted to the given speech signal in the sense that they maximize the coding gain at each resolution level. Four adaptive algorithms are reviewed for solving this problem in the framework of lattice realization of multirate lossless filter banks. The resulting speech coding scheme is presented. The performance is compared for each of the algorithms, as well as some non-adapted multiresolution transforms.
This paper describes architectures and design of a general purpose parallel image processor chip called a SliM-II image Processor. The chip has a linear array of 64 processing elements (PEs), operates at 30 MHz in the...
详细信息
This paper describes architectures and design of a general purpose parallel image processor chip called a SliM-II image Processor. The chip has a linear array of 64 processing elements (PEs), operates at 30 MHz in the worst case simulation and gives 1.92 GIPS. SIiM-II can greatly reduce the inter-PE communication overhead, due to the idea of sliding that is overlapping inter-PE communication with computation. In contrast to existing array processors, each PE has a multiplier that is quite effective for convolution, template matching, etc. The instruction set can execute an ALU operation, data I/O, and inter-PE communication simultaneously in an instruction cycle. In addition, during the ALU/multiplier operation, SliM-II provides parallel load/store between the register file and on-chip memory as in DSP chips. The bandwidth of data I/O and inter-PE communication increases due to bit-parallel paths. We developed vHDL models and performed logic synthesis using the COMPASS/sup TM/ CAD tool. We used the COMPASS/sup TM/ 3.3 v 0.6 /spl mu/m standard cell library (v8r4.9.1). The total number of transistors is about 1.5 millions. The SliM-II chip is being fabricated at the LG Semiconductor Co,, Ltd. The performance estimation shows a significant improvement for algorithms requiring multiplications compared with existing array processors.
We show a high throughput implementation of SAR on high performance computing (HPC) platforms. In our implementation, the processors are divided into two groups of size M and N. The first group consisting of M process...
详细信息
We show a high throughput implementation of SAR on high performance computing (HPC) platforms. In our implementation, the processors are divided into two groups of size M and N. The first group consisting of M processors computes the FDC (frequency domain convolution) in range dimension, and the second group of N processors computes the FDC in azimuth dimension. M and N are determined by the computational requirements of FDC in range and azimuth dimensions respectively. The key contribution of this paper is the development of a general high-throughput M-to-N communication algorithm. The M-to-N communication algorithm is a basic communication primitive used in many signal processing applications when a software task pipeline is employed to obtain high throughput performance. Our algorithm reduces the number of communication steps to 1g(N/M+1)+n(k-1), where k/spl ges/2 and n=[1g/sub k/ M]. Implementation results on the IBM SP2 and the Cray T3D based on the MITRE real-time benchmarks are presented. The results show that, given an image of size 1K/spl times/1K, the minimum number of processors required for processing the SAR benchmarks can be reduced by 50% by using the proposed communication algorithm.
A mixed-signal array processor used as an image processor for an intelligent cruise control system is presented. The processor's ALU is a programmable analog arithmetic circuit which can perform addition, subtract...
详细信息
A mixed-signal array processor used as an image processor for an intelligent cruise control system is presented. The processor's ALU is a programmable analog arithmetic circuit which can perform addition, subtraction, multiplication, and division with 1.3% accuracy. This circuit enables the processor to operate with the low power and low area of a dedicated analog circuit while retaining the programmability of a digital processor. The array processor performed an edge detection algorithm and a sub-pixel resolution algorithm. A 1 cm square array of the mixed-signal processor cells in 0.8 /spl mu/m CMOS with a 5 v power supply would dissipate 1 W at 420 MIPS.
A new efficient implementation of a IEEE-standard conform 8 point discrete cosine transform (DCT) is presented. The architecture is based on different classes of orthogonal 2/spl times/2 /spl mu/-rotations used to app...
详细信息
A new efficient implementation of a IEEE-standard conform 8 point discrete cosine transform (DCT) is presented. The architecture is based on different classes of orthogonal 2/spl times/2 /spl mu/-rotations used to approximate the angles of the DCT. By using only orthogonal /spl mu/-rotations it is guaranteed, that the whole transform remains orthogonal and perfect reconstruction of the signal can be achieved. It is shown that for the implementation of the DCT with approximated rotation angles (angle quantization) about 28% less shift and add operations are necessary than for a standard conform implementation with coefficient quantization. This lends to a large power benefit due to less adder hardware and less capacitive load of the global interconnects. Besides this, there are some other advantageous aspects concerning the area and delay. To support the full custom design of the layout, module generators for all the different classes /spl mu/-rotations can be used to generate the necessary rotations automatically.
A distributed optimization framework and its application to the regulation of the behavior of a network of interacting imageprocessingalgorithms are presented. The algorithm parameters used to regulate information e...
详细信息
A distributed optimization framework and its application to the regulation of the behavior of a network of interacting imageprocessingalgorithms are presented. The algorithm parameters used to regulate information extraction are explicitly represented as state variables associated with all network nodes. Nodes are also provided with message-passing procedures to represent dependences between parameter settings at adjacent levels. The regulation problem is defined as a joint-probability maximization of a conditional probabilistic measure evaluated over the space of possible configurations of the whole set of state variables (i.e., parameters). The global optimization problem is partitioned and solved in a distributed way, by considering local probabilistic measures for selecting and estimating the parameters related to specific algorithms used within the network. The problem representation allows a spatially varying tuning of parameters, depending on the different informative contents of the subareas of an image. An application of the proposed approach to an imageprocessing problem is described. The professing chain chosen as an example consists of four modules. The first three algorithms correspond to network nodes. The topmost node is devoted to integrating information derived from applying different parameter settings to the algorithms of the chain. The nodes associated with data-transformation processes to be regulated are represented by an optical sensor and two filtering units (for edge-preserving and edge-extracting filterings), and a straight-segment detection module is used as an integration site. Each module is provided with knowledge concerning the parameters to regulate the related processing phase and with specific criteria to estimate data quality. Messages can be bidirectionally propagated among modules in order to search, in a distributed way, for the ''optimum'' set of parameters yielding the best solution. Experimental results obtained on indoor
暂无评论