A testbed for investigation of heterogeneous and reconfigurable data network fabrics supporting a parallel DSP computational accelerator is described. The DSP processors are large-grained processors (Analog Devices SH...
详细信息
A testbed for investigation of heterogeneous and reconfigurable data network fabrics supporting a parallel DSP computational accelerator is described. The DSP processors are large-grained processors (Analog Devices SHARC DSPs), with a variety of parallel DSP array architectures possible. The network fabric is intended to be reconfigurable (within a rich but necessarily limited set of structures) to adapt to the needs of a sequence of imageprocessingalgorithms being executed (e.g., in a medical imageprocessing environment). The testbed will exploit conventional FPGA components to provide reconfigurable network structures and will exploit commercial high-speed interconnect components emerging for applications such as board-to-board applications. As a computational accelerator, the testbed is intended to be controlled by a host processor, with the host processor cooperating in the definition of the changes in the structure of the network structure as execution of a sequence of imageprocessingalgorithms proceeds.
The aim of the paper is to introduce two new multichannel median type filters, obtained by vector extension of their scalar (gray scale image) counterparts. The two filters are typical examples for two situations in i...
详细信息
The aim of the paper is to introduce two new multichannel median type filters, obtained by vector extension of their scalar (gray scale image) counterparts. The two filters are typical examples for two situations in imageprocessing: the use of the fuzzy attribute to justify the weighting factors and the fuzzy attribute as descriptor for rule-based processing.
We develop exact algorithms for geometric operations on general circles and circular arcs on the sphere, using integer homogeneous coordinates. The algorithms include testing a point against a circle, computing the in...
详细信息
We develop exact algorithms for geometric operations on general circles and circular arcs on the sphere, using integer homogeneous coordinates. The algorithms include testing a point against a circle, computing the intersection of two circles, and ordering three arcs out of the same point. These operations allow robust manipulation of maps on the sphere, providing a reliable framework for GIS, robotics, and other geometric applications.
This article considers the problem of thickness measurements and flaw detection in concrete with one side access. It can be solved by ultrasonic echo methods based on the application of multiple-unit antenna arrays, w...
详细信息
This article considers the problem of thickness measurements and flaw detection in concrete with one side access. It can be solved by ultrasonic echo methods based on the application of multiple-unit antenna arrays, with further data processing in accordance with tomographic algorithms. The data processing algorithm and criteria of the selection of antenna arrays parameters, based on experimental investigations, are described in the article. The results of concrete inner structure inspection obtained by the proposed method are presented. Practically, it is possible to receive an image of concrete cross section at depths of up to one meter and to detect hollow defects with diameters of more than 50 mm.
In video applications, digital encoding techniques are frequently associated to the use of compression algorithms. Compression algorithms such as the MPEG 2 standard allow economies in transmission bandwidth, while pr...
详细信息
In video applications, digital encoding techniques are frequently associated to the use of compression algorithms. Compression algorithms such as the MPEG 2 standard allow economies in transmission bandwidth, while providing a service with a quality similar or better to the current analogue systems. The reduction in the number of bits necessary to represent a given sequence is achieved by exploiting spatial and temporal redundancy in the original video sequences. The number of bits necessary to achieve a given picture quality level will be variable because the redundancy in the original sequence is also variable. This means that compression algorithms such as the MPEG 2 standard will generate variable bit rate (vBR) encoded bit streams if a constant quality is to be obtained. The goal of the work described in this paper is to develop dynamic multiplexing algorithms to be implemented at the user terminal equipment and to study the potential benefits of jointly controlling the bit rate of several encoders and the channel bit rate. The bit rate of the vBR transmission channel is constrained by leaky bucket policing functions. The source bit rate is determined and controlled on a picture basis by the proposed dynamic bandwidth allocation algorithm with the objective of maintaining a minimum level of quality among all video sources using always the least possible bandwidth and without violating the traffic contract.
This paper presents a technique for simulating processors based on the principle of compiled simulation. Unlike existing, commercially available instruction set simulators for DSPs, which are of interpretive character...
详细信息
This paper presents a technique for simulating processors based on the principle of compiled simulation. Unlike existing, commercially available instruction set simulators for DSPs, which are of interpretive character, the proposed technique performs instruction decoding and simulation scheduling at compile time. The technique offers up to three orders of magnitude faster simulation. The high speed allows the user to explore algorithms and hardware/software trade-offs before any hardware implementation. Moreover, the user can tailor the compiled simulation to trade speed for more accuracy. In this paper, the sources of the speedup and the limitations of the technique are analyzed and the realization of the simulation compiler is presented.
We propose an open signal processing system design and implementation environment, BEEHIvE, that allows application developers to rapidly compose and debug functional specifications in a networked, distributed computi...
详细信息
We propose an open signal processing system design and implementation environment, BEEHIvE, that allows application developers to rapidly compose and debug functional specifications in a networked, distributed computing environment, and then later migrate the application (transparently) onto an embedded, distributed, computing hardware/software platform, with the capability to reconfigure (adaptively) the resources assigned to the application to meet the dynamic real-time requirements of the implementation. Recent developments in the area of virtual machines; broker-based, distributed, transportable computing; object-oriented programming methodologies, Java and its real-time extensions; reconfigurable and programmable hardware; approximate algorithms; adaptive-load and resource-management algorithms, are harnessed in this operating environment.
This paper presents a comparative study of adaptive wavelet design for a speech coder based on wavelet-type multiresolution transforms. It is concerned with the problem of choosing suitable transforms that are adapted...
详细信息
This paper presents a comparative study of adaptive wavelet design for a speech coder based on wavelet-type multiresolution transforms. It is concerned with the problem of choosing suitable transforms that are adapted to the given speech signal in the sense that they maximize the coding gain at each resolution level. Four adaptive algorithms are reviewed for solving this problem in the framework of lattice realization of multirate lossless filter banks. The resulting speech coding scheme is presented. The performance is compared for each of the algorithms, as well as some non-adapted multiresolution transforms.
This paper describes architectures and design of a general purpose parallel image processor chip called a SliM-II image Processor. The chip has a linear array of 64 processing elements (PEs), operates at 30 MHz in the...
详细信息
This paper describes architectures and design of a general purpose parallel image processor chip called a SliM-II image Processor. The chip has a linear array of 64 processing elements (PEs), operates at 30 MHz in the worst case simulation and gives 1.92 GIPS. SIiM-II can greatly reduce the inter-PE communication overhead, due to the idea of sliding that is overlapping inter-PE communication with computation. In contrast to existing array processors, each PE has a multiplier that is quite effective for convolution, template matching, etc. The instruction set can execute an ALU operation, data I/O, and inter-PE communication simultaneously in an instruction cycle. In addition, during the ALU/multiplier operation, SliM-II provides parallel load/store between the register file and on-chip memory as in DSP chips. The bandwidth of data I/O and inter-PE communication increases due to bit-parallel paths. We developed vHDL models and performed logic synthesis using the COMPASS/sup TM/ CAD tool. We used the COMPASS/sup TM/ 3.3 v 0.6 /spl mu/m standard cell library (v8r4.9.1). The total number of transistors is about 1.5 millions. The SliM-II chip is being fabricated at the LG Semiconductor Co,, Ltd. The performance estimation shows a significant improvement for algorithms requiring multiplications compared with existing array processors.
We show a high throughput implementation of SAR on high performance computing (HPC) platforms. In our implementation, the processors are divided into two groups of size M and N. The first group consisting of M process...
详细信息
We show a high throughput implementation of SAR on high performance computing (HPC) platforms. In our implementation, the processors are divided into two groups of size M and N. The first group consisting of M processors computes the FDC (frequency domain convolution) in range dimension, and the second group of N processors computes the FDC in azimuth dimension. M and N are determined by the computational requirements of FDC in range and azimuth dimensions respectively. The key contribution of this paper is the development of a general high-throughput M-to-N communication algorithm. The M-to-N communication algorithm is a basic communication primitive used in many signal processing applications when a software task pipeline is employed to obtain high throughput performance. Our algorithm reduces the number of communication steps to 1g(N/M+1)+n(k-1), where k/spl ges/2 and n=[1g/sub k/ M]. Implementation results on the IBM SP2 and the Cray T3D based on the MITRE real-time benchmarks are presented. The results show that, given an image of size 1K/spl times/1K, the minimum number of processors required for processing the SAR benchmarks can be reduced by 50% by using the proposed communication algorithm.
暂无评论