Generalized adaptive neural filters are a class of nonlinear adaptive filters that includes stack filters as a subset. We further extend this class by using a multi-window approach. In this way we obtain a parallel re...
详细信息
Generalized adaptive neural filters are a class of nonlinear adaptive filters that includes stack filters as a subset. We further extend this class by using a multi-window approach. In this way we obtain a parallel recursive filtering operation and make better use of the implicit parallelism of the neural network architecture. the proposed neural network structure uses shared weight architecture for efficient implementation. Experimental results in actual image processing illustrate the efficiency of the approach.
In many signal processing applications, widely used in new multimedia based systems, finite precision representation effects are of great concern. When hardware platform are involved, such for VLSI designs, these prob...
详细信息
In many signal processing applications, widely used in new multimedia based systems, finite precision representation effects are of great concern. When hardware platform are involved, such for VLSI designs, these problems are particularly important, in order to obtain high performance, compact and power-conscious architectures. In particular fixed point effects have not yet been emphasized enough for the forthcoming standard JPEG 2000. In this paper a detailed study of quantization noise effects and their hardware impact is presented. the final objective is to investigate the possibility to exploit finite precision data representation in order to design low cost and low power systems.
this paper presents a high-speed VLSI implementation structure for a multiplier. Four n-bit numbers are generated using even and odd positions of the two n-bit numbers. then they are multiplied pairwise. A parallel ad...
详细信息
this paper presents a high-speed VLSI implementation structure for a multiplier. Four n-bit numbers are generated using even and odd positions of the two n-bit numbers. then they are multiplied pairwise. A parallel addition algorithm is used to add up the partial products. three k-bit numbers at each level are converted to two (k+1)-bit numbers at the next level using a 3-to-2 adding technique. Carry propagation is left to the last stage of multiplier where a fast carry-look-ahead adder is used to add the final two 2(n-1)-bit numbers. the supply voltage (V/sub dd/) is 3.3 /spl upsi/ which can be lowered to 2.5 /spl upsi/. the multiplier are in 0.8 /spl mu/m technology. HSPICE simulation shows a total delay of 3.25 ns for a 32-bit multiplier.
this paper proposes two topologies of radix-2 complex-number multipliers based on distributed arithmetic and the redundant signed-digit representation. the advantage of this approach is twofold: the distributed arithm...
详细信息
this paper proposes two topologies of radix-2 complex-number multipliers based on distributed arithmetic and the redundant signed-digit representation. the advantage of this approach is twofold: the distributed arithmetic reduces the hardware requirements respect to direct implementation of the complex-number multiplication, and the redundant number system avoids the carry-propagation and allows computing on-line the digits. Two Radix-2 architectures are presented. these multipliers have been implemented on FPGA and an optimum mapping is proposed. the presented circuits have been compared to other complex-number multipliers leading to more efficient area-time structures and a lower latency.
Over the last decades Genetic algorithms (GA) and Genetic Programming (GP) have proven to be efficient tools for a wide range of applications. However, in order to solve human-competitive problems they require large a...
详细信息
this paper presents a new technique which allows high baud rate with low operation speed of the synchronizer. this technique is based on parallelprocessing. What is done by only one clock operating at the baud rate c...
详细信息
this paper presents a new technique which allows high baud rate with low operation speed of the synchronizer. this technique is based on parallelprocessing. What is done by only one clock operating at the baud rate can be done by two clocks operating only at half rate. By generalizing we propose versions of clock recovery circuits operating at the ratio 1/2/sup n/ of the data rate. thus we obtain circuits transmitting at very high data rate but operating at very low frequency. the proposed circuits which are transition sensitive (digital) are compared withthe traditional level sensitive (analog).
A parallel version of the evolutionary graph generation (EGG) system, called the distributed EGG (DEGG) system, was developed on a cluster of PCs using a message-passing interface (MPI). To demonstrate the capability ...
详细信息
A parallel version of the evolutionary graph generation (EGG) system, called the distributed EGG (DEGG) system, was developed on a cluster of PCs using a message-passing interface (MPI). To demonstrate the capability of DEGG, it is applied to seeking the optimal design of various multipliers. Experimental results substantially show that DEGG consistently performs better than the EGG and known conventional designs.
We propose a solution to handle two problems inducted by the growth of the complexity of machine vision systems: (1) the need of a robust, open and flexible framework to control various descriptive and operational kno...
详细信息
ISBN:
(纸本)0780372417
We propose a solution to handle two problems inducted by the growth of the complexity of machine vision systems: (1) the need of a robust, open and flexible framework to control various descriptive and operational knowledge; and (2) the necessity to have an architecture which offer parallelprocessingthat can be easily scaled to an evolving underlying hardware. We propose an agent society, implemented in the Java language, that is organized as an irregular pyramid for many reasons: (1) an agent provides an abstraction to encapsulate reactive or cognitive processing; and (2) the pyramid proposes a formal graph-based approach to ensure global and distributed goal satisfaction. the evaluation of the architecture performed on a X-scanner breast image, shows good quality results and parallelprocessing abilities.
A new VLSI architecture for the computation of the three-dimensional discrete cosine transform (3D DCT) for compression of integral 3D images is proposed. the 3D DCT is decomposed into 1D DCTs computed in each of the ...
详细信息
A new VLSI architecture for the computation of the three-dimensional discrete cosine transform (3D DCT) for compression of integral 3D images is proposed. the 3D DCT is decomposed into 1D DCTs computed in each of the three dimensions. the architecture is a parallel structure which computes an N/spl times/N/spl times/N-point DCT by computing N N/spl times/N 2D DCTs in parallel and feeding each of the computed 2D DCT coefficients into a final ID DCT block. the architecture uses 5N/sup 2//2 multiplier-accumulators to evaluate N/spl times/N/spl times/N-point DCT's at a rate of N complete 3D DCT coefficients per clock cycles, where N is even. the architecture is regular and modular and as such it is suitable for VLSI implementation. the proposed architecture has a better area-time performance than previously reported 3D DCT architectures. Also, the proposed architecture reduces the initial delay by a factor of N.
We propose a sampled-analog rank-order filter (ROF) architecture of complexity O(n/sup 2/). It yields a very compact structure because the devices used are essentially of minimum geometry. Its sole active building blo...
详细信息
We propose a sampled-analog rank-order filter (ROF) architecture of complexity O(n/sup 2/). It yields a very compact structure because the devices used are essentially of minimum geometry. Its sole active building block being the simple CMOS inverter, the circuit exhibits an excellent low-voltage compatibility. Furthermore, it can support a rail-to-rail input range. It is inherently fast due to the fully parallel signal processing, and the speed is expected to increase with technological scaling at the same rate as purely digital circuitry. Finally, it supports full programmability of the rank by means of an analog reference voltage. the ROF is based on a pair of multiple-winners-take-all (mWTA) circuits and a set of AND gates. the paper includes a description of the architecture and a detailed analysis of the mWTA. Most relevant design issues are addressed, and experimental results obtained from a fabricated ROF are presented.
暂无评论