We present a high performance implementation of the FFT algorithm on the BOPS ManArray parallel DSP processor. The ManArray we consider for this application consists of an array controller and 2 to 4 fully interconnec...
详细信息
We present a high performance implementation of the FFT algorithm on the BOPS ManArray parallel DSP processor. The ManArray we consider for this application consists of an array controller and 2 to 4 fully interconnected processing elements. To expose the parallelism inherent to an FFT algorithm we use a factorization of the DFT matrix in Kronecker products, permutation and diagonal matrices. Our implementation utilizes the multiple levels of parallelism that are available on the ManArray. We use the special multiply complex instruction, that calculates the product of two complex 32-bit fixed point numbers in 2 cycles (pipelinable). Instruction level parallelism is exploited via the indirect Very Long Instruction Word (iVLIW). With an iVLIW, in the same cycle a complex number is read from memory, another complex number is written to memory, a complex multiplication starts and another finishes, two complex additions or subtractions are done and a complex number is exchanged with another processing element. Multiple local FFTs are executed in Single Instruction Multiple Data (SIMD) mode, and to avoid a costly data transposition we execute distributed FFTs in Synchronous Multiple Instructions Multiple Data (SMIMD) mode.
The kernel in Cohen's generalized time-frequency representation (GTFR) requires is chosen in accordance to certain desired performance attributes. Properties of the kernel are typically expressed as constraints. W...
详细信息
The kernel in Cohen's generalized time-frequency representation (GTFR) requires is chosen in accordance to certain desired performance attributes. Properties of the kernel are typically expressed as constraints. We establish that many commonly used constraints are convex in the sense that all allowable kernels satisfying a given constraint form a convex set. Thus, for a given set of constraints, the kernel can be designed by alternately projecting among these sets. If there exists a nonempty intersection among the constraint sets, then the theory of projection onto convex sets (POCS) guarantees convergence to a point in the intersection. If the constraints can be partitioned into two sets, each with a nonempty intersection, the POCS guarantees convergence to a kernel that satisfies the inconsistent constraints with minimum mean square error.
We consider the problem of detecting a known Gaussian random transient in the presence of a strong, known, random, Gaussian, narrowband interference. This can be regarded as a special case of the classical problem of ...
详细信息
We consider the problem of detecting a known Gaussian random transient in the presence of a strong, known, random, Gaussian, narrowband interference. This can be regarded as a special case of the classical problem of detecting a known Gaussian random signal in known Gaussian colored noise. There exists a standard solution for such a problem, based on the classical optimum detector for random signals in noise. However, such a detector does not explicitly use the non-stationary character of the signal as a priori available information. Reformulation of the optimum detection in the time-frequency plane allows one to exploit this distinguishing signal feature and suppress the stationary interference and noise. This is accomplished here by use of the Wigner-Ville signal representation and an optimum signal/noise subspace decomposition that maximizes the transient signal to noise ratio. The new detection procedure eliminates the subspace where major part of the energy of random noise sample will fall while retaining almost all of the signal energy. In this fashion, a gain in the output signal to noise ratio is achieved as verified by simulations.
In this paper, we introduce a new definition for the instantaneous frequency of a discrete-time analytic signal. Unlike the existing definition which uses only two data samples around a particular time, this method ut...
详细信息
In this paper, we introduce a new definition for the instantaneous frequency of a discrete-time analytic signal. Unlike the existing definition which uses only two data samples around a particular time, this method utilizes all the data samples for estimating the instantaneous frequency. We prove that this quantity is identical to the average frequency evaluated at the particular time in the discrete-time TFD. This property is consistent with the analogous continuous-time property. We also derive requirements on the discrete-time kernel needed to satisfy this property. Using computer-generated signals and real data, performance comparisons are made between the proposed approach and the existing one.
The problem addressed in this paper is the detection and classification of deterministic objects and random textures in a noisy scene. An energy detector is developed in the cumulant domain, by exploiting the noise in...
详细信息
The problem addressed in this paper is the detection and classification of deterministic objects and random textures in a noisy scene. An energy detector is developed in the cumulant domain, by exploiting the noise insensitivity of higher-order statistics. An efficient implementation of this detector is described, using matched filtering. Its performance is analyzed using asymptotic distributions in a binary hypothesis testing framework. Object and texture classifiers are derived using higher-order statistics. They are minimum distance classifiers in the cumulant domain, and can be efficiently implemented using a bank of matched filters. Further, they are robust to additive Gaussian noise and insensitive to object shifts. Extensions, which can handle object rotation and scaling are also discussed. An alternate texture classifier is derived from a ML viewpoint, that is more efficient at the expense of complexity. The application of these algorithms to texture modeling is shown and consistent parameter estimators are obtained. Simulations are shown for both the object and the texture classification problems.
A general solution for the problem of time-frequency signal representation of nonlinear FM signals is provided, based on a generalization of the Wigner-Ville distribution. The Wigner-Ville distribution (WVD) is a seco...
详细信息
ISBN:
(纸本)0819406945
A general solution for the problem of time-frequency signal representation of nonlinear FM signals is provided, based on a generalization of the Wigner-Ville distribution. The Wigner-Ville distribution (WVD) is a second order time-frequency representation. That is, it is able to give ideal energy concentration for quadratic phase signals and its ensemble average is a second order time-varying spectrum. The same holds for Cohen's class of time-frequency distributions, which are smoothed versions of the WVD. The WVD may be extended so as to achieve ideal energy concentration for higher order phase laws, and such that the expectation is a time-varying higher order spectrum. The usefulness of these generalized Wigner-Ville distributions (GWVD) is twofold. Firstly, because they achieve ideal energy concentration for polynomial phase signals, they may be used for optimal instantaneous frequency estimation. Second, they are useful for discriminating between nonstationary processes of differing higher order moments. In the same way that the WVD is generalized, we generalize Cohen's class of TFDs by defining a class of generalized time-frequency distributions (GTFDs) obtained by a two dimensional smoothing of the GWVD. Another results derived from this approach is a method based on higher order spectra which allows the separation of cross-terms and auto- terms in the WVD.
A new preconditioner is proposed for the solution of an N × N Toeplitz system TNx = b, where TN can be symmetric indefinite or nonsymmetric, by preconditioned iterative methods. The preconditioner FN is obtained ...
详细信息
ISBN:
(纸本)0819406945
A new preconditioner is proposed for the solution of an N × N Toeplitz system TNx = b, where TN can be symmetric indefinite or nonsymmetric, by preconditioned iterative methods. The preconditioner FN is obtained based on factorizing the generating function T(z) into the product of two terms corresponding, respectively, to minimum-phase causal and anticausal systems and therefore called the minimum-phase LU (MPLU) factorization preconditioner. Due to the minimum-phase property, F N-1 is bounded. For rational Toeplitz TN with generating function T(z) = A(z-1)/B(z-1) + C(z)/D(z), where A(z), B(z), C(z), and D(z) are polynomials of orders p1, q1, p2, and q2, we show that the eigenvalues of FN-1TN are repeated exactly at 1 except at most α F outliers, where αF depends on p1, q1, p2, q2, and the number approximately ega of the roots of T(z) = A(z-1)D(z) + B(z-1)C(z) outside the unit circle. A preconditioner KN in circulant form generalized from the symmetric case is also presented for comparison.
Estimates for the condition number of a matrix are useful in many areas of scientific computing, including: recursive least squares computations, optimization, eigenanalysis, and general nonlinear problems solved by l...
详细信息
Estimates for the condition number of a matrix are useful in many areas of scientific computing, including: recursive least squares computations, optimization, eigenanalysis, and general nonlinear problems solved by linearization techniques where matrix modification techniques are used. The purpose of this paper is to propose an adaptive Lanczos estimator scheme, which we call ale, for tracking the condition number of the modified matrix over time. Applications to recursive least squares (RLS) computations using the covariance method with sliding data windows are considered. ale is fast for relatively small n - parameter problems arising in RLS methods in control and signalprocessing, and is adaptive over time, i.e., estimates at time t are used to produce estimates at time t + 1. Comparisons are made with other adaptive and non-adaptive condition estimators for recursive least squares problems. Numerical experiments are reported indicating that ale yields a very accurate recursive condition estimator.
Redundant Residue Number Systems (RRNS) have been proposed as suitable candidates for fault tolerance in compute intensive applications. The redundancy is based on multiple projections to moduli sub-sets and conductin...
详细信息
ISBN:
(纸本)0819416207
Redundant Residue Number Systems (RRNS) have been proposed as suitable candidates for fault tolerance in compute intensive applications. The redundancy is based on multiple projections to moduli sub-sets and conducting a search for results that lie in a so-called illegitimate range. This paper presents RRNS fault tolerant procedures for a recently introduced finite polynomial ring mapping procedure (modulus replication RNS). The mapping technique dispenses with the need for many relatively prime ring moduli, which is a major draw-back with conventional RRNS systems. Although double, triple, and quadrupole modular redundancy can be implemented in the polynomial mapping structure, polynomial coefficient circuitry, or the independent direct product ring computational channels, for error detection and/or correction, this paper discusses the implementation of redundant rings which are generated by (1) redundant residues, (2) spare general computational channels, or (3) a combination of the two. The first architecture is suitable for RNS embedding in the MRRNS, and the second for single moduli mappings. The combination architecture allows a trade-off between the two extremes. The application area is in fault tolerant compute intensive DSP arrays.
Algorithm based fault tolerance techniques have been used for systolic processors and general purpose multiprocessors. In this paper, we have applied an algorithm based fault tolerance technique to high-level synthesi...
详细信息
ISBN:
(纸本)0819416207
Algorithm based fault tolerance techniques have been used for systolic processors and general purpose multiprocessors. In this paper, we have applied an algorithm based fault tolerance technique to high-level synthesis of signal flow graphs. The technique incorporates reliability into synthesized DSP filters. Given a signal Flow Graph representation for these filters, our schemes synthesize a reliable schedule for the operations in them and allocate functional units to the operations subject to hardware and reliability constraints. We impose reliability constraints to avoid compensating errors in fault detection. Our first scheme uses the well known duplicate and compare approach, while the second uses a novel linearity based checking approach used in algorithm based fault tolerance methods for matrix computations. The schemes have been implemented and results obtained by using them on sample signal flow graphs are presented. These results show the linearity based scheme to have a low time overhead. For example, this scheme takes about 10% extra time for the reliable synthesis of a 10th order IIR filter. Our proposed work extends previous work in the area in three directions: (1) use of linearity based checks, over duplication based checks, (2) handling cyclic flow graphs, over previous proposals for acyclic graphs, and (3) reliability constrained hardware mapping. Extensions of the schemes to nonlinear flow graphs is also proposed.
暂无评论