Using the proposed factorizations of discrete cosine transform (DCT) matrices, fast and recursive algorithms are stated. In this paper, signal flow graphs for the n-point DCT II and DCT IV algorithms are introduced. T...
详细信息
Using the proposed factorizations of discrete cosine transform (DCT) matrices, fast and recursive algorithms are stated. In this paper, signal flow graphs for the n-point DCT II and DCT IV algorithms are introduced. The proposed algorithms yield exactly the same results as with standard DCT algorithms but are faster. The arithmetic complexity and stability of the algorithms are explored, and improvements of these algorithms are compared with previously existing fast and stable DCT algorithms. A parallel hardware computing architecture for the DCT II algorithm is proposed. The computing architecture is first designed, simulated, and prototyped using a 40-nm Xilinx Virtex-6 FPGA and thereafter mapped to custom integrated circuit technology using 0.18-m CMOS standard cells from Austria Micro Systems. The performance trade-off exists between computational precision, chip area, clock speed, and power consumption. This trade-off is explored in both FPGA and custom CMOS implementation spaces. An example FPGA implementation operates at clock frequencies in excess of 230MHz for several values of system word size leading to real-time throughput levels better than 230 million 16-point DCTs per second. Custom CMOS-based results are subject to synthesis and place-and-route steps of the design flow. Physical silicon fabrication was not conducted due to prohibitive cost.
A novel type of algorithms for the discrete sine transform (DST) are introduced in this paper. By using a basic trigonometric identity, these algorithms realize a successive reduction of the summation size in a simple...
详细信息
A novel type of algorithms for the discrete sine transform (DST) are introduced in this paper. By using a basic trigonometric identity, these algorithms realize a successive reduction of the summation size in a simple manner, and therefore cause a very simple structure. The indexing of this algorithm involves the Hadamard order, the generation of which is given in this paper. These algorithms use cosines and sines as multipliers. It will cause less computational error than those algorithms with secant multipliers. The multipliers can be generated recursively in a simple way, without the need of referring to any trigonometric functions. Fortran subroutines to compute various types of the DST are provided. Zusammenfassung Neuartige Algorithmen für die diskrete Sinustransformation (DST) werden eingeführt. Durch die Anwendung einer grundlegenden trigonometrischen Identität realisieren diese Algorithmen auf einfache Weise eine sukzessive Verringerung des Summationsumfangs; so entsteht eine sehr einfache Struktur. Die Indizierung dieser Algorithmen folgt der Hadamard-Ordnung, deren Erzeugung in diesem Aufsatz gezeigt wird. Diese Algorithmen verwenden Sinus- und Kosinusterme als Multiplikatoren und verursachen so geringere Rechenfehler als die Verfahren mit Sekans-Multiplikatoren. Die Multiplikatoren können in einfacher Weise rekursiv erzeugt werden, ohne daβ man auf irgendwelche trigonometrische Funktionen zurückgreifen muβ. fortran -Unterprogramme zur Berechung verschiedener DST-Arten werden angegeben.
Multichannel fast QR decomposition RLS (MC-FQRD-RLS) algorithms are well known for their good numerical properties and low computational complexity. The main limitation is that they lack an explicit weight vector term...
详细信息
Multichannel fast QR decomposition RLS (MC-FQRD-RLS) algorithms are well known for their good numerical properties and low computational complexity. The main limitation is that they lack an explicit weight vector term, limiting themselves to problems seeking an estimate of the output error signal. This paper presents techniques which allow us to use MC-FQRD-RLS algorithms with applications that previously have required explicit knowledge of the adaptive filter weights. We first consider a multichannel system identification setup and present how to obtain, at any time, the filter weights associated with the MC-FQRD-RLS algorithm. Thereafter, we turn to problems where the filter weights are periodically updated using training data, and then used for fixed filtering of a useful data sequence, e. g., burst-trained equalizers. Finally, we consider a particular control structure, indirect learning, where a copy of the coefficient vector is filtering a different input sequence than that of the adaptive filter. Simulations are carried out for Volterra system identification, decision feedback equalization, and adaptive predistortion of high-power amplifiers. The results verify our claims that the proposed techniques achieve the same performance as the inverse QRD-RLS algorithm at a much lower computational cost.
This paper presents two new, closely related adaptive algorithms for LS system identification. The starting point for the derivation of the algorithms is the inverse Cholesky factor of the;data correlation matrix, obt...
详细信息
This paper presents two new, closely related adaptive algorithms for LS system identification. The starting point for the derivation of the algorithms is the inverse Cholesky factor of the;data correlation matrix, obtained via a QR decomposition (QRD). Both algorithms are of O(p) computational complexity, with p being the order of the system. The first algorithm is a fixed order QRD scheme with enhanced-parallelism. The second is an order recursive lattice type algorithm based exclusively on orthogonal Givens rotations, with lower complexity compared to previously derived ones. Both algorithms are derived following a new approach, which exploits efficient time and order updates of a specific state vector quantity.
We present a fast algorithm for the construction of a spectral projector. This algorithm allows us to compute the density matrix, as used in, e.g., the Kohn-Sham iteration. and so obtain the electron density. We compu...
详细信息
We present a fast algorithm for the construction of a spectral projector. This algorithm allows us to compute the density matrix, as used in, e.g., the Kohn-Sham iteration. and so obtain the electron density. We compute the spectral projector by constructing the matrix sign function through a simple polynomial recursion. We present several matrix representations for fast computation within this recursion, using bases with controlled space-spatial-frequency localization. In particular we consider wavelet and local cosine bases. Since spectral projectors appear in many contexts, we expect many additional applications of our approach. (C) 1999 Academic Press.
It is shown that an N point type I odd discrete cosine transform can be reformulated as a (2N-1) point DFT of a real-symmetric sequence efficiently computed by the real-symmetric PFA-FFT. Using simple index mappings, ...
详细信息
It is shown that an N point type I odd discrete cosine transform can be reformulated as a (2N-1) point DFT of a real-symmetric sequence efficiently computed by the real-symmetric PFA-FFT. Using simple index mappings, the type II and III ODCTs are efficiently computed from the ODCT-1 of the same length. The ODCT-IV are then computed from ODCT-II or III using simple recurrence formulas.
We introduce a general framework for computing the continuous wavelet transform (CWT). Included in this framework is an FFT implementation as well as fast algorithms which achieve O(1) complexity per wavelet coefficie...
详细信息
ISBN:
(纸本)0819422134
We introduce a general framework for computing the continuous wavelet transform (CWT). Included in this framework is an FFT implementation as well as fast algorithms which achieve O(1) complexity per wavelet coefficient. The general approach that we present allows a straight forward comparison among a large variety of implementations. In our framework, computation of the CWT is viewed as convolving the input signal with wavelet templates that are the oblique projection of the ideal wavelets into one subspace orthogonal to a second subspace. We present this idea and discuss and compare particular implementations.
This thesis presents four contributions: first, it develops new techniques to extend the range of applications of computationally efficient (comparing to recursive least-squares (RLS) algorithm) fast QR-decomposition le...
详细信息
This thesis presents four contributions: first, it develops new techniques to extend the range of applications of computationally efficient (comparing to recursive least-squares (RLS) algorithm) fast QR-decomposition least-squares (FQRD-LS) algorithms; second, it develops new version of F QRD-LS algorithm for widely-linear (WL) input signal; third, It presents fixed-point analysis of F QRD-LS algorithm; and finally, it applies contant modulus algorithm (CMA) framework to the inverse QR-decomposition recursive least-squares (QRD-RLS) algorithm. The main idea in the new techniques is to make available the adaptive filter coefficients using the internal variables of the F QRD-RLS algorithm. Four applications that result from using these techniques are: system identification, burst-trained equalization, broad-band beamforrnation, and predistortion. WL adaptive algorithms are well suited for non-circular input signals, which arises for example in adaptive beamforming scenario when number of sources is greater than the number of antennas. In fixed point analysis of FQRD-LS algorithm we present: mathematical expressions for the mean square quantization error (MSQE) of all internal variables of the FQRD-LS algorithms; and derive the conditions that guarantee the stability of FQRD-LS algorithms for the purpose of fixed-point implementation. Finally, we show how to apply the CMA framework toward inverse QRD-RLS algorithm. We show application of CMA based IQRD-RLS algorithm in blind equalization of an optical channel.
When a system of first order linear ordinary differential equations has eigenvalues of large magnitude, its solutions generally exhibit complicated behaviour, such as high-frequency oscillations, rapid growth or rapid...
详细信息
When a system of first order linear ordinary differential equations has eigenvalues of large magnitude, its solutions generally exhibit complicated behaviour, such as high-frequency oscillations, rapid growth or rapid decay. The cost of representing such solutions using standard techniques grows with the magnitudes of the eigenvalues. As a consequence, the running times of standard solvers for ordinary differential equations also grow with the size of these eigenvalues. The solutions of scalar equations with slowly-varying coefficients, however, can be represented via slowly-varying phase functions at a cost which is bounded independent of the magnitudes of the eigenvalues of the corresponding coefficient matrix. Here we couple an existing solver for scalar equations which exploits this observation with a well-known technique for transforming a system of linear ordinary differential equations into scalar form. The result is a method for solving a large class of systems of linear ordinary differential equations in time independent of the magnitudes of the eigenvalues of their coefficient matrices. We discuss the results of numerical experiments demonstrating the properties of our algorithm.
An effective algorithm of LM. Morf, Ph.D. Thesis, Department of Electrical Engineering, Stanford University, Stanford, CA, 1974;in: Proceedings of the IEEE International Conference on ASSP, IEEE Computer Society Press...
详细信息
An effective algorithm of LM. Morf, Ph.D. Thesis, Department of Electrical Engineering, Stanford University, Stanford, CA, 1974;in: Proceedings of the IEEE International Conference on ASSP, IEEE Computer Society Press, Silver Spring, MD, 1980, pp. 954-959;R.R. Bitmead and B.D.O. Anderson, Linear Algebra Appl. 34 (1980) 103-116] computes the solution (x) over right arrow = T-1(b) over right arrow to a strongly nonsingular Toeplitz or Toeplitz-like linear system T (x) over right arrow = (b) over right arrow, a short displacement generator for the inverse T-1 of T, and det T. We extend this algorithm to the similar computations with n x n Cauchy and Cauchy-like matrices. Recursive triangular factorization of such a matrix can be computed by our algorithm at the cost of executing O(nr(2)log(3) n) arithmetic operations, where r is the scaling rank of the input Cauchy-like matrix C (r = 1 if C is a Cauchy matrix). Consequently, the same cost bound applies to the computation of the determinant of C, a short scaling generator of C-1, and the solution to a nonsingular linear system of n equations with such a matrix C. (Our algorithm does nor use the reduction to Toeplitz-like computations.) We also relax the assumptions of strong nonsingularity and even nonsingularity of the input not only for the computations in the field of complex or real numbers, but even, where the algorithm runs in an arbitrary field. We achieve this by using randomization, and we also show a certain improvement of the respective algorithm by Kaltofen for Toeplitz-like computations in an arbitrary field. Our subject has close correlation to rational tangential (matrix) interpolation under passivity condition (e.g., to Nevanlinna-Pick tangential interpolation problems) and has further impact on the decoding of algebraic codes. (C) 2000 Published by Elsevier Science Inc. All rights reserved.
暂无评论