This paper reports on a preliminary study of applying single-channel (scalar) and multichannel (vector) 2-D linear prediction to color image modeling and coding. Also, the novel idea of a multi-input single-output 2-D...
详细信息
This paper reports on a preliminary study of applying single-channel (scalar) and multichannel (vector) 2-D linear prediction to color image modeling and coding. Also, the novel idea of a multi-input single-output 2-D ADPCM coder is introduced. The results of this study indicate that texture information in multispectral images can be represented by linear prediction coefficients or matrices, where as the prediction error conveys edge-information. Moreover, by using a single-channel edge-information we obtained, from original color images of 24 bits/pixel, reconstructed images of good quality at information rates of 1 bit/pixel or less.
A major source of audible distortion in current low-bit-rate speech coding algorithms is an inaccurate degree of periodicity of the voiced speech signal. If the correlations between neighboring pitch cycles are accura...
详细信息
A major source of audible distortion in current low-bit-rate speech coding algorithms is an inaccurate degree of periodicity of the voiced speech signal. If the correlations between neighboring pitch cycles are accurately reproduced, these audible distortions can be reduced significantly. To this purpose, a novel method of coding voiced speech is introduced, which transmits an encoded prototype waveform at 20-30 ms intervals. The prototype waveform describes a pitch cycle representative for the interval, and is quantized using analysis-by-synthesis methods. The speech signal is reconstructed by concatenation of interpolated prototype waveforms. The short-term and the long-term correlations between pitch cycles can be controlled explicitly. Unquantized reconstructed speech is virtually indistinguishable from the original signal. The method results in excellent speech quality at rates between 3.0 and 4.0 kb/s.< >
Speaker verification is of great importance, especially in the field of forensics and security. This paper aims at implementing such a system at the hardware level. This system extracts features from the fresh voice s...
详细信息
Speaker verification is of great importance, especially in the field of forensics and security. This paper aims at implementing such a system at the hardware level. This system extracts features from the fresh voice samples and verifies the speaker by comparing those with the ones being stored in the database. The features used here are the linear predictive coding (LPC) Coefficients which are obtained using the Levinson - Durbin (LD) algorithm. This paper proposes to implement Vector Quantization (VQ) to obtain the representative LPC vectors. A simple speaker verification system for a single person is efficiently implemented on FPGA.
In this work we compare different classification algorithms applied on different number of features (linear predictive coding coefficients) in order to detect audio signals from wildlife areas. The final goal is to fi...
详细信息
In this work we compare different classification algorithms applied on different number of features (linear predictive coding coefficients) in order to detect audio signals from wildlife areas. The final goal is to find the appropriate number of linear predictive coding coefficients to provide the desired accuracy for a certain framework. The experimental results prove that the best classifier is Logistic Model Trees regardless the number of features, having a constant classification accuracy greater than 95%. In the case of a reduced number of features, both Random Forest and Lazy IBk have good results; the classification accuracy is greater than 98%.
Room acoustic response modeling is a challenging problem. Typical applications include speech dereverberation and loudspeaker correction. Traditionally, infinite-duration impulse response (IIR) or finite-duration impu...
详细信息
Room acoustic response modeling is a challenging problem. Typical applications include speech dereverberation and loudspeaker correction. Traditionally, infinite-duration impulse response (IIR) or finite-duration impulse response (FIR) filters have been used for acoustic response modeling and equalization. The IIR filter, also called a parametric filter, has a bell-shaped magnitude response and is characterized by its center frequency, the gain at the center frequency, and a Q factor (which is inversely related to the bandwidth of the filter) and is easily implemented as a cascade for purposes of room response modeling and equalization. In this paper we present a technique for determining the coefficients of a second order IIR using a linear predictive coding (LPC) model, where the poles or roots of a high-order LPC dictate the parameters of the parametric filter. Due to the band interactions between the IIR filters, forming the cascade to model the room response, we also present a technique to optimize the Q values so as to better characterize the room response. An accurate model allows for better equalization, for correcting the loudspeaker and room acoustics for speech/audio enhancement, particularly at low frequencies. Alternatively, this technique can be utilized for speech dereverberation applications where the room responses have been estimated a priori. The advantages of the proposed method is the fast computation of the IIR filter parameters, from to the LPC model, since (i) the LPC model is efficient to compute since it uses the Levinson-Durbin recursion to solve the normal equations that arise from the least squares formulation, and (ii) a reasonably high-order LPC model is able to accurately model the low-frequency room response modes.
A continuously adaptive approach to speech encoding is presented. In contrast with other adaptation methods, it provides reliable modeling of the transition between two phonemes, and, unlike the usual block-stationary...
详细信息
A continuously adaptive approach to speech encoding is presented. In contrast with other adaptation methods, it provides reliable modeling of the transition between two phonemes, and, unlike the usual block-stationary techniques, it eliminates the need to detect these transitions. Continuously adaptive linear predictive coding takes into account the inherent nonstationarity of the speech signal by using the expected minimum rate of change of the model parameters as a constraint in the recursive estimation of these parameters. The criterion considered is a constrained-least-squares cost functional which incorporates with equal weight all instantaneous errors up to the time of observation. An appropriate algorithm is given, and simulations are presented to illustrate the basic cost-performance tradeoffs involved in the approach.< >
In linear predictive coding (LPC) analysis, the linear predictors are computed using the classical Levinson-Durbin algorithm. But the Levinson-Durbin algorithm is a processing bottleneck as it involves the addition of...
详细信息
In linear predictive coding (LPC) analysis, the linear predictors are computed using the classical Levinson-Durbin algorithm. But the Levinson-Durbin algorithm is a processing bottleneck as it involves the addition of inner products in the calculation of the reflection coefficients. This paper develops and reports on modifications to the algorithm used in various speech processing and coding applications for efficient implementations.
We present a linearpredictive compression approach for time-consistent 3D mesh sequences supporting and exploiting scalability. The algorithm decomposes each frame of a mesh sequence in layers employing patch-based m...
详细信息
We present a linearpredictive compression approach for time-consistent 3D mesh sequences supporting and exploiting scalability. The algorithm decomposes each frame of a mesh sequence in layers employing patch-based mesh simplification techniques. This layered decomposition is consistent in time. Following the predictivecoding paradigm, local temporal and spatial dependencies between layers and frames are exploited for compression. Prediction is performed vertex-wise from coarse to fine layers exploiting the motion of already encoded 1-ring neighbor vertices for prediction of the current vertex location. It is shown that a predictive exploitation of the proposed layered configuration of vertices can improve the compression performance upon other state-of-the-art approaches by up to 16% in domains relevant for applications.
A digital audio watermarking algorithm based on discrete wavelet transform is presented. A visually significant binary image via some pre-processing and SS modulating is embedded in audio low-middle frequency coeffici...
详细信息
A digital audio watermarking algorithm based on discrete wavelet transform is presented. A visually significant binary image via some pre-processing and SS modulating is embedded in audio low-middle frequency coefficients in wavelet domain. A scheme of watermark detection is presented by using linear predictive coding, and it does not use the original signal during extracting watermark. The BER is improved 10%-15% in this algorithm compared with the algorithm in Wang R.D. and Chai P.Q. (2003). Experimental results show that the watermark is imperceptible and the algorithm is robust to many attacks, such as low pass filtering, resampling, MP3 compression and so on.
暂无评论