We present a linearpredictive compression approach for time-consistent 3D mesh sequences supporting and exploiting scalability. The algorithm decomposes each frame of a mesh sequence in layers employing patch-based m...
详细信息
We present a linearpredictive compression approach for time-consistent 3D mesh sequences supporting and exploiting scalability. The algorithm decomposes each frame of a mesh sequence in layers employing patch-based mesh simplification techniques. This layered decomposition is consistent in time. Following the predictivecoding paradigm, local temporal and spatial dependencies between layers and frames are exploited for compression. Prediction is performed vertex-wise from coarse to fine layers exploiting the motion of already encoded 1-ring neighbor vertices for prediction of the current vertex location. It is shown that a predictive exploitation of the proposed layered configuration of vertices can improve the compression performance upon other state-of-the-art approaches by up to 16% in domains relevant for applications.
Accurate vowel recognition forms the backbone of most successful speech recognition systems. A collection of techniques exists to extract the relevant features from the steady-state regions of the vowels both in time ...
详细信息
ISBN:
(纸本)9780769530505
Accurate vowel recognition forms the backbone of most successful speech recognition systems. A collection of techniques exists to extract the relevant features from the steady-state regions of the vowels both in time as well as in frequency domains. In this paper we present a novel and accurate feature extraction technique for recognizing Malayalam spoken vowels based on linear predictive coding method and compared the result with wavelet packet decomposition method. Recognition is performed using k-NN pattern classifier. The classification is conducted for 5 Malayalam vowel sounds using training and test set consisting of 50 ( 10 from each class) samples each. The overall recognition accuracy obtained for the vowel using LPC feature extraction method is 94%. The proposed method is efficient and computationally less expensive. The experimental results demonstrate the efficiency of the proposed algorithm
The algebraic code excited linear prediction (ACELP) algorithm, because of low complexity and high quality in its analysis-by-synthesis optimisation, has been adopted by many speech coding standards. This study propos...
详细信息
The algebraic code excited linear prediction (ACELP) algorithm, because of low complexity and high quality in its analysis-by-synthesis optimisation, has been adopted by many speech coding standards. This study proposes the unified generalised pulse replacement (UPR) search algorithm for the stochastic codebook of ACELP speech coders. The proposed UPR algorithm discusses the search breadth, the order of the search direction and the update frequency based on the pulse replacement method. In addition, there are many derivative types of UPR algorithms discussed. The proposed approaches can achieve the lowest computational complexity with imperceptible degradation of the speech quality. Furthermore, the normalised degradation ratio based on the standard subjective quality measurement is proposed to fairly compare the performance. The experimental results will verify the claims.
Although there are speech coding standards producing high-quality speech above 4 kbps, below that transparent quality has not been achieved yet. There is still room for improvement at lower bit rates, especially at 2....
详细信息
Although there are speech coding standards producing high-quality speech above 4 kbps, below that transparent quality has not been achieved yet. There is still room for improvement at lower bit rates, especially at 2.4 kbps and below, which is an area of interest for military and security applications. Strategies for achieving high-quality speech using sinusoidal coding at very low bit rates are discussed. Previous work in the literature on combining several frames in a metaframe and performing variable bit allocation within the metaframe is extended. Experiments have been carried out to find an optimum metaframe size compromise between delay and quantisation gains. Metaframe classification and quantisation according to the metaframe class are used for better efficiency. A method for voicing determination from the linear prediction coefficient (LPC) shape is also presented. The proposed techniques have been applied to the SB-LPC vocoder to produce speech at 1.2 and 0.8 kbps, and compared to the original SB-LPC vocoder at 2.4/1.2 kbps as well as an established standard (Mixed Excitation linearpredictive - MELP - vocoder) at 2.4/1.2/0.6 kbps in a listening test. It has been found that the proposed techniques have been effective in reducing the bit rate while not compromising the speech quality.
A new calibration method that accurately predicts the Research Octane Number (RON) values of gasoline fractions, based on their infrared spectra, is presented. This model combines linear predictive coding (LPC) and mu...
详细信息
A new calibration method that accurately predicts the Research Octane Number (RON) values of gasoline fractions, based on their infrared spectra, is presented. This model combines linear predictive coding (LPC) and multiple linear regression (MLR) as an integrated estimation technique. Spectral information from the 4800-3520 cm (1) range was initially encoded into linearpredictive (LP) coefficients, which were used as predictor variables in the MLR model against RON values. The model was trained and tested on an extensive data set (384 gasoline samples) and found to ensure prediction accuracy of 0.3 RON Root Mean Squared Error (RMSE). The LPC technique was found to be efficient in capturing spectral features of the entire range, related to the RON characteristics of the gasoline samples, without the need of any pretreatment on the experimental raw data. The small number of input variables in the regression model ensures a robust, easy-to-use and high accuracy prediction model. (C) 2009 Elsevier Ltd. All rights reserved.
The aim of this work is to present a method in computer vision for person identification via iris recognition. The method makes essential use of computational geometry and LPC. (C) 2007 Wiley Periodicals, Inc.
The aim of this work is to present a method in computer vision for person identification via iris recognition. The method makes essential use of computational geometry and LPC. (C) 2007 Wiley Periodicals, Inc.
We consider linear predictive coding and noise shaping for coding and transmission of auto-regressive (AR) sources over lossy networks. We generalize an existing framework to arbitrary filter orders and propose use of...
详细信息
ISBN:
(纸本)9781424464258;9780769539942
We consider linear predictive coding and noise shaping for coding and transmission of auto-regressive (AR) sources over lossy networks. We generalize an existing framework to arbitrary filter orders and propose use of fixed-lag smoothing at the decoder, in order to further reduce the impact of transmission failures. We show that fixed-lag smoothing up to a certain delay can be obtained without additional computational complexity by exploiting the state-space structure. We prove that the proposed smoothing strategy strictly improves performance under quite general conditions. Finally, we provide simulations on AR sources, and channels with correlated losses, and show that substantial improvements are possible.
This paper investigates the use of Neural Networks in recognizing Malay vowels of children in speaker-independent manner. Malay vowels are comprised of /a/, /e/, /./, /i/, /o/and /u/. Speech database is collected from...
详细信息
ISBN:
(纸本)9783642038815
This paper investigates the use of Neural Networks in recognizing Malay vowels of children in speaker-independent manner. Malay vowels are comprised of /a/, /e/, /./, /i/, /o/and /u/. Speech database is collected from 300 Malay children between seven and twelve years old. Each speaker contributes two samples per vowel sound. The speech database is organized equally into training set and test set. The speech sounds are sampled at 20 kHz with 16 bit resolution. A single frame of cepstral coefficients is extracted around the vowel onset point using linear predictive coding. Multi-Layer Perceptron (MLP) with one hidden-layer is used to train and recognize the vowel sounds. The output of the MLP consists of 6 neurons, which correspond to the 6 vowel sounds. Experiments are conducted to determine the optimal signal length of vowels, and hidden neuron number of MLP. A maximum recognition rate of 75.00% is achieved at signal length of 30ms and 35ms.
In this paper, we describe a new approach to cope with packet loss in speech coders. The idea is to split the information present in each speech packet into two components, one to independently decode the given speech...
详细信息
ISBN:
(纸本)9781424442966
In this paper, we describe a new approach to cope with packet loss in speech coders. The idea is to split the information present in each speech packet into two components, one to independently decode the given speech frame and one to enhance it by exploiting interframe dependencies. The scheme is based on sparse linear prediction and a redefinition of the analysis-by-synthesis process. We present Mean Opinion Scores for the presented coder with different degrees of packet loss and show that it performs similarly to frame dependent coders for low packet loss probability and similarly to frame independent coders for high packet loss probability. We also present ideas on how to make the coder work synergistically with the channel loss estimate.
暂无评论