Formants are able to define basic properties of speech efficiently by using very limited parameter sets; thus they have found important usage area at many applications of speech processing like coding, recognition, sy...
详细信息
Formants are able to define basic properties of speech efficiently by using very limited parameter sets; thus they have found important usage area at many applications of speech processing like coding, recognition, synthesis and enhancement. Estimation of formants is harder than simply tracking the peaks of the spectrum; as the output of the vocal tract's spectral peaks are dependent on the shape of vocal tract, excitation and periodicity in a complex way. Because of this reason, a lot of past work was done on formant estimation and their positive and negative properties have been recognized. In this article we analyzed some of the popular formant estimation method's performances and compared them. Among these three compared methods, it's seen that the particle filtering based formant estimation method gives the most successful performance. Furthermore, it's recognized that linear predictive coding method has estimation difficulties with signals with low sampling frequencies and cepstrum method causes excess formants at peak picking.
This paper describes a novel approach of fractal modelling and coding of residuals for excitation in the linear predictive coding of speech. This work was motivated by reducing the bit rate to 1200 bps, while maintain...
详细信息
This paper describes a novel approach of fractal modelling and coding of residuals for excitation in the linear predictive coding of speech. This work was motivated by reducing the bit rate to 1200 bps, while maintaining a good quality of speech. linear prediction based speech coders differ primarily in the modelling of the residual. The design trade-off in the modelling of the residual is between quality and bit-rate. In this paper fractal modelling is used to model the residual. We show that fractal modelling reduces the bit-rate while maintaining quality. A 6 kbps speech coder was implemented using the piecewise self-affine fractal model. The new coder has a signal-to-noise ratio of 10.9 dB. An informal subjective measure found the perceptual quality to be comparable to that of the 13 kbps GSM coder.
The feasibility and performance of an embedded RPE (ERPE) scheme based on multistage coding is investigated. The coding efficiency of second and subsequent stages depends on the spectral envelope difference between th...
详细信息
The feasibility and performance of an embedded RPE (ERPE) scheme based on multistage coding is investigated. The coding efficiency of second and subsequent stages depends on the spectral envelope difference between the original speech and the error signal at each stage whereas re-use of LPC parameters derived from the original speech depends on the corresponding LPC spectral difference. Suitable measures of spectral difference are defined and simulation shows that both decrease with the perceptual weighting factor. The ERPE system requires little extra coding complexity and can be simplified further by using a partial phase adaptation procedure with marginal loss of SNR performance. The simulated ERPE system shows graceful reduction of reconstructed speech quality for bit rates from 14.8 to 6.4 kb/s in 4.2 kb/s steps.
The covariance analysis of linear predictive coding has wide applications, especially in speech recognition and speech signal processing. Real-time applications demand very high processing speed for linearpredictive ...
详细信息
The covariance analysis of linear predictive coding has wide applications, especially in speech recognition and speech signal processing. Real-time applications demand very high processing speed for linear predictive coding analysis. VLSI technology which possesses properties of low-cost, high-speed and massive computing capabilities is a suitable candidate. In this paper, systolic array processors for the covariance analysis of linear predictive coding are developed. The covariance analysis of linear predictive coding contains a large set of irregular and nested recurrence equations. Systolizing the algorithm is a difficult task for such a complex problem, Existing methods of systematic design for systolic arrays are not much helpful to this problem. To overcome it, a break-combination method is presented in this paper. In this manner, the task is first decomposed and then mapped onto several interconnected systolic arrays. The resulting systolic arrays of the sub problems are then combined to form a complete solution.
The aim of this work is to present a method in computer vision for person identification via iris recognition. The method makes essential use of computational geometry and LPC. (C) 2007 Wiley Periodicals, Inc.
The aim of this work is to present a method in computer vision for person identification via iris recognition. The method makes essential use of computational geometry and LPC. (C) 2007 Wiley Periodicals, Inc.
Speech coders are fundamental component in telecommunication and multimedia infrastructure. Several systems like, mobile telephony, voice over internet protocol (VOIP), audio conferencing etc., rely on efficient speec...
详细信息
ISBN:
(纸本)9781467363204
Speech coders are fundamental component in telecommunication and multimedia infrastructure. Several systems like, mobile telephony, voice over internet protocol (VOIP), audio conferencing etc., rely on efficient speech coding. Speech coders strive to provide low-bit rate maintaining the same speech quality and intelligibility. linear predictive coding uses spectral properties of the speech to "optimize" the coder's performance for human ear. In this paper we perform a comparative assessment of speech coding performance of some state-space filters to give designers an insight into capabilities of these filters. The filters considered are Kalman filter, state-space recursive least-squares (SSRLS) and SSRLS with adaptive memory (SSRLSWAM). The results of RLS and LMS are also quoted. The performance is judged in terms of perceptual evaluation of speech quality (PESQ) and prediction gain.
Accurate vowel recognition forms the backbone of most successful speech recognition systems. A collection of techniques exists to extract the relevant features from the steady-state regions of the vowels both in time ...
详细信息
ISBN:
(纸本)9780769530505
Accurate vowel recognition forms the backbone of most successful speech recognition systems. A collection of techniques exists to extract the relevant features from the steady-state regions of the vowels both in time as well as in frequency domains. In this paper we present a novel and accurate feature extraction technique for recognizing Malayalam spoken vowels based on linear predictive coding method and compared the result with wavelet packet decomposition method. Recognition is performed using k-NN pattern classifier. The classification is conducted for 5 Malayalam vowel sounds using training and test set consisting of 50 ( 10 from each class) samples each. The overall recognition accuracy obtained for the vowel using LPC feature extraction method is 94%. The proposed method is efficient and computationally less expensive. The experimental results demonstrate the efficiency of the proposed algorithm
A description is given of an efficient code-excited linearpredictive (CELP) coder for bit rates between 6 and 16 kb/s, and novel effective algorithms for the selection of the excitation signal. The authors then propo...
详细信息
A description is given of an efficient code-excited linearpredictive (CELP) coder for bit rates between 6 and 16 kb/s, and novel effective algorithms for the selection of the excitation signal. The authors then propose a class of binary codebooks having interesting properties in terms of performance, robustness against transmission errors and flexibility in the choice of the bit rate. Due to the optimal structure of the coder and to the fast algorithms for selecting the excitation signal, a real-time implementation has been made possible on one 32-bit floating-point digital signal processor.< >
This paper presents a segmentation based linear predictive coding (SLPC) method for multispectral images. Given a set of multispectral images, the SLPC method first segments it into statistically distinct regions. It ...
详细信息
This paper presents a segmentation based linear predictive coding (SLPC) method for multispectral images. Given a set of multispectral images, the SLPC method first segments it into statistically distinct regions. It then finds a suitable linear prediction model for each region. Finally, it quantizes the prediction error in each class using a vector quantizer. The original image set is described by the segmentation map, the model parameters for each class, and the quantized prediction errors. The SLPC method can produce very high compression gains, because the specification of the segmentation map and model parameters requires significantly fewer bits than that for the original intensity values. This method has been applied to magnetic resonance head images with three spectral bands (one T1 weighted and two T2 weighted, 256/spl times/256/spl times/12 bits/image). Images compressed by a factor of more than 22 have been regarded as indistinguishable from the originals, by several radiologists.< >
Numerous authors have developed digital techniques for enhancing acoustically noisy speech signals. These algorithms are often applied as preprocessors to narrowband digital speech communications systems. Among the dr...
详细信息
Numerous authors have developed digital techniques for enhancing acoustically noisy speech signals. These algorithms are often applied as preprocessors to narrowband digital speech communications systems. Among the drawbacks of the pre- processor approach is the relatively high overall computational complexity. A method, using smoothed spectral estimates, is described for embedding many of these techniques within the autocorrelation approach to linearpredictive analysis. This method provides a significant reduction in overall complexity.
暂无评论