Accurate quantization of the LPC model is of prime importance for the quality of low bitrate speech coders. In the literature, the quantization properties of several representations of the LPC model have been studied....
详细信息
Accurate quantization of the LPC model is of prime importance for the quality of low bitrate speech coders. In the literature, the quantization properties of several representations of the LPC model have been studied. The best results have generally been obtained with the LSP frequencies. In scalar quantization schemes, the immitance spectrum pairs (ISP) perform even slightly better. The good quantization performance of LSP and ISP can be attributed to their theoretical statistical properties: they are uncorrelated when estimated from stationary autoregressive processes, in contrast to the other representations. For small variations in the coefficients of any representation, the spectral distortion can be expressed as a weighted squared distortion measure. The optimal weighting matrix is the inverse of the covariance matrix of the coefficients. For the LSP and ISP this matrix is a diagonal matrix and hence the best weighting factors are the inverses of the theoretical variances. The difference between the LSP and ISP is due to their distributions in speech.
In this paper, we discuss the use of artificial neural learning methods for low bit-rate speech compression, potentially in non-stationary environments. Unsupervised learning algorithms are particularly well-suited fo...
详细信息
In this paper, we discuss the use of artificial neural learning methods for low bit-rate speech compression, potentially in non-stationary environments. Unsupervised learning algorithms are particularly well-suited for vector quantization (VQ) which is used in many speech compression applications. We discuss two unsupervised learning algorithms: frequency-sensitive competitive learning and Kohonen's self-organizing maps which have both been investigated for learning the codebook vectors in an adaptive vector quantizer. In contrast with earlier work, we have employed these learning rules in VQ of the linear predictive coding (LPC) prediction residual. The performance of these unsupervised learning algorithms in speaker-dependent and speaker-independent speech compression are presented. Our results compare favourably with those of code-excited linear prediction (CELP) requiring reduced computational power with a tolerable reduction in speech quality. We also explore the effects of limited precision on classification and learning in competitive learning algorithms for low power VLSI implementations.
We present a 16 kb/s CELP coder with a complexity as low as 3 MIPS. The main thrust is to reduce the complexity as much as possible while maintaining toll-quality. This low-complexity CELP (LC-CELP) coder has the foll...
详细信息
ISBN:
(纸本)0780324315
We present a 16 kb/s CELP coder with a complexity as low as 3 MIPS. The main thrust is to reduce the complexity as much as possible while maintaining toll-quality. This low-complexity CELP (LC-CELP) coder has the following features: (1) fast LPC quantization, (2) 3-tap pitch prediction with efficient open-loop pitch search and predictor tap quantization, (3) backward-adaptive excitation gain, and (4) a trained excitation codebook with a small vector dimension and a small codebook size. Most CELP coders require one full DSP or even two DSP chips to implement in real-time. In contrast, 3 to 6 full-duplex LC-CELP coders can fit into a single DSP chip, since each takes only around 3 MIPS to implement. This coder achieved slightly higher mean opinion stores (MOS) than the CCITT 32 kb/s ADPCM. It also exhibits good performance when tandemed with itself or transcoded with other coders.
A new method for extracting the area, peripheral and corner capacitance components of bipolar junction transistor, using measurements versus bias on a number of different structures is presented and validated. The sam...
详细信息
A new method for extracting the area, peripheral and corner capacitance components of bipolar junction transistor, using measurements versus bias on a number of different structures is presented and validated. The same model with different parameters is used for the three components. Validation has been made using a quasi-2D simulator. Finally, it is shown that this method gives accurate results regarding the goodness of fit.
Use of a bilinear conformal map to achieve a frequency warping nearly identical to the Bark scale is described. Because the map takes the unit circle to itself, its form is that of an allpass transfer function. Since ...
详细信息
Use of a bilinear conformal map to achieve a frequency warping nearly identical to the Bark scale is described. Because the map takes the unit circle to itself, its form is that of an allpass transfer function. Since it is a first-order map, it preserves the model order of rational systems. A direct-form expression for computing the optimal allpass coefficient as a function of sampling rate is developed, and a filter design example is presented.
Unsupervised learning algorithms play a central part in models of neural computation. K-means clustering algorithms, a type of unsupervised learning algorithms, have been used in many application areas. We propose an ...
详细信息
Unsupervised learning algorithms play a central part in models of neural computation. K-means clustering algorithms, a type of unsupervised learning algorithms, have been used in many application areas. We propose an improved K-means algorithm for optimal partition which can achieve better variation equalization than standard binary splitting algorithms. The proposed clustering algorithm was applied to combined multi-codebook/MLP neural network speech recognition system to train the LPC based codebooks. It achieved smaller variation of the variances of clusters than that from the standard binary splitting algorithm.
An HMM continuous Hebrew phoneme recognition system, that requires no manual segmentation for its training was developed. A relatively small Hebrew data base was acquired for training and recognition of phonemes in co...
详细信息
An HMM continuous Hebrew phoneme recognition system, that requires no manual segmentation for its training was developed. A relatively small Hebrew data base was acquired for training and recognition of phonemes in continuous speech. One of the main problems in phoneme recognition, that of manual segmentation of the training data base, was overcome by a special training algorithm. The Viterbi algorithm was used in the recognition stage, and the evaluation of the results was done with the Levenshtein distance measure. Initial recognition results of Hebrew phonemes for speaker independent, text dependent cases were 69.4% correct phoneme recognition.
暂无评论