In speech coding, several vector quantization (VQ) methods for the LPC (linear predictive coding) parameters have been developed. Because LPC parameters are too dynamic to quantize directly, the LSFs (line spectrum fr...
详细信息
ISBN:
(纸本)0780331923
In speech coding, several vector quantization (VQ) methods for the LPC (linear predictive coding) parameters have been developed. Because LPC parameters are too dynamic to quantize directly, the LSFs (line spectrum frequencies) are used instead. In this study, we propose the linked split-vector quantizer (LSVQ) where the lower and the upper codebook are selected according to the preselected middle codevector. Using the ordering property of LSFs, LSVQ links three codebooks for the efficient use of the codebook space. Compared with the conventional split-vector quantizer (SVQ), LSVQ increases the usage of codebook space by 10.84%, and shows lower spectral distortion at 23 bits/frame than the SVQ at 24 bits/frame.
An algorithm for coding 20 Hz-15 kHz speech signals at 64 kbit/s with a very low delay (frame of 0.16 ms) is presented. To achieve a quality near to transparency, the authors propose adapting the Low-Delay CELP coder ...
详细信息
An algorithm for coding 20 Hz-15 kHz speech signals at 64 kbit/s with a very low delay (frame of 0.16 ms) is presented. To achieve a quality near to transparency, the authors propose adapting the Low-Delay CELP coder to the 15 kHz bandwidth and suggest a new noise shaping method based on a psychoacoustic model. In this way they take advantage of linear predictive coding and masking properties of the human perception system. Finally, an algebraic codebook is proposed, allowing an important reduction of coder computational complexity, without decreasing the perceived quality of signals.
Pseudo-articulatory representations are increasingly being used in work on speech synthesis and recognition. The value of such representations lies in their derivation from linguistic abstractions-they are based on ar...
详细信息
Pseudo-articulatory representations are increasingly being used in work on speech synthesis and recognition. The value of such representations lies in their derivation from linguistic abstractions-they are based on articulatory idealizations used by linguists to describe speech. Iles and Edmondson (1994) demonstrated that, using these representations, it is possible to overcome the many-to-one problem in mapping articulatory configuration to acoustic signal. The authors show how the representations facilitate the details of speech processing, for both synthesis and recognition, and give details of work in progress on recognition. The role of pseudo-articulatory representations in the development of an integrated approach to synthesis and recognition is also discussed.
We present an adaptive path-following method based on the technique of homotopy, which efficiently computes the line spectral pairs by exploiting their natural ordering and low frame-to-frame variation. We first defin...
详细信息
We present an adaptive path-following method based on the technique of homotopy, which efficiently computes the line spectral pairs by exploiting their natural ordering and low frame-to-frame variation. We first define continuous paths from known roots of the LSP polynomials of a prior speech frame to the unknown roots of the next frame in the sequence. A gradient-search based numerical predictor-corrector procedure is then used for tracing these paths in order to compute the unknown roots. This method uses only scalar operations and allows for all paths to be tracked independently. Conditions guaranteeing the existence of continuously differentiable paths are established by using a transformation between the s and z domains, and simulation results are presented to verify these conditions. Finally, we suggest guidelines for algorithm parameter selection as well as for reducing the computational complexity.
We present a simple and efficient new method for encoding LPC parameters based on soft computing (SC). More specifically the authors, using a novel training technique that is able to extract fuzzy knowledge, propose a...
详细信息
We present a simple and efficient new method for encoding LPC parameters based on soft computing (SC). More specifically the authors, using a novel training technique that is able to extract fuzzy knowledge, propose a method of transformation from reflection coefficients into a new set of parameters related to the fuzzy inferential process. Performance evaluations in terms of spectral distortion and comparisons of perceptive quality in listening tests, show that this new approach is appropriate for low bit-rate speech coding as it achieves a good trade-off between quality and bit rate with a very low computational load.
Extracting a good feature set is important to pattern recognition. A new formulation of integrating the feature extraction into the model training is proposed. The intraframe weighting, the interframe weighting and th...
详细信息
ISBN:
(纸本)0780331923
Extracting a good feature set is important to pattern recognition. A new formulation of integrating the feature extraction into the model training is proposed. The intraframe weighting, the interframe weighting and the feature reduction schemes can be obtained from this new formulation. According to the dependence of the class model parameters, three types of feature extraction are derived. Some experiments for the speaker recognition application are given to show the effectiveness of the new proposed feature extraction method.
User friendly human interfaces have received great attention recently. Our goal is to realize a natural human-machine communication environment by giving a face to the computer terminal or communication system. In ord...
详细信息
User friendly human interfaces have received great attention recently. Our goal is to realize a natural human-machine communication environment by giving a face to the computer terminal or communication system. In order to construct such an interface, a real synthesised image in real-time is needed. In this paper, we develop a real-time media conversion system and examine the communication between the user and a virtual agent with a human face on the computer monitor.
Vector quantization (VQ) is usually adopted to quantize parameters of multiple dimensions. It uses certain meaningful vector measures for codevector selection and codebook training to achieve bit reduction, at the exp...
详细信息
Vector quantization (VQ) is usually adopted to quantize parameters of multiple dimensions. It uses certain meaningful vector measures for codevector selection and codebook training to achieve bit reduction, at the expense of, however, higher computation and memory requirement. Due to the aforementioned shortcomings, in practical applications, scalar quantization (SQ) is still applied to quantize each of the multidimensional parameters individually. In this paper, we propose to apply a vector measure method to the table lookup in scalar quantization so as to improve its coding efficiency without increasing storage requirement. Two LSF scalar quantizers are investigated in which the weighted Euclidean distance is used as a vector measure to improve the performance of both quantizers.
The recently adopted ITU-T G.729 8 kbps codec standard's performance is evaluated for digital cellular channels that are characterized by Rayleigh fading. Two forward error correction schemes (FEC) are studied. Th...
详细信息
The recently adopted ITU-T G.729 8 kbps codec standard's performance is evaluated for digital cellular channels that are characterized by Rayleigh fading. Two forward error correction schemes (FEC) are studied. They are the convolutional code based FEC and the Nordstrom Robinson based FEC. The effects of interleaving depth are investigated for both FEC schemes. Both flat fading as well as co-channel interference limited models are used in our study.
暂无评论