This paper discusses problems of adaptive transform coding schemes at bit rates of 12 kbit/s and below. Objective and subjective performance reductions, like low-pass filtering effects as one of the main sources of pe...
详细信息
This paper discusses problems of adaptive transform coding schemes at bit rates of 12 kbit/s and below. Objective and subjective performance reductions, like low-pass filtering effects as one of the main sources of perceptual distortion, are investigated and proposals are made how to improve the performance of the coder at low and medium bit rates. Additionally, the needed transmission of side information reduces the efficiency of the scheme. Various methods to lower the rate of this supplementary data signal are given as well as modifications of the scheme which lead to a more easily implemented coder structure.
The problems in using a single-ANN (artificial neural network) predictive system in speech coding are considered and a system consisting of several ANNs, a multi-ANN system, optimised for operation over various bands ...
详细信息
The problems in using a single-ANN (artificial neural network) predictive system in speech coding are considered and a system consisting of several ANNs, a multi-ANN system, optimised for operation over various bands in the dynamic speech range is proposed. This system shows considerable improvement over the single-ANN system.
Third generation mobile systems are currently receiving a lot of attention in many parts of the world. They are intended to provide a wide range of services to a large number of users on a universal basis. As far as q...
详细信息
The major obstacle which has limited the use of Vector Quantization (VQ) for real-time speech coding is the computationally demanding codebook-search algorithm. The essential task of this algorithm, pattern matching, ...
详细信息
The major obstacle which has limited the use of Vector Quantization (VQ) for real-time speech coding is the computationally demanding codebook-search algorithm. The essential task of this algorithm, pattern matching, has several properties which make it amenable to VLSI realization using a highly concurrent processor architecture. A VLSI pattern-matching chip provides the essential building-block for a specialpurpose codebook-search processor (CSP). The CSP can serve as a generic architecture for a variety of VQ-based speech coding applications. This paper reports on a working VQ processor for speech coding based on a first generation VLSI chip that efficiently performs the essential pattern-matching operation needed for the codebook-search process. Furthermore, the CSP architecture, using this chip, has been successfully incorporated into a compact single-board Vector PCM implementation which operates at rates between 7 and 18 kbits/s. A real-time Adaptive Vector Predictive Coder system using the CSP and augmented by a TMS-32010 programmable signal processor has been designed and recently implemented. We describe the structure of these two VQ coders and present experimental results obtained using the single-board Vector PCM coder.
The generalization of gain adaptation to vector quantization (VQ) is explored in this paper and a comprehensive examination of alternative techniques is presented. We introduce a class of adaptive vector quantizers th...
详细信息
The generalization of gain adaptation to vector quantization (VQ) is explored in this paper and a comprehensive examination of alternative techniques is presented. We introduce a class of adaptive vector quantizers that can dynamically adjust the "gain" or amplitude scale of code vectors according to the input signal level. The encoder uses a gain estimator to determine a suitable normalization of each input vector prior to VQ encoding. The normalized vectors have reduced dynamic range and can then be more efficiently coded. At the receiver, the VQ decoder output is multiplied by the estimated gain. Both forward and backward adaptation are considered and several different gain estimators are compared and evaluated. Gain-adaptive VQ can be used alone for "vector PCM" coding (i.e., direct waveform VQ) or as a building block in other vector coding schemes. The design algorithm for generating the appropriate gain-normalized VQ codebook is introduced. When applied to speech coding, gain-adaptive VQ achieves significant performance improvement over fixed VQ with a negligible increase in complexity.
In this paper, we propose a novel multicomponent amplitude and frequency modulated (AFM) signal model for parametric representation of speech phonemes. An efficient technique is developed for parameter estimation of t...
详细信息
In this paper, we propose a novel multicomponent amplitude and frequency modulated (AFM) signal model for parametric representation of speech phonemes. An efficient technique is developed for parameter estimation of the proposed model. The Fourier-Bessel series expansion is used to separate a multicomponent speech signal into a set of individual components. The discrete energy separation algorithm is used to extract the amplitude envelope (AE) and the instantaneous frequency (IF) of each component of the speech signal. Then, the parameter estimation of the proposed AFM signal model is carried out by analysing the AE and IF parts of the signal component. The developed model is found to be suitable for representation of an entire speech phoneme (voiced or unvoiced) irrespective of its time duration, and the model is shown to be applicable for low bit-rate speech coding. The symmetric Itakura-Saito and the root-mean-square log-spectral distance measures are used for comparison of the original and reconstructed speech signals.
A two-channel conjugate vector quantizer is proposed in an attempt to reduce quantization distortion for noisy channels. In this quantization, two different codebooks are used. The encoder selects the channel code pai...
详细信息
A two-channel conjugate vector quantizer is proposed in an attempt to reduce quantization distortion for noisy channels. In this quantization, two different codebooks are used. The encoder selects the channel code pair that generates the smallest distortion between the input and the averaged output vectors. These two codebooks are alternately trained by an iterative algorithm which is based on the generalized Lloyd algorithm. coding experiments show that the proposed scheme has almost the same SNR as a conventional vector quantizer for an error-free channel. On the other hand, it has a significantly higher SNR than the conventional one for a 1% error rate. This scheme also has merits in computational complexity and storage requirements. The scheme is confirmed to be effective for a medium bit-rate speech waveform coder.
Low-rate speech coding technology has recently made a significant progress with the introduction of new interpolative algorithms. The inherent complexity of these algorithms is, however too high too be commercially us...
详细信息
Low-rate speech coding technology has recently made a significant progress with the introduction of new interpolative algorithms. The inherent complexity of these algorithms is, however too high too be commercially useful for low-cost applications. In this paper we propose new approaches to low-complexity speech coding at coding rates of 1.2 and 2.4 kbps. The proposed methods utilize all the advantages of interpolative coding but greatly simplify the analysis and synthesis operations to a point where low-cost two-way digital speech communication can be easily implemented on DSP or host platforms. At 2.4 kbps, the complexity of the proposed coder is about 7.5 and 2.5 MFLOPS for the encoder and decoder, respectively. At 1.2 kbps, the complexity is about 6 and 2.3 MFLOPS for the encoder and decoder, respectively. The small computational load of these coders make them suitable for multi-tasking environment and low-cost terminals. Informal subjective evaluation shows that, at 2.4 kbps, good communication quality is obtained. Communication quality is less than toll quality but the perceived coding effects are not annoying and do not prevent long sustained two-way conversation with high degree of intelligibility. The quality does not significantly degrade at 1.2 kbps and it is considered sufficient for messaging applications.
This paper reviews our recent efforts in the design and implementation of real-time speech coders. We discuss our approach and methodology for real-time hardware for coder techniques ranging from low to high complexit...
详细信息
This paper reviews our recent efforts in the design and implementation of real-time speech coders. We discuss our approach and methodology for real-time hardware for coder techniques ranging from low to high complexity. Examples of realizations are given for each approach. They include adaptive differential PCM coding, subband coding, harmonic scaling with subband coding, and adaptive transform coding. Low to medium complexity techniques are based on the use of the Bell Laboratories digital signal processing (DSP) integrated circuit. High complexity block processing techniques are based on the use of an array processing computer. We conclude with an assessment of the performance versus complexity tradeoffs involved in these coding methods.
暂无评论