Here, we propose speech-coding procedures achieving high subjective quality, avoiding speech-specific processing and interframe exploitation. Thus, the scheme is tractable for packet-based voice communication, and has...
详细信息
Here, we propose speech-coding procedures achieving high subjective quality, avoiding speech-specific processing and interframe exploitation. Thus, the scheme is tractable for packet-based voice communication, and has the capability of coding generic audio. The architecture is based on an modified discrete cosine transform (MDCT) representation of the signal, and combines efficient vector quantization (VQ) techniques with psychoacoustic principles. Weighted quantization of MDCT coefficients is performed, using a codebook based on a statistical model of the multidimensional NEXT pdf. The weighting and the codebook are adapted for each frame to account for masking thresholds given by a psychoacoustic analysis. Actual quantization is performed using lattices, thereby, achieving close to rate independent complexity. The result is a coding scheme operational at a range of rates. Here, a particular instance at 16 kbits/s, using a sampling frequency of 8 kHz, is shown to perform better than an LD-CELP operating at the same rate, even though no interframe memory is exploited.
Robust low bit rate speech coders are essential in commercial and military communication systems. They operate at fixed bit rates and those bit rates cannot be altered without major modifications in the vocoder design...
详细信息
Robust low bit rate speech coders are essential in commercial and military communication systems. They operate at fixed bit rates and those bit rates cannot be altered without major modifications in the vocoder design. In this paper we introduce a scaled speech coder, which operates on time-scale modified input speech. The proposed method offers any bit rate from 2400 b/s to downwards without modifying the principle vocoder structure, which is the mixed excitation linear prediction (MELP) vocoder. We consider the application of transmitting MELP-encoded speech over noisy communication channels after time scale compression is applied. Computer simulation results, both source and channel, are presented in terms of objective speech quality metrics and informal subjective listening tests. A statistical tool called bootstrap is also used to determine the accuracy of these test results. Design parameters such as codec complexity and delay are also investigated. (C) 2003 Elsevier Inc. All rights reserved.
In this paper, we describe the theory and implementation of a variable rate speech coder using the cubic spline wavelet decomposition. In the discrete time wavelet extrema representation, Cvetkovic, et. al. implement ...
详细信息
ISBN:
(纸本)0819422134
In this paper, we describe the theory and implementation of a variable rate speech coder using the cubic spline wavelet decomposition. In the discrete time wavelet extrema representation, Cvetkovic, et. al. implement an iterative projection algorithm to reconstruct the wavelet decomposition from the extrema representation. Based on this model, prior to this work, we have described a technique for speechcoding using the extrema representation which suggests that the non-decimated extrema representation allows us to exploit the pitch redundancy in speech. A drawback of the above scheme is the audible perceptual distortion due to the iterative algorithm which fails to converge on some speech frames. This paper attempts to alleviate the problem by showing that for a particular class of wavelets that implements the ladder of spaces consisting of the splines, the iterative algorithm can be replaced by an interpolation procedure. Conditions under which the interpolation reconstructs the transform exactly are identified. One of the advantages of the extrema representation is the 'denoising' effect. A least squares technique to reconstruct the signal is constructed. The effectiveness of the scheme in reproducing significant details of the speech signal is illustrated using an example.
暂无评论