Code-excited linear prediction coding with generalized pitch prediction (GPP-CELP) requires linear prediction filtering of the stochastic codebook output prior to addition of the adaptive codebook (ACE) component. The...
详细信息
Code-excited linear prediction coding with generalized pitch prediction (GPP-CELP) requires linear prediction filtering of the stochastic codebook output prior to addition of the adaptive codebook (ACE) component. The ACE component represents a sequence of past reconstructed samples passed through a low-pass filter to reflect the reduced pitch periodicity of the higher speech frequencies. The spectrum of the residual manifests broad peaks leading to significantly narrower distributions in the LPC parameter space. Additionally, the quantization error of the residual may be masked by the significantly greater energy of the ACE component. This work compares the quantization requirements for the information required to represent the time-varying LPC filter of the GPP-CELP coder with that of the classical CELP coder. With non-predictivecoding of the LPC information a bit-rate reduction from 20 bits/20 ms to 16 bits/20 ms appears feasible without introducing noticeable degradation due to quantization.
In this work, a robust recursive procedure for identification of a nonstationary AR speech model based on a frame-based quadratic classifier with heuristically decision threshold is considered. A novel heuristically d...
详细信息
In this work, a robust recursive procedure for identification of a nonstationary AR speech model based on a frame-based quadratic classifier with heuristically decision threshold is considered. A novel heuristically decision threshold is proposed and its efficiency is compared to the previously defined heuristically thresholds through analyzing natural speech signal with voiced and mixed excitation segments. Obtained results show that the considered robust procedure with the proposed heuristically decision threshold achieves more accurate AR speech parameter estimation and provides improved tracking performance.
This paper presents an algebraic vector quantized codebook excited linear prediction (AVQ-CELP) speech codec. The objective is to enhance the half rate mode of IS-127, the enhanced variable rate codec (EVRC). In AVQ-C...
详细信息
This paper presents an algebraic vector quantized codebook excited linear prediction (AVQ-CELP) speech codec. The objective is to enhance the half rate mode of IS-127, the enhanced variable rate codec (EVRC). In AVQ-CELP scheme, only the perceptually important components are encoded, and the selection of the components is done in a way similar to the ACELP. An open-loop procedure is used to select the subvectors. The selected sub-vectors are concatenated and vector quantized. An analysis-by-synthesis strategy is used to determine the optimal excitation. The generalized Lloyd algorithm (GLA) is used to optimize the AVQ codebook. In order to improve the synthesis quality of voiced frames, a two-pulse version of ACELP is used in the strong voiced frames. The proposed algorithm was incorporated in the Nokia CDMA handset prototype. Under a joint collaboration effort with SK Telecom, a field-testing was performed in Korea to evaluate the performance of the proposed AVQ algorithm. The results indicate a considerable improvement relative to the standard EVRC operating at the maximum half-rate.
In spectral coding of speech, several different criteria are in use for designing and evaluating quantizers. One measure, spectral distortion (SD), has become dominant for comparisons between coders. At run-time, a co...
详细信息
In spectral coding of speech, several different criteria are in use for designing and evaluating quantizers. One measure, spectral distortion (SD), has become dominant for comparisons between coders. At run-time, a coder normally quantizes vectors according to other measures, e.g. line spectrum frequency (LSF) distance, in order to keep computational complexity down. In this study, we adopt the SD criterion both in coder design and for quantizer operation. The quantizer is optimized to give minimal average SD scores, This allows us to address the question, is average SD measure really a good criterion, matching subjective ratings. We perform a few objective and subjective tests based on SD optimized coding and some versions thereof. Our tests imply that minimizing average SD may not lead to the best subjective scoring.
This paper presents a method for obtaining numerical estimates of high rate vector quantization (VQ) performance suitable for sources for which the PDF is not analytically available. In the proposed method, the VQ poi...
详细信息
This paper presents a method for obtaining numerical estimates of high rate vector quantization (VQ) performance suitable for sources for which the PDF is not analytically available. In the proposed method, the VQ point density is described from a Gaussian mixture model optimized for the data. Employing this method for LPC spectrum quantization, we obtain high rate expressions for both the average spectral distortion (SD) and the distribution function of the SD. We estimate the minimum bits required for a quantizer to obtain an average SD of 1 dB and the outlier statistics for that quantizer. We find that approximately 3 bits can be saved as compared to a 2-split LSF-based vector quantizer.
Using a recently proposed method based on charge pumping measurements which allows the extraction of the Si-SiO/sub 2/ interface trap depth concentration profiles, the trap parameters are studied as a function of Fowl...
详细信息
Using a recently proposed method based on charge pumping measurements which allows the extraction of the Si-SiO/sub 2/ interface trap depth concentration profiles, the trap parameters are studied as a function of Fowler-Nordheim injection. As the stress proceeds, the interface trap layer extends deeper in the direction of the oxide depth, the trap density in the oxide seems to increase faster than that at the interface and the trap capture cross-sections strongly increase. This induces deeper penetration of the carriers into the oxide depth and a larger contribution of the so-called slow traps to the device electrical properties. This can be viewed as an extension of the Si-SiO/sub 2/ interface in the direction of the oxide depth.
This paper presents a technique using artificial neural networks (ANNs) for speaker identification that results in a better success rate compared to other techniques. The technique used in this paper uses both power s...
详细信息
This paper presents a technique using artificial neural networks (ANNs) for speaker identification that results in a better success rate compared to other techniques. The technique used in this paper uses both power spectral densities (PSDs) and linear prediction coefficients (LPCs) as feature inputs to a self organizing feature map to achieve a better identification performance. Results for speaker identification with different methods are presented and compared.
In this paper, we present a novel background noise coding scheme for variable rate speech coders. Existing approaches to noise coding at very low bit rates (i.e. below 1 kbps) fail to faithfully reproduce background n...
详细信息
In this paper, we present a novel background noise coding scheme for variable rate speech coders. Existing approaches to noise coding at very low bit rates (i.e. below 1 kbps) fail to faithfully reproduce background noise resulting in a degradation of the overall perceptual quality. In our approach, classification of the noise type is used to select the type of excitation to be used at the receiver. To illustrate the benefits of our scheme, we have modified the noise coding mode of the CDMA enhanced variable rate codec (EVRC) to include the proposed class-dependent noise excitation model. Evaluation tests have shown that we have improved the overall quality with the proposed noise coding scheme without an increase in bit rate.
The United States government has developed a new Federal Standard 2400 bps vocoding algorithm called MELP-mixed excitation linear prediction. This vocoder has a very acceptable voice quality under benign error-free ch...
详细信息
The United States government has developed a new Federal Standard 2400 bps vocoding algorithm called MELP-mixed excitation linear prediction. This vocoder has a very acceptable voice quality under benign error-free channel conditions. However, when subjected to high error conditions as could be experienced in tactical vehicular operations, amelioration techniques may be employed which take advantage of the underlying inter-frame residual redundancy of the MELP parameters themselves. This paper describes experiments conducted on the MELP vocoding algorithm in conjunction with Viterbi convolutional error decoding, and enhanced with maximum a posteriori techniques which utilize these redundancy statistics. Both hard and soft Viterbi decoding implementations are investigated in addition to turbo codes.
This paper deals with multi-stage vector quantization of line spectrum pair (LSP) parameters in wideband speech coders and discusses commonly used spectral distortion measures and their relation to the perceptual qual...
详细信息
This paper deals with multi-stage vector quantization of line spectrum pair (LSP) parameters in wideband speech coders and discusses commonly used spectral distortion measures and their relation to the perceptual quality of the speech coding.
暂无评论