In this correspondence, a new method of analysis of speech is proposed that will bring out variations in vocal tract system characteristics in short (2-4 ms) segments. In this method, the source and system components ...
详细信息
In this correspondence, a new method of analysis of speech is proposed that will bring out variations in vocal tract system characteristics in short (2-4 ms) segments. In this method, the source and system components of the speech signal are suitably windowed to reduce the effects of truncation of conventional waveform windowing.
A linear predictive coding (LPC) analysis scheme which is applicable to speech coding is proposed. The analysis method, called interpolative LPC (ILPC) analysis, estimates the spectral envelope by incorporating the in...
详细信息
A linear predictive coding (LPC) analysis scheme which is applicable to speech coding is proposed. The analysis method, called interpolative LPC (ILPC) analysis, estimates the spectral envelope by incorporating the interpolation characteristics into the LPC analysis. The ILPC analysis reduces average spectral distortion and the percentage of outlier frames, compared with the conventional LPC analysis followed by linear interpolation.
The reliable communication of FS CELP 10 16 encoded speech over very noisy channels is investigated. Using second-order Markov chains it is shown that over one-quarter of the CELP bits in every frame of speech are red...
详细信息
The reliable communication of FS CELP 10 16 encoded speech over very noisy channels is investigated. Using second-order Markov chains it is shown that over one-quarter of the CELP bits in every frame of speech are redundant. An unequal error protection coding scheme, which exploits this residual redundancy, is proposed for sending the CELP parameters over Gaussian and Rayleigh fading channels. Simulations indicate substantial coding gains over conventional systems.
The authors describe an improved pitch detection algorithm for efficient multiband excitation (MBE) coding of speech. The improved algorithm adds a corrective measure to the error measure for spectrum matching employe...
详细信息
The authors describe an improved pitch detection algorithm for efficient multiband excitation (MBE) coding of speech. The improved algorithm adds a corrective measure to the error measure for spectrum matching employed in conventional MBE pitch analysis. This corrective measure effectively applies equal weighting Lo all harmonic bands by normalising the error energy in each band. The result is the reduction of gross pitch errors owing to pitch doublings,An additional advantage of using this corrective measure is that, because this measure is based on a sum-of-product formula, comparisons of matching scores may be performed with partial-sums computed during the evaluation of the measure, therefore, facilitating fast searching of the optimum pitch period, Simulation results show that, with the deployment of the improved algorithm, pitch tracking procedure is no longer needed;and the coding delay of the MBE coder;is significantly reduced.
Previous studies of nonlinear prediction of speech have been mostly focused on short-term prediction. This paper presents long-term nonlinear prediction based on second-order Volterra filters. It will be shown that th...
详细信息
Previous studies of nonlinear prediction of speech have been mostly focused on short-term prediction. This paper presents long-term nonlinear prediction based on second-order Volterra filters. It will be shown that the presented predictor can outperform conventional linear prediction techniques in terms of prediction gain and "whiter" residuals.
This correspondence deal with spectral modeling in filter banks. It is shown, both theoretically and experimentally, that subspectral modeling is superior to full spectrum modeling if performed before the rate change....
详细信息
This correspondence deal with spectral modeling in filter banks. It is shown, both theoretically and experimentally, that subspectral modeling is superior to full spectrum modeling if performed before the rate change. The price paid for this performance improvement is an increase of computations. A few different signal sources were considered in this study. It is shown that the performance of AR and ARMA techniques are comparable in subspectral modeling. The first is desired because of its simplicity. As an application of this study, we implemented a CELP based speech codec embedded in a filter bank structure. We found that there were no performance improvements of subband CELP technique over the fullband case. The theoretical reasonings of the experimental results are also given in this correspondence.
Konstantinides and Yao have considered the problem of rank determination by use of effective singular values. In this correspondence, we show how to use the minimum description length criterion of Rissanen to provide ...
详细信息
Konstantinides and Yao have considered the problem of rank determination by use of effective singular values. In this correspondence, we show how to use the minimum description length criterion of Rissanen to provide an alternative means of estimating the index of the smallest nonzero singular value of a matrix when given estimates of the singular values.
The mixed excitation linear prediction (MELP) algorithm has been recently selected as the new federal standard for 2.4kbit/s coding of speech signals. The authors exploit the average residual inter-frame correlation a...
详细信息
The mixed excitation linear prediction (MELP) algorithm has been recently selected as the new federal standard for 2.4kbit/s coding of speech signals. The authors exploit the average residual inter-frame correlation and the error sensitivities of the bits in a MELP frame to enhance the robustness of the proposed joint MELP turbo coding schemes for Operations over Rayleigh fading channels.
Numerous noise-suppression techniques have been developed for operating at the front end of low-bit-rate digital voice terminals. Some of these techniques have been evaluated by standardized intelligibility tests such...
详细信息
Numerous noise-suppression techniques have been developed for operating at the front end of low-bit-rate digital voice terminals. Some of these techniques have been evaluated by standardized intelligibility tests such as the diagnostic rhyme test (DRT). It is well known that the use of a noise suppressor seldom improves the DRT score even though listeners have had the impression that speech quality was enhanced. Unfortunately, noise suppressors have only occasionally been evaluated by standardized quality tests. The authors supplement quality test data for reference purposes. They use the diagnostic acceptability measure (DAM) to evaluate speech quality of the latest 2400-b/s linear-predictive coder (LPC) with a noise suppressor at the front end. They used a spectral subtraction technique for noise suppression. Ten different sets of noisy speech recorded at actual military platforms (such as a helicopter, tank, turboprop, helicopter carrier, or jeep) were input sources. The magnitude of the DAM improvement is substantial: as much as six points on the average, which is large enough to upgrade speech quality somewhat.< >
A new speech synthesis device has been developed. This speech synthesizer is based on a formant synthesizer and uses residual information. Generally, synthesized speech using residual information has high quality. In ...
详细信息
A new speech synthesis device has been developed. This speech synthesizer is based on a formant synthesizer and uses residual information. Generally, synthesized speech using residual information has high quality. In this speech synthesizer, the residual information is introduced as FIR (Finite Impulse Response) filter coefficients. Great flexibility in bit rate can be obtained by this implementation. Speech synthesis experiment showed the flexibility and the high performance for this speech synthesizer. This speech synthesizer can be applied to a high quality text to speech synthesis system and many other systems.
暂无评论