In this paper, a very low bit speech coder at 1.2 kbps is newly proposed. Like the LPC vocoder, it only requires gain, pitch, and spectral information, but its quality is far superior. The synthesis method is one of h...
详细信息
In this paper, a very low bit speech coder at 1.2 kbps is newly proposed. Like the LPC vocoder, it only requires gain, pitch, and spectral information, but its quality is far superior. The synthesis method is one of harmonic coding, using sinusoids whose frequencies are multiples of the fundamental frequency, where the amplitudes of the sinusoids are adaptively modulated using gammatone filters as a perceptual weighting filter. The sinusoids' phases are also adjusted so as to maximize the perceptual quality. In order to reduce the total bit rate to 1.2 kbit/s, a new segment coder for spectral information (LSP coefficients) using DP matching is also proposed. The quality of the synthesized speech was improved by 0.45 in the mean opinion score (MOS) compared with that of the simple LPC vocoder operating at the same rate, and it was comparable to that of 2.4 kbit/s MELP coder.
The partial trigonometric moment problem is shown to provide a unifying framework for several speech modelling techniques, such as the classical LPC antoregressive model, the line spectral pairs and composite sinusoid...
详细信息
The partial trigonometric moment problem is shown to provide a unifying framework for several speech modelling techniques, such as the classical LPC antoregressive model, the line spectral pairs and composite sinusoidal waves models, and the Toeplitz eigenvector model for formant extraction, From a mathematical viewpoint, this moment problem can be identified to an extension problem in the class of impedance functions or equivalently in the class of nonnegative definite Toeplitz matrices.
a new method for coding generic audio signals at 64 kbit/s in the bandwidth 20-15000 Hz with a low delay is presented. It combines sub-band coding, low delay CELP algorithm and cascaded filterbanks. We show how the di...
详细信息
a new method for coding generic audio signals at 64 kbit/s in the bandwidth 20-15000 Hz with a low delay is presented. It combines sub-band coding, low delay CELP algorithm and cascaded filterbanks. We show how the different parameters of LD-CELP can be adapted to achieve a better quality, perceptual coding techniques are integrated into the encoder for allocating bits to each sub-band.
An algorithm for LPC (linear predictive coding) parameter optimization in multipulse (MP)-LPC based speech coders is presented. It is shown that, by taking into account the nature of the MP-excitation signal into LPC ...
详细信息
An algorithm for LPC (linear predictive coding) parameter optimization in multipulse (MP)-LPC based speech coders is presented. It is shown that, by taking into account the nature of the MP-excitation signal into LPC parameter computation, it is possible to improve the effectiveness of the LPC model. This results in a better quality of the reconstructed signal in terms both of objective and subjective criteria. The implementation details of the algorithm are discussed and experimental results are presented. In particular a comparison with standard MP-LPC techniques is given.< >
This paper describes a method for warping the frequency axis of cepstrum coefficients in a way analogous to the preprocessing performed by the human ear. The equations are derived and historical background relating to...
详细信息
This paper describes a method for warping the frequency axis of cepstrum coefficients in a way analogous to the preprocessing performed by the human ear. The equations are derived and historical background relating to different warping scales is discussed. The calculation is a two-step procedure in which the bilinear transform is used to represent the LPC coefficients on a warped frequency scale. A warping constant determines the degree of transformation. This results in an ARMA representation of the filter transfer function. The second step determines recursively the cepstrum coefficients corresponding to this ARMA transfer function.< >
The temporal characteristics of human inductive strength judgment process have not been previously investigated, although some preliminary spatial localization results have been reported. In the present study, some Ch...
详细信息
The temporal characteristics of human inductive strength judgment process have not been previously investigated, although some preliminary spatial localization results have been reported. In the present study, some Chinese verbal inductive reasoning tasks are used in ERP (event related potentials) experiments to explore the time course of inductive strength judgment. The experimental results confirm our expectation. Inductive strength judgment after the presentation of a conclusion mainly contains three stages: visual encoding, semantic information integration, and strength evaluation. It can be tentatively concluded that visual encoding may be observed at the frontal P200 and the posterior N200, and the frontal LNC and the posterior LPC may reflect semantic information integration, and the slow waves after about 650ms may relate to strength evaluation process.
This paper proposes a new low bit-rate speech coding algorithm based on a multi-pulse excitation technique. In the algorithm, an excitation signal in a frame, which includes several pitch periods, is effectively repre...
详细信息
This paper proposes a new low bit-rate speech coding algorithm based on a multi-pulse excitation technique. In the algorithm, an excitation signal in a frame, which includes several pitch periods, is effectively represented by pulses during only one pitch period. Excitation signal for other pitch periods in the frame is reproduced by interpolating the pulses. From several experiments to evaluate the new coder, it is found that the new coder produces natural-sounding speech at low bit rates. Subjective evaluation results show that the new coder at 4.8kb/s attains good speech quality which is almost equivalent to that for 6bit/sample/µ-law PCM.
This paper describes an implementation of MELP (mixed excitation linear prediction) vocoder. Subband division required for implementing the MELP vocoder was performed by the lifting wavelet transform. A new method to ...
详细信息
This paper describes an implementation of MELP (mixed excitation linear prediction) vocoder. Subband division required for implementing the MELP vocoder was performed by the lifting wavelet transform. A new method to generate an appropriate glottal waveform was devised. In addition, three kinds of fluctuations observed in the steady parts of voiced speech were incorporated to enhance the naturalness of synthesized speech.
暂无评论