We propose a robust recursive procedure based on a weighted recursive least squares (WRLS) algorithm with variable forgetting factor (VFF) and frame-based quadratic classifier for identification of nonstationary AR mo...
详细信息
We propose a robust recursive procedure based on a weighted recursive least squares (WRLS) algorithm with variable forgetting factor (VFF) and frame-based quadratic classifier for identification of nonstationary AR model of a speech production system. Also, two versions of the frame-based quadratic classifier design procedure, iterative quadratic classifications procedure (CIQC) and its real-time modification (RTQC), are considered. A comparative experimental analysis is done according to the results obtained in analyzing speech signal with voiced and mixed excitation segments. Experimental results justify that two main problems of LPC speech analysis, nonstationarity of LPC parameters and non-appropriateness of AR modeling of speech (particularly on the voiced frames), can be solved by application of the proposed robust procedure. As for the comparison of CIQC and RTQC algorithm, it has been observed that superior results are obtained by using the proposed method with the RTQC algorithm and it is recommended for use in the nonstationary AR speech model identification.
We apply the relative entropy functional to sets of line-spectrum pairs (LSPs) and transform-based generalized spectral pmfs of Gibson et al. (1993) and present experimental results for sequence segmentation and vecto...
详细信息
We apply the relative entropy functional to sets of line-spectrum pairs (LSPs) and transform-based generalized spectral pmfs of Gibson et al. (1993) and present experimental results for sequence segmentation and vector quantization which show that the relative entropy of these quantities is a useful indicator for variable-rate speech coding.
We have improved G.728 output speech quality for frame erasure channels. Three cases are considered: (1) no change to G.728, (2) change only the G.728 decoder, and (3) change both the encoder and decoder. In case 1, w...
详细信息
We have improved G.728 output speech quality for frame erasure channels. Three cases are considered: (1) no change to G.728, (2) change only the G.728 decoder, and (3) change both the encoder and decoder. In case 1, we synthesize a bit-stream during erased frames so that the decoder decodes an excitation with low energy or with characteristics similar to the excitation of previous good frames. In case 2, the gain-scaled excitation and LPC coefficients are extrapolated, and vital operations of backward LPC and gain adaptations are continued. Case 3 adds spectral smoothing and increases bandwidth expansion for the LPC and gain predictors. These techniques are quite effective, as the speech quality degradation due to 1% frame erasures ranges from just slightly noticeable in case 1 to almost unnoticeable in case 3. For case 3, the output speech is still intelligible for frame erasure rates up to 10% or even 20%.
This paper presents a new voice coder for applications in future low bit rate communication systems. The emphasis has been put on the speech quality, noise robustness and complexity. The coder realizes a multiband+LPC...
详细信息
This paper presents a new voice coder for applications in future low bit rate communication systems. The emphasis has been put on the speech quality, noise robustness and complexity. The coder realizes a multiband+LPC spectral analysis and synthesis of speech. The transmitted information consists of an LPC10 filter, a set of voicing rates, pitch, energies, spectral density of excitation in five subbands, and information about the stationarity of the signal in each half-frame. Depending upon this stationarity, the quantization process is adapted to provide more spectral information (stable speech) or more temporal information (transitory speech). In order to be less sensitive to the surrounding noise, the pitch and voicing rates are first computed in each subband. The final values of these parameters are obtained from the values in the current frame and its neighbours. The excitation signal used at the synthesis side consists of a mixture of isolated pulses, periodic and aperiodic signals of adjustable spectral composition. Tests results are provided.
Most current work in the area of high quality audio coding falls under one of two categories: transform or sub-band coding. LPC coders since based on modelling human voice production systems are found to be inappropri...
详细信息
Most current work in the area of high quality audio coding falls under one of two categories: transform or sub-band coding. LPC coders since based on modelling human voice production systems are found to be inappropriate in modelling music and other non-speech sounds. A more improved model for such signals is shown to be the multipulse LPC model. In this paper we propose to improve the quality of the multipulse model by first passing the signal of interest through a filter bank and then extracting the multipulse parameters from each of the bandpass filter outputs. The idea of the wavelet decomposition is utilised for the design of the filter bank. Both the multipulse model and the wavelet decomposition are well known. But a combination of both has not been exploited yet. This combination is expected to lead to a new way in high quality low bit rate audio coding.
The code excited linear prediction coder (CELP) makes it possible to synthesize good quality speech at low bit rates. In such a case, speech quality mainly depends on spectral envelope design accuracy. Different kinds...
详细信息
The code excited linear prediction coder (CELP) makes it possible to synthesize good quality speech at low bit rates. In such a case, speech quality mainly depends on spectral envelope design accuracy. Different kinds of parameters belonging to the parametrical domain (linear prediction coefficients.
暂无评论