This paper introduces noncausal all-pole models that are capable of efficiently capturing both the magnitude and phase information of voiced speech, It is shown that noncausal all-pole filter models are better able to...
详细信息
This paper introduces noncausal all-pole models that are capable of efficiently capturing both the magnitude and phase information of voiced speech, It is shown that noncausal all-pole filter models are better able to match both magnitude and phase information and are particularly appropriate for voiced speech due to the nature of the glottal excitation. By modeling speech in the frequency domain, the standard difficulties that occur when using noncausal all-pole filters are avoided. Several algorithms for determining the model parameters based on frequency-domain information and the masking effects of the ear are described. Our work suggests that high-quality voiced speech can be produced using a 14th-order noncausal all-pole model.
Low-delay techniques are proposed for coding 7 kHz speech using subband code-excited linear predictive coding (CELP). The use of separate and joint index codebooks is compared. Specifically, the joint-index-subband CE...
详细信息
Low-delay techniques are proposed for coding 7 kHz speech using subband code-excited linear predictive coding (CELP). The use of separate and joint index codebooks is compared. Specifically, the joint-index-subband CELP (JISBC) algorithm is found to provide good quality with processing delay in the range 2.375-3.375 ms at corresponding bit rates of 16-8 k bit/s.
The aim of this correspondence is to present a robust representation of speech based on AR modeling of the causal part of the autocorrelation sequence. In noisy speech recognition, this new representation achieves bet...
详细信息
The aim of this correspondence is to present a robust representation of speech based on AR modeling of the causal part of the autocorrelation sequence. In noisy speech recognition, this new representation achieves better results than several other related techniques.
We provide a simple proof of the minimum phase property of the optimum linear prediction polynomial, The proof follows directly from the fact that the minimized prediction error has to satisfy the orthogonality princi...
详细信息
We provide a simple proof of the minimum phase property of the optimum linear prediction polynomial, The proof follows directly from the fact that the minimized prediction error has to satisfy the orthogonality principle, Additional insights provided by this proof are also discussed.
The feasibility and performance of an embedded regular pulse excited speech coder (ERPE) based on multistage coding is investigated. The simulated ERPE system exhibits a graceful reduction of reconstructed speech qual...
详细信息
The feasibility and performance of an embedded regular pulse excited speech coder (ERPE) based on multistage coding is investigated. The simulated ERPE system exhibits a graceful reduction of reconstructed speech quality for bit rates from 14.8 to 6.4 kb/s in 4.2 kb/s steps, and carries a very small signal-to-noise ratio (SNR) penalty compared to its conventional version.
Taking the evolution of spectral parameters into consideration in speech coding has been shown to enhance the perceptual performance. In this study we examine and compare two methods that are designed for explicit con...
详细信息
Taking the evolution of spectral parameters into consideration in speech coding has been shown to enhance the perceptual performance. In this study we examine and compare two methods that are designed for explicit control of spectral dynamics. One method operates on the encoder part of the coding system by incorporating a constraint in the distortion measure and the other method smoothes the trajectory of output vectors at the decoder side. The decoder method requires however an additional coding delay of one frame. By means of listening experiments it is demonstrated for three different vector quantizer structures that especially the decoder method gives significant improvements. For noisy channels, the preference for this method is even more emphasized.
Efficient quantization methods of the line spectrum pairs (LSP) which have good performances, low complexity and memory are proposed. The adaptive quantization method utilizing the ordering property of LSP parameters ...
详细信息
ISBN:
(纸本)0818679190
Efficient quantization methods of the line spectrum pairs (LSP) which have good performances, low complexity and memory are proposed. The adaptive quantization method utilizing the ordering property of LSP parameters is used in a scalar quantizer and a vector-scalar hybrid quantizer. The maximum quantization range of each LSP parameter is varied adaptively on the quantized value of the previous order's LSP parameter. The proposed scalar quantization algorithm needs 31 bits/frame which is 3 bits less than in the conventional scalar quantization method with interframe prediction to maintain the transparent quality of speech. The improved vector-scalar quantizer achieves an average spectral distortion of 1 dB using 26 bits/frame. The performances of proposed quantization methods are evaluated in the transmission errors.
We compare neural networks and statistical methods used to identify birds by their songs. Six birds native to Manitoba were chosen which exhibited overlapping characteristics in terms of frequency content, song compon...
详细信息
We compare neural networks and statistical methods used to identify birds by their songs. Six birds native to Manitoba were chosen which exhibited overlapping characteristics in terms of frequency content, song components and length of songs. Songs from multiple individuals in each species were employed. These songs were analyzed using backpropagation learning in two-layer perceptrons, as well as methods from multivariate statistics including quadratic discriminant analysis. Preprocessing methods included linear predictive coding and windowed Fourier transforms. Generalization performance ranged from 82% to 93% correct identification, with the lower figures corresponding to smaller networks that employed more preprocessing for dimensionality reduction. Computational requirements were significantly reduced in the later case.
Mexican Spanish has received little attention so far despite being one of the most spoken Spanish dialects in the world with an enormous potential for interest. It presents some particular characteristics that differe...
详细信息
Mexican Spanish has received little attention so far despite being one of the most spoken Spanish dialects in the world with an enormous potential for interest. It presents some particular characteristics that differentiate it from the Spanish spoken in Spain that has been the dialect mostly studied during the past. We present our study on the properties of phones in Mexican Spanish and acoustic modeling required for the development of an utterance verification system for Mexican Spanish. Two different approaches for modeling the alternative hypothesis in the subword-level utterance verification system are also presented and compared.
The NLMS algorithm has low computational cost and exhibits optimal performance with excitation by Gaussian noise, but has poor performance with coloured signals such as speech. This paper proposes an acausally-conditi...
详细信息
The NLMS algorithm has low computational cost and exhibits optimal performance with excitation by Gaussian noise, but has poor performance with coloured signals such as speech. This paper proposes an acausally-conditioned (AC-NLMS) method for coloured signals which adjusts the correlation matrix governing adaptation behaviour to an LMS approximation of that for Gaussian noise so as to permit near-optimal NLMS performance. The low computational complexity of NLMS is conserved. The technique has potential applications to acoustic echo and noise cancellation.
暂无评论