This paper introduces a modified linear prediction method based on the Karhumen-Loeve expansion of the correlation matrix of the speech samples. This result is obtained via a new normalization of the parameters. It is...
详细信息
This paper introduces a modified linear prediction method based on the Karhumen-Loeve expansion of the correlation matrix of the speech samples. This result is obtained via a new normalization of the parameters. It is shown that, due to some important properties of Toeplitz matrices, the poles of the A. R. model lie on the unit circle. Consequently, only the formant frequencies are computed and the result can be interpreted as a special discrete Fourier transform. Application to speech analysis is developped with a comparison to the usual linear prediction and ceptrum methods.
In ordinary linear prediction the speech spectral envelope is modeled by an all-pole spectrum. The error criterion employed guarantees a uniform fit across the whole frequency range. However, we know from speech perce...
详细信息
In ordinary linear prediction the speech spectral envelope is modeled by an all-pole spectrum. The error criterion employed guarantees a uniform fit across the whole frequency range. However, we know from speech perception studies that low frequencies are more important than high frequencies for perception. Therefore, a minimally redundant model would strive to achieve a uniform perceptual fit across the spectrum, which means that it should be able to represent low frequencies more accurately than high frequencies. This is achieved in the LPCW vocoder: an LPC vocoder employing our recently developed method of linearpredictive warping (LPW). The result is improved speech quality for the same bit rate.
A highly reliable non-iterative algorithm for the detection of the closed glottis interval of voiced sounds is described. The method is based on an indicator of linear dependence of certain intervals of the speech sig...
详细信息
A highly reliable non-iterative algorithm for the detection of the closed glottis interval of voiced sounds is described. The method is based on an indicator of linear dependence of certain intervals of the speech signal that are considerably shorter than the pitch period. This indicator is evaluated using the normalized prediction error calculated by linear predictive coding applied to the integrated speech signal. The method allows for the decomposition of the speech signal into an excitation function and a signal for which the glottal impedance is constant and infinite. The algorithm has been extensively applied to the detection of a pitch period in the speech signal and to the extraction of the vocal tract area function. The potential applications of this approach include not only speech analysis and synthesis, but equally speech recognition and speaker identification.
Quest for new speaker dependent features is a constant problem in the design of automatic speaker recognition systems. In speech, information about the speaker usually arises along with the semantic information which ...
详细信息
Quest for new speaker dependent features is a constant problem in the design of automatic speaker recognition systems. In speech, information about the speaker usually arises along with the semantic information which makes its independent use difficult. In this paper, a method based on linear prediction (LP) analysis is described which yields features that are more speaker dependent than the usual linear predictor coefficients (LPC). In this method the LPC contours are obtained through cascade realization of digital inverse filtering (DIF) for speech signals. A low order (2-4) DIF removes the gross spectral characteristics such as the large dynamic range and some significant peaks which tend to mask the weaker formants. Visual comparison of the contours and a preliminary statistical analysis indicate that the LPC contours obtained by processing the output signal of the first stage contain better features for speaker dependency than the direct LPC contours.
This paper reports the results of an investigation of a computable Quality Comparison Measure (called the QCM) for linearpredictive systems. The measure described is easily obtained by a synthesis-analysis procedure....
详细信息
This paper reports the results of an investigation of a computable Quality Comparison Measure (called the QCM) for linearpredictive systems. The measure described is easily obtained by a synthesis-analysis procedure. It is a weighted combination of differences between the input and output speech parameters for a series of spoken sentences. Results are presented that demonstrate a high correlation between QCM and listener preference scores. The QCM offers an alternative to costly and time consuming formal listening procedures.
This book is the first in-depth unified presentation of the important area of linear prediction in speech processing. It covers linear prediction from detailed theoretical considerations through practical applications...
详细信息
This book is the first in-depth unified presentation of the important area of linear prediction in speech processing. It covers linear prediction from detailed theoretical considerations through practical applications including Fortran program implementations of important algorithms. linear Prediction Formulations, Speech Synthesis Structures, Spectral Analysis, Formant and Fundamental Frequency Estimation, Computational Considerations, and Vocoders are presented with emphasis on interrelating the two most widely used forms (the autocorrelation method and the covariance method). Because of the depth of presentation from theoretical derivations through computer programs, the material should be applicable to a wide range of backgrounds. The book is written mainly for those interested in acoustical speech processing, although certain portions will be of interest to other backgrounds in speech research and digital signal processing.
A computer system is described in which isolated words, spoken by a designated talker, are recognized through calculation of a minimum prediction residual. A reference pattern for each word to be recognized is stored ...
详细信息
A computer system is described in which isolated words, spoken by a designated talker, are recognized through calculation of a minimum prediction residual. A reference pattern for each word to be recognized is stored as a time pattern of linear prediction coefficients (LPC). The total log prediction residual of an input signal is minimized by optimally registering the reference LPC onto the input autocorrelation coefficients using the dynamic programming algorithm (DP). The input signal is recognized as the reference word which produces the minimum prediction residual. A sequential decision procedure is used to reduce the amount of computation in DP. A frequency normalization with respect to the long-time spectral distribution is used to reduce effects of variations in the frequency response of telephone connections. The system has been implemented on a DDP-516 computer for the 200-word recognition experiment. The recognition rate for a designated male talker is 97.3 percent for telephone input, and the recognition time is about 22 times real time.
This paper presents several digital signal processing methods for representing speech. Included among the representations are simple waveform coding methods; time domain techniques; frequency domain representations; n...
详细信息
This paper presents several digital signal processing methods for representing speech. Included among the representations are simple waveform coding methods; time domain techniques; frequency domain representations; nonlinear or homomorphic methods; and finaIly linear predictive coding techniques. The advantages and disadvantages of each of these representations for various speech processing applications are discussed.
暂无评论