linear predictive coding (LPC) parameters are widely used in various speech processing applications for representing the spectral envelope information of speech. For low bit rate speech-coding applications, it is impo...
详细信息
linear predictive coding (LPC) parameters are widely used in various speech processing applications for representing the spectral envelope information of speech. For low bit rate speech-coding applications, it is important to quantize these parameters accurately using as few bits as possible. Though the vector quantizers are more efficient than the scalar quantizers, their use for accurate quantization of LPC information (using 24-26 bits/frames) is impeded due to their prohibitively high complexity. In this paper, a split vector quantization approach is used to overcome the complexity problem. Here, the LPC vector consisting of 10 line spectral frequencies (LSF's) is divided into two parts and each part is quantized separately using vector quantization. Using the localized spectral sensitivity property of the LSF parameters, a weighted LSF distance measure is proposed. Using this distance measure, it is shown that the split vector quantizer can quantize LPC information in 24 bits/frame with an average spectral distortion of 1 dB and less than 2% frames having spectral distortion greater than 2 dB. Effect of channel errors on the performance of this quantizer is also investigated and results are reported.
Itakura and Saito [1] used the maximum likelihood (ML) method to derive a spectral matching criterion for autoregressive (i.e., all-pole) random processes. In this paper, their results are generalized to periodic proc...
详细信息
Itakura and Saito [1] used the maximum likelihood (ML) method to derive a spectral matching criterion for autoregressive (i.e., all-pole) random processes. In this paper, their results are generalized to periodic processes having arbitrary model spectra. For the all-pole model, Kay's [2] covariance domain solution to the recursive ML (RML) problem is cast into the spectral domain and used to obtain the RML solution for periodic processes. When applied to speech, this leads to a method for solving the joint pitch and spectrum envelope estimation problems. It is shown that if the number of frequency power measurements greatly exceeds the model order, then the RML algorithm reduces to a pitch-directed, frequency domain version of linearpredictive (LP) spectral analysis. Experiments on a real-time vocoder reveals that the RML synthetic speech has the quality of being heavily smoothed.
An experiment was designed to compare the formant 1 (F1) and formant 2 (F2) frequency movements of vowels next to /r/ with the same vowels next to other consonants. The data for this experiment were based on formant t...
详细信息
An experiment was designed to compare the formant 1 (F1) and formant 2 (F2) frequency movements of vowels next to /r/ with the same vowels next to other consonants. The data for this experiment were based on formant trajectories computed by the linear prediction coefficient (LPC) technique on a digital computer. The results indicate that with the exception of /i/ the effect of initial /r/ on the following syllable nuclei could be considered minimal. The effect of final /r/ on the syllable nuclei preceding it is appreciable. Algorithms are postulated to define a retroflexed vowel space for vowels preceding /r/ in terms of the nonretroflexed vowel space.
Identification of exon location in a DNA sequence has been considered as the most demanding and challenging research topic in the field of Bioinformatics. This work proposes a robust approach combining the Trigonometr...
详细信息
Identification of exon location in a DNA sequence has been considered as the most demanding and challenging research topic in the field of Bioinformatics. This work proposes a robust approach combining the Trigonometric mapping with Adaptive tuned Kaiser Windowing approach for locating the protein coding regions (EXONS) in a genetic sequence. For better convergence as well as improved accurateness, the side lobe height control parameter (beta) of Kaiser Window in the proposed algorithm is made adaptive to track the changing dynamics of the genetic sequence. This yields better tracking potential of the anticipated Adaptive Kaiser algorithm as it uses the recursive Gauss Newton tuning which in turn utilizes the covariance of the error signal to tune the beta factor which has been shown through numerous simulation results under a variety of practical test conditions. A detailed comparative analysis with the existing mapping schemes, windowing techniques, and other signal processing methods like SVD, AN, DFT, STDFT, WT, and ST has also been included in the paper to focus on the strength and efficiency of the proposed approach. Moreover, some critical performance parameters have been computed using the proposed approach to investigate the effectiveness and robustness of the algorithm. In addition to this, the proposed approach has also been successfully applied on a number of benchmark gene sets like Musmusculus, Homosapiens, and C. elegans, etc., where the proposed approach revealed efficient prediction of exon location in contrast to the other existing mapping methods.
The vast majority of commercially available isolated word recognizers use a filter bank analysis as the front end processing for recognition. It is not well understood how the parameters of different filter banks (e.g...
详细信息
The vast majority of commercially available isolated word recognizers use a filter bank analysis as the front end processing for recognition. It is not well understood how the parameters of different filter banks (e.g., number of filters, types of filters, filter spacing, etc.) affect recognizer performance. In this paper we present results of performance evaluation of several types of filter bank analyzers in a speaker trained isolated word recognition test using dialed-up telephone line recordings. We have studied both DFT (discrete Fourier transform) and direct form implementations of the filter banks. We have also considered uniform and nonuniform filter spacings. The results indicate that the best performance (highest word accuracy) is obtained by both a 15-channel uniform filter bank and a 13-channel nonuniform filter bank (with channels spacing along a critical band scale). The performance of a 7-channel critical band filter bank is almost as good as that of the two best filter banks. In comparison to a conventional linear predictive coding (LPC) word recognizer, the performance of the best filter bank recognizers was, on average, several percent worse than that of an eighth-order LPC-based recognizer. A discussion as to why some filter banks performed better than others, and why the LPC-based system did the best, is given in this paper.
The LPC prediction error provides one measure of the success of linear prediction analysis in modeling a speech signal. Although a great deal is known about the properties of the prediction error, relatively little ha...
详细信息
The LPC prediction error provides one measure of the success of linear prediction analysis in modeling a speech signal. Although a great deal is known about the properties of the prediction error, relatively little has been published about its variation as a function of the position of the analysis frame. In this paper it is shown that a fairly substantial variation in the prediction error is obtained within a single frame (i.e., 10 ms), independent of the analysis method (i.e., the covariance, autocorrelation, or lattice method). The implication of this result is that standard methods of LPC analysis may be inadequate for some applications. This is because the error signal is generally uniformly sampled at a low rate (on the order of 100 Hz), and this can lead to aliased results because of the variation of the error signal within the frame. For applications such as word recognition with frame-to-frame distance calculations using the prediction error, the errors due to uniform sampling can accrue. For speech synthesis applications, the effect of uniform sampling of the error signal is a small, but noticeable roughness in the synthetic speech. Various techniques for reducing the intraframe variation of the prediction error are discussed.
A single chip speech synthesizer was designed using a switched-capacitor multiplier to implement the LPC algorithm. The chip contains the LPC-10 filter, 20 kbit ROM, all control logic, a three-pole switched-capacitor ...
详细信息
A single chip speech synthesizer was designed using a switched-capacitor multiplier to implement the LPC algorithm. The chip contains the LPC-10 filter, 20 kbit ROM, all control logic, a three-pole switched-capacitor low-pass filter, and an audio amplifier capable of driving a speaker directly. The chip was fabricated in 5 µm CMOS technology and is 218 mils on the side.
It is shown that an autoregressive system with stationary and independent stochastic coefficients can be modeled by a constant coefficient equation, where the constants are the stochastic means, if the resulting syste...
详细信息
It is shown that an autoregressive system with stationary and independent stochastic coefficients can be modeled by a constant coefficient equation, where the constants are the stochastic means, if the resulting system is sufficiently low pass, low gain (LPLG). The LPLG requirement can be relaxed as the variations on the random coefficients become small.
By using specially developed microprocessors, and the larger, cheaper read-only memories (ROM's) now available, it is possible to store and reproduce the human voice electronically.
By using specially developed microprocessors, and the larger, cheaper read-only memories (ROM's) now available, it is possible to store and reproduce the human voice electronically.
An algorithm for the solution of the linear equations for the "covariance method" of linear prediction is stated and proved. The algorithm requires only O(p 2 ) arithmetic operations, and in form resembles t...
详细信息
An algorithm for the solution of the linear equations for the "covariance method" of linear prediction is stated and proved. The algorithm requires only O(p 2 ) arithmetic operations, and in form resembles the Levinson algorithm for solution of the linear equations for the "correlation method" of linear prediction. The structural properties of the problem and its solution are emphasized in the analysis presented.
暂无评论