A method for recursively calculating the autocorrelation functions for LPC analysis in a vocoder environment is developed theoretically and studied experimentally. The method has three specific advantages: (1) it requ...
详细信息
A method for recursively calculating the autocorrelation functions for LPC analysis in a vocoder environment is developed theoretically and studied experimentally. The method has three specific advantages: (1) it requires very little memory for its implementation; (2) it is realized by a structure consisting of several identical modules; and (3) the effective window length may be changed without varying the structure. Experimental results showed the speech quality to be comparable to (but slightly superior to) that produced by an autocorrelation LPC using a Hanning window.
In an attempt to develop a more robust vocoder an adaptive noise-stripping Wiener filter is used to prefilter the noisy speech. In order to adapt to quasi-stationary noise a speech classifier is developed that detects...
详细信息
In an attempt to develop a more robust vocoder an adaptive noise-stripping Wiener filter is used to prefilter the noisy speech. In order to adapt to quasi-stationary noise a speech classifier is developed that detects the presence of silence (noise alone), unvoiced speech or voiced speech. During the silent intervals the noise statistics and the corresponding Wiener filter are up-dated resulting in a decision-directed adaptive structure.
Four values for number of poles (13, 11, 9, 8) were combined factorially with three values of step size for quantization of log area ratios (0.5, 1, 2 dB), and with four values of frame rate (100, 67, 50, 33 per secon...
详细信息
Four values for number of poles (13, 11, 9, 8) were combined factorially with three values of step size for quantization of log area ratios (0.5, 1, 2 dB), and with four values of frame rate (100, 67, 50, 33 per second), to define 48 LPC vocoder systems with overall bit rates ranging from 8.7 down to 1.3 kbps. Subjects rated the DEGRADATION of signal quality by each vocoder, for each of seven sentence tokens, chosen to challenge LPC vocoders maximally. The results define the combination of LPC parameters yielding the best speech quality for any desired overall bit rate.
An improved linear predictive coding scheme is proposed that allows for some adaptation of the analysis time interval according to the signal's stationarity properties. A fixed short time interval is used for anal...
详细信息
An improved linear predictive coding scheme is proposed that allows for some adaptation of the analysis time interval according to the signal's stationarity properties. A fixed short time interval is used for analysis and the intervals are concatenated when the signal has stationary statistics over successive intervals. A "speech-like" all pole signal with time varying parameters is analyzed using autocorrelation and covariance lattice methods. The accuracy with which the model matches the actual signal is examined in terms of coefficient variation and Markel and Gray's spectral difference measure.
As part of a Defense Advanced Research Projects Agency program investigating voice transmission over a packet-switched computer network, a real-time variable frame rate LPC vocoder has been implemented at USC-ISI. The...
详细信息
As part of a Defense Advanced Research Projects Agency program investigating voice transmission over a packet-switched computer network, a real-time variable frame rate LPC vocoder has been implemented at USC-ISI. The vocoder is implemented on a Floating Point Systems AP-120B array processor. A likelihood ratio threshold is used to control when frames of data are transmitted. The average data rate and the speech quality of the vocoder depend on the threshold, but acceptable quality is possible with data rates less than 2,000 bits per second for continuous speech.
One of the most difficult problems in speech analysis is reliable discrimination among silence, unvoiced speech, and voiced speech which has been transmitted over a telephone line. Although several methods have been p...
详细信息
One of the most difficult problems in speech analysis is reliable discrimination among silence, unvoiced speech, and voiced speech which has been transmitted over a telephone line. Although several methods have been proposed for making this 3-level decision, these schemes have met with only modest success. In this paper a novel approach to the voiced-unvoiced-silence detection problem is proposed in which a spectral characterization of each of the 3 classes of signal is obtained during a training session, and an LPC distance metric and an energy distance are nonlinearly combined to make the final discrimination. This algorithm has been tested over conventional switched telephone lines, across a variety of speakers, and has been found to have an error rate of about 5%, with the majority of the errors (about 2/3) occurring at the boundaries between signal classes. The algorithm is currently being used in a speaker independent word recognition system.
At learning, LPC is used to get the reference poles corresponding to the words. During the recognition, the order of the filtering is variable and imposed by the dictionary. The distance between an input speech window...
详细信息
At learning, LPC is used to get the reference poles corresponding to the words. During the recognition, the order of the filtering is variable and imposed by the dictionary. The distance between an input speech window and a dictionary speech window is computed with a method near Itakura's method but using a series of two-order inverse filtering. An improved dynamic programming is used allowing parallel computation for several words.
A method for reducing the characteristic buzz from LPC synthetic speech is presented. The method consists of the use of an non-impulse source for exciting the LPC synthesizer during voiced sounds. One novel feature is...
详细信息
A method for reducing the characteristic buzz from LPC synthetic speech is presented. The method consists of the use of an non-impulse source for exciting the LPC synthesizer during voiced sounds. One novel feature is that the temporal parameters of the source are kept in fixed proportion to the pitch period. An extensive perceptual experiment has shown that the resulting quality of the synthesis is significantly preferred over the quality of the standard LPC synthesis.
暂无评论