Mean-square-error minimizing signal compression techniques, such as Autoregressive Analysis or linear predictive coding and Principal Component or Karhunen-Loève Analysis, can be systematically characterized in t...
详细信息
Mean-square-error minimizing signal compression techniques, such as Autoregressive Analysis or linear predictive coding and Principal Component or Karhunen-Loève Analysis, can be systematically characterized in terms of canonical coordinate or generalized eigenvector procedures. This approach provides considerable insight into the interrelationships between a variety of seemingly different signal compression methods. The approach also provides a convenient mechanism for introducing the types of non-Euclidean error measures that are needed to adjust the signal performance optimization criteria to take into account different types of a priori statistical and dynamical information relating to both the desired signal and to various interference processes.
This paper describes a procedure that reduces the spectral distortion in LPC encoded speech preprocessed by a CVSD coder. In this type of tandem configuration (wide-band/narrowband), the CVSD process introduces extran...
详细信息
This paper describes a procedure that reduces the spectral distortion in LPC encoded speech preprocessed by a CVSD coder. In this type of tandem configuration (wide-band/narrowband), the CVSD process introduces extraneous wideband noise and a general broadening of the formant bandwidths. When coupled with the formant distortion introduced by the LPC process, the tandem speech appears buzzy, muffled, and of lower quality than either system considered alone. By low-pass filtering the CVSD speech, on a formant adaptive basis, and narrowing the bandwidths of the primary formants, F1 and F2, the input signal to the LPC synthesizer more closely resembles the original unprocessed signal. This spectral enhancement procedure produces a higher quality CVSD/LPC signal than previously realized.
This paper describes the real time implementation of a linear predictive coding algorithm that has been developed over the past five years. The algorithm chosen for the analyzer is a modification of the Covariance Met...
详细信息
This paper describes the real time implementation of a linear predictive coding algorithm that has been developed over the past five years. The algorithm chosen for the analyzer is a modification of the Covariance Method introduced by B. S. Atal [1],[2] of Bell Labs. The system for pitch extraction uses a minimum distance function correlation technique. A dynamic programming algorithm [3] is used for pitch smoothing and correction of isolated pitch errors. The synthesizer uses a transversal filter. Considerable time has been devoted to optimizing the running time and integer scaling of the different algorithms for real time implementation on a 16 bit mini-computer.
A great deal of current research in the area of narrowband digital speech compression makes use of the linear Prediction coding (LPC) algorithm to extract the vocal track spectrum. This paper describes a technique tha...
详细信息
A great deal of current research in the area of narrowband digital speech compression makes use of the linear Prediction coding (LPC) algorithm to extract the vocal track spectrum. This paper describes a technique that splits the spectrum into two equal halves and performs a piecewise LPC approximation to each half. By taking advantage of the classical benefits of piecewise approximation, the fidelity is expected to be higher than standard LPC. In addition, by making use of under-sampling and spectrum folding, computational requirements are reduced by about 40%. PLPC has been implemented in real time on the CSP-30 computer at the Speech Research and Development Facility of the Communications Security Engineering Office (DCW) at ESD.
One of the most difficult problems in speech analysis is reliable discrimination among silence, unvoiced speech, and voiced speech which has been transmitted over a telephone line. Although several methods have been p...
详细信息
One of the most difficult problems in speech analysis is reliable discrimination among silence, unvoiced speech, and voiced speech which has been transmitted over a telephone line. Although several methods have been proposed for making this three-level decision, these schemes have met with only modest success. In this paper, a novel approach to the voiced-unvoiced-silence detection problem is proposed in which a spectral characterization of each of the three classes of signal is obtained during a training session, and an LPC distance measure and an energy distance are nonlinearly combined to make the final discrimination. This algorithm has been tested over conventional switched telephone lines, across a variety of speakers, and has been found to have an error rate of about 5 percent, with the majority of the errors (about \frac{2}{3} ) occurring at the boundaries between signal classes. The algorithm is currently being used in a speaker-independent word recognition system.
A microprocessor realization for a linearpredictive vocoder is presented. The goal was a low-power, low-cost, compact special-purpose realization of a narrow-band speech terminal. The resultant design is a general-pu...
详细信息
A microprocessor realization for a linearpredictive vocoder is presented. The goal was a low-power, low-cost, compact special-purpose realization of a narrow-band speech terminal. The resultant design is a general-purpose two-bus structure running at a 150 ns cycle time, using as the basic signal processing element, four of the AMD 2901 CPE chips. This basic structure is augmented by a four-cycle multiplier to allow for sufficient signal processing power. The design concessions that mark the linear predictive coding microprocessor (LPCM) as a special-purpose machine designed to be a speech terminal are: limited I/O and limited memory. The present design requires 162 dual-in-line packages, dissipates less than 45 W and occupies about \fraclinear predictive codinglinear predictive coding ft 3 .
We show by theoretical argument and by experiment with both synthetic and real data that selection of an undriven segment of voiced speech for analysis by linear predictive coding (LPC) gives more accurate estimates o...
详细信息
We show by theoretical argument and by experiment with both synthetic and real data that selection of an undriven segment of voiced speech for analysis by linear predictive coding (LPC) gives more accurate estimates of the poles of the vocal-tract model. In the case of voiced nasal phonemes, this technique provides a simple algorithm for separately determining the poles and the zeros in the model and illustrates the desirability of identifying the portions of the speech wave during which there is a significant driving input. A key problem which remains is the development of a practical algorithm for selecting such segments for analysis.
linear prediction is a generally accepted method for obtaining all-pole speech representations. However, in many situations (e.g., nasalization studies) spectral zeros are important and a more general modeling procedu...
详细信息
linear prediction is a generally accepted method for obtaining all-pole speech representations. However, in many situations (e.g., nasalization studies) spectral zeros are important and a more general modeling procedure is required. Unfortunately, the need for pitch synchronization has limited the success of available techniques. This paper explores a novel approach to pole-zero analysis, called homomorphic prediction, which seems to avoid the synchronization problem. A minimum-phase estimate of the vocal-tract impluse response is obtained by homomorphic filtering of the speech waveform. Such a signal, by definition, has a known time registration. linear prediction is applied to this waveform to identify its poles. The LPC "residual" (error signal) is computed by inverse filtering. This signal contains the information about the zeros. Its z transform is then approximated by a polynomial either through a weighted least squares procedure (homomorphic prediction, using Shanks' method of finding zeros), or by spectral inversion followed by a second pass of LPC (homomorphic prediction involving "inverse LPC"). Results of a preliminary evaluation on real and synthetic speech are presented.
The LPC prediction error provides one measure of the success of linear prediction analysis in modeling a speech signal. Although a great deal is known about the properties of the prediction error, relatively little ha...
详细信息
The LPC prediction error provides one measure of the success of linear prediction analysis in modeling a speech signal. Although a great deal is known about the properties of the prediction error, relatively little has been published about its variation as a function of the position of the analysis frame. In this paper it is shown that a fairly substantial variation in the prediction error is obtained within a single frame (i.e., 10 ms), independent of the analysis method (i.e., the covariance, autocorrelation, or lattice method). The implication of this result is that standard methods of LPC analysis may be inadequate for some applications. This is because the error signal is generally uniformly sampled at a low rate (on the order of 100 Hz), and this can lead to aliased results because of the variation of the error signal within the frame. For applications such as word recognition with frame-to-frame distance calculations using the prediction error, the errors due to uniform sampling can accrue. For speech synthesis applications, the effect of uniform sampling of the error signal is a small, but noticeable roughness in the synthetic speech. Various techniques for reducing the intraframe variation of the prediction error are discussed.
暂无评论