A technique is described for recognition of contours in a spectrogram associated with a given string of events by means of dynamic models expressing relationships between frequency as the dependent variable and time a...
详细信息
A technique is described for recognition of contours in a spectrogram associated with a given string of events by means of dynamic models expressing relationships between frequency as the dependent variable and time as the independent variable.
Although 2400 BPS vocoders based upon linear predictive coding have produced speech intelligibility scores as high as 90% in a quiet laboratory setting, few actual system measurements have been made in noisy, stressfu...
详细信息
Although 2400 BPS vocoders based upon linear predictive coding have produced speech intelligibility scores as high as 90% in a quiet laboratory setting, few actual system measurements have been made in noisy, stressful, military environments. This paper describes LPC vocoder performance in high acoustic noise environments and when the speaker is subjected to stress, vibrations and accelerations. Measurements were made on military platforms which included ships, conventional aircraft, helicopters, tracked vehicles and wheeled vehicles; acoustic noise levels varied from 70 to 125dB Sound Pressure Level. (1)
A novel vocoder concept is presented which is based on discrete time equivalents of the uniform acoustic tube. Recently proposed models of lossy sections of different lengths are cascaded to reprensent the vocal tract...
详细信息
A novel vocoder concept is presented which is based on discrete time equivalents of the uniform acoustic tube. Recently proposed models of lossy sections of different lengths are cascaded to reprensent the vocal tract. Results comparable to a standard 12-section LPC-vocoder are achieved cascading typically 5 tube sections of different lengths. Major savings are derived in terms of the hardware complexity during speech synthesis and in term of the data rate during speech transmission. The concept can also be used for modeling the nasal tract by interconnecting a discrete time acoustic tube system to the vocal tract model via a proper adaptor. Till now losses of the low-loss-type are considered. The analysis part of the vocoder is based ona standard autocorrelation LPC-analysis followed by an additional approximation stage yielding a reduction of the number of tube sections. The performance test of the proposed vocoder is done both by subjective evaluation and by comparison in the spectral and the acoustic tube domain.
This paper describes a unique design that attacks two problem areas of LPC: noise suppression input level control and real time simulation/ test. The noise level design uses algorithms to digitally process speech data...
详细信息
This paper describes a unique design that attacks two problem areas of LPC: noise suppression input level control and real time simulation/ test. The noise level design uses algorithms to digitally process speech data before input to the LPC algorithm processor. The LPC processor described in the paper is based on a microprocessor design conceived specifically for speech. The noise suppression and level control algorithms are performed in a separate front end processor that detects noise patterns and deletes them from the normal voice input. The operational hardware system is shown to the block diagram level as well as the particular simulation/test scheme. Test results are also described in this paper.
linear predictive coding (LPC) has been successfully applied to the encoding of speech and other time series. It has been widely observed, however, that the performance of an LPC algorithm deteriorates rapidly in the ...
详细信息
linear predictive coding (LPC) has been successfully applied to the encoding of speech and other time series. It has been widely observed, however, that the performance of an LPC algorithm deteriorates rapidly in the presence of background noise. In this paper, we describe and discuss one approach to the identification of a time series corrupted by additive white noise. A common approach to this problem is to prefilter the noisy time series, and then to apply an estimation algorithm which treats the time series as if it were noise-free. We describe an alternative approach which involves modifying the time-series model at the outset to account for the presence of noise. An estimation algorithm is then developed for this modified model. We discuss the development of the model, the estimation algorithm, and some representative experimental results.
In this paper a low-cost, small dimensions, multiple users audio response system using microprocessors and LSI circuits is presented, The synthesis strategy is performed by a MOS microprocessor, which provides 12 LPC ...
详细信息
In this paper a low-cost, small dimensions, multiple users audio response system using microprocessors and LSI circuits is presented, The synthesis strategy is performed by a MOS microprocessor, which provides 12 LPC parameters to a microprogrammable digital filter. The synthetic message is produced by a concatenation of units which range from words to dyad-like elements, depending on the application. The synthesis algorithm is based on two main steps: the segment concatenation rules and the prosodic rules. The fundamental frequency contour is defined by the type of sentence, the punctuation ma rks and the position of the accents.
In speech analysis and synthesis based on linear prediction, it is a common assumption that predictor coeffcients contain all the necessary spectral and phase information for accurate synthesis of the speech signal. H...
详细信息
In speech analysis and synthesis based on linear prediction, it is a common assumption that predictor coeffcients contain all the necessary spectral and phase information for accurate synthesis of the speech signal. However, even under the best circumstances, the synthetic speech sounds unnatural to the critical listener. Subjective tests reveal that spectral errors introduced by the linear prediction analysis techniques are a major source of unnatural sound quality in synthetic speech. This paper describes a modified analysis-synthesis procedure which, although relying on the basic LPC technique for analysis and synthesis, avoids spectral amplitude and phase distortions introduced by these techniques. In new method, proper reproduction of speech spectrum at the receiver is ensured by transmitting the short-time spectrum of prediction residual to the receiver.
In a previous paper, it was shown that the presence of a combinatorial shifter in the data paths of a user microprogrammable general purpose computer could be used for fast fixed-point multiplication if a unique micro...
详细信息
In a previous paper, it was shown that the presence of a combinatorial shifter in the data paths of a user microprogrammable general purpose computer could be used for fast fixed-point multiplication if a unique microsubroutine was created for each required multiplier. A sixth-order direct form IIR digital filter was implemented via this technique. This work has been extended, so as to produce a tenth-order LPC k-parameter lattice synthesizer software system which executes in about 60% real time. The remaining CPU time may be allocated for k-parameter manipulation, as in synthesis-by-rule aAgorithms, or for unrelated computation. The approach used is novel since it involves dynamic analysis of k-parameters and creation of the microsubroutines required for synthesis during each pitch period. The results suggest that a high speed shifter embedded in an otherwise conventional micromachine architecture is Useful for practical, real-time digital signal processing applications.
A speaker independent, isolated word recognition system is proposed which is based on the use of multiple templates for each word in the vocabulary. The word templates are obtained from a statistical clustering analys...
详细信息
A speaker independent, isolated word recognition system is proposed which is based on the use of multiple templates for each word in the vocabulary. The word templates are obtained from a statistical clustering analysis of a large data base consisting of 100 replications of each word (i.e. once by each of 100 talkers). The recognition system, which uses telephone recordings, is based on an LPC analysis of the unknown word, dynamic time warping of each reference template to the unknown word (using the Itakura LPC distance measure), and the application of a K-nearest neighbor (KNN) decision rule to lower the probability of error. Results are presented on two test sets of data which show error rates that are comparable to, or better than, those obtained with speaker trained, isolated word recognition systems.
暂无评论