linear Prediction (LP) and the well known fast algorithms of the Levinson type have already widely proven their efficiency in various fields. But these algorithms fail to be applicable when one of the principal minors...
详细信息
linear Prediction (LP) and the well known fast algorithms of the Levinson type have already widely proven their efficiency in various fields. But these algorithms fail to be applicable when one of the principal minors of the signal covariance matrix happens to be singular (or close to singularity). This paper investigates this singular case both for scalar and vector (multichannel) time series. It is shown that the standard lattice may be replaced by a convenient lossless lattice. Various techniques are proposed to leap over the singular case. They are analysed in terms of generalized choleski factors. The paper then deals with the multichannel case which is attached by a reduction of redundant outputs leading back to the scalar singular case. The results are applicable to the design of robust LPC algorithms and to the use of ARMA models in antenna array processing.
Although powerful device simulators are being developed, analytical models are still essential for depicting the underlying physical mechanisms. Recently, attention was paid to a "unified" approach able to a...
详细信息
Although powerful device simulators are being developed, analytical models are still essential for depicting the underlying physical mechanisms. Recently, attention was paid to a "unified" approach able to account for MOSFET continuous operation from weak to moderate and strong inversion. In this paper, we propose an original model which applies not only to bulk Si and partially depleted SOI MOSFET's, but also to ultrathin film SOI transistors.< >
The final quality of a concatenation synthesis system is directly related to the continuity of the spectrum at the concatenation point. Due to the subjective auditory masking, if we minimize the spectral distortion in...
详细信息
The final quality of a concatenation synthesis system is directly related to the continuity of the spectrum at the concatenation point. Due to the subjective auditory masking, if we minimize the spectral distortion in the formant frequencies, the quality will increase significantly. We present, along with results concerning pitch marking, an algorithm capable of modifying the LPC envelope in a flexible way which is the heart of a spectral smoothing module for a diphone-based linear prediction pitch-synchronous overlap-add (LP-PSOLA) concatenation system.< >
Experimental results are reported in the area of speech modification for the purpose of correcting vocal tract resonance disorders. The linear predictive coding (LPC) coefficients and residual corresponding to an inpu...
详细信息
Experimental results are reported in the area of speech modification for the purpose of correcting vocal tract resonance disorders. The linear predictive coding (LPC) coefficients and residual corresponding to an input speech signal are calculated, and a combination of nonlinear frequency warping and amplitude scaling is applied to produce modified LPC coefficients and then synthesize an output speech signal from these modified coefficients and the residual signal. The spectral envelope modification parameters are designed to adapt to the first two formant frequencies of the input speech. Cubic polynomial warping functions and amplitude scaling factors are found at vowel formant locations and are interpolated from these points over F1, F2 space. The results are compared with those obtained from frame-by-frame nonlinear spectral warping using a dynamic programming algorithm.< >
A system for the automatic translation of any text of Italian into naturally fluent speech is presented. The system, planned for use in a reading machine for the blind, is build up around a Phonological Processor (hen...
详细信息
A system for the automatic translation of any text of Italian into naturally fluent speech is presented. The system, planned for use in a reading machine for the blind, is build up around a Phonological Processor (hence FP) and synthesizes speech-by joining LPC coded diphones. The FP maps into prosodic structures the phonological rules of Italian. Structural information is provided by such hierarchical prosodic constituents as Syllable (S), Metrical Foot (MF), Phonological Word (PW), Intonationa T Group (IG). Onto these structures, phonological rules are applied such as the "letter-to-sound" rules, automatic word stress rules, internal stress hierarchy rules indicating secondary stress, external sandhi rules, phonological focus assignment rules, logical focus assignment rules.
This paper presents a new technique for low cost robust voice compression at rate of 9.6 K bps or less. Our approach is based on applying two new concepts in voice compression; State-variable digital biquads and Punct...
详细信息
This paper presents a new technique for low cost robust voice compression at rate of 9.6 K bps or less. Our approach is based on applying two new concepts in voice compression; State-variable digital biquads and Punctured tree search algorithm. The digital biquads are used for real time spectral analysis of speech. The simplified multipulse excitation is generated by the punctured tree search algorithm that combines the conventional (M,L) tree search algorithm and data compression.
Line spectrum pairs (LSP) representation of linear predictive coding coefficients is widely used in speech coding, speech recognition and other domains due to its desirable interpolation and quantization properties. S...
详细信息
Line spectrum pairs (LSP) representation of linear predictive coding coefficients is widely used in speech coding, speech recognition and other domains due to its desirable interpolation and quantization properties. Several methods proposed for calculating LSP parameters have been complicated by high computation complexity. This paper proposed an effective and efficient algorithm APF using Aitken iterative method and polynomial synthesis division. LSP parameters were estimated by obtaining a root of N-order nonlinear equation by Aitken iterative method at first, then decreasing degrees with polynomial synthesis division, and finally calculating quartic equation using Ferrari's solution. Theoretic analysis and experiment results show that the proposed algorithm has not only high precision but also low calculation complexity
Voice morphing is the process of gradually transforming the voice of a given speaker to that of another. The ability to change the speaker's individual characteristics and produce high-quality voices can be used i...
详细信息
Voice morphing is the process of gradually transforming the voice of a given speaker to that of another. The ability to change the speaker's individual characteristics and produce high-quality voices can be used in many applications. For example, in multimedia and video entertainment, voice morphing is just like its visual counterpart: while seeing a face gradually changing from one person's to another's, we can simultaneously hear the voice changing as well. Another application could be in forensic voice identification: creating a voice-bank of different pitches, rates, and timbres, to assist in recognition of the suspect's voice. In this study we present a new technique, which enables the production of N intermediate voices that gradually change between voices of two speakers, or one voice signal that changes gradually. This technique is based on two components. One is creating a 3D prototype waveform interpolation (PWI) surface from the residual error ' signal, which is estimated from LPC analysis, to produce a new intermediate excitation signal. The second component is a representation of the vocal tract by a lossless tube area function, and interpolation of the two speakers' parameters.
The use of a parameter related to the vocal tract length of humans is investigated for a speaker recognition system. This parameter is a warping factor for the frequency spectrum and it was determined from a vocal tra...
详细信息
The use of a parameter related to the vocal tract length of humans is investigated for a speaker recognition system. This parameter is a warping factor for the frequency spectrum and it was determined from a vocal tract length normalization module of an automatic speech recognizer. A decision fusion approach WHS used to combine the output of the stand-alone speaker verification with a separate classifier for the warping factor. The fused system exhibits a relative improvement of 23% compared to the stand-alone system
An autoregressive spectral estimation method is developed to reduce the noise effect in prediction coefficient estimation. This method solves the prediction coefficients from a generalized Yule-Walker equation which i...
详细信息
An autoregressive spectral estimation method is developed to reduce the noise effect in prediction coefficient estimation. This method solves the prediction coefficients from a generalized Yule-Walker equation which is formed by the data and its generalized autocorrelation sequence. This method provides several control parameters for the spectral estimator to combat the unmodeled additive noise in the linear least square sense. Through the efficient use of information by this method, data size will be directly helpful in noise suppression.< >
暂无评论