In this paper, we introduce an auto-regressive moving average (ARMA) lattice model for speech modeling. The speech characteristics are modeled and expressed in the form of lattice reflection coefficients for classific...
详细信息
In this paper, we introduce an auto-regressive moving average (ARMA) lattice model for speech modeling. The speech characteristics are modeled and expressed in the form of lattice reflection coefficients for classification. Self Organization Map (SOM) is used to build codebooks for classification and recognition of the lattice reflection coefficients. Experimental results based on an isolated word speech database of 10 words/names indicate that the ARMA lattice model achieves superior recognition performance as compared to those of the conventional auto-regressive (AR) model.
linear prediction method is one of the most frequently used analysis methods of speech. Covariance method and auto-correlation method of linear prediction often fail to make a precise analysis of speech because of the...
详细信息
linear prediction method is one of the most frequently used analysis methods of speech. Covariance method and auto-correlation method of linear prediction often fail to make a precise analysis of speech because of the excitation source or fundamental frequency. In order to decrease the affect of the excitation source, various kinds of difference operations are usually employed for preprocessing. However, such preprocessings do not always work satisfactorily. Here proposed is a new approach to LPC analysis based on selective use of speech data to reject the data disturbed by the excitation source, and is called selective linear prediction method. The method is constructed aiming to improve the accuracy of analysis. First, the formulation of linear prediction is presented using generalized inverse matrices. Then, a successive computation is described based on Givens' reduction. The selective computation, which plays an essential role in our method, owes its efficiency to Givens' reduction. Finally the advantage of the proposed method is demonstrated by computer simulation using both synthetic and natural speech.
In speech coding, segment vocoders offer good intelligibility at low bit rates. A segment vocoder has four basic components 1) Segmentation of input speech 2) Segment quantization 3) Residual quantization 4) Synthesis...
详细信息
ISBN:
(纸本)9781424463831;9781424463855
In speech coding, segment vocoders offer good intelligibility at low bit rates. A segment vocoder has four basic components 1) Segmentation of input speech 2) Segment quantization 3) Residual quantization 4) Synthesis of speech. Most segment vocoders use a recognition approach to segment quantization. In this paper, we assume a different approach to segment quantization. The segmental unit is a syllable and the segment codebook stores the sequence of LPC vectors. During the encoding process the speech segment is quantized using the sequence of LPC vectors that result in the smallest residual energy. PESQ scores indicate that this vocoder achieves better quality compared to that of a corresponding vocoder that uses a speech recognition framework.
Summary form only given. A data compression technique has been developed for efficient communication and storage of seismic waveform data. The technique is also useful with similar kinds of continuous waveform data. I...
详细信息
Summary form only given. A data compression technique has been developed for efficient communication and storage of seismic waveform data. The technique is also useful with similar kinds of continuous waveform data. It uses a special version of the linear predictive coding method, along with secondary bi-level sequence coding. The principal feature of this two-stage technique is that it provides exact, bit-for-bit recovery of the original data.< >
In this paper, we propose a pitch synchronous addition method for LPC analysis by making use of the periodicity of speech. It is shown that the solution overcomes the difficulty involved with the technique of noise re...
详细信息
In this paper, we propose a pitch synchronous addition method for LPC analysis by making use of the periodicity of speech. It is shown that the solution overcomes the difficulty involved with the technique of noise reduction compatible with the stability of the LPC filter obtained by subtracting the noise part from the autocorrelation function of speech. The relation between the pitch period of speech and the improvement in signal-to-noise ratio accomplished by the method is investigated. The simulation results show the effectiveness of the proposed method especially for high-pitched speech.
作者:
P. PrandomM. GoodwinM. VetterliLCAV
Ecole Polytech. Fed. de Lausanne Switzerland EECS
University of California Berkeley USA LCAV
Ecole Polytechnique Fédérale de Lausanne Switzerland
The idea of optimal joint time segmentation and resource allocation for signal modeling is explored with respect to arbitrary segmentations and arbitrary representation schemes. When the chosen signal modeling techniq...
详细信息
The idea of optimal joint time segmentation and resource allocation for signal modeling is explored with respect to arbitrary segmentations and arbitrary representation schemes. When the chosen signal modeling techniques can be quantified in terms of a cost function which is additive over distinct segments, a dynamic programming approach guarantees the global optimality of the scheme while keeping the computational requirements of the algorithm sufficiently low. Two immediate applications of the algorithm to LPC speech coding and to sinusoidal modeling of musical signals are presented.
The ARMA model provides an effective means for precise representation of the speech production process. A stable and accurate estimation of ARMA parameters from the speech signal has been shown to be possible by the S...
详细信息
The ARMA model provides an effective means for precise representation of the speech production process. A stable and accurate estimation of ARMA parameters from the speech signal has been shown to be possible by the SEARMA method. In this paper we propose an ARMA speech analysis-synthesis system based on the SEARMA method. The validity of the proposed system is investigated by means of both objective and subjective evaluation. The results reveal that the zeros in the spectrum contribute to the reduction of spectral distortion and to the improvement of the quality of synthetic speech.
A novel algorithm for codebook initialization for use with the K-means algorithm for codebook generation is presented. This algorithm was shown to result in better codebooks at considerably reduced generation time com...
详细信息
A novel algorithm for codebook initialization for use with the K-means algorithm for codebook generation is presented. This algorithm was shown to result in better codebooks at considerably reduced generation time compared to Lloyd's algorithm and to other codebook initialization algorithms. The proposed algorithm generated codebooks with smaller maxima and smaller standard deviations of the codebook vector quantization (VQ) distortions. These outcomes were consistently observed with substantial margins for all cases considered.< >
An end-point detector for LPC speech using squared prediction error look-ahead and automatic/manual threshold determination is described. The detector is algorithmically simple, computationally efficient,and uses only...
详细信息
An end-point detector for LPC speech using squared prediction error look-ahead and automatic/manual threshold determination is described. The detector is algorithmically simple, computationally efficient,and uses only one decision parameter. Preliminary tests indicate that it is relatively immune to transient pulses and various low-level noises, yet preserves low-level speech sounds such as weak fricatives to a significant extent under moderate noise conditions. Tests indicate that 93.8% of automatically determined endpoints agree to within two frames of manually determined endpoints. The detector is especially suitable for use in vector-quantization based LPC systems, where the squared prediction error is easily available.
Multiple pulse excited linear predictive coding (MPLPC) has recently received a great deal of attention in the literature as an attractive means of speech coding at data rates below 10 Kbits/second. The existing appro...
详细信息
Multiple pulse excited linear predictive coding (MPLPC) has recently received a great deal of attention in the literature as an attractive means of speech coding at data rates below 10 Kbits/second. The existing approaches to MPLPC analysis arrive at the parameters for an all-pole model by minimizing the mean squared modeling error before attempting to find a set of pulses to excite the model. The strategy proposed here selects the all-pole parameters to concentrate the model excitation in a finite number of locations. The goal is then to produce a maximally pulse-like residual as a result of the all-pole parameter estimation.
暂无评论