In this paper, a digital processing method is described for modifying tone contrast that was defined as the difference in frequencies between peaks and valleys of pitch curves in natural utterances. Speech signals wit...
详细信息
In this paper, a digital processing method is described for modifying tone contrast that was defined as the difference in frequencies between peaks and valleys of pitch curves in natural utterances. Speech signals with modified tones were presented to hearing-impaired Chinese listeners who were asked to identify four alternative Mandarin words. Employing this method, it was found that modified speech with enhanced tone contrast contributed moderate gains in the percentage correct word identification when compared to unmodified speech, while reducing tone contrast generally reduced the percentage correct identification. These findings therefore offer support to the assertion that a hearing aid with tone modifications is indeed effective for hearing-impaired Chinese.
In this paper, we introduce an auto-regressive moving average (ARMA) lattice model for speech modeling. The speech characteristics are modeled and expressed in the form of lattice reflection coefficients for classific...
详细信息
In this paper, we introduce an auto-regressive moving average (ARMA) lattice model for speech modeling. The speech characteristics are modeled and expressed in the form of lattice reflection coefficients for classification. Self Organization Map (SOM) is used to build codebooks for classification and recognition of the lattice reflection coefficients. Experimental results based on an isolated word speech database of 10 words/names indicate that the ARMA lattice model achieves superior recognition performance as compared to those of the conventional auto-regressive (AR) model.
We present in this paper a new binomial sine pulse (BSP) excitation signal used in linear prediction-based speech codecs. The structure of the BSP excitation signal is actually a sine wave whose amplitude is modulated...
详细信息
We present in this paper a new binomial sine pulse (BSP) excitation signal used in linear prediction-based speech codecs. The structure of the BSP excitation signal is actually a sine wave whose amplitude is modulated by a binomial signal. The binomial signal describes the various trends of excitation signals in a pitch period, and the pulsatance of the BSP excitation signal coincides with the vibration frequency of vocal folds. In experiments, processing is going on frame by frame and the same excitation signal is placed at every pitch excitation moment in a frame. Speech codecs based on this new BSP excitation have the advantages of low complexity and low delay. Experiment results prove that such a new speech codec can provide highly intelligible synthesized speech below 3 kbps.
We present an MPEG slice layer model for VBR encoded video using linear predictive coding (LPC) and generalized periodic Markov chains. Each slice position within an MPEG frame is modeled using an LPC autoregressive f...
详细信息
We present an MPEG slice layer model for VBR encoded video using linear predictive coding (LPC) and generalized periodic Markov chains. Each slice position within an MPEG frame is modeled using an LPC autoregressive function. The selection of the particular LPC function is governed by a generalized periodic Markov chain; one chain is defined for each I, P, and B frame type. The model is sufficiently modular in that sequences which exclude B frames can eliminate the corresponding Markov chain. We show that the model matches the pseudo-periodic autocorrelation function quite well. We present simulation results of an asynchronous transfer mode (ATM) video transmitter using a FIFO queue and measure the average cell delay. Simulation results showed good agreement with results obtained using actual traces as sources.
A linear predictive coding (LPC) analysis scheme which is applicable to speech coding is proposed. The analysis method, called interpolative LPC (ILPC) analysis, estimates the spectral envelope by incorporating the in...
详细信息
A linear predictive coding (LPC) analysis scheme which is applicable to speech coding is proposed. The analysis method, called interpolative LPC (ILPC) analysis, estimates the spectral envelope by incorporating the interpolation characteristics into the LPC analysis. The ILPC analysis reduces average spectral distortion and the percentage of outlier frames, compared with the conventional LPC analysis followed by linear interpolation.
A method for optimising LPC filters in linear prediction based speech coders is described. The optimisation process compensates for errors incurred through coding the excitation signal, providing an improvement in the...
详细信息
A method for optimising LPC filters in linear prediction based speech coders is described. The optimisation process compensates for errors incurred through coding the excitation signal, providing an improvement in the quality of the decoded speech, with no increase in bit rate.
This work describes a new version of the decimation-in-degree (DID) transformation used by Wu and Chen [1] as part of their procedure for computing line spectrum pair (LSP) coefficients. This new version eliminates al...
详细信息
This work describes a new version of the decimation-in-degree (DID) transformation used by Wu and Chen [1] as part of their procedure for computing line spectrum pair (LSP) coefficients. This new version eliminates all nontrivial multiplications, requires fewer total operations (additions and multiplications), and is performed in place-eliminating a memory buffer.
In this letter, we demonstrate that the commonly assumed:Arrhenius law is inconsistent with extrapolation of data-retention time-to-failure of nonvolatile memories in highly accelerated life-tests. We argue that the r...
详细信息
In this letter, we demonstrate that the commonly assumed:Arrhenius law is inconsistent with extrapolation of data-retention time-to-failure of nonvolatile memories in highly accelerated life-tests. We argue that the retention time, namely log(t(R)), varies linearly with temperature T rather than with 1/T as commonly assumed, yielding an important reduction in the extrapolated time-to-failure. Extensive experimental results demonstrate the physical consistency of the new model. In particular, data-retention of EPROM devices and leakage current of interpoly dielectric and gate oxide have been investigated over a wide range of temperatures. Finally, it is shown that our model reconciles seemingly controversial activation energy data from the literature.
This letter proposes a 4-kb/s multimode code-excited linear prediction (CELP) coder with pitch synchronous extended excitation, Three modes are used for the short-term excitation, namely algebraic, extended, or stocha...
详细信息
This letter proposes a 4-kb/s multimode code-excited linear prediction (CELP) coder with pitch synchronous extended excitation, Three modes are used for the short-term excitation, namely algebraic, extended, or stochastic excitations, together with an adaptive codebook for the long-term excitation. Comparisons with the FS-1016 and ITU-T G.723.1 coders show a performance level between these standards.
In this paper, a new feature mapping scheme is presented to cope with environmental and target signature changes for underwater target classification. A wavelet packet-based feature extraction scheme is used in conjun...
详细信息
In this paper, a new feature mapping scheme is presented to cope with environmental and target signature changes for underwater target classification. A wavelet packet-based feature extraction scheme is used in conjunction with the linear prediction coding (LPC) scheme as the front-end processor. The core of the system is the adaptive feature mapping subsystem that minimizes the classification error of the classifier. The extracted feature vector is mapped by the resultant transformation matrix in such a way that the mapped version remains invariant to the environmental and sensory changes. The feedback to the adaptation mechanism is provided by a k-nearest neighbor classifier. The test results on 40 kHz linear FM acoustic backscattered data collected for six different objects are presented The effectiveness of the adaptive system vs. nonadaptive one is demonstrated for several signal-to-noise ratio (SNR) conditions.
暂无评论