Effective rate variation during active speech is a necessary component of sophisticated variable rate speech compression schemes. Here, we use open-loop estimates of spectral shape to roughly determine signal bandwidt...
详细信息
Effective rate variation during active speech is a necessary component of sophisticated variable rate speech compression schemes. Here, we use open-loop estimates of spectral shape to roughly determine signal bandwidth and bit allocations for variable rate encoding of spectral parameters. We analyze the application of the relative entropy functional to sets of line-spectrum pairs (LSPs) and transform-based generalized spectral distributions of Gibson et al. (1993). We present experimental results demonstrating that the relative entropy of these quantities can be used to good advantage in developing variable rate vector quantization schemes for the spectral envelope of speech signals.
The application of robust LPC parameter estimation in standard CELP coders for secure communications, USA FED STD 1016 4.8 kb/s, for the purpose of decreasing LPC spectral degradation compared to the standard LPC meth...
详细信息
The application of robust LPC parameter estimation in standard CELP coders for secure communications, USA FED STD 1016 4.8 kb/s, for the purpose of decreasing LPC spectral degradation compared to the standard LPC methods is considered in the paper. Comparative experimental analysis is done referring to the results of three different spectral measures related to the RMS LOG spectral measure: likelihood ratios, cosh measure and cepstral distance. The experimental analysis justifies the use of the robust LPC methods in standard CELP speech coders.
This paper presents a new method for estimating formant frequencies. The formant model is based on a digital resonator. Each resonator represents a segment of the short-time power spectrum. The complete spectrum is mo...
详细信息
This paper presents a new method for estimating formant frequencies. The formant model is based on a digital resonator. Each resonator represents a segment of the short-time power spectrum. The complete spectrum is modeled by a set of digital resonators connected in parallel. An algorithm based on dynamic programming produces both the model parameters and segment boundaries that optimally match the spectrum. The main results of this paper are: (1) modeling formants by digital resonators allows a reliable estimation of formant frequencies; (2) digital resonators can be used efficiently in connection with dynamic programming; and (3) a recognition test with formant frequencies results in a string error rate of 4.8% on the adult corpus of the TI digit string database.
This paper presents the extended tube model for the vocal and nasal tract, in which the velum is described by means of a three-port-adaptor, which can be reduced to a two-port-adaptor. In this way the transfer functio...
详细信息
This paper presents the extended tube model for the vocal and nasal tract, in which the velum is described by means of a three-port-adaptor, which can be reduced to a two-port-adaptor. In this way the transfer function is derived, then its properties are investigated, so that this model can be used for the speech analysis of the nasal and non-nasal sounds and can be simply implemented by the inverse filtering method. Moreover the losses in the vocal and nasal tract are analysed and the shape of the nasal tract is also estimated.
作者:
D.J. MashaoLEMS
Division of Engineering Brown University Providence RI USA
This paper is concerned with the search for an optimal feature-set for a speech recognition system. A better acoustic feature analysis that suitably enhances the semantic information in a consistent fashion can reduce...
详细信息
This paper is concerned with the search for an optimal feature-set for a speech recognition system. A better acoustic feature analysis that suitably enhances the semantic information in a consistent fashion can reduce raw-score (no grammar) error rate significantly. A simple two-dimensional parameterized feature-set is proposed. The feature-set is compared against a standard mel-cepstrum, LPC-based feature-set in talker-independent, connected-alphadigit HMM-based recognizer. The results show that a particular combination of parameters yields a significantly lower error rate than the baseline mel-cepstrum LPC-based feature-set.
A low cost concatenation based speech synthesis system for German is described which combines the advantage of minimal memory requirements with good intelligibility and high segmental and prosodic acceptability. This ...
详细信息
A low cost concatenation based speech synthesis system for German is described which combines the advantage of minimal memory requirements with good intelligibility and high segmental and prosodic acceptability. This is achieved by the multiple use of "microsegments", stretches of speech signal varying in length from demi phone to phone size. All prosodic structuring is carried out in the time domain.
Hot-carrier currents and degradation for MOSFETs with gate length from 0.8μm down to 0.1 μm are studied The variations of substrate and gate currents are investigated in a wide voltage range (drain, gate, substrate)...
详细信息
Hot-carrier currents and degradation for MOSFETs with gate length from 0.8μm down to 0.1 μm are studied The variations of substrate and gate currents are investigated in a wide voltage range (drain, gate, substrate). The results show that the physical mechanisms responsible for these two currents are substantially different. The substrate current is created by low energy carriers heated by the pinch-off electric field, whereas the gate current is induced by high energy carriers due to the secondary heating at the drain-substrate junction. Finally, the correlation between photon emission phenomena and substrate current, and, between hot-carrier-induced degradation and gate current are underlined.
A parametric coder of the Code Excited linear Prediction (CELP) family in presented, aimed at coding of digitised electrocardiograms (ECG). Identifying the various parts of an ECG is done to assess the coder performan...
详细信息
A parametric coder of the Code Excited linear Prediction (CELP) family in presented, aimed at coding of digitised electrocardiograms (ECG). Identifying the various parts of an ECG is done to assess the coder performance on a subjective scale. This is used to adjust the parameters of the CELP to better fit the signal. Variants of the CELP are presented, especially designed fur ECG coding; this includes a "copy" long term predictor and a recursive search in the VQ codebook, both resulting in a variable bitrate. Using the percent root-mean-square distortion measure, the coder is shown to deliver results around 7.6% at a mean rate of 2.1 bits per sample. This is on par with or above coders reported elsewhere. The coders presented are shown to be very robust.
A new robust parametric formulation to predictivecoding is introduced. The linear prediction filter coefficients are transformed here into a set of weighted line frequencies. The positive weights play the dual role o...
详细信息
A new robust parametric formulation to predictivecoding is introduced. The linear prediction filter coefficients are transformed here into a set of weighted line frequencies. The positive weights play the dual role of a new set of parameters and simultaneously they exhibit the relative importance of the associated line frequencies. This new representation is shown to be always stable under quantization.
暂无评论