In the last decade, the statistical approach has found widespread use in machine translation both for written and spoken language and has had a major impact on the translation accuracy. The goal of this paper is to co...
详细信息
In the last decade, the statistical approach has found widespread use in machine translation both for written and spoken language and has had a major impact on the translation accuracy. The goal of this paper is to cover the state of the art in statistical machine translation. We would re-visit the underlying principles of the statistical approach to machine translation and summarize the progress that has been made over the last decade
In this paper, we consider the use of multiple acoustic features of the speech signal for robust speech recognition. We investigate the combination of various auditory based (mel frequency cepstrum coefficients, perce...
详细信息
In this paper, we consider the use of multiple acoustic features of the speech signal for robust speech recognition. We investigate the combination of various auditory based (mel frequency cepstrum coefficients, perceptual linear prediction, etc.) and articulatory based (voicedness) features. Features are combined by linear discriminant analysis and log-linear model combination based techniques. We describe the two feature combination techniques and compare the experimental results. Experiments performed on the large-vocabulary task VerbMobil II (German conversational speech) show that the accuracy of automatic speech recognition systems can be improved by the combination of different acoustic features.
In this paper, three different voicing features are studied as additional acoustic features for continuous speech recognition. The harmonic product spectrum based feature is extracted in frequency domain while the aut...
详细信息
In this paper, three different voicing features are studied as additional acoustic features for continuous speech recognition. The harmonic product spectrum based feature is extracted in frequency domain while the autocorrelation and the average magnitude difference based methods work in time domain. The algorithms produce a measure of voicing for each time frame. The voicing measure was combined with the standard Mel Frequency Cepstral Coefficients (MFCC) using linear discriminant analysis to choose the most relevant features. Experiments have been performed on small and large vocabulary tasks. The three different voicing measures combined with MFCCs resulted in similar improvements in word error rate: improvements of up to 14% on the small-vocabulary task and improvements of up to 6% on the large-vocabulary task relative to using MFCC alone with the same overall number of parameters in the system.
This paper is based on the work carried out in the framework of the Verbmobil project, which is a limited-domain speech translation task (German-English). In the final evaluation, the statistical approach was found to...
详细信息
暂无评论