版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:AT&T Labs Res Florham Pk NJ 07932 USA
出 版 物:《IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING》 (IEEE Trans Speech Audio Process)
年 卷 期:1999年第7卷第6期
页 面:643-655页
核心收录:
主 题:EM algorithm neural networks nonlinear compensation robust speech recognition stochastic matching
摘 要:The performance of an automatic speech recognizer degrades when there exists an acoustic mismatch between the training and the testing conditions in the data. Though it is certain that the mismatch is nonlinear, its exact form is unknown. Tackling the problem of nonlinear mismatches is a difficult task that has not been adequately addressed before. In this paper, we develop an approach that uses nonlinear transformations in the stochastic matching framework to compensate for acoustic mismatches, The functional form of the nonlinear transformation is modeled by neural networks. We develop a new technique to train neural networks using the generalized EM algorithm. This technique eliminates the need for stereo databases, which are difficult to obtain in practical applications. The new technique is data-driven and hence can be used under a wide variety of conditions without a priori knowledge of the environment, Using this technique, we show that we can provide improvement under various types of acoustic mismatch;in some cases a 72% reduction in word error rate is achieved.