The paper describes an architecture for multi-channel and multi-modal applications. First the design problem is explored and a proposal for a system that can handle multi-modal interaction and delivery of Internet con...
详细信息
The paper describes an architecture for multi-channel and multi-modal applications. First the design problem is explored and a proposal for a system that can handle multi-modal interaction and delivery of Internet content is proposed. The focus is pertained in some development aspects and the way they are addressed by using state-of-the-art tools. The various components are defined and described in detail. Finally, conclusions and a view of future work on the evolution of such systems is given.
We describe a method for automatically learning a parser from labeled, bracketed corpora that results in a fast, robust, lightweight parser that is suitable for real-time dialog systems and similar applications. Unlik...
详细信息
We investigate the utility of right-context (look-ahead information) in incremental left-to-right language models with word sense disambiguation, and discover somewhat unexpectedly that using right-context in addition...
详细信息
Over the years, many proposals have been made to incorporate assorted types of feature in language models. However, discrepancies between training sets, evaluation criteria, algorithms, and hardware environments make ...
详细信息
This paper uses an information-based approach to conduct feature types selection for language modeling in a systematic manner. We describe a quantitative analysis of the information gain and the information redundancy...
详细信息
In this paper we report results of an investigation into English-Japanese Cross-language Information Retrieval (CLIR) comparing a number of query translation methods. Results from experiments using the standard BMIR-J...
详细信息
作者:
Wu, DekaiWong, HongsingHKUST
Human Language Technology Center Department of Computer Science University of Science and Technology Clear Water Bay Hong Kong
We introduce a stochastic grammatical channel model for machine translation, that synthesizes several desirable characteristics of both statistical and grammatical machine translation. As with the pure statistical tra...
详细信息
This paper describes extensions and improvements to IBM's large vocabulary continuous speech recognition (LVCSR) system for transcription of broadcast news. The recognizer uses an additional 35 hours of training d...
详细信息
This paper describes extensions and improvements to IBM's large vocabulary continuous speech recognition (LVCSR) system for transcription of broadcast news. The recognizer uses an additional 35 hours of training data over the one used in the 1996 Hub4 evaluation. It includes a number of new features: optimal feature space for acoustic modeling (in training and/or testing), filler-word modeling, Bayesian information criterion (BIC) based segment clustering, an improved implementation of iterative MLLR and 4-gram language models. Results using the 1996 DARPA Hub4 evaluation data set are presented.
This paper describes some of the main problems and issues specific to the transcription of broadcast news and describes some of the methods for solving them that have been incorporated into the IBM Large Vocabulary Co...
详细信息
This paper describes some of the main problems and issues specific to the transcription of broadcast news and describes some of the methods for solving them that have been incorporated into the IBM Large Vocabulary Continuous Speech Recognition System.
暂无评论