检索结果-内蒙古大学图书馆

Conference on Computer Vision and pattern recognition (CVPR)

作者： T. Deselaers A. Criminisi J. Winn A. Agarwal Microsoft Research Limited Cambridge UK Human Language Technology and Pattern Recognition RWTH Aachen University of Technology Germany

A new method for localising and recognising hand poses and objects in real-time is presented. This problem is important in vision-driven applications where it is natural for a user to combine hand gestures and real objects when interacting with a machine. Examples include using a real eraser to remove words from a document displayed on an electronic surface. In this paper the task of simultaneously recognising object classes, hand gestures and detecting touch events is cast as a single classification problem. A random forest algorithm is employed which adaptively selects and combines a minimal set of appearance, shape and stereo features to achieve maximum class discrimination for a given image. This minimal set leads to both efficiency at run time and good generalisation. Unlike previous stereo works which explicitly construct disparity maps, here the stereo matching costs are used directly as visual cue and only computed on-demand, i.e. only for pixels where they are necessary for recognition. This leads to improved efficiency. The proposed method is assessed on a database of a variety of objects and hand poses selected for interacting on a flat surface in an office environment.

关键词： Object detection pattern recognition Shape Cameras humans Event detection Costs Visual databases Hardware Object recognition

来源：评论

学校读者我要写书评

暂无评论

The RWTH Arabic-to-English spoken language translation system

The RWTH Arabic-to-English spoken language translation syste...

引用

IEEE Workshop on Automatic Speech recognition and Understanding

作者： Oliver Bender Evgeny Matusov Stefan Hahn Sasa Hasan Shahram Khadivi Hermann Ney Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6-Computer Science Department RWTH Aachen University Aachen Germany

ISBN: (纸本)9781424413690;1424413699

We present the RWTH phrase-based statistical machine translation system designed for the translation of Arabic speech into English text. This system was used in the Global Autonomous language Exploitation (GALE) Go/No-Go Translation Evaluation 2007. Using a two-pass approach, we first generate n-best translation candidates and then rerank these candidates using additional models. We give a short review of the decoder as well as of the models used in both passes. We stress the difficulties of spoken language translation, i.e. how to combine the recognition and translation systems and how to compensate for missing punctuation. In addition, we cover our work on domain adaptation for the applied language models. We present translation results for the official GALE 2006 evaluation set and the GALE 2007 development set.

关键词： Natural languages Automatic speech recognition Surface-mount technology Vocabulary Decoding Speech analysis Broadcasting humans pattern recognition Computer science

来源：评论

学校读者我要写书评

暂无评论

Editorial: Knowledge engineering, semantics, and signal processing in audio - visual information retrieval

引用

IEEE Transactions on Circuits and Systems for Video technology 2007年第3期17卷 257-258页

作者： Izquierdo, Ebroul Zhang, Jian Sikora, Thomas Huang, Thomas S. Department of Electrical Engineering Queen Mary University of London London El 4NS United Kingdom Sydney Australia School of Computer Science and Engineering University of New South Wales Australia Communication Systems Group Technical University Berlin Berlin Germany ITG University of Illinois at Urbana-Champaign United States Department of Electrical and Computer Engineering Coordinated Science Laboratory United States Image Formation and Processing Group Beckman Institute for Advanced Science and Technology United States Institute's Major Research Theme Human Computer Intelligent Interaction National Academy of Engineering Chinese Academies of Engineering and Sciences China International Association of Pattern Recognition Optical Society of American United States

No abstract available

关键词： Special issues and sections Knowledge engineering Information retrieval Audio-visual systems

来源：评论

学校读者我要写书评

暂无评论

Patch-based object recognition using discriminatively trained Gaussian mixtures

Patch-based object recognition using discriminatively traine...

引用

2006 17th British Machine Vision Conference, BMVC 2006

作者： Hegerath, Andre Deselaers, Thomas Ney, Hermann Human Language Technology and Pattern Recognition Group RWTH Aachen University D-52056 Aachen Germany

ISBN: (纸本)1904410146

We present an approach using Gaussian mixture models for part-based object recognition where spatial relationships of the parts are explicitly modeled and parameters of the generative model are tuned discriminatively. These extensions lead to great improvements of the classification accuracy. Furthermore we evaluate several improvements over our baseline system which incrementally improve the obtained results which compare favorable well to other published results for the three Caltech tasks and the PASCAL evaluation 05 tasks.

关键词： Object recognition

来源：评论

学校读者我要写书评

暂无评论

The RWTH Statistical Machine Translation System for the IWSLT 2006 Evaluation 3

The RWTH Statistical Machine Translation System for the IWSL...

引用

3rd International Workshop on Spoken language Translation, IWSLT 2006

作者： Mauser, Arne Zens, Richard Matusov, Evgeny Hasan, Saša Ney, Hermann Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6 Computer Science Department RWTH Aachen University AachenD-52056 Germany

We give an overview of the RWTH phrase-based statistical machine translation system that was used in the evaluation campaign of the International Workshop on Spoken language Translation (IWSLT) 2006. The system was ranked first with respect to the BLEU measure in all language pairs it was used Using a two-pass aproach, we first generate the N best translation candidates. The second pass consists of rescoring and reranking these candidates. We will give a description of the search algorithm as well as of the models used in each pass. We will also describe our method for dealing with punctuation restoration, in order to overcome the difficulties of spoken language translation. This work also includes a brief description of the system combination done by the partners participating in the European TC-Star project. © 2006 International Workshop on Spoken language Translation, IWSLT 2006. All rights reserved.

关键词： Computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

N-Gram posterior probabilities for statistical machine translation

N-Gram posterior probabilities for statistical machine trans...

引用

2006 Workshop on Statistical Machine Translation, WMT 2006, collocated with the HLT-NAACL 2006

作者： Zens, Richard Ney, Hermann Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6 - Computer Science Department RWTH Aachen University AachenD-52056 Germany

Word posterior probabilities are a common approach for confidence estimation in automatic speech recognition and machine translation. We will generalize this idea and introduce n-gram posterior probabilities and show how these can be used to improve translation quality. Additionally, we will introduce a sentence length model based on posterior probabilities. We will show significant improvements on the Chinese-English NIST task. The absolute improvements of the BLEU score is between 1.1% and 1.6%. © HLT-NAACL *** right reserved.

关键词： Machine translation

来源：评论

学校读者我要写书评

暂无评论

Discriminative reordering models for statistical machine translation

Discriminative reordering models for statistical machine tra...

引用

2006 Workshop on Statistical Machine Translation, WMT 2006, collocated with the HLT-NAACL 2006

作者： Zens, Richard Ney, Hermann Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6 Computer Science Department RWTH Aachen University AachenD-52056 Germany

We present discriminative reordering models for phrase-based statistical machine translation. The models are trained using the maximum entropy principle. We use several types of features: based on words, based on word classes, based on the local context. We evaluate the overall performance of the reordering models as well as the contribution of the individual feature types on a word-aligned corpus. Additionally, we show improved translation performance using these reordering models compared to a state-of-the-art baseline system. © HLT-NAACL *** right reserved.

关键词： Computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

Acoustic feature combination for robust speech recognition

Acoustic feature combination for robust speech recognition

引用

2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05

作者： Zolnay, András Schlüter, Ralf Ney, Hermann Computer Science Department Human Language Technology and Pattern Recognition RWTH Aachen University 52056 Aachen Germany

ISBN: (纸本)0780388747

In this paper, we consider the use of multiple acoustic features of the speech signal for robust speech recognition. We investigate the combination of various auditory based (Mel Frequency Cepstrum Coefficients, Perceptual Linear Prediction, etc.) and articulatory based (voicedness) features. Features are combined by a Linear Discriminant Analysis based and by a log-linear model combination based techniques. We describe the two feature combination techniques and compare the experimental results. Experiments performed on the large-vocabulary task VerbMobil II (German conversational speech) show that the accuracy of automatic speech recognition systems can be improved by the combination of different acoustic features. © 2005 IEEE.

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

The RWTH Phrase-based Statistical Machine Translation System 2

The RWTH Phrase-based Statistical Machine Translation System

引用

2nd International Workshop on Spoken language Translation, IWSLT 2005

作者： Zens, Richard Bender, Oliver Hasan, Saša Khadivi, Shahram Matusov, Evgeny Xu, Jia Zhang, Yuqi Ney, Hermann Human Language Technology and Pattern Recognition Lehrstuhl für Informatik VI Computer Science Department RWTH Aachen University AachenD-52056 Germany

We give an overview of the RWTH phrase-based statistical machine translation system that was used in the evaluation campaign of the International Workshop on Spoken language Translation 2005. We use a two pass approach. In the first pass, we generate a list of the N best translation candidates. The second pass consists of rescoring and reranking this N-best list. We will give a description of the search algorithm as well as the models that are used in each pass. We participated in the supplied data tracks for manual transcriptions for the following translation directions: Arabic-English, Chinese-English, English-Chinese and Japanese-English. For Japanese-English, we also participated in the C-Star track. In addition, we performed translations of automatic speech recognition output for Chinese-English and Japanese-English. For both language pairs, we translated the single-best ASR hypotheses. Additionally, we translated Chinese ASR lattices. © 2005 International Workshop on Spoken language Translation, IWSLT 2005.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Articulatory motivated acoustic features for speech recognition

Articulatory motivated acoustic features for speech recognit...

引用

9th European Conference on Speech Communication and technology

作者： Kocharov, Daniil Zolnay, András Schlüter, Ralf Ney, Hermann Department of Phonetics Faculty of Philology Saint-Petersburg State University 199034 Saint Petersburg Russia Human Language Technology and Pattern Recognition Lehrstuhl für Informatik VI Computer Science Department RWTH Aachen University 52056 Aachen Germany

In this paper, we consider the use of multiple acoustic features of the speech signal for continuous speech recognition. A novel articulatory motivated acoustic feature is introduced, namely the spectrum derivative feature. The new feature is tested in combination with the standard Mel Frequency Cepstral Coefficients (MFCC) and the voicedness features. Linear Discriminant Analysis is applied to find the optimal combination of different acoustic features. Experiments have been performed on small and large vocabulary tasks. Significant improvements in word error rate have been obtained by combining the MFCC feature with the articulatory motivated voicedness and spectrum derivative features: improvements of up to 25% on the small-vocabulary task and improvements of up to 4% on the large-vocabulary task relative to using MFCC alone with the same overall number of parameters in the system.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：