检索结果-内蒙古大学图书馆

ICDAR2009 - 10th International Conference on Document Analysis and recognition

作者： Dreuw, Philippe Heigold, Georg Ney, Hermann RWTH Aachen University Human Language Technology and Pattern Recognition Ahornstr 55 D-52056 Aachen Germany

ISBN: (纸本)9780769537252

We present a novel confidence-based discriminative training for model adaptation approach for an HMM based Arabic handwriting recognition system to handle different handwriting styles and their variations. Most current approaches are maximum-likelihood trained HMM systems and try to adapt their models to different writing styles using writer adaptive training, unsupervised clustering, or additional writer specific data. Discriminative training based on the Maximum Mutual Information criterion is used to train writer independent handwriting models. For model adaptation during decoding, an unsupervised confidence-based discriminative training on a word and frame level within a two-pass decoding process is proposed. Additionally, the training criterion is extended to incorporate a margin term. The proposed methods are evaluated on the IFN/ENIT Arabic handwriting database, where the proposed novel adaptation approach can decrease the word-error-rate by 33% relative. © 2009 IEEE.

关键词： Decoding

来源：评论

学校读者我要写书评

暂无评论

Are unaligned words important for machine translation?

Are unaligned words important for machine translation?

引用

13th Annual Conference of the European Association for Machine Translation, EAMT 2009

作者： Zhang, Yuqi Matusov, Evgeny Ney, Hermann Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6 - Computer Science Department RWTH Aachen University D-52056 Aachen Germany

In this paper, we deal with the problem of a large number of unaligned words in automatically learned word alignments for machine translation (MT). These unaligned words are the reason for ambiguous phrase pairs extracted by a statistical phrase-based MT system. In translation, this phrase ambiguity causes deletion and insertion errors. We present hard and optional deletion approaches to remove the unaligned words in the source language sentences. Improvements in translation quality are achieved both on large and small vocabulary tasks with the presented methods. © 2009 European Association for Machine Translation.

关键词： Machine translation

来源：评论

学校读者我要写书评

暂无评论

Demonstration of Joshua: An open source toolkit for parsing-based machine translation

Demonstration of Joshua: An open source toolkit for parsing-...

引用

Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and 4th International Joint Conference on Natural language Processing of the AFNLP, ACL-IJCNLP 2009

作者： Li, Zhifei Callison-Burch, Chris Dyer, Chris Ganitkevitch, Juri Khudanpur, Sanjeev Schwartz, Lane Thornton, Wren N.G. Weese, Jonathan Zaidan, Omar F. Center for Language and Speech Processing Johns Hopkins University United States Computational Linguistics and Information Processing Lab. University of Maryland United States Human Language Technology and Pattern Recognition Group RWTH Aachen University Germany Natural Language Processing Lab. University of Minnesota United States

ISBN: (纸本)9781617382581

We describe Joshua (Li et al., 2009a)1, an open source toolkit for statistical machine translation. Joshua implements all of the algorithms required for translation via synchronous context free grammars (SCFGs): chart-parsing, n-gram language model integration, beam- and cubepruning, and k-best extraction. The toolkit also implements suffix-array grammar extraction and minimum error rate training. It uses parallel and distributed computing techniques for scalability. We also provide a demonstration outline for illustrating the toolkit's features to potential users, whether they be newcomers to the field or power users interested in extending the toolkit. © 2009 ACL and AFNLP.

关键词： Distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

Confidence-Based Discriminative Training for Model Adaptation in Offline Arabic Handwriting recognition

Confidence-Based Discriminative Training for Model Adaptatio...

引用

International Conference on Document Analysis and recognition

作者： Philippe Dreuw Georg Heigold Hermann Ney Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany

ISBN: (纸本)9781424445004

We present a novel confidence-based discriminative training for model adaptation approach for an HMM based Arabic handwriting recognition system to handle different handwriting styles and their *** current approaches are maximum-likelihood trained HMM systems and try to adapt their models to different writing styles using writer adaptive training, unsupervised clustering, or additional writer specific *** training based on the maximum mutual information criterion is used to train writer independent handwriting models. For model adaptation during decoding, an unsupervised confidence-based discriminative training on a word and frame level within a two-pass decoding process is proposed. Additionally, the training criterion is extended to incorporate a margin *** proposed methods are evaluated on the IFN/ENIT Arabic handwriting database, where the proposed novel adaptation approach can decrease the word-error-rate by 33% relative.

关键词： Adaptation model Handwriting recognition Hidden Markov models Writing Mutual information Maximum likelihood decoding Maximum likelihood estimation pattern recognition Databases Automatic speech recognition

来源：评论

学校读者我要写书评

暂无评论

Writer Adaptive Training and Writing Variant Model Refinement for Offline Arabic Handwriting recognition

Writer Adaptive Training and Writing Variant Model Refinemen...

引用

International Conference on Document Analysis and recognition

作者： Philippe Dreuw David Rybach Christian Gollan Hermann Ney Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany

We present a writer adaptive training and writer clustering approach for an HMM based Arabic handwriting recognition system to handle different handwriting styles and their variations. Additionally, a writing variant model refinement for specific writing variants is proposed. Current approaches try to compensate the impact of different writing styles during preprocessing and normalization steps. Writer adaptive training with a CMLLR based feature adaptation is used to train writer dependent models. An unsupervised writer clustering with Bayesian information criterion based stopping condition for a CMLLR based feature adaptation during a two-pass decoding process is used to cluster different handwriting styles of unknown test writers. The proposed methods are evaluated on the IFN/ENIT Arabic handwriting database.

关键词： Writing Handwriting recognition Maximum likelihood decoding Hidden Markov models Automatic speech recognition Maximum likelihood linear regression pattern recognition Bayesian methods Databases Text analysis

来源：评论

学校读者我要写书评

暂无评论

Joshua: An Open Source Toolkit for Parsing-based Machine Translation 4

Joshua: An Open Source Toolkit for Parsing-based Machine Tra...

引用

4th Workshop on Statistical Machine Translation, WMT 2009, immediately preceding the 12th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2009

作者： Li, Zhifei Callison-Burch, Chris Dyer, Chris Ganitkevitch, Juri Khudanpur, Sanjeev Schwartz, Lane Thornton, Wren N. G. Weese, Jonathan Zaidan, Omar F. Center For Language And Speech Processing Johns Hopkins University BaltimoreMD United States Computational Linguistics And Information Processing Lab University of Maryland College ParkMD United States Human Language Technology And Pattern Recognition Group RWTH Aachen University Germany Natural Language Processing Lab University of Minnesota MinneapolisMN United States

We describe Joshua, an open source toolkit for statistical machine translation. Joshua implements all of the algorithms required for synchronous context free grammars (SCFGs): chart-parsing, ngram language model integration, beamand cube-pruning, and k-best extraction. The toolkit also implements suffix-array grammar extraction and minimum error rate training. It uses parallel and distributed computing techniques for scalability. We demonstrate that the toolkit achieves state of the art translation performance on the WMT09 French-English translation task. ©2009 Association for Computational Linguistics.

关键词： Distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

Audio segmentation for speech recognition using segment features

Audio segmentation for speech recognition using segment feat...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： David Rybach Christian Gollan Ralf Schluter Hermann Ney Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Germany

Audio segmentation is an essential preprocessing step in several audio processing applications with a significant impact e.g. on speech recognition performance. We introduce a novel framework which combines the advantages of different well known segmentation methods. An automatically estimated log-linear segment model is used to determine the segmentation of an audio stream in a holistic way by a maximum a posteriori decoding strategy, instead of classifying change points locally. A comparison to other segmentation techniques in terms of speech recognition performance is presented, showing a promising segmentation quality of our approach.

关键词： Speech recognition Streaming media Decoding Broadcasting Loudspeakers Automatic speech recognition Bayesian methods humans Natural languages pattern recognition

来源：评论

学校读者我要写书评

暂无评论

White-space models for offline Arabic handwriting recognition

White-space models for offline Arabic handwriting recognitio...

引用

作者： Dreuw, Philippe Jonas, Stephan Ney, Hermann Human Language Technology and Pattern Recognition RWTH Aachen University Germany

ISBN: (纸本)9781424421756

We propose to explicitly model white-spaces for Arabic handwriting recognition within different writing variants. Position-dependent character shapes in Arabic handwriting allow for large white-spaces between characters even within words. Here, a separate character model for white-spaces in combination with a lexicon using different writing variants and character model length adaptation is proposed. Current handwriting recognition systems model the white-spaces implicitly within the character models leading to possibly degraded models, or try to explicitly segment the Arabic words into pieces of Arabic words being prone to segmentation errors. Several white-space modeling approaches are analyzed on the well known IFN/ENIT database and outperform the best reported error rates. © 2008 IEEE.

关键词： Character recognition

来源：评论

学校读者我要写书评

暂无评论

Statistical pattern recognition and machine translation from 1988 to 1998: What has happened?

Statistical pattern recognition and machine translation from...

引用

8th International Workshop on pattern recognition in Information Systems, PRIS 2008;In Conjunction with ICEIS 2008

作者： Ney, Hermann Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany

来源：评论

学校读者我要写书评

暂无评论

Spoken language processing techniques for sign language recognition and translation

引用

technology and Disability 2008年第2期20卷 121-133页

作者： Dreuw, Philippe Stein, Daniel Deselaers, Thomas Rybach, David Zahedi, Morteza Bungeroth, Jan Ney, Hermann Human Language Technology and Pattern Recognition Computer Science Department 6 RWTH Aachen University Aachen Germany

We present an approach to automatically recognize sign language and translate it into a spoken language. A system to address these tasks is created based on state-of-the-art techniques from statistical machine translation, speech recognition, and image processing research. Such a system is necessary for communication between deaf and hearing people. The communication is otherwise nearly impossible due to missing sign language skills on the hearing side, and the low reading and writing skills on the deaf side. As opposed to most current approaches, which focus on the recognition of isolated signs only, we present a system that recognizes complete sentences in sign language. Similar to speech recognition, we have to deal with temporal sequences. Instead of the acoustic signal in speech recognition, we process a video signal as input. Therefore, we use a speech recognition system to obtain a textual representation of the signed sentences. This intermediate representation is then fed into a statistical machine translation system to create a translation into a spoken language. To achieve good results, some particularities of sign languages are considered in both systems. We use a publicly available corpus to show the performance of the proposed system and report very promising results. © 2008 IOS Press. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：