检索结果-内蒙古大学图书馆

arXiv 2020年

作者： Hanselmann, Harald Ney, Hermann Human Language Technology and Pattern Recognition Group RWTH Aachen University Aachen Germany AppTek GmbH Aachen Germany

The term fine-grained visual classification (FGVC) refers to classification tasks where the classes are very similar and the classification model needs to be able to find subtle differences to make the correct prediction. State-of-the-art approaches often include a localization step designed to help a classification network by localizing the relevant parts of the input images. However, this usually requires multiple iterations or passes through a full classification network or complex training schedules. In this work we present an efficient localization module that can be fused with a classification network in an end-to-end setup. On the one hand the module is trained by the gradient flowing back from the classification network. On the other hand, two self-supervised loss functions are introduced to increase the localization accuracy. We evaluate the new model on the three benchmark datasets CUB200-2011, Stanford Cars and FGVC-Aircraft and are able to achieve competitive recognition performance. Copyright © 2020, The Authors. All rights reserved.

关键词： Benchmarking

来源：评论

学校读者我要写书评

暂无评论

Leave-One-Out Phrase Model Training for Large-Scale Deployment 12

Leave-One-Out Phrase Model Training for Large-Scale Deployme...

引用

Workshop on Statistical Machine Translation

作者： Joern Wuebker Mei-Yuh Hwang Chris Quirk Human Language Technology and Pattern Recognition Group RWTH Aachen University Germany Microsoft Corporation Redmond WA USA

ISBN: (纸本)9781622765928

Training the phrase table by force-aligning (FA) the training data with the reference translation has been shown to improve the phrasal translation quality while significantly reducing the phrase table size on medium sized tasks. We apply this procedure to several large-scale tasks, with the primary goal of reducing model sizes without sacrificing translation quality. To deal with the noise in the automatically crawled parallel training data, we introduce on-demand word deletions, insertions, and backoffs to achieve over 99% successful alignment rate. We also add heuristics to avoid any increase in OOV rates. We are able to reduce already heavily pruned baseline phrase tables by more than 50% with little to no degradation in quality and occasionally slight improvement, without any increase in OOVs. We further introduce two global scaling factors for re-estimation of the phrase table via posterior phrase alignment probabilities and a modified absolute discounting method that can be applied to fractional counts.

关键词： reduced mass Model trains Heuristics Tables

来源：评论

学校读者我要写书评

暂无评论

Joshua: An Open Source Toolkit for Parsing-based Machine Translation 4

Joshua: An Open Source Toolkit for Parsing-based Machine Tra...

引用

4th Workshop on Statistical Machine Translation, WMT 2009, immediately preceding the 12th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2009

作者： Li, Zhifei Callison-Burch, Chris Dyer, Chris Ganitkevitch, Juri Khudanpur, Sanjeev Schwartz, Lane Thornton, Wren N. G. Weese, Jonathan Zaidan, Omar F. Center For Language And Speech Processing Johns Hopkins University BaltimoreMD United States Computational Linguistics And Information Processing Lab University of Maryland College ParkMD United States Human Language Technology And Pattern Recognition Group RWTH Aachen University Germany Natural Language Processing Lab University of Minnesota MinneapolisMN United States

We describe Joshua, an open source toolkit for statistical machine translation. Joshua implements all of the algorithms required for synchronous context free grammars (SCFGs): chart-parsing, ngram language model integration, beamand cube-pruning, and k-best extraction. The toolkit also implements suffix-array grammar extraction and minimum error rate training. It uses parallel and distributed computing techniques for scalability. We demonstrate that the toolkit achieves state of the art translation performance on the WMT09 French-English translation task. ©2009 Association for Computational Linguistics.

关键词： Distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

Improved Robustness to Disfluencies in Rnn-Transducer Based Speech recognition

Improved Robustness to Disfluencies in Rnn-Transducer Based ...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Valentin Mendelev Tina Raissi Guglielmo Camporese Manuel Giollo Amazon Alexa Human Language Technology and Pattern Recognition Group RWTH Aachen University Germany University of Padova Italy

Automatic Speech recognition (ASR) based on Recurrent Neural Network Transducers (RNN-T) is gaining interest in the speech community. We investigate data selection and preparation choices aiming for improved robustness of RNN-T ASR to speech disfluencies with a focus on partial words. For evaluation we use clean data, data with disfluen- cies and a separate dataset with speech affected by stuttering. We show that after including a small amount of data with disfluencies in the training set the recognition accuracy on the tests with disfluencies and stuttering improves. Increasing the amount of training data with disfluencies gives additional gains without degradation on the clean data. We also show that replacing partial words with a dedicated token helps to get even better accuracy on utterances with disfluencies and stutter. The evaluation of our best model shows 22.5% and 16.4% relative WER reduction on those two evaluation sets.

关键词： Training Degradation Transducers Recurrent neural networks Training data Speech recognition Signal processing

来源：评论

学校读者我要写书评

暂无评论

OPEN VOCABULARY HANDWRITING recognition USING COMBINED WORD-LEVEL AND CHARACTER-LEVEL language MODELS

OPEN VOCABULARY HANDWRITING RECOGNITION USING COMBINED WORD-...

引用

IEEE International Conference on Acoustics, Speech, and Signal Processing

作者： Michal Kozielski David Rybach Stefan Hahn Ralf Schluter Hermann Ney Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany

ISBN: (纸本)9781479903573

In this paper, we present a unified search strategy for open vocabulary handwriting recognition using weighted finite state transducers. Additionally to a standard word-level language model we introduce a separate n-gram character-level language model for out-of-vocabulary word detection and recognition. The probabilities assigned by those two models are combined into one Bayes decision rule. We evaluate the proposed method on the IAM database of English handwriting. An improvement from 22.2% word error rate to 17.3% is achieved comparing to the closed-vocabulary scenario and the best published result.

关键词： Handwriting Vocabulary modelling languages search strategies Error analysis Word Bayes Decision Rule handwriting recognition

来源：评论

学校读者我要写书评

暂无评论

THE RWTH 2010 QUAERO ASR EVALUATION SYSTEM FOR ENGLISH, FRENCH, AND GERMAN

THE RWTH 2010 QUAERO ASR EVALUATION SYSTEM FOR ENGLISH, FREN...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： M. Sundermeyer M. Nussbaum-Thom S. Wiesler C. Plahl A. El-Desoky Mousa S. Hahn D. Nolden R. Schluter H. Ney Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University

ISBN: (纸本)9781457705380

Recognizing Broadcast Conversational (BC) speech data is a difficult task, which can be regarded as one of the major challenges beyond the recognition of Broadcast News (BN). This paper presents the automatic speech recognition systems developed by RWTH for the English, French, and German language which attained the best word error rates for English and German, and competitive results for the French task in the 2010 Quaero evaluation for BC and BN data. At the same time, the RWTH German system used the least amount of training data among all participants. Large reductions in word error rate were obtained by the incorporation of the new Bottleneck Multilayer Perceptron (MLP) features for all three languages. Additional improvements were obtained for the German system by applying a new language modeling technique, decomposing words into sublexical components.

关键词： automatic speech recognition multilayer perceptrons

来源：评论

学校读者我要写书评

暂无评论

Improvements in RWTH's System for Off-Line Handwriting recognition

Improvements in RWTH's System for Off-Line Handwriting Recog...

引用

International Conference on Document Analysis and recognition

作者： Michal Michał Kozielski Patrick Doetsch Hermann Ney Human Language Technology and Pattern Recognition Group RWTH Aachen University Aachen Germany Rheinisch-Westfalische Technische Hochschule Aachen Aachen Nordrhein-Westfalen DE Human Language Technol. & Pattern Recognition Group RWTH Aachen Univ. Aachen Germany

ISBN: (纸本)9780769549993

In this paper we describe a novel HMM-based system for off-line handwriting recognition. We adapt successful techniques from the domains of large vocabulary speech recognition and image object recognition: moment-based image normalization, writer adaptation, discriminative feature extraction and training, and open-vocabulary recognition. We evaluate those methods and examine their cumulative effect on the recognition performance. The final system outperforms current state-of-the-art approaches on two standard evaluation corpora for English and French handwriting.

关键词： Hidden Markov models Training Handwriting recognition Feature extraction Databases Error analysis Standards

来源：评论

学校读者我要写书评

暂无评论

Does Joint Training Really Help Cascaded Speech Translation?

arXiv

引用

arXiv 2022年

作者： Tran, Viet Anh Khoa Thulke, David Gao, Yingbo Herold, Christian Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University AachenD-52056 Germany

Currently, in speech translation, the straightforward approach - cascading a recognition system with a translation system - delivers state-of-the-art results. However, fundamental challenges such as error propagation from the automatic speech recognition system still remain. To mitigate these problems, recently, people turn their attention to direct data and propose various joint training methods. In this work, we seek to answer the question of whether joint training really helps cascaded speech translation. We review recent papers on the topic and also investigate a joint training criterion by marginalizing the transcription posterior probabilities. Our findings show that a strong cascaded baseline can diminish any improvements obtained using joint training, and we suggest alternatives to joint training. We hope this work can serve as a refresher of the current speech translation landscape, and motivate research in finding more efficient and creative ways to utilize the direct data for speech translation. Copyright © 2022, The Authors. All rights reserved.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Towards two-dimensional sequence to sequence model in neural machine translation

arXiv

引用

arXiv 2018年

作者： Bahar, Parnia Brix, Christopher Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department Rwth Aachen University AachenD-52056 Germany

This work investigates an alternative model for neural machine translation (NMT) and proposes a novel architecture, where we employ a multi-dimensional long short-term memory (MDLSTM) for translation modeling. In the state-of-the-art methods, source and target sentences are treated as one-dimensional sequences over time, while we view translation as a two-dimensional (2D) mapping using an MDLSTM layer to define the correspondence between source and target words. We extend beyond the current sequence to sequence backbone NMT models to a 2D structure in which the source and target sentences are aligned with each other in a 2D grid. Our proposed topology shows consistent improvements over attention-based sequence to sequence model on two WMT 2017 tasks, German↔English. Copyright © 2018, The Authors. All rights reserved.

关键词： Neural machine translation

来源：评论

学校读者我要写书评

暂无评论

Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture

arXiv

引用

arXiv 2020年

作者： Brix, Christopher Bahar, Parnia Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University AachenD-52056 Germany

Sparse models require less memory for storage and enable a faster inference by reducing the necessary number of FLOPs. This is relevant both for time-critical and on-device computations using neural networks. The stabilized lottery ticket hypothesis states that networks can be pruned after none or few training iterations, using a mask computed based on the unpruned converged model. On the transformer architecture and the WMT 2014 English→German and English→French tasks, we show that stabilized lottery ticket pruning performs similar to magnitude pruning for sparsity levels of up to 85%, and propose a new combination of pruning techniques that outperforms all other techniques for even higher levels of sparsity. Furthermore, we confirm that the parameter’s initial sign and not its specific value is the primary factor for successful training, and show that magnitude pruning cannot be used to find winning lottery tickets. Copyright © 2020, The Authors. All rights reserved.

关键词： Network architecture

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：