检索结果-内蒙古大学图书馆

Towards Unsupervised Learning for Handwriting recognition

学校读者我要写书评

暂无评论

Towards Unsupervised Learning for Handwriting Recognition

International Workshop on Frontiers in Handwriting recognition

作者： Michal Kozielski Malte Nuhn Patrick Doetsch Hermann Ney Human Language Technology and Pattern Recognition Group RWTH Aachen University Aachen Germany

We present a method for training an off-line handwriting recognition system in an unsupervised manner. For an isolated word recognition task, we are able to bootstrap the system without any annotated data. We then retrain the system using the best hypothesis from a previous recognition pass in an iterative fashion. Our approach relies only on a prior language model and does not depend on an explicit segmentation of words into characters. The resulting system shows a promising performance on a standard dataset in comparison to a system trained in a supervised fashion for the same amount of training data.

关键词： Hidden Markov models Training Vocabulary Error analysis Handwriting recognition Training data Ciphers

OPEN VOCABULARY HANDWRITING recognition USING COMBINED WORD-LEVEL AND CHARACTER-LEVEL language MODELS

学校读者我要写书评

暂无评论

OPEN VOCABULARY HANDWRITING RECOGNITION USING COMBINED WORD-...

IEEE International Conference on Acoustics, Speech, and Signal Processing

作者： Michal Kozielski David Rybach Stefan Hahn Ralf Schluter Hermann Ney Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany

ISBN: (纸本)9781479903573

In this paper, we present a unified search strategy for open vocabulary handwriting recognition using weighted finite state transducers. Additionally to a standard word-level language model we introduce a separate n-gram character-level language model for out-of-vocabulary word detection and recognition. The probabilities assigned by those two models are combined into one Bayes decision rule. We evaluate the proposed method on the IAM database of English handwriting. An improvement from 22.2% word error rate to 17.3% is achieved comparing to the closed-vocabulary scenario and the best published result.

关键词： Handwriting Vocabulary modelling languages search strategies Error analysis Word Bayes Decision Rule handwriting recognition

Improving unsupervised word-by-word translation with language model and denoising autoencoder

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Kim, Yunsu Geng, Jiahui Ney, Hermann Human Language Technology and Pattern Recognition Group RWTH Aachen University Aachen Germany

Unsupervised learning of cross-lingual word embedding offers elegant matching of words across languages, but has fundamental limitations in translating sentences. In this paper, we propose simple yet effective methods to improve word-by-word translation of cross-lingual embeddings, using only monolingual corpora but without any back-translation. We integrate a language model for context-aware search, and use a novel denoising autoencoder to handle reordering. Our system surpasses state-of-the-art unsupervised neural translation systems without costly iterative training. We also analyze the effect of vocabulary size and denoising type on the translation performance, which provides better understanding of learning the cross-lingual word embedding and its usage in translation. Copyright © 2019, The Authors. All rights reserved.

关键词： Embeddings

Generalizing back-translation in neural machine translation

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Graca, Miguel Kim, Yunsu Schamper, Julian Khadivi, Shahram Ney, Hermann Human Language Technology and Pattern Recognition Group RWTH Aachen University Aachen Germany

Back-translation - data augmentation by translating target monolingual data - is a crucial component in modern neural machine translation (NMT). In this work, we reformulate back-translation in the scope of crossentropy optimization of an NMT model, clarifying its underlying mathematical assumptions and approximations beyond its heuristic usage. Our formulation covers broader synthetic data generation schemes, including sampling from a target-to-source NMT model. With this formulation, we point out fundamental problems of the sampling-based approaches and propose to remedy them by (i) disabling label smoothing for the target-to-source model and (ii) sampling from a restricted search space. Our statements are investigated on the WMT 2018 German → English news translation task. Copyright © 2019, The Authors. All rights reserved.

关键词： Neural machine translation

When and Why is Unsupervised Neural Machine Translation Useless?

学校读者我要写书评

暂无评论

arXiv 2020年

作者： Kim, Yunsu Graça, Miguel Ney, Hermann Human Language Technology and Pattern Recognition Group RWTH Aachen University Aachen Germany

This paper studies the practicality of the current state-of-the-art unsupervised methods in neural machine translation (NMT). In ten translation tasks with various data settings, we analyze the conditions under which the unsupervised methods fail to produce reasonable translations. We show that their performance is severely affected by linguistic dissimilarity and domain mismatch between source and target monolingual data. Such conditions are common for low-resource language pairs, where unsupervised learning works poorly. In all of our experiments, supervised and semi-supervised baselines with 50k-sentence bilingual data outperform the best unsupervised results. Our analyses pinpoint the limits of the current unsupervised NMT and also suggest immediate research directions. Copyright © 2020, The Authors. All rights reserved.

关键词： Neural machine translation

THE RWTH 2010 QUAERO ASR EVALUATION SYSTEM FOR ENGLISH, FRENCH, AND GERMAN

学校读者我要写书评

暂无评论

THE RWTH 2010 QUAERO ASR EVALUATION SYSTEM FOR ENGLISH, FREN...

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： M. Sundermeyer M. Nussbaum-Thom S. Wiesler C. Plahl A. El-Desoky Mousa S. Hahn D. Nolden R. Schluter H. Ney Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University

ISBN: (纸本)9781457705380

Recognizing Broadcast Conversational (BC) speech data is a difficult task, which can be regarded as one of the major challenges beyond the recognition of Broadcast News (BN). This paper presents the automatic speech recognition systems developed by RWTH for the English, French, and German language which attained the best word error rates for English and German, and competitive results for the French task in the 2010 Quaero evaluation for BC and BN data. At the same time, the RWTH German system used the least amount of training data among all participants. Large reductions in word error rate were obtained by the incorporation of the new Bottleneck Multilayer Perceptron (MLP) features for all three languages. Additional improvements were obtained for the German system by applying a new language modeling technique, decomposing words into sublexical components.

关键词： automatic speech recognition multilayer perceptrons

Pan, Zoom, Scan - Time-coherent, Trained Automatic Video Cropping

学校读者我要写书评

暂无评论

Pan, Zoom, Scan - Time-coherent, Trained Automatic Video Cro...

26th IEEE Conference on Computer Vision and pattern recognition (CVPR 2008), vol.10

作者： Thomas Deselaers Philippe Dreuw Hermann Ney Human Language Technology and Pattern Recognition Group RWTH Aachen University Aachen Germany

We present a method to fully automatically fit videos in 16:9 format on 4:3 screens and vice versa. It can be applied to arbitrary aspect ratios and can be used to make videos suitable for mobile viewing devices with small and possibly uncommonly sized displays. The cropping sequence is optimised over time to create smooth transitions and thus leads to an excellent viewing experience. Current televisions have simple and often disturbing methods which either show the centre region of the image, distort the image, or pad it with black borders. The technique presented here can fully automatically find the "right" viewing area for each image in a video sequence. It works in real-time with only very little time-shift. We employ different low-level features and a log-linear model to learn how to find the right area. The method is able to automatically decide whether padding with black borders is necessary or whether all relevant image areas fit on screen by cropping the image. Evaluation is done on ten videos from five different types of content and the baseline methods are clearly outperformed.

关键词： Displays Motion pictures TV Video sequences Crops Layout humans pattern recognition DVD Image resolution

Efficient nearly error-less LVCSR decoding based on incremental forward and backward passes

学校读者我要写书评

暂无评论

Efficient nearly error-less LVCSR decoding based on incremen...

IEEE Workshop on Automatic Speech recognition and Understanding

作者： David Nolden Ralf Schlüter Hermann Ney Human Language Technology and Pattern Recognition Group RWTH Aachen University Aachen Germany

ISBN: (纸本)9781479927579

We show that most search errors can be identified by aligning the results of a symmetric forward and backward decoding pass. Based on this knowledge, we introduce an efficient high-level decoding architecture which yields virtually no search errors, and requires virtually no manual tuning. We perform an initial forward- and backward decoding with tight initial beams, then we identify search errors, and then we recursively increment the beam sizes and perform new forward and backward decodings for erroneous intervals until no more search errors are detected. Consequently, each utterance and even each single word is decoded with the smallest beam size required to decode it correctly. On all tested systems we achieve an error rate equal or very close to classical decoding with ideally tuned beam size, but unsupervisedly without specific tuning, and at around 2 times faster runtime. An additional speedup by factor 2 can be achieved by decoding the forward and backward pass in separate threads.

关键词： Decoding Acoustic beams Hidden Markov models Context Acoustics Error analysis Runtime

When and why is document-level context useful in neural machine translation?

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Duc, Yunsu Kim Tran, Thanh Ney, Hermann Human Language Technology and Pattern Recognition Group RWTH Aachen University Aachen Germany

Document-level context has received lots of attention for compensating neural machine translation (NMT) of isolated sentences. However, recent advances in document-level NMT focus on sophisticated integration of the context, explaining its improvement with only a few selected examples or targeted test sets. We extensively quantify the causes of improvements by a document-level model in general test sets, clarifying the limit of the usefulness of document-level context in NMT. We show that most of the improvements are not interpretable as utilizing the context. We also show that a minimal encoding is sufficient for the context modeling and very Copyright © 2019, The Authors. All rights reserved.

关键词： Neural machine translation