We study the application of active learning techniques to the translation of unbounded data streams via interactive neural machine translation. The main idea is to select, from an unbounded stream of source sentences,...
详细信息
In this work, multiple hierarchical language modeling strategies for a zero OOV rate large vocabulary continuous speech recognition system are investigated. In our previously proposed hierarchical approach, a full-wor...
详细信息
Checkpoint averaging is a simple and effectivemethod to boost the performance of convergedneural machine translation models. The calculation is cheap to perform and the fact thatthe translation improvement almost come...
详细信息
Despite the known limitations, most machine translation systems today still operate on the sentence-level. One reason for this is, that most parallel training data is only sentence-level aligned, without document-leve...
详细信息
This paper describes the statistical machine translation (SMT) systems developed at RWTH Aachen University for the translation task of the ACL 2013 Eighth Workshop on Statistical Machine Translation (WMT 2013). We par...
详细信息
We investigate insertion and deletion models for hierarchical phrase-based statistical machine translation. Insertion and deletion models are designed as a means to avoid the omission of content words in the hypothese...
详细信息
Context-aware neural machine translation (NMT) is a promising direction to improve the translation quality by making use of the additional context, e.g., document-level translation, or having meta-information. Althoug...
详细信息
This paper describes the statistical machine translation (SMT) systems developed at RWTH Aachen University for the translation task of the NAACL 2012 Seventh Workshop on Statistical Machine Translation (WMT 2012). We ...
详细信息
This work investigates the alignment problem in state-of-the-art multi-head attention models based on the transformer architecture. We demonstrate that alignment extraction in transformer models can be improved by aug...
详细信息
In this paper, we investigate large-scale lightly-supervised training with a pivot language: We augment a baseline statistical machine translation (SMT) system that has been trained on human-generated parallel trainin...
详细信息
In this paper, we investigate large-scale lightly-supervised training with a pivot language: We augment a baseline statistical machine translation (SMT) system that has been trained on human-generated parallel training corpora with large amounts of additional unsupervised parallel data;but instead of creating this synthetic data from monolingual source language data with the baseline system itself, or from target language data with a reverse system, we employ a parallel corpus of target language data and data in a pivot language. The pivot language data is automatically translated into the source language, resulting in a trilingual corpus with unsupervised source language side. We augment our baseline system with the unsupervised sourcetarget parallel data. Experiments are conducted for the German- French language pair using the standard WMT newstest sets for development and testing. We obtain the unsupervised data by translating the English side of the English-French 109 corpus to German. With careful system design, we are able to achieve improvements of up to +0.4 points BLEU / -0.7 points TER over the baseline.
暂无评论