The preprocessing pipelines in Natural language Processing usually involve a step of removing sentences consisted of illegal characters. The definition of illegal characters and the specific removal strategy depend on...
详细信息
This paper investigates the combination of different short-term features and the combination of recurrent and non-recurrent neural networks (NNs) on a Spanish speech recognition task. Several methods exist to combine ...
详细信息
ISBN:
(纸本)9781479903573
This paper investigates the combination of different short-term features and the combination of recurrent and non-recurrent neural networks (NNs) on a Spanish speech recognition task. Several methods exist to combine different feature sets such as concatenation or linear discriminant analysis (LDA). Even though all these techniques achieve reasonable improvements, feature combination by multi-layer perceptrons (MLPs) outperforms all known approaches. We develop the concept of MLP based feature combination further using recurrent neural networks (RNNs). The phoneme posterior estimates derived from an RNN lead to a significant improvement over the result of the MLPs and achieve a 5% relative better word error rate (WER) with much less parameters. Moreover, we improve the system performance further by combining an MLP and an RNN in a hierarchical framework. The MLP benefits from the preprocessing of the RNN. All NNs are trained on phonemes. Nevertheless, the same concepts could be applied using context-dependent states. In addition to the improvements in recognition performance w.r.t. WER, NN based feature combination methods reduce both, the training and the testing complexity. Overall, the systems are based on a single set of acoustic models, together with the training of different NNs.
This paper describes a new method for building compact context-dependency transducers for finite-state transducer-based ASR decoders. Instead of the conventional phonetic decision-tree growing followed by FST compilat...
详细信息
This paper describes the statistical machine translation (SMT) systems developed at RWTH Aachen University for the translation task of the NAACL 2012 Seventh Workshop on Statistical Machine Translation (WMT 2012). We ...
详细信息
ISBN:
(纸本)9781622765928
This paper describes the statistical machine translation (SMT) systems developed at RWTH Aachen University for the translation task of the NAACL 2012 Seventh Workshop on Statistical Machine Translation (WMT 2012). We participated in the evaluation campaign for the French-English and German-English language pairs in both translation directions. Both hierarchical and phrase-based SMT systems are applied. A number of different techniques are evaluated, including an insertion model, different lexical smoothing methods, a discriminative reordering extension for the hierarchical system, reverse translation, and system combination. By application of these methods we achieve considerable improvements over the respective baseline systems.
The integration of language models for neural machine translation has been extensively studied in the past. It has been shown that an external language model, trained on additional target-side monolingual data, can he...
详细信息
Compared to sentence-level systems, document-level neural machine translation (NMT) models produce a more consistent output across a document and are able to better resolve ambiguities within the input. There are many...
详细信息
Data processing is an important step in various natural language processing tasks. As the commonly used datasets in named entity recognition contain only a limited number of samples, it is important to obtain addition...
详细信息
Encoder-decoder architecture is widely adopted for sequence-to-sequence modeling tasks. For machine translation, despite the evolution from long short-term memory networks to Transformer networks, plus the introductio...
详细信息
Despite the known limitations, most machine translation systems today still operate on the sentence-level. One reason for this is, that most parallel training data is only sentence-level aligned, without document-leve...
详细信息
Document-level context for neural machine translation (NMT) is crucial to improve the translation consistency and cohesion, the translation of ambiguous inputs, as well as several other linguistic phenomena. Many work...
详细信息
暂无评论