While recurrent neural networks can motivate cross-sentence language modeling and its application to automatic speech recognition (ASR), corresponding modifications of the training method for that end are rarely discu...
详细信息
ISBN:
(数字)9781728103068
ISBN:
(纸本)9781728103075
While recurrent neural networks can motivate cross-sentence language modeling and its application to automatic speech recognition (ASR), corresponding modifications of the training method for that end are rarely discussed. In fact, even more generally, the impact of training sequence construction strategy in language modeling for different evaluation conditions is typically ignored. In this work, we revisit this basic but fundamental question. We train language models based on long short-term memory recurrent neural networks and Transformers using various types of training sequences and study their robustness with respect to different evaluation modes. Our experiments on 300h Switchboard and Quaero English datasets show that models trained with back-propagation over sequences consisting of concatenation of multiple sentences with state carry-over across sequences effectively outperform those trained with the sentence-level training, both in terms of perplexity and word error rates for cross-utterance ASR.
Recent advances in deep learning show that end-to-end speech to text translation model is a promising approach to direct the speech translation field. In this work, we provide an overview of different end-to-end archi...
详细信息
ISBN:
(数字)9781728103068
ISBN:
(纸本)9781728103075
Recent advances in deep learning show that end-to-end speech to text translation model is a promising approach to direct the speech translation field. In this work, we provide an overview of different end-to-end architectures, as well as the usage of an auxiliary connectionist temporal classification (CTC) loss for better convergence. We also investigate on pre-training variants such as initializing different components of a model using pretrained models, and their impact on the final performance, which gives boosts up to 4% in Bleu and 5% in Ter. Our experiments are performed on 270h IWSLT TED-talks En→De, and 100h LibriSpeech Audio-books En→Fr. We also show improvements over the current end-to-end state-of-the-art systems on both tasks.
The preprocessing pipelines in Natural language Processing usually involve a step of removing sentences consisted of illegal characters. The definition of illegal characters and the specific removal strategy depend on...
详细信息
Prominently used in support vector machines and logistic regressions, kernel functions (kernels) can implicitly map data points into high dimensional spaces and make it easier to learn complex decision boundaries. In ...
详细信息
Accessibility to historical documents is mostly limited to scholars. This is due to the language barrier inherent in humanlanguage and the linguistic properties of these documents. Given a historical document, modern...
详细信息
The goal of this paper is to deal with a data scarcity scenario where deep learning techniques use to fail. We compare the use of two well established techniques, Restricted Boltzmann Machines and Variational Auto-enc...
详细信息
Flying ad-hoc networks (FANETs) have many applications in military, industrial and agricultural areas. Due to specific features of FANETs, such as high-speed nodes, low density of nodes in the network, and rapid chang...
详细信息
Significant performance degradation of automatic speech recognition (ASR) systems is observed when the audio signal contains cross-talk. One of the recently proposed approaches to solve the problem of multi-speaker AS...
详细信息
This paper describes the unsupervised neural machine translation (NMT) systems of the RWTH Aachen University developed for the English ↔ German news translation task of the EMNLP 2018 Third Conference on Machine Trans...
详细信息
Embedding and projection matrices are commonly used in neural language models (NLM) as well as in other sequence processing networks that operate on large vocabularies. We examine such matrices in fine-tuned language ...
详细信息
暂无评论