检索结果-内蒙古大学图书馆

The RWTH Aachen University Supervised Machine Translation Systems for WMT 2018 3

学校读者我要写书评

暂无评论

The RWTH Aachen University Supervised Machine Translation Sy...

3rd Conference on Machine Translation, WMT 2018 at the Conference on Empirical Methods in Natural language Processing, EMNLP 2018

作者： Schamper, Julian Rosendahl, Jan Bahar, Parnia Kim, Yunsu Nix, Arne Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University AachenD-52056 Germany

ISBN: (纸本)9781948087810

This paper describes the statistical machine translation systems developed at RWTH Aachen University for the German→English, English→Turkish and Chinese→English translation tasks of the EMNLP 2018 Third Conference on Machine Translation (WMT 2018). We use ensembles of neural machine translation systems based on the Transformer architecture. Our main focus is on the German→English task where we scored first with respect to all automatic metrics provided by the organizers. We identify data selection, fine-tuning, batch size and model dimension as important hyperparameters. In total we improve by 6.8% BLEU over our last year’s submission and by 4.8% BLEU over the winning system of the 2017 German→English task. In English→Turkish task, we show 3.6% BLEU improvement over the last year’s winning system. We further report results on the Chinese→English task where we improve 2.2% BLEU on average over our baseline systems but stay behind the 2018 winning systems. ©2018 Association for Computational Linguistics

关键词： Neural machine translation

Investigation on estimation of sentence probability by combining forward, backward and Bi-directional LSTM-RNNs 19

学校读者我要写书评

暂无评论

Investigation on estimation of sentence probability by combi...

19th Annual Conference of the International Speech Communication, INTERSPEECH 2018

作者： Irie, Kazuki Lei, Zhihong Deng, Liuhui Schlüter, Ralf Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University AachenD-52056 Germany

A combination of forward and backward long short-term memory (LSTM) recurrent neural network (RNN) language models is a popular model combination approach to improve the estimation of the sequence probability in the second pass N-best list rescoring in automatic speech recognition (ASR). In this work, we further push such an idea by proposing a combination of three models: a forward LSTM language model, a backward LSTM language model and a bi-directional LSTM based gap completion model. We derive such a combination method from a forward backward decomposition of the sequence probability. We carry out experiments on the Switchboard speech recognition task. While we empirically find that such a combination gives slight improvements in perplexity over the combination of forward and backward models, we finally show that a combination of the same number of forward models gives the best perplexity and word error rate (WER) overall. © 2018 International Speech Communication Association. All rights reserved.

关键词： Long short-term memory

On using specaugment for end-to-end speech translation

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Bahar, Parnia Zeyer, Albert Schlüter, Ralf Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University Aachen52062 Germany AppTek Aachen52062 Germany

This work investigates a simple data augmentation technique, SpecAugment, for end-to-end speech translation. SpecAugment is a low-cost implementation method applied directly to the audio input features and it consists of masking blocks of frequency channels, and/or time steps. We apply SpecAugment on end-to-end speech translation tasks and achieve up to +2.2% BLEU on LibriSpeech Audiobooks En→Fr and +1.2% on IWSLT TED-talks En→De by alleviating overfitting to some extent. We also examine the effectiveness of the method in a variety of data scenarios and show that the method also leads to significant improvements in various data conditions irrespective of the amount of training data. Copyright © 2019, The Authors. All rights reserved.

关键词：

On using 2D sequence-to-sequence models for speech recognition

学校读者我要写书评

暂无评论

arXiv 2019年

Attention-based sequence-to-sequence models have shown promising results in automatic speech recognition. Using these architectures, one-dimensional input and output sequences are related by an attention approach, thereby replacing more explicit alignment processes, like in classical HMM-based modeling. In contrast, here we apply a novel two-dimensional long short-term memory (2DLSTM) architecture to directly model the input/output relation between audio/feature vector sequences and word sequences. The proposed model is an alternative model such that instead of using any type of attention components, we apply a 2DLSTM layer to assimilate the context from both input observations and output transcriptions. The experimental evaluation on the Switchboard 300h automatic speech recognition task shows word error rates for the 2DLSTM model that are competitive to end-to-end attention-based model. Copyright © 2019, The Authors. All rights reserved.

关键词： Speech recognition

language modeling with deep transformers

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Irie, Kazuki Zeyer, Albert Schlüter, Ralf Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University Aachen52074 Germany AppTek GmbH Aachen52062 Germany

We explore deep autoregressive Transformer models in language modeling for speech recognition. We focus on two aspects. First, we revisit Transformer model configurations specifically for language modeling. We show that well configured Transformer models outperform our baseline models based on the shallow stack of LSTM recurrent neural network layers. We carry out experiments on the open-source LibriSpeech 960hr task, for both 200K vocabulary word-level and 10K byte-pair encoding subword-level language modeling. We apply our word-level models to conventional hybrid speech recognition by lattice rescoring, and the subword-level models to attention based encoder-decoder models by shallow fusion. Second, we show that deep Transformer language models do not require positional encoding. The positional encoding is an essential augmentation for the self-attention mechanism which is invariant to sequence ordering. However, in autoregressive setup, as is the case for language modeling, the amount of information increases along the position dimension, which is a positional signal by its own. The analysis of attention weights shows that deep autoregressive self-attention models can automatically make use of such positional information. We find that removing the positional encoding even slightly improves the performance of these models. Copyright © 2019, The Authors. All rights reserved.

关键词： Speech recognition

A neural, interactive-predictive system for multimodal sequence to sequence tasks

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Peris, Álvaro Casacuberta, Francisco Pattern Recognition and Human Language Technology Research Center Universitat Politècnica de València València Spain

We present a demonstration of a neural interactive-predictive system for tackling multimodal sequence to sequence tasks. The system generates text predictions to different sequence to sequence tasks: machine translation, image and video captioning. These predictions are revised by a human agent, who introduces corrections in the form of characters. The system reacts to each correction, providing alternative hypotheses, compelling with the feedback provided by the user. The final objective is to reduce the human effort required during this correction process. This system is implemented following a client–server architecture. For accessing the system, we developed a website, which communicates with the neural model, hosted in a local server. From this website, the different tasks can be tackled following the interactive-predictive framework. We open-source all the code developed for building this system. The demonstration in hosted in http://***/ interactive-seq2seq. Copyright © 2019, The Authors. All rights reserved.

关键词： Websites

Interactive-predictive neural multimodal systems?

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Peris, Álvaro Casacuberta, Francisco Pattern Recognition and Human Language Technology Research Center Universitat Politècnica de València València Spain

Despite the advances achieved by neural models in sequence to sequence learning, exploited in a variety of tasks, they still make errors. In many use cases, these are corrected by a human expert in a posterior revision process. The interactive-predictive framework aims to minimize the human effort spent on this process by considering partial corrections for iteratively refining the hypothesis. In this work, we generalize the interactive-predictive approach, typically applied in to machine translation field, to tackle other multimodal problems namely, image and video captioning. We study the application of this framework to multimodal neural sequence to sequence models. We show that, following this framework, we approximately halve the effort spent for correcting the outputs generated by the automatic systems. Moreover, we deploy our systems in a publicly accessible demonstration, that allows to better understand the behavior of the interactive-predictive framework. Copyright © 2019, The Authors. All rights reserved.

关键词： Deep learning

A Q-learning-based smart clustering routing method in flying Ad Hoc networks

学校读者我要写书评

暂无评论

Journal of King Saud University - Computer and Information Sciences 2024年第1期36卷

作者： Hosseinzadeh, Mehdi Tanveer, Jawad Rahmani, Amir Masoud Aurangzeb, Khursheed Yousefpoor, Efat Yousefpoor, Mohammad Sadegh Darwesh, Aso Lee, Sang-Woong Fazlali, Mahmood Institute of Research and Development Duy Tan University Da Nang Viet Nam School of Medicine and Pharmacy Duy Tan University Da Nang Viet Nam Department of Computer Science and Engineering Sejong University Seoul 05006 South Korea Future Technology Research Center National Yunlin University of Science and Technology Yunlin Taiwan Department of Computer Engineering College of Computer and Information Sciences King Saud University P.O. Box 51178 Riyadh 11543 Saudi Arabia Department of Computer Engineering Dezful Branch Islamic Azad University Dezful Iran Department of Information Technology University of Human Development Sulaymaniyah Iraq Pattern Recognition and Machine Learning Lab Gachon University 1342 Seongnamdaero Sujeonggu Seongnam 13120 South Korea Cybersecurity and Computing Systems Research Group School of Physics Engineering and Computer Science University of Hertfordshire Hertfordshire AL10 9AB United Kingdom

Flying ad hoc networks (FANETs) have particular importance in various military and civilian applications due to their specific features, including frequent topological changes, the movement of drones in a three-dimensional space, and their restricted energy. These features have created challenges for designing cluster-based routing protocols. In this paper, a Q-learning-based smart clustering routing method (QSCR) is suggested in FANETs. In QSCR, each node discovers its neighbors through the periodic exchange of hello messages. The hello time interval is different in each cluster, and cluster leaders determine this interval based on the average speed similarity. Next, an adaptive clustering process is presented for categorizing drones in the clusters. In this step, the cluster leader is selected based on a new parameter called merit value, which includes residual energy, centrality, neighbor degree, speed similarity, and link validity time. Then, a centralized Q-learning model is presented to tune weight coefficients related to merit parameters dynamically. In the last step, the routing process is done using a greedy forwarding technique. Finally, QSCR is run on NS2, and the simulation results of QSCR are compared with those of ICRA, WCA, and DCA. These results show that QSCR carries out the clustering process rapidly but has less cluster stability than ICRA. QSCR gets energy efficiency and improves network lifetime. In the routing process, QSCR has a high packet delivery rate compared to other schemes. Also, the number of isolated clusters created in QSCR is less than other clustering methods. However, the proposed scheme has a higher end-to-end delay than ICRA. Also, this scheme experiences more communication overhead than ICRA slightly. © 2024 The Author(s)

关键词： Clustering Flying ad hoc networks (FANETs) Machine learning (ML) Reinforcement learning (RL) Unmanned aerial vehicles (UAVs)

On the choice of modeling unit for sequence-to-sequence speech recognition

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Irie, Kazuki Prabhavalkar, Rohit Kannan, Anjuli Bruguier, Antoine Rybach, David Nguyen, Patrick Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University AachenD-52056 Germany Google Mountain ViewCA94043 United States