检索结果-内蒙古大学图书馆

arXiv 2023年

作者： Thulke, David Daheim, Nico Dugast, Christian Ney, Hermann Human Language Technology and Pattern Recognition Group RWTH Aachen University Germany AppTek GmbH Aachen Germany

This paper summarizes our contributions to the document-grounded dialog tasks at the 9th and 10th Dialog System technology Challenges (DSTC9 and DSTC10). In both iterations the task consists of three subtasks: first detect whether the current turn is knowledge seeking, second select a relevant knowledge document, and third generate a response grounded on the selected document. For DSTC9 we proposed different approaches to make the selection task more efficient. The best method, Hierarchical Selection, actually improves the results compared to the original baseline and gives a speedup of 24x. In the DSTC10 iteration of the task, the challenge was to adapt systems trained on written dialogs to perform well on noisy automatic speech recognition transcripts. Therefore, we proposed data augmentation techniques to increase the robustness of the models as well as methods to adapt the style of generated responses to fit well into the proceeding dialog. Additionally, we proposed a noisy channel model that allows for increasing the factuality of the generated responses. In addition to summarizing our previous contributions, in this work, we also report on a few small improvements and reconsider the automatic evaluation metrics for the generation task which have shown a low correlation to human judgments. Copyright © 2023, The Authors. All rights reserved.

关键词： Iterative methods

来源：评论

学校读者我要写书评

暂无评论

Triplet lexicon models for statistical machine translation

Triplet lexicon models for statistical machine translation

引用

2008 Conference on Empirical Methods in Natural language Processing, EMNLP 2008, Co-located with AMTA 2008 and the International Workshop on Spoken language Translation

作者： Hasan, Saša Ganitkevitch, Juri Ney, Hermann Andrés-Ferrer, Jesús Human Language Technology and Pattern Recognition RWTH Aachen University Germany Universidad Politécnica de Valencia Dept. Sist. Informáticos y Computación Spain

This paper describes a lexical trigger model for statistical machine translation. We present various methods using triplets incorporating long-distance dependencies that can go beyond the local context of phrases or n-gram based language models. We evaluate the presented methods on two translation tasks in a re-ranking framework and compare it to the related IBM model 1. We show slightly improved translation quality in terms of BLEU and TER and address various constraints to speed up the training based on Expectation- Maximization and to lower the overall number of triplets without loss in translation performance. © 2008 Association for Computational Linguistics.

关键词： Computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

Cascaded span extraction and response generation for document-grounded dialog

arXiv

引用

arXiv 2021年

作者： Daheim, Nico Thulke, David Dugast, Christian Ney, Hermann Human Language Technology and Pattern Recognition Group RWTH Aachen University Germany AppTek GmbH Aachen Germany

This paper summarizes our entries to both subtasks of the first DialDoc shared task which focuses on the agent response prediction task in goal-oriented document-grounded dialogs. The task is split into two subtasks: predicting a span in a document that grounds an agent turn and generating an agent response based on a dialog and grounding document. In the first subtask, we restrict the set of valid spans to the ones defined in the dataset, use a biaffine classifier to model spans, and finally use an ensemble of different models. For the second subtask, we use a cascaded model which grounds the response prediction on the predicted span instead of the full document. With these approaches, we obtain significant improvements in both subtasks compared to the baseline. Copyright © 2021, The Authors. All rights reserved.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

The Rwth Asr System for Ted-Lium Release 2: Improving Hybrid Hmm With Specaugment

The Rwth Asr System for Ted-Lium Release 2: Improving Hybrid...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： Wei Zhou Wilfried Michel Kazuki Irie Markus Kitza Ralf Schluter Hermann Ney Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany

ISBN: (数字)9781509066315

ISBN: (纸本)9781509066322

We present a complete training pipeline to build a state-of-the-art hybrid HMM-based ASR system on the 2nd release of the TED-LIUM corpus. Data augmentation using SpecAugment is successfully applied to improve performance on top of our best SAT model using i-vectors. By investigating the effect of different maskings, we achieve improvements from SpecAugment on hybrid HMM models without increasing model size and training time. A subsequent sMBR training is applied to fine-tune the final acoustic model, and both LSTM and Transformer language models are trained and evaluated. Our best system achieves a 5.6% WER on the test set, which outperforms the previous state-of-the-art by 27% relative.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Using morpheme and syllable based sub-words for polish LVCSR

Using morpheme and syllable based sub-words for polish LVCSR

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： M. Ali Basha Shaik Amr El-Desoky Mousa Ralf Schlüter Hermann Ney Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany

Polish is a synthetic language with a high morpheme-per-word ratio. It makes use of a high degree of inflection leading to high out-of-vocabulary (OOV) rates, and high language Model (LM) perplexities. This poses a challenge for Large Vocabulary and Continuous Speech recognition (LVCSR) systems. Here, the use of morpheme and syllable based units is investigated for building sub-lexical LMs. A different type of sub-lexical units is proposed based on combining morphemic or syllabic units with corresponding pronunciations. Thereby, a set of grapheme-phoneme pairs called graphones are used for building LMs. A relative reduction of 3.5% in Word Error Rate (WER) is obtained with respect to a traditional system based on full-words.

关键词： Vocabulary Speech recognition Speech Adaptation models Computational modeling Error analysis Joints

来源：评论

学校读者我要写书评

暂无评论

Speaker adaptive joint training of Gaussian mixture models and bottleneck features

Speaker adaptive joint training of Gaussian mixture models a...

引用

IEEE Workshop on Automatic Speech recognition and Understanding

作者： Zoltán Tüske Pavel Golik Ralf Schlüter Hermann Ney Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany

In the tandem approach, the output of a neural network (NN) serves as input features to a Gaussian mixture model (GMM) aiming to improve the emission probability estimates. As has been shown in our previous work, GMM with pooled covariance matrix can be integrated into a neural network framework as a softmax layer with hidden variables, which allows for joint estimation of both neural network and Gaussian mixture parameters. Here, this approach is extended to include speaker adaptive training (SAT) by introducing a speaker dependent neural network layer. Error backpropagation beyond this speaker dependent layer realizes the adaptive training of the Gaussian parameters as well as the optimization of the bottleneck (BN) tandem features of the underlying acoustic model, simultaneously. In this study, after the initialization by constrained maximum likelihood linear regression (CMLLR) the speaker dependent layer itself is kept constant during the joint training. Experiments show that the deeper backpropagation through the speaker dependent layer is necessary for improved recognition performance. The speaker adaptively and jointly trained BN-GMM results in 5% relative improvement over very strong speaker-independent hybrid baseline on the Quaero English broadcast news and conversations task, and on the 300-hour Switchboard task.

关键词： Adaptation models Training Acoustics Hidden Markov models Robustness Artificial neural networks

来源：评论

学校读者我要写书评

暂无评论

A convergence analysis of log-linear training and its application to speech recognition

A convergence analysis of log-linear training and its applic...

引用

IEEE Workshop on Automatic Speech recognition and Understanding

作者： S. Wiesler R. Schlüter H. Ney Human Language Technology and Pattern Recognition RWTH Aachen University of Technology Aachen Germany

Log-linear models are a promising approach for speech recognition. Typically, log-linear models are trained according to a strictly convex criterion. Optimization algorithms are guaranteed to converge to the unique global optimum of the objective function from any initialization. For large-scale applications, considerations in the limit of infinite iterations are not sufficient. We show that log-linear training can be a highly ill-conditioned optimization problem, resulting in extremely slow convergence. Conversely, the optimization problem can be preconditioned by feature transformations. Making use of our convergence analysis, we improve our log-linear speech recognition system and achieve a strong reduction of its training time. In addition, we validate our analysis on a continuous handwriting recognition task.

关键词： Hidden Markov models Training Optimization Convergence Speech recognition Eigenvalues and eigenfunctions Polynomials

来源：评论

学校读者我要写书评

暂无评论

EXPLOITING SPARSENESS OF BACKING-OFF language MODELS FOR EFFICIENT LOOK-AHEAD IN LVCSR

EXPLOITING SPARSENESS OF BACKING-OFF LANGUAGE MODELS FOR EFF...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： David Nolden Hermann Ney Ralf Schluter Human Language Technology and Pattern Recognition GroupRWTH Aachen University Aachen Germany

ISBN: (纸本)9781457705380

In this paper, we propose a new method for computing and applying language model look-ahead in a dynamic network decoder, exploiting the sparseness of backing-off n-gram language models. Only partial (sparse) look-ahead tables are computed, with a size that depends on the number of words that have an n-gram score in the language model for a specific context, rather than a constant, vocabulary dependent size. Since high order backing-off language models are inherently sparse, this mechanism reduces the runtime- and memory effort of computing the look-ahead tables by magnitudes. A modified decoding algorithm is required to apply these sparse LM look-ahead tables efficiently. We show that sparse LM look-ahead is much more efficient than the classical method, and that full n-gram look-ahead becomes favorable over lower order look-ahead even when many distinct LM contexts appear during decoding.

关键词： speech recognition decoding search language model look-ahead modelling languages Tables language DECODING indium new methods dynamic network WWOX gene decoding algorithm

来源：评论

学校读者我要写书评

暂无评论

Confidence-Based Discriminative Training for Model Adaptation in Offline Arabic Handwriting recognition

Confidence-Based Discriminative Training for Model Adaptatio...

引用

International Conference on Document Analysis and recognition

作者： Philippe Dreuw Georg Heigold Hermann Ney Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany

ISBN: (纸本)9781424445004

We present a novel confidence-based discriminative training for model adaptation approach for an HMM based Arabic handwriting recognition system to handle different handwriting styles and their *** current approaches are maximum-likelihood trained HMM systems and try to adapt their models to different writing styles using writer adaptive training, unsupervised clustering, or additional writer specific *** training based on the maximum mutual information criterion is used to train writer independent handwriting models. For model adaptation during decoding, an unsupervised confidence-based discriminative training on a word and frame level within a two-pass decoding process is proposed. Additionally, the training criterion is extended to incorporate a margin *** proposed methods are evaluated on the IFN/ENIT Arabic handwriting database, where the proposed novel adaptation approach can decrease the word-error-rate by 33% relative.

关键词： Adaptation model Handwriting recognition Hidden Markov models Writing Mutual information Maximum likelihood decoding Maximum likelihood estimation pattern recognition Databases Automatic speech recognition

来源：评论

学校读者我要写书评

暂无评论

Full-Sum Decoding for Hybrid Hmm Based Speech recognition Using LSTM language Model

Full-Sum Decoding for Hybrid Hmm Based Speech Recognition Us...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： Wei Zhou Ralf Schluter Hermann Ney Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany

ISBN: (数字)9781509066315

ISBN: (纸本)9781509066322

In hybrid HMM based speech recognition, LSTM language models have been widely applied and achieved large improvements. The theoretical capability of modeling any unlimited context suggests that no recombination should be applied in decoding. This motivates to reconsider full summation over the HMM-state sequences instead of Viterbi approximation in decoding. We explore the potential gain from more accurate probabilities in terms of decision making and apply the full-sum decoding with a modified prefix-tree search framework. The proposed full-sum decoding is evaluated on both Switchboard and Librispeech corpora. Different models using CE and sMBR training criteria are used. Additionally, both MAP and confusion network decoding as approximated variants of general Bayes decision rule are evaluated. Consistent improvements over strong baselines are achieved in almost all cases without extra cost. We also discuss tuning effort, efficiency and some limitations of full-sum decoding.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：