检索结果-内蒙古大学图书馆

50th Annual Meeting of the Association for Computational Linguistics, ACL 2012

作者： Wuebker, Joern Ney, Hermann Zens, Richard Computer Science Department Human Language Technology and Pattern Recognition Group RWTH Aachen University Germany Google Inc. 1600 Amphitheatre Parkway Mountain View CA 94043 United States

ISBN: (纸本)9781937284251

In this work we present two extensions to the well-known dynamic programming beam search in phrase-based statistical machine translation (SMT), aiming at increased efficiency of decoding by minimizing the number of language model computations and hypothesis expansions. Our results show that language model based pre-sorting yields a small improvement in translation quality and a speedup by a factor of 2. Two look-ahead methods are shown to further increase translation speed by a factor of 2 without changing the search space and a factor of 4 with the side-effect of some additional search errors. We compare our approach with Moses and observe the same performance, but a substantially better trade-off between translation quality and speed. At a speed of roughly 70 words per second, Moses reaches 17.2% BLEU, whereas our approach yields 20.0% with identical models. © 2012 Association for Computational Linguistics.

关键词： Decoding

来源：评论

学校读者我要写书评

暂无评论

Bag-of-visual-words models for adult image classification and filtering

Bag-of-visual-words models for adult image classification an...

引用

作者： Deselaers, Thomas Pimenidis, Lexi Ney, Hermann Computer Science Department Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany Security and Privacy Research RWTH Aachen University Aachen Germany

ISBN: (纸本)9781424421756

We present a method to classify images into different categories of pornographic content to create a system for filtering pornographic images from network traffic. Although different systems for this application were presented in the past, most of these systems are based on simple skin colour features and have rather poor performance. Recent advances in the image recognition field in particular for the classification of objects have shown that bag-of-visual-words-approaches are a good method for many image classification problems. The system we present here, is based on this approach, uses a task-specific visual vocabulary and is trained and evaluated on an image database of 8500 images from different categories. It is shown that it clearly outperforms earlier systems on this dataset and further evaluation on two novel web-traffic collections shows the good performance of the proposed system. © 2008 IEEE.

关键词： Image classification

来源：评论

学校读者我要写书评

暂无评论

Investigations on hessian-free optimization for cross-entropy training of deep neural networks

Investigations on hessian-free optimization for cross-entrop...

引用

14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013

作者： Wiesler, Simon Li, Jinyu Xue, Jian Computer Science Department Human Language Technology and Pattern Recognition RWTH Aachen University 52056 Aachen Germany Microsoft Corporation Redmond WA 98052 United States

Context-dependent deep neural network HMMs have been shown to achieve recognition accuracy superior to Gaussian mixture models in a number of recent works. Typically, neural networks are optimized with stochastic gradient descent. On large datasets, stochastic gradient descent improves quickly during the beginning of the optimization. But since it does not make use of second order information, its asymptotic convergence behavior is slow. In regions with pathological curvature, stochastic gradient descent may almost stagnate and thereby falsely indicate convergence. Another drawback of stochastic gradient descent is that it can only be parallelized within minibatches. The Hessian-free algorithm is a second order batch optimization algorithm that does not suffer from these problems. In a recent work, Hessian-free optimization has been applied to a training of deep neural networks according to a sequence criterion. In that work, improvements in accuracy and training time have been reported. In this paper, we analyze the properties of the Hessian-free optimization algorithm and investigate whether it is suited for cross-entropy training of deep neural networks as well. Copyright © 2013 ISCA.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Controllable Factuality in Document-Grounded Dialog Systems Using a Noisy Channel Model

Controllable Factuality in Document-Grounded Dialog Systems ...

引用

2022 Findings of the Association for Computational Linguistics: EMNLP 2022

作者： Daheim, Nico Thulke, David Dugast, Christian Ney, Hermann Ubiquitous Knowledge Processing Lab Department of Computer Science Technical University of Darmstadt Germany Human Language Technology and Pattern Recognition RWTH Aachen University Germany AppTek GmbH

In this work, we present a model for document-grounded response generation in dialog that is decomposed into two components according to Bayes' theorem. One component is a traditional ungrounded response generation model and the other component models the reconstruction of the grounding document based on the dialog context and generated response. We propose different approximate decoding schemes and evaluate our approach on multiple open-domain and task-oriented document-grounded dialog datasets. Our experiments show that the model is more factual in terms of automatic factuality metrics than the baseline model. Furthermore, we outline how introducing scaling factors between the components allows for controlling the tradeoff between factuality and fluency in the model output. Finally, we compare our approach to a recently proposed method to control factuality in grounded dialog, CTRL (Rashkin et al., 2021), and show that both approaches can be combined to achieve additional improvements. © 2022 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Revisiting Checkpoint Averaging for Neural Machine Translation

arXiv

引用

arXiv 2022年

作者： Gao, Yingbo Herold, Christian Yang, Zijian Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University Germany

Checkpoint averaging is a simple and effective method to boost the performance of converged neural machine translation models. The calculation is cheap to perform and the fact that the translation improvement almost comes for free, makes it widely adopted in neural machine translation research. Despite the popularity, the method itself simply takes the mean of the model parameters from several checkpoints, the selection of which is mostly based on empirical recipes without many justifications. In this work, we revisit the concept of checkpoint averaging and consider several extensions. Specifically, we experiment with ideas such as using different checkpoint selection strategies, calculating weighted average instead of simple mean, making use of gradient information and fine-tuning the interpolation weights on development data. Our results confirm the necessity of applying checkpoint averaging for optimal performance, but also suggest that the landscape between the converged checkpoints is rather flat and not much further improvement compared to simple averaging is to be obtained. Copyright © 2022, The Authors. All rights reserved.

关键词： Neural machine translation

来源：评论

学校读者我要写书评

暂无评论

OPEN VOCABULARY HANDWRITING recognition USING COMBINED WORD-LEVEL AND CHARACTER-LEVEL language MODELS

OPEN VOCABULARY HANDWRITING RECOGNITION USING COMBINED WORD-...

引用

IEEE International Conference on Acoustics, Speech, and Signal Processing

作者： Michal Kozielski David Rybach Stefan Hahn Ralf Schluter Hermann Ney Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Aachen Germany

ISBN: (纸本)9781479903573

In this paper, we present a unified search strategy for open vocabulary handwriting recognition using weighted finite state transducers. Additionally to a standard word-level language model we introduce a separate n-gram character-level language model for out-of-vocabulary word detection and recognition. The probabilities assigned by those two models are combined into one Bayes decision rule. We evaluate the proposed method on the IAM database of English handwriting. An improvement from 22.2% word error rate to 17.3% is achieved comparing to the closed-vocabulary scenario and the best published result.

关键词： Handwriting Vocabulary modelling languages search strategies Error analysis Word Bayes Decision Rule handwriting recognition

来源：评论

学校读者我要写书评

暂无评论

THE RWTH 2010 QUAERO ASR EVALUATION SYSTEM FOR ENGLISH, FRENCH, AND GERMAN

THE RWTH 2010 QUAERO ASR EVALUATION SYSTEM FOR ENGLISH, FREN...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： M. Sundermeyer M. Nussbaum-Thom S. Wiesler C. Plahl A. El-Desoky Mousa S. Hahn D. Nolden R. Schluter H. Ney Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University

ISBN: (纸本)9781457705380

Recognizing Broadcast Conversational (BC) speech data is a difficult task, which can be regarded as one of the major challenges beyond the recognition of Broadcast News (BN). This paper presents the automatic speech recognition systems developed by RWTH for the English, French, and German language which attained the best word error rates for English and German, and competitive results for the French task in the 2010 Quaero evaluation for BC and BN data. At the same time, the RWTH German system used the least amount of training data among all participants. Large reductions in word error rate were obtained by the incorporation of the new Bottleneck Multilayer Perceptron (MLP) features for all three languages. Additional improvements were obtained for the German system by applying a new language modeling technique, decomposing words into sublexical components.

关键词： automatic speech recognition multilayer perceptrons

来源：评论

学校读者我要写书评

暂无评论

FEATURE COMBINATION AND STACKING OF RECURRENT AND NON-RECURRENT NEURAL NETWORKS FOR LVCSR

FEATURE COMBINATION AND STACKING OF RECURRENT AND NON-RECURR...

引用

IEEE International Conference on Acoustics, Speech, and Signal Processing

作者： Christian Plahl Michael Kozielski Ralf Schluter Hermann Ney Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University

ISBN: (纸本)9781479903573

This paper investigates the combination of different short-term features and the combination of recurrent and non-recurrent neural networks (NNs) on a Spanish speech recognition task. Several methods exist to combine different feature sets such as concatenation or linear discriminant analysis (LDA). Even though all these techniques achieve reasonable improvements, feature combination by multi-layer perceptrons (MLPs) outperforms all known approaches. We develop the concept of MLP based feature combination further using recurrent neural networks (RNNs). The phoneme posterior estimates derived from an RNN lead to a significant improvement over the result of the MLPs and achieve a 5% relative better word error rate (WER) with much less parameters. Moreover, we improve the system performance further by combining an MLP and an RNN in a hierarchical framework. The MLP benefits from the preprocessing of the RNN. All NNs are trained on phonemes. Nevertheless, the same concepts could be applied using context-dependent states. In addition to the improvements in recognition performance w.r.t. WER, NN based feature combination methods reduce both, the training and the testing complexity. Overall, the systems are based on a single set of acoustic models, together with the training of different NNs.

关键词： Feature combination Multi-layer perceptron Recurrent neural networks Long-short-term-memory Speech recognition recurrent neural nets Speech recognition CSRP3 gene Stacking Neural network System performance Training

来源：评论

学校读者我要写书评

暂无评论

Direct construction of compact context-dependency transducers from data

Direct construction of compact context-dependency transducer...

引用

作者： Rybach, David Riley, Michael Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Germany Google Inc. 76 Ninth Avenue New York NY United States

This paper describes a new method for building compact context-dependency transducers for finite-state transducer-based ASR decoders. Instead of the conventional phonetic decision-tree growing followed by FST compilation, this approach incorporates the phonetic context splitting directly into the transducer construction. The objective function of the split optimization is augmented with a regularization term that measures the number of transducer states introduced by a split. We give results on a large spoken-query task for various n-phone orders and other phonetic features that show this method can greatly reduce the size of the resulting context-dependency transducer with no significant impact on recognition accuracy. This permits using context sizes and features that might otherwise be unmanageable. © 2010 ISCA.

关键词： Transducers

来源：评论

学校读者我要写书评

暂无评论

Audio segmentation for speech recognition using segment features

Audio segmentation for speech recognition using segment feat...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： David Rybach Christian Gollan Ralf Schluter Hermann Ney Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Germany

Audio segmentation is an essential preprocessing step in several audio processing applications with a significant impact e.g. on speech recognition performance. We introduce a novel framework which combines the advantages of different well known segmentation methods. An automatically estimated log-linear segment model is used to determine the segmentation of an audio stream in a holistic way by a maximum a posteriori decoding strategy, instead of classifying change points locally. A comparison to other segmentation techniques in terms of speech recognition performance is presented, showing a promising segmentation quality of our approach.

关键词： Speech recognition Streaming media Decoding Broadcasting Loudspeakers Automatic speech recognition Bayesian methods humans Natural languages pattern recognition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：