检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

148 篇 会议
82 篇 期刊文献

馆藏范围

230 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

165 篇 工学
- 120 篇 计算机科学与技术...
- 106 篇 软件工程
- 47 篇 信息与通信工程
- 16 篇 电气工程
- 15 篇 电子科学与技术（可...
- 15 篇 控制科学与工程
- 14 篇 机械工程
- 6 篇 生物工程
- 5 篇 化学工程与技术
- 5 篇 生物医学工程（可授...
- 4 篇 光学工程
- 3 篇 交通运输工程
- 2 篇 动力工程及工程热...
105 篇 理学
- 76 篇 物理学
- 42 篇 数学
- 19 篇 统计学（可授理学、...
- 12 篇 系统科学
- 7 篇 生物学
- 5 篇 化学
19 篇 管理学
- 11 篇 图书情报与档案管...
- 6 篇 管理科学与工程(可...
- 3 篇 工商管理
- 2 篇 公共管理
5 篇 法学
- 4 篇 社会学
- 1 篇 法学
4 篇 医学
- 4 篇 临床医学
- 2 篇 基础医学(可授医学...
- 2 篇 公共卫生与预防医...
2 篇 文学
- 2 篇 新闻传播学
1 篇 经济学
- 1 篇 应用经济学
1 篇 教育学
1 篇 农学
1 篇 艺术学

主题

51 篇 speech recogniti...
15 篇 hidden markov mo...
15 篇 training
13 篇 neural machine t...
12 篇 machine translat...
12 篇 decoding
12 篇 transducers
11 篇 computer aided l...
9 篇 error analysis
9 篇 recurrent neural...
8 篇 speech
8 篇 feature extracti...
8 篇 neural network
7 篇 modelling langua...
7 篇 vocabulary
7 篇 humans
6 篇 optimization
6 篇 handwriting reco...
5 篇 hierarchical sys...
5 篇 modeling languag...

机构

40 篇 human language t...
37 篇 apptek gmbh aach...
32 篇 human language t...
20 篇 human language t...
10 篇 human language t...
9 篇 human language t...
8 篇 computer science...
8 篇 human language t...
7 篇 spoken language ...
7 篇 apptek gmbh aach...
6 篇 human language t...
6 篇 human language t...
6 篇 human language t...
5 篇 pattern recognit...
5 篇 human language t...
5 篇 future technolog...
4 篇 national laborat...
4 篇 human language t...
4 篇 institute of res...
4 篇 school of medici...

作者

141 篇 ney hermann
55 篇 schlüter ralf
37 篇 hermann ney
16 篇 zeyer albert
16 篇 zhou wei
15 篇 ralf schluter
14 篇 gao yingbo
12 篇 ralf schlüter
12 篇 mansour saab
12 篇 zeineldeen moham...
12 篇 michel wilfried
12 篇 zens richard
11 篇 herold christian
10 篇 bahar parnia
10 篇 peitz stephan
9 篇 peter jan-thorst...
9 篇 schluter ralf
9 篇 freitag markus
9 篇 wang weiyue
8 篇 wuebker joern

语言

230 篇 英文

检索条件"机构=Human Language Technology and Pattern Recognition Computer Science"

共 230 条记录，以下是1-10 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Improving Long Context Document-Level Machine Translation 4

Improving Long Context Document-Level Machine Translation

引用

4th Workshop on Computational Approaches to Discourse, CODI 2023

作者： Herold, Christian Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University AachenD-52056 Germany

ISBN: (纸本)9781959429890

Document-level context for neural machine translation (NMT) is crucial to improve the translation consistency and cohesion, the translation of ambiguous inputs, as well as several other linguistic phenomena. Many works have been published on the topic of document-level NMT, but most restrict the system to only local context, typically including just the one or two preceding sentences as additional information. This might be enough to resolve some ambiguous inputs, but it is probably not sufficient to capture some document-level information like the topic or style of a conversation. When increasing the context size beyond just the local context, there are two challenges: (i) the memory usage increases exponentially (ii) the translation performance starts to degrade. We argue that the widely-used attention mechanism is responsible for both issues. Therefore, we propose a constrained attention variant that focuses the attention on the most relevant parts of the sequence, while simultaneously reducing the memory consumption. For evaluation, we utilize targeted test sets in combination with novel evaluation techniques to analyze the translations in regards to specific discourse-related phenomena. We find that our approach is a good compromise between sentence-level NMT vs attending to the full context, especially in low resource scenarios. © 2023 Association for Computational Linguistics.

关键词： Neural machine translation

来源：评论

学校读者我要写书评

暂无评论

Enhancing and Adversarial: Improve ASR with Speaker Labels 48

Enhancing and Adversarial: Improve ASR with Speaker Labels

引用

48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023

作者： Zhou, Wei Wu, Haotian Xu, Jingjing Zeineldeen, Mohammad Luscher, Christoph Schluter, Ralf Ney, Hermann Rwth Aachen University Human Language Technology and Pattern Recognition Computer Science Department Aachen52074 Germany AppTek GmbH Aachen52062 Germany

ISBN: (纸本)9781728163277

ASR can be improved by multi-task learning (MTL) with domain enhancing or domain adversarial training, which are two opposite objectives with the aim to increase/decrease domain variance towards domain-aware/agnostic ASR, respectively. In this work, we study how to best apply these two opposite objectives with speaker labels to improve conformer-based ASR. We also propose a novel adaptive gradient reversal layer for stable and effective adversarial training without tuning effort. Detailed analysis and experimental verification are conducted to show the optimal positions in the ASR neural network (NN) to apply speaker enhancing and adversarial training. We also explore their combination for further improvement, achieving the same performance as i-vectors plus adversarial training. Our best speaker-based MTL achieves 7% relative improvement on the Switchboard Hub5'00 set. We also investigate the effect of such speaker-based MTL w.r.t. cleaner dataset and weaker ASR NN. © 2023 IEEE.

关键词： Linearization

来源：评论

学校读者我要写书评

暂无评论

Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers 48

Lattice-Free Sequence Discriminative Training for Phoneme-Ba...

引用

48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023

作者： Yang, Zijian Zhou, Wei Schluter, Ralf Ney, Hermann Rwth Aachen University Human Language Technology and Pattern Recognition Computer Science Department Aachen52074 Germany AppTek GmbH Aachen52062 Germany

ISBN: (纸本)9781728163277

Recently, RNN-Transducers have achieved remarkable results on various automatic speech recognition tasks. However, lattice-free sequence discriminative training methods, which obtain superior performance in hybrid models, are rarely investigated in RNN-Transducers. In this work, we propose three lattice-free training objectives, namely lattice-free maximum mutual information, lattice-free segment-level minimum Bayes risk, and lattice-free minimum Bayes risk, which are used for the final posterior output of the phoneme-based neural transducer with a limited context dependency. Compared to criteria using N-best lists, lattice-free methods eliminate the decoding step for hypotheses generation during training, which leads to more efficient training. Experimental results show that lattice-free methods gain up to 6.5% relative improvement in word error rate compared to a sequence-level cross-entropy trained model. Compared to the N-best-list based minimum Bayes risk objectives, lattice-free methods gain 40% - 70% relative training time speedup with a small degradation in performance. © 2023 IEEE.

关键词： neural transducer sequence discriminative training Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Robust Knowledge Distillation from RNN-T Models with Noisy Training Labels Using Full-Sum Loss 48

Robust Knowledge Distillation from RNN-T Models with Noisy T...

引用

48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023

作者： Zeineldeen, Mohammad Audhkhasi, Kartik Baskar, Murali Karthick Ramabhadran, Bhuvana Rwth Aachen University Human Language Technology and Pattern Recognition Computer Science Department Aachen52074 Germany Google Llc New York United States

ISBN: (纸本)9781728163277

This work studies knowledge distillation (KD) and addresses its constraints for recurrent neural network transducer (RNN-T) models. In hard distillation, a teacher model transcribes large amounts of unlabelled speech to train a student model. Soft distillation is another popular KD method that distills the output logits of the teacher model. Due to the nature of RNN-T alignments, applying soft distillation between RNNT architectures having different posterior distributions is challenging. In addition, bad teachers having high word-error-rate (WER) reduce the efficacy of KD. We investigate how to effectively distill knowledge from variable quality ASR teachers, which has not been studied before to the best of our knowledge. We show that a sequence-level KD, full-sum distillation, outperforms other distillation methods for RNN-T models, especially for bad teachers. We also propose a variant of full-sum distillation that distills the sequence discriminative knowledge of the teacher leading to further improvement in WER. We conduct experiments on public datasets namely SpeechStew and LibriSpeech, and on in-house production data. © 2023 IEEE.

关键词： Recurrent neural networks

来源：评论

学校读者我要写书评

暂无评论

Revisiting Checkpoint Averaging for Neural Machine Translation 2

Revisiting Checkpoint Averaging for Neural Machine Translati...

引用

2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural language Processing, AACL-IJCNLP 2022

作者： Gao, Yingbo Herold, Christian Yang, Zijian Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department Rwth Aachen University AachenD-52056 Germany

ISBN: (纸本)9781959429043

Checkpoint averaging is a simple and effectivemethod to boost the performance of convergedneural machine translation models. The calculation is cheap to perform and the fact thatthe translation improvement almost comes forfree, makes it widely adopted in neural machine translation research. Despite the popularity, the method itself simply takes the mean ofthe model parameters from several checkpoints,the selection of which is mostly based on empirical recipes without many justifications. In thiswork, we revisit the concept of checkpoint averaging and consider several extensions. Specifically, we experiment with ideas such as usingdifferent checkpoint selection strategies, calculating weighted average instead of simplemean, making use of gradient information andfine-tuning the interpolation weights on development data. Our results confirm the necessityof applying checkpoint averaging for optimalperformance, but also suggest that the landscape between the converged checkpoints israther flat and not much further improvementcompared to simple averaging is to be obtained. © AACL-IJCNLP *** rights reserved

关键词： Neural machine translation

来源：评论

学校读者我要写书评

暂无评论

Does Joint Training Really Help Cascaded Speech Translation?

Does Joint Training Really Help Cascaded Speech Translation?

引用

2022 Conference on Empirical Methods in Natural language Processing, EMNLP 2022

作者： Tran, Viet Anh Khoa Thulke, David Gao, Yingbo Herold, Christian Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University AachenD-52056 Germany

Currently, in speech translation, the straightforward approach - cascading a recognition system with a translation system - delivers state-of-the-art results. However, fundamental challenges such as error propagation from the automatic speech recognition system still remain. To mitigate these problems, recently, people turn their attention to direct data and propose various joint training methods. In this work, we seek to answer the question of whether joint training really helps cascaded speech translation. We review recent papers on the topic and also investigate a joint training criterion by marginalizing the transcription posterior probabilities. Our findings show that a strong cascaded baseline can diminish any improvements obtained using joint training, and we suggest alternatives to joint training. We hope this work can serve as a refresher of the current speech translation landscape, and motivate research in finding more efficient and creative ways to utilize the direct data for speech translation. © 2022 Association for Computational Linguistics.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Efficient Training of Neural Transducer for Speech recognition 23

Efficient Training of Neural Transducer for Speech Recogniti...

引用

23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022

作者： Zhou, Wei Michel, Wilfried Schlüter, Ralf Ney, Hermann Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University 52074 Aachen Germany AppTek GmbH 52062 Aachen Germany

As one of the most popular sequence-to-sequence modeling approaches for speech recognition, the RNN-Transducer has achieved evolving performance with more and more sophisticated neural network models of growing size and increasing training epochs. While strong computation resources seem to be the prerequisite of training superior models, we try to overcome it by carefully designing a more efficient training pipeline. In this work, we propose an efficient 3-stage progressive training pipeline to build highly-performing neural transducer models from scratch with very limited computation resources in a reasonable short time period. The effectiveness of each stage is experimentally verified on both Librispeech and Switchboard corpora. The proposed pipeline is able to train transducer models approaching state-of-the-art performance with a single GPU in just 2-3 weeks. Our best conformer transducer achieves 4.1% WER on Librispeech test-other with only 35 epochs of training. Copyright © 2022 ISCA.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

The Conformer Encoder May Reverse the Time Dimension

The Conformer Encoder May Reverse the Time Dimension

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Robin Schmitt Albert Zeyer Mohammad Zeineldeen Ralf Schlűter Hermann Ney Computer Science Department Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany AppTek GmbH Aachen Germany

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

We sometimes observe monotonically decreasing cross-attention weights in our Conformer-based global attention-based encoder-decoder (AED) models, negatively affecting performance compared to monotonically increasing attention weights. Further investigation shows that the Conformer encoder reverses the sequence in the time dimension. We analyze the initial behavior of the decoder cross-attention mechanism and find that it encourages the Conformer encoder self-attention to build a connection between the initial frames and all other informative frames. Furthermore, we show that, at some point in training, the self-attention module of the Conformer starts dominating the output over the preceding feed-forward module, which then only allows the reversed information to pass through. We propose methods and ideas of how this flipping can be avoided and investigate a novel method to obtain label-frame-position alignments by using the gradients of the label log probabilities w.r.t. the encoder input frames.

关键词： Training Signal processing Acoustics Decoding Speech processing

来源：评论

学校读者我要写书评

暂无评论

On the Relation Between Internal language Model and Sequence Discriminative Training for Neural Transducers

On the Relation Between Internal Language Model and Sequence...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Zijian Yang Wei Zhou Ralf Schlüter Hermann Ney Computer Science Department Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany AppTek GmbH Aachen Germany

Internal language model (ILM) subtraction has been widely applied to improve the performance of the RNN-Transducer with external language model (LM) fusion for speech recognition. In this work, we show that sequence discriminative training has a strong correlation with ILM subtraction from both theoretical and empirical points of view. Theoretically, we derive that the global optimum of maximum mutual information (MMI) training shares a similar formula as ILM subtraction. Empirically, we show that ILM subtraction and sequence discriminative training achieve similar effects across a wide range of experiments on Librispeech, including both MMI and minimum Bayes risk (MBR) criteria, as well as neural transducers and LMs of both full and limited context. The benefit of ILM subtraction also becomes much smaller after sequence discriminative training. We also provide an indepth study to show that sequence discriminative training has a minimal effect on the commonly used zero-encoder ILM estimation, but a joint effect on both encoder and prediction + joint network for posterior probability reshaping including both ILM and blank suppression.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Controllable Factuality in Document-Grounded Dialog Systems Using a Noisy Channel Model

Controllable Factuality in Document-Grounded Dialog Systems ...

引用

2022 Findings of the Association for Computational Linguistics: EMNLP 2022

作者： Daheim, Nico Thulke, David Dugast, Christian Ney, Hermann Ubiquitous Knowledge Processing Lab Department of Computer Science Technical University of Darmstadt Germany Human Language Technology and Pattern Recognition RWTH Aachen University Germany AppTek GmbH

In this work, we present a model for document-grounded response generation in dialog that is decomposed into two components according to Bayes' theorem. One component is a traditional ungrounded response generation model and the other component models the reconstruction of the grounding document based on the dialog context and generated response. We propose different approximate decoding schemes and evaluate our approach on multiple open-domain and task-oriented document-grounded dialog datasets. Our experiments show that the model is more factual in terms of automatic factuality metrics than the baseline model. Furthermore, we outline how introducing scaling factors between the components allows for controlling the tradeoff between factuality and fluency in the model output. Finally, we compare our approach to a recently proposed method to control factuality in grounded dialog, CTRL (Rashkin et al., 2021), and show that both approaches can be combined to achieve additional improvements. © 2022 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共23页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：