检索结果-内蒙古大学图书馆

arXiv 2019年

作者： Wang, Yiming Chen, Tongfei Xu, Hainan Ding, Shuoyang Lv, Hang Shao, Yiwen Peng, Nanyun Xie, Lei Watanabe, Shinji Khudanpur, Sanjeev Center of Language and Speech Processing Johns Hopkins University BaltimoreMD United States Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD United States Information Sciences Institute University of Southern California Los AngelesCA United States ASLP@NPU School of Computer Science Northwestern Polytechnical University Xi'an China

We present ESPRESSO, an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning library PyTorch and the popular neural machine translation toolkit FAIRSEQ. ESPRESSO supports distributed training across GPUs and computing nodes, and features various decoding approaches commonly employed in ASR, including look-ahead word-based language model fusion, for which a fast, parallelized decoder is implemented. ESPRESSO achieves state-of-the-art ASR performance on the WSJ, Librispeech, and Switchboard data sets among other end-to-end systems without data augmentation, and is 4-11× faster for decoding than similar systems (e.g. ESPNET). Copyright © 2019, The Authors. All rights reserved.

关键词： speech recognition

来源：评论

学校读者我要写书评

暂无评论

Pretraining by Backtranslation for End-to-end ASR in Low-Resource Settings

arXiv

引用

arXiv 2018年

作者： Wiesner, Matthew Renduchintala, Adithya Watanabe, Shinji Liu, Chunxi Dehak, Najim Khudanpur, Sanjeev Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States

We explore training attention-based encoder-decoder ASR in low-resource settings. These models perform poorly when trained on small amounts of transcribed speech, in part because they depend on having sufficient target-side text to train the attention and decoder networks. In this paper we address this shortcoming by pretraining our network parameters using only text-based data and transcribed speech from other languages. We analyze the relative contributions of both sources of data. Across 3 test languages, our text-based approach resulted in a 20% average relative improvement over a text-based augmentation technique without pretraining. Using transcribed speech from nearby languages gives a further 20-30% relative reduction in character error rate. Copyright © 2018, The Authors. All rights reserved.

关键词： Signal encoding

来源：评论

学校读者我要写书评

暂无评论

Low-resource contextual topic identification on speech

arXiv

引用

arXiv 2018年

作者： Liu, Chunxi Wiesner, Matthew Watanabe, Shinji Harman, Craig Trmal, Jan Dehak, Najim Khudanpur, Sanjeev Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States

In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified. We first present a general purpose method for topic ID on spoken segments in low-resource languages, using a cascade of universal acoustic modeling, translation lexicons to English, and English-language topic classification. Next, instead of classifying each segment independently, we demonstrate that exploring the contextual dependencies across sequential segments can provide large improvements. In particular, we propose an attention-based contextual model which is able to leverage the contexts in a selective manner. We test both our contextual and non-contextual models on four LORELEI languages, and on all but one our attention-based contextual model significantly outperforms the context-independent models. Copyright © 2018, The Authors. All rights reserved.

关键词： Recurrent neural networks

来源：评论

学校读者我要写书评

暂无评论

A Synthetic Recipe for OCR

A Synthetic Recipe for OCR

引用

International Conference on Document Analysis and Recognition

作者： David Etter Stephen Rawls Cameron Carpenter Gregory Sell Human Language Technology Center of Excellence Johns Hopkins University Baltimore USA Information Science Institute University of Southern California Johns Hopkins University

Synthetic data generation for optical character recognition (OCR) promises unlimited training data at zero annotation cost. With enough fonts and seed text, we should be able to generate data to train a model that approaches or exceeds the performance with real annotated data. Unfortunately, this is not always the reality. Unconstrained image settings, such as internet memes, scanned web pages, or newspapers, present diverse scripts, fonts, layouts, and complex backgrounds, which cause models trained with synthetic data to break down. In this work, we investigate the synthetic image generation problem on a large multilingual set of unconstrained document images. Our work presents a comprehensive evaluation of the impact of synthetic data attributes on model performance. The results provide a recipe for synthetic data generation that will help guide future research.

关键词： Training Optical character recognition software Simultaneous localization and mapping Training data Data models Image color analysis Layout

来源：评论

学校读者我要写书评

暂无评论

Multi-task domain adaptation for sequence tagging 2

Multi-task domain adaptation for sequence tagging

引用

2nd Workshop on Representation Learning for NLP, Rep4NLP 2017 at the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017

作者： Peng, Nanyun Dredze, Mark Human Language Technology Center of Excellence Center for Language and Speech Processing Johns Hopkins University BaltimoreMD21218 United States

ISBN: (纸本)9781945626623

Many domain adaptation approaches rely on learning cross domain shared representations to transfer the knowledge learned in one domain to other domains. Traditional domain adaptation only considers adapting for one task. In this paper, we explore multi-task representation learning under the domain adaptation scenario. We propose a neural network framework that supports domain adaptation for multiple tasks simultaneously, and learns shared representations that better generalize for domain adaptation. We apply the proposed framework to domain adaptation for sequence tagging problems considering two tasks: Chinese word segmentation and named entity recognition. Experiments show that multi-task domain adaptation works better than disjoint domain adaptation for each task, and achieves the state-of-the-art results for both tasks in the social media domain. © 2017 Association for Computational Linguistics.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Calibration of Deep Probabilistic Models with Decoupled Bayesian Neural Networks

arXiv

引用

arXiv 2019年

作者： Maroñas, Juan Paredes, Roberto Ramos, Daniel PRHLT - Pattern Recognition and Human Language Technology Research Center Universitat Politècnica de Valencia Spain AUDIAS - Audio Data Intelligence and Speech Universidad Autónoma de Madrid Spain

Deep Neural Networks (DNNs) have achieved state-of-the-art accuracy performance in many tasks. However, recent works have pointed out that the outputs provided by these models are not well-calibrated, seriously limiting their use in critical decision scenarios. In this work, we propose to use a decoupled Bayesian stage, implemented with a Bayesian Neural Network (BNN), to map the uncalibrated probabilities provided by a DNN to calibrated ones, consistently improving calibration. Our results evidence that incorporating uncertainty provides more reliable probabilistic models, a critical condition for achieving good calibration. We report a generous collection of experimental results using high-accuracy DNNs in standardized image classification benchmarks, showing the good performance, flexibility and robust behavior of our approach with respect to several state-of-the-art calibration methods. Code for reproducibility is provided. Copyright © 2019, The Authors. All rights reserved.

关键词： Calibration

来源：评论

学校读者我要写书评

暂无评论

Predicting asymmetric transitive relations in knowledge bases 1

Predicting asymmetric transitive relations in knowledge base...

引用

1st Workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis, KG4IR 2017

作者： Rastogi, Pushpendre Van Durme, Benjamin Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States

Knowledge Base Completion (KBC), or link prediction, is the task of inferring missing edges in an existing knowledge graph. Although a number of methods have been evaluated empirically on select datasets for KBC, much less attention has been paid to understanding the relationship between the logical properties encoded by a given KB and the KBC method being evaluated. In this paper we study the effect of the logical properties of a relation on the performance of a KBC method, and we present a theorem and empirical results that can guide researchers in choosing the KBC algorithm for a KB. © Copyright by the paper's authors.

关键词： Knowledge graph

来源：评论

学校读者我要写书评

暂无评论

The JHU machine translation systems for WMT 2017 2

The JHU machine translation systems for WMT 2017

引用

2nd Conference on Machine Translation, WMT 2017

作者： Ding, Shuoyang Khayrallah, Huda Koehn, Philipp Post, Matt Kumar, Gaurav Duh, Kevin Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States

ISBN: (纸本)9781945626968

This paper describes the Johns Hopkins University submissions to the shared translation task of EMNLP 2017 Second Conference on Machine Translation (WMT 2017). We set up phrase-based, syntax-based and/or neural machine translation systems for all 14 language pairs of this year's evaluation campaign. We also performed neural rescoring of phrase-based systems for English-Turkish and English-Finnish. © 2017 Association for Computational Linguistics

关键词： Neural machine translation

来源：评论

学校读者我要写书评

暂无评论

Training relation embeddings under logical constraints 1

Training relation embeddings under logical constraints

引用

1st Workshop on Knowledge Graphs and Semantics for Text Retrieval and Analysis, KG4IR 2017

作者： Rastogi, Pushpendre Poliak, Adam Van Durme, Benjamin Center for Language and Speech Processing Johns Hopkins University United States Human Language Technology Center of Excellence Johns Hopkins University United States

We present ways of incorporating logical rules into the construction of embedding based Knowledge Base Completion (KBC) systems. Enforcing "logical consistency" in the predictions of a KBC system guarantees that the predictions comply with logical rules such as symmetry, implication and generalized transitivity. Our method encodes logical rules about entities and relations as convex constraints in the embedding space to enforce the condition that the score of a logically entailed fact must never be less than the minimum score of an antecedent fact. Such constraints provide a weak guarantee that the predictions made by our KBC model will match the output of a logical knowledge base for many types of logical inferences. We validate our method via experiments on a knowledge graph derived fromWordNet. © Copyright by the paper's authors.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

On the evaluation of semantic phenomena in neural machine translation using natural language inference

On the evaluation of semantic phenomena in neural machine tr...

引用

2018 Conference of the North American Chapter of the Association for Computational Linguistics: human language Technologies, NAACL HLT 2018

作者： Poliak, Adam Belinkov, Yonatan Glass, James Van Durme, Benjamin Center for Language and Speech Processing Johns Hopkins University BaltimoreMD21218 United States Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology CambridgeMA02139 United States

ISBN: (纸本)9781948087292

We propose a process for investigating the extent to which sentence representations arising from neural machine translation (NMT) systems encode distinct semantic *** use these representations as features to train a natural language inference (NLI) classifier based on datasets recast from existing semantic annotations. In applying this process to a representative NMT system, we find its encoder appears most suited to supporting inferences at the syntax-semantics interface, as compared to anaphora resolution requiring worldknowledge. We conclude with a discussion on the merits and potential deficiencies of the existing process, and how it may be improved and extended as a broader framework for evaluating semantic coverage. © 2018 Association for Computational Linguistics.

关键词： Neural machine translation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：