检索结果-内蒙古大学图书馆

41st IEEE International Conference on Acoustics, speech and Signal processing, ICASSP 2016

作者： Mallidi, Sri Harish Hermansky, Hynek Center for Language and Speech Processing Johns Hopkins University Baltimore United States Human Language Technology Center of Excellence Johns Hopkins University Baltimore United States

ISBN: (纸本)9781479999880

Robustness of automatic speech recognition (ASR) to acoustic mismatches can be improved by multistream framework. Frequently used approach to combine decisions from individual streams involve training large number of neural networks, one for each possible stream combination. In this work, we propose to simplify the fusion by replacing the large number of fusion networks with a single fusion network. During training of the proposed fusion network, features from a stream are randomly dropped out. At test time, corrupted streams are identified and dropped out to improve robustness. Using the proposed approach, we were able to achieve significant reduction in number of parameters, while remaining in less than 2.5 % relative degradation of conventional fusion technique. Furthermore, proposed fusion network is also applied in a multistream ASR system to improve noise robustness of Aurora4 speech recognition task. Noticeable improvements were observed over baseline systems (relative improvement of 9.2 % in microphone mismatch and 3.2 % in additive noise conditions). © 2016 IEEE.

关键词： deep neural networks multistream ASR performance monitoring stream fusion

来源：评论

学校读者我要写书评

暂无评论

Multi-speaker conversations, cross-talk, and diarization for speaker recognition

Multi-speaker conversations, cross-talk, and diarization for...

引用

IEEE International Conference on Acoustics, speech and Signal processing

作者： Gregory Sell Alan McCree Human Language Technology Center of Excellence The Johns Hopkins University Baltimore MD 21218 USA

ISBN: (纸本)9781509041183

I-vector training and extraction assume that a speech file is spoken by a single speaker. This work considers the effects of violating that assumption with the presence of cross-talk or multi-speaker conversations. First, it is demonstrated that these problematic speech files can be detected using the i-vector representation itself. The impact of these violations of the single-speaker assumption are then explored along with strategies to mitigate it. It is shown that, even in predominantly clean data, the removal of cross-talk can provide modest gains, but that T matrix and PLDA training are largely robust to these types of noise. It is also shown that detection in front of diarization is a reasonable strategy in the presence of data with an unknown proportion of multi-speaker conversations. Finally, in the course of this work, evidence is found that cross-talk detection and multi-speaker detection may in fact be different tasks that require separately trained detectors.

关键词： speaker diarization speaker recognition i-vectors speaker recognition crosstalk Conversation S matrix

来源：评论

学校读者我要写书评

暂无评论

Punctuation prediction model for conversational speech

arXiv

引用

arXiv 2018年

作者： Zelasko, Piotr Szymański, Piotr Mizgajski, Jan Szymczak, Adrian Carmiel, Yishay Dehak, Najim Intelligent Wire United States Department of Computer Science Electronics and Telecommunications AGH University of Science and Technology al. Mickiewicza 30 Kraków Poland Department of Computational Intelligence Wroclaw University of Technology Wybrzeze Stanislawa Wyspiańskiego 27 Wroclaw50-370 Poland Center for Language and Speech Processing Johns Hopkins University BaltimoreMD United States

An ASR system usually does not predict any punctuation or capitalization. Lack of punctuation causes problems in result presentation and confuses both the human reader and off-the-shelf natural language processing algorithms. To overcome these limitations, we train two variants of Deep Neural Network (DNN) sequence labelling models - a Bidirectional Long Short-Term Memory (BLSTM) and a Convolutional Neural Network (CNN), to predict the punctuation. The models are trained on the Fisher corpus which includes punctuation annotation. In our experiments, we combine time-aligned and punctuated Fisher corpus transcripts using a sequence alignment algorithm. The neural networks are trained on Common Web Crawl GloVe embedding of the words in Fisher transcripts aligned with conversation side indicators and word time infomation. The CNNs yield a better precision and BLSTMs tend to have better recall. While BLSTMs make fewer mistakes overall, the punctuation predicted by the CNN is more accurate - especially in the case of question marks. Our results constitute significant evidence that the distribution of words in time, as well as pre-trained embeddings, can be useful in the punctuation prediction task. Copyright © 2018, The Authors. All rights reserved.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

Sentential paraphrasing as black-box machine translation

Sentential paraphrasing as black-box machine translation

引用

2016 Conference of the North American Chapter of the Association for Computational Linguistics: human language Technologies, NAACL-HLT 2016

作者： Napoles, Courtney Callison-Burch, Chris Post, Matt Center for Language and Speech Processing Johns Hopkins University United States Computer and Information Science Department University of Pennsylvania United States Human Language Technology Center of Excellence Johns Hopkins University United States

We present a simple, prepackaged solution to generating paraphrases of English sentences. We use the Paraphrase Database (PPDB) for monolingual sentence rewriting and provide machine translation language packs: Prepackaged, tuned models that can be downloaded and used to generate paraphrases on a standard Unix environment. The language packs can be treated as a black box or customized to specific tasks. In this demonstration, we will explain how to use the included interactive webbased tool to generate sentential paraphrases. © NAACL-HLT 2016 - 2016 Conference of the North American Chapter of the Association for Computational Linguistics: human language Technologies, Proceedings of the Demonstrations Session. All rights reserved.

关键词： Machine translation

来源：评论

学校读者我要写书评

暂无评论

A study of imitation learning methods for semantic role labeling

A study of imitation learning methods for semantic role labe...

引用

2016 Workshop on Structured Prediction for Natural language processing, NLP 2016 at the Conference on Empirical Methods in Natural language processing, EMNLP 2016

作者： Wolfe, Travis Dredze, Mark van Durme, Benjamin Human Language Technology Center of Excellence Johns Hopkins University United States

ISBN: (纸本)9781945626296

Global features have proven effective in a wide range of structured prediction problems but come with high inference costs. Imitation learning is a common method for training models when exact inference isn't feasible. We study imitation learning for Semantic Role Labeling (SRL) and analyze the effectiveness of the Violation Fixing Perceptron (VFP) (Huang et al., 2012) and Locally Optimal Learning to Search (LOLS) (Chang et al., 2015) frameworks with respect to SRL global features. We describe problems in applying each framework to SRL and evaluate the effectiveness of some solutions. We also show that action ordering, including easy first inference, has a large impact on the quality of greedy global models. © 2016 Association for Computational Linguistics.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Generating Politically-Relevant Event Data 1

Generating Politically-Relevant Event Data

引用

EMNLP 2016 1st Workshop on Natural language processing and Computational Social Science, NLP + CSS 2016

作者： Beieler, John Human Language Technology Center of Excellence Johns Hopkins University United States

ISBN: (纸本)9781945626265

Automatically generated political event data is an important part of the social science data ecosystem. The approaches for generating this data, though, have remained largely the same for two decades. During this time, the field of computational linguistics has progressed tremendously. This paper presents an overview of political event data, including methods and ontologies, and a set of experiments to determine the applicability of deep neural networks to the extraction of political events from news text. ©2016 Association for Computational Linguistics.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Fluency detection on communication networks

Fluency detection on communication networks

引用

2016 Conference on Empirical Methods in Natural language processing, EMNLP 2016

作者： Lippincott, Tom Van Durme, Benjamin Human Language Technology Center of Excellence Johns Hopkins University United States

ISBN: (纸本)9781945626258

When considering a social media corpus, we often have access to structural information about how messages are flowing between people or organizations. This information is particularly useful when the linguistic evidence is sparse, incomplete, or of dubious quality. In this paper we construct a simple model to leverage the structure of Twitter data to help determine the set of languages each user is fluent in. Our results demonstrate that imposing several intuitive constraints leads to improvements in performance and stability. We release the first annotated data set for exploring this task, and discuss how our approach may be extended to other applications. © 2016 Association for Computational Linguistics

关键词： Social networking (online)

来源：评论

学校读者我要写书评

暂无评论

Topic identification of spoken documents using unsupervised acoustic unit discovery

Topic identification of spoken documents using unsupervised ...

引用

International Conference on Acoustics, speech, and Signal processing (ICASSP)

作者： Santosh Kesiraju Raghavendra Pappagari Lucas Ondel Lukáš Burget Najim Dehak Sanjeev Khudanpur Jan Černocký Suryakanth V Gangashetty Brno University of Technology Speech@Fit and IT4I Center of Excellence Brno Czech Republic Center for Language and Speech Processing Johns Hopkins University Baltimore U. S. A Vysoke uceni technicke v Brne Brno MoravskoslezskÃ½ CZ International Institute of Information Technology Hyderabad India

This paper investigates the application of unsupervised acoustic unit discovery for topic identification (topic ID) of spoken audio documents. The acoustic unit discovery method is based on a non-parametric Bayesian phone-loop model that segments a speech utterance into phone-like categories. The discovered phone-like (acoustic) units are further fed into the conventional topic ID framework. Using multilingual bottleneck features for the acoustic unit discovery, we show that the proposed method outperforms other systems that are based on cross-lingual phoneme recognizer.

关键词： Hidden Markov models Acoustics speech Vocabulary Training Data models Bayes methods

来源：评论

学校读者我要写书评

暂无评论

Contrasting public opinion dynamics and emotional response during crisis 8th

Contrasting public opinion dynamics and emotional response d...

引用

8th International Conference on Social Informatics, SocInfo 2016

作者： Volkova, Svitlana Chetviorkin, Ilia Arendt, Dustin van Durme, Benjamin Pacific Northwest National Laboratory RichlandWA United States Computational Mathematics and Cybernetics Lomonosov Moscow State University Moscow Russia Center for Language and Speech Processing Johns Hopkins University Human Language Technology Center of Excellence BaltimoreMD United States

ISBN: (纸本)9783319478791

We propose an approach for contrasting spatiotemporal dynamics of public opinions expressed toward targeted entities, also known as stance detection task, in Russia and Ukraine during crisis. Our analysis relies on a novel corpus constructed from posts on the VKontakte social network, centered on local public opinion of the ongoing Russian-Ukrainian crisis, along with newly annotated resources for predicting expressions of fine-grained emotions including joy, sadness, disgust, anger, surprise and fear. Akin to prior work on sentiment analysis we align traditional public opinion polls with aggregated automatic predictions of sentiments for contrastive geo-locations. We report interesting observations on emotional response and stance variations across geo-locations. Some of our findings contradict stereotypical misconceptions imposed by media, for example, we found posts from Ukraine that do not support Euromaidan but support Putin, and posts from Russia that are against Putin but in favor USA. Furthermore, we are the first to demonstrate contrastive stance variations over time across geolocations using storyline visualization (Storyline visualization is available at http://***/∼svitlana/) technique. © Springer International Publishing AG 2016.

关键词： Sentiment analysis

来源：评论

学校读者我要写书评

暂无评论

Augmented data training of joint acoustic/phonotactic DNN i-vectors for NIST LRE15

Augmented data training of joint acoustic/phonotactic DNN i-...

引用

Speaker and language Recognition Workshop, Odyssey 2016

作者： McCree, Alan Sell, Gregory Garcia-Romero, Daniel Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD United States

This paper presents the JHU HLTCOE submission to the NIST 2015 language Recognition Evaluation, including critical and novel algorithmic components, use of limited and augmented training data, and additional post-evaluation analysis and improvements. All of our systems used i-vectors based on Deep Neural Networks (DNNs) with discriminatively-trained Gaussian classifiers, and linear fusion was performed with duration-dependent scaling. A key innovation was the use of three different kinds of i-vectors: acoustic, phonotactic, and joint. In addition, data augmentation was used to overcome the limited training data of this evaluation. Post-evaluation analysis shows the benefits of these design decisions as well as further potential improvements. © Odyssey 2016: Speaker and language Recognition Workshop. All rights reserved.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：