检索结果-内蒙古大学图书馆

IEEE International Conference on Acoustics, speech and Signal processing

作者： Neville Ryanta Elika Bergelson Kenneth Church Alejandrina Cristia Jun Du Sriram Ganapathy Sanjeev Khudanpur Diana Kowalski Mahesh Krishnamoorthy Rajat Kulshreshta Mark Liberman Yu-Ding Lu Matthew Maciejewski Florian Metze Jan Profant Lei Sun Yu Tsao Zhou Yu Linguistic Data Consortium University of Pennsylvania Philadelphia PA USA Department of Psychology and Neuroscience Duke University Durham NC USA IBM Yorktown Heights NY USA Laboratoire de Sciences Cognitives et Psycholinguistique ENS Paris France University of Science and Technology of China Hefei China Electrical Engineering Department Indian Institute of Science Bangalore India Center for Language and Speech Processing Johns Hopkins University Baltimore MD USA University of Illinois at Urbana-Champaign Champaign IL USA Apple Cupertino CA USA Language Technologies Institute Carnegie Mellon University Pittsburgh PA USA Research Center for Information Technology Innovation Academia Sinica Taipei Taiwan Brno University of Technology Brno Czech Republic Department of Computer Science University of California Davis Davis CA USA

ISBN: (纸本)9781538646595

Automatic speech recognition is more and more widely and effectively used. Nevertheless, in some automatic speech analysis tasks the state of the art is surprisingly poor. One of these is "diarization", the task of determining who spoke when. Diarization is key to processing meeting audio and clinical interviews, extended recordings such as police body cam or child language acquisition data, and any other speech data involving multiple speakers whose voices are not cleanly separated into individual channels. Overlapping speech, environmental noise and suboptimal recording techniques make the problem harder. During the JSALT Summer Workshop at CMU in 2017, an international team of researchers worked on several aspects of this problem, including calibration of the state of the art, detection of overlaps, enhancement of noisy recordings, and classification of shorter speech segments. This paper sketches the workshop's results, and announces plans for a "Diarization Challenge" to encourage further progress.

关键词： diarization overlap detection speech enhancement automatic speech recognition speech recognition speech enhancement speech state of the art monuron Recordings

来源：评论

学校读者我要写书评

暂无评论

Phone-aware neural language identification

Phone-aware neural language identification

引用

Oriental COCOSDA International Conference on speech Database and Assessments

作者： Zhiyuan Tang Dong Wang Yixiang Chen Ying Shi Lantian Li Center for Speech and Language Technologies RIIT Tsinghua University Tsinghua National Laboratory for Information Science and Technology Tsinghua University Department of Computer Science Tsinghua University

ISBN: (纸本)9781538633342

Pure acoustic neural models, particularly the LSTM-RNN model, have shown great potential in language identification (LID). However, the phonetic information has been largely overlooked by most of existing neural LID models, although this information has been used in the conventional phonetic LID systems with a great success. We present a phone- aware neural LID architecture, which is a deep LSTM-RNN LID system but accepts output from an RNN-based ASR system. By utilizing the phonetic knowledge, the LID performance can be significantly improved. Interestingly, even if the test language is not involved in the ASR training, the phonetic knowledge still presents a large contribution. Our experiments conducted on four languages within the Babel corpus demonstrated that the phone-aware approach is highly effective.

关键词： Phonetics Training Computational modeling Databases Acoustics Standardization Integrated circuit modeling

来源：评论

学校读者我要写书评

暂无评论

Quick and Reliable Document Alignment via TF/IDF-weighted Cosine Distance 1

Quick and Reliable Document Alignment via TF/IDF-weighted Co...

引用

1st Conference on Machine Translation, WMT 2016, held at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016

作者： Buck, Christian Koehn, Philipp University of Edinburgh Edinburgh United Kingdom Center for Language and Speech Processing Department of Computer Science Johns Hopkins University BaltimoreMD United States

ISBN: (纸本)9781945626104

This work describes our submission to the WMT16 Bilingual Document Alignment task. We show that a very simple distance metric, namely Cosine distance of tf/idf weighted document vectors provides a quick and reliable way to align documents. We compare many possible variants for constructing the document vectors. We also introduce a greedy algorithm that runs quicker and performs better in practice than the optimal solution to bipartite graph matching. Our approach shows competitive performance and can be improved even further through combination with URL based pair matching. © 2016 Association for Computational Linguistics.

关键词： Pattern matching

来源：评论

学校读者我要写书评

暂无评论

Sentential paraphrasing as black-box machine translation

Sentential paraphrasing as black-box machine translation

引用

2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human language Technologies, NAACL-HLT 2016

作者： Napoles, Courtney Callison-Burch, Chris Post, Matt Center for Language and Speech Processing Johns Hopkins University United States Computer and Information Science Department University of Pennsylvania United States Human Language Technology Center of Excellence Johns Hopkins University United States

We present a simple, prepackaged solution to generating paraphrases of English sentences. We use the Paraphrase Database (PPDB) for monolingual sentence rewriting and provide machine translation language packs: Prepackaged, tuned models that can be downloaded and used to generate paraphrases on a standard Unix environment. The language packs can be treated as a black box or customized to specific tasks. In this demonstration, we will explain how to use the included interactive webbased tool to generate sentential paraphrases. © NAACL-HLT 2016 - 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human language Technologies, Proceedings of the Demonstrations Session. All rights reserved.

关键词： Machine translation

来源：评论

学校读者我要写书评

暂无评论

A Hybrid Method of Domain Lexicon Construction for Opinion Targets Extraction Using Syntax and Semantics

引用

Journal of computer science & Technology 2016年第3期31卷 595-603页

作者： Chun Liao Chong Feng Sen Yang He-Yan Huang Department of Computer Science and Technology Beijing Institute of Technology Beijing 100081 China Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications Beijing Institute of Technology Beijing 100081 China

Opinion targets extraction of Chinese microblogs plays an important role in opinion mining. There has been a significant progress in this area recently, especially the method based on conditional random field （CRF）. However, this method only takes lexicon-related features into consideration and does not excavate the implied syntactic and semantic knowledge. We propose a novel approach which incorporates domain lexicon with groups of syntactical and semantic features. The approach acquires domain lexicon through a novel way which explores syntactic and semantic information through Part- of-speech, dependency structure, phrase structure, semantic role and semantic similarity based on word embedding. And then we combine the domain lexicon with opinion targets extracted from CRF with groups of features for opinion targets extraction. Experimental results on COAE2014 dataset show the outperformance of the approach compared with other well-known methods on the task of opinion targets extraction.

关键词： domain lexicon opinion targets extraction syntactic structure semantic role word embedding

来源：评论

学校读者我要写书评

暂无评论

Automatic Construction of Morphologically Motivated Translation Models for Highly Inflected, Low-Resource languages 12

Automatic Construction of Morphologically Motivated Translat...

引用

12th Conference of the Association for Machine Translation in the Americas, AMTA 2016

作者： Hewitt, John Post, Matt Yarowsky, David Department of Computer and Information Science University of Pennsylvania PhiladelphiaPA19104 United States Center for Language and Speech Processing Johns Hopkins University BaltimoreMD21211 United States

Statistical Machine Translation (SMT) of highly inflected, low-resource languages suffers from the problem of low bitext availability, which is exacerbated by large inflectional paradigms. When translating into English, rich source inflections have a high chance of being poorly estimated or out-of-vocabulary (OOV). We present a source language-agnostic system for automatically constructing phrase pairs from foreign-language inflections and their morphological analyses using manually constructed datasets, including Wiktionary. We then demonstrate the utility of these phrase tables in improving translation into English from Finnish, Czech, and Turkish in simulated low-resource settings, finding substantial gains in translation quality. We report up to +2.58 BLEU in a simulated low-resource setting and +1.65 BLEU in a moderateresource setting. We release our morphologically-motivated translation models, with tens of thousands of inflections in each of 8 languages. © 2016 The Authors.

关键词： computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

Phonetic temporal neural model for language identification

arXiv

引用

arXiv 2017年

作者： Tang, Zhiyuan Wang, Dong Chen, Yixiang Li, Lantian Abel, Andrew Chengdu Institute of Computer Applications Chinese Academy of Sciences and University of Chinese Academy of Sciences Beijing100049 China Center for Speech and Language Technologies Tsinghua University Beijing100084 China Tsinghua National Laboratory for Information Science and Technology and the Center for Speech and Language Technologies Tsinghua University Beijing100084 China Department of Computer Science and Software Engineering Xi'an Jiaotong-Liverpool University Suzhou215123 China

Deep neural models, particularly the LSTM-RNN model, have shown great potential for language identification (LID). However, the use of phonetic information has been largely overlooked by most existing neural LID methods, although this information has been used very successfully in conventional phonetic LID systems. We present a phonetic temporal neural model for LID, which is an LSTM-RNN LID system that accepts phonetic features produced by a phone-discriminative DNN as the input, rather than raw acoustic features. This new model is similar to traditional phonetic LID methods, but the phonetic knowledge here is much richer: it is at the frame level and involves compacted information of all phones. Our experiments conducted on the Babel database and the AP16-OLR database demonstrate that the temporal phonetic neural approach is very effective, and significantly outperforms existing acoustic neural models. It also outperforms the conventional i-vector approach on short utterances and in noisy conditions. Copyright © 2017, The Authors. All rights reserved.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Neural morphological analysis: Encoding-decoding canonical segments

Neural morphological analysis: Encoding-decoding canonical s...

引用

2016 Conference on Empirical Methods in Natural language processing, EMNLP 2016

作者： Kann, Katharina Cotterell, Ryan Schütze, Hinrich Center for Information and Language Processing LMU Munich Germany Department of Computer Science Johns Hopkins University United States

ISBN: (纸本)9781945626258

Canonical morphological segmentation aims to divide words into a sequence of standardized segments. In this work, we propose a character-based neural encoder-decoder model for this task. Additionally, we extend our model to include morpheme-level and lexical information through a neural reranker. We set the new state of the art for the task improving previous results by up to 21% accuracy. Our experiments cover three languages: English, German and Indonesian. © 2016 Association for Computational Linguistics

关键词： Decoding

来源：评论

学校读者我要写书评

暂无评论

A joint model of orthography and morphological segmentation 15

A joint model of orthography and morphological segmentation

引用

15th Conference of the North American Chapter of the Association for Computational Linguistics: Human language Technologies, NAACL HLT 2016

作者： Cotterell, Ryan Vieira, Tim Schütze, Hinrich Department of Computer Science Johns Hopkins University United States Center for Information and Language Processing LMU Munich Germany

ISBN: (纸本)9781941643914

We present a model of morphological segmentation that jointly learns to segment and restore orthographic changes, e.g., funniest ⟼ fun-y-est. We term this form of analysis canonical segmentation and contrast it with the traditional surface segmentation, which segments a surface form into a sequence of substrings, e.g., funniest ⟼ funn-i-est. We derive an importance sampling algorithm for approximate inference in the model and report experimental results on English, German and Indonesian. ©2016 Association for Computational Linguistics.

关键词： Importance sampling

来源：评论

学校读者我要写书评

暂无评论

Speaker segmentation using deep speaker vectors for fast speaker change scenarios

Speaker segmentation using deep speaker vectors for fast spe...

引用

IEEE International Conference on Acoustics, speech and Signal processing

作者： Renyu Wang Mingliang Gu Lantian Li Mingxing Xu Thoms Fang Zheng School of Linguistic Science Jiangsu Normal University Xuzhou 221116 China Center for Speech and Language Technologies Division of Technical Innovation and Development Tsinghua National Laboratory for Information Science and Technology Research Institute of Information Technology Department of Computer Science and Technology

ISBN: (纸本)9781509041183

A novel speaker segmentation approach based on deep neural network is proposed and investigated. This approach uses deep speaker vectors (d-vectors) to represent speaker characteristics and to find speaker change points. The d-vector is a kind of frame-level speaker discriminative feature, whose discriminative training process corresponds to the goal of discriminating a speaker change point from a single speaker speech segment in a short time window. Following the traditional metric-based segmentation, each analysis window contains two sub-windows and is shifting along the audio stream to detect speaker change points, where the speaker characteristics are represented by the means of deep speaker vectors for all frames in each window. Experimental investigations conducted in fast speaker change scenarios show that the proposed method can detect speaker change points more quickly and more effectively than the commonly used segmentation methods.

关键词： Speaker segmentation deep neural networks speaker vector Loudspeakers

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：