检索结果-内蒙古大学图书馆

9th workshop on graph-based methods for natural language processing, Textgraphs 2014, in conjunction with the Conference on Empirical methods in natural language processing, EMNLP 2014

作者： Parveen, Daraksha Strube, Michael Heidelberg Institute for Theoretical Studies gGmbH Schloss-Wolfsbrunnenweg 35 Heidelberg69118 Germany

ISBN: (纸本)9781937284961

In this paper, we introduce a novel graph based technique for topic based multidocument summarization. We transform documents into a bipartite graph where one set of nodes represents entities and the other set of nodes represents sentences. To obtain the summary we apply a ranking technique to the bipartite graph which is followed by an optimization step. We test the performance of our method on several DUC datasets and compare it to the state-of-the-art. © 2014 Association for Computational Linguistics

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

From visualisation to hypothesis construction for second language acquisition 9

From visualisation to hypothesis construction for second lan...

引用

9th workshop on graph-based methods for natural language processing, Textgraphs 2014, in conjunction with the Conference on Empirical methods in natural language processing, EMNLP 2014

作者： Malmasi, Shervin Dras, Mark Centre for Language Technology Macquarie University Sydney NSW Australia

ISBN: (纸本)9781937284961

One research goal in Second language Acquisition (SLA) is to formulate and test hypotheses about errors and the environments in which they are made, a process which often involves substantial effort;large amounts of data and computational visualisation techniques promise help here. In this paper we have defined a new task for finding contexts for errors that vary with the native language of the speaker that are potentially useful for SLA research. We propose four models for approaching this task, and find that one based only on error-feature co-occurrence and another based on determining maximum weight cliques in a feature association graph discover strongly distinguishing contexts, with an apparent trade-off between false positives and very specific contexts. © 2014 Association for Computational Linguistics

关键词： Errors

来源：评论

学校读者我要写书评

暂无评论

Parallel training of DNNs with natural gradient and parameter averaging 3

Parallel training of DNNs with natural gradient and paramete...

引用

3rd International Conference on Learning Representations, ICLR 2015

作者： Povey, Daniel Zhang, Xiaohui Khudanpur, Sanjeev Center for Language and Speech Processing and Human Language Technology Center of Excellence Johns Hopkins University BaltimoreMD21218 United States

We describe the neural-network training framework used in the Kaldi speech recognition toolkit, which is geared towards training DNNs with large amounts of training data using multiple GPU-equipped or multi-core machines. In order to be as hardware-agnostic as possible, we needed a way to use multiple machines without generating excessive network traffic. Our method is to average the neural network parameters periodically (typically every minute or two), and redistribute the averaged parameters to the machines for further training. Each machine sees different data. By itself, this method does not work very well. However, we have another method, an approximate and efficient implementation of natural Gradient for Stochastic Gradient Descent (NG-SGD), which seems to allow our periodic-averaging method to work well, as well as substantially improving the convergence of SGD on a single machine. © 2015 International Conference on Learning Representations, ICLR. All rights reserved.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Exploiting timegraphs in temporal relation classification 9

Exploiting timegraphs in temporal relation classification

引用

9th workshop on graph-based methods for natural language processing, Textgraphs 2014, in conjunction with the Conference on Empirical methods in natural language processing, EMNLP 2014

作者： Laokulrat, Natsuda Miwa, Makoto Tsuruoka, Yoshimasa University of Tokyo 3-7-1 Hongo Bunkyo-ku Tokyo Japan Toyota Technological Institute 2-12-1 Hisakata Tempaku-ku Nagoya Japan

ISBN: (纸本)9781937284961

Most of the recent work on machine learning-based temporal relation classification has been done by considering only a given pair of temporal entities (events or temporal expressions) at a time. Entities that have temporal connections to the pair of temporal entities under inspection are not considered even though they provide valuable clues to the prediction. In this paper, we present a new approach for exploiting knowledge obtained from nearby entities by making use of timegraphs and applying the stacked learning method to the temporal relation classification task. By performing 10-fold cross validation on the Timebank corpus, we achieved an F1 score of 59.61% based on the graph-based evaluation, which is 0.16 percentage points higher than that of the local approach. Our system outperformed the state-of-the-art system that utilizes global information and achieved about 1.4 percentage points higher accuracy. © 2014 Association for Computational Linguistics

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

language Identification in Code-Switching Scenario 1

Language Identification in Code-Switching Scenario

引用

1st workshop on Computational Approaches to Code Switching, Switching 2014 at the 2014 Conference on Empirical methods in natural language processing, EMNLP 2014

作者： Jain, Naman Bhat, Riyaz Ahmad LTRC IIIT-H Hyderabad India

ISBN: (纸本)9781937284961

This paper describes a CRF based token level language identification system entry to language Identification in CodeSwitched (CS) Data task of CodeSwitch 2014. Our system hinges on using conditional posterior probabilities for the individual codes (words) in code-switched data to solve the language identification task. We also experiment with other linguistically motivated language specific as well as generic features to train the CRF based sequence labeling algorithm achieving reasonable results. © 2014 Association for Computational Linguistics

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

The Tel Aviv University System for the Code-Switching workshop Shared Task 1

The Tel Aviv University System for the Code-Switching Worksh...

引用

1st workshop on Computational Approaches to Code Switching, Switching 2014 at the 2014 Conference on Empirical methods in natural language processing, EMNLP 2014

作者： Bar, Kfir Dershowitz, Nachum School of Computer Science Tel Aviv University Ramat Aviv Israel

ISBN: (纸本)9781937284961

We describe our entry in the EMNLP 2014 code-switching shared task. Our system is based on a sequential classifier, trained on the shared training set using various character- and word-level features, some calculated using a large monolingual corpora. We participated in the Twitter-genre Spanish-English track, obtaining an accuracy of 0.868 when measured on the tweet level and 0.858 on the word level. © 2014 Association for Computational Linguistics

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Parsing clinical text: how good are the state-of-the-art parsers?

引用

BMC MEDICAL INFORMATICS AND DECISION MAKING 2015年第Sup1期15卷 S2-S2页

作者： Jiang, Min Huang, Yang Fan, Jung-wei Tang, Buzhou Denny, Josh Xu, Hua Univ Texas Houston Sch Biomed Informat Houston Houston TX 77030 USA Kaiser Permanente San Diego CA USA Harbin Inst Technol Shenzhen Grad Sch Shenzhen Peoples R China Vanderbilt Univ Sch Med Dept Med Nashville TN 37212 USA Vanderbilt Univ Sch Med Dept Biomed Informat Nashville TN 37212 USA

Background: Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain. methods: In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline;and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank. Results: Our results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measur

关键词： Medical language processing natural language processing parsing clinical text NLP

来源：评论

学校读者我要写书评

暂无评论

Word-level language Identification using CRF: Code-switching Shared Task Report of MSR India System 1

Word-level Language Identification using CRF: Code-switching...

引用

1st workshop on Computational Approaches to Code Switching, Switching 2014 at the 2014 Conference on Empirical methods in natural language processing, EMNLP 2014

作者： Chittaranjan, Gokul Vyas, Yogarshi Bali, Kalika Choudhury, Monojit Microsoft Research India University of Maryland United States

ISBN: (纸本)9781937284961

We describe a CRF based system for word-level language identification of code-mixed text. Our method uses lexical, contextual, character n-gram, and special character features, and therefore, can easily be replicated across languages. Its performance is benchmarked against the test sets provided by the shared task on code-mixing (Solorio et al., 2014) for four language pairs, namely, English-Spanish (En-Es), English-Nepali (En-Ne), English-Mandarin (En-Cn), and Standard Arabic-Arabic (Ar-Ar) Dialects. The experimental results show a consistent performance across the language pairs. © 2014 Association for Computational Linguistics

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

DCU-UVT: Word-Level language Classification with Code-Mixed Data 1

DCU-UVT: Word-Level Language Classification with Code-Mixed ...

引用

1st workshop on Computational Approaches to Code Switching, Switching 2014 at the 2014 Conference on Empirical methods in natural language processing, EMNLP 2014

作者： Barman, Utsab Wagner, Joachim Chrupala, Grzegorz Foster, Jennifer CNGL Centre for Global Intelligent Content National Centre for Language Technology School of Computing Dublin City University Dublin Ireland Tilburg School of Humanities Department of Communication and Information Sciences Tilburg University Tilburg Netherlands

ISBN: (纸本)9781937284961

This paper describes the DCU-UVT team's participation in the language Identification in Code-Switched Data shared task in the workshop on Computational Approaches to Code Switching. Word-level classification experiments were carried out using a simple dictionary-based method, linear kernel support vector machines (SVMs) with and without contextual clues, and a k-nearest neighbour approach. based on these experiments, we select our SVM-based system with contextual clues as our final system and present results for the Nepali-English and Spanish-English datasets. © 2014 Association for Computational Linguistics

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

The CMU Submission for the Shared Task on language Identification in Code-Switched Data 1

The CMU Submission for the Shared Task on Language Identific...

引用

1st workshop on Computational Approaches to Code Switching, Switching 2014 at the 2014 Conference on Empirical methods in natural language processing, EMNLP 2014

作者： Lin, Chu-Cheng Ammar, Waleed Levin, Lori Dyer, Chris Language Technologies Institute Carnegie Mellon University PittsburghPA15213 United States

ISBN: (纸本)9781937284961

We describe the CMU submission for the 2014 shared task on language identification in code-switched data. We participated in all four language pairs: Spanish-English, Mandarin-English, Nepali-English, and Modern Standard Arabic-Arabic dialects. After describing our CRF-based baseline system, we discuss three extensions for learning from unlabeled data: semi-supervised learning, word embeddings, and word lists. © 2014 Association for Computational Linguistics

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：