检索结果-内蒙古大学图书馆

proceedings of the 10th workshop on Building and Using Comparable Corpora, BUCC 2017 at the Annual Meeting of the Association for Computational Linguistics, ACL 2017

Proceedings of the 10th Workshop on Building and Using Compa...

引用

10th workshop on Building and Using Comparable Corpora, BUCC 2017 at the Annual Meeting of the Association for Computational Linguistics, ACL 2017

ISBN: (纸本)9781945626616

The proceedings contain 12 papers. The topics discussed include: users and data: the two neglected children of bilingual natural language processing research;deep investigation of cross-language plagiarism detection methods;sentence alignment using unfolding recursive autoencoders;acquisition of translation lexicons for historically unwritten languages via bridging loanwords;toward a comparable corpus of Latvian, Russian and English tweets;automatic extraction of parallel speech corpora from dubbed movies;a parallel collection of clinical trials in Portuguese and English;weighted set-theoretic alignment of comparable sentences;and BUCC 2017 shared task: a first attempt toward a deep learning framework for identifying parallel sentences in comparable corpora.

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

A report on the 2017 native language identification shared task 12

A report on the 2017 native language identification shared t...

引用

12th workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017, held in conjunction with EMNLP 2017

作者： Malmasi, Shervin Evanini, Keelan Cahill, Aoife Tetreault, Joel Pugh, Robert Hamill, Christopher Napolitano, Diane Qian, Yao Harvard Medical School BostonMA United States Macquarie University Sydney Australia Educational Testing Service PrincetonNJ United States Grammarly New YorkNY United States Educational Testing Service San FranciscoCA United States

ISBN: (纸本)9781945626852

Native language Identification (NLI) is the task of automatically identifying the native language (L1) of an individual based on their language production in a learned language. It is typically framed as a classification task where the set of L1s is known a priori. Two previous shared tasks on NLI have been organized where the aim was to identify the L1 of learners of English based on essays (2013) and spoken responses (2016) they provided during a standardized assessment of academic English proficiency. The 2017 shared task combines the inputs from the two prior tasks for the first time. There are three tracks: NLI on the essay only, NLI on the spoken response only (based on a transcription of the response and i-vector acoustic features), and NLI using both responses. We believe this makes for a more interesting shared task while building on the methods and results from the previous two shared tasks. In this paper, we report the results of the shared task. A total of 19 teams competed across the three different sub-tasks. The fusion track showed that combining the written and spoken responses provides a large boost in prediction accuracy. Multiple classifier systems (e.g. ensembles and meta-classifiers) were the most effective in all tasks, with most based on traditional classifiers (e.g. SVMs) with lexical/syntactic features. © EMNLP 2017 - 12th workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - proceedings of the workshop. All rights reserved.

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Exploring n-gram character presentation in bidirectional rnn-crf for Chinese clinical named entity recognition

Exploring n-gram character presentation in bidirectional rnn...

引用

2017 Evaluation Task at the China Conference on Knowledge graph and Semantic Computing, CCKS-Tasks 2017

作者： Ouyang, En Li, Yuxi Jin, Ling Li, Zuofeng Zhang, Xiaoyan Tongji University Shanghai China Peking University First Hospital Beijing China Philips Research China-Healthcare Shanghai China

Clinical named entity recognition (CNER) that identifies boundaries and types of medical entities, is a fundamental and crucial task in clinical natural language processing. Recent years have witnessed considerable progress in deep learning based algorithms, such as RNN, CNN and their integrated methods, which show the effectiveness in CNER. In this work, we propose a deep learning model for CNER that adopts bidirectional RNN-CRF architecture using concatenated n-gram character representation to capture rich context information. Second, we incorporate word segmentation results, part-of-speech (POS) tagging and medical vocabulary as features into our model. Further, the final output is delivered by the comparison between the separated models and the overall model. The proposed framework has been evaluated in CCKS2017 task2 dataset, achieving 90.10 F1-score for CNER.

关键词： Recurrent neural networks

来源：评论

学校读者我要写书评

暂无评论

Reader-aware multi-document summarization: An enhanced model and the first dataset

Reader-aware multi-document summarization: An enhanced model...

引用

EMNLP 2017 workshop on New Frontiers in Summarization, NFiS 2017

作者： Li, Piji Bing, Lidong Lam, Wai Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong Hong Kong Ai Lab Tencent Inc. Shenzhen China

ISBN: (纸本)9781945626890

We investigate the problem of readeraware multi-document summarization (RA-MDS) and introduce a new dataset for this problem. To tackle RA-MDS, we extend a variational auto-encodes (VAEs) based MDS framework by jointly considering news documents and reader comments. To conduct evaluation for summarization performance, we prepare a new dataset. We describe the methods for data collection, aspect annotation, and summary writing as well as scrutinizing by experts. Experimental results show that reader comments can improve the summarization performance, which also demonstrates the usefulness of the proposed dataset. The annotated dataset for RA-MDS is available online1. © EMNLP *** right reserved.

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Debunking sentiment lexicons: A case of domain-specific sentiment classification for Croatian 6

Debunking sentiment lexicons: A case of domain-specific sent...

引用

6th workshop on Balto-Slavic natural language processing, BSNLP 2017 at the 15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017

作者： Gombar, Paula Medić, Zoran Alagić, Domagoj Šnajder, Jan Text Analysis and Knowledge Engineering Lab Faculty of Electrical Engineering and Computing University of Zagreb Unska 3 Zagreb10000 Croatia

ISBN: (纸本)9781945626456

Sentiment lexicons are widely used as an intuitive and inexpensive way of tackling sentiment classification, often within a simple lexicon word-counting approach or as part of a supervised model. However, it is an open question whether these approaches can compete with supervised models that use only word-representation features. We address this question in the context of domain-specific sentiment classification for Croatian. We experiment with the graph-based acquisition of sentiment lexicons, analyze their quality, and investigate how effectively they can be used in sentiment classification. Our results indicate that, even with as few as 500 labeled instances, a supervised model substantially outperforms a word-counting model. We also observe that adding lexicon-based features does not significantly improve supervised sentiment classification. © 2017 Association for Computational Linguistics.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

融合文本概念化与网络表示的观点检索

引用

软件学报 2018年第10期29卷 2899-2914页

作者：廖祥文刘德元桂林程学旗陈国龙福州大学数学与计算机科学学院福建福州350116 福建省网络计算与智能信息处理重点实验室(福州大学) 福建福州350116 网络数据科学与技术重点实验室(中国科学院) 北京100190

观点检索是自然语言处理领域中的一个热点研究课题.现有的观点检索模型在检索过程中往往无法根据上下文将词汇进行知识、概念层面的抽象,在语义层面忽略词汇之间的语义联系,观点层面缺乏观点泛化能力.因此,提出一种融合文本概念化与网... 详细信息

观点检索是自然语言处理领域中的一个热点研究课题.现有的观点检索模型在检索过程中往往无法根据上下文将词汇进行知识、概念层面的抽象,在语义层面忽略词汇之间的语义联系,观点层面缺乏观点泛化能力.因此,提出一种融合文本概念化与网络表示的观点检索方法.该方法首先利用知识图谱分别将用户查询和文本概念化到正确的概念空间,并利用网络表示将知识图谱中的词汇节点表示成低维向量,然后根据词向量推出查询和文本的向量,并用余弦公式计算用户查询与文本的相关度,接着引入基于统计机器学习的分类方法挖掘文本的观点.最后,利用概念空间、网络表示空间以及观点分析结果构建特征,并服务于观点检索模型.相关实验结果表明,所提出的检索模型可以有效提高多种检索模型的观点检索性能.其中,基于统一相关模型的观点检索方法在两个实验数据集上相比于基准方法,在MAP评价指标上分别提升了6.1%和9.3%,基于排序学习的观点检索方法在两个实验数据集上相比于基准方法,在MAP评价指标上分别提升了2.3%和14.6%.

关键词：信息检索观点检索知识图谱文本概念化网络表示

来源：评论

学校读者我要写书评

暂无评论

Neural Network methods in natural language processing

引用

2017年

作者： Yoav Goldberg Graeme Hirst

ISBN: (数字)9781627052955

Neural networks are a family of powerful machine learning models. This book focuses on the application of neural network models to natural language data. The first half of the book (Parts I and II) covers the basics of supervised machine learning and feed-forward neural networks, the basics of working with machine learning over language data, and the use of vector-based rather than symbolic representations for words. It also covers the computation-graph abstraction, which allows to easily define and train arbitrary neural networks, and is the basis behind the design of contemporary neural network software libraries. The second part of the book (Parts III and IV) introduces more specialized neural network architectures, including 1D convolutional neural networks, recurrent neural networks, conditioned-generation models, and attention-based models. These architectures and techniques are the driving force behind state-of-the-art algorithms for machine translation, syntactic parsing, and many other applications. Finally, we also discuss tree-shaped networks, structured prediction, and the prospects of multi-task learning.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Survey on Extractive Text Summarization methods with Multi-Document Datasets 7

Survey on Extractive Text Summarization Methods with Multi-D...

引用

7th International Conference on Advances in Computing, Communications and Informatics, ICACCI 2018

作者： Varalakshmi K, P.N. Kallimani, Jagadish S. Department of Computer Science and Engineering M S Ramaiah Institute of Technology India

ISBN: (纸本)9781538653142

Text summarization has been one of the key research areas in natural language processing (NLP) for a while. The various methods to summarize one or more documents can be broadly classified into extractive and abstractive text summarization where the former involves selecting key parts in the document and embedding into the summary while balancing between salience and redundancy. The latter involves creating new sentences to provide a summary of the documents. Extractive summarization can further be done in a supervised manner with humans or an unsupervised manner without any human intervention. This paper provides the knowledge a few of the current methods to perform extractive text summarization where the input would be multi document sets. Multi document summarization can consider two types of document sets;a homogeneous set of documents which have a common topic or theme and a heterogeneous set where the main topic for the documents are unrelated but they contain some form information that is related to the summary. The first method uses sentence regression where they consider performing sentence ranking along with sentence relations followed by greedy selection process. The second is an unsupervised paragraph embedding method utilizing a density peaks clustering method. The third method proposes document-level reconstruction using a neural document model. The fourth method is a query focused, joint neural network based model with an attention mechanism. The fifth method concentrates on coherence by providing a graph-based model which does not require discourse analysis as a prerequisite. We also see a way to create a heterogeneous multi-documentcorpus along with the limitations of each of these methods. © 2018 IEEE.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

基于自然语言处理的临床合理用药知识图谱构建

引用

中华医学图书情报杂志 2019年第9期28卷 1-5页

作者：张小亮王忠民王永庆郭建军刘云南京医科大学第一附属医院(江苏省人民医院) 江苏南京210096 南京医科大学医学信息学与管理研究所江苏南京211166

目的:构建基于自然语言处理的临床合理用药知识图谱。方法:以国家食品药品监督管理总局(CFDA)、美国食品药品监督管理总局(FDA)及某大型三甲医院药品库中药品说明书为数据源,构建了一种基于深度学习算法的临床合理用药知识图谱库。对随... 详细信息

目的:构建基于自然语言处理的临床合理用药知识图谱。方法:以国家食品药品监督管理总局(CFDA)、美国食品药品监督管理总局(FDA)及某大型三甲医院药品库中药品说明书为数据源,构建了一种基于深度学习算法的临床合理用药知识图谱库。对随机抽取的500份药品说明书进行人工标注,将标注的数据划分为训练集、测试集、验证集。基于深度学习模型BRET进行训练,通过训练集训练模型和验证集验证训练过程中的性能及训练后通过测试集测试模型性能,用优化后的机器学习模型预测未标注的药品说明书。结果:最终抽取出30余万条"实体-关系-实体"的三元组关系,将机器学习模型产生的三元组与领域专家标注产生的三元组一起导入Neo4j图形数据库中存储,以知识图谱的形式展现给临床药师。结论:通过基于深度学习算法的临床合理用药知识库构建,在标引少量药品说明书的前提下,挖掘出药品说明书中所有的医疗关系和实体。自动构建基于药品说明书的合理用药知识图谱,可提高合理用药的自动化程度和准确度,降低不合理用药。

关键词：深度学习知识图谱合理用药自然语言处理机器学习

来源：评论

学校读者我要写书评

暂无评论

Exploring flexibility in natural language generation through discursive analysis of new textual genres 2nd

Exploring flexibility in natural language generation through...

引用

2nd International workshop on Future and Emerging Trends in language Technology, FETLT 2016

作者： Vicente, Marta Lloret, Elena University of Alicante Alicante Spain

ISBN: (纸本)9783319693644

Since automatic language generation is a task able to enrich applications rooted in most of the language-related areas, from machine translation to interactive dialogue, it seems worthwhile to undertake a strategy focused on enhancing generation system’s adaptability and flexibility. It is our first objective to understand the relation between the factors that contribute to discourse articulation in order to devise the techniques that will generate it. From that point, we want to determine the appropriate methods to automatically learn those factors. The role of genre on this approach remains essential as provider of the stable forms that are required in the discourse to meet certain communicative goals. The arising of new web-based genres and the accessibility of the data due to its digital nature, has prompted us to use reviews in our first attempt to learn the characteristics of their singular non-rigid structure. The process and the preliminary results are explained in the present paper. © 2017, Springer International Publishing AG.

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：