检索结果-内蒙古大学图书馆

DegExt: a language-independent keyphrase extractor

JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2013年第3期4卷 377-387页

作者： Litvak, Marina Last, Mark Kandel, Abraham Sami Shamoon Acad Coll Engn Dept Software Engn IL-84100 Beer Sheva Israel Ben Gurion Univ Negev Dept Informat Syst Engn IL-84105 Beer Sheva Israel Univ S Florida Dept Comp Sci & Engn Tampa FL 33620 USA

In this paper, we introduce DegExt, a graph-based language-independent keyphrase extractor, which extends the keyword extraction method described in Litvak and Last (graph-based keyword extraction for single-document summarization. In: proceedings of the workshop on multi-source multilingual information extraction and summarization, pp 17-24, 2008). We compare DegExt with two state-of-the-art approaches to keyphrase extraction: GenEx (Turney in Inf Retr 2: 303-336, 2000) and TextRank (Mihalcea and Tarau in Textrank-bringing order into texts. In: proceedings of the conference on empirical methods in natural language processing. Barcelona, Spain, 2004). We evaluated DegExt on collections of benchmark summaries in two different languages: English and Hebrew. Our experiments on the English corpus show that DegExt significantly outperforms TextRank and GenEx in terms of precision and area under curve for summaries of 15 keyphrases or more at the expense of a mostly non-significant decrease in recall and F-measure, when the extracted phrases are matched against gold standard collection. Due to DegExt's tendency to extract bigger phrases than GenEx and TextRank, when the single extracted words are considered, DegExt outperforms them both in terms of recall and F-measure. In the Hebrew corpus, DegExt performs the same as TextRank disregarding the number of keyphrases. An additional experiment shows that DegExt applied to the TextRank representation graphs outperforms the other systems in the text classification task. For documents in both languages, DegExt surpasses both GenEx and TextRank in terms of implementation simplicity and computational complexity.

关键词： Keyphrase extraction Summarization Text mining graph-based document representation Node centrality

来源：评论

学校读者我要写书评

暂无评论

FSMNLP 2012 - proceedings of the 10th International workshop on Finite State methods and natural language processing

FSMNLP 2012 - Proceedings of the 10th International Workshop...

引用

10th International workshop on Finite State methods and natural language processing, FSMNLP 2012

The proceedings contain 19 papers. The topics discussed include: effect of language and error models on efficiency of finite-state spell-checking and correction;practical finite state optimality theory;handling unknown words in Arabic FST morphology;Urdu – Roman transliteration via finite state transducers;integrating aspectually relevant properties of verbs into a morphological analyzer for english;finite-state technology in a verse-making tool;DAGGER: a toolkit for automata on directed acyclic graphs;WFST-based grapheme-to-phoneme conversion: open source tools for alignment, model-building and decoding;and finite-state acoustic and translation model composition in statistical speech translation: empirical assessment.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Understanding seed selection in bootstrapping 8

Understanding seed selection in bootstrapping

引用

8th workshop on graph-based methods for natural language processing, Textgraphs 2013, at the Conference on Empirical methods in natural language processing, EMNLP 2013

作者： Ehara, Yo Sato, Issei Oiwa, Hidekazu Nakagawa, Hiroshi Graduate School of Information Science and Technology United States Information Technology Center University of Tokyo / 7-3-1 Hongo Bunkyo-ku Tokyo Japan JSPS Research Fellow Kojimachi Business Center Building 5-3-1 Kojimachi Chiyoda-ku Tokyo Japan

ISBN: (纸本)9781937284978

Bootstrapping has recently become the focus of much attention in natural language processing to reduce labeling cost. In bootstrapping, unlabeled instances can be harvested from the initial labeled "seed" set. The selected seed set affects accuracy, but how to select a good seed set is not yet clear. Thus, an "iterative seeding" framework is proposed for bootstrapping to reduce its labeling cost. Our framework iteratively selects the unlabeled instance that has the best "goodness of seed" and labels the unlabeled instance in the seed set. Our framework deepens understanding of this seeding process in bootstrapping by deriving the dual problem. We propose a method called expected model rotation (EMR) that works well on not well-separated data which frequently occur as realistic data. Experimental results show that EMR can select seed sets that provide significantly higher mean reciprocal rank on realistic data than existing naive selection methods or random seed sets. © 2013 Association for Computational Linguistics

关键词： Iterative methods

来源：评论

学校读者我要写书评

暂无评论

ACL HLT 2011 - Textgraphs 2011: workshop on graph-based methods for natural language processing, proceedings of the workshop

ACL HLT 2011 - TextGraphs 2011: Workshop on Graph-Based Meth...

引用

6th workshop on graph-based methods for natural language processing, Textgraphs 2011

ISBN: (纸本)9781937284008

The proceedings contain 9 papers. The topics discussed include: a combination of topic models with max-margin learning for relation detection;nonparametric Bayesian word sense induction;invariants and variability of synonymy networks: self mediated agreement by confluence;word sense induction by community detection;using a Wikipedia-based semantic relatedness measure for document clustering;GrawlTCQ: terminology and corpora building by ranking simultaneously terms, queries and documents using graph random walks;simultaneous similarity learning and feature-weight learning for document clustering;unrestricted quantifier scope disambiguation;and from ranked words to dependency trees: two-stage unsupervised non-projective dependency parsing.

关键词：

来源：评论

学校读者我要写书评

暂无评论

ACL 2010 - Textgraphs 2010: 2010 workshop on graph-based methods for natural language processing, proceedings of the workshop

ACL 2010 - TextGraphs 2010: 2010 Workshop on Graph-Based Met...

引用

5th workshop on graph-based methods for natural language processing, Textgraphs 2010

ISBN: (纸本)1932432779

The proceedings contain 17 papers. The topics discussed include: graph-based clustering for computational linguistics: a survey;towards the automatic creation of a wordnet from a term-based lexical network;an investigation on the influence of frequency on the lexical organization of verbs;robust and efficient page rank for word sense disambiguation;hierarchical spectral partitioning of bipartite graphs to cluster dialects and identify distinguishing features;a character-based intersection graph approach to linguistic phylogeny;spectral approaches to learning in the graph domain;cross-lingual comparison between distributionally determined word similarity networks;co-occurrence cluster features for lexical substitutions in context;contextually-mediated semantic similarity graphs for topic segmentation;and experiments with CST-based multidocument summarization.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Textgraphs 2010 - 2010 workshop on graph-based methods for natural language processing at the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010 - proceedings of the workshop

TextGraphs 2010 - 2010 Workshop on Graph-Based Methods for N...

引用

2010 workshop on graph-based methods for natural language processing, Textgraphs 2010 at the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010 - proceedings of the workshop

ISBN: (纸本)1932432779

The proceedings contain 16 papers. The topics discussed include: graph-based clustering for computational linguistics: a survey;towards the automatic creation of a wordnet from a term-based lexical network;an investigation on the influence of frequency on the lexical organization of verbs;robust and efficient page rank for word sense disambiguation;hierarchical spectral partitioning of bipartite graphs to cluster dialects and identify distinguishing features;a character-based intersection graph approach to linguistic phylogeny;spectral approaches to learning in the graph domain;and cross-lingual comparison between distributionally determined word similarity networks.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

7th workshop on Recent Advances in Slavonic natural language processing, RASLAN 2013

7th Workshop on Recent Advances in Slavonic Natural Language...

引用

7th workshop on Recent Advances in Slavonic natural language processing, RASLAN 2013

The proceedings contain 12 papers. The special focus in this conference is on natural language processing. The topics include: Preparing verbalex printed edition;web application for semantic network editing;portable lexical analysis for parsing of morphologically-rich languages;acquiring data for textual entailment recognition;semi-automatic theme-rheme identification;intrinsic methods for comparison of corpora;typos in Czech corpora;expanding translation memories;methods for detection of word usage over time;towards the realistic natural language representations and type-based search of idiomatic expression.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Dense semantic graph and its application in single document summarisation 7

Dense semantic graph and its application in single document ...

引用

7th International workshop on Information Filtering and Retrieval, DART 2013 - workshop of the 13th AI*IA Conference

作者： Joshi, Monika Wang, Hui McClean, Sally University of Ulster Co. AntrimBT37 0QB United Kingdom University of Ulster Co. LondonderryBT52 1SA United Kingdom

Semantic graph representation of text is an important part of natural language processing applications such as text summarisation. We have studied two ways of constructing the semantic graph of a document from dependency parsing of its sentences. The first graph is derived from the subject-object-verb representation of sentence, and the second graph is derived from considering more dependency relations in the sentence by a shortest distance dependency path calculation, resulting in a dense semantic graph. We have shown through experiments that dense semantic graphs gives better performance in semantic graph based unsupervised extractive text summarisation. Copyright © 2013 for the individual papers by the papers' authors.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

RANLP 2013 - proceedings of the Student Research workshop: Recent Advances in natural language processing, associated with 9th International Conference on Recent Advances in natural language processing, RANLP 2013

RANLP 2013 - Proceedings of the Student Research Workshop: R...

引用

2013 Recent Advances in natural language processing, RANLP 2013

The proceedings contain 22 papers. The topics discussed include: a dataset for Arabic textual entailment;answering questions from multiple documents – the role of multi-document summarization;multi-document summarization using automatic key-phrase extraction;automatic evaluation of summary using textual entailment;towards a discourse model for knowledge elicitation;detecting negated and uncertain information in biomedical and review texts;cross-language plagiarism detection methods;rule-based named entity extraction for ontology population;towards definition extraction using conditional random fields;and event-centered simplification of news stories.

关键词：

来源：评论

学校读者我要写书评

暂无评论

FudanNLP: A toolkit for Chinese natural language processing 51

FudanNLP: A toolkit for Chinese natural language processing

引用

51st Annual Meeting of the Association for Computational Linguistics, ACL 2013

作者： Qiu, Xipeng Zhang, Qi Huang, Xuanjing Fudan University 825 Zhangheng Road Shanghai China

The growing need for Chinese natural language processing (NLP) is largely in a range of research and commercial applications. However, most of the currently Chinese NLP tools or components still have a wide range of issues need to be further improved and developed. FudanNLP is an open source toolkit for Chinese natural language processing (NLP), which uses statistics-based and rule-based methods to deal with Chinese NLP tasks, such as word segmentation, part-of-speech tagging, named entity recognition, dependency parsing, time phrase recognition, anaphora resolution and so on. © 2013 Association for Computational Linguistics.

关键词： natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：