检索结果-内蒙古大学图书馆

5th workshop on graph-based methods for natural language processing, Textgraphs 2010

作者： Biemann, Chris 475 Brannan St Ste. 330 San Francisco CA 94107 United States

ISBN: (纸本)1932432779

This paper examines the influence of features based on clusters of co-occurrences for supervised Word Sense Disambiguation and Lexical Substitution. Cooccurrence cluster features are derived from clustering the local neighborhood of a target word in a co-occurrence graph based on a corpus in a completely unsupervised fashion. Clusters can be assigned in context and are used as features in a supervised WSD system. Experiments fitting a strong baseline system with these additional features are conducted on two datasets, showing improvements. Cooccurrence features are a simple way to mimic Topic Signatures (Mart´inez et al., 2008) without needing to construct resources manually. Further, a system is described that produces lexical substitutions in context with very high precision. © 2010 The Association for Computational Linguistics.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

Multi-level association graphs - A new graph-based model for information retrieval

Multi-level association graphs - A new graph-based model for...

引用

2nd workshop on graph-based Algorithms for natural language processing, Textgraphs 2007

作者： Witschel, Hans Friedrich NLP department University of Leipzig P.O. Box 100920 04009 Leipzig Germany

This paper introduces multi-level association graphs (MLAGs), a new graph-based framework for information retrieval (IR). The goal of that framework is twofold: first, it is meant to be a meta model of IR, i.e. it subsumes various IR models under one common representation. Second, it allows to model different forms of search, such as feedback, associative retrieval and browsing at the same time. It is shown how the new integrated model gives insights and stimulates new ideas for IR algorithms. One of these new ideas is presented and evaluated, yielding promising experimental results.

关键词： Information retrieval

来源：评论

学校读者我要写书评

暂无评论

DegExt: a language-independent keyphrase extractor

引用

JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2013年第3期4卷 377-387页

作者： Litvak, Marina Last, Mark Kandel, Abraham Sami Shamoon Acad Coll Engn Dept Software Engn IL-84100 Beer Sheva Israel Ben Gurion Univ Negev Dept Informat Syst Engn IL-84105 Beer Sheva Israel Univ S Florida Dept Comp Sci & Engn Tampa FL 33620 USA

In this paper, we introduce DegExt, a graph-based language-independent keyphrase extractor, which extends the keyword extraction method described in Litvak and Last (graph-based keyword extraction for single-document summarization. In: proceedings of the workshop on multi-source multilingual information extraction and summarization, pp 17-24, 2008). We compare DegExt with two state-of-the-art approaches to keyphrase extraction: GenEx (Turney in Inf Retr 2: 303-336, 2000) and TextRank (Mihalcea and Tarau in Textrank-bringing order into texts. In: proceedings of the conference on empirical methods in natural language processing. Barcelona, Spain, 2004). We evaluated DegExt on collections of benchmark summaries in two different languages: English and Hebrew. Our experiments on the English corpus show that DegExt significantly outperforms TextRank and GenEx in terms of precision and area under curve for summaries of 15 keyphrases or more at the expense of a mostly non-significant decrease in recall and F-measure, when the extracted phrases are matched against gold standard collection. Due to DegExt's tendency to extract bigger phrases than GenEx and TextRank, when the single extracted words are considered, DegExt outperforms them both in terms of recall and F-measure. In the Hebrew corpus, DegExt performs the same as TextRank disregarding the number of keyphrases. An additional experiment shows that DegExt applied to the TextRank representation graphs outperforms the other systems in the text classification task. For documents in both languages, DegExt surpasses both GenEx and TextRank in terms of implementation simplicity and computational complexity.

关键词： Keyphrase extraction Summarization Text mining graph-based document representation Node centrality

来源：评论

学校读者我要写书评

暂无评论

Fusing Document, Collection and Label graph-based Representations with Word Embeddings for Text Classification 12

Fusing Document, Collection and Label Graph-based Representa...

引用

12th workshop on graph-based methods for natural language processing, Textgraphs 2018 - in conjunction with the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human, NAACL HLT 2018

作者： Skianis, Konstantinos Malliaros, Fragkiskos D. Vazirgiannis, Michalis École Polytechnique France CentraleSupélec and Inria Saclay France

ISBN: (纸本)9781948087254

Contrary to the traditional Bag-of-Words approach, we consider the graph-of-Words (GoW) model in which each document is represented by a graph that encodes relationships between the different terms. based on this formulation, the importance of a term is determined by weighting the corresponding node in the document, collection and label graphs, using node centrality criteria. We also introduce novel graph-based weighting schemes by enriching graphs with word-embedding similarities, in order to reward or penalize semantic relationships. Our methods produce more discriminative feature weights for text categorization, outperforming existing frequency-based criteria. Code and data are available online. © 2018 Association for Computational Linguistics.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

Measuring aboutness of an entity in a text 1

Measuring aboutness of an entity in a text

引用

1st workshop on graph-based Algorithms for natural language processing, Textgraphs 2006 at Human language Technologies

作者： Moens, Marie-Francine Jeuniaux, Patrick Angheluta, Roxana Mitra, Rudradeb Legal Informatics and Information Retrieval Katholieke Universiteit Leuven Belgium Department. of Psychology University of Memphis United States Mission Critical IT Brussels Belgium

In many information retrieval and selection tasks it is valuable to score how much a text is about a certain entity and to compute how much the text discusses the entity with respect to a certain viewpoint. In this paper we are interested in giving an aboutness score to a text, when the input query is a person name and we want to measure the aboutness with respect to the biographical data of that person. We present a graph-based algorithm and compare its results with other approaches. © 2006 Association for Computational Linguistics

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

GTN-ED: Event Detection Using graph Transformer Networks 15

GTN-ED: Event Detection Using Graph Transformer Networks

引用

15th workshop on graph-based methods for natural language processing, Textgraphs 2021

作者： Dutta, Sanghamitra Ma, Liang Saha, Tanay Kumar Lu, Di Tetreault, Joel Jaimes, Alejandro Carnegie Mellon University United States Dataminr United States

ISBN: (纸本)9781954085381

Recent works show that the graph structure of sentences, generated from dependency parsers, has potential for improving event detection. However, they often only leverage the edges (dependencies) between words, and discard the dependency labels (e.g., nominal-subject), treating the underlying graph edges as homogeneous. In this work, we propose a novel framework for incorporating both dependencies and their labels using a recently proposed technique called graph Transformer Networks (GTN). We integrate GTNs to leverage dependency relations on two existing homogeneousgraph-based models, and demonstrate an improvement in the F1 score on the ACE dataset. © 2021 Association for Computational Linguistics.

关键词： Syntactics

来源：评论

学校读者我要写书评

暂无评论

A novel graph kernel algorithm for improving the effect of text classification

引用

Computer Speech & language 2026年 95卷

作者： Fan Yang Tan Zhu Jing Huang Zhilin Huang Guoqi Xie College of Computer and Information Engineering Central South University of Forestry and Technology Changsha Hunan 41004 PR China School of Computer Science and Engineering Hunan University of Science and Technology Xiangtan Hunan 411201 PR China Key Laboratory for Embedded and Network Computing of Hunan Province College of Computer Science and Electronic Engineering Hunan University Changsha Hunan 410082 PR China

Text classification is an important topic in natural language processing. In recent years, both graph kernel methods and deep learning methods have been widely employed in text classification tasks. However, previous graph kernel algorithms focused too much on the graph structure itself, such as the shortest path subgraph,while focusing limited attention to the information of the text itself. Previous deep learning methods have often resulted in substantial utilization of computational resources. Therefore,we propose a new graph kernel algorithm to address the disadvantages. first,we extract the textual information of the document using the term weighting scheme. Second,we collect the structural information on the document graph. Third, graph kernel is used for similarity measurement for text classification. We compared eight baseline methods on three experimental datasets, including traditional deep learning methods and graph-based classification methods, and tested our algorithm on multiple indicators. The experimental results demonstrate that our algorithm outperforms other baseline methods in terms of accuracy. Furthermore, it achieves a minimum reduction of 69% in memory consumption and a minimum decrease of 23% in runtime. Furthermore, as we decrease the percentage of training data, our algorithm continues to achieve superior results compared to other deep learning methods. The excellent experimental results show that our algorithm can improve the efficiency of text classification tasks and reduce the occupation of computer resources under the premise of ensuring high accuracy.

关键词： Document similarity measure graph kernel Machine learning Term weighting Text classification

来源：评论

学校读者我要写书评

暂无评论

Timestamped graphs: Evolutionary models of text for multi-document summarization

Timestamped graphs: Evolutionary models of text for multi-do...

引用

2nd workshop on graph-based Algorithms for natural language processing, Textgraphs 2007

作者： Lin, Ziheng Kan, Min-Yen School of Computing National University of Singapore Singapore 177543 Singapore

Current graph-based approaches to automatic text summarization, such as LexRank and TextRank, assume a static graph which does not model how the input texts emerge. A suitable evolutionary text graph model may impart a better understanding of the texts and improve the summarization process. We propose a timestamped graph (TSG) model that is motivated by human writing and reading processes, and show how text units in this model emerge over time. In our model, the graphs used by LexRank and TextRank are specific instances of our timestamped graph with particular parameter settings. We apply timestamped graphs on the standard DUC multi-document text summarization task and achieve comparable results to the state of the art.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

Unigram language models using diffusion smoothing over graphs

Unigram language models using diffusion smoothing over graph...

引用

2nd workshop on graph-based Algorithms for natural language processing, Textgraphs 2007

作者： Jedynak, Bruno Karakos, Damianos Dept. of Appl. Mathematics and Statistics Center for Imaging Sciences Johns Hopkins University Baltimore MD 21218-2686 United States Dept. of Electrical and Computer Engineering Center for Language and Speech Processing Johns Hopkins University Baltimore MD 21218-2686 United States

We propose to use graph-based diffusion techniques with data-dependent kernels to build unigram language models. Our approach entails building graphs, where each vertex corresponds uniquely to a word from a closed vocabulary, and the existence of an edge (with an appropriate weight) between two words indicates some form of similarity between them. In one of our constructions, we place an edge between two words if the number of times these words were seen in a training set differs by at most one count. This graph construction results in a similarity matrix with small intrinsic dimension, since words with the same counts have the same neighbors. Experimental results from a benchmark task from language modeling show that our method is competitive with the Good-Turing estimator.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

Unsupervised large-vocabularyword sense disambiguation with graph-based algorithms for sequence data labeling 05

Unsupervised large-vocabularyword sense disambiguation with ...

引用

Human language Technology Conference and Conference on Empirical methods in natural language processing, HLT/EMNLP 2005, Co-located with the 2005 Document Understanding Conference, DUC and the 9th International workshop on Parsing Technologies, IWPT

作者： Mihalcea, Rada Department of Computer Science University of North Texas United States

This paper introduces a graph-based algorithm for sequence data labeling, using random walks on graphs encoding label dependencies. The algorithm is illustrated and tested in the context of an unsupervised word sense disambiguation problem, and shown to significantly outperform the accuracy achieved through individual label assignment, as measured on standard senseannotated data sets. © 2005 Association for Computational Linguistics.

关键词： graphic methods

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：