检索结果-内蒙古大学图书馆

Improving unsupervised keyphrase extraction by modeling hierarchical multi-granularity features

INFORMATION PROCESSING & MANAGEMENT 2023年第4期60卷

作者： Zhang, Zhihao Liang, Xinnian Zuo, Yuan Lin, Chenghua Beihang Univ Sch Econ & Management Beijing Peoples R China Beihang Univ State Key Lab Software Dev Environm Beijing Peoples R China Univ Sheffield Dept Comp Sci Sheffield England

Existing unsupervised keyphrase extraction methods typically emphasize the importance of the candidate keyphrase itself, ignoring other important factors such as the influence of uninfor-mative sentences. We hypothesize that the salient sentences of a document are particularly important as they are most likely to contain keyphrases, especially for long documents. To our knowledge, our work is the first attempt to exploit sentence salience for unsupervised keyphrase extraction by modeling hierarchical multi-granularity features. Specifically, we propose a novel position-aware graph-based unsupervised keyphrase extraction model, which includes two model variants. The pipeline model first extracts salient sentences from the document, followed by keyphrase extraction from the extracted salient sentences. In contrast to the pipeline model which models multi-granularity features in a two-stage paradigm, the joint model accounts for both sentence and phrase representations of the source document simultaneously via hierarchical graphs. Concretely, the sentence nodes are introduced as an inductive bias, injecting sentence-level information for determining the importance of candidate keyphrases. We compare our model against strong baselines on three benchmark datasets including Inspec, DUC 2001, and SemEval 2010. Experimental results show that the simple pipeline-based approach achieves promising results, indicating that keyphrase extraction task benefits from the salient sentence extraction task. The joint model, which mitigates the potential accumulated error of the pipeline model, gives the best performance and achieves new state-of-the-art results while generalizing better on data from different domains and with different lengths. In particular, for the SemEval 2010 dataset consisting of long documents, our joint model outperforms the strongest baseline UKERank by 3.48%, 3.69% and 4.84% in terms of F1@5, F1@10 and F1@15, respectively. We also conduct qualitative experimen

关键词： Unsupervised keyphrase extraction graph-based ranking algorithm Hierarchical Multi-granularity features

来源：评论

学校读者我要写书评

暂无评论

Query Focused Multi-document Summarization based on Five-Layered graph and Universal Paraphrastic Embeddings 6th

Query Focused Multi-document Summarization Based on Five-Lay...

引用

6th Computer Science On-Line Conference (CSOC)

作者： Canhasi, Ercan Gjirafa Inc 28A Prishtine Kosovo

ISBN: (纸本)9783319572611

Query focused multi-document summarization is a process of automatic query biased text compression of a document set. Lately, the graph-based and ranking methods have been intensively attracted the researchers from extractive document summarization domain. The uniform sentence connecteness or non-uniform document-sentence connecteness, such as sentence similarity weighted by document importance, were the main features used by work to date. Contrary, in this paper we present a novel five-layered heterogeneous graph model. It emphasizes not only sentence and document level relations but also the influence of lower level relations (e.g. a part of sentence similarity) and higher level relations (i.e. query to sentences similarity). based on this model, we developed an iterative sentence ranking algorithm, based on the existing well known PageRank algorithm. Moreover, for text similarity calculations we used universal paraphrase embeddings that outperform various strong baselines on many text similarity tasks and many domains. Experiments are conducted on the DUC 2005 data sets and the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) evaluation results demonstrate the advantages of the proposed approach.

关键词： Multidocument summarization graph-based summarization graph-based ranking algorithm PageRank

来源：评论

学校读者我要写书评

暂无评论

Improved Automatic Keyword Extraction Given More Semantic Knowledge 1

引用

International Workshop on Database Systems for Advanced Applications (DASFAA)

作者： Yang, Kai Chen, Zhenhong Cai, Yi Huang, DongPing Leung, Ho-Fung South China Univ Technol Sch Software Engn Guangzhou Guangdong Peoples R China Chinese Univ Hong Kong Dept Comp Sci & Engn Hong Kong Hong Kong Peoples R China

ISBN: (数字)9783319320557

ISBN: (纸本)9783319320557;9783319320540

graph-based ranking algorithm such as TextRank shows a remarkable effect on keyword extraction. However, these algorithms build graphs only considering the lexical sequence of the documents. Hence, graphs generated by these algorithm can not reflect the semantic relationships between documents. In this paper, we demonstrate that there exists an information loss in the graph-building process from textual documents to graphs. These loss will lead to the misjudgment of the algorithm. In order to solve this problem, we propose a new approach called Topic-based TextRank. Different from the traditional algorithm, our approach takes the lexical meaning of the text unit (i.e. words and phrase) into account. The result of our experiments shows that our proposed algorithm can outperform the state-of-the-art algorithms.

关键词： Keyword extraction Topic model graph-based ranking algorithm Semantic analysis

来源：评论

学校读者我要写书评

暂无评论

SRRank: Leveraging Semantic Roles for Extractive Multi-Document Summarization

引用

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 2014年第12期22卷 2048-2058页

作者： Yan, Su Wan, Xiaojun Peking Univ MOE Key Lab Computat Linguist Inst Comp Sci & Technol Beijing 100871 Peoples R China

Extractive multi-document summarization systems usually rank sentences in a document set with some ranking strategy and then select a few highly ranked sentences into the summary. One of the most popular ranking algorithms is the graph-based ranking algorithm. In this paper, we investigate making use of semantic role information to enhance the graph-based ranking algorithm for multi-document summarization. We first parse the sentences and obtain the semantic roles, and then propose a novel SRRank algorithm and two extensions to make better use of the semantic role information. Our proposed algorithms can simultaneously rank the sentences, semantic roles and words in a heterogeneous ranking process. Experimental results on two DUC datasets demonstrate that our proposed algorithms significantly outperform a few baselines, and the semantic role information is validated to be very helpful for multi-document summarization.

关键词： graph-based ranking algorithm multi-document summarization semantic roles

来源：评论

学校读者我要写书评

暂无评论

Generating summaries by means of synthesis of conceptual graphs

引用

REVISTA SIGNOS 2014年第86期47卷 463-485页

作者： Miranda Jimenez, Sabino Gelbukh, Alexander Sidorov, Grigori Inst Politecn Nacl Mexico City 07738 DF Mexico

In this study, we propose a model for generating single-document abstractive summaries, based on the conceptual representation of the text. Although there are studies that take into account the partial syntactic or semantic representation of the text, so far, a complete semantic representation of texts has not been used for generating summaries. Our model uses a complete semantic representation of text by means of conceptual graph structures. In this context, the task of generating the summary is reduced to summarize the set of corresponding conceptual graphs. In order to do this, a set of operations on graphs is applied: generalization, join or association, ranking, and pruning. Furthermore, a hierarchy of concepts (WordNet) and heuristic rules based on the semantic patterns from VerbNet are used in order to support such operations. The resulting set of graphs depicts the text summary at the conceptual level. The method was evaluated on the DUC 2003 data collection. The results show that the method is effective for summarizing short texts.

关键词： Abstractive summarization weighted conceptual graphs graph-based ranking algorithm HITS algorithm

来源：评论

学校读者我要写书评

暂无评论

A document-sensitive graph model for multi-document summarization

引用

KNOWLEDGE AND INFORMATION SYSTEMS 2010年第2期22卷 245-259页

作者： Wei, Furu Li, Wenjie Lu, Qin He, Yanxiang Hong Kong Polytech Univ Dept Comp Kowloon Hong Kong Peoples R China Wuhan Univ Dept Comp Sci & Technol Wuhan 430072 Peoples R China

In recent years, graph-based models and ranking algorithms have drawn considerable attention from the extractive document summarization community. Most existing approaches take into account sentence-level relations (e.g. sentence similarity) but neglect the difference among documents and the influence of documents on sentences. In this paper, we present a novel document-sensitive graph model that emphasizes the influence of global document set information on local sentence evaluation. By exploiting document-document and document-sentence relations, we distinguish intra-document sentence relations from inter-document sentence relations. In such a way, we move towards the goal of truly summarizing multiple documents rather than a single combined document. based on this model, we develop an iterative sentence ranking algorithm, namely DsR (Document-Sensitive ranking). Automatic ROUGE evaluations on the DUC data sets show that DsR outperforms previous graph-based models in both generic and query-oriented summarization tasks.

关键词： graph-based summarization model graph-based ranking algorithm Inter-and intra-document relation Generic summarization Query-oriented summarization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：