检索结果-内蒙古大学图书馆

AN OPTIMIZATION APPROACH TO AUTOMATIC generic document summarization

COMPUTATIONAL INTELLIGENCE 2013年第1期29卷 129-155页

作者： Alguliev, Rasim M. Aliguliyev, Ramiz M. Mehdiyev, Chingiz A. Azerbaijan Natl Acad Sci Inst Informat Technol Baku 1141 Az Azerbaijan

In this paper, we have presented an optimization approach to document summarization. The potential of optimization based document summarization models has not been well explored to date. This is partially the difficulty to formulate the criteria used for objective assessment. We modeled document summarization as the linear and nonlinear optimization problems. These models generally attempt simultaneously to balance coverage and diversity in the summary. To solve the optimization problem we developed a novel particle swarm optimization (PSO) algorithm. Experiments showed our linear and nonlinear models produce very competitive results, which significantly outperform the NIST baselines in both years. More important, although linear and nonlinear models are comparable to the top three systems S24, S15, and S12 in the DUC2006, they are even superior to the best participating system in the DUC2005.

关键词： generic document summarization summary diversity redundancy optimization models PSO with nonlinear decreasing inertia weight PMI-based sentence similarity measure

来源：评论

学校读者我要写书评

暂无评论

GenDocSum plus MCLR: generic document summarization based on maximum coverage and less redundancy

引用

EXPERT SYSTEMS WITH APPLICATIONS 2012年第16期39卷 12460-12473页

作者： Alguliev, Rasim M. Aliguliyev, Ramiz M. Hajirahimova, Makrufa S. Azerbaijan Natl Acad Sci Inst Informat Technol AZ-1141 Baku Azerbaijan

With the rapid growth of information on the Internet and electronic government recently, automatic multi-document summarization has become an important task. Multi-document summarization is an optimization problem requiring simultaneous optimization of more than one objective function. In this study, when building summaries from multiple documents, we attempt to balance two objectives, content coverage and redundancy. Our goal is to investigate three fundamental aspects of the problem, i.e. designing an optimization model, solving the optimization problem and finding the solution to the best summary. We model multi-document summarization as a Quadratic Boolean Programing (QBP) problem where the objective function is a weighted combination of the content coverage and redundancy objectives. The objective function measures the possible summaries based on the identified salient sentences and overlap information between selected sentences. An innovative aspect of our model lies in its ability to remove redundancy while selecting representative sentences. The QBP problem has been solved by using a binary differential evolution algorithm. Evaluation of the model has been performed on the DUC2002, DUC2004 and DUC2006 data sets. We have evaluated our model automatically using ROUGE toolkit and reported the significance of our results through 95% confidence intervals. The experimental results show that the optimization-based approach for document summarization is truly a promising research direction. (C) 2012 Elsevier Ltd. All rights reserved.

关键词： generic document summarization Maximum coverage Less redundancy Optimization model Differential evolution algorithm

来源：评论

学校读者我要写书评

暂无评论

Extractive multi-document text summarization based on graph independent sets

引用

EGYPTIAN INFORMATICS JOURNAL 2020年第3期21卷 145-157页

作者： Uckan, Taner Karci, Ali Van Yuzuncu Yil Univ Comp Programming Dept TR-65000 Van Turkey Inonu Univ Dept Comp Engn TR-44000 Malatya Turkey

We propose a novel methodology for extractive, generic summarization of text documents. The Maximum Independent Set, which has not been used previously in any summarization study, has been utilized within the context of this study. In addition, a text processing tool, which we named KUSH, is suggested in order to preserve the semantic cohesion between sentences in the representation stage of introductory texts. Our anticipation was that the set of sentences corresponding to the nodes in the independent set should be excluded from the summary. Based on this anticipation, the nodes forming the Independent Set on the graphs are identified and removed from the graph. Thus, prior to quantification of the effect of the nodes on the global graph, a limitation is applied on the documents to be summarized. This limitation prevents repetition of word groups to be included in the summary. Performance of the proposed approach on the document Understanding Conference (DUC-2002 and DUC-2004) datasets was calculated using ROUGE evaluation metrics. The developed model achieved a 0.38072 ROUGE performance value for 100-word summaries, 0.51954 for 200-word summaries, and 0.59208 for 400-word summaries. The values reported throughout the experimental processes of the study reveal the contribution of this innovative method. (C) 2019 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Artificial Intelligence, Cairo University.

关键词： Graph independent set Graph-based document summarization generic document summarization Extractive text summarization Multi document text summarization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：