检索结果-内蒙古大学图书馆

47th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR)

作者： Malik, Manuj Zhao, Zheng Fonseca, Marcio Rao, Shrisha Cohen, Shay B. IIIT Bangalore Bangalore Karnataka India Univ Edinburgh Edinburgh Midlothian Scotland

ISBN: (纸本)9798400704314

Extracting relevant information from legal documents is a challenging task due to the technical complexity and volume of their content. These factors also increase the costs of annotating large datasets, which are required to train state-of-the-art summarization systems. To address these challenges, we introduce CivilSum, a collection of 23,350 legal case decisions from the Supreme Court of India and other Indian High Courts paired with human-written summaries. Compared to previous datasets such as IN-Abs, CivilSum not only has more legal decisions but also provides shorter and more abstractive summaries, thus offering a challenging benchmark for legal summarization. Unlike other domains such as news articles, our analysis shows the most important content tends to appear at the end of the documents. We measure the effect of this tail bias on summarization performance using strong architectures for long-document abstractive summarization, and the results highlight the importance of long sequence modeling for the proposed task. CivilSum and related code are publicly available to the research community to advance text summarization in the legal domain.

关键词： abstractive text summarization Dataset Legal IR Legal document summarization

来源：评论

学校读者我要写书评

暂无评论

Deep Learning Based abstractive Turkish News summarization 27

Deep Learning Based Abstractive Turkish News Summarization

引用

27th Signal Processing and Communications Applications Conference (SIU)

作者： Karakoc, Enise Yilmaz, Burcu Kuveyt Turk Katilim Bankasi AR GE Merkezi Kocaeli Turkey Gebze Tekn Univ Bilisim Teknolojileri Enstitusu Kocaeli Turkey

ISBN: (纸本)9781728119045

With the increase of knowledge, there is a need for summarization systems that will direct the person to the area they are interested in without any waste of time. In this work, Turkish news headlines have been predicted by using encoder-decoder model from deep learning methods. Abstraction based text summarization method has been used during the generation of headlines. The system has been trained with recurrent neural networks by developing encoder-decoder model. The word embeddings of the words in news texts have been generated by using Fasttext that is very commonly used model in the literature recently. The system has been tested separately by training the first sentence, first two sentences and full-text of each news. The success of the system is measured by ROUGE score and semantic similarity score. According to the experimental results, it has been observed the model trained with full-text of news outperforms among the other models.

关键词： deep learning abstractive text summarization news summarization recurrent neural network

来源：评论

学校读者我要写书评

暂无评论

Fact-Aware abstractive summarization Based on Prompt Information and Re-Ranking 7

Fact-Aware Abstractive Summarization Based on Prompt Informa...

引用

7th International Conference on Machine Learning and Natural Language Processing, MLNLP 2024

作者： Cai, Tianyu Yang, Yurui Wan, Yujie Mao, Xingjing Ju, Shenggen Sichuan University Chengdu China

ISBN: (纸本)9798350354973

abstractive text summarization helps people quickly obtain the key information of an article, and existing models generate fluent summaries but often suffer from factual consistency problems, a key issue that current research has not adequately addressed. In order to ensure the quality of summaries while improving their factualness, the article proposes a factaware summary generation method that combines reordering and prompting information to improve the quality and factual consistency of generated summaries. Keyword extraction and key phrases are first introduced and then fed into the generative abstract model along with the original text to obtain candidate abstracts that incorporate factual information from the original text. In order to combine abstract quality and factual consistency, the ROUGE metrics and FactCC metrics are combined, and a reordering rule is designed to guide the model in generating more realistic and content-rich summaries. In addition, the model is further incentivized to assign higher probability scores to more authentic summaries through the introduction of contrast loss. Experimental results on CNN/Daily Mail and XSum datasets show that the article's proposed method outperforms the strong baseline model in terms of quality and factual consistency, and ablation experiments validate the effectiveness of the proposed module. © 2024 IEEE.

关键词： abstractive text summarization keywords information prompt reranking

来源：评论

学校读者我要写书评

暂无评论

MS-Pointer Network: abstractive text Summary Based on Multi-Head Self-Attention

引用

IEEE ACCESS 2019年 7卷 138603-138613页

作者： Guo, Qian Huang, Jifeng Xiong, Naixue Wang, Pan Shanghai Normal Univ Coll Informat Mech & Elect Engn Shanghai 200030 Peoples R China Northeastern State Univ Dept Math & Comp Sci Tahlequah OK 74464 USA Huanggang Jinghe Sch Xixian New Area Xian 713702 Shaanxi Peoples R China

.abstractive text summarization plays an important role in the field of natural language processing. However, the abstractive text summary adopts deep learning research method to predict words often appears semantic inaccuracy and repetition and so on. at the present stage, in order to solve the problem that semantic inaccuracy, we propose an MS-Pointer Network that based on the multi-head self-attention mechanism, which a multi-head self-attention mechanism is introduced in the basic encoder-decoder model. Since multi-head self-attention can combine input words into the encoder-decoder arbitrarily, and given a higher weight of these words that combination of the semantics, thereby achieving the purpose of enhancing the semantic features of the text, so that the abstractive text summary is more semantically structured, And the multi-head self-attention mechanism add the position information of the input text, which can enhance the semantic representation of the text. At the same time, in order to solve the problem of out of vocabulary, a pointer network is introduced on the seqtoseq with a multi-head attention mechanism. The model is referred to as MS-Pointer Network. We used CNN/Daily Mail and Gigaword datasets to validate our model, and uses the ROUGE metric to measure model. Experiments have shown that abstractive text summaries generated using the multi-head self-attention mechanism outperforming current open state-of-the-art two points averagely.

关键词： abstractive text summarization encoder-decoder model multi-head self-attention mechanism

来源：评论

学校读者我要写书评

暂无评论

STV-BEATS: Skip Thought Vector and Bi-Encoder based Automatic text Summarizer

引用

KNOWLEDGE-BASED SYSTEMS 2022年 240卷

作者： Tomer, Minakshi Kumar, Manoj USICT GGSIPU Delhi India MSIT IT Dept Delhi India Netaji Subhas Univ Technol Ambedkar Inst Adv Commun Technol & Res East Campus Delhi India

A novel text summarization framework referred to as Skip-Though Vector and Bi-encoder Based Automatic text summarization (STV-BEATS) is proposed in this paper. STV-BEATS utilizes - (a) skip-though vector to generate sentence-based embedding;and (b) Long Short-Term Memory (LSTM) based deep autoencoder to reduce dimensions of skip thought vectors. STV-BEATS works in the conjunction of extractive and abstractive summarization models to enhance the overall quality of the results. For each sentence, relevance and novelty metrics are calculated on the intermediate representation of the deep autoencoder to generate the final sentence score. The highly scored sentences are selected to generate an extractive summary. On the other hand, the abstractive part is composed of two encoders and a decoder which works as - (a) the first GRU-based bi-directional encoder and decoder work as basic sequence-to-sequence model on the extractive summary;and (b) the second GRU-based unidirectional encoder is used for fine encoding. Extensive computer experiments are conducted to determine the effectiveness of the STV-BEATS. Three standard benchmark datasets, namely, CNN/Daily Mail, DUC-2004, and DUC-2002 are used during experiments. Further, recall-oriented understudy for gisting evaluation (ROUGE) is used for validation of the STV-BEATS. Result reveals that the proposed STV-BEATS is capable of effective text summarization and achieves substantially better results over the state-of-the-art models. (C)& nbsp;2022 Elsevier B.V. All rights reserved.

关键词： abstractive text summarization auto encoder extractive text summarization primary encoder secondary encoder skip thought vector

来源：评论

学校读者我要写书评

暂无评论

Generating Semantically Similar and Human-Readable Summaries With Generative Adversarial Networks

引用

IEEE ACCESS 2019年 7卷 169426-169433页

作者： Zhuang, Haojie Zhang, Weibin Chinese Acad Sci Shenzhen Inst Adv Technol Shenzhen 518055 Peoples R China South China Univ Technol SCUT Sch Elect & Informat Engn Guangzhou 510641 Peoples R China VoiceAI Technol Shenzhen 518057 Peoples R China

The application of neural networks in natural language processing, including abstractive text summarization, is increasingly attractive in recent years. However, teaching a neural network to generate a human-readable summary that reflects the core idea of the original source text (i.e., semantically similar) remains a challenging problem. In this paper, we explore using generative adversarial networks to solve this problem. The proposed model contains three components: a generator that encodes the long input text into a shorter representation;a discriminator to teach the generator to create human-readable summaries and another discriminator to restrict the output of the generator to reflect the core idea of the input text. The main training process can be carried out in an adversarial learning process. To solve the non-differentiable problem caused by the words sampling process, we use the policy gradient algorithm to optimize the generator. We evaluate the proposed model on the CNN/Daily Mail summarization task. The experimental results show that the model outperforms previous state-of-the-art models.

关键词： Generators Generative adversarial networks Training Decoding Postal services Recurrent neural networks abstractive text summarization generative adversarial networks natural language processing

来源：评论

学校读者我要写书评

暂无评论

Bayesian active summarization

引用

COMPUTER SPEECH AND LANGUAGE 2024年 83卷

作者： Gidiotis, Alexios Tsoumakas, Grigorios Aristotle Univ Thessaloniki Sch Informat Thessaloniki Greece

Bayesian Active Learning has had significant impact to various NLP problems, but nevertheless its application to text summarization has been explored very little. We introduce Bayesian Active summarization (BAS), as a method of combining active learning methods with stateof-the-art summarization models. Our findings suggest that BAS achieves better and more robust performance, compared to random selection, particularly for small and very small data annotation budgets. More specifically, applying BAS with a summarization model like PEGASUS we managed to reach 95% of the performance of the fully trained model, using less than 150 training samples. Furthermore, we have reduced standard deviation by 18% compared to the conventional random selection strategy. Using BAS we showcase it is possible to leverage large summarization models to effectively solve real-world problems with very limited annotated data.

关键词： Active learning abstractive text summarization Bayes methods Monte Carlo methods Natural language processing Deep learning

来源：评论

学校读者我要写书评

暂无评论

RepSum: A general abstractive summarization framework with dynamic word embedding representation correction

引用

COMPUTER SPEECH AND LANGUAGE 2023年 80卷

作者： Feng, Jianzhou Long, Jing Han, Chunlong Ren, Zhongcan Wang, Qin Yanshan Univ Sch Informat Sci & Engn Qinhuangdao 066000 Hebei Peoples R China Yanshan Univ Key Lab Software Engn Hebei Prov Qinhuangdao 066000 Hebei Peoples R China

summarization is flexible and allows the model to generate new words and phrases. However, the familiar words are more likely to be selected as abstract candidate words in the process of abstractive summarization, causing the generated abstract to diverge from the refer-ence. In our consideration, this is caused by representation degeneration of the pre-trained word embedding. Therefore, this paper proposes a general abstractive summarization framework with dynamic word embedding representation correction (RepSum). The representation correction algorithm identifies the dimension most relevant to word frequency and eliminates the word frequency features. Then the distribution of word embeddings will be more even. As a result, the words will be selected as candidate words without frequency bias to improve the quality of the abstract. The experimental results illustrate that RepSum performs better than the benchmark model in summary quality, demonstrating our method's effectiveness.

关键词： abstractive text summarization Dynamic word embedding Representation correction algorithm Word frequency features elimination

来源：评论

学校读者我要写书评

暂无评论

Generating Bengali News Headlines: An Attentive Approach with Sequence-to-Sequence Networks

Generating Bengali News Headlines: An Attentive Approach wit...

引用

International Conference on System Modeling & Advancement in Research Trends (SMART)

作者： Mushfiqus Salehin Ashik Ahamed Aman Rafat Fazle Rabby Khan Sheikh Abujar Daffodil International University Dhaka Bangladesh

ISBN: (数字)9781728132457

ISBN: (纸本)9781728132464

This age of data-driven innovation has made automated relevant and important data extraction a necessity. Automated text summarization has made it possible to extract relevant information from large amounts of data without needing any supervision. But the extracted information could seem artificial at times and that's where the abstractive summarization method tries to mimic the human way of summarizing by creating coherent summaries using novel words and sentences. Due to the difficult nature of this method, before deep learning, there hasn't been much progress. So, during this work, we have proposed an attention mechanism-based sequence-to-sequence network to generate abstractive summaries of Bengali text. We have also built our own large Bengali news dataset and applied our model on it to show indeed deep sequence-to-sequence neural networks can achieve good performance summarizing Bengali texts.

关键词： abstractive text summarization Bengali text summarization LSTM attention-based headline generation Bengali news texts extracting data

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：