咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Diluie: constructing diverse d... 收藏

Diluie: constructing diverse demonstrations of in-context learning with large language model for unified information extraction

作     者:Guo, Qian Guo, Yi Zhao, Jin 

作者机构:Department of Computer Science and Engineering East China University of Science and Technology Shanghai200237 China Shanghai Key Laboratory of Data Science School of Computer Science Fudan University Shanghai200433 China 

出 版 物:《Neural Computing and Applications》 (Neural Comput. Appl.)

年 卷 期:2024年第36卷第22期

页      面:13491-13512页

核心收录:

学科分类:1205[管理学-图书情报与档案管理] 08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 081202[工学-计算机软件与理论] 

基  金:This research is financially supported by Science and Technology Committee of Shanghai Municipality (STCSM) (Science and Technology Program Grants 22511104800 and 22DZ1204903) 

主  题:Demonstrations 

摘      要:Large language models (LLMs) have demonstrated promising in-context learning capabilities, especially with instructive prompts. However, recent studies have shown that existing large models still face challenges in specific information extraction (IE) tasks. Moreover, it could have more effectively utilized various prompts such as instruction tuning, diverse demonstrations of in-context learning, and long-range token sequences for assisting language modeling in understanding context. In this study, we propose DILUIE, a unified information extraction framework based on in-context learning with diverse demonstration examples. DILUIE is encoded with an EVA attention mechanism and incremental encoding technology. Based on the constructed diverse demonstrations, we expand the size of instances efficiently in both instruction tuning and in-context learning to gain insights into the potential benefits of utilizing diverse information extraction datasets. To deepen the understanding of context, we further design three auxiliary tasks to assist in aligning contextual semantics. Experimental results demonstrate that DILUIE achieves 2.23 and 2.53% improvements in terms of Micor-/Macor-F1 on average relative to the current state-of-the-art baseline, which also significantly outperforms the GPT-3.5-turbo in zero-shot settings, and the average token length of achieving the best performance over tasks is around 15k. Furthermore, we observe that in-context learning shows enhanced performance when provided with more demonstrations during multiple-shot instruction tuning (8 k). Additionally, increasing the length of instructions (10 k) can result in a more substantial improvement in the upper limits of scaling for in-context learning. Code is available on https://***/Phevos75/DILUIE. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分