咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Self-prompted Chain-of-Thought... 收藏
arXiv

Self-prompted Chain-of-Thought on Large Language Models for Open-domain Multi-hop Reasoning

作     者:Wang, Jinyuan Li, Junlong Zhao, Hai 

作者机构:SJTU-Paris Elite Institute of Technology Shanghai Jiao Tong University China Department of Computer Science and Engineering Shanghai Jiao Tong University China Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering Shanghai Jiao Tong University China 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2023年

核心收录:

主  题:Benchmarking 

摘      要:In open-domain question-answering (ODQA), most existing questions require single-hop reasoning on commonsense. To further extend this task, we officially introduce open-domain multi-hop reasoning (ODMR) by answering multi-hop questions with explicit reasoning steps in open-domain setting. Recently, large language models (LLMs) have found significant utility in facilitating ODQA without external corpus. Furthermore, chain-of-thought (CoT) prompting boosts the reasoning capability of LLMs to a greater extent with manual or automated paradigms. However, existing automated methods lack of quality assurance, while manual approaches suffer from limited scalability and poor diversity, hindering the capabilities of LLMs. In this paper, we propose Self-prompted Chain-of-Thought (SP-CoT), an automated framework to mass-produce high quality CoTs of LLMs, by LLMs and for LLMs. SP-CoT introduces an automated generation pipeline of high quality ODMR datasets, an adaptive sampler for in-context CoT selection and self-prompted inference via in-context learning. Extensive experiments on four multi-hop question-answering benchmarks show that our proposed SP-CoT not only significantly surpasses the previous SOTA methods on large-scale (175B) LLMs, but also nearly doubles the zero-shot performance of small-scale (13B) LLMs. Further analysis reveals the remarkable capability of SP-CoT to elicit direct and concise intermediate reasoning steps by recalling ∼50% of intermediate answers on MuSiQue-Ans dataset. Copyright © 2023, The Authors. All rights reserved.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分