检索结果-内蒙古大学图书馆

Enhancing model learning in reinforcement learning through Q-function-guided trajectory alignment

APPLIED INTELLIGENCE 2025年第7期55卷 1-22页

作者： Du, Xin Zhong, Shan Gong, Shengrong Si, Yali Qi, Zhenyu Changshu Inst Technol Sch Comp Sci & Engn Suzhou 215506 Jiangsu Peoples R China Suzhou Univ Sci & Technol Sch Elect & Informat Engn Suzhou 215506 Jiangsu Peoples R China Univ Arizona Tucson AZ 85711 USA

model-based reinforcement learning (MBrl) methods hold great promise for achieving excellent sample efficiency by fitting a dynamics model to previously observed data and leveraging it for rl or planning. However, the resulting trajectories may diverge from actual-world trajectories due to the accumulation of errors in multi-step model sampling, particularly for longer horizons. This undermines the performance of MBrl and significantly affects sample efficiency. Therefore, we present a trajectory alignment capable of aligning simulated trajectories with their real counterparts from any initial random state and with adaptive length, enabling the preparation of paired real-simulated samples to minimize compounding errors. Additionally, we design a Q-function function to estimate Q values for the paired real-simulated samples. The simulated samples whose Q-value difference from the real ones surpasses a given threshold will be discarded, thus preventing the model from over-fitting to erroneous samples. Experimental results demonstrate that both trajectory alignment and Q-function guided sample filtration contribute to improving policy and sample efficiency. Our method surpasses previous state-of-the-art model-based approaches in both sample efficiency and asymptotic performance across a series of challenging control tasks. The code is open source and available at https://***/duxin0618/***.

关键词： Reinforcement learning Dynamics model model-based rl algorithms model learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：