文献详情 >UAV Path Planning Based on the... 收藏

UAV Path Planning Based on the Average TD3 Algorithm With Prioritized Experience Replay

作者：Luo, Xuqiong Wang, Qiyuan Gong, Hongfang Tang, Chao

作者机构：Changsha Univ Sci & Technol Sch Math & Stat Changsha 410114 Peoples R China Changsha Univ Sci & Technol Sch Comp & Commun Engn Changsha Peoples R China

出版物：《IEEE ACCESS》 (IEEE Access)

年卷期：2024年第12卷

页面：38017-38029页

核心收录：

基　　金：Excellent Youth Project of Education Department of Hunan Province

主　　题：Autonomous aerial vehicles Path planning Heuristic algorithms Training Approximation algorithms Machine learning algorithms Stability analysis Deep reinforcement learning Vehicle dynamics Performance evaluation UAV path planning deep reinforcement learning prioritized experience replay average TD3 algorithm

摘要：Path planning is one of the important components of the Unmanned Aerial Vehicle (UAV)mission, and it is also the key guarantee for the successful completion of the UAV s mission. The traditionalpath planning algorithm has certain limitations and deficiencies in the complex dynamic *** at the dynamic complex obstacle environment, this paper proposes an improved TD3 algorithm,which enables the UAV to complete the autonomous path planning through online learning and continuoustrial and error. The algorithm changes the experience pool of TD3 algorithm to priority experience replay,so that the agent can distinguish the importance of empirical samples, improve the sampling efficiency ofthe algorithm, and reduce the training time. The average TD3 is proposed, and the average value ofQ1Q2is taken when the target value is updated to solve the problem of overestimating theQvalue while avoidingunderestimating theQvalue, so that the improved algorithm has better stability and can adapt to variouscomplex obstacle environments. A new reward function is set up, so that each step of the UAV action canreceive reward feedback, which solves the problem of sparse reward in deep reinforcement learning. Theexperimental results show that this method can train the UAV to reach the target safely and quickly in amulti-obstacle environment. Compared with DDPG, SAC and traditional TD3, the path planning successrate of this algorithm is higher than that of the other three algorithms, and the collision rate is lower than thatof the comparison algorithm, which has better path planning performance

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

UAV Path Planning Based on the Average TD3 Algorithm With Prioritized Experience Replay

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

UAV Path Planning Based on the Average TD3 Algorithm With Prioritized Experience Replay

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：