检索结果-内蒙古大学图书馆

Real-Time Visual Object Tracking Based on Reinforcement Learning with twin delayed deep deterministic algorithm 1

9th International Conference on Intelligence Science and Big Data Engineering (IScIDE)

作者： Zheng, Shengjie Wang, Huan Nanjing Univ Sci & Technol Nanjing Peoples R China

ISBN: (数字)9783030361891

ISBN: (纸本)9783030361891;9783030361884

Object tracking as a low-level vision task has always been a hot topic in computer vision. It is well known that Challenges such as background clutters, fast object motion and occlusion et al. affect a lot the robustness or accuracy of existing object tracking methods. This paper proposes a reinforcement learning model based on twin delayed deep deterministic algorithm (TD3) for single object tracking. The model is based on the deep reinforcement learning model, Actor-Critic (AC), in which the Actor network predicts a continuous action that moves the target bounding box in the previous frame to the object position in the current frame and adapts to the object size. The Critic network evaluates the confidence of the new bounding box online to determine whether the Critic model needs to be updated or re-initialized. In further, in our model we use TD3 algorithm to further optimize the AC model by using two Critic networks to jointly predict the bounding box confidence, and to obtain the smaller predicted value as the label to update the network parameters, thereby rendering the Critic network to avoid excessive estimation bias, accelerate the convergence of the loss function, and obtain more accurate prediction values. Also, a small amount of random noise with upper and lower bounds are added to the action in the Actor model, and the search area is reasonably expanded in offline learning to improve the robustness of the tracking method under strong background interference and fast object motion. The Critic model can also guide the Actor model to select the best action and continuously update the state of the tracking object. Comprehensive experimental results on the OTB-2013 and OTB-2015 benchmarks demonstrate that our tracker performs best in precision, robustness, and efficiency when compared with state-of-the-art methods.

关键词： Visual object tracking Reinforcement learning Actor-Critic model twin delayed deep deterministic algorithm

来源：评论

学校读者我要写书评

暂无评论

MP-TD3: Multi-Pool Prioritized Experience Replay-Based Asynchronous twin delayed deep deterministic Policy Gradient algorithm

引用

IEEE ACCESS 2024年 12卷 105268-105280页

作者： Tan, Wenwen Huang, Detian Huaqiao Univ Coll Engn Quanzhou 362000 Peoples R China Quanzhou Digital Inst Quanzhou 362000 Peoples R China

The prioritized experience replay mechanisms have achieved remarkable success in accelerating the convergence of reinforcement learning algorithms. However, applying traditional prioritized experience replay mechanisms directly to asynchronous reinforcement learning leads to slow convergence, due to the difficulty for an agent to utilize excellent experiences obtained by other agents interacting with the environment. To address the above issue, we propose a Multi-pool Prioritized experience replay-based asynchronous twin delayed deep deterministic policy gradient algorithm (MP-TD3). Specifically, a multi-pool prioritized experience replay mechanism is proposed to strengthen the experience interactions among different agents to accelerate the network convergence. Then, a global-pool self-cleaning mechanism based on sample diversity and a global-pool self-cleaning mechanism based on TD-errors are designed to overcome the deficiency that the samples suffer from high redundancy and low information content in the global-pool, respectively. Finally, a multi-batch sampling mechanism is investigated to further reduce the training time. Extensive experiments validate that the proposed MP-TD3 significantly improve the convergence speed and performance compared with state-of-the-art methods.

关键词： Training Parallel processing Convergence Correlation Network architecture Training data Reinforcement learning Asynchronous reinforcement learning twin delayed deep deterministic algorithm prioritized experience replay TD-error

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：