文献详情 >A novel actor-critic-identifie... 收藏

A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

为不明确的非线性的系统的近似最佳的控制的新奇 actorcriticidentifier 体系结构

作者：Bhasin, S. Kamalapurkar, R. Johnson, M. Vamvoudakis, K. G. Lewis, F. L. Dixon, W. E.

作者机构：Indian Inst Technol Dept Elect Engn Delhi India Univ Florida Dept Mech & Aerosp Engn Gainesville FL USA Univ Calif Santa Barbara Ctr Control Dynam Syst & Computat CCDC Santa Barbara CA 93106 USA Univ Texas Arlington Automat & Robot Res Inst Ft Worth TX 76118 USA

出版物：《AUTOMATICA》 (自动学)

年卷期：2013年第49卷第1期

页面：82-92页

核心收录：

学科分类：0711[理学-系统科学] 0808[工学-电气工程] 07[理学] 08[工学] 070105[理学-运筹学与控制论] 081101[工学-控制理论与控制工程] 0811[工学-控制科学与工程] 0701[理学-数学] 071101[理学-系统理论]

基　　金：NSF [0547448, 0901491] Department of Energy, DOE University Research Program in Robotics (URPR) [DE-FG04-86NE37967] Direct For Computer & Info Scie & Enginr Div Of Information & Intelligent Systems Funding Source: National Science Foundation Directorate For Engineering Div Of Electrical, Commun & Cyber Sys [1128050, 0901491] Funding Source: National Science Foundation

主　　题：Learning control Adaptive control Optimal control Approximate dynamic programming Actor-critic-identifier

摘要：An online adaptive reinforcement learning-based solution is developed for the infinite-horizon optimal control problem for continuous-time uncertain nonlinear systems. A novel actor-critic-identifier (ACI) is proposed to approximate the Hamilton-Jacobi-Bellman equation using three neural network (NN) structures actor and critic NNs approximate the optimal control and the optimal value function, respectively, and a robust dynamic neural network identifier asymptotically approximates the uncertain system dynamics. An advantage of using the ACI architecture is that learning by the actor, critic, and identifier is continuous and simultaneous, without requiring knowledge of system drift dynamics. Convergence of the algorithm is analyzed using Lyapunov-based adaptive control methods. A persistence of excitation condition is required to guarantee exponential convergence to a bounded region in the neighborhood of the optimal control and uniformly ultimately bounded (UUB) stability of the closed-loop system. Simulation results demonstrate the performance of the actor-critic-identifier method for approximate optimal control. (C) 2012 Elsevier Ltd. All rights reserved.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：