检索结果-内蒙古大学图书馆

Discrete-Time Stable Generalized Self-learning Optimal Control With Approximation Errors

ieee TRANSACTIONS ON NEURAL NETWORKS AND learning SYSTEMS 2018年第4期29卷 1226-1238页

作者： Wei, Qinglai Li, Benkai Song, Ruizhuo Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Univ Chinese Acad Sci Beijing 100049 Peoples R China Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China

In this paper, a generalized policy iteration (GPI) algorithm with approximation errors is developed for solving infinite horizon optimal control problems for nonlinear systems. The developed stable GPI algorithm provides a general structure of discrete-time iterative adaptive dynamic programming algorithms, by which most of the discrete-time reinforcement learning algorithms can be described using the GPI structure. It is for the first time that approximation errors are explicitly considered in the GPI algorithm. The properties of the stable GPI algorithm with approximation errors are analyzed. The admissibility of the approximate iterative control law can be guaranteed if the approximation errors satisfy the admissibility criteria. The convergence of the developed algorithm is established, which shows that the iterative value function is convergent to a finite neighborhood of the optimal performance index function, if the approximate errors satisfy the convergence criterion. Finally, numerical examples and comparisons are presented.

关键词： adaptive critic designs adaptive dynamic programming (ADP) approximate dynamic programming generalized policy iteration (GPI) neural networks neurodynamic programming nonlinear systems optimal control reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

adaptive, Optimal, Virtual Synchronous Generator Control of Three-Phase Grid-Connected Inverters Under Different Grid Conditions-An adaptive dynamic programming Approach

引用

ieee TRANSACTIONS ON INDUSTRIAL INFORMATICS 2022年第11期18卷 7388-7399页

作者： Wang, Zhongyang Yu, Yunjun Gao, Weinan Davari, Masoud Deng, Chao Fuzhou Inst Technol Sch Appl Sci & Engn Fuzhou 350506 Peoples R China Nanchang Univ Dept Automat Informat Engn Nanchang 330031 Jiangxi Peoples R China Florida Inst Technol Florida Tech Coll Engn & Sci Dept Mech & Civil Engn Melbourne FL 32901 USA Georgia Southern Univ Dept Elect & Comp Engn Statesboro Campus Statesboro GA 30460 USA Nanjing Univ Posts & Telecommun Inst Adv Technol Nanjing 210023 Peoples R China

This article proposes an adaptive, optimal, data-driven control approach based on reinforcement learning and adaptive dynamic programming to the three-phase grid-connected inverter employed in virtual synchronous generators (VSGs). This article takes into account unknown system dynamics and different grid conditions, including balanced/unbalanced grids, voltage drop/sag, and weak grids. The proposed method is based on value iteration, which does not rely on an initial admissible control policy for learning. Considering the premise that the VSG control should stabilize the closed-loop dynamics, the VSG outputs are optimally regulated through the adaptive, optimal control strategy proposed in this article. Comparative simulations and experimental results validate the proposed method's effectiveness and reveal its practicality and implementation.

关键词： Voltage control Power system stability Synchronous generators Inverters Damping reinforcement learning Optimal control adaptive dynamic programming (ADP) adaptive optimal control reinforcement learning value iteration virtual synchronous generator (VSG)

来源：评论

学校读者我要写书评

暂无评论

Integrating Sporadic Imitation in reinforcement learning Robots

Integrating Sporadic Imitation in Reinforcement Learning Rob...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Richert, Willi Scheller, Ulrich Koch, Markus Kleinjohann, Bernd Stern, Claudius Univ Gesamthsch Paderborn Fac Comp Sci Elect Engn & Math D-33102 Paderborn Germany

ISBN: (纸本)9781424427611

Although the combination of reinforcement learning and imitation has been already considered in recent research, it always revolved around fixed settings where demonstrator and imitator are fixed and the imitation process is a well-defined period of time. What is missing is the investigation of approaches that also work in scenarios where imitation is only sporadically possible. This means that in a multi-robot scenario a robot is now allowed to interrupt another robot by asking to repeat certain actions, but can only observe and integrate information bits delivered occasionally. In this paper we present how that can be done in continuous and noisy environment within an SMDP context.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Neural-Network-Based reinforcement learning Controller for Nonlinear Systems with Non-symmetric Dead-zone Inputs

Neural-Network-Based Reinforcement Learning Controller for N...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Zhang, Xin Zhang, Huaguang Liu, Derong Kim, Yongsu Northeastern Univ Sch Informat Sci & Engn Shenyang 110004 Liaoning Peoples R China Univ Illinois Dept Elect & Comp Engn Chicago IL 60607 USA

ISBN: (纸本)9781424427611

A novel adaptive-critic-based NN controller using reinforcement learning is developed for a class of nonlinear systems with non-symmetric dead-zone inputs. The adaptive critic NN controller uses two NNs: the critic NN is used to approximate the strategic utility function, and the output of action NN is used to approximate the unknown nonlinear function and to minimize the strategic utility function. The tuning of the NNs is performed online without an explicit offline learning phase. The uniformly ultimate boundedness of the close-loop tracking error is derived by using using the Lyapunov method. Finally, a numerical example is included to show the effectiveness of the theoretical results.

关键词： Nonlinear systems

来源：评论

学校读者我要写书评

暂无评论

Structure search of probabilistic models and data correction for EDA-RL

Structure search of probabilistic models and data correction...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Handa, Hisashi Graduate School of Natural Science and Technology Okayama University Tsushima-naka 3-1-1 Okayama 700-8530 Japan

ISBN: (纸本)9781424498888

We have proposed a novel Estimation of Distribution Algorithm for solving reinforcement learning problems: EDA-RL. The EDA-RL can perform well if the complexity of the structure of the probabilistic model is adapted to the difficulty of given problems. Therefore, this paper proposes a structure search method of the probabilistic model in the EDA-RL as in conventional EDA taking account multivariate dependencies. Moreover, a data correction method by eliminating loops of state transitions is also proposed. Computational simulations on maze problems, which have several perceptual aliasing states, show the effectiveness of the proposed method. © 2011 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Neural-Network-Based Robust Control Schemes for Nonlinear Multiplayer Systems With Uncertainties via adaptive dynamic programming

引用

ieee TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 2019年第3期49卷 579-588页

作者： Jiang, He Zhang, Huaguang Luo, Yanhong Han, Ji Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Liaoning Peoples R China

This paper investigates the robust control issues of nonlinear multiplayer systems by utilizing adaptive dynamic programming (ADP) methods and fills a gap in the ADP field, where actuator uncertainties for multiplayer systems are still not addressed. Two types of actuator uncertainties including bounded nonlinear perturbation and unknown constant actuator fault are taken into consideration. First, a data-driven reinforcement learning (RL) approach is derived to learn the optimal solutions of multiplayer nonzero-sum games. Then, based on the obtained optimal control policies, two robust control schemes are developed to handle these two different types of uncertainties, respectively, and the associated stability analysis is also provided. To implement the proposed iterative RL approach, a single neural network (NN) architecture with least-square-based updating law is given, which reduces the computation burden compared with the traditional dual NN architecture. Finally, two numerical examples are shown to test the feasibility of our proposed schemes.

关键词： adaptive dynamic programming (ADP) approximate dynamic programming neural network (NN) reinforcement learning (RL)

来源：评论

学校读者我要写书评

暂无评论

A novel approach for constructing basis functions in approximate dynamic programming for feedback control

A novel approach for constructing basis functions in approxi...

引用

2013 4th ieee symposium on adaptive dynamic programming and reinforcement learning, ADPRL 2013

作者： Wang, Jian Huang, Zhenhua Xu, Xin College of Mechatronics and Automation National University of Defense Tech Changsha 410073 China Xi'An Air Force Military Representative Office Xi'an China

ISBN: (纸本)9781467359252

This paper presents a novel approach for constructing basis functions in approximate dynamic programming (ADP) through the locally linear embedding (LLE) process. It considers the experience (sample) data as a high-dimensional space and the basis functions to be solved as a low-dimensional space. Through mapping the high-dimensional data into a single global coordinate system of lower dimensionality, the solved basis functions in low-dimensional space have the property that nearby experience data in the high dimensional space remain nearby and similarly co-located with respect to one in the low dimensional space. Thus, the obtained basis functions can precisely approximate the real value/action-value function. The simulation results show that the basis functions obtained by LLE can represent the final policy with a higher precision. © 2013 ieee.

关键词： dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Theoretical Analysis of a reinforcement learning based Switching Scheme

Theoretical Analysis of a Reinforcement Learning based Switc...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (ADPRL)

作者： Heydari, Ali South Dakota Sch Mines & Technol Dept Mech Engn Rapid City SD 57701 USA

ISBN: (纸本)9781479945528

A reinforcement learning based scheme for optimal switching with an infinite-horizon cost function is briefly proposed in this paper. Several theoretical questions are shown to arise regarding its convergence, optimality of the result, and continuity of the limit function, to be uniformly approximated using parametric function approximators. The main contribution of the paper is providing rigorous answers for the questions, where, sufficient conditions for convergence, optimality, and continuity are provided.

关键词： function approximation learning (artificial intelligence) infinite-horizon cost function optimal switching parametric function approximators reinforcement learning based switching scheme Approximation methods Artificial neural networks Convergence Cost function Optimal control Schedules Switches learning (artificial intelligence) Cost functions Approximation method function approximation Switches Converge Artificial neural networks Optimal control Theoretical analysis

来源：评论

学校读者我要写书评

暂无评论

Neural-Network-Based adaptive dynamic Surface Control for MIMO Systems with Unknown Hysteresis

Neural-Network-Based Adaptive Dynamic Surface Control for MI...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (ADPRL)

作者： Liu, Lei Wang, Zhanshan Shen, Zhengwei Northeastern Univ Coll Informat Sci & Engn Shenyang Liaoning Peoples R China

ISBN: (纸本)9781479945528

This paper focuses on the composite adaptive tracking control for a class of nonlinear multiple-input-multiple-output (MIMO) systems with unknown backlash-like hysteresis nonlinearities. A dynamic surface control method is incorporated into the proposed control strategy to eliminate the problem of explosion of complexity. Compared with some existing methods, the prediction error between system state and serial-parallel estimation model is combined with compensated tracking error to construct the adaptive laws for neural network (NN) weights. It is shown that the proposed control approach can guarantee that all the signals of the resulting closed-loop systems are semi-globally uniformly ultimately bounded and the tracking error converges to a small neighborhood. Finally, simulation results are provided to confirm the effectiveness of the proposed approaches.

关键词： dynamic surface control prediction error backlash-like hysteresis adaptive neural network control

来源：评论

学校读者我要写书评

暂无评论

Convergent reinforcement learning Control with Neural Networks and Continuous Action Search

Convergent Reinforcement Learning Control with Neural Networ...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (ADPRL)

作者： Lee, Minwoo Anderson, Charles W. Colorado State Univ Dept Comp Sci Ft Collins CO 80523 USA

ISBN: (纸本)9781479945528

We combine a convergent TD-learning method and direct continuous action search with neural networks for function approximation to obtain both stability and generalization over inexperienced state-action pairs. We extend linear Greedy-GQ to nonlinear neural networks for convergent learning. Direct continuous action search with back-propagation leads to efficient high-precision control. A high dimensional continuous state and action problem, octopus arm control, is examined to test the proposed algorithm. Comparing TD, linear Greedy-GQ, and nonlinear Greedy-GQ, we discuss how the correction term contributes to learning with nonlinear Greedy-GQ algorithm and how continuous action search contributes to learning speed and stability.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：