检索结果-内蒙古大学图书馆

2017 IEEE 6th Data Driven Control and Learning Systems Conference (DDCLS’17)

作者： Qinglai Wei Ruizhuo Song Yancai Xu Derong Liu Qiao Lin The State Key Laboratory of Management and Control for Complex Systems Institute of Automation Chinese Academy of Sciences School of Automation and Electrical Engineering University of Science and Technology Beijing

ISBN: (纸本)9781509054626

In this paper, a novel discrete-time iterative zero-sum adaptive dynamic programming(ADP) algorithm is developed for solving the optimal control problems of nonlinear systems. Two iteration processes, which are lower and upper iterations, are employed to solve the lower and upper value functions, respectively. Arbitrary positive semi-definite functions are acceptable to initialize the upper and lower iterations of the iterative zero-sum ADP algorithm. It is proven that the upper and lower value functions converge to the optimal performance index function if the optimal performance index function exists, where the existence criterion of the optimal performance index function is unnecessary. Simulation examples are given to illustrate the effective performance of the present method.

关键词： approximate dynamic programming adaptive dynamic programming zero-sum game optimal control

来源：评论

学校读者我要写书评

暂无评论

Neural Network Adaptive Critic Control With Disturbance Rejection 29

Neural Network Adaptive Critic Control With Disturbance Reje...

引用

第29届中国控制与决策会议

作者： Ding Wang Chaoxu Mu Derong Liu The State Key Laboratory of Management and Control for Complex Systems Institute of Automation Chinese Academy of Sciences School of Computer and Control Engineering University of Chinese Academy of Sciences School of Electrical and Information Engineering Tianjin University School of Automation and Electrical Engineering University of Science and Technology Beijing

ISBN: (纸本)9781509046584

A neural-network-based adaptive critic control method is established for continuous-time input-affine uncertain nonlinear systems to achieve disturbance *** present problem can be formulated as a two-player zero-sum differential game and the adaptive critic mechanism is employed to solve the minimax optimization problem.A neural network identifier is developed to reconstruct the unknown dynamical *** optimal control law and the worst-case disturbance law are designed by introducing and training a critic neural *** effectiveness of the present self-learning control method is also illustrated by a simulation experiment.

关键词： Adaptive critic control Adaptive dynamic programming approximate dynamic programming Disturbance rejection Learning systems Neural networks Nonlinear control

来源：评论

学校读者我要写书评

暂无评论

Local Policy Iteration Adaptive dynamic programming for Discrete-Time Nonlinear Systems 14th

Local Policy Iteration Adaptive Dynamic Programming for Disc...

引用

14th International Symposium on Neural Networks (ISNN)

作者： Wei, Qinglai Xu, Yancai Lin, Qiao Liu, Derong Song, Ruizhuo Univ Chinese Acad Sci Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China

ISBN: (纸本)9783319590813;9783319590806

Adaptive dynamic programming is a hot research topic nowadays. Therefore, the paper concerns a new local policy adaptive iterative dynamic programming (ADP) algorithm. Moreover, this algorithm is designed for the discrete-time nonlinear systems, which are used to solve problems concerning infinite horizon optimal control. The new local policy iteration ADP algorithm has the characteristics of updating the iterative control law and value function within one subset of the state space. Morevover, detailed iteration process of the local policy iteration is presented thereafter. The simulation example is listed to show the good performance of the newly developed algorithm.

关键词： Nonlinear systems approximate dynamic programming Local policy iteration Optimal control Discrete time

来源：评论

学校读者我要写书评

暂无评论

A Data-driven Online ADP of Exponential Convergence Based on k-nearest-neighbor Averager, Stable Term and Persistence Excitation 4

A Data-driven Online ADP of Exponential Convergence Based on...

引用

4th International Conference on Systems and Informatics (ICSAI)

作者： Huang, Zhijian Wang, Shengtang Zheng, Huan Zhang, Cheng Zhang, Guichen Wu, Qili Tan, Qinmin Yang, Zhiyuan Shanghai Maritime Univ Lab Intelligent Control & Computat Shanghai Peoples R China

ISBN: (纸本)9781538611074

With the development of marine science, aeronautics and astronautics, energy, chemical industry, biomedicine and management science, many complex systems face the problem of optimization and control. approximate dynamic programming solves the curse of dimensionality problem of dynamic programming, and it is a new kind of approximate optimization solution that emerges in recent years. Based on the analysis of optimization system, this paper proposes a nonlinear multi-input multi-output, online learning, and data-driven approximate dynamic programming structure and its learning algorithm. The method is achieved from the following three aspects: 1) the critic function of multi-dimensional input critic module of the approximate dynamic programming is approximated with a data-driven k-nearest neighbor method;2) the multi-output policy iteration of the approximate dynamic programming actor module is calculated with an exponential convergence performance;3) The critic and actor modules are learned synchronously, and achieve the online optimal and control effect. The optimal control for the longitudinal motion of a thermal underwater glider is used to show the effect of the proposed method. This work can lay a foundation for the theory and application of a nonlinear data-driven multi-input multi-output approximate dynamic programming method. It's also the consensus needs in optimization control and artificial intelligence of many scientific and engineering fields, such as energy conservation, emission reduction, decision support and operational management etc.

关键词： approximate dynamic programming exponential convergence k-nearest-neighbor persistence excitation

来源：评论

学校读者我要写书评

暂无评论

A Data-driven ADP with RBF Network and LSM Learning Algorithm 6

A Data-driven ADP with RBF Network and LSM Learning Algorith...

引用

2017 IEEE 6th Data Driven Control and Learning Systems Conference (DDCLS’17)

作者： Zhijian Huang Yudong Li Wentao Chen Qin Zhang Qili Wu Qinmin Tan Zhiyuan Yang Merchant Marine College Shanghai Maritime University Institute of Power plant and Automation Shanghai Jiao Tong University

ISBN: (纸本)9781509054626

ADP is an effective optimal method. However, the optimality depends on its network structure and training algorithm. This paper adopts RBF neural network to realize its critic and action networks after a detailed analysis on ADP. The LSM method is introduced as training algorithm, and a novel basis function is defined, which achieves global optimization and online control. The validity is verified by finding the optimal point through local minimums.

关键词： Nonlinear approximate dynamic programming RBF LSM Utility function

来源：评论

学校读者我要写书评

暂无评论

Active Fault Diagnosis for Jump Markov Nonlinear Systems

Active Fault Diagnosis for Jump Markov Nonlinear Systems

引用

20th World Congress of the International-Federation-of-Automatic-Control (IFAC)

作者： Skach, Jan Puncochar, Ivo Straka, Ondrej Univ West Bohemia European Ctr Excellence NTIS Fac Appl Sci Plzen 30614 Czech Republic

In this paper, a problem of active fault diagnosis for jump Markov nonlinear systems with non-Gaussian noises is considered. The imperfect state information formulation is transformed using sufficient statistics to a dynamical optimization problem that can be solved using approximate dynamic programming. The sufficient statistics are produced using the Bayesian recursive relations and particle filter algorithm. A special structure of approximate Bellman function is chosen to reduce a complexity caused by high dimension of statistics obtained from the particle filter. The proposed active fault detector design is compared with an extended Kalman filter based design in the simulation example. (C) 2017, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.

关键词： Active fault diagnosis approximate dynamic programming jump Markov systems nonlinear state estimation optimization particle filter

来源：评论

学校读者我要写书评

暂无评论

Discrete-Time Generalized Policy Iteration ADP Algorithm With Approximation Errors

Discrete-Time Generalized Policy Iteration ADP Algorithm Wit...

引用

IEEE Symposium Series on Computational Intelligence (IEEE SSCI)

作者： Wei, Qinglai Li, Benkai Song, Ruizhuo Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing Peoples R China Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing Peoples R China

ISBN: (纸本)9781538627266

This paper concerns with a novel generalized policy iteration (GPI) algorithm with approximation errors. Approximation errors are explicitly considered in the GPI algorithm. The properties of the stable GPI algorithm with approximation errors are analyzed. The convergence of the developed algorithm is established to show that the iterative value function is convergent to a finite neighborhood of the optimal performance index function. Finally, numerical examples and comparisons are presented.

关键词： Adaptive critic designs adaptive dynamic programming approximate dynamic programming neuro-dynamic programming generalized policy iteration nonlinear systems optimal control neural networks reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Relations between Model Predictive Control and Reinforcement Learning

引用

IFAC-PapersOnLine 2017年第1期50卷 4920-4928页

作者： Görges D. Juniorprofessorship for Electromobility University of Kaiserslautern Erwin-Schrödinger-Straße 12 Kaiserslautern 67663 Germany

In this paper relations between model predictive control and reinforcement learning are studied for discrete-time linear time-invariant systems with state and input constraints and a quadratic value function. The principles of model predictive control and reinforcement learning are reviewed in a tutorial manner. From model predictive control theory it is inferred that the optimal value function is piecewise quadratic on polyhedra and that the optimal policy is piecewise affine on polyhedra. Various ideas for exploiting the knowledge on the structure and the properties of the optimal value function and the optimal policy in reinforcement learning theory and practice are presented. The ideas can be used for deriving stability and feasibility criteria and for accelerating the learning process which can facilitate reinforcement learning for systems with high order, fast dynamics, and strict safety requirements. © 2017

关键词： actor-critic structure approximate dynamic programming Model predictive control multi-parametric programming reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming approach for process control

引用

JOURNAL OF PROCESS CONTROL 2010年第9期20卷 1038-1048页

作者： Lee, Jay H. Wong, Weechin Georgia Inst Technol Sch Chem & Biomol Engn Atlanta GA 30332 USA

We assess the potentials of the approximate dynamic programming (ADP) approach for process control, especially as a method to complement the model predictive control (MPC) approach. In the artificial intelligence (AI) and operations research (OR) research communities, ADP has recently seen significant activities as an effective method for solving Markov decision processes (MDPs), which represent a type of multi-stage decision problems under uncertainty. Process control problems are similar to MDPs with the key difference being the continuous state and action spaces as opposed to discrete ones. In addition, unlike in other popular ADP application areas like robotics or games, in process control applications first and foremost concern should be on the safety and economics of the on-going operation rather than on efficient learning. We explore different options within ADP design, such as the pre-decision state vs. post-decision state value function, parametric vs. nonparametric value function approximator, batch-mode vs. continuous-mode learning, and exploration vs. robustness. We argue that ADP possesses great potentials, especially for obtaining effective control policies for stochastic constrained nonlinear or linear systems and continually improving them towards optimality. (C) 2010 Elsevier Ltd. All rights reserved.

关键词： Stochastic process control Stochastic dynamic programming approximate dynamic programming Dual control Constrained control

来源：评论

学校读者我要写书评

暂无评论

approximate dynamic programming approach for process control

Approximate dynamic programming approach for process control

引用

IFAC Symposium on Advanced Control of Chemical Processes (ADCHEM)

作者： Lee, Jay H. Wong, Weechin Georgia Inst Technol Sch Chem & Biomol Engn Atlanta GA 30332 USA

关键词： Stochastic process control Stochastic dynamic programming approximate dynamic programming Dual control Constrained control

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：