检索结果-内蒙古大学图书馆

Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming

引用

NEUROCOMPUTING 2019年 344卷 13-19页

作者： Jiang, He Zhang, Huaguang Xie, Xiangpeng Han, Ji Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Liaoning Peoples R China Nanjing Univ Posts & Telecommun Inst Adv Technol Nanjing 210003 Peoples R China

adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multiplayer systems with control constraints have rarely been investigated in this field. In order to address this issue, a novel policy iteration (PI) algorithm is proposed based on ADP technique, and its associated convergence analysis is also studied in this brief paper. For the proposed PI algorithm, an online neural network (NN) implementation scheme with multiple-network structure is presented. In the online NN-based learning algorithm, critic network, constrained actor networks and unconstrained actor networks are employed to approximate the value function, constrained and unconstrained control policies, respectively, and the NN weight updating laws are designed based on the gradient descent method. Finally, a numerical simulation example is illustrated to show the effectiveness. (C) 2019 Elsevier B.V. All rights reserved.

关键词： adaptive dynamic programming Approximate dynamic programming Reinforcement learning Neural networks

来源：评论

学校读者我要写书评

暂无评论

Robust Cooperative Control for Nonlinear Multi-agent Systems with Input-Disturbances via adaptive dynamic programming

Robust Cooperative Control for Nonlinear Multi-agent Systems...

引用

第33届中国控制与决策会议

作者： Qiuxia Qu Juan Wang QiongXia Liangliang Sun Yang Cui School of Information and Control Engineering Shenyang Jianzhu University School of Electronic and Information Engineering University of Science and Technology Liaoning

Considering the leader-following consensus problem for the nonlinear multi-agent systems with bounded input-disturbances under fixed topology,a novel distributed robust protocol is designed to guarantee all followers synchronize to the leader by investigating the gain of the Nash *** robustness restrictions are given through Lyapunov *** get the Nash solution,critic neural networks are trained based on adaptive dynamic programming algorithm in an online and forward-in-time manner to solve the coupled Hamilton-Jacobi *** additional term is added to the neural network weight tuning law to avoid the requirement for the initial admissible control law.

关键词： adaptive dynamic programming Input-disturbance Leader-following consensus problem Multi-agent systems Nash equilibrium Neural network

来源：评论

学校读者我要写书评

暂无评论

Stable value iteration for two-player zero-sum game of discrete-time nonlinear systems based on adaptive dynamic programming

引用

NEUROCOMPUTING 2019年 340卷 180-195页

作者： Song, Ruizhuo Zhu, Liao Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China

In this paper, a stable value iteration (SVI) algorithm is developed to solve discrete-time two-player zero-sum game (TP-ZSG) for nonlinear systems based on adaptive dynamic programming (ADP). In the SVI algorithm, both optimality and stability of nonlinear systems are considered with proofs given. First, an iterative ADP algorithm is presented to obtain the approximate optimal solutions by solving Hamilton-Jacobi-Isaacs (HJI) equation. Second, a range of the discount factor is shown, which guarantees HJI equation serving as a Lyapunov equation. Moreover, we prove that if the iteration number reaches a given number, then the iterative control inputs make the closed-loop system asymptotic stable. Third, in order to improve the practicability of the developed stability condition, a simple criteria is established based on Lyapunov stability theory. Neural networks (NNs) are used to approximate the system states, the value function, the control and disturbance inputs. Finally, simulation results are given to illustrate the performance of the developed optimal control method. (C) 2019 Elsevier B.V. All rights reserved.

关键词： adaptive dynamic programming Neural network-based Zero-sum game

来源：评论

学校读者我要写书评

暂无评论

Connected cruise control with delayed feedback and disturbance: An adaptive dynamic programming approach

引用

INTERNATIONAL JOURNAL OF adaptive CONTROL AND SIGNAL PROCESSING 2019年第2期33卷 356-370页

作者： Huang, Mengzhe Gao, Weinan Jiang, Zhong-Ping NYU Dept Elect & Comp Engn Tandon Sch Engn Control & Networks Lab Brooklyn NY 11201 USA

This paper studies the connected cruise control problem for a platoon of human-operated and autonomous vehicles. The autonomous vehicles can receive motional data, ie, headway and velocity information from other vehicles by wireless vehicle-to-vehicle communication. The use of wireless communications in information exchange between vehicles inevitably causes input delay in the platooning system. Meanwhile, unpredictable behaviors of the leading vehicle constitute exogenous disturbance for the system. An adaptive optimal control problem with input delay and disturbance is formulated, and a novel data-driven control solution is proposed such that each vehicle in the platoon can achieve safe distance and desired velocity. By adopting an adaptive dynamic programming technique with sampled-data system theory, a data-driven adaptive optimal control approach is proposed for autonomous vehicles by the learning strategies of policy iteration without the accurate knowledge of the dynamics of all human drivers and vehicles. The efficacy of the proposed controller is substantiated by rigorous analysis and validated by simulation results in different scenarios.

关键词： adaptive dynamic programming connected vehicles time-delayed input

来源：评论

学校读者我要写书评

暂无评论

Ascent trajectory tracking method using time-varying quadratic adaptive dynamic programming

引用

PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART G-JOURNAL OF AEROSPACE ENGINEERING 2019年第11期233卷 4154-4165页

作者： Sun, Zhengxiang Chao, Tao Wang, Songyan Yang, Ming Harbin Inst Technol Control & Simulat Ctr Harbin 150001 Heilongjiang Peoples R China

Ascent trajectory tracking in the longitudinal plane is a class of nonaffine noncascade nonlinear system control problems with single input and multiple output, which is difficult to control by nonlinear method directly. adaptive dynamic programming algorithm has the advantages of precise control and adaptability for general nonlinear control problem, and the time-varying quadratic adaptive dynamic programming algorithm proposed in this paper promotes the convergence and calculation speed of the traditional adaptive dynamic programming algorithm. To implement the algorithm effectively, the independent variables of ascent model are substituted in the launching coordinate, and then the model is treated as a discrete nominal trajectory tracking problem. Besides, the heuristic dynamic programming structure is used to train the processed model, and thus only the time-varying weight in the designed evaluation network needs to be updated. Simulation shows that the proposed algorithm can update the control variable online after predicting the cost function offline with the dynamic equations, which is faster than the general adaptive dynamic programming algorithm. In addition, the proposed algorithm can effectively and accurately track the nominal trajectory under the uncertainty of parameters compared with the linear quadratic regulators algorithm.

关键词： adaptive dynamic programming ascent trajectory tracking optimal control time-varying quadratic polynomial neural network

来源：评论

学校读者我要写书评

暂无评论

Energy Maximization Absorption of Wave Energy Converter Based on Fourier Pseudo-Spectral Method and adaptive dynamic programming

引用

The International Journal of Intelligent Control and Systems 2024年第3期29卷 108-118页

作者： Xinyu Bao Zhen Chen Ming Li Department of Railway Hohhot Vocational CollegeHohhot 010000China Automation and Measurement Department Ocean University of ChinaQingdao 266100China

In this paper,we propose a novel noncausal control framework to address the energy maximization problem of wave energy converters(WECs)subject to *** energy maximization problem of WECs is a constrained optimal control *** proposed control framework converts this problem into a reference trajectory tracking problem through the Fourier pseudo-spectral method(FPSM)and utilizes the online tracking adaptive dynamic programming(OTADP)algorithm to realize real-time trajectory tracking for practical use in the ocean *** the wave prediction technique,the optimal trajectory is generated online through a receding horizon(RH)implementation.A critic neural network(NN)is applied to approximate the optimal cost value function and calculate the error-tracking control by solving the associated Hamilton-Jacobi-Bellman(HJB)*** proposed WEC control framework improves computational efficiency and makes the online control feasible in *** results show the effects of the receding horizon implementation of FPSM with different window lengths and window functions,while verifying the performances of tracking control and energy absorption of WECs in two different sea conditions.

关键词： Wave energy converter Fourier pseudo-spectral control adaptive dynamic programming energy maximization optimal trajectory tracking control

来源：评论

学校读者我要写书评

暂无评论

Distributed optimal coordination control for nonlinear multi-agent systems using event-triggered adaptive dynamic programming method

引用

ISA TRANSACTIONS 2019年 91卷 184-195页

作者： Zhao, Wei Zhang, Huaipin Southeast Univ Sch Math Nanjing 210096 Jiangsu Peoples R China Nanjing Univ Posts & Telecommun Inst Adv Technol Nanjing 210023 Jiangsu Peoples R China

This paper is concerned with the design of distributed optimal coordination control for nonlinear multi agent systems (NMASs) based on event-triggered adaptive dynamic programming (ETADP) method. The method is firstly introduced to design the distributed coordination controllers for NMASs, which not only avoids the transmission of redundant data compared with traditional time-triggered adaptive dynamic programming (TTADP) strategy and minimizes the performance function of each agent. The event-triggered conditions are proposed based on Lyapunov functional method, which is deduced by guaranteeing the stability of NMASs. Then a new adaptive policy iteration algorithm is presented to obtain the online solutions of the Hamiton-Jocabi-Bellman (HJB) equations. In order to implement the proposed ETADP method, the fuzzy hyperbolic model based critic neural networks (NN) are utilized to approximate the value functions and help calculate the control policies. In critic NNs, the NN weight estimations are updated at the event-triggered instants leading to aperiodic weight tuning laws so that computation cost is reduced. It is proved that the weight estimation errors and the local neighborhood coordination errors is uniformly ultimately bounded (UUB). Finally, two simulation examples are provided to show the effectiveness of the proposed ETADP method. (C) 2019 ISA. Published by Elsevier Ltd. All rights reserved.

关键词： Multi-agent systems Event-triggered sampling Distributed optimal coordination control adaptive dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Event-Based Cooperative Control for Uncertain Multiagent System Using Parallel adaptive dynamic programming

引用

The International Journal of Intelligent Control and Systems 2024年第3期29卷 127-133页

作者： Shanshan Jiao Qinglai Wei Fei Dai Jianchao Wu Miaosheng Qiu Bin Zhang Yongjin Luo Kunxin Huang Genpo Ma Institute of Systems Engineering Macao University of Science and TechnologyMacao 999078China State Key Laboratory of Multimodal Artificial Intelligence Systems Institute of AutomationChinese Academy of SciencesBeijing 100190China School of Artificial Intelligence University of Chinese Academy of SciencesBeijing 100049China Guangzhou Goaland Energy Conservation Tech.Co. Ltd.Guangzhou 510663China

This study explores a new robust consensus control strategy for uncertain multiagent systems and provides an event-based solution to adaptive dynamic programming(ADP)based optimal *** than the control function,the feedback system established symmetrical to the physical system allows the optimal consensus control issue to be handled by the optimal control protocol of an augmented affine *** feedback system focuses on an auxiliary variable formed in light of the optimality principle and the virtual control input built on a critic neural network(NN).Analysis reveals that the auxiliary variable benefits from decreasing the influence of uncertainty on control performance,while the proposed approach is implemented with fewer communication resources since the critic NN is updated as events ***,evidence from simulation findings validates the theoretical results.

关键词： Optimal consensus control event-based control robust control parallel control adaptive dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Data-driven adaptive dynamic programming for partially observable nonzero-sum games via Q-learning method

引用

INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE 2019年第7期50卷 1338-1352页

作者： Wang, Wei Chen, Xin Fu, Hao Wu, Min China Univ Geosci Sch Automat Wuhan 430074 Hubei Peoples R China Hubei Key Lab Adv Control & Intelligent Automat C Wuhan 430074 Hubei Peoples R China

This paper concerns with a class of discrete-time linear nonzero-sum games with the partially observable system state. As is known, the optimal control policy for the nonzero-sum games relies on the full state measurement which is hard to fulfil in partially observable environment. Moreover, to achieve the optimal control, one needs to know the accurate system model. To overcome these deficiencies, this paper develops a data-driven adaptive dynamic programming method via Q-learning method using measurable input/output data without any system knowledge. First, the representation of the unmeasurable inner system state is built using historical input/output data. Then, based on the representation state, a Q-function-based policy iteration approach with convergence analysis is introduced to approximate the optimal control policy iteratively. A neural network (NN)-based actor-critic framework is applied to implement the developed data-driven approach. Finally, two simulation examples are provided to demonstrate the effectiveness of the developed approach.

关键词： adaptive dynamic programming nonzero-sum games partially observable Q-learning

来源：评论

学校读者我要写书评

暂无评论

Design and implementation of an optimal switching controller for uninterruptible power supply inverters using adaptive dynamic programming

引用

IET POWER ELECTRONICS 2019年第12期12卷 3068-3076页

作者： Gogani Khiabani, Ataollah Heydari, Ali Southern Methodist Univ Dept Mech Engn 6425 Boaz Lane Dallas TX 75205 USA

In this study, a new approach based on adaptive dynamic programming (ADP) is proposed to control single-phase uninterruptible power supply inverters. The control scheme uses a single function approximator, called critic, to evaluate the optimal cost and determine the optimal switching. After offline training of the critic, which is a function of system states and elapsed time, the resulting optimal weights are used in online control, to get a smooth output AC voltage in a feedback form. Simulations show the desirable performance of this controller with linear and non-linear loads and its relative robustness to parameter uncertainty and disturbances. Furthermore, the proposed controller is upgraded so that the inverter is suitable for single-phase variable frequency drives. Finally, as one of the few studies in the field of ADP, the proposed controllers are implemented on a physical prototype to show the performance in practise.

关键词： dynamic programming invertors variable speed drives function approximation uninterruptible power supplies feedback optimal switching controller ADP adaptive dynamic programming single-phase uninterruptible power supply inverters control scheme single function approximator called critic optimal cost offline training system states elapsed time resulting optimal weights online control smooth output AC voltage nonlinear loads inverter single-phase variable frequency drives

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：