检索结果-内蒙古大学图书馆

Infinite time linear quadratic stackelberg game problem for unknown stochastic discrete-time systems via adaptive dynamic programming approach

引用

ASIAN JOURNAL OF CONTROL 2021年第2期23卷 937-948页

作者： Liu, Xikui Liu, Ruirui Li, Yan Shandong Univ Sci & Technol Coll Math & Syst Sci Qingdao 266590 Shandong Peoples R China Shandong Univ Sci & Technol Coll Elect Engn & Automat Qingdao 266590 Shandong Peoples R China

In this paper, we propose an adaptive dynamic programming (ADP) approach to solve the infinite horizon linear quadratic (LQ) Stackelberg game problem for unknown stochastic discrete-time systems with multiple decision makers. Firstly, the stochastic LQ Stackelberg game problem is converted into the deterministic problem by system transformation. Next, a value iteration ADP approach is put forword and the convergence is given. Thirdly, in order to implement the iterative method, back propagation neural network (BPNN) is chosen to design model network, critic network and action network to approximate the unknown systems, objective functions and Stackelberg strategies. Finally, simulation results show that the algorithm is effective.

关键词： adaptive dynamic programming back propagation neural network Stackelberg game stochastic discrete-time systems

来源：评论

学校读者我要写书评

暂无评论

Event-triggered adaptive dynamic programming for decentralized tracking control of input constrained unknown nonlinear interconnected systems

引用

NEURAL NETWORKS 2023年 157卷 336-349页

作者： Wu, Qiuye Zhao, Bo Liu, Derong Polycarpou, Marios M. Guangdong Univ Technol Sch Automat Guangzhou 510006 Peoples R China Beijing Normal Univ Sch Syst Sci Beijing 100875 Peoples R China Southern Univ Sci & Technol Dept Mech & Energy Engn Shenzhen 518055 Peoples R China Univ Illinois Dept Elect & Comp Engn Chicago IL 60607 USA Univ Cyprus Dept Elect & Comp Engn CY-2109 Nicosia Cyprus Univ Cyprus KIOS Res Ctr Excellence CY-2109 Nicosia Cyprus

This paper addresses decentralized tracking control (DTC) problems for input constrained unknown nonlinear interconnected systems via event-triggered adaptive dynamic programming. To reconstruct the system dynamics, a neural-network-based local observer is established by using local input-output data and the desired trajectories of all other subsystems. By employing a nonquadratic value function, the DTC problem of the input constrained nonlinear interconnected system is transformed into an optimal control problem. By using the observer-critic architecture, the DTC policy is obtained by solving the local Hamilton-Jacobi-Bellman equation through the local critic neural network, whose weights are tuned by the experience replay technique to relax the persistence of excitation condition. Under the event-triggering mechanism, the DTC policy is updated at the event-triggering instants only. Then, the computational resource and the communication bandwidth are saved. The stability of the closed -loop system is guaranteed by implementing event-triggered DTC policy via Lyapunov's direct method. Finally, simulation examples are provided to demonstrate the effectiveness of the proposed scheme.(c) 2022 Elsevier Ltd. All rights reserved.

关键词： adaptive dynamic programming Event-triggered control Decentralized tracking control Input constraints Experience replay Neural networks

来源：评论

学校读者我要写书评

暂无评论

Distributed optimal consensus control for nonlinear multi-agent systems with input saturation based on event-triggered adaptive dynamic programming method

引用

INTERNATIONAL JOURNAL OF CONTROL 2022年第2期95卷 282-294页

作者： Shi, Zhengqing Zhou, Chuan Nanjing Univ Sci & Technol Sch Automat Nanjing Peoples R China

In this paper, the distributed optimal consensus problem is investigated for a class of continuous-time nonlinear multi-agent systems with input saturation. Non-quadratic cost functions are introduced to handle input constraints and a novel distributed optimal consensus protocol is derived based on event-triggered adaptive dynamic programming method. An online implement scheme is designed under actor-critic network framework in order to obtain the solutions of Hamilton-Jacobi-Bellman equations online. The computation and communication loads are effectively reduced since the weight estimation vectors and controllers are updated only at event-triggered instants. Detailed analysis is presented based on Lyapunov stability theory which guarantees that the weight estimation errors and local consensus errors are uniformly ultimately bounded. Furthermore, it proves that Zeno behaviour can be effectively avoided. Finally, the simulation examples are presented to validate the proposed strategy.

关键词： Multi-agent system event-triggered mechanism adaptive dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach

引用

NEUROCOMPUTING 2012年第1期78卷 14-22页

作者： Wang, Ding Liu, Derong Wei, Qinglai Chinese Acad Sci Inst Automat State Key Lab Intelligent Control & Management Co Beijing 100190 Peoples R China Univ Illinois Dept Elect & Comp Engn Chicago IL 60607 USA

In this paper, a finite-horizon neuro-optimal tracking control strategy for a class of discrete-time nonlinear systems is proposed. Through system transformation, the optimal tracking problem is converted into designing a finite-horizon optimal regulator for the tracking error dynamics. Then, with convergence analysis in terms of cost function and control law, the iterative adaptive dynamic programming (ADP) algorithm via heuristic dynamic programming (HDP) technique is introduced to obtain the finite-horizon optimal tracking controller which makes the cost function close to its optimal value within an epsilon-error bound. Three neural networks are used as parametric structures to implement the algorithm, which aims at approximating the cost function, the control law, and the error dynamics, respectively. Two simulation examples are included to complement the theoretical discussions. (C) 2011 Elsevier B.V. All rights reserved.

关键词： adaptive critic designs adaptive dynamic programming Approximate dynamic programming Finite-horizon optimal tracking control Learning control Neural networks Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Output feedback tracking control of a class of continuous-time nonlinear systems via adaptive dynamic programming approach

引用

INFORMATION SCIENCES 2018年 469卷 1-13页

作者： Yang, Yang Xu, Chuang Yue, Dong Xie, Xiangpeng Nanjing Univ Posts & Telecommun Coll Automat Nanjing 210023 Jiangsu Peoples R China Nanjing Univ Posts & Telecommun Coll Artificial Intelligence Nanjing 210023 Jiangsu Peoples R China Nanjing Univ Posts & Telecommun Inst Adv Technol Nanjing 210023 Jiangsu Peoples R China Nanjing Univ Posts & Telecommun Jiangsu Engn Lab Big Data Anal & Control Act Dist Nanjing 210023 Jiangsu Peoples R China

In this paper, we propose an output-based tracking control scheme for a class of continuous-time nonlinear systems via the adaptive dynamic programming (ADP) technique. A neural networks (NNs) observer is constructed to reconstruct immeasurable information of the nonlinear systems, and, by introducing a new state vector and appropriate coordinate transformation, tracking control issues are converted into optimal regulation problems where critic-actor neural networks structures are developed for the solution of Hamilton-Jacobi-Bellman (HJB) equation corresponding to tracking errors. In addition, a robust term is introduced to eliminate effects from approximation errors. It is proven that all signals in the closed-loop system are uniformly ultimately bounded (UUB) by the Lyapunov approach. Finally, simulation examples are provided for illustration of the theoretical claims. (C) 2018 Elsevier Inc. All rights reserved.

关键词： Output feedback adaptive dynamic programming Neural networks Nonlinear systems

来源：评论

学校读者我要写书评

暂无评论

Residential Energy Scheduling for Variable Weather Solar Energy Based on adaptive dynamic programming

引用

IEEE/CAA Journal of Automatica Sinica 2018年第1期5卷 36-46页

作者： Derong Liu Yancai Xu Qinglai Wei Xinliang Liu Guangdong University of Technology Guangzhou 510006 China The State Key Laboratory of Manage-ment and Control for Complex Systems Institute of Automation Chinese Academy of Sciences Beijing 100190 China University of Chinese Academy of Sciences Beijing 100049 China The Bureau of Informationization Development Cyberspace Administration of China Beijing 100010 China IEEE

The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamic programming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy ***, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost.

关键词： Action dependent heuristic dynamic programming adaptive dynamic programming control strategy residential energy management smart grid

来源：评论

学校读者我要写书评

暂无评论

Internal reinforcement adaptive dynamic programming for optimal containment control of unknown continuous-time multi-agent systems

引用

NEUROCOMPUTING 2020年 413卷 85-95页

作者： Zhang, Jiefu Peng, Zhinan Hu, Jiangping Zhao, Yiyi Luo, Rui Ghosh, Bijoy Kumar Univ Elect Sci & Technol China Sch Automat Engn Chengdu 611731 Peoples R China Southwestern Univ Finance & Econ Sch Business Adm Chengdu 611130 Peoples R China Texas Tech Univ Dept Math & Stat Lubbock TX 79409 USA

In this paper, a novel control scheme is developed to solve an optimal containment control problem of unknown continuous-time multi-agent systems. Different from traditional adaptive dynamic programming (ADP) algorithms, this paper proposes an internal reinforcement ADP algorithm (IR-ADP), in which the internal reinforcement signals are added in order to facilitate the learning process. Then a distributed containment control law is designed for each agent with the internal reinforcement signal. The convergence of this IR-ADP algorithm and the stability of the closed-loop multi-agent system are analyzed theoretically. For the implementation of the optimal controllers, three neural networks (NNs), namely internal reinforcement NNs, critic NNs and actor NNs, are utilized to approximate the internal reinforcement signals, the performance indices and optimal control laws, respectively. Finally, some simulation results are provided to demonstrate the effectiveness of the proposed algorithm. (C) 2020 Elsevier B.V. All rights reserved.

关键词： Optimal containment control Multi-agent system Internal reinforcement learning adaptive dynamic programming Neural network

来源：评论

学校读者我要写书评

暂无评论

Accurate Current Sharing and Voltage Regulation in Hybrid Wind/Solar Systems: An adaptive dynamic programming Approach

引用

IEEE TRANSACTIONS ON CONSUMER ELECTRONICS 2022年第3期68卷 261-272页

作者： Wang, Rui Ma, Dazhong Li, Ming-Jia Sun, Qiuye Zhang, Huaguang Wang, Peng Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Liaoning Peoples R China Xi An Jiao Tong Univ Sch Energy & Power Engn Minist Educ Key Lab Thermofluid Sci & Engn Xian 710049 Shaanxi Peoples R China Northeastern Univ Elect Engn Dept Shenyang 110819 Liaoning Peoples R China Nanyang Technol Univ Sch Elect & Elect Engn Singapore 639798 Singapore

Renewable energy is an advisable choice to reduce fuel consumption and CO2 emission. Therein, wind energy and solar energy are the most promising contributors to reach this goal. Although the hybrid wind/solar system has been widely studied, the real-time current sharing based on their maximum capacities is rarely achieved in terms of seconds. Based on this, this paper proposes an accurate current sharing and voltage regulation approach in hybrid wind/solar systems, which is based on distributed adaptive dynamic programming approach. Firstly, the equivalent wind/solar model is built, which is an indispensable preprocessing to achieve the complementary between wind energy and solar energy. Therein, the wind energy and solar energy can output relative current according to their respective capacity ratio, which ensure the maximum utilization ratio of renewable energy source. Furthermore, current sharing and voltage regulation problem is switched into optimal control problem. Under this effect, each source agent aims to obtain the optimal control variable and achieve accurate current sharing/voltage regulation. Moreover, an adaptive dynamic programming approach based on Bellman principle is proposed. It can achieve accurate current sharing and voltage regulation. Finally, the simulation results are provided to illustrate the performance of the proposed adaptive dynamic programming approach.

关键词： Hybrid wind/solar system current sharing voltage regulation adaptive dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Stable value iteration for two-player zero-sum game of discrete-time nonlinear systems based on adaptive dynamic programming

引用

NEUROCOMPUTING 2019年 340卷 180-195页

作者： Song, Ruizhuo Zhu, Liao Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China

In this paper, a stable value iteration (SVI) algorithm is developed to solve discrete-time two-player zero-sum game (TP-ZSG) for nonlinear systems based on adaptive dynamic programming (ADP). In the SVI algorithm, both optimality and stability of nonlinear systems are considered with proofs given. First, an iterative ADP algorithm is presented to obtain the approximate optimal solutions by solving Hamilton-Jacobi-Isaacs (HJI) equation. Second, a range of the discount factor is shown, which guarantees HJI equation serving as a Lyapunov equation. Moreover, we prove that if the iteration number reaches a given number, then the iterative control inputs make the closed-loop system asymptotic stable. Third, in order to improve the practicability of the developed stability condition, a simple criteria is established based on Lyapunov stability theory. Neural networks (NNs) are used to approximate the system states, the value function, the control and disturbance inputs. Finally, simulation results are given to illustrate the performance of the developed optimal control method. (C) 2019 Elsevier B.V. All rights reserved.

关键词： adaptive dynamic programming Neural network-based Zero-sum game

来源：评论

学校读者我要写书评

暂无评论

Model-free optimal controller design for continuous-time nonlinear systems by adaptive dynamic programming based on a precompensator

引用

ISA TRANSACTIONS 2015年 57卷 63-70页

作者： Zhang, Jilie Zhang, Huaguang Liu, Zhenwei Wang, Yingchun Northeastern Univ Sch Informat Sci & Engn Shenyang 110819 Liaoning Peoples R China Southwest Jiaotong Univ Sch Informat Sci & Technol Chengdu 610031 Peoples R China

In this paper, we consider the problem of developing a controller for continuous-time nonlinear systems where the equations governing the system are unknown. Using the measurements, two new online schemes are presented for synthesizing a controller without building or assuming a model for the system, by two new implementation schemes based on adaptive dynamic programming (ADP). To circumvent the requirement of the prior knowledge for systems, a precompensator is introduced to construct an augmented system. The corresponding Hamilton-Jacobi-Bellman (HJB) equation is solved by adaptive dynamic programming, which consists of the least-squared technique, neural network approximator and policy iteration (PI) algorithm. The main idea of our method is to sample the information of state, state derivative and input to update the weighs of neural network by least-squared technique. The update process is implemented in the framework of PI. In this paper, two new implemenption schemes are presented. Finally, several examples are given to illustrate the effectiveness of our schemes. (C) 2014 ISA. Published by Elsevier Ltd. All rights reserved.

关键词： Model-free controller Optimal control Precompensator adaptive dynamic programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：