In this paper, we propose an adaptive dynamic programming (ADP) approach to solve the infinite horizon linear quadratic (LQ) Stackelberg game problem for unknown stochastic discrete-time systems with multiple decision...
详细信息
In this paper, we propose an adaptive dynamic programming (ADP) approach to solve the infinite horizon linear quadratic (LQ) Stackelberg game problem for unknown stochastic discrete-time systems with multiple decision makers. Firstly, the stochastic LQ Stackelberg game problem is converted into the deterministic problem by system transformation. Next, a value iteration ADP approach is put forword and the convergence is given. Thirdly, in order to implement the iterative method, back propagation neural network (BPNN) is chosen to design model network, critic network and action network to approximate the unknown systems, objective functions and Stackelberg strategies. Finally, simulation results show that the algorithm is effective.
This paper addresses decentralized tracking control (DTC) problems for input constrained unknown nonlinear interconnected systems via event-triggered adaptive dynamic programming. To reconstruct the system dynamics, a...
详细信息
This paper addresses decentralized tracking control (DTC) problems for input constrained unknown nonlinear interconnected systems via event-triggered adaptive dynamic programming. To reconstruct the system dynamics, a neural-network-based local observer is established by using local input-output data and the desired trajectories of all other subsystems. By employing a nonquadratic value function, the DTC problem of the input constrained nonlinear interconnected system is transformed into an optimal control problem. By using the observer-critic architecture, the DTC policy is obtained by solving the local Hamilton-Jacobi-Bellman equation through the local critic neural network, whose weights are tuned by the experience replay technique to relax the persistence of excitation condition. Under the event-triggering mechanism, the DTC policy is updated at the event-triggering instants only. Then, the computational resource and the communication bandwidth are saved. The stability of the closed -loop system is guaranteed by implementing event-triggered DTC policy via Lyapunov's direct method. Finally, simulation examples are provided to demonstrate the effectiveness of the proposed scheme.(c) 2022 Elsevier Ltd. All rights reserved.
In this paper, the distributed optimal consensus problem is investigated for a class of continuous-time nonlinear multi-agent systems with input saturation. Non-quadratic cost functions are introduced to handle input ...
详细信息
In this paper, the distributed optimal consensus problem is investigated for a class of continuous-time nonlinear multi-agent systems with input saturation. Non-quadratic cost functions are introduced to handle input constraints and a novel distributed optimal consensus protocol is derived based on event-triggered adaptive dynamic programming method. An online implement scheme is designed under actor-critic network framework in order to obtain the solutions of Hamilton-Jacobi-Bellman equations online. The computation and communication loads are effectively reduced since the weight estimation vectors and controllers are updated only at event-triggered instants. Detailed analysis is presented based on Lyapunov stability theory which guarantees that the weight estimation errors and local consensus errors are uniformly ultimately bounded. Furthermore, it proves that Zeno behaviour can be effectively avoided. Finally, the simulation examples are presented to validate the proposed strategy.
In this paper, a finite-horizon neuro-optimal tracking control strategy for a class of discrete-time nonlinear systems is proposed. Through system transformation, the optimal tracking problem is converted into designi...
详细信息
In this paper, a finite-horizon neuro-optimal tracking control strategy for a class of discrete-time nonlinear systems is proposed. Through system transformation, the optimal tracking problem is converted into designing a finite-horizon optimal regulator for the tracking error dynamics. Then, with convergence analysis in terms of cost function and control law, the iterative adaptive dynamic programming (ADP) algorithm via heuristic dynamicprogramming (HDP) technique is introduced to obtain the finite-horizon optimal tracking controller which makes the cost function close to its optimal value within an epsilon-error bound. Three neural networks are used as parametric structures to implement the algorithm, which aims at approximating the cost function, the control law, and the error dynamics, respectively. Two simulation examples are included to complement the theoretical discussions. (C) 2011 Elsevier B.V. All rights reserved.
In this paper, we propose an output-based tracking control scheme for a class of continuous-time nonlinear systems via the adaptive dynamic programming (ADP) technique. A neural networks (NNs) observer is constructed ...
详细信息
In this paper, we propose an output-based tracking control scheme for a class of continuous-time nonlinear systems via the adaptive dynamic programming (ADP) technique. A neural networks (NNs) observer is constructed to reconstruct immeasurable information of the nonlinear systems, and, by introducing a new state vector and appropriate coordinate transformation, tracking control issues are converted into optimal regulation problems where critic-actor neural networks structures are developed for the solution of Hamilton-Jacobi-Bellman (HJB) equation corresponding to tracking errors. In addition, a robust term is introduced to eliminate effects from approximation errors. It is proven that all signals in the closed-loop system are uniformly ultimately bounded (UUB) by the Lyapunov approach. Finally, simulation examples are provided for illustration of the theoretical claims. (C) 2018 Elsevier Inc. All rights reserved.
The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable ener...
详细信息
The residential energy scheduling of solar energy is an important research area of smart grid. On the demand side, factors such as household loads, storage batteries, the outside public utility grid and renewable energy resources, are combined together as a nonlinear, time-varying, indefinite and complex system, which is difficult to manage or optimize. Many nations have already applied the residential real-time pricing to balance the burden on their grid. In order to enhance electricity efficiency of the residential micro grid, this paper presents an action dependent heuristic dynamicprogramming(ADHDP) method to solve the residential energy scheduling problem. The highlights of this paper are listed below. First,the weather-type classification is adopted to establish three types of programming models based on the features of the solar energy. In addition, the priorities of different energy resources are set to reduce the loss of electrical energy ***, three ADHDP-based neural networks, which can update themselves during applications, are designed to manage the flows of electricity. Third, simulation results show that the proposed scheduling method has effectively reduced the total electricity cost and improved load balancing process. The comparison with the particle swarm optimization algorithm further proves that the present method has a promising effect on energy management to save cost.
In this paper, a novel control scheme is developed to solve an optimal containment control problem of unknown continuous-time multi-agent systems. Different from traditional adaptive dynamic programming (ADP) algorith...
详细信息
In this paper, a novel control scheme is developed to solve an optimal containment control problem of unknown continuous-time multi-agent systems. Different from traditional adaptive dynamic programming (ADP) algorithms, this paper proposes an internal reinforcement ADP algorithm (IR-ADP), in which the internal reinforcement signals are added in order to facilitate the learning process. Then a distributed containment control law is designed for each agent with the internal reinforcement signal. The convergence of this IR-ADP algorithm and the stability of the closed-loop multi-agent system are analyzed theoretically. For the implementation of the optimal controllers, three neural networks (NNs), namely internal reinforcement NNs, critic NNs and actor NNs, are utilized to approximate the internal reinforcement signals, the performance indices and optimal control laws, respectively. Finally, some simulation results are provided to demonstrate the effectiveness of the proposed algorithm. (C) 2020 Elsevier B.V. All rights reserved.
Renewable energy is an advisable choice to reduce fuel consumption and CO2 emission. Therein, wind energy and solar energy are the most promising contributors to reach this goal. Although the hybrid wind/solar system ...
详细信息
Renewable energy is an advisable choice to reduce fuel consumption and CO2 emission. Therein, wind energy and solar energy are the most promising contributors to reach this goal. Although the hybrid wind/solar system has been widely studied, the real-time current sharing based on their maximum capacities is rarely achieved in terms of seconds. Based on this, this paper proposes an accurate current sharing and voltage regulation approach in hybrid wind/solar systems, which is based on distributed adaptive dynamic programming approach. Firstly, the equivalent wind/solar model is built, which is an indispensable preprocessing to achieve the complementary between wind energy and solar energy. Therein, the wind energy and solar energy can output relative current according to their respective capacity ratio, which ensure the maximum utilization ratio of renewable energy source. Furthermore, current sharing and voltage regulation problem is switched into optimal control problem. Under this effect, each source agent aims to obtain the optimal control variable and achieve accurate current sharing/voltage regulation. Moreover, an adaptive dynamic programming approach based on Bellman principle is proposed. It can achieve accurate current sharing and voltage regulation. Finally, the simulation results are provided to illustrate the performance of the proposed adaptive dynamic programming approach.
In this paper, a stable value iteration (SVI) algorithm is developed to solve discrete-time two-player zero-sum game (TP-ZSG) for nonlinear systems based on adaptive dynamic programming (ADP). In the SVI algorithm, bo...
详细信息
In this paper, a stable value iteration (SVI) algorithm is developed to solve discrete-time two-player zero-sum game (TP-ZSG) for nonlinear systems based on adaptive dynamic programming (ADP). In the SVI algorithm, both optimality and stability of nonlinear systems are considered with proofs given. First, an iterative ADP algorithm is presented to obtain the approximate optimal solutions by solving Hamilton-Jacobi-Isaacs (HJI) equation. Second, a range of the discount factor is shown, which guarantees HJI equation serving as a Lyapunov equation. Moreover, we prove that if the iteration number reaches a given number, then the iterative control inputs make the closed-loop system asymptotic stable. Third, in order to improve the practicability of the developed stability condition, a simple criteria is established based on Lyapunov stability theory. Neural networks (NNs) are used to approximate the system states, the value function, the control and disturbance inputs. Finally, simulation results are given to illustrate the performance of the developed optimal control method. (C) 2019 Elsevier B.V. All rights reserved.
In this paper, we consider the problem of developing a controller for continuous-time nonlinear systems where the equations governing the system are unknown. Using the measurements, two new online schemes are presente...
详细信息
In this paper, we consider the problem of developing a controller for continuous-time nonlinear systems where the equations governing the system are unknown. Using the measurements, two new online schemes are presented for synthesizing a controller without building or assuming a model for the system, by two new implementation schemes based on adaptive dynamic programming (ADP). To circumvent the requirement of the prior knowledge for systems, a precompensator is introduced to construct an augmented system. The corresponding Hamilton-Jacobi-Bellman (HJB) equation is solved by adaptive dynamic programming, which consists of the least-squared technique, neural network approximator and policy iteration (PI) algorithm. The main idea of our method is to sample the information of state, state derivative and input to update the weighs of neural network by least-squared technique. The update process is implemented in the framework of PI. In this paper, two new implemenption schemes are presented. Finally, several examples are given to illustrate the effectiveness of our schemes. (C) 2014 ISA. Published by Elsevier Ltd. All rights reserved.
暂无评论