In this paper, the Bellman equation is used to solve the stochastic optimal control of unknown linear discrete-time system with communication imperfections including random delays, packet losses and quantization. A dy...
详细信息
ISBN:
(纸本)9781467314909
In this paper, the Bellman equation is used to solve the stochastic optimal control of unknown linear discrete-time system with communication imperfections including random delays, packet losses and quantization. A dynamic quantizer for the sensor measurements is proposed which essentially provides system states to the controller. To eliminate the effect of the quantization error, the dynamics of the quantization error bound and an update law for tuning its range are derived. Subsequently, by using adaptive dynamic programming technique, the infinite horizon optimal regulation of the uncertain NCS is solved in a forward-in-time manner without using value and/or policy iterations by using Q-function and reinforcement learning. The asymptotic stability of the closed-loop system is verified by standard Lyapunov stability theory. Finally, the effectiveness of the proposed method is verified by simulation results.
In this paper, an approximate optimal control method with input constraints based on adaptive dynamic programming (ADP) is proposed for a class of special linear systems. The method is based on an actor/critic framewo...
详细信息
ISBN:
(纸本)9781665440899
In this paper, an approximate optimal control method with input constraints based on adaptive dynamic programming (ADP) is proposed for a class of special linear systems. The method is based on an actor/critic framework. The critic approximator is used to approximate the optimal cost function, and the actor approximator is used to approximate the bounded control input. In view of the fact that the algorithm requires a persistence of excitation (PE) condition, we use the previous data and the current data to alleviate this requirement. When Lyapunov method is used to prove stability, the error between the optimal control and the bounded control is considered. It is prove that the closed-loop system can be guaranteed to be uniformly ultimately bounded (UUB). On this basis, a robustness term is added to compensate the effect of the approximation error. A simulation example shows the effectiveness.
In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter *** strategy is based on the output rede...
详细信息
In this paper,a novel adaptive Fault-Tolerant Control(FTC)strategy is proposed for non-minimum phase Hypersonic Vehicles(HSVs)that are affected by actuator faults and parameter *** strategy is based on the output redefinition method and adaptive dynamic programming(ADP).The intelligent FTC scheme consists of two main parts:a basic fault-tolerant and stable controller and an ADP-based supplementary *** the basic FTC part,an output redefinition approach is designed to make zero-dynamics stable with respect to the new ***,Ideal Internal dynamic(IID)is obtained using an optimal bounded inversion approach,and a tracking controller is designed for the new output to realize output tracking of the nonminimum phase HSV *** the ADP-based compensation control part,an ActionDependent Heuristic dynamicprogramming(ADHDP)adopting an actor-critic learning structure is utilized to further optimize the tracking performance of the HSV control ***,simulation results are provided to verify the effectiveness and efficiency of the proposed FTC algorithm.
This paper presents a hybrid adaptive dynamic programming (hybrid-ADP) approach for determining the optimal continuous and discrete control laws of a switched system online, solely from state observations. The new hyb...
详细信息
This paper presents a hybrid adaptive dynamic programming (hybrid-ADP) approach for determining the optimal continuous and discrete control laws of a switched system online, solely from state observations. The new hybrid-ADP recurrence relationships presented are applicable to model-free control of switched hybrid systems that are possibly nonlinear. The computational complexity and convergence of the hybrid-ADP approach are analyzed, and the method is validated numerically showing that the optimal controller and value function can be learned iteratively online from state observations.
This paper presented an original decentralized fault tolerant control approach for modular manipulators which based on adaptive dynamic programming (ADP) algorithm. First, the dynamic model of modular manipulators is ...
详细信息
This paper presented an original decentralized fault tolerant control approach for modular manipulators which based on adaptive dynamic programming (ADP) algorithm. First, the dynamic model of modular manipulators is established via joint torque feedback technique. Then, the fault tolerant controller is designed which composes of model-based compensation controller, observer-based fault tolerant controller and ADP-based optimal controller. According to ADP algorithm, the Hamiltonian-Jacobi-Bellman (HJB) equation can be tackled by critic neural network (NN). The closed-loop modular manipulators system is guaranteed asymptotic stable based on Lyapunov theory. Experiments are performed to verify the proposed method, and the results have guaranteed its effectiveness.
Although robust regulation problem has been well studied, solving robust tracking control via online learning has not been fully solved, in particular for nonlinear systems. This paper develops an online adaptive lear...
详细信息
Although robust regulation problem has been well studied, solving robust tracking control via online learning has not been fully solved, in particular for nonlinear systems. This paper develops an online adaptive learning technique to complete the robust tracking control design for nonlinear uncertain systems, which uses the ideas of adaptive dynamic programming (ADP) proposed for optimal control. An augmented system is first constructed using the tracking error and reference trajectory, so as to reformulate the tracking control into a modified robust regulation problem. Then, an equivalence between the robust control and the optimal control is established by using a constructive discounted cost function, which allows to design the robust control by tackling the optimal control of its nominal system. Then, the derived Hamilton-Jacobi-Bellman (HJB) equation is solved by training a critic neural network (NN). Finally, an adaptive learning algorithm is adopted to online directly update the unknown NN weights, where the convergence can be guaranteed. The closed-loop system stability is rigorously proved and extensive simulation results are given to show the effectiveness of the developed learning algorithm. (c) 2021 Elsevier B.V. All rights reserved.
In this article, an event-triggered robust adaptive dynamic programming (ETRADP) algorithm is developed to solve a class of multiplayer Stackelberg-Nash games (MSNGs) for uncertain nonlinear continuous-time systems. C...
详细信息
In this article, an event-triggered robust adaptive dynamic programming (ETRADP) algorithm is developed to solve a class of multiplayer Stackelberg-Nash games (MSNGs) for uncertain nonlinear continuous-time systems. Considering the different roles of players in the MSNG, the hierarchical decision-making process is described as the designed value functions for the leader and all followers, which assist to transform the robust control problem of the uncertain nonlinear system into an optimal regulation problem of the nominal system. Then, an online policy iteration algorithm is formulated to solve the derived coupled Hamilton-Jacobi equation. Meanwhile, an event-triggered mechanism is designed to alleviate computational and communication burdens. Moreover, critic neural networks (NNs) are constructed to obtain the event-triggered approximate optimal control polices for all players, which constitute the Stackelberg-Nash equilibrium of the MSNG. By using Lyapunov's direct method, the stability of the closed-loop uncertain nonlinear system is guaranteed under the ETRADP-based control scheme in the sense of uniform ultimate boundedness. Finally, a numerical simulation is provided to demonstrate the effectiveness of the present ETRADP-based control scheme.
In this paper,the multi-missile cooperative guidance system is formulated as a general nonlinear multi-agent *** save the limited communication resources,an adaptive eventtriggered optimal guidance law is proposed by ...
详细信息
In this paper,the multi-missile cooperative guidance system is formulated as a general nonlinear multi-agent *** save the limited communication resources,an adaptive eventtriggered optimal guidance law is proposed by designing a synchronization-error-driven triggering condition,which brings together the consensus control with adaptive dynamic programming(ADP)***,the developed event-triggered distributed control law can be employed by finding an approximate solution of event-triggered coupled Hamilton-Jacobi-Bellman(HJB)*** address this issue,the critic network architecture is constructed,in which an adaptive weight updating law is designed for estimating the cooperative optimal cost function ***,the event-triggered closed-loop system is decomposed into two subsystems:the system with flow dynamics and the system with jump *** using Lyapunov method,the stability of this closed-loop system is guaranteed and all signals are ensured to be Uniformly Ultimately Bounded(UUB).Furthermore,the Zeno behavior is *** results are finally provided to demonstrate the effectiveness of the proposed method.
As human beings,people coordinate movements and interact with the environment through sensory information and motor adaptation in the daily *** characteristics of these interactions can be studied using optimization-b...
详细信息
As human beings,people coordinate movements and interact with the environment through sensory information and motor adaptation in the daily *** characteristics of these interactions can be studied using optimization-based models,which assume that the precise knowledge of both the sensorimotor system and its interactive environment is available for the central nervous system(CNS).However,both static and dynamic uncertainties occur inevitably in the daily *** these uncertainties are taken into consideration,the previously developed models based on optimization theory may fail to explain how the CNS can still coordinate human movements which are also robust with respect to the *** order to address this problem,this paper presents a novel computational mechanism for sensorimotor control from a perspective of robust adaptive dynamic programming(RADP).Sharing some essential features of reinforcement learning,which was originally observed from mammals,the RADP model for sensorimotor control suggests that,instead of identifying the system dynamics of both the motor system and the environment,the CNS computes iteratively a robust optimal control policy using the real-time sensory *** online learning algorithm is provided in this paper,with rigorous convergence and stability ***,it is applied to simulate several experiments reported from the past *** comparing the proposed numerical results with these experimentally observed data,the authors show that the proposed model can reproduce movement trajectories which are consistent with experimental *** addition,the RADP theory provides a unified framework that connects optimality and robustness properties in the sensorimotor system.
Although optimal regulation problem has been well studied, resolving optimal tracking control via adaptive dynamic programming (ADP) has not been completely resolved, particularly for nonlinear uncertain systems. In t...
详细信息
Although optimal regulation problem has been well studied, resolving optimal tracking control via adaptive dynamic programming (ADP) has not been completely resolved, particularly for nonlinear uncertain systems. In this paper, an online adaptive learning method is developed to realize the optimal tracking control design for nonlinear motor driven systems (NMDSs), which adopts the concept of ADP, unknown system dynamic estimator (USDE), and prescribed performance function (PPF). To this end, the USDE in a simple form is first proposed to address the NMDSs with bounded disturbances. Then, based on the estimated unknown dynamics, we define an optimal cost function and derive the optimal tracking control. The derived optimal tracking control is divided into two parts, that is, steady-state control and optimal feedback control. The steady-state control can be obtained with the tracking commands directly. The optimal feedback control can be obtained via the concept of ADP based on the PPF;this contributes to improving the convergence of critic neural network (CNN) weights and tracking accuracy of NMDSs. Simulations are provided to display the feasibility of the designed control method.
暂无评论