In this paper, we develop an event-based adaptive robust stabilization method for continuous-time nonlinear systems with uncertain terms via a self-learning technique called neural dynamic programming. Through system ...
详细信息
ISBN:
(数字)9783319466873
ISBN:
(纸本)9783319466873;9783319466866
In this paper, we develop an event-based adaptive robust stabilization method for continuous-time nonlinear systems with uncertain terms via a self-learning technique called neural dynamic programming. Through system transformation, it is proven that the robustness of the uncertain system can be achieved by designing an event-triggered optimal controller with respect to the nominal system under a suitable triggering condition. Then, the idea of neural dynamic programming is adopted to perform the main controller design task by building and training a critic network. Finally, the effectiveness of the present adaptive robust control strategy is illustrated via a simulation example.
Autonomous vehicles are considered to have great potentials in improving transportation safety and efficiency. Autonomous follow driving is one of the highly probable application forms of autonomous vehicles in the ne...
详细信息
Autonomous vehicles are considered to have great potentials in improving transportation safety and efficiency. Autonomous follow driving is one of the highly probable application forms of autonomous vehicles in the near future. In this article, we aim at the basic autonomous following form with one follower and one leader. Proper longitudinal regulation of the follower vehicle is essential for the driving quality of the two-vehicle platoon. Focusing on this problem, a novel longitudinal control method composing of a learning-based acceleration decision phase and an internal model-based acceleration tracking phase is proposed for the follower vehicle. In the acceleration decision phase, proper acceleration commands of the follower that adjusts the following distance converging to the target value are determined by a near-optimal acceleration policy which is obtained through an online reinforcement learning algorithm named neural dynamic programming. In the acceleration tracking phase, throttle and brake control commands that drive the vehicle as the decided acceleration are derived by an internal model control structure. The performance of our proposed method is verified by simulation experiments conducted with CarSim, an industry recognized vehicle dynamic simulator.
Since the nonlinear properties of the autonomous land vehicles (ALVs) and the time-varying relationship between ego-vehicle and the desired path, it is difficult to tune the parameters of a path tracking controller fo...
详细信息
ISBN:
(纸本)9781509023967
Since the nonlinear properties of the autonomous land vehicles (ALVs) and the time-varying relationship between ego-vehicle and the desired path, it is difficult to tune the parameters of a path tracking controller for the autonomous driving of ALVs. Aiming at this problem, a novel learning based path tracking method is proposed in this paper, which is composed of the Stanley control structure and a learning based module. The input of the learning module is the relationship between current vehicle state and the desired path, and the learning output is the parameter k in the Stanley control structure. What we want to learn is to adaptive tune k according to current vehicle state. A near-optimal policy is obtained by neural dynamic programming (NDP), which is an online and model-free algorithm. The learning based module online tunes the parameter k of the Stanley control structure. The simulation results show that the proposed path tracking method possesses attractive performance.
Parameter estimation of static friction torques in servo control systems is of great significance to their robust control. Many researchers are devoted to pursuing the solutions to estimating the coefficients of the s...
详细信息
Parameter estimation of static friction torques in servo control systems is of great significance to their robust control. Many researchers are devoted to pursuing the solutions to estimating the coefficients of the static friction torques. In order to tackle the troublesome matter more effectively, in this paper, we address a neural dynamic programming inspired particle swarm search algorithm. We call the algorithm direct BP neural dynamic programming inspired PSO (NDPSO) since we incorporate direct back propagation (BP) and neural dynamic programming (NDP) into particle swarm optimization (PSO). In NDPSO, critic BP neural network is trained to balance the Bellman equation while action BP neural network is used to train the inertia weight, the cognitive coefficient, and the social coefficient of the PSO algorithm. The training target is to enable the critic BP neural network output to approach the ultimately successful objective. Successively, NDPSO, together with standard PSO (SPSO) and genetic algorithm (GA), is applied to the parameter identification of the static friction torque in a servo control system with single input and single output (SISO). The experimental results clearly demonstrate that NDPSO is effective and outperforms SPSO and GA in identifying the parameters of the static friction torque in the servo control system.
This research is dedicated to developing a min-max robust control strategy for a dynamic game involving pursuers, evaders, and defenders in a multiple-missile scenario. The approach employs neural dynamic programming,...
详细信息
This research is dedicated to developing a min-max robust control strategy for a dynamic game involving pursuers, evaders, and defenders in a multiple-missile scenario. The approach employs neural dynamic programming, utilizing multiple continuous differential neural networks (DNNs). The competitive controller devised addresses the robust optimization of a joint cost function that relies on the trajectories of the pursuer-evader-defender system, accommodating an uncertain mathematical model while adhering to control restrictions. The dynamicprogramming min-max formulation facilitates robust control by accounting for bounded modeling uncertainties and external disturbances for each game component. The value function of the Hamilton-Jacobi-Bellman (HJB) equation is approximated by a DNN, enabling the estimation of the closed-loop formulation for the joint dynamic game with state restrictions. The controller's design is grounded in estimating the state trajectory under the worst possible uncertainties and perturbations, providing a robustness factor through the robust neural controller. The learning law class for the time-varying weights in the DNN is generated by studying the HJB partial differential equation for the missile motion for each player in the dynamic game. The controller incorporates the solution of the obtained learning laws and a time-varying Riccati equation, offering an online solution to the control implementation. A recurrent algorithm, based on the Kiefer-Wolfowitz method, adjusts the initial conditions for the weights to satisfy the final condition of the given cost function for the dynamic game. A numerical example is presented to validate the proposed robust control methodology, confirming the optimization solution based on the DNN approximation for Bellman's value function.
This article reviews the recent development of adaptive dynamicprogramming (ADP) with applications in control. First, its applications in optimal regulation are introduced, and some skilled and efficient algorithms a...
详细信息
This article reviews the recent development of adaptive dynamicprogramming (ADP) with applications in control. First, its applications in optimal regulation are introduced, and some skilled and efficient algorithms are presented. Next, the use of ADP to solve game problems, mainly nonzero-sum game problems, is elaborated. It is followed by applications in large-scale systems. Note that although the functions presented in this article are based on continuous-time systems, various applications of ADP in discrete-time systems are also analyzed. Moreover, in each section, not only some existing techniques are discussed, but also possible directions for future work are pointed out. Finally, some overall prospects for the future are given, followed by conclusions of this article. Through a comprehensive and complete investigation of its applications in many existing fields, this article fully demonstrates that the ADP intelligent control method is promising in today's artificial intelligence era. Furthermore, it also plays a significant role in promoting economic and social development.
One of the challenging problems in sensor network systems is to estimate and track the state of a target point mass with unknown dynamics. Recent improvements in deep learning (DL) show a renewed interest in applying ...
详细信息
One of the challenging problems in sensor network systems is to estimate and track the state of a target point mass with unknown dynamics. Recent improvements in deep learning (DL) show a renewed interest in applying DL techniques to state estimation problems. However, the process noise is absent which seems to indicate that the point-mass target must be non-maneuvering, as process noise is typically as significant as the measurement noise for tracking maneuvering targets. In this paper, we propose a continuous-time (CT) model-free or model-building distributed reinforcement learning estimator (DRLE) using an integral value function in sensor networks. The DRLE algorithm is capable of learning an optimal policy from a neural value function that aims to provide the estimation of a target point mass. The proposed estimator consists of two high pass consensus filters in terms of weighted measurements and inverse-covariance matrices and a critic reinforcement learning mechanism for each node in the network. The efficiency of the proposed DRLE is shown by a simulation experiment of a network of underactuated vertical takeoff and landing aircraft with strong input coupling. The experiment highlights two advantages of DRLE: i) it does not require the dynamic model to be known, and ii) it is an order of magnitude faster than the state-dependent Riccati equation (SDRE) baseline.
This article introduces the state-of-the-art development of adaptive dynamicprogramming and reinforcement learning(ADPRL).First,algorithms in reinforcement learning(RL)are introduced and their roots in dynamic progra...
详细信息
This article introduces the state-of-the-art development of adaptive dynamicprogramming and reinforcement learning(ADPRL).First,algorithms in reinforcement learning(RL)are introduced and their roots in dynamicprogramming are *** dynamicprogramming(ADP)is then introduced following a brief discussion of dynamic *** in ADP and RL have enjoyed the fast developments of the past decade from algorithms,to convergence and optimality analyses,and to stability *** key steps in the recent theoretical developments of ADPRL are mentioned with some future *** particular,convergence and optimality results of value iteration and policy iteration are reviewed,followed by an introduction to the most recent results on stability analysis of value iteration algorithms.
This study solves a finite horizon optimal problem for linear systems with parametric uncertainties and bounded perturbations. The control solution considers the uncertain part of the system in the sub-optimal control...
详细信息
This study solves a finite horizon optimal problem for linear systems with parametric uncertainties and bounded perturbations. The control solution considers the uncertain part of the system in the sub-optimal control...
详细信息
This study solves a finite horizon optimal problem for linear systems with parametric uncertainties and bounded perturbations. The control solution considers the uncertain part of the system in the sub-optimal control solution by proposing a min-max problem solved by a dynamicneuralprogramming approximate solution. The structure of the neural network was proposed to satisfy the charcateristics of the value function including possitiveness and continuity. The impact of the presence of bounded perturbation over the Hamiltonian maximization was analyzed in detail. The explicit learning law used to adjust the weights was obtained directly from the Hamilton-Jacobi-Bellman (HJB) approximate solution. The weights adjustment to the proposed algorithm is based on an on-line state dependent Riccati-like equation. A numerical simulation is presented to illustrate the results of the sub-optimal algorithm including its comparison against the classical linear regulator solved considering the non-perturbed system. (C) 2018, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.
暂无评论