检索结果-内蒙古大学图书馆

neural dynamic programming for Event-Based Nonlinear Adaptive Robust Stabilization 1

23rd International Conference on neural Information Processing (ICONIP)

作者： Wang, Ding Ma, Hongwen Liu, Derong Wang, Huidong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China Shandong Univ Finance & Econ Sch Management Sci & Engn Jinan 250014 Peoples R China

ISBN: (数字)9783319466873

ISBN: (纸本)9783319466873;9783319466866

In this paper, we develop an event-based adaptive robust stabilization method for continuous-time nonlinear systems with uncertain terms via a self-learning technique called neural dynamic programming. Through system transformation, it is proven that the robustness of the uncertain system can be achieved by designing an event-triggered optimal controller with respect to the nominal system under a suitable triggering condition. Then, the idea of neural dynamic programming is adopted to perform the main controller design task by building and training a critic network. Finally, the effectiveness of the present adaptive robust control strategy is illustrated via a simulation example.

关键词： Adaptive dynamic programming Adaptive robust stabilization Event-based control neural dynamic programming neural network

来源：评论

学校读者我要写书评

暂无评论

An adaptive longitudinal control method for autonomous follow driving based on neural dynamic programming and internal model structure

引用

INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS 2017年第6期14卷

作者： Zhu, Qi Dai, Bin Huang, Zhenhua Sun, Zhenping Liu, Daxue Natl Univ Def Technol Coll Mechatron Engn & Automat 109 Deya Rd Changsha 410073 Hunan Peoples R China

Autonomous vehicles are considered to have great potentials in improving transportation safety and efficiency. Autonomous follow driving is one of the highly probable application forms of autonomous vehicles in the near future. In this article, we aim at the basic autonomous following form with one follower and one leader. Proper longitudinal regulation of the follower vehicle is essential for the driving quality of the two-vehicle platoon. Focusing on this problem, a novel longitudinal control method composing of a learning-based acceleration decision phase and an internal model-based acceleration tracking phase is proposed for the follower vehicle. In the acceleration decision phase, proper acceleration commands of the follower that adjusts the following distance converging to the target value are determined by a near-optimal acceleration policy which is obtained through an online reinforcement learning algorithm named neural dynamic programming. In the acceleration tracking phase, throttle and brake control commands that drive the vehicle as the decided acceleration are derived by an internal model control structure. The performance of our proposed method is verified by simulation experiments conducted with CarSim, an industry recognized vehicle dynamic simulator.

关键词： Autonomous following acceleration decision acceleration tracking neural dynamic programming internal model structure

来源：评论

学校读者我要写书评

暂无评论

An Adaptive Path Tracking Method for Autonomous Land Vehicle based on neural dynamic programming 13

An Adaptive Path Tracking Method for Autonomous Land Vehicle...

引用

IEEE International Conference on Mechatronics and Automation

作者： Zhu, Qi Huang, Zhenhua Liu, Daxue Dai, Bin Natl Univ Def Technol Coll Mechatron Engn & Automat Changsha Hunan Peoples R China

ISBN: (纸本)9781509023967

Since the nonlinear properties of the autonomous land vehicles (ALVs) and the time-varying relationship between ego-vehicle and the desired path, it is difficult to tune the parameters of a path tracking controller for the autonomous driving of ALVs. Aiming at this problem, a novel learning based path tracking method is proposed in this paper, which is composed of the Stanley control structure and a learning based module. The input of the learning module is the relationship between current vehicle state and the desired path, and the learning output is the parameter k in the Stanley control structure. What we want to learn is to adaptive tune k according to current vehicle state. A near-optimal policy is obtained by neural dynamic programming (NDP), which is an online and model-free algorithm. The learning based module online tunes the parameter k of the Stanley control structure. The simulation results show that the proposed path tracking method possesses attractive performance.

关键词： path tracking autonomous land vehicle neural dynamic programming Stanley method

来源：评论

学校读者我要写书评

暂无评论

Friction coefficient estimation in servo systems using neural dynamic programming inspired particle swarm search

引用

APPLIED INTELLIGENCE 2015年第1期43卷 1-14页

作者： Lu, Yongzhong Yan, Danping Levy, David Huazhong Univ Sci & Technol Sch Software Engn Wuhan 430074 Peoples R China Huazhong Univ Sci & Technol Coll Publ Adm Wuhan 430074 Peoples R China Univ Sydney Fac Engn & Informat Technol Sydney NSW 2006 Australia

Parameter estimation of static friction torques in servo control systems is of great significance to their robust control. Many researchers are devoted to pursuing the solutions to estimating the coefficients of the static friction torques. In order to tackle the troublesome matter more effectively, in this paper, we address a neural dynamic programming inspired particle swarm search algorithm. We call the algorithm direct BP neural dynamic programming inspired PSO (NDPSO) since we incorporate direct back propagation (BP) and neural dynamic programming (NDP) into particle swarm optimization (PSO). In NDPSO, critic BP neural network is trained to balance the Bellman equation while action BP neural network is used to train the inertia weight, the cognitive coefficient, and the social coefficient of the PSO algorithm. The training target is to enable the critic BP neural network output to approach the ultimately successful objective. Successively, NDPSO, together with standard PSO (SPSO) and genetic algorithm (GA), is applied to the parameter identification of the static friction torque in a servo control system with single input and single output (SISO). The experimental results clearly demonstrate that NDPSO is effective and outperforms SPSO and GA in identifying the parameters of the static friction torque in the servo control system.

关键词： Parameter identification Servo control system neural dynamic programming Static friction torque Particle swarm optimization

来源：评论

学校读者我要写书评

暂无评论

ε-Nash Equilibrium of Pursuer-Evader-Defender Missile Navigation dynamic Games

引用

UNMANNED SYSTEMS 2025年第3期13卷 813-835页

作者： Noriega-Marquez, Sebastian Hernandez-Sanchez, Alejandra Chairez, Isaac Poznyak, Alexander Ctr Invest & Estudios Avanzados IPN CINVESTAV IPN Dept Control Automat Mexico City 47368 Mexico Tecnol Monterrey Inst Adv Mat Sustainable Mfg Mexico City 14380 Mexico Tecnol Monterrey Inst Adv Mat Sustainable Mfg Zapopan 45210 JA Mexico

This research is dedicated to developing a min-max robust control strategy for a dynamic game involving pursuers, evaders, and defenders in a multiple-missile scenario. The approach employs neural dynamic programming, utilizing multiple continuous differential neural networks (DNNs). The competitive controller devised addresses the robust optimization of a joint cost function that relies on the trajectories of the pursuer-evader-defender system, accommodating an uncertain mathematical model while adhering to control restrictions. The dynamic programming min-max formulation facilitates robust control by accounting for bounded modeling uncertainties and external disturbances for each game component. The value function of the Hamilton-Jacobi-Bellman (HJB) equation is approximated by a DNN, enabling the estimation of the closed-loop formulation for the joint dynamic game with state restrictions. The controller's design is grounded in estimating the state trajectory under the worst possible uncertainties and perturbations, providing a robustness factor through the robust neural controller. The learning law class for the time-varying weights in the DNN is generated by studying the HJB partial differential equation for the missile motion for each player in the dynamic game. The controller incorporates the solution of the obtained learning laws and a time-varying Riccati equation, offering an online solution to the control implementation. A recurrent algorithm, based on the Kiefer-Wolfowitz method, adjusts the initial conditions for the weights to satisfy the final condition of the given cost function for the dynamic game. A numerical example is presented to validate the proposed robust control methodology, confirming the optimization solution based on the DNN approximation for Bellman's value function.

关键词： dynamic games missile unmanned operation neural dynamic programming navigation dynamics

来源：评论

学校读者我要写书评

暂无评论

Adaptive dynamic programming for Control: A Survey and Recent Advances

引用

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 2021年第1期51卷 142-160页

作者： Liu, Derong Xue, Shan Zhao, Bo Luo, Biao Wei, Qinglai Guangdong Univ Technol Sch Automat Guangzhou 510006 Peoples R China South China Univ Technol Sch Comp Sci & Engn Guangzhou 510006 Peoples R China Beijing Normal Univ Sch Syst Sci Beijing 100875 Peoples R China Cent South Univ Sch Automat Changsha 410083 Peoples R China Peng Cheng Lab Shenzhen 518000 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Univ Chinese Acad Sci Beijing 100049 Peoples R China

This article reviews the recent development of adaptive dynamic programming (ADP) with applications in control. First, its applications in optimal regulation are introduced, and some skilled and efficient algorithms are presented. Next, the use of ADP to solve game problems, mainly nonzero-sum game problems, is elaborated. It is followed by applications in large-scale systems. Note that although the functions presented in this article are based on continuous-time systems, various applications of ADP in discrete-time systems are also analyzed. Moreover, in each section, not only some existing techniques are discussed, but also possible directions for future work are pointed out. Finally, some overall prospects for the future are given, followed by conclusions of this article. Through a comprehensive and complete investigation of its applications in many existing fields, this article fully demonstrates that the ADP intelligent control method is promising in today's artificial intelligence era. Furthermore, it also plays a significant role in promoting economic and social development.

关键词： Adaptive critic designs (ACDs) adaptive dynamic programming approximate dynamic programming intelligent control learning control neural dynamic programming neuro-dynamic programming optimal control reinforcement learning (RL)

来源：评论

学校读者我要写书评

暂无评论

Model-Free Distributed Reinforcement Learning State Estimation of a dynamical System Using Integral Value Functions

IEEE OPEN JOURNAL OF CONTROL SYSTEMS

引用

IEEE OPEN JOURNAL OF CONTROL SYSTEMS 2023年 2卷 70-78页

作者： Salamat, Babak Elsbacher, Gerhard Tonello, Andrea M. Belzner, Lenz TH Ingolstadt AImot Inst D-85049 Ingolstadt Germany Alpen Adria Univ Klagenfurt Inst Embedded Syst Klagenfurt Austria

One of the challenging problems in sensor network systems is to estimate and track the state of a target point mass with unknown dynamics. Recent improvements in deep learning (DL) show a renewed interest in applying DL techniques to state estimation problems. However, the process noise is absent which seems to indicate that the point-mass target must be non-maneuvering, as process noise is typically as significant as the measurement noise for tracking maneuvering targets. In this paper, we propose a continuous-time (CT) model-free or model-building distributed reinforcement learning estimator (DRLE) using an integral value function in sensor networks. The DRLE algorithm is capable of learning an optimal policy from a neural value function that aims to provide the estimation of a target point mass. The proposed estimator consists of two high pass consensus filters in terms of weighted measurements and inverse-covariance matrices and a critic reinforcement learning mechanism for each node in the network. The efficiency of the proposed DRLE is shown by a simulation experiment of a network of underactuated vertical takeoff and landing aircraft with strong input coupling. The experiment highlights two advantages of DRLE: i) it does not require the dynamic model to be known, and ii) it is an order of magnitude faster than the state-dependent Riccati equation (SDRE) baseline.

关键词： Mathematical models dynamical systems Aerodynamics Target tracking Reinforcement learning Noise measurement Riccati equations Aerospace consensus filters deep learning distributed filter dynamical system model neural dynamic programming sensor networks

来源：评论

学校读者我要写书评

暂无评论

State of the Art of Adaptive dynamic programming and Reinforcement Learning

引用

CAAI Artificial Intelligence Research 2022年第2期1卷 93-110页

作者： Derong Liu Mingming Ha Shan Xue Department of Mechanical and Energy Engineering Southern University of Science and TechnologyShenzhen 518055China Department of Electrical and Computer Engineering University of Illinois at ChicagoIL 606071USA School of Automation and Electrical Engineering University of Science and Technology BeijingBeijing 100083China School of Computer Science and Engineering South China University of TechnologyGuangzhou 510006China

This article introduces the state-of-the-art development of adaptive dynamic programming and reinforcement learning(ADPRL).First,algorithms in reinforcement learning(RL)are introduced and their roots in dynamic programming are *** dynamic programming(ADP)is then introduced following a brief discussion of dynamic *** in ADP and RL have enjoyed the fast developments of the past decade from algorithms,to convergence and optimality analyses,and to stability *** key steps in the recent theoretical developments of ADPRL are mentioned with some future *** particular,convergence and optimality results of value iteration and policy iteration are reviewed,followed by an introduction to the most recent results on stability analysis of value iteration algorithms.

关键词： adaptive dynamic programming approximate dynamic programming adaptive critic designs neuro-dynamic programming neural dynamic programming reinforcement learning intelligent control learning control optimal control

来源：评论

学校读者我要写书评

暂无评论

Realization of robust optimal control by dynamic neural-programming

引用

IFAC-PapersOnLine 2018年第13期51卷 468-473页

作者： Ballesteros-Escamilla, Mariana Chairez, Isaac Boltyanski, Vladimir G. Poznyak, Alexander Department of Automatic Control CINVESTAV-IPN Av. IPN 2508 San Pedro Zacatenco Mexico City07360 Mexico Department of Bioprocess UPIBI-IPN Av. Acueducto 550 La Laguna Ticoman Mexico City07340 Mexico CIMAT Guanajuato Mexico

This study solves a finite horizon optimal problem for linear systems with parametric uncertainties and bounded perturbations. The control solution considers the uncertain part of the system in the sub-optimal control solution by proposing a min-max problem solved by a dynamic neural programming approximate solution. The structure of the neural network was proposed to satisfy the charcateristics of the value function including possitiveness and continuity. The impact of the presence of bounded perturbation over the Hamiltonian maximization was analyzed in detail. The explicit learning law used to adjust the weights was obtained directly from the Hamilton-Jacobi-Bellman (HJB) approximate solution. The weights adjustment to the proposed algorithm is based on an on-line state dependent Riccati-like equation. A numerical simulation is presented to illustrate the results of the sub-optimal algorithm including its comparison against the classical linear regulator solved considering the non-perturbed system. © 2018

关键词： dynamic programming Linear systems neural networks dynamic neural networks Hamilton jacobi bellman Hamilton Jacobi Bellman equation neural dynamic programming Parametric uncertainties Robust optimal control Sub optimal algorithms Suboptimal control

来源：评论

学校读者我要写书评

暂无评论

Realization of robust optimal control by dynamic neural-programming

Realization of robust optimal control by dynamic neural-prog...

引用

2nd IFAC Conference on Modelling, Identification and Control of Nonlinear Systems (MICNON)

作者： Ballesteros-Escamilla, Mariana Chairez, Isaac Boltyanski, Vladimir G. Poznyak, Alexander IPN CINVESTAV Dept Automat Control AV IPN 2508 Mexico City 07360 DF Mexico IPN UPIBI Dept Bioproc Av Acueducto 550 Mexico City 07340 DF Mexico CIMAT Guanajuato Mexico

关键词： neural dynamic programming dynamic neural Networks Hamilton Jacobi Bellman equation Sub-optimal control

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：