检索结果-内蒙古大学图书馆

32nd Chinese Control And Decision Conference (CCDC)

作者： Zhou, Tianmin Hou, Jiaxu Li, Handong Di, Zengru Zhao, Bo Beijing Normal Univ Sch Syst Sci Beijing 100875 Peoples R China

ISBN: (纸本)9781728158556

This paper is concerned with the neuro-control for continuous-time nonlinear systems subject to stochastic disturbance. Due to the stochastic disturbance, the traditional value function in existing literature cannot meet the stochastic control problems, since mixed second partial derivatives are employed to construct modified value function of conditional expectation. To solve the Hamilton-Jacobi-Bellman equation, a novel online policy iteration algorithm with an Ito correction term is developed with establishing a critic neural network to approximate the optimal value function. Thus, the online optimal control can be obtained in a closed-loop form. The closed-loop system is guaranteed to be stable in probability via Lyapunov's direct method. Finally, numerical example is provided to illustrate the effectiveness of the developed control method.

关键词： adaptive dynamic programming reinforcement learning Policy Iteration Stochastic Nonlinear

来源：评论

学校读者我要写书评

暂无评论

Event-Triggered Optimal Neuro-Controller Design With reinforcement learning for Unknown Nonlinear Systems

引用

ieee TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 2019年第9期49卷 1866-1878页

作者： Yang, Xiong He, Haibo Liu, Derong Univ Rhode Isl Dept Elect Comp & Biomed Engn Kingston RI 02881 USA Tianjin Univ Sch Elect & Informat Engn Tianjin 300072 Peoples R China Guangdong Univ Technol Sch Automat Guangzhou 510006 Guangdong Peoples R China

This paper develops an optimal control scheme for continuous-time unknown nonlinear systems using the event-triggering mechanism. Different from designing controllers using the time-triggering mechanism, the event-triggered controller is updated only when the system state deviates more than a certain threshold from a prescribed value. To obtain the event-triggered optimal controller, we develop an identifier-critic architecture under the framework of reinforcement learning. The identifier network, composed of a feedforward neural network (FNN), aims to derive the knowledge of unknown system dynamics, and the critic network, constituted of an FNN, intends to derive the event-triggered optimal controller. The identifier network is tuned via the combination of a standard back-propagation algorithm and an e-modification method, and the critic network is updated using a modification of the gradient descent method. By introducing an additional stability term to update the critic network, the initial admissible control is no longer required. Meanwhile, by using historical and instantaneous state data together, the persistence of excitation condition is relaxed. A stability analysis of the closed-loop system is provided based on the Lyapunov method. The effectiveness of the proposed designs is illustrated through simulations of a nonlinear example and a single link robot arm system.

关键词： adaptive dynamic programming (ADP) event-triggered control neural networks (NNs) nonlinear systems optimal control reinforcement learning (RL)

来源：评论

学校读者我要写书评

暂无评论

adaptive Optimal Control via Continuous-Time Q-learning for Unknown Nonlinear Affine Systems 58

Adaptive Optimal Control via Continuous-Time Q-Learning for ...

引用

58th ieee Conference on Decision and Control (CDC)

作者： Chen, Anthony Siming Herrmann, Guido Univ Bristol Dept Mech Engn Bristol BS8 1QU Avon England Univ Manchester Dept Elect & Elect Engn Manchester M13 9PL Lancs England

ISBN: (数字)9781728113982

ISBN: (纸本)9781728113982

This paper proposes two novel adaptive optimal control algorithms for continuous-time nonlinear affine systems based on reinforcement learning: i) generalised policy iteration (GPI) and ii) Q-learning. As a result, the a priori knowledge of the system drift f(x) is not needed via GPI, which gives us a partially model-free and online solution. We then for the first time extend the idea of Q-learning to the nonlinear continuous time optimal control problem in a noniterative manner. This leads to a completely model-free method where neither the system drift f(x) nor the input gain g(x) is needed. For both methods, the adaptive critic and actor are continuously and simultaneously updating each other without iterative steps, which effectively avoids the hybrid structure and the need for an initial stabilising control policy. Moreover, finite-time convergence is guaranteed by using a sliding mode technique in the new adaptive approach, where the persistent excitation (PE) condition can be directly verified online. We also prove the overall Lyapunov stability and demonstrate the effectiveness of the proposed algorithms using numerical examples.

关键词： adaptive optimal control Q-learning non-linear systems reinforcement learning approximate dynamic programming adaptive critic

来源：评论

学校读者我要写书评

暂无评论

adaptive Data Replication Optimization Based on reinforcement learning

Adaptive Data Replication Optimization Based on Reinforcemen...

引用

ieee symposium Series on Computational Intelligence (SSCI)

作者： Chee Keong Wee Richi Nayak Digital Application Services Business Applications Technology Services eHealth Queensland Queensland Australia School of Electrical Engineering & Computer Science Science & Engineering Faculty Queensland University of Technology Brisbane Queensland Australia

ISBN: (数字)9781728125473

ISBN: (纸本)9781728125480

Data replication plays an important role in enterprise IT landscapes, where data is shared among multiple IT systems. IT administrators need to tune the replicating software's configuration setting for it to perform at its optimum level. It is a challenge to continue optimizing the software's configuration to keep up with the fluctuating workload in a dynamic business environment. We propose a novel approach of using reinforcement learning with meta-heuristics to create an adaptive optimization method for data replication software. The experimental results show the replicating software managed by the proposed approach can perform at an optimum level despite consistently working under changing workloads.

关键词： Software Business Throughput reinforcement learning Tuning Machine learning algorithms Adaptation models

来源：评论

学校读者我要写书评

暂无评论

reinforcement Q-learning Incorporated With Internal Model Method for Output Feedback Tracking Control of Unknown Linear Systems

引用

ieee ACCESS 2020年 8卷 134456-134467页

作者： Chen, Cong Sun, Weijie Zhao, Guangyue Peng, Yunjian South China Univ Technol Sch Automat Sci & Engn Guangzhou 510640 Peoples R China

This paper investigates the output feedback (OPFB) tracking control problem for discrete-time linear (DTL) systems with unknown dynamics. With the approach of augmented system, the tracking control problem is first turned into a regulation problem with a discounted performance function, the solution of which relies on the Q-function based Bellman equation. Then, a novel value iteration (VI) scheme based on reinforcement Q-learning mechanism is proposed for solving the Q-function Bellman equation without knowing the system dynamics. Moreover, the convergence of the VI based Q-learning is proved by indicating that it converges to the Q-function Bellman equation and it brings out no bias of solution even under the probing noise satisfying the persistent excitation (PE) condition. As a result, the OPFB tracking controller can be learned online by using the past input, output, and reference trajectory data of the augmented system. The proposed scheme removes the requirement of initial admissible policy in the policy iteration (PI) method. Finally, effectiveness of the proposed scheme is demonstrated through a simulation example.

关键词： Mathematical model Heuristic algorithms Optimal control System dynamics Trajectory Linear systems Convergence adaptive dynamic programming (ADP) optimal control Bellman equation on-policy internal model

来源：评论

学校读者我要写书评

暂无评论

adaptive Critic Designs for Event-Triggered Robust Control of Nonlinear Systems With Unknown dynamics

引用

ieee TRANSACTIONS ON CYBERNETICS 2019年第6期49卷 2255-2267页

作者： Yang, Xiong He, Haibo Tianjin Univ Sch Elect & Informat Engn Tianjin 300072 Peoples R China Univ Rhode Isl Dept Elect Comp & Biomed Engn Kingston RI 02881 USA

This paper develops a novel event-triggered robust control strategy for continuous-time nonlinear systems with unknown dynamics. To begin with, the event-triggered robust nonlinear control problem is transformed into an event-triggered nonlinear optimal control problem by introducing an infinite-horizon integral cost for the nominal system. Then, a recurrent neural network (RNN) and adaptive critic designs (ACDs) arc employed to solve the derived event-triggered nonlinear optimal control problem. The RNN is applied to reconstruct the system dynamics based on collected system data. After acquiring the knowledge of system dynamics, a unique critic network is proposed to obtain the approximate solution of the event-triggered Hamilton-Jacobi-Bellman equation within the framework of ACDs. The critic network is updated by using simultaneously historical and instantaneous state data. An advantage of the present critic network update law is that it can relax the persistence of excitation condition. Meanwhile, under a newly developed event-triggering condition, the proposed critic network tuning rule not only guarantees the critic network weights to converge to optimums but also ensures nominal system states to be uniformly ultimately bounded. Moreover, by using Lyapunov method, it is proved that the derived optimal event-triggered control (ETC) guarantees uniform ultimate boundedness of all the signals in the original system. Finally, a nonlinear oscillator and an unstable power system are provided to validate the developed robust ETC scheme.

关键词： adaptive critic designs (ACDs) adaptive dynamic programming (ADP) event-triggered control (ETC) neural networks (NNs) reinforcement learning (RL) robust control

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning for Robotic Safe Control with Force Sensing 2

Reinforcement Learning for Robotic Safe Control with Force S...

引用

2nd World Robot Conference (WRC) / symposium on Advanced Robotics and Automation (WRC SARA)

作者： Lin, Nan Zhang, Linrui Chen, Yuxuan Zhu, Yujun Chen, Ruoxi Wu, Peichen Chen, Xiaoping Univ Sci & Technol China Sch Comp Sci & Technol Hefei 230026M Peoples R China Sch Informat Sci & Technol China Hefei 230026 Peoples R China

ISBN: (纸本)9781728155524

For the task with complicated manipulation in unstructured environments, traditional hand-coded methods are ineffective, while reinforcement learning can provide more general and useful policy. Although the reinforcement learning is able to obtain impressive results, its stability and reliability is hard to guarantee, which would cause the potential safety threats. Besides, the transfer from simulation to real-world also will lead in unpredictable situations. To enhance the safety and reliability of robots, we introduce the force and haptic perception into reinforcement learning. Force and tactual sensation play key roles in robotic dynamic control and human-robot interaction. We demonstrate that the force-based reinforcement learning method can be more adaptive to environment, especially in sim-to-real transfer. Experimental results show in object pushing task, our strategy is safer and more efficient in both simulation and real world, thus it holds prospects for a wide variety of robotic applications.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

adaptive Slope Locomotion with Deep reinforcement learning

Adaptive Slope Locomotion with Deep Reinforcement Learning

引用

ieee/SICE International symposium on System Integration

作者： William Jones Tamir Blum Kazuya Yoshida Space Robotics Laboratory of the Department of Aerospace Engineering Graduate School of Engineering Tohoku University Sendai Japan

ISBN: (数字)9781728166674

ISBN: (纸本)9781728166681

In this paper we present a model free Deep reinforcement learning based approach to the motion planning problem of a quadruped moving from a flat to an inclined plane. In our implementation, we do not provide any prior information of the location of the inclined plane, nor pass any vision data during the training process. With this approach, we train a 12 degree of freedom quadruped robot to traverse up and down a variety of simulated sloped environments, in the process demonstrating that deep reinforcement learning is able to generate highly dynamic and adaptable solutions.

关键词： Legged locomotion learning (artificial intelligence) Animals Computational modeling Neural networks Logic gates

来源：评论

学校读者我要写书评

暂无评论

Longitudinal dynamic versus Kinematic Models for Car-Following Control Using Deep reinforcement learning

Longitudinal Dynamic versus Kinematic Models for Car-Followi...

引用

ieee Intelligent Transportation Systems Conference (ieee-ITSC)

作者： Lin, Yuan McPhee, John Azad, Nasser L. Univ Waterloo Syst Design Engn Dept Waterloo ON N2L 3G1 Canada

ISBN: (纸本)9781538670248

The majority of current studies on autonomous vehicle control via deep reinforcement learning (DRL) utilize point-mass kinematic models, neglecting vehicle dynamics which includes acceleration delay and acceleration command dynamics. The acceleration delay, which results from sensing and actuation delays, results in delayed execution of the control inputs. The acceleration command dynamics dictates that the actual vehicle acceleration does not rise up to the desired command acceleration instantaneously due to dynamics. In this work, we investigate the feasibility of applying DRL controllers trained using vehicle kinematic models to more realistic driving control with vehicle dynamics. We consider a particular longitudinal car-following control, i.e., adaptive Cruise Control (ACC), problem solved via DRL using a point-mass kinematic model. When such a controller is applied to car following with vehicle dynamics, we observe significantly degraded car-following performance. Therefore, we redesign the DRL framework to accommodate the acceleration delay and acceleration command dynamics by adding the delayed control inputs and the actual vehicle acceleration to the reinforcement learning environment state, respectively. The training results show that the redesigned DRL controller results in near-optimal control performance of car following with vehicle dynamics considered when compared with dynamic programming solutions.

关键词： Acceleration

来源：评论

学校读者我要写书评

暂无评论

Neural Network Tracking Control of Unknown Servo System with Approximate dynamic programming 38

Neural Network Tracking Control of Unknown Servo System with...

引用

38th Chinese Control Conference (CCC)

作者： Lv, Yongfeng Ren, Xuemei Zeng, Tianyi Li, Linwei Na, Jing Beijing Inst Technol Sch Automat Beijing 100081 Peoples R China Kunming Univ Sci & Technol Fac Mech & Elect Engn Kunming 650500 Yunnan Peoples R China

ISBN: (纸本)9789881563972

Although the adaptive dynamic programming (ADP) scheme has been widely researched on the optimal problem in recent years, which has not been applied to the servo system. In this paper, a simplified reinforcement learning (RL) based (ADP) scheme is developed to obtain the optimal tracking control of the servo system, where the unknown system dynamics are approximated with a three-layer neural network (NN) identifier. First, the servo system model is constructed and a three-layer NN identifier is used to approximate the unknown servo system. The NN weights of both the hidden layer and output layer are synchronously tuned with an adaptive gradient law. An RL-based critic NN is then used to learn the optimal cost function, and NN weights are updated by minimizing the squared Hamilton-Jacobi-Bellman (HJB) error. The optimal tracking control of the servomechanism is obtained based on the three-layer NN identifier and RL scheme, which can make the motor speed track the predefined command. Moreover, the convergence of the identifier and NN weights is proved. Finally, a servomechanism model is provided, which can illustrate the proposed methods.

关键词： reinforcement learning adaptive dynamic programming Optimal Control Neural Networks Servomechanisms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：