检索结果-内蒙古大学图书馆

Adaptive fault-tolerant control for affine non-linear systems based on approximate dynamic programming

IET CONTROL THEORY AND APPLICATIONS 2016年第6期10卷 655-663页

作者： Fan, Quan-Yong Yang, Guang-Hong Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Liaoning Peoples R China Northeastern Univ State Key Lab Synthet Automat Proc Ind Shenyang 110819 Liaoning Peoples R China

This study investigates the fault-tolerant control problem for affine nonlinear systems with time-varying actuator gain and bias faults. In order to handle the actuator faults and guarantee the approximate optimal performance of the nominal non-linear dynamics, the approximate dynamic programming method is used to design a sliding mode fault-tolerant control policy. First, the actuator faults are estimated using a disturbance observer and a novel adaptive scheme. Based on the fault estimations, an integral sliding function is constructed and the reachability condition is derived. Then, an actor-critic algorithm with new weight tuning laws is given to learn the bounded nearly optimal control policy for the nominal dynamics. The convergence of the neural network weights is presented based on a Lyapunov analysis method. Finally, the simulation results are given to verify the efficacy of the developed method.

关键词： adaptive control fault tolerant control affine transforms nonlinear control systems dynamic programming time-varying systems actuators nonlinear dynamical systems variable structure systems observers reachability analysis optimal control neurocontrollers Lyapunov methods adaptive fault-tolerant control problem affine nonlinear systems time-varying actuator gain bias faults approximate optimal performance nonlinear dynamics approximate dynamic programming method sliding mode fault-tolerant control policy actuator faults disturbance observer integral sliding function reachability condition actor-critic algorithm optimal control policy neural network weights Lyapunov analysis method

来源：评论

学校读者我要写书评

暂无评论

actor-critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems

引用

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015年第1期26卷 140-151页

作者： Kiumarsi, Bahare Lewis, Frank L. Univ Texas Arlington UTA Res Inst Ft Worth TX 76118 USA

This paper presents a partially model-free adaptive optimal control solution to the deterministic nonlinear discrete-time (DT) tracking control problem in the presence of input constraints. The tracking error dynamics and reference trajectory dynamics are first combined to form an augmented system. Then, a new discounted performance function based on the augmented system is presented for the optimal nonlinear tracking problem. In contrast to the standard solution, which finds the feedforward and feedback terms of the control input separately, the minimization of the proposed discounted performance function gives both feedback and feedforward parts of the control input simultaneously. This enables us to encode the input constraints into the optimization problem using a nonquadratic performance function. The DT tracking Bellman equation and tracking Hamilton-Jacobi-Bellman (HJB) are derived. An actor-critic-based reinforcement learning algorithm is used to learn the solution to the tracking HJB equation online without requiring knowledge of the system drift dynamics. That is, two neural networks (NNs), namely, actor NN and critic NN, are tuned online and simultaneously to generate the optimal bounded control policy. A simulation example is given to show the effectiveness of the proposed method.

关键词： actor-critic algorithm discrete-time (DT) nonlinear optimal tracking input constraints neural network (NN) reinforcement learning (RL)

来源：评论

学校读者我要写书评

暂无评论

An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes

引用

SYSTEMS & CONTROL LETTERS 2010年第12期59卷 760-766页

作者： Bhatnagar, Shalabh Indian Inst Sci Dept Comp Sci & Automat Bangalore 560012 Karnataka India

We develop in this article the first actor-critic reinforcement learning algorithm with function approximation for a problem of control under multiple inequality constraints. We consider the infinite horizon discounted cost framework in which both the objective and the constraint functions are suitable expected policy-dependent discounted sums of certain sample path functions. We apply the Lagrange multiplier method to handle the inequality constraints. Our algorithm makes use of multi-timescale stochastic approximation and incorporates a temporal difference (TD) critic and an actor that makes a gradient search in the space of policy parameters using efficient simultaneous perturbation stochastic approximation (SPSA) gradient estimates. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal policy. (C) 2010 Elsevier B.V. All rights reserved.

关键词： Constrained Markov decision processes Infinite horizon discounted cost criterion Function approximation actor-critic algorithm Simultaneous perturbation stochastic approximation

来源：评论

学校读者我要写书评

暂无评论

Transfer Reinforcement Learning Framework for Energy Saving in Next Generation Wireless Networks

Transfer Reinforcement Learning Framework for Energy Saving ...

引用

作者： Shreyata Sharma Indraprastha Institute of Information Technology Delhi

学位级别：硕士

Recent upsurge in data intensive applications over wireless communication networks is stimu- lating rapid expansion of such networks and thus presenting new research challenges pertaining to their efficient deployment. In the present communication networks, the increased traffic load entails network operators to expand their networks by the deployment of a large number of base stations (BSs) and access points (APs). Studies have reported that a major portion of energy consumption occurs at the access network entities. This means that the massive data traffic is being served at the expense of increased carbon footprint and huge energy consumption. There- fore, energy saving has emerged as one of the major aspects in such data intensive and high traffic communication networks. Considering this, the energy efficient operation of BSs and APs has become a major research problem and it is well taken up in this thesis for the case of wireless networks. In this work, the research aim of energy saving has been considered for both, cellular BSs and Wi-Fi APs to cover the major part of the wireless communication networks. An actor-critic (AC) reinforcement learning (RL) framework is used to enable traffic based ON/OFF switching of BSs and APs. Furthermore, previously estimated traffic statistics is exploited through the process of transfer learning for further improvement in energy savings and speeding up the learning process. Herein, this novel approach is used for three cases: realization of a transfer learning framework for Wi-Fi networks, implementation of a three state RL based BS switching scheme for existing cellular networks and application of RL in heterogeneous networks (HetNets) consisting of macro and femto BSs. The use of practical scenario and real time data collected from institute's Wi-Fi network to validate the adopted scheme is an important feature of this study. The superiority of the proposed framework is depicted through simulations and relevant mathematic

关键词： Reinforcement learning transfer learning actor-critic algorithm energy efficient access networks moderate traffic heterogeneous networks

来源：评论

学校读者我要写书评

暂无评论

A Transfer Learning Framework for Energy Efficient Wi-Fi Networks and Performance Analysis Using Real Data

A Transfer Learning Framework for Energy Efficient Wi-Fi Net...

引用

IEEE International Conference on Advanced Networks and Telecommuncations Systems

作者： Shreyata Sharma S. J. Darak Anand Srivastava Honggang Zhang Department of Electronics and Communication Engineering IIIT Delhi College of Information Science & Electronic Engineering (ISEE) Zhejiang University

ISBN: (纸本)9781509021949

In the recent past, there has been an exponential increase in data intensive services over the communication networks. This trend would sustain in future communication networks as well, especially in the Wi-Fi networks. This could be attributed to rapid growth of business and institutional entities and the need for cellular data off-loading for which localized Wi-Fi networks are preferred due to higher offered data rate. In such networks, a major portion of energy consumption occurs at the access network entities making energy efficient operation of Wi-Fi access points (APs) extremely crucial. In this paper, an actor-critic (AC) reinforcement learning (RL) framework is designed to enable traffic based ON/OFF switching of APs in Wi-Fi network. Furthermore, previously estimated traffic statistics is exploited in future scenarios which speeds up the learning process and provide additional improvement in energy saving. The important feature of the present study is the validation of the proposed framework on real data collected from an institute's Wi-Fi network. The simulation results for 20 APs of a Wi-Fi network shows that the proposed framework can lead to around 75% saving in energy consumption as compared to the case when AP switching is not considered.

关键词： Reinforcement learning Energy saving in Wi-Fi networks Transfer learning actor-critic algorithm Wi-Fi Transfer (Psychology) Learning networks(communications) observational data learning (artificial intelligence) Apus Energy efficiency

来源：评论

学校读者我要写书评

暂无评论

TACT: A Transfer actor-critic Learning Framework for Energy Saving in Cellular Radio Access Networks

引用

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS 2014年第4期13卷 2000-2011页

作者： Li, Rongpeng Zhao, Zhifeng Chen, Xianfu Palicot, Jacques Zhang, Honggang Zhejiang Univ Dept Informat Sci & Elect Engn Hangzhou 310027 Zhejiang Peoples R China Univ Europeenne Bretagne Rennes France Supelec F-35576 Cesson Sevigne France VTT Tech Res Ctr Finland FI-90571 Oulu Finland

Recent works have validated the possibility of improving energy efficiency in radio access networks (RANs), achieved by dynamically turning on/off some base stations (BSs). In this paper, we extend the research over BS switching operations, which should match up with traffic load variations. Instead of depending on the dynamic traffic loads which are still quite challenging to precisely forecast, we firstly formulate the traffic variations as a Markov decision process. Afterwards, in order to foresightedly minimize the energy consumption of RANs, we design a reinforcement learning framework based BS switching operation scheme. Furthermore, to speed up the ongoing learning process, a transfer actor-critic algorithm (TACT), which utilizes the transferred learning expertise in historical periods or neighboring regions, is proposed and provably converges. In the end, we evaluate our proposed scheme by extensive simulations under various practical configurations and show that the proposed TACT algorithm contributes to a performance jumpstart and demonstrates the feasibility of significant energy efficiency improvement at the expense of tolerable delay performance.

关键词： Radio access networks base stations sleeping mode green communications energy saving reinforcement learning transfer learning actor-critic algorithm

来源：评论

学校读者我要写书评

暂无评论

A Neuro-fuzzy Learning System for Adaptive Swarm Behaviors Dealing with Continuous State Space

引用

4th International Conference on Intelligent Computing

作者： Kuremoto, Takashi Obayashi, Masanao Kobayashi, Kunikazu Adachi, Hirotaka Yoneda, Kentaro Yamaguchi Univ Grad Sch Sci & Engn Tokiwadai 2-16-1 Yamaguchi 7558611 Japan Fac Sci & Engn Yamaguchi Japan

ISBN: (纸本)9783540859833

Swarm intelligence has brought a new paradise for function optimization, structural optimization, multi-agent systems and other study fields. In our previous work, we proposed a neuro-fuzzy system using reinforcement learning algorithm (actor-critic method with TD error learning algorithm) to acquire optimized swarm behaviors. This paper improves the conventional learning system, which only deals with discrete state space and action space, to solve how a swarm to learn and obtain its adaptive behaviors in the continuous state space. The improved system adopts a new policy function of action which is possible to yield continuous actions corresponding to continuous states. The effectiveness of proposed system is investigated by computer simulations with more kinds of environments for the goal-exploration problem.

关键词： neuro-fuzzy net swarm behavior reinforcement learning multi-agent system actor-critic algorithm goal-exploration problem

来源：评论

学校读者我要写书评

暂无评论

The actor-critic algorithm as multi-time-scale stochastic approximation

引用

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES 1997年第4期22卷 525-543页

作者： Borkar, VS Konda, VR Indian Inst Sci Dept Comp Sci & Automat Bangalore 560012 Karnataka India

The actor-critic algorithm of Barto and others for simulation-based optimization of Markov decision processes is cast as a two time Scale stochastic approximation. Convergence analysis, approximation issues and an exa... 详细信息

关键词： actor-critic algorithm stochastic approximation Markov decision processes simulation-based algorithms policy iteration

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：