检索结果-内蒙古大学图书馆

International Joint Conference on Neural Networks (IJCNN)

作者： Zhao, Qiming Xu, Hao Jagannathan, S. Missouri Univ S&T Dept Elect & Comp Engn Rolla MO 65409 USA

ISBN: (纸本)9781467314909

In this paper, the Bellman equation is used to solve the stochastic optimal control of unknown linear discrete-time system with communication imperfections including random delays, packet losses and quantization. A dynamic quantizer for the sensor measurements is proposed which essentially provides system states to the controller. To eliminate the effect of the quantization error, the dynamics of the quantization error bound and an update law for tuning its range are derived. Subsequently, by using adaptive dynamic programming technique, the infinite horizon optimal regulation of the uncertain NCS is solved in a forward-in-time manner without using value and/or policy iterations by using Q-function and reinforcement learning. The asymptotic stability of the closed-loop system is verified by standard Lyapunov stability theory. Finally, the effectiveness of the proposed method is verified by simulation results.

关键词： Networked Control System Quantization adaptive dynamic programming Optimal Control

来源：评论

学校读者我要写书评

暂无评论

learning Control of a Bioreactor System Using Kernel-based Heuristic dynamic programming

Learning Control of a Bioreactor System Using Kernel-based H...

引用

10th World Congress on Intelligent Control and Automation (WCICA)

作者： Lian, Chuanqiang Xu, Xin Zuo, Lei Huang, Zhenhua Natl Univ Def Technol Inst Automat Changsha 410073 Hunan Peoples R China

ISBN: (纸本)9781467313988

To solve the learning control problem of a bioreactor system, a novel framework of heuristic dynamic programming (HDP) with sparse kernel machines is presented, which integrates kernel methods into critic learning of HDP. As a class of adaptive critic designs (ACDs), HDP has been used to realize online learning control of dynamical systems, where neural networks are commonly employed to approximate the value functions or policies. However, there are still some difficulties in the design and implementation of HDP such as that the learning efficiency and convergence of HDP greatly rely on the empirical design of the critic and so on. In this paper, by using the sparse kernel machines, Kernel HDP (KHDP) is proposed and its performance is analyzed both theoretically and empirically. Due to the representation learning and nonlinear approximation ability of sparse kernel machines, KHDP can obtain better performance than previous HDP method with manually designed neural networks. Simulation results demonstrate the effectiveness of the proposed method.

关键词： heuristic dynamic programming reinforcement learning kernel machines bioreactor Markov decision processes

来源：评论

学校读者我要写书评

暂无评论

Achieving Quality of Service with Adaptation-based programming for medium access protocols

Achieving Quality of Service with Adaptation-based Programmi...

引用

2012 ieee Global Communications Conference, GLOBECOM 2012

作者： Zhu, Pingan Pinto, Jervis Nguyen, Thinh Fern, A. School of Electrical Engineering and Computer Science Oregon State University Corvallis 97331 United States

ISBN: (纸本)9781467309219

Designing network protocols that work well under a variety of network conditions typically involves a large amount of manual tuning and guesswork, particularly when choosing dynamic update strategies for numeric parameters. The situation is made more complex by adding the Quality of Service (QoS) requirements to a network protocol. A fundamentally different approach for designing protocols is via reinforcement learning (RL) algorithms which allow protocols to be automatically optimized through network simulation. However, getting RL to work well in practice requires considerable expertise and carries a significant implementation overhead. To help overcome this challenge, recent work has developed the programming paradigm of Adaptation-Based programming (ABP), which allows programmers who are not RL-experts to write self-optimizing 'adaptive programs'. In this work, we study the potential of applying ABP to the problem of designing network protocols via simulation. We demonstrate the flexibility of our design method via a number of case studies, each of which investigates the performance of an adaptive program written for the backoff mechanism of the MAC layer in the 802.11 standard. Our results show that the learned protocols typically outperform 802.11 on a number of evaluation metrics and network conditions. © 2012 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning Control Based on Multi-Goal Representation Using Hierarchical Heuristic dynamic programming

Reinforcement Learning Control Based on Multi-Goal Represent...

引用

International Joint Conference on Neural Networks (IJCNN)

作者： Ni, Zhen He, Haibo Zhao, Dongbin Prokhorov, Danil V. Univ Rhode Isl Dept Elect Comp & Biomed Engn Kingston RI 02881 USA Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100864 Peoples R China Toyota Res Inst NA TTC Ann Arbor MI 48105 USA

ISBN: (纸本)9781467314909

We are interested in developing a multi-goal generator to provide detailed goal representations that help to improve the performance of the adaptive critic design (ACD). In this paper we propose a hierarchical structure of goal generator networks to cascade external reinforcement into more informative internal goal representations in the ACD. This is in contrast with previous designs in which the external reward signal is assigned to the critic network directly. The ACD control system performance is evaluated on the ball-and-beam balancing benchmark under noise-free and various noisy conditions. Simulation results in the form of a comparative study demonstrate effectiveness of our approach.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

dynamic Switching and Real-time Machine learning for Improved Human Control of Assistive Biomedical Robots

Dynamic Switching and Real-time Machine Learning for Improve...

引用

4th ieee RAS and EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob) / symposium on Surgical Robotics

作者： Pilarski, Patrick M. Dawson, Michael R. Degris, Thomas Carey, Jason P. Sutton, Richard S. Univ Alberta Dept Comp Sci Edmonton AB T6G 2E8 Canada Univ Alberta Dept Mech Engn Edmonton AB T6G 2G8 Canada

ISBN: (纸本)9781457712005

A general problem for human-machine interaction occurs when a machine's controllable dimensions outnumber the control channels available to its human user. In this work, we examine one prominent example of this problem: amputee switching between the multiple functions of a powered artificial limb. We propose a dynamic switching approach that learns during ongoing interaction to anticipate user behaviour, thereby presenting the most effective control option for a given context or task. Switching predictions are learned in real time using temporal difference methods and reinforcement learning, and demonstrated within the context of a robotic arm and a multi-function myoelectric controller. We find that a learned, dynamic switching order is able to out-perform the best fixed (non-adaptive) switching regime on a standard prosthetic proficiency task, increasing the number of optimal switching suggestions by 23%, and decreasing the expected transition time between degrees of freedom by more than 14%. These preliminary results indicate that real-time machine learning, specifically online prediction and anticipation, may be an important tool for developing more robust and intuitive controllers for assistive biomedical robots. We expect these techniques will transfer well to near-term use by patients. Future work will describe clinical testing of this approach with a population of amputee patients.

关键词： Controllers

来源：评论

学校读者我要写书评

暂无评论

2012 ieee Multi-Conference on Systems and Control, MSC 2012

2012 IEEE Multi-Conference on Systems and Control, MSC 2012

引用

2012 6th ieee Multi-Conference on Systems and Control, MSC 2012

ISBN: (纸本)9781467345989

The proceedings contain 41 papers. The topics discussed include: local asymptotic convergence of a cycle-free persistent formation of double-integrators in three-dimensional space;several performance measures for the obstacle detection of an overlapped ultrasonic sensor ring;fuel flow control of a PEM fuel cell with MPPT;control of inter-agent distances in cyclic polygon formations;an online integral reinforcement learning algorithm to solve N-player Nash games;adaptive reduced-order control of discrete repetitive processes with iteration-varying reference signals;optimal iterative learning control with uncertain reference points;a dynamic T-S fuzzy systems identification algorithm based on sparsity regularization;optimal design of an overlapped ultrasonic sensor ring using a new composite design index;and robust fault isolation using dynamically extended observers.

关键词：

来源：评论

学校读者我要写书评

暂无评论

2012 ieee Southwest symposium on Image Analysis and Interpretation, SSIAI 2012, Proceedings

2012 IEEE Southwest Symposium on Image Analysis and Interpre...

引用

2012 ieee Southwest symposium on Image Analysis and Interpretation, SSIAI 2012

ISBN: (纸本)9781467318303

The proceedings contain 53 papers. The topics discussed include: adaptive kernel learning for detection of clustered microcalcifications in mammograms;using segmentation in CT metal artifact reduction;automated nuclei tracking in c. elegans based on spherical model fitting with multiple target tracking;a hybrid watershed method for cell image segmentation;cell splitting using dynamic programming;combining multiple visual processing streams for locating and classifying objects in video;automated detection of dust clouds and sources in NOAA-AVHRR satellite imagery;integrated multiple behavior models for abnormal crowd behavior detection;detection of spectrally sparse anomalies in hyperspectral imagery;a novel background subtraction method to detect microcalcifications;a conservative scene model update policy;illumination-invariant representation for natural color images and its application;and curvature oriented clustering of sparse motion vector fields.

关键词：

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning Controller Design for Affine Nonlinear Discrete-Time Systems using Online Approximators

引用

ieee TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS 2012年第2期42卷 377-390页

作者： Yang, Qinmin Jagannathan, Sarangapani Zhejiang Univ Dept Control Sci & Engn State Key Lab Ind Control Technol Hangzhou 310027 Zhejiang Peoples R China Missouri Univ Sci & Technol Dept Elect & Comp Engn Rolla MO 65409 USA

In this paper, reinforcement learning state- and output-feedback-based adaptive critic controller designs are proposed by using the online approximators (OLAs) for a general multi-input and multioutput affine unknown nonlinear discrete-time systems in the presence of bounded disturbances. The proposed controller design has two entities, an action network that is designed to produce optimal signal and a critic network that evaluates the performance of the action network. The critic estimates the cost-to-go function which is tuned online using recursive equations derived from heuristic dynamic programming. Here, neural networks (NNs) are used both for the action and critic whereas any OLAs, such as radial basis functions, splines, fuzzy logic, etc., can be utilized. For the output-feedback counterpart, an additional NN is designated as the observer to estimate the unavailable system states, and thus, separation principle is not required. The NN weight tuning laws for the controller schemes are also derived while ensuring uniform ultimate boundedness of the closed-loop system using Lyapunov theory. Finally, the effectiveness of the two controllers is tested in simulation on a pendulum balancing system and a two-link robotic arm system.

关键词： adaptive critic dynamic programming (DP) Lyapunov method neural networks (NNs) online approximators (OLAs) online learning reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning and Approximate dynamic programming for Feedback Control 1

引用

丛书名： ieee Press Series on Computational Intelligence

2012年

作者： Lewis, Frank L. Liu, Derong

ISBN: (数字)9781118453988

ISBN: (纸本)9781118104200

reinforcement learning (RL) and adaptive dynamic programming (ADP) has been one of the most critical research fields in science and engineering for modern complex systems. This book describes the latest RL and ADP techniques for decision and control in human engineered systems, covering both single player decision and control and multi-player games. Edited by the pioneers of RL and ADP research, the book brings together ideas and methods from many fields and provides an important and timely guidance on controlling a wide variety of systems, such as robots, industrial processes, and economic decision-making.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Automated and Agile Server Parameter Tuning with learning and Control

Automated and Agile Server Parameter Tuning with Learning an...

引用

International symposium on Parallel and Distributed Processing (IPDPS)

作者： Yanfei Guo Palden Lama Xiaobo Zhou Department of Computer Science University of Colorado Colorado Springs USA

Server parameter tuning in virtualized data centers is crucial to performance and availability of hosted Internet applications. It is challenging due to high dynamics and burstiness of workloads, multi-tier service architecture, and virtualized server infrastructure. In this paper, we investigate automated and agile server parameter tuning for maximizing effective throughput of multi-tier Internet applications. A recent study proposed a reinforcement learning based server parameter tuning approach for minimizing average response time of multi-tier applications. reinforcement learning is a decision making process determining the parameter tuning direction based on trial-and-error, instead of quantitative values for agile parameter tuning. It relies on a predefined adjustment value for each tuning action. However it is nontrivial or even infeasible to find an optimal value under highly dynamic and bursty workloads. We design a neural fuzzy control based approach that combines the strengths of fast online learning and self-adaptive ness of neural networks and fuzzy control. Due to the model independence, it is robust to highly dynamic and bursty workloads. It is agile in server parameter tuning due to its quantitative control outputs. We implement the new approach on a test bed of virtualized HP Pro Liant blade servers hosting RUBiS benchmark applications. Experimental results demonstrate that the new approach significantly outperforms the reinforcement learning based approach for both improving effective system throughput and minimizing average response time.

关键词： Servers Tuning Time factors Fuzzy control Throughput Neurons learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：