In this paper, the Bellman equation is used to solve the stochastic optimal control of unknown linear discrete-time system with communication imperfections including random delays, packet losses and quantization. A dy...
详细信息
ISBN:
(纸本)9781467314909
In this paper, the Bellman equation is used to solve the stochastic optimal control of unknown linear discrete-time system with communication imperfections including random delays, packet losses and quantization. A dynamic quantizer for the sensor measurements is proposed which essentially provides system states to the controller. To eliminate the effect of the quantization error, the dynamics of the quantization error bound and an update law for tuning its range are derived. Subsequently, by using adaptivedynamicprogramming technique, the infinite horizon optimal regulation of the uncertain NCS is solved in a forward-in-time manner without using value and/or policy iterations by using Q-function and reinforcementlearning. The asymptotic stability of the closed-loop system is verified by standard Lyapunov stability theory. Finally, the effectiveness of the proposed method is verified by simulation results.
To solve the learning control problem of a bioreactor system, a novel framework of heuristic dynamicprogramming (HDP) with sparse kernel machines is presented, which integrates kernel methods into critic learning of ...
详细信息
ISBN:
(纸本)9781467313988
To solve the learning control problem of a bioreactor system, a novel framework of heuristic dynamicprogramming (HDP) with sparse kernel machines is presented, which integrates kernel methods into critic learning of HDP. As a class of adaptive critic designs (ACDs), HDP has been used to realize online learning control of dynamical systems, where neural networks are commonly employed to approximate the value functions or policies. However, there are still some difficulties in the design and implementation of HDP such as that the learning efficiency and convergence of HDP greatly rely on the empirical design of the critic and so on. In this paper, by using the sparse kernel machines, Kernel HDP (KHDP) is proposed and its performance is analyzed both theoretically and empirically. Due to the representation learning and nonlinear approximation ability of sparse kernel machines, KHDP can obtain better performance than previous HDP method with manually designed neural networks. Simulation results demonstrate the effectiveness of the proposed method.
Designing network protocols that work well under a variety of network conditions typically involves a large amount of manual tuning and guesswork, particularly when choosing dynamic update strategies for numeric param...
详细信息
We are interested in developing a multi-goal generator to provide detailed goal representations that help to improve the performance of the adaptive critic design (ACD). In this paper we propose a hierarchical structu...
详细信息
ISBN:
(纸本)9781467314909
We are interested in developing a multi-goal generator to provide detailed goal representations that help to improve the performance of the adaptive critic design (ACD). In this paper we propose a hierarchical structure of goal generator networks to cascade external reinforcement into more informative internal goal representations in the ACD. This is in contrast with previous designs in which the external reward signal is assigned to the critic network directly. The ACD control system performance is evaluated on the ball-and-beam balancing benchmark under noise-free and various noisy conditions. Simulation results in the form of a comparative study demonstrate effectiveness of our approach.
A general problem for human-machine interaction occurs when a machine's controllable dimensions outnumber the control channels available to its human user. In this work, we examine one prominent example of this pr...
详细信息
ISBN:
(纸本)9781457712005
A general problem for human-machine interaction occurs when a machine's controllable dimensions outnumber the control channels available to its human user. In this work, we examine one prominent example of this problem: amputee switching between the multiple functions of a powered artificial limb. We propose a dynamic switching approach that learns during ongoing interaction to anticipate user behaviour, thereby presenting the most effective control option for a given context or task. Switching predictions are learned in real time using temporal difference methods and reinforcementlearning, and demonstrated within the context of a robotic arm and a multi-function myoelectric controller. We find that a learned, dynamic switching order is able to out-perform the best fixed (non-adaptive) switching regime on a standard prosthetic proficiency task, increasing the number of optimal switching suggestions by 23%, and decreasing the expected transition time between degrees of freedom by more than 14%. These preliminary results indicate that real-time machine learning, specifically online prediction and anticipation, may be an important tool for developing more robust and intuitive controllers for assistive biomedical robots. We expect these techniques will transfer well to near-term use by patients. Future work will describe clinical testing of this approach with a population of amputee patients.
The proceedings contain 41 papers. The topics discussed include: local asymptotic convergence of a cycle-free persistent formation of double-integrators in three-dimensional space;several performance measures for the ...
ISBN:
(纸本)9781467345989
The proceedings contain 41 papers. The topics discussed include: local asymptotic convergence of a cycle-free persistent formation of double-integrators in three-dimensional space;several performance measures for the obstacle detection of an overlapped ultrasonic sensor ring;fuel flow control of a PEM fuel cell with MPPT;control of inter-agent distances in cyclic polygon formations;an online integral reinforcementlearning algorithm to solve N-player Nash games;adaptive reduced-order control of discrete repetitive processes with iteration-varying reference signals;optimal iterative learning control with uncertain reference points;a dynamic T-S fuzzy systems identification algorithm based on sparsity regularization;optimal design of an overlapped ultrasonic sensor ring using a new composite design index;and robust fault isolation using dynamically extended observers.
The proceedings contain 53 papers. The topics discussed include: adaptive kernel learning for detection of clustered microcalcifications in mammograms;using segmentation in CT metal artifact reduction;automated nuclei...
ISBN:
(纸本)9781467318303
The proceedings contain 53 papers. The topics discussed include: adaptive kernel learning for detection of clustered microcalcifications in mammograms;using segmentation in CT metal artifact reduction;automated nuclei tracking in c. elegans based on spherical model fitting with multiple target tracking;a hybrid watershed method for cell image segmentation;cell splitting using dynamicprogramming;combining multiple visual processing streams for locating and classifying objects in video;automated detection of dust clouds and sources in NOAA-AVHRR satellite imagery;integrated multiple behavior models for abnormal crowd behavior detection;detection of spectrally sparse anomalies in hyperspectral imagery;a novel background subtraction method to detect microcalcifications;a conservative scene model update policy;illumination-invariant representation for natural color images and its application;and curvature oriented clustering of sparse motion vector fields.
In this paper, reinforcementlearning state- and output-feedback-based adaptive critic controller designs are proposed by using the online approximators (OLAs) for a general multi-input and multioutput affine unknown ...
详细信息
In this paper, reinforcementlearning state- and output-feedback-based adaptive critic controller designs are proposed by using the online approximators (OLAs) for a general multi-input and multioutput affine unknown nonlinear discrete-time systems in the presence of bounded disturbances. The proposed controller design has two entities, an action network that is designed to produce optimal signal and a critic network that evaluates the performance of the action network. The critic estimates the cost-to-go function which is tuned online using recursive equations derived from heuristic dynamicprogramming. Here, neural networks (NNs) are used both for the action and critic whereas any OLAs, such as radial basis functions, splines, fuzzy logic, etc., can be utilized. For the output-feedback counterpart, an additional NN is designated as the observer to estimate the unavailable system states, and thus, separation principle is not required. The NN weight tuning laws for the controller schemes are also derived while ensuring uniform ultimate boundedness of the closed-loop system using Lyapunov theory. Finally, the effectiveness of the two controllers is tested in simulation on a pendulum balancing system and a two-link robotic arm system.
reinforcementlearning (RL) and adaptivedynamicprogramming (ADP) has been one of the most critical research fields in science and engineering for modern complex systems. This book describes the latest RL and ADP tec...
ISBN:
(数字)9781118453988
ISBN:
(纸本)9781118104200
reinforcementlearning (RL) and adaptivedynamicprogramming (ADP) has been one of the most critical research fields in science and engineering for modern complex systems. This book describes the latest RL and ADP techniques for decision and control in human engineered systems, covering both single player decision and control and multi-player games. Edited by the pioneers of RL and ADP research, the book brings together ideas and methods from many fields and provides an important and timely guidance on controlling a wide variety of systems, such as robots, industrial processes, and economic decision-making.
Server parameter tuning in virtualized data centers is crucial to performance and availability of hosted Internet applications. It is challenging due to high dynamics and burstiness of workloads, multi-tier service ar...
详细信息
Server parameter tuning in virtualized data centers is crucial to performance and availability of hosted Internet applications. It is challenging due to high dynamics and burstiness of workloads, multi-tier service architecture, and virtualized server infrastructure. In this paper, we investigate automated and agile server parameter tuning for maximizing effective throughput of multi-tier Internet applications. A recent study proposed a reinforcementlearning based server parameter tuning approach for minimizing average response time of multi-tier applications. reinforcementlearning is a decision making process determining the parameter tuning direction based on trial-and-error, instead of quantitative values for agile parameter tuning. It relies on a predefined adjustment value for each tuning action. However it is nontrivial or even infeasible to find an optimal value under highly dynamic and bursty workloads. We design a neural fuzzy control based approach that combines the strengths of fast online learning and self-adaptive ness of neural networks and fuzzy control. Due to the model independence, it is robust to highly dynamic and bursty workloads. It is agile in server parameter tuning due to its quantitative control outputs. We implement the new approach on a test bed of virtualized HP Pro Liant blade servers hosting RUBiS benchmark applications. Experimental results demonstrate that the new approach significantly outperforms the reinforcementlearning based approach for both improving effective system throughput and minimizing average response time.
暂无评论