The proceedings contain 42 papers. The topics discussed include: approximate real-time optimal control based on sparse Gaussian process models;subspace identification for predictive state representation by nuclear nor...
ISBN:
(纸本)9781479945535
The proceedings contain 42 papers. The topics discussed include: approximate real-time optimal control based on sparse Gaussian process models;subspace identification for predictive state representation by nuclear norm minimization;active learning for classification: an optimistic approach;convergent reinforcementlearning control with neural networks and continuous action search;theoretical analysis of a reinforcementlearning based switching scheme;an analysis of optimistic, best-first search for minimax sequential decision making;information-theoretic stochastic optimal control via incremental sampling-based algorithms;policy gradient approaches for multi-objective sequential decision making: a comparison;and cognitive control in cognitive dynamic systems: a new way of thinking inspired by the brain.
The proceedings contain 45 papers. The topics discussed include: active learning for personalizing treatment;active exploration by searching for experiments that falsify the computed control policy;optimistic planning...
ISBN:
(纸本)9781424498888
The proceedings contain 45 papers. The topics discussed include: active learning for personalizing treatment;active exploration by searching for experiments that falsify the computed control policy;optimistic planning for sparsely stochastic systems;adaptive sample collection using active learning for kernel-based approximate policy iteration;tree-based variable selection for dimensionality reduction of large-scale control systems;high-order local dynamicprogramming;safe reinforcementlearning in high-risk tasks through policy improvement;agent self-assessment: determining policy quality without execution;reinforcementlearning algorithms for solving classification problems;reinforcementlearning in multidimensional continuous action spaces;grounding subgoals in information transitions;and directed exploration of policy space using support vector classifiers.
The proceedings contain 28 papers. The topics discussed include: local stability analysis of high-order recurrent neural networks with multi-step piecewise linear activation functions;finite-horizon optimal control de...
ISBN:
(纸本)9781467359252
The proceedings contain 28 papers. The topics discussed include: local stability analysis of high-order recurrent neural networks with multi-step piecewise linear activation functions;finite-horizon optimal control design for uncertain linear discrete-time systems;adaptive optimal control for nonlinear discrete-time systems;optimal control for a class of nonlinear system with controller constraints based on finite-approximation-errors ADP algorithm;finite horizon stochastic optimal control of uncertain linear networked control system;real-time tracking on adaptive critic design with uniformly ultimately bounded condition;a novel approach for constructing basis functions in approximate dynamicprogramming for feedback control;and a combined hierarchical reinforcementlearning based approach for multi-robot cooperative target searching in complex unknown environments.
The proceedings contain 34 papers. The topics discussed include: a unified framework for temporal difference methods;efficient data reuse in value function approximation;constrained optimal control of affine nonlinear...
ISBN:
(纸本)9781424427611
The proceedings contain 34 papers. The topics discussed include: a unified framework for temporal difference methods;efficient data reuse in value function approximation;constrained optimal control of affine nonlinear discrete-time systems using GHJB method;algorithm and stability of ATC receding horizon control;online policy iteration based algorithms to solve the continuous-time infinite horizon optimal control problem;real-time motor control using recurrent neural networks;hierarchical optimal control of a 7-DOF Arm model;coupling perception and action using minimax optimal control;a convergent recursive least squares policy iteration algorithm for multi-dimensional Markov Decision Process with continuous state and action spaces;basis function adaptation methods for cost approximation in MDP;and executing concurrent actions with multiple Markov Decision Processes.
adprl 2011 is the third ieee International symposium on Approximate dynamicprogramming and reinforcementlearning. The area of approximate dynamicprogramming and reinforcementlearning is a fusion of a number of res...
adprl 2011 is the third ieee International symposium on Approximate dynamicprogramming and reinforcementlearning. The area of approximate dynamicprogramming and reinforcementlearning is a fusion of a number of research areas in engineering, mathematics, artificial intelligence, operations research, and systems and control theory. This symposium brings together researchers from different disciplines and will provide a remarkable opportunity for the academic and industrial community to address new challenges, share innovative yet practical solutions, and define promising future research directions.
The proceedings contain 49 papers. The topics discussed include: fitted Q iteration with CMACs;reinforcement-learning-based magneto-hydrodynamic control hypersonic flows;a novel fuzzy reinforcementlearning approach i...
详细信息
ISBN:
(纸本)1424407060
The proceedings contain 49 papers. The topics discussed include: fitted Q iteration with CMACs;reinforcement-learning-based magneto-hydrodynamic control hypersonic flows;a novel fuzzy reinforcementlearning approach in two-level intelligent control of 3-DOF robot manipulators;knowledge transfer using local features;particle swarm optimization adaptivedynamicprogramming;discrete-time nonlinear HJB solution using approximation dynamicprogramming: convergence proof;dual representations for dynamicprogramming and reinforcementlearning;an optimal ADP algorithm for a high-dimensional stochastic control problem;convergence of model-based temporal difference learning for control;the effect of bootstrapping in multi-automata reinforcementlearning;and a theoretical analysis of cooperative behavior in multi-agent Q-learning.
Wireless Sensor Networks (WSNs) play a pivotal role in enabling Internet of Things (IoT) devices with sensing and actuation capabilities. Operating in remote and resourceconstrained environments, these IoT devices fac...
详细信息
An online reinforcementlearning algorithm is proposed in this paper to directly utilizes online data efficiently for continuous deterministic systems without system parameters. The dependence on some specific approxi...
详细信息
ISBN:
(纸本)9781479945528
An online reinforcementlearning algorithm is proposed in this paper to directly utilizes online data efficiently for continuous deterministic systems without system parameters. The dependence on some specific approximation structures is crucial to limit the wide application of online reinforcementlearning algorithms. We utilize the online data directly with the kd-tree technique to remove this limitation. Moreover, we design the algorithm in the Probably Approximately Correct principle. Two examples are simulated to verify its good performance.
In this paper, an approximate optimal control method based on adaptivedynamicprogramming(ADP) is discussed for completely unknown nonlinear system. An online critic-action-identifier algorithm is developed using neu...
详细信息
ISBN:
(纸本)9781479945528
In this paper, an approximate optimal control method based on adaptivedynamicprogramming(ADP) is discussed for completely unknown nonlinear system. An online critic-action-identifier algorithm is developed using neural network systems, where the critic -action networks approximate the optimal value function and optimal control and the other two neural networks approximates the unknown system. Furthermore the adaptive tuning laws are given based on Lyapunov approach, which ensures the uniform ultimate bounded stability of the closed-loop system. Finally, the effectiveness is demonstrated by a simulation example.
暂无评论