检索结果-内蒙古大学图书馆

2007 ieee symposium on approximate dynamic programming and reinforcement learning, adprl 2007

ISBN: (纸本)1424407060

The proceedings contain 49 papers. The topics discussed include: fitted Q iteration with CMACs;reinforcement-learning-based magneto-hydrodynamic control hypersonic flows;a novel fuzzy reinforcement learning approach in two-level intelligent control of 3-DOF robot manipulators;knowledge transfer using local features;particle swarm optimization adaptive dynamic programming;discrete-time nonlinear HJB solution using approximation dynamic programming: convergence proof;dual representations for dynamic programming and reinforcement learning;an optimal ADP algorithm for a high-dimensional stochastic control problem;convergence of model-based temporal difference learning for control;the effect of bootstrapping in multi-automata reinforcement learning;and a theoretical analysis of cooperative behavior in multi-agent Q-learning.

关键词： dynamic programming

来源：评论

学校读者我要写书评

暂无评论

2007 ieee international symposium on approximate dynamic programming and reinforcement learning

Proceedings of the 2007 IEEE Symposium on Approximate Dynami...

引用

Proceedings of the 2007 ieee symposium on approximate dynamic programming and reinforcement learning, adprl 2007 2007年

作者： Liu, Derong Munos, Remi Si, Jennie Wunsch, II, Donald C.

No abstract available

ISBN: (纸本)1424407060

No abstract available

关键词：

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning by backpropagation through an LSTM model/critic

Reinforcement learning by backpropagation through an LSTM mo...

引用

ieee International symposium on approximate dynamic programming and reinforcement learning

作者： Bakker, Bram Univ Amsterdam Inst Informat Intelligent Syst Lab Amsterdam NL-1098 SJ Amsterdam Netherlands

ISBN: (纸本)9781424407064

This paper describes backpropagation through an LSTM recurrent neural network model/critic, for reinforcement learning tasks in partially observable domains. This combines the advantage of LSTM's strength at learning long-term temporal dependencies to infer states in partially observable tasks, with the advantage of being able to learn high-dimensional and/or continuous actions with backpropagation's focused credit assignment mechanism.

关键词： Backpropagation State-space methods Recurrent neural networks Neural networks Observability dynamic programming learning systems Intelligent systems Intelligent networks Laboratories

来源：评论

学校读者我要写书评

暂无评论

An approximate dynamic programming strategy for responsive traffic signal control

An approximate dynamic programming strategy for responsive t...

引用

ieee International symposium on approximate dynamic programming and reinforcement learning

作者： Cai, Chen Univ Coll London Ctr Transport Studies London WC1E 6BT England

ISBN: (纸本)9781424407064

This paper proposes an approximate dynamic programming strategy for responsive traffic signal control. It is the first attempt that optimizes signal control objective dynamically through adaptive approximation of value function. The proposed value function approximation is separable and exogenous factor independent. The algorithm updates the approximated value function progressively in operation, while preserving the structural property of the control problem. The convergence and performance of the algorithm have been tested in a range of experiments. It has been concluded that the new strategy is as good as the best existing control strategies while being efficient and simple in computation. It also has the potential of being extended to multi-phase signal control at isolate junction and to decentralized network operation.

关键词： dynamic programming Traffic control Function approximation Communication system traffic control Adaptive control Roads learning Testing Delay Vehicle safety

来源：评论

学校读者我要写书评

暂无评论

Toward effective combination of off-line and on-line training in ADP framework

Toward effective combination of off-line and on-line trainin...

引用

ieee International symposium on approximate dynamic programming and reinforcement learning

作者： Prokhorov, Danil Toyota Technol Ctr Ann Arbor MI 48105 USA

ISBN: (纸本)9781424407064

We are interested in finding the most effective combination between off-line and on-line/real-time training in approximate dynamic programming. We introduce our approach of combining proven off-line methods of training for robustness with a group of on-line methods. Training for robustness is carried out on reasonably accurate models with the multi- stream Kalman filter method [1], whereas on-line adaptation is performed either with the help of a critic or by methods resembling reinforcement learning. We also illustrate importance of using recurrent neural networks for both controller/actor and critic.

关键词： dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Randomly sampling actions in dynamic programming

Randomly sampling actions in dynamic programming

引用

ieee International symposium on approximate dynamic programming and reinforcement learning

作者： Atkeson, Christopher G. Carnegie Mellon Univ Inst Robot Pittsburgh PA 15213 USA

ISBN: (纸本)9781424407064

We describe an approach towards reducing the curse of dimensionality for deterministic dynamic programming with continuous actions by randomly sampling actions while computing a steady state value function and policy. This approach results in globally optimized actions, without searching over a discretized multidimensional grid. We present results on finding time invariant control laws for two, four, and six dimensional deterministic swing up problems with up to 480 million discretized states.

关键词： dynamic programming

来源：评论

学校读者我要写书评

暂无评论

An optimal ADP algorithm for a high-dimensional stochastic control problem

An optimal ADP algorithm for a high-dimensional stochastic c...

引用

ieee International symposium on approximate dynamic programming and reinforcement learning

作者： Nascimento, Juliana Powell, Warren Princeton Univ Dept Operat Res & Financial Engn Princeton NJ 08544 USA

ISBN: (纸本)9781424407064

We propose a provably optimal approximate dynamic programming algorithm for a class of multistage stochastic problems, taking into account that the probability distribution of the underlying stochastic process is not known and the state space is too large to be explored entirely. The algorithm and its proof of convergence rely on the fact that the optimal value functions of the problems within the problem class are concave and piecewise linear. The algorithm is a combination of Monte Carlo simulation, pure exploitation, stochastic approximation and a projection operation. Several applications, in areas like energy, control, inventory and finance, fall under the framework.

关键词： dynamic programming

来源：评论

学校读者我要写书评

暂无评论

On a successful application of multi-agent reinforcement learning to operations research benchmarks

On a successful application of multi-agent reinforcement lea...

引用

ieee International symposium on approximate dynamic programming and reinforcement learning

作者： Gabel, Thomas Riedmiller, Martin Univ Osnabruck Dept Math & Comp Sci Inst Cognit Sci D-49069 Osnabruck Germany

ISBN: (纸本)9781424407064

In this paper, we suggest and analyze the use of approximate reinforcement learning techniques for a new category of challenging benchmark problems from the field of Operations Research. We demonstrate that interpreting and solving the task of job-shop scheduling as a multi-agent learning problem is beneficial for obtaining near-optimal solutions and can very well compete with alternative solution approaches. The evaluation of our algorithms focuses on numerous established Operations Research benchmark problems.

关键词： Multi agent systems

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning in continuous action spaces

Reinforcement learning in continuous action spaces

引用

ieee International symposium on approximate dynamic programming and reinforcement learning

作者： van Hasselt, Hado Wiering, Marco A. Univ Utrecht Dept Informat & Comp Sci Intelligent Syst Grp Padualaan 14 NL-3508 TB Utrecht Netherlands

ISBN: (纸本)9781424407064

Quite some research has been done on reinforcement learning in continuous environments, but the research on problems where the actions can also be chosen from a continuous space is much more limited. We present a new class of algorithms named Continuous Actor Critic learning Automaton (CACLA) that can handle continuous states and actions. The resulting algorithm is straightforward to implement. An experimental comparison is made between this algorithm and other algorithms that can handle continuous action spaces. These experiments show that CACLA performs much better than the other algorithms, especially when it is combined with a Gaussian exploration method.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Short-term stock market timing prediction under reinforcement learning schemes

Short-term stock market timing prediction under reinforcemen...

引用

2007 ieee symposium on approximate dynamic programming and reinforcement learning, adprl 2007

作者： Hailin, Li Dagli, Cihan H. Enke, David Department of Engineering Management and Systems Engineering University of Missouri-Rolla Rolla MO 65409-0370 United States

ISBN: (纸本)1424407060

There are fundamental difficulties when only using a supervised learning philosophy to predict financial stock short-term movements. We present a reinforcement-oriented forecasting framework in which the solution is converted from a typical error-based learning approach to a goal-directed matchbased learning method. The real market timing ability in forecasting is addressed as well as traditional goodness-of-fit-based criteria. We develop two applicable hybrid prediction systems by adopting actor-only and actor-critic reinforcement learning, respectively, and compare them to both a supervised-only model and a classical random walk benchmark in forecasting three daily-based stock indices series within a 21-year learning and testing period. The performance of actor-critic-based systems was demonstrated to be superior to that of other alternatives, while the proposed actor-only systems also showed efficacy. © 2007 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：