咨询与建议

限定检索结果

文献类型

  • 140 篇 会议
  • 7 篇 期刊文献

馆藏范围

  • 147 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 71 篇 工学
    • 66 篇 计算机科学与技术...
    • 15 篇 软件工程
    • 11 篇 电气工程
    • 9 篇 控制科学与工程
    • 2 篇 仪器科学与技术
    • 2 篇 信息与通信工程
    • 1 篇 力学(可授工学、理...
    • 1 篇 机械工程
    • 1 篇 建筑学
  • 11 篇 理学
    • 10 篇 数学
    • 2 篇 系统科学
    • 2 篇 统计学(可授理学、...
  • 5 篇 管理学
    • 4 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 1 篇 图书情报与档案管...
  • 3 篇 经济学
    • 3 篇 应用经济学

主题

  • 76 篇 dynamic programm...
  • 39 篇 learning
  • 26 篇 optimal control
  • 25 篇 reinforcement le...
  • 15 篇 function approxi...
  • 15 篇 control systems
  • 14 篇 approximation al...
  • 14 篇 equations
  • 13 篇 neural networks
  • 13 篇 stochastic proce...
  • 12 篇 convergence
  • 10 篇 state-space meth...
  • 10 篇 cost function
  • 9 篇 mathematical mod...
  • 8 篇 trajectory
  • 8 篇 approximation me...
  • 7 篇 approximate dyna...
  • 7 篇 algorithm design...
  • 7 篇 adaptive control
  • 7 篇 heuristic algori...

机构

  • 4 篇 school of inform...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 northeastern uni...
  • 3 篇 univ texas autom...
  • 3 篇 arizona state un...
  • 3 篇 robotics institu...
  • 3 篇 univ illinois de...
  • 2 篇 princeton univ d...
  • 2 篇 national science...
  • 2 篇 college of mecha...
  • 2 篇 key laboratory o...
  • 2 篇 univ utrecht dep...
  • 2 篇 department of op...
  • 1 篇 inria
  • 1 篇 computational le...
  • 1 篇 school of automa...
  • 1 篇 univ cincinnati ...
  • 1 篇 toyota technol c...
  • 1 篇 neuroinformatics...

作者

  • 5 篇 liu derong
  • 4 篇 xu xin
  • 4 篇 martin riedmille...
  • 4 篇 huaguang zhang
  • 4 篇 marco a. wiering
  • 4 篇 zhang huaguang
  • 4 篇 si jennie
  • 4 篇 derong liu
  • 3 篇 hado van hasselt
  • 3 篇 lewis frank l.
  • 3 篇 dongbin zhao
  • 3 篇 powell warren b.
  • 3 篇 warren b. powell
  • 3 篇 riedmiller marti...
  • 2 篇 manuel loth
  • 2 篇 van hasselt hado
  • 2 篇 preux philippe
  • 2 篇 hu dewen
  • 2 篇 jennie si
  • 2 篇 philippe preux

语言

  • 142 篇 英文
  • 5 篇 其他
检索条件"任意字段=2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007"
147 条 记 录,以下是111-120 订阅
排序:
Using ADP to Understand and Replicate Brain Intelligence: the Next Level Design
Using ADP to Understand and Replicate Brain Intelligence: th...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Paul J. Werbos National Science Foundation Arlington VA USA
Since the 1960's the author proposed that we could understand and replicate the highest level of intelligence seen in the brain, by building ever more capable and general systems for adaptive dynamic programming (... 详细信息
来源: 评论
Two Novel On-policy reinforcement learning Algorithms based on TD(λ)-methods
Two Novel On-policy Reinforcement Learning Algorithms based ...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Marco A. Wiering Hado van Hasselt Department of Information and Computing Sciences University of Utrecht Utrecht Netherlands
This paper describes two novel on-policy reinforcement learning algorithms, named QV(λ)-learning and the actor critic learning automaton (ACLA). Both algorithms learn a state value-function using TD(λ)-methods. The ... 详细信息
来源: 评论
Model-Based reinforcement learning in Factored-State MDPs
Model-Based Reinforcement Learning in Factored-State MDPs
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Alexander L. Strehl Department of Computer Science Rutgers University Piscataway NJ USA
We consider the problem of learning in a factored-state Markov decision process that is structured to allow a compact representation. We show that the well-known algorithm, factored Rmax, performs near-optimally on al... 详细信息
来源: 评论
Efficient learning in Cellular Simultaneous Recurrent Neural Networks - The Case of Maze Navigation Problem
Efficient Learning in Cellular Simultaneous Recurrent Neural...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Roman Ilin Robert Kozma Paul J. Werbos Department of Mathematical Sciences University of Memphis Memphis TN USA National Science Foundation Arlington VA USA
Cellular simultaneous recurrent neural networks (SRN) show great promise in solving complex function approximation problems. In particular, approximate dynamic programming is an important application area where SRNs h... 详细信息
来源: 评论
A Theoretical Analysis of Cooperative Behavior in Multi-agent Q-learning
A Theoretical Analysis of Cooperative Behavior in Multi-agen...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Ludo Waltman Uzay Kaymak Erasmus Erasmus University Rotterdam Rotterdam Netherlands
A number of experimental studies have investigated whether cooperative behavior may emerge in multi-agent Q-learning. In some studies cooperative behavior did emerge, in others it did not. This paper provides a theore... 详细信息
来源: 评论
Using Reward-weighted Regression for reinforcement learning of Task Space Control
Using Reward-weighted Regression for Reinforcement Learning ...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Jan Peters Stefan Schaal University of Southern California Los Angeles CA USA
Many robot control problems of practical importance, including task or operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or rein... 详细信息
来源: 评论
dynamic optimization of the strength ratio during a terrestrial conflict
Dynamic optimization of the strength ratio during a terrestr...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Alexandre Sztykgold Gilles Coppin Olivier Hudry GET/ENST-Bretagne LUSSI Department France GET/ENST Computer Science Department France
The aim of this study is to assist a military decision maker during his decision-making process when applying tactics on the battlefield. For that, we have decided to model the conflict by a game, on which we will see... 详细信息
来源: 评论
Fitted Q Iteration with CMACs
Fitted Q Iteration with CMACs
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Stephan Timmer Martin Riedmiller Department of Computer Science University of Osnabrück Osnabruck Germany
A major issue in model-free reinforcement learning is how to efficiently exploit the data collected by an exploration strategy. This is especially important in case of continuous, high dimensional state spaces, since ... 详细信息
来源: 评论
Opposition-Based reinforcement learning in the Management of Water Resources
Opposition-Based Reinforcement Learning in the Management of...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: M. Mahootchi H. R. Tizhoosh K. Ponnambalam Systems Design Engineering University of Waterloo Waterloo ONT Canada
Opposition-based learning (OBL) is a new scheme in machine intelligence. In this paper, an OBL version Q-learning which exploits opposite quantities to accelerate the learning is used for management of single reservoi... 详细信息
来源: 评论
Value-Iteration Based Fitted Policy Iteration: learning with a Single Trajectory
Value-Iteration Based Fitted Policy Iteration: Learning with...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Andras Antos Csaba Szepesvari Remi Munos Computer and Automation Research Inst. Hungarian Academy of Sciences Budapest Hungary University of Alberta Edmonton Canada SequeL team INRIA Futurs University of Lille (USTL) Villeneuve d'Ascq France
We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian decision problems when the training data is composed of the trajectory of some fixed behaviour policy. ... 详细信息
来源: 评论