咨询与建议

限定检索结果

文献类型

  • 140 篇 会议
  • 7 篇 期刊文献

馆藏范围

  • 147 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 71 篇 工学
    • 66 篇 计算机科学与技术...
    • 15 篇 软件工程
    • 11 篇 电气工程
    • 10 篇 控制科学与工程
    • 2 篇 仪器科学与技术
    • 2 篇 信息与通信工程
    • 1 篇 力学(可授工学、理...
    • 1 篇 机械工程
    • 1 篇 建筑学
  • 11 篇 理学
    • 10 篇 数学
    • 2 篇 系统科学
    • 2 篇 统计学(可授理学、...
  • 7 篇 管理学
    • 6 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 1 篇 图书情报与档案管...
  • 3 篇 经济学
    • 3 篇 应用经济学

主题

  • 76 篇 dynamic programm...
  • 39 篇 learning
  • 26 篇 optimal control
  • 25 篇 reinforcement le...
  • 15 篇 function approxi...
  • 15 篇 control systems
  • 14 篇 approximation al...
  • 14 篇 equations
  • 13 篇 neural networks
  • 13 篇 stochastic proce...
  • 12 篇 convergence
  • 10 篇 state-space meth...
  • 10 篇 cost function
  • 9 篇 mathematical mod...
  • 8 篇 trajectory
  • 8 篇 approximation me...
  • 7 篇 approximate dyna...
  • 7 篇 algorithm design...
  • 7 篇 adaptive control
  • 7 篇 heuristic algori...

机构

  • 4 篇 school of inform...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 northeastern uni...
  • 3 篇 univ texas autom...
  • 3 篇 arizona state un...
  • 3 篇 robotics institu...
  • 3 篇 univ illinois de...
  • 2 篇 princeton univ d...
  • 2 篇 national science...
  • 2 篇 college of mecha...
  • 2 篇 key laboratory o...
  • 2 篇 univ utrecht dep...
  • 2 篇 department of op...
  • 1 篇 inria
  • 1 篇 computational le...
  • 1 篇 school of automa...
  • 1 篇 univ cincinnati ...
  • 1 篇 toyota technol c...
  • 1 篇 neuroinformatics...

作者

  • 5 篇 liu derong
  • 4 篇 xu xin
  • 4 篇 martin riedmille...
  • 4 篇 huaguang zhang
  • 4 篇 marco a. wiering
  • 4 篇 zhang huaguang
  • 4 篇 si jennie
  • 4 篇 derong liu
  • 3 篇 hado van hasselt
  • 3 篇 lewis frank l.
  • 3 篇 dongbin zhao
  • 3 篇 powell warren b.
  • 3 篇 warren b. powell
  • 3 篇 riedmiller marti...
  • 2 篇 manuel loth
  • 2 篇 van hasselt hado
  • 2 篇 preux philippe
  • 2 篇 hu dewen
  • 2 篇 jennie si
  • 2 篇 philippe preux

语言

  • 142 篇 英文
  • 5 篇 其他
检索条件"任意字段=2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007"
147 条 记 录,以下是101-110 订阅
排序:
Toward effective combination of off-line and on-line training in ADP framework
Toward effective combination of off-line and on-line trainin...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Danil Prokhorov Toyota Technical Center Ann Arbor MI USA
We are interested in finding the most effective combination between off-line and on-line/real-time training in approximate dynamic programming. We introduce our approach of combining proven off-line methods of trainin... 详细信息
来源: 评论
An Optimal ADP Algorithm for a High-Dimensional Stochastic Control Problem
An Optimal ADP Algorithm for a High-Dimensional Stochastic C...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Juliana Nascimento Warren Powell Department of Operations Research and Financial Engineering Princeton University Engineering Princeton NJ USA
We propose a provably optimal approximate dynamic programming algorithm for a class of multistage stochastic problems, taking into account that the probability distribution of the underlying stochastic process is not ... 详细信息
来源: 评论
reinforcement learning in Continuous Action Spaces
Reinforcement Learning in Continuous Action Spaces
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Hado van Hasselt Marco A. Wiering Department of Information and Computing Sciences University of Utrecht Utrecht Netherlands
Quite some research has been done on reinforcement learning in continuous environments, but the research on problems where the actions can also be chosen from a continuous space is much more limited. We present a new ... 详细信息
来源: 评论
The Effect of Bootstrapping in Multi-Automata reinforcement learning
The Effect of Bootstrapping in Multi-Automata Reinforcement ...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Maarten Peeters Katja Verbeeck Ann Nowe Computational Modeling Laboratory Vrije Universiteit Brussel Brussels Belgium
learning automata are shown to be an excellent tool for creating learning multi-agent systems. Most algorithms used in current automata research expect the environment to end in an explicit end-stage. In this end-stag... 详细信息
来源: 评论
Kernelizing LSPE(λ)
Kernelizing LSPE(λ)
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Tobias Jung Daniel Polani University of Mainz Germany University of Herfordshire UK
We propose the use of kernel-based methods as underlying function approximator in the least-squares based policy evaluation framework of LSPE(λ) and LSTD(λ). In particular we present the 'kernelization' of m... 详细信息
来源: 评论
Optimal control applied to Wheeled Mobile Vehicles
Optimal control applied to Wheeled Mobile Vehicles
收藏 引用
ieee International symposium on Intelligent Signal Processing
作者: Gomez, M. Martinez, T. Sanchez, S. Meziat, D. Univ Alcala Escuela Politecn Super Dept Automat Alcala De Henares Spain Univ Alicante Escuela Politecn Super Ingn Sistemas Teoria Sefial Dept Fis Alicante Spain
The goal of the work described in this paper is to develop a particular optimal control technique based on a Cell. Mapping technique in combination with the Q-learning reinforcement learning method to control wheeled ... 详细信息
来源: 评论
Particle Swarn Optimized Adaptive dynamic programming
Particle Swarn Optimized Adaptive Dynamic Programming
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Dongbin Zhao Jianqiang Yi Derong Liu Key Laboratory of Complex Systems and Intelligence Science Institute of Automation Chinese Academy and Sciences Beijing China Department of Electrical and Computer Engineering University of Illinois Chicago Chicago IL USA
Particle swarm optimization is used for the training of the action network and critic network of the adaptive dynamic programming approach. The typical structures of the adaptive dynamic programming and particle swarm... 详细信息
来源: 评论
Identifying trajectory classes in dynamic tasks
Identifying trajectory classes in dynamic tasks
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Stuart O. Anderson Siddhartha S. Srinivasa Robotics Institute Carnegie Mellon University Pittsburgh PA USA Intel Research Pittsburgh Intel Corporation Pittsburgh PA USA
Using domain knowledge to decompose difficult control problems is a widely used technique in robotics. Previous work has automated the process of identifying some qualitative behaviors of a system, finding a decomposi... 详细信息
来源: 评论
Model-Based reinforcement learning in Factored-State MDPs
Model-Based Reinforcement Learning in Factored-State MDPs
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Alexander L. Strehl Department of Computer Science Rutgers University Piscataway NJ USA
We consider the problem of learning in a factored-state Markov decision process that is structured to allow a compact representation. We show that the well-known algorithm, factored Rmax, performs near-optimally on al... 详细信息
来源: 评论
Q-learning with Continuous State Spaces and Finite Decision Set
Q-Learning with Continuous State Spaces and Finite Decision ...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Kengy Barty Pierre Girardeau Jean-Sebastien Roy Cyrille Strugarek EDF Research and Development Clamart France
This paper aims to present an original technique in order to compute the optimal policy of a Markov decision problem with continuous state space and discrete decision variables. We propose an extension of the Q-learni... 详细信息
来源: 评论