咨询与建议

限定检索结果

文献类型

  • 229 篇 会议
  • 18 篇 期刊文献

馆藏范围

  • 247 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 113 篇 工学
    • 103 篇 计算机科学与技术...
    • 42 篇 软件工程
    • 38 篇 电气工程
    • 23 篇 控制科学与工程
    • 5 篇 信息与通信工程
    • 3 篇 机械工程
    • 2 篇 力学(可授工学、理...
    • 1 篇 仪器科学与技术
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
  • 27 篇 理学
    • 25 篇 数学
    • 7 篇 系统科学
    • 6 篇 统计学(可授理学、...
    • 1 篇 物理学
    • 1 篇 化学
    • 1 篇 大气科学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 社会学

主题

  • 95 篇 dynamic programm...
  • 54 篇 optimal control
  • 51 篇 learning
  • 44 篇 reinforcement le...
  • 35 篇 learning (artifi...
  • 27 篇 equations
  • 25 篇 neural networks
  • 22 篇 heuristic algori...
  • 20 篇 convergence
  • 20 篇 control systems
  • 18 篇 function approxi...
  • 18 篇 mathematical mod...
  • 16 篇 approximation al...
  • 15 篇 vectors
  • 15 篇 cost function
  • 14 篇 markov processes
  • 14 篇 nonlinear system...
  • 14 篇 artificial neura...
  • 13 篇 stochastic proce...
  • 12 篇 adaptive dynamic...

机构

  • 10 篇 chinese acad sci...
  • 5 篇 school of inform...
  • 4 篇 northeastern uni...
  • 4 篇 department of el...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 automation and r...
  • 3 篇 department of el...
  • 3 篇 robotics institu...
  • 3 篇 key laboratory o...
  • 3 篇 natl univ def te...
  • 3 篇 univ illinois de...
  • 2 篇 department of ar...
  • 2 篇 school of electr...
  • 2 篇 univ groningen i...
  • 2 篇 univ texas autom...
  • 2 篇 colorado state u...
  • 2 篇 guangxi univ sch...
  • 2 篇 national science...
  • 2 篇 informatics inst...

作者

  • 13 篇 liu derong
  • 7 篇 hado van hasselt
  • 7 篇 marco a. wiering
  • 7 篇 dongbin zhao
  • 6 篇 zhao dongbin
  • 5 篇 xu xin
  • 5 篇 lewis frank l.
  • 5 篇 huaguang zhang
  • 5 篇 wei qinglai
  • 5 篇 derong liu
  • 5 篇 warren b. powell
  • 4 篇 haibo he
  • 4 篇 jagannathan s.
  • 4 篇 frank l. lewis
  • 4 篇 zhang huaguang
  • 4 篇 ni zhen
  • 4 篇 yanhong luo
  • 4 篇 wang ding
  • 4 篇 he haibo
  • 4 篇 damien ernst

语言

  • 246 篇 英文
  • 1 篇 其他
检索条件"任意字段=2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2014"
247 条 记 录,以下是201-210 订阅
排序:
Identifying trajectory classes in dynamic tasks
Identifying trajectory classes in dynamic tasks
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Stuart O. Anderson Siddhartha S. Srinivasa Robotics Institute Carnegie Mellon University Pittsburgh PA USA Intel Research Pittsburgh Intel Corporation Pittsburgh PA USA
Using domain knowledge to decompose difficult control problems is a widely used technique in robotics. Previous work has automated the process of identifying some qualitative behaviors of a system, finding a decomposi... 详细信息
来源: 评论
An Optimal ADP Algorithm for a High-Dimensional Stochastic Control Problem
An Optimal ADP Algorithm for a High-Dimensional Stochastic C...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Juliana Nascimento Warren Powell Department of Operations Research and Financial Engineering Princeton University Engineering Princeton NJ USA
We propose a provably optimal approximate dynamic programming algorithm for a class of multistage stochastic problems, taking into account that the probability distribution of the underlying stochastic process is not ... 详细信息
来源: 评论
The QV family compared to other reinforcement learning algorithms
The QV family compared to other reinforcement learning algor...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Marco A. Wiering Hado van Hasselt Department of Artificial Intelligence University of Groningam Netherlands Intelligent Systems Group of Utrecht University Netherlands
This paper describes several new online model-free reinforcement learning (RL) algorithms. We designed three new reinforcement algorithms, namely: QV2, QVMAX, and QVMAX2, that are all based on the QV-learning algorith... 详细信息
来源: 评论
Online reinforcement learning Neural Network Controller Design for Nanomanipulation
Online Reinforcement Learning Neural Network Controller Desi...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Qinmin Yang S. Jagannathan Department of Electrical & Computer Engineering University of Missouri Rolla MO USA
In this paper, a novel reinforcement learning neural network (NN)-based controller, referred to adaptive critic controller, is proposed for affine nonlinear discrete-time systems with applications to nanomanipulation.... 详细信息
来源: 评论
Q-learning with Continuous State Spaces and Finite Decision Set
Q-Learning with Continuous State Spaces and Finite Decision ...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Kengy Barty Pierre Girardeau Jean-Sebastien Roy Cyrille Strugarek EDF Research and Development Clamart France
This paper aims to present an original technique in order to compute the optimal policy of a Markov decision problem with continuous state space and discrete decision variables. We propose an extension of the Q-learni... 详细信息
来源: 评论
A theoretical and empirical analysis of Expected Sarsa
A theoretical and empirical analysis of Expected Sarsa
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Harm van Seijen Hado van Hasselt Shimon Whiteson Marco Wiering Integrated Systems group TNO Defence Safety and Security The Hague Netherlands Intelligent Systems Group University of Utrecht Utrecht Netherlands Intelligent Autonomous Systems Group University of Amsterdam Amsterdam Netherlands Department of Artificial Intelligence University of Groningam Groningen Netherlands
This paper presents a theoretical and empirical analysis of Expected Sarsa, a variation on Sarsa, the classic on-policy temporal-difference method for model-free reinforcement learning. Expected Sarsa exploits knowled... 详细信息
来源: 评论
Kalman Temporal Differences: The deterministic case
Kalman Temporal Differences: The deterministic case
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Matthieu Geist Olivier Pietquin Gabriel Fricout IMS Research Group Supélec Metz France IMS Research Group Metz France MC cluster ArcelorMittal Research Maizieres-Les-Metz France
This paper deals with value function and Q-function approximation in deterministic Markovian decision processes. A general statistical framework based on the Kalman filtering paradigm is introduced. Its principle is t... 详细信息
来源: 评论
A Theoretical Analysis of Cooperative Behavior in Multi-agent Q-learning
A Theoretical Analysis of Cooperative Behavior in Multi-agen...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Ludo Waltman Uzay Kaymak Erasmus Erasmus University Rotterdam Rotterdam Netherlands
A number of experimental studies have investigated whether cooperative behavior may emerge in multi-agent Q-learning. In some studies cooperative behavior did emerge, in others it did not. This paper provides a theore... 详细信息
来源: 评论
Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark
Evaluation of Policy Gradient Methods and Variants on the Ca...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Martin Riedmiller Jan Peters Stefan Schaal NeuroInformatics Group University of Osnabrück Germany Computational Learning and Motor Control University of Southern California USA
In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, 'vanilla' policy gradients and natural policy gradients. Each o... 详细信息
来源: 评论
Efficient learning in Cellular Simultaneous Recurrent Neural Networks - The Case of Maze Navigation Problem
Efficient Learning in Cellular Simultaneous Recurrent Neural...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Roman Ilin Robert Kozma Paul J. Werbos Department of Mathematical Sciences University of Memphis Memphis TN USA National Science Foundation Arlington VA USA
Cellular simultaneous recurrent neural networks (SRN) show great promise in solving complex function approximation problems. In particular, approximate dynamic programming is an important application area where SRNs h... 详细信息
来源: 评论