咨询与建议

限定检索结果

文献类型

  • 228 篇 会议
  • 4 篇 期刊文献

馆藏范围

  • 232 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 98 篇 工学
    • 93 篇 计算机科学与技术...
    • 40 篇 软件工程
    • 25 篇 电气工程
    • 14 篇 控制科学与工程
    • 4 篇 机械工程
    • 1 篇 力学(可授工学、理...
    • 1 篇 信息与通信工程
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
  • 23 篇 理学
    • 23 篇 数学
    • 6 篇 统计学(可授理学、...
    • 4 篇 系统科学
    • 1 篇 化学
    • 1 篇 大气科学
  • 9 篇 管理学
    • 7 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 社会学

主题

  • 95 篇 dynamic programm...
  • 52 篇 learning
  • 46 篇 optimal control
  • 37 篇 reinforcement le...
  • 34 篇 learning (artifi...
  • 27 篇 equations
  • 22 篇 heuristic algori...
  • 21 篇 control systems
  • 20 篇 convergence
  • 19 篇 neural networks
  • 18 篇 function approxi...
  • 17 篇 mathematical mod...
  • 16 篇 approximation al...
  • 15 篇 vectors
  • 14 篇 markov processes
  • 14 篇 artificial neura...
  • 14 篇 cost function
  • 13 篇 stochastic proce...
  • 12 篇 algorithm design...
  • 12 篇 adaptive control

机构

  • 5 篇 school of inform...
  • 4 篇 northeastern uni...
  • 4 篇 department of el...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 automation and r...
  • 3 篇 northeastern uni...
  • 3 篇 robotics institu...
  • 3 篇 key laboratory o...
  • 3 篇 univ illinois de...
  • 2 篇 department of ar...
  • 2 篇 school of electr...
  • 2 篇 univ groningen i...
  • 2 篇 univ texas autom...
  • 2 篇 colorado state u...
  • 2 篇 guangxi univ sch...
  • 2 篇 national science...
  • 2 篇 informatics inst...
  • 2 篇 college of infor...
  • 2 篇 school of automa...

作者

  • 7 篇 hado van hasselt
  • 7 篇 lewis frank l.
  • 7 篇 marco a. wiering
  • 7 篇 dongbin zhao
  • 6 篇 liu derong
  • 5 篇 huaguang zhang
  • 5 篇 zhang huaguang
  • 5 篇 derong liu
  • 5 篇 warren b. powell
  • 4 篇 xu xin
  • 4 篇 vrabie draguna
  • 4 篇 jagannathan s.
  • 4 篇 frank l. lewis
  • 4 篇 yanhong luo
  • 4 篇 damien ernst
  • 4 篇 jan peters
  • 4 篇 peters jan
  • 4 篇 zhao dongbin
  • 3 篇 xu hao
  • 3 篇 martin riedmille...

语言

  • 232 篇 英文
检索条件"任意字段=2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009"
232 条 记 录,以下是111-120 订阅
排序:
Iterative local dynamic programming
Iterative local dynamic programming
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Emanuel Todorov Yuval Tassa Department of Cognitive Science University of California San Diego USA Center of Neural Computation Hebrew University of Jerusalem Israel
We develop an iterative local dynamic programming method (iLDP) applicable to stochastic optimal control problems in continuous high-dimensional state and action spaces. Such problems are common in the control of biol... 详细信息
来源: 评论
Neural-network-based reinforcement learning controller for nonlinear systems with non-symmetric dead-zone inputs
Neural-network-based reinforcement learning controller for n...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Xin Zhang Huaguang Zhang Derong Liu Yongsu Kim School of Information Science and Engineering Northeastern University Shenyang Liaoning China Department of Electrical and Computer Engineering University of Illinois Chicago Chicago IL USA
A novel adaptive-critic-based NN controller using reinforcement learning is developed for a class of nonlinear systems with non-symmetric dead-zone inputs. The adaptive critic NN controller uses two NNs: the critic NN... 详细信息
来源: 评论
A unified framework for temporal difference methods
A unified framework for temporal difference methods
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Dimitri P. Bertsekas Laboratory of Information and Decision Systems (LIDS) Massachusetts Institute of Technology MA USA
We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis f... 详细信息
来源: 评论
Near-optimality bounds for greedy periodic policies with application to grid-level storage
Near-optimality bounds for greedy periodic policies with app...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Yuhai Hu Boris Defourny Department of Industrial & Systems Engineering Lehigh University USA
This paper is concerned with periodic Markov Decision Processes, as a simplified but already rich model for nonstationary infinite-horizon problems involving seasonal effects. Considering the class of policies greedy ... 详细信息
来源: 评论
A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional Markov decision process with continuous state and action spaces
A convergent recursive least squares approximate policy iter...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Jun Ma Warren B. Powell Department of Operations Research and Financial Engineering Princeton University Princeton NJ USA
In this paper, we present a recursive least squares approximate policy iteration (RLSAPI) algorithm for infinite-horizon multi-dimensional Markov decision process in continuous state and action spaces. Under certain p... 详细信息
来源: 评论
Integrating sporadic imitation in reinforcement learning robots
Integrating sporadic imitation in Reinforcement Learning rob...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Willi Richert Ulrich Scheller Markus Koch Bernd Kleinjohann Claudius Stern Faculty of of Computer Science Electrical Engineering and Mathematics University of Paderborn Paderborn Germany
Although the combination of reinforcement learning and imitation has been already considered in recent research, it always revolved around fixed settings where demonstrator and imitator are fixed and the imitation pro... 详细信息
来源: 评论
Finite-horizon optimal control design for uncertain linear discrete-time systems
Finite-horizon optimal control design for uncertain linear d...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Qiming Zhao Hao Xu S. Jagannathan Department of Electrical and Computer Engineering Missouri University of Science and Technology Rolla MO USA
In this paper, the finite-horizon optimal adaptive control design for linear discrete-time systems with unknown system dynamics by using adaptive dynamic programming (ADP) is presented. In the presence of full state f... 详细信息
来源: 评论
High-order local dynamic programming
High-order local dynamic programming
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Yuval Tassa Emanuel Todorov Interdisciplinary Center of Neural Computation Hebrew University Jerusalem Israel Applied Mathematics and Computer Science & Engineering University of Washington Seattle USA
We describe a new local dynamic programming algorithm for solving stochastic continuous Optimal Control problems. We use cubature integration to both propagate the state distribution and perform the Bellman backup. Th... 详细信息
来源: 评论
An adaptive dynamic programming algorithm to solve optimal control of uncertain nonlinear systems
An adaptive dynamic programming algorithm to solve optimal c...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Xiaohong Cui Yanhong Luo Huaguang Zhang School of Information Science and Engineering Northeastern University Shenyang Liaoning China
In this paper, an approximate optimal control method based on adaptive dynamic programming(ADP) is discussed for completely unknown nonlinear system. An online critic-action-identifier algorithm is developed using neu... 详细信息
来源: 评论
A data-based online reinforcement learning algorithm with high-efficient exploration
A data-based online reinforcement learning algorithm with hi...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Yuanheng Zhu Dongbin Zhao The State Key Laboratory of Management and Control for Complex Systems Chinese Academy of Sciences Beijing China
An online reinforcement learning algorithm is proposed in this paper to directly utilizes online data efficiently for continuous deterministic systems without system parameters. The dependence on some specific approxi... 详细信息
来源: 评论