咨询与建议

限定检索结果

文献类型

  • 229 篇 会议
  • 18 篇 期刊文献

馆藏范围

  • 247 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 113 篇 工学
    • 103 篇 计算机科学与技术...
    • 42 篇 软件工程
    • 38 篇 电气工程
    • 23 篇 控制科学与工程
    • 5 篇 信息与通信工程
    • 3 篇 机械工程
    • 2 篇 力学(可授工学、理...
    • 1 篇 仪器科学与技术
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
  • 27 篇 理学
    • 25 篇 数学
    • 7 篇 系统科学
    • 6 篇 统计学(可授理学、...
    • 1 篇 物理学
    • 1 篇 化学
    • 1 篇 大气科学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 社会学

主题

  • 95 篇 dynamic programm...
  • 54 篇 optimal control
  • 51 篇 learning
  • 44 篇 reinforcement le...
  • 35 篇 learning (artifi...
  • 27 篇 equations
  • 25 篇 neural networks
  • 22 篇 heuristic algori...
  • 20 篇 convergence
  • 20 篇 control systems
  • 18 篇 function approxi...
  • 18 篇 mathematical mod...
  • 16 篇 approximation al...
  • 15 篇 vectors
  • 15 篇 cost function
  • 14 篇 markov processes
  • 14 篇 nonlinear system...
  • 14 篇 artificial neura...
  • 13 篇 stochastic proce...
  • 12 篇 adaptive dynamic...

机构

  • 10 篇 chinese acad sci...
  • 5 篇 school of inform...
  • 4 篇 northeastern uni...
  • 4 篇 department of el...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 automation and r...
  • 3 篇 department of el...
  • 3 篇 robotics institu...
  • 3 篇 key laboratory o...
  • 3 篇 natl univ def te...
  • 3 篇 univ illinois de...
  • 2 篇 department of ar...
  • 2 篇 school of electr...
  • 2 篇 univ groningen i...
  • 2 篇 univ texas autom...
  • 2 篇 colorado state u...
  • 2 篇 guangxi univ sch...
  • 2 篇 national science...
  • 2 篇 informatics inst...

作者

  • 13 篇 liu derong
  • 7 篇 hado van hasselt
  • 7 篇 marco a. wiering
  • 7 篇 dongbin zhao
  • 6 篇 zhao dongbin
  • 5 篇 xu xin
  • 5 篇 lewis frank l.
  • 5 篇 huaguang zhang
  • 5 篇 wei qinglai
  • 5 篇 derong liu
  • 5 篇 warren b. powell
  • 4 篇 haibo he
  • 4 篇 jagannathan s.
  • 4 篇 frank l. lewis
  • 4 篇 zhang huaguang
  • 4 篇 ni zhen
  • 4 篇 yanhong luo
  • 4 篇 wang ding
  • 4 篇 he haibo
  • 4 篇 damien ernst

语言

  • 246 篇 英文
  • 1 篇 其他
检索条件"任意字段=2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2014"
247 条 记 录,以下是51-60 订阅
排序:
The QV Family Compared to Other reinforcement learning Algorithms
The QV Family Compared to Other Reinforcement Learning Algor...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning
作者: Wiering, Marco A. van Hasselt, Hado Univ Groningen Dept Artificial Intelligence NL-9700 AB Groningen Netherlands Univ Utrecht Intelligent Syst Grp NL-3508 TC Utrecht Netherlands
This paper describes several new online model-free reinforcement learning (RL) algorithms. We designed three new reinforcement algorithms, namely: QV2, QVMAX, and QV-MAX2, that are all based on the QV-learning algorit... 详细信息
来源: 评论
reinforcement learning-based Optimal Control Considering L Computation Time Delay of Linear Discrete-time Systems
Reinforcement Learning-based Optimal Control Considering <i>...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Fujita, Taishi Ushio, Toshimitsu
In embedded control systems, the control input is computed based on sensing data of a plant in a processor and there is a delay, called the computation time delay, due to the computation and the data transmission. Whe... 详细信息
来源: 评论
Supervised adaptive dynamic programming based adaptive cruise control
Supervised adaptive dynamic programming based adaptive cruis...
收藏 引用
symposium Series on Computational Intelligence, ieee SSCI2011 - 2011 ieee symposium on adaptive dynamic programming and reinforcement learning, adprl 2011
作者: Zhao, Dongbin Hu, Zhaohui Key Laboratory of Complex Systems and Intelligence Science Institute of Automation Chinese Academy of Sciences Beijing 100190 China
This paper proposes a supervised adaptive dynamic programming (SADP) algorithm for the full range adaptive cruise control (ACC) system. The full range ACC system considers both the ACC situation in highway system and ... 详细信息
来源: 评论
An approximate dynamic programming strategy for responsive traffic signal control
An approximate dynamic programming strategy for responsive t...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Cai, Chen Univ Coll London Ctr Transport Studies London WC1E 6BT England
This paper proposes an approximate dynamic programming strategy for responsive traffic signal control. It is the first attempt that optimizes signal control objective dynamically through adaptive approximation of valu... 详细信息
来源: 评论
Iterative Local dynamic programming
Iterative Local Dynamic Programming
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning
作者: Todorov, Emanuel Tassa, Yuval Univ Calif San Diego Dept Cognit Sci La Jolla CA 92093 USA Hebrew Univ Jerusalem Ctr Neural Computat IL-91905 Jerusalem Israel
We develop an iterative local dynamic programming method (iLDP) applicable to stochastic optimal control problems in continuous high-dimensional state and action spaces. Such problems are common in the control of biol... 详细信息
来源: 评论
Multi-Objective reinforcement learning for AUV Thruster Failure Recovery
Multi-Objective Reinforcement Learning for AUV Thruster Fail...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Ahmadzadeh, Seyed Reza Kormushev, Petar Caldwell, Darwin G. Ist Italiano Tecnol Dept Adv Robot Via Morego 30 I-16163 Genoa Italy
This paper investigates learning approaches for discovering fault-tolerant control policies to overcome thruster failures in Autonomous Underwater Vehicles (AUV). The proposed approach is a model-based direct policy s... 详细信息
来源: 评论
Beyond Exponential Utility Functions: A Variance-Adjusted Approach for Risk-Averse reinforcement learning
Beyond Exponential Utility Functions: A Variance-Adjusted Ap...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Gosavi, Abhijit A. Das, Sajal K. Murray, Susan L. Missouri Univ Sci & Technol Dept Engn Management & Syst Engn Rolla MO 65409 USA Missouri Univ Sci & Technol Dept Comp Sci Rolla MO 65409 USA
Utility theory has served as a bedrock for modeling risk in economics. Where risk is involved in decision-making, for solving Markov decision processes (MDPs) via utility theory, the exponential utility (EU) function ... 详细信息
来源: 评论
Using reward-weighted imitations for robot reinforcement learning
Using reward-weighted imitations for robot reinforcement lea...
收藏 引用
2009 ieee symposium on adaptive dynamic programming and reinforcement learning, adprl 2009
作者: Peters, Jan Kober, Jens Department of Empirical Inference and Machine Leartling Max Planck Institute for Biological Cybernetics Spemannstr. 38 72076 Tlibingen Germany
reinforcement learning is an essential ability for robots to learn new motor skills. Nevertheless, few methods scale into the domain of anthropomorphic robotics. In order to improve in terms of efficiency, the problem... 详细信息
来源: 评论
Efficient Data Reuse in Value Function Approximation.
Efficient Data Reuse in Value Function Approximation.
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning
作者: Hachiya, Hirotaka Akiyama, Takayuki Sugiyama, Masashi Peters, Jan Tokyo Inst Technol Dept Comp Sci Meguro Ku 2-12-1 O Okayama Tokyo 1528552 Japan Max Planck Inst Biol Cybernet Dept Scholkopf D-72076 Tubingen Germany
Off-policy reinforcement learning is aimed at efficiently using data samples gathered from a policy that is different from the currently optimized policy. A common approach is to use importance sampling techniques for... 详细信息
来源: 评论
Bayesian active learning with basis functions
Bayesian active learning with basis functions
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning
作者: Ryzhov, Ilya O. Powell, Warren B. Operations Research and Financial Engineering Princeton University Princeton NJ 08544 United States
A common technique for dealing with the curse of dimensionality in approximate dynamic programming is to use a parametric value function approximation, where the value of being in a state is assumed to be a linear com... 详细信息
来源: 评论