咨询与建议

限定检索结果

文献类型

  • 748 篇 会议
  • 271 篇 期刊文献
  • 4 册 图书

馆藏范围

  • 1,023 篇 电子文献
  • 1 种 纸本馆藏

日期分布

学科分类号

  • 712 篇 工学
    • 520 篇 计算机科学与技术...
    • 381 篇 电气工程
    • 278 篇 控制科学与工程
    • 153 篇 软件工程
    • 79 篇 信息与通信工程
    • 40 篇 交通运输工程
    • 23 篇 仪器科学与技术
    • 20 篇 机械工程
    • 9 篇 生物工程
    • 8 篇 电子科学与技术(可...
    • 7 篇 力学(可授工学、理...
    • 7 篇 土木工程
    • 6 篇 动力工程及工程热...
    • 6 篇 石油与天然气工程
    • 4 篇 生物医学工程(可授...
    • 3 篇 材料科学与工程(可...
    • 3 篇 化学工程与技术
    • 3 篇 航空宇航科学与技...
    • 3 篇 安全科学与工程
  • 118 篇 理学
    • 98 篇 数学
    • 32 篇 系统科学
    • 22 篇 统计学(可授理学、...
    • 10 篇 生物学
    • 8 篇 物理学
    • 4 篇 化学
  • 66 篇 管理学
    • 63 篇 管理科学与工程(可...
    • 14 篇 工商管理
    • 5 篇 图书情报与档案管...
  • 5 篇 经济学
    • 4 篇 应用经济学
  • 3 篇 法学
    • 3 篇 社会学
  • 2 篇 医学
  • 1 篇 教育学

主题

  • 313 篇 reinforcement le...
  • 216 篇 dynamic programm...
  • 206 篇 optimal control
  • 107 篇 adaptive dynamic...
  • 104 篇 adaptive dynamic...
  • 97 篇 learning
  • 88 篇 neural networks
  • 78 篇 heuristic algori...
  • 68 篇 reinforcement le...
  • 58 篇 learning (artifi...
  • 54 篇 nonlinear system...
  • 53 篇 convergence
  • 51 篇 control systems
  • 51 篇 mathematical mod...
  • 48 篇 approximate dyna...
  • 44 篇 approximation al...
  • 43 篇 equations
  • 42 篇 adaptive control
  • 41 篇 artificial neura...
  • 41 篇 cost function

机构

  • 41 篇 chinese acad sci...
  • 27 篇 univ rhode isl d...
  • 17 篇 tianjin univ sch...
  • 16 篇 univ sci & techn...
  • 16 篇 univ illinois de...
  • 15 篇 northeastern uni...
  • 14 篇 beijing normal u...
  • 13 篇 northeastern uni...
  • 13 篇 guangdong univ t...
  • 12 篇 northeastern uni...
  • 9 篇 natl univ def te...
  • 8 篇 ieee
  • 8 篇 univ chinese aca...
  • 7 篇 univ chinese aca...
  • 7 篇 cent south univ ...
  • 7 篇 southern univ sc...
  • 7 篇 beijing univ tec...
  • 6 篇 chinese acad sci...
  • 6 篇 missouri univ sc...
  • 5 篇 nanjing univ pos...

作者

  • 54 篇 liu derong
  • 37 篇 wei qinglai
  • 29 篇 he haibo
  • 22 篇 wang ding
  • 21 篇 xu xin
  • 19 篇 jiang zhong-ping
  • 17 篇 lewis frank l.
  • 17 篇 yang xiong
  • 17 篇 zhang huaguang
  • 17 篇 ni zhen
  • 16 篇 zhao bo
  • 15 篇 gao weinan
  • 14 篇 zhao dongbin
  • 13 篇 derong liu
  • 13 篇 zhong xiangnan
  • 12 篇 si jennie
  • 10 篇 jagannathan s.
  • 10 篇 dongbin zhao
  • 10 篇 song ruizhuo
  • 9 篇 abouheaf mohamme...

语言

  • 992 篇 英文
  • 25 篇 其他
  • 6 篇 中文
检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"
1023 条 记 录,以下是821-830 订阅
排序:
Efficient data reuse in value function approximation
Efficient data reuse in value function approximation
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Hirotaka Hachiya Takayuki Akiyama Masashi Sugiyama Jan Peters Department of Computer Science Tokyo Institute of Technology Meguro Tokyo Japan Department Schölkopf Max-Planck Institute of Biological Cybernetics Tubingen Germany
Off-policy reinforcement learning is aimed at efficiently using data samples gathered from a policy that is different from the currently optimized policy. A common approach is to use importance sampling techniques for... 详细信息
来源: 评论
learning continuous-action control policies
Learning continuous-action control policies
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Jason Pazis Michail G. Lagoudakis Department of Electronic and Computer Engineering Technical University of Crete Crete Greece
reinforcement learning for control in stochastic processes has received significant attention in the last few years. Several data-efficient methods, even for continuous state spaces, have been proposed, however most o... 详细信息
来源: 评论
A theoretical and empirical analysis of Expected Sarsa
A theoretical and empirical analysis of Expected Sarsa
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Harm van Seijen Hado van Hasselt Shimon Whiteson Marco Wiering Integrated Systems group TNO Defence Safety and Security The Hague Netherlands Intelligent Systems Group University of Utrecht Utrecht Netherlands Intelligent Autonomous Systems Group University of Amsterdam Amsterdam Netherlands Department of Artificial Intelligence University of Groningam Groningen Netherlands
This paper presents a theoretical and empirical analysis of Expected Sarsa, a variation on Sarsa, the classic on-policy temporal-difference method for model-free reinforcement learning. Expected Sarsa exploits knowled... 详细信息
来源: 评论
reinforcement learning Control of a Real Mobile Robot Using Approximate Policy Iteration
收藏 引用
6th International symposium on Neural Networks
作者: Zhang, Pengchen Xu, Xin Liu, Chunming Yuan, Qiping Natl Univ Def Technol Inst Automat Changsha 410073 Hunan Peoples R China
Machine learning for mobile robots has attracted lots of research interests in recent years. However, there are still many challenges to apply learning techniques in real mobile robots, e.g., generalization ill Contin... 详细信息
来源: 评论
Kalman Temporal Differences: The deterministic case
Kalman Temporal Differences: The deterministic case
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Matthieu Geist Olivier Pietquin Gabriel Fricout IMS Research Group Supélec Metz France IMS Research Group Metz France MC cluster ArcelorMittal Research Maizieres-Les-Metz France
This paper deals with value function and Q-function approximation in deterministic Markovian decision processes. A general statistical framework based on the Kalman filtering paradigm is introduced. Its principle is t... 详细信息
来源: 评论
ADHDP(λ) strategies based coordinated ramps metering with queuing consideration
ADHDP(λ) strategies based coordinated ramps metering with q...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Xuerui Bai Dongbin Zhao Jianqiang Yi Laboratory of Complex Systems and Intelligence Science Institute of Automation Chinese Academy and Sciences Beijing China
Ramp metering has been developed as a traffic management strategy to alleviate congestion on freeways. Most ramp metering control algorithms are concerned without queuing consideration, because its still a tough job t... 详细信息
来源: 评论
Algorithm and stability of ATC receding horizon control
Algorithm and stability of ATC receding horizon control
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Hongwei Zhang Jie Huang Frank L. Lewis Department of Mechanical and Automation Engineering Chinese University of Hong Kong New Territories Hong Kong China Automation and Robotics Research Institute University of Texas Arlington Fort Worth TX USA
Receding horizon control (RHC), also known as model predictive control (MPC), is a suboptimal control scheme that solves a finite horizon open-loop optimal control problem in an infinite horizon context and yields a m... 详细信息
来源: 评论
Neuro-controller of cement rotary kiln temperature with adaptive critic designs
Neuro-controller of cement rotary kiln temperature with adap...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Xiaofeng Lin Tangbo Liu Shaojian Song Chunning Song College of Electrical Engineering Guangxi University Nanning China College of Electrical Engineering Guangxi University China
The production process of the cement rotary kiln is a typical engineering thermodynamics with large inertia, lagging and nonlinearity. So it is very difficult to control this process accurately using traditional contr... 详细信息
来源: 评论
Algorithms for variance reduction in a policy-gradient based actor-critic framework
Algorithms for variance reduction in a policy-gradient based...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Yogesh P. Awate Department of Industrial Engineering and Operations Research Indian Institute of Technology Bombay India
We consider the framework of a set of recently proposed two-timescale actor-critic algorithms for reinforcement-learning (RL) using the long-run average-reward criterion and linear feature-based value-function approxi... 详细信息
来源: 评论
Path integral-based stochastic optimal control for rigid body dynamics
Path integral-based stochastic optimal control for rigid bod...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: E. A. Theodorou J. Buchli S. Schaal Computer Science Neuroscience & Biomedical Engineering University of Southern California CA USA ATR Computational Neuroscience Laboratories Kyoto Japan
Recent advances on path integral stochastic optimal control [1],[2] provide new insights in the optimal control of nonlinear stochastic systems which are linear in the controls, with state independent and time invaria... 详细信息
来源: 评论