咨询与建议

限定检索结果

文献类型

  • 748 篇 会议
  • 271 篇 期刊文献
  • 4 册 图书

馆藏范围

  • 1,023 篇 电子文献
  • 1 种 纸本馆藏

日期分布

学科分类号

  • 712 篇 工学
    • 520 篇 计算机科学与技术...
    • 381 篇 电气工程
    • 278 篇 控制科学与工程
    • 153 篇 软件工程
    • 79 篇 信息与通信工程
    • 40 篇 交通运输工程
    • 23 篇 仪器科学与技术
    • 20 篇 机械工程
    • 9 篇 生物工程
    • 8 篇 电子科学与技术(可...
    • 7 篇 力学(可授工学、理...
    • 7 篇 土木工程
    • 6 篇 动力工程及工程热...
    • 6 篇 石油与天然气工程
    • 4 篇 生物医学工程(可授...
    • 3 篇 材料科学与工程(可...
    • 3 篇 化学工程与技术
    • 3 篇 航空宇航科学与技...
    • 3 篇 安全科学与工程
  • 118 篇 理学
    • 98 篇 数学
    • 32 篇 系统科学
    • 22 篇 统计学(可授理学、...
    • 10 篇 生物学
    • 8 篇 物理学
    • 4 篇 化学
  • 66 篇 管理学
    • 63 篇 管理科学与工程(可...
    • 14 篇 工商管理
    • 5 篇 图书情报与档案管...
  • 5 篇 经济学
    • 4 篇 应用经济学
  • 3 篇 法学
    • 3 篇 社会学
  • 2 篇 医学
  • 1 篇 教育学

主题

  • 313 篇 reinforcement le...
  • 216 篇 dynamic programm...
  • 206 篇 optimal control
  • 107 篇 adaptive dynamic...
  • 104 篇 adaptive dynamic...
  • 97 篇 learning
  • 88 篇 neural networks
  • 78 篇 heuristic algori...
  • 68 篇 reinforcement le...
  • 58 篇 learning (artifi...
  • 54 篇 nonlinear system...
  • 53 篇 convergence
  • 51 篇 control systems
  • 51 篇 mathematical mod...
  • 48 篇 approximate dyna...
  • 44 篇 approximation al...
  • 43 篇 equations
  • 42 篇 adaptive control
  • 41 篇 artificial neura...
  • 41 篇 cost function

机构

  • 41 篇 chinese acad sci...
  • 27 篇 univ rhode isl d...
  • 17 篇 tianjin univ sch...
  • 16 篇 univ sci & techn...
  • 16 篇 univ illinois de...
  • 15 篇 northeastern uni...
  • 14 篇 beijing normal u...
  • 13 篇 northeastern uni...
  • 13 篇 guangdong univ t...
  • 12 篇 northeastern uni...
  • 9 篇 natl univ def te...
  • 8 篇 ieee
  • 8 篇 univ chinese aca...
  • 7 篇 univ chinese aca...
  • 7 篇 cent south univ ...
  • 7 篇 southern univ sc...
  • 7 篇 beijing univ tec...
  • 6 篇 chinese acad sci...
  • 6 篇 missouri univ sc...
  • 5 篇 nanjing univ pos...

作者

  • 54 篇 liu derong
  • 37 篇 wei qinglai
  • 29 篇 he haibo
  • 22 篇 wang ding
  • 21 篇 xu xin
  • 19 篇 jiang zhong-ping
  • 17 篇 lewis frank l.
  • 17 篇 yang xiong
  • 17 篇 zhang huaguang
  • 17 篇 ni zhen
  • 16 篇 zhao bo
  • 15 篇 gao weinan
  • 14 篇 zhao dongbin
  • 13 篇 derong liu
  • 13 篇 zhong xiangnan
  • 12 篇 si jennie
  • 10 篇 jagannathan s.
  • 10 篇 dongbin zhao
  • 10 篇 song ruizhuo
  • 9 篇 abouheaf mohamme...

语言

  • 992 篇 英文
  • 25 篇 其他
  • 6 篇 中文
检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"
1023 条 记 录,以下是661-670 订阅
排序:
Exponential moving average Q-learning algorithm
Exponential moving average Q-learning algorithm
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Mostafa D. Awheda Howard M. Schwartz Department of Systems and Computer Engineering Carleton University Ottawa Canada
A multi-agent policy iteration learning algorithm is proposed in this work. The Exponential Moving Average (EMA) mechanism is used to update the policy for a Q-learning agent so that it converges to an optimal policy ... 详细信息
来源: 评论
An integrated design for intensified direct heuristic dynamic programming
An integrated design for intensified direct heuristic dynami...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Xiong Luo Jennie Si Yuchao Zhou School of Computer and Communication Engineering University of Science and Technology Beijing (USTB) Beijing China Arizona State University Tempe AZ US
There has been a growing interest in the study of adaptive/approximate dynamic programming (ADP) in recent years. The ADP technique provides a powerful tool to understand and improve the principled technologies of mac... 详细信息
来源: 评论
A novel approach for constructing basis functions in approximate dynamic programming for feedback control
A novel approach for constructing basis functions in approxi...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Jian Wang Zhenhua Huang Xin Xu College of Mechatronics and Automation National University of Defense Tech Changsha P. R. China
This paper presents a novel approach for constructing basis functions in approximate dynamic programming (ADP) through the locally linear embedding (LLE) process. It considers the experience (sample) data as a high-di... 详细信息
来源: 评论
Bias-corrected Q-learning to control max-operator bias in Q-learning
Bias-corrected Q-learning to control max-operator bias in Q-...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Donghun Lee Boris Defourny Warren B. Powell Department of Computer Science Princeton University Princeton NJ USA Operations Research and Financial Engineering Princeton University Princeton NJ USA
We identify a class of stochastic control problems with highly random rewards and high discount factor which induce high levels of statistical error in the estimated action-value function. This produces significant le... 详细信息
来源: 评论
Free energy based policy gradients
Free energy based policy gradients
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Evangelos A. Theodorou Jiri Najemnik Emo Todorov Department of Computer Science and Engineering University of Washington Seattle University of Washington Seattle Departments of Computer Science and Engineering and Applied Math University of Washington Seattle
Despite the plethora of reinforcement learning algorithms in machine learning and control, the majority of the work in this area relies on discrete time formulations of stochastic dynamics. In this work we present a n... 详细信息
来源: 评论
Proceedings of the 2013 ieee Conference on Evolving and adaptive Intelligent Systems, EAIS 2013 - 2013 ieee symposium Series on Computational Intelligence, SSCI 2013
Proceedings of the 2013 IEEE Conference on Evolving and Adap...
收藏 引用
2013 ieee Conference on Evolving and adaptive Intelligent Systems, EAIS 2013 - 2013 ieee symposium Series on Computational Intelligence, SSCI 2013
The proceedings contain 20 papers. The topics discussed include: Resolving global and local drifts in data stream regression using evolving rule-based models;fuzzy decision trees for dynamic data;dynamic and evolving ...
来源: 评论
adaptive optimal control for nonlinear discrete-time systems
Adaptive optimal control for nonlinear discrete-time systems
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Chunbin Qin Huaguang Zhang Yanhong Luo School of Information Science and Engineering Northeastern University Shenyang China Basic Experiment Teaching Center Henan University Kaifeng China
This paper proposes an on-line near-optimal control scheme based on capabilities of neural networks (NNs), in function approximation, to attain the on-line solution of optimal control problem for nonlinear discrete-ti... 详细信息
来源: 评论
Optimistic planning for belief-augmented Markov Decision Processes
Optimistic planning for belief-augmented Markov Decision Pro...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Raphael Fonteneau Lucian Buşoniu Rémi Munos Department of Electrical Engineering and Computer Science University of Liège BELGIUM Universite de Lorraine CRAN FRANCE SequeL Team Inria Lille FRANCE
This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Proc... 详细信息
来源: 评论
A combined hierarchical reinforcement learning based approach for multi-robot cooperative target searching in complex unknown environments
A combined hierarchical reinforcement learning based approac...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Yifan Cai Simon X. Yang Xin Xu The School of Engineering University of Guelph Guelph Ontario Canada The College of Mechatronics and Automation National University of Defense Technology Changsha Hunan Province China
Effective cooperation of multi-robots in unknown environments is essential in many robotic applications, such as environment exploration and target searching. In this paper, a combined hierarchical reinforcement learn... 详细信息
来源: 评论
The second order temporal difference error for Sarsa(λ)
The second order temporal difference error for Sarsa(λ)
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Qiming Fu Quan Liu Fei Xiao Guixin Chen Department of Computer Science and Technology Soochow University Suzhou China
Traditional reinforcement learning algorithms, such as Q-learning, Q(λ), Sarsa, and Sarsa(λ), update the action value function using temporal difference (TD) error, which is computed by the last action value functio... 详细信息
来源: 评论