咨询与建议

限定检索结果

文献类型

  • 743 篇 会议
  • 265 篇 期刊文献
  • 4 册 图书

馆藏范围

  • 1,012 篇 电子文献
  • 1 种 纸本馆藏

日期分布

学科分类号

  • 704 篇 工学
    • 517 篇 计算机科学与技术...
    • 376 篇 电气工程
    • 275 篇 控制科学与工程
    • 152 篇 软件工程
    • 79 篇 信息与通信工程
    • 39 篇 交通运输工程
    • 23 篇 仪器科学与技术
    • 20 篇 机械工程
    • 9 篇 生物工程
    • 8 篇 电子科学与技术(可...
    • 7 篇 力学(可授工学、理...
    • 6 篇 动力工程及工程热...
    • 6 篇 石油与天然气工程
    • 5 篇 土木工程
    • 4 篇 航空宇航科学与技...
    • 4 篇 生物医学工程(可授...
    • 3 篇 材料科学与工程(可...
    • 3 篇 化学工程与技术
    • 3 篇 安全科学与工程
  • 119 篇 理学
    • 99 篇 数学
    • 33 篇 系统科学
    • 22 篇 统计学(可授理学、...
    • 10 篇 生物学
    • 8 篇 物理学
    • 4 篇 化学
  • 65 篇 管理学
    • 62 篇 管理科学与工程(可...
    • 15 篇 工商管理
    • 5 篇 图书情报与档案管...
  • 5 篇 经济学
    • 4 篇 应用经济学
  • 3 篇 法学
    • 3 篇 社会学
  • 2 篇 教育学
  • 2 篇 医学

主题

  • 308 篇 reinforcement le...
  • 213 篇 dynamic programm...
  • 202 篇 optimal control
  • 105 篇 adaptive dynamic...
  • 104 篇 adaptive dynamic...
  • 97 篇 learning
  • 87 篇 neural networks
  • 73 篇 heuristic algori...
  • 68 篇 reinforcement le...
  • 58 篇 learning (artifi...
  • 54 篇 nonlinear system...
  • 52 篇 control systems
  • 51 篇 convergence
  • 51 篇 mathematical mod...
  • 48 篇 approximate dyna...
  • 44 篇 approximation al...
  • 43 篇 equations
  • 43 篇 adaptive control
  • 40 篇 artificial neura...
  • 40 篇 cost function

机构

  • 41 篇 chinese acad sci...
  • 27 篇 univ rhode isl d...
  • 17 篇 tianjin univ sch...
  • 16 篇 univ sci & techn...
  • 16 篇 univ illinois de...
  • 15 篇 northeastern uni...
  • 14 篇 beijing normal u...
  • 13 篇 northeastern uni...
  • 12 篇 northeastern uni...
  • 12 篇 guangdong univ t...
  • 9 篇 natl univ def te...
  • 8 篇 ieee
  • 8 篇 univ chinese aca...
  • 7 篇 univ chinese aca...
  • 7 篇 cent south univ ...
  • 7 篇 southern univ sc...
  • 6 篇 chinese acad sci...
  • 6 篇 missouri univ sc...
  • 6 篇 beijing univ tec...
  • 5 篇 nanjing univ pos...

作者

  • 54 篇 liu derong
  • 37 篇 wei qinglai
  • 29 篇 he haibo
  • 21 篇 xu xin
  • 21 篇 wang ding
  • 19 篇 jiang zhong-ping
  • 17 篇 yang xiong
  • 17 篇 zhang huaguang
  • 17 篇 ni zhen
  • 16 篇 lewis frank l.
  • 16 篇 zhao bo
  • 15 篇 gao weinan
  • 14 篇 zhao dongbin
  • 13 篇 zhong xiangnan
  • 12 篇 si jennie
  • 11 篇 derong liu
  • 10 篇 jagannathan s.
  • 10 篇 dongbin zhao
  • 9 篇 song ruizhuo
  • 9 篇 abouheaf mohamme...

语言

  • 986 篇 英文
  • 20 篇 其他
  • 6 篇 中文
检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"
1012 条 记 录,以下是51-60 订阅
排序:
A Combined Hierarchical reinforcement learning Based Approach For Multi-robot Cooperative Target Searching in Complex Unknown Environments
A Combined Hierarchical Reinforcement Learning Based Approac...
收藏 引用
4th ieee International symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Cai, Yifan Yang, Simon X. Xu, Xin Univ Guelph Sch Engn Guelph ON N1G 2W1 Canada Natl Univ Def Technol Coll Mechatron & Automat Changsha 410073 Hunan Peoples R China
Effective cooperation of multi-robots in unknown environments is essential in many robotic applications, such as environment exploration and target searching. In this paper, a combined hierarchical reinforcement learn... 详细信息
来源: 评论
adaptive critic designs
收藏 引用
ieee TRANSACTIONS ON NEURAL NETWORKS 1997年 第5期8卷 997-1007页
作者: Prokhorov, DV Wunsch, DC Laboratory of Chromatography DEPg.Fac.Quimica Universidad Nacional Autonoma de Mexico Circuito interior Cd Universitaria/CP 04510 Mexico D.F.Mexico
We discuss a variety of adaptive critic designs (ACD's) for neurocontrol. These are suitable for learning in noisy, nonlinear, and nonstationary environments. They have common roots as generalizations of dynamic p... 详细信息
来源: 评论
Delayed Insertion and Rule Effect Moderation of Domain Knowledge for reinforcement learning
Delayed Insertion and Rule Effect Moderation of Domain Knowl...
收藏 引用
4th ieee International symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Teng, Teck-Hou Tan, Ah-Hwee Nanyang Technol Univ Sch Comp Engn Ctr Computat Intelligence Singapore Singapore Nanyang Technol Univ Sch Comp Engn Singapore Singapore
Though not a fundamental pre-requisite to efficient machine learning, insertion of domain knowledge into adaptive virtual agent is nonetheless known to improve learning efficiency and reduce model complexity. Conventi... 详细信息
来源: 评论
Bias-Corrected Q-learning to Control Max-Operator Bias in Q-learning
Bias-Corrected Q-Learning to Control Max-Operator Bias in Q-...
收藏 引用
4th ieee International symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Lee, Donghun Defourny, Boris Powell, Warren B. Princeton Univ Dept Comp Sci Princeton NJ 08540 USA Princeton Univ Dept Operat Res & Financial Engn Princeton NJ 08540 USA
We identify a class of stochastic control problems with highly random rewards and high discount factor which induce high levels of statistical error in the estimated action-value function. This produces significant le... 详细信息
来源: 评论
Scalarized Multi-Objective reinforcement learning: Novel Design Techniques
Scalarized Multi-Objective Reinforcement Learning: Novel Des...
收藏 引用
4th ieee International symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Van Moffaert, Kristof Drugan, Madalina M. Nowe, Ann Vrije Univ Brussel Dept Comp Sci B-1050 Brussels Belgium
In multi-objective problems, it is key to find compromising solutions that balance different objectives. The linear scalarization function is often utilized to translate the multi-objective nature of a problem into a ... 详细信息
来源: 评论
adaptive Optimal Control of CVCF Inverters With Uncertain Load: An adaptive dynamic programming Approach
收藏 引用
ieee ACCESS 2021年 9卷 89276-89286页
作者: Wang, Zhongyang Yu, Yunjun Fuzhou Inst Technol Sch Appl Sci & Engn Fuzhou 350506 Peoples R China Nanchang Univ Sch Informat Engn Nanchang 330031 Jiangxi Peoples R China Nanchang Univ AI Inst Nanchang 330031 Jiangxi Peoples R China
This paper proposed a data-driven adaptive optimal control approach for CVCF (constant voltage, constant frequency) inverter based on reinforcement learning and adaptive dynamic programming (ADP). Different from exist... 详细信息
来源: 评论
A Study on the Efficiency of learning a Robot Controller in Various Environments
A Study on the Efficiency of Learning a Robot Controller in ...
收藏 引用
4th ieee International symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Soga, Sachiko Kobayashi, Ichiro Ochanomizu Univ Grad Sch Humanities & Sci Bunkyo Ku Tokyo 1128610 Japan
In the case that a robot controller is trained by means of evolutionary computation, the robot will be able to behave sufficiently in the environment where the robot has been trained. However, if the robot is put in a... 详细信息
来源: 评论
reinforcement learning to Train Ms. Pac-Man Using Higher-order Action-relative Inputs
Reinforcement Learning to Train Ms. Pac-Man Using Higher-ord...
收藏 引用
4th ieee International symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Bom, Luuk Henken, Ruud Wiering, Marco Univ Groningen Inst Artificial Intelligence & Cognit Engn Fac Math & Nat Sci NL-9700 AB Groningen Netherlands
reinforcement learning algorithms enable an agent to optimize its behavior from interacting with a specific environment. Although some very successful applications of reinforcement learning algorithms have been develo... 详细信息
来源: 评论
Optimistic Planning for Continuous-Action Deterministic Systems
Optimistic Planning for Continuous-Action Deterministic Syst...
收藏 引用
4th ieee International symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Busoniu, Lucian Daniels, Alexander Munos, Remi Babuska, Robert Univ Lorraine CRAN UMR 7039 Nancy France CNRS CRAN UMR 7039 Nancy France Delft Univ Technol DCSC Delft Netherlands INRIA Lille Nord Europe Team SequeL Lille France
We consider the class of online planning algorithms for optimal control, which compared to dynamic programming are relatively unaffected by large state dimensionality. We introduce a novel planning algorithm called SO... 详细信息
来源: 评论
The Second Order Temporal Difference Error for Sarsa(λ)
The Second Order Temporal Difference Error for Sarsa(λ)
收藏 引用
4th ieee International symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Fu, Qiming Liu, Quan Xiao, Fei Chen, Guixin Soochow Univ Dept Comp Sci & Technol Suzhou Peoples R China
Traditional reinforcement learning algorithms, such as Q-learning, Q(lambda), Sarsa, and Sarsa(lambda), update the action value function using temporal difference (TD) error, which is computed by the last action value... 详细信息
来源: 评论