咨询与建议

限定检索结果

文献类型

  • 751 篇 会议
  • 272 篇 期刊文献
  • 4 册 图书

馆藏范围

  • 1,027 篇 电子文献
  • 1 种 纸本馆藏

日期分布

学科分类号

  • 719 篇 工学
    • 523 篇 计算机科学与技术...
    • 385 篇 电气工程
    • 284 篇 控制科学与工程
    • 153 篇 软件工程
    • 83 篇 信息与通信工程
    • 41 篇 交通运输工程
    • 24 篇 仪器科学与技术
    • 21 篇 机械工程
    • 9 篇 电子科学与技术(可...
    • 9 篇 生物工程
    • 7 篇 力学(可授工学、理...
    • 7 篇 土木工程
    • 7 篇 石油与天然气工程
    • 6 篇 动力工程及工程热...
    • 4 篇 材料科学与工程(可...
    • 4 篇 生物医学工程(可授...
    • 4 篇 安全科学与工程
    • 3 篇 化学工程与技术
    • 3 篇 航空宇航科学与技...
  • 120 篇 理学
    • 98 篇 数学
    • 31 篇 系统科学
    • 22 篇 统计学(可授理学、...
    • 10 篇 生物学
    • 9 篇 物理学
    • 5 篇 化学
  • 68 篇 管理学
    • 65 篇 管理科学与工程(可...
    • 14 篇 工商管理
    • 7 篇 图书情报与档案管...
  • 5 篇 经济学
    • 4 篇 应用经济学
  • 3 篇 法学
    • 3 篇 社会学
  • 2 篇 医学
  • 1 篇 教育学

主题

  • 315 篇 reinforcement le...
  • 216 篇 dynamic programm...
  • 206 篇 optimal control
  • 110 篇 adaptive dynamic...
  • 105 篇 adaptive dynamic...
  • 97 篇 learning
  • 88 篇 neural networks
  • 79 篇 heuristic algori...
  • 67 篇 reinforcement le...
  • 58 篇 learning (artifi...
  • 54 篇 nonlinear system...
  • 52 篇 convergence
  • 52 篇 control systems
  • 51 篇 mathematical mod...
  • 48 篇 approximate dyna...
  • 44 篇 approximation al...
  • 43 篇 equations
  • 42 篇 adaptive control
  • 41 篇 cost function
  • 40 篇 artificial neura...

机构

  • 41 篇 chinese acad sci...
  • 27 篇 univ rhode isl d...
  • 17 篇 tianjin univ sch...
  • 16 篇 northeastern uni...
  • 16 篇 univ sci & techn...
  • 16 篇 univ illinois de...
  • 14 篇 beijing normal u...
  • 13 篇 northeastern uni...
  • 13 篇 guangdong univ t...
  • 12 篇 northeastern uni...
  • 9 篇 natl univ def te...
  • 8 篇 ieee
  • 8 篇 univ chinese aca...
  • 7 篇 univ chinese aca...
  • 7 篇 cent south univ ...
  • 7 篇 southern univ sc...
  • 7 篇 beijing univ tec...
  • 6 篇 chinese acad sci...
  • 6 篇 missouri univ sc...
  • 5 篇 nanjing univ pos...

作者

  • 55 篇 liu derong
  • 37 篇 wei qinglai
  • 29 篇 he haibo
  • 22 篇 wang ding
  • 21 篇 xu xin
  • 19 篇 jiang zhong-ping
  • 17 篇 lewis frank l.
  • 17 篇 yang xiong
  • 17 篇 zhang huaguang
  • 17 篇 ni zhen
  • 16 篇 zhao bo
  • 16 篇 gao weinan
  • 14 篇 zhao dongbin
  • 13 篇 zhong xiangnan
  • 12 篇 si jennie
  • 12 篇 derong liu
  • 11 篇 song ruizhuo
  • 10 篇 jagannathan s.
  • 10 篇 dongbin zhao
  • 9 篇 abouheaf mohamme...

语言

  • 970 篇 英文
  • 51 篇 其他
  • 6 篇 中文
检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"
1027 条 记 录,以下是891-900 订阅
A scalable model-free recurrent neural network framework for solving POMDPs
A scalable model-free recurrent neural network framework for...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Liu, Zhenzhen Elhanany, Itamar Univ Tennessee Dept Elect & Comp Engn Knoxville TN 37996 USA
This paper presents a framework for obtaining an optimal policy in model-free Partially Observable Markov Decision Problems (POMDPs) using a recurrent neural network (RNN). A Q-function approximation approach is taken... 详细信息
来源: 评论
Coordinated reinforcement learning for decentralized optimal control
Coordinated reinforcement learning for decentralized optimal...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Yagan, Daniel Tharn, Chen-Khong Natl Univ Singapore Dept Elect & Comp Engn Singapore 117548 Singapore
We consider a multi-agent system where the overall performance is affected by the joint actions or policies of agents. However, each agent only observes a partial view of the global state condition. This model is know... 详细信息
来源: 评论
Identifying trajectory classes in dynamic tasks
Identifying trajectory classes in dynamic tasks
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Anderson, Stuart O. Srinivasa, Siddhartha S. Carnegie Mellon Univ Inst Robot 5000 Forbes Ave Pittsburgh PA 15213 USA Intel Res Pittsburgh Pittsburgh PA 15213 USA
Using domain knowledge to decompose difficult control problems is a widely used technique in robotics. Previous work has automated the process of identifying some qualitative behaviors of a system, finding a decomposi... 详细信息
来源: 评论
dynamic optimization of the strength ratio during a terrestrial conflict
Dynamic optimization of the strength ratio during a terrestr...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Sztykgold, Alexandre Coppin, Gilles Hudry, Olivier GET ENST Bretagne LUSSI Dept CNRS TAMCICUMR 2872 Bretagne Germany GET ENST Bretagne Dept Comp Sci CNRS LTCI UMR 5141 Bretagne Germany
The aim of this study is to assist a military decision maker during his decision-making process when applying tactics on the battlefield. For that, we have decided to model the conflict by a game, on which we will see... 详细信息
来源: 评论
A recurrent control neural network for data efficient reinforcement learning
A recurrent control neural network for data efficient reinfo...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Schaefer, Anton Maximilian Udluft, Steffen Zimmermann, Hans-Georg Univ Ulm Dept Optimisat & Operat Res D-89069 Ulm Germany Corp Technol Seimens AG Dept Learning Syst Informat & Commun D-81739 Munich Germany
In this paper we introduce a new model-based approach for a data-efficient modelling and control of reinforcement learning problems in discrete time. Our architecture is based on a recurrent neural network (RNN) with ... 详细信息
来源: 评论
Robust dynamic programming for discounted infinite-horizon Markov decision processes with uncertain stationary transition matrice
Robust dynamic programming for discounted infinite-horizon M...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Li, Baohua Si, Jennie Arizona State Univ Dept Elect Engn Tempe AZ 85287 USA
In this paper, finite-state, Saite-action, discounted infinite-horizon-cost Markov decision processes (MDPs) with uncertain stationary transition matrices are discussed in the deterministic policy space. Uncertain sta... 详细信息
来源: 评论
Evolutionary computation on multitask reinforcement learning problems
Evolutionary computation on multitask reinforcement learning...
收藏 引用
ieee International Conference on Networking, Sensing and Control
作者: Handa, Hisashi Okayama Univ Grad Sch Nat Sci & Technol Okayama 7008530 Japan
Recently, Multitask learning, which can cope with several tasks, has attracted much attention. Multitask reinforcement learning introduced by Tanaka et al is a problem class where number of problem instances of Markov... 详细信息
来源: 评论
Value-iteration based fitted policy iteration:: learning with a single trajectory
Value-iteration based fitted policy iteration:: Learning wit...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Antos, Andras Szepesvari, Csaba Munos, Remi Hungarian Acad Sci Comp & Automat Res Inst Kendu U 13-17 H-1111 Budapest Hungary Univ Alberta Dept Comput Sci Edmonton AB Canada
We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian Decision Problems when the training data is composed of the trajectory of some fixed behaviour policy. ... 详细信息
来源: 评论
A theoretical analysis of cooperative behaviorin multi-agent Q-learning
A theoretical analysis of cooperative behaviorin multi-agent...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Waltman, Ludo Kaymak, Uzay Erasmus Univ Erasmus Sch Econ POB 1738 NL-3000 DR Rotterdam Netherlands
A number of experimental studies have investigated whether cooperative behavior may emerge in multi-agent Qlearning. In some studies cooperative behavior did emerge, in others it did not. This paper provides a theoret... 详细信息
来源: 评论
Evaluation of policy gradient methods and variants on the cart-pole benchmark
Evaluation of policy gradient methods and variants on the ca...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Riedmiller, Martin Peters, Jan Schaal, Stefan Univ Osnabruck Neuroinformat Grp D-4500 Osnabruck Germany Univ Southern Calif Computat Learning & Motor Control Los Angeles CA 90007 USA
In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, 'vanilla' policy gradients and natural policy gradients. Each o... 详细信息
来源: 评论