咨询与建议

限定检索结果

文献类型

  • 299 篇 会议
  • 8 篇 期刊文献

馆藏范围

  • 307 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 180 篇 工学
    • 158 篇 计算机科学与技术...
    • 56 篇 电气工程
    • 48 篇 软件工程
    • 47 篇 控制科学与工程
    • 13 篇 信息与通信工程
    • 10 篇 机械工程
    • 6 篇 仪器科学与技术
    • 4 篇 力学(可授工学、理...
    • 4 篇 生物工程
    • 3 篇 动力工程及工程热...
    • 2 篇 交通运输工程
    • 2 篇 核科学与技术
    • 2 篇 生物医学工程(可授...
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 航空宇航科学与技...
    • 1 篇 食品科学与工程(可...
  • 40 篇 理学
    • 35 篇 数学
    • 9 篇 系统科学
    • 8 篇 统计学(可授理学、...
    • 4 篇 物理学
    • 4 篇 生物学
    • 1 篇 化学
    • 1 篇 天文学
    • 1 篇 大气科学
    • 1 篇 地球物理学
    • 1 篇 地质学
  • 18 篇 管理学
    • 17 篇 管理科学与工程(可...
    • 7 篇 工商管理
  • 4 篇 经济学
    • 4 篇 应用经济学
  • 1 篇 医学

主题

  • 115 篇 dynamic programm...
  • 76 篇 reinforcement le...
  • 67 篇 learning
  • 47 篇 optimal control
  • 30 篇 neural networks
  • 27 篇 control systems
  • 21 篇 approximate dyna...
  • 21 篇 approximation al...
  • 20 篇 function approxi...
  • 20 篇 equations
  • 17 篇 convergence
  • 16 篇 adaptive dynamic...
  • 16 篇 state-space meth...
  • 16 篇 heuristic algori...
  • 14 篇 mathematical mod...
  • 13 篇 stochastic proce...
  • 12 篇 learning (artifi...
  • 12 篇 adaptive control
  • 12 篇 cost function
  • 11 篇 algorithm design...

机构

  • 5 篇 arizona state un...
  • 4 篇 department of el...
  • 4 篇 school of inform...
  • 4 篇 department of in...
  • 4 篇 univ sci & techn...
  • 4 篇 chinese acad sci...
  • 4 篇 department of el...
  • 3 篇 princeton univ d...
  • 3 篇 northeastern uni...
  • 3 篇 national science...
  • 3 篇 robotics institu...
  • 3 篇 univ illinois de...
  • 3 篇 univ utrecht dep...
  • 2 篇 univ groningen i...
  • 2 篇 sharif univ tech...
  • 2 篇 univ texas autom...
  • 2 篇 pengcheng labora...
  • 2 篇 guangxi univ sch...
  • 2 篇 chinese acad sci...
  • 2 篇 cemagref lisc au...

作者

  • 14 篇 liu derong
  • 9 篇 wei qinglai
  • 8 篇 si jennie
  • 7 篇 xu xin
  • 5 篇 derong liu
  • 4 篇 lewis frank l.
  • 4 篇 martin riedmille...
  • 4 篇 huaguang zhang
  • 4 篇 jennie si
  • 4 篇 marco a. wiering
  • 4 篇 xin xu
  • 4 篇 zhang huaguang
  • 4 篇 dongbin zhao
  • 4 篇 lei yang
  • 4 篇 powell warren b.
  • 4 篇 riedmiller marti...
  • 3 篇 hado van hasselt
  • 3 篇 van hasselt hado
  • 3 篇 jagannathan s.
  • 3 篇 munos remi

语言

  • 305 篇 英文
  • 1 篇 其他
  • 1 篇 中文
检索条件"任意字段=IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning"
307 条 记 录,以下是181-190 订阅
排序:
Call admission control in wireless DS-CDMA systems using actor-critic reinforcement learning
Call admission control in wireless DS-CDMA systems using act...
收藏 引用
2nd international symposium on Wireless Pervasive Computing
作者: Chanloha, Pitipong Usaha, Wipawee Suranaree Univ Technol Sch Telecommun Engn Nakhon Ratchasima 30000 Thailand
This paper addresses the call admission control (CAC) problem for multiple services in the uplink of a cellular system using direct sequential code division multiple access (DS-CDMA) when taking into account the physi... 详细信息
来源: 评论
Kernelizing LSPE(λ)
Kernelizing LSPE(λ)
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Tobias Jung Daniel Polani University of Mainz Germany University of Herfordshire UK
We propose the use of kernel-based methods as underlying function approximator in the least-squares based policy evaluation framework of LSPE(λ) and LSTD(λ). In particular we present the 'kernelization' of m... 详细信息
来源: 评论
Fitted Q Iteration with CMACs
Fitted Q Iteration with CMACs
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Stephan Timmer Martin Riedmiller Department of Computer Science University of Osnabrück Osnabruck Germany
A major issue in model-free reinforcement learning is how to efficiently exploit the data collected by an exploration strategy. This is especially important in case of continuous, high dimensional state spaces, since ... 详细信息
来源: 评论
Two Novel On-policy reinforcement learning Algorithms based on TD(λ)-methods
Two Novel On-policy Reinforcement Learning Algorithms based ...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Marco A. Wiering Hado van Hasselt Department of Information and Computing Sciences University of Utrecht Utrecht Netherlands
This paper describes two novel on-policy reinforcement learning algorithms, named QV(λ)-learning and the actor critic learning automaton (ACLA). Both algorithms learn a state value-function using TD(λ)-methods. The ... 详细信息
来源: 评论
Q-learning with Continuous State Spaces and Finite Decision Set
Q-Learning with Continuous State Spaces and Finite Decision ...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Kengy Barty Pierre Girardeau Jean-Sebastien Roy Cyrille Strugarek EDF Research and Development Clamart France
This paper aims to present an original technique in order to compute the optimal policy of a Markov decision problem with continuous state space and discrete decision variables. We propose an extension of the Q-learni... 详细信息
来源: 评论
Coordinated reinforcement learning for Decentralized Optimal Control
Coordinated Reinforcement Learning for Decentralized Optimal...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Daniel Yagan Chen-Khong Tham Department of Electrical and Computer Engineering National University of Singapore Singapore
We consider a multi-agent system where the overall performance is affected by the joint actions or policies of agents. However, each agent only observes a partial view of the global state condition. This model is know... 详细信息
来源: 评论
A Theoretical Analysis of Cooperative Behavior in Multi-agent Q-learning
A Theoretical Analysis of Cooperative Behavior in Multi-agen...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Ludo Waltman Uzay Kaymak Erasmus Erasmus University Rotterdam Rotterdam Netherlands
A number of experimental studies have investigated whether cooperative behavior may emerge in multi-agent Q-learning. In some studies cooperative behavior did emerge, in others it did not. This paper provides a theore... 详细信息
来源: 评论
dynamic optimization of the strength ratio during a terrestrial conflict
Dynamic optimization of the strength ratio during a terrestr...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Alexandre Sztykgold Gilles Coppin Olivier Hudry GET/ENST-Bretagne LUSSI Department France GET/ENST Computer Science Department France
The aim of this study is to assist a military decision maker during his decision-making process when applying tactics on the battlefield. For that, we have decided to model the conflict by a game, on which we will see... 详细信息
来源: 评论
Safe Adaptive dynamic programming Method for Nonlinear Safety-Critical Systems with Disturbance  6
Safe Adaptive Dynamic Programming Method for Nonlinear Safet...
收藏 引用
6th international Conference on Robotics and Automation Engineering, ICRAE 2021
作者: Wang, Jinguang Zhang, Dehua Zhang, Jishi Zhu, Heyang Hu, Shaolin Qin, Chunbin Henan University School of Artificial Intelligence Kaifeng China Guangdong University of Petrochemical Technology School of Automation Maoming China
In this paper, a safe adaptive dynamic programming (SADP) method based on the barrier function (BF) is proposed for the optimal control problem of nonlinear safety-critical systems with the safety constraints and exte... 详细信息
来源: 评论
Identifying trajectory classes in dynamic tasks
Identifying trajectory classes in dynamic tasks
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Stuart O. Anderson Siddhartha S. Srinivasa Robotics Institute Carnegie Mellon University Pittsburgh PA USA Intel Research Pittsburgh Intel Corporation Pittsburgh PA USA
Using domain knowledge to decompose difficult control problems is a widely used technique in robotics. Previous work has automated the process of identifying some qualitative behaviors of a system, finding a decomposi... 详细信息
来源: 评论