咨询与建议

限定检索结果

文献类型

  • 228 篇 会议
  • 4 篇 期刊文献

馆藏范围

  • 232 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 98 篇 工学
    • 93 篇 计算机科学与技术...
    • 40 篇 软件工程
    • 25 篇 电气工程
    • 14 篇 控制科学与工程
    • 4 篇 机械工程
    • 1 篇 力学(可授工学、理...
    • 1 篇 信息与通信工程
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
  • 23 篇 理学
    • 23 篇 数学
    • 6 篇 统计学(可授理学、...
    • 4 篇 系统科学
    • 1 篇 化学
    • 1 篇 大气科学
  • 9 篇 管理学
    • 7 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 社会学

主题

  • 95 篇 dynamic programm...
  • 52 篇 learning
  • 46 篇 optimal control
  • 37 篇 reinforcement le...
  • 34 篇 learning (artifi...
  • 27 篇 equations
  • 22 篇 heuristic algori...
  • 21 篇 control systems
  • 20 篇 convergence
  • 19 篇 neural networks
  • 18 篇 function approxi...
  • 17 篇 mathematical mod...
  • 16 篇 approximation al...
  • 15 篇 vectors
  • 14 篇 markov processes
  • 14 篇 artificial neura...
  • 14 篇 cost function
  • 13 篇 stochastic proce...
  • 12 篇 algorithm design...
  • 12 篇 adaptive control

机构

  • 5 篇 school of inform...
  • 4 篇 northeastern uni...
  • 4 篇 department of el...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 automation and r...
  • 3 篇 northeastern uni...
  • 3 篇 robotics institu...
  • 3 篇 key laboratory o...
  • 3 篇 univ illinois de...
  • 2 篇 department of ar...
  • 2 篇 school of electr...
  • 2 篇 univ groningen i...
  • 2 篇 univ texas autom...
  • 2 篇 colorado state u...
  • 2 篇 guangxi univ sch...
  • 2 篇 national science...
  • 2 篇 informatics inst...
  • 2 篇 college of infor...
  • 2 篇 school of automa...

作者

  • 7 篇 hado van hasselt
  • 7 篇 lewis frank l.
  • 7 篇 marco a. wiering
  • 7 篇 dongbin zhao
  • 6 篇 liu derong
  • 5 篇 huaguang zhang
  • 5 篇 zhang huaguang
  • 5 篇 derong liu
  • 5 篇 warren b. powell
  • 4 篇 xu xin
  • 4 篇 vrabie draguna
  • 4 篇 jagannathan s.
  • 4 篇 frank l. lewis
  • 4 篇 yanhong luo
  • 4 篇 damien ernst
  • 4 篇 jan peters
  • 4 篇 peters jan
  • 4 篇 zhao dongbin
  • 3 篇 xu hao
  • 3 篇 martin riedmille...

语言

  • 232 篇 英文
检索条件"任意字段=2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009"
232 条 记 录,以下是201-210 订阅
排序:
Q-learning with Continuous State Spaces and Finite Decision Set
Q-Learning with Continuous State Spaces and Finite Decision ...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Kengy Barty Pierre Girardeau Jean-Sebastien Roy Cyrille Strugarek EDF Research and Development Clamart France
This paper aims to present an original technique in order to compute the optimal policy of a Markov decision problem with continuous state space and discrete decision variables. We propose an extension of the Q-learni... 详细信息
来源: 评论
Editorial Special Issue on adaptive dynamic programming and reinforcement learning
收藏 引用
ieee Transactions on Systems, Man, and Cybernetics: Systems 2020年 第11期50卷 3944-3947页
作者: Liu, Derong Lewis, Frank L. Wei, Qinglai School of Automation Guangdong University of Technology Guangzhou510006 China Uta Research Institute University of Texas at Arlington Fort WorthTX76118 United States State Key Laboratory of Management and Control for Complex Systems Istitute of Automation Chinese Academy of Sciences Beijing100190 China University of Chinese Academy of Sciences Beijing100049 China
The past decade has witnessed a surge in research activities related to adaptive dynamic programming (ADP) and reinforcement learning (RL), particularly for control applications. Several books [item 1)–5) in the Appe... 详细信息
来源: 评论
reinforcement learning Control of a Real Mobile Robot Using Approximate Policy Iteration
收藏 引用
6th International symposium on Neural Networks
作者: Zhang, Pengchen Xu, Xin Liu, Chunming Yuan, Qiping Natl Univ Def Technol Inst Automat Changsha 410073 Hunan Peoples R China
Machine learning for mobile robots has attracted lots of research interests in recent years. However, there are still many challenges to apply learning techniques in real mobile robots, e.g., generalization ill Contin... 详细信息
来源: 评论
A Theoretical Analysis of Cooperative Behavior in Multi-agent Q-learning
A Theoretical Analysis of Cooperative Behavior in Multi-agen...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Ludo Waltman Uzay Kaymak Erasmus Erasmus University Rotterdam Rotterdam Netherlands
A number of experimental studies have investigated whether cooperative behavior may emerge in multi-agent Q-learning. In some studies cooperative behavior did emerge, in others it did not. This paper provides a theore... 详细信息
来源: 评论
Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark
Evaluation of Policy Gradient Methods and Variants on the Ca...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Martin Riedmiller Jan Peters Stefan Schaal NeuroInformatics Group University of Osnabrück Germany Computational Learning and Motor Control University of Southern California USA
In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, 'vanilla' policy gradients and natural policy gradients. Each o... 详细信息
来源: 评论
Efficient learning in Cellular Simultaneous Recurrent Neural Networks - The Case of Maze Navigation Problem
Efficient Learning in Cellular Simultaneous Recurrent Neural...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Roman Ilin Robert Kozma Paul J. Werbos Department of Mathematical Sciences University of Memphis Memphis TN USA National Science Foundation Arlington VA USA
Cellular simultaneous recurrent neural networks (SRN) show great promise in solving complex function approximation problems. In particular, approximate dynamic programming is an important application area where SRNs h... 详细信息
来源: 评论
A Novel Fuzzy reinforcement learning Approach in Two-Level Intelligent Control of 3-DOF Robot Manipulators
A Novel Fuzzy Reinforcement Learning Approach in Two-Level I...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Nasser Sadati Mohammad Mollaie Emamzadeh Electrical Engineering Department Sharif University of Technology Tehran Tehran Iran Electrical Engineering Department Sharif University of Technology Tehran Iran
In this paper, a fuzzy coordination method based on interaction prediction principle (IPP) and reinforcement learning is presented for the optimal control of robot manipulators with three degrees-of-freedom. For this ... 详细信息
来源: 评论
Strategy Generation with Cognitive Distance in Two-Player Games
Strategy Generation with Cognitive Distance in Two-Player Ga...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Kosuke Sekiyama Ricardo Carnieri Toshio Fukuda Department of Micro-Nano Systems Engineering University of Nagoya Nagoya Japan
In game theoretical approaches to multi-agent systems, a payoff matrix is often given a priori and used by agents in action selection. By contrast, in this paper we approach the problem of decision making by use of th... 详细信息
来源: 评论
Two Novel On-policy reinforcement learning Algorithms based on TD(λ)-methods
Two Novel On-policy Reinforcement Learning Algorithms based ...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Marco A. Wiering Hado van Hasselt Department of Information and Computing Sciences University of Utrecht Utrecht Netherlands
This paper describes two novel on-policy reinforcement learning algorithms, named QV(λ)-learning and the actor critic learning automaton (ACLA). Both algorithms learn a state value-function using TD(λ)-methods. The ... 详细信息
来源: 评论
dynamic optimization of the strength ratio during a terrestrial conflict
Dynamic optimization of the strength ratio during a terrestr...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Alexandre Sztykgold Gilles Coppin Olivier Hudry GET/ENST-Bretagne LUSSI Department France GET/ENST Computer Science Department France
The aim of this study is to assist a military decision maker during his decision-making process when applying tactics on the battlefield. For that, we have decided to model the conflict by a game, on which we will see... 详细信息
来源: 评论