咨询与建议

限定检索结果

文献类型

  • 228 篇 会议
  • 4 篇 期刊文献

馆藏范围

  • 232 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 98 篇 工学
    • 93 篇 计算机科学与技术...
    • 40 篇 软件工程
    • 25 篇 电气工程
    • 14 篇 控制科学与工程
    • 4 篇 机械工程
    • 1 篇 力学(可授工学、理...
    • 1 篇 信息与通信工程
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
  • 23 篇 理学
    • 23 篇 数学
    • 6 篇 统计学(可授理学、...
    • 4 篇 系统科学
    • 1 篇 化学
    • 1 篇 大气科学
  • 9 篇 管理学
    • 7 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 社会学

主题

  • 95 篇 dynamic programm...
  • 52 篇 learning
  • 46 篇 optimal control
  • 37 篇 reinforcement le...
  • 34 篇 learning (artifi...
  • 27 篇 equations
  • 22 篇 heuristic algori...
  • 21 篇 control systems
  • 20 篇 convergence
  • 19 篇 neural networks
  • 18 篇 function approxi...
  • 17 篇 mathematical mod...
  • 16 篇 approximation al...
  • 15 篇 vectors
  • 14 篇 markov processes
  • 14 篇 artificial neura...
  • 14 篇 cost function
  • 13 篇 stochastic proce...
  • 12 篇 algorithm design...
  • 12 篇 adaptive control

机构

  • 5 篇 school of inform...
  • 4 篇 northeastern uni...
  • 4 篇 department of el...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 automation and r...
  • 3 篇 northeastern uni...
  • 3 篇 robotics institu...
  • 3 篇 key laboratory o...
  • 3 篇 univ illinois de...
  • 2 篇 department of ar...
  • 2 篇 school of electr...
  • 2 篇 univ groningen i...
  • 2 篇 univ texas autom...
  • 2 篇 colorado state u...
  • 2 篇 guangxi univ sch...
  • 2 篇 national science...
  • 2 篇 informatics inst...
  • 2 篇 college of infor...
  • 2 篇 school of automa...

作者

  • 7 篇 hado van hasselt
  • 7 篇 lewis frank l.
  • 7 篇 marco a. wiering
  • 7 篇 dongbin zhao
  • 6 篇 liu derong
  • 5 篇 huaguang zhang
  • 5 篇 zhang huaguang
  • 5 篇 derong liu
  • 5 篇 warren b. powell
  • 4 篇 xu xin
  • 4 篇 vrabie draguna
  • 4 篇 jagannathan s.
  • 4 篇 frank l. lewis
  • 4 篇 yanhong luo
  • 4 篇 damien ernst
  • 4 篇 jan peters
  • 4 篇 peters jan
  • 4 篇 zhao dongbin
  • 3 篇 xu hao
  • 3 篇 martin riedmille...

语言

  • 232 篇 英文
检索条件"任意字段=2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009"
232 条 记 录,以下是31-40 订阅
排序:
Multi-Objective reinforcement learning for AUV Thruster Failure Recovery
Multi-Objective Reinforcement Learning for AUV Thruster Fail...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Ahmadzadeh, Seyed Reza Kormushev, Petar Caldwell, Darwin G. Ist Italiano Tecnol Dept Adv Robot Via Morego 30 I-16163 Genoa Italy
This paper investigates learning approaches for discovering fault-tolerant control policies to overcome thruster failures in Autonomous Underwater Vehicles (AUV). The proposed approach is a model-based direct policy s... 详细信息
来源: 评论
Policy Gradient Approaches for Multi-Objective Sequential Decision Making: A Comparison
Policy Gradient Approaches for Multi-Objective Sequential De...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Parisi, Simone Pirotta, Matteo Smacchia, Nicola Bascetta, Luca Restelli, Marcello Politecn Milan Dept Elect Informat & Bioengn Piazza Leonardo da Vinci 32 I-20133 Milan Italy
This paper investigates the use of policy gradient techniques to approximate the Pareto frontier in Multi-Objective Markov Decision Processes (MOMDPs). Despite the popularity of policy-gradient algorithms and the fact... 详细信息
来源: 评论
Annealing-Pareto Multi-Objective Multi-Armed Bandit Algorithm
Annealing-Pareto Multi-Objective Multi-Armed Bandit Algorith...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Yahyaa, Saba Q. Drugan, Madalina M. Manderick, Bernard Vrije Univ Brussel Dept Comp Sci Pl Laan 2 B-1050 Brussels Belgium
In the stochastic multi-objective multi-armed bandit (or MOMAB), arms generate a vector of stochastic rewards, one per objective, instead of a single scalar reward. As a result, there is not only one optimal arm, but ... 详细信息
来源: 评论
Theoretical analysis of a reinforcement learning based switching scheme
Theoretical analysis of a reinforcement learning based switc...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Ali Heydari Mechanical Engineering Department South Dakota School of Mines and Technology Rapid City SD
A reinforcement learning based scheme for optimal switching with an infinite-horizon cost function is briefly proposed in this paper. Several theoretical questions are shown to arise regarding its convergence, optimal... 详细信息
来源: 评论
An adaptive dynamic programming algorithm to solve optimal control of uncertain nonlinear systems
An adaptive dynamic programming algorithm to solve optimal c...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Xiaohong Cui Yanhong Luo Huaguang Zhang School of Information Science and Engineering Northeastern University Shenyang Liaoning China
In this paper, an approximate optimal control method based on adaptive dynamic programming(ADP) is discussed for completely unknown nonlinear system. An online critic-action-identifier algorithm is developed using neu... 详细信息
来源: 评论
Near-optimality bounds for greedy periodic policies with application to grid-level storage
Near-optimality bounds for greedy periodic policies with app...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Yuhai Hu Boris Defourny Department of Industrial & Systems Engineering Lehigh University USA
This paper is concerned with periodic Markov Decision Processes, as a simplified but already rich model for nonstationary infinite-horizon problems involving seasonal effects. Considering the class of policies greedy ... 详细信息
来源: 评论
adaptive dynamic programming-based optimal tracking control for nonlinear systems using general value iteration
Adaptive dynamic programming-based optimal tracking control ...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Xiaofeng Lin Qiang Ding Weikai Kong Chunning Song Qingbao Huang School of Electrical Engineering Guangxi University Nanning China
For the optimal tracking control problem of affine nonlinear systems, a general value iteration algorithm based on adaptive dynamic programming is proposed in this paper. By system transformation, the optimal tracking... 详细信息
来源: 评论
A data-based online reinforcement learning algorithm with high-efficient exploration
A data-based online reinforcement learning algorithm with hi...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Yuanheng Zhu Dongbin Zhao The State Key Laboratory of Management and Control for Complex Systems Chinese Academy of Sciences Beijing China
An online reinforcement learning algorithm is proposed in this paper to directly utilizes online data efficiently for continuous deterministic systems without system parameters. The dependence on some specific approxi... 详细信息
来源: 评论
ADP-based optimal control for a class of nonlinear discrete-time systems with inequality constraints
ADP-based optimal control for a class of nonlinear discrete-...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Yanhong Luo Geyang Xiao College of Information Science and Engineering Northeastern University
In this paper, the adaptive dynamic programming (ADP) approach is utilized to design a neural-network-based optimal controller for a class of nonlinear discrete-time (DT) systems with inequality constraints. To begin ... 详细信息
来源: 评论
Using supervised training signals of observable state dynamics to speed-up and improve reinforcement learning
Using supervised training signals of observable state dynami...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Daniel L Elliott Charles Anderson Dept of Computer Science Colorado State University
A common complaint about reinforcement learning (RL) is that it is too slow to learn a value function which gives good performance. This issue is exacerbated in continuous state spaces. This paper presents a straight-... 详细信息
来源: 评论