咨询与建议

限定检索结果

文献类型

  • 228 篇 会议
  • 4 篇 期刊文献

馆藏范围

  • 232 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 98 篇 工学
    • 93 篇 计算机科学与技术...
    • 40 篇 软件工程
    • 25 篇 电气工程
    • 14 篇 控制科学与工程
    • 4 篇 机械工程
    • 1 篇 力学(可授工学、理...
    • 1 篇 信息与通信工程
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
  • 23 篇 理学
    • 23 篇 数学
    • 6 篇 统计学(可授理学、...
    • 4 篇 系统科学
    • 1 篇 化学
    • 1 篇 大气科学
  • 9 篇 管理学
    • 7 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 社会学

主题

  • 95 篇 dynamic programm...
  • 52 篇 learning
  • 46 篇 optimal control
  • 37 篇 reinforcement le...
  • 34 篇 learning (artifi...
  • 27 篇 equations
  • 22 篇 heuristic algori...
  • 21 篇 control systems
  • 20 篇 convergence
  • 19 篇 neural networks
  • 18 篇 function approxi...
  • 17 篇 mathematical mod...
  • 16 篇 approximation al...
  • 15 篇 vectors
  • 14 篇 markov processes
  • 14 篇 artificial neura...
  • 14 篇 cost function
  • 13 篇 stochastic proce...
  • 12 篇 algorithm design...
  • 12 篇 adaptive control

机构

  • 5 篇 school of inform...
  • 4 篇 northeastern uni...
  • 4 篇 department of el...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 automation and r...
  • 3 篇 northeastern uni...
  • 3 篇 robotics institu...
  • 3 篇 key laboratory o...
  • 3 篇 univ illinois de...
  • 2 篇 department of ar...
  • 2 篇 school of electr...
  • 2 篇 univ groningen i...
  • 2 篇 univ texas autom...
  • 2 篇 colorado state u...
  • 2 篇 guangxi univ sch...
  • 2 篇 national science...
  • 2 篇 informatics inst...
  • 2 篇 college of infor...
  • 2 篇 school of automa...

作者

  • 7 篇 hado van hasselt
  • 7 篇 lewis frank l.
  • 7 篇 marco a. wiering
  • 7 篇 dongbin zhao
  • 6 篇 liu derong
  • 5 篇 huaguang zhang
  • 5 篇 zhang huaguang
  • 5 篇 derong liu
  • 5 篇 warren b. powell
  • 4 篇 xu xin
  • 4 篇 vrabie draguna
  • 4 篇 jagannathan s.
  • 4 篇 frank l. lewis
  • 4 篇 yanhong luo
  • 4 篇 damien ernst
  • 4 篇 jan peters
  • 4 篇 peters jan
  • 4 篇 zhao dongbin
  • 3 篇 xu hao
  • 3 篇 martin riedmille...

语言

  • 232 篇 英文
检索条件"任意字段=2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009"
232 条 记 录,以下是91-100 订阅
排序:
Continuous-time ADP for linear systems with partially unknown dynamics
Continuous-time ADP for linear systems with partially unknow...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Vrabie, Draguna Abu-Khalaf, Murad Lewis, Frank L. Wang, Youyi Univ Texas Automat & Robot Res Inst Ft Worth TX 76118 USA Nanyang Technol Univ Sch Elect & Elect Engn Singapore Singapore
Approximate dynamic programming has been formulated and applied mainly to discrete-time systems. Expressing the ADP concept for continuous-time systems raises difficult issues related to sampling time and system model... 详细信息
来源: 评论
Agent self-assessment: Determining policy quality without execution
Agent self-assessment: Determining policy quality without ex...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning
作者: Hans, Alexander Duell, Siegmund Udluft, Steffen Neuroinformatics and Cognitive Robotics Lab Ilmenau University of Technology Ilmenau Germany Machine Learning Group Berlin Institute of Technology Berlin Germany Intelligent Systems and Control Siemens AG Corporate Technology Munich Munich Germany
With the development of data-efficient reinforcement learning (RL) methods, a promising data-driven solution for optimal control of complex technical systems has become available. For the application of RL to a techni... 详细信息
来源: 评论
Policy Gradient Approaches for Multi-Objective Sequential Decision Making: A Comparison
Policy Gradient Approaches for Multi-Objective Sequential De...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Parisi, Simone Pirotta, Matteo Smacchia, Nicola Bascetta, Luca Restelli, Marcello Politecn Milan Dept Elect Informat & Bioengn Piazza Leonardo da Vinci 32 I-20133 Milan Italy
This paper investigates the use of policy gradient techniques to approximate the Pareto frontier in Multi-Objective Markov Decision Processes (MOMDPs). Despite the popularity of policy-gradient algorithms and the fact... 详细信息
来源: 评论
Online adaptive learning of optimal control solutions using integral reinforcement learning
Online adaptive learning of optimal control solutions using ...
收藏 引用
作者: Vamvoudakis, Kyriakos G. Vrabie, Draguna Lewis, Frank L. Automation and Robotics Research Institute University of Texas at Arlington Fort Worth TX 76118 United States
In this paper we introduce an online algorithm that uses integral reinforcement knowledge for learning the continuous-time optimal control solution for nonlinear systems with infinite horizon costs and partial knowled... 详细信息
来源: 评论
Annealing-Pareto Multi-Objective Multi-Armed Bandit Algorithm
Annealing-Pareto Multi-Objective Multi-Armed Bandit Algorith...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Yahyaa, Saba Q. Drugan, Madalina M. Manderick, Bernard Vrije Univ Brussel Dept Comp Sci Pl Laan 2 B-1050 Brussels Belgium
In the stochastic multi-objective multi-armed bandit (or MOMAB), arms generate a vector of stochastic rewards, one per objective, instead of a single scalar reward. As a result, there is not only one optimal arm, but ... 详细信息
来源: 评论
Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
Discrete-time nonlinear HJB solution using approximate dynam...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Al-Tamimi, Asma Lewis, Frank Univ Texas Automat & Robot Res Inst Ft Worth TX 76118 USA Univ Texas Arlington Automat & Robot Res Inst Ft Worth TX 76118 USA
In this paper, a greedy iteration scheme based on approximate dynamic programming (ADP), namely Heuristic dynamic programming (HDP), is used to solve for the value function of the Hamilton Jacobi Bellman equation (HJB... 详细信息
来源: 评论
A Retrospective on adaptive dynamic programming for Control
A Retrospective on Adaptive Dynamic Programming for Control
收藏 引用
International Joint Conference on Neural Networks
作者: Lendaris, George G. Portland State Univ Syst Sci Grad Program Portland OR 97201 USA
Some three decades ago, certain computational intelligence methods of reinforcement learning were recognized as implementing an approximation of Bellman's dynamic programming method, which is known in the controls... 详细信息
来源: 评论
Pattern Driven dynamic Scheduling Approach using reinforcement learning
Pattern Driven Dynamic Scheduling Approach using Reinforceme...
收藏 引用
ieee International Conference on Automation and Logistics
作者: Wei Yingzi Jiang Xinli Hao Pingbo Gu Kanfeng Shenyang Ligong Univ Shenyang 110168 Peoples R China Chinese Acad Sci Shenyang Inst Automat Shenyang 110016 Peoples R China
Production scheduling is critical for manufacturing system. Dispatching rules are usually applied dynamically to schedule the job in the dynamic job-shop. The paper presents an adaptive iterative scheduling algorithm ... 详细信息
来源: 评论
adaptive dynamic programming for Feedback Control
Adaptive Dynamic Programming for Feedback Control
收藏 引用
7th Asian Control Conference (ASCC 2009)
作者: Lewis, Frank L. Vrabie, Draguna Univ Texas Arlington Automat & Robot Res Inst Ft Worth TX 76118 USA
Living organisms learn by acting on their environment, observing the resulting reward stimulus, and adjusting their actions accordingly to improve the reward. This action-based or reinforcement learning can capture no... 详细信息
来源: 评论
adaptive computation of optimal nonrandomized policies in constrained average-reward MDPs
Adaptive computation of optimal nonrandomized policies in co...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Eugene A. Feinberg Department of Applied Mathematics and Statistics Stony Brook University Stony Brook NY USA
This paper deals with computation of optimal nonrandomized nonstationary policies and mixed stationary policies for average-reward Markov decision processes with multiple criteria and constraints. We consider problems... 详细信息
来源: 评论