咨询与建议

限定检索结果

文献类型

  • 229 篇 会议
  • 18 篇 期刊文献

馆藏范围

  • 247 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 113 篇 工学
    • 103 篇 计算机科学与技术...
    • 42 篇 软件工程
    • 38 篇 电气工程
    • 23 篇 控制科学与工程
    • 5 篇 信息与通信工程
    • 3 篇 机械工程
    • 2 篇 力学(可授工学、理...
    • 1 篇 仪器科学与技术
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
  • 27 篇 理学
    • 25 篇 数学
    • 7 篇 系统科学
    • 6 篇 统计学(可授理学、...
    • 1 篇 物理学
    • 1 篇 化学
    • 1 篇 大气科学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 社会学

主题

  • 95 篇 dynamic programm...
  • 54 篇 optimal control
  • 51 篇 learning
  • 44 篇 reinforcement le...
  • 35 篇 learning (artifi...
  • 27 篇 equations
  • 25 篇 neural networks
  • 22 篇 heuristic algori...
  • 20 篇 convergence
  • 20 篇 control systems
  • 18 篇 function approxi...
  • 18 篇 mathematical mod...
  • 16 篇 approximation al...
  • 15 篇 vectors
  • 15 篇 cost function
  • 14 篇 markov processes
  • 14 篇 nonlinear system...
  • 14 篇 artificial neura...
  • 13 篇 stochastic proce...
  • 12 篇 adaptive dynamic...

机构

  • 10 篇 chinese acad sci...
  • 5 篇 school of inform...
  • 4 篇 northeastern uni...
  • 4 篇 department of el...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 automation and r...
  • 3 篇 department of el...
  • 3 篇 robotics institu...
  • 3 篇 key laboratory o...
  • 3 篇 natl univ def te...
  • 3 篇 univ illinois de...
  • 2 篇 department of ar...
  • 2 篇 school of electr...
  • 2 篇 univ groningen i...
  • 2 篇 univ texas autom...
  • 2 篇 colorado state u...
  • 2 篇 guangxi univ sch...
  • 2 篇 national science...
  • 2 篇 informatics inst...

作者

  • 13 篇 liu derong
  • 7 篇 hado van hasselt
  • 7 篇 marco a. wiering
  • 7 篇 dongbin zhao
  • 6 篇 zhao dongbin
  • 5 篇 xu xin
  • 5 篇 lewis frank l.
  • 5 篇 huaguang zhang
  • 5 篇 wei qinglai
  • 5 篇 derong liu
  • 5 篇 warren b. powell
  • 4 篇 haibo he
  • 4 篇 jagannathan s.
  • 4 篇 frank l. lewis
  • 4 篇 zhang huaguang
  • 4 篇 ni zhen
  • 4 篇 yanhong luo
  • 4 篇 wang ding
  • 4 篇 he haibo
  • 4 篇 damien ernst

语言

  • 246 篇 英文
  • 1 篇 其他
检索条件"任意字段=2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2014"
247 条 记 录,以下是171-180 订阅
排序:
dynamic lead time promising
Dynamic lead time promising
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Matthew J. Reindorp Michael C. Fu Department of Industrial Engineering and Innovation Sciences Eindhovan University of Technology Netherlands Robert H. Smith School of Business and Institute of Systems Research University of Maryland USA
We consider a make-to-order business that serves customers in multiple priority classes. Orders from customers in higher classes bring greater revenue, but they expect shorter lead times than customers in lower classe... 详细信息
来源: 评论
Analyzing collective behavior in evolutionary swarm robotic systems based on an ethological approach
Analyzing collective behavior in evolutionary swarm robotic ...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Toshiyuki Yasuda Nanami Wada Kazuhiro Ohkura Yoshiyuki Matsumura Graduate School of Engineering Hiroshima University Higashi-Hiroshima JAPAN Faculty of Textile Science and Technology Shinshu University Ueda Nagano JAPAN
Swarm robotic systems are a type of multi-robot systems which generally consist of many homogeneous autonomous robots without any type of global controllers. Swarm robotics aims at designing desired collective behavio... 详细信息
来源: 评论
Using reward-weighted imitation for robot reinforcement learning
Using reward-weighted imitation for robot Reinforcement Lear...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Jan Peters Jens Kober Department of Empirical Inference and Machine Learning Max-Planck Institute of Biological Cybernetics Tubingen Germany
reinforcement learning is an essential ability for robots to learn new motor skills. Nevertheless, few methods scale into the domain of anthropomorphic robotics. In order to improve in terms of efficiency, the problem... 详细信息
来源: 评论
Safe reinforcement learning in high-risk tasks through policy improvement
Safe reinforcement learning in high-risk tasks through polic...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Francisco Javier Garcia Polo Fernando Fernandez Rebollo Computer Science Department Universidad Carlos III de Madrid Madrid Spain
reinforcement learning (RL) methods are widely used for dynamic control tasks. In many cases, these are high risk tasks where the trial and error process may select actions which execution from unsafe states can be ca... 详细信息
来源: 评论
Event-Triggered reinforcement learning Approach for Unknown Nonlinear Continuous-Time System
Event-Triggered Reinforcement Learning Approach for Unknown ...
收藏 引用
International Joint Conference on Neural Networks (IJCNN)
作者: Zhong, Xiangnan Ni, Zhen He, Haibo Xu, Xin Zhao, Dongbin Univ Rhode Isl Dept Elect Comp & Biomed Engn Kingston RI 02881 USA Natl Univ Def Technol Coll Mechatron & Automat Changsha 410073 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China
This paper provides an adaptive event-triggered method using adaptive dynamic programming (ADP) for the nonlinear continuous-time system. Comparing to the traditional method with fixed sampling period, the event-trigg... 详细信息
来源: 评论
Feature discovery in approximate dynamic programming
Feature discovery in approximate dynamic programming
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Philippe Preux Sertan Girgin Manuel Loth Laboratoire dInformatique Fondamentale de Lille (Computer Science Laboratory associated to the CNRS) and the INRIAINRIA Université de Lille France
Feature discovery aims at finding the best representation of data. This is a very important topic in machine learning, and in reinforcement learning in particular. Based on our recent work on feature discovery in the ... 详细信息
来源: 评论
The Knowledge Gradient Policy for Offline learning with Independent Normal Rewards
The Knowledge Gradient Policy for Offline Learning with Inde...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Peter Frazier Warren Powell Department of Operations Research and Financial Engineering Princeton University Engineering Princeton NJ USA
We define a new type of policy, the knowledge gradient policy, in the context of an offline learning problem. We show how to compute the knowledge gradient policy efficiently and demonstrate through Monte Carlo simula... 详细信息
来源: 评论
Toward effective combination of off-line and on-line training in ADP framework
Toward effective combination of off-line and on-line trainin...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Danil Prokhorov Toyota Technical Center Ann Arbor MI USA
We are interested in finding the most effective combination between off-line and on-line/real-time training in approximate dynamic programming. We introduce our approach of combining proven off-line methods of trainin... 详细信息
来源: 评论
Inferring bounds on the performance of a control policy from a sample of trajectories
Inferring bounds on the performance of a control policy from...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Raphael Fonteneau Susan Murphy Louis Wehenkel Damien Ernst Department of Electrical Engineering and Computer Science University of Liège Belgium University of Michigan USA
We propose an approach for inferring bounds on the finite-horizon return of a control policy from an off-policy sample of trajectories collecting state transitions, rewards, and control actions. In this paper, the dyn... 详细信息
来源: 评论
Using ADP to Understand and Replicate Brain Intelligence: the Next Level Design
Using ADP to Understand and Replicate Brain Intelligence: th...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Paul J. Werbos National Science Foundation Arlington VA USA
Since the 1960's the author proposed that we could understand and replicate the highest level of intelligence seen in the brain, by building ever more capable and general systems for adaptive dynamic programming (... 详细信息
来源: 评论