咨询与建议

限定检索结果

文献类型

  • 229 篇 会议
  • 18 篇 期刊文献

馆藏范围

  • 247 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 113 篇 工学
    • 103 篇 计算机科学与技术...
    • 42 篇 软件工程
    • 38 篇 电气工程
    • 23 篇 控制科学与工程
    • 5 篇 信息与通信工程
    • 3 篇 机械工程
    • 2 篇 力学(可授工学、理...
    • 1 篇 仪器科学与技术
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
  • 27 篇 理学
    • 25 篇 数学
    • 7 篇 系统科学
    • 6 篇 统计学(可授理学、...
    • 1 篇 物理学
    • 1 篇 化学
    • 1 篇 大气科学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 社会学

主题

  • 95 篇 dynamic programm...
  • 54 篇 optimal control
  • 50 篇 learning
  • 44 篇 reinforcement le...
  • 35 篇 learning (artifi...
  • 27 篇 equations
  • 25 篇 neural networks
  • 22 篇 heuristic algori...
  • 20 篇 convergence
  • 20 篇 control systems
  • 18 篇 function approxi...
  • 18 篇 mathematical mod...
  • 16 篇 approximation al...
  • 15 篇 vectors
  • 15 篇 cost function
  • 14 篇 markov processes
  • 14 篇 nonlinear system...
  • 14 篇 artificial neura...
  • 13 篇 stochastic proce...
  • 12 篇 adaptive dynamic...

机构

  • 10 篇 chinese acad sci...
  • 5 篇 school of inform...
  • 4 篇 northeastern uni...
  • 4 篇 department of el...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 automation and r...
  • 3 篇 department of el...
  • 3 篇 robotics institu...
  • 3 篇 key laboratory o...
  • 3 篇 natl univ def te...
  • 3 篇 univ illinois de...
  • 2 篇 department of ar...
  • 2 篇 school of electr...
  • 2 篇 univ groningen i...
  • 2 篇 univ texas autom...
  • 2 篇 colorado state u...
  • 2 篇 guangxi univ sch...
  • 2 篇 national science...
  • 2 篇 informatics inst...

作者

  • 13 篇 liu derong
  • 7 篇 hado van hasselt
  • 7 篇 marco a. wiering
  • 7 篇 dongbin zhao
  • 6 篇 zhao dongbin
  • 5 篇 xu xin
  • 5 篇 lewis frank l.
  • 5 篇 huaguang zhang
  • 5 篇 wei qinglai
  • 5 篇 derong liu
  • 5 篇 warren b. powell
  • 4 篇 haibo he
  • 4 篇 jagannathan s.
  • 4 篇 frank l. lewis
  • 4 篇 zhang huaguang
  • 4 篇 ni zhen
  • 4 篇 yanhong luo
  • 4 篇 wang ding
  • 4 篇 he haibo
  • 4 篇 damien ernst

语言

  • 246 篇 英文
  • 1 篇 其他
检索条件"任意字段=2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2014"
247 条 记 录,以下是31-40 订阅
排序:
Closed-Loop Control of Anesthesia and Mean Arterial Pressure Using reinforcement learning
Closed-Loop Control of Anesthesia and Mean Arterial Pressure...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Padmanabhan, Regina Meskin, Nader Haddad, Wassim M. Qatar Univ Dept Elect Engn Doha Qatar Georgia Inst Technol Sch Aerosp Engn Atlanta GA 30332 USA
General anesthesia is required for patients undergoing surgery as well as for some patients in the intensive care units with acute respiratory distress syndrome. However, most anesthetics affect cardiac and respirator... 详细信息
来源: 评论
Multi-Objective reinforcement learning for AUV Thruster Failure Recovery
Multi-Objective Reinforcement Learning for AUV Thruster Fail...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Ahmadzadeh, Seyed Reza Kormushev, Petar Caldwell, Darwin G. Ist Italiano Tecnol Dept Adv Robot Via Morego 30 I-16163 Genoa Italy
This paper investigates learning approaches for discovering fault-tolerant control policies to overcome thruster failures in Autonomous Underwater Vehicles (AUV). The proposed approach is a model-based direct policy s... 详细信息
来源: 评论
Policy Gradient Approaches for Multi-Objective Sequential Decision Making: A Comparison
Policy Gradient Approaches for Multi-Objective Sequential De...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Parisi, Simone Pirotta, Matteo Smacchia, Nicola Bascetta, Luca Restelli, Marcello Politecn Milan Dept Elect Informat & Bioengn Piazza Leonardo da Vinci 32 I-20133 Milan Italy
This paper investigates the use of policy gradient techniques to approximate the Pareto frontier in Multi-Objective Markov Decision Processes (MOMDPs). Despite the popularity of policy-gradient algorithms and the fact... 详细信息
来源: 评论
Annealing-Pareto Multi-Objective Multi-Armed Bandit Algorithm
Annealing-Pareto Multi-Objective Multi-Armed Bandit Algorith...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)
作者: Yahyaa, Saba Q. Drugan, Madalina M. Manderick, Bernard Vrije Univ Brussel Dept Comp Sci Pl Laan 2 B-1050 Brussels Belgium
In the stochastic multi-objective multi-armed bandit (or MOMAB), arms generate a vector of stochastic rewards, one per objective, instead of a single scalar reward. As a result, there is not only one optimal arm, but ... 详细信息
来源: 评论
Clipping in Neurocontrol by adaptive dynamic programming
收藏 引用
ieee TRANSACTIONS ON NEURAL NETWORKS AND learning SYSTEMS 2014年 第10期25卷 1909-1920页
作者: Fairbank, Michael Prokhorov, Danil Alonso, Eduardo City Univ London Sch Informat Dept Comp Sci London EC1V OHB England Toyota Res Inst NA Ann Arbor MI 48105 USA
In adaptive dynamic programming, neurocontrol, and reinforcement learning, the objective is for an agent to learn to choose actions so as to minimize a total cost function. In this paper, we show that when discretized... 详细信息
来源: 评论
Policy Iteration adaptive dynamic programming Algorithm for Discrete-Time Nonlinear Systems
收藏 引用
ieee TRANSACTIONS ON NEURAL NETWORKS AND learning SYSTEMS 2014年 第3期25卷 621-634页
作者: Liu, Derong Wei, Qinglai Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China
This paper is concerned with a new discrete-time policy iteration adaptive dynamic programming (ADP) method for solving the infinite horizon optimal control problem of nonlinear systems. The idea is to use an iterativ... 详细信息
来源: 评论
A Novel Iterative θ-adaptive dynamic programming for Discrete-Time Nonlinear Systems
收藏 引用
ieee TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 2014年 第4期11卷 1176-1190页
作者: Wei, Qinglai Liu, Derong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China
This paper is concerned with a new iterative theta-adaptive dynamic programming (ADP) technique to solve optimal control problems of infinite horizon discrete-time nonlinear systems. The idea is to use an iterative AD... 详细信息
来源: 评论
Finite-Approximation-Error-Based Discrete-Time Iterative adaptive dynamic programming
收藏 引用
ieee TRANSACTIONS ON CYBERNETICS 2014年 第12期44卷 2820-2833页
作者: Wei, Qinglai Wang, Fei-Yue Liu, Derong Yang, Xiong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China
In this paper, a new iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal control problems for infinite horizon discrete-time nonlinear systems with finite approximation errors. First, ... 详细信息
来源: 评论
Theoretical analysis of a reinforcement learning based switching scheme
Theoretical analysis of a reinforcement learning based switc...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Ali Heydari Mechanical Engineering Department South Dakota School of Mines and Technology Rapid City SD
A reinforcement learning based scheme for optimal switching with an infinite-horizon cost function is briefly proposed in this paper. Several theoretical questions are shown to arise regarding its convergence, optimal... 详细信息
来源: 评论
Data-Driven Neuro-Optimal Temperature Control of Water-Gas Shift Reaction Using Stable Iterative adaptive dynamic programming
收藏 引用
ieee TRANSACTIONS ON INDUSTRIAL ELECTRONICS 2014年 第11期61卷 6399-6408页
作者: Wei, Qinglai Liu, Derong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China
In this paper, a novel data-driven stable iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal temperature control problems for water-gas shift (WGS) reaction systems. According to the ... 详细信息
来源: 评论