咨询与建议

限定检索结果

文献类型

  • 746 篇 会议
  • 270 篇 期刊文献
  • 4 册 图书

馆藏范围

  • 1,020 篇 电子文献
  • 1 种 纸本馆藏

日期分布

学科分类号

  • 711 篇 工学
    • 520 篇 计算机科学与技术...
    • 380 篇 电气工程
    • 278 篇 控制科学与工程
    • 153 篇 软件工程
    • 79 篇 信息与通信工程
    • 40 篇 交通运输工程
    • 23 篇 仪器科学与技术
    • 20 篇 机械工程
    • 9 篇 生物工程
    • 8 篇 电子科学与技术(可...
    • 7 篇 力学(可授工学、理...
    • 7 篇 土木工程
    • 6 篇 动力工程及工程热...
    • 6 篇 石油与天然气工程
    • 4 篇 生物医学工程(可授...
    • 3 篇 材料科学与工程(可...
    • 3 篇 化学工程与技术
    • 3 篇 航空宇航科学与技...
    • 3 篇 安全科学与工程
  • 118 篇 理学
    • 98 篇 数学
    • 32 篇 系统科学
    • 22 篇 统计学(可授理学、...
    • 10 篇 生物学
    • 8 篇 物理学
    • 4 篇 化学
  • 66 篇 管理学
    • 63 篇 管理科学与工程(可...
    • 14 篇 工商管理
    • 5 篇 图书情报与档案管...
  • 5 篇 经济学
    • 4 篇 应用经济学
  • 3 篇 法学
    • 3 篇 社会学
  • 2 篇 医学
  • 1 篇 教育学

主题

  • 312 篇 reinforcement le...
  • 216 篇 dynamic programm...
  • 206 篇 optimal control
  • 107 篇 adaptive dynamic...
  • 104 篇 adaptive dynamic...
  • 97 篇 learning
  • 88 篇 neural networks
  • 78 篇 heuristic algori...
  • 68 篇 reinforcement le...
  • 58 篇 learning (artifi...
  • 54 篇 nonlinear system...
  • 53 篇 convergence
  • 51 篇 control systems
  • 51 篇 mathematical mod...
  • 48 篇 approximate dyna...
  • 44 篇 approximation al...
  • 43 篇 equations
  • 42 篇 adaptive control
  • 41 篇 artificial neura...
  • 41 篇 cost function

机构

  • 41 篇 chinese acad sci...
  • 27 篇 univ rhode isl d...
  • 17 篇 tianjin univ sch...
  • 16 篇 univ sci & techn...
  • 16 篇 univ illinois de...
  • 15 篇 northeastern uni...
  • 14 篇 beijing normal u...
  • 13 篇 northeastern uni...
  • 13 篇 guangdong univ t...
  • 12 篇 northeastern uni...
  • 9 篇 natl univ def te...
  • 8 篇 ieee
  • 8 篇 univ chinese aca...
  • 7 篇 univ chinese aca...
  • 7 篇 cent south univ ...
  • 7 篇 southern univ sc...
  • 7 篇 beijing univ tec...
  • 6 篇 chinese acad sci...
  • 6 篇 missouri univ sc...
  • 5 篇 nanjing univ pos...

作者

  • 54 篇 liu derong
  • 37 篇 wei qinglai
  • 29 篇 he haibo
  • 22 篇 wang ding
  • 21 篇 xu xin
  • 19 篇 jiang zhong-ping
  • 17 篇 lewis frank l.
  • 17 篇 yang xiong
  • 17 篇 zhang huaguang
  • 17 篇 ni zhen
  • 16 篇 zhao bo
  • 15 篇 gao weinan
  • 14 篇 zhao dongbin
  • 13 篇 zhong xiangnan
  • 12 篇 si jennie
  • 12 篇 derong liu
  • 10 篇 jagannathan s.
  • 10 篇 dongbin zhao
  • 10 篇 song ruizhuo
  • 9 篇 abouheaf mohamme...

语言

  • 994 篇 英文
  • 20 篇 其他
  • 6 篇 中文
检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"
1020 条 记 录,以下是571-580 订阅
排序:
Pseudo-MDPs and factored linear action models
Pseudo-MDPs and factored linear action models
收藏 引用
2014 ieee symposium on adaptive dynamic programming and reinforcement learning, ADPRL 2014
作者: Yao, Hengshuai Szepesvári, Csaba Pires, Bernardo Ávila Zhang, Xinhua Department of Computing Science University of Alberta EdmontonABT6G2E8 Canada Machine Learning Research Group National ICT Australia Sydney Australia
In this paper we introduce the concept of pseudo-MDPs to develop abstractions. Pseudo-MDPs relax the requirement that the transition kernel has to be a probability kernel. We show that the new framework captures many ... 详细信息
来源: 评论
Approximate Real-Time Optimal Control Based on Sparse Gaussian Process Models
Approximate Real-Time Optimal Control Based on Sparse Gaussi...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Boedecker, Joschka Springenberg, Jost Tobias Wuelfing, Jan Riedmiller, Martin Univ Freiburg Dept Comp Sci Machine Learning Lab D-79110 Freiburg Germany
In this paper we present a fully automated approach to (approximate) optimal control of non-linear systems. Our algorithm jointly learns a non-parametric model of the system dynamics - based on Gaussian Process Regres... 详细信息
来源: 评论
Policy Gradient Approaches for Multi-Objective Sequential Decision Making: A Comparison
Policy Gradient Approaches for Multi-Objective Sequential De...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Parisi, Simone Pirotta, Matteo Smacchia, Nicola Bascetta, Luca Restelli, Marcello Politecn Milan Dept Elect Informat & Bioengn Piazza Leonardo da Vinci 32 I-20133 Milan Italy
This paper investigates the use of policy gradient techniques to approximate the Pareto frontier in Multi-Objective Markov Decision Processes (MOMDPs). Despite the popularity of policy-gradient algorithms and the fact... 详细信息
来源: 评论
Neural-network-based optimal tracking control scheme for a class of unknown discrete-time nonlinear systems using iterative ADP algorithm
收藏 引用
NEUROCOMPUTING 2014年 125卷 46-56页
作者: Huang, Yuzhu Liu, Derong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China
In this paper, an optimal tracking control scheme is proposed for a class of unknown discrete-time nonlinear systems using iterative adaptive dynamic programming (ADP) algorithm. First, in order to obtain the dynamics... 详细信息
来源: 评论
A Two Stage learning Technique for Dual learning in the Pursuit-Evasion Differential Game
A Two Stage Learning Technique for Dual Learning in the Purs...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Al-Talabi, Ahmad A. Schwartz, Howard M. Carleton Univ Dept Syst & Comp Engn 1125 Colonel By Dr Ottawa ON K1S 5B6 Canada Univ Baghdad Al Khwarizmi Coll Engn Mechatron Engn Dept Baghdad Iraq
This paper addresses the case of dual learning in the pursuit-evasion (PE) differential game and examines how fast the players can learn their default control strategies. The players should learn their default control... 详细信息
来源: 评论
Longitudinal Control of Hypersonic Vehicles Based on Direct Heuristic dynamic programming Using ANFIS
Longitudinal Control of Hypersonic Vehicles Based on Direct ...
收藏 引用
International Joint Conference on Neural Networks (IJCNN)
作者: Luo, Xiong Chen, Yi Si, Jennie Liu, Feng USTB Sch Comp & Commun Engn Beijing 100083 Peoples R China Arizona State Univ Sch Elect Comp & Energy Engn Tempe AZ 85287 USA
Since the launch of the scramjet, recent years have witnessed a growing interest in the study of airbreathing hypersonic vehicles. Due to its strong coupling characteristics, high nonlinearity, and uncertain parameter... 详细信息
来源: 评论
Pareto Upper Confidence Bounds algorithms: an empirical study
Pareto Upper Confidence Bounds algorithms: an empirical stud...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Drugan, Madalina M. Nowe, Ann Manderick, Bernard Vrije Univ Brussel Artificial Intelligence Lab Ixelles Belgium
Many real-world stochastic environments are inherently multi-objective environments with conflicting objectives. The multi-objective multi-armed bandits (MOMAB) are extensions of the classical, i.e. single objective, ... 详细信息
来源: 评论
Annealing-Pareto Multi-Objective Multi-Armed Bandit Algorithm
Annealing-Pareto Multi-Objective Multi-Armed Bandit Algorith...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning (ADPRL)
作者: Yahyaa, Saba Q. Drugan, Madalina M. Manderick, Bernard Vrije Univ Brussel Dept Comp Sci Pl Laan 2 B-1050 Brussels Belgium
In the stochastic multi-objective multi-armed bandit (or MOMAB), arms generate a vector of stochastic rewards, one per objective, instead of a single scalar reward. As a result, there is not only one optimal arm, but ... 详细信息
来源: 评论
adaptive dynamic programming for terminally constrained finite-horizon optimal control problems  53
Adaptive dynamic programming for terminally constrained fini...
收藏 引用
53rd ieee Annual Conference on Decision and Control (CDC)
作者: Andrews, L. Klotz, J. R. Kamalapurkar, R. Dixon, W. E. Univ Florida Dept Mech & Aerosp Engn Gainesville FL USA
adaptive dynamic programming is applied to control-affine nonlinear systems with uncertain drift dynamics to obtain a near-optimal solution to a finite-horizon optimal control problem with hard terminal constraints. A... 详细信息
来源: 评论
Self-learning Cruise Control Using Kernel-Based Least Squares Policy Iteration
收藏 引用
ieee TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY 2014年 第3期22卷 1078-1087页
作者: Wang, Jian Xu, Xin Liu, Daxue Sun, Zhenping Chen, Qingyang Natl Univ Def Technol Coll Mechatron & Automat Changsha 410073 Hunan Peoples R China
This paper presents a novel learning-based cruise controller for autonomous land vehicles (ALVs) with unknown dynamics and external disturbances. The learning controller consists of a time-varying proportional-integra... 详细信息
来源: 评论