咨询与建议

限定检索结果

文献类型

  • 299 篇 会议
  • 8 篇 期刊文献

馆藏范围

  • 307 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 180 篇 工学
    • 158 篇 计算机科学与技术...
    • 56 篇 电气工程
    • 48 篇 软件工程
    • 47 篇 控制科学与工程
    • 13 篇 信息与通信工程
    • 10 篇 机械工程
    • 6 篇 仪器科学与技术
    • 4 篇 力学(可授工学、理...
    • 4 篇 生物工程
    • 3 篇 动力工程及工程热...
    • 2 篇 交通运输工程
    • 2 篇 核科学与技术
    • 2 篇 生物医学工程(可授...
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 航空宇航科学与技...
    • 1 篇 食品科学与工程(可...
  • 40 篇 理学
    • 35 篇 数学
    • 9 篇 系统科学
    • 8 篇 统计学(可授理学、...
    • 4 篇 物理学
    • 4 篇 生物学
    • 1 篇 化学
    • 1 篇 天文学
    • 1 篇 大气科学
    • 1 篇 地球物理学
    • 1 篇 地质学
  • 18 篇 管理学
    • 17 篇 管理科学与工程(可...
    • 7 篇 工商管理
  • 4 篇 经济学
    • 4 篇 应用经济学
  • 1 篇 医学

主题

  • 115 篇 dynamic programm...
  • 76 篇 reinforcement le...
  • 67 篇 learning
  • 47 篇 optimal control
  • 30 篇 neural networks
  • 27 篇 control systems
  • 21 篇 approximate dyna...
  • 21 篇 approximation al...
  • 20 篇 function approxi...
  • 20 篇 equations
  • 17 篇 convergence
  • 16 篇 adaptive dynamic...
  • 16 篇 state-space meth...
  • 16 篇 heuristic algori...
  • 14 篇 mathematical mod...
  • 13 篇 stochastic proce...
  • 12 篇 learning (artifi...
  • 12 篇 adaptive control
  • 12 篇 cost function
  • 11 篇 algorithm design...

机构

  • 5 篇 arizona state un...
  • 4 篇 department of el...
  • 4 篇 school of inform...
  • 4 篇 department of in...
  • 4 篇 univ sci & techn...
  • 4 篇 chinese acad sci...
  • 4 篇 department of el...
  • 3 篇 princeton univ d...
  • 3 篇 northeastern uni...
  • 3 篇 national science...
  • 3 篇 robotics institu...
  • 3 篇 univ illinois de...
  • 3 篇 univ utrecht dep...
  • 2 篇 univ groningen i...
  • 2 篇 sharif univ tech...
  • 2 篇 univ texas autom...
  • 2 篇 pengcheng labora...
  • 2 篇 guangxi univ sch...
  • 2 篇 chinese acad sci...
  • 2 篇 cemagref lisc au...

作者

  • 14 篇 liu derong
  • 9 篇 wei qinglai
  • 8 篇 si jennie
  • 7 篇 xu xin
  • 5 篇 derong liu
  • 4 篇 lewis frank l.
  • 4 篇 martin riedmille...
  • 4 篇 huaguang zhang
  • 4 篇 jennie si
  • 4 篇 marco a. wiering
  • 4 篇 xin xu
  • 4 篇 zhang huaguang
  • 4 篇 dongbin zhao
  • 4 篇 lei yang
  • 4 篇 powell warren b.
  • 4 篇 riedmiller marti...
  • 3 篇 hado van hasselt
  • 3 篇 van hasselt hado
  • 3 篇 jagannathan s.
  • 3 篇 munos remi

语言

  • 305 篇 英文
  • 1 篇 其他
  • 1 篇 中文
检索条件"任意字段=IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning"
307 条 记 录,以下是91-100 订阅
排序:
approximate dynamic programming Solutions of Multi-Agent Graphical Games Using Actor-Critic Network Structures
Approximate Dynamic Programming Solutions of Multi-Agent Gra...
收藏 引用
international Joint Conference on Neural Networks (IJCNN)
作者: Abouheaf, Mohammed I. Lewis, Frank L. Univ Texas Arlington Res Inst Ft Worth TX 76118 USA
This paper studies a new class of multi-agent discrete-time dynamical graphical games, where interactions between agents are restricted by a communication graph structure. The paper brings together discrete Hamiltonia... 详细信息
来源: 评论
Development of reinforcement learning Algorithm for 2-DOF Helicopter Model  27
Development of Reinforcement Learning Algorithm for 2-DOF He...
收藏 引用
27th ieee international symposium on Industrial Electronics, ISIE 2018
作者: Fandel, Andrew Birge, Anthony Miah, Suruz Department Bradley University Electrical and Computer Engineering PeoriaIL United States
This paper examines a reinforcement learning strategy for controlling a two degree-of-freedom (2-DOF) helicopter. The pitch and yaw angles are regulated to their corresponding reference angles by applying appropriate ... 详细信息
来源: 评论
approximate dynamic programming of Continuous Annealing process
Approximate Dynamic Programming of Continuous Annealing proc...
收藏 引用
ieee international Conference on Automation and Logistics
作者: Zhang, Yingwei Guo, Chao Chen, Xue Teng, Yongdong Northeastern Univ Minist Educ Key Lab Integrated Automat Proc Ind Shenyang 110004 Liaoning Peoples R China
approximate dynamic programming method is a combination of neural networks, reinforcement learning, as well as the idea of dynamic programming. It is an online control method which bases on actual data rather than a p... 详细信息
来源: 评论
Near Optimal Output Feedback Control of Nonlinear Discrete-time Systems Based on reinforcement Neural Network learning
收藏 引用
ieee/CAA Journal of Automatica Sinica 2014年 第4期1卷 372-384页
作者: Qiming Zhao Hao Xu Sarangapani Jagannathan the DENSO International America Inc. with the College of Science and Engineering Texas A&M University the Department of Electrical&Computer Engineering Missouri University of Science and Technology
In this paper, the output feedback based finitehorizon near optimal regulation of nonlinear affine discretetime systems with unknown system dynamics is considered by using neural networks(NNs) to approximate Hamilton-... 详细信息
来源: 评论
Discrete-Time Generalized Policy Iteration ADP Algorithm With Approximation Errors
Discrete-Time Generalized Policy Iteration ADP Algorithm Wit...
收藏 引用
ieee symposium Series on Computational Intelligence (ieee SSCI)
作者: Wei, Qinglai Li, Benkai Song, Ruizhuo Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing Peoples R China Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing Peoples R China
This paper concerns with a novel generalized policy iteration (GPI) algorithm with approximation errors. Approximation errors are explicitly considered in the GPI algorithm. The properties of the stable GPI algorithm ... 详细信息
来源: 评论
reinforcement learning Control of a Real Mobile Robot Using approximate Policy Iteration
收藏 引用
6th international symposium on Neural Networks
作者: Zhang, Pengchen Xu, Xin Liu, Chunming Yuan, Qiping Natl Univ Def Technol Inst Automat Changsha 410073 Hunan Peoples R China
Machine learning for mobile robots has attracted lots of research interests in recent years. However, there are still many challenges to apply learning techniques in real mobile robots, e.g., generalization ill Contin... 详细信息
来源: 评论
Optimal control applied to Wheeled Mobile Vehicles
Optimal control applied to Wheeled Mobile Vehicles
收藏 引用
ieee international symposium on Intelligent Signal Processing
作者: Gomez, M. Martinez, T. Sanchez, S. Meziat, D. Univ Alcala Escuela Politecn Super Dept Automat Alcala De Henares Spain Univ Alicante Escuela Politecn Super Ingn Sistemas Teoria Sefial Dept Fis Alicante Spain
The goal of the work described in this paper is to develop a particular optimal control technique based on a Cell. Mapping technique in combination with the Q-learning reinforcement learning method to control wheeled ... 详细信息
来源: 评论
Stable Iterative Optimal Control for Discrete-Time Nonlinear Systems Using Numerical Controller
Stable Iterative Optimal Control for Discrete-Time Nonlinear...
收藏 引用
ieee international Conference on Vehicular Electronics and Safety (ICVES)
作者: Wei, Qinglai Liu, Derong Chinese Acad Sci State Key Lab Management & Control Complex Syst Inst Automat Beijing 100190 Peoples R China
This paper is concerned with a new iterative adaptive dynamic programming (ADP) algorithm to solve optimal control problems for infinite horizon discrete-time nonlinear systems using a numerical controller. The conver... 详细信息
来源: 评论
A performance gradient perspective on approximate dynamic programming and its application to partially observable markov decision processes
A performance gradient perspective on approximate dynamic pr...
收藏 引用
ieee international symposium on Intelligent Control
作者: Dankert, James Yang, Lei Si, Jennie Arizona State Univ Dept Elect Engn Tempe AZ 85287 USA
This paper shows an approach to integrating common approximate dynamic programming (ADP) algorithms into a theoretical framework to address both analytical characteristics and algorithmic features. Several important i... 详细信息
来源: 评论
High-order local dynamic programming
High-order local dynamic programming
收藏 引用
作者: Tassa, Yuval Todorov, Emanuel Interdisciplinary Center for Neural Computation Hebrew University Jerusalem Israel Applied Mathematics and Computer Science and Engineering University of Washington Seattle United States
We describe a new local dynamic programming algorithm for solving stochastic continuous Optimal Control problems. We use cubature integration to both propagate the state distribution and perform the Bellman backup. Th... 详细信息
来源: 评论