咨询与建议

限定检索结果

文献类型

  • 61 篇 期刊文献
  • 21 篇 会议

馆藏范围

  • 82 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 74 篇 工学
    • 47 篇 计算机科学与技术...
    • 37 篇 控制科学与工程
    • 31 篇 电气工程
    • 6 篇 软件工程
    • 5 篇 机械工程
    • 3 篇 信息与通信工程
    • 2 篇 仪器科学与技术
    • 2 篇 航空宇航科学与技...
    • 1 篇 电子科学与技术(可...
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
    • 1 篇 环境科学与工程(可...
  • 15 篇 理学
    • 6 篇 数学
    • 6 篇 系统科学
    • 3 篇 物理学
    • 2 篇 化学
    • 1 篇 生物学
    • 1 篇 生态学
  • 10 篇 管理学
    • 10 篇 管理科学与工程(可...
    • 2 篇 工商管理
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 法学
  • 1 篇 教育学
    • 1 篇 教育学
  • 1 篇 军事学

主题

  • 82 篇 neuro-dynamic pr...
  • 28 篇 optimal control
  • 24 篇 reinforcement le...
  • 20 篇 approximate dyna...
  • 19 篇 adaptive critic ...
  • 18 篇 neural networks
  • 15 篇 adaptive dynamic...
  • 12 篇 nonlinear system...
  • 11 篇 dynamic programm...
  • 9 篇 adaptive dynamic...
  • 6 篇 function approxi...
  • 6 篇 policy iteration
  • 5 篇 scheduling
  • 4 篇 markov chains
  • 4 篇 generalized poli...
  • 3 篇 value iteration
  • 3 篇 temporal-differe...
  • 3 篇 q-learning
  • 2 篇 plug-in hybrid e...
  • 2 篇 differential gam...

机构

  • 21 篇 chinese acad sci...
  • 10 篇 univ sci & techn...
  • 8 篇 guangdong univ t...
  • 4 篇 beijing normal u...
  • 3 篇 alphatech inc bu...
  • 2 篇 guangdong univ t...
  • 2 篇 mit informat & d...
  • 2 篇 georgia inst tec...
  • 2 篇 school of automa...
  • 2 篇 mit dept elect e...
  • 2 篇 northeastern uni...
  • 2 篇 univ texas arlin...
  • 2 篇 southern univ sc...
  • 2 篇 univ illinois de...
  • 2 篇 changchun univ t...
  • 2 篇 rzeszow univ tec...
  • 1 篇 univ sci & techn...
  • 1 篇 princeton univ d...
  • 1 篇 univ chinese aca...
  • 1 篇 chinese acad sci...

作者

  • 19 篇 liu derong
  • 18 篇 wei qinglai
  • 7 篇 song ruizhuo
  • 5 篇 zhao bo
  • 5 篇 wang ding
  • 3 篇 bertsekas dp
  • 3 篇 tsitsiklis jn
  • 3 篇 jay h. lee
  • 3 篇 yang xiong
  • 3 篇 lee jh
  • 3 篇 lee jm
  • 3 篇 yan pengfei
  • 2 篇 burghardt andrze...
  • 2 篇 lewis frank l.
  • 2 篇 li yuanchun
  • 2 篇 an tianjiao
  • 2 篇 niket s. kaisare
  • 2 篇 vanroy b
  • 2 篇 szuster marcin
  • 2 篇 lin hanquan

语言

  • 74 篇 英文
  • 4 篇 其他
  • 4 篇 中文
检索条件"主题词=neuro-dynamic programming"
82 条 记 录,以下是51-60 订阅
排序:
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning
收藏 引用
ARTIFICIAL INTELLIGENCE 2008年 第4-5期172卷 454-482页
作者: Salles Barreto, Andre da Motta Anderson, Charles W. Univ Fed Rio de Janeiro COPPE Programa Engn Civil BR-21945 Rio De Janeiro Brazil Colorado State Univ Dept Comp Sci Ft Collins CO 80523 USA
This work presents the restricted gradient-descent (RGD) algorithm, a training method for local radial-basis function networks specifically developed to be used in the context of reinforcement learning. The RGD algori... 详细信息
来源: 评论
Relative value function approximation for the capacitated re-entrant line scheduling problem
收藏 引用
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 2005年 第3期2卷 285-299页
作者: Choi, JY Reveliotis, S Georgia Inst Technol Sch Ind & Syst Engn Atlanta GA 30332 USA
The problem addressed in this study is that of determining how to allocate the workstation processing and buffering capacity in a capacitated re-entrant line to the job instances competing for it, in order to maximize... 详细信息
来源: 评论
Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints
收藏 引用
IEEE TRANSACTIONS ON CYBERNETICS 2015年 第7期45卷 1372-1385页
作者: Liu, Derong Yang, Xiong Wang, Ding Wei, Qinglai Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China
The design of stabilizing controller for uncertain nonlinear systems with control constraints is a challenging problem. The constrained-input coupled with the inability to identify accurately the uncertainties motivat... 详细信息
来源: 评论
neuro-Optimal Control for Discrete Stochastic Processes via a Novel Policy Iteration Algorithm
收藏 引用
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 2020年 第11期50卷 3972-3985页
作者: Liang, Mingming Wang, Ding Liu, Derong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Beijing Univ Technol Fac Informat Technol Beijing 100124 Peoples R China Beijing Univ Technol Beijing Key Lab Computat Intelligence & Intellige Beijing 100124 Peoples R China Guangdong Univ Technol Sch Automat Guangzhou 510006 Peoples R China
In this paper, a novel policy iteration adaptive dynamic programming (ADP) algorithm is presented which is called "local policy iteration ADP algorithm" to obtain the optimal control for discrete stochastic ... 详细信息
来源: 评论
Rollout algorithms for stochastic scheduling problems
收藏 引用
JOURNAL OF HEURISTICS 1999年 第1期5卷 89-108页
作者: Bertsekas, DP Castañon, DA MIT Dept Elect Engn & Comp Sci Cambridge MA 02139 USA Boston Univ Dept Elect Engn Burlington MA 01803 USA Alphatech Inc Burlington MA 01803 USA
Stochastic scheduling problems are difficult stochastic control problems with combinatorial decision spaces. In this paper we focus on a class of stochastic scheduling problems, the quiz problem and its variations. We... 详细信息
来源: 评论
Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
收藏 引用
MACHINE LEARNING 2006年 第2期63卷 107-133页
作者: Tadic, VB Univ Sheffield Dept Automat Control & Syst Engn Sheffield S1 3JD S Yorkshire England
The mean-square asymptotic behavior of temporal-difference learning algorithrns with constant step-sizes and linear function approximation is analyzed in this paper. The analysis is carried out for the case of discoun... 详细信息
来源: 评论
Online Synchronous Approximate Optimal Learning Algorithm for Multiplayer Nonzero-Sum Games With Unknown dynamics
收藏 引用
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 2014年 第8期44卷 1015-1027页
作者: Liu, Derong Li, Hongliang Wang, Ding Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China
In this paper, we develop an online synchronous approximate optimal learning algorithm based on policy iteration to solve a multiplayer nonzero-sum game without the requirement of exact knowledge of dynamical systems.... 详细信息
来源: 评论
A structure property of optimal policies for maintenance problems with safety-critical components
收藏 引用
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 2008年 第3期5卷 519-531页
作者: Xia, Li Zhao, Qianchuan Jia, Qing-Shan Tsinghua Univ Dept Automat Ctr Intelligent & Networked Syst CFINS Beijing 100084 Peoples R China Tsinghua Univ TNLIST Beijing 100084 Peoples R China
The maintenance problem with safety-critical components is significant for the economical benefit of companies. Motivated by a practical asset maintenance project, a new joint replacement maintenance problem is introd... 详细信息
来源: 评论
A single front genetic algorithm for parallel multi-objective optimization in dynamic environments
收藏 引用
neuroCOMPUTING 2009年 第16-18期72卷 3570-3579页
作者: Camara, Mario Ortega, Julio de Toro, Francisco Univ Granada Dept Comp Technol & Architecture E-18071 Granada Spain Univ Granada Dept Signal Theory Telemat & Commun E-18071 Granada Spain
This paper proposes a new parallel evolutionary procedure to solve multi-objective dynamic optimization problems along with some measures to evaluate multi-objective optimization in dynamic environments. These dynamic... 详细信息
来源: 评论
An analysis of temporal-difference learning with function approximation
收藏 引用
IEEE TRANSACTIONS ON AUTOMATIC CONTROL 1997年 第5期42卷 674-690页
作者: Tsitsiklis, JN VanRoy, B the Laboratory for Information and Decision Systems Massachusetts Institute of Technology
We discuss the temporal-difference learning algorithm, as applied to approximating the cost-to-go function of an infinite-horizon discounted Markov chain, The algorithm we analyze updates parameters of a linear functi... 详细信息
来源: 评论