咨询与建议

限定检索结果

文献类型

  • 61 篇 期刊文献
  • 21 篇 会议

馆藏范围

  • 82 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 74 篇 工学
    • 47 篇 计算机科学与技术...
    • 37 篇 控制科学与工程
    • 31 篇 电气工程
    • 6 篇 软件工程
    • 5 篇 机械工程
    • 3 篇 信息与通信工程
    • 2 篇 仪器科学与技术
    • 2 篇 航空宇航科学与技...
    • 1 篇 电子科学与技术(可...
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
    • 1 篇 环境科学与工程(可...
  • 15 篇 理学
    • 6 篇 数学
    • 6 篇 系统科学
    • 3 篇 物理学
    • 2 篇 化学
    • 1 篇 生物学
    • 1 篇 生态学
  • 10 篇 管理学
    • 10 篇 管理科学与工程(可...
    • 2 篇 工商管理
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 法学
  • 1 篇 教育学
    • 1 篇 教育学
  • 1 篇 军事学

主题

  • 82 篇 neuro-dynamic pr...
  • 28 篇 optimal control
  • 24 篇 reinforcement le...
  • 20 篇 approximate dyna...
  • 19 篇 adaptive critic ...
  • 18 篇 neural networks
  • 15 篇 adaptive dynamic...
  • 12 篇 nonlinear system...
  • 11 篇 dynamic programm...
  • 9 篇 adaptive dynamic...
  • 6 篇 function approxi...
  • 6 篇 policy iteration
  • 5 篇 scheduling
  • 4 篇 markov chains
  • 4 篇 generalized poli...
  • 3 篇 value iteration
  • 3 篇 temporal-differe...
  • 3 篇 q-learning
  • 2 篇 plug-in hybrid e...
  • 2 篇 differential gam...

机构

  • 21 篇 chinese acad sci...
  • 10 篇 univ sci & techn...
  • 8 篇 guangdong univ t...
  • 4 篇 beijing normal u...
  • 3 篇 alphatech inc bu...
  • 2 篇 guangdong univ t...
  • 2 篇 mit informat & d...
  • 2 篇 georgia inst tec...
  • 2 篇 school of automa...
  • 2 篇 mit dept elect e...
  • 2 篇 northeastern uni...
  • 2 篇 univ texas arlin...
  • 2 篇 southern univ sc...
  • 2 篇 univ illinois de...
  • 2 篇 changchun univ t...
  • 2 篇 rzeszow univ tec...
  • 1 篇 univ sci & techn...
  • 1 篇 princeton univ d...
  • 1 篇 univ chinese aca...
  • 1 篇 chinese acad sci...

作者

  • 19 篇 liu derong
  • 18 篇 wei qinglai
  • 7 篇 song ruizhuo
  • 5 篇 zhao bo
  • 5 篇 wang ding
  • 3 篇 bertsekas dp
  • 3 篇 tsitsiklis jn
  • 3 篇 jay h. lee
  • 3 篇 yang xiong
  • 3 篇 lee jh
  • 3 篇 lee jm
  • 3 篇 yan pengfei
  • 2 篇 burghardt andrze...
  • 2 篇 lewis frank l.
  • 2 篇 li yuanchun
  • 2 篇 an tianjiao
  • 2 篇 niket s. kaisare
  • 2 篇 vanroy b
  • 2 篇 szuster marcin
  • 2 篇 lin hanquan

语言

  • 74 篇 英文
  • 4 篇 其他
  • 4 篇 中文
检索条件"主题词=Neuro-dynamic Programming"
82 条 记 录,以下是41-50 订阅
排序:
Average cost temporal-difference learning
收藏 引用
AUTOMATICA 1999年 第11期35卷 1799-1808页
作者: Tsitsiklis, JN Van Roy, B MIT Informat & Decis Syst Lab Cambridge MA 02139 USA
We propose a variant of temporal-difference learning that approximates average and differential costs of an irreducible aperiodic Markov chain. Approximations are comprised of linear combinations of fixed basis functi... 详细信息
来源: 评论
Stochastic approximation or nonexpansive maps:: Application to Q-learning algorithms
收藏 引用
SIAM JOURNAL ON CONTROL AND OPTIMIZATION 2002年 第1期41卷 1-22页
作者: Abounadi, J Bertsekas, DP Borkar, V MIT Dept Elect Engn & Comp Sci Cambridge MA 02139 USA Tata Inst Fundamental Res Sch Technol & Comp Sci Bombay 400005 Maharashtra India
We discuss synchronous and asynchronous iterations of the form x(k+1) = x(k) + gamma(k)(h(x(k)) + w(k)), where h is a suitable map and {w(k)} is a deterministic or stochastic sequence satisfying suitable conditions. I... 详细信息
来源: 评论
Analysis and optimization of service availability in an HA cluster with load-dependent machine availability
收藏 引用
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 2007年 第9期18卷 1307-1319页
作者: Ang, Chee-Wei Tham, Chen-Khong Inst Infocomm Res Singapore 119613 Singapore Natl Univ Singapore Dept Elect & Comp Engn Singapore 119260 Singapore
Calculations of service availability of a High-Availability (HA) cluster are usually based on the assumption of load-independent machine availabilities. In this paper, we study the issues and show how the service avai... 详细信息
来源: 评论
Markov decision processes with delays and asynchronous cost collection
收藏 引用
IEEE TRANSACTIONS ON AUTOMATIC CONTROL 2003年 第4期48卷 568-574页
作者: Katsikopoulos, KV Engelbrecht, SE Univ Massachusetts Dept Mech & Ind Engn Amherst MA 01003 USA Univ Massachusetts Dept Comp Sci Amherst MA 01003 USA
Markov decision processes (MDPs) may involve three types of delays. First, state information, rather than being available instantaneously, may arrive with a delay (observation delay). Second, an action may take effect... 详细信息
来源: 评论
New rollout algorithms for combinatorial optimization problems
收藏 引用
OPTIMIZATION METHODS & SOFTWARE 2002年 第4期17卷 627-654页
作者: Guerriero, F Mancini, M Musmanno, R Univ Calabria Dipartimento Elettron Informat & Sistemist I-87030 Arcavacata Di Rende CS Italy
Rollout algorithms are new computational approaches used to determine near-optimal solutions for deterministic and stochastic combinatorial optimization problems. They are built on a generic base heuristic with the ai... 详细信息
来源: 评论
ADP-based optimal sensor scheduling for target tracking in energy harvesting wireless sensor networks
收藏 引用
NEURAL COMPUTING & APPLICATIONS 2016年 第6期27卷 1543-1551页
作者: Song, Ruizhuo Wei, Qinglai Xiao, Wendong Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China
This paper proposes a novel sensor scheduling scheme based on adaptive dynamic programming, which makes the sensor energy consumption and tracking error optimal over the system operational horizon for wireless sensor ... 详细信息
来源: 评论
Valuation of American options via basis functions
收藏 引用
IEEE TRANSACTIONS ON AUTOMATIC CONTROL 2004年 第3期49卷 374-385页
作者: Lai, TL Wong, SPS Stanford Univ Dept Stat Stanford CA 94305 USA Hong Kong Univ Sci & Technol Dept Informat & Syst Management Hong Kong Hong Kong Peoples R China
After a brief review of recent developments in the pricing and hedging of American options, this paper modifies the basis function approach to adaptive control and neuro-dynamic programming, and applies it to develop:... 详细信息
来源: 评论
Discrete-Time Deterministic Q-Learning: A Novel Convergence Analysis
收藏 引用
IEEE TRANSACTIONS ON CYBERNETICS 2017年 第5期47卷 1224-1237页
作者: Wei, Qinglai Lewis, Frank L. Sun, Qiuye Yan, Pengfei Song, Ruizhuo Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Univ Texas Arlington UTA Res Inst Arlington TX 76118 USA Northeastern Univ Shenyang 110036 Peoples R China Northeastern Univ Sch Informat Sci & Engn Shenyang 110036 Peoples R China Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China
In this paper, a novel discrete-time deterministic Q-learning algorithm is developed. In each iteration of the developed Q-learning algorithm, the iterative Q function is updated for all the state and control spaces, ... 详细信息
来源: 评论
Continuous-Time Time-Varying Policy Iteration
收藏 引用
IEEE TRANSACTIONS ON CYBERNETICS 2020年 第12期50卷 4958-4971页
作者: Wei, Qinglai Liao, Zehua Yang, Zhanyu Li, Benkai Liu, Derong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Univ Chinese Acad Sci Sch Artificial Intelligence Beijing 100049 Peoples R China Guangdong Univ Technol Sch Automat Guangzhou 510006 Peoples R China
A novel policy iteration algorithm, called the continuous-time time-varying (CTTV) policy iteration algorithm, is presented in this paper to obtain the optimal control laws for infinite horizon CTTV nonlinear systems.... 详细信息
来源: 评论
A partial policy iteration ADP algorithm for nonlinear neuro-optimal control with discounted total reward
收藏 引用
neuroCOMPUTING 2021年 424卷 23-34页
作者: Liang, Mingming Wei, Qinglai Guangdong Univ Technol Sch Automat Guangzhou 510006 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China
This paper constructs a partial policy iteration adaptive dynamic programming (ADP) algorithm to solve the optimal control problem of nonlinear systems with discounted total reward. Compared with traditional policy it... 详细信息
来源: 评论