咨询与建议

限定检索结果

文献类型

  • 748 篇 会议
  • 271 篇 期刊文献
  • 4 册 图书

馆藏范围

  • 1,023 篇 电子文献
  • 1 种 纸本馆藏

日期分布

学科分类号

  • 712 篇 工学
    • 520 篇 计算机科学与技术...
    • 381 篇 电气工程
    • 278 篇 控制科学与工程
    • 153 篇 软件工程
    • 79 篇 信息与通信工程
    • 40 篇 交通运输工程
    • 23 篇 仪器科学与技术
    • 20 篇 机械工程
    • 9 篇 生物工程
    • 8 篇 电子科学与技术(可...
    • 7 篇 力学(可授工学、理...
    • 7 篇 土木工程
    • 6 篇 动力工程及工程热...
    • 6 篇 石油与天然气工程
    • 4 篇 生物医学工程(可授...
    • 3 篇 材料科学与工程(可...
    • 3 篇 化学工程与技术
    • 3 篇 航空宇航科学与技...
    • 3 篇 安全科学与工程
  • 118 篇 理学
    • 98 篇 数学
    • 32 篇 系统科学
    • 22 篇 统计学(可授理学、...
    • 10 篇 生物学
    • 8 篇 物理学
    • 4 篇 化学
  • 66 篇 管理学
    • 63 篇 管理科学与工程(可...
    • 14 篇 工商管理
    • 5 篇 图书情报与档案管...
  • 5 篇 经济学
    • 4 篇 应用经济学
  • 3 篇 法学
    • 3 篇 社会学
  • 2 篇 医学
  • 1 篇 教育学

主题

  • 313 篇 reinforcement le...
  • 216 篇 dynamic programm...
  • 206 篇 optimal control
  • 107 篇 adaptive dynamic...
  • 104 篇 adaptive dynamic...
  • 97 篇 learning
  • 88 篇 neural networks
  • 78 篇 heuristic algori...
  • 68 篇 reinforcement le...
  • 58 篇 learning (artifi...
  • 54 篇 nonlinear system...
  • 53 篇 convergence
  • 51 篇 control systems
  • 51 篇 mathematical mod...
  • 48 篇 approximate dyna...
  • 44 篇 approximation al...
  • 43 篇 equations
  • 42 篇 adaptive control
  • 41 篇 artificial neura...
  • 41 篇 cost function

机构

  • 41 篇 chinese acad sci...
  • 27 篇 univ rhode isl d...
  • 17 篇 tianjin univ sch...
  • 16 篇 univ sci & techn...
  • 16 篇 univ illinois de...
  • 15 篇 northeastern uni...
  • 14 篇 beijing normal u...
  • 13 篇 northeastern uni...
  • 13 篇 guangdong univ t...
  • 12 篇 northeastern uni...
  • 9 篇 natl univ def te...
  • 8 篇 ieee
  • 8 篇 univ chinese aca...
  • 7 篇 univ chinese aca...
  • 7 篇 cent south univ ...
  • 7 篇 southern univ sc...
  • 7 篇 beijing univ tec...
  • 6 篇 chinese acad sci...
  • 6 篇 missouri univ sc...
  • 5 篇 nanjing univ pos...

作者

  • 54 篇 liu derong
  • 37 篇 wei qinglai
  • 29 篇 he haibo
  • 22 篇 wang ding
  • 21 篇 xu xin
  • 19 篇 jiang zhong-ping
  • 17 篇 lewis frank l.
  • 17 篇 yang xiong
  • 17 篇 zhang huaguang
  • 17 篇 ni zhen
  • 16 篇 zhao bo
  • 15 篇 gao weinan
  • 14 篇 zhao dongbin
  • 13 篇 derong liu
  • 13 篇 zhong xiangnan
  • 12 篇 si jennie
  • 10 篇 jagannathan s.
  • 10 篇 dongbin zhao
  • 10 篇 song ruizhuo
  • 9 篇 abouheaf mohamme...

语言

  • 992 篇 英文
  • 25 篇 其他
  • 6 篇 中文
检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"
1023 条 记 录,以下是871-880 订阅
排序:
Convergence of model-based temporal difference learning for control
Convergence of model-based temporal difference learning for ...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Van Hasselt, Hado Wiering, Marco A. Univ Utrecht Dept Informat & Comp Sci Intelligent Syst Grp Padualaan 14 NL-3508 TB Utrecht Netherlands
A theoretical analysis of Model-Based Temporal Difference learning for Control is given, leading to a proof of convergence. This work differs from earlier 'work on the convergence of Temporal Difference learning b... 详细信息
来源: 评论
reinforcement learning neural-network-based controller for nonlinear discrete-time systems with input constraints
收藏 引用
ieee TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS 2007年 第2期37卷 425-436页
作者: He, Pingan Jagannathan, S. Univ Missouri Dept Elect & Comp Engn Rolla MO 65409 USA
A novel adaptive-critic-based neural network (NN) controller in discrete time is designed to deliver a desired tracking performance for a class of nonlinear systems in the presence of actuator constraints. The constra... 详细信息
来源: 评论
An optimal ADP algorithm for a high-dimensional stochastic control problem
An optimal ADP algorithm for a high-dimensional stochastic c...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Nascimento, Juliana Powell, Warren Princeton Univ Dept Operat Res & Financial Engn Princeton NJ 08544 USA
We propose a provably optimal approximate dynamic programming algorithm for a class of multistage stochastic problems, taking into account that the probability distribution of the underlying stochastic process is not ... 详细信息
来源: 评论
Two novel on-policy reinforcement learning algorithms based on TD(λ)-methods
Two novel on-policy reinforcement learning algorithms based ...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Wiering, Marco A. van Hasselt, Hado Univ Utrecht Dept Informat & Comp Sci Intelligent Syst Grp Padualaan 14 NL-3508 TB Utrecht Netherlands
This paper describes two novel on-policy reinforcement learning algorithms, named QV(lambda)-learning and the actor critic learning automaton (ACLA). Both algorithms learn a state value-function using TD(lambda)-metho... 详细信息
来源: 评论
A novel fuzzy reinforcement learning approach in two-level intelligent control of 3-DOF robot manipulators
A novel fuzzy reinforcement learning approach in two-level i...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Sadati, Nasser Emamzadeh, Mohammad Mollaie Sharif Univ Technol Dept Elect Engn Intelligent Syst Lab Tehran Iran
In this paper, a fuzzy coordination method based on Interaction Prediction Principle (IPP) and reinforcement learning is presented for the optimal control of robot manipulators with three degrees-of-freedom. For this ... 详细信息
来源: 评论
Sparse temporal difference learning using LASSO
Sparse temporal difference learning using LASSO
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Loth, Manuel Davy, Manuel Preux, Philippe Univ Lille CNRS LIFL SequeL INRIA Futurs Villeneuve France
We consider the problem of on-line value function estimation in reinforcement learning. We concentrate on the function approximator to use. To try to break the curse of dimensionality, we focus on non parametric funct... 详细信息
来源: 评论
Fitted Q iteration with CMACs
Fitted Q iteration with CMACs
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Timmer, Stephan Riedmiller, Martin Univ Osnabruck Dept Comp Sci D-4500 Osnabruck Germany
A major issue in model-free reinforcement learning is how to efficiently exploit the data collected by an exploration strategy. This is especially important in case of continuous, high dimensional state spaces, since ... 详细信息
来源: 评论
Kernelizing LSPE(λ)
Kernelizing LSPE(λ)
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Jung, Tobias Polani, Daniel Johannes Gutenberg Univ Mainz D-6500 Mainz Germany Univ Hertfordshir Hatfield Herts England
We propose the use of kernel-based methods as underlying function approximator in the least-squares based policy evaluation framework of LSPE(lambda) and LSTD(lambda). In particular we present the Ikernelization' ... 详细信息
来源: 评论
Continuous-time adaptive critics
收藏 引用
ieee TRANSACTIONS ON NEURAL NETWORKS 2007年 第3期18卷 631-647页
作者: Hanselmann, Thomas Noakes, Lyle Zaknich, Anthony Univ Melbourne Dept Elect & Elect Engn Parkville Vic 3010 Australia Univ Western Australia Sch Math & Stat Crawley WA 6009 Australia Murdoch Univ Sch Engn Sci Perth WA 6150 Australia
A continuous-time formulation of an adaptive critic design (ACD) is investigated. Connections to the discrete case are made, where backpropagation through time (BPTT) and real-time recurrent learning (RTRL) are preval... 详细信息
来源: 评论
The effect of bootstrapping in multi-automata reinforcement learning
The effect of bootstrapping in multi-automata reinforcement ...
收藏 引用
ieee International symposium on Approximate dynamic programming and reinforcement learning
作者: Peeters, Maarten Verbeeck, Katja Nowe, Ann Vrije Univ Brussel Computat Modeling Lab Pleinlaan 2 B-1050 Brussels Belgium
learning Automata are shown to be an excellent tool for creating learning multi-agent systems. Most algorithms used in current automata research expect the environment to end in an explicit end-stage. In this end-stag... 详细信息
来源: 评论