咨询与建议

限定检索结果

文献类型

  • 140 篇 会议
  • 7 篇 期刊文献

馆藏范围

  • 147 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 71 篇 工学
    • 66 篇 计算机科学与技术...
    • 15 篇 软件工程
    • 11 篇 电气工程
    • 9 篇 控制科学与工程
    • 2 篇 仪器科学与技术
    • 2 篇 信息与通信工程
    • 1 篇 力学(可授工学、理...
    • 1 篇 机械工程
    • 1 篇 建筑学
  • 11 篇 理学
    • 10 篇 数学
    • 2 篇 系统科学
    • 2 篇 统计学(可授理学、...
  • 5 篇 管理学
    • 4 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 1 篇 图书情报与档案管...
  • 3 篇 经济学
    • 3 篇 应用经济学

主题

  • 76 篇 dynamic programm...
  • 39 篇 learning
  • 26 篇 optimal control
  • 25 篇 reinforcement le...
  • 15 篇 function approxi...
  • 15 篇 control systems
  • 14 篇 approximation al...
  • 14 篇 equations
  • 13 篇 neural networks
  • 13 篇 stochastic proce...
  • 12 篇 convergence
  • 10 篇 state-space meth...
  • 10 篇 cost function
  • 9 篇 mathematical mod...
  • 8 篇 trajectory
  • 8 篇 approximation me...
  • 7 篇 approximate dyna...
  • 7 篇 algorithm design...
  • 7 篇 adaptive control
  • 7 篇 heuristic algori...

机构

  • 4 篇 school of inform...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 northeastern uni...
  • 3 篇 univ texas autom...
  • 3 篇 arizona state un...
  • 3 篇 robotics institu...
  • 3 篇 univ illinois de...
  • 2 篇 princeton univ d...
  • 2 篇 national science...
  • 2 篇 college of mecha...
  • 2 篇 key laboratory o...
  • 2 篇 univ utrecht dep...
  • 2 篇 department of op...
  • 1 篇 inria
  • 1 篇 computational le...
  • 1 篇 school of automa...
  • 1 篇 univ cincinnati ...
  • 1 篇 toyota technol c...
  • 1 篇 neuroinformatics...

作者

  • 5 篇 liu derong
  • 4 篇 xu xin
  • 4 篇 martin riedmille...
  • 4 篇 huaguang zhang
  • 4 篇 marco a. wiering
  • 4 篇 zhang huaguang
  • 4 篇 si jennie
  • 4 篇 derong liu
  • 3 篇 hado van hasselt
  • 3 篇 lewis frank l.
  • 3 篇 dongbin zhao
  • 3 篇 powell warren b.
  • 3 篇 warren b. powell
  • 3 篇 riedmiller marti...
  • 2 篇 manuel loth
  • 2 篇 van hasselt hado
  • 2 篇 preux philippe
  • 2 篇 hu dewen
  • 2 篇 jennie si
  • 2 篇 philippe preux

语言

  • 142 篇 英文
  • 5 篇 其他
检索条件"任意字段=2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007"
147 条 记 录,以下是121-130 订阅
排序:
Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence Proof
Discrete-time nonlinear HJB solution using Approximate dynam...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Asma Al-Tamimi Frank Lewis Automation & Robotics Research Institute University of Texas Arlington Fort Worth TX USA
In this paper, a greedy iteration scheme based on approximate dynamic programming (ADP), namely heuristic dynamic programming (HDP), is used to solve for the value function of the Hamilton Jacobi Bellman equation (HJB... 详细信息
来源: 评论
approximate Optimal Control-Based Neurocontroller with a State Observation System for Seedlings Growth in Greenhouse
Approximate Optimal Control-Based Neurocontroller with a Sta...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: H. D. Patino J. A. Pucheta C. Schugurensky R. Fullana B. Kuchen Universidad Nacional de San Juan San Juan Argentina
In this paper, an approximate optimal control-based neurocontroller for guiding the seedlings growth in greenhouse is presented. The main goal of this approach is to obtain a close-loop operation with a state neurocon... 详细信息
来源: 评论
Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark
Evaluation of Policy Gradient Methods and Variants on the Ca...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Martin Riedmiller Jan Peters Stefan Schaal NeuroInformatics Group University of Osnabrück Germany Computational Learning and Motor Control University of Southern California USA
In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, 'vanilla' policy gradients and natural policy gradients. Each o... 详细信息
来源: 评论
A Novel Fuzzy reinforcement learning Approach in Two-Level Intelligent Control of 3-DOF Robot Manipulators
A Novel Fuzzy Reinforcement Learning Approach in Two-Level I...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Nasser Sadati Mohammad Mollaie Emamzadeh Electrical Engineering Department Sharif University of Technology Tehran Tehran Iran Electrical Engineering Department Sharif University of Technology Tehran Iran
In this paper, a fuzzy coordination method based on interaction prediction principle (IPP) and reinforcement learning is presented for the optimal control of robot manipulators with three degrees-of-freedom. For this ... 详细信息
来源: 评论
Coordinated reinforcement learning for Decentralized Optimal Control
Coordinated Reinforcement Learning for Decentralized Optimal...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Daniel Yagan Chen-Khong Tham Department of Electrical and Computer Engineering National University of Singapore Singapore
We consider a multi-agent system where the overall performance is affected by the joint actions or policies of agents. However, each agent only observes a partial view of the global state condition. This model is know... 详细信息
来源: 评论
Computing Optimal Stationary Policies for Multi-Objective Markov Decision Processes
Computing Optimal Stationary Policies for Multi-Objective Ma...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Marco A. Wiering Edwin D. de Jong Department of Information and Computing Sciences University of Utrecht Utrecht Netherlands
This paper describes a novel algorithm called CON-MODP for computing Pareto optimal policies for deterministic multi-objective sequential decision problems. CON-MODP is a value iteration based multi-objective dynamic ... 详细信息
来源: 评论
Sparse Temporal Difference learning Using LASSO
Sparse Temporal Difference Learning Using LASSO
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Manuel Loth Manuel Davy Philippe Preux SequeL INRIA-Futurs LIFL CNRS University of Lille (USTL) France SequeL INRIA-Futurs Lagis CNRS Ecole Centrale de Lille France SequeL INRIA-Futurs LIFL CNRS University of Lille (USTL) France
We consider the problem of on-line value function estimation in reinforcement learning. We concentrate on the function approximator to use. To try to break the curse of dimensionality, we focus on non parametric funct... 详细信息
来源: 评论
Performance analysis of direct heuristic dynamic programming using control-theoretic measures
Performance analysis of direct heuristic dynamic programming...
收藏 引用
International Joint Conference on Neural Networks
作者: Yang, Lei Si, Jennie Tsakalis, Konstantinos S. Rodriguez, Annando A. Arizona State Univ Dept Elect Engn Tempe AZ 85287 USA
approximate dynamic programming (ADP) has been widely studied from several important perspectives: algorithm development, learning efficiency measured by success or failure statistics, convergence rate, and learning e... 详细信息
来源: 评论
Strategy Generation with Cognitive Distance in Two-Player Games
Strategy Generation with Cognitive Distance in Two-Player Ga...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Kosuke Sekiyama Ricardo Carnieri Toshio Fukuda Department of Micro-Nano Systems Engineering University of Nagoya Nagoya Japan
In game theoretical approaches to multi-agent systems, a payoff matrix is often given a priori and used by agents in action selection. By contrast, in this paper we approach the problem of decision making by use of th... 详细信息
来源: 评论
Discrete-Time Adaptive dynamic programming using Wavelet Basis Function Neural Networks
Discrete-Time Adaptive Dynamic Programming using Wavelet Bas...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)
作者: Ning Jin Derong Liu Ting Huang Zhongyu Pang Department of Electrical and Computer Engineering University of Illinois Chicago IL USA
dynamic programming for discrete time systems is difficult due to the "curse of dimensionality": one has to find a series of control actions that must be taken in sequence, hoping that this sequence will lea... 详细信息
来源: 评论