咨询与建议

限定检索结果

文献类型

  • 229 篇 会议
  • 18 篇 期刊文献

馆藏范围

  • 247 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 113 篇 工学
    • 103 篇 计算机科学与技术...
    • 42 篇 软件工程
    • 38 篇 电气工程
    • 23 篇 控制科学与工程
    • 5 篇 信息与通信工程
    • 3 篇 机械工程
    • 2 篇 力学(可授工学、理...
    • 1 篇 仪器科学与技术
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
  • 27 篇 理学
    • 25 篇 数学
    • 7 篇 系统科学
    • 6 篇 统计学(可授理学、...
    • 1 篇 物理学
    • 1 篇 化学
    • 1 篇 大气科学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 社会学

主题

  • 95 篇 dynamic programm...
  • 54 篇 optimal control
  • 51 篇 learning
  • 44 篇 reinforcement le...
  • 35 篇 learning (artifi...
  • 27 篇 equations
  • 25 篇 neural networks
  • 22 篇 heuristic algori...
  • 20 篇 convergence
  • 20 篇 control systems
  • 18 篇 function approxi...
  • 18 篇 mathematical mod...
  • 16 篇 approximation al...
  • 15 篇 vectors
  • 15 篇 cost function
  • 14 篇 markov processes
  • 14 篇 nonlinear system...
  • 14 篇 artificial neura...
  • 13 篇 stochastic proce...
  • 12 篇 adaptive dynamic...

机构

  • 10 篇 chinese acad sci...
  • 5 篇 school of inform...
  • 4 篇 northeastern uni...
  • 4 篇 department of el...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 automation and r...
  • 3 篇 department of el...
  • 3 篇 robotics institu...
  • 3 篇 key laboratory o...
  • 3 篇 natl univ def te...
  • 3 篇 univ illinois de...
  • 2 篇 department of ar...
  • 2 篇 school of electr...
  • 2 篇 univ groningen i...
  • 2 篇 univ texas autom...
  • 2 篇 colorado state u...
  • 2 篇 guangxi univ sch...
  • 2 篇 national science...
  • 2 篇 informatics inst...

作者

  • 13 篇 liu derong
  • 7 篇 hado van hasselt
  • 7 篇 marco a. wiering
  • 7 篇 dongbin zhao
  • 6 篇 zhao dongbin
  • 5 篇 xu xin
  • 5 篇 lewis frank l.
  • 5 篇 huaguang zhang
  • 5 篇 wei qinglai
  • 5 篇 derong liu
  • 5 篇 warren b. powell
  • 4 篇 haibo he
  • 4 篇 jagannathan s.
  • 4 篇 frank l. lewis
  • 4 篇 zhang huaguang
  • 4 篇 ni zhen
  • 4 篇 yanhong luo
  • 4 篇 wang ding
  • 4 篇 he haibo
  • 4 篇 damien ernst

语言

  • 246 篇 英文
  • 1 篇 其他
检索条件"任意字段=2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2014"
247 条 记 录,以下是231-240 订阅
排序:
Robust dynamic programming for Discounted Infinite-Horizon Markov Decision Processes with Uncertain Stationary Transition Matrice
Robust Dynamic Programming for Discounted Infinite-Horizon M...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Baohua Li Jennie Si Department of Electrical Engineering Arizona State University Tempe AZ USA
In this paper, finite-state, finite-action, discounted infinite-horizon-cost Markov decision processes (MDPs) with uncertain stationary transition matrices are discussed in the deterministic policy space. Uncertain st... 详细信息
来源: 评论
Algorithm and stability of ATC receding horizon control
Algorithm and stability of ATC receding horizon control
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Hongwei Zhang Jie Huang Frank L. Lewis Department of Mechanical and Automation Engineering Chinese University of Hong Kong New Territories Hong Kong China Automation and Robotics Research Institute University of Texas Arlington Fort Worth TX USA
Receding horizon control (RHC), also known as model predictive control (MPC), is a suboptimal control scheme that solves a finite horizon open-loop optimal control problem in an infinite horizon context and yields a m... 详细信息
来源: 评论
Value-Iteration Based Fitted Policy Iteration: learning with a Single Trajectory
Value-Iteration Based Fitted Policy Iteration: Learning with...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Andras Antos Csaba Szepesvari Remi Munos Computer and Automation Research Inst. Hungarian Academy of Sciences Budapest Hungary University of Alberta Edmonton Canada SequeL team INRIA Futurs University of Lille (USTL) Villeneuve d'Ascq France
We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian decision problems when the training data is composed of the trajectory of some fixed behaviour policy. ... 详细信息
来源: 评论
Discrete-time nonlinear HJB solution using Approximate dynamic programming: Convergence Proof
Discrete-time nonlinear HJB solution using Approximate dynam...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Asma Al-Tamimi Frank Lewis Automation & Robotics Research Institute University of Texas Arlington Fort Worth TX USA
In this paper, a greedy iteration scheme based on approximate dynamic programming (ADP), namely heuristic dynamic programming (HDP), is used to solve for the value function of the Hamilton Jacobi Bellman equation (HJB... 详细信息
来源: 评论
Neuro-controller of cement rotary kiln temperature with adaptive critic designs
Neuro-controller of cement rotary kiln temperature with adap...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Xiaofeng Lin Tangbo Liu Shaojian Song Chunning Song College of Electrical Engineering Guangxi University Nanning China College of Electrical Engineering Guangxi University China
The production process of the cement rotary kiln is a typical engineering thermodynamics with large inertia, lagging and nonlinearity. So it is very difficult to control this process accurately using traditional contr... 详细信息
来源: 评论
A dynamic programming Approach to Viability Problems
A Dynamic Programming Approach to Viability Problems
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Pierre-Arnaud Coquelin Sophie Martin Remi Munos Centre de Mathématiques Appliquées Ecole Polytechnique Palaiseau France Laboratoire dIngénierie pour les Systémes Complexes Cemagref de Clermont-Ferrand Aubiere France INRIA Futurs Universite de Lille 3 France
Viability theory considers the problem of maintaining a system under a set of viability constraints. The main tool for solving viability problems lies in the construction of the viability kernel, defined as the set of... 详细信息
来源: 评论
Approximate Optimal Control-Based Neurocontroller with a State Observation System for Seedlings Growth in Greenhouse
Approximate Optimal Control-Based Neurocontroller with a Sta...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: H. D. Patino J. A. Pucheta C. Schugurensky R. Fullana B. Kuchen Universidad Nacional de San Juan San Juan Argentina
In this paper, an approximate optimal control-based neurocontroller for guiding the seedlings growth in greenhouse is presented. The main goal of this approach is to obtain a close-loop operation with a state neurocon... 详细信息
来源: 评论
Leader-Follower semi-Markov Decision Problems: Theoretical Framework and Approximate Solution
Leader-Follower semi-Markov Decision Problems: Theoretical F...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Kurian Tharakunnel Siddhartha Bhattacharyya Department of Information and Decision Sciences University of Illinois Chicago Chicago IL USA
Leader-follower problems are hierarchical decision problems in which a leader uses incentives to induce certain desired behavior among a set of self-interested followers. dynamic leader-follower problems extend this s... 详细信息
来源: 评论
Opposition-Based Q(λ) with Non-Markovian Update
Opposition-Based Q(λ) with Non-Markovian Update
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Maryam Shokri Hamid R. Tizhoosh Mohamed S. Kamel Pattern Analysis and Machine Intelligence Laboratory Department of Systems Design Engineering University of Waterloo ONT Canada Department of Electrical and Computer Engineering University of Waterloo ONT Canada
The OQ(λ) algorithm benefits from an extension of eligibility traces introduced as opposition trace. This new technique is a combination of the idea of opposition and eligibility traces to deal with large state space... 详细信息
来源: 评论
Coupling perception and action using minimax optimal control
Coupling perception and action using minimax optimal control
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Tom Erez William D. Smart Washington University Saint Louis MO USA
This paper proposes a novel approach for coupling perception and action through minimax dynamic programming. We tackle domains where the agent has some control over the observation process (e.g. via the manipulation o... 详细信息
来源: 评论