咨询与建议

限定检索结果

文献类型

  • 299 篇 会议
  • 8 篇 期刊文献

馆藏范围

  • 307 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 180 篇 工学
    • 158 篇 计算机科学与技术...
    • 56 篇 电气工程
    • 48 篇 软件工程
    • 47 篇 控制科学与工程
    • 13 篇 信息与通信工程
    • 10 篇 机械工程
    • 6 篇 仪器科学与技术
    • 4 篇 力学(可授工学、理...
    • 4 篇 生物工程
    • 3 篇 动力工程及工程热...
    • 2 篇 交通运输工程
    • 2 篇 核科学与技术
    • 2 篇 生物医学工程(可授...
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 航空宇航科学与技...
    • 1 篇 食品科学与工程(可...
  • 40 篇 理学
    • 35 篇 数学
    • 9 篇 系统科学
    • 8 篇 统计学(可授理学、...
    • 4 篇 物理学
    • 4 篇 生物学
    • 1 篇 化学
    • 1 篇 天文学
    • 1 篇 大气科学
    • 1 篇 地球物理学
    • 1 篇 地质学
  • 18 篇 管理学
    • 17 篇 管理科学与工程(可...
    • 7 篇 工商管理
  • 4 篇 经济学
    • 4 篇 应用经济学
  • 1 篇 医学

主题

  • 115 篇 dynamic programm...
  • 76 篇 reinforcement le...
  • 67 篇 learning
  • 47 篇 optimal control
  • 30 篇 neural networks
  • 27 篇 control systems
  • 21 篇 approximate dyna...
  • 21 篇 approximation al...
  • 20 篇 function approxi...
  • 20 篇 equations
  • 17 篇 convergence
  • 16 篇 adaptive dynamic...
  • 16 篇 state-space meth...
  • 16 篇 heuristic algori...
  • 14 篇 mathematical mod...
  • 13 篇 stochastic proce...
  • 12 篇 learning (artifi...
  • 12 篇 adaptive control
  • 12 篇 cost function
  • 11 篇 algorithm design...

机构

  • 5 篇 arizona state un...
  • 4 篇 department of el...
  • 4 篇 school of inform...
  • 4 篇 department of in...
  • 4 篇 univ sci & techn...
  • 4 篇 chinese acad sci...
  • 4 篇 department of el...
  • 3 篇 princeton univ d...
  • 3 篇 northeastern uni...
  • 3 篇 national science...
  • 3 篇 robotics institu...
  • 3 篇 univ illinois de...
  • 3 篇 univ utrecht dep...
  • 2 篇 univ groningen i...
  • 2 篇 sharif univ tech...
  • 2 篇 univ texas autom...
  • 2 篇 pengcheng labora...
  • 2 篇 guangxi univ sch...
  • 2 篇 chinese acad sci...
  • 2 篇 cemagref lisc au...

作者

  • 14 篇 liu derong
  • 9 篇 wei qinglai
  • 8 篇 si jennie
  • 7 篇 xu xin
  • 5 篇 derong liu
  • 4 篇 lewis frank l.
  • 4 篇 martin riedmille...
  • 4 篇 huaguang zhang
  • 4 篇 jennie si
  • 4 篇 marco a. wiering
  • 4 篇 xin xu
  • 4 篇 zhang huaguang
  • 4 篇 dongbin zhao
  • 4 篇 lei yang
  • 4 篇 powell warren b.
  • 4 篇 riedmiller marti...
  • 3 篇 hado van hasselt
  • 3 篇 van hasselt hado
  • 3 篇 jagannathan s.
  • 3 篇 munos remi

语言

  • 305 篇 英文
  • 1 篇 其他
  • 1 篇 中文
检索条件"任意字段=IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning"
307 条 记 录,以下是201-210 订阅
排序:
A Recurrent Control Neural Network for Data Efficient reinforcement learning
A Recurrent Control Neural Network for Data Efficient Reinfo...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Anton Maximilian Schaefer Steffen Udluft Hans-Georg Zimmermann Department of Optimisation and Operations Research University of Ulm (EBS) Germany Department of Learning Systems Information & Communications Siemens AG Munich Germany
In this paper we introduce a new model-based approach for a data-efficient modelling and control of reinforcement learning problems in discrete time. Our architecture is based on a recurrent neural network (RNN) with ... 详细信息
来源: 评论
SVM Viability Controller Active learning: Application to Bike Control
SVM Viability Controller Active Learning: Application to Bik...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Laetitia Chapel Guillaume Deffuant Cemagref LISC Aubiere France
It was shown recently that SVMs are particularly adequate to define action policies to keep a dynamical system inside a given constraint set (in the framework of viability theory). However, the training set of the SVM... 详细信息
来源: 评论
Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark
Evaluation of Policy Gradient Methods and Variants on the Ca...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Martin Riedmiller Jan Peters Stefan Schaal NeuroInformatics Group University of Osnabrück Germany Computational Learning and Motor Control University of Southern California USA
In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, 'vanilla' policy gradients and natural policy gradients. Each o... 详细信息
来源: 评论
Value-Iteration Based Fitted Policy Iteration: learning with a Single Trajectory
Value-Iteration Based Fitted Policy Iteration: Learning with...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Andras Antos Csaba Szepesvari Remi Munos Computer and Automation Research Inst. Hungarian Academy of Sciences Budapest Hungary University of Alberta Edmonton Canada SequeL team INRIA Futurs University of Lille (USTL) Villeneuve d'Ascq France
We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian decision problems when the training data is composed of the trajectory of some fixed behaviour policy. ... 详细信息
来源: 评论
2021 ieee/ACM 29th international symposium on Quality of Service, IWQOS 2021
2021 IEEE/ACM 29th International Symposium on Quality of Ser...
收藏 引用
29th ieee/ACM international symposium on Quality of Service, IWQOS 2021
The proceedings contain 105 papers. The topics discussed include: designing approximate and deployable SRPT scheduler: a unified framework;automated quality of service monitoring for 5G and beyond using distributed le...
来源: 评论
Strategy Generation with Cognitive Distance in Two-Player Games
Strategy Generation with Cognitive Distance in Two-Player Ga...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Kosuke Sekiyama Ricardo Carnieri Toshio Fukuda Department of Micro-Nano Systems Engineering University of Nagoya Nagoya Japan
In game theoretical approaches to multi-agent systems, a payoff matrix is often given a priori and used by agents in action selection. By contrast, in this paper we approach the problem of decision making by use of th... 详细信息
来源: 评论
A dynamic programming Approach to Viability Problems
A Dynamic Programming Approach to Viability Problems
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Pierre-Arnaud Coquelin Sophie Martin Remi Munos Centre de Mathématiques Appliquées Ecole Polytechnique Palaiseau France Laboratoire dIngénierie pour les Systémes Complexes Cemagref de Clermont-Ferrand Aubiere France INRIA Futurs Universite de Lille 3 France
Viability theory considers the problem of maintaining a system under a set of viability constraints. The main tool for solving viability problems lies in the construction of the viability kernel, defined as the set of... 详细信息
来源: 评论
Optimal Sliding Mode Control of ROV Fixed Depth Attitude Based on reinforcement learning  11
Optimal Sliding Mode Control of ROV Fixed Depth Attitude Bas...
收藏 引用
11th ieee Annual international Conference on CYBER Technology in Automation, Control, and Intelligent Systems, CYBER 2021
作者: Fule, Wang Qiuxia, Qu Baolong, Yuan Liangliang, Sun Yupeng, Li Guanyan, Guo Zupeng, Xiao Liang, Sun Zhigang, Li School of Information and Control Engineering Shenyang Jianzhu University Shenyang110168 China Shenyang Institute of Automation Chinese Academy of Science Shenyang1101669 China
In this paper, an integral sliding mode control algorithm based on reinforcement learning is proposed for underwater vehicle depth determination control system. Since it is difficult for nonlinear continuous systems t... 详细信息
来源: 评论
Computing Optimal Stationary Policies for Multi-Objective Markov Decision Processes
Computing Optimal Stationary Policies for Multi-Objective Ma...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Marco A. Wiering Edwin D. de Jong Department of Information and Computing Sciences University of Utrecht Utrecht Netherlands
This paper describes a novel algorithm called CON-MODP for computing Pareto optimal policies for deterministic multi-objective sequential decision problems. CON-MODP is a value iteration based multi-objective dynamic ... 详细信息
来源: 评论
Robust dynamic programming for Discounted Infinite-Horizon Markov Decision Processes with Uncertain Stationary Transition Matrice
Robust Dynamic Programming for Discounted Infinite-Horizon M...
收藏 引用
ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Baohua Li Jennie Si Department of Electrical Engineering Arizona State University Tempe AZ USA
In this paper, finite-state, finite-action, discounted infinite-horizon-cost Markov decision processes (MDPs) with uncertain stationary transition matrices are discussed in the deterministic policy space. Uncertain st... 详细信息
来源: 评论