咨询与建议

限定检索结果

文献类型

  • 229 篇 会议
  • 18 篇 期刊文献

馆藏范围

  • 247 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 113 篇 工学
    • 103 篇 计算机科学与技术...
    • 42 篇 软件工程
    • 38 篇 电气工程
    • 23 篇 控制科学与工程
    • 5 篇 信息与通信工程
    • 3 篇 机械工程
    • 2 篇 力学(可授工学、理...
    • 1 篇 仪器科学与技术
    • 1 篇 建筑学
    • 1 篇 化学工程与技术
    • 1 篇 交通运输工程
  • 27 篇 理学
    • 25 篇 数学
    • 7 篇 系统科学
    • 6 篇 统计学(可授理学、...
    • 1 篇 物理学
    • 1 篇 化学
    • 1 篇 大气科学
  • 10 篇 管理学
    • 8 篇 管理科学与工程(可...
    • 3 篇 工商管理
    • 2 篇 图书情报与档案管...
  • 2 篇 经济学
    • 2 篇 应用经济学
  • 1 篇 法学
    • 1 篇 社会学

主题

  • 95 篇 dynamic programm...
  • 54 篇 optimal control
  • 51 篇 learning
  • 44 篇 reinforcement le...
  • 35 篇 learning (artifi...
  • 27 篇 equations
  • 25 篇 neural networks
  • 22 篇 heuristic algori...
  • 20 篇 convergence
  • 20 篇 control systems
  • 18 篇 function approxi...
  • 18 篇 mathematical mod...
  • 16 篇 approximation al...
  • 15 篇 vectors
  • 15 篇 cost function
  • 14 篇 markov processes
  • 14 篇 nonlinear system...
  • 14 篇 artificial neura...
  • 13 篇 stochastic proce...
  • 12 篇 adaptive dynamic...

机构

  • 10 篇 chinese acad sci...
  • 5 篇 school of inform...
  • 4 篇 northeastern uni...
  • 4 篇 department of el...
  • 4 篇 department of in...
  • 3 篇 department of el...
  • 3 篇 automation and r...
  • 3 篇 department of el...
  • 3 篇 robotics institu...
  • 3 篇 key laboratory o...
  • 3 篇 natl univ def te...
  • 3 篇 univ illinois de...
  • 2 篇 department of ar...
  • 2 篇 school of electr...
  • 2 篇 univ groningen i...
  • 2 篇 univ texas autom...
  • 2 篇 colorado state u...
  • 2 篇 guangxi univ sch...
  • 2 篇 national science...
  • 2 篇 informatics inst...

作者

  • 13 篇 liu derong
  • 7 篇 hado van hasselt
  • 7 篇 marco a. wiering
  • 7 篇 dongbin zhao
  • 6 篇 zhao dongbin
  • 5 篇 xu xin
  • 5 篇 lewis frank l.
  • 5 篇 huaguang zhang
  • 5 篇 wei qinglai
  • 5 篇 derong liu
  • 5 篇 warren b. powell
  • 4 篇 haibo he
  • 4 篇 jagannathan s.
  • 4 篇 frank l. lewis
  • 4 篇 zhang huaguang
  • 4 篇 ni zhen
  • 4 篇 yanhong luo
  • 4 篇 wang ding
  • 4 篇 he haibo
  • 4 篇 damien ernst

语言

  • 246 篇 英文
  • 1 篇 其他
检索条件"任意字段=2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2014"
247 条 记 录,以下是221-230 订阅
排序:
Short-term Stock Market Timing Prediction under reinforcement learning Schemes
Short-term Stock Market Timing Prediction under Reinforcemen...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Hailin Li Cihan H. Dagli David Enke Department of Engineering Management and Systems Engineering University of Missouri Rolla Rolla MO USA
There are fundamental difficulties when only using a supervised learning philosophy to predict financial stock short-term movements. We present a reinforcement-oriented forecasting framework in which the solution is c... 详细信息
来源: 评论
A Recurrent Control Neural Network for Data Efficient reinforcement learning
A Recurrent Control Neural Network for Data Efficient Reinfo...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Anton Maximilian Schaefer Steffen Udluft Hans-Georg Zimmermann Department of Optimisation and Operations Research University of Ulm (EBS) Germany Department of Learning Systems Information & Communications Siemens AG Munich Germany
In this paper we introduce a new model-based approach for a data-efficient modelling and control of reinforcement learning problems in discrete time. Our architecture is based on a recurrent neural network (RNN) with ... 详细信息
来源: 评论
A Scalable Model-Free Recurrent Neural Network Framework for Solving POMDPs
A Scalable Model-Free Recurrent Neural Network Framework for...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Zhenzhen Liu Itamar Elhanany Department of Electrical & Computer Engineering University of Tennessee Knoxville TN USA
This paper presents a framework for obtaining an optimal policy in model-free partially observable Markov decision problems (POMDPs) using a recurrent neural network (RNN), A Q-function approximation approach is taken... 详细信息
来源: 评论
Opposition-Based reinforcement learning in the Management of Water Resources
Opposition-Based Reinforcement Learning in the Management of...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: M. Mahootchi H. R. Tizhoosh K. Ponnambalam Systems Design Engineering University of Waterloo Waterloo ONT Canada
Opposition-based learning (OBL) is a new scheme in machine intelligence. In this paper, an OBL version Q-learning which exploits opposite quantities to accelerate the learning is used for management of single reservoi... 详细信息
来源: 评论
Coordinated reinforcement learning for Decentralized Optimal Control
Coordinated Reinforcement Learning for Decentralized Optimal...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Daniel Yagan Chen-Khong Tham Department of Electrical and Computer Engineering National University of Singapore Singapore
We consider a multi-agent system where the overall performance is affected by the joint actions or policies of agents. However, each agent only observes a partial view of the global state condition. This model is know... 详细信息
来源: 评论
DHP adaptive Critic Motion Control of Autonomous Wheeled Mobile Robot
DHP Adaptive Critic Motion Control of Autonomous Wheeled Mob...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Wei-Song Lin Ping-Chieh Yang Department and Institute of Electrical Engineering National Taiwan University Taipei Taiwan
Autonomous drive of wheeled mobile robot (WMR) needs implementing velocity and path tracking control subject to complex dynamical constraints. Conventionally, this control design is obtained by analysis and synthesis ... 详细信息
来源: 评论
Dual Representations for dynamic programming and reinforcement learning
Dual Representations for Dynamic Programming and Reinforceme...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Tao Wang Michael Bowling Dale Schuurmans Department of Computing Science University of Alberta Edmonton Canada
We investigate the dual approach to dynamic programming and reinforcement learning, based on maintaining an explicit representation of stationary distributions as opposed to value functions. A significant advantage of... 详细信息
来源: 评论
Continuous-Time ADP for Linear Systems with Partially Unknown dynamics
Continuous-Time ADP for Linear Systems with Partially Unknow...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Draguna Vrabie Murad Abu-Khalaf Frank L. Lewis Youyi Wang Automation and Robotics Research Institute University of Texas Arlington Fort Worth TX USA School of Electrical and Electronic Engineering Nanyang Technological University Singapore
Approximate dynamic programming has been formulated and applied mainly to discrete-time systems. Expressing the ADP concept for continuous-time systems raises difficult issues related to sampling time and system model... 详细信息
来源: 评论
An Approximate dynamic programming Approach for Job Releasing and Sequencing in a Reentrant Manufacturing Line
An Approximate Dynamic Programming Approach for Job Releasin...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Jose A. Ramirez-Hernandez Emmanuel Fernandez Department of Electrical & Computer Engineering University of Cincinnati OH USA
This paper presents the application of an approximate dynamic programming (ADP) algorithm to the problem of job releasing and sequencing of a benchmark reentrant manufacturing line (RML). The ADP approach is based on ... 详细信息
来源: 评论
Computing Optimal Stationary Policies for Multi-Objective Markov Decision Processes
Computing Optimal Stationary Policies for Multi-Objective Ma...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (adprl)
作者: Marco A. Wiering Edwin D. de Jong Department of Information and Computing Sciences University of Utrecht Utrecht Netherlands
This paper describes a novel algorithm called CON-MODP for computing Pareto optimal policies for deterministic multi-objective sequential decision problems. CON-MODP is a value iteration based multi-objective dynamic ... 详细信息
来源: 评论