咨询与建议

限定检索结果

文献类型

  • 746 篇 会议
  • 270 篇 期刊文献
  • 4 册 图书

馆藏范围

  • 1,020 篇 电子文献
  • 1 种 纸本馆藏

日期分布

学科分类号

  • 711 篇 工学
    • 520 篇 计算机科学与技术...
    • 380 篇 电气工程
    • 278 篇 控制科学与工程
    • 153 篇 软件工程
    • 79 篇 信息与通信工程
    • 40 篇 交通运输工程
    • 23 篇 仪器科学与技术
    • 20 篇 机械工程
    • 9 篇 生物工程
    • 8 篇 电子科学与技术(可...
    • 7 篇 力学(可授工学、理...
    • 7 篇 土木工程
    • 6 篇 动力工程及工程热...
    • 6 篇 石油与天然气工程
    • 4 篇 生物医学工程(可授...
    • 3 篇 材料科学与工程(可...
    • 3 篇 化学工程与技术
    • 3 篇 航空宇航科学与技...
    • 3 篇 安全科学与工程
  • 118 篇 理学
    • 98 篇 数学
    • 32 篇 系统科学
    • 22 篇 统计学(可授理学、...
    • 10 篇 生物学
    • 8 篇 物理学
    • 4 篇 化学
  • 66 篇 管理学
    • 63 篇 管理科学与工程(可...
    • 14 篇 工商管理
    • 5 篇 图书情报与档案管...
  • 5 篇 经济学
    • 4 篇 应用经济学
  • 3 篇 法学
    • 3 篇 社会学
  • 2 篇 医学
  • 1 篇 教育学

主题

  • 312 篇 reinforcement le...
  • 216 篇 dynamic programm...
  • 206 篇 optimal control
  • 107 篇 adaptive dynamic...
  • 104 篇 adaptive dynamic...
  • 97 篇 learning
  • 88 篇 neural networks
  • 78 篇 heuristic algori...
  • 68 篇 reinforcement le...
  • 58 篇 learning (artifi...
  • 54 篇 nonlinear system...
  • 53 篇 convergence
  • 51 篇 control systems
  • 51 篇 mathematical mod...
  • 48 篇 approximate dyna...
  • 44 篇 approximation al...
  • 43 篇 equations
  • 42 篇 adaptive control
  • 41 篇 artificial neura...
  • 41 篇 cost function

机构

  • 41 篇 chinese acad sci...
  • 27 篇 univ rhode isl d...
  • 17 篇 tianjin univ sch...
  • 16 篇 univ sci & techn...
  • 16 篇 univ illinois de...
  • 15 篇 northeastern uni...
  • 14 篇 beijing normal u...
  • 13 篇 northeastern uni...
  • 13 篇 guangdong univ t...
  • 12 篇 northeastern uni...
  • 9 篇 natl univ def te...
  • 8 篇 ieee
  • 8 篇 univ chinese aca...
  • 7 篇 univ chinese aca...
  • 7 篇 cent south univ ...
  • 7 篇 southern univ sc...
  • 7 篇 beijing univ tec...
  • 6 篇 chinese acad sci...
  • 6 篇 missouri univ sc...
  • 5 篇 nanjing univ pos...

作者

  • 54 篇 liu derong
  • 37 篇 wei qinglai
  • 29 篇 he haibo
  • 22 篇 wang ding
  • 21 篇 xu xin
  • 19 篇 jiang zhong-ping
  • 17 篇 lewis frank l.
  • 17 篇 yang xiong
  • 17 篇 zhang huaguang
  • 17 篇 ni zhen
  • 16 篇 zhao bo
  • 15 篇 gao weinan
  • 14 篇 zhao dongbin
  • 13 篇 zhong xiangnan
  • 12 篇 si jennie
  • 12 篇 derong liu
  • 10 篇 jagannathan s.
  • 10 篇 dongbin zhao
  • 10 篇 song ruizhuo
  • 9 篇 abouheaf mohamme...

语言

  • 994 篇 英文
  • 20 篇 其他
  • 6 篇 中文
检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"
1020 条 记 录,以下是601-610 订阅
排序:
Data-driven partially observable dynamic processes using adaptive dynamic programming
Data-driven partially observable dynamic processes using ada...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Xiangnan Zhong Zhen Ni Yufei Tang Haibo He Department of Electrical University of Rhode Island Kingston RI USA
adaptive dynamic programming (ADP) has been widely recognized as one of the “core methodologies” to achieve optimal control for intelligent systems in Markov decision process (MDP). Generally, ADP control design req... 详细信息
来源: 评论
Optimal self-learning battery control in smart residential grids by iterative Q-learning algorithm
Optimal self-learning battery control in smart residential g...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Qinglai Wei Derong Liu Guang Shi Yu Liu Qiang Guan The State Key Laboratory of Management and Control for Complex Systems Chinese Academy of Sciences
In this paper, a novel dual iterative Q-learning algorithm is developed to solve the optimal battery management and control problems in smart residential environments. The main idea is to use adaptive dynamic programm... 详细信息
来源: 评论
Heuristics for multiagent reinforcement learning in decentralized decision problems
Heuristics for multiagent reinforcement learning in decentra...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Martin W. Allen David Hahn Douglas C. MacFarland Computer Science Department University of Wisconsin-La Crosse La Crosse Wisconsin Computer Science Department Worcester Polytechnic Institute Worcester Massachusetts
Decentralized partially observable Markov decision processes (Dec-POMDPs) model cooperative multiagent scenarios, providing a powerful general framework for team-based artificial intelligence. While optimal algorithms... 详细信息
来源: 评论
Active learning for classification: An optimistic approach
Active learning for classification: An optimistic approach
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Timothé Collet Olivier Pietquin GeorgiaTech-CNRS UMI France Supelec MaLIS Research group France LIFL (UMR 8022 CNRS / Lille 1) IUF (Institut Universitaire de France) France University Lille 1 France
In this paper, we propose to reformulate the active learning problem occurring in classification as a sequential decision making problem. We particularly focus on the problem of dynamically allocating a fixed budget o... 详细信息
来源: 评论
Model-free Q-learning over finite horizon for uncertain linear continuous-time systems
Model-free Q-learning over finite horizon for uncertain line...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Hao Xu S. Jagannathan College of Science and Engineering Texas A&M University-Corpus Christi Corpus Christi TX USA Department of Electrical and Computer Engineering Missouri University of Science and Technology Rolla MO USA
In this paper, a novel optimal control over finite horizon has been introduced for linear continuous-time systems by using adaptive dynamic programming (ADP). First, a new time-varying Q-function parameterization and ... 详细信息
来源: 评论
adaptive fault identification for a class of nonlinear dynamic systems
Adaptive fault identification for a class of nonlinear dynam...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Li-Bing Wu Dan Ye Xin-Gang Zhao College of Information Science and Engineering Northeastern University Shenyang Liaoning P. R. China College of Sciences University of Science and Technology Liaoning Anshan Liaoning P. R. China State Key Laboratory of Robotics and Shenyang Institute of Automation CAS Shenyang Liaoning P. R. China
This paper is concerned with the diagnosis problem of actuator faults for a class of nonlinear systems. It is assumed that the upper bound of the Lipschtiz constant of the nonlinearity in the faulty system is unknown.... 详细信息
来源: 评论
Cognitive control in cognitive dynamic systems: A new way of thinking inspired by the brain
Cognitive control in cognitive dynamic systems: A new way of...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Simon Haykin Ashkan Amiri Mehdi Fatemi Cognitive Systems Laboratory McMaster University Hamilton Ontario Canada
Briefly, main purpose of the paper is fourfold: a) Cognitive perception, which consists of two functional blocks: improved sparse-coding under the influence of perceptual attention for extracting relevant information ... 详细信息
来源: 评论
A Multi-Agent Q-learning-based Framework for Achieving Fairness in HTTP adaptive Streaming
A Multi-Agent Q-Learning-based Framework for Achieving Fairn...
收藏 引用
14th ieee/IFIP Network Operations and Management symposium (NOMS)
作者: Petrangeli, Stefano Claeys, Maxim Latre, Steven Famaey, Jeroen De Turck, Filip Univ Ghent IMinds Dept Informat Technol INTEC B-9050 Ghent Belgium Univ Antwerp IMinds Dept Math & Comp Sci B-2020 Antwerp Belgium
HTTP adaptive Streaming (HAS) is quickly becoming the de facto standard for Over-The-Top video streaming. In HAS, each video is temporally segmented and stored in different quality levels. Quality selection heuristics... 详细信息
来源: 评论
reinforcement learning-based optimal control considering L computation time delay of linear discrete-time systems
Reinforcement learning-based optimal control considering L c...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Taishi Fujita Toshimitu Ushio Department of Cybernetics Czech Technical University Prague Czech Republic
In embedded control systems, the control input is computed based on sensing data of a plant in a processor and there is a delay, called the computation time delay, due to the computation and the data transmission. Whe... 详细信息
来源: 评论
Closed-loop control of anesthesia and mean arterial pressure using reinforcement learning
Closed-loop control of anesthesia and mean arterial pressure...
收藏 引用
ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)
作者: Regina Padmanabhan Nader Meskin Wassim M. Haddad Department of Electrical Engineering Qatar University Qatar School of Aerospace Engineering Georgia Institute of Technology Atlanta GA USA
General anesthesia is required for patients undergoing surgery as well as for some patients in the intensive care units with acute respiratory distress syndrome. How-ever, most anesthetics affect cardiac and respirato... 详细信息
来源: 评论