咨询与建议

限定检索结果

文献类型

  • 81 篇 期刊文献
  • 29 篇 会议
  • 2 篇 学位论文

馆藏范围

  • 112 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 88 篇 工学
    • 54 篇 计算机科学与技术...
    • 37 篇 电气工程
    • 30 篇 控制科学与工程
    • 8 篇 交通运输工程
    • 7 篇 石油与天然气工程
    • 5 篇 信息与通信工程
    • 5 篇 软件工程
    • 3 篇 动力工程及工程热...
    • 2 篇 仪器科学与技术
    • 2 篇 土木工程
    • 1 篇 电子科学与技术(可...
    • 1 篇 化学工程与技术
    • 1 篇 船舶与海洋工程
    • 1 篇 环境科学与工程(可...
  • 28 篇 管理学
    • 28 篇 管理科学与工程(可...
    • 3 篇 工商管理
  • 25 篇 理学
    • 23 篇 数学
    • 4 篇 系统科学
    • 1 篇 物理学
    • 1 篇 统计学(可授理学、...
  • 11 篇 经济学
    • 7 篇 理论经济学
    • 3 篇 应用经济学
  • 3 篇 医学
    • 3 篇 临床医学
    • 2 篇 基础医学(可授医学...

主题

  • 112 篇 value function a...
  • 37 篇 reinforcement le...
  • 18 篇 approximate dyna...
  • 12 篇 dynamic programm...
  • 7 篇 dynamic vehicle ...
  • 7 篇 temporal differe...
  • 6 篇 q-learning
  • 5 篇 function approxi...
  • 5 篇 markov decision ...
  • 4 篇 markov decision ...
  • 4 篇 neural networks
  • 4 篇 optimal control
  • 4 篇 policy iteration
  • 3 篇 rate of converge...
  • 3 篇 actor-critic
  • 3 篇 policy evaluatio...
  • 3 篇 polynomial basis...
  • 3 篇 reinforcement le...
  • 3 篇 energy managemen...
  • 3 篇 off-policy learn...

机构

  • 2 篇 beijing univ che...
  • 2 篇 hefei univ techn...
  • 2 篇 missouri univ sc...
  • 2 篇 univ massachuset...
  • 2 篇 tokyo inst techn...
  • 2 篇 northeastern uni...
  • 2 篇 univ sci & techn...
  • 2 篇 tech univ carolo...
  • 2 篇 natl univ def te...
  • 2 篇 georgia inst tec...
  • 2 篇 chinese acad sci...
  • 2 篇 otto von guerick...
  • 2 篇 rice univ dept e...
  • 1 篇 polish acad sci ...
  • 1 篇 shanghai engn re...
  • 1 篇 tsinghua univ de...
  • 1 篇 univ sydney sch ...
  • 1 篇 inria nancy gran...
  • 1 篇 univ southern ca...
  • 1 篇 univ twente ind ...

作者

  • 6 篇 ulmer marlin w.
  • 5 篇 song tianheng
  • 5 篇 li dazi
  • 4 篇 xu xin
  • 4 篇 mattfeld dirk c.
  • 3 篇 soeffker ninja
  • 3 篇 hachiya hirotaka
  • 2 篇 tutsoy onder
  • 2 篇 huang zhenhua
  • 2 篇 savelsbergh mart...
  • 2 篇 montoya juan m.
  • 2 篇 lewis frank l.
  • 2 篇 pietquin olivier
  • 2 篇 jin qibing
  • 2 篇 sickles robin c.
  • 2 篇 geist matthieu
  • 2 篇 li ping
  • 2 篇 chapman archie c...
  • 2 篇 zuo lei
  • 2 篇 cervellera crist...

语言

  • 110 篇 英文
  • 1 篇 其他
检索条件"主题词=Value Function Approximation"
112 条 记 录,以下是51-60 订阅
排序:
The dynamic shortest path problem with time-dependent stochastic disruptions
收藏 引用
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES 2018年 第Jul.期92卷 42-57页
作者: Sever, Derya Zhao, Lei Dellaert, Nico Demir, Emrah Van Woensel, Tom De Kok, Ton Eindhoven Univ Technol Sch Ind Engn Eindhoven Netherlands Tsinghua Univ Dept Ind Engn Beijing Peoples R China Cardiff Univ Panalpina Ctr Mfg & Logist Res Cardiff Business Sch Cardiff S Glam Wales
The dynamic shortest path problem with time-dependent stochastic disruptions consists of finding a route with a minimum expected travel time from an origin to a destination using both historical and real-time informat... 详细信息
来源: 评论
Horizontal combinations of online and offline approximate dynamic programming for stochastic dynamic vehicle routing
收藏 引用
CENTRAL EUROPEAN JOURNAL OF OPERATIONS RESEARCH 2020年 第1期28卷 279-308页
作者: Ulmer, Marlin W. Tech Univ Carolo Wilhelmina Braunschweig Braunschweig Germany
Stochastic and dynamic vehicle routing problems gain increasing attention in the research community. In these problems, routing plans are dynamically updated based on realizations of stochastic information. Due to the... 详细信息
来源: 评论
Optimal Energy Management of a Residential Prosumer: A Robust Data-Driven Dynamic Programming Approach
收藏 引用
IEEE SYSTEMS JOURNAL 2022年 第1期16卷 1548-1557页
作者: Guo, Zhongjie Wei, Wei Chen, Laijun Wang, Zhaojian Catalao, Joao P. S. Mei, Shengwei Tsinghua Univ Dept Elect Engn State Key Lab Power Syst Beijing 100084 Peoples R China Qinghai Univ New Energy Ind Res Ctr Xining 810016 Peoples R China Univ Porto Fac Engn P-4200465 Porto Portugal INESC TEC P-4200465 Porto Portugal
Prosumers are agents that both consume and produce energy. This article studies the optimal energy management of a residential prosumer which consists of a renewable power plant and an energy storage unit. Energy coul... 详细信息
来源: 评论
Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes
收藏 引用
JOURNAL OF MACHINE LEARNING RESEARCH 2007年 第10期8卷 2169-2231页
作者: Mahadevan, Sridhar Maggioni, Mauro Univ Massachusetts Dept Comp Sci Amherst MA 01003 USA Duke Univ Dept Math & Comp Sci Durham NC 27708 USA
This paper introduces a novel spectral framework for solving Markov decision processes (MDPs) by jointly learning representations and optimal policies. The major components of the framework described in this paper inc... 详细信息
来源: 评论
Hamiltonian-Driven Adaptive Dynamic Programming for Continuous Nonlinear Dynamical Systems
收藏 引用
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2017年 第8期28卷 1929-1940页
作者: Yang, Yongliang Wunsch, Donald Yin, Yixin Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China Missouri Univ Sci & Technol Dept Elect & Comp Engn Rolla MO 65401 USA
This paper presents a Hamiltonian-driven framework of adaptive dynamic programming (ADP) for continuous time nonlinear systems, which consists of evaluation of an admissible control, comparison between two different a... 详细信息
来源: 评论
Gaussian Process Approximate Dynamic Programming for Energy-Optimal Supervisory Control of Parallel Hybrid Electric Vehicles
收藏 引用
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY 2022年 第8期71卷 8367-8380页
作者: Bae, Jin Woo Kim, Kwang-Ki K. Texas A&M Univ Dept Comp Sci & Engn College Stn TX 77843 USA Inha Univ Dept Elect & Comp Engn Incheon 22212 South Korea
We propose an energy-efficient supervisory control method for the power management of parallel hybrid electric vehicles (HEVs) to improve the fuel economy and reduce exhaust gas emissions. Plug-in HEVs ((P)HEVs) have ... 详细信息
来源: 评论
High-Speed Finite Control Set Model Predictive Control for Power Electronics
收藏 引用
IEEE TRANSACTIONS ON POWER ELECTRONICS 2017年 第5期32卷 4007-4020页
作者: Stellato, Bartolomeo Geyer, Tobias Goulart, Paul J. Univ Oxford Oxford OX1 3PJ England ABB Corp Res Ctr CH-5405 Baden Switzerland
Common approaches for direct model predictive control (MPC) for current reference tracking in power electronics suffer from the high computational complexity encountered when solving integer optimal control problems o... 详细信息
来源: 评论
Technical update: Least-squares temporal difference learning
收藏 引用
MACHINE LEARNING 2002年 第2-3期49卷 233-246页
作者: Boyan, JA ITA Software Cambridge MA 02139 USA
TD(lambda) is a popular family of algorithms for approximate policy evaluation in large MDPs. TD(lambda) works by incrementally updating the value function after each observed transition. It has two major drawbacks: i... 详细信息
来源: 评论
Two-loop reinforcement learning algorithm for finite-horizon optimal control of continuous-time affine nonlinear systems
收藏 引用
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL 2022年 第1期32卷 393-420页
作者: Chen, Zhe Xue, Wenqian Li, Ning Lewis, Frank L. Shanghai Jiao Tong Univ Dept Automat Shanghai 200240 Peoples R China Minist Educ China Key Lab Syst Control & Informat Proc Shanghai Peoples R China Shanghai Engn Res Ctr Intelligent Control & Manag Shanghai Peoples R China Northeastern Univ State Key Lab Synthet Automat Proc Ind Shenyang Peoples R China Northeastern Univ Int Joint Res Lab Integrated Automat Shenyang Peoples R China Univ Texas Arlington UTA Res Inst Arlington TX 76019 USA
This article proposes three novel time-varying policy iteration algorithms for finite-horizon optimal control problem of continuous-time affine nonlinear systems. We first propose a model-based time-varying policy ite... 详细信息
来源: 评论
Learning to pour with a robot arm combining goal and shape learning for dynamic movement primitives
收藏 引用
ROBOTICS AND AUTONOMOUS SYSTEMS 2011年 第11期59卷 910-922页
作者: Tamosiunaite, Minija Nemec, Bojan Ude, Ales Woergoetter, Florentin Univ Gottingen Inst Phys Biophys 3 Bernstein Ctr Computat Neurosci D-37077 Gottingen Germany Vytautas Magnus Univ Dept Informat LT-44404 Kaunas Lithuania Jozef Stefan Inst Dept Automat Biocybernet & Robot Ljubljana 1000 Slovenia
When describing robot motion with dynamic movement primitives (DMPs), goal (trajectory endpoint), shape and temporal scaling parameters are used. In reinforcement learning with DMPs, usually goals and temporal scaling... 详细信息
来源: 评论