咨询与建议

限定检索结果

文献类型

  • 81 篇 期刊文献
  • 29 篇 会议
  • 2 篇 学位论文

馆藏范围

  • 112 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 88 篇 工学
    • 54 篇 计算机科学与技术...
    • 37 篇 电气工程
    • 30 篇 控制科学与工程
    • 8 篇 交通运输工程
    • 7 篇 石油与天然气工程
    • 5 篇 信息与通信工程
    • 5 篇 软件工程
    • 3 篇 动力工程及工程热...
    • 2 篇 仪器科学与技术
    • 2 篇 土木工程
    • 1 篇 电子科学与技术(可...
    • 1 篇 化学工程与技术
    • 1 篇 船舶与海洋工程
    • 1 篇 环境科学与工程(可...
  • 28 篇 管理学
    • 28 篇 管理科学与工程(可...
    • 3 篇 工商管理
  • 25 篇 理学
    • 23 篇 数学
    • 4 篇 系统科学
    • 1 篇 物理学
    • 1 篇 统计学(可授理学、...
  • 11 篇 经济学
    • 7 篇 理论经济学
    • 3 篇 应用经济学
  • 3 篇 医学
    • 3 篇 临床医学
    • 2 篇 基础医学(可授医学...

主题

  • 112 篇 value function a...
  • 37 篇 reinforcement le...
  • 18 篇 approximate dyna...
  • 12 篇 dynamic programm...
  • 7 篇 dynamic vehicle ...
  • 7 篇 temporal differe...
  • 6 篇 q-learning
  • 5 篇 function approxi...
  • 5 篇 markov decision ...
  • 4 篇 markov decision ...
  • 4 篇 neural networks
  • 4 篇 optimal control
  • 4 篇 policy iteration
  • 3 篇 rate of converge...
  • 3 篇 actor-critic
  • 3 篇 policy evaluatio...
  • 3 篇 polynomial basis...
  • 3 篇 reinforcement le...
  • 3 篇 energy managemen...
  • 3 篇 off-policy learn...

机构

  • 2 篇 beijing univ che...
  • 2 篇 hefei univ techn...
  • 2 篇 missouri univ sc...
  • 2 篇 univ massachuset...
  • 2 篇 tokyo inst techn...
  • 2 篇 northeastern uni...
  • 2 篇 univ sci & techn...
  • 2 篇 tech univ carolo...
  • 2 篇 natl univ def te...
  • 2 篇 georgia inst tec...
  • 2 篇 chinese acad sci...
  • 2 篇 otto von guerick...
  • 2 篇 rice univ dept e...
  • 1 篇 polish acad sci ...
  • 1 篇 shanghai engn re...
  • 1 篇 tsinghua univ de...
  • 1 篇 univ sydney sch ...
  • 1 篇 inria nancy gran...
  • 1 篇 univ southern ca...
  • 1 篇 univ twente ind ...

作者

  • 6 篇 ulmer marlin w.
  • 5 篇 song tianheng
  • 5 篇 li dazi
  • 4 篇 xu xin
  • 4 篇 mattfeld dirk c.
  • 3 篇 soeffker ninja
  • 3 篇 hachiya hirotaka
  • 2 篇 tutsoy onder
  • 2 篇 huang zhenhua
  • 2 篇 savelsbergh mart...
  • 2 篇 montoya juan m.
  • 2 篇 lewis frank l.
  • 2 篇 pietquin olivier
  • 2 篇 jin qibing
  • 2 篇 sickles robin c.
  • 2 篇 geist matthieu
  • 2 篇 li ping
  • 2 篇 chapman archie c...
  • 2 篇 zuo lei
  • 2 篇 cervellera crist...

语言

  • 110 篇 英文
  • 1 篇 其他
检索条件"主题词=Value Function Approximation"
112 条 记 录,以下是101-110 订阅
排序:
Optimal intra-day operations of behind-the-meter battery storage for primary frequency regulation provision: A hybrid lookahead method
收藏 引用
ENERGY 2022年 第0期247卷 123482-123482页
作者: Wen, Kerui Li, Weidong Yu, Samson Shenglong Li, Ping Shi, Peng Dalian Univ Technol Fac Elect Informat & Elect Engn Dalian 116024 Peoples R China Deakin Univ Sch Engn 75 Pigdgon Rd Waurn Ponds Vic 3216 Australia State Grid Liaoning Elect Power Supply Co Ltd Elect Power Res Inst Shenyang 110006 Peoples R China Univ Adelaide Sch Elect & Elect Engn North Terrace Adelaide SA 5000 Australia
Battery energy storage systems (BESSs) are being widely installed behind-the-meter to reduce electricity bill. By providing grid ancillary services, behind-the-meter BESSs can increase potential revenue streams. This ... 详细信息
来源: 评论
value-gradient iteration with quadratic approximate value functions
收藏 引用
ANNUAL REVIEWS IN CONTROL 2023年 56卷
作者: Yang, Alan Boyd, Stephen Stanford Univ Dept Elect Engn Stanford CA 94305 USA
We propose a method for designing policies for convex stochastic control problems characterized by random linear dynamics and convex stage cost. We consider policies that employ quadratic approximate value functions a... 详细信息
来源: 评论
The Dynamic Freight Routing Problem for Less-Than-Truckload Carriers
收藏 引用
TRANSPORTATION SCIENCE 2023年 第3期57卷 717-740页
作者: Baubaid, Ahmad Boland, Natashia Savelsbergh, Martin King Fahd Univ Petr & Minerals Ind & Syst Engn Dept Dhahran 31261 Saudi Arabia King Fahd Univ Petr & Minerals Interdisciplinary Res Ctr Smart Mobil & Logist Dhahran 31261 Saudi Arabia Georgia Inst Technol H Milton Stewart Sch Ind & Syst Engn Atlanta GA 30332 USA
Less-than-truckload (LTL) carriers transport freight shipments from origins to destinations by consolidating freight using a network of terminals. As daily freight quantities are uncertain, carriers dynamically decide... 详细信息
来源: 评论
Approximate dynamic programming for constrained linear systems: A piecewise quadratic approximation approach☆
收藏 引用
AUTOMATICA 2024年 160卷
作者: He, Kanghui Shi, Shengling van den Boom, Ton De Schutter, Bart Delft Univ Technol Delft Ctr Syst & Control Delft Netherlands
Approximate dynamic programming (ADP) faces challenges in dealing with constraints in control problems. Model predictive control (MPC) is, in comparison, well-known for its accommodation of constraints and stability g... 详细信息
来源: 评论
F-Discrepancy for Efficient Sampling in Approximate Dynamic Programming
收藏 引用
IEEE TRANSACTIONS ON CYBERNETICS 2016年 第7期46卷 1628-1639页
作者: Cervellera, Cristiano Maccio, Danilo CNR Inst Intelligent Syst Automat I-16149 Genoa Italy
In this paper, we address the problem of generating efficient state sample points for the solution of continuous-state finite-horizon Markovian decision problems through approximate dynamic programming. It is known th... 详细信息
来源: 评论
Attitude control for hypersonic reentry vehicles: An efficient deep reinforcement learning method
收藏 引用
APPLIED SOFT COMPUTING 2022年 123卷
作者: Liu, Yiheng Wang, Honglun Wu, Tiancai Lun, Yuebin Fan, Jiaxuan Wu, Jianfa Beihang Univ Sch Automat Sci & Elect Engn Beijing 100191 Peoples R China Beihang Univ Shenyuan Honors Coll Beijing 100191 Peoples R China Beihang Univ Sci & Technol Aircraft Control Lab Beijing 100191 Peoples R China
Aiming at the attitude control problem of hypersonic reentry vehicles (HRVs), a deep reinforcement learning (DRL) based anti-disturbance control method is proposed. First, a compound control framework consisting of a ... 详细信息
来源: 评论
Distributed Gradient Temporal Difference Off-policy Learning With Eligibility Traces: Weak Convergence
收藏 引用
IFAC-PapersOnLine 2020年 第2期53卷 1563-1568页
作者: Miloš S. Stanković Marko Beko Srdjan S. Stanković Innovation Center School of Electrical Engineering University of Belgrade Vlatacom Institute Belgrade and Singidunum University Belgrade Serbia COPELABS Universidade Lusófona de Humanidades e Tecnologias Lisboa Portugal and UNINOVA Caparica Portugal School of Electrical Engineering University of Belgrade Serbia and Vlatacom Institute Belgrade Serbia
In this paper we propose two novel distributed algorithms for multi-agent off-policy learning of linear approximation of the value function in Markov decision processes. The algorithms differ in the way of how distrib... 详细信息
来源: 评论
Sarsa(lambda)-based Logistics Planning Approximated by value function With Policy Iteration
收藏 引用
JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY 2015年 第4期9卷 449-466页
作者: Tang, Yu Taizhou Univ Taizhou Chunhui Rd 100 Taizhou 225300 Jiangsu Peoples R China
The logistics planning problem has been extensively investigated for a long time. However, with the increasing number of stochastic events occurred in road, increasing number of stochastic factors should be taken into... 详细信息
来源: 评论
Sarsa(Λ)-Based Logistics Planning Approximated by value function with Policy Iteration
收藏 引用
Journal of Algorithms & Computational Technology 2015年 第4期9卷 449-466页
作者: Yu Tang Taizhou University Taizhou Jiangsu 225300 China
The logistics planning problem has been extensively investigated for a long time. However, with the increasing number of stochastic events occurred in road, increasing number of stochastic factors should be taken into... 详细信息
来源: 评论
An Exemplar Test Problem on Parameter Convergence Analysis of Temporal Difference Algorithms
An Exemplar Test Problem on Parameter Convergence Analysis o...
收藏 引用
World Congress on Intelligent Control and Automation
作者: Martin Brown Onder Tutsoy Control Systems Group School of Electrical and Electronic Engineering The University of Manchester
Reinforcement learning techniques have been developed to solve difficult learning control problems having small amount of a priori knowledge about the system dynamics. In this paper, a simple unstable exemplar test pr... 详细信息
来源: 评论