咨询与建议

限定检索结果

文献类型

  • 81 篇 期刊文献
  • 28 篇 会议
  • 2 篇 学位论文

馆藏范围

  • 111 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 87 篇 工学
    • 53 篇 计算机科学与技术...
    • 36 篇 电气工程
    • 30 篇 控制科学与工程
    • 8 篇 交通运输工程
    • 7 篇 石油与天然气工程
    • 5 篇 软件工程
    • 4 篇 信息与通信工程
    • 3 篇 动力工程及工程热...
    • 2 篇 仪器科学与技术
    • 2 篇 土木工程
    • 1 篇 电子科学与技术(可...
    • 1 篇 化学工程与技术
    • 1 篇 船舶与海洋工程
    • 1 篇 环境科学与工程(可...
  • 28 篇 管理学
    • 28 篇 管理科学与工程(可...
    • 3 篇 工商管理
  • 24 篇 理学
    • 22 篇 数学
    • 4 篇 系统科学
    • 1 篇 物理学
    • 1 篇 统计学(可授理学、...
  • 11 篇 经济学
    • 7 篇 理论经济学
    • 3 篇 应用经济学
  • 3 篇 医学
    • 3 篇 临床医学
    • 2 篇 基础医学(可授医学...

主题

  • 111 篇 value function a...
  • 37 篇 reinforcement le...
  • 18 篇 approximate dyna...
  • 12 篇 dynamic programm...
  • 7 篇 dynamic vehicle ...
  • 7 篇 temporal differe...
  • 6 篇 q-learning
  • 5 篇 function approxi...
  • 5 篇 markov decision ...
  • 4 篇 markov decision ...
  • 4 篇 neural networks
  • 4 篇 optimal control
  • 4 篇 policy iteration
  • 3 篇 rate of converge...
  • 3 篇 actor-critic
  • 3 篇 policy evaluatio...
  • 3 篇 polynomial basis...
  • 3 篇 reinforcement le...
  • 3 篇 energy managemen...
  • 3 篇 off-policy learn...

机构

  • 2 篇 beijing univ che...
  • 2 篇 hefei univ techn...
  • 2 篇 missouri univ sc...
  • 2 篇 univ massachuset...
  • 2 篇 tokyo inst techn...
  • 2 篇 northeastern uni...
  • 2 篇 univ sci & techn...
  • 2 篇 tech univ carolo...
  • 2 篇 natl univ def te...
  • 2 篇 georgia inst tec...
  • 2 篇 chinese acad sci...
  • 2 篇 otto von guerick...
  • 2 篇 rice univ dept e...
  • 1 篇 polish acad sci ...
  • 1 篇 shanghai engn re...
  • 1 篇 tsinghua univ de...
  • 1 篇 univ sydney sch ...
  • 1 篇 inria nancy gran...
  • 1 篇 univ southern ca...
  • 1 篇 univ twente ind ...

作者

  • 6 篇 ulmer marlin w.
  • 5 篇 song tianheng
  • 5 篇 li dazi
  • 4 篇 xu xin
  • 4 篇 mattfeld dirk c.
  • 3 篇 soeffker ninja
  • 3 篇 hachiya hirotaka
  • 2 篇 tutsoy onder
  • 2 篇 huang zhenhua
  • 2 篇 savelsbergh mart...
  • 2 篇 montoya juan m.
  • 2 篇 lewis frank l.
  • 2 篇 pietquin olivier
  • 2 篇 jin qibing
  • 2 篇 sickles robin c.
  • 2 篇 geist matthieu
  • 2 篇 li ping
  • 2 篇 chapman archie c...
  • 2 篇 zuo lei
  • 2 篇 cervellera crist...

语言

  • 109 篇 英文
  • 2 篇 其他
检索条件"主题词=Value function approximation"
111 条 记 录,以下是21-30 订阅
排序:
Leveraging Statistical Multi-Agent Online Planning with Emergent value function approximation  18
Leveraging Statistical Multi-Agent Online Planning with Emer...
收藏 引用
Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems
作者: Thomy Phan Lenz Belzner Thomas Gabor Kyrill Schmid LMU Munich Munich Germany
Making decisions is a great challenge in distributed autonomous environments due to enormous state spaces and uncertainty. Many online planning algorithms rely on statistical sampling to avoid searching the whole stat... 详细信息
来源: 评论
Improving value function approximation in Factored POMDPs by Exploiting Model Structure  15
Improving Value Function Approximation in Factored POMDPs by...
收藏 引用
International Conference on Autonomous Agents and Multiagent Systems
作者: Tiago S. Veiga Matthijs T. J. Spaan Pedro U. Lima Institute for Systems Robotics Instituto Superior Tecnico Universidade de Lisboa Delft University of Technology Delft The Netherlands
Linear value function approximation in Markov decision processes (MDPs) has been studied extensively, but there are several challenges when applying such techniques to partially observable MDPs (POMDPs). Furthermore, ... 详细信息
来源: 评论
NUMERICAL REALIZATION OF THE MORTENSEN OBSERVER VIA A HESSIAN-AUGMENTED POLYNOMIAL approximation OF THE value function
收藏 引用
SIAM JOURNAL ON SCIENTIFIC COMPUTING 2025年 第1期47卷 A181-A206页
作者: Breiten, Tobias Kunisch, Karl k. Schroeder, Jesper Tech Univ Berlin Inst Math MA 44 D-10623 Berlin Germany Karl Franzens Univ Graz Inst Math & Sci Comp A-8010 Graz Austria Austrian Acad Sci Johann Radon Inst A-4040 Linz Austria
Two related numerical schemes for the realization of the Mortensen observer or minimum energy estimator for the state reconstruction of nonlinear dynamical systems subject to deterministic disturbances are proposed an... 详细信息
来源: 评论
Controlled approximation of the value function in stochastic dynamic programming for multi-reservoir systems
收藏 引用
COMPUTATIONAL MANAGEMENT SCIENCE 2015年 第4期12卷 539-557页
作者: Zephyr, Luckny Lang, Pascal Lamond, Bernard F. Univ Laval Operat & Decis Syst Dept Pavillon Palasis Prince2325 Rue Terrasse Quebec City PQ G1V 0A6 Canada
We present a newapproach for adaptive approximation of the value function in stochastic dynamic programming. Under convexity assumptions, our method is based on a simplicial partition of the state space. Bounds on the... 详细信息
来源: 评论
Tacit knowledge-informed approximate dynamic programming for last-mile delivery operations in online-to-offline pharmacies
收藏 引用
INDUSTRIAL MANAGEMENT & DATA SYSTEMS 2025年 第3期125卷 1078-1109页
作者: Yang, Xuan Luo, Hao Nie, Xinyao Kong, Xiangtianrui Shenzhen Univ Coll Econ Postdoctoral Res Stn Theoret Econ Shenzhen Peoples R China Shenzhen Univ Coll Econ Dept Supply Chain Management Shenzhen Peoples R China
PurposeTacit knowledge in frontline operations is primarily reflected in the holders' intuition about dynamic systems. Despite the implicit nature of tacit knowledge, the understanding of complex systems it encaps... 详细信息
来源: 评论
A comparison of reinforcement learning policies for dynamic vehicle routing problems with stochastic customer requests
收藏 引用
COMPUTERS & INDUSTRIAL ENGINEERING 2025年 200卷
作者: Akkerman, Fabian Mes, Martijn van Jaarsveld, Willem Univ Twente Ind Engn & Business Informat Syst NL-7500 AE Enschede Netherlands Eindhoven Univ Technol Operat Planning Accounting & Control NL-5600 MB Eindhoven Netherlands
This paper presents directions for using reinforcement learning with neural networks for dynamic vehicle routing problems (DVRPs). DVRPs involve sequential decision-making under uncertainty where the expected future c... 详细信息
来源: 评论
value function approximations via Kernel Embeddings for No-Regret Reinforcement Learning  14
Value Function Approximations via Kernel Embeddings for No-R...
收藏 引用
Asian Conference on Machine Learning (ACML)
作者: Chowdhury, Sayak Ray Oliveira, Rafael Microsoft Res Bengaluru India Univ Sydney Sydney NSW Australia
We consider the regret minimization problem in reinforcement learning (RL) in the episodic setting. In many real-world RL environments, the state and action spaces are continuous or very large. Existing approaches est... 详细信息
来源: 评论
Turning from crime: A dynamic perspective
收藏 引用
JOURNAL OF ECONOMETRICS 2008年 第1-2期145卷 158-173页
作者: Sickles, Robin C. Williams, Jenny Rice Univ Dept Econ Houston TX 77251 USA Univ Melbourne Melbourne Vic 3010 Australia
This paper examines criminal choice using a variant of the human capital model. The innovation of our approach is that it attempts to disaggregate individual capital, not unlike production-based studies which disaggre... 详细信息
来源: 评论
Incentivized self-rebalancing fleet in electric vehicle sharing
收藏 引用
IISE TRANSACTIONS 2021年 第2期54卷 173-185页
作者: Wu, Yuguang Chen, Minmin Wang, Xin Univ Wisconsin Madison Dept Ind & Syst Engn Madison WI 53706 USA Amazon Com Serv LLC Seattle WA USA Univ Wisconsin Madison Grainger Inst Engn Madison WI 53706 USA
With the rising need for efficient and flexible short-distance urban transportation, more vehicle sharing companies are offering one-way car-sharing services. Electrified vehicle sharing systems are even more effectiv... 详细信息
来源: 评论
Applying unweighted least-squares based techniques to stochastic dynamic programming: theory and application
收藏 引用
IET CONTROL THEORY AND APPLICATIONS 2019年 第15期13卷 2387-2398页
作者: Forootani, Ali Iervolino, Raffaele Tipaldi, Massimo Univ Sannio Dept Engn Piazza Roma 21 I-82100 Benevento Italy Univ Naples Federico II Dept Elect Engn & Informat Technol Via Claudio 21 I-80125 Naples Italy
Big data and the curse of dimensionality are common vocabularies that researchers in different communities have recently been dealing with, e.g. dynamic programming (DP) in automatic control system society. A novel un... 详细信息
来源: 评论