咨询与建议

限定检索结果

文献类型

  • 81 篇 期刊文献
  • 29 篇 会议
  • 2 篇 学位论文

馆藏范围

  • 112 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 88 篇 工学
    • 54 篇 计算机科学与技术...
    • 37 篇 电气工程
    • 30 篇 控制科学与工程
    • 8 篇 交通运输工程
    • 7 篇 石油与天然气工程
    • 5 篇 信息与通信工程
    • 5 篇 软件工程
    • 3 篇 动力工程及工程热...
    • 2 篇 仪器科学与技术
    • 2 篇 土木工程
    • 1 篇 电子科学与技术(可...
    • 1 篇 化学工程与技术
    • 1 篇 船舶与海洋工程
    • 1 篇 环境科学与工程(可...
  • 28 篇 管理学
    • 28 篇 管理科学与工程(可...
    • 3 篇 工商管理
  • 25 篇 理学
    • 23 篇 数学
    • 4 篇 系统科学
    • 1 篇 物理学
    • 1 篇 统计学(可授理学、...
  • 11 篇 经济学
    • 7 篇 理论经济学
    • 3 篇 应用经济学
  • 3 篇 医学
    • 3 篇 临床医学
    • 2 篇 基础医学(可授医学...

主题

  • 112 篇 value function a...
  • 37 篇 reinforcement le...
  • 18 篇 approximate dyna...
  • 12 篇 dynamic programm...
  • 7 篇 dynamic vehicle ...
  • 7 篇 temporal differe...
  • 6 篇 q-learning
  • 5 篇 function approxi...
  • 5 篇 markov decision ...
  • 4 篇 markov decision ...
  • 4 篇 neural networks
  • 4 篇 optimal control
  • 4 篇 policy iteration
  • 3 篇 rate of converge...
  • 3 篇 actor-critic
  • 3 篇 policy evaluatio...
  • 3 篇 polynomial basis...
  • 3 篇 reinforcement le...
  • 3 篇 energy managemen...
  • 3 篇 off-policy learn...

机构

  • 2 篇 beijing univ che...
  • 2 篇 hefei univ techn...
  • 2 篇 missouri univ sc...
  • 2 篇 univ massachuset...
  • 2 篇 tokyo inst techn...
  • 2 篇 northeastern uni...
  • 2 篇 univ sci & techn...
  • 2 篇 tech univ carolo...
  • 2 篇 natl univ def te...
  • 2 篇 georgia inst tec...
  • 2 篇 chinese acad sci...
  • 2 篇 otto von guerick...
  • 2 篇 rice univ dept e...
  • 1 篇 polish acad sci ...
  • 1 篇 shanghai engn re...
  • 1 篇 tsinghua univ de...
  • 1 篇 univ sydney sch ...
  • 1 篇 inria nancy gran...
  • 1 篇 univ southern ca...
  • 1 篇 univ twente ind ...

作者

  • 6 篇 ulmer marlin w.
  • 5 篇 song tianheng
  • 5 篇 li dazi
  • 4 篇 xu xin
  • 4 篇 mattfeld dirk c.
  • 3 篇 soeffker ninja
  • 3 篇 hachiya hirotaka
  • 2 篇 tutsoy onder
  • 2 篇 huang zhenhua
  • 2 篇 savelsbergh mart...
  • 2 篇 montoya juan m.
  • 2 篇 lewis frank l.
  • 2 篇 pietquin olivier
  • 2 篇 jin qibing
  • 2 篇 sickles robin c.
  • 2 篇 geist matthieu
  • 2 篇 li ping
  • 2 篇 chapman archie c...
  • 2 篇 zuo lei
  • 2 篇 cervellera crist...

语言

  • 110 篇 英文
  • 1 篇 其他
检索条件"主题词=Value Function Approximation"
112 条 记 录,以下是61-70 订阅
排序:
Transfer reinforcement learning for multi-agent pursuit-evasion differential game with obstacles in a continuous environment
收藏 引用
ASIAN JOURNAL OF CONTROL 2024年 第4期26卷 2125-2140页
作者: Hu, Penglin Pan, Quan Zhao, Chunhui Guo, Yaning Northwestern Polytech Univ Sch Automat Xian 710130 Peoples R China
In this paper, we study the multi-pursuer single-evader pursuit-evasion (MSPE) differential game in a continuous environment with the consideration of obstacles. We propose a novel pursuit-evasion algorithm based on r... 详细信息
来源: 评论
A generalized energy management framework for hybrid construction vehicles via model-based reinforcement learning
收藏 引用
ENERGY 2022年 260卷
作者: Zhang, Wei Wang, Jixin Xu, Zhenyu Shen, Yuying Gao, Guangzong Jilin Univ Sch Mech & Aerosp Engn Changchun 130012 Peoples R China State Key Lab Smart Mfg Special Vehicles & Transm Baotou 014030 Peoples R China Huazhong Univ Sci & Technol State Key Lab Digital Mfg Equipment & Technol Wuhan Peoples R China
Hybrid construction vehicles (HCVs) have more specific tasks and highly repetitive patterns than on-road ve-hicles. Consequently, they are more suitable for model-based energy management. However, distinctions be-twee... 详细信息
来源: 评论
Sparse online kernelized actor-critic Learning in reproducing kernel Hilbert space
收藏 引用
ARTIFICIAL INTELLIGENCE REVIEW 2022年 第1期55卷 23-58页
作者: Yang, Yongliang Zhu, Hufei Zhang, Qichao Zhao, Bo Li, Zhenning Wunsch, Donald C. Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Univ Chinese Acad Sci Beijing Peoples R China Beijing Normal Univ Sch Syst Sci Beijing 100875 Peoples R China Univ Macau State Key Lab Internet Things Smart City Taipa 59193 Macao Peoples R China Missouri Univ Sci & Technol Dept Elect & Comp Engn Rolla MO 65401 USA
In this paper, we develop a novel non-parametric online actor-critic reinforcement learning (RL) algorithm to solve optimal regulation problems for a class of continuous-time affine nonlinear dynamical systems. To dea... 详细信息
来源: 评论
Deep reinforcement learning using least-squares truncated temporal-difference
收藏 引用
CAAI Transactions on Intelligence Technology 2024年 第2期9卷 425-439页
作者: Junkai Ren Yixing Lan Xin Xu Yichuan Zhang Qiang Fang Yujun Zeng College of Intelligence Science and Technology National University of Defense TechnologyChangshaChina State Key Laboratory of Astronautic Dynamics Xi'an Satellite Control CenterXi'anChina
Policy evaluation(PE)is a critical sub-problem in reinforcement learning,which estimates the value function for a given policy and can be used for policy ***,there still exist some limitations in current PE methods,su... 详细信息
来源: 评论
A Fast Technique for Smart Home Management: ADP With Temporal Difference Learning
收藏 引用
IEEE TRANSACTIONS ON SMART GRID 2018年 第4期9卷 3291-3303页
作者: Keerthisinghe, Chanaka Verbic, Gregor Chapman, Archie C. Univ Sydney Sch Elect & Informat Engn Sydney NSW 2006 Australia
This paper presents a computationally efficient smart home energy management system (SHEMS) using an approximate dynamic programming (ADP) approach with temporal difference learning for scheduling distributed energy r... 详细信息
来源: 评论
value function gradient learning for large-scale multistage stochastic programming problems
收藏 引用
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH 2023年 第1期308卷 321-335页
作者: Lee, Jinkyu Bae, Sanghyeon Kim, Woo Chang Lee, Yongjae Korea Adv Inst Sci & Technol KAIST Dept Ind & Syst Engn Daejeon South Korea Ulsan Natl Inst Sci & Technol UNIST Dept Ind Engn Ulsan South Korea
A stagewise decomposition algorithm called "value function gradient learning" (VFGL) is proposed for large-scale multistage stochastic convex programs. VFGL finds the parameter values that best fit the gra-d... 详细信息
来源: 评论
Adaptive dynamic programming for robust neural control of unknown continuous-time non-linear systems
收藏 引用
IET CONTROL THEORY AND APPLICATIONS 2017年 第14期11卷 2307-2316页
作者: Yang, Xiong He, Haibo Liu, Derong Zhu, Yuanheng Tianjin Univ Sch Elect & Informat Engn Tianjin 300072 Peoples R China Univ Rhode Isl Dept Elect Comp & Biomed Engn Kingston RI 02881 USA Guangdong Univ Technol Sch Automat Guangzhou 510006 Guangdong Peoples R China Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China
The design of robust controllers for continuous-time (CT) non-linear systems with completely unknown non-linearities is a challenging task. The inability to accurately identify the non-linearities online or offline mo... 详细信息
来源: 评论
Link Analysis for Solving Multiple-Access MDPs With Large State Spaces
收藏 引用
IEEE TRANSACTIONS ON SIGNAL PROCESSING 2023年 71卷 947-962页
作者: Bozkus, Talha Mitra, Urbashi Univ Southern Calif Ming Hsieh Dept Elect & Comp Engn Los Angeles CA 90007 USA
Wireless communication networks can be well-modeled by Markov Decision Processes (MDPs). While traditional dynamic programming algorithms such as value and policy iteration have lower complexity than brute force strat... 详细信息
来源: 评论
Incremental Sparse Bayesian Method for Online Dialog Strategy Learning
收藏 引用
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING 2012年 第8期6卷 903-916页
作者: Lee, Sungjin Eskenazi, Maxine Carnegie Mellon Univ Language Technol Inst Pittsburgh PA 15213 USA
This paper proposes an incremental sparse Bayesian learning method to allow continuous dialog strategy learning from the interactions with real users. Since conventional reinforcement learning (RL) methods require a h... 详细信息
来源: 评论
The Control of Invasive Species on Private Property with Neighbor-to-Neighbor Spillovers
收藏 引用
ENVIRONMENTAL & RESOURCE ECONOMICS 2014年 第2期59卷 231-255页
作者: Fenichel, Eli P. Richards, Timothy J. Shanafelt, David W. Yale Univ Sch Forestry & Environm Studies New Haven CT 06511 USA Arizona State Univ Morrison Sch Agribusiness & Resource Management Mesa AZ 85212 USA Arizona State Univ Sch Life Sci Tempe AZ 85287 USA
Invasive pests cross property boundaries. Property managers may have private incentives to control invasive species despite not having sufficient incentive to fully internalize the external costs of their role in spre... 详细信息
来源: 评论