咨询与建议

限定检索结果

文献类型

  • 82 篇 期刊文献
  • 29 篇 会议
  • 2 篇 学位论文

馆藏范围

  • 113 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 88 篇 工学
    • 54 篇 计算机科学与技术...
    • 37 篇 电气工程
    • 30 篇 控制科学与工程
    • 8 篇 交通运输工程
    • 7 篇 石油与天然气工程
    • 5 篇 信息与通信工程
    • 5 篇 软件工程
    • 3 篇 动力工程及工程热...
    • 2 篇 仪器科学与技术
    • 2 篇 土木工程
    • 1 篇 电子科学与技术(可...
    • 1 篇 化学工程与技术
    • 1 篇 船舶与海洋工程
    • 1 篇 环境科学与工程(可...
  • 29 篇 管理学
    • 29 篇 管理科学与工程(可...
    • 3 篇 工商管理
  • 26 篇 理学
    • 24 篇 数学
    • 4 篇 系统科学
    • 1 篇 物理学
    • 1 篇 统计学(可授理学、...
  • 11 篇 经济学
    • 7 篇 理论经济学
    • 3 篇 应用经济学
  • 3 篇 医学
    • 3 篇 临床医学
    • 2 篇 基础医学(可授医学...

主题

  • 113 篇 value function a...
  • 37 篇 reinforcement le...
  • 18 篇 approximate dyna...
  • 12 篇 dynamic programm...
  • 7 篇 dynamic vehicle ...
  • 7 篇 temporal differe...
  • 6 篇 markov decision ...
  • 6 篇 q-learning
  • 5 篇 function approxi...
  • 4 篇 markov decision ...
  • 4 篇 neural networks
  • 4 篇 optimal control
  • 4 篇 policy iteration
  • 3 篇 rate of converge...
  • 3 篇 actor-critic
  • 3 篇 policy evaluatio...
  • 3 篇 polynomial basis...
  • 3 篇 reinforcement le...
  • 3 篇 energy managemen...
  • 3 篇 off-policy learn...

机构

  • 2 篇 beijing univ che...
  • 2 篇 hefei univ techn...
  • 2 篇 missouri univ sc...
  • 2 篇 univ massachuset...
  • 2 篇 tokyo inst techn...
  • 2 篇 northeastern uni...
  • 2 篇 univ sci & techn...
  • 2 篇 tech univ carolo...
  • 2 篇 natl univ def te...
  • 2 篇 georgia inst tec...
  • 2 篇 chinese acad sci...
  • 2 篇 otto von guerick...
  • 2 篇 rice univ dept e...
  • 1 篇 polish acad sci ...
  • 1 篇 shanghai engn re...
  • 1 篇 tsinghua univ de...
  • 1 篇 univ sydney sch ...
  • 1 篇 inria nancy gran...
  • 1 篇 univ southern ca...
  • 1 篇 univ twente ind ...

作者

  • 6 篇 ulmer marlin w.
  • 5 篇 song tianheng
  • 5 篇 li dazi
  • 4 篇 xu xin
  • 4 篇 mattfeld dirk c.
  • 3 篇 soeffker ninja
  • 3 篇 hachiya hirotaka
  • 2 篇 tutsoy onder
  • 2 篇 huang zhenhua
  • 2 篇 savelsbergh mart...
  • 2 篇 montoya juan m.
  • 2 篇 lewis frank l.
  • 2 篇 pietquin olivier
  • 2 篇 jin qibing
  • 2 篇 sickles robin c.
  • 2 篇 geist matthieu
  • 2 篇 li ping
  • 2 篇 chapman archie c...
  • 2 篇 zuo lei
  • 2 篇 cervellera crist...

语言

  • 111 篇 英文
  • 1 篇 其他
检索条件"主题词=Value Function Approximation"
113 条 记 录,以下是91-100 订阅
排序:
Reinforcement learning with automatic basis construction based on isometric feature mapping
收藏 引用
INFORMATION SCIENCES 2014年 286卷 209-227页
作者: Huang, Zhenhua Xu, Xin Zuo, Lei Natl Univ Def Technol Coll Mechatron & Automat Changsha 410073 Hunan Peoples R China
value function approximation (VFA) has been a major research topic in reinforcement learning. Although various reinforcement learning algorithms with VFA have been proposed, the performance of most previous algorithms... 详细信息
来源: 评论
The Control of Invasive Species on Private Property with Neighbor-to-Neighbor Spillovers
收藏 引用
ENVIRONMENTAL & RESOURCE ECONOMICS 2014年 第2期59卷 231-255页
作者: Fenichel, Eli P. Richards, Timothy J. Shanafelt, David W. Yale Univ Sch Forestry & Environm Studies New Haven CT 06511 USA Arizona State Univ Morrison Sch Agribusiness & Resource Management Mesa AZ 85212 USA Arizona State Univ Sch Life Sci Tempe AZ 85287 USA
Invasive pests cross property boundaries. Property managers may have private incentives to control invasive species despite not having sufficient incentive to fully internalize the external costs of their role in spre... 详细信息
来源: 评论
Methods for approximating value functions for the Dominion card game
收藏 引用
EVOLUTIONARY INTELLIGENCE 2014年 第4期6卷 195-204页
作者: Winder, Ransom K. Mitre Corp 7525 Colshire Dr Mclean VA 22102 USA
Artificial neural networks have been successfully used to approximate value functions for tasks involving decision making. In domains where decisions require a shift in judgment as the overall state changes, it is hyp... 详细信息
来源: 评论
Q-learning in Continuous State-Action Space with Redundant Dimensions by Using a Selective Desensitization Neural Network  7
Q-learning in Continuous State-Action Space with Redundant D...
收藏 引用
Joint 7th International Conference on Soft Computing and Intelligent Systems (SCIS) and 15th International Symposium on Advanced Intelligent Systems (ISIS)
作者: Kobayashi, Takaaki Shibuya, Takeshi Morita, Masahiko Univ Tsukuba Grad Sch Syst & Informat Engn Tsukuba Ibaraki 3058573 Japan Univ Tsukuba Fac Engn Informat & Syst Tsukuba Ibaraki 3058573 Japan
When applying reinforcement learning algorithms such as Q-learning to real world problems, we must consider the high and redundant dimensions and continuity of the state-action space. The continuity of state-action sp... 详细信息
来源: 评论
The Operation Optimization Model of Pumped-Hydro Power Storage Station Based on Approximate Dynamic Programming
The Operation Optimization Model of Pumped-Hydro Power Stora...
收藏 引用
International Conference on Power System Technology (PowerCon)
作者: Liang, Zhencheng Li, Yu Wei, Hua Guangxi Univ Sch Elect Engn Nanning 530004 Peoples R China Guangxi Key Lab Power Syst Optimizat & Energy Tec Nanning Peoples R China
Based on the hypothesis that pumped-hydro power storage (PHPS) station is available for multi-day optimization and adjustment, the paper has proposed a long-term operation optimization model of PHPS station based on a... 详细信息
来源: 评论
Geodesic Gaussian kernels for value function approximation
收藏 引用
AUTONOMOUS ROBOTS 2008年 第3期25卷 287-304页
作者: Sugiyama, Masashi Hachiya, Hirotaka Towell, Christopher Vijayakumar, Sethu Tokyo Inst Technol Dept Comp Sci Meguro Ku Tokyo 1528552 Japan Univ Edinburgh Sch Informat Edinburgh EH9 3JZ Midlothian Scotland
The least-squares policy iteration approach works efficiently in value function approximation, given appropriate basis functions. Because of its smoothness, the Gaussian kernel is a popular and useful choice as a basi... 详细信息
来源: 评论
DYNAMIC PRODUCT POSITIONING IN DIFFERENTIATED PRODUCT MARKETS: THE EFFECT OF FEES FOR MUSICAL PERFORMANCE RIGHTS ON THE COMMERCIAL RADIO INDUSTRY
收藏 引用
ECONOMETRICA 2013年 第5期81卷 1763-1803页
作者: Sweeting, Andrew Univ Maryland Dept Econ College Pk MD 20742 USA Duke Univ Durham NC 27706 USA NBER Cambridge MA 02138 USA
This article predicts how radio station formats would change if, as was recently proposed, music stations were made to pay fees for musical performance rights. It does so by estimating and solving, using parametric ap... 详细信息
来源: 评论
EXPERT-BASED REWARD SHAPING AND EXPLORATION SCHEME FOR BOOSTING POLICY LEARNING OF DIALOGUE MANAGEMENT
EXPERT-BASED REWARD SHAPING AND EXPLORATION SCHEME FOR BOOST...
收藏 引用
IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)
作者: Ferreira, Emmanuel Lefevre, Fabrice Univ Avignon LIA F-84911 Avignon 9 France
This paper investigates the conditions under which expert knowledge can be used to accelerate the policy optimization of a learning agent. Recent works on reinforcement learning for dialogue management allowed to devi... 详细信息
来源: 评论
Incremental Sparse Bayesian Method for Online Dialog Strategy Learning
收藏 引用
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING 2012年 第8期6卷 903-916页
作者: Lee, Sungjin Eskenazi, Maxine Carnegie Mellon Univ Language Technol Inst Pittsburgh PA 15213 USA
This paper proposes an incremental sparse Bayesian learning method to allow continuous dialog strategy learning from the interactions with real users. Since conventional reinforcement learning (RL) methods require a h... 详细信息
来源: 评论
An Exemplar Test Problem on Parameter Convergence Analysis of Temporal Difference Algorithms
An Exemplar Test Problem on Parameter Convergence Analysis o...
收藏 引用
10th World Congress on Intelligent Control and Automation (WCICA)
作者: Brown, Martin Tutsoy, Onder Univ Manchester Control Syst Grp Sch Elect & Elect Engn Manchester M13 9PL Lancs England
Reinforcement learning techniques have been developed to solve difficult learning control problems having small amount of a priori knowledge about the system dynamics. In this paper, a simple unstable exemplar test pr... 详细信息
来源: 评论