文献详情 >Classification-Based Approxima... 收藏

Classification-Based Approximate Policy Iteration

作者：Farahmand, Amir-massoud Precup, Doina Barreto, Andre M. S. Ghavamzadeh, Mohammad

作者机构：McGill Univ Sch Comp Sci Montreal PQ H3A 0E9 Canada Carnegie Mellon Univ Inst Robot Pittsburgh PA 15213 USA Mitsubishi Elect Res Labs Cambridge MA 02139 USA Natl Lab Sci Comp LNCC BR-25651075 Petropolis Brazil

出版物：《IEEE TRANSACTIONS ON AUTOMATIC CONTROL》 (IEEE Trans Autom Control)

年卷期：2015年第60卷第11期

页面：2989-2993页

核心收录：

学科分类：0808[工学-电气工程] 08[工学] 0811[工学-控制科学与工程]

基　　金：Natural Sciences and Engineering Research Council of Canada (NSERC)

主　　题：Approximate dynamic programming approximate policy iteration classification finite-sample analysis reinforcement learning

摘要：Tackling large approximate dynamic programming or reinforcement learning problems requires methods that can exploit regularities of the problem in hand. Most current methods are geared towards exploiting the regularities of either the value function or the policy. We introduce a general classification-based approximate policy iteration (CAPI) framework that can exploit regularities of both. We establish theoretical guarantees for the sample complexity of CAPI-style algorithms, which allow the policy evaluation step to be performed by a wide variety of algorithms, and can handle nonparametric representations of policies. Our bounds on the estimation error of the performance loss are tighter than existing results.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Classification-Based Approximate Policy Iteration

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Classification-Based Approximate Policy Iteration

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：