咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Classification-Based Approxima... 收藏

Classification-Based Approximate Policy Iteration

作     者:Farahmand, Amir-massoud Precup, Doina Barreto, Andre M. S. Ghavamzadeh, Mohammad 

作者机构:McGill Univ Sch Comp Sci Montreal PQ H3A 0E9 Canada Carnegie Mellon Univ Inst Robot Pittsburgh PA 15213 USA Mitsubishi Elect Res Labs Cambridge MA 02139 USA Natl Lab Sci Comp LNCC BR-25651075 Petropolis Brazil 

出 版 物:《IEEE TRANSACTIONS ON AUTOMATIC CONTROL》 (IEEE Trans Autom Control)

年 卷 期:2015年第60卷第11期

页      面:2989-2993页

核心收录:

学科分类:0808[工学-电气工程] 08[工学] 0811[工学-控制科学与工程] 

基  金:Natural Sciences and Engineering Research Council of Canada (NSERC) 

主  题:Approximate dynamic programming approximate policy iteration classification finite-sample analysis reinforcement learning 

摘      要:Tackling large approximate dynamic programming or reinforcement learning problems requires methods that can exploit regularities of the problem in hand. Most current methods are geared towards exploiting the regularities of either the value function or the policy. We introduce a general classification-based approximate policy iteration (CAPI) framework that can exploit regularities of both. We establish theoretical guarantees for the sample complexity of CAPI-style algorithms, which allow the policy evaluation step to be performed by a wide variety of algorithms, and can handle nonparametric representations of policies. Our bounds on the estimation error of the performance loss are tighter than existing results.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分