文献详情 >Reinforcement learning, Sequen... 收藏

Reinforcement learning, Sequential Monte Carlo and the EM algorithm

作者机构：Indian Inst Technol Dept Elect Engn Bombay 400076 Maharashtra India Graviton Res Capital LLP 14th FloorTower CBldg 8 Gurugram 122002 Haryana India

出版物：《SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES》 (Sadhana)

年卷期：2018年第43卷第8期

页面：1-11页

核心收录：

学科分类：12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 08[工学]

基　　金：J C Bose Fellowship Department of Science and Technology, Government of India

主　　题：Nonlinear filtering and smoothing Sequential Monte Carlo reinforcement learning importance sampling EM algorithm

摘要：Using the expression for the unnormalized nonlinear filter for a hidden Markov model, we develop a dynamic-programming-like backward recursion for the filter. This is combined with some ideas from reinforcement learning and a conditional version of importance sampling in order to develop a scheme based on stochastic approximation for estimating the desired conditional expectation. This is then extended to a smoothing problem. Applying these ideas to the EM algorithm, a reinforcement learning scheme is developed for estimating the partially observed log-likelihood function. A stochastic approximation scheme maximizes this function over the unknown parameter. The two procedures are performed on two different time scales, emulating the alternating expectation and maximization operations of the EM algorithm. We also extend this to a continuous state space problem. Numerical results are presented in support of our schemes.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Reinforcement learning, Sequential Monte Carlo and the EM algorithm

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Reinforcement learning, Sequential Monte Carlo and the EM algorithm

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：