咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Reinforcement learning, Sequen... 收藏

Reinforcement learning, Sequential Monte Carlo and the EM algorithm

作     者:Borkar, Vivek S. Jain, Ankush V. 

作者机构:Indian Inst Technol Dept Elect Engn Bombay 400076 Maharashtra India Graviton Res Capital LLP 14th FloorTower CBldg 8 Gurugram 122002 Haryana India 

出 版 物:《SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES》 (Sadhana)

年 卷 期:2018年第43卷第8期

页      面:1-11页

核心收录:

学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 08[工学] 

基  金:J C Bose Fellowship Department of Science and Technology, Government of India 

主  题:Nonlinear filtering and smoothing Sequential Monte Carlo reinforcement learning importance sampling EM algorithm 

摘      要:Using the expression for the unnormalized nonlinear filter for a hidden Markov model, we develop a dynamic-programming-like backward recursion for the filter. This is combined with some ideas from reinforcement learning and a conditional version of importance sampling in order to develop a scheme based on stochastic approximation for estimating the desired conditional expectation. This is then extended to a smoothing problem. Applying these ideas to the EM algorithm, a reinforcement learning scheme is developed for estimating the partially observed log-likelihood function. A stochastic approximation scheme maximizes this function over the unknown parameter. The two procedures are performed on two different time scales, emulating the alternating expectation and maximization operations of the EM algorithm. We also extend this to a continuous state space problem. Numerical results are presented in support of our schemes.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分