版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Indian Inst Technol Dept Elect Engn Bombay 400076 Maharashtra India Graviton Res Capital LLP 14th FloorTower CBldg 8 Gurugram 122002 Haryana India
出 版 物:《SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES》 (Sadhana)
年 卷 期:2018年第43卷第8期
页 面:1-11页
核心收录:
学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 08[工学]
基 金:J C Bose Fellowship Department of Science and Technology, Government of India
主 题:Nonlinear filtering and smoothing Sequential Monte Carlo reinforcement learning importance sampling EM algorithm
摘 要:Using the expression for the unnormalized nonlinear filter for a hidden Markov model, we develop a dynamic-programming-like backward recursion for the filter. This is combined with some ideas from reinforcement learning and a conditional version of importance sampling in order to develop a scheme based on stochastic approximation for estimating the desired conditional expectation. This is then extended to a smoothing problem. Applying these ideas to the EM algorithm, a reinforcement learning scheme is developed for estimating the partially observed log-likelihood function. A stochastic approximation scheme maximizes this function over the unknown parameter. The two procedures are performed on two different time scales, emulating the alternating expectation and maximization operations of the EM algorithm. We also extend this to a continuous state space problem. Numerical results are presented in support of our schemes.