The EM algorithm is a popular method for parameter estimation in situations where the data can be viewed as being incomplete. As each E-step visits each data point on a given iteration, the EM algorithm requires consi...
详细信息
The EM algorithm is a popular method for parameter estimation in situations where the data can be viewed as being incomplete. As each E-step visits each data point on a given iteration, the EM algorithm requires considerable computation time in its application to large data sets. Two versions, the incremental EM (iem) algorithm and a sparse version of the EM algorithm, were proposed recently by Neal R.M. and Hinton G.E. in Jordan M.I. (Ed.), Learning in Graphical Models, Kluwer, Dordrecht, 1998, pp. 355- 368 to reduce the computational cost of applying the EM algorithm. With the iemalgorithm, the available n observations are divided into B (B less than or equal to n) blocks and the E-step is implemented for only a block of observations at a time before the next M-step is performed. With the sparse version of the EM algorithm for the fitting of mixture models, only those posterior probabilities of component membership of the mixture that are above a specified threshold are updated;the remaining component-posterior probabilities are held fixed. In this paper, simulations are performed to assess the relative performances of the iemalgorithm with various number of blocks and the standard EM algorithm. In particular, we propose a simple rule for choosing the number of blocks with the iemalgorithm. For the iemalgorithm in the extreme case of one observation per block, we provide efficient updating formulas, which avoid the direct calculation of the inverses and determinants of the component-covariance matrices. Moreover, a sparse version of the iemalgorithm (SPiem) is formulated by combining the sparse E-step of the EM algorithm and the partial E-step of the iemalgorithm. This SPiemalgorithm can further reduce the computation time of the iemalgorithm.
暂无评论