版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:INRA UMR MIA 518 F-75231 Paris France AgroParisTech UMR MIA F-75231 Paris France INRA URGV UMR1165 F-91057 Evry France UEVE UMR URGV F-91057 Evry France CNRS UMR URGV ERL8196 F-91057 Evry France
出 版 物:《STATISTICS AND COMPUTING》 (统计学与计算)
年 卷 期:2014年第24卷第4期
页 面:493-504页
核心收录:
学科分类:0202[经济学-应用经济学] 02[经济学] 020208[经济学-统计学] 07[理学] 0714[理学-统计学(可授理学、经济学学位)] 0812[工学-计算机科学与技术(可授工学、理学学位)]
主 题:Hidden Markov models Model-based clustering Mixture model Hierarchical algorithm
摘 要:In unsupervised classification, Hidden Markov Models (HMM) are used to account for a neighborhood structure between observations. The emission distributions are often supposed to belong to some parametric family. In this paper, a semiparametric model where the emission distributions are a mixture of parametric distributions is proposed to get a higher flexibility. We show that the standard EM algorithm can be adapted to infer the model parameters. For the initialization step, starting from a large number of components, a hierarchical method to combine them into the hidden states is proposed. Three likelihood-based criteria to select the components to be combined are discussed. To estimate the number of hidden states, BIC-like criteria are derived. A simulation study is carried out both to determine the best combination between the combining criteria and the model selection criteria and to evaluate the accuracy of classification. The proposed method is also illustrated using a biological dataset from the model plant Arabidopsis thaliana. A R package HMMmix is freely available on the CRAN.