The em algorithm is a widely used tool in maximum-likelihood estimation in incomplete data problems. Existing theoretical work has focused on conditions under which the iterates or likelihood values converge, and the ...
详细信息
The em algorithm is a widely used tool in maximum-likelihood estimation in incomplete data problems. Existing theoretical work has focused on conditions under which the iterates or likelihood values converge, and the associated rates of convergence. Such guarantees do not distinguish whether the ultimate fixed point is a near global optimum or a bad local optimum of the sample likelihood, nor do they relate the obtained fixed point to the global optima of the idealized population likelihood (obtained in the limit of infinite data). This paper develops a theoretical framework for quantifying when and how quickly em-type iterates converge to a small neighborhood of a given global optimum of the population likelihood. For correctly specified models, such a characterization yields rigorous guarantees on the performance of certain two-stage estimators in which a suitable initial pilot estimator is refined with iterations of the em algorithm. Our analysis is divided into two parts: a treatment of the em and first-order em algorithms at the population level, followed by results that apply to these algorithms on a finite set of samples. Our conditions allow for a characterization of the region of convergence of em-type iterates to a given population fixed point, that is, the region of the parameter space over which convergence is guaranteed to a point within a small neighborhood of the specified population fixed point. We verify our conditions and give tight characterizations of the region of convergence for three canonical problems of interest: symmetric mixture of two Gaussians, symmetric mixture of two regressions and linear regression with covariates missing completely at random.
We address the problem of Bayesian variable selection for high-dimensional linear regression. We consider a generative model that uses a spike-and-slab-like prior distribution obtained by multiplying a deterministic b...
详细信息
We address the problem of Bayesian variable selection for high-dimensional linear regression. We consider a generative model that uses a spike-and-slab-like prior distribution obtained by multiplying a deterministic binary vector, which traduces the sparsity of the problem, with a random Gaussian parameter vector. The originality of the work is to consider inference through relaxing the model and using a type-II log-likelihood maximization based on an em algorithm. Model selection is performed afterwards relying on Occam's razor and on a path of models found by the em algorithm. Numerical comparisons between our method, called spinyReg, and state-of-the-art high-dimensional variable selection algorithms (such as lasso, adaptive lasso, stability selection or spike and-slab procedures) are reported. Competitive variable selection results and predictive performances are achieved on both simulated and real benchmark data sets. An original regression data set involving the prediction of the number of visitors of the Orsay museum in Paris using bike-sharing system data is also introduced, illustrating the efficiency of the proposed approach. The R package spinyReg implementing the method proposed in this paper is available on CRAN. (C) 2015 Elsevier Inc. All rights reserved.
In linear mixed models, the assumption of normally distributed random effects is often inappropriate and unnecessarily restrictive. The proposed approximate Dirichlet process mixture assumes a hierarchical Gaussian mi...
详细信息
In linear mixed models, the assumption of normally distributed random effects is often inappropriate and unnecessarily restrictive. The proposed approximate Dirichlet process mixture assumes a hierarchical Gaussian mixture that is based on the truncated version of the stick breaking presentation of the Dirichlet process. In addition to the weakening of distributional assumptions, the specification allows to identify clusters of observations with a similar random effects structure. An Expectation-Maximization algorithm is given that solves the estimation problem and that, in certain respects, may exhibit advantages over Markov chain Monte Carlo approaches when modelling with Dirichlet processes. The method is evaluated in a simulation study and applied to the dynamics of unemployment in Germany as well as lung function growth data.
A new acceleration scheme for optimization procedures is defined through geometric considerations and applied to the em algorithm. In many cases it is able to circumvent the problem of stagnation. No modification of t...
详细信息
A new acceleration scheme for optimization procedures is defined through geometric considerations and applied to the em algorithm. In many cases it is able to circumvent the problem of stagnation. No modification of the original algorithm is required. It is simply used as a software component. Thus the new scheme can be easily implemented to accelerate a fixed point algorithm maximizing some objective function. Some practical examples and simulations are presented to show its ability to accelerate em-type algorithms converging slowly.
This paper deals with an empirical Bayes approach for spatial prediction of a Gaussian random field. In fact, we estimate the hyperparameters of the prior distribution by using the maximum likelihood method. In order ...
详细信息
This paper deals with an empirical Bayes approach for spatial prediction of a Gaussian random field. In fact, we estimate the hyperparameters of the prior distribution by using the maximum likelihood method. In order to maximize the marginal distribution of the data, the em algorithm is used. Since this algorithm requires the evaluation of analytically intractable and high dimensionally integrals, a Monte Carlo method based on discretizing parameter space, is proposed to estimate the relevant integrals. Then, the approach is illustrated by its application to a spatial data set. Finally, we compare the predictive performance of this approach with the reference prior method.
Let Y=(Y-t)(t greater than or equal to 0) be an unobserved random process which influences the distribution of a random variable T which can be interpreted as the time to failure. When a conditional hazard rate corres...
详细信息
Let Y=(Y-t)(t greater than or equal to 0) be an unobserved random process which influences the distribution of a random variable T which can be interpreted as the time to failure. When a conditional hazard rate corresponding to T is a quadratic function of covariates, Y, the marginal survival function may be represented by the first two moments of the conditional distribution of Y among survivors. Such a representation may not have an explicit parametric form. This makes it difficult to use standard maximum likelihood procedures to estimate parameters - especially for censored survival data. In this paper a generalization of the em algorithm for survival problems with unobserved, stochastically changing covariates is suggested. It is shown that, for a general model of the stochastic failure model, the smoothing estimates of the first two moments of Y are of a specific form which facilitates the em type calculations. Properties of the algorithm are discussed.
The maximum likelihood estimation of parameters of the Poisson binomial distribution, based on a sample with exact and grouped observations, is considered by applying the em algorithm (Dempster et al., 1977). The resu...
详细信息
The maximum likelihood estimation of parameters of the Poisson binomial distribution, based on a sample with exact and grouped observations, is considered by applying the em algorithm (Dempster et al., 1977). The results of Louis (1982) are used in obtaining the observed information matrix and accelerating the convergence of the em algorithm substantially. The maximum likelihood estimation from samples consisting entirely of complete (Sprott, 1958) or grouped observations are treated as special cases of the estimation problem mentioned above. A brief account is given for the implementation of the em algorithm when the sampling distribution is the Neyman Type A since the latter is a limiting form of the Poisson binomial. Numerical examples based on real data are included.
A frequently occurring problem is to find the maximum likelihood estimation (MLE) of p subject to p is an element of C (C subset of P the probability vectors in R-k). The problem has been discussed by many authors and...
详细信息
A frequently occurring problem is to find the maximum likelihood estimation (MLE) of p subject to p is an element of C (C subset of P the probability vectors in R-k). The problem has been discussed by many authors and they mainly focused when p is restricted by linear constraints or log-linear constraints. In this paper, we construct the relationship between the the maximum likelihood estimation of p restricted by p is an element of C and em algorithm and demonstrate that the maximum likelihood estimator can be computed through the em algorithm (Dempster et al. in J R Stat Soc Ser B 39:1-38, 1997). Several examples are analyzed by the proposed method.
To deal with very large datasets a mini-batch version of the Monte Carlo Markov Chain Stochastic Approximation Expectation-Maximization algorithm for general latent variable models is proposed. For exponential models ...
详细信息
To deal with very large datasets a mini-batch version of the Monte Carlo Markov Chain Stochastic Approximation Expectation-Maximization algorithm for general latent variable models is proposed. For exponential models the algorithm is shown to be convergent under classical conditions as the number of iterations increases. Numerical experiments illustrate the performance of the mini-batch algorithm in various models. In particular, we highlight that mini-batch sampling results in an important speed-up of the convergence of the sequence of estimators generated by the algorithm. Moreover, insights on the effect of the mini-batch size on the limit distribution are presented. Finally, we illustrate how to use mini-batch sampling in practice to improve results when a constraint on the computing time is given.
Interval-censored data arise in a wide variety of application and research areas such as, for example, AIDS studies (Kim et al., 1993) and cancer research (Finkelstein, 1986;Becker & Melbye, 1991). Peto (1973) pro...
详细信息
Interval-censored data arise in a wide variety of application and research areas such as, for example, AIDS studies (Kim et al., 1993) and cancer research (Finkelstein, 1986;Becker & Melbye, 1991). Peto (1973) proposed a Newton-Raphson algorithm for obtaining a generalized maximum likelihood estimate (GMLE) of the survival function with interval-censored observations. Turnbull (1976) proposed a self-consistent algorithm for interval-censored data and obtained the same GMLE. Groeneboom & Wellner (1992) used the convex minorant algorithm for constructing an estimator of the survival function with "case 2" interval-censored data. However, as is known, the GMLE is not uniquely defined on the interval [0, infinity). In addition, Turnbull's algorithm leads to a self-consistent equation which is not in the form of an integral equation. Large sample properties of the GMLE have not been previously examined because of, we believe, among other things, the lack of such an integral equation. In this paper, we present an em algorithm for constructing a GMLE on [0, infinity). The GMLE is expressed as a solution of an integral equation. More recently, with the help of this integral equation, Yu et al. (1997a, b) have shown that the GMLE is consistent and asymptotically normally distributed. An application of the proposed GMLE is presented.
暂无评论