检索结果-内蒙古大学图书馆

Robust mixture modelling using the t distribution

STATISTICS AND COMPUTING 2000年第4期10卷 339-348页

作者： Peel, D McLachlan, GJ Univ Queensland Dept Math St Lucia Qld 4072 Australia

Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.

关键词： finite mixture models normal components multivariate t components maximum likelihood em algorithm cluster analysis

来源：评论

学校读者我要写书评

暂无评论

Optimization transfer using surrogate objective functions

引用

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS 2000年第1期9卷 1-20页

作者： Lange, K Hunter, DR Yang, I Univ Calif Los Angeles Dept Biomath Sch Med Los Angeles CA 90095 USA Univ Calif Los Angeles Dept Human Genet Sch Med Los Angeles CA 90095 USA Penn State Univ Dept Stat University Pk PA 16802 USA Schering Plough Corp Inst Res Kenilworth NJ 07033 USA

The well-known em algorithm is an optimization transfer algorithm that depends on the notion of incomplete or missing data. By invoking convexity arguments, one can construct a variety of other optimization transfer algorithms that do not involve missing data. These algorithms all rely on a majorizing or minorizing function that serves as a surrogate for the objective function. Optimizing the surrogate function drives the objective function in the correct direction. This article illustrates this general principle by a number of specific examples drawn from the statistical literature. Because optimization transfer algorithms often exhibit the slow convergence of em algorithms, two methods of accelerating optimization transfer are discussed and evaluated in the context of specific problems.

关键词： convexity em algorithm majorization maximum likelihood Newton's method

来源：评论

学校读者我要写书评

暂无评论

Maximum likelihood estimation of latent interaction effects with the LMS method

引用

PSYCHOMETRIKA 2000年第4期65卷 457-474页

作者： Klein, A Moosbrugger, H Goethe Univ Frankfurt D-6000 Frankfurt Germany

In the context of structural equation modeling, a general interaction model with multiple latent interaction effects is introduced. A stochastic analysis represents the nonnormal distribution of the joint indicator vector as a finite mixture of normal distributions. The Latent Moderated Structural Equations (LMS) approach is a new method developed for the analysis of the general interaction model that utilizes the mixture distribution and provides a ML estimation of model parameters by adapting the em algorithm. The finite sample properties and the robustness of LMS are discussed. Finally, the applicability of the new method is illustrated by an empirical example.

关键词： latent interaction effects mixture distribution ML estimation structural equation modeling (Sem) em algorithm

来源：评论

学校读者我要写书评

暂无评论

Generalized latent trait models

引用

PSYCHOMETRIKA 2000年第3期65卷 391-411页

作者： Moustaki, I Knott, M Univ London London Sch Econ & Polit Sci Dept Stat London WC2A 2AE England

In this paper we discuss a general model framework within which manifest variables with different distributions in the exponential Family can be analyzed with a latent trait model. A unified maximum likelihood method For estimating the parameters of the generalized latent trait model will be presented. We discuss in addition the scoring of individuals on the latent dimensions. The general framework presented allows, not only the analysis of manifest variables all of one type but also the simultaneous analysis of a collection of variables with different distributions. The approach used analyzes the data as they are by making assumptions about the distribution of the manifest variables directly.

关键词： generalized linear models latent trait model em algorithm scoring methods

来源：评论

学校读者我要写书评

暂无评论

Models for papilloma multiplicity and regression: applications to transgenic mouse studies

引用

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS 2000年第1期49卷 19-30页

作者： Dunson, DB NIEHS Biostat Branch Res Triangle Pk NC 27709 USA

In cancer studies that use transgenic or knockout mice, skin tumour counts are recorded over time to measure tumorigenicity. In these studies cancer biologists are interested in the effect of endogenous and/or exogenous factors on papilloma onset, multiplicity and regression. In this paper an analysis of data from a study conducted by the National Institute of Environmental Health Sciences on the effect of genetic factors on skin tumorigenesis is presented. Papilloma multiplicity and regression are modelled by using Bernoulli, Poisson and binomial latent variables, each of which can depend on covariates and previous outcomes. An em algorithm is proposed for parameter estimation, and generalized estimating equations adjust for extra dependence between outcomes within individual animals. A Cox proportional hazards model is used to describe covariate effects on the onset of tumours.

关键词： em algorithm latent variables skin painting studies transition models tumorigenicity

来源：评论

学校读者我要写书评

暂无评论

Estimation of distribution functions using data from different environments

引用

LIFETIME DATA ANALYSIS 2000年第3期6卷 271-279页

作者： Li, LX Univ New Orleans Dept Math New Orleans LA 70148 USA

Suppose that when a unit operates in a certain environment, its lifetime has distribution G, and when the unit operates in another environment, its lifetime has a different distribution, say F. Moreover, suppose the unit is operated for a certain period of time in the first environment and is then transferred to the second environment. Thus we observe a censored lifetime in the first environment and a failure time of a "used" unit in the second environment. We propose an em algorithm approach for obtaining a self-consistent estimator of F using observations from both environments. The case where failure times are subject to right censoring is considered as well. We also establish the maximum likelihood estimator of F when the unit is repairable. Application and simulation studies are presented to illustrate the methods derived.

关键词： censored data em algorithm minimum repair MLE product limit estimator self-consistent estimator

来源：评论

学校读者我要写书评

暂无评论

Maximum likelihood analysis of quantitative trait loci under selective genotyping

引用

HEREDITY 2000年第5期84卷 525-537页

作者： Xu, SZ Vogl, C Univ Calif Riverside Dept Bot & Plant Sci Riverside CA 92521 USA Univ Oulu Dept Biol FIN-90401 Oulu Finland

Selective genotyping is a cost-saving strategy in mapping quantitative trait loci (QTLs). When the proportion of individuals selected for genotyping is low, the majority of the individuals are not genotyped, but their phenotypic values, if available, are still included in the data analysis to correct the bias in parameter estimation. These ungenotyped individuals do not contribute much information about linkage analysis and their inclusion can substantially increase the computational burden. For multiple trait analysis, ungenotyped individuals may not have a full array of phenotypic measurements. In this case, unbiased estimation of QTL effects using current methods seems to be impossible. In this study, we develop a maximum likelihood method of QTL mapping under selective genotyping using only the phenotypic values of genotyped individuals. Compared with the full data analysis (using all phenotypic values), the proposed method performs well. We derive an expectation-maximization (em) algorithm that appears to be a simple modification of the existing em algorithm for standard interval mapping. The new method can be readily incorporated into a standard QTL mapping software, e.g. MAPMAKER. A general recommendation is that whenever full data analysis is possible, the full maximum likelihood analysis should be performed. If it is impossible to analyse the full data, e.g. sample sizes are too large, phenotypic values of ungenotyped individuals are missing or composite interval mapping is to be performed, the proposed method can be applied.

关键词： em algorithm QTL mapping simplex algorithm truncated selection

来源：评论

学校读者我要写书评

暂无评论

Estimating the propagation rate of a viral infection of potato plants via mixtures of regressions

引用

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS 2000年第3期49卷 371-384页

作者： Turner, TR Univ New Brunswick Dept Math & Stat Fredericton NB E3B 5A3 Canada

A problem arising from the study of the spread of a viral infection among potato plants by aphids appears to involve a mixture of two linear regressions on a single predictor variable. The plant scientists studying the problem were particularly interested in obtaining a 95% confidence upper bound for the infection rate. We discuss briefly the procedure for fitting mixtures of regression models by means of maximum likelihood, effected via the em algorithm. We give general expressions for the implementation of the M-step and then address the issue of conducting statistical inference in this context. A technique due to T. A. Louis may be used to estimate the covariance matrix of the parameter estimates by calculating the observed Fisher information matrix. We develop general expressions for the entries of this information matrix. Having the complete covariance matrix permits the calculation of confidence and prediction bands for the fitted model. We also investigate the testing of hypotheses concerning the number of components in the mixture via parametric and 'semiparametric' bootstrapping. Finally, we develop a method of producing diagnostic plots of the residuals from a mixture of linear regressions.

关键词： em algorithm fisher information maximum likelihood parametric bootstrap prediction bands quantile-quantile plots regression diagnostics

来源：评论

学校读者我要写书评

暂无评论

Posterior inference in the random intercept model based on samples obtained with Markov chain Monte Carlo methods

引用

COMPUTATIONAL STATISTICS 2000年第3期15卷 315-336页

作者： Hoijtink, H Univ Utrecht Dept Method & Stat NL-3508 TC Utrecht Netherlands

Many papers (including most of the papers in this issue of Computational Statistics) deal with Markov Chain Monte Carlo (MCMC) methods. This paper will give an introduction to the augmented Gibbs sampler (a special case of MCMC), illustrated using the random intercept model. A 'nonstandard' application of the augmented Gibbs sampler will be discussed to give an illustration of the power of MCMC methods. Furthermore, it will be illustrated that the posterior sample resulting from an application of MCMC can be used for more than determination of convergence and the computation of simple estimators like the a posteriori expectation and standard deviation. Posterior samples give access to many other inferential possibilities. Using a simulation study, the frequency properties of some of these possibilities will be evaluated.

关键词： data augmentation em algorithm Gibbs sampler MCMC multilevel model posterior inference

来源：评论

学校读者我要写书评

暂无评论

Reconstruction in emission tomography via a Bayesian multiscale statistical framework

Reconstruction in emission tomography via a Bayesian multisc...

引用

Conference on Wavelet Applications in Signal and Image Processing VIII

作者： Kolaczyk, ED Nowak, RD Rice Univ Dept Elect & Comp Engn Houston TX 77005 USA

ISBN: (纸本)0819437646

Recently the authors introduced a general Bayesian statistical method for modeling and analysis in linear inverse problems involving certain types of count data. emission-based tomography in medical imaging is a particularly important and common example of this type of problem. In this paper we provide an overview of the methodology and illustrate its application to problems in emission tomography through a series of simulated and real-data examples. The framework rests on the special manner in which a multiscale representation of recursive dyadic partitions (essentially an unnormalized Haar analysis) interacts with the statistical likelihood of data with Poisson noise characteristics. In particular, the likelihood function permits a factorization, with respect to location-scale indexing, analogous to the manner in which, say, an arbitrary signal allows a wavelet transform. Recovery of an object from tomographic data is then posed as a problem involving the statistical estimation of a multiscale parameter vector. A type of statistical shrinkage estimation is used, induced by careful choice of a Bayesian prior probability structure for the parameters. Finally, the ill-posedness of the tomographic imaging problem is accounted for by embedding the above-described framework within a larger, but simpler statistical estimation problem, via the so-called Expectation-Maximization (em) approach. The resulting image reconstruction algorithm is iterative in nature, entailing the calculation of two closed-form algebraic expressions at each iteration. Convergence of the algorithm to a unique solution, under appropriate choice of Bayesian prior, can be assured.

关键词： em algorithm likelihood factorization multiscale prior recursive partitioning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：