Background and Objectives: Missing outcome data are a common occurrence for most clinical research trials. The 'complete case analysis' is a widely adopted method to tackle with missing observations. However, ...
详细信息
Background and Objectives: Missing outcome data are a common occurrence for most clinical research trials. The 'complete case analysis' is a widely adopted method to tackle with missing observations. However, it reduced the sample size of the study and thus have an impact on statistical power. Hence every effort should be made to reduce the amount of missing data. The objective of this work is to provide the application of different analytical tools to handle missing data imputation techniques through illustration. Methods: We used Imputation techniques such as em algorithm, MCMC, Regression, and Predictive Mean matching methods and compared the results on hepatitis C virus-induced hepatocellular carcinoma (HCV-HCC) data. The statistical models by Generalized Estimating Equations, Time-dependent Cox Regression, and Joint Modeling were applied to obtain the statistical inference on imputed data. The missing data handling technique compatible with Principle Component Analysis (PCA) was found suitable to work with high dimensional data. Results: Joint modelling provides a slightly lower standard error than other analytical methods each imputation. Accordingly, to our methodology, Joint Modeling analysis with the em algorithm imputation method has appeared to be the most appropriate method with HCV-HCC data. However, Generalized Estimating Equations and Time-dependent Cox Regression methods were relatively easy to run. Conclusion: The multiple imputation methods are efficient to provide inference with missing data. It is technically robust than any ad hoc approach to working with missing data. (c) 2021 Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).
Expectation-maximization (em) algorithm has been used to maximize the likelihood function or posterior when the model contains unobserved latent variables. One main important application of em algorithm is to find the...
详细信息
Expectation-maximization (em) algorithm has been used to maximize the likelihood function or posterior when the model contains unobserved latent variables. One main important application of em algorithm is to find the maximum likelihood estimator for mixture models. In this article, we propose an em type algorithm to maximize a class of mixture type objective functions. In addition, we prove the monotone ascending property of the proposed algorithm and discuss some of its applications. (c) 2012 Elsevier B.V. All rights reserved.
In recent years, several mixtures of skew factor analyzers have been proposed. These models adopt various skew distributions for either the factors or the errors, but not both. This paper examines the connections betw...
详细信息
In recent years, several mixtures of skew factor analyzers have been proposed. These models adopt various skew distributions for either the factors or the errors, but not both. This paper examines the connections between these formulations and introduces a unified model that allows for skewness in both the factors and errors. (C) 2020 Elsevier B.V. All rights reserved.
In this paper, a hidden Markov model for modelling matrix-variate time series data is developed. It relies on matrix-variate distribution and presents a promising alternative to the existing methods. Simulation study ...
详细信息
In this paper, a hidden Markov model for modelling matrix-variate time series data is developed. It relies on matrix-variate distribution and presents a promising alternative to the existing methods. Simulation study is carefully conducted and uses benchmark tests with pre-specified overlapping values. Compared with the existing methods, the proposed model demonstrates rather high accuracy in state classification. Results suggest that such an approach is indeed competitive. Interesting applications are presented for real-life data illustration.
Rubin and Thayer (Psychometrika, 47:69-76, 1982) proposed the em algorithm for exploratory and confirmatory maximum likelihood factor analysis. In this paper, we prove the following fact: the em algorithm always gives...
详细信息
Rubin and Thayer (Psychometrika, 47:69-76, 1982) proposed the em algorithm for exploratory and confirmatory maximum likelihood factor analysis. In this paper, we prove the following fact: the em algorithm always gives a proper solution with positive unique variances and factor correlations with absolute values that do not exceed one, when the covariance matrix to be analyzed and the initial matrices including unique variances and inter-factor correlations are positive definite. We further numerically demonstrate that the em algorithm yields proper solutions for the data which lead the prevailing gradient algorithms for factor analysis to produce improper solutions. The numerical studies also show that, in real computations with limited numerical precision, Rubin and Thayer's (Psychometrika, 47:69-76, 1982) original formulas for confirmatory factor analysis can make factor correlation matrices asymmetric, so that the em algorithm fails to converge. However, this problem can be overcome by using an em algorithm in which the original formulas are replaced by those guaranteeing the symmetry of factor correlation matrices, or by formulas used to prove the above fact.
Background: Pooling is a cost effective way to collect data for genetic association studies, particularly for rare genetic variants. It is of interest to estimate the haplotype frequencies, which contain more informat...
详细信息
Background: Pooling is a cost effective way to collect data for genetic association studies, particularly for rare genetic variants. It is of interest to estimate the haplotype frequencies, which contain more information than single locus statistics. By viewing the pooled genotype data as incomplete data, the expectation-maximization (em) algorithm is the natural algorithm to use, but it is computationally intensive. A recent proposal to reduce the computational burden is to make use of database information to form a list of frequently occurring haplotypes, and to restrict the haplotypes to come from this list only in implementing the em algorithm. There is, however, the danger of using an incorrect list, and there may not be enough database information to form a list externally in some applications. Results: We investigate the possibility of creating an internal list from the data at hand. One way to form such a list is to collapse the observed total minor allele frequencies to "zero" or "at least one", which is shown to have the desirable effect of amplifying the haplotype frequencies. To improve coverage, we propose ways to add and remove haplotypes from the list, and a benchmarking method to determine the frequency threshold for removing haplotypes. Simulation results show that the em estimates based on a suitably augmented and trimmed collapsed data list (ATCDL) perform satisfactorily. In two scenarios involving 25 and 32 loci respectively, the em-ATCDL estimates outperform the em estimates based on other lists as well as the collapsed data maximum likelihood estimates. Conclusions: The proposed augmented and trimmed CD list is a useful list for the em algorithm to base upon in estimating the haplotype distributions of rare variants. It can handle more markers and larger pool size than existing methods, and the resulting em-ATCDL estimates are more efficient than the em estimates based on other lists.
This paper proposes a variant of em (expectation-maximization) algorithm for Markovian arrival process (MAP) and phase-type distribution (PH) parameter estimation. Especially, we derive the deterministic annealing em ...
详细信息
This paper proposes a variant of em (expectation-maximization) algorithm for Markovian arrival process (MAP) and phase-type distribution (PH) parameter estimation. Especially, we derive the deterministic annealing em (DAem) algorithm for MAP/PH parameter estimation. The DAem algorithm is one of the methods to overcome a local maxima problem associated with the conventional em algorithm. This paper derives concrete E- and M-step formulas for MAP parameter estimation from inter-arrival time data and PH parameter estimation from point samples in the framework of DAem algorithm. Numerical examples demonstrate the DAem algorithm for Markov-modulated Poisson process (MMPP) and several classes of PH distribution.
Convergence of the expectation-maximization (em) algorithm to a global optimum of the marginal log likelihood function for unconstrained latent variable models with categorical indicators is presented. The sufficient ...
详细信息
Convergence of the expectation-maximization (em) algorithm to a global optimum of the marginal log likelihood function for unconstrained latent variable models with categorical indicators is presented. The sufficient conditions under which global convergence of the em algorithm is attainable are provided in an information-theoretic context by interpreting the em algorithm as alternating minimization of the Kullback-Leibler divergence between two convex sets. It is shown that these conditions are satisfied by an unconstrained latent class model, yielding an optimal bound against which more highly constrained models may be compared.
The first hitting time of a Wiener process to a boundary naturally leads to a lifetime model with cures and is known to follow an inverse Gaussian distribution. This thesis focuses on a first hitting time regression m...
详细信息
The first hitting time of a Wiener process to a boundary naturally leads to a lifetime model with cures and is known to follow an inverse Gaussian distribution. This thesis focuses on a first hitting time regression model for lifetime data with cures based on the defective inverse Gaussian distribution. Maximum likelihood estimation (MLE) of the model parameters is performed using the em algorithm on the incomplete likelihood, which is written as a mixture model between those cured and those susceptible. Confidence intervals are obtained using two different methods: (i) delta method on the cure rate directly; and (ii) delta method on the log odds of being cured. Through simulation, the performance of the MLE and the confidence intervals is evaluated. The study results demonstrate that model parameters and the cure rate can be estimated with low bias, but confidence intervals for the cure rate had coverage probabilities below the nominal level of confidence.
Degradation modeling and parameter estimation of products based on performance degradation data is the basis of prediction and health management(PHM) technology,which has attracted much attention from scholars at home...
详细信息
Degradation modeling and parameter estimation of products based on performance degradation data is the basis of prediction and health management(PHM) technology,which has attracted much attention from scholars at home and *** at the nonlinear degradation process with observed noise that usually exists in products and equipment in practical engineering applications,the nonlinear Wiener process is analyzed and a nonlinear degradation model is ***,based on this model,the parameter estimation of the degradation model is realized by combining Kalman smoothing and em algorithm to accomplish both denoising and parameter ***,the accuracy of the parameter estimation in this paper is verified by simulation data and the rotating bearing dataset provided by FemTO-ST institution for simulation validation and example validation,respectively.
暂无评论