In real life, more often experimental units are susceptible to more than one risk factor. Moreover, some experimental units may not fail even if they are observed over a long period of time. In statistical analysis, c...
详细信息
In real life, more often experimental units are susceptible to more than one risk factor. Moreover, some experimental units may not fail even if they are observed over a long period of time. In statistical analysis, competing risks models handle the first scenario while cure rate models have been introduced to analyze the long-term survivors in the population. In this paper, we consider a cure rate model when the failure of a unit can be due to either of the two competing causes. To analyze the competing risk data in presence of long-term survivors, we consider the latent failure times approach introduced by Cox (J R Stat Soc Ser B (Methodol) 21(2):411-421, 1959). The latent failure times are assumed to follow exponential distributions, and they are independently distributed. Under this setup, a random censoring scheme is applied and the observed data consist of either censored times or actual failure times along with the cause of failures. We derive the maximum likelihood estimators (MLEs) using the expectation-maximization (em) algorithm based on the missing value principle. As the overall survival function is not a proper survival function, the asymptotic behavior of the MLEs is not immediate. We provide the sufficient conditions for the existence, uniqueness, consistency and the asymptotic normality of the MLEs. Monte Carlo simulations are performed to support the theoretical validation numerically. For illustrative purposes, we have analyzed one real dataset, and the results are quite satisfactory.
In this paper, we first illustrate the restricted empirical likelihood function, as an alternative to the usual empirical likelihood. Then, we use this quasi-empirical likelihood function as a basis for Bayesian analy...
详细信息
In this paper, we first illustrate the restricted empirical likelihood function, as an alternative to the usual empirical likelihood. Then, we use this quasi-empirical likelihood function as a basis for Bayesian analysis of AR(r) time series models. The efficiency of both the posterior computation algorithm, when the estimating equations are linear functions of the parameters, and the em algorithm for estimating hyper-parameters is an appealing property of our proposed approach. Moreover, the competitive finite-sample performance of this proposed method is illustrated via both simulation study and analysis of a real dataset. (C) 2021 The Authors. Published by Atlantis Press B.V.
In healthcare economics count datasets often exhibit excessive zeros or right-skewed tails. When covariates are available, such datasets are typically modelled using the zero-inflated (ZI) or finite mixture (FM) regre...
详细信息
In healthcare economics count datasets often exhibit excessive zeros or right-skewed tails. When covariates are available, such datasets are typically modelled using the zero-inflated (ZI) or finite mixture (FM) regression models. However, neither model performs adequately when the dataset has both excessive zeros and a long tail, which is often the case in practice. In this paper we combine these two models to create a more flexible, versatile class of ZIFM models. With this model we perform a comprehensive analysis on the number of visits to a physician's office using the US healthcare demand dataset that has been used in numerous healthcare studies in the literature. After comparing to other existing models which have been reported to perform well on this dataset, we find that the ZIFM model substantially outperforms alternative models. In addition, the model offers a new interpretation that is in contrast to previous empirical findings regarding the factors associated with the demand for the physicians, which can shed a fresh light on the healthcare utilisation policies.
The problem of fitting a folded normal distribution by maximum likelihood has been described as 'not straightforward', and alternatives such as em proposed. We suggest here that it is in fact straightforward t...
详细信息
The problem of fitting a folded normal distribution by maximum likelihood has been described as 'not straightforward', and alternatives such as em proposed. We suggest here that it is in fact straightforward to fit such a distribution by direct numerical maximization of the likelihood. We demonstrate this in an example. The relevant R code is included.
In this paper, the maximum likelihood estimates (MLE's) of the parameters of a finite mixture of modified Weibull (MW(alpha,beta,gamma) distributions are obtained based on type-I and type-II censored samples using...
详细信息
In this paper, the maximum likelihood estimates (MLE's) of the parameters of a finite mixture of modified Weibull (MW(alpha,beta,gamma) distributions are obtained based on type-I and type-II censored samples using the em algorithm. A simulation study is carried out to study the behavior of the mean squared errors. A real data set is introduced and analyzed using a mixture of two MW distributions and also using a mixture of two Weibull (alpha,beta) distributions. A comparison is carried out between the mentioned mixtures based on the corresponding Kolmogorov-Smirnov (K-S) test statistic to emphasize that the MW mixture model fits the data better than the other mixture model.
In lifetime studies, on many occasions, a proportion of individuals may experience the event of interest at the beginning of the study itself, while another group of individuals may not experience the event of interes...
详细信息
Owing to the strong control bedrock geology may exert on the chemical composition of stream sediments, the determination of stream sediment geochemical anomalies is always affected by the lithology background in areas...
详细信息
Owing to the strong control bedrock geology may exert on the chemical composition of stream sediments, the determination of stream sediment geochemical anomalies is always affected by the lithology background in areas with variable lithologies. In this study, the expectation-maximization (em) algorithm was used to separate lithologies of different chemical compositions in a 1: 200 000 scale regional geochemical data set of stream sediments in a lithologically complex region in Hunan province, SE China. The data set included 1024 minerogenic stream sediment samples which were analysed for Cu, La, Li, Be, Cr, Ni, Sr, V, Th, Ti and Zr. A comparison between Cu anomalies determined with and without taking into account the separation of lithologies was carried out. The result shows that stream sediment geochemical anomalies in lithologically complex regions can be determined in a more reasonable way by application of the em clustering method. Strong but false or meaningless anomalies can be eliminated, and weak but important or meaningful anomalies are more clearly revealed.
Conditional correlation networks, within Gaussian Graphical Models (GGM), are widely used to describe the direct interactions between the components of a random vector. In the case of an unlabelled Heterogeneous popul...
详细信息
Hidden Markov chain, or Markov field, models, with observations in a Euclidean space, play a major role across signal and image processing. The present work provides a statistical framework which can be used to extend...
详细信息
Hidden Markov chain, or Markov field, models, with observations in a Euclidean space, play a major role across signal and image processing. The present work provides a statistical framework which can be used to extend these models, along with related, popular algorithms (such as the Baum-Welch algorithm), to the case where the observations lie in a Riemannian manifold. It is motivated by the potential use of hidden Markov chains and fields, with observations in Riemannian manifolds, as models for complex signals and images. Copyright (C) 2021 The Authors.
Incomplete data sets have been a problem in most studies, however, few studies have come to realise that imputation is a solution to this problem. Incomplete data can have a significant effect on the conclusion drawn ...
详细信息
ISBN:
(纸本)9781665403450
Incomplete data sets have been a problem in most studies, however, few studies have come to realise that imputation is a solution to this problem. Incomplete data can have a significant effect on the conclusion drawn and decision made. To solve the problem of incomplete data, one should use techniques to recover those missing values, depending on how much the data is missing, how big is the data, how the data has gone missing, etc. In this report, we aimed to compare the performance of the em algorithm and matrix completion when imputing the missing values for varying degrees of missing data. Kullback-Leibler (KL) divergence was used as an evaluation metric to observe the performance of Expectation-Maximization (em) algorithm and matrix completion when estimating missing values relative to the ground-truth distribution. The findings of this research shows that the em algorithm outperformed matrix completion in both the theoretical (the simulated scenarios of learning from varying degrees of missing data) and the application (the application of theoretical model on real-world data on credit card fraud) models. Few similarities of the algorithms were observed when recovering missing values such as the increasing trend of error as missing values increases and the impact of increasing number of variables in a data set. Matrix completion only performed better when missing values were beyond approximately 75%. Therefore, from our findings, we conclude that when less than 50% of the data is missing, em algorithm produces accurate predictions. The em algorithm performed better compared to the matrix completion since it first learned the data itself and used maximum likelihood procedures to estimate the parameters of the model while the matrix completion analysed the existing pattern from rows and columns and imputes them using the pattern learned in the data.
暂无评论