Tweedie's compound Poisson model is a popular method to model insurance claims with probability mass at zero and nonnegative, highly right-skewed distribution. In particular, it is not uncommon to have extremely u...
详细信息
Tweedie's compound Poisson model is a popular method to model insurance claims with probability mass at zero and nonnegative, highly right-skewed distribution. In particular, it is not uncommon to have extremely unbalanced data with excessively large proportion of zero claims, and even traditional Tweedie model may not be satisfactory for fitting the data. In this paper, we propose a boosting-assisted zero-inflated Tweedie model, called emTboost, that allows zero probability mass to exceed a traditional model. We makes a nonparametric assumption on its Tweedie model component, that unlike a linear model, is able to capture nonlinearities, discontinuities, and complex higher order interactions among predictors. A specialized Expectation-Maximization algorithm is developed that integrates a blockwise coordinate descent strategy and a gradient tree-boosting algorithm to estimate key model parameters. We use extensive simulation and data analysis on synthetic zero-inflated auto-insurance claim data to illustrate our method's prediction performance.
The Burr Type III distribution has been applied in the study of income, wage and wealth. It is suitable to fit lifetime data since it has flexible shape and controllable scale parameters. The popularity of Burr Type I...
详细信息
ISBN:
(纸本)9780735412415
The Burr Type III distribution has been applied in the study of income, wage and wealth. It is suitable to fit lifetime data since it has flexible shape and controllable scale parameters. The popularity of Burr Type III distribution increases because it has included the characteristics of other distributions such as logistic and exponential. Burr Type III distribution has two categories: First a two-parameter distribution which has two shape parameters and second a three-parameter distribution which has a scale and two shape parameters. Expectation-maximization (em) algorithm method is selected in this paper to estimate the two-and three-parameter Burr Type III distributions. Complete and censored data are simulated based on the derivation of pdf and cdf in parametric form of Burr Type III distributions. Then, the em estimates are compared with estimates from maximum likelihood estimation (MLE) approach through mean square error. The best approach results in estimates with a higher approximation to the true parameters are determined. The result shows that the em algorithm estimates perform better than the MLE estimates for two- and three-parameter Burr Type III distributions in the presence of complete and censored data.
We study the strong consistency of the maximum likelihood estimator under a special finite mixture of two-parameter Gamma distributions. Somewhat surprisingly, the likelihood function under Gamma mixture with a set of...
详细信息
We study the strong consistency of the maximum likelihood estimator under a special finite mixture of two-parameter Gamma distributions. Somewhat surprisingly, the likelihood function under Gamma mixture with a set of independent and identically distributed observations is unbounded. There exist many sets of nonsensical parameter values at which the likelihood value is arbitrarily large. This leads to an inconsistent, or arguably undefined, maximum likelihood estimator. Interestingly, when the scale or shape parameter in the finite Gamma mixture model is structural, the maximum likelihood estimator of the mixing distribution is well defined and strongly consistent. Establishing the consistency when the shape parameter is structural is technically less challenging and already given in the literature. In this paper, we prove the consistency when the scale parameter is structural and provide some illustrative simulation experiments. We further include an application example of the model with a structural scale parameter to salary potential data. We conclude that the Gamma mixture distribution with a structural scale parameter provides another flexible yet relatively parsimonious model for observations with intrinsic positive values.
Problem definition: We address the problem of how to estimate lost sales for substitutable products when there is no reliable on-shelf availability (OSA) information. Academic/practical relevance: We develop a novel a...
详细信息
Problem definition: We address the problem of how to estimate lost sales for substitutable products when there is no reliable on-shelf availability (OSA) information. Academic/practical relevance: We develop a novel approach to estimating lost sales using only sales data, a market share estimate, and an estimate of overall availability. We use the method to illustrate the negative consequences of using potentially inaccurate inventory records as indicators of availability. Methodology: We suggest a partially hidden Markov model of OSA to generate probabilistic choice sets and incorporate these probabilistic choice sets into the estimation of a multinomial logit demand model using a nested expectation-maximization algorithm. We highlight the importance of considering inventory reliability problems first through simulation and then by applying the procedure to a data set from a major U.S. retailer. Results: The simulations show that the method converges in seconds and produces estimates with similar or lower bias than state-of-the-art benchmarks. For the product category under consideration at the retailer, our procedure finds lost sales of around 3.0% compared with 0.2% when relying on the inventory record as an indicator of availability. Managerial implications: Themethod efficiently computes estimates that can be used to improve inventory management and guide managers on how to use their scarce resources to improve stocking execution. The research also shows that ignoring inventory record inaccuracies when estimating lost sales can produce substantially inaccurate estimates, which leads to incorrect parameters in supply chain planning.
In field reliability analyses, a data collection period is given to monitor the failure events from the field. Left-truncation arises due to early failures occurring before the data collection period, and right-censor...
详细信息
In field reliability analyses, a data collection period is given to monitor the failure events from the field. Left-truncation arises due to early failures occurring before the data collection period, and right-censoring arises for late failures occurring beyond the monitoring period. Naive analyses of left-truncated and right-censored data lead to biased estimation of the population lifetime of interest. A variety of models and methods have been developed to analyze the left-truncated and right-censored data for field reliability analyses. The goal of the paper is to review the existing models and methods for fitting left-truncated and right-censored data. Our review includes the existing statistical models, such as the exponential, Weibull, lognormal, gamma, Gompertz, Lomax, and spline models. We comprehensively review the statistical issues of maximum likelihood estimation, model selection, residual lifetime prediction, and Bayesian methods. Some of these methods are illustrated through the field reliability analysis of the electric power transformer dataset.
Double censored data often arise in medical and epidemiological studies when observations are subject to both left censoring and right censoring. In this article, based on doubly censored data, we consider maximum lik...
详细信息
Double censored data often arise in medical and epidemiological studies when observations are subject to both left censoring and right censoring. In this article, based on doubly censored data, we consider maximum likelihood estimation for the Cox-Aalen model with fixed covariates. By treating left censored observations as missing, we propose expectation-maximization (em) algorithms for obtaining the maximum likelihood estimators (MLE) of the regression coefficients for the Cox-Aalen model. We establish the asymptotic properties of the MLE. Simulation studies show that MLE via the em algorithms performs well.
Multivariate interval-censored data arise when each subject under study can potentially experience multiple events and the onset time of each event is not observed exactly but is known to lie in a certain time interva...
详细信息
Multivariate interval-censored data arise when each subject under study can potentially experience multiple events and the onset time of each event is not observed exactly but is known to lie in a certain time interval formed by adjacent examination times with changed statuses of the event. This type of incomplete and complex data structure poses a substantial challenge in practical data analysis. In addition, many potential risk factors exist in numerous studies. Thus, conducting variable selection for event-specific covariates simultaneously becomes useful in identifying important variables and assessing their effects on the events of interest. In this paper, we develop a variable selection technique for multivariate interval-censored data under a general class of semiparametric transformation frailty models. The minimum information criterion (MIC) method is embedded in the optimization step of the proposed expectation-maximization (em) algorithm to obtain the parameter estimator. The proposed em algorithm greatly reduces the computational burden in maximizing the observed likelihood function, and the MIC naturally avoids selecting the optimal tuning parameter as needed in many other popular penalties, making the proposed algorithm promising and reliable. The proposed method is evaluated through extensive simulation studies and illustrated by an analysis of patient data from the Aerobics Center Longitudinal Study.
In this paper, we study the techniques of blast wave field reconstruction based on Tomography. Overpressure field is reconstructed by inverting the velocity field in the process of shock wave transmission. Since the r...
详细信息
ISBN:
(纸本)9781479948604
In this paper, we study the techniques of blast wave field reconstruction based on Tomography. Overpressure field is reconstructed by inverting the velocity field in the process of shock wave transmission. Since the reconstruction process is difficult due to the insufficient number of excitation sources and detectors, we propose an em algorithm based on prior information. Appropriate models are constructed using the proposed methods, and a simulation example is put forward at last. The result reveals that compared with the traditional methods, this method has higher precision and converges faster. It also shows the validity and practicality of the developed algorithm in solving the problem of incomplete data reconstruction.
Objectives: Oral Cancer, also called Oral Squamous Cell Carcinoma (OSCC), has been one of the serious cancers that affect the South Asian countries. A range of diagnostic strategies are available including biopsy of t...
详细信息
Objectives: Oral Cancer, also called Oral Squamous Cell Carcinoma (OSCC), has been one of the serious cancers that affect the South Asian countries. A range of diagnostic strategies are available including biopsy of the affected part. The Wnt/beta-catenin pathway plays important roles in morphogenesis, normal physiological functions, and tumor formation. This study examined the accumulation of beta-catenin in the nuclei and cytoplasm of oral cancer. Methods: The accuracy of histopathological results is hampered by considerable inter and intra-reader variability even by expert pathologists. In order to get both qualitative and quantitative results, we developed a system for diagnosis of oral cancer using Expectation-Maximization (em algorithm). Results: The microscopic images of immunohistochemical staining of beta-catenin expression were segmented using Iterative Method of (em) algorithm to extract the cellular and extracellular components of an image. The segmentation process of the system uses unitone conversion to obtain a single channel image using Principal Component Analysis (PCA) with the highest contrast. Finally, the unitone image is normalized to (0-1) range. Conclusion: Based on the segmentation process we conclude that beta-catenin expression using em algorithm is an efficient technique to help the pathologist to evaluate the histological changes on microscopic images of oral cancer.
In this article, we have discussed the problem of point estimation of the three unknown parameters of a bivariate new extended Weibull distribution under complete and randomly right-censored samples. The expectation-m...
详细信息
In this article, we have discussed the problem of point estimation of the three unknown parameters of a bivariate new extended Weibull distribution under complete and randomly right-censored samples. The expectation-maximization algorithm is used to estimate the unknown parameters. Simulation experiments are performed to see the effectiveness of the estimators for complete and censored data. One dataset has been considered to illustrate the practical utility of the article.
暂无评论