We propose a heteroscedastic replicated measurement error model based on the class of scale mixtures of skew-normal distributions, which allows the variances of measurement errors to vary across subjects. We develop E...
详细信息
We propose a heteroscedastic replicated measurement error model based on the class of scale mixtures of skew-normal distributions, which allows the variances of measurement errors to vary across subjects. We develop em algorithms to calculate maximum likelihood estimates for the model with or without equation error. An empirical Bayes approach is applied to estimate the true covariate and predict the response. Simulation studies show that the proposed models can provide reliable results and the inference is not unduly affected by outliers and distribution misspecification. The method has also been used to analyze a real data of plant root decomposition.
Models for dealing with survival data in the presence of a cured fraction of individuals have attracted the attention of many researchers and practitioners in recent years. In this paper, we propose a cure rate model ...
详细信息
Models for dealing with survival data in the presence of a cured fraction of individuals have attracted the attention of many researchers and practitioners in recent years. In this paper, we propose a cure rate model under the competing risks scenario. For the number of causes that can lead to the event of interest, we assume the polylogarithm distribution. The model is flexible in the sense it encompasses some well-known models, which can be tested using large sample test statistics applied to nested models. Maximum-likelihood estimation based on the em algorithm and hypothesis testing are investigated. Results of simulation studies designed to gauge the performance of the estimation method and of two test statistics are reported. The methodology is applied in the analysis of a data set.
In this paper, we propose a fast image denoising method based on discrete Markov random fields and the fast Fourier transform. The purpose of the image denoising is to infer the original noiseless image from a noise c...
详细信息
In this paper, we propose a fast image denoising method based on discrete Markov random fields and the fast Fourier transform. The purpose of the image denoising is to infer the original noiseless image from a noise corrupted image. We consider the case where several noisy images are available for inferring the original image and the Bayesian approach is adopted to create the posterior probability distribution of the denoised image. In the proposed method, the estimation of the denoised image is achieved using belief propagation and an expectation-maximization algorithm. We numerically verified the performance of the proposed method using several standard images.
BackgroundModeling thousands of markers simultaneously has been of great interest in testing association between genetic biomarkers and disease or disease-related quantitative traits. Recently, an expectation-maximiza...
详细信息
BackgroundModeling thousands of markers simultaneously has been of great interest in testing association between genetic biomarkers and disease or disease-related quantitative traits. Recently, an expectation-maximization (em) approach to Bayesian variable selection (emVS) facilitating the Bayesian computation was developed for continuous or binary outcome using a fast em algorithm. However, it is not suitable to the analyses of time-to-event outcome in many public databases such as The Cancer Genome Atlas (TCGA).ResultsWe extended the emVS to high-dimensional parametric survival regression framework (SurvemVS). A variant of cyclic coordinate descent (CCD) algorithm was used for efficient iteration in M-step, and the extended Bayesian information criteria (EBIC) was employed to make choice on hyperparameter tuning. We evaluated the performance of SurvemVS using numeric simulations and illustrated the effectiveness on two real datasets. The results of numerical simulations and two real data analyses show the well performance of SurvemVS in aspects of accuracy and computation. Some potential markers associated with survival of lung or stomach cancer were *** results suggest that our model is effective and can cope with high-dimensional omics data.
The maximum tolerated dose (MTD) is commonly practiced for dose selection in oncology. Higher doses work as the best effective treatment. Conventionally, the doses are selected in phase 2 from a phase. Perhaps, the qu...
详细信息
The maximum tolerated dose (MTD) is commonly practiced for dose selection in oncology. Higher doses work as the best effective treatment. Conventionally, the doses are selected in phase 2 from a phase. Perhaps, the quality of life gets compromised due to the high dose of chemotherapeutic regimes in the early phase of the trial. Alternative chemotherapy administration is Metronomic Chemotherapy (MC). Here very minimal doses are administered to avoid high toxicity. This work is about to handle missing data of circulating endothelial cells (CEC) and supports the optimal biological dose (OBD) for MC. It is performed with mimic data. A data simulation strategy is adopted. Simulation work is performed with R software. It helps to identify the suitable technique to handle missing data. The results conclude that the MC is efficacious by improving the Progression Free Survival (PFS) and Oveall Survival (OS) through controlled toxicity. Now the illustrated example can be extended to explore the impact of CEC for OS. This is a preliminary attempt towards MC to address some critical issues.
The Conway-Maxwell-Poisson (COM-Poisson) distribution is useful to account for a cure proportion in survival data. With this model, two computational approaches for calculating maximum likelihood estimates have been d...
详细信息
The Conway-Maxwell-Poisson (COM-Poisson) distribution is useful to account for a cure proportion in survival data. With this model, two computational approaches for calculating maximum likelihood estimates have been developed in the literature: one based on the method in the gamlss R package that employs the first-order derivatives of the log-likelihood, and the other based on the em algorithm that employs the complete-data likelihood. In this paper, we propose a robust version of the Newton-Raphson (NR) algorithm, where the robustness is introduced by random perturbations to the initial values and by log-transformations to positive parameters. We provide the expressions of the derivatives of the log-likelihood under the Bernoulli cure model and computer codes for implementation. Since the NR algorithmemploys the first- and second-derivatives of the log-likelihood, it converges more quickly than the method of the gamlss R package. We also review the em algorithms and compare the computational performance between the NR and em algorithms via simulations. We also include a novel data to be fitted to the COM-Poisson cure model, and discuss the consequence of performing the two algorithms.
In this article we propose a multiple-inflation Poisson regression to model count response data containing excessive frequencies at more than one non-negative integer values. To handle multiple excessive count respons...
详细信息
In this article we propose a multiple-inflation Poisson regression to model count response data containing excessive frequencies at more than one non-negative integer values. To handle multiple excessive count responses, we generalize the zero-inflated Poisson regression by replacing its binary regression with the multinomial regression, while Su et al. [Statist. Sinica 23 (2013) 1071-1090] proposed a multiple-inflation Poisson model for consecutive count responses with excessive frequencies. We give several properties of our proposed model, and do statistical inference under the fully Bayesian framework. We perform simulation studies and also analyze the data related to the number of infections collected in five major hospitals in Turkey, using our methodology.
Judgment post-stratification is used to supplement observations taken from finite mixture models with additional easy to obtain rank information and incorporate it in the estimation of model parameters. To do this, sa...
详细信息
Judgment post-stratification is used to supplement observations taken from finite mixture models with additional easy to obtain rank information and incorporate it in the estimation of model parameters. To do this, sampled units are post-stratified on ranks by randomly selecting comparison sets for each unit from the underlying population and assigning ranks to them using available auxiliary information or judgment ranking. This results in a set of independent order statistics from the underlying model, where the number of units in each rank class is random. We consider cases where one or more rankers with different ranking abilities are used to provide judgment ranks. The judgment ranks are then combined to produce a strength of agreement measure for each observation. This strength measure is implemented in the maximum likelihood estimation of model parameters via a suitable expectation maximization algorithm. Simulation studies are conducted to evaluate the performance of the estimators with or without the extra rank information. Results are applied to bone mineral density data from the third National Health and Nutrition Examination Survey to estimate the prevalence of osteoporosis in adult women aged 50 and over.
In the framework of cluster analysis based on Gaussian mixture models, it is usually assumed that all the variables provide information about the clustering of the sample units. Several variable selection procedures a...
详细信息
In the framework of cluster analysis based on Gaussian mixture models, it is usually assumed that all the variables provide information about the clustering of the sample units. Several variable selection procedures are available in order to detect the structure of interest for the clustering when this structure is contained in a variable sub-vector. Currently, in these procedures a variable is assumed to play one of (up to) three roles: (1) informative, (2) uninformative and correlated with some informative variables, (3) uninformative and uncorrelated with any informative variable. A more general approach for modelling the role of a variable is proposed by taking into account the possibility that the variable vector provides information about more than one structure of interest for the clustering. This approach is developed by assuming that such information is given by non-overlapped and possibly correlated sub-vectors of variables;it is also assumed that the model for the variable vector is equal to a product of conditionally independent Gaussian mixture models (one for each variable sub-vector). Details about model identifiability, parameter estimation and model selection are provided. The usefulness and effectiveness of the described methodology are illustrated using simulated and real datasets.
Bond rating Transition Probability Matrices (TPMs) are built over a one-year time-frame and for many practical purposes, like the assessment of risk in portfolios or the computation of banking Capital Requirements (e....
详细信息
Bond rating Transition Probability Matrices (TPMs) are built over a one-year time-frame and for many practical purposes, like the assessment of risk in portfolios or the computation of banking Capital Requirements (e.g. the new IFRS 9 regulation), one needs to compute the TPM and probabilities of default over a smaller time interval. In the context of continuous time Markov chains (CTMC) several deterministic and statistical algorithms have been proposed to estimate the generator matrix. We focus on the Expectation-Maximization (em) algorithm by Bladt and Sorensen. [J. R. Stat. Soc. Ser. B (Stat. Method.), 2005, 67, 395-410] for a CTMC with an absorbing state for such estimation. This work's contribution is threefold. Firstly, we provide directly computable closed form expressions for quantities appearing in the em algorithm and associated information matrix, allowing easy approximation of confidence intervals. Previously, these quantities had to be estimated numerically and considerable computational speedups have been gained. Secondly, we prove convergence to a single set of parameters under very weak conditions (for the TPM problem). Finally, we provide a numerical benchmark of our results against other known algorithms, in particular, on several problems related to credit risk. The em algorithm we propose, padded with the new formulas (and error criteria), outperforms other known algorithms in several metrics, in particular, with much less overestimation of probabilities of default in higher ratings than other statistical algorithms.
暂无评论