Mixture of regression models are one of the most important statistical data analysis tools in a heterogeneous population. Similar to modeling variance parameter in a homogeneous population, we apply the idea of joint ...
详细信息
Mixture of regression models are one of the most important statistical data analysis tools in a heterogeneous population. Similar to modeling variance parameter in a homogeneous population, we apply the idea of joint mean and variance models to the mixture of regression models and propose a new class of models: mixture of joint mean and variance models to analyze the heteroscedastic normal data coming from a heterogeneous population in this paper. The problem of variable selection for the proposed models is considered. In particular, a modified Expectation-Maximization (em) algorithm for estimating the model parameters is developed. The consistency and the oracle property of the penalized estimators are established. Properties of the estimators of the regression coefficients are evaluated through Monte Carlo simulations. Finally, a real data analysis is illustrated by the proposed methodologies.
In some situations,the failure time of interest is defined as the gap time between two related events and the observations on both event times can suffer either right or interval *** data are usually referred to as do...
详细信息
In some situations,the failure time of interest is defined as the gap time between two related events and the observations on both event times can suffer either right or interval *** data are usually referred to as doubly censored data and frequently encountered in many clinical and observational ***,there may also exist a cured subgroup in the whole population,which means that not every individual under study will experience the failure time of interest *** this paper,we consider regression analysis of doubly censored data with a cured subgroup under a wide class of flexible transformation cure ***,we consider marginal likelihood estimation and develop a two-step approach by combining the multiple imputation and a new expectation-maximization(em)algorithm for its *** resulting estimators are shown to be consistent and asymptotically *** finite sample performance of the proposed method is investigated through simulation *** proposed method is also applied to a real dataset arising from an AIDS cohort study for illustration.
This study aims to determine how to deal with the identification from input and output data of switched linear systems (SLSs) with Box and Jenkins models. The identification difficulties of this system are that there ...
详细信息
This study aims to determine how to deal with the identification from input and output data of switched linear systems (SLSs) with Box and Jenkins models. The identification difficulties of this system are that there exist unknown switched signal, unknown middle variables, and colored noise terms in the identification process. To address these issues, the proposed identification method proceeds in two stages, including the estimation of the switched signal of SLSs and the identification of the parameters of all subsystems. First, the Gaussian mixture model is established to represent the distribution of the input and output data of SLSs. Then, the posterior probability is calculated by the expectation-maximization (em) algorithm and the naive Bayes classifier, and the switched signal is estimated according to the maximum probability criterion. Next, the auxiliary model based multi-innovation generalized extended least square (AM-MI-GELS) algorithm is used to estimate the parameters of all subsystems. Finally, the effectiveness of the proposed method is verified through the simulation example.
Identifying disease-associated changes in DNA methylation can help us gain a better understanding of disease etiology. Bisulfite sequencing allows the generation of high-throughput methylation profiles at single-base ...
详细信息
Identifying disease-associated changes in DNA methylation can help us gain a better understanding of disease etiology. Bisulfite sequencing allows the generation of high-throughput methylation profiles at single-base resolution of DNA. However, optimally modeling and analyzing these sparse and discrete sequencing data is still very challenging due to variable read depth, missing data patterns, long-range correlations, data errors, and confounding from cell type mixtures. We propose a regression-based hierarchical model that allows covariate effects to vary smoothly along genomic positions and we have built a specialized em algorithm, which explicitly allows for experimental errors and cell type mixtures, to make inference about smooth covariate effects in the model. Simulations show that the proposed method provides accurate estimates of covariate effects and captures the major underlying methylation patterns with excellent power. We also apply our method to analyze data from rheumatoid arthritis patients and controls. The method has been implemented in R package SOMNiBUS.
Population genetic theory has been well developed for diploid species, but its extension to study genetic diversity, variation and evolution in autopolyploids, a class of polyploids derived from the genome doubling of...
详细信息
Population genetic theory has been well developed for diploid species, but its extension to study genetic diversity, variation and evolution in autopolyploids, a class of polyploids derived from the genome doubling of a single ancestral species, requires the incorporation of multisomic inheritance. Double reduction, which is characteristic of autopolyploidy, has long been believed to shape the evolutionary consequence of organisms in changing environments. Here, we develop a computational model for testing and estimating double reduction and its genomic distribution in autotetraploids. The model is implemented with the expectation-maximization (em) algorithm to dissect unobservable allelic recombinations among multiple chromosomes, enabling the simultaneous estimation of allele frequencies and double reduction in natural populations. The framework fills an important gap in the population genetic theory of autopolyploids.
This paper investigates the data generating structure which can be represented as a mixture of single-index panel data model with heterogeneous link function. The switching between the states is governed by a hidden v...
详细信息
This paper investigates the data generating structure which can be represented as a mixture of single-index panel data model with heterogeneous link function. The switching between the states is governed by a hidden variable. We also offer an Expectation Maximization (em) algorithm for estimating parameters numerically. The ability of the proposed mixture model will be illustrated with both the simulated performance and the empirical applications.
Image binarization of uneven lighted images, using thresholding techniques, is still a challenging task. Adaptive thresholding methods are the widely adopted approaches for binarization of uneven lighting images. Howe...
详细信息
Image binarization of uneven lighted images, using thresholding techniques, is still a challenging task. Adaptive thresholding methods are the widely adopted approaches for binarization of uneven lighting images. However, the efficacy of these adaptive thresholding methods is highly sensitive to the criteria function used for measuring the bimodal property of the gray level distribution of a local region. In this paper, we propose Gaussian Mixture Model (GMM) which is based on adaptive thresholding for binarizing uneven lighting images. The proposed GMM based criteria function efficiently partitioning the uneven light images into bimodal and unimodal subimages with low uneven light effect. At first, the bimodal subimages are binarized using Otsu's thresholding approach, followed by unimodal subimages being thresholded using the bilinear interpolation of neighbouring thresholds of bimodal subimages. Next a fast Expectation Maximization(em) algorithm is developed to reduce the computational complexity of the GMM. Experimental results on different uneven light images demonstrate that the proposed adaptive thresholding outperforms the other considered methods with an avg. misclassification error of 1.68 % and an average computation time of 1.50 seconds. The computational time can be further reduced by a specially purposed hardware and parallel processing of each subimages for real time applications.
In the complex market environment, it is difficult for enterprises to innovate only by their own internal product resources. In order to improve the performance level of new product development, this paper puts forwar...
详细信息
Mixed models play an important role for describing data in various fields, and accordingly selecting the most appropriate mixed model is an appealing topic in model selection literature. To achieve the goal of selecti...
详细信息
Mixed models play an important role for describing data in various fields, and accordingly selecting the most appropriate mixed model is an appealing topic in model selection literature. To achieve the goal of selecting the most appropriate mixed model, we propose a procedure to jointly select the fixed and random effects by implementing the adaptive Lasso (Zou 2006) penalized methodology via cross-validation. In the procedure, the application of cross-validation can effectively lower the risk of selecting overfitting models. The data are divided into training and test sets, where the training set is utilized for constructing candidate models and the test set is utilized for choosing the most appropriate mixed model. To boost the computational efficiency in the estimation and in the selection of mixed models, we adopt the em algorithm to optimize the penalized likelihood. Theoretical properties are founded to prove that the proposed approach possesses the consistency and oracle properties. The simulations and a real data example are provided to justify the validity of the procedure.
In this paper, we propose a novel hierarchical Bayesian model and an efficient estimation method for the problem of joint estimation of multiple graphical models, which have similar but different sparsity structures a...
详细信息
In this paper, we propose a novel hierarchical Bayesian model and an efficient estimation method for the problem of joint estimation of multiple graphical models, which have similar but different sparsity structures and signal strength. Our proposed hierarchical Bayesian model is well suited for sharing of sparsity structures, and our procedure, called as GemBag, is shown to enjoy optimal theoretical properties in terms of l(infinity) norm estimation accuracy and correct recovery of the graphical structure even when some of the signals are weak. Although optimization of the posterior distribution required for obtaining our proposed estimator is a non-convex optimization problem, we show that it turns out to be convex in a large constrained space facilitating the use of computationally efficient algorithms. Through extensive simulation studies and an application to a bike sharing data set, we demonstrate that the proposed GemBag procedure has strong empirical performance in comparison with alternative methods.
暂无评论