We propose algorithms for approximate filtering and smoothing in high-dimensional Factorial hidden Markov models. The approximation involves discarding, in a principled way, likelihood factors according to a notion of...
详细信息
We propose algorithms for approximate filtering and smoothing in high-dimensional Factorial hidden Markov models. The approximation involves discarding, in a principled way, likelihood factors according to a notion of locality in a factor graph associated with the emission distribution. This allows the exponential-in-dimension cost of exact filtering and smoothing to be avoided. We prove that the approximation accuracy, measured in a local total variation norm, is "dimension-free" in the sense that as the overall dimension of the model increases the error bounds we derive do not necessarily degrade. A key step in the analysis is to quantify the error introduced by localizing the likelihood function in a Bayes' rule update. The factorial structure of the likelihood function which we exploit arises naturally when data have known spatial or network structure. We demonstrate the new algorithms on synthetic examples and a London Underground passenger ow problem, where the factor graph is effectively given by the train network.
Social network services (SNSs) such as Twitter and Facebook have emerged as a new medium for communication. They offer a unique mechanism of sharing information by allowing users to receive all messages posted by thos...
详细信息
Social network services (SNSs) such as Twitter and Facebook have emerged as a new medium for communication. They offer a unique mechanism of sharing information by allowing users to receive all messages posted by those whom they & x201C;follow& x201D;. As information in today& x2019;s SNSs often spreads in the form of hashtags, detecting rapidly spreading hashtags in SNSs has recently attracted much attention. In this paper, we propose realistic epidemic models to describe the probabilistic process of hashtag propagation. Our models take into account the way how users communicate in SNSs;moreover the models consider the influence of external media and separate it from internal diffusion within networks. Based on the proposed models, we develop efficient inference algorithms that measure the propagation rates of hashtags in social networks. With real-life social network data including hashtags and synthetic data obtained by simulating information diffusion, we show that the proposed algorithms find fast-spreading hashtags more accurately than existing algorithms. Moreover, our in-depth case study demonstrates that our algorithms correctly find internal diffusion rates of hashtags as well as external media influences.
Dynamical reliability assessment and failure prediction are effective tools for ensuring the efficiency, availability, and safety of repairable systems. To achieve better assessment performance, accurate modeling fail...
详细信息
Dynamical reliability assessment and failure prediction are effective tools for ensuring the efficiency, availability, and safety of repairable systems. To achieve better assessment performance, accurate modeling failure recurrence data are the core of prediction approaches. However, because of the uncertainties from the environmental conditions and repair activities, the failure counting model is usually not well established. To solve this problem, in this paper, we propose an adaptive recursive-filter-based dynamical failure prediction approach for complex repairable systems. First, based on the framework of the state space model, a fusion model that fuses Brownian motion into a nonhomogeneous Poisson process is proposed to characterize failure process under multiple uncertainty conditions. Then, an adaptive statistical inference method based on a Bayesian recursive filter and the em algorithm is derived to update the model parameters and estimate the initial states adaptively. To verify the effectiveness of the proposed approach, a real gas pipeline compressors reliability prediction problem was implemented.
Image segmentation is a fundamental research topic in image processing and computer vision. In recent decades, researchers developed a large number of segmentation algorithms for various applications. Among these algo...
详细信息
Image segmentation is a fundamental research topic in image processing and computer vision. In recent decades, researchers developed a large number of segmentation algorithms for various applications. Among these algorithms, the normalized cut (Ncut) segmentation method is widely applied due to its good performance. The Ncut segmentation model is an optimization problem whose energy is defined on a specifically designed graph. Thus, the segmentation results of the existing Ncut method are largely dependent on a preconstructed similarity measure on the graph since this measure is usually given empirically by users. This flaw will lead to some undesirable segmentation results. In this paper, we propose an Ncut-based segmentation algorithm by integrating an adaptive similarity measure and spatial regularization. The proposed model combines the Parzen-Rosenblatt window method, nonlocal weights entropy, Ncut energy, and regularizer of phase field in a variational framework. Our method can adaptively update the similarity measure function by estimating some parameters. This adaptive procedure enables the proposed algorithm to find a better similarity measure for classification than the Ncut method. We provide some mathematical interpretation of the proposed adaptive similarity from multiple viewpoints, such as statistics and convex optimization. In addition, the regularizer of phase field can guarantee that the proposed algorithm has a robust performance in the presence of noise, and it can also rectify the similarity measure with a spatial priori. The well-posed theory such as the existence of the minimizer for the proposed model is given in the paper. Compared with some existing segmentation methods such as the traditional Ncutbased model and the classical Chan-Vese model, the numerical experiments show that our method can provide promising segmentation results.
In the study of multiple failure time data with recurrent clinical endpoints, the classical independent censoring assumption in survival analysis can be violated when the evolution of the recurrent events is correlate...
详细信息
In the study of multiple failure time data with recurrent clinical endpoints, the classical independent censoring assumption in survival analysis can be violated when the evolution of the recurrent events is correlated with a censoring mechanism such as death. Moreover, in some situations, a cure fraction appears in the data because a tangible proportion of the study population benefits from treatment and becomes recurrence free and insusceptible to death related to the disease. A bivariate joint frailty mixture cure model is proposed to allow for dependent censoring and cure fraction in recurrent event data. The latency part of the model consists of two intensity functions for the hazard rates of recurrent events and death, wherein a bivariate frailty is introduced by means of the generalized linear mixed model methodology to adjust for dependent censoring. The model allows covariates and frailties in both the incidence and the latency parts, and it further accounts for the possibility of cure after each recurrence. It includes the joint frailty model and other related models as special cases. An expectation-maximization (em)-type algorithm is developed to provide residual maximum likelihood estimation of model parameters. Through simulation studies, the performance of the model is investigated under different magnitudes of dependent censoring and cure rate. The model is applied to data sets from two colorectal cancer studies to illustrate its practical value.
In this article, we consider step-stress accelerated life testing (SSALT) models assuming that the time-to-event distribution belongs to the proportional hazard family and the underlying population consists of long-te...
详细信息
In this article, we consider step-stress accelerated life testing (SSALT) models assuming that the time-to-event distribution belongs to the proportional hazard family and the underlying population consists of long-term survivors. Further, with an increase in stress levels, it is natural that the mean time to the event of interest gets shortened and hence a method of obtaining order-restricted maximum likelihood estimators (MLEs) of the model parameters is proposed based on expectation maximization (em) algorithm coupled with the reparametrization technique. To illustrate the effectiveness of the proposed method, extensive simulation experiments are performed and a real-life data example is analyzed in detail.
Recently the progressive censoring scheme has been extended for two or more populations. In this article we consider the joint Type-II progressive censoring (JPC) scheme for two populations when the lifetime distribut...
详细信息
Recently the progressive censoring scheme has been extended for two or more populations. In this article we consider the joint Type-II progressive censoring (JPC) scheme for two populations when the lifetime distributions of the experimental units of the two populations follow two-parameter generalized exponential distributions with the same scale parameter but different shape parameters. The maximum likelihood estimators of the unknown parameters cannot be obtained in explicit forms. We propose to use the expectation maximization (em) algorithm to compute the maximum likelihood estimators. The observed information matrix based on missing value principles is derived. We study the Bayesian inference of the unknown parameters based on a beta-gamma prior for the shape parameters, and an independent gamma prior for the common scale parameter. The Bayes estimators with respect to the squared error loss function cannot be obtained in explicit form. We propose to use the importance sampling technique to compute the Bayes estimates and the associated credible intervals of the unknown parameters. Extensive simulation experiments have been performed to study the performances of the different methods. Finally a real data set has been analyzed for illustrative purposes.
Recent advances in sequencing and genotyping technologies are contributing to a data revolution in genome-wide association studies that is characterized by the challenging large p small n problem in statistics. That i...
详细信息
Recent advances in sequencing and genotyping technologies are contributing to a data revolution in genome-wide association studies that is characterized by the challenging large p small n problem in statistics. That is, given these advances, many such studies now consider evaluating an extremely large number of genetic markers (p) genotyped on a small number of subjects (n). Given the dimension of the data, a joint analysis of the markers is often fraught with many challenges, while a marginal analysis is not sufficient. To overcome these obstacles, herein, we propose a Bayesian two-phase methodology that can be used to jointly relate genetic markers to binary traits while controlling for confounding. The first phase of our approach makes use of a marginal scan to identify a reduced set of candidate markers that are then evaluated jointly via a hierarchical model in the second phase. Final marker selection is accomplished through identifying a sparse estimator via a novel and computationally efficient maximum a posteriori estimation technique. We evaluate the performance of the proposed approach through extensive numerical studies, and consider a genome-wide application involving colorectal cancer.
In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be app...
详细信息
In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be applied to low dimensional predictors, the new semiparametric models can easily incorporate high dimensional predictors into the nonparametric components. The proposed models are very general, and many of the recently proposed semiparametric/nonparametric mixture regression models are indeed special cases of the new models. Backfitting estimates and the corresponding modified em algorithms are proposed to achieve optimal convergence rates for both parametric and nonparametric parts. We establish the identifiability results of the proposed two models and investigate the asymptotic properties of the proposed estimation procedures. Simulation studies are conducted to demonstrate the finite sample performance of the proposed models. Two real data applications using the new models reveal some interesting findings.
In this paper, a new flexible approach to modeling data with multiple partial right-censoring points is proposed. This method is based on finite mixture models, flexible tool to model heterogeneity in data. A general ...
详细信息
In this paper, a new flexible approach to modeling data with multiple partial right-censoring points is proposed. This method is based on finite mixture models, flexible tool to model heterogeneity in data. A general framework to accommodate partial censoring is considered. In this setting, it is assumed that a certain portion of data points are censored and the rest are not. This situation occurs in many insurance loss data sets. A novel probability function is proposed to be used as a mixture component and the expectation-maximization algorithm is employed for estimating model parameters. The Bayesian information criterion is used for model selection. Additionally, an approach for the variability assessment of parameter estimates as well as the computation of quantiles commonly known as risk measures is considered. The proposed model is evaluated using a simulation study based on four common probability distribution functions used to model right skewed loss data and applied to a real data set with good results.
暂无评论