We develop a new extension to the Mean-Field approximation for inference in graphical models which has advantages over other approximation schemes which have been proposed. The method is economical in its use of varia...
详细信息
We develop a new extension to the Mean-Field approximation for inference in graphical models which has advantages over other approximation schemes which have been proposed. The method is economical in its use of variational parameters and the approximating conditional distribution can be specified with direct reference to the dependence structure of the variables in the graphical model. We apply the method to sigmoid belief networks.
Multiple outcomes are often used to properly characterize an effect of interest. This paper proposes a latent variable model for the situation where repeated measures over time are obtained on each outcome. These outc...
详细信息
Multiple outcomes are often used to properly characterize an effect of interest. This paper proposes a latent variable model for the situation where repeated measures over time are obtained on each outcome. These outcomes are assumed to measure an underlying quantity of main interest from different perspectives. We relate the observed outcomes using regression models to a latent variable, which is then modeled as a function of covariates by a separate regression model. Random effects are used to model the correlation due to repeated measures of the observed outcomes and the latent variable. An em algorithm is developed to obtain maximum likelihood estimates of model parameters. Unit-specific predictions of the latent variables are also calculated. This method is illustrated using data from a national panel study on changes in methadone treatment practices.
A receiver operating characteristic (ROC) curve is commonly used to measure the accuracy of a medical test. It is a plot of the true positive fraction (sensitivity) against the false positive fraction (I-specificity) ...
详细信息
A receiver operating characteristic (ROC) curve is commonly used to measure the accuracy of a medical test. It is a plot of the true positive fraction (sensitivity) against the false positive fraction (I-specificity) for increasingly stringent positivity criterion. Bias can occur in estimation of an ROC curve if only some of the tested patients are selected for disease verification and if analysis is restricted only to the verified cases. This bias is known as verification bias. In this paper, we address the problem of correcting for verification bias in estimation of an ROC curve when the verification process and efficacy of the diagnostic test depend on covariates. Our method applies the em algorithm to ordinal regression models to derive ML estimates for ROC curves as a function of covariates, adjusted for covariates affecting the likelihood of being verified. Asymptotic variance estimates are obtained using the observed information matrix of the observed data. These estimates are derived under the missing-at-random assumption, which means that selection for disease verification depends only on the observed data, i.e., the test result and the observed covariates. We also address the issues of model selection and model checking. Finally, we illustrate the proposed method on data from a two-phase study of dementia disorders, where selection for verification depends on the screening test result and age.
This paper treats a multiresolution hidden Markov model for classifying images. Each image is represented by feature vectors at several resolutions, which are statistically dependent as modeled by the underlying state...
详细信息
This paper treats a multiresolution hidden Markov model for classifying images. Each image is represented by feature vectors at several resolutions, which are statistically dependent as modeled by the underlying state process, a multiscale Markov mesh. Unknowns in the model are estimated by maximum likelihood, in particular by employing the expectation-maximization algorithm. An image is classified by finding the optimal set of states with maximum a posteriori probability. States are then mapped into classes. The multiresolution model enables multiscale information about context to be incorporated into classification. Suboptimal algorithms based on the model provide progressive classification that is much faster than the algorithm based on single-resolution hidden Markov models.
It is always difficult to train the multi-layer feedforward neural networks (FNN) based on the cumulants match criterion because cumulants are the nonlinear and implicit function of the FNN parameters. In this work, t...
详细信息
It is always difficult to train the multi-layer feedforward neural networks (FNN) based on the cumulants match criterion because cumulants are the nonlinear and implicit function of the FNN parameters. In this work, two new cumulant-based training methods for two-layer FNN are developed. In the first method, the hidden units of two-layer FNN are approximated with multiple linear systems, and further total FNN is modeled with a "mixture of experts" (ME) architecture. With the ME model, FNN parameters are estimated with the expectation-maximization (em) algorithm. The second method, for simplifying the two-layer FNN statistical model, proposes a simplified two-level hierarchical ME to remodel the FNN, in which hidden variables are introduced to decompose training total FNN into training a set of single neurons. Based on training single neuron, total FNN is trained in a simplified version with a faster convergence speed. (C) 2000 Elsevier Science B.V. All rights reserved.
Some failure time data come from a population that consists of some subjects who are susceptible to and others who are nonsusceptible to the event of interest. The data typically have heavy censoring at the end of the...
详细信息
Some failure time data come from a population that consists of some subjects who are susceptible to and others who are nonsusceptible to the event of interest. The data typically have heavy censoring at the end of the follow-up period, and a standard survival analysis would not always be appropriate. In such situations where there is good scientific or empirical evidence of a nonsusceptible population, the mixture or cure model can be used (Farewell, 1982, Biometrics 38, 1041-1046). It assumes a binary distribution to model the incidence probability and a parametric failure time distribution to model the latency. Kuk and Chen (1992, Biometrika 79, 531-541) extended the model by using Cox's proportional hazards regression for the latency. We develop maximum likelihood techniques for the joint estimation of the incidence and latency regression parameters in this model using the nonparametric form of the likelihood and an em algorithm. A zero-tail constraint is used to reduce the near nonidentifiability of the problem. The inverse of the observed information matrix is used to compute the standard errors. A simulation study shows that the methods are competitive to the parametric methods under ideal conditions and are generally better when censoring from loss to follow-up is heavy. The methods are applied to a data set of tonsil cancer patients treated with radiation therapy.
We consider the problem of estimating and suppressing many unknown independent and time-varying interferers in a spread-spectrum communication system. The interferers are assumed to be present in a wide frequency rang...
详细信息
We consider the problem of estimating and suppressing many unknown independent and time-varying interferers in a spread-spectrum communication system. The interferers are assumed to be present in a wide frequency range. In order to detect, estimate, and track the interference, we use a bank of hidden Markov model filters operating in the frequency domain, The hidden Markov model filters' outputs are then used to suppress the existing interference, The computational complexity of our scheme is only linear in the number of interferers. The simulation studies show that our proposed novel schemes adapt quickly in tracking the time-varying nature of the interference.
In a 1992 Technometrics paper, Lambert (1992, 34, 1-14) described zero-inflated Poisson (ZIP) regression, a class of models for count data with excess zeros. In a ZIP model, a count response variable is assumed to be ...
详细信息
In a 1992 Technometrics paper, Lambert (1992, 34, 1-14) described zero-inflated Poisson (ZIP) regression, a class of models for count data with excess zeros. In a ZIP model, a count response variable is assumed to be distributed as a mixture of a Poisson(lambda) distribution and a distribution with point mass of one at zero, with mixing probability p. Both p and lambda are allowed to depend on covariates through canonical link generalized linear models. In this paper, we adapt Lambert's methodology to an upper bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model. In addition, we add to the flexibility of these fixed effects models by incorporating random effects so that, e.g., the within-subject correlation and between-subject heterogeneity typical of repeated measures data can be accommodated. We motivate, develop, and illustrate the methods described here with an example from horticulture, where both upper bounded count (binomial-type) and unbounded count (Poisson-type) data with excess zeros were collected in a repeated measures designed experiment.
Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is empl...
详细信息
Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The Ehl algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.
The expectation-maximization (em) algorithm is popular in estimating parameters of various statistical models. In this paper, we consider applications of the em algorithm to the maximum a posteriori (MAP) sequence dec...
详细信息
The expectation-maximization (em) algorithm is popular in estimating parameters of various statistical models. In this paper, we consider applications of the em algorithm to the maximum a posteriori (MAP) sequence decoding assuming that sources and channels are described by hidden Markov models (HMM's), HMM's call accurately approximate a large variety of communication channels with memory and, in particular, wireless fading channels with noise. The direct maximization of the a posteriori probability (APP) is too complex, The em algorithm allows us to obtain the MAP sequence estimation iteratively, Since each step of the em algorithm increases the APP, the algorithm can improve performance of ally decoding procedure.
暂无评论