Chain-of-events data are longitudinal observations on a succession of events that can only occur in a prescribed order. One goal in an analysis of this type of data is to determine the distribution of times between th...
详细信息
Chain-of-events data are longitudinal observations on a succession of events that can only occur in a prescribed order. One goal in an analysis of this type of data is to determine the distribution of times between the successive events. This is difficult when individuals are observed periodically rather than continuously because the event times are then interval censored. Chain-of-events data may also be subject to truncation when individuals can only be observed if a certain event in the chain (e.g., the final event) has occurred. We provide a nonparametric approach to estimate the distributions of times between successive events in discrete time for data such as these under the semi-Markov assumption that the times between events are independent. This method uses a self-consistency algorithm that extends Turnbull's algorithm (1976, Journal of the Royal Statistical Society, Series B 38, 290-295). The quantities required to carry out the algorithm can be calculated recursively for improved computational efficiency. Two examples using data from studies involving HIV disease are used to illustrate our methods.
The purpose of this paper is to present and evaluate a heuristic algorithm for learning Bayesian networks for clustering. Our approach is based upon improving the Naive-Bayes model by means of constructive induction. ...
详细信息
The purpose of this paper is to present and evaluate a heuristic algorithm for learning Bayesian networks for clustering. Our approach is based upon improving the Naive-Bayes model by means of constructive induction. A key idea in this approach is to treat expected data as real data. This allows us to complete the database and to take advantage of factorable closed forms for the marginal likelihood. In order to get such an advantage, we search for parameter values using the em algorithm or another alternative approach that we have developed: a hybridization of the Bound and Collapse method and the em algorithm, which results in a method that exhibits a faster convergence rate and a more effective behaviour than the em algorithm. Also, we consider the possibility of interleaving runnings of these two methods after each structural change. We evaluate our approach on synthetic and real-world databases. (C) 1999 Elsevier Science B.V. All rights reserved.
In this paper we examine the problem of estimating a stochastic signal from noise corrupted linearly distorted samples of the original. Due to the ill-posedness caused by the blurring function, we are motivated to exa...
详细信息
In this paper we examine the problem of estimating a stochastic signal from noise corrupted linearly distorted samples of the original. Due to the ill-posedness caused by the blurring function, we are motivated to examine an inversion method in which the statistics of the underlying process are modeled as a 1/f type fractal process. In particular, we explore two issues with the use of such a model: the effects of model mismatch and parameter estimation. Our analysis demonstrates that the mean-square-error performance of the estimator is quite insensitive to the choice of prior model parameters used in the recovery of the signal. Such robustness is shown to hold even when the underlying process is not of the 1/f variety. We then introduce an expectation-maximization technique for jointly extracting the best parameters for use in an inversion along with the reconstructed signal. Here, Monte Carlo and Cramer-Rao bound results demonstrate that we are able to determine accurate model parameters exactly in those situations where the model mismatch analysis shows that such fidelity is required to ensure low mean square error in the recovery of the underlying signal. (C) 1999 Elsevier Science B.V. All rights reserved.
In this paper a method to automatically generate a Gaussian mixture classifier is presented. The growing process is based on the iterative addition of Gaussian nodes. Each iteration takes place in two sequential steps...
详细信息
In this paper a method to automatically generate a Gaussian mixture classifier is presented. The growing process is based on the iterative addition of Gaussian nodes. Each iteration takes place in two sequential steps: first, using the em algorithm, we maximize the likelihood of the data under the current configuration of the classifier;then, a new Gaussian node is added to the class which most improves the discriminant capabilities of the network. Growth control is imposed by means of a complexity penalizing term and a discriminant MMI condition. The classical em algorithm for Gaussian mixtures is also extended to jointly include labeled and unlabeled data. We report some artificial experiments that show the utility of this extension and the reliability of the proposed growing technique. We also report results of the Growing Gaussian Mixtures Network on terrain classification over a Landsat-TM image using different restrictions on the covariance matrix of the Gaussian mixtures. Comparisons in classification performance with a set of MLP neural networks are provided. (C) 1999 Elsevier Science B.V. All rights reserved.
Mixed effects models are often used for estimating fixed effects and variance components in longitudinal studies of continuous data. When the outcome being modelled is a laboratory measurement, however, it may be subj...
详细信息
Mixed effects models are often used for estimating fixed effects and variance components in longitudinal studies of continuous data. When the outcome being modelled is a laboratory measurement, however, it may be subject to lower and upper detection limits (i.e., censoring). In this paper, the usual em estimation procedure for mixed effects models is modified to account for left and/or right censoring.
The occurrence of different forms of asymmetry complicates the analysis and interpretation of patterns in asymmetry. Furthermore, between-individual heterogeneity in developmental stability (DS) and thus fluctuating a...
详细信息
The occurrence of different forms of asymmetry complicates the analysis and interpretation of patterns in asymmetry. Furthermore, between-individual heterogeneity in developmental stability (DS) and thus fluctuating asymmetry (FA), is required to find relationships between DS and other factors. Separating directional asymmetry (DA) and antisymmetry (AS) from real FA and understanding between-individual heterogeneity in FA is therefore crucial in the analysis and interpretation of patterns in asymmetry. In this paper we introduce and explore mixture analysis to (i) identify FA, DA and AS from the distribution of the signed asymmetry, and (ii) to model and quantify between-individual heterogeneity in developmental stability and FA. In addition, we expand mixtures to the estimation of the proportion of variation in the unsigned FA that can be attributed to between-individual heterogeneity in the presumed underlying developmental stability (the so-called hypothetical repeatability). Finally, we construct weighted normal probability plots to investigate the assumption of underlying normality of the different components. We specifically show that (i) model selection based on the likelihood ratio test has the potential to yield models that incorporate nearly all heterogeneity in FA;(ii) mixtures appear to be a powerful and sensitive statistical technique to identify the different forms of asymmetry;(iii) restricted measurement accuracy and the occurrence of many zero observations results in an overestimation of the hypothetical repeatability on the basis of the model parameters;and (iv) as judged from the high correlation coefficients of the normal probability plots, the underlying normality assumption appears to hold for the empirical data we analysed. In conclusion, mixtures provide a useful statistical tool to study patterns in asymmetry.
作者:
Marchette, DJPoston, WLUSN
Computat Stat Grp Ctr Surface Warfare Dahlgren VA 22448 USA USN
Adv Processors Grp Ctr Surface Warfare Dahlgren VA 22448 USA
In automatic pattern recognition applications, numerous features that describe the classes are obtained in an attempt to ensure accurate classification of unknown observations. These features or dimensions must be red...
详细信息
In automatic pattern recognition applications, numerous features that describe the classes are obtained in an attempt to ensure accurate classification of unknown observations. These features or dimensions must be reduced to a smaller number before classification schemes can be applied, because classifiers become computationally and analytically unmanageable in high dimensions;Principal components and Fisher's Linear Discriminant offer global dimensionality reduction within the framework of linear algebra applied to covariance matrices. This report describes local methods that use both mixture-models and nearest neighbor calculations to construct local versions of these methods. These new versions for local dimensionality reduction will provide increased classification accuracy in lower dimensions.
We describe a semiparametric mixture model for human fertility studies. The probability of conception is a product of two components. The mixing distribution, the component that introduces the heterogeneity among the ...
详细信息
We describe a semiparametric mixture model for human fertility studies. The probability of conception is a product of two components. The mixing distribution, the component that introduces the heterogeneity among the menstrual cycles that come from different couples, is characterized nonparametrically by a finite number of moments. The second component, the intercourse-related probability is modeled parametrically to assess the possible exposure effects. We discuss an em algorithm-based estimating procedure that incorporates the natural order in the moments. (C) 1999 Elsevier Science B.V. All rights reserved.
Purpose. To develop a pharmacokinetic model for tenidap and to identify important relationships between the pharmacokinetic parameters and available covariates. Methods. Plasma concentration data from several phase I ...
详细信息
Purpose. To develop a pharmacokinetic model for tenidap and to identify important relationships between the pharmacokinetic parameters and available covariates. Methods. Plasma concentration data from several phase I and phase II studies were used to develop a pharmacokinetic model for tenidap, a novel anti-rheumatic drug. An appropriate pharmacokinetic model was selected on the basis of individual nonlinear regression analyses and an em algorithm was used to perform a nonlinear mixed-effects analysis. Scatter plots of posterior individual pharmacokinetic parameters were used to identify possible covariate effects. Results. predicted responses were in good agreement with the observed data. A biexponential model with zero order absorption was subsequently used to develop the mixed-effects model. Covariate relationships selected on the basis of differences in the objective function, although statistically significant, were not particularly strong. Conclusions. The pharmacokinetics of tenidap can be described by a bi-exponential model with zero order absorption. Based on differences in the log-likelihood, significant covariate-parameter relationships were identified between smoking and CL, and between gender and Vss and CLd Simulated sparse data analyses indicated that the model would be robust for the analysis of sparse data, generated in observational studies.
An important inferential objective in state space modelling is to recover unobserved states using fixed-interval smoothing. Thus, the identification of cases which have a substantial influence on the smoothers is a re...
详细信息
An important inferential objective in state space modelling is to recover unobserved states using fixed-interval smoothing. Thus, the identification of cases which have a substantial influence on the smoothers is a relevant practical problem. To facilitate this identification, we propose a case-deletion diagnostic which can be easily computed using the outputs of the standard filtering and smoothing algorithms. Our diagnostic is defined as the Kullback-Leibler directed divergence between two versions of the conditional density which determines the smoothers, one based on all the data, the other based on all the data except for the case or cases in question. We investigate the detection performance of the diagnostic in a practical application.
暂无评论