Twin studies are essential for assessing disease inheritance. Data generated from twin studies are traditionally analyzed using specialized computational programs. For many researchers, especially those who are new to...
详细信息
Twin studies are essential for assessing disease inheritance. Data generated from twin studies are traditionally analyzed using specialized computational programs. For many researchers, especially those who are new to twin studies, understanding and using those specialized computational programs can be a daunting task. Given that SAS (Statistical Analysis Software) is the most popular software for statistical analysis, we suggest that the use of SAS procedures for twin data may be a helpful alternative and demonstrate that we can obtain similar results from SAS to those produced by specialized computational programs. This numerical validation is practically useful, because a natural concern with general statistical software is whether it can deal with data that are generated from special study designs such as twin studies and if it can test a particular hypothesis. We concluded through our extensive simulation that SAS procedures can be used easily as a very convenient alternative to specialized programs for twin data analysis.
Binary data are often of interest in many small areas of applications. The use of standard small area estimation methods based on linearmixedmodels becomes problematic for such data. An empirical plug-in predictor (...
详细信息
Binary data are often of interest in many small areas of applications. The use of standard small area estimation methods based on linearmixedmodels becomes problematic for such data. An empirical plug-in predictor (EPP) under a unit-level generalized linear mixed model with logit link function is often used for the estimation of a small area proportion. However, this EPP requires the availability of unit-level population information for auxiliary data that may not be always accessible. As a consequence, in many practical situations, this EPP approach cannot be applied. Based on the level of auxiliary information available, different small area predictors for estimation of proportions are proposed. Analytic and bootstrap approaches to estimating the mean squared error of the proposed small area predictors are also developed. Monte Carlo simulations based on both simulated and real data show that the proposed small area predictors work well for generating the small area estimates of proportions and represent a practical alternative to the above approach. The developed predictor is applied to generate estimates of the proportions of indebted farm households at district-level using debt investment survey data from India.
We investigate the spread of Nectria canker of beech, which is a fungal chronic disease caused by Nectria ditissima Tul. et C. Tul. Data are available from a beech provenance trial. A possible influential factor on th...
详细信息
We investigate the spread of Nectria canker of beech, which is a fungal chronic disease caused by Nectria ditissima Tul. et C. Tul. Data are available from a beech provenance trial. A possible influential factor on the proportion of infected trees per plot is the wind dispersal zone(s) (wdz), a categorical variable describing the distance and wind direction from diseased shelterwood, the source of infection. We investigate the effect of wdz and whether the disease incidence in the regeneration can be explained alone by the wdz using different approaches accounting for spatial correlation in the data. One method uses generalized estimating equations (GEE) where, through specification of a general variance-covariance matrix allowing for nonindependence, spatial correlation can be accounted for in the model. The second method uses generalized additive models (GAM) and the spatial autocorrelation is dealt with by modeling it as a spatial trend. The third method uses generalized linear mixed models (GLMM) with a random effect accounting for spatial correlation and heterogeneity. We show that, in the beech data, some spatial correlation is present that is over and above that accounted for by the wdz. Therefore, methods not accounting for this correlation are inappropriate. The GLMM is the most appropriate model because it manages to model the biological process best: It explains the variation in disease incidence by the wdz and by secondary infection. Hence it yields the most precise estimates.
The statistical methods for analyzing spatial count data have often been based on random fields so that a latent variable can be used to specify the spatial dependence. In this article, we introduce two frequentist ap...
详细信息
The statistical methods for analyzing spatial count data have often been based on random fields so that a latent variable can be used to specify the spatial dependence. In this article, we introduce two frequentist approaches for estimating the parameters of model-based spatial count variables. The comparison has been carried out by a simulation study. The performance is also evaluated using a real dataset and also by the simulation study. The simulation results show that the maximum likelihood estimator appears to be with the better sampling properties.
Often, the functional form of covariate effects in an additive model varies across groups defined by levels of a categorical variable. This structure represents a factor-by-curve interaction. This article presents pen...
详细信息
Often, the functional form of covariate effects in an additive model varies across groups defined by levels of a categorical variable. This structure represents a factor-by-curve interaction. This article presents penalized spline models that incorporate factor-by-curve interactions into additive models. A mixedmodel formulation for penalized splines allows for straightforward model fitting and smoothing parameter selection. We illustrate the proposed model by applying it to pollen ragweed data in which seasonal trends vary by year.
We explore a hierarchical generalized latent factor model for discrete and bounded response variables and in particular, binomial responses. Specifically, we develop a novel two-step estimation procedure and the corre...
详细信息
We explore a hierarchical generalized latent factor model for discrete and bounded response variables and in particular, binomial responses. Specifically, we develop a novel two-step estimation procedure and the corresponding statistical inference that is computationally efficient and scalable for the high dimension in terms of both the number of subjects and the number of features per subject. We also establish the validity of the estimation procedure, particularly the asymptotic properties of the estimated effect size and the latent structure, as well as the estimated number of latent factors. The results are corroborated by a simulation study and for illustration, the proposed methodology is applied to analyze a dataset in a gene-environment association study.
A method for modeling survival data with multilevel clustering is described. The Cox partial likelihood is incorporated into the generalized linear mixed model (GLMM) methodology. Parameter estimation is achieved by m...
详细信息
A method for modeling survival data with multilevel clustering is described. The Cox partial likelihood is incorporated into the generalized linear mixed model (GLMM) methodology. Parameter estimation is achieved by maximizing a log likelihood analogous to the likelihood associated with the best linear unbiased prediction (BLUP) at the initial step of estimation and is extended to obtain residual maximum likelihood (REML) estimators of the variance component. Estimating equations for a three-level hierarchical survival model are developed in detail, and such a model is applied to analyze a set of chronic granulomatous disease (CGD) data on recurrent infections as an illustration with both hospital and patient effects being considered as random. Only the latter gives a significant contribution. A simulation study is carried out to evaluate the performance of the REML estimators. Further extension of the estimation procedure to models with an arbitrary number of levels is also discussed.
A multilevel model for ordinal data in generalized linear mixed models (GLMM) framework is developed to account for the inherent dependencies among observations within clusters. Motivated by a data set from the Britis...
详细信息
A multilevel model for ordinal data in generalized linear mixed models (GLMM) framework is developed to account for the inherent dependencies among observations within clusters. Motivated by a data set from the British Social Attitudes Panel Survey (BSAPS), the random district effects and respondent effects are incorporated into the linear predictor to accommodate the nested clusterings. The fixed (random) effects are estimated (predicted) by maximizing the penalized quasi likelihood (PQL) function, whereas the variance component parameters are obtained via the restricted maximum likelihood (REML) estimation method. The model is employed to analyze the BSAPS data. Simulation studies are conducted to assess the performance of estimators. (C) 2015 Elsevier B.V. All rights reserved.
In a 1992 Technometrics paper, Lambert (1992, 34, 1-14) described zero-inflated Poisson (ZIP) regression, a class of models for count data with excess zeros. In a ZIP model, a count response variable is assumed to be ...
详细信息
In a 1992 Technometrics paper, Lambert (1992, 34, 1-14) described zero-inflated Poisson (ZIP) regression, a class of models for count data with excess zeros. In a ZIP model, a count response variable is assumed to be distributed as a mixture of a Poisson(lambda) distribution and a distribution with point mass of one at zero, with mixing probability p. Both p and lambda are allowed to depend on covariates through canonical link generalizedlinearmodels. In this paper, we adapt Lambert's methodology to an upper bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model. In addition, we add to the flexibility of these fixed effects models by incorporating random effects so that, e.g., the within-subject correlation and between-subject heterogeneity typical of repeated measures data can be accommodated. We motivate, develop, and illustrate the methods described here with an example from horticulture, where both upper bounded count (binomial-type) and unbounded count (Poisson-type) data with excess zeros were collected in a repeated measures designed experiment.
Disease mapping is an important area of statistical research. Contributions to the area over the last twenty years have been instrumental in helping to pinpoint potential causes of mortality and to provide a strategy ...
详细信息
Disease mapping is an important area of statistical research. Contributions to the area over the last twenty years have been instrumental in helping to pinpoint potential causes of mortality and to provide a strategy for effective allocation of health funding. Because of the complexity of spatial analyses, new developments in methodology have not generally found application at Vital Statistics agencies. Inference for spatio-temporal analyses remains computationally prohibitive, for routine preparation of mortality atlases. This paper considers whether approximate methods of inference are reliable for mapping studies, especially in terms of providing accurate estimates of relative risks, ranks of regions and standard errors of risks. These approximate methods lie in the broader realm of approximate inference for generalized linear mixed models. Penalized quasi-likelihood is specifically considered here. The main focus is on assessing how close the penalized quasi-likelihood estimates are to target values, by comparison with the more rigorous and widespread Bayesian Markov Chain Monte Carlo methods. No previous studies have compared these two methods. The quantities of prime interest are small-area relative risks and the estimated ranks of the risks which are often used for ordering the regions. It will be shown that penalized quasi-likelihood is a reasonably accurate method of inference and can be recommended as a simple, yet quite precise method for initial exploratory studies. (C) 2005 Elsevier B.V. All rights reserved.
暂无评论