To ensure that a study can properly address its research aims, the sample size and power must be determined appropriately. Covariate adjustment via regression modeling permits more precise estimation of the effect of ...
详细信息
To ensure that a study can properly address its research aims, the sample size and power must be determined appropriately. Covariate adjustment via regression modeling permits more precise estimation of the effect of a primary variable of interest at the expense of increased complexity in sample size/power calculation. The presence of correlation between the main variable and other covariates, commonly seen in observational studies and non-randomized clinical trials, further complicates this process. Though sample size and power specification methods have been obtained to accommodate specific covariate distributions and models, most existing approaches rely on either simple approximations lacking theoretical support or complex procedures that are difficult to apply at the design stage. The current literature lacks a general, coherent theory applicable to a broader class of regression models and covariate distributions. We introduce succinct formulas for sample size and power determination with the generalizedlinear, Cox, and Fine-Gray models that account for correlation between a main effect and other covariates. Extensive simulations demonstrate that this method produces studies that are appropriately sized to meet their type I error rate and power specifications, particularly offering accurate sample size/power estimation in the presence of correlated covariates.
Vector generalized linear models (VGLMs) as implemented in the VGAM R package permit multiple parameters to depend (via inverse link functions) on linear predictors. However it is often the case that one wishes differ...
详细信息
Vector generalized linear models (VGLMs) as implemented in the VGAM R package permit multiple parameters to depend (via inverse link functions) on linear predictors. However it is often the case that one wishes different parameters to be related to each other in some way (i.e., to jointly satisfy certain constraints). Prominent and important examples of such cases include the normal or Gaussian family where one wishes to model the variance as a function of the mean, e.g., variance proportional to the mean raised to some power. Another example is the negative binomial family whose variance is approximately proportional to the mean raised to some power. It is shown that such constraints can be implemented in a straightforward manner via reduced rank regression (RRR) and easily used via the rrvglm () function. To this end RRR is briefly described and applied so as to impose parameter constraints in VGLMs with two parameters. The result is a rank-1 RR-VGLM. Numerous examples are given, some new, of the use of this technique. The implication here is that RRR offers hitherto undiscovered potential usefulness to many statistical distributions. (C) 2013 Elsevier B.V. All rights reserved.
generalized linear models play an essential role in a wide variety of statistical applications. This paper discusses an approximation of the likelihood in these models that can greatly facilitate computation. The basi...
详细信息
generalized linear models play an essential role in a wide variety of statistical applications. This paper discusses an approximation of the likelihood in these models that can greatly facilitate computation. The basic idea is to replace a sum that appears in the exact log-likelihood by an expectation over the model covariates;the resulting "expected log-likelihood" can in many cases be computed significantly faster than the exact log-likelihood. In many neuroscience experiments the distribution over model covariates is controlled by the experimenter and the expected log-likelihood approximation becomes particularly useful;for example, estimators based on maximizing this expected log-likelihood (or a penalized version thereof) can often be obtained with orders of magnitude computational savings compared to the exact maximum likelihood estimators. A risk analysis establishes that these maximum EL estimators often come with little cost in accuracy (and in some cases even improved accuracy) compared to standard maximum likelihood estimates. Finally, we find that these methods can significantly decrease the computation time of marginal likelihood calculations for model selection and of Markov chain Monte Carlo methods for sampling from the posterior parameter distribution. We illustrate our results by applying these methods to a computationally-challenging dataset of neural spike trains obtained via large-scale multi-electrode recordings in the primate retina.
Epidemiologic studies use outcome-dependent sampling (ODS) schemes where, in addition to a simple random sample, there are also a number of supplement samples that are collected based on outcome variable. ODS scheme...
详细信息
Epidemiologic studies use outcome-dependent sampling (ODS) schemes where, in addition to a simple random sample, there are also a number of supplement samples that are collected based on outcome variable. ODS scheme is a cost-effective way to improve study efficiency. We develop a maximum semiparametric empirical likelihood estimation (MSELE) for data from a two-stage ODS scheme under the assumption that given covariate, the outcome follows a general linear model. The information of both validation samples and nonvalidation samples are used. What is more, we prove the asymptotic properties of the proposed MSELE.
Hidden Markov and semi-Markov models (H(S)MMs) constitute useful tools for modeling observations subject to certain dependency structures. The hidden states render these models very flexible and allow them to capture ...
详细信息
Hidden Markov and semi-Markov models (H(S)MMs) constitute useful tools for modeling observations subject to certain dependency structures. The hidden states render these models very flexible and allow them to capture many different types of latent patterns and dynamics present in the data. This has led to the increased popularity of these models, which have been applied to a variety of problems in various domains and settings, including longitudinal data. In many longitudinal studies, the response variable is categorical or count-type. generalizedlinear mixed models (GLMMs) can be used to analyze a wide range of variables, including categorical and count. The present study proposes a model that combines HSMMs with GLMMs, leading to generalizedlinear mixed hidden semi-Markov models (GLM-HSMMs). These models can account for time-varying unobserved heterogeneity and handle different response types. Parameter estimation is achieved using a Monte Carlo Newton-Raphson (MCNR)-like algorithm. In our proposed model, the distribution of the random effects depends on hidden states. We illustrate the applicability of GLM-HSMMs with an example in the field of occupational health, where the response variable consists of count values. Furthermore, we assess the performance of our MCNR-like algorithm through a simulation study.
While developing a prior distribution for any Bayesian analysis, it is important to check whether the corresponding posterior distribution becomes degenerate in the limit to the true parameter value as the sample size...
详细信息
While developing a prior distribution for any Bayesian analysis, it is important to check whether the corresponding posterior distribution becomes degenerate in the limit to the true parameter value as the sample size increases. In the same vein, it is also important to understand a more detailed asymptotic behavior of posterior distributions. This is particularly relevant in the development of many nonsubjective priors. The present paper focuses on asymptotic expansions of posteriors for generalized linear models with canonical link functions when the number of regressors grows to infinity at a certain rate relative to the growth of the sample size. These expansions are then used to derive moment matching priors in the generalizedlinear model setting. (C) 2014 Elsevier Inc. All rights reserved.
The current article explores whether the application of generalized linear models (GLM) and generalized estimating equations (GEE) can be used in place of conventional statistical analyses in the study of ordinal data...
详细信息
The current article explores whether the application of generalized linear models (GLM) and generalized estimating equations (GEE) can be used in place of conventional statistical analyses in the study of ordinal data that code an underlying continuous variable, like entheseal changes. The analysis of artificial data and ordinal data expressing entheseal changes in archaeological North African populations gave the following results. Parametric and nonparametric tests give convergent results particularly for P values <0.1, irrespective of whether the underlying variable is normally distributed or not under the condition that the samples involved in the tests exhibit approximately equal sizes. If this prerequisite is valid and provided that the samples are of equal variances, analysis of covariance may be adopted. GLM are not subject to constraints and give results that converge to those obtained from all nonparametric tests. Therefore, they can be used instead of traditional tests as they give the same amount of information as them, but with the advantage of allowing the study of the simultaneous impact of multiple predictors and their interactions and the modeling of the experimental data. However, GLM should be replaced by GEE for the study of bilateral asymmetry and in general when paired samples are tested, because GEE are appropriate for correlated data. Am J Phys Anthropol 153:473-483, 2014. (c) 2013 Wiley Periodicals, Inc.
In this article, the parametric robust regression approaches are proposed for making inferences about regression parameters in the setting of generalized linear models (GLMs). The proposed methods are able to test hyp...
详细信息
In this article, the parametric robust regression approaches are proposed for making inferences about regression parameters in the setting of generalized linear models (GLMs). The proposed methods are able to test hypotheses on the regression coefficients in the misspecified GLMs. More specifically, it is demonstrated that with large samples, the normal and gamma regression models can be properly adjusted to become asymptotically valid for inferences about regression parameters under model misspecification. These adjusted regression models can provide the correct type I and II error probabilities and the correct coverage probability for continuous data, as long as the true underlying distributions have finite second moments.
generalized linear models have been more widely used than linearmodels which exclude categorical variables. The penalized method becomes an effective tool to study ultrahigh dimensional generalized linear models. In ...
详细信息
generalized linear models have been more widely used than linearmodels which exclude categorical variables. The penalized method becomes an effective tool to study ultrahigh dimensional generalized linear models. In this paper, we study theoretical results of the adaptive Lasso for generalized linear models in terms of diverging number of parameters and ultrahigh dimensionality. The asymptotic results are examined by several simulation studies. (c) 2014 Elsevier B.V. All rights reserved.
Adjusted responses, adjusted fitted values and adjusted residuals are known to play in generalized linear models the role played in linearmodels by observations, fitted values and ordinary residuals. We think this pa...
详细信息
Adjusted responses, adjusted fitted values and adjusted residuals are known to play in generalized linear models the role played in linearmodels by observations, fitted values and ordinary residuals. We think this parallelism, which was widely recognized and used in the early literature on generalized linear models, has been somewhat overlooked in more recent presentations. We revise this parallelism, systematizing and proving some results that are either scattered or not satisfactorily spelled out in the literature. In particular, we formally derive the asymptotic dispersion matrix of the (scaled) adjusted residuals, by proving that in generalized linear models the fitted values are asymptotically uncorrelated with the raw residuals and hence deriving the asymptotic dispersion matrix of these latter residuals. Also, we show that an orthogonal decomposition of the error vector between adjusted response and true linear predictor, parallel to the familiar decomposition in linearmodels, holds approximately. Finally, we provide some new perspective, both in linear and generalized linear models, on adjusted residuals for model comparison, and their relationships with test-statistics used to compare the fit of nested models.
暂无评论