检索结果-内蒙古大学图书馆

Simple incorporation of interactions into additive models

BIOMETRICS 2001年第2期57卷 539-545页

作者： Coull, BA Ruppert, D Wand, MP Harvard Univ Sch Publ Hlth Dept Biostat Boston MA 02115 USA Cornell Univ Sch Operat Res & Ind Engn Ithaca NY 14853 USA

Often, the functional form of covariate effects in an additive model varies across groups defined by levels of a categorical variable. This structure represents a factor-by-curve interaction. This article presents penalized spline models that incorporate factor-by-curve interactions into additive models. A mixed model formulation for penalized splines allows for straightforward model fitting and smoothing parameter selection. We illustrate the proposed model by applying it to pollen ragweed data in which seasonal trends vary by year.

关键词： generalized additive model generalized linear mixed model penalized spline pollen forecasting varying-coefficient model

来源：评论

学校读者我要写书评

暂无评论

On generalized latent factor modeling and inference for high-dimensional binomial data

引用

BIOMETRICS 2023年第3期79卷 2311-2320页

作者： Ma, Ting Fung Wang, Fangfang Zhu, Jun Univ South Carolina Dept Stat Columbia SC 29208 USA Worcester Polytech Inst Dept Math Sci Worcester MA 01609 USA Univ Wisconsin Dept Stat Madison WI 53706 USA

We explore a hierarchical generalized latent factor model for discrete and bounded response variables and in particular, binomial responses. Specifically, we develop a novel two-step estimation procedure and the corresponding statistical inference that is computationally efficient and scalable for the high dimension in terms of both the number of subjects and the number of features per subject. We also establish the validity of the estimation procedure, particularly the asymptotic properties of the estimated effect size and the latent structure, as well as the estimated number of latent factors. The results are corroborated by a simulation study and for illustration, the proposed methodology is applied to analyze a dataset in a gene-environment association study.

关键词： Discrete bounded data eigenanalysis gene-environment association generalized linear mixed model sub-Gaussian error

来源：评论

学校读者我要写书评

暂无评论

Multilevel cumulative logistic regression model with random effects: Application to British social attitudes panel survey data

引用

COMPUTATIONAL STATISTICS & DATA ANALYSIS 2015年 88卷 173-186页

作者： Chan, Moon-tong Yu, Dalei Yau, Kelvin K. W. City Univ Hong Kong Dept Management Sci Hong Kong Hong Kong Peoples R China Yunnan Univ Finance & Econ Stat & Math Coll Kunming 650221 Peoples R China Yunnan Tongchuang Sci Comp & Data Min Ctr Kunming 650221 Peoples R China

A multilevel model for ordinal data in generalized linear mixed models (GLMM) framework is developed to account for the inherent dependencies among observations within clusters. Motivated by a data set from the British Social Attitudes Panel Survey (BSAPS), the random district effects and respondent effects are incorporated into the linear predictor to accommodate the nested clusterings. The fixed (random) effects are estimated (predicted) by maximizing the penalized quasi likelihood (PQL) function, whereas the variance component parameters are obtained via the restricted maximum likelihood (REML) estimation method. The model is employed to analyze the BSAPS data. Simulation studies are conducted to assess the performance of estimators. (C) 2015 Elsevier B.V. All rights reserved.

关键词： generalized linear mixed model Multilevel model Ordinal response Random effect

来源：评论

学校读者我要写书评

暂无评论

A study of variable selection using g-prior distribution with ridge parameter

引用

COMPUTATIONAL STATISTICS & DATA ANALYSIS 2012年第6期56卷 1920-1934页

作者： Baragatti, M. Pommeret, D. CNRS Inst Math Luminy F-13288 Marseille 9 France Ipsogen SA Luminy Biotech Entreprises F-13288 Marseille 9 France

In the Bayesian stochastic search variable selection framework, a common prior distribution for the regression coefficients is the g-prior of Zellner. However there are two standard cases where the associated covariance matrix does not exist and the conventional prior of Zellner cannot be used: if the number of observations is lower than the number of variables (large p and small n paradigm), or if some variables are linear combinations of others. In such situations, a prior distribution derived from the prior of Zellner can be considered by introducing a ridge parameter. This prior is a flexible and simple adaptation of the g-prior and its influence on the selection of variables is studied. A simple way to choose the associated hyper-parameters is proposed. The method is valid for any generalized linear mixed model and particular attention is paid to the study of probit mixed models when some variables are linear combinations of others. The method is applied to both simulated and real datasets obtained from Affymetrix microarray experiments. Results are compared to those obtained with the Bayesian Lasso. (c) 2011 Elsevier B.V. All rights reserved.

关键词： Bayesian Lasso generalized linear mixed model Metropolis-within-Gibbs algorithm Probit mixed regression model Ridge parameter Stochastic search variable selection Zellner prior

来源：评论

学校读者我要写书评

暂无评论

Multilevel models for survival analysis with random effects

引用

BIOMETRICS 2001年第1期57卷 96-102页

作者： Yau, KKW City Univ Hong Kong Dept Management Sci Kowloon Hong Kong Peoples R China

A method for modeling survival data with multilevel clustering is described. The Cox partial likelihood is incorporated into the generalized linear mixed model (GLMM) methodology. Parameter estimation is achieved by maximizing a log likelihood analogous to the likelihood associated with the best linear unbiased prediction (BLUP) at the initial step of estimation and is extended to obtain residual maximum likelihood (REML) estimators of the variance component. Estimating equations for a three-level hierarchical survival model are developed in detail, and such a model is applied to analyze a set of chronic granulomatous disease (CGD) data on recurrent infections as an illustration with both hospital and patient effects being considered as random. Only the latter gives a significant contribution. A simulation study is carried out to evaluate the performance of the REML estimators. Further extension of the estimation procedure to models with an arbitrary number of levels is also discussed.

关键词： chronic granulomatous disease generalized linear mixed model multilevel model random effects residual maximum likelihood survival analysis

来源：评论

学校读者我要写书评

暂无评论

Zero-inflated Poisson and binomial regression with random effects: A case study

引用

BIOMETRICS 2000年第4期56卷 1030-1039页

作者： Hall, DB Univ Georgia Dept Stat Athens GA 30602 USA

In a 1992 Technometrics paper, Lambert (1992, 34, 1-14) described zero-inflated Poisson (ZIP) regression, a class of models for count data with excess zeros. In a ZIP model, a count response variable is assumed to be distributed as a mixture of a Poisson(lambda) distribution and a distribution with point mass of one at zero, with mixing probability p. Both p and lambda are allowed to depend on covariates through canonical link generalized linear models. In this paper, we adapt Lambert's methodology to an upper bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model. In addition, we add to the flexibility of these fixed effects models by incorporating random effects so that, e.g., the within-subject correlation and between-subject heterogeneity typical of repeated measures data can be accommodated. We motivate, develop, and illustrate the methods described here with an example from horticulture, where both upper bounded count (binomial-type) and unbounded count (Poisson-type) data with excess zeros were collected in a repeated measures designed experiment.

关键词： excess zeros EM algorithm generalized linear mixed model heterogeneity mixed effects overdispersion repeated measures

来源：评论

学校读者我要写书评

暂无评论

Computational techniques for spatial logistic regression with large data sets

引用

COMPUTATIONAL STATISTICS & DATA ANALYSIS 2007年第8期51卷 3631-3653页

作者： Paciorek, Christopher J. Harvard Univ Sch Publ Hlth Boston MA 02115 USA

In epidemiological research, outcomes are frequently non-normal, sample sizes may be large, and effect sizes are often small. To relate health outcomes to geographic risk factors, fast and powerful methods for fitting spatial models, particularly for non-normal data, are required. I focus on binary outcomes, with the risk surface a smooth function of space, but the development herein is relevant for non-normal data in general. I compare penalized likelihood (PL) models, including the penalized quasi-likelihood (PQL) approach, and Bayesian models based on fit, speed, and ease of implementation. A Bayesian model using a spectral basis (SB) representation of the spatial surface via the Fourier basis provides the best tradeoff of sensitivity and specificity in simulations, detecting real spatial features while limiting overfitting and being reasonably computationally efficient. One of the contributions of this work is further development of this underused representation. The SB model outperforms the PL methods, which are prone to overfitting, but is slower to fit and not as easily implemented. A Bayesian Markov random field model performs less well statistically than the SB model, but is very computationally efficient. We illustrate the methods on a real data set of cancer cases in Taiwan. The success of the SB with binary data and similar results with count data suggest that it may be generally useful in spatial models and more complicated hierarchical models. (c) 2006 Elsevier B.V. All rights reserved.

关键词： Bayesian statistics disease mapping Fourier basis generalized linear mixed model geostatistics risk surface spatial statistics spectral basis

来源：评论

学校读者我要写书评

暂无评论

Approximate inference for disease mapping

引用

COMPUTATIONAL STATISTICS & DATA ANALYSIS 2006年第10期50卷 2552-2570页

作者： Ainsworth, L. M. Dean, C. B. Simon Fraser Univ Dept Stat & Actuarial Sci Burnaby BC V5A 1S6 Canada

Disease mapping is an important area of statistical research. Contributions to the area over the last twenty years have been instrumental in helping to pinpoint potential causes of mortality and to provide a strategy for effective allocation of health funding. Because of the complexity of spatial analyses, new developments in methodology have not generally found application at Vital Statistics agencies. Inference for spatio-temporal analyses remains computationally prohibitive, for routine preparation of mortality atlases. This paper considers whether approximate methods of inference are reliable for mapping studies, especially in terms of providing accurate estimates of relative risks, ranks of regions and standard errors of risks. These approximate methods lie in the broader realm of approximate inference for generalized linear mixed models. Penalized quasi-likelihood is specifically considered here. The main focus is on assessing how close the penalized quasi-likelihood estimates are to target values, by comparison with the more rigorous and widespread Bayesian Markov Chain Monte Carlo methods. No previous studies have compared these two methods. The quantities of prime interest are small-area relative risks and the estimated ranks of the risks which are often used for ordering the regions. It will be shown that penalized quasi-likelihood is a reasonably accurate method of inference and can be recommended as a simple, yet quite precise method for initial exploratory studies. (C) 2005 Elsevier B.V. All rights reserved.

关键词： mixed Poisson model generalized linear mixed model spatial autocorrelation geographic epidemiology

来源：评论

学校读者我要写书评

暂无评论

Detection of Significant Disease Risks Using a Spatial Conditional Autoregressive model

引用

BIOMETRICS 2008年第4期64卷 1043-1053页

作者： Escaramis, Georgia Carrasco, Josep L. Ascaso, Carlos Univ Barcelona Dept Salut Publ Barcelona 08035 Spain

The conditional autoregressive (CAR) model is widely used to describe the geographical distribution of a specific disease risk in lattice mapping. Successful developments based on frequentist and Bayesian procedures have been extensively applied to obtain two-stage disease risk predictions at the subregional level. Bayesian procedures are preferred for making inferences, as the posterior standard errors (SE) of the two-stage prediction account for the variability in the variance component estimates;however, some recent work based on frequentist procedures and the use of bootstrap adjustments for the SE has been undertaken. In this article we investigate the suitability of an analytical adjustment for disease risk inference that provides accurate interval predictions by using the penalized quasilikelihood (PQL) technique to obtain model parameter estimates. The method is a first-order approximation of the naive SE based on a Taylor expansion and is interpreted as a conditional measure of variability providing conditional calibrated prediction intervals, given the data. We conduct a simulation study to demonstrate how the method can be used to estimate the specific subregion risk by interval. We evaluate the proposed methodology by analyzing the commonly used example data set of lip cancer incidence in the 56 counties of Scotland for the period 1975-1980. This evaluation reveals a close similarity between the solutions provided by the method proposed here and those of its fully Bayesian counterpart.

关键词： Best linear unbiased predictors Conditional autoregressive model Disease mapping generalized linear mixed model Penalized quasilikelihood Variance components

来源：评论

学校读者我要写书评

暂无评论

Detection of outliers in longitudinal count data via overdispersion

引用

COMPUTATIONAL STATISTICS & DATA ANALYSIS 2014年 79卷 192-202页

作者： Gumedze, Freedom N. Chatora, Tinashe D. Univ Cape Town Dept Stat Sci ZA-7701 Rondebosch South Africa

Count data are usually modeled using the Poisson generalized linear model. The Poisson model requires that the variance be a deterministic function of the mean. This assumption may not be met for a particular data set, that is, the model may not adequately capture the variability in the data. The extra-variability in the data may be accommodated using overdispersion models, such as the negative binomial distribution. In addition to the overdispersion outliers may be present in the data as indicated by the model residuals or some functions of the model residuals. A variance shift outlier model (VSOM) for count data is introduced. The model is used to detect potential outliers in the data, and to down-weight them in the analysis if desired. In this model the overdispersion is modeled using an observation-specific random effect. The status of a given observation as an outlier is indicated by the size of the associated shift in variance for that observation. The model is then extended to longitudinal count data for the detection of outliers at the subject level. We illustrate the methodology using a real data set taken from the literature. Extensions of the VSOM for count data to other non-normal responses are discussed. (C) 2014 Elsevier B.V. All rights reserved.

关键词： generalized linear model generalized linear mixed model Hierarchical generalized linear model Likelihood ratio test Negative binomial model Outlier detection Overdispersion Poisson model Variance shift outlier model

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：