检索结果-内蒙古大学图书馆

MODELING HYBRID TRAITS FOR COMORBIDITY AND GENETIC STUDIES OF ALCOHOL AND NICOTINE CO-DEPENDENCE

ANNALS OF APPLIED STATISTICS 2018年第4期12卷 2359-2378页

作者： Zhang, Heping Liu, Dungang Zhao, Jiwei Bi, Xuan Yale Sch Publ Hlth Dept Biostat New Haven CT 06520 USA Univ Cincinnati Dept Operat Business Analyt & Informat Syst Linder Coll Business Cincinnati OH 45221 USA SUNY Buffalo Dept Biostat Buffalo NY 14214 USA

We propose a novel multivariate model for analyzing hybrid traits and identifying genetic factors for comorbid conditions. Comorbidity is a common phenomenon in mental health in which an individual suffers from multiple disorders simultaneously. For example, in the Study of Addiction: Genetics and Environment (SAGE), alcohol and nicotine addiction were recorded through multiple assessments that we refer to as hybrid traits. Statistical inference for studying the genetic basis of hybrid traits has not been well developed. Recent rank-based methods have been utilized for conducting association analyses of hybrid traits but do not inform the strength or direction of effects. To overcome this limitation, a parametric modeling framework is imperative. Although such parametric frameworks have been proposed in theory, they are neither well developed nor extensively used in practice due to their reliance on complicated likelihood functions that have high computational complexity. Many existing parametric frameworks tend to instead use pseudo-likelihoods to reduce computational burdens. Here, we develop a model fitting algorithm for the full likelihood. Our extensive simulation studies demonstrate that inference based on the full likelihood can control the type-I error rate, and gains power and improves the effect size estimation when compared with several existing methods for hybrid models. These advantages remain even if the distribution of the latent variables is misspecified. After analyzing the SAGE data, we identify three genetic variants (rs7672861, rs958331, rs879330) that are significantly associated with the comorbidity of alcohol and nicotine addiction at the chromosome-wide level. Moreover, our approach has greater power in this analysis than several existing methods for hybrid *** the analysis of the SAGE data motivated us to develop the model, it can be broadly applied to analyze any hybrid responses.

关键词： Comorbidity association em algorithm latent variable ordinal outcome

来源：评论

学校读者我要写书评

暂无评论

Inverse regression approach to robust nonlinear high-to-low dimensional mapping

引用

JOURNAL OF MULTIVARIATE ANALYSIS 2018年 163卷 1-14页

作者： Perthame, emeline Forbes, Florence Deleforge, Antoine Univ Grenoble Alpes CNRS Inria LJK F-38000 Grenoble France Inria Rennes France

The goal of this paper is to address the issue of nonlinear regression with outliers, possibly in high dimension, without specifying the form of the link function and under a parametric approach. Nonlinearity is handled via an underlying mixture of affine regressions. Each regression is encoded in a joint multivariate Student distribution on the responses and covariates. This joint modeling allows the use of an inverse regression strategy to handle the high dimensionality of the data, while the heavy tail of the Student distribution limits the contamination by outlying data. The possibility to add a number of latent variables similar to factors to the model further reduces its sensitivity to noise or model misspecification. The mixture model setting has the advantage of providing a natural inference procedure using an em algorithm. The tractability and flexibility of the algorithm are illustrated in simulations and real high-dimensional data with good performance that compares favorably with other existing methods. (C) 2017 Elsevier Inc. All rights reserved.

关键词： em algorithm Inverse regression Mixture of regressions Nonlinear regression High dimension Robust regression Student distribution

来源：评论

学校读者我要写书评

暂无评论

Finding the Number of Normal Groups in Model-Based Clustering via Constrained Likelihoods

引用

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS 2018年第2期27卷 404-416页

作者： Cerioli, Andrea Angel Garcia-Escudero, Luis Mayo-Iscar, Agustin Riani, Marco Univ Parma Dipart Sci Econ & Aziendali Parma Italy Univ Valladolid Dept Estadist & IO Valladolid Spain Univ Valladolid IMUVA Valladolid Spain

Deciding the number of clusters k is one of the most difficult problems in cluster analysis. For this purpose, complexity-penalized likelihood approaches have been introduced in model-based clustering, such as the well-known Bayesian information criterion and integrated complete likelihood criteria. However, the classification/mixture likelihoods considered in these approaches are unbounded without any constraint on the cluster scatter matrices. Constraints also prevent traditional em and Cem algorithms from being trapped in (spurious) local maxima. Controlling the maximal ratio between the eigenvalues of the scatter matrices to be smaller than a fixed constant c >= 1 is a sensible idea for setting such constraints. A new penalized likelihood criterion which takes into account the higher model complexity that a higher value of c entails is proposed. Based on this criterion, a novel and fully automated procedure, leading to a small ranked list of optimal (k, c) couples is provided. A new plot called "car-bike," which provides a concise summary of the solutions, is introduced. The performance of the procedure is assessed both in empirical examples and through a simulation study as a function of cluster overlap. Supplementary materials for the article are available online.

关键词： BIC Cem algorithm Clustering em algorithm ICL Mixtures

来源：评论

学校读者我要写书评

暂无评论

Numerical approximation of the observed information matrix with Oakes' identity

引用

BRITISH JOURNAL OF MATHemATICAL & STATISTICAL PSYCHOLOGY 2018年第3期71卷 415-436页

作者： Chalmers, R. Philip Univ Georgia Dept Educ Psychol 323 Aderhold Hall Athens GA 30602 USA

An efficient and accurate numerical approximation methodology useful for obtaining the observed information matrix and subsequent asymptotic covariance matrix when fitting models with the em algorithm is presented. The numerical approximation approach is compared to existing algorithms intended for the same purpose, and the computational benefits and accuracy of this new approach are highlighted. Instructive and real-world examples are included to demonstrate the methodology concretely, properties of the estimator are discussed in detail, and a Monte Carlo simulation study is included to investigate the behaviour of a multi-parameter item response theory model using three competing finite-difference algorithms.

关键词： em algorithm supplemented em observed information Oakes's identity item response theory finite differences

来源：评论

学校读者我要写书评

暂无评论

Improved mixed model for longitudinal data analysis using shrinkage method

引用

MATHemATICAL SCIENCES 2018年第4期12卷 305-312页

作者： Rahmani, M. Arashi, M. Mamode Khan, N. Sunecher, Y. Shahrood Univ Technol Shahrood Iran Univ Mauritius Reduit Mauritius Univ Technol Mauritius Pointe Aux Sables Mauritius

The problem of multicollinearity among predictor variables is a frequent issue in longitudinal data analysis. In this context, this paper proposes a mixed ridge regression model via shrinkage methods to analyze such data. Furthermore, in view of obtaining more efficient estimators, we propose preliminary and Stein-type estimators using prior information for fixed-effects parameters. The model parameters are estimated via the em algorithm. A simulation study is also presented to assess the performance of the estimators under different estimation methods. An application to the HIV data is also illustrated.

关键词： em algorithm Longitudinal data Mixed model Preliminary test Stein estimation Ridge regression 65C60 62J12 62H12 62J20 62J10

来源：评论

学校读者我要写书评

暂无评论

Simultaneous estimation of QTL parameters for mapping multiple traits

引用

JOURNAL OF GENETICS 2018年第1期97卷 267-274页

作者： Tong, Liang Sun, Xiaoxia Zhou, Ying Heilongjiang Univ Sch Math Sci Harbin 150080 Heilongjiang Peoples R China Suihua Univ Sch Informat Engn Suihua 152061 Peoples R China

The analysis of quantitative trait loci (QTLs) aims at mapping and estimating the positions and effects of the genes that may affect the quantitative trait, and evaluating the relationship between the gene variation and the phenotype. In existing studies, most methods mainly focus on the association/linkage between multiple gene loci and one trait, in which some useful joint information of multiple traits may be ignored. In this paper, we proposed a method of simultaneously estimating all QTL parameters in the framework of multiple-trait multiple-interval mapping. Simulation results show that in accuracy aspect, the proposed method outperforms an existing method for mapping multiple traits. A real example is also provided to validate the performance of the new method.

关键词： em algorithm estimation multiple-interval mapping recombination rate

来源：评论

学校读者我要写书评

暂无评论

Multivariate measurement error models based on Student-t distribution under censored responses

引用

STATISTICS 2018年第6期52卷 1395-1416页

作者： Matos, Larissa A. Castro, Luis M. Cabral, Celso R. B. Lachos, Victor H. Univ Estadual Campinas IMECC Dept Stat BR-13083859 Campinas SP Brazil Pontificia Univ Catolica Chile Dept Stat Santiago Chile Univ Fed Amazonas Dept Stat Manaus Amazonas Brazil Univ Connecticut Dept Stat Storrs CT 06269 USA

Measurement error models constitute a wide class of models that include linear and nonlinear regression models. They are very useful to model many real-life phenomena, particularly in the medical and biological areas. The great advantage of these models is that, in some sense, they can be represented as mixed effects models, allowing us to implement wellknown techniques, like the em-algorithm for the parameter estimation. In this paper, we consider a class of multivariate measurement error models where the observed response and/or covariate are not fully observed, i.e., the observations are subject to certain threshold values below or above which the measurements are not quantifiable. Consequently, these observations are considered censored. We assume a Student-t distribution for the unobserved true values of the mismeasured covariate and the error term of the model, providing a robust alternative for parameter estimation. Our approach relies on a likelihood-based inference using an em-type algorithm. The proposed method is illustrated through some simulation studies and the analysis of an AIDS clinical trial dataset.

关键词： Censored responses em algorithm measurement error models Student-t distribution

来源：评论

学校读者我要写书评

暂无评论

A change-point model for detecting heterogeneity in ordered survival responses

引用

STATISTICAL METHODS IN MEDICAL RESEARCH 2018年第12期27卷 3595-3611页

作者： Bouaziz, Olivier Nuel, Gregory Univ Paris 05 Lab MAP5 Paris France CNRS Sorbonne Paris Cite Paris France Univ Paris 06 Sorbonne Univ CNRS 7599 LPMA Paris France

In this article, we suggest a new statistical approach considering survival heterogeneity as a breakpoint model in an ordered sequence of time-to-event variables. The survival responses need to be ordered according to a numerical covariate. Our estimation method will aim at detecting heterogeneity that could arise through the ordering covariate. We formally introduce our model as a constrained Hidden Markov Model, where the hidden states are the unknown segmentation (breakpoint locations) and the observed states are the survival responses. We derive an efficient Expectation-Maximization framework for maximizing the likelihood of this model for a wide range of baseline hazard forms (parametrics or nonparametric). The posterior distribution of the breakpoints is also derived, and the selection of the number of segments using penalized likelihood criterion is discussed. The performance of our survival breakpoint model is finally illustrated on a diabetes dataset where the observed survival times are ordered according to the calendar time of disease onset.

关键词： Constrained HMM Cox model em algorithm heterogeneity survival analysis

来源：评论

学校读者我要写书评

暂无评论

Estimating a network from multiple noisy realizations

引用

ELECTRONIC JOURNAL OF STATISTICS 2018年第2期12卷 4697-4740页

作者： Le, Can M. Levin, Keith Levina, Elizaveta Univ Calif Davis Dept Stat Davis CA 95616 USA Univ Michigan Dept Stat Ann Arbor MI 48109 USA

Complex interactions between entities are often represented as edges in a network. In practice, the network is often constructed from noisy measurements and inevitably contains some errors. In this paper we consider the problem of estimating a network from multiple noisy observations where edges of the original network are recorded with both false positives and false negatives. This problem is motivated by neuroimaging applications where brain networks of a group of patients with a particular brain condition could be viewed as noisy versions of an unobserved true network corresponding to the disease. The key to optimally leveraging these multiple observations is to take advantage of network structure, and here we focus on the case where the true network contains communities. Communities are common in real networks in general and in particular are believed to be presented in brain networks. Under a community structure assumption on the truth, we derive an efficient method to estimate the noise levels and the original network, with theoretical guarantees on the convergence of our estimates. We show on synthetic networks that the performance of our method is close to an oracle method using the true parameter values, and apply our method to fMRI brain data, demonstrating that it constructs stable and plausible estimates of the population network.

关键词： Noisy networks stochastic block model brain networks em algorithm

来源：评论

学校读者我要写书评

暂无评论

Joint regression modeling for missing categorical covariates in generalized linear models

引用

JOURNAL OF APPLIED STATISTICS 2018年第15期45卷 2741-2759页

作者： Carlos Perez-Ruiz, Luis Escarela, Gabriel Univ Autonoma Metropolitana Iztapalapa Dept Matemat Av San Rafael Atlixco 186 Mexico City 09340 DF Mexico

Missing covariates data is a common issue in generalized linear models (GLMs). A model-based procedure arising from properly specifying joint models for both the partially observed covariates and the corresponding missing indicator variables represents a sound and flexible methodology, which lends itself to maximum likelihood estimation as the likelihood function is available in computable form. In this paper, a novel model-based methodology is proposed for the regression analysis of GLMs when the partially observed covariates are categorical. Pair-copula constructions are used as graphical tools in order to facilitate the specification of the high-dimensional probability distributions of the underlying missingness components. The model parameters are estimated by maximizing the weighted loglikelihood function by using an em algorithm. In order to compare the performance of the proposed methodology with other well-established approaches, which include complete-cases and multiple imputation, several simulation experiments of Binomial, Poisson and Normal regressions are carried out under both missing at random and non-missing at random mechanisms scenarios. The methods are illustrated by modeling data from a stage III melanoma clinical trial. The results show that the methodology is rather robust and flexible, representing a competitive alternative to traditional techniques.

关键词： Copula missing data vines pair copula constructions em algorithm multivariate distribution

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：