检索结果-内蒙古大学图书馆

Incomplete graphical model inference via latent tree aggregation

STATISTICAL MODELLING 2019年第5期19卷 545-568页

作者： Robin, Genevieve Ambroise, Christophe Robin, Stephane Ecole Polytech Ctr Math Appl UMR 7641 X POP INRIA Palaiseau France Univ Evry Val dEssonne Lab Math & Modelisat Evry Univ Paris Saclay Evry France Univ Paris Saclay INRA Math & Informat Appl Paris AgroParisTech Paris France

Graphical network inference is used in many fields such as genomics or ecology to infer the conditional independence structure between variables, from measurements of gene expression or species abundances for instance. In many practical cases, not all variables involved in the network have been observed, and the samples are actually drawn from a distribution where some variables have been marginalized out. This challenges the sparsity assumption commonly made in graphical model inference, since marginalization yields locally dense structures, even when the original network is sparse. We present a procedure for inferring Gaussian graphical models when some variables are unobserved, that accounts both for the influence of missing variables and the low density of the original network. Our model is based on the aggregation of spanning trees, and the estimation procedure on the expectation-maximization algorithm. We treat the graph structure and the unobserved nodes as missing variables and compute posterior probabilities of edge appearance. To provide a complete methodology, we also propose several model selection criteria to estimate the number of missing nodes. A simulation study and an illustration on flow cytometry data reveal that our method has favourable edge detection properties compared to existing graph inference techniques. The methods are implemented in an R package.

关键词： Gaussian graphical model latent variables em algorithm model selection

来源：评论

学校读者我要写书评

暂无评论

A hierarchical independent component analysis model for longitudinal neuroimaging studies

引用

NEUROIMAGE 2019年 189卷 380-400页

作者： Wang, Yikai Guo, Ying Emory Univ Dept Biostat & Bioinformat Rollins Sch Publ Hlth Atlanta GA 30322 USA

In recent years, longitudinal neuroimaging study has become increasingly popular in neuroscience research to investigate disease-related changes in brain functions, to study neurodevelopment or to evaluate treatment effects on neural processing. One of the important goals in longitudinal imaging analysis is to study changes in brain functional networks across time and how the changes are modulated by subjects' clinical or demographic variables. In current neuroscience literature, one of the most commonly used tools to extract and characterize brain functional networks is independent component analysis (ICA), which separates multivariate signals into linear mixture of independent components. However, existing ICA methods are only applicable to cross-sectional studies and not suited for modeling repeatedly measured imaging data. In this paper, we propose a novel longitudinal independent component model (L-ICA) which provides a formal modeling framework for extending ICA to longitudinal studies. By incorporating subject-specific random effects and visit-specific covariate effects, L-ICA is able to provide more accurate estimates of changes in brain functional networks on both the population- and individual-level, borrow information across repeated scans within the same subject to increase statistical power in detecting covariate effects on the networks, and allow for model-based prediction for brain networks changes caused by disease progression, treatment or neurodevelopment. We develop a fully traceable exact em algorithm to obtain maximum likelihood estimates of L-ICA. We further develop a subspace-based approximate em algorithm which greatly reduce the computation time while still retaining high accuracy. Moreover, we present a statistical testing procedure for examining covariate effects on brain network changes. Simulation results demonstrate the advantages of our proposed methods. We apply L-ICA to ADNI2 study to investigate changes in brain functional networks

关键词： fMRI ICA Longitudinal imaging study Blind source separation Brain functional networks em algorithm

来源：评论

学校读者我要写书评

暂无评论

Mixtures of Gaussian copula factor analyzers for clustering high dimensional data

引用

JOURNAL OF THE KOREAN STATISTICAL SOCIETY 2019年第3期48卷 480-492页

作者： Zhang, Lili Baek, Jangsun Chonnam Natl Univ Dept Stat Gwangju 61186 South Korea

Mixtures of factor analyzers is a useful model-based clustering method which can avoid the curse of dimensionality in high-dimensional clustering. However, this approach is sensitive to both diverse non-normalities of marginal variables and outliers, which are commonly observed in multivariate experiments. We propose mixtures of Gaussian copula factor analyzers (MGCFA) for clustering high-dimensional clustering. This model has two advantages;(1) it allows different marginal distributions to facilitate fitting flexibility of the mixture model, (2) it can avoid the curse of dimensionality by embedding the factor-analytic structure in the component-correlation matrices of the mixture distribution. An em algorithm is developed for the fitting of MGCFA. The proposed method is free of the curse of dimensionality and allows any parametric marginal distribution which fits best to the data. It is applied to both synthetic data and a microarray gene expression data for clustering and shows its better performance over several existing methods. (C) 2018 The Korean Statistical Society. Published by Elsevier B.V. All rights reserved.

关键词： Copula em algorithm Factor analyzers Gaussian copula factor analyzers Model-based clustering

来源：评论

学校读者我要写书评

暂无评论

Inference and optimal design of multiple constant-stress testing for generalized half-normal distribution under type-II progressive censoring

引用

JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION 2019年第16期89卷 3075-3104页

作者： Abd El-Raheem, A. M. Ain Shams Univ Fac Educ Dept Math Cairo Egypt

The generalized half-normal (GHN) distribution and progressive type-II censoring are considered in this article for studying some statistical inferences of constant-stress accelerated life testing. The em algorithm is considered to calculate the maximum likelihood estimates. Fisher information matrix is formed depending on the missing information law and it is utilized for structuring the asymptomatic confidence intervals. Further, interval estimation is discussed through bootstrap intervals. The Tierney and Kadane method, importance sampling procedure and Metropolis-Hastings algorithm are utilized to compute Bayesian estimates. Furthermore, predictive estimates for censored data and the related prediction intervals are obtained. We consider three optimality criteria to find out the optimal stress level. A real data set is used to illustrate the importance of GHN distribution as an alternative lifetime model for well-known distributions. Finally, a simulation study is provided with discussion.

关键词： Accelerated life testing Bayes estimation Bayesian prediction em algorithm generalized half-normal distribution optimal stress level simulation study type-II progressive censoring

来源：评论

学校读者我要写书评

暂无评论

Semiparametric estimation for the non-mixture cure model in case-cohort and nested case-control studies

引用

COMPUTATIONAL STATISTICS & DATA ANALYSIS 2020年第0期144卷 106874-000页

作者： Han, Bo Wang, Xiaoguang Dalian Univ Technol Sch Math Sci Dalian 116024 Liaoning Peoples R China

Case-cohort and nested case-control designs are widely used strategies to reduce costs of covariate measurements in epidemiological cohort studies. A unified likelihood framework for two cohort designs is constructed and two statistical procedures are presented for making inference about the effects of incomplete covariates on the cumulative incidence of clinical event time. A pseudo-maximum likelihood estimation based on the sieve method is developed for the semiparametric non-mixture cure model, which can handle missing covariates and a cure fraction occurring in censored survival data. The resulting estimators are shown to be consistent and asymptotically normal in both case-cohort and nested case-control studies. In addition, for two cohort designs, an expectation-maximization (em) algorithm is developed to simplify the maximization of the likelihood function with the Bernstein-based smoothing technique. Such a procedure would allow one to estimate the nonparametric component of the semiparametric model in closed form and relieve the computational burden. Simulation studies demonstrate that the proposed estimators have good properties in practical situations, and a motivating application to real data is provided to illustrate the methodology. (C) 2019 Elsevier B.V. All rights reserved.

关键词： Case-cohort Nested case-control Non-mixture cure model Pseudo-maximum likelihood estimation em algorithm

来源：评论

学校读者我要写书评

暂无评论

Simultaneous Variable and Covariance Selection With the Multivariate Spike-and-Slab LASSO

引用

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS 2019年第4期28卷 921-931页

作者： Deshpande, Sameer K. Rockova, Veronika George, Edward, I MIT Comp Sci & Artificial Intelligence Lab 77 Massachusetts Ave Cambridge MA 02139 USA Univ Chicago Booth Sch Business Dept Econometr & Stat Chicago IL 60637 USA Univ Penn Dept Stat Philadelphia PA 19104 USA

We propose a Bayesian procedure for simultaneous variable and covariance selection using continuous spike-and-slab priors in multivariate linear regression models where q possibly correlated responses are regressed onto p predictors. Rather than relying on a stochastic search through the high-dimensional model space, we develop an ECM algorithm similar to the emVS procedure of Rockova and George targeting modal estimates of the matrix of regression coefficients and residual precision matrix. Varying the scale of the continuous spike densities facilitates dynamic posterior exploration and allows us to filter out negligible regression coefficients and partial covariances gradually. Our method is seen to substantially outperform regularization competitors on simulated data. We demonstrate our method with a re-examination of data from a recent observational study of the effect of playing high school football on several later-life cognition, psychological, and socio-economic outcomes. An R package, scripts for replicating examples in this article, and results from further simulation studies are provided in the available online.

关键词： Bayesian shrinkage em algorithm Gaussian graphical modeling Multivariate regression Nonconvex optimization

来源：评论

学校读者我要写书评

暂无评论

Heteroscedastic and heavy-tailed regression with mixtures of skew Laplace normal distributions

引用

JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION 2019年第17期89卷 3213-3240页

作者： Dogru, Fatma Zehra Yu, Keming Arslan, Olcay Giresun Univ Dept Stat Fac Arts & Sci TR-28200 Giresun Turkey Brunel Univ Coll Engn Design & Phys Sci Dept Math London England Ankara Univ Dept Stat Fac Sci Ankara Turkey

Joint modelling skewness and heterogeneity is challenging in data analysis, particularly in regression analysis which allows a random probability distribution to change flexibly with covariates. This paper, based on a skew Laplace normal (SLN) mixture of location, scale, and skewness, introduces a new regression model which provides a flexible modelling of location, scale and skewness parameters simultaneously. The maximum likelihood (ML) estimators of all parameters of the proposed model via the expectation-maximization (em) algorithm as well as their asymptotic properties are derived. Numerical analyses via a simulation study and a real data example are used to illustrate the performance of the proposed model.

关键词： em algorithm joint location scale and skewness models mixture model ML estimation SLN SN

来源：评论

学校读者我要写书评

暂无评论

Estimating the earthquake occurrence rates in Corinth Gulf (Greece) through Markovian arrival process modeling

引用

JOURNAL OF APPLIED STATISTICS 2019年第6期46卷 995-1020页

作者： Bountzis, P. Papadimitriou, E. Tsaklidis, G. Aristotle Univ Thessaloniki Dept Geophys Thessaloniki Greece Aristotle Univ Thessaloniki Dept Math Thessaloniki Greece

The Markovian Arrival Process (MAP) is applied as a candidate model to describe the time-varying earthquake activity in Corinth Gulf, Greece. To the best of our knowledge, this is the first attempt to study the earthquake temporal evolution with the specific class of MAPs. A complete catalogue is used for the earthquake temporal distribution investigation, along with data sets of different magnitude cutoffs. The study area is divided into its western and eastern subareas, and possible variations in the earthquake occurrence times were sought. Hidden states of MAPs correspond to different levels of seismicity, and hence various numbers of states are examined. Akaike and Bayes information criteria are implemented for identifying the best model, and comparison to the most known and broadly accepted theoretical interevent time distributions is provided. In all cases, the fitted MAPs with phase type distributed intearrival times outperform the models with other distributions. Important indicators of the underlying Markov process are computed, and the earthquake frequency is approximated by the counting process. The analysis demonstrates high index of burstiness for the earthquake generation in the eastern part, i.e. long quiescent periods alternate with short ones of intense seismic activity.

关键词： Markov arrival process earthquake occurrence rates background seismic activity Corinth Gulf em algorithm

来源：评论

学校读者我要写书评

暂无评论

Estimation of the additive hazards model with case K interval-censored failure time data in the presence of informative censoring

引用

COMPUTATIONAL STATISTICS & DATA ANALYSIS 2020年第0期144卷 106891-000页

作者： Wang, Shuying Wang, Chunjie Wang, Peijie Sun, Jianguo Changchun Univ Technol Sch Math & Stat Changchun 130012 Peoples R China Jilin Univ Sch Math Ctr Appl Stat Res Changchun 130012 Peoples R China Univ Missouri Dept Stat Columbia MO 65211 USA

The additive hazards model is one of the most commonly used model in regression analysis of failure time data and many estimation procedures have been developed for its inference under various situations (Kalbfleisch and Prentice (2002);Lin and Ying (1994);Sun (2006)). In this paper, we consider a situation, case K interval-censored data with informative interval censoring, that often occurs in practice such as medical follow-up studies but has not been discussed much in the literature due to the difficulties involved. For the problem, a joint model is proposed to describe the correlation between the failure time of interest and the underlying censoring or observation process and a sieve maximum likelihood approach is developed. In particular, an em algorithm is presented for the implementation of the proposed estimation procedure and the asymptotic properties of the resulting estimators are established. A simulation study is conducted to assess the finite sample performance of the proposed method and suggests that it works well for practical situations. Also the method is applied to an AIDS study that motivated this study. (C) 2019 Elsevier B.V. All rights reserved.

关键词： Case K interval-censored data em algorithm Informative censoring Sieve maximum likelihood estimation

来源：评论

学校读者我要写书评

暂无评论

Influence measures for the Waring regression model

引用

BRAZILIAN JOURNAL OF PROBABILITY AND STATISTICS 2019年第2期33卷 402-424页

作者： Rivas, Luisa Galea, Manuel Univ Concepcion Dept Estadist Concepcion Chile Pontificia Univ Catolica Chile Dept Estadiist Santiago Chile

In this paper, we present a regression model where the response variable is a count data that follows a Waring distribution. The Waring regression model allows for analysis of phenomena where the Geometric regression model is inadequate, because the probability of success on each trial, p, is different for each individual and p has an associated distribution. Estimation is performed by maximum likelihood, through the maximization of the Q-function using em algorithm. Diagnostic measures are calculated for this model. To illustrate the results, an application to real data is presented. Some specific details are given in the Appendix of the paper.

关键词： em algorithm beta-geometric distribution generalized Cook's distance appropriate perturbation global and local influence

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：