检索结果-内蒙古大学图书馆

A LATENT MIXTURE MODEL FOR HETEROGENEOUS CAUSAL MECHANISMS IN MENDELIAN RANDOMIZATION

ANNALS OF APPLIED STATISTICS 2024年第2期18卷 966-990页

作者： Long, Daniel Zho, Qingyuan Chen, Yang Univ Michigan Dept Stat Ann Arbor MI 48109 USA Univ Cambridge Stat Lab Cambridge England

Mendelian randomization (MR) is a popular method in epidemiology and genetics that uses genetic variation as instrumental variables for causal inference. Existing MR methods usually assume most genetic variants are valid instrumental variables that identify a common causal effect. There is a general lack of awareness that this effect homogeneity assumption can be violated when there are multiple causal pathways involved, even if all the instrumental variables are valid. In this article we introduce a latent mixture model MR -Path that groups instruments that yield similar causal effect estimates together. We develop a Monte Carlo em algorithm to fit this mixture model, derive approximate confidence intervals for uncertainty quantification, and adopt a modified Bayesian Information Criterion (BIC) for model selection. We verify the efficacy of the Monte Carlo em algorithm, confidence intervals, and model selection criterion using numerical simulations. We identify potential mechanistic heterogeneity when applying our method to estimate the effect of high -density lipoprotein cholesterol on coronary heart disease and the effect of adiposity on type II diabetes.

关键词： Causal inference instrumental variables em algorithm Monte Carlo sampling HDL cholesterol diabetes

来源：评论

学校读者我要写书评

暂无评论

Robust mixture of linear mixed modeling via multivariate Laplace distribution

引用

COMPUTATIONAL STATISTICS 2025年 1-22页

作者： Li, Xiongya Bai, Xiuqin Song, Weixing Kansas State Univ Dept Stat Manhattan KS 66506 USA

The assumption of normality in random effects and regression errors is the primary cause of the lack of robustness in the maximum likelihood estimation procedure for linear mixed models. In this paper, we introduce a robust method for estimating regression parameters in these models, by positing that the random effects and regression errors follow a multivariate Laplace distribution. This new methodology, implemented via an em algorithm, is computationally more efficient compared to the existing robust t procedure in the literature. Simulation studies suggest that the performance of the proposed estimation method in finite samples either surpasses or is at least on par with the robust t procedure.

关键词： Mixture of linear mixed models Robustness Multivariate Laplace distribution em algorithm

来源：评论

学校读者我要写书评

暂无评论

Penalized composite likelihood estimation for hidden Markov models with unknown number of states

引用

STATISTICS & PROBABILITY LETTERS 2025年 216卷

作者： Lin, Yong Huang, Mian Shanghai Univ Finance & Econ Sch Stat & Management Shanghai 200433 Peoples R China

Estimating hidden Markov models (HMMs) with unknown number of states is a challenging task. In this paper, we propose a new penalized composite likelihood approach for simultaneously estimating both the number of states and the parameters in an overfitted HMM. We prove the order selection consistency and asymptotic normality of the resultant estimator. Simulation studies and an application demonstrate the finite sample performance of the proposed method.

关键词： Hidden Markov models Order selection Penalized composite likelihood em algorithm

来源：评论

学校读者我要写书评

暂无评论

A latent class pattern mixture model for nonignorable nonresponses in multivariate categorical data

引用

COMPUTATIONAL STATISTICS 2025年 1-31页

作者： Lee, Jungwun Sieger, Margaret Lloyd Phillips, Jon D. Boston Univ Sch Publ Hlth Dept Biostat 801 Massachusetts Ave Boston MA 02118 USA Univ Kansas Sch Med Dept Populat Hlth 3901 Rainbow Blvd Kansas City KS 66160 USA Univ Connecticut Sch Social Work 38 Prospect St Hartford CT 06103 USA

Survey data using categorical item variables are widely used in applied research such as psychology, education, and behavioral studies. Unfortunately, survey data are highly susceptible to nonignorable missing values that may threaten the validity of statistical inference if naively ignored or inappropriately treated. This paper proposes a novel latent pattern mixture model for nonignorable missing values in multivariate categorical outcomes. The proposed model posits the existence of two categorical latent variables;one latent variable represents a nonresponse pattern, and the other represents a response pattern conditioning on the nonresponse pattern. We propose two parameter estimation strategies: the maximum-likelihood (ML) estimation using the expectation-maximization algorithm and Bayesian estimation using the Markov-Chain Monte Carlo algorithm. Simulation studies revealed that the ML estimation is preferred to the Bayesian estimation with noninformative priors in terms of standardized biases given the large sample size, whereas the Bayesian estimation can be preferred when the sample size is small. Finally, our real data example analyzed a data set with parental substance use disorder and revealed six latent classes of participants that are distinguished in response and missingness patterns.

关键词： em algorithm Bayesian inference Missing not at random Latent class analysis Parental substance use disorder

来源：评论

学校读者我要写书评

暂无评论

Markov-switching decision trees

引用

ASTA-ADVANCES IN STATISTICAL ANALYSIS 2024年第2期108卷 461-476页

作者： Adam, Timo Oetting, Marius Michels, Rouven Univ Copenhagen Dept Math Sci Univ Pk 5 DK-2100 Copenhagen Denmark Bielefeld Univ Fac Business Adm & Econ Univ str 25 D-33615 Bielefeld Germany

Decision trees constitute a simple yet powerful and interpretable machine learning tool. While tree-based methods are designed only for cross-sectional data, we propose an approach that combines decision trees with time series modeling and thereby bridges the gap between machine learning and statistics. In particular, we combine decision trees with hidden Markov models where, for any time point, an underlying (hidden) Markov chain selects the tree that generates the corresponding observation. We propose an estimation approach that is based on the expectation-maximisation algorithm and assess its feasibility in simulation experiments. In our real-data application, we use eight seasons of National Football League (NFL) data to predict play calls conditional on covariates, such as the current quarter and the score, where the model's states can be linked to the teams' strategies. R code that implements the proposed method is available on GitHub.

关键词： Decision trees em algorithm Hidden Markov models Time series modeling

来源：评论

学校读者我要写书评

暂无评论

Bayesian mixture modelling with ranked set samples

引用

STATISTICS IN MEDICINE 2024年第19期43卷 3723-3741页

作者： Alvandi, Amirhossein Omidvar, Sedigheh Hatefi, Armin Jozani, Mohammad Jafari Ozturk, Omer Nematollahi, Nader Univ Massachusetts Dept Math & Stat Amherst MA USA Allameh Tabatabai Univ Dept Stat Tehran Iran Mem Univ Newfoundland Dept Math & Stat St John NF Canada Univ Manitoba Dept Stat Winnipeg MB Canada Ohio State Univ Dept Stat Columbus OH USA

We consider the Bayesian estimation of the parameters of a finite mixture model from independent order statistics arising from imperfect ranked set sampling designs. As a cost-effective method, ranked set sampling enables us to incorporate easily attainable characteristics, as ranking information, into data collection and Bayesian estimation. To handle the special structure of the ranked set samples, we develop a Bayesian estimation approach exploiting the Expectation-Maximization (em) algorithm in estimating the ranking parameters and Metropolis within Gibbs Sampling to estimate the parameters of the underlying mixture model. Our findings show that the proposed RSS-based Bayesian estimation method outperforms the commonly used Bayesian counterpart using simple random sampling. The developed method is finally applied to estimate the bone disorder status of women aged 50 and older.

关键词： bone mineral data em algorithm finite mixture models Gibbs sampling imperfect ranking metropolis-Hastings misplacement probability model ranked set sampling

来源：评论

学校读者我要写书评

暂无评论

Flexible Multivariate Mixture Models: A Comprehensive Approach for Modeling Mixtures of Non-Identical Distributions

引用

INTERNATIONAL STATISTICAL REVIEW 2024年第0期

作者： Pal, Samyajoy Heumann, Christian Ludwig Maximilians Univ Munchen Dept Stat Munich Germany

The mixture models are widely used to analyze data with cluster structures and the mixture of Gaussians is most common in practical applications. The use of mixtures involving other multivariate distributions, like the multivariate skew normal and multivariate generalised hyperbolic, is also found in the literature. However, in all such cases, only the mixtures of identical distributions are used to form a mixture model. We present an innovative and versatile approach for constructing mixture models involving identical and non-identical distributions combined in all conceivable permutations (e.g. a mixture of multivariate skew normal and multivariate generalised hyperbolic). We also establish any conventional mixture model as a distinctive particular case of our proposed framework. The practical efficacy of our model is shown through its application to both simulated and real-world data sets. Our comprehensive and flexible model excels at recognising inherent patterns and accurately estimating parameters.

关键词： em algorithm maximum likelihood estimates mixture model mixture of non-identical distributions multivariate generalised hyperbolic distribution multivariate skew normal distribution

来源：评论

学校读者我要写书评

暂无评论

A powerful approach to identify replicable variants in genome-wide association studies

引用

AMERICAN JOURNAL OF HUMAN GENETICS 2024年第5期111卷 966-978页

作者： Li, Yan Lei, Haochen Wen, Xiaoquan Cao, Hongyuan Changchun Univ Sci & Technol Sch Comp Sci & Technol Changchun 130022 Jilin Peoples R China Jilin Univ Sch Math Changchun 130012 Jilin Peoples R China Florida State Univ Dept Stat Tallahassee FL 32306 USA Univ Michigan Dept Biostat Ann Arbor MI 48109 USA

Replicability is the cornerstone of modern scientific research. Reliable identifications of genotype -phenotype associations that are significant in multiple genome-wide association studies (GWASs) provide stronger evidence for the findings. Current replicability analysis relies on the independence assumption among single -nucleotide polymorphisms (SNPs) and ignores the linkage disequilibrium (LD) structure. We show that such a strategy may produce either overly liberal or overly conservative results in practice. We develop an efficient method, ReAD, to detect replicable SNPs associated with the phenotype from two GWASs accounting for the LD structure. The local dependence structure of SNPs across two heterogeneous studies is captured by a four -state hidden Markov model (HMM) built on two sequences of p values. By incorporating information from adjacent locations via the HMM, our approach provides more accurate SNP significance rankings. ReAD is scalable, platform independent, and more powerful than existing replicability analysis methods with effective false discovery rate control. Through analysis of datasets from two asthma GWASs and two ulcerative colitis GWASs, we show that ReAD can identify replicable genetic loci that existing methods might otherwise miss.

关键词： em algorithm false discovery rate GWAS hidden Markov model linkage disequilibrium pool-adjacent-violator algorithm replicability

来源：评论

学校读者我要写书评

暂无评论

Finite mixtures of mean-parameterized Conway-Maxwell-Poisson models

引用

STATISTICAL PAPERS 2024年第3期65卷 1469-1492页

作者： Zhan, Dongying Young, Derek S. Univ Kentucky Dr Bing Zhang Dept Stat 725 Rose St Lexington KY 40536 USA

For modeling count data, the Conway-Maxwell-Poisson (CMP) distribution is a popular generalization of the Poisson distribution due to its ability to characterize data over- or under-dispersion. While the classic parameterization of the CMP has been well-studied, its main drawback is that it is does not directly model the mean of the counts. This is mitigated by using a mean-parameterized version of the CMP distribution. In this work, we are concerned with the setting where count data may be comprised of subpopulations, each possibly having varying degrees of data dispersion. Thus, we propose a finite mixture of mean-parameterized CMP distributions. An em algorithm is constructed to perform maximum likelihood estimation of the model, while bootstrapping is employed to obtain estimated standard errors. A simulation study is used to demonstrate the flexibility of the proposed mixture model relative to mixtures of Poissons and mixtures of negative binomials. An analysis of dog mortality data is presented.

关键词： Bootstrapping Count data Data dispersion em algorithm Negative binomial

来源：评论

学校读者我要写书评

暂无评论

Traffic count data analysis using mixtures of Kato-Jones distributions

引用

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS 2024年第2期74卷 352-372页

作者： Nagasaki, Kota Kato, Shogo Nakanishi, Wataru Jones, M. C. Inst Sci Tokyo Dept Civil & Environm Engn W6-92-12-1 OokayamaMeguro Ku Tokyo 1528550 Japan Inst Stat Math Tokyo Japan Kanazawa Univ Sch Geosci & Civil Engn Kanazawa Ishikawa Japan Open Univ Sch Math & Stat Milton Keynes England

We discuss the modelling of traffic count data that show the variation of traffic volume within a day. For the modelling, we apply mixtures of Kato-Jones distributions in which each component is unimodal and affords a wide range of skewness and kurtosis. We consider two methods for parameter estimation, namely, a modified method of moments and the maximum-likelihood method. These methods were seen to be useful for fitting the proposed mixtures to our data. As a result, the variation in traffic volume was classified into the morning and evening traffic whose distributions have different shapes, particularly different degrees of skewness and kurtosis.

关键词： circular data directional statistics em algorithm maximum-likelihood estimation method of moments estimation traffic counter data

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：