检索结果-内蒙古大学图书馆

Hidden Markov models for multivariate panel data

STATISTICS AND COMPUTING 2024年第6期34卷 1-21页

作者： Neal, Mackenzie R. Sochaniwsky, Alexa A. Mcnicholas, Paul D. McMaster Univ Dept Math & Stat Hamilton ON Canada

While advances continue to be made in model-based clustering, challenges persist in modeling various data types such as panel data. Multivariate panel data present difficulties for clustering algorithms because they are often plagued by missing data and dropouts, presenting issues for estimation algorithms. This research presents a family of hidden Markov models that compensate for the issues that arise in panel data. A modified expectation-maximization algorithm capable of handling missing not at random data and dropout is presented and used to perform model estimation.

关键词： Hidden Markov models Panel data Correlated data Missing data em algorithm Longitudinal studies

来源：评论

学校读者我要写书评

暂无评论

A Scalable Method to Exploit Screening in Gaussian Process Models with Noise

引用

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS 2024年第2期33卷 603-613页

作者： Geoga, Christopher J. Stein, Michael L. Rutgers State Univ Dept Stat New Brunswick NJ 08901 USA

A common approach to approximating Gaussian log-likelihoods at scale exploits the fact that precision matrices can be well-approximated by sparse matrices in some circumstances. This strategy is motivated by the screening effect, which refers to the phenomenon in which the linear prediction of a process Z at a point x0 depends primarily on measurements nearest to x0. But simple perturbations, such as iid measurement noise, can significantly reduce the degree to which this exploitable phenomenon occurs. While strategies to cope with this issue already exist and are certainly improvements over ignoring the problem, in this work we present a new one based on the em algorithm that offers several advantages. While in this work we focus on the application to Vecchia's approximation (Vecchia), a particularly popular and powerful framework in which we can demonstrate true second-order optimization of M steps, the method can also be applied using entirely matrix-vector products, making it applicable to a very wide class of precision matrix-based approximation methods. for this article are available online.

关键词： em algorithm Gaussian processes Stochastic trace estimation Vecchia's approximation

来源：评论

学校读者我要写书评

暂无评论

Clustering and estimation of finite mixture models under bivariate ranked set sampling with application to a breast cancer study

引用

STATISTICAL PAPERS 2024年第2期65卷 705-736页

作者： Aghabozorgi, Hamid Haji Eskandari, Farzad Allameh Tabatabai Univ Dept Stat Tehran Iran

In the literature on modeling heterogeneous data via mixture models, it is generally assumed that the samples are drawn from the underlying population using the simple random sampling (SRS) technique. This study exploits the bivariate ranked set sampling (BVRSS) technique to learn finite mixture models. We generalize the expectation-maximization (em) algorithm under univariate RSS to the bivariate case. Computationally, through a simulation study under a noisy setting, we compare the performance of the proposed rank-based estimators with that of the SRS-based competitors in estimating unknown parameters and cluster assignments. The proposed methodology is applied to a breast cancer data set to diagnose malignant or benign tumors in patients. The results showed that the extra rank information in BVRSS samples leads to a better inference about the unknown features of mixture models.

关键词： Bivariate ranked set sampling em algorithm Mixture models Model-based clustering

来源：评论

学校读者我要写书评

暂无评论

A Lindley-binomial model for analyzing the proportions with sparseness and excessive zeros

引用

JOURNAL OF APPLIED STATISTICS 2024年第9期51卷 1792-1817页

作者： Deng, Dianliang Zhang, Xiaoqing Univ Regina Dept Math & Stat Sask SK Canada Univ Regina Dept Math & Stat Sask SK S4S 0A2 Canada

Proportional data arise frequently in a wide variety of fields of study. Such data often exhibit extra variation such as over/under dispersion, sparseness and zero inflation. For example, the hepatitis data present both sparseness and zero inflation with 19 contributing non-zero denominators of 5 or less and with 36 having zero seropositive out of 83 annual age groups. The whitefly data consists of 640 observations with 339 zeros (53%), which demonstrates extra zero inflation. The catheter management data involve excessive zeros with over 60% zeros averagely for outcomes of 193 urinary tract infections, 194 outcomes of catheter blockages and 193 outcomes of catheter displacements. However, the existing models cannot always address such features appropriately. In this paper, a new two-parameter probability distribution called Lindley-binomial (LB) distribution is proposed to analyze the proportional data with such features. The probabilistic properties of the distribution such as moment, moment generating function are derived. The Fisher scoring algorithm and em algorithm are presented for the computation of estimates of parameters in the proposed LB regression model. The issues on goodness of fit for the LB model are discussed. A limited simulation study is also performed to evaluate the performance of derived em algorithms for the estimation of parameters in the model with/without covariates. The proposed model is illustrated through three aforementioned proportional datasets.

关键词： Proportional data em algorithm Lindley distribution binomial distribution overdispersion sparseness zero inflation

来源：评论

学校读者我要写书评

暂无评论

Portfolio analysis with mean-CVaR and mean-CVaR-skewness criteria based on mean-variance mixture models

引用

ANNALS OF OPERATIONS RESEARCH 2024年第1-2期336卷 945-966页

作者： Abudurexiti, Nuerxiati He, Kai Hu, Dongdong Rachev, Svetlozar T. Sayit, Hasanjan Sun, Ruoyu Xian Jiaotong Liverpool Univ Dept Financial & Actuarial Math Suzhou Peoples R China Texas Tech Univ Dept Math & Stat Lubbock TX USA

The paper Zhao et al. (Ann Oper Res 226:727-739, 2015) shows that mean-CVaR-skewness portfolio optimization problems based on asymetric Laplace (AL) distributions can be transformed into quadratic optimization problems for which closed form solutions can be found. In this note, we show that such a result also holds for mean-risk-skewness portfolio optimization problems when the underlying distribution belongs to a larger class of normal mean-variance mixture (NMVM) models than the class of AL *** then study the value at risk (VaR) and conditional value at risk (CVaR) risk measures of portfolios of returns with NMVM *** have closed form expressions for portfolios of normal and more generally elliptically distributed returns, as discussed in Rockafellar and Uryasev (J Risk 2:21-42, 2000) and Landsman and Valdez (N Am Actuar J 7:55-71, 2003). When the returns have general NMVM distributions, these risk measures do not give closed form expressions. In this note, we give approximate closed form expressions for the VaR and CVaR of portfolios of returns with NMVM *** tests show that our closed form formulas give accurate values for VaR and CVaR and shorten the computational time for portfolio optimization problems associated with VaR and CVaR considerably.

关键词： Portfolio selection Normal mean-variance mixtures Risk measure Mean-risk-skewness em algorithm

来源：评论

学校读者我要写书评

暂无评论

Choice of sampling effort in a Schnabel census for accurate population size estimates

引用

ENVIRONMENTAL AND ECOLOGICAL STATISTICS 2025年 1-17页

作者： Chin, Su Na Bohning, Dankmar Univ Southampton Math Sci & Southampton Stat Sci Res Inst Univ Rd Southampton SO17 1BJ England Univ Malaysia Sabah Fac Sci & Nat Resources Jalan UMS Kota Kinabalu 88400 Sabah Malaysia

Population size estimation has long been a key area of interest across various fields. The Schnabel census, a widely applied capture-recapture method, is commonly used for population estimation. However, the topic of sampling effort in Schnabel census studies remains insufficiently explored. This study aims to determine the required sampling effort in Schnabel census studies, considering different levels of capture success rates and population heterogeneity. To address this, the number of capture occasions, T, is adjusted to achieve different probabilities of missing observation, p(0), with the goal of maintaining an appropriate width of the confidence interval. Specifically, maintaining p(0)<0.5 could limit uncertainty to within 20% of the true population size for N >= 100. Zero-truncated counting distribution was applied by fitting three models: binomial, beta-binomial, and binomial mixture. The findings reveal an exponential relationship between the desired success capture rate and the required number of capture occasions. Additionally, lower detectability requires more capture occasions to achieve the same level of capture success rate compared to higher detectability. This methodological approach provides robust and efficient estimation strategies, ensuring the sustainability and feasibility of population monitoring programs.

关键词： Bootstrap Capture-recapture em algorithm Mixture model Uncertainty Zero-truncated

来源：评论

学校读者我要写书评

暂无评论

Latent Network Estimation and Variable Selection for Compositional Data Via Variational em

引用

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS 2022年第1期31卷 163-175页

作者： Osborne, Nathan Peterson, Christine B. Vannucci, Marina Rice Univ Dept Stat Houston TX 77251 USA Univ Texas MD Anderson Canc Ctr Dept Biostat Houston TX 77030 USA

Network estimation and variable selection have been extensively studied in the statistical literature, but only recently have those two challenges been addressed simultaneously. In this article, we seek to develop a novel method to simultaneously estimate network interactions and associations to relevant covariates for count data, and specifically for compositional data, which have a fixed sum constraint. We use a hierarchical Bayesian model with latent layers and employ spike-and-slab priors for both edge and covariate selection. For posterior inference, we develop a novel variational inference scheme with an expectation-maximization step, to enable efficient estimation. Through simulation studies, we demonstrate that the proposed model outperforms existing methods in its accuracy of network recovery. We show the practical utility of our model via an application to microbiome data. The human microbiome has been shown to contribute too many of the functions of the human body, and also to be linked with a number of diseases. In our application, we seek to better understand the interaction between microbes and relevant covariates, as well as the interaction of microbes with each other. We call our algorithm simultaneous inference for networks and covariates and provide a Python implementation, which is available online.

关键词： Bayesian hierarchical model Count data em algorithm Graphical model Microbiome data Variational inference

来源：评论

学校读者我要写书评

暂无评论

Inference for a constant-stress model under progressive type-II censored data from the truncated normal distribution

引用

COMPUTATIONAL STATISTICS 2024年第5期39卷 2791-2820页

作者： Sief, Mohamed Liu, Xinsheng Abd El-Raheem, Abd El-Raheem Mohamed Nanjing Univ Aeronaut & Astronaut State Key Lab Mech & Control Mech Struct Inst Nano Sci Nanjing 210016 Peoples R China Nanjing Univ Aeronaut & Astronaut Dept Math Nanjing 210016 Peoples R China Fayoum Univ Fac Sci Dept Math Al Fayyum 63514 Egypt Ain Shams Univ Fac Educ Dept Math Cairo Egypt

In this study, constant-stress accelerated life testing has been investigated using type-II censoring of failure data from a truncated normal distribution. Various classical estimation approaches are discussed for estimating model parameters, hazard rates, and reliability functions. Among these methods are maximum likelihood estimation, the em algorithm, and maximum product spacing estimation. Interval estimation is also introduced in the context of asymptomatic confidence intervals and bootstrap intervals. Furthermore, the missing information principle was employed to compute the observed Fisher information matrix. Three optimality criteria linked with the Fisher information matrix are considered to find out the optimal value of each stress level. To interpret the proposed techniques, Monte Carlo simulations are run in conjunction with real data analysis.

关键词： Constant-stress accelerated life test Progressive type-II censoring Maximum likelihood estimation Maximum product spacing estimation em algorithm Truncated normal distribution Bootstrap confidence interval

来源：评论

学校读者我要写书评

暂无评论

Frailty model with change points for survival analysis

引用

PHARMACEUTICAL STATISTICS 2024年第3期23卷 408-424页

作者： Kojima, Masahiro Orihara, Shunichiro Kyowa Kirin Co Ltd Tokyo Japan Inst Stat Math Tachikawa Japan Tokyo Med Univ Tokyo Japan Kyowa Kirin Co Ltd R&D Div Biometr Dept 1-9-2 OtemachiChiyoda Ku Tokyo 100004 Japan

We propose a novel frailty model with change points applying random effects to a Cox proportional hazard model to adjust the heterogeneity between clusters. In the specially focused eight empowered Action Group (EAG) states in India, there are problems with different survival curves for children up to the age of five in different states. Therefore, when analyzing the survival times for the eight EAG states, we need to adjust for the effects among states (clusters). Because the frailty model includes random effects, the parameters are estimated using the expectation-maximization (em) algorithm. Additionally, our model needs to estimate change points;we thus propose a new algorithm extending the conventional estimation algorithm to the frailty model with change points to solve the problem. We show a practical example to demonstrate how to estimate the change point and the parameters of the distribution of random effect. Our proposed model can be easily analyzed using the existing R package. We conducted simulation studies with three scenarios to confirm the performance of our proposed model. We re-analyzed the survival time data of the eight EAG states in India to show the difference in analysis results with and without random effect. In conclusion, we confirmed that the frailty model with change points has a higher accuracy than the model without a random effect. Our proposed model is useful when heterogeneity needs to be taken into account. Additionally, the absence of heterogeneity did not affect the estimation of the regression parameters.

关键词： change point Cox proportional hazard model em algorithm frailty model random effect

来源：评论

学校读者我要写书评

暂无评论

A computational approach to estimation of discrete Pareto parameters

引用

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION 2023年第8期52卷 3692-3711页

作者： Amponsah, Charles K. Kozubowski, Tomasz J. Univ Nevada Dept Math & Stat Reno NV 89557 USA

The discrete Pareto (DP) distribution studied in this paper is a probability model with a power-law tail, which provides a convenient alternative to the well-known Zipf distribution. While basic characteristics of the DP model are available explicitly, this is not an exponential family and parameter estimation connected with this model is a challenging task. With this in mind we develop a computational approach to this problem, based on the expectation-maximization (em) algorithm. In the process, we discover an interesting new probability distribution, which is a certain tilted version of the standard gamma model, and we provide a short account of its basic properties. The latter play a crucial role in our em algorithm. Our computational approach to DP parameter estimation is illustrated by simulations, while a real data example from finance illustrates potential applications of the DP stochastic model.

关键词： Discrete distribution Discretization em algorithm Heavy tails Parameter estimation Pareto distribution Power law Tilted distribution

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：