检索结果-内蒙古大学图书馆

Zero-and-one-inflated Poisson regression model

STATISTICAL PAPERS 2021年第2期62卷 915-934页

作者： Liu, Wenchen Tang, Yincai Xu, Ancha East China Normal Univ Sch Stat KLATASDS MOE Shanghai 200241 Peoples R China Wenzhou Univ Dept Math Wenzhou 325035 Zhejiang Peoples R China

In this paper, a zero-and-one-inflated Poisson (ZOIP) regression model is proposed. The maximum likelihood estimation (MLE) and Bayesian estimation for this model are investigated. Three estimation methods of the ZOIP regression model are obtained based on data augmentation method which is expectation-maximization (em) algorithm, generalized expectation-maximization (Gem) algorithm and Gibbs sampling respectively. A simulation study is conducted to assess the performance of the proposed estimation for various sample sizes. Finally, an accidental deaths data set is analyzed to illustrate the practicability of the proposed method.

关键词： Zero-and-one-inflated Poisson regression model Data augmentation Gibbs sampling em algorithm Gem algorithm

来源：评论

学校读者我要写书评

暂无评论

Estimating error rate of classification into several normal populations under equal mean restriction

引用

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION 2023年

作者： Jana, Nabakumar Chakraborty, Ankur Indian Inst Technol ISM Dhanbad Dept Math & Comp Dhanbad Jharkhand India

Classification of observation into several univariate normal populations is considered when the population means are unknown but equal. Plug-in Bayes classification rules based on different estimators of the common mean are proposed for k populations. When the variances are ordered, the rule based on the Graybill-Deal estimator is compared with another rule. We prove the consistency property of the classification rules. Confidence intervals of conditional error rate are derived for two and three populations. Under the assumption of ordered variances, Bayes estimator of the ratio of variances is derived to use as a plug-in estimator for classification. We derive estimators of the parameters of mixture densities associated with two normal populations with a common mean and propose classification rules for mixture distribution. An extensive simulation is performed to compare different rules and interval estimators of the conditional error rates.

关键词： Bayes estimator Variance ratio Confidence interval Conditional error rate Probability of misclassification em algorithm

来源：评论

学校读者我要写书评

暂无评论

Efficient semiparametric inference for two-phase studies with outcome and covariate measurement errors

引用

STATISTICS IN MEDICINE 2021年第3期40卷 725-738页

作者： Tao, Ran Lotspeich, Sarah C. Amorim, Gustavo Shaw, Pamela A. Shepherd, Bryan E. Vanderbilt Univ Dept Biostat Med Ctr Nashville TN 37232 USA Vanderbilt Univ Vanderbilt Genet Inst Med Ctr Nashville TN 37232 USA Univ Penn Perelman Sch Med Dept Biostat Epidemiol & Informat Philadelphia PA 19104 USA

In modern observational studies using electronic health records or other routinely collected data, both the outcome and covariates of interest can be error-prone and their errors often correlated. A cost-effective solution is the two-phase design, under which the error-prone outcome and covariates are observed for all subjects during the first phase and that information is used to select a validation subsample for accurate measurements of these variables in the second phase. Previous research on two-phase measurement error problems largely focused on scenarios where there are errors in covariates only or the validation sample is a simple random sample of study subjects. Herein, we propose a semiparametric approach to general two-phase measurement error problems with a quantitative outcome, allowing for correlated errors in the outcome and covariates and arbitrary second-phase selection. We devise a computationally efficient and numerically stable expectation-maximization algorithm to maximize the nonparametric likelihood function. The resulting estimators possess desired statistical properties. We demonstrate the superiority of the proposed methods over existing approaches through extensive simulation studies, and we illustrate their use in an observational HIV study.

关键词： data audits electronic health records em algorithm HIV AIDS missing data sieve approximation

来源：评论

学校读者我要写书评

暂无评论

The multivariate mixed Negative Binomial regression model with an application to insurance a posteriori ratemaking

引用

INSURANCE MATHemATICS & ECONOMICS 2021年第PartB期101卷 602-625页

作者： Tzougas, George di Cerchiara, Alice Pignatelli Heriot Watt Univ Dept Actuarial Math & Stat Edinburgh Midlothian Scotland London Sch Econ & Polit Sci Dept Stat London England

This paper is concerned with introducing a family of multivariate mixed Negative Binomial regression models in the context of a posteriori ratemaking. The multivariate mixed Negative Binomial regression model can be considered as a candidate model for capturing overdispersion and positive dependencies in multi-dimensional claim count data settings, which all recent studies suggest are the norm when the ratemaking consists of pricing different types of claim counts arising from the same policy. For expository purposes, we consider the bivariate Negative Binomial-Gamma and Negative Binomial-Inverse Gaussian regression models. An Expectation-Maximization type algorithm is developed for maximum likelihood estimation of the parameters of the models for which the definition of a joint probability mass function in closed form is not feasible when the marginal means are modelled in terms of covariates. In order to illustrate the versatility of the proposed estimation procedure a numerical illustration is performed on motor insurance data on the number of claims from third party liability bodily injury and property damage. Finally, the a posteriori, or Bonus-Malus, premium rates resulting from the bivariate Negative Binomial-Gamma and Negative Binomial-Inverse Gaussian regression model are compared to those determined by the bivariate Negative Binomial and Poisson-Inverse Gaussian regression models. (C) 2021 Elsevier B.V. All rights reserved.

关键词： Multivariate mixed Negative Binomial regression model em algorithm A posteriori ratemaking Nonlife insurance Bodily injury and property damage MTPL claim counts

来源：评论

学校读者我要写书评

暂无评论

Improvement of the Gaussian mixture models' unsupervised learning method through the inclusion of dynamical systems for various types of nonlinear data

引用

HELIYON 2024年第13期10卷 e33605页

作者： Mahjoub, Rahim Farhangian Univ Dept Labor & Technol Educ Qazvin Iran

Gaussian mixture models (GMM) with a modulating dynamical system (DS) approach is an unsupervised learning method, and it can estimate the distribution of given data or encoding trajectories in the input space. In this paper, a series of trajectories is considered for simulation, and the role of tuning parameters in the algorithm for both Gaussian function encoding and behavior of the dynamical system is obtained and compared. This algorithm divides the input space of the data into presupposed local regions and then in each local region of the data employs a dynamical system approach for tracking the major trajectories of the data. In this paper, the influence of the number of the Gaussian function in the GMM approach is investigated and simulated deeply. Furthermore, the influence of the local statistical characteristic of data such as mean or covariance of the data on the training process is discussed, and in these conditions, the effect of tuning parameters as the number of the Gaussian function is explained. Also, all details of the characteristic of DS depend on these tuning parameters, especially when data has more variance or noise, this adjustment should be checked more accurately. So, eventually, we showed in the obtained simulation results that the behavior and location of attractor points in DS on the data distributions and accordingly stability of the DS is getting improved drastically by tuning the number of Gaussian functions accurately.

关键词： Dynamical system Nonlinear trajectories em algorithm Point-to-point motions GMM approach

来源：评论

学校读者我要写书评

暂无评论

Covariate adaptive familywise error rate control for genome- wide association studies

引用

BIOMETRIKA 2021年第4期108卷 915-931页

作者： Zhou, Huijuan Zhang, Xianyang Chen, Jun Renmin Univ China Inst Stat & Big Data Beijing 100872 Peoples R China Texas A&M Univ Dept Stat College Stn TX 77843 USA Mayo Clin Div Biomed Stat & Informat 200 First St SW Rochester MN 55905 USA

The familywise error rate has been widely used in genome-wide association studies. With the increasing availability of functional genomics data, it is possible to increase detection power by leveraging these genomic functional annotations. Previous efforts to accommodate covariates in multiple testing focused on false discovery rate control, while covariate-adaptive procedures controlling the familywise error rate remain underdeveloped. Here, we propose a novel covariate-adaptive procedure to control the familywise error rate that incorporates external covariates which are potentially informative of either the statistical power or the prior null probability. An efficient algorithm is developed to implement the proposed method. We prove its asymptotic validity and obtain the rate of convergence through a perturbation-type argument. Our numerical studies show that the new procedure is more powerful than competing methods and maintains robustness across different settings. We apply the proposed approach to the UK Biobank data and analyse 27 traits with 9 million single-nucleotide polymorphisms tested for associations. Seventy-five genomic annotations are used as covariates. Our approach detects more genome-wide significant loci than other methods in 21 out of the 27 traits.

关键词： em algorithm External covariate Familywise error rate Multiple testing

来源：评论

学校读者我要写书评

暂无评论

A robust Birnbaum-Saunders regression model based on asymmetric heavy-tailed distributions

引用

METRIKA 2021年第7期84卷 1049-1080页

作者： Maehara, Rocio Bolfarine, Heleno Vilca, Filidor Balakrishnan, N. Univ Pacifico Dept Ingn Lima Peru Univ Estadual Sao Paulo Dept Estat Sao Paulo Brazil Univ Estadual Campinas Dept Estat BR-6065 Campinas SP Brazil McMaster Univ Dept Math & Stat Hamilton ON Canada

Skew-normal/independent distributions provide an attractive class of asymmetric heavy-tailed distributions to the usual symmetric normal distribution. We use this class of distributions here to derive a robust generalization of sinh-normal distributions (Rieck in Statistical analysis for the Birnbaum-Saunders fatigue life distribution, 1989), we then propose robust nonlinear regression models, generalizing the Birnbaum-Saunders regression models proposed by Rieck and Nedelman (Technometrics 33:51-60, 1991) that have been studied extensively. The proposed regression models have a nice hierarchical representation that facilitates easy implementation of an em algorithm for the maximum likelihood estimation of model parameters and provide a robust alternative to estimation of parameters. Simulation studies as well as applications to a real dataset are presented to illustrate the usefulness of the proposed model as well as all the inferential methods developed here.

关键词： Nonlinear regression models Birnbaum– Saunders distribution em algorithm Robust estimation Skew-normal independent distribution Sinh-normal distribution

来源：评论

学校读者我要写书评

暂无评论

An additive hazards cure model with informative interval censoring

引用

LIFETIME DATA ANALYSIS 2021年第2期27卷 244-268页

作者： Wang, Shuying Wang, Chunjie Sun, Jianguo Changchun Univ Technol Sch Math & Stat Changchun 130012 Peoples R China Jilin Univ Sch Math Ctr Appl Stat Res Changchun 130012 Peoples R China

The existence of a cured subgroup happens quite often in survival studies and many authors considered this under various situations (Farewell in Biometrics 38:1041-1046, 1982;Kuk and Chen in Biometrika 79:531-541, 1992;Lam and Xue in Biometrika 92:573-586, 2005;Zhou et al. in J Comput Graph Stat 27:48-58, 2018). In this paper, we discuss the situation where only interval-censored data are available and furthermore, the censoring may be informative, for which there does not seem to exist an established estimation procedure. For the analysis, we present a three component model consisting of a logistic model for describing the cure rate, an additive hazards model for the failure time of interest and a nonhomogeneous Poisson model for the observation process. For estimation, we propose a sieve maximum likelihood estimation procedure and the asymptotic properties of the resulting estimators are established. Furthermore, an em algorithm is developed for the implementation of the proposed estimation approach, and extensive simulation studies are conducted and suggest that the proposed method works well for practical situations. Also the approach is applied to a cardiac allograft vasculopathy study that motivated this investigation.

关键词： Cure model em algorithm Informative interval censoring Sieve estimation

来源：评论

学校读者我要写书评

暂无评论

Initialization of Hidden Markov and Semi-Markov Models: A Critical Evaluation of Several Strategies

引用

INTERNATIONAL STATISTICAL REVIEW 2021年第3期89卷 447-480页

作者： Maruotti, Antonello Punzo, Antonio Libera Univ Ss Maria Assunta Dipartimento Giurisprudenza Econ Polit & Lingue M Rome Italy Univ Bergen Dept Math Bergen Norway Univ Catania Dipartimento Econ & Impresa Catania Italy

The expectation-maximization (em) algorithm is a familiar tool for computing the maximum likelihood estimate of the parameters in hidden Markov and semi-Markov models. This paper carries out a detailed study on the influence that the initial values of the parameters impose on the results produced by the algorithm. We compare random starts and partitional and model-based strategies for choosing the initial values for the em algorithm in the case of multivariate Gaussian emission distributions (EDs) and assess the performance of each strategy with different assessment criteria. Several data generation settings are considered with varying number of latent states, of variables as well as of the level of fuzziness in the data, and discussion on how each factor influences the obtained results is provided. Simulation results show that different initialization strategies may lead to different log-likelihood values and, accordingly, to different estimated partitions. A clear indication of which strategies should be preferred is given. We further include two real-data examples, widely analysed in the hidden semi-Markov model literature.

关键词： em algorithm hidden Markov models hidden semi-Markov models initialization simulation

来源：评论

学校读者我要写书评

暂无评论

A mixed-model approach for powerful testing of genetic associations with cancer risk incorporating tumor characteristics

引用

BIOSTATISTICS 2021年第4期22卷 772-788页

作者： Zhang, Haoyu Zhao, Ni Ahearn, Thomas U. Wheeler, William Garcia-Closas, Montserrat Chatterjee, Nilanjan Johns Hopkins Bloomberg SPH Dept Biostat 615 N Wolfe St Baltimore MD 21205 USA NCI Div Canc Epidemiol & Genet 9609 Med Ctr Dr Rockville MD 20850 USA NCI Informat Management Serv Inc 11730 Plaza Amer Dr Reston VA 20190 USA Johns Hopkins Univ Sch Med SPH Dept Oncol 733 N Broadway Baltimore MD 21205 USA Johns Hopkins Bloomberg SPH Dept Epidemiol 615 N Wolfe St Baltimore MD 21205 USA

Cancers are routinely classified into subtypes according to various features, including histopathological characteristics and molecular markers. Previous genome-wide association studies have reported heterogeneous associations between loci and cancer subtypes. However, it is not evident what is the optimal modeling strategy for handling correlated tumor features, missing data, and increased degrees-of-freedom in the underlying tests of associations. We propose to test for genetic associations using a mixed-effect two-stage polytomous model score test (MTOP). In the first stage, a standard polytomous model is used to specify all possible subtypes defined by the cross-classification of the tumor characteristics. In the second stage, the subtype-specific case-control odds ratios are specified using a more parsimonious model based on the case-control odds ratio for a baseline subtype, and the case-case parameters associated with tumor markers. Further, to reduce the degrees-of-freedom, we specify case-case parameters for additional exploratory markers using a random-effect model. We use the Expectation-Maximization algorithm to account for missing data on tumor markers. Through simulations across a range of realistic scenarios and data from the Polish Breast Cancer Study (PBCS), we show MTOP outperforms alternative methods for identifying heterogeneous associations between risk loci and tumor subtypes. The proposed methods have been implemented in a user-friendly and high-speed R statistical package called TOP (https://***/andrewhaoyu/TOP).

关键词： Cancer subtypes em algorithm Etiologic heterogeneity Susceptibility variants Score tests Two-stage polytomous model

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：