咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Optimal decorrelated score sub... 收藏

Optimal decorrelated score subsampling for generalized linear models with massive data

作     者:Junzhuo Gao Lei Wang Heng Lian 

作者机构:School of Statistics and Data Science&LPMCNankai UniversityTianjin 300071China Department of MathematicsCity University of Hong KongHong KongChina 

出 版 物:《Science China Mathematics》 (中国科学(数学)(英文版))

年 卷 期:2024年第67卷第2期

页      面:405-430页

核心收录:

学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 07[理学] 070105[理学-运筹学与控制论] 0701[理学-数学] 

基  金:This work was supported by the Fundamental Research Funds for the Central Universities National Natural Science Foundation of China(Grant No.12271272)and the Key Laboratory for Medical Data Analysis and Statistical Research of Tianjin 

主  题:A-optimality decorrelated score subsampling high-dimensional inference L-optimality massive data 

摘      要:In this paper, we consider the unified optimal subsampling estimation and inference on the lowdimensional parameter of main interest in the presence of the nuisance parameter for low/high-dimensionalgeneralized linear models (GLMs) with massive data. We first present a general subsampling decorrelated scorefunction to reduce the influence of the less accurate nuisance parameter estimation with the slow convergencerate. The consistency and asymptotic normality of the resultant subsample estimator from a general decorrelatedscore subsampling algorithm are established, and two optimal subsampling probabilities are derived under theA- and L-optimality criteria to downsize the data volume and reduce the computational burden. The proposedoptimal subsampling probabilities provably improve the asymptotic efficiency of the subsampling schemes in thelow-dimensional GLMs and perform better than the uniform subsampling scheme in the high-dimensional GLMs.A two-step algorithm is further proposed to implement, and the asymptotic properties of the correspondingestimators are also given. Simulations show satisfactory performance of the proposed estimators, and twoapplications to census income and Fashion-MNIST datasets also demonstrate its practical applicability.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分