咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Facilitating high-dimensional ... 收藏

Facilitating high-dimensional transparent classification via empirical Bayes variable selection

便于 highdimensional 经由实验 Bayes 的透明分类变量选择

作     者:Bar, Haim Booth, James Wells, Martin T. Liu, Kangyan 

作者机构:Univ Connecticut Dept Stat Storrs CT 06269 USA Cornell Univ Dept Biol Stat & Computat Biol Ithaca NY USA Cornell Univ Dept Stat Sci Ithaca NY USA 

出 版 物:《APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY》 (商业与工业应用随机模型)

年 卷 期:2018年第34卷第6期

页      面:949-961页

核心收录:

学科分类:1201[管理学-管理科学与工程(可授管理学、工学学位)] 07[理学] 070104[理学-应用数学] 0714[理学-统计学(可授理学、经济学学位)] 0701[理学-数学] 

基  金:National Science Foundation [DMS 1612625, DMS 1611893] National Institutes of Health [U19 AI111143] 

主  题:EM algorithm generalized linear models random forest support vector machines variable selection 

摘      要:We present a two-step approach to classification problems in the large P, small N setting, where the number of predictors may be larger than the sample size. We assume that the association between the predictors and the class variable has an approximate linear-logistic form, but we allow the class boundaries to be nonlinear. We further assume that the number of true predictors is relatively small. In the first step, we use a binomial generalized linear model to identify which predictors are associated with each class and then restrict the data set to these predictors and run a nonlinear classifier, such as a random forest or a support vector machine. We show that, without the variable screening step, the classification performance of both the random forest and support vector machine is degraded when many among the P predictors are not related to the class.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分