咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Algorithmic randomness based f... 收藏

Algorithmic randomness based feature selection for traditional Chinese chronic gastritis diagnosis

算法的随意为繁体中文基于特征选择长期的胃炎诊断

作     者:Wang, Huazhen Lv, Bing Yang, Fan Zheng, Kai Li, Xuan Hu, Xueqin 

作者机构:Huaqiao Univ Coll Comp Sci & Technol Xiamen 361021 Peoples R China Xiamen Univ Sch Informat Sci & Technol Xiamen 361005 Peoples R China China Acad Chinese Med Sci Inst Informat Tradit Chinese Med Beijing 100700 Peoples R China 

出 版 物:《NEUROCOMPUTING》 (神经计算)

年 卷 期:2014年第140卷

页      面:252-264页

核心收录:

学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:National Natural Science Foundation of China [61202144, 61203282] Natural Science Foundation of Fujian Province in China [2012J01274, 2012J05125] Key Laboratory of System Control and Information Processing Ministry of Education of Shanghai Jiao Tong University [SCIP2012007] Research Grant Council of Huaqiao University [09BS515] 

主  题:Feature selection Feature importance Algorithmic randomness Conformal predictor Chronic gastritis Random forests 

摘      要:Machine learning methods involving multivariate interacting effects have become mainstream in feature selection. However, the feature importance score generated by machine learning methods is not statistically interpretable, which hampers its application in practice like medical diagnosis. In this study, a framework of Algorithmic Randomness based Feature Selection (ARFS) is proposed to measure the feature importance score using the p-value which derives from the combination of algorithmic randomness test and machine learning methods. In ARFS, a machine learning algorithm, such as random forest (RF), support vector machine (SVM) and naive Bayes classifier (NB) is used to compute the nonconformity score of each example belonging to data distribution, and then the p-value from algorithmic randomness test is obtained from nonconformity scores. ARFS evaluates the importance of each feature with the reduction of p-value on the datasets before and after random permutation of that feature, which makes it statistically interpretable. To demonstrate its efficiency, three ARFS models, i.e. ARFS-RF, ARFS-SVM and ARFS-NB were used to compare with some feature selection approaches, i.e. RF-ACC, RF-Gini, KNNpermute, SMFS, ANOVA and SNR. The results showed that ARFS-RF obtained better performances both on the synthetic and benchmark datasets. Further study on chronic gastritis dataset in Traditional Chinese Medicine (TCM) showed that the symptom sets given by ARFS-RF performs substantially better than that of TCM experts with the same size. The symptom ranking list generated by ARFS-RF can offer counselling for the physician to design, select, and interpret the symptoms in chronic gastritis diagnosis. (C) 2014 Elsevier B.V. All rights reserved.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分