版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Univ Engn & Technol Dept Comp Sci & Engn Lahore 54890 Pakistan Ryerson Univ Res Lab Adv Syst Modelling Toronto ON M5B 2K3 Canada Univ Toronto Dalla Lana Sch Publ Hlth Toronto ON M5S Canada Ryerson Univ Ted Rogers Sch Informat Technol Management Toronto ON M5B 2K3 Canada York Univ Dept Math & Stat Toronto ON M3J 1P3 Canada
出 版 物:《IEEE ACCESS》 (IEEE Access)
年 卷 期:2019年第7卷
页 面:1365-1375页
核心收录:
主 题:Metabolic syndrome decision tree the National Heart Lungs and Blood Institute and American Heart Association (NHLBI) and American Heart Association (AHA) diagnostic prediction data sampling methods K-medoids random under sampling over sampling
摘 要:The objective of this inductive research was to investigate: 1) the relationship between diabetes mellitus and individual risk factors of metabolic syndrome (MetS), in a non-conservative setting;2) the prediction of future onset of diabetes using relevant risk factors of MetS;and 3) to investigate the relative performance of machine learning methods when data sampling techniques are used to generate balanced training sets. The dataset used in this research contains 667 907 records for a period ranging from 2003 to 2013. Quantifying the contribution of individual risk factors of MetS in the development of diabetes in a non-conservative setting logistic regression analysis was performed. Our analyses contradict the view that diabetes is commonly associated with low levels of high-density lipoprotein (HDL). Instead, our results demonstrate that the increased levels of HDL are positively correlated with diabetes onset, particularly in women. We also proposed J48 decision tree and Naive Bayes methods for prediction of future onset of diabetes using relevant risk factors obtained from logistic regression analysis, over balanced and unbalanced datasets. The results demonstrated the supremacy of Naive Bayes with K-medoids under-sampling technique as compared to random under-sampling, oversampling, and no sampling. It is achieved on average 79% receiver operating characteristic performance with the increased true positive rate. The results of this paper suggest further research to clarify the pathophysiological significance of HDL and pathways in the development of diabetes.