咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Performance Comparison of Mach... 收藏

Performance Comparison of Machine Learning Platforms

作     者:Roy, Asim Qureshi, Shiban Pande, Kartikeya Nair, Divitha Gairola, Kartik Jain, Pooja Singh, Suraj Sharma, Kirti Jagadale, Akshay Lin, Yi-Yang Sharma, Shashank Gotety, Ramya Zhang, Yuexin Tang, Ji Mehta, Tejas Sindhanuru, Hemanth Okafor, Nonso Das, Santak Gopal, Chidambara N. Rudraraju, Srinivasa B. Kakarlapudia, Avinash, V 

作者机构:Arizona State Univ Dept Informat Syst Tempe AZ 85287 USA 

出 版 物:《INFORMS JOURNAL ON COMPUTING》 (INFORMS J. Comput.)

年 卷 期:2019年第31卷第2期

页      面:207-225页

核心收录:

学科分类:1201[管理学-管理科学与工程(可授管理学、工学学位)] 08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

主  题:machine learning platforms classification algorithms comparison of algorithms comparison of platforms 

摘      要:In this paper, we present a method for comparing and evaluating different collections of machine learning algorithms on the basis of a given performance measure (e.g., accuracy, area under the curve (AUC), F-score). Such a method can be used to compare standard machine learning platforms such as SAS, IBM SPSS, and Microsoft Azure ML. A recent trend in automation of machine learning is to exercise a collection of machine learning algorithms on a particular problem and then use the best performing algorithm. Thus, the proposed method can also be used to compare and evaluate different collections of algorithms for automation on a certain problem type and find the best collection. In the study reported here, we applied the method to compare six machine learning platforms - R, Python, SAS, IBM SPSS Modeler, Microsoft Azure ML, and Apache Spark ML. We compared the platforms on the basis of predictive performance on classification problems because a significant majority of the problems in machine learning are of that type. The general question that we addressed is the following: Are there platforms that are superior to others on some particular performance measure? For each platform, we used a collection of six classification algorithms from the following six families of algorithms - support vector machines, multilayer perceptrons, random forest (or variant), decision trees/ gradient boosted trees, Naive Bayes/Bayesian networks, and logistic regression. We compared their performance on the basis of classification accuracy, F-score, and AUC. We used F-score and AUC measures to compare platforms on two-class problems only. For testing the platforms, we used a mix of data sets from (1) the University of California, Irvine (UCI) library, (2) the Kaggle competition library, and (3) high-dimensional gene expression problems. We performed some hyperparameter tuning on algorithms wherever possible.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分