版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Department of Computers and Systems Engineering Faculty of Engineering Al-Azhar University Cairo Egypt Department of Computers and Systems Engineering Faculty of Engineering Minia University Egypt
出 版 物:《Journal of Electrical Systems and Information Technology》
年 卷 期:2018年第5卷第3期
页 面:363-370页
主 题:Arabic tweets Preprocessing Classifications Classifier algorithms Ensemble methods
摘 要:Tweets classification became interest topics in recent years, especially for the Arabic language. In this paper, the Arabic tweets are classified automatically into one of some predetermined categories mainly: sport, culture, politics, technology and general, based on their linguistic characteristics and their contents, also the classification accuracy is improved for Arabic tweets, by using ensemble methods mainly: bagging, boosting and stacking on the same dataset that we used it before in the classification, to verify of the results, and identify the best classifier gives high accuracy. The experimental results showed that using ensemble methods are better than using individual classifier, to improve the accuracy of classification. Increased accuracy of classifier Naïve Bayes (NB) to 1.6%, classifier Sequential Minimal Optimization (SMO) to 2.2% and finally Decision Tree (J48) classifier reached up to 3.2%, comparing to using the J48, NB, or SMO as a single classifier.