检索结果-内蒙古大学图书馆

Arabic Sentiment Analysis Using optuna hyperparameter optimization and Metaheuristics Feature Selection to Improve Performance of LightGBM

引用

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS 2025年第2期16卷 553-568页

作者： Nazier, Mostafa Medhat Gomaa, Mamdouh M. Abdallah, Mohamed M. Sayed, Awny Minia Univ Fac Sci Comp Sci Dept Al Minya 61519 Egypt King Abdulaziz Univ Fac Comp & Informat Technol Informat Technol Dept Jeddah 21589 Saudi Arabia

Sentiment Analysis (SA) effectively examines big data, such as customer reviews, market research, social media posts, online discussions, and customer feedback evaluation. Arabic Language is a complex and rich language. The main reason for the need to enhance Arabic resources is the existence of numerous dialects alongside the standard version (MSA). This study investigates the impact of stemming and lemmatization methods on Arabic sentiment analysis (ASA) using Machine Learning techniques, specifically the LightGBM classifier. It also employs metaheuristic feature selection algorithms like particle swarm optimization, dragonfly optimization, grey wolf optimization, harris hawks optimizer, and a genetic optimization algorithm to identify the most relevant features to improve LightGBM's model performance. It also employs the optuna hyperparameter optimization framework to determine the optimal set of hyperparameter values to enhance LightGBM model performance. It also underscores the importance of preprocessing strategies in ASA and highlights the effectiveness of metaheuristic approaches and optuna hyperparameter optimization in improving LightGBM model performance in ASA. It also applies different stemming and lemmatization methods, Metaheuristic Feature Selection algorithms, and the optuna hyperparameter optimization on eleven datasets with different Arabic dialects. The findings indicate that metaheuristics feature selection with the LightGBM classifier, using suitable stemming and lemmatization or combining them, enhances LightGBM's accuracy by between 0 and 8%. Still, optuna hyperparameter optimization with the LightGBM classifier, using suitable stemming and lemmatization or combining them, depending on data characteristics, improves LightGBM's accuracy by between 2 and 11%. It achieves superior results than metaheuristics feature selection in more than 90% of cases. This study is of significant importance in the field of ASA, providing valuable insights and d

关键词： Arabic Sentiment Analysis (ASA) big data Light Gradient Boosting Machine (LightGBM) optuna hyperparameter optimization metaheuristics feature selection machine learning

来源：评论

学校读者我要写书评

暂无评论

Financial distress prediction based on ensemble feature selection and improved stacking algorithm

引用

KYBERNETES 2024年第7期54卷 3712-3735页

作者： Wu, Chong Chen, Xiaofang Jiang, Yongjie Harbin Inst Technol Sch Management Harbin Peoples R China

PurposeWhile the Chinese securities market is booming, the phenomenon of listed companies falling into financial distress is also emerging, which affects the operation and development of enterprises and also jeopardizes the interests of investors. Therefore, it is important to understand how to accurately and reasonably predict the financial distress of ***/methodology/approachIn the present study, ensemble feature selection (EFS) and improved stacking were used for financial distress prediction (FDP). Mutual information, analysis of variance (ANOVA), random forest (RF), genetic algorithms, and recursive feature elimination (RFE) were chosen for EFS to select features. Since there may be missing information when feeding the results of the base learner directly into the meta-learner, the features with high importance were fed into the meta-learner together. A screening layer was added to select the meta-learner with better performance. Finally, Optima hyperparameters were used for parameter tuning by the *** empirical study was conducted with a sample of A-share listed companies in China. The F1-score of the model constructed using the features screened by EFS reached 84.55%, representing an improvement of 4.37% compared to the original features. To verify the effectiveness of improved stacking, benchmark model comparison experiments were conducted. Compared to the original stacking model, the accuracy of the improved stacking model was improved by 0.44%, and the F1-score was improved by 0.51%. In addition, the improved stacking model had the highest area under the curve (AUC) value (0.905) among all the compared ***/valueCompared to previous models, the proposed FDP model has better performance, thus bridging the research gap of feature selection. The present study provides new ideas for stacking improvement research and a reference for subsequent research in this field.

关键词： Financial distress prediction Stacking algorithm Ensemble feature selection optuna hyperparameter optimization

来源：评论

学校读者我要写书评

暂无评论

Quantifying scattering characteristics of mangrove species from optuna-based optimal machine learning classification using multi-scale feature selection and SAR image time series

引用

INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION 2023年 122卷

作者： Fu, Bolin Liang, Yiyin Lao, Zhinan Sun, Xidong Li, Sunzhe He, Hongchang Sun, Weiwei Fan, Donglin Guilin Univ Technol Coll Geomat & Geoinformat Guilin 541006 Peoples R China Ningbo Univ Dept Geog & Spatial Informat Tech Ningbo 315211 Peoples R China

Mangroves play a significant role in carbon sequestration and storage. Mapping mangrove species and monitoring their conditions have been a crucial issue for achieving sustainable development goals. Currently combing multidimensional optical and SAR images with machine learning have become an important approach for mangrove species classification, but there are still some challenges in feature selection and hyperparameter optimizations. In this study, we proposed a novel classification framework by combing multi-scale variable selection algorithm (MUVR) with state-of-the-art machine learning hyperparameter optimization method (optuna) for mapping mangrove species in the Beilun Estuary and Maowei Sea nature reserves using optical and dualpolarization SAR images, and further quantified the scattering characteristics of mangrove species using SAR image time series. We found that: (1) The MUVR algorithm could determine the optimal scale features for different scenarios and mangrove species, and improve the classification performance of machine learning with an overall accuracy (OA) improvement of 12.85%;(2) The optuna-based optimal CatBoost outperforms LightGBM and NGBoost algorithms in mapping mangrove species, which achieved the highest OA (93.18%). This study demonstrated that LightGBM was suitable for identifying Aegiceras corniculatum, while the CatBoost algorithm was suitable for discriminating Avicennia marina, Bruguiera gymnorrhiza, Cyperus malaccensis, Kandelia candel and Sonneratia apetala;(3) SAR images and its derivatives improved identification ability of mangrove species, and collaboration of multispectral images and SAR-derived features produced the better classification;(4) From 2018 to 2020, the backscattering coefficients of mangrove species in VV and VH polarization focused on 0.053-0.327 and 0.015-0.062, respectively. The coherence coefficients of mangroves displayed a seasonal change trend with the large variations in summer and small variations in

关键词： Mangrove species classification Machine learning MUVR variable selection optuna hyperparameter optimization Scattering characteristics evaluation Dual-polarization SAR images

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：