文献详情 >Leveraging machine learning to... 收藏

Leveraging machine learning to proactively identify phishing campaigns before they strike

作者：Zhang, Kun Wang, Haifeng Chen, Meiyi Chen, Xianglin Liu, Long Geng, Qiang Zhou, Yu

作者机构：Hainan Normal Univ Sch Informat Sci & Technol Haikou 571158 Hainan Peoples R China Hainan Trop Ocean Univ Sch Comp Sci & Technol Sanya 572022 Hainan Peoples R China Hainan Trop Ocean Univ Sch Ocean Informat Engn Sanya 572022 Hainan Peoples R China Sanya Informat Infrastruct Investment & Construct Sanya 572022 Hainan Peoples R China Haikou Univ Econ TJ YZ Sch Network Sci Haikou 571127 Peoples R China

出版物：《JOURNAL OF BIG DATA》 (J. Big Data)

年卷期：2025年第12卷第1期

页面：1-55页

核心收录：

基　　金：Hainan Province Science and Technology Special Fund [ZDYF2024GXJS034] Innovation Platform for Academicians of Hainan Province [YSPTZX202036] Hainan Provincial Natural Science Foundation of China [621MS0787, 621QN271] Education Department of Hainan Province [Hnjg2019-50, Hnky2024ZD-24] Sanya Science and Technology Special Fund [2022KJCX30]

主　　题：Phishing cybercrimes Uniform resource locator Machine learning classification Shapely additive explanations Recursive feature elimination Hyperparameter optimization

摘要：With the increasing reliance on digital platforms for shopping, communication, and meetings, users are more exposed to cyber threats like phishing. These attacks often involve fraudulent websites designed to steal sensitive information, such as passwords and credit card details, by mimicking legitimate sites. Attackers use various deceptive techniques, including link manipulation, filter evasion, covert redirection, website forgery, and social engineering. This study introduces an advanced phishing detection framework using machine learning (ML) models. A dataset of 1,353 URLs (702 legitimate, 103 suspicious, and 548 phishing) was compiled, with nine key features extracted for classification. Four ML classifiers-Categorical Boosting, Random Forest (RF), Decision Tree (DT), and Extreme Gradient Boosting (XGB)-were employed, with cross-validation ensuring robust model evaluation. Feature selection was conducted using SHapley Additive Explanations (SHAP) and Recursive Feature Elimination (RFE) to enhance interpretability and computational efficiency. To further refine classification accuracy across legitimate, suspicious, and phishing categories, hyperparameter tuning was performed using four nature-inspired optimization algorithms: Golden Jackal Optimization, Dandelion Optimization, Coati Optimization, and Puma Optimization. These algorithms were chosen for their strong global search capabilities and adaptability to complex datasets, ensuring optimal parameter selection for improved model performance. The study s main contribution lies in integrating these optimization techniques with ML classifiers, significantly improving phishing detection accuracy while reducing computational complexity. Experimental results demonstrated that XGB-based models, particularly XGPO, achieved the highest performance across two feature-selection scenarios. In Scenario 1, Accuracy = 0.980, Precision = 0.981, Recall = 0.980, F1-score = 0.980, MCC = 0.965, AUC = 0.985. In Scenario 2, Accur

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Leveraging machine learning to proactively identify phishing campaigns before they strike

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Leveraging machine learning to proactively identify phishing campaigns before they strike

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：