检索结果-内蒙古大学图书馆

2024 IEEE International Conference on Consumer Electronics, ICCE 2024

作者： Gnanasivam, Sajepan Tveter, Daniel Dinh, Nga Faculty of Computer Science Halden Norway

ISBN: (纸本)9798350324136

The development of 5G network and beyond has led to an explosion of data generation. It is therefore crucial to have an intrusion detection system (IDS) to detect and remove malicious packets from entering network. This paper therefore presents an IDS based on a Feature Selection approach which applies the Recursive Feature Elimination and Random Forest Classifier with 10-fold Cross Validation to classify malicious and benign traffic on a publicly available UNSW-NB15 dataset. Most existing Feature Selection approaches on this dataset directed to enhance the performance of a limited number of algorithms used. Our proposed Feature Selection approach was tested on six well-known supervised machine learning (ML) algorithms including Artificial Neural Network (ANN), Random Forest (RF), Decision Tree (DT), K-Nearest Neighbor (KNN), Support Vector machines (SVM) and Logistic Regression (LR) performing binary classification. In addition, we performed hyperparameter tuning to get the best possible parameters for each ML algorithm. Unlike hyperparameter tuning in most studies, we perform both Manual Search and Grid Search. The performance of the selected ML algorithms are evaluated based on Accuracy, Recall, Precision, and F1 score. The results from our experiments indicate that the most robust algorithm is ANN whereas the weakest performing algorithm is LR. RF is the second-best performing algorithm, however, its runtime is much lower than that of ANN. In particular, ANN excels with (testing accuracy, F1 score) of (88.62%, 96.473%), RF with (87.40%, 89.60%), DT with (87.266%, 89.414%), KNN with (87.11%, 88.7%), SVM with (81.835%, 86.959%) and LR with (81.835%, 85.632%). In addition, the over-fitting problems are eliminated based on our proposed Feature Selection and Hyperparameter turning. Compared with existing works with the same ML algorithms on UNSW-NB15 dataset, our proposed Feature Selection approach achieved better results in most cases and more stable among different

关键词： feature selection hyperparameter tuning Intrusion detection supervised machine learning algorithms

来源：评论

学校读者我要写书评

暂无评论

Ensemble-Based Network Attack Prediction in WSN 15

Ensemble-Based Network Attack Prediction in WSN

引用

15th International Conference on Computing Communication and Networking Technologies, ICCCNT 2024

作者： Tejaswini, B. Suruthi, V. Safia Naveed, S. Kcg College of Technology Computer Science and Engineering Chennai India

ISBN: (纸本)9798350370249

Computer network security and integrity are severely impacted by network attacks. The ability to predict and prevent these attacks is crucial for maintaining a secure network environment. supervised ML (machine learning) techniques have emerged as effective tools for network attack prediction because of their ability to analyze large amounts of network data and identify patterns indicative of malicious activity. We present a comprehensive analysis of supervised machine-learning techniques for the prediction of network attacks. We gather and preprocess the data, extracting pertinent features and converting them into a format suitable for machine learning algorithms. We assess the performance of these algorithms. We investigate the interpretability of the trained models to gain insights into the underlying patterns and characteristics of network attacks. This allows network administrators to understand the nature of attacks and develop appropriate defense strategies. Additionally, we discuss the challenges and limitations associated with the application of supervised ML techniques in the domain of network attack prediction, such as the need for real-time analysis and the emergence of sophisticated evasion techniques. © 2024 IEEE.

关键词： Network attacks prediction Patterns recognition Performance evaluation Real-time Analysis supervised machine learning algorithms

来源：评论

学校读者我要写书评

暂无评论

HDPF: Heart Disease Prediction Framework Based on Hybrid Classifiers and Genetic Algorithm

引用

IEEE ACCESS 2021年 9卷 146797-146809页

作者： Ashri, Sarria E. A. El-Gayar, M. M. El-Daydamony, Eman M. Mansoura Univ Fac Comp & Informat Sci Dept Informat Technol Mansoura 35516 Egypt

supervised machine learning algorithms are powerful classification techniques commonly used to build prediction models that help diagnose the disease early. However, some challenges like overfitting and underfitting need to be overcome while building the model. This paper introduces hybrid classifiers using the ensembled model with a majority voting technique to improve prediction accuracy. Furthermore, a proposed preprocessing technique and features selection based on a genetic algorithm is suggested to enhance prediction performance and overall time consumption. In addition, the 10-folds cross-validation technique is used to overcome the overfitting problem. Experiments were performed on a dataset for cardiovascular patients from the UCI machine learning Repository. Through a comparative analytical approach, the study results indicated that the proposed ensemble classifier model achieved a classification accuracy of 98.18% higher than the rest of the relevant developments in the study.

关键词： Diseases Heart Support vector machines Predictive models Data models machine learning algorithms Genetic algorithms Cardiovascular disease supervised machine learning algorithms simple genetic algorithm ensembled model majority voting technique

来源：评论

学校读者我要写书评

暂无评论

Breast Cancer Prediction and Detection Using Data Mining Classification algorithms: A Comparative Study

引用

TEHNICKI VJESNIK-TECHNICAL GAZETTE 2019年第1期26卷 149-155页

作者： Kaya Keles, Mumine Adana Sci & Technol Univ Dept Comp Engn Balcali MahallesiCatalan Caddesi 201-1 TR-01250 Saricam Adana Turkey

Today, cancer has become a common disease that can afflict the life of one of every three people. Breast cancer is also one of the cancer types for which early diagnosis and detection is especially important. The earlier breast cancer is detected, the higher the chances of the patient being treated. Therefore, many early detection or prediction methods are being investigated and used in the fight against breast cancer. In this paper, the aim was to predict and detect breast cancer early with non-invasive and painless methods that use data mining algorithms. All the data mining classification algorithms in Weka were run and compared against a data set obtained from the measurements of an antenna consisting of frequency bandwidth, dielectric constant of the antenna's substrate, electric field and tumor information for breast cancer detection and prediction. Results indicate that Bagging, IBk, Random Committee, Random Forest, and SimpleCART algorithms were the most successful algorithms, with over 90% accuracy in detection. This comparative study of several classification algorithms for breast cancer diagnosis using a data set from the measurements of an antenna with a 10-fold cross-validation method provided a perspective into the data mining methods' ability of relative prediction. From data obtained in this study it can be said that if a patient has a breast cancer tumor, detection of the tumor is possible.

关键词： breast cancer classification data mining detection and prediction of tumor supervised machine learning algorithms

来源：评论

学校读者我要写书评

暂无评论

Quality-Efficiency Trade-offs in machine learning for Text Processing

Quality-Efficiency Trade-offs in Machine Learning for Text P...

引用

IEEE International Conference on Big Data

作者： Ricardo Baeza-Yates Zeinab Liaghat Web Science and Social Computing Group DTIC Universitat Pompeu Fabra Barcelona Catalonia Spain

As the amount of available digital documents keeps growing rapidly, extracting useful information from them has become a major challenge. Data mining, natural language processing, and machine learning are powerful techniques that can be used together to deal with this problem. Depending on the task at hand, there are many different approaches that can be used. The methods available are continuously improved, but not all of them have been tested and compared in a set of coherent problems using supervised machine learning algorithms. For example, what happens to the quality of the methods if we increase the training data size from, say, 100 MB to over 1 GB? Moreover, are quality gains worth it when the rate of data processing diminishes? Can we trade quality for time efficiency and recover the quality loss by just being able to process more data? We attempt to answer these questions in a general way for text processing tasks, considering the trade-offs involving training data size, learning time, and quality obtained. For this, we propose a performance trade-off framework and apply it to three important tasks: Named Entity Recognition, Sentiment Analysis and Document Classification. These problems were also chosen because they have different levels of object granularity: words, paragraphs, and documents. For each problem, we selected several supervised machine learning algorithms and we evaluated the trade-offs of them on large publicly available data sets (news, reviews, patents). To explore these trade-offs, we use different data subsets of increasing size ranging from 50 MB to several GB. For the last two tasks, we also consider similar algorithms with two different data sets and two evaluation techniques, to study their impact on the resulting trade-offs. We find that the results do not change significantly and that most of the time the best algorithms are the ones with fastest processing time. However, we also show that the results for small data (say less than 1

关键词： supervised machine learning algorithms text processing algorithmic trade-offs learning trade-offs

来源：评论

学校读者我要写书评

暂无评论

Analysis of COVID-19 cases and comorbidities using machine learning algorithms: A case study of the Limpopo Province, South Africa

引用

SCIENTIFIC AFRICAN 2023年 21卷

作者： Boateng, Alexander Maposa, Daniel Mokobane, Reshoketswe Darikwa, Timotheus Gyamfi, Charles Univ Free State Dept Biostat Bloemfontein South Africa Univ Limpopo Dept Stat & Operat Res Polokwane South Africa Kwame Nkrumah Univ Sci & Technol Dept Stat & Actuarial Sci Kumasi Ghana

This study examined the biological, social, and clinical risk factors for mortality in coronavirus of the year 2019 (COVID-19) hospitalised patients. The population of the study is prone to COVID-19, thus understanding the most common traits and comorbidities of people who were affected is crucial in reducing its consequences. In this study, four supervised machine learning algorithms were implemented and compared to predict the mortality rate based on the explanatory variables across the five districts of Limpopo Province in South Africa. The data was obtained from Lim-popo Department of Health. Prediction about the chances of dying from COVID-19 disease was made using logistic regression, random forest, support vector machine, and decision tree algo-rithms on the dataset of 20,592 records with twenty-one attributes. Due to the imbalanced nature of the data, Random Over-Sampling Examples (ROSE) were employed to balance our data for more accurate classification effectively. The ROSE package provides functions to deal with binary classification problems in the presence of imbalanced classes. We used 70% of the data for training, while 30% was selected for testing the predictive algorithms. A technique called Step Akaike's Information Criterion (StepAIC) was deployed to reduce the insignificant variables from the full model of the logistic regression. According to the findings of the study, among the four algorithms tested, random forest had the highest recall rate for predicting mortality at roughly 79 percent compared to the other three algorithms. Accordingly, we conclude that random forest algorithm is appropriate for predicting the chances of patients dying from COVID-19 based on the attributes of the five districts of Limpopo Province. In terms of the features and their importance, a function called Variable Importance (VarImp) was used to check which of the attributes have predictive power on the outcome variable (discharged status). The findings revealed that

关键词： Comorbidities supervised machine learning algorithms Random forest Logistic regression COVID-19 Limpopo province South Africa

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：