To address the need for summarizing and extracting information efficiently, this paper highlights the growing challenge posed by the increasing number of PDF files. Reading lengthy documents is a tedious and time-cons...
详细信息
Software systems are essential in modern life and require rigorous testing to ensure reliability. Pairwise testing, a combinatorial testing methodology, optimizes test case generation by minimizing redundancy while ma...
详细信息
This study applies single-valued neutrosophic sets, which extend the frameworks of fuzzy and intuitionistic fuzzy sets, to graph theory. We introduce a new category of graphs called Single-Valued Heptapartitioned Neut...
详细信息
An imbalanced dataset often challenges machine learning, particularly classification methods. Underrepresented minority classes can result in biased and inaccurate models. The Synthetic Minority Over-Sampling Techniqu...
详细信息
An imbalanced dataset often challenges machine learning, particularly classification methods. Underrepresented minority classes can result in biased and inaccurate models. The Synthetic Minority Over-Sampling Technique (SMOTE) was developed to address the problem of imbalanced data. Over time, several weaknesses of the SMOTE method have been identified in generating synthetic minority class data, such as overlapping, noise, and small disjuncts. However, these studies generally focus on only one of SMOTE’s weaknesses: noise or overlapping. Therefore, this study addresses both issues simultaneously by tackling noise and overlapping in SMOTE-generated data. This study proposes a combined approach of filtering, clustering, and distance modification to reduce noise and overlapping produced by SMOTE. Filtering removes minority class data (noise) located in majority class regions, with the k-nn method applied for filtering. The use of Noise Reduction (NR), which removes data that is considered noise before applying SMOTE, has a positive impact in overcoming data imbalance. Clustering establishes decision boundaries by partitioning data into clusters, allowing SMOTE with modified distance metrics to generate minority class data within each cluster. This SMOTE clustering and distance modification approach aims to minimize overlap in synthetic minority data that could introduce noise. The proposed method is called “NR-Clustering SMOTE,” which has several stages in balancing data: (1) filtering by removing minority classes close to majority classes (data noise) using the k-nn method;(2) clustering data using K-means aims to establish decision boundaries by partitioning data into several clusters;(3) applying SMOTE oversampling with Manhattan distance within each cluster. Test results indicate that the proposed NR-Clustering SMOTE method achieves the best performance across all evaluation metrics for classification methods such as Random Forest, SVM, and Naїve Bayes, compared t
This study proposes an innovative diabetes prediction chatbot that utilizes large language models (LLMs) to determine the likelihood of diabetes based on specific patient inputs. Unlike conventional machine learning m...
详细信息
This paper demonstrates the feasibility of using an electronic nose to assess fish quality by analyzing air quality and examining volatile organic compounds (VOCs) alongside physical variables, with pH, protein conten...
详细信息
In recent years, low-light image enhancement techniques have made significant progress in generating reasonable visual details. However, current methods have not yet fully utilized the full semantic prior of visual el...
详细信息
The rapid proliferation of Internet of Things (IoT) devices has led to a substantial increase in network packet traffic, raising significant privacy concerns. Although traffic encryption is employed to protect the pri...
详细信息
The rapid proliferation of Internet of Things (IoT) devices has led to a substantial increase in network packet traffic, raising significant privacy concerns. Although traffic encryption is employed to protect the privacy of IoT devices, attackers can still leverage Machine Learning (ML) and Deep Learning (DL) techniques to classify device types by analyzing packet characteristics, such as size and timing. The main challenges in the state of the art are the lack of effective methods for exposing privacy violations in encrypted IoT traffic, and the absence of robust defense mechanisms to mitigate privacy breaches caused by network traffic analysis. Considering these challenges, this study presents two key contributions: (i) a novel vector-based classification method that enhances device-type identification from encrypted IoT traffic using advanced ML and DL techniques, and (ii) a robust defense mechanism based on Differential Privacy (DP) and advanced padding techniques against traffic analysis attacks. Therefore, the study examines privacy risks associated with sequential IoT device data and evaluates the effectiveness of ML algorithms using two datasets. The results demonstrate that the proposed vector-based classification method significantly improves the attacker’s classification accuracy, even when privacy-preserving techniques, such as padding, are used to obscure device-type classification. For this purpose, the study evaluates eXtreme Gradient Boosting (XGBoost), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) for IoT traffic classification, achieving an accuracy rate of 99.61% with XGBoost, 96.74% with LSTM, and 96.94% with GRU. Additionally, the Decision Tree (DT), Random Forest (RF), k-Nearest Neighbors (kNN), and GRU classification algorithms are also evaluated and compared with the XGBoost and LSTM classifiers for the proposed attack model. As a defense mechanism, DP is applied using the Fourier Perturbation Algorithm (FPA) to optimize padd
This paper considers the security of non-minimum phase systems, a typical kind of cyber-physical systems. Non-minimum phase systems are characterized by unstable zeros in their transfer functions, making them particul...
详细信息
Cloud-based Intelligence of Things is significant for Augmented Enterprise Management Systems. Data integrity auditing is challenging in the intelligence of things environment, mainly when the newer versions in the pu...
详细信息
暂无评论