The gaming industry produces vast amounts of user-generated feedback, making it challenging for developers to efficiently analyze and respond to real-time reviews. This study addresses the problem of classifying large...
详细信息
An imbalanced dataset often challenges machine learning, particularly classification methods. Underrepresented minority classes can result in biased and inaccurate models. The Synthetic Minority Over-Sampling Techniqu...
详细信息
An imbalanced dataset often challenges machine learning, particularly classification methods. Underrepresented minority classes can result in biased and inaccurate models. The Synthetic Minority Over-Sampling Technique (SMOTE) was developed to address the problem of imbalanced data. Over time, several weaknesses of the SMOTE method have been identified in generating synthetic minority class data, such as overlapping, noise, and small disjuncts. However, these studies generally focus on only one of SMOTE’s weaknesses: noise or overlapping. Therefore, this study addresses both issues simultaneously by tackling noise and overlapping in SMOTE-generated data. This study proposes a combined approach of filtering, clustering, and distance modification to reduce noise and overlapping produced by SMOTE. Filtering removes minority class data (noise) located in majority class regions, with the k-nn method applied for filtering. The use of Noise Reduction (NR), which removes data that is considered noise before applying SMOTE, has a positive impact in overcoming data imbalance. Clustering establishes decision boundaries by partitioning data into clusters, allowing SMOTE with modified distance metrics to generate minority class data within each cluster. This SMOTE clustering and distance modification approach aims to minimize overlap in synthetic minority data that could introduce noise. The proposed method is called “NR-Clustering SMOTE,” which has several stages in balancing data: (1) filtering by removing minority classes close to majority classes (data noise) using the k-nn method;(2) clustering data using K-means aims to establish decision boundaries by partitioning data into several clusters;(3) applying SMOTE oversampling with Manhattan distance within each cluster. Test results indicate that the proposed NR-Clustering SMOTE method achieves the best performance across all evaluation metrics for classification methods such as Random Forest, SVM, and Naїve Bayes, compared t
Accessing complex medical data, especially temporal information, presents a significant challenge for non-technical users, including healthcare professionals not versed in technology or query languages like SPARQL. Th...
详细信息
Recently, theory-guided neural networks have attracted significant attention in solving partial differential equations due to their minimal data requirements and alignment with physical laws. However, selecting the pe...
详细信息
The paper generalizes the direct method of moving planes to the Logarithmic Laplacian ***,some key ingredients of the method are discussed,for example,Narrow region principle and Decay at ***,the radial symmetry of th...
详细信息
The paper generalizes the direct method of moving planes to the Logarithmic Laplacian ***,some key ingredients of the method are discussed,for example,Narrow region principle and Decay at ***,the radial symmetry of the solution of the Logarithmic Laplacian system is obtained.
Peer and self-assessment open opportunities to scale assessments in online classrooms. This article reports our experiences of using AsPeer-peer assessment system, with two iterations of a university online class. We ...
详细信息
Global visual localization is critical for UAVs operating in environments where global navigation satellite systems (GNSS) are unreliable or unavailable. While many methods, such as visual odometry (VIO), rely on opti...
详细信息
This paper presents a method of backscatter communication. Backscatter communication is a technology whereby a signal reflected back to the source is utilized to transmit information. An FMCW radar operating in the 60...
详细信息
Relation extraction is a key task in biomedical natural language processing (NLP), aiming to identify and extract relationships between entities across various sources, including clinical texts, research papers, and o...
详细信息
This paper examines a cutting-edge approach that melds backscatter modulation principles with a frequency modulated continuous wave (FMCW) radar system. Building upon radio-frequency identification (RFID) backscatter ...
详细信息
暂无评论