An imbalanced dataset often challenges machine learning, particularly classification methods. Underrepresented minority classes can result in biased and inaccurate models. The Synthetic Minority Over-Sampling Techniqu...
详细信息
An imbalanced dataset often challenges machine learning, particularly classification methods. Underrepresented minority classes can result in biased and inaccurate models. The Synthetic Minority Over-Sampling Technique (SMOTE) was developed to address the problem of imbalanced data. Over time, several weaknesses of the SMOTE method have been identified in generating synthetic minority class data, such as overlapping, noise, and small disjuncts. However, these studies generally focus on only one of SMOTE’s weaknesses: noise or overlapping. Therefore, this study addresses both issues simultaneously by tackling noise and overlapping in SMOTE-generated data. This study proposes a combined approach of filtering, clustering, and distance modification to reduce noise and overlapping produced by SMOTE. Filtering removes minority class data (noise) located in majority class regions, with the k-nn method applied for filtering. The use of Noise Reduction (NR), which removes data that is considered noise before applying SMOTE, has a positive impact in overcoming data imbalance. Clustering establishes decision boundaries by partitioning data into clusters, allowing SMOTE with modified distance metrics to generate minority class data within each cluster. This SMOTE clustering and distance modification approach aims to minimize overlap in synthetic minority data that could introduce noise. The proposed method is called “NR-Clustering SMOTE,” which has several stages in balancing data: (1) filtering by removing minority classes close to majority classes (data noise) using the k-nn method;(2) clustering data using K-means aims to establish decision boundaries by partitioning data into several clusters;(3) applying SMOTE oversampling with Manhattan distance within each cluster. Test results indicate that the proposed NR-Clustering SMOTE method achieves the best performance across all evaluation metrics for classification methods such as Random Forest, SVM, and Naїve Bayes, compared t
Data is the lifeblood of the modern world, forming a fundamental part of AI, decision-making, and research advances. With increase in interest in data, governments have taken important steps towards a regulated data w...
详细信息
Data is the lifeblood of the modern world, forming a fundamental part of AI, decision-making, and research advances. With increase in interest in data, governments have taken important steps towards a regulated data world, drastically impacting data sharing and data usability and resulting in massive amounts of data confined within the walls of organizations. While synthetic data generation (SDG) is an appealing solution to break down these walls and enable data sharing, the main drawback of existing solutions is the assumption of a trusted aggregator for generative model training. Given that many data holders may not want to, or be legally allowed to, entrust a central entity with their raw data, we propose a framework for collaborative and private generation of synthetic tabular data from distributed data holders. Our solution is general, applicable to any marginal-based SDG, and provides input privacy by replacing the trusted aggregator with secure multi-party computation (MPC) protocols and output privacy via differential privacy (DP). We demonstrate the applicability and scalability of our approach for the state-of-the-art select-measure-generate SDG algorithms MWEM+PGM and AIM. Copyright 2024 by the author(s)
In the basic vehicle routing problem (VRP), a vehicle must deliver goods from one centralized warehouse to multiple customers efficiently. Several VRP variants and constraints exist, including different product types,...
详细信息
Situation awareness is the cognitive capability of human and artificial agents to perceive, understand and predict the status of the situation in an environment. Situation awareness systems aim at supporting the situa...
详细信息
Some of the applications in the field of machine learning and computer vision may require a higher level of privacy and security, better cost efficiency, offline operation, and greater scalability and efficiency. This...
详细信息
The power sector is an important factor in ensuring the development of the national *** simulation and prediction of power consumption help achieve the balance between power generation and power *** this paper,a Multi...
详细信息
The power sector is an important factor in ensuring the development of the national *** simulation and prediction of power consumption help achieve the balance between power generation and power *** this paper,a Multi-strategy Hybrid Coati Optimizer(MCOA)is used to optimize the parameters of the three-parameter combinatorial optimization model TDGM(1,1,r,ξ,Csz)to realize the simulation and prediction of China's daily electricity ***,a novel MCOA is proposed in this paper,by making the following improvements to the Coati Optimization Algorithm(COA):(ⅰ)Introduce improved circle chaotic mapping strategy.(ⅱ)Fusing Aquila Optimizer,to enhance MCOA's exploration capabilities.(ⅲ)Adopt an adaptive optimal neighborhood jitter learning *** improve MCOA escape from local optimal solutions.(ⅳ)Incorporating Differential Evolution to enhance the diversity of the ***,the superiority of the MCOA algorithm is verified by comparing it with the newly proposed algorithm,the improved optimiza-tion algorithm,and the hybrid algorithm on the CEC2019 and CEC2020 test ***,in this paper,MCOA is used to optimize the parameters of TDGM(1,1,r,ξ,Csz),and this model is applied to forecast the daily electricity consumption in China and compared with the predictions of 14 models,including seven intelligent algorithm-optimized TDGM(1,1,r,ξ,Csz),and seven forecasting *** experimental results show that the error of the proposed method is minimized,which verifies the validity of the proposed method.
This research examines the transmission dynamics of the Omicron variant of COVID-19 using SEIQIcRVW and SQIRV models,considering the delay in converting susceptible individuals into infected *** significant delays eve...
详细信息
This research examines the transmission dynamics of the Omicron variant of COVID-19 using SEIQIcRVW and SQIRV models,considering the delay in converting susceptible individuals into infected *** significant delays eventually resulted in the pandemic’s *** ensure the safety of the host population,this concept integrates quarantine and the COVID-19 *** investigate the stability of the proposed *** fundamental reproduction number influences stability *** to our findings,asymptomatic cases considerably impact the prevalence of Omicron infection in the *** real data of the Omicron variant from Chennai,Tamil Nadu,India,is used to validate the outputs.
Recently,nano-systems based on molecular communications via diffusion(MCvD)have been implemented in a variety of nanomedical applications,most notably in targeted drug delivery system(TDDS)***,because the MCvD is unre...
详细信息
Recently,nano-systems based on molecular communications via diffusion(MCvD)have been implemented in a variety of nanomedical applications,most notably in targeted drug delivery system(TDDS)***,because the MCvD is unreliable and there exists molecular noise and inter symbol interference(ISI),cooperative nano-relays can acquire the reliability for drug delivery to targeted diseased cells,especially if the separation distance between the nano transmitter and nano receiver is *** this work,we propose an approach for optimizing the performance of the nano system using cooperative molecular communications with a nano relay scheme,while accounting for blood flow effects in terms of drift *** fractions of the molecular drug that should be allocated to the nano transmitter and nano relay positioning are computed using a collaborative optimization problem solved by theModified Central Force Optimization(MCFO)*** the previous work,the probability of bit error is expressed in a closed-form *** is used as an objective function to determine the optimal velocity of the drug molecules and the detection threshold at the nano *** simulation results show that the probability of bit error can be dramatically reduced by optimizing the drift velocity,detection threshold,location of the nano-relay in the proposed nano system,and molecular drug budget.
As communication technologies and equipment evolve, smart assets become smarter. The agricultural industry is also evolving in line with the implementation of modern communication protocols, intelligent sensors, and e...
详细信息
The general goal of the research is to reduce loss of productivity, failure and downtime and the percentage of waste in production in the company, which is already committed to and successful in applying standards, ra...
详细信息
暂无评论