The primary objective of research concerning class imbalance problems revolves around the generation of high-quality data for minority classes. Prior investigations have witnessed various approaches to synthesizing da...
详细信息
The primary objective of research concerning class imbalance problems revolves around the generation of high-quality data for minority classes. Prior investigations have witnessed various approaches to synthesizing data, resulting in varying data quality. This study introduces a novel oversamplingframework, termed the optimal oversampling framework (OOF), which adopts a distinctive perspective. OOF uses optimization algorithms to guide the data generation process, ensuring that new samples are not only similar to the minority class but also exhibit sufficient diversity. Specifically, the method combines initialization and evolutionary strategies to refine the generated samples, while evaluating the similarity of the samples to the minority and majority classes through distance and cosine similarity measures. In addition, OOF prevents premature convergence and ensures that the generated samples maintain uniqueness through diversity judgments and fitness function settings. Finally, OOF selects the best quality samples for oversampling by optimizing the ranking. To demonstrate the effectiveness of OOF, we integrated the Particle Swarm Optimization algorithm with OOF and conducted comparative experiments involving nine different oversampling methods across 21 datasets characterized by high class imbalance ratios. The experimental outcomes validate the success of the OOF approach.
暂无评论