版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Tianjin Univ Technol Sch Comp Sci & Engn Tianjin 300386 Peoples R China Aalborg Univ Dept Comp Sci DK-9220 Aalborg Denmark Tech Univ Denmark Dept Technol Management & Econ DK-2800 Kongens Lyngby Denmark
出 版 物:《IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING》 (IEEE Trans Knowl Data Eng)
年 卷 期:2025年第37卷第3期
页 面:1167-1181页
核心收录:
学科分类:0808[工学-电气工程] 08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:National Natural Science Foundation of China [62306212, T2422015, 62020106004] Tianjin Natural Science Foundation [23JCJQJC000070] Marie Sklodowska-Curie Actions (MSCA)
主 题:Data models Noise Classification algorithms Machine learning Neural networks Training Noise measurement Machine learning algorithms Federated learning Fans Class imbalance learning feature rectification latent space learning paradigm
摘 要:Class imbalance learning is a challenging task in machine learning applications. To balance training data, traditional class imbalance learning approaches, such as class resampling or reweighting, are commonly applied in the literature. However, these methods can have significant limitations, particularly in the presence of noisy data, missing values, or when applied to advanced learning paradigms like semi-supervised or federated learning. To address these limitations, this paper proposes a novel and theoretically-ensured latent Feature Rectification method for clAss iMbalance lEarning (FRAME). The proposed FRAME can automatically learn multiple centroids for each class in the latent space and then perform class balancing. Unlike data-level methods, FRAME balances feature in the latent space rather than the original space. Compared to algorithm-level methods, FRAME can distinguish different classes based on distance without the need to adjust the learning algorithms. Through latent feature rectification, FRAME can effectively mitigate contaminated noises/missing values without worrying about structural variations in the data. In order to accommodate a wider range of applications, this paper extends FRAME to the following three main learning paradigms: fully-supervised learning, semi-supervised learning, and federated learning. Extensive experiments on 10 binary-class datasets demonstrate that our FRAME can achieve competitive performance than the state-of-the-art methods and its robustness to noises/missing values.