咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Rethinking Data Heterogeneity ... 收藏
IEEE Transactions on Artificial Intelligence

Rethinking Data Heterogeneity in Federated Learning: Introducing a New Notion and Standard Benchmarks

作     者:Vahidian, Saeed Morafah, Mahdi Chen, Chen Shah, Mubarak Lin, Bill 

作者机构:Department of Electrical and Computer Engineering University of California La Jolla San DiegoCA92093 United States Center for Research in Computer Vision University of Central Florida OrlandoFL32816-2365 United States Center for Research in ComputerVision University of Central Florida OrlandoFL32816 United States 

出 版 物:《IEEE Transactions on Artificial Intelligence》 (IEEE. Trans. Artif. Intell.)

年 卷 期:2024年第5卷第3期

页      面:1386-1397页

核心收录:

基  金:This paper was recommended for publication by Associate Editor Keeley Alexandra Crockett upon evaluation of the reviewers' comments. This work was supported in part by NSF under Grant 1956339 

主  题:Artificial intelligence 

摘      要:Though successful, federated learning (FL) presents new challenges for machine learning, especially when the issue of data heterogeneity, also known as Non-IID data, arises. To cope with the statistical heterogeneity, previous works incorporated a proximal term in local optimization or modified the model aggregation scheme at the server side or advocated clustered federated learning approaches, where the central server groups agent population into clusters with jointly trainable data distributions to take the advantage of a certain level of personalization. While effective, they lack a deep elaboration on what kind of data heterogeneity and how the data heterogeneity impacts the accuracy performance of the participating clients. In contrast to many of the prior FL approaches,wedemonstrate not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for *** are intuitive: 1) Dissimilar labels of clients (label skew) are not necessarily considered data heterogeneity, and 2) the principal angle between the clients data subspaces spanned by their corresponding principal vectors of data is a better estimate of the data heterogeneity. Impact Statement-FL is becoming a compelling learning paradigm in the artificial intelligence (AI) area. However, FL suffers from a notorious issue which is the existence of statistical Non-IID data across different distributed clients. Due to diverse participants, severe data heterogeneity can be present in different clients data, which has been demonstrated to result in unstable and slow convergence. For instance, training a global model across hospitals to identify brain/cancer tumors where every hospitals? images come froma different domain/distribution. To simulate this statistical data heterogeneity, data heterogeneity has been simply modeled as Non-IID label skewwhich tends to be a rigid data partitioning and is hardly representative and th

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分