文献详情 >Rethinking Data Heterogeneity ... 收藏

IEEE Transactions on Artificial Intelligence

Rethinking Data Heterogeneity in Federated Learning: Introducing a New Notion and Standard Benchmarks

作者：Vahidian, Saeed Morafah, Mahdi Chen, Chen Shah, Mubarak Lin, Bill

作者机构：Department of Electrical and Computer Engineering University of California La Jolla San DiegoCA92093 United States Center for Research in Computer Vision University of Central Florida OrlandoFL32816-2365 United States Center for Research in ComputerVision University of Central Florida OrlandoFL32816 United States

出版物：《IEEE Transactions on Artificial Intelligence》 (IEEE. Trans. Artif. Intell.)

年卷期：2024年第5卷第3期

页面：1386-1397页

核心收录：

基　　金：This paper was recommended for publication by Associate Editor Keeley Alexandra Crockett upon evaluation of the reviewers' comments. This work was supported in part by NSF under Grant 1956339

主　　题：Artificial intelligence

摘要：Though successful, federated learning (FL) presents new challenges for machine learning, especially when the issue of data heterogeneity, also known as Non-IID data, arises. To cope with the statistical heterogeneity, previous works incorporated a proximal term in local optimization or modified the model aggregation scheme at the server side or advocated clustered federated learning approaches, where the central server groups agent population into clusters with jointly trainable data distributions to take the advantage of a certain level of personalization. While effective, they lack a deep elaboration on what kind of data heterogeneity and how the data heterogeneity impacts the accuracy performance of the participating clients. In contrast to many of the prior FL approaches,wedemonstrate not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for *** are intuitive: 1) Dissimilar labels of clients (label skew) are not necessarily considered data heterogeneity, and 2) the principal angle between the clients data subspaces spanned by their corresponding principal vectors of data is a better estimate of the data heterogeneity. Impact Statement-FL is becoming a compelling learning paradigm in the artificial intelligence (AI) area. However, FL suffers from a notorious issue which is the existence of statistical Non-IID data across different distributed clients. Due to diverse participants, severe data heterogeneity can be present in different clients data, which has been demonstrated to result in unstable and slow convergence. For instance, training a global model across hospitals to identify brain/cancer tumors where every hospitals? images come froma different domain/distribution. To simulate this statistical data heterogeneity, data heterogeneity has been simply modeled as Non-IID label skewwhich tends to be a rigid data partitioning and is hardly representative and th

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Rethinking Data Heterogeneity in Federated Learning: Introducing a New Notion and Standard Benchmarks

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Rethinking Data Heterogeneity in Federated Learning: Introducing a New Notion and Standard Benchmarks

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：