咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Learning over inherently distr... 收藏
arXiv

Learning over inherently distributed data

作     者:Yan, Donghui Xu, Ying 

作者机构:Department of Mathematics and Program in Data Science University of Massachusetts DartmouthMA United States Indigo Agroculture Inc BostonMA United States 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2019年

核心收录:

主  题:Big data 

摘      要:The recent decades have seen a surge of interests in distributed computing. Existing work focus primarily on either distributed computing platforms, data query tools, or, algorithms to divide big data and conquer at individual machines etc. It is, however, increasingly often that the data of interest are inherently distributed, i.e., data are stored at multiple distributed sites due to diverse collection channels, business operations etc. We propose to enable learning and inference in such a setting via a general framework based on the distortion minimizing local transformations. This framework only requires a small amount of local signatures to be shared among distributed sites, eliminating the need of having to transmitting big data. Computation can be done very efficiently via parallel local computation. The error incurred due to distributed computing vanishes when increasing the size of local signatures. As the shared data need not be in their original form, data privacy may also be preserved. Experiments on linear (logistic) regression and Random Forests have shown promise of this approach. This framework is expected to apply to a general class of tools in learning and inference with the continuity property. Copyright © 2019, The Authors. All rights reserved.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分