咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Optimal Algorithms for Augment... 收藏
arXiv

Optimal Algorithms for Augmented Testing of Discrete Distributions

作     者:Aliakbarpour, Maryam Indyk, Piotr Rubinfeld, Ronitt Silwal, Sandeep 

作者机构:Rice University United States MIT United States UW-Madison United States 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2024年

核心收录:

主  题:Consensus algorithm 

摘      要:We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution p, extensive research has established optimal bounds for uniformity testing, identity testing (goodness of fit), and closeness testing (equivalence or two-sample testing). We explore these problems in a setting where a predicted data distribution, possibly derived from historical data or predictive machine learning models, is available. We demonstrate that such a predictor can indeed reduce the number of samples required for all three property testing tasks. The reduction in sample complexity depends directly on the predictor s quality, measured by its total variation distance from p. A key advantage of our algorithms is their adaptability to the precision of the prediction. Specifically, our algorithms can self-adjust their sample complexity based on the accuracy of the available prediction, operating without any prior knowledge of the estimation s accuracy (i.e. they are consistent). Additionally, we never use more samples than the standard approaches require, even if the predictions provide no meaningful information (i.e. they are also robust). We provide lower bounds to indicate that the improvements in sample complexity achieved by our algorithms are information-theoretically optimal. Furthermore, experimental results show that the performance of our algorithms on real data significantly exceeds our worst-case guarantees for sample complexity, demonstrating the practicality of our approach. © 2024, CC BY-NC-ND.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分