We explore a setting in which a number of subjects want to compute on their pooled data while keeping the statistical confidentiality of their input. Statistical confidentiality is different from the cryptographic con...
详细信息
ISBN:
(纸本)9783319997704;9783319997711
We explore a setting in which a number of subjects want to compute on their pooled data while keeping the statistical confidentiality of their input. Statistical confidentiality is different from the cryptographic confidentiality guaranteed by cryptographic multiparty secure computation: whereas in the latter nothing is disclosed about the input, in statistical input confidentiality a noise-added version of the input is disclosed, which allows more flexible computations. We propose a protocol based on local anonymization via randomized response, whereby the empirical distribution of the data of the subjects is approximated. From that distribution, most statistical calculations can be approximated as well. Regarding the accuracy of the approximation, ceteris paribus it improves with the number of subjects. Large dimensionality (that is, a large number of attributes) decreases accuracy and we propose a strategy to mitigate the dimensionality problem. We show how to characterize the privacy guarantee for each subject in terms of differential privacy. Experimental work is reported on the attained accuracy as a function of the number of respondents, number of attributes and randomized response parameters.
暂无评论