Any institution that disseminates data in aggregated form has the duty to ensure that individual confidential information is not disclosed, either by not releasing data or by perturbing the released data while maintai...
详细信息
Any institution that disseminates data in aggregated form has the duty to ensure that individual confidential information is not disclosed, either by not releasing data or by perturbing the released data while maintaining data utility. Controlled tabular adjustment (CTA) is a promising technique of the second type where a protected table that is close to the original one in some chosen distance is constructed. The choice of the specific distance shows a trade-off: although the Euclidean distance has been shown (and is confirmed here) to produce tables with greater "utility," it gives rise to mixed integer quadratic problems (MIQPs) with pairs of linked semi-continuous variables that are more difficult to solve than the mixed integer linear problems corresponding to linear norms. We provide a novel analysis of perspective reformulations (PRs) for this special structure;in particular, we devise a projected PR ((PR)-R-2), which is piecewise-conic but simplifies to a (nonseparable) MIQP when the instance is symmetric. We then compare different formulations of the CTA problem, showing that the ones based on (PR)-R-2 most often obtain better computational results.
Statistical agencies collect input data from individuals and deliver output information to the society based on these data. A fundamental feature of output information is the "protection" of sensitive inform...
详细信息
ISBN:
(纸本)9783319112572;9783319112565
Statistical agencies collect input data from individuals and deliver output information to the society based on these data. A fundamental feature of output information is the "protection" of sensitive information, since too many details could disseminate privacy information from individuals and therefore violate their rights. Another feature of output information is the "utility" to data users, as a scientific may use this output for research or a politician for making decisions. Clearly more details are in the output, more useful it is, but it is also less protected. There are several methodologies based on Mathematical Optimization to solve the problem of finding "good" protected and useful solutions. While the literature on algorithms to apply them is extensive, statisticians have major concerns to use them in practice because these algorithms may have numeral troubles on frequency tables and may produce biased solutions. This article discusses these observations and describes how to overcome them using a modern technique called Enhanced Controlled tabular Adjustment. Computational experiments show the effectiveness of the approach on benchmark instances.
Statistical database protection is a part of information security which tries to prevent published statistical information (tables, individual records)from disclosing the contribution of specific respondents. This pap...
详细信息
ISBN:
(纸本)0769516327
Statistical database protection is a part of information security which tries to prevent published statistical information (tables, individual records)from disclosing the contribution of specific respondents. This paper shows how to use information-theoretic concepts to measure disclosure risk for tabulardata. The proposed disclosure risk measure is compatible with a broad class of disclosure protection methods and can be extended for computing disclosure risk for a set of linked tables.
暂无评论