In this paper we propose a social network anonymization algorithm for security and confidentiality of health data set transmitted and shared across a social network. The growing need to address privacy concerns when s...
详细信息
ISBN:
(纸本)9781479967865
In this paper we propose a social network anonymization algorithm for security and confidentiality of health data set transmitted and shared across a social network. The growing need to address privacy concerns when social network data is released for mining purposes has recently led to considerable interest in various techniques anonymization. In the proposed scheme, the data owners interact over Internet using social network application where the algorithm determines records that can't be disclosed and satisfy a formal privacy protection. Furthermore, this algorithm integrates a secure encryption function, thus, the anonymization health data set is guaranteed during the session. Finally, we present an evaluation under mathematical analysis to prove the correctness of the anonymity function in the algorithm.
With the rapid growth of social network applications, more and more people are participating in social networks. Privacy protection in online social networks becomes an important issue. The illegal disclosure or impro...
详细信息
With the rapid growth of social network applications, more and more people are participating in social networks. Privacy protection in online social networks becomes an important issue. The illegal disclosure or improper use of users' private information will lead to unaccepted or unexpected consequences in people's lives. In this paper, we concern on authentic popularity disclosure in online social networks. To protect users' privacy, the social networks need to be anonymized. However, existing anonymization algorithms on social networks may lead to nontrivial utility loss. The reason is that the anonymization process has changed the social network's structure. The social network's utility, such as retrieving data files, reading data files, and sharing data files among different users, has decreased. Therefore, it is a challenge to develop an effective anonymization algorithm to protect the privacy of user's authentic popularity in online social networks without decreasing their utility. In this paper, we first design a hierarchical authorization and capability delegation (HACD) model. Based on this model, we propose a novel utility-based popularity anonymization (UPA) scheme, which integrates proxy re-encryption with keyword search techniques, to tackle this issue. We demonstrate that the proposed scheme can not only protect the users' authentic popularity privacy, but also keep the full utility of the social network. Extensive experiments on large real-world online social networks confirm the efficacy and efficiency of our scheme. (C) 2016 Elsevier B.V. All rights reserved.
Background: Knowledge of the geographical locations of individuals is fundamental to the practice of spatial epidemiology. One approach to preserving the privacy of individual-level addresses in a data set is to de-id...
详细信息
Background: Knowledge of the geographical locations of individuals is fundamental to the practice of spatial epidemiology. One approach to preserving the privacy of individual-level addresses in a data set is to de-identify the data using a non-deterministic blurring algorithm that shifts the geocoded values. We investigate a vulnerability in this approach which enables an adversary to re-identify individuals using multiple anonymized versions of the original data set. If several such versions are available, each can be used to incrementally refine estimates of the original geocoded location. Results: We produce multiple anonymized data sets using a single set of addresses and then progressively average the anonymized results related to each address, characterizing the steep decline in distance from the re-identified point to the original location, ( and the reduction in privacy). With ten anonymized copies of an original data set, we find a substantial decrease in average distance from 0.7 km to 0.2 km between the estimated, re-identified address and the original address. With fifty anonymized copies of an original data set, we find a decrease in average distance from 0.7 km to 0.1 km. Conclusion: We demonstrate that multiple versions of the same data, each anonymized by non-deterministic Gaussian skew, can be used to ascertain original geographic locations. We explore solutions to this problem that include infrastructure to support the safe disclosure of anonymized medical data to prevent inference or re-identification of original address data, and the use of a Markov-process based algorithm to mitigate this risk.
暂无评论