Despite extensive research on cryptography, secure and efficient query processing over outsourced data remains an open challenge. This poster continues along the emerging trend in secure data processing that recognize...
详细信息
ISBN:
(纸本)9781450360999
Despite extensive research on cryptography, secure and efficient query processing over outsourced data remains an open challenge. This poster continues along the emerging trend in secure data processing that recognizes that the entire dataset may not be sensitive, and hence, non-sensitivity of data can be exploited to overcome some of the limitations of existing encryption-based approaches. In particular, this poster outlines a new secure keyword search approach, called query keyword binning (QB) that allows non-sensitive parts of the data to be outsourced in clear-text while guaranteeing that no information is leaked by joint processing of non-sensitive data (in clear-text) and sensitive data (in encrypted form). QB improves the performance of and strengthens the security of the underlying cryptographic technique by preventing size, frequency-count, and workload-skew attacks.
High-dimensional data is particularly useful for data analytics research. In the healthcare domain, for instance, high-dimensional data analytics has been used successfully for drug discovery. Yet, in order to adhere ...
详细信息
ISBN:
(纸本)9781450360999
High-dimensional data is particularly useful for data analytics research. In the healthcare domain, for instance, high-dimensional data analytics has been used successfully for drug discovery. Yet, in order to adhere to privacy legislation, data analytics service providers must guarantee anonymity for data owners. In the context of high-dimensional data, ensuring privacy is challenging because increased data dimensionality must be matched by an exponential growth in the size of the data to avoid sparse datasets. Syntactically, anonymising sparse datasets with methods that rely of statistical significance, makes obtaining sound and reliable results, a challenge. As such, strong privacy is only achievable at the cost of high information loss, rendering the data unusable for data analytics. In this paper, we make two contributions to addressing this problem from both the privacy and information loss perspectives. First, we show that by identifying dependencies between attribute subsets we can eliminate privacy violating attributes from the anonymised dataset. Second, to minimise information loss, we employ a greedy search algorithm to determine and eliminate maximal partial unique attribute combinations. Thus, one only needs to find the minimal set of identifying attributes to prevent re-identification. Experiments on a health cloud based on the SAP HANA platform using a semi-synthetic medical history dataset comprised of 109 attributes, demonstrate the effectiveness of our approach.
Advances in sensing, networking, and actuation technologies have resulted in the IoT wave that is expected to revolutionize all aspects of modern society. This paper focuses on the new challenges of privacy that arise...
详细信息
ISBN:
(纸本)9781450360999
Advances in sensing, networking, and actuation technologies have resulted in the IoT wave that is expected to revolutionize all aspects of modern society. This paper focuses on the new challenges of privacy that arise in IoT in the context of smart homes. Specifically, the paper focuses on preventing the user's privacy via inferences through channel and in-home device activities. We propose a method for securely scheduling the devices while decoupling the device and channels activities. The proposed solution avoids any attacks that may reveal the coordinated schedule of the devices, and hence, also, assures that inferences that may compromise individual's privacy are not leaked due to device and channel level activities. Our experiments also validate the proposed approach, and consequently, an adversary cannot infer device and channel activities by just observing the network traffic.
data Stream Processing Systems (DSPSs) execute long-running, continuous queries over transient streaming data, often making use of outsourced, third-party computational platforms. However, third-party outsourcing can ...
详细信息
ISBN:
(纸本)9781450360999
data Stream Processing Systems (DSPSs) execute long-running, continuous queries over transient streaming data, often making use of outsourced, third-party computational platforms. However, third-party outsourcing can lead to unwanted violations of data providers' access controls or privacy policies, as data potentially flows through untrusted infrastructure. To address these types of violations, data providers can elect to use stream processing techniques based upon computation-enabling encryption. Unfortunately, this class of solutions can leak information about underlying plaintext values, reduce the possible set of queries that can be executed, and come with detrimental performance overheads. To alleviate the concerns with cryptographically-enforced access controls in DSPSs, we have developed Sanctuary, a DSPS that makes use of Intel's Software Guard Extensions (SGX) to protect data being processed on untrusted infrastructure. We show that Sanctuary can execute arbitrary queries while leaking no more information than an idealized Trusted Infrastructure system. At the same time, an extensive evaluation shows that the overheads associated with stream processing in Sanctuary are comparable to its computation-enabling encryption counterparts for many queries.
We consider the problem of implementing a Linear Quadratic Gaussian (LQG) controller on a distributed system, while maintaining the privacy of the measurements, state estimates, control inputs and system model. The co...
详细信息
ISBN:
(纸本)9781450362856
We consider the problem of implementing a Linear Quadratic Gaussian (LQG) controller on a distributed system, while maintaining the privacy of the measurements, state estimates, control inputs and system model. The component sub-systems and actuator out-source the LQG computation to a cloud controller and encrypt their signals and matrices. The encryption scheme used is Labeled Homomorphic Encryption, which supports the evaluation of degree-2 polynomials on encrypted data, by attaching a unique label to each piece of data and using the fact that the outsourced computation is known by the actuator. We write the state estimate update and control computation as multivariate polynomials in the encrypted data and propose an extension to the Labeled Homomorphic Encryption scheme that achieves the evaluation of low-degree polynomials on encrypted data, with degree larger than two. We showcase the numerical results of the proposed protocol for a temperature control application that indicates competitive online times.
Deep Neural Networks (DNNs) have overtaken classic machine learning algorithms due to their superior performance in big data analysis in a broad range of applications. On the other hand, in recent years Machine Learni...
详细信息
ISBN:
(纸本)9781450360999
Deep Neural Networks (DNNs) have overtaken classic machine learning algorithms due to their superior performance in big data analysis in a broad range of applications. On the other hand, in recent years Machine Learning as a Service (MLaaS) has become more widespread in which a client uses cloud services for analyzing its data. However, the client's data may be sensitive which raises privacy concerns. In this paper, we address the issue of privacy preserving classification in a Machine Learning as a Service (MLaaS) settings and focus on convolutional neural networks (CNN). To achieve this goal, we develop new techniques to run CNNs over encrypted data. First, we design methods to approximate commonly used activation functions in CNNs (i.e. ReLU, Sigmoid, and Tanh) with low degree polynomials which is essential for a practical and efficient solution. Then, we train CNNs with approximation polynomials instead of original activation functions and implement CNNs classification over encrypted data. We evaluate the performance of our modified models at each step. The results of our experiments using several CNNs with a varying number of layers and structures are promising. When applied to the MNIST optical character recognition tasks, our approach achieved 99.25% accuracy which significantly outperforms state-of-the-art solutions and is close to the accuracy of the best non-private version. Furthermore, it can make up to 164000 predictions per hour. These results show that our approach provides accurate, efficient, and scalable privacy-preserving predictions in CNNs.
A large class of biometric template protection algorithms assume that feature vectors are integer valued. However, biometric data is generally represented through real-valued feature vectors. Therefore, secure templat...
详细信息
ISBN:
(纸本)9781450367264
A large class of biometric template protection algorithms assume that feature vectors are integer valued. However, biometric data is generally represented through real-valued feature vectors. Therefore, secure template constructions are not immediately applicable when feature vectors are composed of real numbers. We propose a generic transformation and extend the domain of biometric template protection algorithms from integer-valued feature vectors to real valued feature vectors. We show that our transformation is accuracy-preserving and verify our theoretical findings by reporting the implementation results using a public keystroke dynamics dataset.
Webpage fingerprinting methods infer the webpages visited in a traffic trace and are serious threats to the privacy of web users. Prior work evaluates webpage fingerprinting methods using traffic samples from a single...
详细信息
ISBN:
(纸本)9781450360999
Webpage fingerprinting methods infer the webpages visited in a traffic trace and are serious threats to the privacy of web users. Prior work evaluates webpage fingerprinting methods using traffic samples from a single client and does not consider the client diversity factor-webpages can be visited using different browsers, operating systems and devices. In this paper, we study the impact of client diversity on HTTPS webpage fingerprinting. First, we evaluate 5 prominent fingerprinting methods using traffic samples from 19 different clients. We show that the best performing methods overfit to the traffic patterns of a single client and do not generalize when they are evaluated using the samples from a different client (even if the clients use the same browser and operating system and only differ in device). Then, we investigate the traffic patterns of the clients and find differences in the HTTP messages generated, servers communicated and implementation of HTTP/2 across the clients. Finally, we show that the robustness of the methods can be increased by training them using the samples from a diverse set of clients. This study informs the community towards a realistic threat model for HTTPS webpage fingerprinting and presents an analysis of modern HTTPS traffic.
暂无评论