Conventional matrixfactorization relies on centralized collection of users' data for recommendation, which might introduce an increased risk of privacy leakage especially when the recommender is untrusted. Existi...
详细信息
ISBN:
(纸本)9798350381993;9798350382006
Conventional matrixfactorization relies on centralized collection of users' data for recommendation, which might introduce an increased risk of privacy leakage especially when the recommender is untrusted. Existing differentially private matrixfactorization methods either assume the recommender is trusted, or can only provide a uniform level of privacy protection for all ratings in untrusted recommender scenario. In this paper, we propose a novel Heterogeneous Differentially Private matrixfactorization algorithm (HDPMF) for untrusted recommender. To the best of our knowledge, we are the first to achieve heterogeneous differential privacy for decentralized matrix factorization in untrusted recommender scenario. Specifically, our framework uses modified stretching mechanism with an innovative rescaling scheme to achieve better trade-off between privacy and accuracy. Meanwhile, by allocating privacy budget properly, we can capture homogeneous privacy preference within a user/item but heterogeneous privacy preference across different users/items. Theoretical analysis confirms that HDPMF renders rigorous privacy guarantee, and exhaustive experiments demonstrate its superiority especially in strong privacy guarantee, high dimension model and sparse dataset scenario.
Network Coordinate Systems (NCS) are promising techniques to predict unknown network distances from a limited number of measurements Most NCS algorithms are based on metric space embedding and suffer from the inabilit...
详细信息
ISBN:
(纸本)9783642129629
Network Coordinate Systems (NCS) are promising techniques to predict unknown network distances from a limited number of measurements Most NCS algorithms are based on metric space embedding and suffer from the inability to represent, distance asymmetries and Triangle Inequality Violations (TIVs) To overcome these drawbacks, we formulate the problem of network distance prediction as guessing the missing elements of a distance matrix and solve it by matrixfactorization. A distinct feature of our approach, called decentralized matrix factorization (DMF), is that it is fully decentralized The factorization of the incomplete distance matrix is collaboratively and iteratively done at, all nodes with each node retrieving only a small number of distance measurements There are no special nodes such as landmarks nor a central node where the distance measurements are collected and stored We compare DMF with two popular NCS algorithms Vivaldi and IDES. The former is based on metric space embedding, while the latter is also based on matrixfactorization but uses landmarks Experimental results show that DMF achieves competitive accuracy with the double advantage of having no landmarks and of being able to represent distance asymmetries and TIVs.
Hybrid cloud computing combines private clouds with geographically-distributed resources from public clouds, desktop grids or in-house gateways to provide the most flexibility of each kind of cloud platforms. Service ...
详细信息
Hybrid cloud computing combines private clouds with geographically-distributed resources from public clouds, desktop grids or in-house gateways to provide the most flexibility of each kind of cloud platforms. Service provisioning for wide-area applications such as cloud backup or cloud network games is sensitive to wide-area network metrics such as round trip time, bandwidth, or loss rates. In order to optimize the quality of the service provision in hybrid clouds, it is highly valuable for the hybrid clouds to collect detailed network metrics between participating nodes of the hybrid clouds. However, since nodes can be large-scale and dynamic, the network metrics may be diverse for different cloud services, it is challenging to increase the generality, scalability, accuracy, and the robustness of the measurement process. We propose a novel distributed level monitoring method HPM (Hierarchical Performance Measurement) satisfying these requirements. For each kind of network metric, HPM represents the degree of pairwise closeness with discrete level values inspired by the hierarchical clustering tree. HPM maps probed metric to discrete levels based on an existing distributed K-means clustering method that helps maximize the similarity of the network metric in the same level, which therefore optimizes the matching between pairwise levels and the real-world pairwise proximity. Furthermore, for scalability reasons, HPM computes the pairwise levels with decentralized coordinates. Each node independently maintains its low-dimensional coordinate based on a novel decentralized implementation of the Maximum Margin matrixfactorization method, which optimizes the mapping between the network metrics and the level values. Simulation results for the round trip time, bandwidth, loss, and hop count metric confirm that HPM converges fast, is robust to parameter settings, scales well with increasing levels or system size, and adapts well to diverse metrics. A prototype deployment on
Low-rank matrix approximation is an important tool in data mining with a wide range of applications, including recommender systems, clustering, and identifying topics in documents. When the matrix to be approximated o...
详细信息
Low-rank matrix approximation is an important tool in data mining with a wide range of applications, including recommender systems, clustering, and identifying topics in documents. When the matrix to be approximated originates from a large distributed system, such as a network of mobile phones or smart meters, a challenging problem arises due to the strongly conflicting yet essential requirements of efficiency, robustness, and privacy preservation. We argue that although collecting sensitive data in a centralized fashion may be efficient, it is not an option when considering privacy and efficiency at the same time. Thus, we do not allow any sensitive data to leave the nodes of the network. The local information at each node (personal attributes, documents, media ratings, etc.) defines one row in the matrix. This means that all computations have to be performed at the edge of the network. Known parallel methods that respect the locality constraint, such as synchronized parallel gradient search or distributed iterative methods, require synchronized rounds or have inherent issues with load balancing, and thus they are not robust to failure. Our distributed stochastic gradient descent algorithm overcomes these limitations. During the execution, any sensitive information remains local, whereas the global features (e.g., the factor model of movies) converge to the correct value at all nodes. We present a theoretical derivation and a thorough experimental evaluation of our algorithm. We demonstrate that the convergence speed of our method is competitive while not relying on synchronization and being robust to extreme and realistic failure scenarios. To demonstrate the feasibility of our approach, we present trace-based simulations, real smartphone user behavior analysis, and tests over real movie recommender system data.
暂无评论