SimRank is one of the most fundamental measures that evaluate the structural similarity between two nodes in a graph and has been applied in a plethora of data mining and machine learning tasks. These tasks often invo...
ISBN:
(纸本)9781956792041
SimRank is one of the most fundamental measures that evaluate the structural similarity between two nodes in a graph and has been applied in a plethora of data mining and machine learning tasks. These tasks often involve single-source SimRank computation that evaluates the SimRank values between a source node u and all other nodes. Due to its high computation complexity, single-source SimRank computation for large graphs is notoriously challenging, and hence recent studies resort to distributed processing. To our surprise, although SimRank has been widely adopted for two decades, theoretical aspects of distributed SimRanks with provable results have rarely been studied. In this paper, we conduct a theoretical study on single-source SimRank computation in the Massive parallel Computation (MPC) model, which is the standard theoretical framework modeling distributedsystems. Existing distributed SimRank algorithms enforce either Omega(log n) communication round complexity or Omega(n) machine space for a graph of n nodes. We overcome this barrier. Particularly, given a graph of n nodes, for any query node v and constant error epsilon > 3/n, we show that using O(log(2) log n) rounds of communication among machines is enough to compute single-source SimRank values with at most epsilon absolute errors, while each machine only needs a space sub-linear to n. To the best of our knowledge, this is the first single-source SimRank algorithm in MPC that can overcome the Theta T(log n) round complexity barrier with provable result accuracy.
This paper studies the localization problem of wireless sensor network in indoor environment. In order to meet the mobile users39; demand for location service in indoor environment, an indoor distributed localizatio...
详细信息
In this paper, we propose a reconfigurable framework optimized for resource-constrained platforms to accelerate CNNs using the high concurrency and data-proximate characteristics of edge computing devices. The framewo...
详细信息
Multi-Node computation, also known as distributed computing, is a paradigm that allows for the efficient utilization of multiple interconnected nodes or machines to perform complex computational tasks. By dividing the...
详细信息
Spatial data analysis is a technique used to analyze large amounts of spatial data generated by on-demand cab services such as Uber, Lyft, and Grab. This type of data includes information on the pickup and drop-off lo...
详细信息
Aiming at the current problem of lack of effective data and high service latency for intrusion detection in smart substations, this paper proposes a lightweight intrusion detection method for smart substations. Throug...
详细信息
This research work describes a framework for secured and effective way of group interaction incorporating classical cryptography and quantum communication technique. This framework employs a classical cryptographic me...
详细信息
To cope with the problems of insufficient level of intelligence in attempting combat missions under tactical edge conditions, high command, and control delay, and difficulty in sharing data across domains, the joint i...
详细信息
Broadcasting is one of the fundamental information dissemination primitives in interconnection networks, where a message is passed from one node (called originator) to all other nodes in the network. Following the inc...
详细信息
In order to solve the problems of centralized anti-counterfeiting system and illegal merchants copy authentic commodities at low cost and with low difficulty. This paper proposes an anti-counterfeiting system design b...
详细信息
暂无评论