We present an algorithm for approximating the diameter of massive weighted undirected graphs on distributed platforms supporting a MapReduce-like abstraction. In order to be efficient in terms of both time and space, ...
详细信息
We present an algorithm for approximating the diameter of massive weighted undirected graphs on distributed platforms supporting a MapReduce-like abstraction. In order to be efficient in terms of both time and space, our algorithm is based on a decomposition strategy which partitions the graph into disjoint clusters of bounded radius. Theoretically, our algorithm uses linear space and yields a polylogarithmic approximation guarantee;most importantly, for a large family of graphs, it features a round complexity asymptotically smaller than the one exhibited by a natural approximation algorithm based on the state-of-the-art Delta-stepping SSSP algorithm, which is its only practical, linear-space competitor in the distributed setting. We complement our theoretical findings with a proof-of-concept experimental analysis on large benchmark graphs, which suggests that our algorithm may attain substantial improvements in terms of running time compared to the aforementioned competitor, while featuring, in practice, a similar approximation ratio.
Task-based execution of graph workloads allows various ordered and unordered implementations, with tasks representing dependencies between graph vertices and edges. This work explores graphalgorithms in the context o...
详细信息
Task-based execution of graph workloads allows various ordered and unordered implementations, with tasks representing dependencies between graph vertices and edges. This work explores graphalgorithms in the context of ordered and unordered task-based implementations, that trade-off work-efficiency with parallelism. The monotonicity of convergent graph solutions is the reason behind the trade-off between work-efficiency and parallelism. This trade-off results in variable performance-based choices within and across different machines (CPUs and GPUs), graphalgorithms, implementations (ordered, relaxed, and unordered). Input graphs also augment this choice space, with this work analyzing temporally changing graphs in addition to the static graphs explored by prior works. These algorithmic and architectural choices are first explored in this work, and it is seen that different graph workload-input combinations perform optimally on diverse architectural configurations. The resulting choice space is analyzed and this work represents it in the form of characteristic variables that correlate with each choice space. Using these characteristic variables, this work proposes analytical and neural network models to correlate these choice spaces to find the best performing implementation. The variables and the prediction models proposed in this work are also integrated with a state-of-the-art performance predictor on a multiaccelerator setup, and shows geometric performance gains of 54% on a CPU, 14% on a GPU, and 31.5% in a multiaccelerator setup over baseline implementations without performance prediction.
暂无评论