There is a growing need for distributed graph processing systems to have many more compute nodes processinggraph-based Big Data applications, which, however, increases the chance of node failures. To address the issu...
详细信息
There is a growing need for distributed graph processing systems to have many more compute nodes processinggraph-based Big Data applications, which, however, increases the chance of node failures. To address the issue, we propose a novel recovery scheme to accelerate the recovery process by parallelizing the recomputation. Once a failure occurs, all recomputations are confined to subgraphs that originally reside in the failed compute nodes. When the recovery starts, these subgraphs are reassigned to another set of compute nodes, where the recomputation over these subgraphs are conducted in parallel. To minimize the recovery latency, we also develop a reassignment strategy, from these subgraphs to the replaced compute nodes, by properly leveraging the computation and communication cost. We integrate the proposed recovery scheme into Giraph system, a widely used graphprocessing system. The experimental results over a variety of real graph datasets demonstrate that our proposed recovery scheme outperforms existing recovery methods by up to 30x on a cluster of 40 compute nodes.
A large number of specialized graphprocessingsystems have been developed to cope with the increasing demand of graph analytics. Most of them require users to deploy a new framework in the cluster for graph processin...
详细信息
A large number of specialized graphprocessingsystems have been developed to cope with the increasing demand of graph analytics. Most of them require users to deploy a new framework in the cluster for graphprocessing and switch to other systems to execute non-graph algorithms. This increases the complexity of cluster management and results in unnecessary data movement and duplication. In this paper, we propose our graphprocessing engine, named epiCG, which is built on top of epiC, an elastic data processing system. The core of epiCG is a new unit called graphUnit, which is able to not only perform iterative graphprocessing efficiently, but also collaborate with other types of units to accomplish any complex/multi-stage data analytics. epiCG supports both edge-cut and vertex-cut partitioning methods, and for the latter method, we propose a novel light-weight greedy strategy that enables all the graphUnits to generate vertex-cut partitioning in parallel. Furthermore, unlike existing graphprocessingsystems, failure recovery in epiCG is completely automatic. We compare epiCG with several prevalent graphprocessingsystems via extensive experiments with real-life dataset and applications. The results show that epiCG possesses high efficiency and scalability, and performs exceptionally well in large dataset settings, showcasing its suitability for large-scale graphprocessing. (C) 2016 Elsevier Inc. All rights reserved.
Dealing with large-scale graphs requires an efficient graph partitioner that produces balanced partitions with fewer cut edges/vertices in a reasonable amount of time. Despite several algorithms that have been propose...
详细信息
Dealing with large-scale graphs requires an efficient graph partitioner that produces balanced partitions with fewer cut edges/vertices in a reasonable amount of time. Despite several algorithms that have been proposed, it is still insufficient. Even with the continuous growth of graph volume, they do not consider the graph volume during graph partitioning. Therefore, these algorithms generate an imbalanced workload. We propose a graph partitioner algorithm VSCT based essentially on four key metrics: Volume, Size, Cuts, and Time to maintain high-quality graph partitioning. Using real-world datasets, we show that VSCT performs an efficient partitioning quality against the existing graph partitioning algorithms.
Efficiently processing large graphs is challenging, since parallel graph algorithms suffer frompoor scalability and performance due to many factors, including heavy communication and load-imbalance.Furthermore, it is ...
详细信息
Efficiently processing large graphs is challenging, since parallel graph algorithms suffer from
poor scalability and performance due to many factors, including heavy communication and load-imbalance.
Furthermore, it is difficult to express graph algorithms, as users need to understand
and effectively utilize the underlying execution of the algorithm on the distributed system. The
performance of graph algorithms depends not only on the characteristics of the system (such as
latency, available RAM, etc.), but also on the characteristics of the input graph (small-world scalefree,
mesh, long-diameter, etc.), and characteristics of the algorithm (sparse computation vs. dense
communication). The best execution strategy, therefore, often heavily depends on the combination
of input graph, system and algorithm.
Fine-grained expression exposes maximum parallelism in the algorithm and allows the user to
concentrate on a single vertex, making it easier to express parallel graph algorithms. However,
this often loses information about the machine, making it difficult to extract performance and
scalability from fine-grained algorithms.
To address these issues, we present a model for expressing parallel graph algorithms using a
fine-grained expression. Our model decouples the algorithm-writer from the underlying details
of the system, graph, and execution and tuning of the algorithm. We also present various graph
paradigms that optimize the execution of graph algorithms for various types of input graphs and
systems. We show our model is general enough to allow graph algorithms to use the various graph
paradigms for the best/fastest execution, and demonstrate good performance and scalability for
various different graphs, algorithms, and systems to 100,000+ cores.
暂无评论