检索结果-内蒙古大学图书馆

Fast Failure Recovery in Vertex-Centric distributed graph processing systems

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2019年第4期31卷 733-746页

作者： Lu, Wei Shen, Yanyan Wang, Tongtong Zhang, Meihui Jagadish, H. V. Du, Xiaoyong Renmin Univ China MOE DEKE Beijing 100872 Peoples R China Renmin Univ China Sch Informat Beijing 100872 Peoples R China Shanghai Jiao Tong Univ Xuhui Qu 200000 Shanghai Shi Peoples R China Beijing Inst Technol Beijing 100081 Peoples R China Univ Michigan Elect Engn & Comp Sci Ann Arbor MI 48109 USA

There is a growing need for distributed graph processing systems to have many more compute nodes processing graph-based Big Data applications, which, however, increases the chance of node failures. To address the issue, we propose a novel recovery scheme to accelerate the recovery process by parallelizing the recomputation. Once a failure occurs, all recomputations are confined to subgraphs that originally reside in the failed compute nodes. When the recovery starts, these subgraphs are reassigned to another set of compute nodes, where the recomputation over these subgraphs are conducted in parallel. To minimize the recovery latency, we also develop a reassignment strategy, from these subgraphs to the replaced compute nodes, by properly leveraging the computation and communication cost. We integrate the proposed recovery scheme into Giraph system, a widely used graph processing system. The experimental results over a variety of real graph datasets demonstrate that our proposed recovery scheme outperforms existing recovery methods by up to 30x on a cluster of 40 compute nodes.

关键词： distributed graph processing systems failure recovery checkpoint log compression partition-based recovery

来源：评论

学校读者我要写书评

暂无评论

epiCG: A graphUnit Based graph processing Engine on epiC

引用

BIG DATA RESEARCH 2016年 4卷 59-69页

作者： Shen, Yanyan Cai, Qingchao Lu, Wei Sun, Dalie Xie, Zhongle Shanghai Jiao Tong Univ Shanghai 200030 Peoples R China Natl Univ Singapore Singapore Singapore Harbin Inst Technol Harbin Peoples R China Renmin Univ China Beijing Peoples R China

A large number of specialized graph processing systems have been developed to cope with the increasing demand of graph analytics. Most of them require users to deploy a new framework in the cluster for graph processing and switch to other systems to execute non-graph algorithms. This increases the complexity of cluster management and results in unnecessary data movement and duplication. In this paper, we propose our graph processing engine, named epiCG, which is built on top of epiC, an elastic data processing system. The core of epiCG is a new unit called graphUnit, which is able to not only perform iterative graph processing efficiently, but also collaborate with other types of units to accomplish any complex/multi-stage data analytics. epiCG supports both edge-cut and vertex-cut partitioning methods, and for the latter method, we propose a novel light-weight greedy strategy that enables all the graphUnits to generate vertex-cut partitioning in parallel. Furthermore, unlike existing graph processing systems, failure recovery in epiCG is completely automatic. We compare epiCG with several prevalent graph processing systems via extensive experiments with real-life dataset and applications. The results show that epiCG possesses high efficiency and scalability, and performs exceptionally well in large dataset settings, showcasing its suitability for large-scale graph processing. (C) 2016 Elsevier Inc. All rights reserved.

关键词： epiCG epiC Vertex-cut partitioning distributed graph processing systems Big Data

来源：评论

学校读者我要写书评

暂无评论

VSCT algorithm for graph partitioning based on volume, size, cuts and time

引用

INTERNATIONAL JOURNAL OF PARALLEL EMERGENT AND distributed systems 2023年第3期38卷 181-197页

作者： Sakouhi, Chayma Khaldi, Abir Ghezala, Henda Ben Univ Manouba Natl Sch Comp Sci RIADI Lab Manouba Tunisia Univ Manouba Comp Sci Natl Sch Comp Sci RIADI Lab Manouba 8100 Tunisia

Dealing with large-scale graphs requires an efficient graph partitioner that produces balanced partitions with fewer cut edges/vertices in a reasonable amount of time. Despite several algorithms that have been proposed, it is still insufficient. Even with the continuous growth of graph volume, they do not consider the graph volume during graph partitioning. Therefore, these algorithms generate an imbalanced workload. We propose a graph partitioner algorithm VSCT based essentially on four key metrics: Volume, Size, Cuts, and Time to maintain high-quality graph partitioning. Using real-world datasets, we show that VSCT performs an efficient partitioning quality against the existing graph partitioning algorithms.

关键词： graph partitioning algorithms distributed graph processing systems volume size cuts time

来源：评论

学校读者我要写书评

暂无评论

Algorithm-Level Optimizations for Scalable Parallel graph processing

Algorithm-Level Optimizations for Scalable Parallel Graph Pr...

引用

作者： Harshvardhan Texas A&M University

Efficiently processing large graphs is challenging, since parallel graph algorithms suffer from poor scalability and performance due to many factors, including heavy communication and load-imbalance. Furthermore, it is difficult to express graph algorithms, as users need to understand and effectively utilize the underlying execution of the algorithm on the distributed system. The performance of graph algorithms depends not only on the characteristics of the system (such as latency, available RAM, etc.), but also on the characteristics of the input graph (small-world scalefree, mesh, long-diameter, etc.), and characteristics of the algorithm (sparse computation vs. dense communication). The best execution strategy, therefore, often heavily depends on the combination of input graph, system and algorithm. Fine-grained expression exposes maximum parallelism in the algorithm and allows the user to concentrate on a single vertex, making it easier to express parallel graph algorithms. However, this often loses information about the machine, making it difficult to extract performance and scalability from fine-grained algorithms. To address these issues, we present a model for expressing parallel graph algorithms using a fine-grained expression. Our model decouples the algorithm-writer from the underlying details of the system, graph, and execution and tuning of the algorithm. We also present various graph paradigms that optimize the execution of graph algorithms for various types of input graphs and systems. We show our model is general enough to allow graph algorithms to use the various graph paradigms for the best/fastest execution, and demonstrate good performance and scalability for various different graphs, algorithms, and systems to 100,000+ cores.

关键词： graph processing distributed systems graph Algorithms High Performance Computing Parallel graph Algorithms Scalable graph Algorithms distributed graph processing systems Large-scale graph processing Thesis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：