graph analytics are at the heart of a broad range of applications such as drug discovery, page ranking, transportation systems, and recommendation models. When graph size exceeds the available memory size in a computi...
详细信息
ISBN:
(纸本)9781665440660
graph analytics are at the heart of a broad range of applications such as drug discovery, page ranking, transportation systems, and recommendation models. When graph size exceeds the available memory size in a computing node, out-of-core graph processing is needed. For the widely used out-of-core graph processing systems. the graphs are stored and accessed from a long latency SSD storage, which becomes a significant performance bottleneck. To tackle this long latency this work exploits the key insight that that nearly all graph algorithms have a dynamically varying number of active vertices that must be processed in each iteration. However, existing graphprocessing frameworks, such as graphChi, load the entire graph in each iteration even if a small fraction of the graph is active. This limitation is due to the structure of the graph storage used by these systems. In this work, we propose to use a compressed sparse row (CSR) based graph storage that is more amenable for selectively loading only a few active vertices in each iteration. However, CSR based graphprocessing suffers from random update propagation to many target vertices. To solve this challenge, we propose to use a multi-log update mechanism that logs updates separately, rather than directly update the active edges and vertices in a graph. The multi-log system maintains a separate log per each vertex interval (a group of vertices). This separation enables efficient processing of all updates bound to each vertex interval by just loading the corresponding log. Further, by logging all the updates associated with a vertex interval in one contiguous log this approach reduces read amplification since all the pages in the log will be processed in the next iteration without wasted page reads. Over the current state of the art out-of-core graph processing framework, our evaluation results show that the MultiLogVC framework improves performance by up to 17.84x, 1.19x, 1.65x, 1.38x, 3.15x, and 6.00x for the widely
A graph engine should possess adaptability to ensure efficient processing despite a variety of graph data and algorithms. In terms of out-of-coregraph engines, which exploit a hierarchical memory structure, an adapti...
详细信息
ISBN:
(纸本)9781728195865
A graph engine should possess adaptability to ensure efficient processing despite a variety of graph data and algorithms. In terms of out-of-coregraph engines, which exploit a hierarchical memory structure, an adaptive caching scheme is necessary to sustain effectiveness of memory usage. A caching policy selectively stores data likely to be used in the upper-layer memory based on its own expectation about the future workload. However, the graph workload contains a complexity of memory access according to graph data, algorithm, and configurations. This makes it difficult for a static caching policy to respond to the changes in workload. In this paper, we propose a graph-adaptive caching scheme which ensures consistent effectiveness under the changing workloads. Our caching scheme employs an adaptive policy that responds to changes in real-time workloads. To detect the changes, we adopt the competition procedures between two contrasting properties-locality and regularity-that appear in graph workloads. In addition, we combine two window adjustment techniques to alleviate the overhead from competition procedures. The proposed caching scheme is applicable to different types of graph engines, achieving better efficiency in memory usage. Our experimental results prove that our scheme improves the performance of graphprocessing by up to 65% compared to existing schemes.
Partitioning and processing of large graphs on a single machine with limited memory is a challenge. While many custom solutions for out-of-coreprocessing have been developed, limited work has been done on out-of-core...
详细信息
ISBN:
(纸本)9798400701795
Partitioning and processing of large graphs on a single machine with limited memory is a challenge. While many custom solutions for out-of-coreprocessing have been developed, limited work has been done on out-of-core partitioning that can be far more memory intensive than processing. In this paper we present the OMRGx system whose programming interface allows the programmer to rapidly prototype existing as well as new partitioning and processing strategies with minimal programming effort and oblivious of the graph size. The OMRGx engine transparently implements these strategies in an out-of-core manner while hiding the complexities of managing limited memory, parallel computation, and parallel IO from the programmer. The execution model allows multiple partitions to be simultaneously constructed and simultaneously processed by dividing the machine memory among the partitions. In contrast, existing systems process partitions one at a time. Using OMRGx we developed the first out-of-core implementation of the popular MtMetis partitioner. OMRGx implementations of existing Gridgraph and graphChi out-of-coreprocessing frameworks deliver performance better than their standalone optimized implementations. The runtimes of implementations produced by OMRGx decrease with the number of partitions requested and increase linearly with the graph size. Finally OMRGx default implementation performs the best of all.
暂无评论