graph algorithms currently play increasingly important roles, especially in social networks and language modeling scenarios. Recently, accelerating graph algorithms by heterogeneous high performance computers with the...
详细信息
ISBN:
(纸本)9781509048250
graph algorithms currently play increasingly important roles, especially in social networks and language modeling scenarios. Recently, accelerating graph algorithms by heterogeneous high performance computers with the integrated cores and expanded SIMD lanes has been becoming the mainstream. However, the existing methods, restricted by the low-efficiency grouping strategy and the non-optimized selection mechanism of tile size of a graph, are far below our expectations in many ways. Moreover, there are few convenient integrated tools provided for deploying the graph algorithms on MIC architecture. In this paper, we propose a high-efficiency framework MicRun, which is flexible to be used for graph algorithms on SIMD architecture of the Xeon Phi. There are two key components in MicRun, the Bucket Grouping module and Auto-tuning module. In the Grouping module, an optimization algorithm is designed for splitting graph tiles into contlict-free groups, which can be directly processed on SIMD parallelism. In the Auto-tuning module, a novel strategy is proposed for optimizing the tile size to boost execution efficiency of the graph computation. MicRun currently supports Bellman-Ford and PageRank algorithms, we also conduct extensive validation experiments on MicRun. Experimental results show that MicRun outperforms existing mechanisms in terms of storage and time overhead. As a consequence, both graph algorithms achieve an average speedup of 1.1x by MicRun, compared with the state-of-the-art.
Existing state-of-the-art CPU graph frameworks take advantage of multiple cores, but not the SIMD capability within each core. In this work, we retarget an existing GPU graph algorithm compiler to obtain the first gra...
详细信息
ISBN:
(纸本)9781728186139
Existing state-of-the-art CPU graph frameworks take advantage of multiple cores, but not the SIMD capability within each core. In this work, we retarget an existing GPU graph algorithm compiler to obtain the first graph framework that uses SIMD extensions on CPUs to efficiently execute graph algorithms. We evaluate this compiler on 10 benchmarks and 3 graphs on 3 different CPUs and also compare to the GPU. Evaluation results show that on a 8-core machine, enabling SIMD on a naive multi-core implementation achieves an additional 7.48x speedup, averaged across 10 benchmarks and 3 inputs. Applying our SIMD-targeted optimizations improves the plain SIMD implementation by 1.67x, outperforming a serial implementation by 12.46x. On average, the optimized multi-core SIMD version also outperforms the state-of-the-art graph framework, graphIt, by 1.53x, averaged across 5 (common) benchmarks. SIMD execution on CPUs closes the gap between the CPU and GPU to 1.76x, but the CPU virtual memory performs better when graphs are much bigger than available physical memory.
Tower of Hanoi is a classical tutorial example traditionally for introducing recursive algorithms in CS1. This poster describes a lecture of teaching graph algorithms using the graphical representation of the game and...
详细信息
ISBN:
(纸本)9781450394338
Tower of Hanoi is a classical tutorial example traditionally for introducing recursive algorithms in CS1. This poster describes a lecture of teaching graph algorithms using the graphical representation of the game and its variants in upper-level courses. More specifically, this poster provides a totally different perspective on solving the tower of Hanoi using DFS, BFS, A*, greedy, and other search algorithms on graphs. Students will not only solve the game and its variants in an inspiring way, but also learn a modeling method to convert problems into graphs.
We consider scenarios in which two parties, each in possession of a graph, wish to compute some algorithm on their joint graph in a privacy-preserving manner, that is, without leaking any information about their input...
详细信息
ISBN:
(纸本)3540306846
We consider scenarios in which two parties, each in possession of a graph, wish to compute some algorithm on their joint graph in a privacy-preserving manner, that is, without leaking any information about their inputs except that revealed by the algorithm's output. Working in the standard secure multi-party computation paradigm, we present new algorithms for privacy-preserving computation of AND (all pairs shortest distance) and SSSD (single source shortest distance), as well as two new algorithms for privacy-preserving set union. Our algorithms are significantly more efficient than generic constructions. As in previous work on privacy-preserving data mining, we prove that our algorithms are secure provided the participants are "honest, but curious."
graph problems are common across many fields, from scientific computing to social sciences. Despite their importance and the attention received, implementing graph algorithms effectively on modern computing systems re...
详细信息
ISBN:
(纸本)9798350342543
graph problems are common across many fields, from scientific computing to social sciences. Despite their importance and the attention received, implementing graph algorithms effectively on modern computing systems remains a challenging task that requires significant programming effort and generally results in customized implementations. Current computing and memory hierarchies are not architected for irregular computations, resulting performance that is far from the theoretical architectural peak. In this paper, we propose a compiler framework to simplify the development of graph algorihtm implementations that can achieve high performance on modern computing systems. We provide a high-level domain specific language (DSL) to represent graph algorithms through sparse linear algebra expressions and graph primitives including semiring and masking. The compiler leverages the semantics information expressed through the DSL during the optimization and code transformation passes, resulting in more efficient IR passed to the compiler backend. In particular, we introduce an Index Tree Dialect that preserves the semantic information of the graph algorithm to perform high-level, domain-specific optimizations, including workspace transformation, two-phase computation, and automatic parallelization. We demonstrate that this work outperforms state-of-the-art graph libraries LAgraph by up to 3.7x speedup in semiring operations, 2.19x speedup in an important sparse computational kernel, and 9.05x speedup in graph processing algorithms.
Distributed graph platforms like Pregel have used vertex-centric programming models to process the growing corpus of graph datasets using commodity clusters. However, the irregular structure of graphs causes load imba...
详细信息
ISBN:
(纸本)9781509024537
Distributed graph platforms like Pregel have used vertex-centric programming models to process the growing corpus of graph datasets using commodity clusters. However, the irregular structure of graphs causes load imbalances across machines, and this is exacerbated for non-stationary graph algorithms where not all parts of the graph are active at the same time. As a result, such graph platforms do not make efficient use of distributed resources. In this paper, we decouple graph partitioning from placement on hosts, and introduce strategies for elastic placement of graph partitions on Cloud VMs to reduce the cost of execution compared to a static placement, even as we minimize the increase in makespan. These strategies are innovative in modeling the graph algorithm's non-stationary behavior a priori using a metagraph sketch. We validate our strategies for several real-world graphs, using runtime traces for approximate Betweenness Centrality (BC) algorithm on our subgraph-centric GoFFish graph platform. Our strategies are able to reduce the cost of execution by up to 54%, compared to a static placement, while achieving a makespan that is within 25% of the optimal.
In this thesis we give new algorithms for two fundamental graph problems. We develop novel ways of using linear programming formulations, even exponential-sized ones, to extract structure from problem instances and to...
详细信息
In this thesis we give new algorithms for two fundamental graph problems. We develop novel ways of using linear programming formulations, even exponential-sized ones, to extract structure from problem instances and to guide algorithms in making progress. Somewhat surprisingly, similar polyhedral techniques can be harnessed in the two seemingly disparate settings. In the first part of the thesis we address a benchmark problem in combinatorial optimization: the asymmetric traveling salesman problem (ATSP). It consists in finding the shortest tour that visits all vertices of a given directed graph with weights on edges. Due to its NP-hardness, the theoretical study of algorithms for ATSP has focused on approximation algorithms: ones that are provably both efficient and give solutions competitive with the optimum. Specifically, a ρ-approximation algorithm for ATSP is one that runs in polynomial time and always outputs a tour that is at most ρ times longer than the shortest tour. Finding such an approximation algorithm with ρ bounded (i. e., a constant factor) had been a long-standing open problem. In this thesis, we give such an algorithm. Our approximation guarantee is analyzed with re- spect to the standard linear programming relaxation, and thus our result also confirms the conjec- tured constant integrality gap of that relaxation. Our techniques build upon the constant-factor approximation algorithm for the special case of node-weighted metrics due to Svensson [Sve15]. In particular, we give a generic reduction to structured instances that resemble but are more general than those arising from node-weighted metrics. This reduction takes advantage of a laminar family of vertex sets that arises from the linear programming relaxation. In the second part of the thesis we address the perfect matching problem. The first polynomial- time algorithm for it, given by Edmonds in 1965 [Edm65b], is historically associated with the in- troduction of the class P and our notion that
Combinatorial problems such as those from graph theory pose serious challenges for parallel machines due to non-contiguous, concurrent accesses to global data structures with low degrees of locality. The hierarchical ...
详细信息
ISBN:
(纸本)0769523803
Combinatorial problems such as those from graph theory pose serious challenges for parallel machines due to non-contiguous, concurrent accesses to global data structures with low degrees of locality. The hierarchical memory systems of symmetric multiprocessor (SMP) clusters optimize for local, contiguous memory accesses, and so are inefficient platforms for such algorithms. Few parallel graph algorithms outperform their best sequential implementation on SMP clusters due to long memory latencies and high synchronization costs. In this paper, we consider the performance and scalability of two graph algorithms, list ranking and connected components, on two classes of shared-memory computers: symmetric multiprocessors such as the Sun Enterprise servers and multithreaded architectures (MTA) such as the Cray MTA-2. While previous studies have shown that parallel graph algorithms can speedup on SMPs, the systems' reliance on cache microprocessors limits performance. The MTA's latency tolerant processors and hardware support for fine-grain synchronization makes performance a function of parallelism. Since parallel graph algorithms have an abundance of parallelism, they perform and scale significantly better on the MTA. We describe and give a performance model for each architecture. We analyze the performance of the two algorithms and discuss how the features of each architecture affects algorithm development, ease of programming, performance, and scalability.
We reconsider basic algorithmic graph problems in a setting where an n-vertex input graph is read-only and the computation must take place in a working memory of O(n) bits or little more than that. For computing conne...
详细信息
ISBN:
(纸本)9783939897781
We reconsider basic algorithmic graph problems in a setting where an n-vertex input graph is read-only and the computation must take place in a working memory of O(n) bits or little more than that. For computing connected components and performing breadth-first search, we match the running times of standard algorithms that have no memory restrictions, for depth-first search and related problems we come within a factor of Theta(log log n), and for computing minimum spanning forests and single-source shortest-paths trees we come close for sparse input graphs.
Recommendation Systems play a very important role in our lives in the modern era. With the advent of Big Data in recent years, an enormous amount of information (in both structured and un structured data formats) is b...
详细信息
Recommendation Systems play a very important role in our lives in the modern era. With the advent of Big Data in recent years, an enormous amount of information (in both structured and un structured data formats) is being generated every second from various data sources. Recommendation Systems are very helpful for generating meaningful insights from massive amounts of data. Slower batch approaches are enhanced by real time recommendations enabled by storing data in a graph database platform. This dissertation aims to build and implement a real time recommendation system using different graph algorithms in Neo4j, the current leader in graph operational database management systems. The requirements or use cases for this research were proposed by the AI and Analytics Team of Neo4j, for their ongoing research and development activities. Various research papers were studied to get an overview of various graph algorithms currently used in recommendation systems in graph databases. A customized graph data model was implemented to provide solutions for the research questions. Cypher, the Neo4j query language was used to implement a selection of recommendation graph algorithms on this data model. The graph algorithms used were Overlap Similarity, Cosine Similarity and PageRank. For providing a comparison between traditional and graph databases, FP-Growth, a traditional Association Rule Algorithm, was implemented using Rapid Miner, a leading Data Mining Tool, showcasing a particular use case using a traditional approach. A Python Script was developed to prepare the data for loading to the customized graph data model. Also, a data profiling or statistical analysis was performed on the loaded data providing a thorough analysis of the structure, contents and meta data of the loaded data. graph analytics were only been introduced for an operational graph database management system in the fourth quarter of 2017. The results obtained from this research highlight the enormous potent
暂无评论