In vertex-centric programming, users express a graph algorithm as a vertex program and specify the iterative behavior of a vertex in a compute function, which is executed by all vertices in a graph in parallel, synchr...
详细信息
ISBN:
(纸本)9798400704062
In vertex-centric programming, users express a graph algorithm as a vertex program and specify the iterative behavior of a vertex in a compute function, which is executed by all vertices in a graph in parallel, synchronously in a sequence of supersteps. While this programming model is straight-forward for simple algorithms where vertices behave the same in each superstep, for complex vertex programs where vertices have different behavior across supersteps, a vertex needs to frequently dispatch on the value of supersteps in compute, which suffers from unnecessary interpretation overhead and complicates the control flow. We address this using meta-programming: instead of branching on the value of a superstep, users separate instructions that should be executed in different supersteps via a staging-time wait() instruction. When a superstep starts, computations in a vertex program resume from the last execution point, and continue executing until the next wait(). We implement this in the programming model of an agent-based simulation framework CloudCity and show that avoiding the interpretation overhead caused by dispatching on the value of a superstep can improve the performance by up to 25% and lead to more robust performance.
Pattern matching in big graphs is important for different modern applications. Recently, this problem was defined in terms of multiple extensions of graph simulation, to reduce complexity and capture more meaningful r...
详细信息
Pattern matching in big graphs is important for different modern applications. Recently, this problem was defined in terms of multiple extensions of graph simulation, to reduce complexity and capture more meaningful results. These results were achieved through the relaxation of commonly used constraint in subgraph isomorphism pattern matching. Nevertheless, these graph simulation variant models are still too strict to provide results in many cases, especially when analyzed graphs contain anomalies and incomplete information. To deal with this issue, we introduce a new graph pattern matching (GPM) method, called partial simulation, capable of retrieving matches despite missing parts of the pattern graph, such as vertices and/or edges. Furthermore, considering the number and inequality of the outputs, we define a relevance function to compute a value expressing how each match vertex respects the pattern graph. Similarly, we define partial dual simulation GPM that returns vertices that satisfy a part of the dual simulation constraints and assigns a relevance value to them. Additionally, we provide distributed scalable algorithms to evaluate the proposed partial simulation methods based on the distributed vertex-centric programming paradigm. Finally, our experiments on real-world data graphs demonstrate the effectiveness of the proposed models and the efficiency of their associated algorithms.
暂无评论