Given the non-determinism and race conditions in distributed programs, the ability to provide assurance about them is crucial Our work focuses on incremental synthesis where we modify a distributed programs to add sel...
详细信息
ISBN:
(纸本)9783642051173
Given the non-determinism and race conditions in distributed programs, the ability to provide assurance about them is crucial Our work focuses on incremental synthesis where we modify a distributed programs to add self-stabilization We concentrate On reducing the time complexity of such synthesis using parallelism We apply these techniques in context, of constraint, satisfaction In particular incremental synthesis of self-stabilizing programs requires adding recovery actions to satisfy the constraint, that arc time in the legitimate states We consider two approaches to speedup the synthesis algorithm first, the use of the multiple constraints that, have to be satisfied during synthesis: second, the use of the distributed nature of the programs being synthesized We show that, our approaches provide significant reductions in the synthesis time
The Java (TM) m developers kit requires a size () operation for all objects, tracking the number of elements in the object. Unfortunately, the best known solution, available in the Java concurrency package, has a bloc...
详细信息
The Java (TM) m developers kit requires a size () operation for all objects, tracking the number of elements in the object. Unfortunately, the best known solution, available in the Java concurrency package, has a blocking concurrent implementation that does not scale. This paper presents a highly scalable wait-free implementation of a concurrent size () operation based on a new lock-free interrupting snapshots algorithm. The key idea behind the new algorithm is to allow snapshot scan methods to interrupt each other until they agree on a shared linearization point with respect to update methods. This contrasts sharply with past approaches to the classical atomic snapshot problem, that have had threads coordinate the collecting of a shared global view. As we show empirically, the new algorithm scales well, significantly outperforming existing implementations. (C) 2012 Elsevier Inc. All rights reserved.
SCAN (Structural Clustering Algorithm for Networks) is a well-studied, widely used graph clustering algorithm. For large graphs, however, sequential SCAN variants are prohibitively slow, and parallel SCAN variants do ...
详细信息
ISBN:
(纸本)9781450383431
SCAN (Structural Clustering Algorithm for Networks) is a well-studied, widely used graph clustering algorithm. For large graphs, however, sequential SCAN variants are prohibitively slow, and parallel SCAN variants do not effectively share work among queries with different SCAN parameter settings. Since users of SCAN often explore many parameter settings to find good clusterings, it is worthwhile to precompute an index that speeds up queries. This paper presents a practical and provably efficient parallel index-based SCAN algorithm based on GS*-Index, a recent sequential algorithm. Our parallel algorithm improves upon the asymptotic work of the sequential algorithm by using integer sorting. It is also highly parallel, achieving logarithmic span (parallel time) for both index construction and clustering queries. Furthermore, we apply locality-sensitive hashing (LSH) to design a novel approximate SCAN algorithm and prove guarantees for its clustering behavior. We present an experimental evaluation of our algorithms on large real-world graphs. On a 48-core machine with two-way hyper-threading, our parallel index construction achieves 50-151x speedup over the construction of GSA-Index. In fact, even on a single thread, our index construction algorithm is faster than GS*-Index. Our parallel index query implementation achieves 5-32x speedup over GS*-Index queries across a range of SCAN parameter values, and our implementation is always faster than ppSCAN, a state-of-the-art parallel SCAN algorithm. Moreover, our experiments show that applying LSH results in faster index construction while maintaining good clustering quality.
Finding the strongly connected components (SCCs) of a directed graph is a fundamental graph-theoretic problem. Tarjan's algorithm is an efficient serial algorithm to find SCCs, but relies on the hard-to-paralleliz...
详细信息
ISBN:
(纸本)9780769552071
Finding the strongly connected components (SCCs) of a directed graph is a fundamental graph-theoretic problem. Tarjan's algorithm is an efficient serial algorithm to find SCCs, but relies on the hard-to-parallelize depth-first search (DFS). We observe that implementations of several parallel SCC detection algorithms show poor parallel performance on modern multicore platforms and large-scale networks. This paper introduces the Multistep method, a new approach that avoids work inefficiencies seen in prior SCC approaches. It does not rely on DFS, but instead uses a combination of breadth-first search (BFS) and a parallel graph coloring routine. We show that the Multistep method scales well on several real-world graphs, with performance fairly independent of topological properties such as the size of the largest SCC and the total number of SCCs. On a 16-core Intel Xeon platform, our algorithm achieves a 20x speedup over the serial approach on a 2 billion edge graph, fully decomposing it in under two seconds. For our collection of test networks, we observe that the Multistep method is 1.92x faster (mean speedup) than the state-of-the-art Hong et al. SCC method. In addition, we modify the Multistep method to find connected and weakly connected components, as well as introduce a novel algorithm for determining articulation vertices of biconnected components. These approaches all utilize the same underlying BFS and coloring routines.
The purpose of trajectory segmentation algorithms is to replace an input trajectory by a sub-trajectory with fewer points than the input, but that is also a good approximation to the original trajectory. As such, traj...
详细信息
ISBN:
(纸本)9781538672327
The purpose of trajectory segmentation algorithms is to replace an input trajectory by a sub-trajectory with fewer points than the input, but that is also a good approximation to the original trajectory. As such, trajectory segmentation is an essential pre-processing step for trajectory mining algorithms, such as clustering. Among the segmentation strategies that are commonly used for trajectory clustering is Minimum Description Length (MDL)-based segmentation, which consists in finding a sub-trajectory such that the sum of its distance to the input trajectory and its overall length is minimum. However, there are no efficient algorithms for optimal MDL-based segmentation;there are only approximate algorithms. In this work we fill this gap by proposing a parallel multicore algorithm for MDL-based trajectory segmentation. We use three real-life datasets to show that our algorithm achieves optimal MDL, and compare its performance against Traclus, the state-of-the-art approximate Description Length (DL) segmentation algorithm.
Designing efficient algorithms for many-core and multicore architectures requires using different strategies to allow for the best exploitation of the hardware resources on those architectures. This work describes eff...
详细信息
ISBN:
(纸本)9781510810594
Designing efficient algorithms for many-core and multicore architectures requires using different strategies to allow for the best exploitation of the hardware resources on those architectures. This work describes efficient many-core and multicore large-scale energy calculations for Monte Carlo Gibbs ensemble using cell lists. Designing Monte Carlo molecular simulations is challenging as they have less computation and parallelism when compared to similar molecular dynamics applications. Our modified cell list allows for more speedup gains for energy calculations on both many-core and multicore architectures when compared to other implementations without using the conventional cell lists. We present our results and analysis of the cell list algorithms for each one of the parallel architectures using top of the line GPUs, CPUs, and Intel's Phi coprocessors. We evaluate the performance of the cell list algorithms for different problem sizes and different radial cutoffs.
暂无评论