We show that isomorphism of trees and outerplanar graphs can be tested in O(log n) time with n/log(n) processors on a CRCW PRAM and in O(log2n) time with n/log2n processors on an EREW PRAM. This gives the first optima...
详细信息
In this paper we describe deterministic parallel algorithms for planar point location and for building the Voronoi Diagram of n co-planar points. These algorithms are designed for BSP-like models of computation, where...
详细信息
ISBN:
(纸本)3540642307
In this paper we describe deterministic parallel algorithms for planar point location and for building the Voronoi Diagram of n co-planar points. These algorithms are designed for BSP-like models of computation, where p processors, with O(n/p) much greater than O(1) local memory each, communicate through some arbitrary interconnection network. They are communication-efficient since they require, respectively, O(l) and O(log p) communication steps and O(n log n/p) local computation per step. Both algorithms require O(n/p) = Omega(p) local memory.
Graph algorithms play a prominent role in several fields of sciences and engineering. Notable among them are graph traversal, finding the connected components of a graph, and computing shortest paths. There are severa...
详细信息
ISBN:
(纸本)9781479907298
Graph algorithms play a prominent role in several fields of sciences and engineering. Notable among them are graph traversal, finding the connected components of a graph, and computing shortest paths. There are several efficient implementations of the above problems on a variety of modern multiprocessor architectures. It can be noticed in recent times that the size of the graphs that correspond to real world data sets has been increasing. parallelism offers only a limited succor to this situation as current parallel architectures have severe short-comings when deployed for most graph algorithms. At the same time, these graphs are also getting very sparse in nature. This calls for particular work efficient solutions aimed at processing large, sparse graphs on modern parallel architectures. In this paper, we introduce graph pruning as a technique that aims to reduce the size of the graph. Certain elements of the graph can be pruned depending on the nature of the computation. Once a solution is obtained for the pruned graph, the solution is extended to the entire graph. We apply the above technique on three fundamental graph algorithms: breadth first search (BFS), Connected Components (CC), and All Pairs Shortest Paths (APSP). To validate our technique, we implement our algorithms on a heterogeneous platform consisting of a multicore CPU and a GPU. On this platform, we achieve an average of 35% improvement compared to state-of-the-art solutions. Such an improvement has the potential to speed up other applications that rely on these algorithms.
The main objective of this paper is the development of a new parallel integration algorithm for Solving Boundary Value Problem (BVPs) in Ordinary Differential Equation, (ODEs). This algorithm is suitable for running o...
详细信息
ISBN:
(纸本)9781424444564
The main objective of this paper is the development of a new parallel integration algorithm for Solving Boundary Value Problem (BVPs) in Ordinary Differential Equation, (ODEs). This algorithm is suitable for running on MIMD computing systems. We will analysis the stability and error control of the developed algorithm. The treatment of stiff boundary value, problems by developed technique have been considered, finally we have generalized the method for higher order BVPs.
In this paper, we propose two efficient parallel algorithms for constructing a k-tree center and a k-tree core of a tree network, respectively. Both algorithms take O(log n) time using O(n) work on the EREW PRAM. Our ...
详细信息
ISBN:
(纸本)3540309357
In this paper, we propose two efficient parallel algorithms for constructing a k-tree center and a k-tree core of a tree network, respectively. Both algorithms take O(log n) time using O(n) work on the EREW PRAM. Our algorithms improve the algorithms previously proposed by Wang (IEEE Trans. Par. Dist. Sys. 1998) and Peng et al. (J. algorithms 1993).
The rise of explicit parallel programming involves new problems: lack of structure for parallel algorithms and the ad hoc development of parallel algorithms. We use skeletons to characterize and design parallel algori...
详细信息
ISBN:
(纸本)0818675829
The rise of explicit parallel programming involves new problems: lack of structure for parallel algorithms and the ad hoc development of parallel algorithms. We use skeletons to characterize and design parallel algorithms and define a process to refine the designs step-by-step into programs. This paper introduces a high-level library on top of MPI which is derived from the skeleton concept to achieve better programmability and obtain portability. We conclude with a CFD application to demonstrate our idea.
This paper presents efficient sequential and parallel algorithms for the minimum cost flow problem on planar networks. Our algorithms are based on the interior point method for linear programming, and make full use of...
详细信息
Memory consistency model is crucial to the performance of shared-memory multiprocessors, and in current architectures several different models are adopted. In this paper, using graph algorithms for illustrative purpos...
详细信息
ISBN:
(纸本)9780889866386
Memory consistency model is crucial to the performance of shared-memory multiprocessors, and in current architectures several different models are adopted. In this paper, using graph algorithms for illustrative purposes, we consider the impact of memory model on the implementation and performance of parallel algorithms on shared-memory multiprocessors. We show that the implementation of PRAM algorithm's is largely "oblivious" of the underlying memory model, and has good performance on relaxed models. More importantly, we show that different memory models can favor drastically different algorithm designs.
The longest common subsequence (LCS) problem on a pair of strings is a classical problem in string algorithms. Its extension, the semilocal LCS problem, provides a more detailed comparison of the input strings, withou...
详细信息
ISBN:
(纸本)9781450390682
The longest common subsequence (LCS) problem on a pair of strings is a classical problem in string algorithms. Its extension, the semilocal LCS problem, provides a more detailed comparison of the input strings, without any increase in asymptotic running time. Several semi-local LCS algorithms have been proposed previously;however, to the best of our knowledge, none have yet been implemented. In this paper, we explore a new hybrid approach to the semi-local LCS problem. We also propose a novel bit-parallel LCS algorithm. In the experimental part of the paper, we present an implementation of several existing and new parallel LCS algorithms and evaluate their performance.
In this article we propose parallel algorithms for the construction of conforming finite-element discretization on linear octrees. Existing octree-based discretizations scale to billions of elements, but the complexit...
详细信息
ISBN:
(纸本)9781595939746
In this article we propose parallel algorithms for the construction of conforming finite-element discretization on linear octrees. Existing octree-based discretizations scale to billions of elements, but the complexity constants can be high. In our approach we use several techniques to minimize overhead: a novel bottom-up tree-construction and 2:1 balance constraint enforcement;a Golomb-Rice encoding for compression by representing the octree and element connectivity as an Uniquely Decodable Code (UDC);overlapping communication and computation;and byte alignment for cache efficiency. The cost of applying the Laplacian is comparable to that of applying it using a direct indexing regular grid discretization with the same number of elements. Our algorithm has scaled up to four billion octants on 4096 processors on a Cray XT3 at the Pittsburgh Supercomputing Center. The overall tree construction time is under a minute in contrast to previous implementations that required several minutes;the evaluation of the discretization of a variable-coefficient Laplacian takes only a few seconds.
暂无评论