The queue-read, queue-write (QRQW) parallel random access machine ( PRAM) model permits concurrent reading and writing to shared memory locations, but at a cost proportional to the number of readers/writers to any one...
详细信息
The queue-read, queue-write (QRQW) parallel random access machine ( PRAM) model permits concurrent reading and writing to shared memory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. The QRQW PRAM model reflects the contention properties of most commercially available parallel machines more accurately than either the well-studied CRCW PRAM or EREW PRAM models, and can be efficiently emulated with only logarithmic slowdown on hypercube-type noncombining networks. This paper describes fast, low-contention, work-optimal, randomized QRQW PRAM algorithms for the fundamental problems of load balancing, multiple compaction, generating a random permutation, parallel hashing, and distributive sorting, These logarithmic or sublogarithmic time algorithms considerably improve upon the best known EREW PRAM algorithms for these problems, while avoiding the high-contention steps typical of CRCW PRAM algorithms. An illustrative experiment demonstrates the performance advantage of a new QRQW random permutation algorithm when compared with the popular EREW algorithm. Finally, this paper presents new randomized algorithms for integer sorting and general sorting. (C) 1996 Academic Press, Inc.
The strong link between matroids and matching is used to extend the ideas that resulted in the design of random NC (RNC) algorithms for matching to obtain RNC algorithms for the matroid union, intersection, and matchi...
详细信息
The strong link between matroids and matching is used to extend the ideas that resulted in the design of random NC (RNC) algorithms for matching to obtain RNC algorithms for the matroid union, intersection, and matching problems, and for linearly representable matroids. As a consequence, RNC algorithms for the well-known problems of finding an arborescence and a maximum cardinality set of edge-disjoint spanning trees in a graph are obtained. The key tools used are linear algebra and randomization.
PANDA is a future hadron and nuclear physics experiment at the FAIR facility in construction in Darmstadt, Germany. Unlike the majority of current experiments, PANDA's strategy for data acquisition is based on onl...
详细信息
beta-skeletons, prominent members of the neighborhood graph family, have interesting geometric properties and various applications ranging from geographic networks to archeology. This paper focuses on computing the be...
详细信息
beta-skeletons, prominent members of the neighborhood graph family, have interesting geometric properties and various applications ranging from geographic networks to archeology. This paper focuses on computing the beta-spectrum, a labeling of the edges of the Delaunay triangulation, DT(V), which makes it possible to quickly find the lune-ased beta-skeleton of V for any query value beta is an element of [1,2]. We consider planar n-point sets V with L-p metric, 1 < p < infinity. We present an O (n log(2) n) time sequential, and an O (log(4) n) time parallel, beta-spectrum labeling. We also show a parallel algorithm, which for a given beta is an element of [1,2] finds the lune-based beta-skeleton in O (log(2) n) time. The parallel algorithms use O(n) processors in the CREW-PRAM model. (C) 2015 Elsevier B.V. All rights reserved.
We describe the design and implementation of efficient parallel algorithms, and a software library for the parallel implementation of compressed octree data structures. Octrees are widely used in supporting hierarchic...
详细信息
We describe the design and implementation of efficient parallel algorithms, and a software library for the parallel implementation of compressed octree data structures. Octrees are widely used in supporting hierarchical methods for scientific applications such as the N-body problem, molecular dynamics and smoothed particle hydrodynamics. The primary goal of our work is to identify and abstract the commonalities present in various hierarchical methods using octrees, design efficient parallel algorithms for them, and encapsulate them in a software library. We designed provably efficient parallel algorithms and implementation strategies that perform well irrespective of the spatial distribution of data in the computational domain. The library will enable rapid development of applications, allowing application developers to use efficient parallel algorithms developed for this purpose, without the necessity of having detailed knowledge of the algorithms or of implementing them. The software is developed in C using the Message Passing Interface (MPI). We report experimental results on an IBM xSeries parallel computer. (c) 2005 Elsevier B.V. All rights reserved.
For any fixed1 k > 0, we obtain (a) an O(alpha2 log alpha log(k)n) time parallel algorithm for (alpha + 1) coloring an alpha-degree graph with n/log(k)n processors on an EREW PRAM, (b) an O(alpha2 log alpha log(k)n...
详细信息
For any fixed1 k > 0, we obtain (a) an O(alpha2 log alpha log(k)n) time parallel algorithm for (alpha + 1) coloring an alpha-degree graph with n/log(k)n processors on an EREW PRAM, (b) an O(alpha2 log alpha log(k)n) time parallel algorithm for 3-coloring an alpha-degree rooted tree with n/log(k)n processors on a CREW PRAM, (c) an O(log(k)n) time parallel algorithm for finding a Maximal Independent Set of a rooted tree on an ARBITRARY CRCW PRAM.
The intelligence of traditional database systems can be improved by recursion. Using recursion, relational database systems are extended into knowledge-base systems (deductive database systems). Linear recursion is th...
详细信息
The intelligence of traditional database systems can be improved by recursion. Using recursion, relational database systems are extended into knowledge-base systems (deductive database systems). Linear recursion is the most frequently found type of recursion in deductive databases. Deductive databases queries are computationally intensive and lend themselves naturally to parallelization to speed up the solution of such queries. In this paper, parallel algorithms to solve the generalized fully and partially instantiated forms of the same generation query in deductive databases are presented. The algorithms use special data structures, namely, a special matrix that stores paths from source nodes of the graph representing a two-attribute normalized database relation to all nodes reachable from these source nodes, and a reverse matrix that stores paths from any node to all source nodes related to that node.
Combinational logic synthesis is a very important but computationally expensive phase of VLSI system design. parallel processing offers an attractive solution to reduce this design cycle time. In this paper, we descri...
详细信息
Combinational logic synthesis is a very important but computationally expensive phase of VLSI system design. parallel processing offers an attractive solution to reduce this design cycle time. In this paper, we describe ProperMIS, a portable parallel algorithm for logic synthesis based on the MIS multi-level logic synthesis system. As part of this work, we have developed novel parallel algorithms for the different logic transformations of the MIS system. Our algorithm uses an asynchronous message-driven computing model with no synchronizing barriers separating phases of parallel computation. The algorithm is portable across a wide variety of parallel architectures, and is built around a well-defined sequential algorithm interface, so that we can benefit from future expansion of the sequential algorithm. We present results on several MCNC and ISCAS benchmark circuits for a variety of shared memory and distributed processing architectures. Our implementation produces speedups of an average of 4 on 8 processors.
An O(n) time sequential algorithm is presented for computing the diameter and the center of an interval graph with n vertices given the interval representation of the graph as a sorted list of endpoints. Two parallel ...
详细信息
An O(n) time sequential algorithm is presented for computing the diameter and the center of an interval graph with n vertices given the interval representation of the graph as a sorted list of endpoints. Two parallel algorithms are also presented for the same problems. These algorithms work in O(n/P + log n) time on the EREW PRAM using P processors and O (n) space.
In this paper, the design and implementation of a single shared bus, shared memory multiprocessing system using Intel's single board computers is presented. The hardware configuration and the operating system deve...
详细信息
In this paper, the design and implementation of a single shared bus, shared memory multiprocessing system using Intel's single board computers is presented. The hardware configuration and the operating system developed to execute the parallel algorithms are discussed. The performance evaluation studies carried out on shamp are outlined.
暂无评论