The proceedings contain 7 papers from the symposium on parallelism in algorithms and architectures, SPAA 2003: 15th annualsymposium on parallelism in algorithms and architectures. The topics discussed include: a prac...
详细信息
The proceedings contain 7 papers from the symposium on parallelism in algorithms and architectures, SPAA 2003: 15th annualsymposium on parallelism in algorithms and architectures. The topics discussed include: a practical algorithm for constructing oblivious routing schemes;novel architectures for P2P applications: the continuous-discrete approach;quantifying instruction criticality for shared memory multiprocessors;relaxing the problem-size bound for out-of-core columnsort;the complexity of verifying memory coherence;a near optimal scheduler for switch-memory-switch routers;and on local algorithms for topology control and routing in ad hoc networks.
This paper addresses the question of the future applicability of bus-based multiprocessors, by studying the execution characteristics of a number of data parallel numerical and scientific applications, as the system p...
详细信息
ISBN:
(纸本)9780897917179
This paper addresses the question of the future applicability of bus-based multiprocessors, by studying the execution characteristics of a number of data parallel numerical and scientific applications, as the system parameters of processor speed, bus bandwidth, and cache size are scaled (holding the number of processors fixed). This study focuses on 'asymptotic' results. Specifically, it addresses whether the application eventually become communication bound as the machine and the application are scaled. Essentially, it addresses the question of whether the machine becomes unsuitable as a parallel computing platform for the application. A transient analysis on the rate which the asymptotic results take hold is also performed.
This paper addresses fast parallel methods for the computation of the Radon (Hough) Transform. The Radon Transform (RT) of an image is a set of projections of the image taken at different angles. Its computation is ex...
详细信息
ISBN:
(纸本)089791483X
This paper addresses fast parallel methods for the computation of the Radon (Hough) Transform. The Radon Transform (RT) of an image is a set of projections of the image taken at different angles. Its computation is extremely important in image processing and computer vision, for problems such as pattern recognition and reconstruction of CAT scan images. We present a unique new method for combining partial results which allows us to construct a parallel algorithm which computes an approximation to the RT in O(logN) time for an N by N input array, using O(N2) processors. This is a lower processor-time product than all previous techniques. The method appears to be quite simple and practical. In addition, this algorithm appears to be quite well-suited to implementation on strict SIMD array-type computers. We discuss SIMD-parallel mappings of the algorithm for mesh, butterfly, and hypercube architectures.
An HMM is a very simple parallel machine consisting of finite state devices that can manipulate pointers to each other. The more commonly studied PRAM is a much richer parallel machine with a shared global memory and ...
Partitioning a graph into blocks of roughly equal weight while cutting only few edges is a fundamental problem in computer science with numerous practical applications. While shared-memory parallel partitioners have r...
详细信息
ISBN:
(纸本)9798400704161
Partitioning a graph into blocks of roughly equal weight while cutting only few edges is a fundamental problem in computer science with numerous practical applications. While shared-memory parallel partitioners have recently matured to achieve the same quality as widely used sequential partitioners, there is still a pronounced quality gap between distributed partitioners and their sequential counterparts. In this work, we shrink this gap considerably by describing the engineering of an unconstrained local search algorithm suitable for distributed partitioners. We integrate the proposed algorithm in a distributed multilevel partitioner. Our extensive experiments show that the resulting algorithm scales to thousands of PEs while computing cuts that are, on average, only 3.5% larger than those of a state-of-the-art high-quality shared-memory partitioner. Compared to previous distributed partitioners, we obtain on average 6.8% smaller cuts than the best-performing competitor while being more than 9 times faster.
We present the first near-linear work and poly-logarithmic depth algorithm for computing a minimum cut in an undirected graph. Previous parallelalgorithms with poly-logarithmic depth required at least quadratic work ...
详细信息
We present the first near-linear work and poly-logarithmic depth algorithm for computing a minimum cut in an undirected graph. Previous parallelalgorithms with poly-logarithmic depth required at least quadratic work in the number of vertices. In a graph with n vertices and m edges, our randomized algorithm computes the minimum cut with high probability in O(m log(4) n) work and O(log(3) n) depth. This result is obtained by parallelizing a data structure that aggregates weights along paths in a tree, in addition exploiting the connection between minimum cuts and approximate maximum packings of spanning trees. In addition, our algorithm improves upon bounds on the number of cache misses incurred to compute a minimum cut.
Techniques for quickly executing lengthy computations by the use of molecular parallelism are described. It is demonstrated that molecular computations can be done using short DNA strands by more or less conventional ...
详细信息
ISBN:
(纸本)9780897917179
Techniques for quickly executing lengthy computations by the use of molecular parallelism are described. It is demonstrated that molecular computations can be done using short DNA strands by more or less conventional biotechnology engineering techniques within a small number of laboratory steps. Two abstract models of molecular computation are proposed. The first, the parallel Associative Memory Model, is a very high level model which includes a parallel Associative Matching operation, that appears to improve the power of molecular parallelism beyond the operations previously considered by Lipton (1994). A Recombinant DNA Model is also proposed which is a low level model that allows operations that are abstractions of very well understood recombinant DNA operations and provides a representation, herein called complex, for the relevant structural properties of DNA.
we compare the Paraleap FPCA computer, a 64-processor hardware prototype of the PRAM-driven XMT architecture, with an Intel Core 2 Duo processor and show that Paraleap outperforms the Intel processor by up to 13.89x i...
详细信息
ISBN:
(纸本)9781605586069
we compare the Paraleap FPCA computer, a 64-processor hardware prototype of the PRAM-driven XMT architecture, with an Intel Core 2 Duo processor and show that Paraleap outperforms the Intel processor by up to 13.89x in terms of cycle counts. The comparison favors the Intel design, since the silicon area of an ASIC implementation of the 64-processor XMT design is the same as that of a single core.
We present parallel greedy approximation algorithms for set cover and related problems. These algorithms build on an algorithm for solving a graph problem we formulate and study called Maximal Nearly Independent Set (...
详细信息
ISBN:
(纸本)9781450307437
We present parallel greedy approximation algorithms for set cover and related problems. These algorithms build on an algorithm for solving a graph problem we formulate and study called Maximal Nearly Independent Set (MANIS)-a graph abstraction of a key component in existing work on parallel set cover. We derive a randomized algorithm for MANIS that has O(m) work and O(log(2)m) depth on input with m edges. Using MANIS, we obtain RNC algorithms that yield a (1 + epsilon)H-n-approximation for set cover, a (1 - 1/e - epsilon)-approximation for max cover and a (4 + epsilon)-approximation for min-sum set cover all in linear work;and an O(log*n)-approximation for asymmetric k-center for k <= log(O(1)) n and a (1.861 + epsilon)-approximation for metric facility location both in essentially the same work bounds as their sequential counterparts.
暂无评论