This paper investigates the parallel time and processor complexities of several searching problems involving Monge and Monge-composite arrays. We present array-searching algorithms for concurrent-read-concurrent-write...
详细信息
ISBN:
(纸本)0897913701
This paper investigates the parallel time and processor complexities of several searching problems involving Monge and Monge-composite arrays. We present array-searching algorithms for concurrent-read-concurrent-write (CRCW) PRAMs, concurrent-read-exclusive-write (CREW) PRAMs, hypercubes, cube-connected-cycles, and shuffle-exchange networks. All these algorithms run in optimal time, and their processor-time products are all within an O(lg n) factor of the worst-case sequential bounds. Several applications of these algorithms are also given. Two applications improve previous results substantially, and the others provide novel parallelalgorithms for problems not previously considered.
The design and use of scalable parallel computers is discussed. Much of the knowledge gained in the study of distributd systems is relevant, although it needs to be used in new manners. The author argues that the chal...
详细信息
ISBN:
(纸本)0897916131
The design and use of scalable parallel computers is discussed. Much of the knowledge gained in the study of distributd systems is relevant, although it needs to be used in new manners. The author argues that the challenge is to achieve maximum reuse of existing technologies, microprocessor architectures, I/O systems, programming languages, operating systems, application codes, without compromising too much parallel performance and scalability. Such skillful compromises are essential to the success of the parallel processing industry, and pose even more intellectual challenges than the 'start from scratch' approach.
This paper presents a comparison of the pragmatic aspects of some parallelalgorithms for finding connected components, together with optimizations on these algorithms. The algorithms being compared are two similar al...
详细信息
Dynamic programming is an efficient technique to solve combinatorial search and optimization problem. There have been many parallel dynamic programming algorithms. The purpose of this paper is to study a family of dyn...
详细信息
ISBN:
(纸本)9781595936677
Dynamic programming is an efficient technique to solve combinatorial search and optimization problem. There have been many parallel dynamic programming algorithms. The purpose of this paper is to study a family of dynamic programming algorithm where data dependence appear between non-consecutive stages, in other words, the data dependence is non-uniform. This kind of dynamic programming is typically called nonserial polyadic dynamic programming. Owing to the: non-uniform data dependence;it is harder to optimize this problem for parallelism and locality on parallelarchitectures. In this paper, we address the chanllenge of exploiting fine grain parallelism and locality of nonserial polyadic dynamic programming on a multi-core architecture. We present a programming and execution model for multi-core architectures with memory hierarchy. In the framework of the new model, the parallelism and locality benifit from a data dependence transformation. We propose a parallel pipelined algorithm for filling the dynamic programming matrix by decomposing the computation operators. The new parallel algorithm tolerates the memory access latency using multi-thread and is easily improved with the technique. We formulate and analytically solve the optimization problem determing the the size that minimizes the total execution time. The experiments on a simulator give a validation of the proposed model and show that the fine grain parallel algorithm achieves sub-linear speedup and that a potential high scalability on multi-core arichitecture.
This paper deals with the problem of parallel construction of trees with optimal weighted path length. We study both the unordered case, known as the Huffman coding problem and the ordered case known as the optimal al...
详细信息
For a sequence of n values, each with an associated integer label, the multiprefix operation calculates a partial sum for each value summing all preceding values with the same label, and for each label, a reduction va...
详细信息
This paper introduces the serial-parallel decision problem. Consider an online scheduler that receives a series of tasks, where each task has both a parallel and a serial implementation. The parallel implementation ha...
详细信息
ISBN:
(纸本)9798400704161
This paper introduces the serial-parallel decision problem. Consider an online scheduler that receives a series of tasks, where each task has both a parallel and a serial implementation. The parallel implementation has the advantage that it can make progress concurrently on multiple processors, but the disadvantage that it is (potentially) work-inefficient. As tasks arrive, the scheduler must decide for each task which implementation to use. We begin by studying total awake time. We give a simple decide-on-arrival scheduler that achieves a competitive ratio of 3 for total awake time-this scheduler makes serial/parallel decisions immediately when jobs arrive. Our second result is an parallel-work-oblivious scheduler that achieves a competitive ratio of 6 for total awake time-this scheduler makes all of its decisions based only on the size of each serial job and without needing to know anything about the parallel implementations. Finally, we prove a lower bound showing that, if a scheduler wishes to achieve a competitive ratio of O(1), it can have at most one of the two aforementioned properties (decide-on-arrival or parallel-work-oblivious). We also prove lower bounds of the form 1 + Omega(1) on the optimal competitive ratio for any scheduler. Next, we turn our attention to optimizing mean response time. Here, we show that it is possible to achieve an O(1) competitive ratio with O(1) speed augmentation. This is the most technically involved of our results. We also prove that, in this setting, it is not possible for a parallel-work-oblivious scheduler to do well. In addition to these results, we present tight bounds on the optimal competitive ratio if we allow for arrival dependencies between tasks (e.g., tasks are components of a single parallel program), and we give an in-depth discussion of the remaining open questions.
parallel randomized algorithms are presented that solve n-dimensional systems of linear equations and compute inverses of n × n non-singular matrices over a field in O((log n)2) time, where each time unit represe...
详细信息
The proceedings contain 40 papers. The topics discussed include: time vs. space trade-offs for rendezvous in trees;allowing each node to communicate only once in a distributed system: shared whiteboard models;optimal ...
ISBN:
(纸本)9781450312134
The proceedings contain 40 papers. The topics discussed include: time vs. space trade-offs for rendezvous in trees;allowing each node to communicate only once in a distributed system: shared whiteboard models;optimal and competitive runtime bounds for continuous, local gathering of mobile robots;online multi-robot exploration of grid graphs with rectangular obstacles;in search of parallel dimensions;delegation and nesting in best-effort hardware transactional memory;design, verification and applications of a new read-write lock algorithm;a lock-free B+tree;brief announcement: the problem based benchmark suite;brief announcement: subgraph isomorphism on a multithreaded shared memory architecture;efficient cache oblivious algorithms for randomized divide-and-conquer on the multicore model;a scalable framework for heterogeneous GPU-based clusters;and faster and simpler width-independent parallelalgorithms for positive semidefinite programming.
The proceedings contain 44 papers. The topics discussed include: deterministic distributed sparse and ultra-sparse spanners and connectivity certificates;fully polynomial-time distributed computation in low-treewidth ...
ISBN:
(纸本)9781450391467
The proceedings contain 44 papers. The topics discussed include: deterministic distributed sparse and ultra-sparse spanners and connectivity certificates;fully polynomial-time distributed computation in low-treewidth graphs;adaptive massively parallelalgorithms for cut problems;preparing for disaster: leveraging precomputation to efficiently repair graph structures upon failures;the energy complexity of Las Vegas leader election;a fully-distributed peer-to-peer protocol for byzantine-resilient distributed hash tables;brief announcement: the (limited) power of multiple identities: asynchronous byzantine reliable broadcast with improved resilience through collusion;brief announcement: composable dynamic secure emulation;and robust and optimal contention resolution without collision detection.
暂无评论