An efficient design and implementation of the collective communication part in a Message Passing Interface (MPI) that is optimized for clusters of workstations is described. The system which consist of two main compon...
详细信息
ISBN:
(纸本)9780897917179
An efficient design and implementation of the collective communication part in a Message Passing Interface (MPI) that is optimized for clusters of workstations is described. The system which consist of two main components, the MPI-CCL layer and a User-level Reliable Transport Protocol (URTP), is integrated with the operating system via an efficient kernel extension mechanism. The system is then implemented on a collection of IBM RS/6000 workstations connected via a 10Mbit Ethernet LAN. Results indicate that the performance of the MPI Broadcast (on top of Ethernet) is about twice as fast as a recently published software implementation of broadcast on top of ATM.
A n × m (0,1)-matrix is said to satisfy the consecutive-ones property if there is a permutation of the rows of the matrix such that in each column all non-zero entries are adjacent. The problem of determining suc...
详细信息
ISBN:
(纸本)9780897917179
A n × m (0,1)-matrix is said to satisfy the consecutive-ones property if there is a permutation of the rows of the matrix such that in each column all non-zero entries are adjacent. The problem of determining such a permutation, if one exists, is the consecutive-ones property problem. Previously, Klein and Reif [13] gave a parallel solution for the consecutive-ones property problem with an algorithm based on complicated parallel PQ-tree manipulations. The work complexity of this algorithm was improved in [14] to run in time O(log2 n) with a linear number of CRCW processors. We present a new algorithm for this problem, based on a less sophisticated data structure, that improves upon the processor bounds of the previous algorithms by a factor of log n/log log n is general, and by a factor of log n for sufficiently dense problem instances. Our algorithm uses a novel divide-and-conquer approach, and uses for a fundamental data structure the decomposition of graphs into tri-connected components. Solutions to the consecutive-ones problem have important applications to a variety of problems in computational molecular biology, databases, distributed computing, VLSI placement and routing, and graph and network theory.
In this paper we study the question: How useful is randomization in speeding up Exclusive Write PRAM computations? Our results give further evidence that randomization is of limited use in these types of computations....
详细信息
ISBN:
(纸本)9780897917179
In this paper we study the question: How useful is randomization in speeding up Exclusive Write PRAM computations? Our results give further evidence that randomization is of limited use in these types of computations. First we examine a compaction problem on both the CREW and EREW PRAM models, and we present randomized lower bounds which match the best deterministic lower bounds known. (For the CREW PRAM model, the lower bound is asymptotically optimal). These are the first non-trivial randomized lower bounds known for the compaction problem on these models. We show that our lower bounds also apply to the problem of approximate compaction. Next we examine the problem of computing boolean functions on the CREW PRAM model, and we present a randomized lower bound which improves on the previous best randomized lower bound for many boolean functions, including the OR function. (The previous lower bounds for these functions were asymptotically optimal, but we improve the constant multiplicative factor). We also give an alternate proof for the randomized lower bound on PARITY, which was already optimal to within a constant additive factor. Lastly, we give a randomized lower bound for integer merging on an EREW PRAM which matches the best deterministic lower bound known. In all our proofs, we use the Random Adversary method, which has previously only been used for proving lower bounds on models with Concurrent Write capabilities. Thus this paper also serves to illustrate the power and generality of this method for proving parallel randomized lower bounds.
暂无评论