Tree search algorithms play an important role in many applications in the field of artificial intelligence. When playing board games like chess etc., computers use game tree search algorithms to evaluate a position. I...
详细信息
ISBN:
(纸本)9781581134094
Tree search algorithms play an important role in many applications in the field of artificial intelligence. When playing board games like chess etc., computers use game tree search algorithms to evaluate a position. In this paper, we present a procedure that we call parallel Controlled Conspiracy Number Search (parallel CCNS). Shortly, we describe the principles of the sequential CCNS algorithm, which bases its approximation-results on irregular subtrees of the entire game tree. We have parallelized CCNS and implemented it in our chess program ***, which now is the first in the world, that could win a high-ranked Grandmaster chess-tournament. We add experiments that show a speedup of about 50 on 159 processors running on an SCI-workstation-cluster.
A hierarchical communication model was presented to study the scheduling tasks with small communication delays for cluster of processors. The model captured hierarchical nature of communication in parallel computers. ...
详细信息
ISBN:
(纸本)9781581134094
A hierarchical communication model was presented to study the scheduling tasks with small communication delays for cluster of processors. The model captured hierarchical nature of communication in parallel computers. An approximation algorithm was proposed for the small communication time (SCT) hierarchical model with an unbounded number of bi-processor machines. It was found that the communication overhead did not interfere with the availability of processors and allowed all processors to execute other tasks.
The prefetch schedulng problem and output scheduling problem with respect to parallel disk input and output, respectively, are considered. A fixed size pool of empty memory buffers and the sequence of block write requ...
详细信息
ISBN:
(纸本)9781581134094
The prefetch schedulng problem and output scheduling problem with respect to parallel disk input and output, respectively, are considered. A fixed size pool of empty memory buffers and the sequence of block write requests are taken as the input by the output scheduling problem. The prefetch scheduling problem takes the pool of empty memory buffers and the sequence of block read requests as the input. The duality between the two problems and the analysis of applications in one domain with respect to the applications in an another domain is also illusttrated.
This paper presents a work-optimal CGM algorithm that solves the Longest Increasing Subsequence Problem. It can be implemented in the CGM with P processors in O(N2/P) time and O(P) communication steps. It is the first...
详细信息
This paper presents a work-optimal CGM algorithm that solves the Longest Increasing Subsequence Problem. It can be implemented in the CGM with P processors in O(N2/P) time and O(P) communication steps. It is the first CGM algorithm for this problem and it is work-optimal since the sequential algorithm has a complexity of O(N2).
A basic problem in hypergraphs is that of finding a large independent set-one of guaranteed size-in a given hypergraph. Understanding the parallel complexity of this and related independent set problems on hypergraphs...
详细信息
ISBN:
(纸本)9781581134094
A basic problem in hypergraphs is that of finding a large independent set-one of guaranteed size-in a given hypergraph. Understanding the parallel complexity of this and related independent set problems on hypergraphs is a fundamental open issue in parallel computation. Caro and Tuza (J. Graph Theory, Vol. 15, pp. 99-107, 1991) have shown a certain lower bound αk(H) on the size of a maximum independent set in a given k-uniform hypergraph H, and have also presented an efficient sequential algorithm to find an independent set of size αk(H). They also show that αk(H) is the size of the maximum independent set for various hypergraph families. Here, we develop the first RNC algorithm to find an independent set of size αk (H), and also derandomize it for various special cases. We also present lower bounds on independent set size and corresponding RNC algorithms for non-uniform hypergraphs.
Discrete ordinates methods are commonly used to simulate radiation transport for fire or weapons modeling. The computation proceeds by sweeping the flux across a grid. A particular cell cannot be computed until all th...
详细信息
ISBN:
(纸本)9781581134094
Discrete ordinates methods are commonly used to simulate radiation transport for fire or weapons modeling. The computation proceeds by sweeping the flux across a grid. A particular cell cannot be computed until all the cells immediately upwind of it are finished. If the directed dependence graph for the grid cells contains a cycle then sweeping methods will deadlock. This can happen in unstructured grids and time stepped problems where the grid is allowed to deform. In this paper we present a parallel algorithm to detect cycles in the dependence graphs present in these grids as well as an implementation and experimental results on shared and distributed memory machines.
For 2-D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time under the BSP model. We consider uniform dependency computations, tiled so that (at leas...
详细信息
ISBN:
(纸本)9781581134094
For 2-D iteration space tiling, we address the problem of determining the tile parameters that minimize the total execution time under the BSP model. We consider uniform dependency computations, tiled so that (at least) one of the tile boundaries is parallel to the domain boundary. We determine the optimal tile size as a closed form solution. In addition, we determine the optimal number of processors and also the optimal slope of the oblique tile boundary. Our predictions are validated, among other examples, on a sequence alignment problem specialized to similar sequences using Ficket's "k-band" algorithm, for which, our optimal semi-oblique tiling yields an improvement over orthogonal tiling by a factor of 2.5. Our optimal solution requires a block-cyclic distribution of tiles to processors. The best one can obtain with only block distribution (as many authors require) is 3 times slower.
We address the problem of prefetching and caching in a parallel I/O system and present a new algorithm for optimal parallel-disk scheduling. Traditional buffer management algorithms that minimize the number of I/O dis...
详细信息
We address the problem of prefetching and caching in a parallel I/O system and present a new algorithm for optimal parallel-disk scheduling. Traditional buffer management algorithms that minimize the number of I/O disk accesses, are substantially suboptimal in a parallel I/O system where multiple I/Os can proceed simultaneously. We present a new algorithm SUPERVISOR for parallel-disk I/O scheduling. We show that in the off-line case, where apriori knowledge of all the requests is available, SUPERVISOR performs the minimum number of I/Os to service the given I/O requests. This is the first parallel I/O scheduling algorithm that is provably offiine optimal. In the on-line case, we study SUPERVISOR in the context of global L-block lookahead, which gives the buffer management algorithm a lookahead consisting of L distinct requests. We show that the competitive ratio of SUPERVISOR, with global L-block lookahead, is θ(M - L + D), when L &le M, and θ(MD/L), when L > M, where the number of disks is D and buffer size is M.
Von zur Gathen proposed an efficient parallel exponentiation algorithm in finite fields using normal basis representations. In this paper we present a processor-efficient parallel exponentiation algorithm in GF(2n) wh...
详细信息
ISBN:
(纸本)9781581134094
Von zur Gathen proposed an efficient parallel exponentiation algorithm in finite fields using normal basis representations. In this paper we present a processor-efficient parallel exponentiation algorithm in GF(2n) which improves upon von zur Gathen's algorithm. We also show that exponentiation in GF(2n) can be done in O(log n) time using n/(log n)2 processors. Hence we get processor × time bound O(n/log n), which is optimal. Finally, we present an on-line processor assignment scheme which was missing in von zur Gathen's algorithm, and show that its time complexity is negligible.
We present the design and implementation of a parallel out-of-core sorting algorithm, which is based on Leighton's colunmsort algorithm. We show how to relax some of the steps of the original columnsort algorithm ...
详细信息
ISBN:
(纸本)9781581134094
We present the design and implementation of a parallel out-of-core sorting algorithm, which is based on Leighton's colunmsort algorithm. We show how to relax some of the steps of the original columnsort algorithm to permit a faster out-of-core implementation. Our algorithm requires only 4 passes over the data, and a 3-pass implementation is possible. Although there is a limit on the number of records that can be sorted-as a function of the memory used per processor-this upper limit need not be a severe restriction, and it increases superlinearly with the per-processor memory. To the best of our knowledge, our implementation is the first out-of-core multiprocessor sorting algorithm whose output is in the order assumed by the parallel Disk Model. We define several measures of sorting efficiency and demonstrate that our implementation's sorting efficiency is competitive with that of NOW-Sort, a sorting algorithm developed to sort large amounts of data quickly on a cluster of workstations.
暂无评论