this paper introduces the APHID (Asynchronous parallel Hierarchical Iterative Deepening) game-tree search algorithm. APHID represents a departure from the approaches used in practice. Instead of parallelism based on t...
详细信息
this paper introduces the APHID (Asynchronous parallel Hierarchical Iterative Deepening) game-tree search algorithm. APHID represents a departure from the approaches used in practice. Instead of parallelism based on the minimal search tree, APHID uses a truncated game-tree and all of the leaves of that tree are searched in parallel. APHID has been programmed as an easy to implement, game-independent αβ library, and has been tested on several game-playing programs. Results for an Othello program are presented here. the algorithm yields good parallel performance on a network of workstations, without using a shared transposition table.
Indexing multidimensional data is inherently complex leading to slow query processing. this behavior becomes more pronounced withthe increase in database size and/or number of dimensions. In this paper, we address th...
详细信息
Indexing multidimensional data is inherently complex leading to slow query processing. this behavior becomes more pronounced withthe increase in database size and/or number of dimensions. In this paper, we address this issue by processing an index structure in parallel. First, we study different ways of partitioning an index structure. We then propose efficient algorithms for processing each query in parallel on the index structure. Using these strategies, we parallelized two multidimensional index structures - R* and LIB and evaluated the performance gains for the Gazetteer and the Catalog data of the Alexandria Digital Library on the Meiko CS-2.
Homing sequences play an important role in the testing of finite state systems and have been used in a number of applications such as hardware fault-detection, protocol verification, and learning algorithms etc. Recen...
详细信息
Homing sequences play an important role in the testing of finite state systems and have been used in a number of applications such as hardware fault-detection, protocol verification, and learning algorithms etc. Recent applications of homing sequences involve large DFA's withthousands of states. Such applications motivate the design of a parallel algorithm for this problem. Here we present a deterministic parallel algorithm of time complexity O(√nlog2n) using a polynomial number of processors on CREW PRAM model. No faster deterministic parallel algorithm is known for this problem. We also discuss the parallel complexity of some related problems.
this paper identifies a framework for modeling applications as distributed active objects, which can either be used to design a new active distributed multi-database architecture or can be applied to an existing multi...
详细信息
this paper identifies a framework for modeling applications as distributed active objects, which can either be used to design a new active distributed multi-database architecture or can be applied to an existing multidatabase architectural design to make it active. A system whose design follows a model such as the one proposed in this paper is referred to as an active multi-database system, for it is equipped with mechanisms to detect and propagate anticipated changes as distributed active objects.
We propose a family of regular Cayley network graphs of degree three based on permutation groups for design of massively parallel systems. these graphs are shown to be based on the shuffle exchange operations, to have...
详细信息
We propose a family of regular Cayley network graphs of degree three based on permutation groups for design of massively parallel systems. these graphs are shown to be based on the shuffle exchange operations, to have logarithmic diameter in the number of vertices, and to be maximally fault tolerant. We investigate different algebraic properties of these networks (including fault tolerance) and propose a simple routing algorithm. these graphs are shown to be able to efficiently simulate other permutation group based graphs;thus they seem to be very attractive for VLSI implementation and for applications requiring bounded number of I/O ports as well as to run existing applications for other permutation group based architectures.
Performance penalties due to synchronization are a common concern in parallel programming. Traditional approaches enforce the correct ordering of write operations using locks, but this can be time-consuming and drasti...
详细信息
Performance penalties due to synchronization are a common concern in parallel programming. Traditional approaches enforce the correct ordering of write operations using locks, but this can be time-consuming and drastically reduce the benefits of using a parallel machine. Instead, for certain classes or programs we propose using an optimistic approach where the solution is calculated without any locks. this approach detects data races by maintaining statistics on memory writes and correcting potentially inappropriate data values by repeating selected computations and write operations. this scheme is evaluated with a novel parallel implementation of the Moller-Plesset perturbation theory energy calculation for closed-shell molecules.
In this paper we propose user-controllable I/O operations and explore the effects of them with some synthetic access patterns. the operations allow users to determine a file structure matching the access patterns, con...
详细信息
In this paper we propose user-controllable I/O operations and explore the effects of them with some synthetic access patterns. the operations allow users to determine a file structure matching the access patterns, control the layout and distribution of data blocks on physical disks, and present various access patterns with a minimum number of I/O operations. the operations do not use a file pointer to access data as in typical file systems, which eliminates the overhead of managing the offset of the file, making it easy to share data and reducing the number of I/O operations.
A consistent checkpointing algorithm saves a consistent view of a distributed application's state on stable storage. the traditional consistent checkpointing algorithms require different processes to save their st...
详细信息
A consistent checkpointing algorithm saves a consistent view of a distributed application's state on stable storage. the traditional consistent checkpointing algorithms require different processes to save their state at about the same time. this causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce checkpoint overhead. this paper presents a simple approach to arbitrarily stagger the checkpoints. Our approach requires that the processes take consistent logical checkpoints, as compared to consistent physical checkpoints enforced by existing algorithms. Experimental results on nCube-2 are presented.
We considered the load-balanced multiplication of a large sparse matrix with a large sequence of vectors, on parallel computers. Due to the associated computational and inter-node communication challenges, we propose ...
详细信息
We considered the load-balanced multiplication of a large sparse matrix with a large sequence of vectors, on parallel computers. Due to the associated computational and inter-node communication challenges, we propose a method that combines fast load-balanced work allocation with efficient message passing implementations. the performance of the proposed method was evaluated on benchmark matrices as well as on synthetically generated matrix data. We compare our proposed allocation solution with previous research work. It will be shown that, by using our approach, a tangible improvement over prior work can be obtained, particularly for very sparse and skewed matrices.
Most parallel databases exploit two types of parallelism: intra-query parallelism and inter-transaction concurrency. Between these two cases lies another type of parallelism: inter-query parallelism within a transacti...
详细信息
Most parallel databases exploit two types of parallelism: intra-query parallelism and inter-transaction concurrency. Between these two cases lies another type of parallelism: inter-query parallelism within a transaction or application. Exploiting inter-query parallelism requires either compiler support to automatically parallelize the existing embedded query programs, or programming support to write explicitly parallel query programs. In this paper, we present compiler analysis to automatically detect parallelism in the embedded query programs. We present compiler algorithms for detecting dependences in such programs. We show that the properties of some aggregate functions such as MIN and MAX can help reduce statically computed dependences.
暂无评论