parallelalgorithms for lossless data compression via dictionary compression using the longest fragment first (LFF) and greedy parsing strategies are described. Dictionary compression removes redundancy by replacing s...
详细信息
ISBN:
(纸本)0818656026
parallelalgorithms for lossless data compression via dictionary compression using the longest fragment first (LFF) and greedy parsing strategies are described. Dictionary compression removes redundancy by replacing substrings of the input by references to strings stored in a dictionary. Given a static dictionary stored as a suffix tree, we present PRAM CREW algorithms for LFF compression which run in O(log2n) time with O(n/log n) processors where it is assumed that the maximum dictionary entry is of length O(log n). We also describe a logarithmic time and linear processor algorithm for greedy parsing.
Non-wave shaping parallel bidirectional heuristic search algorithms have been reported to suffer of the bidirectional search anomaly. Although wave-shaping is considered as the most natural approach, parallel bidirect...
详细信息
ISBN:
(纸本)0818656026
Non-wave shaping parallel bidirectional heuristic search algorithms have been reported to suffer of the bidirectional search anomaly. Although wave-shaping is considered as the most natural approach, parallel bidirectional wave-shaping algorithms are extremely scare. We introduce a wave-shaping algorithm for parallel bidirectional heuristic search in distributed memory environments. the method is inspired by the celebrated uniprocessor bidirectional wave-shaping algorithm of DeChampeaux-Sint. Our performance evaluation shows that the proposed method scales well, maintains good performance for increasing problem sizes, and attains close to linear speedup over the sequential DeChampeaux-Sint algorithm.
In this paper we develop algorithms for performing total exchange (all-to-all broadcast) in an n-dimensional faulty SIMD hypercube, Qn, with up to n - 1 node faults. In an SIMD hypercube, during a communication step, ...
详细信息
ISBN:
(纸本)0818656026
In this paper we develop algorithms for performing total exchange (all-to-all broadcast) in an n-dimensional faulty SIMD hypercube, Qn, with up to n - 1 node faults. In an SIMD hypercube, during a communication step, nodes can exchanges information withtheir neighbors only across a specific dimension. We describe a sequence of algorithms which take N log N, N log n, 3N - 7, 2.5N--7 steps for this problem. By carefully analyzing these algorithms and a property of certain ordering of dimensions, we obtain an improved algorithm which takes 2N steps.
Scalable parallel computers require I/O balanced with computational power to solve data-intensive problems. Distributed memory architectures call for I/O hardware and software beyond those of conventional scalar syste...
详细信息
ISBN:
(纸本)0818656026
Scalable parallel computers require I/O balanced with computational power to solve data-intensive problems. Distributed memory architectures call for I/O hardware and software beyond those of conventional scalar systems. this paper introduces the MasPar I/O system, designed to provide balanced and scalable data-parallel Unix I/O. the architecture and implementation of the I/O hardware and software are described. Key elements include parallel access to conventional Unix file descriptors and a self-routing multistage network coupled with a buffer memory for flexible parallel I/O transfers. Performance measurements are presented for parallel Unix I/O with a scalable RAID disk array, a RAM disk, and a HIPPI interconnect.
AVL trees are efficient data structures for implementing dictionaries. We present a parallel dictionary, using AVL trees, on the EREW-PRAM by proposing optimal algorithms to perform k operations with p (1 LSEQ p &...
详细信息
ISBN:
(纸本)0818656026
AVL trees are efficient data structures for implementing dictionaries. We present a parallel dictionary, using AVL trees, on the EREW-PRAM by proposing optimal algorithms to perform k operations with p (1 LSEQ p &le k) processors. An explicit processor scheduling is devised to avoid simultaneous reads in our parallel algorithm to perform k searches, which avoids the need for any additional memory in the parallelization. To perform multiple insertions and deletions, we identify rotations (in addition to AVL tree rotations) required to restore balance and present parallelalgorithms to perform p insertions/deletions in O(log n + log p) time with p processors.
Although several strategies have been developed for designing efficient parallelalgorithms, there is still a need for new strategies so that efficient or simple solutions can be obtained for broader classes of proble...
详细信息
ISBN:
(纸本)0818656026
Although several strategies have been developed for designing efficient parallelalgorithms, there is still a need for new strategies so that efficient or simple solutions can be obtained for broader classes of problems. In this paper, we propose to establish a new design strategy, called parallel parentheses matching (PPM), by solving a number of problems related to trees. Withthis strategy, a given problem is first converted into an equivalent parentheses matching problem, the solution of which finally leads to the desired solution of the original problem. the PPM strategy is applied to solve various bottom-up tree computation problems such as heights of the nodes of a tree, extreme values in subtrees, and lowest common ancestor. Furthermore, elegant algorithms are presented for the tree contraction problem and balancing binary trees, thus demonstrating the versatility of our approach.
three Adjoining Grammar (TAG) is a powerful grammatical formalism for large-scale natural language processing. However, the computational complexity of parsing algorithms for TAG is high. We introduce a new parallel T...
详细信息
ISBN:
(纸本)0818656026
three Adjoining Grammar (TAG) is a powerful grammatical formalism for large-scale natural language processing. However, the computational complexity of parsing algorithms for TAG is high. We introduce a new parallel TAG parsing algorithm for MIMD hypercube multicomputers, using large granularity grammar partitioning, asynchronous communication, and distributed termination detection. We describe our implementation on the nCUBE/2 parallel computer, and provide experimental results on both random and English grammars. Our algorithm delivers the best performance of any TAG parsing algorithm to date, yielding an almost two order-of-magnitude speedup and good efficiency on up to 256 processors.
the All-Nearest Neighbor problem (ANN, for short) is stated as follows: given a set S of points in the plane, determine for every point in S, a point that lies closest to it. the ANN problem is central to VLSI design,...
详细信息
ISBN:
(纸本)0818656026
the All-Nearest Neighbor problem (ANN, for short) is stated as follows: given a set S of points in the plane, determine for every point in S, a point that lies closest to it. the ANN problem is central to VLSI design, computer graphics, pattern recognition, and image processing, among others. In this paper we propose time-optimal algorithms to solve the ANN problem for an arbitrary set of points in the plane and also for the special case in which the points are vertices of a convex polygon. Both our algorithms run on meshes with multiple broadcasting. We first establish an Ω(log n) time lower bound for the task of solving an arbitrary n-point instance of the ANN problem, even if the points are the vertices of a convex polygon. this lower bound holds for boththe CREW-PRAM and for the mesh with multiple broadcasting.
Path planning is an essential step in any mobile robot system. In this paper, we propose a parallel algorithm and architecture for this computationally intensive problem. the parallel implementation is based on the tr...
详细信息
ISBN:
(纸本)0818656026
Path planning is an essential step in any mobile robot system. In this paper, we propose a parallel algorithm and architecture for this computationally intensive problem. the parallel implementation is based on the trulla algorithm proposed earlier by the authors in [5]. the algorithms uses a wavefront like propagation technique and computes the near-optimal paths from every point in the grid space to the specified destination. the algorithm is implemented on a linear systolic array architecture which is simple and regular in structure and can be implemented as a VLSI system. Currently, a prototype VLSI chip with 5 processor cells in being implemented at the University of South Florida.
Testing is a critical and expensive part of software development. One testing methodology, data flow testing, uses the flow of data in a program to determine whether the program is adequately tested. We present a new ...
详细信息
ISBN:
(纸本)0818656026
Testing is a critical and expensive part of software development. One testing methodology, data flow testing, uses the flow of data in a program to determine whether the program is adequately tested. We present a new approach that partitions the data flow testing workload into an appropriately sized granularity. the workload can be scheduled either statically or dynamically, and can be adapted to either a shared memory or a distributed memory environment. We implemented both uniprocessor and multiprocessor versions of our data flow tester in C for the Data General Corporation's AViiOn 5000 machine-experimented with a variety of programs, and obtained good speedup using the multiprocessor version over the uniprocessor version. Currently, we are expanding our tester to handle larger programs, and experimenting on machines with different architectures and a greater number of processors.
暂无评论