Given a set of n intervals representing an interval graph, the problem of finding a maximum matching between pairs of disjoint (nonintersecting) intervals has been considered in the sequential model. In this paper we ...
详细信息
Given a set of n intervals representing an interval graph, the problem of finding a maximum matching between pairs of disjoint (nonintersecting) intervals has been considered in the sequential model. In this paper we present parallel algorithms for computing maximum cardinality matchings among pairs of disjoint intervals in interval graphs in the EREW PRAM and hypercube models. For the general case of the problem, our algorithms compute a maximum matching in O (log(3) n) time using O (nl log(2) n) processors on the EREW PRAM and using n processors on the hypercubes. For the case of proper interval graphs, our algorithm runs in O (log n) time using O (n) processors if the input intervals are not given already sorted and using O (n/log n) processors otherwise, on the EREW PRAM. On n-processor hypercubes, our algorithm for the proper interval case takes O (log n log log n) time for unsorted input and O (log n) time for sorted input. Our parallel results also lead to optimal sequential algorithms for computing maximum matchings among disjoint intervals. In addition, we present an improved parallel algorithm for maximum matching between overlapping intervals in proper interval graphs.
We have implemented sample sort and a parallel version of Quicksort on a cache-coherent shared address space multiprocessor: the SUN ENTERPRISE 10000. Our computational experiments show that parallel Quicksort outperf...
详细信息
We have implemented sample sort and a parallel version of Quicksort on a cache-coherent shared address space multiprocessor: the SUN ENTERPRISE 10000. Our computational experiments show that parallel Quicksort outperforms sample sort. Sample sort has been long thought to be the best, general parallel sorting algorithm, especially for larger data sets. On 32 processors of the ENTERPRISE 10000 the speedup of parallel Quicksort is more than six units higher than the speedup of sample sort, resulting in execution times that were more than 50% faster than sample sort. On one processor, parallel quicksort achieved 15% percent faster execution times than sample sorting. Moreover, because of its low memory requirements, parallel Quicksort could sort data sets at twice the size that sample sort could under the same system memory restrictions.
In this paper, the characteristic of parallel computing model of BSP and NOWS was analyzed. The research indicates that based from some algorithm that designed the rationalization parallel computing, the parallel comp...
详细信息
ISBN:
(纸本)0780377028
In this paper, the characteristic of parallel computing model of BSP and NOWS was analyzed. The research indicates that based from some algorithm that designed the rationalization parallel computing, the parallel computing model is fit with the environment. With the improvement of parallel computing algorithm based on NOWS of linear programming normal, the simplex method has obtained the best result to validate the conclusion.
In this paper, we report on 2-D and 3-D versions of parallelization, and efforts to simplify reconstruction of data during post processing. The parallelization scheme uses single-coordinate domain decomposition with 1...
详细信息
In this paper, we report on 2-D and 3-D versions of parallelization, and efforts to simplify reconstruction of data during post processing. The parallelization scheme uses single-coordinate domain decomposition with 1-cell overlap, and asynchronous time-stepping. We have performed a series of calibration on sheet beam klystron cavity with long drift tube using MAGIC software.
We propose a practical parallel on-the-fly algorithm for enumerative LTL (linear temporal logic) model checking. The algorithm is designed for a cluster of workstations communicating via MPI (message passing interface...
详细信息
ISBN:
(纸本)9780769520353
We propose a practical parallel on-the-fly algorithm for enumerative LTL (linear temporal logic) model checking. The algorithm is designed for a cluster of workstations communicating via MPI (message passing interface). The detection of cycles (faulty runs) effectively employs the so called back-level edges. In particular, a parallel level-synchronized breadth-first search of the graph is performed to discover back-level edges. For each level, the back-level edges are checked in parallel by a nested depth-first search to confirm or refute the presence of a cycle. Several optimizations of the basic algorithm are presented and advantages and drawbacks of their application to distributed LTL model-checking are discussed. Experimental implementation of the algorithm shows promising results.
The processing of three-dimensional (3-D) objects from 3-D digital image data is an important task in the image processing and the computer vision fields. The distance transform (DT) is extensively applied in the imag...
详细信息
The processing of three-dimensional (3-D) objects from 3-D digital image data is an important task in the image processing and the computer vision fields. The distance transform (DT) is extensively applied in the image processing and computer vision areas as a key operation. In a two or three-dimensional image array, the computation of distance transform (DT) is an important task. With the increasing application of 3-D voxel images, it is useful to consider the distance transform of a 3-D digital image array. In order to provide the efficient transform computations, parallelism is employed. We develop parallel algorithms for the three-dimensional Euclidean distance transform (3D-EDT) on the SIMD hypercube computer. The time complexity of our parallel algorithm is O(log/sup 2/N) for an N/spl times/N/spl times/N image array using N/sup 3/ processors. A generalized parallel algorithm for the 3D-EDT is also proposed and it runs O((N/p)/sup 3/log(N)+(N/p)/sup 2/log/sup 2/p) time for an N/spl times/N/spl times/N binary image array on the SIMD hypercube computer using p/sup 3/ PE's, where 1/spl les/p/spl les/N.
The principle and steps of the algorithm for mining fuzzy association rules is studied, and the parallel algorithm for mining fuzzy association rules is presented. In this parallel mining algorithm, quantitative attri...
详细信息
ISBN:
(纸本)0769519229
The principle and steps of the algorithm for mining fuzzy association rules is studied, and the parallel algorithm for mining fuzzy association rules is presented. In this parallel mining algorithm, quantitative attributes are partitioned into several fuzzy sets by the parallel fuzzy c-means algorithm, and fuzzy sets are applied to soften the partition boundary of the attributes. Then, the parallel algorithm for mining Boolean association rules is improved to discover frequent fuzzy attributes. Last, the fuzzy association rules with at least fuzzy confidence are generated on all processors. The parallel mining algorithm is implemented on the distributed linked PC/workstation. The experiment results show that the parallel mining algorithm has fine scaleup, sizeup and speedup.
In this paper, we would like to summarize the recent advances on the improved Krylov subspace methods for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices. The propos...
详细信息
In this paper, we would like to summarize the recent advances on the improved Krylov subspace methods for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices. The proposed methods combine elements of numerical stability and parallel algorithm design without increasing much computational costs. The methods have the following common feature that all are derived such that all matrix-vector multiplication, inner products and vector updates of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. Therefore, the cost of global communication which represents the bottleneck of the performance can be significantly reduced. Here, the bulk synchronous parallel (BSP) model is used to design fully efficient, scalable and portable parallel proposed algorithms and to provide accurate performance prediction of the algorithms for a wide range of architectures including the Cray T3D, the Parsytec, and a cluster of workstations connected by an Ethernet. This performance model uses only a few system dependent parameters based on a simple and accurate cost modelling to provide useful insight in the time complexity of the method. The theoretical performance predictions are compared with some preliminary measured timing results of a numerical application from ocean flow simulation.
This paper proposes an extended version of a previously developed low cost parallel computation platform called Para Worker. The new system is termed Para Worker 2 which differentiates from the early system. The new p...
详细信息
ISBN:
(纸本)0780378040
This paper proposes an extended version of a previously developed low cost parallel computation platform called Para Worker. The new system is termed Para Worker 2 which differentiates from the early system. The new proposed system adds enhanced features of improved dynamic object reallocation, adaptive consistency protocols, and location transparency as compared to the original system. The proposal is particularly useful for the implementation and execution of computational intelligence techniques such as evolutionary computing for engineering applications.
We present in this work a wide spectrum of results on analyzing the behavior of parallel heuristics for solving optimization problems. We focus on evolutionary algorithms as well as on simulated annealing. Our goal is...
详细信息
We present in this work a wide spectrum of results on analyzing the behavior of parallel heuristics for solving optimization problems. We focus on evolutionary algorithms as well as on simulated annealing. Our goal is to offer a first study on the possible changes in the search mechanics when shifting from a LAN distributed algorithm to a WAN environment. We address six optimization tasks of considerable complexity. The results show that, despite the expected slower execution time, the WAN versions of our algorithms consistently solve the problems. We even report some interesting results in which WAN algorithms outperform LAN ones. We also extend the study to include hybrid versions to check the scope of our conclusions.
暂无评论