Polygonal geometric operations are fundamental in domains such as Computer Graphics, Computer-Aided Design, and Geographic Information Systems. Handling degenerate cases in such operations is important when real-world...
详细信息
ISBN:
(纸本)9798350301199
Polygonal geometric operations are fundamental in domains such as Computer Graphics, Computer-Aided Design, and Geographic Information Systems. Handling degenerate cases in such operations is important when real-world spatial data are used. The popular Greiner-Hormann (GH) clipping algorithm does not handle such cases properly without perturbing vertices leading to inaccuracies and ambiguities. In this work, we parallelize the O(n(2))-time general polygon clipping algorithm by Foster et al., which can handle degenerate cases without perturbation. Our CREW pram algorithm can perform clipping in O(log n) time using n + k number of processors with simple polygons, where n is the number of input edges and k is the number of edge intersections. For efficient GPU implementation, we employ three effective filters which have not been used in prior work on polygon clipping: 1) Common-minimum-bounding-rectangle filter, 2) Count-based filter, and 3) Line-segment-minimum-bounding-rectangle filter. They drastically reduce O( n2) candidate edge pairs comparisons by 80%-99%, leading to significantly faster parallel execution. In our experiments, C++ CUDA-based implementation yields up to 40X speedup over real-world datasets, processing two polygons with a total of 174K vertices on an Nvidia Quadro RTX 5000 GPU compared to the sequential Foster's algorithm running on an Intel Xeon Silver 4210R CPU.
Given a set of n entities to be classified, and a matric of dissimilarities between pairs of them. This article considers the problem called Minimum Sum of Diameters Clustering Problem, where a partition of the set of...
详细信息
Given a set of n entities to be classified, and a matric of dissimilarities between pairs of them. This article considers the problem called Minimum Sum of Diameters Clustering Problem, where a partition of the set of entities into k clusters such that the sum of the diameters of these clusters is minimized. In sequential, Brucker showed that the problem is NP-hard, when k ≥ 3 [1]. For the case of k = 2, Hansen and Jaumard gave an O(n^3logn) algorithm [2], which Ramnath later improved the running time to O(n^3) [3]. In this article, we discuss parallel algorithms for the Minimum Sum of Diameters Clustering Problem, for the case of k = 2. In particular, we present an NC algorithm that runs in O(logn) parallel time and n^7 processors on the Common CRCW pram model. Additionally, we propose the parallel algorithmic technique which can be applied to improve the processor bound by a factor of n. As a result, our algorithm can be implemented in O(logn) parallel time using n^6 processors on the Common CRCW pram model. In addition, regarding the issue of high processor complexity, we also propose a more practical NC algorithm which can be implemented in O(log^3n) parallel time using n^(3.376) processors on the EREW pram model.
Clipping arbitrary polygons is one of the complex operations in computer graphics and computational geometry. It is applied in many fields such as Geographic Information Systems (GIS) and VLSI CAD. We have two signifi...
详细信息
ISBN:
(纸本)9781479980062
Clipping arbitrary polygons is one of the complex operations in computer graphics and computational geometry. It is applied in many fields such as Geographic Information Systems (GIS) and VLSI CAD. We have two significant results to report. Our first result is the effective parallelization of the classic, highly sequential Greiner-Hormann algorithm, which yields the first output-sensitive CREW pram algorithm for a pair of simple polygons, and can perform clipping in O(logn) time using O(n + k) processors, where n is the total number of vertices and k is the number of edge intersections. This improves upon our previous clipping algorithm based on the parallelization of Vatti's sweepline algorithm, which requires O(n + k + k') processors to achieve logarithmic time complexity where k' can be O(n(2)) [1]. This also improves upon another O(logn) time algorithm by Karinthi, Srinivas, and Almasi which unlike our algorithm does not handle self-intersecting polygons, is not output-sensitive, and must employ Theta(n(2)) processors to achieve O(logn) time [2]. We also study multi-core and many-core implementations of our parallel Greiner-Hormann algorithm. Our second result is a practical, parallel GIS system, namely MPI-GIS, for polygon overlay processing of two GIS layers containing large number of polygons over a cluster of compute nodes. It employs R-tree for efficient indexing and identification of potentially intersecting set of polygons across two input GIS layers. Spatial data files tend to be large in size (in GBs) and the underlying overlay computation is highly irregular and compute intensive. This system achieves 44X speedup on a 32-node NERSC's CARVER cluster while processing about 600K polygons in two GIS layers within 19 seconds which takes over 13 minutes on state-of-art ArcGIS system.
Given a set of n entities to be classified, and a matric of dissimilarities between pairs of them. This paper considers the problem called MINIMUM SUM OF DIAMETERS CLUSTERING PROBLEM, where a partition of the set of e...
详细信息
ISBN:
(纸本)9781467378253
Given a set of n entities to be classified, and a matric of dissimilarities between pairs of them. This paper considers the problem called MINIMUM SUM OF DIAMETERS CLUSTERING PROBLEM, where a partition of the set of entities into k clusters such that the sum of the diameters of these clusters is minimized. Brucker showed that the complexity of the problem is NP-hard, when k >= 3 [1]. For the case of k = 2, Hansen and Jaumard gave an O(n(3) log n) algorithm [2], which Ramnath later improved the running time to O(n(3)) [3]. This paper discusses the parallel complexity of the MINIMUM SUM OF DIAMETERS CLUSTERING PROBLEM. For the case of k = 2, we show that the problem in parallel in fact belongs in class NC.(1) In particular, we show that the parallel complexity of the problem is O(log n) parallel time and n(7) processors on the COMMON CRCW pram model. Additionally, we propose the parallel algorithmic technique which can be applied to improve the processor bound by a factor of n. As a result, we show that the problem can be quickly solved in O(log n) parallel time using n(6) processors on the COMMON CRCW pram model. In addition, regarding the issue of high processor complexity, we also propose a more practical NC algorithm which can be implemented in O(log(3) n) parallel time using n(3.376) processors on the EREW pram model.
Polygon clipping is one of the complex operations in computational geometry. It is a primitive operation in many fields such as Geographic Information Systems (GIS), Computer Graphics and VLSI CAD. Sequential algorith...
详细信息
ISBN:
(纸本)9781479956180
Polygon clipping is one of the complex operations in computational geometry. It is a primitive operation in many fields such as Geographic Information Systems (GIS), Computer Graphics and VLSI CAD. Sequential algorithms for this problem are in abundance in literature but there are very few parallel algorithms solving it in its most general form. We present the first output-sensitive CREW pram algorithm, which can perform polygon clipping in O(logn) time using (n + k + k') processors, where n is the number of vertices, k is the number of edge intersections and k' is the additional temporary vertices introduced due to the partitioning of polygons. The current best algorithm by Karinthi, Srinivas, and Almasi [1] does not handle self-intersecting polygons, is not output-sensitive and must employ Theta(n(2)) processors to achieve O(logn) time. Our algorithm is developed from the first principles and it is superior to [1] in cost. It yields a practical implementation on multicores and demonstrates 30x speedup for real-world dataset. Our algorithm can perform the typical clipping operations including intersection, union, and difference.
We present parallel algorithms on the BSP/CGM model, with p processors, to count and generate all the maximal cliques of a circle graph with n vertices and m edges. To count the number of all the maximal cliques, with...
详细信息
We present parallel algorithms on the BSP/CGM model, with p processors, to count and generate all the maximal cliques of a circle graph with n vertices and m edges. To count the number of all the maximal cliques, without actually generating them, our algorithm requires O(log p) communication rounds with O(nm/p) local computation time. We also present an algorithm to generate the first maximal clique in O(log p) communication rounds with O(nm/p) local computation, and to generate each one of the subsequent maximal cliques this algorithm requires O(log p) communication rounds with O(m/p) local computation. The maximal cliques generation algorithm is based on generating all maximal paths in a directed acyclic graph, and we present an algorithm for this problem that uses O(log p) communication rounds with O(m/p) local computation for each maximal path. We also show that the presented algorithms can be extended to the CREW pram model.
We consider the Renaming Problem, a basic processing step in string algorithms, for which we give a simultaneously work and time optimal Las Vegas type pram algorithm. The Renaming Problem is closely related to the Mu...
详细信息
We consider the Renaming Problem, a basic processing step in string algorithms, for which we give a simultaneously work and time optimal Las Vegas type pram algorithm. The Renaming Problem is closely related to the Multiset Sorting Problem.
Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n(2)) algorithms are known for this problem [3,4,11,19]. This paper reviews important results ...
详细信息
Hierarchical clustering is a common method used to determine clusters of similar data points in multidimensional spaces. O(n(2)) algorithms are known for this problem [3,4,11,19]. This paper reviews important results for sequential algorithms and describes previous work on parallel algorithms for hierarchical clustering. Parallel algorithms to perform hierarchical clustering using several distance metrics are then described. Optimal pram algorithms using n/log n processors are given for the average link, complete link, centroid, median, and minimum variance metrics. Optimal butterfly and tree algorithms using n/log n processors are given for the centroid, median, and minimum variance metrics. Optimal asymptotic speedups are achieved for the best practical algorithm to perform clustering using the single link metric on a n/log n processor pram, butterfly, or tree.
We define a novel scheduling problem; it is solved in parallel by repeated, rapid, approximate reschedulings. This leads to the first optimal logarithmic time pram algorithm for list ranking. Companion papers show how...
详细信息
We define a novel scheduling problem; it is solved in parallel by repeated, rapid, approximate reschedulings. This leads to the first optimal logarithmic time pram algorithm for list ranking. Companion papers show how to apply these results to obtain improved pram upper bounds for a variety of problems on graphs, including the following: connectivity, biconnectivity, Euler tour and st-numbering, and a number of problems on trees.
The problem considered is that of finding the n-bit result in dividing one n-bit number by another. Circuits are presented with asymptotically small size and depth for this problem; from them, efficient parallel rand...
详细信息
The problem considered is that of finding the n-bit result in dividing one n-bit number by another. Circuits are presented with asymptotically small size and depth for this problem; from them, efficient parallel random access memory (pram) algorithms are derived for division in the bit model. The primary result, which is a size-efficient implementation of the circuits, translates into a uniform parallel algorithm on a shared memory machine with bit operations and exclusive memory writes with parallel time D(n) using O(S(n)) processors. A significant improvement is obtained in size bounds for division circuits of small depth. However, there is scope for improvement in the size bounds.
暂无评论