Voronoi diagram construction is a common and fundamental problem in computational geometry and spatial computing. Numerous sequential and parallel algorithms for Voronoi diagram construction exists in literature. This...
详细信息
ISBN:
(纸本)9781728126074
Voronoi diagram construction is a common and fundamental problem in computational geometry and spatial computing. Numerous sequential and parallel algorithms for Voronoi diagram construction exists in literature. This paper presents a multi-threaded approach where we augment an existing sequential implementation of Fortune's planesweep algorithm with compiler directives. The novelty of our fine-grained parallel algorithm lies in exploiting the concurrency available at each event point encountered during the algorithm. On the Intel Xeon E5 CPU, our shared-memory parallelization with OpenMP achieves around 2x speedup compared to the sequential implementation using datasets containing 2k-128k sites.
Line segment intersection is one of the elementary operations in computational geometry. Complex problems in Geographic Information Systems (GIS) like finding map overlays or spatial joins using polygonal data require...
详细信息
ISBN:
(纸本)9783030122744;9783030122737
Line segment intersection is one of the elementary operations in computational geometry. Complex problems in Geographic Information Systems (GIS) like finding map overlays or spatial joins using polygonal data require solving segment intersections. Plane sweep paradigm is used for finding geometric intersection in an efficient manner. However, it is difficult to parallelize due to its in-order processing of spatial events. We present a new fine-grained parallel algorithm for geometric intersection and its CPU and GPU implementation using OpenMP and OpenACC. To the best of our knowledge, this is the first work demonstrating an effective parallelization of plane sweep on GPUs. We chose compiler directivebased approach for implementation because of its simplicity to parallelize sequential code. Using Nvidia Tesla P100 GPU, our implementation achieves around 40X speedup for line segment intersection problem on 40K and 80K data sets compared to sequential CGAL library.
Integration of multiple types of compute elements and memories in a single system requires proper support at a system-software level including operating system (OS), compilers, drivers, etc. The OS helps in scheduling...
详细信息
ISBN:
(纸本)9783319655789;9783319655772
Integration of multiple types of compute elements and memories in a single system requires proper support at a system-software level including operating system (OS), compilers, drivers, etc. The OS helps in scheduling work on different compute elements and manages memory operations in multiple memory pools including page migration. Compilers and programming languages provide tools for taking advantage of advanced architectural features. In this paper we encourage code developers to work with experimental versions of compilers and OpenMP standard extensions designed for hybrid OpenPOWER nodes. Specifically, we focus on nested parallelism and Unified Memory as key elements for efficient system-wide programming of CPU and GPU resources of OpenPOWER. We give implementation details using code samples and we discuss limitations of the presented approaches.
暂无评论