We present a parallel algorithm for computing the Voronoi diagram of a set of spheres, S in R3. The spheres have varying radii and do not intersect. We compute each Voronoi cell independently using a two-stage iterati...
详细信息
We present a parallel algorithm for computing the Voronoi diagram of a set of spheres, S in R3. The spheres have varying radii and do not intersect. We compute each Voronoi cell independently using a two-stage iterative procedure, assuming the input spheres are in general position. In the first stage, an initial Voronoi cell for a sphere si is computed using an iterative lower envelope approach restricted to a subset of spheres Li subset of S. This helps to avoid defining the bisectors between all pairs of input spheres and develop a distributed memory parallel algorithm. We use the Delaunay graph of sample points from the input spheres to select the subset Li for computing each Voronoi cell. In the second stage, Voronoi cells obtained from the first stage are matched for updating the subsets. If additional spheres are added to a subset Li, the correctness of the computed vertices is verified with the bisectors of spheres newly added to Li. Results and performance of the algorithm show robustness and speed of the algorithm in handling a large set of spheres.
While constructing a Voronoi diagram V(P) for a set of P of n points on a mesh-connected computer (MCC), it is necessary to find a set B of edges which are intersected by the dividing chain C during the merge process ...
详细信息
While constructing a Voronoi diagram V(P) for a set of P of n points on a mesh-connected computer (MCC), it is necessary to find a set B of edges which are intersected by the dividing chain C during the merge process of two Voronoi diagrams V(L) and V(R), where L and R contain the leftmost [n/2] points and the rightmost [n/2] points of P respectively. The computation of B requires two operations: First decide for each edge e in V(L) and V(R) whether its end vertices are closer to L or R, and then from that information, determine whether e is intersected by C. However, in the previous parallel algorithm each of the former and latter operations requires planar point location which takes O(square-root n) time on square-root n x square-root n MCC, and in addition the former operation needs to compute convex hulls of L and R. In this paper, we shall show that the latter operation can be done in O(1) time without executing planar point location and the former operation can be executed without the computation of convex hulls. Therefore, the computation of B is reduced to only one planar point location.
Computed tomographic imaging spectrometers capture hyperspectral images in real-time. However, postprocessing the imagery can require enormous computational resources;thus, limiting its application to nonrealtime scen...
详细信息
Computed tomographic imaging spectrometers capture hyperspectral images in real-time. However, postprocessing the imagery can require enormous computational resources;thus, limiting its application to nonrealtime scenarios. To overcome these challenges, we developed a highly parallelizable algorithm that exploits spatial shift-invariance. To demonstrate the versatility of our algorithm, we developed implementations on a desktop and an embedded graphics processing unit. To our knowledge, our results show the fastest image reconstruction times reported. (C) 2020 Society of Photo-Optical Instrumentation Engineers (SPIE)
In order to solve poor fine searching capacity of artificial fish swarm algorithm and artificial bee colony swarm algorithm in late state to result in insufficient local optimization, hybrid swarm intelligent parallel...
详细信息
In order to solve poor fine searching capacity of artificial fish swarm algorithm and artificial bee colony swarm algorithm in late state to result in insufficient local optimization, hybrid swarm intelligent parallel algorithm research based on multi-core clusters is proposed;Then, reverse learning mechanism is introduced in early stage of algorithm, initialized swarms are evenly distributed, and swarms are randomly divided into two groups to make interactive learning strategy accelerates rate of convergence, and basic artificial fish swarm algorithm and artificial bee colony swarm algorithm are used to make global searching. In late stage of algorithm, niches artificial fish swarm algorithm and Random Perturbation Artificial Bee Colony are used to make local fine searching to the solution obtained in early stage;On this basis, MPI+OpenMP+STM parallel programming model based on multi-core clusters is established for parallel design and analysis. Finally, stimulation experiment indicates optimizing efficiency of this algorithm is higher than single artificial fish swarm algorithm and artificial bee colony swarm algorithm. (C) 2016 Elsevier B.V. All rights reserved.
The image will be contaminated by noise during the imaging process, which severely degrades the image quality. It is necessary to filter the collected image. With the increasing amount of image data, the traditional s...
详细信息
The image will be contaminated by noise during the imaging process, which severely degrades the image quality. It is necessary to filter the collected image. With the increasing amount of image data, the traditional single-processor or multiprocessor computing equipment has been unable to meet the requirements of real-time data processing. In this paper, the computational model of weighted mean filtering and the characteristics of high performance computer architecture are studied. An efficient hierarchical image weighted mean filtering parallel algorithm for Open Computing Language (OpenCL) is designed and implemented, which can fully express the parallelism of the computing model. The parallel algorithm takes full account of the characteristics of image discrete convolution computing and the multi-layer logic architecture of high performance computer, deeply excavates the parallelism of the computing platform and computing model, and realizes the efficient task mapping from computing model to computing resources. The model is implemented in parallel with the two levels of work-group and work-item. The experimental results show that compared with the serial algorithm based on CPU, the parallel algorithm based on Open Multi-Processing (OpenMP) and the parallel algorithm based on Compute Unified Device Architecture (CUDA), the parallel algorithm of weighted mean filtering achieves 20.88 times, 18.52 times and 1.26 times acceleration ratio on the NVIDIA GPU computing platform based on OpenCL architecture, respectively. It realizes better computing performance and runs on different Graphic Processing Unit (GPU) computing platforms, and has good portability and scalability.
We present an O((log log N)(2)) -time algorithm for computing the distance transform of an N x N binary image. Our algorithm is designed for the common concurrent read concurrent write parallel random access machine (...
详细信息
We present an O((log log N)(2)) -time algorithm for computing the distance transform of an N x N binary image. Our algorithm is designed for the common concurrent read concurrent write parallel random access machine (CRCW PRAM) and requires O(N2+epsilon / log log N) processors, for any epsilon such that 0 < E < 1. Our algorithm is based on a novel deterministic sampling scheme and can be used for computing distance transforms for a very general class of distance functions. We also present a scalable version of our algorithm when the number of processors is available p(2+epsilon) / log log p for some p < N. In this case, our algorithm runs in O((N-2/p(2)) + (N/p) log log p + (log log p)(2)) time. This scalable algorithm is more practical since usually the number of available processors is much less than the size of the image.
Merging is the process of constructing a sorted array C from two sorted arrays A and B of lengths nA and nB, respectively, such that array C contains the elements of arrays A and B. The problem is a fundamental subrou...
详细信息
Merging is the process of constructing a sorted array C from two sorted arrays A and B of lengths nA and nB, respectively, such that array C contains the elements of arrays A and B. The problem is a fundamental subroutine in many applications such as tree reconstruction, database design, and sorting. In this paper, we present a constant-time integer merging algorithm on a shared memory model with a concurrent read operation. No constant-time algorithm for integer merging on this model was designed previously. The algorithm, which is based on cross-rank and address strategy, works when the elements of the inputs belong to the integer domain. The presented algorithm has the following advantages:- (1) running in constant time;(2) stable;(3) simple;(4) optimal when the domain of integers is less than or equal to the size of the inputs and the number of processors is equal to size of the inputs;and (5) deterministic.
A micro-digital holographic particle tracking velocimetry with high-speed system is constructed by a PC grid environment that employs Windows XP with AD-POWERs as parallel tool. Two algorithms for high-speed system ar...
详细信息
A micro-digital holographic particle tracking velocimetry with high-speed system is constructed by a PC grid environment that employs Windows XP with AD-POWERs as parallel tool. Two algorithms for high-speed system are evaluated under the same PC grid environment. Both methods are based on a computer-generated hologram algorithm. One method is a division algorithm based on time development for the measurements, while the other is a division algorithm based on spatial reconstruction for the measurement. In case of the former, the performance is increased by a factor of 3.3 by using 4 PCs. The present system can compute huge hologram images and output them "on-site" at an experimental facility. (c) 2007 Elsevier B.V. All fights reserved.
A parallel randomized algorithm to find a maximal matching is presented. Its expected running time on a CRCW-PRAM with vbEvb processors in O(logvbEvb). The expected time is independent of the structure of the input gr...
详细信息
A parallel randomized algorithm to find a maximal matching is presented. Its expected running time on a CRCW-PRAM with vbEvb processors in O(logvbEvb). The expected time is independent of the structure of the input graph. This improves the best known deterministic algorithm by a factor of log 2 vbEvb.
Suppose than 0 = n - k. (Thus for eta = 0 we get the well-known Chvata graphs.) An NC(4)-algorithm is presented which accepts as input an eta-Chvatal graph and produces a Hamiltonian cycle in G as an output. This is a...
详细信息
Suppose than 0 < eta < 1 is given. We call a graph, G, on n vertices an eta-Chvatal graph if its degree sequences d(1) <= d(2) <= ... <= d(n) satisfies: for k < n/2, d(k) <= min {k + eta n. n/2} implies d(n-k-eta n) >= n - k. (Thus for eta = 0 we get the well-known Chvata graphs.) An NC(4)-algorithm is presented which accepts as input an eta-Chvatal graph and produces a Hamiltonian cycle in G as an output. This is a significant improvement on the previous best NC-algorithm for the problem, which finds a Hamiltonian cycle only in Dirac graphs (delta(G) >= n/2 where delta(G) is the minimum degree in G). (C) 2008 Elsevier B.V. All rights reserved.
暂无评论