We provide optimal parallel solutions to several shortest path and visibility problems set in triangulated simple polygons. Let P be a triangulated simple polygon with n vertices, preprocessed to support shortest path...
详细信息
We provide optimal parallel solutions to several shortest path and visibility problems set in triangulated simple polygons. Let P be a triangulated simple polygon with n vertices, preprocessed to support shortest path queries. We can find the shortest path tree from any point inside P in O(log n) time using O(n/log n) processors. In the same bounds, we can preprocess P for shooting queries (a query can be answered in O(log n) time by a uniprocessor). Given a set S of m points inside P, we can find an implicit representation of the relative convex hull of S in O(log(nm)) time with O(m) processors. If the relative convex hull has k edges, we can explicitly produce these edges in O(log(nm)) time with O(k/log(nm)) processors. All of these algorithms are deterministic and use the CREW PRAM model.
Some of the most widely used algorithms for two-point boundary value ordinary differential equations, namely, finite-difference and collocation methods and standard multiple shooting, proceed by setting up and solving...
详细信息
Some of the most widely used algorithms for two-point boundary value ordinary differential equations, namely, finite-difference and collocation methods and standard multiple shooting, proceed by setting up and solving a structured system of linear equations. It is well known that the linear system can be set up efficiently in parallel;we show here that a structured orthogonal factorization technique can be used to solve this system, and hence the overall problem, in an efficient, parallel, and stable way.
As a super class of tournament digraphs, Bang-Jensen, Huang and Prisner [4] defined an in-tournament digraph (in-toumament for short) and investigated a number of its nice properties. The in-tournament is a directed g...
详细信息
As a super class of tournament digraphs, Bang-Jensen, Huang and Prisner [4] defined an in-tournament digraph (in-toumament for short) and investigated a number of its nice properties. The in-tournament is a directed graph in which the set of in-neighbors of every vertex induces a tournament digraph. In other words, the presence of arcs (x,z) and (y,z) implies that exactly one of (x,y) or (y,z) exists. In this paper, we propose, for in-tournaments, parallel algorithms for examining the existence of a Hamiltonian path and a Hamiltonian cycle and for constructing them, if they exist.
A convex hull is one of the most fundamental and interesting geometric constructs ill computational geometry. Considerable research effort has focused on developing algorithms, both in serial and in parallel, for comp...
详细信息
A convex hull is one of the most fundamental and interesting geometric constructs ill computational geometry. Considerable research effort has focused on developing algorithms, both in serial and in parallel, for computing convex hulls. In particular, there are few I problems whose parallel algorithms are so thoroughly studied as convex hull problems. In this paper, we review the convex hull parallel algorithms and their paradigm. We provide a summary of results and introduce several interesting topics including typical techniques, output-size sensitive methods, randomized approaches, and robust algorithms for convex hull problems, with which we may see the highlights of the whole research for parallel algorithms, Most of our discussion uses the PRAM (parallel Random Access Machine) computational model. but still we give a glance at the results of the other parallel computational models such as mesh, mesh-of-trees, hypercube, recofigurable array, and models of coarse grained multicomputers like BSP and LogP.
Determinants has been used intensively in a variety of applications through history. It also influenced many fields of mathematics like linear algebra. Finding the determinants of a squared matrix can be done using a ...
详细信息
ISBN:
(纸本)9781479904624
Determinants has been used intensively in a variety of applications through history. It also influenced many fields of mathematics like linear algebra. Finding the determinants of a squared matrix can be done using a variety of methods, including well-known methods of Leibniz formula and Laplace expansion which calculate the determinant of any NxN matrix in O(n!). However, decomposition methods, such as: LU decomposition, Cholesky decomposition and QR decomposition, have replaced the native methods with a significantly reduced complexity of O(n boolean AND 3). In this paper, we introduce two parallel algorithms for Laplace expansion and LU decomposition. Then, we analyze them and compare them with their perspective sequential algorithms in terms of run time, speed-up and efficiency, where new algorithms provided better results. At maximum, in Laplace expansion, it became 129% faster, whereas in LU Decomposition, it became 44% faster.
作者:
SAOUDI, ANIVAT, ML.I.P.N
Université Paris XIII Institut Galilée Av. J. B. Clément Villetaneuse 93400 France L.I.T.P
Université Paris VII 2 Place Jusieu Paris Cedex 05 75251 France
This paper presents efficient and optimal parallel algorithms for multidimensional image template matching on CREW PRAM model. For an Nd image and Md window, we present an optimal (resp. efficient) algorithm which run...
详细信息
We present efficient and scalable parallel algorithms for performing mathematical operations for low-rank tensors represented in the tensor train (TT) format. We consider algorithms for addition, elementwise multiplic...
详细信息
We present efficient and scalable parallel algorithms for performing mathematical operations for low-rank tensors represented in the tensor train (TT) format. We consider algorithms for addition, elementwise multiplication, computing norms and inner products, orthonormalization, and rounding (rank truncation). These are the kernel operations for applications such as iterative Krylov solvers that exploit the TT structure. The parallel algorithms are designed for distributed-memory computation, and we propose a data distribution and strategy that parallelizes computations for individual cores within the TT format. We analyze the computation and communication costs of the proposed algorithms to show their scalability, and we present numerical experiments that demonstrate their efficiency on both shared-memory and distributed-memory parallel systems. For example, we observe better single-core performance than the existing MATLAB TT-Toolbox in rounding a 2GB TT tensor, and our implementation achieves a 34x speedup using all 40 cores of a single node. We also show nearly linear parallel scaling on larger TT tensors up to over 10,000 cores for all mathematical operations.
The clusters of SMP using fast networks, such as the Myricom's Myrinet, have emerged as important platforms for high performance computing. Although their peak advertised performance is very high, their real perfo...
详细信息
ISBN:
(纸本)354043786X
The clusters of SMP using fast networks, such as the Myricom's Myrinet, have emerged as important platforms for high performance computing. Although their peak advertised performance is very high, their real performance may be much lower than the peak advertised performance for many applications. To achieve high performance, we need to take advantages of both SMP and cluster architectures. Based on the HPM model for parallel computing, the performance of clusters of SMP systems is analyzed, and principles to optimize parallel algorithms (both from the paxallelism and locality point of view) are proposed. The influence of memory hierarchies on the performance is highly emphasized. Some practical examples on commercial clusters of SMPs systems Dawning D2000-2 and D3000 are also given.
We consider a sparse matrix-matrix multiplication (SpGEMM) setting where one matrix is square and the other is tall and skinny. This special variant, TS-SpGEMM, has important applications in multi-source breadth-first...
详细信息
ISBN:
(数字)9798350352917
ISBN:
(纸本)9798350352924;9798350352917
We consider a sparse matrix-matrix multiplication (SpGEMM) setting where one matrix is square and the other is tall and skinny. This special variant, TS-SpGEMM, has important applications in multi-source breadth-first search, influence maximization, sparse graph embedding, and algebraic multi-grid solvers. Unfortunately, popular distributed algorithms like sparse SUMMA deliver suboptimal performance for TS-SpGEMM. To address this limitation, we develop a novel distributed-memory algorithm tailored for TS-SpGEMM. Our approach employs customized 1D partitioning for all matrices involved and leverages sparsity-aware tiling for efficient data transfers. In addition, it minimizes communication overhead by incorporating both local and remote computations. On average, our TS-SpGEMM algorithm attains 5x performance gains over 2D and 3D SUMMA. Furthermore, we use our algorithm to implement multi-source breadth-first search and sparse graph embedding algorithms and demonstrate their scalability up to 512 Nodes (or 65,536 cores) on NERSC Perlmutter.
We present simple parallel algorithms to recognize hypergraphs with the property that the vertices can be ordered that the hyperedges form intervals. The algorithm circumvents P-Q-trees and is a simplification of the ...
详细信息
ISBN:
(纸本)0818677430
We present simple parallel algorithms to recognize hypergraphs with the property that the vertices can be ordered that the hyperedges form intervals. The algorithm circumvents P-Q-trees and is a simplification of the algorithm of Klein and Reif [12] This results to a simple parallel algorithm for interval graph recognition and for recognition of convex bipartite graphs. The basic ideas are quite similar to the interval graph recognition algorithm of Hsu. The major idea is that we can circumvent the lexical breadth-first search procedure. With similar techniques, also interval hypergraphs can be recognized. With a little bit less workload, interval hypergraphs can be recognized if a representation of the hyperedges as subtrees of a tree is known.
暂无评论