A parallel algorithm for the longest common subsequence problem on LARPBS is presented. For two sequences of lengths m and n, the algorithm uses p processors and costs O(mn/p) computation time where 1 <= p <= ma...
详细信息
ISBN:
(纸本)3540258620
A parallel algorithm for the longest common subsequence problem on LARPBS is presented. For two sequences of lengths m and n, the algorithm uses p processors and costs O(mn/p) computation time where 1 <= p <= max {m, n}. Time-area cost of the algorithm is O(mn/p) and memory space required is O((m+n)/p) which all reach optimal. We also show this algorithm is scalable when the number of processors p satisfies 1 <= p <= max {m, n}. To the best of our knowledge this is the fastest and cost-optimal parallel algorithm for LCS problem on array architectures.
The densest subgraph problem has received significant attention, both in theory and in practice, due to its applications in problems such as community detection, social network analysis, and spam detection. Due to the...
详细信息
ISBN:
(纸本)9781611977929
The densest subgraph problem has received significant attention, both in theory and in practice, due to its applications in problems such as community detection, social network analysis, and spam detection. Due to the high cost of obtaining exact solutions, much attention has focused on designing approximate densest subgraph algorithms. However, existing approaches are not able to scale to massive graphs with billions of edges. In this paper, we introduce a new framework that combines approximate densest subgraph algorithms with a pruning optimization. We design new parallel variants of the state-of-the-art sequential Greedy++ algorithm, and plug it into our framework in conjunction with a parallel pruning technique based on k-core decomposition to obtain parallel (1+epsilon)-approximate densest subgraph algorithms. On a single thread, our algorithms achieve 2.6-34x speedup over Greedy++, and obtain up to 22.37x self-relative parallel speedup on a 30core machine with two-way hyper-threading. Compared with the state-of-the-art parallel algorithm by Harb et al. [NeurIPS'22], we achieve up to a 114x speedup on the same machine. Finally, against the recent sequential algorithm of Xu et al. [PACMMOD'23], we achieve up to a 25.9x speedup. The scalability of our algorithms enables us to obtain near-optimal density statistics on the hyperlink2012 (with roughly 113 billion edges) and clueweb (with roughly 37 billion edges) graphs for the first time in the literature.
We give an overview on graph decomposition techniques to obtain fast algorithms for optimization problems on graphs. Based on the observation that most algorithmical problems can be solved easily on trees these method...
详细信息
In this work, we design, analyze, and optimize sequential and shared-memory parallel algorithms for partitioned local depths (PaLD). Given a set of data points and pairwise distances, PaLD is a method for identifying ...
详细信息
ISBN:
(纸本)9781611977967
In this work, we design, analyze, and optimize sequential and shared-memory parallel algorithms for partitioned local depths (PaLD). Given a set of data points and pairwise distances, PaLD is a method for identifying strength of pairwise relationships based on relative distances, enabling the identification of strong ties within dense and sparse communities even if their sizes and within-community absolute distances vary greatly. We design two algorithmic variants that perform community structure analysis through triplet comparisons of pairwise distances. We present theoretical analyses of computation and communication costs and prove that the sequential algorithms are communication optimal, up to constant factors. We introduce performance optimization strategies that yield sequential speedups of up to 29x over a baseline sequential implementation and parallel speedups of up to 26.2x over optimized sequential implementations using up to 32 threads on an Intel multicore CPU.
Maximizing a non-negative, monontone, submodular function f over n elements under a cardinality constraint k (SMCC) is a well-studied NP-hard problem. It has important applications in, e.g., machine learning and influ...
详细信息
ISBN:
(纸本)9783031498145;9783031498152
Maximizing a non-negative, monontone, submodular function f over n elements under a cardinality constraint k (SMCC) is a well-studied NP-hard problem. It has important applications in, e.g., machine learning and influence maximization. Though the theoretical problem admits polynomial-time approximation algorithms, solving it in practice often involves frequently querying submodular functions that are expensive to compute. This has motivated significant research into designing parallel approximation algorithms in the adaptive complexity model;adaptive complexity (adaptivity) measures the number of sequential rounds of poly(n) function queries an algorithm requires. The state-of-the-art algorithms can achieve (1- 1/e - e)-approximate solutions with O(1/e(2) log n) adaptivity, which approaches the known adaptivity lowerbounds. However, the O(1/e(2) log n) adaptivity only applies to maximizing worst-case functions that are unlikely to appear in practice. Thus, in this paper, we consider the special class of p-superseparable submodular functions, which places a reasonable constraint on f, based on the parameter p, and is more amenable to maximization, while also having real-world applicability. Our main contribution is the algorithm LS+GS, a finer-grained version of the existing LS+PGB algorithm, designed for instances of SMCC when f is p-superseparable;it achieves an expected (1- 1/e - e)-approximate solution with O(1/e(2) log(pk)) adaptivity independent of n. Additionally, unrelated to p-superseparability, our LS+GS algorithm uses only O(e(-1) n + e(-2) log n) oracle queries, which has an improved dependence on e(-1) over the state-of-the-art LS+PGB;this is achieved through the design of a novel thresholding subroutine.
parallel algorithms for solving almost linear systems are studied. A non-stationary parallel algorithm based on the multi-splitting technique and its extension to an asynchronous model are considered. Convergence prop...
详细信息
ISBN:
(纸本)3540662286
parallel algorithms for solving almost linear systems are studied. A non-stationary parallel algorithm based on the multi-splitting technique and its extension to an asynchronous model are considered. Convergence properties of these methods are studied for M-matrices and H-matrices. We implemented these algorithms on two distributed memory multiprocessors, where we studied their performance in relation to overlapping of the splittings at each iteration.
This work analyses two different approaches to parallelise an exact algorithm for the solution of the Constrained Two-Dimensional Cutting Stock Problem. A fine-grained model based on the parallel execution of the gene...
详细信息
ISBN:
(纸本)9780769539393
This work analyses two different approaches to parallelise an exact algorithm for the solution of the Constrained Two-Dimensional Cutting Stock Problem. A fine-grained model based on the parallel execution of the generation loops is implemented through a shared-memory model using the OpenMP tool. Also, a coarse-grained model based on the parallel execution of the search loop and in the introduction of efficient synchronisation and load balancing schemes is implemented through a distributed-memory model using MPI. As a novelty, we have incorporated into the models the checking of dominance and duplication rules, thus affecting the search space and so, the operation of the parallelisations. In the experimental evaluation it is demonstrated that, even when the domination/duplication tests are applied to the parallel algorithms, they are able to obtain an important improvement over the sequential approach.
In this paper, fast time- A nd space-parallel algorithms for solution of parabolic PDEs are developed. It is shown that the seemingly strictly serial time-stepping procedures for solution of problem can be completely ...
详细信息
We provide optimal parallel solutions to several shortest path and visibility problems set in triangulated simple polygons. Let P be a triangulated simple polygon with n vertices, preprocessed to support shortest path...
详细信息
ISBN:
(纸本)9780897915175
We provide optimal parallel solutions to several shortest path and visibility problems set in triangulated simple polygons. Let P be a triangulated simple polygon with n vertices, preprocessed to support shortest path queries. We can find the shortest path tree from any point inside P in O(log n) time using O(n/log n) processors. In the same bounds, we can preprocess P for shooting queries (a query can be answered in O(log n) time by a uniprocessor). Given a set S of m points inside P, we can find an implicit representation of the relative convex hull of S in O(log(nm)) time with O(m) processors. If the relative convex hull has k edges, we can explicitly produce these edges in O(log(nm)) time with O(k/log(nm)) processors. All of these algorithms are deterministic and use the CREW PRAM model.
We show that isomorphism of trees and outerplanar graphs can be tested in O(log n) time with n/log(n) processors on a CRCW PRAM and in O(log2n) time with n/log2n processors on an EREW PRAM. This gives the first optima...
详细信息
暂无评论