Large-scale graphs with billions and trillions of vertices and edges require efficient parallel algorithms for common graph problems, one of which is single-source shortest paths (SSSP). Bulk-synchronous parallel algo...
详细信息
ISBN:
(数字)9798350355543
ISBN:
(纸本)9798350355550
Large-scale graphs with billions and trillions of vertices and edges require efficient parallel algorithms for common graph problems, one of which is single-source shortest paths (SSSP). Bulk-synchronous parallel algorithms such as ∆-stepping experience large synchronization costs at the scale of many nodes, so asynchronous approaches are needed for scalability. However, asynchronous approaches are susceptible to wasteful, speculative execution. We introduce ACIC, a highly asynchronous approach modulated by continuous concurrent introspection and adaptation. Using message-driven concurrent reductions and broadcasts, task-based scheduling, and an adaptive aggregation library, we explore techniques such as evolving windows and generation and prioritized flow of optimal updates, or edge relaxations, aimed at reducing speculative loss without constraining parallelism. Our results, while preliminary, demonstrate the promise of these ideas, with the potential to impact a wider class of graph algorithms.
This sequence alignment stands as a pivotal method in the realm of bioinformatics, meticulously employed to ascertain the degree of similarity between diverse sequences such as DNA, RNA, and amino acids. Among the myr...
详细信息
ISBN:
(数字)9798350383027
ISBN:
(纸本)9798350383034
This sequence alignment stands as a pivotal method in the realm of bioinformatics, meticulously employed to ascertain the degree of similarity between diverse sequences such as DNA, RNA, and amino acids. Among the myriad techniques utilized in tackling sequence alignment challenges, the Longest Common Subsequence (LCS) takes center stage. This paper delves into the realm of enhancing LCS efficiency through the implementation of thread parallelization. Drawing inspiration from the seminal work of Wagner and Fischer in 1974, both sequential and parallel techniques exhibit remarkable consistency in identifying the maximum length of LCS. However, this research goes a step further by introducing thread parallelization, which leverages multithreading, resource synchronization, and task decomposition within the domain of parallel programming. The meticulous integration of these advanced techniques results in a notable enhancement in terms of running time compared to the conventional iterative sequential approach. The experimentation and evaluation of both sequential and parallel approaches were conducted using Netbeans, a robust Integrated Development Environment (IDE) tailored for the Java Programming Language. The findings underscore the superior performance of the thread parallelization strategy, establishing its prowess in optimizing the execution time of LCS problem resolution.
We present an O(1)-round fully-scalable deterministic massively parallel algorithm for computing the min-plus matrix multiplication of unit-Monge matrices. We use this to derive a O(log n)-round fully-scalable massive...
详细信息
Given a text and a pattern over an alphabet, the pattern matching problem searches for all occurrences of the pattern in the text. An equivalence relation ≈ is called a substring consistent equivalence relation (SCER...
详细信息
Aiming at the complex structure of the existing deniable authentication image encryption methods based on public key cryptography and the high computational cost caused by many bilinear and modular power operations, a...
详细信息
In 1988, Vazirani gave an NC algorithm for computing the number of perfect matchings in K-3,K-3-minor-free graphs by building on Kasteleyn's scheme for planar graphs, and stated that this "opens up the possib...
详细信息
In 1988, Vazirani gave an NC algorithm for computing the number of perfect matchings in K-3,K-3-minor-free graphs by building on Kasteleyn's scheme for planar graphs, and stated that this "opens up the possibility of obtaining an NC algorithm for finding a perfect matching in K-3,K-3-free graphs." In this paper, we finally settle this 30-year-old open problem. Building on recent NC algorithms for planar and bounded-genus perfect matching by Anari and Vazirani and later by Sankowski, we obtain NC algorithms for perfect matching in any minor-closed graph family that forbids a one-crossing graph. This family includes several well-studied graph families including the K-3,K-3-minor-free graphs and K-5-minor-free graphs. Graphs in these families not only have unbounded genus, but can have genus as high as O(n). Our method applies as well to several other problems related to perfect matching. In particular, we obtain NC algorithms for the following problems in any family of graphs (or networks) with a one-crossing forbidden minor: (1) Determining whether a given graph has a perfect matching and, if so, finding one. (2) Finding a minimum-weight perfect matching in the graph, assuming that the edge weights are polynomially bounded. (3) Finding a maximum st-flow in the network, with arbitrary capacities. The main new idea enabling our results is the definition and use of matching-mimicking networks, small replacement networks that behave the same with respect to matching problems involving a fixed set of terminals, as the larger network they replace.
Large-scale Internet-of- Things (IoT) networks enable intelligent applications and services, such as autonomous deriving. As many users generate various datasets, federated learning in distributed IoT networks emerges...
详细信息
ISBN:
(数字)9798350377675
ISBN:
(纸本)9798350377682
Large-scale Internet-of- Things (IoT) networks enable intelligent applications and services, such as autonomous deriving. As many users generate various datasets, federated learning in distributed IoT networks emerges from learning from distinct datasets. To realize efficient and reliable communications in distributed networks, we propose a collaborative optimization model for resource-constrained federated learning using a joint design of wireless resource allocation and expected learning losses. Precisely, we start to formulate a learning-oriented power allocation problem. Then, we derive a convergence bound and build the relationship between communications and learning. At last, we perform an optimal algorithm based on majorization-minimization frameworks. Thanks to the high parallelization of the proposed algorithm, extensive experimental results corroborate that optimal power allocation in distributed networks benefits efficient federated learning compared to the state-of-the-art benchmark algorithms.
Querying the existence of an edge in a given graph or hypergraph is a building block in several algorithms. Hashing-based methods can be used for this purpose, where the given edges are stored in a hash table in a pre...
详细信息
ISBN:
(数字)9781665497473
ISBN:
(纸本)9781665497480
Querying the existence of an edge in a given graph or hypergraph is a building block in several algorithms. Hashing-based methods can be used for this purpose, where the given edges are stored in a hash table in a preprocessing step, and then the queries are answered using the lookup operations. While the general hashing methods have fast lookup times in the average case, the worst case run time is much higher. Perfect hashing methods take advantage of the fact that the items to be stored are all available and construct a collision free hash function for the given input, resulting in an optimal lookup time even in the worst case. We investigate an efficient shared-memory parallel implementation of a recently proposed perfect hashing method for hypergraphs. We experimentally compare the resulting parallel algorithms with the state-of-the-art and demonstrate better run time and scalability on a set of hypergraphs corresponding to real-life sparse tensors.
It has been widely observed that there exists no universal best Multi-objective Evolutionary Algorithm (MOEA) dominating all other MOEAs on all possible Multi-objective Optimization Problems (MOPs). In this work, we a...
详细信息
We present the first GPU-based parallel algorithm to efficiently update vertex coloring on large dynamic networks. For single GPU, we introduce the concept of loosely maintained vertex color update that reduces comput...
详细信息
We present the first GPU-based parallel algorithm to efficiently update vertex coloring on large dynamic networks. For single GPU, we introduce the concept of loosely maintained vertex color update that reduces computation and memory requirements. For multiple GPUs, in distributed environments, we propose priority-based ordering of vertices to reduce the communication time. We prove the correctness of our algorithms and experimentally demonstrate that for graphs of over 16 million vertices and over 134 million edges on a single GPU, our dynamic algorithm is as much as 20x faster than state-of-the-art algorithm on static graphs. For larger graphs with over 130 million vertices and over 260 million edges, our distributed implementation with 8 GPUs produces updated color assignments within 160 milliseconds. In all cases, the proposed parallel algorithms produce comparable or fewer colors than state-of-the-art algorithms.
暂无评论