检索结果-内蒙古大学图书馆

arXiv 2023年

作者： Schmidt, Jonas Schwentick, Thomas TU Dortmund University Germany

The paper proposes dynamic parallel algorithms for connectivity and bipartiteness of undirected graphs that require constant time and O(n1/2+ϵ) work on the CRCW PRAM model. The work of these algorithms almost matches the work of the O(log n) time algorithm for connectivity by Kopelowitz et al. (2018) on the EREW PRAM model and the time of the sequential algorithm for bipartiteness by Eppstein et al. (1997). In particular, we show that the sparsification technique, which has been used in both mentioned papers, can in principle also be used for constant time algorithms in the CRCW PRAM model, despite the logarithmic depth of sparsification trees. © 2023, CC BY.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Efficient Sequential and parallel algorithms for Estimating Higher Order Spectra 19

Efficient Sequential and Parallel Algorithms for Estimating ...

引用

28th ACM International Conference on Information and Knowledge Management (CIKM)

作者： Wang, Zigeng Mamun, Abdullah-Al Cai, Xingyu Ravishanker, Nalini Rajasekaran, Sanguthevar Univ Connecticut Dept Comp Sci & Engr Storrs CT 06269 USA Univ Connecticut Dept Stat Storrs CT 06269 USA Google LLC Mountain View CA USA

ISBN: (纸本)9781450369763

Higher order spectra (HOS) are a powerful tool in nonlinear time series analysis and they have been extensively used as feature representations in data mining, communications and cosmology domains. However, HOS estimation suffers from high computational cost and memory consumption. Any algorithm for computing the kth order spectra on a dataset of size n needs Omega(n(k-1)) time since the output size will be Omega(n(k-1)) as well, which makes the direct HOS analysis difficult for long time series, and further prohibits its direct deployment to resource-limited and time-sensitive applications. Existing algorithms for computing HOS are either inefficient or have been implemented on obsolete architectures. Thus it is essential to develop efficient generic algorithms for HOS estimations. In this paper, we present a package of generic sequential and parallel algorithms for computationally and memory efficient HOS estimations which can be employed on any parallel machine or platform. Our proposed algorithms largely reduce the HOS' computational cost and memory usage in spectrum multiplication and smoothing steps through carefully designed prefix sum operations. Moreover, we employ a matrix partitioning technique and design algorithms with optimal memory usage and present the parallel approaches on the PRAM and the mesh models. Furthermore, we implement our algorithms for both bispectrum and trispectrum estimations. We conduct extensive experiments and cross-compare the proposed algorithms' performance. Results show that our algorithms achieve state-of-the-art computational and memory efficiency, and our parallel algorithms achieve close to linear speedups.

关键词： higher order spectra sequential algorithms parallel algorithms memory efficient algorithms linear speedups

来源：评论

学校读者我要写书评

暂无评论

Oblivious resampling oracles and parallel algorithms for the Lopsided Lovasz Local Lemma 30

Oblivious resampling oracles and parallel algorithms for the...

引用

30th Annual ACM-SIAM Symposium on Discrete algorithms (SODA)

作者： Harris, David G. Univ Maryland Dept Comp Sci College Pk MD 20742 USA

ISBN: (纸本)9781611975482

The Lovasz Local Lemma (LLL) is a probabilistic tool which shows that, if a collection of \bad" events B in a probability space are not too likely and not too interdependent, then there is a positive probability that no bad-events in B occur. Moser & Tardos (2010) gave sequential and parallel algorithms which transformed most applications of the variable-assignment LLL into efficient algorithms. A framework of Harvey & Vondrak (2015) based on \resampling oracles" extended this give very general sequential algorithms for other probability spaces satisfying the Lopsided Lov asz Local Lemma (LLLL). We describe a new structural property of resampling oracles which holds for all known resampling oracles, which we call \obliviousness." Essentially, it means that the interaction between two bad-events B;B0 depends only on the randomness used to resample B, and not on the precise state within B itself. This property has two major consequences. First, it is the key to achieving a unified parallel LLLL algorithm, which is faster than previous, problemspecific algorithms of Harris (2016) for the variableassignment LLLL algorithm and of Harris & Srinivasan (2014) for permutations. This new algorithm extends a framework of Kolmogorov (2016), and gives the first RNC algorithms for rainbow perfect matchings and rainbow hamiltonian cycles of K-n. Second, this property allows us to build LLLL probability spaces out of a relatively simple \atomic" set of events. It was intuitively clear that existing LLLL spaces were built in this way;but the obliviousness property formalizes this and gives a way of automatically turning a resampling oracle for atomic events into a resampling oracle for conjunctions of them. Using this framework, we get the first sequential resampling oracle for rainbow perfect matchings on the complete s-uniform hypergraph K-n((s)) and the first commutative resampling oracle for hamiltonian cycles of K-n

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Generic methodology for the design of parallel algorithms based on pattern languages 9th

Generic methodology for the design of parallel algorithms ba...

引用

9th International Conference on Supercomputing, ISUM 2018

作者： Serrano-Rubio, A. Alejandra Meneses-Viveros, Amilcar Morales-Luna, Guillermo B. Paredes-López, Mireya Computer Science Department Cinvestav-IPN Mexico City Mexico Mathematics Department Cinvestav-IPN Mexico City Mexico

ISBN: (纸本)9783030104474

A parallel system to solve complex computational problems involve multiple instruction, simultaneous flows, communication structures, synchronisation and competition conditions between processes, as well as mapping and balance of workload in each processing unit. The algorithm design and the facilities of processing units will affect the cost-performance ratio of any algorithm. We propose a generic methodology to capture the main characteristics of parallel algorithm design methodologies, and to add the experience of expert programmers through pattern languages. Robust design considering the relations between architectures and programs is a crucial item to implement high-quality parallel algorithms. We aim for a methodology to exploit algorithmic concurrencies and to establish optimal process allocation into processing units, exploring the lowest implementation details. Some basic examples are described, such as the k-means algorithm, to illustrate and to show the effectiveness of our methodology. Our proposal identifies essential design patterns to find models of Data Mining algorithms with string self-adaptive mechanisms for homogeneous and heterogeneous parallel architectures. © Springer Nature Switzerland AG 2019.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Fast and Work-Optimal parallel algorithms for Predicate Detection

arXiv

引用

arXiv 2020年

作者： Garg, Rohan Purdue University Department of Computer Science United States

Recently, the predicate detection problem was shown to be in the parallel complexity class NC. In this paper, we give some more work efficient parallel algorithms to solve the predicate detection problem on a distributed computation with n processes and at most m states per process. The previous best known parallel predicate detection algorithm, parallelCut, had work complexity O(m3n3 log mn). We give two algorithms, a deterministic algorithm with work complexity O(m2n2) and a randomized algorithm with work complexity Õ(m2n2). Furthermore, our algorithms have a space complexity of O(mn2) whereas parallelCut had a space complexity of O(m2n2). Copyright © 2020, The Authors. All rights reserved.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Work-Efficient parallel algorithms for Accurate Floating-Point Prefix Sums

Work-Efficient Parallel Algorithms for Accurate Floating-Poi...

引用

IEEE Conference on High Performance Extreme Computing (HPEC)

作者： Sean Fraser Helen Xu Charles E. Leiserson Computer Science and Artificial Intelligence Laboratory MIT Cambridge MA

ISBN: (数字)9781728192192

ISBN: (纸本)9781728192208

Existing work-efficient parallel algorithms for floating-point prefix sums exhibit either good performance or good numerical accuracy, but not both. Consequently, prefix-sum algorithms cannot easily be used in scientific-computing applications that require both high performance and accuracy. We have designed and implemented two new algorithms, called CAST_BLK and PAIR_BLK, whose accuracy is significantly higher than that of the high-performing prefix-sum algorithm from the Problem Based Benchmark Suite, while running with comparable performance on modern multicore machines. Specifically, the root mean squared error of the PBBS code on a large array of uniformly distributed 64-bit floating-point numbers is 8 times higher than that of CAST_BLK and 5.8 times higher than that of PAIR_BLK. These two codes employ the PBBS three-stage strategy for performance, but they are designed to achieve high accuracy, both theoretically and in practice. A vectorization enhancement to these two scalar codes trades off a small amount of accuracy to match or outperform the PBBS code while still maintaining lower error.

关键词： Multicore processing Scientific computing Conferences Graphics processing units Libraries parallel algorithms Standards

来源：评论

学校读者我要写书评

暂无评论

New parallel algorithms for all pairwise computation on large HPC clusters 20

New parallel algorithms for all pairwise computation on larg...

引用

20th International Conference on parallel and Distributed Computing, Applications and Technologies, PDCAT 2019

作者： Tang, Tao Wu, Hao Bao, Wei Yang, Pengyi Yuan, Dong Zhou, Bing Bing Faculty of Engineering and IT University of Technology Sydney Sydney Australia School of Computer Science University of Sydney Sydney Australia School of Mathematics and Statistics University of Sydney Sydney Australia

ISBN: (纸本)9781728126166

All pairwise computation is defined as performing computation between every pair of the elements in a given dataset. It is often a necessary first step in a number of bioinformatics applications. Many of such applications require multiple terabytes of main memory and take multiple peta floating point operations to complete the computation. Therefore, large HPC clusters are needed to tackle these large-scale computational problems. Conventionally designed parallel algorithms using data partitioning may have a scalability issue, i.e., for a given problem of fixed size the efficiency may decrease if the number of compute nodes is increased (Amdahl's law). In this paper we introduce a new method for parallel algorithm design. Using this method we first design an efficient one-dimensional (1D) ring algorithm and then a two-dimensional (2D) algorithm based on the 1D ring for all pairwise computation. When increasing the compute nodes, instead of reducing the block size, we make multiple copies of the original data blocks in the 1D ring and distribute them across the added compute nodes in the other dimension. By properly organizing the compute nodes the communication overhead can be reduced to a minimum in this two-dimensional setting. Experiments on a Cray XC40 HPC supercomputer show that our new algorithms are very efficient and scalable for large-scale all pairwise computation on large HPC clusters. © 2019 IEEE.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for Operations on Multi-Valued Decision Diagrams 32

Parallel Algorithms for Operations on Multi-Valued Decision ...

引用

32nd AAAI Conference on Artificial Intelligence / 30th Innovative Applications of Artificial Intelligence Conference / 8th AAAI Symposium on Educational Advances in Artificial Intelligence

作者： Perez, Guillaume Regin, Jean-Charles Cornell Univ Dept Comp Sci Ithaca NY 14850 USA Univ Nice Sophia Antipolis CNRS UMR 7271 I3S F-06900 Sophia Antipolis France

ISBN: (纸本)9781577358008

Multi-valued Decision Diagrams (MDDs) have been extensively studied in the last ten years. Recently, efficient algorithms implementing operators such as reduction, union, intersection, difference, etc., have been designed. They directly deal with the graph structure of the MDD and a time reduction of several orders of magnitude in comparison to other existing algorithms have been observed. These operators have permitted a new look at MDDs, because extremely large MDDs can finally be manipulated as shown by the models used to solve complex application in music generation. However, MDDs become so large (50GB) that minutes are sometimes required to perform some operations. In order to accelerate the manipulation of MDDs, parallel algorithms are required. In this paper, we introduce such algorithms. We carefully design them in order to overcome inherent difficulties of the parallelization of sequential algorithms such as data dependencies, software lock-out, false sharing, or load balancing. As a result, we observe a speed-up, i.e. ratio between parallel and sequential runtimes, growing linearly with the number of cores.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for Solving Sparse Binary Subset Sums Using Random Mappings

Parallel Algorithms for Solving Sparse Binary Subset Sums Us...

引用

International Symposium on Problems of Redundancy in Information and Control Systems (RED)

作者： Nikita Rumenko Alexander Kostyuck Moscow Technical University of Communications and Informatics Moscow Russian Federation

ISBN: (数字)9781728119441

ISBN: (纸本)9781728119458

In this article we consider using random mappings to solve sparse binary subset sums via collision search. A mapping is constructed that suits our purpose and two parallel algorithms are proposed based on known collision-finding techniques. Following the applicability of binary subset sums, results of this paper are relevant to learning parities with noise, decoding random codes and related problems.

关键词： Memory management Redundancy Program processors Runtime Complexity theory Control systems parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Undirected (1+𝜀)-shortest paths via minor-aggregates: near-optimal deterministic parallel and distributed algorithms 2022

Undirected (1+𝜀)-shortest paths via minor-aggregates: nea...

引用

Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing

作者： Václav Rozhoň Christoph Grunau Bernhard Haeupler Goran Zuzic Jason Li ETH Zurich Switzerland ETH Zurich Switzerland / Carnegie Mellon University USA University of California at Berkeley USA

ISBN: (纸本)9781450392648

This paper presents near-optimal deterministic parallel and distributed algorithms for computing (1+eps)-approximate single-source shortest paths in any undirected weighted graph. On a high level, we deterministically reduce this and other shortest-path problems to Õ(1) Minor-Aggregations. A Minor-Aggregation computes an aggregate (e.g., max or sum) of node-values for every connected component of some subgraph. Our reduction immediately implies: Optimal deterministic parallel (PRAM) algorithms with Õ(1) depth and near-linear work. Universally-optimal deterministic distributed (CONGEST) algorithms, whenever deterministic Minor-Aggregate algorithms exist. For example, an optimal Õ(hopDiameterG)-round deterministic CONGEST algorithm for excluded-minor networks. Several novel tools developed for the above results are interesting in their own right: A local iterative approach for reducing shortest path computations “up to distance D” to computing low-diameter decompositions “up to distance D/2”. Compared to the recursive vertex-reduction approach of [Li20], our approach is simpler, suitable for distributed algorithms, and eliminates many derandomization barriers. A simple graph-based Õ(1)-competitive ℓ1-oblivious routing based on low-diameter decompositions that can be evaluated in near-linear work. The previous such routing [ZGY+20] was no(1)-competitive and required no(1) more work. A deterministic algorithm to round any fractional single-source transshipment flow into an integral tree solution. The first distributed algorithms for computing Eulerian orientations.

关键词： Shortest Path Distributed algorithms parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：