检索结果-内蒙古大学图书馆

12th International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2021

作者： Catalán, Sandra Igual, Francisco D. Rodríguez-Sánchez, Rafael Herrero, José R. Quintana-Ortí, Enrique S. Universidad Complutense de Madrid Madrid Spain Universitat Politècnica de Catalunya Barcelona Spain Universitat Politècnica de València Valencia Spain

ISBN: (纸本)9781450383486

We take advantage of the new tasking features in OpenMP to propose advanced task-parallel algorithms for the inversion of dense matrices via Gauss-Jordan elimination. Our algorithms perform a partitioning of the matrix operand into two levels of tasks: The matrix is first divided vertically, by column blocks (or panels), in order to accommodate the standard partial pivoting scheme that ensures the numerical stability of the method. In addition, depending on the particular kernel to be applied, each panel is partitioned either horizontally by row blocks (tiles) or vertically by μ-panels (of columns), in order to extract sufficient task parallelism to feed a many-threaded general purpose processor (CPU). The results of the experimental evaluation show the performance benefits of the advanced tasking algorithms on an Intel Xeon Gold processor with 20 cores. © 2021 ACM.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

FEPAC: A Framework for Evaluating parallel algorithms on Cluster Architectures 21

FEPAC: A Framework for Evaluating Parallel Algorithms on Clu...

引用

2021 Australasian Computer Science Week Multiconference, ACSW 2021

作者： Warade, Mehul Schneider, Jean-Guy Lee, Kevin Deakin University Australia

ISBN: (纸本)9781450389563

For many years, computer scientists have explored the computing power of so-called computing clusters to address performance requirements of computationally intensive tasks. Historically, computing clusters have been optimized with run-time performance in mind, but increasingly energy consumption has emerged as a second dimension that needs to be considered when optimizing cluster configurations. However, there is a lack of generally available tool support to experiment with cluster and algorithm configurations in order to identify "sweet-spots"with regards to both, run-time performance and energy consumption, respectively. In this work, we are introducing FEPAC, a framework for the automated evaluation of parallel algorithms on different cluster architectures and different deployments of software processes to hardware nodes, allowing users to explore the impact of different configurations on run-time properties of their computations. As proof of concept, the utility of the framework is demonstrated on a custom-built Raspberry Pi 3B+ cluster using different types of parallel algorithms as benchmarks. The experiments evaluate matrix multiplication, kmeans, and OpenCV on varying sizes of cluster, and showed that although a larger cluster improves performance, there is often a trade-off between energy and computation time. © 2021 ACM.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Sequential and parallel algorithms for all-pair k-mismatch maximal common substrings

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2020年 144卷 68-79页

作者： Chockalingam, Sriram P. Thankachan, Sharma, V Aluru, Srinivas Georgia Inst Technol Inst Data Engn & Sci 756 W Peachtree St NW12th Floor Atlanta GA 30308 USA Georgia Inst Technol Dept Computat Sci & Engn 756 W Peachtree St NW13th Floor Atlanta GA 30308 USA Univ Cent Florida Dept Comp Sci Orlando FL 32816 USA

Identifying long pairwise maximal common substrings among a large set of sequences is a frequently used construct in computational biology, with applications in DNA sequence clustering and assembly. Due to errors made by sequencers, algorithms that can accommodate a small number of differences are of particular interest. Formally, let D be a collection of n sequences of total length N, phi be a length threshold, and k be a mismatch threshold. The goal is to identify and report all k-mismatch maximal common substrings of length at least phi over all pairs of strings in D. Heuristics based on seed-and-extend style filtering techniques are often employed in such applications. However, such methods cannot provide any provably efficient run time guarantees. To this end, we present a sequential algorithm with an expected run time of O(N log(k) N+occ), where occ is the output size. We then present a distributed memory parallel algorithm with an expected run time of O ((N/P log N + occ) log(k) N) using O (log(k+1) N) expected rounds of global communications, under some realistic assumptions, where p is the number of processors. Finally, we demonstrate the performance and scalability of our algorithms using experiments on large high throughput sequencing data. (C) 2020 Elsevier Inc. All rights reserved.

关键词： Approximate sequence matching String algorithms Suffix trees Hamming distance parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Massively parallel algorithms for b-Matching

arXiv

引用

arXiv 2022年

作者： Ghaffari, Mohsen Grunau, Christoph Mitrović, Slobodan ETH Zurich Switzerland UC Davis United States

This paper presents an O(log log d¯) round massively parallel algorithm for 1 + ǫ approximation of maximum weighted b-matchings, using near-linear memory per machine. Here d¯ denotes the average degree in the graph and ǫ is an arbitrarily small positive constant. Recall that b-matching is the natural and well-studied generalization of the matching problem where different vertices are allowed to have multiple (and differing number of) incident edges in the matching. Concretely, each vertex v is given a positive integer budget bv and it can have up to bv incident edges in the matching. Previously, there were known algorithms with round complexity O(log log n), or O(log log ∆) where ∆ denotes maximum degree, for 1 + ǫ approximation of weighted matching and for maximal matching [Czumaj et al., STOC’18, Ghaffari et al. PODC’18;Assadi et al. SODA’19;Behnezhad et al. FOCS’19;Gamlath et al. PODC’19], but these algorithms do not extend to the more general b-matching problem. Copyright © 2022, The Authors. All rights reserved.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Exploring Statistics Around Kahn-Kalai Conjecture by Using parallel algorithms

Exploring Statistics Around Kahn-Kalai Conjecture by Using P...

引用

Computer Science Society (SCCC) International Conference Chilean

作者： Christopher A. Torres Pablo E. Román Departamento de ingeniería informática Universidad de Santiago de Chile Santiago Chile

ISBN: (纸本)9781665456753

There are many questions about the statistical properties of random graphs, particularly those related to cyclic structures. However, theoretical advances have been made in the sparse connection regime. Recent results on the Kahn-Kalai conjecture show that there is a limiting connection probability beyond which it's very likely to find Hamiltonian cycles. It is shown that this probability is $P \sim log(n)/n$ where $n$ is the number of nodes. We explore experimentally around this limit by showing its empirical statistical behavior. These results are useful in configuring various engineering problems based on sparse graphs.

关键词： Backtracking Markov processes Very large scale integration Software Silicon Reactive power parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Performance Evaluation of parallel algorithms

arXiv

引用

arXiv 2022年

作者： Ene, Donald Anireh, Vincent Ike Department of Computer Science Rivers State University Port Harcourt Nigeria

Evaluating how well a whole system or set of subsystems performs is one of the primary objectives of performance testing. We can tell via performance assessment if the architecture implementation meets the design objectives. Performance evaluations of several parallel algorithms are compared in this study. Both theoretical and experimental methods are used in performance assessment as a subdiscipline in computer science. The parallel method outperforms its sequential counterpart in terms of throughput. The parallel algorithm's performance (speedup) is examined, as shown in the result. © 2022, CC BY-NC-ND.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Dynamic Constant Time parallel Graph algorithms with Sub-Linear Work 48

Dynamic Constant Time Parallel Graph Algorithms with Sub-Lin...

引用

48th International Symposium on Mathematical Foundations of Computer Science, MFCS 2023

作者： Schmidt, Jonas Schwentick, Thomas TU Dortmund University Germany

ISBN: (纸本)9783959772921

The paper proposes dynamic parallel algorithms for connectivity and bipartiteness of undirected graphs that require constant time and O(n1/2+ϵ) work on the CRCW PRAM model. The work of these algorithms almost matches the work of the O(log n) time algorithm for connectivity by Kopelowitz et al. (2018) on the EREW PRAM model and the time of the sequential algorithm for bipartiteness by Eppstein et al. (1997). In particular, we show that the sparsification technique, which has been used in both mentioned papers, can in principle also be used for constant time algorithms in the CRCW PRAM model, despite the logarithmic depth of sparsification trees. © Jonas Schmidt and Thomas Schwentick

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Work-Efficient parallel algorithms for Accurate Floating-Point Prefix Sums

Work-Efficient Parallel Algorithms for Accurate Floating-Poi...

引用

IEEE High Performance Extreme Computing Conference (HPEC)

作者： Fraser, Sean Xu, Helen Leiserson, Charles E. MIT Comp Sci & Artificial Intelligence Lab 77 Massachusetts Ave Cambridge MA 02139 USA

ISBN: (纸本)9781728192192

Existing work-efficient parallel algorithms for floating-point prefix sums exhibit either good performance or good numerical accuracy, but not both. Consequently, prefix-sum algorithms cannot easily be used in scientific-computing applications that require both high performance and accuracy. We have designed and implemented two new algorithms, called CAST_BLK and PAIR_BLK, whose accuracy is significantly higher than that of the high-performing prefix-sum algorithm from the Problem Based Benchmark Suite, while running with comparable performance on modern multicore machines. Specifically, the root mean squared error of the PBBS code on a large array of uniformly distributed 64-bit floating-point numbers is 8 times higher than that of CAST_BLK and 5.8 times higher than that of PAIR_BLK. These two codes employ the PBBS three-stage strategy for performance, but they are designed to achieve high accuracy, both theoretically and in practice. A vectorization enhancement to these two scalar codes trades off a small amount of accuracy to match or outperform the PBBS code while still maintaining lower error.

关键词： floating-point arithmetic parallel algorithms parallelism prefix sums span summation sum-depth vectorization work

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for Predicate Detection 19

Parallel Algorithms for Predicate Detection

引用

20th International Conference on Distributed Computing and Networking (ICDCN)

作者： Garg, Vijay K. Garg, Rohan Univ Texas Austin Austin TX 78712 USA

ISBN: (纸本)9781450360944

Given a trace of a distributed computation and a desired predicate, the predicate detection problem is to find a consistent global state that satisfies the given predicate. The predicate detection problem has many applications in the testing and runtime verification of parallel and distributed systems. We show that many problems related to predicate detection are in the parallel complexity class NC, the set of decision problems decidable in polylogarithmic time on a parallel computer with a polynomial number of processors. Given a computation on n processes with at most m local states per process, our parallel algorithm to detect a given conjunctive predicate takes O(log mn) time and O(m(3)n(3) log mn) work. The sequential algorithm takes O(mn(2)) time. For data race detection, we give a parallel algorithm that takes O(logmn log n) time, also placing that problem in NC. This is the first work, to the best of our knowledge, that places the parallel complexity of such predicate detection problems in the class NC.

关键词： Predicate Detection parallel algorithms Data Race Detection

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for Evaluating Matrix Polynomials 19

Parallel Algorithms for Evaluating Matrix Polynomials

引用

48th International Conference on parallel Processing (ICPP)

作者： Toledo, Sivan Waisel, Amit Tel Aviv Univ Blavatnik Sch Comp Sci Tel Aviv Israel

ISBN: (纸本)9781450362955

We develop and evaluate parallel algorithms for a fundamental problem in numerical computing, namely the evaluation of a polynomial of a matrix. The algorithm consists of many building blocks that can be assembled in several ways. We investigate parallelism in individual building blocks, develop parallel implemenations, and assemble them into an overall parallel algorithm. We analyze the effects of both the dimension of the matrix and the degree of the polynomial on both arithmetic complexity and on parallelism, and we consequently propose which variants use in different cases. Our theoretical results indicate that one variant of the algorithm, based on applying the Paterson-Stockmeyer method to the entire matrix, parallelizes very effectively on virtually any matrix dimension and polynomial degree. However, it is not the most efficient from the arithmetic complexity viewpoint. Another algorithm, based on the Davies-Higham block recurrence is much more efficient from the arithmetic complexity viewpoint, but one of its building blocks is serial. Experimental results on a dual-socket 28-core server show that the first algorithm can effectively use all the cores, but that on high-degree polynomials the second algorithm is often faster, in spite of the sequential phase. This indicates that our parallel algorithms for the other phases are indeed effective.

关键词： Matrix Polynomials Polynomial Evaluation parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：