检索结果-内蒙古大学图书馆

6th International IEEE/ACM Workshop on Extreme Scale Programming Models and Middleware, ESPM2 2021

作者： Yadav, Srinivas Gupta, Nikunj Reverdell, Auriane Kaiser, Hartmut Keshav Memorial Institute of Technology Hyderabad India University of Illinois at Urbana-Champaign Illinois United States Swiss National Supercomputing Centre Zurich Switzerland Louisiana State University Center for Computation Technology Baton Rouge United States

ISBN: (纸本)9781665411400

Recent additions to the C++ standard and ongoing standardization efforts aim to add data-parallel types to the C++ standard library. This enables the use of vectorization techniques in existing C++ codes without having to rely on the C++ compiler's abilities to auto-vectorize the code's execution. The integration of the existing parallel algorithms with these new data-parallel types opens up a new way of speeding up existing codes with minimal effort. Today, only very little implementation experience exists for potential data-parallel execution of the standard parallel algorithms. In this paper, we report on experiences and performance analysis results for our implementation of two new data-parallel execution policies usable with HPX's parallel algorithms module: simd and par_simd. We utilize the new experimental implementation of data-parallel types provided by recent versions of the GCC and Clang C++ standard libraries. The benchmark results collected from artificial tests and real-world codes presented in this paper are very promising. Compared to sequenced execution, we report on speed-ups of more than three orders of magnitude when executed using the newly implemented data-parallel execution policy par_simd with HPX's parallel algorithms. We also report that our implementation is performance portable across different compute architectures (x64-Intel and AMD, and Arm), using different vectorization extensions (AVX2, AVX512, and NEON128). © 2021 IEEE.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel Minimum Cuts in O(m log2n) Work and Low Depth 21

Parallel Minimum Cuts in O(m log2n) Work and Low Depth

引用

33rd ACM Symposium on parallelism in algorithms and Architectures, SPAA 2021

作者： Anderson, Daniel Blelloch, Guy E. Carnegie Mellon University PittsburghPA United States

ISBN: (纸本)9781450380706

We present a randomized O(m log^2 n) work, O(polylog n) depth parallel algorithm for minimum cut. This algorithm matches the work bounds of a recent sequential algorithm by Gawrychowski, Mozes, and Weimann [ICALP'20], and improves on the previously best parallel algorithm by Geissmann and Gianinazzi [SPAA'18], which performs O(m log^4 n) work in O(polylog n) depth. Our algorithm makes use of three components that might be of independent interest. Firstly, we design a parallel data structure that efficiently supports batched mixed queries and updates on trees. It generalizes and improves the work bounds of a previous data structure of Geissmann and Gianinazzi and is work efficient with respect to the best sequential algorithm. Secondly, we design a parallel algorithm for approximate minimum cut that improves on previous results by Karger and Motwani. We use this algorithm to give a work-efficient procedure to produce a tree packing, as in Karger's sequential algorithm for minimum cuts. Lastly, we design an efficient parallel algorithm for solving the minimum 2-respecting cut problem. © 2021 Owner/Author.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

FLEXIBLE parallel algorithms FOR BIG DATA OPTIMIZATION

FLEXIBLE PARALLEL ALGORITHMS FOR BIG DATA OPTIMIZATION

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： Francisco Facchinei Simone Sagratella Gesualdo Scutari Dpt. of Computer Control and Management Eng. University of Rome "La Sapienza" Roma Italy Dpt. of Electrical Eng. State University of New York (SUNY) at Buffalo Buffalo NY 14260 USA

ISBN: (纸本)9781479928941

We propose a decomposition framework for the parallel optimization of the sum of a differentiable function and a (block) separable nonsmooth, convex one. The latter term is typically used to enforce structure in the solution as, for example, in LASSO problems. Our framework is very flexible and includes both fully parallel Jacobi schemes and Gauss-Seidel (Southwell-type) ones, as well as virtually all possibilities in between (e.g., gradient- or Newton-type methods) with only a subset of variables updated at each iteration. Our theoretical convergence results improve on existing ones, and numerical results show that the new method compares favorably to existing algorithms.

关键词： Optimization Convergence Jacobian matrices Minimization Approximation methods parallel algorithms Approximation algorithms

来源：评论

学校读者我要写书评

暂无评论

DPTL+: Efficient parallel triangle listing on batch-dynamic graphs 37

DPTL+: Efficient parallel triangle listing on batch-dynamic ...

引用

37th IEEE International Conference on Data Engineering, ICDE 2021

作者： Yu, Michael Qin, Lu Zhang, Ying Zhang, Wenjie Lin, Xuemin University of New South Wales Australia University of Technology Sydney AAII Australia

ISBN: (纸本)9781728191843

Triangle listing is an important topic in many practical applications. We have observed that this problem has not yet been studied systematically in the context of batch-dynamic graphs. In this paper, we aim to fill this gap by developing novel and efficient parallel solutions. Specifically, given a graph G and a batch-update of edges B, we report the updated triangles (deleted triangles and new triangles) resulting from the batch of updates. We notice that it is cost expensive to directly apply state-of-the-art triangle listing algorithms because they are designed to enumerate the complete set of triangles from a given graph, whereas only the updated ones are the relevant output for our problem setting. In this paper, we developed an efficient algorithm, namely DPTL, based on a newly designed orientation technique, which only outputs the updated triangles while ensuring that each triangle solution is identified without any duplicate solutions. We follow up by taking advantage of a graph's degree distributions and designed a more sophisticated algorithm, namely DPTL+. We show that DPTL+ can achieve the best performance in terms of both practical performance and theoretical time complexity. Our comprehensive experiments over 28 real-life large graphs show the superior performance of the DPTL+ algorithm when compared against DPTL and two baseline solutions. Theoretically, we also show that DPTL+ has a time complexity of Θ(∑〈u, v〉∈Bmin{deg(u), deg(v)}+m) where deg(x) is the degree of a vertex x, and m is the number of edges adjacent to the vertices in the batch-update. This time complexity is more promising than that of other solutions. © 2021 IEEE.

关键词： Conferences Data engineering Time complexity parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Prime Number Sieving-A Systematic Review with Performance Analysis

引用

algorithms 2024年第4期17卷 157页

作者： Ghidarcea, Mircea Popescu, Decebal Univ Politehn Bucuresti Comp Sci Splaiul Independentei 313 Bucharest 060042 Romania

The systematic generation of prime numbers has been almost ignored since the 1990s, when most of the IT research resources related to prime numbers migrated to studies on the use of very large primes for cryptography, and little effort was made to further the knowledge regarding techniques like sieving. At present, sieving techniques are mostly used for didactic purposes, and no real advances seem to be made in this domain. This systematic review analyzes the theoretical advances in sieving that have occurred up to the present. The research followed the PRISMA 2020 guidelines and was conducted using three established databases: Web of Science, IEEE Xplore and Scopus. Our methodical review aims to provide an extensive overview of the progress in prime sieving-unfortunately, no significant advancements in this field were identified in the last 20 years.

关键词： prime numbers prime number sieving prime number generation algorithms algorithm optimization parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Lower Bounds for parallel and Randomized Convex Optimization

引用

JOURNAL OF MACHINE LEARNING RESEARCH 2020年第1期21卷 1-31页

作者： Diakonikolas, Jelena Guzman, Cristobal Univ Wisconsin Madison Dept Comp Sci 1210 W Dayton St Madison WI 53706 USA Pontificia Univ Catolica Chile Inst Math & Computat Engn Fac Math Vicunna Mackenna 4860 Santiago 7820436 Chile Pontificia Univ Catolica Chile Sch Engn Millennium Nucleus Ctr Discovery Structures Compl Vicunna Mackenna 4860 Santiago 7820436 Chile

We study the question of whether parallelization in the exploration of the feasible set can be used to speed up convex optimization, in the local oracle model of computation and in the high-dimensional regime. We show that the answer is negative for both deterministic and randomized algorithms applied to essentially any of the interesting geometries and nonsmooth, weakly-smooth, or smooth objective functions. In particular, we show that it is not possible to obtain a polylogarithmic (in the sequential complexity of the problem) number of parallel rounds with a polynomial (in the dimension) number of queries per round. In the majority of these settings and when the dimension of the space is polynomial in the inverse target accuracy, our lower bounds match the oracle complexity of sequential convex optimization, up to at most a logarithmic factor in the dimension, which makes them (nearly) tight. Another conceptual contribution of our work is in providing a general and streamlined framework for proving lower bounds in the setting of parallel convex optimization. Prior to our work, lower bounds for parallel convex optimization algorithms were only known in a small fraction of the settings considered in this paper, mainly applying to Euclidean (l(2)) and l(infinity) spaces.

关键词： lower bounds convex optimization parallel algorithms randomized algorithms non-Euclidean optimization

来源：评论

学校读者我要写书评

暂无评论

Hierarchical Model of parallel Metaheuristic Optimization algorithms 13

Hierarchical Model of Parallel Metaheuristic Optimization Al...

引用

13th International Symposium on Intelligent Systems (INTELS)

作者： Seliverstov, E. Y. Karpenko, A. P. Bauman Moscow State Univ Ul Baumanskaya 2 Ya5 Moscow 105005 Russia

The paper introduces a novel model of parallel metaheuristic optimization algorithms. The hierarchical graph model of a parallel optimization algorithm is proposed. It consists of the model for a parallel optimization algorithm at the top level of the hierarchy and the model for a sequential optimization algorithm at the bottom level. The unified representation of a metaheuristic optimization algorithm, which allows representing a class of metaheuristic algorithms, is used. The extension of the proposed model to the parametric hierarchical model is proposed. Graph model transformations for a parallel algorithm analysis and synthesis are introduced. The representation of several metaheuristic algorithms with the proposed model is discussed. (C) 2019 The Authors. Published by Elsevier B.V.

关键词： evolutionary algorithms metaheuristic optimization parametric optimization parallel algorithms particle swarm optimization

来源：评论

学校读者我要写书评

暂无评论

A robust Delaunay-AFT based parallel method for the generation of large-scale fully constrained meshes

引用

COMPUTERS & STRUCTURES 2020年 228卷 1页

作者： Yu, Fei Zeng, Yan Guan, Z. Q. Lo, S. H. Dalian Univ Technol Dept Engn Mech State Key Lab Struct Anal Ind Equipment Dalian 116024 Peoples R China Univ Hong Kong Dept Civil Engn Pokfulam Rd Hong Kong Peoples R China

Making full use of a sequential Delaunay-AFT mesher, a parallel method for the generation of large-scale tetrahedral meshes on distributed-memory machines is developed. To generate meshes with the required and the preserved properties, a Delaunay-AFT based domain decomposition (DD) technique is employed. Starting from the Delaunay triangulation (DT) covering the problem domain, this technique creates a layer of elements dividing the domain into several zones. The initially coarsely meshed domain is partitioned into DTs of subdomains which can be meshed in parallel. When the size of a subdomain is smaller than a user-specified threshold, it will be meshed with the standard Delaunay-AFT mesher. A two-level DD strategy is designed to improve the parallel efficiency of this algorithm. A dynamic load balancing scheme is also implemented using the Message Passing Interface (MPI). Out-of-core meshing is introduced to accommodate excessive large meshes that cannot be handled by the available memory of the computer (RAM). Numerical tests are performed for various complex geometries with thousands of surface patches. Ultra-large-scale meshes with more than ten billion tetrahedral elements have been created. Moreover, the meshes generated with different numbers of DD operations are nearly identical in quality: showing the consistency and the stability of the automatic decomposition algorithm. (C) 2019 Elsevier Ltd. All rights reserved.

关键词： Finite element mesh generation parallel algorithms Domain decomposition Delaunay triangulations Delaunay-AFT Out-of-core

来源：评论

学校读者我要写书评

暂无评论

Solving Black-Scholes Equation Based on Time Domain Decomposition and Meshless Method

Solving Black-Scholes Equation Based on Time Domain Decompos...

引用

2022 International Joint Conference on Information and Communication Engineering, JCICE 2022

作者： Duan, Yong Zhu, Dongyuan School of Mathematical Sciences University of Electronic Science and Technology Chengdu China

ISBN: (数字)9781665460675

ISBN: (纸本)9781665460675

The work of this paper is to solve the Black-Scholes equation under European options based on the time parallel algorithm combined with the kansa method. Firstly, the partial differential equation of the price of derivative products based on stock price is obtained by using efficient market theory, no-arbitrage principle and ITO theorem. Then, the general heat conduction equation is solved by time domain decomposition coupled meshless method. Finally, through numerical example verify that this computational format has high accuracy and validity. © 2022 IEEE.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A GENERIC FINITE ELEMENT FRAMEWORK ON parallel TREE-BASED ADAPTIVE MESHES

引用

SIAM JOURNAL ON SCIENTIFIC COMPUTING 2020年第6期42卷 C436-C468页

作者： Badia, Santiago Martin, Alberto F. Neiva, Eric Verdugo, Francesc Monash Univ Sch Math Sci Clayton Vic 3800 Australia UPC CIMNE Ctr Int Metodes Numer Engn Castelldefels 08860 Spain Univ Politecn Cataluna Dept Civil & Environm Engn ES-08034 Barcelona Spain

In this work we formally derive and prove the correctness of the algorithms and data structures in a parallel, distributed-memory, generic finite element framework that supports h-adaptivity on computational domains represented as forest-of-trees. The framework is grounded on a rich representation of the adaptive mesh suitable for generic finite elements that is built on top of a low-level, light-weight forest-of-trees data structure handled by a specialized, highly parallel adaptive meshing engine, for which we have identified the requirements it must fulfill to be coupled into our framework. Atop this two-layered mesh representation, we build the rest of the data structures required for the numerical integration and assembly of the discrete system of linear equations. We consider algorithms that are suitable for both subassembled and fully assembled distributed data layouts of linear system matrices. The proposed framework has been implemented within the FEMPAR scientific software library, using p4est as a practical forest-of-octrees demonstrator. A strong scaling study of this implementation when applied to Poisson and Maxwell problems reveals remarkable scalability up to 32.2K CPU cores and 482.2M degrees of freedom. Besides, a comparative performance study of FEMPAR and the state-of-the-art deal. II finite element software shows at least comparative performance, and at most a factor of 2-3 improvement in the h-adaptive approximation of a Poisson problem with first- and second-order Lagrangian finite elements, respectively.

关键词： partial differential equations finite elements adaptive mesh refinement forest of trees parallel algorithms scientific software

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：