检索结果-内蒙古大学图书馆

A Multi-GPU Aggregation-Based AMG Preconditioner for Iterative Linear Solvers

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 2023年第8期34卷 2365-2376页

作者： Bernaschi, Massimo Celestini, Alessandro Vella, Flavio D'Ambra, Pasqua Inst Appl Comp IAC CNR I-00185 Rome Italy Univ Trento I-38122 Trento Italy

We present and release in open source format a sparse linear solver which efficiently exploits heterogeneous parallel computers. The solver can be easily integrated into scientific applications that need to solve large and sparse linear systems on modern parallel computers made of hybrid nodes hosting Nvidia Graphics Processing Unit (GPU) accelerators. The work extends previous efforts of some of the authors in the exploitation of a single GPU accelerator and proposes an implementation, based on the hybrid MPI-CUDA software environment, of a Krylov-type linear solver relying on an efficient Algebraic MultiGrid (AMG) preconditioner already available in the BootCMatchG library. Our design for the hybrid implementation has been driven by the best practices for minimizing data communication overhead when multiple GPUs are employed, yet preserving the efficiency of the GPU kernels. Strong and weak scalability results of the new version of the library on well-known benchmark test cases are discussed. Comparisons with the Nvidia AmgX solution show a speedup, in the solve phase, up to 2.0x.

关键词： GPU accelerators heterogeneous computing iterative sparse linear solvers parallel numerical algorithms scalability

来源：评论

学校读者我要写书评

暂无评论

Block row projection method based on M-matrix splitting

引用

JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS 2018年 340卷 731-744页

作者： Zhang, Zhengyi Sameh, Ahmed H. Purdue Univ Dept Comp Sci 305 N Univ St W Lafayette IN 47907 USA

We propose a hybrid sparse linear system solver based on M-matrix splitting and block-row projection (BRP). We split the sparse coefficient matrix A into two (nonsingular) M-matrices, and construct an augmented larger linear system which we solve using a BRP method. The robustness of BRP is compared with those of ILUT-preconditioned GMRES, and the sparse direct solver Pardiso. We also demonstrate the parallel scalability of BRP on a cluster of multicore nodes. (C) 2017 Elsevier B.V. All rights reserved.

关键词： numerical linear algebra Krylov subspace methods Preconditioners Block row projection M-matrix splitting parallel numerical algorithms

来源：评论

学校读者我要写书评

暂无评论

A Comparison of Accuracy and Efficiency of parallel Solvers for Fractional Power Diffusion Problems 12th

A Comparison of Accuracy and Efficiency of Parallel Solvers ...

引用

12th International Conference on parallel Processing and Applied Mathematics (PPAM)

作者： Ciegis, Raimondas Starikovicius, Vadimas Margenov, Svetozar Kriauziene, Rima Vilnius Gediminas Tech Univ Sauletekis Ave 11 LT-10223 Vilnius Lithuania Bulgarian Acad Sci Inst Informat & Commun Technol Acad G Bonchev StBl 25A BU-1113 Sofia Bulgaria Vilnius Univ Inst Math & Informat Akad St 4 LT-08663 Vilnius Lithuania

ISBN: (纸本)9783319780245;9783319780238

In this paper, we construct and investigate parallel solvers for three dimensional problems described by fractional powers of elliptic operators. The main aim is to make a scalability analysis of parallel versions of several state of the art solvers. The originality of this work is that we also consider the accuracy of the selected numerical algorithms. For comparison of accuracy, we use solutions obtained solving the test problem by the Fourier algorithm. Such analysis enables to compare the efficiency of the proposed parallel algorithms depending on the required accuracy of solution and on a number of processes used in computations.

关键词： Fractional diffusion Finite volume method parallel numerical algorithms MPI Scalability Multigrid

来源：评论

学校读者我要写书评

暂无评论

Scalability analysis of different parallel solvers for 3D fractional power diffusion problems

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2019年第19期31卷

作者： Ciegis, Raimondas Starikovicius, Vadimas Margenov, Svetozar Kriauziene, Rima Vilnius Gediminas Tech Univ Math Modelling Dept Sauletekio Al 11 LT-10223 Vilnius Lithuania Bulgarian Acad Sci Inst Informat & Commun Technol Sofia Bulgaria Vilnius Univ Inst Math & Informat Vilnius Lithuania

In this paper, we develop and investigate the parallel numerical algorithms for three different state-of-the-art numerical methods for solving the non-local problems described by fractional powers of elliptic operators. These methods transform the non-local problem into some local differential problems of elliptic or parabolic type. A two-level parallelization approach is applied to construct the efficient parallel algorithms using the domain decomposition and master-slave methods, to deal with the increase in computational complexity. We show and compare the serial and parallel solution times that are required to achieve similar accuracy of the solution using different algorithms. Results of extensive convergence tests are presented solving a three-dimensional test problem with known decrease of the solution's convergence rate depending on the fractional power coefficient. We analyze and discuss the non-trivial question, which parallel algorithm is recommended to achieve certain accuracy for the given fractional power coefficient.

关键词： convergence fractional diffusion fractional Laplacian multigrid parallel numerical algorithms parallel scalability

来源：评论

学校读者我要写书评

暂无评论

Execution Behavior Analysis of parallel Schemes for Implicit Solution Methods for ODEs 17

Execution Behavior Analysis of Parallel Schemes for Implicit...

引用

17th Annual International Symposium on parallel and Distributed Computing (ISPDC)

作者： Kalinnik, Natalia Rauber, Thomas Univ Bayreuth Dept Comp Sci Bayreuth Germany

ISBN: (纸本)9781538653302

In this article, we consider diagonal-implicitly iterated Runge-Kutta (DIIRK) methods for the numerical solution of stiff ordinary differential equations (ODEs) and investigate their performance behavior on a modern cluster system using MPI. DIIRK methods are implicit methods and require the solution of non-linear equation systems in each iteration step. In particular, we are interested in the parallel execution behavior when using different basis Newton methods for solving the resulting non-linear equation systems of different versions of the DIIRK method. We explore the use of direct solution methods based on LU factorization for the resulting linear equation systems as well as the use of Krylov subspace methods and investigate the resulting performance and accuracy.

关键词： parallel numerical algorithms implicit ODE methods predictor-corrector methods

来源：评论

学校读者我要写书评

暂无评论

Supercomputer simulation of nonlinear problems of fluid dynamics in cores

引用

LOBACHEVSKII JOURNAL OF MATHEMATICS 2017年第5期38卷 958-963页

作者： Podryga, V. O. Polyakov, S. V. Puzyrkov, D. V. Russian Acad Sci Keldysh Inst Appl Math Moscow 125047 Russia

This report focuses on technology of supercomputer simulation of nonlinear processes in the cores, extracted from oil and gas production wells in order to study the properties of hydrocarbon reservoirs. One of modern approaches to solving these kind problems is to create multiphysical mathematical model of core for its study by computer methods. This approach minimizes the number of natural experiments and predicts the evolution of layers properties. Also it allows to predict oil and gas recovery of layers for a long time period. However, implementation of this technology called "virtual core" requires the following: 1) to create multiparametrical model of core as close as possible to the reality;2) to include the multicomponent and multiphase composition and complex real geometry of core in consideration;3) to develop a computational framework for modeling the seepage of multicomponent liquid and gas mixtures through the core;4) to carry out large-scale calibration calculations. In this paper, an attempt to create such a multifactor mathematical model and computational foundations for its computing and supercomputing analysis is made.

关键词： Fluid dynamics nonlinear processes in cores mathematical modeling parallel numerical algorithms supercomputer simulations

来源：评论

学校读者我要写书评

暂无评论

A parallel sparse linear system solver based on Hermitian/skew-Hermitian splitting

引用

COMPUTERS & MATHEMATICS WITH APPLICATIONS 2016年第8期72卷 2000-2007页

作者： Zhang, Zhengyi Sameh, Ahmed H. Purdue Univ Dept Comp Sci 305 N Univ St W Lafayette IN 47907 USA

In this paper we describe a parallel algorithm for solving large sparse nonsingular linear systems Ax = f, of order n, using the Hermitian Skew-Hermitian splitting approach for handling the augmented linear system, of order 2n, that arises from the linear least problem of minimizing the 2-norm of (f-Ax). We use the restarted GMRES as the outer iteration with the Hermitian Skew-Hermitian Splitting (HSS) preconditioner. In solving systems involving this preconditioner, the most time consuming part deals with handling shifted skew-symmetric systems. We solve such systems using the successive overrelaxation (SOR). Theoretical analysis shows that our solver always converges to the unique solution of Ax = f. We present several numerical experiments that demonstrate the robustness of our solver compared to other schemes, and show its parallel scalability on a single multicore node. (C) 2016 Elsevier Ltd. All rights reserved.

关键词： numerical linear algebra Krylov subspace methods Preconditioners Hermitian/Skew-Hermitian splitting parallel numerical algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel solvers for fractional power diffusion problems

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2017年第24期29卷 e4216.1-e4216..12页

作者： Ciegis, Raimondas Starikovicius, Vadimas Margenov, Svetozar Kriauziene, Rima Vilnius Gediminas Tech Univ Sauletekio Ave 11 Vilnius LT-10223 Vilnius Lithuania Bulgarian Acad Sci Inst Informat & Commun Technol Acad G Bonchev StrBl 25A BU-1113 Sofia Bulgaria Vilnius Univ Inst Math & Informat Akademijos Str 4 LT-08663 Vilnius Lithuania

Mathematical models with fractional-order differential operators are computationally expensive due to the non-local nature of these operators. In this work, we construct and investigate parallel solvers for problems described by fractional powers of elliptic operators, like fractional diffusion. Three state-of-the-art approaches are used to transform the non-local fractional-order differential problem into local partial differential equation problems formulated in a space of higher dimension. numerical schemes and parallel algorithms are developed for all three approaches. The resulting parallel algorithms have very different properties. We investigate the weak and strong scalability of the developed parallel algorithms and compare their parallel performance.

关键词： fractional diffusion fractional Laplacian multigrid parallel efficiency and scalability parallel numerical algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel simulations for Fractional-Order Systems 18

Parallel simulations for Fractional-Order Systems

引用

18th International Symposium on Symbolic and Numeric algorithms for Scientific Computing (SYNASC)

作者： Baban, Andrada Bonchis, Cosmin Fikl, Alexandru Rosu, Florin West Univ Timisoara Bd V Parvan 4Cam 045B RO-300223 Timisoara Romania eAustria Res Inst Bd V Parvan 4Cam 045B RO-300223 Timisoara Romania Univ Illinois Dept Aerosp Engn Champaign IL USA

ISBN: (纸本)9781509057078

In this paper, we explore how numerical calculations can be accelerated by implementing several numerical methods of fractional-order systems using parallel computing techniques. We investigate the feasibility of parallel computing algorithms and their efficiency in reducing the computational costs over a large time interval. Particularly, we present the case of Adams-Bashforth-Mouhlton predictor-corrector method and measure the speedup of two parallel approaches by using GPU and HPC cluster implementations.

关键词： Fractional-order systems parallel numerical algorithms GPU processing HPC processing

来源：评论

学校读者我要写书评

暂无评论

SCALABLE HETEROGENEOUS CPU-GPU COMPUTATIONS FOR UNSTRUCTURED TETRAHEDRAL MESHES

引用

IEEE MICRO 2015年第4期35卷 6-15页

作者： Langguth, Johannes Sourouri, Mohammed Lines, Glenn Terje Baden, Scott B. Cai, Xing Univ Calif San Diego Dept Comp Sci & Engn San Diego CA 92103 USA

Multicore CPUs can be combined with GPUs to perform computations over 3D unstructured meshes on heterogeneous CPU-GPU clusters. The authors explain how to unlock the CPUs' computing power without slowing down other tasks related to data movement. By solving the representative diffusion equation using the cell-centered finite volume method, the authors demonstrate that combining the computing capacity of CPUs and GPUs delivers a performance advantage over the GPU-only approach.

关键词： Computer programs emerging technologies GPUs Graphics processing units Instruction sets irregular meshes Mathematical model multicore processors parallel numerical algorithms Particle separators Performance evaluation performance optimization sparse linear algebra Three-dimensional displays

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：