检索结果-内蒙古大学图书馆

hybrid parallelization of Euler-Lagrange simulations based on MPI-3 shared memory

ADVANCES IN ENGINEERING SOFTWARE 2022年 174卷

作者： Kopper, Patrick Copplestone, Stephen M. Pfeiffer, Marcel Koch, Christian Fasoulas, Stefanos Beck, Andrea Univ Stuttgart Inst Aircraft Prop Syst Pfaffenwaldring 6 Stuttgart Germany Boltzpl Numer Plasma Dynam GmbH Schelmenwasenstr 34 Stuttgart Germany Univ Stuttgart Inst Space Syst Pfaffenwaldring 29 Stuttgart Germany Univ Stuttgart Inst Aerodynam & Gas Dynam Pfaffenwaldring 21 Stuttgart Germany

The use of Euler-Lagrange methods on unstructured grids extends their application area to more versatile setups. However, the lack of a regular topology limits the scalability of distributed parallel methods, especially for routines that perform a physical search in space. One of the most prominent slowdowns is the search for halo elements in physical space for the purpose of runtime communication avoidance. In this work, we present a new communication-free halo element search algorithm utilizing the MPI-3 shared memory model. This novel method eliminates the severe performance bottleneck of many-to-many communication during initialization compared to the distributed parallelization approach and extends the possible applications beyond those achievable with the previous approach. Building on these data structures, we then present methods for efficient particle emission, scalable deposition schemes for particle-field coupling, and latency hiding approaches. The scaling performance of the proposed algorithms is validated through plasma dynamics simulations of an open-source framework on a massively parallel system, demonstrating an efficiency of up to 80% on 131072 cores.

关键词： High-performance computing hybrid parallel programming Shared memory Particle-In-Cell Discontinuous Galerkin spectral element Halo region

来源：评论

学校读者我要写书评

暂无评论

PERFORMANCE EVALUATION OF programming MODELS FOR SMP-BASED CLUSTERS

引用

JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS 2008年第7期31卷 1181-1188页

作者： Lee, Myungho Park, Neungsoo Ro, Won W. Li, Kuan-Ching Providence Univ Dept Comp Sci & Informat Engn Taichung Taiwan Myong Ji Univ Dept Comp Software Yongin Kyung Ki Do South Korea Konkuk Univ Dept Comp Sci & Engn Seoul South Korea Yonsei Univ Dept Elect Engn Seoul 120749 South Korea

Recently, computing Clusters based oil shared-memory multiprocessors (SNIP's) is becoming popular for high performance computing (HPC) applications. With the recent prevalence of CPU's, which are small-scale SMP's themselves, multi-core CPU's SMP Clusters will become increasingly popular in the near future. SNIP clusters have characteristics of both SMP's and MPP's. Therefore, developing parallel programs which can efficiently exploits characteristics of both SMP and MPP in SMP Clusters is a challenging task. Standard parallel programming Models Such as MPI. OpenMP, or hybrid (a combination of the two former models) are commonly used for SNIP Clusters. Depending oil the characteristics of applications, however, some programming models are better than others. To identify and select a Suitable programming model for an application oil SMP Clusters needs a quantity of analysis of the application behavior and its performance. In this paper, We conduct experimental studies to evaluate the benefits and limits of MPI and OpenMP oil three SNIP-based systems using standard HPC applications parallelized using MPI, OpenMP, and hybrid model. The performance results and final analysis may lead to in optimal programming model for the applications.

关键词： high performance computing MPI openMP hybrid parallel programming cluster of SMP'S

来源：评论

学校读者我要写书评

暂无评论

hybrid parallel iterative sparse linear solver framework for reservoir geomechanical and flow simulation

引用

JOURNAL OF COMPUTATIONAL SCIENCE 2021年 51卷

作者： Gasparini, Leonardo Rodrigues, Jose R. P. Augusto, Douglas A. Carvalho, Luiz M. Conopoima, Cesar Goldfeld, Paulo Panetta, Jairo Ramirez, Joao P. Souza, Michael Figueiredo, Mateus O. Leite, Victor M. D. M. PETROBRAS R&D Ctr CENPES Ave Horacio Macedo 950 BR-21941915 Rio De Janeiro RJ Brazil Fundacao Oswaldo Cruz Ave Brasil 4365 BR-21040900 Rio De Janeiro RJ Brazil Univ Estado Rio De Janeiro Rua Sao Francisco Xavier 524 BR-20559900 Rio De Janeiro RJ Brazil Rio de Janeiro Fed Univ Ave Athos da Silveira Ramos 149 BR-21941909 Rio De Janeiro RJ Brazil Aeronaut Inst Technol ITA Praca Marechal Eduardo Gomes 50 BR-12228900 Sao Jose Dos Campos SP Brazil Univ Fed Ceara Ave Humberto Monte S-NCampus Pici Bloco 914 BR-60455760 Fortaleza Ceara Brazil SENAI CIMATEC Supercomp Ctr Ind Innovat Ave Orlando Gomes 1845 BR-41650010 Salvador BA Brazil

We discuss new developments of a hybrid parallel iterative sparse linear solver framework focused on petroleum reservoir flow and geomechanical simulation. It runs efficiently on several platforms, from desktop workstations to clusters of multicore nodes, with or without multiple GPUs, using a two-tier hierarchical architecture for distributed matrices and vectors. Results show good parallel scalability. Comparisons with a well-established library and a proprietary commercial solver indicate that our solver is competitive with the best available tools. We present results of the solver?s application to simulations of real and synthetic reservoir models of up to billions of unknowns, running on CPUs and GPUs on up to 2000 processes.

关键词： Iterative sparse linear solver hybrid parallel programming GPU Reservoir simulation Geomechanical simulation

来源：评论

学校读者我要写书评

暂无评论

A hybrid CPU-GPU-MIC algorithm for minimal hitting set enumeration

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2019年第18期31卷

作者： Carastan-Santos, Danilo Martins-, David C., Jr. Song, Siang W. Rozante, Luiz C. S. de Camargo, Raphael Y. Univ Fed ABC Ctr Math Comp & Cognit Santo Andre Brazil Univ Grenoble Alpes CNRS INRIA Grenoble INPLIG Grenoble France Univ Sao Paulo Inst Matemat & Estat Sao Paulo Brazil

We present a hybrid exact algorithm for the Minimal Hitting Set (MHS) Enumeration Problem for highly heterogeneous CPU-GPU-MIC platforms. With several techniques that permit an efficient exploitation of each architecture, low communication cost, and effective load balancing, we were able to enumerate MHSs for large instances in reasonable time, achieving good performance and scalability. We obtained speedups of up to 25.32 in comparison with using two six-core CPUs and we also enumerated MHSs for instances with tens of thousands of variables in less than 5 hours. We also evaluated our algorithm with a real-world driven dataset, and with a large CPU-GPU cluster, we unprecedentedly enumerated in parallel large minimal hitting sets of this dataset in less than 8 hours. These results reinforce the statement that heterogeneous clusters of CPUs, GPUs, and MICs can be used efficiently for high-performance computing.

关键词： GPU high performance computing hitting set hybrid parallel programming MIC

来源：评论

学校读者我要写书评

暂无评论

A hybrid parallel Implementation for the Maximum Flow Problem

A Hybrid Parallel Implementation for the Maximum Flow Proble...

引用

Euromicro International Conference on parallel, Distributed and Network-Based Processing

作者： Marco A. Stefanes Luiz F. Alvino Computing College Federal University of Mato Grosso do Sul Campo Grande Brazil Federal Institute of Mato Grosso do Sul Campo Grande Brazil

The maximum flow problem is a classical combinatorial problem with many applications. In this work a hybrid parallel algorithm using both multi-core and many-core technologies for computing the maximum flow in a network is presented. The proposed implementation is applicable in OpenMP/CUDA-enabled computing environment. To improve the performance two strategies were implemented: an adaptive approach where the algorithm alternate GPU/CPU processing according to the number of active nodes and implementations of the global relabeling and gap relabeling heuristics on multi-core approach. When compared against the best sequential implementation, the speedups range from 2.36 to 5.38 in several kinds of graph. Results show that the proposed algorithm is faster than previous parallel implementations on CPU/GPUs for all kinds of tested graphs.

关键词： Maximum flow Push-relabel method hybrid parallel programming Multi/many-core architectures

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：