检索结果-内蒙古大学图书馆

16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)

作者： Rostami, M. Ali Buecker, H. Martin Vogt, Christian Seidler, Ralf Neuhaeuser, David Rath, Volker Univ Jena Inst Comp Sci D-07743 Jena Germany Rhein Westfal TH Aachen Inst Appl Geophys & Geothermal Energy EON Energy Res Ctr D-52056 Aachen Germany Dublin Inst Adv Studies Sch Cosm Phys Dublin 4 Ireland Univ Complutense Madrid Dept Fis Tierra Astron & Astrofis 2 Madrid Spain

ISBN: (纸本)9781479984480

Inverse problems arise in various areas of science and engineering. These problems are not only difficult to solve numerically, but they also require a large amount of computer resources both in time and memory. It is therefore not surprising that inverse problems are often solved using techniques from high-performance computing. We consider the parallelization of an inverse problem in the field of geothermal reservoir engineering. In this particular scientific application, the underlying software package is already parallelized using the shared-memory programming paradigm OpenMP. Here, we present an extension of this parallelization to distributed memory enabling a hybrid OpenMP/MPI parallelization. The situation is different from the standard way of hybrid parallel programming because the data structures of the OpenMP-parallelized code differ from those in the serial implementation. We exploit this transformation of the data structures in our distributed-memory strategy for parallelizing an ensemble Kalman filter, a particular method for the solution of inverse problems. We describe this novel parallelization strategy, introduce a performance model, and present timing results on a compute cluster using nodes with 2 sockets, each equipped with Intel Xeon X5675 Westmere EP processors with 6 cores. All timing results are obtained with a pure MPI parallelization without using any OpenMP threads.

关键词： Kalman filters message passing parallel programming shared memory systems Intel Xeon X5675 Westmere EP processors computer resource data structures distributed-memory parallelization geothermal reservoir engineering hybrid Open MP-MPI parallelization hybrid parallel programming message passing interface parallelization strategy shared-memory parallel ensemble Kalman filter shared-memory programming paradigm Algorithm design and analysis Arrays Computational modeling Inverse problems Mathematical model Standards Inverse method message passing Kalman filters parallelization Algorithm design and analysis shared memory systems Parallel programming Data structures Reservoir engineering distributed memory Computational modeling Mathematical Model

来源：评论

学校读者我要写书评

暂无评论

A parallel modular computing environment for three-dimensional multiresolution simulations of compressible flows

引用

COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING 2022年 391卷 114486-114486页

作者： Hoppe, Nils Adami, Stefan Adams, Nikolaus A. Chair Aerodynam & Fluid Mech Boltzmannstr 15 D-85748 Garching Germany

Numerical investigation of compressible flows faces two main challenges. In order to accurately describe the flow characteristics, high-resolution nonlinear numerical schemes are needed to capture discontinuities and resolve wide convective, acoustic and interfacial scale ranges. The simulation of realistic three-dimensional (3D) problems with state-of-the-art finite-volume method (FVM) based on approximate Riemann solvers with weighted nonlinear reconstruction schemes requires the usage of high-performance computing (HPC) architectures. Efficient compression algorithms reduce computational and memory load. Fully adaptive multiresolution (MR) algorithms with LTS have proven their potential for such applications. While modern central processing units (CPUs) requires multiple levels of parallelism to achieve peak performance, the fine-grained MR mesh adaptivity results in challenging compute/communication patterns. Moreover, local time stepping (LTS) incurs strong data dependencies which challenge a parallelization strategy.& nbsp;We address these challenges with a block-based MR algorithm, where arbitrary cuts in the underlying octree are possible. This allows for a parallelization on distributed-memory machines via the Message Passing Interface (MPI). We obtain neighbor relations by a simple bit-logic in a modified Morton Order. The block-based concept allows for a modular setup of the source code framework in which the building blocks of the algorithm, such as the choice of the Riemann solver or the reconstruction stencil, are interchangeable without loss of parallel performance. We present the capabilities of the modular framework with a range of test cases and scaling analysis with effective resolutions beyond one billion cells using O(10(4)) cores. (C)& nbsp;2021 Elsevier B.V. All rights reserved.

关键词： Multiresolution Compressible flows High-order methods HPC distributed-memory parallelization

来源：评论

学校读者我要写书评

暂无评论

distributed- and shared-memory parallelizations of assignment-based data association for multitarget tracking

引用

ANNALS OF OPERATIONS RESEARCH 1999年第90期90卷 293-322页

作者： Popp, RL Pattipati, KR Bar-Shalom, YB Alphatech Inc Burlington MA 01803 USA Univ Connecticut Dept Elect & Syst Engn Storrs CT 06269 USA

To date, there has been a lack of efficient and practical distributed- and shared-memory parallelizations of the data association problem for multitarget tracking. Filling this gap is one of the primary focuses of the present work. We begin by describing our data association algorithm in terms of an Interacting Multiple Model (IMM) state estimator embedded into an optimization framework, namely, a two-dimensional (2D) assignment problem (i.e., weighted bipartite matching). Contrary to conventional wisdom, we show that the data association (or optimization) problem is not the major computational bottleneck;instead, the interface to the optimization problem, namely, computing the rather numerous gating tests and IMM state estimates, covariance calculations, and likelihood function evaluations (used as cost coefficients in the 2D assignment problem), is the primary source of the workload. Hence, for both a general-purpose shared-memory MIMD (Multiple Instruction Multiple Data) multiprocessor system and a distributed-memory Intel Paragon high-performance computer, we developed parallelizations of the data association problem that focus on the interface problem. For the former, a coarse-grained dynamic parallelization was developed that realizes excellent performance (i.e., superlinear speedups) independent of numerous factors influencing problem size (e.g., many models in the IMM, denseycluttered environments, contentious target-measurement data, etc.). For the latter, an SPMD (Single Program Multiple Data) parallelization was developed that realizes near-linear speedups using relatively simple dynamic task allocation algorithms. Using a real measurement database based on two FAA air traffic control radars, we show that the parallelizations developed in this work offer great promise in practice.

关键词： distributed-memory parallelization shared-memory parallelization data association assignment problem task allocation problem multitarget tracking

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：