检索结果-内蒙古大学图书馆

Confederated modular differential equation APIs for accelerated algorithm development and benchmarking

ADVANCES IN ENGINEERING SOFTWARE 2019年 132卷 1-6页

作者： Rackauckas, Christopher Nie, Qing MIT Dept Math Cambridge MA 02139 USA Univ Calif Irvine Dept Math Irvine CA 92697 USA

Performant numerical solving of differential equations is required for large-scale scientific modeling. In this manuscript we focus on two questions: (1) how can researchers empirically verify theoretical advances and consistently compare methods in production software settings and (2) how can users (scientific domain experts) keep up with the state-of-the-art methods to select those which are most appropriate? Here we describe how the confederated modular API of *** addresses these concerns. We detail the package-free API which allows numerical methods researchers to readily utilize and benchmark any compatible method directly in full-scale scientific applications. In addition, we describe how the complexity of the method choices is abstracted via a polyalgorithm. We show how scientific tooling built on top of ***, such as packages for dynamical systems quantification and quantum optics simulation, both benefit from this structure and provide themselves as convenient benchmarking tools.

关键词： Differential equations Julia Generic programming Type-specialization polyalgorithms API Design

来源：评论

学校读者我要写书评

暂无评论

A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2002年第10期14卷 805-839页

作者： Valsalam, V Skjellum, A Mississippi State Univ Dept Comp Sci High Performance Comp Lab Mississippi State MS 39762 USA

Despite extensive research, optimal performance has not easily been available previously for matrix multiplication (especially for large matrices) on most architectures because of the lack of a structured approach and the limitations imposed by matrix storage formats. A simple but effective framework is presented here that lays the foundation for building high-performance matrix-multiplication codes in a structured, portable and efficient manner. The resulting codes are validated on three different representative RISC and CISC architectures on which they significantly outperform highly optimized libraries such as ATLAS and other competing methodologies reported in the literature. The main component of the proposed approach is a hierarchical storage format that efficiently generalizes the applicability of the memory hierarchy friendly Morton ordering to arbitrary-sized matrices. The storage format supports polyalgorithms, which are shown here to be essential for obtaining the best possible performance for a range of problem sizes. Several algorithmic advances are made in this paper, including an oscillating iterative algorithm for matrix multiplication and a variable recursion cutoff criterion for Strassen's algorithm. The authors expose the need to standardize linear algebra kernel interfaces, distinct from the BLAS, for writing portable high-performance code. These kernel routines operate on small blocks that fit in the L1 cache. The performance advantages of the proposed framework can be effectively delivered to new and existing applications through the use of object-oriented or compiler-based approaches. Copyright (C) 2002 John Wiley Sons, Ltd.

关键词： matrix multiplication hierarchical matrix storage Morton order polyalgorithms Strassen's algorithm kernel interface

来源：评论

学校读者我要写书评

暂无评论

A framework for adaptive collective communications for heterogeneous hierarchical computing systems

引用

JOURNAL OF COMPUTER AND SYSTEM SCIENCES 2008年第6期74卷 1082-1093页

作者： Steffenel, Luiz Angelo Mounie, Gregory Univ Nancy 2 LORIA F-54001 Nancy France Lab ID IMAG Grenoble France

Collective communication operations are widely used in MPI applications and play an important role in their performance. However, the network heterogeneity inherent to grid environments represent a great challenge to develop efficient high performance computing applications. In this work we propose a generic framework based on communication models and adaptive techniques for dealing with collective communication patterns on grid platforms. Toward this goal, we address the hierarchical organization of the grid, selecting the most efficient communication algorithms at each network level. Our framework is also adaptive to grid load dynamics since it considers transient network characteristics for dividing the nodes into clusters. Our experiments with the broadcast operation on a real-grid setup indicate that an adaptive framework allows significant performance improvements on MPI collective communications. (C) 2007 Elsevier Inc. All rights reserved.

关键词： grid computing performance modeling adaptive techniques polyalgorithms collective communication MPI

来源：评论

学校读者我要写书评

暂无评论

TOWARDS POLYALGORITHMIC LINEAR-SYSTEM SOLVERS FOR NONLINEAR ELLIPTIC PROBLEMS

引用

SIAM JOURNAL ON SCIENTIFIC COMPUTING 1994年第3期15卷 681-703页

作者： ERN, A GIOVANGIGLI, V KEYES, DE SMOOKE, MD ECOLE NATL PONTS & CHAUSEES F-75007 PARIS FRANCE NASA LARC INST COMP APPLICAT SCI & ENGN HAMPTON VA 23681 USA ECOLE POLYTECH CTR MATH APPL F-91128 PALAISEAU FRANCE ECOLE POLYTECH CNRS F-91128 PALAISEAU FRANCE

The authors investigate the performance of several preconditioned conjugate gradient-like algorithms and a standard stationary iterative method (block-line successive overrelaxation (SOR)) on linear systems of equations that arise from a nonlinear elliptic flame sheet problem simulation. The nonlinearity forces a pseudotransient continuation process that makes the problem parabolic and thus compacts the spectrum of the Jacobian matrix so that simple relaxation methods are viable in the initial stages of the solution process. However, because of the transition from parabolic to elliptic character as the timestep is increased in pursuit of the steady-state solution, the performance of the candidate linear solvers spreads as the domain of convergence of Newton's method is approached. In numerical experiments over the course of a full nonlinear solution trajectory, short recurrence or optimal Krylov algorithms combined with a Gauss-Seidel (GS) preconditioning yield better execution times with respect to the standard block-line SOR techniques, but SOR performs competitively at a smaller storage cost until the final stages. Block-incomplete factorization preconditioned methods, on the other hand, require nearly a factor of two more storage than SOR and are uniformly less effective during the pseudotransient stages. The advantage of GS preconditioning is partly attributable to die exploitation of a dominant convection direction in the examples;nevertheless, a multidomain version of GS with streamwise coupling lagged at rows between adjacent subdomains incurs only a modest penalty.

关键词： NONLINEAR ELLIPTIC BOUNDARY VALUE PROBLEMS polyalgorithms PRECONDITIONED ITERATIVE METHODS COMPUTATIONAL COMBINATION

来源：评论

学校读者我要写书评

暂无评论

Numerical experiments to optimize the use of (I)LU preconditioning in the iterative linear solver package LINSOL

引用

APPLIED NUMERICAL MATHEMATICS 2002年第1期41卷 23-37页

作者： Schönauer, W Häfner, H Univ Karlsruhe Rechenzentrum D-76128 Karlsruhe Germany

In the iterative linear solver package LINSOL several generalized conjugate gradient (CG) methods (or, briefly, CG-type methods) with quite different properties are implemented. With these methods polyalgorithms with automatic method switching are constructed. The "emergency exit" that is taken in the worst case is the ATPRES method (which is very robust, but very slow). In this paper we investigate if (I)LU preconditioning were a better emergency exit and how the drop tolerance for small elements in ILU affects the convergence behavior. The answer will be: it depends. (C) 2001 IMACS. Published by Elsevier Science B.V. All rights reserved.

关键词： sparse linear systems (I)LU preconditioning software LINSOL FIDISOL automatic method switching generalized conjugate gradient (CG) methods polyalgorithms

来源：评论

学校读者我要写书评

暂无评论

Numerical experiments to optimize the use of (I)LU preconditioning in the iterative linear solver package LINSOL

Numerical experiments to optimize the use of (I)LU precondit...

引用

16th IMACS World Congress

作者： Schönauer, W Häfner, H Univ Karlsruhe Rechenzentrum D-76128 Karlsruhe Germany

关键词： sparse linear systems (I)LU preconditioning software LINSOL FIDISOL automatic method switching generalized conjugate gradient (CG) methods polyalgorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：