检索结果-内蒙古大学图书馆

Efficient deterministic parallel simulation of 2D semiconductor devices based on WENO-Boltzmann schemes

COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING 2009年第5-8期198卷 693-704页

作者： Mantas, Jose M. Caceres, Maria J. Univ Granada Dept Lenguajes & Sistemas Informat ETS Ing Informat & Telecomunicac E-18071 Granada Spain Univ Granada Fac Ciencias Dept Matemat Aplicada Granada 18002 Spain

A flexible parallel deterministic solver of the Boltzmann-Poisson system for 2D semiconductor device simulation on computer clusters is presented. The simulator is obtained by parallelizing a previously proposed numerical scheme based on high order finite difference weighted essentially non-oscillatory (WENO) schemes. Although the underlying numerical scheme presents important advantages over direct simulation Monte Carlo methods, this scheme imposes very high demands of computing power. Due to this, the parallelization of the different calculation phases in the numerical scheme has been tackled. The data subdomain which demands most of the computational workload has been suitably distributed among the processors and several parallel design decisions has been taken in order to achieve good performance. Moreover, the resultant parallel application can be easily adjusted to simulate a wide range of devices and could be easily used by engineers without mathematical background about the underlying numerical scheme. The parallel algorithm has been implemented in C++ augmented with calls to MPI functions and functions of optimized linear algebra libraries. Several experiments have been performed by simulating particular MOSFET and DG-MOSFET devices on a SMP cluster in order to show its efficiency. (C) 2008 Elsevier B.V. All rights reserved.

关键词： Semiconductor simulation parallel numerical algorithms Finite difference weighted essentially non-oscillatory schemes High performance cluster computing

来源：评论

学校读者我要写书评

暂无评论

Fault tolerant algorithms for heat transfer problems

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2008年第5期68卷 663-677页

作者： Ltaief, Hatem Gabriel, Edgar Garbey, Marc Univ Houston Dept Comp Sci Houston TX 77204 USA

With the emergence of new massively parallel systems in the high performance computing area allowing scientific simulations to run on thousands of processors, the mean time between failures of large machines is decreasing from several weeks to a few minutes. The ability of hardware and software components to handle these singular events called process failures is therefore getting increasingly important. In order for a scientific code to continue despite a process failure, the application must be able to retrieve the lost data items. The recovery procedure after failures might be fairly straightforward for elliptic and linear hyperbolic problems. However, the reversibility in time for parabolic problems appears to be the most challenging part because it is an ill-posed problem. This paper focuses on new fault-tolerant numerical schemes for the time integration of parabolic problems. The new algorithm allows the application to recover from process failures and to reconstruct numerically the lost data of the failed process(es) avoiding the expensive roll-back operation required in most checkpoin/restart schemes. As a fault tolerant communication library, we use the fault tolerant message passing interface developed by the Innovative Computing Laboratory at the University of Tennessee. Experimental results show promising performances. Indeed, the three-dimensional parabolic benchmark code is able to recover and to keep on running after failures, adding only a very small penalty to the overall time of execution. (C) 2007 Elsevier Inc. All rights reserved.

关键词： parallel numerical algorithms process fault tolerance parabolic problems

来源：评论

学校读者我要写书评

暂无评论

Computational forces in the Linpack benchmark

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2008年第9期68卷 1283-1290页

作者： Numrich, Robert W. Univ Minnesota Minnesota Supercomp Inst Minneapolis MN 55455 USA

Dimensional analysis reduces a complicated ten-parameter formula for the execution time of the Linpack benchmark to a simpler two-parameter formula. These two parameters are ratios of software forces and hardware forces that determine a self-similarity Surface. Machines move along paths on this surface as the problem size and the number of processors change. Two machines scale the same way, they move along the same path, if they have the same hardware forces. To design efficient algorithms, the programmer must produce software forces large enough to overcome the hardware forces. Modern machines have larger hardware forces than older machines and are harder to program. (C) 2008 Elsevier Inc. All rights reserved.

关键词： scalability dimensional analysis performance analysis computational intensity computational force parallel numerical algorithms Linpack benchmark

来源：评论

学校读者我要写书评

暂无评论

Dimensional analysis applied to a parallel QR algorithm

引用

7th International Conference on parallel Processing and Applied Mathematics

作者： Numrich, Robert W. Univ Minnesota Minnesota Supercomp Inst Minneapolis MN 55455 USA

ISBN: (纸本)9783540681052

We apply dimensional analysis to a formula for execution time for a QR algorithm from a paper by Henry and van de Geijn. We define a single efficiency surface that reduces performance analysis for this algorithm to an exercise in differential geometry. As the problem size and the number of processors change, different machines move along different paths on the surface determined by two computational forces specific to each machine. We show that computational force, also called computational intensity, is a unifying concept for understanding the performance of parallel numerical algorithms.

关键词： scalability performance analysis computational intensity computational force parallel numerical algorithms dimensional analysis

来源：评论

学校读者我要写书评

暂无评论

A parallel hybrid banded system solver: the SPIKE algorithm

引用

parallel COMPUTING 2006年第2期32卷 177-194页

作者： Polizzi, E Sameh, AH Purdue Univ Dept Comp Sci W Lafayette IN 47907 USA

This paper describes an efficient and robust hybrid parallel solver "the SPIKE algorithm" for narrow-banded linear systems. Two versions of SPIKE with their built-in-options are described in detail: the Recursive SPIKE version for handling non-diagonally dominant systems and the Truncated SPIKE version for diagonally dominant ones. These SPIKE schemes can be used either as direct solvers, or as preconditioners for outer iterative schemes. Both versions are faster than the direct solvers in ScaLAPACK on parallel computing platforms, and quite competitive in terms of achieved accuracy For handling systems that are dense within the band. (c) 2005 Elsevier B.V. All rights reserved.

关键词： banded linear systems iterative schemes numerical linear algebra parallel numerical algorithms preconditioners ScaLAPACK SPIKE

来源：评论

学校读者我要写书评

暂无评论

A parallel hybrid banded system solver: the SPIKE algorithm

A parallel hybrid banded system solver: the SPIKE algorithm

引用

3rd Workshop on parallel Matrix algorithms and Applications (PMAA 2004)

作者： Polizzi, E Sameh, AH Purdue Univ Dept Comp Sci W Lafayette IN 47907 USA

关键词： banded linear systems iterative schemes numerical linear algebra parallel numerical algorithms preconditioners ScaLAPACK SPIKE

来源：评论

学校读者我要写书评

暂无评论

Performance of preconditioners for the distributed vector finite-element time-domain algorithm

引用

IEEE TRANSACTIONS ON MAGNETICS 2005年第5期41卷 1716-1719页

作者： Nicolas, A Nicolas, L Vollaire, C Butrylo, B Ecole Cent Lyon CNRS UMR 5005 Ctr Genie Elect Lyon F-69134 Ecully France Bialystok Tech Univ Fac Elect Engn PL-15351 Bialystok Poland

This paper deals with some aspects of performance of the symmetric successive over-relaxation preconditioner in a distributed environment. The details of distributed formulation of the preconditioner are presented. Some performance metrics are compared and discussed for the message passing interface implementation of the algorithm. The properties of the solver are estimated for concurrent three-dimensional formulation of the finite-element time-domain method. The analyzed benchmark models are approximated by tetrahedral first order Whitney elements.

关键词： Edge elements finite-element (FE) method iterative solver parallel numerical algorithms time-domain algorithm

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for Large-Scale Nanoelectronics Simulations Using NESSIE

引用

JOURNAL OF COMPUTATIONAL ELECTRONICS 2004年第3-4期3卷 363-366页

作者： Polizzi, Eric Sameh, Ahmed Purdue Univ Dept Comp Sci W Lafayette IN 47907 USA

Large-scale computational problems are encountered when one attempts to realize high degree of detail and realism in the simulation of quantum transport in nanodevices. These problems can be addressed using novel parallel algorithms that are ideally suited for high-end computing platforms. This article has two objectives: (i) the description of the transport model and the associated computational challenges within the multidimensional finite element simulator NESSIE, and (ii) the presentation of a new strategy for handling the transport problem and solving the banded linear systems that arise from the Green (or wave) function approach.

关键词： NEGF quantum transport modeling computational nanoelectronics parallel numerical algorithms

来源：评论

学校读者我要写书评

暂无评论

Computational challenges in nanoscale device modeling

引用

Nanotechnology Conference and Trade Show (Nanotech 2004)

作者： Polizzi, E Sameh, A Sun, H Purdue Univ Dept Comp Sci W Lafayette IN 47907 USA

ISBN: (纸本)0972842284

The development of new simulation tools is critical for the exploration of quantum transport in nanoscale devices. Such simulation is commonly performed by solving self-consistently the transport problem using the Non-Equilibrium Green's Functions (NEGF) formalism and the Poisson's equation to account for the space charge e ects. The quest for ever higher levels of detail and realism in such simulations as the modeling of multidimensional devices with detailed band structure calculations with(or without) the inclusion of scattering e ects, requires huge computational e ort. Hence, the need for an active research e ort in developing novel numerical techniques and parallel algorithms that axe ideally suited for high-end computing platforms. In this article, we will identify the identify the challenging numerical problems which arise from the NEGF/Poisson procedure and we will present new efficient parallel schemes for computing the problem.

关键词： nanoscale devices Green's function NEGF-poisson parallel numerical algorithms linear systems generalized eigenvalue problems

来源：评论

学校读者我要写书评

暂无评论

parallel integration of hydrodynamical approximations of the Boltzmann equation for rarefied gases on a cluster of computers

引用

Journal of Computational Methods in Sciences and Engineering 2004年第1-2期4卷 33-41页

作者： Mantas Ruiz, José Miguel Pareschi, Lorenzo Carrillo, José Antonio Lopera, Julio Ortega Software Engineering Department. University of Granada C/P. Daniel de Saucedo s/n. 18071 Granada Spain Department of Mathematics. University of Ferrara Via Machiavelli 35 I-44100 Italy ICREA Depto. Matemàtiques University Autònoma Barcelona Bellaterra E-08193 Spain Computer Architecture and Technology Department. University of Granada C/P. Daniel de Saucedo s/n 18071 Granada Spain

The relaxed Burnett system, recently introduced in as a hydrodynamical approximation of the Boltzmann equation, is numerically solved. Due to the stiffness of this system and the severe CFL condition for large Mach numbers, a fully implicit Runge-Kutta method has been used. In order to reduce computing time, we apply a parallel stiff ODE solver based on 4-stage Radau IIA IRK. The ODE solver is combined with suitable first order upwind and second order MUSCL relaxation schemes for the spatial derivatives. Speedup results and comparisons to DSMC and Navier-Stokes approximations are reported for a 1D shock profile.

关键词： parallel numerical algorithms boltzmann equation burnett equations parallel stiff ODE solvers relaxation implicit runge-kutta methods

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：