We study the parallel implementation of two diagonalization methods for solving dense linear systems: the well known Gauss-Jordan method and a new one introduced by Huard. The number of arithmetic operations performed...
详细信息
We study the parallel implementation of two diagonalization methods for solving dense linear systems: the well known Gauss-Jordan method and a new one introduced by Huard. The number of arithmetic operations performed by the Huard method is the same as for Gaussian elimination, namely 2n3/3, less than for the Jordan method, namelyn3. We introduce parallel versions of these methods, compare their performances and study their complexity. We assume a shared memory computer with a number of processorspof the order ofn, the size of the problem to be solved, We show that the best parallel version for Jordan's method is by rows whereas the best one for Huard's method is by columns. Our main result states that for a small number of processors the parallel Huard method is faster than the parallel Jordan method and slower otherwise. The separation is obtained forp= 0.44n.
This paper explores the macro data flow approach for solving numerical applications on distributed memory systems. We discuss the problems of this approach with a sophisticated ‘real life’ algorithm—the adaptive fu...
详细信息
This paper explores the macro data flow approach for solving numerical applications on distributed memory systems. We discuss the problems of this approach with a sophisticated ‘real life’ algorithm—the adaptive full multigrid method.
It is shown that the nonnumeric parts of the algorithm—the initialization, the termination and the mapping of processes to processors—are very important for the overall performance.
To avoid unnecessary global synchronization points we propose to use the distributed supervisors. We compare this solution with more centralized algorithms. The performance evaluation is done for nearest neighbour and bus connected multiprocessors using a simulation systems.
This paper deals with some aspects of performance of the symmetric successive over-relaxation preconditioner in a distributed environment. The details of distributed formulation of the preconditioner are presented. So...
详细信息
This paper deals with some aspects of performance of the symmetric successive over-relaxation preconditioner in a distributed environment. The details of distributed formulation of the preconditioner are presented. Some performance metrics are compared and discussed for the message passing interface implementation of the algorithm. The properties of the solver are estimated for concurrent three-dimensional formulation of the finite-element time-domain method. The analyzed benchmark models are approximated by tetrahedral first order Whitney elements.
Large-scale computational problems are encountered when one attempts to realize high degree of detail and realism in the simulation of quantum transport in nanodevices. These problems can be addressed using novel para...
详细信息
Large-scale computational problems are encountered when one attempts to realize high degree of detail and realism in the simulation of quantum transport in nanodevices. These problems can be addressed using novel parallelalgorithms that are ideally suited for high-end computing platforms. This article has two objectives: (i) the description of the transport model and the associated computational challenges within the multidimensional finite element simulator NESSIE, and (ii) the presentation of a new strategy for handling the transport problem and solving the banded linear systems that arise from the Green (or wave) function approach.
In this paper, we present results of parallelnumerical simulations on Maxwell's equations. The parallel code is used to study the effect of the instantaneous focusing nonlinearity upon dispersionless pulse propag...
详细信息
In this paper, we present results of parallelnumerical simulations on Maxwell's equations. The parallel code is used to study the effect of the instantaneous focusing nonlinearity upon dispersionless pulse propagation in bulk dielectric. Indications are given of the development of shocks on the optical carrier wave and upon the pulse envelope. We then use the code to study focusing and collapse of optical pulses at anomalously dispersive frequencies. We examine the effect of varying the focusing of the light by varying the intensity as a way to compensate linear dispersion. We demonstrate blow up of sufficiently intense short pulses at finite propagation distances, and we show numerically that the location of blow up depends nontrivially upon the intensity of the light. (C) 2003 Elsevier B.V. All rights reserved.
This paper describes an efficient and robust hybrid parallel solver "the SPIKE algorithm" for narrow-banded linear systems. Two versions of SPIKE with their built-in-options are described in detail: the Recu...
详细信息
This paper describes an efficient and robust hybrid parallel solver "the SPIKE algorithm" for narrow-banded linear systems. Two versions of SPIKE with their built-in-options are described in detail: the Recursive SPIKE version for handling non-diagonally dominant systems and the Truncated SPIKE version for diagonally dominant ones. These SPIKE schemes can be used either as direct solvers, or as preconditioners for outer iterative schemes. Both versions are faster than the direct solvers in ScaLAPACK on parallel computing platforms, and quite competitive in terms of achieved accuracy For handling systems that are dense within the band. (c) 2005 Elsevier B.V. All rights reserved.
In this paper, we construct and investigate parallel solvers for three dimensional problems described by fractional powers of elliptic operators. The main aim is to make a scalability analysis of parallel versions of ...
详细信息
ISBN:
(纸本)9783319780245;9783319780238
In this paper, we construct and investigate parallel solvers for three dimensional problems described by fractional powers of elliptic operators. The main aim is to make a scalability analysis of parallel versions of several state of the art solvers. The originality of this work is that we also consider the accuracy of the selected numericalalgorithms. For comparison of accuracy, we use solutions obtained solving the test problem by the Fourier algorithm. Such analysis enables to compare the efficiency of the proposed parallelalgorithms depending on the required accuracy of solution and on a number of processes used in computations.
This paper describes an efficient and robust hybrid parallel solver "the SPIKE algorithm" for narrow-banded linear systems. Two versions of SPIKE with their built-in-options are described in detail: the Recu...
详细信息
This paper describes an efficient and robust hybrid parallel solver "the SPIKE algorithm" for narrow-banded linear systems. Two versions of SPIKE with their built-in-options are described in detail: the Recursive SPIKE version for handling non-diagonally dominant systems and the Truncated SPIKE version for diagonally dominant ones. These SPIKE schemes can be used either as direct solvers, or as preconditioners for outer iterative schemes. Both versions are faster than the direct solvers in ScaLAPACK on parallel computing platforms, and quite competitive in terms of achieved accuracy For handling systems that are dense within the band. (c) 2005 Elsevier B.V. All rights reserved.
This report focuses on technology of supercomputer simulation of nonlinear processes in the cores, extracted from oil and gas production wells in order to study the properties of hydrocarbon reservoirs. One of modern ...
详细信息
This report focuses on technology of supercomputer simulation of nonlinear processes in the cores, extracted from oil and gas production wells in order to study the properties of hydrocarbon reservoirs. One of modern approaches to solving these kind problems is to create multiphysical mathematical model of core for its study by computer methods. This approach minimizes the number of natural experiments and predicts the evolution of layers properties. Also it allows to predict oil and gas recovery of layers for a long time period. However, implementation of this technology called "virtual core" requires the following: 1) to create multiparametrical model of core as close as possible to the reality;2) to include the multicomponent and multiphase composition and complex real geometry of core in consideration;3) to develop a computational framework for modeling the seepage of multicomponent liquid and gas mixtures through the core;4) to carry out large-scale calibration calculations. In this paper, an attempt to create such a multifactor mathematical model and computational foundations for its computing and supercomputing analysis is made.
In this paper, we present results of parallelnumerical simulations on Maxwell's equations. The parallel code is used to study the effect of the instantaneous focusing nonlinearity upon dispersionless pulse propag...
详细信息
In this paper, we present results of parallelnumerical simulations on Maxwell's equations. The parallel code is used to study the effect of the instantaneous focusing nonlinearity upon dispersionless pulse propagation in bulk dielectric. Indications are given of the development of shocks on the optical carrier wave and upon the pulse envelope. We then use the code to study focusing and collapse of optical pulses at anomalously dispersive frequencies. We examine the effect of varying the focusing of the light by varying the intensity as a way to compensate linear dispersion. We demonstrate blow up of sufficiently intense short pulses at finite propagation distances, and we show numerically that the location of blow up depends nontrivially upon the intensity of the light. (C) 2003 Elsevier B.V. All rights reserved.
暂无评论