We describe and test a software approach to fault detection in common numericalalgorithms. Such result checking or algorithm-based fault tolerance (ABFT) methods may be used, for example, to overcome single-event ups...
详细信息
We describe and test a software approach to fault detection in common numericalalgorithms. Such result checking or algorithm-based fault tolerance (ABFT) methods may be used, for example, to overcome single-event upsets in computational hardware or to detect errors in complex, high-efficiency implementations of the algorithms. Following earlier work, we use checksum methods to validate results returned by a numerical subroutine operating subject to unpredictable errors in data. We consider common matrix and Fourier algorithms which return results satisfying a necessary condition having a linear form;the checksum tests compliance with this condition. We discuss the theory and practice of setting numerical tolerances to separate errors caused by a fault from those inherent in finite-precision floating-point calculations. We concentrate on comprehensively defining and evaluating tests having various accuracy/computational burden tradeoffs, and we emphasize average-case algorithm behavior rather than using worst-case upper bounds on error.
In this paper, we present results of parallelnumerical simulations on Maxwell's equations. The parallel code is used to study the effect of the instantaneous focusing nonlinearity upon dispersionless pulse propag...
详细信息
In this paper, we present results of parallelnumerical simulations on Maxwell's equations. The parallel code is used to study the effect of the instantaneous focusing nonlinearity upon dispersionless pulse propagation in bulk dielectric. Indications are given of the development of shocks on the optical carrier wave and upon the pulse envelope. We then use the code to study focusing and collapse of optical pulses at anomalously dispersive frequencies. We examine the effect of varying the focusing of the light by varying the intensity as a way to compensate linear dispersion. We demonstrate blow up of sufficiently intense short pulses at finite propagation distances, and we show numerically that the location of blow up depends nontrivially upon the intensity of the light. (C) 2003 Elsevier B.V. All rights reserved.
In this paper, we present results of parallelnumerical simulations on Maxwell's equations. The parallel code is used to study the effect of the instantaneous focusing nonlinearity upon dispersionless pulse propag...
详细信息
In this paper, we present results of parallelnumerical simulations on Maxwell's equations. The parallel code is used to study the effect of the instantaneous focusing nonlinearity upon dispersionless pulse propagation in bulk dielectric. Indications are given of the development of shocks on the optical carrier wave and upon the pulse envelope. We then use the code to study focusing and collapse of optical pulses at anomalously dispersive frequencies. We examine the effect of varying the focusing of the light by varying the intensity as a way to compensate linear dispersion. We demonstrate blow up of sufficiently intense short pulses at finite propagation distances, and we show numerically that the location of blow up depends nontrivially upon the intensity of the light. (C) 2003 Elsevier B.V. All rights reserved.
In this paper, we describe various methods of deriving a parallel version of Stone's Strongly Implicit Procedure (SIP) for solving sparse linear equations arising from finite difference approximation to partial di...
详细信息
In this paper, we describe various methods of deriving a parallel version of Stone's Strongly Implicit Procedure (SIP) for solving sparse linear equations arising from finite difference approximation to partial differential equations (PDEs). Sequential versions of this algorithm have been very successful in solving semi-conductor, heat conduction and flow simulation problems and an efficient parallel version would enable much larger simulations to be run. An initial investigation of various parallelizing strategies was undertaken using a version of high performance Fortran (HPF) and the best methods were reprogrammed using the MPI message passing libraries for increased efficiency. Early attempts concentrated on developing a parallel version of the characteristic wavefront computation pattern of the existing sequential SIP code. However, a red-black ordering of grid points, similar to that used in parallel versions of the Gauss-Seidel algorithm, is shown to be far more efficient. The results of both the wavefront and red-black MPI based algorithms are reported for various size problems and number of processors on a sixteen node IBM SP2. Copyright (C) 2001 John Wiley & Sons, Ltd.
We investigate several iterative numerical schemes for nonlinear variational image smoothing and segmentation implemented in parallel, A general iterative framework subsuming these schemes is suggested for which globa...
详细信息
We investigate several iterative numerical schemes for nonlinear variational image smoothing and segmentation implemented in parallel, A general iterative framework subsuming these schemes is suggested for which global convergence irrespective of the starting point can be shown. We characterize various edge-preserving regularization methods from the recent image processing literature involving auxiliary variables as special cases of this general framework. As a by-product, global convergence can be proven under conditions slightly weaker than those stated in the literature. Efficient Krylov subspace solvers for the linear parts of these schemes have been implemented on a multi-processor machine. The performance of these parallel implementations has been assessed and empirical results concerning convergence rates and speed-up factors are reported.
In this paper we discuss numerical methods and algorithms for the solution of NLTE stellar atmosphere problems involving expanding atmospheres, e.g., found in novae, supernovae and stellar winds. We show how a scheme ...
详细信息
In this paper we discuss numerical methods and algorithms for the solution of NLTE stellar atmosphere problems involving expanding atmospheres, e.g., found in novae, supernovae and stellar winds. We show how a scheme of nested iterations can be used to reduce the high dimension of the problem to a number of problems with smaller dimensions. As examples of these sub-problems, we discuss the numerical solution of the radiative transfer equation for relativistically expanding media with spherical symmetry, the solution of the multi-level nonLTE statistical equilibrium problem for extremely large model atoms, and our temperature correction procedure. Although modern iteration schemes are very efficient, parallelalgorithms are essential in making large-scale calculations feasible, therefore we discuss some parallelization schemes that we have developed. (C) 1999 Elsevier Science B.V. All rights reserved.
In this paper we study the implementation of a variant of the classic Gauss-Jordan (GJ) method which was recently introduced by Huard [8] on a shared memoryMIMDcomputer. Two parallel versions are derived by dividing t...
详细信息
In this paper we study the implementation of a variant of the classic Gauss-Jordan (GJ) method which was recently introduced by Huard [8] on a shared memoryMIMDcomputer. Two parallel versions are derived by dividing the sequential Huard method into noninterfering tasks. Taking into consideration the computation as well as the communication complexity we present a parallel scheduling algorithm for each task graph. Next, in an attempt to reduce the communication cost we introduce block versions and follow a similar approach for their study.
Algorithm-based fault-tolerance (ABFT) is an inexpensive method of incorporating fault-tolerance into existing applications. Applications are modified to operate on encoded data and produce encoded results which may t...
详细信息
Algorithm-based fault-tolerance (ABFT) is an inexpensive method of incorporating fault-tolerance into existing applications. Applications are modified to operate on encoded data and produce encoded results which may then be checked for correctness. An attractive feature of the scheme is that it requires little or no modification to the underlying hardware or system software. Previous algorithm-based methods for developing reliable versions of numerical programs for general-purpose multicomputers have mostly concerned themselves with error detection. A truly fault-tolerant algorithm, however, needs to locate errors and recover from them once they are located. In a parallel processing environment, this corresponds to locating the faulty processors and recovering the data corrupted by the faulty processors. In this paper, we first present a general scheme for performing fault-location and recovery under the ABFT framework. Our fault model assumes that a faulty processor can corrupt all the data it possesses. The fault-location scheme is an application of system-level diagnosis theory to the ABFT framework, while the fault-recovery scheme uses ideas from coding theory to maintain redundant data and uses this to recover corrupted data in the event of processor failures. Results are presented on implementations of three numericalalgorithms on a 16-processor Intel iPSC/2 hypercube multicomputer, which demonstrate acceptably low overheads for the single and double fault location and recovery cases.
The parallel Diagonal Dominant (PDD) algorithm is an efficient tridiagonal solver. In this paper, a detailed study of the PDD algorithm is given. First the PDD algorithm is extended to solve periodic tridiagonal syste...
详细信息
The parallel Diagonal Dominant (PDD) algorithm is an efficient tridiagonal solver. In this paper, a detailed study of the PDD algorithm is given. First the PDD algorithm is extended to solve periodic tridiagonal systems and its scalability is studied. Then the reduced PDD algorithm, which has a smaller operation count than that of the conventional sequential algorithm for many applications, is proposed. Accuracy analysis is provided for a class of tridiagonal systems, the symmetric and skew-symmetric Toeplitz tridiagonal systems. Implementation results show that the analysis gives a good bound on the relative error, and the PDD and reduced PDD algorithms are good candidates for emerging massively parallel machines.
We consider the parallel computation of flows of integral fluids on a heterogeneous network of workstations. The proposed methodology is relevant to computational mechanics problems which involve a compute-intensive t...
详细信息
We consider the parallel computation of flows of integral fluids on a heterogeneous network of workstations. The proposed methodology is relevant to computational mechanics problems which involve a compute-intensive treatment of internal variables (e.g. fibre suspension flow and deformation of viscoplastic solids). The main parallel computing issue in such applications is that of load balancing. Both static and dynamic allocation of work to processors are considered in the present paper. The proposed parallelalgorithms have been implemented in an experimental, parallel version of the commercial POLYFLOW package developed in Louvain-la-Neuve. The implementation uses the public domain PVM software library (parallel Virtual Machine), which we have extended in order to ease porting to heterogeneous networks. We describe parallel efficiency results obtained with three PVM configurations, involving up to seven workstations with maximum relative processing speeds of five. The physical problems are the stick/slip and abrupt contraction flows of a K.B.K.Z. integral fluid. Using static allocation, parallel efficiencies in the range 67%-85% were obtained on a PVM network with four workstations having relative speeds of 2:1:1:1. parallel efficiencies higher than 90% were obtained on the three PVM configurations using the dynamic load-balancing schemes.
暂无评论