A universal parallelized numerical approach for solving three-dimensional (3D) convection diffusion equation with variable coefficients is proposed by combining the implicit difference method of Crank-Nicolson with al...
详细信息
ISBN:
(纸本)0769515126
A universal parallelized numerical approach for solving three-dimensional (3D) convection diffusion equation with variable coefficients is proposed by combining the implicit difference method of Crank-Nicolson with alternating bar parallelization, which can be used to solve numerically any variation of 3D convection diffusion equation. By virtue of a bar parallelization and a multistep iteration technique, this approach trades off between parallelism and accuracy. Its main merits are the generality, absolute stability, acceptable space demand and still of two-order accuracy. Its one parallel implementation, named as Codie4D, on network of workstations by the popular MPI library enlists the benefits of portability and applicability. Experimental results show that Codie4D has good runtime performance.
In this paper, an improved version of the BiCGStab (IBiCGStab) method for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices is proposed. the method combines elements o...
详细信息
ISBN:
(纸本)0769515126
In this paper, an improved version of the BiCGStab (IBiCGStab) method for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices is proposed. the method combines elements of numerical stability and parallel algorithm design without increasing the computational costs. the algorithm is derived such that all inner products of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. therefore, the cost of global communication which represents the bottleneck of the parallel performance can be significantly reduced. the resulting IBiCGStab algorithm maintains the favorable properties of the original method while not increasing computational costs. Data distribution suitable for both irregularly and regularly structured matrices based on the analysis of the non-zero matrix elements is presented. Communication scheme is supported by overlapping execution of computation and communication to reduce waiting times. the efficiency of this method is demonstrated by numerical experimental results carried out on a massively parallel distributed memory system.
We suggest a new approach to the analysis of Petri nets. It consists of extracting pairs of independent (parallel) transitions, constructing the set of auxiliary objects of those pairs, and treating it on the basis of...
详细信息
ISBN:
(纸本)9810475241
We suggest a new approach to the analysis of Petri nets. It consists of extracting pairs of independent (parallel) transitions, constructing the set of auxiliary objects of those pairs, and treating it on the basis of natural and obvious rules. this allows one to extract all possible scenarios in the behavior of Petri nets and to evaluate their probability-like characteristics in the case of several scenarios.
Efficient determination of processing termination at barrier synchronization points can occupy an important role in the overall throughput of parallel and distributed computing systems. Even though relatively efficien...
详细信息
ISBN:
(纸本)0769517609
Efficient determination of processing termination at barrier synchronization points can occupy an important role in the overall throughput of parallel and distributed computing systems. Even though relatively efficient termination detection techniques have been proposed for certain environments, no effective performance analysis methodology has been introduced to determine application attributes that favor the use of a particular termination detection technique. this fact has hindered the adoption and development of termination detection schemes. this paper addresses this problem by developing a communication pattern based methodology to improve the precision of the theoretical performance of termination detection techniques in lieu of laborious experiments or potentially subjective benchmarking studies. By measuring message complexity from the idle period respect, it provides a simple and effective way to evaluate existing termination detection techniques or design new termination detection algorithms.
In this paper parallel solving symmetric eigenproblems, which include standard and generalized eigenvalue problems, is discussed. For standard eigenvalue problem and tridiagonal eigenvalue problem is not the key point...
详细信息
Creating portable and automatically scalable parallel software has been a goal for researchers and practitioners since the advent of parallel computing. In this paper we present a programming methodology that reduces ...
详细信息
ISBN:
(纸本)0769515126
Creating portable and automatically scalable parallel software has been a goal for researchers and practitioners since the advent of parallel computing. In this paper we present a programming methodology that reduces parallel programming complexity, while creating portable and automatically scalable parallel software. To support this methodology two separate tools have been developed - the PARSA Software Development Environment and an accompanying thread manager. the development environment addresses programming issues via an object-based graphical programming methodology that transforms a project automatically into a portable and scalable source code. Generated source code makes calls to the user-level thread manager, which manages the run time execution of the parallel software. Two sample applications that contain various forms of parallelism have been developed and are compiled on three different systems with diverse native threading mechanisms to demonstrate portability Finally, the automatic scalability is demonstrated withthe run time performance of the applications on multiprocessor systems.
As a classical method of image segmentation in mathematical morphology, the watershed transform has been applied successively into some fields like remote sensing image processing, biomedical and computer vision appli...
详细信息
In this paper based on the advantages of both optical transmission and electronic computation, we first provide an O(log log N) bus cycles parallel algorithm for the medial axis transform of an N×N binary image o...
详细信息
this paper presents several strategies for parallel implementations of the greedy randomized adaptive search procedure (GRASP) and the variable neighborhood search (VNS) applied to a combinatorial optimization problem...
详细信息
In this paper some implicit domain decomposition procedures for solving parabolic problems are proposed. In these methods, the classic implicit scheme is used in each sub-domain, and Dirichlet boundary values at the (...
详细信息
暂无评论