Improvements are presented on an operator-splitting method on multiple grids for the simulation of transport, and possibly reaction, in highly heterogeneous porous media. the method captures the effect of fine scale h...
详细信息
Improvements are presented on an operator-splitting method on multiple grids for the simulation of transport, and possibly reaction, in highly heterogeneous porous media. the method captures the effect of fine scale heterogeneity on the fluid flow with low numerical dispersion, and has great potential for distributed memory parallelprocessing of large-scale simulations in the other of a million grid cells. the geological description of the reservoir can be incorporated to a length scale of about one to two feet. Results are discussed for an areal displacement problem with real field data over a domain that extends over several square miles. Performances on an iPSC/860-hypercube for simple problems show the high parallel content of the algorithm.
Many elementary numerical algorithms involve not only vector operations but also matrix operations. Today9;s vector processors only support vector operations, and execute matrix operations in terms of vector operat...
详细信息
ISBN:
(纸本)0818626720
Many elementary numerical algorithms involve not only vector operations but also matrix operations. Today's vector processors only support vector operations, and execute matrix operations in terms of vector operations, because they can not access matrix operands in one instruction. this will lead to poor sustained performances of vector machines. In this paper we will discuss how to support both vector operations and matrix operations in vector architectures. At first subarray patterns for vector and matrix operations are introduced. then we present a set of accessing modes which can make vector architectures to access both vector and matrix operands. Finally the performance improvement for matrix multiplication and the FFT is demonstrated.
the proceedings contain 104 papers. the topics discussed include: incrementally extensible hypercube (IEH) graphs;efficient distributed routing algorithms for a synchronous circuit-switched hypercube;distributed algor...
ISBN:
(纸本)0780306058
the proceedings contain 104 papers. the topics discussed include: incrementally extensible hypercube (IEH) graphs;efficient distributed routing algorithms for a synchronous circuit-switched hypercube;distributed algorithms for shortest-path, deadlock-free routing and broadcasting in Fibonacci cubes;predicting the limits of multiple processor performance using job profiles;performance modeling of a shared-memory multiprocessor system;the performance of local and global scheduling strategies in multi-programmed parallel systems;partially shared variables and hierarchical shared memory multiprocessor architectures;and throughput enhancement in multiprocessor architectures for pipelining and digital signal processing applications.
this paper explores the use of Proteus, an architecture-independent language suitable for prototyping parallel and distributed programs. Proteus is a high-level imperative notation based on sets and sequences with a s...
详细信息
ISBN:
(纸本)0818626720
this paper explores the use of Proteus, an architecture-independent language suitable for prototyping parallel and distributed programs. Proteus is a high-level imperative notation based on sets and sequences with a single construct for the parallel composition of processes communicating through shared memory. Several different parallelalgorithms for N-body simulation are presented in Proteus, illustrating how Proteus provides a common foundation for expressing the various parallel programming models. this common foundation allows prototype parallel programs to be tested and evolved without the use of machine-specific languages. To transform prototypes to implementations on specific architectures, program refinement techniques are utilized. Refinement strategies are illustrated that target broad-spectrum parallel intermediate languages, and their viability is demonstrated by refining an N-body algorithm to data-parallel CVL code.
Computer architectures may be classified by the relationship between their instruction and data streams (SISD, SIMD, MISD, MIMD) and the locality of memory (shared, local). Distributed computing is an emerging paralle...
详细信息
ISBN:
(纸本)0872628671
Computer architectures may be classified by the relationship between their instruction and data streams (SISD, SIMD, MISD, MIMD) and the locality of memory (shared, local). Distributed computing is an emerging parallel technology centering around the LM-MIMD hardware model. Computational platforms falling in this category are network-connected workstations and bus-connected local-memory processors. A methodology for implementing nonlinear finite element analysis on a homogeneous distributed processing network is discussed. the method can also be extended to heterogeneous networks comprised of different machine architectures provided that they have a mutual communication interface. the development environment for the present prototype was comprised of two distributed processing platforms: a Sun sparestation network and an 8-node Intel Touchstone i860 MIMD machine - each platform was used individually. the domain is decomposed serially in a preprocessor. Separate input files are written for each subdomain. these files are read in by local copies of the program executable operating in parallel. Communication between processors is addressed utilizing synchronous message passing. the basic kernel of message passing is the exchange of internal forces which is analogous to the interaction that physical bodies undergo when subjected to internal forces. Results show a best case speedup of 1.95 (97.5% efficiency) for the Intel hypercube using 2 i860 processors and 1.93 (96.5% efficiency) using 2 SUN sparestations on the ethernet network.
Window-based parallelarchitectures are considered as target structures for the computation of low and medium level image processingalgorithms. their definition stems from a general reformulation of algorithms, based...
详细信息
Window-based parallelarchitectures are considered as target structures for the computation of low and medium level image processingalgorithms. their definition stems from a general reformulation of algorithms, based on local data processing. A methodology for high level global evaluation of such architectures is presented, considering the reachable performances of the structures as main significant parameters.< >
this paper investigates use of the CM-2 Connection Machine (CM) for simulations of flow in randomly heterogeneous, variably saturated porous media. Heterogeneity is synthesized by assuming saturated hydraulic conducti...
详细信息
this paper investigates use of the CM-2 Connection Machine (CM) for simulations of flow in randomly heterogeneous, variably saturated porous media. Heterogeneity is synthesized by assuming saturated hydraulic conductivity to be a realization of a spatially uncorrelated, random function. the finite difference solution methodology leads to nonlinear equations which are linearized by an iterative Picard scheme. the resulting system of linear equations is solved iteratively using either the diagonally preconditioned conjugate gradient (DPCG) or the Jacobi method. Comparisons of the Connection Machine Variably Saturated Flow Simulator (CMVSFS) with scalar, engineering practice oriented, and highly vectorized/optimized numerical codes, are encouraging and point to the importance of continued research in hydrological applications of massively parallel computing.
Rank order filters form an important class of low level image operations that have widespread applications in image smoothing, texture analysis, etc. In this paper, we study several ways of computing rank order filter...
详细信息
the paper concerns computational models and languages for iterative cellular automata for image processing applications. the authors present some formal models, with different computational resources, that can be usef...
详细信息
the paper concerns computational models and languages for iterative cellular automata for image processing applications. the authors present some formal models, with different computational resources, that can be useful in the solution of certain algorithmic problems. they introduce some concepts such as memory splitting, conditional functions, dynamic neighborhood and supervisor automaton. the models defined lead to a parallel language structure that can express low-level image processingalgorithms in a clear and concise way. the language allows a transparent description of the algorithms and can be easily expandable to reflect the needs of people working in different branches of image processing.< >
this paper proposes a novel approach to program development for highly parallelarchitectures, primarily as far as debugging is concerned. the visual nature of the debugging stage, when dealing with image-processing a...
详细信息
this paper proposes a novel approach to program development for highly parallelarchitectures, primarily as far as debugging is concerned. the visual nature of the debugging stage, when dealing with image-processingalgorithms, is heavily supported so that all the relevant information, which is generally either hidden or presented without its logical structures, is made available to programmers. the authors present the modular and portable software system built, in Pavia University, for the PAPIA2 machine.< >
暂无评论