Matrix partitioning problems that arise in the efficient estimation of sparse Jacobians andHessians can be modeledusing variants of graph coloring problems. In a previous work [6], we argue that distance-2 and distanc...
详细信息
Data and control parallelism algorithms are described for a matrix method which detects and locates the presence of logic hazards in combinational logic circuits. Examples are given for illustration.
Data and control parallelism algorithms are described for a matrix method which detects and locates the presence of logic hazards in combinational logic circuits. Examples are given for illustration.
The method of discrete ordinates is commonly used to solve the Boltzmann radiation transport equation for applications ranging from simulations of fires to weapons effects. The equations are most efficiently solved by...
详细信息
The method of discrete ordinates is commonly used to solve the Boltzmann radiation transport equation for applications ranging from simulations of fires to weapons effects. The equations are most efficiently solved by sweeping the radiation flux across the computational grid. For unstructured grids this poses several interesting challenges, particularly when implemented on distributed-memory parallel machines where the grid geometry is spread across processors. We describe a asynchronous, parallel, message-passing algorithm that performs sweeps simultaneously from many directions across unstructured grids. We identify key factors that limit the algorithm’s parallel scalability and discuss two enhancements we have made to the basic algorithm: one to prioritize the work within a processor’s subdomain and the other to better decompose the unstructured grid across processors. Performance results are give for the basic and enhanced algorithms implemented withi a radiation solver running on hundreds of processors of Sandia’s Intel Tflops machine and DEC-Alpha CPlant cluster.
The increasing interest in product networks (PNs) as a method of combining desirable properties of component networks, has prompted a need for the general study of the algorithmic issues related to this important clas...
详细信息
The increasing interest in product networks (PNs) as a method of combining desirable properties of component networks, has prompted a need for the general study of the algorithmic issues related to this important class of interconnection networks. In this paper we present unified parallel algorithms for Gaussian elimination, with partial and complete pivoting, on product networks. A parallel algorithm for backward substitution is also presented. The proposed algorithms are network independent and are also independent of the matrix distribution methods employed. These algorithms can be used on a wide range of PNs including hypercube, mesh, and k-ary n-cube. Unified models for estimating computation time and interprocessor communication time are also presented. These models are then used to measure the performance of the proposed algorithms on several product networks
Presents a class of modified parallel Rosenbrock methods (MPROW) which possesses more free parameters to improve further the various properties of the methods and will be similarly written as MPROW. Information on par...
详细信息
Presents a class of modified parallel Rosenbrock methods (MPROW) which possesses more free parameters to improve further the various properties of the methods and will be similarly written as MPROW. Information on parallel Rosenbrock methods; Convergence and stability analysis; Discussion on two-stage third-order methods.
We consider a parallel Algol-like language, combining procedures with shared-variable parallelism. Procedures permit encapsulation of common parallel programming idioms. Local variables provide a way to restrict inter...
详细信息
We consider a parallel Algol-like language, combining procedures with shared-variable parallelism. Procedures permit encapsulation of common parallel programming idioms. Local variables provide a way to restrict interference between parallel commands. The combination of local variables, procedures, and parallelism supports a form of concurrent object-oriented programming. We provide a denotational semantics for this language, simultaneously adapting possible worlds to the parallel setting and generalizing transition traces to the procedural setting. This semantics supports reasoning about safety and liveness properties of parallel programs, and validates a number of natural laws of program equivalence based on noninterference properties of local variables. The semantics also validates familiar laws of functional programming. We also provide a relationally parallel semantics. This semantics supports standard methods of reasoning about representational independence, adapted to shared-variable programs. The clean design of the programming language and its semantics shows that procedures and shared-variable parallelism can be combined smoothly. (C) 2002 Elsevier Science (USA).
The backtrack search problem involves visiting all the nodes of an arbitrary binary tree given a pointer to its root subject to the constraint that the children of a node are revealed only after their parent is visite...
详细信息
The backtrack search problem involves visiting all the nodes of an arbitrary binary tree given a pointer to its root subject to the constraint that the children of a node are revealed only after their parent is visited. We present a fast, deterministic backtrack search algorithm for a p-processor COMMON CRCW-PRAM, which visits any n-node tree of height h in time O((n/p + h)(logloglog p)(2)). This upper bound compares favourably with a natural Omega (n/p + h) lower bound for this problem. Our approach embodies novel, efficient techniques for dynamically assigning tree-nodes to processors to ensure that the work is shared equitably among them. (C) 2002 Elsevier Science B.V. All rights reserved.
A parallel procedure based on a single-program, multiple-data (SPMD) algorithm is presented for parallel computing of turbulent combustion and flame spread in fires. The computation is based on modeling of radiative t...
详细信息
A parallel procedure based on a single-program, multiple-data (SPMD) algorithm is presented for parallel computing of turbulent combustion and flame spread in fires. The computation is based on modeling of radiative turbulent reacting flow and pyrolysis of solid fuel. With angular domain decomposition applied to the parallel computing of radiation and spatial domain decomposition to the computation of nonradiative turbulent reacting flow and solid fuel pyrolysis, the whole computation is distributed among a group of concurrent tasks, which communicate with each other through a message-passing interface library. Using this procedure, a self-developed computational combustion code has been parallelized on both a multiprocessor PC and a symmetric multiprocessor (SMP) system, SGI Origin 2000. The parallelization was verified by comparing the parallel results with sequential results. The performance of the parallel procedure was evaluated using various test cases. As expected, the efficiency of parallelism varies with both computer architecture and case scenario. In general, good efficiency was obtained.
The problem of computing a matching of maximum weight in a given edge-weighted graph is not known to be P-hard or in RNC. This paper presents two parallel approximation algorithms for this problem. The first is an RNC...
详细信息
The problem of computing a matching of maximum weight in a given edge-weighted graph is not known to be P-hard or in RNC. This paper presents two parallel approximation algorithms for this problem. The first is an RNC-approximation scheme, i.e., an RNC algorithm that computes a matching of weight at least 1 - epsilon times the maximum for any fixed constant epsilon > 0. The other is an NC approximation algorithm achieving an approximation ratio of 1/(2 + epsilon) for any fixed constant epsilon > 0. (C) 2000 Elsevier Science B.V. All rights reserved.
Computing the frequent subsets of large multi-attribute data is a key component of local pattern detection data mining algorithms. It is both computation- and data-intensive. The standard parallel algorithms require m...
详细信息
Computing the frequent subsets of large multi-attribute data is a key component of local pattern detection data mining algorithms. It is both computation- and data-intensive. The standard parallel algorithms require multiple passes through the data. The cost of data access may easily outweigh any performance gained by parallelizing the computational part. We address two opportunities for performance improvement: using a parallel approximate algorithm that requires only a single pass over the data;and using a probabilistic technique to avoid generating most of the lattice of subsets implied by each object's data. The computation required is only slightly greater than levelwise algorithms, but the amount of data access is much smaller. (C) 2002 Published by Elsevier Science B.V.
暂无评论