Investigations of the parallel computing of the non-ideal 3-D space detonation wave propagation are presented in this paper on the hi-performance computer based on CC-NUMA architecture. Upon analyzing and testing the ...
详细信息
ISBN:
(纸本)0769515126
Investigations of the parallel computing of the non-ideal 3-D space detonation wave propagation are presented in this paper on the hi-performance computer based on CC-NUMA architecture. Upon analyzing and testing the previous serial program, the computation of curvature, the first-order and the second-order difference were determined to be the main objects of parallelization. Some processing techniques were applied to convert the serial program into parallel program, such as the strategy of "Divide and Conquer", the balance of the loading distribution. Numerical simulation computation of the parallel program results in a great increase of computing speed of the non-ideal 3-D space detonation wave propagation.
An algorithm, which solves the cooperative concurrent computing tasks by using the idle cycle of a number of high performance heterogeneous workstations interconnected by a high-speed network, is proposed. In order to...
详细信息
ISBN:
(纸本)0769515126
An algorithm, which solves the cooperative concurrent computing tasks by using the idle cycle of a number of high performance heterogeneous workstations interconnected by a high-speed network, is proposed. In order to get better parallel computation performance, this paper gives a model and an algorithm of task scheduling among heterogeneous workstations, in which the costs of loading data, computing, communication and collecting results are considered. Using this efficient algorithm, an optimal subset of heterogeneous workstations withthe shortest parallel executing time of tasks can be selected.
Heterogeneous parallel systems are becoming increasingly more common, especially withthe increasing use of cluster computers, such as PCs and networks of workstations for parallel computing. the main concern of this ...
详细信息
ISBN:
(纸本)0769515126
Heterogeneous parallel systems are becoming increasingly more common, especially withthe increasing use of cluster computers, such as PCs and networks of workstations for parallel computing. the main concern of this paper is measuring and evaluating the performance of such parallel systems, based on dynamic load balancing algorithm for parallel search algorithm depth-first search algorithm (DFS). the implementation of dynamic load balancing is running under the MPI (message passing interface) that allows parallel execution on cluster of heterogeneous 6 SUN workstations (COHW), operating with Solaris operating system and cluster of 10 PCs operating with Linux operating system, parallel program of dynamic load balancing is written in C language.
It is presented in this paper that the design and analysis of finite difference domain decomposition algorithms for the two-dimensional heat equation and the numerical results have shown the stability and accuracy of ...
详细信息
ISBN:
(纸本)0769515126
It is presented in this paper that the design and analysis of finite difference domain decomposition algorithms for the two-dimensional heat equation and the numerical results have shown the stability and accuracy of the algorithms. the algorithms in the paper have further extended those developed by Dawson and the others [6].
A type of incomplete decomposition preconditioner based on local block factorization is considered, for the matrices derived from discreting 2-D or 3-D elliptic partial differential equations. We prove that the condit...
详细信息
ISBN:
(纸本)0769515126
A type of incomplete decomposition preconditioner based on local block factorization is considered, for the matrices derived from discreting 2-D or 3-D elliptic partial differential equations. We prove that the condition numbers of the preconditioned matrices are small, which means that the constructed preconditioners are effective. Further we consider an efficient parallel version of the preconditioner which depends only on a single integer argument. When its value is small, the iterations needed on multiple processors to converge is much more than on a single processor But withthe increase of this value, the difference decreases step by step. Finally, we have many experiments on a cluster of 6 PCs with main frequencies of 1.8GHz the results show that the local block factorizations constructed are efficient in serial implementation, if compared to some well-known effective preconditioners, and the parallel versions are efficient also.
Improving the computation efficiency is a key issue in image processing, especially in edge detection, because edge detection is very computationally intensive. Withthe development of real-time application of image p...
详细信息
ISBN:
(纸本)0769515126
Improving the computation efficiency is a key issue in image processing, especially in edge detection, because edge detection is very computationally intensive. Withthe development of real-time application of image processing, fast processing response is becoming more critical. In this paper, a technique for distributed image processing on Spiral Architecture is proposed, which provides a platform for speeding up image processing based on clusters.
We study parallel solutions to the problem of weighted multiselection to select r elements on given weighted-ranks from a, set S of n weighted elements, where an element is on weighted rank k if it is the smallest ele...
详细信息
ISBN:
(纸本)0769515126
We study parallel solutions to the problem of weighted multiselection to select r elements on given weighted-ranks from a, set S of n weighted elements, where an element is on weighted rank k if it is the smallest element such that the aggregated weight of all elements not greater than it in S is not smaller than k. We propose efficient algorithms on two of the most popular parallelarchitectures, hypercube and mesh. For a hypercube with p < n processors, we present a parallel algorithm running in O(n(epsilon) min{r, log p}) time for p = n(1-epsilon), 0 < epsilon < 1, which is cost optimal when r greater than or equal to p. Our algorithm on rootp x rootp mesh runs in O(rootp + n/p log(3) p) time P which is the same as multiselection on mesh when r greater than or equal to log p, and thus has the same optimality as multiselection in this case.
Recent research efforts of parallelprocessing on non-dedicated clusters have focused on high execution performance, parallelism management, transparent access to resources, and making clusters easy to use. However as...
详细信息
ISBN:
(纸本)0769515126
Recent research efforts of parallelprocessing on non-dedicated clusters have focused on high execution performance, parallelism management, transparent access to resources, and making clusters easy to use. However as a collection of independent computers used by multiple users, clusters are susceptible to failure. this paper shows the development of a coordinated checkpointing facility for the GENESIS cluster operating system. this facility was developed by exploiting existing operating system services. High performance and low overheads are achieved by allowing the processes of a parallel application to continue executing during the creation of check-points, while maintaining low demands on cluster resources by using coordinated checkpointing.
the external selection problem is to select the record withthe K-th smallest key from the given N records that are distributed and stored evenly on the D disks for the parallel machine with D processors. Each process...
详细信息
ISBN:
(纸本)0769515126
the external selection problem is to select the record withthe K-th smallest key from the given N records that are distributed and stored evenly on the D disks for the parallel machine with D processors. Each processor has its own primary memory of size M records and one disk, where N/D> M. the processors are connected with a root D X rootD Mesh architecture. Based on a two-stage approach, this paper presents an efficient parallel external selection algorithm for the distributed-memory parallel systems. First, all the processors execute local external sorting in parallel, each processor sorts the N/D records on its own disk. Next, they execute parallel external selection from the D sorted sub files on the D disks. this algorithm is asymptotically optimal and has a small constant factor of time complexity.
this paper presents a two-level parallel evolutionary algorithm for solving function optimization problem containing multiple solutions.. By combining the characteristics of both global search and local search, the fo...
详细信息
ISBN:
(纸本)0769515126
this paper presents a two-level parallel evolutionary algorithm for solving function optimization problem containing multiple solutions.. By combining the characteristics of both global search and local search, the former enables individual to draw closer to each optimal solution and keeps the genetic diversity,of individuals. then different individuals are selected fort local evolution in their appropriate neighborhood. this simple as well as easy-to-handle algorithm turns out to be very practical according to the numerical experiments which indicate that all optimal solutions can be found out by running once of the algorithm within a fairly short period of time.
暂无评论