Recent research efforts of parallelprocessing on non-dedicated clusters have focused on high execution performance, parallelism management, transparent access to resources, and making clusters easy to use. However as...
详细信息
ISBN:
(纸本)0769515126
Recent research efforts of parallelprocessing on non-dedicated clusters have focused on high execution performance, parallelism management, transparent access to resources, and making clusters easy to use. However as a collection of independent computers used by multiple users, clusters are susceptible to failure. this paper shows the development of a coordinated checkpointing facility for the GENESIS cluster operating system. this facility was developed by exploiting existing operating system services. High performance and low overheads are achieved by allowing the processes of a parallel application to continue executing during the creation of check-points, while maintaining low demands on cluster resources by using coordinated checkpointing.
the external selection problem is to select the record withthe K-th smallest key from the given N records that are distributed and stored evenly on the D disks for the parallel machine with D processors. Each process...
详细信息
ISBN:
(纸本)0769515126
the external selection problem is to select the record withthe K-th smallest key from the given N records that are distributed and stored evenly on the D disks for the parallel machine with D processors. Each processor has its own primary memory of size M records and one disk, where N/D> M. the processors are connected with a root D X rootD Mesh architecture. Based on a two-stage approach, this paper presents an efficient parallel external selection algorithm for the distributed-memory parallel systems. First, all the processors execute local external sorting in parallel, each processor sorts the N/D records on its own disk. Next, they execute parallel external selection from the D sorted sub files on the D disks. this algorithm is asymptotically optimal and has a small constant factor of time complexity.
Heterogeneous parallel systems are becoming increasingly more common, especially withthe increasing use of cluster computers, such as PCs and networks of workstations for parallel computing. the main concern of this ...
详细信息
ISBN:
(纸本)0769515126
Heterogeneous parallel systems are becoming increasingly more common, especially withthe increasing use of cluster computers, such as PCs and networks of workstations for parallel computing. the main concern of this paper is measuring and evaluating the performance of such parallel systems, based on dynamic load balancing algorithm for parallel search algorithm depth-first search algorithm (DFS). the implementation of dynamic load balancing is running under the MPI (message passing interface) that allows parallel execution on cluster of heterogeneous 6 SUN workstations (COHW), operating with Solaris operating system and cluster of 10 PCs operating with Linux operating system, parallel program of dynamic load balancing is written in C language.
this paper presents a two-level parallel evolutionary algorithm for solving function optimization problem containing multiple solutions.. By combining the characteristics of both global search and local search, the fo...
详细信息
ISBN:
(纸本)0769515126
this paper presents a two-level parallel evolutionary algorithm for solving function optimization problem containing multiple solutions.. By combining the characteristics of both global search and local search, the former enables individual to draw closer to each optimal solution and keeps the genetic diversity,of individuals. then different individuals are selected fort local evolution in their appropriate neighborhood. this simple as well as easy-to-handle algorithm turns out to be very practical according to the numerical experiments which indicate that all optimal solutions can be found out by running once of the algorithm within a fairly short period of time.
the paper concerns the parallel computing and its application for solving the full Lyapunov exponents in the general nonlinear parameter-dependent continuous ordinary differential equations. Based on a standard serial...
详细信息
ISBN:
(纸本)0769515126
the paper concerns the parallel computing and its application for solving the full Lyapunov exponents in the general nonlinear parameter-dependent continuous ordinary differential equations. Based on a standard serial algorithm developed by Wolf et al.'s [1], we present a parallel algorithm using the block-cyclic decomposition method, and then apply it for solving the Lyapunov exponents of a continuous differential equation. By testing its performance of the parallel algorithm on the supercomputer DAWNING-2000II, it is proved that the parallel algorithm is of high level parallelism, no need for message passing (little communication cost), and little I/O. In addition, the algorithm can be extended to any high dimensional ordinary differential equations.
In this paper, we present the design and implementation of a new cluster file system, th-CluFS, which is based on the standard NFS protocol and is implemented in the user level space completely. this open platform fil...
详细信息
ISBN:
(纸本)0769515126
In this paper, we present the design and implementation of a new cluster file system, th-CluFS, which is based on the standard NFS protocol and is implemented in the user level space completely. this open platform file system is important as the clusters become larger and heterogeneous. To take advantages of the accumulated resources and high-speed network in clusters, th-CluFS follows a serverless architecture, hybrid distributed metadata management, and file granular data distribution, and it uses distributed metadata cache and unique cache to optimize performance. For the flexibility of th-CluFS, We plan to employ file migration to balance I/O load across nodes dynamically. According to the experiment results, we conclude that th-CluFS can meet the requirements of consistent file system view, performance and scalability gracefully.
Biological sequence comparison is an important tool for researchers in molecular biology. there are several algorithms for sequence comparison. the Smith-Waterman algorithm, based on dynamic programming, is one of the...
详细信息
We propose an improved version of the CGS method for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices. the proposed method combines elements of numerical stability an...
详细信息
ISBN:
(纸本)0769515126
We propose an improved version of the CGS method for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices. the proposed method combines elements of numerical stability and parallel algorithm design without increasing computational costs. the algorithm is derived such that all matrix-vector multiplication, inner products and vector updates of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. therefore, the cost of global communication which represents the bottleneck of the performance can be significantly reduced. In this paper, the Bulk Synchronous parallel (BSP) model is used to design a fully efficient, scalable and portable parallel proposed algorithm and to provide accurate performance prediction of the algorithm for a wide range of architectures including the Cray T3D, the Parsytec, and a cluster of workstations connected by an Ethernet. this performance model uses only a few system dependent parameters based on a simple and accurate cost modelling to provide useful insight in the time complexity of the method. the theoretical performance prediction are compared with some preliminary measured timing results of a numerical application from ocean flow simulation.
A universal parallelized numerical approach for solving three-dimensional (3D) convection diffusion equation with variable coefficients is proposed by combining the implicit difference method of Crank-Nicolson with al...
详细信息
ISBN:
(纸本)0769515126
A universal parallelized numerical approach for solving three-dimensional (3D) convection diffusion equation with variable coefficients is proposed by combining the implicit difference method of Crank-Nicolson with alternating bar parallelization, which can be used to solve numerically any variation of 3D convection diffusion equation. By virtue of a bar parallelization and a multistep iteration technique, this approach trades off between parallelism and accuracy. Its main merits are the generality, absolute stability, acceptable space demand and still of two-order accuracy. Its one parallel implementation, named as Codie4D, on network of workstations by the popular MPI library enlists the benefits of portability and applicability. Experimental results show that Codie4D has good runtime performance.
In this paper, an improved version of the BiCGStab (IBiCGStab) method for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices is proposed. the method combines elements o...
详细信息
ISBN:
(纸本)0769515126
In this paper, an improved version of the BiCGStab (IBiCGStab) method for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices is proposed. the method combines elements of numerical stability and parallel algorithm design without increasing the computational costs. the algorithm is derived such that all inner products of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. therefore, the cost of global communication which represents the bottleneck of the parallel performance can be significantly reduced. the resulting IBiCGStab algorithm maintains the favorable properties of the original method while not increasing computational costs. Data distribution suitable for both irregularly and regularly structured matrices based on the analysis of the non-zero matrix elements is presented. Communication scheme is supported by overlapping execution of computation and communication to reduce waiting times. the efficiency of this method is demonstrated by numerical experimental results carried out on a massively parallel distributed memory system.
暂无评论