We describe a general algorithm suitable for executing and coupling components of a software framework on a parallel computer. The requirements of a flexible, efficient and robust algorithm are defined precisely, and ...
详细信息
We describe a general algorithm suitable for executing and coupling components of a software framework on a parallel computer. The requirements of a flexible, efficient and robust algorithm are defined precisely, and the motivation for the requirements is demonstrated on several examples. In short, the requirements are the following: (i) the algorithm should allow arbitrary distribution of processors among the components, (ii) it should allow arbitrary coupling schedule between the components, (iii) it should not use any inter-processor communication other than already required by the components and their couplings, and (iv) it should never get into a dead-lock. We show that the proposed algorithm based on the Temporal and Predefined Ordering of Tasks (TPOT) satisfies all these requirements. The TPOT algorithm has been implemented in the Space Weather Modeling Framework. The flexibility and efficiency of the algorithm is demonstrated with several examples. (c) 2006 :Elsevier B.V. All rights reserved.
We present a model-based parallel algorithm for origin and orientation refinement for 3D reconstruction in cryoTEM. The algorithm is based upon the Projection Theorem of the Fourier Transform. Rather than projecting t...
详细信息
We present a model-based parallel algorithm for origin and orientation refinement for 3D reconstruction in cryoTEM. The algorithm is based upon the Projection Theorem of the Fourier Transform. Rather than projecting the current 3D model and searching for the best match between an experimental view and the calculated projections, the algorithm computes the Discrete Fourier Transform (DFT) of each projection and searches for the central section ("cut") of the 3D DFT that best matches the DFT of the projection. Factors that affect the efficiency of a parallel program are first reviewed and then the performance and limitations of the proposed algorithm are discussed. The parallel program that implements this algorithm, called pO(2)R, has been used for the refinement of several virus structures, includine those of the 500 A diameter dengue virus (to 9.5 angstrom resolution), the 850 A mammalian reovirus (to better than 7 angstrom), and the 1800 angstrom paramecium bursaria chlorella virus (to 15 angstrom). (c) 2005 Elsevier Inc. All rights reserved.
Making use of the special structure of near tridiagonal Toeplitz matrix, a new fast algorithm is presented to solve near tridiagonal Toeplitz equations. Based on the near LU factorization of tridiagonal Toeplitz matri...
详细信息
Making use of the special structure of near tridiagonal Toeplitz matrix, a new fast algorithm is presented to solve near tridiagonal Toeplitz equations. Based on the near LU factorization of tridiagonal Toeplitz matrix and by making use of the principle of divide and rule, a fast distributed parallel algorithm is put forward for near tridiagonal Toeplitz equations. By introducing Qing-Jiushao algorithm and special mathematic skill, the new parallel algorithm avoids redundant operations. Also proved in theory is that the algorithm's speedup is closed to linearity. Finally, numerical experiments show that the new parallel algorithm have a high parallel efficiency. And above all, if n is large enough, the speedup is approximate to linearity.
Nearest neighbor query is a basic problem of computational geometry. As an extension of nearest neighbor query, k-nearest-neighbor is widely applied in the fields of VLSI design, data retrieval, pattern matching, grap...
详细信息
Nearest neighbor query is a basic problem of computational geometry. As an extension of nearest neighbor query, k-nearest-neighbor is widely applied in the fields of VLSI design, data retrieval, pattern matching, graph processing, etc. A parallel algorithm is presented on a reconfigurable mesh of size N × N for k-nearest-neighbor search in a planar point set S of N points. The time complexity of this algorithm is O(k). It attains the lower bound of this problem.
With the interconnection of power systems, interarea low frequency oscillation occurs frequently, insufficient damping of the weak-interconnected power grid has become a serious issue affecting power system stability....
详细信息
ISBN:
(纸本)9781424401109
With the interconnection of power systems, interarea low frequency oscillation occurs frequently, insufficient damping of the weak-interconnected power grid has become a serious issue affecting power system stability. Considering the high rank of algebraic and differential equations and the geographical distributed data of interconnected power systems, it is necessary to analyze small signal stability problems in parallel mode with distributed computers. Based on the 'Multi-port Inverse Matrix parallel algorithm' proposed by China EPRI, the parallel implementation methods of the Simultaneous Iteration algorithm and the Inverse Iteration/Rayleigh Quotient Iteration algorithm are described in detail in this paper. With the characteristics of limited communication time and good load balancing, the proposed parallel algorithms can run well on distributed PC clusters. Numerical simulation results on actual large-scale power systems show that the proposed algorithms are correct and efficient.
To keep up with the pace of fast development of Internet, cluster architecture has been proposed for next generation core routers. In a cluster router, parallel computation is expected. Computing shortest path tree (S...
详细信息
ISBN:
(纸本)9780889866386
To keep up with the pace of fast development of Internet, cluster architecture has been proposed for next generation core routers. In a cluster router, parallel computation is expected. Computing shortest path tree (SPT) is a fundamental problem implementing OSPF, which is one of the most popular routing protocols. This paper presents a parallel algorithm BPA (Branching parallel algorithm) for computing SPT, analyzes the performance of BPA, and finally validates the BPA performance by experiments
Sequence alignment is one of the most important fundamental operations in bioinformatics. It has been successfully applied to predict the function, structure and evolution of biological sequences. In this paper, the s...
详细信息
ISBN:
(纸本)0387344020
Sequence alignment is one of the most important fundamental operations in bioinformatics. It has been successfully applied to predict the function, structure and evolution of biological sequences. In this paper, the sequence alignment algorithms based on dynamic programming are analyzed and compared. We present a parallel algorithm for pairwise alignment and implement it on a clustering system with MPI. The experimental results demonstrate the effectiveness in performance promotion. We encapsulate the algorithm into a grid service for practical use.
In this paper, we present a general survey on parallel computing. The main contents include parallel computer system which is the hardware platform of parallel computing, parallel algorithm which is the theoretical ba...
详细信息
In this paper, we present a general survey on parallel computing. The main contents include parallel computer system which is the hardware platform of parallel computing, parallel algorithm which is the theoretical base of parallel computing, parallel programming which is the software support of parallel computing. After that, we also introduce some parallel applications and enabling technologies. We argue that parallel computing research should form an integrated methodology of "architecture algorithm programming application". Only in this way, parallel computing research becomes continuous development and more realistic.
Based on the two-list algorithm and the parallel three-list algorithm, an improved parallel three-list algorithm for knapsack problem is proposed, in which the method of divide and conquer, and parallel merging withou...
详细信息
Based on the two-list algorithm and the parallel three-list algorithm, an improved parallel three-list algorithm for knapsack problem is proposed, in which the method of divide and conquer, and parallel merging without memory conflicts are adopted. To find a solution for the n-element knapsack problem, the proposed algorithm needs O(2^3n/8) time when O(2^3n/8) shared memory units and O(2^n/4) processors are available. The comparisons between the proposed algorithm and 10 existing algorithms show that the improved parallel three-fist algorithm is the first exclusive-read exclusive-write (EREW) parallel algorithm that can solve the knapsack instances in less than O(2^n/2) time when the available hardware resource is smaller than O(2^n/2) , and hence is an improved result over the past researches.
A parallel algorithm for solving the coupled-perturbed MCSCF (CPMCSCF) equations and analytic nuclear second derivatives of CASSCF wave functions is presented. A parallel scheme for evaluating derivative integrals and...
详细信息
A parallel algorithm for solving the coupled-perturbed MCSCF (CPMCSCF) equations and analytic nuclear second derivatives of CASSCF wave functions is presented. A parallel scheme for evaluating derivative integrals and their subsequent use in constructing other derivative quantities is described. The task of solving the CPMCSCF equations is approached using a parallelization scheme that partitions the electronic hessian matrix over all processors as opposed to simple partitioning of the 3 N solution vectors among the processors. The scalability of the current algorithm, up to 128 processors, is demonstrated. Using three test cases, results indicate that the parallelization of derivative integral evaluation through a simple scheme is highly effective regardless of the size of the basis set employed in the CASSCF energy calculation. parallelization of the construction of the MCSCF electronic hessian during solution of the CPMCSCF equations varies quantitatively depending on the nature of the hessian itself, but is highly scalable in all cases. (c) 2005 Wiley Periodicals, Inc.
暂无评论