A parallel algorithm for efficient calculation of the second derivatives (Hessian) of the conformational energy in internal coordinates is proposed, This parallel algorithm is based on the master/slave model. A master...
详细信息
A parallel algorithm for efficient calculation of the second derivatives (Hessian) of the conformational energy in internal coordinates is proposed, This parallel algorithm is based on the master/slave model. A master processor distributes the calculations of components of the Hessian to one or more slave processors that, after finishing their calculations, send the results to the master processor that assembles all the components of the Hessian. Our previously developed molecular analysis system for conformational energy optimization, norm,al mode analysis, and Monte Carlo simulation for internal coordinates is extended to use this parallel algorithm for Hessian calculation on a massively parallel computer. The implementation of our algorithm uses the message passing Interface and works effectively on both distributed-memory parallel computers and shared-memory parallel computers. We applied this system to the Newton-Raphson energy optimization of the structures of glutaminyl transfer RNA (Gln-tRNA) with 74 nucleotides and glutaminyl-tRNA synthetase (GlnRS) with 540 residues to analyze the performance of our system. The parallel speedups for the Hessian calculation were 6.8 for Gln-tRNA with 24 processors and 11.2 for GlnRS with 54 processors. The parallel speedups for the Newton-Raphson optimization were 6.3 for Gln-tRNA with 30 processors and 12.0 for GlnRS with 62 processors. (C) 1998 John Wiley & Sons, Inc.
In this paper, based upon the divide-and-conquer strategy, we propose a parallel algorithm for solving the circulant tridiagonal systems, which is simpler than the two previous algorithms proposed by Agui and Jimenez ...
详细信息
In this paper, based upon the divide-and-conquer strategy, we propose a parallel algorithm for solving the circulant tridiagonal systems, which is simpler than the two previous algorithms proposed by Agui and Jimenez (1995) and Chung et al. (1995). Our algorithm can be easily generalized to solve the tridiagonal systems, the block-tridiagonal systems, and the circulant block-tridiagonal systems. (C) 1998 Elsevier Science B.V.
This paper presents a parallel algorithm for solving the implicit diffusion difference equations. The basic idea is based on vectorization of the tridiagonal Toeplitz difference equations. This method is superior to t...
详细信息
This paper presents a parallel algorithm for solving the implicit diffusion difference equations. The basic idea is based on vectorization of the tridiagonal Toeplitz difference equations. This method is superior to the algorithm showed by H. Stone [8]. We computed some examples on an NEC SX-3/44R supercomputer by our method. The results showed a good parallelism with this algorithm.
作者:
Wei, YMWu, HBWei, JYFudan Univ
Dept Math Shanghai 200433 Peoples R China Fudan Univ
Lab Math Nonlinear Sci Shanghai 200433 Peoples R China Fudan Univ
Inst Math Shanghai 200433 Peoples R China Fudan Univ
Dept Comp Sci Shanghai 200433 Peoples R China Fudan Univ
Parallel Proc Inst Shanghai 200433 Peoples R China
We derive a successive;matrix squaring (SMS) algorithm to approximate the weighted generalized inverse, which can be expressed in the form of successive squaring of a composite matrix T. Given an m by n matrix A with ...
详细信息
We derive a successive;matrix squaring (SMS) algorithm to approximate the weighted generalized inverse, which can be expressed in the form of successive squaring of a composite matrix T. Given an m by n matrix A with m approximate to n, we show that the weighted generalized inverse of A can be computed in parallel time ranging from O(log n) to O(log(2) n) provided that there are enough processors to support matrix multiplication in time O(log n). (C) 2000 Elsevier Science Inc. All rights reserved.
A simple parallel algorithm for the evaluation of polynomials written in the Chebyshev form is introduced. By this method only 2 inverted right perpendicularlog(2)(p - 2)inverted left perpendicular + inverted right pe...
详细信息
A simple parallel algorithm for the evaluation of polynomials written in the Chebyshev form is introduced. By this method only 2 inverted right perpendicularlog(2)(p - 2)inverted left perpendicular + inverted right perpendicularlog(2) pinverted left perpendicular + 4inverted right perpendicularN/pinverted left perpendicular - 7 steps on p processors are needed to evaluate a Chebyshev series of degree N. Theoretical analysis of the efficiency is performed and some numerical examples on a CRAY T3D are shown.
The performance of the standard Monte Carlo method is compared with the performance obtained through the use of (t, m, s)-nets in base b in the approximation of several high dimensional integral problems in valuing de...
详细信息
The performance of the standard Monte Carlo method is compared with the performance obtained through the use of (t, m, s)-nets in base b in the approximation of several high dimensional integral problems in valuing derivatives and other securities, The (t, m, s)-nets are generated by a parallel algorithm, where particular considerations are given to scalability of dynamic adaptive routing and load balancing in the design and implementation of the algorithm. From the numerical evidence it appears that such nets can be powerful tools for valuing such securities. (C) 2000 Elsevier Science B.V. All rights reserved.
In this paper, it is supposed that the B&B algorithm finds the first optimal solution after h nodes have been expanded and m active nodes have been created in the state-space tree. Then the lower bound Ω(m+h log ...
详细信息
In this paper, it is supposed that the B&B algorithm finds the first optimal solution after h nodes have been expanded and m active nodes have been created in the state-space tree. Then the lower bound Ω(m+h log h) of the running time for the general sequential B&B algorithm and the lower bound Ω(m/p+h log p) for the general parallel best-first B&B algorithm in PRAM-CREW are proposed, where p is the number of processors available. Moreover, the lower bound Ω(M/p+H+(H/p) log (H/p)) is presented for the parallel algorithms on distributed memory system, where M and H represent total number of the active nodes and that of the expanded nodes processed by p processors, respectively. In addition, a nearly fastest general parallel best-first B&B algorithm is put forward. The parallel algorithm is the fastest one as p = max{hε, r}, where ε = 1/ rootlogh, and r is the largest branch number of the nodes in the state-space tree.
Real time transient stability analysis is a challenging computing problem. In order to speed up the solution of this problem, parallel processing technologies have been applied. In this paper, the implementation of pa...
详细信息
ISBN:
(纸本)0780359356
Real time transient stability analysis is a challenging computing problem. In order to speed up the solution of this problem, parallel processing technologies have been applied. In this paper, the implementation of parallel algorithms for transient stability analysis on a message passing multicomputer is described. Both parallelism-in-space and parallelism-in-time are exploited Test simulations are performed for two large-scale power systems using an IBM SP2 parallel computer. Speedup results are presented to show the performance of the proposed algorithms.
Matrix operations are the core of many linear systems. Efficient matrix multiplication is critical to many numerical applications, such as climate modeling, molecular dynamics computational fluid dynamics and etc. Muc...
详细信息
ISBN:
(纸本)0769509363
Matrix operations are the core of many linear systems. Efficient matrix multiplication is critical to many numerical applications, such as climate modeling, molecular dynamics computational fluid dynamics and etc. Much research work has been done to improve the performance of matrix operations. However, the majority of these works is focused on two-dimensional (2D) matrix. Very little research work has been done on three or higher dimensional matrix. Recently. a new structure called Extended Karnaugh Map Representation (EKMR) for n-dimensional (nD) matrix representation has been proposed, which provides better matrix operations performance compared to the Traditional matrix representation (TMR). The main idea of EKMR is to represent any no matrix by 2D matrices. Hence, efficient algorithms design for no matrices becomes less complicated. parallel matrix operation algorithms based oil EKMR and TMR are presented Analysis and experiments are conducted to assess their performance. Both our analysis and experimental result show that parallel algorithms based on EKMR outperform those based on TMR.
With scale-up and complication of the problems in engineering and technology, control theories are required to meet higher and higher demands. The rapid development of computer technology, especially the parallelizati...
详细信息
ISBN:
(纸本)7312012035
With scale-up and complication of the problems in engineering and technology, control theories are required to meet higher and higher demands. The rapid development of computer technology, especially the parallelization trend, provides new opportunities to the development of control theories. This paper discusses the necessity and prospective of the application of parallel algorithms to control, and illustrates the influence of parallel algorithms on control with the parallel algorithm of the solution to the inverse dynamics problem of six-DOF platform parallel manipulators.
暂无评论