作者:
Salah, AhmadLi, KenliHunan Univ
Coll Informat Sci & Engn Changsha 410082 Hunan Peoples R China Zagazig Univ
Dept Comp Sci Coll Comp & Informat Zagazig Egypt Hunan Univ
Natl Supercomp Ctr Changsha Changsha 410082 Hunan Peoples R China
Protein structure comparison is a vital process in several tasks like the prediction of protein structures and functions and detecting the proteins evolutionary relationships. The expansion of both the parallel comput...
详细信息
Protein structure comparison is a vital process in several tasks like the prediction of protein structures and functions and detecting the proteins evolutionary relationships. The expansion of both the parallel computational hardware and the discovered protein structures stimulates the growth of the parallel computational tools to handle this massive data of proteome. Here, we present a parallel tool, parallel 3D-BLAST (PAR-3D-BLAST), which lists the similar structures to the query protein. Each protein in the result list has a structural similarity score and an alignment to the query structure. The presented tool is implemented to fit both the standalone multi-core computers and clusters of multi-core nodes. The achieved speedup is linear and scalable. The experimental results outline that the speedup increases as the size of the database increases. Using a cluster of 35 computing cores, the tool constructs the database of the entire structural classification of proteins dataset, 108,116 protein entries, in less than 6min and with average query time of 1.45s. The obtained speed up is 20 times for database construction and 17 times for searching the query. The tool is an open source and free to use, distribute, and share;it is available at http://***/par3dblast. Copyright (c) 2013 John Wiley & Sons, Ltd.
parallel algorithm for Binary Tree Traversing Sequence Based on Coding is proposed in the paper. The method is understood and mastered easily, which can simplify the traversing process, makes the process for educing b...
详细信息
ISBN:
(纸本)9781424436927
parallel algorithm for Binary Tree Traversing Sequence Based on Coding is proposed in the paper. The method is understood and mastered easily, which can simplify the traversing process, makes the process for educing binary tree traversing sequence becoming quickly and intuitive, fits to the demonstration of traversing process in classroom teaching, and can improve the computing efficiency and accuracy through avoiding complex description aroused by recursive calling directly from the definition. The algorithm embodies a deduction idea that is from abstract to concrete, from special to general. The traversing process of the parallel algorithm is described in detail and analyzed verified with an application instance.
parallel computing is an important method used in high performance computing. A new SIMD architecture named ESCA (Engineering and Science Computing Accelerator) is introduced briefly in this paper. It aims to accelera...
详细信息
ISBN:
(纸本)9781424474547
parallel computing is an important method used in high performance computing. A new SIMD architecture named ESCA (Engineering and Science Computing Accelerator) is introduced briefly in this paper. It aims to accelerate the computation for most critical scientific workload as a coprocessor by virtue of outstanding architecture and flexible parallel algorithm. As dense matrix multiplication is a widely used operation that can be accelerated by parallel computing, we maps its algorithm onto ESCA and estimates the performance, and the results imply that ESCA has some advantage and potentiality.
As a synchronization parallel framework, the parallel variable transformation (PVT) algorithm is effective to solve unconstrained optimization problems. In this paper, based on the idea that a constrained optimization...
详细信息
As a synchronization parallel framework, the parallel variable transformation (PVT) algorithm is effective to solve unconstrained optimization problems. In this paper, based on the idea that a constrained optimization problem is equivalent to a differentiable unconstrained optimization problem by introducing the Fischer Function, we propose an asynchronous PVT algorithm for solving large-scale linearly constrained convex minimization problems. This new algorithm can terminate when some processor satisfies terminal condition without waiting for other processors. Meanwhile, it can enhances practical efficiency for large-scale optimization problem. Global convergence of the new algorithm is established under suitable assumptions. And in particular, the linear rate of convergence does not depend on the number of processors. Crown Copyright (c) 2009 Published by Elsevier Inc. All rights reserved.
Suppose than 0 = n - k. (Thus for eta = 0 we get the well-known Chvata graphs.) An NC(4)-algorithm is presented which accepts as input an eta-Chvatal graph and produces a Hamiltonian cycle in G as an output. This is a...
详细信息
Suppose than 0 < eta < 1 is given. We call a graph, G, on n vertices an eta-Chvatal graph if its degree sequences d(1) <= d(2) <= ... <= d(n) satisfies: for k < n/2, d(k) <= min {k + eta n. n/2} implies d(n-k-eta n) >= n - k. (Thus for eta = 0 we get the well-known Chvata graphs.) An NC(4)-algorithm is presented which accepts as input an eta-Chvatal graph and produces a Hamiltonian cycle in G as an output. This is a significant improvement on the previous best NC-algorithm for the problem, which finds a Hamiltonian cycle only in Dirac graphs (delta(G) >= n/2 where delta(G) is the minimum degree in G). (C) 2008 Elsevier B.V. All rights reserved.
This paper deals with the convergence and stability of a new parallel algorithm and the error estimates for a particular case of the new parallel algorithm, which is used to solve the incompressible nonstationary Navi...
详细信息
This paper deals with the convergence and stability of a new parallel algorithm and the error estimates for a particular case of the new parallel algorithm, which is used to solve the incompressible nonstationary Navier-Stokes equations. The theoretical results show that the scheme is (at least) conditionally stable and convergent. (c) 2007 Elsevier Ltd. All rights reserved.
In this paper we extend some well-known notions of combinatorics on multi-sets such as iterative permutation, multi-subset, iterative combination and then construct new efficient algorithms for generating all iterativ...
详细信息
In this paper we extend some well-known notions of combinatorics on multi-sets such as iterative permutation, multi-subset, iterative combination and then construct new efficient algorithms for generating all iterative permutations, multi-subsets and iterative combinations of a multi-set. Applying the parallelizing method based on output decomposition we parallelize the algorithms. Furthermore, we use these algorithms to solve an optimal problem of work arrangement and an extended knapsack one.
The numerical simulations for solving the transport equations lead to the large computation and need to implement parallel calculation. On the unstructured grids, the communicating delays, sorting algorithms and inser...
详细信息
The numerical simulations for solving the transport equations lead to the large computation and need to implement parallel calculation. On the unstructured grids, the communicating delays, sorting algorithms and inserting algorithms limit the performance of current algorithms, which decrease the scalable parallel performance. This paper presents an effective way to implement the scalable parallel numerical simulation on the clusters by combining the energy groups and the space domain decomposition. Based on the list schedule, we first design a multi-group parallel method to solve the load unbalance problem which is brought about by the energy group parallel decomposition. After we describe the priority algorithm to arrange the orders for all meshes, we present a parallel algorithm based on geometry domain decomposition. A parallel code combining those two algorithms was designed. Using the code, we solved a two dimension particle transport equations on a cluster, performance results show the algorithms have well scalability.
A planar fuel cell stack is a layered structure consisting of repeated modules-membrane electrode assemblies (MEAs) separated by bipolar plates (BPs). Generally, the distributions of voltage and temperature over the B...
详细信息
A planar fuel cell stack is a layered structure consisting of repeated modules-membrane electrode assemblies (MEAs) separated by bipolar plates (BPs). Generally, the distributions of voltage and temperature over the BP volume are described by three-dimensional Laplace equations. However, the thickness of a BP is much smaller than its in-plane size. This enables us to reduce a three-dimensional Laplace equation to a two-dimensional Poisson equation and to develop an efficient parallel algorithm for stack simulation. In the simplest variant, each individual module "MEA + BP" is solved on a separate processor. Typically, the number of cells in a stack is 10 to 100;this algorithm is thus most suitable for small- and medium-scale parallel machines. A much faster method is to cut every module into a number of "stripes" and to solve each stripe on a separate processor. Numerical tests with this method show that with eight stripes per module the solution of the electric problem is obtained roughly ten times faster than expected. Evidently, the striping algorithm provides much faster convergence of the iterative Poisson solver. The effect is presumably due to fast damping of high-frequency modes of potential in the iteration process. This algorithm may open up possibilities for fast simulation of real 100-cell stacks using massively parallel machines.
作者:
Zhu, XiangyuanLi, KenliSalah, AhmadHunan Univ
Coll Informat Sci & Engn Changsha 410082 Hunan Peoples R China Hunan Univ
Natl Super Comp Ctr Changsha Changsha 410082 Hunan Peoples R China Zhaoqing Univ
Educ Technol & Comp Ctr Zhaoqing 516061 Guangdong Peoples R China Zagazig Univ
Dept Comp Sci Zagazig 44519 Sharkia Egypt
In this paper, we address the large-scale biological sequence alignment problem, which has an increasing demand in computational biology. We employ data parallelism paradigm that is suitable for handling large-scale p...
详细信息
In this paper, we address the large-scale biological sequence alignment problem, which has an increasing demand in computational biology. We employ data parallelism paradigm that is suitable for handling large-scale processing on multi-core computers to achieve a high degree of parallelism. Using the data parallelism paradigm, we propose a general strategy which can be used to speed up any multiple sequence alignment method. We applied five different clustering algorithms in our strategy and implemented rigorous tests on an 8-core computer using four traditional benchmarks and artificially generated sequences. The results show that our multi-core-based implementations can achieve up to 151-fold improvements in execution time while losing 2.19% accuracy on average. The source code of the proposed strategy, together with the test sets used in our analysis, is available on request. (c) 2013 Elsevier Ltd. All rights reserved.
暂无评论