Detecting binding motifs of combinatorial transcription factors (TFs) from chromatin immunoprecipitation sequencing (ChIP-seq) experiments is an important and challenging computational problem for understanding gene r...
详细信息
Detecting binding motifs of combinatorial transcription factors (TFs) from chromatin immunoprecipitation sequencing (ChIP-seq) experiments is an important and challenging computational problem for understanding gene regulations. Although a number of motif-finding algorithms have been presented, most are either time consuming or have sub-optimal accuracy for processing large-scale datasets. In this article, we present a fully parallelized algorithm for detecting combinatorial motifs from ChIP-seq datasets by using Fisher combined method and OpenMP parallel design. Large scale validations on both synthetic data and 350 ChIP-seq datasets from the ENCODE database showed that FisherMP has not only super speeds on large datasets, but also has high accuracy when compared with multiple popular methods. By using FisherMP, we successfully detected combinatorial motifs of CTCF, YY1, MAZ, STAT3 and USF2 in chromosome X, suggesting that they are functional co-players in gene regulation and chromosomal organization. Integrative and statistical analysis of these TF-binding peaks clearly demonstrate that they are not only highly coordinated with each other, but that they are also correlated with histone modifications. FisherMP can be applied for integrative analysis of binding motifs and for predicting cis-regulatory modules from a large number of ChIP-seq datasets.
Three-dimensional magnetotelluric modeling algorithm of high accuracy and high efficiency is required for data interpretation and inversion. In this paper, edge-based finite element method with unstructured mesh is us...
详细信息
Three-dimensional magnetotelluric modeling algorithm of high accuracy and high efficiency is required for data interpretation and inversion. In this paper, edge-based finite element method with unstructured mesh is used to solve 3D magnetotelluric problem. Two boundary conditions-Dirichlet boundary condition and Neumann boundary condition-are set for cross-validation and comparison. We propose an efficient parallel algorithm to speed up computation and improve efficiency. The algorithm is based on distributed matrix storage and has three levels of parallelism. The first two are process level parallelization for frequencies and matrix solving, and the last is thread-level parallelization for loop unrolling. The algorithm is validated by several model studies. Scalability tests have been performed on two distributed-memory HPC platforms, one consists of IntelE5-2660 microprocessors and the other consists of Phytium FT2000 Plus microprocessors. On Intel platform, computation time of our algorithm solving Dublin Test Model-1 with 3,756,373 edges at 21 frequencies is 365 s on 2520 cores. The speedup and efficiency are 1609 and 60% compared to 100 cores. On Phytium platform, scalability test shows that the speedup from 256 cores to 86,016 cores has been increased to 11,255.
The numerical solution of differential equations of fractional order is known to be a computationally very expensive problem due to the non-local nature of the fractional differential operators. We demonstrate that pa...
详细信息
The numerical solution of differential equations of fractional order is known to be a computationally very expensive problem due to the non-local nature of the fractional differential operators. We demonstrate that parallelization may be used to overcome these difficulties. To this end we propose to implement the fractional version of the second-order Adams-Bashforth-Moulton method on a parallel computer. According to many recent publications, this algorithm has been successfully applied to a large number of fractional differential equations arising from a variety of application areas. The precise nature of the parallelization concept is discussed in detail and some examples are given to show the viability of our approach.
For finding a common fixed point of a finite family of G-nonexpansive mappings, we implement a new parallel algorithm based on the Ishikawa iteration process with the inertial technique. We obtain the weak convergence...
详细信息
For finding a common fixed point of a finite family of G-nonexpansive mappings, we implement a new parallel algorithm based on the Ishikawa iteration process with the inertial technique. We obtain the weak convergence theorem of this algorithm in Hilbert spaces endowed with a directed graph by assuming certain control conditions. Furthermore, numerical experiments on the diffusion problem demonstrate that the proposed approach outperforms well-known approaches.
A parallel algorithm for transonic flow calculations is presented, and its performance on an array processor is investigated. The emphasis of this paper is on the mapping strategy, namely, sliced mapping, that is, a f...
详细信息
A parallel algorithm for transonic flow calculations is presented, and its performance on an array processor is investigated. The emphasis of this paper is on the mapping strategy, namely, sliced mapping, that is, a fine-grain parallelism of a technique on an SIMD computer. A brief description of the transonic small perturbation (TSP) equation is given followed by a model problem to be solved on the array processing parallel environment. Particular attention is paid to the various iteration methods within the mixed subsonic and supersonic region using a linearization of the dicretized TSP operator.
Given a text of length n and a pattern of length m, a three-phase randomized parallel algorithm for string matching is presented. The parallel algorithm allows the string-matching problem to be solved in O(log n) time...
详细信息
Given a text of length n and a pattern of length m, a three-phase randomized parallel algorithm for string matching is presented. The parallel algorithm allows the string-matching problem to be solved in O(log n) time on the hypercube network with O(n/log n) processors. It is cost-optimal in the sense that the number of processors times execution time is minimized and the speed-up is maximal. The probability that at least a false match occurs is less than (1.883)(k)n1-k/2, where k is the number of repeated times that the derived parallel solver runs on the same input strings.
The article is devoted to the construction of a parallel algorithm for calculating plasma dynamics by a particle-in-cell method using a semi-implicit scheme that conserves energy and charge. This is a two-stage predic...
详细信息
The article is devoted to the construction of a parallel algorithm for calculating plasma dynamics by a particle-in-cell method using a semi-implicit scheme that conserves energy and charge. This is a two-stage predictor-corrector scheme. At the prediction stage a semi-implicit Lapenta-type method is used in which an energy-conserving linear current does not satisfy the local Gauss law. At the correction stage the currents, electromagnetic fields, and particle velocities are corrected so that difference laws of energy and charge conservation are satisfied exactly. This approach turns out to be efficient in modeling of multi-scale phenomena with a sufficiently large time step. However, the method is computer time-consuming, since it requires not only solving two systems of linear equations per step, but also reconstructing the entire matrix of the system. The authors have developed a matrix-operator software implementation algorithm for this scheme, which allows efficient paralleling of the calculations and using the various available libraries for work with matrices and solvers for systems of linear equations. To construct the matrix, a row-by-row storage algorithm is used with search for the elements via a hash table, which reduces the memory capacity required, the number of thread synchronizations, and can significantly speed up the calculations. This algorithm has been successfully applied in a computer code, Beren3D.
In this paper an efficient parallel algorithm to solve a three-dimensional problem of subsidence above exploited gas reservoirs is presented. The parallel program is developed on a cluster of workstations. The paralle...
详细信息
In this paper an efficient parallel algorithm to solve a three-dimensional problem of subsidence above exploited gas reservoirs is presented. The parallel program is developed on a cluster of workstations. The parallel virtual machine (PVM) system is used to handle communications among networked workstations. The method has advantages such as numbering of the finite element mesh in an arbitrary manner, simple programming organization, smaller core requirements and computation times. An implementation of this parallel method on workstations is discussed, the speed-up and efficiency of this method being demonstrated by a numerical example. Copyright (C) 1999 John Wiley & Sons, Ltd.
In this paper, we propose an algorithm for solving the maximum common subgraph problem. The sequential and parallel versions of the algorithm and their software implementation are described, and their effectiveness is...
详细信息
In this paper, we propose an algorithm for solving the maximum common subgraph problem. The sequential and parallel versions of the algorithm and their software implementation are described, and their effectiveness is experimentally studied. This problem is one of the most famous NP-complete problems. Its solution may be required when solving many practical problems related to the study of complex structures. We solve it in a formulation in which we need to find all possible isomorphisms of the found common subgraph. Due to the extremely high complexity of the problem, the desire to speed up its solution by parallelizing the algorithm is quite natural. To organize parallel computing, the RPM_ParLib library is used, which makes it possible to create parallel applications running on a local computer network under the control of the .NET Framework runtime environment. The library supports a recursive-parallel programming style and ensures efficient distribution of work and dynamic load balancing of computing modules during program execution. It can be used for applications written in any programming language supported by the .NET Framework. The purpose of the numerical experiment is to study the acceleration achieved through the recursive-parallel organization of calculations. For the experiment, a special application in C# that is designed to generate various sets of initial data with specified parameters is developed. Here, we describe the characteristics of the generated initial graph pairs and the results obtained during the experiment.
We present a parallel algorithm for the solution of tridiagonal linear systems based on the method of partitioning and backward elimination. A given N x N system is partitioned into r blocks each of size n x n (N = rn...
详细信息
We present a parallel algorithm for the solution of tridiagonal linear systems based on the method of partitioning and backward elimination. A given N x N system is partitioned into r blocks each of size n x n (N = rn, n even). Within each block the system is rewritten by collecting pairs of equations from the top and bottom (written backwards);this makes it suitable to solve the subsystem by a UL-factorization of its coefficient matrix. Once a core system of size 2r x 2r has been solved, solutions of the r subsystems can be computed in parallel. Interestingly, the serial arithmetical operations count for the present algorithm is O(l6 1/2N);this makes it faster than the quadrant interlocking factorization method of Chawla and Passi [1], the partitioning method of Wang [7] and cyclic reduction (see Hockney and Jesshope [6, p. 479]), each of which has a serial count of O(17N).
暂无评论