We describe our approach in augmenting the BEAGLE library for high-performance statistical phylogenetic inference to support concurrent computation of independent partial likelihoods arrays. Our solution involves iden...
详细信息
ISBN:
(纸本)9783319654829;9783319654812
We describe our approach in augmenting the BEAGLE library for high-performance statistical phylogenetic inference to support concurrent computation of independent partial likelihoods arrays. Our solution involves identifying independent likelihood estimates in analyses of partitioned datasets and in proposed tree topologies, and configuring concurrent computation of these likelihoods via CUDA and opencL frameworks. We evaluate the effect of each increase in concurrency on throughput performance for our partial likelihoods kernel for a four-state nucleotide substitution model on a variety of parallel computing hardware, such as NVIDIA and AMD GPU5, and Intel multicore cPus, observing up to 16-fold speedups over our previous implementation. Finally, we evaluate the effect of these gains on an domain application program, MrBayes. For a partitioned nucleotide-model analysis we observe an average speedup for the overall run time of 2.1-fold over our previous parallel implementation, and 10-fold over the native MrBayes with SSE.
Multicore clusters are widely used to solve combinatorial optimization problems, which require high computing power and a large amount of memory. In this sense, Hash Distributed A* (HDA*) parallelizes A*, a combinator...
详细信息
ISBN:
(纸本)9783319654829;9783319654812
Multicore clusters are widely used to solve combinatorial optimization problems, which require high computing power and a large amount of memory. In this sense, Hash Distributed A* (HDA*) parallelizes A*, a combinatorial optimization algorithm, using the MPI library. HDA* scales well on multicore clusters and on multicore machines. Additionally, there exist several versions of HDA* that were adapted for multicore machines, using the Pthreads library. In this paper, we present Hybrid HDA* (HHDA*), a hybrid parallel search algorithm based on HDA* that combines message-passing (MPI) with shared-memory programming (Pthreads) to better exploit the computing power and memory of multicore clusters. We evaluate the performance and memory consumption of HHDA* on a multicore cluster, using the 15-puzzle as a case study. the results reveal that HHDA* achieves a slightly higher average performance and uses considerably less memory than HDA*. these improvements allowed HHDA* to solve one of the hardest 15-Puzzle instances.
DNA methylation (mC) and hydroxymethylation (hmC) can have a significant effect on normal human development, health and disease status. Hydroxymethylation studies require specific treatment of DNA, as well as software...
详细信息
ISBN:
(纸本)9783319654829;9783319654812
DNA methylation (mC) and hydroxymethylation (hmC) can have a significant effect on normal human development, health and disease status. Hydroxymethylation studies require specific treatment of DNA, as well as software tools for their analysis. In this paper, we propose a parallel software tool for analyzing the DNA hydroxymethylation data obtained by TAB-seq. the software is based on the use of binary trees for searching the different occurrences of methylation and hydroxymethylation in DNA samples. the binary trees allow to efficiently store and access the information about the methylation of each methylated/hydroxymethylated cytosines in the samples. Evaluation results shows that the performance of the application is only limited by the computer input/output bandwidth, even for the case of very long samples.
Many studies have shown that there is a direct relationship between Single Nucleotide Polymorphisms (SNPs) and the appearance of complex diseases, such as Alzheimer's or Parkinson's. However, recent advances i...
详细信息
ISBN:
(纸本)9783319654829;9783319654812
Many studies have shown that there is a direct relationship between Single Nucleotide Polymorphisms (SNPs) and the appearance of complex diseases, such as Alzheimer's or Parkinson's. However, recent advances in the Study of the Complete Genome Association indicate that the relationship between SNPs and these diseases goes beyond a simple one-to-one relationship, that is, the appearance of multiple SNPs (epistasis) influences the appearance of these diseases. In this sense, this work proposes the application of the NSGA-II multi-objective algorithm for the detection of epistasis of multiple loci in a database with 31,341 SNPs. Moreover, a parallel study has been performed to reduce the execution time of this problem. Our implementation not only achieves a reasonable good parallel performance and scalability, but also its biological significance overcomes other approaches published in the literature.
this article presents massively parallel execution of the BLAST algorithm on supercomputers and HPC clusters using thousands of processors. Our work is based on the optimal splitting up the set of queries running with...
详细信息
ISBN:
(纸本)9783319654829;9783319654812
this article presents massively parallel execution of the BLAST algorithm on supercomputers and HPC clusters using thousands of processors. Our work is based on the optimal splitting up the set of queries running withthe non-modified NCBI-BLAST package for sequence alignment. the work distribution and search management have been implemented in Java using a PCJ (parallel Computing in Java) library. the PCJ-BLAST package is responsible for reading sequence for comparison, splitting it up and start multiple NCBI-BLAST executables. We also investigated a problem of parallel I/O and thanks to PCJ library we deliver high throughput execution of BLAST. the presented results show that using Java and PCJ library we achieved very good performance and efficiency. In result, we have significantly reduced time required for sequence analysis. We have also proved that PCJ library can be used as an efficient tool for fast development of the scalable applications.
the increasing gap between plentiful computing elements and limited memory bandwidth makes it increasingly difficult and sometimes even infeasible for HPC community to port more applications onto many-core processor a...
详细信息
ISBN:
(纸本)9783319654829;9783319654812
the increasing gap between plentiful computing elements and limited memory bandwidth makes it increasingly difficult and sometimes even infeasible for HPC community to port more applications onto many-core processor architectures. the Sunway many-core processor SW26010 used to build the Sunway TaihuLight System contains a total of 260 heterogeneous cores. All these cores can be divided into 4 core groups (CGs). Each CG includes a Management processing Element (MPE) core and 64 Computing processing Elements (CPEs) cores. In this paper, we refactor an important molecular dynamics (MD) application GROMACS on the Sunway Taihulight system. By rewriting the compute-intensive kernel of GROMACS, we exploit a suitable parallelism for CPE cluster and implement pipelining computation between MPE and CPE cluster. Optimization strategies including the efficient use of scratchpad, the software-emulated cache and a hybrid parallel algorithm are adopted to solve the challenging memory bandwidth limitation. When comparing the refactored version using MPE and 64 CPEs withthe original ported version using only MPE, we achieve a 16x speedup for the compute-intensive kernel. For simulating a molecule with 3 million atoms, we currently have managed to scale to 798,720 cores. Moreover, we analyze the adaptability of our mapping and optimization strategies for solving the memory bandwidth limitation when refactoring a real-world application on the Sunway heterogeneous many-core processor system.
this paper aims to test the impact of inter-city investment on enterprises performance. By using a panel dataset of Chinese firms which have invested in 43 countries and regions over the of 2003-2009 and gravity model...
详细信息
ISBN:
(纸本)9783319654829;9783319654812
this paper aims to test the impact of inter-city investment on enterprises performance. By using a panel dataset of Chinese firms which have invested in 43 countries and regions over the of 2003-2009 and gravity model, we find that institutional distance is favorable to Chinas outward direct investment, which implies that the Chinese multinationals dont seem willing to enter those countries that have similar institutions withtheir home country, in this sense, Chinese enterprises outward direct investment can be interpreted as being driven by the motivation of institutional escape. Technology distance displays an Inversed-U shape which suggests some technical distance is the premise for ODI and may reflect the fact of simultaneous existence of boththe technology utilization ODI and the technology-seeking ODI of China. Geographical distance has no significant impact on Chinas outward direct investment which supports the proposition of death of distance. these findings point to the importance of going beyond firm boundary to consider various distances between home and host countries in making investment decisions, which not only overcome the defects of the existing studies, but also propose new theoretical explanations for the phenomenon that Chinese enterprises are still capable of ODI even when the ownership advantages are missing. According to the results of this paper, Chinese enterprises should choose to invest in the countries with large institutional distance, small economic and medium technical distance from the home country, and, at the same time, they should not bother geographical distance too much.
In this paper, embeddings of a family of 3D meshes in locally twisted cubes are studied. Let LTQ(n)(V, E) denotes the n-dimensional locally twisted cube. We find two major results in this paper:(1) For any integer n &...
详细信息
ISBN:
(数字)9783319111940
ISBN:
(纸本)9783319111940;9783319111933
In this paper, embeddings of a family of 3D meshes in locally twisted cubes are studied. Let LTQ(n)(V, E) denotes the n-dimensional locally twisted cube. We find two major results in this paper:(1) For any integer n >= 4, two node-disjoint 3D meshes of size 2 x 2 x 2(n-3) can be embedded into LTQ(n) with dilation 1 and expansion 2. (2) For any integer n = 6, four node-disjoint 4x2x2(n-5) meshes can be embedded into LTQ(n) with dilation 1 and expansion 4. Further, an embedding algorithm can be constructed based on our embedding method. the obtained results are optimal in the sense that the dilations of the embeddings are 1.
Computer simulations withthe first-principle (kinetic) model are essential for studying multi-scale processes in space plasma. We develop numerical schemes for Vlasov simulations for practical use on currently-existi...
详细信息
this book constitutes the proceedings of the 17th international conference on algorithms and architectures for parallel processing, ica3pp 2017, held in Helsinki, Finland, in August 2017.;the 25 full papers presented ...
详细信息
ISBN:
(数字)9783319654829
ISBN:
(纸本)9783319654812
this book constitutes the proceedings of the 17th international conference on algorithms and architectures for parallel processing, ica3pp 2017, held in Helsinki, Finland, in August 2017.;the 25 full papers presented were carefully reviewed and selected from 117 submissions. they cover topics such as parallel and distributed architectures; software systems and programming models; distributed and network-based computing; big data and its applications; parallel and distributed algorithms; applications of parallel and distributed computing; service dependability and security in distributed and parallel systems; service dependability and security in distributed and parallel systems; performance modeling and evaluation.;this volume also includes 41 papers of four workshops, namely: the 4thinternational Workshop on Data, Text, Web, and Social Network Mining (DTWSM 2017), the 5thinternational Workshop on parallelism in Bioinformatics (PBio 2017);, the First international Workshop on Distributed Autonomous Computing in Smart City (DACSC 2017), and the Second international Workshop on Ultrascale Computing for Early Researchers (UCER 2017).
暂无评论