检索结果-内蒙古大学图书馆

Using GPU to accelerate the pairwise structural RNA alignment with base pair probabilities

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2020年第10期32卷

作者： Sundfeld, Daniel Teodoro, George Havgaard, Jakob H. Gorodkin, Jan Melo, Alba C. M. A. Brasilia Fed Inst Informat & Commun Technol SGAN 610 Modulos DEFGAsa Norte BR-70830450 Brasilia DF Brazil Univ Brasilia Dept Comp Sci Brasilia DF Brazil Univ Copenhagen Ctr Noncoding RNA Technol & Hlth Dept Vet & Anim Sci Copenhagen Denmark

Structural alignments of Ribonucleic acid (RNA) sequences solved by the sankoff algorithm are computationally expensive and often require constraints to be used in practice. Modern Graphics Processing Units (GPUs) contain more than 1000 cores, which compute in parallel to speed up applications. Here, we present a GPU-based solution to the RNA structural alignment problem that makes use of precalculated base pair probabilities on the individual sequences. We designed and developed an unconstrained version of the sankoff algorithm, obtaining the optimal result and calculating the entire four-dimension dynamic programming matrix (4D DP). Our approach uses a two-level wavefront strategy to exploit parallelism. The 4D DP matrix is divided in one external matrix (EM) and several internal matrices (IM). We applied wavefront strategies on the EM and IMs in a two-level hierarchical way. At the first level, the wavefront is applied to the EM, calculating the cells that belong to the same diagonal in parallel. In the second level, since each cell in the EM is itself an IM matrix, the cells that belong to the same IM diagonal are calculated in parallel. The results obtained with real RNA sequences show that our GPU version is capable of outperforming a multicore CPU version of the unconstrained version of the sankoff algorithm. Compared with the CPU-based version running on 32 cores, our approach is able to achieve a speedup of 7.81x on the NVidia Tesla P100. In this case, the execution time was reduced from 6 hours and 18 minutes (32 cores) to 48 minutes and 20 seconds (GPU).

关键词： base-pairing probabilities GPUs high-performance computing RNA sankoff algorithm

来源：评论

学校读者我要写书评

暂无评论

A Genetic algorithm for Character State Live Phylogeny 1

引用

11th Brazilian Symposium on Bioinformatics (BSB)

作者： Fernandes, Rafael L. Guths, Rogerio Telles, Guilherme P. Almeida, Nalvo F. Walter, Maria Emilia M. T. Univ Brasilia Dept Ciencia Comp Brasilia DF Brazil Univ Fed Mato Grosso do Sul Fac Comp Campo Grande MS Brazil Univ Estadual Campinas Inst Comp Campinas Brazil

ISBN: (数字)9783030017224

ISBN: (纸本)9783030017224;9783030017217

Character state live phylogeny generalizes character state phylogeny in the sense that they relate taxonomic units based on their similarities over a set of characters, but allowing live ancestors. An approach for character state live phylogeny reconstruction is called parsimony, where one tries to minimize the total number of character state changes along the edges of the tree. The problem of finding a tree that minimizes this number is known as large live parsimony problem. When the tree topology is also given as input, the problem is known as small live parsimony problem. We propose a genetic algorithm to solve the large live problem, which uses extended versions of the algorithms of Fitch and sankoff to solve the small live problem, both devised in this work. Besides, we performed two experiments. In the first one, a multiple alignment of H1N1 and H3N2 viruses from different countries, taken as input, allowed to obtain interesting live phylogenies, representing alternative evolutionary hypothesis. The second experiment took as input a multiple alignment of the HIV virus env gene, from one patient, read in different dates through 12 years. The generated live phylogenies were similar to the ones generated by PAUP, where dates close to each other were grouped into clusters, but suggesting new evolutionary stories.

关键词： Live phylogeny Genetic algorithms Parsimony sankoff algorithm Fitch algorithm

来源：评论

学校读者我要写书评

暂无评论

Pareto optimization in algebraic dynamic programming

引用

algorithmS FOR MOLECULAR BIOLOGY 2015年第1期10卷 1-20页

作者： Saule, Cedric Giegerich, Robert Univ Bielefeld Fac Technol D-33615 Bielefeld Germany Univ Bielefeld Ctr Biotechnol D-33615 Bielefeld Germany

Pareto optimization combines independent objectives by computing the Pareto front of its search space, defined as the set of all solutions for which no other candidate solution scores better under all objectives. This gives, in a precise sense, better information than an artificial amalgamation of different scores into a single objective, but is more costly to compute. Pareto optimization naturally occurs with genetic algorithms, albeit in a heuristic fashion. Non-heuristic Pareto optimization so far has been used only with a few applications in bioinformatics. We study exact Pareto optimization for two objectives in a dynamic programming framework. We define a binary Pareto product operator *(Par) on arbitrary scoring schemes. Independent of a particular algorithm, we prove that for two scoring schemes A and B used in dynamic programming, the scoring scheme A*B-Par correctly performs Pareto optimization over the same search space. We study different implementations of the Pareto operator with respect to their asymptotic and empirical efficiency. Without artificial amalgamation of objectives, and with no heuristics involved, Pareto optimization is faster than computing the same number of answers separately for each objective. For RNA structure prediction under the minimum free energy versus the maximum expected accuracy model, we show that the empirical size of the Pareto front remains within reasonable bounds. Pareto optimization lends itself to the comparative investigation of the behavior of two alternative scoring schemes for the same purpose. For the above scoring schemes, we observe that the Pareto front can be seen as a composition of a few macrostates, each consisting of several microstates that differ in the same limited way. We also study the relationship between abstract shape analysis and the Pareto front, and find that they extract information of a different nature from the folding space and can be meaningfully combined.

关键词： Pareto optimization Dynamic programming Algebraic dynamic programming RNA structure sankoff algorithm

来源：评论

学校读者我要写书评

暂无评论

RNA Structural Alignments, Part I: sankoff-Based Approaches for Structural Alignments

RNA Structural Alignments, Part I: Sankoff-Based Approaches ...

引用

作者： Jakob Hull Havgaard Jan Gorodkin

Simultaneous alignment and secondary structure prediction of RNA sequences is often referred to as “RNA structural alignment.” A class of the methods for structural alignment is based on the principles proposed by sankoff more than 25 years ago. The sankoff algorithm simultaneously folds and aligns two or more sequences. The advantage of this algorithm over those that separate the folding and alignment steps is that it makes better predictions. The disadvantage is that it is slower and requires more computer memory to run. The amount of computational resources needed to run the sankoff algorithm is so high that it took more than a decade before the first implementation of a sankoff style algorithm was published. However, with the faster computers available today and the improved heuristics used in the implementations the sankoff-based methods have become practical. This chapter describes the methods based on the sankoff algorithm. All the practical implementations of the algorithm use heuristics to make them run in reasonable time and memory. These heuristics are also described in this chapter. less

关键词： Structure Prediction Structural RNA alignment Simultaneous folding and alignment of RNA sequences sankoff algorithm

来源：评论

学校读者我要写书评

暂无评论

ACCELERATED METHOD FOR COMPARING AMINO-ACID-SEQUENCES WITH ALLOWANCE FOR POSSIBLE GAPS - PLOTTING OPTIMUM CORRESPONDENCE PATHS

引用

INTERNATIONAL JOURNAL OF PEPTIDE AND PROTEIN RESEARCH 1981年第3期17卷 284-291页

作者： POZDNYAKOV, VI PANKOV, YA Institute of Experimental Endocrinology and Hormone Chemistry of the Academy of Medical Sciences of the USSR Moscow USSR

An accelerated method is suggested which enables an effective comparison to be made of amino acid (nucleotide) sequences of great length with due regard to a large number of possible gaps. The method consists in limiting the area of complete similarity charts, calculated in accordance with the algorithm suggested by sankoff, by a certain specially selected diagonal band. The application of the Monte-Carlo method permits a statistical evaluation to be made of the certainty of the similarity of the compared sequences and to choose on such a comparison band, an optimum correspondence path which can readily be transformed into sequence alignment. Using this approach, prolactin and somatotropin families of sequences were found to be homologous at a high level of significance and their optimum alignment with 2 gaps was suggested. In contrast, 2 regions of assumed partial gene duplication in .beta.-galactosidase sequence, suggested by Hood et al, were not statistically significantly similar.

关键词： comparison correspondence band paths gaps sankoff algorithm sequence comparison total similarity charts

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：