版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Univ Leipzig Competence Ctr Scalable Data Serv & Solut Dresden Augustuspl 12 D-04107 Leipzig Germany Univ Leipzig Bioinformat Grp Dept Comp Sci Hartelstr 16 D-04107 Leipzig Germany Univ Leipzig Interdisciplinary Ctr Bioinformat Hartelstr 16-18 D-04107 Leipzig Germany Univ Leipzig Dept Comp Sci Nat Language Proc Grp Augustuspl 12 D-04107 Leipzig Germany Max Planck Inst Math Sci Inselstr 22 D-04103 Leipzig Germany Fraunhofer Inst Cell Therapy & Immunol Perlickstr 1 D-04103 Leipzig Germany Univ Vienna Dept Theoret Chem Wahringer Str 17 A-1090 Vienna Austria Ctr Noncoding RNA Technol & Hlth Gronegardsvej 3 DK-1870 Frederiksberg C Denmark Santa Fe Inst 1399 Hyde Pk Rd Santa Fe NM 87501 USA
出 版 物:《ALGORITHMS FOR MOLECULAR BIOLOGY》 (分子生物学)
年 卷 期:2018年第13卷第1期
页 面:16-16页
核心收录:
学科分类:0710[理学-生物学] 071010[理学-生物化学与分子生物学] 07[理学] 0836[工学-生物工程]
基 金:German Federal Ministry of Education and Research within the project Competence Center for Scalable Data Services and Solutions (ScaDS) Dresden/Leipzig [BMBF 01IS14014B] German Research Foundation (DFG) Universitat Leipzig
主 题:Superbubble de Bruijn graph Genome assembly Linear time algorithm
摘 要:Background: Superbubbles are distinctive subgraphs in direct graphs that play an important role in assembly algorithms for high- throughput sequencing ( HTS) data. Their practical importance derives from the fact they are connected to their host graph by a single entrance and a single exit vertex, thus allowing them to be handled independently. Efficient algorithms for the enumeration of superbubbles are therefore of important for the processing of HTS data. Superbubbles can be identified within the strongly connected components of the input digraph after transforming them into directed acyclic graphs. The algorithm by Sung et al. ( IEEE ACM Trans Comput Biol Bioinform 12: 770- 777, 2015) achieves this task in O( m log( m))- time. The extraction of superbubbles from the transformed components was later improved to by Brankovic et al. ( Theor Comput Sci 609: 374- 383, 2016) resulting in an overall O( m + n)- time algorithm. Results: A re- analysis of the mathematical structure of superbubbles showed that the construction of auxiliary DAGs from the strongly connected components in the work of Sung et al. missed some details that can lead to the reporting of false positive superbubbles. We propose an alternative, even simpler auxiliary graph that solved the problem and retains the linear running time for general digraph. Furthermore, we describe a simpler, space- efficient O( m + n) - time algorithm for detecting superbubbles in DAGs that uses only simple data structures. Implementation: We present a reference implementation of the algorithm that accepts many commonly used formats for the input graph and provides convenient access to the improved algorithm. https :// githu b. com/ Fabia nexe/ Super bubble