版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Univ Warsaw Inst Informat Banacha 2 PL-02097 Warsaw Poland PSL Res Univ DI ENS 45 Rue Ulm F-75005 Paris France
出 版 物:《INFORMATION AND COMPUTATION》 (信息与计算)
年 卷 期:2020年第271卷第0期
页 面:104513-000页
核心收录:
学科分类:12[管理学] 1201[管理学-管理科学与工程(可授管理学、工学学位)] 08[工学] 0701[理学-数学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:"Algorithms for text processing with errors and uncertainties" within the HOMING programme of the Foundation for Polish Science [POIR.04.04.00-00-24BA/16-00] European Union under the European Regional Development Fund
主 题:Streaming algorithm k-Mismatch problem Weighted string Position weight matrix
摘 要:We present a new streaming algorithm for the k-MISMATCH problem, one of the most basic problems in pattern matching. Given a pattern and a text, the task is to find all substrings of the text that are at the Hamming distance at most k from the pattern. Our algorithm is enhanced with an important new feature called ERROR Correcting, and its complexities for k = 1 and for a general k are comparable to those of the solutions for the k-MISMATCH problem by Porat and Porat (FOCS 2009) and Clifford et al. (SODA 2016). In parallel to our research, a yet more efficient algorithm for the k-MISMATCH problem with the ERROR CORRECTING feature was developed by Clifford et al. (SODA 2019). Using the new feature and recent work on streaming MULTIPLE PATTERN MATCHING we develop a series of streaming algorithms for pattern matching on weighted strings, which are a commonly used representation of uncertain sequences in molecular biology. We also show that these algorithms are space-optimal up to polylog factors. A preliminary version of this work was published at DCC 2017 conference[24]. (C) 2019 Elsevier Inc. All rights reserved.