检索结果-内蒙古大学图书馆

New tabulation and sparse dynamic programming based techniques for sequence similarity problems

DISCRETE APPLIED MATHEMATICS 2016年第0期212卷 96-103页

作者： Grabowski, Szymon Lodz Univ Technol Inst Appl Comp Sci Al Politech 11 PL-90924 Lodz Poland

Calculating the length l of a longest common subsequence (LCS) of two strings, A of length n and B of length m, is a classic research topic, with many known worst-case oriented results. We present three algorithms for LCS length calculation with respectively O(mn 1g 1g n/ lg(2) n), O(mn/ lg(2) n + r) and O(n + r) time complexity, where the second one works for r = o(mn/(lg n lg lg n)), and the third one for r = Theta(mn/ lg(k) n), for a real constant 1 <= k <= 3, and l = O(n/(lg(k-1) n(lg lg n)(2))), where r is the number of matches in the dynamic programming matrix. We also describe conditions for a given problem sufficient to apply our techniques, with several concrete examples presented, namely the edit distance, the longest common transposition-invariant subsequence (LCTS) and the merged longest common subsequence (MerLCS) problems. (C) 2015 Elsevier B.V. All rights reserved.

关键词： Sequence similarity Longest common subsequence sparse dynamic programming Tabulation

来源：评论

学校读者我要写书评

暂无评论

New tabulation and sparse dynamic programming based techniques for sequence similarity problems 18

New tabulation and sparse dynamic programming based techniqu...

引用

18th Prague Stringology Conference, PSC 2014

作者： Grabowski, Szymon Lodz University of Technology Institute of Applied Computer Science Al. Politechniki 11 Lódź90-924 Poland

ISBN: (纸本)9788001055472

Calculating the length of a longest common subsequence (LCS) of two strings, A of length n and B of length m, is a classic research topic, with many worstcase oriented results known. We present two algorithms for LCS length calculation with respectively O(mn log log n/ log2 n) and O(mn/ log2 n + r) time complexity, the latter working for r = o(mn/(log n log log n)), where r is the number of matches in the dynamic programming matrix. We also describe conditions for a given problem sufficient to apply our techniques, with several concrete examples presented, namely the edit distance, LCTS and MerLCS problems. © Czech Technical University in Prague, Czech Republic.

关键词： Longest common subsequence Sequence similarity sparse dynamic programming Tabulation

来源：评论

学校读者我要写书评

暂无评论

Chaining of Maximal Exact Matches in Graphs 30th

Chaining of Maximal Exact Matches in Graphs

引用

30th International Symposium on String Processing and Information Retrieval (SPIRE) / 18th Workshop on Compression, Text, and Algorithms (WCTA)

作者： Rizzo, Nicola Caceres, Manuel Makinen, Veli Univ Helsinki Dept Comp Sci POB 68Pietari Kalmin Katu 5 Helsinki 00014 Finland

ISBN: (纸本)9783031439797;9783031439803

We show how to chain maximal exact matches (MEMs) between a query string Q and a labeled directed acyclic graph (DAG) G = (V, E) to solve the longest common subsequence (LCS) problem between Q and G. We obtain our result via a new symmetric formulation of chaining in DAGs that we solve in O(m + n + k(2)|V| + |E| + kN log N) time, where m = |Q|, n is the total length of node labels, k is the minimum number of paths covering the nodes of G and N is the number of MEMs between Q and node labels, which we show encode full MEMs.

关键词： sequence to graph alignment longest common subsequence sparse dynamic programming

来源：评论

学校读者我要写书评

暂无评论

The longest common subsequence problem for small alphabets in the word RAM model

引用

INFORMATION PROCESSING LETTERS 2025年 190卷

作者： Campos, Rodrigo Alexander Castro Univ Autonoma Metropolitana Dept Sistemas San Pablo 420 Mexico City 02200 Mexico

Given two strings of lengths m and n, with m <= n, the longest common subsequence problem consists of computing a common subsequence of maximum length by deleting symbols from both strings. While the O(mn) algorithm devised in 1974 is optimal in the most general setting, algorithms that depend on parameters other than m and n have been proposed since then. In the word RAM model, let w be the word size, s be the alphabet size, d be the number of dominant symbol matches between the strings, and p be the length of the longest common subsequence. Fast algorithms for this problem have complexities O(mn/ log n), O(mn/w), O(ns + min(p(n - p), pm)), O(n logs + d log log min(d, mn/d)), O(ns+min(ds, pm)), and O(ns+s!2s+d logs). In this work, we present an O(n(s+log & lowast;n)+ min(d logs, pm)) algorithm when s is an element of O(w), and also an O(n(s + log & lowast;n) + d) algorithm when s <= w which uses bitwise instructions that became recently available in modern processors.

关键词： Longest common subsequence Small alphabet Word RAM model sparse dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Efficient algorithms for the longest common subsequence in k-length substrings

引用

INFORMATION PROCESSING LETTERS 2014年第11期114卷 634-638页

作者： Deorowicz, Sebastian Grabowski, Szymon Silesian Tech Univ Inst Informat PL-44100 Gliwice Poland Lodz Univ Technol Inst Appl Comp Sci PL-90924 Lodz Poland

Finding the longest common subsequence in k-length substrings (LCSk) is a recently proposed problem motivated by computational biology. This is a generalization of the well-known LCS problem in which matching symbols from two sequences A and B are replaced with matching non-overlapping substrings of length k from A and B. We propose several algorithms for LCSk, being non-trivial incarnations of the major concepts known from LCS research (dynamic programming, sparse dynamic programming, tabulation). Our algorithms make use of a linear-time and linear-space preprocessing finding the occurrences of all the substrings of length k from one sequence in the other sequence. (C) 2014 Elsevier B.V. All rights reserved.

关键词： Algorithms Combinatorial problems LCSk sparse dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Parallel Chaining Algorithms

Parallel Chaining Algorithms

引用

17th European MPI Users Group Meeting

作者： Abouelhoda, Mohamed Mohamed, Hisham Nile Univ Ctr Informat Sci Giza Egypt

ISBN: (纸本)9783642156458

Given a set of weighted hyper-rectangles in a k-dimensional space, the chaining problem is to identify a set of colinear and non-overlapping hyper-rectangles of total maximal weight. This problem is used in a number of applications in bioinformatics, string processing, and VLSI design. In this paper, we present parallel versions of the chaining algorithm for bioinformatics applications, running on multi-core and computer cluster architectures. Furthermore, we present experimental results of our implementations on both architectures.

关键词： Chaining Algorithms Bioinforrnatics sparse dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Fast algorithms for computing tree LCS

引用

THEORETICAL COMPUTER SCIENCE 2009年第43期410卷 4303-4314页

作者： Mozes, Shay Tsur, Dekel Weimann, Oren Ziv-Ukelson, Michal Ben Gurion Univ Negev IL-84105 Beer Sheva Israel Brown Univ Providence RI 02912 USA MIT Cambridge MA 02139 USA

The LCS of two rooted, ordered, and labeled trees F and G is the largest forest that can be obtained from both trees by deleting nodes. We present algorithms for computing tree LCS which exploit the sparsity inherent to the tree LCS problem. Assuming G is smaller than F, our first algorithm runs in time O(r . height(F) . height(G) . lg lg vertical bar G vertical bar), where r is the number of pairs (upsilon is an element of F, omega is an element of G) such that upsilon and omega) have the same label. Our second algorithm runs in time O(Lr lg r . lg lg vertical bar G vertical bar), where L is the size of the LCS of F and G. For this algorithm we present a novel three-dimensional alignment graph. Our third algorithm is intended for the constrained variant of the problem in which only nodes with zero or one children can be deleted. For this case we obtain an O(rh lg lg vertical bar G vertical bar) time algorithm, where h = height(F) + height(G). (C) 2009 Elsevier B.V. All rights reserved.

关键词： Tree LCS Tree edit distance Ordered trees Largest common subforest sparse dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Efficient algorithms for pattern matching with general gaps, character classes, and transposition invariance

引用

INFORMATION RETRIEVAL 2008年第4期11卷 335-357页

作者： Fredriksson, Kimmo Grabowski, Szymon Univ Kuopio Dept Comp Sci FIN-70211 Kuopio Finland Tech Univ Lodz Dept Comp Engn PL-90924 Lodz Poland

We develop efficient dynamic programming algorithms for pattern matching with general gaps and character classes. We consider patterns of the form p(0) g(a(0),b(0))p(1)g(a(1),b(1)) ... p(m-1), where p(i) subset of Sigma, Sigma is some finite alphabet, and g(a(i) ,b(i)) denotes a gap of length a(i) ...b(i) between symbols p(i) and p(i+1). The text symbol t(j) matches p(i) iff t(j) is an element of p(i) . Moreover, we require that if p(i) matches t(j) , then p(i+1) should match one of the text symbols t(j+ai+1) ... t(j+bi+1). Either or both of a(i) and b (i) can be negative. We also consider transposition invariant matching, i.e., the matching condition becomes t(j) is an element of p(i) + tau, for some constant tau determined by the algorithms. We give algorithms that have efficient average and worst case running times. The algorithms have important applications in music information retrieval and computational biology. We give experimental results showing that the algorithms work well in practice.

关键词： string matching sparse dynamic programming bounded length gaps character classes transposition invariance

来源：评论

学校读者我要写书评

暂无评论

Efficient algorithms for (δ, γ, α) and (δ, k_Δ, α)-matching

引用

INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE 2008年第1期19卷 163-183页

作者： Fredriksson, Kimmo Grabowski, Szymon Univ Joensuu Dept Comp Sci & Stat FIN-80101 Joensuu Finland Tech Univ Lodz Dept Comp Engn PL-90924 Lodz Poland

We propose new algorithms for (delta, gamma, alpha)-matching. In this string matching problem we axe given a pattern P = p(0)p(1)...p(m-1) and a text T = t(0)t(1)...t(n-1) over some integer alphabet Sigma = {0...sigma - 1}. The pattern symbol p(i) delta-matches the text symbol t(j) iff vertical bar p(i) - t(j)vertical bar <= delta. The pattern P (delta, gamma)-matches some text substring t(j)... t(j+m-1) iff for all i it holds that vertical bar p(i) - t(j+1)vertical bar <= delta and Sigma vertical bar p(i) - t(j+i)vertical bar <= gamma. Finally, in (delta, gamma, alpha)-matching we also permit at most alpha-symbol gaps between each matching text symbol. The only known previous algorithm runs in O(nm) time. We give several algorithms that improve the average case up to O(n) for small alpha, and the worst case to O(min{nm, vertical bar M vertical bar alpha}) or O(nm log(gamma)/w), where M = {(i, j) vertical bar vertical bar p(i) - t(j)vertical bar <= delta} and w is the number of bits in a machine word. The proposed algorithms can be easily modified to solve several other related problems, we explicitly consider e.g. character classes (instead of delta-matching), (Delta-limited) k-mismatches (instead of gamma-matching) and more general gaps, including negative ones. These find important applications in computational biology. We conclude with experimental results showing that the algorithms are very efficient in practice.

关键词： approximate string matching music information retrieval computational biology bit-parallelism sparse dynamic programming bounded gaps

来源：评论

学校读者我要写书评

暂无评论

Peak alignment using restricted edit distances

引用

BIOMOLECULAR ENGINEERING 2007年第3期24卷 337-342页

作者： Makinen, Veli Univ Helsinki Dept Comp Sci FIN-00014 Helsinki Finland

A peak is a pair of real values (x, y), where x is the time when peak of height y is registered. In the peak alignment problem, we are given two sequences of peaks, and our task is to align the sequences allowing some basic edit operations on the peaks. We study an instance of the peak alignment problem that arises in the analysis of Mass Spectrometry data in Systems Biology. There the measurement technique guarantees that two peaks (x, y), (x', y') can only be considered the same if x is close enough toe, and y is close enough to y'. We review some methods to do alignment under such restrictions on matches. (C) 2007 Elsevier B.V. All rights reserved.

关键词： mass spectrometry gel electrophoresis peak alignment edit distance sparse dynamic programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：