In the constructive programming community it is commonplace to see formal developments of textbook algorithms. In the algorithm design community, on the other hand, it may be well known that the textbook solution to a...
详细信息
In the constructive programming community it is commonplace to see formal developments of textbook algorithms. In the algorithm design community, on the other hand, it may be well known that the textbook solution to a problem is not the most efficient possible. However, in presenting the more efficient solution, the algorithm designer will usually omit some of the implementation details, thus creating an algorithm gap between the abstract algorithm and its concrete implementation. This is in contrast to the formal development, which usually proceeds all the way to the complete concrete implementation of the less efficient solution. We claim that the algorithm designer is forced to omit some of the details by the relative expressive poverty of the Pascal-like languages typically used to present the solution. The greater expressiveness provided by a functional language would allow the whole story to be told in a reasonable amount of space. In this paper we use a functional language to present the development of a sophisticated algorithm all the way to the final code. We hope to bridge the algorithm gap between abstract and concrete implementations, and thereby facilitate communication between the constructive programming and algorithm design communities. (C) 1999 Elsevier Science B.V. All rights reserved.
We propose new algorithms for (delta, gamma, alpha)-matching. In this string matching problem we axe given a pattern P = p(0)p(1)...p(m-1) and a text T = t(0)t(1)...t(n-1) over some integer alphabet Sigma = {0...sigma...
详细信息
We propose new algorithms for (delta, gamma, alpha)-matching. In this string matching problem we axe given a pattern P = p(0)p(1)...p(m-1) and a text T = t(0)t(1)...t(n-1) over some integer alphabet Sigma = {0...sigma - 1}. The pattern symbol p(i) delta-matches the text symbol t(j) iff vertical bar p(i) - t(j)vertical bar <= delta. The pattern P (delta, gamma)-matches some text substring t(j)... t(j+m-1) iff for all i it holds that vertical bar p(i) - t(j+1)vertical bar <= delta and Sigma vertical bar p(i) - t(j+i)vertical bar <= gamma. Finally, in (delta, gamma, alpha)-matching we also permit at most alpha-symbol gaps between each matching text symbol. The only known previous algorithm runs in O(nm) time. We give several algorithms that improve the average case up to O(n) for small alpha, and the worst case to O(min{nm, vertical bar M vertical bar alpha}) or O(nm log(gamma)/w), where M = {(i, j) vertical bar vertical bar p(i) - t(j)vertical bar <= delta} and w is the number of bits in a machine word. The proposed algorithms can be easily modified to solve several other related problems, we explicitly consider e.g. character classes (instead of delta-matching), (Delta-limited) k-mismatches (instead of gamma-matching) and more general gaps, including negative ones. These find important applications in computational biology. We conclude with experimental results showing that the algorithms are very efficient in practice.
We show how to chain maximal exact matches (MEMs) between a query string Q and a labeled directed acyclic graph (DAG) G = (V, E) to solve the longest common subsequence (LCS) problem between Q and G. We obtain our res...
详细信息
ISBN:
(纸本)9783031439797;9783031439803
We show how to chain maximal exact matches (MEMs) between a query string Q and a labeled directed acyclic graph (DAG) G = (V, E) to solve the longest common subsequence (LCS) problem between Q and G. We obtain our result via a new symmetric formulation of chaining in DAGs that we solve in O(m + n + k(2)|V| + |E| + kN log N) time, where m = |Q|, n is the total length of node labels, k is the minimum number of paths covering the nodes of G and N is the number of MEMs between Q and node labels, which we show encode full MEMs.
Given a set of weighted hyper-rectangles in a k-dimensional space, the chaining problem is to identify a set of colinear and non-overlapping hyper-rectangles of total maximal weight. This problem is used in a number o...
详细信息
ISBN:
(纸本)9783642156458
Given a set of weighted hyper-rectangles in a k-dimensional space, the chaining problem is to identify a set of colinear and non-overlapping hyper-rectangles of total maximal weight. This problem is used in a number of applications in bioinformatics, string processing, and VLSI design. In this paper, we present parallel versions of the chaining algorithm for bioinformatics applications, running on multi-core and computer cluster architectures. Furthermore, we present experimental results of our implementations on both architectures.
Here we reported a new mothod for large-scale sequence *** it,we could get a approximate but accurate enough global or local alignment *** the method,sparse dynamic programming was used to refine the alignment space,t...
详细信息
Here we reported a new mothod for large-scale sequence *** it,we could get a approximate but accurate enough global or local alignment *** the method,sparse dynamic programming was used to refine the alignment space,thus, computational time is *** also used hashing techenique to search for short gapfree alignment fragments,which is the basic ensemble of sparsedynamic *** examples has been aligned by the program by our method.
Given a pattern string P = p1p2 ... pm and K parallel text strings T = {T-k = t(1)(k) ... t(n)(k) |1 0 such that P can be split into kappa pieces P = P-1 ... P-kappa, where each P-i has an occurrence in some text tra...
详细信息
Given a pattern string P = p1p2 ... pm and K parallel text strings T = {T-k = t(1)(k) ... t(n)(k) |1 <= k <= K} over an integer alphabet S, our task is to find the smallest integer kappa > 0 such that P can be split into kappa pieces P = P-1 ... P-kappa, where each P-i has an occurrence in some text track T-ki and these partial occurrences retain the order. We study some variations of this minimum splitting problem, such as splittings with limited gaps and transposition invariance, and show how to use sparse dynamic programming to solve the variations efficiently. In particular, we show that the minimum splitting problem can be interpreted as a shortest path problem on line segments. (C) 2004 Elsevier B.V. All rights reserved.
暂无评论