检索结果-内蒙古大学图书馆

Transposition invariant string matching

JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC 2005年第2期56卷 124-153页

作者： Mäkinen, V Navarro, G Ukkonen, E Univ Helsinki Dept Comp Sci FIN-00014 Helsinki Finland Univ Chile Dept Comp Sci Ctr Web Res Santiago Chile

Given strings A = a(1)a(2)...a(m) and B=b(1)b(2)...b(n) over an alphabet Sigma subset of U, where U is some numerical universe closed under addition and subtraction, and a distance function d(A, B) that gives the score of the best (partial) matching of A and B, the transposition invariant distance is min(t is an element of U){d(A + t, B)}, where A + t = (a(1) + t)(a(2) + t)...(a(m) + t). We study the problem of computing the transposition invariant distance for various distance (and similarity) functions d, including Hamming distance, longest common subsequence (LCS), Levenshtein distance, and their versions where the exact matching condition is replaced by an approximate one. For all these problems we give algorithms whose time complexities are close to the known upper bounds without transposition invariance, and for some we achieve these upper bounds. In particular, we show how sparse dynamic programming can be used to solve transposition invariant problems, and its connection with multidimensional range-minimum search. As a byproduct, we give improved sparse dynamic programming algorithms to compute LCS and Levenshtein distance. (c) 2004 Elsevier Inc. All rights reserved.

关键词： edit distance music sequence comparison transposition invariance sparse dynamic programming range-minimum searching

来源：评论

学校读者我要写书评

暂无评论

Efficient computation of gapped substring kernels on large alphabets

引用

JOURNAL OF MACHINE LEARNING RESEARCH 2005年第9期6卷 1323-1344页

作者： Rousu, J Shawe-Taylor, J Univ London Royal Holloway & Bedford New Coll Dept Comp Sci Egham TW20 0EX Surrey England Univ Southampton Sch Elect & Comp Sci Southampton SO17 1BJ Hants England

We present a sparse dynamic programming algorithm that, given two strings s and t, a gap penalty l, and an integer p, computes the value of the gap-weighted length-p subsequences kernel. The algorithm works in time O(p vertical bar M vertical bar log vertical bar t vertical bar), where M = {(i,j)vertical bar s(i) = t(j)} is the set of matches of characters in the two sequences. The algorithm is easily adapted to handle bounded length subsequences and different gap-penalty schemes, including penalizing by the total length of gaps and the number of gaps as well as incorporating character-specific match/gap penalties. The new algorithm is empirically evaluated against a full dynamic programming approach and a trie-based algorithm both on synthetic and newswire article data. Based on the experiments, the full dynamic programming approach is the fastest on short strings, and on long strings if the alphabet is small. On large alphabets, the new sparse dynamic programming algorithm is the most efficient. On medium-sized alphabets the trie-based approach is best if the maximum number of allowed gaps is strongly restricted.

关键词： kernel methods string kernels text categorization sparse dynamic programming

来源：评论

学校读者我要写书评

暂无评论

On minimizing pattern splitting in multi-track string matching

引用

JOURNAL OF DISCRETE ALGORITHMS 2005年第2-4期3卷 248-266页

作者： Lemstrom, Kjell Makinen, Veli Univ Helsinki Dept Comp Sci POB 68Gustav Hallstromin Katu 2b FIN-00014 Helsinki Finland

Given a pattern string P = p1p2 ... pm and K parallel text strings T = {T-k = t(1)(k) ... t(n)(k) |1 0 such that P can be split into kappa pieces P = P-1 ... P-kappa, where each P-i has an occurrence in some text tra... 详细信息

Given a pattern string P = p1p2 ... pm and K parallel text strings T = {T-k = t(1)(k) ... t(n)(k) |1 <= k <= K} over an integer alphabet S, our task is to find the smallest integer kappa > 0 such that P can be split into kappa pieces P = P-1 ... P-kappa, where each P-i has an occurrence in some text track T-ki and these partial occurrences retain the order. We study some variations of this minimum splitting problem, such as splittings with limited gaps and transposition invariance, and show how to use sparse dynamic programming to solve the variations efficiently. In particular, we show that the minimum splitting problem can be interpreted as a shortest path problem on line segments. (C) 2004 Elsevier B.V. All rights reserved.

关键词： String matching sparse dynamic programming Shortest paths Transposition invariance Music retrieval

来源：评论

学校读者我要写书评

暂无评论

sparse dynamic programming for evolutionary-tree comparison

引用

SIAM JOURNAL ON COMPUTING 1997年第1期26卷 210-230页

作者： Farach, M Thorup, M UNIV COPENHAGEN DEPT COMP SCIDK-2100 COPENHAGENDENMARK

Constructing evolutionary trees for species sets is a fundamental problem in biology. Unfortunately, there is no single agreed upon method for this task, and many methods are in use. Current practice dictates that trees be constructed using different methods and that the resulting trees should be compared for consensus. It has become necessary to automate this process as the number of species under consideration has grown. We study one formalization of the problem: the maximum agreement-subtree (MAST) problem. The MAST problem is as follows: given a set A and two rooted trees T-0 and T-1 leaf-labeled by the elements of A, find a maximum-cardinality subset B of A such that the topological restrictions of T-0 and T-1 to B are isomorphic. In this paper, we will show that this problem reduces to unary weighted bipartite matching (UWBM) with an O(n(1+o(1))) additive overhead. We also show that UWBM reduces linearly to MAST. Thus our algorithm is optimal unless UWBM can be solved in near linear time. The overall running time of our algorithm is O(n(1.5)log n), improving on the previous best algorithm, which runs in O(n(2)). We also derive an O(nc(root log n))-time algorithm for the case of bounded degrees, whereas the previously best algorithm runs in O(n(2)), as in the unbounded case.

关键词： sparse dynamic programming computational biology evolutionary trees

来源：评论

学校读者我要写书评

暂无评论

Bridging the algorithm gap: A linear-time functional program for paragraph formatting

引用

SCIENCE OF COMPUTER programming 1999年第1期35卷 3-27页

作者： de Moor, O Gibbons, J Univ Oxford Programming Res Grp Oxford OX1 3QD England Oxford Brookes Univ Sch Comp & Math Sci Oxford OX3 0BP England

In the constructive programming community it is commonplace to see formal developments of textbook algorithms. In the algorithm design community, on the other hand, it may be well known that the textbook solution to a problem is not the most efficient possible. However, in presenting the more efficient solution, the algorithm designer will usually omit some of the implementation details, thus creating an algorithm gap between the abstract algorithm and its concrete implementation. This is in contrast to the formal development, which usually proceeds all the way to the complete concrete implementation of the less efficient solution. We claim that the algorithm designer is forced to omit some of the details by the relative expressive poverty of the Pascal-like languages typically used to present the solution. The greater expressiveness provided by a functional language would allow the whole story to be told in a reasonable amount of space. In this paper we use a functional language to present the development of a sophisticated algorithm all the way to the final code. We hope to bridge the algorithm gap between abstract and concrete implementations, and thereby facilitate communication between the constructive programming and algorithm design communities. (C) 1999 Elsevier Science B.V. All rights reserved.

关键词： paragraph formatting functional programming algorithm design sparse dynamic programming transformational programming

来源：评论

学校读者我要写书评

暂无评论

Large-Scale Comparison Analysis of Genome Sequences

引用

生物数学学报 1997年第2期12卷 97-103页

作者： Tang Haixu Ding Dafu(Shanghai Institute of Biochemistry,Academia Sinica Shanhai 200031) 中科院上海生物化学所上海

Here we reported a new mothod for large-scale sequence *** it,we could get a approximate but accurate enough global or local alignment *** the method,sparse dynamic programming was used to refine the alignment space,thus, computational time is *** also used hashing techenique to search for short gapfree alignment fragments,which is the basic ensemble of sparse dynamic *** examples has been aligned by the program by our method.

关键词： Large-scale sequence alignment hashing techenique sparse dynamic programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：