The Pattern Matching problern with Swaps consists in finding all occurrences of a pattern P in a text T, when disjoint local swaps in the pattern are allowed. In the Approximate Pattern Matching problem with Swaps one...
详细信息
ISBN:
(纸本)9788001044032
The Pattern Matching problern with Swaps consists in finding all occurrences of a pattern P in a text T, when disjoint local swaps in the pattern are allowed. In the Approximate Pattern Matching problem with Swaps one seeks to compute, for every text location with a swapped match of P the number of swaps necessary to obtain a match at the location. in this paper, we present new efficient algorithms for the Approximatc, Swap Matching problem. In particular, we first present a O(nm(2)) algorithm, where fn is the length of the pattern and n is the length of the text, which is a variation of the BACKWARD-CROSSSAMPLING algorithm, a recent solution to the swap matching problem. Subsequently, we propose an efficient implementation of our algorithm, based on the bit-parallelism technique. The latter solution achieves a O(mn)-time and O(sigma)-space complexity, where a is the dimension of the alphabet. From an extensive comparison with some of the most recent and effective algorithms for the approximate swap matching problem, it turns out that our algorithms are very flexible and achieve very good results in practice.
The Pattern Matching problem with Swaps consists in finding all occurrences of a, pattern P in a text T, when disjoint local swaps in the pattern are allowed. In the Approximate Pattern Matching problem with Swaps one...
详细信息
ISBN:
(纸本)9783540958901
The Pattern Matching problem with Swaps consists in finding all occurrences of a, pattern P in a text T, when disjoint local swaps in the pattern are allowed. In the Approximate Pattern Matching problem with Swaps one seeks, for every text location with a swapped match of P, the number of swaps necessary to obtain a match at the location. In this paper, we present a new approach for solving both the Swap Matching problem and the Approximate Swap Matching problem in linear time, in the case of short, patterns. In particular, we devise a O(nm) general algorithm, named CROSS-SAMPLINGING, and show an efficient implementation of it, based on bit-parallelism, which achieves O(n) worst-case time and O(sigma)-space complexity, with patterns whose length is comparable to the word-size of the target machine.
The Pattern Matching problem with Swaps consists in finding all occurrences of a pattern P in a text;T, when disjoint local swaps in the pattern are allowed. In this paper, we present a new efficient algorithm for the...
详细信息
ISBN:
(纸本)9783642102165
The Pattern Matching problem with Swaps consists in finding all occurrences of a pattern P in a text;T, when disjoint local swaps in the pattern are allowed. In this paper, we present a new efficient algorithm for the Swap Matching problem with short patterns. In particular, we devise a O(nm(2)) general algorithm, named BACKWARD-CROSS-SAMPLING, and show an efficient implementation of it, based on bit-parallelism, which achieves O(nm) worst-case time and O(sigma)-space complexity, with patterns whose length m is comparable to the word-size of the target machine (n and sigma are respectively the size of the text and of the alphabet). From an extensive comparison with some of the most recent and effective algorithms for the swap matching problem, it turns out that our algorithm is very flexible and achieves very good results in practice.
We give a time-randomness tradeoff for the quasi-random rumour spreading protocol proposed by Doerr et al [Doerr, B., T. Friedrich, and T. Sauerwald. Quasirandom rumor spreading. In Proc. of the 19th Annual ACM-SIAM S...
详细信息
Molecular biology has posed a number of fascinating and sometimes daunting computational problems, which came naturally expressed in its native language of character strings. Through the years, some such problems have...
详细信息
Molecular biology has posed a number of fascinating and sometimes daunting computational problems, which came naturally expressed in its native language of character strings. Through the years, some such problems have found elegant and even useful solutions in response to the needs that originally motivated them. What is perhaps even more remarkable, several of the ideas inspired by computational molecular biology have found application in remote and diverse domains, so that it may be argued that molecular biology did more for computing than the latter did for it. As a modest tribute, this paper reviews a small sample of these cases drawing from the personal exposure of the author.
作者:
Halman, NirMIT
Dept Civil & Environm Engn Cambridge MA 02139 USA
Helly's theorem says that, if every d+1 elements of a given finite set of convex objects in R-d have a common point, there is a point common to all of the objects in the set. In discrete Helly theorems the common ...
详细信息
Helly's theorem says that, if every d+1 elements of a given finite set of convex objects in R-d have a common point, there is a point common to all of the objects in the set. In discrete Helly theorems the common point should belong to an a priori given set. In lexicographic Helly theorems the common point should not be lexicographically greater than a given point. Using discrete and lexicographic Helly theorems we get linear time solutions for various optimization problems. For this, we introduce the DLP-type (discrete linear programming-type) model, and provide new algorithms that solve in randomized linear time fixed-dimensional DLP-type problems. For variable-dimensional DLP-type problems, our algorithms run in time subexponential in the combinatorial dimension. Finally, we use our results in order to solve in randomized linear time problems such as the discrete p-center on the real line, the discrete weighted 1-center problem in R-d with either l(1) or l(infinity) norm, the standard (continuous) problem of finding a line transversal for a totally separable set of planar convex objects, a discrete version of the problem of finding a line transversal for a set of axis-parallel planar rectangles, and the (planar) lexicographic rectilinear p-center problem for p = 1, 2, 3. These are the first known linear time algorithms for these problems. Moreover, we use our algorithms to solve in randomized subexponential time various problems in game theory, improving upon the best known algorithms for these problems.
Motif patterns consisting of sequences of intermixed solid and don't-care characters have been introduced and studied in connection with pattern discovery problems of computational biology and other domains. In or...
详细信息
Motif patterns consisting of sequences of intermixed solid and don't-care characters have been introduced and studied in connection with pattern discovery problems of computational biology and other domains. In order to alleviate the exponential growth of such motifs, notions of maximal saturation and irredundancy have been formulated, whereby more or less compact subsets of the set of all motifs can be extracted, that are capable of expressing all others by suitable combinations. In this paper, we introduce the notion of maximal irredundant motifs in a two-dimensional array and develop initial properties and a combinatorial argument that poses a linear bound on the total number of such motifs. The remainder of the paper presents approaches to the discovery of irredundant motifs both by offline and incremental algorithms. (C) 2007 Elsevier B.V. All rights reserved.
Compact bases formed by motifs called "irredundant" and capable of generating all other motifs in a sequence have been proposed in recent years and successfully tested in tasks of biosequence analysis and cl...
详细信息
Compact bases formed by motifs called "irredundant" and capable of generating all other motifs in a sequence have been proposed in recent years and successfully tested in tasks of biosequence analysis and classification. Given a sequence s of n characters drawn from an alphabet Sigma, the problem of extracting such a base from s had been previously solved in time O(n(2) log n log vertical bar Sigma vertical bar) and O(vertical bar Sigma vertical bar n(2) log(2) n log log n), respectively, using the FFT-based string searching by Fischer and Paterson. More recently, a solution on binary strings taking time O(n(2)) without resorting to the FFT was also proposed. In the present paper, we considered the problem of incrementally extracting the bases of all suffixes of a string. This problem was solved in a previous work in time O(n(3)). A much faster incremental algorithm is described here, which takes time O(n(2) log n) for binary strings. Although this algorithm does not make use of the FFT, its performance is comparable to the one exhibited by the previous FFT-based algorithms involving the computation of only one base. The implicit representation of a single base requires O(n) space, whence for finite alphabets the proposed solution is within a log n factor from optimality. (C) 2008 Elsevier B.V. All rights reserved.
The NP-complete POWER DOMINATING SET problem is an "electric power networks variant" of the classical domination problem in graphs: Given an undirected graph G = (V, E), find a minimum-size set P. V such tha...
详细信息
The NP-complete POWER DOMINATING SET problem is an "electric power networks variant" of the classical domination problem in graphs: Given an undirected graph G = (V, E), find a minimum-size set P. V such that all vertices in V are "observed" by the vertices in P. Herein, a vertex observes itself and all its neighbors, and if an observed vertex has all but one of its neighbors observed, then the remaining neighbor becomes observed as well. We show that POWER DOMINATING SET can be solved by "bounded-treewidth dynamic programs." For treewidth being upper-bounded by a constant, we achieve a linear-time algorithm. In particular, we present a simplified linear-time algorithm for POWER DOMINATING SET in trees. Moreover, we simplify and extend several NP-completeness results, particularly showing that POWER DOMINATING SET remains NP-complete for planar graphs, for circle graphs, and for split graphs. Specifically, our improved reductions imply that POWER DOMINATING SET parameterized by vertical bar P vertical bar is W[2]-hard and it cannot be better approximated than DOMINATING SET.
Let T = (V, E) be a tree with n nodes such that each node v is associated with a value-weight pair (val(v), w(v)), where the value val(v) is a real number and the weight w(v) is a positive integer. The density of a pa...
详细信息
Let T = (V, E) be a tree with n nodes such that each node v is associated with a value-weight pair (val(v), w(v)), where the value val(v) is a real number and the weight w(v) is a positive integer. The density of a path P = (v(1), v(2).....v(k)) is defined as Sigma(k)(i=1) val(i)/Sigma(k)(i=1)w(i). The weight of P, denoted by w(P), is Sigma(k)(i=1)w(i). Given a tree of n nodes, two integers w(min) and w(max), and a length lower bound L, we present a pseudo-polynomial O(w(max)nL)-time algorithm to find a maximum-density path P in T such that W-min <= w(P) <= w(max) and the length of P is at least L. (c) 2007 Elsevier B.V. All rights reserved.
暂无评论