Given a text T and a pattern P, the order-preserving pattern matching (OPPM) problem is to find all substrings in T which have the same relative orders as P. The OPPM has been studied in the fields of finding some pat...
详细信息
Given a text T and a pattern P, the order-preserving pattern matching (OPPM) problem is to find all substrings in T which have the same relative orders as P. The OPPM has been studied in the fields of finding some patterns affected by relative orders, not by their absolute values. In this paper, we present a method of deciding the order-isomorphism between two strings even when there are same characters. Then, we show that the bad character rule of the horspool algorithm for generic pattern matching problems can be applied to the OPPM problem and we present a space-efficient algorithm for computing shift tables for text search. Finally, we combine our bad character rule with the KMP-based algorithm to improve the worst-case running time. We give experimental results to show that our algorithm is about 2 to 6 times faster than the KMP-based algorithm in reasonable cases. (C) 2014 Published by Elsevier B.V.
A multi-track string is a tuple of strings of the same length. The full permuted pattern matching problem is, given two multi-track strings T = (t(1), t(2),..., t(N)) and P = (p(1), p(2),..., p(N)) such that vertical ...
详细信息
ISBN:
(纸本)9788001059968
A multi-track string is a tuple of strings of the same length. The full permuted pattern matching problem is, given two multi-track strings T = (t(1), t(2),..., t(N)) and P = (p(1), p(2),..., p(N)) such that vertical bar p(1)vertical bar = ... = vertical bar p(N)vertical bar <= vertical bar t(1)vertical bar = ... = vertical bar t(N)vertical bar, to find all positions i such that P = (tr(1) [i : i + m-1],..., t(rN) [i : i + m-1]) for some permutation (r(1),..., r(N)) of (1,..., N), where m = vertical bar p(1)vertical bar and t[i : j] denotes the substring of t from position i to j. We propose new algorithms that perform full permuted pattern matching practically fast. The first and second algorithms are based on the Boyer-Moore algorithm and the horspool algorithm, respectively. The third algorithm is based on the Aho-Corasick algorithm where we use a multi-track character instead of a single character in the so-called goto function. The fourth algorithm is an improvement of the multi-track Knuth-Morris-Pratt algorithm that uses an automaton instead of the failure function of the original algorithm. Our experiment results demonstrate that those algorithms perform permuted pattern matching faster than existing algorithms.
We study exact string matching in special texts, which consist of consecutive fixed-length chunks where each position of a chunk has a character distribution of its own. This kind of setting can also be interpreted so...
详细信息
ISBN:
(纸本)9783540763352
We study exact string matching in special texts, which consist of consecutive fixed-length chunks where each position of a chunk has a character distribution of its own. This kind of setting can also be interpreted so that a chunk represents a character of a larger alphabet. If texts and patterns are of this kind, it may ruin the efficiency of common algorithms. We examine anomalies related to the horspool and Sunday algorithms in this setting. In addition we present two new algorithms.
暂无评论