Given a string W, the longest previous factor (LPF) problem is to determine the maximum length of a previously occurring factor for each suffix occurring in W. The LPF problem is defined for traditional strings exclus...
详细信息
Given a string W, the longest previous factor (LPF) problem is to determine the maximum length of a previously occurring factor for each suffix occurring in W. The LPF problem is defined for traditional strings exclusively from the constant alphabet Sigma. A parameterized string (p-string) is a string composed of symbols from a constant alphabet Sigma and a parameter alphabet Pi. We formulate the LPF problem in terms of p-strings by defining the parameterized longest previous factor (pLPF) problem. Subsequently, we present an expected linear time solution to construct the parameterized longest previous factor (pLPF) array. Given our pLPF solution, we show how to construct the pLCP (parameterized longest common prefix) array with the same general algorithm. We exploit the properties of the pLPF data structure to also construct the standard LPF (longest previous factor) and LCP (longest common prefix) arrays all in linear time. Further, we provide insight into the practicality of our construction algorithms. (c) 2012 Elsevier B.V. All rights reserved.
The challenge of direct parameterizedsuffix sorting (p-suffix sorting) for a parameterized string (p-string) is the dynamic nature of parameterizedsuffixes (p-suffixes). In this work, we propose transformative appro...
详细信息
ISBN:
(纸本)9783642250101
The challenge of direct parameterizedsuffix sorting (p-suffix sorting) for a parameterized string (p-string) is the dynamic nature of parameterizedsuffixes (p-suffixes). In this work, we propose transformative approaches to direct p-suffix sorting by generating and sorting lexicographically numeric fingerprints and arithmetic codes that correspond to individual p-suffixes. Our algorithm to p-suffix sort via fingerprints is the first theoretical linear time algorithm for p-suffix sorting for non-binary parameter alphabets, which assumes that each code is represented by a practical integer. We eliminate the key problems of fingerprints by introducing an algorithm that exploits the ordering of arithmetic codes to sort p-suffixes in linear time on average.
The longest previous factor (LPF) problem is defined for traditional strings exclusively from the constant alphabet Sigma. A parameterized string (p-string) is a sophisticated string composed of symbols from a constan...
详细信息
ISBN:
(纸本)9783642250101
The longest previous factor (LPF) problem is defined for traditional strings exclusively from the constant alphabet Sigma. A parameterized string (p-string) is a sophisticated string composed of symbols from a constant alphabet Sigma and a parameter alphabet Pi. We generalize the LPF problem to the parameterized longest previous factor (pLPF) problem defined for p-strings. Subsequently, we present a linear time solution to construct the pLPF array. Given our pLPF algorithm, we show how to construct the pLCP (parameterized longest common prefix) array in linear time. Our algorithm is further exploited to construct the standard LPF and LCP arrays all in linear time.
The parameterized longest previous factor (pLPF) problem as defined for parameterized strings (p-strings) adds a level of parameterization to the longest previous factor (LPF) problem originally defined for traditiona...
详细信息
The parameterized longest previous factor (pLPF) problem as defined for parameterized strings (p-strings) adds a level of parameterization to the longest previous factor (LPF) problem originally defined for traditional strings. In this work, we consider the construction of the pLPF data structure and identify the strong relationship between the pLPF linear time construction and several variations of the problem. Initially, we propose a taxonomy of classes for longest factor problems. Using this taxonomy, we show an interesting connection between the pLPF and popular data structures. It is shown that a subset of longest factor problems may be created with the pLPF construction. More specifically, the pLPF problem is used as a foundation to achieve the linear time construction of popular data structures such as the LCP, parameterized-LCP (pLCP), parameterized-border (p-border) array, and border array. We further generalize the permuted-LCP for p-strings and provide a linear time construction. A number of new variations of the pLPF problem are proposed and addressed in linear time for both p-strings and traditional strings, including the longest not-equal factor (LneF), longest reverse factor (LrF), and longest factor (LF). The framework of the pLPF construction is exploited to efficiently address a multitude of data structures with prospects in various applications. Finally, we implement our algorithms and perform various experiments to confirm theoretical results. (C) 2012 Elsevier B.V. All rights reserved.
The challenge of direct parameterizedsuffix sorting (p-suffix sorting) for a parameterized string (p-string), say T of length-n, is the dynamic nature of the n parameterizedsuffixes (p-suffixes) of T. In this work, ...
详细信息
The challenge of direct parameterizedsuffix sorting (p-suffix sorting) for a parameterized string (p-string), say T of length-n, is the dynamic nature of the n parameterizedsuffixes (p-suffixes) of T. In this work, we propose transformative approaches to direct p-suffix sorting by generating and sorting lexicographically numeric fingerprints and arithmetic codes that correspond to individual p-suffixes. Our algorithm to p-suffix sort via fingerprints is the first theoretical linear time algorithm for p-suffix sorting for nonbinary parameter alphabets, which assumes that, in practice, all codes are within the range of an integral data type. We eliminate the key problems of fingerprints by introducing an algorithm that exploits the ordering of arithmetic codes to sort p-suffixes in linear time on average. The arithmetic coding approach is further extended to handle p-strings in the worst case. This algorithm is the first direct p-suffix sorting approach in theory to execute in o(n(2)) time in the worst case, which improves on the best known theoretical result on this problem that sorts p-suffixes based on p-suffix classifications in O(n(2)) time. We show that, based on the algorithmic parameters and the input data, our algorithm does indeed execute in linear time in various cases, which is confirmed with experimental results. (C) 2012 Elsevier B. V. All rights reserved.
暂无评论