We explore the, regular-expression matching problem with respect to prefix-freeness of the pattern. We prove that a prefix-free regular expression gives only a linear number of matching substrings in the size of a giv...
详细信息
We explore the, regular-expression matching problem with respect to prefix-freeness of the pattern. We prove that a prefix-free regular expression gives only a linear number of matching substrings in the size of a given text. Based on this observation, we propose an efficient algorithm for the prefix-free regular-expression matching problem. Furthermore, we suggest an algorithm to determine whether or not a given regular language is prefix-free. (c) 2007 Elsevier B.V. All rights reserved.
This paper presents a Boyer-Moore-type algorithm for regular expression patternmatching, answering an open problem posed by Aho in 1980 (patternmatching in strings, Academic Press, New York, 1980, p. 342). The new a...
详细信息
This paper presents a Boyer-Moore-type algorithm for regular expression patternmatching, answering an open problem posed by Aho in 1980 (patternmatching in strings, Academic Press, New York, 1980, p. 342). The new algorithm handles patterns specified by regular expressions-a generalization of the Boyer-Moore and Commentz-Walter algorithms. Like the Boyer-Moore and Commentz-Walter algorithms, the new algorithm makes use of shift functions which can be precomputed and tabulated. The precomputation algorithms are derived, and it is shown that the required shift functions can be precomputed from Commentz-Walter's d(1) and d(2) shift functions. In certain cases, the Boyer-Moore (respectively Commentz-Walter) algorithm has greatly outperformed the Knuth-Morris-Pratt (respectively Aho-Corasick) algorithm (as discussed by Watson in his Ph.D. Thesis, Eindhoven University of Technology, September 1995, and in: N. Ziviani, R. Baeza-Yates, K. Guimaracs (Eds.), Proc. Third South American Workshop on string Processing, International Informatics Series, vol. 4, Carleton University Press, Recife, Brazil, 1996, pp. 280-294). In testing, the algorithm presented in this paper also frequently outperforms the regular expression generalization of the Aho-Corasick algorithm. (C) 2003 Elsevier B.V. All rights reserved.
This paper describes an efficient multi-attribute patternmatching machine to locate all occurrences of any of a finite number of a sequence of rule structures in a series of input structures. The matching operation o...
详细信息
This paper describes an efficient multi-attribute patternmatching machine to locate all occurrences of any of a finite number of a sequence of rule structures in a series of input structures. The matching operation of the proposed machine is similar to the method of Aho-Corasick or the method of retrieval using a trie, however, the proposed machine has the following distinctive features: (1) The proposed machine enables us to match set representations containing multiple attributes;(2) It enables us to match separate components;(3) It enables us to match a rule consisting of an exclusive set. In this paper, their features are described in detail. Moreover, the patternmatching algorithm is evaluated by the theoretical observations and the experimental observations that are supported by the simulation results for a variety of rules for document processing as text proofreading, text reduction, and examining a relation between sentences.
Aho and Corasick presented a string pattern matching machine to locate multiple keywords. However, the AC machine could not match multi-attribute information. This paper describes an efficient multi-attribute pattern ...
详细信息
Aho and Corasick presented a string pattern matching machine to locate multiple keywords. However, the AC machine could not match multi-attribute information. This paper describes an efficient multi-attribute patternmatching machine to locate all occurrences of any of a finite number of the sequence of rule structures (called matching rules) in a sequence of input structures. The proposed algorithm enables us to match set representations containing multiple attributes. Therefore, confirming transition is decided by the relationship, whether the input structure includes the rule structure or not. Finally, the patternmatching algorithm is evaluated by theoretical analysis and the evaluation is supported by the simulation results with rules for the extraction of keywords. (C) 1998 Elsevier Science Inc. All rights reserved.
Aho and Corasick presented a string pattern matching machine (hereafter called machine AC) to locate multiple keywords. However, the machine AC must be reconstructed all over again when a keyword is appended. This pap...
详细信息
Aho and Corasick presented a string pattern matching machine (hereafter called machine AC) to locate multiple keywords. However, the machine AC must be reconstructed all over again when a keyword is appended. This paper proposes an efficient algorithm to append a keyword for the machine AC. This paper presents the time efficiency comparison with the original algorithm using the actual simulation results. The simulation results show the speed up factor, by the algorithm proposed, to be between 25 and 270 fold when compared with the original algorithm by Aho and Corasick which requires the reconstruction of the entire machine AC.
In the subtree isomorphism problem, given 2 rooted trees T subscript 1 and T subscript 2, a determination is made as to whether T subscript 1 is isomorphic to any subtree of T subscript 2. A tree is considered ordere...
详细信息
In the subtree isomorphism problem, given 2 rooted trees T subscript 1 and T subscript 2, a determination is made as to whether T subscript 1 is isomorphic to any subtree of T subscript 2. A tree is considered ordered if the relative order of its subtrees in each node is fixed. It is shown that a O(m+n) time algorithm can be obtained when dealing only with ordered trees. The algorithm is based on tree encoding and on string pattern matching. The subtree isomorphism problem has connections with the tree patternmatching problem, in which nodes are labeled and the trees may contain special symbols that stand for arbitrary trees. While several algorithms already exist to reduce the tree patternmatching problem to stringmatching, algorithms with a time complexity of less than O(mn) are obtained only in some special cases.
作者:
ATALLAH, MJDepartment of Computer Sciences
Purdue University Abstract Authors References Cited By Keywords Metrics Similar Download Citation Email Print Request Permissions
A straight line is an axis ofsymmetry of a planar figure if the figure is invariant to reflection with respect to that line. The purpose of this correspondence is to describe an O( n log n) time algorithm for enumerat...
详细信息
A straight line is an axis ofsymmetry of a planar figure if the figure is invariant to reflection with respect to that line. The purpose of this correspondence is to describe an O( n log n) time algorithm for enumerating all the axes of symmetry of a planar figure which is made up of (possibly intersecting) segments, circles, points, etc. The solution involves a reduction of the problem to a combinatorial question on words. Our algorithm is optimal since we can establish an Ω(n log n) time lower bound for this problem.
Two planar figures aresimilar if a scaled version of one of them can be moved so that it coincides with the second figure. The problem of checking whether two planar figures are similar is relevant to both computation...
详细信息
Two planar figures aresimilar if a scaled version of one of them can be moved so that it coincides with the second figure. The problem of checking whether two planar figures are similar is relevant to both computational geometry and pattern recognition. An efficient algorithm is known for checking whether two polygonsP andQ are similar(1) The purpose of this note is to give an efficient algorithm for checking whether two planar figuresP andQ are similar when the figures are no longer constrained to be polygons. We give anO(n logn) time algorithm for solving this problem when each figure consists of a collection of (possibly intersecting) straight line segments, circles, and ellipses. Our algorithm can easily be modified for figures which include other geometric patterns as well. We also prove that our algorithm is optimal.
This paper describes a simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text. The algorithm consists of constructing a finite state patternmatching machine fr...
详细信息
This paper describes a simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text. The algorithm consists of constructing a finite state patternmatching machine from the keywords and then using the patternmatching machine to process the text string in a single pass. Construction of the patternmatching machine takes time proportional to the sum of the lengths of the keywords. The number of state transitions made by the patternmatching machine in processing the text string is independent of the number of keywords. The algorithm has been used to Improve the speed of a library bibliographic search program by a factor of 5 to 10. [ABSTRACT FROM AUTHOR]
暂无评论