检索结果-内蒙古大学图书馆

An incremental algorithm for string pattern matching machines

INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS 1995年第1-2期58卷 33-42页

作者： Tsuda, K Fuketa, M Aoe, JI [a] Department of Information Science & Intelligent Systems The University of Tokushima Minami-Josanjima-Cho Tokushima-Shi Japan

Aho and Corasick presented a string pattern matching machine (hereafter called machine AC) to locate multiple keywords. However, the machine AC must be reconstructed all over again when a keyword is appended. This paper proposes an efficient algorithm to append a keyword for the machine AC. This paper presents the time efficiency comparison with the original algorithm using the actual simulation results. The simulation results show the speed up factor, by the algorithm proposed, to be between 25 and 270 fold when compared with the original algorithm by Aho and Corasick which requires the reconstruction of the entire machine AC.

关键词： string pattern matching append a keyword bibliographic search text-editing information retrieval

来源：评论

学校读者我要写书评

暂无评论

Inverted Lists string pattern matching

Inverted Lists String Pattern Matching

引用

2nd IEEE International Conference on Computer Science and Information Technology

作者： Khancome, Chouvalit Boonjing, Veera KMITL Fac Sci Dept Math & Comp Sci Ladkrabang Bankok Thailand

ISBN: (纸本)9781424445189

This paper presents two algorithms of string pattern matching. These algorithms employ the inverted lists to accommodate the string pattern to be searched for. The first solution scans the text in a single pass for all occurrences of string pattern. The second solution, which improves the first one, takes the comparison times equal to the length of pattern plus the number of comparisons that lead to be mismatched.

关键词： string pattern matching inverted lists (IVL) inverted index

来源：评论

学校读者我要写书评

暂无评论

A pre-processing algorithm for string pattern matching

A pre-processing algorithm for string pattern matching

引用

International Conference on Algorithmic Mathematics and Computer Science (AMCS 05)

作者： Boxer, L Niagara Univ Dept Comp & Informat Sci Niagara Univ NY 14109 USA

We introduce an algorithm that can be run in sublinear time and has, under certain circumstances, a high probability of greatly reducing the amount of data from the text that must be considered in order to solve strin... 详细信息

ISBN: (纸本)1932415637

关键词： string pattern matching analysis of algorithms

来源：评论

学校读者我要写书评

暂无评论

Concluding pattern of Web Page Based on string pattern matching

Concluding Pattern of Web Page Based on String Pattern Match...

引用

International Conference on Web Information Systems and Mining (WISM 2011)

作者： Cai, Yiqing Wang, Xinjun Lu, Chunsheng Yan, Zhongmin Peng, Zhaohui Shandong Univ Sch Comp Sci & Technol Jinan 250100 Peoples R China Informat Ctr Minist Human Resources & Social Secu Beijing Peoples R China

ISBN: (纸本)9783642239816

Presently, each Web site has its own topics and formats to arrange the page structure and present information. Therefore, there is a great need for value-added service that extracts information from multiple sources. Data extraction from HTML is usually performed by software modules called wrappers. In many studies of constructing wrapper, concluding the pattern of the Web site is a importance task in the beginning. This paper studies the problem of concluding pattern from a Web page that contains several nested structure and repeated structure. In our method, the algorithm bases on string pattern matching can discover the nested structure and the repeated structure in a Web page. Then a regular expression will be generated as the pattern of the Web site.

关键词： hierarchical preorder traversal string pattern of Web site string pattern matching nested structure the repeated structure concluding pattern

来源：评论

学校读者我要写书评

暂无评论

A Boyer-Moore-style algorithm for regular expression pattern matching

引用

SCIENCE OF COMPUTER PROGRAMMING 2003年第2-3期48卷 99-117页

作者： Watson, BW Watson, RE Univ Pretoria Dept Comp Sci ZA-0002 Pretoria South Africa Eindhoven Univ Technol Dept Comp Sci NL-5600 MB Eindhoven Netherlands

This paper presents a Boyer-Moore-type algorithm for regular expression pattern matching, answering an open problem posed by Aho in 1980 (pattern matching in strings, Academic Press, New York, 1980, p. 342). The new algorithm handles patterns specified by regular expressions-a generalization of the Boyer-Moore and Commentz-Walter algorithms. Like the Boyer-Moore and Commentz-Walter algorithms, the new algorithm makes use of shift functions which can be precomputed and tabulated. The precomputation algorithms are derived, and it is shown that the required shift functions can be precomputed from Commentz-Walter's d(1) and d(2) shift functions. In certain cases, the Boyer-Moore (respectively Commentz-Walter) algorithm has greatly outperformed the Knuth-Morris-Pratt (respectively Aho-Corasick) algorithm (as discussed by Watson in his Ph.D. Thesis, Eindhoven University of Technology, September 1995, and in: N. Ziviani, R. Baeza-Yates, K. Guimaracs (Eds.), Proc. Third South American Workshop on string Processing, International Informatics Series, vol. 4, Carleton University Press, Recife, Brazil, 1996, pp. 280-294). In testing, the algorithm presented in this paper also frequently outperforms the regular expression generalization of the Aho-Corasick algorithm. (C) 2003 Elsevier B.V. All rights reserved.

关键词： string pattern matching regular expressions Boyer-Moore algorithms Commentz-Walter algorithms algorithm generalizations

来源：评论

学校读者我要写书评

暂无评论

Efficient implementation of Aho-Corasick pattern matching automata using Unicode

引用

SOFTWARE-PRACTICE & EXPERIENCE 2007年第6期37卷 669-690页

作者： Nieminen, Janne Kilpelainen, Pekka Univ Kuopio Dept Comp Sci FI-70211 Kuopio Finland

We study different efficient implementations of an Aho-Corasick pattern matching automaton when searching for patterns in Unicode text. Much of the previous research has been based on the assumption of a relatively small alphabet, for example the 7-bit ASCII. Our aim is to examine the differences in performance arising from the use of a large alphabet, such as Unicode that is widely used today. The main concern is the representation of the transition function of the pattern matching automaton. We examine and compare array, linked list, hashing, balanced tree, perfect hashing, hybrid, triple-array, and doublearray representations. For perfect hashing, we present an algorithm that constructs the hash tables in expected linear time and linear space. We implement the Aho-Corasick automaton in Java using the different transition function representations, and we evaluate their performance. Triple-array and doublearray performed best in our experiments, with perfect hashing, hashing, and balanced tree coming next. We discovered that the array implementation has a slow preprocessing time when using the Unicode alphabet. It seems that the use of a large alphabet can slow down the preprocessing time of the automaton considerably depending on the transition function representation used. Copyright (C) 2006 John Wiley & Sons, Ltd.

关键词： string pattern matching Aho-Corasick implementation transition function

来源：评论

学校读者我要写书评

暂无评论

Prefix-free regular languages and pattern matching

引用

THEORETICAL COMPUTER SCIENCE 2007年第1-2期389卷 307-317页

作者： Han, Yo-Sub Wang, Yajun Wood, Derick Korea Inst Sci & Technol Intelligence & Interact Res Ctr Seoul 130650 South Korea Hong Kong Univ Sci & Technol Dept Comp Sci Kowloon Hong Kong Peoples R China

We explore the, regular-expression matching problem with respect to prefix-freeness of the pattern. We prove that a prefix-free regular expression gives only a linear number of matching substrings in the size of a given text. Based on this observation, we propose an efficient algorithm for the prefix-free regular-expression matching problem. Furthermore, we suggest an algorithm to determine whether or not a given regular language is prefix-free. (c) 2007 Elsevier B.V. All rights reserved.

关键词： string pattern matching regular-expression matching prefix-free regular languages pruned prefix-free languages

来源：评论

学校读者我要写书评

暂无评论

Efficient multi-attribute pattern matching

引用

INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS 1998年第1-2期66卷 21-38页

作者： Ando, K Mizobuchi, S Shishibori, M Aoe, J Univ Tokushima Dept Informat Sci & Intelligent Syst Tokushima 770 Japan

This paper describes an efficient multi-attribute pattern matching machine to locate all occurrences of any of a finite number of a sequence of rule structures in a series of input structures. The matching operation of the proposed machine is similar to the method of Aho-Corasick or the method of retrieval using a trie, however, the proposed machine has the following distinctive features: (1) The proposed machine enables us to match set representations containing multiple attributes;(2) It enables us to match separate components;(3) It enables us to match a rule consisting of an exclusive set. In this paper, their features are described in detail. Moreover, the pattern matching algorithm is evaluated by the theoretical observations and the experimental observations that are supported by the simulation results for a variety of rules for document processing as text proofreading, text reduction, and examining a relation between sentences.

关键词： information retrieval string pattern matching multi-attribute pattern matching met representation separate components exclusive set

来源：评论

学校读者我要写书评

暂无评论

An improvement of the Aho-Corasick machine

引用

INFORMATION SCIENCES 1998年第1-4期111卷 139-151页

作者： Ando, K Kinoshita, T Shishibori, M Aoe, J Univ Tokushima Dept Informat Sci & Intelligent Syst Tokushima 770 Japan

Aho and Corasick presented a string pattern matching machine to locate multiple keywords. However, the AC machine could not match multi-attribute information. This paper describes an efficient multi-attribute pattern matching machine to locate all occurrences of any of a finite number of the sequence of rule structures (called matching rules) in a sequence of input structures. The proposed algorithm enables us to match set representations containing multiple attributes. Therefore, confirming transition is decided by the relationship, whether the input structure includes the rule structure or not. Finally, the pattern matching algorithm is evaluated by theoretical analysis and the evaluation is supported by the simulation results with rules for the extraction of keywords. (C) 1998 Elsevier Science Inc. All rights reserved.

关键词： string pattern matching multi-attribute pattern matching set representation finite state pattern matching machine matching algorithm constructing algorithm

来源：评论

学校读者我要写书评

暂无评论

In-place algorithms for exact and approximate shortest unique substring problems

引用

THEORETICAL COMPUTER SCIENCE 2017年 690卷 12-25页

作者： Hon, Wing-Kai Thankachan, Sharma V. Xu, Bojian Natl Tsing Hua Univ Dept Comp Sci Hsinchu 300 Taiwan Univ Cent Florida Dept Comp Sci Orlando FL 32816 USA Eastern Washington Univ Dept Comp Sci Cheney WA 99004 USA

We revisit the exact shortest unique substring (SUS) finding problem, and propose its approximate version where mismatches are allowed, due to its applications in subfields such as computational biology, We design a generic in-place framework that fits to solve both the exact and approximate k-mismatch SUS finding, using the minimum 2n memory words, each of inverted right perpendicular log(2)(n)inverted left perpendicular bits, plus n bytes space, where n is the input string size. By using the in-place framework, we can find the exact and approximate k-mismatch SUS for every string position using a total of O(n) and O(n(2)) time, respectively, regardless of the value of k. Our framework does not involve any compressed or succinct data structures and thus is practical and easy to implement. Experimental study shows that the peak memory usage of our proposal is consistently 9n bytes for any string of size n, validating the claim that our solution is in-place. Further, our proposal uses much less memory and is much faster than the currently best work that has implementation for exact SUS finding. (C) 2017 Elsevier B.V. All rights reserved.

关键词： string pattern matching Shortest unique substring In-place algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：