The inference of regular expressions from a finite number of samples has important applications in various fields such as information extraction, XML schema learning, biological sequence analysis. In this paper, we pr...
详细信息
ISBN:
(纸本)9781450366069
The inference of regular expressions from a finite number of samples has important applications in various fields such as information extraction, XML schema learning, biological sequence analysis. In this paper, we present an algorithm for learning regular expressions based on repeated string detection. The algorithm can learn a subclass of regular expressions in which unary operators such that Kleene star and Kleene plus can apply on multiple characters. Preliminary experimental results demonstrate the effectiveness and efficiency of the proposed algorithm.
暂无评论