In the parameterized string matching, a given pattern P is said to match with a substring t of the text T, if there exist a bijection from the symbols of P to the symbols of t. This problem has an important applicatio...
详细信息
The Boyer-Moore algorithm uses two pre-computed tables for searching a string: skip, which utilizes the occurrence heuristic ofsymbols in a pattern, and shift, which utilizes the match heuristic of the pattern. Resear...
详细信息
Suffix trees are by far the most important data structure in stringology, with myriads of applications in fields like bioinformatics, data compression and information retrieval. Classical representations of suffix tre...
详细信息
ISBN:
(纸本)9783540690665
Suffix trees are by far the most important data structure in stringology, with myriads of applications in fields like bioinformatics, data compression and information retrieval. Classical representations of suffix trees require O (n log n) bits of space, for a string of size n. This is considerably more than the n log(2) sigma bits needed for the string itself, where sigma is the alphabet size. The size of suffix trees has been a barrier to their wider adoption in practice. A recent so-called fully-compressed suffix tree (FCST) requires asymptotically only the space of the text entropy. FCSTs, however, have the disadvantage of being static, not, supporting updates to the text. In this paper we show how to support dynamic FCSTs within the same optimal space of the static version and executing all the operations in polylogarithmic time. In particular, we are able to build the suffix tree within optimal space.
In this paper, we describe a metasearch tool resulting from experiments in aggregating the results of different name matching algorithms on a knowledge-intensive multicultural name matching task. Three retrieval engin...
详细信息
ISBN:
(纸本)9781424419777
In this paper, we describe a metasearch tool resulting from experiments in aggregating the results of different name matching algorithms on a knowledge-intensive multicultural name matching task. Three retrieval engines that match Romanized names were tested on a noisy and predominantly Arabic dataset. One is based on a genetic string matching algorithm;another is designed specifically for Arabic names;and the third makes use of culturally-specific matching strategies for multiple cultures. We show that even a relatively naive method for aggregating results significantly increased effectiveness over each of the individual algorithms, resulting in nearly tripling the F-score of the worst-performing algorithm included in the aggregate, and in a 6 point improvement in F-score over the single best-performing algorithm included.
A string matching approach is proposed to find a region correspondance between two images. Regions and their spatial relationships are represented by two combinatorial pyramids encoding two segmentation hierarchies. O...
详细信息
ISBN:
(纸本)9783540699040
A string matching approach is proposed to find a region correspondance between two images. Regions and their spatial relationships are represented by two combinatorial pyramids encoding two segmentation hierarchies. Our matching algorithm is decomposed in two steps: We first require that the features of the two matched regions be similar. This threshold on the similarity of the regions to be matched is used as a pruning step. We secondly require that at least one cut may be determined in each hierarchy such that the cyclic sequence of neighbors of the two matched regions have similar features. This distance is based on a cicular string matching algorithm which uses both the orientability of the plane and the hierarchical encoding of the two regions to reduce the computational cost of the matching and enforce its robustness.
The analysis of expressive performance, an important research topic in Computer Music, is almost exclusively devoted to the study of Western Classical piano music. Instruments like the acoustic guitar and styles like ...
详细信息
ISBN:
(纸本)9780615248493
The analysis of expressive performance, an important research topic in Computer Music, is almost exclusively devoted to the study of Western Classical piano music. Instruments like the acoustic guitar and styles like Bossa Nova and Samba have been little studied, despite their harmonic and rhythmic richness. This paper describes some experimental results obtained with the extraction of rhythmic patterns from the guitar accompaniment of Bossa Nova songs. The songs, played by two different performers and recorded with the help of a MIDI guitar, were represented as strings and processed by FlExPat, a string matching algorithm. The results obtained were then compared to a previously acquired catalogue of "good" patterns.
Exact matching of single patterns in DNA and amino acid sequences is studied. We performed an extensive experimental comparison of algorithms presented in the literature. In addition, we introduce new variations of ea...
详细信息
The proceedings contain 175 papers. The topics discussed include: a multipattern matching algorithm using sampling ad bit index;an evolutionary gait generator with online parameter adjustment for humanoid robots;solvi...
详细信息
ISBN:
(纸本)9781424419685
The proceedings contain 175 papers. The topics discussed include: a multipattern matching algorithm using sampling ad bit index;an evolutionary gait generator with online parameter adjustment for humanoid robots;solving MEC model of haplotype reconstruction using information fusion, single greedy and parallel clustering approaches;methodology for evaluating string matching algorithms on multiprocessor;a novel approach for solving the multiple sequence alignment problem;a framework for predicting proteins 3D structures;intelligent heart disease prediction system using data mining techniques;DMPML- data mining preparation markup language;enumeration of maximum clique for mining spatial co-location patterns;a methodology for discovering spatial co-location patterns;a cellular automata approach to detecting concept drift and dealing with noise;and rapid and robust ranking of text documents in a dynamically changing corpus.
This paper introduces an efficient string matching algorithm for in-place reconstructible delta compression. Some algorithms for existing in-place reconstruction enable embedded systems with limited memory to create t...
详细信息
ISBN:
(纸本)9781424421718
This paper introduces an efficient string matching algorithm for in-place reconstructible delta compression. Some algorithms for existing in-place reconstruction enable embedded systems with limited memory to create the new file version in the memory space that the current version file occupies. However these algorithms increase the size of the delta file transmitted over low-bandwidth channels, or increase the decoding complexity. To solve these problems, we developed alleviated greedy algorithm that creates the delta file for in-place reconstruction. The algorithm lowers decoding complexity and results in a smaller size of delta file than the existing in-place reconstruction algorithm. Delta size and required memory size for decoding in experiments are compared to an existing delta compression technique.
There are numerous exact string matching algorithms that have similar performance characteristics. Which algorithm is best depends on the length of the pattern being searched for, the number of letters in the alphabet...
详细信息
暂无评论