Here we introduce two new notions of approximate matching with application in computer assisted music analysis. We present algorithms for each notion of approximation: for approximate string matching and for computing...
详细信息
Here we introduce two new notions of approximate matching with application in computer assisted music analysis. We present algorithms for each notion of approximation: for approximate string matching and for computing approximate squares.
We present in this article a linear time and space method for the computation of the length of a repeated suffix for each prefix of a given word p . Our method is based on the utilization of the factor oracle of p whi...
详细信息
We present in this article a linear time and space method for the computation of the length of a repeated suffix for each prefix of a given word p . Our method is based on the utilization of the factor oracle of p which is a new and very compact structure introduced in [1], used for representing all the factors of p . We exhibit applications where our method really speeds up the computation of repetitions in words.
In this paper, we examine a number of methods for text processing, principally coming from computational biology, and examine in which manner they can apply to musical analysis. Then, we propose a number of modificati...
详细信息
ISBN:
(纸本)0769512844
In this paper, we examine a number of methods for text processing, principally coming from computational biology, and examine in which manner they can apply to musical analysis. Then, we propose a number of modifications that can be made to these methods to allow for a better application to musical analysis. To this end, we first examine the practices of musical analysis. We focus on the field of extraction of motives. A common approach to this problem is to consider repetitions : whenever some part of the musical text is repeated, it can be considered as a motive. Detecting motives can either be based on a perfect match, or on inexact matching. To this end, the concept of similarity will be introduced and analysed, and its meaning will be defined in the scope of musical analysis. We also deal with the problem of the representation (or encoding) of the musical text. The role of encoding, and its consequences on the application of algorithms will be investigated.
This is a theoretical study of partially occluded one-dimensional images. Here, we consider ''valid" images composed from a given set of objects, where some objects appearing in the image may be partially...
详细信息
This is a theoretical study of partially occluded one-dimensional images. Here, we consider ''valid" images composed from a given set of objects, where some objects appearing in the image may be partially obstructed by others. A CRCW PRAM algorithm is presented here for validating a one-dimensional image x of length n over a set of k objects of equal length in O(log logn) time with linear work, where k is a fixed integer. (C) 2000 Published by Elsevier Science B.V. All rights reserved.
Two linear time algorithms are presented. One for determining, for every position in a given square matrix, the longest prefix of a given pattern (also a square matrix) that occurs at that position and one for computi...
详细信息
Two linear time algorithms are presented. One for determining, for every position in a given square matrix, the longest prefix of a given pattern (also a square matrix) that occurs at that position and one for computing all square covers of a given two-dimensional square matrix.
We present a new sublinear-size index structure for finding all occurrences of a given q-gram in a text. Such a q-gram index is needed in many approximate pattern matching algorithms. All earlier q-gram indexes requir...
详细信息
We present a new sublinear-size index structure for finding all occurrences of a given q-gram in a text. Such a q-gram index is needed in many approximate pattern matching algorithms. All earlier q-gram indexes require at least O(n) space, where n is the length of the text. The new Lempel-Ziv index needs only O(n/log n) space while being as fast as previous methods. The new method takes advantage of repetitions in the text found by Lempel-Ziv parsing.
We present an almost linear time method of inductive synthesis restoring simple regular expressions from one representative (good) example. In particular, we consider synthesis of expressions of star-height one, where...
详细信息
We present an almost linear time method of inductive synthesis restoring simple regular expressions from one representative (good) example. In particular, we consider synthesis of expressions of star-height one, where we allow one union operation under each iteration, and synthesis of expressions without union operations from examples that may contain mistakes. In both cases we provide sufficient conditions defining precisely the class of target expressions and the notion of good examples under which the synthesis algorithm works correctly, and present the proof of correctness. In the case of expressions with unions the proof is based on novel results in the combinatorics of words. A generalized algorithm that can synthesize simple expressions containing unions from noisy examples is implemented as a computer program. Computer experiments show that the algorithm is quite practical and may have applications in genome informatics.
We consider the problem of finding the repetitive structures of a given string x. The period u of the string x grasps the repetitiveness of x, since x is a prefix of a string constructed by concatenations of u. We gen...
详细信息
We consider the problem of finding the repetitive structures of a given string x. The period u of the string x grasps the repetitiveness of x, since x is a prefix of a string constructed by concatenations of u. We generalize the concept of repetitiveness as follows: A string w covers a string I if there is a superstring of x which is constructed by concatenations and superpositions of Lu. A substring w of x is called a seed of x if w covers x. we present an O (n log n)-time algorithm for finding all the seeds of a given string of length n.
We present an optimal O(log log n) time algorithm on the CRCW PRAM which tests whether a square array, A, of size n × n, is superprimitive. If A is not superprimitive, the algorithm returns the quasiperiod, i.e.,...
详细信息
Ribonucleic acid (RNA) strings are strings over the four-letter alphabet (A,C,G,U) with a secondary structure of base-pairing between A-U and C - G pairs in the string(1). Edges are drawn between two bases that are pa...
详细信息
ISBN:
(纸本)3540600442
Ribonucleic acid (RNA) strings are strings over the four-letter alphabet (A,C,G,U) with a secondary structure of base-pairing between A-U and C - G pairs in the string(1). Edges are drawn between two bases that are paired in the secondary structure and these edges have traditionally been assumed to be noncrossing. The noncrossing base-pairing naturally leads to a tree-like representation of the secondary structure of RNA strings. In this paper, we address several notions of similarity between two RNA strings that take into account both the primary sequence and secondary base-pairing structure of the strings. We present efficient algorithms for exact matching and approximate matching between two RNA strings. We define a notion of alignment between two RNA strings and devise algorithms based on dynamic programming. We then present a method for optimally aligning a given RNA string with unknown secondary structure to one with known sequence and structure, thus attacking the structure prediction problem in the case when the structure of a closely related sequence is known. The techniques employed to prove our results include reductions to well-known string matching problems allowing wild cards and ranges, and speeding up dynamic programming by using the tree structures implicit in the secondary structure of RNA strings.
暂无评论