Casual mutations and natural selection have driven the evolution of protein amino acid sequences that we observe at present in nature. The question about which is the dominant force of proteins evolution is still lack...
详细信息
Casual mutations and natural selection have driven the evolution of protein amino acid sequences that we observe at present in nature. The question about which is the dominant force of proteins evolution is still lacking of an unambiguous answer. Casual mutations tend to randomize protein sequences while, in order to have the correct functionality, one expects that selection mechanisms impose rigid constraints on amino acid sequences. Moreover, one also has to consider that the space of all possible amino acid sequences is so astonishingly large that it could be reasonable to have a well tuned amino acid sequence indistinguishable from a random one. In order to study the possibility to discriminate between random and natural amino acid sequences, we introduce different measures of association between pairs of amino acids in a sequence, and apply them to a dataset of 1047 natural protein sequences and 10,470 random sequences, carefully generated in order to preserve the relative length and amino acid distribution of the natural proteins. We analyze the multidimensional measures with machine learning techniques and show that, to a reasonable extent, natural protein sequences can be differentiated from random ones. (C) 2015 Elsevier Ltd. All rights reserved.
Let phi = (1+root 5)/2 denote the golden section. We investigate relationships between unbounded iterations of the floor function applied to various combinations of phi and phi(2). We use them to formulate an algebrai...
详细信息
Let phi = (1+root 5)/2 denote the golden section. We investigate relationships between unbounded iterations of the floor function applied to various combinations of phi and phi(2). We use them to formulate an algebraic polynomial-time winning strategy for a new four-pile take-away game Flora, which is motivated by partitioning the set of games into subsets CompGames and PrimGames. We further formulate recursive, arithmetic, and word-mapping winning strategies for it. The arithmetic one is based on the Fibonacci numeration system. We further show how to generate the floor words induced by the iterations using word-mappings and characterize them using the Fibonacci numeration system. We also exhibit an infinite array of such sequences.
The block reversal of a word is a generalization of the concept of reversal of a word where in place of reversing individual letters, we take the blocks of the word in the reverse order. Since there can be multiple wa...
详细信息
The block reversal of a word is a generalization of the concept of reversal of a word where in place of reversing individual letters, we take the blocks of the word in the reverse order. Since there can be multiple ways in which a word can be divided into blocks, the block reversal of a word forms a set. We study some properties of the block reversal of a word. We characterize words that have the same block reversal set. We find a relation between the block reversal and the non-overlapping inversion defined by Schoniger and Waterman in the year 1992. We characterize words with the minimum and the maximum number of elements in the block reversal. We then study the distribution of palindromes in the block reversal of a word. (C) 2021 Elsevier B.V. All rights reserved.
We relate binary words with a given number of subsequences to continued fractions of rational numbers with a given denominator. We deduce that there are binary strings of length O(log n log log n) with exactly n subse...
详细信息
We relate binary words with a given number of subsequences to continued fractions of rational numbers with a given denominator. We deduce that there are binary strings of length O(log n log log n) with exactly n subsequences;this can be improved to O(log n) under assumption of Zaremba's conjecture. (C) 2022 Elsevier B.V. All rights reserved.
In this paper we study a generalization of the classical notions of bordered and unbordered words, motivated by DNA computing. DNA strands can be viewed as finite strings over the alphabet {A, G, C, T}, and are used i...
详细信息
In this paper we study a generalization of the classical notions of bordered and unbordered words, motivated by DNA computing. DNA strands can be viewed as finite strings over the alphabet {A, G, C, T}, and are used in DNA computing to encode information. Due to the fact that A is Watson-Crick complementary to T and G to C, DNA single strands that are. Watson-Crick complementary can bind to each other or to themselves in either intended or unintended ways. One of the structures that is usually undesirable for biocomputation, since it makes the affected DNA string unavailable for future interactions, is the hairpin: If some subsequences of a DNA single string are Complementary to each other, the string will bind to itself forming a hairpin-like structure. This paper studies a mathematical formalization of a particular case of hairpins, the Watson-Crick bordered words. A Watson-Crick bordered word is a word with the property that it has a prefix that is Watson-Crick complementary to its suffix. More generally, we investigate the notion of theta-bordered words, where theta is a morphic or antimorphic involution. We show that the set of all theta-bordered words is regular, when theta is an antimorphic involution and the set of all theta-bordered words is context-sensitive when theta is a morphic involution. We study the properties of theta-bordered and theta-unbordered words and also the relation between theta-bordered and theta-unbordered words and certain type of involution codes.
An extension of the problem of reconstruction of words from a given multiset of its subwords supposedly generated by unit shifts of a window of fixed length along such words. In the extension, feasible solutions must ...
详细信息
An extension of the problem of reconstruction of words from a given multiset of its subwords supposedly generated by unit shifts of a window of fixed length along such words. In the extension, feasible solutions must satisfy additional constraints. The case is considered when these constraints are specified by forbidden words. A solution to the problem is obtained as a result of searching for Euler paths or Euler cycles in a de Bruijn multidigraph with the additional operation of reduction of its edges and subsequent application of special algebraic operations of multiplication of adjacency matrices as in the first part of this article.
Among Sturmian words, some of them are morphic, i.e. fixed point of a non-identical morphism on words. Berstel and Seebold (1993) have shown that if a characteristic Sturmian word is morphic, then it can be extended b...
详细信息
Among Sturmian words, some of them are morphic, i.e. fixed point of a non-identical morphism on words. Berstel and Seebold (1993) have shown that if a characteristic Sturmian word is morphic, then it can be extended by the left with one or two letters in such a way that it remains morphic and Sturmian. Yasutomi (1997) has proved that these were the sole possible additions and that, if we cut the first letters of such a word, it didn't remain morphic. In this paper, we give an elementary and combinatorial proof of this result.
A language which has only a finite number of primitive roots is like a finite-dimensional vector space having a finite basis. Such a language is called a local language. The purpose of this paper is to study some of t...
详细信息
A language which has only a finite number of primitive roots is like a finite-dimensional vector space having a finite basis. Such a language is called a local language. The purpose of this paper is to study some of the algebraic properties of local languages. We show that every local context-free language is regular which generalizes the well-known fact that every context-free language over one letter alphabet is regular. We also derive that a skinny context-free language is local or not is decidable.
A word v is said to be a proper d-factor of a word u if v not equal u and u = vx = yv for some words x, y. The family of words which have i distinct proper d-factors is denoted by D(i). According to the number of dist...
详细信息
A word v is said to be a proper d-factor of a word u if v not equal u and u = vx = yv for some words x, y. The family of words which have i distinct proper d-factors is denoted by D(i). According to the number of distinct proper d-factors of words, the free semigroup X+ generated by X can be expressed as the disjoint union of D(i)'s. words in D(1) are called d-minimal words. d-Minimal words are often called non-overlapping words, dipolar words or unborded words. In this paper, we study the relationship between D-i(I) and D(i) concerning the basic properties of decompositions and catenations of words. We give characterizations of words in D-2(1)boolean AND D(I) and D(2). We also show that sets D-i(I)\D(j) and D(j)\D-i(1) are disjunctive. It is known that every disjunctive language is dense but not regular. We obtain the results that X+D(1) and X+D(2) are regular but X+D(i) is disjunctive for every i greater than or equal to 4. Served as an example of disjunctive d-minimal context-free languages, a disjunctive d-minimal context-free language is constructed. Moreover, we show that the well-known Dyck language is a free semigroup generated by a d-minimal bifix code. The languages of which the catenations consist of d-minimal words are studied in this paper too. That is, some properties of d-minimality-annihilators of languages are investigated. (C) 1998 Elsevier Science B.V. All rights reserved.
As every non-empty word is a power of a unique primitive word, a set of primitive roots of a language is like an independent subset of a vector space. A language having finitely many primitive roots is called a local ...
详细信息
As every non-empty word is a power of a unique primitive word, a set of primitive roots of a language is like an independent subset of a vector space. A language having finitely many primitive roots is called a local language. The purpose of this paper is to characterize local regular languages. We show that whether a regular language is local or not is decidable. In the meanwhile, two characterizations of local regular languages are derived. (C) 1998 Elsevier Science B.V. All rights reserved.
暂无评论