Background: Regulation of gene transcription is crucial for the function and development of all organisms. While gene prediction programs that identify protein coding sequence are used with remarkable success in the a...
详细信息
Background: Regulation of gene transcription is crucial for the function and development of all organisms. While gene prediction programs that identify protein coding sequence are used with remarkable success in the annotation of genomes, the development of computational methods to analyze noncoding regions and to delineate transcriptional control elements is still in its infancy. Results: Here we present novel algorithms to detect cis-regulatory modules through genome wide scans for clusters of transcription factor binding sites using three levels of prior information. When binding sites for the factors are known, our statistical segmentation algorithm, Ahab, yields about 150 putative gap gene regulated modules, with no adjustable parameters other than a window size. If one or more related modules are known, but no binding sites, repeated motifs can be found by a customized Gibbs sampler and input to Ahab, to predict genes with similar regulation. Finally using only the genome, we developed a third algorithm, Argos, that counts and scores clusters of overrepresented motifs in a window of sequence. Argos recovers many of the known modules, upstream of the segmentation genes, with no training data. Conclusions: We have demonstrated, in the case of body patterning in the Drosophila embryo, that our algorithms allow the genome-wide identification of regulatory modules. We believe that Ahab overcomes many problems of recent approaches and we estimated the false positive rate to be about 50%. Argos is the first successful attempt to predict regulatory modules using only the genome without training data. Complete results and module predictions across the Drosophila genome are available at [http://***/similar tosiggia/].
The Gnaphalieae are a group of sunflowers that have their greatest diversity in South America, Southern Africa, and Australia. The objective of this study was to reconstruct a phylogeny of the South African Gnaphaliea...
详细信息
The Gnaphalieae are a group of sunflowers that have their greatest diversity in South America, Southern Africa, and Australia. The objective of this study was to reconstruct a phylogeny of the South African Gnaphalieae using sequence data from two noncoding chloroplast DNA sequences, the trnL intron and trnL/trnF intergenic spacer. Included in this investigation are the genera of the Gnaphalieae from the African basal groups, members of the subtribes Cassiniinae, Gnaphaliinae, and Relhaniinae, and African representatives from the large Old World genus Helichrysum. Results indicate that two Gnaphaloid genera, Printzia and Callilepis, should be excluded from the Gnaphalieae. In most trees the Relhaniinae s.s. (sensu stricto) and some of the basal taxa comprise a clade that is sister to the remainder of the tribe Gnaphalieae. The Relhaniinae, which are restricted to Africa, are not a monophyletic group as presently circumscribed, nor are the South African members of Helichrysum, the Cassiniinae and Gnaphaliinae. There is general agreement between our molecular analysis and that of morphology, particularly in the terminal branches of the trees.
About 1.6 kb of the noncoding region upstream of the muscular dystrophin gene was sequenced in human and other primates. The alignment showed the existence of many stretches of conserved sequences among the compared s...
详细信息
About 1.6 kb of the noncoding region upstream of the muscular dystrophin gene was sequenced in human and other primates. The alignment showed the existence of many stretches of conserved sequences among the compared species distributed all along the investigated fragment, including the 5' end. In correspondence to these conserved boxes, we identified several new putative cis-acting elements that have similarity to known control regions of other muscle-specific genes. In some cases, however, the conserved sequences did not correspond to any known transcription factor binding sites. The rate of evolution estimated site by site all along the investigated region revealed a nonhomogeneous distribution of the substitution rate, several sequences exhibited a very slow rate of evolution suggesting that evolutionary forces of different nature may be at work. On the basis of the sequence alignment, we reconstructed the phylogenetic relationships within the hominoid lineage. In addition, we estimated the relative rate of evolution between hominoid and Old World monkeys, confirming the existence of an evolutionary slowdown in the hominoid lineage.
It is proposed that proteins can bind with relatively low-affinity and specificity to multiple sites, defined as sequence motifs, on polynucleotide chains, and that such binding can collectively be turned into high-af...
详细信息
It is proposed that proteins can bind with relatively low-affinity and specificity to multiple sites, defined as sequence motifs, on polynucleotide chains, and that such binding can collectively be turned into high-affinity, high-specificity binding through cooperative effects, especially when the sequence motifs recur periodically. The selection of individual nucleotides has in general been thought to be the condition of the existence and conservation of function in most of the noncoding sequences. This condition seems unnecessary. Calculations are presented as a step in the direction of giving credibility to a model of stable gene repression.
暂无评论