Cornelia de Lange syndrome (CdLS;MIM# 122470) is a rare developmental disorder. Pathogenic variants in 5 genes explain approximately 50% cases, leaving the other 50% unsolved. We performed whole genome sequencing (WGS...
详细信息
Cornelia de Lange syndrome (CdLS;MIM# 122470) is a rare developmental disorder. Pathogenic variants in 5 genes explain approximately 50% cases, leaving the other 50% unsolved. We performed whole genome sequencing (WGS) +/- RNA sequencing (RNA-seq) in 5 unsolved trios fulfilling the following criteria: (i) clinical diagnosis of classic CdLS, (ii) negative gene panel sequencing from blood and saliva-isolated DNA, (iii) unaffected parents' DNA samples available and (iv) proband's blood-isolated RNA available. A pathogenic de novo mutation (DNM) was observed in a CdLS differential diagnosis gene in 3/5 patients, namely POU3F3, SPEN, and TAF1. In the other two, we identified two distinct deep intronic DNM in NIPBL predicted to create a novel splice site. RT-PCRs and RNA-Seq showed aberrant transcripts leading to the creation of a novel frameshift exon. Our findings suggest the relevance of WGS in unsolved suspected CdLS cases and that deep intronic variants may account for a proportion of them.
Palaeognathae consists of five groups of extant species: flighted tinamous (1) and four flightless groups: kiwi (2), cassowaries and emu (3), rheas (4), and ostriches (5). Molecular studies supported the groupings of ...
详细信息
Palaeognathae consists of five groups of extant species: flighted tinamous (1) and four flightless groups: kiwi (2), cassowaries and emu (3), rheas (4), and ostriches (5). Molecular studies supported the groupings of extinct moas with tinamous and elephant birds with kiwi as well as ostriches as the group that diverged first among the five groups. However, phylogenetic relationships among the five groups are still controversial. Previous studies showed extensive heterogeneity in estimated gene tree topologies from conserved nonexonic elements, introns, and ultraconserved elements. Using the noncoding loci together with protein-coding loci, this study investigated the factors that affected gene tree estimation error and the relationships among the five groups. Using closely related ostrich rather than distantly related chicken as the outgroup, concatenated and gene tree-based approaches supported rheas as the group that diverged first among groups (1)-(4). Whereas gene tree estimation error increased using loci with low sequence divergence and short length, topological bias in estimated trees occurred using loci with high sequence divergence and/or nucleotide composition bias and heterogeneity, which more occurred in trees estimated from coding loci than noncoding loci. Regarding the relationships of (1)-(4), the site patterns by parsimony criterion appeared less susceptible to the bias than tree construction assuming stationary time-homogeneous model and suggested the clustering of kiwi and cassowaries and emu the most likely with similar to 40% support rather than the clustering of kiwi and rheas and that of kiwi and tinamous with 30% support each.
The majority of the DNA sequence in our genome is noncoding and not intended for synthesizing proteins. Nonetheless, genome-wide mapping of ribosome footprints has revealed widespread translation in annotated noncodin...
详细信息
Gene expression divergence and chromosomal rearrangements have been put forward as major contributors to phenotypic differences between closely related species. It has also been established that duplicated genes show ...
详细信息
Gene expression divergence and chromosomal rearrangements have been put forward as major contributors to phenotypic differences between closely related species. It has also been established that duplicated genes show enhanced rates of positive selection in their amino acid sequences. If functional divergence is largely due to changes in gene expression, it follows that regulatory sequences in duplicated loci should also evolve rapidly. To investigate this hypothesis, we performed likelihood ratio tests (LRTs) on all noncoding loci within 5 kb of every transcript in the human genome and identified sequences with increased substitution rates in the human lineage since divergence from Old World Monkeys. The fraction of rapidly evolving loci is significantly higher nearby genes that duplicated in the common ancestor of humans and chimps compared with nonduplicated genes. We also conducted a genome-wide scan for nucleotide substitutions predicted to affect transcription factor binding. Rates of binding site divergence are elevated in noncoding sequences of duplicated loci with accelerated substitution rates. Many of the genes associated with these fast-evolving genomic elements belong to functional categories identified in previous studies of positive selection on amino acid sequences. In addition, we find enrichment for accelerated evolution nearby genes involved in establishment and maintenance of pregnancy, processes that differ significantly between humans and monkeys. Our findings support the hypothesis that adaptive evolution of the regulation of duplicated genes has played a significant role in human evolution.
Background: Detecting conserved noncoding sequences (CNSs) across species highlights the functional elements. Alignment procedures combined with computational prediction of transcription factor binding sites (TFBSs) c...
详细信息
Background: Detecting conserved noncoding sequences (CNSs) across species highlights the functional elements. Alignment procedures combined with computational prediction of transcription factor binding sites (TFBSs) can narrow down key regulatory elements. Repeat masking processes are often performed before alignment to mask insertion sequences such as transposable elements (TEs). However, recently such TEs have been reported to influence the gene regulatory network evolution. Therefore, an alignment approach that is robust to TE insertions is meaningful for finding novel conserved TFBSs in TEs. Results: We constructed a web server 'ReAlignerV' for complex alignment of genomic sequences. ReAlignerV returns ladder-like schematic alignments that integrate predicted TFBSs and the location of TEs. It also provides pair-wise alignments in which the predicted TFBS sites and their names are shown alongside each sequence. Furthermore, we evaluated false positive aligned sites by focusing on the species-specific TEs (SSTEs), and found that ReAlignerV has a higher specificity and robustness to insertions for sequences having more than 20% TE content, compared to LAGAN, AVID, MAVID and BLASTZ. Conclusion: ReAlignerV can be applied successfully to TE-insertion-rich sequences without prior repeat masking, and this increases the chances of finding regulatory sequences hidden in TEs, which are important sources of the regulatory network evolution. ReAlignerV can be accessed through and downloaded from http://***/.
Background: The accurate detection of genes and the identification of functional regions is still an open issue in the annotation of genomic sequences. This problem affects new genomes but also those of very well stud...
详细信息
Background: The accurate detection of genes and the identification of functional regions is still an open issue in the annotation of genomic sequences. This problem affects new genomes but also those of very well studied organisms such as human and mouse where, despite the great efforts, the inventory of genes and regulatory regions is far from complete. Comparative genomics is an effective approach to address this problem. Unfortunately it is limited by the computational requirements needed to perform genome-wide comparisons and by the problem of discriminating between conserved coding and non-coding sequences. This discrimination is often based (thus dependent) on the availability of annotated proteins. Results: In this paper we present the results of a comprehensive comparison of human and mouse genomes performed with a new high throughput grid-based system which allows the rapid detection of conserved sequences and accurate assessment of their coding potential. By detecting clusters of coding conserved sequences the system is also suitable to accurately identify potential gene loci. Following this analysis we created a collection of human-mouse conserved sequence tags and carefully compared our results to reliable annotations in order to benchmark the reliability of our classifications. Strikingly we were able to detect several potential gene loci supported by EST sequences but not corresponding to as yet annotated genes. Conclusion: Here we present a new system which allows comprehensive comparison of genomes to detect conserved coding and non-coding sequences and the identification of potential gene loci. Our system does not require the availability of any annotated sequence thus is suitable for the analysis of new or poorly annotated genomes.
We have established a cartilaginous fish cell line [Squalus acanthias embryo cell line (SAE)], a mesenchymal stem cell line derived from the embryo of an elasmobranch, the spiny dogfish shark S. acanthias. Elasmobranc...
详细信息
We have established a cartilaginous fish cell line [Squalus acanthias embryo cell line (SAE)], a mesenchymal stem cell line derived from the embryo of an elasmobranch, the spiny dogfish shark S. acanthias. Elasmobranchs (sharks and rays) first appeared > 400 million years ago, and existing species provide useful models for comparative vertebrate cell biology, physiology, and genomics. Comparative vertebrate genomics among evolutionarily distant organisms can provide sequence conservation information that facilitates identification of critical coding and noncoding regions. Although these genomic analyses are informative, experimental verification of functions of genomic sequences depends heavily on cell culture approaches. Using ESTs defining mRNAs derived from the SAE cell line, we identified lengthy and highly conserved gene-specific nucleotide sequences in the noncoding 3' UTRs of eight genes involved in the regulation of cell growth and proliferation. Conserved noncoding 3' mRNA regions detected by using the shark nucleotide sequences as a starting point were found in a range of other vertebrate orders, including bony fish, birds, amphibians, and mammals. Nucleotide identity of shark and human in these regions was remarkably well conserved. Our results indicate that highly conserved gene sequences dating from the appearance of jawed vertebrates and representing potential cis-regulatory elements can be identified through the use of cartilaginous fish as a baseline. Because the expression of genes in the SAE cell line was prerequisite for their identification, this cartilaginous fish culture system also provides a physiologically valid tool to test functional hypotheses on the role of these ancient conserved sequences in comparative cell biology.
Mammal-fish-conserved-sequence 1 (MFCS1) is a highly conserved sequence that acts as a limb-specific cis-acting regulator of Sonic hedgehog (Shh) expression, residing 1 Mb away from the Shh coding sequence in mouse. U...
详细信息
Mammal-fish-conserved-sequence 1 (MFCS1) is a highly conserved sequence that acts as a limb-specific cis-acting regulator of Sonic hedgehog (Shh) expression, residing 1 Mb away from the Shh coding sequence in mouse. Using gene-driven screening of an ENU-mutagenized mouse archive, we obtained mice with three new point mutations in MFCS 1: M101116, M10111 7, and M101192. Phenotype analysis revealed that M101116 mice exhibit preaxial polydactyly and ectopic Shh expression at the anterior margin of the limb buds like a previously identified mutant, M100081. In contrast, M10111 7 and M101192 show no marked abnon-nalities in limb morphology. Furthermore, transgenic analysis revealed that the M101116 and MI00081 sequences drive ectopic reporter gene expression at the anterior margin of the limb bud, in addition to the normal posterior expression. Such ectopic expression was not observed in the embryos carrying a reporter transgeDe driven by M101117. These results suggest that M101116 and M100081 affect the negative regulatory activity of MFCSI, which suppresses anterior Shh expression in developing limb buds. Thus, this study shows that gene-driven screening for ENU-induced mutations is an effective approach for exploring the function of conserved, noncoding sequences and potential cis-regulatory elements. (c) 2006 Elsevier Inc. All rights reserved.
Background: Several studies have investigated the relationships between selective constraints in introns and their length, GC content and location within genes. To date, however, no such investigation has been done in...
详细信息
Background: Several studies have investigated the relationships between selective constraints in introns and their length, GC content and location within genes. To date, however, no such investigation has been done in plants. Studies of selective constraints in noncoding DNA have generally involved interspecific comparisons, under the assumption of the same selective pressures acting in each lineage. Such comparisons are limited to cases in which the noncoding sequences are not too strongly diverged so that reliable sequence alignments can be obtained. Here, we investigate selective constraints in a recent segmental duplication that includes 605 paralogous intron pairs that occurred about 7 million years ago in rice (O. sativa). Results: Our principal findings are: (1) intronic divergence is negatively correlated with intron length, a pattern that has previously been described in Drosophila and mammals;(2) there is a signature of strong purifying selection at splice control sites;(3) first introns are significantly longer and have a higher GC content than other introns;(4) the divergences of first and non-first introns are not significantly different from one another, a pattern that differs from Drosophila and mammals;and (5) short introns are more diverged than four-fold degenerate sites suggesting that selection reduces divergence at four-fold sites. Conclusion: Our observation of stronger selective constraints in long introns suggests that functional elements subject to purifying selection may be concentrated within long introns. Our results are consistent with the presence of strong purifying selection at splicing control sites. Selective constraints are not significantly stronger in first introns of rice, as they are in other species.
暂无评论