Hidden stop codons are nucleotide triples TAA, TAG, and TGA that appear in the second and third reading frames of a protein coding gene. Recent studies reported biological evidence suggesting that hidden stop codons a...
详细信息
Hidden stop codons are nucleotide triples TAA, TAG, and TGA that appear in the second and third reading frames of a protein coding gene. Recent studies reported biological evidence suggesting that hidden stop codons are important in preventing misread of mRNA, which is often detrimental to the cell. We study the problem of designing protein-encoding genes with large number of hidden stop codons under biological constraints including GC content and codon usage of individual organism. In simpler models, we obtained provably optimal results. In more complex models, the designed genes have many more hidden stop codons than wild-type genes do, as observed in an experiment with 8 genomes with a wide range of GC content and codon usage.
Genome-wide association studies (GWAS) test hundreds of thousands of single-nucleotide polymorphisms (SNPs) for association to a trait, treating each marker equally and ignoring prior evidence of association to specif...
详细信息
Genome-wide association studies (GWAS) test hundreds of thousands of single-nucleotide polymorphisms (SNPs) for association to a trait, treating each marker equally and ignoring prior evidence of association to specific regions. Typically, promising regions are selected for further investigation based on p-values obtained from simple tests of association. However, loci that exert only a weak, low-penetrant role on the trait, producing modest evidence of association, are not detectable in the context of a GWAS. Implementing prior knowledge of association in GWAS could increase power, help distinguish between false and true positives, and identify better sets of SNPs for follow-up *** we performed a GWAS on rheumatoid arthritis (RA) patients and controls (Problem 1, Genetic Analysis Workshop 16). In order to include prior information in the analysis, we applied four methods that distinctively deal with markers in candidate genes in the context of GWAS. SNPs were divided into a random and a candidate subset, then we applied empirical correction by permutation, false-discovery rate, false-positive report probability, and posterior odds of association using different prior probabilities. We repeated the same analyses on two different sets of candidate markers defined on the basis of previously reported association to RA following two different approaches. The four methods showed similar relative behavior when applied to the two sets, with the proportion of candidate SNPs ranked among the top 2,000 varying from 0 to 100%. The use of different prior probabilities changed the stringency of the methods, but not their relative performance.
Background. The processes involved in the somatic assembly of antigen receptor genes are unique to the immune system and are driven largely by random events. Subtle biases, however, may exist and provide clues to the ...
Background. The processes involved in the somatic assembly of antigen receptor genes are unique to the immune system and are driven largely by random events. Subtle biases, however, may exist and provide clues to the molecular mechanisms involved in their assembly and selection. Large-scale efforts to provide baseline data about the genetic characteristics of immunoglobulin (Ig) genes and the mechanisms involved in their assembly have recently become possible due to the rapid growth of genetic databases. Results. We gathered and analyzed nearly 6,500 productive human Ig heavy chain genes and compared them with 325 non-productive Ig genes that were originally rearranged out of frame and therefore incapable of being biased by selection. We found evidence for differences in n-nucleotide tract length distributions which have interesting interpretations for the mechanisms involved in n-nucleotide polymerization. Additionally, we found striking statistical evidence for pairing preferences among D and J segments. We present a statistical model to support our hypothesis that these pairing biases are due to multiple sequential D-to-J rearrangements. Conclusion. We present here the most precise estimates of gene segment usage frequencies currently available along with analyses regarding n-nucleotide distributions and D-J segment pair preferences. Additionally, we provide the first statistical evidence that sequential D-J recombinations occur at the human heavy chain locus during B-cell ontogeny with an approximate frequency of 20%.
Subnetworks can reveal the complex patterns of the whole-genome network by extracting the interactions that depend on temporal, spatial, or condition specific context. In this paper we present an optimization framewor...
详细信息
Subnetworks can reveal the complex patterns of the whole-genome network by extracting the interactions that depend on temporal, spatial, or condition specific context. In this paper we present an optimization framework to identify condition specific subnetworks. This framework allows us to identify the most coherent subnetwork by integrating the information from both nodes and edges in the graph. Importantly we design an algorithm to solve the optimization problem efficiently. It is very fast and can extract subnetworks from large-scale network with about 10000 nodes. As a pilot study we apply our method to identify type 2 diabetes related subnetworks in the human protein-protein interaction network.
SIRE1 is a 2,000-copy member of the Ty1/copia retroelement family found in the soybean genome and is closely related to sireviruses found in the genomes of other legumes. Although these elements closely resemble typic...
详细信息
Protein-protein interactions (PPIs) play an extremely important role in performing a variety of biological functions. The interactomes of several model organisms including budding yeast Saccharomyces cerevisiae have r...
详细信息
Protein-protein interactions (PPIs) play an extremely important role in performing a variety of biological functions. The interactomes of several model organisms including budding yeast Saccharomyces cerevisiae have recently been studied using experimental techniques such as the yeast two-hybrid assay. However, these techniques are generally biased against integral membrane proteins due to their intrinsic limitations. Given the fact that the interactions between integral membrane proteins cover a large fraction of the whole interactome, we report a study of predicting interactions between integral membrane proteins in yeast by a quantitative model We integrate protein-protein interaction and domain-domain interaction (DDI) data from disparate sources and apply a log likelihood scoring method on all putative integral membrane proteins in yeast to predict their interactions based on a cut-off threshold. We show that our approach improves on other predictive approaches when tested on a ldquogold-standardrdquo data set and achieves 74.6% true positive rate at the expense of 0.43% false positive rate. Furthermore, we find that two integral membrane proteins are more likely to interact with each other if they share more common interaction partners. This study allows us to reach a more extensive understanding of the yeast integral membrane proteins from a network view, which also complements the previous prediction approaches based on the genomic context.
Researchers at the Department of Energy's (DOE) Pacific Northwest National Laboratory (PNNL) in Richland, WA, are creating computing environments for biologists that seamlessly integrate collections of data and co...
详细信息
Researchers at the Department of Energy's (DOE) Pacific Northwest National Laboratory (PNNL) in Richland, WA, are creating computing environments for biologists that seamlessly integrate collections of data and computational resources. MeDICi is an evolving middleware platform for building complex, high-performance analytical applications. MIF components are constructed using Java programming interfaces that support inter-component communication using asynchronous messaging. Local components execute inside the MIF container. Remote components create distributed solutions and integrate with non-Java code. Mule provides the MIF container environment. MIF extends the Mule interface to make component and pipeline construction easier and to create an encapsulation device for component creation. The MIF interface is agnostic of the underlying Java messaging platform. This allows deployments to configure MIF applications using technologies that meet individual quality-of-service requirements.
Nuclear Overhauser effects (NOE) distance constraints and torsion angle constraints are major conformational constraints for nuclear magnetic resonance (NMR) structure refinement. In particular, the number of NOE cons...
详细信息
暂无评论