Achievement of the best balance between the accuracy and efficiency is always an important issue when searching a tree space of large data sets. In the 5th issue in 2009, Rodrigo et al used bootstrapped topologies as ...
详细信息
Background. Francisella tularensis is the etiologic agent of tularemia and is classified as a select agent by the Centers for Disease Control and Prevention. Currently four known subspecies of F. tularensis that diffe...
详细信息
Misfit sidechains in protein crystal structures are a stumbling block in using those structures to direct further scientific inference. Problems due to surface disorder and poor electron density are very difficult to ...
详细信息
Misfit sidechains in protein crystal structures are a stumbling block in using those structures to direct further scientific inference. Problems due to surface disorder and poor electron density are very difficult to address, but a large class of systematic errors are quite common even in well-ordered regions, resulting in sidechains fit backwards into local density in predictable ways. The MolProbity web site is effective at diagnosing such errors, and can perform reliable automated correction of a few special cases such as 180° flips of Asn or Gln sidechain amides, using all-atom contacts and H-bond networks. However, most at-risk residues involve tetrahedral geometry, and their valid correction requires rigorous evaluation of sidechain movement and sometimes backbone shift. The current work extends the benefits of robust automated correction to more sidechain types. The Autofix method identifies candidate systematic, flipped-over errors in Leu, Thr, Val, and Arg using MolProbity quality statistics, proposes a corrected position using real-space refinement with rotamer selection in Coot, and accepts or rejects the correction based on improvement in MolProbity criteria and on χ angle change. Criteria are chosen conservatively, after examining many individual results, to ensure valid correction. To test this method, Autofix was run and analyzed for 945 representative PDB files and on the 50S ribosomal subunit of file 1YHQ. Over 40% of Leu, Val, and Thr outliers and 15% of Arg outliers were successfully corrected, resulting in a total of 3,679 corrected sidechains, or 4 per structure on average. Summary Sentences: A common class of misfit sidechains in protein crystal structures is due to systematic errors that place the sidechain backwards into the local electron density. A fully automated method called "Autofix" identifies such errors for Leu, Val, Thr, and Arg and corrects over one third of them, using MolProbity validation criteria and Coot real-space refinement
Much of modern machine learning and statistics research consists of extracting information from high-dimensional patterns. Often times, the large number of features that comprise this high-dimensional pattern are them...
详细信息
Much of modern machine learning and statistics research consists of extracting information from high-dimensional patterns. Often times, the large number of features that comprise this high-dimensional pattern are themselves vector valued, corresponding to sampled values in a time-series. Here, we present a classification methodology to accommodate multiple time-series using boosting. This method constructs an additive model by adaptively selecting basis functions consisting of a discriminating feature's full time-series. We present the necessary modifications to fisher linear discriminant analysis and least-squares, as base learners, to accommodate the weighted data in the proposed boosting procedure. We conclude by presenting the performance of our proposed method against a synthetic stochastic differential equation data set and a real world data set involving prediction of cancer patient susceptibility for a particular chemoradiotherapy.
We are correcting the abstract of our published article ([1]). The sentence that starts "We observe that 4.5% of MPSS tags...." was not scientifically complete in the original abstract, having only two of th...
We are correcting the abstract of our published article ([1]). The sentence that starts "We observe that 4.5% of MPSS tags...." was not scientifically complete in the original abstract, having only two of the four numbers required to describe a comparison of two technologies in two different organisms. The abstract below more accurately describes our findings, as documented in Figure 1 of the manuscript.
This report summarizes the proceedings of the second workshop of the 'Minimum Information for Biological and Biomedical Investigations' (MIBBI) consortium held on Dec 1-2, 2010 in Rüdesheim, Germany throu...
详细信息
This report summarizes the proceedings of the second workshop of the 'Minimum Information for Biological and Biomedical Investigations' (MIBBI) consortium held on Dec 1-2, 2010 in Rüdesheim, Germany through the sponsorship of the Beilstein-Institute. MIBBI is an umbrella organization uniting communities developing Minimum Information (MI) checklists to standardize the description of data sets, the workflows by which they were generated and the scientific context for the work. This workshop brought together representatives of more than twenty communities to present the status of their MI checklists and plans for future development. Shared challenges and solutions were identified and the role of MIBBI in MI checklist development was discussed. The meeting featured some thirty presentations, wide-ranging discussions and breakout groups. The top outcomes of the two-day workshop as defined by the participants were: 1) the chance to share best practices and to identify areas of synergy; 2) defining a series of tasks for updating the MIBBI Portal; 3) reemphasizing the need to maintain independent MI checklists for various communities while leveraging common terms and workflow elements contained in multiple checklists; and 4) revision of the concept of the MIBBI Foundry to focus on the creation of a core set of MIBBI modules intended for reuse by individual MI checklist projects while maintaining the integrity of each MI project. Further information about MIBBI and its range of activities can be found at http://***/.
Background: Pseudogenes provide a record of the molecular evolution of genes. As glycolysis is such a highly conserved and fundamental metabolic pathway, the pseudogenes of glycolytic enzymes comprise a standardized g...
详细信息
Background: Pseudogenes provide a record of the molecular evolution of genes. As glycolysis is such a highly conserved and fundamental metabolic pathway, the pseudogenes of glycolytic enzymes comprise a standardized genomic measuring stick and an ideal platform for studying molecular evolution. One of the glycolytic enzymes, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), has already been noted to have one of the largest numbers of associated pseudogenes, among all proteins. Results: We assembled the first comprehensive catalog of the processed and duplicated pseudogenes of glycolytic enzymes in many vertebrate model-organism genomes, including human, chimpanzee, mouse, rat, chicken, zebrafish, pufferfish, fruitfly, and worm (available at http://***/glycolysis/). We found that glycolytic pseudogenes are predominantly processed, i.e. retrotransposed from the mRNA of their parent genes. Although each glycolytic enzyme plays a unique role, GAPDH has by far the most pseudogenes, perhaps reflecting its large number of non-glycolytic functions or its possession of a particularly retrotranspositionally active sub-sequence. Furthermore, the number of GAPDH pseudogenes varies significantly among the genomes we studied: none in zebrafish, pufferfish, fruitfly, and worm, 1 in chicken, 50 in chimpanzee, 62 in human, 331 in mouse, and 364 in rat. Next, we developed a simple method of identifying conserved syntenic blocks (consistently applicable to the wide range of organisms in the study) by using orthologous genes as anchors delimiting a conserved block between a pair of genomes. This approach showed that few glycolytic pseudogenes are shared between primate and rodent lineages. Finally, by estimating pseudogene ages using Kimura's two-parameter model of nucleotide substitution, we found evidence for bursts of retrotranspositional activity approximately 42, 36, and 26 million years ago in the human, mouse, and rat lineages, respectively. Conclusion: Overall, we pe
Researchers at the Department of Energy's (DOE) Pacific Northwest National Laboratory (PNNL) in Richland, WA, are creating computing environments for biologists that seamlessly integrate collections of data and co...
详细信息
Researchers at the Department of Energy's (DOE) Pacific Northwest National Laboratory (PNNL) in Richland, WA, are creating computing environments for biologists that seamlessly integrate collections of data and computational resources. MeDICi is an evolving middleware platform for building complex, high-performance analytical applications. MIF components are constructed using Java programming interfaces that support inter-component communication using asynchronous messaging. Local components execute inside the MIF container. Remote components create distributed solutions and integrate with non-Java code. Mule provides the MIF container environment. MIF extends the Mule interface to make component and pipeline construction easier and to create an encapsulation device for component creation. The MIF interface is agnostic of the underlying Java messaging platform. This allows deployments to configure MIF applications using technologies that meet individual quality-of-service requirements.
暂无评论