Recent research has demonstrated the utility of using supervised classification systems for automatic identification of low quality microarray data. However, this approach requires annotation of a large training set b...
详细信息
Recent research has demonstrated the utility of using supervised classification systems for automatic identification of low quality microarray data. However, this approach requires annotation of a large training set by a qualified expert. In this paper we demonstrate the utility of an unsupervised classification technique based on the Expectation-Maximization (EM) algorithm and naive Bayes classification. On our test set, this system exhibits performance comparable to that of an analogous supervised learner constructed from the same training data.
The RCSB Protein data Bank has developed a portal for structural genomics resources at http://***. Reports about the worldwide contributing centers are available, including summary reports for target lists, target sta...
详细信息
The RCSB Protein data Bank has developed a portal for structural genomics resources at http://***. Reports about the worldwide contributing centers are available, including summary reports for target lists, target status progress, targets in the PDB, and sequence redundancy analyses, and links to each center's resources.
Low-cost whole-genome assembly has enabled the collection of haplotype-resolved pangenomes for numerous organisms. In turn, this technological change is encouraging the development of methods that can precisely addres...
Low-cost whole-genome assembly has enabled the collection of haplotype-resolved pangenomes for numerous organisms. In turn, this technological change is encouraging the development of methods that can precisely address the sequence and variation described in large collections of related genomes. These approaches often use graphical models of the pangenome to support algorithms for sequence alignment, visualization, functional genomics, and association studies. The additional information provided to these methods by the pangenome allows them to achieve superior performance on a variety of bioinformatic tasks, including read alignment, variant calling, and genotyping. Pangenome graphs stand to become a ubiquitous tool in genomics. Although it is unclear whether they will replace linearreference genomes, their ability to harmoniously relate multiple sequence and coordinate systems will make them useful irrespective of which pangenomic models become most common in the future.
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) infection is silent or benign in most infected individuals, but causes hypoxemic COVID-19 pneumonia in about 10% of cases. We review studies of the human ge...
详细信息
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) infection is silent or benign in most infected individuals, but causes hypoxemic COVID-19 pneumonia in about 10% of cases. We review studies of the human genetics of life-threatening COVID-19 pneumonia, focusing on both rare and common variants. Large-scale genome-wide association studies have identified more than 20 common loci robustly associated with COVID-19 pneumonia with modest effect sizes, some implicating genes expressed in the lungs or leukocytes. The most robust association, on chromosome 3, concerns a haplotype inherited from Neanderthals. Sequencing studies focusing on rare variants with a strong effect have been particularly successful, identifying inborn errors of type I interferon (IFN) immunity in 1–5% of unvaccinated patients with critical pneumonia, and their autoimmune phenocopy, autoantibodies against type I IFN, in another 15–20% of cases. Our growing understanding of the impact of human genetic variation on immunity to SARS-CoV-2 is enabling health systems to improve protection for individuals and populations.
The rapid increase in volume and complexity of biomedical data requires changes in research, communication, and clinical practices. This includes learning how to effectively integrate automated analysis with high–dat...
The rapid increase in volume and complexity of biomedical data requires changes in research, communication, and clinical practices. This includes learning how to effectively integrate automated analysis with high–data density visualizations that clearly express complex phenomena. In this review, we summarize key principles and resources from data visualization research that help address this difficult challenge. We then survey how visualization is being used in a selection of emerging biomedical research areas, including three-dimensional genomics, single-cell RNA sequencing (RNA-seq), the protein structure universe, phosphoproteomics, augmented reality–assisted surgery, and metagenomics. While specific research areas need highly tailored visualizations, there are common challenges that can be addressed with general methods and strategies. Also common, however, are poor visualization practices. We outline ongoing initiatives aimed at improving visualization practices in biomedical research via better tools, peer-to-peer learning, and interdisciplinary collaboration with computer scientists, science communicators, and graphic designers. These changes are revolutionizing how we see and think about our data.
暂无评论