The prediction of the putative enzymatic function of uncharacterized proteins is a major problem in the field of metagenomic research, where large amounts of sequences can be rapidly determined. In this work a machine...
详细信息
The prediction of the putative enzymatic function of uncharacterized proteins is a major problem in the field of metagenomic research, where large amounts of sequences can be rapidly determined. In this work a machine-learning approach was developed, that attempts the prediction of enzymatic activity based on three protein domain databases, PFAM, CATH and SCOP, which contain functional and structural information of proteins as Hidden Markov Models. Separate and combined classifiers were trained by well-annotated data and their performance was assessed in order to compare the predictive power of different attribute sets corresponding to the three protein domain databases. All classifiers performed well, with an average accuracy of ~96% and an average AUC score of 0.84. As a conclusion, the classification procedure can be integrated to more extended metagenomic analysis workflows.
StRAnGER is a web application for the automated statistical analysis of annotated experiments, exploiting controlled biological vocabularies, like the Gene Ontology or the KEGG pathways terms. In the first version, St...
详细信息
StRAnGER is a web application for the automated statistical analysis of annotated experiments, exploiting controlled biological vocabularies, like the Gene Ontology or the KEGG pathways terms. In the first version, StRAnGER featured various gene profiling platforms for functional analysis of genomic datasets, starting from a list of significant genes derived from statistical and empirical thresholds. In the current version, various major improvements have been implemented, namely a new ranking algorithm, the expansion of background distributions with protein annotations, the addition of a mode for batch experiments and a noise-control analysis that evaluates the robustness of the prioritized terms through iterative addition of random genes. Overall, StRAnGER enables a systems level functional interpretation through the utilization of bootstrapping techniques and the detection of distribution-independent term enrichments.
To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 1...
To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1-5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D.
One known challenge in analyzing gene expression data is to combine analysis outcomes obtained disparately by applying multiple, independent meta-analysis methods. Here we present an integrative computational system t...
详细信息
One known challenge in analyzing gene expression data is to combine analysis outcomes obtained disparately by applying multiple, independent meta-analysis methods. Here we present an integrative computational system that narrows down biological hypotheses by integrating gene expression patterns, transcription factor (TF) binding site analysis outcomes, and Gene Ontology (GO) enrichment analysis outcomes. This system identifies regulated genes from microarray experiments through statistical processes, categorizes similarly behaving groups of genes and then carries out binding site analysis and gene function enrichment analysis based on some significant clusters. The output is an ordered set of "putative" pair-wise relationships between TFs and their potential target genes. The relationships are ranked based on their closeness to the experimental context. We demonstrate the effectiveness of our framework using two independent microarray data sets.
Atherosclerosis is a multifactorial disease involving a lot of genes and proteins recruited throughout its manifestation. The present study aims to exploit bioinformatic tools in order to analyze microarray data of at...
O1 Regulation of genes by telomere length over long distances Jerry W. Shay O2 The microtubule destabilizer KIF2A regulates the postnatal establishment of neuronal circuits in addition to prenatal cell survival, cell ...
O1 Regulation of genes by telomere length over long distances Jerry W. Shay O2 The microtubule destabilizer KIF2A regulates the postnatal establishment of neuronal circuits in addition to prenatal cell survival, cell migration, and axon elongation, and its loss leading to malformation of cortical development and severe epilepsy Noriko Homma, Ruyun Zhou, Muhammad Imran Naseer, Adeel G. Chaudhary, Mohammed Al-Qahtani, Nobutaka Hirokawa O3 Integration of metagenomics and metabolomics in gut microbiome research Maryam Goudarzi, Albert J. Fornace Jr. O4 A unique integrated system to discern pathogenesis of central nervous system tumors Saleh Baeesa, Deema Hussain, Mohammed Bangash, Fahad Alghamdi, Hans-Juergen Schulten, Angel Carracedo, Ishaq Khan, Hanadi Qashqari, Nawal Madkhali, Mohamad Saka, Kulvinder S. Saini, Awatif Jamal, Jaudah Al-Maghrabi, Adel Abuzenadah, Adeel Chaudhary, Mohammed Al Qahtani, Ghazi Damanhouri O5 RPL27A is a target of miR-595 and deficiency contributes to ribosomal dysgenesis Heba Alkhatabi O6 Next generation DNA sequencing panels for haemostatic and platelet disorders and for Fanconi anaemia in routine diagnostic service Anne Goodeve, Laura Crookes, Nikolas Niksic, Nicholas Beauchamp O7 Targeted sequencing panels and their utilization in personalized medicine Adel M. Abuzenadah O8 International biobanking in the era of precision medicine Jim Vaught O9 Biobank and biodata for clinical and forensic applications Bruce Budowle, Mourad Assidi, Abdelbaset Buhmeida O10 Tissue microarray technique: a powerful adjunct tool for molecular profiling of solid tumors Jaudah Al-Maghrabi O11 The CEGMR biobanking unit: achievements, challenges and future plans Abdelbaset Buhmeida, Mourad Assidi, Leena Merdad O12 Phylomedicine of tumors Sudhir Kumar, Sayaka Miura, Karen Gomez O13 Clinical implementation of pharmacogenomics for colorectal cancer treatment Angel Carracedo, Mahmood Rasool O14 From association to causality: translation of GWAS findings for genomic me
暂无评论