版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Indian Inst Technol Dept Comp Sci & Engn Patna 801103 Bihar India
出 版 物:《IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS》 (IEEE/ACM Trans. Comput. BioL. Bioinf.)
年 卷 期:2022年第19卷第5期
页 面:2770-2781页
核心收录:
学科分类:0710[理学-生物学] 0808[工学-电气工程] 08[工学] 0714[理学-统计学(可授理学、经济学学位)] 0701[理学-数学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:Visvesvaraya PhD Scheme for Electronics and IT, an initiative of the Ministry of Electronics and Information Technology (MeitY), Government of India Young Faculty Research Fellowship (YFRF) Award Visvesvaraya PhD Scheme for Electronics and IT, Ministry of Electronics and Information Technology (MeitY), Government of India
主 题:Gene expression Diseases Three-dimensional displays Prognostics and health management Computer architecture Proteins Feature extraction Multi-modal architecture gene prognosis gene expression profile protein 3D structure nucleotide sequence attention mechanism deep learning
摘 要:An in-depth exploration of gene prognosis using different methodologies aids in understanding various biological regulations of genes in disease pathobiology and molecular functions. Interpreting gene functions at biological and molecular levels remains a daunting yet crucial task in domains such as drug design, personalized medicine, and next-generation diagnostics. Recent advancements in omics technologies have produced diverse heterogeneous genomic datasets like micro-array gene expression, miRNA expression, DNA sequence, 3D structures, which are significant resources for understanding the gene functions. In this paper, we propose a novel self-attention based deep multi-modal model, named DeePROG, for the prognosis of disease affected genes based on heterogeneous omics data. We use three NCBI datasets covering three modalities, namely gene expression profile, the underlying DNA sequence, and the 3D protein structures. To extract useful features from each modality, we develop several context-specific deep learning models. Besides, we develop three attention-based deep bi-modal architectures along with DeePROG to leverage the prognosis of the underlying biomedical data. We assess the performance of the models in terms of computational assessment of function annotation (CAFA2) metrics. Moreover, we analyze the results in terms of receiver operating characteristics (ROC) curve in high-class imbalance data setting and perform statistical significance tests in terms of Welch s t-test. Experiment results show that DeePROG significantly outperforms baseline models across in terms of performance metrics. The source code and all preprocessed datasets used in this study are available at https://***/duttaprat/DeePROG.