As a class of extremely significant of biocatalysts, enzymes play an important role in the process of biological reproduction and metabolism. Therefore, the prediction of enzyme function is of great significance in bi...
详细信息
As a class of extremely significant of biocatalysts, enzymes play an important role in the process of biological reproduction and metabolism. Therefore, the prediction of enzyme function is of great significance in biomedicine fields. Recently, computational methods for predicting enzyme function have been proposed, and they effectively reduce the cost of enzyme function prediction. However, there are still deficiencies for effectively mining the discriminant information for enzyme function recognition in existing methods. In this study, we present MVDINET, a novel method for multi-level enzyme function prediction. First, the initial multi-view feature data is extracted by the enzyme sequence. Then, the above initial views are fed into various deep specific network modules to learn the depth-specificity information. Further, a deep view interaction network is designed to extract the interaction information. Finally, the specificity information and interaction information are fed into a multi-view adaptively weighted classification. We compressively evaluate MVDINET on benchmark datasets and demonstrate that MVDINET is superior to existing methods.
This special section of ieee/ACM Transactions on computationalbiology and bioinformatics presents extended versions of some of the best papers accepted at the Eighth International conference on Algorithms for Computa...
详细信息
The proceedings contain 32 papers. The topics discussed include: construction of Chinese freshwater fish information database;identification of potentially therapeutic target genes in ovarian cancer via bioinformatic ...
ISBN:
(纸本)9780738132020
The proceedings contain 32 papers. The topics discussed include: construction of Chinese freshwater fish information database;identification of potentially therapeutic target genes in ovarian cancer via bioinformatic approach;ITGAX: a potential biomarker of acute myeloid leukemia (AML) through bioinformatic analysis;using machine learning models to study medication adherence in hypertensive patients based on national stroke screening data;investigating spatio-temporal cellular interactions in embryonic morphogenesis by 4D nucleus tracking and systematic comparative analysis — taking nematodes C. elegans and C. briggsae as examples;and drug-target interaction identification via dual-graph regularized robust PCA in heterogeneous networks.
An evolutionary algorithm is used to evolve personal contact networks representing the individuals in a population and the interactions between them. Such networks can be used to track the progress of an epidemic, as ...
详细信息
ISBN:
(纸本)9781665484626
An evolutionary algorithm is used to evolve personal contact networks representing the individuals in a population and the interactions between them. Such networks can be used to track the progress of an epidemic, as it passes from infected individuals to others. Two fitness functions are used: epidemic duration and epidemic spread. Each of these is evaluated in the context of new variants being introduced during the course of the epidemic. Individuals infected with one variant obtain immunity to that variant and possible partial immunity to future variants. Two frameworks for epidemic variants are presented. In the first, infectivity is coupled directly to how well an individual's immunity covers the variant. In the second, infectivity is decoupled, causing a much higher number of infections but with many of lessened severity due to immunity.
A transcription factor (TF) is a sequence-specific DNA-binding protein, which plays key roles in cell-fate decision by regulating gene expression. Predicting TFs is key for tea plant research community, as they regula...
详细信息
A transcription factor (TF) is a sequence-specific DNA-binding protein, which plays key roles in cell-fate decision by regulating gene expression. Predicting TFs is key for tea plant research community, as they regulate gene expression, influencing plant growth, development, and stress responses. It is a challenging task through wet lab experimental validation, due to their rarity, as well as the high cost and time requirements. As a result, computational methods are increasingly popular to be chosen. The pre-training strategy has been applied to many tasks in natural language processing (NLP) and has achieved impressive performance. In this paper, we present a novel recognition algorithm named TeaTFactor that utilizes pre-training for the model training of TFs prediction. The model is built upon the BERT architecture, initially pre-trained using protein data from UniProt. Subsequently, the model was fine-tuned using the collected TFs data of tea plants. We evaluated four different word segmentation methods and the existing state-of-the-art prediction tools. According to the comprehensive experimental results and a case study, our model is superior to existing models and achieves the goal of accurate identification. In addition, we have developed a web server at http://***, which we believe will facilitate future studies on tea transcription factors and advance the field of crop synthetic biology.
The goal of the 22(nd) International Workshop on Data Mining in bioinformatics (BIOKDD 2023) is to encourage KDD researchers to solve the numerous problems and challenges in bioinformatics using Data Mining technologi...
详细信息
ISBN:
(纸本)9798400704901
The goal of the 22(nd) International Workshop on Data Mining in bioinformatics (BIOKDD 2023) is to encourage KDD researchers to solve the numerous problems and challenges in bioinformatics using Data Mining technologies. Based on the organizers' expertise and communities, BIOKDD 2023 features the theme "Large-Scale Data-Driven Methods for bioinformatics". This theme encourages the use of high-performance computing (HPC) to support the training of large machine learning models for problems in bioinformatics and computationalbiology. The key goal is to accelerate the convergence between Data Mining and bioinformatics communities to expedite discoveries in basic biology, medicine and healthcare. The goal of the 23(rd) International Workshop on Data Mining in bioinformatics (BIOKDD 2024) is to encourage KDD researchers to solve the numerous problems and challenges in bioinformatics using Data Mining technologies. Based on the organizers' expertise and communities, BIOKDD 2024 features the theme "Advancing bioinformatics with LLMs and GenAI". This theme encourages the use of large language models and generative artificial intelligence to solve problems in bioinformatics and computationalbiology. The key goal is to accelerate the convergence between Data Mining and bioinformatics communities to expedite discoveries in basic biology, medicine and healthcare.
Circular RNAs (circRNAs) play a significant role in cancer development and therapy resistance. There is substantial evidence indicating that the expression of circRNAs affects the sensitivity of cells to drugs. Identi...
详细信息
Circular RNAs (circRNAs) play a significant role in cancer development and therapy resistance. There is substantial evidence indicating that the expression of circRNAs affects the sensitivity of cells to drugs. Identifying circRNAs-drug sensitivity association (CDA) is helpful for disease treatment and drug discovery. However, the identification of CDA through conventional biological experiments is both time-consuming and costly. Therefore, it is urgent to develop computational methods to predict CDA. In this study, we propose a new computational method, the subgraph-aware graph convolutional network (SAGCN), for predicting CDA. SAGCN first constructs a heterogeneous network composed of circRNA similarity network, drug similarity network, and circRNA-drug bipartite network. Then, a subgraph extractor is proposed to learn the latent subgraph structure of the heterogeneous network using a graph convolutional network. The extractor can capture 1-hop and 2-hop information and then a fusing attention mechanism is designed to integrate them adaptively. Simultaneously, a novel subgraph-aware attention mechanism is proposed to detect intrinsic subgraph structure. The final node feature representation is obtained to make the CDA prediction. Experimental results demonstrate that SAGCN obtained an average AUC of 0.9120 and AUPR of 0.8693, exceeding the performance of the most advanced models under 10-fold cross-validation. Case studies have demonstrated the potential of SAGCN in identifying associations between circRNA and drug sensitivity.
Artificial intelligence (AI) has revolutionized various fields, including bioinformatics and genomics, by offering powerful tools and techniques to analyse and interpret complex biological data. This research paper ex...
详细信息
The computational methods for the prediction of gene function annotations aim to automatically find associations between a gene and a set of Gene Ontology (GO) terms describing its functions. Since the hand-made curat...
详细信息
The computational methods for the prediction of gene function annotations aim to automatically find associations between a gene and a set of Gene Ontology (GO) terms describing its functions. Since the hand-made curation process of novel annotations and the corresponding wet experiments validations are very time-consuming and costly procedures, there is a need for computational tools that can reliably predict likely annotations and boost the discovery of new gene functions. This work proposes a novel method for predicting annotations based on the inference of GO similarities from expression similarities. The novel method was benchmarked against other methods on several public biological datasets, obtaining the best comparative results. exp2GO effectively improved the prediction of GO annotations in comparison to state-of-the-art methods. Furthermore, the proposal was validated with a full genome case where it was capable of predicting relevant and accurate biological functions. The repository of this project withh full data and code is available at https://***/sinc-lab/exp2GO.
RNA Design is a crucial bioinformatics problem to tailor specific RNA sequences into structures that guide our biology and medicine. Because these computer generated sequences are often statistically different from ob...
详细信息
ISBN:
(数字)9781665484626
ISBN:
(纸本)9781665484626
RNA Design is a crucial bioinformatics problem to tailor specific RNA sequences into structures that guide our biology and medicine. Because these computer generated sequences are often statistically different from observed RNA sequence and do not fold as intended in real lab conditions, methods are developed to ensure more biological consistency. One approach is substructure restriction where any solutions are composed of observed RNA substructures recorded in a database. However, studies on this restricted problem are limited: while our Applied Research Lab's Simulated Annealing solution (SIMARD) uses substructures it only considers a single mutation policy and single method of generating substructure consistent sequences. We therefore propose two new policies of mutation: uniform, randomly modifying any structure, and length proportional, substructures are swapped randomly in proportion to their RNA length to target substructures that cover the most bases of a problem. In experiments on roughly fifty RNA Design problems, we conclude the potential value of these substructure based mutation methods resulting in solutions potentially hundreds of bases closer to the target folded structure than previous Simulated Annealing solutions.
暂无评论