Coronavirus disease 2019 is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). We have used bioinformatics to investigate seventeen mutations in the spike protein of SARS-CoV-2, as this mediates i...
详细信息
Coronavirus disease 2019 is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). We have used bioinformatics to investigate seventeen mutations in the spike protein of SARS-CoV-2, as this mediates infection of human cells and is the target of most vaccine strategies and antibody-based therapies. Two mutations, H146Y and S221W, were identified as being most pathogenic. Mutations at positions D614G, A829T, and P1263L might also have deleterious effects on protein function. We hypothesized that candidate small molecules may be repurposed to combat viral infection. We investigated changes in binding energies of the ligands and the mutant proteins by assessing molecular docking. For an understanding of cellular function and organization, protein-protein interactions are also critical. Protein-protein docking for naive and mutated structures of SARS-CoV-2 S protein was evaluated for their binding energy with the angiotensin-converting enzyme 2 (ACE2). These interactions might limit the binding of the SARS-CoV-2 spike protein to the ACE2 receptor or may have a deleterious effect on protein function that may limit infection. These results may have important implications for the transmission of SARS-CoV-2, its pathogenesis, and the potential for drug repurposing and immune therapies.
This paper presents an in-depth look of how FPGA computing can offer substantial speedups in the execution of bioinformatics algorithms, with specific results achieved to date for a broad range of algorithms. Examples...
详细信息
ISBN:
(纸本)9781467343589;9781467343572
This paper presents an in-depth look of how FPGA computing can offer substantial speedups in the execution of bioinformatics algorithms, with specific results achieved to date for a broad range of algorithms. Examples and case studies are presented for sequence comparison (BLAST, CAST), multiple sequence alignment (MAFFT, T-Coffee), RNA and protein secondary structure prediction (Zuker, Predator), gene prediction (Glimmer/GlimmerHMM) and phylogenetic tree computation (RAxML), running on mainstream FPGA technologies as well as high-end FPGA-based systems (Convey HC1, BeeCube). This work also presents technological and other obstacles that need to be overcome in order for FPGA computing to become a mainstream technology in bioinformatics.
Many parallel and distributed strategies were created to reduce the execution time of bioinformatics algorithms. One well-known bioinformatics algorithm is the Smith-Waterman, that may be parallelized using the wavefr...
详细信息
Many parallel and distributed strategies were created to reduce the execution time of bioinformatics algorithms. One well-known bioinformatics algorithm is the Smith-Waterman, that may be parallelized using the wavefront method. When the wavefront is distributed across many heterogeneous nodes, it must be balanced to create a synchronous data flow. This is a very challenging problem if the nodes have variable computational power. This paper presents an agent-based solution for parallel biological sequence comparison applications that use the multi-node wavefront method. In our approach, autonomous agents are able to identify unbalanced computations and dynamically rebalance the load among the nodes. Two strategies were developed to the balancer agent in order to identify if the computations are balanced, one using global information and other using only local information. The global strategy demands a huge amount of data transfers, incurring in more communication, whereas the local strategy can decide about the balancing status using only local information. The results show that the balancing gains of strategies are very close. Thus, the local strategy is preferred, since it can be implemented in real wavefront balancers with almost the same benefits as the global strategy. (C) 2014 Elsevier Ltd. All rights reserved.
Background: Recent efforts in HIV-1 vaccine design have focused on immunogens that evoke potent neutralizing antibody responses to a broad spectrum of viruses circulating worldwide. However, the development of effecti...
详细信息
Background: Recent efforts in HIV-1 vaccine design have focused on immunogens that evoke potent neutralizing antibody responses to a broad spectrum of viruses circulating worldwide. However, the development of effective vaccines will depend on the identification and characterization of the neutralizing antibodies and their epitopes. We developed bioinformatics methods to predict epitope networks and antigenic determinants using structural information, as well as corresponding genotypes and phenotypes generated by a highly sensitive and reproducible neutralization assay. 282 clonal envelope sequences from a multiclade panel of HIV-1 viruses were tested in viral neutralization assays with an array of broadly neutralizing monoclonal antibodies (mAbs: b12, PG9,16,PGT121-128, PGT130-131, PGT135-137, PGT141-145, and PGV04). We correlated IC50 titers with the envelope sequences, and used this information to predict antibody epitope networks. Structural patches were defined as amino acid groups based on solvent-accessibility, radius, atomic depth, and interaction networks within 3D envelope models. We applied a boosted algorithm consisting of multiple machine-learning and statistical models to evaluate these patches as possible antibody epitope regions, evidenced by strong correlations with the neutralization response for each antibody. Results: We identified patch clusters with significant correlation to IC50 titers as sites that impact neutralization sensitivity and therefore are potentially part of the antibody binding sites. Predicted epitope networks were mostly located within the variable loops of the envelope glycoprotein (gp120), particularly in V1/V2. Site-directed mutagenesis experiments involving residues identified as epitope networks across multiple mAbs confirmed association of these residues with loss or gain of neutralization sensitivity. Conclusions: Computational methods were implemented to rapidly survey protein structures and predict epitope networks ass
RNA structures are widely distributed across all life forms. The global conformation of these structures is defined by a variety of constituent structural units such as helices, hairpin loops, kissing-loop motifs and ...
详细信息
RNA structures are widely distributed across all life forms. The global conformation of these structures is defined by a variety of constituent structural units such as helices, hairpin loops, kissing-loop motifs and pseudoknots, which often behave in a modular way. Their ubiquitous distribution is associated with a variety of functions in biological processes. The location of these structures in the genomes of RNA viruses is often coordinated with specific processes in the viral life cycle, where the presence of the structure acts as a checkpoint for deciding the eventual fate of the process. These structures have been found to adopt complex conformations and exert their effects by interacting with ribosomes, multiple host translation factors and small RNA molecules like miRNA. A number of such RNA structures have also been shown to regulate translation in viruses at the level of initiation, elongation or termination. The role of various computational studies in the preliminary identification of such sequences and/or structures and subsequent functional analysis has not been fully appreciated. This review aims to summarize the processes in which viral RNA structures have been found to play an active role in translational regulation, their global conformational features and the bioinformatics/computational tools available for the identification and prediction of these structures.
A multitude of algorithms for sequence comparison, short-read assembly and whole-genome alignment have been developed in the general context of molecular biology, to support technology development for high-throughput ...
详细信息
A multitude of algorithms for sequence comparison, short-read assembly and whole-genome alignment have been developed in the general context of molecular biology, to support technology development for high-throughput sequencing, numerous applications in genome biology and fundamental research on comparative genomics. The computational complexity of these algorithms has been previously reported in original research papers, yet this often neglected property has not been reviewed previously in a systematic manner and for a wider audience. We provide a review of space and time complexity of key sequence analysis algorithms and highlight their properties in a comprehensive manner, in order to identify potential opportunities for further research in algorithm or data structure optimization. The complexity aspect is poised to become pivotal as we will be facing challenges related to the continuous increase of genomic data on unprecedented scales and complexity in the foreseeable future, when robust biological simulation at the cell level and above becomes a reality. (C) 2017 Elsevier B.V. All rights reserved.
Plant genomics plays a pivotal role in enhancing global food security and sustainability by offering innovative solutions for improving crop yield, disease resistance, and stress tolerance. As the number of sequenced ...
详细信息
Plant genomics plays a pivotal role in enhancing global food security and sustainability by offering innovative solutions for improving crop yield, disease resistance, and stress tolerance. As the number of sequenced genomes grows and the accuracy and contiguity of genome assemblies improve, structural annotation of plant genomes continues to be a significant challenge due to their large size, polyploidy, and rich repeat content. In this paper, we present an overview of the current landscape in crop genomics research, highlighting the diversity of genomic characteristics across various crop species. We also assessed the accuracy of popular gene prediction tools in identifying genes within crop genomes and examined the factors that impact their performance. Our findings highlight the strengths and limitations of BRAKER2 and Helixer as leading structural genome annotation tools and underscore the impact of genome complexity, fragmentation, and repeat content on their performance. Furthermore, we evaluated the suitability of the predicted proteins as a reliable search space in proteomics studies using mass spectrometry data. Our results provide valuable insights for future efforts to refine and advance the field of structural genome annotation.
It is critical to be able to identify longitudinally changing genes in temporal data so that studies can be focused on how gene expression changes in a dynamic way. While biological networks continue to play a signifi...
详细信息
ISBN:
(纸本)9781509030507
It is critical to be able to identify longitudinally changing genes in temporal data so that studies can be focused on how gene expression changes in a dynamic way. While biological networks continue to play a significant role in modeling and characterizing complex relationships in biological systems, most network modeling studies in biomedical research focus on snapshot or "static" network-based analysis to identify genes of interest. In this study, we use a temporal non-sampling network-based approach to identify and rank genes that exhibit significant co-expression variation over time. We use in the C. elegans gene correlation network obtained from mRNA expression profiles to illustrate the value of the proposed approach. We compare the results of this method to results obtained from traditional statistical analysis that focuses on identifying simple differentially expressed genes. We show that rank-based temporal network analysis can identify genes that contribute to changes in the network structure and consequently contribute to changes in the genetic regulatory machine.
High-throughput sequencing is currently a major transforming technology in biology. In this paper, we study a population genomics problem motivated by the newly available short reads data from high-throughput sequenci...
详细信息
ISBN:
(纸本)9789814335058
High-throughput sequencing is currently a major transforming technology in biology. In this paper, we study a population genomics problem motivated by the newly available short reads data from high-throughput sequencing. In this problem, we are given short reads collected from individuals in a population. The objective is to infer haplotypes with the given reads. We first formulate the computational problem of haplotype inference with short reads. Based on a simple probabilistic model on short reads, we present a new approach of inferring haplotypes directly from given reads (i.e. without first calling genotypes). Our method is finding the most likely haplotypes whose local genealogical history can be approximately modeled as a perfect phylogeny. We show that the optimal haplotypes under this objective can be found for many data using integer linear programming for modest sized data when there is no recombination. We then develop a related heuristic method which can work with larger data, and also allows recombination. Simulation shows that the performance of our method is competitive against alternative approaches.
The paper describes issues related to network traffic analysis. The scope of this article includes discussion regarding the problem of network traffic identification and classification. Furthermore, paper presents two...
详细信息
ISBN:
(纸本)9783540693833
The paper describes issues related to network traffic analysis. The scope of this article includes discussion regarding the problem of network traffic identification and classification. Furthermore, paper presents two bioinformatics methods: Clustal and Center Star. Both methods were precisely adapted to the network security purpose. In both methods, the concept of extraction of a common subsequence, based on multiple sequence alignment of more than two network attack signatures, was used. This concept was inspired by bioinformatics solutions for the problems related to finding similarities in a set of DNA, RNA or amino acids sequences. Additionally, the scope of the paper includes detailed description of test procedures and their results. At the end some relevant evaluations and conclusions regarding both methods are presented.
暂无评论