Background: Insertions and deletions (indels) represent a common type of sequence variations, which are less studied and pose many important biological questions. Recent research has shown that the presence of sizable...
详细信息
Background: Insertions and deletions (indels) represent a common type of sequence variations, which are less studied and pose many important biological questions. Recent research has shown that the presence of sizable indels in protein sequences may be indicative of protein essentiality and their role in protein interaction networks. Examples of utilization of indels for structure-based drug design have also been recently demonstrated. Nonetheless many structural and functional characteristics of indels remain less researched or unknown. Description: We have created a web- based resource, Indel PDB, representing a structural database of insertions/ deletions identified from the sequence alignments of highly similar proteins found in the proteindatabank (PDB). Indel PDB utilized large amounts of available structural information to characterize 1-, 2- and 3- dimensional features of indel sites. Indel PDB contains 117,266 non- redundant indel sites extracted from 11,294 indel- containing proteins. Unlike loop databases, Indel PDB features more indel sequences with secondary structures including alpha- helices and beta- sheets in addition to loops. The insertion fragments have been characterized by their sequences, lengths, locations, secondary structure composition, solvent accessibility, protein domain association and three dimensional structures. Conclusion: By utilizing the data available in Indel PDB, we have studied and presented here several sequence and structural features of indels. We anticipate that Indel PDB will not only enable future functional studies of indels, but will also assist protein modeling efforts and identification of indel- directed drug binding sites.
Background: Entry of HIV-1 into human lymphoid requires activities of viral envelope glycoproteins gp120 and gp41, and two host-cell proteins, the primary receptor CD4 and a chemokine co-receptor. In addition, a third...
详细信息
Background: Entry of HIV-1 into human lymphoid requires activities of viral envelope glycoproteins gp120 and gp41, and two host-cell proteins, the primary receptor CD4 and a chemokine co-receptor. In addition, a third cell-surface protein called protein disulfide isomerase (PDI) is found to play a major role in HIV-1 entry. PDI is capable of mediating thio-disulfide interchange reactions and could enable the reduction of gp120 disulfide bonds, which triggers the major conformational changes in gp120 and gp41 required for virus entry. In this scenario, inhibition of HIV-1 entry can be brought about by introducing agents that can block thiol-disulfide interchange reaction of cell surface PDI. There have been studies with agents that inhibit PDI activity, but the exact mode of binding remains to be elucidated;this might provide insights to develop new drugs to target PDI. This study attempts to perceive the mode of binding of dithionitrobenzoic acid (DTNB), and its structurally related compounds on PDI enzyme. Results: We performed molecular docking simulation with six different inhibitors (ligand), which includes DTNB, NSC695265, thionitrobenzoic acid, 2-nitro-5-thiocyanobenzoic acid, 2-nitro-5-sulfo-sulfonyl-benzoic acid and NSC517871 into the redox-active site [C37-G38-H39-C40] of the PDI enzyme and the activity was inferred by redox inhibitory models. All ligands showed favorable interactions and most of them seemed to bind to hydrophobic amino acids Ala34, Trp36, Cys37, Cys40, His39, Thr68 and Phe80. The redox inhibitory conformations were energetically and statistically favored and supported the evidence from wet laboratory experiments reported in the literature. Conclusion: We demonstrated that in silico docking experiment can be effectively carried out to recognize the redox inhibitory models of PDI with inhibitor molecules. Interestingly we found that number of docked clusters with each ligand varies in the range of five to eight and conveys that the binding s
Background: Accurate small molecule binding site information for a protein can facilitate studies in drug docking, drug discovery and function prediction, but small molecule binding site protein sequence annotation is...
详细信息
Background: Accurate small molecule binding site information for a protein can facilitate studies in drug docking, drug discovery and function prediction, but small molecule binding site protein sequence annotation is sparse. The Small Molecule Interaction database ( SMID), a database of protein domain-small molecule interactions, was created using structural data from the proteindatabank (PDB). More importantly it provides a means to predict small molecule binding sites on proteins with a known or unknown structure and unlike prior approaches, removes large numbers of false positive hits arising from transitive alignment errors, non-biologically significant small molecules and crystallographic conditions that overprediction binding sites. Description: Using a set of co-crystallized protein-small molecule structures as a starting point, SMID interactions were generated by identifying protein domains that bind to small molecules, using NCBI's Reverse Position Specific BLAST (RPS-BLAST) algorithm. SMID records are available for viewing at http:// smid. blueprint. org. The SMID-BLAST tool provides accurate transitive annotation of small-molecule binding sites for proteins not found in the PDB. Given a protein sequence, SMID-BLAST identifies domains using RPS-BLAST and then lists potential small molecule ligands based on SMID records, as well as their aligned binding sites. A heuristic ligand score is calculated based on E-value, ligand residue identity and domain entropy to assign a level of confidence to hits found. SMID-BLAST predictions were validated against a set of 793 experimental small molecule interactions from the PDB, of which 472 (60%) of predicted interactions identically matched the experimental small molecule and of these, 344 had greater than 80% of the binding site residues correctly identified. Further, we estimate that 45% of predictions which were not observed in the PDB validation set may be true positives. Conclusion: By focusing on protein doma
Background: proteins that are similar in sequence or structure may perform different functions in nature. In such cases, function cannot be inferred from sequence or structural similarity. Results: We analyzed experim...
详细信息
Background: proteins that are similar in sequence or structure may perform different functions in nature. In such cases, function cannot be inferred from sequence or structural similarity. Results: We analyzed experimental structures belonging to the Structural Classification of proteins ( SCOP) database and showed that about half of them belong to multi-functional fold families for which protein similarity alone is not adequate to assign function. We also analyzed predicted structures from the LiveBench and the PDB-CAFASP experiments and showed that accurate homology-based functional assignments cannot be achieved approximately one third of the time, when the protein is a member of a multi-functional fold family. We then conducted extended performance evaluation and comparisons on both experimental and predicted structures using our Functional Signatures from Structural Alignments (FSSA) algorithm that we previously developed to handle the problem of classifying proteins belonging to multi-functional fold families. Conclusion: The results indicate that the FSSA algorithm has better accuracy when compared to homology-based approaches for functional classification of both experimental and predicted protein structures, in part due to its use of local, as opposed to global, information for classifying function. The FSSA algorithm has also been implemented as a webserver and is available at http://***/fssa.
Background: The number of protein structures from structural genomics centers dramatically increases in the proteindatabank (PDB). Many of these structures are functionally unannotated because they have no sequence ...
详细信息
Background: The number of protein structures from structural genomics centers dramatically increases in the proteindatabank (PDB). Many of these structures are functionally unannotated because they have no sequence similarity to proteins of known function. However, it is possible to successfully infer function using only structural similarity. Results: Here we present the PDB-UF database, a web-accessible collection of predictions of enzymatic properties using structure-function relationship. The assignments were conducted for three-dimensional protein structures of unknown function that come from structural genomics initiatives. We show that 4 hypothetical proteins ( with PDB accession codes: 1VH0, 1NS5, 1O6D, and 1TO0), for which standard BLAST tools such as PSI-BLAST or RPS-BLAST failed to assign any function, are probably methyltransferase enzymes. Conclusion: We suggest that the structure-based prediction of an EC number should be conducted having the different similarity score cutoff for different protein folds. Moreover, performing the annotation using two different algorithms can reduce the rate of false positive assignments. We believe, that the presented web-based repository will help to decrease the number of protein structures that have functions marked as "unknown" in the PDB file.
The fold and biochemical activity of a protein are tightly coupled. Once a protein is characterized, it is usual to determine its structure in order to derive an atomic descrip- tion of its molecular mechanism. The fo...
详细信息
The fold and biochemical activity of a protein are tightly coupled. Once a protein is characterized, it is usual to determine its structure in order to derive an atomic descrip- tion of its molecular mechanism. The fold reveals interaction surfaces, ligand-binding pockets, and the precise juxtaposition of functional groups. less
Background: Two of the mostly unsolved but increasingly urgent problems for modern biologists are a) to quickly and easily analyse protein structures and b) to comprehensively mine the wealth of information, which is ...
详细信息
Background: Two of the mostly unsolved but increasingly urgent problems for modern biologists are a) to quickly and easily analyse protein structures and b) to comprehensively mine the wealth of information, which is distributed along with the 3D co-ordinates by the proteindatabank (PDB). Tools which address this issue need to be highly flexible and powerful but at the same time must be freely available and easy to learn. Results: We present MolTalk, an elaborate programming language, which consists of the programming library libmoltalk implemented in Objective-C and the Smalltalk-based interpreter MolTalk. MolTalk combines the advantages of an easy to learn and programmable procedural scripting with the flexibility and power of a full programming language. An overview of currently available applications of MolTalk is given and with PDBChainSaw one such application is described in more detail. PDBChainSaw is a MolTalk-based parser and information extraction utility of PDB files. Weekly updates of the PDB are synchronised with PDBChainSaw and are available for free download from the MolTalk project page http://*** following the link to PDBChainSaw. For each chain in a protein structure, PDBChainSaw extracts the sequence from its coordinates and provides additional information from the PDB-file header section, such as scientific organism, compound name, and EC code. Conclusion: MolTalk provides a rich set of methods to analyse and even modify experimentally determined or modelled protein structures. These methods vary in complexity and are thus suitable for beginners and advanced programmers alike. We envision MolTalk to be most valuable in the following applications:
Background: The integration of many aspects of protein/DNA structure analysis is an important requirement for software products in general area of structural bioinformatics. In fact, there are too few software package...
详细信息
Background: The integration of many aspects of protein/DNA structure analysis is an important requirement for software products in general area of structural bioinformatics. In fact, there are too few software packages on the internet which can be described as successful in this respect. We might say that what is still missing is publicly available, web based software for interactive analysis of the sequence/structure/ function of proteins and their complexes with DNA and ligands. Some of existing software packages do have certain level of integration and do offer analysis of several structure related parameters, however not to the extent generally demanded by a user. Results: We are reporting here about new Sting Millennium Suite (SMS) version which is fully accessible ( including for local files at client end), web based software for molecular structure and sequence/structure/ function analysis. The new SMS client version is now operational also on Linux boxes and it works with non-public pdb formatted files (structures not deposited at the RCSB/ PDB), eliminating earlier requirement for the registration if SMS components were to be used with user's local files. At the same time the new SMS offers some important additions and improvements such as link to ProTherm as well as significant re-engineering of SMS component ConSSeq. Also, we have added 3 new SMS mirror sites to existing network of global SMS servers: Argentina, Japan and Spain. Conclusion: SMS is already established software package and many key data base and software servers worldwide, do offer either a link to, or host the SMS. SMS ((s) under bar ting (m) under bar illennium (s) under bar uite) is web-based publicly available software developed to aid researches in their quest for translating information about the structures of macromolecules into knowledge. SMS allows to a user to interactively analyze molecular structures, cross-referencing visualized information with a correlated one, available acro
暂无评论