检索结果-内蒙古大学图书馆

A method of comparing protein molecular surface based on normal vectors with attributes and its application to function identification

引用

INFORMATION SCIENCES 2002年第1-4期146卷 41-54页

作者： Kaneta, Y Shoji, N Ohkawa, T Nakamura, H Osaka Univ Grad Sch Informat Sci & Technol Engn Dept Multimedia Engn Suita Osaka 5650871 Japan Osaka Univ Inst Prot Res Suita Osaka 5650871 Japan

Recent researches have clarified that the function of protein depends on its molecular surface. They suggest the possibility of the protein function identification based on the molecular surface comparison, in which a molecular surface of an unknown protein is compared with many surfaces of known active sites as reference templates. This paper presents an effective surface comparison method by using normal vectors with attributes of the curvature and the physical properties on projections and depressions. The vectors that should be matched are limited by extracting two vectors at similar relative positions and with the attributes of surface in order to reduce computational complexity. The proposed method was applied to 11 surface data. As a result, the mean calculation time was about 3 min, and it was possible to compare at approximately an optimal location. This method was applied to 103 surface data. The result of identification showed 95.2% identification accuracy. (C) 2002 Elsevier Science Inc. All rights reserved.

关键词： protein active site molecular surface comparison protein function identification normal vector

来源：评论

学校读者我要写书评

暂无评论

FASTA-SWAP and FASTA-PAT: Pattern database searches using combinations of aligned amino acids, and a novel scoring theory

引用

JOURNAL OF MOLECULAR BIOLOGY 1996年第4期259卷 840-854页

作者： Ladunga, I Wiese, BA Smith, RF EOTVOS LORAND UNIV DEPT GENET H-1088 BUDAPEST HUNGARY

We introduce two new pattern database search tools that utilize statistical significance and information theory to improve protein function identification. Both the general pattern scoring theory with the specific matrices introduced here and the low redundancy of pattern databases increase search sensitivity and selectivity. Pattern scoring preferentially rewards matches at conserved positions in a pattern with higher scores than matches at variable positions, and assigns more negative scores to mismatches at conserved positions than to mismatches at variable positions. The theory of pattern scoring can be used to create log-odds pattern scores for patterns derived from any set of multiple alignments. This theoretical framework can be used to adapt existing sequence database search tools to pattern analysis. Our FASTA-SWAP and FASTA-PAT tools are extensions of the FASTA program that search a sequence query against a pattern database. In the first step, FASTA-SWAP searches the diagonals of the query sequence and the library pattern for high-scoring segments, while FASTA-PAT performs an extended version of hashing. In the second step, both methods refine the alignments and the scores using dynamic programming. The tools utilize an extremely compact binary representation of all possible combinations of amino acid residues in aligned positions. Our FASTA-SWAP and FASTA-PAT tools are well suited for functional identification of distant relatives that may be missed by sequence database search methods. FASTA-SWAP and FASTA-PAT searches can be performed using out World-Wide Web Server (http://***:9331/seq-search/Options/***1). (C) 1996 Academic Press Limited

关键词： protein function identification amino acid sequence pattern protein database search scoring theory FASTA

来源：评论

学校读者我要写书评

暂无评论

Aligning Discovered Patterns from protein Family Sequences

Aligning Discovered Patterns from Protein Family Sequences

引用

7th International-Association-for-Pattern-Recognition (IAPR) International Conference on Pattern Recognition in Bioinformatics (PRIB)

作者： Lee, En-Shiun Annie Zhuang, Dennis Wong, Andrew K. C. Univ Waterloo Ctr Pattern Anal & Machine Intelligence Waterloo ON N2L 3G1 Canada

ISBN: (纸本)9783642341236

A basic task in protein analysis is to discover a set of sequence patterns that characterizes the function of a protein family. To address this task, we introduce a synthesized pattern representation called Aligned Pattern (AP) Cluster to discover potential functional segments in protein sequences. We apply our algorithm to identify and display the binding segments for the Cytochrome C. and Ubiquitin protein families. The resulting AP Clusters correspond to protein binding segments that surround the binding residues. When compared to the results from the protein annotation databases, PROSITE and pFam, ours are more efficient in computation and comprehensive in quality. The significance of the AP Cluster is that it is able to capture subtle variations of the binding segments in protein families. It thus could help to reduce time-consuming simulations and experimentation in the protein analysis.

关键词： protein Analysis protein function identification Pattern Discovery Pattern Clustering Hierarchical Clustering Motif Finding Local Alignment Approximate String Matching

来源：评论

学校读者我要写书评

暂无评论

Factors limiting the performance of prediction-based fold recognition methods

引用

protein SCIENCE 1999年第4期8卷 750-759页

作者： de la Cruz, X Thornton, JM Univ London Univ Coll Dept Biochem & Mol Biol London WC1E 6BT England

In the past few years, a new generation of fold recognition methods has been developed, in which the classical sequence information is combined with information obtained from secondary structure and, sometimes, accessibility predictions. The results are promising, indicating that this approach may compete with potential-based methods (Rost B et al., 1997, J Mol Biol 270:471-480). Here we present a systematic study of the different factors contributing to the performance of these methods, in particular when applied to the problem of fold recognition of remote homologues. Our results indicate that secondary structure and accessibility prediction methods have reached an accuracy level where they are not the major factor limiting the accuracy of fold recognition. The pattern degeneracy problem is confirmed as the major source of error of these methods. On the basis of these results, we study three different options to overcome these limitations: normalization schemes, mapping of the coil state into the different zones of the Ramachandran plot, and post-threading graphical analysis.

关键词： fold recognition protein function identification protein structure prediction remote homologues secondary structure and accessibility predictions sequence annotation threading

来源：评论

学校读者我要写书评

暂无评论

Ranking and compacting binding segments of protein families using aligned pattern clusters

引用

PROTEOME SCIENCE 2013年第1-Sup期11卷 S8-S8页

作者： Lee, En-Shiun Annie Wong, Andrew K. C. Univ Waterloo Dept Syst Design Engn Waterloo ON N2L 3G1 Canada

Background: Discovering sequence patterns with variation can unveil functions of a protein family that are important for drug discovery. Exploring protein families using existing methods such as multiple sequence alignment is computationally expensive, thus pattern search, called motif finding in Bioinformatics, is used. However, at present, combinatorial algorithms result in large sets of solutions, and probabilistic models require a richer representation of the amino acid associations. To overcome these shortcomings, we present a method for ranking and compacting these solutions in a new representation referred to as Aligned Pattern Clusters (APCs). To tackle the problem of a large solution set, our method reveals a reduced set of candidate solutions without losing any information. To address the problem of representation, our method captures the amino acid associations and conservations of the aligned patterns. Our algorithm renders a set of APCs in which a set of patterns is discovered, pruned, aligned, and synthesized from the input sequences of a protein family. Results: Our algorithm identifies the binding or other functional segments and their embedded residues which are important drug targets from the cytochrome c and the ubiquitin protein families taken from Unitprot. The results are independently confirmed by pFam's multiple sequence alignment. For cytochrome c protein the number of resulting patterns with variations are reduced by 76.62% from the number of original patterns without variations. Furthermore, all of the top four candidate APCs correspond to the binding segments with one of each of their conserved amino acid as the binding residue. The discovered proximal APCs agree with pFam and PROSITE results. Surprisingly, the distal binding site discovered by our algorithm is not discovered by pFam nor PROSITE, but confirmed by the three-dimensional cytochrome c structure. When applied to the ubiquitin protein family, our results agree with pFam and rev

关键词： protein Analysis protein function identification Pattern Discovery Pattern Clustering Hierarchical Clustering Pattern Search Motif Finding Local Alignment Drug Discovery

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：