检索结果-内蒙古大学图书馆

Learning Quantitative sequence-function relationships from Massively Parallel Experiments

JOURNAL OF STATISTICAL PHYSICS 2016年第5期162卷 1203-1243页

作者： Atwal, Gurinder S. Kinney, Justin B. Cold Spring Harbor Lab Simons Ctr Quantitat Biol POB 100 Cold Spring Harbor NY 11724 USA

A fundamental aspect of biological information processing is the ubiquity of sequence-function relationships-functions that map the sequence of DNA, RNA, or protein to a biochemically relevant activity. Most sequence-function relationships in biology are quantitative, but only recently have experimental techniques for effectively measuring these relationships been developed. The advent of such "massively parallel" experiments presents an exciting opportunity for the concepts and methods of statistical physics to inform the study of biological systems. After reviewing these recent experimental advances, we focus on the problem of how to infer parametric models of sequence-function relationships from the data produced by these experiments. Specifically, we retrace and extend recent theoretical work showing that inference based on mutual information, not the standard likelihood-based approach, is often necessary for accurately learning the parameters of these models. Closely connected with this result is the emergence of "diffeomorphic modes"aEuro"directions in parameter space that are far less constrained by data than likelihood-based inference would suggest. Analogous to Goldstone modes in physics, diffeomorphic modes arise from an arbitrarily broken symmetry of the inference problem. An analytically tractable model of a massively parallel experiment is then described, providing an explicit demonstration of these fundamental aspects of statistical inference. This paper concludes with an outlook on the theoretical and computational challenges currently facing studies of quantitative sequence-function relationships.

关键词： sequence-function relationships Mutual information Likelihood Diffeomorphic modes Sort-Seq

来源：评论

学校读者我要写书评

暂无评论

2D-RNA-coupling numbers:: A new computational chemistry approach to link secondary structure topology with biological function

引用

JOURNAL OF COMPUTATIONAL CHEMISTRY 2007年第6期28卷 1049-1056页

作者： Gonzalez-Diaz, Humberto Agueero-Chapin, Guillermin Varona, Javier Molina, Reinaldo Delogu, Giovanna Santana, Lourdes Uriarte, Eugenio Podda, Gianni Univ Cagliari Dipartimento Farmaco Chim Tecnol I-09124 Cagliari Italy Univ Nacl Autonoma Mexico Biomed Unit FES Istacala Tlalnepantla 54090 DF Mexico Univ Porto Fac Ciencias REQUIMTE P-4169007 Oporto Portugal

Methods for prediction of proteins, DNA, or RNA function and mapping it onto sequence often rely on bioinformatics alignment approach instead of chemical structure. Consequently, it is interesting to develop computational chemistry approaches based on molecular descriptors. In this sense, many researchers used sequence-coupling numbers and our group extended them to 2D proteins representations. However, no coupling numbers have been reported for 2D-RNA topology graphs, which are highly branched and contain useful information. Here, we use a computational chemistry scheme: (a) transforming sequences into RNA secondary structures, (b) defining and calculating new 2D-RNA-coupling numbers, (c) seek a structure-function model, and (d) map biological function onto the folded RNA. We studied as example 1-aminocyclopropane-1-carboxylic acid (ACC) oxidases known as ACO, which control fruit ripening having importance for biotechnology industry. First, we calculated tau(k)(2D-RNA) values to a set of 90-folded RNAs, including 28 transcripts of ACO and control sequences. Afterwards, we compared the classification performance of 10 different classifiers implemented in the software WEKA. In particular, the logistic equation ACO = 23.8 . tau(1)(2D-RNA) + 41.4 predicts ACOs with 98.9%, 98.0%, and 97.8% of accuracy in training, leave-one-out and 10-fold cross-validation, respectively. Afterwards, with this equation we predict ACO function to a sequence isolated in this work from Coffea arabica (GenBank accession DQ218452). The tau(1)(2D-RNA) also favorably compare with other descriptors. This equation allows us to map the codification of ACO activity on different mRNA topology features. The present computational-chemistry approach is general and could be extended to connect RNA secondary structure topology to other functions. (C) 2007 Wiley Periodicals, Inc.

关键词： RNA secondary structure molecular descriptors sequence-function relationships coupling numbers linear classifiers machine learning algorithms

来源：评论

学校读者我要写书评

暂无评论

Characterization of proteins from the 3N5M family reveals an operationally stable amine transaminase

引用

APPLIED MICROBIOLOGY AND BIOTECHNOLOGY 2022年第17期106卷 5563-5574页

作者： Kollipara, Manideep Matzel, Philipp Sowa, Miriam Brott, Stefan Bornscheuer, Uwe Hoehne, Matthias Univ Greifswald Inst Biochem Prot Biochem Felix Hausdorff Str 4 D-17489 Greifswald Germany Justus Liebig Univ Giessen Inst Food Chem & Food Technol Heinrich Buff Ring 17 D-35392 Giessen Germany Univ Greifswald Inst Biochem Dept Biotechnol & Enzyme Catalysis Felix Hausdorff Str 4 D-17489 Greifswald Germany

Amine transaminases (ATA) convert ketones into optically active amines and are used to prepare active pharmaceutical ingredients and building blocks. Novel ATA can be identified in protein databases due to the extensive knowledge of sequence-function relationships. However, predicting thermo- and operational stability from the amino acid sequence is a persisting challenge and a vital step towards identifying efficient ATA biocatalysts for industrial applications. In this study, we performed a database mining and characterized selected putative enzymes of the beta-alanine:pyruvate transaminase cluster (3N5M) - a subfamily with so far only a few described members, whose tetrameric structure was suggested to positively affect operational stability. Four putative transaminases (TA-1: Bilophilia wadsworthia, TA-5: Halomonas elongata, TA-9: Burkholderia cepacia, and TA-10: Burkholderia multivorans) were obtained in a soluble form as tetramers in E. coli. During comparison of these tetrameric with known dimeric transaminases we found that indeed novel ATA with high operational stabilities can be identified in this protein subfamily, but we also found exceptions to the hypothesized correlation that a tetrameric assembly leads to increased stability. The discovered ATA from Burkholderia multivorans features a broad substrate specificity, including isopropylamine acceptance, is highly active (6 U/mg) in the conversion of 1-phenylethylamine with pyruvate and shows a thermostability of up to 70 degrees C under both, storage and operating conditions. In addition, 50% (v/v) of isopropanol or DMSO can be employed as co-solvents without a destabilizing effect on the enzyme during an incubation time of 16 h at 30 degrees C.

关键词： Amine transaminase sequence-function relationships Thermostability Operational stability Enzyme discovery

来源：评论

学校读者我要写书评

暂无评论

Predicting protein functional sites with phylogenetic motifs

引用

PROTEINS-STRUCTURE function AND BIOINFORMATICS 2005年第2期58卷 309-320页

作者： La, D Sutch, B Livesay, DR Calif State Polytech Univ Pomona Dept Chem Pomona CA 91768 USA Calif State Polytech Univ Pomona Dept Biol Sci Pomona CA 91768 USA

In this report, we demonstrate that phylogenetic motifs, sequence regions conserving the overall familial phylogeny, represent a promising approach to protein functional site prediction. Across our structurally and functionally heterogeneous data set, phylogenetic motifs consistently correspond to functional sites defined by both surface loops and active site clefts. Additionally, the partially buried prosthetic group regions of cytochrome P450 and succinate dehydrogenase are identified as phylogenetic motifs. In nearly all instances, phylogenetic motifs are structurally clustered, despite little overall sequence proximity, around key functional site features. Based on calculated false-positive expectations and standard motif identification methods, we show that phylogenetic motifs are generally conserved in sequence. This result implies that they can be considered motifs in the traditional sense as well. However, there are instances where phylogenetic motifs are not (overall) well conserved in sequence. This point is enticing, because it implies that phylogenetic motifs are able to identify key sequence regions that traditional motif-based approaches would not. Further, phylogenetic motif results are also shown to be consistent with evolutionary trace results, and bootstrapping is used to demonstrate tree significance. (C) 2004 Wiley-Liss, Inc.

关键词： phylogenetic motif phylogenetic tree phylogenomics functional site prediction sequence-function relationships

来源：评论

学校读者我要写书评

暂无评论

A unified statistical potential reveals that amino acid stickiness governs nonspecific recruitment of client proteins into condensates

引用

PROTEIN SCIENCE 2022年第7期31卷 e4361-e4361页

作者： Villegas, Jose A. Levy, Emmanuel D. Weizmann Inst Sci Dept Chem & Struct Biol IL-7610001 Rehovot Israel Univ Illinois Coll Pharm Dept Pharmaceut Sci Chicago IL 60612 USA

Membraneless organelles are cellular compartments that form by liquid-liquid phase separation of one or more components. Other molecules, such as proteins and nucleic acids, will distribute between the cytoplasm and the liquid compartment in accordance with the thermodynamic drive to lower the free energy of the system. The resulting distribution colocalizes molecular species to carry out a diversity of functions. Two factors could drive this partitioning: the difference in solvation between the dilute versus dense phase and intermolecular interactions between the client and scaffold proteins. Here, we develop a set of knowledge-based potentials that allow for the direct comparison between stickiness, which is dominated by desolvation energy, and pairwise residue contact propensity terms. We use these scales to examine experimental data from two systems: protein cargo dissolving within phase-separated droplets made from FG repeat proteins of the nuclear pore complex and client proteins dissolving within phase-separated FUS droplets. These analyses reveal a close agreement between the stickiness of the client proteins and the experimentally determined values of the partition coefficients (R > 0.9), while pairwise residue contact propensities between client and scaffold show weaker correlations. Hence, the stickiness of client proteins is sufficient to explain their differential partitioning within these two phase-separated systems without taking into account the composition of the condensate. This result implies that selective trafficking of client proteins to distinct membraneless organelles requires recognition elements beyond the client sequence composition. Statement Empirical potentials for amino acid stickiness and pairwise residue contact propensities are derived. These scales are unique in that they enable direct comparison of desolvation versus contact terms. We find that partitioning of a client protein to a condensate is best explained by amino acid sticki

关键词： amino acid stickiness biomolecular condensates contact potential interface propensity sequence-function relationships statistical energy

来源：评论

学校读者我要写书评

暂无评论

Rugged fitness landscapes minimize promiscuity in the evolution of transcriptional repressors

引用

Cell Systems 2024年第4期15卷 374-387.e6页

作者： Meger, Anthony T. Spence, Matthew A. Sandhu, Mahakaran Matthews, Dana Chen, Jackie Jackson, Colin J. Raman, Srivatsan Department of Biochemistry University of Wisconsin-Madison Madison 53706 WI United States Research School of Chemistry Australian National University Canberra 2601 ACT Australia Research School of Biology Australian National University Canberra 2601 ACT Australia ARC Centre of Excellence for Innovations in Peptide & Protein Science Research School of Chemistry Australian National University Canberra 2601 ACT Australia ARC Centre of Excellence for Innovations in Synthetic Biology Research School of Chemistry Australian National University Canberra 2601 ACT Australia Department of Bacteriology University of Wisconsin-Madison Madison 53706 WI United States Department of Chemical and Biological Engineering University of Wisconsin-Madison Madison 53706 WI United States

How a protein's function influences the shape of its fitness landscape, smooth or rugged, is a fundamental question in evolutionary biochemistry. Smooth landscapes arise when incremental mutational steps lead to a progressive change in function, as commonly seen in enzymes and binding proteins. On the other hand, rugged landscapes are poorly understood because of the inherent unpredictability of how sequence changes affect function. Here, we experimentally characterize the entire sequence phylogeny, comprising 1,158 extant and ancestral sequences, of the DNA-binding domain (DBD) of the LacI/GalR transcriptional repressor family. Our analysis revealed an extremely rugged landscape with rapid switching of specificity, even between adjacent nodes. Further, the ruggedness arises due to the necessity of the repressor to simultaneously evolve specificity for asymmetric operators and disfavors potentially adverse regulatory crosstalk. Our study provides fundamental insight into evolutionary, molecular, and biophysical rules of genetic regulation through the lens of fitness landscapes. © 2024 Elsevier Inc.

关键词： ancestral sequence reconstruction ASR deep mutational scanning DMS epistasis fitness landscape protein evolution sequence-function relationships transcription regulators

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：