检索结果-内蒙古大学图书馆

作者： Sun, Bingjun PennState University Libraries

学位级别：Doctor of Philosophy

Traditional generic search engines using textual keyword matching do not support domain specific searches. However, different domains may have different domain specific searches. For example, Chemical research is molecule centric rather than document centric. Usually a chemical molecule can be represented in multiple ways, e.g., textual chemical entities such as chemical names and formulae, and chemical structures such as 2D graphs and 3D graphs. Thus, in chemoinformatics, chemical entity searches and chemical structure searches are more important than simple document searches using keyword matching. In this work, we show how to build a domain specific search engine that enables both entity and 2D graph searches for chemical molecules. First of all, documents are collected from the Web, and then preprocessed using document classification and segmentation. We apply Support Vector Machines for classification and propose a novel method of text segmentation. Then chemical entities in the documents are tagged and indexed to provide fast searches. Simultaneously, chemical structure information are collected, processed, and indexed for fast graph *** issues exist to support textual chemical entity searches. Chemical names and formulae usually appear in chemical documents when corresponding molecules are mentioned, but a chemical molecule can have different textual representative ways. A simple keyword search would retrieve only the exact match and not the others. Additionally, ambiguous non-chemical terms such as "He" are retrieved. We show how chemical entity searches can improve the relevance of returned documents by avoiding those ambiguous terms. Our search engine first extracts chemical entities from text, performs novel indexing suitable for chemical names and formulae, and supports different query models that a scientist may require. We propose a model of hierarchical conditional random fields for entity tagging that considers long-term dependencies at the

关键词： Information retrieval information extraction entity tagging entity search query model graph mining graph indexing graph search feature selection chemoinformatics

来源：评论

学校读者我要写书评

暂无评论

chemoinformatics and Machine Learning Approaches for Identifying Antiviral Compounds

引用

MOLECULAR INFORMATICS 2022年第4期41卷 e2100190-e2100190页

作者： John, Lijo Soujanya, Yarasi Mahanta, Hridoy Jyoti Sastry, G. Narahari CSIR Indian Inst Chem Technol Ctr Mol Modeling Hyderabad 500007 India CSIR North East Inst Sci & Technol Adv Computat & Data Sci Div Jorhat 785006 Assam India Acad Sci & Innovat Res AcSIR Ghaziabad 201002 Uttar Pradesh India

Current pandemics propelled research efforts in unprecedented fashion, primarily triggering computational efforts towards new vaccine and drug development as well as drug repurposing. There is an urgent need to design novel drugs with targeted biological activity and minimum adverse reactions that may be useful to manage viral outbreaks. Hence an attempt has been made to develop Machine Learning based predictive models that can be used to assess whether a compound has the potency to be antiviral or not. To this end, a set of 2358 antiviral compounds were compiled from the CAS COVID-19 antiviral SAR dataset whose activity was reported based on IC50 value. A total 1157 two-dimensional molecular descriptors were computed among which, the most highly correlated descriptors were selected using Tree-based, Correlation-based and Mutual information-based feature selection methods. Seven Machine Learning algorithms i. e., Random Forest, XGBoost, Support Vector Machine, KNN, Decision Tree, MLP Classifier and Logistic Regression were benchmarked. The best performance was achieved by the models developed using Random Forest and XGBoost algorithms in all the feature selection methods. The maximum predictive accuracy of both these models was 88 % with internal validation. Whereas, with an external dataset, a maximum accuracy of 93.10 % for XGBoost and 100 % for Random Forest based model was achievable. Furthermore, the study demonstrated scaffold analysis of the molecules as a pragmatic approach to explore the importance of structurally diverse compounds in data driven studies.

关键词： SARS-COVID-19 Antivirals chemoinformatics Molecular Descriptors Machine Learning Feature Selection MCC

来源：评论

学校读者我要写书评

暂无评论

Two New Graph Kernels and Applications to chemoinformatics

引用

8th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition (GbRPR)

作者： Gauezere, Benoit Brun, Luc Villemin, Didier CNRS GREYC UMR 6072 F-6072 Caen France CNRS LCMT UMR 6507 F-6507 Caen France

ISBN: (纸本)9783642208447

chemoinformatics is a well established research field concerned with the discovery of molecule's properties through informational techniques. Computer science's research fields mainly concerned by the chemoinformatics field are machine learning and graph theory. From this point of view, graph kernels provide a nice framework combining machine learning techniques with graph theory. Such kernels prove their efficiency on several chemoinformatics problems. This paper presents two new graph kernels applied to regression and classification problems within the chemoinformatics field. The first kernel is based on the notion of edit distance while the second is based on sub trees enumeration. Several experiments show the complementary of both approaches.

关键词： edit-distance graph kernel chemoinformatics

来源：评论

学校读者我要写书评

暂无评论

chemoinformatics and structural bioinformatics in OCaml

引用

JOURNAL OF CHEMINFORMATICS 2019年第1期11卷 10-10页

作者： Berenger, Francois Zhang, Kam Y. J. Yamanishi, Yoshihiro Kyushu Inst Technol Dept Biosci & Bioinformat Fac Comp Sci & Syst Engn Iizuka Fukuoka Japan RIKEN Ctr Biosyst Dynam Res Lab Struct Bioinformat Yokohama Kanagawa Japan Japan Sci & Technol Agcy PRESTO Kawaguchi Saitama Japan

BackgroundOCaml is a functional programming language with strong static types, Hindley-Milner type inference and garbage collection. In this article, we share our experience in prototyping chemoinformatics and structural bioinformatics software in ***, we introduce the language, list entry points for chemoinformaticians who would be interested in OCaml and give code examples. Then, we list some scientific open source software written in OCaml. We also present recent open source libraries useful in chemoinformatics. The parallelization of OCaml programs and their performance is also shown. Finally, tools and methods useful when prototyping scientific software in OCaml are *** our experience, OCaml is a programming language of choice for method development in chemoinformatics and structural bioinformatics.

关键词： chemoinformatics Structural bioinformatics Bisector tree Scientific software Software prototyping Open source Functional programming OCaml

来源：评论

学校读者我要写书评

暂无评论

chemoinformatics in Drug Discovery Edited by Tudor Oprea (University of New Mexico). Wiley-VCH, Weinheim. 2005. xxii + 493 pp. 17 × 25 cm. $149.00. ISBN 3-527-30753-2.

引用

Journal of Natural Products 2005年第8期68卷 1306-1307页

作者： David G. Covell National Cancer Institute Frederick Maryland

来源：评论

学校读者我要写书评

暂无评论

chemoinformatics-An Introduction for Computer Scientists

引用

ACM COMPUTING SURVEYS 2009年第2期41卷 8:1-8:38页

作者： Brown, Nathan Inst Canc Res 15 Cotswold Rd Sutton SM2 5NG Surrey England

chemoinformatics is an interface science aimed primarily at discovering novel chemical entities that will ultimately result in the development of novel treatments for unmet medical needs, although these same methods are also applied in other fields that ultimately design new molecules. The field combines expertise from, among others, chemistry, biology, physics, biochemistry, statistics, mathematics, and computer science. In this general review of chemoinformatics the emphasis is placed on describing the general methods that are routinely applied in molecular discovery and in a context that provides for an easily accessible article for computer scientists as well as scientists from other numerate disciplines.

关键词： Algorithms Design Experimentation Measurement Theory chemoinformatics chemometrics docking drug discovery molecular modeling QSAR

来源：评论

学校读者我要写书评

暂无评论

chemoinformatics-based enumeration of chemical libraries: a tutorial

引用

JOURNAL OF CHEMINFORMATICS 2020年第1期12卷 64-64页

作者： Saldivar-Gonzalez, Fernanda I. Huerta-Garcia, C. Sebastian Medina-Franco, Jose L. Univ Nacl Autonoma Mexico DIFACQUIM Res Grp Sch Chem Dept Pharm Ave Univ 3000 Mexico City 04510 DF Mexico Univ Nacl Autonoma Mexico Sch Chem Dept Pharm Ave Univ 3000 Mexico City 04510 DF Mexico

Virtual compound libraries are increasingly being used in computer-assisted drug discovery applications and have led to numerous successful cases. This paper aims to examine the fundamental concepts of library design and describe how to enumerate virtual libraries using open source tools. To exemplify the enumeration of chemical libraries, we emphasize the use of pre-validated or reported reactions and accessible chemical reagents. This tutorial shows a step-by-step procedure for anyone interested in designing and building chemical libraries with or without chemoinformatics experience. The aim is to explore various methodologies proposed by synthetic organic chemists and explore affordable chemical space using open-access chemoinformatics tools. As part of the tutorial, we discuss three examples of design: a Diversity-Oriented-Synthesis library based on lactams, a bis-heterocyclic combinatorial library, and a set of target-oriented molecules: isoindolinone based compounds as potential acetylcholinesterase inhibitors. This manuscript also seeks to contribute to the critical task of teaching and learning chemoinformatics.

关键词： Chemical enumeration chemoinformatics Combinatorial libraries DOS synthesis Drug design Education KNIME Python

来源：评论

学校读者我要写书评

暂无评论

引用

2016年

作者： nikolay t kochev

Charge-Related Topological Index - Various chemoinformatics Applications. by Kochev, Nikolay T; Bangov, Ivan; Petrov, Emil; Moskovkina, Marina; Stoyanov, Borislav; published by

关键词： applications. charge-related chemoinformatics index topological

来源：评论

学校读者我要写书评

暂无评论

chemoinformatics-driven classification of Angiosperms using sulfur-containing compounds and machine learning algorithm

引用

PLANT METHODS 2022年第1期18卷 118-118页

作者： Abdullah-Zawawi, Muhammad-Redha Govender, Nisha Karim, Mohammad Bozlul Altaf-Ul-Amin, Md Kanaya, Shigehiko Mohamed-Hussein, Zeti-Azura Univ Kebangsaan Malaysia Inst Syst Biol INBIOSIS Ukm Bangi 43600 Malaysia UKM Med Mol Biol Inst UMBI Jalan Yaacob Latif Kuala Lumpur 56000 Malaysia Nara Inst Sci & Technol Grad Sch Informat Sci 8916-5 Takayama Cho Ikoma Nara 6300192 Japan Univ Kebangsaan Malaysia Fac Sci & Technol Dept Appl Phys Ukm Bangi 43600 Malaysia

Background Phytochemicals or secondary metabolites are low molecular weight organic compounds with little function in plant growth and development. Nevertheless, the metabolite diversity govern not only the phenetics of an organism but may also inform the evolutionary pattern and adaptation of green plants to the changing environment. Plant chemoinformatics analyzes the chemical system of natural products using computational tools and robust mathematical algorithms. It has been a powerful approach for species-level differentiation and is widely employed for species classifications and reinforcement of previous classifications. Results This study attempts to classify Angiosperms using plant sulfur-containing compound (SCC) or sulphated compound information. The SCC dataset of 692 plant species were collected from the comprehensive species-metabolite relationship family (KNApSAck) database. The structural similarity score of metabolite pairs under all possible combinations (plant species-metabolite) were determined and metabolite pairs with a Tanimoto coefficient value > 0.85 were selected for clustering using machine learning algorithm. Metabolite clustering showed association between the similar structural metabolite clusters and metabolite content among the plant species. Phylogenetic tree construction of Angiosperms displayed three major clades, of which, clade 1 and clade 2 represented the eudicots only, and clade 3, a mixture of both eudicots and monocots. The SCC-based construction of Angiosperm phylogeny is a subset of the existing monocot-dicot classification. The majority of eudicots present in clade 1 and 2 were represented by glucosinolate compounds. These clades with SCC may have been a mixture of ancestral species whilst the combinatorial presence of monocot-dicot in clade 3 suggests sulphated-chemical structure diversification in the event of adaptation during evolutionary change. Conclusions Sulphated chemoinformatics informs classification of Angiospe

关键词： Angiosperms chemoinformatics KNApSAck database Sulfur-containing compounds Molecular fingerprints Monocot-dicot

来源：评论

学校读者我要写书评

暂无评论

chemoinformatics and Library Design

Chemoinformatics and Library Design

引用

作者： Joe Zhongxiang Zhou

This chapter provides a brief overview of chemoinformatics and its applications to chemical library design. It is meant to be a quick starter and to serve as an invitation to readers for more in-depth exploration of the field. The topics covered in this chapter are chemical representation, chemical data and data mining, molecular descriptors, chemical space and dimension reduction, quantitative structure–activity relationship, similarity, diversity, and multiobjective optimization. less

关键词： Virtual Screening Data Mining library design chemoinformatics QSAR QSPR chemical space multiobjective optimization similarity diversity chemical representation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：