版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:IBM Thomas J Watson Res Ctr 1101 Kitchawan Rd Yorktown Hts NY 10598 USA Zhejiang Univ 688 Yuhangtang Rd Hangzhou 310027 Peoples R China Oak Ridge Natl Lab 1 Bethel Valley Rd Oak Ridge TN 37830 USA
出 版 物:《BMC BIOINFORMATICS》 (英国医学委员会:生物信息)
年 卷 期:2021年第22卷第1期
页 面:1-21页
核心收录:
学科分类:0710[理学-生物学] 0836[工学-生物工程] 10[医学]
基 金:IBM Bluegene Science Program [W125859 W1464125 W1464164]
主 题:Lead optimization Drug discovery Molecular dynamics simulation Machine learning Variational autoencoder Clustering
摘 要:Background Drug discovery is a multi-stage process that comprises two costly major steps: pre-clinical research and clinical trials. Among its stages, lead optimization easily consumes more than half of the pre-clinical budget. We propose a combined machine learning and molecular modeling approach that partially automates lead optimization workflow in silico, providing suggestions for modification hot spots. Results The initial data collection is achieved with physics-based molecular dynamics simulation. Contact matrices are calculated as the preliminary features extracted from the simulations. To take advantage of the temporal information from the simulations, we enhanced contact matrices data with temporal dynamism representation, which are then modeled with unsupervised convolutional variational autoencoder (CVAE). Finally, conventional and CVAE-based clustering methods are compared with metrics to rank the submolecular structures and propose potential candidates for lead optimization. Conclusion With no need for extensive structure-activity data, our method provides new hints for drug modification hotspots which can be used to improve drug potency and reduce the lead optimization time. It can potentially become a valuable tool for medicinal chemists.