Nonlinear embeddings are central in machine learning (ML). However, they often suffer from insufficient interpretability, due to the restricted access to the latent space. To improve interpretability, elements of the ...
详细信息
Nonlinear embeddings are central in machine learning (ML). However, they often suffer from insufficient interpretability, due to the restricted access to the latent space. To improve interpretability, elements of the latent space need to be represented in the input space. The process of finding such inverse transformation is known as the pre-image problem. This challenging task is especially difficult when dealing with complex and discrete data represented by graphs. In this paper, we propose a framework aimed at defining ML models that do not suffer from the pre-image problem. This framework is based on normalizingflows (NF), generating the latent space by learning both forward and inverse transformations. From this framework, we propose two specifications to design models working on predictive contexts, namely classification and regression. Asa result, our approaches are able to obtain good predictive performances and to generate the pre-image of any element in the latent space. Our experimental results highlight the predictive capabilities and the proficiency in generating graph pre-images, thereby emphasizing the versatility and effectiveness of our approaches for graph machine learning.
Generating molecular graphs with desired chemical properties driven by deep graph generative models provides a very promising way to accelerate drug discovery process. Such graph generative models usually consist of t...
详细信息
ISBN:
(纸本)9781450379984
Generating molecular graphs with desired chemical properties driven by deep graph generative models provides a very promising way to accelerate drug discovery process. Such graph generative models usually consist of two steps: learning latent representations and generation of molecular graphs. However, to generate novel and chemically-valid molecular graphs from latent representations is very challenging because of the chemical constraints and combinatorial complexity of molecular graphs. In this paper, we propose Moflow, a flow-based graph generative model to learn invertible mappings between molecular graphs and their latent representations. To generate molecular graphs, our Moflow first generates bonds (edges) through a Glow based model, then generates atoms (nodes) given bonds by a novel graph conditional flow, and finally assembles them into a chemically valid molecular graph with a posthoc validity correction. Our Moflow has merits including exact and tractable likelihood training, efficient one-pass embedding and generation, chemical validity guarantees, 100% reconstruction of training data, and good generalization ability. We validate our model by four tasks: molecular graph generation and reconstruction, visualization of the continuous latent space, property optimization, and constrained property optimization. Our Moflow achieves state-of-the-art performance, which implies its potential efficiency and effectiveness to explore large chemical space for drug discovery.
暂无评论