Dual-view Molecular Pre-training

52Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Molecular pre-training, which is about to learn an effective representation for molecules on large amount of data, has attracted substantial attention in cheminformatics and bioinformatics. A molecule can be viewed as either a graph (where atoms are connected by bonds) or a SMILES sequence (where depth-first-search is applied to the molecular graph with specific rules). The Transformer and graph neural networks (GNN) are two representative methods to deal with the sequential data and the graphic data, which can globally and locally model the molecules respectively and are supposed to be complementary. In this work, we propose to leverage both representations and design a new pre-training algorithm, dual-view molecule pre-training (briefly, DVMP), that can effectively combine the strengths of both types of molecule representations. DVMP has a Transformer branch and a GNN branch, and the two branches are pre-trained to maintain the semantic consistency of molecules. After pre-training, we can use either the Transformer branch (this one is recommended according to empirical results), the GNN branch, or both for downstream tasks. DVMP is tested on 11 molecular property prediction tasks and outperforms strong baselines. Furthermore, we test DVMP on three retrosynthesis tasks and it achieves state-of-the-art results. Our code is released at https://github.com/microsoft/DVMP.

Cite

CITATION STYLE

APA

Zhu, J., Xia, Y., Wu, L., Xie, S., Zhou, W., Qin, T., … Liu, T. Y. (2023). Dual-view Molecular Pre-training. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 3615–3627). Association for Computing Machinery. https://doi.org/10.1145/3580305.3599317

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free