Chemformer: A pre-trained transformer for computational chemistry

Ross Irwin; Spyridon Dimitriadis; Jiazhen He; Esben Jannik Bjerrum

Journal ArticleOPEN ACCESS

Chemformer: A pre-trained transformer for computational chemistry

Machine Learning: Science and Technology (2022) 3(1)

DOI: 10.1088/2632-2153/ac3ffb

288Citations

241Readers

Abstract

Transformer models coupled with a simplified molecular line entry system (SMILES) have recently proven to be a powerful combination for solving challenges in cheminformatics. These models, however, are often developed specifically for a single application and can be very resource-intensive to train. In this work we present the Chemformer model - a Transformer-based model which can be quickly applied to both sequence-to-sequence and discriminative cheminformatics tasks. Additionally, we show that self-supervised pre-training can improve performance and significantly speed up convergence on downstream tasks. On direct synthesis and retrosynthesis prediction benchmark datasets we publish state-of-the-art results for top-1 accuracy. We also improve on existing approaches for a molecular optimisation task and show that Chemformer can optimise on multiple discriminative tasks simultaneously. Models, datasets and code will be made available after publication.

Author supplied keywords

Cite

CITATION STYLE

APA

Irwin, R., Dimitriadis, S., He, J., & Bjerrum, E. J. (2022). Chemformer: A pre-trained transformer for computational chemistry. Machine Learning: Science and Technology, 3(1). https://doi.org/10.1088/2632-2153/ac3ffb

Chemformer: A pre-trained transformer for computational chemistry

Abstract

Author supplied keywords

Cite

Register to see more suggestions