Transfer learning for a foundational chemistry model

22Citations
Citations of this article
43Readers
Mendeley users who have this article in their library.

Abstract

Data-driven chemistry has garnered much interest concurrent with improvements in hardware and the development of new machine learning models. However, obtaining sufficiently large, accurate datasets of a desired chemical outcome for data-driven chemistry remains a challenge. The community has made significant efforts to democratize and curate available information for more facile machine learning applications, but the limiting factor is usually the laborious nature of generating large-scale data. Transfer learning has been noted in certain applications to alleviate some of the data burden, but this protocol is typically carried out on a case-by-case basis, with the transfer learning task expertly chosen to fit the finetuning. Herein, I develop a machine learning framework capable of accurate chemistry-relevant prediction amid general sources of low data. First, a chemical “foundational model” is trained using a dataset of ∼1 million experimental organic crystal structures. A task specific module is then stacked atop this foundational model and subjected to finetuning. This approach achieves state-of-the-art performance on a diverse set of tasks: toxicity prediction, yield prediction, and odor prediction.

Cite

CITATION STYLE

APA

King-Smith, E. (2023). Transfer learning for a foundational chemistry model. Chemical Science, 15(14), 5143–5151. https://doi.org/10.1039/d3sc04928k

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free