Transfer learning for a foundational chemistry model

Emma King-Smith

Journal ArticleOPEN ACCESS

Transfer learning for a foundational chemistry model

King-Smith E

Chemical Science (2023) 15(14) 5143-5151

DOI: 10.1039/d3sc04928k

22Citations

43Readers

Abstract

Data-driven chemistry has garnered much interest concurrent with improvements in hardware and the development of new machine learning models. However, obtaining sufficiently large, accurate datasets of a desired chemical outcome for data-driven chemistry remains a challenge. The community has made significant efforts to democratize and curate available information for more facile machine learning applications, but the limiting factor is usually the laborious nature of generating large-scale data. Transfer learning has been noted in certain applications to alleviate some of the data burden, but this protocol is typically carried out on a case-by-case basis, with the transfer learning task expertly chosen to fit the finetuning. Herein, I develop a machine learning framework capable of accurate chemistry-relevant prediction amid general sources of low data. First, a chemical “foundational model” is trained using a dataset of ∼1 million experimental organic crystal structures. A task specific module is then stacked atop this foundational model and subjected to finetuning. This approach achieves state-of-the-art performance on a diverse set of tasks: toxicity prediction, yield prediction, and odor prediction.

Cite

CITATION STYLE

APA

King-Smith, E. (2023). Transfer learning for a foundational chemistry model. Chemical Science, 15(14), 5143–5151. https://doi.org/10.1039/d3sc04928k

Transfer learning for a foundational chemistry model

Abstract

Cite

Register to see more suggestions