Discovery of structure–property relations for molecules via hypothesis-driven active learning over the chemical space

  • Ghosh A
  • Kalinin S
  • Ziatdinov M
N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The discovery of the molecular candidates for application in drug targets, biomolecular systems, catalysts, photovoltaics, organic electronics, and batteries necessitates the development of machine learning algorithms capable of rapid exploration of chemical spaces targeting the desired functionalities. Here, we introduce a novel approach for active learning over the chemical spaces based on hypothesis learning. We construct the hypotheses on the possible relationships between structures and functionalities of interest based on a small subset of data followed by introducing them as (probabilistic) mean functions for the Gaussian process. This approach combines the elements from the symbolic regression methods, such as SISSO and active learning, into a single framework. The primary focus of constructing this framework is to approximate physical laws in an active learning regime toward a more robust predictive performance, as traditional evaluation on hold-out sets in machine learning does not account for out-of-distribution effects which may lead to a complete failure on unseen chemical space. Here, we demonstrate it for the QM9 dataset, but it can be applied more broadly to datasets from both domains of molecular and solid-state materials sciences.

Cite

CITATION STYLE

APA

Ghosh, A., Kalinin, S. V., & Ziatdinov, M. A. (2023). Discovery of structure–property relations for molecules via hypothesis-driven active learning over the chemical space. APL Machine Learning, 1(4). https://doi.org/10.1063/5.0157644

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free