Exploring chemical and conformational spaces by batch mode deep active learning

18Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

The development of machine-learned interatomic potentials requires generating sufficiently expressive atomistic data sets. Active learning algorithms select data points on which labels, i.e., energies and forces, are calculated for inclusion in the training set. However, for batch mode active learning, i.e., when multiple data points are selected at once, conventional active learning algorithms can perform poorly. Therefore, we investigate algorithms specifically designed for this setting and show that they can outperform conventional algorithms. We investigate selection based on the informativeness, diversity, and representativeness of the resulting training set. We propose using gradient features specific to atomistic neural networks to evaluate the informativeness of queried samples, including several approximations allowing for their efficient evaluation. To avoid selecting similar structures, we present several methods that enforce the diversity and representativeness of the selected batch. Finally, we apply the proposed approaches to several molecular and periodic bulk benchmark systems and argue that they can be used to generate highly informative atomistic data sets by running any atomistic simulation.

Cite

CITATION STYLE

APA

Zaverkin, V., Holzmüller, D., Steinwart, I., & Kästner, J. (2022). Exploring chemical and conformational spaces by batch mode deep active learning. Digital Discovery, 1(5), 605–620. https://doi.org/10.1039/d2dd00034b

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free