Small data materials design with machine learning: When the average model knows best

20Citations
Citations of this article
54Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Machine learning is quickly becoming an important tool in modern materials design. Where many of its successes are rooted in huge datasets, the most common applications in academic and industrial materials design deal with datasets of at best a few tens of data points. Harnessing the power of machine learning in this context is, therefore, of considerable importance. In this work, we investigate the intricacies introduced by these small datasets. We show that individual data points introduce a significant chance factor in both model training and quality measurement. This chance factor can be mitigated by the introduction of an ensemble-averaged model. This model presents the highest accuracy, while at the same time, it is robust with regard to changing the dataset size. Furthermore, as only a single model instance needs to be stored and evaluated, it provides a highly efficient model for prediction purposes, ideally suited for the practical materials scientist.

Cite

CITATION STYLE

APA

Vanpoucke, D. E. P., Van Knippenberg, O. S. J., Hermans, K., Bernaerts, K. V., & Mehrkanoon, S. (2020). Small data materials design with machine learning: When the average model knows best. Journal of Applied Physics, 128(5). https://doi.org/10.1063/5.0012285

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free