A strategy to apply machine learning to small datasets in materials science

387Citations
Citations of this article
869Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

There is growing interest in applying machine learning techniques in the research of materials science. However, although it is recognized that materials datasets are typically smaller and sometimes more diverse compared to other fields, the influence of availability of materials data on training machine learning models has not yet been studied, which prevents the possibility to establish accurate predictive rules using small materials datasets. Here we analyzed the fundamental interplay between the availability of materials data and the predictive capability of machine learning models. Instead of affecting the model precision directly, the effect of data size is mediated by the degree of freedom (DoF) of model, resulting in the phenomenon of association between precision and DoF. The appearance of precision-DoF association signals the issue of underfitting and is characterized by large bias of prediction, which consequently restricts the accurate prediction in unknown domains. We proposed to incorporate the crude estimation of property in the feature space to establish ML models using small sized materials data, which increases the accuracy of prediction without the cost of higher DoF. In three case studies of predicting the band gap of binary semiconductors, lattice thermal conductivity, and elastic properties of zeolites, the integration of crude estimation effectively boosted the predictive capability of machine learning models to state-of-Art levels, demonstrating the generality of the proposed strategy to construct accurate machine learning models using small materials dataset.

Cite

CITATION STYLE

APA

Zhang, Y., & Ling, C. (2018). A strategy to apply machine learning to small datasets in materials science. Npj Computational Materials, 4(1). https://doi.org/10.1038/s41524-018-0081-z

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free