Theory-guided machine learning in materials science

Nicholas Wagner; James M. Rondinelli

Journal ArticleOPEN ACCESS

Theory-guided machine learning in materials science

Frontiers in Materials (2016) 3

DOI: 10.3389/fmats.2016.00028

136Citations

279Readers

Abstract

Materials scientists are increasingly adopting the use of machine learning tools to discover hidden trends in data and make predictions. Applying concepts from data science without foreknowledge of their limitations and the unique qualities of materials data, however, could lead to errant conclusions. The differences that exist between various kinds of experimental and calculated data require careful choices of data processing and machine learning methods. Here, we outline potential pitfalls involved in using machine learning without robust protocols. We address some problems of overfitting to training data using decision trees as an example, rational descriptor selection in the field of perovskites, and preserving physical interpretability in the application of dimensionality reducing techniques such as principal component analysis. We show how proceeding without the guidance of domain knowledge can lead to both quantitatively and qualitatively incorrect predictive models.

Author supplied keywords

Cite

CITATION STYLE

APA

Wagner, N., & Rondinelli, J. M. (2016). Theory-guided machine learning in materials science. Frontiers in Materials, 3. https://doi.org/10.3389/fmats.2016.00028

Theory-guided machine learning in materials science

Abstract

Author supplied keywords

Cite

Register to see more suggestions