Benchmark AFLOW Data Sets for Machine Learning

Conrad L. Clement; Steven K. Kauwe; Taylor D. Sparks

Journal ArticleOPEN ACCESS

Benchmark AFLOW Data Sets for Machine Learning

Integrating Materials and Manufacturing Innovation (2020) 9(2) 153-156

DOI: 10.1007/s40192-020-00174-4

22Citations

22Readers

Get full text

Abstract

Materials informatics is increasingly finding ways to exploit machine learning algorithms. Techniques such as decision trees, ensemble methods, support vector machines, and a variety of neural network architectures are used to predict likely material characteristics and property values. Supplemented with laboratory synthesis, applications of machine learning to compound discovery and characterization represent one of the most promising research directions in materials informatics. A shortcoming of this trend, in its current form, is a lack of standardized materials data sets on which to train, validate, and test model effectiveness. Applied machine learning research depends on benchmark data to make sense of its results. Fixed, predetermined data sets allow for rigorous model assessment and comparison. Machine learning publications that do not refer to benchmarks are often hard to contextualize and reproduce. In this data descriptor article, we present a collection of data sets of different material properties taken from the AFLOW database. We describe them, the procedures that generated them, and their use as potential benchmarks. We provide a compressed ZIP file containing the data sets and a GitHub repository of associated Python code. Finally, we discuss opportunities for future work incorporating the data sets and creating similar benchmark collections.

Author supplied keywords

Cite

CITATION STYLE

APA

Clement, C. L., Kauwe, S. K., & Sparks, T. D. (2020). Benchmark AFLOW Data Sets for Machine Learning. Integrating Materials and Manufacturing Innovation, 9(2), 153–156. https://doi.org/10.1007/s40192-020-00174-4

Benchmark AFLOW Data Sets for Machine Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions