Abstract
Machine learning and data-driven methods have been adopted for material science research in recent years; yet, the textual data are not fully embraced by the materials and physics community. In this work, we aim to make the computers unsupervisedly learn the latent information on the solar cell materials based on the textual data with minimal human intervention and perform solar cell materials predictions. An unsupervised machine learning model is constructed by automatically extracting the information from the materials literature database using word embeddings, which successfully establishes the hidden relationships between the materials formulas and their photovoltaic applications. Uncommon solar cell materials predicted by the natural language processing (NLP)-based machine learning method are further evaluated via the first-principles methods to reveal the optoelectronic properties of the predicted candidate, demonstrating the validity of the NLP-assisted machine learning model. This study highlights the text-based machine learning methods for solar cell materials and calls for a wide deployment of the NLP methods for the materials research.
Cite
CITATION STYLE
Zhang, L., & He, M. (2022). Unsupervised machine learning for solar cell materials from the literature. Journal of Applied Physics, 131(6). https://doi.org/10.1063/5.0064875
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.