Abstract
New discoveries in chemistry and materials science, with an increasingly expanding volume of requisite knowledge and experimental workload, provide unique opportunities for machine learning (ML) to take critical roles in accelerating research efficiency. This work demonstrates 1) the use of large language models (LLMs) for automated literature reviews; and 2) the training of an ML model to predict chemical knowledge (thermodynamic parameters). The LLM-based literature review tool (LMExt) successfully extracted chemical information and beyond into a machine-readable structure, including stability constants for metal cation–ligand interactions, thermodynamic properties, and other broader data types (medical research papers and financial reports), effectively overcoming the challenges inherent in each domain. Using the autonomous acquisition of thermodynamic data, an ML model is trained using the CatBoost algorithm for accurately predicting thermodynamic parameters (e.g., enthalpy of formation) of minerals. This work highlights the transformative potential of integrated ML approaches to reshape chemistry and materials science research.
Author supplied keywords
Cite
CITATION STYLE
Liu, J., Anderson, H., Waxman, N. I., Kovalev, V., Fisher, B., Li, E., & Guo, X. (2026). Predicting Materials Thermodynamics Enabled by Large Language Model-Driven Dataset Building and Machine Learning. Advanced Intelligent Systems. https://doi.org/10.1002/aisy.202500857
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.