Chemical mixtures have recently come to the attention of open standards and data structures for capturing machine-readable descriptions for informatics uses. At the present time, essentially all transmission of information about mixtures is done using short text descriptions that are readable only by trained scientists, and there are no accessible repositories of marked-up mixture data. We have designed a machine learning tool that can interpret mixture descriptions and upgrade them to the high-level Mixfile format, which can in turn be used to generate Mixtures InChI notation. The interpretation achieves a high success rate and can be used at scale to markup large catalogs and inventories, with some expert checking to catch edge cases. The training data that was accumulated during the project is made openly available, along with previously released mixture editing tools and utilities.
CITATION STYLE
Clark, A. M., Gedeck, P., Cheung, P. P., & Bunin, B. A. (2021). Using Machine Learning to Parse Chemical Mixture Descriptions. ACS Omega, 6(34), 22400–22409. https://doi.org/10.1021/acsomega.1c03311
Mendeley helps you to discover research relevant for your work.