A comparative study of machine learning models on molecular fingerprints for odor decoding

Jinyoung Suh; Yeonju Hong; Chunho Park

Journal ArticleOPEN ACCESS

A comparative study of machine learning models on molecular fingerprints for odor decoding

Communications Chemistry (2025) 8(1)

DOI: 10.1038/s42004-025-01651-7

1Citations

18Readers

Abstract

Understanding how molecular structure relates to odor perception is a longstanding problem, with important implications for fragrance development and sensory science. In this study, we present an advanced comparative analysis of machine learning approaches for predicting fragrance odors, examining both individual descriptor‐based models and integrated frameworks. Using a curated dataset of 8681 compounds from ten expert sources, we benchmark functional group fingerprints, classical molecular descriptors, and Morgan structural fingerprints across Random Forest, eXtreme Gradient Boosting, and Light Gradient Boosting Machine. The Morgan-fingerprint-based XGBoost model achieves the highest discrimination (AUROC 0.828, AUPRC 0.237), outperforming descriptor-based models. Our findings highlight the superior representational capacity of molecular fingerprints to capture olfactory cues, not only achieving high predictive performance but also revealing a continuous, interpretable scent space that aligns with perceptual and chemical relationships. This paves the way for data-driven research into olfactory mechanisms, alongside the next generation of in silico odor prediction.

Cite

CITATION STYLE

APA

Suh, J., Hong, Y., & Park, C. (2025). A comparative study of machine learning models on molecular fingerprints for odor decoding. Communications Chemistry, 8(1). https://doi.org/10.1038/s42004-025-01651-7

A comparative study of machine learning models on molecular fingerprints for odor decoding

Abstract

Cite

Register to see more suggestions