Machine Learning of Raman Spectroscopic Data: Comparison of Different Validation Strategies

5Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Machine learning (ML) techniques are valuable for analyzing complex biological SERS spectra, allowing for the detection of minor differences in cell composition. However, several challenges arise in the data analysis process, such as selecting the appropriate preprocessing methods, machine learning algorithms, and validation strategies to avoid under/overfitting and ensure reliable estimates. This study systematically compared various validation strategies and their impact on multiple ML classifiers using four biological datasets of varying complexities, in terms of class overlap, and sample variability. Therefore, a machine learning workflow was established, incorporating more than 10 classifiers and using nested cross-validation (CV) for hyperparameter tuning and performance estimation. Five CV strategies were compared: Leave-One-Group-Out, stratified K-Fold, unstratified K-Fold, Leave-One-Out, and nested CV. Our results demonstrate that stratified K-Fold CV yielded performance nearly equivalent to nested CV in terms of accuracy and efficiency but with a reduced computational cost. Leave-One-Group-Out strategy produced lower performance estimates than the other four methods, which may be more representative of real-world performance. Conclusively, this work shows that simpler CV strategies can effectively replace computationally expensive nested CV in certain cases, while maintaining comparable performance. Nonetheless, careful consideration of overfitting remains crucial when employing these more efficient methods.

Cite

CITATION STYLE

APA

Lilek, D., Zimmermann, D., Steininger, L., Musso, M., Wilts, B. D., Gamsjaeger, S., … Prohaska, K. (2025). Machine Learning of Raman Spectroscopic Data: Comparison of Different Validation Strategies. Journal of Raman Spectroscopy, 56(9), 867–877. https://doi.org/10.1002/jrs.6842

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free