Pretreatment Methods for Enhancing Machine Learning Performance on Metabolomics Data

undefined Rustam

Journal ArticleOPEN ACCESS

Pretreatment Methods for Enhancing Machine Learning Performance on Metabolomics Data

Rustam

IEEE Access (2025) 13 80133-80148

DOI: 10.1109/ACCESS.2025.3567153

3Citations

9Readers

Abstract

Pretreatment methods are critical for metabolomics data analysis, yet their impact on machine learning performance remains insufficiently explored. This paper introduces a novel approach to systematically evaluate eight pretreatment methods - Centering, Autoscaling, Range Scaling, Pareto Scaling, Vast Scaling, Level Scaling, Log Transformation, and Power Transformation - across four diverse metabolomics datasets (MTBLS161, MTBLS547, ST001000, and ST001047). The novelty of this paper lies in its comprehensive assessment of how these methods influence model-specific performance, particularly for Gradient Boosting Classifier, Multi-Layer Perceptron, Support Vector Classifier, and Random Forest. The findings reveal that Vast Scaling and Autoscaling consistently outperform other methods, enhancing classification accuracy and robustness by effectively normalizing metabolite intensities and preserving variance. Simpler methods, like Centering and Log Transformation, offered limited improvements in high-dimensional datasets. This study establishes a novel framework for designing tailored preprocessing pipelines, advancing metabolomics data analysis and enabling the extraction of meaningful biological insights through machine learning.

Author supplied keywords

Cite

CITATION STYLE

APA

Rustam. (2025). Pretreatment Methods for Enhancing Machine Learning Performance on Metabolomics Data. IEEE Access, 13, 80133–80148. https://doi.org/10.1109/ACCESS.2025.3567153

Pretreatment Methods for Enhancing Machine Learning Performance on Metabolomics Data

Abstract

Author supplied keywords

Cite

Register to see more suggestions