Pretreatment Methods for Enhancing Machine Learning Performance on Metabolomics Data

3Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Pretreatment methods are critical for metabolomics data analysis, yet their impact on machine learning performance remains insufficiently explored. This paper introduces a novel approach to systematically evaluate eight pretreatment methods - Centering, Autoscaling, Range Scaling, Pareto Scaling, Vast Scaling, Level Scaling, Log Transformation, and Power Transformation - across four diverse metabolomics datasets (MTBLS161, MTBLS547, ST001000, and ST001047). The novelty of this paper lies in its comprehensive assessment of how these methods influence model-specific performance, particularly for Gradient Boosting Classifier, Multi-Layer Perceptron, Support Vector Classifier, and Random Forest. The findings reveal that Vast Scaling and Autoscaling consistently outperform other methods, enhancing classification accuracy and robustness by effectively normalizing metabolite intensities and preserving variance. Simpler methods, like Centering and Log Transformation, offered limited improvements in high-dimensional datasets. This study establishes a novel framework for designing tailored preprocessing pipelines, advancing metabolomics data analysis and enabling the extraction of meaningful biological insights through machine learning.

Cite

CITATION STYLE

APA

Rustam. (2025). Pretreatment Methods for Enhancing Machine Learning Performance on Metabolomics Data. IEEE Access, 13, 80133–80148. https://doi.org/10.1109/ACCESS.2025.3567153

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free