Abstract
Eye-Tracking data provides valuable insights into human behavior, yet its high variability to noise require robust preprocessing to ensure meaningful analysis. This study introduces and evaluates a systematic preprocessing pipeline tailored to enhance machine learning classifier performance in the context of Eye-Tracking data, on a dataset on academic cheating detection. Unlike prior work focusing on isolated preprocessing steps, our approach explores 193 configurations by combining techniques for missing value imputation, outlier handling, normalization, smoothing, feature limiting, and filtering. A Random Forest classifier is used consistently across all configurations due to its robustness and prior success in similar domains. Our results demonstrate that well-designed preprocessing pipelines can substantially improve classification accuracy. Additionally, a feature importance analysis reveals that static spatial and camera-based metrics outperform traditional gaze dynamics in predictive power. This research aims to create a reusable framework for Eye-Tracking data.
Cite
CITATION STYLE
Landes, J., Klettke, M., & Köppl, S. (2025). Impact of Preprocessing on Classification Results of Eye-Tracking-Data. Datenbank-Spektrum, 25(3), 153–166. https://doi.org/10.1007/s13222-025-00518-4
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.