ML-PipeDebugger: A Debugging Tool for Data Processing Pipelines

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data pre-processing for data analysis usually requires a considerable number of interdependent steps, many of which are liable to errors or to introduce unwanted biases. Such errors can lead to cases where predictions for similar data instances differ unexpectedly much. An important question is then to find out where in the data processing pipeline the deviation was caused. We present a tool that can help identify critical data processing steps, allowing to “debug” or improve data pre-processing and model generation. More generally, the tool gives a view of how different data instances behave in relation to each other throughout a pipeline. The task to identify critical steps turns out to be rather complex, mostly because features of different types and ranges have to be compared, because required statistical measures must be obtained from often small samples, and because time series can be involved.

Cite

CITATION STYLE

APA

Kossak, F., & Zwick, M. (2019). ML-PipeDebugger: A Debugging Tool for Data Processing Pipelines. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11707 LNCS, pp. 263–272). Springer. https://doi.org/10.1007/978-3-030-27618-8_20

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free