ML-PipeDebugger: A Debugging Tool for Data Processing Pipelines

Felix Kossak; Michael Zwick

Conference Proceedings

ML-PipeDebugger: A Debugging Tool for Data Processing Pipelines

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11707 LNCS 263-272

DOI: 10.1007/978-3-030-27618-8_20

0Citations

2Readers

Get full text

Abstract

Data pre-processing for data analysis usually requires a considerable number of interdependent steps, many of which are liable to errors or to introduce unwanted biases. Such errors can lead to cases where predictions for similar data instances differ unexpectedly much. An important question is then to find out where in the data processing pipeline the deviation was caused. We present a tool that can help identify critical data processing steps, allowing to “debug” or improve data pre-processing and model generation. More generally, the tool gives a view of how different data instances behave in relation to each other throughout a pipeline. The task to identify critical steps turns out to be rather complex, mostly because features of different types and ranges have to be compared, because required statistical measures must be obtained from often small samples, and because time series can be involved.

Author supplied keywords

Cite

CITATION STYLE

APA

Kossak, F., & Zwick, M. (2019). ML-PipeDebugger: A Debugging Tool for Data Processing Pipelines. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11707 LNCS, pp. 263–272). Springer. https://doi.org/10.1007/978-3-030-27618-8_20

ML-PipeDebugger: A Debugging Tool for Data Processing Pipelines

Abstract

Author supplied keywords

Cite

Register to see more suggestions