On Studying the Effect of Data Quality on Classification Performances

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

During the last decade, data have played a key role for learning and decision making models. Unfortunately, the quality of data has been ignored or partially investigated as a pre-processing step. Motivated by applications in various fields, we propose to study data quality and its impact on the performance of several learning models. In this work, we first study the difficulty of repairing errors by introducing a list of elementary repairing tasks ranging from easy to complex with an increasing level. Then, we form categories from the state-of-the-art cleaning and repairing methods. We also investigate if it is always efficient to repair data. By including standard classifications models and public dataset, our work enables their use in different contexts and can be extended to other machine learning applications.

Cite

CITATION STYLE

APA

Jouseau, R., Salva, S., & Samir, C. (2022). On Studying the Effect of Data Quality on Classification Performances. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13756 LNCS, pp. 82–93). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-21753-1_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free