Noisy data set identification

Luís Paulo F. García; André C.P.L.F. De Carvalho; Ana C. Lorena

Conference Proceedings

Noisy data set identification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8073 LNAI 629-638

DOI: 10.1007/978-3-642-40846-5_63

19Citations

42Readers

Get full text

Abstract

Real data are often corrupted by noise, which can be provenient from errors in data collection, storage and processing. The presence of noise hampers the induction of Machine Learning models from data, which can have their predictive or descriptive performance impaired, while also making the training time longer. Moreover, these models can be overly complex in order to accomodate such errors. Thus, the identification and reduction of noise in a data set may benefit the learning process. In this paper, we thereby investigate the use of data complexity measures to identify the presence of noise in a data set. This identification can support the decision regarding the need of the application of noise redution techniques. © 2013 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

García, L. P. F., De Carvalho, A. C. P. L. F., & Lorena, A. C. (2013). Noisy data set identification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8073 LNAI, pp. 629–638). https://doi.org/10.1007/978-3-642-40846-5_63

Noisy data set identification

Abstract

Author supplied keywords

Cite

Register to see more suggestions