Survey on Preprocessing Techniques for Big Data Projects †

3Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

In the era of big data, a vast amount of data are being produced. This results in two main issues when trying to discover knowledge from these data. There is a lot of information that is not relevant to the problem we want to solve, and there are many imperfections and errors in the data. Therefore, preprocessing these data is a key step before applying any kind of learning algorithm. Reducing the number of features to a relevant subset (feature selection) and reducing the possible values of continuous variables (discretisation) are two of the main preprocessing techniques. This paper will review different methods for completing these two steps, focusing on the big data context and giving examples of projects where they have been applied.

Cite

CITATION STYLE

APA

Lopez-Miguel, I. D. (2021). Survey on Preprocessing Techniques for Big Data Projects †. Engineering Proceedings, 7(1). https://doi.org/10.3390/engproc2021007014

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free