Data preprocessing is a technique in data mining to make the data read for further processing according to the requirement. Preprocessing is required because the data might be incomplete, redundant, come from different sources which may require aggregation, etc., and data can be processed either sequentially or in parallel. There are several parallel frameworks such as Hadoop, MPI, and CUDA to process the data. A survey has been done to understand these parallel frameworks, and a comparison between sequential and parallel approach is carried out to compare the efficiency of the two approaches.
CITATION STYLE
Rai, S., Geetha, M., & Kumar, P. (2022). Preprocessing of Datasets Using Sequential and Parallel Approach: A Comparison. In Lecture Notes in Networks and Systems (Vol. 209, pp. 311–320). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-16-2126-0_27
Mendeley helps you to discover research relevant for your work.