Massive data processing is a challenging problem in the age of big data. Traditional attribute reduction algorithms are generally time-consuming when facing massive data. For fast processing, we introduce a parallel fast approximate attribute reduction algorithm with MapReduce. We divide the original data into many small blocks, and use reduction algorithm for each block. The reduction algorithm is based on attribute significance. We compute the dependency of each reduction on testing data in order to select the best reduction. Data with different sizes are experimented. The experimental results show that our proposed algorithm can efficiently process large-scale data on Hadoop platform. In particular, on high dimensional data, the algorithm runs significantly faster than other latest parallel reduction methods. © 2013 Springer-Verlag.
CITATION STYLE
Li, P., Wu, J., & Shang, L. (2013). Fast approximate attribute reduction with MapReduce. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8171 LNAI, pp. 271–278). https://doi.org/10.1007/978-3-642-41299-8_26
Mendeley helps you to discover research relevant for your work.