MiSeRe-Hadoop: A large-scale robust sequential classification rules mining framework

Elias Egho; Dominique Gay; Romain Trinquart; Marc Boullé; Nicolas Voisine; Fabrice Clérot

Conference Proceedings

MiSeRe-Hadoop: A large-scale robust sequential classification rules mining framework

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10440 LNCS 105-119

DOI: 10.1007/978-3-319-64283-3_8

4Citations

7Readers

Get full text

Abstract

Sequence classification has become a fundamental problem in data mining and machine learning. Feature based classification is one of the techniques that has been used widely for sequence classification. Mining sequential classification rules plays an important role in feature based classification. Despite the abundant literature in this area, mining sequential classification rules is still a challenge; few of the available methods are sufficiently scalable to handle large-scale datasets. MapReduce is an ideal framework to support distributed computing on large data sets on clusters of computers. In this paper, we propose a distributed version of MiSeRe algorithm on MapReduce, called MiSeRe-Hadoop. MiSeRe-Hadoop holds the same valuable properties as MiSeRe, i.e., it is: (i) robust and user parameter-free anytime algorithm and (ii) it employs an instance-based randomized strategy to promote diversity mining. We have applied our method on two real-world large datasets: a marketing dataset and a text dataset. Our results confirm that our method is scalable for large scale sequential data analysis.

Cite

CITATION STYLE

APA

Egho, E., Gay, D., Trinquart, R., Boullé, M., Voisine, N., & Clérot, F. (2017). MiSeRe-Hadoop: A large-scale robust sequential classification rules mining framework. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10440 LNCS, pp. 105–119). Springer Verlag. https://doi.org/10.1007/978-3-319-64283-3_8

MiSeRe-Hadoop: A large-scale robust sequential classification rules mining framework

Abstract

Cite

Register to see more suggestions