StrDip: A fast data stream clustering algorithm using the dip test of unimodality

Yonghong Luo; Ying Zhang; Xiaoke Ding; Xiangrui Cai; Chunyao Song; Xiaojie Yuan

Conference Proceedings

StrDip: A fast data stream clustering algorithm using the dip test of unimodality

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11234 LNCS 193-208

DOI: 10.1007/978-3-030-02925-8_14

3Citations

2Readers

Get full text

Abstract

Data stream clustering is an important problem of data mining. As the infinite growth of data stream’s length, excessive data is making great troubles to the storage of data. A number of algorithms have been proposed for data stream clustering, such as CluStream, DenStream, DStream and StrAP. With the Big Data era’s coming, the amount of data in one timestamp is growing at a great speed, so the time efficiency of data stream clustering algorithms is drawing huge attention from researchers while some state-of-the-art algorithms are excellent in cluster purity but intolerable in time efficiency. In this paper, we propose the StrDip, a fast data stream clustering algorithm which combines the Dip Test of Unimodality with the online/offline two-stage stream clustering framework. The StrDip also adapts a novel clustering feature vector and some microcluster pruning methods. Comparing to others algorithms, results of experiments on synthetic and real-world datasets show that, the StrDip gains a huge advantage in time efficiency and the clustering purity and quality are also good.

Author supplied keywords

Cite

CITATION STYLE

APA

Luo, Y., Zhang, Y., Ding, X., Cai, X., Song, C., & Yuan, X. (2018). StrDip: A fast data stream clustering algorithm using the dip test of unimodality. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11234 LNCS, pp. 193–208). Springer Verlag. https://doi.org/10.1007/978-3-030-02925-8_14

StrDip: A fast data stream clustering algorithm using the dip test of unimodality

Abstract

Author supplied keywords

Cite

Register to see more suggestions