Parallel and distributed solutions are essential for clustering data streams due to the large volumes of data. This paper first examines a direct adaptation of a recently developed prototype-based algorithm into three existing parallel frameworks. Based on the evaluation of performance, the paper then presents a customised pipeline framework that combines incremental and two-phase learning into a balanced approach that dynamically allocates the available processing resources. This new framework is evaluated on a collection of synthetic datasets. The experimental results reveal that the framework not only produces correct final clusters on the one hand, but also significantly improves the clustering efficiency.
CITATION STYLE
Alazeez, A. A. A., Jassim, S., & Du, H. (2019). TPICDS: A two-phase parallel approach for incremental clustering of data streams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11339 LNCS, pp. 5–16). Springer Verlag. https://doi.org/10.1007/978-3-030-10549-5_1
Mendeley helps you to discover research relevant for your work.