The least sample size essential for detecting changes in clustering solutions of streaming datasets

7Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

The clustering analysis approach treats multivariate data tuples as objects and groups them into clusters based on their similarities or dissimilarities within the dataset. However, in modern world, a significant volume of data is continuously generated from diverse sources over time. In these dynamic scenarios, the data is not static but continually evolves. Consequently, the interesting patterns and inherent subgroups within the datasets also change and develop over time. The researchers have paid special attention to monitoring changes in cluster solutions of evolving streams. For this matter, several algorithms have been proposed in the literature. However, to date, no study has examined the effect of variability in cluster sizes on the evolution of cluster solutions. Moreover, no guidance is available on determining the impact of cluster sizes on the type of changes they experience in the streams. In the present simulation study using artificial datasets, the evolution of clusters is examined concerning the variability in cluster sizes. The findings are substantial because tracing and monitoring the changes in clustering solutions have a wide range of applications in every field of research. This study determines the minimum sample size required in the clustering of time-stamped datasets.

Cite

CITATION STYLE

APA

Atif, M., Farooq, M., Abiad, M., & Shafiq, M. (2024). The least sample size essential for detecting changes in clustering solutions of streaming datasets. PLoS ONE, 19(2 FEBRUARY). https://doi.org/10.1371/journal.pone.0297355

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free