An on-line modified X-means method is proposed for solving data stream clustering tasks in conditions when an amount of clusters is apriori unknown. This approach is based on an ensemble of clustering neural networks that contains the self-organizing maps by T. Kohonen. Each clustering neural network consists of a different number of neurons where an amount of clusters is connected to a quality of the clustering process. All ensemble’s members process information which is fed sequentially to the system in a parallel mode. The effectiveness of the clustering process is determined using the Caliński-Harabasz index. The self-learning algorithm uses a similarity measure of a special type. A main feature of the proposed method is an absence of the competition step, i.e. neuron-winner is not determined. A number of experiments has been held in order to investigate the proposed system’s properties. Experimental results have confirmed the fact that the system under consideration could be used for solving a wide range of Data Mining tasks when data sets are processed in an on-line mode. The proposed ensemble system provides computational simplicity, and data sets are processed faster due to the possibility of parallel tuning.
CITATION STYLE
Zhernova, P., Deyneko, A., Deyneko, Z., Pliss, I., & Ahafonov, V. (2019). Data stream clustering in conditions of an unknown amount of classes. In Advances in Intelligent Systems and Computing (Vol. 754, pp. 410–418). Springer Verlag. https://doi.org/10.1007/978-3-319-91008-6_41
Mendeley helps you to discover research relevant for your work.