Statistical σ-partition clustering over data streams

Nam Hun Park; Won Suk Lee

Conference ProceedingsOPEN ACCESS

Statistical σ-partition clustering over data streams

Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (2003) 2838 387-398

DOI: 10.1007/978-3-540-39804-2_35

0Citations

4Readers

Abstract

This paper proposes a grid-based clustering method that dynamically partitions the range of a grid-cell based on its distribution statistics of data elements in a data stream. Initially the multi-dimensional space of a data domain is partitioned into a set of mutually exclusive equal-size initial cells. As a new data element is generated continuously, each cell monitors the distribution statistics of data elements within its range. When the support of data elements in a cell becomes high enough, the cell is dynamically divided into two mutually exclusive smaller cells called intermediate cells by assuming the distribution of data elements is a normal distribution. Eventually, the dense sub-range of an initial cell is recursively partitioned until it becomes the smallest cell called a unit cell. In order to minimize the number of cells, a sparse intermediate or unit cell can be pruned if its support becomes much less than a minimum support. The performance of the proposed method is comparatively analyzed through a series of experiments.

Cite

CITATION STYLE

APA

Park, N. H., & Lee, W. S. (2003). Statistical σ-partition clustering over data streams. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 2838, pp. 387–398). Springer Verlag. https://doi.org/10.1007/978-3-540-39804-2_35

Statistical σ-partition clustering over data streams

Abstract

Cite

Register to see more suggestions