Abstract
Text clustering has been an overlooked field of text mining that requires more attention. Several applications require automatic text organisation which relies on an information retrieval system based on organised search results. Spherical k-means is a successful adaptation of the classic k-means algorithm for text clustering. However, conventional methods to accelerate k-means may not apply to spherical k-means due to the different nature of text document data. The proposed work introduces an iterative feature filtering technique that reduces the data size during the process of clustering which further produces more feature-relevant clusters in less time compared to classic spherical k-means. The novelty of the proposed method is that feature assessment is distinct from the objective function of clustering and derived from the cluster structure. Experimental results show that the proposed scheme achieves computation speed without sacrificing cluster quality over popular text corpora. The demonstrated results are satisfactory and outperform compared to recent works in this domain.
Author supplied keywords
Cite
CITATION STYLE
Sharma, I., Sharma, A., Chaturvedi, R., Rajpurohit, J., & Kumar, M. (2025). SKIFF: Spherical K-means with iterative feature filtering for text document clustering. Journal of Information Science, 51(5), 1204–1216. https://doi.org/10.1177/01655515231165230
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.