This study focuses on high-dimensional text data clustering, given the inability of K-means to process high-dimensional data and the need to specify the number of clusters and randomly select the initial centers. We propose a Stacked-Random Projection dimensionality reduction framework and an enhanced K-means algorithm DPC-K-means based on the improved density peaks algorithm. The improved density peaks algorithm determines the number of clusters and the initial clustering centers of K-means. Our proposed algorithm is validated using seven text datasets. Experimental results show that this algorithm is suitable for clustering of text data by correcting the defects of K-means.
CITATION STYLE
Sun, Y., & Platoš, J. (2020). High-Dimensional Text Clustering by Dimensionality Reduction and Improved Density Peak. Wireless Communications and Mobile Computing, 2020. https://doi.org/10.1155/2020/8881112
Mendeley helps you to discover research relevant for your work.