Lots of clustering algorithms have been developed, while most of them cannot process objects in hybrid numerical/nominal attribute space or with missing values. In most of them, the number of clusters should be manually determined and the clustering results are sensitive to the input order of the objects to be clustered. These limit applicability of the clustering and reduce the quality of clustering. To solve this problem, an improved clustering algorithm based on rough set (RS) and entropy theory was presented. It aims at avoiding the need to prespecify the number of clusters, and clustering in both numerical and nominal attribute space with the similarity introduced to replace the distance index. At the same time, the RS theory endows the algorithm with the function to deal with vagueness and uncertainty in data analysis. Shannon's entropy was used to refine the clustering results by assigning relative weights to the set of attributes according to the mutual entropy values. A novel measure of clustering quality was also presented to evaluate the clusters. This algorithm was analyzed and applied later to cluster the data set of one industrial product. The experimental results confirm that performances of efficiency and clustering quality of this algorithm are improved. © 2006 Elsevier Ireland Ltd. All rights reserved.
Chen, C. B., & Wang, L. Y. (2006). Rough Set-Based Clustering with Refinement Using Shannon’s Entropy Theory. Computers and Mathematics with Applications, 52(10–11), 1563–1576. https://doi.org/10.1016/j.camwa.2006.03.033