Comparative study of K-means and fuzzy C-means algorithms on the breast cancer data

Ashutosh Kumar Dubey; Umesh Gupta; Sonal Jain

Journal ArticleOPEN ACCESS

Comparative study of K-means and fuzzy C-means algorithms on the breast cancer data

International Journal on Advanced Science, Engineering and Information Technology (2018) 8(1) 18-29

DOI: 10.18517/ijaseit.8.1.3490

69Citations

129Readers

Abstract

Breast cancer is one of the most common forms of cancer having a worldwide prevalence. Continuous research is going on for detecting breast cancer in its early stage as the possibility of cure is very high in the early stage. The two main objectives of this work were: firstly, to compare the performance of k-means and fuzzy c-means (FCM) clustering algorithms; and secondly, to make an attempt to carefully consider and examine, from multiple points of view, the combination of different computational measures for k-means and FCM algorithms for a potential to achieve better clustering accuracy. K-means and FCM algorithms have been considered to understand the impact of clustering on the breast cancer data. The execution of k-means algorithm is based on centroid, distance, split method, threshold, epoch, attributes, and number of iterations; while FCM is executed on the basis of fuzziness value and termination condition. The breast cancer Wisconsin (BCW) dataset was used for the experimentation and the comparison. The combination of variance and same centroid offers better outcome in terms of k-means algorithm. The highest and lowest clustering accuracies are (94.7%, 77.1%) and (94.4%, 88.5%) for foggy and random centroid, respectively. The overall average positive prediction accuracy obtained by this approach is approximately 92%. In case of FCM, the highest and lowest clustering accuracies are (97.2%, 91.1%), (97.2%, 90.9%), (97.8%, 90.4%), and (97.1%, 90.2%) for different combination of fuzziness and termination criteria. The average highest and lowest clustering accuracies are (95.7%, 94.7%), (95.9%, 93.6%), (95.3%, 94.2%), and (95.6%, 93.7%) for the same combination in the case of FCM. K-means algorithm was more prominent and consistent in terms of computation time as FCM required more time to carry out several fuzzy calculations and iterations. The findings of this work provide an incisive and extensive understanding of the computational parameters used with k-means and FCM algorithms. The computational results indicate that FCM algorithm was found to be prominent and consistent than k-means algorithm when executed with different iterations, fuzziness values, and termination criteria. It is more potentially capable in clustering BCW dataset as the clustering accuracy is more important than time.

Author supplied keywords

Cite

CITATION STYLE

APA

Dubey, A. K., Gupta, U., & Jain, S. (2018). Comparative study of K-means and fuzzy C-means algorithms on the breast cancer data. International Journal on Advanced Science, Engineering and Information Technology, 8(1), 18–29. https://doi.org/10.18517/ijaseit.8.1.3490

Comparative study of K-means and fuzzy C-means algorithms on the breast cancer data

Abstract

Author supplied keywords

Cite

Register to see more suggestions