Optimizing Clustering Algorithms for Anti-Microbial Evaluation Data: A Majority Score-Based Evaluation of K-Means, Gaussian Mixture Model, and Multivariate T-Distribution Mixtures

9Citations
Citations of this article
28Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This study presents a detailed analysis of the performance of the majority score clustering algorithm on three different datasets of anti-microbial evaluation, namely the minimum inhibitory concentration (MIC) of bacteria, and the antifungal activity of chemical compounds against 4 bacteria (E. coli, P. aeruginosa, S. aureus, S. pyogenes) and 2 fungi (C. albicans, As. fumigatus). Clustering is an unsupervised machine learning method used to group chemical compounds based on their similarity. In this paper, we apply the k-means clustering, Gaussian mixture model (GMM), and mixtures of multivariate t distribution to antibacterial activity datasets. To determine the optimal number of clusters and which clustering algorithm performs best, we use a variety of clustering validation indices (CVIs) which include within sum square (to be minimized), connectivity (to be minimized), Silhouette Width (to be maximized), and the Dunn Index (to be maximized). Based on the majority score clustering algorithm, we conclude that the k-means and mixture of multivariate t-distribution methods perform best in terms of the maximum CVIs, while GMM performs best in terms of the minimum CVIs. K-means clustering and mixture of multivariate t-distribution provide 3 optimal clusters for the anti-microbial evaluation of antibacterial activity dataset and 5 optimal clusters for the MIC bacteria dataset. K-means clustering, mixture of multivariate t-distribution, and GMM provide 3 optimal clusters for both the antibacterial and antifungal activity datasets. K-means clustering algorithm performs the best in terms of the majority-based clustering algorithm. This study may be useful for the pharmaceutical industry, chemists, and medical professionals in the future.

Cite

CITATION STYLE

APA

Mahmood, H., Mehmood, T., & Al-Essa, L. A. (2023). Optimizing Clustering Algorithms for Anti-Microbial Evaluation Data: A Majority Score-Based Evaluation of K-Means, Gaussian Mixture Model, and Multivariate T-Distribution Mixtures. IEEE Access, 11, 79793–79800. https://doi.org/10.1109/ACCESS.2023.3288344

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free