Abstract
Gaussian mixture models (GMMs) are widely used for modelling stochastic problems. Indeed, a wide diversity of packages have been developed in R. However, no recent review describing the main features offered by these packages and comparing their performances has been performed. In this article, we first introduce GMMs and the EM algorithm used to retrieve the parameters of the model and analyse the main features implemented among seven of the most widely used R packages. We then empirically compare their statistical and computational performances in relation with the choice of the initialisation algorithm and the complexity of the mixture. We demonstrate that the best estimation with well-separated components or with a small number of components with distinguishable modes is obtained with REBMIX initialisation, implemented in the rebmix package, while the best estimation with highly overlapping components is obtained with k-means or random initialisation. Importantly, we show that implementation details in the EM algorithm yield differences in the parameters’ estimation. Especially, packages mixtools (Young et al. 2020) and Rmixmod (Langrognet et al. 2021) estimate the parameters of the mixture with smaller bias, while the RMSE and variability of the estimates is smaller with packages bgmm (Ewa Szczurek 2021), EMCluster (W.-C. Chen and Maitra 2022), GMKMcharlie (Liu 2021), flexmix (Gruen and Leisch 2022) and mclust (Fraley, Raftery, and Scrucca 2022). The comparison of these packages provides R users with useful recommendations for improving the computational and statistical performance of their clustering and for identifying common deficiencies. Additionally, we propose several improvements in the development of a future, unified mixture model package.
Cite
CITATION STYLE
Chassagnol, B., Bichat, A., Boudjeniba, C., Wuillemin, P. H., Guedj, M., Gohel, D., … Becht, E. (2023). Gaussian Mixture Models in R. R Journal, 15(2), 56–76. https://doi.org/10.32614/RJ-2023-043
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.