Criteria for mixture-model clustering with side-information

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The estimation of mixture models is a well-known approach for cluster analysis and several criteria have been proposed to select the number of clusters. In this paper, we consider mixture models using sideinformation, which gives the constraint that some data in a group originate from the same source. Then the usual criteria are not suitable. An EM (Expectation-Maximization) algorithm has been previously developed to jointly allow the determination of the model parameters and the data labelling, for a given number of clusters. In this work we adapt three usual criteria, which are the bayesian information criterion (BIC), the Akaike information criterion (AIC), and the entropy criterion (NEC), so that they can take into consideration the side-information. One simulated problem and two real data sets have been used to show the relevance of the modified criterion versions and compare the criteria. The efficiency of both the EM algorithm and the criteria, for selecting the right number of clusters while getting a good clustering, is in relation with the amount of side-information. Side-information being mainly useful when the clusters overlap, the best criterion is the modified BIC.

Cite

CITATION STYLE

APA

Grall-Maës, E., & Dao, D. T. (2017). Criteria for mixture-model clustering with side-information. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10163 LNCS, pp. 43–59). Springer Verlag. https://doi.org/10.1007/978-3-319-53375-9_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free