Gaussian clusters and noise: An approach based on the minimum description length principle

1Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We introduce a well-grounded minimum description length (MDL) based quality measure for a clustering consisting of either spherical or axis-aligned normally distributed clusters and a cluster with a uniform distribution in an axis-aligned rectangular box. The uniform component extends the practical usability of the model e.g. in the presence of noise, and using the MDL principle for the model selection makes comparing the quality of clusterings with a different number of clusters possible. We also introduce a novel search heuristic for finding the best clustering with an unknown number of clusters. The heuristic is based on the idea of moving points from the Gaussian clusters to the uniform one and using MDL for determining the optimal amount of noise. Tests with synthetic data having a clear cluster structure imply that the search method is effective in finding the intuitively correct clustering. © 2010 Springer-Verlag.

Cite

CITATION STYLE

APA

Luosto, P., Kivinen, J., & Mannila, H. (2010). Gaussian clusters and noise: An approach based on the minimum description length principle. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6332 LNAI, pp. 251–265). https://doi.org/10.1007/978-3-642-16184-1_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free