Model-based clustering of high-dimensional data: A review

  • Bouveyron C
  • Brunet C
  • 1

    Readers

    Mendeley users who have this article in their library.
  • N/A

    Citations

    Citations of this article.

Abstract

Model-based clustering is a popular tool which is renowned for its probabilistic foundations and its flexibility. However, high-dimensional data are nowadays more and more frequent and, unfortunately, classical model-based clustering techniques show a disappointing behavior in high-dimensional spaces. This is mainly due to the fact that model-based clustering methods are dramatically over-parametrized in this case. However, high-dimensional spaces have specific characteristics which are useful for clustering and recent techniques exploit those characteristics. After having recalled the bases of model-based clustering, dimension reduction approaches, regularization-based techniques, parsimonious modeling, subspace clustering methods and clustering methods based on variable selection are reviewed. Existing softwares for model-based clustering of high-dimensional data will be also reviewed and their practical use will be illustrated on real-world data sets.

Author-supplied keywords

  • dimension reduction
  • high-dimensional data
  • model-based clustering
  • parsimonious models
  • r package
  • regularization
  • software
  • subspace clustering
  • variable selection

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • C Bouveyron

  • C Brunet

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free