Subspace metric ensembles for semi-supervised clustering of high dimensional data

8Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

A critical problem in clustering research is the definition of a proper metric to measure distances between points. Semi-supervised clustering uses the information provided by the user, usually defined in terms of constraints, to guide the search of clusters. Learning effective metrics using constraints in high dimensional spaces remains an open challenge. This is because the number of parameters to be estimated is quadratic in the number of dimensions, and we seldom have enough side-information to achieve accurate estimates. In this paper, we address the high dimensionality problem by learning an ensemble of subspace metrics. This is achieved by projecting the data and the constraints in multiple subspaces, and by learning positive semi-definite similarity matrices therein. This methodology allows leveraging the given side-information while solving lower dimensional problems. We demonstrate experimentally using high dimensional data (e.g., microarray data) the superior accuracy achieved by our method with respect to competitive approaches. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Yan, B., & Domeniconi, C. (2006). Subspace metric ensembles for semi-supervised clustering of high dimensional data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4212 LNAI, pp. 509–520). Springer Verlag. https://doi.org/10.1007/11871842_48

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free