Semi-supervised clustering using heterogeneous dissimilarities

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The performance of many clustering algorithms such as k-means depends strongly on the dissimilarity considered to evaluate the sample proximities. The choice of a good dissimilarity is a difficult task because each dissimilarity reflects different features of the data. Therefore, different dissimilarities should be integrated in order to reflect more accurately which is similar for the user and the problem at hand. In many applications, the user feedback or the a priory knowledge about the problem provide pairs of similar and dissimilar examples. This side-information may be used to learn a distance metric and to improve the clustering results. In this paper, we address the problem of learning a linear combination of dissimilarities using side information in the form of equivalence constraints. The minimization of the error function is based on a quadratic optimization algorithm. A smoothing term is included that penalizes the complexity of the family of distances and avoids overfitting. The experimental results suggest that the method proposed outperforms a standard metric learning algorithm and improves the classical k-means clustering based on a single dissimilarity. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Martín-Merino, M. (2010). Semi-supervised clustering using heterogeneous dissimilarities. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6218 LNCS, pp. 375–384). https://doi.org/10.1007/978-3-642-14980-1_36

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free