Semi-supervised fuzzy c-means clustering of biological data

M. Ceccarelli; A. Maratea

Conference Proceedings

Semi-supervised fuzzy c-means clustering of biological data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 3849 LNAI 259-266

DOI: 10.1007/11676935_32

4Citations

9Readers

Get full text

Abstract

Semi-supervised methods use a small amount of labeled data as a guide to unsupervised techniques. Recent literature shows better performance of these methods with respect to totally unsupervised ones even with a small amount of side-information This fact suggests that the use of semi-supervised methods may be useful especially in very difficult and noisy tasks where little a priori information is available. This is the case of biological datasets' classification. The two more frequently used paradigms to include side-information into clustering are Constrained Clustering and Metric Learning. In this paper we use a Metric Learning approach as a preliminary step to fuzzy clustering and we show that Semi-Supervised Fuzzy Clustering (SSFC) can be an effective tool for classification of biological datasets. We used three real biological datasets and a generalized version of the Partition Entropy index to validate our results. In all cases tested the metric learning step produced a better highlight of the datasets' clustering structure. © Springer-Verlag Berlin Heidelberg 2006.

Author supplied keywords

Cite

CITATION STYLE

APA

Ceccarelli, M., & Maratea, A. (2006). Semi-supervised fuzzy c-means clustering of biological data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3849 LNAI, pp. 259–266). https://doi.org/10.1007/11676935_32

Semi-supervised fuzzy c-means clustering of biological data

Abstract

Author supplied keywords

Cite

Register to see more suggestions