The three steps of clustering in the post-genomic era: A synopsis

R. Giancarlo; G. Lo Bosco; L. Pinello; F. Utro

Conference Proceedings

The three steps of clustering in the post-genomic era: A synopsis

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6685 LNBI 13-30

DOI: 10.1007/978-3-642-21946-7_2

17Citations

12Readers

Get full text

Abstract

Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. Following Handl et al., it can be summarized as a three step process: (a) choice of a distance function; (b) choice of a clustering algorithm; (c) choice of a validation method. Although such a purist approach to clustering is hardly seen in many areas of science, genomic data require that level of attention, if inferences made from cluster analysis have to be of some relevance to biomedical research. Unfortunately, the high dimensionality of the data and their noisy nature makes cluster analysis of genomic data particularly difficult. This paper highlights new findings that seem to address a few relevant problems in each of the three mentioned steps, both in regard to the intrinsic predictive power of methods and algorithms and their time performance. Inclusion of this latter aspect into the evaluation process is quite novel, since it is hardly considered in genomic data analysis. © 2011 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Giancarlo, R., Lo Bosco, G., Pinello, L., & Utro, F. (2011). The three steps of clustering in the post-genomic era: A synopsis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6685 LNBI, pp. 13–30). https://doi.org/10.1007/978-3-642-21946-7_2

The three steps of clustering in the post-genomic era: A synopsis

Abstract

Cite

Register to see more suggestions