Computational cluster validation in post-genomic data analysis

703Citations
Citations of this article
705Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: The discovery of novel biological knowledge from the ab initio analysis of post-genomic data relies upon the use of unsupervised processing methods, in particular clustering techniques. Much recent research in bioinformatics has therefore been focused on the transfer of clustering methods introduced in other scientific fields and on the development of novel algorithms specifically designed to tackle the challenges posed by post-genomic data. The partitions returned by a clustering algorithm are commonly validated using visual inspection and concordance with prior biological knowledge - whether the clusters actually correspond to the real structure in the data is somewhat less frequently considered. Suitable computational cluster validation techniques are available in the general data-mining literature, but have been given only a fraction of the same attention in bioinformatics. Results: This review paper aims to familiarize the reader with the battery of techniques available for the validation of clustering results, with a particular focus on their application to post-genomic data analysis. Synthetic and real biological datasets are used to demonstrate the benefits, and also some of the perils, of analytical cluster validation. © The Author 2005. Published by Oxford University Press. All rights reserved.

Cite

CITATION STYLE

APA

Handl, J., Knowles, J., & Kell, D. B. (2005, August 1). Computational cluster validation in post-genomic data analysis. Bioinformatics. Oxford University Press. https://doi.org/10.1093/bioinformatics/bti517

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free