Accounting for noise when clustering biological data

Roman Sloutsky; Nicolas Jimenez; S. Joshua Swamidass; Kristen M. Naegle

Journal ArticleOPEN ACCESS

Accounting for noise when clustering biological data

Briefings in Bioinformatics (2013) 14(4) 423-436

DOI: 10.1093/bib/bbs057

25Citations

91Readers

Abstract

Clustering is a powerful and commonly used technique that organizes and elucidates the structure of biological data. Clustering data from gene expression, metabolomics and proteomics experiments has proven to be useful at deriving a variety of insights, such as the shared regulation or function of biochemical components within networks. However, experimental measurements of biological processes are subject to substantial noise-stemming from both technical and biological variability-and most clustering algorithms are sensitive to this noise. In this article, we explore several methods of accounting for noise when analyzing biological data sets through clustering. Using a toy data set and two different case studies-gene expression and protein phosphorylation-we demonstrate the sensitivity of clustering algorithms to noise. Several methods of accounting for this noise can be used to establish when clustering results can be trusted. These methods span a range of assumptions about the statistical properties of the noise and can therefore be applied to virtually any biological data source. © The Author 2012. Published by Oxford University Press.

Author supplied keywords

Cite

CITATION STYLE

APA

Sloutsky, R., Jimenez, N., Swamidass, S. J., & Naegle, K. M. (2013). Accounting for noise when clustering biological data. Briefings in Bioinformatics, 14(4), 423–436. https://doi.org/10.1093/bib/bbs057

Accounting for noise when clustering biological data

Abstract

Author supplied keywords

Cite

Register to see more suggestions