VSClust: Feature-based variance-sensitive clustering of omics data

23Citations
Citations of this article
59Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: Data clustering is indispensable for identifying biologically relevant molecular features in large-scale omics experiments with thousands of measurements at multiple conditions. Optimal clustering results yield groups of functionally related features that may include genes, proteins and metabolites in biological processes and molecular networks. Omics experiments typically include replicated measurements of each feature within a given condition to statistically assess featurespecific variation. Current clustering approaches ignore this variation by averaging, which often leads to incorrect cluster assignments. Results: We present VSClust that accounts for feature-specific variance. Based on an algorithm derived from fuzzy clustering, VSClust unifies statistical testing with pattern recognition to cluster the data into feature groups that more accurately reflect the underlying molecular and functional behavior. We apply VSClust to artificial and experimental datasets comprising hundreds to >80 000 features across 6-20 different conditions including genomics, transcriptomics, proteomics and metabolomics experiments. VSClust avoids arbitrary averaging methods, outperforms standard fuzzy c-means clustering and simplifies the data analysis workflow in large-scale omics studies.

Cite

CITATION STYLE

APA

Schwämmle, V., & Jensen, O. N. (2018). VSClust: Feature-based variance-sensitive clustering of omics data. Bioinformatics, 34(17), 2965–2972. https://doi.org/10.1093/bioinformatics/bty224

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free