Finding biologically accurate clusterings in hierarchical tree decompositions using the variation of information

Saket Navlakha; James White; Niranjan Nagarajan; Mihai Pop; Carl Kingsford

Conference Proceedings

Finding biologically accurate clusterings in hierarchical tree decompositions using the variation of information

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5541 LNBI 400-417

DOI: 10.1007/978-3-642-02008-7_29

15Citations

6Readers

Get full text

Abstract

Hierarchical clustering is a popular method for grouping together similar elements based on a distance measure between them. In many cases, annotations for some elements are known beforehand, which can aid the clustering process. We present a novel approach for decomposing a hierarchical clustering into the clusters that optimally match a set of known annotations, as measured by the variation of information metric. Our approach is general and does not require the user to enter the number of clusters desired. We apply it to two biological domains: finding protein complexes within protein interaction networks and identifying species within metagenomic DNA samples. For these two applications, we test the quality of our clusters by using them to predict complex and species membership, respectively. We find that our approach generally outperforms the commonly used heuristic methods. © Springer-Verlag Berlin Heidelberg 2009.

Author supplied keywords

Cite

CITATION STYLE

APA

Navlakha, S., White, J., Nagarajan, N., Pop, M., & Kingsford, C. (2009). Finding biologically accurate clusterings in hierarchical tree decompositions using the variation of information. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5541 LNBI, pp. 400–417). https://doi.org/10.1007/978-3-642-02008-7_29

Finding biologically accurate clusterings in hierarchical tree decompositions using the variation of information

Abstract

Author supplied keywords

Cite

Register to see more suggestions