Integrative clustering of high-dimensional data with joint and individual clusters

Kristoffer H. Hellton; Magne Thoresen

Journal ArticleOPEN ACCESS

Integrative clustering of high-dimensional data with joint and individual clusters

Biostatistics (2016) 17(3) 537-548

DOI: 10.1093/biostatistics/kxw005

22Citations

37Readers

Abstract

When measuring a range of genomic, epigenomic, and transcriptomic variables for the same tissue sample, an integrative approach to analysis can strengthen inference and lead to new insights. This is also the case when clustering patient samples, and several integrative cluster procedures have been proposed. Common for these methodologies is the restriction to a joint cluster structure, equal in all data layers. We instead present a clustering extension of the Joint and Individual Variance Explained algorithm (JIVE), Joint and Individual Clustering (JIC), enabling the construction of both joint and data type-specific clusters simultaneously. The procedure builds on the connection between k-means clustering and principal component analysis, and hence, the number of clusters can be determined by the number of relevant principal components. The proposed procedure is compared with iCluster, a method restricted to only joint clusters, and simulations show that JIC is clearly advantageous when both individual and joint clusters are present. The procedure is illustrated using gene expression and miRNA levels measured in breast cancer tissue from The Cancer Genome Atlas. The analysis suggests a division into three joint clusters common for both data types and two expression-specific clusters.

Author supplied keywords

Cite

CITATION STYLE

APA

Hellton, K. H., & Thoresen, M. (2016). Integrative clustering of high-dimensional data with joint and individual clusters. Biostatistics, 17(3), 537–548. https://doi.org/10.1093/biostatistics/kxw005

Integrative clustering of high-dimensional data with joint and individual clusters

Abstract

Author supplied keywords

Cite

Register to see more suggestions