Integrative clustering of high-dimensional data with joint and individual clusters, with an application to the Metabric study

Abstract

When measuring a range of different genomic, epigenomic, transcriptomic and other variables, an integrative approach to analysis can strengthen inference and give new insights. This is also the case when clustering patient samples, and several integrative cluster procedures have been proposed. Common for these methodologies is the restriction of a joint cluster structure, which is equal for all data layers. We instead present Joint and Individual Clustering (JIC), which estimates both joint and data type-specific clusters simultaneously, as an extension of the JIVE algorithm (Lock et. al, 2013). The method is compared to iCluster, another integrative clustering method, and simulations show that JIC is clearly advantageous when both individual and joint clusters are present. The method is used to cluster patients in the Metabric study, integrating gene expression data and copy number aberrations (CNA). The analysis suggests a division into three joint clusters common for both data types and seven independent clusters specific for CNA. Both the joint and CNA-specific clusters are significantly different with respect to survival, also when adjusting for age and treatment.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…