Abstract
Any knowledge discovery could in principal benefit from the fusion of directly or even indirectly related data sources. In this paper we explore whether data fusion by simultaneous matrix factorization could be adapted for survival regression. We propose a new method that jointly infers latent data factors from a number of heterogeneous data sets and estimates regression coefficients of a survival model. We have applied the method to CAMDA 2014 largescale Cancer Genomes Challenge and modeled survival time as a function of gene, protein and miRNA expression data, and data on methylated and mutated regions. We find that both joint inference of data factors and regression coefficients and data fusion procedure are crucial for performance. Our approach is substantially more accurate than the baseline Aalen’s additive model. Latent factors inferred by our approach could be mined further; for CAMDA challenge, we found that the most informative factors are related to known cancer processes.
Author supplied keywords
Cite
CITATION STYLE
Žitnik, M., & Zupan, B. (2015). Survival regression by data fusion. Systems Biomedicine, 2(3), 47–53. https://doi.org/10.1080/21628130.2015.1016702
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.