Effectively integrating and mining multi-view, high-dimensional omics data is instrumental to precision medicine. Numerous methods have been proposed for addressing this problem. However, they more or less neglect the challenges (e.g., interpretability, stability and consistency) pertaining to this integration process, whereby suffering from unstable or inconsistent variable selection and prediction accuracy deterioration. In this paper, we introduce a novel Fusion Lasso (FL) framework in which variable selection and data integration are formulated as a weighted constrained optimization problem. Specifically, four regularization constraints, i.e., sparsity, fusion penalty, instability and inconsistency, are simultaneously taken into account in the fusion model using multi-view data, while sparse features are revealed from data of each individual view through the ℓ1-norm minimization. We use the ADMM and Accelerated ADMM (AADMM) schemes to solve this optimization problem, leading to a scalable model convergence with solid theoretical guarantee. By applying FL to fve multi-omics cancer datasets collected by The Cancer Genome Atlas (TCGA), we demonstrate that FL outperforms popular variable selection and data integration approaches, such as Elastic Net, Precision Lasso, B-RAIL and MDBN, in cancer subtype and/or stage prediction. The proposed method is useful and can be further adopted to systems biology and other advanced clinical research areas where multi-view data integration is a necessity.
CITATION STYLE
Chen, Z., Edwards, A., & Zhang, K. (2020). Fusion Lasso and Its Applications to Cancer Subtype and Stage Prediction. In Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2020. Association for Computing Machinery, Inc. https://doi.org/10.1145/3388440.3412461
Mendeley helps you to discover research relevant for your work.