Data integration strategy for robust classification of biomedical data

1Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents the protocol for integration of data coming from two most common types of biological data (clinical and molecular) for more effective classification patients with cancer disease. In this protocol, the identification of the most informative features is performed by using statistical and information-theory based selection methods for molecular data and the Boruta algorithm for clinical data. Predictive models are built with the help of the Random Forest classification algorithm. The process of data integration includes combining the most informative clinical features and the synthetic features obtained from genetic marker models as input variables for classifier algorithms. We applied this classification protocol to METABRIC breast cancer samples. Clinical data, gene expression data and somatic copy number aberrations data were used for clinical endpoint prediction. We tested the various methods for combining from different dataset information. Our research shows that both types of molecular data contain features that relevant for clinical endpoint prediction. The best model was obtained by using ten clinical and two synthetic features obtained from biomarker models. In the examined cases, the type of filtration molecular markers had a small impact the predictive power of models even though the lists of top informative biomarkers are divergent.

Cite

CITATION STYLE

APA

Polewko-Klim, A., & Rudnicki, W. R. (2020). Data integration strategy for robust classification of biomedical data. In Advances in Intelligent Systems and Computing (Vol. 1160 AISC, pp. 596–606). Springer. https://doi.org/10.1007/978-3-030-45691-7_56

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free