Data-Driven Supervised Learning for Life Science Data

Maximilian Münch; Christoph Raab; Michael Biehl; Frank Michael Schleif

Journal ArticleOPEN ACCESS

Data-Driven Supervised Learning for Life Science Data

Frontiers in Applied Mathematics and Statistics (2020) 6

DOI: 10.3389/fams.2020.553000

10Citations

12Readers

Abstract

Life science data are often encoded in a non-standard way by means of alpha-numeric sequences, graph representations, numerical vectors of variable length, or other formats. Domain-specific or data-driven similarity measures like alignment functions have been employed with great success. The vast majority of more complex data analysis algorithms require fixed-length vectorial input data, asking for substantial preprocessing of life science data. Data-driven measures are widely ignored in favor of simple encodings. These preprocessing steps are not always easy to perform nor particularly effective, with a potential loss of information and interpretability. We present some strategies and concepts of how to employ data-driven similarity measures in the life science context and other complex biological systems. In particular, we show how to use data-driven similarity measures effectively in standard learning algorithms.

Author supplied keywords

Cite

CITATION STYLE

APA

Münch, M., Raab, C., Biehl, M., & Schleif, F. M. (2020). Data-Driven Supervised Learning for Life Science Data. Frontiers in Applied Mathematics and Statistics, 6. https://doi.org/10.3389/fams.2020.553000

Data-Driven Supervised Learning for Life Science Data

Abstract

Author supplied keywords

Cite

Register to see more suggestions