Protein sequence analysis by proximities

Frank Michael Schleif

Book Chapter

Protein sequence analysis by proximities

Schleif F

Humana Press Inc., (2016), 185-195

DOI: 10.1007/978-1-4939-3106-4_12

0Citations

8Readers

Get full text

Abstract

Sequence data are widely used to get a deeper insight into biological systems. From a data analysis perspective they are given as a set of sequences of symbols with varying length. In general they are compared using nonmetric score functions. In this form the data are nonstandard, because they do not provide an immediate metric vector space and their analysis using standard methods is complicated. In this chapter we provide various strategies for how to analyze these type of data in a mathematically accurate way instead of the often seen ad hoc solutions. Our approach is based on the scoring values from protein sequence data although be applicable in a broader sense. We discuss potential recoding concepts of the scores and discuss algorithms to solve clustering, classification and embedding tasks for score data for a protein sequence application.

Author supplied keywords

Cite

CITATION STYLE

APA

Schleif, F. M. (2016). Protein sequence analysis by proximities. In Methods in Molecular Biology (Vol. 1362, pp. 185–195). Humana Press Inc. https://doi.org/10.1007/978-1-4939-3106-4_12

Protein sequence analysis by proximities

Abstract

Author supplied keywords

Cite

Register to see more suggestions