Protein sequence analysis by proximities

0Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Sequence data are widely used to get a deeper insight into biological systems. From a data analysis perspective they are given as a set of sequences of symbols with varying length. In general they are compared using nonmetric score functions. In this form the data are nonstandard, because they do not provide an immediate metric vector space and their analysis using standard methods is complicated. In this chapter we provide various strategies for how to analyze these type of data in a mathematically accurate way instead of the often seen ad hoc solutions. Our approach is based on the scoring values from protein sequence data although be applicable in a broader sense. We discuss potential recoding concepts of the scores and discuss algorithms to solve clustering, classification and embedding tasks for score data for a protein sequence application.

Cite

CITATION STYLE

APA

Schleif, F. M. (2016). Protein sequence analysis by proximities. In Methods in Molecular Biology (Vol. 1362, pp. 185–195). Humana Press Inc. https://doi.org/10.1007/978-1-4939-3106-4_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free