Information content of sets of biological sequences revisited

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

To analyze the information included in a pool of amino acid sequences, a first approach is to align the sequences, to estimate the probability of each amino acid to occur within columns of the aligned sequences and to combine these values through an “entropy” function whose minimum corresponds to absence of information, that is, to the case where each amino acid has the same probability to occur. Another alternative is to construct a distance tree between sequences (issued by the alignment) based on sequence similarity and to properly interpret the tree topology so to model the evolutionary property of residue conservation. We introduce the concept of “evolutionary content” of a tree of sequences, and demonstrate at what extent the more classical notion of “information content” on sequences approximates the new measure and in what manner tree topology contributes sharper information for the detection of protein binding sites.

Cite

CITATION STYLE

APA

Carbone, A., & Engelen, S. (2009). Information content of sets of biological sequences revisited. Natural Computing Series, (9783540888680), 31–42. https://doi.org/10.1007/978-3-540-88869-7_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free