Log-odds sequence logos

11Citations
Citations of this article
64Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: DNA and protein patterns are usefully represented by sequence logos. However, the methods for logo generation in common use lack a proper statistical basis, and are non-optimal for recognizing functionally relevant alignment columns. Results: We redefine the information at a logo position as a per-observation multiple alignment log-odds score. Such scores are positive or negative, depending on whether a column's observations are better explained as arising from relatedness or chance. Within this framework, we propose distinct normalized maximum likelihood and Bayesian measures of column information. We illustrate these measures on High Mobility Group B (HMGB) box proteins and a dataset of enzyme alignments. Particularly in the context of protein alignments, our measures improve the discrimination of biologically relevant positions.

Cite

CITATION STYLE

APA

Yu, Y. K., Capra, J. A., Stojmirović, A., Landsman, D., & Altschul, S. F. (2015). Log-odds sequence logos. Bioinformatics, 31(3), 324–331. https://doi.org/10.1093/bioinformatics/btu634

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free