Background: Conservation and variation scores are used when evaluating sites in a multiple sequence alignment, in order to identify residues critical for structure or function. A variety of scores are available today but it is not clear how different scores relate to each other.Results: We applied 25 conservation and variation scores to alignments from the Catalytic Site Atlas (CSA). We calculated distances among scores based on correlation coefficients, and constructed a dendrogram of the scores by average linking cluster analysis. The cluster analysis showed that most scores fall into one of two groups--substitution matrix based group and frequency based group respectively. We also evaluated the scores' performance in predicting catalytic sites and found that frequency based scores generally perform best.Conclusions: Conservation and variation scores can be classified into mainly two large groups. When using a score to predict catalytic sites, frequency based scores that also consider a background distribution are most successful. © 2010 Johansson and Toh; licensee BioMed Central Ltd.
CITATION STYLE
Johansson, F., & Toh, H. (2010). A comparative study of conservation and variation scores. BMC Bioinformatics, 11. https://doi.org/10.1186/1471-2105-11-388
Mendeley helps you to discover research relevant for your work.