That amino acid properties are responsible for the way protein molecules evolve is natural and is also reasonably well supported both by the structure of the genetic code and, to a large extent, by the experimental measures of the amino acid similarity. Nevertheless, there remains a significant gap between observed similarity matrices and their reconstructions from amino acid properties. Therefore, we introduce a simple theoretical model of amino acid similarity matrices, which allows splitting the matrix into two parts - one that depends only on mutabilities of amino acids and another that depends on pairwise similarities between them. Then the new synthetic amino acid properties are derived from the pairwise similarities and used to reconstruct similarity matrices covering a wide range of information entropies. Our model allows us to explain up to 94% of the variability in the BLOSUM family of the amino acids similarity matrices in terms of amino acid properties. The new properties derived from amino acid similarity matrices correlate highly with properties known to be important for molecular evolution such as hydrophobicity, size, shape and charge of amino acids. This result closes the gap in our understanding of the influence of amino acids on evolution at the molecular level. The methods were applied to the single family of similarity matrices used often in general sequence homology searches, but it is general and can be used also for more specific matrices. The new synthetic properties can be used in analyzes of protein sequences in various biological applications. © 2014 Rudnicki et al.
Rudnicki, W. R., Mroczek, T., & Cudek, P. (2014). Amino acid properties conserved in molecular evolution. PLoS ONE, 9(6). https://doi.org/10.1371/journal.pone.0098983