Methods for analysing correlated mutations in proteins are becoming an increasingly powerful tool for predicting contacts within and between proteins. Nevertheless, limitations remain due to the requirement for large multiple sequence alignments (MSA) and the fact that, in general, only the relatively small number of top-ranking predictions are reliable. To date, methods for analysing correlated mutations have relied exclusively on amino acid MSAs as inputs. Here, we describe a new approach for analysing correlated mutations that is based on combined analysis of amino acid and codon MSAs. We show that a direct contact is more likely to be present when the correlation between the positions is strong at the amino acid level but weak at the codon level. The performance of different methods for analysing correlated mutations in predicting contacts is shown to be enhanced significantly when amino acid and codon data are combined.Genes contain instructions to make proteins from building blocks called amino acids. The instructions are encoded in units called codons that each specify a single amino acid in the chain. A small mutation in a particular codon can change the amino acid found at the corresponding position in the protein. Some amino acids interact with other amino acids in the chain, thereby enabling the protein to adopt the three-dimensional shape it needs to work properly. Therefore, a mutation that affects one of these amino acids may have a large impact on the ability of the protein to work.A mutation at one position in the protein may, however, have little effect if it is accompanied by a ‘compensatory’ mutation at another position. Such compensatory mutations are more likely to occur when the two positions in the protein are close to each other. To identify such mutations, the amino acid sequences of similar proteins from different organisms are aligned and compared.A computational method called ‘correlated mutation analysis’ searches for pairs of positions in the alignment that display co-variation, i.e. where particular mutations at one position tend to be accompanied by certain mutations at the second position. These pairs are then ranked according to the strength of their correlation and those with the highest ranking are predicted to be in close contact. Such predictions are, however, far from perfect and can give false results.Jacob et al. developed and tested a new technique of correlated mutation analysis by examining codon sequences as well as amino acid sequences. The rationale behind the technique relies on the fact that several different codons can encode the same amino acid, so that a mutation in a codon does not always change the amino acid it encodes. Therefore, a strong correlation at the amino acid level can be accompanied by a weak correlation at the codon level. In such cases the positions are more likely to be in contact than in cases where there is a strong correlation also at the codon level since the correlation can then be due to constraints at the DNA or RNA level.Jacob et al. tested their approach using different methods for analyzing correlated mutations that were proposed in previous studies. This showed that the predictions obtained using both amino acid and codon data are significantly more accurate than those obtained by comparing amino acid sequences only. Future work will test whether combining amino acid and codon data can also be used to predict interactions between different proteins.
CITATION STYLE
Jacob, E., Unger, R., & Horovitz, A. (2015). Codon-level information improves predictions of inter-residue contacts in proteins by correlated mutation analysis. ELife, 4. https://doi.org/10.7554/elife.08932
Mendeley helps you to discover research relevant for your work.