A likelihood-based index of protein protein binding affinities with application to influenza HA escape from antibodies.
- PubMed: 17478433
Abstract
In many biological systems, proteins interact with other organic molecules to produce indispensable functions, in which molecular recognition phenomena are essential. Proteins have kept or gained their functions during molecular evolution. Their functions seem to be flexible, and a few amino acid substitutions sometimes cause drastic changes in function. In order to monitor and predict such drastic changes in the early stages in target populations, we need to identify patterns of structural changes during molecular evolution causing decreases or increases in the binding affinity of protein complexes. In previous work, we developed a likelihood-based index to quantify the degree to which a sequence fits a given structure. This index was named the sequence-structure fitness (SSF) and is calculated empirically based on amino acid preferences and pairwise interactions in the structural environment present in template structures. In the present work, we used the SSF to develop an index to measure the binding affinity of protein-protein complexes defined as the log likelihood ratio, contrasting the fitness of the sequences to the structure of the complex and that of the uncomplexed proteins. We applied the developed index to the complexes formed between influenza A hemagglutinin (HA) and four antibodies. The antibody-antigen binding region of HA is under strong selection pressure by the host immune system. Hence, examination of the long-term adaptation of HA to the four antibodies could reveal the strategy of the molecular evolution of HA. Two antibodies cover the HA receptor-binding region, while the other two bind away from the receptor-binding region. By focusing on branches with a significant decline in binding ability, we could detect key amino acid replacements and investigate the mechanism via conditional probabilities. The contrast between the adaptations to the two types of antibodies suggests that the virus adapts to the immune system at the cost of structural change.
Author-supplied keywords
A likelihood-based index of protein protein binding affinities with application to influenza HA escape from antibodies.
Application to Influenza HA Escape from Antibodies
Teruaki Watabe,* Hirohisa Kishino, Leonardo de Oliveira Martins, and Yasuhiro Kitazoe*
*Center of Medical Information Science, Kochi University, Kochi, Japan; and Laboratory of Biometrics, Graduate School of
Agriculture and Life Science, University of Tokyo, Tokyo, Japan
In many biological systems, proteins interact with other organic molecules to produce indispensable functions, in which
molecular recognition phenomena are essential. Proteins have kept or gained their functions during molecular evolution.
Their functions seem to be flexible, and a few amino acid substitutions sometimes cause drastic changes in function. In
order to monitor and predict such drastic changes in the early stages in target populations, we need to identify patterns of
structural changes during molecular evolution causing decreases or increases in the binding affinity of protein complexes.
In previous work, we developed a likelihood-based index to quantify the degree to which a sequence fits a given
structure. This index was named the sequence-structure fitness (SSF) and is calculated empirically based on amino acid
preferences and pairwise interactions in the structural environment present in template structures. In the present work, we
used the SSF to develop an index to measure the binding affinity of protein–protein complexes defined as the log
likelihood ratio, contrasting the fitness of the sequences to the structure of the complex and that of the uncomplexed
proteins. We applied the developed index to the complexes formed between influenza A hemagglutinin (HA) and four
antibodies. The antibody–antigen binding region of HA is under strong selection pressure by the host immune system.
Hence, examination of the long-term adaptation of HA to the four antibodies could reveal the strategy of the molecular
evolution of HA. Two antibodies cover the HA receptor-binding region, while the other two bind away from the
receptor-binding region. By focusing on branches with a significant decline in binding ability, we could detect key amino
acid replacements and investigate the mechanism via conditional probabilities. The contrast between the adaptations to
the two types of antibodies suggests that the virus adapts to the immune system at the cost of structural change.
Introduction
Many biological functions are predominantly con-
trolled by protein–protein interactions, and molecular rec-
ognition phenomena are essential to biological systems.
These recognition phenomena involve the association of
proteins to ligands or substrates. One of the most important
molecular recognitions is that performed between lympho-
cytes and major histocompatibility complex (MHC) class I
molecules. Natural killer (NK) cells—one of the lympho-
cyte classes—detect downregulation of MHC class I mol-
ecules by means of specific membrane receptors. A main
category of these receptors is the killer cell immunoglobu-
lin-like receptor (KIR) family. These KIR genes have
evolved in primates to generate a diverse family of recep-
tors with unique structures that enable them to recognize
MHC class I molecules with locus and allele specificity
(Vilches and Parham 2002). Their combinatorial expression
creates a repertoire of NK cells that antagonize the spread of
pathogens and tumors.
Proteins have evolved, keeping their functions or
newly gaining other functions. For example, globins arose
very early in evolution and are found in a wide range of
organisms. The globins have maintained the ability to in-
teract with each other to enable cooperative oxygen bind-
ing, and they show functional flexibility and realize oxygen
binding in a number of ways.While vertebrate hemoglobins
have evolved, conserving their ability to bind ligands and
deliver oxygen molecules bound at the heme sites, many
functionally important structural differences were found
to exist among vertebrate hemoglobins (Naoi et al.
2001). Functional flexibility appears to be a distinctive
feature of protein evolution.
A few amino acid substitutions sometimes cause dras-
tic changes to the protein function or to its influence in a sys-
tem. The severe acute respiratory syndrome (SARS)
coronavirus is one of the most well-studied viruses. It
was caught in the act of adapting to humans, and the viral
spike glycoprotein was identified as a major determinant of
the species’ specificity of coronavirus infection. Only four
amino acids in the receptor-binding domain differ between
the human epidemic and the civet strains, but they cause
more than a 1,000-fold difference in the binding affinity
to the human angiotensin-converting enzyme 2, a specific
receptor glycoprotein on the surface of host cells (Li et al.
2005). Adaptation of a virus to a homologous receptor
of a new host species appears to require very few amino
acid substitutions at the large receptor-binding interface
(Holmes 2005).
Measuring the binding affinity of complexes formed
between biological molecules is indispensable to monitor-
ing and predicting adaptive evolution in a target population.
Particularly in the arms race of the host–parasite system,
binding affinity plays a crucial role. Watabe et al. (2006)
developed a likelihood-based index, named sequence-
structure fitness (SSF), which quantifies the degree to which
a sequence fits in the given structure. The SSF is calculated
empirically based on amino acid preferences and pairwise
interactions in the structural environment present in tem-
plates’ structure. In the present study, we used the SSF to
develop an index to measure the binding affinity of pro-
tein–protein complexes defined as the log likelihood ratio,
contrasting the fitness of the sequences to the structure of
the complex and that of the uncomplexed proteins. This index
enables us to quantify the binding affinity between proteins.
We applied the developed index to systems of complexes be-
tween virus proteins containing epitopes (influenza A hem-
agglutinin [HA]) and antibodies of the host immune system.
Key words: protein structure, affinity of complex, influenza virus,
antigenic drift, functional redundancy.
E-mail: twatabe-mi@umin.ac.jp.
Mol. Biol. Evol. 24(8):1627–1638. 2007
doi:10.1093/molbev/msm079
Advance Access publication May 2, 2007
The Author 2007. Published by Oxford University Press on behalf of
the Society for Molecular Biology and Evolution. All rights reserved.
For permissions, please e-mail: journals.permissions@oxfordjournals.org
HA from antibodies and analyzed the binding ability of the
antibodies to HA at each node of the phylogenetic dendro-
gram of the HA sequence. On the one hand, we found that
the binding ability to the HA surface of antibodies whose
neutralization abilities are supplied by the indirect mecha-
nism of blocking virus attachment decreased but was never
completely lost. On the other hand, antibodies covering the
receptor-binding region of HA by direct binding to the re-
ceptor binding sites completely lost their binding abilities
along the HA evolutionary pathway. We also found that the
binding ability of these antibodies to HA started to increase
after a couple of years in an impotent stage, and this binding
ability was restored to a magnitude comparable to the orig-
inal, prompting HA to evade them again.
Materials and Methods
A Likelihood-based Index of the Binding Ability of
Protein–Protein Complexes
To trace long-term changes in interactions between
proteins, measuring the affinity of the complexes for many
sequences along the evolutionary tree is indispensable.
High-throughput experiments to directly measure affinity
are currently impractical. Therefore, we estimated affinity
by using structural information for the templates and infor-
mation on amino acid preferences and pairwise interactions
in the structural environment in the Protein Data Bank
(PDB).
The binding ability of the complexes formed between
proteins A and B is measured as the inverse of the disso-
ciation constant of the complex:
1
Kd
5
½A B
½A½B ;
where [A], [B] and ½A B are the concentrations of proteins
A and B and their complexes in equilibrium. To predict the
ratio using information from the protein database, we inter-
pret the ratio as reflecting the likelihood ratio
PðA BÞ
PðAÞPðBÞ :
The binding ability can be characterized by the amino
acid sequences in the binding region and the structure of
the complexes. Hence, the likelihood is explained by
the likelihood of the amino acid sequences given the
structure:
PðA BÞ
PðAÞPðBÞ5
PðseqA; seqBjstrAþBÞPðstrAþBÞ
PðseqAjstrAÞPðstrAÞPðseqBjstrBÞPðstrBÞ
:
If the structures were stable during evolution of the pro-
teins, it could be thought that the likelihoods PðstrAþBÞ,
PðstrAÞ, and PðstrBÞ were constant. Assuming stability of
the structures during evolution of the proteins, we defined
the index of the affinity of the complex as the log likelihood
ratio of the amino acid sequences in the binding region of
the complex to those in the binding region of the uncom-
plexed proteins (fig. 1),
½affinity log PðseqA; seqBjstrAþBÞ
PðseqAjstrAÞPðseqBjstrBÞ
; ð1Þ
which would be interpreted as logKd.
The sequence distribution of a structure PðseqjstrÞ in
Equation 1 expresses the fitness of the amino acids given
the structural environment. To predict the protein structure
of an amino acid sequence, Simons et al. (1999a, 1999b)
proposed to maximize:
Pða1; . . . ; anjXÞ ffi
Y
i
PðaijXÞ
Y
i,j
Pðai; aj
XÞ
Pðai
XÞPðaj
XÞ : ð2Þ
Here, an amino acid sequence and the positions of the
Ca atoms of amino acid residues are denoted by A5
ða1; . . . ; anÞ and X5ðx1; . . . ; xnÞ, respectively. The first
term represents the amino acid preference and the second
term is the pairwise interaction given the structure. Expect-
ing that the fitness of an amino acid depends mostly on the
local structure surrounding each of the amino acid residues,
Simons et al. (1999a, 1999b) categorized local environ-
ments and calculated amino acid frequencies in the proteins
registered in PDB for each of the categories (see below for
more detail).
Watabe et al. (2006) obtained evidence that implies
that the structure of the crown region of the HIV V3 loop
varies more in the patient where the viral population expe-
rienced larger sequence evolution. They traced the value of
FIG. 1.—A. The complex formed between HA and fragment antigen
binding (Fab) HC45, and (B) the uncomplexed individual proteins. In the
analysis, it was considered that the individual uncomplexed proteins are
far from each other.
1628 Watabe et al.
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


