Directed Evolution of phiC31 Integrase Toward a Single Location at Human Chromosome Xq22.1
Abstract
Site-specific gene targeting, without prior enzyme recognition site placement, is a highly sought after genetic tool. Previously we showed that the serine recombinase, phiC31, is capable of stably integrating a donor gene into several hotspots of the human genome. These hotspots are genomic loci termed pseudo attP sites, which are observed more than once and have partial homology with phiC31s native recognition site, attP. There were 19 hotspots found, which accounted for 56% of all integrations sites. Still, this leaves a large portion of the integrations sites as rare occurrences among an estimated 370 integration sites throughout the human genome. Decreasing the total number of integration sites through directed evolution of phiC31 was the focus of this study. We demonstrate that only three rounds of mutagenesis were required to engineer phiC31 toward human chromosome Xq22.1 in extrachromosomal assays. This site was the fourth most frequent integration site among the 19 hotspots previously found. Furthermore, we provide evidence to suggest that the majority of this improvement was the result of synonymous amino acid changes. By using the strategies reported here, we present an engineering platform for evolving phiC31 toward integration sites in a variety of species. 123
Author-supplied keywords
Directed Evolution of phiC31 Integrase Toward a Single Location at Human Chromosome Xq22.1
Directed Evolution of phiC31 Integrase Toward a Single
Location at Human Chromosome Xq22.1
Portions of this study have been modified from a published abstract
Christopher L. Chavez, Jason J. Hoyt, and Michele P. Calos. Directed evolution of phage
phiC31 integrase toward a single locus on human chromosome Xq22.1
2008. Molecular Therapy 16:S123
In this study, I wrote up all results and performed all experiments except for qPCR, which was
performed with Christopher Chavez.
ABSTRACT
Site-specific gene targeting, without prior enzyme recognition site placement, is
a highly sought after genetic tool. Previously we showed that the serine recombinase,
phiC31, is capable of stably integrating a donor gene into several “hotspots” of the
human genome. These hotspots are genomic loci termed “pseudo attP sites,” which are
observed more than once and have partial homology with phiC31’s native recognition
site, attP. There were 19 hotspots found, which accounted for 56% of all integrations
sites. Still, this leaves a large portion of the integrations sites as rare occurrences among
an estimated 370 integration sites throughout the human genome. Decreasing the total
number of integration sites through directed evolution of phiC31 was the focus of this
study. We demonstrate that only three rounds of mutagenesis were required to engineer
phiC31 toward human chromosome Xq22.1 in extrachromosomal assays. This site was
the fourth most frequent integration site among the 19 hotspots previously found.
Furthermore, we provide evidence to suggest that the majority of this improvement was
the result of synonymous amino acid changes. By using the strategies reported here, we
present an engineering platform for evolving phiC31 toward integration sites in a
variety of species.
INTRODUCTION
Genetic engineering of recombinases for improved efficiency and site-
specificity has used a variety of approaches. Recombinases such as Cre and Flp, with
simple sequence target sites, loxP and FRT respectively, have served as good models
for how effective directed evolution can be. The drawback to engineering Cre or Flp is
that homologous target sequences do not natively exist in the human genome and
targets with symmetrical partial similarity are rare. Early efforts to alter specificity
towards native genomic sites in yeast involved relaxing the specificity of Cre and Flp
(1). This approach, however, has the potential to create off-target integration events. Cre
variants have been evolved using an iterative screening process of positive selection
followed by negative selection, but so far this has been limited to bacterial assays (2).
Both Cre and Flp normally require two 13 base pair (b.p.) symmetrical recognition sites
flanking an 8 b.p. target. Statistically, such targets are extremely unlikely in genomes of
most species. To address this, a heterologous mixture of Cre variants have been
engineered though directed evolution to target asymmetrical sites. Heterologous Cre
variants working in tandem has only been demonstrated in vitro thus far (3). More
recently, after 126 rounds of DNA shuffling, Buchholz and colleagues found a Cre
variant capable of excising an HIV-1 provirus in a HeLa cell line (4).
Another approach is to take advantage of the modular structure of recombinases
by substituting an enzyme’s native DNA binding domain with another. As has been
shown in the case of zinc finger nucleases, site-specificity can be altered resulting in
high frequencies of recombination (5). A major advantage of zinc finger nucleases is
that, in theory, they can be designed to target any site in the human genome. Such zinc
finger-recombinase chimeras have been demonstrated with the transposase Tn3 (6),
several invertases (7), and as shown in the previous chapter of this thesis, phiC31. In the
case of the transposase and invertase chimeras, a major drawback is that excision
readily occurs in these systems. To overcome possible back reactions, zinc finger-
invertase chimeras have been co-evolved and delivered as a heterologous mixture. This
then requires engineering and screening of not only the zinc finger domain, but of the
recombinase domain as well. While phiC31 avoids excising any recombination event,
the chimeras developed so far are proof of principle only. Zinc finger DNA binding
domains will still need to be created and screened to recognize specific targets in the
human genome.
As an alternative to these approaches, we sought to develop a directed evolution
strategy utilizing phiC31’s wild-type ability to recognize target sites in the human
genome (8). The phiC31 serine-recombinase is capable of recombining two attachment
sites, attB and attP, in the Streptomyces genome. Sites with partial identity to attP,
termed pseudo attP, exist in the human genome and have been shown to be the
substrates for phiC31-mediated integration of donor plasmids bearing the wild-type attB
site (9). In a large integration profile study in several human cell lines, we previously
identified 196 integration events (10). Of those analyzed, 56% of the events were
recurrent integrations representing 19 “hotspots.” This was in contrast to a previous
study, which concluded that integration at human chromosome 8p22 occurred in 32/67
of all rescued events. It was demonstrated by Chalberg et al that this discrepancy was
due to the use of pooled colonies and too few restriction enzymes utilized in the plasmid
rescue technique to detect integration (10). Prior to the Chalberg study, an attempt at
evolving phiC31 to further recognize the chromosome 8p22 site was made (11). A DNA
shuffling technique was employed and mutants were screened in a bacterial assay.
While this led to mutants with improved specificity towards 8p22, it reduced the overall
frequency of integration in human cell lines to levels too low for practical use in gene
therapy. While reduced efficiency is expected in specificity mutants, it is now believed
that 8p22 may have additional chromosomal features that limit phiC31-mediated
integration, which make it an undesirable target (10).
We reason that engineering site-specific targeting would be faster than
alternative approaches by using an enzyme such as phiC31, which already preferentially
targets native genomic sites in a variety of species (8, 12, 13, 14, 15). With a greater
understanding of phiC31 pseudo sites in the human genome, we hypothesized that new
site-specific mutants could be developed, which did not have the efficiency problems
associated with the 8p22 phiC31 specificity integrase mutant. For this study, we used a
directed evolution strategy to mutagenize wild-type phiC31 and increase site-specificity
towards a target on chromosome Xq22.1. This locus was the fifth most frequent hotspot
and found to represent ~5% of all integration sites studied by Chalberg et al. Despite
being the fifth most frequent site, Xq22.1 was chosen because integration events were
intergenic and found in multiple cell lines. It is also 71% identical with attP, which was
the fourth highest among the 19 hotspots ranging in homology from 46-79%. We
describe mutagenesis with both degenerate oligos and error-prone PCR to develop
libraries for screening. From this strategy, we have isolated two phiC31 mutants with
improved site-specificity toward Xq22.1. We also demonstrate a loss in recombination
with the native attP target site, which indicates that these are not relaxed preference
mutants.
MATERIALS AND METHODS
Plasmid construction. Plasmid pFC-Xq was used in the initial bacterial screen
and was derived from pBCPB (8). An expression cassette containing the CMV
promoter, SV40 polyA and a multiple cloning site was cloned into the AgeI site of
pBCPB. 293 human embryonic kidney genomic DNA was prepped and ~450 b.p. of the
Xq22.1 integration site was PCR amplified with primer set: 5’-
AATTCGCCCTTGCTCTAGACCTAGGTCCATTTTCGCTAATATTGTC and 5’-
AATTCGCCCTTGCTCTAGAAGATCTGGCACCCAAATGTAGCTTTA, The attP
from pBCPB was removed with SpeI , blunted, and ligated to the Xq22.1 PCR insert.
Plasmid pBCXB was created similarly to pFC-Xq, except that the CMV expression
cassette was not cloned into the construct. Plasmid pBXG was derived from pBPGreen,
which has been described (16). Briefly, the CMV promoter was inverted to prevent
expression of the GFP gene. The attP in pBPGreen was replaced by ~450 b.p. of the
human Xq22.1 integration site with the same PCR product from the primers above.
Plasmid pDB2 was derived from pEGFP-C1 (Stratagene, Cedar Creek, TX) by placing
~300 b.p. of phiC31 attB into the MluI site. The 613 amino acid version of wild-type
phiC31 was encoded by pCMVInt as previously described (8).
Cell culture and transfection. HeLa cells (ATCC, Manassas, VA) were grown
in Dubecco’s Modified Eagle medium (Invitrogen, Carlsbad, CA) and supplemented
with 9% fetal bovine serum and 1% penicillin/streptomycin. Transfection of HeLa cells
was carried out using FuGENE6 (Roche, Indianapolis, IN) using 3 µl of FuGENE
reagent per 1 µl of DNA in either 96-well plates or 60 mm dishes for verification. For a
96-well plate, ~70-80% confluent HeLa cells were transfected with 25 ng of the
pBXG/pBPGreen donor and 25 ng of a single candidate mutant per well. Transfection
into 60 mm plates with 980 µg of mutant candidates (or pCMVInt) and 20 µg pDB2 to
select for integration was used to count colonies and obtain pooled phiC31-modified
genomic DNA. Briefly after 24 h, cells were split 1:40 to three 10 cm plates. At 48 h
after transfection, cells were selected in medium containing 350 µg/ml of G418
(Invitrogen). Selection was continued for 10–14 days, at which time colonies became
visible. This gives an estimate of the total number of integration events, which includes
random integration of the pDB2 donor, integration into the Xq22.1 pseudo site, and
integration into other pseudo sites.
Mutagenesis and library generation. Libraries containing mutants derived from
phiC31 were created with two different methods. One, degenerate oligos were created
with random substitutions in either the first or second base of a codon. We created two
sets, 2n5n9n contained substitutions in only the second, fifth, or ninth amino acids. The
second set, 2n7n9n10n, had substitutions only in those respective amino acids. This
created a maximum library size of 64 unique candidates from 2n5n9n and 958 unique
mutants from 2n7n9n10n. Two, error-prone PCR (GeneMorphII, Stratagene, La Jolla,
CA) was used to randomly mutagenize the entire length of phiC31. We used 100 ng of
pCSI-28Xq from the degenerate oligo screening as PCR template. The 6230 b.p.
plasmid meant that ~30 ng of the target gene (1839 b.p.) was mutagenized. The third
round of mutagenesis used ~50 ng of starting template. Products were cloned into the
5’-KpnI and 3’-XhoI sites of the vector pCS, which is the same backbone as pCMVInt.
We sequenced several mutants to determine the frequency of mutations and, in general,
used PCR conditions to keep the number of amino acid substitutions below 5 per round.
Mammalian screen for improved mutants. Initially a blue/white bacterial
screen was employed to identify functional mutants before proceeding to the more
laborious mammalian screen, and has been described previously (16). For the
mammalian screen, bacterial colonies were picked and plasmid DNA purified using
either that standard Qiaprep Spin Miniprep kit or the Qiaprep 8 Turbo Mimiprep kit
(Qiagen, CA). Mutants were tested in replicates of eight, along with a negative control
receiving no DNA and the positive control wild-type phiC31 encoded by pCMVInt.
This meant that 10 mutants per 96-well plate could be screened. Approximately 100
mutants were screened from the degenerate oligo mutant library and more than 500 in
total from the error-prone PCR libraries. This represented just a fraction of the total
mutants derived from error-prone PCR, estimated to number ~10
6
unique mutants. The
recombination between the attB and Xq22.1 sites on pBXG resulted in GFP expression.
Seventy-two hours after transfection into HeLa cells, GFP expression was quantified
with the Guava PCA-96 analyzer (Guava Technologies Inc, Hayward, CA). We used
the mean fluorescent intensity of GFP+ cells to determine which mutants were more
specific to the Xq22.1 site. Improved mutants were then verified in larger 60 mm dishes
with the same assay, using 500 ng of pBXG/pBPGreen and 500 ng of a mutant
candidate plasmid. 500 ng of wild-type pCMVInt was used as the standard control (Fig.
5-1).
Excision assay and sequencing analysis. 500 ng of pBCPB or pBCXB were co-
transfected along with 500 ng. of mutant candidates or pCMVInt into HeLa cells.
Twenty-four hours prior to harvesting, cells were supplemented with 50 U/ml DnaseI
(Invitrogen) to reduce the background of any untransfected DNA. At 72 hours after
transfection, low molecular weight DNA was recovered as described by Hirt (17). An
aliquot of this DNA was electroporated into competent DH10B E. coli cells and spread
onto plates containing chloramphenicol and Xgal to select for the excision plasmid. The
intramolecular recombination frequency was then determined by dividing the total
number of colonies into the number of white colonies X 100. Cells transfected with
either pBCPB or pBCXB alone were used as controls to determine the number of
background white colonies. All sequencing was performed with a primer matching the
M13 Reverse site in pBCXB by Elim Biopharmaceuticals (Hayward, CA). Sequencher
software (Gene Codes Corp., Ann Arbor, MI) was used to align sequences with vector
and genomic DNA, and recombination junctions were identified by sequence matching
to attB and Xq22.1.
PCR analysis for Xq22.1 integration. Genomic DNA was prepared from pooled
transfected HeLa colonies after two weeks of G418 selection with the DNEasy kit
(Qiagen). Approximately 200 ng of genomic DNA was used with primers for the attB
site in pDB2 and genomic Xq22.1 DNA as described previously (10). Genomic DNA
from cells, which did not undergo selection, was used as a negative control.
RESULTS
First generation mutants show increased human chromosome Xq22.1
specificity. In a previous effort to improve catalytic efficiency in phiC31, mutagenesis
was limited to the catalytic domain and the resulting mutant library was screened in a
bacterial assay (16). When these first round mutations were combined, two mutants
showed significant site-specificity toward a placed attP target site over other pseudo
attP targets in 293 human embryonic kidney cells. Sequence analysis showed two of the
first round parent mutants had changes within the first ten amino acids in the 613 amino
acid phiC31 integrase. This result was unexpected and suggested that mutations need
not occur in the DNA binding domain to alter specificity. Using this as a model for
developing a mutant capable of recombining the human Xq22.1 site more frequently
than other pseudo sites, we made an initial mutant library with degenerate oligos for
amino acids 2-10 of the 613 residue long phiC31.
We initially used two screens in order to detect improved mutants. The first was
a blue/white colony assay in E. coli. In this screen, the mutant library was cloned into
the assay plasmid pFC-Xq. Plasmid pFC-Xq also encoded for the lacZ gene, which was
flanked by the native attB site and ~450 b.p. of the Xq22.1 pseudo attP site. This was
merely a qualitative screen to detect a functional integrase capable of recombining attB
with Xq22.1 and excising the lacZ gene. Positive mutants would then be identified by
white bacterial colonies and DNA harvested for the secondary mammalian cell screen.
In parallel, we performed the same screen with mutants cloned into a second plasmid,
pFC1, as a control. Plasmid pFC1 is similar to pFC-Xq, except that the Xq22.1 pseudo
site has been replaced with wild-type attP. Screening with the wild-type attB/attP sites
showed very few false positives. However, 95% of the pFC-Xq screening plasmids
were false positives. Colonies were deemed false positive if a plasmid prep and
subsequent analytical digest and sequence did not reveal the presence of a mutant
phiC31 insert. In majority of the cases, it appeared catalysis of the att cores did occur,
but the resulting plasmid only contained the origin of replication and antibiotic
resistance marker. This was likely due to an extreme difficulty for phiC31 to complete
recombination between attB and the Xq22.1 pseudo site in a bacterial context.
We chose to perform subsequent screenings directly in the secondary
mammalian screen in HeLa cervical cancer cells (Fig. 5-1). In this extrachromosomal
inversion assay, the ability of a mutant enzyme to recombine att sites is quantitatively
measured by GFP expression. Plasmid pBXGreen contains the eGFP gene downstream
of an inverted CMV promoter flanked by attB and the Xq22.1 pseudo site. This plasmid
is co-transfected with a plasmid encoding a mutant phiC31 candidate or wild-type
phiC31 as the standard. Figure 5-2 shows GFP expression of the best mutant candidates
from the initial degenerate oligo library. Candidate pCSI-28Xq was able to recombine
attB and the Xq22.1pseudo site in this extrachromosomal inversion assay ~1.22-fold
higher than wild-type phiC31 (pCMVInt). Sequencing results showed that a T2I amino
acid mutation had occurred. Interestingly, this mutation was also found, in combination
with other mutations, in improved attP specificity mutants (16).
Two more rounds of mutagenesis evolve phiC31 away from attP. We used
whole-gene error-prone PCR on pCSI-28Xq and screened for improved mutants via the
extrachromosomal inversion assay for two additional rounds. To be sure that relaxed
specificity mutants were not isolated, we performed a negative selection version of this
assay on a plasmid bearing wild-type attB and attP sites flanking an inverted CMV
promoter driving eGFP. Wild-type phiC31 (pCMVInt) was used as a positive control.
Figure 5-3 shows the GFP expression levels of the best third round mutant candidates
and wild-type phiC31 with both the positive and negative selection assays. Also
included is the second round parent, C05, from which the two best candidates, E04 and
G05, were derived. These results show that in the second round, C05 has increased
recombination frequency on both wild-type attB/attP and attB/Xq22.1 sites. This
suggests that a mutation arose in C05, which resulted in a more relaxed specificity
mutant or increased efficiency. Round three mutants show reduced recombination
frequency with wild-type attB/attP and in contrast have increased recombination
between attB and Xq22.1. This demonstrated that round three mutants have increased
specificity toward Xq22.1 binding and recombination.
Two silent mutations alter phiC31 specificity. Sequencing results with 3X
coverage of C05, E04, and G04 revealed no amino acid changes, other than retaining
the T2I mutation. This suggested that changes in recombination activity resulted from
silent mutations. Candidate C05 had a silent mutation at residue Glu24 (gag>gaa), while
E04 had a silent mutation at Gly176 (ggc>ggt), and G04 a silent mutation at Gly268
(ggc>ggt). Candidates E04 and G04 both had the Glu24 silent mutation as they are both
children of C05. That an advantageous splice site arose due to these silent SNPs without
destroying phiC31 integrase functionality seems unlikely. An alternative view is that
codon usage bias was determining specificity. Recently, Kimchi-Sarfat and colleagues
provided evidence that a silent mutation in the multidrug resistance 1 (MDR1) gene
created a protein with altered conformation and substrate specificity (18). The silent
mutations that arose in both the phiC31 mutants and in the MDR1 gene resulted in
changes from frequent to infrequent human codons. If these codon changes alter
translational kinetics, then it is conceivable that phiC31 enzyme folding and function
would be changed as a result.
Excision assay confirms GFP expression results. We sought to confirm the
GFP expression results before continuing with more laborious mammalian cell
integration assays. We transfected pBCXB along with wild-type phiC31, E04, or G04
into HeLa cells. Functional enzymes should successfully recombine attB and Xq22.1
and excise the lacZ gene centered between the two att sites. After 72 hours, plasmid
DNA was obtained through a Hirt extraction (17) and transformed into E. coli for
scoring of white colonies over blue + white. Table 5-1 shows the results of this
intramolecular integration assay. We determined that wild-type phiC31 integrase
catalyzed recombination between attB and Xq22.1 at a frequency of 1.45%. In contrast,
recombination frequency for the improved enzymes was 8.53% and 7.20% for E04 and
G04 respectively. Compared to wild-type attP/attB recombination (pBCPB),
recombination frequency dropped from 38% for wild-type phiC31 to ~14% for E04 and
~18% for G04. The drop in recombination frequency for attP/attB indicated that this
screen was not limited to deriving hyperactive mutants or mutants with relaxed
specificity. Rather, these results showed evidence that just three rounds of mutagenesis
was enough to alter phiC31 site-specificity.
Finally, to confirm that site-specific recombination had occurred and that loss of
lacZ was attributable to the mutant-mediated recombination, we picked 10 white
colonies from the E04 + pBCXB plate, prepared minipreps of the DNA and sequenced
the attB/Xq22.1 junctions. All of the aligned sequences, except one, showed that
recombination had occurred within 2 bases of the dinucleotide core where phiC31
catalyzes recombination (Fig 5-4). The lone exception was a plasmid with a
recombination within 3 bases of the Xq22.1 core and a 79 base pair insert between
Xq22.1 and the attB core. The 79 base pair insert corresponded to a region of the lacZ
gene that laid 723 bases downstream of the attB core. The tenth sequence read could not
be aligned. Similar sequencing showed recombination within 1-20 bases and 0-19 bases
for wild-type phiC31 integrase and G04, respectively. See table 5-2 for sequencing
reads.
Genomic integration frequency of phiC31 Xq22.1 mutant candidates. One of
the concerns with improving specificity is that it comes at the cost of enzyme
efficiency. To quantify efficiency of overall integration, which includes any genomic
pseudo attP sites, we performed a G418 drug selection assay. The donor plasmid,
pDB2, encodes the neomycin drug-resistance gene and contains the phiC31 attB
recognition site. We transfected HeLa cells with this donor plasmid, along with wild-
type phiC31 or one of the mutant candidates (E04 and G04). Cells then underwent
G418 selection for two weeks, after which colonies were counted. As shown in figure 5-
5, wild-type phiC31 was 2.1-fold higher in efficiency than E04 and ~3-fold higher than
G04. Cells that were transfected with the pDB2 donor alone resulted in just a few
colonies. As the mammalian intrachromosomal assay (Fig. 5-1) was designed to screen
for more specific mutants toward Xq22.1, it is not surprising that E04 and G04 had
suppressed colony numbers. The lower colony numbers obtained with E04 and G04
suggested that pseudo attP specificity, rather than relaxed specificity, had occurred after
three rounds of directed evolution.
Confirmation of integration at chromosome Xq22.1. In order to confirm
chromosomal integration at Xq22.1, we repeated the G418 selection assay and allowed
cells to grow to confluency. Genomic DNA preps were made from the harvested cells,
and we amplified any integration events with PCR using primers matching the attB site
in the donor pDB2 and in the genomic Xq22.1 region. Figure 5-6 shows the results of
this PCR for pCMVInt, E04, G04 and a negative control that received no DNA. While
this was a crude non-quantitative PCR without an internal control, it suggested that both
E04 and G04 may have higher Xq22.1 integration frequencies than wild-type phiC31
integrase. We are currently pursuing qPCR methods to measure integration frequency
of pDB2 into Xq22.1 using candidate phiC31 integrase mutants.
DISCUSSION
This study documents the initial developments to evolve phiC31 to integrate
transgenes at higher specificity into human chromosome Xq22.1. We reasoned that by
starting with a known “hotspot” for phiC31-mediated integration, that site-specificity
could be rapidly developed. This is in contrast to most directed evolution strategies,
which aim to screen for mutants capable of targeting sites without any prior sequence
preference. We have demonstrated that only three rounds of mutagenesis were required
in order to significantly alter innate specificity. Furthermore, phiC31 mutants were
isolated with decreased recombination frequencies at the attP site in extrachromosomal
assays. This suggests that phiC31 mutants had not simply become more “promiscuous”
toward either attP or Xq22.1, but rather more specific toward the Xq22.1 site.
Integration frequency into other pseudo attP sites is currently not known.
Lacking in this study is a definitive measurement for integration at the
chromosomal Xq22.1 site. While donor gene integration at this site has been observed
in pooled HeLa colonies transfected with E04 and G04 (Fig. 5-6), the exact frequency at
Xq22.1 compared to wild-type phiC31 is unknown. We are currently employing two
methods to answer this question. The first is picking HeLa colonies after two weeks of
G418 selection and expanding them individually. They will then be tested for the
presence of Xq22.1 integration with the same set of primers used in the pooled colony
assay above. The second method is qPCR, which does not require the laborious
procedure of picking single colony isolates and purifying genomic DNA for each clone.
We are in the process of developing the proper controls for qPCR at the Xq22.1
integration site.
In the phiC31 DNA shuffling study by Sclimenti et al (11), they were able to
improve specificity toward a pseudo site on chromosome 8p22. However, the overall
integration efficiency of the improved mutants had dropped to more than 10-fold below
wild-type levels. It cannot be stated with certainty why such a drop occurred, but their
initial screening took place in E. coli, which as shown here can alter phiC31
recombination activity compared to human cells. We initially performed screens in
bacteria to filter out non-functional mutants before continuing on to the more tedious
mammalian cell studies. This bacterial screen was plagued with false-positives when
trying to recombine the attB site with Xq22.1. While it has been shown that phiC31-
mediated recombination in mammalian cells does not require host co-factors (8), this
may not be true for all pseudo sites, such as Xq22.1 If so, then it is conceivable that in
bacteria, certain pseudo sites cannot be recombined properly, which would lead to the
observed false-positives. Performing a bacterial screen, although high-throughput,
means mutants that work particularly well in eukaryotic cells may be lost. Despite
relying upon a more laborious mammalian cell screen in this study, only ~600 mutants
needed to be assayed to find candidates with greater Xq22.1 specificity. To avoid false-
positives and increase throughput, we are currently exploring the use of the yeast S.
cerevisiae to perform a lacZ excision screen, followed by bacterial transformation to
score for improved mutants. Functional mutants, screened as white colonies, could then
be tested in mammalian cells, which we typically assay in replicates of ten in a 96-well
plate.
It was surprising that the top third round mutants, E04 and G04, only had three
mutations, with two mutations being silent bases changes. If these silent changes are
contributing to directed evolution, then it suggests that a rational designed approach for
targeted genetics is not always the best strategy. Our first round mutations arose from a
semi-rational approach, in which degenerate oligos were designed with random
nucleotides in the first two bases of each codon. This first round mutagenesis then,
excluded the possibility for silent mutations to occur. It was not until employing a non-
rational strategy in the second round of mutagenesis with error-prone PCR, that the
silent mutations in phiC31 arose. The exact contribution of these silent changes is not
known. mRNA hairpin formation, codon usage bias, codon pair bias, or a combination
of any of these are all equally likely to lead to altered recombinases (18, 19, 20). Less
likely is the formation of a fortuitous splice site, but this cannot be ruled out either.
The Chalberg et al phiC31 integration analysis from 2006 indicated that the
nearest gene to the Xq22.1 integration site was DRP2 at 23 Kb away (10). More
recently, a new mRNA encoding for the 14-3-3 tau splice variant protein has been
annotated at just 5 Kb away from Xq22.1 (2006 BLAT human genome assembly, 21).
The group of 14-3-3 proteins has been shown to interact with cytoskeletal proteins,
including altered neurofilaments responsible for amyotrophic lateral sclerosis (ALS).
The Xq22.1 integration site is downstream of the proposed transcriptional region for the
14-3-3 tau splice variant. It is a possibility that integration at Xq22.1 would have no
effect on the transcription of its upstream neighbor. While this leaves concern for
continuing directed evolution toward the Xq22.1 pseudo site, the methods outlined in
this study would still most likely favor positive outcomes to other phiC31 pseudo sites.
The recent annotation of the 14-3-3 tau splice variant protein highlights the importance
of developing a genetic engineering approach that can quickly evolve enzyme specificty
to different target sequences. As the annotation of the human genome continues to
become more detailed, it is certain that many sites once deemed safe for donor gene
integration will no longer be considered acceptable for therapeutic gene targeting.
Directed evolution of phiC31 has an advantage over viral and transposon systems, in
that known integrations sites are mostly intergenic. Of the phiC31 pseudo sites found in
gene regions, they are usually downstream of the transcribed gene product, in contrast
to viral insertions that usually favor 5’ end insertions (22, 23). Additionally, the overall
number of estimated phiC31 integration sites in the human genome is only ~370 (10).
This number is small enough to reduce the chance of insertional mutagenesis, but large
enough that a great number of alternative integration options are left open when
deciding where to evolve phiC31 specificity toward. Compared to zinc finger nuclease
engineering strategies (24, 25, 26), the steps shown in this study offer a simple and fast
throughput of directed mutants. We envision a phiC31 directed evolution platform
capable of rapid engineering toward specific loci in human and non-human genomes.
Table 5-1. Intramolecular integration frequencies of candidate phiC31 Xq22.1 mutants
Plasmid N White Blue
Intramolecular integration
frequency, % Standard Error, %
pBCXB 1 1 257 0.38 NA
pCMVInt + pBCPB 3 81 130 38.39 1.27
pCSI-E04 + pBCPB 3 130 799 13.99 1.67
pCSI-G04 + pBCPB 2 147 657 18.28 1.53
pCMVInt + pBCXB 3 7 476 1.45 0.3
pCSI-E04 + pBCXB 3 39 418 8.53 2.15
pCSI-G04 + pBCXB 2 9 116 7.20 1.42
Shown are the numbers of white colonies and blue colonies following co-transfection into HeLa cells of
pBCXB with wild-type phiC31 or a candidate mutant. To confirm evolution away from the native attP
recognition site, the experiment was repeated with pBCPB. Three independent experiments were
performed for each plasmid. Intramolecular integration frequency is calculated as white colonies over
total white + blue. N, number of independent experiments; pCMVInt is wild-type phiC31. NA, data not
available.
Table 5-2. Recombination junctions from pBCXB intramolecular excision assay.
Sequence attL attB Xq22.1
Perfect
tgccagggcgtgcccTTaggttctccttgttc
0 0
WT
1
gtgccagggcgtgccctTGgttctccttgttc
+1 +2
2
gtgccagggcgtgcCAtaggttctccttgttc
-2 -1
3
gtctcgaagccgcggtGA-gttctccttgttc
-20 +1
4-6 Could not align
E04
1
gtgccagggcgtgccctTGgttctccttgttc
+1 +2
2
gtgccagggcgtgccctTGgttctccttgttc
+1 +2
3
ggtgccagggcgtgcccTGgttctccttgttc
0 +2
4
gtgccagggcgtgccctTGgttctccttgttc
+1 +2
5
ccagggcgtgcccTTataggttctccttgttc
0 -1
6
gtgccagggcgtgccctTGgttctccttgttc
+1 +2
7
gtgccagggcgtgccctTGgttctccttgttc
+1 +2
8
gtgccagggcgtgccctTGgttctccttgttc
+1 +2
9
gtgcgggtgccaggG(79)Gttctccttgttc
-8 +3
10 Could not align
G04
1
tgccagggcgtgcccTTaggttctccttgttc
0 0
2
gggtgccagggcgtgccCGgttctccttgttc
-1 +2
3
ggtctcgaagccgcggtgCGttctccttgttc
-19 +3
4
cgcggtgcgggtgccagggcgtgCCcttgttc
-3 +8
5
cgcggtgcgggtgccATaggttctccttgttc
-10 0
Shown are sequences from recombination between attB and Xq22.1 on plasmid pBCXB with wild-type
phiC31 integrase (WT), mutant candidate E04, or G04. For convenience, only the first 32 bases flanking
the junction are shown. The crossover core is capitalized and in bold. Deleted bases are shown with a dash
(-). The left half of attB is shown followed by the right half of the Xq22.1 pseudo site. See figure 5-4 for a
detailed explanation of the excision assay. The two columns on the far right indicate how accurate the
crossover was relative to a perfect recombination. A plus (+) indicates the crossover occurred downstream
of the perfect TT core and minus (-) indicates the crossover occurred upstream of the perfect core.
Sequence E04-9 had a 79 base pair insert between the attB and Xq22.1 junction, which corresponded to
part of the lacZ gene 723 bases downstream from attB.
Figure 5-1. Mammalian cell inversion assay for screening phiC31 specificity
mutants. In this extrachromosomal assay, the ability of a mutant enzyme to recombine
attB and Xq22.1sites is quantitatively measured by GFP expression. Plasmid pBXGreen
contains the eGFP gene downstream of an inverted CMV promoter flanked by attB and
the Xq22.1 pseudo site. This plasmid is co-transfected into HeLa cells with a plasmid
encoding a mutant phiC31 candidate or wild-type phiC31 as the standard. Mutant or
wild-type phiC31 products from the co-transfected plasmid mediate recombination
between att sites on pBXG. This results in the inversion of CMV back into its correct
orientation. After 72 hours, GFP expression is measured by flow cytometry. A more
specific mutant for Xq22.1 will have higher mean fluorescence than wild-type. The
assay can be used for negative selection with pBPGreen to test for mutants evolved
away from the attP recognition site. The attR and attL sites are recombined attB and
Xq22.1 (attP) half sites.
0
10
20
30
40
50
60
70
80
90
100
Naïve pCMVInt pCSI-28Xq pCSI-42Xq pCSI-43Xq
A
v
g
.
F
l
u
o
r
e
s
c
e
n
c
e
G
F
P
+
Figure 5-2. GFP expression of first round Xq22.1 specific mutant phiC31
candidates. Degenerate oligos for amino acids 2-10 of phiC31 were used to create a
library of mutant candidates. These were screened using HeLa cells in a 96-well plate
with 8 replicates per candidate. This graph shows the results for promising mutants in a
96-well screen, which were then verified for improvement in 60 mm plates. For this
assay, plasmid pBXG was co-transfected with pCMVInt or the mutant library
candidates. Higher mean fluorescence of the GFP positive fraction represents mutants
with greater Xq22.1 specificity. Error bars are the standard error for three independent
transfections.
0
20
40
60
80
100
120
140
160
180
pBPGreen
alone
pBXG alone pCMVInt C05 (R2) E04 (R3) G04 (R3)
A
v
g
.
F
l
u
o
r
e
s
c
e
n
c
e
o
f
G
F
P
+
Xq22.1
attP
Figure 5-3. Relative GFP expression of candidate mutants demonstrates evolution
away from native recognition site. We performed a mammalian intrachromosomal
assay to screen for Xq22.1 specific phiC31 mutants. The results shown here represent
three independent transfections performed in 60 mm plates of sub-confluent HeLa cells
to confirm initial 96-well plate screens. In this assay, as explained in figure 5-1, a
functional mutant inverts a backwards CMV promoter to turn on eGFP expression.
Quantification of GFP expression by flow cytometry can identify mutants with higher
Xq22.1 specificity. Another plasmid, pBPGreen, is similar to pBXG except that wild-
type attP is in place of the Xq22.1 pseudo site. Together with table 5-1, after three
rounds of mutagenesis and screening, mutants E04 and G04 demonstrated decreased
specificity to attP while concomitantly improving specificity toward the Xq22.1 site.
Candidate C05 is the parental mutant of E04 and G04. The hashed bars show
recombination on pBPGreen and solid indicates pBXG recombination.
Figure 5-4. Schematic diagram of recombined attB and Xq22.1 in pBCXB. A
perfect crossover recombination would bisect the dinucleotide core of each att site (bold
capitalized). With att sites in direct orientation, the lacZ gene is excised causing
bacterial colonies to grow white when plated on Xgal agar plates. Only 28 bases of each
core are shown, though pBCXB contains 450 b.p. of the Xq22.1 core and ~230 b.p. of
the attB core. Plasmid pBCPB is similar to pBCXB, except that attP is located in the
place of Xq22.1. Table 5-1 shows the colony results from this assay. Sequencing results
of recombined plasmids on pBCXB are shown in table 5-2.
0
50
100
150
200
250
300
350
400
450
Donor Only pCMVInt E04 G04
#
G
4
1
8
R
e
s
i
s
t
a
n
t
C
o
l
o
n
i
e
s
Figure 5-5. Chromosomal integration frequency of mutant phiC31 candidates. This
graph shows the number of G418 resistant HeLa colonies after co-transfection with
pDB2 (“donor”, which encodes for the neomycin gene) and either wild-type phiC31
(pCMVInt) or one of the mutant candidates. Overall integration at all human pseudo
sites indicates pCMVInt is 2.1-fold above E04 and ~3-fold above G04. It is not known
at this time if the reduction in efficiency is due to less integrase being made. Number of
colonies represents the average of three independent transfections. Standard error bars
are shown.
Figure 5-6. Pooled genomic DNA verifies Xq22.1 integration. Genomic DNA from
pooled HeLa colonies was purified two weeks after G418 selection and co-transfection
pDB2 and either wild-type or mutant phiC31. 200 ng of genomic template was used to
PCR amplify integrations at the Xq22.1 attL junction. This PCR does not reliably
predict integration frequencies, but shows that integration has not been lost at the
chromosomal Xq22.1 after three rounds of directed evolution. pCMVInt is the wild-
type phiC31, E04 and G04 are third round mutant candidates, and ‘Neg’ is prepped
HeLa DNA which did not undergo transfection or selection. The visible bands are
~1000 b.p.
REFERENCES
1. Bolusani S, Ma CH, Paek A, Konieczka JH, Jayaram M, and Voziyanov Y.
2006. Evolution of variants of yeast site-specific recombinase Flp that utilize native
genomic sequences as recombination target sites. Nucleic Acids Res. 34:5259-69.
2. Santoro SW and Schultz PG. 2002. Directed evolution of the site specificity of
Cre recombinase. Proc. Natl. Acad. Sci. USA. 99:4185-90.
3. Saraf-Levy T, Santoro SW, Volpin H, Kushnirsky T, Eyal Y, Schultz PG,
Gidoni D, and Carmi N. 2006. Site-specific recombination of asymmetric lox sites
mediated by a heterotetrameric Cre recombinase complex. Bioorg Med Chem.
14:3081-9.
4. Sarkar I, Hauber I, Hauber J, and Buchholz F. 2007. HIV-1 proviral DNA
excision using an evolved recombinase. Science 316: 1912-5.
5. Urnov FD, Miller JC, Lee YL, Beausejour CM, Rock JM, Augustus S,
Jamieson AC, Porteus MH, Gregory PD, and Holmes MC. 2005. Highly
efficient endogenous human gene correction using designed zinc-finger nucleases.
Nature 435:646-51.
6. Akopian A, He J, Boocock MR, and Stark WM. 2003. Chimeric recombinases
with designed DNA sequence recognition. Proc. Natl. Acad. Sci. USA. 100:8688-91.
7. Gordley, RM, Smith JD, Graslund T, and Barbas CF. 2007. Evolution and
programmable zinc finger-recombinases with activity in human cells. J. Mol. Biol.
367:802-13.
8. Groth AC, Olivares EC, Thyagarajan B, and Calos MP. 2000. A phage integrase
directs efficient site-specific integration in human cells. Proc. Natl. Acad. Sci. USA
97:5995-6000.
9. Thyagarajan B, Olivares EC, Hollis RP, Ginsburg DS, and Calos MP. 2001.
Site-specific genomic integration in mammalian cells mediated by phage phiC31
integrase. Mol. and Cell Biol. 21:3926-34.
10. Chalberg TW, Portlock JL, Olivares EC, Thyagarajan B, Kirby PJ, Hillman
RT, Hoelters J, and Calos MP. 2006. Integration specificity of phage phiC31
integrase in the human genome. J. Mol. Biol. 357:28-48.
11. Sclimenti CR, Thyagarajan B, and Calos MP. 2001. Directed evolution of a
recombinase for improved genomic intergration at a native human sequence.
Nucleic Acids Res. 29:5044-51.
12. Olivares EC, HollisRP, Chalberg TW, Meuse L, Kay MA, and Calos MP. 2002.
Site-specific genomic integration produces therapeutic Factor IX levels in mice.
Nat. Biotechnol. 20:1124-8.
13. Chalberg TW, Genise HL, Vollrath D, and Calos MP. 2005. phiC31 integrase
confers genomic integration and long-term transgene expression in rat retina. Invest.
Ophthalmol. Vis. Sci. 46:2140-6.
14. Groth AC, Fish M, Nusse R, and Calos MP. 2004. Construction of transgenic
Drosophila by using the site-specific integrase from phage phiC31. Genetics
166:1775-82.
15. Allen BG and Weeks DL. 2005. Transgenic Xenopus laevis embryos can be
generated using phiC31 integrase. Nat. Methods 2:975-9.
16. Keravala A, Lee S, Thyagarajan B, Olivares E, Gabrovsky V, Woodard LE,
and Calos MP. 2008. Mutational derivatives of phiC31 integrase with enhanced
efficiency and specificity. Mol. Therapy [in press].
17. Hirt B. 1967. Selective extraction of polyoma DNA from infected mouse cultures.
J. Mol. Biol. 26:365-9.
18. Kimchi-Sarfaty C, Oh JM, Kim I, Sauna ZE, Calcango AM, Ambudkar SV,
and Gottesman MM. 2007. A “silent” polymorphism in the MDR1 gene changes
substrate specificity. Science 315:525-8.
19. Gutman GA and Hatfield GW. 1989. Nonrandom utilization of codon pairs in
Escherichia coli. Proc. Natl. Acad. Sci. USA 86:3699-703.
20. Coleman JR, Papamichail D, Skiena S, Futcher B, Wimmer E, and Mueller S.
2008. Virus Attenuation by Genome-Scale Changes in Codon Pair Bias. Science
320:1784-7.
21. Ge W, Volkening K, Leystra-Lantz C, Jaffe H, and Strong MJ. 2007 14-3-3
protein binds to the low molecular weight neurofilament (NFL) mRNA 3’ UTR.
Mol. Cell. Neurosci. 34:80-7.
22. Mitchell RS, Beitzel BF, Schroder AR, Shinn P, Chen H, Berry CC, Ecker JR,
and Bushman FD. 2004. Retroviral DNA integration: ASLV, HIV, and MLV show
distinct target site preferences. PLoS Biol. 2:E234.
23. Wu X, Li Y, Crise B, and Burgess SM. 2003. Transcription start regions in the
human genome are favored targets for MLV integration. Science 300:1749–52.
24. Mani M, Kandavelou K, Dy FJ, Durai S, and Chandrasegaran S. 2005. Design,
engineering, and characterization of zinc finger nucleases. Biochem. Biophys. Res.
Commun. 335:447-57.
25. Mani M, Smith J, Kandavelou K, Berg JM, and Chandrasegaran S. 2005.
Binding of two zinc finger nuclease monomers to two specific sites is required for
effective double-strand DNA cleavage. Biochem. Biophys. Res. Commun. 334:1191-
7.
26. Durai S, Mani M, Kandavelou K, Wu J, Porteus MH, and Chandrasegaran S.
2005. Zinc finger nucleases: custom-designed molecular scissors for genome
targeting of plant and mammalian cells. Nucleic Acids Res. 33:5978-90.
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



