Sign up & Download
Sign in

Rewritable digital data storage in live cells via engineered control of recombination directionality

by J Bonnet, P Subsoontorn, D Endy
Proceedings of the National Academy of Sciences ()

Abstract

The use of synthetic biological systems in research, healthcare, and manufacturing often requires autonomous history-dependent behavior and therefore some form of engineered biological memory. For example, the study or reprogramming of aging, cancer, or development would benefit from genetically encoded counters capable of recording up to several hundred cell division or differentiation events. Although genetic material itself provides a natural data storage medium, tools that allow researchers to reliably and reversibly write information to DNA in vivo are lacking. Here, we demonstrate a rewriteable recombinase addressable data (RAD) module that reliably stores digital information within a chromosome. RAD modules use serine integrase and excisionase functions adapted from bacteriophage to invert and restore specific DNA sequences. Our core RAD memory element is capable of passive information storage in the absence of heterologous gene expression for over 100 cell divisions and can be switched repeatedly without performance degradation, as is required to support combinatorial data storage. We also demonstrate how programmed stochasticity in RAD system performance arising from bidirectional recombination can be achieved and tuned by varying the synthesis and degradation rates of recombinase proteins. The serine recombinase functions used here do not require cell-specific cofactors and should be useful in extending computing and control methods to the study and engineering of many biological systems.

Cite this document (BETA)

Available from www.pnas.org
Page 1
hidden

Rewritable digital data storage i...

Rewritable digital data storage in live cells via engineered control of recombination directionality Jerome Bonnet, Pakpoom Subsoontorn, and Drew Endy1 Department of Bioengineering, Room 269B, Y2E2 Building, 473 Via Ortega, Stanford University, Stanford, CA 94305 Edited by David Baker, University of Washington, Seattle, WA, and approved April 6, 2012 (received for review February 8, 2012) The use of synthetic biological systems in research, healthcare, and manufacturing often requires autonomous history-dependent behavior and therefore some form of engineered biological mem- ory. For example, the study or reprogramming of aging, cancer, or development would benefit from genetically encoded counters capable of recording up to several hundred cell division or differ- entiation events. Although genetic material itself provides a natur- al data storage medium, tools that allow researchers to reliably and reversibly write information to DNA in vivo are lacking. Here, we demonstrate a rewriteable recombinase addressable data (RAD) module that reliably stores digital information within a chromo- some. RAD modules use serine integrase and excisionase functions adapted from bacteriophage to invert and restore specific DNA se- quences. Our core RAD memory element is capable of passive in- formation storage in the absence of heterologous gene expression for over 100 cell divisions and can be switched repeatedly without performance degradation, as is required to support combinatorial data storage. We also demonstrate how programmed stochasticity in RAD system performance arising from bidirectional recombina- tion can be achieved and tuned by varying the synthesis and de- gradation rates of recombinase proteins. The serine recombinase functions used here do not require cell-specific cofactors and should be useful in extending computing and control methods to the study and engineering of many biological systems. DNA inversion ∣ synthetic biology ∣ genetic engineering ∣ standard biological parts Mcross-regulating ost engineered genetic data storage systems use auto- or bistable systems of transcription repressors or activators to define and hold state via continuous gene expres- sion (1–4). Such epigenetic storage systems can be subject to evo- lutionary counter selection due to resource burdens placed on the host cell or spontaneous switching due to putatively stochastic fluctuations in cellular processes, including gene expression. Moreover, heterologous expression-based systems are difficult to redeploy given differences in gene regulatory mechanisms across organisms. Another approach for storing data inside organisms is to code extrinsic information within genetic material (5). Nucleic acids have undergone natural selection to serve as heritable data sto- rage material in organismal lineages. Moreover, DNA provides attractive features in terms of data storage robustness, scalability, and stability (6). In addition, engineered transmission of DNA molecules could support data exchange between organisms as needed to implement higher-order multicellular behaviors within programmed consortia (6, 7). Practically, researchers have begun to use enzymes that modify DNA, typically site-specific recombinases, to study and control engineered genetic systems. For example, recombinases can cat- alyze strand exchange between specific DNA sequences and enable precise manipulation of DNA in vitro and in vivo (8). Depending on the relative location or orientation of recombina- tion sites, three distinct recombination outcomes, integration, ex- cision or inversion, can be realized. From such knowledge, several natural recombination systems have been reapplied to support research in cell and developmen- tal biology (9, 10). However, all in vivo DNA-based control or data storage systems implemented to date are “single-write” systems (11–13). Consequently, the amount of information such systems are able to store is linearly proportional to the number of implemented elements (for example, a “thermometer-code” counter capable of recording N events given N data storage ele- ments) (13). Single-write architectures are limiting if many of the uses for genetic data storage are considered in detail. For example, stu- dies of replicative aging in yeast or human fibroblasts typically track at least 25 or 45 cell division events prior to the onset of senescence, respectively (14). Lineage mapping during worm de- velopment frequently tracks at least 10 differentiation events (15), while research with mouse and human systems considers up to several hundred cell divisions (16). In situations where the same signal is being recorded over multiple occurrences (for example, a series of cell division events), reliably rewritable ele- ments are needed to realize geometric increases in data storage capacity (for example, combinatorial counters capable of record- ing 2N events given N storage elements). Among the recombinase family of DNA-modifying enzymes, phage integrases are unique in that the directionality of the recombination reaction can be influenced by an excisionase co- factor (17). In natural systems, a phage integrase alone typically catalyzes site-specific recombination between an attachment site on the infecting phage chromosome (attP) and an attachment site encoded within the host chromosome (attB). The resulting inte- gration reaction inserts the phage genome within the host chro- mosome bracketed by newly formed attL and attR (LR) sites. Upon induction leading to lytic growth, the prophage coexpresses integrase and excisionase that together restore an independent phage genome and the original attB and attP (BP) sites (18). Early work with the r32 polar mutations of bacteriophage lambda revealed that integrase mediated recombination of anti- parallel BP sites could also lead to the inversion of the interven- ing DNA (19, 20). Subsequent studies on DNA supercoiling used phage integrases to invert recombinant DNA sequences flanked by opposing BP sites (11, 21). Further in vitro work has since demonstrated that an integrase excisionase complex can revert a DNA sequence flanked by opposing LR sites (22). We thus sought to develop a stable data register that could invert and re- store a target DNA sequence in vivo by appropriately controlling the conditional heterologous expression of integrase and exci- sionase. Phages integrases are thought to represent two evolutionary and mechanistically distinct recombinase families (23). Tyrosine integrases, such as the bacteriophage lambda integrase, often Author contributions: J.B., P.S., and D.E. designed research J.B. and P.S. performed research J.B., P.S., and D.E. analyzed data and J.B., P.S., and D.E. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Freely available online through the PNAS open access option. 1To whom correspondence should be addressed. E-mail: endy@stanford.edu. This article contains supporting information online at www.pnas.org/lookup/suppl/ doi:10.1073/pnas.1202344109/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1202344109 PNAS Early Edition ∣ 1 of 6 ENGINEERING APPLIED BIOLOGICAL SCIENCES
Page 2
hidden
have relatively long attachment sites (∼200 bp), use a Holliday junction mechanism during strand exchange, and require host specific cofactors. By contrast, serine integrases use a double- strand break mechanism during recombination and can have shorter attachment sites (∼50 bp). In addition, some serine inte- grases do not require host cofactors, a feature that has led to their successful reuse across a range of organisms (24). We thus chose to explore the engineering of rewritable genetic data storage sys- tems using a bacteriophage serine integrase. Bacteriophage Bxb1 now provides the best characterized ser- ine integrase excisionase system (25–28). Bxb1 gp35 is a serine integrase that catalyzes integration of the Bxb1 genome into the GroEL1 gene of Mycobacterium smegmatis (25). Bxb1 gp47 is an excisionase that mediates excision in vivo and has been shown to control recombination directionality in vitro with high efficiency (27). Minimal attB, attP, attL, and attR sites have been defined for the Bxb1 system (25–27). The Bxb1 excisionase does not bind DNA independently and, from in vitro studies, is thought to control integrase directionality in a stoichiometric manner (27). From these and other studies several models have been proposed for how Bxb1 excisionase controls integrase directionality, but it is not yet clear how excisionase-mediated recombination pro- ceeds or is regulated in vivo (27, 29). Results Architecture and Model for a RAD Module. We developed a RAD module based on a two-state latch architecture that switches between states in response to distinct inputs and stores the last state recorded in the absence of either input signal. Here, our RAD module consists of an inducible “set” generator producing integrase, an inducible “reset” generator producing integrase and excisionase, and a DNA data register (Fig. 1A). Briefly, produc- tion of integrase alone should set a DNA register sequence flanked by oppositional attB and attP sites, thereby producing an inverted sequence flanked by attL and attR sites (State “1”). A second independent transcriptional input drives the simultaneous production of integrase and excisionase and should reset the reg- ister sequence to its original orientation and flanking sequences (State “0”). We built a chemical kinetic model to better understand the potential behavior and failure modes of a DNA inversion RAD module (Fig. 1B). Our model reflects available knowledge of the mechanics and kinetics of the Bxb1 recombinase system, specifi- cally (27, 30, 31). We used the model to estimate the operational phase diagram of our latch at pseudoequilibrium (SI Appendix). We found three distinct latch operating regions as a function of integrase and excisionase expression levels, corresponding to expected “set,” “reset,” or “hold” operations (Fig. 1C). One com- plete latch cycle requires the dynamic adjustment of integrase and excisionase expression through a “set, hold, reset, hold” pat- tern. These operations are realized in practice by cycling the tran- scription signals that define latch set and reset inputs and by tuning the specific genetic elements that provide fine control over integrase and excisionase synthesis and degradation. Unidirectional DNA Inversion and Data Storage. We first implemen- ted a data storage register via a DNA fragment encoding fluor- escent reporter proteins and Bxb1 recombinase recognition sites flanking a constitutive promoter on the chromosome of Escher- ichia coli DH5αZ1 (32) (Fig. 1A). We then confirmed via micro- scopy and cytometry that the state of the register could be assayed reliably (Fig. 2A). We next established that the register could set and hold state via a pulse of integrase expression within cells containing a single coding sequence for integrase. To do this, we built integrase driven “set” switches by cloning Bxb1 integrase un- der the control of an inducible promoter (32, 33) and a ribosome binding site library (Materials and Methods). We transformed the set-encoding vectors into cells containing the chromosomal BP register and isolated cells that only switched when induced many variants switch spontaneously in the absence of an input signal or do not switch when induced (SI Appendix, Fig. S4 Table 1). We were able to isolate set functions that switch with greater than 95% efficiency at the single-cell level and that hold state follow- ing inducer removal (Fig. 2B). Bidirectionality of Excisionase-Mediated DNA Inversion in Vivo. We next determined if Bxb1 integrase and excisionase could mediate DNA inversion from an LR to BP state efficiently and unidirec- tionally in vivo. Previous in vitro experiments show that Bxb1 integrase and excisionase can catalyze LR to BP recombination to near completion (27). Unexpectedly, we found that a reset function mediated by integrase plus excisionase is reversible in vivo. For example, using constructs (15–20 copies per cell) expres- sing both integrase and excisionase we observed that upon induc- tion using a reset signal (arabinose), both DNA register states are sampled across a mixed population and then a split population arises following reset signaling (Fig. 2C). We observed bidirectional behavior starting from either initial register state, suggesting that, in the context of our system, ex- pression of integrase plus excisionase results in repeated cycles of register inversion between BP and LR states (Fig. 2C, Right). We postulated that the system enters a bidirectional regime if the concentration of excisionase is too low relative to integrase, as might be needed to completely reverse recombination direction- Fig. 1. Architecture, mechanisms, and operation of a recombinase addressable data (RAD) module. (A) The DNA inversion RAD module is driven by two generic transcription input signals, set and reset. A set signal drives expression of integrase that inverts a DNA element serving as a genetic data register. Flipping the register converts flanking attB and attP sites to attL and attR sites, respectively. A reset signal drives expression of integrase and excisionase and restores both register orientation and the original flanking attB and attP sites. The register itself encodes a constitutive promoter which initiates strand-specific transcription. Following successful set or reset operations, mutually exclusive transcription outputs “1” or “0” are activated, respectively. For the RAD module developed here, a “1” or “0” register state produces red or green fluorescent protein, respectively. (B) Elementary chemical reactions, molecular species, and kinetic parameters used to model the RAD module. Molecular concentrations are normalized to the integrase dimer dissociation constant (Ki). Kinetic rates are normalized to the integrase-mediated recombination rate (kc −1). (C) Simulated phase diagram detailing pseudoequilibrium operating regimes for a RAD module experiencing sustained integrase and excisionase expression levels for 200∕kc. The red, green, and gray lines represent, with decreasing intensity, 95, 75, and 55% switching (or hold) efficiencies (main text and SI Appendix). 2 of 6 ∣ www.pnas.org/cgi/doi/10.1073/pnas.1202344109 Bonnet et al.

Readership Statistics

137 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
31% Ph.D. Student
 
15% Student (Master)
 
15% Post Doc
by Country
 
49% United States
 
9% United Kingdom
 
7% Japan

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in