Advanced genetic strategies for r...
Journal of Biotechnology 115 (2005) 113���128 Advanced genetic strategies for recombinant protein expression in Escherichia coli Hans Peter S��rensen, Kim Kusk Mortensen��� Laboratory of BioDesign, Department of Molecular Biology, Aarhus University, Gustav Wieds Vej10 C, DK-8000 Aarhus C, Denmark Received 29 March 2004 received in revised form 26 August 2004 accepted 30 August 2004 Abstract Preparations enriched by a specific protein are rarely easily obtained from natural host cells. Hence, recombinant protein pro- duction is frequently the sole applicable procedure. The ribosomal machinery, located in the cytoplasm is an outstanding catalyst of recombinant protein biosynthesis. Escherichia coli facilitates protein expression by its relative simplicity, its inexpensive and fast high-density cultivation, the well-known genetics and the large number of compatible tools available for biotechnology. Especially the variety of available plasmids, recombinant fusion partners and mutant strains have advanced the possibilities with E. coli. Although often simple for soluble proteins, major obstacles are encountered in the expression of many heterologous proteins and proteins lacking relevant interaction partners in the E. coli cytoplasm. Here we review the current most important strategies for recombinant expression in E. coli. Issues addressed include expression systems in general, selection of host strain, mRNA stability, codon bias, inclusion body formation and prevention, fusion protein technology and site-specific proteolysis, compartment directed secretion and finally co-overexpression technology. The macromolecular background for a variety of obstacles and genetic state-of-the-art solutions are presented. �� 2004 Elsevier B.V. All rights reserved. Keywords: Escherichia coli Recombinant protein expression systems Inclusion bodies Fusion proteins Rare codon tRNAs 1. The modern recombinant expression system A number of central elements are essential in the design of recombinant expression systems (Baneyx, 1999 Jonasson et al., 2002). Expression is normally induced from a plasmid harboured by a system com- patible genetic background. The genetic elements of ��� Corresponding author. Fax: +45 86 12 31 78. E-mail address: email@example.com (K.K. Mortensen). the expression plasmid include origin of replication (ori), an antibiotic resistance marker, transcriptional promoters, translation initiation regions (TIRs) as well as transcriptional and translational terminators. 1.1. The replicon The replicon of plasmids contain the origin of repli- cation and in some cases associated cis acting elements (del Solar et al., 1998). Most plasmid vectors used in re- 0168-1656/$ ��� see front matter �� 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.jbiotec.2004.08.004
114 H.P. S��rensen, K.K. Mortensen / Journal of Biotechnology 115 (2005) 113���128 combinant protein expression replicate by the ColE1 or the p15A replicon. Plasmid copy number is controlled by the origin of replication that preferably replicates in a relaxed fashion (Baneyx, 1999). The ColE1 repli- con present in modern expression plasmids is derived from the pBR322 (copy number 15���20) or the pUC (copy number 500���700) family of plasmids, whereas the p15A replicon is derived from pACYC184 (copy number 10���12). These multi-copy plasmids are stably replicated and maintained under selective conditions and plasmid free daughter cells are rare (Summers, 1998). Plasmid incompatibility is defined as the inabil- ity of two plasmids to be stably maintained in the same cell (Hardy, 1987). Different replicon incompatibility groups and drug resistance markers are required when multiple plasmids are employed for the co-expression of gene products. Derivatives containing ColE1 and p15A replicons are often combined in this context since they are compatible plasmids (Mayer, 1995). 1.2. Resistance markers The most common drug resistance markers in re- combinant expression plasmids confer resistance to ampicillin, kanamycin, chloramphenicol or tetracy- cline. Plasmid mediated resistance to ampicillin is ac- complished by expression of -lactamase from the bla gene. This enzyme is secreted to the periplasm, where it catalyse hydrolysis of the -lactam ring. Ampicillin present in the cultivation medium is especially suscep- tible to degradation, either by secreted -lactamase, or acidic conditions in high-density cultures. The latter effect can be alleviated by the less degradation sus- ceptible ampicillin analog, carbenicillin, Kanamycin, chloramphenicol and tetracycline interfere with pro- tein synthesis by binding to critical areas of the ri- bosome. Kanamycin is inactivated in the periplasm by aminoglycoside phosphotransferases and chloram- phenicol by the cat gene product, chloramphenicol acetyl transferase. Various genes confer resistance to tetracycline (Connell et al., 2003). 1.3. Promoters Recombinant expression plasmids require a strong transcriptional promoter to control high-level gene expression. Basal transcription in the absence of inducer is minimized through the presence of a suitable repressor. Minimization of basal transcription is especially important when the expression target introduce a cellular stress situation and thereby selects for plasmid loss. Promoter induction is either thermal or chemical and the most common inducer is the sugar molecule isopropyl-beta-d- thiogalactopyranoside (IPTG) (Hannig and Makrides, 1998). 1.4. Messenger RNA Translation initiation from the translation initiation region (TIR) of the transcribed messenger RNA re- quire a ribosomal binding site (RBS) including the Shine���Dalgarno (SD) sequence and a translation initia- tion codon (S��rensen et al., 2002). The Shine���Dalgarno sequence is located 7 �� 2 nucleotides upstream from the initiation codon, which is the canonical AUG in effi- cient recombinant expression systems (Ringquist et al., 1992). Optimal translation initiation is obtained from mRNAs with the SD sequence UAAGGAGG. The RBS secondary structure is highly important for translation initiation and efficiency is improved by high contents of adenine and thymine (Laursen et al., 2002). Trans- lation initiation efficiency is in particular influenced by the codon following the initiation codon and adenine is abundant in highly expressed genes (Stenstrom et al., 2001). A transcription terminator placed downstream from the sequence encoding the target gene, serves enhance- ment of plasmid stability by preventing transcription through the origin of replication and from irrelevant promoters located in the plasmid. Transcription termi- nators stabilize the mRNA by forming a stem loop at the three prime end (Newbury et al., 1987). Translation termination is preferably mediated by the stop codon UAA in Escherichia coli. Increased efficiency of trans- lation termination is achieved by insertion of consec- utive stop codons or the prolonged UAAU stop codon (Poole et al., 1995). 1.5. Current expression systems A wealth of expression systems designed for various applications and compatibilities are available. Approximately 80% of the proteins used to solve three-dimensional structures submitted to the protein data bank (PDB) in 2003 were prepared in an E. coli ex-
H.P. S��rensen, K.K. Mortensen / Journal of Biotechnology 115 (2005) 113���128 115 pression system. The T7 based pET expression system (commercialized by Novagen) is by far the most used in recombinant protein preparation (pET represents more than 90% of the 2003 PDB protein preparation systems). Systems using the PL promoter/cI repressor (e.g., Invitrogen pLEX), Trc promoter (e.g., Amersham Biosciences pTrc), Tac promoter (e.g., Amersham Biosciences pGEX) and hybrid lac/T5 (e.g., Qiagen pQE) promoters are common (Hannig and Makrides, 1998). A radically different system is based on the araBAD promoter (e.g., Invitrogen pBAD). Here we review two particular systems that illustrate the most general mechanisms in current recombinant expression systems. Various expression systems and promoters have been reviewed elsewhere (Baneyx, 1999 Hannig and Makrides, 1998 Jonasson et al., 2002). 2. The pET expression system Studier and colleagues first described the pET ex- pression system, which has been developed for a vari- ety of expression applications (Dubendorff and Studier, 1991 Studier et al., 1990). More than 40 different pET plasmids are commercially available. The system in- cludes hybrid promoters, multiple cloning sites for the incorporation of different fusion partners and protease cleavage sites, along with a high number of genetic backgrounds modified for various expression purposes. Expression requires a host strain lysogenized by a DE3 phage fragment, encoding the T7 RNA polymerase (bacteriophage T7 gene 1), under the control of the IPTG inducible lacUV5 promoter (Fig. 1A). LacI re- presses the lacUV5 promoter and the T7/lac hybrid pro- moter encoded by the expression plasmid. A copy of the lacI gene is present on the E. coli genome and on the plasmid in a number of pET configurations. LacI is a weakly expressed gene and a 10-fold enhancement of the repression is achieved when the overexpressing promoter mutant LacIq is employed (Calos, 1978). T7 RNA polymerase is transcribed when IPTG binds and triggers the release of tetrameric LacI from the lac op- erator. Transcription of the target gene from the T7/lac hybrid promoter (repressed by LacI as well) is subse- quently initiated by T7 RNA polymerase (Fig. 1A). The T7 promoter is a 20-nucleotide sequence not recognized by the E. coli RNA polymerase. T7 RNA polymerase transcribes maximally 230 nucleotides per second and is five times faster than E. coli RNA polymerase (50 nucleotides per second). Background expression from pET expression plasmids is dimin- ished by the presence of T7 lysozyme (bacteriophage T7 gene 3.5 amidase), which is a natural inhibitor of T7 RNA polymerase. Co-expression of T7 lysozyme is achieved by either plasmid pLysS or pLysE. These plasmids harbour the T7 lysozyme gene in silent (pLysS) and expressed (pLysE) orientations, with respect to the cognate tetracycline responsive (Tc) promoter (Studier, 1991). The lacUV5 promoter is less sensitive to regulation by the cAMP-CRP (cAMP receptor protein) complex, than the lac promoter. However, incorporation of 1% glucose in the culti- vation medium reduces cAMP levels and enhances repression of the promoter significantly (cAMP is produced as a response to low glucose levels). Graded inductions of pET vectors have recently been included in the pET system repertoire (Novagen Tuner strains). Host strains deficient in the lacY gene product lactose permease offers precise control of target protein expression (Khlebnikov and Keasling, 2002). 3. The pBAD expression system Expression plasmids based on the araBAD pro- moter are designed for tight control of background expression and l-arabinose dependent graded ex- pression of the target protein (Guzman et al., 1995). The latter property is in contrast to the all-or-nothing induction experienced by most other bacterial ex- pression systems (Morgan-Kiss et al., 2002). A linear increase in gene expression with increasing inducer concentration is seen at the population level when the araBAD system is employed. Induction is unfortunately all-or-nothing in individual cells, which are either fully induced or uninduced (Siegele and Hu, 1997). Autocatalytic mechanisms related to the natural inducer transport systems, in concert with ara- binose degradation, are responsible for all-or-nothing induction of the araBAD promoter. The autocatalytic effect occurs since the arabinose transporters (araE and araFGH) are under arabinose inducible control. Homogenous gene expression has been achieved in strains deficient in arabinose transport and degrada- tion, by facilitated diffusion of arabinose, catalyzed by arabinose independent transporters supplied in trans