Sign up & Download
Sign in

Inference of plasmid-copy-number mean and noise from single-cell gene expression data.

by Stéphane Ghozzi, Jérôme Wong Ng, Didier Chatenay, Jérôme Robert
Physical Review E - Statistical, Nonlinear and Soft Matter Physics ()

Abstract

Plasmids are extrachromosomal DNA molecules which code for their own replication. We previously reported a setup using genes coding for fluorescent proteins of two colors that allowed us, using a simple model, to extract the plasmid-copy-number noise in a monoclonal population of bacteria J. Wong Ng, Phys. Rev. E 81, 011909 (2010). Here we present a detailed calculation relating this noise to the measured levels of fluorescence, taking into account all sources of fluorescence fluctuations: not only the fluctuation of gene expression as in the simple model but also the growth and division of bacteria, the nonuniform distribution of their ages, the random partition of proteins at divisions, and the replication and partition of plasmids and chromosome. We show how to use the chromosome as a reference, which helps extracting the plasmid-copy-number noise in a self-consistent manner.

Cite this document (BETA)

Available from link.aps.org
Page 1
hidden

Inference of plasmid-copy-number ...

Inference of plasmid-copy-number mean and noise from single-cell gene expression data St��phane Ghozzi* and J��r��me Wong Ng��� Laboratoire de Physique Statistique, ��cole Normale Sup��rieure, UPMC Univ Paris 06, Universit�� Paris Diderot, CNRS, 24 rue Lhomond, 75005 Paris, France Didier Chatenay and J��r��me Robert Laboratoire Jean Perrin, FRE 3231, CNRS-UPMC, 24 rue Lhomond, 75005 Paris, France Received 22 July 2010 revised manuscript received 5 October 2010 published 11 November 2010 Plasmids are extrachromosomal DNA molecules which code for their own replication. We previously re- ported a setup using genes coding for fluorescent proteins of two colors that allowed us, using a simple model, to extract the plasmid-copy-number noise in a monoclonal population of bacteria J. Wong Ng et al., Phys. Rev. E 81, 011909 2010 . Here we present a detailed calculation relating this noise to the measured levels of fluorescence, taking into account all sources of fluorescence fluctuations: not only the fluctuation of gene expression as in the simple model but also the growth and division of bacteria, the nonuniform distribution of their ages, the random partition of proteins at divisions, and the replication and partition of plasmids and chromosome. We show how to use the chromosome as a reference, which helps extracting the plasmid-copy- number noise in a self-consistent manner. DOI: 10.1103/PhysRevE.82.051916 PACS number s : 87.18.Tt, 87.16. b I. INTRODUCTION Plasmids are highly common in natural bacterial strains and are widely used in studies of gene expression 1 . They have been seen as a model for genomic replication and par- tition 1,2 and studied as genetic control systems, possibly subject to noise 3 . A number of techniques have been used to measure plasmid-copy numbers PCNs . DNA titration is the simplest, but least precise. Quantitative polymerase chain reaction qPCR 4 is often used and gives access to mean PCN in a population. Two in vivo labeling techniques may a priori give access to PCN distributions when applied on single cells: fusions of a fluorescent protein with a transcrip- tion factor that binds the plasmids 5,6 or insertion of a gene coding for a fluorescent protein into the plasmids 7 . How- ever, both have limitations that prevent them from giving access to more than the mean PCN 8 . In the remainder of this Introduction we briefly recall the setup of the experiments reported previously, making use of dual fluorescence reporters, which allowed us to infer the second moment of PCN distributions 8 . In Sec. II we de- rive the expression for PCN mean and noise in a simple case, where only fluctuations of gene expression are considered. The realistic case, taking into account all sources of fluctua- tions of the actual experiment, is presented in Sec. III. Sec- tion IV presents the values obtained for PCN mean and noise when one uses the experimentally measured quantities. These results and the principle of this work are then dis- cussed. Appendixes A���E present some computations in greater details. The gene egfp 9 , coding for the green fluorescent protein EGFP , was fused to the inducible strong promoter PtacI 10 and then inserted in the chromosome of an E. coli strain. The bacteria were then transformed with either one of the four plasmids studied here, which contained the fusion PtacI-mOrange 11 : we thus obtained strains expressing EGFP and the orange fluorescent protein mOrange at the same time under the same transcriptional control. After 1 h induction with isopropyl-thio-beta-galactoside IPTG , all protein expression was blocked. Cells were incubated over- night so that all fluorescent proteins acquire their mature form. For each of the four strains, green and orange fluores- cence intensities of individual cells were then measured. In each experiment at least 10 000 cells were observed and at least three experiments were done in each condition. In general, disentangling the various contributions to the final distribution of fluorescence would be a difficult prob- lem. However, making some assumptions on the gene ex- pression processes, we will be able to express the first and second moments of the number of fluorescent proteins as functions of those of copy numbers and to inverse these re- lations to find how to relate the experimental measurements to the distribution of PCN. Section II presents this strategy in a simple case. II. SIMPLE MODEL We suppose here that during the induction, bacteria do not grow, the plasmids and chromosomes do not replicate, the protein production does not depend on time 12 , and the age distribution of bacteria is uniform. We note Pa i as the contribution of the copy i of the gene a a=O or G for the genes mOrange or egfp to the total num- ber of proteins Pa at the end of induction in one cell and na as the number of copies of the gene a in that cell see Fig. 1 . One can write *Present address: Institut f��r Theoretische Physik, Universit��t zu K��ln, Z��lpicherstrasse 77, 50937 Cologne, Germany ghozzi@thp.uni-koeln.de ���Present address: Physics of Biological Systems, CNRS URA 2171, Institut Pasteur, 25-28, rue du Dr Roux, 75015 Paris, France. PHYSICAL REVIEW E 82, 051916 2010 1539-3755/2010/82 5 /051916 8 ��2010 The American Physical Society 051916-1
Page 2
hidden
Pa = i=1 na Pa. i The average over the population of Pa can thus be written as Pa = na i=1 na Pi a p na,Pa i Pa, i where p na , Pa i is the joint probability of na and Pa. i We can suppose that the distribution of the number of proteins pro- duced by each copy does not depend on the particular copy considered nor on the number of copies we measured the same distributions of green fluorescence, i.e., of expression from the chromosome for strains bearing both high and low copy number plasmids 13 . Thus, Pa = na p na na Pa 1 p Pa 1 Pa 1 = na Pa 1 . Moreover, we can suppose that on average the number of proteins produced by a copy of a gene does not depend on the gene both genes are under the same promoter . Hence, as expected, nO nG = PO PG . 1 The moments of order 2 can similarly be written as PaPb = na,nb i=1 na j=1 nb Pi a ,Pj b p na,nb,Pa,Pb i j PaPb, i j where Pa and Pb are evaluated in the same cell. In the case of different genes, we can suppose that the correlation does not depend on the particular copies consid- ered or on their numbers. Thus, POPG = nO,nG p nO,nG nOnG P1 O ,P1 G p PO,PG 1 1 POPG 1 1 = nOnG POPG 1 1 . In the case of the same gene, we can suppose that two dif- ferent copies correlate like two copies of different genes PaPa i j = POPG 1 1 , ��� i j and that the autocorrelation of one copy does not depend on the particular copy or gene considered Pa i 2 = P1 2 , ��� a,i . Then, Pa 2 = na P1 2 + na na - 1 POPG 1 1 . Combining those two last expressions with Eq. 1 , we ob- tain nO 2 = PO PG nG 2 + 1 POPG PO 2 - PO PG PG 2 nOnG . Since the replication of the chromosome is well controlled 2,14 we can suppose that the variance of the chromosome copy number vanishes nG 2 nG 2 and that the plasmid and chromosome copy numbers are uncorrelated nOnG nO nG . Let be the PCN noise, defined by 2 = nO 2 - nO 2 / nO 2. Then, 2 = PG PO + 1 POPG PG PO PO 2 - PG 2 - 1, 2 which, as it turns out, does not depend on the chromosome copy number or any other external inputs but solely on quan- tities directly measured in this experiment. III. COMPLETE MODEL We want now to also take into account sources of fluores- cence fluctuation other than gene expression. We assume that all cells have exactly the same division time T. Two studies report a small variability of division times, with a standard deviation of the growth time constant of 10% of the aver- age 15,16 . We note t0 the age of a cell at the beginning of induction. Under this hypothesis, the distribution of ages t0 is exponential 17 : p t0 = 2 ln 2 / T 2-t0/T. We will also con- sider that the induction time 1 h is a multiple of the divi- sion time. This is true at 30 and 37 ��C, where we measured cell cycles of 1 h and 30 min, respectively, but not for inter- mediate temperatures this is discussed in Sec. IV . We will present calculations with cells dividing twice during the in- duction, i.e., a cell cycle of 30 min, more or less divisions only change the numerical prefactors 18 . At each cell division, fluorescent proteins are randomly inherited by one of the two daughter cells, thus adding to the fluorescence fluctuations. As discussed in Appendix A, this contribution turns out to be small���to a good precision, half of the fluorescent proteins are inherited by each daughter cell. Following one lineage during the induction, we can now express the number of fluorescent proteins at the end of in- duction in a given cell, Pa = 1 4 t0 T + 1 2 T 2T + 2T 2T+t0 i=1 na t a i,t dt, where we took the age of the cell at the beginning of induc- tion t0 as the initial time and introduced a i,t , the rate of protein production at time t from the copy i of the gene a 19 . FIG. 1. Color online Cartoon of the lineage of a bacterium during protein production induction, here depicted with one divi- sion only one of the two final cells is shown . Fluorescence inten- sities of single cells are measured at the end of induction. The orange green intensities are proportional to the number of orange proteins PO green proteins PG in the observed cell, shown as orange dark gray green light gray dots. These proteins were produced during all the induction by a varying number of mOrange or egfp copies nO and nG and randomly distributed among daugh- ter cells at each division. GHOZZI et al. PHYSICAL REVIEW E 82, 051916 2010 051916-2

Readership Statistics

14 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
50% Ph.D. Student
 
14% Student (Master)
 
14% Professor
by Country
 
36% United States
 
21% France
 
14% Japan

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in