Sign up & Download
Sign in

Dependence on temperature and GC content of bubble length distributions in DNA

by G Kalosakas, S Ares
Dna Sequenc (2009)

Abstract

We present numerical results on the temperature dependence of the distribution of bubble lengths in DNA segments of various guanine-cytosine (GC) concentrations. Base-pair openings are described by the Peyrard-Bishop-Dauxois model and the corresponding thermal equilibrium distributions of bubbles are obtained through Monte Carlo calculations for bubble sizes up to the order of a hundred base pairs. The dependence of the parameters of bubble length distribution on temperature and the GC content is investigated. We provide simple expressions which approximately describe these relations. The variation of the average bubble length is also presented. We find a temperature dependence of the exponent c that appears in the distribution of bubble lengths. If an analogous dependence exists in the loop entropy exponent of real DNA, it may be relevant to understand overstretching in force-extension experiments.

Cite this document (BETA)

Available from arxiv.org
Page 1
hidden

Dependence on temperature and GC content of bubble length distributions in DNA

ar
X
iv
:0
90
6.
46
83
v1
[
q-
bio
.B
M
]
25
Ju
n 2
00
9
Dependence on temperature and GC content of bubble length distributions in DNA
G. Kalosakas∗1 and S. Ares∗2
1Department of Materials Science, University of Patras, GR-26504 Rio, Greece
2Max Planck Institute for the Physics of Complex Systems,
No¨thnitzer Str. 38, D-01187 Dresden, Germany,
and Grupo Interdisciplinar de Sistemas Complejos (GISC)
We report numerical results on the temperature dependence of the distribution of bubble lengths
in DNA segments of various GC concentrations. Base-pair openings are described by the Peyrard-
Bishop-Dauxois model and the corresponding thermal equilibrium distributions of bubbles are ob-
tained through Monte Carlo calculations for bubble sizes up to the order of a hundred base-pairs.
The dependence of the parameters of bubble length distribution on temperature and the GC content
is investigated. We provide simple expressions which approximately describe these relations. The
variation of the average bubble length is also presented. Finally, we find a temperature dependence
of the exponent c in the probability distribution of bubble lengths. If an analogous dependence
exists in the loop entropy exponent of real DNA, it may be relevant to understand overstretching
in force-extension experiments.
I. INTRODUCTION
Local openings of the DNA double helix are required
in several biological functions, for instance transcription
and replication. These local separations of the two DNA
strands are mediated by a specific machinery in the cell.
In order to deal with such complex processes, it is neces-
sary first to understand the interactions keeping together
the two complementary strands within a single DNA du-
plex, as well as the properties of fluctuating DNA open-
ings in thermal equilibrium. This fact has been realized
long time ago and base-pairing interactions and the ther-
mal stability of the double helix have been conventionally
probed by increasing temperature up to the DNA denat-
uration transition [1, 2].
Base-pair openings (bubbles) occur in DNA due to
thermal fluctuations even at temperatures well below the
melting transition. It has been speculated that they may
play a role in the recognition of specific DNA sites by
DNA-binding proteins [3, 4, 5, 6]. Recent experiments
using a novel hairpin quenching technique have been able
to show this bubble formation in the pre-melting regime
and characterize it quantitatively [7, 8, 9]. By increas-
ing temperature these bubbles grow and more bubbles
are nucleated, thus leading to the complete separation of
the two strands at the denaturation transition. There-
fore statistical properties of DNA bubbles in a wide tem-
perature regime, extending from biological temperatures
up to the melting transition, are of particular interest.
The purpose of this work is the investigation of the dis-
tribution of bubble lengths and its temperature depen-
dence for DNA sequences containing different percent-
ages of guanine-cytosine (GC) base-pairs. The study
is performed in the framework of the Peyrard-Bishop-
∗Both authors contributed equally to this work.
Dauxois (PBD) model [10], from where conjectures are
drawn about the actual distributions of bubbles in DNA.
If our results on the temperature dependence of this
distribution, obtained for finite bubble lengths, remain
qualitatively valid in the asymptotic regime (for very
large values of bubble lengths), then there would be a
connection with the ongoing discussion about the inter-
pretation of the DNA overstretching observed in force-
extension experiments. In this case the possibilities of ei-
ther a force-induced melting or the existence of a double-
stranded elongated DNA phase (the so called S-DNA)
have been proposed to explain the abrupt elongation of
DNA at forces ∼ 65pN [11]. This is briefly discussed in
the concluding section.
In a recent study we have presented the distribution of
bubble lengths in the PBD model of DNA at 310K and
we found that it can be described by a power-law mod-
ified exponential [12]. Anharmonic interactions between
complementary bases forming base-pairs are responsible
for the observed non-exponential distribution. The same
form of distribution has been derived in the framework
of the Poland-Scheraga model [13, 14, 15], which repre-
sents a completely different theoretical approach of DNA
denaturation than the PBD model that we use in our
calculations. This distribution is also found for a primi-
tive version of the PBD model, viz. the Peyrard-Bishop
model [16] with linear stacking interactions, but in this
case the characteristic values of the parameters of the
distribution are different [17].
Here we examine how the bubble length distribution
varies with temperature and present its complete depen-
dence on both temperature and the GC fraction of the
DNA segment. The PBD model [10] is used for the de-
scription of base-pair openings, where a set of continuous
variables yn represent the base-pair displacements from
equilibrium distance and the index n labels the base-pairs
along the DNA chain. The potential energy of the system
consists of two parts: the on-site interaction V (yn) within
Page 2
hidden
2each base-pair and the stacking interaction U(yn, yn+1)
between adjacent base-pairs. A Morse potential is used
for the on-site energy,
V (yn) = Dn(e−anyn − 1)2, (1)
where the parameters Dn and an distinguish between GC
and AT base-pairs (DGC = 0.075eV, aGC = 6.9A˚−1 for
a GC base-pair and DAT = 0.05eV, aAT = 4.2A˚−1 for
an AT pair), while a nonlinear potential describes the
stacking interaction,
U(yn, yn+1) =
K
2
(1 + ρe−b(yn+yn+1))(yn − yn+1)2, (2)
with K = 0.025eV/A˚2, ρ = 2, and b = 0.35A˚−1. We
use parameter values from previous works [4, 5, 18, 19].
These parameters have been originally obtained from em-
pirical fits to experimental data [18] and are also able
to successfully describe other experimental situations
[4, 19]. Entropic effects of DNA are described from the
PBD Hamiltonian when studying its thermodynamics,
leading to an entropy driven melting transition [10, 20].
The efficiency of the rather simple PBD model to de-
scribe base-pair openings in DNA has led to its extensive
use in the literature [4, 5, 6, 12, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32]. In particular, our choice of
the PBD model for the study of bubble length distribu-
tions is motivated by the success of the model [19] in re-
producing experimental measurements of bubble forma-
tion [7, 8], as well as the accurate description at a quanti-
tative level of melting curves in short DNA segments [18].
The coarse-grained description of the model allows the
possibility to perform calculations with long sequences
of up to tens of thousands of base pairs [30]. Moreover,
the popular Nearest Neighbor model [33], which is very
successful in describing the melting of short oligomers,
does not reproduce well experimental data on intermedi-
ate states for longer sequences [8, 9, 34, 35]. Compared to
the more conventional modified Ising type models used in
the study of DNA melting [2, 13], the PBD model is qual-
itatively similar: the on-site potential of the PBD model
is an extension of the magnetic field in the Ising type
models, while the nonlinear stacking potential in Eq. (2)
corresponds to the interaction between neighboring spins
and loop entropies in Ising models. We expect our results
for the PBD model to hold qualitatively also for the Ising
type models, since both models yield the same shape for
the distribution of bubble sizes [12].
II. RESULTS
Considering a random DNA sequence of a given GC
percentage, xGC , at equilibrium at temperature T , we
calculate the distribution per base-pair of bubble lengths
l, P (l), by counting during Monte Carlo simulations
of the DPB model the average occurrences of openings
(base-pair displacements) larger than a fixed threshold
ythres at l successive base-pairs. In each simulation a ran-
dom sequence of 1000 base pairs is used in which the AT
or GC identity of each base pair is generated randomly
under the constraint of a specified GC percentage. The
properties of such random long sequences are not dif-
ferent from natural sequences, as we showed in previous
work [12] comparing the results for a segment from Es-
cherichia Coli’s gal promoter with those from a randomly
generated sequence with the same GC percentage.
The Monte Carlo simulation is performed using the
Metropolis algorithm [36], which is used first for ther-
malization and then for measurement runs. Results are
averaged over several realizations using the same random
sequence and different initial conditions for the inter-base
distance within each base-pair and for the random num-
ber generator. Other details of the simulation (number
of realizations and Monte Carlo steps) are as in Ref. [12].
Since the used sequences are so long, the results are in-
dependent of the precise realization of the random se-
quence. We have run further simulations with different
random sequences to assure this point. There is an is-
sue about the dissociation observed in the PBD model in
Monte Carlo simulations. In particular, due to the upper
bound of the on-site potential for large positive displace-
ments, for long enough simulations a complete dissoci-
ation would eventually occur at any temperature, even
below the melting transition [20, 37]. However, because
of the large DNA segments considered here, the probabil-
ity of a complete dissociation during the simulation time
is so low that no such events were observed at any of the
studied temperatures below the melting temperature.
The threshold value considered for the openings is
ythres = 1.5A˚. Results for different values of this thresh-
old are qualitatively similar. We use periodic bound-
ary conditions in DNA segments of length 1000 base-
pairs (sufficiently larger than the studied bubble sizes)
and therefore our results refer to internal bubbles in long
DNA chains. The results are independent of the length
of the studied sequence, provided that it is longer than
the longest observed bubbles [12]. This has been verified
by simulations on DNA segments of different length, re-
sulting in the same distributions P (l) for the bubble sizes
studied here, i.e. with l up to about a hundred base-pairs.
For short molecules, the actual sequence does play a role,
since it can trigger local openings in different regions at
the end or in the middle of the sequence [7, 8]. In this
case, the border effects associated can not be character-
ized only by the GC percentage of the short sequence,
but this problem is out of the scope of the present work.
In Figure 1 we show bubble length distributions at var-
ious temperatures for a GC percentage of xGC = 50% and
87.5%. Similar plots have been obtained for nine differ-
ent values of xGC : 0%,12.5%,25%,37.5%,. . .,100%. Note
here that although the PBD model takes into account
Page 3
hidden
31 10 100
Bubble length (l)
10-7
10-6
10-5
10-4
10-3
10-2
10-1
P(l) T=270K
T=280K
T=290K
T=300K
T=310K
T=320K
T=330K
T=340K
(a)
50% GC
1 10 100
Bubble length (l)
10-7
10-6
10-5
10-4
10-3
10-2
10-1
P(l) T=270K
T=280K
T=290K
T=300K
T=310K
T=320K
T=330K
T=340K
T=350K
(b)
87.5% GC
FIG. 1: (Color online) Distribution per base-pair of bub-
ble lengths l (in number of base-pairs), P (l), for different
temperatures (points, as indicated in the plots) at random
DNA sequences with a GC content of (a) 50 % and (b) 87.5
%. Continuous lines are fits with the distribution of Eq.(3).
ythres = 1.5A˚.
cooperativity in bubble formation through the nonlinear
stacking interaction, it does not contain any parameter
describing a bubble nucleation size. Thus the distribu-
tions in figure 1 show results even for bubbles of length
l = 1, implying that the nucleation size is 1. This should
not be confused with the minimum length of a short DNA
sequence necessary to sustain bubble states, which has
been shown experimentally [7, 8] and theoretically within
the DPB model [19] to be greater than one.
Except for the cases of l = 1 at relatively higher tem-
peratures (closer to the melting transition), the numer-
ical results obtained for the distributions can be rather
well described by the power-lawmodified exponential [12]
P (l) = W
e−l/ξ
l c
, for l > 1. (3)
In the following we characterize the dependence of the
parameters of distribution on T and xGC , and pro-
vide approximate expressions for the relations ξ(T, xGC),
c(T, xGC), and W (T, xGC). The distribution parameters
are obtained through fitting of plots like those of Figure
1 with Eq. (3), using a weight proportional to 1/P (l)2.
We note here that Eq. (3) can be derived from poly-
mer physics [13, 14, 15] as an asymptotic expression for
very large bubble sizes, l ≫ 1, in DNA. However, we
find that the same expression describes well the bubble
length distribution in the PBD model of DNA, even for
small bubble sizes l. We emphasize that we are interested
here in bubble lengths up to about a hundred, or a few
hundred at most, base-pairs (which are relevant for any
practical purpose) and not for the asymptotic behavior
of the distribution. In this context we use Eq. (3) as
an empirical formula valid for finite bubble lengths l (in
the non-asymptotic regime), and describe the variation
of its parameters on T and xGC . If one is interested in
the asymptotic behavior of the distribution, this can be
obtained by proper scaling analysis and, as it has been
shown in Ref. [29], it may be described by different val-
ues of parameter c. Therefore we are not concerned with
the order of the melting transition and the exponent c
presented here is not indicative of the kind of the transi-
tion [14, 15, 22, 29, 35], as it also depends on the some-
how arbitrary value of the threshold chosen to consider
a base-pair open.
Figure 2 presents the dependence of the decay length
ξ of Eq. (3). In 2a the variation of ξ with tempera-
ture is shown (points) for different values of xGC . The
T-dependence is accurately described by the divergent
function
ξ(T ) = ξ0 +
ξ1
Tc − T
, (4)
where Tc is the denaturation temperature. Such a re-
lation is also valid for the Poland-Scheraga model [15].
Lines in Figure 2a show fittings of the ξ(T ) data with
Eq. (4), using a weight proportional to 1/ξ2. The criti-
cal temperature Tc as obtained from the fitting at differ-
ent values of xGC is presented with circles in Figure 2b.
A linear dependence of Tc on the GC content is found,
in accordance with known experimental results [38] and
calculations from simplified models [39]. A least square
fitting of the Tc(xGC) data results in the continuous line
shown in Figure 2b. Regarding the homopolymer cases of
poly(dA)-poly(dT) (xGC = 0%) and poly(dG)-poly(dC)
(xGC = 100%), the critical temperatures for the tran-
sition can be independently calculated through the nu-
merically exact transfer integral technique [40, 41, 42],
and the corresponding results are Tc(xGC = 0) = 325.2K
and Tc(xGC = 100) = 366.0K. These values are shown
with squares joined by a dashed line in Figure 2b, where
the line lies inside the error interval of the Monte Carlo
results. The actual denaturation temperatures obtained
through the Monte Carlo simulations are in agreement
with those presented in Fig. 2b. Therefore, the critical
temperatures Tc shown in Fig. 2b provide a guide of how
far from the melting transition are the data at various
temperatures presented in this work.
The other parameters ξ1 and ξ0 resulting from the fit-
ting of the ξ(T ) data with Eq. (4) are shown with circles
in Figures 2c and 2d, respectively. Their dependence on
Page 4
hidden
4280 300 320 340
Temperature (K)
1
10
100
ξ
0% GC
12.5% GC
25% GC
37.5% GC
50% GC
62.5% GC
75% GC
87.5% GC
100% GC
(a)
0 20 40 60 80 100
xGC (GC percentage %)
320
330
340
350
360
370
T c

(K
)
324.4(0.6)+0.407(0.009)⋅xGC
(b)
180
200
220
240
ξ 1

) 226(6)-0.28(0.10)⋅xGC
0 20 40 60 80 100
xGC (GC percentage %)
-1.5
-1
-0.5
ξ 0
-1.09(0.09)+0.0035(0.0016)⋅xGC
(c)
(d)
FIG. 2: (Color online) (a) Dependence of the decay length ξ
of the distribution (3) on the temperature, for different val-
ues of the GC content of the DNA sequence (points). Lines
show fits with the function of Eq. (4). (b) Dependence of
the critical temperature Tc, as obtained from the fitting of
the ξ(T ) data with Eq. (4), on the GC content of the DNA
sequence (circles). Solid line represents a least square fit ac-
cording to a linear dependence. Squares show exact results
of the critical temperatures for the homogeneous cases of 0%
GC and 100% GC, obtained from transfer integral calcula-
tions, while the dashed line connects these two points. (c)
and (d) Dependence of the parameters ξ1 and ξ0, respectively,
of the fit of the ξ(T ) data with Eq. (4), on the GC content
of the sequence (circles). Solid lines represent linear fits of
the corresponding data. Equations of straight lines resulting
from the corresponding fittings are shown in (b), (c), and (d),
where the values in parentheses represent errors of the fitting
parameters.
280 300 320 340
Temperature (K)
1.6
1.7
1.8
1.9
2
2.1
c
0% GC
25% GC
50% GC
75% GC
100% GC
(a)
0.002
0.004
0.006
0.008
c 1

(K
-
1 )
0.0064(0.0001)-2.9(0.2)⋅10-5⋅xGC
0 25 50 75 100
xGC (GC percentage %)
-0.5
0
0.5
c 0
-0.04(0.03)+0.0066(0.0005)⋅xGC
(c)
(b)
FIG. 3: (Color online) (a) Dependence of the exponent c of the
distribution (3) on the temperature, for different values of the
GC content of the DNA sequence (points). Lines show linear
fits with equation (6). (b) and (c) Dependence of the parame-
ters c1 and c0, respectively, of the fit of the c(T ) data with Eq.
(6), on the GC content of the sequence (filled points). Error
bars are standard errors resulting from the fitting procedure.
Solid lines represent linear fits of the corresponding data and
the resulting equations are also shown.
xGC can be approximately considered as linear (contin-
uous lines in Figs. 2c and 2d). Therefore, the relation
ξ(T, xGC) = a1 + a2xGC +
a3 + a4xGC
a5 + a6xGC − T
, (5)
where a1, a2, . . . , a6 are constants, can approximately
provide the dependence of the decay length ξ on T and
xGC .
The variation of the exponent c of the distribution
(3) is presented in Figure 3. Points in 3a show the
temperature dependence of c for different GC percent-
ages (for clarity of the plot the corresponding results for
xGC = 12.5%, 37.5%, 62.5%, 87.5% have been omitted).
This dependence may approximately be described by a
linear function
c(T ) = c0 + c1T. (6)
Lines in Figure 3a show fittings of the numerical results
with the above formula. The parameters c1 and c0 ob-
tained from the fitting at different values of GC content
Page 5
hidden
5280 300 320 340
Temperature (K)
0.01
0.02
W
0% GC
12.5% GC
25% GC
37.5% GC
50% GC
62.5% GC
75% GC
87.5% GC
100% GC
(a)
1⋅10-6
2⋅10-6
3⋅10-6
W
2
(K
-
2 ) 2.7(0.1)⋅10-6-1.6(0.2)⋅10-8⋅xGC
-0.0016
-0.0012
-0.0008
W
1
(K
-
1 )
-0.00136(0.00006)+7.9(1.0)⋅10-6⋅xGC
0 20 40 60 80 100
xGC (GC percentage %)
0.08
0.12
0.16
W
0
0.176(0.008)-0.0010(0.0001)⋅xGC
(b)
(c)
(d)
FIG. 4: (Color online) (a) Dependence of the pre-exponential
coefficient W of the distribution (3) on the temperature,
for different values of the GC content of the DNA sequence
(points). Lines show quadratic fits with equation (8). (b),
(c), and (d) Dependence of the parameters W2, W1, and W0,
respectively, of the fit of the W (T ) data with Eq. (8), on the
GC content of the sequence (circles). Solid lines represent lin-
ear fits of the corresponding data and the resulting equations
are also shown.
are shown with points in Figures 3b and 3c, respectively,
while the corresponding error bars are derived from the
fitting procedure. The latter plots indicate a linear de-
pendence of c1 and c0 on xGC , implying the approximate
relation
c(T, xGC) = b1 + b2xGC + (b3 + b4xGC)T, (7)
where b1, b2, b3, and b4 are constants independent of T
and xGC .
In Figure 4 is shown the dependence of the coefficient
W of the distribution (3). Points in 4a display the vari-
ation of W with T for various GC contents. A quadratic
function
W (T ) = W0 +W1T +W2T 2 (8)
can describe rather well the temperature dependence of
W , at least in the studied temperature regime. The cor-
responding fittings with Eq. (8) are shown by lines in
Figure 4a. The resulting fitting parameters at different
GC percentages are plotted by points in Figures 4b, 4c,
and 4d, respectively. These plots can approximately be
described by linear dependences of W2, W1, and W0 on
xGC (solid lines). As a result the expression
W (T, xGC) = d1+d2xGC+(d3+d4xGC)T+(d5+d6xGC)T 2
(9)
can approximate the dependence of W on T and xGC ,
where d1, d2, . . . , d6 are constants.
The detailed results of this investigation do not con-
firm a bilinear dependence of the parameters of the dis-
tribution on the GC content at a fixed temperature [12].
Instead, a linear dependence of the exponent c and the
coefficient W on xGC arises at constant T . We note that
the results presented in Figure 2 of Ref. [12] can be de-
scribed well by Eqs. (5), (7), and (9) for T = 310K and
using the values of constants as they provided in Figures
2b,c,d (for ai), 3b,c (for bi), and 4b,c,d (for di).
We finally present the variation of the average bubble
length, LB, on the two parameters of interest, viz. tem-
perature and GC percentage. LB is obtained through
the total number of base-pairs in bubble states divided
by the total number of bubbles [12]:
LB =

l lP (l)

l≥1 P (l)
. (10)
Substituting P (l) by the power-law modified exponential
function of Eq. (3), the sums in Eq. (10) yield:
LB(c, ξ) =
Lic−1(e−1/ξ)
Lic(e−1/ξ)
, (11)
where Lis(z) is the first branch of the polylogarithm func-
tion [43], defined as Lis(z) =
∑∞
k=1 z
k/ks. The average
bubble length for a given temperature and GC percent-
age, LB(T, xGC), can be obtained substituting in Eq. (11)
the parameters ξ and c, calculated for the corresponding
T and xGC through Eqs. (5) and (7), respectively, using
the values of constants determined in Figures 2 and 3.
Figure 5a depicts the dependence of LB on xGC for
different temperatures, obtained directly from our nu-
merical simulations (points). Eq. (11), shown by dotted
lines in the plot, reproduces correctly the behavior of the
data. However, the errors in the estimated parameters
in Figures 2 and 3 add up to produce a small deviation
between the numerical points and the values obtained by
Eq. (11). We also show, using solid lines in the plot,
more accurate fits of the numerical data with a simpler
phenomenological expression, namely the exponentially
decaying function
LB = Γ0 + Γ1 exp(−Γ2xGC). (12)
These empirical fits generalize the apparent exponential
decay seen earlier at 310K [12]. Presenting these data
on a different way, more appropriate for experimental in-
vestigations, the temperature dependence of LB for fixed
Page 6
hidden
60 50 100
xGC (GC percentage %)
1
1.5
2
2.5
3
L B

(ba
se
pa
irs
)
T=350K
T=340K
T=330K
T=320K
T=310K
T=300K
T=290K
T=280K
T=270K
(a)
280 300 320 340 360
Temperature (K)
1
2
3
L B

(ba
se
pa
irs
)
0% GC
12.5% GC
25% GC
37.5% GC
50% GC
62.5% GC
75% GC
87.5% GC
100% GC
(b)
FIG. 5: (Color online) (a) Dependence of the average bub-
ble length LB , Eq. (10), on the GC content, for different
temperatures (points). Dotted lines show the analytical re-
sult of Eq. (11), using the values of ξ and c from Eqs. (5)
and (7). Solid lines show empirical fits with exponential func-
tions, Eq. (12). (b) Dependence of the average bubble length
LB on temperature for different GC contents (points). Lines
represent empirical fits with cubic functions.
GC contents is shown in figure 5b (points), for tempera-
tures sufficiently below the melting transition. Here for
clarity of the plot we do not show again the prediction
of Eq. (11), but instead we show with lines fits with a
simple phenomenological function, namely a cubic poly-
nomial, which describes approximately the data in the
investigated temperature regime. The coefficients of the
cubic polynomials show an exponential dependence, of
the form (12), on the GC percentage.
III. CONCLUSIONS
In conclusion, we have presented the dependence of
bubble length distributions in the PBD model of DNA
on temperature and the GC content. The investigated
temperature regime was extended from biologically rel-
evant values up to values below the melting transition.
Approximate expressions have been obtained for the pa-
rameters of the power-law modified exponential distribu-
tion in the non-asymptotic regime (for bubble lengths up
0 20 40 60 80 100
xGC (GC percentage %)
1.8
1.85
1.9
1.95
2
2.05
2.1
c
c
FIG. 6: Extrapolated dependence of the exponent c at the
critical temperature (cc) on the GC content.
to about a hundred base-pairs). The exponent c behaves
linearly both in temperature and GC content, Eq. (7),
while the coefficient W shows a quadratic dependence on
temperature and a linear dependence on the GC frac-
tion, Eq. (9). The decay length ξ is described by the
relatively simple equation (5). The constants ai, bi, and
di appearing in these expressions depend on the ampli-
tude ythres of the considered base-pair openings, and for
ythres = 1.5A˚ are given by the values shown in Figures
2, 3, and 4. Using the expressions of the exponent c
and the decay length ξ, the average bubble length is an-
alytically given by Eq. (11). Our results may be useful
in biotechnological applications which involve thermally
induced DNA denaturation, as well as in recent hairpin
quenching experiments [7, 8, 9] for studying bubble for-
mation.
An important result of this work is the prediction of a
dependence both in temperature and genetic sequence of
the exponent c of the bubble distribution (3). This pre-
diction can be experimentally verified using the method
proposed in Ref. [44]: based on the measurement of cer-
tain correlation functions of base-pair openings using flu-
orescence correlation spectroscopy [45], the exponent c
can be experimentally calculated.
The sequence dependence of c is specially relevant at
the melting temperature. At this temperature very large
bubbles appear and the asymptotic behavior of the bub-
ble length distribution is necessary. Assuming that the
behavior revealed in our findings qualitatively holds in
the asymptotic regime (which needs to be investigated
since there is not a priori any reason for this to be true),
then the particular dependence of c can be extrapolated.
Combining an expression for c as that given in Eq. (7) and
the dependence of critical temperature on the GC per-
centage presented in Fig. 2b, we show in Fig. 6 estimated
values of c at the melting temperature, cc, as a function
of the GC percentage. We see that cc is not constant,
but decreases as the GC fraction increases. Although,
in view of the above assumptions, this result should be
Page 7
hidden
7taken with caution, such a behavior is consistent with the
observation, coming from lattice models simulations that
consider DNA strands as self-avoiding random walks, of
a decrease of cc with increasing stiffness of the strands
[46].
Under the same assumption that our results for the
temperature dependence of c qualitatively hold in the
asymptotic regime, then this may be useful to un-
derstand experimental results in force-extension exper-
iments. When performing single molecule experiments,
force-extension curves of double stranded DNA display
hysteresis [47, 48]. The hysteresis observed is more pro-
nounced at higher temperatures. The existence of an
elongated double-stranded form of DNA, called S-DNA,
has been proposed to explain the abrupt elongation of
DNA at a force of ∼ 65pN [49]. Numerical studies of the
force-extension problem [50] suggest that the existence
of S-DNA is necessary to explain the temperature depen-
dence of the observed hysteresis. However, an alternative
explanation without the appearance of an S-DNA phase
is possible if the exponent c is temperature dependent
[50]. The order of variation of c we find here is in the
right direction (c increases with temperature, implying
larger hysteresis loops at high temperatures according to
the experiments) and of the necessary magnitude [50] to
explain the hysteresis effects observed. Hence, if the qual-
itative behavior of our results is valid in the asymptotic
regime, this would have an effect on the ongoing debate
about S-DNA, favoring force-induced melting as an ex-
planation for the DNA elongation over the formation of
an S-DNA phase. Interestingly, recent experimental re-
sults support the force-induced melting explanation [51],
in agreement with the expectation from the temperature
dependence of c.
Acknowledgments. We thank J. Bois for critical read-
ing of the manuscript and J. Bois and N. Theodorakopou-
los for enlightening discussions. G.K. acknowledges the
hospitality of MPI-PKS in Dresden and support from the
C. Caratheodori program C155 of University of Patras.
S.A. acknowledges financial support from Ministerio de
Educacio´n y Ciencia (Spain) through grant MOSAICO.
[1] R.M. Wartell and A.S. Benight, Phys. Rep. 126, 67
(1985).
[2] D. Poland and H.A. Scheraga, Theory of helix coil tran-
sition in biopolymers, Academic Press (1970).
[3] A. Banerjee and H.M. Sobell, J. Biomol. Struct. Dyn.
1, 253 (1983); H.M. Sobell, Proc. Natl. Acad. Sci. USA
82, 5328 (1985).
[4] C.H. Choi, G. Kalosakas, K.Ø. Rasmussen, M. Hiromura,
A.R. Bishop, and A. Usheva, Nucleic Acids Res. 32, 1584
(2004).
[5] G. Kalosakas, K.Ø. Rasmussen, A.R. Bishop, C.H. Choi,
and A. Usheva, Europhys. Lett. 68, 127 (2004).
[6] C.H. Choi, Z. Rapti, V. Gelev et al., Biophys. J. 95, 597
(2008).
[7] A. Montrichok, G. Gruner, and G. Zocchi, Europhys.
Lett. 62, 452 (2003); Y. Zeng, A. Montrichok, and G.
Zocchi, Phys. Rev. Lett. 91, 148101 (2003).
[8] Y. Zeng, A. Montrichok, and G. Zocchi, J. Mol. Biol.
339, 67 (2004).
[9] Y. Zeng and G. Zocchi, Biophys. J. 90, 4522 (2006).
[10] T. Dauxois, M. Peyrard, and A.R. Bishop, Phys. Rev.
E 47, 44 (1993).
[11] M.C. Williams and I. Rouzina, Curr. Opin. Struct. Biol.
12, 330 (2002).
[12] S. Ares and G. Kalosakas, Nano Lett. 7, 307 (2007).
[13] D. Poland and H.A. Scheraga, J. Chem. Phys. 45, 1464
(1966).
[14] Y. Kafri, D. Mukamel, and L. Peliti, Eur. Phys. J. B 27,
135 (2002).
[15] B. Coluzzi and E. Yeramian, Eur. Phys. J. B 56, 349
(2007).
[16] M. Peyrard and A.R. Bishop, Phys. Rev. Lett. 62, 2755
(1989).
[17] W. Sung and J.-H. Jeon, Phys. Rev. E 69, 031902 (2004);
J.-H. Jeon, W. Sung, and F.H. Ree, J. Chem. Phys. 124,
164905 (2006); J.-H. Jeon, P.J. Park, and W. Sung , J.
Chem. Phys. 125, 164901 (2006).
[18] A. Campa and A. Giansanti, Phys. Rev. E 58, 3585
(1998).
[19] S. Ares, N.K. Voulgarakis, K.Ø. Rasmussen, and A.R.
Bishop, Phys. Rev. Lett. 94, 035504 (2005).
[20] T. Dauxois and M. Peyrard, Phys. Rev. E 51, 4027
(1995).
[21] D. Cule and T. Hwa, Phys. Rev. Lett. 79, 2375 (1997).
[22] N. Theodorakopoulos, T. Dauxois, and M. Peyrard,
Phys. Rev. Lett. 85, 6 (2000).
[23] N.K. Voulgarakis, G. Kalosakas, K.Ø. Rasmsussen, and
A.R. Bishop, Nano Lett. 4, 629 (2004).
[24] T.S. van Erp, S. Cuesta-Lo´pez, J.-G. Hagmann, and M.
Peyrard, Phys. Rev. Lett. 95, 218104 (2005); C.H. Choi,
A. Usheva, G. Kalosakas, K.Ø. Rasmussen, and A.R.
Bishop, Phys. Rev. Lett. 96, 239801 (2006); T.S. van
Erp, S. Cuesta-Lo´pez, J.-G. Hagmann, and M. Peyrard,
Phys. Rev. Lett. 96, 239802 (2006).
[25] Z. Rapti, A. Smerzi, K.Ø. Rasmussen, A.R. Bishop, C.H.
Choi, and A. Usheva, Europhys. Lett. 74, 540 (2006).
[26] B.S. Alexandrov, L.T. Wille, K.Ø. Rasmussen, A.R.
Bishop, and K.B. Blagoev, Phys. Rev. E 74, 050901
(2006).
[27] G. Kalosakas, K.Ø. Rasmussen, and A.R. Bishop, Chem.
Phys. Lett. 432 291 (2006).
[28] N.K. Voulgarakis, A. Redondo, A.R. Bishop, and K.Ø.
Rasmussen, Phys. Rev. Lett. 96, 248101 (2006).
[29] N. Theodorakopoulos, Phys. Rev. E 77, 031919 (2008).
[30] F. de los Santos, O. Al Hammal, and M.A. Mun˜oz, Phys.
Rev. E 77, 032901 (2008).
[31] T. Das and S. Chakraborty, Europhys. Lett. 83, 48003
(2008).
[32] B. Alexandrov, N.K. Voulgarakis, K.Ø. Rasmussen, A.
Usheva, and A.R. Bishop, J. Phys.: Condens. Matter
21, 034107 (2009).
[33] J. SantaLucia Jr, Proc. Natl. Acad. Sci. USA, 95, 1460
(1998).
[34] R. Gonzalez, Y. Zeng, V. Ivanov, and G. Zocchi, J.
Phys.: Condens. Matter 21, 034102 (2009).
[35] R. Everaers, S. Kumar, and C. Simm, Phys. Rev. E 75,
041918 (2007).
[36] N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H.
Page 8
hidden
8Teller, and E. Teller, J. Chem. Phys. 21, 1087 (1953).
[37] Y.-L. Zhang, W.-M. Zheng, J.-X. Liu, Y.Z. Chen, Phys.
Rev. E 56, 7100 (1997).
[38] J. Marmur and P. Doty, J. Mol. Biol. 5, 109 (1962).
[39] S. Ares and A. Sa´nchez, Eur. Phys. J. B 56, 253 (2007).
[40] D.J. Scalapino, M. Sears, and R.A. Ferrell, Phys. Rev.
B 6, 3409 (1972).
[41] S. Aubry, J. Chem. Phys. 62, 3217 (1975); J.A.
Krumhansl and J.R. Schrieffer, Phys. Rev. B 11, 3535
(1975).
[42] S. Ares and A. Sa´nchez, Phys. Rev. E 70, 061607 (2004).
[43] L. Lewin, Polylogarithms and Associated Functions,
North-Holland, New York (1981).
[44] A. Bar, Y. Kafri, and D. Mukamel, Phys. Rev. Lett. 98,
038103 (2007).
[45] G. Altan-Bonnet, A. Libchaber, and O. Krichevsky,
Phys. Rev. Lett. 90, 138101 (2003).
[46] E. Carlon, E. Orlandini, and A.L. Stella, Phys. Rev.
Lett. 88, 198101 (2002).
[47] M. Rief, H. Clausen-Schaumann, and H.E. Gaub, Nat.
Struct. Biol. 6, 346 (1999).
[48] H. Mao, J.R. Arias-Gonzalez, S.B. Smith, I. Tinoco, and
C. Bustamante, Biophys. J. 89, 1308 (2005).
[49] P. Cluzel, A. Lebrun, C. Heller, R. Lavery, J.L. Viovy,
D. Chatenay, and F. Caron, Science 271, 792 (1996).
[50] S. Whitelam, S. Pronk, and P.L. Geissler, Biophys. J.
94, 2452 (2008).
[51] L. Shokri, M.J. McCauley, I. Rouzina, and M.C.
Williams, Biophys. J. 95, 1248 (2008).

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

4 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
75% Ph.D. Student
 
25% Librarian
by Country
 
25% United Kingdom
 
25% India
 
25% Canada