Structural Phase Behaviour Via Monte Carlo Techniques
Abstract
There are few reliable computational techniques applicable to the problem of structural phase behaviour. This is starkly emphasised by the fact that there are still a number of unanswered questions concerning the solid state of some of the simplest models of matter. To determine the phase behaviour of a given system we invoke the machinery of statistical physics, which identifies the equilibrium phase as that which minimises the free-energy. This type of problem can only be dealt with fully via numerical simulation, as any less direct approach will involve making some uncontrolled approximation. In particular, a numerical simulation can be used to evaluate the free-energy difference between two phases if the simulation is free to visit them both. However, it has proven very difficult to find an algorithm which is capable of efficiently exploring two different phases, particularly when one or both of them is a crystalline solid. This thesis builds on previous work (Physical Review Letters 79 p.3002), exploring a new Monte Carlo approach to this class of problem. This new simulation technique uses a global coordinate transformation to switch between two different crystalline structures. Generally, this `lattice switch' is found to be extremely unlikely to succeed in a normal Monte Carlo simulation. To overcome this, extended-sampling techniques are used to encourage the simulation to visit `gateway' microstates where the switch will be successful. After compensating for this bias in the sampling, the free-energy difference between the two structures can be evaluated directly from their relative probabilities. As concrete examples on which to base the research, the lattice-switch Monte Carlo method is used to determine the free-energy difference between the face-centred cubic (fcc) and hexagonal close-packed (hcp) phases of two generic model systems the hard-sphere and Lennard-Jones potentials. The structural phase behaviour of the hard-sphere solid is determined at densities near melting and in the close-packed limit. The factors controlling the efficiency of the lattice-switch approach are explored, as is the character of the `gateway' microstates. The face-centred cubic structure is identified as the thermodynamically stable phase, and the free-energy difference between the two structures is determined with high precision. These results are shown to be in complete agreement with the results of other authors in the field (published during the course of this work), some of whom adopted the lattice-switch method for their calculations. Also, the results are favourably compared against the experimentally observed structural phase behaviour of sterically-stabilised colloidal dispersions, which are believed to behave like systems of hard spheres. The logical extension of the hard sphere work is to generalise the lattice-switch technique to deal with `softer' systems, such as the Lennard-Jones solid. The results in the literature for the structural phase behaviour of this relatively simple system are found to be completely inconsistent. A number of different approaches to this problem are explored, leading to the conclusion that these inconsistencies arise from the way in which the potential is truncated. Using results for the ground-state energies and from the harmonic approximation, we develop a new truncation scheme which allows this system to be simulated accurately and efficiently. Lattice-switch Monte Carlo is then used to determine the fcc-hcp phase boundary of the Lennard-Jones solid in its entirety. These results are compared against the experimental results for the Lennard-Jones potential's closest physical analogue, the rare-gas solids. While some of the published rare-gas observations are in approximate agreement with the lattice-switch results, these findings contradict the widely held belief that fcc is the equilibrium structure of the heavier rare-gas solids for all pressures and temperatures. The possible reasons for this disagreement are discussed. Finally, we examine the pros and cons of the lattice-switch technique, and explore ways in which it can be extended to cover an even wider range of structures and interactions.
Author-supplied keywords
Structural Phase Behaviour Via Monte Carlo Techniques
Via Monte Carlo Techniques
Andrew N Jackson
Doctor of Philosophy
The University of Edinburgh
August 2001
There are few reliable computational techniques applicable to the problem of structural phase behaviour.
This is starkly emphasised by the fact that there are still a number of unanswered questions concerning
the solid state of some of the simplest models of matter. To determine the phase behaviour of a given sys-
tem we invoke the machinery of statistical physics, which identifies the equilibrium phase as that which
minimises the free-energy. This type of problem can only be dealt with fully via numerical simulation, as
any less direct approach will involve making some uncontrolled approximation. In particular, a numer-
ical simulation can be used to evaluate the free-energy difference between two phases if the simulation
is free to visit them both. However, it has proven very difficult to find an algorithm which is capable of
efficiently exploring two different phases, particularly when one or both of them is a crystalline solid.
This thesis builds on previous work [1] (Physical Review Letters 79 p.3002), exploring a new Monte
Carlo approach to this class of problem. This new simulation technique uses a global coordinate trans-
formation to switch between two different crystalline structures. Generally, this ‘lattice switch’ is found
to be extremely unlikely to succeed in a normal Monte Carlo simulation. To overcome this, extended-
sampling techniques are used to encourage the simulation to visit ‘gateway’ microstates where the switch
will be successful. After compensating for this bias in the sampling, the free-energy difference between
the two structures can be evaluated directly from their relative probabilities. As concrete examples on
which to base the research, the lattice-switch Monte Carlo method is used to determine the free-energy
difference between the face-centred cubic (fcc) and hexagonal close-packed (hcp) phases of two generic
model systems — the hard-sphere and Lennard-Jones potentials.
The structural phase behaviour of the hard-sphere solid is determined at densities near melting and in
the close-packed limit. The factors controlling the efficiency of the lattice-switch approach are explored,
as is the character of the ‘gateway’ microstates. The face-centred cubic structure is identified as the
thermodynamically stable phase, and the free-energy difference between the two structures is determined
with high precision. These results are shown to be in complete agreement with the results of other
authors in the field (published during the course of this work), some of whom adopted the lattice-switch
method for their calculations. Also, the results are favourably compared against the experimentally
iii
observed structural phase behaviour of sterically-stabilised colloidal dispersions, which are believed to
behave like systems of hard spheres.
The logical extension of the hard sphere work is to generalise the lattice-switch technique to deal with
‘softer’ systems, such as the Lennard-Jones solid. The results in the literature for the structural phase
behaviour of this relatively simple system are found to be completely inconsistent. A number of different
approaches to this problem are explored, leading to the conclusion that these inconsistencies arise from
the way in which the potential is truncated. Using results for the ground-state energies and from the
harmonic approximation, we develop a new truncation scheme which allows this system to be simulated
accurately and efficiently. Lattice-switch Monte Carlo is then used to determine the fcc–hcp phase
boundary of the Lennard-Jones solid in its entirety. These results are compared against the experimental
results for the Lennard-Jones potential’s closest physical analogue, the rare-gas solids. While some of
the published rare-gas observations are in approximate agreement with the lattice-switch results, these
findings contradict the widely held belief that fcc is the equilibrium structure of the heavier rare-gas
solids for all pressures and temperatures. The possible reasons for this disagreement are discussed.
Finally, we examine the pros and cons of the lattice-switch technique, and explore ways in which it can
be extended to cover an even wider range of structures and interactions.
An electronic (PDF) version of this thesis can be found at:
http://www.ph.ed.ac.uk/cmatter/links/anj-thesis/.
good at riddles, being a Bear of Very Little Brain. So he
sang Cottleston Pie instead:
Cottleston, Cottleston, Cottleston Pie,
A fly cant bird, but a bird can fly.
Ask me a riddle, and I reply:
Cottleston, Cottleston, Cottleston Pie.
Cottleston, Cottleston, Cottleston Pie,
A fish cant whistle, and neither can I.
Ask me a riddle, and I reply:
Cottleston, Cottleston, Cottleston Pie.
Cottleston, Cottleston, Cottleston Pie,
Why does a chicken, I dont know why.
Ask me a riddle, and I reply:
Cottleston, Cottleston, Cottleston Pie.
Edward Bear.
(Winnie-the-Pooh, A.A. Milne)
This thesis has been composed by myself and has not been submitted in any previous application for a
degree. The work reported herein was executed by me, unless otherwise stated. Elements of this work
appear in references [2] and [3].
A. N. Jackson
August 2001
vii
...continued from previous page
Acronym Description Section Page
SS ‘Strong’ Sampling 3.6.2 50
SSM Site-to-Site Mapping 4.3.1 60
TH Top-Hat algorithm 3.2.1 37
TP Transition Probability weight-function estimator 3.6.2 48
TSSM Transformed Site-to-Site Mapping 4.3.1 60
UVM Unconstrained Volume Moves 3.2.2 39
VS Visited States 3.6.2 46
Title Page i
Abstract iii
Quote v
Declaration vii
Dedication ix
Acknowledgements xi
Contents viii
Contents xix
1 Introduction 1
2 Background 5
2.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 The hard-sphere potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 The Lennard-Jones potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 What do we want to know? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Phase behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Statistical physics of phase transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Techniques & solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.1 Pen & paper theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Ground-state energies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Dynamics & oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
The harmonic approximation & higher-order schemes . . . . . . . . . . . . . . 13
2.4.2 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Modelling the kinetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
xv
5.2.2 Isothermal-isobaric SHE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.2.3 Application & error analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.3 Newton-Raphson technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3.1 The NVT ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.3.2 Measurement & error analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.3.3 The NPT ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.3.4 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.4 Stage II: Tracing the coexistence curve . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.5 Single-histogram extrapolation (reprise) . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.6 Gibbs-Duhem integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.6.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.6.2 Integration procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.6.3 Error analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
i. Systematic errors from the integration procedure . . . . . . . . . . . . . . . . 106
ii. Errors involving the initial state-point . . . . . . . . . . . . . . . . . . . . . . 107
iii. Stochastic errors in the gradient estimation . . . . . . . . . . . . . . . . . . 107
5.7 Predictor-corrector LSMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6 The Lennard-Jones Solid 111
6.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.2 Truncation schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.3 Ground-state energies & lattice-summation . . . . . . . . . . . . . . . . . . . . . . . . 117
6.4 The harmonic approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.4.1 Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.4.2 Stability of the crystalline structures . . . . . . . . . . . . . . . . . . . . . . . . 126
6.4.3 Truncation-scheme dependence . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.4.4 Verifying the harmonic calculation . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.4.5 The harmonic phase diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.4.6 Estimating the anharmonic effects . . . . . . . . . . . . . . . . . . . . . . . . . 130
6.5 Lattice-switch Monte Carlo for soft potentials . . . . . . . . . . . . . . . . . . . . . . . 132
6.5.1 The lattice-switch order parameter . . . . . . . . . . . . . . . . . . . . . . . . . 133
Multicanonical weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.5.2 The constant volume ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.5.3 The constant pressure ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.5.4 The consequences of truncation . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.5.5 Implementation details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
The question is, when the hard-sphere system freezes, what structure will it form? To the physicist it is
intuitively clear that the most efficient way to pack together a number of spheres is by stacking hexagonal
layers of these one on top of another, in much the same way one might arrange cannonballs.1 Of these,
however, there are any number of possible stacking patterns (for example hexagonal close-packed or
face-centred cubic). As all these stackings have precisely the same density, the differences in the relative
stabilities of the structures are determined entirely by many-body effects, and tend to be very small and
difficult to determine. This a long standing problem in statistical physics [7], and will be attacked in
chapter 4.
Figure 2.1: (a) The hard-sphere potential. Particles are prevented from overlapping by an infinite
energy barrier for rij < , where is the diameter of the spheres. (b) The Lennard-Jones potential,
which is composed of a steep repulsive term and a softer attractive term, leading to a well of depth .
2.1.2 The Lennard-Jones potential
This potential is characterised by a steeply repulsive core, describing the mutual repulsion of electrons as
atoms are pushed together, and a weaker van der Waals attraction. When combined these opposing forces
produce a minimum in the potential where, at low pressure, one might expect neighbouring particles to
reside (see figure 2.1(b)). It has the form,
(rij) = 4
"
rij
12
rij
6
#
: (2.4)
It is a ‘soft’ potential, because in contrast to hard-core potentials (such as the hard spheres introduced
above) there is no finite range of rij for which the potential is infinite. As this system has both an energy
scale () and a length scale (), the properties of a system of Lennard-Jones particles will depend on
1To the mathematician this piece of guesswork, known as Kepler’s conjecture [5], presents a formidable technical challenge
and has only recently been proven [6].
both the density and the temperature. The Lennard-Jones system is also expected to favour close-packed
structures, but the precise balance between the face-centred cubic and hexagonal close-packed phases
remains unclear. This system’s structural phase behaviour will form the subject matter of chapter 6.
2.1.3 What do we want to know?
Quite simply, we want to take our microscopic models of particle interactions and use them to predict
the structure and properties of our model systems in the thermodynamic limit (i.e. as the number of
particles, N , tends towards O(1023) 1). The definition of the model indicates the macroscopic
conditions that the system will be sensitive to, and we wish to predict how the properties of the system
change as those macroscopic conditions are altered.
2.2 Phase behaviour
Despite the gigantic number of degrees of freedom that a macroscopic system possesses, it is found that
real materials have well defined bulk properties (for example the observed density or the structure) that
only depend on a few macroscopic control parameters, such as temperature and pressure. The different
phases of a system are identified by those macroscopic bulk properties, and a phase transition simply
refers to a point in the macroscopic-constraint space where a small change in those control parameters
causes the properties of the system to change sharply.
It is generally possible to characterise a phase transition by the use of an order parameter, often de-
fined to be zero in one phase and finite in the other. As well as identifying which side of the phase
boundary a given system is on for a given set of macroscopic constraints, it also allows the phase tran-
sition to be classified as either first-order or continuous. First-order transitions are characterised by
step-discontinuity in the order parameter, accompanied by a latent heat of transformation. For example,
as we increase the pressure of gas, we may find that the density will jump from a low to a high value
as the gas condenses into a liquid, while simultaneously warming its surroundings. In the case of con-
tinuous transitions, the step discontinuities are present in the derivatives of the order parameter, and no
latent heat of transformation is observed.
The overall phase behaviour of a system can be described using a phase diagram, a schematic example
of which is presented in figure 2.2. The lines indicate the location of first-order phase transitions, and
the regions between them indicate which phase one would observe for any pair of values of the pressure
and temperature. The liquid-gas coexistence curve of figure 2.2 terminates in a critical point, where a
continuous transition from gas-liquid coexistence to a fluid state occurs. Continuous phase transitions
Figure 2.3: Schematic of configuration space, where the darker
regions correspond to the microstate configurations with the
greatest probability. This figure shows three phases (, and
) which are separated by regions of unlikely configurations, and
of which has the greatest total weight.
The value of W (~q) depends on both the positions of the constituent particles (f~rg) and on the nature
of the macroscopic constraints. For example in the canonical (NV T ) ensemble, where the number of
particles, the density and the temperature is fixed, the weight is defined by the configurational energy of
the particles,
W (f~qg) = WNV T (f~rg) = exp
E(f~rg)
kT
; (2.5)
where k is Boltzmann’s constant and T is the temperature. If one moves to the isobaric-isothermal
(NPT ) ensemble, where pressure, temperature and number are fixed (but the volume V can fluctuate),
it can be shown that
W (f~qg) = WNPT (f~rg; V ) = exp
E(f~rg)
kT
pV
kT
: (2.6)
Clearly, the probability of a system being in any individual microstate can be determined as the weight
of that microstate divided by the total weights of all the possible microstates. That is,
P (f~qg) =
1
Z
W (f~qg); (2.7)
where the constant of normalisation (Z) is the partition function of the entire system, defined as
Z =
Y
i
Z
d~qi
W (f~qg): (2.8)
In order to determine the preferred phase, we must construct a ‘total phase probability’ from our mi-
crostates. This can be achieved by breaking the system’s configuration space up into sections, each
corresponding to a different phase. This is analogous to cutting along the dotted lines of figure 2.3, and
re-expressing the total partition function as no more that the sum of its (now separated) parts,
Z = Z + Z + Z
: (2.9)
The process of dividing up the configuration space is based on restricting the summation of Z to include
only those configurations ‘belonging to’ a given phase, and is written as
Z =
Y
i
Z
d~qi
W (f~qg): (2.10)
Thus, the probability of any given phase can be written as
P =
Z
Z
: (2.11)
This process is rather like taking the pieces cut from figure 2.3, and then comparing the amounts of
laser-printer toner on each. The most likely phase would be determined from the piece of paper with the
most toner on it.
These ‘total phase probabilities’ are easier to deal with when cast in terms of thermodynamic potentials.
In the canonical ensemble, we construct the Helmholtz free-energy,
F = kBT lnZ; (2.12)
and express the relative probabilities of two phases as a difference in free-energies,
F = F F = kT ln
Z
Z
= kT ln
P
P
: (2.13)
In the case of hard-spheres, the energy of the system can be only zero or infinity, and so in the canonical
ensemble the microstate weight (eqn. 2.5), can be only 1 or 0 respectively. Therefore in this case, our
partition function is simply a count of the number of ways of arranging the spheres (
) and the free-
energy becomes the familiar Boltzmann entropy,
S = k ln
: (2.14)
In the case of the isobaric-isothermal ensemble, the thermodynamic potential is the Gibbs free-energy,
and has the same form as in the case of the canonical ensemble,
G = kT lnZ; (2.15)
where the partition function now includes a sum over all possible volumes of the system,
Z =
Y
i
Z
d~ri
Z
dV WNPT (f~rg; V ): (2.16)
2.4.1 Pen & paper theory
Ground-state energies
When wishing to compare a number of different candidate structures without recourse to numerical
simulation, one might proceed as follows. Firstly, one may attempt to evaluate the energy of each of the
structures when all particles are sitting exactly on their lattice sites (i.e. at zero Kelvin). If this ground-
state energy difference is large (compared with kT in the range of temperatures of interest) one may
decide to consider this as sufficient. For the pair-potentials considered here, the ground-state energy is
trivial to evaluate: For hard-spheres, all valid configurations have the same energy, and for the Lennard-
Jones interaction one can calculate the sum of the pair-potentials over each lattice one wishes to consider
(see x6.3, and also p.400 of [9]).
Unfortunately, while this approach can yield useful results, it clearly offers no way to distinguish the
structures for the hard-sphere case. Furthermore, for the Lennard-Jones system the ground-state energy
difference between fcc and hcp is so small that one cannot expect it to dominate the phase behaviour in
the finite-temperature regime. Therefore in both cases the full free-energy must be considered, including
the entropic contribution.
Dynamics & oscillations
Much has been learned about crystalline solids from the Einstein and Debye models. In the Einstein
model [9, p.462], one assumes that the system can be represented as a set of independent harmonic
oscillators with identical frequencies. In the case of the Debye model [9, p.458], a simple distribution
of frequencies is constructed by considering the elastic behaviour of a continuous system, and adding in
a cut-off at high frequencies corresponding the maximum frequency of the discrete, finite system. Both
of these models have a single free parameter, and give valuable insight into the thermodynamics of the
solid state. However, neither of them can be applied to all problems of structural phase behaviour as
they stand. The assumption of a trivial frequency distribution removes all notion of geometry from the
problem, and when comparing structures of identical density, like fcc and hcp, we are fundamentally
concerned with geometry as this is the only way in which the structures differ. Any applicable theory
must therefore include these geometrical differences, and a popular example of such a theory is the
harmonic approximation.
The harmonic approximation & higher-order schemes
The harmonic approximation operates by assuming that soft-potentials can be accurately represented by
simple spring-like interactions. That is, the interaction potential is truncated to second-order, in terms of
the displacements of the particles from their ground-state (‘perfect-lattice’) positions. The resulting set
of equations is exactly soluble, and accurately represents the original system when the displacements of
particles from their sites is small, i.e. in limit of low temperature [9, chapter 22].
The harmonic approximation was first proposed in 1912 by Born & von Ka´rma´n [11, 12], when the
concept of a crystal lattice was still in the process of being confirmed as an experimental reality via X-
ray diffraction. Since that time the harmonic approximation has been modified in many different ways,
all with the aim of extending the range of physical conditions over which one may produce accurate
results. For example, the quasi-harmonic approximation [9] attempts to model the (more experimentally
realistic) constant pressure behaviour by including the effects of thermal expansion. Many extensions
of the harmonic theory have also been proposed in order to extend the range of temperatures for which
the theory holds true (see [9, chapter 25] & [13, 14]). These are largely peturbative methods, where the
effect of the 3rd and 4th order terms of the displacement-expansion upon the free-energy is estimated
under the assumption that the amount by which these terms alter the free-energy is small.
However, comparisons of these methods (for example [14]) have illustrated the fundamental problem
with this class of approach; the uncontrolled nature of the approximation. One cannot know, a priori,
whether a given approximate theory will be able to predict successfully the behaviour of the true system
over the temperature range of interest. The harmonic approximation can be used to predict the behaviour
of a system in the low temperature limit (a fact which will be used to check our Lennard-Jones results,
see x6.4), but that is all one can be sure of. Also, the harmonic approximation is not well suited to dealing
with the hard-sphere solid, as the interaction is strongly anharmonic (i.e. very badly approximated by
a quadratic potential well) and so cannot be expected to fit within this framework.2 To ensure accuracy
and generality, one is required to work without uncontrolled assumptions, and that can only be achieved
by attacking the problem numerically.
2.4.2 Simulation
Before examining the ways in which simulation can be applied to structural phase behaviour, some
general concepts common to all condensed-matter simulation work should be outlined. Simulations of
this kind can be described quite aptly by the phrase ‘balls in a box’, as that is essentially what one
2Although, it is interesting to note that there are striking similarities between the two systems [15], and in one dimension
these two system can be shown to be equivalent [16].
creates in mind of the machine. We define our set of positions for the particles (~ri), within some (usually
Cartesian) coordinate system, apply some appropriate boundary conditions, and then allow the system
to explore its configuration space so that we might determine the properties of interest.
One might imagine choosing the boundary conditions in order to represent the containers used for real
experiments, for example one could simulate inside a hard-walled box. However, simulations that use
this type of boundary condition tend to be dominated by surface effects. In general, the bulk behaviour
can be simulated much more effectively using periodic boundary conditions (figure 2.4). This arrange-
ment means that the simulation cell no longer has any true surfaces, and rather it sees copies of itself in
all directions. As long as this does not allow particles to interact directly with themselves, one finds that
the thermodynamic limit can be better represented by a smaller system than in the case of a surface-
dominated simulation. Unfortunately, this does not mean that the effects of finite system size disappear
entirely, and care must still be taken to ensure that the simulated system is ‘big enough’ to accurately
represent the behaviour in the thermodynamic limit.
Figure 2.4: While a hard-walled container (left) can be used as the
simulation box, periodic boundary conditions (right) give a more reliable
representation of the bulk.
As well as the system size and boundary conditions, one must also choose the rules under which our
simulation will evolve. There are two main options here, known as the Molecular Dynamics and Monte
Carlo techniques.3
In classical Molecular Dynamics (MD) [17, chapter 4], the system is evolved deterministically by inte-
grating Newton’s equations of motion for the system. That is, one calculates the forces on every particle,
and then updates the velocities and positions of those particles according to Newton’s laws. As these
laws conserve energy, this approach would seem to restrict the range of macroscopic environments one
may consider. However, the MD scheme can be extended relatively easily: for example the simula-
3There are other approaches, such as hybrid Molecular Dynamics + Monte Carlo schemes like Brownian Dynamics [17],
but these will not be considered here.
the latter case, a number of simulations are used to compute free-energy differences, using numerical
integration to sum the derivatives of the free-energy along some inter-system pathway. The pathway
usually lies between the system of interest and an ideal system for which the free-energy is known.
These multi-stage integration methods will be considered in x2.4.4.
2.4.3 Direct differences
Any simulation which is free to visit two different phases can determine the free-energy difference
between them from their relative probabilities (eqn. 2.13). This idea can be traced back to the seminal
publication of Bennett [25], and underpins almost all of the work performed in this research. The
algorithm presented in [25] is quite general, allowing the free-energy difference between two different
systems to be estimated by measuring the energy cost of changing from one Hamiltonian to the other,
but without actually performing that ‘move’. The ratio of the partition functions for systems operating
under two different Hamiltonians can be expressed as
Z2
Z1
=
P
exp [ H2]
P
exp [ H1]
(2.17)
P
exp [ (H2 H1)] exp [ H1]
P
exp [ H1]
(2.18)
(exp [ (H2 H1)])1 ; (2.19)
where the sum notation
P
has been used as a shorthand for a sum over all microstates of the system,
and where (exp [ (H2 H1)])1 denotes the mean value of exp( (H2 H1)), as determined while
operating under Hamiltonian 1. Using this, the free-energy difference between the two systems can be
determined as
F1;2 = F2 F1
= kT ln
Z2
Z1
= kT ln
(exp [ (H2 H1)])1
: (2.20)
Unfortunately, as it stands, this technique is not practically useful in the majority of cases. This is
because the particle configurations generated using Hamiltonian 1 must be very close to those which
would be generated by Hamiltonian 2. If this is not the case, there will be large statistical fluctuations in
the estimated value of Z2Z1 , and the timescale required for a sufficiently accurate measurement of F1;2
will become prohibitively large. In general, the configurations associated with two distinct phases are so
different that this approach cannot be used to compare them. For example, Moody et al. [26] attempted
to use this style of calculation to determine the free-energy difference between the fcc and hcp phases of
low density high density
Figure 2.6: The single-occupancy cell model. At low den-
sities (left) one has a relatively simple weakly-interacting
system, and at high densities (right) the particles no
longer ‘see’ the walls, and the true system is recovered.
The Einstein crystal
This approach, developed more recently than the previous two, is the most flexible and powerful integra-
tion technique to appear so far. For this reason, the method (as first described in [32, 33], and discussed
in [17]) will be explored in some detail here, in the context of the hard-sphere solid.
Figure 2.7: Einstein crystal integration. As the spring
constant is increased, one moves from a system dom-
inated by hard-sphere interactions (left) to an Einstein
crystal (right).
The Einstein crystal is simply a system of particles that interact only with their lattice site, as if attached
there by a spring. If one combines this interaction with the inter-particle hard-sphere repulsion, then one
can see (fig. 2.7) that while at low spring-strength (small ) the effect of the springs will be negligible, at
high enough ( ! max, say) the particles will cease to interact with each other and we will recover
the Einstein crystal, for which the free-energy Feinstein is known analytically. The two systems are
combined via a modified configurational energy,
EIM = E +
NX
i=1
(~ri ~Ri)
2; (2.25)
where the original expression has been augmented by a harmonic term based on the distance each particle
is from its lattice site (~Ri). In contrast to the temperature and density integrations considered earlier,
our integration variable is no longer a thermodynamic variable but a model parameter. To determine the
free-energy of the system we must evaluate the integral,
F =
Z max
=0
dF ()
d
d: (2.26)
As shown in ref. [17, pp. 154,216] (see also eqn. 2.23) the value of dFd
can be estimated as the average
of the square of the particle displacements, and eqn. 2.26 becomes,
F =
Z max
=0
(~ri ~Ri)2
d: (2.27)
So by measuring the fluctuations in particle positions for a series of values of (from 0 up to some max)
one can determine the free-energy of the system.7 The question remains, however, as to how large should
max be? Unfortunately, this question has no easy answer and leads to a series of complications to the
technique (see fig. 2.8).
This complexity arises from the fact that one cannot actually take max very close to1 in a numerical
calculation. Thus, the ‘ideal’ system against which the hard-sphere solid is actually being compared is
in fact a weakly-interacting Einstein crystal. In [33], the amount these weak interactions contribute to
the free-energy is estimated from a ‘virial-like’ expansion (Finteraction) and an empirically determined
correction to this expansion (Fcorrection). Example values of all of the contributions to the total free-
energy are given in table 2.1.
Free-energy Contributions (per particle) fcc(N = 63) hcp(N = 63)
Feinstein 7:9568 7:9568
Finteraction 0:0183 0:0183
Fcorrection 0:0024 0:0008
F 2:9754(10) 2:9761(9)
Fidealgas 0:9432 0:9432
FC:o:M: 0:0246 0:0246
Ftotal 5:9159(10) 5:9168(09)
Ffcc Fhcp 0:0009(14)
Table 2.1: Example values of the components of the free-energy from the Einstein crystal
calculation in ref. [33]. The errors associated with these estimates are given in parenthesis,
indicating the uncertainty in the final digit. This convention will be used throughout this thesis.
So, having performed this long and complex calculation for both the fcc and hcp structures, we find that
while the total free-energy has been very accurately determined (to within 0:02%), this is not enough to
7Note that as for SOC model, this is only true if the calculation is carried out in the centre-of-mass frame. In this case,
this is because the unconstrained hard-sphere solid is free to move away from the lattice-sites used for the Einstein crystal
calculation. This centre-of-mass diffusion means that the value of
D
(~ri ~Ri)2
E
0
will diverge unless the centre-of-mass is
fixed.
some detail.
3.1.3 Random sampling
In this most basic form of Monte Carlo integration, one samples the configuration space of a system at
random. That is, we generate the trial states by choosing random values for all the degrees of freedom
of the system, and we always accept the move to this new state. The macroscopic control parameters
(such as the temperature) do not affect the evolution of the system, and are simply used to calculate the
system’s properties from the simulation data. To concretize this idea, we shall examine the case of the
toy model and how one can use random sampling to evaluate its partition function,
Z =
Z +1
1
W (x)dx ; (3.4)
where
W (x) = exp
x2
: (3.5)
This ‘integration over all microstates’ can be approximated by an integral over a finite range of x,
Z
Z +xmax
xmax
W (x)dx ; (3.6)
which becomes exact as xmax !1. This approximate expression can be rewritten (via the mean value
theorem [35]) as,
Z 2xmaxW (x) ; (3.7)
whereW (x) is the mean value of the integrand over the range xmax to +xmax. This can be interpreted
as a statement that the area under the W (x)-curve (i.e. the value of Z) is equal to the area of a rectangle
of width 2xmax and height W (x) (see fig. 3.2(a)).
Equation 3.7 tells us that if we can estimate the value of W (x) this will in turn produce an estimate
of the value of Z . This mean value can be calculated by randomly sampling W (x) over the range
xmax < x < +xmax. That is, we generate i = 1:::Ns values of x which are randomly and uniformly
distributed between xmax and xmax, then estimate the average value of the Boltzmann weight, which
can be calculated using,
hW (x)ixmax =
1
Ns
NsX
i=1
W (x): (3.8)
The xmax subscript of the left-hand side is used to indicate that the average depends on the chosen value
of xmax. The value of hW (x)ixmax can be used to produce an estimate Z via a slightly modified version
of eqn. 3.7,
Z
EB
= 2xmax hW (x)ixmax ; (3.9)
where we have used W (x) EB= hW (x)ixmax , meaning that the true mean value of W (x) is estimated by
(hence EB=) the average value ofW (x), as measured in a Monte Carlo simulation. The same hOi notation,
as defined in eqn. 3.8, will be used throughout this thesis to indicate a simulation-based estimate of the
mean value of an observable (O).
Figure 3.2: Comparison of the evolution of a simulation of the toy model for the two MC algorithms.
(a) Random sampling: A broad range of x is sampled uniformly, and the Boltzmann factor does not
affect the evolution of the simulation. (b) Importance sampling: By controlling the transition between
states in a way that incorporates the Boltzmann weights, the microstates are automatically sampled
in proportion to their importance.
While this process will work for this model, allowing the numerical value of Z to be estimated for any
temperature, it is somewhat unsatisfying. It would be preferable if the integration process dedicated more
time to the values of x for which the Boltzmann weight is high, as opposed to uniformly sampling some
large and arbitrary range of x. Although the related problem of how to choose an appropriate value
for xmax is something of an artifact of this particular toy model, for the kind of many-body systems
we are really interested in it is generally true that only very small regions of configuration space are
physically significant (i.e. of high statistical weight). For this reason, random sampling will invariably
fail us when studying many-body systems, but it does provide a useful stepping stone on the way to the
more powerful MC techniques at our disposal.
3.1.4 Importance sampling
The essence of this powerful technique lies in controlling the simulation’s transitions between mi-
crostates in a way that ensures that those microstates are sampled according to their importance (see
fig. 3.2(b)). In other words, the ways in which trial moves are generated and the rules under which a trial
move is accepted or rejected are chosen such that each microstate of the system occurs with a frequency
proportional to its Boltzmann weight. If we consider measuring the average energy of our toy model,
E =
Z +1
1
E(x)P (x)dx ; (3.10)
where
P (x) =
W (x)
Z
(3.11)
then it is clear that if we can explore x with the correct (Boltzmann) probabilities P (x), then the energy
of the system can be estimated via a simple average of the observed energy of the simulation,
E
EB
= hEi =
1
Ns
NsX
i=1
Ei ; (3.12)
and indeed any other observable of the system can be treated in the same way.
The key feature of importance sampling is that the procedure does not rely on any prior knowledge of the
actual value of Z , but will nonetheless produce a sequence of microstates with the correct probabilities.
To see how this is achieved, we must first re-express the evolution of the simulation in terms of a Markov
chain.
3.1.5 The Markov chain
A simulation will form a Markov chain if the sequence of microstates it produces are generated such that
the next microstate depends only on the current microstate, and not on any previous microstates in the
sequence. We shall write the probability of observing any given microstate as P (i), which is shorthand
for the probability of the system occupying a particular point in configuration space, P (f~qgi)
Q
~q d~q (for
the toy model this is simply P (x)dx). As the simulation’s evolution is Markovian, the change in P (i)
(per unit time, i.e. per simulation step), P (i), can be written as,
P (i) =
X
j 6=i
P (j)P (j ! ijj)
X
j 6=i
P (i)P (i! jji); (3.13)
where P (i ! jji) is the probability of a transition from i to j given that the system is in microstate i.
The first sum on the right-hand side of equation 3.13 describes the flux of probability into microstate i
from all the others, and the second sum describes the flux out of state i.3 The concept of a probability
flux is perhaps most easily grasped in this context by considering the ensemble corresponding to the toy
model. Imagine we have a very large number (an ensemble) of toy-model simulations, each of which
is exploring its configuration space independently from all the others. In this case, the probability flux
from one microstate to another is essentially the number of systems which make this transition per unit
time.
3Of course, in all the model systems considered here the configuration space is continuous, and so these sums in fact
correspond to integrations over this space. However, for the sake of clarity the discrete sum notation will be used here.
microstates, respectively). As well as these, a recent publication [40] has provided a more physicist-
friendly proof that detailed balance does indeed ensure convergence to equilibrium.
The work in [40] also shows that any scheme which satisfies eqn. 3.14 will converge properly, and so as
implied above detailed balance is indeed a stricter condition than is necessary. A similar argument was
presented in reference [41], in order to justify the use of certain MC move generation schemes which
had been shown to break detailed balance in its strictest form. The work in this thesis, however, will
‘play safe’ by ensuring that detailed balance is strictly observed at all times.
To make use of the detailed balance condition we substitute the Boltzmann probability of each microstate
(P (i), eqn. 2.7) into eqn. 3.15. We find that the troublesome partition functions cancel, leaving us with
the requirement that
P (i! jji)
P (j ! ijj)
=
W (f~qgj)
W (f~qgi)
: (3.16)
From the arguments presented above, it should be clear that if transition probabilities are chosen such
that equation 3.16 is satisfied, then the simulation should automatically converge onto the Boltzmann
distribution.
3.1.6 The Metropolis algorithm
The original and most widely used algorithm in Monte Carlo simulation was introduced by Metropolis
et al. [42]. To describe this algorithm, it is first necessary to split the transition probability into its
two constituent parts: Pgen, the probability of generating any given move, and Pacc, the probability of
accepting that move,
P (i! jji) = Pgen(i! jji)Pacc(i! jji): (3.17)
The Metropolis algorithm requires that there is no bias in the way in which moves are generated, i.e.
that
Pgen(i! jji) = Pgen(j ! ijj): (3.18)
For the toy model, we might imagine generating new microstates by randomly changing x by a small
amount. If this were done such that (for example) moves to the left were generated more frequently
than moves to the right, then eqn. 3.18 would not be satisfied. The simplest way to ensure that 3.18 is
satisfied in this case would be to generate moves to the left and to the right in a uniform and unbiased
manner.5 This property (eqn. 3.18) is referred to as the underlying symmetry of the Markov chain, and
shifts the implementation of detailed balance entirely into Pacc. Metropolis et al. [42] showed that a
5An alternative to this condition is that any bias that is introduced into the generation of moves is removed by an additional
acceptance condition so that eqn. 3.18 is recovered. Roughly speaking, if moves to the left are generated twice as often as
moves to the right, then moves to the left should only be accepted half the time.
3.2.2 Isobaric-isothermal ensemble algorithms
In the case of the NPT ensemble, the simulation must sample the configurations associated with all
the possible simulation-cell volumes, as well as those associated with the different possible particle
configurations. To do this, it is possible to add simple volume dilations, which alter the volume of the
simulation cell without modifying the particle positions (figure 3.6, left). The decision to accept this
move would then be based upon the Boltzmann weight WNPT given by equation 2.6. However, there
is a more efficient method with which to implement volume moves (see [17, sec. 5.4] for details of this
and other types of volume move). The idea is that instead of just changing the volume by some scaling
factor, all particle positions within the periodic cell are also scaled (see figure 3.6, right).
Figure 3.6: Volume moves. While just changing the volume of the periodic cell (left)
leads to a simpler acceptance condition, the simulation takes some time to re-adjust to
the new volume. If the particle positions are scaled with the volume, the resulting con-
figuration is nearer the equilibrium configurations for that volume, and so the simulation
is more efficient.
This all-scaling scheme requires the particle positions to be re-expressed as,
~ri = L~si (3.22)
where the deformation matrix L has zero off-diagonal elements, and has parameters a, b and c along
the diagonal. Under this constraint, the L matrix describes a mapping from a cuboid simulation cell of
dimensions a b c onto a unit cube. Substituting this form into the equation for the partition function
(2.16) we find,
Z =
Z
dV
"
Y
i
Z
d(L~si)
#
exp [ E(f~rg) PV ] ; (3.23)
=
Z
dV
"
Y
i
Z
d~si
#
V N exp [ E(f~rg) PV ] ; (3.24)
=
Z
dV
"
Y
i
Z
d~si
#
exp [ E(f~rg) PV +N lnV ] ; (3.25)
(3.26)
Acceptance rates and attempt frequencies
As in the case of particle moves, we have the freedom to tune the size of the volume moves (and so
alter the associated acceptance rate) in order to maximise the efficiency of the simulation. This issue is
explored in x4.10.1 & x6.6.1. We also have the freedom to choose how often to attempt a volume move,
but for this we choose to use the standard prescription [17, x3.4] [34, x4.5], where volume moves are
attempted at random intervals, with an average frequency of once per Monte Carlo sweep.
3.3 The random factor
For any of the algorithms outlined here to work, we need a source of random numbers with which to
carry out the Monte Carlo moves. However, computers are strictly deterministic machines (excepting
hardware faults, which are extremely rare in modern computers) and so there can be no such thing as
a perfect random number generator (RNG). The sequence of numbers generated by a computational
algorithm will be well defined, and will always (eventually) repeat. Many years of research have been
dedicated to the problem of pseudo-random number generators (see ref. [34, x6] for an introduction),
but the best we can hope for is that, for the volumes of random numbers we require, the sequence will
fail all known tests designed to detect ordered behaviour.
Unfortunately, some simulation algorithms can be extremely sensitive to any correlations in a random
number generator’s output, and the only way to ensure that all is well is to check that we recover statis-
tically indistinguishable results whatever RNG we choose. So, while most of this work was performed
using one random number generator (RAN250/521), occasionally a simpler algorithm (RAN250) was
employed in order to check the reliability of our results. Briefly, the RAN250 algorithm is a straight-
forward linear congruential algorithm, whereas the RAN250/521 generator combines the results of two
independent (congruential) algorithms to produce an overall sequence of significantly greater statistical
fidelity than the RAN250 alone. For more information on these algorithms, see refs. [44, 45, 46] and
references therein. No RNG dependence was detected during this research.
3.4 Error analysis
Given that we have a successful importance sampling simulation, the question remains as to how we
might estimate any properties of interest from the simulation data. We have already seen that the energy
can be estimated by periodically measuring the simulated values and finding the average, and this is true
for any observable
O
EB
= hOi =
1
Ns
NsX
i=1
Oi ; (3.29)
where Ns is the total number of times the value of O was sampled. However, we need to be able
to estimate our statistical uncertainty in this average, that is we need to compute the standard error
associated with the mean value of O.6 It is well known (see e.g. [48]) that if the Ns observations of O
are uncorrelated, then the standard error of the mean can be calculated as
(O) =
1
p
Ns
h
O2
hOi2
i
: (3.30)
Of course, as a Markov chain is generated by making a long series of (usually small) changes to the
system, the sequence of microstates will be correlated to some degree. But how can we ensure that these
correlations are not skewing the averages, and that the statistical errors are reliable? One option is to
calculate the autocorrelation function of the data, and then estimate the autocorrelation time from that
function. One can then ensure that the data is only sampled on a timescale significantly greater than
the autocorrelation time. However, this process is rather time consuming and slightly arbitrary (as it
depends on the particular form chosen to fit to the autocorrelation function data). A more straightforward
approach is known as block averaging (see [17, Appendix D.3]), which avoids the explicit calculation
of the autocorrelation time. The individual Ns measurements are gathered into Nb blocks of length Lb,
so we have B = 1:::Nb individual block averages,
hOiB =
1
Lb
LbBX
i=1+(B 1)Lb
Oi: (3.31)
From this set of averages, we can compute the overall average,
hOi =
1
Nb
NbX
B=1
hObiB ; (3.32)
and the associated variance,
2b (O) =
1
Nb
NbX
B=1
hOi2B hOi
2 : (3.33)
It can be shown that the following error estimate (based on the assumption that the blocks are uncorre-
lated),
(O) =
b(O)p
Nb
(3.34)
will be a constant for all block sizes which are sufficiently large that the individual block averages are
uncorrelated. Therefore, we can check that the block size is sufficiently large by calculating (O) for
a number of different block sizes (a range of values of Nb) and then adopt the value of (O) from the
range of Nb for which it is constant.
6For an excellent introduction to error-handling techniques in general, see Taylor [47].
3.5 Successes & failures
Importance sampling Monte Carlo, as outlined so far, is a widely used and powerful tool for investigating
many-body systems. However, there are a number of important limitations. For example, as mentioned
in x2.2, continuous phase transitions present some serious difficulties for the computational physicist.
A Monte Carlo simulation performed near a critical point tends to suffer from ‘critical slowing down’,
meaning that the progress of the simulation through configuration space becomes extremely slow. Also,
such a simulation will suffer from serious finite size effects, because at a critical point the spatial corre-
lation length (i.e. the size of the fluctuations) diverges, and so only an infinite system can give a truly
accurate representation of the system’s behaviour. However, much progress has been made by using
‘smart’ MC moves to speed up the simulation’s progress (e.g. cluster moves [17, Chapter 12]), and by
using finite-size scaling techniques to determine what a series of finite (differently) sized simulations
can tell us about the infinite system [49].
However, for the work under consideration here, the core limitation of an importance sampling MC
simulation is that it cannot determine the absolute free-energy of a system. However, an MC simulation
can estimate the difference in free-energy between two phases if it is free to visit them both (x2.4.2).
Unfortunately, for the cold, solid-state systems under consideration here, the configuration space consists
of a number of distinct fragments, corresponding to different phases, each of which is not only (formally)
metastable, but on any reasonable simulation timescale is effectively stable.
An analogous problem can be invented by making a small alteration to our toy model. Instead of a
single harmonic well, imagine that there are two minima in the configurational energy (see fig. 3.7). An
importance sampling simulation (using ’physical’ moves) will tend to linger in one well or the other, only
crossing the barrier occasionally (see fig. 3.7(a)). If we make the barrier higher, the system spends less
and less time moving between the wells, i.e. the probability of crossing from one well to the other gets
smaller. With a high enough barrier, the simulation will become stuck (on the computational timescale)
in one of the two wells, and never ‘see’ the other one(see fig. 3.7(b). Note that lowering the temperature
of the simulation shown in figure 3.7(a) would have the same effect as increasing the barrier height.
3.6 Extended sampling
The essential idea behind extended sampling is to determine the behaviour of the system for some
microstate weighting scheme other than the Boltzmann distribution, i.e. the simulation is biased. This
biased simulation can overcome the probability barrier between two regions of configuration space, and
so the simulation can visit both of them. The trick, however, is to do this in such a way as to allow us
Figure 3.7: The toy model with a double-well potential: (a) If the barrier is relatively small, the
simulation will move between the two on a reasonable timescale. (b) As the barrier is made taller, the
simulation tends to get stuck in whichever basin it started off in.
to calculate what would have happened if the Boltzmann distribution had been used, so that we may
remove the bias after the simulation has run its course.
This idea has been around for some time, and there are a number of different biased sampling techniques
available for a number of different contexts (see [17, 24] and references therein). However, this work will
focus on one particular technique, generally referred to as multicanonical Monte Carlo (MCMC). The
genesis of this method can be traced back to the “umbrella sampling” of Torrie & Valleau [50], which
was later generalised by Berg & Neuhaus [51] to create the multicanonical sampling algorithm. The
name, incidentally, comes from the idea of a single simulation that visits many different temperatures,
i.e. it samples many different canonical ensembles. The implementation of MCMC used in this work
follows the form given in the Smith & Bruce review article [24].
3.6.1 Multicanonical Monte Carlo
The essence of the kind of problem which multicanonical Monte Carlo is well suited to dealing with
is presented in figure 3.8(a). We have some macroscopic order parameter M for which there are two
distinct sets of values, corresponding to two separate regions of configuration space in which the Boltz-
mann weight is high. In the case of our double-well toy model (fig. 3.7), this order parameter could be
the energy. The order parameter space is assumed to be discretized, such that for each microstate i, its
associated macrostate valueM belongs to a discrete macrostate binMm where there are m = 1:::NM
macrostates altogether. The probability of visiting the macrostates of M in between the two phases
becomes (exponentially) small as one moves away from the (approximately Gaussian) peaks, and so the
probability of crossing this region in an importance-sampling simulation is very small.
Figure 3.8: Illustration of the essential ideas of multicanonical Monte Carlo. (a) We have some
double-peaked distribution, which we wish to explore in full. The discreteM states are shown as a
grey histogram. (b) A multicanonical simulation can visit all relevantM-values uniformly (grey-filled
plot), using an appropriate weight function (thick line). The true distribution can be determined by
removing the bias introduced by the weights. The curves have been drawn as continuous functions
for reasons of clarity.
In multicanonical Monte Carlo, the original Boltzmann weight is augmented by a ‘weight function’
which depends onM (via the index m),
W(f~qg) = W (f~qg) exp [m] : (3.35)
This can be used to modulate the probability of visiting each macrostate ofM, and so potentially allows
the full range ofM to be explored under any distribution which seems appropriate. The usual choice
[24] is for the macrostate space to be explored approximately uniformly,7 as illustrated in figure 3.8(b).
The bias introduced by the weight function can be removed from the simulation data as follows. Any
measurements of an observable taken in the m’th macrostate have had their weighting augmented by
a factor exp [m], and so by reweighting each observation by a factor of exp [ m], the bias will be
removed. Our simple averages for observables (eqn. 3.29) become,
hOi =
PNs
i=1Oi exp [ m]
PNs
i=1 exp [ m]
; (3.36)
7The question of what the optimal augmented distribution should be is by no means trivial to answer, but the work presented
in reference [52] suggests that the gains over using a uniform distribution are small, because “...multi-canonical sampling is
never bad in the way that Boltzmann sampling can be bad”.
3.6.2 Generating the weight function
The simplest approach, which is suitable for many cases, is the visited states (VS) method. This is
an iterative method by which a revised weight function ((n+1)) is derived from the previous weight
function ((n)) combined with a histogram ofM obtained by using those weights. The update scheme
is written as,
(n+1)m =
(n)
m ln
h
H(n)(mjm) + 1
i
+ k; (3.41)
where the arbitrary constant k is fixed by the convention that the lowest point on the function will be
placed at zero (minm(m) = 0). This arbitrary constant arises because the dynamics of the system are
determined only by the change in the weight function as one moves between microstates, and so the
absolute values of are unimportant.
This iterative process may be initialised by assuming a flat weight function, thus reducing the initial
simulation to normal importance sampling. The histogram thus obtained can then be used to modify
the weight function (through eqn. 3.41) so that the probabilities of macrostates which were infrequently
visited will be enhanced, and the more popular macrostates will find their popularity diminished. Even-
tually, this iterative process will converge; the measured histogram will become uniform, and the weight
function will be unchanged from one iteration to the next. At this point, the weight function precisely de-
scribes the distribution P (M), as the bias now exactly opposes this distribution. However, this method
tends to be very slow to converge, and a large number of iterations may be required when the twin peaks
of P (M) are narrow in comparison with the distance inM between them. This poor performance can
be improved by block-wise evaluation, where a set of separate simulations are used, each of which is
constrained to explore different regions ofM. The results of these Blocked Visited States (BVS) simu-
lations can be used to estimate different sections of the weight function (via eqn. 3.41), which can then
be spliced together to provide an estimate of the overall weight function. Once a reasonable weight
function has been determined, the standard VS algorithm can be used to refine it further.
A second blocking technique is based on using a slowly moving barrier, which allows the system to
explore only theM = 0 macrostate at first, then theM = 1:::+ 1 states, and then theM = 2:::+ 2
states, and so on. The barrier is moved outwards at regular intervals of the order of the equilibration
time. This allows each side of the double-peaked distribution to be evaluated separately, after which the
two estimates can be combined to produce an estimate of the overall distribution. This will be referred
to as the Mobile Barrier Visited State (MBVS) technique.
The macrostate transition probability matrix
While the VS methods have the advantage of being relatively simple, a significantly more efficient ap-
proach has been invented, based on monitoring the rate of transitions between macrostates [24]. During
this work, many different algorithms have been employed for the purposes of collecting and utilising
the transition data. However, before describing them, we must first outline the formal mathematical
structure upon which all of these methods are based.
In a Monte Carlo simulation, the sequence of microstates forms a Markov chain (eqn. 3.13). Conse-
quently, the sequence of macrostates generated by a Monte Carlo simulation can be also described in
those terms [24],
Pm =
X
l 6=m
PmlPm
X
l 6=m
PlmPl; (3.42)
where (to distinguish this equation from its microscopic equivalent) Pml is used as a shorthand for
P (m! ljm). The matrix Pml is the macrostate transition probability matrix (MTPM). The macrostate
Markov chain can be shown to obey a detailed balance condition,
PmlPm = PlmPl: (3.43)
It should be noted, however, that this is true only under the assumption of fast local equilibration. As
the simulation moves between microstates, it takes some time for the history of the simulation to be
forgotten (the autocorrelation time). If this time is significant on the scale of the time it takes to move
from macrostate to macrostate then the measured transition rates will depend on the recent history of
the system, not just the current state, and the macrostate chain becomes non-Markovian. Fortunately,
each macrostate will usually correspond to a large number of microstates, and so the rate of transitions
between macrostates tends to be sufficiently slow. Even if the macrostate transitions are found to occur
too rapidly, this can be circumvented by ‘locking’ the simulation into each macrostate for a small period
of time.
The actual transition rates (Pml) themselves represent the total probability of all possible transitions
between two macrostates. Therefore, to measure the transition matrix, we should count the number
of accepted transitions from one macrostate to another, remembering to include the rejected moves as
transitions from the macrostate in question unto itself. These quantities are recorded in a matrix, Tml,
thus allowing the transition rates to be estimated as,
Pml
EB
=
Tml
P
k Tmk
: (3.44)
In other words, the probability of a transition fromm to l (P (m! l jm)) is estimated using the number
of observed transitions from m to l divided by the total number of transitions observed for state m.
MTPM approach the data from different runs is easy to combine: the Boltzmann macrostate transition
probabilities do not depend on the probability of visiting each state in the simulation (i.e. one records the
Boltzmann probability of a transition from i given that we are in state i). This means that the information
we collect is not dependent on the algorithm used to explore the macrostate space (under the condition of
fast local equilibration). Reference [53] includes a proof of this fact in the context of MCMC simulation.
It shows that the data from many different runs can be combined by a simple summation of the individual
elements, even if different weight functions were used. This allows the MTPM to be steadily built up,
while the MCMC weight function is slowly improved.
Artificial dynamics
The TP weight-generation algorithms considered so far have all been based in the ‘natural’ dynamics of
an importance or MCMC sampling simulation. To speed up the evaluation of the transition probabili-
ties, a number of ‘artificial’ dynamics can be used. For example, we have used the Multiple Initialisation
(MITP) method (as used in [24]), where the data from many simulations, each of which has been ini-
tialised somewhere in between the two equilibrium peaks, are combined to produce an estimate of the
weight function. Another example of this concerns block-wise evolution (BTP), similar to the blocked
VS algorithm mentioned above. Under the BTP scheme, a number of simulations can be used, each
locked into separate ranges of macrostate space, and the transition matrix elements from each of these
can be combined to produce the overall transition matrix. The mobile-barrier MBVS scheme mentioned
previously can also be used to control the evolution of the simulation while the transition rates are mea-
sured (MBTP). As well as these techniques, which have been taken from the literature [24, 49] a related
technique, referred to here as “strong sampling”, was invented during the course of this work.
Strong sampling
This new (unpublished) TP method has been designed to sample the whole range ofM automatically
and so remove the ‘black art’ from weight function determination. The algorithm is based around a
four step process intended to force the simulation to visit the entire range of macrostates of M in a
near uniform manner. It is not dissimilar to the BTP technique mentioned above, but the ‘block’ into
which the simulation is locked is only one macrostate wide. This is implemented by using two ‘barriers’,
each of which forces any microstate transitions which would cause the simulation to cross the barrier
to be rejected. For example, if the order parameter was chosen to be the energy, and if the macrostates
were naturally discrete, then this would simply amount to implementing a microcanonical (fixed energy)
simulation in the MC context. Although the system is locked in a single macrostate, the simulation is
3.7 Discussion
The greatest freedom intrinsic to the Monte Carlo approach is the freedom to invent ‘special moves’.
Instead of employing only ‘physical’ moves, such as moving single particles around by small amounts,
one can invent composite moves, where many degrees of freedom are updated simultaneously. This is
the idea behind the cluster moves used to efficiently simulate the Ising model, where the speed of the
simulation’s progress through configuration space is boosted by flipping many spins at once [17, Chapter
12].
The aim of the work in this thesis is to design a special move that will map a configuration of one
crystalline structure onto some configuration of another. Large moves in configuration space are gener-
ally found to be somewhat unlikely to occur, and so we will probably have to apply extended sampling
techniques in order to encourage this phase-switch to occur. The work presented here explores the de-
velopment of just such a technique, making use of a global coordinate transformation to perform the
switch, and defining a suitable order parameter so that this move may be multicanonically encouraged.
This interface-avoiding dual-phase simulation algorithm was first developed for the hard-sphere solid,
and this forms the subject matter of the next chapter.
The different phase regimes are shown in figure 4.1, both schematically and through the form of the
pressure-density curve of the system. At low densities (~ . 0:663), the hard-sphere fluid is the stable
phase. As the density is increased above ~ 0:663, the equilibrium state consists of coexisting fluid
and crystal regions. Beyond ~ 0:733, the equilibrium structure is a pure crystalline solid. The dashed
region of the solid-phase curve shows the behaviour of the metastable crystal, terminating in a spinodal
point below which the solid will always spontaneously melt. Also, if the system is quickly ‘quenched’ by
rapidly increasing the density (or the pressure), then the metastable (glassy) fluid phase may be observed.
This metastable branch is believed to terminate at ~ 0:86, at which point the pressure will diverge
as the structure becomes randomly close-packed (rcp) [67, 68]. For the crystalline phase, the pressure
increases with the density and finally diverges in the crystalline close-packing limit (by definition, at
~ = 1:0).
Figure 4.1: Illustrations of phase behaviour and the phase diagram of the hard-sphere system.
At low densities, a fluid is observed for which the semi-empirical Carnahan-Starling equation [69]
provides an accurate equation of state. At higher densities, the fluid branch (leading to random close-
packing) is metastable with respect to the coexisting fluid-crystal or the ordered crystal phase. The
fluid-crystal tie-line and the random close-packed data/equations are taken from [67]. The crystalline
solids’ equations of state were taken from [70]. Inset shows a close-up of the spinodal region of the
crystalline phase, where the difference between the fcc and hcp equations of state (taken from [70])
can be clearly seen.
As explained in x2.1.1, the candidates for the equilibrium structure of the hard-sphere solid are the close-
packed crystals and of these structures the face-centred cubic and hexagonal close-packed are the most
important (the other possible close-packed structures will be introduced in x4.2). In the early (1960’s)
work on this problem, the only measurable difference between the fcc and hcp structures was the pressure
difference [71, 7] (as shown by the inset graph on the right-hand side of fig. 4.1). However, the pressure
difference on its own cannot be used to determine which is the thermodynamically favoured phase; this
can only be achieved by explicitly evaluating the total free-energy difference. Of the publications during
this period, only reference [71] contains a calculation of the entropy difference: they estimated that
4.3 Lattice-switch Monte Carlo for hard spheres
Before we can consider how to switch between two crystalline structures, we must be able to identify a
configuration as ‘belonging to’ a given crystalline phase. This can be done quite naturally, and in the
traditions of lattice dynamics (see x2.4.1 and x6.4), by decomposing the particle position coordinates
into a sum of ‘lattice’ and ‘displacement’ vectors (see fig. 4.5),
~ri = ~R
i + ~ui : (4.3)
Here, f~Rg f~Ri ; i = 1:::Ng is a set of fixed (configuration-independent) vectors describing the
crystalline structure (labelled ). The set of positions of all N sites will be referred to as the ‘lattice’,
although this is not strictly the usual crystallographic meaning of the word.1 The f~ug vectors represent
displacements with respect to those lattice sites.
Figure 4.5: The decomposition of particle positions ~r into
a lattice-site position ~R (marked with a cross) and a dis-
placement ~u from that site.
This framework provides a number of ways of identifying a configuration as being associated with a
particular structure. One might adopt the criterion that all particle displacements, with respect to the
associated lattice sites, should lie within some spatial cut-off, chosen to be sufficiently large that the
results are independent of its specific value. The bounded random-walk and top-hat move generation
schemes (x3.2.1) are implementations of this constraint. Alternatively, we might identify the relevant
configurations as the set that are accessible from some member of the set (e.g. the perfect crystalline
state) within some given temporal cut-off. The actual length of the temporal cut-off should not be
important as long as it is ‘long enough’, meaning that it allows the configurations associated with a
particular structure to be sampled efficiently. This has the merit of being what, in practice, computer
simulations actually do: if a computer simulation is begun in fcc, say, then it will tend to stay in that
structure over any and all reasonable simulation timescales. This style of cut-off is invoked implicitly
by choosing to implement the (unbounded) random-walk move generation mechanism (x3.2.1). Of
course, if a simulation is able to change structure spontaneously, then the lattice-switch approach will
1A crystalline structure is usually described in terms of the crystallographic lattice (defining the periodic unit) and the basis
(which defines the positions of the lattice sites within the repeating cell). By convolving the lattice with the basis, the positions
of all the sites in the system are determined, and it is this that we refer to as the ‘lattice’.
This representation is significantly more complex than the simple local mappings, and introduces a
number of computational overheads. Also, the problem of identifying the most efficient NLM is very
difficult to tackle (it is, effectively, a 3N -dimensional minimisation problem), and the work presented
in x4.10.4 suggests that (for hard-spheres at least) there is little to gain here by this kind of tuning. For
these reasons, all the mappings used in this thesis will be one-to-one (SSM) mappings. The possible
advantages of using a complex mapping will be considered in x7.2.
While a number of different site-site mapping have been tested (see x4.10.4), one particular mapping
was used during almost all of this work. As illustrated schematically in fig. 4.6, this mapping exploits
the fact that fcc and hcp differ only by their stacking pattern (ABC versus AB), by simply translating
the appropriate close-packed planes. By ‘translate’ we mean, more precisely, ‘relocate at a position
defined by an appropriate translation vector’; the planes should not be thought of as ‘sliding through’
the intermediate space.
Figure 4.6: The main LS transformation, shown for the perfect-crystal configuration. Six
close-packed (x-y) layers are shown, with an additional [bracketed] layer at the bottom
showing the position of the periodic image of the top (z = 0) layer. The circles show the
boundaries of hard spheres located at the sites of the two close-packed structures. In this
realization of the fcc!hcp lattice switch, the top pair of planes are left unaltered, while
the other pairs of planes are relocated by translations, specified by the vectors ~t (white
arrows) and ~t (black arrows). The vector ~t is identified in fig. 4.2.
Fig. 4.6 shows the lattice-switch as applied to the perfect-crystal configurations, where all particles
are on their lattice-sites (~ui = ~0 8 i). In this case, the two structures are clearly ‘energy matched’
because the lattice switch cannot cause any particles to overlap, and thus will be accepted. However,
for a ‘typical’ equilibrium microstate (see fig. 4.7), the two configurations related by the LS operation
will not be energy matched. While the planes which are displaced together (type (i) in fig. 4.6) cannot
produce any overlaps between themselves during the switch, the planes which are translated differently
(type (ii) of fig. 4.6) may, and indeed with overwhelming probability will, map a realizable (no overlaps)
configuration of one structure onto an unrealizable (overlapping) configuration of the other. A MC
lattice-switch move will, therefore, be rejected for most configurations, but not for all. A number of
configurations for which the lattice-switch move may be accepted (i.e. ‘gateway’ states) must exist. For
example, from the definition of our lattice-switch operation, configurations for which all of the particles
are ‘close enough’ to the perfect-lattice positions must fall into this category. One could choose these
‘small-displacement’ configurations to act as the gateway states, and design a multicanonical procedure
accordingly. However, we can avoid having to make this explicit choice and, instead, let the system find
gateway configurations itself. To do this, we first define a measure of the mismatch between the energies
of the configurations linked by the lattice-switch.
4.3.2 The overlap order parameter
For a system of hard-spheres, the configurational energy cost of the lattice switch is always either zero or
infinity, and using this to measure the ‘energy mismatch’ will not produce an effective MCMC weighting
procedure. The algorithm must be able to tell if the configurations are ‘getting closer’ to the gateway
states, instead of simply whether the LS produces any overlaps or not. For this reason, the mismatch is
best quantified by the number of pairs of overlapping spheres created by the lattice switch. To this end,
we let M(f~ug; ) denote the number of overlapping pairs associated with the displacements f~ug within
the structure . From this, we define the overlap order parameter,
M(f~ug) = M(f~ug; hcp) M(f~ug; fcc) : (4.6)
As M(f~ug; ) is necessarily non-negative, and must be zero for any realizable set of displacements
for the structure , thenM will be 0 ( 0) for realizable configurations of the fcc (hcp) structure.
Figure 4.7 provides a concrete example. The gateway states may then be identified (without prejudging
their microscopic character) as the set of configurations for whichM = 0, as these configurations are
realizable in both structures.
4.3.3 Multicanonical weighting
As mentioned above, the lattice switch will usually cause overlaps for a ‘typical’ configuration of a given
phase. However, we can use multicanonical sampling (x3.6.1) to enhance the probability of visiting
the unlikely M = 0 gateway states. The MCMC algorithm supplied in x3.6.1 can be applied to this
problem by simply identifying the multicanonical order parameter (the M of x3.6.1) as the overlap
order parameter (M of eqn. 4.6). This order parameter is naturally discretized (the number of pairs
of overlapping spheres can only be an integer), and the MCMC tools presented in x3.6.1 can be used
‘straight out of the box’. The combination of the LS transformation, the overlap order parameter and
Figure 4.7: Schematic representation of the LS transformation applied to a ‘typical’ configuration of
hard spheres. The crosses identify the lattice sites, while the small circles locate the sphere centres
for this configuration of displacements f~ug. This configuration is realizable (gives no overlaps) in the
fcc structure, but under the LS transformation it is mapped onto an (unrealizable) hcp configuration
with three overlapping pairs of hard spheres (shown with broad, dashed boundaries). Thus, for this
configuration, the overlap order parameterM(f~ug) = 3 (eqn. 4.6).
the MCMC algorithm can now be used to evaluate the distribution P (MjC) for any set of conditions
C (referring to, for example, the constant volume or constant pressure ensembles), via eqn. 3.38. By
breaking the distribution into its two (fcc and hcp) parts, the relative probabilities of the two phases can
be estimated (c.f. eqns. 2.11 & 3.36) using,
P (fccjC)
P (hcpjC)
EB
=
P
M>0H(MjC; fg) exp [ m]P
M<0H(MjC; fg) exp [ m]
: (4.7)
4.4 The constant-volume ensemble
The weight of any configuration in the NV T ensemble is simply exp [ E] (eqn. 2.5). The energy
E is zero for any realizable configuration of hard-sphere, and so each of these configurations has unit
weight. Configurations with overlapping spheres are prevented by the infinite energy barrier, i.e. the
configurational weight of this energy is zero. As the weight can only be zero or one, the value of the
temperature does not affect the partition function. We have chosen to set T = 1, implying that the
Helmholtz free-energy is equal in magnitude (although opposite in sign) to the entropy,
f(N;V; ; ) s(N;V; ) =
1
N
lnZ(N;V; ; ) ; (4.8)
In other words, we expect the Gibbs and Helmholtz free-energies to differ by terms that are second order
in the pressure difference between the two phases (at least in the thermodynamic limit). That pressure
difference is extremely small, as is the difference between the measured densities of the two structures
(see fig. 4.1 and ref. [70]). Therefore, we expect the magnitude of g to be close to that of the entropy
density s.
4.6 The close-packed limit
As well as simulating at constant density (~ < 1:0) or at finite pressure, it is also possible to simulate the
close-packed limit (~ ! 1:0, ~p ! 1) directly. A full treatment of this limit can be found in reference
[72], and only an outline of the crucial elements is supplied here. In this limit, the configurational
integral (eqn. 2.8) may be rewritten as the product of two terms:
Z(N;V; ) = Z0(N;V )Zcp(N;) : (4.17)
The first term is defined by
Z0(N;V ) =
1
3N
; (4.18)
with
= 1 3
p
~ : (4.19)
The contribution to the entropy due to this term diverges in the close-packed limit, but is independent
of the phase. In this limit, the sphere separation becomes infinitely small (embodied in ), the sphere
surfaces are essentially flat (fig. 4.8(a)), and the system can be visualised as a set of hard dodecahedra2
(fig. 4.8(b)).
Clearly, then, in this limit only the parallel separation between neighbouring particles can be of impor-
tance, because any finite perpendicular displacement will be insignificant on the scale of the particle
size. The difference between the displacements of every pair of particles from their associated lattice
sites (~uij = ~uj ~ui) is re-expressed in terms of the parallel and perpendicular components of that vector
with respect to the direction of each nearest-neighbour site (fig. 4.8(a)),
~uij = u
k
ijn^
ij + ~u
?
ij : (4.20)
The unit vector n^ij describes the direction from a lattice-site i to a nearest-neighbour site j, and therefore
the system is dependent on the phase () via the geometry of the nearest-neighbour environment. The
2Which should be thought of as large in comparison with their separation.
Figure 4.8: The close-packed limit. (a) As the mean separation of the spheres tends to zero, their
curvature becomes effectively negligible and only the separation parallel to the nearest-neighbour
vectors is important. (b) This limit can be thought of as a system of hard dodecahedra, or in two
dimensions, a system of hard hexagons.
displacement coordinates are then rescaled by an -dependent factor, producing the following expression
for the second contribution to the configurational entropy (eqn. 4.17),
Zcp(N;) =
Y
i
Z
d~ui
nnY
hi;ji
cp(u
k
ij) [1 +O()] ; (4.21)
where the particle interaction is now described by
cp(u
k
ij) = 0 if u
k
ij < 1; (4.22)
= 1 otherwise. (4.23)
Note that all length-scales have now been removed from the problem (the diameter and the density no
longer feature in the calculation). However, the actual simulation algorithm remains essentially the same
as in the hard-sphere case, but the particle interactions are now made to obey eqn. 4.22; an ‘overlapping’
pair of particles is identified as those for which ukij < 1.
4.7 Polydispersity
Simulations were also performed for a simple model of a polydisperse hard-sphere system, where each
particle now has a unique diameter. The diameter of each sphere, i, is drawn randomly from a uniform
distribution over the range (1 2 ) < i < (1 +
2 ). The algorithm is altered so as to implement
the modified interaction potential:
(rij) = 1 if rij <
i + j
2
; (4.24)
= 0 otherwise. (4.25)
However, it is not necessary to apply the periodic boundary conditions to the particle positions them-
selves if the calculation of the particle-particle interactions take them fully into account. That is, when
determining the particle-particle separation (~rij) required by the pair-potential calculation, the bound-
ary conditions are applied to the components of rij (x, y and z) so that the nearest image of the
‘neighbouring’ particle is always used. Originally [1], this was done using the ‘standard’ formula [34]
a! a Laint
2
a
La
; (4.27)
where a refers to each of the x, y and z directions, and La refers the side-length of the simulation cell
in that direction. Unfortunately, this scheme fails when particles are not constrained to stay within the
same simulation cell-image [34, p. 326], because the separation between two particles may become
greater than the cell-size (see fig. 4.10(b)). If this happens, then eqn. 4.27 will calculate the position
of the second nearest image instead of the first. This can mean that neighbouring spheres do not ‘see’
each other, and so pass through one another. When using an (unbounded) random-walk algorithm, this
mechanism lets the particles move far from their lattice sites, and causes the simulation to break down
in a manner that is almost indistinguishable from the crystal actually melting.3 However, this flaw can
be remedied by using the following sequence of coordinate transformations
a ! a Laint
a
La
a ! a Laint
2
a
La
; (4.28)
which will always calculate the position of the first-nearest image, whichever simulation-cell image(s)
the particles lie in.
4.8.2 Truncated interactions
To implement the calculation of the hard-sphere configurational energy (eqn. 2.2), the simulation should
calculate the inter-particle potential energy for all distinct pairs of particles. However, this order N2
calculation is extremely computationally expensive and indeed, unnecessary. In a short-ranged and
strongly-caged system such as a hard-sphere crystal, each particle can usually be associated with the
same lattice-site for the entire duration of even a long simulation.4 Thus, particles which are separated
by more than a few lattice-spacings will never interact, and so this interaction need not be included in
the configurational-energy summation.
3This unphysical ‘melting’ was occasionally observed during the research published in [1], and a failure to trace this to
the boundary conditions forced the original authors to use the top-hat move-generation scheme. This kind of algorithm forces
particles to stay near their associated sites, and so prevents the ‘melting’ process.
4As long as the centre-of-mass motion is taken into account (see x4.8.4).
For a crystalline system, it is easiest to phrase the truncation of the interaction in terms of the number
of nearest-neighbour shells for which the interaction is calculated. For the fcc and hcp lattices there
are 12 nearest-neighbours (at a distance of 1rnn, see fig. 4.3), and 6 second-nearest neighbours (at
p
2rnn). As these hard particles only interact upon touching, it should be sufficient to include just these
18 neighbour particles in the summation; more distant particles are extremely unlikely to touch one
another. Nevertheless, such approximations should not be made carelessly, and so the consistency of
this approximation was tested using the direct-difference method of Bennett [25] (see x2.4.3) to evaluate
the free-energy difference between two different ranges of truncation. Although the direct-difference
approach is not generally statistically reliable, we can reasonably expect the particle configurations
associated with two different levels of truncation to be very similar and so it should be possible to
apply this technique more or less as it stands. Having said that, the binary nature of the hard-sphere
potential mean that it makes more sense to re-express the Bennett algorithm in terms of the number of
configurations associated with a given set of constraints.
We use
(N;V; ; C) to denote the number of configurations of a structure , composed of N hard-
spheres in a volume V , that satisfy a constraint C. We choose to consider a ‘liberal’ constraint that
allows 2nd nearest neighbours to overlap (Cl), and a tighter constraint that disallows this (Ct). The
entropy difference between the two systems operating under constraints Ct and Cl can be written as
sc(N;V; ) =
1
N
[S(N;V; ; Ct) S(N;V; ; Cl)]
=
1
N
[ln
(N;V; ; Ct) ln
(N;V; ; Cl)]
=
1
N
ln
(N;V; ; Ct)
(N;V; ; Cl)
: (4.29)
Now, under the assumption that the set of configurations that satisfy the tight constraint Ct form a
consistent subset of the configurations under the looser constraint Cl, we may write
P (CtjN;V;Cl) =
(N;V; ; Ct)
(N;V; ; Cl)
=
(CtjN;V; ; Cl)
(N;V; ; Cl)
: (4.30)
By consistent we mean that the subset of the configurations visited under the looser constraint which
satisfy the tighter constraint must be identical to the configurations which would be explored using only
the tighter constraint. This assumption is illustrated schematically in fig. 4.11, and in mathematical
terms corresponds to the statement that
(CtjN;V; ; Cl)
(N;V; ; Ct) : (4.31)
As long as this assumption holds, then the probability of satisfying the tighter constraint (eqn. 4.30) can
be estimated as,
P (CtjN;V;Cl)
EB
=
Nt
Nl
; (4.32)
Figure 4.11: Configuration-space overlaps of the tight and liberal constraint mi-
crostates. (a) In our calculation, we assume that the tighter constraint forms a consis-
tent (unique) subset of the more loosely constrained system. (b) If isolated fragments
of configuration space associated the tight-constraint microstates can be explored as
the loosely constrained system evolves, the calculation will over-estimate the size of
the Ct configuration space, and so underestimate the free-energy difference between
the two systems.
whereNt andNl are the number of configurations satisfying constraints t and l which have been gener-
ated during a simulation which demands only that the looser constraint Cl is obeyed. To determine Nt
andNl, we simply count the number of microstates satisfying each constraint generated by a simulation
using ‘ghost’ 2nd nearest-neighbour interactions (see fig. 4.12). The entropy difference,
sc(N;V; ) =
1
N
ln
Nt
Nl
; (4.33)
can be calculated for both structures ( = fcc and = hcp), and this difference between sc(N;V; fcc)
and sc(N;V; hcp) can be used to estimate free-energy cost associated with the truncation of the inter-
action with respect to the overall free-energy difference.
Figure 4.12: Simple illustrations of the two types of configuration to be counted in
the estimation of the truncation error. When all particles satisfy the (tighter) constraint
that no 2nd neighbours should overlap (left), it is counted as a type t (tight-constraint
microstate. However, if any particle in the system is overlapping one of its ‘ghost’
second nearest-neighbours (right), then the configuration is counted as a type l (liberal-
constraint) microstate.
4.8.3 Monte Carlo moves
The most basic MC move simply concerns changing the individual sphere displacements ~ui. All of the
different move generation schemes described in x3.2.1 were tested, and the results of this comparison
are presented in x4.10.1. However, before moving on to the other classes of MC move, there is one issue
concerning the implementation of all of the move-generation schemes which should be mentioned here.
In order to carry out the multicanonical weighting, it is necessary to know the current and trial values of
the weight function, and the algorithm must therefore ‘know’ the value of the order parameterM at all
times. To minimise the time spent calculating how a change in the displacement of one of the spheres
affects the value ofM, we used a ‘local’ order parameter array. This array stores information on which
neighbours currently overlap with each sphere in the conjugate phase. The changes to this array due to
each change in the sphere displacements are calculated and stored, so that the associated change inM
can be determined. Note also that for the random-walk algorithm, checks were put in place to test for
the breakdown of the ‘temporal cut-off’ (described in x4.3). At regular intervals, the distance from each
particle to its neighbouring sites was calculated. If any particle was found to be nearer another particle’s
lattice site that its own, the simulation run would be automatically halted.
In the canonical ensemble the only other type of Monte Carlo move is the lattice switch. After every
sphere move, the value ofM is checked. If we are in the gateway macrostate (M = 0), then the lattice
switch move is accepted.
In the isothermal-isobaric ensemble, the simulation implements volume dilations as well as particle and
lattice-switch moves. The unconstrained aspect-ratio algorithm (UVM, see x3.2.2) was used for all of
the work presented in this chapter. Note that the value ofM depends on the size of our simulation cell as
well as the local sphere displacements, and so the global nature of a volume move requires that both the
global overlap order parameterM and the local overlap array be re-calculated. Therefore, the volume
moves are computationally expensive, and so attempts to change the volume of the simulation cell were
performed only once per sweep (on average). The volume-move parameter (l) was chosen such that
the autocorrelation time ofM was at a minimum (x4.10.1).
4.8.4 The centre of mass
When implementing the spatial cut-off as a means of associating a configuration with a lattice, it is
important to understand the role of the centre-of-mass diffusion. Over time, the entire simulation cell
will diffuse through space, and so the system will ‘drift into the walls’ of any spatial cut-off that is
associated with the lattice. To investigate the importance of this effect the simulation code was designed
imise the efficiency of the simulation, we minimise the autocorrelation time ofM with respect to this
parameter. As can be seen from fig. 4.13, the minimum in the autocorrelation time of the order parameter
occurs at a RW acceptance rate of about 30%, corresponding to a maximum step size of r = 0:13
at our chosen density (~ = 0:7778). As a convenient scale on which to consider ‘small’ lengths such as
r, we define the minimum sphere separation,
= (~ 1=3 1) ; (4.37)
which is the shortest distance between two sphere surfaces in the perfect-crystal configuration. For the
chosen density, the value of is 0:09, and so the best RW move parameter is 1:4 times greater than
the minimum sphere separation.
In the NPT ensemble, a range of pressures were explored, and the density for each pressure was deter-
mined from the simulation data. A simple linear interpolation was then used to estimate the pressure for
which the corresponding density is ~ 0:7778. This produced a value for the pressure of ~p = 18:74,
for which the optimum value of l was found to be 0.005, corresponding to an acceptance ratio of
approximately 50%.
For the simulation of the close-packed limit, the optimum value of r was found to be 0:5, and as
in the finite-pressure case, the optimal acceptance rate was approximately 30%. For the polydisperse
system, exactly the same parameters were used as for the monodisperse NV T simulations.
4.10.2 Neighbour-interaction entropy difference
The entropy difference between a simulation of N = 63 hard-spheres with only first nearest-neighbour
interactions and the same simulation using both 1st & 2nd neighbours (sc, eqn. 4.33) is shown in figure
4.14. While only a small range of densities have been investigated, it is clear that as one moves away
from the melting density ( 0:733), the entropy difference diminishes very rapidly. In fact the value
of sc must tend to zero in the close-packed limit, as only 1st nearest-neighbours can possibly interact
when the particle separation becomes small.
However, the results shown in fig. 4.14 are not in agreement with the results presented in ref. [77]. For a
system of N = 83 hard-spheres, ref. [77] estimates the value of sc to be 8(2) 10 5 for both the fcc
and hcp structures when =cp = 0:739. In this work, the estimated entropy differences at that density
are an order of magnitude smaller; sc = 5:0(2) 10 6 for hcp and 6:0(2) 10 6 for fcc.
The reason for this disagreement is unclear, and insufficient information about the details of their calcu-
lation is supplied by the authors of ref. [77] to allow any firm conclusions to be drawn. If the estimates
presented here are incorrect, this must imply that the calculation’s central assumption (eqn. 4.31) has
N Phase Simulation Time [MCS] ~ of [70] ~ c=a
63 fcc 3:5 106 0.77757 0.7775(1) 1.6332(5)
63 hcp 3:5 106 0.77753 0.7776(1) 1.6323(7)
123 fcc 9 5 106 0.77757 0.7773(1) 1.6333(3)
123 hcp 9 5 106 0.77753 0.7770(3) 1.6332(3)
Table 4.1: Densities and c=a-ratios measured in the NPT ensemble, at ~p = 18:74, for the
fcc and hcp crystals using two different system sizes, along with the density predicted by the
equations of state from ref. [70]. The N = 123 single phase information was determined from
a LS double-phase simulation by reweighting each volume measurement so as to unfold the
bias introduced by the MCMC weight function.
in ref [70]. Our results agree with the predictions from the equations of state given in ref. [70], but the
densities have not been determined accurately enough to resolve the difference between the fcc and hcp
densities. Table 4.1 also shows the value of the c=a ratio measured during these simulations. None of
these measurements is significantly different from the ideal value (c=a = 1:63299). Evidently, whatever
difference there is between the true values of these quantities in each of the phases is not easily resolved
using these methods.
Figure 4.15: Measurements of uk and h[u?]2i in the close-packed limit, for N = 63 hard
dodecahedra arranged on an fcc lattice. (a) The distribution P (uk), with a Gaussian fitted
curve (solid line). The errors are significantly smaller that the symbol-size. (b) Plot of the
average value of [u?]2 as a function of uk. The value of this property at uk = 1 is in agreement
with the value given by ref. [70] (solid line, denoted as [RJS]).
The close-packed limit simulations were tested in two ways, based on the distribution of uk and on the
value of h[u?]2i as a function of uk, and the results are shown in figure 4.15. As asserted in x4.6, the
value of P (uk) when uk = 1 should be 0.5, and this is found to be the case (P (uk) = 0:494(5)). Also,
the average value of [u?]2 at contact (uk = 1) was compared against the results provided in ref. [70].
Our measurements indicate that h[u?(uk = 1)]2i is 1:130(1), in agreement with the value of 1:131 given
in [70].
4.10.4 Variations of the lattice switch
The fundamental barrier to accepting the lattice-switch move is the number of overlapping spheres that
would be created by it. The average number of overlapping pairs of particles that would be created by
the switch in a normal importance sampling simulation is termed the ‘equilibrium overlap count’, and
it makes sense to use any freedom we have concerning the definition of the lattice switch to minimise
this number. As mentioned in x4.3.1, there are any number of different possible choices of lattice-switch
mapping. Here, a range of different site-site mappings have been compared, using the equilibrium
overlap count as an (inverse) measure of their efficiency. Table 4.2 shows the results for a variety of
mappings, chosen to expose the different factors that control the mapping efficiency. Mapping number 1
is the one described in fig. 4.6: the notation (0; ~t;+~t) signifies that the three pairs of planes (counting
from the top of fig. 4.6) are translated by 0, ~t and +~t respectively. A similar convention is used to
label mappings 2 and 3. In mapping 4 (‘random-plane’) a hcp configuration is generated by taking an
fcc configuration and re-stacking its close-packed planes in a random order, but in a hcp pattern. In
mapping 5 (‘random-site’) a hcp configuration is generated by mapping the particle displacements in an
fcc configuration randomly onto the sites of an hcp lattice.
mapping description effect m = hMi=N
1 (0; ~t;+~t) fcc! hcp 0.150(1)
2 (0; 2~t; 2~t) fcc! hcp 0.183(1)
3 (0; 3~t; 3~t) fcc! fcc 0.194(1)
4 random-plane fcc! hcp 0.373(2)
5 random-site fcc! hcp 0.820(3)
Table 4.2: The efficiency of different mappings (for N = 123 and ~ =
0:7778), as measured by the average number of overlaps (per sphere)
that they generate. See main text for details.
The random-site mapping (#5) shows the largest overlap count, which is perhaps unsurprising given that
this transformation preserves the least information about how neighbouring sphere were arranged in the
fcc phase. Using the random-plane mapping (#4) cuts the overlap count by a factor of (a little more
than) 2 with respect to the random-site transformation. This efficiency gain simply reflects the fact that
of the 6N possible overlaps between nearest neighbours, only the 3N associated with neighbours in
different (but adjacent) planes can now contribute. Mapping 3 simply generates one fcc configuration
from another (it is useful only because it is informative): its overlap count is cut by a further factor of
2. This reflects the fact that this mapping (as with mappings 1 and 2) moves the close-packed planes in
Figure 4.16: Weight evolution for N = 63 hard spheres, comparing the visited states
(VS) and transition probability (TP) methods. The points marked VS are the results
of the first 3 iterations of the visited-states algorithm, initiated from an fcc equilibrium
state. The points marked TP emerge from one application of the transition probability
method. The solid line shows a refined (usable) set of weights.
−0.4 −0.2 0 0.2 0.4d − d00
2
4
6
8
10
P(d)
FCCgatewayHCP
Figure 4.17: Distribution of the separation d between adjacent close-packed planes in a sys-
tem of N = 63 spheres at ~ = 0:7778, in the equilibrium hcp and fcc macrostates, and in the
gateway (M = 0) macrostate. The separation is measured with respect to the equilibrium
separation d0 and is expressed in units of the equilibrium sphere separation .
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



