i=1
i
k
i
(2.17)
where the approximations k
i
are given by
k
i
= tf(y
n
+
N1
j=1
ij
k
i
, t
n
+
N1
j=1
ij
t) (2.18)
and the parameters
i
and
ij
are chosen to obtain an Nth order method. Note that
this choice is usually not unique.
The most widely used RungeKutta algorithm is the fourth order method:
k
1
= tf(t
n
, y
n
)
k
2
= tf(t
n
+ t/2, y
n
+ k
1
/2)
k
3
= tf(t
n
+ t/2, y
n
+ k
2
/2)
k
4
= tf(t
n
+ t, y
n
+ k
3
)
y
n+1
= y
n
+
k
1
6
+
k
2
3
+
k
3
3
+
k
4
6
+ O(t
5
) (2.19)
10
in which two estimates at the intermediate point t
n
+ t/2 are combined with one
estimate at the starting point t
n
and one estimate at the end point t
n
= t
0
+ nt.
Exercise: check the order of these two RungeKutta algorithms.
2.2 Integrating the classical equations of motion
The most common ordinary dierential equation you will encounter are Newtons equa
tion for the classical motion for N point particles:
m
i
dv
i
dt
=
F
i
(t, x
1
, . . . , x
N
, v
1
, . . . , v
N
) (2.20)
dx
i
dt
= v
i
, (2.21)
where m
i
, v
i
and x
i
are the mass, velocity and position of the ith particle and
F
i
the
force acting on this particle.
For simplicity in notation we will restrict ourselves to a single particle in one di
mension, before discussing applications to the classical fewbody and manybody prob
lem. We again label the time steps t
n+1
= t
n
+ t, and denote by x
n
and v
n
the
approximate solutions for x(t
n
) and v(t
n
) respectively. The accelerations are given by
a
n
= a(t
n
, x
n
, v
n
) = F(t
n
, x
n
, v
n
)/m.
The simplest method is again the forwardEuler method
v
n+1
= v
n
+ a
n
t
x
n+1
= x
n
+ v
n
t. (2.22)
which is however unstable for oscillating systems as can be seen in the Mathematica
notebook on the web page. For a simple harmonic oscillator the errors will increase
exponentially over time no matter how small the time step t is chosen and the forward
Euler method should thus be avoided!
For velocityindependent forces a surprisingly simple trick is sucient to stabilize
the Euler method. Using the backward dierence v
n+1
(x
n+1
x
n
)/t instead of a
forward dierence v
n
(x
n+1
x
n
)/t we obtain the stable backwardEuler method:
v
n+1
= v
n
+ a
n
t
x
n+1
= x
n
+ v
n+1
t, (2.23)
where the new velocity v
n+1
is used in calculating the positions x
n+1
.
A related stable algorithm is the midpoint method, using a central dierence:
v
n+1
= v
n
+ a
n
t
x
n+1
= x
n
+
1
2
(v
n
+ v
n+1
)t. (2.24)
Equally simple, but surprisingly of second order is the leapfrog method, which is
one of the commonly used methods. It evaluates positions and velocities at dierent
11
times:
v
n+1/2
= v
n1/2
+ a
n
t
x
n+1
= x
n
+ v
n+1/2
t. (2.25)
As this method is not selfstarting the Euler method is used for the rst half step:
v
1/2
= v
0
+
1
2
a
0
t. (2.26)
For velocitydependent forces the secondorder EulerRichardson algorithm can be
used:
a
n+1/2
= a
_
x
n
+
1
2
v
n
t, v
n
+
1
2
a
n
t, t
n
+
1
2
t
_
v
n+1
= v
n
+ a
n+1/2
t (2.27)
x
n+1
= x
n
+ v
n
t +
1
2
a
n+1/2
t
2
.
The most commonly used algorithm is the following form of the Verlet algorithm
(velocity Verlet):
x
n+1
= x
n
+ v
n
t +
1
2
a
n
(t)
2
v
n+1
= v
n
+
1
2
(a
n
+ a
n+1
)t. (2.28)
It is third order in the positions and second order in the velocities.
2.3 Boundary value problems and shooting
So far we have considered only the initial value problem, where we specied both the
initial position and velocity. Another type of problems is the boundary value problem
where instead of two initial conditions we specify one initial and one nal condition.
Examples can be:
We lanuch a rocket from the surface of the earth and want it to enter space (dened
as an altitude of 100km) after one hour. Here the initial and nal positions are
specied and the question is to estimate the required power of the rocket engine.
We re a cannon ball from ETH Hnggerberg and want it to hit the tower of the
university of Z urich. The initial and nal positions as well as the initial speed of
the cannon ball is specied. The question is to determine the angle of the cannon
barrel.
Such boundary value problems are solved by the shooting method which should
be familiar to Swiss students from their army days. In the second example we guess an
angle for the cannon, re a shot, and then iteratively adjust the angle until we hit our
target.
More formally, let us again consider a simple onedimensional example but instead of
specifying the initial position x
0
and velocity v
0
we specify the initial position x(0) = x
0
and the nal position after some time t as x(t) = x
f
. To solve this problem we
12
1. guess an initial velocity v
0
=
2. dene x(t; ) as the numerically integrated value of for the nal position as a
function of
3. numerically solve the equation x(t; ) = x
f
We thus have to combine one of the above integrators for the equations of motion
with a numerical root solver.
2.4 Numerical root solvers
The purpose of a root solver is to nd a solution (a root) to the equation
f(x) = 0, (2.29)
or in general to a multidimensional equation
f(x) = 0. (2.30)
Numerical root solvers should be well known from the numerics courses and we will
just review three simple root solvers here. Keep in mind that in any serious calculation
it is usually best to use a well optimized and tested library function over a handcoded
root solver.
2.4.1 The Newton and secant methods
The Newton method is one of best known root solvers, however it is not guaranteed to
converge. The key idea is to start from a guess x
0
, linearize the equation around that
guess
f(x
0
) + (x x
0
)f
(x
0
) = 0 (2.31)
and solve this linearized equation to obtain a better estimate x
1
. Iterating this procedure
we obtain the Newton method:
x
n+1
= x
n
f(x
n
)
f
(x
n
)
. (2.32)
If the derivative f
(x
n
)
f(x
n
) f(x
n1
)
x
n
x
n1
(2.33)
Substituting this into the Newton method (2.32) we obtain the secant method:
x
n+1
= x
n
(x
n
x
n1
)
f(x
n
)
f(x
n
) f(x
n1
)
. (2.34)
13
The Newton method can easily be generalized to higher dimensional equations, by
dening the matrix of derivatives
A
ij
(x) =
f
i
(x)
x
j
(2.35)
to obtain the higher dimensional Newton method
x
n+1
= x
n
A
1
f(x) (2.36)
If the derivatives A
ij
(x) are not known analytically they can be estimated through nite
dierences:
A
ij
(x) =
f
i
(x + h
j
e
j
) f
i
(x)
h
j
with h
j
x
j
(2.37)
where is the machine precision (about 10
16
for double precision oating point num
bers on most machines).
2.4.2 The bisection method and regula falsi
Both the bisection method and the regula falsi require two starting values x
0
and x
1
surrounding the root, with f(x
0
) < 0 and f(x
1
) > 0 so that under the assumption of a
continuous function f there exists at least one root between x
0
and x
1
.
The bisection method performs the following iteration
1. dene a midpoint x
m
= (x
0
+ x
1
)/2.
2. if signf(x
m
) = signf(x
0
) replace x
0
x
m
otherwise replace x
1
x
m
until a root is found.
The regula falsi works in a similar fashion:
1. estimate the function f by a straight line from x
0
to x
1
and calculate the root of
this linearized function: x
2
= (f(x
0
)x
1
f(x
1
)x
0
)/(f(x
1
) f(x
0
)
2. if signf(x
2
) = signf(x
0
) replace x
0
x
2
otherwise replace x
1
x
2
In contrast to the Newton method, both of these two methods will always nd a
root.
2.4.3 Optimizing a function
These root solvers can also be used for nding an extremum (minimum or maximum)
of a function f(x), by looking a root of
f(x) = 0. (2.38)
While this is ecient for onedimensional problems, but better algorithms exist.
In the following discussion we assume, without loss of generality, that we want to
minimize a function. The simplest algorithm for a multidimensional optimization is
14
steepest descent, which always looks for a minimum along the direction of steepest
gradient: starting from an initial guess x
n
a onedimensional minimization is applied
to determine the value of which minimizes
f(x
n
+ f(x
n
)) (2.39)
and then the next guess x
n+1
is determined as
x
n+1
= x
n
+ f(x
n
) (2.40)
While this method is simple it can be very inecient if the landscape of the
function f resembles a long and narrow valley: the onedimensional minimization will
mainly improve the estimate transverse to the valley but takes a long time to traverse
down the valley to the minimum. A better method is the conjugate gradient algo
rithm which approximates the function locally by a paraboloid and uses the minimum
of this paraboloid as the next guess. This algorithm can nd the minimuim of a long
and narrow parabolic valley in one iteration! For this and other, even better, algorithms
we recommend the use of library functions.
One nal word of warning is that all of these minimizers will only nd a local
minimum. Whether this local minimum is also the global minimum can never be
decided by purely numerically. A necessary but never sucient check is thus to start
the minimization not only from one initial guess but to try many initial points and
check for consistency in the minimum found.
2.5 Applications
In the last section of this chapter we will mention a few interesting problems that can be
solved by the methods discussed above. This list is by no means complete and should
just be a starting point to get you thinking about which other interesting problems you
will be able to solve.
2.5.1 The onebody problem
The onebody problem was already discussed in some examples above and is well known
from the introductory classical mechanics courses. Here are a few suggestions that go
beyond the analytical calculations performed in the introductory mechanics classes:
Friction
Friction is very easy to add to the equations of motion by including a velocitydependent
term such as:
dv
dt
=
F [v[
2
(2.41)
while this term usually makes the problem impossible to solve analytically you will see
in the exercise that this poses no problem for the numerical simulation.
Another interesting extension of the problem is adding the eects of spin to a thrown
ball. Spinning the ball causes the velocity of airow dier on opposing sides. This in
15
turn exerts leads to diering friction forces and the trajectory of the ball curves. Again
the numerical simulation remains simple.
Relativistic equations of motion
It is equally simple to go from classical Newtonian equations of motion to Einsteins
equation of motion in the special theory of relativity:
d p
dt
=
F (2.42)
where the main change is that the momentum p is no longer simply mv but now
p = m
0
v (2.43)
where m
0
is the mass at rest of the body,
=
_
1 +
[ p[
2
m
2
0
c
2
=
1
_
1
v
2
c
2
, (2.44)
and c the speed of light.
These equations of motion can again be discretized, for example in a forwardEuler
fashion, either by using the momenta and positions:
x
n+1
= x
n
+
p
n
m
0
t (2.45)
p
n+1
= p
n
+
F
n
t (2.46)
or using velocities and positions
x
n+1
= x
n
+v
n
t (2.47)
v
n+1
= v
n
+
F
n
m
0
t (2.48)
The only change in the program is a division by , but this small change has large
consequences, one of which is that the velocity can never exceed the speed of light c.
2.5.2 The twobody (Kepler) problem
While the generalization of the integrators for equations of motion to more than one
body is trivial, the twobody problem does not even require such a generalization in
the case of forces that depend only on the relative distance of the two bodies, such as
gravity. The equations of motion
m
1
d
2
x
1
dt
2
=
F(x
2
x
1
) (2.49)
m
2
d
2
x
2
dt
2
=
F(x
1
x
2
) (2.50)
16
where
F(x
2
x
1
) =
F(x
2
x
1
) we can perform a transformation of coordinates to
center of mass and relative motion. The important relative motion gives a single body
problem:
m
d
2
x
dt
2
=
F(x) = V ([x[), (2.51)
where x = x
2
x
1
is the distance, m = m
1
m
2
/(m
1
+m
2
) the reduced mass, and V the
potential
V (r) =
Gm
r
(2.52)
In the case of gravity the above problem is called the Kepler problem with a force
F(x) = Gm
x
[x[
3
(2.53)
and can be solved exactly, giving the famous solutions as either circles, ellipses,
parabolas or hyperbolas.
Numerically we can easily reproduce these orbits but can again go further by adding
terms that make an analytical solution impossible. One possibility is to consider a
satellite in orbit around the earth and add friction due to the atmosphere. We can
calculate how the satellite spirals down to earth and crashes.
Another extension is to consider eects of Einsteins theory of general relativity.
In a lowest order expansion its eect on the Kepler problem is a modied potential:
V (r) =
Gm
r
_
_
1 +
L
2
r
2
_
_
, (2.54)
where
L = mx v is the angular momentum and a constant of motion. When plotting
the orbits including the extra 1/r
3
term we can observe a rotation of the main axis of
the elliptical orbit. The experimental observation of this eect on the orbit of Mercury
was the rst conrmation of Einsteins theory of general relativity.
2.5.3 The threebody problem
Next we go to three bodies and discuss a few interesting facts that can be checked by
simulations.
Stability of the threebody problem
Stability, i.e. that a small perturbation of the initial condition leads only to a small
change in orbits, is easy to prove for the Kepler problem. There are 12 degrees of
freedom (6 positions and 6 velocities), but 11 integrals of motion:
total momentum: 3 integrals of motion
angular momentum: 3 integrals of motion
center of mass: 3 integrals of motion
17
Energy: 1 integral of motion
Lenz vector: 1 integral of motion
There is thus only one degree of freedom, the initial position on the orbit, and stability
can easily be shown.
In the threebody problem there are 18 degrees of freedom but only 10 integrals
of motion (no Lenz vector), resulting in 8 degrees of freedom for the orbits. Even
restricting the problem to planar motions in two dimensions does not help much: 12
degrees of freedom and 6 integrals of motion result in 6 degrees of freedom for the orbits.
Progress can be made only for the restricted threebody problem, where the mass
of the third body m
3
0 is assumed to be too small to inuence the rst two bodies
which are assumed to be on circular orbits. This restricted threebody problem has four
degrees of freedom for the third body and one integral of motion, the energy. For the
resulting problem with three degrees of freedom for the third body the famous KAM
(KolmogorovArnoldMoser) theorem can be used to prove stability of moonlike orbits.
Lagrange points and Trojan asteroids
In addition to moonlike orbits, other (linearly) stable orbits are around two of the
Lagrange points. We start with two bodies on circular orbits and go into a rotating
reference frame at which these two bodies are at rest. There are then ve positions, the
ve Lagrange points, at which a third body is also at rest. Three of these are colinear
solutions and are unstable. The other two stable solutions form equilateral triangles.
Astronomical observations have indeed found a group of asteroids, the Trojan as
teroids on the orbit of Jupiter, 60 degrees before and behind Jupiter. They form an
equilateral triangle with the sun and Jupiter.
Numerical simulations can be performed to check how long bodies close to the perfect
location remain in stable orbits.
Kirkwood gaps in the rings of Saturn
Going farther away from the sun we next consider the Kirkwood gaps in the rings of
Saturn. Simulating a system consisting of Saturn, a moon of Saturn, and a very light
ring particle we nd that orbits where the ratio of the period of the ring particle to that
of the moon are unstable, while irrational ratios are stable.
The moons of Uranus
Uranus is home to an even stranger phenomenon. The moons Janus and Epimetheus
share the same orbit of 151472 km, separated by only 50km. Since this separation is
less than the diameter of the moons (ca. 100150km) one would expect that the moons
would collide.
Since these moons still exist something else must happen and indeed a simulation
clearly shows that the moons do not collide but instead switch orbits when they ap
proach each other!
18
2.5.4 More than three bodies
Having seen these unusual phenomena for three bodies we can expect even stranger
behavior for four or ve bodies, and we encourage you to start exploring them with
your programs.
Especially noteworthy is that for ve bodies there are extremely unstable orbits
that diverge in nite time: ve bodies starting with the right initial positions and nite
velocities can be innitely far apart, and ying with innite velocities after nite time!
For more information see http://www.ams.org/notices/199505/saari2.pdf
19
Chapter 3
Partial Dierential Equations
In this chapter we will present algorithms for the solution of some simple but widely used
partial dierential equations (PDEs), and will discuss approaches for general partial
dierential equations. Since we cannot go into deep detail, interested students are
referred to the lectures on numerical solutions of dierential equations oered by the
mathematics department.
3.1 Finite dierences
As in the solution of ordinary dierential equations the rst step in the solution of a
PDE is to discretize space and time and to replace dierentials by dierences, using
the notation x
n
= nx. We already saw that a rst order dierential f/x can be
approximated in rst order by
f
x
=
f(x
n+1
) f(x
n
)
x
+ O(x) =
f(x
n
) f(x
n1
)
x
+ O(x) (3.1)
or to second order by the symmetric version
f
x
=
f(x
n+1
) f(x
n1
)
2x
+ O(x
2
), (3.2)
From these rst order derivatives can get a second order derivative as
2
f
x
2
=
f(x
n+1
) + f(x
n1
) 2f(x
n
)
x
2
+ O(x
2
). (3.3)
To derive a general approximation for an arbitrary derivative to any given order use
the ansatz
l
k=l
a
k
f(x
n+k
), (3.4)
insert the Taylor expansion
f(x
n+k
) = f(x
n
) + xf
(x
n
) +
x
2
2
f
(x
n
) +
x
3
6
f
(x
n
) +
x
4
4
f
(4)
(x
n
) + . . . (3.5)
and choose the values of a
k
so that all terms but the desired derivative vanish.
20
As an example we give the fourthorder estimator for the second derivative
2
f
x
2
=
f(x
n2
) + 16f(x
n1
) 30f(x
n
) + 16f(x
n+1
) f(x
n+2
)
12x
2
+ O(x
4
). (3.6)
and the second order estimator for the third derivative:
3
f
x
3
=
f(x
n2
) + 2f(x
n1
) 2f(x
n+1
) + f(x
n+2
)
x
3
+ O(x
2
). (3.7)
Extensions to higher dimensions are straightforward, and these will be all the dif
ferential quotients we will need in this course.
3.2 Solution as a matrix problem
By replacing dierentials by dierences we convert the (non)linear PDE to a system
of (non)linear equations. The rst example to demonstrate this is determining an
electrostatic or gravitational potential given by the Poisson equation
2
(x) = 4(x), (3.8)
where is the charge or mass density respectively and units have been chosen such that
the coupling constants are all unity.
Discretizing space we obtain the system of linear equations
(x
n+1
, y
n
, z
n
) + (x
n1
, y
n
, z
n
)
+(x
n
, y
n+1
, z
n
) + (x
n
, y
n1
, z
n
) (3.9)
+(x
n
, y
n
, z
n+1
) + (x
n
, y
n
, z
n1
)
6(x
n
, y
n
, z
n
) = 4(x
n
, y
n
, z
n
)x
2
,
where the density (x
n
, y
n
, z
n
) is dened to be the average density in the cube with
linear extension x around the point (x
n
, y
n
, z
n
).
The general method to solve a PDE is to formulate this linear system of equations
as a matrix problems and then to apply a linear equation solver to solve the system of
equations. For small linear problems Mathematica can be used, or the dsysv function
of the LAPACK library.
For larger problems it is essential to realize that the matrices produced by the
discretization of PDEs are usually very sparse, meaning that only O(N) of the N
2
matrix elements are nonzero. For these sparse systems of equations, optimized iterative
numerical algorithms exist
1
and are implemented in numerical libraries such as in the
ITL library.
2
1
R. Barret, M. Berry, T.F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C.
Romine, and H. van der Vorst, Templates for the Solution of Linear Systems: Building Blocks for
Iterative Methods (SIAM, 1993)
2
J.G. Siek, A. Lumsdaine and LieQuan Lee, Generic Programming for High Performance Numerical
Linear Algebra in Proceedings of the SIAM Workshop on Object Oriented Methods for Interoperable
Scientic and Engineering Computing (OO98) (SIAM, 1998); the library is availavle on the web at:
http://www.osl.iu.edu/research/itl/
21
This is the most general procedure and can be used in all cases, including boundary
value problems and eigenvalue problems. The PDE eigenvalue problem maps to a
matrix eigenvalue problem, and an eigensolver needs to be used instead of a linear
solver. Again there exist ecient implementations
3
of iterative algorithms for sparse
matrices.
4
For nonlinear problems iterative procedures can be used to linearize them, as we
will discuss below.
Instead of this general and exible but bruteforce method, many common PDEs
allow for optimized solvers that we will discuss below.
3.3 The relaxation method
For the Poisson equation a simple iterative method exists that can be obtained by
rewriting above equation as
(x
n
, y
n
, z
n
) =
1
6
[(x
n+1
, y
n
, z
n
) + (x
n1
, y
n
, z
n
) + (x
n
, y
n+1
, z
n
)
+(x
n
, y
n1
, z
n
) + (x
n
, y
n
, z
n+1
) + (x
n
, y
n
, z
n1
)]
2
3
(x
n
, y
n
, z
n
)x
2
, (3.10)
The potential is just the average over the potential on the six neighboring sites plus
a term proportinal to the density .
A solution can be obtained by iterating equation 3.10:
(x
n
, y
n
, z
n
)
1
6
[(x
n+1
, y
n
, z
n
) + (x
n1
, y
n
, z
n
) + (x
n
, y
n+1
, z
n
)
+(x
n
, y
n1
, z
n
) + (x
n
, y
n
, z
n+1
) + (x
n
, y
n
, z
n1
)]
2
3
(x
n
, y
n
, z
n
)x
2
, (3.11)
This iterative solver will be implemented in the exercises for two examples:
1. Calculate the potential between two concentric metal squares of size a and 2a.
The potential dierence between the two squares is V . Starting with a potential
0 on the inner square, V on the outer square, and arbitrary values inbetween,
a twodimensional variant of equation 3.11 is iterated until the dierences drop
below a given threshold. Since there are no charges the iteration is simply:
(x
n
, y
n
)
1
4
[(x
n+1
, y
n
) + (x
n1
, y
n
) + (x
n
, y
n+1
) + (x, y
n1
)]. (3.12)
2. Calculate the potential of a distribution of point charges: starting from an ar
bitrary initial condition, e.g. (x
n
, y
n
, z
n
) = 0, equation 3.11 is iterated until
convergence.
3
http://www.compphys.org/software/ietl/
4
Z. Bai, J. Demmel and J. Dongarra (Eds.), Templates for the Solution of Algebraic Eigenvalue
Problems: A Practical Guide (SIAM, 2000).
22
Since these iterations are quite slow it is important to improve them by one of two
methods discussed below.
3.3.1 GaussSeidel Overrelaxtion
GaussSeidel overrelaxtion determines the change in potential according to equation
3.11 but then changes the value by a multiple of this proposed change:
(x
n
, y
n
, z
n
) =
1
6
[(x
n+1
, y
n
, z
n
) + (x
n1
, y
n
, z
n
) + (x
n
, y
n+1
, z
n
)
+(x
n
, y
n1
, z
n
) + (x
n
, y
n
, z
n+1
) + (x
n
, y
n
, z
n1
)]
2
3
(x
n
, y
n
, z
n
)x
2
(x
n
, y
n
, z
n
)
(x
n
, y
n
, z
n
) (x
n
, y
n
, z
n
) + w(x
n
, y
n
, z
n
) (3.13)
with an overrelaxation factor of 1 < w < 2. You can easily convince yourself, by
considering a single charge and initial values of (x
n
, y
n
, z
n
) = 0 that choosing value
w 2 is unstable.
3.3.2 Multigrid methods
Multigrid methods dramatically accelerate the convergence of many iterative solvers.
We start with a very coarse grid spacing xx
0
and iterate
solve the Poisson equation on the grid with spacing x
rene the grid x x/2
interpolate the potential at the new grid points
and repeat until the desired nal ne grid spacing x is reached.
Initially convergence is fast since we have a very small lattice. In the later steps
convergence remains fast since we always start with a very good guess.
3.4 Solving timedependent PDEs by the method
of lines
3.4.1 The diusion equation
Our next problem will include a rst order timederivative, with a partial dierential
equation of the form
f(x, t)
t
= F(f, t) (3.14)
where f contains only spatial derivatives and the initial condition at time t
0
is given by
f(x, t
0
) = u(x). (3.15)
23
One common equation is the diusion equation, e.g. for heat transport
T(x, t)
t
=
K
C
2
T(x, t) +
1
C
W(x, t) (3.16)
where T is the temperature, C the specic heat, the density and K the thermal
conductivity. External heat sources or sinks are specied by W(x, t).
This and similar initial value problems can be solved by the method of lines: af
ter discretizing the spatial derivatives we obtain a set of coupled ordinary dierential
equations which can be evolved fort each point along the time line (hence the name)
by standard ODE solvers. In our example we obtain, specializing for simplicity to the
onedimensional case:
T(x
n
, t)
t
=
K
Cx
2
[T(x
n+1
, t) + T(x
n1
, t) 2T(x
n
, t)] +
1
C
W(x
n
, t) (3.17)
Using a forward Euler algorithm we nally obtain
T(x
n
, t + t) = T(x
n
, t)
Kt
Cx
2
[T(x
n+1
, t) + T(x
n1
, t) 2T(x
n
, t)] +
t
C
W(x
n
, t)
(3.18)
This will be implemented in the exercises and used in the supercomputing examples.
3.4.2 Stability
Great care has to be taken in choosing appropriate values of x and t, as too long
time steps t immediately lead to instabilities. By considering the case where the
temperature is 0 everywhere except at one point it is seen immediately, like in the case
of overrelaxation that a choice of Kt/Cx
2
> 1/2 is unstable. A detailed analysis,
which is done e.g. in the lectures on numerical solutions of dierential equations, shows
that this heat equation solver is only stable for
Kt
Cx
2
<
1
4
. (3.19)
We see that, for this PDE with second order spatial and rst order temporal derivatives,
it is not enough to just adjust t proportional to x, but t O(x
2
) is needed.
Here it is even more important to check for instabilities than in the case of PDEs!
3.4.3 The CrankNicolson method
The simple solver above can be improved by replacing the forward Euler method for
the time integration by a midpoint method:
T(x, t+t) = T(x, t)+
Kt
2C
_
2
T(x, t) +
2
T(x, t + t)
_
+
t
2C
[W(x, t) + W(x, t + t)]
(3.20)
Discretizing space and introducing the linear operator A dened by
AT(x
n
, t) =
Kt
Cx
2
[T(x
n+1
, t) + T(x
n1
, t) 2T(x
n
, t)] (3.21)
24
to simplify the notation we obtain an implicit algorithm:
(2 1 A)
T(t + t) = (2 A)
T(t) +
t
C
_
W(t) +
W(t + t)
_
, (3.22)
where 1 is the unit matrix and
T(t) = (T(x
1
, t), . . . T(x
N
, t)) (3.23)
W(t) = (W(x
1
, t), . . . W(x
N
, t)) (3.24)
are vector notations for the values of the temperature and heat source at each point.
In contrast to the explicit solver (3.18) the values at time t +t are not given explicitly
on the right hand side but only as a solution to a linear system of equations. After
evaluating the right hand side, still a linear equation needs to be solved. This extra
eort, however, gives us greatly improved stability and accuracy.
Note that while we have discussed the CrankNicolson method here in the context
of the diusion equation, it can be applied to any timedependent PDE.
3.5 The wave equation
3.5.1 A vibrating string
Another simple PDE is the wave equation, which we will study for the case of a string
running along the xdirection and vibrating transversely in the ydirection:
2
y
t
2
= c
2
2
y
x
2
. (3.25)
The wave velocity c =
_
T/ is a function of the string tension T and the mass density
of the string.
As you can easily verify, analytic solutions of this wave equation are of the form
y = f
+
(x + ct) + f
(x ct). (3.26)
To solve the wave equation numerically we again discretize time and space in the
usual manner and obtain, using the the second order dierence expressions for the
second derivative:
y(x
i
, t
n+1
) + y(x
i
, t
n1
) 2y(x
i
, t
n
)
(t)
2
c
2
y(x
i+1
, t
n
) + y(x
i1
, t
n
) 2y(x
i
, t
n
)
(x)
2
. (3.27)
This can be transformed to
y(x
i
, t
n+1
) = 2(1
2
)y(x
i
, t
n
) y(x
i
, t
n1
) +
2
[y(x
i+1
, t
n
) + y(x
i1
, t
n
)] , (3.28)
with = ct/x.
Again, we have to choose the values of t and x carefully. Surprisingly, for the
wave equation when choosing = 1 we obtain the exact solution without any error! To
check this, insert the exact solution (3.26) into the dierence equation (3.27).
25
Decreasing both x and t does not increase the accuracy but only the spatial and
temporal resolution. This is a very special feature of the linear wave equation.
Choosing smaller time steps and thus < 1 there will be solutions propagating
faster than the speed of light, but since they decrease with the square of the distance
r
2
this does not cause any major problems.
On the other hand, choosing a slightly larger time step and thus > 1 has catas
trophic consequences: these unphysical numerical solution increase and diverge rapidly,
as can be seen in the Mathematica Notebook posted on the web page.
3.5.2 More realistic models
Real strings and musical instruments cause a number of changes and complications to
the simple wave equation discussed above:
Real strings vibrate in both the y and z direction, which is easy to implement.
Transverse vibrations of the string cause the length of the string and consequently
the string tension T to increase. This leads to an increase of the velocity c and
the value of . The nice fact that = 1 gives the exact solution can thus no
longer be used and special care has to be taken to make sure that < 1 even for
the largest elongations of the string.
Additionally there will be longitudinal vibrations of the string, with a much higher
velocity c

c. Consequently the time step for longitudinal vibrations t

has
to be chosen much smaller than for transverse vibrations. Instead of of simulating
both transverse and longitudinal vibrations with the same small time step t

one still uses the larger time step t for the transverse vibrations but updates the
transverse positions only every t/t

iterations of the longitudinal positions.
Finally the string is not in vacuum and innitely long, but in air and attached to
a musical instrument. Both the friction of the air and forces exerted by the body
of the instrument cause damping of the waves and a modied sound.
For more information about applications to musical instruments I refer to the article
by N. Giordano in Computers in Phsyics 12, 138 (1998). This article also discusses
numerical approaches to the following problems
How is sound created in an acoustic guitar, an electric guitar and a piano?
What is the sound of these instruments?
How are the strings set into motion in these instruments?
The simulation of complex instruments such as pianos still poses substantial unsolved
challenges.
26
3.6 The nite element method
3.6.1 The basic nite element method
While the nite dierence method used so far is simple and straightforward for regular
mesh discretizations it becomes very hard to apply to more complex problems such as:
spatially varying constants, such as spatially varying dielectric constants in the
Poisson equation.
irregular geometries such as airplanes or turbines.
dynamically adapting geometries such as moving pistons.
In such cases the nite element method has big advantages over nite dierences
since it does not rely on a regular mesh discretization. We will discuss the nite element
method using the onedimensional Poisson equation
i=1
a
i
v
i
(x). (3.31)
For our numerical calculation the innite basis set needs to be truncated, choosing a 
nite subset u
i
, i = 1, . . . , N of N linearly independent, but not necessarily orthogonal,
functions:
N
(x) =
N
i=1
a
i
u
i
(x). (3.32)
The usual choice are functions localized around some mesh points x
i
, which in contrast
to the nite dierence method do not need to form a regular mesh.
The coecients a = (a
1
, . . . , a
N
) are chosen to minimize the residual
N
(x) + 4(x) (3.33)
over the whole interval. Since we can choose N coecients we can impose N conditions
0 = g
i
_
1
0
[
N
(x) + 4(x)] w
i
(x)dx, (3.34)
where the weight functions w
i
(x) are often chosen to be the same as the basis functions
w
i
(x) = u
i
(x). This is called the Galerkin method.
In the current case of a linear PDE this results in a linear system of equations
Aa =
b (3.35)
27
with
A
ij
=
_
1
0
u
i
(x)w
j
(x)dx =
_
1
0
u
i
(x)w
j
(x)dx
b
i
= 4
_
1
0
(x)w
i
(x)dx, (3.36)
where in the rst line we have used integration by parts to circumvent problems with
functions that are not twice dierentiable.
A good and simple choice of local basis functions fullling the boundary conditions
(3.30) are local triangles centered over the points x
i
= ix with x = 1/(n + 1):
u
i
(x) =
_
_
(x x
i1
)/x for x [x
i
1, x
i
]
(x
i+1
x)/x for x [x
i
, x
i
+ 1]
0 otherwise
, (3.37)
but other choices such as local parabolas are also possible.
With the above choice we obtain
A
ij
=
_
1
0
u
i
(x)u
j
(x)dx =
_
1
0
u
i
(x)u
j
(x)dx =
_
_
2/x
1/x
0
for i = j
for i = j 1
otherwise
(3.38)
and, choosing a charge density (x) = (/4) sin(x)
b
i
= 4
_
1
0
(x)u
i
(x)dx =
1
x
(2 sin x
i
sin x
i1
sin x
i+1
) (3.39)
In the onedimensional case the matrix A is tridiagonal and ecient linear solvers for
this tridiagonal matrix can be found in the LAPACK library. In higher dimensions the
matrices will usually be sparse band matrices and iterative solvers will be the methods
of choice.
3.6.2 Generalizations to arbitrary boundary conditions
Our example assumed boundary conditions (0) = (1) = 0. These boundary con
ditions were implemented by ensuring that all basis functions u
i
(x) were zero on the
boundary. Generalizations to arbitrary boundary conditions (0) =
0
and (1) =
1
are possible either by adding additional basis functions that are nonzero at the bound
ary or be starting from a generalized ansatz that automatically ensures the correct
boundary conditions, such as
N
(x) =
0
(1 x) +
1
x
N
i=1
a
i
u
i
(x). (3.40)
3.6.3 Generalizations to higher dimensions
Generalizations to higher dimensions are done by
creating higherdimensional meshes
28
and providing higherdimensional basis functions, such as pyramids centered on
a mesh point.
While the basic principles remain the same, stability problems can appear at sharp
corners and edges and for timedependent geometries. The creation of appropriate
meshes and basis functions is an art in itself, and an important area of industrial
research. Interested students are referred to advanced courses on the subject of nite
element methods
3.6.4 Nonlinear partial dierential equations
The nite element method can also be applied to nonlinear partial dierential equations
without any big changes. Let us consider a simple example
(x)
d
2
dx
2
(x) = 4(x) (3.41)
Using the same ansatz, Eq. (3.32) as before and minimizing the residuals
g
i
=
1
_
0
[
(x) + 4(x)] w
i
(x)dx (3.42)
as before we now end up with a nonlinear equation instead of a linear equation:
i,j
A
ijk
a
i
a
j
= b
k
(3.43)
with
A
ijk
=
1
_
0
u
i
(x)u
j
(x)w
k
(x)dx (3.44)
and b
k
dened as before.
The only dierence between the case of linear and nonlinear partial dierential
equations is that the former gives a set of coupled linear equations, while the latter
requires the solution of a set of coupled nonlinear equations.
Often, a Picard iteration can be used to transform the nonlinear problem into a
linear one. In our case we can start with a crude guess
0
(x) for the solution and use
that guess to linearize the problem as
0
(x)
d
2
1
dx
2
(x) = 4(x) (3.45)
to obtain a better solution
1
. Replacing
0
by
1
an iterating the procedure by solving
n
(x)
d
2
n+1
dx
2
(x) = 4(x) (3.46)
for ever better solutions
n+1
we converge to the solution of the nonlinear partial dif
ferential equation by solving a series of linear partial dierential equations.
29
3.7 Maxwells equations
The last linear partial dierential equation we will consider in this section are Maxwells
equations for the electromagnetic eld. We will rst calculate the eld created by a
single charged particle and then solve Maxwells equations for the general case.
3.7.1 Fields due to a moving charge
The electric potential at the location
R due to a single static charge q at the position
r can directly be written as
V (
R) =
q
[r
R[
, (3.47)
and the electric eld calculated from it by taking the gradient
E = V .
When calculating the elds due to moving charges one needs to take into account
that the electromagnetic waves only propagate with the speed of light. It is thus
necessary to nd the retarded position
r
ret
=
R r(t
ret
)
(3.48)
and time
t
ret
= t
r
ret
(t
ret
)
c
(3.49)
so that the distance r
ret
of the particle at time t
ret
was just ct
ret
. Given the path of the
particle this just requires a root solver. Next, the potential can be calculated as
V (
R, t) =
q
r
ret
(1 r
ret
v
ret
/c)
(3.50)
with the retarded velocity given by
v
ret
=
dr(t)
dt
t=tret
(3.51)
The electric and magnetic eld then work out as
E(
R, t) =
qr
ret
r
ret
u
ret
_
u
ret
_
c
2
v
2
ret
_
+r
ret
(u
ret
a
ret
)
_
(3.52)
B(
R, t) = r
ret
E(
R, t) (3.53)
with
a
ret
=
d
2
r(t)
dt
2
t=tret
(3.54)
and
u
ret
= c r
ret
v
ret
. (3.55)
30
x
y
z
j
z
j
x
j
y
Figure 3.1: Denition of charges and currents for the YeeVischen algorithm
3.7.2 The YeeVischen algorithm
For the case of a single moving charge solving Maxwells equation just required a root
solver to determine the retarded position and time. In the general case of many particles
it will be easier to directly solve Maxwells equations, which (setting
0
=
0
= c = 1)
read
B
t
=
E (3.56)
E
t
=
B 4
j (3.57)
t
=
j (3.58)
The numerical solution starts by dividing the volume into cubes of side length x,
as shown in gure 3.7.2 and dening by (x) the total charge inside the cube.
Next we need to dene the currents owing between cubes. They are most naturally
dened as owing perpendicular to the faces of the cube. Dening as j
x
(x) the current
owing into the box from the left, j
y
(x) the current from the front and j
z
(x) the current
from the bottom we can discretize the continuity equation (3.58) using a halfstep
method
(x, t + t/2) = (x, t + t/2)
t
x
6
f=1
j
f
(x, t). (3.59)
The currents through the faces j
f
(x, t) are dened as
j
1
(x, t) = j
x
(x, t)
j
2
(x, t) = j
y
(x, t)
j
3
(x, t) = j
z
(x, t) (3.60)
j
4
(x, t) = j
x
(x + x e
x
, t)
j
5
(x, t) = j
y
(x + x e
y
, t)
j
6
(x, t) = j
z
(x + x e
z
, t).
Be careful with the signs when implementing this.
31
E
z
E
x
E
y
E
1
E
2
E
3
E
4
(E)
z
Figure 3.2: Denition of electric eld and its curl for the YeeVischen algorithm.
B
z
B
x
B
y
B
1
B
2
B
3
B
4
(B)
y
Figure 3.3: Denition of magnetic eld and its curl for the YeeVischen algorithm.
Next we observe that equation (3.57) for the electric eld
E contains a term propor
tional to the currents j and we dene the electric eld also perpendicular to the faces,
but oset by a half time step. The curl of the electric eld, needed in equation (3.56) is
then most easily dened on the edges of the cube, as shown in gure 3.7.2, again taking
care of the signs when summing the electric elds through the faces around an edge.
Finally, by noting that the magnetic eld term (3.56) contains terms proportional
to the curl of the electric eld we also dene the magnetic eld on the edges of the
cubes, as shown in gure 3.7.2. We then obtain for the last two equations:
E(x, t + t/2) =
E(x, t + t/2) +
t
x
_
4
e=1
B
e
(x, t) 4
j(x, t)
_
(3.61)
B(x, t + t) =
B(x, t)
t
x
4
f=1
E
f
(x, t + t/2) (3.62)
which are stable if t/x 1/
3.
32
3.8 Hydrodynamics and the Navier Stokes equation
3.8.1 The Navier Stokes equation
The Navier Stokes equation is one of the most famous, if not the most famous set of
partial dierential equations. They describe the ow of a classical Newtonian uid.
The rst equation describing the ow of the uid is the continuity equation, describ
ing conservation of mass:
t
+ (v) = 0 (3.63)
where is the local mass density of the uid and v its velocity. The second equation is
the famous NavierStokes equation describing the conservation of momentum:
t
(v) + = g (3.64)
where g is the force vector of the gravitational force coupling to the mass density, and
ij
is the momentum tensor
ij
= v
i
v
j
ij
(3.65)
with
ij
= [
i
v
j
+
j
v
i
] +
__
2
3
_
v P
_
ij
. (3.66)
The constants and describe the shear and bulk viscosity of the uid, and P is the
local pressure.
The third and nal equation is the energy transport equation, describing conserva
tion of energy:
t
_
+
1
2
v
2
_
+
j
e
= 0 (3.67)
where is the local energy density, the energy current is dened as
j
e
= v
_
+
1
2
v
2
_
v k
B
T, (3.68)
where T is the temperature and the heat conductivity.
The only nonlinearity arises from the momentum tensor
ij
in equation (3.65).
In contrast to the linear equations studied so far, where we had nice and smoothly
propagating waves with no big surprises, this nonlinearity causes the fascinating and
complex phenomenon of turbulent ow.
Despite decades of research and the big importance of turbulence in engineering
it is still not completely understood. Turbulence causes problems not only in en
gineering applications but also for the numerical solution, with all known numerical
solvers becoming unstable in highly turbulent regimes. Its is then hard to distinguish
the chaotic eects caused by turbulence from chaotic eects caused by an instabil
ity of the numerical solver. In fact the question of nding solutions to the Navier
Stokes equations, and whether it is even possible at all, has been nominated as one of
the seven millennium challenges in mathematics, and the Clay Mathematics Institute
(http:/www.claymath.org/) has oered a prize money of one million US$ for solving
the NavierStokes equation or for proving that they cannot be solved.
33
Just keep these convergence problems and the resulting unreliability of numerical
solutions in mind the next time you hit a zone of turbulence when ying in an airplane,
or read up on what happened to American Airlines ight AA 587.
3.8.2 Isothermal incompressible stationary ows
For the exercises we will look at a simplied problem, the special case of an isothermal
(constant T) static (/T = 0) ow of an incompressible uid (constant ). In this
case the NavierStokes equations simplify to
v v +P
2
v = g (3.69)
v = 0 (3.70)
In this stationary case there are no problems with instabilities, and the NavierStokes
equations can be solved by a linear niteelement or nitedierences method combined
with a Picarditeration for the nonlinear part.
3.8.3 Computational Fluid Dynamics (CFD)
Given the importance of solving the NavierStokes equation for engineering the numer
ical solution of these equations has become an important eld of engineering called
Computational Fluid Dynamics (CFD). For further details we thus refer to the special
courses oered in CFD.
3.9 Solitons and the Kortevegde Vries equation
As the nal application of partial dierential equations for this semester quantum
mechanics and the Schrodinger equation will be discussed in the summer semester we
will discuss the Kortevegde Vries equations and solitons.
3.9.1 Solitons
John Scott Russell, a Scottish engineer working on boat design made a remarkable
discovery in 1834:
I was observing the motion of a boat which was rapidly drawn along a narrow
channel by a pair of horses, when the boat suddenly stopped  not so the mass
of water in the channel which it had put in motion; it accumulated round
the prow of the vessel in a state of violent agitation, then suddenly leaving
it behind, rolled forward with great velocity, assuming the form of a large
solitary elevation, a rounded, smooth and welldened heap of water, which
continued its course along the channel apparently without change of form or
diminution of speed. I followed it on horseback, and overtook it still rolling
on at a rate of some eight or nine miles an hour, preserving its original
gure some thirty feet long and a foot to a foot and a half in height. Its
height gradually diminished, and after a chase of one or two miles I lost it
34
in the windings of the channel. Such, in the month of August 1834, was my
rst chance interview with that singular and beautiful phenomenon which I
have called the Wave of Translation.
John Scott Russells wave of translation is nowadays called a soliton and is a wave
with special properties. It is a timeindependent stationary solution of special non
linear wave equations, and remarkably, two solitions pass through each other without
interacting.
Nowadays solitons are far from being just a mathematical curiosity but can be used
to transport signal in specially designed glass bers over long distances without a loss
due to dispersion.
3.9.2 The Kortevegde Vries equation
The Kortevegde Vries (KdV) equation is famous for being the rst equation found
which shows soliton solutions. It is a nonlinear wave equation
u(x, t)
t
+ u
u(x, t)
x
+
3
u(x, t)
x
3
= 0 (3.71)
where the spreading of wave packets due to dispersion (from the third term) and the
sharpening due to shock waves (from the nonlinear second term) combine to lead to
timeindependent solitons for certain parameter values.
Let us rst consider these two eects separately. First, looking at a linear wave
equation with a higher order derivative
u(x, t)
t
+ c
u(x, t)
x
+
3
u(x, t)
x
3
= 0 (3.72)
and solving it by the usual ansatz u(x, t) = exp(i(kx t) we nd dispersion due to
wave vector dependent velocities:
= ck k
3
(3.73)
Any wave packet will thus spread over time.
Next let us look at the nonlinear term separately:
u(x, t)
t
+ u
u(x, t)
x
= 0 (3.74)
The amplitude dependent derivative causes taller waves to travel faster than smaller
ones, thus passing them and piling up to a large shock wave, as can be seen in the
Mathematica Notebook provided on the web page.
Balancing the dispersion caused by the third order derivative with the sharpening
due to the nonlinear term we can obtain solitions!
35
3.9.3 Solving the KdV equation
The KdV equation can be solved analytically by making the ansatz u(x, t) = f(xct).
Inserting this ansatz we obtain an ordinary dierential equation
f
(3)
+ ff
cf
= 0, (3.75)
which can be solved analytically in a long and cumbersome calculation, giving e.g. for
= 1 and = 6:
u(x, t) =
c
2
sech
2
_
1
2
c (x ct x
0
)
_
(3.76)
In this course we are more interested in numerical solutions, and proceed to solve
the KdV equation by a nite dierence method
u(x
i
, t + t) = u(x
i
, t t) (3.77)
3
t
x
[u(x
i+1
, t) + u(x
i
, t) + u(x
i1
, t)] [u(x
i+1
, t) u(x
i1
, t)]
t
x
3
[u(x
i+2
, t) + 2u(x
i+1
, t) 2u(x
i1
, t) u(x
i2
, t)] .
Since this integrator requires the wave at two previous time steps we start with an
initial step of
u(x
i
, t
0
+ t) = u(x
i
, t
0
) (3.78)
6
t
x
[u(x
i+1
, t) + u(x
i
, t) + u(x
i1
, t)] [u(x
i+1
, t) u(x
i1
, t)]
2
t
x
3
[u(x
i+2
, t) + 2u(x
i+1
, t) 2u(x
i1
, t) u(x
i2
, t)]
This integrator is stable for
t
x
_
[u[ + 4
[[
x
2
_
1 (3.79)
Note that as in the case of the heat equation, a progressive decrease of space steps or
even of space and time steps by the same factor will lead to instabilities!
Using this integrator, also provided on the web page as a Mathematica Notebook
you will be able to observe:
The decay of a wave due to dispersion
The creation of shock waves due to the nonlinearity
The decay of a step into solitons
The crossing of two solitons
36
Chapter 4
The classical Nbody problem
4.1 Introduction
In this chapter we will discuss algorithms for classical Nbody problems, whose length
scales span many orders of magnitudes
the universe ( 10
26
m)
galaxy clusters ( 10
24
m)
galaxies ( 10
21
m)
clusters of stars ( 10
18
m)
solar systems ( 10
13
m)
stellar dynamics ( 10
9
m)
climate modeling ( 10
6
m)
gases, liquids and plasmas in technical applications ( 10
3
. . . 10
2
m)
On smaller length scales quantum eects become important. We will deal with them
later.
The classical Nbody problem is dened by the following system of ordinary dier
ential equations:
m
i
dv
i
dt
=
F
i
=
i
V (x
1
, . . . , x
N
)
dx
i
dt
= v
i
, (4.1)
where m
i
, v
i
and x
i
are the mass, velocity and position of the ithe particle.
The potential V (x
1
, . . . , x
N
) is often the sum of an external potential and a twobody
interaction potential:
V (x
1
, . . . , x
N
) =
i
V
ext
(x
i
) +
i<j
U
ij
([x
i
x
j
[) (4.2)
The special form U([x
i
x
j
[) of the twobody potential follows from translational and
rotational symmetry.
37
4.2 Applications
There are many dierent forms of the potential U:
1. In astrophysical problems gravity is usually sucient, except in dense plasmas,
interiors of stars and close to black holes:
U
(gravity)
ij
(r) = G
m
i
m
j
r
. (4.3)
2. The simplest model for nonideal gases are hard spheres with radius a
i
:
U
(hard sphere)
ij
(r) =
_
0 for r >= a
i
+ a
j
for r < a
i
+ a
j
(4.4)
3. Covalent crystals and liquids can be modeled by the LennardJones potential
U
(LJ)
ij
(r) = 4
ij
_
(
r
12
) (
r
)
6
_
. (4.5)
The r
6
term describes the correct asymptotic behavior of the covalent van der
Waals forces. The r
12
term models the hard core repulsion between atoms.
The special form r
12
is chosen to allow for a fast and ecient calculation as
square of the r
6
term. Parameters for liquid argon are = 1.65 10
21
J and
= 3.4 10
10
m.
4. In ionic crystals and molten salts the electrostatic forces are dominant:
U
(ionic)
ij
(r) = b
ij
r
n
+ e
2
Z
i
Z
j
r
, (4.6)
where Z
i
and Z
j
are the formal charges of the ions.
5. The simulation of large biomolecules such as proteins or even DNA is a big
challenge. For nonbonded atoms often the 1612 potential, a combination of
LennardJones and electrostatic potential is used:
U
(1612)
ij
(r) = e
2
Z
i
Z
j
r
+ 4
ij
_
(
r
12
) (
r
)
6
_
. (4.7)
For bonded atoms there are two ways to model the bonding. Either the distances
between two atoms can be xed, or the bonding can be described by a harmonic
oscillator:
U
(bond)
ij
(r) =
1
2
K
ij
(r b
ij
)
2
. (4.8)
The modeling of xed angles between chemical bonds (like in water molecules) is
a slightly more complex problem. Again, either the angle can be xed, or modeled
by a harmonic oscillator in the angle . Note that the angle is determined by the
location of three atoms, and that this is thus a threebodyinteraction! Students
who are interested in such biomolecules are referred to the research group of Prof.
van Gunsteren in the chemistry department.
38
6. More complex potentials are used in the simulation of dense plasmas and of col
lisions of heavy atomic nuclei.
7. The CarParrinello method combines a classical simulation of the molecular dy
namics of the motion of atomic nuclei with a quantum chemical abinitio calcula
tion of the forces due to electronic degrees of freedom. This gives more accurate
forces than a LennardJones potential but is possible only on rather small sys
tems due to the large computational requirements. If you are interested in the
CarParrinello method consider the research group of Prof. Parrinello in Lugano.
4.3 Solving the manybody problem
The classical manybody problem can be tackled with the same numerical methods that
we used for the fewbody problems, but we will encounter several additional diculties,
such as
the question of boundary conditions
measuring thermodynamic quantities such as pressure
performing simulations at constant temperature or pressure instead of constant
energy or volume
reducing the scaling of the force calculation for longrange forces from O(N
2
) to
O(N lnN)
overcoming the slowing down of simulations at phase transitions
4.4 Boundary conditions
Open boundary conditions are natural for simulations of solar systems or for collisions of
galaxies, molecules or atomic nuclei. For simulations of crystals, liquids or gases on the
other hand, eects from open boundaries are not desired, except for the investigation
of surface eects. For these systems periodic boundary conditions are better. As we
discussed earlier, they remove all boundary eects.
In the calculation of forces between two particle all periodic images of the simulation
volume have to be taken into account. For short range forces, like a LennardJones force,
the minimum image is the method of choice. Here the distance between a particle
and the nearest of all periodic images of a second particle is chosen for the calculation
of the forces between the two particles.
For long range forces on the other hand (forces that as r
d
or slower) the minimum
image method is not a good approximation because of large nite size eects. Then the
forces caused by all the periodic images of the second particle have to be summed over.
The electrostatic potential acting on a particle caused by other particles with charge q
i
at sites r
i
is
p
=
i
q
i
[r
n
r
i
[
, (4.9)
39
where n is an integer vector denoting the periodic translations of the root cell and r
n
is the position of the particle in the corresponding image of the root cell.
This direct summation converges very slowly. It can be calculated faster by the
Ewald summation technique
1
, which replaces the sum by two faster converging sums:
p
=
i
q
i
erfc([r
n
r
i
[)
[r
n
r
i
[
+
+
1
L
h=0
q
i
exp
_
[h[
2
L
2
_
cos
_
2
L
h (r
o
r
i
)
_
. (4.10)
In this sum the
h are integer reciprocal lattice vectors. The parameter is arbitrary
and can be chosen to optimize convergence.
Still the summation is timeconsuming. Typically one tabulates the dierences
between Ewald sums and minimum image values on a grid laid over the simulation
cell and interpolates for distances between the grid points. For details we refer to the
detailed discussion in M.J. Sangster and M. Dixon, Adv. in Physics 25, 247 (1976).
4.5 Molecular dynamics simulations of gases, liq
uids and crystals
4.5.1 Ergodicity, initial conditions and equilibration
In scattering problems or in the simulation of cosmological evolution the initial condi
tions are usually given. The simulation then follows the time evolution of these initial
conditions. In molecular dynamics simulations on the other hand one is interested in
thermodynamic averages A). In an ergodic system the phase space average is equivalent
to the time average:
A) :=
_
A()P[]d
_
P[]d
= lim
_
0
A(t)dt. (4.11)
Initial conditions are best chosen as a regular crystal lattice. The velocities are
picked randomly for each component, according to a Maxwell distribution
P[v
] exp
_
mv
2
2k
B
T
_
. (4.12)
Finally the velocities are all rescaled by a constant factor to obtain the desired total
energy.
An important issue is that the system has to be equilibrated (thermalized) for some
time before thermal equilibrium is reached and measurements can be started. This
thermalization time is best determined by observing time series of physical observables,
such as the kinetic energy (temperature) or other quantities of interest.
1
P.P. Ewald, Ann. Physik 64, 253 (1921).
40
4.5.2 Measurements
A simple measurement is the selfdiusion constant D. In a liquid or gaseous system it
can be determined from the time dependence of the positions:
2
(t) =
1
N
N
i=1
[r
i
(t) r
i
(0)]
2
= 2dDt +
2
0
(4.13)
In a crystal the atoms remain at the same location in the lattice and thus D = 0.
A measurement of D is one way to observe melting of a crystal.
Another quantity that is easy to measure is the mean kinetic energy
E
k
) =
1
2
i=1
m
i
v
2
i
). (4.14)
E
k
) is proportional to the mean temperature
E
k
) =
G
2
k
B
T, (4.15)
where G = d(N 1) dN is the number of degrees of freedom.
In a system with xed boundaries the particles are reected at the boundaries. The
pressure P is just the force per area acting on the boundary walls of the system In the
case of periodic boundary conditions there are no walls. The pressure P can then be
measured using the following equation, derived from the virial theorem:
P =
Nk
B
T
V
+
1
dV
i<j
r
ij
F
ij
(t), (4.16)
where
F
ij
denotes the force between particles i and j and r
ij
is their distance.
The rst term of equation (4.16) is the kinetic pressure, due to the kinetic energy of
the particles. This term alone gives the ideal gas law. The second term is the pressure
(force per area) due to the interaction forces.
More information can usually be extracted from the pair correlation function
g(r) =
1
(N 1)
_
i=j
(r +r
i
r
j
)
_
(4.17)
or its Fourier transform, the static structure factor S(
k)
g(r) 1 =
1
(2)
d
_
[S(
k) 1] exp(i
k r)d
k (4.18)
S(
k) 1 =
_
[g(
r) 1] exp(i
k r)dr (4.19)
If the angular dependence is of no interest, a radial pair correlation function
g(r) =
1
4
_
g(r) sindd (4.20)
41
and corresponding structure factor
S(k) = 4
_
0
sin kr
kr
[g(r) 1]r
2
dr (4.21)
can be used instead.
This structure factor can be measured in Xray or neutron scattering experiments.
In a perfect crystal the structure factor shows sharp function like Bragg peaks and a
periodic long range structure in g(r). Liquids still show broad maxima at distances of
nearest neighbors, second nearest neighbors, etc., but these features decay rapidly with
distance.
The specic heat at constant volume c
V
can in principle be calculated as a tempera
ture derivative of the internal energy. Since such numerical derivatives are numerically
unstable the preferred method is a calculation from the energy uctuations
c
v
=
E
2
) E)
2
k
B
T
2
. (4.22)
4.5.3 Simulations at constant energy
The equations of motion of a disspiationless system conserve the total energy and the
simulation is thus done in the microcanonical ensemble. Discretization of the time
evolution however introduces errors in the energy conservation, and as a consequence
the total energy will slowly change over time. To remain in the microcanonical ensemble
energy corrections are necessary from time to time. These are best done by a rescaling
of all the velocities with a constant factor. The equations are easy to derive and will
not be listed here.
4.5.4 Constant temperature
The canonical ensemble at constant temperature is usually of greater relevance than the
microcanonical ensemble at constant energy. The crudest, adhoc method for obtaining
constant temperature is a rescaling like we discussed for constant energy. This time
however we want rescale the velocities to achieve a constant kinetic energy and thus,
by equation (4.15) constant temperature. Again the equations can easily be derived.
A better method is the NoseHoover thermostat. In this algorithm the system is
coupled reversibly to a heat bath by a friction term :
m
i
dv
i
dt
=
F
i
v
i
dr
i
dt
= v
i
(4.23)
(4.24)
The friction term is chosen such that constant temperature is achieved on average.
We want this term to heat up the system if the temperature is too low and to cool it
down if the temperature is too high. One way of doing this is by setting
d
dt
=
1
m
s
_
E
k
1
2
Gk
B
T
_
, (4.25)
where m
s
is the coupling constant to the heat bath.
42
4.5.5 Constant pressure
Until now we always worked at xed volume. To perform simulations at constant
pressure we need to allow the volume to change. This can be done by rescaling the
coordinates with the linear size L of the system:
r = Lx. (4.26)
The volume of the system is denoted by = L
D
. We extend the Lagrangian by
including an external pressure P
0
and an inertia M for pressure changes (e.g. the mass
of a piston):
L =
N
i=1
m
i
2
L
2
_
dx
i
dt
_
2
i<j
V (L(x
i
x
j
)) +
M
2
_
d
dt
_
2
+ P
0
(4.27)
The Euler equations applied to above Lagrangian give the equations of motion:
d
2
x
i
dt
2
=
1
m
i
L
F
i
2
D
d
dt
dx
i
dt
d
2
dt
2
=
P P
0
M
, (4.28)
where P turns out to be just the pressure dened in equation (4.16). These equations
of motion are integrated with generalizations of the Verlet algorithm.
Generalizations of this algorithm allow changes not only of the total volume but
also of the shape of the simulation volume.
4.6 Scaling with system size
The time intensive part of a classical Nbody simulation is the calculation of the forces.
The updating of positions and velocities according to the forces is rather fast and scales
linearly with the number of particles N.
For short range forces the number of particles within the interaction range is limited,
and the calculation of the forces, while it might still be a formidable task, scales with
O(N) and thus poses no big problems.
Rapidly decaying potentials, like the LennardJones potential can be cut o at a
distance r
c
. The error thus introduced into the estimates for quantities like the pressure
can be estimated from equations (4.16) using equation (4.17) as:
P =
2
2
3
_
rc
V
r
g(r)r
3
dr (4.29)
where a common twobody potential V (r) between all particle pairs was assumed. If
V (R) decays faster than r
3
(in general r
d
, where d is the dimensionality) this correc
tion becomes small as r
c
is increased.
Long range forces, like Coulomb forces or gravity, on the other hand, pose a big
problem. No nite cuto may be introduced without incurring substantial errors.
43
Each particle asserts a force onto every other particle, thus requiring (N 1)N/2
O(N
2
) force calculations. This is prohibitive for large scale simulations. In numerical
simulations there is a solution to this problem. Due to discrete time steps t we cannot
avoid making errors in the time integration. Thus we can live with a small error in the
force calculations and use one of a number of algorithms that, while introducing small
controllable errors in the forces, need only O(N log N) computations.
4.6.1 The ParticleMesh (PM) algorithm
The ParticleMesh (PM) algorithm maps the force calculation to the solution of a
Poisson equation which can be done in a time proportional to O(N log N). It works as
follows:
1. A regular mesh with M N mesh points is introduced in the simulation volume
2. The masses of the particles are assigned in a clever way to nearby mesh points.
3. The potential equation (often a Poisson equation) is solved on the mesh using a
fast solver in O(M log M) O(N log N) steps.
4. The potential at the position of each particle is interpolated again in a clever
way from the potential at the nearby mesh points and the force upon the particle
calculated from the gradient of the potential.
The charge assignment and potential interpolation are the tricky parts. They should
be done such that errors are minimized. The standard textbook Computer Simulations
Using Particles by R.W. Hockney and J.W. Eastwood discusses the PM method in
great detail.
We want to fulll at least the following conditions
1. At large particle separations the errors should become negligible
2. The charge assigned to the mesh points and the forces interpolated from mesh
points should vay smoothly as the particle position changes.
3. Total momentum should be conserved, i.e. the force
F
ij
acting on particle i from
particle j should fulll
F
ij
=
F
ji
.
The simplest scheme is the NGP scheme (neasrest grid point), where the full particle
mass is assigned to the nearest grid point and the force is also evaluated at the nearest
grid point. More elaborate schemes like the CIC (cloud in cell) scheme assign the charge
to the 2
d
nearest grid points and also interpolate the forces from these grid points. The
algorithm becomes more accurate but also more complex as more points are used. For
detailed discussions read the book by Hockney and Eastwood.
In periodic systems and for forces which have a Greens function g(r) (e.g. solutions
of a Poisson equation) one of the best methods is the fast Fourier method, which we
will describe for the case of gravity, where the potential can be calculated as
(r) =
_
d
3
r
(r
)g(r r
), (4.30)
44
where (r) is the charge distribution and the Greens function is
g(r) =
G
[[r[[
(4.31)
in three space dimension. The convolution in equation (4.30) is best performed by
Fourier transforming the equation, which reduces it to a multiplication
k) = (
k) g(
k) (4.32)
On the nite mesh of the PM method, the discrete charge distribution is Fourier trans
formed in O(M log M) steps using the Fast Fourier Transform (FFT) algorithm. The
Fourier transform of the Greens function in equation (4.31) is
g(
k) =
G
[[
k[[
2
. (4.33)
Using this equation gives the poor mans Poisson solver. A suitably modied Greens
function, such as
g(
k)
1
sin
2
(k
x
L/2) + sin
2
(k
y
L/2) + sin
2
(k
z
L/2)
, (4.34)
where L is the linear dimension of the simulation volume can reduce the discretization
errors caused by the nite mesh. In contrast to equation (4.33) this form dierentiable
also at the Brillouin zone boundary, e.g. when k
x
= /L or k
y
= /L or k
z
= /L.
Before writing any program yourself we recommend that you study the textbooks and
literature in detail as there are many subtle issues you have to take care of.
The PM algorithm is very ecient but has problems with
nonuniform particle distributions, such as clustering of stars and galaxies.
strong correlations eects between particles. Bound states, such as binary stars,
will never be found in PM simulations of galaxies.
complex geometries.
The rst two of these problems can be solved by the P
3
M and AP
3
M algorithms
4.6.2 The P
3
M and AP
3
M algorithms
The PM method is good for forces due to far away particles but bad for short ranges.
The P
3
M method solves this problem by splitting the force
F into a long range force
F
l
and a short range force
F
s
:
F =
F
l
+
F
s
(4.35)
The long range force
F
l
is chosen to be small and smoothly varying for short dis
tances. It can be eciently computed using the particlemesh (PM) method. The
short range force
F
s
has a nite interaction radius R is calculated exactly, summing
45
up the particleparticle forces. Thus the name particleparticle/particle mesh (P
3
M)
algorithm.
For nearly uniform particle distributions the number of particles within the range
of
F
s
is small and independent of N. The P
3
M algorithm then scales as O(N) +
O(M log M) with M N.
Attractive long range forces, like gravity, tend to clump particles and lead to ex
tremely nonuniform particle distribution. Just consider solar systems, star clusters,
galaxies and galaxy clusters to see this eect. Let us consider what happens to the
P
3
M method in this case. With a mesh of N M points it will happen that almost all
particles (e.g. a galaxy) clump within the range R of the short range force
F
s
. Then the
PP part scales like O(N
2
). Alternatively we can increase the number of mesh points
M to about M N, which again is nonoptimal.
The solution to this problem rening the mesh in the regions of space with a high
particle density. In a simulation of a collision between two galaxies we will use a ne
mesh at the location of the galaxies and a coarse mesh in the rest of the simulation
space. The adaptive P
3
M method (AP
3
M) automatically renes the mesh in regions of
space with high particle densities and is often used, besides tree codes, for cosmological
simulations.
4.6.3 The tree codes
Another approach to speeding up the force calculation is by collecting clusters of far
away particles into eective pseudoparticles. The mass of these pseudo particles is
the total mass of all particles in the cluster they represent. To keep track of these
clusters a tree is constructed. Details of this method are explained very well in the
book ManyBody Tree Methods in Physics by Susanne Pfalzner and Paul Gibbon.
4.6.4 The multipole expansion
In more or less homogeneous systems another algorithm can be used which at rst sight
scales like O(N) The Fast Multipole Method (FMM) calculates a high order multipole
expansion for the potential due to the particles and uses this potential to calculate the
forces. The calculation of the high order multipole moments is a big (programming)
task, but scales only like O(N).
Is this the optimal method? No! There are two reasons. First of all, while the calcu
lation of the multipole moments takes only O(N) time we need to go to about O(log N)
th order to achieve good accuracy, and the overall scaling is thus also O(N log N). Sec
ondly, the calculation of the multipole moments is a computationally intensive task and
the prefactor of the N log N term much larger than in tree codes.
The multipole method is still useful in combination with tree codes. Modern tree
codes calculate not only the total mass of a cluster, but also higher order multipole
moments, up to the hexadecapole. This improves the accuracy and eciency of tree
codes.
46
4.7 Phase transitions
In the molecular dynamics simulations of a LennardJones liquid in the exercises we
can see the rst example of a phase transition: a rst order phase transition between
a crystal and a solid. Structural phase transitions in continuous systems are usually of
rst order, and second order transition occur only at special points, such as the critical
point of a uid. We will discuss second order phase transitions in more detail in the
later chapter on magnetic simulations and will focus here on the rst order melting
transition.
In rst order phase transitions both phases (e.g. ice and water) can coexist at the
same time. There are several characteristic features of rst order phase transitions
that can be used to distinguish them from second order ones. One such feature is the
latent heat for melting and evaporation. If the internal energy is increased, e.g. by
a a heat source, the temperature rst increases until the phase transition. Then it
stays constant as more and more of the crystal melts. The temperature will rise again
only once enough energy is added to melt the whole crystal. Alternatively this can be
seen as a jump at the transition temperature of the internal energy as a function of
temperature. Similarly, at constant pressure a volume change can be observed at the
melting transition. Another indication is a jump in the self diusion constant at T
c
.
A more direct observation is the measurement of a quantity like the structure factor
in dierent regions of the simulation volume. At rst order phase transitions regions of
both phases (e.g. crystal and liquid or liquid and gas) can be observed at the same time.
In second order phase transitions (like in crystal structure changes from tetragonal to
orthorhombic), on the other hand a smooth change as a function of temperature is
observed, and the whole system is always either in one phase or in the other.
When simulating rst order phase transitions one encounters a problem which is
actually a wellknown phenomenon. To trigger the phase transition a domain of the new
phase has to be formed. As this formation of the domain can cost energy proportional
to its boundary the formation of such new domains can be suppressed, resulting in
undercooled or overheated liquids. The huge time scales for melting (just watch an ice
cube melt!) are a big problem for molecular dynamics simulations of rst order phase
transitions. Later we will learn how Monte Carlo simulations can be used to introduce
a faster dynamics, speeding up the simulation of phase transitions.
4.8 From uid dynamics to molecular dynamics
Depending on strength and type of interaction dierent algorithms are used for the
simulation of classical systems.
1. Ideal or nearly ideal gases with weak interaction can be modeled by the Navier
Stokes equations.
2. If the forces are stronger, the NavierStokes equations are no longer appropriate.
In that case particleincell (PIC) algorithms can be used:
Like in the nite element method the simulation volume is split into cells.
Here the next step is not the solution of a partial dierential equation on
47
this mesh, but instead the uid volume in each cell is replaced by a pseudo
particle. These pseudoparticles, which often correspond to millions of real
uid particles, carry the total mass, charge and momentum of the uid cell.
The pseudoparticle are then propagated using molecular dynamics.
Finally, the new mass, charge and momentum densities on the mesh are
interpolated from the new positions of the pseudoparticles.
3. If interactions or correlations are even stronger, each particle has to be simulated
explicitly, using the methods discussed in this chapter.
4. In astrophysical simulations there are huge dierences in density and length scales
 from interstellar gases to neutron stars or even black holes. For these simulations
hybrid methods are needed. Parts of the system (e.g. interstellar gases and dark
matter) are treated as uids and simulated using uid dynamics. Other parts (e.g.
galaxies and star clusters) are simulated as particles. The border line between uid
and particle treatment is uid and determined only by the fact that currently not
more than 10
8
particles can be treated.
4.9 Warning
Tree codes and the (A)P
3
M methods accept a small error in exchange for a large
speedup. However, when simulating for several million time steps these errors will
add up. Are the results reliable? This is still an open question.
48
Chapter 5
Integration methods
In thermodynamics, as in many other elds of physics, often very high dimensional
integrals have to be evaluated. Even in a classical Nbody simulation the phase space
has dimension 6N, as there are three coordinates each for the location and position
of each particle. In a quantum mechanical problem of N particles the phase space is
even exponentially large as a function of N. We will now review what we learned last
semester about integration methods and Monte Carlo integrators.
5.1 Standard integration methods
A Riemannian integral f(x) over an interval [a, b] can be evaluated by replacing it by
a nite sum:
_
b
a
f(x)dx =
N
i=1
f(a + ix)x + O(x
2
), (5.1)
where x = (a b)/N. The discretization error decreases as 1/N for this simple
formula. Better approximations are the trapezoidal rule
_
b
a
f(x)dx = x
_
1
2
f(a) +
N1
i=1
f(a + ix) +
1
2
f(b)
_
+ O(x
2
), (5.2)
or the Simpson rule
_
b
a
f(x)dx =
x
3
_
_
f(a) +
N/2
i=1
4f(a + (2i 1)x) +
N/21
i=1
2f(a + 2ix) + f(b)
_
_
+O(x
4
),
(5.3)
which scales like N
4
.
For more elaborate schemes like the Romberg method or Gaussian integration we
refer to textbooks.
In higher dimensions the convergence is much slower though. With N points in
d dimensions the linear distance between two points scales only as N
1/d
. Thus the
Simpson rule in d dimensions converges only as N
4/d
, which is very slow for large d.
The solution are Monte Carlo integrators.
49
5.2 Monte Carlo integrators
With randomly chosen points the convergence does not depend on dimensionality. Using
N randomly chosen points x
i
the integral can be approximated by
1
_
f(x)dx f :=
1
N
N
i=1
f(x
i
), (5.4)
where :=
_
dx is the integration volume. As we saw in the previous chapter the errors
of such a Monte Carlo estimate the errors scale as N
1/2
. In d 9 dimensions Monte
Carlo methods are thus preferable to a Simpson rule.
5.2.1 Importance Sampling
This simple Monte Carlo integration is however not the ideal method. The reason is
the variance of the function
Varf =
1
_
f(x)
2
dx
_
1
_
f(x)dx
_
2
N
N 1
(f
2
f
2
). (5.5)
The error of the Monte Carlo simulation is
=
Varf
N
_
f
2
f
2
N 1
. (5.6)
In phase space integrals the function is often strongly peaked in a small region of
phase space and has a large variance. The solution to this problem is importance
sampling, where the points x
i
are chosen not uniformly but according to a probability
distribution p(x) with
_
p(x)dx = 1. (5.7)
Using these pdistributed random points the sampling is done according to
f) =
1
_
A(x)dx =
1
_
f(x)
p(x)
p(x)dx
1
N
N
i=1
f(x
i
)
p(x
i
)
(5.8)
and the error is
=
Varf/p
N
. (5.9)
It is ideal to choose the distribution function p as similar to f as possible. Then the
ratio f/p is nearly constant and the variance small.
As an example, the function f(x) = exp(x
2
) is much better integrated using
exponentially distributed random numbers with p(x) = exp(x) instead of uniformly
distributed random numbers.
A natural choice for the weighting function p is often given in the case of phase
space integrals or sums, where an observable A is averaged over all congurations x in
phase space where the probability of a conguration is p(x). The phase space average
A) is:
A) =
_
A(x)p(x)dx
_
p(x)dx
. (5.10)
50
5.3 Pseudo random numbers
The most important ingredient for a Monte Carlo calculation is a source of random
numbers. The problem is: how can a deterministic computer calculate true random
numbers. One possible source of random numbers is to go to a casino (not necessarily
in Monte Carlo) and obtain random numbers from the roulette wheel.
Since this is not a useful suggestion the best remaining solution is to calculate pseudo
random numbers using a numerical algorithm. Despite being deterministic these pseudo
random number generators can produce sequences of numbers that look random if one
does not know the underlying algorithm. As long as they are suciently random (you
might already see a problem appearing here), these pseudo random numbers can be
used instead of true random numbers.
5.3.1 Uniformly distributed random numbers
A popular type of random number generator producing uniformly distributed is the
linear congruential generator (LCG)
x
n
= (ax
n1
+ c) mod m, (5.11)
with positive integer numbers a, c and m. The quality of the pseudo random numbers
depends sensitively on the choice of these parameters. A common and good choice is
a = 16807, c = 0, m = 2
31
1 und x
0
= 667790. The main problem of LCG generators
is that, because the next number x
n+1
depends only on one previous number x
n
the
sequence of numbers produced is at most m. Current computers with gigahertz clock
rates can easily exhaust such a sequence in seconds. LCG generators should thus no
longer be used.
Modern generators are usually based on lagged Fibonacci methods, such as the
generator
x
n
= x
n607
+ x
n253
mod m (5.12)
The rst 607 numbers need to be produced by another generator, e.g. an LCG generator.
Instead of the shifts (607, 253) other good choices can be (2281, 1252),(9689, 5502) or
(44497, 23463). By calculating the next number from more than one previous numbers
these generator can have extremely long periods.
One of the most recent generators is the Mersenne Twister, a combination of a
lagged Fibonacci generator with a bit twister, shuing around the bits of the random
numbers to improve the quality of the generator.
Instead of coding a pseudo random number generator yourself, use one of the libraries
such as the SPRNG library in Fortran or C or the Boost random number library in C++.
5.3.2 Testing pseudo random numbers
Before using a pseudo random number generator you will have to test the generator to
determine whether the numbers are suciently random for your application. Standard
tests that are applied to all new generators include the following:
The period has to be longer than the number of pseudo random numbers required.
51
The numbers have to be uniformly distributed. This can be tested by a
2
or a
KolmogorovSmirnov test.
Successive number should not be correlated. This can be tested by a
2
test or
by a simple graphical test: ll a square with successive points at the coordinates
(x
2i1
, x
2i
) and look for structures.
All of these tests, and a large number of similar tests are necessary but by no
means sucient, as Landau, Ferrenberg und Wong have demonstrated. They showed
that standard and well tested generators gave very wrong results when applied to a
simulation of the Ising model. The reason were longrange correlations in the generators
that had not been picked up by any test. The consequence is that no matter how many
tests you make, you can never be sure about the quality of the pseudo random number
generator. After all, the numbers are not really random. The only reliable test is to
rerun your simulation with another random number generator and test whether the
result changes or not.
5.3.3 Nonuniformly distributed random numbers
Nonuniformly distributed random numbers can be obtained from uniformly distributed
random numbers in the following way. Consider the probability that a random number
y, distributed with a distribution f is less than x. This probability is just the integral
P
f
[y < x] =
_
x
y
W
xy
= 1 (5.18)
A consequence is that the Markov process conserves the total probability. Another
consequence is that the largest eigenvalue of the transition matrix W is 1 and the cor
responding eigenvector with only positive entries is the equilibrium distribution which
is reached after a large number of Markov steps.
We want to determine the transition matrix W so that we asymptotically reach the
desired probability p
x
for a conguration i. A set of sucient conditions is:
1. Ergodicity: It has to be possible to reach any conguration x from any other
conguration y in a nite number of Markov steps. This means that for all x and
y there exists a positive integer n < such that (W
n
)
xy
,= 0.
2. Detailed balance: The probability distribution p
(n)
x
changes at each step of the
Markov process:
x
p
(n)
x
W
xy
= p
(n+1)
y
. (5.19)
but converges to the equilibrium distribution p
x
. This equilibrium distribution p
x
is an eigenvector with left eigenvalue 1 and the equilibrium condition
x
p
x
W
xy
= p
y
(5.20)
must be fullled. It is easy to see that the detailed balance condition
W
xy
W
yx
=
p
y
p
x
(5.21)
is sucient.
The simplest Monte Carlo algorithm is the Metropolis algorithm:
Starting with a point x
i
choose randomly one of a xed number N of changes x,
and propose a new point x
= x
i
+ x.
53
Calculate the ratio os the probabilities P = p
x
/p
x
i
.
If P > 1 the next point is x
i+1
= x
If P < 1 then x
i+1
= x
if r < P.
Measure the quantity A at the new point x
i+1
.
The algorithm is ergodic if one ensures that the N possible random changes allow all
points in the integration domain to be reached in a nite number of steps. If additionally
for each change x there is also an inverse change x we also fulll detailed balance:
W
ij
W
ji
=
1
N
min(1, p(j)/p(i))
1
N
min(1, p(i)/p(j))
=
p(j)
p(i)
. (5.22)
As an example let us consider summation over integers i. We choose N = 2 possible
changes i=1 and fulll both ergodicity and detailed balance as long as p(i) is nonzero
only over a nite contiguous subset of the integers.
To integrate a onedimensional function we take the limit N and pick any
change [, ] with equal probability. The detailed balance equation (5.22) is
only modied in minor ways:
W
ij
W
ji
=
d
2
min(1, p(j)/p(i))
d
2
min(1, p(i)/p(j))
=
p(j)
p(i)
. (5.23)
Again, as long as p(x) is nonzero only on a nite interval detailed balance and ergodicity
are fullled.
5.5 Autocorrelations, equilibration and Monte Carlo
error estimates
5.5.1 Autocorrelation eects
In the determination of statistical errors of the Monte Carlo estimates we have to take
into account correlations between successive points x
i
in the Markov chain. These
correlations between congurations manifest themselves in correlations between the
measurements of a quantity A measured in the Monte Carlo process. Denote by A(t)
the measurement of the observable A evaluated at the tth Monte Carlo point x
t
. The
autocorrelations decay exponentially for large time dierences :
A
t
A
t+
) A)
2
exp(/
(exp)
A
) (5.24)
Note that the autocorrelation time
A
depends on the quantity A.
An alternative denition is the integrated autocorrelation time
(int)
A
, dened by
(int)
A
=
=1
(A
t
A
t+
) A)
2
)
A
2
) A)
2
(5.25)
54
As usual the expectation value of the quantity A can be estimated by the mean
A, using equation (5.4). The error estimate [equation (5.6)] has to be modied. The
error estimate (A)
2
is the expectation value of the squared dierence between sample
average and expectation value:
(A)
2
= (AA))
2
) =
_
1
N
N
t=1
A(t) A)
_
2
)
=
1
N
2
N
i=1
_
A(t)
2
A)
2
)
_
+
2
N
2
N
t=1
Nt
=1
_
A(t)A(t + )) < A >
2
)
_
1
N
VarA (1 + 2
(int)
A
)
1
N 1
A
2
A
2
)(1 + 2
(int)
A
) (5.26)
In going from the second to third line we assumed
(int)
A
N and extended the summa
tion over to innity. In the last line we replaced the variance by an estimate obtained
from the sample. We see that the number of statistical uncorrelated samples is reduced
from N to N/(1 + 2
(int)
A
).
In many Monte Carlo simulations the error analysis is unfortunately not done ac
curately. Thus we wish to discuss this topic here and in the exercises.
5.5.2 The binning analysis
The binning analysis is a reliable way to estimate the integrated autocorrelation times.
Starting from the original series of measurements A
(0)
i
with i = 1, . . . , N we iteratively
create binned series by averaging over to consecutive entries:
A
(l)
i
:=
1
2
_
A
(l1)
2i1
+ A
(l1)
2i
_
, i = 1, . . . , N
l
N/2
l
. (5.27)
These bin averages A
(l)
i
are less correlated than the original values A
(0)
i
. The mean value
is still the same.
The errors A
(l)
, estimated incorrectly using equation (5.6)
A
(l)
=
VarA
(l)
N
l
1
1
N
l
_
N
l
i=1
_
A
(l)
i
A
(l)
_
2
(5.28)
however increase as a function of bin size 2
l
. For 2
l
(int)
A
the bins become uncorrelated
and the errors converge to the correct error estimate:
A = lim
l
A
(l)
. (5.29)
This binning analysis gives a reliable recipe for estimating errors and autocorrelation
times. One has to calculate the error estimates for dierent bin sizes l and check if they
converge to a limiting value. If convergence is observed the limit A is a reliable error
estimate, and
(int)
A
can be obtained from equation (5.26) as
(int)
A
=
1
2
_
_
A
A
(0)
_
2
1
_
(5.30)
55
If however no convergence of the A
(l)
is observed we know that
(int)
A
is longer than
the simulation time and we have to perform much longer simulations to obtain reliable
error estimates.
To be really sure about convergence and autocorrelations it is very important to start
simulations always on tiny systems and check convergence carefully before simulating
larger systems.
For the projects we will provide you with a simple observable class that implements
this binning analysis.
5.5.3 Jackknife analysis
The binning procedure is a straightforward way to determine errors and autocorrelation
times for Monte Carlo measurements. For functions of measurements like U = A)/B)
it becomes dicult because of error propagation and crosscorrelations.
Then the jackknife procedure can be used. We again split the measurements into M
bins of size N/M
(int)
that should be much larger than any of the autocorrelation
times.
We could now evaluate the complex quantity U in each of the M bins and obtain
an error estimate from the variance of these estimates. As each of the bins contains
only a rather small number of measurements N/M the statistics will not be good. The
jackknife procedure instead works with M+1 evaluations of U. U
0
is the estimate using
all bins, and U
i
for i = 1, . . . M is the value when all bins except the ith bin are used.
That way we always work with a large data set and obtain good statistics.
The resulting estimate for U will be:
U = U
0
(M 1)(U U
0
) (5.31)
with a statistical error
U =
M 1
_
1
M
M
i=1
(U
i
)
2
(U)
2
_
1/2
, (5.32)
where
U =
1
M
M
i=1
U
i
, (5.33)
5.5.4 Equilibration
Thermalization is as important as autocorrelations. The Markov chain converges only
asymptotically to the desired distribution. Consequently, Monte Carlo measurements
should be started only after a large number N
eq
of equilibration steps, when the distri
bution is suciently close to the asymptotic distribution. N
eq
has to be much larger
than the thermalization time which is dened similar to the autocorrelation time as:
(eq)
A
=
=1
(A
0
A
) A)
2
)
A
0
)A) A)
2
(5.34)
56
It can be shown that the thermalization time is the maximum of all autocorrelation
times for all observables and is related to the second largest eigenvalue
2
of the Markov
transition matrix W by
(th)
= 1/ log
2
. It is recommended to thermalize the system
for at least ten times the thermalization time before starting measurements.
57
Chapter 6
Percolation
6.1 Introduction
While the molecular dynamics and Monte Carlo methods can, in principle, be used
to study any physical system, diculties appear close to phase transitions. In our
investigation of phase transitions we will start with percolation, a very simple and
purely geometric topic.
Although there is no dynamics of any kind involved, percolation nevertheless exhibits
complex behavior and a phase transition. This simple model will allow us to introduce
many concepts that we will need later for the simulation of dynamic systems. These
concepts include:
Phase transitions
Scaling
Finite size eects and nite size scaling
Monte Carlo Simulations
Analytic and numeric renormalization group methods
Series expansion methods
Percolation models can be applied to a variety of problems:
Oil elds: The swiss cheese model can be used as a simplied model for the
storage and ow of liquids in porous media. The porous stone is modeled by
spherical cavities distributed randomly in the surrounding medium. If the density
of cavities gets larger they start to overlap and form large cluster of cavities.
Important questions that can be asked include:
For oil drilling we want to know how the amount of oil stored varies with the
size of the oil eld.
Fluids can only ow through the medium of there is a cluster connecting
opposite sides. Such a cluster is called a percolating cluster. What density
of cavities we need to create a percolating cluster.
58
Next we want to know how the the speed of uid ow through the medium
depends on the density.
Forest res: We model forest res by assuming that a tree neighboring a burning
tree catches re with a probability p. Fire ghters will want to know:
How much of the forest will burn?
Will the re spread throughout the whole forest?
Spread of diseases: The spread of diseases can be modeled in a simplied way
similar to the forest re model. Now the important question is: what part of the
population will fall ill? Will the disease by an epidemic, spreading throughout
the whole country, or even an endemic, spreading over the whole world?
Conductance of wire meshes:
How many links can be cut in a wire mesh so that it is still conducting?
What is the resistivity as a function of the ratio of cut links?
Vulnerability of the internet: The internet was designed in 1964 to make
computer networks reliable even in the case of attacks in war time. The most
urgent question is:
What portion of the internet is still connected if a fraction p of switches fails?
Gelation of liquids: Gelation of liquids can be modeled by allowing a molecule to
form a bond with a neighboring molecule with probability p. Again the interesting
questions are:
What is the average size of molecule clusters as a function of p?
What is the critical concentration at which the largest molecule cluster per
colates and the liquid solidies?
Baking of cookies: A nice household example is given in the textbook by Gould
and Tobochnik. Distribute several drops of dough randomly on a cookie tray.
When baking the dough will spread and cookies that are close will bake together.
If the number of drops is too large we will obtain a percolating cookie, spanning
the baking tray from one edge to the other.
For a detailed discussion we refer to the book by Stauer and Aharony.
6.2 Site percolation on a square lattice
In this lecture we will discuss in detail the simplest of percolation models: site percola
tion on a two dimensional square lattice. Each square shall be occupied with probability
p. Occupied squares that share edges form a cluster.
As can be seen in gure 6.1 or by playing with the Java apples on our web page, for
large p there is a percolating cluster, i.e. a cluster that spans the lattice from one
59
a) b) c)
Figure 6.1: Examples of site percolation on a 16 16 square lattice for several proba
bilities: a) p = 0.2, b) p = p
c
= 0.5927, and c) p = 0.8. It can be sen that for p p
c
there is a spanning cluster, reaching from the top to the bottom edge.
edge to the other. In the innite lattice limit there is a sharp transition at a critical
density p
c
. For p < p
c
there is never a percolating cluster and for p > p
c
there is always
a percolating cluster.
We will ask the following questions:
What is the critical concentration p
c
?
How do the number of clusters and average cluster size S depend on p?
What is the probability P that a site is on the percolating cluster?
What is the resistivity/conductance of the percolating cluster?
How does a nite lattice size L change the results?
How do the results depend on the lattice structure and on the specic percolation
model used?
The answer to the last question will be, that close to the critical concentration p
c
the
properties depend only on the dimensionality d and on the type (continuum or lattice)
but not on the lattice structure or specic percolation model. Thus our results obtained
for the percolation transition on the square lattice will be universal in the sense that
they will apply also to all other twodimensional percolation models, like the forest re
model, the spread of diseases and the problems encountered when baking cookies.
6.3 Exact solutions
6.3.1 One dimension
In one dimension the percolation problem can be solved exactly. We will use this exactly
soluble case to introduce some quantities of interest.
A percolating cluster spans the whole chain and is not allowed to contain any empty
site, thus p
c
= 1.
60
The probability that a site is the left edge of a cluster of nite size s is simply
n
s
= (1 p)
2
p
s
(6.1)
as the cluster consists of s occupied sites neighbored by two empty ones. The probability
that a random site is anywhere on an ssite cluster is sn
s
. The sum over all cluster sizes
s leads to a sum rule which is valid not only in one dimension. The probability that a
site is on a cluster of any size is just p:
P +
s
sn
s
= p, (6.2)
where P is the probability that a site is on one of the the innite percolating cluster.
In one dimension P = 1 for p = p
c
and P = 0 otherwise.
The average cluster size is
S =
s
s
2
n
s
s
sn
s
=
1 +p
1 p
(p
c
p)
1
, (6.3)
where the last expression is the universal asymptotic form near p
c
.
The pair correlation function g(r) gives the probability that for an occupied site a
site at distance r belongs to the same cluster. In one dimensions it is
g(r) = p
r
= exp([r[/), (6.4)
where
=
1
log p
(p
c
p)
1
. (6.5)
is called the correlation length. Again there is a sum rule for the pair correlation
function:
r=
g(r) = S (6.6)
6.3.2 Innite dimensions
Another exact solution is available in innite dimensions on the Bethe lattice, shown in
gure 6.2. It is a tree which, starting from one site branches out, with every site having
z 3 neighbors.
Why is this lattice innite dimensional? In d dimensions the volume scales like
L
d
and the surface like L
d1
, thus surface volume
11/d
. The Bethe lattice with R
generations of sites has a boundary of z(z 1)
R1
sites and a volume of 1 + z[(z
1)
R
1]/(z 2) sites. For large R we have surface volume (z 2)/(z 1) and thus
d = .
As the Bethe lattice contains no loops everything can again be calculated exactly.
Let us follow a branch in a cluster. At each site it is connected to z sites and branches
out to z 1 new branches. At each site the cluster branches into p(z 1) occupied
branches. If p(z 1) < 1 the number of branches decreases and the cluster will be
61
Figure 6.2: Part of the innite dimensional Bethe lattice for z = 3. In this lattice each
site is connected to z neighbors.
nite. If p(z 1) > 1 the number of branches increases and the cluster will be innite.
Thus
p
c
=
1
z 1
(6.7)
P can be calculated exactly for z = 3 using simple arguments that are discussed
in detail in the book by Aharony and Stauer. We dene by Q the probability that a
cluster starting at the root of one of the branches does not connect to innity. This is
the case if either (i) the root of the branch is empty or (ii) it is occupied but neither of
the branches connected to it extends to innity. Thus for z = 3
Q = 1 p + pQ
2
(6.8)
This equation has two solutions: Q = 1 and Q = (1 p)/p. The probability that a site
is not on the percolating cluster is then 1 p + pQ
3
, as it is either empty or occupied
but not connected to innty by any of the z = 3 branches connected to the site. This
gives P = 0, corresponding to p < p
c
and
P = 1 (1 p + pQ
3
) = p
_
_
1
_
1 p
p
_
3
_
_
(p p
c
) (6.9)
for p > p
c
= 1/2 and P = 0 for p < p
c
. Similar calculations can also be performed
analytically also for larger z, always leading to the same power law.
A similar argument can be found for the mean cluster size for p < p
c
. Let us
call the size of a cluster on a branch T. This is 0 if the root of the branch is empty
and 1 + (z 1)T if it is occupied. Thus: T = p(1 + (z 1)T), with the solution
T = p/(p + 1 pz). The size of the total cluster is the root plus the three branches:
S = 1 +zT =
1 +p
1 (z 1)p
(p p
c
)
1
(6.10)
What about the cluster probabilities n
s
? A cluster of s sites has 2+(z2)s neighboring
empty sites. By looking at the ratio n
s
(p)/n
s
(p
c
) we get rid of prefactors and obtain:
n
s
(p)
n
s
(p
c
)
=
_
p
p
c
_
s
_
1 p
1 p
c
_
2+(z2)s
(6.11)
62
For z = 3 this is asymptotically is asymptotically
n
s
(p)
n
s
(p
c
)
exp(cs) with c = log[1 4(p p
c
)
2
] (p p
c
)
2
. (6.12)
All that is missing now is an expression for n
s
(p
c
). Unfortunately this cannot be
obtained exactly for arbitrary n. Instead we make an educated guess. As S =
s
s
2
n
s
(p)/
s
sn
s
(p) must diverge at p = p
c
and p =
s
sn
s
(p) + P is nite, n
s
(p
c
)
must decay faster than s
2
but slower than s
3
. We make the ansatz rst suggested by
M.E. Fisher:
n
s
(p
c
) s
(6.13)
with 2 < 3.
From the summation
S =
s
s
2
n
s
(p)
s
sn
s
(p)
(p
c
p)
26
(6.14)
we obtain by comparing with equation (6.10) that
= 5/2. (6.15)
By using equations (6.12) and (6.13) we can rederive the asymptotic power laws of
equations (6.9) and (6.10).
6.4 Scaling
We have seen that in both exactly solvable cases the interesting quantities have power
law singularities at p
c
. The generalization of this to arbitrary lattices is the scaling
ansatz, which cannot be proven. However it can be motivated from the fractal behavior
of the percolating cluster, from renormalization group arguments and from the good
agreement of this ansatz with numerical results.
6.4.1 The scaling ansatz
We generalize the scaling for the average cluster size S to:
S [p p
c
[
(6.16)
where need not be equal to 1 in general. In the calculation of the average cluster size
we omit, as done before, any existing percolating cluster, as that would give an innite
contribution.
The correlation length can be dened as
2
=
r
r
2
g(r)
r
g(r)
, (6.17)
63
which is equivalent to the previous denition from the exponential decay of the correla
tion function. The sum over all distances can be split into sums over all clusters adding
the contribution of each cluster:
2
=
2
cluster
R
2
cluster
n(cluster)
2
cluster
n(cluster)
2
, (6.18)
where
R
2
cluster
=
1
2n(cluster)
2
i,jcluster
[r
i
r
j
[
2
(6.19)
is the average radius of the cluster, n(cluster) the size of the cluster and r
i
the location
of the ith site . The denition 6.18) through the cluster moment of inertia is the
most natural one for simulations.
Also for the correlation length we dene an exponent :
[p p
c
[
(6.20)
The pair correlation function g(r) for p ,= p
c
decays exponentially with the corre
lation length . At the critical concentration p = p
c
however the correlation length
diverges and we assume a power law:
g(r) r
(d2+)
(6.21)
For the probability P that a site is on the percolating cluster we make the ansatz:
P (p p
c
)
(6.22)
Finally we dene an exponent for the cluster density M
0
=
s
n
s
which scales as
M
0
[p p
c
[
2
(6.23)
Fortunately we do not have to calculate all ve exponents, as there exist scaling relations
between them. To derive them we start from an ansatz for the cluster numbers n
s
which
is motivated from the previous exact results:
n
s
(p) = s
f[(p p
c
)s
], (6.24)
where f is a scaling function that needs to be determined and and are two universal
exponents. Both the onedimensional as well as the innitedimensional results can be
cast in that scaling form.
Starting from that ansatz we can using some series summation tricks presented in
the exercises calculate P, S, and M
0
similar to what we did in one dimension. We
obtain:
=
2
(6.25)
=
3
(6.26)
2 =
1
(6.27)
from which we can derive the scaling law:
2 = 2 + (6.28)
64
6.4.2 Fractals
For the further discussion we need to introduce the concept of fractal dimensions, which
should be familiar to all of you. As you can see by looking at pictures or playing with
the applet on the web page, the percolating cluster at criticality is a complex object
and selfsimilar on all length scales.
This selfsimilarity follows naturally from the divergence of the dominant length
scale at the critical point and is reected in the powerlaw behavior of all properties
at the critical point. Power laws are the only scaleinvariant functions f(r/l) f(r),
as r
(r/l)
.
Thus selfsimilarity and fractal behavior are intimately related to the scaling ansatz.
Whether you use the scaling ansatz to motivate fractal behavior, or use apparent fractal
behavior to motivate the scaling ansatz is a matter of taste.
Selfsimilar objects like the percolating cluster at criticality are called fractals, since
their dimension D dened by the relationship of volume to linear dimension:
V (R) R
D
(6.29)
is a nonintegral fraction D. This is in contrast to simple objects like lines, squares or
cubes which have integer dimension.
Applying these idea to clusters we make the ansatz
R
s
s
1/D
(6.30)
for the average radius of a cluster of s sites. This allows us to evaluate equation
(6.18) as:
2
=
2
s
R
2
s
s
2
n
s
s
s
2
n
s
, (6.31)
since a) R
s
is the mean distance between two sites in a cluster b) a site is connected
to s other sites, and c) n
s
s is the probability of the site belonging to an ssite cluster.
The summation can again be performed analogous to the ones used to obtain Eqs.
(6.25)(6.27). Using these equations to replace and by the other exponents we nd:
D = ( + )/ = d /, (6.32)
where the last equality will be derived later using nite size scaling.
By integrating equation (6.21) one can also obtain the fractal dimension, leading to
another scaling law:
d 2 + = 2/ (6.33)
6.4.3 Hyperscaling and upper critical dimension
Finally, we wish to mention thehyper scaling  law:
d = + 2 = 2 (6.34)
which is obtained combining several of the previously derived scaling relations The
scaling laws involving dimensions are usually called hyper scaling laws. While the
65
other scaling laws hold forall dimensions this one will eventually break down for large
d as it is clearly not valid for the Bethe lattice. It holds up to the upper critical
dimension d
u
= 6 in the case of percolation. For d d
u
the exponents are always the
same as those of the innite dimensional Bethe lattice, but hyperscaling breaks down
for d > d
u
= 6. As the most interesting physical problems are in dimensions below d
u
.
we will not discuss this issue further but instead concentrate on methods to calculate
properties of percolation models below d
u
.
6.5 Renormalization group
The renormalization group method is intricately linked to the self similarity of the
percolating cluster and to the scaling ansatz. It also provides a motivation for the
scaling ansatz.
The idea is to ignore unimportant microscopic details and to concentrate on the
important physics on large scales. We do that by replacing a b b square of our square
lattice by a single square. We choose it to be lled if a percolating cluster exists on the
b b square and empty otherwise. This process is iterated until we are left with just
one single square which is either lled or empty.
6.5.1 The square lattice
As the simplest example let us choose b = 2. On a 2 2 square there are two vertically
spanning clusters with 2 occupied sites, four spanning clusters with three occupied sites
and one spanningcluster with four occupied sites. The total probability R(p) for a
vertically spanning cluster on a 2 x 2 is thus
R(p) = 2p
2
(1 p)
2
+ 4p
3
(1 p)
3
+ p
4
(6.35)
The renormalization group transformation
p R(p) = 2p
2
(1 p)
2
+ 4p
3
(1 p) + p
4
(6.36)
has two trivial xed points p = 0 and p = 1 as well as one nontrivial xed point
p
2
= (
_
(5) 1)/2 0.6180. This is surprisingly close to the correct result p
c
= 0.5927
To obtain better accuracy one needs to work with larger cell sizes b. In the limit
b we have p
b
p
c
and the renormalization group calculation becomes exact.
It is important to note that the choice of percolation criterion is ambiguous. Here
we have chosen to dene a percolating cluster as one that spans the lattice vertically.
We can as well choose that the lattice should be spanned horizontally, obtaining the
identical results. The other choices, while giving the same results in the limit b
have larger nite size corrections. If we dene a cluster as percolating if it spans
horizontally or vertically, the renormalization group transformation is
R(p) = 4p
2
(1 p)
2
+ 4p
3
(1 p) + p
4
(6.37)
with a xed point at (3
= R(p) (6.39)
= /b (6.40)
At the same time the scaling law (6.20) is valid and we obtain:
(p
b
)
=
b
=
1
b
(p p
b
)
(6.41)
By expanding R(p) in a Taylor series around p
b
we obtain after a few algebraic trans
formations:
=
log b
log
dR
dp
b
. (6.42)
For b = 2 this gives the very crude estimate 1.635, compared to the exact value
= 4/3.
6.5.2 The triangular lattice
A better estimate is obtained on the triangular lattice. There we replace three sites by
one (thus b
2
= 3), obtaining the transformation
p R(p) = 3p
2
(1 p) + p
3
(6.43)
with xed points at 0, 1/2 and 1. In this case surprisingly the value p
3
= p
c
is
exact.
Also the estimate for is much better: 1.355. It will be necessary to go to larger
values of b if we want to improve the accuracy of our results. Up to b = 6 we might
be able to determine the RG equations exactly. For larger b we have to use numerical
methods, which will be the topic of the next section.
6.6 Monte Carlo simulation
Monte Carlo simulations are the simplest method to investigate percolation and can
be done, for example, by visual inspection using our applet or by a more extended
simulation. We will focus on three types of questions that can be answered by Monte
Carlo simulations:
1. What is the pdependence of an arbitrary quantity X?
2. What is the critical probability p
c
?
67
3. What are the values of the universal critical exponents?
X can be a quantity like , S, P or any other interesting observable.
For now we treat only with nite lattices of linear dimension L < . The problem
of extrapolation to L will be discussed later.
The expectation value of the quantity X can be calculated exactly for small lattices
by a sum over all possible congurations on a system with N = L
d
sites:
X) =
N
n=0
cC
N,n
p
n
(1 p)
Nn
X(c) (6.44)
where (
N,n
is the set of congurations of n occupied sites in a lattice with N sites, and
X(c) is the value of the quantity X measured in the conguartion c. It is obvious that
this sum can be performed exactly only for small lattice sizes up to N 30 sites, as
the number of terms increases exponentially like 2
N
.
6.6.1 Monte Carlo estimates
Larger lattice sizes with N up to 10
6
can no longer be done exactly, but Monte CarIo
summation can provide estimates of this average to any desired accuracy. The av
erage over the complete set of congurations is replaced by a random sample of M
10
6
. . . 10
12
congurations c
i
drawn randomly with the correct probabilities p
n
(1p)
Nn
for a conguration with n occupied sites.
To create congurations wth the correct probabilities we draw a pseudorandom
number u, uniformly distributed in [0, 1[ for each site j. We set the site occupied if
u < p and empty otherwise. This corresponds to importance sampling, where each
conguration is created with the correct probability.
6.6.2 Cluster labeling
Next we identify clusters using, for example, the HoshenKopelman cluster labeling
algorithm, which works the following way:
Allocate an array of size N to store the cluster label for each site.
Loop through all sites i in the lattice. For occupied sites check if a cluster label
has already been assigned to any occupied neighboring sites.
If no neighboring site is occupied or has a label assigned, assign a new cluster
label to this site.
If one neighboring site is occupied and has a cluster label assigned, assign
the same label to this site.
If more than one neighboring sites are occupied and all have the same label
assigned, assign this same label to the current site.
68
0.0 0.2 0.4 0.6 0.8 1.0
p
0.0
0.2
0.4
0.6
0.8
1.0
M
L=4
L=8
L=12
L=16
L=24
L=32
L=48
L=64
Figure 6.3: Probability P for a site to be on the percolating cluster, as a function of
lattice size L for dierent concentrations p.
If more than one neighboring site is occupied and the sites have dierent
labels assigned, we have a problem. The current site connects two cluster
parts which until now were disconnected and labeled as dierent clusters.
We take the smallest of label numbers as the proper label and assign this to
the current site. For the other, larger, labels we would need to relabel all
wrongly labeled sites.
We use the following trick to avoid relabeling all wrongly labeled sites. For each
label we keep a list of the proper labels, initialized originally to the label itself.
To obtain the cluster label of a site we rst obtain the label number assigned to
the site. Next we check its proper label. If the two agree we are done. Otherwise
we replace the label by the stored proper label and repeat this process until the
label agrees with its proper label. This way we avoid relabeling all wrongly labeled
sites.
Having thus created a conguration and identied the existing clusters we can mea
sure the value of any quantity of interest. Care must be taken that in the averages over
clusters for S and other quantities any existing percolating cluster is always excluded
since its contribution is innite in the thermodynamic limit L .
69
6.6.3 Finite size eects
In gure 6.3 we show Monte Carlo results obtained for P in a series of Monte Carlo
simulations. We plot P as a function of p for several dierent system sizes L. It can be
seen that for p p
c
P converges rapidly to zero as L is increased. For p p
c
on the
other hand it converges rapidly to a nite value, as expected since P is the probability
of a site belonging to the percolating cluster, which exists for p > p
c
. We can also see
that as L is increased a singularity starts to develop at p
c
and it is plausible that P
follows a power law P (p p
c
)
L
/
. (6.45)
More formally we can make the following scaling ansatz. Close to criticality the only
important length scale is . The eects of nite system sizes are thus determined only
by the ratio L/. For a quantity X, which diverges as (p p
c
)
for L we make
the ansatz:
X(L, p) = (p p
c
)
T
1
(L/) = (p p
c
)
T
1
((p p
c
)L
1/
) (6.46)
or equivalently
X(L, ) =
/
T
2
(L/)
_
/
for L
L
/
for L
(6.47)
Applying this scaling ansatz to the size of the percolating cluster L
d
P we immedi
ately derive the second expression for the fractal dimension D in equation (6.32):
L
D
= L
d
P(L, ) = L
d
/
f(L/) L
d/
(6.48)
where we have chosen L = const, allowing us to replace by L, and giving a constant
value for the scaling function f(L/).
Thus we see that nite size scaling allows us to determine the ratio of exponents
like / by calculating the Ldependence of P at p = p
c
.
One problem however still remains: how can we determine p
c
on a nite lattice?
The answer is again nite size scaling. Consider the probability (p) for the existence
of a percolating cluster. In the innite system it is
(p) = (p p
c
). (6.49)
70
In a nite system the step function is smeared out. We can make the usual nite size
scaling ansatz using equation (6.46)
(p, L) = ((p p
c
)L
1/
). (6.50)
The derivative d/dp gives the probability for rst nding a percolating cluster at
concentrations in the interval [p, p + dp[. The probability p at which a percolating
cluster appears can easily be measured in a Monte Carlo simulation. Its average
p
av
=
_
p
d
dp
dp (6.51)
is slightly dierent from the exact value p
c
on any nite lattice, but converges like
p
av
p
c
L
1/
(6.52)
as can be seen by integrating the scaling ansatz 6.50. Similarly the variance
2
=
_
( p p
av
)
2
d
dp
dp (6.53)
decreases as
L
1/
(6.54)
That we we can obtain and p
c
from the nite size scaling of the average p at which a
percolating cluster appears on a nite lattice.
We do this by drawing a pseudorandom number u
j
, uniformly distributed in [0, 1[
for each site j. Next we use a binary search to determine the probability p where a
percolating cluster appears for this realization of the u
j
s. We start by checking p = 0.5.
We occupy all squares with u
j
< p. If a percolating cluster exists we set p = 0.25,
otherwise p = 0.75. Next we check if there is a percolating cluster for the same u
j
s and
the new p. By repeating this bisection we improve the accuracy by a factor of two in
each step and obtain the desired value in a logarithmic number of steps log / log 2,
where is the desired accuracy. This gives us one sample for p. By repeating the whole
procedure with new sets of u
j
s we can obtain good estimates for p
av
and .
Finally we t the results obtained for several lattice sizes L to obtain p
c
and as
tting parameters.
Once we have a good estimate for p
c
we can also use equation (6.45) to get estimates
for further exponents. By measuring P(p
c
) as a function of lattice size L and tting to
equation (6.45) we estimate the ratio of exponents /. Similarly, by tting S(p
c
) as a
function of L we estimate /. The scaling laws which relate the exponents can be used
as a check on our simulations and as a tool to increase the accuracy of our estimates.
6.7 Monte Carlo renormalization group
Higher precision than in simple Monte Carlo simulations can be obtained by using
Monte Carlo estimates to perform renormalization group calculations for huge block
sizes b, often b 500.
71
To determine the renormalization group transformation R(p) in a Monte Carlo sim
ulation we write it as:
R(p) =
N
n=0
_
N
n
_
p
n
(1 p)
Nn
P(n), (6.55)
where N = b
2
is the number of sites in the block and P(n) is the probability that a
percolating cluster exists if n sites are occupied. The easiest way to calculate P(n) is
to start from an empty lattice and to add occupied sites until at a certain number of
s sites a percolating cluster is formed. Then we set P(n) = 1 for n s and P(n) = 0
for n < s. Averaging over many congurations provides us with a good Monte Carlo
estimate for P(n). The critical point p
b
is found by looking for the xed point with
R(p
b
) = p
b
. The exponents, like are obtained from the derivatives of R(p) using
equation (6.42), as we did in the analytic RG calculation.
The corrections due to nite block sizes b can be extrapolated using the following
equations, again determined by nite size scaling:
(b) 1/ log b, (6.56)
p
b
p
c
b
(6.57)
Using b 500 one can obtain estimates for the exponents that are accurate to four
digits!
6.8 Series expansion
Series expansion is a very dierent method, trying to make maximum use of analytically
known results for small systems to obtain high precision e xtrapolations to large systems.
The basis is an exact calculation for the probability n
s
(p) of small clusters. For example,
on a square lattice we have
n
1
(p) = p(1 p)
4
(6.58)
n
2
(p) = 2p
2
(1 p)
6
(6.59)
n
3
(p) = 2p
3
(1 p)
8
+ 4p
3
(1 p)
7
(6.60)
.
.
. (6.61)
In the calculation of these probabilities we needed to enumerate the dierent geometries
of ssite clusters and calculate the number of times they can be embedded into a square
lattice. Once we have calculated the cluster numbers n
s
for clusters up to size s we
can calculate the rst s terms of a Taylor expansion in p for any quantity of interest.
This Taylor series can be used as the basis of several analysis procedures, of which we
mention only two often used approaches.
The ratio method determines critical points and exponents from ratios of successive
expansion coecients. Look, for example, at the Taylor series expansion for a function
like (p
c
p)
:
(p
c
p)
= p
i=0
a
i
p
i
= p
c
_
1 +
p
c
p +
( + 1)
2p
2
c
p
2
+ . . .
_
(6.62)
72
Table 6.1: Critical exponents for percolation in any dimension
functional form exponent d = 1 d = 2 d = 3 d = 4 d = 5 d 6
M
0
[p p
c
[
2
1 2/3 0.62 0.72 0.86  1
n
s
(p = p
c
) s
=
p p
c
(6.64)
The logarithmic derivative of with respect to p will thus have a pole at p
c
with
residuum , refecting the singularity at p
c
. To estimate the pole and residuum we use
Pade approximants to represent the series for as a rational function:
N
i=0
a
i
p
i
=
L
j=0
b
j
p
j
M
k=0
c
k
p
k
(6.65)
with L+M = N. The coecients are determined by matching the rst N terms in the
Taylor series of the left and right hand sides.
The zeroes of the polynomial in the denominator give the poles, one of which will be
a good estimate for p
c
. By using dierent values L and M the accuracy of the resulting
estimates can be checked. This was the method of choice before large computers became
available and is still in use today, when computers are used to obtain n
s
symbolically
for large values of s.
6.9 Listing of the universal exponents
In table 6.1 we give an overview over the various exponents calculated for dimensions
d < d
u
= 6 as well as the Bethe lattice results, valid for any d d
u
.
We wish to point out that besides the exponents there exist further universal quan
tities, like the universal amplitude ratios
=
S(p
c
)
S(p
c
+ )
(6.66)
73
Finally we wish to repeat that these exponents are universal in the sense that they
depend only on dimensionality, but not on the specic percolation model used. Thus
all the examples mentioned in the introduction to this chapter: forest res, oil reser
voirs, gelation, spreading of diseases, etc., share the same exponents as long as the
dimensionality is the same.
Thus, although we have considered only toy models we get widely applicable results.
This is a common thread that we will encounter again. While performing numerical
simulations for specic models is nice, being able to extract universally valid results
from specic model simulations is the high art of computer simulations.
74
Chapter 7
Magnetic systems
In this chapter we move away from the static problem of percolation to the more
dynamic problem of evaluating thermodynamic averages through phase space integrals.
In this context we will encounter many of the same problems as in percolation.
7.1 The Ising model
The Ising model is the simplest model for a magnetic system and a prototype statistical
system. We will use it for our discussion of thermodynamic phase transitions. It consists
of an array of classical spins
i
= 1 that can point either up (
i
= +1) or down
(
i
= 1). The Hamiltonian is
H = J
i,j
j
, (7.1)
where the sum goes over nearest neighbor spin pairs.
Two parallel spins contribute an energy of J while two antiparallel ones contribute
+J. In the ferromagnetic case the state of lowest energy is the fully polarized state
where all spins are aligned, either pointing up or down.
At nite temperatures the spins start to uctuate and also states of higher energy
contribute to thermal averages. The average magnetization thus decreases from its full
value at zero temperature. At a critical temperature T
c
there is a second order phase
transition to a disordered phase, similar to the one we discussed in the percolation
problem.
The Ising model is the simplest magnetic model exhibiting such a phase transition
and is often used as a prototype model for magnetism. To discuss the phase transition
the scaling hypothesis introduced in the context of percolation will again be used.
As is known from the statistical mechanics course the thermal average of a quantity
A at a nite temperature T is given by a sum over all states:
A) =
1
Z
i
A
i
exp(E
i
), (7.2)
where = 1/k
B
T is the inverse temperature. A
i
is the value of the quantity A in the
conguration i. E
i
is the energy of that conguration.
75
The partition function (Zustandssumme)
Z =
i
exp(E
i
) (7.3)
normalizes the probabilities p
i
= exp(E
i
)/Z.
For small systems it is possible to evaluate these sums exactly. As the number of
states grows like 2
N
a straightforward summation is possible only for very small N. For
large higher dimensional systems Monte Carlo summation/integration is the method of
choice.
7.2 The single spin ip Metropolis algorithm
As was discussed in connection with integration it is usually not ecient to estimate
the average (7.2) using simple sampling. The optimal method is importance sampling,
where the states i are not chosen uniformly but with the correct probability p
i
, which
we can again do using the Metropolis algorithm.
The simplest Monte Carlo algorithm for the Ising model is the single spin ip
Metropolis algorithm which denes a Markov chain through phase space.
Starting with a conguration c
i
propose to ip a single spin, leading to a new
conguration c
.
Calculate the energy dierence E = E[c
] E[c
i
] between the congurations c
and c
i
.
If E < 0 the next conguration is c
i+1
= c
If E > 0 then c
i+1
= c
if r < exp(E).
Measure all the quantities of interest in the new conguration.
This algorithm is ergodic since any conguration can be reached from any other in
a nite number of spin ips. It also fullls the detailed balance condition.
7.3 Systematic errors: boundary and nite size ef
fects
In addition to statistical errors due to the Monte Carlo sampling our simulations suer
from systematic errors due to boundary eects and the nite size of the system.
In contrast to the percolation problems boundary eects can be avoided completely
by using periodic boundary conditions. The lattice is continued periodically, forming a
torus. The left neighbor of the leftmost spin is just the rightmost boundary spin, etc..
Although we can thus avoid boundary eects, nite size eects remain since now
all correlations are periodic with the linear system size as period. In the context of the
percolation problem we have already learned how to deal with nite size eects:
76
Away from phase transitions the correlation length is nite and nite size eects
are negligible if the linear system size L . Usually L > 6 is sucient, but
this should be checked for each simulation.
In the vicinity of continuous phase transitions we encounter the same problem as
in percolation: the correlation length diverges. Again nite size scaling comes
to the rescue and we can obtain the critical behavior as discussed in the chapter
on percolation.
7.4 Critical behavior of the Ising model
Close to the phase transition at T
c
again scaling laws characterize the behavior of all
physical quantities. The average magnetization scales as
m(T) = [M[/V ) (T
c
T)
, (7.4)
where M is the total magnetization and V the system volume (number of spins).
The magnetic susceptibility =
dm
dh
[
h=0
can be calculated from magnetization uc
tuations and diverges with the exponent :
(T) =
M
2
/V ) [M[)
2
/V
T
[T
c
T[
. (7.5)
The correlation length is dened by the asymptotically exponential decay of the
twospin correlations:
r
) [m[)
2
exp(r/). (7.6)
It is best calculated from the structure factor S(q) , dened as the Fourier transform
of the correlation function. For small q the structure factor has a Lorentzian shape:
S(q) =
1
1 +q
2
2
+ O(q
4
). (7.7)
The correlation length diverges as
(p) [T T
c
[
. (7.8)
At the critical point the correlation function again follows the same power law as in
the percolation problem:
r
) r
(d2+)
(7.9)
where = 2/ d + 2, derived from the same scaling laws as in percolation.
The specic heat C(T) diverges logarithmically in two dimensions:
C(T) ln [T T
c
[ [T T
c
[
(7.10)
and the critical exponent = 0.
Like in percolation, nite size scaling is the method of choice for the determination
of these exponents.
77
Table 7.1: Critical exponents for the Ising model in two and three dimensions.
quantity functional form exponent Ising d = 2 Ising d = 3
magnetization m (T
c
T)
1/8 0.3258(44)
susceptibility [T T
c
[
7/4 1.2390(25)
correlation length [T T
c
[
1 0.6294(2)
specic heat C(T) [T T
c
[
0
inverse critical temp. 1/T
c
1
2
ln(1 +
2) 0.221657(2)
A good estimate of T
c
is obtained from the Binder cumulant
U = 1
M
4
)
3M
2
)
2
. (7.11)
Just as the function (L, p) in the percolation problem has a universal value at p
c
,
also the Binder cumulant has a universal value at T
c
. The curves of U(T) for dierent
system sizes L all cross in one point at T
c
. This is a consequence of the nite size scaling
ansatz:
M
4
) = (T T
c
)
4
u
4
((T T
c
)L
1/
)
M
2
) = (T T
c
)
2
u
2
((T T
c
)L
1/
). (7.12)
Thus
U(T, L) = 1
u
4
((T T
c
)L
1/
)
3u
2
((T T
c
)L
1/
)
2
, (7.13)
which for T = T
c
is universal and independent of system size L:
U(T
c
, L) = 1
u
4
(0)
3u
2
(0)
2
(7.14)
High precision Monte Carlo simulations actually show that not all lines cross exactly
at the same point, but that due to higher order corrections to nite size scaling the
crossing point moves slightly, proportional to L
1/
, allowing a high precision estimate
of T
c
and .
1
In the table we show the exponents and critical temperature of the Ising model in
two and three dimensions.
7.5 Critical slowing down and cluster Monte Carlo
methods
The importance of autocorrelation becomes clear when we wish to simulate the Ising
model at low temperatures. The mean magnetization m) is zero on any nite cluster,
1
See, e.g., A. M. Ferrenberg and D. P. Landau, Phys. Rev. B 44 5081 (1991); K. Chen, A. M.
Ferrenberg and D. P. Landau, Phys. Rev. B 48 3249 (1993).
78
as there is a degeneracy between a conguration and its spin reversed counterpart. If,
however, we start at low temperatures with a conguration with all spins aligned up it
will take extremely long time for all spins to be ipped by the single spin ip algorithm.
This problem appears as soon as we get close to the critical temperature, where it was
observed that the autocorrelation times diverge as
[min(, L)]
z
. (7.15)
with a dynamical critical exponents z 2 for all local update methods like the single
spin ip algorithm.
The reason is that at low temperatures it is very unlikely that even one spin gets
ipped, and even more unlikely for a large cluster of spins to be ipped.
The solution to this problem was found in 1987 and 1989 by Swendsen and Wang
2
and by Wol.
3
Instead of ipping single spins they propose to ip big clusters of spins and choose
them in a clever way so that the probability of ipping these clusters is large.
7.5.1 KandelDomany framework
We use the FortuinKastelyn representation of the Ising model, as generalized by Kandel
and Domany. The phase space of the Ising model is enlarged by assigning a set ( of
possible graphs to each conguration C in the set of congurations (. We write the
partition function as
Z =
CC
GG
W(C, G) (7.16)
where the new weights W(C, G) > 0 are chosen such that Z is the partition function of
the original model by requiring
GG
W(C, G) = W(C) := exp(E[C]), (7.17)
where E[C] is the energy of the conguration C.
The algorithm now proceeds as follows. First we assign a graph G ( to the
conguration C, chosen with the correct probability
P
C
(G) = W(C, G)/W(C). (7.18)
Then we choose a new conguration C
, G)], keeping
the graph G xed; next a new graph G
is chosen
C (C, G) (C
, G) C
(C
, G
) . . . (7.19)
What about detailed balance? The procedure for choosing graphs with probabilities
P
G
obeys detailed balance trivially. The nontrivial part is the probability of choosing
a new conguration C
, G)] = W(C
, G)p[(C
, G)] =
W(C
, G)
W(C, G) + W(C
, G)
(7.21)
or by again using the Metropolis algorithm:
p[(C, G) (C
, G)] = min(W(C
with W(C
, G) ,= 0.
7.5.2 The cluster algorithms for the Ising model
Let us now show how this abstract and general algorithm can be applied to the Ising
model. Our graphs will be bondpercolation graphs on the lattice. Spins pointing
into the same direction can be connected or disconnected. Spins pointing in opposite
directions will always be disconnected. In the Ising model we can write the weights
W(C) and W(C, G) as products over all bonds b:
W(C) =
b
w(C
b
) (7.25)
W(C, G) =
b
w(C
b
, G
b
) =
b
(C
b
, G
b
)V (G
b
) (7.26)
where the local bond congurations C
b
can be one of , , ,
and the local graphs can be connected or disconnected. The graph selection
can thus be done locally on each bond.
Table 7.2 shows the local bond weights w(c, g), w(c), (c, g) and V (g). It can easily
be checked that the sum rule (7.17) is satised.
80
The probability of a connected bond is [exp(J) exp(J)]/ exp(J) = 1
exp(2J) if two spins are aligned and zero otherwise. These connected bonds group
the spins into clusters of aligned spins.
A new conguration C
i
) of a single spin. As the cluster to which the spin belongs
can be freely ipped, and the ipped cluster has the same probability as the original
one, the improved estimator is
m) =
1
2
(
i
)) = 0. (7.27)
This result is obvious because of symmetry, but we saw that at low temperatures a
single spin ip algorithm will fail to give this correct result since it takes an enormous
time to ip all spins. Thus it is encouraging that the cluster algorithms automatically
give the exact result in this case.
Correlation functions are not much harder to measure:
j
) =
_
1 if
i und
j are on the same cluster
0 otherwise
(7.28)
To derive this result consider the two cases and write down the improved estimators by
considering all possible cluster ips.
Using this simple result for the correlation functions the mean square of the mag
netization is
m
2
) =
1
N
2
i,
j
) =
1
N
2
cluster
S(cluster)
2
), (7.29)
where S(cluster) is the number of spins in a cluster. The susceptibility above T
c
is
simply given by m
2
) and can also easily be calculated by above sum over the squares
of the cluster sizes.
82
In the Wol algorithm only a single cluster is built. Above sum (7.29) can be
rewritten to be useful also in case of the Wol algorithm:
m
2
) =
1
N
2
cluster
S(cluster)
2
)
=
1
N
2
i
1
S(cluster containing
i)
S(cluster containing
i)
2
=
1
N
2
i
S(cluster containing
i) =
1
N
S(cluster)). (7.30)
The expectation value for m
2
is thus simply the mean cluster size. In this derivation
we replaced the sum over all clusters by a sum over all sites and had to divide the
contribution of each cluster by the number of sites in the cluster. Next we can replace
the average over all lattice sites by the expectation value for the cluster on a randomly
chosen site, which in the Wol algorithm will be just the one Wol cluster we build.
Generalizations to other quantities, like the structure factor S(q) are straightfor
ward. While the calculation of S(q) by Fourier transform needs at least O(N log N)
steps, it can be done much faster using improved estimators, here derived for the Wol
algorithm:
S(q)) =
1
N
2
r,r
r
exp(iq(r r
))
=
1
NS(cluster)
r,r
cluster
r
exp(iq(r r
))
=
1
NS(cluster)
rcluster
exp(iqr)
2
, (7.31)
This needs only O(S(cluster)) operations and can be measured directly when construct
ing the cluster.
Care must be taken for higher order correlation functions. Improved estimators
for quantities like m
4
contain terms of the form S(cluster
1
)S(cluster
2
)), which need
at least two clusters and cannot be measured in an improved way using the Wol
algorithm.
7.7 Generalizations of cluster algorithms
Cluster algorithms can be used not only for the Ising model but for a large class of
classical, and even quantum spin models. The quantum version is the loop algorithm,
which will be discussed later in the course. In this section we discuss generalizations to
other classical spin models.
Before discussing specic models we remark that generalizations to models with
dierent coupling constants on dierent bonds, or even random couplings are straight
forward. All decisions are done locally, individually for each spin or bond, and the
couplings can thus be dierent at each bond.
83
7.7.1 Potts models
qstate Potts models are the generalization of the Ising model to more than two states.
The Hamilton function is
H = J
i,j
s
i
,s
j
, (7.32)
where the states s
i
can take any integer value in the range 1, . . . , q. The 2state Potts
model is just the Ising model with some trivial rescaling.
The cluster algorithms for the Potts models connect spins with probability 1 e
J
if the spins have the same value. The clusters are then ipped to any arbitrarily
chosen value in the range 1, . . . , q.
7.7.2 O(N) models
Another, even more important generalization are the O(N) models. Well known exam
ples are the XY model with N = 2 and the Heisenberg model with N = 3. In contrast
to the Ising model the spins can point into any arbitrary direction on the Nsphere.
The spins in the XY model can point into any direction in the plane and can be char
acterized by a phase. The spins in the Heisenberg model point into any direction on a
sphere.
The Hamilton function is:
H = J
i,j
S
i
S
j
, (7.33)
where the states
S
i
are Ndimensional unit vectors.
Cluster algorithms are constructed by projecting all spins onto a random direction
e. The cluster algorithm for the Ising model can then be used for this projection. Two
spins
S
i
and
S
j
are connected with probability
1 exp
_
min[0, 2J( e
S
i
)( e
S
j
)]
_
. (7.34)
The spins are ipped by inverting the projection onto the edirection:
S
i
S
i
2( e
S
i
) e. (7.35)
In the next update step a new direction e on the unit sphere is chosen.
The Heisenberg model
Table 7.3 lists the critical exponents and temperature for the three dimensional Heisen
berg model.
In two dimensions models with a continuous symmetry, like the O(N) models with
N 2 do not exhibit a phase transition at a nite critical temperature, as is proven by
the MerminWagner theorem. The reason are the Goldstonemodes, long wavelength
spin waves that have vanishingly low excitation energies at long wavelengths. As a
consequence the twodimensional Heisenberg model has T
c
= 0 and an exponentially
growing correlation length at low temperatures exp(2J/T). We will learn more
about this model and nite size scaling from one of the projects.
84
Table 7.3: Critical properties of the threedimensional classical Heisenberg model.
quantity functional form exponent
magnetization m (T
c
T)
0.3639(35)
susceptibility [T T
c
[
1.3873(85)
correlation length [T T
c
[
0..7048(30)
specic heat C(T) [T T
c
[
0.1144(90)
inverse critical temperature simple cubic 1/T
c
0.693035(37)
bc cubic 0.486798(12)
The XY model
The only exception to the just stated rule that models with N 2 do not exhibit any
nite temperature phase transition in two dimensions is the XY model which has a
nite temperature KosterlitzThouless transition.
This is a very special kind of phase transition. In accordance with the Mermin
Wagner theorem there is no nite magnetization at any nite temperature. However
the vorticity remains nite up to a critical temperature T
c
> 0. At T
c
it jumps from the
universal value 2T
c
/ to 0. This model will be investigated in another of the projects.
7.7.3 Generic implementation of cluster algorithms
The cluster algorithms for many models, including the models discussed above, can
be implemented in a very generic way using template constructs in C++. This generic
program, as well as performance comparisons which show that the generic program is on
average as fast (and sometimes even faster) than a specic optimized Fortran program
will be presented at the end of the course  after you have written your versions of the
algorithms.
7.8 The WangLandau algorithm
7.8.1 Flat histograms
While the cluster algorithms discussed in this section solve the problem of critical
slowing down at second order (continuous) phase transitions they do not help at all
at rst order phase transitions. At rst order phase transitions there is no divergence
of any quantity and both the disordered and the ordered phase remain (meta)stable
throughout the transition. The most famous example is the solid/liquid transition
where you can every day observe the coexistence of both (meta)stable phases, e.g.
water and ice. Water remains liquid even when cooled below the freezing temperature,
until some ice crystal nucleates in the supercooled water when it starts freezing. There
is a large free energy, the surface energy of the rst ice crystal, which has to be overcome
before freezing sets in. This leads to macroscopically tunneling times between the two
coexistent phases at the phase transition.
85
2 1.5 1 0.5
E/N
10
10
10
8
10
6
10
4
10
2
10
0
P
(
E
,
T
c
)
L=60
L=100
L=150
L=200
Figure 7.1: The probability P(E, T) = (E) exp(E/k
B
T) of visiting a state with
energy E in a q = 10state Potts model at the critical temperature. The tunneling
probability between the two phases (the dip between the two maxima) becomes expo
nentially small for large systems. This gure is taken from the paper: F. Wang and
D.P. Landau, Phys. Rev. Lett. 86, 2050 (2001).
The simplest lattice model showing such a rst order thermal phase transition is
a twodimensional Potts model with large q, e.g. q = 10. For this model we show in
Fig. 7.1 the probability P(E, T) of visiting a conguration with energy E. This is:
P(E, T) = (E)p(E) = (E)e
E/k
B
T
, (7.36)
where the density of states (E) counts the number of states with energy E. At the
critical temperature there are two coexisting phases, the ordered and disordered ones
with dierent energies. In order to tunnel from one to the other one has to continuously
change the energy and thus go through the probability minimum between the two
peeks. This probability decreases exponentially with system size and we thus have a
real problem!
The WangLandau algorithm is the latest in a series of attempts at solving this
problem
4
It starts from the simple observation that the probability minimum vanishes
if we choose p(E) 1/(E) instead of the Boltzmann weight p(E) exp(E/k
B
T):
P(E, T) = (E)p(E) (E)/(E) = const. (7.37)
During our sampling we thus get a at histogram in the energy.
7.8.2 Determining (E)
The only problem now is that (E) is not known, since it requires a solution of the full
problem. The approach by Wang and Landau is crude but simple. They start with a
4
F. Wang and D.P. Landau, Phys. Rev. Lett. 86, 2050 (2001); Phys. Rev. E 64, 056101 (2001).
86
(very bad guess) (E) = 1 for all energies an iteratively improve it:
Start with (E) = 1 and f = e
Repeat
Reset a histogram of energies H(E) = 0
Perform simulations until a histogram of energies H(E) is at
pick a random site
attempt a local Metropolis update using p(E) = 1/(E)
increase the histogram at the current energy E: H(E) H(E) + 1
increase the estimate for (E) at the current energy E: (E) (E) f
once H(E) is at (e.g. the minimum is at least 80% of the mean), reduce
f
f
stop once f 1 + 10
8
As you can see, only a few lines of code need to be changed in your local update
algorithm for the Ising model, but a few remarks are necessary:
1. Check for atness of the histogram not at very step but only after a reasonable
number of sweeps N
sweeps
. One sweep is dened as one attempted update per site.
2. The initial value for f needs to be carefully chosen, f = e is only a rough guide.
As discussed in the papers a good choice is picking the initial f such that f
Nsweeps
is approximately the total number of states (e.g. q
N
for a qstate Potts model
with N sites).
3. The atness criterion is quite arbitrary and some research is still necessary to nd
the optimal choice.
4. The density of states (E) can become very large and easily exceed 10
10000
. In
order to obtain such large numbers the multiplicative increase (E) (E) f is
essential. A naive additive guess (E) (E) + f would never be able to reach
the large numbers needed.
5. Since (E) is so large, we only store its logarithm. The update step is thus
ln (E) ln (E) + ln f. The Metropolis acceptance probability will be
P = min[1, exp(ln(E
old
) ln (E
new
))] (7.38)
7.8.3 Calculating thermodynamic properties
Another advantage of the WangLandau algorithm is that, once we know the density
of states (E), we can directly calculate the partition function
Z =
c
E
c
e
Ec/k
B
T
=
E
(E)e
E/k
B
T
(7.39)
87
and the free energy
F = k
B
T ln Z = k
B
T ln
E
(E)e
E/k
B
T
(7.40)
which are both not directly accessible in any other Monte Carlo algorithm. All other
thermodynamic properties such as the susceptibility or the specic heat can now be
calculated simply as derivatives of the free energy.
In evaluating these sums one has to be careful on how to avoid exponent overow
in the exponentials. More details on how to do that will be given in the exercises.
7.8.4 Optimized ensembles
Instead of choosing a at histogram, any arbitrary ensemble can actually be simu
lated using generalizations of the WangLandau algorithm. And indeed, as realized by
Prakash Dayal
5
the at histogram ensemble is not yet optimal. Simon Trebst
6
has
then derived an optimal ensemble and an algorithm to determine it. He will talk about
this algorithm later during the semester.
7.9 The transfer matrix method
7.9.1 The Ising chain
As a last method we wish to discuss the transfer matrix method which gives exact
numerical results for Ising models on innite strips of small width W. The partition
function of a periodic Ising chain of length L can be written as:
Z =
{(
1
,...,
L
)}
L
i=1
exp(J
i
i+1
), (7.41)
where the sum goes over all congurations (
1
, . . . ,
L
) and periodic boundary condi
tions are implemented by dening
L+1
1
. The partition function can be rewritten
as a trace over a product of transfer matrices U
U =
_
exp(J) exp(J)
exp(J) exp(J)
_
. (7.42)
Using these transfer matrices the partition function becomes:
Z = Tr
_
_
1
=,
2
=,
L
=,
L
i=1
exp(J
i
i+1
)
_
_
= TrU
L
(7.43)
For strips of nite width W transfer matrices of dimension 2
W
can be constructed in a
similar fashion, as we will see in the next section.
5
P. Dayal et al., Phys. Rev. Lett. 92, 097201 (2004)
6
S. Trebst, D. Huse and M. Troyer, Phys. Rev. E 70, 046701 (2004)
88
The free energy density is:
f =
1
L
ln Z =
1
L
ln TrU
L
(7.44)
In the limit L only the largest eigenvalues of the transfer matrix U will be
important. We label the eigenvalues
i
, with i = 1, . . . , D := 2
W
, with [
1
[ > [
2
[
3
[ . . .. Then the free energy density can be written as
f =
1
L
ln TrU
L
=
1
L
ln(
D
i=1
L
i
)
=
1
L
ln(
L
1
D
i=1
L
i
/
L
1
) =
1
L
_
ln
L
1
+ ln(
D
i=1
L
i
/
L
1
)
_
(7.45)
In the limit L all
L
i
/
L
1
converge to zero except for the i = 1 term, which gives
1. Thus we obtain
f =
1
ln
1
. (7.46)
In a similar way we can show that correlations decay like (
2
/
1
)
r
, which gives a cor
relation length
=
1
ln
1
ln
2
(7.47)
7.9.2 Coupled Ising chains
The same procedure also works for innitely long strips of W coupled Ising chains. In
that case the transfer matrix has dimension N = 2
W
and is constructed just as before.
To obtain the eigenvalues of the transfer matrix we will use three tricks:
1. Representation of the dense transfer matrix U as a product of sparse matrices.
2. Calculation of the eigenvalues of the transfer matrix by iterative methods
3. Calculation of the matrix elements by multispin coding
The Hamilton function is
H =
x=
H
(x)
(7.48)
with
H
(x)
= J
W
y=1
x,y
x+1,y
J
W1
y=1
x,y
x,y+1
(7.49)
The corresponding transfer matrix U is a dense matrix, with all elements nonzero.
We can however write is as a product of sparse matrices:
U
(x)
=
W1
y=1
U
1/2
(x,y)(x,y+1)
W
y=1
U
(x,y)(x+1,y)
W1
y=1
U
1/2
(x,y)(x,y+1)
(7.50)
89
where the partial transfer matrices U
(x
1
,y
1
)(x
2
,y
2
)
arise from the terms
x
1
,y
1
x
2
,y
2
in the
Hamiltonian. The square roots U
1/2
(x,y)(x,y+1)
are used to make the matrix U Hermitian,
which will be necessary for the Lanczos algorithm.
All of these transfer matrices are sparse. The matrices U
(x,y)(x,y+1)
are diagonal
as they act only on the conguration at xed x. The matrices U
(x,y)(x+1,y)
contain
a diagonal entry for
x,y
=
x+1,y
and additionally an odiagonal entry when
x,y
=
x+1,y
.
We can thus replace a matrixvector multiplication Uv, which would need N
2
= 2
2W
operations by W 1 multiplications with diagonal matrices U
(x,y)(x,y+1)
and W multi
plications with sparse matrices U
(x,y)(x+1,y)
. This requires only (4W 2)N operations,
signicantly less than N
2
!
In the next section we will discuss how the matrixvector multiplications can be
coded very eciently. Finally we will explain the Lanczos algorithm which calculates
the extreme eigenvalues of a sparse matrix.
7.9.3 Multispin coding
Multispin coding is an ecient technique to map a conguration of Ising spins to an
array index. We interpret the bit pattern of an integer number as a spin conguration.
A bit set to 1 corresponds to an up spin, a bit set to 0 corresponds to a down spin.
Let us now consider the matrices U
(x,y)(x,y+1)
. As mentioned before, they are diago
nal as they do not propagate in the xdirection, but just give the Boltzmann weights of
the current conguration. For a given y we loop over all N congurations c = 0 . . . N1.
In each conguration c we consider the bits y and y +1. If they are the same, the spins
are aligned and the diagonal matrix element is exp(J), otherwise the matrix element
is exp(J).
The matrices U
(x,y)(x+1,y)
are not much harder to compute. For each conguration
c there is a diagonal element exp(J). This corresponds to the case when the spin
conguration does not change and the two spins at (x, y) and (x +1, y) are aligned. In
addition there is one odiagonal element exp(J), corresponding to a change in spin
orientation. It connects the state c with a state c
n+1
v
n+1
= Av
n
n
v
n
n
v
n1
, (7.52)
where
n
= v
n
Av
n
,
n
= [v
n
Av
n1
[. (7.53)
As the orthogonality condition
v
i
v
j
=
ij
(7.54)
does not determine the phases of the basis vectors, the
i
can be chosen to be real and
positive. As can be seen, we only need to keep three vectors of size N in memory, which
makes the Lanczos algorithm very ecient, when compared to dense matrix eigensolvers
which require storage of order N
2
.
In the Krylov basis the matrix A is tridiagonal
T
(n)
:=
_
1
2
0 0
2
2
.
.
.
.
.
.
.
.
.
0
.
.
.
.
.
.
.
.
. 0
.
.
.
.
.
.
.
.
.
.
.
.
n
0 0
n
n
_
_
. (7.55)
92
The eigenvalues
1
, . . . ,
M
of T are good approximations of the eigenvalues of A.
The extreme eigenvalues converge very fast. Thus M N iterations are sucient to
obtain the extreme eigenvalues.
Eigenvectors
It is no problem to compute the eigenvectors of T. They are however given in the
Krylov basis v
1
, v
2
, . . . , v
M
. To obtain the eigenvectors in the original basis we need
to perform a basis transformation.
Due to memory constraints we usually do not store all the v
i
, but only the last three
vectors. To transform the eigenvector to the original basis we have to do the Lanczos
iterations a second time. Starting from the same initial vector v
1
we construct the
vectors v
i
iteratively and perform the basis transformation as we go along.
Roundo errors and ghosts
In exact arithmetic the vectors v
i
are orthogonal and the Lanczos iterations stop after
at most N 1 steps. The eigenvalues of T are then the exact eigenvalues of A.
Roundo errors in nite precision cause a loss of orthogonality. There are two ways
to deal with that:
Reorthogonalization of the vectors after every step. This requires storing all of
the vectors v
i
and is memory intensive.
Control of the eects of roundo.
We will discuss the second solution as it is faster and needs less memory. The main
eect of roundo errors is that the matrix T contains extra spurious eigenvalues, called
ghosts. These ghosts are not real eigenvalues of A. However they converge towards
real eigenvalues of A over time and increase their multiplicities.
A simple criterion distinguishes ghosts from real eigenvalues. Ghosts are caused by
roundo errors. Thus they do not depend on on the starting vector v
1
. As a consequence
these ghosts are also eigenvalues of the matrix
T, which can be obtained from T by
deleting the rst row and column:
T
(n)
:=
_
2
3
0 0
3
3
.
.
.
.
.
.
.
.
.
0
.
.
.
.
.
.
.
.
. 0
.
.
.
.
.
.
.
.
.
.
.
.
n
0 0
n
n
_
_
. (7.56)
From these arguments we derive the following heuristic criterion to distinguish ghosts
from real eigenvalues:
All multiple eigenvalues are real, but their multiplicities might be too large.
All single eigenvalues of T which are not eigenvalues of
T are also real.
Numerically stable and ecient implementations of the Lanczos algorithm can be
obtained from netlib. As usual, do not start coding your own algorithm but use existing
optimal implementations.
93
7.11 Renormalization group methods for classical
spin systems
Just as in the percolation problem renormalization group methods can also be applied
to spin systems. A block of b spins is replaced by a single spin, and the interaction
is renormalized by requiring that physical expectation values are invariant under the
transformation.
Again, Monte Carlo renormalization group can be used for larger block sizes b. As
we have already learned the basics of renormalization in the percolation problem we
will not discuss any details here. Interested students are referred to the text book by
Tao Pang and to references therein.
94
Chapter 8
The quantum onebody problem
8.1 The timeindependent onedimensional Schrodinger
equation
We will start the discussion of quantum problems with the timeindepent onedimensional
Schrodinger equation for a particle with mass m in a Potential V (x). For this problem
the timedependent Schrodinger equation
i h
t
=
h
2
2m
x
2
+ V (x), (8.1)
can be simplied to an ordinary dierential equation using the ansatz (x, t) = (x) exp(iEt)
E =
h
2
2m
x
2
+ V (x). (8.2)
8.1.1 The Numerov algorithm
After rewriting this second order dierential equation to a coupled system of two rst
order dierential equations, any ODE solver such as the RungeKutta method could be
applied, but there exist better methods.
For the special form
n1
=
n
x
n
+
x
2
2
x
3
6
(3)
n
+
x
4
24
(4)
n
x
5
120
(5)
n
+ O(x
6
) (8.4)
Adding
n+
and
n
we obtain
n+1
+
n1
= 2
n
+ (x)
2
n
(x)
4
12
(4)
n
. (8.5)
Replacing the fourth derivatives by a nite dierence second derivative of the second
derivatives
(4)
n
=
n+1
+
n1
2
n
x
2
(8.6)
95
and substituting k(x)(x) for
n+1
= 2
_
1
5(x)
2
12
k
n
_
_
1 +
(x)
2
12
k
n1
_
n1
+ O(x
6
), (8.7)
which is locally of sixth order!
Initial values
To start the Numerov algorithm we need the wave function not just at one but at two
initial values and will now present several ways to obtain these.
For potentials V (x) with reection symmetry V (x) = V (x) the wave functions
need to be either even (x) = (x) or odd (x) = (x) under reection, which
can be used to nd initial values:
For the even solution we use a halfinteger mesh with mesh points x
n+1/2
=
(n + 1/2)x and pick intiial values
(
x
1/2
) = (x
1/2
) = 1.
For the odd solution we know that (0) = (0) and hence (0) = 0, specifying
the rst starting value. Using an integer mesh with mesh points x
n
= nx we
pick
(
x
1
) = 1 as the second starting value.
In general potentials we need to use other approaches. If the potentials vanishes for
large distances: V (x) = 0 for [x[ a we can use the exact solution of the Schrdinger
equation at large distances to dene starting points, e.g.
(a) = 1 (8.8)
(a x) = exp(x
_
2mE/ h). (8.9)
Finally, if the potential never vanishes we need to begin with a single starting value
(x
0
) and obtain the second starting value (x
1
) by performing an integration over the
rst time step with an Euler or RungeKutta algorithm.
8.1.2 The onedimensional scattering problem
The scattering problem is the numerically easiest quantum problem since solutions
exist for all energies E > 0, if the potential vanishes at large distances (V (x) 0 for
[x[ . The solution becomes particularly simple if the potential is nonzero only
on a nite interval [0, a]. For a particle approaching the potential barrier from the left
(x < 0) we can make the following ansatz for the free propagation when x < 0:
L
(x) = Aexp(ikx) + Bexp(ikx) (8.10)
where A is the amplitude of the incoming wave and B the amplitude of the reected
wave. On the right hand side, once the particle has left the region of nite potential
(x > a), we can again make a free propagation ansatz,
R
(x) = C exp(ikx) (8.11)
96
The coecients A, B and C have to be determined selfconsistently by matching to a
numerical solution of the Schrodinger equation in the interval [0, a]. This is best done
in the following way:
Set C = 1 and use the two points a and a +x as starting points for a Numerov
integration.
Integrate the Schrodinger equation numerically backwards in space,from a to 0
using the Numerov algorithm.
Match the numerical solution of the Schrodinger equation for x < 0 to the free
propagation ansatz (8.10) to determine A and B.
Once A and B have been determined the reection and transmission probabilities R
and T are given by
R = [B[
2
/[A[
2
(8.12)
T = 1/[A[
2
(8.13)
8.1.3 Bound states and solution of the eigenvalue problem
While there exist scattering states for all energies E > 0, bound states solutions of the
Schrodinger equation with E < 0 exist only for discrete energy eigenvalues. Integrating
the Schrodinger equation from to + the solution will diverge to as x
for almost all values. These functions cannot be normalized and thus do not constitute
solutions to the Schrodinger equation. Only for some special eigenvalues E, will the
solution go to zero as x /
A simple eigensolver can be implemented using the following shooting method, where
we again will assume that the potential is zero outside an interval [0, a]:
Start with ann initial guess E
Integrate the Schrodinger equation for
E
(x) from x = 0 to x
f
a and determine
the value
E
(x
f
)
use a root solver, such as a bisection method, to look for an energy E with
E
(x
f
) 0
This algorithm is not ideal since the divergence of the wave function for x will
cause roundo error to proliferate.
A better solution is to integrate the Schrodinger equation from both sides towards
the center:
We search for a point b with V (b) = E
Starting fromx = 0 we integrate the left hand side solution
L
(x) to a chosen point
b and obtain
L
(b) and a numerical estimate for
L
(b) = (
L
(b)
L
(bx))/x.
Starting from x = a we integrate the right hand solution
R
(x) down to the same
point b and obtain
R
(b) and a numerical estimate for
R
(b) = (
R
(b + x)
R
(b))/x.
97
At the point b the wave functions and their rst two derivatives have to match,
since solutions to the Schrodinger equation have to be twice continuously dieren
tiable. Keeping in mind that we can multiply the wave functions by an arbitrary
factor we obtain the conditions
L
(b) =
R
(b) (8.14)
L
(b) =
R
(b) (8.15)
L
(b) =
R
(b) (8.16)
The last condition is automatically fullled since by the choice V (b) = E the
Schrodinger equation at b reduces to
L
(b)
L
(b)
=
R
(b)
R
(b)
=
d log
R
dx
[
x=b
(8.17)
This last equation has to be solved for in a shooting method, e.g. using a bisection
algorithm
8.2 The timeindependent Schrodinger equation in
higher dimensions
The time independent Schrodinger equation in more than one dimension is a partial
dierential equation and cannot, in general, be solved by a simple ODE solver such as
the Numerov algorithm. Before employing a PDE solver we should thus always rst try
to reduce the problem to a onedimensional problem. This can be done if the problem
factorizes.
A rst example is a threedimensional Schrdinger equation in a cubic box with
potential V (r) = V (x)V (y)V (z) with r = (x, y, z). Using the product ansatz
(r) =
x
(x)
y
(y)
z
(z) (8.18)
the PDE factorizes into three ODEs which can be solved as above.
Another famous trick is possible for spherically symmetric potentials with V (r) =
V ([r[) where an ansatz using spherical harmonics
l,m
(r) = l, m(r, , ) =
u(r)
r
Y
lm
(, ) (8.19)
can be used to reduce the threedimensional Schrodinger equation to a onedimensional
one for the radial wave function u(r):
_
h
2
2m
d
2
dr
2
+
h
2
l(l + 1)
2mr
2
+ V (r)
_
u(r) = Eu(r) (8.20)
in the interval [0, [. Given the singular character of the potential for r 0, a
numerical integration should start at large distances r and integrate towards r = 0, so
that the largest errors are accumulated only at the last steps of the integration.
98
8.2.1 Variational solutions using a nite basis set
In the case of general potentials, or for more than two particles, it will not be possible to
reduce the Schrodinger equation to a onedimensional problem and we need to employ
a PDE solver. One approach will again be to discretize the Schrodinger equation on a
discrete mesh using a nite dierence approximation. A better solution is to expand
the wave functions in terms of a nite set of basis functions
[) =
N
i=1
a
i
[u
i
), (8.21)
as we did in the nite element method in section 3.6.
To estimate the ground state energy we want to minimize the energy of the varia
tional wave function
E
=
[H[)
[)
. (8.22)
Keep in mind that, since we only chose a nite basis set [u
i
) the variational estimate
E
h
2
2m
2
+ V
_
u
j
(r) (8.23)
the matrix elements of the Hamilton operator H and by
S
ij
= u
i
[u
j
) =
_
dru
i
(r)
u
j
(r) (8.24)
the overlap matrix. Note that for an orthogonal basis set, S
ij
is the identity matrix
ij
.
Minimizing equation (8.22) we obtain a generalized eigenvalue problem
j
H
ij
a
j
= E
k
S
ik
a
k
. (8.25)
or in a compact notation with a = (a
1
, . . . , a
N
)
Ha = ESa. (8.26)
If the basis set is orthogonal this reduces to an ordinary eigenvalue problem and we can
use the Lanczos algorithm.
In the general case we have to nd orthogonal matrices U such that U
T
SU is the
identity matrix. Introducing a new vector
b = U
1
a. we can then rearrange the problem
into
Ha = Sa
HU
b = ESU
b
HU
b = ESU
b
U
T
HU
b = EU
T
SU
b = E
b (8.27)
99
and we end up with a standard eigenvalue problem for U
T
HU. Mathematica and
LAPACK both contain eigensolvers for such generalized eigenvalue problems.
The nal issue is the choice of basis functions. While a general nite element basis
can be used it iss advantageous to make use of known solutions to similar problem as
we will illustrate in the case of an anharmonic oscillator with Hamilton operator
H = H
0
+ q
4
H
0
=
1
2
(p
2
+ q
2
), (8.28)
where the momentum operator is p = i h
q
. The eigenstates [n) and eigenvalues
n
=
(n + 1/2)
0
of H
0
are be known from the basic quantum mechanics lectures. In real
space the eigenstates are given by
n
(q) =
1
_
2
n
n!
exp
_
1
2
q
2
_
H
n
(q), (8.29)
where the H
n
are the Hermite polynomials. Using these eigenstates as a basis set, the
operator H
0
becomes a diagonal matrix. The position operator becomes
q =
1
2
(a
+ a), (8.30)
where the raising and lowering operators a
[n) = n[a[n + 1) =
n + 1. (8.31)
The matrix representation of the anharmonic term q
4
is a banded matrix. Aftwer
truncation of the basis set to a nite number of states N, a sparse eigensolver such as
the Lanczos algorithm can again be used to calculate the spectrum. Note that since we
use the orthonormal eigenstates of H
0
as basis elements, the overlap matrix S here is
the identity matrix and we have to deal only with a standard eigenvalue problem. A
solution to this problem is provided in a Mathematica notebook on the web page.
8.3 The timedependent Schrodinger equation
Finally we will reintroduce the time dependence to study dynamics in nonstationary
quantum systems.
8.3.1 Spectral methods
By introducing a basis and solving for the complete spectrum of energy eigenstates we
can directly dolve the timedependent problem in the case of a stationary Hamiltonian.
This is a consequence of the linearity of the Schrodinger equation.
To calculate the time evolution of a state [(t
0
)) from time t
0
to t we rst solve
the stationary eigenvalue problem H[) = E[) and calculate the eigenvectors [
n
) and
100
eigenvalues
n
. Next we represent the initial wave function [) by a spectral decompo
sition
[(t
0
)) =
n
c
n
[
n
). (8.32)
Since each of the [
n
) is an eigenvector of H, the time evolution e
ihH(tt
0
)
is trivial
and we obtain at time t:
[(t)) =
n
c
n
e
ihn(tt
0
)
[
n
). (8.33)
8.3.2 Direct numerical integration
If the number of basis states is too large to perform a complete diagonalization of
the Hamiltonian, or if the Hamiltonian changes over time we need to perform a direct
integration of the Schrodinger equation.
The main novelty, compared to the integration of classical wave equations is that
the exact quantum mechanical time evolution conserves the normalization
_
[(x, t)[
2
dx = 1 (8.34)
of the wave function and the numerical algorithm should also have this property: the
approximate time evolution needs to be unitary.
We rst approximate the time evolution
(x, t + t) = e
ihHt
(x, t). (8.35)
by a forward Euler approximation
e
ihHt
1 i hHt + O(t
2
). (8.36)
This is neither unitary nor stable and we need a better integrator. The simplest stable
and unitary integrator can be obtained by the following reformulation
e
ihHt
=
_
e
ihHt/2
_
1
e
ihHt/2
_
1 +i hH
t
2
_
1
_
1 i hH
t
2
_
+ O(t
3
). (8.37)
This gives an algorithm
(x, t + t) =
_
1 +i hH
t
2
_
1
_
1 i hH
t
2
_
(x, t) (8.38)
or equivalently
_
1 +i hH
t
2
_
(x, t + t) =
_
1 i hH
t
2
_
(x, t). (8.39)
After introducing a basis as above, we realize that we need to solve a linear system
of equations. For onedimensional problems the matrix representation of H is often
tridiagonal and a tridiagonal solver can be used.
In higher dimensions the matrix H will no longer be simply tridiagonal but still very
sparse and we can use iterative algorithms, similar to the Lanczos algorithm for the
eigenvalue problem. For details about these algorithms we refer to the nice summary at
http://mathworld.wolfram.com/topics/Templates.html and especially the bicon
jugate gradient (BiCG) algorithm. Implementations of this algorithm are available, e.g.
in the Iterative Template Library (ITL).
101
Chapter 9
The quantum N body problem:
quantum chemistry methods
After treating classical problems in the rst chapters, we will turn to quantum problems
in the last two chapters. In the winter semester we saw that the solution of the classical
onebody problem reduced to an ordinary dierential equation, while that of quantum
onebody problem was a partial dierential equation.
Many body problems are intrinsically harder to simulate. In the last chapter we
saw how the computational complexity of the classical Nbody problem can be reduced
from O(N
2
) to O(N ln N), thus making even large systems with 10
8
particles accessible
to simulations.
The quantum many body problem is much harder than the classical one. While the
dimension of the classical phase space grew linearly with the number of particles, that of
the quantum problem grows exponentially. Thus the complexity is usually O(exp(N)).
Exponential algorithms are the nightmare of any computational scientist, but they
naturally appear in quantum many body problems. In this chapter we will discuss
approximations used in quantum chemistry that reduce the problem to an polynomial
one, typically scaling like O(N
4
). These methods map the problem to a singleparticle
problem and work only as long as correlations between electrons are weak. In the
next chapter we will thus discuss exact methods  without any approximation  which
are needed for the simulation of strongly correlated systems, such as high temperature
superconductors, heavy electron materials and quantum magnets.
9.1 Basis functions
All approaches to solutions of the quantum mechanical Nbody problem start by choos
ing a suitable basis set for the wave functions. The Schrodinger equation then maps to
an eigenvalue equations, which is solved using standard linear algebra methods. Before
attempting to solve the many body problem we will discuss basis sets for single particle
wave functions.
102
9.1.1 The electron gas
For the free electron gas with Hamilton operator
H =
N
i=1
h
2
2m
2
+ e
2
i<j
v
ee
(r
i
, r
j
) (9.1)
v
ee
(r, r
) =
1
[r r
[
(9.2)
the ideal choice for basis functions are plane waves
k
(r) = exp(i
kr). (9.3)
At low temperatures the electron gas forms a Wigner crystal. Then a better choice of
basis functions are eigenfunctions of harmonic oscillators centered around the classical
equilibrium positions.
9.1.2 Electronic structure of molecules and atoms
The Hamilton operator for molecules and atoms contains extra terms due to the atomic
nuclei:
H =
N
i=1
_
h
2
2m
2
+ V (r
i
)
_
+ e
2
i<j
v
ee
(r
i
, r
j
) (9.4)
where the potential of the M atomic nuclei with charges Z
i
at the locations
R
i
is given
by
V (r) = e
2
M
i=1
Z
i
[
R
i
r[
. (9.5)
Here we use the BornOppenheimer approximation and consider the nuclei as stationary
classical particles. This approximation is valid since the nuclei are many orders of mag
nitude heavier than the electrons. The CarParinello method for molecular dynamics,
which we will discuss later, moves the nuclei classically according to electronic forces
that are calculated quantum mechanically.
As single particle basis functions we choose L atomic basis functions f
i
, centered on
the nuclei. In general these functions are not orthogonal. Their scalar products dene
a matrix
S
ij
=
_
d
3
rf
i
(r)f
j
(r), (9.6)
which is in general not the identity matrix. The associated annihilation operators a
i
are dened formally as scalar products
a
i
=
j
(S
1
)
ij
_
d
3
rf
j
(r)
(r), (9.7)
where =, is the spin index and
i
, a
j
=
(S
1
)
ij
a
i
, a
j
= a
i
, a
j
= 0. (9.8)
103
Due to the nonorthogonality the adjoint a
i
does not create a state with wave function
f
i
. This is done by the operator a
i
, dened through:
a
i
=
j
S
ji
a
i
, (9.9)
which has the following simple commutation relation with a
j
:
a
i
, a
j
=
ij
. (9.10)
The commutation relations of the a
i
and the a
j
are:
a
i
a
j
=
S
ij
a
i
, a
j
= a
i
, a
j
= 0. (9.11)
When performing calculations with these basis functions extra care must be taken
to account for this nonorthogonality of the basis.
In this basis the Hamilton operator (9.4) is written as
H =
ij
t
ij
a
i
a
j
+
1
2
ijkl
V
ijkl
a
i
a
k
a
l
a
j
. (9.12)
The matrix elements are
t
ij
=
_
d
3
rf
i
(r)
_
h
2
2m
2
+ V (r
i
)
_
f
j
(r) (9.13)
V
ijkl
= e
2
_
d
3
r
_
d
3
r
i
(r)f
j
(r)
1
[r r
[
f
k
(r
)f
l
(r
) (9.14)
Which functions should be used as basis functions? Slater proposed the Slater
TypeOrbitals (STO):
f
i
(r, , ) r
n1
e
i
r
Y
lm
(, ). (9.15)
The values
i
are optimized so that the eigenstates of isolated atoms are reproduced
as good as possible. The main advantage of STOs is that they exhibit the correct
asymptotic behavior for small distances r. Their main disadvantage is in the evaluation
of the matrix elements in equation (9.14).
The GaussianTypeOrbitals (GTO)
f
i
(r) x
l
y
m
z
n
e
i
r
2
(9.16)
simplify the evaluation of matrix elements, as products of Gaussian functions centered
at two dierent nuclei are again Gaussian functions and can be integrated easily. In
addition, the term
1
rr

can also be rewritten as an integral over a Gaussian function
1
[r r
[
=
2
_
0
dte
t
2
(rr
)
2
. (9.17)
Then the sixdimensional integral (9.14) is changed to a sevendimensional one, albeit
with purely Gaussian terms and this can be carried out analytically as we have seen in
the exercises.
As there are O(L
4
) integrals of the type (9.14), quantum chemistry calculations
typically scale as O(N
4
). Modern methods
104
9.2 Pseudopotentials
The electrons in inner, fully occupied shells do not contribute in the chemical bindings.
To simplify the calculations they can be replaced by pseudopotentials, modeling the
inner shells. Only the outer shells (including the valence shells) are then modeled using
basis functions. The pseudopotentials are chosen such that calculations for isolated
atoms are as accurate as possible.
9.3 Hartree Fock
The HartreeFock approximation is based on the assumption of independent electrons.
It starts from an ansatz for the Nparticle wave function as a Slater determinant of N
singleparticle wave functions:
(r
1
,
1
; . . . ; r
N
,
N
) =
1
1
(r
1
,
1
)
N
(r
1
,
1
)
.
.
.
.
.
.
1
(r
N
,
N
)
N
(r
N
,
N
)
. (9.18)
The single particle wave functions
are chosen so that the energy The ground state energy is:
[H[) =
N
=1
[
h
2
2m
2
+ V [
) +
1
2
e
2
N
,=1
(
[v
ee
[
[v
ee
[
)) .
(9.19)
is minimized, constrained by the normalization of the wave functions. Using Lagrange
multipliers
_
[H[)
)
_
= 0. (9.20)
Performing the variation we end up with the HartreeFock equation
F[
) =
), (9.21)
where the matrix elements f
[
h
2
2m
2
+ V [
) + e
2
N
=1
(
[v
ee
[
[v
ee
[
)) . (9.22)
The HartreeFock equation looks like a oneparticle Schrodinger equation. However,
the potential depends on the solution. The equation is used iteratively, always using
the new solution for the potential, until convergence to a xed point is achieved.
105
The eigenvalues
=1
1
2
e
2
N
,=1
(
[v
ee
[
[v
ee
[
)) . (9.23)
How do we calcuate the functions
n
in our nite basis sets, introduced in the
previous section? To simplify the discussion we assume closedshell conditions, where
each orbital is occupied by both an electron with spin and one with spin .
We start by writing the Hartree Fock wave function (9.18) is second quantized form:
[) =
,
c
[0), (9.24)
where [
=
L
n=1
d
n
a
n
(9.25)
and nd that
a
j
[) = a
j
[0) =
d
j
=
c
[0). (9.26)
In order to evaluate the matrix elements [H[) of the Hamiltonian (9.12) we introduce
the bondorder matrix
P
ij
=
[a
i
a
j
[) = 2
i
d
j
, (9.27)
where we have made use of the closedshell conditions to sum over the spin degrees of
freedom. The kinetic term of H is now simply
ij
P
ij
t
ij
. Next we rewrite the interaction
part [a
i
a
k
a
l
a
j
[) in terms of the P
ij
. We nd that if =
[a
i
a
k
a
l
a
j
[) = [a
i
a
j
[)[a
k
a
l
[) [a
i
a
l
[)[a
k
a
j
[) (9.28)
and if ,=
:
[a
i
a
k
a
l
a
j
[) = [a
i
a
j
[)[a
k
a
l
[) (9.29)
Then the energy is (again summing over the spin degrees of freedom):
E
0
=
ij
t
ij
P
ij
+
1
2
ijkl
_
V
ijkl
1
2
V
ilkj
_
P
ij
P
kl
. (9.30)
We can now repeat the variational arguments, minimizing E
0
under the condition that
the [
) are normalized:
1 =
) =
i,j
d
i
d
j
S
ij
. (9.31)
106
Using Lagrange multipliers to enforce this constraint we nally end up with the Hartree
Fock equations for a nite basis set:
L
j=1
(f
ij
S
ij
)d
j
= 0, (9.32)
where
f
ij
= t
ij
P
ij
+
kl
_
V
ijkl
1
2
V
ilkj
_
P
kl
. (9.33)
This is a generalized eigenavlue problem of the form Ax = Bx for which library
routines exist. They rst diagonalize B and then solve a standard eigenvalue problem
in the basis which diagonalizes B.
9.4 Density functional theory
Another commonly used method, for which the Nobel prize in chemistry was awarded to
Walter Kohn, is the density functional theory. It is based on two fundamental theorems
by Hohenberg and Kohn. The rst theorem states that the ground state energy E
0
of
an electronic system in an external potential V is a functional of the electron density
(r) :
E
0
= E[] =
_
d
3
rV (r)(r) + F[], (9.34)
with a universal functional F. The second theorem states that the density of the ground
state wave function minimizes this functional. I will prove both theorems in the lecture.
Until now everything is exact. The problem is that, while the functional F is univer
sal, it is also unknown! Thus we need to nd good approximations for the functional.
One usually starts from the ansatz:
F[] = E
h
[] + E
k
[] + E
xc
[]. (9.35)
The Hartreeterm E
h
given by the Coulomb repulsion between two electrons:
E
h
[] =
e
2
2
_
d
3
rd
3
r
(r)(r
)
[r r
[
. (9.36)
The kinetic energy E
k
[] is that of a noninteracting electron gas with the same density.
The exchange and correlation term E
xc
[] contains the remaining unknown contribu
tion.
To determine the ground state wave function we again make an ansatz for the wave
function, using N/2 singleelectron wave function, which we occupy with spin and
spin electrons:
(r) = 2
N/2
=1
[
(r)[
2
. (9.37)
The singleelectron wave functions are again normalized. Variation of the functional,
taking into account this normalization gives an equation, which looks like a single
electron Schrodinger equation
_
h
2
2m
2
+ V
eff
(r)
_
(r) =
(r), (9.38)
107
with an eective potential
V
eff
(r) = U(r) + e
2
_
d
3
r
(r
)
[r r
[
+ v
xc
(r), (9.39)
and an exchangecorrelation potential dened by
v
xc
(r) =
E
xc
[]
(r)
. (9.40)
The form (9.38) arises because we have separated the kinetic energy of the noninteracting
electron system from the functional. The variation of this kinetic energy just gives the
kinetic term of this Schrodingerlike equation.
This nonlinear equation is again solved iteratively, where in the ansatz (9.37) the
are chosen.
In nite (and nonorthogonal) basis sets the same techniques as for the HartreeFock
method are used to derive a nite eigenvalue problem.
9.4.1 Local Density Approximation
Apart from the restricted basis set everything was exact up to this point. As the
functional E
xc
[] and thus the potential v
xc
(r) is not known, we need to introduce
approximations.
The simplest approximation is the local density approximation (LDA), which
replaces v
xc
by that of a uniform electron gas with the same density:
v
xc
=
0.611
r
s
(r
s
) [a.u.]
(r
s
) = 1 + 0.0545r
s
ln(1 + 11.4/r
s
)
r
1
s
= a
B
_
4
3
_
1/3
(9.41)
9.4.2 Improved approximations
Improvements over the LDA have been an intense eld of research in quantum chemistry.
I will just mention two improvements. The local spin density approximation (LSDA)
uses separate densities for electrons with spin and . The generalized gradient
approximation (GGA) and its variants use functionals depending not only on the
density, but also on its derivatives.
9.5 CarParinello method
Roberto Car and Michele Parinello have combined density functional theory with molec
ular dynamics to improve the approximations involved in using only LennardJones
potentials. This method allows much better simulations of molecular vibration spectra
and of chemical reactions.
108
The atomic nuclei are propagated using classical molecular dynamics, but the elec
tronic forces which move them are estimated using density functional theory:
M
n
d
2
R
n
dt
2
=
E[(r, t),
R
n
]
R
n
. (9.42)
Here M
n
and
R
n
are the masses and locations of the atomic nuclei.
As the solution of the full electronic problem at every time step is a very time
consuming task it is performed only once at the start. The electronic degrees of freedoms
are then updated using an articial dynamics:
m
d
2
(r, t)
dt
2
=
1
2
E[(r, t),
R
n
]
(r, t)
+
=1
c
[0) (9.44)
one or two of the c
i
:
[
0
) =
_
_
1 +
i,
c
i
c
i<j,<
ij
c
i
c
j
c
_
_
[
HF
). (9.45)
The energies are then minimized using this variational ansatz. In a problem with
N occupied and M empty orbitals this leads to a matrix eigenvalue problem with
dimension 1+NM +N
2
M
2
. Using the Lanczos algorithm the low lying eigenstates can
then be calculated in O((N + M)
2
) steps.
Further improvements are possible by allowing more than only doublesubstitutions.
The optimal method treats the full quantum problem of dimension (N + M)!/N!M!.
Quantum chemists call this method fullCI. Physicists simplify the Hamilton operator
slightly to obtain simpler models with fewer matrix elements, and call that method
exact diagonalization. This method will be discussed in the nal chapter.
9.7 Program packages
As the model Hamiltonian and the types of basis sets are essentially the same for all
quantum chemistry applications exible program packages have been written. There
is thus usually no need to write your own programs unless you want to implement a
new algorithm.
109
Chapter 10
The quantum N body problem:
exact algorithms
The quantum chemical approaches discussed in the previous chapter simplify the prob
lem of solving a quantum many body problem. The complexity of a problem with
N particles in M = O(N) orbitals is then only O(N
4
) or often better, instead of
O(exp(N)).
This enormous reduction in complexity is however paid for by a crude approximation
of electron correlation eects. This is acceptable for normal metals, band insulators and
semiconductors but fails in materials with strong electron correlations, such as almost
all transition metal ceramics.
In the last category many new materials have been synthesized over the past twenty
years, including the famous high temperature superconductors. In these materials the
electron correlations play an essential role and lead to interesting new properties that are
not completely understood yet. Here we leave quantum chemistry and enter quantum
physics again.
10.1 Models
To understand the properties of these materials the Hamilton operator of the full quan
tum chemical problem (9.4) is usually simplied to generic models, which still contain
the same important features, but which are easier to investigate. They can be used to
understand the physics of these materials, but not directly to quantitatively t experi
ments.
10.1.1 The tightbinding model
The simplest model is the tightbinding model, which concentrates on the valence bands.
All matrix elements t
ij
in equation (9.13), apart from the ones between nearest neighbor
atoms are set to zero. The others are simplied, as in:
H =
i,j,
(t
ij
c
i,
c
j,
+ H.c.). (10.1)
This model is easily solvable by Fourier transforming it, as there are no interactions.
110
10.1.2 The Hubbard model
To include eects of electron correlations, the Hubbard model includes only the often
dominant intraorbital repulsion V
iiii
of the V
ijkl
in equation (9.14):
H =
i,j,
(t
ij
c
i,
c
j,
+ H.c.)) +
i
U
i
n
i,
n
i,
. (10.2)
The Hubbard model is a longstudied, but except for the 1D case still not completely
understood model for correlated electron systems.
In contrast to band insulators, which are insulators because all bands are either
completely lled or empty, the Hubbard model at large U is insulating at half lling,
when there is one electron per orbital. The reason is the strong Coulomb repulsion U
between the electrons, which prohibit any electron movement in the half lled case at
low temperatures.
10.1.3 The Heisenberg model
In this insulating state the Hubbard model can be simplied to a quantum Heisenberg
model, containing exactly one spin per site.
H =
i,j
J
ij
S
i
S
j
(10.3)
For large U/t the perturbation expansion gives J
ij
= 2t
2
ij
(1/U
i
+1/U
j
). The Heisenberg
model is the relevant eective models at temperatures T t
ij
, U ( 10
4
K in copper
oxides).
10.1.4 The tJ model
The tJ model is the eective model for large U at low temperatures away from half
lling. Its Hamiltonian is
H =
i,j,
_
(1 n
i,
)t
ij
c
i,
c
j,
(1 n
j,
) + H.c.
_
+
i,j
J
ij
(
S
i
S
j
n
i
n
j
/4). (10.4)
As doubleoccupancy is prohibited in the tJ model there are only three instead of four
states per orbital, greatly reducing the Hilbert space size.
10.2 Algorithms for quantum lattice models
10.2.1 Exact diagonalization
The most accurate method is exact diagonalization of the Hamiltonian matrix using the
Lanczos algorithm which was discussed in section 7.10. The size of the Hilbert space of
an Nsite system [4
N
for a Hubbard model , 3
N
for a tJ model and (2S+1)
N
for a spin
S model] can be reduced by making use of symmetries. Translational symmetries can
be employed by using Bloch waves with xed momentum as basis states. Conservation
111
of particle number and spin allows to restrict a calculation to subspaces of xed particle
number and magnetization.
As an example I will sketch how to implement exact diagonalization for a simple
onedimensional spinless fermion model with nearest neighbor hopping t and nearest
neighbor repulsion V :
H = t
L1
i=1
(c
i
c
i+1
+ H.c.) + V
L1
i=1
n
i
n
i+1
. (10.5)
The rst step is to construct a basis set. We describe a basis state as an unsigned
integer where bit i set to one corresponds to an occupied site i. As the Hamiltonian
conserves the total particle number we thus want to construct a basis of all states with
N particles on L sites (or N bits set to one in L bits). The function state(i) returns
the state corresponding to the ith basis state, and the function index(s) returns the
number of a basis state s.
#include <vector>
#include <alps/bitops.h>
#include <limits>
#include <valarray>
class FermionBasis {
public:
typedef unsigned int state_type;
typedef unsigned int index_type;
FermionBasis (int L, int N);
state_type state(index_type i) const {return states_[i];}
index_type index(state_type s) const {return index_[s];}
unsigned int dimension() const { return states_.size();}
private:
std::vector<state_type> states_;
std::vector<index_type> index_;
};
FermionBasis::FermionBasis(int L, int N)
{
index_.resize(1<<L); // 2^L entries
for (state_type s=0;s<index_.size();++s)
if(alps::popcnt(s)==N) {
// correct number of particles
states_.push_back(s);
index_[s]=states_.size()1;
}
else
112
// invalid state
index_[s]=std::numeric_limits<index_type>::max();
}
Next we have to implement a matrixvector multiplication v = Hw for the Hamil
tonian:
#include <cassert>
class HamiltonianMatrix : public FermionBasis {
public:
HamiltonianMatrix(int L, int N, double t, double V)
: FermionBasis(L,N), t_(t), V_(V), L_(L) {}
void multiply(std::valarray<double>& v, const std::valarray<double>& w);
private:
double t_, V_;
int L_;
};
void HamiltonianMatrix::multiply(std::valarray<double>& v,
const std::valarray<double>& w)
{
// check dimensions
assert(v.size()==dimension());
assert(w.size()==dimension());
// do the Vterm
for (int i=0;i<dimension();++i)
{
state_type s = state(i);
// count number of neighboring fermion pairs
v[i]=w[i]*V_*alps::popcnt(s&(s>>1));
}
// do the tterm
for (int i=0;i<dimension();++i)
{
state_type s = state(i);
for (int r=0;r<L_1;++r) {
state_type shop = s^(3<<r); // exchange two particles
index_type idx = index(shop); // get the index
if(idx!=std::numeric_limits<index_type>::max())
v[idx]+=w[i]*t_;
}
113
}
}
This class can now be used together with the Lanczos algorithm to calculate the
energies and wave functions of the low lying states of the Hamiltonian.
In production codes one uses all symmetries to reduce the dimension of the Hilbert
space as much as possible. In this example translational symmetry can be used if
periodic boundary conditions are applied. The implementation gets much harder then.
In order to make the implementation of exact diagonalization much easier we have
generalized the expression templates technique developed by Todd Veldhuizen for array
expression to expressions including quantum operators. Using this expression template
library we can write a multiplication
[) = H[) = (t
L1
i=1
(c
i
c
i+1
+ H.c.) + V
L1
i
n
i
n
i+1
)[) (10.6)
simply as:
Range i(1,L1);
psi = sum(i,(t*(cdag(i)*c(i+1)+HC)+V*n(i)*n(i+1))*phi);
The advantage of the above onthey calculation of the matrix in the multipli
cation routine is that the matrix need not be stored in memory, which is an advan
tage for the biggest systems where just a few vectors of the Hilbert space will t into
memory. If one is not as demanding and wants to simulate a slightly smaller system,
where the (sparse) matrix can be stored in memory, then a less ecient but more
exible function can be used to create the matrix, since it will be called only once at
the start of the program. Such a program is available through the ALPS project at
http://alps.compphys.org/.
10.2.2 Quantum Monte Carlo
Pathintegral representation in terms of world lines
All quantum Monte Carlo algorithms are based on a mapping of a ddimensional quan
tum system to a (d +1)dimensional classical system using a pathintegral formulation.
We then perform classical Monte Carlo updates on the world lines of the particles. I will
introduce one modern algorithm for lattice models, the loopalgorithm which is a gen
eralization of the cluster algorithms to lattice models. Other (noncluster) algorithms
are avaliable also for continuum models. The interested student will nd descriptions
of those algorithms in many text books on computational physics.
I will discuss the loop algorithm for a spin1/2 quantum XXZ model with the
Hamiltonian
H =
i,j
_
J
z
S
z
i
S
z
j
+ J
xy
(S
x
i
S
x
j
+ S
y
i
S
y
j
)
_
=
i,j
_
J
z
S
z
i
S
z
j
+
J
xy
2
(S
+
i
S
j
+ S
i
S
+
j
)
_
. (10.7)
114
For J J
z
= J
xy
we have the Heisenberg model (J > 0 is ferromagnetic, J < 0
antiferromagnetic). J
xy
= 0 is the (classical) Ising model and J
z
= 0 the quantum XY
model.
In the quantum Monte Carlo simulation we want to evaluate thermodynamic aver
ages such as
A) =
TrAe
H
Tre
H
. (10.8)
The main problem is the calculation of the exponential e
H
. The straightforward
calculation would require a complete diagonalization, which is just what we want to
avoid. We thus discretize the imaginary time (inverse temperature) direction
1
and
subdivide = M:
e
H
=
_
e
H
_
M
= (1 H)
M
+ O() (10.9)
In the limit M ( 0) this becomes exact. We will take the limit later, but
stay at nite for now.
The next step is to insert the identity matrix, represented by a sum over all basis
states 1 =
i
[i)i[ between all operators (1 H):
Z = Tre
H
= Tr (1 H)
M
+ O()
=
i
1
,...,i
M
i
1
[1 H[i
2
)i
2
[1 H[i
3
) i
M
[1 H[i
1
) + O()
=: P
i
1
,...,i
M
(10.10)
and similarly for the measurement, obtaining
A) =
i
1
,...,i
M
i
1
[A(1 H)[i
2
)
i
1
[1 H[i
2
)
P
i
1
,...,i
M
+ O(). (10.11)
If we choose the basis states [i) to be eigenstates of the local S
z
operators we end
up with an Isinglike spin system in one higher dimension. Each choice i
1
, . . . , i
M
corresponds to one of the possible congurations of this classical spin system. The
trace is mapped to periodic boundary conditions in the imaginary time direction of this
classical spin system. The probabilities are given by matrix elements i
n
[1H[i
n+1
).
We can now sample this classical system using classical Monte Carlo methods.
However, most of the matrix elements i
n
[1H[i
n+1
) are zero, and thus nearly all
congurations have vanishing weight. The only nonzero congurations are those where
neighboring states [i
n
) and [i
n+1
) are either equal or dier by one of the odiagonal
matrix elements in H, which are nearest neighbor exchanges by two opposite spins. We
can thus uniquely connect spins on neighboring time slices and end up with world
lines of the spins, sketched in Fig. 10.1. Instead of sampling over all congurations of
local spins we thus have to sample only over all world line congurations (the others
have vanishing weight). Our update moves are not allowed to break world lines but
have to lead to new valid world line congurations.
1
Time evolution in quantum mechanics is e
itH
. The Boltzman factor e
H
thus corresponds to
an evolution in imaginary time t = i
115
space
i
m
a
g
i
n
a
r
y
t
i
m
e
0
Figure 10.1: Example of a world line conguration for a spin1/2 quantum Heisenberg
model. Drawn are the world lines for upspins only. Down spin world lines occupy the
rest of the conguration.
Table 10.1: The six local congurations for an XXZ model and their weights.
conguration weight
S
i
()
S
i
(+d)
S
j
()
S
j
(+d)
,
S
i
()
S
i
(+d)
S
j
()
S
j
(+d)
1 +
Jz
4
d
S
i
()
S
i
(+d)
S
j
()
S
j
(+d)
,
S
i
()
S
i
(+d)
S
j
()
S
j
(+d)
1
Jz
4
d
S
i
()
S
i
(+d)
S
j
()
S
j
(+d)
,
S
i
()
S
i
(+d)
S
j
()
S
j
(+d)
Jxy
2
d
The loop algorithm
Until 1993 only local updates were used, which suered from a slowing down like in
the classical case. The solution came as a generalization of the cluster algorithms to
quantum systems.
2
This algorithm is best described by rst taking the continuous time limit M
( d) and by working with innitesimals. Similar to the Ising model we look at
two spins on neigboring sites i and j at two neighboring times and +d, as sketched
in Tab. 10.1. There are a total of six possible congurations, having three dierent
probabilities. The total probabilities are the products of all local probabilities, like in
the classical case. This is obvious for dierent time slices. For the same time slice it
is also true since, denoting by H
ij
the term in the Hamiltonian H acting on the bond
between sites i and j we have
i,j
(1 dH
ij
) = 1 d
i,j
H
ij
= 1 dH. In
the following we focus only on such local fourspin plaquettes. Next we again use the
2
H. G. Evertz et al., Phys. Rev. Lett. 70, 875 (1993); B. B. Beard and U.J. Wiese, Phys. Rev.
Lett. 77, 5130 (1996); B. Ammon, H. G. Evertz, N. Kawashima, M. Troyer and B. Frischmuth, Phys.
Rev. B 58, 4304 (1998).
116
a) , b) , c) , d)
Figure 10.2: The four local graphs: a) vertical, b) horizontal c) crossing and d) freezing
(connects all four corners).
Table 10.2: The graph weights for the quantumXY model and the function speci
fying whether the graph is allowed. The dash denotes a graph that is not possible for
a conguration because of spin conservation and has to be zero.
G ( , G) ( , G) ( , G)
= ( , G) = ( , G) = ( , G) graph weight
1 1 1
Jxy
4
d
1 1
Jxy
4
d
1 1
Jxy
4
d
0 0 0 0
total weight 1 1
Jxy
2
d
KandelDomany framework and assign graphs. As the updates are not allowed to break
world lines only four graphs, sketched in Fig. 10.2 are allowed. Finally we have to nd
functions and graph weights that give the correct probabilities. The solution for the
XY model, ferromagnetic and antiferromagnetic Heisenberg model and the Ising model
is shown in Tables 10.2  10.5.
Let us rst look at the special case of the Ising model. As the exchange term is
absent in the Ising model all world lines run straight and can be replaced by classical
spins. The only nontrivial graph is the freezing, connecting two neighboring world
lines. Integrating the probability that two neighboring sites are nowhere connected
along the time direction we obtain: times:
=0
(1 dJ/2) = lim
M
(1 J/2)
M
= exp(J/2) (10.12)
Taking into account that the spin is S = 1/2 and the corresponding classical cou
pling J
cl
= S
2
J = J/4 we nd for the probability that two spins are connected:
1 exp(2J
cl
). We end up exactly with the cluster algorithm for the classical Ising
model!
117
Table 10.3: The graph weights for the ferromagnetic quantum Heisenberg model and
the function specifying whether the graph is allowed. The dash denotes a graph
that is not possible for a conguration because of spin conservation and has to be zero.
G ( , G) ( , G) ( , G)
= ( , G) = ( , G) = ( , G) graph weight
1 1 1
J
4
d
0 0 0
1 1
J
2
d
0 0 0 0
total weight 1 +
J
4
d 1
J
4
d
J
2
d
Table 10.4: The graph weights for the antiferromagnetic quantum Heisenberg model
and the function specifying whether the graph is allowed. The dash denotes a
graph that is not possible for a conguration because of spin conservation and has to
be zero. To avoid the sign problem (see next subsection) we change the sign of J
xy
,
which is allowed only on bipartite lattices.
G ( , G) ( , G) ( , G)
= ( , G) = ( , G) = ( , G) graph weight
1 1 1
J
4
d
1 1
J
2
d
0 0 0
0 0 0 0
total weight 1
J
4
d 1 +
J
4
d
J
2
d
118
Table 10.5: The graph weights for the ferromagnetic Ising model and the function
specifying whether the graph is allowed. The dash denotes a graph that is not possible
for a conguration because of spin conservation and has to be zero.
G ( , G) ( , G) ( , G)
= ( , G) = ( , G) = ( , G) graph weight
1 1 1
Jz
4
d
0 0 0
0 0 0
1 0 0
Jz
2
d
total weight 1 +
Jz
4
d 1
Jz
4
d 0
The other cases are special. Here each graph connects two spins. As each of these
spins is again connected to only one other, all spins connected by a cluster form a
closed loop, hence the name loop algorithm. Only one issue remains to be explained:
how do we assign a horizontal or crossing graph with innitesimal probability, such
as (J/2)d. This is easily done by comparing the assignment process with radioactive
decay. For each segment the graph runs vertical, except for occasional decay processes
occuring with probability (J/2)d. Instead of asking at every innitesimal time step
whether a decay occurs we simply calculate an exponentially distributed decay time t
using an exponential distribution with decay constant J/2. Looking up the equation
in the lecture notes of the winter semester we have t = (2/J) ln(1 u) where u is a
uniformly distributed random number.
The algorithm now proceeds as follows (see Fig. 10.3): for each bond we start at
time 0 and calculate a decay time. If the spins at that time are oriented properly and an
exchange graph is possible we insert one. Next we advance by another randomly chosen
decay time along the same bond and repeat the procedure until we have reached the
extent . This assigns graphs to all innitesimal time steps where spins do not change.
Next we assign a graph to all of the (nite number of) time steps where two spins
are exchanged. In the case of the Heisenberg models there is always only one possible
graph to assign and this is very easy. In the next step we identify the loopclusters and
then ip them each with probability 1/2. Alternatively a Wollike algorithm can be
constructed that only builds one loopcluster.
Improved estimators for measurements can be constructed like in classical models.
The derivation is similar to the classical models. I will just mention two simple ones
119
world lines world lines +
decay graphs
world lines
after flips of some
loop clusters
Figure 10.3: Example of a loop update. In a rst step decay paths are inserted where
possible at positions drawn randomly according to an exponential distribution and
graphs are assigned to all exchange terms (hoppings of world lines). In a second stage
(not shown) the loop clusters are identied. Finally each loop cluster is ipped with
probability 1/2 and one ends up with a new conguration.
for the ferromagnetic Heisenberg model. The spinspin corelation is
S
z
i
()S
z
j
(
) =
_
1 if (i, ) und (j,
c
S(c)
2
, (10.14)
where the sum goes over all loop clusters and S(c) is the length of all the loop segments
in the loop cluster c.
The negative sign problem
Now that we have an algorithm with no critical slowing down we could think that
we have completely solved the problem of quantum many body problems. Indeed the
scaling of the loop algorithm is O(N) where N is the number of lattice sites and
the inverse temperature this is optimum scaling.
There is however the negative sign problem which destroys our dreams. We need to
interpret the matrix elements i
n
[1 H[i
n+1
) as probablities, which requires them
to be positive. However all odiagonal positive matrix elements of H give rise to a
negative probability!
The simplest example is the exchange term (J
xy
/2)(S
+
i
S
j
+S
i
S
+
j
) in the Hamil
tonian (10.7) in the case of an antiferromagnet with J
xy
< 0. For any bipartite lattice,
such as chains, square lattices or cubic lattices with there is always an even number
of such exchanges and we get rescued once more. For nonbipartite lattices (such as a
triangular lattice), on which the antiferromagnetic order is frustrated there is no way
120
around the sign problem. Similarly a minus sign occurs in all congurations where two
fermions are exchanged.
Even when there is a sign problem we can still do a simulation. Instead of sampling
A)
p
:=
_
A(x)p(x)dx
_
p(x)dx
(10.15)
we rewrite this equation as
A)
p
=
_
A(x)sign(p(x))p(x)dx
_
p(x)dx
_
sign(p(x))p(x)dx
_
p(x)dx
=
A signp)
p
signp)
p
. (10.16)
We sample with the absolute values [p[ and include the sign in the observable. The sign
problem is the fact that the errors get blown up by an additional factor 1/signp)
p
,
which grows exponentially with volume and inverse temperature , as signp)
p
exp(const N). Then we are unfortunately back to exponential scaling.
The sign problem occurs not only for frustrated magnets, but for any fermionic
quantum system in moer than one dimension: the wave function changes sign when
two fermions are exchanged and hence any world line conguration where two fermions
exchange their positions during the propagation from imaginary time 0 to will con
tribute with a negative weight. Many people have tried to solve the sign problem using
basis changes or clever reformulations, but except for special cases nobody has suc
ceeded yet. If you want you can try your luck: the person who nds a general solution
to the sign problem will surely get a nobel prize. Unfortunately it was recently shown
that the negative sign problem is NPhard and thus almost certainly unsolvable in the
general case.
3
10.2.3 Density Matrix Renormalization Group methods
The density matrix renormalization group (DMRG) method uses a clever trick of reduc
ing the Hilbert space size by selecting important states. The idea is to grow the lattice
by just a few sites in each renormalization step. If all basis states are kept, this leads to
the wellknown exponential increase in size. Instead of keeping all basis functions, the
DMRG method selects a number of m important states according to their density
matrix eigenvalue. A good reference is the paper by S.R. White in Phys. Rev. B 48,
10345 (1993), as well as the doctoral thesis of B. Ammon.
10.3 Lattice eld theories
We did not talk much about eld theories since a discussion of algorithms for lattice
eld theories requires a good knowledge of analytical approaches for eld theories rst.
Here I just want to sketch how a classical or quantum eld theory can be simulated.
3
M. Troyer and U.J. Wiese, Phys. Rev. Lett. 94, 170201 (2005).
121
10.3.1 Classical eld theories
As an example I choose the classical O(N) nonlinear sigma model in d dimensions, with
an action:
S = g
_
d
d
x[
(x)[
2
with [
(x)[ = 1 (10.17)
living in a nite box of size L
d
. To simulate this eld theory we introduce a nite lattice
spacing a and obtain a grid of dimension M
d
with M = L/a. Let us denote the value of
the eld
on the grid points by
i
. Discretizing the action S by replacing derivatives
by dierences we obtain, making use of [
(x)[ = 1
S
a
=
2g
a
2
i,j
j
. (10.18)
This is nothing but the action of a claccisal O(N) lattice model in d dimension with
J = 2g/a
2
and we can again use the cluster algorithms. N = 1 is the Ising model,
N = 2 the XY model and N = 3 the Heisenberg model. One calls the model (10.17)
the eective eld theory of the O(N) lattice models.
There is however a subtle but important dierence. In statistical mechanics we had
a xed lattice spacing a and let L to approach the thermodynamic limit. In the
eld theory we keep L xed and let a 0, while scaling J like 2g/a
2
. This leads to
dierent interpretations of the results.
10.3.2 Quantum eld theories
Quantum eld theories can be treated similarly. Using a path integral formulation like
we introduced it for quantum lattice models the ddimensional quantum eld theory is
mapped to a (d + 1)dimensional classical one: Let us consider the quantum nonlinear
sigma model, a Lorentz invariant eld theory with an eective action:
S = g
_
0
d
_
d
d
x
_
[
(x, )[
2
+
1
c
2
[
(x, )/[
2
_
with [
(x, )[ = 1,
(10.19)
where is the inverse temperature and the velocity c relates the spatial and temporal
scales. Introducing a lattice spacing a in the spatial direction and a lattice spacing
a/c in the temporal direction we obtain a classical O(N) lattice model with couplings
cl
J = 2g/a
2
. The extent in the space direction is L/a and in the temporal direction
c/a. Thus we see that the coupling constant of the quantum model gets mapped to the
classical temperature and the temperature gets mapped to the extent in imaginary time.
Again we can use the classical cluster algorithms to study this quantum eld theory.
For details and simulation algorithms for other eld theories the interested student
is refered to the book by J.M. Thijssen.
122
Enjoy your holidays!
123