Professional Documents
Culture Documents
S Friedlander, University of Illinois-Chicago, (Friedlander and Serre 2003) and the compendium
Chicago, IL, USA of articles on hydrodynamics and nonlinear
ª 2006 Elsevier Ltd. All rights reserved. instabilities in Godreche and Maneville (1998).
removes the highest-derivative term from the equa- moderately large Reynolds numbers. Fully devel-
tions, the nature of the Euler equations is funda- oped turbulence is a phenomenon associated with
mentally different from that of the Navier–Stokes very high Reynolds numbers. These are parameter
equations and the limit of vanishing viscosity (or regimes basically inaccessible in current numerical
infinite Reynolds number) is a very singular limit. investigations of the Navier–Stokes equations and
Since all real fluids are at least very weakly viscous, turbulent models. The Euler equations lie at the
it could be argued that only the the Navier–Stokes limit as R ! 1. It is an interesting observation that
equations are physically relevant. However, many results at the limit of infinite Reynolds number are
important physical phenomena, such as turbulence, sometimes also applicable and consistent with
involve flows at very high Reynolds numbers (104 or experiments for flows with only moderate Reynolds
higher). Hence, an understanding of turbulence is number.
likely to involve the asymptotics of the Navier– There is a huge diversity of forces that couple
Stokes equations as R ! 1. The first step towards with fluid motion to produce instability. We will
the construction of such asymptotics is the study of merely mention a few of these which an interested
inviscid fluids governed by the Euler equations: reader could pursue in consultation with texts listed
in the ‘‘Further reading’’ section and references
@q
þ ðq rÞ q ¼ rP ½4a therein.
@t
the instabilities that may occur in the evolution We assume that if the initial value u(x, 0) 2 X is
of the magnetic field is called the kinematic given, the future evolution u(x, t), t > 0, of the
dynamo problem. This gives rise to interesting equation is uniquely defined (at least for sufficiently
problems in dynamical systems and actually is small initial data). Without loss of generality, we
closely analogous to the topic of vorticity can assume that zero is a steady state.
generation in the three-dimensional (3D) fluid We define a version of Lyapunov (nonlinear)
equations in the absence of MHD effects. stability and its converse instability.
In the next section we discuss certain mathema- Definition 1 Let (X, Z) be a pair of Banach spaces.
tical results that have been rigorously proved for The zero steady state is called (X, Z) nonlinearly
particular problems in the stability of fluid flows. stable if, no matter how small > 0, there exists
We restrict our attention to the ‘‘basic’’ equations, > 0 so that u(x, 0) 2 X and
that is, [2a] and [2b], [4a] and [4b], observing that
even in rather simple configurations there are still kuðx; 0ÞkZ <
more open problems than precise rigorous results.
imply the following two assertions:
(i) there exists a global in time solution such that
The Navier–Stokes Equations: u(x, t) 2 (([0, 1); X);
Mathematical Definitions of (ii) ku(x, t)kZ < for a.e. t 2 [0, 1).
Stability/Instability
The zero state is called nonlinearly unstable if either
Instability occurs when there is some disturbance of of the above assertions is violated. We note that
the internal or external forces acting on the fluid under this strong definition of stability, loss of
and, loosely speaking, the question of stability or existence of a solution is a particular case of
instability considers whether there exist disturbances instability. The concept of existence that we will
that grow with time. There are many mathematical invoke in considering the Navier–Stokes equations is
definitions of stability of a solution to a PDE. Most the existence of ‘‘mild’’ solutions introduced by Kato
of these definitions are closely related but they may and Fujita (1962). Local-in-time existence of mild
not be equivalent. Because of the distinctly different solutions is known in X = Lq for q n, where n
nature of the Navier–Stokes equations for a viscous denotes the space dimension. (Lq denotes the usual
fluid and the Euler equations for an inviscid fluid, Lebesque space).
we will adopt somewhat different precise definitions We now state two theorems for the Navier–Stokes
of stability for the two systems of PDEs. Both equations [2a] and [2b]. The theorems are valid in any
definitions are related to the concept known as space dimension n and in finite or infinite domains. Of
Lyapunov stability. A steady state described by a course, the most physically relevant cases are n = 3 or
velocity field U 0 (x) is called Lyapunov stable if 2. Both theorems relate properties of the spectrum of
every state q(x, t) ‘‘close’’ to U 0 (x) at t = 0 stays the linearized Navier–Stokes equations to stability or
close for all t > 0. In mathematical terms, ‘‘close- instability of the full nonlinear system. Let
ness’’ is defined by considering metrics in a normed U 0 (x), P0 (x) be a steady state flow:
space X. While in finite-dimensional systems the
choice of norm is not significant because all Banach 1 2 1
ðU 0 rÞU 0 ¼ rP0 þ r U0 þ F ½5a
norms are equivalent, in infinite-dimensional sys- R R
tems, such as a fluid configuration, this choice is
crucial. The point was emphasized by Yudovich r U0 ¼ 0 ½5b
(1989) and it is a version of the definition of
where U 0 2 C1 vanishes on the boundary of the
stability given in this book that we will adopt in
domain D and F is a suitable external force. We
connection with the parabolic Navier–Stokes
write [2a] and [2b] in perturbation form as
equations.
qðx; tÞ ¼ U 0 ðxÞ þ uðx; tÞ ½6
Definitions for a General Nonlinear where
Evolution Equation
@u
Consider an evolution equation for u(x, t) whose ¼ LNS u þ Nðu; uÞ ½7a
@t
phase space is a Banach space X:
@u
¼ Lu þ Nðu; uÞ ru¼0 ½7b
@t
4 Stability of Flows
this can be done only for constant-coefficient equa- Linear (spectral) instability of a steady Euler flow
tions. In the case U(z) = sin mz, a Fourier series U 0 (x) concerns the structure of the spectrum of LE .
representation for the eigenfunctions leads to a Assuming U 0 2 C1 (T n ), the linear equation
tridiagonal infinite matrix for the algebraic system
satisfied by the Fourier coefficients. This is amenable @u
¼ LE u; ru¼0 ½17
to examination using continued fractions. Analysis of @t
the characteristic equation shows that there exist real
eigenvalues > 0 provided R is larger than some defines a strongly continuous group in every Sobolev
critical value for each wave number k with k2 < m2 . space W s, p with generator LE . We denote this group
by exp {LE t}. For the issue of spectral instability of
the Euler equation it proves useful to study not only
the spectrum of LE but also the spectrum of the
The Euler Equation: Linear and evolution operator exp {LE t}. This permits the
Nonlinear Stability/Instability development of an explicit formula for the growth
We conclude this brief article with some discussion rate of a small perturbation due to the essential (or
of instabilities in the inviscid Euler equations whose continuous) spectrum. It was proved by Vishik
existence is likely to be important as a ‘‘trigger’’ for (1996) that a quantity , refered to as a ‘‘fluid
the development of instabilities in high-Reynolds- Lyapunov exponent’’ gives the maximum growth
number viscous flows. As we mentioned, the Euler rate of the essential spectrum of exp{LE t}. This
equations are very different from the Navier–Stokes quantity is obtained by computing the exponential
equations in their mathematical structure. The growth rate of a certain vector that satisfies a
Euler equations are degenerate and nonelliptic. As specific system of ODEs over the trajectories of the
such, the spectrum of the linearized operator LE is flow U 0 (x). This proves to be an effective mechan-
not amenable to standard spectral theory of elliptic ism for detecting instabilities in the essential
operators. For example, unlike the Navier–Stokes spectrum which result due to high-spatial-frequency
operator, the spectrum of LE is not purely discrete perturbations. For example, for this reason any flow
even in bounded domains. To define LE we consider U 0 (x) with a hyperbolic fixed point is linearly
a steady Euler flow {U 0 (x), P0 (x)}, where unstable with growth in the sense of the L2 -norm.
In two dimensions, is equal to the maximal
U 0 rU 0 ¼ rP0 ½12a classical Lyapunov exponent (i.e., the exponential
growth of a tangent vector over the ODE x_ = U 0 (x)).
r U0 ¼ 0 ½12b In three dimensions, the existence of a nonzero
1
We assume that U 0 2 C . For the Euler equations, classical Lyapunov exponent implies that > 0.
appropriate boundary conditions include zero nor- However, in three dimensions there are also exam-
mal component of U 0 on a rigid boundary, or ples where the classical Lyapunov exponent is zero
periodicity conditions (i.e., flow on a torus) or and yet > 0. We note that the delicate issue of the
suitable decay at infinity in an unbounded domain. unstable essential spectrum is strongly dependent on
The theorems that we will be describing have been the function space for the perturbations and that ,
proved mainly in the cases of the second and third for a given U 0 , will vary with this function space.
conditions stated above. There are many classes of More details and examples of instabilities in the
vector fields U 0 (x), in two and three dimensions, essential spectrum can be found in references in the
that satisfy [12a] and [12b]. We write [4a] and [4b] bibliography.
in perturbation form as In contrast with instabilities in the essential
spectrum, the existence of discrete unstable eigenva-
qðx; tÞ ¼ U 0 ðxÞ þ uðx; tÞ ½13 lues is independent of the norm in which growth is
with measured. From this point of view, such instabilities
@u
¼ LE u þ Nðu; uÞ ½14a can be considered as ‘‘strong.’’ However, for most
@t flows U 0 (x) we do not know the existence of such
ru¼0 ½14b unstable eigenvalues. For fully 3D flows there are no
examples, to our knowledge, where such unstable
Here
eigenvalues have been proved to exist for flows with
standard metrics. The case that has received the
LE u ðU 0 rÞ u ðu rÞU 0 r P1 ½15
most attention in the literature is the ‘‘relatively
simple’’ case of plane parallel shear flow. The
Nðu; uÞ ðu rÞ u r P2 ½16
eigenvalue problem is governed by the Rayleigh
6 Stability of Flows
equation (which is the inviscid version of the Orr– strict local maximum or minimum of E, then the
Sommerfeld equation [11]): steady flow is nonlinearly stable in the space J1 of
" 2 # divergence-free vectors u(x, t) (satisfying the bound-
i d 2 ary conditions) that have finite norm,
U k U00 ¼ 0
k dz2
kukJ1 kukL2 þ kcurl ukL2 ½19
¼0 at z ¼ 1 ½18
The celebrated Rayleigh stability criterion says that This theory can be applied, for example, to show
a sufficient condition for the eigenvalues to be that any shear flow with no inflection points in the
pure imaginary is the absence of an inflection point profile U(z) is nonlinearly unstable in the function
in the shear profile U(z). It is more difficult to prove space J1 , that is, the classical Rayleigh criterion
the converse; however, there have been several implies not only spectral stability but also nonlinear
recent results that show that oscillating profiles stability.
indeed produce unstable eigenvalues. For example, if We note that Arnol’d’s stability method cannot be
U(z) = sin mz the continued fraction proof of applied to the Euler equations in three dimensions
Meshalkin and Sinai can be adapted to exhibit the because the second variation of the energy defined
full unstable spectrum for [18]. We note the ‘‘fluid on the tangent space to M is never definite at a
Lyapunov exponent’’ is zero for all shear flows; critical point U 0 (x). This result is suggestive, but
thus the only way the unstable spectrum can be does not prove, that most Euler flows in three
nonempty for shear flows is via discrete unstable dimensions are nonlinearly unstable in the Arnol’d
eigenvalues. sense. To quote Arnol’d, in the context of the Euler
As we have discussed, it is possible to show that equations ‘‘there appear to be an infinitely great
many classes of steady Euler flows are linearly number of unstable configurations.’’
unstable, either due to a nonempty unstable essential In recent years, there have been a number of
spectrum (i.e., cases where > 0) or due to unstable results concerning nonlinear instability for the
eigenvalues or possibly for both reasons. It is natural Euler equation. Most of these results prove non-
to ask what this means about the stability/instability linear instability under certain assumptions on the
of the full nonlinear Euler equations [14]–[16]. The structure of the spectrum of the linearized Euler
issue of nonlinear stability is complex and there are operator. To date, none of the approaches prove
several natural precise definitions of nonlinear the definitive result that in general linear instability
stability and its converse instability. implies nonlinear instability. As we have remarked,
One definition is to consider nonlinear stability this is a much more delicate issue for Euler than for
in the energy norm L2 and the enstrophy norm H 1 , Navier–Stokes because of the existence, for a
which are natural function spaces to measure generic Euler flow, of a nonempty essential
growth of disturbances but are not ‘‘correct’’ spaces unstable spectrum. To give a flavor of the mathe-
for the Euler equations in terms of proven proper- matical treatment of nonlinear instability for the
ties of existence and uniqueness of solutions to the Euler equations, we present one recent result and
nonlinear equation. Falling under this definition is refer the interested reader to articles listed in the
the most frequently employed method to prove ‘‘Further reading’’ section for further results and
nonlinear stability, which is an elegant technique discussions.
developed by Arnol’d (cf. Arnol’d and Khesin In the context of Euler equations in two dimen-
(1998) and references therein). This is based on sions, we adopt the following definition of Lyapu-
the existence of the so-called energy-Casimirs. The nov stability.
vorticity curl q is transported by the motion of Definition 4 An equilibrium solution U 0 (x) is
the fluid so that at time t it is obtained from the called Lyapunov stable if for every " > 0 there exists
vorticity at time t = 0 by a volume-preserving > 0 so that for any divergence-free vector u(x, 0) 2
diffeomorphism. In the terminology of Arnol’d, W 1þs, p , s > 2=p, such that ku(x, 0)kL2 < the unique
the velocity fields obtained in this manner at any solution u(x, t) to [14]–[16] satisfies
two times are called isovortical. For a given field
U 0 (x), the class of isovortical fields is an infinite- kuðx; tÞkL2 < " for t 2 ½0; 1Þ
dimensional manifold M, which is the orbit of the
group of volume-preserving diffeomorphisms in the We note that we require the initial value u(x, 0) to
space of divergence-free vector fields. The steady be in the Sobolev space W 1þs, p , s > p=2, since it is
flows are exactly the critical points of the energy known that the 2D Euler equations are globally in
functional E restricted to M. If a critical point is a time well posed in this function space.
Stability of Flows 7
Definition 5 Any steady flow U 0 (x) for which the this paper. She thanks Misha Vishik for much
conditions of Definition 4 are violated is called helpful advice.
nonlinearly unstable in L2 . The work is partially supported by NSF grant
DMS-0202767.
Observe that the open issues (in three dimensions)
of nonuniqueness or nonexistence of solutions to See also: Compressible Flows: Mathematical Theory;
[14]–[16] would, under Definition 5, be scenarios Incompressible Euler Equations: Mathematical Theory;
for instability. Magnetohydrodynamics; Newtonian Fluids and
Thermohydraulics; Non-Newtonian Fluids; Topological
Theorem 6 (Nonlinear instability for 2D Euler
Knot Theory and Macroscopic Physics.
flows). Let U 0 (x) 2 C1 (T 2 ) be satisfy [12]. Let
be the maximal Lyapunov exponent to the ODE
x_ = U 0 (x). Assume that there exists an eigenvalue
in the L2 spectrum of the linear operator LE given Further Reading
by [15] with Re > . Then in the sense of
Arnol’d VI and Khesin B (1998) Topological Methods in
Definition 5, U 0 (x) is Lyapunov unstable with Hydrodynamics. New York: Springer.
respect to growth in the L2 -norm. Chandrasekhar S (1961) Hydrodynamic and Hydromagnetic
Stability. Oxford: Oxford University Press.
The proof of this result is given in Vishik and Chossat P and Iooss G (1994) The Couette–Taylor Problem.
Friedlander (2003) and uses a so-called ‘‘bootstrap’’ Berlin: Springer.
argument whose origins can be found in references Drazin PG and Reid WH (1981) Hydrodynamic Stability.
in that article. We remark that the above result gives Cambridge: Cambridge University Press.
Friedlander S, Pavlovic N, and Shvydkoy R (2006) Nonlinear
nonlinear instability with respect to growth of the
instability for the Navier–Stokes equations (to appear in
energy of a perturbation which seems to be a Communications in Mathematical Physics).
physically reasonable measure of instability. Friedlander S and Serre D (eds.) (2003) Handbook of Mathema-
In order to apply Theorem 6 to a specific 2D flow tical Fluid Dynamics, vol. 2. Amsterdam: Elsevier.
it is necessary to know that the linear operator LE Friedlander S and Yudovich VI (1999) Instabilities in fluid
has an eigenvalue with Re > . As we have motion. Notices of the American Mathematical Society 46:
1358–1367.
discussed, such knowledge is lacking for a generic Gershuni GZ and Zhukovitiskii EM (1976) Convective Instabil-
flow U 0 (x). Once again, we turn to shear flows. As ity of Incompressible Fluids. Jerusalem: Keter Publishing
we noted = 0 for shear flows, any shear profile for House.
which unstable eigenvalues have been proved to Godreche C and Manneville P (eds.) (1998) Hydrodynamics and
Nonlinear Instabilities. Cambridge: Cambridge University
exist provides an example of nonlinear instability
Press.
with respect to growth in the energy. Joseph DD (1976) Stability of Fluid Motions, 2 vols. Berlin:
We conclude with the observation that it is Springer.
tempting to speculate that, given the complexity Kato T and Fujita H (1962) On the nonstationary Navier–Stokes
of flows in three dimensions, most, if not all, such system. Rend. Sem. Mat. Univ. Padova 32: 243–260.
inviscid flows are nonlinearly unstable. It is clear Lin CC (1967) The Theory of Hydrodynamic Stability.
Cambridge: Cambridge University Press.
from the concept of the fluid Lyapunov exponent Meshalkin LD and Sinai IaG (1961) Investigation of stability for a
that stretching in a flow is associated with system of equations describing the plane movement of an
instabilities and there are more mechanisms for incompressible viscous liquid. App. Math. Mech. 25:
stretching in three, as opposed to two, dimensions. 1700–1705.
However, to date there are virtually no mathema- Swinney H and Gollub L (eds.) (1985) Hydrodynamic Instabilities
and Transition to Turbulence. New York: Springer.
tical results for the nonlinear stability problem for Vishik MM (1996) Spectrum of small oscillations of an ideal fluid
fully 3D flows and many challenging issues remain and Lyapunov exponents. Journal de Mathématiques Pures et
entirely open. Appliquées 75: 531–557.
Vishik M and Friedlander S (2003) Nonlinear instability in
2 dimensional ideal fluids: the case of a dominant
eigenvalue. Communications in Mathematical Physics 243:
Acknowledgments 261–273.
Yudovich VI (1989, US translation) Linearization Method in
The author is very grateful to IHES and ENS- Hydrodynamical Stability Theory, Transl. Math. Monog.
Cachan for their hospitality during the writing of vol. 74. Providence: American Mathematical Society.
8 Stability of Matter
Stability of Matter
J P Solovej, University of Copenhagen, Copenhagen, cannot be arbitrarily negative as the number of
Denmark particles increases. This is often referred to as
ª 2006 Elsevier Ltd. All rights reserved. ‘‘stability of the second kind.’’ If stability of the
second kind does not hold, one would be able to
extract an arbitrarily large amount of energy by
adding a single atomic particle to a sufficiently large
Introduction
macroscopic object.
The theorem on stability of matter is one of the most A perhaps more intuitive notion of stability is
celebrated results in mathematical physics. It is one related to the volume occupied by a macroscopic
of the rare cases where a result of such great object. More precisely, the volume of the object,
importance to our understanding of the world when its total energy is close to the lowest possible
around us appeared first in a completely rigorous energy, grows at least linearly in the number of
formulation. particles. This volume dependence is a fairly simple
Issues of stability are, of course, extremely impor- consequence of stability of matter as formulated
tant in physics. One of the major triumphs of the above.
theory of quantum mechanics is the explanation it The first mention of stability of the second kind
gives of the stability of the hydrogen atom (and the for a charged system is perhaps by Onsager (1939),
complete description of its spectrum). Quantum who studied a system of charged classical particles
mechanics or, more precisely, the uncertainty princi- with a hard core and proved the stability of the
ple explains not only the stability of tiny microscopic second kind. The proof of stability of matter by
objects, but also the stability of gigantic stellar Dyson and Lenard, which does not rely on any hard-
objects such as white dwarfs. Chandrasekhar’s core assumption, but rather on the properties of
famous theory on the stability of white dwarfs fermionic quantum particles, used results from
required, however, not only the usual uncertainty Onsager’s paper.
principle, but also the Pauli exclusion principle for The real relevance of the notion of stability of the
the fermionic electrons. second kind was first realized by Fisher and Ruelle
Whereas both the stability of atoms and the (1966) in an attempt to understand the thermo-
stability of white dwarfs were early triumphs of dynamic properties of matter and to give meaning
quantum mechanics, it, surprisingly, took nearly to thermodynamic quantities such as the energy
40 years before the question of stability of everyday density (energy per volume). Stability of matter is a
macroscopic objects was even raised (Fisher and necessary ingredient in explaining the existence of
Ruelle 1966). The rigorous answer to the question thermodynamics, that is, that the energy per
came shortly thereafter in what came to be known volume has a well-defined limit as the volume and
as the ‘‘theorem on stability of matter’’ proved first number of particles tend to infinity, with the ratio
by Dyson and Lenard (1967). (i.e., the density of particles) kept fixed. The
Both the stability of hydrogen and the stability of existence of this limit is, however, not just a simple
white dwarfs simply mean that the total energy of consequence of stability of matter. The existence of
the system cannot be arbitrarily negative. If there the thermodynamic limit for ordinary charged
were no such lower bound to the energy, one would matter was proved rigorously by Lieb and Lebowitz
have a system from which it would be possible, in (1972) using the result on stability of matter as an
principle, to extract an infinite amount of energy. input.
One often refers to this kind of stability as stability After the original proof of stability of matter by
of the first kind. Dyson and Lenard, several other proofs were given
Stability of matter is somewhat different. Stability (see, e.g., reviews by Lieb (1976, 1990, 2004) for
of the first kind for atoms generalizes, as noted later, detailed references). Lieb and Thirring (1975) in
to objects of macroscopic size. The question arises particular presented an elegant and simple proof
as to how the lowest possible energy depends on the relying on an uncertainty principle for fermions. As
size or, more precisely, on the (macroscopic) number explained in a later section, the best mathematical
of particles in the object. Stability of matter in its formulation of the usual uncertainty principle is in
precise mathematical formulation is the requirement terms of a Sobolev inequality. The method of Lieb
that the lowest possible energy depends at most and Thirring is related to a Sobolev type inequality
linearly on the number of particles. Put differently, for antisymmetric functions. The Lieb–Thirring
the lowest possible energy calculated per particle inequality is discussed later. The proof by Dyson
Stability of Matter 9
and Lenard gave a very poor bound on the lowest Since we consider only electrostatic interactions,
possible energy per particle. The proof by Lieb and the quantum Hamiltonian describing this system is
Thirring gave a much more realistic bound on this
quantity (see below). Two proofs of stability of X
N K X
X N
zk
matter will be sketched here. Both proofs rely on the HN ¼ Ti
i¼1 k¼1 i¼1
jxi rk j
Lieb–Thirring inequality. The first proof described is X X
mathematically simple to explain, whereas the 1 zk z‘
þ þ ½1
second proof (Lieb–Thirring) is based on the jx xj j 1k<‘K jrk r‘ j
1i<jN i
Thomas–Fermi theory. It is mathematically some-
what more involved but, from a physical point of The kinetic energy operator Ti is (half) the Laplacian in
view, more intuitive. the variable xi , i.e., Ti = ð1/2Þi . Atomic units are
As in the case of white dwarfs, stability of matter used, where not only the electron charge is 1, but the
relies on the fermionic property of electrons. Dyson mass of the electron is also 1 and h = 1. The unit of
(1967) proved that the stability of the second kind energy is then 2 Ry.
fails if we ignore the Pauli exclusion principle. In The Hamiltonian HN depends on the parameters
physics textbooks, the importance of the Pauli z = (z1 , . . . , zK ) and r = (r1 , . . . , rK ). It acts on the
exclusion principle for the stability of white dwarfs Hilbert space of fermionic, that is, antisymmetric
is often emphasized. Its importance for the stability wave functions. More precisely, the fermionic
of everything around us is usually ignored. Hilbert space is
As mentioned above the result on stability of
matter appeared from the beginning as a completely ^
N
self-adjoint operator with the property that the ground-state energy over the space L2 (R3(NþK) )
second equality in [2] holds. e
(ignoring spin) by E(N,K). Then, Dyson proved that
In the definition of EF , we have minimized over
all the positions r of the nuclei. Even though the e
min EðN; KÞ CM7=5
NþK¼M
nuclear dynamics is not considered, one is still
interested in finding the lowest possible energy for some constant C > 0. It was later shown by
independent of where they are located. Conlon et al. (1988) that the exponent 7/5 is indeed
optimal. Dyson (1967) made a conjecture for the
Theorem 1 (Stability of the first kind). For all N, precise asymptotic behavior of this energy. This
K, and z, we have conjecture, which was proved by Lieb and Solovej
(2005) and Solovej (2004), is given in the next
Eðz; N; KÞ > 1
theorem.
Theorem 2 (Stability of matter). There exists a Theorem 4 (Dyson’s 7/5-law for the charged
constant Cjzj > 0 depending only on jzj = max Bose gas).
{z1 , . . . , zk } such that
e
EðN; KÞ
F lim min 7=5
E ðz; N; KÞ Cjzj ðN þ KÞ M!1 NþK¼M M
Z Z Z
1 2 5=2 2
The constant Cjzj bounds the binding energy per ¼ inf jrj J j 0; ¼ 1 ½3
2
particle. In the case of hydrogen atoms, when
jzj = 1, Dyson and Lenard arrived at a bound with where
C1 1014 Ry. Lieb and Thirring arrive at C1 3=4
5 = 10 Ry. Since the binding energy of a single 4 ð1/2Þð3/4Þ
J¼
hydrogen atom is 1 Ry, it is easy to see that one 5ð5/4Þ
must have C1 1=4. Over the years, there have
been some improvements on the estimated value of
this constant in the theory of stability of matter.
That the Pauli exclusion principle, that is, the Generalizations of Stability of Matter
fermionic character of the electrons, is necessary for
Over the years, generalizations of stability of matter
stability of matter is a consequence of the next
including relativistic effects and interactions with the
theorem.
electromagnetic field have been attempted. Since the
Theorem 3 (the N 5=3 law for bosons). If N = K relativistic Dirac operator is not bounded below, we
and z1 = = zK = z > 0, then there exist constants cannot simply replace the standard nonrelativistic
C > 0 depending on z such that kinetic energy operator Tj = (1/2)j by the free
Dirac operator.
C N 5=3 < Eðz; N; NÞ < Cþ N 5=3 Relativistic effects have been included by con-
sidering the (pseudo) relativistic kinetic energy
It is the superlinear (exponent 5/3) behavior in N qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
of the upper bound that violates stability of matter. TjRel ¼ c2 j þ c4 c2
This upper bound was proved by Lieb (1979) by a
fairly simple variational argument. The lower bound In the units used in this article, the physical value
above, which shows that the exponent 5/3 is of the speed of light c is approximately 137 or,
optimal, was proved by Dyson and Lenard (1968) more precisely, the reciprocal of the fine-structure
in their original paper on stability of matter. constant .
This theorem leaves open the possibility that the For this relativistic kinetic energy, Lieb and Yau
stability of matter could be recovered by introducing (1988) proved that stability of matter holds in the
finite nuclear masses. That this, indeed, is not the case sense formulated in Theorem 2 if (= c1 ) is small
was proved by Dyson (1967) by a complicated enough and maxj {zj } 2=. It is known here that
variational argument based on the Bogolubov pair the value 2= is the best possible, since it is so
theory for
P superfluid helium. We now add the kinetic for the one-atom case. The one-atom case had
energy K k = 1 ð1/2Þrk of the nuclei (assuming, for been studied by Herbst. The corresponding case of
simplicity, that they have the same mass as the a one-electron molecule was studied by Lieb and
electrons) to the Hamiltonian HN and consider Daubechies. Less optimal results on the stability of
the case where z1 = z2 = = zK = 1. We denote the matter with relativistic kinetic energy had been
Stability of Matter 11
obtained prior to the work of Lieb and Yau by (1995). The latter result includes the physical value of
Conlon and later by Fefferman and de la Llave. . The fact that a bound on is needed had been
References to these works can be found in the work proved by Loss and Yau. Stability for a one-electron
of Lieb and Yau (1988). atom had been proved in this model by Fröhlich,
The relativistic kinetic energy TjRel agrees with the Lieb, and Loss. The many-electron atom and the one-
free Dirac operator on the positive spectral subspace electron molecule had been studied by Lieb and Loss.
of the free Dirac operator (i.e., a subspace of Most relevant references may be found in the work of
L2 (R3 ; C4 )). Therefore, the stability of matter Lieb et al. (1995).
follows if Tj is replaced by the free Dirac operator The possibility of quantizing the magnetic field has
and if one restricts to the Hilbert space obtained as also been studied. In this case, one must introduce an
in [2] but with L2 (R3 ; C2 ) replaced by the positive ultraviolet cutoff in the momentum modes of the
spectral subspace of the free Dirac operator. This vector potential. Stability of matter in the resulting
formulation is often referred to as the ‘‘no-pair’’ model of (ultraviolet cutoff) QED coupled to non-
model. In the usual Dirac picture, the negative relativistic matter was proved by Fefferman et al.
spectral subspace, the Dirac sea, is occupied. As long improving results of Bugliaro, Fröhlich, and Graf.
as one ignores pair creation, only the positive Finally, one may include both relativistic effects
spectral subspace is available. and electromagnetic interactions. Let us first discuss
Magnetic fields may be included by considering the case of classical electromagnetic fields. If instead
the ‘‘magnetic kinetic energy’’ of the Pauli kinetic energy one uses the Dirac
2 operator with a magnetic vector potential then
TjMag ¼ 12 irj c1 Aðxj Þ there would be no lower bound on the energy. But,
as previously described, one can study a no-pair
It turns out that the stability of matter theorem formulation of relativistic particles coupled to
(Theorem 2) holds for all magnetic vector potentials electromagnetic fields. The question arises which
A : R3 ! R3 with a constant Cjzj independent of A. subspace of L2 (R 3 ; C4 ) one should restrict to (i.e.,
This is, therefore, also the case if we consider the which subspace is filled and which one is available).
magnetic field (or rather the vector potential) as a There are two obvious choices. Either one should, as
dynamic variable and add the (positive) field energy before, restrict to the positive spectral subspace of
Z the free Dirac operator or one should restrict to the
1
U¼ jr AðxÞj2 dx ½4 positive spectral subspace of the magnetic Dirac
8 R3
operator. It is proved by Lieb et al. (1997) that the
to the Hamiltonian. The resulting Hamiltonian former choice leads to instability, whereas stability
describes a charged spinless particle interacting of matter holds for the latter choice under some
with a classical electromagnetic field. conditions on and maxj {zj }. Stability requires that
A more complicated situation is described by the the field energy U is included in the Hamiltonian. It
‘‘magnetic Pauli kinetic energy’’ then holds independently of the magnetic field.
This final stability result also holds if the magnetic
TjPauli ¼ 12ððirj c1 Aðxj ÞÞ s j Þ2 field is quantized with an ultraviolet cutoff as
proved by Lieb and Loss (2002).
where the coupling of the spin to the magnetic field The no-pair model even with the ultraviolet cutoff
is included through the vector of 2 2 Pauli quantized field is not fully relativistically invariant.
matrices acting on the spin components of particle j, As mentioned above, there is still no mathematical
that is, s = (1 , 2 , 3 ), with
formulation of QED, a fully relativistically invariant
model for quantum particles interacting with elec-
0 1 0 i 1 0
1 ¼ ; 2 ¼ ; 3 ¼ tromagnetic fields.
1 0 i 0 0 1
A mathematically more flexible formulation not have the form required for the stability of the
is provided by the classical Sobolev inequality, second kind.
which states that for all square-integrable functions
2 L2 (R3 ), one has The Proof of Stability of Matter
Z Z 1=3
The proof of stability of the first kind presented in
jr j2 CS j j6 ½5
the previous section must be improved in two ways
in order to conclude the stability of matter.
for CS > 0. It follows from this inequality that for For fermions, it turns out that the lower bound in
any attractive potential V, there is a lower bound on [6] can be improved in such a way that there is no
the energy expectation factor N in the first term. This is the content of the
bound of Lieb and Thirring discussed in the
1
; V introduction.
2
Z Z Z 1=3 Theorem 5 (Lieb–Thirring inequality 1975). The
1 2 2 1 6
¼ jr j Vj j CS sum of all the negative eigenvalues of the oper-
2 2 ator ð1/2Þ V(x) is bounded below by
Z Z 2=5 Z 1=5
Z
V 5=2 j j2 j j6 LLT V 5=2
Z Z
C V 5=2 j j2 for some constant LLT > 0
for some C > 0. Thus, the lowest possible energy of For N noninteracting fermions moving in the
one particle moving in the potential V is bounded potential V, the lowest possible energy is given by
R the sum of the N lowest eigenvalues of the operator
below by C V 5=2 . For N R (noninteracting) particles,
the lower bound is CN V 5=2 . This holds whether in the above theorem. Thus, the theorem gives a
or not the particles have spin. If, more generally, the lower bound on this energy independently of N.
potential The second point where the argument from the
R can be written as V = U þ W, U, W 0, previous section has to be improved is the control of
where U5=2 < 1 and W is bounded W kWk1 ,
then the energy of N noninteracting particles moving the electrostatic energy. In the above discussion, all
in the potential V is bounded below by repulsive terms have simply been ignored. For
Z stability of matter, a much more delicate bound is
NC U5=2 NkWk1 ½6 needed. Many versions of such bounds have been
given going back to the work of Onsager (1939).
Here, a result of Baxter (1980) will be used.
For the Hamiltonian HN from [1], one can get a
lower bound on the energy E(z, N, K) by ignoring all Theorem 6 (Baxter’s correlation estimate). For all
the positive potential terms, that is, the last two positions xi , . . . , xN , r1 , . . . , r K 2 R3 and all charges
sums in [1]. The remaining Hamiltonian describes N z1 , . . . , zK > 0, we have the pointwise inequality
independent particles moving in the potential
K X
X N
zk X 1
X
K
zk X
K þ
V ¼ ¼ ðUk þ Wk Þ k¼1 i¼1
jxi rk j 1i<jN jxi xj j
k¼1
jx rk j k¼1
X zk z‘ XN
where Uk is the restriction of zk =jx rk j to the set þ Vðxi Þ
jr r‘ j
1k<‘K k i¼1
jx rk j < R for some R > 0 and Wk is the restriction
to the complementary set. Using [6], one can easily
see that the energy expectation is bounded below by where V(x) = (1 þ 2 maxk {zk }) maxk {jx rk j1 }.
This theorem simply states that, for a lower
C N K5=2 maxfzk g5=2 R1=2 N K maxfzk gR1 bound, one can replace the full electrostatic Cou-
k k
lomb energy by the energy of independent electrons
0 2 2
¼ C N K maxfzk g moving in the potential where they always see only
k
the closest nuclei (with a modified charge). Baxter
where we have made the optimal choice for (1980) used probabilistic techniques to prove the
R
(K maxk {zk })1 . inequality. An improved version of the inequality
This finite lower bound on the energy proves was given by Lieb and Yau (1988), with an analytic
the stability of the first kind, but it clearly does proof.
Stability of Matter 13
Similarly to the argument in the previous section, One should compare the Lieb–Thirring kinetic
one can write V(x) = U(x) þ W(x), where U is the energy bound with the expression (3/10)(32 )2=3 5=3
restriction of V to the set where mink {jx rk j} < R for the (thermodynamic) energy density of a
for some R > 0 and W is the restriction to the free Fermi gas. One of the yet unproven conjectures
complementary set. It then follows from Baxter’s is that the Lieb–Thirring bound holds with CLT
correlation estimate and the Lieb–Thirring inequality replaced by the free Fermi constant (3/10)(32 )2=3 .
that the lowest eigenvalue of the Hamiltonian HN on The idea in the Lieb–Thirring proof of stability of
the fermionic Hilbert space HFN is bounded below by matter is to bound the energy below by an
Z expression depending only on the one-electron
LLT U5=2 Nð1 þ 2 maxfzk gÞR1 density. Theorem 7 achieves this for the kinetic
k
energy. What is missing is a lower bound on the
5=2
Cð1 þ 2 maxfzk gÞ KR1=2 electrostatic Coulomb energy depending only on the
k
density. One can show (see Lieb (1976) or Lieb and
Nð1 þ 2 maxfzk gÞR1 Thirring (1975)) that, except for an error of the
k
form ‘‘– const N,’’ the total energy expectation
¼ C ð1 þ 2 maxfzk gÞ2 ðN þ KÞ
0
k (, HN ) may be bounded below by
where R
(1 þ 2 maxk {zk })1 . This lower bound is Z XK Z
zk
linear in the total particle number N þ K, as CLT 5=3 ðxÞ dx
jx rk j
required by stability of matter. ZZ
k¼1
1 ðxÞðyÞ X zk z‘
þ dx dy þ ½7
2 jx yj 1k<‘K k
jr r‘ j
From Thomas–Fermi Theory to Stability
of Matter Here, as before, is the one-electron density of the
In this final section, the proof of stability of matter N-body wave function . The expression [7] is the
by Lieb and Thirring (1975), where they use the famous Thomas–Fermi energy functional. It has
Thomas–Fermi theory, is discussed briefly. First note been studied rigorously by Lieb and Simon (1977).
that there is a dual formulation of the Lieb–Thirring The Thomas–Fermi energy is Rthe infimum of the
inequality theorem (Theorem 5), which makes the expression (7) over all with = N. One of the
connection to the Sobolev inequality [5] much more important results about the Thomas–Fermi energy is
transparent. Teller’s no-binding theorem (Lieb and Simon 1977).
It states that in Thomas–Fermi theory atoms do not
Theorem 7 (Lieb–Thirring inequality as a kinetic bind to form molecules. This means that the
energy bound). For any normalized antisymmetric Thomas–Fermi energy is greater than the sum of
(fermionic) wave function 2 HFN we have with the individual atomic energies (these energies in turn
2=3
CLT = 35 ( 25 L1
LT ) the following lower bound on the depend only on the nuclear charges).
kinetic energy: The above Thomas–Fermi lower bound on the
X N Z energy expectation (, HN ) together with the no-
1
kri ðx1 ; . . . ; xN Þk2 dx1 dxN binding theorem implies stability of matter.
i¼1
2 R 3N
The generalizations to stability of matter dis-
Z
cussed earlier are proved in a way similar to the
CLT ðxÞ5=3 dx proof presented in the previous section.
R3
N
where k k is the norm in spin space (C2 ) and the See also: h-Pseudodifferential Operators and
one-electron density is given by Applications; Quantum Statistical Mechanics: Overview;
Z Schrödinger Operators.
ðxÞ ¼ N kðx; x2 ; . . . ; xN Þk2 dx2 dxN
R 3ðN1Þ
Dyson FJ and Lenard A (1967) Stability of matter. I. Journal of Lieb EH and Yau H-T (1988) The stability and instability of
Mathematical Physics 8: 423–434. relativistic matter. Communications in Mathematical Physics
Dyson FJ and Lenard A (1968) Stability of matter. II. Journal of 118: 177–213.
Mathematical Physics 9: 698–711. Lieb EH (1990) The stability of matter: from atoms to stars. 1989
Fefferman CL (1997) Stability of matter with magnetic fields. Gibbs Lecture. Bulletin of the American Mathematical Society
CRM Proceedings and Lecture Notes 12: 119–133. 22: 1–49.
Fefferman C, Fröhlich J, and Graf GM (1997) Stability of ultraviolet Lieb EH, Loss M, and Solovej JP (1995) Stability of matter in
cutoff quantum electrodynamics with non-relativistic matter. magnetic fields. Physical Review Letters 75: 985–989.
Communications in Mathematical Physics 190: 309–330. Lieb EH, Siedentop H, and Solovej JP (1997) Stability and
Fisher M and Ruelle D (1966) The stability of many-particle instability of relativistic electrons in classical electromagnetic
systems. Journal of Mathematical Physics 7: 260–270. fields. Journal of Statistical Physics 89: 37–59.
Lieb EH and Lebowitz JL (1972) The constitution of matter: Lieb EH and Loss M (2002) Stability of a model of relativistic
existence of thermodynamics for systems composed of quantum electrodynamics. Communications in Mathematical
electrons and nuclei. Advances in Mathematics 9: 316–398. Physics 228: 561–588.
Lieb EH and Thirring WE (1975) Bound for the kinetic energy of Lieb EH (2004) The stability of matter and quantum electro-
fermions which proves the stability of matter. Physical Review dynamics, Proceedings of the Heisenberg Symposium, Munich,
Letters 35: 687–689. Dec. 2001. In: Buschhorn G and Wess J (eds.) Fundamental
Lieb EH (1976) The stability of matter. Reviews of Modern Physics – Heisenberg and Beyond, pp. 53–68. Berlin: Springer
Physics 48: 553–569. (arXiv math-ph/0209034).
Lieb EH and Simon B (1977) Thomas–Fermi theory of atoms, Lieb EH and Solovej JP Ground state energy of the two-
molecules and solids. Advances in Mathematics 23: 22–116. component charged Bose gas. Communications in Mathema-
Lieb EH (1979) The N5=3 law for bosons. Physics Letters A 70: tical Physics (to appear).
71–73. Onsager L (1939) Electrostatic interaction of molecules. Journal
Lieb EH and Thirring WE (1984) Gravitational collapse in of Physical Chemistry 43: 189–196.
quantum mechanics with relativistic kinetic energy. Annals of Solovej JP (2004) Upper bounds to the ground state energies of
Physics, NY 155: 494–512. the one- and two-component charged Bose gases. Preprint.
Lieb EH and Yau H-T (1987) The Chandrasekhar theory of
stellar collapse as the limit of quantum mechanics. Commu-
nications in Mathematical Physics 112: 147–174.
(M, g) is called asymptotically simple (AS) if its where r is the covariant derivative, R the scalar
conformal completion is smooth everywhere except curvature of (, g). An initial data set is said to be
i0 and every null geodesic intersects S at precisely maximal if trg k = 0. This is a gauge condition which
two endpoints. The AS assumption allows one to can be imposed without loss of generality. For
derive precise decay asymptotic for various curvature simplicity we shall assume, throughout this article,
components of (M, g) along null geodesics which that all initial data sets we consider are maximal.
are referred to as strong peeling. The obvious
Definition 2 An initial data set is said to be flat, or
questions raised by this procedure are: do there exist
trivial, if it corresponds to a complete spacelike
nontrivial AS spacetimes and, if so, do they contain
hypersurface in Minkowski space with its induced
a sufficiently large class of radiating spacetimes
metric and second fundamental form. An initial data
including those which appear in all relevant
set is said to be asymptotically flat if there exists a
applications?
system of coordinates (x1 , x2 , x3 ) defined in a
Clearly, the two problems mentioned above are
neighborhood of infinity
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi on , with
related but not equivalent. Asymptotically simple
r = (x1 )2 þ (x2 )2 þ (x3 )2 , relative to which the
spacetimes verify strong peeling, in particular they
metric g approaches the Euclidian metric and k
are globally asymptotically flat, that is, their
approaches zero as r ! 1. We assume, for simpli-
curvature tensor tends to zero along all geodesics.
city, that has only one end. A neighborhood of
Yet, it is perfectly possible that arbitrarily small
infinity means the complement of a sufficiently large
perturbations of the Minkowski space are geodesi-
compact set on .
cally complete and globally asymptotically flat
without being asymptotically simple. Remark 1 Because of the constraint equations, the
The first global stability result of the Minkowski asymptotic behavior cannot be arbitrarily pre-
metric was proved by Christodoulou and Klainer- scribed. A precise definition of asymptotic flatness
man (1993). Their result proves sufficiently strong has to involve the ADM mass of (, g). Taking the
peeling estimates to allow one to derive the most mass into account, we write
important properties of gravitational radiation, such
as the Bondi mass-law formula, but not as strong as 2M
gij ¼ 1 þ ij þ oðr1 Þ
those consistent with asymptotic simplicity. A r
companion result was proved by Klainerman and According to the positive-mass theorem, M 0 and
Nicolò (2003). Recently, Rodnianski and Lindblad M = 0 implies that the initial data set is flat.
(submitted) have obtained a surprising global
stability of Minkowski result for the Einstein vacuum Definition 3 We say that an initial data set is
equations in the Lorentz gauge, which provides strongly asymptotically flat if, for some 1=2,
considerable weaker peeling than Christodoulou and relative to the coordinate system mentioned above,
Klainerman (1993) and Klainerman and Nicolò
2M
(1999) but is much easier to prove. gij 1 þ ij ¼ Oðr1 Þ; kij ¼ Oðr2 Þ
r
The goal of this article is to describe various results
obtained since the early 1980s concerning both as r ! 1
aspects of the problem of stability of Minkowski
Moreover, every derivative of g (1 þ 2M=r) and k
mentioned above.
improves the asymptotics by one.
Definition 4 A Cauchy development of an initial
Initial Data Formulation data set (, g, k) is a spacetime manifold (M, g)
satisfying the Einstein equations together with an
The proper mathematical context for the stability of embedding i: ! M such that i (g), i (k) are the
Minkowski is that provided by the initial-value first and second fundamental forms of i() in M.
problem for vacuum solutions to the Einstein field A development is required to be also globally
equations, that is, Ricci flat spacetimes (M, g), hyperbolic (which means that i() is a Cauchy
R = 0. We recall the following simple definitions: hypersurface, i.e., each causal curve in M intersects
Definition 1 An initial data set is a triplet (, g, k) i() at precisely one point) in order to assure the
consisting of a three-dimensional complete Rieman- unique dependence of solutions on the data. A
nian manifold (, g) and a 2-covariant symmetric future development of (, g, k) consists of a globally
tensor k on satisfying the constraint equations: hyperbolic manifold (M, g) with boundary, satisfy-
ing the Einstein equations, and an embedding i as
rj kij ri trg k ¼ 0; R jkj2 þ ðtr kÞ2 ¼ 0 before which identifies to the boundary of M.
16 Stability of Minkowski Space
The most primitive question asked about the relative to their proper time (or affine parameter in
initial-value problem, solved in a satisfactory way, the case of null geodesics). If the initial data set is
for very large classes of evolution equations, is that of sufficiently far off from the trivial one, the corre-
local existence and uniqueness of solutions. For the sponding future development may not be regular.
Einstein equations, this type of result was first This is the content of the following well-known
established by Bruhat (1952) with the help of wave theorem of Penrose (1979).
coordinates which allowed her to cast the Einstein
Theorem 3 If the manifold support of an initial
equations in the form of a system of nonlinear wave
data set is noncompact and contains a closed
equations to which one can apply the standard theory
trapped surface, the corresponding maximal devel-
of symmetric hyperbolic systems. A stronger result,
opment is incomplete.
due to Hughes et al. (1976), states the following:
Theorem 1 Let (, g, k) be an initial data set for
the Einstein vacuum equations. Assume that can Stability of Minkowski Space
be covered by a locally finite system of coordinate
At the opposite end of Penrose’s trapped-surface
charts U related to each other by C1 diffeomorph-
s s1 condition, the problem of stability of Minkowski
isms, such that (g, k) 2 Hloc (U ) Hloc (U ) with
space concerns the development of asymptotically
s > 5=2. Then there exists a unique (up to an
flat initial data sets which are sufficiently close to
isometry) globally hyperbolic, Hausdorff, develop-
the trivial one. Although it may be reasonable to
ment (M, g) for which is a Cauchy hypersurface.
expect the existence of a sufficiently small neighbor-
In Theorem 1, the uniqueness up to an isometry hood of the trivial initial data set, in an appropriate
requires additional regularity, s > (5=2) þ 1, on the topology, such that all corresponding developments
data. One has uniqueness, however, without addi- are geodesically complete and globally asymptoti-
tional regularity for the reduced Einstein equations cally flat, such a result was by no means preor-
system in wave coordinates. dained. First, all known explicit asymptotically
flat solutions of the Einstein vacuum equations,
Remark 2 In the case of nonlinear systems of that is, the Kerr family, are singular. The attempts
differential equations, the local existence and to construct nonexplicit, dynamic, solutions based
uniqueness result leads, through a straightforward on the conformal compactification method, due
extension argument, to a global result. The formula- to Penrose (1962), were obstructed by the irregular
tion of the same type of result for the Einstein behavior of initial data sets at i0 . (The problem is
equations is a little more subtle; it was done by that the singularity at i0 could propagate and thus
Bruhat and Geroch. destroy the expected smoothness of scry. This
Theorem 2 (Bruhat–Geroch). For each smooth problem has been recently solved by constructing
initial data set, there exists a unique maximal future initial data sets which are precisely stationary at
development. spacelike infinity.) Finally, the attempts, using
partial differential equation hyperbolic methods,
Thus, any construction, obtained by an evolution- to extend the classical local result of Bruhat
ary approach from a specific initial data set, must be ran into the usual difficulties of establishing global
necessarily contained in its maximal development. in time existence to solutions of quasilinear hyper-
This may be said to solve the problem of global bolic systems. Indeed, as mentioned above, the
existence and uniqueness in general relativity. This is wave coordinate gauge allows one to express
of course misleading, for equations defined in a fixed the Einstein vacuum equations in the form of
background global is a solution which exists for all a system of nonlinear wave equations which does
time. In general relativity, however, we have no such not satisfy Klainerman’s null condition (the null
background as the spacetime itself is the unknown. condition (Klainerman 1983, 1986) identifies an
The connection with the classical meaning of a global important class of quasilinear systems of wave
solution requires a special discussion concerning the equations in four spacetime dimensions for which
proper time of timelike geodesics; all further ques- one can prove global in time existence of small
tions may be said to concern the qualitative properties solutions) and thus was sought to lead to formation
of the maximal development. The central issue is that of singularities. (The conjectured singular behavior of
of existence and character of singularities. First, we wave coordinates was sought, however, to reflect
can define a regular maximal development as one only the instability of the specific choice of gauge
which is complete in the sense that all future timelike condition and not a true singularity of the equations.)
and null geodesics can be indefinitely extended According to Bruhat (personal communication),
Stability of Minkowski Space 17
Einstein himself had reasons to believe that the as r ! 1 with 4 r2 = Area(St, u = t \ Cu ). Also,
Minkowski space may not be stable. The problem , = O(r7=2 ), with the average of over the
of stability of the Minkowski space was first settled compact 2-surfaces St, u = t \ Cu .
by Christodoulou and Klainerman (1990).
Three points are noteworthy. (1) The outgoing
Theorem 4 (Global stability of Minkowski). Any optical solution refers to the solution of the Eikonal
asymptotically flat initial data set which is suffi- equation g @a u@ u = 0 whose level hypersurfaces
ciently close to the trivial one has a complete Cu intersect t in expanding wave fronts for
maximal future development. increasing t; (2) the generators L and L are given
by: L = g @ u@ , the null geodesic generator of
A related result (Theorem 5) proved recently by
Cu ; L is then the null conjugate of L, perpendicular
Klainerman and Nicolò (2003a), solves the problem
to St, u = Cu \ t ; and (3) ea is an orthonormal frame
of radiation for arbitrary asymptotically flat initial
on St, u .
data sets: a proof the result below can also be
derived, indirectly, from Christodoulou and Klainer- Theorem 5 (Expanded version). For any asympto-
man (1993). The proof of Klainerman and Nicolò tically flat initial data sets (, g, k), verifying the same
(2003a) avoids, however, a great deal of the asymptotically flat conditions as in Theorem 4 one
technical complications of this proof. can find a suitable domain 0 with compact
closure in such that its future domain of influence
Theorem 5 For any, suitably defined, asymptoti-
Cþ (0 ) can be foliated by two null foliations; one
cally flat initial data set (, g, k) with maximal
outgoing C(u) whose leaves are complete towards the
future development (M, g), one can find a suitable
future and the second one C(u) which is incoming.
domain 0 with compact closure in such that
þ Let S(u, u) = C(u) \ C(u) denote the compact
the boundary Dþ 0 of its domain of influence C (0 ), 2-surfaces of intersection between the outgoing and
or causal future of , in M has complete null
incoming null hypersurfaces, whose area is denoted
geodesic generators with respect to the correspond-
by 4 r2 , and consider an adapted null frame (that is,
ing affine parameters.
L is a the geodesic null generator of C(u), L its null
Both the results of Christodoulou–Klainerman and conjugate perpendicular to S(u, u), and ea an ortho-
Klainerman–Nicolò prove in fact a lot more than normal frame on S(u, u)) L, L, (ea )a = 1, 2 at every
stated above. They provide a wealth of information point along an outgoing null cone C(u). Then,
concerning the behavior of null hypersurfaces as well denoting by , , , , , the null components of
as the rate at which various components of the the curvature tensor, as in Theorem 5, we have, along
Riemann curvature tensor approach zero along time- C(u) as r ! 1,
like and null geodesics. Here are more precise
versions for Theorems 4 and 5. ; ; ; ¼ Oðr7=2 Þ; ¼ Oðr2 Þ;
½2
Theorem 4 (Expanded version). Assume that ¼ Oðr1 Þ
(, g, k) is maximal and strong asymptotically
Observe that the rates of decay in [1] and [2] are
flat, g (1 þ 2M=r) = 0(r3=2 ), k = 0(r5=2 ) plus
the same. This will be referred to as weak peeling to
an appropriate global smallness assumption. We can
distinguish from the rates of decay compatible with
construct complete spacetime (M, g) together with a
asymptotic simplicity, that is,
maximal foliation t given by the level hypersurfaces
of a time function t and null foliation Cu , given by the ¼ Oðr5 Þ; ¼ Oðr4 Þ
level hypersurfaces of an outgoing optical function u ½3
such that relative to an adapted null frame e4 = L, ; ¼ Oðr3 Þ; ¼ Oðr2 Þ; ¼ Oðr1 Þ
e3 = L, and (ea )a = 1, 2 we have, along the null hyper-
to which we shall refer as strong peeling. We shall
surfaces Cu the weak peeling decay,
discuss more about these in the next section,
following a review, of a recent result of Lindblad–
ab ¼ RðL; ea ; L; eb Þ ¼ Oðr7=2 Þ Rodnianski.
2a ¼ RðL; L; L; ea Þ ¼ Oðr7=2 Þ Even the expanded forms of Theorems 4 and 5
stated here do not exhaust, all the information
4 ¼ RðL; L; L; LÞ ¼ Oðr3 Þ
½1 provided by global stability results in Christodoulou
4 ¼ RðL; L; L; LÞ ¼ Oðr3 Þ and Klainerman (1993) and Klainerman and Nicolò
2 a ¼ RðL; L; L; ea Þ ¼ Oðr2 Þ (2003a). Of particular interest are the main
asymptotic conclusions which can be derived
ab ¼ RðL; ea ; L; eb Þ ¼ Oðr1 Þ with the help of these information, the most
18 Stability of Minkowski Space
important being the Bondi mass-law formula which important part of their technical complications, the
calculates the gravitational energy radiated at null weak peeling properties of [1].
infinity.
The simplest gauge condition in which the
hyperbolic character of the Einstein field equations
Strong Peeling
are easiest to exhibit is the wave coordinate
condition; that is, one solves the Einstein vacuum The weak peeling properties [1] derived in Theorems
equations relative to a special system of coordinates 4 and 5 are consistent, from a scaling point of view,
x which satisfy the equation & gx = 0. Then, with the SAF condition. To derive strong peeling,
denoting by h = g m with m the standard see [3], one needs stronger asymptotic conditions.
Minkowski metric, we obtain the following system Recently, Corvino–Schoen and Chruściel and Delay
of quasilinear wave equations in h, (2002) have proved the existence of a large class of
asymptotically flat initial data sets (, g, k) which
g @ @ h ¼ Nðh; @hÞ ½4 are precisely stationary (here gkerr , kkerr are the initial
data of the a Kerr solution in standard coordinates)
with N(h, @h) a nonlinear term, quadratic in @h, g = gkerr , k = kkerr outside a sufficiently large com-
which can be exhibited explicitly. This form of the pact set. Moreover, they have proved the existence
Einstein field equations, called the wave coordinates of sufficiently small solutions in this class which
reduced Einstein equations, is precisely the one satisfy the requirements needed in Friedrich’s con-
which allowed Bruhat (1952) to prove the first formal compactification method (see Friedrich
local existence result. Later, she also pointed out (2002) and the references within) to produce
that the first nontrivial iterate of [4] behaves like asymptotically simple spacetimes, that is, spacetimes
t1 log t rather than t1 as expected from the decay satisfying Penrose’s regular compactification condi-
properties of solutions to & h = 0 in Minkowski tion (Penrose 1962). Simultaneously, Klainerman
space. This seems to indicate that the wave and Nicolò (1999) were able to refine the methods
coordinates may not be suitable to study the long- used in the proof of Theorem 5 to prove the
time behavior of solutions to the Einstein field following:
equations. This negative conclusion is also consis-
Theorem 6 Assume that the initial data set (, g, k)
tent with the fact that the eqns [4] do not verify
of Theorem 5 satisfies the stronger assumption,
Klainerman’s null condition. (Klainerman’s null
condition (Klainerman 1983) is an algebraic condi- g gS ¼ Oðrð3=2þ
Þ Þ; k ¼ ðrð5=2þ
Þ Þ ½5
tion on systems of nonlinear wave equations in
(1 þ 3) dimensions, similar to [4], which allows one for some
> 3=2. Here
to extend all local solutions, corresponding to small
M 1 2
initial data, for all time. Moreover, these solutions gS ¼ 1 2 dr þ r 2 ðd2 þ sin2 d2 Þ
r
decay at the rate of t1 as t ! 1 consistent to the
decay of free waves.) Lindblad and Rodnianski denotes the restriction of the Schwarzschild to t = 0
(2003) were able to isolate a new condition, which in standard polar coordinates. Then, in addition to
they call the weak null condition, verified by the the results reported in Theorem 5, we have the
wave coordinates reduced Einstein eqns [4], for strong peeling estimates,
which one can prove a small data global existence
result consistent with the weaker decay rates ¼ Oðr5 Þ; ¼ Oðr4 Þ
suggested by the linear asymptotic analysis of as r ! 1 along the outgoing null leaves C(u).
Bruhat. Although the new result provides far Moreover, the same conclusions hold true if [5] is
weaker peeling information than [1], it is much replaced by
simpler to prove than both Theorems 4 and 5.
Moreover, the result seems to apply to a broader g gkerr ¼ Oðrð3=2þ
Þ Þ; k kkerr ¼ ðrð5=2þ
Þ Þ ½6
class of initial data than in Theorems 4 and 5. It
remains an intriguing open problem whether the for some
> 5=2.
result of Lindblad–Rodnianski can be used as a The first part of the theorem was proved in
stepping stone towards the more complete results of Klainerman and Nicolò (2003b). The second part is
Theorems 4 and 5; that it is once a complete work in progress by Klainerman and Nicolò. The
solution, with limited peeling, is known to exist existence of initial conditions of the type required in
whether one can improve, using the more precise Theorem 6 was established in the works of Corvino
techniques employed in Theorems 4 and 5 minus an (2000) and Chruściel and Delay (2002).
Stability of Minkowski Space 19
Open Problems Bruhat YCh and Geroch RP Global aspects of the Cauchy
problem in general relativity. Communications in Mathema-
tical Physics 14: 329–335.
Problem 1 Extend results of Theorems 5 and 6 to Christodoulou D (2000) The Global Initial Value Problem in
the whole domain of dependence, for small data sets. General Relativity, Lecture Given at the Ninth Marcel
Grossman Meeting. 2–8 July, Rome.
The results of Theorems 5 and 6 give a Christodoulou D and Klainerman S (1990) Asymptotic properties
satisfactory description of gravitational radiation of of linear field equations in Minkowski space. Communications
general classes of asymptotically flat initial data sets in Pure and Applied Mathematics XLIII: 137–199.
Christodoulou D and Klainerman S (1993) The global non-linear
outside the domain of dependence of a sufficiently stability of the Minkowski space. Princeton Mathematical Series.
large compact set. It would be desirable to extend Chruściel PT and Delay E (2002) Existence of non-trivial,
these results to the whole domain of dependence of vacuum, asymptotically simple spacetimes. Classical and
initial data sets which satisfy an additional global Quantum Gravity 19: L71–L79.
smallness assumption similar to that of Theorem 4. Corvino J (2000) Scalar curvature deformation and a gluing
construction for the Einstein constraint equations. Commu-
Problem 2 Is strong peeling (and implicitly asymp- nications in Mathematical Physics 214: 137–189.
totic simplicity) consistent with physically relevant Friedrich H (2002) Conformal Einstein evolution. In:
Frauendiener J and Friedrich H (eds.) The Conformal
data? If not, is weak peeling a good substitute? Structure of Space-Time, Lecture Notes in Physics. Springer.
Damour and Christodoulou (2000) have given Hughes T, Kato T, and Marsden J (1976) Well-posed quasilinear
second-order hypebolic systems. Archives for Rational and
conclusive evidence that under no-incoming- Mechanical Analysis 63: 273–294.
radiation condition the future null infinity cannot Hawking SW and Ellis GFR (1973) The large scale structure of
be smooth. In fact, = O(r4 log r) as r ! 1. spacetime. Cambridge Monographs on Mathematical Physics.
Klainerman S (1983) Long-Time Behavior of Solutions to Non-
Problem 3 Can one weaken the AF conditions to linear Wave Equations Proc. ICM Warszawa.
include, for example, initial data sets with infinite Klainerman S (1986) The null condition and global existence to
ADM angular momentum? non-linear wave equations. Lectures in Applied Mathematics
23: 293–326.
It is reasonable to expect a global stability of Klainerman S and Nicolò F (1999) On local and global aspects of
Minkowski result for small initial data sets which the Cauchy problem in general relativity. Classical and
verify, for arbitrarily small
, Quantum Gravity 16: R73–R157.
Klainerman S and Nicolò F (2003a) The evolution problem in
M general relativity. Progress in Mathematical Physics, vol. 25.
g 1þ2 ¼ 0ðr1c Þ; k ¼ 0ðr2c Þ Boston: Birkhaüser.
r
Klainerman S and Nicolò F (2003b) Peeling properties of
One expects in this case that the top null components asymptotically flat solutions to the Einstein vacuum equations.
Classical and Quantum Gravity.
and decay only like O(r3 ) as r ! 1 along the
Lindblad H and Rodnianski I (2003) The weak null condition for
null hypersurfaces C(u). It seems that the methods of Eisnstein’s equations. Comptes Rendus Hebdomodaires des
Lindblad–Rodnianski can treat this case but can only Seances de l’Academic des Sciences, Paris 336(11): 901–906.
give decay estimates for , of the form O(r3þc ). Lindblad H, and Rodnianski I Global existence for the Einstein
vacuum equations in wave coordinates. Annals of Mathe-
Problem 4 Is the Kerr solution in the exterior of matics (submitted).
the black hole stable? Penrose R (1965) Zero rest-mass fields including gravitation:
asymptotic behaviour. Proceedings of the Royal Society of
The problem remains wide open. London. Series A 284: 159–203.
Penrose R (1979) Singularities and time asymmetry in general
See also: Asymptotic Structure and Conformal Infinity; relativity – an Einstein centenary survey. In: Hawking S and
Classical Groups and Homogeneous Spaces; Critical Israel W (eds.) Cambridge: Cambridge University Press.
Phenomena in Gravitational Collapse; Einstein Penrose R (1962) Zero rest mass fields including gravitation:
Equations: Exact Solutions; Geometric Analysis and asymptotic behavior. Proceedings of the Royal Society of
London. Series A 270: 159–203.
General Relativity; Supergravity.
Wald R (1984) General Relativity. University of Chicago Press.
Further Reading
Bruhat YC (1952) Theoreme d’existence pour certain systemes
d’equations aux derivee partielles nonlineaires. Acta Mathe-
matica 88: 141–225.
20 Stability Problems in Celestial Mechanics
attracting bodies (according to Newton’s law), In the case N = 2, one reduces to the two-body
inspired several mathematical and physical theories: problem, which can be explicitly solved by means of
from the development of perturbation methods to Kepler’s laws as follows. Consider, for example, the
the discovery of chaotic systems, as attested by the Earth–Sun case: for negative values of the energy,
masterly work of H Poincaré (1892). In particular, the trajectory of the Earth is an ellipse with one
perturbation theory had relevant applications in focus coinciding with the barycenter, which can
celestial mechanics; for example, it led to the practically be identified with the Sun; the Earth–Sun
prediction of the existence of Neptune in the radius vector describes equal areas in equal times;
nineteenth century by J C Adams and U Leverrier the cube of the semimajor axis is proportional to the
square of the period of revolution.
Consider now an extension to the study of three
Table 1 Tititus–Bode law and observed data
bodies such that in the Keplerian approximation P2
and P3 move around P1 and such that the
Index n Distance computed Observed semimajor axis of P2 is greater than that of P3 (an
(of [1]) from [1] distance (AU) example is obtained identifying P1 with the Sun, P2
Mercury 1 0.4 0.39 with the Jupiter, and P3 with an asteroid of the
Venus 0 0.7 0.72 main belt). The three-body problem is described by
Earth 1 1 1 [2] setting N = 3; a special case is given by the
Mars 2 1.6 1.52 restricted three-body problem, which describes the
Jupiter 4 5.2 5.2
Saturn 5 10 9.54
evolution of a ‘‘zero-mass’’ body under the gravita-
Uranus 6 19.6 19.19 tional attraction exerted by an assigned two-body
system. Setting N = 3 and m3 = 0 in [2], the
Stability Problems in Celestial Mechanics 21
equations governing the restricted three-body pro- problem; notice that H(L, G, ‘, , ) has two degrees
blem are given by of freedom and an explicit time dependence through
the longitude of P2 . If the primaries are assumed to
d2 uð1Þ m2 ðuð1Þ uð2Þ Þ move in circular orbits around their common center
¼
dt 2
juð1Þ uð2Þ j3 of mass, the Hamiltonian function reduces to two
d2 uð2Þ m1 ðuð2Þ uð1Þ Þ degrees of freedom, where a new variable g is
¼ introduced as the difference between the argument
dt 2
juð2Þ uð1Þ j3 of perihelion and the longitude of the primary.
d2 uð3Þ m1 ðuð3Þ uð1Þ Þ m2 ðuð3Þ uð2Þ Þ Normalizing the units of measure so that the
¼ distance between the primaries and the sum of
dt 2
juð3Þ uð1Þ j3 juð3Þ uð2Þ j3
their masses is unity, the Hamiltonian function H
The first two equations concern the motion of the describing the circular, planar, restricted three-body
primaries P1 and P2 and they correspond to a problem is given by
Keplerian two-body problem, whose solution can
be inserted in the equation for u(3) , which becomes a 1
HðL; G; ‘; gÞ ¼ G þ "FðL; G; ‘; gÞ ½6
periodically forced second-order equation. The 2L2
restricted three-body problem can be conveniently
where " = m2 . The perturbing function takes the
described in terms of suitable action-angle coordi-
form
nates, known as Delaunay variables. The present
discussion is restricted to the planar case, namely we 1
assume that the motion of the three bodies takes F ¼ r cosðf þ gÞ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 þ r2 2r cosðf þ gÞ
place on the same plane. The corresponding Delau-
nay variables, say (L, G, ‘, ) 2 R2 T 2 , are defined where f = ’ represents the true anomaly, namely
as follows (Szebehely 1967). Let a and e be, the angle formed by the instantaneous orbital radius
respectively, the semimajor axis and the eccentricity with the periapsis line. Notice that the quantities r
2=3
of the osculating orbit of P3 and let = 1=m1 ; then and f are functions of the Delaunay variables
Delaunay’s action variables are given by through the relations [3]–[5]. As a consequence,
pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffi one can expand the perturbing function in the form
L ¼ m 1 a; G ¼ L 1 e2
(Delaunay 1860)
Next, introduce the angle variables: we denote by X
and ’ the longitudes of Jupiter and of the asteroid; FðL; G; ‘; gÞ ¼ Fjk ð‘; gÞej ak
let be the argument of perihelion, namely the angle j; k0
formed by the periapsis direction with a preassigned
reference line, and let u denote the eccentric where Fjk are cosine terms with arguments given by
anomaly, which can be defined through a linear combination of the variables ‘ and g. For
rffiffiffiffiffiffiffiffiffiffiffi example, the first few terms of the series develop-
’ 1þe u ment are given by the following expression:
tan ¼ tan ½3
2 1e 2
L4 9 L4 e
Let ‘ be the mean anomaly, which is related to the FðL; G; ‘; gÞ ¼ 1 L8 þ cos ‘
eccentric anomaly by means of Kepler’s equation 4 64 2
3 6 15 10
L þ L cosð‘ þ gÞ
‘ ¼ u e sin u ½4 8 64
9
Delaunay’s angle variables are represented by the þ L4 e cosð‘ þ 2gÞ
4
mean anomaly ‘ and by the argument of perihelion
3 4 5 8
. For completeness, it should be remarked that L þ L cosð2‘ þ 2gÞ
the distance r between the minor body P3 and the 4 16
primary P1 is related to the longitude and to the 3 4
L e cosð3‘ þ 2gÞ
eccentric anomaly by means of the relations 4
5 6 35 10
að1 e2 Þ L þ L cosð3‘ þ 3gÞ
r¼ ¼ að1 e cos uÞ ½5 8 128
1 þ e cosð’ Þ 35
L8 cosð4‘ þ 4gÞ
In a reference frame centered at one of the 64
primaries, say P1 , let H = H(L, G, ‘, , ) denote 63 10
L cosð5‘ þ 5gÞ þ ½7
the Hamiltonian function describing the planar 128
22 Stability Problems in Celestial Mechanics
dynamics of two bodies with comparable masses, Nekhoroshev’s theorem (see, e.g., Arnol’d et al.
moving in the gravitational field of a larger primary) (1997)), which guarantees, under smallness require-
and the existence of quasiperiodic motions has been ments, the stability of the motions on an open set of
proved for values of the ratio of semimajor axis less initial conditions for exponentially long times.
than 0.8 and for inclinations up to 1 . Consider a Hamiltonian function of the form
Concrete estimates on the strength of the perturba-
Hðy; xÞ ¼ hðyÞ þ "f ðy; xÞ; ðy; xÞ 2 B T n ½9
tion were given by M Hénon: in the context of the
n
three-body problem, the application of the original where B is an open subset of R . We assume that h
version of Arnol’d’s theorem allows one to prove the and f are analytic functions and that the integrable
existence of invariant tori for values of the perturbing Hamiltonian h satisfies a geometric condition, called
parameter (representing the Jupiter–Sun mass ratio) steepness. We remark that functions such as h(L, G)
10333 while the implementation of Moser’s theo- in [8] satisfy the steepness condition. For sufficiently
rem provides an estimate of 1050 . We remark that the small values of ", Nekhoroshev’s theorem states that
astronomical value of the Jupiter–Sun mass ratio any motion (y(t), x(t)) satisfying Hamilton’s equa-
amounts to 103 , showing a relevant discrepancy tions associated with [9] is bounded for a finite (but
between KAM results and physical measurements. exponentially long) time, that is,
More recently, KAM estimates have been refined and b
adapted to the study of significant problems of celestial kyðtÞ yð0Þk
y0 "a ; for jtj
t0 eð"0 ="Þ
mechanics (Celletti and Chierchia 1995). Strong where y0 , t0 , "0 , a, and b are suitable positive
improvements have been obtained combining accurate constants.
estimates with a computer-assisted implementation, Nekhoroshev’s theorem can be conveniently
where the computer is used to perform long computa- applied to the three-body problem, where it provides
tions concerning the development of the perturbing a confinement of the action variables, representing
series and the check of KAM estimates. The numerical the semimajor axis and the eccentricity of the
errors are controlled through the implementation of a osculating orbit. Interesting applications of
suitable technique, known as interval arithmetic. In Nekhoroshev’s theorem concern the investigation
the framework of the planar, circular, restricted three- of the triangular Lagrangian points in the spatial,
body problem, the stability of some asteroids has been restricted three-body problem. (The Lagrangian
proved by A Celletti and L Chierchia for realistic points are five equilibrium positions of the planar,
values of the perturbing parameter (e.g., for " = 103 ). restricted three-body problem in a synodic reference
A suitable approximation of the disturbing function frame, which rotates with the angular velocity of the
(namely, a finite truncation of the series development primaries. Two of such positions are called trian-
as in [7]) has been considered. The result relies on an gular, since the configuration of the three bodies is
implementation of a computer-assisted isoenergetic an equilateral triangle in the orbital plane.) Effective
KAM theorem and on the following remark: in the estimates were developed by A Giorgilli and
four-dimensional phase space, on a fixed energy level C Skokos, showing the existence of a stability
the invariant two-dimensional surfaces separate the region around the Lagrangian point L4 , large
phase space, providing the stability of the actions for enough to include some known asteroids. In the
all motions trapped between any two invariant tori. same framework, the exponential stability was
Since the action variables are related to the semimajor proven by G Benettin, F Fassó, and M Guzzo for
axis and to the eccentricity of the orbit, one obtains all values of the mass-ratio parameter, except for a
that the elliptic elements remain close to their initial few values of the reduced mass up to ’ 0.038.
values.
A computer-assisted KAM theorem has been
Numerical Results
applied by A Giorgilli and U Locatelli to the
planetary (Jupiter–Saturn) problem. Using a suitable The study of the stability of the N-body problem can
secular approximation, it can be shown that this be investigated by performing numerical integrations
model admits two invariant tori, which bound the of the equations of motion. The dynamics of the
orbits corresponding to the initial data of Jupiter outer planets of the solar system (from Jupiter to
and Saturn. Pluto) has been explored by Sussman and Wisdom
(1992) using a dedicated computer, the Digital
Orrery. The integration of the equations of motion
Nekhoroshev Stability
was performed over 845 million years; the results
A different approach in order to study the stability provided evidence of the stability of the major
of nearly integrable systems is provided by planets and a chaotic behavior of Pluto. An
24 Stability Problems in Celestial Mechanics
alternative approach, based on an average of the equations for rigid body, the equation of motion in
equations of motion over fast angles, was adopted normalized units (i.e., assuming that the period of
by Laskar (1995), where the perturbing function of revolution is 2) takes the form
the spatial problem was expanded up to the second "
order in the masses and up to the fifth powers of the x
€ þ 3 sinð2x 2f Þ ¼ 0 ½10
r
eccentricity and the inclination. The dynamics of all
planets (excluding Pluto) was investigated by means where " 32 (B A)=C. This equation is integrable
of frequency analysis over a time span ranging from whenever A = B or in the case of zero orbital
15 Gyr to þ10 Gyr. The numerical integrations eccentricity. Due to the assumption of Keplerian
provided evidence of the regularity of the external motion, both r and f are known functions of the
planets (from Jupiter to Neptune), a moderate time. Therefore, we can expand [10] in Fourier
chaotic behavior of Venus and the Earth, and a series as
marked chaotic dynamics of Mercury and Mars. X1 m
The computations show that the inner solar system x
€þ" W ; e sinð2x mtÞ ¼ 0 ½11
m6¼0; m¼1
2
is chaotic, with a Lyapunov time of 5 Myr, thus
preventing any prediction of the evolution over where the coefficients W(m=2, e) decay as
100 Myr. W(m=2, e) / ejm2j . A further simplification of the
model is obtained as follows. According to (4), we
neglected the dissipative forces and perturbations
The Spin–Orbit Problem due to other bodies. The most important contribu-
tion is due to the nonrigidity of the satellite,
The dynamics of the bodies of the solar system provoking a tidal torque caused by the internal
results from a combination of a revolutionary friction. The size of the dissipative effects is
motion around a primary body and a rotation significantly small compared to the gravitational
about an internal axis. A simple mathematical terms. Therefore, we decide to retain in [11] only
model describing the spin–orbit interaction can those terms which are of the same order or larger
be introduced as follows. Let S be a triaxial than the average effect of the tidal torque. The
ellipsoidal satellite, which moves about a central following equation results:
planet P. We denote by Trev and Trot the periods of
revolution and rotation. A p : q spin–orbit reso- X
N2
x
€þ" ~ m ; e sinð2x mtÞ ¼ 0
W ½12
nance occurs if 2
m6¼0; m¼N1
Trev p
¼ ; for p; q 2 N; q 6¼ 0 where N1 and N2 are suitable integers, which depend
Trot q
on the physical and orbital parameters of the satellite,
Whenever p = q = 1, the satellite always points the while W̃(m=2, e) are suitable truncations of the
same face to the host planet. Most of the evolved coefficients W(m=2, e). We remark that eqn [12] can
satellites or planets are trapped in a 1 : 1 resonance, be derived from Hamilton’s equations associated
with the only exception of Mercury, which is with a one-dimensional, time-dependent, nearly
observed in a nearly 3 : 2 resonance. In order to integrable Hamiltonian function with perturbing
introduce a simple mathematical model which parameter " and a trigonometric disturbing function.
describes the spin–orbit interaction, we assume that:
1. the satellite moves on a Keplerian orbit around the Analytical Results
planet (with semimajor axis a and eccentricity e);
The phase space associated with [12] admits a
2. the spin axis is perpendicular to the orbit plane;
Poincaré map showing a pendulum-like structure:
3. the spin axis coincides with the shortest physical
the periodic orbits are surrounded by librational
axis; and
curves and the chaotic separatrix divides the libra-
4. dissipative effects as well as perturbations due to
tional regime from the region where rotational
other planets or satellites are neglected.
motions can take place. The three-dimensional
We denote by A < B < C the principal moments phase space is separated by KAM rotational tori
of inertia of the satellite and by r and f, respectively, into invariant regions, providing a strong stability
the instantaneous orbital radius and the true property for all motions confined between any pair
anomaly of the Keplerian orbit. Let x be the angle of KAM rotational tori. Let us denote by P(p=q) a
between the longest axis of the ellipsoid and a periodic orbit associated with the p : q resonance; in
preassigned reference line. From standard Euler’s the context of the model associated with [12], the
Stability Problems in Celestial Mechanics 25
stability of the periodic orbit P(p=q) is obtained by 60 and 90 . Since the present obliquity of the
showing the existence of two invariant tori Earth amounts to 23.3 , the Earth is outside the
T (!1 ) and T (!2 ) with !1 < p=q < !2 . A refined dangerous region. An interesting simulation was
computer-assisted KAM theorem has been imple- performed to evaluate the role played by the
mented (Celletti 1990) with the aim of proving the Moon. Without the Moon, the extent of the
existence of trapping invariant surfaces. Realistic chaotic region would greatly increase, eventually
estimates, in agreement with the physical values of preventing the birth of an evoluted life. Among
the parameters (namely, the equatorial oblateness " the inner planets, Mars’ obliquity shows larger
and the eccentricity e), have been obtained in several chaotic extent, which drives to variations from
examples of spin–orbit commensurabilities, like the 0 to 60 in a few million years. On the contrary,
1 : 1 Moon–Earth interaction or the 3 : 2 Mercury– the external planets do not show significant
Sun resonance. chaotic regions and their obliquities are essen-
Concerning Nekhoroshev-type estimates, the tially stable.
classical D’Alembert problem has been studied by
Biasco and Chierchia (2002). In particular, an See also: Averaging Methods; Dynamical Systems in
equatorially symmetric oblate planet moving on a Mathematical Physics: An Illustration from Water Waves;
Keplerian orbit around a primary body has been Gravitational N-Body Problem (Classical); Hamiltonian
Systems: Stability and Instability Theory; KAM Theory
investigated; the model does not assume any further
and Celestial Mechanics; Multiscale Approaches;
constraint on the spin axis. Although the Hamilto- Stability Theory and KAM.
nian describing this model is properly degenerate, it
is shown that Nekhoroshev-like results apply to the
D’Alembert problem in the proximity of a 1 : 1
resonance. Further Reading
Arnol’d VI, Kozlov VV, and Neishtadt AI (1997) Mathematical
Aspects of Classical and Celestial Mechanics. Berlin: Springer.
Numerical Results
Biasco L and Chierchia L (2002) Nekhoroshev stability for the
The model introduced in [10]–[12] often represents an D’Alembert problem of celestial mechanics. Atti Accademia
Nazionale Lincei Classe Scienze Fisiche Matematiche Naturali
unrealistic simplification of the spin–orbit dynamics. Rendiconti Lincei (9) Matematica Applicata 13(2): 85–89.
In particular, assumption (1) implies that secular Celletti A (1990) Analysis of resonances in the spin–orbit problem
perturbations of the orbital parameters are neglected, in celestial mechanics: the synchronous resonance (Part I).
whereas the hypothesis (2) corresponds to disregarding Journal of Applied Mathematics and Physics (ZAMP) 41:
the spin–orbit obliquity, namely the angle formed by 174–204.
Celletti A and Chierchia L (1995) A Constructive Theory of
the rotational axis with the normal to the orbital
Lagrangian Tori and Computer-Assisted Applications.
plane. Due to the presence of an equatorial bulge, the Dynamics Reported, New Series, vol. 4, pp. 60–129. Berlin:
gravitational attraction of the other bodies of the solar Springer.
system induces a torque, resulting in a precessional Delaunay C (1860) Théorie du Mouvement de la Lune, Mémoires
motion. It is also important to take into account the de l’Académie des Sciences 1, Tome XXVIII, Paris.
Laskar J (1995) Large scale chaos and marginal stability in the
changes of the obliquity angle, whose variations
solar system. In: XIth ICMP Meeting (Paris, July 1994),
might affect the climatic behavior. pp. 75–120. Cambridge, MA: International Press.
A realistic model for the precession and the Poincarè H (1892) Les Méthodes Nouvelles de la Méchanique
variation of the obliquity has been presented by Céleste. Paris: Gauthier-Villars.
Laskar (1995). The numerical simulations and the Sussman GJ and Wisdom J (1992) Chaotic evolution of the solar
system. Science 241: 56–62.
frequency-map analysis show that the Earth’s
Szebehely V (1967) Theory of Orbits. New York: Academic Press.
obliquity is actually stable, although a large
chaotic region is found in the interval between
26 Stability Theory and KAM
elapsed, an equidistribution of the energy among all Hamiltonian structure of the equations of motion)
modes (thermalization) might be expected. At least such that at each step the size of the perturbation is
this behavior was expected by Fermi, Pasta, and reduced. Of course, on the basis of Poincaré’s result,
Ulam, but it was not what they found numerically: this iterative procedure cannot work for all initial
on the contrary, all the energy seemed to remain conditions (e.g., when w does not satisfy [4]). The
associated with the modes close to the few initially key point in Kolmogorov’s scheme is to fix the
excited ones. rotation vector w of the torus one is looking for, in
At about the same time, Kolmogorov (1954) such a way that the small divisors are controlled
published a breakthrough paper going exactly in through the Diophantine condition [4] and the
the opposite direction: if one perturbs an integrable exponentially fast convergence of the algorithm.
system, under some mild conditions on the integr- New proofs and extensions of Kolmogorov’s
able part, most of the tori are preserved, although theorem were given later by Arnol’d (1962) and by
slightly deformed. A more precise statement is the Moser (1962); hence, the acronym KAM to denote
following. such a theorem. Arnol’d gave a more detailed (and
slightly different) proof compared to the original
Theorem 1 Let an N-degree-of-freedom Hamilto-
one by Kolmogorov, and applied the result to the
nian system be described by an analytic Hamiltonian
planar three-body problem, thus showing that
of the form
physical applications of the theorem were possible.
Moser, on the other hand, proposed a modified
HðA; aÞ ¼ H0 ðAÞ þ "f ðA; aÞ ½3 method using a technique introduced by Nash
(which approximates smooth functions with analy-
with " a real parameter (perturbation parameter), tical ones) to deal with the case of systems with
f a 2-periodic function of each angle variable finite smoothness.
(potential or perturbation), and H0 (A) satisfying For fixed small enough ", the surviving invariant
the nondegeneracy condition det @A2 H0 (A) 6¼ 0 tori cover a large portion of the phase space, called
(anisochrony condition). If w = w(A) @A H0 (A) is the Kolmogorov set; the relative measure of the
fixed to satisfy the Diophantine condition region of phase space whichpisffiffiffi not filled by such tori
tends to zero at least as " for " ! 0. A system
C0 described by a Hamiltonian like [3] is then called a
jw nj > 8n 2 ZN n 0 ½4 quasi-integrable Hamiltonian system.
jnj
The excluded region of phase space corresponds
to the unperturbed tori which are destroyed by the
for some constants C0 > 0 and > N 1 (here
perturbation: the rotation vectors of such tori are
jnj = j1 j þ þ jN j and denotes the standard
close to a resonance, that is, to a value w such that
inner product: w n = !1 1 þ þ !N N ), then
w n = 0 for some integer vector n, and these are
there is an invariant torus with rotation vector w
exactly the vectors which do not satisfy the
for " small enough, say for " smaller than some value
Diophantine condition [4] for any value C0 . A
"0 depending on C0 and (and on the function f ).
subset of phase space of this kind is called a
By saying that there is an invariant torus with resonance region.
rotation vector w, one means that there is an At first sight, this would seem to provide an
invariant surface in phase space on which, in explanation for the results found by Fermi, Pasta,
suitable coordinates, the dynamics is the same as in and Ulam, but this is not quite the case. First, the
the unperturbed case, and the conjugation (i.e., the threshold value "0 depends on N, and goes to zero
change of variables which leads to such coordinates) very fast as N ! 1 (in general as N! for some
is analytic in the angle variables and in the > 0); however, the results of the numerical
perturbation parameter. One also says that the experiments apparently were insensitive to the
torus of an integrable system (" = 0) is preserved number N of oscillators. Second, the KAM theorem
(or even persists) under a small perturbation. deals with maximal tori, that is, tori characterized
Note that, a posteriori, this proves convergence of by rotation vectors which have as many components
the perturbation series: however, a direct check of as the number of degrees of freedom, while the
convergence was performed only recently by rotation vectors of the numerical quasiperiodic
Eliasson (1996). Kolmogorov’s proof was based on solutions seem to involve just a small number of
a completely different idea, that is, by performing components.
iteratively a sequence of canonical transformations Finally, as an extra problem, the validity of the
(which are changes of coordinates preserving the nondegeneracy condition for the unperturbed
28 Stability Theory and KAM
Hamiltonian is violated, because the unperturbed corresponding to the normal coordinates), known
Hamiltonian is linear in the action variables (one as the first and second Mel’nikov conditions:
says that the Hamiltonian is isochronous). Recently,
C0
Rink (2001), by continuing the work by Nishida, jw n k j > 8n 2 ZN n 0; 81 k s
showed that in the Fermi–Pasta–Ulam problem it is jnj
possible to perform a canonical change of coordi- C0 ½5
j w n k k0 j > 8n 2 ZN n 0
nates such that in the new variables the Hamiltonian jnj
becomes anisochronous: one uses part of the 81 k; k0 s
perturbation to remove isochrony. But the other
two obstacles remain. Such conditions appear, with the values of the
normal frequencies slightly modified by terms
depending on ", at each iterative step, and at the
end only for values of " belonging to some Cantor
Lower-Dimensional Tori set one can have elliptic lower-dimensional tori.
A natural question is what happens to the invariant The second Mel’nikov conditions are not really
tori corresponding to rotation vectors which are not necessary, and in fact they can be relaxed as Bourgain
rationally independent, that is, vectors satisfying n (1994) has shown; this is an important fact, as it
resonance conditions, such as w n i = 0 for n allows degenerate normal frequencies, which were
independent vectors n 1 , . . . , n n , with 1 n N 2 forbidden in the previous works by Kuksin (1987),
(the case n = N 1 corresponds to periodic orbits Eliasson (1988), and Pöschel (1989).
and is comparatively easy); for instance, one can Similar results also apply in the case of lower-
take w = (!1 , . . . , !n , 0, . . . , 0) and, by a suitable dimensional tori for the model [3], which represents
linear change of coordinates, one can always make sort of a degenerate situation, as the normal
the reduction to a case of this kind. In particular, frequencies vanish for " = 0. Again, one has to use
one can ask if a result analogous to the KAM part of the perturbation to remove the complete
theorem holds for these tori. Such a problem for the degeneracy of normal frequencies.
model [3] has not been studied very widely in the
literature. What has usually been considered is a Quasiperiodic Solutions in Partial
system of n rotators coupled with a system with
Differential Equations
s = N n degrees of freedom near an equilibrium
point: then one calls normal coordinates the For explaining the Fermi–Pasta–Ulam experiment,
coordinates describing the latter, and the role of one has to deal with systems with arbitrarily many
the parameter " is played by the size of the normal degrees of freedom. Hence, it is natural to investigate
coordinates (if their initial conditions are chosen systems which have ab initio infinitely many
near the equilibrium point). In the absence of degrees of freedom, such as the nonlinear wave
perturbation (i.e., for " = 0), one has either hyper- equation, utt uxx þ V(x)u = ’(u), the nonlinear
bolic or elliptic or, more generally, mixed tori, Schrödinger equation, iut uxx þ V(x)u = ’(u), the
according to the nature of the equilibrium points: nonlinear Korteweg–de Vries equation ut þ uxxx
one refers to these tori as lower-dimensional tori, as 6ux u = ’(u), and other systems of nonlinear partial
they represent n-dimensional invariant surfaces in a differential equations (PDEs); the continuum limit of
system with N degrees of freedom. Then one can the Fermi–Pasta–Ulam model gives indeed a non-
study the preservation of such tori. linear Korteweg–de Vries equation, as shown by
One can prove that, in such a case, at least if Zabuski and Kruskal (1965). Here (t, x) 2 R [0, ]d ,
certain generic conditions are satisfied, in suitable if d is the space dimension, and either periodic
coordinates, n angles rotate with frequencies (u(0, t) = u(, t)) or Dirichlet (u(0, t) = u(, t) = 0)
!1 , . . . , !n , respectively, while the remaining N n boundary conditions can be considered; ’(u) is a
angles have to be fixed close to some values function analytic in u and starting from orders strictly
corresponding to the extremal points of the function higher than one, while V(x) is an analytic function of
obtained by averaging the potential over the rotating x, depending on extra parameters 1 , . . . , n . Such a
angles. function is introduced essentially for technical rea-
The case of hyperbolic tori is easier, as in the case sons, as we shall see that the eigenvalues k of the
of elliptic tori one has to exclude some values of " to Sturm–Liouville operator @x2 þ V(x) must satisfy
avoid some further resonance conditions between some Diophantine conditions. If we set V(x) = 2 R
the rotation vector w and the normal frequencies k in the nonlinear wave equation, we obtain the Klein–
(i.e., the eigenvalues of the linearized system Gordon equation, which, in the particular case = 0,
Stability Theory and KAM 29
reduces to the string equation. Again, the role of the Even if systems of the type considered above have
perturbation parameter is played by the size of the been widely studied, they remain significantly
solution itself. different from a discrete system such as the chain
Small-amplitude periodic and quasiperiodic of oscillators [1] for N large enough (also in the
solutions for PDE systems have been extensively limit N ! 1), so that the results which have been
studied, among others, by Kuksin, Wayne, Craig, found for PDE systems do not really provide an
Pöschel, and Bourgain. Results for such systems read explanation for the numerical findings.
as follows. Consider for concreteness the one-dimen- Also in the case of lower-dimensional tori for finite-
sional nonlinear wave equation with Dirichlet bound- dimensional systems the main problem is that, even if
ary conditions and with ’(u) = u3 þ O(u5 ). When the such tori exist, it is not clear what relevance they can
nonlinear function ’(u) is absent, any solution of the have for the dynamics (a case in which hyperbolic tori
linear wave equation utt uxx þ V(x)u = 0 is a super- play a role is considered later). An important feature of
position of either finitely or infinitely many periodic maximal tori is that they fill most of the phase space, a
solutions with frequencies k determined by the property which certainly does not hold for lower-
function V(x). Let u0 (wt, x) be a quasiperiodic dimensional tori, which lie outside the Kolmogorov set.
solution of the linear wave equation with rotation In the Fermi–Pasta–Ulam experiment, one con-
vector w 2 Rn , where !k = mk , for some n-tuple siders initial conditions close to lower-dimensional
{m1 , . . . , mn }. Then for " small enough there exists a tori; hence, an interesting problem is to study their
subset " of the space of parameters with large stability, that is, how fast the trajectories starting
Lebesgue measure (more precisely, with complemen- from such initial conditions drift away.
tary Lebesgue measure which tends to zero when
" ! 0) such that for all x = (1 , . . . , n ) 2 " there is a
solution u" (t, x) of the nonlinear wave equation and a
Arnol’d Diffusion and Nekhoroshev’s
rotation vector w " satisfying the conditions
Theorem
pffiffiffi
u" ðt; xÞ "u0 ðw " t; xÞj C" Consider again the maximal tori. For N = 2, the
jw " w j < C" ½6 preservation of most of the invariant tori prevents the
possibility of diffusion in phase space: the tori
for some positive constant C. represent two-dimensional surfaces in a three-dimen-
The case n = 1 (periodic solutions) is not as easy sional space (as dynamics occur on the level surfaces
as the finite-dimensional case, because there are of the energy in a four-dimensional space), so that, if
infinitely many normal frequencies, so that there are an initial condition is trapped in a gap between two
small divisor problems which for finite-dimensional tori, the corresponding trajectory remains confined
systems appear only for n
2. forever between them. The situation is quite different
For the nonlinear wave equation and the for N
3: in such a case, the tori do not represent a
Schrödinger equation, if n
1, one can take topological obstruction to diffusion any more.
V(x) = , but one needs 6¼ 0; for n > 1, one can That mechanisms of diffusion are really possible
take V(x) = , as one can perform a preliminary was shown by Arnol’d (1963). Because of the
transformation leading to an equation in which a perturbation, lower-dimensional hyperbolic tori
function depending on parameters naturally appear inside the resonance regions, with their
appears, as shown by Kuksin and Pöschel (1996). stable and unstable manifolds (whiskers). It is
For n = 1, the case = 0 has been very recently possible that these manifolds of the same torus
solved by Gentile et al. (2005). intersect with a nonvanishing angle (homoclinic
Statements for more general situations can also angle); as a consequence, the angles between the
be obtained, while extensions to space dimensions stable and unstable manifolds of nearby tori
d
2 are not trivial and have been obtained only (heteroclinic angles) can also be different from
recently by Bourgain (1998). The above result also zero, and one can find a set of hyperbolic lower-
holds if the number of components of the rotation dimensional tori such that the unstable manifold of
vector is less than the number of parameters: one each of them intersects the stable manifold of the
uses such parameters because one needs to impose torus next to it: one says that such tori form a
some Diophantine conditions such as [5], now for transition chain of heteroclinic connections. Then
all the frequencies k = !k , k 2
= {m1 , . . . , mn }. Again, there can be trajectories moving along such connec-
the second Mel’nikov conditions were shown by tions, producing at the end a drift of order 1 (in ") in
Bourgain to be unnecessary, and this is an essential the action variables. Such a phenomenon is referred
ingredient for the higher-dimensional case. to as Arnol’d diffusion.
30 Stability Theory and KAM
Of course, diffusing trajectories should be located resonance region, and so on. Of course, for initial
in the region of phase space where there are no conditions on some invariant torus, KAM theorem
invariant tori (hence, a very small region when " is applies, but the new result concerns initial condi-
small), but an important consequence is that, unlike tions which do not belong to any tori.
what happens in the unperturbed case, not all Nekhoroshev’s theorem gives a lower bound for
motions are stable: in particular, the action variables the diffusion time, that is, the time required for a
can change by a large amount over long times. drift of order 1 to occur in the action variables. But,
Providing interesting examples of Hamiltonian of course, an upper bound would also be desirable.
systems in which Arnol’d diffusion can occur is not The diffusion times are related to the amplitude of
so easy: in fact, for the diffusion to really occur, one the homoclinic angles, which are very small (and
needs a lower bound on the homoclinic angles, and difficult to estimate as stated before). The strongest
to evaluate these angles can be difficult. For results in this direction have been obtained with
instance, Arnold’s (1963) original example, which variational methods, for instance, by Bessi, Bernard,
describes a system near a resonance region, is a two- Berti, and Bolle: at best, for the diffusion time, one
parameter system given by finds an estimate O( 1 log 1 ), if is the ampli-
2 tude of the homoclinic angles (which in turn are
1 2
2 A1 þ A2 þ A3 þ ðcos 1 1Þ exponentially small in some power of
, as one can
þ " ðcos 1 1Þðsin 2 þ cos 3 Þ ½7 expect as a consequence of Nekhoroshev’s theorem).
Then one can imagine that the results of the Fermi–
and the angles can be proved to be bounded from Pasta–Ulam experiment can also be interpreted in the
below only by assuming that the perturbation para- light of Nekhoroshev’s theorem. The solutions one
meter " is exponentially small with respect to the other finds numerically certainly do not correspond to
parameter , which in turn implies a situation not maximal tori, but one could expect that they could be
really convincing from a physical point of view. More solutions which appear to be quasiperiodic for long
generally, for all the examples which are discussed in but finite times (e.g., moving near some lower-
literature, the relation with physics (as the d’Alembert dimensional torus determined by the initial condi-
problem on the possibility for a planet to change the tions), and that if one really insists on observing the
inclination of the precession cone) is not obvious. time evolution for a very long time, then deviations
So the question naturally arises as to how fast can from quasiperiodic behavior could be detected. This
such a mechanism of diffusion be, and how relevant is an appealing interpretation, and the most recent
is it for practical purposes. A first answer is numerical results make it plausible: Galgani and
provided by a theorem of Nekhoroshev (1977), Giorgilli (2003) have found numerically that the
which states the following result. energy, even if initially confined to the lower modes,
tend to be shared among all the other modes, and
Theorem 2 Suppose we have an N-degree-of-
higher the modes the longer is the time needed for the
freedom quasi-integrable Hamiltonian system,
energy to flow to them. Of course, this does not settle
where the unperturbed Hamiltonian satisfies some
the problem, as there is still the issue of the large
condition such as convexity (or a weaker one,
number of degrees of freedom; furthermore, for large
known as steepness, which is rather involved, to
N the spacing between the frequencies is small, and
state in a concise way); for concreteness consider a
they become almost degenerate. Hence, the problem
function H0 (A) in [2] which is quadratic in A. Then
still has to be considered as open.
there are two positive constants a and b such that
for times t up to O( exp ("b )) the variations of the
action variables cannot be larger than O("a ). Stability versus Chaos
The constants a and b depend on N, and they tend The main problem in applying the KAM theorem
to zero when N ! 1; Lochak and Nei_shtadt (1992) seems to be related to the small value of the threshold
and Pöschel (1993) found estimates a = b = 1=2N,
0 which is required. In general, when the size of the
which are probably in general optimal. Nekhor- perturbation parameter is very large, the region of
oshev’s theorem is usually stated in the form above, phase space filled with invariant tori decreases (or even
but it provides more information than that explicitly disappears), and chaotic motions appear. By the latter,
written: the trajectories, when trapped into a one generally means motions which are highly
resonance region, drift away and come close to sensitive to the initial conditions: a small variation of
some invariant torus, and then they behave like the initial conditions produces a catastrophic variation
quasiperiodic motions, up to very small corrections, in the corresponding trajectories (this is due to the
for a long time, until they enter some other appearance of strictly positive Lyapunov exponents).
Stability Theory and KAM 31
A natural question is then how such a result as the (1997) found analytical bounds on the perturbation
KAM theorem is meaningful in physical situations: parameters comparable with the physical values. Of
in other words, for which systems the KAM theorem course, this is not at all conclusive for the general
can really apply. situation in which all planets (with their satellites
One of the main motivations to study such a and the asteroids) are considered together; in
problem was to explain astronomical observations particular, it does not shed light on the problem of
and to study the stability of the solar system. In the stability of the entire solar system.
order to apply the KAM theorem to the solar On the contrary, extensive numerical simulations
system, one has to interpret the gravitational forces performed by Laskar (starting from 1989) seem to
between the planets as perturbations of a collection suggest that the solar system is unstable. Deflections
of several decoupled two-body systems (each planet from the current orbits could be produced to such an
with the Sun). One can write the masses of the extent that collisions between planets could not be
planets as "mi , and " plays the role of the avoided: Mercury could collide with Venus and be
perturbation parameter. The corresponding Hamil- ejected from the solar system. An important issue is
tonian (after suitable reductions and scalings) is to consider the times over which such phenomena
can occur. Laskar’s numerical simulations show that
XN
p2i XN
mi m0 X pi pj such times are less than the estimated age of the solar
þ" system, and that one can make accurate predictions
i¼1
2 i i¼1 jqi j 1i<jN
m0
X for the planetary motions only for a finite amount of
mi mj
þ" ½8 time (100 Myr). Furthermore, the assumed partial
jq
1i<jN i
qj j instability of the solar system has also been used by
Laskar (2004) to explain some observed phenomena
where i = 0 corresponds to the Sun, while such as the evolution of the obliquity (which is the
i = 1, . . . , N correspond to the planets (hence angle between equator and orbital plane) of some
N = 9), m0 is the mass of the Sun, and " i are the planets. Of course, these simulations have been
reduced masses ( 1 1 1
i = mi þ "m0 ); here (qi , pi ) 2 carried out with several approximations, as that of
3 3
R R , i ¼ 0, . . . , N, the inner product in pi pj is averaging over the fast variables, which allows one to
in R3 , and the norm j j is the Euclidean one. use a large integration step in the numerical integra-
A first difficulty is that the solar system is a properly tion of the equations of motion for the resulting
degenerate system; that is, the unperturbed Hamilto- system. This is the so-called secular system intro-
nian does not depend on all the action variables. But duced by Lagrange: instead of the fast motion of the
such a degeneracy can be removed by performing a planets, one describes the slow deformations of the
canonical change of coordinates which produces a new planetary orbits (imagining the planets as regions of
Hamiltonian in which the integrable part contains new mass spread along their orbits).
terms of order " depending on all action variables and
is nondegenerate, while the perturbation becomes of See also: Averaging Methods; Bifurcation Theory;
order "2 : the angle variables corresponding to the Billiards in Bounded Convex Domains; Diagrammatic
actions not originally appearing in the unperturbed Techniques in Perturbation Theory; Dynamical Systems
and Thermodynamics; Gravitational N-Body Problem
Hamiltonian are called the slow variables, while the
(Classical); Hamiltonian Systems: Stability and Instability
others are called the fast variables.
Theory; Hamilton–Jacobi Equations and Dynamical
However, a naive implementation of the KAM Systems: Variational Aspects; Integrable Systems and
theorem, in general, even for simplified but still Discrete Geometry; KAM Theory and Celestial
realistic systems, would provide a preposterously Mechanics; Localization for Quasiperiodic Potentials;
small value of the threshold "0 . The problem could Stability Problems in Celestial Mechanics;
be just a computational one: in principle, a very Synchronization of Chaos; Weakly Coupled Oscillators.
refined estimate of the threshold could give a better
value, so that it is very difficult to decide analytically
if the real values of the planetary masses allow the
Further Reading
solar system to fall inside the regime of appli-
cability of the KAM theorem. Results in this Arnol’d VI (1963) Proof of a theorem of A. N. Kolmogorov on
direction have been obtained, but only for special the preservation of conditionally periodic motions under a
situations: for instance, by considering the restri- small perturbation of the Hamiltonian. Russian Mathematical
Surveys 18: 85–192.
cted planar circular three-body problem (which Arnol’d VI (1964) Instability of dynamical systems with
provides a simplified description of the system many degrees of freedom. Soviet Mathematics Doklady 5:
‘‘Sun þ Jupiter þ asteroid’’), Celletti and Chierchia 581–585.
32 Standard Model of Particle Physics
Arnol’d VI, Kozlov VV, and Nei_shtadt AI (1988) Dynamical Kolmogorov AN (1954) On conservation of conditionally
Systems III. Encyclopedia of Mathematical Sciences, vol. 3. periodic motions for a small change in Hamilton’s function.
Berlin: Springer. Doklady Akademii Nauk SSSR 98: 527–530 (Russian).
Bourgain J (1994) Construction of quasi-periodic solutions for Kuksin SB (1987) Hamiltonian perturbations of infinite-
Hamiltonian perturbations of linear equations and applica- dimensional linear systems with imaginary spectrum. Func-
tions to nonlinear PDE. International Mathematics Research tional Analysis and its Applications 21: 192–205.
Notices 1994(11), 475–497. Kuksin SB (1993) Nearly Integrable Infinite-Dimensional Hamil-
Bourgain J (1998) Quasi-periodic solutions of Hamiltonian tonian Systems, Lecture Notes in Mathematics, vol. 1556.
perturbations of 2D linear Schrödinger equations. Annals of Berlin: Springer.
Mathematics 148: 363–439. Kuksin SB and Pöschel J (1996) Invariant Cantor manifolds of
Bourgain J (2005) Green’s Function Estimates for Lattice quasi-periodic oscillations for a nonlinear Schrödinger equa-
Schrödinger Operators and Applications. Princeton: Princeton tion. Annals of Mathematics 143: 149–179.
University Press. Laskar J (2004) Chaos in the solar system. In: Iagolnitzer D,
Celletti A and Chierchia L (1997) On the stability of realistic Rivasseau V, and Zinn-Justin J (eds.) Proceedings of the
three-body problems. Communications in Mathematical Phy- International Conference on Theoretical Physics TH2002,
sics 186: 413–449. (Paris, 2002). Basel: Birkhäuser.
Eliasson LH (1996) Absolutely convergent series expansions for Lochak P and Nei_shtadt AI (1992) Estimates of stability time for
quasi periodic motions. Mathematical Physics Electronic nearly integrable systems with a quasiconvex Hamiltonian.
Journal 2, paper 4 (electronic). Preprint 1988. Chaos 2: 495–499.
Eliasson LH (1988) Perturbations of stable invariant tori for Moser J (1962) On invariant curves of area-preserving mappings
Hamiltonian systems. Annali della Scuola Normale Superiore of an annulus. Nachrichten der Akademie der Wissenschaften
di Pisa 15: 115–147. in Göttingen 1962: 1–20.
Ford J (1992) The Fermi–Pasta–Ulam problem: paradox turns Moser J (1973) Stable and Random Motions in Dynamical
discovery. Physics Reports 213: 271–310.z Systems, Annals of Mathematical Studies. Princeton: Princeton
Galgani L and Giorgilli A (2003) Recent results on the Fermi– University Press.
Pasta–Ulam problem. Rossiı_skaya Akademiya Nauk. Sankt- Nekhorošev NN (1977) An exponential estimate of the time of
Peterburgskoe Otdelenie. Matematicheskiı_Institut im. V. A. stability of nearly integrable Hamiltonian systems. Russian
Steklova. Zapiski Nauchnykh Seminarov (POMI) 300: Mathematical Surveys 32: 1–65.
145–154. Pöschel J (1989) On elliptic lower-dimensional tori in Hamilto-
Gallavotti G (1986) Quasi-integrable mechanical systems. In: nian systems. Mathematische Zeitscrift 202: 559–608.
Phénomènes critiques, systèmes aléatoires, théories de jauge Pöschel J (1993) Nekhoroshev estimates for quasi-convex
(Les Houches, 1984), pp. 539–624. Amsterdam: North- Hamiltonian systems. Mathematische Zeitscrift 213:
Holland. 187–216.
Gallavotti G, Bonetto F, and Gentile G (2004) Aspects of Rink B (2001) Symmetry and resonance in periodic FPU chains.
Ergodic, Qualitative and Statistical Theory of Motion. Berlin: Communications in Mathematical Physics 218: 665–685.
Springer. Zabusky NJ and Kruskal MD (1965) Interaction of ‘‘solitons’’ in
Gentile G, Mastropietro V, and Procesi M (2005) Periodic a collisionless plasma and the recurrence of initial states.
solutions for completely resonant nonlinear wave equations Physical Review Letters 15: 240–243.
with Dirichlet boundary conditions. Communications in
Mathematical Physics 257: 319–362.
conserved currents and charges. There are eight the group SU(3) with color triplet quark matter
strong charges, called ‘‘color’’ charges and four fields fixes the QCD Lagrangian density to be
electroweak charges (which, in particular, include n
the electric charge). The commutators of these 1X 8 X f
Q2 : s (Q2 ) decreases for increasing Q2 and vanishes quarks and leptons, generally indicated by is
asymptotically. Thus, the QCD interaction becomes understood):
very weak in processes with large Q2 , called hard
processes or deep inelastic processes (i.e., with a 1X 3
1
Lsymm ¼ FA FA B B
final-state distribution of momenta and a particle 4 A¼1 4
content very different from that in the initial state).
þ L i D L þ R i D R ½7
One can prove that in four spacetime dimensions all
gauge theories based on a noncommuting group of This is the Yang–Mills Lagrangian for the gauge
symmetry are asymptotically free, and conversely. group SU(2) U(1) with fermion matter fields. Here
The effective coupling decreases very slowly at large
momenta with the inverse logarithm of Q2 : B ¼ @ B @ B
½8
s (Q2 ) = 1=b log Q2 =2 , where b is a known con- A
F ¼ @ WA @ WA g ABC WB WC
stant and is an energy of the order of a few
hundred MeV. Since in quantum mechanics large are the gauge antisymmetric tensors constructed out
momenta imply short wavelengths, the result is that of the gauge field B associated with U(1), and WA
at short distances the potential between two color corresponding to the three SU(2) generators; ABC
charges is similar to the Coulomb potential, that is, are the group structure constants (see eqn [11]),
proportional to s (r)=r, with an effective color which, for SU(2), coincide with the totally antisym-
charge which is small at short distances. On the metric Levi-Civita tensor (recall the familiar
contrary the interaction strength becomes large at angular-momentum commutators).
large distances or small transferred momenta, of The fermion fields are described through their
order Q < left- and right-hand components:
. In fact, the observed hadrons are tightly
bound composite states of quarks, with compensating ¼ ½ð1 5 Þ=2 ; L; R ¼ ½ð1 5 Þ=2 ½9
L; R
color charges so that they are overall neutral in color.
The property of confinement is the impossibility of Note that, as given in eqn [9],
separating color charges, like individual quarks and
gluons. This is because in QCD the interaction
L ¼ y
L 0¼ y ½ð1 5 Þ=20
potential between color charges increases, at long ¼ ½0 ð1 5 Þ=20 ¼ ½ð1 þ 5 Þ=2
distances, linearly in r. When we try to separate the
quark and the antiquark that form a color-neutral The matrices P = (1 5 )=2 are projectors. They
meson the interaction energy grows until pairs of satisfy the relations P P = P , P P = 0,
quarks and antiquarks are created from the vacuum Pþ þ P = 1.
and new neutral mesons are coalesced instead of free The standard electroweak theory is a chiral
quarks. For example, consider the process eþ e ! qq theory, in the sense that L and R behave
at large center-of-mass energies. The final-state quark differently under the gauge group. In particular, all
and antiquark have large energies, so they separate in R are singlets and all L are doublets in the
opposite directions very fast. But the color-confine- minimal SM (MSM). Thus, mass terms for fermions
ment forces create new pairs in between them. Two (of the form ¯ L R þ h.c.) are forbidden in the
back-to-back jets of colorless hadrons are observed symmetric limit. Fermion masses are introduced,
with a number of slow pions that make the exact together with W and Z masses, by the mechanism
separation of the two jets impossible. In some of symmetry breaking. The covariant derivatives
cases, a third well-separated jet of hadrons is also D L,R are explicitly given by
observed: these events correspond to the radiation D L; R
of an energetic gluon from the parent quark– " #
antiquark pair. X
3
1
A A 0
¼ @ þ ig tL; R W þ ig YL; R B L; R ½10
A¼1
2
A
Electroweak Interactions where tL,R and 1=2YL,R are the SU(2) and U(1)
generators, respectively, in the reducible representa-
We split the electroweak Lagrangian into two parts tions L,R . The commutation relations of the SU(2)
by separating the Higgs boson couplings: generators are given by
A B
L ¼ Lsymm þ LHiggs ½6 tL ; tL ¼ i ABC tLC and tRA ; tRB ¼ i ABC tRC ½11
We start by specifying Lsymm , which involves only We use the normalization tr[tA tB ] = 1=2AB in the
gauge bosons and fermions (a sum over all flavors of fundamental representation of SU(2). The electric
Standard Model of Particle Physics 35
charge generator Q (in units of e, the positron transfer squared can be neglected with respect to
charge) is given by m2W in the propagator of Born diagrams with single
W exchange, from eqn [14], we can write
Q ¼ tL3 þ 1=2YL ¼ tR3 þ 1=2YR ½12
2
LCC
eff ’ g =8mW
2 ð1 5 Þtþ
L
All fermion couplings to the gauge bosons can be
derived directly from eqns [7] and [10]. The charged- ð1 5 Þt L ½19
current (CC) couplings are the simplest. From By specializing further in the case of doublet fields
nh pffiffiffii such as e e or , we obtain the tree-level
g t1 W1 þ t2 W2 ¼ g t1 þ it2 = 2
relation of g with the Fermi coupling constant
h pffiffiffii o
W1 iW2 = 2 þ h:c: GF measured from decay (GF = 1.16639(2)
nh pffiffiffii o 105 GeV2 ):
¼g tþ W = 2 þ h:c: ½13 pffiffiffi
GF = 2 ¼ g2 =8m2W ½20
pffiffiffi
where t = t1 it2 and W = (W 1 iW 2 )= 2, we By recalling that g sin
W = e, we can also cast this
obtain the vertex relation in the form
h pffiffiffi pffiffiffi
V W ¼ g tLþ = 2 ð1 5 Þ=2 þ tRþ = 2 mW ¼ Born = sin
W ½21
i
ð1 þ 5 Þ=2 W þ h:c: ½14 with
pffiffiffi 1=2
Born ¼ = 2GF ’ 37:2802 GeV ½22
In the neutral-current (NC) sector, the photon A
and the mediator Z of the weak NC are orthogonal
where is the fine-structure constant of QED
and normalized linear combinations of B and W3 :
( e2 =4 = 1=137.036).
A ¼ cos
W B þ sin
W W3 In the same way, for neutral currents we obtain,
½15 in Born approximation, from eqn [18], the effective
Z ¼ sin
W B þ cos
W W3 four-fermion interaction given by
pffiffiffi
Equations [15] define the weak mixing angle
W . LNC 2GF 0 ½. . . ½. . .
eff ’ ½23
The photon is characterized by equal couplings to
left and right fermions with a strength equal to the where
electric charge. Recalling eqn [12] for the charge
matrix Q, we immediately obtain ½. . . tL3 ð1 5 Þ þ tR3 ð1 þ 5 Þ 2Q sin2
W ½24
and
g sin
W ¼ g0 cos
W ¼ e ½16
0 ¼ m2W =m2Z cos2
W ½25
or, equivalently,
All couplings given in this section are obtained at
tan
W ¼ g0 =g ½17
tree level and are modified in higher orders of
Once
W has been fixed by the photon couplings, it perturbation theory. In particular, the relations
is a simple matter of algebra to derive the Z between mW and sin
W (eqns [21] and [22]) and
couplings, with the result the observed values of ( = 0 at tree level) in
different NC processes are altered by computable
Z ¼ g=ð2 cos
W Þ tL3 ð1 5 Þ þ tR3 ð1 þ 5 Þ small electroweak radiative corrections.
2Q sin2
W Z ½18 The gauge-boson self-interactions can be derived
from the F term p inffiffiffiLsymm , by using eqn [15] and
where ¯ Z is a notation for the vertex. In the W = (W 1 iW 2 )= 2. For the three-gauge-boson
MSM, tR3 = 0 and tL3 = 1=2. Note that the CC and vertex W þ W V with V = Z, , we obtain
NC weak couplings do not conserve P (parity) and C
(charge conjugation). W W þ V ¼ igW W þ V ½g ðq pÞ þ g ðp rÞ
In order to derive the effective four-fermion þ g ðr qÞ ½26
interactions that are equivalent, at low energies, to
the CC and NC couplings given in eqns [14] and with
[18], we anticipate that large masses, as experimen-
gW W þ ¼ g sin
W ¼ e and
tally observed, are provided for W and Z by LHiggs . ½27
For left–left CC couplings, when the momentum gW W þ Z ¼ g cos
W
36 Standard Model of Particle Physics
This form of the triple gauge vertex is very special: in 5 -free and diagonal. In fact, we can make separate
general, there could be departures from the above SM unitary transformations on L and R according to
expression, even restricting us to SU(2) U(1) gauge 0 0
symmetric and C and P invariant couplings. In fact, L ¼U L; R ¼V R ½33
some small corrections are already induced by the and consequently
radiative corrections. The SM form of the triple gauge
vertex has been experimentally confirmed by measur- M ! M0 ¼ Uy MV ½34
ing the cross section eþ e ! W þ W at LEP.
This transformation does not alter the general
We now turn to the Higgs sector of the electro-
structure of the fermion couplings in Lsymm .
weak Lagrangian. The Higgs Lagrangian is specified
If only one Higgs doublet is present, the change of
by the gauge principle and the requirement of
basis that makes M diagonal will at the same time
renormalizability to be
diagonalize also the fermion–Higgs Yukawa cou-
y
mechanism that ensures natural flavor conservation In MSM only one Higgs doublet is present. Then the
of the neutral current couplings at the tree level. For fermion–Higgs couplings are in proportion to the
three generations of quarks, the CKM matrix depends fermion masses. In fact, from the Yukawa couplings
on four physical parameters: three mixing angles and g
f f (fL
fR þ h.c.), the mass mf is obtained by replacing
one phase. This phase is the unique source of CP
by v, so that mf = g
f f v. In MSM, three out of the
violation in the SM. four Hermitian fields are removed from the physical
We now consider the gauge-boson masses and their spectrum by the Higgs mechanism and become the
couplings to the Higgs. These effects are induced by longitudinal modes of W þ ,W , and Z which acquire a
the (D
)y (D
) term in LHiggs (eqn [28]), where mass. The fourth neutral Higgs is physical and should
" # be found. If more doublets are present, two more
X3
charged and two more neutral Higgs scalars should be
D
¼ @ þ ig tA WA þ ig0 ðY=2ÞB
½37
A¼1
around for each additional doublet.
The couplings of the physical Higgs H to the
Here tA and 1=2Y are the SU(2) U(1) generators in gauge bosons can be simply obtained from LHiggs , by
the reducible representation spanned by
. Not only the replacement
doublets but all non-singlet Higgs representations can þ
contribute to gauge-boson masses. The condition that
ðxÞ 0 pffiffiffi
ðxÞ ¼ ! ½45
the photon remains massless is equivalent to the
0 ðxÞ v þ ðH= 2Þ
condition that the vacuum is electrically neutral: y
(so that (D
) (D
) = 1=2(@ H)2 þ
), with the
¼ g2 v= 2 Wþ W H þ g2 =4 Wþ W H 2
replaced by v. We obtain h
pffiffiffi i
pffiffiffi2 þ g2 vZ Z = 2 2 cos2
W H
m2W Wþ W ¼ g2 tþ v= 2 Wþ W ½39
þ g2 = 8 cos2
W Z Z H 2
whilst for the Z mass we get (recalling eqn [15])
In MSM, the Higgs mass m2H v2 is of order of
1 2 3 the weak scale v but cannot be predicted because the
2 mZ Z Z ¼ g cos
W t
2 value of is not fixed. The dominant decay mode of
g0 sin
W ðY=2Þ v Z Z ½40 channel below the WW
the Higgs is in the bb
where the factor of 1/2 on the left-hand side is the threshold, while the W W channel is dominant for
þ
correct normalization for the definition of the mass sufficiently large mH . The width is small below the
of a neutral field. For Higgs doublets WW threshold, not exceeding a few MeV, but
þ
increases steeply beyond the threshold, reaching the
0 asymptotic value of 1=2m3H at large mH , where
¼ ; v¼ ½41
0 v all energies and masses are in TeV.
we obtain A central role in the experimental verification of
the standard electroweak theory has been played by
m2W ¼ 1=2g2 v2 ; m2Z ¼ 1=2g2 v2 = cos2
W ½42 CERN, the European Laboratory for Particle Physics,
located near Geneva, between France and Switzer-
Note that by using eqn [20] we obtain
land. The indirect effects of the Z0 , that is, the
v ¼ 23=4 GF
1=2
¼ 174:1 GeV ½43 occurrence of weak processes induced by the neutral
current, were first observed in 1974 at CERN by the
It is also evident that for Higgs doublets Collaboration Gargamelle (the name of the bubble
chamber used in the experiment). Later, in 1982, the
0 ¼ m2W =m2Z cos2
W ¼ 1 ½44
W and the Z0 were, for the first time, directly
This relation is typical of one or more Higgs doublets produced and observed in proton–antiproton colli-
and would be spoiled by the existence of, for example, sions by the UA1 and UA2 collaborations and then
Higgs triplets. This result is valid at the tree level and is further studied with the same technique both at
modified by calculable small electroweak radiative CERN and subsequently at the Tevatron of Fermilab
corrections. The 0 parameter has been measured from near Chicago. Starting from 1989 LEP, the large eþ e
the intensity of NC interactions (recall eqn [25]) and collider was functioning at CERN till 2000. In the LEP
confirmed to be close to unity at a few per milli level. circular ring of circumference 27 km, electrons and
38 Stationary Black Holes
positrons were accelerated in opposite directions to an dismantled and, in its tunnel, a new double ring of
equal energy in the range between 45 and 103 GeV. superconducting magnets is being installed. The new
The beams were made to cross and collide in accelerator, the LHC (Large Hadron Collider), will be
correspondence of four experimental areas where the a proton–proton collider of total center-of-mass
ALEPH, DELPHI, L3, and OPAL detectors were energy 14 TeV. Two large experiments ATLAS and
located to study the final states produced in the CMS will continue to search for the Higgs starting in
collisions. In its first phase, called LEP1, from 1989 the year 2007. The sensitivity of LHC experiments to
to 1995 the LEP operation had been completely the SM Higgs will go up to masses mH of 1 TeV.
dedicated to a precise study of the Z0 properties,
mass, lifetime, and decay modes in order to accurately See also: Effective Field Theories; Electric–Magnetic
test the predictions of the SM. The main lessons of the Duality; Electroweak Theory; General Relativity:
precision tests of the standard electroweak theory can Experimental Tests; Noncommutative Geometry and the
Standard Model; Perturbative Renormalization Theory
be summarized as follows. It has been checked that the
and BRST; Quantum Chromodynamics; Quantum
couplings of quarks and leptons to the weak gauge
Electrodynamics and its Precision Tests; Quantum Field
bosons W and Z are indeed precisely those prescribed Theory: a Brief Introduction; Relativistic Wave Equations
by the gauge symmetry. The accuracy of a few tenths Including Higher Spin Fields; Renormalization: General
of 1% for these tests implies that, not only the tree Theory; Supersymmetric Particle Models.
level, but also the structure of quantum corrections has
been verified. Then, since the end of 1995, the energy
of LEP was increased and the phase of LEP2 was Further Reading
started. The total energy was gradually increased up to
Altarelli G (2000) The Standard Electroweak Theory and Beyond,
206 GeV. The main physics goals of LEP2 were the Proceedings of the Summer School on Phenomenology of
search for the Higgs and for possible new particles, the Gauge Interactions. Zuoz, Switzerland hep-ph/0011078.
precise measurement of mW and the experimental Altarelli G (2001) A QCD Primer, Proceedings of the 2001
study of the triple gauge vertices WW and WWZ0 . European School of High Energy Physics. Beatenberg,
Switzerland, hep-ph/0204179.
The Higgs particle of the SM could in principle be
Close F, Marten, and Sutton MC (2002) The Particle Odyssey.
produced at LEP2 in the reaction e þ e ! Z0 H, Oxford: Oxford University Press.
which proceeds by Z0 exchange. The nonobservation Fraser G (ed.) (1998) The Particle Century. London: The Institute
of the Higgs particle at LEP2 has allowed to establish a of Physics.
lower limit on its mass: mH > Martin BR and Shaw G (1997) Particle Physics, 2nd edn.
114 GeV. Indirect Chichester: Wiley.
indications on the Higgs mass were also obtained
Particle Data Group (2004) Review of particle physics. Physics
from the precision tests of the SM, as the radiative Letters B 592: 1.
effects depend logarithmically on mH . The indication Perkins DH (2000) Introduction to High Energy Physics. Read-
is that the Higgs mass cannot be too heavy if the SM is ing: Addison-Wesley.
valid: mH < 219 GeV at 95% c.l. In 2001, LEP was
ones. In the vacuum region, these are all given by the in the asymptotically flat region, with all orbits
Schwarzschild family. A theorem of Birkhoff shows 2-periodic. In asymptotically flat spacetimes, this
that in the vacuum region any spherically symmetric implies that there exists an axis of symmetry, that is, a
metric, even without assuming stationarity, belongs to set on which the Killing vector vanishes. Killing vector
the family of Schwarzschild metrics, parametrized by a fields which are a nontrivial linear combination of a
positive mass parameter m. Thus, regardless of time translation and of a rotation in the asymptotically
possible motions of the matter, as long as they remain flat region are called stationary rotating, or helical.
spherically symmetric, the exterior metric is the There exists a technique, due independently to
Schwarzschild one for some constant m. This has the Kruskal and Szekeres, of attaching together two
following consequence for stellar dynamics: imagine regions r > 2m and two regions r < 2m of the
following the collapse of a cloud of pressureless fluid Schwarzschild metric, as in Figure 1, to obtain a
(‘‘dust’’). Within Newtonian gravity, this dust cloud manifold with a metric which is smooth at r = 2m.
will, after finite time, contract to a point at which the In the extended spacetime, the hypersurface {r = 2m}
density and the gravitational potential diverge. How- is a null hypersurface e, the Schwarzschild event
ever, this result cannot be trusted as a sensible physical horizon. The stationary Killing vector X = @t
prediction because, even if one supposes that New- extends to a Killing vector in the extended spacetime
tonian gravity is still valid at very high densities, a which becomes tangent to and null on e. The global
matter model based on noninteracting point particles properties of the Kruskal–Szekeres extension of the
is certainly not. Consider, next, the same situation in exterior Schwarzschild spacetime make this spacetime
the Einstein theory of gravity: here a new question a natural model for a nonrotating black hole. It is
arises, related to the form of the Schwarzschild metric worth noting here that the exterior Schwarzschild
outside of the spherically symmetric body: spacetime [2] admits an infinite number of noniso-
metric vacuum extensions, even in the class of
g ¼ V 2 dt2 þ V 2 dr2 þ r2 d2 ; maximal, analytic, simply connected ones. The
Kruskal–Szekeres extension is singled out by the
2Gm
V2 ¼ 1 ; properties that it is maximal, vacuum, analytic, simply
rc2 connected, with all maximally extended geodesics
2Gm either complete, or with the area r of the orbits of the
t 2 R; r 2 ;1 ½2
c2 isometry groups tending to zero along them.
We can now come back to the problem of the
Here d2 is the line element of the standard contracting dust cloud according to the Einstein
2-sphere. Since the metric [2] seems to be singular as theory. For simplicity, we take the density of the
r = 2m is approached (from now on, we use units in dust to be uniform – the so-called Oppenheimer–
which G = c = 1), there arises the need to understand Snyder solution. It then turns out that, in the course
what happens at the surface of the star when the of collapse, the surface of the dust will eventually
radius r = 2m is reached. One thus faces the need of cross the Schwarzschild radius, leaving behind a
a careful study of the geometry of the metric [2] Schwarzschild black hole. If one follows the dust
when r = 2m is approached, and crossed. cloud further, a singularity will eventually form, but
The first key feature of the metric [2] is its will not be visible from the ‘‘outside region’’ where
stationarity, of course, with Killing vector field X r > 2m. For a collapsing body of the mass of the
given by X = @t . A Killing field, by definition, is a Sun, say, one has 2m = 3 km. Thus, standard
vector field the local flow of which generates isome- phenomenological matter models such as that for
tries. A spacetime (the term spacetime denotes a dust can still be trusted, so that the previous
smooth, paracompact, connected, orientable, and objection to the Newtonian scenario does not apply.
time-orientable Lorentzian manifold) is called station- There is a rotating generalization of the Schwarz-
ary if there exists a Killing vector field X which schild metric, namely the two-parameter family of
approaches @t in the asymptotically flat region (where r exterior Kerr metrics, which in Boyer–Lindquist
goes to 1; see below for precise definitions) and coordinates takes the form
generates a one-parameter group of isometries. A
spacetime is called static if it is stationary and if the a2 sin2 2 2a sin2 ðr2 þ a2 Þ
g¼ dt dt d’
stationary Killing vector X is hypersurface orthogonal,
2 2
that is, X[ ^ dX[ = 0, where X[ = X dx = g X dx . ðr2 þ a2 Þ a2 sin 2
þ sin d’2
A spacetime is called axisymmetric if there exists a
Killing vector field Y, which generates a one-parameter
þ dr2 þ d2 ½3
group of isometries and which behaves like a rotation
40 Stationary Black Holes
Singularity (r = 0)
T
r = 2m
r = constant < 2m
r = constant > 2m
r = constant > 2m
II
r = 2m
X
III I
r = 2m IV
r = 2m
r = constant < 2m
Singularity (r = 0)
t = constant
Figure 1 The Kruskal–Szekeres extension of the Schwarzschild solution. (Adapted with permission from Nicolas J-P (2002) Dirac fields
on asymptotically flat space-times. Dissertationes Mathematicae 408: 1–85.)
with 0 a < m. Here = r 2 þ a2 cos2 , = r2 þ a2 stationary black holes, and to give a classification
2mr and rþ < r < 1 where rþ = m þ (m2 a2 )1=2 . of models satisfying the field equations.
When a = 0, the Kerr metric reduces to the
Schwarzschild metric. The Kerr metric is again a
vacuum solution, and it is stationary with X = @t the Model-Independent Concepts
asymptotic time translation, as well as axisymmetric
Some of the notions used informally in the
with Y = @’ the generator of rotations. Similarly to
introductory section will now be made more
the Schwarzschild case, it turns out that the metric
precise. The mathematical notion of black hole is
can be smoothly extended across r = rþ , with {r = rþ }
meant to capture the idea of a region of spacetime
being a smooth null hypersurface e in the extension.
which cannot be seen by ‘‘outside observers.’’ Thus,
The null generator K of e is the limit of the
at the outset, one assumes that there exists a family
stationary-rotating Killing field X þ !Y, where
of physically preferred observers in the spacetime
! = a=(2mrþ ). On the other hand, the Killing vector
under consideration. When considering isolated
X is timelike only outside the hypersurface {r = m þ
physical systems, it is natural to define the ‘‘exterior
(m2 a2 cos2 )1=2 }, on which X becomes null. In the
observers’’ as observers which are ‘‘very far’’ away
region between rþ and r = m þ (m2 a2 cos2 )1=2 ,
from the system under consideration. The standard
which is called the ergoregion, X is spacelike. It is
way of making this mathematically precise is by
also spacelike on and tangent to e, except where the
using conformal completions, discussed in more
axis of rotation meets e, where X is null. Based on
detail in the article about asymptotic structure in
the above properties, the Kerr family provides ~ ~g) is called a con-
this encyclopedia: a pair (m,
natural models for rotating black holes.
formal completion at infinity, or simply conformal
Unfortunately, as opposed to the spherically
completion, of (m, g) if m~ is a manifold with
symmetric case, there are no known explicit collap-
boundary such that:
sing solutions with rotating matter, in particular no
known solutions having the Kerr metric as final 1. m is the interior of m;~
state. 2. there exists a function , with the property that
The aim of the theory outlined below is to the metric ~g, defined as 2 g on m, extends by
understand the general geometrical features of continuity to the boundary of m, ~ with the
Stationary Black Holes 41
extended metric remaining of Lorentzian signa- spacetime. This has to be a condition which does not
ture; and exclude singularities (otherwise the Schwarzschild
~ vanishes
3. is positive on m, differentiable on m, and Kerr black holes would be excluded), but which
on the boundary nevertheless guarantees a well-behaved exterior
region. One such condition, assumed in all the
i :¼ m~ n m results described below, is the existence in m of an
with d nowhere vanishing on i. asymptotically flat spacelike hypersurface s with
compact interior. Further, either s has no boundary
The boundary i of m~ is called Scri, a phonic or the boundary of s lies on eþ [ e . To make
shortcut for ‘‘script I.’’ The idea here is the things precise, for any spacelike hypersurface let gij
following: forcing to vanish on i ensures that i be the induced metric, and let Kij denote its extrinsic
lies infinitely far away from any physical object – a curvature. A spacelike hypersurface sext diffeo-
mathematical way of capturing the notion ‘‘very far morphic to R3 minus a ball will be called asympto-
away.’’ The condition that d does not vanish is a tically flat if the fields (gij , Kij ) satisfy the fall-off
convenient technical condition which ensures that i conditions
is a smooth three-dimensional hypersurface, instead
of some, say, one- or two-dimensional object, or of a jgij ij j þ rj@‘ gij j þ þ rk j@‘1 ‘k gij j
set with singularities here and there. Thus, i is an þ rjKij j þ þ r k j@‘1 ‘k1 Kij j Cr1 ½7
idealized description of a family of observers at
infinity. for some constants C, k 1. A hypersurface s (with
To distinguish between various points of i, one or without boundary) will be said to be asymptotically
sets flat with compact interior if s is of the form sint [
sext , with sint compact and sext asymptotically flat.
iþ ¼ fpoints in i which are to the future of the There exists a canonical way of constructing a
physical spacetimeg conformal completion with good global properties
i ¼ fpoints in i which are to the past of the for stationary spacetimes which are asymptotically
flat in the sense of [7], and which are vacuum
physical spacetimeg
sufficiently far out in the asymptotic region. This
(Recall that a point q is to the future, respectively to conformal completion is referred to as the standard
the past, of p if there exists a future directed, completion and will be assumed from now on.
respectively past directed, causal curve from p to q. Returning to the event horizon e = eþ [ e ,
Causal curves are curves such that their tangent it is not very difficult to show that every Killing
vector ˙ is causal everywhere, g(, ˙ 0.) One
_ ) vector field X is necessarily tangent to e. Since
then defines the black hole region b as the latter set is a null Lipschitz hypersurface, it
follows that X is either null or spacelike on e. This
b :¼ fpoints in m which are leads to a preferred class of event horizons, called
½4
not in the past of iþ g Killing horizons. By definition, a Killing horizon
associated with a Killing vector K is a null hypersur-
By definition, points in the black hole region cannot face which coincides with a connected component of
thus send information to iþ ; equivalently, observers the set
on iþ cannot see points in b. The white-hole region
w is defined by changing the time orientation in [4]. HðKÞ :¼ fp 2 m: gðK; KÞðpÞ ¼ 0; KðpÞ 6¼ 0g ½8
A key notion related to the concept of a black hole is A simple example is provided by the ‘‘boost Killing
that of future (eþ ) and past (e ) event horizons, vector field’’ K = z@t þ t@z in Minkowski spacetime:
eþ :¼ @b; e :¼ @w ½5 H(K) has four connected components,
Under mild assumptions, event horizons in station- H :¼ ft ¼ z; t > 0g; ; 2 f1g
ary spacetimes with matter satisfying the null-energy The closure H of H is the set {jtj = jzj}, which is not
condition, a manifold, because of the crossing of the null
T ‘ ‘ 0 for all null vectors ‘ ½6 hyperplanes {t = z} at t = z = 0. Horizons of this
type are referred to as bifurcate Killing horizons,
are smooth null hypersurfaces, analytic if the metric with the set {K(p) = 0} being called the bifurcation
is analytic. surface of H(K). The bifurcate horizon structure in
In order to develop a reasonable theory, one the Kruszkal–Szekeres–Schwarzschild spacetime can
also needs a regularity condition for the interior of be clearly seen in Figures 1 and 2.
42 Stationary Black Holes
r = constant < 2M
Singularity (r = 0)
r = 2M r = 2M
i+ i+
t = constant
r = infinity
t = constant II
r = infinity
i0 i0
III I
r = infinity
r = infinity
IV
r = constant > 2M
r = constant > 2M
i– i–
r = 2M
r = 2M
r = constant < 2M
Singularity (r = 0)
t = constant
Figure 2 The Carter–Penrose diagram for the Kruskal–Szekeres spacetime. There are actually two asymptotically flat regions, with
corresponding i and e defined with respect to the second region, but not indicated on this diagram. Each point in this diagram represents
a two-dimensional sphere, and coordinates are chosen so that light cones have slopes 1. Regions are numbered as in Figure 1. (Adapted
with permission from Nicolas J-P (2002) Dirac fields on asymptotically flat space-times. Dissertationes Mathematicae 408: 1–85.)
Stationary Black Holes 43
A corollary of the topological censorship theorem Based on the facts below, it is expected that the
of Friedman, Schleich, and Witt is that DOCs of DOCs of appropriately regular, stationary, vacuum
regular black hole spacetimes satisfying the domi- black holes are isometrically diffeomorphic to those
nant-energy condition are simply connected. This of Kerr black holes:
implies that connected components of event hor-
1. The rigidity theorem (Hawking). Event horizons in
izons in stationary spacetimes have R S2
regular, nondegenerate, stationary, analytic
topology.
vacuum black holes are either Killing horizons for
The discussion of the concepts associated with
X, or there exists a second Killing vector in hhmii.
stationary-black hole spacetimes can be concluded
2. The Killing horizons theorem (Sudarsky–Wald).
by summarizing the properties of the Schwarzs-
Nondegenerate stationary vacuum black holes
child and Kerr geometries: the extended
such that the event horizon is the union of Killing
Kerr spacetime with m > a is a black hole space-
horizons of X are static.
time with the hypersurface {r = rþ } forming a
3. The Schwarzschild black holes exhaust the family
nondegenerate, bifurcate Killing horizon generated
of static regular vacuum black holes (Israel,
by the vector field X þ !Y and surface gravity
Bunting – Masood-ul-Alam, Chruściel).
given by
4. The Kerr black holes satisfying
ðm2 a2 Þ1=2
¼ m2 > a2 ½11
2m½m þ ðm2 a2 Þ1=2
exhaust the family of nondegenerate, stationary-
In the case a = 0, where the angular velocity !
axisymmetric, vacuum, connected black holes.
vanishes, X is hypersurface orthogonal and becomes
Here m is the total Arnowitt–Deser–Misner
the generator of H. The bifurcation surface in this
(ADM) mass, while the product am is the total
case is the totally geodesic 2-sphere, along which the
ADM angular momentum. (Of course, these
four regions in Figure 1 are joined.
quantities generalize the constants a and m
appearing in the Kerr metric.) The framework
for the proof has been set up by Carter, and the
Classification of Stationary Solutions statement above is due to Robinson.
(‘‘No-Hair Theorems’’) The above results are collectively known under
We confine attention to the ‘‘outside region’’ of the name of no-hair theorems, and they have not
black holes, the DOC. (Except for the degenerate provided the final answer to the problem so far.
case discussed later, the ‘‘inside’’(black hole) There are no a priori reasons known for the
region is not stationary, so that this restriction analyticity hypothesis in the rigidity theorem.
already follows from the requirement of stationar- Further, degenerate horizons have been completely
ity.) For reasons of space, we only consider understood in the static case only.
vacuum solutions; there exists a similar theory Yet another key open question is that of the
for electro-vacuum black holes. (There is a some- existence of nonconnected regular stationary-
what less developed theory for black hole space- axisymmetric vacuum black holes. The following
times in the presence of nonabelian gauge fields.) result is due to Weinstein: let @sa , a = 1, . . . , N, be
In connection with a collapse scenario, the vacuum the connected components of @s. Let X[ = g X dx ,
condition begs the question: collapse of what? The where X is the Killing vector field which asymptoti-
answer is twofold: first, there are large classes of cally approaches the unit normal to sext . Similarly, set
solutions of Einstein equations describing pure Y [ = g Y dx , Y being the Killing vector field
gravitational waves. It is believed that sufficiently associated with rotations. On each @sa , there exists
strong such solutions will form black holes. a constant !a such that the vector X þ !a Y is tangent
(Whether or not they will do that is related to the to the generators of the Killing horizon intersecting
cosmic censorship conjecture, see Spacetime Toplogy, @sa . The constant !a is called the angular velocity of
Casual Structure and Singularities.) Consider, next, a the associated Killing horizon. Define
dynamical situation in which matter is initially present. Z
1
The conditions imposed in this section correspond ma ¼
dX[ ½12
8 @sa
then to a final state in which matter has either been
radiated away to infinity, or has been swallowed by Z
1
the black hole (as in the spherically symmetric La ¼
dY [ ½13
Oppenheimer–Snyder collapse described above). 4 @sa
44 Stationary Phase Approximation
latter integral can be studied if suitable assumptions ’() = i () for a real-valued function which has a
are made about the asymptotic behavior of the phase nondegenerate local minimum at 0 , in which case
function and the amplitude at infinity, but this is not the integrand is a sharply peaking Gaussian density
the subject of this article. The use of the exponential at 0 . When ’ and a are analytic near 0 , then the
function with purely imaginary argument instead of method of steepest descent consists of deforming the
the sine and the cosine is just a matter of convenience. path of integration in the complex domain in such a
The first observation about oscillatory integrals in way that the integrand becomes such a sharply
the next section is the principle of stationary phase, peaking Gaussian density. During the deformation,
which states that the contributions to the integral the integral does not change because of Cauchy’s
which are not rapidly decreasing as ! ! 1 only integral theorem.
come from the stationary points of ’, the points 2 An important extension of the theory occurs if the
where the total derivative d’() of ’ is equal to zero. real-valued phase function and the amplitude are
This principle is closely related to the observation that allowed to depend smoothly on additional para-
a superposition of waves is maximal at points where meters x, which vary in an n-dimensional smooth
the waves are in phase, an observation which goes manifold M. The amplitude is also allowed to
back to Huygens (1690). depend on !, with an asymptotic expansion of
Assume that 0 is a nondegenerate stationary point of the form
’. That is, d’(0 ) = 0 and the Hessian D2 ’(0 ) of ’ at
0 is nondegenerate. Then 0 is an isolated stationary X
1
aðx; ; !Þ ar ðx; Þ!mþðk=2Þr as ! ! 1 ½2
point of ’, and the contribution to I(!) of a neighbor-
r¼0
hood of 0 has an asymptotic expansion of the form
The expansion is supposed to be locally uniformly in
X
1
Ið!Þ e i!’ð0 Þ
cr ! k=2r
; r!1 (x, ) and to allow termwise differentiations of any
r¼0 order with respect to the variables (x, ). Then the
integral
Here the leading coefficient c0 is the product of a(0 ) Z
with a nonzero constant which only depends on
Iðx; !Þ ¼ ei!’ðx;Þ aðx; ; !Þ d
D2 ’(0 ) and the density d at 0 . For increasing r the
coefficients cr depend on the derivatives of ’ and a
at 0 of increasing order (see the section ‘‘The is called an oscillatory integral of order m. Here the
method of stationary phase’’). function x 7! I(x, !) is viewed as a continuous
Usually, even if all the objects are analytic in a superposition of the -dependent family of oscilla-
neighborhood of 0 , the asymptotic power series tory functions x 7! ei!’(x, ) a(x, ).
does not converge. However, there are exceptional The example which formed the point of departure
cases where the stationary phase approximation is of Airy (1838) is that ei!’(x, ) a(x, , !) is the wave
exact. Assume, for instance, that is a compact which arrives at the points x in spacetime which is
manifold provided with a symplectic form , ’ is the sent out by a point on a reflecting mirror. That is,
Hamiltonian function of a Hamiltonian circle action at x one collects (= integrates over ) all the waves
on with isolated fixed points, and a() d = k =k!. sent out by the various points of the mirror . The
Then the stationary points of ’ are the fixed points main point of the theory, however, is that in great
of the circle action, each stationary point of ’ is generality the solutions of linear partial differential
nondegenerate and I(!) is equal to the sum over the equations, such as classical wave equations or
finitely many stationary points of only the leading quantum mechanical Schrödinger equations, can be
terms of the asymptotic expansions at the stationary represented, as functions of x, as oscillatory inte-
points. This Duistermaat–Heckman formula is a grals. This construction has led to decisive progress
consequence of a more general localization formula in the general theory of linear partial differential
in equivariant cohomology (see the section ‘‘Exact equations with smoothly varying coefficients.
stationary phase’’). According to the principle of stationary phase, the
For the purpose of applications, but also in the main asymptotic contributions to the integral come
analysis of oscillatory integrals, it is worthwhile to from the points such that @’(x, )=@ = 0. The
allow complex-valued phase functions, but with a phase function ’ 2 C1 (M ) is called nondegene-
local minimum for the imaginary part at the rate if the (n þ k) k-matrix
stationary point 0 of the real parts. That is, the
real part of the exponent i!’() has a local @ 2 ’ðx; Þ @’ðx; Þ
has rank k when ¼0 ½3
maximum at 0 . An extreme case occurs when @ðx; Þ@ @
46 Stationary Phase Approximation
This is the natural condition to ensure that the set the equation p = 0 implies that is invariant under
the flow of the Hamiltonian system with Hamilton
@’ðx; Þ function equal to p. Furthermore, the principal
S’ :¼ ðx; Þ 2 M ¼0
@ symbol s of u satisfies a homogeneous first-order
ordinary differential equation along the solution
is a smooth n-dimensional submanifold of M . curves of the Hamiltonian system. Conversely, these
The condition [3], moreover, implies that the mapping properties can be used to construct global oscillatory
integrals u which asymptotically satisfy Pu = 0 and
@’ðx; Þ have prescribed initial values. This theory, due to
’ : S’ 3 ðx; Þ 7! x; 2 T M
@x Maslov (1972), may be viewed as a far reaching
generalization of the WKB method.
is a smooth immersion from S’ into the cotangent Let : T M ! M : (x, ) 7! x denote the canonical
bundle T M of M. Note that = @’(x, )=@x is projection from T M onto M. The projections into
coordinate invariantly defined as a linear form on M of the solution curves in a Lagrangian submani-
the tangent space Tx M of M at the point x. That fold of T M, of a Hamiltonian system which
is, 2 (Tx M) = the dual space of Tx M, and (Tx M) leaves invariant, are the ray bundles of geome-
is the fiber of T M over x. In classical mechanics, trical optics. If is not transversal to the fiber of
T M is the phase space of the position space M, T M at (x, ), then the ray bundle exhibits a caustic
and a linear form on Tx M is called a momen- at the point x 2 M, and the oscillatory integral is
tum vector at the position x. If denotes the cano- asymptotically of larger order than !m near x.
nical symplectic form on T M, then ’ = 0. The Applying the theory of unfoldings of singularities
immersion ’ locally embeds S’ onto a smooth to the phase function, one can determine the
n-dimensional submanifold ’ of M, which is a structurally stable caustics and obtain normal
Lagrangian manifold in T M, which by definition forms of the oscillatory integrals in the structurally
means that ’ = 0. stable cases (see the section ‘‘Caustics’’).
Oscillatory integrals with very different phase If we also integrate over the frequency variable !,
functions and amplitudes can define the same then we obtain the Fourier integral distributions u of
!-dependent functions on M. The theory of Hörmander (1971, sections 1.2 and 3.2). In this case
Hörmander (1971, section 3.1) says that the germs the corresponding Lagrangian manifold is conic in
of the Lagrangian manifolds ’ and are the same the sense that if (x, ) 2 , then (x,
) 2 for every
if and only if ’ and define the same class of
> 0. The wave front set of u, which is the
oscillatory integrals. Moreover, every Lagrangian microlocal singular locus of the distribution u, is
submanifold of T M is locally of the form ’ contained in , with equality if the principal symbol
for some nondegenerate phase function ’. In this of u is not equal to zero at the corresponding
way, the mapping ’ 7! ’ defines a bijection stationary points of the phase function. Fourier
between the set of equivalence classes of germs of integral operators are defined as the linear operators
nondegenerate phase functions and the set of germs acting on distributions, of which the distribution
of Lagrangian submanifolds of T M. Let be an kernels are Fourier integral distributions. Under a
immersed Lagrangian submanifold of T M. A suitable transversality condition for the Lagrangian
global oscillatory integral of order m on M, defined manifolds of the distribution kernels, the composi-
by , is a locally finite sum u(x, !) of oscillatory tion of two Fourier integral operators is again a
integrals of order m with nondegenerate phase Fourier integral operator, and the principal symbol
functions ’ such that ’ . The leading terms of of the composition is a product of the principal
the amplitudes correspond to a section s of a symbols. The proof is an application of the method
canonically defined complex line bundle over , of stationary phase. Fourier integral operators are a
which is called the principal symbol of u (see the very powerful tool in the analysis of linear partial
section ‘‘The principal symbol on the Lagrangian differential operators with smoothly varying coeffi-
manifold’’). cients (see Hörmander (1985)).
If P is a linear partial differential operator, such as
the wave operators, in which the coefficients may
depend in a smooth way on x and in a polynomial
way on !, then the condition that Pu is asymptoti-
The Principle of Stationary Phase
cally small implies that p = 0 on , in which p is a The principle of stationary phase says that if the
smooth function on T M, called the principal phase function ’ has no stationary points in
symbol of P. Because is a Lagrangian manifold, the support of the amplitude function a, then the
Stationary Phase Approximation 47
oscillatory integral [1] is rapidly decreasing, in the (1990, theorem 7.6.1)). The Taylor expansion of the
sense that for every N we have I(!) = O(!N ) as exponential factor in [4] then yields that
! ! 1. For the proof, one introduces a vector field v Z
on such that v’ = 1 on a neighborhood of the ei!hQy;yi=2 bðyÞ dy
support of a. Then ei!’ = (i!)1 v(ei!’ ), and an Rk
! 1=2 X 1
1
integration by parts in [1] yields that det Q ð2i!Þr
2 i r¼0
r!
Z
r
1 @ @
Ið!Þ ¼ ei!’ðÞ ðt vaÞðÞ d Q1 ; bðyÞ
i! @y @y y¼0
where t v denotes the transposed of the linear partial as ! ! 1 (see Hörmander (1990, lemma 7.7.3)).
differential operator v. Iterating this, the rapid It is important for the applications that, if the
decrease of I(!) follows. phase function and amplitude depend smoothly on
Using cutoff functions, I(!) is, modulo a rapidly parameters, all the constructions can be made to
decreasing function, equal to an oscillatory integral depend smoothly on the parameters.
with phase function ’ and an amplitude which
has support in an arbitrarily small neighborhood of
the set of stationary points of ’. In this sense, Exact Stationary Phase
the contributions to the integral which are not
Suppose that we have given an action of a Lie group
rapidly decreasing come only from the stationary
G on the manifold . Let g denote the Lie algebra of
points of ’.
G. For any g 2 G and X 2 g the corresponding
diffeomorphism of and vector field on is
The Method of Stationary Phase denoted by g and X , respectively. If () denotes
the algebra of smooth differential forms on , then
Assume that 0 is a nondegenerate stationary point
we consider the algebra Sg () of all ()-
of ’. Then 0 is an isolated stationary point of ’.
valued polynomials on g, where Sg denotes the
Using local coordinates near 0 , the contribution to
algebra of all polynomial functions on g. On Sg
[1] from the neighborhood of 0 can be written as an
() we have the action of g 2 G which sends to
oscillatory integral with = Rk and a pase function
X 7! g ((Ad g X)). Let A = (Sg ())G denote
’ which has a nondegenerate stationary point at 0.
the subalgebra of all G-invariant elements of Sg
Write Q = D2 ’(0). According to the Morse lemma,
(). The equivariant exterior derivative D is
there is smooth substitution of variables = T(y)
defined by
such that T(0) = 0, DT(0) = I, and ’(T()) = ’(0) þ
hQy, yi=2 for all y in a neighborhood of 0 in Rk . ðDÞðXÞ ¼ dððXÞÞ iX ððXÞÞ
Applying this substitution of variables to [1] we
If is homogeneous as a differential form of degree p
obtain
and homogeneous as a polynomial on g of degree q,
Z then r = p þ 2q is called the total degree of . Let Ar
Ið!Þ ¼ ei!’ð0Þ ei!hQy;yi=2 bðyÞ dy denote the space of sums of such 2 A of total degree r.
R k
Then Dr = D: Ar ! Arþ1 and Dr
Dr1 = 0. The space
HrG () := ker Dr =Im Dr1 is called the equivariant
where b is a compactly supported smooth function cohomology in degree r, in the model of Cartan (1950).
on Rk with b(0) = a(0). Now the Fourier transform Assume that is compact and oriented, and that
of the function y 7! ei!hQy, yi=2 is equal to the function the action of G preserves the orientation. If 2 A,
then we denote by (X)[k] the volume part of the
! 1=2 1 1 differential form (X), and
7! det Q ei! hQ ;i=2 ½4 Z
2 i Z
ðXÞ :¼ ðXÞ½k ; X 2 g
Both in the definition of the square root of the R
determinant and in the proof one uses the analytic defines an Ad G-invariant function on g. Now
continuation to the domain of complex-valued = D
implies that (X)[k] id equal Rto the exterior
symmetric bilinear forms Q for which the imaginary derivative of
(X)[k1] , and therefore = 0, in view
part of Q is positive definite. For purely imaginary of Stokes’ theorem. It follows
R that integration over
Q we have the familiar formula for the Fourier yields a linear mapping from HG () to (Sg )Ad G ,
transform of a Gaussian density (see Hörmander which is called integration in equivariant cohomology.
48 Stationary Phase Approximation
Now assume that also the Lie group G is amplitude a, and (x0 , 0 ) = ’ (x0 , 0 ), then the phase
compact, and let X 2 g. Then the zero-set ZX of function ’(x, ) (x) in the oscillatory integral
X in has finitely many connected components F, Z Z
each of which is a smooth and compact submanifold hu; ei bi ¼ eið’ðx;Þ ðxÞ aðx; ÞbðxÞ d dx
of . In general, the F’s can have different M
dimensions. The linearization LX of the vector has a stationary point at (x0 , 0 ), which means that
field X along F acts linearly on the normal bundle and d intersect at (x0 , 0 ). Here the 1-form d on
NF of F. If is the curvature form of NF, then M, which is a section of : T M ! M, is viewed as a
submanifold of T M. Locally the Lagrangian sub-
i manifolds of T M which are transversal to the fibers
"ðXÞ :¼ detC ðLX Þ
2 of : T M ! M are precisely the manifolds of the
form d . The stationary point of ’ is non-
is called the equivariant Euler form of NF. "(X) is an
degenerate if and only if L := T(x0 , 0 ) and
invertible element in the algebra even (F). The
L := T(x0 , 0 ) (d ) are transversal. In this case, the
localization formula of Berline–Vergne (1982) and
method of stationary phase can be applied in order
Atiyah–Bott (1984) now says that if D = 0 then
to obtain an asymptotic expansion in terms of
Z XZ
½dim F powers of !. The coefficient of the leading term of
ðXÞ ¼ iF ðXÞ="ðXÞ order !m depends only on the Lagrangian plane L ,
F F
which is transversal to both L and the tangent space
Assume that is a symplectic form on , which of the fiber of T M, and not on the other data of
implies that k = 2l is even. Furthermore, assume that and b. If L denotes the set of all Lagrangian planes
the infinitesimal action of g on is Hamiltonian, in T(x0 , 0 ) (T M) which are transversal to both L and
which means that there exists a G-equivariant the fiber, then the complex-valued functions on L
smooth mapping : ! g , called the momentum which arise in this way form a one-dimensional
mapping, such that iX = d((X)) for every X 2 g. complex vector space L(x0 , 0 ) . The L(x0 , 0 ) for
Here is viewed as an element of (g 0 ())G A. (x0 , 0 ) 2 form a complex line bundle over
Then b(X) := (X) defines an element b 2 A such which is canonically isomorphic to the tensor
that Db = 0. In turn, this implies that the form product of the line bundle of half-densities and the
Maslov line bundle, a line bundle with structure
X
l
group Z=4Z (see Duistermaat (1974, section 1.2)).
ðXÞ :¼ ei!b
ðXÞ
¼ ei!ðXÞ ði!Þr =r! In this way, the principal symbol s of u can be
r¼0
viewed as a section of the line bundle over .
is equivariantly closed, and the localization formula
of equivariant cohomology applied to this case
yields the Duistermaat–Heckman (1982, 1983)
Caustics
formula. Because
(X)[k] = ei!(X) (i!)l =l!, its inte-
gral over is an oscillatory integral with phase Let (x0 , 0 ) be a point in the Lagrangian submanifold
function (X). The stationary points of (X) are the of T M. The restriction to of the projection
zeros of X and the stationary points of (X) are : T M ! M is a diffeomorphism from an open
nondegenerate if and only if the zeros of X are neighborhood of (x0 , 0 ) in onto an open neigh-
isolated. It follows that in this case the oscillatory borhood of x0 in M, if and only if is transversal
integral is equal to the leading term in the to the fiber of T M at (x0 , 0 ). If = ’ for a
stationary-phase approximation. nondegenerate phase function ’, (x0 , 0 ) 2 S’ and
(x0 , 0 ) = ’ (x0 , 0 ), then this condition is in turn
equivalent to the condition that 0 is a nondegenerate
stationary point of 7! ’(x0 , ). An application of the
The Principal Symbol on the
method of stationary phase shows that in this case the
Lagrangian Manifold
oscillatory integral is equal to a progressing wave of
Let u(x, !) be a global oscillatory integral of order m the form ei! (x) b(x, !). Here (x) = ’(x, (x)), where
defined by , and let (x0 , 0 ) 2 . One way to define (x) is the stationary point of 7! ’(x, ), and b(x, !)
the principal symbol of u at (x0 , 0 ) 2 is to test u has an asymptotic expansion as in [2] with k = 0.
with an oscillatory function of the form ei! (x) b(x), If 0 is a degenerate stationary point of
in which d (x0 ) = 0 , the support of b is contained 7! ’(x0 , ) and a0 (x0 , 0 ) 6¼ 0, then the oscillatory
in a small neighborhood of x0 , and b(x0 ) = 1. If u is integral is not of order O(!m ). That is, it is of larger
locally represented by the phase function ’ and order than at points where we have a nondegenerate
Stationary Phase Approximation 49
stationary point. For this reason, the points (x0 , 0 ) caustic points can be very intricate (see, e.g., Berry
at which is not transversal to the fibers of et al. (1979)). A survey of the application of the
: T M ! M are called the caustic points of . theory of unfoldings to caustics in oscillatory
Their projections x0 2 M form the caustic set in M. integrals can be found in Duistermaat (1974).
In the theory of unfoldings of singularities, the
germs of the families of functions x 7! ( 7! ’(x, )) See also: Equivariant Cohomology and the Cartan
and y 7! ( 7! (y, )) are called equivalent if there Model; Feynman Path Integrals; Functional Integration in
exists a germ of a diffeomorphism of the form Quantum Physics; Hamiltonian Group Actions;
h-Pseudodifferential Operators and Applications;
H : (x, ) 7! (y(x), (x, )) and a smooth function (x)
Multiscale Approaches; Normal Forms and Semiclassical
such that (y(x), (x, )) = ’(x, ) þ (x). If J(y, !) is
Approximation; Optical Caustics; Path Integrals in
an oscillatory integral with phase function , Noncommutative Geometry; Perturbation Theory and its
integration variable and parameter y, then the Techniques; Schrödinger Operators; Singularity and
substitution of variables = (x, ) in the integral, Bifurcation Theory; Wave Equations and Diffraction.
followed by the substitution of variables y = y(x) in
the parameters, yields that J(y, !) = ei!(x) I(x, !), in
which I(x, !) is an oscillatory integral with phase
function ’ and an amplitude function of the same Further Reading
order as the amplitude function of J. The germ ’ is Airy GB (1838) On the intensity of light in a neighborhood of a
called stable if every nearby germ is equivalent to caustic. Transactions of the Cambridge Philosophical Society
’. The Morse lemma with parameters implies that 6: 379–403.
this is the case if 7! ’(x0 , ) has a nondegenerate Atiyah MF and Bott R (1984) The moment map and equivariant
cohomology. Topology 23: 1–28.
stationary point at 0 . However, the theory of Berline N and Vergne M (1982) Classes caractéristiques équivar-
unfoldings of singularities of Thom and Mather iante. Formules de localisation en cohomologie équivariante.
shows that there are many stable germs with Comptes Rendus Hebdomadaires des Seances de l’Academie
degenerate critical points. Moreover, in dimension des Sciences, Paris 295: 539–541.
Berry MV, Nye JF, and Wright FJ (1979) The elliptic umbilic
n 5 the generic germ is stable, and is equivalent to
diffraction catastrophe. Philosophical Transactions of the
a germ in a finite list of normal forms. Royal Society of London A291: 453–484.
The simplest example of a normal form with Cartan H (1950) Notion d’algèbre différentielle; applications ou
degenerate critical points is ’(x, ) = 3 þ x1 . Here opère un groupe de Lie, and: La transgression dans un groupe
we have taken k = 1, but still allowed an arbitrary de Lie et dans un fibré principal. In: Colloque de Topologie,
dimension n 1 of M. In this normal form, the pp. 15–27, 57–71 Bruxelles: C.B.R.M.
Duistermaat JJ (1974) Oscillatory integrals, Lagrange immersions
stationary points correspond to 32 þ x1 = 0, which and unfoldings of singularities. Communications in Pure and
is a manifold which over the x-space folds over at Applied Mathematics 27: 207–281.
x1 = 0. The stationary point is degenerate if and only Duistermaat JJ and Heckman GJ (1982) On the variation in the
if 6 = 0, hence x1 = 0, which means that x1 = 0 is cohomology of the symplectic form of the reduced phase
the caustic set. If the amplitude is equal to 1, then space. Inventiones Mathematical 69: 259–268.
Duistermaat JJ and Heckman GJ (1983) On the variation in the
the oscillatory integral is equal to !1=3 Ai(!2=3 x1 ), cohomology of the symplectic form of the reduced phase
in which Ai(z) denotes the Airy function. If the space. Inventiones Mathematical 72: 153–158.
amplitude is nonzero at a degenerate critical point, Hörmander L (1971) Fourier integral operators I. Acta Mathe-
then the oscillatory integral near the corresponding matica 127: 79–183.
caustic point is asymptotically of the same order as Hörmander L (1983, 1990) The Analysis of Linear Partial
Differential Operators I. Berlin: Springer.
!1=3 Ai(!2=3 x1 ), which implies that the oscillatory Hörmander L (1985) The Analysis of Linear Partial Differential
integral is a factor !1=6 larger at these caustic points Operators IV. Berlin: Springer.
than at the points away from the caustic set. In Airy Huygens C (1690) Traité de la Lumière, Leyden: Van der Aa.
(1838), where the Airy function was introduced, Airy (English translation: Treatise on Light, 1690, reprint by Dover
considered light in a neighborhood of a caustic as an Publications, New York, 1962.)
Maslov VP (1965) Perturbation Theory and Asymptotic Methods
oscillatory integral. Then, under suitable genericity (In Russian), Moscow: Moskov. Gos. University. (French
conditions, he brought the phase function into the translation by Lascoux J and Sénéor R (1972) Paris: Dunod.)
normal form 3 þ x1 . Even for stable normal forms
in low dimensions, the interference patterns near the
50 Statistical Mechanics and Combinatorial Problems
the NP class to counting problems and obviously fact, random codes are optimal), the decoding
contains the class NP as a particular case. problem is in general NP-complete and therefore
Let us consider the following sightly simplified potentially intractable. However, since the choice of
definition of the Ising and the spin glass problems. the coding scheme is part of the design, what
matters are the average-case behavior of the decod-
Problem instance A symmetric matrix Jij with
ing algorithm (and its large deviations) and very
entries in {1, 0, 1} and an inverse temperature .
P efficient codes which can solve on average the
Output The partition
P function Z = {i } 2H(s) , decoding problem close to Shannon’s bounds are
where H(s) = i,j Jij i j with i = 1. known.
In what follows, we will limit the discussion to
Moreover, let us define the fully polynomial
two basic examples of combinatorial and counting
randomized approximation scheme (FPRAS) for
problems which are representative and central to
counting and decision problems. A FPRAS for a
both computer science and statistical physics.
function f from problem instances to real numbers is
a probabilistic algorithm that in polynomial time in
the problem size n and in the relative error 2 [0, 1], Constraint Satisfaction Problems
outputs with high probability a number which
Combinatorial problems are usually written as
approximates f (n) within a ratio 1 þ . Given the
constraint satisfaction problems (CSPs): n discrete
above definitions, the theorems can be stated as
variables are given which have to satisfy m
follows:
constraints, all at the same time. Each constraint
Theorem 1 There can be no FPRAS for the spin can take different forms depending on the prob-
glass problem unless P = NP, that is, all problems in lem under study: famous examples are the
NP turn out to be solvable in polynomial time. K-satisfiability (K-SAT) problem in which constraints
are an ‘‘OR’’ function of K variables in the ensemble
Theorem 2 The Ising problem is #P-complete even
(or their negations) and the graph Q-coloring
when the matrix Jij is non-negative, that is, an
problem in which constraints simply enforce the
algorithm which outputs in polynomial time the
condition that the endpoints of the edges in the graph
exact Ising partition function for an arbitrary graph
must not have the same color (among the Q possible
could be used to solve any other counting problem
ones). Quite in general a generic CSP can be written
in #P.
as the problem of finding a zero-energy ground state
The above theorems hold for arbitrary graphs, of an appropriate energy function and its analysis
in particular for those graph or lattice realizations amounts at performing a zero-temperature statistical
which are particularly hard to analyze, the so-called physics study. Hard combinatorial problems are
worst cases. There exist no similar proofs of those which correspond to frustrated physical model
computational hardness for more restricted and systems.
realistic structures, such as, for instance, three- Given an instance of a CSP, one wants to know
dimensional regular lattices for the Ising problem whether there exists a solution, that is, an assign-
or finite connectivity random graphs for spin glasses. ment of the variables which satisfies all the
As a final introductory remark, it is worth constraints (e.g., a proper coloring). When it exists,
mentioning that the connections between worst- the instance is called SAT, and one wants to find
case complexity and the average case one is the a solution. Most of the interesting CSPs are
building block of modern cryptography and com- NP-complete: in the worst case, the number of
munication theory. On the one hand, the so-called operations needed to decide whether an instance
RSA cryptosystem is based on factoring large is SAT or not is expected to grow exponentially
integers, a problem which is believed to be hard on with the number of variables. But recent years
average while it is not known to be so in the worst have seen an upsurge of interest in the theory of
case. On the other hand, alternative cryptographic typical-case complexity, where one tries to identify
systems have been proposed which rely on a worst- random ensembles of CSPs which are hard to solve,
case/average-case equivalence (see, e.g., the theorem and the reason for this difficulty. As already
of Ajtai (1996) concerning some hidden vector mentioned, random ensembles of CSPs are also
problems in high-dimensional lattices.) of great theoretical and practical importance in
As far as communication theory is concerned, communication theory, since some of the best
average-case complexity is indeed crucial: while modern error-correcting codes (the so-called low-
Shannon’s theorem (1948) provides a very general density parity check codes) are based on such
result stating that many optimal codes do exist (in constructions.
52 Statistical Mechanics and Combinatorial Problems
Satisfiability and Spin Glass Models heuristic analytical arguments are in support of the
so-called satisfiability threshold conjecture:
The archetypical example of CSP is satisfiability
(SAT). This is a core problem in computational Conjecture There exists c (K) such that with high
complexity: it is the first one to have been shown probability:
NP-complete, and since then thousands of problems
if < c (K), a random instance is satisfiable;
have been shown to be computationally equivalent
if > c (K), a random instance is unsatisfiable.
to it. Yet it is not so easy to find difficult instances.
The main ensemble which has been used for this Although this conjecture remains unproven, the
goal is the random K-SAT ensemble (for K > 2, existence of a nonuniform sharp threshold has been
K-SAT is NP-complete). established by Friedgut (1997). A lot of effort has been
The SAT problem is defined as follows. Given a devoted to understanding this phase transition. This is
vector of {0, 1} Boolean variables x = {xi }i2I , where interesting both from physics and the computer science
I = {1, . . . , n}, consider a SAT formula defined by points of view, because the random instances with
close to c are the hardest to solve. There exist
^
F ðx Þ ¼ Ca ðxÞ rigorous results that give bounds for the threshold
a2A c (K): using these bounds, it was shown that c (K)
scales as 2K ln (2) when K ! 1.
where A is an arbitrary finite setW(disjoint with I) On the statistical physics side, the cavity method
labeling the clauses Ca ; Ca (x) = i2I(a) Ja, i (xi ); any (which is the generalization to disordered systems
literal Ja, i (xi ) is either xi or xi (‘‘not’’ xi ); and characterized by ergodicity breaking of the iterative
finally, I(a) I for every a 2 A. Similarly to I(a), we method used to solve exactly physical models on the
can define the set A(i) A as A(i) = {a : i 2 I(a)}, that Bethe lattice), is a powerful tool which is claimed to
is, the set of clauses containing variable xi or its be able to compute the exact value of the threshold,
negation. giving for instance c (3) ’ 4.266 7 . . . It is a non-
Given a formula F , the problem of finding a rigorous method but the self-consistency of its
variable assignment s such that F (s) = 1, if it exists, results have been checked by a ‘‘stability analysis,’’
can also be written as a spin glass problem and it has also led to the development of a new
as follows: if we consider a set of n Ising spins, family of algorithms – the so-called ‘‘survey propa-
i 2 {1} in place of the Boolean variables gation’’ – which can solve efficiently very large
(i = 1, 1 $ xi = 0, 1) we may write the energy instances at clause densities which are very close to
function associated to each clause as follows: the threshold (for technical details see Mézard and
Zecchina (2002) and Braunstein et al. (2005) and
references therein).
YK
1 þ Ja;ir ir The main hypothesis on which the cavity analysis
Ea ¼
r¼1
2 of random K-SAT relies is the existence, in a region
of clause density [d , c ] close to the threshold, of
where Ja, i = 1 (resp. Ja, i = 1) if xi (resp. x
~i ) appears an intermediate phase called the ‘‘hard-SAT’’ phase;
in clause a. The total energy of a configuration see Figure 1. In this phase the set S of solutions
PjAj (a subset of the vertices in an n-dimensional
E = a = 1 Ea is nothing but a K-spin spin glass
hypercube) is supposed to split into many discon-
model.
nected clusters S = S 1 [ S 2 [ . If one considers
Random K-SAT is a version of SAT in which each
two solutions X, Y in the same cluster S j , it is
clause is taken to involve exactly K distinct
variables, randomly chosen and negated with uni-
form distribution. Its energy function corresponds to
a spin glass system over a finite connectivity
(diluted) random graph.
In recent years random K-SAT has attracted much
interest in computer science and in statistical O(n)
physics. The interesting limit is the thermodynamic
limit n ! 1, m = jAj ! 1 at fixed clause density Easy SAT Hard SAT UnSAT
= m=n.
Its most striking feature is certainly its sharp m/n
threshold. It is strongly believed that there exists a Figure 1 A pictorial representation of the clustering transition
phase transition for this problem: numerical and in random K-SAT.
Statistical Mechanics and Combinatorial Problems 53
possible to walk from X to Y (staying in S) by The single-sample SP equations are nicely described
flipping at each step a finite numbers of variables. If, in terms of the factor graph representation used in
on the other hand, X and Y are in different clusters, information theory to characterize error-correcting
in order to walk from X to Y (staying in S), at least codes. In the factor graph, the N variables i, j, k, . . . are
one step will involve an extensive number (i.e., / n) represented by circular ‘‘variable nodes,’’ whereas the
of flips. This clustered phase is held responsible for M clauses a, b, c, . . . are represented by square ‘‘func-
entrapping many local-search algorithms into non- tion nodes.’’ For random K-SAT, the function nodes
optimal metastable states. This phenomenon is not have connectivity K, while the variable nodes have an
exclusive to random K-SAT. It is also predicted to average Poisson connectivity K.
appear in many other hard SAT and optimization The iterative SP equations are examples of message-
problems such as ‘‘coloring,’’ and corresponds to the passing procedures. In message-passing algorithms
so called ‘‘one-step replica symmetry breaking’’ such as the so-called ‘‘belief propagation (BP)
(1RSB) phase in the language of statistical physics. algorithm’’ used in error-correcting codes and
It is also a crucial limiting feature for decoding statistical inference problems, the unknowns which
algorithms in some error correcting codes. are self-consistently evaluated by iteration are the
The only CSP for which the existence of the marginals over the solution space of the variables
clustering phase has been established rigorously is characterizing the combinatorial problem (the prob-
the polynomial problem of solving random linear ability space is the set of all solutions sampled with
equation in GF (Motwani and Raghavan 2000). For uniform measure). According to the physical inter-
random K-SAT, rigorous probabilistic bounds can pretation, the quantities that are evaluated by SP are
be used to prove the existence of the clustering the probability distributions of local fields over the set
phenomenon, for large enough K, in some region of of clusters. That is, while BP performs a ‘‘white’’
included in the interval [d (K), c (K)] predicted by average over solutions, SP takes care of cluster-to-
the statistical physics analysis. cluster fluctuations, telling us which is the probability
In the analysis of CSP like K-SAT, two main of picking up a cluster at random and finding a given
questions are in order. The first is of algorithmic variable biased in a certain direction (or unfrozen if it
nature and asks for an algorithm which decides is paramagnetic in the cluster). SP computes quantities
whether for a given CSP instance all the constraints which are probabilities over different pure states: the
can be simultaneously satisfied or not. The second order parameter which is evaluated as fixed point of
question is more theoretical and deals with large the SP equations is a probability measure in a space of
random instances, for which one wants to know the functions, or for finite n, the full list of probability
structure of the solution space and predict the densities describing the cluster-to-cluster fluctuations
typical behavior of classes of algorithms. of the variables.
In both SP and BP one assumes knowledge of the
marginals of all variables in the temporary absence
Message-Passing Algorithms from
of one of them and then writes the marginal
Statistical Physics
probability induced on this ‘‘cavity’’ variable in
The algorithmic contributions of statistical absence of another third variable interacting with it
mechanics to combinatorial optimization are numer- (i.e., the so-called Bethe lattice approximation for
ous and important (a representative example being the problem). These relations define a closed set of
the celebrated ‘‘simulated annealing algorithm’’). equations for such cavity marginals that can be
For the sake of brevity, here we limit the discussion solved iteratively (this fact is known as message-
to the so-called ‘‘message-passing algorithms’’ which passing technique). The equations become exact if
are also of great interest in coding theory. the cavity variables acting as inputs are uncorre-
The statistical analysis of the cavity equations lated. They are conjectured to be an asymptotically
allows to study the average properties of ensemble exact approximation over random locally tree–like
of problems and it is totally equivalent to the replica structures such as, for instance, the random K-SAT
method in which the average over the ensemble is factor graph. Both BP and SP can be derived in a
the first step in any calculation. The survey variational framework.
propagation (SP) equations are a formulation of
the cavity equations which is valid for each specific
Complexity of Counting Problems
instance and is able to provide information about
the statistical behavior of the individual variables in In order to describe the nature of computational
the stable and metastable states of a given energy complexity of counting in physical models, it is
density (i.e., given fraction of violated constraints). enough to consider the classical Ising problem. The
54 Statistical Mechanics and Combinatorial Problems
computation of the Ising partition function or, more The generalization of the Pfaffian construction to
in general, of the weighted matching polynomial, is the nonplanar case must deal with the ambiguity of
the root problem of lattice statistics. orienting the homology cycles of the graph. Such a
For planar graphs like, for example, two-dimensional problem can be formally solved in full generality for
regular lattices, counting problems can often be any orientable lattice and leads to an expression of
solved by a variety of different methods, for the Ising partition function or the dimer coverings
example, transfer matrices and Pfaffians, which generating function given as a sum over all possible
require a number of operations which are poly- inequivalent orientations of the lattice (or its embed-
nomial in the number of vertices. ding surface): for a graph of genus g, the homology
The complexity of the counting problems changes basis is composed of 2g cycles and, therefore, there
if one considers nonplanar graphs, that is, graphs are 22g inequivalent orientations. It is only for graphs
with a nontrivial topological genus. In discrete of logarithmic genus that the generalized Pfaffian
mathematics, such problems are classified as formalism provides a polynomial algorithm.
#P-complete, meaning that the existence of an Counting perfect matchings can be thought of as
exact polynomial algorithm for the evaluation of the the problem of evaluating the permanent of 0–1
generating functions would imply the polynomial matrices over properly constructed bipartite graphs,
solvability of many known counting combinatorial which is among the oldest and most famous
problems, the most famous one being the evaluation of #P-complete problems.
the permanent of 0–1 matrices. In statistical mechanics The Pfaffian formalism when applied to the perma-
and mathematical chemistry, the interest in nonplanar nent problem leads to a simple general result, that is, it
lattices is obviously related to their D > 2 character: provides a general formula for writing the permanent
the three-dimensional cubic lattice is nothing but a of a matrix in terms of a number of determinants which
nonplanar graph of topological genus g = 1 þ N=4, is exponential in the genus of the underlying graph.
where N is the number of sites.
The planar two-dimensional Ising model was solved See also: Combinatorics: Overview; Determinantal
in 1944 by Onsager using the algebraic transfer matrix Random Fields; Dimer Problems; Phase Transitions in
method. Successively, alternative exact solutions have Continuous Systems; Spin Glasses; Two-Dimensional
been proposed which resorted to simple combinatorial Ising Model.
and geometrical reasoning. As is well known, the
underlying idea of the combinatorial methods consists
in recasting the sum over spin configurations of the Further Reading
Boltzmann weights as a sum over closed curves (loops)
Achlioptas D, Naor A, and Peres Y (2005) Rigorous location of
weighted by the activity of their bonds. Double phase transitions in hard optimization problems. Nature 435:
counting is avoided by a proper cancellation mechan- 759–764.
ism which takes care of the different intrinsic Ajtai N (1996) Generating hard instances of lattice problems.
topologies of loops which give rise to the same Electronic Colloquium on Computational Complexity
contribution in the partition function. Such an (ECCC) 7: 3.
Braunstein A, Mézard M, and Zecchina R (2005) Survey
approach has been developed first by Kac and Ward Propagation: an Algorithm for Satisfiability. Random Struc-
(1952) and provides a direct way of taking the field tures and Algorithms 27: 201–226.
theoretic continuum limit. In D > 2, the general- Cocco S and Monasson R (2004) Heuristic average-case analysis
ization of the above method encounters enormous of the backtrack resolution of random 3-satisfiability
difficulties due to the variety of intrinsic topologies of instances. Theoretical Computer Science A 320: 345.
Distler J (1992) A note on the 3D Ising model as a string theory.
surfaces immersed in D > 2 lattices. Nuclear Physics B 388: 648.
Another combinatorial method proposed in the Dubois O, Monasson R, Selman B, and Zecchina R (eds.) (2001)
1960s by Kasteleyn is the so-called Pfaffian method. NP-hardness and Phase transitions (Special Issue), Theoretical
It consists in writing the weighted sum over loops as Computer Science, vol. 265 (1–2). Elsevier.
a dimer covering or prefect matching generating Friedgut E (1999) Sharp threshold of graph properties, and the
KSat problem. Journal of American Mathematical Society 12:
function. Once the relationship between loop count- 1017–1054.
ing and dimer coverings (or perfect matchings) over Jerrum M and Sinclair A (1989) Approximating the permanent.
a suitably decorated and properly oriented lattice is SIAM Journal on Computing 18: 1149.
established, the Pfaffian method turns out to be a Lovász L and Plummer MD (1986) Matching Theory. North-
simple technique for the derivation of exact solu- Holland Mathematics Studies 121, Annals of Discrete Mathe-
matics (29). New York: North-Holland.
tions or for the definition of polynomial algorithms Mézard M, Mora T, and Zecchina R (2005) Clustering of
over planar lattices which are applicable also to the solutions in the random satisfiability problem. Physics Review
two-dimensional Ising spin glass. Letters 94: 197205.
Statistical Mechanics of Interfaces 55
Mézard M and Parisi G (2001) The Bethe lattice spin glass Motwani R and Raghavan P (2000) Randomized Algorithms.
revisited. European Physical Journal B 20: 217. Cambridge: Cambridge University Press.
Mézard M, Parisi G, and Virasoro MA (1987) Sping Glass Nishimori H (2001) Statistical Physics of Spin Glasses and
Theory and Beyond. Singapore: World Scientific. Information Processing. Oxford: Oxford University Press.
Mézard M, Parisi G, and Zecchina R (2002) Analytic and Papadimitriou CH (1994) Computational Complexity. Addison-
algorithmic solution of random satisfiability problems. Science Wesley.
297: 812–815. Regge T and Zecchina R (2000) Combinatorial and topological
Mézard M and Zecchina R (2002) Random K-satisfiability: from approach to the 3D Ising model. Journal of Physics A:
an analytic solution to a new efficient algorithm. Physical Mathematical and General 33: 741.
Review E 66: 056126. Richardson T and Urbanke R (2001) An introduction to the analysis
Monasson R, Zecchina R, Kirkpatrick S, Selman B, and of iterative coding systems. In: Marcus B and Rosenthal J (eds.)
Troyansky L (1999) Determining computational complexity Codes, Systems, and Graphical Models. New York: Springer.
from characteristic ‘‘phase transitions’’. Nature 400: 133.
Introduction
a
When a fluid is in contact with another fluid, or b
with a gas, a portion of the total free energy of the w
system is proportional to the area of the surface of
Figure 1 Partial and complete wetting.
contact, and to a coefficient, the surface tension,
which is specific for each pair of substances.
Equilibrium will accordingly be obtained when the conditions and it is a subsequent relaxation of the
free energy of the surfaces in contact is a minimum. macroscopic crystal that restores the equilibrium.
Suppose that we have a drop of some fluid, b, An interesting phenomenon that can be observed
over a flat substrate, w, while both are exposed to on these crystals is the roughening transition,
air, a. We have then three different surfaces of characterized by the disappearance of the facets of
contact, and the total free energy of the system a given orientation, when the temperature attains a
consists of three parts, associated to these three certain particular value. The best observations have
surfaces. A drop of fluid b will exist provided its been made on helium crystals, in equilibrium with
own two surface tensions exceed the surface tension superfluid helium, since the transport of matter and
between the substrate w and the air, that is, heat is then extremely fast. Crystals grow to sizes of
provided that 1–5 mm and relaxation times vary from milliseconds
to minutes. Roughening transitions for three differ-
wb þ ba > wa
ent types of facets have been observed (see, e.g.,
If equality is attained, then a film of fluid b is Wolf et al. (1983)).
formed, a situation which is known as perfect, or These are some classical examples among a
complete wetting (see Figure 1). variety of interesting phenomena connected with
When one of the substances involved is aniso- the behavior of the interface between two phases in
tropic, such as a crystal, the contribution to the total a physical system. The study of the nature and
free energy of each element of area depends on its properties of the interfaces, at least for some simple
orientation. The minimum surface free energy for a systems in statistical mechanics, is also an interesting
given volume then determines the ideal form of the subject of mathematical physics. Some aspects of
crystal in equilibrium. this study will be discussed in the present article.
It is only in recent times that equilibrium crystals We assume that the interatomic forces can be
have been produced in the laboratory, first, in modeled by a lattice gas, and consider, as a simple
negative crystals (vapor bubbles) of organic sub- example, the ferromagnetic Ising model. In a typical
stances. Most crystals grow under nonequilibrium two-phase equilibrium state, there is a dense
56 Statistical Mechanics of Interfaces
component, which can be interpreted as a solid or The measures [2] determine (by the Dobrushin–
liquid phase, and a dilute phase, which can be Lanford–Ruelle equations) the set of Gibbs states of
interpreted as the vapor phase. Considering certain the infinite system, as measures on the set of all
particular cases of such situations, we first introduce configurations. If a Gibbs state happens to be equal
a precise definition of the surface tension and then to lim (j
), when L1 , L2 , L3 ! 1, under a fixed
proceed on the mathematical analysis of some boundary condition , we shall call it the Gibbs
preliminary properties of the corresponding inter- state associated to the boundary condition . One
faces. The next topic concerns the wetting properties also says that this state exists in the thermodynamic
of the system, and the final section is devoted to the limit. Then, equivalently, the correlation functions
associated equilibrium crystal. [4] converge to the corresponding expectation values
in this state.
This model presents, at low temperatures (i.e., for
> c , where c is the critical inverse temperature),
Pure Phases and Surface Tension
two different thermodynamic pure phases, a dense
The Ising model is defined on the cubic lattice L = Z3 , and a dilute phase in the lattice gas language (called
with configuration space = {1, 1}L . If 2 , the here the positive and the negative phase). This
value (i) = 1 or 1 is the spin at the site means two extremal translation-invariant Gibbs
i = (i1 , i2 , i3 ) 2 L, and corresponds to an empty or an states, þ and , obtained as the Gibbs states
occupied site in the lattice gas version of the model. associated with the boundary conditions , respec-
The system is first considered in a finite box L, tively equal to the ground configurations (i) = 1
with fixed values of the spins outside. and (i) = 1, for all i 2 L. The spontaneous
In order to simplify the exposition, we shall magnetization
mainly consider the three-dimensional Ising model,
though some of the results to be discussed hold in m ðÞ ¼ þ ððiÞÞ ¼ ððiÞÞ ½5
any dimension d 2. We shall also, sometimes, is then strictly positive. On the other hand, if c ,
refer to the two-dimensional model, it being under- then the Gibbs state is unique and m = 0.
stood that the definitions have been adapted in the Each configuration inside can be described in a
obvious way. We assume that the box is a geometric way by specifying the set of Peierls
parallelepiped, centered at the origin of L, of sides contours which indicate the boundaries between
L1 , L2 , L3 , parallel to the axes. the regions of spin 1 and the regions of spin 1.
A configuration of spins on ((i), i 2 ), denoted Unit-square faces are placed midway between the
, has an energy defined by the Hamiltonian pairs of nearest-neighbor sites i and j, perpendicu-
X
H ð j
Þ ¼ J ðiÞðjÞ ½1 larly to these bonds, whenever (i)(j) = 1. The
hi;ji\6¼; connected components of this set of faces are the
Peierls contours. Under the boundary conditions (þ)
where J is a positive constant (ferromagnetic or and (), the contours form a set of closed surfaces.
attractive interaction). The sum runs over all They describe the defects of the considered config-
nearest-neighbor pairs hi, ji L, such that at least uration with respect to the ground states of the
one of the sites belongs to , and one takes system (the constant configurations 1 and 1), and
(i) = (i) when i 62 , the configuration 2 are a basic tool for the investigation of the model at
being the given boundary condition. The probability low temperatures.
of the configuration , at the inverse temperature In order to study the interface between the two
= 1=kT, is given by the Gibbs measure pure phases, one needs to construct a state describ-
ing the coexistence of these phases. This can be done
Þ ¼ Z ðÞ1 expðH ð j
ð j ÞÞ ½2
by means of a new boundary condition. Let
where Z () is the partition function n = (n1 , n2 , n3 ) be a unit vector in R3 , such that
X n3 > 0, and introduce the mixed boundary condition
Z ðÞ ¼ expðH ð jÞÞ ½3 ( , n), for which
Local properties at equilibrium can be described by ðiÞ ¼ 1 if i n 0 ½6
the correlation functions between the spins on finite 1 if i n < 0
sets of sites,
X Y This boundary condition forces the system to
ððAÞÞ ¼ ð j
Þ ðiÞ ½4 produce a defect going transversally through the
i2A box , a large Peierls contour that can be
Statistical Mechanics of Interfaces 57
interpreted as the microscopic interface (also called the number of faces of (inside ). The term U ()
a domain wall). The other defects that appear above equals ln Zþ (, )=Zþ (), the sum in the partition
and below the interface can be described by closed function Zþ (, ) being extended to all configura-
contours inside the pure phases. tions whose associated contours do not intersect .
The free energy per unit area due to the presence Each term in sum [9] gives a weight proportional to
of the interface is the surface tension. It is defined by the probability of the corresponding microscopic
interface.
n3 Z ;n ðÞ At low (positive) temperatures, we expect the
ðnÞ ¼ lim lim ln þ ½7
L1 ;L2 !1 L3 !1 L1 L2 Z ðÞ microscopic interface corresponding to this bound-
ary condition, which at zero temperature coincides
In this expression the volume contributions propor-
with the plane i3 = 1=2, to be modified by small
tional to the free energy of the coexisting phases, as
deformations. Each microscopic interface can then
well as the boundary effects, cancel, and only the
be described by its defects, with respect to the
contributions to the free energy due to the interface
interface at = 1. To this end, one introduces some
are left. The existence of such a quantity indicates
objects, called walls, which form the boundaries
that the macroscopic interface, separating the
between the horizontal plane portions of the micro-
regions occupied by the pure phases in a large
scopic interface, also called the ceilings of the
volume , has a microscopic thickness and can
interface.
therefore be regarded as a surface in a thermo-
More precisely, one says that a face of is a
dynamic approach.
ceiling face if it is horizontal and such that
Theorem 1 The interfacial free energy per unit the vertical line passing through its center does not
area, (n), exists, is bounded, and its extension by have other intersections with . Otherwise, one
positive homogeneity, f (x) = jxj (x=jxj), is a convex says that it is a wall face. The set of wall faces splits
function on R3 . Moreover, (n) is strictly positive into maximal connected components. The set of
for > c , and vanishes if c . walls, associated to , is the set of these compo-
nents, each component being identified by its
The existence of (n) and also the last statement
geometric form and its projection on the plane
were proved by Lebowitz and Pfister (1981), in the
i3 = 1=2. Every wall !, with projection (!),
particular case n = (0, 0, 1), with the help of correla-
increases the energy of the interface by a quantity
tion inequalities. A complete proof of the theorem
2Jk!k, where k!k = j!j j(!)j, and two walls are
was given later with similar arguments. The con-
compatible if their projections do not intersect. In
vexity of f is equivalent to the fact that the surface
this way, the microscopic interfaces may be inter-
tension satisfies a thermodynamic stability condi-
preted as a ‘‘gas of walls’’ on the two-dimensional
tion known as the pyramidal inequality (see
lattice.
Messager et al. (1992)).
Dobrushin, who developed the above analysis,
also proved the dilute character of this ‘‘gas’’ at low
temperatures. This implies that the microscopic
Gibbs States and Interfaces interface is essentially flat, or rigid. One can under-
In this section we consider the ( , n0 ) boundary stand this fact by noticing first that the probability
condition, also simply denoted ( ), associated to the of a wall is less than exp (2Jk!k) and, second,
vertical direction n0 = (0, 0, 1), that in order to create a ceiling in , which is not in
the plane i3 = 1=2, one needs to surround it by a
ðiÞ ¼ 1 if i3 0; ðiÞ ¼ 1 if i3 < 0 ½8 wall, that one has to grow when the ceiling is made
over a larger area.
The corresponding surface tension is = (n0 ). We Using correlation inequalities one proves that the
shall first recall some classical results which concern Gibbs state , associated to the ( ) boundary
the Gibbs states and interfaces at low temperatures. conditions, always exists, and that it is invariant
According to the geometrical description of the under horizontal translations of the lattice, that is,
configurations introduced in the last section, we ((A þ a)) = ((A)) for all a = (a1 , a2 , 0). It is
observe that also an extremal Gibbs state. Let m(z) be the
X magnetization (((z)) at the site z = (0, 0, z). The
Z ;n ðÞ=Zþ ðÞ ¼ expð2Jjj U ðÞÞ ½9
function m(z) is monotone increasing and satisfies
the symmetry property m(z) = m(z þ 1). Some
where the sum runs over all microscopic interfaces consequences of Dobrushin’s work are the following
compatible with the boundary condition and jj is properties.
58 Statistical Mechanics of Interfaces
Theorem 2 If the temperature is low enough, that This is an exact computation that has been done by
is, if J c1 , where c1 is a given constant, then Abraham and Reed.
Let us come back to the three-dimensional Ising
m ð0Þ is strictly positive ½10 model where we know that the interface orthogonal
to a lattice axis is rigid at low temperatures.
m ðzÞ ! m ; when z ! 1; exponentially fast ½11
Question 1 At higher temperatures, but before
reaching the critical temperature, do the fluctuations
Equation [10] is just another way of saying that
of this interface become unbounded, in the thermo-
the interface is rigid and that the state is non-
dynamic limit, so that the corresponding Gibbs state
translation invariant (in the vertical direction).
is translation invariant?
Then, the correlation functions ((A)) describe
the local properties, or local structure, of the One says then that the interface is rough, and it is
macroscopic interface. In particular, the function believed that, effectively, the interface becomes
m(z) represents the magnetization profile. Then rough when the temperature is raised, undergoing
statement [11], together with the symmetry prop- a roughening transition at an inverse temperature
erty, tells us that the thickness of this interface is R > c .
finite, with respect to the unit lattice spacing. It is known that R cd = 2 , the critical inverse
The statistics of interfaces has been rewritten in temperature of the two-dimensional Ising model,
terms of a gas of walls and this system may further since van Beijeren proved, using correlation inequal-
be studied by cluster expansion techniques. There is ities, that above this value, the state is not
an interaction between the walls, coming from the translation invariant. Recalling that the rigid inter-
term U () in eqn [9], but a convenient mathema- face may be viewed as a two-dimensional system,
tical description of this interaction can be obtained the system of walls, a representation that would
by applying the standard low-temperature cluster become inappropriate for a rough interface, one
expansion, in terms of contours, to the regions might think that the phase transition of the two-
above and below the interface. dimensional Ising model is relevant for the rough-
This method was introduced by Gallavotti in his ening transition, and that R is somewhere near
study (mentioned below) of the two-dimensional cd = 2 . Indeed, approximate methods, used by Weeks
Ising model. It has been applied by Bricmont and and co-workers give some evidence for the existence
co-workers to examine the interface structure in the of such a R and suggest a value slightly smaller
present case. As a consequence, it follows that than cd = 2 , as shown in Table 1. To this day,
the surface tension, more exactly (), and also however, there appears to be no proof of the fact
the correlation functions, are analytic functions at that R > c , that is, that the roughening transition
low temperatures. They can be obtained as explicit for the three-dimensional Ising model really occurs.
convergent series in the variable = e2J . At present one is able to study the roughening
The same analysis applied to the two-dimensional transition rigorously only for some simplified mod-
model shows a very different behavior at low els with a restricted set of admissible microscopic
temperatures. In this case, the microscopic interface interfaces. Moreover, the closed contours, describing
is a polygonal line and the walls belong to the one- the defects above and below , are neglected, so that
dimensional lattice. One can then increase the size of these two regions have the constant configurations 1
a ceiling without modifying the walls attached to it. or 1, and one has U () = 0 in eqn [9].
Indeed, Gallavotti turned this observation into a The best known of these models is the classic SOS
proof that the Gibbs state is now translation (solid-on-solid) model in which the interfaces have
invariant.
pffiffiffiffiffiffiThe line undergoes large fluctuations of the property of being cut only once by all vertical
order L1 , and disappears from any finite region of lines of the lattice. This means that is the graph of
the lattice, in the thermodynamic limit. In particular, a function that can equivalently be used to define
we have then = (1=2)(þ þ ), a result that the possible configurations of . If contains the
extends to all boundary conditions ( , n). horizontal face with center (i1 , i2 , i3 1=2), then
Using these results Bricmont and co-workers also
studied the local structure of the interface at low
temperatures and showed that its intrinsic thickness Table 1 Some temperature values
is finite. To study the global fluctuations, one can
compute the magnetization profile by introducing, d =3 c J
0:22 approximate critical temperature
before taking the thermodynamic limit, a change of d =3 R J
0:41 conjectured roughening temperature
d =2 c J = 0:44 exact critical temperature
scale:
((zL1 )), with = 1=2 or near to this value.
Statistical Mechanics of Interfaces 59
(i1 , i2 ) = i3 .
The proof that the SOS model with the boundary +
+
condition ( ) has a roughening transition is a highly
nontrivial result due to Fröhlich and Spencer. When +
+
is small enough, the fluctuations of are of order
pffiffiffiffiffiffiffiffiffi
ln L (in a cubic box of side L). –
Moreover, other interface models, with additional –
conditions on the allowed microscopic interfaces, –
–
are exactly solvable. The BCSOS (body-centered
SOS) model, introduced by van Beijeren, belongs to
this class. It is, in fact, the first model for which the
existence of a roughening transition has been
proved. More recently, also the TISOS (triangular
Ising SOS) model, introduced by Blöte and Hilhorst –
and further studied by Nienhuis and co-workers, has –
been considered in this context.
The interested reader can find more information
and references, concerning the subject of this K W
section, in the review article by Abraham (1986).
Figure 2 Boundary conditions for the cubic lattice. Above, the
box with the ( ) and (step) boundary conditions. Below, the
box 0 and the wall W with the (w ) boundary conditions.
Wetting Phenomena
Next we consider the Ising model over a plane condition, (i) = 1 or (i) = 1, for all i 2 L0 . Let us
horizontal substrate (also called a wall) and study consider first the case of the () boundary condition.
the difference of surface tensions which governs the The surface free energy contribution per unit area
wetting properties of this system. due to the presence of the wall, when we have the
We first describe the approach developed by negative phase in the bulk, is
Fröhlich and Pfister (1987) and briefly report some
results of their study. We consider the model on the w ð; KÞ
semi-infinite lattice 1 Zw ð0 Þ
¼ lim lim ln ½14
0 3 L1 ;L2 !1 L3 !1 L1 L2 Z ðÞ1=2
L ¼ fi 2 Z : i3 0g ½12
The division by Z ()1=2 allows us to subtract from
A magnetic field, K 0, is added on the boundary
the total free energy, ln Zw (0 ), the bulk term and
sites, i3 = 0, which describes the interaction with the
all boundary terms which are not related to the
substrate, supposed to occupy the complementary
presence of the wall. The existence of limit [14]
region L n L0 .
follows from correlation inequalities, and we have
We constrain the model in the finite box 0 = \ L0 ,
w 0.
with as above, and impose the value of the spins
One can prove, as well, the existence of the Gibbs
outside. The Hamiltonian becomes
state w of the semi-infinite system, associated to
X X
Hw0 ð0 j
Þ¼ J ðiÞðjÞ K ðiÞ ½13 the () boundary condition. This state is the limit of
hi;ji\0 6¼; i20 ;i3 ¼0 the finite volume Gibbs measures 0 (0 j())
defined by the Hamiltonian [13]. It describes the
Here 0 represents the configuration inside 0 , the local equilibrium properties of the system near
pairs hi, ji are contained in L0 , and (i) = (i) when the wall, when deep inside the bulk the system is
i 62 0 , the configuration being the given boundary in the negative phase. Similar definitions give the
condition (see Figure 2). The corresponding parti- surface tension wþ and the Gibbs state wþ ,
tion function is denoted by Zw (0 ). corresponding to the boundary condition (i) = 1,
Since there are two pure phases in the model, we for all i 2 0 .
must consider two surface free energies, or surfaces We remark that the states wþ and w are invariant
tensions, wþ and w , between the wall and the by translations parallel to the plane i3 = 0, and
positive or negative phase present in the bulk. They introduce the magnetizations, mw (z) = w ((z)),
are defined through the choice of the boundary where z denotes the site (0, 0, z), mw = mw (0), and
60 Statistical Mechanics of Interfaces
similarly mwþ (z) and mwþ . Their connection with function Zw , etc., and we obtain = 2K and
the surface free energies is given by the formula = 2J. For nonzero but low temperatures, the
small perturbations of these ground states have to be
w ð; KÞ wþ ð; KÞ considered, a problem that can be treated by the
Z K
method of cluster expansions. In fact, the corre-
¼ ðmwþ ð; sÞ mw ð; sÞÞ ds ½15 sponding defects can be described by closed con-
0
tours as in the case of pure phases.
We mention in the following theorem some
results of Fröhlich and Pfister’s study. Here is, Theorem 4 For K < J, the functions w (, K)
as before, the usual surface tension between the and wþ (, K) are analytic at low temperatures,
two pure phases of the system, for a horizontal that is, provided that (J K) c2 , where c2 is a
interface. given constant. Moreover, mwþ (z) and mw (z) tend,
respectively, to m and to m , when z ! 1,
Theorem 3 With the above definitions, we have exponentially fast.
w
ð; KÞ wþ ð; KÞ ðÞ ½16 The last statement in Theorem 4 tells us that the
wall affects only a layer of finite thickness (with
mwþ ð; KÞ mw ð; KÞ 0 ½17 respect to the lattice spacing). From a macroscopic
and the difference in [17] is a monotone decreasing point of view, the negative phase reaches the wall,
function of the parameter K. Moreover, if mwþ = mw , and we are in the partial-wetting regime. Indeed, a
then the Gibbs states wþ and w coincide. strict inequality holds in [16].
Thus, for K < J there is always partial wetting at
The proof is a subtle application of correlation low temperatures. Then the following question arises:
inequalities. Since, from Theorem 3, the integrand in
eqn [15] is a positive and decreasing function, the Question 2 Is there a situation of complete wetting
difference = w wþ is a monotone increasing at higher temperatures? It is understood here that K
and concave (and hence continuous) function of the takes a fixed value, characteristic of the substrate,
parameter K. On the other hand, one can prove that such that 0 < K < J.
= , if K J. This justifies the following This is known to be the case in dimension d = 2,
definition: where the exact value of Kw () can be obtained
Kw ðÞ ¼ minfK : ð; KÞ ¼ þ ðÞg ½18 from Abraham’s solution of the model:
In the thermodynamic description of wetting, the cosh 2Kw ¼ cosh 2J e2J sinh 2J
partial-wetting regime is characterized by the strict
Then complete wetting occurs for in the interval
inequality in [16]. Equivalently, by K < Kw (). We
c < w (K), where c is the critical inverse
must have then mwþ 6¼ mw , because of eqn [15].
temperature and w (K) is the solution of Kw () = K.
This shows that, in the case of partial wetting, wþ
The case d = 2 has been reviewed in Abraham
and w are different Gibbs states.
(1986).
The complete-wetting regime is characterized by the
To our knowledge, the above question remains an
equality in [16], that is, by K Kw (). Then, we have
open problem for the Ising model in dimension
mwþ = mw , and taking into account the last statement
d = 3. The problem has, however, been solved for
in Theorem 3, also wþ = w . This last result implies
the simpler case of a SOS interface model. In this
that there is only one Gibbs state. Thus, complete
case, a nice and rather brief proof of the following
wetting corresponds to unicity of the Gibbs state.
result has been given by Chalker (1982): one has
In this case, we also have lim mw (z) = m , when
mwþ = mw , and hence complete wetting, if
z ! 1, because this is always true for mwþ (z). This
indicates that we are in the positive phase of the 2ðJ KÞ < lnð1 e8J Þ
system although we have used the () boundary
condition, so that the bulk negative phase cannot It is very plausible that a similar statement is valid
reach the wall anymore. The film of positive phase, for the semi-infinite Ising model and, also that
which wets the wall completely, has an infinite Chalker’s method could play a role for extending the
thickness with respect to the unit lattice spacing, in proof to this case, provided an additional assump-
the thermodynamic limit. tion is made. Namely, that is sufficiently large,
When = 1, only a few particular ground con- and hence J K small enough, in order to insure the
figurations contribute to the partition functions, convergence of the cluster expansions and to be able
such as the configuration (i) = 1 for the partition to use them.
Statistical Mechanics of Interfaces 61
section ‘‘Gibbs states and interfaces.’’ This analysis the step may be viewed as an additional defect on
allows us to determine the shape of the facets in a the rigid interface described in the section ‘‘Pure
rigorous way. phases and surface tension.’’ It is, in fact, a long wall
We first observe that the appearance of a facet going from one side to the other side of the box .
in the equilibrium crystal shape is related, according The step structure at low temperatures can then be
to the Wulff construction, to the existence of a analyzed with the help of a new cluster expansion.
discontinuity in the derivative of the surface As a consequence of this analysis, we have the
tension with respect to the orientation. More following theorem.
precisely, assume that the surface tension satisfies
Theorem 5 If the temperature is low enough, that
the convexity condition of Theorem 1, and let this
is, if J c3 , where c3 is a given constant, then the
function (n) = (,
) be expressed in terms of the
step free energy, step (
), exists, is strictly positive,
spherical coordinates of n, the vector n0 being taken
and extends by positive homogeneity to a strictly
as the x3 -axis. A facet orthogonal to n0 appears in
convex function. Moreover, step (
) is an analytic
the Wulff shape if and only if the derivative
function of = e2J , for which an explicit conver-
@(,
)=@ is discontinuous at the point = 0,
gent series expansion can be found.
for all
. The facet F @W consists of the points
x 2 R3 belonging to the plane x3 = (n0 ) and such Using the above results on the step structure,
that, for all
between 0 and 2, similar methods allow us to evaluate the increment
in surface tension of an interface tilted by a very
x1 cos
þ x2 sin
@ð;
Þ=@j¼0þ ½21 small angle with respect to the rigid horizontal
interface. This increment can be expressed in terms
The step free energy is expected to play an of the step free energy, and one obtains the
important role in the facet formation. It is defined following relation.
as the free energy associated with the introduction
of a step of height 1 on the interface, and can be Theorem 6 For J c3 , we have
regarded as an order parameter for the roughening
@ð;
Þ=@j¼0þ ¼ step ð
Þ ½24
transition. Let be a parallelepiped as in the section
‘‘Pure phases and surface tension,’’ and introduce This relation, together with eqn [21], implies that
the (step, m) boundary conditions (see Figure 2), one obtains the shape of the facet by means of the
associated to the unit vectors m = ( cos
, sin
) 2 two-dimensional Wulff construction applied to the
R2 , by step free energy. The reader will find a detailed
8 discussion on these points, as well as the proofs of
> 1 if i > 0 or if i3 ¼ 0 and Theorems 5 and 6, in Miracle-Sole (1995).
<
ðiÞ ¼ i1 m1 þ i2 m2 0 ½22 From the properties of step stated in Theorem 5,
>
: it follows that the Wulff equilibrium crystal presents
1 otherwise
well-defined boundary lines, smooth and without
straight segments, between a rounded part of the
Then, the step free energy per unit length for a step
crystal surface and the facets parallel to the three
orthogonal to m (with m2 > 0) on the horizontal
main lattice planes.
interface, is
It is expected, but not proved, that at a higher
temperature, but before reaching the critical
step ð
Þ
temperature, the facets associated with the Ising
cos
Zstep;m ðÞ model undergo a roughening transition. It is then
¼ lim lim lim ln ;n ½23
L1 !1 L2 !1 L3 !1 L1 Z 0 ðÞ natural to believe that the equality [24] is true for
any larger than R , allowing us to determine the
A first result concerning this point was obtained facet shape from eqns [21] and [24], and that for
by Bricmont and co-workers, by proving a correla- R , both sides in this equality vanish, and
tion inequality which establish step (0) as a lower thus, the disappearance of the facet is involved.
bound to the one-sided derivative @(, 0)=@ at However, the condition that the temperature is
= 0þ (the inequality extends also to
6¼ 0). Thus, low enough is needed in the proofs of Theorems 5
when step > 0, a facet is expected. and 6.
Using the perturbation theory of the horizontal
interface, it is possible to also study the microscopic See also: Dimer Problems; Phase Transitions in
interfaces associated with the (step, m) boundary Continuous Systems; Phase Transition Dynamics;
conditions. When considering these configurations, Two-Dimensional Ising Model; Wulff Droplets.
Stochastic Differential Equations 63
More precisely we perturb eqn [1] as follows: There one can find basic concepts of the theory of
stochastic processes as the concept of adapted,
dXt progressively measurable process. An adapted pro-
¼ bðt; Xt Þ þ aðt; Xt Þt
dt ½2 cess is also said to be nonanticipating towards the
X0 ¼ x0 filtration (F t ) which represents the state of the
information at each time t. A process (Xt ) is said to
We suppose for a moment that d = m = 1. In reality no be adapted if for any t, Xt is F t -measurable. The
reasonable real-valued process (t )t0 fulfilling pre- notion of progressively measurable process is a slight
vious assumptions exists. In particular, if process (t ) refinement of the notion of adapted process.
exists (resp. (t ) exists and each t is a square-
integrable r.v.), then the process cannot have contin- Definition 2
uous paths (resp. it cannot be measurable with respect (i) A (continuous) (F t ) adapted process (Wt ) is called
to Rþ ). However,R t suppose that such a process (classical) (F t )-Brownian motion if W0 = 0, if
exists; we set Bt = 0 s ds. In that case, properties (1) for any s < t Wt Ws is an N(0, t s) distrib-
and (2) can be translated into the following on (Bt ). uted r.v. which is independent of F s .
(P1) It has independent increments, which means (ii) An (F t )-m-dimensional Brownian motion is a
that for any t0 , . . . , tn , h 0, Bt1 þh Bt0 þh , . . . , vector (W 1 , . . . , W m ) of (F t )-classical indepen-
Btn þh Btn1 þh are independent r.v.’s. dent Brownian motions.
(P2) It has stationary increments, which means that From now on, we will consider a probability
for any t0 , . . . , tn , h 0, the law of (Bt1 þh space (, F , P) equipped with a filtration (F t )t0
Bt0 þh , ..., Btn þh Btn1 þh ) does not depend on h. fulfilling the usual conditions. From now on all the
On the other hand, it is natural to require that considered filtrations will have that property.
Let W = (Wt )t0 be an (F t )t0 -m-dimensional clas-
(C1) B0 = 0 a.s., sical Brownian motion. In Karatzas and Shreve (1991,
(C2) it is a continuous process, that is, it has chapter 3) and Revuz and Yor (1999, chapter 4), one
continuous paths a.s. introduces the notion of stochastic Itô integral
Equation [2] should be rewritten in some integral form announced before. Let Y = (Y 1 , . . . , Y m ) be a progres-
Z t Rsively
T
measurable m-dimensional process
2 RT such that
Xt ¼ X0 þ bðs; Xs Þds 0 kY s k ds < 1, then the Itô integral 0 Y sR s is well
dW
0 defined. In particular the indefinite integral 0 Ys dWs is
Z t
an (F t )-progressively measurable continuous process.
þ aðs; Xs ÞdBs ½3 If dm
0 R t Y is an R matrix-valued process, the integral
0 Ys dWs is componentwise defined and it will be a
Clearly the paths of process (Bt ) cannot be differ- vector in Rd . The analogous of differential calculus in
entiable,
Rt so one has to give meaning to integral the framework of stochastic processes is Itô calculus,
0 a(s, X s )dBs . This will be intended in the ‘‘Itô’’ see again Karatzas and Shreve (1991, chapter 3) and
sense, see considerations below. Revuz and Yor (1999, chapter 4). Important tools are
An important result of probability theory says the concept of quadratic variation [X] of a stochastic
that a stochastic process (Bt ) fulfilling properties P1, process when it exists. For instance, the quadratic
P2 and C1, C2 is essentially a ‘‘Brownian motion’’. variationR [W]t of a classical Brownian Rt motion equals t.
t
More precisely, there are real constants b, such If Mt = 0 Ys dWs , then [M]t = 0 kYs k2 ds. One cele-
that Bt = bt þ Wt , where (Wt ) is a classical Brow- brated theorem of P Lévy states the following: if (Mt )
nian motion defined below. defines a continuous (F t )-local martingale such that
[Mt ] t, then M is an (F t )-classical Brownian motion.
Definition 1
That theorem is called the ‘‘Lévy characterization
(i) A (continuous) stochastic process (Wt ) is called theorem of Brownian motion.’’ Itô formula constitutes
classical ‘‘Brownian motion’’ if W0 = 0 a.s., the natural generalization of fundamental theorem of
it has independent increments and the law of differential calculus to the stochastic calculus. Another
Wt Ws is a Gaussian N(0, t s) r.v. significant tool is Girsanov theorem; it states essen-
(ii) A m-dimensional Brownian motion is a vector tially the following: suppose that the following so-
(W 1 , . . . , W m ) of independent classical Brow- called ‘‘Novikov condition’’ is verified:
nian motions.
Z T
Let (F t )t0 be a filtration fulfilling the usual 1
E exp kYt k2 dt < 1
conditions, see (Karatzas and Shreve (1991, section 1.1). 2 0
Stochastic Differential Equations 65
R
Then the process W ~ t = Wt þ t Ys ds, t 2 [0, T] is Theorem 1 We suppose a and b locally Lipschitz
0
again an m-dimensional (F t )-classical Brownian with linear growth. Let be a square-integrable r.v.
motion under a new probability measure Q on that is F 0 -measurable. Then [4] has a unique
(, F T ) defined by solution X. Moreover,
Z t !
1 2
dQ ¼ dP exp Ys dWs kYs k ds E sup jXt j2 <1
0 2 tT
Let be an F 0 -measurable r.v., for instance,
Remark 1
x 2 R d . We are interested in the SDE
(i) Equation [4] can be settled similarly by putting
dXt ¼ aðt; Xt Þ dWt þ bðt; Xt Þ dt initial condition x at some time s. In that case
½4
X0 ¼ the problem is again well stated. If x is a
deterministic point of R d , then we will often
Definition 3 A progressively measurable process
denote by Xs, x the solution of that problem.
(Xt )t2[0, T] is said to be solution of [4] if a.s.
(ii) If the coefficients are only locally Lipschitz, the
Z t Z t
equation may be solved until a stopping time. If
Xt ¼ Z þ aðt; Xt Þ dWt þ bðt; Xt Þ dt
0 0 ½5 d = 1, it is possible to state necessary and sufficient
8t 2 ½0; T conditions for nonexplosion (Feller test).
(iii) The theorem above admits several generaliza-
provided that the right-hand side member makes tions. For instance, the Brownian motion can be
sense. In particular, such a solution is continuous. replaced by general semimartingales, (possibly
The function a (resp. b) is called the diffusion (drift) with jumps as Lévy processes).
coefficient of the SDE. a and b may sometimes be
An important role of diffusion processes is the fact
allowed to be random; however, this dependence
that they provide probabilistic representation to
has to be progressively measurable. Clearly, we can
PDEs of parabolic (and even elliptic) type. We will
define the notion of solution (Xt )t0 on the whole
only mention here the parabolic framework.
positive real axis.
We denote A(t, x) = a(t, x)a(t, x) , where means
We remark that those equations are called Itô transposition for matrices. (t, x) ! A(t, x) = (Aij (t, x))
SDEs. A solution of previous equation is named is a d d matrix-valued function. Let us consider also
diffusion process. continuous functions k : [0, T] R d ! R d , g : [0, T]
Rd ! Rd with polynomial growth or non-negative.
Given a solution of [4], we can associate its
The Lipschitz Case generator (Lt , t 2 [0, T]) setting
The most natural framework for studying the 1X d
existence and uniqueness for SDEs appears when Lt f ðxÞ ¼ Aij ðt; xÞ@ij2 f ðxÞ þ bðt; xÞ rf ðxÞ
2 i;j¼1
the coefficients are Lipschitz.
A function : [0, T] Rm ! Rd is said to have
‘‘polynomial growth’’ (with respect to x uniformly in Feynman–Kac theorem is stated below and it
t), if for some n there is a constant C > 0 with provides probabilistic representation of an asso-
ciated parabolic linear PDEs.
sup kðt; xÞk Cð1 þ kxkn Þ ½6
t2½0;T Theorem 2 Suppose there is a function v : [0, T[
Rd ! Rd continuous with polynomial growth of
The same function is said to have ‘‘linear growth’’ if class C1, 2 ([0, T] Rd ) satisfying the following
[6] holds with n = 1. A function : Rþ Rm ! R d is Cauchy problem:
said to be ‘‘locally Lipschitz’’ (with respect to
x uniformly in t), if for every t 2 [0, T], K > 0, ð@t v þ Lt Þv kv ¼ g
½7
j[0, T][K, K] is Lipschitz (with respect to x uniformly vðT; xÞ ¼ f ðxÞ
with respect to t).
Let a : Rþ R dm ! Rd , b : Rþ R d ! R d , be Then
Borel functions, an Rd -valued r.v. F 0 -measurable Z T
and (Wt )t0 be a m-dimensional (F t )-Brownian vðs; xÞ ¼ E f ðXT Þ exp kð; X Þ d
motion. Z T s Z t
Classical fixed-point theorems allow to establish gðt; Xt Þ exp kð; X Þ d dt
the following classical result. s s
66 Stochastic Differential Equations
for (s, x) 2 [0, T] Rd , where X = Xs, x . In particu- This equation will be denoted by E(a, b) (without initial
lar, such a solution is unique. condition). However, as we will see, the general
concept of solution of an SDE is more sophisticated
Remark 2
and subtle than in the deterministic case. We distin-
(i) In order to obtain ‘‘classical solutions’’ of the guish several variants of existence and uniqueness.
above Cauchy problem, one needs some condi-
Definition 4 (Strong existence). We will say that
tions. It is the case, for instance, when the
equation E(a, b) admits strong existence if the
following ellipticity condition holds on A:
following holds. Given any probability space
9c > 0; 8ðt;xÞ 2 ½0;T Rn ; 8ð1 ;...;n Þ 2 Rn (, F , P), a filtration (F t )t0 , an (F t )t0 -Brownian
X
d X
d motion (Wt )t0 , an F 0 -measurable and square-
Aij ðt;xÞi j c ji j2 ½8 integrable r.v. , there is a process (Xt )t0 solution
i;j i¼1 to E(a, b) with X0 = a.s.
In the degenerate case, it is possible to deal with Definition 5 (Pathwise uniqueness). We will say
viscosity solutions, in the sense of P L Lions. that equation E(a, b) admits pathwise uniqueness if
This theorem establishes an important link the following property is fulfilled. Let (, F , P) be a
between deterministic PDEs and SDEs. probability space, a filtration (F t )t0 , an (F t )t0
(ii) A natural generalization of Feynman–Kac theo- Brownian motion (Wt )t0 . If two processes X, X ~ are
rem comes from the system of forward–backward ~
two solutions such that X0 = X0 a.s., then X and X ~
SDEs in the sense of Pardoux and Peng. coincide.
(iii) Other types of probabilistic representation do
appear in stochastic control theory through the Definition 6 (Existence in law or weak existence).
so-called verification theorems, see for instance, Let be a probability law on Rd . We will say that
Fleming and Soner (1993) and Yong and Zhou E(a, b; ) admits weak existence if there is a
(1999). In that case, the (nonlinear) Hamilton– probability space (, F , P), a filtration (F t )t0 , an
Jacobi–Bellmann deterministic equation is (F t )t0 -Brownian motion (Wt )t0 , and a process
represented by a controlled SDE. (Xt )t0 solution of E(a, b) with being the law of X0 .
(iv) Another bridge between nonlinear PDEs and We say that E(a, b) admits weak existence if
diffusions can be provided in the framework of E(a, b; ) admits weak existence for every .
interacting particle systems with chaos propaga- Definition 7 (Uniqueness in law). Let be a
tion, see Graham et al. (1996) for a survey on probability law on Rd . We say that E(a, b; ) has a
those problems. Among the most significant unique solution in law if the following holds. We
nonlinear PDEs investigated probabilistically, we consider an arbitrary probability space (, F , P) and
quote the case of porous media equations. For a filtration (F t )t0 on it; we consider also another
instance, for a positive integer m, a solution to probability space (, ˜ F~, P)
~ equipped with another
2 filtration (F~t )t0 ; we consider an (F t )t0 -Brownian
@t u ¼ 12 @xx ðu2mþ1 Þ ½9
motion (Wt )t0 , and an (F~t )t0 -Brownian motion
can be represented by a (nonlinear) diffusion of (W~ t )t0 ; we suppose having a process (Xt )t0 (resp. a
the type, see Benachour et al. (1996), process (X ~ t )t0 ) solution of E(a, b) on the first (resp.
on the second) probability space such that both the
d Xt ¼ um ðs; Xs Þ dWt law of X0 and X ~ 0 are identical to . Then X and X ~
½10
uðt; Þ ¼ law density of Xt must have the same law as r.v. with values in
E = C(Rþ ) (or C[0, T]).
Different Notions of Solutions We say that E(a, b) has a unique solution in law if
E(a, b; ) has a unique solution in law for every .
Let a and b as at the beginning of the previous
section. Let (, F , P) be a probability space, a There are important theorems which establish
filtration (F t )t0 fulfilling the usual conditions, an bridges among the preceding notions. One of the
(F t )t0 -classical Brownian motion (Wt )t0 . Let be most celebrated is the following.
an F 0 -measurable r.v. In the section ‘‘Motivation
and preliminaries,’’ we defined the notion of solu- Proposition 1 (Yamada–Watanabe). Consider the
tion of the following equation: equation E(a, b).
(i) Pathwise uniqueness implies uniqueness in law.
d Xt ¼ bðt; Xt Þ dt þ aðt; Xt Þ dWt
½11 (ii) Weak existence and pathwise uniqueness imply
X0 ¼ strong existence.
Stochastic Differential Equations 67
growth, Theorem [1] implies that E(a, b) admits for a certain m > 1. We suppose that a, b are
strong existence and pathwise uniqueness. continuous with linear growth. Then E(a, b; )
(ii) If a and b are only locally Lipschitz, then admits weak existence.
pathwise uniqueness is fulfilled.
From now on, a function : [0, T] Rm ! Rd will
be said Hölder-continuous if it is Hölder-continuous
Existence and Uniqueness in Law in the space variable x 2 Rm uniformly with respect
to the time variable t 2 [0, T].
A way to create weak solutions of E(1, b) when Stroock and Varadhan (1979) also provide the
(t, x) ! b(t, x) is Borel with linear growth is the following result, which is an easy consequence of
Girsanov theorem. Suppose d = 1 for simplicity. Let their theorem 7.2.1.
us consider an (F t )-classical Brownian motion (Xt ).
We set Proposition 4 We suppose a, b both Hölder-
Z t continuous, bounded such that condition; [8] is
Wt ¼ Xt bðs; Xs Þds fulfilled. Then SDE E(a, b; ) admits weak uniqueness.
0
Remark 4
Under some suitable probability Q, (Wt ) is an (F t )-
classical Brownian motion. Therefore, (Xt ) provides (i) The Hölder condition and [8] in Proposition 4
a solution to E(1, b; 0 ). may be relaxed and replaced with the solva-
We continue with an example where E(a, b) does bility of a Cauchy problem of a parabolic PDE
not admit pathwise uniqueness, even though it with suitable terminal value.
admits uniqueness in law. (ii) In the case d = 1, if a, b are bounded and just
Borel with [8] for x on each compact, then
Example 1 We consider the stochastic equation E(a, b; ) admits weak existence and uniqueness
Z t
in law. See Stroock and Varadhan (1979,
Xt ¼ signðXs ÞdWs ½12 exercises 7.3.2 and 7.3.3).
0 (iii) If d = 2, the same holds as at previous point
provided that moreover a does not depend on
with time.
1 if x 0 We proceed with some more specifically unidi-
signðxÞ ¼
1 if x < 0 mensional material stating some results from
K J Engelbert and W Schmidt, who furnished
It corresponds to E(a, b; 0 ) with b=0 and necessary and sufficient conditions for weak exis-
a(x) = sign(x). tence and uniqueness in law of SDEs.
For a Borel function : R ! R, we first define
If (Wt )t0 is an (F t )-classical Brownian motion,
then (Xt )t0 is (F t )t0 -continuous local martingale ZðÞ ¼ fx 2 RjðxÞ ¼ 0g
vanishing at zero such that [X]t t. According to
then we define the set I() as the set of real numbers
Lévy characterization theorem stated earlier, X is an
x such that
(F t )t0 -classical Brownian motion. This shows in
Z xþ"
particular that E(a, b; 0 ) admits uniqueness in law. dy
In the sequel, we will show that E(a, b; 0 ) also 2
¼ 1; 8" > 0
x" ðyÞ
admits weak existence.
Let now (, F , P) be a probability space, an Proposition 5 (Engelbert–Schmidt criterion). Sup-
(F t )t0 -classical Brownian motion with respect to a pose that a : R ! R, that is, does not depend on time
filtration and (Xt )t0 such that [12] is verified. Then and we consider the equation without drift E(a, 0).
~ t = Xt can also be shown to be a solution.
X
Therefore, E(a, b; 0 ) does not admit pathwise (i) E(a, 0) admits weak existence (without explo-
uniqueness. sion) if and only if
We continue stating a result true in the multi-
dimensional case. IðaÞ
ZðaÞ ½14
68 Stochastic Differential Equations
(ii) E(a, 0) admits weak existence and uniqueness in Remark 7 Suppose d = 1. Pathwise uniqueness for
law if and only if E(a, b) also holds under the following assumptions.
(i) a, b are bounded, a is time independent and
IðaÞ ¼ ZðaÞ ½15
a const. > 0, h as in Proposition 6. This result
Remark 5 has an analogous form in the case of spacetime
white noise driven SPDEs of parabolic type, as
(i) If a is continuous then, [14] is always verified. proved by Bally, Gyongy, and Pardoux in 1994.
Indeed, if a(x) 6¼ 0, there is " > 0 such that (ii) a independent on time, b bounded and a
const. > 0; moreover, ja(x) a(y)j2 jf (y)
jaðyÞj > 0; 8y 2 ½x "; x þ "
f (x)j and f is increasing and bounded.
Therefore, x cannot belong to I(a). For illustration we provide some significant
(ii) Equation [14] is verified also for some discon- examples.
tinuous functions as, for instance, a(x) =
sign(x). This confirms what was affirmed Example 2
Z t
previously, that is, the weak existence (and
Xt ¼ jXs j dWs ; t0 ½16
uniqueness in law) for E(a, 0). 0
(iii) If a(x) = 1{0} (x), [14] is not verified.
(iv) If a(x) = jxj , 1=2, then We set a(x) = jxj , 0 < < 1. This is equation
E(a, 0) with a(x) = jxj . According to Engelbert–
ZðaÞ ¼ IðaÞ ¼ f0g Schmidt notations, we have Z(a) = {0}. Moreover
(i) If 1=2, then I(a) = {0}.
So there is at most one solution in law for
(ii) If < 1=2 then I(a) = ;.
E(a, 0).
(v) The proof is technical and makes use of Therefore, according to Proposition 5, E(a, 0) admits
Lévy characterisation theorem of Brownian weak existence. On the other hand, if 1=2,
motion.
jx y j hðjx yjÞ ½17
where h(z) = z . According to Proposition 6, [16]
Results on Pathwise Uniqueness admits pathwise uniqueness and by Corollary [1],
also strong existence. The unique solution is X 0.
Proposition 6 (Yamada–Watanabe). Let a, b : Rþ If < 1=2, X 0 is always a solution. This is not
R ! R and consider again E(a, b). Suppose b the only one; even uniqueness in law is not true.
globally Lipschitz and h : R þ ! Rþ strictly increas- pffiffiffiffiffiffi
Example 3 Let a(x) = jxj, b Lipschitz. Then
ing continuous such that E(a, b) admits strong existence and pathwise unique-
(i) h(0) ness. In fact, a is Hölder-continuous with parameter
R " = 0;
(ii) 0 (1/h2 )(y)dy = 1, 8" > 0; and 1/2 and the second item of Remark 6 applies; so
(iii) ja(t, x) a(t, y)j h(x y). pathwise uniqueness holds. Strong existence is a
consequence of Propositions 3 and 1 (ii).
Then pathwise uniqueness is verified. An interesting particular case is provided by the
Remark 6 following equation. Let x0 , , 0, k 2 R. The
following equation admits strong existence and
(i) In Proposition 6, one typical choice is pathwise uniqueness.
h(u) = u , > 1=2. Z t pffiffiffiffiffiffiffiffi Z t
(ii) Pathwise uniqueness for E(a, b) holds therefore
Zt ¼ x0 þ jZs j dWs þ ð kXs Þ ds
if b is globally Lipschitz and a is Hölder- 0 0
continuous with parameter equal to 1/2. t 2 ½0; T ½18
Corollary 1 Suppose that the assumptions of
Equation [18] is widely used in mathematical finance
Proposition 6 are verified and a, b continuous with
and it constitutes the model of Cox–Ingersoll–Ross:
linear growth. Then E(a, b; ) admits strong exis-
the solution of the mentioned equation represents the
tence and pathwise uniqueness, whenever verifies
short interest rate.
condition [13].
Consider now the particular case where k = 0,
Proof It follows from Propositions 6 and 3 = 2. According to some comparison theorem for
together with Proposition 1 (ii). & SDEs, the solution Z is always non-negative and
Stochastic Differential Equations 69
therefore the absolute value may be omitted. The Without entering into details, the classical Itô
equation becomes formula allows to show that (Yt ) defines a solution of
Z t pffiffiffiffiffi
dYt ¼ ~aðYt Þ dWt
Zt ¼ x0 þ 2 Zs dWs þ t ½19 ½22
0 Y0 ¼ hðx0 Þ
Definition 8 The unique solution Z to Now, eqn [22] fulfills the requirements of the
Z t pffiffiffiffiffi Engelbert–Schmidt criterion so that it admits weak
Zt ¼ x0 þ 2 Zs dWs þ t ½20 existence and uniqueness in law. Consequently,
0 unless explosion, one can easily establish the same
well-posedness for [21].
is called ‘‘square -dimensional Bessel process’’ Zvonkin transformation also allows to prove
starting at x0 ; it is denoted by BESQ (x0 ); for fine strong existence and pathwise uniqueness results
properties of this process, see Revuz and Yor (1999, for [21]; for instance, when
ch. IX.3).
Since Z 0, we call -dimensional a has linear growth, Zand
pffiffiffiffi Bessel process y
starting from x0 the process X = Z. It is denoted bðsÞ
y! ds
by BES (x0 ). 0 a2 ðsÞ
Remark 8 Let d 1. Let W = (W 1 , . . . , W d ) be a is a bounded function.
classical d-dimensional Brownian motion. We set In fact, problem [22] satisfies pathwise uniqueness
Xt = kWt k. (Xt )t0 is a d-dimensional Bessel process. and strong existence since the coefficients are
Remark 9 If > 1, it is possible to see that Lipschitz with linear growth. Therefore, one can
Z deduce the same for [21].
1 t ds Veretennikov generalized Zvonkin transformation
X t ¼ Wt þ
2 0 Xs to the d-dimensional case in some cases which
include the case a = 1 and b bounded Borel.
Zvonkin’s procedure suggests also to consider a
The Case with Distributional Drift
formal equation of the type
Pioneering work about diffusions with generalized
dXt ¼ dWt þ 0 ðXt Þ dt ½23
drift was presented by N I Portenko, but in the
framework of semimartingale processes. Recently, where is only a continuous function and so b = 0
some work was done characterizing solutions in the is a Schwartz distribution; could be, for instance,
class of the so-called Dirichlet processes, with some the realization of an independent Brownian motion
motivations in random irregular environment. of W. Therefore, eqn [23] is motivated by the study
A useful transformation in the theory of SDE is of irregular random media. When = 1, b = 0 , SDE
the so-called ‘‘Zvonkin transformation.’’ Let (Wt ) be [22], h0 = e2 still makes sense.
an (F t )-classical Brownian motion. Let a (resp. b) : Using the Engelbert–Schmidt criterion, one can see
R ! R (resp. C1 ) be locally bounded. We suppose that problem [22] still admits weak existence and
moreover a > 0. We fix x0 2 R. Let (Xt )t0 be a uniqueness in the sense of distribution laws. If Y is a
solution of solution of [22], X = h1 (Y) provides a natural
Z t candidate solution for [21]. R F Bass, Z-Q Chen and
X t ¼ x0 þ bðXs Þ ds F Flandoli, F Russo, and J Wolf investigated general-
0
Z t ized SDEs as [23]: in particular, they made previous
þ aðXs Þ dWs ½21 reasoning rigorous, respectively, in the case of strong
0 and weak solutions, see Flandoli et al. (2003).
We set
Z x
2b Connected Topics
ðxÞ ¼ ðyÞ dy
0 a2 We aim here at giving some basic references about
and we define h : R ! R such that topics which are closely connected to SDEs.
differential equations. In: Talay and Tubaro L (eds.) Lectures Pardoux E (1998) Backward stochastic differential equations and
given at the 1st Session and Summer School held in viscosity solutions of systems of semilinear parabolic and
Montecatini Terme, 22–30 May 1995, Lecture notes in elliptic PDEs of second order. Stochastic analysis and related
Mathematics, vol. 1627. Springer-Verlag. Centro Internazio- topics, VI (Geilo, 1996), 79–127, Progr. Probab., 42, Boston:
nale Matematico Estivo (C.I.M.E.), Florence, 1996. Birkäuser, 1998.
Karatzas I and Shreve SE (1991) Brownian Motion and Stochastic Protter P (1992) Stochastic Integration and Differential Equa-
Calculus, 2nd edn. New York: Springer. tions. A new approach. Berlin: Springer.
Kloeden PE and Platen E (1992) Numerical Solutions of Lyons T and Qian Z (2002) Controlled Systems and Rough Paths.
Stochastic Differential Equations. Berlin: Springer. Oxford: Oxford University Press.
Lamberton D and Lapeyre B (1997) Introduction au Calcul Revuz D and Yor M (1999) Continuous Martingales and
Stochastique et Applications la Finance. Paris: Collection Brownian Motion, Third edition. Berlin: Springer-Verlag.
Ellipses. Stroock D and Varadhan SRS (1979) Multidimensional Diffusion
Ma J and Yong J (1999) Forward–Backward Stochastic Differ- Processes. Berlin–New York: Springer.
ential Equations and Their Applications, Lecture Notes in Walsh JB (1986) An introduction to stochastic partial differential
Mathematics, vol. 1702. Berlin: Springer. equations. In: Ecole d’été de probabilités de Saint–Flour, XIV –
Nualart D (1995) The Malliavin Calculus and Related Topics. 1984, pp. 265–439. Lecture Notes in Mathematics. vol. 118,
New York: Springer. Springer.
Øksendal B (2003) Stochastic Differential Equations. An Intro- Yong J and Zhou XY (1999) Stochastic Controls: Hamiltonian
duction with Applications. Sixth edition, Universitext. Berlin: Systems and HJB Equations. New York: Springer.
Springer-Verlag.
Stochastic Hydrodynamics
B Ferrario, Università di Pavia, Pavia, Italy relevant collective properties of the flow that,
ª 2006 Elsevier Ltd. All rights reserved. hopefully, make it possible to grasp the salient
features of the dynamics. In this sense, stochastic
hydrodynamics is germane to the kinetic gas theory.
Introduction In the next section we shall review a typical topic of
stochastic hydrodynamics, the evolution of prob-
Mathematical models in hydrodynamics are intro- ability measures. Results on stationary probability
duced to describe the motion of fluids. The basic measures will be given in the subsequent sections.
equations for Newtonian incompressible fluids are Another characteristic of turbulent flows is the lack
the Euler and the Navier–Stokes equations, for of space regularity of the velocity field. We shall
inviscid and viscous fluids, respectively. For a given introduce in the section ‘‘The stochastic Navier–
set of body forces acting on the fluid, these Stokes equations’’ a stochastic model of turbulence,
nonlinear partial differential equations (PDEs) which exhibits lack of regularity of the solutions.
model the evolution in time of the velocity and The Euler equations are a singular limit of the
pressure at each point of the fluid, given the initial Navier–Stokes equations, since they are first order,
velocity and suitable boundary conditions (see instead of second-order PDEs. It is little surprise if they
Partial Differential Equations: Some Examples). involve different mathematical techniques. A full sec-
The equations of hydrodynamics offer challenging tion will be devoted to a discussion of Euler equations
mathematical problems, like proving the existence and another to the Navier–Stokes equations. Statistics
and uniqueness of solutions, determining their of an inviscid flow, when approximated by vortex
regularity, their asymptotic behavior for large time, motion, will be described in the final section.
and their stability. To gain some insight into the
behavior of fluids, stochastic analysis is introduced
into hydrodynamics. In fact, there are various
Statistical Solutions
attempts to describe turbulent regime (see Turbu-
lence Theories). But, analyzing individual solutions Let u(t, x) be the fluid velocity at time t and point
that determine the flow at any time, for a given x 2 D R d ; since the initial velocity is always
initial condition, is a desperate task, since the affected by experimental errors, it is reasonable to
dynamics in a turbulent regime is chaotic and highly assign a measure determining the probability that
unstable. This is a particular chaotic motion with the initial velocity belongs to a Borel set of the
some characteristic statistical properties (see Monin space H of all admissible velocity fields u = u(x).
and Yaglom (1987)). The aim of a statistical A spatial statistical solution is a family of
description of turbulent flow is to single out some probability measures (t, ), t 0, each supported
72 Stochastic Hydrodynamics
on the set H such that, given any Borel set in H, describing the spatial statistical solution, we deal with
we have the moments of (t, ) of any order. For a nonlinear
dynamics [3], the moments equations are an infinite
Probfuðt; xÞ 2 g ¼ ðt; Þ; 8t > 0 ½1
chain of coupled equations, the so-called Friedman–
with the initial condition (0, ) = (). The con- Keller equations.
struction and analysis of statistical solutions (t, ) is A prominent role among statistical solutions is
one of the crucial mathematical problems in played by stationary solutions. They contain all the
stochastic hydrodynamics (see, e.g., Vishik and statistical information in the case of equilibrium in
Fursikov (1988)). time. We have that the characteristic functional of
Hopf gave the first mathematical formulation of an invariant measure is constant in time. Therefore,
the problem of describing turbulent flows by
statistical solutions. The first result on the existence d
ðt; Þ ¼ 0
of statistical solutions is by Foias in 1973. Hopf dt
(1952) presented an equation in variational deriva- Bearing in mind equation [5], this is equivalent
tives satisfied by the characteristic functional (t, ) to say that the signed measure h, F(v)i(t, dv) vani-
of the family of measures (t, ) associated with the shes, for any test function and time t. Setting t = 0,
Navier–Stokes equations. The characteristic func- we obtain that an invariant measure in the space
tional (t, ) is the Fourier transform of the measure H satisfies the Liouville equation
(t, ): Z
Z
hðvÞ; FðvÞi dðvÞ ¼ 0 ½7
ðt; Þ ¼ eih;ui ðt; duÞ ½2 H
H
for appropriate test functions . This equation is
defined for any smooth test function .
also called the relation of infinitesimal invariance
We now derive the evolution equation for (t, ),
and the measure is said to be infinitesimally
by assuming that the dynamics takes place in the
invariant.
phase space H and follows the nonlinear equation
The stationary measures are natural candidates to
du describe the statistical asymptotic behavior of the
¼ FðuÞ ½3
dt system when t ! 1. Notice that, in a chaotic system
two motions that are arbitrarily close to one another at
If uv (t) is the solution started from v at time t = 0,
t = 0 can evolve in completely different ways. So, to
then its probability distribution is represented by
describe satisfactorily the dynamics we take average
the time-evolved measure (t, ). Therefore, we
over a big number of experiments. This is the so-called
have that
Z Z ensemble average. These averages are assumed to be
v with respect to an invariant measure . The invariant
eih;ui ðt; duÞ ¼ eih;u ðtÞi ð0; dvÞ ½4
H H measures must exist and either they are unique or at
most one has physical meaning and enters in the
Differentiating in time, we obtain
functional integral defining the ensemble average.
Z
d v According to the ergodic principle (an assumption not
ðt; Þ ¼ eih;u ðtÞi ih; Fðuv ðtÞÞið0; dvÞ yet proved in hydrodynamics), ensemble averages
dt
ZH replace long-time averages: for every initial velocity
¼ i eih;vi h; FðvÞiðt; dvÞ ½5 field v, except for a set of initial values negligible in
H
some sense, the time average of an observable tends,
The last integral is uniquely determined by , since as time goes to infinity, to the ensemble average
the measure (t, ) is uniquely determined by (t, ). Z Z
1 T v
We denote by (t, ) the last integral in [5]. The lim ðu ðtÞÞ dt ¼ d ½8
T!1 T 0 H
evolution equation thus obtained for the character-
istic functional is However, it is extremely difficult to prove the
existence of stationary probability measures for the
d
ðt; Þ ¼ iðt; Þ; 8 ½6 Navier–Stokes equations solving directly equation
dt [7]. The situation is formally the same as in
This is called the Hopf equation associated with the equilibrium statistical mechanics, where the Liouville
dynamical system [3]. equation is in fact solved, leading to the Boltzmann–
Another way to analyze the evolution of measures Gibbs distribution. However, the results in statistical
is through the moments; instead of the measure (t, ) hydrodynamics are far from being satisfactory.
Stochastic Hydrodynamics 73
Recent studies to prove the existence of invariant to construct (formally) invariant measures of Gibbs
measures for the Navier–Stokes equations are based type: the energy
on stochastic models (see the section ‘‘The stochastic Z
Navier–Stokes equations’’). On the other hand, for 1
EðuÞ :¼ juj2 dx
the Euler equations it is possible to construct 2 D
formally invariant measures, by means of invariant
quantities of the classical motion (see the next and, only in the two-dimensional case (d = 2), the
section). enstrophy
Finally, we point out that there are techniques Z
using invariant measures to show some results for 1
SðuÞ :¼ jcurl uj2 dx
the time evolution (e.g., the motion exists for almost 2 D
all initial values with respect to an invariant
measure). (with curl u = r? u @u2 =@x1 @u1 =@x2 for d = 2).
It is natural to look for velocity fields in the
following function spaces: the space H 0 of finite
kinetic energy and the space H 1 of finite enstrophy.
The Euler Equations Clearly, the admissible fields should also obey the
We start recalling some basic facts on Euler boundary conditions and divergence-free condition.
equations (see Incompressible Euler Equations: If P is the projection operator onto the space of
Mathematical Theory). divergence-free vectors, and B is the bilinear form
The motion of an inviscid, incompressible, and B(u, v) := P[(u r)v], the Euler equations can be
homogeneous fluid is described by the Euler given the structure of an evolution,
equations, which in Eulerian coordinates read as
du
¼ Bðu; uÞ ½10
@u dt
þ ðu rÞu þ rp ¼ f
@t
in D ½9 obtained by applying the projection operator P to
ru¼0 the first equation in [9]. The pressure disappears and
un¼0 on @D can be regarded as a Lagrange multiplier associated
with the divergence-free constraint (r u = 0); it can
where, at time t 0 and position x 2 D, u = u(t, x) be fully recovered once the velocity field is known.
is the vector velocity, p = p(t, x) the hydrodynamic The dynamics is considered in the phase space of
pressure. The units have been chosen so that the divergence-free velocity vectors H (a large space
mass density = 1. r denotes the nabla vector containing H0 and H 1 ), which is an infinite-
operator so dimensional functional space. More precisely, iden-
tifying H 0 with its dual (H 0 )0 , we introduce the
X
d
@ Gelfand’s triplet
ur¼ uj
@xj
j¼1 H1 H 0 ðH1 Þ0 ¼ H 1
X
d
@uj
ru¼ The space H , with = 1, 2, . . . , are the usual
j¼1
@xj Sobolev spaces but with the additional divergence-
free and boundary conditions. For > 0 noninteger,
@p @p
rp ¼ ;...; the spaces H are defined by interpolation, whereas
@x1 @xd those with < 0 by duality. As usual, regularity in
space is related to the spaces H with higher
Finally, f denotes the external force. If the spatial
exponent . We have that H = [2R H .
domain D has a boundary @D, then the velocity is
Invariance of E and S can be proved resorting to
assumed to be tangent to the boundary (n denotes
eqn [9] and assuming that u is a smooth vector field.
the exterior normal vector to the boundary). Some
For instance,
initial condition u0 at time t = 0 is assigned.
When f = 0, there are invariant quantities for Z Z
d d1 @u
system [9]. In the literature, there are many works EðuðtÞÞ ¼ juj2 dx ¼ u dx
suggesting a Gaussian stationary statistics (see, e.g., dt dt 2 D D @t
Z Z
the paper by Kraichnan (1980)). We consider ¼ u ½ðu rÞu dx u rp dx
invariants that are quadratic in the velocity so as D D
74 Stochastic Hydrodynamics
Each component Bj (u, u) is defined for S -a.e. u. easier than for the Euler equations. However, at
These estimates lead to define a weak solution (see variance with the Euler equations, the Navier–
Albeverio and Cruzeiro (1990)): Stokes equations do not possess invariants, since
the viscosity dissipates energy. Hence, it is difficult
Theorem 1 Let d = 2. There exists a flow U(t, !)
to find explicit expressions of invariant measures for
defined on a probability space (, F , P) with values
the deterministic Navier–Stokes equations, except
in H "1 for any " > 0, U( , !) 2 C(R, H"1 ) P-a.e.
the trivial invariant measures concentrated on a
!, such that for each component Uj we have
stationary solution. However, as soon as a stochastic
Uj ðt; !Þ force is introduced in these equations, it is possible
Z t to have nontrivial invariant measures. It is impos-
¼ Uj ð0; !Þ þ Bj ðUðs; !Þ; Uðs; !ÞÞ ds; sible to review here the wide literature concerning
0 the stochastic Navier–Stokes equations and we
P a:e:!; 8t 2 R confine ourselves to make some remarks. Most
results are concerned with proving the existence
Moreover, the measure S is invariant under this and/or uniqueness of an invariant measure , with-
flow. out giving an explicit representation, apart some
We point out that uniqueness is an open problem attempts like Gallavotti (2002), where a formal
also for d = 2. But already in the classical analysis of representation of stationary distributions is given in
the Euler equations in a bounded domain, unique- terms of functional integrals. Some properties of the
ness for initial velocity of finite energy is not known. not explicit invariant measures are given like, for
Working with the measure E is even worse, instance, estimates of moments, exponential conver-
especially when d = 3, because its support is a larger gence of the statistical solution for large time.
space within which more irregular velocity vectors Stochastic forces can enter in the Navier–Stokes
live. The more irregular the spaces where the flow equations in different ways. We can consider
lives, the more difficult is to handle the nonlinear randomness in the forcing term, so that the force f
term B(u, u). in [18] has a deterministic component which
On the other hand, for d = 1, the mathematical represents its mean varying slowly and a stochastic
analysis is much easier. For instance, it can be one, which accounts small fluctuations around the
proved (see Robert (2003)) that the one-dimensional mean and varying very rapidly. Alternatively, since
inviscid Burgers equation on the line the molecules are not rigidly connected to one
another in the fluid, they are subjected to fluctua-
@u @ 1 2 tions. A complete description of fluctuations relating
þ u ¼0 ½17 the microscopic and macroscopic motion is not
@t @x 2
achieved at present. However, we shall introduce
has intrinsic invariant statistical solution, given by a some models for which rigorous mathematical
class of Lévy’s processes with negative jumps. results can be proved.
The first part of this section concerns the Navier–
Stokes equations with noise n:
The Stochastic Navier–Stokes Equations
The Navier–Stokes equations describe advection @u
u þ ðu rÞu þ rp ¼ n
with velocity u and diffusion with kinematic @t ½19
viscosity > 0 (see Viscous Incompressible Fluids: ru¼0
Mathematical Theory)
for which invariant measures exist, one of which can
@u be ergodic provided that the noise is suitably chosen.
u þ ðu rÞu þ rp ¼ f In the second part, a Navier–Stokes-type stochastic
@t
ru¼0 in D ½18 system is described, which has irregular solutions, as
expected in turbulence.
u¼0 on @D
Let us introduce the stochastic Navier–Stokes
where is the Laplace operator. Nonslip boundary equations with time white noise. The first equation
conditions are assumed. Although the Euler equa- in [19] is an Itô equation:
tions [9] are formally obtained from [18] by setting @t u þ ½u þ ðu rÞu þ rp ¼ @t w ½20
= 0, the presence of the second-order operator
makes the analysis needed to prove the Here w = wð1Þ , . . . , wðdÞ is a Brownian motion, that
existence, uniqueness, and regularity of solutions is, its time derivative n = @w=@t is a Gaussian
76 Stochastic Hydrodynamics
stochastic field with zero mean and correlation As soon as the forcing term is more regular in space,
function given by the Navier–Stokes system has a solution of finite
energy. These are solutions close to those of the
E½nðjÞ ðt; xÞnðkÞ ðt0 ; x0 Þ deterministic equation. Techniques similar to those
¼ jk qðx x0 Þðt t0 Þ ½21 used to prove the existence and/or uniqueness of
solutions for the deterministic equations work also
for j, k = 1, . . . , d.
in the stochastic case with an additive noise (or even
We shall use the differential form for the Itô
a multiplicative noise) to get weak or strong
equation [20] always understood in the integral
solutions. Global existence in the space H 0 is proved
form
for d = 2, 3 and uniqueness only for d = 2, as is the
Z t
case for the deterministic Navier–Stokes equations.
uðtÞ uð0Þ þ ½uðsÞ þ ðuðsÞ rÞuðsÞ The interesting feature is that by adding a noise
0
which acts on all the components with respect to a
þ rpðsÞds ¼ wðtÞ ½22
Hilbert basis (or at least on many components), the
Modeling perturbations by a white noise process stochastic Navier–Stokes system has a unique
represents the first step to understand how a random invariant measure, which is ergodic. This is proved
perturbation acts in the mathematical equations, for the spatial dimension d = 2. By means of the
rather than a good physical or numerical model. The Krylov–Bogoliubov’s method, existence of at least
first results are in a paper by Bensoussan and an invariant measure is proved by compactness of a
Temam (1973). family of averaged measures; the limit measures are
Obviously, the regularity of the solutions depends stationary measures. But, when many modes are
on the spatial covariance q of the noise. perturbed by a noise, there is a mixing effect on the
Let us consider the following cases. dynamics, avoiding existence of many stationary
measures. For the spatial dimension d = 2, the best
q = : the noise is white also in space.
result in this context is in Hairer and Mattingly
An invariant measure is known explicitly. Indeed, (2004), where the noise acts on very few modes. For
assume periodic boundary conditions on the square the spatial dimension d = 3, the result in Da Prato
(d = 2) or the cube (d = 3) D, which makes the and Debussche (2003) shows the existence of an
spatial domain a torus. In this case, the Euler and invariant measure; even if there is no uniqueness of
Navier–Stokes equations are set in the same func- the solutions (as in the deterministic case), by a
tional spaces. The generator of the stochastic selection principle, they construct a transition
Navier–Stokes equations [20] corresponds to the semigroup, which has a unique invariant measure,
sum of the generator of the Euler equations [9] and ergodic and strongly mixing.
of the stochastic Stokes equations Mathematical proofs are given for very different
noises. (The reader is urged to consult, among the
@t u ¼ ½u rp þ @t w
½23 others, the papers by E, Mattingly and Sinai; Flandoli
ru¼0 and Maslowski; Mikulevicius and Rozovskii; Vishik
Since the first equation in [23] is linear in the and Fursikov. The latter authors study also statistical
unknown velocity u, the Stokes system has a unique solutions
P in two and three dimensions. For a kick noise
invariant measure which is a centered Gaussian n = k (t k)qk (x) in equations [19], there are results
measure. In particular, when the noise is a space- for d = 2 by Bricmont, Kupiainen and Lefevere; Kuksin
time white noise and d = 2, this is the invariant and Shirikyan.)
measure [14] of the enstrophy: We conclude that, as far as invariant measures
and their ergodicity are concerned, the stochastic
ð0Þ;ð2Þ 1 2SðuÞ Navier–Stokes equations have richer results than the
S ðduÞ ¼ e du
Z deterministic Navier–Stokes equations. It is appeal-
ing to investigate the limit as the intensity of the
On a bidimensional torus, it is proved that this
noise goes to zero, so as to recover the deterministic
measure is not only infinitesimally invariant, but
equation. Now, think of equation [19] with a noise
also globally invariant for a unique flow [20]
"n, for n fixed and " ! 0. Due to the sensitive
defined for (0),
S
(2)
-a.e. initial velocity. We recall
dependence on initial conditions, even a small noise
that initial velocities of finite energy are negligible
may have important effects on the dynamics. A
with respect to the measure (0), S
(2)
.
conjecture by Kolmogorov is that the unique
q more regular than above, that is, the noise is invariant measure " tends, when " ! 0, to a specific
colored in space. measure, the so-called Kolmogorov measure, which
Stochastic Hydrodynamics 77
would enter into the ergodic principle. This is a According to the mathematical model for the
difficult problem, not yet solved. fluctuation, we have
We also mention the analysis of the inviscid limit.
dxðtÞ ¼ uðt; xðtÞÞdt þ bðxðtÞÞdwðtÞ ½28
Kuksin (2004) showed that the solution u of the
two-dimensional stochastic Navier–Stokes equations Therefore, Du is computed by means of Itô’s
formula
@u pffiffiffi
u þ ðu rÞu þ rp ¼ n; 0< 1 ½24 Xd
@t @u @u
Duðt; xðtÞÞ ¼ dt þ dxk ðtÞ
@t @x k
on the torus converges in distribution to a stationary k¼1
solution of the Euler equations. Here n is a random 1X d
@2u
force white in time and smooth in space. More þ b bs dt ½29
2 k;s¼1 @xk @xs k
precisely, for each subsequence uj ,
This leads to the stochastic Navier–Stokes-type
lim lim uj ðT þ tÞ ¼ UðtÞ ½25
j !0 T!1 equations (we neglect the overline symbol)
and almost every trajectory of the nontrivial limit dt u þ ½u þ ðu rÞu þ rp þ 12 Qu dt
process U solves the Euler equations [9] without the ¼ ðb rÞu dwðtÞ ½30
forcing term. Moreover, the process U keeps ru¼0
memory of some features of the noise force n, since
the mean values of the enstrophy and of the energy where Q is the second-order differential operator
of U depend on the noise n. given by the last term in [29].
We now present the second part on stochastic Rigorous mathematical results for the above
models for viscous fluids. In his 1884 paper, equations have been proved for the one-dimensional
Reynolds introduced the decomposition of turbulent case, that is, the Burgers equations on the line.
flow into mean and fluctuating flows. The equations Given an initial velocity of finite energy u0 2 H 0 ,
obtained are difficult to study. We shall show now a there exists a unique solution u 2 C([0, T]; H 0 ) \
tractable model for a one-dimensional problem L2 (0, T; H 1 ) (P-a.s.). But it can be shown that for a
(d = 1) with a suitable model of fluctuations. more regular initial velocity there is no higher
Decompose the velocity field into the sum of a regularity of the solution of eqn [30], if b 6¼ 0. This
mean flow u and a fluctuation means that these stochastic Burgers equations
cannot have too regular nontrivial solutions, as
u¼uþ expected in turbulent motion.
recover the continuous system. But there are many system [33]. We can prove that Z(N) is finite for
different ways to approximate a continuous vorticity ~ 2 (8 =N, 4 ), so that it is natural to choose as
by a cloud of point vortices and different approx- ~
a scaling N = . Hence,
imations may lead to very different statistical
equilibrium states. N ðdx1 dx2 dxN Þ
We present here the approach presented in Lions 1
¼ eð=NÞH dx1 dx2 dxN ½36
(1997). To get an idea of a completely different ZðNÞ
approximation, see, for example, Robert (2003).
Let D be a bounded open smooth simply is considered for 8 < 0, or > 0 with
connected subset of R2 . Then there exists a function N > =4 .
(the stream function) such that u = r? and Bearing in mind the Onsager approach to approx-
j@D = 0. Given the velocity u, we recover the stream imate the turbulent Euler motion by means of point
function byR means of the vorticity ! = curl u = , vortices, we are interested in the limit as N goes to
so (x) = D g(x, y)!(y) dy (here g is the Green’s þ1, for fixed in (8 , þ 1). It turns out that,
function of the Laplacian and x, y are points in when the number of point vortices becomes very
D). The Euler equations can be written as large, their statistical behavior corresponds to a very
large number of independent particles moving in a
@! mean force field that they create.
þ u r! ¼ 0 More precisely, consider = 1=N, ~ = . The
@t ½31
! ¼ curl u empirical measure
X
N describing the vorticity, weakly converges to a
!¼ i xi ðtÞ ½32 probability density and each correlation function
i¼1 Z Z
1
Here the vortex intensities i are real values and Nj ðx 1 ; ; xj Þ ¼ dxjþ1 dxN eð=NÞH
D D ZðNÞ
xi (t) are distinct points in D for i = 1, . . . , N.
for j ¼ 1; . . . ; N 1 ½37
According to the Euler equations, these points evolve
as follows (see also Marchioro and Pulvirenti (1994)): j Qj
weakly converges to i = 1 = i = 1 (xi ).
The equation satisfied by , also called the mean-
d XN
xj ðtÞ ¼ r? l gðxj ðtÞ; xl ðtÞÞ field equation, is
xj
dt l¼1; l6¼j
eUðxÞ
þ j r? gðxj Þ; j ¼ 1; . . . ; N ½33 ðxÞ ¼ R ;
xj ~ UðyÞ dy
De
Z
where ~g is related to the Green’s function g. This is a with UðxÞ ¼ gðx; yÞðyÞ dy ½38
Hamiltonian system in DN . Hereafter, we shall D
suppose that the vortex intensities are the same The relation between U and can also be written as
(i = 8i), so that the Hamiltonian is U = in D, U = 0 on @D. We point out that
u = r? U is a stationary solution of the Euler
1 X N XN
equations. Indeed, ! = U = and is a function
Hðx1 ; . . . ; xN Þ ¼ gðxj ; xl Þ þ gðxj Þ
~ ½34
2 l; j¼1; l6¼j j¼1 of U, let us say = F(U). This gives that
r! = rUF0 (U) and thus the term u r! in the
By means of H, we define the canonical Gibbs Euler equation [31] vanishes.
measure It can be proved that there exists a solution of the
mean-field equation when 0 or when < 0 and
N ðdx1 dx2 dxN Þ D is simply connected. Uniqueness is known in some
1 ~ cases, for instance, when D is a bounded open
¼ eHðx1;...; xN Þ dx1 dx2 dxN ½35
ZðNÞ smooth simply connected domain and the velocity is
assumed tangent to the boundary.
where Z(N) is the partition function. If Z(N) < 1, There are numerical evidences of this approxima-
then N is a well-defined probability measure on DN tion approach (see references in Lions (1997)
and, by construction, it is an invariant measure for referring to the periodic case). They show that for
Stochastic Hydrodynamics 79
large time and large Reynolds number (viscosity Da Prato G and Debussche A (2003) Ergodicity for the 3D
close to 0), the vorticity of the solution of the stochastic Navier–Stokes equations. Journal de Mathéma-
tiques Pures et Appliquées 82(8): 877–947.
Navier–Stokes equations appears in a simple and E Weinan, Mattingly JC, and Sinai Y (2001) Gibbsian dynamics
organized structure. This stays intact until the and ergodicity for the stochastically forced Navier–Stokes
viscous dissipation damps it. The important obser- equation. Communications in Mathematical Physics 224(1):
vation is that the organized structure is described 83–106.
quite precisely by the solution of the mean-field Flandoli F and Bessaih H (2003) A mean field result for 3D vortex
filaments. In: Davies IM, Jacob N, Truman A, Hassan O,
equation for some specific . Morgan K, and Weatherill NP (eds.) Probabilistic Methods in
Actually, to say that a fluid is inviscid is an Fluids, 22–34. River Edge, NJ: World Scientific.
approximation (which may be justified in many Flandoli F and Maslowski B (1995) Ergodicity of the 2-D Navier–
contexts), since every fluid displays some kind of Stokes equation under random perturbations. Communica-
viscosity. But turbulence is a phenomenon occurring tions in Mathematical Physics 172(1): 119–141.
Frisch U (1995) Turbulence. The legacy of A. N. Kolmogorov.
at very small viscosity. In this sense, the above result Cambridge: Cambridge University Press.
provides a description of stationary regime in an Gallavotti G (2002) Foundations of Fluid Dynamics. Berlin:
ideal fluid, which is a good approximation of some Springer.
numerical simulations of real fluids. Besides this Hairer M and Mattingly JC (2004) Ergodic properties of highly
good agreement with numerical simulations, there is degenerate 2D stochastic Navier–Stokes equations (English.
English, French summary). Comptes Rendus Mathématique.
no proof on how to deduce the mean-field equation Académie des Sciences. Paris 339(12): 879–888 (see also the
from the Euler equations (e.g., which parameter paper ‘‘Ergodicity of the 2D Navier–Stokes equations with
has to be chosen in eqn [38]?). degenerate stochastic forcing’’ to appear on Annals of
Mathematics).
Remark The extension of this analysis to three- Hopf E (1952) Statistical hydromechanics and functional calculus.
dimensional flows involves vortex filaments, instead Journal of Rational Mechanics and Analysis 1: 87–123.
of point vortices. There are attempts to describe Kraichnan RH and Montgomery D (1980) Two–dimensional
interacting vortex filaments as proposed by Chorin. turbulence. Reports on Progress in Physics 43(5): 547–619.
Kuksin SB (2004) The Eulerian limit for 2D statistical hydro-
Idealizations of behavior of vortices are introduced dynamics. Journal of Statistical Physics 115(1/2): 469–492.
to have a tractable mathematical model. The reader Kuksin S and Shirikyan A (2000) Stochastic dissipative PDEs and
is referred to Lions (1997) for a description of nearly Gibbs measures. Communications in Mathematical Physics
parallel vortex filaments and to Flandoli and Bessaih 213(2): 291–330.
(2003) for more realistic filaments which fold. Lions P-L (1997) On Euler Equations and Statistical Physics,
Pubblicazione della Scuola Normale Superiore. Pisa: Cattedra
Galileiana.
See also: Cauchy Problem for Burgers-Type Equations;
Marchioro C and Pulvirenti M (1994) Mathematical Theory of
Hamiltonian Fluid Dynamics; Incompressible Euler Incompressible Nonviscous Fluids. Applied Mathematical
Equations: Mathematical Theory; Malliavin Calculus; Science, vol. 96. New York: Springer.
Non-Newtonian Fluids; Partial Differential Equations: Mikulevicius R and Rozovskii BL (2004) Stochastic Navier–
Some Examples; Stochastic Differential Equations; Stokes equations for turbulent flows. SIAM Journal on
Turbulence Theories; Viscous Incompressible Fluids: Mathematical Analysis 35(5): 1250–1310.
Mathematical Theory; Vortex Dynamics. Monin AS and Yaglom AM (1987) Statistical Fluid Mechanics:
Mechanics of Turbulence.Cambridge, MA: MIT Press.
Onsager L (1949) Statistical hydrodynamics. Nuovo Cimento
6(suppl. 2): 279–287.
Further Reading Robert R (2003) Statistical hydrodynamics (Onsager revisited). In:
Friedlander S and Serre D (eds.) Handbook of Mathematical
Albeverio S and Cruzeiro AB (1990) Global flows with invariant Fluid Dynamics, vol. II, pp. 1–54. Amsterdam: North-Holland.
(Gibbs) measures for Euler and Navier–Stokes two dimensional Vishik MJ and Fursikov AV (1988) Mathematical Problems of
fluids. Communications in Mathematical Physics 129: 431–444. Statistical Hydromechanics. Dordrecht: Kluwer Academic
Bensoussan A and Temam R (1973) Équations stochastiques du type Publishers.
Navier–Stokes. Journal of Functional Analysis 13: 195–222.
Bricmont J, Kupiainen A, and Lefevere R (2001) Ergodicity of the
2D Navier–Stokes equations with random forcing. Commu-
nications in Mathematical Physics 224(1): 65–81.
80 Stochastic Loewner Evolutions
gt
γ (t )
U0 Ut
Figure 2 The map gt from Hn[0,t] onto H:
where Bt is a standard one-dimensional Brownian transformations, but the scale invariance of SLE in
motion. Equation [1] is often given in terms of the H shows that the image measure is independent of
inverse ft = g1
t : the choice of transformation.
The geometric and fractal properties of the curve
2
f_t ðzÞ ¼ ft0 ðzÞ pffiffiffi vary greatly as the parameter changes:
z Bt
if 4, is a simple curve;
This equation describes a random evolution of if 4 < < 8, has self-intersections, but is not
conformal maps ft from H into subdomains of H. space filling; and
the solution of [1] is defined up to a
For each z 2 H, if 8, is a space filling curve.
time Tz 2 [0, 1] with Tz > 0 for z 6¼ 0. For fixed
To see this, one notes that the conformal Markov
t, gt is the unique conformal transformation of
property implies that there can be double points
Ht := {z 2 H : Tz > t} onto H with expansion
with positive probability if and only if Tx < 1
2t occurs with positive probability for x > 0. In add-
gt ðzÞ ¼ z þ þ ; z!1 ition, the curve is space filling if and only if Tz < 1
z
for all z and Tw 6¼ Tz for w 6¼ z. The problem is then
The chordal SLE path is the random curve reduced to a problem about the Bessel equation [3]
: [0, 1) ! H such that for each t, Ht is the for which the following holds:
unbounded component of Hn[0, t]. It is not
immediate from the definition that such a curve if a 1=2 and z 6¼ 0, the probability that Tz < 1
exists, but its existence has been proved. If Gt = gt= , is zero. If a < 1=2, this probability equals 1.
then we can write eqn [1] as if 1=4 < a < 1=2, and w, z are distinct points in H,
then there is a positive probability that Tw = Tz .
_ t ðzÞ ¼ a if 0 < a 1=4, then with probability 1, Tw 6¼ Tz
G ½2
Gt ðzÞ þ Wt for all w 6¼ z.
pffiffiffi
where a = 2= and Wt := Bt= is a standard This kind of argument is typical when studying
Brownian motion. Then Zzt := Gt (z) þ Wt satisfies SLE – geometric properties of the curve are
the Bessel stochastic differential equation established by analyzing a stochastic differential
a equation. The Hausdorff dimension of the path
dZzt ¼ dt þ dWt ; Zz0 ¼ z ½3 is given by
Zzt
n o
This equation is valid up to time Tz , which is the dim½½0; 1Þ ¼ min 1 þ ; 2
first time that Zzt = 0. 8
Although chordal SLE is defined with a parti-
cular parametrization, one generally thinks of it as a The radial Loewner equation describes the evolu-
measure on curves modulo reparametrization. The tion of a curve from the boundary of the unit disk
scaling properties of Brownian motion imply that D = {z : jzj < 1} to the origin. Suppose : [0, 1) !
this measure is invariant under dilations of H. If D is a simple curve with (0) = 1, (0, 1) Dn{0},
D
is a simply connected domain and z, w are distinct and (t) ! 0 as t ! 1. Let gt be the unique
boundary points of D, chordal SLE in D connecting conformal transformation of Dn[0, t] onto D such
z and w is defined to be the conformal image of that gt (0) = 0, g0t (0) > 0. One can check that g0t (0) is
SLE in H from 0 to 1 under a conformal continuous and strictly increasing in t, and hence we
transformation of H onto D taking 0 to z and 1 can parametrize in such a way that gt0 (0) = et .
to w. There is a one-parameter family of such Using this reparametrization, there is a continuous
82 Stochastic Loewner Evolutions
Ut : [0, 1) ! R with U0 = 0 such that gt satisfies motion. Girsanov’s theorem implies that Brownian
the radial Loewner equation motions with the same variance but different drifts
have absolutely continuous distributions. In parti-
eiUt þ gt ðzÞ cular, qualitative properties such as existence of
g_ t ðzÞ ¼ gt ðzÞ ; g0 ðzÞ ¼ z
eiUt gt ðzÞ double points or Hausdorff dimension of paths are
the same for radial and chordal SLE. Ut is a driftless
If z 6¼ 0, then we can define ht (z) = i log gt (z)
Brownian motion if a = 1=3, = 6.
locally near z, and this equation becomes
Whole-plane SLE from 0 to 1 is a path
: (1, 1) ! C with (1) = 0, (1) = 1, such
_ht ðzÞ ¼ cot ht ðzÞ Ut
2 that given (1, t], the distribution of (t, 1) is
that of radial SLE from boundary point (t) to
Radial SLE (connecting 1 and 0 in D) is obtained interior point 1 in the domain Cn[1, t]. One can
pffiffiffi
by setting Ut = Bt . If D is a simply connected define whole-plane SLE connecting two distinct
domain, z 2 D, w 2 @D, then radial SLE in D points in C by conformal transformation.
connecting w and z is obtained by conformal
transformation using the unique transformation
f of D onto D with f (0) = z, f (1) = w. Again, we
Locality and Restriction
think of this as being defined modulo time change. If
a = 2= and vt = hat=2 , then There are two special values of : = 6, a = 1=3 that
satisfies the ‘‘locality’’ property and = 8=3, a = 3=4
a vt ðzÞ þ Wt that satisfies the ‘‘restriction’’ property. Suppose is a
v_ t ðzÞ ¼ cot ½4
2 2 chordal SLE curve from 0 to 1 in H parametrized
pffiffiffi as in [2]. Suppose : N ! H is a conformal map
where Wt := Bt= is a standard Brownian
taking a neighborhood N of 0 in H to (N ) and that
motion. If Lzt = vt (z) þ Wt , then we get
locally maps R into R. Let (t) ˜ = (t), which is
z defined for sufficiently small t. Let g
t be the
z a L
dLt ¼ cot t dt þ dWt conformal transformation of Hn˜ [0, t] onto H with
2 2
SLE6 is a natural candidate for the boundary of In studying the relationship between SLE and
percolation clusters. conformal field theories, two other probabilistic
If 4, SLE paths are simple, that is, with no objects, restriction measures and the (Brownian)
self-intersections. Suppose A Hn{0} is a compact loop soup, arise. An H-hull (connecting 0 and 1) is
set such that HnA is simply connected. Let denote an unbounded, connected, closed set K H with
a chordal SLE in H connecting 0 and 1 and K \ R = {0} and such that HnK consists of two
let EA be the event EA = {(0, 1) \ A = ;}. Let connected components, one whose boundary
A : HnA ! H be the unique conformal transforma- includes the positive reals and the other whose
tion with A (0) = 0, A (1) = 1, 0A (1) = 1. On boundary includes the negative reals. A (chordal)
the event EA , we can define (t)
˜ = A (t). Chordal restriction measure on hulls K is a probability
SLE is said to satisfy the restriction property if the measure with the property that for any A as in [6],
conditional distribution of ˜ given EA is the same as the distribution of A K given {K \ A = ;} is the
(a time change of) . The only 4 that satisfies same as the original measure. The (Brownian) loop
this property is = 8=3. The proof of this fact also measure is a measure on unrooted loops derived
establishes the formula: if is a chordal SLE8=3 from Brownian bridges. It is the scaling limit of the
curve in H from 0 to 1, then measure on random walk loops that gives each
unrooted simple random walk loop of length 2n
Pfð0; 1Þ \ A ¼ ;g ¼ 0A ð0Þ5=8 ½6 measure 42n . The loop measure in a bounded
There is a similar formula for radial SLE8=3 , which domain is obtained by restricting to loops that stay
establishes a radial restriction property. Suppose in that domain. We can consider this as a measure
A Dn{0, 1} is a compact set such that DnA is on ‘‘hulls’’ by filling in the bounded holes (so that
simply connected. Let A be the unique conformal the complement of the hull is connected). By doing
transformation of DnA onto D with A (0), 0A (0) > 0. this we get a family of infinite measures on hulls,
Then, if is a radial SLE8=3 curve from 1 to 0 in D, indexed by domains D, and this family satisfies
then conformal invariance and the restriction property.
The loop soup with parameter is a Poissonian
Pfð0; 1Þ \ A ¼ ;g ¼ 0A ð0Þ5=48 j0A ð1Þj5=8 realization from this measure with parameter .
The set of all restriction measures is parametrized
The restriction property makes SLE8=3 the candidate by 5=8; the -restriction measure has the
for the scaling limit of self-avoiding walks. property that
Stochastic Resonance
S Herrmann, Université Henri Poincaré, Nancy 1 deduce estimates of the average temperature on
Vandoeuvre-lès-Nancy, France Earth over the last 700 000 years. They exhibit
P Imkeller, Humboldt Universität zu Berlin, periodic switching between ice and warm ages with
Berlin, Germany fast spontaneous transitions. The average periodicity
ª 2006 Elsevier Ltd. All rights reserved. of the glaciation time series obtained is 105 years.
In order to explain temperature variations, Benzi
et al. (1981) introduced random perturbations into
Introduction an energy balance model of the Budyko–Sellers type.
This model describes the evolution of the seasonal
The concept of stochastic resonance was introduced and global average temperature X caused by defects
by physicists. It originated in a toy model designed in the balance between incoming and outgoing
for a qualitative description of periodicity phenom-
radiation
ena in the recurrences of glacial eras in Earth’s
history. It spread its popularity over numerous areas dXðtÞ
c ¼ Ein Eout
of natural sciences: neuronal response to periodic dt
stimuli, variations of magnetization in a ferromag- where c is the active thermal inertia of the system.
netic system, voltage variations in the simple Schmitt The incoming energy is modeled as proportional
trigger electronic circuit or in more complicated to the ‘‘solar constant’’ Q:
devices, behavior of lasers in optical bi-stability, etc.
The interest in this ubiquitous phenomenon is 2
t
Ein ¼ Q 1 þ A cos ; with T
92 000 years
enhanced by signal analysis: an optimal dose of T
noise in some system can essentially boost signal
transduction. Noise in this context does not enter the and A
0.1% of Q. This exceedingly small varia-
system as an impurity perturbing its performance, but tion of the solar constant is caused by a modulation
on the contrary as a catalyst triggering amplified of the orbital eccentricity of the Earth’s trajectory
stochastic response to weak periodic signals. (Figure 1). The outgoing radiation Eout is composed
of two essential parts. The first part a(X)Ein is of the random system to weak perturbations with
dominated by the albedo a(X) representing the long periods.
proportion of energy reflected back to space. It is a
decreasing function of temperature, due to the
higher rate of reflection from a brighter Earth at Strongly Damped Brownian Particle
low temperatures implying a bigger volume of ice.
It is useful to roughly compare solutions of
The second part of the outgoing radiation comes
stochastic differential equations and motions of
from the fact that the Earth radiates energy like a
Brownian particles in double-well landscapes
black body, and is given by the Boltzmann law X4 ,
(Figure 3) in order to understand properties of
where is the Stefan constant. Describing the
their trajectories (see Schweitzer 2003, Mazo 2002).
balance of energy terms as a slowly and weakly
As in the previous section, let us concentrate on a
time-varying gradient of a potential U, the balance
one-dimensional setting, remarking that we shall
model can be expressed by
give a treatment that easily generalizes to the finite-
dimensional setting. Due to Newton’s law, the
dXðtÞ @U t
¼ ; XðtÞ motion of a particle is governed by the impact of
dt @x T all forces acting on it. Let us denote F the sum of
these forces, m the mass, x the space coordinate, and
where the time period 1 is blown up to (large) T by
v the velocity of the particle. Then
time scaling. The roles of deep and shallow wells
switch periodically (Figure 2). Since the variation of mv_ ¼ F
the solar constant is extremely small, we can assume
Let us first assume the potential to be switched off.
that the height of the barrier between the two wells
In their pioneering work at the turn of the
is lower-bounded by a positive constant. The system
twentieth century, Marian v. Smoluchowski and
then admits three steady states two of which are
Paul Langevin introduced stochastic concepts to
stable and separated by roughly 10 K. As the solar
describe the Brownian particle motion by claiming
constant, they fluctuate slowly and very weakly.
that at time t
Therefore, this deterministic system cannot account
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
for climate changes with temperature variations of FðtÞ ¼ 0 vðtÞ þ 2kB T0 W _t
10 K. They can only be explained by allowing
transitions between the two steady states which The first term results from friction 0 and is velocity
become possible by adding noise to the system. In dependent. An additional stochastic force represents
general, short timescale phenomena such as annual random interactions between Brownian particles and
fluctuations in solar radiation are modeled by their simple molecular random environment. The
white noise W _ (formal derivative of the Wiener
Gaussian white noise of intensity " and lead to
equations of the type process) plays the crucial role. The diffusion coefficient
(standard deviation of the random impact) is com-
@U t pffiffiffi posed of Boltzmann’s constant kB , friction, and
dX"t ¼ ; X"t dt þ "dWt ½1 environmental temperature T. It satisfies the condition
@x T
of the fluctuation–dissipation theorem expressing the
which are generic for studying stochastic resonance balance of energy loss due to friction and energy gain
in numerous physical and biological models. Gen- resulting from noise. The equation of motion becomes
erally, the input of noise amplifies a weak periodic dxðtÞ
signal by creating trajectories fluctuating randomly ¼ vðtÞ
dt
periodically between meta-stable states. An optimal pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
tuning of noise intensity to period length (‘‘stochas- 0 2kB T0
dvðtÞ ¼ vðtÞ dt þ dWt
tic resonance’’) significantly enhances the response m m
Figure 2 Deep and shallow wells switching periodically. Figure 3 Brownian particle in a double-well landscape.
88 Stochastic Resonance
2 2
1
1 1
0
0
0
–1 –1
–1
–2 –2
0 T 2T 3T 4T 0 T 2T 3T 4T 0 T 2T 3T 4T
Figure 4 Resonance pictures for diffusions.
In the stationary regime, the stationary Ornstein– generating a potential U(t, x). This leads to the
Uhlenbeck process provides its solution Langevin equation
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Z t dxðtÞ
ð0 =mÞt 2kB T0 ¼ vðtÞ
vðtÞ ¼ vð0Þ e þ eð0 =mÞðtsÞ dWs dt
m 0 @U pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
m dvðtÞ ¼ 0 vðtÞdt ðt; xðtÞÞ þ 2kB T0 dWt
@x
The ratio := 0 =m determines the dynamic behav-
ior. Let us focus on the over-damped situation with In the over-damped limit, after relaxation time, the
large friction and very small mass. Then for adiabatic elimination of the fast variables (Gardiner
t > > 1= = (relaxation time), the first term in the 2004) leads to an equation similar to the one
expression for velocity can be neglected, while the encountered in the previous section:
stochastic integral represents a Gaussian process. By pffiffiffiffiffiffiffiffiffiffiffiffi
1 @U 2kB T
integrating, we obtain in the over-damped limit dxðtÞ ¼ ðt; xðtÞÞ þ dWt
0 @x 0
( ! 1) that v and thus x is Gaussian with almost
constant mean In the particular case of some double-well potential
x ! U(t, x) with slow periodic variation, the follow-
1 et ing patterns of behavior of the solution trajectories
mðtÞ ¼ xð0Þ þ vð0Þ xð0Þ will be experienced. If temperature is high, noise has
a predominant influence on the motion, and the
and covariance close to the covariance of white particle often crosses the barrier separating the two
noise see Nelson (1967): wells during one period. The behavior of the particle
does not seem to be periodic but rather chaotic. If
2kB T kB T temperature is small, the particle stays for a very
Kðs; tÞ ¼ minðs; tÞ þ ð2 þ 2et þ 2es long time in the starting well, fluctuating weakly
0 0
around the equilibrium position. It has too low
ejtsj eðt þ sÞ Þ energy to follow the periodic variation of the
potential. So in this case too, the trajectories do
2kB T
minðs; tÞ not look periodic. Between these two extreme
0 situations, there exists a regime of noise intensities
for which the energy transmitted by the noise is
Hence, the time-dependent change of the velocity of sufficient to cross the barrier almost twice per
the Brownian particle can be neglected, the velocity period. The parameters are then near to the
rapidly thermalizes (v_ 0), while the spactial coor- resonance point and the motion exhibits periodic
dinate remains far from equilibrium. In the so-called switching (Figure 4).
adiabatic transformation, the evolution of the
particle’s position is thus given by the transformed
Langevin equation
Transition Criteria
pffiffiffiffiffiffiffiffiffiffiffiffi and Quasideterministic Motion
2kB T
dxðtÞ ¼ dWt
0 Studying stochastic resonance accordingly means
looking for the range of regimes for which periodic
Let us next suppose that we have a Brownian behavior is enhanced and eventually optimal. The
particle in an external field of force (see Figure 3), optimal relation between period T and noise
Stochastic Resonance 89
intensity " emerges in the small noise limit. To in probability as " ! 0. Here denotes Lebesgue
explain this, let us focus on the basic indicator for measure on R. If T < eV s = , the time left is not long
periodic transitions – the time the Brownian particle enough for crossings: the particle stays in the
needs to exit from the starting well, say the left one. starting well, near the stable equilibrium point:
In the ‘‘frozen’’ case, that is, if the time variation of
the potential term is eliminated just by freezing it at ðt 2 ½0; 1 : jX"tT ðxl 1fx 2 Dl g þ xr 1fx 2 Dr g Þj > Þ ! 0
some time s, the asymptotics of the exit time is
derived from the classical large deviation theory of This observation is at the basis of Freidlin’s law of
randomly perturbed dynamical systems (see Freidlin quasideterministic periodic motion discussed in the
and Wentzell 1998). Let us assume that U is locally subsequent section. The lesson it teaches is this: to
Lipschitz. We denote by Dl (resp. Dr ) the domain observe switching of the position to the energetically
corresponding to the left (resp. right) well and most favorable well, T should be larger than some
their common boundary. The law of the first exit critical level e = . Measuring time in exponential
time D" l = inf {t 0, X"t 2
= Dl } is described by some scales by
through the equation T " = e
=" , the
particular functional related to large deviation. For condition becomes
> .
t > 0, we introduce the ‘‘action functional’’ on the
space of continuous functions C([0, t]) on [0, t] by
( R t 2 Stochastic Resonance for Landscapes,
1
s 2 0 ’ _ u þ @U
@x ðs; ’u Þ du; if ’ is abs: Frozen on Half-Periods
St ð’Þ ¼ continuous
þ1 otherwise This particular case has analytical advantages, since
which is non-negative and vanishes on the set it allows one to employ classical techniques of
of solutions of the ordinary differential equation semigroup and operator theory. The situation is the
x_ = (@U=@x)(s, x). Let x and y 2 R. In relation with following: let U be a double-well potential with
the action functional, we define the quasipotential minima xl = 1 and xr = 1 and a saddle point at
the origin. We assume that U(x) ! 1 as jxj ! 1
Vs ðx; yÞ ¼ inffSst ð’Þ : ’ 2 Cð½0; tÞ; ’0 ¼ x; ’t ¼ y; t 0g and U(1) = V=2 = V l =2, U(1) = v=2 = V r =2,
U(0) = 0, and 0 < v < V. We define the 1-periodic
It represents the minimal work the diffusion starting
potential by U(t, x) = U(t þ 1=2, x). Hence on each
in x has to do in order to reach y. To switch wells,
half-period the corresponding diffusion is time homo-
the Brownian particle starting in the left well’s
geneous. The critical level is then easily defined by
bottom xl has to overcome the barrier. So we let
= v, that is, twice the depth of the shallow well. By
V s ¼ inf Vs ðxl ; yÞ letting
y2
(
This minimal work needed to exit from the left well 1 for t 2 ½k; k þ 12Þ
ðtÞ ¼
can be computed explicitly, and is seen to equal to 1 for t 2 ½k þ 12 ; k þ 1Þ; k ¼ 0; 1; 2; . . .
twice its depth. The asymptotic behavior of the exit
time is expressed by the periodic function which describes the location of
lim " ln E½D" l ¼ V s the global minimum of the potential, we get in the
"!0 small noise limit
and
ðt 2 ½0; 1 : jX"tT ðtÞj > Þ ! 0
lim Px eðV s Þ=" < D" l < eðV s þ ="Þ ¼ 1
"!0 in probability as " ! 0. This result expresses
for any > 0 Freidlin’s law of quasideterministic motion: for
The prefactor for the exponential rate, derived by large periods, the trajectories of the particle
Freidlin and Wentzell (1998), was first given by approach a periodic deterministic function. But the
Eyring and Kramers and then by Bovier et al. (2004). sense in which this notion measures periodicity does
Let us now assume that the left well is the deeper not take into account that for large periods short
one at time s. If the Brownian particle has enough excursions to the wrong well may occur in an erratic
time to cross the barrier, that is, if T > eV s = , then way without counting much for Lebesgue measure
whatever the starting point is, Freidlin (2000) proved of time. In fact, if the period is too large, that is,
that it should stay near xl in the following sense:
> V, the time available in one period permits the
exit of not only the shallow well but also that of the
ðt 2 ½0; 1 : jX"tT xl j > Þ ! 0 deep well. So, whatever the starting position of
90 Stochastic Resonance
the particle is, the number of observed transitions in the SPA-to-noise ratio, giving the ratio of the
one half period becomes very large. Indeed the first amplitude of the response and the noise intensity,
time the particle starting in xl hits again xl after which is also related to the signal-to-noise ratio:
visiting the position xr satisfies
MSPN ð"; TÞ ¼ MSPA ð"; TÞ="2
EðÞ ¼ ev=" þ eV=" < T " ¼ e
="
the total energy of the averaged trajectories
The motion of the particle appears more chaotical
Z 1
than periodic: noise intensity is too large compared
to period length. We avoid this range of chaotic MEN ð"; TÞ ¼ ðE
½XsT Þ2 ds
0
spontaneous transitions by defining the resonance
interval IR = [v, V], as the range of admissible energy
The second family of criteria is more probabilistic.
parameters
for randomly periodic behavior. In this
It refers to quality measures based on transition
regime, the trajectories possess periodicity proper-
times between the domains of attraction of the local
ties. In these terms the resonance point describes the
minima, residence times distributions measuring the
tuning rate
R 2 IR for which the stochastic response
time spent in one well between two transitions, or
to weak external periodic forcing is optimal. To
interspike times. This family is certainly less popular
make sense, this point has to refer to some measure
in the physics community.
of quality for periodicity of random trajectories. In
However, measures related to invariant measures
the huge physics literature concerning resonance,
may suffer from robustness deficiency (Imkeller and
two families of criteria can be distinguished. The
Pavlyukevich 2002). To explain what we mean by
first one is based on invariant measures and spectral
robustness, let us introduce a model reduction first
properties of the infinitesimal generator associated
discussed by McNamara and Wiesenfeld (1989).
with the diffusion X" . Now, X" is not Markovian
Instead of studying the diffusion X" in the double-
and consequently does not admit invariant mea-
well landscape, they introduce a two-state Markov
sures. But by taking into account deterministic
chain Y " (Figure 5) the dynamics of which just
motion of time in the interval of periodicity and
takes account of the domain of attraction the diffusion
considering the process Zt = (t mod(T " ), Xt ), we
is in, and therefore with state space {1, 1}. A
obtain a Markov process with an invariant measure
reasonable choice of the infinitesimal generator should
t (x)dx. In other words, the law of Xt
t (x)dx and
retain the dynamics of the diffusion’s transitions
the law of XtþT
tþT (x)dx, under this measure,
characterized by Kramers’ rate. We may take
are the same for all t 0. Let us present the most
important ones:
’ ’ T
the spectral power amplification (SPA) which QðtÞ ¼ ; 0t
2
plays an eminent role in the physics literature
T
describes the energy carried by the spectral QðtÞ ¼ ; t<T
component of the averaged trajectories of X" ’ ’ 2
corresponding to the period:
periodically continued on Rþ . Here, ’ = peV=" and
Z 1
2 = qev=" . The prefactors of subexponential order
MSPA ð"; TÞ ¼ E
½X"sT e2is ds are beyond the scope of large deviation theory. They
0 are related to the curvature of the potential in the
1 1 1
0 0 0
–1 –1 –1
0 T 2T 3T 4T 0 T 2T 3T 4T 0 T 2T 3T 4T
Figure 5 Resonance pictures for Markov chain.
Stochastic Resonance 91
minima and the saddle point of the landscape and particular function designed to cut out the small
given by fluctuations of the diffusion in the neighborhood of
the bottoms of the wells, by identifying all states
1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
p¼ U00 ð1ÞjU00 ð0Þj there. So g(x) = 1 (resp. 1) in some neighborhood
2 of 1 (resp. 1) and otherwise g is the identity. This
1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi
results in
q¼ U00 ð1ÞjU00 ð0Þj
2 Z 1 2
On the intervals [kT=2, (k þ 1)T=2[, k 0, the e
MSPA ð"; TÞ ¼ E
gðXsT Þe ds
" 2is
Markov chain Y " is time-homogeneous and its 0
transition probabilities can be expressed in terms In the small noise limit this quality function admits a
of ’ and . For instance, the probability with which local maximum close to the resonance point of the
the chain jumps from state 1 to state þ1 in the time reduced model: the growth rate of Topt "
is also given
window [t, t þ h] equals ’h þ o(h), if this time by the sum of the wells’ depths. So the lack of
interval is contained in [kT=2, (k þ 1)T=2[ for robustness seems to be due to the small fluctuations
some even k. The stationary measure of the Markov of the particle in the wells’ bottoms. In any case, this
chain denoted by
can be explicitly calculated, and clearly calls for other quality measures to be used to
so can the classical quality measures based on the transfer properties of the reduced model to the
spectral notions. For instance, the spectral power original one. Our discussion indicates that due to
amplification coefficient equals their emphasis on the pure transition dynamics, the
Z 2 second family of quality measures should be used.
1
MSPA ð"; TÞ ¼ E
½Yst" e2is ds For these notions there is no need to restrict to
0 landscapes frozen in time-independent potential
4 T 2 ð’ Þ2 states on half-period intervals.
¼ 2
ð’ þ Þ2 T 2 þ 2
have to look at smaller periods even at the cost that On this interval they get close to deterministic
the particle may not stay close to the global periodic ones. Again, periodicity is quantified by a
minimum. Let us study the transition dynamics. quality measure, to be maximized in order to obtain
Assume that the starting point is 1 corresponding resonance as the best possible response to periodic
to the bottom of the deep well. If the depth of the well forcing. One interesting measure is based on the
is always larger than
= " log T " , the particle has too probability that random transitions happen in some
little time during one period to climb the barrier, and small time window around a deterministic time, in
should stay in the starting well. If, on the contrary, the small noise limit (Herrmann and Inkeller 2005).
the minimal work to leave the starting well, given by Formally, for h > 0, the measure gives
2 (s), becomes smaller than
at some time s, then
the transition can and will happen. More formally, Mh ð"; TÞ ¼ min Pi ð
=T " 2 ½ai
h; ai
þ hÞ
i¼
for
2 [ inf t0 2 (t), supt0 2 (t)], we define
(Figure 6). where Pi is the law of the diffusion starting in i. In
the small noise limit, this quality measure tends to 1,
a
ðsÞ ¼ infft s : 2 ðtÞ
g and optimal tuning can be related to the exponential
rate at which this happens. This is due to the
The first transition time from 1 to 1 denoted þ following large deviations principle:
has the following asymptotic behavior as
" ! 0: þ =T " ! a
(0). At the second transition the lim " logð1 Mh ð"; TÞÞ ¼ maxf
2i ðai
hÞg
particle returns to the starting well. If aþ
is defined
"!0 i¼
2Δ – (t ) þ ðtÞ ¼ ðt þ Þ
consisting of a two-state Markov chain with to the effect of modulating periodically a bifurcation
infinitesimal generator parameter: at time zero the right-hand well becomes
almost flat, and at the same time the bottom of the
’ðtÞ ’ðtÞ
QðtÞ ¼ well and the saddle approach each other; half a
ðtÞ ðtÞ
period later, a spatially symmetric scenario is
where ’(t) = exp 2 (t=T)=" and (t) = encountered. In this situation, there is a threshold
exp 2þ (t=T)=". The law of transition times of value for the noise intensity under which transitions
this Markov chain is readily computed from Laplace become unlikely. Above this threshold, the trajec-
transforms. Normalized by T " it converges to ai
. tories typically contain two transitions per period.
This calculation even reveals a rigorous underlying Results are formulated in terms of concentration
pattern for the second- and higher-order transition properties for random trajectories. The intuitive
times interpreting the interspike distributions of picture is this: with overwhelming probability,
the physics literature. The dynamics of diffusion sample paths will be concentrated in spacetime sets
and Markov chain are similar. Resonance points scaling with the small parameters of the problem. In
provided by Mh for the diffusion and its analog for higher dimensions, these sets may be given by
the Markov chain agree. adiabatic or center manifolds of the deterministic
system, which allow model reduction of higher-
dimensional systems to lower-dimensional ones.
Asymptotic results hold for any choice of the small
Related Notions: Synchronization
parameters in a whole parameter region. A passage
In the preceding sections, we interpreted stochastic to the small noise limit as for optimal tuning in the
resonance as optimal response of a randomly preceding sections is not needed.
perturbed dynamical system to weak periodic forcing, Related problems studied by Berglund and Gentz
in the spirit of the physics literature (see Gammaitoni in the multidimensional case concern the noise-
et al. (1998)). Our crucial assumption concerned the induced passage through periodic orbits, where
barrier heights a Brownian particle has to overcome unexpected phenomena arise. Here, as opposed to
in the potential landscape of the dynamical system: it the classical Freidlin–Wentzell theory, the distribu-
is uniformly lower bounded in time. Measures for the tion of first-exit points depends nontrivially on the
quality of tuning were based on essentially two noise intensity. Again aiming at results valid for
concepts: one concerning spectral criteria, with the small but nonvanishing parameters in subexponen-
spectral power amplification as most prominent tial scale ranges, they investigate the density of first-
member, the other one concerning the pure transi- passage times in a large regime of parameter values,
tions dynamics between the domains of attraction of and obtain insight into the transition from the
the local minima. A number of different criteria can stochastic resonance regime into the synchronization
be used to create an optimal tuning between the regime.
intensity of the noise perturbation and the large
period of the dynamical system. The relations have to See also: Dynamical Systems in Mathematical Physics:
be of an exponential type T = exp
=", since the An Illustration from Water Waves; Magnetic Resonance
Brownian particle needs exponentially long times to Imaging; Spectral Theory for Linear Operators;
cross the barrier separating the wells according to the Stochastic Differential Equations.
Eyring–Kramers–Freidlin transition law. Our barrier
height assumption seems natural in many situations,
but can fail in others. If it becomes small periodically, Further Reading
and eventually scales with the noise-intensity para-
meter, the Brownian particle does not need to wait an Benzi R, Sutera A, and Vulpiani A (1981) The mechanism of
exponentially long time to climb it. So periodicity stochastic resonance. Journal of Physics A 14: L453–L457.
Berglund N and Gentz B (2002) A sample-paths approach to
obtains for essentially smaller timescales. In this noise-induced synchronization: stochastic resonance in a
setting, the slowness of periodic forcing may also be double-well potential. Annals of Applied Probability 12(4):
assumed to be essentially subexponential in the noise 1419–1470.
intensity. Bovier A, Eckhoff M, Gayrard V, and Klein M (2004) Meta-
If it is fast enough to allow for substantial changes stability in reversible diffusion processes. I. Sharp asymptotics
for capacities and exit times. Journal of the European
before large deviation effects can take over, we are Mathematical Society 6(4): 399–424.
in the situation of Berglund and Gentz (2002). They Freidlin MI (2000) Quasi-deterministic approximation,
in fact consider the case in which the barrier metastability and stochastic resonance. Physica D 137(3–4):
between the wells becomes low twice per period, 333–352.
94 String Field Theory
Freidlin M (2003) Noise sensitivity of stochastic resonance and Imkeller P and Pavlyukevich I (2002) Model reduction and
other problems related to large deviations. In: Sri stochastic resonance. Stochastics and Dynamics 2(4):
Namachchivaya and Lin YK (eds.) IUTAM Symposium on 463–506.
Nonlinear Stochastic Dynamics, Solid Mech. Appl, vol. 110, Mazo RM (2002) Brownian Motion: Fluctuations, Dynamics,
pp. 43–55. Dordrecht: Kluwer Academic. and Applications. International Series of Monographs on
Freidlin MI and Wentzell AD (1998) Random Perturbations of Physics, vol. 112. New York: Oxford University Press.
Dynamical Systems, 2nd edn. New York: Springer. McNamara B and Wiesenfeld K (1989) Theory of stochastic
Gammaitoni L, Hänggi P, Jung P, and Marchesoni F (1998) resonance. Physical Review A 39(9): 4854–4869.
Stochastic resonance. Reviews of Modern Physics 70(1): Nelson E (1967) Dynamical Theories of Brownian Motion.
223–287. Princeton, NJ: Princeton University Press.
Gardiner CW (2004) Handbook of Stochastic Methods for Schweitzer F (2003) Brownian Agents and Active Particles,
Physics, Chemistry and the Natural Sciences. Springer Series Springer Series in Synergetics, (Collective dynamics in the
in Synergetics, 3rd edn., vol. 13. Berlin: Springer. natural and social sciences, With a foreword by J. Doyne
Herrmann S and Imkeller P (2005) The exit problem for Farmer). Berlin: Springer.
diffusions with time-periodic drift and stochastic resonance.
Annals of Applied Probability 15(1A): 39–68.
open SFT (OSFT). We can hope that the nonpertur- Here to the standard free bulk action (integrated
bative string dualities will also be understood in the over the upper-half complex plane UHP) we have
framework of SFT, once covariant SFTs for the added perturbation localized on the real axis R.
superstring are better developed. Notice that the basis of perturbations depends on
In this article, we review the basic formalism of the chosen BCFT0 .
covariant SFT, using for illustration purposes the 3. We interpret the coefficients {~i (x )} of the
simplest model – cubic bosonic OSFT. We then perturbations as spacetime fields. (The tilde on
briefly sketch the generalization to bosonic SFTs ~i (x) serves as a reminder that these fields are not
that include closed strings. Finally, we turn to the quite the same as the fields i (x) that will appear
subjects of classical solutions in OSFT and the in the OSFT action). We are after a spacetime
physics of the open-string tachyon. action S[{~i }] such that solutions of its classical
equations of motion correspond to conformal
boundary conditions:
where the nilpotent BRST operator Q has the first-quantized wave functions are promoted to
standard expression dynamical fields in the second-quantized theory.
I Finally, let us quote the reality condition for the
1 string field, which takes a compact form in the
Q¼ ðc Tmatter þ : bc@c :Þ ½6
2i Schrödinger representation:
Though not a priori obvious, it turns out that the ½X ðÞ; bðÞ; cðÞ
simplest form of the OSFT action is achieved by ¼ ½X ð Þ; bð Þ; cð Þ ½10
taking as the fundamental off-shell variable an
where the superscript denotes complex conjugation.
arbitrary G = þ1 element of the first-quantized
Fock space,
The Classical Action
ð1Þ
ji 2 HBCFT0 ½7 With all the ingredients in place, it is immediate to
write the quadratic part of the OSFT action. The
By the usual state–operator correspondence of CFT, linearized equations of motion must reproduce the
we can also represent ji as a local (boundary) physical-state condition [5]. This suggests
vertex operator acting on the vacuum,
S hjQji ½11
ji ¼ ð0Þj0i ½8 Here hji is the usual BPZ inner product of BCFT0 ,
which is defined in terms of a two-point correlator on
The open-string field ji is really an infinite- the disk, as we review below. The ghost anomaly
dimensional array of spacetime fields. We can implies that on the disk we must have Gtot = þ3,
make this transparent by expanding it as which happily is the case in [11]. Moreover, since the
inner product is nondegenerate, variation of [11] gives
XZ
ji ¼ dpþ1 kji ðkÞi i ðk Þ ½9 Qji ¼ 0 ½12
i
as desired. The equivalence relation jVphys i
where {ji (k)i} is some convenient basis of H(1) BCFT0 jVphys i þ Qji is interpreted in the second-quantized
that diagonalizes the momentum k . The fields language as the spacetime gauge invariance
i are a priori complex. This is remedied by
ð0Þ
imposing a suitable reality condition on the string ji ¼ Qji; ji 2 HBCFT0 ½13
field, which will be stated momentarily. Notice that
valid for the general off-shell field. This equation is
there are many more elements in {ji (k)i} than in the
a very compact generalization of the linearized
physical subspace (the cohomology classes of Q).
gauge invariance for the massless gauge field.
Some of the extra fields will turn out to be
Indeed, focusing on the level-zero components,
nondynamical and could be integrated out, but at
ji A (x)(c@X )(0)j0i and ji (x)j0i, we find
the price of making the OSFT action look much
A (x) = @ (x). It is then plausible to guess that the
more complicated.
nonlinear gauge invariance should take the form
It is often useful to think of the string field in
terms of its Schrödinger representation, that is, as a ji ¼ Qji þ ji ji ji ji ½14
functional on the configuration space of open
where is some suitable product operation that
strings. Consider the unit half-disk in the upper-
conserves ghost number
half plane, DH {jzj 1, =z 0}, with the vertex
operator (0) inserted at the origin. Impose BCFT0 ðnÞ ðmÞ ðnþmÞ
: HBCFT0
HBCFT0 ! HBCFT0 ½15
open string boundary conditions for the fields X(z, z)
on the real axis (here X(z, z) is a short-hand notation Based on a formal analogy with 3D nonabelian
for all matter and ghost fields), and boundary Chern–Simons theory, Witten proposed the cubic
conditions X() = Xb () on the curved boundary of action
DH , z = exp (i), 0 . The path integral over
1 1 1
X(z, z) in the interior of the half-disk assigns a S¼ 2 hjQji þ hj i ½16
go 2 3
complex number to any given Xb (), so we obtain a
functional [Xb ()]. This is the Schrödinger wave The string field ji is analogous to the Chern–
function of the state (0)j0i. Thus, we can think of Simons gauge potential A = Ai dxi , the product to
open-string functionals [Xb ()] as the fundamental the ^ product of differential forms, Q to the exterior
variables of OSFT. This is as it should be: the derivative d, and the ghost number G to the degree
String Field Theory 97
of the form. The analogy also suggests a number of coordinates are essential since we are dealing with
algebraic identities: off-shell open-string states. The BPZ inner product
(two-point vertex) is given by
Q2 ¼ 0
h1 j2 i hI 1 ð0Þ 2 ð0ÞiUPH
hQAjBi ¼ ð1ÞGðAÞ hAjQBi
1 ½20
QðA BÞ ¼ ðQAÞ B þ ð1ÞGðAÞ A ðQBÞ IðzÞ ¼
z
½17
hAjBi ¼ ð1ÞGðAÞGðBÞ hBjAi The symbol f (0), where f is a complex map,
hAjB Ci ¼ hBjC Ai means the conformal transform of (0) by f. For
A ðB CÞ ¼ ðA BÞ C example, if is a dimension-d primary field, then
f (0) = f 0 (0)d (f (0)). If is nonprimary, the
Note in particular the associativity of the -product. transformation rule will be more complicated and
It is straightforward to check that this algebraic involve extra terms with higher derivatives of f. By
structure implies the gauge invariance of the cubic performing the SL(2, C) transformation
action under [14]. A -product satisfying all required
formal properties can indeed be defined. The most 1 þ iz
w ¼ hðzÞ ½21
intuitive presentation is in the functional language. 1 iz
Given an open-string curve X(), 0 , we we can represent the two-point vertex as a corre-
single out the string mid-point = =2 and define lator on the unit disk D = {jwj 1},
the left and right ‘‘half-string’’ curves
h1 j2 i ¼ hf1 1 ð0Þ; f2 2 ð0ÞiD
½22
XL ðÞ XðÞ for 0 f1 ðz1 Þ ¼ hðz1 Þ; f2 ðz2 Þ ¼ hðz2 Þ
2 ½18
XR ðÞ Xð Þ for The vertex operators are inserted as w = 1 and
2 w = þ1 on D (see Figure 2a) and correspond to the
A functional [X()] can, of course, be regarded as two open strings at (Euclidean) world sheet time
a functional of the two half-strings, [X] ! = 1 (we take z = exp (i þ )). The left half of
[XL , XR ]. We define D is the world sheet of the first open string; the right
Z half of D is the world sheet of the second string. The
ð1 2 Þ½XL ; XR ½dY1 ½XL ; Y2 ½Y; XR ½19 two strings meet at = 0 on the imaginary w axis.
R The three-point Witten vertex is given by
where [dY] is meant as the functional integral over
the space of half-strings Y(), with Y(=2) = h1 ; 2 ; 3 i
XL (=2) = XR (=2). Figure 1a shows two open hg1 1 ð0Þg2 2 ð0Þg3 3 ð0ÞiD ½23
strings interacting (to form a single open string) if
where
and only if the right half of the first string precisely
overlaps with the left half of the second string. 1 þ iz1 2=3
Associativity is transparent (Figure 1b). g1 ðz1 Þ ¼ e2i=3
1 iz1
We can now translate this formal construction in
the precise CFT language. Very generally, an n-point 1 þ iz2 2=3
g2 ðz2 Þ ¼ ½24
vertex of open strings can be defined by specifying 1 iz2
an n-punctured disk, that is, a disk with marked
1 þ iz3 2=3
points on the boundary (punctures) and a choice of g3 ðz3 Þ ¼ e2i=3
1 iz3
local coordinates around each puncture. Local
w w
0 φ1
A
B
π π/2
A C φ1 φ2 φ2
A*B
0 π/2
B A*B*C φ3
π
(a) (b)
(a) (b) Figure 2 Representation of the quadratic and cubic vertices as
Figure 1 Midpoint overlaps of open strings. 2- and 3-punctured unit disks.
98 String Field Theory
The 3-punctured disk is depicted in Figure 2b, and where @l and @r are derivatives from the left and
describes the symmetric mid-point overlap of the from the right. It is often convenient to expand S in
three strings at = 0. Finally, the relation between powers of h, S = S0 þ hS1 þ h2 S2 þ , with
the three-point vertex and the -product is
fS0 ; S0 g ¼ 0
h1 j2 3 i h1 ; 2 ; 3 i ½25 ½28
fS0 ; S1 g þ fS0 ; S1 g ¼ 2hS0 ; . . .
Knowledge of the right-hand side (RHS) in [25] for
all allows to reconstruct the -product. All formal
With these definitions in place, we shall simply
properties [17] are easily shown to hold in the CFT
describe the answer, which is extremely elegant. In
language. This completes the definition of the OSFT
OSFT the full set of fields and antifields is packaged
action. in a single string field ji of unrestricted ghost
Evaluation of the classical action is completely
number. If we write
algorithmic and can be carried out for arbitrary
massive states, with no fear of divergences, since in
ji ¼ j i þ jþ i
all required correlators the operators are inserted ½29
well apart from each other. with Gð Þ 1 and Gðþ Þ 2
SFT Diagrams and Minimal Area Metrics minimal-area metrics will summarize ideas devel-
oped mainly by Zwiebach.) Quite generally, the
Imposing the Siegel gauge condition b0 = 0, one
Feynman rules of an SFT provide us with a cell
finds the gauge-fixed action
decomposition of the appropriate moduli space of
Riemann surfaces, a way to construct surfaces in
1 1 1
Sgf ¼ hjc0 L0 ji þ hj i terms of vertices and propagators. Given a Riemann
g2o 2 3
surface (for fixed values of its complex moduli), the
þ h
j b0 ji ½32 SFT must associate with it one and only one string
diagram. The diagram has more structure than the
Riemann surface: it defines a metric on it. In all
where
is a Lagrangian multiplier. The propagator known covariant SFTs, this is the metric of minimal
reads area obeying suitable length conditions. Consider
Z 1 the following:
b0
¼ b0 dT eTL0 ½33 Minimal-area problem for open SFT Let Ro be a
L0 0
Riemann surface with at least one boundary
Since L0 is the first-quantized open-string Hamilto- component and possibly punctures on the boundary.
nian, eTL0 is the operator that evolves the open- Find the (conformal) metric of minimal area on Ro
string wave functions [X()] by Euclidean world such that all nontrivial Jordan open curves have
sheet time T. It can be visualized as a flat length greater than or equal to . (A curve is said to
rectangular strip of ‘‘horizontal’’ width and be nontrivial if it cannot be continuously shrunk to a
‘‘vertical’’ height T. Each propagator comes with point without crossing a puncture.)
an antighost insertion
An OSFT diagram (for fixed values of its Ti ),
Z defines a Riemann surface Ro endowed with a
b0 ¼ bðÞ ½34 metric solving this minimal-area problem. This is the
0
metric implicit in its picture: flat everywhere except
integrated on a horizontal trajectory. at the conical singularities of defect angle (n 2)
The only elementary interaction vertex is the mid- when n propagators meet symmetrically. (For n = 3,
point three-string overlap, visualized in Figure 3. We these are the elementary cubic vertices; for n > 3,
are instructed to draw all possible diagrams with they are effective vertices, obtained when propaga-
given external legs (represented as semi-infinite tors joining cubic vertices collapse to zero length.) It
strips), and to integrate over all Schwinger para- is not difficult to see both that the length conditions
meters Ti 2 [0, 1) associated with the internal are obeyed, and that the metric cannot be made
propagators. The claim is that this prescription smaller without violating a length condition. Con-
reproduce precisely the first-quantized result [1]. versely, any surface Ro endowed with a minimal-
This follows if we can show that (1) the OSFT area metric, corresponds to an OSFT diagram. The
Feynman rules give a unique cover of the moduli idea is that the minimal-area metric must have open
space of open Riemann surfaces; (2) the integration geodesics (‘‘horizontal trajectories’’) of length
measure agrees with the measure [d ] in [1]. The foliating the surface. The geodesics intersect on a
latter property holds because the antighost insertion set of measure zero – the ‘‘critical graph’’ where the
[34] is precisely the one prescribed by the Polyakov propagators are glued. Bands of open geodesics of
formalism for integrating over the moduli Ti . To infinite height are the external legs of the diagram,
show point (1), we introduce the concept of while bands of finite height are the internal
minimal-area metrics, which has proved very propagators.
fruitful. (Here and below, our discussion of The single cover of moduli space is then ensured
by an existence and uniqueness theorem for metrics
solving the minimal-area problem for OSFT. These
metrics are seen to arise from Jenkins–Strebel
quadratic differentials. Existence shows that the
Feynman rules of OSFT generate each Riemann
surface Ro at least once. Uniqueness shows that
there is no overcounting: since different diagrams
correspond to different metrics (by inspection of
Figure 3 The cubic vertex represented as the mid-point gluing their picture), no Riemann surface can be generated
of three strips. twice.
100 String Field Theory
corresponds to propagator lengths Ti going to In the classical theory, the string field carries ghost
infinity. Surprisingly, the closed-string poles are number G = þ2, since it is the off-shell extension of
also correctly reproduced, despite the fact that the familiar closed-string physical states, and the
OSFT treats only the open strings as fundamental quadratic action reads
dynamical variables. In some sense, closed strings
must be considered as derived objects in OSFT. S h; Qc i ½37
Factorizing the amplitudes over the closed-string Here Qc is the usual closed BRST operator. The inner
poles, one finds that on-shell closed-string states can product h , i is defined in terms of the BPZ inner
be represented, at least formally, as certain singular product, with an extra insertion of c
0 c0
c0 ,
open-string fields with G = þ2, closely related to the
(formal) identity string field. The picture is that of a hA; Bi hAjc
0 jBi ½38
folded open string, whose left and right halves In [37] Gtop = þ6, as it should be. Without the
precisely overlap, with an extra closed-string vertex extra ghost insertion and the subsidiary conditions
operator inserted at the mid-point. The correspond- [36] it would not be possible to write a quadratic
ing open/closed vertex is given by action. The linearized equations of motion and
gauge invariance,
hphys jiOC hphys ð0ÞI ð0ÞiD
ð1Þ
1 þ iz 2 ½35 Qc ji ¼ 0; ji ji þ Qc ji; ji 2 H~CFT0 ½39
I¼
1 iz give the expected cohomological problem. The fact
and describes the coupling to the open-string field of that the cohomology is computed in the semirelative
a nondynamical, on-shell closed string jphys i. It is complex, b
0 ji = b0 ji = 0, well known from the
possible to add this open/closed vertex to the OSFT operator formalism of the first-quantized theory, is
action. Remarkably, the resulting Feynman rules recovered naturally in the second-quantized treatment.
give a single cover of the moduli space of Riemann The interacting action is constructed iteratively,
surfaces with at least one boundary, with open and by demanding that the resulting Feynman rules give
closed punctures. This is shown using the same a (unique) cover of moduli space. This requires the
minimal-area problem as above, but now allowing introduction of infinitely many elementary string
for surfaces with closed punctures as well. vertices V g, n , where n is the number of closed-string
We should finally mention that the structure of punctures and g the genus. This decomposition of
OSFT emerges frequently in topological string moduli space is more intricate than the decomposi-
theory, in contexts where open/closed duality plays tion that arises in OSFT, but is in fact analogous to
a central role. Two examples are the interpretation it, when characterized in terms of the following.
of Chern–Simons theory as the OSFT for the Minimal-area problem for closed SFT Let Rc be a
A-model on the conifold, and the intepretation of closed Riemann surface, possibly with punctures.
the Kontsevich matrix integral for topological Find the (conformal) metric of minimal area on R
gravity as the OSFT on FZZT branes in (2, 1) such that all nontrivial Jordan closed curves have
minimal string theory. length greater than or equal to 2.
The minimal-area metric induces a foliation of
Rc by closed geodesics of length 2. In the classical
Closed Bosonic SFT theory (g = 0), the minimal-area metrics arise from
The generalization to covariant closed SFT is Jenkins–Strebel quadratic differentials (as in the open
nontrivial, essentially because the requisite closed- case), and geodesics intersect on a measure-zero set.
string decomposition of moduli space is much more For g > 0, however, there can be foliation bands of
complicated. geodesics that cross. By staring at the foliation, we can
The free theory parallels the open case, with a break up the surface into vertices and propagators. In
minor complication in the treatment of the CFT zero correspondence with each puncture, there is a band of
String Field Theory 101
infinite height, a flat semi-infinite cylinder of circum- open/closed vertex [35]) corresponds to taking lo =
ference 2, which we identify as an external leg of the and lc = 0. Varying lc 2 [0, 2], we find a whole
diagram. We mark a closed geodesic on each semi- family of interpolating SFTs. This construction
infinite cylinder, at a distance from its boundary. clarifies the special status of the Witten theory:
Bands of finite height (internal bands not associated to moduli space is covered by a single cubic open
punctures) correspond to propagators if their height is overlap vertex, with no need to introduce dynamical
greater than 2, otherwise they are considered part of closed strings, but at the price of a somewhat
an elementary vertex. Along any internal cylinder of singular formulation.
height greater than 2, we mark two closed geodesics,
at a distance from the boundary of the cylinder. If we
now cut open all the marked curves, the surface Classical Solutions in Open SFT
decomposes into a number of semi-infinite cylinders
In the present formulation of SFT, a background (a
(external legs), finite cylinders (internal propagators)
classical solution of string theory) must be chosen from
and surfaces with boundaries (elementary interac-
the outset. The very definition of the string field
tions). Each elementary interaction of genus g and
requires to specify a (B)CFT0 . Intuitively, the string
with n boundaries is an element of V g, n . A crucial point
field lives in the ‘‘tangent’’ to the ‘‘theory space’’ at a
of this construction is that we took care of leaving a
specific point – where ‘‘theory space’’ is some notion of
‘‘stub’’ of length attached to each boundary. Stubs
a ‘‘space of 2D (boundary) quantum field theories,’’
ensure that sewing of surfaces preserves the length
not necessarily conformal. In the early 1990s indepen-
condition on the metric (no closed curve shorter
dence from the choice of background was demon-
than 2).
strated for infinitesimal deformations: the SFT actions
These geometric data can be translated into an
written using neighboring (B)CFTs are indeed related
iterative algebraic construction of the full quantum
by a field redefinition. In recent years, it has become
action S[]. The V g, n satisfy geometric recursion
apparent that at least the open-string field reaches out
relations whose algebraic counterpart is the quan-
to open-string backgrounds a finite distance away –
tum BV master equation for S[]. Remarkably, the
possibly covering the whole of theory space. (Classical
singularities of the operator encountered in OSFT
solutions of closed SFT are beginning to be investi-
are absent here, precisely because of the presence of
gated at the time of this writing (2005)).
the stubs. We refer to Zwiebach (1993) for a
The OSFT action written using BCFT0 data is just
complete discussion of closed SFT.
the full world volume action of the D-brane with
BCFT0 boundary conditions. Which classical solu-
Open/Closed SFT tions should we expect in this OSFT? In the bosonic
string, Dp branes carry no conserved charge and are
There is also a covariant SFT that includes both open
unstable. This instability is reflected in the presence
and closed strings as fundamental variables. The
of a mode with m2 = 1=0 , the open-string
Feynman rules arise from the following problem.
tachyon T(x ), = 0, . . . , p. From this physical pic-
Minimal-area problem for open/closed SFT Let ture, Sen argued that:
Roc be a Riemann surface, with or without
1. the tachyon potential, obtained by eliminating
boundaries, possibly with open and closed punctu-
the higher modes of the string field by their
res. Find the (conformal) metric of minimal area on
equations of motion, must admit a local mini-
Roc such that all nontrivial Jordan open curves have
mum corresponding to the vacuum with no
length greater than or equal to lo = , and all
D-brane at all (henceforth, the tachyon vacuum,
nontrivial Jordan closed curves have length greater
T(x ) = T0 );
than or equal to lc = 2.
2. the value of the potential at T0 (measured with
The surface Roc is decomposed in terms of respect to the BCFT0 point T = 0) must be
g, n
elementary vertices V b, m (of genus g, b boundary exactly equal to minus the tension of the brane
components, n closed-string punctures and m open- with BCFT0 boundary conditions;
string punctures) joined by open and closed propa- 3. there must be no perturbative open-string excita-
gators. Degenerations of the surface correspond tions around the tachyon vacuum; and
always to propagators becoming of infinite length – 4. there must be space-dependent ‘‘lump’’ solutions
factorization is manifest both in the open and in the corresponding to lower-dimensional branes. For
closed channel. example, a lump localized along one world
The SFT described in the section ‘‘Closed strings volume direction, say x1 , such that T(x1 ) ! T0
in OSFT’’ (Witten OSFT augmented with the single as x1 ! 1, is identified with a D(p 1) brane.
102 String Field Theory
Sen’s conjectures have all been verified in OSFT. from requiring that [41] admits classical solutions in
(See Sen (2004) and Taylor and Zwiebach (2003) Siegel gauge. The choice
for reviews). The deceptively simple-looking equa-
1
tions of motion (in Siegel gauge) Q¼ ðcðiÞ cðiÞÞ
2i
L0 ji þ b0 ðji jiÞ ¼ 0 ½40 ¼ c0 ðc2 þ c2 Þ þ ðc4 þ c4 Þ ½42
are really an infinite system of coupled equations,
satisfies all these requirements. The conjecture
and no analytic solutions are known. Turning on a
(Rastelli et al. 2001) is that, by a field redefinition,
vacuum expectation value (VEV) for the tachyon
the kinetic term around the tachyon vacuum can be
drives into condensation an infinite tower of modes.
cast into this form. This ‘‘purely ghost’’ Q is
Fortunately, the approximation technique of ‘‘level
somewhat singular (it acts at the delicate string
truncation’’ is surprisingly effective. The string field
mid-point), and presumably should be regarded as
is restricted to modes with an L0 eigenvalue smaller
the leading term of a more complicated operator
than a prescribed maximal level L. For any finite L,
that includes matter pieces as well. The normal-
the truncated OSFT contains a finite number of
ization constant
0 is formally infinite. Nevertheless,
fields and numerical computations are possible.
a regulator (e.g., level truncation) can be introduced,
Numerical results for various classical solutions
and physical observables are finite and independent
converge quite rapidly as the level L is increased.
of the regulator. The vacuum SFT ([41]–[42])
The most important solution is the string field jT i
appears to capture the correct physics, at least at
that corresponds to the tachyon vacuum. A remark-
the classical level. Taking a matter/ghost factorized
able feature of jT i is universality: it can be written
ansatz
as a linear combination of modes obtained by acting
on the tachyon c1 j0i with ghost oscillators and jg i
jm i ½43
matter Virasoro operators,
and assuming that the ghost part is universal for all
jT i ¼ T0 c1 j0i þ u Lm
2 c1 j0i þ v c1 j0i þ D-branes solutions, the equations of motion reduce
This implies that the properties of jT i are indepen- to following equations for the matter part:
dent of any detail of BCFT0 , since all computations jm i jm i ¼ jm i ½44
involving jT i can be reduced to purely combinator-
ial manipulations involving the ghosts and the A solution jm i can be regarded as a projector
Virasoro algebra. The numerical results strongly acting in ‘‘half-string space.’’ Recall that the
confirm Sen’s conjectures, and indicate that the -product looks formally like a matrix multiplica-
tachyon vacuum is located at a non-singular point in tion [19]: the matrices are the string fields, whose
configuration space. Numerical solutions describing ‘‘indices’’ run over the half-string curves. These
lower-dimensional branes and exactly marginal projector equations have been exactly solved by
deformations are also available. For example, the many different techniques (see Rastelli (2004) for a
full family of solutions interpolating between a review). In particular, there is a general BCFT
D1 and a D0 brane at the self-dual radius has construction that shows that one can obtain solu-
been found. There is increasing evidence that the tions corresponding to any D-brane configuration,
open-string field provides a faithful map of the including multiple branes – the rank of the projector
open-string landscape. is the number of branes. A rank-one projector
corresponds to an open-string functional which is
left/right split, [X()] = FL (XL )FR (XR ). There is
Vacuum SFT: D-branes as Projectors
also clear analogy between these solutions and the
In the absence of a closed-form expression for jT i, soliton solutions of noncommutative field theory.
we are led to guesswork. When expanded around The analogy can be made sharper using a formalism
jT i, the OSFT is still cubic, only with a different that rewrites the open-string -product as the tensor
kinetic term Q, product of infinitely many Moyal products. (See
Bars (2002) and references therein).
1 1
S ¼
0 hjQji þ hj i ½41 It is unclear whether or not multiple-brane
2 3
solutions (should) exist in the original OSFT – they
The operator Q must obey all the formal properties are yet to be found in level truncation. Under-
[17], must be universal (constructed from ghosts and standing this and other issues, like the precise role of
matter Virasoro operators), and must have trivial closed strings in the quantum theory seems to
cohomology at G = þ1. Another constraint comes require a precise characterization of the allowed
String Theory: Phenomenology 103
space of open-string functionals. In principle, the Erler TG and Gross DJ (2004) Locality, causality, and an initial
path integral over such functionals would define the value formulation for open string field theory, arXiv:hep-th/
0406199.
theory at the full nonperturbative level. This remains Ohmori K (2001) A Review on Tachyon Condensation in Open
a challenge for the future. String Field Theories. Master’s thesis, University of Tokyo
(arXiv:hep-th/0102085).
Note Added in Proof Very recently, M Schnabl, Okawa Y (2002) Open string states and D-brane tension from
building on previous work on star algebra projectors vacuum string field theory. Journal of High Energy Physics
and related surface states (Rastelli L (2004) and 0207: 003 (arXiv:hep-th/0204012).
references therein) was able to find the exact Rastelli L (2004) Open string fields and D-branes. Fortschritte der
solution for the universal tachyon condensate in Physics. 52: 302.
Rastelli L, Sen A, and Zwiebach B Vacuum String Field Theory,
OSFT. This breakthrough is likely to lead to rapid Proceedings of the Strings 2001 Conference, TIFR, Mumbai,
new developments in SFT. India, arXiv:hep-th/0106010.
Schnabl M (2005) Analytic solution for tachyon condensation in
See also: Boundary Conformal Field Theory; BRST open string theory. arXiv:hep-th/0511286.
Quantization; Chern–Simons Models: Rigorous Results; Sen A (2004) Tachyon dynamics in open string theory, arXiv:hep-
Fedosov Quantization; The Jones Polynomial; Large-N th/0410103.
and Topological Strings; Large-N Dualities; Shatashvili SL On Field Theory of Open Strings, Tachyon
Noncommutative Geometry from Strings; Condensation and Closed Strings, Proceedings of the Strings
2001 Conference, TIFR, Mumbai, India, arXiv:hep-th/
Noncommutative Tori, Yang–Mills, and String Theory;
0105076.
Operads; Superstring Theories; Topological Quantum
Siegel W (1988) Introduction To String Field Theory. Advanced
Field Theory: Overview; Two-Dimensional Conformal Series in Mathematical Physics 8: 1–244 (arXiv:hep-th/
Field Theory and Vertex Operator Algebras. 0107094).
Taylor W and Zwiebach B (2001) D-branes, tachyons, and string
field theory. In: Gubser SS and Lykken JD (eds.) Boulder
Further Reading 2001, Strings, Branes and Extra Dimensions, TASI 2001
Lectures, pp. 641–759. Singapore: World Scientific.
Bars I (2002) MSFT: Moyal Star Formulation of String Field Theory.
Thorn CB (1989) String field theory. Physics Report 175: 1.
Proceedings of 3rd International Sakharov Conference on
Witten E (1986) Noncommutative geometry and string field
Physics, Moscow, Russia, 24–29 Jun, arXiv:hep-th/0211238.
theory. Nuclear Physics B 268: 253.
Berkovits N (2001) Review of open superstring field theory,
Zwiebach B (1993) Closed string field theory: quantum action
arXiv:hep-th/0105230.
and the B-V master equation. Nuclear Physics B 390: 33
Berkovits N, Okawa Y, and Zwiebach B (2004) WZW-like action
(arXiv:hep-th/9206084).
for heterotic string field theory. Journal of High Energy
Zwiebach B (1998) Oriented open–closed string theory revisited.
Physics 0411: 038 (arXiv:hep-th/0409018).
Annals of Physics 267: 193 (arXiv:hep-th/9705241).
Bordes J, Chan HM, Nellen L, and Tsou ST (1991) Half string
oscillator approach to string field theory. Nuclear Physics B
351: 441.
(which carry nonabelian gauge symmetries and All such models are on equal footing from the
charged matter) in compactifications of type II point of view of the theory. Hence, 4D string models
theories (and orientifolds thereof, like the type I suffer from a large arbitrariness. Although the
theory itself) makes the latter reasonable alternative breaking of supersymmetry clearly changes the
setups to embed the standard model as a brane picture qualitatively (e.g., flat directions associated
world. The different 10D theories (as well as the to moduli are lifted by radiative corrections), it is
11D M-theory) are related by diverse dualities, also difficult to evaluate this impact.
upon compactification. This suggests that they are In this situation, most of the research in string
just different limits of a unique underlying theory. theory phenomenology has centered on the study of
For 4D models, this implies that the different classes generic properties of certain classes of compactifica-
of constructions are ultimately related by dualities, tions, with the potential to lead to realistic struc-
and that often a given model may be realized using tures (such as N = 1 or no supersymmetry,
different string theory constructions as starting nonabelian gauge symmetries with replicated sets
points. of charged chiral fermions). Within each class,
In order to recover 4D physics at low energies, explicit models (as close as possible to the standard
compactification of the theory is required. In model) have also been constructed. Generic predic-
geometrical terms, the theory is required to propa- tions or expectations for phenomenology can be
gate on a spacetime with geometry M4 X6 , where obtained within each setup, but quantitative results,
M4 is a 4D Minkowski space, and X6 is a compact even for explicit models, are always functions of
manifold. This description is valid in the regime of a undetermined moduli vacuum expectation values.
large compactification volume, 0=R2 1 (where R Tractable mechanisms for moduli stabilization are
is the overall scale of the compact manifold), where under active research, although only preliminary
0 string theory corrections are negligible. Other 4D results are available presently.
string models may be constructed using abstract The better-studied classes of models are compac-
conformal field theories. They may often be tifications of heterotic theories on Calabi–Yau
regarded as extrapolations of geometric compactifi- spaces, and compactifications of type II theories (or
cations to the regime of sizes comparable with the orientifolds thereof) with D-branes. Other possibi-
string length, where string theory corrections are lities include the heterotic M-theory, the M-theory
relevant and the classical geometric picture does not on G2 holonomy varieties, the F-theory on Calabi–
hold. Yau 4-folds, etc. As already mentioned, different
In the simplest situation of geometrical compacti- classes (or even explicit models) are often related by
fication, not including additional backgrounds string duality.
beyond the metric, the requirement of 4D spacetime
supersymmetry (useful for the stability of the model,
as well as of phenomenological interest) implies that Heterotic String Phenomenology
the space X6 is endowed with an SU(3) holonomy A large class of phenomenologically interesting
metric. Existence of such metrics is guaranteed for string vacua, which has been explored in depth, is
Calabi–Yau spaces, namely Kähler manifolds with provided by 4D compactifications of (any of the
vanishing first Chern class. two) perturbative heterotic string theories. Compac-
There are a very large number of 4D super- tification on large volume manifolds can be
symmetric string models that can be constructed described in the supergravity approximation. As
using different starting string theories and different described by Candelas, Horowitz, Strominger, and
compactification manifolds. They lead to different Witten, the requirement of 4D N = 1 supersymmetry
4D spectra, often including nonabelian gauge sym- requires the internal manifold to be of SU(3)
metries and charged chiral fermions (but only rarely holonomy, a condition which is satisfied by
resembling the actual standard model). In addition, Calabi–Yau manifolds. In the presence of a curva-
for each given model, there exist, in general, a large ture, the Bianchi identity for the Kalb–Ramond
number of massless 4D scalars, known as moduli, 2-form B is modified, so that, in general, it reads
whose vacuum expectation values are not fixed.
They parametrize different choices of the compacti- 1
dH ¼ tr R2 tr F2 ½1
fication data in a given topological sector (e.g., 30
Kähler and complex structure moduli of the internal where H is the field strength 3-form, R is the Ricci
Calabi–Yau space). All physical parameters of the 2-form, and F is the field strength, in the adjoint
4D theory vary continuously with the vacuum representation, of the 10D gauge fields. Regarding
expectation values of these scalars. the above equation in cohomology leads to a
String Theory: Phenomenology 105
consistency condition, forcing the background gauge The above geometric approach has several limita-
bundle V to be topologically nontrivial, with tions. On the technical side, the construction of
explicit holomorphic and stable gauge bundles is
c2 ðVÞ ¼ c2 ðTX6 Þ ½2
nontrivial from the mathematical viewpoint. On the
where c2 denotes the second Chern class, and TX6 is more fundamental side, it allows one to explore only
the compactification tangent space. the large volume limit of heterotic compactifications.
The condition of supersymmetry implies that the Further insight into the latter aspect can be
gauge fields must be solutions of the Donaldson– obtained via constructions based on exactly solvable
Uhlenbeck–Yau equations. Existence of such a solu- conformal field theories (CFTs), which describe the
tion is guaranteed for holomorphic and stable gauge world-sheet string dynamics in compactifications,
bundles. The simplest solution to these conditions is including all 0 corrections, and, therefore, allowing
the so-called standard embedding, where the gauge one to enter the small volume regime. The simplest
connection is locally identical to the spin connection, such compactifications are provided by toroidal
but more general solutions exist and have been orbifolds, which describe string propagation in
characterized for particular classes of Calabi–Yau quotients of toroidal compactifications by a discrete
manifolds (e.g., when they are elliptically fibered). group . From the world-sheet viewpoint, they are
The gauge background bundle V, with structure described by 2D free CFT, but which include sectors
group H, breaks the 10D gauge symmetry G to its of closed strings with boundary conditions twisted
commutant subgroup G4D . The latter corresponds to by elements of . The resulting 4D theory contains
the 4D gauge symmetry. Moreover, the background chiral fermions, arising from the untwisted and
bundle modifies the Kaluza–Klein reduction of the twisted sectors. In the former, the nonchiral spec-
10D charged fermions, leading to a nonzero number trum of toroidal compactification suffers a projec-
of replicated 4D chiral fermions. Decomposing the tion onto the -invariant states and leads to
adjoint representation of G (in which 10D fermions chirality. Twisted sectors are localized at the fixed
transform) with respect to G4D H, points of the orbifold action, where the local
supersymmetry is reduced, leading naturally to
Adj G ¼ ðRG4D ;i ; RH;i Þ ½3
i chiral fermions.
Many of these models can be regarded as limits of
the net number of 4D chiral fermions in the
compactifications on Calabi–Yau spaces in the limit
representation RG4D is given by the index of the
in which they become locally flat and develop
Dirac operator coupled to V in the representation
conical singularities (and similarly, their gauge
RH, i . Condition [1] implies proper cancellation of
bundles become locally flat and with curvature
chiral anomalies in the resulting theory. A simple
localized near the singular points). Indeed, flat
and well-studied class is provided by standard
directions involving moduli fields in the twisted
embedding compactifications of the E8 E8 hetero-
sector often exist, which correspond to geometric
tic string theory, whose unbroken 4D gauge group is
blow-ups of the singular point that resolve the
E6 E8 . The number of families (i.e., chiral multi-
conical singularities to yield a smooth Calabi–Yau.
plets in the representation 27 of E6 ) and conjugate
The theories remain simple and solvable for any
families (in the 27) are given by the Hodge numbers
value of the untwisted moduli (namely moduli of the
n27 ¼ h1;1 ðX6 Þ; n27 ¼ h2;1 ðX6 Þ ½4 underlying toroidal compactification). This allows
the discussion of their low-energy effective action
More specifically, the harmonic representatives in including the explicit dependence on the untwisted
each cohomology class represent the internal profile moduli, while only partial results for the dependence
of the corresponding 4D fields. The net number of on twisted moduli are known.
families is thus determined by the Euler character- Other approaches, such as free fermion construc-
istic (X6 ) tions or Gepner models, also provide exact descrip-
nfam ¼ jh1;1 h2;1 j ¼ 12 jðX6 Þj ½5 tions of compactifications, although only at a point
of the moduli space, deep inside the small volume
Recently, much progress in heterotic model building regime.
has been achieved in nonstandard embedding com- Exact CFT constructions provide a small volume
pactifications by the detailed construction of holo- description of Calabi–Yau compactifications, at
morphic stable bundles and the computation of the least for particular models. Moreover, their consis-
diverse indexes. In particular, explicit models with tency conditions (modular invariance of the parti-
just the minimal supersymmetric standard model tion function) provide a stringy version of the large
spectrum have been constructed. volume geometric condition implied by eqn [2]. The
106 String Theory: Phenomenology
world-volume gauge bundles, and type IIA compac- corrections, Yukawa couplings, and other diverse
tifications with D6 branes wrapped on special correlation functions have been computed in toroi-
Lagrangian 3-cycles (in general, models with D4 dal cases, where the corresponding correlators are
and D8 branes are not allowed since Calabi–Yau computable exactly in 0 . Particularly interesting is
spaces do not have nontrivial 1- or 5-cycles on the computation of Yukawa couplings, or, in
which to wrap the branes). This classification is a general, of couplings involving only fields at inter-
large volume realization of the general classification sections. These couplings arise from open-string
of supersymmetric configurations of D-branes into world-sheet instantons, namely disks with bound-
two classes, denoted A and B. aries on the D-branes corresponding to those
intersections.
Intersecting Brane Worlds
Type IIB Orientifolds
Type IIA compactifications with A-branes corre-
spond to compactifications of type IIA theory (or Type IIB compactifications with B-type branes
orientifolds thereof) with D6 branes wrapped on contain several familiar classes of 4D models, for
3-cycles of the internal Calabi–Yau space. In these instance, compactifications of type I string theory on
models, each stack of N D6 branes generically leads smooth Calabi–Yau spaces (whose description may
to a U(N) gauge factor. Chirality arises from open be carried out using the effective supergravity
strings stretched between pairs of branes at the action, in close analogy with the heterotic compac-
corresponding intersections. The chiral fermions tifications). Compactifications of type I string theory
from an open string stretched between branes a on orbifolds can be regarded as a particular
and b transform in the bifundamental representation realization of this, easily described using exact
(&a , &b ) of the gauge factors U(Na ) U(Nb ) of the CFTs (although from the viewpoint of the general
intersecting D6 brane stacks. In general, two description as B-branes, the appearance of lower-
3-cycles in a 6D manifold intersect at points of the dimensional branes requires their mathematical
internal space. Hence, such fermions arise in several description to involve coherent sheaves). Since
families, whose (net) number is given by the (net) open strings at orbifolds do not have twisted
number of intersections of the corresponding boundary conditions, chirality arises from the orbi-
3-cycles a , b , namely the topological invariant fold projection of the toroidally compactified theory
intersection number of their homology classes on the spectrum.
Another example within this kind is provided by
Iab ¼ ½a ½b ½8
the so-called magnetized D-brane models. These
Simple modifications of the above rules arise in correspond to toroidal compactifications of type I
some sectors in the presence of orientifold planes theory, with D9 branes carrying constant magnetic
(e.g., the reduction of the gauge symmetry from backgrounds for the internal components of the
unitary to orthogonal or symplectic factors for world-volume gauge fields. In this kind of model,
branes on top of orientifold planes). although the closed-string sector is highly super-
The RR tadpole cancellation conditions specify symmetric, the open-string spectrum has reduced
that the total homological charge carried by the D6 supersymmetry, or no supersymmetry (if the bundle
branes (and the orientifold 6-planes) cancel. They stability condition is relaxed). Chirality arises from
imply automatic cancellation of cubic nonabelian the nontrivial index of the Dirac operator for open
anomalies, and the cancellation of mixed U(1) strings ending on D-branes with different world-
anomalies by a Green–Schwarz mechanism mediated volume magnetic fields. Explicit models have mainly
by 4D scalars from the RR closed-string sector. centered on nonsupersymmetric models from orien-
Explicit models with SM spectrum have been tifolds of T 6 , and on supersymmetric models from
constructed in orientifolds of toroidal compactifica- orientifolds of the T 6 =(Z2 Z2 ) orbifold. In both
tions in the nonsupersymmetric case, and in orbi- contexts, models with semirealistic spectra have
folds thereof in supersymmetric cases. The been obtained: concretely nonsupersymmetric mod-
generalization of the above construction beyond els with just the standard model spectrum, or
toroidal situations is, in principle, possible, but supersymmetric models with the minimal super-
difficult, due to the mathematically challenging symmetric standard model spectrum, plus nonchiral
task of constructing special Lagrangian submani- matter. Further, properties of the gauge coupling
folds for general Calabi–Yau manifolds. constants and the computation of the Yukawa
Certain phenomenologically interesting quantities, couplings have been studied as functions of unde-
such as gauge couplings and their threshold termined moduli.
108 String Theory: Phenomenology
Finally, a second large class of models constructed where VT is a measure of the volume in the
using B-type branes are given by lower-dimensional directions transverse to the brane, and gs is the
D-branes, for example, D3 branes, located at singular 10D string coupling. The above relation shows that
points in the internal compactification space. Since the it is possible to achieve large 4D Planck mass with
massless sector of open strings is determined only in a lower fundamental string scale by adjusting the
terms of the local structure of the singularity, these transverse volume and the string coupling. This has
models have been mostly studied in noncompact been proposed by Antoniadis, Arkani-Hamed,
setups. Resulting spectra can be encoded in quiver Dimopoulos, and Dvali as an alternative to explain
diagrams, related to those in the mathematical litera- the Planck/weak hierarchy without supersymmetry.
ture on the McKay correspondence. Semirealistic three- The compactifications contain several U(1) gauge
family models have been constructed based on systems symmetries. For some of the corresponding gauge
of D3 and D7 branes at the C3 =Z3 orbifold singularity. bosons, the 4D effective theory contains Stuckel-
Type IIB orientifold compactifications are also berg masses of order Ms , due to B ^ F couplings
intimately related to F-theory compactifications on to fields in the RR sector. These couplings make
Calabi–Yau 4-folds, which provide a nonperturba- the U(1) gauge bosons massive; hence, they are
tive completion for such models. absent from the low-energy physics. Nevertheless,
Mirror symmetry exchanges type IIB and IIA the U(1)’s remain as global symmetries exact in 0
compactifications with B- and A-type branes. Hence, and to all orders in the perturbation theory in gs .
it provides a map between the above two kinds of They are violated by D-brane instantons, which
compactifications. This shows that type IIB orienti- are nonperturbative in gs . In many realistic
fold models lead to spectra with structure similar to models, the baryon number is one such global
that of intersecting-branes worlds, and that they symmetry, and it prevents proton decay, even if
share many of their general properties. the string scale is not large.
As a particular example, toroidal models of In general, each gauge factor in the standard
intersecting D6 branes are mapped under mirror model arises from a different brane stack, and
symmetry to models of magnetized D9 branes. This their gauge couplings at the string scale are
mirror map has been exploited to construct the same controlled by different moduli. This implies that,
theories from both starting points and to recover generically, it is not natural to have gauge
certain quantities, such as the 0 -exact Yukawa coupling unification in D-brane models. Particular
couplings in the IIA picture from a purely classical models may enjoy enhanced discrete global
(no 0 corrections) computation in the mirror IIB symmetries at special points in moduli space
model. This is a particular application of the general where unification is achieved, thus making uni-
proposal of homological mirror symmetry in com- fication appear more natural in such examples.
pactifications with branes. Similar statements apply for constructions which
Type II orientifold compactifications with realize complete or partial unification of gauge
D-branes have also been explored beyond the groups at large scales (like string models of grand
geometric regime, using exact CFTs to describe the unification or of Pati–Salam type).
(analog of the) internal space, and crosscap and As already mentioned, important quantities such
boundary states to describe (the analogs of) orienti- as Yukawa couplings are, in principle, computa-
fold planes and D-branes. Formal developments in ble, although quantitative expressions have been
the construction of the latter in Gepner models have derived only in a few examples, mostly in toroidal
been successfully applied to obtain large classes of compactifications or quotients thereof. The results
semirealistic 4D string models in this setup. are moduli dependent, making it difficult to
As compared with heterotic compactifications, the derive model-independent patterns.
setup of D-brane models leads to several generic
features:
M-Theory Phenomenology
Since gauge sectors are localized on D-branes, and
Most of the phenomenological models from the
have a dilaton dependence different from gravita-
M-theory have been constructed using the Horava–
tional interactions, the relation between the
Witten theory (compactification of M-theory on
fundamental string scale and the 4D Planck scale
S1 =Z2 ) as starting point. This theory provides a
and gauge coupling reads
description of the strong coupling regime of the
E8 E8 heterotic theory, and many of its basic
M11p VT
MP2 gYM
2
¼ s
½9 features are similar to those in the perturbative
gs regime. In particular, the techniques used in model
String Theory: Phenomenology 109
building involve the construction of stable and parametrization of the general 4D N = 1 super-
holomorphic vector bundles and the computation gravity action in terms of the Kähler potential for
of the relevant indexes to obtain the 4D gauge group the moduli and matter fields, the gauge kinetic
and charge matter content. An important difference functions, and the superpotential. The moduli action
is that gauge interactions propagate only over the is quite universal, at least for geometric compactifi-
10D boundaries of spacetime, while gravity propa- cations and for untwisted moduli in orbifold
gates over the 11 dimensions. This makes the setup compactifications. For instance, the Kähler potential
share some features of brane-world constructions, for the 4D dilaton multiplet S and the modulus T
and, in particular, it allows one to lower the controlling the size of the internal manifold, in the
fundamental scale of the theory (the 11D Planck large volume and weak coupling regime, reads
scales) to reconcile it with the traditional unification
K ¼ logðS þ S Þ 3 logðT þ T Þ ½10
scale.
A different setup for M-theory phenomenology The corresponding expression including matter
involves the compactification of the 11D theory on a fields is more model dependent, but known within
7-manifold of G2 holonomy X7 , in order to lead to each particular class.
N = 1 supersymmetry in four dimensions. Although
a fundamental formulation of the M-theory is
Moduli Stabilization and Supersymmetry
lacking, duality arguments and indirect evidence
Breaking
can be used to show that nonabelian gauge
symmetries of the A–D–E classical groups arise if Both issues are often related. Although moduli
X7 contains 3-cycles of codimension-4 singularities, stabilization preserving supersymmetry is possible,
locally of the form C2 =, with an A–D–E Kleinian it often occurs that the potential stabilizing moduli
subgroup of SU(2). Similarly, it can be shown that has its origin in mechanisms related to super-
chiral multiplets charged under these gauge symme- symmetry breaking.
tries arise if X7 contains certain codimension-7 The description of purely string theoretical
singularities. The local geometry of the latter has mechanisms to break supersymmetry is difficult,
been explicitly described, and can be regarded as lying and most approaches rely on field-theoretical
at the intersections of codimension-4 singularities. mechanisms in the effective action. One of the better-
The direct construction of such singular G2 studied mechanisms, mostly in the heterotic string
holonomy manifolds is very difficult, and there are setup (but also in type II compactifications), is
no known topological conditions that guarantee gaugino condensation in a strongly coupled hidden
existence of such a metric for a fixed topology. sector, interacting with the standard model sector
However, the existence of large classes of such via gravitational (or perhaps additional gauge)
models can be indirectly shown by using duality interactions. Although explicit models with such
arguments. Namely, any type IIA models of inter- hidden sectors and strong dynamics exist, they
secting D6 branes and O6 planes, preserving N = 1 often result in runaway potentials for moduli.
supersymmetry, lifts to an M-theory compactifica- Racetrack scenarios where several condensates
tion on a singular G2 holonomy manifold. In fact, balance each other are possible but contrived.
the local structure of the codimension-4 and -7 A second mechanism to break supersymmetry,
singularities agrees in particular cases with the local mostly explored in type IIB/F-theory compactifica-
structure of D6 branes on 3-cycles and D6 brane tions, is the introduction of field-strength fluxes for
intersections. p-form fields. Interestingly, such fluxes lead to
nontrivial potentials depending on moduli, and
generically breaking supersymmetry. The existence
Further Topics of several remnant flat directions in the leading 0 , gs
Some additional topics related to the phenomenol- approximation, leaves unanswered the question of
ogy of the string theory, but not covered by the possible runaway moduli potentials in those direc-
above model building description are discussed in tions. However, evidence for nonperturbative con-
tributions stabilizing the remaining moduli at finite
the following.
distance has been proposed. Preliminary results in the
analysis of flux stabilized vacua have been obtained
Effective Actions
in simple examples of (still unrealistic) Calabi–Yau
The construction of effective actions for such classes compactifications with small number of moduli.
of models has been carried out in general in Most explored mechanisms propose supersymmetry
supersymmetric compactifications, using the breaking below the Kaluza–Klein compactification
110 String Theory: Phenomenology
scale, and, therefore, can be described in the 4D standard model. In particular, generic features such
effective theory. They can be nicely parametrized in as nonabelian gauge symmetry and chirality, coupled
terms of vacuum expectation values for the dilaton to gravity, are generic in 4D compactifications. This
and geometric moduli of the compactification. This is already a success. In addition, much progress has
description allows for a computation of the soft been made in the general description of the relevant
terms using the expansion of the N = 1 supergravity mathematical tools, and physical mechanisms and
formulas in components. Concrete patterns, such as ingredients involved in these vacua, as well as in the
the universality of squark masses, or the complex explicit construction of models with the standard
phases of diverse soft terms, can be explored using model spectrum (or supersymmetric extensions of
this approach. it). Yet, many questions remain open and much
Alternative mechanisms of breaking supersymme- more work is needed in order to make contact with
try at higher scales, such as the introduction of the physics observed in nature.
antibranes or nonsupersymmetric compactifications,
lead to generic difficulties with stability. See also: Brane Worlds; Compactification of Superstring
Related to the question of supersymmetry break- Theory; Cosmology: Mathematical Aspects; Superstring
ing is the question of the cosmological constant. Theories.
Unfortunately, there is no manifest mechanism in
the string theory that explains the smallness of the
observed value of this scale. Given that many Further Reading
aspects of both quantum gravity in the string theory
and realistic model building (with proper super- Acharya B and Witten E (2001) Chiral fermions from manifolds
of G(2) holonomy, hep-th /0109152.
symmetry breaking and moduli stabilization) are Aldazabal G, Ibáñez LE, Quevedo F, and Uranga AM (2000)
still under progress, an open-minded point of view D-branes at singularities: a bottom up approach to the string
on this problem and the proposed solutions is kept. embedding of the standard model. Journal of High Energy
Physics 0008: 002.
Angelantonj C and Sagnotti A (2002) Open strings. Physics
Cosmology Reports 371: 1–150.
Angelantonj C and Sagnotti A (2003) Open strings – erratum.
Although somewhat different from the traditional
Physics Reports 376: 339–405.
focus of string phenomenology, recent progress in Antoniadis I, Arkani-Hamed N, Dimopoulos S, and Dvali GR
observational cosmology has triggered much interest (1998) New dimensions at a millimeter to a Fermi and
in string theory realizations of inflationary models superstrings at a TeV. Physics Letters B 436: 257–263.
(or alternatives such as pre-big bang scenarios). Bachas C (1995) A way to break supersymmetry, hep-th /
9503030.
Most inflationary models have centered on using
Blumenhagen R, Cvetic̆ M, Langacker P, and Shiu G (2005)
moduli as the inflaton field, due to their flat Toward realistic intersecting D-brane models, hep-th /
potentials. A simple setup in type II compactifica- 0502005.
tions, known as brane inflation models, uses the Candelas P, Horowitz GT, Strominger A, and Witten E (1985)
modulus controlling a brane position as the inflaton Vacuum configurations for superstrings. Nuclear Physics B
258: 46–74.
field, which has a flat enough potential with a
Donagi R, He Y-H, Ovrut BA, and Reinbacher R (2004) The
moderate fine-tuning. Such setups may lead to spectra of heterotic standard model vacua, hep-th/0411156.
interesting additional features, such as a moderate Green MB, Schwarz JH, and Witten E (1987) Superstring Theory.
but potentially observable density of cosmic strings Cambridge Monographs On Mathematical Physics, vols. 1
created in the reheating process. and 2. Cambridge: Cambridge University Press.
Ibáñez LE (1987) The search for a standard model SUð3Þ
On the other hand, many interesting questions in
SUð2Þ Uð1Þ superstring: an introduction to orbifold con-
string cosmology await further understanding of structions. Seoul Sympos. 1986, 46.
time-dependent backgrounds in the string theory. Polchinski J (1998) String Theory. vols. 1 and 2. Cambridge:
Cambridge University Press.
Uranga AM (2003) Chiral four-dimensional string compactifica-
tions with intersecting D-branes. Classical and Quantum
Retrospect
Gravity 20: S373–S394.
It is remarkable that the formal framework of Witten E (1996) Strong coupling expansion of Calabi–Yau
compactification. Nuclear Physics B 471: 135–158.
the string theory admits tractable solutions with
reasonable resemblance to the structure of the
String Topology: Homotopy and Geometric Perspectives 111
the homology classes are represented by submani- e : LMTM ^ LMTM ! Mapð8; MÞev ðTMÞ
folds, Pr and Qs with transverse intersection, then where LMTM is the Thom spectrum of the pullback
the image of the intersection pairing is represented of the virtual bundle ev (TM). Now we can
by the geometric intersection, P \ Q. compose, to obtain a multiplication,
The remarkable result of Chas and Sullivan says
e
that even without Poincaré duality, there is an LMTM ^ LMTM ! Mapð8; MÞev0 ðTMÞ ! LMTM
intersection type product
The following was proved by Cohen and Jones
: Hp ðLMÞ Hq ðLMÞ ! Hpþqn ðLMÞ (2002).
112 String Topology: Homotopy and Geometric Perspectives
Theorem 1 Let M be a closed manifold, then Cohen and Godin (2004) used the theory of ‘‘fat’’ or
LMTM is a ring spectrum. If M is orientable the ring ‘‘ribbon’’ graphs to represent surfaces as developed
structure on LMTM induces the Chas–Sullivan loop by Harer (1985), Penner (1987), and Strebel (1984),
product on H (LM) by applying homology and the in order to define Pontrjagin–Thom maps,
Thom isomorphism.
g;pþq : ðLMÞp ! Mapðg;pþq ; MÞðg;pþq Þ
The ring structure on the spectrum LMTM was
also observed by Dwyer and Miller using different
where (g, pþq ) is the appropriately defined normal
methods.
bundle of in . By applying (perhaps generalized)
Cohen and Godin (2004) generalized the loop
homology and the Thom isomorphism, they defined
product in the following way. Observe that the
the umkehr map,
figure 8 is homotopy equivalent to the pair of pants
surface P, which we think of as a genus 0 cobordism ðin Þ! : H ððLMÞp Þ ! Hþðg;pþq Þ n ðMapðg;pþq ; MÞÞ
between two circles and one circle.
Furthermore, Figure 1 is homotopic to the where (g, pþq ) = 2 2g p q is the Euler char-
diagram of mapping spaces, acteristic. Cohen and Godin then defined the string
out in topology operation to be the composition,
LM MapðP; MÞ ! ðLMÞ2
where in and out are restriction maps to the g;pþq ¼ out ðin Þ! : H ððLMÞp Þ ! Hþðg;pþq Þ n
‘‘incoming’’ and ‘‘outgoing’’ boundary components ðMapðg;pþq ; MÞÞ ! Hþðg;pþq Þ n ððLMÞq Þ
of the surface P. So the loop product can be viewed
as a composition, They proved that these operations respect gluing of
¼ P surfaces,
¼ ðout Þ ðin Þ! : ðH ðLMÞÞ 2 ! H ðMapðP; MÞÞ 1 #2 ¼ 2 1
! H ðLMÞ
where 1 #2 is the glued surface as shown in
where using the figure 8 to replace the surface P can Figure 3.
be viewed as a technical device that allows one to The coherence of these operations is summarized
define the umkehr map (in )! . in the following theorem.
In general if one considers a surface of genus g,
viewed as a cobordism from p incoming circles to q Theorem 2 (Cohen and Godin 2004). Let h be
outgoing circles, g, pþq , one gets a similar diagram any multiplicative generalized homology theory that
(Figure 2) supports an orientation of M. Then the assignment
out in
ðLMÞq Mapðg;pþq ; MÞ ! ðLMÞp g;pþq ! g;pþq : h ððLMÞp Þ ! h ððLMÞq Þ
r circles
q circles q circles
p circles p circles
Figure 2 g, pþq . Figure 3 1 #2 .
String Topology: Homotopy and Geometric Perspectives 113
setting one has a collection of submanifolds, Di M, group, one has that the loop space of the classifying
referred to as ‘‘D-branes.’’ This theory studies space satisfies
intersections in the path spaces PM (Di , Dj ). a
A theory with D-branes involves ‘‘open–closed LBG ’ BCg
cobordisms’’ which are cobordisms between com- ½g
pact one-dimensional manifolds whose boundary is where [g] is the conjugacy class determined by
partitioned into three parts: g 2 G, and Cg < G is the centralizer of g.
1. Incoming circles and intervals. When BG is represented by a closed manifold, or
2. Outgoing circles and intervals. more generally, when G is a Poincaré duality group,
3. The rest is the ‘‘free boundary’’ which is itself a the Chas–Sullivan loop product then defines pairings
cobordism between the boundary of the incom- among the homologies of the centralizer subgroups.
ing and boundary of the outgoing intervals. Each Abbaspour et al. describe this loop product entirely
connected component of the ‘‘free boundary’’ is in terms of group homology, thus giving structure
labeled by a D-brane (see Figure 4). to the homology of Poincaré-duality groups that
previously had not been known.
In a topological field theory with D-branes,
one associates to each boundary circle a vector Example 2 Applications to 3-manifolds.
space VS1 (in our case VS1 = H (LM)) and to an (Abbaspour 2005). Let : H M ! H (LM) be
interval whose endpoints are labeled by Di , Dj , one induced by inclusion of constant loops. This is a
associates a vector space VDi , Dj (in our case VDi , Dj = split injection of rings. Write H (LM) = H (M)
D6
Lemma 3 Let G ! E ! M be a fiberwise monoid
over a closed manifold M. Then ETM is a ring
Figure 4 Open–closed cobordism. spectrum.
114 String Topology: Homotopy and Geometric Perspectives
The following construction gives a large supply of K-theory K (LBG) maps to the equivariant K-theory,
examples of such fiberwise monoids over manifolds. KG (G). Now in recent work of Freed (2003) twisted
Let G ! P ! M be a principal G bundle over a equivariant K-homology, KG (G) was shown to be
closed manifold M. We can construct the corre- isomorphic to the Verlinde algebra. This algebra is a
sponding adjoint bundle, space of representations of the loop group, LG. The
multiplication in this algebra is the ‘‘fusion product,’’
AdðPÞ ¼ P G G ! M
coming from conformal field theory. One topic of
It is an easy observation that G ! Ad(P) ! M is a current research is to understand the relationship
fiberwise monoid. between multiplicative structure coming from the
string topology of BG, and this fusion product in the
Theorem 4 Ad(P)TM is a ring spectrum. This ring
Verlinde algebra. More generally, the goal is to bring
structure is natural with respect to maps of principal
to bear the considerable calculational techniques of
G-bundles.
algebraic topology that are available in string
Let BG be classifying space of compact Lie topology, to understand the recently uncovered field
groups. It is possible to construct a filtration of BG, theoretic structure of twisted K-theory (Freed 2003),
and its applications to string theory.
M1 ,! M2 ,! ,! Mi Miþ1 ,! ,! BG
where the Mi ’s are compact, closed manifolds. An Acknowledgment
example of this is filtering BU(n) by Grassmannians.
Let G ! Pi ! Mi be the restriction of EG ! BG. The author was partially supported by a grant
By the above theorem one obtains an inverse system from the NSF.
of ring spectra
See also: Mathematical Knot Theory; Topological
PTM
1
1
PTM
2
2
PTM
i
i
PTM
iþ1
iþ1
Defects and Their Homotopy Classification.
Superfluids
D Einzel, Bayerische Akademie der Wissenschaften, Miesener, Wolfke, and others accumulated the
Garching, Germany evidence that liquid 4 He undergoes a second-order
ª 2006 Elsevier Ltd. All rights reserved. phase transition at T = 2.17 K to a state referred to
as a superfluid, since the liquid could flow without
any sign of a flow resistance. This superfluid state
was interpreted in terms of Bose condensation of the
Introduction 4
He atoms in the liquid (London 1938).
Superfluidity has been known to exist since the In Figure 1 the P–T phase diagram of liquid 4 He is
1930s. This widespread phenomenon occurs in shown with a normal liquid phase, a solid phase and
many-particle Bose and Fermi systems as different the superfluid phase below the -line at about 2 K.
as liquid 4 He, liquid 3 He, atomic gases like Rb and Fermions cannot condense in a way similar to the
Li, atomic nuclei, pulsars and last, but not least, in BEC, due to the Pauli exclusion principle. In 1957
metals, where the itinerant electrons may become Bardeen, Cooper, and Schrieffer came up with their
superfluid. This article is devoted to a unifying ingenious proposal that the superfluidity of the
theoretical description of Bose and Fermi super- electron system (usually referred to as superconduc-
fluidity. The mechanisms leading to superfluidity tivity) comes about through the formation of
include Bose–Einstein condensation (BEC) and fermion pairs (quasibosons) in k-space in a spin-
Bardeen, Cooper, and Schrieffer (BCS)–Leggett singlet state. In 1971, several superfluid phases of
pairing correlations. We hope to be able to liquid 3 He at a few mK were discovered by Lee,
demonstrate why this fascinating phenomenon is – Osheroff, and Richardson at Cornell University.
even roughly 80 years after its experimental discov- Experimental aspects connected with the spin
ery and its first theoretical explanation – still a degrees of freedom of the quantum liquid gave
subject of intensive research. strong evidence for Cooper pairing of the 3 He atoms
The phenomenon of superfluidity is closely in a spin-triplet state. In Figure 2 the zero-field P–T
connected with the apparent lack of any measurable phase diagram of liquid 3 He is shown with a normal
flow resistance, which scales with the shear viscosity (Fermi) liquid phase, a solid phase and the super-
of the fluid. Its complete absence implies that fluid A and B phases.
the system is frictionless moving with zero viscosity. Immediately after this discovery, Anthony
The observation of superfluidity is usually precluded J Leggett applied the BCS ideas to liquid 3 He and
by the solidification of most liquids as the tempera- introduced a generalized scheme, that allowed for
ture is lowered. Only systems with particularly triplet-pairing correlations. His theory turned out to
light atoms (like the helium isotopes 4 He and 3 He) describe a large variety of experimental results
stay liquid down to the lowest temperatures. accurately. A new and exciting development set in
These systems are referred to as ‘‘quantum liquids,’’ when Bose–Einstein condensates were discovered for
since their liquid state is caused by the quantum- the first time in dilute gases of alkali atoms in 1995
mechanical zero-point motion of the atoms. It by Cornell and Wiemann et al. (Rb), Ketterle et al.
should be noted that the Helium isotopes (Na), and Hulet et al. (Li).
belong to two different kinds of elementary
particles which can be distinguished by their
statistics: 4 He is a spin-0 boson and 3 He a spin-
1/2 fermion. 4
In 1924, Satyendra Nath Bose and Albert Einstein
proposed that below a characteristic degeneracy Solid
Pressure (MPa)
3
temperature TB , a macroscopic number of bosons Normal liquid
can condense into the state of lowest energy k = 0. 2
In the 1930s, Fritz London and Heinz London
showed that this so-called Bose–Einstein condensate Superfluid λ
1
can be described by a macroscopic quantum-
mechanical wave function like the one for a single Gas
0
elementary particle, but with the probability density 0 1 2 3 4 5 6
replaced by the density of the condensed particles. Temperature (K)
By the end of the 1930s, the experimental results of Figure 1 The phase diagram of liquid 4 He. Courtesy of Erkki
Allen, Kamerlingh–Onnes, Keesom, Kapitza, Thuneberg.
116 Superfluids
1 X0 1 2
h 2
n¼ nk ¼ 3 B32 ðÞ ½2 T!0
¼ ð32 nÞ3 ¼ EF ½8
V k T 2m
where the prime indicates the summation over To summarize, quantum behavior in Bose and
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
excited states jkj > 0. In [2], T = h= 2mkB T Fermi system sets in below the degeneracy tempera-
denotes the thermal de Broglie wavelength which ture T , defined through n3T = O(1). For bosons,
provides a criterion for the importance of quantum T = TB is the temperature at which the chemical
effects or degeneracy through n3T O(1). The Bose potential vanishes, whereas for fermions T = TF is
integrals B () originate from the conversion of the the Fermi temperature.
momentum sum into an energy integral and read for
parabolic dispersion:
Z 1 X1
1 dy y1 e
B ðÞ ¼ yþ
¼ ½3 2
ðÞ 0 e 1 ¼1
in which represents the condensate’s chemical and those with k > 0 (excited states) and average
potential. After performing a Madelung transforma- occupation number
tion (Madelung 1926):
Nex X0 B3/2 ðÞ T 3/2
nex ¼ ¼ nk ¼ n ½15
s V k
B3/2 ð0Þ TB
¼ aei’ ; a2 ¼
m with the total density n = nex þ n0 . The consequence of
one arrives at two coupled hydrodynamic equations, the chemical potential vanishing at TB clearly is a mac-
the first of which reads roscopic occupation of the ground state of the Bose gas:
1 1
@
s N0 !0
= ¼ !1 ½16
þ r jsm ¼ 0 1 þ þ 1
@t ½10
h
This phenomenon is referred to as BEC. Below
jsm ¼
v; s s s
v ¼ r’ TB , = 0 and from [15] we see that
m
3=2
Equation [10] can be interpreted as a continuity free bosons T
nex ¼ n ; T < TB ½17
equation, which represents the conservation law for TB
the condensate mass density
s . The second equation The average occupation of the ground state is given by
@’ 1
h n0 ðTÞ ¼ n nex ðTÞ; T < TB ½18
h2 r 2 Þ
¼ mvs2 þ þ Oð ½11
@t 2 It is important to understand that the number
assumes the form of the Hamilton–Jacobi equation density of condensed particles nex has nothing to
for the action field of classical mechanics h’, if the do with the current response function
s (eqn [10]).
quasiclassical limit (terms / O( h2 r2 ) ! 0) is taken. A derivation of
s will be given in the section ‘‘Local
From [10] and [11] a condensate acceleration response of condensates and excitation gases.’’
equation can be derived, which resembles the Euler Let us now discuss the structure of the excitation
equation of classical hydrodynamics ( = 0 þ ): spectrum, which will turn out to be crucial for the
observability of superfluidity, in some more detail.
@vs 1 Suppose that a macroscopic object of mass M moves
þ ðvs rÞvs ¼ r ½12 through the superfluid. Then one may ask the question,
@t m
at what velocity does this motion cause the creation of
The physical nature of the driving force becomes an excitation of energy Ep and momentum p. The
evident after applying the Gibbs–Duhem relation condition can be formulated in terms of the velocity
118 Superfluids
difference vi vf as Ep = M(v2i v2f )=2 and critical velocity for the phonon–roton spectrum is
p = M(vi vf ). Eliminating vf yields p = p vi þ characterized by the roton minimum and is given by
O(M1 ) so that condition for the creation of an excit- vL D=p0 .
ation leads to the so-called Landau critical velocity
Ep
vL ¼ min >0 ½19 BCS–Leggett Pair Condensation
jpj
The key assumptions of the weak-coupling mean-
It is immediately clear that for free bosons vL = 0. field BCS–Leggett pairing model can be summarized
This means that a free Bose gas can never be a as follows: one first assumes that at sufficiently low
superfluid, since drag forces on moving objects will temperatures it is energetically favorable that a
start to act even at smallest velocities. temperature-dependent part of the fermions forms
It turns out that interaction effects can drastically so-called Cooper pairs. This pair formation is caused
modify the nature of the elementary excitations. In by an attractive interaction in k-space near the
1947, Nikolai Bogoliubov showed (for the first time Fermi surface:
using the method of second quantization) that even in
ðsÞ
the limit of weak repulsive interactions the excitation kp < 0; jk j; jp j < c
spectrum is phonon-like Ep = cjpj, with c the sound
velocity. Lev Landau and Richard Feynman investi- Here k = k measures the energy from the
gated the situation for superfluid 4 He, where the chemical potential. The index s denotes the total
interactions between the atoms are far from weak. spin of the pair. Classical superconductors have
Landau (1947) postulated the following form for the pairs in a relative singlet state s = 0, ms = 0 whereas
excitation spectrum, for which Feynman (1953) gave the superfluid phases of liquid 3 He have pairs in a
the microscopic justification. At low momenta, the relative spin-triplet state s = 1, ms = 0,
1, with ms
spectrum is phonon-like and linear in p: the magnetic quantum number. The amplitude of
spontaneous pair formation is
lim Ep ¼ Ephon
p ¼ cjpj ½20
p!0 gk1 2 h^ck1 ^ck2 i 6¼ 0; T Tc ½22
At higher momenta, the spectrum is reminiscent with k = k1 k2 the relative momentum of the
of that of crystal phonons in that Ep passes though a pair. The attractive interaction that drives the
maximum, and then, at a characteristic momentum Cooper-pair formation connects the pairing ampli-
p0 approaches the next minimum, which, however, tude gk1 2 with a new energy scale, the so-called
is located at a finite energy D. Feynman called this pair potential
part of the spectrum the ‘‘roton’’ (mass mr ) in an X ðsÞ
analogy with a ‘‘smoke ring,’’ since it is connected Dk1 2 ¼ kp gp1 2 ½23
with the forward motion of a particle accompanied p
by a ring of back-flowing other particles: As a consequence of triplet pairing the spin part of
ðjpj p0 Þ2 the pair potential is ‘‘even’’ upon interchange of 1
lim Ep ¼ Erot
p ¼Dþ ½21 and 2 : Dk2 1 = Dk1 2 . Then the Pauli principle
jpj!p0 2mr
requires that Dk1 2 must be ‘‘odd’’ with respect
Figure 4 shows a sketch of the phonon–roton to the interchange of k1 and k2 or, equivalently,
spectrum of superfluid 4 He. Clearly, the Landau k ! k. The k-dependence can now be classified by
an orbital quantum number ‘ with the special cases
of ‘ = 1 (p-wave) pairing, ‘ = 3 (f-wave) pairing, etc.
Ep
All superfluid phases of 3 He are characterized by
p-wave orbital symmetry.
The transition temperature Tc from [23] reads
Rotons
2e
1=ðNF ðsÞ Þ
kB Tc ¼ c e
Δ
with NF = 3n=2EF the density of states at the Fermi
Phonons level and
= 0.577 . . . the Euler constant. The
energies k can trivially be divided into particle-like
0 (k > 0) and hole-like (k < 0) terms. The presence
0 p0 p
of the pair potential Dk leads to a mixing of particle-
Figure 4 The phonon–roton spectrum. and hole-like contributions to the energy, which
Superfluids 119
The short-range Fermi liquid interaction leads to a After a Taylor expansion of n with respect to the small
quasiparticle mass enhancement m =m = 1 þ F1s =3 local temperature change T, the result for cV (T) reads
characterized by the pressure-dependent dimensionless 2
Landau parameter F1s . In Figure 7, the normal fluid 2s þ 1 X Ek @Ek
cV ¼ ’k; Ek ½36
density (
nk, ? for 3 He-A,
n for 3 He-B) is shown as a V T @T
k
function of reduced temperature at a pressure of 27
bar, where F1s = 12.53. The entropy density of an In Figures 8 and 9 we show the cusp-like specific heat
excitation system of arbitrary statistics below the of a Bose gas as compared with the specific heat of
3
transition can be written as He-A, B, which display discontinuities at Tc .
Finally, the superfluid phases of 3 He are char-
ð2s þ 1Þ X acterized in addition by the spin degrees of freedom,
0 ¼ k B Pk
V k ½34 reflected by the bogolon spin magnetization
Pk ¼ ð1 þ n Þ lnð1 þ n Þ n ln n response to an external magnetic field B:
1
2
ρs(T )/ρ
C(T )/NkB
ρn(T )/ρ
0 0
0 1 0 1 2
T/ Tλ T/TB
Figure 6 The normal and superfluid density for He-II. Figure 8 The specific heat capacity of a Bose gas.
Superfluids 121
1
A(ms = ±1)
B
2
A
2
C(T )/CN 3
B(ms = ±1,0)
1 1
3
B (27 bar) X(T )/XN
A
B ms = 0
0
0 0 1
0 1 T/Tc
T/Tc
Figure 10 The spin susceptibility of 3 He-A, B.
Figure 9 The specific heat of 3 He-A, B.
Note that eqn [38] accounts only for the ms = 0 functions and contributes to the entropy and the flow
(bogolon) contribution to the spin-triplet suscept- dissipation. Superfluidity is now well understood using
ibility, the temperature dependence of which is given various aspects of the concept of the macroscopic wave
by the so-called Yosida function Y(T) = NF1 function. On the microscopic level, the mechanisms of
P BEC and BCS–Leggett pair formation have been
k ’k, 1 . The total susceptibility reads
successfully invoked to understand the fascinating
tot ¼ 0 þ 1 þ 1 ½39 properties of Bose and Fermi superfluids.
|{z} |fflfflfflfflfflffl{zfflfflfflfflfflffl}
bogolons condensate
See also: Bose–Einstein Condensates; Bosons and
with the condensate contributing through ms =
1 a Fermions in External Fields; High Tc Superconductor
fraction of 2/3 of the normal state Pauli suscept- Theory; Topological Knot Theory and Macroscopic
ibility. In Figure 10, the reduced spin susceptibility Physics; Variational Techniques for Ginzburg–Landau
Energies; Vortex Dynamics.
=N of 3 He-A, B is plotted vs. reduced tempera-
ture. While the constant susceptibility is character-
istic of the ESP pairing state, the reduction of the
B-phase susceptibility is due to the lack of the Further Reading
nonmagnetic ms = 0 contribution to the spin triplet
Anderson PW and Brinkman WF (1973) Physical Review Letters
in the low-temperature limit. Exchange interaction 30: 1911.
effects, characterized by the dimensionless Landau Anderson PW and Morel P (1960) Physical Review Letters 5: 136.
parameter F0a , lead to a further reduction of the Annett J (2004) Superconductivity, Superfluids and Condensates.
Balian-Werthamer (BW)-state susceptibility, which Oxford: Oxford University Press.
is shown for 27 bar, where F0a = 0.755. Note that Balian R and Werthamer NR (1963) Physical Review 131: 1553.
Bogoliubov NN (1947) Journal of Physics USSR 11: 23.
the theoretical picture reflected in Figure 10, and Bogoliubov NN (1958) Soviet Physics JETP 7: 794.
also in Figures 6, 7, and 9, is in quantitative Dobbs ER (2000) Helium Three. Oxford: Oxford University
agreement with experimental observations. Press.
In summary, superfluidity is a quantum-mechanical Feynman RP (1953) Physical Review 91: 1301.
phenomenon seen on a macroscopic scale. It occurs Keller WE (1969) Helium-3 and Helium-4. New York: Plenum.
Landau LD (1947) Journal of Physics USSR 11: 91.
below the degeneracy temperature T / n2=3 =m of London F (1938) Physical Review 54: 947.
both Bose and Fermi many-particle systems (like liquid Madelung E (1926) Z. Physik 40: 322.
4
He and 3 He) and is a property of a macroscopic Nambu Y (1960) Physical Review 117: 648.
number of particles, the condensate. The role of (weak Tilley DR and Tilley J (1990) Superfluidity and Superconductiv-
or strong) interactions is manifested in the structure of ity. Adam Hilger.
Tisza L (1938) Journal de Physique et Radium 1: 164.
the relevant elementary excitations, which always exist Tsuneto T (1998) Superconductivity and Superfluidity. Cambridge:
in addition to the condensate at finite temperatures and Cambridge University Press.
above certain critical velocities. These excitations form Vollhardt D and Wölfle P (1990) The Superfluid Phases of
a gas, referred to as the normal fluid, since it gives rise Helium 3. Taylor and Francis.
to temperature-dependent thermodynamic and response
122 Supergravity
Supergravity
K S Stelle, Imperial College, London, UK once it was realized that renormalizable supersym-
ª 2006 Elsevier Ltd. All rights reserved. metric models display a cancellation of some of the
divergences that have plagued relativistic quantum
field theory since its inception in the 1930s. In
Introduction: Minimal D = 4 Supergravity particular, in renormalizable flat-space field theory
models, divergences quadratic in a high-momentum
The essential idea of supersymmetry is an extension of cutoff vanish as a result of cancellations between
the relativistic structure group of spacetime, which in virtual bosonic and fermionic particles. This is a
ordinary four-dimensional physics in the absence of very attractive feature for control of the ‘‘hierarchy
gravity is the Poincaré group ISO(3, 1). In a minimal problem’’ in particle physics, especially for the
supersymmetric theory in flat D = 4 spacetime, the instability inherent in having vastly different scales
minimal supersymmetry algebra (the ‘‘graded Poincaré within the same theory, for example, the TeV scale
algebra’’) adds spinorial generators Q to the Lorentz of ordinary electroweak physics and the 1016 GeV
generators Mmn and the translational generators scale where unification with the strong interactions
(momenta) Pm , where m = 0, 1, 2, 3. The core relation might come in.
is the ‘‘anticommutator’’ of two Q : When one includes gravity, the stability problems
g ¼ 2 m Pm
fQ ; Q ½1 of particle physics become much more severe.
Einstein’s theory of general relativity is itself non-
where Q = Qy 0 and the m are the Dirac gamma renormalizable, that is, its ultraviolet divergences are
matrices. In the minimal D = 4 supersymmetry of different forms from the terms present in the
algebra, the spinor generator Q is taken to be original ‘‘classical’’ action and there is no acceptable
Majorana: Q = C(Q) T , where C is the charge- finite set of correction terms that can be added to it
conjugation matrix and AT denotes the transpose to remove this defect. Moreover, when otherwise
of the matrix A. The full supersymmetry algebra tolerably behaved matter field theories that are
adjoins to the anticommutation relation [1] the renormalizable in a flat-spacetime context are
usual commutation relations among the Lorentz coupled to general relativity, the gravitational
generators and the commutators of the Lorentz couplings pollute the matter theories with non-
generators with the momenta and the spinors Q ; renormalizable divergences. This is a key aspect of
the latter express respectively the vectorial and the great difficulty that has been encountered in
spinorial characters of Pm and Q : interpreting gravity as a quantum theory.
Supersymmetry, with its divergence-canceling
i½Mmn ; Mpq ¼ np Mmq mp Mnq ½2
powers, was thus a very attractive option in the
struggle to formulate a quantum theory of gravity, and
i½Mmn ; Pq ¼ nq Pm mq Pn ½3
the creation of a supergravity theory was thus a very
high priority task. This was achieved in 1976 by
i½Mmn ; Q ¼ 12 ðmn QÞ ½4
Freedman, Ferrara, and Van Nieuwenhuizen using the
where mn = (1=2)(m n n m ) and mn = diag(1, technique of iterative Noether coupling to build up this
1, 1, 1) is the Minkowski metric. The final relation nonlinear theory order-by-order in powers of the
in the supersymmetry algebra expresses the flatness fermionic fields. The fermionic partner of the massless
of Minkowski space: spin-2 ‘‘graviton’’ field is a massless fermionic spin-3/2
field that has come to be called the ‘‘gravitino.’’
½Pm ; Pn ¼ 0 ½5
A second 1976 paper by Deser and Zumino soon
This algebra has been considered as an extension of the followed, emphasizing how supergravity manages to
symmetry algebra of particle physics since the work of circumvent the well-known problems of coupling
Gol’fand and Likhtman in 1971, and especially since spins higher than 1 to gravity. A key point in
the linearly realized supersymmetric model of Wess achieving this result is the role played by the local
and Zumino in 1974. That model contains a pair of version of the supersymmetry algebra [1]–[5]. As
D = 4 scalar fields and a D = 4 Majorana spinor, so one can see from the translations occurring on the
the numbers of bosonic and fermionic degrees of right-hand side of [1], when one replaces translation
freedom are equal; this is a fundamental characteristic symmetry by local general coordinate invariance in a
of supersymmetric theories. gravitational context, the supersymmetry transfor-
The work of Wess and Zumino led to an mations must themselves become local as well. Local
explosion of interest in supersymmetry, especially symmetries allow for transformation parameters
Supergravity 123
that are local in the spacetime coordinates xm , and showing the highly nonlinear nature of supergravity
in interacting theories they require coupling of the theory – when expanded out, the theory becomes
corresponding ‘‘gauge field’’ to a conserved current. much more cumbersome to study. The 1.5 order
In the case of supergravity, the gravitino field m formalism trick is one of a large number of algebraic
plays this gauge-field role, and its coupling to the simplifications that had to be developed in order to
conserved current of supersymmetry is the key to master the technical aspects of supergravity. It also
allowing a consistent coupling between the spin-2 reveals a characteristic physical feature: this theory
graviton and the spin-3/2 gravitino. naturally involves a connection with torsion built
from the fermionic fields.
In terms of the torsional covariant derivative
Dm (x) = (@m þ (1=4)(!ab ab
m (e) þ Km ( ))ab )(x) of the
The Minimal Supergravity Action
infinitesimal supersymmetry parameter (x), the
The action for minimal supergravity in D = 4 local supersymmetry transformations which leave
dimensions can be written, using the vierbein the action [6] invariant (up to the integral of a total
formalism where the metric is expressed as a derivative) are
quadratic expression in a nonsymmetric 4 4
eam ¼ i
a m ½9
vierbein matrix eam , gmn = eam ebn ab , as
Z ¼ 21 Dm ½10
1 m
I ¼ 2 d4 x detðeÞRðe; !ðeÞ þ Kð ÞÞ 1
2 The inhomogeneous part 2 @m in the gravitino
Z
i transformation [10] demonstrates the gauge-field
d4 xmnpq m 5 n Dp ðe; !ðeÞ þ Kð ÞÞ q ½6
2 nature of the gravitino field. For a distribution of
pffiffiffiffiffiffiffiffiffiffi ‘‘supermatter’’ fields (e.g., Wess–Zumino model
where = 8G is the gravitational coupling scalars and spinors), the integrated ‘‘charge’’ that
constant, one would get from a Gauss’s law surface integral at
spatial infinity using the gravitino gauge field is the
!ab 1 na
m ðeÞ ¼ 2 e ebn;m ebm;n 12 enb ean;m eam;n
total supercharge Q , which in turn plays the role of
þ 12 ena erb ðenc;r erc;n Þecm ½7 the supersymmetry generator in the original matter-
sector supersymmetry algebra [1].
is the usual vierbein formalism spin connection (in Both the gravitational field and the gravitino field
which ebn, m = @m ebn and ema is the matrix inverse of are thus effectively gauge fields, albeit not of a
ema ), and standard Yang–Mills type. The local algebra is a
i2 a deformation of the rigid supersymmetry algebra [1]–
Kab
mð Þ ¼ ð m b
þ a m b
m b a
Þ ½8 [5], generalizing the relation between general covar-
4
iance and flat-space Poincaré symmetry. Some basic
is the fermionic contorsion, an additional part of the consequences of the flat-space algebra are preserved,
covariant derivative Dm (e þ K( )) appearing in the however. An extremely important instance of this is
action [6]. (Indices m, n are taken to be ‘‘world’’ energy positivity. As one can see by multiplying [1]
indices while indices a, b are ‘‘tangent space’’ indices; by 0 and then contracting on the spinor index,
one can convert from one type to another using the
vierbein eam and its inverse, e.g., a = em 1X
a m .) E ¼ P0 ¼ fQ ; Qy g
Keeping the terms in the action grouped as above 2
using the nonstandard covariant derivative eab m þ Km
ab
The right-hand side is manifestly non-negative
is what has been called ‘‘1.5 order formalism’’: this
provided the theory is quantized in a positive-metric
greatly simplifies the writing and analysis of the
Hilbert space. One can see this even more explicitly
supergravity action [6]. In the action [6], one has the
in a Majorana spinor basis, where Qy = Q .
Ricci scalar R(e, !(e) þ K( )) written in terms of this
Accordingly, for flat-space supersymmetric theories,
generalized torsional spin connection. One may of
one obtains directly the result that energy is
course expand out all the !ab ab
m þ Km combinations non-negative. This carries over to the local algebra
and write the nonlinear fermionic terms separately.
of supergravity, where the total energy is obtained
Doing this produces a quartic term
from a Gauss’s law integral over the sphere at
2 b a spatial infinity.
L4 ¼ ½ c
ð b a c þ 2 a b cÞ In general relativity, an integrated energy can be
32
defined with respect to an asymptotic timelike
4ð a b c
Þð a b c Þ Killing vector at spatial infinity. Showing that this
124 Supergravity
energy is non-negative remained for decades a while the local supersymmetry transformations are
famously unsolved problem in gravitational physics; changed to include the auxiliary fields, e.g., the
it was ultimately proven in Yau’s positive-energy gravitino transformation becomes
theorem. The algebraic structure of supergravity
makes energy positivity much more transparent, m ¼ 21 Dm ð!; KÞ
however. Since pure general relativity can be þ 5 bm 13 m n bn 13 m ðM þ 5 NÞÞ
obtained by setting the gravitino field to zero, this
result is inherited by pure Einstein theory as a while the auxiliary fields transform into expressions
consequence of its being embeddable into super- that vanish on-shell. Since the field equations for the
gravity. Energy positivity can thus be proved even at auxiliary fields are algebraic in character and since
the classical level using ideas taken from super- for source-free supergravity they have the simple
gravity, as was done by Witten and later streamlined solution bm = M = N = 0, one can directly regain the
by Nester, in an argument much simpler than Yau’s on-shell formalism by algebraically eliminating the
proof. This argument writes the energy as an auxiliary fields.
integral over a positive-semidefinite expression The inclusion of auxiliary fields is not an empty
quadratic in a commuting spinor field which is trick, however. The local supersymmetry transfor-
analogous to the (anticommuting) spinor parameter mations including the auxiliary fields form a closed
of supergravity in the transformations [9] and [10]. set without the use of equations of motion (‘‘off-
shell closure’’). This standardizes the form of the
supersymmetry transformations so that they remain
the same even when supermatter is coupled to
Auxiliary Fields and Superspace
supergravity instead of needing a case-by-case
Supergravity shares with flat-space supersymmetric Noether construction as in the case without the
theories a curious technical feature that gives a hint auxiliary fields. In this way, a standard set of
of a new underlying geometry. Standard counting of coupling rules can be drawn up, known as the
the gauge-invariant continuous degrees freedom of ‘‘tensor calculus.’’ This tensor calculus is of great
the graviton and the gravitino in momentum space importance as it allows for the construction of
yield the same result per momentum value: two general models of supergravity coupled to super-
bosonic degrees of freedom and two fermionic matter (Wess–Zumino multiplets and super Yang–
degrees of freedom. This accords with the general Mills multiplets consisting of spin-1 gauge fields and
requirement in supersymmetric theories that the spin-1=2 ‘‘gaugino’’ fields). These general couplings
numbers of bosonic and fermionic degrees of free- form the basis for essentially all supersymmetric
dom match. This count follows from the Einstein phenomenology, and in particular for the formula-
and spin-3/2 equations of motion, or ‘‘on-shell.’’ tion of the Minimal Supersymmetric Standard
If one compares the count of nongauge degrees Model. Since supersymmetry is not directly observed
of freedom without using the equations of motion in low-energy physics, it must be spontaneously
(i.e., ‘‘off-shell’’), one obtains an imbalance, how- broken, like many other gauge symmetries. As it
ever: six nongauge graviton versus 12 nongauge happens, the physically realistic mechanisms of
fermion fields. This is directly related to another supersymmetry breaking all originate from super-
puzzling feature of the supergravity realization of gravity couplings derived using the tensor calculus.
local supersymmetry: the local supersymmetry alge- Given the regular set of tensor calculus rules for
bra closes onto a finite set of transformations only coupling supergravity to supermatter, one is led to
when the equations of motion are imposed. suspect that a geometrical structure lies in the
As in flat-space supersymmetry, the cure for this background. This is indeed the case; the correspond-
problem is to add nondynamical ‘‘auxiliary’’ fields ing construction is known as ‘‘superspace.’’
to the action. In the supergravity case, the The basic idea of superspace is a generalization of
imbalance in the off-shell bose–fermi field count the coset space construction of Minkowski space as
indicates that an additional six bosonic fields are the coset space given by the Poincaré group divided
needed. In the minimal set of auxiliary fields, these by the Lorentz group: M4 (xm ) = ISO(3, 1)=SO(3, 1).
organize into a vector bm and a scalar-pseudoscalar For supersymmetric theories, one analogously con-
pair M, N; the additional terms in the action [6] are structs Superspace(xm ,
) = Graded Poincaré/SO(3, 1).
simply The basic ideas of superspace were introduced by
Z Akulov and Volkov in 1972, while the idea of
expanding in ‘‘functions’’ on this space, thus yielding
d4 x detðeÞ 13 M2 13 N 2 þ 13 bm bm
‘‘superfield,’’ was introduced by Salam and Strathdee
Supergravity 125
in 1974. This led to a formulation of the Wess– and these correspond naturally to the various
Zumino model in terms of a chiral superfield (x,
), possible choices of auxiliary-field sets. With the
which is subjected to a covariant superspace minimal set, the supergravity multiplet is described
constraint. by a superfield carrying a vector index Hm (x,
,
);
In order to manage the formalism of superspace this superfield is called the prepotential of super-
more efficiently, it is convenient to use a two- gravity. Note the fact that since the divisor group in
component spinor formalism corresponding to the the coset-space construction of superspace is the
Weyl basis for the Dirac gamma matrices, in which Lorentz group, superfields may carry indices corre-
the Majorana spinor coordinate
is represented as sponding to any Lorentz representation. The com-
ponent-field expansion of the Hm superfield yields
the physical eam , m , m˙ and auxiliary fields
¼ _
(bm , M, N) together with a number of other compo-
nents of dimension lower than those of the physical
where two-component indices , ˙ = 1, 2 are raised
fields. This is not, however, all that surprising: even
and lowered with the covariant two-index antisym-
˙ the physical fields eam , m , m˙ contain components
metric tensors , ˙ , which both take the numer-
that are not directly related to the physical modes
ical value i2 . The flat-space fermionic covariant
because we are dealing with a gauge theory. What
derivatives are then
occurs in superspace is a redundant expression of
@ _ the supergravity multiplet with the presence of
D ¼ þ im
_
@m
@
various component gauge fields.
½11 The full expression of local supersymmetry in
_ ¼ @ þ i
m @m
D _ _ superspace can be given in a number of different
@
formalisms. Suffice it here to indicate the transfor-
where the m ˙
= (1, i ) for m = (0, i) (where i are the mation of the linearized theory expanded in small
Pauli matrices) are the Van der Waerden matrices fluctuations about empty flat superspace. Convert-
which establish the mapping between vector indices ing the vector index of Hm into a (chiral, antichiral)
and (chiral, antichiral) spinor index pairs. The spinor index pair via H, ˙ = m H , the linearized
˙ m
Wess–Zumino multiplet is then described by a local symmetry transformation of the supergravity
complex chiral superfield satisfying the constraint multiplet is
D ˙ = 0. Unlike the situation in Minkowski space,
where the only Lorentz-covariant solution to a H_ ¼ D L _ L
_ D ½12
constraint that sets to zero the @=@xm derivatives is
a constant, superspace has a reducible set of where the transformation parameter superfield L
coordinates (xm ,
,
˙ ) and, as a result, requiring carrying a spinor index is antichiral: D L ˙ =0
to be annihilated by D ˙ does not require the whole (while the conjugate parameter superfield L is
superfield to be a constant. chiral). Expanding in component fields and compar-
Since the fermionic coordinates of superspace ing with the expansion of Hm , one sees that the
,
˙ are anticommuting (i.e., they are elements of chiral spinor superfield contains precisely the com-
a Grassman algebra), and since , ˙ = 1, 2 have an ponents needed to provide the standard gauge
index range of two, powers of them higher than the symmetries of eam and m , m˙ and also to trans-
second order necessarily vanish. As a result, super- form the other gauge components of Hm as well.
fields like can be expanded into sets of component One can then make various gauge choices according
fields, each of which is an ordinary field in to taste in a given context.
Minkowski space. In this way, a chiral superfield One frequently encountered superspace gauge
expands into (A(x), B(x),
(x),
˙ (x), F(x), G(x)), choice sets to zero all the fields in Hm except for
where the fields A, B,
, and
are the physical the physical and auxiliary fields (eam , m , m˙ ,
fields of the Wess–Zumino model, while F and G bm , M, N). This is called a Wess–Zumino gauge
are dimension-2 auxiliary fields. In this way, the following the analogy to a similar construction for
auxiliary fields of supersymmetry naturally fit into a super Maxwell theory (containing spins 1 and 1/2).
superspace formalism as higher components in a Wess–Zumino gauge choices are not, however,
superfield expansion. It is in this sense that they supersymmetrically covariant. This shows up when
point toward the superspace formulations of super- one works out the supersymmetry algebra in such a
symmetric theories. gauge: the presence of auxiliary fields gives closure,
For supergravity, there are a number of different as required, without use of the equations of motion,
approaches to realizing the theory in superspace, but the anticommutator of two supersymmetry
126 Supergravity
transformations when acting on a gauge field such theorem barring unified spacetime and internal
as the Maxwell field or the vierbein gives a symmetries. This theorem (the Coleman–Mandula
combination of the anticipated translation with an theorem) can be evaded, since at the time it was
admixture of a gauge transformation with a field- written, graded Lie symmetry algebras were not yet
dependent parameter. considered. For nonzero central charges, the exter-
The prepotential superfield of minimal super- nal automorphism algebra becomes a subalgebra of
gravity can itself be fit into larger formalisms in U(N) determined by the requirement that invariant
superspace that are analogous to standard differen- antisymmetric tensors a‘ij exist.
tial geometry, with supervielbeins, superspin con- The representations of the algebra [13]–[14] span
nections and so forth. An unavoidable feature of an increasing range of spins as the number N of
these more seemingly geometric constructions, how- D = 4 supersymmetries increases. For massive repre-
ever, is their high degree of redundancy: superspace sentations without central charges, the spins of the
vielbeins and spin connections carrying Lorentz smallest supersymmetry representation extend from
indices have many component fields in addition to states of spin 0 (scalars) up to spin N/2; with central
those found in the prepotential. This redundancy is charges, the spin range can be shortened down to a
then cut down in turn by imposing superspace minimum range of N=4. For massless representa-
constraints on the geometrical superfields, for tions, the range of helicities in a PCT (parity–
example, on the components of the torsion tensor change–time reversal) symmetric multiplet is from
in superspace. N=4 to N=4. This spin range has an important
implication for the maximal extension of super-
symmetry that can be realized in an interacting
Extended Supergravities and supersymmetric field theory, because no interacting
Supergravities in Higher Dimensions theories with a finite set of spins exist for spins > 2.
The possible graded extensions of the Poincaré Accordingly, the maximal extension of supersym-
algebra allow for more than one spinorial generator. metry is N = 8 for massless theories, and in order to
Thus, one can have N supersymmetry generators have massive states with spins that do not exceed
˙ , i, j = 1, . . . N, with basic anticommutators
Qi , Q spin 2 in an N = 8 theory, the central charges have
j
(in Lorentz two-component notation) to be active for maximal multiplet shortening.
The N = 8 supergravity theory, found by Crem-
_ g ¼ 2 i m _ Pm
fQi ; Q ½13
i j mer and Julia in 1978, is thus the largest possible
supergravity in D = 4 dimensions. It contains the
j
fQi ; Q g ¼ 2 a‘ij Z‘ ½14 following ‘‘spin’’ range (allowing for a certain
imprecision of expression: for massless fields one
i_ ; Q
fQ _ g ¼ 2 _ ‘ should really speak only of helicities)
j _ aij Z‘ ½15
N = 8 supergravity spins
The right-hand sides of [14] and [15] allow for the
possibility of nonvanishing commutators between Spin 2 3
2
1 1
2
0
supersymmetry generators of the same chirality. As Multiplicity 1 8 28 56 70
one can see from the overall symmetry in pairs of
indices (i, j), the coefficients a‘ij must be antisym- In order to realize the automorphism SU(8) symme-
metric in the i, j indices, so such nonvanishing same- try, one has to consider the field strengths for the 28
chirality anticommutators cannot occur for N = 1. spin-1 fields, separated into complex self-dual and
The corresponding abelian generators Z‘ are called anti-self-dual parts in their antisymmetric Lorentz
central charges since they must commute with all the indices. These complex field strengths can then be
other (Qi , Q ˙ , Pm ) elements of the algebra. endowed with a complex 28-dimensional represen-
], j
The i, j indices may be endowed with a symmetry tation of SU(8). The 70 scalars, on the other hand,
meaning as well, although this is not obligatory in fit precisely into the four-index antisymmetric
every model. When the central charges are absent, self-dual representation of SU(8), i1 i2 i3 i4 =
Z‘ = 0, one has U(N) (or SU(N)) as the maximal 1=(4!)i1 i2 i3 i4 j1 j2 j3 j4 j1 j2 j3 j4 . It is the use of the eight-
such external automorphism; the choice of index index epsilon tensor here that restricts the auto-
placement on Qi and Q ˙ anticipates this. If such a morphism group to SU(8) instead of U(8).
j
symmetry is realized in a given model, the fact that The SU(8) automorphism symmetry of N = 8
˙ carry representations both for that
the Qi , Q supergravity theory is linearly realized. It plays an
j
symmetry and for the spacetime Poincaré symmetry important role in another symmetry of this theory
demonstrates how supersymmetry evades the no-go which is highly nonlinear. This theory has a
Supergravity 127
remarkable nonlinear E7 symmetry. In fact, the 70 but divergences are nonetheless expected to occur at
scalars form a nonlinear sigma model with the fields some finite loop order.
taking their values in the coset space E7 =SU(8) (of This persistence of nonrenormalizability in D = 4
dimension 133 63 = 70), where the SU(8) divisor supergravity theories is no longer seen as a disaster,
is the linearly realized automorphism group dis- however, because these theories are now seen as
cussed above. effective theories for the massless modes arising
The extended supergravities point to another from a deeper microscopic quantum theory. In
aspect of supergravity theory: the existence of addition, the theories that are most directly con-
higher-dimensional supergravities, from which the nected to this underlying quantum theory are,
extended theories in D = 4 spacetime can be derived surprisingly, the maximal supergravities in space-
by Kaluza–Klein dimensional reduction. If one time dimensions 10 and 11. D = 11 supergravity can
considers a D0 dimensional massless theory in a be dimensionally reduced on a 1-torus (i.e., a circle)
spacetime where d dimensions form a compact to D = 10 where the massless sector yields type IIA
d-torus, then the theory can be viewed as a D = D0 d supergravity theory. This theory is the effective
dimensional theory in which the discrete Fourier theory for a consistent quantum theory of type IIA
modes arising from the periodicity requirements on superstrings in D = 10. Theories of relativistic
the d-torus give rise to towers of equally spaced strings (i.e., one-dimensional extended objects)
massive Kaluza–Klein states, plus a massless sector have strikingly different properties from theories of
in D0 d dimensions corresponding to the modes point particles. In particular, the spread-out nature
with no dependence on the d-torus coordinates. of the interactions leads to a damping out of the
Importantly, N = 8 supergravity in four- quantum field theory divergences, while the under-
dimensional spacetime can be obtained in this way lying supersymmetry causes a cancellation of other
from a supergravity theory that exists in 11 space- infinities that could have arisen owing to the two-
time dimensions. Upon dimensional reduction on a dimensional nature of the string world sheets. This
7-torus to four dimensions, one obtains N = 8, D = 4 gives, for the first time, a perturbatively well-defined
supergravity at the massless level, plus an infinite quantum theory including gravity.
tower of massive N = 8 supermultiplets with central In addition to the type IIA theory, there are four
charges so that their spin range extends only up to other consistent superstring theories in D = 10, and
spin 2. This D = 11 supergravity was in fact found these are in turn related to various D = 10 super-
before the N = 8 theory by Cremmer, Julia, and gravity effective theories for the massless modes:
Scherk, with the details of the more complicated type IIB, E8 E8 heterotic, SO(32) heterotic, and
N = 8, D = 4 theory being worked out via the SO(32) type I. Remarkably, the maximal D = 11
techniques of Kaluza–Klein dimensional reduction. supergravity enters into this picture as well, as a
The fields of the D = 11 theory include an exotic consequence of a pattern of duality symmetries that
field type not encountered in D = 4 theories: the have been found among the superstring theories.
bosonic fields of the theory comprise the graviton eA
M The dualities of string theory are directly related
plus a three-index antisymmetric tensor gauge field to the nonlinear symmetries of the dimensionally
CMNP . Counting the number of propagating modes reduced supergravities in D = 4. The string quantum
of these fields for a given momentum value gives corrections do not respect the E7 symmetry of the
44 þ 84 = 128 bosonic degrees of freedom. This classical N = 8 theory, but they do respect a discrete
precisely balances the 128 fermionic degrees of subgroup of this symmetry in which the E7 group
freedom coming from the D = 11 gravitino M . elements are required to take integer values: E7 (Z).
This quantum-level restriction to a discrete sub-
group can be seen from another phenomenon
characteristic of superstring theories: the existence
Supergravity Effective Theories, Strings
of ‘‘electric’’ and ‘‘magnetic’’ brane solutions. The
and Branes
antisymmetric-tensor (or ‘‘form’’) fields of the
The hope for a cancellation of the ultraviolet higher-dimensional supergravities naturally give rise
divergences in a supersymmetric theory of gravity to solitonic solutions in which p þ 1 dimensions
turned out to be ephemeral, although there is in fact form a flat Poincaré invariant subspace. This can be
a postponement of the divergence onset until a interpreted as the world volume of an infinite
higher order in quantum field loops. There is p-brane extended object. In the D = 11 supergravity
agreement that the nonmaximal supergravities theory, the branes that emerge in this way are a
diverge at the three-loop order. For the 2-brane and a 5-brane. The three-dimensional world
N = 8, D = 4 theory, the situation remains unclear, volume of the 2-brane naturally couples to the
128 Supermanifolds
3-form field CMNP , just as an ordinary Maxwell weak coupling duality. In the case of the type IIA
vector field couples to the one-dimensional world theory, however, something remarkable happens.
line of a point particle (or 0-brane). The 2-brane is The strong coupling limit of this theory turns out to
thus naturally electrically charged with respect to be related by duality, not to another string theory,
the 3-form field; its charge can be obtained, in a but to the maximal D = 11 supergravity. The role of
direct generalization of the Maxwell case, from a the Kaluza–Klein massive modes for the 11 to 10
Gauss’ law integral of the field strength H[4] = dC[3] reduction is played by an infinite tower of extremal
over a 7-sphere at spatial infinity in the eight charged black holes.
directions transverse to the brane worldvolume. Thus, even D = 11 supergravity theory has a role
The 5-brane, on the other hand, has a magnetic to play in the effective theory of the underlying
type charge; it is the 7-form dual to H[4] that is quantum dynamics. This underlying theory has been
integrated to give its charge. In addition to these dubbed ‘‘M-theory.’’ It is still only partially under-
static infinite p-branes, the theory contains dynami- stood, but many of its most important properties are
cal finite-extent branes as well, although for these presaged by the remarkable nonlinear structure of
one generally does not have explicit solutions. the classical supergravities.
As one reduces a higher-dimensional supergravity
to lower and lower dimensions, there is a proliferation See also: Brane Construction of Gauge Theories; Brane
of solitonic brane solutions of varying dimensionality, Worlds; Branes and Black Hole Statistical Mechanics;
and of both electric and magnetic charge types. In a Random Algebraic Geometry, Attractors and Flux Vacua;
Renormalization: General Theory; Spinors and Spin
quantum theory context, these electrically and magne-
Coefficients; Stability of Minkowski Space;
tically charged branes pair up in ways that must satisfy
Supermanifolds; Superstring Theories;
a generalization of the Dirac quantization condition Supersymmetric Particle Models; Symmetries
for D = 4 electric and magnetic point particles. This and Conservation Laws; Symmetries in Quantum
ends up requiring all the supergravity solitonic brane Field Theory: Algebraic Aspects.
charges to lie on a charge lattice. It is the requirement
that this discrete brane-charge lattice be respected that
restricts the classical supergravity nonlinear symmetry
groups to discrete duality subgroups. Further Reading
The dualities relate brane solutions within a given Buchbinder JL and Kuzenko SM (1998) Ideas and Methods of
theory and also between different string theories. Supersymmetry and Supergravity. Bristol: IoP Publishing Ltd.
They include transformations that invert the radii of Stelle KS (1998) BPS branes in supergravity, Trieste 1987 School of
compactifying tori, giving a large–small compactifi- High-Energy Physics and Cosmology, arXiv:hep-th/9803116.
Van Nieuwenhuizen P (1981) Supergravity. Physics Reports 68:
cation scale duality. They also include transforma- 189–398.
tions that invert the string coupling constant, thus Wess J and Bagger J (1983) Supersymmetry and Supergravity.
interchanging strong and weak coupling. The type Princeton: Princeton University Press.
IIB theory, for example, is self-dual under strong–
Supermanifolds
F A Rogers, King’s College London, London, UK effectiveness of supermanifolds is that anticommut-
ª 2006 Elsevier Ltd. All rights reserved.
ing coordinates allow the fermionic canonical anti-
commutation relations to be handled in a way
analogous to the bosonic canonical commutation
relations. Supersymmetric methods have proved
Introduction
immensely effective in fundamental physics; they
A supermanifold is a generalization of a classical also play a considerable role in geometrical index
manifold to include coordinates that are in some theory in mathematics. In this article we describe
sense anticommuting. Much of the motivation for supermanifolds from two points of view – geometric
the study of supermanifolds comes from super- and algebraic – and consider some of the standard
symmetric physics, where it is useful to have a features of manifold calculus, including integration
formalism which treats fermions and bosons in the since this is an area where the distinctive features of
same way. The underlying reason for the this generalized geometry are particularly apparent.
130 Supermanifolds
Thus, even D = 11 supergravity theory has a role and Conservation Laws; Symmetries in Quantum
to play in the effective theory of the underlying Field Theory: Algebraic Aspects.
quantum dynamics. This underlying theory has been
dubbed ‘‘M-theory.’’ It is still only partially under-
stood, but many of its most important properties are Further Reading
presaged by the remarkable nonlinear structure of
the classical supergravities. Buchbinder JL and Kuzenko SM (1998) Ideas and Methods of
Supersymmetry and Supergravity. Bristol: IoP Publishing Ltd.
See also: Brane Construction of Gauge Theories; Brane Stelle KS (1998) BPS branes in supergravity, Trieste 1987 School of
High-Energy Physics and Cosmology, arXiv:hep-th/9803116.
Worlds; Branes and Black Hole Statistical Mechanics;
Van Nieuwenhuizen P (1981) Supergravity. Physics Reports 68:
Random Algebraic Geometry, Attractors and Flux Vacua; 189–398.
Renormalization: General Theory; Spinors and Spin Wess J and Bagger J (1983) Supersymmetry and Supergravity.
Coefficients; Stability of Minkowski Space; Princeton: Princeton University Press.
Supermanifolds; Superstring Theories;
Supersymmetric Particle Models; Symmetries
Supermanifolds
F A Rogers, King’s College London, London, UK groups, which are supermanifolds with a compatible
ª 2006 Elsevier Ltd. All rights reserved. group structure.
(i) A ‘‘superalgebra’’ is a super vector space numbers. Such functions will be known (anticipating
A = A0 A1 which is also an algebra which the terminology for functions of both odd and even
satisfies Ai Aj Aiþj . variables) as supersmooth. (A useful notation will be
(ii) The superalgebra is ‘‘supercommutative’’ if, for to write
all homogeneous a, b in A, ab = (1)(jajjbj) ba. X
1 n
Fð ; . . . ; Þ ¼ F ½6
If the algebra is supercommutative then odd
elements anticommute, and the square of an odd
element is zero. The basic supercommutative super- with a multi-index = 1 k and =
algebra used is the real Grassmann algebra with 1 k 1. The set of multi-indices is restricted to
generators 1, 1 , 2 , . . . and relations those where 1 1 < < k n.) More general
supersmooth functions, with the coefficients F; , . . .
1i ¼ i 1 ¼ i ; i j ¼ j i ½2 taking values in C, RS , or some other algebra are
also possible.
A typical element of this algebra is then
Differentiation of supersmooth functions of anti-
X X commuting variables is defined by linearity together
a ¼ a; 1 þ ai i þ aij i j ½3
with the rule
i i<j
ððx1 ; . . . ; xm ; 1 ; . . . ; n ÞÞ ¼ ððx1 Þ; . . . ; ðxm ÞÞ ½10 (i) An (m, n) open chart on M is a pair (U, ) such
that U is a subset of M and is an injective map
These maps project out all the nilpotent Grass- of U into RSm,n , with the image (U) an open set
mann generators, leaving simply the real part. The in Rm,n
S .
topology involves the inverse of these projection (ii) An (m, n) atlas on M is a collection {(U , )} of
maps: a subset U of R m,n
S is said to be open if and (m, n) charts on M such that the U cover M
only if there exists an open set V in R m such that and, whenever U \ U is not empty, the change
U = 1 (V). Thus, an open set is unlimited in the of coordinate function 1 is supersmooth.
nilpotent directions. An (m, n)-dimensional supermanifold is a set M
In the sequel, where we consider integration, the together with a maximal (m, n) atlas on M.
superdeterminant of the matrix M of an endo- The space M is given a topology by defining U M
morphism of a super vector space V will be useful. to be open if and only if, for each such that U \ U
If V is an (m, n)-dimensional super vector space is not empty, the set (U \ U ) is an open subset
(so that V0 has dimension m and V1 dimension n), of Rm,n
S .
then M will have the block diagonal form Examples of supermanifolds include Rm,n itself, and
S
M00 M01 also supermanifolds constructed from the data of a
M10 M11 vector bundle over a classical manifold in a manner
which will now be described. If N is a classical
where the entries of M00 and M11 are even, whereas m-dimensional real manifold and E is an n-dimensional
those of M10 and M10 are odd. If N = M1 has block vector bundle over N, then an (m, n)-dimensional
form supermanifold can be constructed in the following
way: suppose that {(V , )} is an atlas of charts on N,
N00 N01
N10 N11 so that each V is an open subset of N and each is
an injective map of V onto an open subset of Rm ,
then the superdeterminant of M is defined by with 1 smooth. Suppose further that the V are
also local trivialization neighborhoods of the bundle E
S det M ¼ det M00 det N11
with transition functions g : V \ V ! GL(n).
It can be shown that the superdeterminant obeys the Then we build the supermanifold M by patching
product rule, unlike the obvious generalization of together the sets 1 ( (V ) R0, n
S ) in a consistent
the determinant to the super case. way. This leads to a supermanifold with coordinate
change functions
1
x1 ; . . . xm 1 n
; ; . . . ;
The Geometric Approach to
Supermanifolds
¼ x1 ; . . . xm 1 n
; ; . . . ;
A manifold is a space locally modeled on the
topological space Rm , where m is the dimension of where
the manifold. Thus, each point in a manifold has a
x1 ; . . . xm
¼ 1
x1 ; . . . xm
neighborhood which is essentially a neighborhood in
Rm . The most geometrically intuitive approach to ½11
X
n
supermanifolds is to generalize this directly by j ¼ gj k x1 ; . . . ; xm k
modeling a space locally on an extension of R m to k¼1
include anticommuting variables; the most straight-
forward space with the required algebraic property (Here again we refer to the appendix for the way in
is the superspace R m, S
n
built from a Grassmann which functions of even Grassmann variables, as
algebra, leading to a supermanifold of dimension opposed simply to real numbers, are handled.)
(m, n). (The dimension of a supermanifold is a pair Particular examples of this construction are the
of integers, indicating the numbers of even and odd tangent bundle over N and bundles of spinors over
coordinates of each point.) N. It was actually shown by Batchelor that all real,
The formal definition of a supermanifold will now supersmooth supermanifolds are of this form.
be given in a manner very closely analogous to that A similar definition may be made of a complex
of a classical manifold. supermanifold using a complex Grassmann algebra,
with the coordinate transition functions required to
Definition 3. Let M be a set.
be superanalytic. In this case, supermanifolds which
Supermanifolds 133
are not related to vector bundles in the manner difference, the two approaches lead to essentially
described above are possible, basically because equivalent supermanifolds.
partitions of unity do not exist in the analytic The advantage of the algebraic approach is its
setting. An example is the twisted supertorus, which mathematical elegance and economy – there is no
is built over the standard torus and has transition need to introduce the auxiliary Grassmann algebra
functions (z, ) ! (z þ 1, ) and (z, ) ! (z þ a þ RS in which coordinate functions take values – but
, þ ), extending the standard torus with transi- from the point of view of physicists, the geometric
tion functions z ! z þ 1, z ! z þ a. (Here a, are, point of view has two advantages: first, it is closer to
respectively, even and odd constants.) This super- the standard manifold picture and thus easier to
manifold is an example of a super Riemann surface; grasp, and, second, it allows a wider class of
such surfaces play an important role in the quanti- supermanifolds, because Grassmann constants are
zation of the spinning string. allowed; for instance, the twisted supertorus
As with classical manifolds, a natural class of described above cannot be included in the algebraic
functions can be defined on a supermanifold: approach without either introducing an auxiliary
a function f on an open subset U of the super- algebra or moving to the more difficult concept of a
manifold M is said to be supersmooth if, for each family of supermanifolds.
such that U \ U is nonempty, the function f 1 is While there have been various attempts to develop
supersmooth on (U \ U ). In local coordinates infinite-dimensional supermanifolds, most of the
supersmooth functionsP are such that constructions have been developed for very specific
f (x1 , . . . , xm , 1 , . . . , n ) = f (x1 , . . . , xm ) with purposes, such as path integration and functional
each f a smooth function. integration methods for theories with fermions.
Even the question of defining a basic infinite-
dimensional superalgebra with the necessary
analytic properties, such as a Hilbert–Banach super-
The Algebraic Approach to
algebra, requires sophisticated procedures, so that
Supermanifolds the development of a theory of infinite-dimensional
In the algebraic approach to supermanifolds, it is the supermanifolds becomes extremely technical.
algebra of functions, rather than the manifold
itself, which is extended to include anticommuting
Calculus on Supermanifolds
elements. In this approach an (m, n)-dimensional
supermanifold is defined to be a pair (N, A), where Much of the calculus of functions on supermanifolds
N is an m-dimensional classical manifold and A is a proceeds in simple analogy to that of classical
sheaf of superalgebras over N with various proper- manifolds, with addition sign factors occurring when-
ties, described below. The statement that A is a ever two odd quantities are transposed. For instance, a
sheaf of algebras over N means that corresponding vector field on M may be described as a super-
to each open subset U of N there is an algebra A(U); derivation of the algebra of supersmooth functions
also, if V U, there is a ‘‘restriction map’’ U, V on M, that is, a linear mapping of this space obeying
mapping A(U) into A(V), and the various restriction the super Leibnitz rule X fg = Xf g þ (1)(jXjjf j) f Xg.
maps obey certain consistency conditions. A parti- Standard examples of vector fields (defined locally) are
cular example of such a sheaf (with trivial odd part) coordinate derivatives @=@xi and @=@j , defined by
is the sheaf A; of real-valued functions on N, with (@=@xi )f = @i (f ) and (@=@j )f = @jþm (f ) with
A; (U) = C1 (U), the set of real-valued smooth func- the coordinate function corresponding to the coordi-
tions on U and U, V mapping a function in C1 (U) nates (x1 , . . . , xm ; 1 , . . . , n ). Equipped with this con-
to its restriction in C1 (V). The defining property of cept of vector field, much of differential calculus on
the sheaf corresponding to an (m, n)-dimensional manifolds can be directly generalized to supermani-
supermanifold is that there is a cover {U } of N for folds in a relatively straightforward way. However, in
which the algebras A(U ) have the form A(U ) ffi the case of integration the situation is quite different.
C1 (U ) (Rn ), so that a typical P element f of The standard approach to integration of anticommut-
A(U ) may be expressed as f = f , where f 2 ing variables is the Berezin integral, which is a formal,
C1 (U ) and 1 , . . . , n are generators of (Rn ). The algebraic integral that is not an antiderivative and has
notation here is chosen to emphasize the close no measure-theoretic features. There are various
correspondence with the algebra of smooth func- reasons why such an integral is used: for instance,
tions described at the end of the previous section. even the simple function of a single anticommuting
This makes it clear that, despite an apparent variable has no antiderivative, while the topology on
RSm,n does not allow open sets which discriminate in
134 Supermanifolds
odd directions. Additionally, when changing variables Definition 4. The function F : Rm,0
S ! RS is said to
on RSm,n it is the superdeterminant of the Jacobian be supersmooth if there exists a smooth function
matrix which must be used. In the purely odd sector, ~ : Rm ! R, such that
F
differentials thus transform the ‘‘wrong’’ way. 1 m
Fðx ; . . . ; x Þ
The Berezin integral of a function f of n anti-
commuting variables is defined by X
m
@ F~
! ¼ F~ððxÞÞ þ ðxi ðxi Þ1Þ ððxÞÞ
Z @xi
n
X i¼1
d f ¼ f1...n ½12 1X m
þ ðxi ðxi Þ1Þ
2 i;j¼1
In other words, Berezin integration simply picks out
the coefficient of the highest-order term, thus @ 2 F~
ðxj ðxj Þ1Þ ððxÞÞ . . . ½15
resembling differentiation more than integration in @xi @xj
the classical sense. Nonetheless, the Berezin integral (Although this Taylor series will in general be
has very useful properties, in particular allowing infinite, it gives well-defined coefficients for each
direct analoges of Fourier transformations and in the expansion [3], so that the value of F is a
integral kernel. Given that it is the algebra of well-defined element of R S .) A number of different
functions, and the operators acting on these alge- classes of function can be obtained, by varying the
bras, which is the key element in supergeometry, space in which the function ~F takes its value.
these are vital properties of the integral.
The transformation rule under change of variable
See also: Batalin–Vilkovisky Quantization; BRST
is the inverse of that which one expects. For Quantization; Graded Poisson Algebras; Path-Integrals in
instance, in the case of a single variable, if one Non Commutative Geometry; Random Matrix Theory in
makes the transformation ! = a þ with a and Physics; Supergravity; Superstring Theories;
constants, a direct calculation shows that the Supersymmetric Particle Models; Supersymmetric
integral is invariant provided that one sets d = a d. Quantum Mechanics.
Integration on RSm,n is essentially defined by
combining classical integration for the even variables
with Berezin integration for odd variables, giving
Z ! Further Reading
m n
X
1 m
d xd f ðx ; . . . ; x Þ Batchelor M (1979) The structure of supermanifolds. Trans-
1 ðVÞ actions of the American Mathematical Society 253: 329–338.
Z Batchelor M (1980) Two approaches to supermanifolds. Trans-
¼ dm x f1...n ðx1 ; . . . ; xm Þ ½13 actions of the American Mathematical Society 258: 257–270.
V Berezin FA (1987) Introduction to Superanalysis. Dordrecht: Reidel.
Berezin FA and Leǐtes DA (1976) Supermanifolds. Soviet Maths
This also defines integration on supermanifolds, Doklady 16: 1218–1222.
provided that we can find a rule for the change of Crane L and Rabin JM (1988) Super Riemann surfaces:
variable. This, as indicated above, may be done by uniformization and Teichmüller theory. Communications in
using the superdeterminant of the Jacobian matrix. Mathematical Physics 113: 601–623.
DeWitt BS (1992) Supermanifolds. Cambridge: Cambridge
Suppose that (y, ) are a new set of coordinates on University Press.
our supermanifold. Then an invariant definition of Howe PS (1979) Super Weyl transformations in two dimensions.
integral is obtained if we set Journal of Physics A 12: 393–402.
0 1 Jadczyk A and Pilch K (1981) Superspace and supersymmetries.
@y @y Communications in Mathematical Physics 78: 373–390.
B @x @ C m n Kostant B (1977) Graded manifolds, graded Lie theory
dm y dn ¼ SdetB C
@ @ @ A d x d ½14 and prequantization. In: Differential Geometric Methods in
Mathematical Physics,. Lecture Notes in Mathematics, Springer.
@x @ Kupsch J and Smolyanov O (2000) Hilbert norms for graded
algebras. Proceedings of the American Mathematical Society
Appendix 128: 1647–1653.
Polchinski J (1998) String Theory, vol. II. Cambridge: Cambridge
We now describe the device which allows functions University Press.
of even Grassmann variables to be handled simply as Rogers A (2003) Supersymmetry and Brownian motion on
functions of conventional variables. The necessary supermanifolds. Infinite Dimensional Analysis, Quantum
class of functions is captured by defining super- Probability and Related Topics 6(suppl. 1): 83–102.
smooth functions on R m,0
S as extensions by Taylor Salam A and Strathdee J (1974) Super-gauge transformations.
Nuclear Physics B 76: 477–482.
expansion from smooth functions on Rm .
Superstring Theories 135
Voronov AA (1992) Geometric integration theory on super- West P (1990) Introduction to Supersymmetry and Supergravity.
manifolds. Soviet Scientific Reviews C: Mathematical Physics Singapore: World Scientific.
Reviews 9: 1–138.
Wess J and Bagger J (1983) Supersymmetry and Supergravity.
Princeton: Princeton University Press.
Superstring Theories
C Bachas and J Troost, Ecole Normale Supérieure, The Five Superstring Theories
Paris, France
Theories of relativistic extended objects are tightly
ª 2006 Elsevier Ltd. All rights reserved. constrained by anomalies, that is, quantum viola-
tions of classical symmetries. These arise because the
classical trajectory of an extended p-dimensional
object (or ‘‘p-brane’’) is described by the embedding
Introduction X ( a ), where a = 0,..., p parametrize the brane world
String theory postulates that all elementary particles volume, and X = 0,..., D1 are coordinates of the
in nature correspond to different vibration states of target space. The quantum mechanics of a single
an underlying relativistic string. In the quantum p-brane is therefore a (p þ 1)-dimensional quantum
theory both the frequencies and the amplitudes of field theory, and as such suffers a priori from
vibration are quantized, so that the quantum states ultraviolet divergences and anomalies. The case
of a string are discrete. They can be characterized by p = 1 is special in that these problems can be exactly
their mass, spin, and various gauge charges. One of handled. The story for higher values of p is much
these states has zero mass and spin equal to 2h, and more complicated, as will become apparent later on.
can be identified with the messenger of gravitational The theory of ordinary loops in space is called
interactions, the graviton. Thus, string theory is a closed bosonic string theory. The classical trajectory
candidate for a unified theory of all fundamental of a bosonic string extremizes the Nambu–Goto
interactions, including quantum gravity. action (proportional to the invariant area of the
In this article, we discuss the theory of superstrings world sheet)
as consistent theories of quantum gravity. The aim is Z qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 2
to provide a quick (mostly lexicographic and biblio- SNG ¼ d detðG @a X @b X Þ ½1
2
0
graphic) entry to some of the salient features of the
subject for a nonspecialist audience. Our treatment is where G (X) is the target-space metric, and 0 is
thus neither complete nor comprehensive – there exist the Regge slope (which is inversely proportional to
for this several excellent expert books, in particular the string tension and has dimensions of length
by Green, et al. (1987) and by Polchinski (1998). An squared). In flat spacetime, and for a conformal
introductory textbook by Zwiebach (2004) is also choice of world-sheet parameters = 0 1 , the
highly recommended for beginners. Several other equations of motion read:
complementary reviews on various aspects of super- @þ @ X ¼ 0 and @ X @ X ¼ 0 ½2
string theories are available on the internet (see the
‘‘Further reading’’ section); some more will be given with the Minkowski metric. The X are thus free
as we proceed. two-dimensional fields, subject to quadratic phase-
space constraints known as the Virasoro conditions.
These can be solved consistently at the quantum
level in the critical dimension D = 26. Otherwise,
the symmetries of eqns [2] are anomalous: either
Lorentz invariance is broken, or there is a conformal
anomaly leading to unitarity problems. (For D < 26,
unitary noncritical string theories in highly curved
rather than in the originally flat background can be
constructed.)
Even for D = 26, bosonic string theory is, how-
ever, sick because its lowest-lying state is a tachyon,
Figure 1 A four-particle and a four-string interaction.
136 Superstring Theories
The finiteness of string perturbation theory has coupling gYM . Thus, up to redefinitions of the fields,
been, strictly speaking, only established up to two the type I theory has necessarily the same low-
loops – for a recent review see D’Hoker and Phong energy limit.
(2002). However, even though the technical pro- The D = 10 supergravity plus super Yang–Mills
blem is open and hard, the qualitative case for all- has a hexagon diagram that gives rise to gauge and
order finiteness is convincing. It can be illustrated gravitational anomalies, similar to the triangle
with the torus diagram which makes a one-loop anomaly in D = 4. It turns out that for the two
contribution to string amplitudes. The thin torus of special groups E8 E8 and SO(32), the structure of
Figure 2 could be traced either by a short, light these anomalies is such that they can be canceled by
string propagating (virtually) for a long time, or by a a combinationR of local counter-terms. One of them
long, heavy string propagating for a short period of is of the form B2 ^ X8 (F, R), where X8 is an 8-form
time. In conventional field theory, these two virtual quartic in the curvature and/or Yang–Mills field
trajectories would have made distinct contributions strength. The other is already present in the lower
to the amplitude, one in the infrared and the second line of expression [7], with the replacement
in the ultraviolet region. In string theory, on the !gauge
3 ! !gauge
3 !Lorentz
3 , where the second Chern–
other hand, they are related by a modular transfor- Simons form is built out of the spin connection.
mation (that exchanges 0 with 1 ) and must not, Note that these modifications of the effective action
therefore, be counted twice. A similar kind of involve terms with more than two derivatives, and
argument shows that all potential divergences of are not required by supersymmetry at the classical
string theory are infrared – they are therefore level. The discovery by Green and Schwarz that
kinematical (i.e., occur for special values of the string theory produces precisely these terms (from
external momenta), or else they signal an instability integrating out the massive string modes) was called
of the vacuum and should cancel if one expands the ‘‘first superstring revolution.’’
around a stable ground state.
The low-energy limit of the heterotic and type I
D-Branes
string theories is N = 1 supergravity plus super
Yang–Mills. In addition to the N = 1 graviton A large window into the nonperturbative structure
multiplet, the massless spectrum now also includes of string theory has been opened by the discovery of
gauge bosons and their associated gauginos. The D(irichlet)-branes, and of strong/weak-coupling
two-derivative effective action in the heterotic case duality symmetries. A Dp brane is a solitonic
reads: p-dimensional excitation, defined indirectly by the
Z pffiffiffiffiffiffiffiffi property that open string endpoints can attach to its
1
Shet ¼ 2 d10 x G e2 world volume (see Figure 3). Stable Dp branes exist
2
" in the type IIA and type IIB theories for p even,
2 respectively, odd, and in the type I theory for p = 1
R þ 4@ @ þ 2 trðF F Þ
gYM and 5. They are charged under the R–R (p þ 1)-form
2 # potential or, for p > 4, under its magnetic dual.
1 2 gauge Strictly speaking, only for 0 p 6 do D-branes
dB2 2 !3 þ fermions ½7
2 gYM resemble regular solitons the word stands for
‘‘solitary waves’’). The D7 branes are more like
where !gauge
3 = tr(AdA þ (2=3)A3 ) is the Chern–
Simons gauge 3-form. Again, supersymmetry fixes
completely the above action – the only freedom is in
the choice of the gauge group and of the Yang–Mills
Time
Space
Figure 2 The same torus diagram viewed in two different
channels. Figure 3 D-branes and open strings.
Superstring Theories 137
cosmic strings, the D8 branes are domain walls, It implies that two or more identical D-branes
while the D9 branes are spacetime filling. Indeed, exert no net static force on each other, because
type I string theory can be thought as arising from their R–R repulsion cancels exactly their gravita-
type IIB through the introduction of an orientifold tional attraction. A nontrivial check of the result
9-plane (required for tadpole cancelation) and of 32 [9] comes from the Dirac quantization condition
D9 branes. (generalized to extended objects by Nepomechie
The low-energy dynamics of a Dp brane is and Teitelboim). Indeed, a Dp brane and a
described by a supersymmetric abelian gauge theory, D(6 p)-brane are dual excitations, like electric
reduced from ten down to p þ 1 dimensions. The and magnetic charges in four dimensions, so their
gauge field multiplet includes 9 p real scalars, couplings must obey
plus gauginos in the spinor representation of the
R-symmetry group SO(9 p). These are precisely 22 p 6p ¼ 2k where k 2 Z ½10
the massless states of an open string with endpoints
moving freely on a hyperplane. The real scalar fields This ensures that the Dirac singularity of the long-
are Goldstone modes of the broken translation range R–R fields of the branes does not lead to an
invariance, that is, they are the transverse coordinate observable Bohm–Aharonov phase. The couplings
~ a ) of the D-brane. The bosonic part of the
fields Y(
[9] obey this condition with k = 1, so that D-branes
low-energy effective action is the sum of a Dirac– carry the smallest allowed R–R charges in the
Born–Infeld (DBI) and a Chern–Simons (CS) like theory.
term: A simple but important observation is that open
Z qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi strings living on a collection of n identical D-branes
Ip ¼ Tp dpþ1
e detðG ^ ab þ F ab Þ have matrix-valued wave functions ij , where
Z X i, j = 1, . . . , n label the possible endpoints of the
p ^ n ^ eF
C ½8 string. The low-energy dynamics of the branes is
n thus described by a nonabelian gauge theory, with
group U(n) if the open strings are oriented, and
where F ab = B ^ ab þ 20 Fab , hats denote pullbacks SO(n) or Sp(n) if they are not. We have already
on the brane of bulk tensor fields (e.g., G ^ ab = encountered such Chan–Paton factors in our discus-
sion of the type I superstring. More generally, this
G @a Y @b Y ), Fab is the field strength of the
world-volume gauge field, and in the CS term simple property of D-branes has led to many insights
one is instructed to keep the (p þ 1)-form of the on the geometric interpretation and engineering of
expression under the integration sign. The constants gauge theories, which are reviewed in the articles
Tp and p are the tension and charge density of the Brane Construction of Gauge Theories and Gauge
D-brane. As was the case for the effective super- Theories from Strings. It has also placed on a firmer
gravities, the above action receives curvature footing the idea of a brane world, according to
corrections that are higher order in the 0 expan- which the fields and interactions of the standard
sion. Note however that a class of higher-order model would be confined to a set of D-branes, while
terms have been already resummed in expression gravitons are free to propagate in the bulk (for
[8]. These involve arbitrary powers of Fab , and are reviews, see Brane Worlds and reference Lust
closely related more precisely T-dual, see later) to (2004)). It has, finally, inspired the gauge/string
relativistic effects which can be important even in theory or AdS/CFT correspondence (see Ads/CFT
the weak-acceleration limit. When refereing to the Correspondence and Aharony et al. (2000)) on
D9 branes of the type I superstring, the action [8] which we will comment later.
includes the GS terms required to cancel the gauge
anomaly.
The tension and charge density of a Dp brane can Dualities and M Theory
be extracted from its coupling to the (closed-string)
graviton and R–R (p þ 1)-form, with the result: One other key role of D-branes has been to provide
evidence for the various nonperturbative duality
conjectures. Dual descriptions of the same physics
Tp2 ¼ 2p ¼ ð42 0 Þ3p ½9
2 arise also in conventional field theory. A prime
example is the Montonen–Olive duality of four-
The equality of tension and charge follows from dimensional, N = 4 supersymmetric Yang–Mills,
unbroken supersymmetry, and is also known as a which is the low-energy theory describing the
Bogomol’nyi–Prasad–Sommerfeld (BPS) condition. dynamics of a collection of D3 branes. The action
138 Superstring Theories
for the gauge field and six associated scalars I (all in weakly coupled heterotic string. These are, indeed,
the adjoint representations of the gauge group G) is the only known ultraviolet completions of the
Z theory [7]. Furthermore, for I 1, the D1 brane
1
SN¼4 ¼ 2 d4 xtr of the type I theory becomes light, and could be
4g plausibly identified with the heterotic string. This
!
X X conjecture has been tested successfully by comparing
I I I J 2
F F þ 2 D D þ 2½ ; various supersymmetry-protected quantities (such as
I I<J
Z the tensions of BPS excitations and special higher-
d4 x tr F
F derivative terms in the effective action), which can be
32 2
calculated exactly either semiclassically, or at a given
þ fermionic terms ½11 order in the perturbative expansion. Testing the duality
for nonprotected quantities is a hard and important
Consider for simplicity the case G = SU(2). The problem, which looks currently out of reach.
scalar potential has flat directions along which the The other three string theories have also well-
six I commute. By an SO(6) R-symmetry rotation, motivated dual descriptions at strong coupling .
we can set all but one of them to zero, and let The type IIB theory is believed to have an SL(2, Z)
<tr(1 1 )> = v2 in the vacuum. In this ‘‘Coulomb symmetry, similar to that of the N = 4 super Yang–
phase’’ of the theory, a U(1) gauge multiplet stays Mills. (Note that is a dynamical parameter, that
massless, while the charged states become massive changes with the vacuum expectation value of the
by the Higgs effect. The theory admits furthermore dilaton < >. Thus, dualities are discrete gauge
smooth magnetic-monopole and dyon solutions, and symmetries of string theory.) The type IIA theory
there is an elegant formula for their mass: has a more surprising strong-coupling limit: pitffiffiffiffiffigrows
4i one extra dimension (of radius R11 = 1= 0 ), and
M ¼ vjnel þ nmg j; where ¼ þ ½12 can be approximated at low energy by the maximal
2 g2
11-dimensional supergravity of Cremmer, Julia, and
and nel (nmg ) denotes the quantized electric (mag- Scherk. The latter is a very economical theory – its
netic) charge. This is a BPS formula that receives massless bosonic fields are only the graviton and a
no quantum corrections. It exhibits the SL(2, Z) 3-form potential A3 . The bosonic part of the action
covariance of the theory, reads
a þ b Z pffiffiffiffiffiffiffiffi
! 1
c þ d S11D ¼ 2 d11 x G ðR 12jF4 j2 Þ
211 Z
and ½13 1
1 A3 ^ F4 ^ F4 ½15
a b 12211
ðnel ; nmg Þ ! ðnel ; nmg Þ
c d
The electric and magnetic charges of the 3-form are a
Here a, b, c, d are integers subject to the condition (fundamental?) membrane and a solitonic 5-brane.
ad bc = 1. Of special importance is the transfor- Standard Kaluza–Klein reduction on a circle maps S11D
mation ! 1=, which exchanges electric and to the IIA supergravity action [6], where G , , and C1
magnetic charges and (at least for = 0) the strong- descend from the 11-dimensional graviton, and B2 and
with the weak-coupling regimes. For more details C3 from the 3-form A3 . Furthermore, all BPS excita-
see the review by Harvey (1996). tions of the type IIA string theory have a counterpart in
The extension of these ideas to string theory can be 11 dimensions, as summarized in Table 1. Finally, if
illustrated with the strong/weak- coupling duality one compactifies the eleventh dimension on an interval
between the type I theory, and the Spin(32)=Z2 (rather than a circle), one finds the conjectured strong-
heterotic string. Both have the same massless spec- coupling limit of the E8 E8 heterotic string.
trum and low-energy action, whose form is dictated The web of duality relations can be extended by
entirely by supersymmetry. The only difference lies in compactifying further to D 9 dimensions. Readers
the relations between the string and supergravity interested in more details should consult Polchinski
parameters. Eliminating the latter, one finds (1998) or one of the many existing reviews of the
pffiffiffi subject (Townsend (1996), see also ‘‘Further Read-
1
het ¼ and 0het ¼ 2I 0I ½14 ing’’ section). In nine dimensions, in particular, the
2I two type II theories, as well as the two heterotic
It is thus tempting to conjecture that the strongly superstrings, are pairwise T-dual. T-duality is a
coupled type I theory has a dual description as a perturbative symmetry (thus firmly established, not
Superstring Theories 139
Table 1 BPS excitations of type IIA string theory, and their counterparts in M theory compactified on a circle of radius R11
From Bachas CP (1997) Lectures on D-branes. In: Olive DI and West PC (eds.) Duality and Supersymmetric Theories, Proceedings,
Easter School, Newton Institute, Euroconference, Cambridge, UK, April 7–18. With permission of Cambridge University Press.
charge conjugation symmetries are violated in weak understood as a theory with the momentum cut-off
interactions. Therefore, N > 1 theories may not be SM , quantum corrections to the mass parameter m2
of immediate phenomenological relevance. How- in eqn [1] are quadratically divergent:
ever, they may be useful for constructing super- 2
3
symmetric theories in more than four dimensions m2 ¼ 3g 2
2 þ g 2
1 þ 8y 2
t SM þ ½3
(more than three spatial dimensions). Chiral (effec- 642
tive) theory in four dimensions can be then obtained Here, g1 , g2 , and yt are the gauge couplings of the
after compactification of extra dimensions. For groups U(1)Y , SU(2)L , and the top-quark Yukawa
instance, N = 2 theory in five dimensions (x , y) coupling, respectively. This means that if, above the
compactified on a circle with reflection symmetry energy scale SM , the SM is replaced by some more
y ! y (orbifold compactification) gives chiral fundamental theory, in which there are particles of
N = 1 theory in four dimensions. masses M & SM , the quantum corrections to m2 are
Absence of quadratic divergences in supersym- quadratically dependent on the new mass scale M.
metric theories is the main argument supporting the For M v, this is very unnatural even if the original
belief that fundamental interactions of elementary parameter m2 remains a free parameter of this
particles at energies not higher that O(1 TeV) should underlying theory and particularly difficult to accept
be described by an (approximately) N = 1 super- if in the underlying theory m2 is fixed by some more
symmetric extension of the standard model (SM). fundamental considerations. If the SM was the
Indeed, supersymmetric models elegantly solve the correct theory up to, for example, the mass scale
so-called hierarchy problem of the SM. At present, suggested by the see-saw mechanism for the neu-
supersymmetry remains a theoretical hypothesis. trino masses, SM 1015 GeV
No experimental evidence for it has been found yet
jm2 j 1028 GeV2 1024 v2 !
(for experimental lower bounds on the masses of
supersymmetric particles see Eidelman et al. (2004)). Clearly, this excludes the possibility of understand-
Supersymmetric models will be tested experimentally ing the magnitude of the Fermi scale v in any
at the Large Linear Collider at CERN (Geneva), after sensible way. Thus, for naturalness of the Higgs
the completion of its construction in 2007. Super- mechanism in the SM there should exist a new mass
gravity theories may be physically relevant as an scale M & v, say only one order of magnitude higher
intermediate step in constructing phenomenologically than v and the theory describing the physics above
viable models from superstring theories. that scale should be free of quadratic divergences.
The essence of the hierarchy problem of the (Approximate) supersymmetry is at present the most
standard model (SM) – the successful SU(3)c elegant and theoretically most complete solution to
SU(2)L U(1)Y gauge theory of interactions of the hierarchy problem of the SM.
quarks and leptons at energies up to about 100 GeV –
is the following. By itself, the SM does not explain the
value of the Fermi scale v of the electroweak
1=2 Supersymmetric Extensions of the SM
SU(2)L U(1)Y symmetry breaking (v GF where
GF is the Fermi constant determined by the life time In supersymmetry, the gauge fields Aa are promoted
of the muon). Indeed, in the SM, the electroweak ^ a = (Aa , a , Da ), one for each
to vector superfields V
symmetry breaking is realized by an elementary Higgs gauge symmetry group generator, where a ’s are
field H (an SU(2) doublet) with a potential Weyl fermions (called gauginos) and Da ’s are
nondynamical auxiliary fields. A renormalizable
V ¼ m2 H y H þ ðH y HÞ2 ½1 supersymmetric gauge theory is completely defined
2
(see, e.g., Sohnius (1985) and Wess and Bagger
where m and are free parameters of the SM. When (1992)) by specifying the gauge group, the set of
m2 < 0 is chosen, the minimum of the potential chiral supermultiplets ^ i = (i , i , Fi ) representing
occurs when matter fields, and the superpotential – a holo-
morphic polynomial function of at most third
m 2 v2
hH y Hi ¼ ½2 order in the chiral superfields which determines
2 Yukawa couplings of the fermions i and scalars i .
that is, the Higgs doublet acquires SU(2) U(1)Y Auxiliary fields Da and Fi can be eliminated via their
breaking vacuum expectation value v which is just (algebraic) equations of motion.
the Fermi scale. The masses of the intermediate The so-called minimal supersymmetric SM
vector bosons W and Z0 are proportional to v and (MSSM) encodes the main features of any super-
depend also on the gauge couplings. Within the SM symmetric extension of the SM. Its gauge group is
142 Supersymmetric Particle Models
SU(3) SU(2) U(1) – the same as in the SM – and contradicted by the experimental data. Therefore, in
the chiral superfields are associated to each of the the MSSM, supersymmetry has to be broken expli-
SM quark and lepton fields. Thus, quarks and citly but in such a way that the soft ultraviolet
leptons get scalar spin zero superpartners, the behavior remains intact. Remarkably, the super-
squarks and sleptons, carrying the same quantum symmetry breaking terms which can be added to the
numbers as their corresponding fermions and the MSSM Lagrangian without reintroducing quadratic
vector superfields provide spin 1/2 superpartners for divergences make heavy just those fields which are
the gauge fields – the gluinos, the winos, and the opposite statistics superpartners of the SM gauge
bino. The SM Higgs doublet with weak hypercharge bosons and fermions. These so-called soft terms are:
Y = 1=2 becomes a scalar component of a chiral
~G
Lsoft ¼ 12G ~ aG
~ a 1W~W
~ aW
~ a 1B
~~~
superfield H^ u which contains in addition one 2 2 BB
doublet of Weyl fermions – the Higgsinos. The ~ 2 m2 jU
m2Q jQj ~ c j2 m2 jD
~ c j2
U D
chiral anomaly cancelation condition requires that
there be also a second Higgs chiral superfield H ^d ~ 2 m2 jE
m2L jLj ~ c j2 m2 jH u j2
E Hu
with Y = 1=2. Such a superfield is also required for m2Hd jHd j2 m23 ðH u H d þ c:c:Þ
giving masses to all flavors of quarks; because of the
holomorphicity of the superpotential the same Higgs þ AU U ~ u þ AD D
~ c QH ~ d þ AE E
~ c QH ~ c LH
~ d ½5
doublet cannot couple simultaneously to all quarks.
With the MSSM superfield content, the most and yield gaugino (gluino G, ~ wino W, ~ and bino B) ~
general renormalizable superpotential consistent and scalar mass terms as well as explicit trilinear
with the gauge symmetry has the form couplings between scalars (scalar mass terms and
A-terms are 3 3 matrices in the flavor space). As a
W ¼ Yu U ^H
^ cQ ^ u þ Yd D ^H
^ cQ ^ d þ Yl E
^ cL
^H ^ dH
^ d þ H ^u
result, supersymmetry is broken in the mass spectra
^L
^ cQ ^ þ 2 E
^ cL
^L^ þ 3 U
^ cD
^ cD
^ c þ 4 L
^H^ u ½4 but not in the dimensionless couplings.
þ 1 D
The origin of the soft supersymmetry breaking
(flavor indices are suppressed) where the superfield Q ^ remains an open issue. Terms [5] are most probably
contains the SU(2) quark doublet Q and its scalar remnants of the spontaneous supersymmetry break-
superpartner Q ~ and similarly for the lepton doublet ing in the so-called ‘‘hidden’’ sector – a hypothetical
^
L, quark singlets U, ^ D,
^ and lepton singlet E ^ super- set of fields that do not interact directly with the
fields. The three first terms in [4] give the SM-like MSSM fields. For example, in the popular scenario,
Yukawa couplings of quarks and leptons to the Higgs they interact with the MSSM fields only gravitation-
fields together with Yukawa couplings of the corre- ally and spontaneous supersymmetry breaking in the
sponding superpartners. The fourth term has no SM hidden sector is communicated to the MSSM sector
analogy; it gives supersymmetric masses to the Higgs by gravitational interactions giving rise to terms [5].
scalar and Higgsinos. The interactions in the second Several other mechanisms of supersymmetry break-
line do not conserve baryon and lepton numbers, ing transmission have also been proposed (gauge
respectively B and L, and should be forbidden (or mediation, anomaly mediation, etc.).
strongly suppressed) by some additional symmetry of The mass parameters and A-terms in [5] are free
the theory as they would lead to rapid proton decay. A parameters of the low-energy supersymmetric theory
discrete symmetry, called R-parity R = (1)2Sþ3(BL) , and, combined with the interactions like QQ ~G~
where S is the spin of the field, is an interesting originating from supersymmetric kinetic terms, may
possibility. R-parity acts differently on the different be a new, troublesome, source of flavor changing
components of the superfields: it is even for all SM neutral currents and of CP violation.
particles and odd for their superpartners. Its conserva-
tion implies that superpartners must appear in pairs in
any interaction vertex. Thus, with R-parity imposed, Higgs Sector of the MSSM
the lightest supersymmetric particle is stable and it is an
The MSSM Higgs potential reads
excellent candidate for the dark matter in the universe.
Supersymmetry cannot be an exact symmetry of V ¼ m21 jHd j2 þ m22 jHu j2 þ m23 ðHu Hd þ c:c:Þ
nature because there do not exist elementary fermions
and bosons degenerate in mass. The superpotential g2 þ g22 2
þ 1 jHd j2 jHu j2 ½6
[4] does not break supersymmetry spontaneously but 8
even if it did the elementary fermions and bosons Its quartic part is uniquely determined by the
would on average have equal masses (they would structure of the supersymmetric gauge theory. The
satisfy some mass sum rule) which is also parameters m21 , m22 , and m23 are determined by
Supersymmetric Particle Models 143
the soft supersymmetry breaking Higgs boson The minimal-model bound on the Higgs mass can
masses [5] and the parameter in [4]. The potential be relaxed in models with extended Higgs sector.
[6] is bounded from below for m21 þ m22 > 2m43 , and For instance, if an additional gauge group singlet
for m21 m22 m23 < 0 it has the electroweak symmetry chiral superfield couples to the Higgs doublets, the
breaking minimum at vu = hHu0 i 6¼ 0, vu = hHd0 i 6¼ 0. Higgs self-coupling in [9] receives additional
The ratio vu =vd tan is then phenomenologically contributions. Explicit calculations show that in
a very important parameter. such and other models, with Msoft . 1 TeV, the
Quantum corrections to the mass parameters in bound on the Higgs mass cannot be raised above
[6] are controlled by the mass scale Msoft of the 150 GeV if one wants to preserve perturbative
supersymmetry breaking terms [5]; at the one-loop gauge coupling unification.
level instead of [3], one finds
1 2 2NEW
m21;2 3g 2 þ g 2
1 12y 2
b;t M 2
soft ln ½7 Supersymmetric Grand Unified Theories
162 M2soft
There are two striking aspects of the matter
where yb and yt are the bottom- and top-quark
spectrum in the SM. One is the chiral anomalies
Yukawa couplings, respectively and NEW is the scale
cancelation (Weinberg 1996–2000, Pokorski 2000),
at which the soft supersymmetry breaking terms are
which is necessary for a unitary (and renormaliz-
generated by the putative supersymmetry breaking
able) theory, and occurs thanks to certain conspiracy
transmission mechanism. In gravity mediation scenar-
between quarks and leptons suggesting a deeper link
ios, NEW MPl . In gauge mediation scenarios, NEW
between them. The second one is that the spectrum
is low but it is a new scale, introduced by hand.
fits into simple representations of the SU(5) and
In the softly broken supersymmetric models, the
SO(10) groups (Ross 1985). Indeed, each generation
hierarchy problem is solved for Msoft . O(10)v.
of the SM matter fills 5 þ 10 þ 1 (if the right-handed
Moreover, eqn [7] shows that via quantum correc-
neutrino is included into the spectrum) representations
tions the large top-quark Yukawa coupling yt drives
of SU(5) and for SO(10), 16 = 5 þ 10 þ 1. The
the mass parameter m22 to a negative value, inducing
assignment of fermions to the SU(5) or SO(10)
the electroweak symmetry breaking. This means that
representations fixes the normalization of the U(1)Y
in supersymmetric models the electroweak scale is
generator. Both facts suggest unification of strong and
calculable in terms of the known coupling constants
electroweak elementary forces in a grand unified
and the (unknown) scales Msoft and cutoff scale
theory with some bigger gauge symmetry group. Such
NEW to the MSSM. If Msoft . O(10)v, the correct
unification implies that all the SM gauge forces
electroweak scale is obtained for NEW MGUT .
become of equal strength at some unification scale.
This nicely fits with unification of the gauge
Their strength is measured by the running gauge
couplings.
couplings i = g2i =4, i = 1, 2, 3, of the three group
In supersymmetric models, the quartic couplings
factors SU(3)c SU(2)L U(1)Y . The energy scale
in the Higgs potential are restricted. This typically
dependence of i is governed by the renormalization
leads to a strong upper bound on the mass of the
group equations. In the first nontrivial approximation,
lightest Higgs particle. In the minimal model with
they read:
the potential [6], at the tree level
ðiÞ
MHiggs < MZ
91 GeV ½8 1 1 b0 Q
¼ ln ½11
i ðQÞ i ðMZ Þ 2 MZ
This bound is substantially modified by quantum
corrections. They depend quadratically on the top- Here, 1=i (MZ ) = (58.98 0.04, 29.57 0.03,
quark mass and logarithmically on the stop mass 8.40 0.14) are the experimental values of the
scale M~t Msoft : gauge couplings at the Fermi scale and b(i) 0 are the
coefficients which depend on the matter content of
M2Higgs < v2 ½9 the theory. They are
where is given by b0 ¼ 101
þ 43Ng ; 43 4 4
6 þ 3Ng ; 11 þ 3Ng
¼ 18 g22 þ g21 cos2 2 þ in the SM and
3g22 m4t M2 b0 ¼ 3
þ 2Ng ; 5 þ 2Ng ; 9 þ 2Ng
with ¼ 2 2 2
ln 2~t ½10 5
8 v MW mt
in the MSSM, where Ng is the number of fermion
For M~t < 1 TeV, MHiggs < 130 GeV. generations. In the SM, the running gauge couplings
144 Supersymmetric Particle Models
approach each other at high scale of order 1013 GeV be consistent with but close to the present experi-
but never unify. mental limits.
In the MSSM, with sparticle spectrum character-
ized by Msoft
1 TeV and for the initial Fermi scale
values given above, the three gauge couplings unify Summary
with high precision at the scale MGUT 1016 GeV. Supersymmetry is distinct in several very important
Therefore, the MSSM can be embedded into super- points from all other proposed solutions to the
symmetric grand unified theories with no hierarchy hierarchy problem. First of all, it provides a general
problem for the Fermi scale (it is stable with respect theoretical framework which allows one to address
to radiative corrections generated by particles with many physical questions. Supersymmetric models,
masses MGUT ) and no conflict with the measured like the MSSM or its simple extensions, satisfy a
values of the gauge couplings. very important criterion of ‘‘perturbative calculabil-
In the SM, the baryon number is (perturbatively) ity.’’ In particular, they are easily consistent with
conserved since there are no renormalizable couplings the precision electroweak data. The SM is their
violating this symmetry. Experimental search for low-energy approximation in the sense of the
proton decay, for example, p ! eþ 0 , p ! Kþ , is Appelquist–Carazzone decoupling, so most of the
one of the most fundamental tests for particle physics. successful structure of the SM is built into super-
The present limit on the proton life time is
p > symmetric models. The quadratically divergent quan-
1033 yr. In grand unified theories, baryon number tum corrections to the Higgs mass parameter (the
conservation is violated by interactions mediated by origin of the hierarchy problem in the SM) are absent
the heavy gauge bosons corresponding to the enlarged in any order of perturbation theory. Therefore, the
gauge symmetry (e.g., SU(5)), spontaneously broken at cutoff to a supersymmetric theory can be as high as
MGUT to the SM gauge symmetry. Such interactions the Planck scale, and ‘‘small’’ scale of the electroweak
manifest themselves at low energy as additional, breaking is still natural. Supersymmetry is not only
nonrenormalizable interactions added to the SM consistent with grand unification of elementary forces
Lagrangian. Proton decay is then induced by the set but, in fact, makes it very successful. And, finally,
of dimension-6 operators of the form supersymmetry is needed for string theory.
ð6Þ However, there are also some problems to be solved:
ð6Þ ci the hierarchy problem of the electroweak scale is solved
Oi ¼ qqql ½12
M2ð6Þ but the origin of the soft supersymmetry breaking scale
Msoft remains an open question: spontaneous super-
where q, l denote quarks and leptons, respectively. symmetry breaking and its transmission to the visible
For c(6)
i GUT
1=25, the experimental limit on sector is a difficult problem and a fully satisfactory
p requires M(6) & 1015 GeV, consistently with mechanism which would yield Msoft hierarchically
MGUT = 1016 GeV in supersymmetric GUTs. How- smaller than the Planck (string) scale has not yet been
ever, in supersymmetric GUTs, there is still another, found. On the phenomenological side, there are new
genuinely supersymmetric, source of contributions potential sources of flavor-changing neutral current
to the proton decay amplitudes. These are the transitions and of CP violation, and baryon and lepton
dimension-5 operators numbers are not automatically conserved by the
ð5Þ renormalizable couplings. But even those problems
ð5Þ ci can at least be discussed in a concrete quantitative way.
Oi ¼ ~~l
qqq ½13
M2ð5Þ
See also: Brane Construction of Gauge Theories;
where q ~, ~l denote squarks and sleptons, respectively. Perturbation Theory and its Techniques; Seiberg–Witten
Such operators originate from the exchange of the Theory; Standard Model of Particle Physics;
color triplet scalars present in the Higgs boson GUT Supergravity; Supermanifolds.
multiplets, with M(5) MGUT 1016 GeV, and
c(5) & 107 is given by the Yukawa couplings.
Inserted into diagrams with gaugino exchanges they Further Reading
give rise to dimension-6 operators of the form [12]. Eidelman S et al. (The Particle Data Group) (2004) Review of
One then gets c(6) = GUT c(5) , M2(6) = M(5) MSUSY . particle physics. Physics Letters B 592: 1.
Given various uncertainties, for example, in the Kane GL (ed.) (1998a) Perspectives in Supersymmetry. Singapore:
World Scientific.
unknown squark, gaugino, and heavy Higgs boson
Kane GL (ed.) (1998b) Perspectives on Higgs Physics II.
mass spectrum, such contributions in supersym- Singapore: World Scientific.
metric GUT models predict the proton life time to Nilles H-P (1984) Physics Reports C 110: 1.
Supersymmetric Quantum Mechanics 145
Pokorski S (2000) Gauge Field Theories, Cambridge Monographs Weinberg S (1996–2000) The Quantum Theory of Fields,
on Mathematical Physics, 2nd edn. Cambridge: Cambridge vols. I–III. Cambridge: Cambridge University Press.
University Press. Wess J and Bagger J (1992) Supersymmetry and Supergravity,
Ross GG (1985) Grand Unified Theories. Redwood City, CA: Princeton Series in Physics, 2nd edn. Princeton: Princeton
Addison-Wesley. University Press.
Sohnius M-F (1985) Physics Reports C 128: 39.
where "0 = "b þ "f . The ground state of this system is whilst
the state annihilated by both b and f:
Q2 ¼ Qy2 ¼ 0 ½16
bj0; 0i ¼ f j0; 0i ¼ 0 ½10
The above relations suffice to guarantee that the
The full set of energy eigenstates of the system is supercharges (Q, Qy ) are conserved:
constructed by taking
½Q; H ¼ Qy ; H ¼ 0 ½17
1 a result re-expressing the degeneracy between states
jnb ; nf i ¼ pffiffiffiffiffiffiffi bynb f ynf j0; 0i
nb ! ½11 with the same n but different nb and nf . The real
nb ¼ ð0; 1; 2; . . .Þ; nf ¼ ð0; 1Þ form of the supercharges is
and the states by provides the integrand for the path-integral repre-
sentation of the evolution operator in the quantum
j0i ! 1; j1i ! ½25
theory. The proof is not given here; the reader is
Then an arbitrary state takes the form of a linear referred to the literature. In passing, note that as the
superposition anticommuting variables (, ) are taken to be
dimentionless, one actually should identify the
ji ¼ 0 j0i þ 1 j1i ! ðÞ ¼ 0 þ 1 ½26 in the
momentum conjugate to with = ih;
and the standard positive-semidefinite inner product quantum theory, this is replaced by the operator
on the state space is represented on the wave ih@=@.
functions by the double integral
Z
hji ¼ d de ðÞðÞ
¼ 0 0 þ 1 1 ½27
Classical Supersymmetry
By construction, f y = and f = @=@ are conjugates The classical action for the supersymmetric oscilla-
with respect to this inner product: tor with bosonic amplitude x and fermionic ampli-
Z Z
tude is
@
d d e ðÞðÞ ¼ d e
ðÞðÞ ½28 Z 2
@ 1 2 ! 2 _
S¼ dt x_ x þ i þ ! ½35
1 2 2
The real (self-conjugate) forms of the fermion
operators are, therefore, defined by As inferred from the quantum theory, it is a
combination of a linear harmonic oscillator and a
@ @
1 ¼ þ ; 2 ¼ i ½29 fermionic
pffiffiffi oscillator of the same frequency. A factor
@ @ h is also absorbed in and ; equivalently, we can
which satisfy the Pauli–Dirac anticommutation use natural units in which h = 1. In the following,
relations we use this convention.
The action [35] is invariant under infinitesimal
i j þ j i ¼ 2ij ½30 symmetry transformations
By taking the product, we obtain
x ¼ i þ
@ ½36
3 ¼ i1 2 ¼ 1 2 ¼ 1 2Nf ¼ ðx_ þ i!xÞ; ¼ ðx_ i!xÞ
@
1 with (
, ) Grassmann-odd parameters. The Noether
, N f ¼ ð 1 3 Þ ½31 theorem then implies that there are conserved
2
fermionic charges
Thus, we may think of the wave functions as two-
Q ¼ ðp i!xÞ; ¼ ðp þ i!xÞ
Q ½37
component spinors, the components being labeled
either by the eigenvalues of the spin operator 3 , or with the momentum defined by p = ẋ. The other
equivalently by the fermion number Nf , which is a conserved quantity is the energy, represented by the
projection operator on the states with negative spin. Hamiltonian
The action of the Hamiltonian on a wave function
() is represented by the integral 1 2
H¼ p þ !2 x2 þ ! ½38
Z 2
0
½HðÞ ¼ d0 deð Þ Hð; Þð
0
Þ ½32 The canonical phase-space formulation is obtained
by defining brackets of two functions (A, B) on the
is the ordered symbol of the Hamiltonian:
where H(, ) by
phase space (x, p ; , )
Hð; Þ h!
¼ "f þ ½33 @A @B @A @B
fA; Bg ¼
This expression is to be considered as the classical @x @p @p @x
Moreover, the charges Q and Q satisfy the bracket states jE, nf i by the energy E and the fermion
algebra number nf = (0, 1). Moreover, all states of positive
energy are degenerate with respect to fermion
¼ 2iH;
Q; Q fQ; H g ¼ Q; H ¼ 0 ½41
number, as they form pairs related by
Thus, the action [35] is the classical counterpart of supersymmetry:
the quantum theory [9]–[17] in the correspondence pffiffiffiffiffiffi pffiffiffiffiffiffi
QjE; 0i ¼ 2E jE; 1i; QjE; 1i ¼ 2E jE; 0i ½47
limit i{A, B} ! [A, B] = AB BA. For these the-
ories, supersymmetry is rooted in the classical Only ground states with E0 = 0 can occur as singlets
transformations [36]. under supersymmetry. The existence of such a
ground state with fermion number nf amounts to
the existence of a state j0, nf i satisfying
Supersymmetric Quantum Mechanics
Ay f j0; nf i ¼ Af y j0; nf i ¼ 0 ½48
The construction for the supersymmetric oscillator
can be generalized to other dynamical systems in The corresponding wave functions are of the form
two ways. First, the nature of the interactions as
represented by the potential can be modified. j0; 0i ! 0 ðx; Þ ¼ ðxÞ
½49
Second, the number of degrees of freedom can be j0; 1i ! 1 ðx; Þ ¼ þ ðxÞ
varied. This section presents a generalization of the
where (x) are solutions of the equations
supersymmetric oscillator to anharmonic interac-
tions, obtained by modification of the supercharges A ¼ 0; Ay þ ¼0 ½50
[37] with a general function (x) as follows:
These functions are formally given by the
Q ¼ ðp iðxÞÞ; ¼ ðp þ iðxÞÞ
Q ½42 expressions
Rx
The brackets [39] imply the supersymmetry algebra ðyÞ dy
ðxÞ ¼ C e 0 ½51
[41] with the Hamiltonian
For a zero-energy ground state to exist, one of these
i
functions must be normalizable. For example, if
H¼ Q; Q
2 (x) is a polynomial of positive odd degree 2k 1,
1 1 1 then, depending on the sign of the coefficient of
¼ p2 þ 2 ðxÞ þ 0 ðxÞ
½43
2 2 2 x2k1 , one of the exponents is bounded, approaching
zero for x ! 1, and as a result becomes square
In quantum mechanics, the supercharges become
integrable.
operators Q and Qy upon reinterpretation of (x, p)
If no normalizable wave functions of the form
as canonically conjugate operators, and the replace-
[51] exist, the ground state cannot have zero energy
ment ! f y and ! f ; this procedure involves no
(E0 > 0) and all states necessarily belong to
ordering ambiguity. The Hamiltonian operator
superdoublets.
defined by the anticommutator of Q and Qy then
takes the operator form associated with [43]. With
the identification
Spinning-Particle Mechanics
1 1
A ¼ pffiffiffi ðp iÞ; Ay ¼ pffiffiffi ðp þ iÞ ½44 Minimal supersymmetric classical or quantum
2 2
mechanics requires equal number of bosonic and
and making use of the (anti)commutation relations fermionic coordinates in configuration space (xi , i ),
rather than equal number of bosonic and fermionic
AAy Ay A ¼ 0 ðxÞ; ff y þ f y f ¼ 1 ½45 degrees of freedom in phase space. Specifically,
this Hamilton operator can be written in normal- minimal free supersymmetric particle mechanics in
ordered form as n dimensions is described by the classical
Lagrangian
H ¼ 12 QQy þ Qy Q ¼ Ay A þ 0 ðxÞf y f ½46
1 i
L ¼ x_ 2i þ i _i ; i ¼ 1; . . . ; n ½52
It is positive-semidefinite by construction. All results 2 2
for the supersymmetric oscillator are reproduced It is invariant modulo a total time derivative under
upon taking (x) = !x. infinitesimal supersymmetry transformations
As the Hamiltonian commutes with the fermion
number operator Nf , we can label all stationary xi ¼ ii ; i ¼ x_ i ½53
Supersymmetric Quantum Mechanics 149
The canonical phase-space formulation is phrased field described by a vector potential Ai (x). An
in terms of the free-particle momentum and extension of the free-particle action [52], invariant
Hamiltonian under the same supersymmetry transformations
[53], is
pi ¼ x_ i ; H ¼ 12p2i ½54
Z
1 i iq
and the brackets S¼ dt x_ 2i þ i _i þ qAi ðxÞx_ i Fij ðxÞi j
2 2 2
@A @B @A @B @A @B
fA; Bg ¼ þ ið1ÞA ½55 ½63
@xi @pi @pi @xi @i @i
where Fij = ri Aj rj Ai is the field strength. The
The supersymmetry transformations are generated
canonical momentum in this model is
by the supercharge
pi ¼ x_ i þ qAi ðxÞ ½64
Q ¼ pi i ; A ¼ ifQ; Ag ½56
with the supersymmetry algebra with the result that the canonical expressions for the
Hamiltonian and supercharge become
ifQ; Qg ¼ 2H; fQ; H g ¼ 0 ½57
H ¼ 12ðpi qAi ðxÞÞ2 ; Q ¼ ðpi qAi ðxÞÞi ½65
An important quantity in these models is the bilinear
(Grassmann-even) antisymmetric tensor In the quantum theory, these constants of motion
become the covariant Laplacian and Dirac operator
ij ¼ ii j ½58
in an external vector potential Ai (x). Observe that
For a free particle, it is a set of constants of motion supersymmetry requires the spin to couple to the
forming a representation of so(n), the Lie algebra of magnetic field with gyromagnetic ratio g = 2. Expli-
n-dimensional rotations: citly, the equation of motion for can be trans-
formed into an equation for the spin precession:
ij ; kl ¼ jk il jl ik ik jl þ il jk ½59
_i ¼ qFij j ) _ ij ¼ qðFik kj ik Fkj Þ ½66
Therefore, the physical interpretation of ij is that it
represents the particle spin. For this reason, super- In three dimensions, this is equivalent to an equation
symmetric particle mechanics is often called spin- in terms of axial vectors:
ning-particle mechanics.
Quantum mechanics of the spinning particle has Fij ¼ "ijk Bk ; ij ¼ "ijk sk ) s_ ¼ qB s ½67
the same algebraic structure, with (xi , pi ) the
showing that the precession rate of s is given by
standard canonically conjugate operators, and the
twice the Larmor frequency.
fermionic coordinates i represented by the genera-
tors of a Clifford algebra; the irreducble representa-
tion in terms of Pauli–Dirac matrices of dimension
2[n=2] 2[n=2] is Extended Supersymmetry
1 It is possible to construct theories with more
i ! pffiffiffi i ; i j þ j i ¼ 2ij ½60
2 supersymmetries by associating with every bosonic
coordinate several fermionic coordinates. An exam-
It follows that the wave functions have 2[n=2] ple is the supersymmetric oscillator and its general-
components, describing different polarization states. izations considered earlier, which has equal number
Furthermore, in minimal supersymmetric quantum of bosonic and fermionic degrees of freedom in
mechanics, the supersymmetry operator is repre- phase space, rather than equal number of bosonic
sented by the Dirac operator: and fermionic coordinates in configuration space.
1 The classical phase space, spanned by variables
Q ! pffiffiffi p; ð pÞ2 ¼ p2i ¼ 2H ½61 (xi , pi ; i , i ) with i = 1, . . . , n, then has double the
2
number of fermionic variables compared to the
Hence, the stationary states of the system solve the minimal supersymmetric particle models. Such mod-
Dirac equation els can be constructed for systems with an
pffiffiffiffiffiffi n-dimensional bosonic configuration space. Their
p ¼ 2E ½62
supercharges take the form
The models can, without difficulty, be extended to
Q ¼ ðpi ii ðxÞÞi ; ¼ ðpi þ ii ðxÞÞi
Q
include interactions with external fields. As an ½68
example, we consider the coupling to a magnetic x ¼ ðx1 ; . . . ; xn Þ
150 Supersymmetric Quantum Mechanics
whilst the Hamiltonian becomes Alternatively, we can represent the wave functions
as spinors of dimension 2n , on which the fermion
H ¼12p2i þ 122i ðxÞ operators fiy and fi act as a 2n -dimensional matrix
þ 14ðrj i þ ri j Þ i j i j ½69 representation of the Clifford algebra with genera-
tors a , a = 1, . . . , 2n, defined by
The supercharges are conserved if the curl of i (x)
vanishes: ri j rj i = 0. It follows that at least i ¼ fi þ fiy ; iþr ¼ i fi fiy ½79
locally there exists a single function W(x) such that
These operators indeed satisfy the anticommutation
i ðxÞ ¼ ri WðxÞ ½70
rule
W(x) is called the superpotential. Defining the
a b þ b a ¼ 2ab ½80
operators
n
Thus, the wave functions have 2 components, as
Ai ¼ pi ii ðxÞ; Ayi ¼ pi þ ii ðxÞ compared to the 2[n=2] polarization states of the
½71
Ai Ayj Ayj Ai ¼ ri j þ rj i minimal models.
partner of complementary fermion number; these Finally, as the wave function representation of
states can never get a nonzero energy under changes supersymmetric quantum mechanics [82] links the
in the parameters of the potential, as long as the Witten index to the space of zero modes of a Dirac
changes respect supersymmetry. Such systems, there- operator, in particular cases it can be used to
fore, necessarily possess exact zero-energy states describe topological aspects of sigma models and
which are invariant under all supersymmetries. gauge theories, and related mathematical quantities
Deformations of the potential respecting super- such as the Atiyah–Singer index.
symmetry are those obtained by changing the More details and references to the original
parameters in the superpotential. The usefulness of literature can be found in the reviews listed in the
this concept is, therefore, that the index for models Further Reading section.
with complicated superpotentials can be computed
by comparing them with models with simple super- See also: Path-Integrals in Non Commutative Geometry;
potentials having similar topological properties. Supermanifolds.
Counting the number of states is not always a
simple procedure, in particular when the spectrum
includes continuum states. Therefore, in practice one Further Reading
often needs a regularization procedure, by taking the
Cooper F, Khare A, and Sukhatme U (1995) Supersymmetry and
trace over the full state space of the exponentially quantum mechanics. Physics Reports 251: 267.
damped quantity De Witt BS (1984, 1992) Supermanifolds. Cambridge: Cambridge
University Press.
Ið Þ ¼ trð1ÞNf e H ½84 Shifman MA (1999) ITEP Lectures on Particle Physics and Field
Theory, vol. 1, ch. 4. Singapore: World Scientific.
and taking the limit ! 0. The quantity [84] can be van Holten JW (1996) D = 1 Supergravity and spinning particles.
computed in terms of a path integral with periodic In: Jancewicz B and Sobezyk J (eds.) From Field Theory to
boundary conditions for the fermionic degrees of Quantum Groups, p. 173. Singapore: World Scientific.
freedom.
by some probability distribution on Herm(V), the that the matrix entries of H all are statistically
Hermitian linear operators on V. We may fix some independent.
orthonormal basis of V and represent the elements By varying the lattice , the number of orbitals N,
H of Herm(V) by Hermitian square matrices. and the variances Jij , one obtains a large class of
Quite generally, probability distributions are Hermitian random matrix models, two prominent
characterized by their Fourier transform or char- subclasses of which are the following:
acteristic function. In the present case this is
1. For jj = 1, one gets the Gaussian Unitary
Ensemble (GUE). Its symmetry group is U =
ðKÞ ¼ eitrHK
U(N), the largest one possible in dimension
where the Fourier variable K is some other linear N = dim V.
operator on V, and h. . .i denotes the expectation 2. If ji jj denotes a distance function for , and f a
value with respect to the probability distribution for rapidly decreasing positive function on R þ of
H. Later, it will be important that, if (K) is an width W, the choice Jij = f (ji jj) with N = 1
analytic function of K, the matrix entries of K need gives an ensemble of band random matrices with
not be from R or C but can be taken from the even bandwidth W and symmetry group U = U(1)jj .
part of some exterior algebra. Beyond being real, symmetric, and positive, the
The probability distributions to be considered in variances Jij are required to have two extra proper-
this article are Gaussian with zero mean, hHi = 0. ties in order for all of the following treatment to go
Their Fourier transform is also Gaussian: through:
ðKÞ ¼ eð1=2ÞJðK;KÞ They must be positive as a quadratic form. This is
to guarantee the existence of an inverse, which we
with J some quadratic form. We now describe J for a denote by wij = (J1 )ij .
large family of hierarchical models that includes the The off-diagonal matrix entries of the inverse
case of band random matrices. must be nonpositive: wij 0 for i 6¼ j.
Let V be given a decomposition by orthogonal
vector spaces:
where it is understood that we are integrating with where the Fourier variables K1 , . . . , Kjj are n n
the Lebesgue measure matrices with matrix entries taken from C or
R on (the normed vector space)
another commutative algebra.
V normalized by e(’, ’)
= 1. The same integral
with anticommuting instead of the (commuting) The key relation of the fermionic variant of the
’ 2 V gives supersymmetry method is that the expectation of the
Z product of determinants [5] has another expression as
eð ;A Þ ¼ det A ½4 Z Y
jj
ferm
n; N ðz; JÞ ¼ detN ðz iQj Þ dn; J ðQÞ ½7
This basic formula from the field theory of j¼1
fermionic particles is a consequence of the integra- pffiffiffiffiffiffi
tion over anticommuting variables actually being (i = 1). The strategy of the proof is quite simple:
differentiation: one writes the determinants in both expressions for
Z ferm
n, N as Gaussian integrals over nNjj complex
@2 fermionic variables 1 , . . . , n (each is a vector in
d 1 d 1 f ð 1 ; 1 ; . . .Þ :¼ f ð 1 ; 1 ; . . .Þ
@ 1@ 1 V with anticommuting coefficients), using the basic
154 Supersymmetry Methods in Random Matrix Theory
formula [4]. The integrals then encountered are holds true, provided that the parameters z1 , . . . , zn
essentially the Fourier transforms of the distribu- all lie in the same half (upper or lower) of the
tions dN, J (H) resp., dn, J (Q). The result is complex plane. To obtain information on transport
Z properties, however, one needs parameters in both
e z ð ; Þ eð1=2Þij Jij ð ;i Þð ;j Þ the upper and lower halves; see the paragraph
following [2]. The general case to be addressed
for both expressions of ferm n, N . In other words, below is I m z > 0 for = 1, . . . , p, and I m z < 0
although the probability distributions dN, J (H) and for = p þ 1, . . . , n. Careful inspection of the steps
dn, J (Q) are distinct (they are defined on different leading to eqn [9] reveals a convergence problem for
spaces), their characteristic functions coincide
P when 0 < p < n. In fact, [9] with Qj in Herm(Cn ) turns
evaluated on the Fourier variables K = ( , ) out to be false in that range. Learning how to
for H and (Ki ) = ( , i ) for Qi . This establishes resolve this problem is the main step toward
the claimed equality of the expressions [5] and [7] for mathematical mastery of the method. Let us there-
ferm
n, N (z, J). fore give the details.
What is the advantage of passing to the alternative If s := sgnI m z , the good (meaning convergent)
expression by dn, J (Q)? The answer is that, while H Gaussian integral to consider is
is made up of independent random variables, the new Z Y
n
variables Qi , called the Hubbard–Stratonovich field, ei s ð’ ;ðz HÞ’ Þ ¼ det1 ðis ðz HÞÞ
are correlated: they interact through the ‘‘exchange’’ ¼1
constants wij = (J1 )ij . If that interaction creates
To avoid carrying around trivial constants, we now
enough collectivity, a kind of mean-field behavior
assume i(n2p)Njj = 1. Use of the characteristic
results.
function of the distribution for H then gives
For the simple case of GUE (jj = 1, w11 = N=2 ) Z
with z1 = = zn = E, one gets the relation
Z n;N ðz; JÞ ¼ ei s z ð’ ;’ Þ
bos
2 2
hdetn ðE HÞi ¼ detN ðE iQÞ eðN=2 Þtr Q dQ 1
e2ij Jij s ð’ ;i ’ Þs ð’ ;j ’ Þ ½10
the right-hand side of which is easily analyzed by the The difficulty of analyzing this expression stems
steepest descent method in the limit of large N. from the ‘‘hyperbolic’’ nature (due to the indefinite-
For band random matrices in the so-called ergodic ness of the signs s = 1) of the term quartic in the
regime, the physical behavior turns out to be governed ’ , ’
.
by the constant mode Q1 = = Qjj – a fact that can
be used to establish GUE universality in that regime.
Fyodorov’s Method
The integrand for bos is naturally expressed in
Bosonic Variant terms of n n matrices Mi with matrix ele-
ments (Mi ) = (’ , i ’ ). These matrices lie in
The bosonic variant of the present method, due to
Hermþ (Cn ), that is, they are non-negative as well
Wegner, computes averages of products of determi-
as Hermitian. Fyodorov’s idea was to introduce
nants placed in the denominator:
them as the new variables of integration. To do
Z Y
n
that step, recall the basic fact that, given two
bos
n;N ðz; JÞ ¼ det1 ðz HÞdN; J ðHÞ ½8 differentiable spaces X and Y and a smooth map
¼1
: X ! Y, a distribution on X is pushed forward
where we now require I m z 6¼ 0 for all = 1, . . . , n. to a distribution () on Y by ()[f ] := [f
],
Complications relative to the fermionic case arise where f is any test function on Y.
from the fact that the integrand in [8] has poles. If We apply this universal principle to the case at
one replaces the anticommuting vectors by hand by identifying X with V n , and Y with
commuting ones ’ , and then simply repeats the (Hermþ (Cn ))jj , and with the mapping that sends
previous calculation in a naive manner, one arrives at
ð’1 ; . . . ; ’n Þ 2 X to ðM1 ; . . . ; Mjj Þ 2 Y
Z Y
jj
bos ?
n;N ðz; JÞ ¼ detN ðz Qj Þdn; J ðQÞ ½9 by (Mi ) = (’ , i ’ ). On X = V n we are integrat-
j¼1 ing Rwith the product Lebesgue measure normalized
by e (’ , ’ ) = 1. We now want the push-forward
where the integral is still over Qj 2 Herm(Cn ). The of this flat measure (or distribution) by the mapping
calculation is correct, and relation [9] therefore . In general, the push-forward of a measure is not
Supersymmetry Methods in Random Matrix Theory 155
guaranteed to have a density but may be singular constructed by Schäfer and Wegner, but was largely
(like a Dirac -distribution). This is in fact what forgotten in later physics work.
happens if N < n. The matrices Mi then have less Writing (Mk ) = (’ , k ’ ) as before, consider
than the maximal rank, so they fail to be positive the function
but possess zero eigenvalues, which implies that
the flat measure on X is pushed forward by into the FM ðQÞ ¼ eð1=2Þij wij trðsQi þizÞðsQj þizÞk trMk Qk ½12
boundary of Y. For N n, on the other hand, the viewed as a holomorphic function of
push-forward measure
Qjj does have a density on Y; and
that density is i = 1 (det Mi )Nn dMi , as is seen by Q ¼ ðQ1 ; . . . ; Qjj Þ 2 EndðCn Þjj
transforming to the eigenvalue representation and R
If the Gaussian integral Q FM (Q)DQ with holo-
comparing Jacobians. The dMi are Lebesgue mea-
morphic density DQ = i dQi is formally carried
sures on Herm(Cn ), normalized by the condition
out by completing the square, one gets the integrand
Z Z
of [10]. This is just what we want, as it would allow
etrMi ðdet Mi ÞNn dMi ¼ e ð’ ;i ’ Þ ¼ 1
Mi >0
us to pass to a Q-matrix formulation akin to the one
of the previous section. But how can that formal
Assembling the sign information for I m z in a step be made rigorous? To that end, one needs to (1)
diagonal matrix s := diag(s1 , . . . , sn ), and pushing the construct a domain on which jFM (Q)j decreases
integral over X forward
Q to an integral over Y with rapidly so that the integral exists, and (2) justify
measure DM := i dMi , we obtain Fyodorov’s completion of the square and shifting of variables.
formula: To begin, take the absolute value of FM (Q).
Z
Putting (1=2)(Qj þ Qj ) =: R eQj and (1=2i)(Qj
n;N ðz; JÞ ¼ eð1=2Þij Jij trðsMi sMj Þ
bos
Qj ) =: I m Qj , we have jFM j = e(1=4)(f1 þf2 þf3 ) with
Y
X
ek trðiszMk þðNnÞ ln Mk Þ DM ½11 f1 ðQÞ ¼ wij trðsI m Qi þ zÞðsI m Qj þ zÞ þ c:c:
ij
This formula has a number of attractive features. X
One is ease of derivation, another is ready general- f2 ðQÞ ¼ 2 wij trðsR eQi ÞðsR eQj Þ
ij
izability to the case of non-Gaussian distributions. !
The main disadvantage of the formula is that it does X X
not apply to the case of band random matrices f3 ðQÞ ¼ 4 tr Mi þ sI m z wij R eQi
i j
(because of the restriction N n); nor does it
combine nicely with the fermionic formula [7] to These expressions suggest making the following
give a supersymmetric formalism, as one formula is choice of integration domain for Qi (i = 1, . . . , jj).
built on Jij and the other on wij . Pick some real constant > 0 and put
Note that [11] clearly displays the dependence on þ
the signature of I m z: you cannot remove the s1 , . . . , sn Pi 0
R eQi ¼ Ti Ti ; I m Qi ¼ Pi :¼
from the integrand without changing the domain of 0 P i
integration Y = (Hermþ (Cn ))jj . This important p q
with Ti 2 U(p, q), Pþ
i 2 Herm(C ), Pi 2 Herm(C ).
feature is missing from the naive formula [9].
The set of matrices Qi so defined is referred to as
Setting q = n p, let U(p, q) be the pseudounitary p, q
the Schäfer–Wegner domain X . The range of the
group of complex n n matrices T with inverse
field Q = (Q1 , . . . , Qjj ) is the direct product
T 1 = sT s. Since jdet Tj = 1 for T 2 U(p,Q q), the X : = (Xp, q jj
) .
integration domain Y and density DM = i dMi of
To show that this is a good choice of domain, we
Fyodorov’s formula are invariant under U(p, q)
first
R of all show convergence of the integral
transformations Mi 7! TMi T , and so is actually the
FM (Q)DQ. The matrices Pi commute with s, so
integrand in the limit where all parameters z1 , . . . , zn X
X
become equal. Thus, the elements of U(p, q) are f1 ðQÞjX ¼ 2R e wij trðPi þ szÞðPj þ szÞ
global symmetries in that limit. This observation ij
holds the key to another method of transforming the
Since the coefficients wij are positive as a quadratic
expression [10].
form, this expression is convex (with a positive
The Method of Schäfer and Wegner
Hessian) in the Hermitian matrices Pi . Second, the
function
To rescue the naive formula [9], what needs to be X 1
abandoned is the integration domain Herm(Cn ) for f2 ðQÞjX ¼ 22 wij tr Ti Ti Tj Tj
the matrices Qi . The good domain to use was ij
156 Supersymmetry Methods in Random Matrix Theory
R R
is bounded from below by the constant 22 ni wii . which proves X (t) FM (Q)DQ = X FM (Q)DQ, inde-
This holds true because wij is negative for i 6¼ j, and pendent of t. (This argument does not go through
because Ti Ti > 0 and the trace of a product of two for the nonrigorous choice sQi := Ti Pi Ti1 usually
positive Hermitian matrices is always positive. made!)
Third, In the limit t ! 1, one encounters the expression
Z Z
!
X X FM ðQÞDQ ¼ dn;J ðisQÞ
f3 ðQÞjX ¼ 4 tr Mi þ sI m z wij Ti Ti X ð1Þ X
i j
eð1=2Þij Jij trðsMi sMj Þþik trðszMk Þ
is positive, as ( . . . ) is positive Hermitian. As long as with dn, J as in [6]. The normalization integral over
sI m z > 0, the function f3 goes to infinity for all X is defined by taking the Hermitian matrices Pi to
possible directions of taking the Ti to infinity on be the inner variables of integration. The outer
U(p, q). integrals over the Ti then demonstrably exist, and
Thus, when the matrices Qi are taken to vary on one can fix the (otherwise
R arbitrary) normalization
the Schäfer–Wegner domain Xp, q
, the absolute value
of DQ by setting X dn, J (isQ) = 1. Making that
(1=4)(f1 þf2 þf3 ) choice, and comparing with [10], one has proved
jFM j = e decreases rapidly
R at infinity.
This establishes the convergence of X FM (Q)DQ. Z Z
Next, let us count dimensions. The mapping bos
n;N ¼ FðMi Þ ¼ð’
;i ’ Þ ðQÞDQ
T 7! TT for T 2 U(p, q) =: G is invariant under
’;’ X
with the Schäfer–Wegner bosonic formalism for where the second supertrace includes a sum over
D
E sites and orbitals, and on setting t1 = t2 = 0 becomes
det1 z1 H t1 Eba
ji det 1
ðz 2 HÞ Y
eNr Str lnðQr iszÞ ¼ SdetN ðQr iszÞ
and eventually differentiating with respect to t1 , t2 at r
t1 = t2 = 0 and summing over a, b; see the subsection Q
The superintegral ‘‘measure’’ DQ = r DQr is the
‘‘Green’s functions from determinants.’’ All steps are flat Berezin form, that is, the product of differentials
formally the same as before, but with traces and for all the commuting matrix entries in (Qr )BB and
determinants replaced by their supersymmetric (Qr )FF , times the product of derivatives for all the
analogs. Having given a great many technical details anticommuting matrix entries in (Qr )BF and (Qr )FB .
in the last two sections, we now just present the To prove the formula [14], two new tools are
final formula along with the necessary definitions needed, a brief account of which is as follows.
and some indication of what are the new elements
involved in the proof. Gaussian Superintegrals
Let each of QBB , QFF , QBF , and QFB stand for a
2 2 matrix. If the first two matrices have There exists a supersymmetric generalization of the
commuting entries and the last two anticommuting Gaussian integration formulas given in the subsec-
ones, they combine to a 4 4 supermatrix: tion ‘‘Determinants as Gaussian integrals’’: if
A, D(B, C) are linear operators or matrices with
QBB QBF commuting (resp., anticommuting) entries, and
Q¼
QFB QFF ReA > 0, one has
Z
Relevant operations on supermatrices are the 1 A B Þð ;C’Þð ;D Þ
supertrace, Sdet ¼ eð’;A’Þð
’;B
C D
StrQ ¼ trQBB trQFF Verification of this formula is straightforward.
and the superdeterminant, Using it, one writes the last factor in [14] as a
Gaussian superintegral over four vectors: ’1 , ’2 , 1 ,
detðQBB Þ and 2 . The integrand then becomes Gaussian in the
SdetQ ¼
detðQFF QFB QBB 1 QBF Þ matrices Qr .
These are related by the identity Sdet = exp
Str
ln
Shifting Variables
whenever the superdeterminant exists and is
nonzero. The next step in the proof is to do the ‘‘Gaussian’’
In the process of applying the method described integral over the supermatrices Qr . By definition, in
earlier, a supermatrix Qi gets introduced at every a superintegral, one first carries out the Fermi
site i of the lattice . The domain of integration for integral, and afterwards the ordinary integrations.
each of the matrix blocks (Qi )BB (i = 1, . . . , jj) is The Gaussian integral over the anticommuting parts
taken to be the Schäfer–Wegner domain X1,
1
(with (Qr )BF and (Qr )FB is readily done by completing the
some choice of > 0); the integration domain for square and shifting variables using the fact that
each of the (Qi )FF is the space of Hermitian 2 2 fermionic integration is differentiation:
matrices, as before. Z Z
@
Let E11
BB be the 4 4 (super)matrix with unit entry d
f ð
0 Þ ¼ f ð
0 Þ ¼ d
f ð
Þ
in the upper-left corner and zeros elsewhere; simi- @
larly, E22
FF has unity in the lower-right corner and Similarly, the Gaussian integral over the Hermitian
zeros elsewhere. Putting s = diag(1, 1, 1, 1) and matrices (Qr )FF is done by completing the square
z = diag(z1 , z2 , z1 , z2 ), the supersymmetric Q-integral and shifting. The integral over (Qr )BB , however, is
formula for the generating function of G(2) ij – not Gaussian, as the domain is not R n but the
obtained by combining the Schäfer–Wegner bosonic Schäfer–Wegner domain. Here, more advanced
method with the fermionic variant – is written as calculus is required: these integrations are done by
* + using a supersymmetric change-of-variables theorem
detðz1 HÞ detðz2 H þ t2 Eab ij Þ
due to Berezin to make the necessary shifts by
detðz1 H t1 Eba ji Þ detðz2 HÞ nilpotents. (There is not enough space to describe
Z
this here, so please consult Berezin’s (1987) book.)
¼ DQeð1=2Þkl wkl StrðsQk sQl Þ
Without difficulty, one finds the result to agree with
the left-hand side of eqn [14], thereby establishing
cc 11 ba 22 ab
eStr ln r;c ðQr iszÞErr þit1 EBB Eji it2 EFF Eij ½14 that formula.
158 Supersymmetry Methods in Random Matrix Theory
Vx . As the point x moves on M the vector spaces Vx the bundle V with metric gAB () = g(@=@A , @=@B ),
turn and twist; thus, they form what is called a the action functional is
vector bundle V over M. (The bundle at hand turns Z
out to be nontrivial, i.e., there exists no global
S¼
dd x@ A gAB ðÞ@ B
choice of coordinates for it.)
A section of V is a smooth mapping : M ! V
such that (x) 2 Vx for all x 2 M. The sections of The coupling parameter
has the physical meaning
V are to be multiplied in the exterior sense, as they of bare (i.e., unrenormalized) conductivity. In the
represent anticommuting degrees of freedom; present model
= NW 2 a2d , where W is essentially
hence the proper object to consider is the exterior the width of the band random matrix in units of the
bundle, ^V. lattice spacing a (the short-distance cutoff of the
It is a beautiful fact that there exists a unique continuum field theory). S is the effective action in
action of the Lie superalgebra g on the sections of the limit z1 = z2 . For a finite frequency ! =Rz1 z2 , a
^V by first-order differential operators, or deriva- symmetry-breaking term of the form i! dd xf (),
tions for short. (Be advised however that this where = N()1 ad is the local density of states,
canonical g -action is not well known in physics or has to be added to S.
mathematics.) By perturbative renormalization group analysis, that
The manifold M is a symmetric space, that is, a is, by integrating out the rapid field fluctuations, one
Riemannian manifold with G-invariant geometry. finds for d = 2 that
decreases on increasing the cutoff
Its metric tensor, g, uniquely extends to a second- a. This property is referred to as ‘‘asymptotic freedom’’
rank tensor field (still denoted by g) which maps in field theory. On its basis one expects exponentially
pairs of derivations of ^V to sections of ^V, and is decaying correlations, and hence localization of all
invariant with respect to the g -action. This collec- states, in two dimensions. However, a mathematical
tion of objects – the symmetric space M, the proof of this conjecture is not currently available.
exterior bundle ^V over it, the action of the Lie In three dimensions and for a sufficiently large bare
superalgebra g on the sections of ^V, and the conductivity, the renormalization flow goes toward
g -invariant second-rank tensor g – form what the metallic fixed point (
! 1), where G-symmetry
the author calls a ‘‘Riemannian symmetric super- is broken spontaneously. A rigorous proof of this
space,’’ M . important conjecture (existence of disordered metals
in three space dimensions) is not available either.
According to the Landau–Ginzburg–Wilson (LGW) For a system in a box of linear size L, the cost of
paradigm of the theory of phase transitions, the exciting fluctuations in the sigma model field is
large-scale physics of a statistical mechanical system estimated as the Thouless energy ETh =
=L2 . In the
near criticality is expected to be controlled by an limit of small frequency, j!j ETh , the physical
effective field theory for the long-wavelength excita- behavior is dominated by the constant modes
tions of the order parameter of the system. A (x) = A (independent of x). By computing the
Wegner is credited for the profound insight that integral over these modes, Efetov found the energy-
the LGW paradigm applies to the random matrix level correlations in the small-frequency limit to be
situation at hand, with the role of the order those of the GUE.
parameter being taken by the matrix Q. He argued
that transport observables (such as the electrical See also: Random Matrix Theory in Physics; Symmetry
conductivity) are governed by slow spatial variations Classes in Random Matrix Theory.
of the Q-field inside the saddle-point manifold.
Efetov skilfully implemented this insight in a super-
symmetric variant of Wegner’s method. Further Reading
While the direct construction of the effective Berezin FA (1987) Introduction to Superanalysis. Dordrecht: Reidel.
continuum field theory by gradient expansion of Disertori M, Pinson H, and Spencer T (2002) Density of states of
[14] is not an entirely easy task, the outcome of the random band matrices. Communication in Mathematical
calculation is predetermined by symmetry. On Physics 232: 83–124.
Efetov KB (1997) Supersymmetry in Disorder and Chaos.
general grounds, the effective field theory has to be Cambridge: Cambridge University Press.
a nonlinear sigma model for the Goldstone bosons Fyodorov YV (2002) Negative moments of characteristic poly-
and fermions of M : if {A } are local coordinates for nomials of random matrices: Ingham–Siegel integral as an
160 Symmetric Hyperbolic Systems and Shock Waves
alternative to Hubbard–Stratonovich transformation. Nuclear Wegner F (1979) The mobility edge problem: continuous
Physics B 621: 643–674. symmetry and a conjecture. Zeitschrift für Physics B 35:
Mirlin AD (2000) Statistics of energy levels and eigen functions in 207–210.
disordered systems. Physics Reports 326: 260–382. Zirnbauer MR (1996) Riemannian symmetric superspaces and
Schäfer L and Wegner F (1980) Disordered system with n orbitals their origin in random matrix theory. Journal of Mathematical
per site: Lagrange formulation, hyperbolic symmetry, and Physics 37: 4986–5018.
Goldstone modes. Zeitschrift für Physik B 38: 113–126.
Ajk
Definitions BB = Djk BA with (Djk ) diagonal. Some authors
require the symmetry condition
Consider a quasilinear system
vanishes: if the initial condition vanishes for jxj R, One constructs a solution defined for t small, which
we claim that u at some later time vanishes for jxj is in H s , s > n=2 þ 1, as a function of x, by the
R t=a, for a large enough. following procedure:
Indeed, let us integrate the energy identity on a
(1) Replace spatial derivatives by regularized opera-
truncated cone := {jxj a(t0 t)=t0 ; 0 t t1 }
tors, which should be bounded in Sobolev
with t1 < t0 . The boundary of consists of three
spaces; the regularized equation is an ODE in
parts: @ = 0 [ 1 [ S, where 0 and 1 represent
H s ; let u" be its solution.
the portions of the boundary on which t = 0 and t1 ,
(2) Write the equation satisfied by derivatives of
respectively. The outer normal to S is proportional
order s of u" , and apply the energy identity to it.
to (a, t0 xj =jxj). Let E(s) denote the integral of uT Qu
(3) Find a positive T such that the solution is
on \ {t = s}. Integrating eqn [11] by parts, we
bounded in H s for jtj T, uniformly in "; this
obtain
Z implies a C1 bound.
(4) Prove the convergence of the approximations
Eðt1 Þ Eð0Þ þ uT u ds
S in L2 .
ZZ (5) Prove the continuity in time of the H s norm;
¼ ð2u f uT CuÞdt dx
T
½12 conclude that the u" tend to a solution in
P C(T, T; H s ).
where is proportional to aQ þ t0 j xj Aj =jxj.
Take a so large that is positive definite. The The result admits a local version, in which
integral over S is then non-negative. If C is positive Sobolev spaces are replaced by Kato’s ‘‘uniformly
definite and f 0, so that E(0) = 0, we find that local’’ spaces. Uniqueness of the solution is proved
E(t1 ) 0. Since Q is positive definite, this implies along similar lines. We do not attempt to identify
u 0 on 1 , as claimed. the infimum of the values of s for which the Cauchy
problem is well-posed.
A Numerical Scheme
Jump Discontinuities: Shock Waves
System Lu = f may be discretized, for example, by
the Lax–Friedrichs method: let h be the discretiza- A ‘‘shock wave’’ is a weak solution of a system of
tion step in space, and k the time step; write conservation laws admitting a jump discontinuity.
j u(t, x) = u(t, x1 , . . . , xj þ h, . . . , xn ) (translation in By definition, weak solutions satisfy, for any smooth
the j direction). One replaces @j u by the centered function A (x) with compact support,
difference in the j direction: (j u j1 u)=2h; and the ZZ
time derivative by ff A @ A þ NA A g dt dx ¼ 0
1 X
½uðt þ k; xÞ ðj uðt; xÞ þ j1 uðt; xÞÞ=k ½13 The theory of shock waves is an attempt to
2n j
understand solutions of conservation laws which are
For consistency of the scheme, we require k=h = > 0 limits of solutions of diffusion equations; the hope is
to be fixed as k and h tend to zero; stability then that the influence of second-derivative terms is
holds if is small. appreciable only near shocks, and that, for given
initial data, there is a unique weak solution of the
conservation law which may be obtained as such a
Nonlinear Problems and Singularities limit, if modeling has been done correctly. This
problem may be difficult already for a single shock
We give a simple setup for proving the existence of
(‘‘shock structure’’).
smooth solutions to SH systems for small times.
The theory of shock waves follows the one-
Such solutions may develop singularities. We limit
dimensional theory closely. We therefore describe
ourselves to two types of singularities, on which SH
the main facts for a conservation law in one space
structure provides some information: jump disconti-
dimension (u = u(t, x)):
nuities and blow-up patterns. Caustic formation is
not considered. @t u þ @x f ðuÞ ¼ 0
If a shock travels at speed c, the weak formulation
Construction of a Smooth Solution
of the equations gives the Rankine–Hugoniot rela-
Consider a real SH system (eqn [1]). Recall that a tion c[u] = [f (u)], where square brackets denote
function of x belongs to the Sobolev space H s if its jumps. There may be several weak solutions having
derivatives of order s or less are square-integrable. the same initial condition. One restricts solutions by
164 Symmetric Hyperbolic Systems and Shock Waves
making two further requirements: (1) the system one can write the solution as the sum of a singular
admits an entropy pair (U, F) with a convex entropy part, known in closed form, and a regular part. If
and (2) to be admissible, weak solutions must be the singularity locus is represented by t = 0, the
limits of ‘‘viscous approximations’’ regular part solves a renormalized equation of the
typical form
@t u þ @x f ðuÞ ¼ "@x2 u
tMu þ Au ¼ t" N ½14
as " ! 0. One then finds easily that the entropy
equality (@t U þ @x F = 0) must be replaced, for such where Mu = 0 is SH. Under natural conditions, for
weak solutions, by the entropy condition: @t U þ any initial condition u0 such that Au0 = 0, there is a
@x F 0 in the weak sense. This condition admits a unique solution of eqn [14] defined for small t.
concrete interpretation if the gradient of each The upshot is an asymptotic representation of
characteristic speed is never orthogonal to the solutions which renders the same services as an
corresponding right eigenvector (‘‘genuine nonli- exact solution, and is valid precisely where numeri-
nearity’’); in that case, characteristics must impinge cal computation breaks down.
on the shock (‘‘shock inequalities’’). Fuchsian reduction enables one in particular to
For the equations of gas dynamics with polytropic study (1) the blow-up time; (2) how the singularity
law (pv = const.), there is a unique solution with locus varies when Cauchy data, prescribed in the
initial condition u = ul for x < 0, u = ur for x > 0, smooth region, are varied; and (3) expressions which
where ul and ur are constant (‘‘Riemann problem’’) remain finite at blow-up. It is the only known general
which satisfies the entropy condition, provided jul ur j procedure for constructing analytically singular
is small. More generally, if the equation of state spacetimes involving arbitrary functions, rather than
p = p(v, s) > 0 satisfies @p=@v < 0 and @ 2 p=@v2 > 0, arbitrary parameters, and is therefore relevant to the
the shock inequalities are equivalent to the fact that search for alternatives to the big bang.
the entropy increases after the passage of a shock
with jul ur j small.
On the numerical side, one should mention:
(1) the widely used idea of upstream differencing; Examples and Applications
(2) the Lax–Wendroff scheme, the complete analysis Wave Equation with Variable Coefficients
of which requires tools from soliton theory; and
(3) the availability of general results for dissipative Consider the equation
schemes for SH systems.
Recent trends include: (1) admissibility conditions @tt u þ 2aj ðxÞ@jt u ajk ðxÞ@jk u ¼ f ðt; x; u; ruÞ
when genuine nonlinearity does not hold and
(2) other approximations of shock wave problems, with (ajk ) positive definite. Letting v = (v0 , . . . ,
most notably kinetic formulations. vnþ1 ) := (u, @j u, @t u), we find the system
Some of the ideas of shock wave theory have been
applied to Hamilton–Jacobi equations and to @t v0 ¼ vnþ1
motion by mean curvature, with applications to @t vk @k vnþ1 ¼ 0
front propagation problems and ‘‘computer vision.’’
@t vnþ1 þ 2ak @k vnþ1 ajk @k vj ¼ f
a statement could be made about invariance under corresponding to invariance under translation in
rotations; if space were not isotropic, experimental space and time are momentum and energy; con-
results would depend on which direction the servation of angular momentum follows from
apparatus was aligned in, and again any laws invariance under rotations and invariance under
would be extremely hard to find. Turning to the Lorentz transformations gives rise to conservation
question of motion, Newton and Galileo realized of motion of the center of mass.
that the laws of dynamics are the same in all inertial
frames in relative motion. In the Newton–Galileo
scheme, the rule for relating the space and time
Gauge Theories: Electromagnetism
coordinates of two frames of reference is (for
relative motion along the common x-axis)
and Yang–Mills Theories
A quantity whose conservation has been well known
x0 ¼ x vt; t0 ¼ t ½1 for a long time is electric charge. The question may
then be asked: invariance under what symmetry
This principle of relativity was reaffirmed by gives rise to conservation of electric charge? A
Einstein, but with the crucial modification that the classical complex field has the Lagrangian density
rules for relating coordinates in two frames are
given by Lorentz transformations, so that [1] is L ¼ ð@ Þð@ Þ m2 ½3
replaced by
which is invariant under
vx
x0 ¼ ðx vtÞ; t0 ¼ t 2 ½2 ! expðiQ Þ ½4
c
being the parameter for the transformation.
Time is absolute in [1] but relative in [2]. Einstein
Noether’s theorem then yields conservation of Q,
was of course motivated by the fact that Maxwell’s
interpreted as electric charge. With a constant, as
equations are covariant under Lorentz transforma-
above, the Lagrangian possesses a ‘‘global’’ symme-
tions, but not under Newton–Galileo ones.
try. This becomes a ‘‘local’’ symmetry when
The above considerations reveal that the laws of
becomes space and time dependent, (r, t) or
nature should be covariant under ten types of
(x ). In that case, however, the Lagrangian [3] is
transformation: three translations in space, one in
no longer invariant under [4], because of the
time, three parameters (angles) for rotations and
derivative terms. To preserve invariance an extra
three velocities. These transformations together
field A must be introduced, so that [4] then
form a group, the inhomogeneous Lorentz, or
becomes
Poincaré group. It is a nonabelian group whose ten
generators correspond to 4-momentum, angular ! expðiQ ðx ÞÞ
momentum, and Lorentz boosts. The seminal work 1 ½5
on the significance of this group in fundamental A ! A þ @
Q
physics is that of Wigner in 1939. Assuming that the
states of fundamental quantum systems (particles, and the Lagrangian acquires extra terms, involving
atoms, molecules) form the basis states for repre- A . The field A is called a gauge field and is
sentations of this group, these entities are described identified with the electromagnetic potential. The
by two quantities, mass and spin. Spin, moreover, transformation [5] is called a gauge transformation,
which was already familiar from earlier investiga- and since the phase factor exp (iQ ) may be
tions in quantum physics, was described by the regarded as a unitary 1 1 matrix, we have here a
rotation group (SU(2), which is homomorphic to theory with U(1) gauge invariance, which describes
SO(3)) only for states with timelike momentum. For electromagnetism and conservation of charge.
photons, for example, with null momentum, spin is The notion of isospin had been introduced by
described by the (noncompact) Euclidean group in Heisenberg in 1932. Isospin (then called isotopic
the plane, with the consequence that there are only spin) was a vector-like quantity conserved in strong
two polarization states for this massless particle. (nuclear) interactions. Yang and Mills in 1954 made
Noether’s theorem provides the crucial link the pioneering suggestion that isospin conservation
between symmetries and conservation laws, via the could also be recast as a gauge theory, by enlarging
principle of least action. Noether showed that the the U(1) group of electromagnetism to SU(2)
invariance of the action under a continuous (corresponding to rotations in ‘‘isospin space’’),
symmetry operation implied the existence of a and at the same time treating the rotation angles as
conserved quantity. The conserved quantities functions of spacetime. Then, eqn [4] will change: if
168 Symmetries and Conservation Laws
case of a nonabelian symmetry group by Guralnik, problem; this is the problem that the number of
Hagen, and Kibble and invoked by Weinberg in his electron neutrinos detected on Earth, originating in
1971 model for the electroweak interaction in which the Sun, is less than the number predicted, by a
the gauge quanta were massive. factor close to 3. The mismatch could be at least
Higgs’ work was motivated by the theory of partly, and perhaps completely, explained if electron
superconductivity, where the Meissner effect (expul- neutrinos ‘‘oscillated’’ into muon and/or tau neutri-
sion of magnetic flux from a superconductor), when nos on their passage from the Sun to the Earth, since
relativistic, implies that the effective mass of a the reaction which detects the neutrinos on Earth is
photon in a superconductor is nonzero – this is, sensitive only to electron neutrinos, and not to the
the ‘‘reason’’ that the flux does not penetrate. In the other species. But oscillation is only permitted if
theory of Bardeen, Cooper, and Schrieffer (BCS), a Le , L , and Lt are not separately conserved quan-
superconductor is described by an effective scalar tities. Oscillation can also only take place if the
field, a composite of electron pairs (though paired in masses of the different neutrinos are different – the
momentum space rather than coordinate space), and oscillation rate depends on m2 – hence not all
this provides a physical analogy with the model the neutrinos may be massless.
above. The SM of particle physics postulates a Higgs
scalar field analogous to the BCS composite scalar
Discrete Symmetries
field. If this field exists, Higgs particles should also
exist, but they have not yet been found. This is an Ever since parity violation was discovered in weak
outstanding problem for the SM. interactions (nuclear beta decay) by Wu in 1957, the
whole subject of discrete symmetries has presented
problems which are still not resolved. The symme-
Baryon and Lepton Numbers tries in question are
The fact that the proton p does not decay into P (space inversion): (x, y, z) ! (x, y, z)
positron plus photon, eþ þ , or muon plus photon, T (time reversal): t ! t
þ þ , implies a conservation law of baryon C (particle–antiparticle conjugation): particle $
number B (the proton possessing B = 1 and the antiparticle
others B = 0). Furthermore, the stability of and Are the laws of physics invariant under these
t against decay into e þ implies conservation of operations? The Wu experiment revealed that weak
lepton numbers Le , L , and Lt . These are regarded interactions are not invariant under P, but what
as global, not local, symmetries, so there are no about other interactions and other operations? In
associated gauge fields or interactions. Interestingly, this context, the CPT theorem is highly important.
however, these symmetries are not built into the SM, According to this theorem (based on very general
so are not guaranteed by it. More interestingly, these assumptions), all laws of nature must be invariant
symmetries are actually destroyed in one attempt to under the combined operation CPT, so that, for
go beyond the SM. This is the hypothesis that QCD example, the fact that weak interactions are not
may be unified with electroweak interactions to invariant under P means that they are not invariant
produce a ‘‘grand unified’’ theory (GUT). The under the product CT either.
simplest GUT is the one in which the SU(2) U(1) The violation of P invariance in beta decay was
SU(3) symmetry is assumed to be a subgroup of the soon related to the fact that the neutrino involved
much tighter symmetry SU(5), and in that theory the (the electron neutrino – or, to be precise, antineu-
proton is unstable: trino) was massless. Spin-1/2 particles like the
electron and neutrino obey the Dirac equation,
p ! e þ þ 0 ½10
which may be written out as a pair of coupled
301
The predicted lifetime is 10 years, while a recent equations for left- and right-handed states. In the
estimate of the lifetime for this decay mode is > case m = 0, however, these equations decouple so it
5 1032 years. It may be that GUTs do not exist in is possible to have a massless spin-1/2 particle which
nature, but since the decay [10] violates conserva- is either left-handed or right-handed. Any interac-
tion of the quantities B and Le , even entertaining the tion involving this particle would automatically
idea that the decay might take place begs the violate parity (which turns a left-handed state into
question, ‘‘are these conservation laws sacrosanct?’’ a right-handed one). Experiments have verified that
Another recent development which leads to the the neutrino is indeed left-handed. The SM incorpo-
same question is the subject of neutrino oscillations. rates this in the sense that the left-handed electron
A strong motivation for this is the solar neutrino e L and the electron neutrino e are assigned to a
170 Symmetries and Conservation Laws
weak isospin SU(2) doublet, while the right-handed violating interactions, but as the density increases
electron e R transforms as a singlet. A similar and this reaction rate becomes less than the
pattern is repeated for the and t particles and expansion rate, thermal equilibrium can no longer
their neutrinos. The phenomenon of neutrino oscil- be maintained. Thus, GUTs offer an explanation of
lations, on the other hand, does not allow all the why there is no antimatter in the Universe. It might
neutrino states also to be purely left-handed (since be thought that this sort of explanation is implau-
they cannot be massless). This poses a potential sible, since the B-violating and CP-violating forces
problem for the SM. are so weak, but actually this is not a problem, since
For a few years after 1957 it was believed that beta the ratio of baryon number to photon number in the
decay violated C as well as P, but conserved the Universe is of the order NB =N 109 ; so we may
product CP; and indeed that all weak interactions conjure up a scenario in which the B and CP
were CP invariant. In 1964, however, it was found violating forces give rise to a volume of space in
that there is a small element of CP violation in K0 which there are, say, 109 antibaryons, 109 þ 1
decay. CP-violating effects are also expected in B0 baryons and approximately the same number of
decays. The physical origin of CP violation is still not photons. Then, all the antibaryons become annihi-
understood, but its importance is that it implies T lated leaving one baryon and 109 photons – as
violation, so that in (at least some) weak interactions, observed.
there is an ‘‘arrow of time’’ on the subnuclear scale. A recent development in the area of discrete
(Such an arrow of time is, of course, familiar in symmetries has been the suggestion by Kostelecky
thermodynamics.) This is used in a cosmological and coworkers that there might exist spontaneous
context to explain baryon–antibaryon asymmetry in violation of CPT and Lorentz symmetry.
the Universe.
Topological Charges
Baryon–Antibaryon Asymmetry
Conserved quantities of a quite different type have
In the standard model of cosmology it is shown that received a lot of attention in recent decades. Their
applying the known laws of physics to the early conservation is a consequence of nontrivial bound-
Universe (the first few minutes) leads to the ary conditions for the fields. A famous example is
conclusion that at an age of 226 s nuclear fusion the sine-Gordon ‘‘kink.’’ The sine-Gordon equation
reactions took place resulting in a mixture of 74%
protons and 26% particles, so that, hundreds of @2 @2 1
þ sinðbÞ ¼ 0 ½11
thousands of years later, when galactic condensation @t2 @x2 b2
took place, it would involve precisely this admixture describes a scalar field in one space and one time
of hydrogen and helium gases. Just this amount of dimension. It is a nonlinear equation which pos-
helium has been found in the Sun, giving great sesses, among others, the interesting solution
confidence to the ‘‘big bang’’ model. Assuming that
at extremely small times the baryon number of the 4 p
f ð
Þ ¼ arctan exp½ð= bÞ
Universe was zero, B = 0, and assuming also (a big b
assumption, but one nevertheless made by cosmol- where
= x vt and = (1 v2 )1=2 . This corre-
ogists) that the Universe is made of matter and not sponds to a solitary wave which moves, preserving
antimatter, we may then ask, why is this – where its shape and size – in distinction to usual waves,
has the antimatter gone? which spread out and dissipate. Waves of this type
Surprisingly, this question was addressed as early are called solitons, and solitons have in fact been
as 1966 by Sakharov, who showed that, starting observed moving along canals. In this case, they are
with an initial state with B = 0, it would be possible solutions to the Korteveg de Vries equation. Equa-
to reach a state with B 6¼ 0 as long as three tion [11] clearly possesses the constant solutions
conditions obtained: B violating interactions, CP
2n
and C violating interactions, and lack of thermal ¼ ; n ¼ 0; 1; 2; . . .
equilibrium. GUTs and ordinary weak interactions b
already provide possibilities for the first two of these which, it may be shown, all have zero energy. We
conditions. Breakdown of thermal equilibrium will may then construct a solution of the above type, but
be expected to occur as the Universe expands. with n = 0 as x ! 1 and n = N as x ! þ1. This
When the particle density is high, reactions such as so-called ‘‘kink’’ solution has finite energy and is not
pþp ! þ will ensure an equal population of continuously deformable into a solution with n = 0
baryons and antibaryons, even in the presence of B everywhere, since this would involve overcoming an
Symmetries and Conservation Laws 171
isometries, with corresponding groups of motion (so Eguchi T, Gilkey PB, and Hanson AJ (1980) Gravitation, gauge
that the isometry group of Minkowski space is the theories and differential geometry. Physics Reports 66: 213.
Huang K (1998) Quantum Field Theory: From Operators to Path
Poincaré group). These groups are an important Integrals. New York: Wiley.
subject of study in cosmology; for example, there is a Kostelecky VA (ed.) (2004) CPT and Lorentz Symmetry:
classification of homogeneous cosmological models, Proceedings of the Third Meeting. Singapore: World Scientific.
labeled according to the Bianchi classification. Landau LD and Lifshitz EM (1971) The Classical Theory of
Fields. Oxford: Pergamon Press.
See also: Cotangent Bundle Reduction; Effective Field Manton N and Sutcliffe P (2004) Topological Solitons.
Cambridge: Cambridge University Press.
Theories; Electroweak Theory; General Relativity:
Perkins DH (2000) Introduction to High Energy Physics, 4th edn.
Overview; Infinite-Dimensional Hamiltonian Systems;
Cambridge: Cambridge University Press.
Noncommutative Geometry and the Standard Model; Review of Particle Properties (2002), Physical Review D 66:
Quantum Field Theory: A Brief Introduction; 01002, July 2002, Part 1.
Quasiperiodic Systems; Sine-Gordon Equation; Rubakov V (2002) Classical Theory of Gauge Fields. Princeton:
Supergravity; Symmetries in Quantum Field Theory of Princeton University Press.
Lower Spacetime dimensions; Symmetry and Symplectic Ryder LH (1996) Quantum Field Theory, 2nd edn. Cambridge:
Reduction; Symmetry Classes in Random Matrix Theory; Cambridge University Press.
Topological Defects and Their Homotopy Classification. Stephani H (2004) Relativity: An Introduction to Special and General
Relativity, 3rd edn. Cambridge: Cambridge University Press.
Weinberg S (1983) The First Three Minutes. London: Fontana.
Further Reading Wess J and Bagger (1983) Supersymmetry and Supergravity.
Princeton: Princeton University Press.
Aitchison IJ and Hey AJ (1981) Gauge Theories in Particle Wigner E (1939) On unitary representations of the inhomoge-
Physics. Bristol: Adam Hilger. neous Lorentz group. Annals of Mathematics 40: 149.
Cheng T-P and Li L-F (1984) Gauge Theory of Elementary
Particle Physics. Oxford: Clarendon Press.
the exact computation of correlation functions functions) invariant. The symmetries form a group
by the help of Ward identities (Belavin, of -automorphisms of the algebra of fields:
Polyakov, and Zamolodchikov 1984). Only
the finite-dimensional Möbius group, however, g ð1 2 Þ ¼ g ð1 Þg ð2 Þ
is also a symmetry of the vacuum state. g ðÞ ¼ g ð Þ ½1
Möbius covariance implies that the theory g1 g2 ¼ g1 g2
contains two subtheories of chiral fields
defined on the light rays t x = constant, (typically given by linear transformations of field
resp. t þ x = constant, and that these can be multiplets). In the strongest case, the automorphisms
extended to fields defined on a circle, by are implemented by unitary operators on the state
adding a ‘‘point at infinity’’ to the light ray space
(Lüscher and Mack 1976). One arrives thus at
one-dimensional chiral quantum field theories UðgÞUðgÞ ¼ g ðÞ ½2
on a circle, which will play an important role
The implementers form a representation of the
in the discussion below.
Continuous symmetries cannot be spontaneously group of automorphisms,
broken in two dimensions. The latter is true not Uðg1 ÞUðg2 Þ ¼ Uðg1 g2 Þ ½3
only for relativistic quantum field theory (Cole-
man 1973), but also in quantum statistical and there is an invariant vector state (a ground state,
mechanics (Mermin and Wagner 1966) where or the vacuum state in relativistic quantum field
it is responsible for the absence of ferromagnet- theory),
ism (see Symmetry Breaking in Field Theory).
UðgÞ ¼ ½4
Spontaneous symmetry breakdown requires
long-range order which is overcome by thermal However, depending on the dynamics of the
fluctuations down to zero temperature, because quantum system, these relations cannot always be
these diverge logarithmically (in the thermody- fully realized. One therefore considers several
namical limit) in two dimensions. This theorem weaker or more general notions of symmetries
thus illustrates how the spacetime dimension- relevant in four dimensions:
dependent size of phase space has an effect on
internal symmetries of quantum systems. A Spontaneously broken symmetries. The transfor-
detailed mathematical analysis of the balance mations are given as automorphisms of an
between phase space (thermal fluctuations) and algebra, but which are not unitarily implemented
long-range order (symmetry breakdown) has in a given irreducible representation of the
been given in a recent discussion of the Gold- algebra. Invariant pure states do not exist.
stone theorem (Buchholz, Doplicher, Longo and Projective representations. The symmetries are
Roberts 1992). unitarily implemented, but the implementers fail
The Coleman–Mandula theorem, excluding a to satisfy the group law [3]. They give rise to ray
mixing between internal and spacetime symme- (projective) representations or representations of a
tries (see above), is valid only in higher covering group. In particular, an invariant state
dimensions. vector as in [4] cannot exist in an irreducible
representation.
In more recent times, it has become apparent that Infinitesimal symmetries. Lie algebras of infinite-
low-dimensional quantum systems do not only simal transformations, given as derivations of an
admit more symmetries, but they may exhibit algebra, which cannot be integrated to finite
internal symmetries of an entirely new type, not transformations. Derivations may or may not be
describable by groups of transformations. In this implemented in a given representation of the algebra
article, we shall focus on the various ways in which by commutators with self-adjoint generators.
the new symmetries can arise, and how they can be Supersymmetry. The infinitesimal transforma-
understood. In order to properly appreciate these tions form a graded Lie algebra.
issues, let us first recall some basic symmetry Local gauge symmetries form an infinite-
concepts in the conventional case. dimensional group which are, however, not
In the traditional setting, symmetries arise in the realized as automorphisms of the quantum alge-
form of groups of transformations of the quantum bra. Quantization of classical gauge interactions
system which leave observable quantities (e.g., usually proceeds by breaking the gauge invariance
vacuum expectation values and correlation in some way and restoring it at a later stage.
174 Symmetries in Quantum Field Theory of Lower Spacetime Dimensions
massive theories always satisfy this selection criter- At a more elementary level, one may think of
ion with a localization region O of the form of a statistics operators as reflecting commutation rela-
narrow cone extending in spacelike direction. (In tions between the searched-for charged fields. Mak-
massless theories with long-range interactions, such ing an ansatz for the commutation relations at
as QED, the situation is more complicated because spacelike separation, essentially the same topological
the charge creates an electric field whose flux at argument as before implies, together with Poincaré
infinity does not vanish (Gauss’ law) and is not invariance, that the coefficients appearing in this
Lorentz invariant.) DHR assume that the localiza- relation should form a representation of the permu-
tion region is even compact, and can be chosen tation group, or of the braid group, respectively. The
arbitrarily within the unitary equivalence class of the DHR approach, however, is entirely intrinsic,
representation. avoiding any a priori assumption of charged fields.
Exploiting a strong version of locality (Haag The duality theorem due to Doplicher and
duality) for the vacuum representation of the Roberts (1990) now states that every symmetric C
observables, DHR proceed to define an associative tensor category (with some further qualifications
composition (or fusion) law for positive-energy valid in the DHR setting) is isomorphic to the
representations. This law is commutative only up category of unitary representations of a compact
to unitary equivalence. The crucial point is that the group, in which the composition law is the tensor
unitary intertwiner establishing this equivalence (the product and the (permutation) symmetry is the
statistics operator) can be chosen in a unique way natural one. Moreover, the category uniquely
provided any pair of spacelike disconnected locali- determines the group, and by a crossed product
zation regions can be continuously deformed into construction (an action of the category on the
any other such pair. algebra A) one reconstructs a field algebra F such
This point marks the separation between high and that [5] holds. If fermionic sectors are present, then
low dimensions. In two dimensions, in each pair of there is some arbitrariness in the commutation
spacelike disconnected regions, one region is to the relations among the corresponding fermionic fields,
left of the other, thus distinguishing the pair which can be exploited to produce the normal
(O1 , O2 ) from (O2 , O1 ). Consequently, they cannot commutation relations (fermionic fields anticom-
be deformed into each other, and there arise two mute among each other, and bosonic fields commute
statistics operators. The same holds in three dimen- with any field at spacelike separation). This fixes the
sions when the localization regions are spacelike field algebra F up to unitary equivalence. The
cones, and O1 , O2 are taken within (the causal conclusion is that the WWW scenario is the most
complement of) some larger spacelike cone. If the general in four dimensions (apart from the reserva-
spacetime dimension is at least 4, or if in three tions due to long-range forces, see above).
dimensions the localization regions are compact,
then the statistics operator is unique and, as a
consequence, coincides with its inverse. Generalized Symmetries in Low
The (non-)uniqueness of the statistics operator has Dimensions
far-reaching consequences concerning our original
In view of the success of this program in four
question about the underlying gauge symmetry.
dimensions and the advantage of the WWW
Namely, the DHR analysis proceeds to show that
scenario for model building, the obvious challenge
the set of positive-energy representations equipped
is to search for an analogous understanding of
with the composition law, and the linear spaces of
superselection sectors (charges) in low dimensions in
inertwiners between different representations,
terms of an algebra of charged fields and a gauge
together form the mathematical structure of a C
symmetry distinguishing the observables. This gauge
tensor category. The statistics operators which are
symmetry cannot, in general, be a group for several
distinguished intertwiners give additional structure
reasons:
to this category: this structure is called a (permuta-
tion) symmetry if the statistics operators coincide As stated before, the tensor category of super-
with their inverse, and it is called a braiding selection sectors possesses only a braiding, rather
otherwise. (It gives rise to a representation of the than a (permutation) symmetry, hence the duality
permutation group or the braid group, respectively.) theorem fails.
In other words, the spacetime topology, through the One can associate a (statistical) dimension d to
intervention of the uniqueness of the statistics each superselection sector [] which is multi-
operator, causes the tensor category to be symmetric plicative under the composition law (fusion), and
in high dimensions, and braided in low dimensions. additive under direct sums. In a symmetric
176 Symmetries in Quantum Field Theory of Lower Spacetime Dimensions
category, the dimensions are necessarily positive approaches to appropriate symmetry concepts in
integers. Indeed, in the WWW scenario, they low dimensions.
coincide with the naive dimension of the asso- Attempts to classify the possible algebraic struc-
ciated representation of the gauge group. But in tures of generalized internal symmetries in a model-
the low-dimensional models, the dimensions turn independent setting start from the idea that the
out to be nonintegers in general. representation category of the internal symmetries of
Moore and Seiberg (1988) have axiomatized the a given model should be equivalent to the tensor
superselection structure of chiral and two- category of its superselection sectors. Several alge-
dimensional conformal field theories in terms braic structures have been proposed as candidates,
of a system of recoupling and braiding coeffi- complying with this idea. They all assume specific
cients controlling the fusion of sectors and its modifications or deformations of eqns [1]–[5] above,
noncommutativity. (In fact, this system is highly constrained by self-consistency. Among these
basically equivalent to the DHR category.) For proposals are:
models such as SU(2) current algebras at level
quantum groups (see e.g., Fröhlich and Kerler
k, these coefficients turn out to coincide with
1993),
the recoupling and braiding coefficients one can
weak quasiquantum groups (Mack and Schomerus
associate with a quantum group deformation
1992) and rational Hopf algebras (Fuchs et al.
(Drinfel’d 1986) of SU(2) with deformation
1994),
parameter q = exp i=k. Representations of
weak C Hopf algebras (Rehren 1997, Böhm and
quantum groups (quasitriangular Hopf algebras,
Szlachányi 1996) or quantum groupoids (Nik-
see Hopf Algebras and q-Deformation Quantum
shych and Vainerman 1998), and
Groups) have a tensor product defined in terms
braided groups (Majid 1991).
of a noncocommutative coproduct. Moreover,
they possess a quantum dimension which is a In several cases, the respective ‘‘symmetry alge-
q-deformation of an integer. The quantum bra’’ can be reconstructed from the tensor category
dimensions precisely match the statistical dimen- of superselection sectors, and a field algebra with
sions of the superselection sectors. All this linear transformation behavior can be constructed
strongly suggests that quantum groups appear as which contains the observables as invariant ele-
generalized symmetries in two dimensions, at ments as in [5]. However, the situation is unsatis-
least in a large class of models. factory for various reasons. First, the class of QFT
models for which these constructions have been
A natural testing ground for the search for performed is quite restricted (most constructions
appropriate generalized symmetry concepts in low work only for rational models, i.e., models with a
dimensions is the abundance of models in chiral and finite set of charges); second, the reconstructed
two-dimensional conformal QFT (see Two- symmetry algebra is not unique and finally, the
Dimensional Models). As mentioned before, confor- constructed field algebras have features which
mal symmetry in two dimensions has far-reaching diverge significantly from the WWW scenario. For
consequences, especially the existence of chiral quan- example, it is not always warranted that the
tum fields which are defined on a one-dimensional quantum symmetries are consistent with the
light ray. As a null direction in the two-dimensional -structure, indispensable for Hilbert space positiv-
spacetime, this ray unites both the spacelike property ity (a necessary prerequisite for the probability
of carrying a causal structure, and the timelike interpretation of quantum theory). Moreover, typi-
property that the generator of translations has positive cally there are global gauge transformations which
spectrum (energy). These two features together with are implemented by localized field operators, thus
Möbius covariance are so powerful that they allow for exhibiting a mixing of local and global concepts. It
the exact construction of large classes of models. The also happens that this holds for elements in the
most elementary ones (minimal models) are center of the symmetry algebra, which implies that
completely described by the chiral stress–energy the field algebra is not local relative to its gauge
density field, that is, the local generator of the invariant elements, that is, the charged fields do not
conformal symmetry. Other models also contain commute with the gauge-invariant elements at
currents which are the local generators of internal spacelike separation. In other constructions, the
symmetries. These models exhibit many nontrivial field algebra is not associative, or there are no finite
superselection structures, which illustrate the wide field multiplets.
range of possible deviations from higher-dimensional Historically, the first candidate for a ‘‘symmetry
QFT, and at the same time exhibit possible algebra’’ compatible with braid group statistics has
Symmetries in Quantum Field Theory of Lower Spacetime Dimensions 177
been the structure of a quantum group, as men- to the observables in the sense mentioned before. In
tioned above. However, in physically interesting rational chiral CFT, such extensions can be classi-
models, the quantum group is not semisimple and fied (and indeed constructed) in terms of the super-
thus has too many (namely, indecomposable) repre- selection category of A, giving direct access to the
sentations. Solutions to this problem have been: decomposition of the vacuum Hilbert space of F into
superselection sectors of A. The advantage here is
1. A BRS approach in an indefinite-metric frame-
that no problems with Hilbert space structure can
work (Hadjiivanov et al. 1991),
arise (because the approach is entirely in terms of
2. ‘‘Truncation,’’ that is, discarding the ‘‘unphysi-
operator algebras); a drawback is that in general F is
cal’’ representations. Fröhlich and Kerler (1993)
not unique, and nonvacuum representations of F
have done this consistently in a categorical
also have to be considered in order to generate all
framework. In fact, they have given a complete
sectors of A.
classification of the possible braided tensor
The method can be used to classify and construct
categories generated by a single irreducible object
both nonlocal chiral extensions as candidates for
with statistical dimension d satisfying 1 < d < 2,
sector-generating field algebras for a theory A of
in terms of categories constructed from the
chiral observables, and local two-dimensional quan-
‘‘truncated’’ representations of Uq (sl2 ). Trunca-
tum field theories containing two given chiral
tion can also be performed by dividing the
subtheories, that is, observable algebras of two-
quantum group itself through the ideal which is
dimensional models (Kawahigashi and Longo 2004).
annihilated by all ‘‘physical’’ representations,
The chiral sector structure of the latter models is
leading to a weak quasiquantum group (Mack
described by a ‘‘modular invariant.’’ In many cases,
and Schomerus 1992).
this means that their thermal partition functions are
3. Relaxing the axioms, thus admitting the more
invariant under the group PSL(2, Z) of modular
general structures mentioned above.
transformations of the temperature (see below).
All the above approaches assume a given general- At this point, another link between spacetime and
ized symmetry concept and show to what extent internal symmetries may be noted. The modular
field algebras complying with it can be constructed. theory of von Neumann algebras (see Tomita–
They thus concern nonobservable objects, and it is Takesaki Modular Theory) associates a one-para-
no contradiction if different symmetry concepts can meter group of automorphisms (called the ‘‘modular
be associated with the same observable data. group’’) with a state and an algebra ‘‘in standard
A more radical concept of global gauge symmetry, position.’’ In quantum field theory, for the vacuum
applicable to the low-dimensional case, has been state and an algebra of observables localized in
developed by Longo and Rehren (1995). Its point of certain wedge regions of Minkowski spacetime, this
departure is the notion of a conditional expectation, group can be identified with a boost subgroup of the
which has the same abstract properties as a group Lorentz group (Bisognano and Wichmann 1975).
average. In the WWW scenario, the Haar measure Similarly, in chiral CFT on the circle, the modular
of the compact gauge group defines an average group associated with the observables in an interval
Z and the vacuum coincides with a subgroup of the
: F 3 7! dðgÞ g ðÞ 2 A ½6 Möbius group. For nonlocal theories, there may be
an obstruction, however. On the other hand, if a
which is a positive linear map respecting the subalgebra is stable under the modular group of
localization, and the observables are invariant, some algebra, then there is a conditional expectation
(a) = a. In fact, the observables are exactly the from the larger algebra onto the smaller algebra.
image of this map, that is, [5] is equivalently Combining these general theorems, the Möbius
formulated, but without reference to the group covariance of the inclusions A(O) F(O) implies
transformations, as the existence of a conditional expectation, that is,
the above generalization of the average over the
AðOÞ ¼ ðFðOÞÞ ½7 internal symmetry. Moreover, assuming a general-
ized notion of compactness (‘‘finite index’’) for the
Turning to the observables A of a quantum field generalized internal symmetry, the Bisognano–Wich-
theory in low dimensions, one looks for a quantum mann property holds also for nonlocal theories
field theory F, containing A and equipped with a (Longo and Rehren 2004).
conditional expectation such that [7] holds, and Of course, there is also a WWW scenario in chiral
which preserves the vaccum state. F may not satisfy theories, that is, one may restrict a local theory to its
local commutativity, but it should be local relative invariants under some group of internal gauge
178 Symmetries in Quantum Field Theory of Lower Spacetime Dimensions
symmetries (‘‘orbifold models’’). It then happens sectors. This general result nicely complies with the
that the invariants not only have the expected experience with integrable models, as mentioned
superselection sectors in correspondence with the before.
representations of the gauge group, but in addition There are also some results giving interesting
‘‘twisted’’ sectors appear which, together with the insight, which can be obtained intrinsically in terms
former, constitute a ‘‘quantum double’’ structure. of the observables. One of them concerns ‘‘central’’
The twisted sectors arise by restriction of solitonic observables (generalized Casimir operators).
sectors of the original theory, which are in one-to-one Casimir operators in the WWW scenario are
correspondence with the elements of the gauge functions of the generators of the internal symmetry
group (Müger 2005). Solitonic sectors are localiz- which usually are integrals over densities belonging
able with respect to two different vacua, and do to the field algebra F (Noether’s theorem). Since
not admit an unrestricted composition law. they also commute with the generators, they can be
approximated by local observables, and are there-
fore defined in each representation of the latter. By
Schur’s lemma, they are multiples of the identity in
Special Issues each irreducible sector. Since the eigenvalues of
A particularly simple situation is the case of anyons, Casimir operators distinguish the representations of
that is, when all sectors have statistical dimension 1. the gauge group, they also distinguish the sectors.
Then the sectors form an abelian group G ^ under In chiral CFT extended to the circle (see above),
fusion, and one can construct a WWW scenario with one can find global ‘‘charge measuring operators’’
global gauge group G the dual of G. ^ The ensuing Ci , one for each sector i , in the center of the
quantum fields satisfy generalized commutation rela- observable algebra (Fredenhagen et al. 1992) which
tions at spacelike separation, given by an abelian have similar properties. They arise as a consequence
representation of the braid group, where the coeffi- of an algebraic obstruction to define the charged
cients can be arbitrary complex phases (responsible sectors on the circle, related to a nontrivial effect if a
for the name ‘‘anyons’’). However, it is known that charge is ‘‘transported once around the circle,’’ and
there can arise an obstruction, which enforces the form an operator representation of the fusion rules
‘‘local’’ global gauge transformations (mentioned within the global algebra of observables. Under
before) to be present. In this case, the gauge rather natural conditions clarified by Kawahigashi,
symmetry can also be described by a quasiquantum Longo, and Müger (2001), the matrix of eigenvalues
group. It is noteworthy that free anyon fields have j (Ci ) is nondegenerate, that is, the generalized
been constructed in two-dimensional spacetime, Casimir operators completely distinguish the super-
while in three dimensions there can be no (cone-) selection sectors. In this case, the superselection
localized massive anyon fields which are free in the category is a modular category (see Braided and
sense that they generate only single-particle states Modular Tensor Categories): the matrix with entries
from the vacuum (Mund 1998). dj j (Ci ) and the diagonal matrix with entries j (U)
The charge structure of massive quantum field (where U is the Möbius rotation by 2) are multi-
theories in two dimensions is very different both ples of the generators S and T of the ‘‘modular
from that encountered in conformal quantum field group’’ PSL(2, Z), in a matrix representation labeled
theories, and from the charge structure in high by the superselection sectors of the chiral observa-
dimensions. It has been observed long ago that, in bles. The physical significance of this matrix
contrast to four dimensions, the strong locality representation is that it relates thermal expectation
property (Haag duality) which is necessary to set values for different values of the temperature (Cardy
up the DHR analysis of superselection sectors, fails 1986, Kac and Peterson 1984, Verlinde 1988)
for the algebra of invariants under an internal gauge These examples, together with the failure of the
group in two dimensions. This algebraic feature can Coleman–Mandula theorem, may illustrate the
be traced back to the fact that the causal comple- intricate relations among spacetime geometry, cov-
ment of a point is disconnected in two dimensions, ariance, and internal symmetry (charge structure) in
or, in physical terms, that ‘‘a charge cannot be low dimensions. In relativistic quantum field theory,
transported around a detector’’ without passing the link is provided by the principle of locality,
through its region of causal dependence. Müger which ‘‘turns geometry into algebra.’’
(1998) has shown that any algebra of observables
which satisfies Haag duality, cannot possess any See also: Algebraic Approach to Quantum Field Theory;
nontrivial DHR superselection sectors at all, and Axiomatic Quantum Field Theory; Braided and Modular
that the only sectors which can exist are solitonic Tensor Categories; Hopf Algebras and q-Deformation
Symmetries in Quantum Field Theory: Algebraic Aspects 179
Quantum Groups; Integrability and Quantum Field Fuchs J, Ganchev A, and Vecsernyés P (1994) On the quantum
Theory; Quantum Field Theory: A Brief Introduction; symmetry of rational field theories. Theoretical Mathematical
Quantum Fields with Topological Defects; Symmetries Physics 98: 266–276.
and Conservation Laws; Symmetries in Quantum Field Hadjiivanov LK, Paunov RR, and Todorov IT (1991) Quantum
group extended chiral p-models. Nuclear Physics B 356:
Theory: Algebraic Aspects; Symmetry Breaking in Field
387–438.
Theory; Tomita–Takesaki Modular Theory; Longo R and Rehren K-H (1995) Nets of subfactors. Rev.
Two-Dimensional Conformal Field Theory and Vertex Mathematical Physics 7: 567–597.
Operator Algebras; Two-Dimensional Models. Mack G and Schomerus V (1992) Quasi Hopf quantum symmetry
in quantum theory. Nuclear Physics B 370: 185–230.
Majid S (1991) Braided groups and algebraic quantum field
Further Reading theories. Letters in Mathematical Physics 22: 167–175.
Moore G and Seiberg N (1989) Classical and quantum conformal
Böhm G and Szlachányi K (1996) A coassociative C -quantum field theory. Communications in Mathematical Physics 123:
group with non-integral dimensions. Letters in Mathematical 177–254.
Physics 35: 437–448. Müger M (1998) Superselection structure of massive quantum
Coleman S and Mandula J (1967) All possible symmetries of the field theories in 1 þ 1 dimensions. Reviews in Mathematical
S-matrix. Physical Review 159: 1251–1256. Physics 10: 1147–1170.
Doplicher S and Roberts JE (1990) Why there is a field algebra Mund J (1998) No-go theorem for ‘‘free’’ relativistic anyons in
with a compact gauge group describing the superselection d = 2 þ 1. Letters in Mathematical Physics 43: 319–328.
structure in particle physics. Communications in Mathematical Rehren K-H (1997) Weak C Hopf symmetry. In: Doebner H-D
Physics 131: 51–107. and Dobrev VK (eds.) Group Theoretical Methods in Physics,
Fredenhagen K, Rehren K-H, and Schroer B (1992) Superselection pp. 62–69. (q-alg/9611007). Sofia: Heron Press.
sectors with braid group statistics and exchange algebras II: Schomerus V (1995) Construction of field algebras with quantum
geometric aspects and conformal covariance. Reviews in symmetries from local observables. Communications in
Mathematical Physics SI1: 113–157. Mathematical Physics 169: 193–236.
Fröhlich J and Kerler T (1993) Quantum Groups, Quantum Wick GC, Wightman AS, and Wigner EP (1952) The intrinsic
Categories, and Quantum Field Theory. Lecture Notes in parity of elementary particles. Physical Review 88: 101–105.
Mathematics, vol. 1542. Berlin: Springer.
transition probabilities j(, )j2 . This formed the for every double cone O. It is usually the case that
starting point for Wigner’s analysis, who concluded: internal symmetries commute with spacetime
symmetries.
Theorem Every symmetry is of the form A ! 7
The state of prime relevance to elementary particle
UAU1 and 7! UU1 , where U is a unitary or
physics is the vacuum state !0 . The corresponding
antiunitary operator.
Gelfand–Naimark–Segal (GNS) representation 0 is
As could have been foreseen from the outset, this called the vacuum representation. Now the vacuum
simple result in no way distinguishes one elementary state of a quantum field theory is typically unique
quantum-mechanical system from another. A more and as such invariant under a symmetry of the system
useful notion of symmetry results if the Hamiltonian !0 1 = !0 .
is reckoned as part of the information describing the
system and, therefore, has to be left invariant by a
symmetry. The operator U above must therefore
Spacetime Symmetries
satisfy the condition UHU1 = H and it commutes
with the Hamiltonian. As the Hamiltonian is the Since the vacuum state is invariant, we have a
generator of time translations, U is a constant of unitary representation of the Poincaré group imple-
motion. This is the genesis of the relation between menting the spacetime symmetries in the vacuum
symmetries and conservation laws. representation. To illustrate the role of representa-
tions up to a factor, we take instead the GNS
representation of a pure state corresponding to a
particle of half-integral spin. Here we need a unitary
Quantum Field Theories
representation of the covering group of the Poincaré
The simplest types of quantum field theories can be group, inhomogeneous SL(2, C) to implement the
described by von Neumann algebras A(O) depend- symmetries. The situation for the subgroup of
ing on double cones O and subject to rotations is the same.
O1 O2 ) AðO1 Þ AðO2 Þ The most important property of these representa-
tions is positivity of the energy. More precisely, in a
a structure referred to as the net of observables. representation of relevance to elementary particle
An alternative approach would be to use the physics such as the vacuum representation, the
Wightman formalism. This would need a discussion generator P0 of time translations is a positive
of pointlike fields and the domains of definition of operator P0 0. Expressed in a frame-independent
unbounded operators, thus complicating a general way, the spectrum of spacetime translations is
exposition of symmetry. contained in the closed forward light cone. It is
Comparing this description of a quantum field one of the basic principles to be exploited in
theory with that of an elementary quantum- applying quantum field theory to elementary particle
mechanical system, the net clearly substitutes obser- physics. Notice that the principle is no longer valid
vables but nothing has yet been said about states. for an equilibrium state.
Since the set of double cones is directed under A similar situation arises in conformal field
inclusion, the union of the A(O) is a -algebra A and theory. Here the role of double cones in Minkowski
a state of our system is a state on this algebra. space is played by intervals on the circle and that of
Most states are of no physical relevance. A the Poincaré group by the Möbius group on the
characterization of the states of physical relevance, circle PSL(2, R). Again, the Möbius group cannot
even say to elementary particle physics, is not always be unitarily implemented and conformal
known although some progress has been made. invariance is defined via a continuous unitary
The net structure is the hallmark of a field theory representation of its covering group. Most impor-
and allows us to distinguish two important classes of tantly, there is an analog of positivity of the energy.
symmetries. An internal symmetry satisfies the The generator of rotations of the circle is a positive
condition operator.
ðAðOÞÞ ¼ AðOÞ A remarkable aspect of spacetime symmetries was
discovered by Bisognano and Wichmann in an
for all double cones O. By contrast, a spacetime application of modular theory in the field-theoretical
symmetry is an automorphism L implementing a context looking not at double cones but at wedges.
Poincaré transformation L and hence satisfying the A wedge W is a Poincaré transform of the standard
condition wedge x1 > jx0 j. They found that the modular
L ðAðOÞÞ ¼ AðLOÞ automorphisms of A(W) and the vacuum vector 0
Symmetries in Quantum Field Theory: Algebraic Aspects 183
have a geometric significance. For the standard hence the objects of a full tensor subcategory T of
wedge, they got the following result. the category of all endomorphisms and their inter-
twiners. There is a dimension function d defined on
Theorem If the net is derived from Wightman
the objects of T , d() = 1, 2, . . . , 1. If T f denotes
fields, the modular operator is e2K , where K is the
the full subcategory whose objects have finite
generator of boosts in the 1-direction and the
dimension, then the following result holds.
modular conjugation is ZR, where is the TCP-
operator, R is the rotation through about the Theorem T f is equivalent to the tensor category of
1-axis, and Z is the unitary operator equal to 1 on finite-dimensional continuous unitary representa-
the Bose subspace and i on the Fermi subspace. tions of a canonical compact group G. There is a
canonical field net F with Bose–Fermi commutation
The modular data for A(O) and 0 also admit a
relations extending A such that G is the group of
geometric interpretation for the free massless scalar
automorphisms of F leaving A pointwise fixed.
field.
These facts enhance our understanding of space- The first step in the proof is to define and analyze
time symmetries. The ideas have meanwhile been the statistics of the representations in question. The
applied to curved spacetime to select a state with statistics of an irreducible representation can be
vacuum-like properties using the principle of the classified as being para-Bose or para-Fermi of order
geometric action of the modular conjugation. d(). The second step is to show that each of finite
dimension has a well-defined conjugate up to
equivalence. The third and most difficult step is
Gauge Symmetry showing that T f can be embedded in the tensor
category of Hilbert spaces.
Gauge symmetries do not fit into our scheme in that
they act trivially on the observable algebra A. To
exhibit a gauge symmetry we need a larger net F The Local Implementation
called the field net. The gauge group will be the of Symmetries
group of automorphisms of F leaving the subnet A Gauge symmetry has its associated conservation
pointwise fixed and A the subnet of F of fixed laws in that the different sectors of the last section
points under G. This has the merit of indicating the are labeled by conserved quantities such as baryon
mathematical framework for gauge symmetry but number, lepton number, or electric charge, gener-
otherwise begs important questions. A priori one ically called charges. The theory is built round the
does not know what properties F should have nor idea of creating charge and elements of the field net
how it should be constructed. carry charges. But there should be a dual approach
The right approach is to understand what intrinsic based on measuring charges. One would like to
structure of A governs the existence of a nontrivial prove the existence of local conserved currents
gauge group. This brings us back to the states or corresponding to these charges. This has not proved
representations relevant to elementary particle phy- possible but there is a good substitute, described
sics. A condition for selecting some of these relevant below, which can be regarded as a weak version of a
representations is that asymptotically they be like quantum Noether theorem.
the vacuum in spacelike directions. More precisely, If O1 O2 is a strict inclusion of double cones,
must be unitarily equivalent to the vacuum then the theory is said to satisfy the split property if
representation 0 on the spacelike complement of there is a type I factor M such that
every double cone.
The resulting theory of superselection sectors AðO1 Þ M AðO2 Þ
hinges on the property of Haag duality that, for
each double cone O, where a type I factor is a von Neumann algebra
isomorphic to some B(H). In this case M can be
AðOÞ ¼ AðO0 Þ0 chosen in a canonical fashion and there is an
isomorphism called the universal localizing map
where O0 denotes the spacelike complement of O. It
of B(H) onto M, where H is the underlying Hilbert
implies that every representation satisfying the
space. We have (A) = A for A 2 A(O1 ).
selection criterion is unitarily equivalent to one of
the form 0 , where is an endomorphism of A Theorem If U is an implementing representation of
localized in some fixed but arbitrary double cone, the internal symmetry group G, (U) will be a
that is, (A) = A if A 2 A(O0 ). The endomorphisms representation of G in M that continues to imple-
thus obtained are closed under composition and ment the symmetry on A(O1 ). If G is a Lie group
184 Symmetries in Quantum Field Theory: Algebraic Aspects
A state of a von Neumann algebra is said to be and JRJ = R0 . J is called the modular conjugation,
normal if it is continuous in the -topology. If ! is the modular operator, and it the modular auto-
normal, then ! (R) is -closed. morphisms. The closure of {1=4 A: A 2 R, A 0}
An inclusion of unital von Neumann algebras has is a cone, called the natural cone. Every normal state
the split property if there is an intermediate type I of R is implemented by a unique vector in the
factor, that is, if it has the form R1 B(H) R2 . natural cone. If is an automorphism of R, there is
The following elementary observation is often therefore a unique vector in the natural cone
used in treating symmetries. If is an automorphism such that, for every A 2 R,
of A with !1 = !, there is a unique unitary
operator leaving the cyclic vector invariant and ð; 1 ðAÞÞ ¼ ð ; A Þ
inducing in the representation ! . In other words, There is now a canonical unitary operator V
U = and defined by
U! ðAÞU1 ¼ ! ðAÞ V A ¼ ðAÞ
If we apply the above lemma to a group G of V maps the natural cone into itself and 7! V is an
symmetries leaving a state invariant, it yields a implementing representation of the group of auto-
group U(g) of unitaries satisfying the condition morphisms of R. Under these circumstances, we do
not have to deal with representations up to a factor.
UðghÞ ¼ UðgÞUðhÞ; g; h 2 G
See also: Algebraic Approach to Quantum Field Theory;
since U(g) is uniquely defined by the above Axiomatic Quantum Field Theory; Boundary Conformal
conditions. Field Theory; Current Algebra; Quantum Fields with
When there is no invariant state, the situation is Topological Defects; Supergravity; Symmetries in
more complicated. Suppose there is a group G of Quantum Field Theory of Lower Spacetime Dimensions;
symmetries and a representation of A where each Two-Dimensional Models.
g is unitarily implemented. Thus, there is a unitary
U(g) with
f ðxÞ ¼ f ðxÞ for all x 2 Rn ; 2 ½2 For m even, the isotropy subgroups up to conjugacy
are
Equivalently, if x(t) is a solution and 2 , then
x(t) is a solution. Dm ; Z2 ðÞ; Z2 ðÞ; 1
In this article, we are interested in the dynamics to
be expected for equivariant vector fields, and where Zj (g) denotes the cyclic group of order j
transitions that arise as parameters are varied. The generated by g. The maximal isotropy subgroups
symmetry group is taken as given, whereas f is a = Z2 (), Z2 () are axial with N()= ffi Z2 . For
general -equivariant vector field. (Other features m odd, Z2 () is conjugate to Z2 () leaving three
such as energy conservation or time reversibility conjugacy classes of isotropy subgroups, and
must be built into the general setup, but are = Z2 () is axial with N()= = 1.
excluded in this article.) The space of commuting linear maps
If the action of is not irreducible, write Rn = Relative Equilibria and Skew Products
V1 Vk (nonuniquely) as a sum of irreducible
A point x0 2 Rn (or the corresponding group orbit
subspaces. Summing together irreducible subspaces
x0 ) is a relative equilibrium if f (x0 ) 2 Tx0 x0 =
that are isomorphic to form isotypic components W
Lx0 . If x0 has isotropy , then x0 is a relative
gives the (unique) isotypic decomposition
equilibrium if f (x0 ) 2 LD x0 , where D = (N()=)0 .
R n = W1 W‘ . If L 2 Hom (R n ), then
Write f (x0 ) = x0 , where 2 LD . The closure of
L(Wj ) Wj for each j, hence Hom (Rn ) =
the one-parameter subgroup exp(t) is a maximal
Hom (W1 ) Hom (W‘ ). Each Wj consists of
torus in D for almost every . All maximal tori are
kj isomorphic copies of an irreducible representation
conjugate with common dimension d = rank D .
with division ring Dj . Let Mk (D) denote the space of
The solution x(t) = exp(t)x0 is typically a
k
k matrices with entries in D. Then
d-dimensional quasiperiodic motion. ‘‘Typically’’
Hom ðR n Þ ffi Mk1 ðD1 Þ Mk‘ ðD‘ Þ ½3 holds in both the topological and probabilistic sense
and there is no phase-locking. When d = 1, x(t) is
Spectral properties of commuting linear maps can be periodic, often called a rotating wave.
recovered from the decomposition [3], paying due Choose a -invariant local cross section X to the
attention to multiplicity and complex conjugates of group orbit x0 at x0 . There is a -invariant
eigenvalues. neighborhood of x0 that is -equivariantly diffeo-
morphic to (
X)=, where acts freely on
X
by
Equivariant Dynamics
The dynamics of equivariant systems includes ð; xÞ ¼ ð1 ; xÞ
(relative) equilibria and periodic solutions, robust and acts by left multiplication on the first
heteroclinic cycles/networks, and symmetric chaotic factor. The -equivariant ODE on (
X)= lifts
attractors. to a (
)-equivariant skew product on
X
_ ¼ ðxÞ; x_ ¼ hðxÞ ½4
Equilibria
where : X ! L, h : X ! X satisfy the -equivariance
Consider the ODE [1] with -equivariant vector
conditions
field f satisfying [2]. If x(t) x0 is an equilibrium,
f (x0 ) = 0, then there is a group orbit x0 of ðxÞ ¼ Ad ðxÞ ¼ ðxÞ1
equilibria.
hðxÞ ¼ hðxÞ
Let = x0 be the isotropy subgroup of x0 . If
dim = dim , then generically (for an open dense and h(x0 ) = 0.
set of -equivariant vector fields), the eigenvalues of Thus, dynamics near the relative equilibrium
(df )x0 have nonzero real part, hence x0 is hyperbolic. x0 R n reduces to dynamics near the ordinary
If the eigenvalues all have negative real part, then x0 equilibrium x0 2 X for the -equivariant vector
is asymptotically stable. If at least one eigenvalue h : X ! X, coupled with drifts. In particular, the
has positive real part, then x0 is unstable. Hyper- stability of x0 is determined by (dh)x0 .
bolic equilibria are isolated and persist under
perturbations of f; the perturbed equilibria continue
Periodic Solutions
to have isotropy . Since (df )x0 2 Hom (Rn ),
decomposition [3] for the action of on Rn A nonequilibrium solution x(t) is periodic if x(t þ T) =
facilitates stability computations for x0 . x(t) for some T > 0. The least such T is the (absolute)
If dim < dim , then x0 is a continuous group period. The spatial symmetry group is the isotropy
orbit of equilibria. Generically, dim ker (df )x0 = subgroup of x(t) for some, and hence all, t 2 R. The
dim dim and ker (df )0 = {x0 : 2 L}, where periodic solution P = {x(t): 0 t < T} lies inside
L is the Lie algebra of . The remaining k = n Fix . Define the spatiotemporal symmetry group
dim þ dim eigenvalues generically have nonzero = { 2 : P = P}. Note that is a normal subgroup
real part so x0 is normally hyperbolic. If all k of and either = ffi S1 (P is a rotating wave) or
eigenvalues have nonzero real part, then x0 is = ffi Zq and P is called a standing wave or a discrete
asymptotically stable. If at least one has positive real rotating wave. For each 2 , there exists T 2 [0, T)
part, then x0 is unstable. When N()= is finite, such that x(t) = x(t þ T ). The relative period of x(t)
generically x0 is an isolated equilibrium in Fix and is the least T > 0 such that x(T) 2 x0 .
persists as an equilibrium with isotropy under If dim = dim , then generically P is hyperbolic,
perturbation. hence isolated, the stability of P is determined by its
186 Symmetry and Symmetry Breaking in Dynamical Systems
Floquet exponents, and P persists under perturba- and reorient themselves at approximately 60 ), and
tion as a periodic solution with spatial symmetry provide a possible intrinsic explanation for irregular
and spatiotemporal symmetry . For infinite and reversals of the Earth’s magnetic field.
N()= finite, generically P is isolated in Fix and Asymmetric perturbations (deterministic or noisy)
the neutral Floquet exponent has multiplicity destroy the cycles, but the perturbed attractors
dim dim þ 1. inherit the bursting behavior.
Establishing the existence of heteroclinic connec-
Relative Periodic Solutions tions is often straightforward when dim Fix i = 2
and nontrivial with dim Fix i 3. Criteria for
A solution x(t) is a relative periodic solution if it is asymptotic stability of heteroclinic cycles are given
not a relative equilibrium and x(T) 2 x(0) for some in terms of real parts of eigenvalues of (df )xi , and
T > 0. The least such T is the relative period. The depend on the geometry of the representation of .
spatial symmetry group = x(t) for some, hence Robust cycles exist also between more complicated
all, t. The spatiotemporal symmetry group is the dynamical states such as periodic solutions or chaotic
closed subgroup of generated by and , where sets (cycling chaos). When W u (xi ) connects to two or
x(T) = x(0), and generically = ffi Td
Zq is a more distinct states, the collection of unstable
maximal topologically cyclic (Cartan) subgroup of manifolds forms a heteroclinic network leading to
N()= containing . Then x(t) is a (d þ 1)- competition between various subnetworks.
dimensional quasiperiodic motion.
The dynamics near the relative periodic solution
is again governed by a skew product. There exists Symmetric Attractors
n 1 such that n = exp (n), where 2 LZ()
Suppose that is a finite group acting linearly on R n .
and Z() is the centralizer of . Define
A closed subset A Rn has symmetry groups =
= exp(). Form a semidirect product o Z2n
{ 2 : x = x for all x 2 A}, = { 2 : A = A}.
by adjoining to an element Q of order 2n such
Here, is an isotropy subgroup and N().
that Q Q1 = 1 for 2 .
In applications, corresponds to instantaneous
In a comoving frame with velocity , a neighbor-
symmetry and to symmetry on average.
hood of the relative periodic orbit is -equivariantly
If A is an attractor (a Lyapunov stable !-limit set)
diffeomorphic to (
X
S1 )= o Z2n , where X is
for a -equivariant vector field f : R n ! Rn , then
a o Z2n -invariant cross section, S1 = R=2nZ and
fixes a connected component of Fix L, where L
o Z2n acts on
X
S1 as
is the union of proper fixed-point spaces in Fix .
ð; x;
Þ ¼ ð 1 ; x;
Þ Provided dim Fix 3, all pairs , satisfying
the above restrictions arise as symmetry groups of a
Q ð; x;
Þ ¼ ð1 ; Qx;
þ 1Þ
nonperiodic attractor A. If dim Fix 5, then A is
The -equivariant ODE on (
X
S1 )= o Z2n realized by a uniformly hyperbolic (Axiom A)
lifts to a
( o Z2n )-equivariant skew product attractor.
If dim Fix 3 and fixes a connected compo-
_ ¼ ðx;
Þ; x_ ¼ hðx;
Þ;
_ ¼ 1 ½5 nent of Fix L, then A is realized by a periodic
1 1
where : X
S ! L, h : X
S ! X satisfy appro- sink provided = is cyclic. If dim Fix = 2, then in
priate o Z2n -equivariance conditions. addition either = or = N().
Suppose A is an attractor and 2 . Then
Robust Heteroclinic Cycles
A \ A = ;. Varying a parameter, A may undergo a
symmetry-increasing bifurcation: A grows until it
Heteroclinic cycles, degenerate in systems without collides with A producing a larger attractor with
symmetry, arise robustly in equivariant systems. Let symmetry on average generated by and .
x1 , . . . , xm 2 R n be saddles with W u (xi ) {xi } Determining symmetries of an attractor by inspec-
W s (xiþ1 ) (where m þ 1 = 1). If 1 , . . . , m are tion is often infeasible. A detective is a -equivariant
isotropy subgroups, W u (xi ) Fix i , and xiþ1 is a polynomial : Rn ! V where every subgroup of is
sink in Fix i , then saddle–sink connections from xi an isotropy subgroup for the action on V, and each
to xiþ1Spersist for nearby -equivariant flows. The component of is nonzero. Suppose that A R n is
union m u
i = 1 W (xi ) forms a robust heteroclinic cycle an attractor with physical (Sinai–Ruelle–Bowen)
(see the subsection ‘‘Dynamics’’ for an example). Such measure . By ergodicity, the time average
cycles, when asymptotically stable, are a mechanism Z
for intermittency or bursting, notably in rotating 1 T
A ¼ lim ðxðtÞÞdt 2 V
Rayleigh–Bénard convection (where rolls disappear T!1 T 0
Symmetry and Symmetry Breaking in Dynamical Systems 187
is well defined for almost every trajectory x(t) in degeneracies can be treated using singularity theory.
supp . Generically, A = A so computing the The equilibria and their stability determines the
symmetry of A reduces to computing the symmetry local dynamics. All bifurcating equilibria have
of a point. isotropy , so there is no symmetry breaking.
If is an infinite compact Lie group, and A is an From now on, consider the remaining subcase
!-limit set containing points of trivial isotropy, then where acts absolutely irreducibly and nontrivially
A cannot be uniformly hyperbolic. Hence partially on Rn . Then Fix = {0}, f (0, ) 0, and (df )0, =
hyperbolic flows arise naturally in systems with c()In where generically c0 (0) 6¼ 0. Assume that
continuous symmetry. Consider the skew product c0 (0) > 0, so the ‘‘trivial solution’’ x = 0 is asympto-
[4] where = 1 and h : X ! X possesses a hyperbolic tically stable subcritically ( < 0) and unstable
basic set X with equilibrium measure (for a supercritically ( > 0). Bifurcating solutions lie out-
Hölder potential). Let
denote Haar measure on . side Fix and hence there is spontaneous symmetry
Then
is partially hyperbolic, and
is breaking.
ergodic (even Bernoulli) for an open dense set of
equivariant flows. Such stably ergodic flows possess Axial Isotropy Subgroups
strong statistical properties (rapid decay of correla- The ‘‘equivariant branching lemma’’ guarantees
tions, central-limit theorem); a possible explanation branches of equilibria with isotropy for each
for hypermeander (Brownian-like motion) of spiral axial isotropy subgroup. There are three associated
waves in planar excitable media. branching patterns, see Figure 1.
If N()= = Z2 , then f is odd. Generically,
Forced Symmetry Breaking
@x3 f (0, 0) 6¼ 0, since (x21 þ þ x2n )x is -equivar-
In applications, symmetry is not perfect and account iant, and there are two branches of equilibria
should be taken of 0 -equivariant perturbations of bifurcating supercritically or subcritically together,
[1] for 0 a subgroup of (including 0 = 1). This and lying on the same group orbit. The branches
topic is not discussed in this article, except in the form a symmetric pitchfork whose direction of
subsections ‘‘Robust heteroclinic cycles’’ and branching is determined by sgn @x3 f (0, 0).
‘‘Branching patterns and finite determinacy.’’ If N()= ffi 1, then generically f is even. If all
quadratic -equivariant maps vanish on Fix , then
the bifurcation is sub/supercritical depending on
Equivariant Bifurcation Theory sgn @x3 f (0, 0) but the branches lie on distinct group
Consider families of ODEs x_ = f (x, ), with bifurca- orbits. This is an asymmetric pitchfork.
tion parameter 2 R and vector field f : Rn
If @x2 f (0, 0) 6¼ 0, then the equilibria exist tran-
R ! Rn satisfying f (0, 0) = 0 and the -equivariance scritically: for < 0 and > 0.
condition The natural actions of Dm on R2 are absolutely
irreducible. The axial branches are symmetric
f ðx; Þ ¼ f ðx; Þ pitchforks for m 4 even, asymmetric pitchforks
for all x 2 Rn ; 2 R; 2 for m 5 odd, and transcritical for m = 3.
The actions of Dm , m 5 odd, provide the
A local bifurcation from the equilibrium x = 0 simplest instances of hidden symmetries, where
occurs if (df )0, 0 is nonhyperbolic. The center sub- certain N()=-equivariant mappings on Fix do
space Ec is the sum of generalized eigenspaces not extend to smooth -equivariant mappings on R n .
corresponding to eigenvalues on the imaginary
axis, and is -invariant. By center manifold theory, Nonaxial Maximal Isotropy Subgroups
local dynamics ((x, ) near (0, 0)) are captured by the
center manifold W c . After center manifold reduction For a real maximal isotropy subgroup, dim Fix
(or Lyapunov–Schmidt reduction if the focus is on odd, there exist branches of equilibria with isotropy
equilibria), it may be assumed that R n = Ec .
If (df )0, 0 possesses zero eigenvalues, then there is
a steady-state bifurcation. Generically, (df )0, 0 = 0
and Ec is absolutely irreducible. There are two
subcases.
If acts trivially on Rn , then n = 1 and generically (a) (b) (c)
there is a saddle–node (or limit point) bifurcation Figure 1 Axial branches: (a) supercritical symmetric pitchfork,
where the zero sets of f (x, ) and
x2
are (b) supercritical asymmetric pitchfork, and (c) transcritical
diffeomorphic for (x, ) near (0, 0). Higher-order branches.
188 Symmetry and Symmetry Breaking in Dynamical Systems
. When dim Fix is even, there are examples Branching Patterns and Finite Determinacy
where equilibria exist and examples where no
The following notion of finite determinacy is based
equilibria exist. For complex or quaternionic,
on equivariant transversality theory. Assume acts
there exist branches of rotating waves with isotropy
absolutely irreducibly. Consider the set F of
. In the quaternionic case, the rotating waves
-equivariant vector fields f : Rn
R ! Rn satisfy-
foliate the SU(2) group orbits according to the Hopf
ing (df )0, 0 = 0. For an open dense subset of F ,
fibration.
branches of relative equilibria near (0, 0) are
Submaximal Isotropy Subgroups normally hyperbolic. The collection of branches of
relative equilibria, together with their isotropy type,
It has been conjectured falsely that steady-state direction of branching, and stability properties, is
bifurcation leads generically to equilibria only with called a branching pattern. These persist under small
maximal isotropy. The simplest counterexample is perturbations and are finitely determined: there exist
the 24-element group = Z3 Z32 generated by q = q 2 and an open dense subset U(q) F such
0 1 0 1 that the branching patterns of f and f þ g are
0 1 0 1 0 0
¼ @ 0 0 1 A; ¼ @ 0 1 0A identical for f 2 U(q), g 2 F , provided g(x, ) =
1 0 0 0 0 1 o(kxkq ).
Furthermore, branching patterns are strongly
(Alternatively, = T Z2 (I3 ), where T SO(3) is finitely determined: there exist d 2 and an open
the tetrahedral group.) dense subset S(d) F such that the branching
The isotropy subgroup = Z2 () has two- patterns of f and f þ g are identical for f 2 S(d)
dimensional fixed-point subspace Fix = {(x, y, 0)}. and all (not necessarily equivariant) g satisfying
The only one-dimensional fixed-point spaces con- g(x, ) = o(kxkd ).
tained in Fix are the x- and y-axes. The general For example, consider the hyperoctahedral group
-equivariant vector field is Sn Zn2 , n 1. Here Sn acts by permutations of the
coordinates (x1 , . . . , xn ) and Zn2 consists of diagonal
x_ ¼ gðx2 ; y2 ; z2 ; Þx
matrices with entries
1. Let = T Zn2 , where T Sn
y_ ¼ gðy2 ; z2 ; x2 ; Þy is a transitive subgroup. Then acts absolutely
z_ ¼ gðz2 ; x2 ; y2 ; Þz irreducibly on Rn and is strongly 3-determined.
Submaximal branches of equilibria exist except when
After scaling, T = Sn , T = An and, if n = 6, T = PGL2 (F5 ).
gðx2 ; y2 ; z2 ; Þ
Dynamics
¼ x2 ay2 bz2 þ oðx2 ; y2 ; z2 ; Þ ½6
Absolutely irreducible representations have arbitra-
Restricting to Fix and dividing out the axial rily high dimension, so steady-state bifurcation
solutions x = 0 and y = 0 yields at lowest order the leads to rich dynamics. The group = Z3 Z32 with
equations = x2 þ ay2 = y2 þ bx2 . Submaximal sgn(a 1) 6¼ sgn(b 1) and a þ b > 2 in [6] yields
solutions exist provided sgn(a 1) = sgn(b 1). asymptotically stable heteroclinic cycles with planar
In general, the existence of equilibria with connections connecting equilibria in the x-, y- and
submaximal isotropy must be treated on a case- z-axes (see Figure 2). In R4 , there is the possibility of
by-case basis (for each absolutely irreducible repre- instant chaos where chaotic dynamics bifurcates
sentation of and isotropy subgroup ). directly from the equilibrium 0.
Asymptotic Stability
Subcritical and axial transcritical branches are
automatically unstable. Moreover, the existence of
a quadratic -equivariant mapping q : Rn ! R n and
x 2 Fix such that (dq)x has eigenvalues with
nonzero real part guarantees that branches of
equilibria with axial isotropy are generically
unstable (even when qjFix 0).
There are no general results for asymptotic
stability, and calculations must be done on a case-
by-case basis. (The remarks in the subsection
‘‘Equilibria’’ are useful here.) Figure 2 Robust heteroclinic cycle for the group = Z3 n Z32 .
Symmetry and Symmetry Breaking in Dynamical Systems 189
2k such that 1 = 1 , for 2 . Codimension- Crawford JD and Knobloch E (1991) Symmetry and symmetry-
1 bifurcations from P are in one-to-one correspon- breaking in fluid dynamics. In: Lumley JL, Van Dyke M, and
Reed HL (eds.) Annual Review of Fluid Mechanics, vol. 23,
dence (modulo tail terms) with bifurcations from pp. 341–387. Palo Alto, CA: Annual Reviews.
fully symmetric equilibria for a ( o Z2k )-equivariant Fiedler B and Scheel A (2003) Spatio-temporal dynamics of
vector field. In particular, period-preserving and reaction-diffusion patterns. In: Kirkilionis M, Krömker S,
period-doubling bifurcations from P reduce to Rannacher R, and Tomi F (eds.) Trends in Nonlinear Analysis,
steady-state bifurcations, and Naimark–Sacker pp. 23–152. Berlin: Springer.
Field M (1996a) Lectures on Bifurcations, Dynamics and
bifurcations reduce to Hopf bifurcations. This Symmetry. Pitman Research Notes in Mathematics Series,
framework incorporates issues such as suppression vol. 356. Harlow: Addison Wesley Longman.
of period doubling. Similar results hold for higher- Field M (1996b) Symmetry Breaking for Compact Lie Groups.
codimension bifurcations. Memoirs of the American Mathematical Society, vol. 574.
The skew products [4] and [5] are valid for proper Providence, RI: American Mathematical Society.
Golubitsky M and Stewart IN (2002) The Symmetry Perspective.
actions of certain noncompact Lie groups pro- Progress in Mathematics, vol. 200. Basel: Birkhäuser.
vided the spatial symmetries are compact, leading to Golubitsky M, Stewart IN, and Schaeffer D (1988) Singularities
explanations of spiral and scroll wave phenomena in and Groups in Bifurcation Theory, Vol. II, Applied Mathe-
excitable media. matical Sciences, vol. 69. New York: Springer.
When the spatial symmetry group is noncompact, Lamb JSW and Melbourne I (1999) Bifurcation from periodic
solutions with spatiotemporal symmetry. In: Golubitsky M, Luss
Ec may be infinite-dimensional and center manifold D, and Strogatz SH (eds.) Pattern Formation in Continuous and
reduction may break down due to continuous- Coupled Systems, IMA Volumes in Mathematics and its
spectrum issues. For Euclidean symmetry, there Applications, vol. 115, pp. 175–191. New York: Springer.
is a theory of modulation or Ginzburg–Landau Lamb JSW, Melbourne I, and Wulff C (2003) Bifurcation from
equations. periodic solutions with spatiotemporal symmetry, including
resonances and mode interactions. Journal Differential Equa-
tions 191: 377–407.
See also: Bifurcation Theory; Bifurcations in Fluid
Melbourne I (2000) Ginzburg–Landau theory and symmetry. In:
Dynamics; Bifurcations of Periodic Orbits; Central Debnath L and Riahi DN (eds.) Nonlinear Instability, Chaos
Manifolds, Normal Forms; Chaos and Attractors; and Turbulence, Vol 2, Advances in Fluid Mechanics, vol. 25,
Electroweak Theory; Finite Group Symmetry Breaking; pp. 79–109. Southampton: WIT Press.
Hyperbolic Dynamical Systems; Quantum Spin Systems; Michel L (1980) Symmetry defects and broken symmetry.
Quasiperiodic Systems; Singularity and Bifurcation Configurations. Hidden symmetry. Reviews of Modern Phy-
Theory. sics 52: 617–651.
Sandstede B, Scheel A, and Wulff C (1999) Dynamical behavior of
patterns with Euclidean symmetry. In: Golubitsky M, Luss D,
Further Reading and Strogatz SH (eds.) Pattern Formation in Continuous and
Coupled Systems, IMA Volumes in Mathematics and its
Chossat P and Lauterbach R (2000) Methods in Equivariant Applications, vol. 115, pp. 249–264. New York: Springer.
Bifurcations and Dynamical Systems. Advanced Series in
Nonlinear Dynamics, vol. 15. Singapore: World Scientific.
flow-invariant manifolds appear as the level sets of a The action is called proper whenever for any
momentum map induced by the symmetry of the two convergent sequences {mn }n2N and {gn mn :=
system. (gn , mn )}n2N in M, there exists a convergent
subsequence {gnk }k2N in G. Compact group actions
are obviously proper.
Symmetry Reduction
The Symmetries of a System Symmetry Reduction of Vector Fields
The standard mathematical fashion to describe the Let M be a smooth manifold and G a Lie group
symmetries of a dynamical system (see Dynamical acting properly on M. Let X 2 X(M)G and Ft be its
Systems in Mathematical Physics: An Illustration (necessarily equivariant) flow. For any isotropy
from Water Waves) X 2 X(M) defined on a mani- subgroup H of the G-action on M, the H-isotropy
fold M(X(M) denotes the Lie algebra of smooth type submanifold MH := {m 2 M j Gm = H} is pre-
vector fields on M endowed with the Jacobi–Lie served by the flow Ft . This property is known as the
bracket [ , ]) consists in studying its invariance law of conservation of isotropy. The properness of
properties with respect to a smooth Lie group the action guarantees that Gm is compact and that
: G M ! M (continuous symmetries) or Lie the (connected components of) MH are embedded
algebra : g ! X(M) (infinitesimal symmetry) submanifolds of M for any closed subgroup H of G.
action. Recall that is a (left) action if the map The manifolds MH are, in general, not closed in M.
g 2 G 7! (g, ) 2 Diff(M) is a group homomorph- Moreover, the quotient group N(H)=H (where N(H)
ism, where Diff(M) denotes the group of smooth denotes the normalizer of H in G) acts freely and
diffeomorphisms of the manifold M. The map is a properly on MH . Hence, if H : MH ! MH =(N(H)=H)
(left) Lie algebra action if the map 2 g 7! () 2 denotes the projection onto orbit space and
X(M) is a Lie algebra antihomomorphism and the iH : MH ,! M is the injection, the vector field X
map (m, ) 2 M g 7! ()(m) 2 TM is smooth. The induces a unique vector field XH on the quotient
vector field X is said to be G-symmetric whenever it MH =(N(H)=H) defined by XH H = TH X iH ,
is equivariant with respect to the G-action , that is, whose flow FtH is given by FtH H = H Ft iH . We
X g = Tg X, for any g 2 G. The space of will refer to XH 2 X(MH =(N(H)=H)) as the H-isotropy
G-symmetric vector fields on M is denoted by type reduced vector field induced by X.
X(M)G . The flow Ft of a G-symmetric vector This reduction technique has been widely
field X 2 X(M)G is G-equivariant, that is, exploited in handling specific dynamical systems.
Ft g = g Ft , for any g 2 G. The vector field X is When the symmetry group G is compact and we are
said to be g-symmetric if [(), X] = 0, for any 2 g. dealing with a linear action, the construction of the
If g is the Lie algebra of the Lie group G (see Lie quotient MH =(N(H)=H) can be implemented in a
Groups: General Theory) then the infinitesimal gen- very explicit and convenient manner by using the
erators M 2 X(M) of a smooth G-group action invariant polynomials of the action and the theo-
defined by rems of Hilbert and Schwarz–Mather.
d
M ðmÞ :¼ ðexp t; mÞ; 2 g; m 2 M
dt t¼0 Symplectic Reduction
constitute a smooth Lie algebra g-action and we Symplectic or Marsden–Weinstein reduction is a
denote in this case () = M . procedure that implements symmetry reduction for
If m 2 M, the closed Lie subgroup Gm := {g 2 Gj the symmetric Hamiltonian systems defined on a
(g, m) = m} is called the isotropy or symmetry symplectic manifold (M, !). The particular case in
subgroup of m. Similarly, the Lie subalgebra which the symplectic manifold is a cotangent bundle
gm := { 2 g j ()(m) = 0} is called the isotropy or is dealt with separately (see Cotangent Bundle
symmetry subalgebra of m. If g is the Lie algebra of Reduction). We recall that the Hamiltonian vector
G and the Lie algebra action is given by the field Xh 2 X(M) associated to the Hamiltonian
infinitesimal generators, then gm is the Lie algebra function h 2 C1 (M) is uniquely determined by the
of Gm . The action is called free if Gm = {e} for every equality !(Xh , ) = dh. In this context, the symme-
m 2 M and locally free if gm = {0} for every m 2 M. tries : G M ! M of interest are given by sym-
We will write interchangeably (g, m) = g (m) = plectic or canonical transformations, that is,
m (g) = g m, for m 2 M and g 2 G. g ! = !, for any g 2 G. For canonical actions each
In this article we will focus mainly on continuous G-invariant function h 2 C1 (M)G has an associated
symmetries induced by proper Lie group actions. G-symmetric Hamiltonian vector field Xh . A Lie
192 Symmetry and Symplectic Reduction
algebra action ’ is called symplectic or canonical if given by h J(q ), i = q (Q (q)), for any q 2
£() ! = 0 for all 2 g, where £ denotes the Lie T Q and any 2 g.
derivative operator. If the Lie algebra action is (iv) (Symplectic linear actions) Let (V, !) be a
induced from a canonical Lie group action by taking symplectic linear space and let G be a subgroup
its infinitesimal generators, then it is also canonical. of the linear symplectic group, acting naturally
on V. By the choice of G this action is canonical
and has a momentum map given by
Momentum Maps
h J(v), i = (1=2)!(V (v), v), for 2 g and v 2 V
The symmetry reduction described in the previous arbitrary.
section for general vector fields does not produce a
well-adapted answer for symplectic manifolds (M, !)
Properties of the Momentum Map
in the sense that the reduced spaces MH =(N(H)=H)
are, in general, not symplectic. To solve this The main feature of the momentum map that makes it of
problem one has to use the conservation laws interest for use in reduction is that it encodes conserva-
associated to the canonical action, which often tion laws for G-symmetric Hamiltonian systems.
appear as momentum maps. Noether’s theorem states that the momentum map is a
Let G be a Lie group acting canonically on the constant of the motion for the Hamiltonian vector field
symplectic manifold (M, !). Suppose that for any 2 g, Xh associated to any G-invariant function h 2 C1 (M)G
the vector field M is Hamiltonian, with Hamiltonian (see Symmetries and Conservation Laws).
function J 2 C1 (M) and that 2 g 7! J 2 C1 (M) is The derivative TJ of the momentum map satisfies
linear. The map J : M ! g defined by the relation the following two properties: range (Tm J) = (gm ) and
h J(z), i = J (z), for all 2 g and z 2 M, is called ker Tm J = (gm)! , for any m 2 M, where (gm )
a momentum map of the G-action (see Hamiltonian denotes the annihilator in g of the isotropy subalgebra
Group Actions). Momentum maps, if they exist, are gm of m, gm := Tm (Gm) = {M (m)j 2 g} is the
determined up to a constant in g for any connected tangent space at m to the G-orbit that contains this
component of M. point, and (gm)! is the symplectic orthogonal space
to gm in the symplectic vector space (Tm M, !(m)).
Examples 1
The first relation is sometimes called the bifurcation
(i) (Linear momentum) The phase space of an lemma since it establishes a link between the symmetry
N-particle system is the cotangent space T R3N of a point and the rank of the momentum map at
endowed with its canonical symplectic struc- that point.
ture. The additive group R3 , whose Lie algebra The existence of the momentum map for a given
is abelian and is also equal to R3 , acts canonical action is not guaranteed. A momentum
canonically on it by spatial translation on each map exists if and only if the linear map : [] 2
factor: v (qi , pi ) = (qi þ v, pi ), with i = 1, . . . , N. g=[g, g] 7! [!(M , )] 2 H 1 (M, R) is identically zero.
This action has an associated momentum map Thus, if H 1 (M, R) = 0 or g=[g, g] = H 1 (g, R) = 0
J : T R3N ! R3 , where we identified the dual of then 0. In particular, if g is semisimple, the
R 3 with itself using the Euclidean inner pro- ‘‘first Whitehead lemma’’ states that H 1 (g, R) = 0
duct, which coincidesPwith the classical linear and therefore a momentum map always exists for
momentum J(qi , pi ) = N i = 1 pi . canonical semisimple Lie algebra actions.
(ii) (Angular momentum) Let SO(3) act on R3 A natural question to ask is when the map
and then, by lift, on T R3 , that is, A (q, p) = (g, [ , ]) ! (C1 (M), { , }) defined by 7! J , 2 g,
(Aq, Ap). This action is canonical and has as is a Lie algebra homomorphism, that is,
associated momentum map J : T R3 ! so(3) ffi J [,
] = { J , J
}, ,
2 g. Here { , } : C1 (M)
R3 , the classical angular momentum J(q, p) = C1 (M) ! C1 (M) denotes the Poisson bracket asso-
q p. ciated to the symplectic form ! of M defined by
(iii) (Lifted actions on cotangent bundles) The {f , h} := !(Xf , Xh ), f , h 2 C1 (M). This is the case if
previous two examples are particular cases of and only if Tz J(M (z)) = ad J(z), for any 2 g,
the following situation. Let : G M ! M be a z 2 M, where ad is the dual of the adjoint
smooth Lie group action. The (left) cotangent representation ad : (,
) 2 g g 7! [,
] 2 g of g on
lifted action of G on T Q is given by g q : = itself. A momentum map that satisfies this relation
Tgq g1 (q ) for g 2 G and q 2 T Q. Cotan- in called infinitesimally equivariant. The reason
gent lifted actions preserve the canonical 1-form behind this terminology is that this is the infinitesi-
on T Q and hence are canonical. They admit mal version of global or coadjoint equivariance: J is
an associated momentum map J : T Q ! g G-equivariant if Adg1 J = J g or, equivalently,
Symmetry and Symplectic Reduction 193
J Ad
g (g z) = J (z), for all g 2 G, 2 g, and z 2 M; manifold out of a given symmetric one in which the
Ad denotes the dual of the adjoint representation conservation laws encoded in the form of a
Ad of G on g. Actions admitting infinitesimally momentum map and the degeneracies associated to
equivariant momentum maps are called Hamilto- the symmetry have been eliminated. This strategy
nian actions and Lie group actions with coadjoint allows the reduction of a symmetric Hamiltonian
equivariant momentum maps are called globally dynamical system to a dimensionally smaller one.
Hamiltonian actions. If the symmetry group G is This reduction procedure preserves the symplectic
connected then global and infinitesimal equivariance category, that is, if we start with a Hamiltonian
of the momentum map are equivalent concepts. If g system on a symplectic manifold, the reduced system
acts canonically on (M, !) and H 1 (g, R) = {0} then is also a Hamiltonian system on a symplectic
this action admits at most one infinitesimally manifold. The reduced symplectic manifold is
equivariant momentum map. usually referred to as the symplectic or Marsden–
Since momentum maps are not uniquely defined, Weinstein reduced space.
one may ask whether one can choose them to be
Theorem 2 Let : G M ! M be a free proper
equivariant. It turns out that if the momentum map is
canonical action of the Lie group G on the connected
associated to the action of a compact Lie group, this
symplectic manifold (M, !). Suppose that this action
can always be done. Momentum maps of cotangent
has an associated momentum map J : M ! g , with
lifted actions are also equivariant as are momentum
nonequivariance 1-cocycle : G ! g . Let 2 g be
maps defined by symplectic linear actions. Canonical
a value of J and denote by G the isotropy of under
actions of semisimple Lie algebras on symplectic
the affine action of G on g . Then:
manifolds admit infinitesimally equivariant momen-
tum maps, since the ‘‘second Whitehead lemma’’ (i) The space M := J 1 ()=G is a regular quotient
states that H2 (g, R) = 0 if g is semisimple. We shall manifold and, moreover, it is a symplectic
identify below a specific element of H 2 (g, R) which is manifold with symplectic form ! uniquely
the obstruction to the equivariance of a momentum characterized by the relation
map (assuming it exists).
! ¼ i !
Even though, in general, it is not possible to
choose a coadjoint equivariant momentum map, it The maps i : J 1 () ,! M and : J 1 () !
turns out that when the symplectic manifold is J 1 ()=G denote the inclusion and the projec-
connected there is an affine action on the dual of the tion, respectively. The pair (M , ! ) is called the
Lie algebra with respect to which the momentum symplectic point reduced space.
map is equivariant. Define the nonequivariance (ii) Let h 2 C1 (M)G be a G-invariant Hamiltonian.
1-cocycle associated to J as the map : G ! g The flow Ft of the Hamiltonian vector field Xh
given by g 7! J(g (z))Adg1 ( J(z)). The connectivity leaves the connected components of J 1 ()
of M implies that the right-hand side of this equality invariant and commutes with the G-action, so
is independent of the point z 2 M. In addition, is a it induces a flow Ft on M defined by
(left) g -valued 1-cocycle on G with respect to the Ft i = Ft .
coadjoint representation of G on g , that is, (iii) The vector field generated by the flow Ft on
(gh) = (g) þ Adg1 (h) for all g, h 2 G. Relative to (M , ! ) is Hamiltonian with associated
the affine action : G g ! g given by reduced Hamiltonian function h 2 C1 (M )
(g, ) 7! Adg1 þ (g), the momentum map J is defined by h = h i . The vector fields
equivariant. The ‘‘reduction lemma,’’ the main Xh and Xh are -related. The triple
technical ingredient in the proof of the reduction (M , ! , h ) is called the reduced Hamiltonian
theorem, states that for any m 2 M we have system.
gJðmÞm ¼ gm \ ker Tm J ¼ gm \ ðgmÞ! (iv) Let k 2 C1 (M)G be another G-invariant func-
tion. Then {h, k} is also G-invariant and
where gJ(m) is the Lie algebra of the isotropy group {h, k} = {h , k }M , where { , }M denotes the
GJ(m) of J(m) 2 g with respect to the affine action Poisson bracket associated to the symplectic
of G on g induced by the nonequivariance form ! on M .
1-cocycle of J.
Reconstruction of Dynamics
The Symplectic Reduction Theorem
We pose now the question converse to the reduction
The symplectic reduction procedure that we now of a Hamiltonian system. Assume that an integral
present consists of constructing a new symplectic curve c (t) of the reduced Hamiltonian system Xh
194 Symmetry and Symplectic Reduction
i ð;
Þ ½1
O ! M and 2 : M O ! O are the projections.
Symmetry and Symplectic Reduction 195
A momentum map for this action is given by J topology is the relative topology induced by P. The
1 2 : M
O þ þ
! g . Let (M
O )0 := (( J 1 depth dp(z) of any z 2 (P, Z) is defined as
1 þ
2 ) (0)=G, (!
!O )0 ) be the symplectic point
reduced space at zero. dpðzÞ :¼ supfk 2 N j 9 S0 ; S1 ; . . . ; Sk 2 Z
with z 2 S0 S1 Sk g
Theorem 4 (Shifting theorem). Under the hypoth-
eses of the symplectic orbit reduction theorem Since for any two elements x, y 2 S in the same piece
(Theorem 3), the symplectic orbit reduced space S 2 P we have dp(x) = dp(y), the depth dp(S) of the
MO , the point reduced spaces M , and (M
Oþ
)0
piece S is well defined by dp(S) := dp(x), x 2 S.
are symplectically diffeomorphic. Finally, the depth dp(P) of (P, Z) is defined by
dp(P) := sup{dp(S) j S 2 Z}.
A continuous mapping f : P ! Q between the
Singular Reduction decomposed spaces (P, Z) and (Q, Y) is a morphism
In the previous section we carried out symplectic of decomposed spaces if, for every piece S 2 Z, there
reduction for free and proper actions. The freeness is a piece T 2 Y such that f (S) T and the
guarantees via the bifurcation lemma that the restriction f jS : S ! T is smooth. If (P, Z) and (P, T )
momentum map J is a submersion and hence the are two decompositions of the same topological
level sets J 1 () are smooth manifolds. Freeness and space we say that Z is coarser than T or that T is
properness ensure that the orbit spaces finer than Z if the identity mapping (P, T ) ! (P, Z)
M := J 1 ()=G are regular quotient manifolds. is a morphism of decomposed spaces. A topological
The theory of singular reduction studies the proper- subspace Q P is a decomposed subspace of (P, Z)
ties of the orbit space M when the hypothesis on if, for all pieces S 2 Z, the intersection S \ Q is a
the freeness of the action is dropped. The main submanifold of S and the corresponding partition
result in this situation shows that these quotients are Z \ Q forms a decomposition of Q.
symplectic Whitney stratified spaces, in the sense Let P be a topological space and z 2 P. Two subsets
that the strata are symplectic manifolds in a very A and B of P are said to be equivalent at z if there is an
natural way; moreover, the local properties of this open neighborhood U of z such that A \ U = B \ U.
Whitney stratification make it into what is called a This relation constitutes an equivalence relation on the
cone space. This statement is referred to as the power set of P. The class of all sets equivalent to a
‘‘symplectic stratification theorem’’ and adapts to given subset A at z will be denoted by [A]z and called
the symplectic symmetric context the stratification the set germ of A at z. If A B P, we say that [A]z is
theorem of the orbit space of a proper Lie group a subgerm of [B]z , and denote [A]z [B]z .
action by using its orbit type manifolds. In order to A stratification of the topological space P is a map
present this result, we review the necessary defini- S that associates to any z 2 P the set germ S(z) of a
tions and results on stratified spaces (see Singularity closed subset of P such that the following condition
and Bifurcation Theory for more information on is satisfied:
singularity theory). Condition (ST) For every z 2 P there is a neighbor-
hood U of z and a decomposition Z of U such that
Stratified Spaces
for all y 2 U the germ S(y) coincides with the set
Let Z be a locally finite partition of the topological germ of the piece of Z that contains y.
space P into smooth manifolds Si P, i 2 I. We
The pair (P, S) is called a stratified space. Any
assume that the manifolds Si P, i 2 I, with their
decomposition of P defines a stratification of P by
manifold topology are locally closed topological sub- associating to each of its points the set germ of the
spaces of P. The pair (P, Z) is a decomposition of P with piece in which it is contained. The converse is, by
pieces in Z when the following condition is satisfied: definition, locally true.
Condition (DS) If R, S 2 Z are such that R \ S 6¼ ;,
then R S. In this case we write R S. If, in The Strata
addition, R 6¼ S we say that R is incident to S or that
Two decompositions Z 1 and Z 2 of P are said to be
it is a boundary piece of S and write R S.
equivalent if they induce the same stratification of P.
The above condition is called the frontier condition If Z 1 and Z 2 are equivalent decompositions of P
and the pair (P, Z) is called a decomposed space. The then, for all z 2 P, we have that dpZ 1 (z) = dpZ 2 (z).
dimension of P is defined as dim P = sup{dim Si j Si 2 Any stratified space (P, S) has a unique decomposi-
Z}. If k 2 N, the k-skeleton Pk of P is the union of all tion Z S associated with the following maximality
the pieces of dimension smaller than or equal to k; its property: for any open subset U P and any
196 Symmetry and Symplectic Reduction
decomposition Z of P inducing S over U, the z 2 R with respect to the chart (U, ) is given by the
restriction of Z S to U is coarser than the restriction following statement:
of Z to U. The decomposition Z S is called the
Condition (B) Let {xn }n2N R \ U and {yn }n2N
canonical decomposition associated to the stratifica-
S \ U be two sequences with the same limit
tion (P, S). It is often denoted by S and its pieces are
called the strata of P. The local finiteness of the z ¼ lim xn ¼ lim yn
n!1 n!1
decomposition Z S implies that for any stratum S
of (P, S) there are only finitely many strata R with and such that xn 6¼ yn , for all n 2 N. Suppose that
S R. Henceforth, the symbol S in the stratification the set of connecting lines (xn )(yn ) Rn con-
(P, S) will denote both the map that associates to verges in projective space to a line L and that the
each point a set germ and the set of pieces associated sequence of tangent spaces {Tyn S}n2N converges in
to the canonical decomposition induced by the the Grassmann bundle of ( dim S)-dimensional sub-
stratification of P. spaces of TM to Tz M. Then, (Tz )1 (L) .
If the condition (A) (respectively (B)) is verified
Stratified Spaces with Smooth Structure for every point z 2 R, the pair (R, S) is said to satisfy
Let (P, S) be a stratified space. A singular or the Whitney condition (A) (respectively (B)). It can
stratified chart of P is a homeomorphism be verified that Whitney’s condition (B) does not
: U ! (U) R n from an open set U P to a depend on the chart used to formulate it. A stratified
subset of Rn such that for every stratum S 2 S space with smooth structure such that, for every pair
the image (U \ S) is a submanifold of Rn and of strata, Whitney’s condition (B) is satisfied is
the restriction jU\S : U \ S ! (U \ S) is a diffeo- called a Whitney space.
morphism. Two singular charts : U ! (U) Rn
and ’ : V ! ’(V) Rm are compatible if for any Cone Spaces and Local Triviality
z 2 U \ V there exist an open neighborhood
W U \ V of z, a natural number N max {n, m}, Let P be a topological space. Consider the equiva-
open neighborhoods O, O0 RN of (U) {0} and lence relation in the product P [0, 1) given by
’(V) {0}, respectively, and a diffeomorphism (z, a) (z0 , a0 ) if and only if a = a0 = 0. We define the
: O ! O0 such that im ’jW = in jW , where cone CP on P as the quotient topological space P
in and im denote the natural embeddings of R n and [0, 1)= . If P is a smooth manifold then the cone
Rm into RN by using the first n and m coordinates, CP is a decomposed space with two pieces, namely,
respectively. The notion of singular or stratified P (0, 1) and the vertex which is the class
atlas is the natural generalization for stratifications corresponding to any element of the form (z, 0),
of the concept of atlas existing for smooth mani- z 2 P, that is, P {0}. Analogously, if (P, Z) is a
folds. Analogously, we can talk of compatible and decomposed (stratified) space then the associated
maximal stratified atlases. If the stratified space cone CP is also a decomposed (stratified) space
(P, S) has a well-defined maximal atlas, then we say whose pieces (strata) are the vertex and the sets of
that this atlas determines a smooth or differentiable the form S (0, 1), with S 2 Z. This implies, in
structure on P. We will refer to (P, S) as a smooth particular, that dim CP = dim P þ 1 and dp(CP) =
stratified space. dp(P) þ 1.
A stratified space (P, S) is said to be locally trivial
if for any z 2 P there exist a neighborhood U of z, a
The Whitney Conditions stratified space (F, S F ), a distinguished point 0 2 F,
Let M be a manifold and R, S M two submani- and an isomorphism of stratified spaces
folds. We say that the pair (R, S) satisfies the : U ! ðS \ UÞ F
Whitney condition (A) at the point z 2 R if the
following condition is satisfied: where S is the stratum that contains z and satisfies
1
(y, 0) = y, for all y 2 S \ U. When F is given by a
Condition (A) For any sequence of points {zn }n2N
cone CL over a compact stratified space L then L is
in S converging to z 2 R for which the sequence of
called the link of z.
tangent spaces {Tzn S}n2N converges in the Grass-
An important corollary of ‘‘Thom’s first isotopy
mann bundle of dim S–dimensional subspaces of TM
lemma’’ guarantees that every Whitney stratified
to Tz M, we have that Tz R .
space is locally trivial. A converse to this implication
Let : U ! Rn be a smooth chart of M around needs the introduction of cone spaces. Their defini-
the point z. The Whitney condition (B) at the point tion is given by recursion on the depth of the space.
Symmetry and Symplectic Reduction 197
Definition 5 Let m 2 N [ {1, !}. A cone space of 4. Let h 2 C1 (M)G be a G-invariant Hamiltonian.
class Cm and depth 0 is the union of countably many Then the flow Ft of Xh leaves the connected
Cm manifolds together with the stratification whose components of J 1 () \ G MzH invariant and com-
strata are the unions of the connected components mutes with the G -action, so it induces a flow Ft on
of equal dimension. A cone space of class Cm and M(H)
that is characterized by (H) (H)
F t i =
depth d þ 1, d 2 N, is a stratified space (P, S) with a Ft (H)
.
Cm differentiable structure such that for any z 2 P 5. The flow Ft is Hamiltonian on M(H) , with
there exists a connected neighborhood U of z, a reduced Hamiltonian function h(H) : M (H)
!R
compact cone space L of class Cm and depth d called defined by h(H) (H) = h i (H)
. The vector fields
(H)
the link, and a stratified isomorphism Xh and Xh(H)
are -related.
6. Let k : M ! R be another G-invariant function.
: U ! ðS \ UÞ CL Then {h, k} is also G-invariant and {h, k}(H) (H)
= {h ,
(H)
k }M(H) , where { , }M(H) denotes the Poisson bracket
where S is the stratum that contains the point z, the
induced by the symplectic structure on M(H) .
map satisfies 1 (y, 0) = y, for all y 2 S \ U, and 0
is the vertex of the cone CL. Theorem 6 (Symplectic stratification theorem). The
If m 6¼ 0 then L is required to be embedded into a quotient M := J 1 ()=G is a cone space when
sphere via a fixed smooth global singular chart considered as a stratified space with strata M(H)
.
’ : L ! Sl that determines the smooth structure
of CL. More specifically, the smooth structure of As was the case for regular reduction, this theorem
can also be formulated from the orbit reduction point
CL is generated by the global chart : [z, t] 2
of view. Using that approach one can conclude
CL 7! t’(z) 2 Rlþ1 . The maps : U ! (S \ U)
that the orbit reduced spaces MO are cone
CL and ’ : L ! Sl are referred to as a cone chart
spaces symplectically stratified by the manifolds
and a link chart, respectively. Moreover, if m 6¼ 0
M(H) 1 z
O := G (J () \ MH )=G that have symplectic
then and 1 are required to be differentiable of
structure uniquely determined by the expression
class Cm as maps between stratified spaces with a
smooth structure. ðHÞ ðHÞ ðHÞ ðHÞ
i O ! ¼ O !O þ J O !þ
O
(H)
where iO
: G ( J 1 () \ MzH ) ,! M is the inclusion,
The Symplectic Stratification Theorem (H) 1
J O : G ( J () \ MzH ) ! O is obtained by restric-
Let (M, !) be a connected symplectic manifold acted tion of the momentum map J, and !þ O is the
canonically and properly upon by a Lie group G. (þ)–symplectic form on O . Analogous statements
Suppose that this action has an associated momen- to (7)–(6) above with obvious modifications are valid.
tum map J : M ! g with nonequivariance 1–cocycle
: G ! g . Let 2 g be a value of J, G the See also: Cotangent Bundle Reduction; Dynamical
isotropy subgroup of with respect to the affine Systems in Mathematical Physics: An Illustration
from Water Waves; Graded Poisson Algebras;
action : G g ! g determined by , and let
Hamiltonian Group Actions; Lie Groups: General Theory;
H G be an isotropy subgroup of the G-action on Poisson Reduction; Singularity and Bifurcation Theory;
M. Let MzH be the connected component of the Symmetries and Conservation Laws.
H-isotropy type manifold that contains a given
element z 2 M such that J(z) = and let G MzH be
its G -saturation. Then the following hold:
Further Reading
1. The set J 1 () \ G MzH is a submanifold of M.
1 z
2. The set M(H)
:= [ J () \ G MH ]=G has a unique
Abraham R and Marsden JE (1978) Foundations of Mechanics,
2nd edn. Reading, MA: Addison-Wesley.
quotient differentiable structure such that the
Arms JM, Cushman R, and Gotay MJ (1991) A universal
canonical projection (H) : J 1 () \ G MzH ! reduction procedure for Hamiltonian group actions. In:
M(H)
is a surjective submersion. Ratiu TS (ed.) The Geometry of Hamiltonian Systems,
3. There is a unique symplectic structure !(H) on pp. 33–51. New York: Springer.
M(H)
characterized by Cendra H, Marsden JE, and Ratiu TS (2001) Lagrangian
reduction by stages. Memoirs of the American Mathematical
Society, Volume 152, No. 722.
iðHÞ
! ¼ ðHÞ
!ðHÞ
Huebschmann J (2001) Singularities and Poisson geometry of
certain representation spaces. In: Landsman NP, Pflaum M,
where i(H) : J 1 () \ G MzH ,! M is the natural and Schlichenmaier M (eds.) Quantization of Singular
inclusion. The pairs (M(H) (H)
, ! ) will be called Symplectic Quotients, Progr. Math., vol. 198, pp. 119–135.
singular symplectic point strata. Boston, MA: Birkhäuser.
198 Symmetry Breaking in Field Theory
Kazhdan D, Kostant B, and Sternberg S (1978) Hamiltonian Marsden JE and Weinstein A (2001) Comments on the history,
group actions and dynamical systems of Calogero type. theory, and applications of symplectic reduction. In:
Communications in Pure and Applied Mathematics 31: Landsman N, Pflaum M, and Schlichenmaier M (eds.)
481–508. Quantization of Singular Sympectic Quotients, Progress in
Kirillov AA (1976) Elements of the Theory of Representations, Mathematics, vol. 198, pp. 1–20. Boston: Birkhäuser.
Grundlehren Math. Wiss. Berlin–New York: Springer. Mayer KR (1973) Symmetries and integrals in mechanics. In:
Kostant B (1970) Quantization and Unitary Representations, Peixoto M (ed.) Dynamical Systems, pp. 259–273. New York:
Lecture Notes in Math., vol. 570, pp. 177–306. Berlin: Academic Press.
Springer. Ortega J-P and Ratiu TS (2003) Momentum Maps and Hamilto-
Lerman E, Montgomery R, and Sjamaar R (1993) Examples of nian Reduction, Progress in Math., vol. 222. Boston:
singular reduction. In: Symplectic Geometry, London Math. Birkhäuser.
Soc. Lecture Note Ser., vol. 192, pp. 127–155. Cambridge: Pflaum MJ (2001) Analytic and Geometric Study of Stratified
Cambridge University Press. Spaces, Lecture Notes in Mathematics, vol. 1768. Berlin:
Marsden J (1981) Lectures on Geometric Methods in Mathema- Springer–Verlag.
tical Physics. Philadelphia: SIAM. Sjamaar R and Lerman E (1991) Stratified symplectic spaces and
Marsden JE, Misiołek G, Ortega J-P, Perlmutter M, and Ratiu TS reduction. Annals of Mathematics 134: 375–422.
(2005) Hamiltonian Reduction by Stages, Lecture Notes in Souriau J-M (1970) Structure des Systèmes Dynamiques. Paris:
Mathematics. Berlin: Springer. Dunod. (English translation by Cushman RH and Tuynman
Marsden JE and Ratiu TS (2003) Introduction to Mechanics and GM (1997)) Structure of Dynamical Systems, Progress in
Symmetry, 2nd edn., second printing (1st edn (1994)), Texts in Math., vol. 149. Boston: Birkhäuser.
Applied Mathematics, vol. 17. New York: Springer.
Marsden JE and Weinstein A (1974) Reduction of symplectic
manifolds with symmetry. Reports on Mathematical Physics
5(1): 121–130.
and the ground state has angular momentum j 6¼ 0, However, these are not unitary operators on the
then it is (2j þ 1)-fold degenerate. spaces H , but rather maps from one space to
The situation is different, however, in a quantum another: U ^ : H ! Hþ – or, alternatively, opera-
L
field theory. In the infinite-volume limit, even abelian tors on the nonseparable Hilbert space H = H .
symmetries can be spontaneously broken. Take, for So far, our discussion has been restricted to the
example, a real scalar field with Lagrangian tree approximation. For a full quantum treatment,
V() must be replaced by the effective potential
L ¼ 12@ @ V ¼ 12_ 2 12ðrÞ2 V ½2 Veff (), which may be defined as the minimum value
(where we set c =
h = 1), again with a double-well of the mean energy density in all states in which the
potential field ˆ has the uniform expectation value h(x)i
ˆ = .
Veff may be computed by summing vacuum loop
V ¼ 18ð2 2 Þ2 ½3 diagrams.
A point to note is that although the degenerate
exhibiting a Z2 symmetry under which
vacua j0 i are mathematically distinct, in the
(x) 7! (x).
absence of any external definition of phase, they
At least in the semiclassical or tree approxi-
are physically identical. There is no internal obser-
mation, there are two degenerate vacuum states j0i
vational test that will distinguish them.
and j00 i, with
^
h0jðxÞj0i and ^
h00 jðxÞj0 0
i ½4
If we quantize the system in a box of finite volume Symmetry-Breaking Phase Transitions
V, then, as earlier, there is an off-diagonal matrix
element of the Hamiltonian connecting the two Spontaneous symmetry breaking often occurs in the
states, context of a phase transition. At high temperature,
pffiffiffi so the true ground state is (approximately) T , there are large fluctuations in and the
(1= 2)(j0i þ j00 i). However, this matrix element
goes to zero exponentially as V ! 0. Even for large central hump of the potential is unimportant. Then
the equilibrium state is symmetric, with hi ˆ = 0.
but finite volume, the rate of transitions from j0i to
j00 i is exponentially slow. However, as the temperature falls, it becomes less
Similarly, we can consider a complex scalar field probable that the field will fluctuate over the top of
theory with a sombrero potential: the hump. It will tend to fall into the trough, and
acquire a nonzero average value hi ˆ – the order
_ 2 jrj2 V
L ¼ jj parameter for the phase transition – thus breaking
2 ½5 the symmetry. The direction of symmetry breaking
V ¼ 12 jj2 122 (e.g., the phase of in the U(1) model) is random,
determined in practice by small preexisting fluctua-
This model is invariant under the U(1) group of phase tions or interactions with the environment.
transformations, (x) 7! (x)ei , so we now have a One way of studying this process is to compute
continuously infinite set of degenerate vacuum states the temperature-dependent effective potential
j0 i labeled by an angle , and satisfying Veff (, T). In the one-loop approximation, at high
1 temperature, the leading corrections to the zero-
^
h0 jðxÞj0 i pffiffiffi e
i
½6
2 temperature effective potential Veff (, T) are of the
form
Once again, one finds that in the infinite-volume
limit there are no matrix elements connecting the
2
different vacuum states. Moreover, in this limit no Veff ð; TÞ ¼ Veff ð; 0Þ N T 4
polynomial formed from the field operators (x)ˆ in 90
a finite volume can have nonzero matrix elements 1 2
þ M ðÞT 2 þ OðTÞ ½8
between j0 i and j0 i for 6¼ . Applying the 24
ˆ
operators (x) to any one of these vacuum states
j0 i, we can construct a Fock space H , and the where N is the total number of helicity states of light
representations of the canonical commutation rela- particles (those with masses T), and M2 , which
tions on these separate Hilbert spaces are unitarily depends on , is the sum of their squared masses.
inequivalent. Formally, we can introduce operators (Fermions if present contribute to N with a factor of
U^ that perform the symmetry transformations: 7/8 and to M2 with a factor of 1/2.) In the simplest
case, where we have only a multiplet f = (a )a = 1, ...; N
U^ ðxÞ
^ U ^ 1 ¼ ðxÞe
^ i
½7
of real scalar fields, N = N and M2 = M2aa
200 Symmetry Breaking in Field Theory
(summation over a implied), where the mass-squared presence of long-range forces) to the appearance of
matrix is massless modes – the Goldstone bosons.
The proof is straightforward. Associated with any
@2V continuous symmetry there is a Noether current
M2ab ¼ ½9
@a @b satisfying the continuity equation @^j = 0 and such
that infinitesimal symmetry transformations are
For example, in an O(N) theory, with V = (1/8)
generated by the spatial integral of ^j0 . The fact that
(f 2 2 )2 , where f 2 = a a , one has
the symmetry is broken means that there is some
M2ab ¼ 12 ðf 2 2 Þ ab þ a b ½10 ˆ
scalar field (x) whose vacuum expectation value
ˆ
h0j(0)j0i is not invariant under the symmetry
whence transformation. Hence,
Z
1 2
Veff ðf; TÞ ðf 2 2 Þ2 NT 4 lim i d3 xh0j½^j0 ðxÞ; ð0Þj0ij
^
x0 ¼0 6¼ 0 ½14
8 90 V!0 V
1
þ T 2 ½ðN þ 2Þf 2 N2 ½11 Moreover, the time derivative of this integral is
48
Z
It is then easy to see that the minimum occurs at
lim i d3 xh0j½@0^j0 ðxÞ; ð0Þj0ij
^
x0 ¼0
f = 0 for T > Tc , where in this approximation V!0 V
Tc 2 = 122 =(N þ 2), while below the critical tem- Z
perature the minimum is at ¼ lim i dSk h0j½^jk ðxÞ; ð0Þj0ij
^
x0 ¼0 ¼ 0 ½15
V!0 @V
Nþ2 2
f 2 ¼ 2eq ðTÞ 2 T ½12 where @V is the bounding surface of V. This vanishes
12
because the surface integral is zero – in a relativistic
As T ! 0, the equilibrium state approaches one of theory, because the commutator vanishes at space-
the vacuum states j0n i, labeled by an N-dimensional like separation, and more generally in the absence of
unit vector n, such that h0n jf̂j0n i = n. long-range interactions because it tends rapidly to
It is often convenient to introduce a classical zero at large spatial separation.
symmetry-breaking potential. For example, in the Now, inserting a complete set of momentum
O(N) model, we may take Vsb = j
f(x), where j eigenstates jn, pi in [14], we can see that there must
is a constant N-vector. This has the effect of tilting the ˆ
exist states such that hn, pj(0)j0i 6¼ 0, with p0 ! 0
potential, thus removing the degeneracy. A character- in the limit jpj ! 0, that is, massless modes.
istic of spontaneous symmetry breaking is that the One can see this more directly in the U(1) model
limits j ! 0 and V ! 1 do not commute. If (for above. Consider a vacuum state j0i such that
pffiffiffi
T < Tc ) we take the infinite-volume limit first, and ˆ = = 2 is real. Then it is useful to shift the
h0jj0i
then let j ! 0, we get different equilibrium states, origin of by writing
depending on the direction from which j approaches
zero; if we fix n and let j = jn, j ! 0, then we find 1
ðxÞ ¼ pffiffiffi ½ þ ’1 ðxÞ þ i’2 ðxÞ ½16
2
lim lim hf̂ðxÞijn ¼ eq ðTÞn ½13
j!0 V!1
where ’1 and ’2 are real. Then the Lagrangian
We may also regard j as representing an interac- becomes
tion with the external environment (e.g., other h
fields). If such a term is present during the cooling L ¼ 12 ’_ 21 ðr’1 Þ2 þ ’_ 22 ðr’2 Þ2 2 ’21
of the system through the phase transition, it will 2 i
constrain the direction of the spontaneous symmetry ’1 ’21 þ ’22 14 ’21 þ ’22 ½17
breaking. Note that one always arrives in this way
at one of the degenerate vacua j0n i, not a linear Evidently, the field ’1 , corresponding p toffiffiffi radial
combination of them. oscillations in , is massive, with mass . But
there is no term in ’22 , so ’2 is massless.
In the case of spontaneous symmetry breaking of
nonabelian symmetries, there may be several Gold-
Goldstone Bosons stone bosons, one for each broken component of the
The Goldstone theorem states that spontaneous continuous symmetry. In our theory with symmetry
breaking of any continuous global symmetry leads group G = O(N), the possible values of the vacuum
inevitably (except, as we discuss later, in the ˆ
expectation value at T = 0 are h0n j(0)j0 n i n,
Symmetry Breaking in Field Theory 201
Elitzur’s Theorem; the Role of in the perturbation series. As is well known, this
Gauge Fixing problem can be dealt with by introducing a gauge-
fixing term, which explicitly breaks the gauge
The concept of spontaneous symmetry breaking in symmetry, and renders Elitzur’s theorem inapplic-
the context of a local symmetry requires further able. But this procedure leaves a global symmetry
discussion, in particular because of Elitzur’s theo- unbroken, and it is in fact that global symmetry that
rem, proved in 1975, which states in essence that is broken spontaneously.
‘‘spontaneous breaking of a local symmetry is One example is the Landau–Ginzburg model of a
impossible.’’ In the light of this theorem, it may superconductor, which is essentially just the non-
seem that a ‘‘spontaneously broken gauge theory’’ is relativistic limit of the abelian Higgs model,
an oxymoron. In fact, it means something rather although there is one significant difference: here
different, although even that is not unproblematic. the field ˆ annihilates a Cooper pair, a bound pair
The theorem was proved in the context of lattice of electrons with equal and opposite momenta and
gauge theory, where the spatial continuum is spins, so e above is replaced by the charge 2e of a
replaced by a discrete lattice. The scalar field is Cooper pair. The appearance of a condensate of
then represented by values f x at each lattice site, and Cooper pairs in the low-temperature superconduct-
the gauge potential by values Ax, on the links of the ing phase corresponds to a state in which hi ˆ is
lattice. This is significant because on the lattice one nonzero. This would not be possible without fixing
can use a manifestly gauge-invariant formalism. a gauge. In the nonrelativistic context, the obvious
Expectation values of gauge-invariant physical gauge to choose is the Coulomb gauge, defined by
variables can be found, for example, by a Monte the condition @k Ak = 0. This gauge-fixing condition
Carlo algorithm that effectively averages over all breaks the local symmetry explicitly, but it leaves
possible gauges. In this context, it is possible to unbroken the global symmetry (x) ! (x)ei with
show that the expectation value of any gauge- constant . It is that global symmetry that is
noninvariant operator (such as f^x ) necessarily spontaneously broken when hi ˆ 6¼ 0.
vanishes identically. For a model with nonabelian local symmetry the
To be more specific, suppose we incorporate
P a standard procedure used to derive a perturbation
symmetry-breaking term of the form j
x f x , and expansion is that of Faddeev and Popov. Consider,
consider the limits V ! 1 followed by j ! 0. In the for example, the SO(3) gauge theory discussed in the
global-symmetry case, as we noted earlier, this yields preceding section. To fix the gauge, we can choose a
the nonzero result [13]. However, in the case of a set of functions F = (Fa ) of the fields, and introduce
local gauge symmetry, one can show rigorously that into the path integral a gauge-fixing term of the form
lim lim hf^x ijn ¼ 0 ½27 1 2
j!0 V!1 Lgf ¼ F ½28
2
The essential reason for this is that we can make a
gauge transformation in the neighborhood of the where is an arbitrary real constant. However, to
point x to make f x have any value we like without ensure that this does not bias the integral, so that the
changing the energy by more than a very small gauge-fixed theory is at least formally equivalent to
amount that goes to zero as j ! 0. Within this the original gauge-invariant theory, one must also
manifestly gauge-invariant formalism, it is clear that include the determinant of the Jacobian matrix
the expectation value of a gauge-noninvariant Fa ðxÞ
operator such as f̂ is not an appropriate order Jab ðx; yÞ ¼ ½29
!b ðyÞ
parameter. One must instead look for a gauge-
invariant order parameter. The easiest way to do this is to introduce Faddeev–
It is important to note, however, that this result C, which are scalar Grassmann
Popov ghost fields C,
applies only in the context of a manifestly gauge- variables, and an appropriate term in the
invariant formalism. But, in general, gauge theories Lagrangian
cannot be quantized in a manifestly gauge-invariant
way. In a path-integral formalism, the action
J
C
LFP ¼ C ½30
functional, which appears in the exponent, is
constant along the orbits of the gauge-group action. For the SO(3) model, a convenient choice of gauge is
Consequently, the integral contains an infinite the R gauge defined by
factor, the volume of the (infinite-dimensional)
gauge group. There are corresponding divergences F ¼ @ A en f ½31
Symmetry Breaking in Field Theory 203
where n is an arbitrarily chosen unit vector. It is Gauge theories pose particular problems because
clear that the full Lagrangian L þ Lgf þ LFP is no of the infrared divergences in the thermal field
longer invariant under the full SO(3) gauge group, theory at high temperature, and because in asymp-
although there is a residual U(1) gauge invariance totically free nonabelian theories the coupling
corresponding to rotations about n. In this gauge, becomes large at very low energy. Even when they
the arbitrary choice of n means that the global appear to exhibit spontaneous symmetry breaking,
SO(3) symmetry is also broken. However, for other they do not necessarily undergo a true phase
choices, such as the Lorentz gauge F ¼ @ A or transition. Lattice gauge theory calculations have
axial gauge F ¼ A3 , the Lagrangian is invariant led to the conclusion that in nonabelian gauge
under global SO(3) rotations of all the fields. This theories with the Higgs field in the fundamental
global symmetry is then spontaneously broken, with representation, there are values of the coupling
f̂ acquiring as before a nonzero expectation value of constants for which there is no phase transition,
the form hf̂(x)i = n. only a rapid but smooth crossover from one type of
It is interesting to look again at the particle behavior to another, so that the high- and low-
content of this model. By setting f(x) = n þ j(x) temperature phases are analytically connected. If the
with n = (0, 0, 1), one finds that in the quadratic part coupling constant is small, there is a first-order
of the Lagrangian, the cross-terms between A and j phase transition, and for moderate values the theory
combine to form a total divergence which can be exhibits a very rapid crossover that looks quite
dropped. As before, ’3 is the Higgs field, with similar to a symmetry-breaking phase transition.
m2 = 2 , A3 is the massless gauge field corres- Nevertheless, the analytic connection between the
ponding to the unbroken gauge symmetry, and the two phases implies that there cannot exist an order
three transverse components of A1 and A2 parameter that is strictly zero above the transition
represent the massive vector fields, with m2 = e2 2 . and nonzero below it.
There are, however, also unphysical fields with In particular, it appears that for physical values
1, 2 , and the long-
-dependent masses: ’1, 2 , C1, 2 , C of the Higgs mass, the electroweak theory does not
itudinal components @ A1, 2 all have m2 = e2 2 . We undergo in fact undergo a true phase transition. It is
can now compute the effective potential Veff (T, j). somewhat ironic that the most famous example of a
One point that should be noted in performing this spontaneously broken gauge theory probably does
calculation is that the ghost fields C, C contribute not, strictly speaking, exhibit a symmetry-breaking
negatively. Obviously, Veff , being -dependent, is phase transition!
not itself physically meaningful. Nevertheless, it can
be shown that the stationary points of Veff are
physical, and correspond to the possible equilibrium Conclusions
states of the theory. Moreover, the extremal values
We have discussed the main features of spontaneous
of Veff are independent of and give correctly the
symmetry breaking in both the global- and local-
thermodynamic potential in the corresponding equi-
symmetry cases, especially the appearance of Gold-
librium states. The negative contributions from the
stone bosons when a continuous global symmetry
ghost fields to N and M2 ensure that the
breaks, and their elimination in the local-symmetry
dependence cancels out, and we find as expected
case by the Higgs mechanism, as well as the
N = 9 and M2 = ( þ 6e2 )2 .
problems attaching to the concept of spontaneous
symmetry breaking in gauge theories.
Phase Transitions and Crossovers
See also: Abelian Higgs Vortices; Effective Field
Our discussion so far has for the most part been Theories; Electroweak Theory; Finite Group Symmetry
restricted to a semiclassical or mean-field approx- Breaking; Lattice Gauge Theory; Noncommutative
imation. It is important to bear in mind, however, Geometry and the Standard Model; Phase Transitions in
that this approximation does not suffice to deter- Continuous Systems; Quantum Central Limit Theorems;
mine whether a phase transition (where the thermo- Quantum Spin Systems; Symmetries in Quantum Field
dynamic free energy is nonanalytic) exists, or what Theory of Lower Spacetime Dimensions; Topological
its nature is. Determining the detailed characteristics Defects and their Homotopy Classification.
of phase transitions requires other methods, such as
the renormalization group or lattice simulations. In
many cases, it is far from trivial to establish the Further Reading
order of the transition, or even whether a true phase Anderson PW (1963) Plasmons, gauge invariance and mass.
transition actually exists. Physical Review 130: 439–442.
204 Symmetry Classes in Random Matrix Theory
Coleman S (1985) Aspects of Symmetry, ch. 5. Cambridge: Guralnik GS, Hagen CR, and Kibble TWB (1964) Global
Cambridge University Press. conservation laws and massless particles. Physical Review
Elitzur S (1975) Impossibility of spontaneously breaking local Letters 13: 585–587.
symmetries. Physical Review D 12: 3978–3982. Higgs PW (1964) Broken symmetries, massless particles and
Englert F and Brout R (1964) Broken symmetry and the mass of gauge fields. Physics Letters 12: 132–133.
gauge vector bosons. Physical Review Letters 13: 321–323. Kibble TWB (1967) Symmetry-breaking in non-abelian gauge
Fradkin E and Shenker SH (1979) Phase diagrams of lattice gauge theories. Physical Review 155: 1554–1561.
theories with Higgs fields. Physical Review D 19: 3682–3697. Weinberg S (1996) The Quantum Theory of Fields, Vol. II.
Goldstone J, Salam A, and Weinberg S (1962) Broken symmetries. Modern Applications, chs. 19 and 21. Cambridge: Cambridge
Physical Review 127: 965–970. University Press.
important role. A prominent example is provided by what are the corresponding symmetry classes,
the operator for space reflection. Its eigenspaces are meaning the irreducible spaces of Hamiltonians on
the subspaces of states with positive and negative V that commute with G.
parity; these reduce the matrix of any reflection- For technical reasons, we assume the group G0 to
invariant Hamiltonian to two blocks. be compact; this is an assumption that covers most
Not all symmetries of a quantum-mechanical (if not all) of the cases of interest in physics. The
system are of the canonical, unitary kind: the noncompact group of space translations can be
prime counterexample is the operation of inverting incorporated, if necessary, by wrapping the system
the time direction, called time reversal for short. In around a torus, whereby translations are turned into
classical mechanics, this operation reverses the sign compact torus rotations.
of the symplectic structure of phase space; in While the primary objects to classify are the
quantum mechanics, its algebraic properties reflect spaces of Hamiltonians H, we shall focus for
the fact that inverting the
pffiffiffiffiffiffitime direction, t 7! t, convenience on the spaces of time evolutions
amounts to sending i = 1 to i. Indeed, time t Ut = eitH=h instead. This change of focus results in
enters in the Dirac, Pauli, or Schrödinger equation no loss, as the Hamiltonians can always be retrieved
as ihd=dt. Therefore, time reversal is represented in by linearizing in t at t = 0.
the quantum theory by an antiunitary operator T,
which is to say that T is complex antilinear: Symmetric Spaces
leave the structure of the vector space V invariant. G0 = SO3 we take R to be the standard irreducible
Thus, U (V) is a group of unitary transformations if module of dimension 2 þ 1; and m then is the
V carries no more than the usual Hermitian scalar number of times a multiplet of states with total
product; and is some subgroup of the unitary group angular momentum occurs in V .
if V does have extra structure (as is the case for the The natural mapping L R ! V by l r 7! l(r)
Nambu space of quasiparticle excitations in a is an isomorphism,
superconductor). The symmetry group G0 , by acting
V ffi L R
on V and preserving its structure, is contained as a
subgroup in U (V). and using it we can transfer the entire discussion
Let now H be any Hamiltonian with the pre- from V to L R . The group G0 acts trivially on
scribed symmetries. Then the time evolution L ffi Cm and irreducibly on R . Therefore, the
t 7! Ut = eitH=h generated by H is a one-parameter component Z of the centralizer Z is the unitary
subgroup of U (V) which commutes with the group
G0 -action. The total set of transformations Ut that
Z ffi UðL Þ ffi Um
arise in this way is called the (connected part of the)
‘‘centralizer’’ of G0 in U (V), and is denoted by Z. if V is a unitary vector space with no extra structure.
This is the ‘‘good’’ set of unitary time evolutions – In the presence of extra structure (which, by
the set compatible with the given symmetries of an compatibility with the G0 -action, restricts to every
ensemble of quantum systems. subspace V ), the factor Z is some subgroup of
The centralizer Z is obviously a group: if U and Um . In all cases, Z is a direct product of connected
U0 belong to Z, then so do their inverses and their compact Lie groups Z .
product. What can one say about the structure of To make the connection with symmetric spaces, write
the group Z? M := Z . Since M is a group, the operation of taking
Since G0 is compact by assumption, its group the inverse, U 7! U1 , makes sense for all U 2 M.
action on V is completely reducible and V is Moreover, being a compact Lie group, the manifold M
guaranteed to have an orthogonal decomposition admits a left- and right-invariant Riemannian structure
M in which the inversion U 7! U1 is an isometry. By
V¼ V translation, one gets an isometry sU1 : U 7! U1 U1 U1
for every U1 2 M. All of these maps sU1 are globally
where the sum runs over isomorphism classes of defined, and the restriction of sU1 to some neighborhood
irreducible G0 -representations , and the vector of U1 coincides with the geodesic inversion with respect
spaces V are called the G0 -isotypic components of to U1 . Thus, M is a symmetric space by the definition
V. For example, if G0 is the rotation group SO3 , the given above. Symmetric spaces of this kind are called
G0 -isotypic component V of V is the subspace type II.
spanned by all the states with total angular
momentum .
Type I
Consider now any U 2 Z. Since U commutes with
the G0 -action, it does not connect different Consider next the case of G1 6¼ ;, where some
G0 -isotypic components. (Indeed, in the example of antiunitary symmetry T is present. As before, let Z
SO3 -invariant dynamics, angular momentum is be the connected component of the centralizer of G0
conserved and transitions between different angular in U (V). Conjugation by T,
momentum sectors are forbidden.) Thus, every
G0 -isotypic component V is an invariant subspace U 7! ðUÞ :¼ TUT 1
for the
Q action of Z on V, and Z decomposes as is an automorphism of U (V) and, owing to T 2 = Id,
Z = Z with blocks Z = Z jV . is involutive. Because G0 G is a normal
To say more, fix a standard irreducible subgroup, restricts to an involutive automorphism
G0 -module R of isomorphism class and consider (still denoted by ) of Z. Now recall that T is
L :¼ HomG0 ðR ; V Þ complex antilinear and the good Hamiltonians are
subject to THT 1 = H. The good time evolutions
the linear space of C-linear maps l : R ! V that Ut = eitH=h clearly satisfy (Ut ) = Ut = Ut1 . Thus,
intertwine the G0 -actions on R and V . An element the good set to consider is M := {U 2 Z j U = (U)1 }.
of L is called a G0 -equivariant homomorphism. By The set M is a manifold, but in general is not a
Schur’s lemma, L ffi C if V is G0 -irreducible. More Lie group.
generally, dimL =: m counts the multiplicity of Further details depend on what does with the
Q
occurrence of R in V ; for example, in the case of factorization Z = Z . If V is a G0 -isotypic
Symmetry Classes in Random Matrix Theory 207
component of V, then so is TV , since T normalizes Table 1 The large families of symmetric spaces. The form of H
G0 . Thus, either V \ TV = 0, or TV = V . In the in the header applies to the last seven families
former case, the involutive automorphism just Family Symmetric
W Z
relates U 2 Z with (U) 2 ZTV , whence no intrin- space Form of H =
Zy
W
sic constraint on Z results, and the time evolutions
(U, (U)1 ) 2 Z ZTV constitute a type-II sym- A UN Complex Hermitian
AI UN =ON Real symmetric
metric space, as before. AII U2N =USp2N Quaternion self-adjoint
A novel situation occurs when TV = V , in which C USp2N Z complex symmetric,
case restricts to an automorphism of Z . Let W =Wy
therefore TV = V , put K
Z for short, and CI USp2N =UN Z complex symmetric,
consider W =0
D SO2N Z complex skew,
M :¼ fU 2 KjU ¼ ðUÞ1 g W =Wy
DIII SO2N =UN Z complex skew,
Note that if two elements p, p0 of K are in M, W =0
then so is the product p0 p1 p0 . The group K acts on AIII Upþq =Up Uq Z complex p q, W = 0
BDI SOpþq =SOp SOq Z real p q, W = 0
M K by CII USp2pþ2q =USp2p Z quaternion
USp2q 2p 2q, W = 0
k U ¼ kUðkÞ1 ðk 2 KÞ
complex Hermitian N N matrices. By putting a Knowing the sign of " = 1 we know the group
UN -invariant Gaussian probability measure K . Indeed, an element k 2 K commutes with T and
after transfer from V to L still commutes with . But
exp trH 2 =22 dH ð 2 RÞ since K is a subgroup of K = Um , this means that
on that space, one gets what is called the GUE – the k 2 K preserves Q. In the case of " = þ1, what is
Gaussian unitary ensemble – which defines the preserved is a symmetric pairing, and therefore K ffi
Wigner–Dyson universality class of unitary symmetry. Om . For " = 1, the multiplicity m must be even
and K preserves an alternating pairing (or symplec-
Classes AI and AII tic structure); in that case K ffi USpm , the unitary
symplectic group.
Consider next the case G1 6¼ ;, with antiunitary Thus, there is a dichotomy for the sets of good
generator T. Let V = TV be any G0 -isotypic time evolutions M ffi K=K :
component of V invariant under T (the type-I
situation). The mapping U 7! TUT 1 = (U) then is Class AI : K=K ffi UN =ON ðN ¼ m Þ
an automorphism of the groups U(V ), G0 and Class AII : K=K ffi U2N =USp2N ð2N ¼ m Þ
K = Z ffi Um . If K is the subgroup of fixed points
of in K, the space of good time evolutions can be Again we are referring to symmetric spaces by the
identified with the symmetric space K=K by the names they – or rather their simple parts SUN =SON
Cartan embedding. Our task is to determine K . and SU2N =USp2N – have in the Cartan classification.
To simplify the notation let us write V
V, R
In general, there is no immediate means of
R, and L
L. We now ask what happens with predicting the parity " , and one has no choice but
T : V ! V in the process of transfer to L R ffi V. to go through the steps of constructing . If
The answer, so we claim, is that T transfers to a : R ! R happens to be G0 -invariant, however, the
pure tensor made from antiunitary maps : L ! L situation simplifies. In that case determines a
and : R ! R, G0 -invariant pairing R R ! C (in the same way as
determines Q : L L ! C above). On general
T ¼ grounds, an irreducible G0 -representation space
To prove this claim, let C be the antilinear map admits at most one such pairing. If that pairing is
from V to the dual vector space V by v 7!hv,i. symmetric, then, as we have seen, " = 1; if it is
Because the elements of G0 are represented by alternating, then " = 1. The parity " is given by
unitaries, the C-linear operator CT : V ! V inter- " " = "T .
twines G0 -actions:
Example Consider any physical system with spin-
CT aðgÞ ¼ g1 CT ðg 2 G0 Þ
rotation symmetry (G0 = SU2 ) and time-reversal
where a is the automorphism a(g) = T 1 gT. From symmetry. The physical operation of time reversal,
the irreducibility of R it follows that the space of T, commutes with spin rotations and, hence, here
intertwiners R ! R is one dimensional here (Schur’s is a case where the factor in T = is
lemma). Therefore, CT : L R ! L R must be a G0 -invariant. On fundamental physics grounds one
pure tensor (as opposed to a sum of such tensors), has T 2 = (1)2S on states with spin S. The spin-S
and since C is clearly a pure tensor, so is T. This representation of SU2 is known to carry an invariant
completes the proof. pairing which is symmetric or skew depending on
By the involutive property T 2 = "T IdV ("T = 1), whether the integer 2S is even or odd. Therefore,
the two antiunitary factors of T = cannot "T = " and " = þ1 in all cases.
but square to 2 = " IdL and 2 = " IdR where Thus, T-invariant systems with no symmetries
" , " = 1 are related by " " = "T . The factor other than energy and spin invariably are class AI.
determines a nondegenerate complex bilinear form By breaking spin-rotation symmetry (G0 = {Id},
Q : L L ! C by " = þ1) while maintaining T-symmetry for states
with half-integer spin (say single electrons, which
Qðl1 ; l2 Þ ¼ hl1 ; l2 iL ðl1 ; l2 2 LÞ
carry spin S = 1=2), one gets " = 1, thereby
Since is antiunitary one has the exchange realizing class AII.
symmetry
The Hamiltonians By passing to the tangent space
Qðl1 ; l2 Þ ¼ h2 l1 ; l2 iL ¼ " Qðl2 ; l1 Þ
of K=K at unity one obtains Hermitian matrices
Thus, the complex bilinear form (or pairing) Q is with entries that are real numbers (class AI) or real
symmetric for " = þ1 and alternating for " = 1. quaternions (class AII). When K -invariant Gaussian
Symmetry Classes in Random Matrix Theory 209
probability measures (called GOE resp., GSE) are purposes, the best viewpoint to take is to attribute
put on these spaces, one gets the Wigner–Dyson the extra invariant structure to the Hilbert space V,
universality classes of orthogonal resp., symplectic thereby turning it into a Nambu space.
symmetry. In mesoscopic physics, these are realized
in disordered metals with time-reversal invariance Nambu Space
(absence of magnetic fields and magnetic impuri-
ties). Spin-rotation symmetry is broken by strong Adopting the standard physics conventions of
spin–orbit scatterers such as gold impurities. second quantization, consider some set of single-
particle creation and annihilation operators cyi and
ci , where i = 1, 2, . . . labels an orthonormal system
Warning
of single-particle states. Such operators are subject
The word ‘‘symmetry class’’ is not synonymous with to the canonical anticommutation relations (CARs)
‘‘universality class.’’ Indeed, inside a symmetry class
many different types of physical behavior are cyi cj þ cj cyi ¼ ij
½1
possible. For example, random matrix models for cyi cyj þ cyj cyi ¼ 0 ¼ ci cj þ cj ci
disordered metallic grains with time-reversal sym-
metry belong to the symmetry class of the example When written in terms of cj þ cyj and i(cj cyj ), these
above (class AI), and so do Anderson tight-binding become the standard defining relations of a Clifford
models with real hopping. The former are known to algebra over
P R. Field operators are linear combina-
exhibit energy level statistics of universal GOE type, tions y = i (ui ci þ fi cyi ) with complex coefficients ui
whereas the latter have localized eigenfunctions and and fi .
hence level statistics which is expected to approach Now take H to be some Hamiltonian which is
the Poisson limit when the system size goes to quadratic in the creation and annihilation operators:
infinity. X 1 X
H¼ Wij cyi cj þ Zij cyi cyj þ Z
ij cj ci
i;j
2 i;j
Disordered Superconductors and let H act on field operators y by the
When Dirac first wrote down his famous equation in commutator: H y
[H, y ]. The time evolution of
1928, he assumed that he was writing an equation y is then determined by the Heisenberg equation of
for the wave function of the electron. Later, because motion
of the instability caused by negative-energy solu- dy
tions, the Dirac equation was reinterpreted (via ih ¼Hy ½2
dt
second quantization) as an equation for the ferm-
ionic field operators of a quantum field theory. A which integrates to y (t) = eitH=h y (0), and is easily
similar change of viewpoint is carried out in reverse verified to preserve the CARs [1].
in the Hartree–Fock–Bogoliubov mean-field descrip- The dynamical equation [2] is equivalent to a
tion of quasiparticle excitations in superconductors. system of linear differential equations for the
There, one starts from the equations of motion for amplitudes ui and fi . If these are assembled into
linear superpositions of the electron creation and vectors, and the Wij and Zij into matrices, eqn [2]
annihilation operators, and reinterprets them as a becomes
unitary quantum dynamics for what might be called
d f W Z f
the quasiparticle ‘‘wave function.’’ ih ¼ y
dt u Z W u
In both cases – the Dirac equation and the
quasiparticle dynamics of a superconductor – there The Hamiltonian matrix on the right-hand side has
enters a structure not present in the standard some special properties due to Zij = Zji (from
quantum mechanics underlying Dyson’s classifica- ci cj = cj ci ) and Wij = W ji (from H being self-
tion: the field operators for fermionic particles are adjoint as an operator in Fock space). To keep
subject to a set of relations called the ‘‘canonical track of these properties while imposing some
anticommutation relations,’’ and these are preserved unitary and antiunitary symmetries, it is best to put
by the quantum dynamics. Therefore, whenever everything in invariant form.
second quantization is undone (assuming it can be So, let U be the unitary
P vector space of annihila-
undone) to return from field operators to wave tion operatorsPu = i ui ci , and view the creation
functions, the wave-function dynamics is required to operators f = i fi cyi as lying in the dual vector space
preserve some extra structure. This puts a linear U . The field operators y = u þ f then are elements
constraint on the good Hamiltonians H. For our of the direct sum U
U =: V, called ‘‘Nambu
210 Symmetry Classes in Random Matrix Theory
space.’’ On V there exists a canonical unitary group SO(V), and imposing unitarity yields a real
structure expressed by orthogonal subgroup SO(VR ) with dim VR 2 2N –
X a symmetric space of the D family.
~i ¼
hy ; y ~ þ fi ~fi Þ
u u
ð When expressed in some basis of Majorana
i i i
C2 . Since two spinors combine to give a scalar, the Three large families of symmetric spaces remain to
latter comes with an alternating bilinear form a : R be implemented. Although these, too, occur in
R ! C. In a suitable basis, the anticommutation mesoscopic physics, their most natural realization
relations [1] factor on particle–hole and spin indices. is by 4D Dirac fermions in a random gauge field
The symmetric bilinear form {,} of V correspondingly background.
factors under the tensor product decomposition Consider the Lagrangian L for the Euclidean
V = L R as spacetime version of QCD with Nc 3 colors of
quarks coupled to an SUNc gauge field A :
fl1 r1 ; l2 r2 g ¼ ½l1 ; l2 aðr1 ; r2 Þ ð@ A Þy þ imy
L ¼ iy y
where [ , ] is an alternating form on L, giving L the The massless Dirac operator D = i (@ A ) anti-
structure of a complex symplectic vector space. commutes with 5 = 0 1 2 3 . Therefore, in a basis
The good set M now consists of the time of eigenstates of 5 the matrix of D takes the form
evolutions that, in addition to preserving the 0 Z
structure of Nambu space, commute with the spin- D¼ ½3
Zy 0
rotation group SU2 :
If the gauge field carries topological charge
2 Z,
M ¼ fU 2 UðVÞjUC ¼ CU; 8g 2 SU2 : gU ¼ Ugg the Dirac operator D has at least j
j zero modes by the
index theorem. To make a simple model of the
By the last condition, all time evolutions act trivially challenging situation where A is distributed according
on the factor R. The condition UC = CU, which to Yang–Mills measure, one takes the matrices Z to be
expresses invariance of the symmetric form of V, complex rectangular, of size p q with p q =
, and
then implies that time evolutions preserve the puts a Gaussian probability measure on that space.
alternating form of L. Time evolutions therefore This random matrix model for D captures the
are unitary symplectic transformations of L, hence universal features of the QCD Dirac spectrum in the
M = USp(L) ffi USp2N – a symmetric space of the C massless limit.
family. The Hamiltonian matrices in class C have The exponential of the truncated Dirac operator,
the standard form eitD (where t is not the time), lies in a space
equivalent to Upþq =Up Uq – a symmetric space of
W Z the AIII family. We therefore say that the universal
H¼
Zy W behavior of the QCD Dirac spectrum is that of
symmetry class AIII.
with W being Hermitian and Z complex and But hold on! Why are we entitled to speak of a
symmetric. symmetry class here? By definition, symmetries
212 Symmetry Classes in Random Matrix Theory
always commute with the Hamiltonian, never do on Nambu space with Dirac U1 -symmetry and an
they anticommute! (The relation D = 5 D 5 is not antiunitary symmetry T.
a symmetry in the sense of Dyson, nor is it a
symmetry in our sense.)
Classes BDI and CII
Consider Hamiltonians D still of the form [3] but
Class AIII
now with matrix entries taken from either the real
To incorporate the massless QCD Dirac operator numbers or the real quaternions. Their one-parameter
into the present classification scheme, we adapt it groups eitD belong to two further families of
to the Nambu space setting. This is done by symmetric spaces, namely the classes BDI and CII
reorganizing the four-component Dirac spinor of Table 1. These large families are known to be
y,y as an eight-component Majorana spinor , realized as symmetry classes by the massless Dirac
to write operator with gauge group SU2 (for BDI), or with
fermions in the adjoint representation (for CII). For
i
L m¼0 ¼ ð@ A Þ the details we must refer to Verbaarschot’s (1994)
2 paper and the recent article by Heinzner et al. (2005).
The 8 8 matrices are real symmetric besides
satisfying the Clifford relations
þ
= 2
. See also: Classical Groups and Homogeneous Spaces;
A possible tensor product realization is Compact Groups and Their Representations;
Determinantal Random Fields; Dirac Fields in Gravitation
0 ¼ 1 z 1; 1 ¼ x y y and Nonabelian Gauge Theory; Dirac Operator and Dirac
Field; High Tc Superconductor Theory; Integrable
2 ¼ y y 1; 3 ¼ z y y
Systems in Random Matrix Theory; Lie Groups: General
The gauge field in this Majorana representation Theory; Random Matrix Theory in Physics; Random
is A = 1 1 (A() (þ) () Partitions; Supersymmetry Methods in Random Matrix
A y ) where A = (1=2)
t
(A A ) are the symmetric and skew parts of Theory; Symmetries and Conservation Laws.
A 2 su (Nc ).
The operator H = i (@ A ) is imaginary
skew, therefore eitH is real orthogonal. This means Further Reading
that there exists a Nambu space V with unitary
structure h , i and symmetric pairing { , }, both of Altland A, Simons BD, and Zirnbauer MR (2002) Theories of
low-energy quasi-particle states in disordered d-wave super-
which are preserved by the action of eitH . No change conductors. Physics Reports 359: 283–354.
of physical meaning or interpretation is implied by Altland A and Zirnbauer MR (1997) Non-standard symmetry
the identical rewriting from Dirac D to Majorana H. classes in mesoscopic normal-/superconducting hybrid systems.
The fact that Dirac fermions are not truly Majorana Physical Review B 55: 1142–1161.
is encoded in a U1 -symmetry HeiQ = eiQ H gener- Dyson FJ (1962) The threefold way: algebraic structure of
symmetry groups and ensembles in quantum mechanics.
ated by Q = 1 1 y . Journal of Mathematical Physics 3: 1199–1215.
Now comes the essential point: since H obeys Heinzner P, Huckleberry A, and Zirnbauer MR (2005) Symmetry
H = H, the chiral ‘‘symmetry’’ H = 5 H5 with classes of disordered fermions. Communications in Mathe-
5 = 1 x 1 can be recast as a true symmetry: matical Physics 257: 725–771.
Helgason S (1978) Differential Geometry, Lie Groups and
5 ¼ THT 1
H ¼ þ5 H Symmetric Spaces. New York: Academic Press.
Mehta ML (1991) Random Matrices. New York: Academic Press.
with antilinear T : 7! 5 . Thus, the massless Weyl H (1939) The Classical Groups: Their Invariants and
Representations. Princeton: Princeton University Press.
QCD Dirac operator is indeed associated with a Verbaarschot JJM (1994) The spectrum of the QCD Dirac
symmetry class in the present, post-Dyson sense: operator and chiral random matrix theory: the threefold
that is class AIII, realized by self-adjoint operators way. Physical Review Letters 72: 2531–2533.
Synchronization of Chaos 213
Synchronization of Chaos
M A Aziz-Alaoui, Université du Havre, Le Havre, behavior of insects, animals, or humans (Pikovsky
France et al. 2001).
ª 2006 Elsevier Ltd. All rights reserved. This process may also be encountered in celestial
mechanics, where it explains the locking of revolu-
tion period of planets and satellites.
Its view was strongly broadened with the devel-
Introduction: Chaotic Systems Can
opments in radio engineering and acoustics, due to
Synchronize the work of Eccles and Vincent, 1920, who found
Synchronization is a ubiquitous phenomenon char- synchronization of a triode generator. Appleton,
acteristic of many processes in natural systems and Van der Pol, and Van der Mark, 1922–27, have,
(nonlinear) science. It has permanently remained an experimentally and theoretically, extended it and
objective of intensive research and is today consid- worked on radio tube oscillators, where they
ered as one of the basic nonlinear phenomena observed entrainment when driving such oscillators
studied in mathematics, physics, engineering, or life sinusoidally, that is, the frequency of a generator
science. This word has a Greek root, syn = common can be synchronized by a weak external signal of a
and chronos = time, which means to share the slightly different frequency.
common time or to occur at the same time, that is, But, even though original notion and theory of
correlation or agreement in time of different synchronization implies periodicity of oscillators,
processes (Boccaletti et al. 2002). Thus, synchroni- during the last decades, the notion of synchroniza-
zation of two dynamical systems generally means tion has been generalized to the case of interacting
that one system somehow traces the motion of chaotic oscillators. Indeed, the discovery of determi-
another. Indeed, it is well known that many coupled nistic chaos introduced new types of oscillating
oscillators have the ability to adjust some common systems, namely the chaotic generators.
relation that they have between them due to weak Chaotic oscillators are found in many dynamical
interaction, which yields to a situation in which a systems of various origins; the behavior of such
synchronization-like phenomenon takes place. systems is characterized by instability and, as a
The original work on synchronization involved result, limited predictability in time.
periodic oscillators. Indeed, observations of (peri- Roughly speaking, a system is chaotic if it is
odic) synchronization phenomena in physics go back deterministic, has a long-term aperiodic behavior,
at least as far as C Huygens (1673), who, during his and exhibits sensitive dependence on initial condi-
experiments on the development of improved pen- tions on a closed invariant set (the chaos theory is
dulum clocks, discovered that two very weakly discussed in more detail elsewhere in this encyclo-
coupled pendulum clocks become synchronized in pedia) (see Chaos and Attractors).
phase: two clocks hanging from a common support Consequently, for a chaotic system, trajectories
(on the same beam of his room) were found to starting arbitrarily close to each other diverge
oscillate with exactly the same frequency and exponentially with time, and quickly become uncor-
opposite phase due to the (weak) coupling in terms related. It follows that two identical chaotic systems
of the almost imperceptible oscillations of the beam cannot synchronize. This means that they cannot
generated by the clocks. produce identical chaotic signals, unless they are
Since this discovery, periodic synchronization has initialized at exactly the same point, which is in
found numerous applications in various domains, general physically impossible. Thus, at first sight,
for instance, in biological systems and living nature synchronization of chaotic systems seems to be
where synchronization is encountered on different rather surprising because one may intuitively (and
levels. Examples range from the modeling of the naively) expect that the sensitive dependence on
heart to the investigation of the circadian rhythm, initial conditions would lead to an immediate
phase locking of respiration with a mechanical breakdown of any synchronization of coupled
ventilator, synchronization of oscillations of human chaotic systems. This scenario in fact led to the
insulin secretion and glucose infusion, neuronal belief that chaos is uncontrollable and thus unusa-
information processing within a brain area and ble. Despite this, in the last decades, the search for
communication between different brain areas. Also, synchronization has moved to chaotic systems.
synchronization plays an important role in several Significant research has been done and, as a result,
neurological diseases such as epilepsies and patho- Yamada and Fujisaka (1983), Afraimovich et al.
logical tremors, or in different forms of cooperative (1986), and Pecora and Carroll (1990) showed that
214 Synchronization of Chaos
two chaotic systems could be synchronized by Our discussion and examples given here are based
coupling them: synchronization of chaos is actual on unidirectionally continuous systems, most of the
and chaos could then be exploitable. Ever since, exposed ideas can be easily extended to discrete
many researchers have discussed the theory and the systems.
design or applications of synchronized motion in Let us also emphasize that the same year, 1990,
coupled chaotic systems. A broad variety of applica- saw the publication of another seminal paper, by
tions has emerged, for example, to increase the Ott, Grebogi, and Yorke (OGY) on the control of
power of lasers, to synchronize the output of chaos (Ott et al. 1990). Recently, it has been
electronic circuits, to control oscillations in chemical realized that synchronization and control of chaos
reactions, or to encode electronic messages for share a common root in nonlinear control theory.
secure communications. Both topics were presented by many authors in a
The publication of the seminal paper of Pecora unified framework. However, synchronization of
and Caroll (1990) had a very strong impact in the chaos has evolved in its own right, even if it is
domain of chaos theory and chaos synchronization, nowadays known as a part of the nonlinear control
and their applications. It had stimulated very intense theory.
research activities and the related studies continue to
attract great attention. Many authors have contrib-
uted to developing this domain, theoretically or
Synchronization and Stability
experimentally (Boccaletti et al. 2002, Pecorra et al.
1997, references therein). For the basic master–slave configuration, where an
However, the special features of chaotic systems autonomous chaotic system (the master)
make it impossible to directly apply the methods
dX
developed for synchronization of periodic oscilla- ¼ FðXÞ; X 2 Rn ½1
tors. Moreover, in the topics of coupled chaotic dt
systems, many different phenomena, which are drives another system (the slave),
usually referred to as synchronization, exist and
dY
have been studied now for over a decade. Thus, ¼ GðX; YÞ; Y 2 Rm ½2
more precise descriptions of such systems are indeed dt
desirable. synchronization takes place when Y asymptotically
Several different regimes of synchronization have copies, in a certain manner, a subset Xp of X. That
been investigated. In the following, the focus will be is, there exists a relation between the two coupled
on explaining the essentials on this large topic, systems, which could be a smooth invertible func-
subdivided into four basic types of synchronization tion , which transforms the trajectories on the
of coupled or forced chaotic systems which have attractor of a first system into those on the attractor
been found and have received much attention, while of a second system. In other words, if we know,
emphasizing on the first three: after a transient regime, the state of the first system,
it allows us to predict the state of the second:
identical (or complete) synchronization (IS), Y(t) = (X(t)). Generally, it is assumed that n m;
which is defined as the coincidence of states of however, for the sake of easy readability (even if this
interacting systems; is not a necessary restriction) the case n = m will
generalized synchronization (GS), which extends only be considered; thus, Xp = X. Henceforth, if we
the IS phenomenon and implies the presence of denote the difference Y (X) by X? , in order to
some functional relation between two coupled arrive at a synchronized motion, it is expected that
systems; if this relationship is the identity, we
jjX? jj! 0; as t ! þ1 ½3
recover the IS;
phase synchronization (PS), which means entrain- If is the identity function, the process is called IS.
ment of phases of chaotic oscillators, whereas
Definition of IS System [2] synchronizes with
their amplitudes remain uncorrelated; and
system [1], if the set M = {(X, Y) 2 Rn R n , Y = X}
lag synchronization (LS), which appears as a
is an attracting set with a basin of attraction B(M B)
coincidence of time-shifted states of two systems.
such that limt!1 kX(t) Y(t)k = 0, for all
(X(0), Y(0)) 2 B.
Other regimes exist, some of them will be briefly
pointed out at the end of this article; we also will Thus, this regime corresponds to the situation
briefly discuss the very relevant issue of the stability where all the variables of two (or more) coupled
of synchronous motions. chaotic systems converge.
Synchronization of Chaos 215
If is not the identity function, the phenomenon synchronization, when the system is smooth, are
is more general and is referred to as GS. given by Josic (2000). This approach relies on the
Fenichel theory of normally hyperbolic invariant
Definition of GS System [2] synchronizes with
manifolds and quantities that resemble Lyapunov
system [1], in the generalized sense, if there exists a
exponents, and is referred to as differentiable GS.
transformation : Rn ! Rm , a manifold M =
nþm However, many situations correspond to the case
{(X, Y) 2 R , Y = (X)} and a subset B (M B),
where, in some region of values of parameters
such that for all (X0 , Y0 ) 2 B, the trajectory based
coupling, the function is only continuous but not
on the initial conditions (X0 , Y0 ) approaches M as
smooth, that is, the graph of is a complicated
time goes to infinity. This is explained further in the
geometrical object. This kind of synchronization
following.
is called nonsmooth GS (Afraimovich et al. 2001).
Henceforth, in the case of IS, eqn [3] above means Furthermore, the mathematical theory of IS often
that a certain hyperplane M, called synchronization assumes the coupled oscillators to be identical, even
manifold, within R2n , is asymptotically stable. if, in practice, no two oscillators are exact copies of
Consequently, for the sake of synchrony motion, each other. This leads to small differences in system
we have to prove that the origin of the transverse parameters and then to synchronization errors.
system X? = Y X is asymptotically stable. That is, These errors have been studied by many authors
to prove that the motion transversal to the synchro- (see, e.g., Illing (2002), and references therein).
nization manifold dies out.
However, significant progress has been made by
mathematicians and physicists in studying the Identical Synchronization
stability of synchronous motions. Two main tools
Perhaps the best way to explain synchronization of
are used in the literature for this aim: conditional
chaos is through IS, also referred to as conventional
Lyapunov exponents and asymptotic stability. In the
or complete synchronization (Boccaletti et al. 2002).
examples given below, we will essentially formulate
It is the simplest form of chaos synchronization and
conditions for synchronization in terms of Lyapunov
generalizes to the complete replacement which is
exponents, which play a central role in chaos theory.
explained below. It is also the most typical form of
These quantities measure the sensitive dependence
chaotic synchronization often observable in two
on initial conditions for a dynamical system and also
identical systems.
quantify synchronization of chaos.
There are various processes leading to synchroni-
The Lyapunov exponents associated with the
zation; depending on the particular coupling config-
variational equation corresponding to the transverse
uration used these processes could be very different.
system X? :
So, one has to distinguish between the following two
dX? main situations, even if they are, in some sense,
¼ DFðXÞX? ½4 similar: the unidirectional and the bidirectional
dt
coupling. Indeed, synchronization of chaotic systems
where DF(X) is the Jacobian of the vector field
is often studied for schemes of the form
evaluated onto the driving trajectory X, are referred
to as transverse or conditional Lyapunov exponents dX
¼ FðXÞ þ kNðX YÞ
(CLEs). dt ½5
In the case of IS, it appears that the condition L?max < dY
0 is sufficient to insure synchronization, where L? ¼ GðYÞ þ kMðX YÞ
max is dt
the largest CLE. Indeed, eqn [4] gives the dynamics of
where F and G act in Rn , (X, Y) 2 (Rn )2 , is a scalar,
the motion transverse to the synchronization manifold;
and M and N are coupling matrices belonging to
therefore, CLEs indicate if this motion dies out or not,
Rnn . If F = G the two subsystems X and Y are
and hence, whether the synchronization state is stable
identical. Moreover, when both matrices are non-
or not. Consequently, if L? max is negative, it insures the
zero then the coupling is called bidirectional, while
stability of the synchronized state. This will be best
it is referred to as unidirectional if one is the zero
explained using two examples below.
matrix, and the other nonzero.
Even if there exist other approaches for studying
synchronization, one may ask if this condition on
L? Constructing Pairs of Synchronized Systems:
max is true in general. To answer this question,
Complete Replacement
mathematicians have recently formulated it in terms
of properties of manifolds (or synchronization Pecora and Carroll (1990) proposed the use of
hyperplanes). Some rigorous results on (generalized) stable subsystems of given chaotic systems to
216 Synchronization of Chaos
construct pairs of unidirectionally coupled synchro- Let w(t) be a chaotic trajectory with initial
nizing systems. Since then generalizations of this condition w(0), and w0 (t) be a trajectory started at
approach have been developed and various meth- a nearly point w0 (0). The basic idea of the Pecora–
ods now exist to synchronize systems (Wu 2002, Carroll approach is to establish the asymptotic
Hasler 1998). stability of the solutions of w0 -subsystem by means
One way to build a couple of synchronized of CLEs. They have shown the following result
systems is then to use the basic construction method (Pecora and Carroll 1990):
introduced by Pecora and Carroll, who made an
Theorem A necessary and sufficient condition for
important observation. They found that, when they
the two subsystems, w andw0 , to be synchronized is
make a replica of part of a chaotic system and send
that all of the CLEs be negative.
a system variable from the original system (trans-
mitter) to drive this replica (receiver), sometimes the Note that only a finite number of possible
replica subsystem and the original chaotic one lock decompositions (or couplings) v–w exist; this is
in their steps and evolve together chaotically in bounded by the number of different possible
synchrony. This method can be described as follows. subsystems, namely N(N 1)=2. (For a description
Consider the autonomous n-dimensional dynamical and mathematical analysis of various coupling
system, schemes see Wu (2002).) Furthermore, by splitting
the main system [6] in a different way, (complete)
du synchronization could not exist. Indeed, in general,
¼ FðuÞ ½6
dt only a few of the possible response subsystems
possess negative CLEs, and may thus be used to
divide this system into two subsystems (u = (v, w)), implement synchronizing systems using the Pecora–
Caroll method. In fact, it has been pointed out in the
dv literature that in some cases, the CLE criterion is not
¼ Gðv; wÞ
dt as practical as some other criteria.
½7
dw For simplicity, the idea will now be developed on
¼ Hðv; wÞ the following three-dimensional simple autonomous
dt
system, which belongs to the class of dynamical
where v = (u1 , . . . , um ), w = (umþ1 , . . . , un ), G =(F1 , . . . , systems called generalized Lorenz systems (see
Fm ), and H = (Fmþ1 , . . . , Fn ). Next, create a new Derivière and Aziz-Alaoui (2003), and references
subsystem w0 identical to the w-subsystem. This therein):
yields a (2n m)-dimensional system:
x_ ¼ 9x 9y
dv
¼ Gðv; wÞ y_ ¼ 17x y xz ½9
dt
dw z_ ¼ z þ xy
¼ Hðv; wÞ ½8
dt
0 (This should be compared with the well-known
dw
¼ Hðv; w0 Þ Lorenz system:
dt
15 0
10 –5
5 –10
0 –15
–5 –20
–10 –25
–15 –30
–15 –10 –5 0 5 10 15 –15 –10 –5 0 5 10 15
Figure 1 The chaotic attractor of system [9]: x–y and x–z plane projections.
w0 = (y2 , z2 ) of the w-subsystem, we obtain the is replaced with the drive counterpart only in certain
following five-dimensional dynamical system: locations (Pecora et al. 1997).
20
15
10
5
0
–5
–10
0 1 2 3 4 5
(a)
5
0
–5
–10
–15
–20
–25
–30
–35
0 1 2 3 4 5
(b)
15
10
–5
–10
–15
–15 –10 –5 0 5 10 15
(c)
Figure 2 Complete replacement synchronization. Time series for (a) yi (t) and (b) zi (t), i = 1, 2, in system [10]. The difference
between the variable of the transmitter and the variable of the receiver asymptotes tends to zero as time progresses, that is,
synchronization occurs after transients die down. (c) The plot of amplitudes y1 against y2 , after transients die down, shows a diagonal
line, which also indicates that the receiver and the transmitter are maintaining synchronization. The plot of z1 against z2 shows a
similar behavior.
For k = 0, the two subsystems are uncoupled; for against x2 , y1 against y2 , and z1 against z2 , can also
k > 0 both subsystems are unidirectionally coupled; indicate the occurrence of system synchronization.
and for k ! þ1, we recover the complete replace- IS was the first for which examples of unidir-
ment coupling scheme explained above. Our numer- ectionally coupled chaotic systems were presented. It
ical computations yield the optimal value k̃ for the is important for potential applications of chaos
synchronization; we found that for k k̃ = 4.999, synchronization in communication systems, or for
both subsystems of [13] synchronize. That is, time-series analysis, where the information flow is
starting from random initial conditions, and after also unidirectional.
some transient time, system [13] generates the same
attractor as for system [9] (see Figure 1). Conse-
quently, all the variables of the coupled chaotic Bidirectional IS
subsystems converge: x2 converges to x1 , y2 to y1 , A second brief example uses a bidirectional (also
and z2 to z1 (see Figure 3). Thus, the second system called mutual or two-way) coupling. In this situa-
(the response) is locked to the first one (the drive). tion, in contrast to the unidirectional coupling, both
Alternatively, observation of diagonal lines in drive and response systems are connected in such a
correlation diagrams, which plot the amplitudes x1 way that they influence each other’s behavior. Many
Synchronization of Chaos 219
10
5
0
–5
–10
2162 2164 2166 2168 2170 2172 2174 2176 2178 2180
10
5
0
–5
–10
2162 2164 2166 2168 2170 2172 2174 2176 2178 2180
–5
–10
–15
–20
–25
1525 1530 1535 1540 1545 1550
Figure 3 Time series for xi (t), yi (t), and zi (t)(i = 1, 2) in system [13] for the coupling constant k = 5:0, that is, beyond the threshold
necessary for synchronization. After transients die down, the two subsystems synchronize perfectly.
biological or physical systems consist of bidirection- as system [9], implies that the attractors of these
ally interacting elements or components; examples combined drive–response six-dimensional systems
range from cardiac and respiratory systems to are confined to a three-dimensional hyperplane (the
coupled lasers with feedback. Let us then take two synchronization manifold) defined by Y = X. After
copies of the same system [9] as given above, but the synchronization is reached, this manifold is a
two-way coupled through a linear constant term k > stable submanifold in the full phase space R 6 .
0 according to variables x1, 2 : Figure 5 gives an idea of what the geometry of the
synchronous attractor of system [13] or [14] looks
x_1 ¼ 9x1 9y1 kðx1 x2 Þ
like, by exhibiting the projection of the phase space
y_1 ¼ 17x1 y1 x1 z1 R6 onto (x1 , y1 , y2 ) subspace. But, one can simi-
z_1 ¼ z1 þ x1 y1 larly plot any combination of variable xi , yi , and
½14 zi (i = 1, 2), and get the same result, since the
x_2 ¼ 9x2 9y2 kðx2 x1 Þ motion, in case of synchronization, is confined to
y_2 ¼ 17x2 y2 x2 z2 the hyperplane defined in R6 by the equalities
z_2 ¼ z2 þ x2 y2 x1 = x2 , y1 = y2 , and z1 = z2 .
This hyperplane is stable since small perturbations
We can get an idea of the onset of synchronization which take the trajectory off the synchronization
by plotting, for example, x1 against x2 for various manifold decay in time. Indeed, as stated earlier,
values of the coupling-strength parameter k. Our CLEs of the linearization of the system around the
numerical computations yield the optimal value k̃ synchronous state could determine the stability of
for the synchronization: k̃ ’ 2.50 (Figure 4), both the synchronized solution. This leads to requiring
(xi , yi , zi ) subsystems synchronize and system [14] that the origin of the transverse system, X? , is
also generates the attractor of Figure 1. asymptotically stable. To see this, for both systems
[13] and [14], we then switch to the new set of
Synchronization manifold and stability Geometri- coordinates, X? = Y X, that is, x? = x2 x1 ,
Geometrically, the fact that systems [13] and [14], y? = y2 y1 , and z? = z2 z1 . The origin (0, 0, 0)
beyond synchronization, generate the same attractor is obviously a fixed point for this transverse system,
220 Synchronization of Chaos
10 10
5 5
0 0
–5 –5
–10 –10
–10 –5 0 5 10 –10 –5 0 5 10
(a) (b)
10 40
35
5 30
25
0 20
15
–5 10
5
–10
0
–10 –5 0 5 10 1200 1220 1240 1260 1280 1300
(c) (d)
Figure 4 Illustration of the onset of synchronization of system [14]. (a)–(c) Plots of amplitudes x1 against x2 for values of the coupling
parameter k = 0:5, 1:5, 2:8, respectively. The system synchronizes for k 2:5. (d) Plot, for k = 2:8, of the norm N(X ) = kx1 x2 k þ
ky1 y2 k þ kz1 z2 k versus t, which shows that the system synchronizes very quickly.
15
se
1 =
0 1 0 1
y
dx? x?
ma niz e :
old on
hro lan
nif ati
B dt C B C
nc rp
B C B C
sy ype
B C B C
B dy? C B C
H
y2 0 B C ¼ V B y? C ½16
B dt C B C
B C B C
@ dz? A @ A
15 dt z?
–15
–12 For systems [13] and [14], we obtain
0 y1 0 1
0 9 ki 9 0
x1
V ¼ Vi ¼@ 17 z 1 x A ½17
12 –15 y x 1
Figure 5 The motion of synchronized system [13] or [14] takes
place on a chaotic attractor which is embedded in the with ki = k for system [13] and ki = 2k for system
synchronization manifold, that is, the hyperplane defined by [14]. Let us remark that the only difference between
x1 = x2 , y1 = y2 , and z1 = z2 : both matrices Vi is the coupling k which has a factor
Synchronization of Chaos 221
0.4
⊥
0.2
relaxed and extended form of IS in non-identical
0
systems.
–0.2
However, it may also occur for pairs of identical
U nid
–0.4
ir e c
ti o n a systems, for example, for systems having reflection
l c ou
B id pling symmetry, F(X) = F(X). Besides these examples
–0.6 i rec
ti o n a of GS, others also exist that exploit symmetries of
–0.8 l cou
p li n g
the underlying systems (Parlitz and Kocarev 1999).
–1
GS was introduced for unidirectionally coupled
–1.2
0 5 10 15 20 25 systems by Rulkov et al. (1995). For simplicity, we
Coupling strength, k
also focus on unidirectionally coupled continuous
Figure 6 The largest transverse Lyapunov exponents L? max as time systems:
a function of coupling strength k, in the unidirectional system [13]
(solid) and the bidirectional system [14] (dotted).
dX
¼ FðXÞ
dt
½18
dY
2 in the bidirectional case. Figure 6 shows the ¼ GðY; uðtÞÞ
dt
dependence of L? max on k, for both examples of
unidirectionally and bidirectionally coupling sys- where X 2 Rn , Y 2 Rm , F : Rn ! Rn , G : R m
tems. L?max becomes negative as k increases, which Rk ! R m , and u(t) = (u1 (t), . . . , uk (t)) with
insures the stability of the synchronized state for ui (t) = hi (X(t, Xo )). Two (non-identical) dynamical
systems [13] and [14]. systems are said to be synchronized in a generalized
Let us note that this can also be proved sense if there is a continuous function from the
analytically as done by Derivière and Aziz-Alaoui phase space of the first to the phase space of the
(2003) by using a suitable Lyapunov function, and second, taking orbits of the first system to orbits of
using some new extended version of LaSalle invar- the second.
iance principle. The main problem is to know when and under
what conditions system [18] undergoes GS. Many
authors have addressed this question, and it has been
Desynchronization motion Synchronization depends
shown that asymptotic stability is equally significant
not only on the coupling strength, but also on the
for this more universal concept (for some theoretical
vector field and the coupling function. For some
results, see Rulkov et al. (1995) and Parlitz and
choice of these quantities, synchronization may
Kocarev (1999)). For unidirectionally coupled con-
occur only within a finite range [k1 , k2 ] of coupling
tinuous time systems, the following results hold:
strength; in such a case a desynchronization phe-
nomenon occurs. Thus, increasing k beyond the Theorem A necessary and sufficient condition for
critical value k2 yields loss of the synchronized system [18] to be synchronized in the generalized
motion (L? max becomes positive). sense is that for each u(t) = u(X(t, Xo )) the system-
is asymptotically stable.
When it is not possible to find a Lyapunov function
Generalized Synchronization
in order to use this theorem, one can numerically
Identical chaotic systems synchronize by following the compute the CLEs of the response system, and use the
same chaotic trajectory. However, real systems are in following result:
general not identical. For instance, when the para-
Theorem The drive and response subsystems of
meters of two coupled identical systems do not match,
system [18] synchronize in the generalized sense iff
or when these coupled systems belong to different
all of the CLEs of the response subsystem are
classes, complete IS may not be expected, because
negative.
there does not exist such an invariant manifold Y = X,
as for IS. For non-identical systems, the possibility of The definition of has the advantage that it allows
some type of synchronization has been investigated the discussion of synchronization of non-identical
(Afraimovich et al. 1986). It was shown that when two systems and, at the same time, to consider synchroni-
different systems are coupled with sufficiently strong zation in terms of the property of synchronization
coupling strength, a general synchronous relation manifold. Therefore, it is important to study the
between their states could exist and it could be existence of the transformation and its nature
222 Synchronization of Chaos
(continuity, smoothness, . . .). Unfortunately, except in on the functional relation occurring in case of GS,
special cases (Afraimovich et al. 1986), rarely will one between two coupled systems.)
be able to produce formulas exhibiting the mapping .
An example of two unidirectionally coupled
chaotic systems which synchronize in the generalized
sense is given below. Consider the following Rössler Phase Synchronization
system driven by system [9]: For coupled non-identical chaotic systems, other
types of synchronizations exist. Recently, a rather
x_1 ¼ 9x1 9y1
weak degree of synchronization, the PS, of chaotic
y_1 ¼ 17x1 y1 x1 z1 systems has been described (Pikovsky et al. 2001).
The Greek meaning of the word synchronization,
z_1 ¼ z1 þ x1 y1 mentioned in the introduction, is closely related to
½19 this type of processes. The synchronous motion is
x_2 ¼ y2 z2 kðx2 ðx12 þ y12 ÞÞ actually not visible. Indeed, in PS the phases of
y_2 ¼ x2 þ 0:2y2 kðy2 ðy12 þ z12 ÞÞ chaotic systems with PS are locked, that is, there
exists a certain relation between them, whereas the
z_2 ¼ 0:2 þ z2 ðx2 9:0Þ kðz2 ðx12 þ z12 ÞÞ amplitudes vary chaotically and are practically
uncorrelated. Thus, it is mostly close to synchroni-
As shown in Figure 7, it appears impossible to tell zation of periodic oscillators.
what the relation is between the transmitter sub-
system (x1 , y1 , z1 ) in eqn [19] and the two Rössler Definition PS of two coupled chaotic oscillators
response subsystems (x2 , y2 , z2 ) at k = 1 and k = 100. occurs if, for arbitrary integers n and m, the phase
However, GS occurs for large values of the locking condition between the corresponding
coupling-strength parameter k. Therefore, for such phases, jn1 (t) m2 (t)j constant, holds and the
values we expect that orbits of [19] will lie in the amplitudes of both systems remain uncorrelated.
vicinity of a certain synchronization manifold. Let us note that such a phenomenon occurs when
Indeed, let us define the set a zero Lyapunov exponent of the response system
becomes negative, while, as explained above, iden-
S ¼ fðx1 ; y1 ; z1 ; x2 ; y2 ; z2 Þ 2 R6 : x2 ¼ x12 þ y12 ; tical chaotic systems synchronize by following the
y2 ¼ y12 þ z12 ; z2 ¼ x21 þ z21 g same chaotic trajectory, when their largest trans-
verse Lyapunov exponent of the synchronized
Since the projections of S onto the coordinates manifold decreases from positive to negative values.
(x1 , y1 , x2 ), (y1 , z1 , y2 ), and (x1 , z1 , z2 ) are parabo- Moreover, following the definition above, this
loids, we can see how the synchronization manifold phenomenon is best observed when a well-defined
is approached. This is illustrated in Figure 8, where phase variable can be identified in both coupled
the (x1 , y1 , x2 ) projections of typical trajectories are systems. This can be done for strange attractors that
shown at four different coupling values. (See Josic spiral around a ‘‘hole,’’ or a particular (fixed) point
(2000) for other examples and further develop- in a two-dimensional projection of the attractor. The
ments; see also Pecora et al. (1997), where the typical example is given by the Rössler system, which,
authors summarize a method in order to get an idea for some range of parameters, exhibits a Möbius-
15 250 700
10 600
200
500
5
150 400
0
100 300
–5
200
–10 50
100
–15 0 0
–15 –10 –5 0 5 10 –160 –140 –120 –100 –80 –60 –40 –20 –20 0 20 40 60 80 100 120 140 160
(a) (b)
(c) (d)
Figure 8 Generalized synchronization. (x1 , y1 , x2 ) projections of typical trajectories of system [19] after transients die out, with
(a) k = 1, (b) k = 20, (c) k = 100, and (d) k = 200. For the last value, the attractor lies in the set S, three-dimensional projections of
which are paraboloı̈ds.
strip-like chaotic attractor with a central hole. In such the phase has a physically important property, it
a case, a phase angle (t) can be defined that decreases does correspond to the direction with the zero
or increases monotonically. For an illustration, we Lyapunov exponent in the phase space, its perturba-
take the following two coupled Rössler oscillators: tions neither grow nor decay in time. Figure 9c
shows that there is a transition from the nonsyn-
x_1 ¼ 1 y1 z1 þ kðx2 x1 Þ chronous phase regime, where the phase difference
y_1 ¼ 1 x1 þ 0:17y1 increases almost linearly with time (k = 0.01 and
k = 0.05), to a synchronous state, where the relation
z_1 ¼ 0:2 þ z1 ðx1 9:0Þ j1 (t) 2 (t)j < constant holds (k = 0.1), that is,
½20
x_2 ¼ 2 y2 z2 þ kðx1 x2 Þ the phase difference does not grow with time.
However, the amplitudes are obviously uncorrelated
y_2 ¼ 2 x2 þ 0:17y2 as seen in Figure 9b. This example shows that
z_2 ¼ 0:2 þ z2 ðx2 9:0Þ PS could takes place for weaker degree of synchro-
nization in chaotic systems. Readers can find more
with a small parameter mismatch 1, 2 = rigorous mathematical discussion on this subject,
0.95 0.04,k governs the strength of coupling. and on the definition of phases of chaotic oscillators,
If we can define a Poincaré section surface for in Pikovsky et al. (2001), see also Boccaletti et al.
the system, then, for each piece of a trajectory (2002) and references therein.
between two cross sections with this surface, we
define the phase, as done in Pikovsky et al. (2001),
as a piecewise linear function of time, so that the Other Treatments and Types
phase increment is 2 at each rotation: of Synchronization
t tn Lag Synchronization
ðtÞ ¼ 2 þ 2n; tn t tnþ1
tnþ1 tn
PS synchronization occurs when non-identical chao-
where tn is the time of the nth crossing of the secant tic oscillators are weakly coupled: the phases are
surface. locked, while the amplitudes remain uncorrelated.
In our example, the last has been chosen as the When the coupling strength becomes larger, some
negative x-axis and represented by the wide segment relationships between amplitudes may be estab-
in Figure 9a. This definition of phases is clearly lished. Indeed, it has been shown (Rosenblum et al.
ambiguous since it depends on the choice of the 1997), in symmetrically coupled non-identical oscil-
Poincaré section; nevertheless, defined in this way, lators and in time-delayed systems, that there exists
224 Synchronization of Chaos
14
10
5 12
0 10
–5 8
–10
6
–15
4
–15 –10 –5 0 5 10 15 4 6 8 10 12 14 16
(a) (b)
30
25
20
k = 0.01
15
φ1 − φ2
10
k = 0.05
5
k = 0.1
0
–5
20 40 60 80 100 120 140 160 180 200
Time
(c)
Figure 9 (a) Rössler chaotic attractor projection onto x–y plane. (b) Amplitudes A1 versus A2 for the phase synchronized case at
k = 0:1. (c) Time serie of phase difference for different coupling strengths k; for k = 0:01 PS is not achieved, while for k = 0:1 PS takes
place. Although the phases are locked, for k = 0:1, the amplitudes remain chaotic and uncorrelated.
a regime of LS. This process appears as a coin- provided that the systems satisfy some stability
cidence of time-shifted states of two systems: conditions.
However, this process could not be classified as
lim jjYðtÞ Xðt Þjj ¼ 0
t!þ1 GS, even if there exists a linear relation between the
coupled systems, because the response system of
where is a positive delay.
projective synchronization is not asymptotically
Projective Synchronization stable. For more information about this subject,
the reader is referred to Mainieri and Rehacek
In coupled partially linear systems, it was reported (1999).
by Mainieri and Rehacek (1999) that two identical
systems could be synchronized up to a scaling factor. Anticipating Synchronization
This type of chaotic synchronization is referred to as
projective synchronization. Consider, for example, a It is interesting to mention that a new form of
three-dimensional chaotic system Ẋ = F(X), where synchronization has recently appeared, the so-called
X = (x, y, z). Decompose X into a vector v = (x, y) anticipating synchronization (Boccaletti et al. 2002).
and a scalar z; the system can then be rewritten as It shows that some coupled chaotic systems might
synchronize such that their response anticipates the
du dz drivers by synchronizing with their future states.
¼ gðv; zÞ; ¼ hðv; zÞ
dt dt It is also interesting to mention the nonlinear H1
synchronization method for nonautonomous
In projective synchronization, two identical sys-
schemes introduced by Suykens et al. (1997).
tems X1 = (x1 , y1 , z1 ) (drive) and X2 = (x2 , y2 , z2 )
(response) are coupled through the scalar variable z.
Spatio-Temporal Synchronization
It occurs if the state vectors v1 and v2 synchronize up
to a constant ratio, that is, limt ! þ1 jjv1 (t) Low-dimensional systems have rather limited useful-
v2 (t)jj = 0, where is called a scaling factor. For ness in modeling real-world applications. This is
partially linear systems, it may automatically occur why the synchronization of chaos has been carried
Synchronization of Chaos 225
out in high dimensions (see Kocarev et al. (1997) for Robustness to parameter mismatch was addressed
a review). See also Chen and Dong (2001) for a by many authors (Illing et al. 2002). Lozi et al.
discussion of special high-dimensional systems, (1993) showed that, by connecting two identical
namely large arrays of coupled chaotic systems. receivers in cascade, a significant amount of the
noise can be reduced, thereby allowing the recovery
of a much higher quality signal.
Application to Transmission Systems
Furthermore, different implementations of chaotic
and Secure Communication secure communication have been proposed during
Synchronization principles are useful in practical the last decades, as well as methods for cracking this
applications. Use of chaotic signals to transmit encoding. The methods used to crack such a chaotic
information has been a very active research topic encoding make use of the low dimensionality of the
in the last decade. Thus, it has been established that chaotic attractors. Indeed, since the properties of
chaotic circuits may be used to transmit information low-dimensional chaotic systems with one positive
by synchronization. As a result, several proposals Lyapunov exponent can be reconstructed by analyz-
for secure-communication schemes have been ing the signal, such as through the delay-time
advanced (see, e.g., Cuomo et al. (1993), Hasler reconstruction methods, it seems unlikely that these
(1998), and Parlitz et al. (1999)). The first labora- systems might provide a secure encryption method.
tory demonstration of a secure-communication The hidden message can often be retrieved easily by
system, which uses a chaotic signal for masking an eavesdropper without using the receiver. But,
purposes, and which exploits the chaotic synchroni- chaotic masking and encoding are difficult to break,
zation techniques to recover the signal, was reported using the state-of-the-art analysis tools, if suffi-
by Kocarev et al. (1992). ciently high dimensional chaos generators with
It is difficult, within the scope of this article, to multiple positive Lyapunov exponents (i.e., hyperch-
give a complete or detailed discussion, and it should aotic systems) are used (see Pecora et al. (1997), and
be noted that there exist many competing and tested references therein).
methods that are well established.
The main idea of the communication schemes is
to encode a message by means of a chaotic Conclusion
dynamical system (the transmitter), and to decode
In spite of the essential progress in theoretical and
it using a second dynamical system (the receiver)
experimental studies, synchronization of chaotic
that synchronizes with the first. In general, secure-
systems continues to be a topic of active investiga-
communication applications assume additionally
tions and will certainly continue to have a broad
that the coupled systems used are identical.
impact in the future. Theory of synchronization
Different methods can be used to hide the useful
remains a challenging problem of nonlinear
information, for example, chaotic masking, chaotic
science.
switching, or direct chaotic modulation (Hasler
1998). For instance, in the chaotic masking method, See also: Bifurcations of Periodic Orbits; Chaos and
an analog information carrying the signal s(t) is Attractors; Fractal Dimensions in Dynamics; Generic
added to the output y(t) of the chaotic system in the Properties of Dynamical Systems; Isochronous Systems;
transmitter. The receiver tries to synchronize with Lyapunov Exponents and Strange Attractors; Singularity
component y(t) of the transmitted signal s(t) þ y(t). and Bifurcation Theory; Stability Theory and KAM;
If synchronization takes place, the information Weakly Coupled Oscillators.
signal can be retrieved by subtraction (Figure 10).
It is interesting to note that, in all proposed
schemes for secure communications using the idea of Further Reading
synchronization (experimental realization or com-
Afraimovich V, Chazottes JR, and Cordonet A (2001) Synchroni-
puter simulation), there is an inevitable noise zation in directionally coupled systems some rigourous results.
degrading the fidelity of the original message. Discrete and Continuous Dynamical Systems B 1(4): 421–442.
Afraimovich V, Verichev N, and Rabinovich MI (1986) Stochastic
synchronization of oscillations in dissipative systems. Radio-
^ physics and Quantum Electron 29: 795–803.
s(t ) Transmitter y(t ) s(t )
(chaotic) Receiver Boccaletti S, Kurths J, Osipov G, Valladares D, and Zhou C
Information Transmitted Retrieved (2002) The synchronization of chaotic systems. Physics
signal signal information Reports 366: 1–101.
(chaotic) signal
Chen G and Dong X (1998) From Chaos to Order. Singapore:
Figure 10 A typical communication setup. World Scientific.
226 Synchronization of Chaos
Cuomo K, Oppenheim A, and Strogatz S (1993) Synchroniza- Mainieri R and Rehacek J (1999) Projective synchronization in
tion of Lorenz-based chaotic circuits with applications to three-dimensional chaotic systems. Physical Review Letters
communications. IEEE Transactions on Circuits and Sys- 82: 3042–3045.
tems – II: Analog and Digital Signal Processing 40(10): Ott E, Grebogi C, and Yorke JA (1990) Controlling chaos.
626–633. Physical Review Letters 64: 1196–1199.
Derivière S and Aziz-Alaoui MA (2003) Estimation of attractors Parlitz U and Kocarev L (1999) Synchronization of chaotic
and synchronization of generalized Lorenz systems. Dynamics systems. In: Schuster HG (ed.) Handbooks of Chaos Control,
of Continuous, Discrete and Impulsive Systems Series B: pp. 271–303. Germany: Wiley-VCH.
Applications and Algorithms 10(6): 833–852. Pecora L and Carroll T (1990) Synchronization in chaotic
Hasler M (1998) Synchronization of chaotic systems and systems. Physical Review Letters 64: 821–824.
transmission of information. International Journal of Bifurca- Pecora L and Carroll T (1991) Driving systems with chaotic
tion and Chaos 8(4): 647–659. signals. Physical Review A 44: 2374–2383.
Huygens Ch (Hugenii) (1673) Horologium Oscillatorium (English Pecora L, Carroll T, Johnson G, and Mar D (1997) Fundamentals
translation: 1986 The Pendulum Clock. Ames: Iowa State of synchronization in chaotic systems, concepts and applica-
University Press). Parisiis, France: Apud F. Muguet. tions. Chaos 7(4): 520–543.
Illing L (2002) Chaos Synchronization and Communications in Pikovsky A, Rosenblum M, and Kurths J (2001) Synchronization,
Semiconductor Lasers. Ph.D. dissertation,. San Diego: Uni- A Universal Concept in Nonlinear Science. Cambridge: Cam-
versity of California. bridge University Press.
Illing L, Brocker J, Kocarev L, Parlitz U, and Abarbanel H (2002) Rosenblum M, Pikovsky A, and Kurths J (1997) From phase to
When are synchronization errors small? Physical Review E 66: lag synchronization in coupled chaotic oscillators. Physical
036229. Review Letters 78: 4193–4196.
Josic K (2000) Synchronization of chaotic systems and invariants Rulkov N, Sushchik M, Tsimring L, and Abarbanel H (1995)
manifolds. Nonlinearity 13: 1321–1336. Generalized synchronization of chaos in directionally coupled
Kocarev Lj, Halle K, Eckert K, and Chua LO (1992) Experi- chaotic systems. Physical Review E 51(2): 980–994.
mental demonstration of secure communication via chaotic Suykens J, Vandewalle J, and Chua LO (1997) Nonlinear H1
synchronization. International Journal of Bifurcation and synchronization of chaotic Lure systems. International Journal
Chaos 2(3): 709–713. of Bifurcation and Chaos 7(6): 1323–1335.
Kocarev Lj, Tasev Z, Stojanovski T, and CParlitz U (1997) Wu CW (2002) Synchronization in Coupled Chaotic Circuits and
Synchronizing spatiotemporel chaos. Chaos 7(4): 635–643. Systems. Series on Nonlinear Science, Series A, vol. 41.
Lozi R and Chua LO (1993) Secure communications via chaotic Singapore: World Scientific.
synchronization II: noise reduction by cascading two identical Yamada T and Fujisaka H (1983) Stability theory of synchronized
receivers. International Journal of Bifurcation and Chaos 3(5): motion in coupled-oscillator systems. Progress of Theoretical
1319–1325. Physics 70: 1240–1248.
T
t Hooft–Polyakov Monopoles see Solitons and Other Extended Field Configurations
effects that occur as a result of a curved spacetime is of the (orthochronous) Poincaré group P "þ . Here
provided in Quantum Field Theory in Curved , x is an automorphism of A, that is, a mapping
Spacetime. Another subject, which is missing almost from A to A which preserves the algebraic structure.
completely, is perturbation theory. This subject has Once a Lorentz frame is fixed by choosing a timelike
been covered extensively in three well-known text- vector e 2 Vþ , the time evolution t 7! 1, te will be
books by Kapusta, Le Bellac, and Umezawa. denoted by t 7! t .
For the free field, the group of automorphisms
(, x) 7! , x is defined by
Observables and States
;x ðWðf ÞÞ :¼ W f ð1 ð: xÞÞ
Following Heisenberg, we start from the basic
assumption that quantum theory can be formulated As before, f 2 S(Rdþ1 ) is a Schwarz function over
in terms of observables which form an algebra A, that the Minkowski space R dþ1 .
is, a vector space with a (noncommutative) multi- While the invariance of the equations of motion is
plication law. Although our emphasis on the abstract reflected in the existence of a representation of the
algebraic structure may look strange, there is a Poincaré group in terms of automorphisms in the
profound reason for starting out with an abstract Heisenberg picture, at least the invariance with
algebra of observables: as soon as one considers respect to Lorentz boosts is spontaneously broken
systems with infinitely many degrees of freedom, one in the Schrödinger picture for a thermal equilibrium
encounters a possibility to realize the abstract elements state.
of the algebra A as operators on a Hilbert space in The usual notions of vector states and density
various inequivalent ways. The famous equivalence matrices associated with a given Hilbert space
between the Heisenberg and the Schrödinger picture (usually Fock space) are a priori not general enough
simply breaks down. States which are macroscopically to cover all cases of interest in thermal field theory.
different (e.g., thermal equilibrium states for different The following algebraic definition of a state sub-
temperatures) give rise – in a natural way, which will stantially generalizes the notion of a state: A state !
be discussed in the sequel – to unitarily inequivalent is a positive, linear, and normalized functional, that
representations of the abstract algebra of observables is, a linear map ! : A ! C such that
A, while states which only differ microscopically can
!ða aÞ 0 and !ð1Þ ¼ 1
be accommodated by density matrices within the same
Hilbert space. In other words, a physical state is Once a state ! is distinguished on physical grounds,
described macroscopically by specifying a representa- the GNS reconstruction theorem provides a Hilbert
tion, and microscopically by a density matrix in this space H! and a representation ! of A, that is, a
representation. map from A to the set of bounded operators B(H! ),
In a Lagrangian approach, the algebra of obser- which preserves the algebraic relations.
vables A may be thought of as being generated by It is instructive to consider the GNS representa-
the underlying fields, currents, etc. This leads to the tion of the Pauli matrices {0 = 1, 1 , 2 , 3 }. Given a
so-called polynomial algebras. It is mathematically state (a diagonal 2 2 matrix with positive entries
convenient to assume that A is an algebra of and tr = 1), the left regular representation (a
bounded operators, generated by the bounded construction well known from group theory)
functions of the underlying quantum fields. If (x) pffiffiffi pffiffiffi
ði Þj > ¼ ji >; i ¼ 0; 1; 2; 3
is any such field and if f 2 S(Rdþ1 ) is any real test
function with support in a bounded region of defines a reducible representation on C4 , unless one
spacetime, then the corresponding operator of the entries in the diagonal of is zero (which
Z corresponds to a pure state). In the latter case, the
Wðf Þ ¼ exp i dx f ðxÞðxÞ GNS Hilbert space is C2 . By construction,
pffiffiffi pffiffiffi
< j(i )j > = tr i , i = 1, 2, 3.
would be a typical element of A. The set of
operators {W(f ) j supp f O} will generate a sub-
algebra A(O) of A. The underlying fields can be Thermal Equilibrium
recovered by taking (functional) derivatives, once a
representation of A on a Hilbert space is specified. The variety of nonequilibrium states ranges from
The spacetime symmetry of Minkowski space mild perturbations of equilibrium states through
manifests itself in the existence of a representation steady states, whose properties are governed
by external heat baths, or hydrodynamic flows
: ð; xÞ 7! ;x 2 AutðAÞ; ð; xÞ 2 P "þ up to totally chaotic states which no longer
Thermal Quantum Field Theory 229
admit a description in terms of thermodynamic semipassive with respect to one fixed efficiency
notions. Buchholz et al. (2002) have initiated an bound E. It has been shown by Kuckert (2002)
investigation of nonequilibrium states that are that a state is completely semipassive in all inertial
locally (but not globally) close to thermal equili- frames if and only if it is completely passive in some
brium. Unfortunately, we will not be able to cover inertial frame. The latter implies that ! is a KMS
this topic. Instead, we will concentrate on states state or a ground state (a result due to Pusz and
which deviate from a true equilibrium state only Woronowicz).
microscopically. Let us now turn to properties of thermal
equilibrium states which are specific for relativistic
Characterization of Thermal Equilibrium States models. It was first recognized by Bros and
Buchholz (1994) that KMS states of a relativistic
When the time evolution t 7! t 2 Aut(A) is changed
theory have stronger analyticity properties in con-
by a local perturbation, which is slowly switched on
figuration space than those imposed by the tradi-
and slowly switched off again, then an equilibrium
tional KMS condition:
state ! returns to its original form at the end of this
procedure. This heuristic condition of adiabatic Definition 2 A KMS state ! satisfies the relativis-
invariance can be expressed by the stability tic KMS condition (Bros and Buchholz 1994) if there
requirement exists a unit vector e in the forward light cone Vþ
Z t such that for every pair of local elements a, b of A
lim dt !ð½a; t ðbÞÞ ¼ 0 8a; b 2 A ½1 the function Fa, b
t!1 t
In a pioniering work Haag, Kastler, and Trych- Fa;b ðx1 ; x2 Þ ¼ ! ðx1 ðaÞx2 ðbÞÞ
Pohlmeyer showed that the characterization [1] of extends to an analytic function in the tube domain
an equilibrium state leads to a sharp mathematical T e=2 T e=2 , where T e=2 = {z 2 C j =z 2 Vþ \
criterion, first encountered by Haag, Hugenholtz, (e=2 Vþ )}.
and Winnink and more implicitly by Kubo, Martin,
and Schwinger: The relativistic KMS condition can be understood
as a remnant of the relativistic spectrum condition in
Definition 1 A state ! over A is called a KMS the vacuum sector. It has been rigorously established
state for some > 0, if for all a, b 2 A, there exists a (Bros and Bruchholz 1994) for the KMS states
function Fa, b which is continuous in the strip 0 constructed by Buchholz and Junglas (1989) and by
=z and analytic and bounded in the open strip C Gérard and the author for the P()2 model. In the
0 < =z < , with boundary values given by thermal Wightman framework (Bros and Buchholz
Fa;b ðtÞ ¼ ! ðat ðbÞÞ and 1996) it has been shown that the relativistic KMS
condition implies existence of model-independent
Fa;b ðt þ iÞ ¼ ! ðt ðbÞaÞ 8t 2 R ½2
analyticity properties of thermal n-point functions.
Before we start analyzing the properties of KMS These properties also appear in perturbative compu-
states, we should mention an alternative character- tations of the thermal Wightman functions
ization of thermal equilibrium states: passivity. The (Steinmann 1995).
amount of work a cycle can perform when applied We now turn to the properties of the set of KMS
to a moving thermodynamic equilibrium state is states. For given , the convex set S of all KMS
bounded by the amount of work an ideal windmill states is known to form a simplex; the extreme
or turbine could perform; this property is called points in the set S are called extremal KMS states.
semipassivity (Kuckert 2002): a state ! is called As a consequence, the extremal states in S can be
semipassive (passive) if there is an ‘‘efficiency distinguished with the help of ‘‘classical’’ (central)
bound’’ E 0 (E = 0) such that observables, that is, by observables which commute
with all other observables.
ðW! ; H! W! Þ E ðW! ; jP! jW! Þ If ! is an extremal KMS state and is an
8W 2 ! ðAÞ00 automorphism which commutes with the time
evolution t 7! t , then the state !0 defined by
with W 1 = W , [H! , W] 2 ! (A)00 , and [P! , W] 2
! (A)00 . Here (H! , P! ) denote the generators imple- !0 ðaÞ :¼ !ð ðaÞÞ; a2A
menting the spacetime translations in the GNS
representation (H! , ! , ! ). Generalizing the notion is again an extremal KMS state to the same
of complete passivity, the state ! is called completely parameter values. If !0 6¼ !, one says that the
semipassive if all its finite tensorial powers are symmetry is spontaneously broken.
230 Thermal Quantum Field Theory
Lorentz invariance with respect to boosts is in the vacuum representation vac. . Next it is shown
always broken by a KMS state, since the KMS that the function
condition distinguishes a rest frame. A KMS state
might also break spatial translation or rotation t 7! !; ðat ðbÞÞ
invariance. However, by averaging over the different 1
¼ tr EðÞeH EðÞvac: ðat ðbÞÞ
configurations one can usually construct a transla- Z
tion- and rotation-invariant state. The situation is allows an analytic extension to a strip of width
drastically different with respect to supersymmetry. which satisfies the KMS boundary condition [2] for
Buchholz and Ojima (1997) have shown that super- jtj < if a, b 2 A(O
) and O
þ te O for jtj < . In
symmetry is broken in any thermal state and it is the final step, Buchholz and Junglas were able to
impossible to proceed from it by ‘‘symmetrization’’ demonstrate that bounds on the nuclear norm are
to states on which an action of supercharges can be even sufficient to control the thermodynamic limit.
defined. Given a thermal field theory, a slight variation of
the method used by Buchholz and Junglas allows
Existence of Thermal Equilibrium States one to construct a KMS state for a new temperature
(Jäkel 2004), that is, to change the temperature of a
Buchholz and Junglas (1989) demonstrated that the thermal state.
existence of KMS states can be guaranteed for a
large class of quantum field-theoretic models. The
basic assumption to be met concerns the phase-space Thermal Representations
properties of the model. A generalized trace norm Given a KMS state ! , the GNS construction gives
(the so-called ‘‘nuclear norm’’) is used to estimate rise to a Hilbert space H and a representation ,
the ‘‘number’’ of degrees of freedom in phase space. called a thermal representation, of A. The algebra
The first step is to construction a subspace H() R := (A)00 possesses a cyclic (due to the GNS
of the vacuum Hilbert space Hvac. , which represents construction) and separating (due to the KMS
excitations of the vacuum strictly localized inside of condition) vector such that
a bounded spacetime region Ô. Due to the strong
correlations present in the vacuum state of any ! ðaÞ ¼ ; ðaÞ 8a 2 A
relativistic model, as a consequence of the Reeh–
The KMS condition implies that ! is invariant
Schlieder property (see the section ‘‘Analyticity of n-
under time translations, that is, !
t = ! for all
point functions’’) this is a delicate procedure, which
t 2 R. Thus,
involves the so-called ‘‘split property.’’ This property
ensures that there exists a product vector
in UðtÞ ðaÞ ¼ ðt ðaÞÞ ; a2A
vacuum Hilbert space Hvac. such that
defines a strongly continuous unitary group
ð
; vac: ðabÞ
Þ ¼ !vac: ðaÞ !vac: ðbÞ {U(t)}t2R implementing the time evolution in the
^c representation . By Stone’s theorem there exists a
8a 2 AðOÞ; b 2 AðOÞ ½3
self-adjoint generator L such that
Here O Ô denotes a slightly smaller open space-
time region (such that the closure O is inside the UðtÞ ¼ eiLt ; t2R ½4
interior of Ô) and A(Ô)c := {A 2 A j [A, B] = 0 8B 2 For 0 < 1, the Liouville operator L is not
A(Ô)}. The existence of a product vector can be bounded from below; its spectrum is symmetric and
ensured if the nuclear norm satisfies some mild consists typically of the whole real line. However,
bounds which are expected to hold in all models of the negative part of L is ‘‘suppressed’’ with respect
physical interest. Given a product vector
which to the algebra of observables R := (A)00 in the
satisfies [3], the sought after subspace is following sense (Haag 1992): let 1]1,] be the
spectral projection of L for the interval ] 1, ]
HðÞ :¼ vac: ðAðOÞÞ00 vac: Sp(L), then
The crucial step in the proof of existence of KMS
k11; A k e kAk 8A 2 R
states is to show that
We now turn to structural aspects which are
tr EðÞeH EðÞ < 1 for > 0 characteristic for a relativistic model, namely the
if the nuclearity condition holds. Here E() denotes existence of strong spatial correlations and the
the projection onto the subspace H() representing connection between the decay of these correlations
localized excitations and H denotes the Hamiltonian and the spectral properties of the Liouville operator.
Thermal Quantum Field Theory 231
Let ! be a state, which satisfies the relativistic conjugation) and a self-adjoint operator 1=2 . The
KMS condition. It follows (using a theorem of connection to physics was established independently
Glaser) that for a 2 A the function a : R4 ! H , by Takesaki and Winnink, showing that the pair
(R, ) satisfies the KMS condition for = 1, if one
x 7! ðx ðaÞÞ
sets t (A) = it Ait for A 2 R.
can be analytically continued from the real axis into Taking advantage of the Reeh–Schlieder property
the domain T e=2 such that it is weakly continuous [5], one can associate modular objects to certain
for =z & 0. If the usual additivity assumption spacetime regions O. In general, a physical inter-
[i Oi = O ) _i R (Oi ) = R (O) for the local von pretation of these modular objects is missing. But for
Neumann algebras holds, then two-dimensional thermal models, which factorize in
light-cone coordinates, the modular group corre-
H ¼ ðAðOÞÞ ½5 sponding to the algebra of a spacelike wedge admits
for any open spacetime region O R dþ1 . Junglas a simple description: at large distances (compared to
has shown that the thermal Reeh–Schlieder property ) from the boundary, the flow pattern is essentially
[5] follows as well from the standard KMS condi- the same as time translations. These are results due
tion, if ! is locally normal with respect to the to Borchers and Yngvason (1999).
vacuum representation.
The decay of spatial correlations depends on
Analyticity Properties of n-Point Functions
infrared properties of the model, and the essential
ingredients for the following cluster theorem are the The correlation functions describe the full physical
continuity properties of the spectrum of L near zero. content of the theory: all observable quantities can
in principle be derived from them. This is so because
Theorem 3 Let denote the unique (up to a
according to the Wightman reconstruction theorem
phase) normalized eigenvector with eigenvalue {0} of
(which is closely related to the GNS construction)
the Liouvillean L and let Pþ denote the projection
knowledge of the correlation functions allows the
onto the strictly positive part of the spectrum of L.
reconstruction of the full representation of the field
Assume that there exist positive constants m > 0
algebra. The Wightman distributions {W (n) }n2N ,
and C1 (O) > 0 such that
ðnÞ
W ðt2 t1 ; x2 x1 ; . . . ; tn tn1 ; xn xn1 Þ
ke
L Pþ ðaÞ k
C1 ðOÞ
m kak 8a 2 AðOÞ :¼ ð ; ðt1 ; x1 Þ ðtn ; xn Þ Þ ½6
R
Here O Rdþ1 is an open and bounded spacetime where (W(f )) =: exp(i dt dx f (t, x) (t, x)), satisfy
region. Now consider two spacelike separated a number of key properties: locality, positivity,
spacetime regions O1 , O2 , which can be embedded Poincaré covariance, and temperedness. These prop-
into O by translation and such that O1 þ e erties have been formulated for thermal field by Bros
O02 , >> . then, for a 2 A(O1 ) and b 2 A(O2 ), and Buchholz (1996), and this section is entirely
based on their work.
j! ðbaÞ ! ðbÞ! ðaÞj C2 2m kak kbk The relativistic KMS condition implies that the
Wightman distributions {W (n) }n2N of a translation-
The constant C2 (, O) 2 Rþ may depend on the
invariant equilibrium state admit in the correspond-
temperature 1 and the size of the region O but is
ing set of spacetime variables (t2 t1 , x2 x1 ), . . . ,
independent of , a, and b.
(tn tn1 , xn xn1 ) an analytic continuation into
From explicit calculations one expects that the union of domains
m = 1=2 for free massless bosons in 3 þ 1 spacetime
ð1 T e Þ ðn1 T e Þ
dimensions. Consequently, the exponent given on
Pn1
the right-hand side is optimal since it is well known for i > 0, i = 1, . . . , n 1 and i = 1 i = 1. The
that in this case the correlations decay only like 1 . tube domains T e were specified in Definition 2.
A description of thermal representations would be For ! 1, the tube T e tends to the vacuum tube
inadequate without pointing out one of the deepest T vac. = Rdþ1 þ iVþ ; thus, one recovers the spectrum
connections between pure mathematics and physics condition for the vacuum expectation values.
that emerged in the last century: consider a von Let us now turn to the Fourier transformed
Neumann algebra R which possesses a cyclic and Wightman correlation functions. Translation invar-
separating vector . Then polar decomposition of iance implies
the closeable operator S : A 7! A , A 2 R, pro-
vides an antiunitary operator J (the modular eðnÞ ð1 ; p1 ; . . . ; n ; pn Þð1 þ þ n Þðp1 þ þ pn Þ
W
232 Thermal Quantum Field Theory
The Wightman distribution W e(n) satisfies on the the general form of the thermal two-point functions
linear manifold (1 , p1 ) þ þ (n , pn ) = 0 the KMS that allow one to apply the techniques of the Jost–
relation in the energy variables: for any pair of Lehmann–Dyson representation. As has been shown
multi-indices (I, J) the identity by Bros and Buchholz (1996), the interacting two-
point function W can be represented in the form
eðnÞ ðJ; IÞ ¼ eI W
W eðnÞ ðI; JÞ
Z 1
ð2Þ
holds, where W e(n) (J, I) is an abbreviation for W ðt; xÞ ¼ dm D ðx; mÞW ðt; x; mÞ
P 0
e ({pi }i2I , {pj }j2J ) and I =
(n)
W i2I i .
We now specialize to the two-point function W (2) .
Here D (x, m) is a distribution in x, m which is
The corresponding commutation function C(x) is symmetric in x, and
given by Z
ð2Þ ð2Þ
ðnÞ ð2Þ
W ðt; x; mÞ ¼ ð2Þ1 ddp eiðtpxÞ W~ ð; pÞ
Cðx1 x2 Þ ¼ W ðx1 ; x2 Þ W ðx2 ; x1 Þ
Locality implies that supp C Vþ [ V . The is the two-point correlation function of the free
retarded and the advanced propagator r and a, thermal field of mass m. In contrast to the vacuum
formally given by case, the damping factors D (x, m) depend in a
nontrivial way on the spatial variables x. The
rðxÞ ¼ iðx
ÞCðxÞ; aðxÞ ¼ iðx
ÞCðxÞ damping factors describe the dissipative effects of
satisfy the relation the thermal system on the propagation of sharply
localized excitations. Bros and Buchholz suggested
r a ¼ iC that the damping factor D (x, m) can be decom-
which corresponds to a partition of the support of posed into a discrete and an absolute continuous
C in its convex components: supp r Vþ and part
supp a V . For the free scalar field of mass m D ðx; mÞ ¼ ðm m0 ÞD;d ðxÞ þ D;c ðx; mÞ
the commutator function is
Z and that the -contribution in the damping factors is
1 due to stable constituent particles of mass m0 out of
ðmÞ
C ðxÞ ¼ dp eipx C~ðmÞ ðpÞ
ð2Þ2 R4 which the thermal states are formed, whereas the
with collective quasiparticle-like excitations only contri-
bute to the continuous part of the damping factors
1 (Bros and Buchholz 1996).
C~ðmÞ ðpÞ ¼ sgnðÞð 2 p2 m2 Þ
2 In the case of spontaneously broken internal
and subsequently the retarded and advanced propa- symmetries Bros and Buchholz (1998) have shown
gators r(m) and a(m) are structural functions of the that the damping factors D (x, m) which appear in
field algebra, which are determined by the c-number the representation of current-field correlations
commutation relations of the fields. Thus, they are functions
independent of the temperature, in contrast to the ð ; j0 ðt; xÞ ð0; 0Þ Þ
two-point function: Z 1
ð2Þ
¼ dm Dþ ðx; mÞ@t W ðt; x; mÞ
ð2Þ C~ðmÞ ðpÞ 0
W~ ðpÞ ¼ ½7
1 e þ D
ð2Þ
ðx; mÞW ðt; x; mÞ
Let now ˜ (p) be the Fourier transform of the time-
ordered function (x). The relation indeed contain a discrete (in the sense of measures)
zero-mass contribution and are slowly decreasing in
aðpÞe
i~rðpÞ þ i~ jxj for small values of m. Thus, these damping
~ðpÞ ¼
1 e factors coincide locally with the Källén–Lehmann
shows that (p)
˜ and i~r(p) only ‘‘coincide up to an weights appearing in the case of spontaneous
exponential tail’’ at very high energies (Bros and symmetry breaking in the vacuum sector (Bros and
Buchholz 1996). Buchholz 1998). It is easily seen in examples that
there is no sharp energy–momentum dispersion law
for the Goldstone particles. Thus, the Källén–
Particle Aspects
Lehmann representation is better suited than Fourier
The condition of locality (together with the relati- transformation to uncover the particle aspects of
vistic KMS condition) leads to strong constraints on thermal equilibrium states.
Thermal Quantum Field Theory 233
In the simplest case, the classical Lagrangian density In 1 þ 1 spacetime dimensions Wick ordering is
of the so-called P()2 models is given by sufficient to eliminate the UV divergences of poly-
nomial interactions. As it turns out, the leading
L ¼ ð@ Þð@ Þ m2 2 4 ½8 order in the UV divergences is independent of the
4 temperature (in agreement with the results found in
Here (t, x) denotes a real scalar field over space- Kopper et al. (2001)). Thus, it is a matter of
time. The construction of the corresponding quan- convenience whether one uses the thermal covar-
tized thermal field presented in this section (Gérard iance function C ,
and Jäkel 2005) is based on the original ideas of
Høegh-Krohn (1974). ð1 þ e Þ
C ðh1 ; h2 Þ :¼ h1 ; h2
2ð1 e Þ L2 ðRÞ
t
ðWðf ÞÞ ¼ Wðeit f Þ; f 2 hm in the Araki–Woods representation and to show that
Hl is essentially self-adjoint Gérard and Jäkel (2005).
If m > 0, the KMS condition allows just one unique Thus, (the closure of) Hl can be used to define a
(quasifree) (
, )-KMS state: perturbed time evolution t 7! tl on A and the vector
!
ðWðf ÞÞ :¼ eð1=4Þðf ;ð1þ2Þf Þm ; :¼ ðe 1Þ1 eð=2ÞHl AW
l :¼
The GNS representation associated to the pair keð=2ÞHl AW k
(W(h m ), !
) is the well-known Araki–Woods repre-
induces a KMS state !l for the dynamical system
sentation, given by
(AW (A)00 , l ).
HAW :¼ h m h m ; AW :¼ F ; A finite propagation speed argument (using
Trotter’s product formula) shows that
AW ðWðhÞÞ :¼ WF ð1 þ Þ1=2 h 1=2 h ; h 2 h m
tl ðAÞ :¼ eitHl AeitHl ; t2R ½9
Here h m is the Hilbert space conjugate to h m , WF (.)
is independent of l for A 2 RAW (O), t 2 R fixed and
denotes the usual Weyl operator on the Fock space
l sufficiently large. Thus, there exists a limiting
(h m h m ) and F 2 (h m h m ) is the Fock
dynamics such that
vacuum. The Liouvillean LAW (see [4]) can be
identified with d( ).
lim ktl ðAÞ t ðAÞk ¼ 0 ½10
The local von Neumann algebra generated by l!1
{AW (W(h)) j h 2 h m (O)} is denoted by RAW (O). The
for all A 2 RAW (O), O bounded. This norm conver-
algebra of observables for the free quantum field
gence extends to the norm closure A of the local von
(and, as we will see, the P()2 model) is the norm
Neumann algebras.
closure
The existence of weak limit points (which are
[ C states) of the (generalized) sequence {!l }l > 0 is a
A :¼ RAW ðOÞ
consequence of the Banach–Alaoglu theorem. The
OR 2
fact that all limit states satisfy the KMS condition
of the local von Neumann algebras. with respect to the pair (A, ) follows from [10]. To
234 Thermal Quantum Field Theory
prove that the sequence {!l }l>0 has only one finite in all orders of the perturbation expansion, once
accumulation point, the theory has been renormalized at zero temperature
by usual renormalization prescriptions.
! ¼ lim !l ½11
l!1
Asymptotic Dynamics of Thermal Fields
is more delicate. Following Høegh-Krohn, Nelson
symmetry is used in Gérard and Jäkel (2005) to Timelike asymptotic properties of thermal correlation
relate the interacting thermal theory on the real line functions cannot be interpreted in terms of free fields
to the P()2 model on the circle S1 of length at due to persistent dissipative effects of a thermal
temperature 0. The existence of the limit [11] then system. This well-known fact manifests itself in a
follows from the uniqueness of the vacuum state on softened pole structure of the Green’s functions in
the circle. The relativistic KMS condition can be momentum space and is at the root of the failure of
derived by Nelson symmetry as well, using the fact the conventional approach to thermal perturbation
that the discrete spectrum of the model on the circle theory (Bros and Bruchholz 2002). In fact, assuming
satisfies the spectrum condition. Since the limit [11] a sharp dispersion law, one would be forced to
exists on the norm closure A of the weakly closed conclude that the scattering matrix is trivial (a
local algebras, it follows from a result of Takesaki famous no-go theorem by Narnhofer et al. (1983)).
and Winnink that ! is locally normal with respect However, there seems to be a possibility to find an
to the Araki–Woods representation (which itself is effective theory, which is much simpler and still
locally normal with respect to the Fock representa- reproduces the correct asymptotic behavior of the full
tion). Consequently, theory. Disregarding low-energy excitations, Bros and
Buchholz (2002) have shown that the -contributions
R ðOÞ :¼ ðAðOÞÞ00 ffi RAW ðOÞ; O bounded
in the damping factors give rise to asymptotically
that is, R (O) is (isomorphic to) the unique leading terms which have a rather simple form: they are
hyperfinite factor of type III1 . Moreover, the local products of the thermal correlation function of a free
Fock property implies that the split property holds. field and a damping factor describing the dissipative
effects of the model-dependent thermal background.
Perturbation Theory This result is based on the assumption that the
truncated n-point functions satisfy
Steinmann (1995) has shown that perturbative expan-
sions for the Wightman distributions of the :4 :4 model ðnÞ
lim T3ðn1Þ=2 W ðt1 ; x1 ; . . . ; tn ; xn Þtrunc: ¼ 0
can be derived directly in the thermodynamic limit, T!1
using as only inputs the equations of motion and the >0
(thermal) Wightman axioms. The result can be
while the -contribution in the damping factors
represented as a sum over generalized Feynman graphs.
exhibit, for large timelike separations T, a T 3=2
The method consists in solving the differential
type behavior (in 3 þ 1 spacetime dimensions).
equations for the correlation functions which follow
Bros and Buchholz (2002) have shown that the
from the field equation, by a power series expansion
asymptotically dominating parts of the correlation
in the coupling constant, using the axiomatic
functions can be interpreted in terms of quasifree
properties of the Wightman functions as subsidiary
states acting on the algebra generated by a Hermi-
conditions. The Wightman axioms are expected to
hold separately in each order of perturbation theory, tian field 0 satisfying the commutation relations
with the exception of the cluster property. ½0 ðt1 ; x1 Þ; 0 ðt2 ; x2 Þ
As expected, the UV renormalization can be
¼ m0 ðt1 t2 ; x1 ; x2 ÞZðx1 x2 Þ
chosen to be temperature independent, that is, one
can use the same counterterms as in the vacuum Here m0 is the usual commutator function of a free
case. But the infrared divergencies are more severe, scalar field of mass m0 and Z is an operator-valued
they cannot be removed by minor adjustments of the distribution commuting with 0 such that !ˆ (Z(x1
renormalization procedure. Various elaborate x2 )) = D, d (x1 x2 ). (Here !ˆ denotes a KMS state
resummation techniques have been proposed to (at for the algebra generated by 0 .) Intuitively speak-
least partially) remove the infrared singularities. ing, the field 0 carries an additional stochastic
Another approach has been pursued by Kopper et al. degree of freedom, which manifests itself in a central
(2001). They have investigated the perturbation expan- element that appears in the commutation relations
sion of the :4 :4 model in the imaginary-time formal- and couples to the thermal background.
ism, using Wilson’s flow equations. The result is once As 0 describes the interacting field asymptoti-
again that all correlation functions become ultraviolet- cally, one may expect that 0 satisfies the field
Toda Lattices 235
equation of the interacting field in an asymptotic Cuniberti G, De Micheli E, and Viano GA (2001) Reconstructing the
sense. Buchholz and Bros (2002) have demonstrated thermal green functions at real times from those at imaginary
times. Communications in Mathematical Physics 216: 59–83.
that this assumption allows one to derive an explicit Derezinski J, Jaksic V, and Pillet CA (2003) Perturbations of W -
expression for the discrete part of the damping dynamics, Liouvilleans and KMS states. Reviews in
factors D, d (x) in simple models. Mathematical Physics 15: 447–489.
Fröhlich J (1975) The reconstruction of quantum fields from
See also: Axiomatic Quantum Field Theory; Quantum Euclidean Green’s functions at arbitrary temperatures. Helve-
Field Theory in Curved Spacetime; Scattering in tica Physica Acta 48: 355–363.
Relativistic Quantum Field Theory: The Analytic Gérard C and Jäkel C (2005) Thermal quantum fields with
spatially cutoff interactions in 1 þ 1 space-time dimensions.
Program; Tomita–Takesaki Modular Theory.
Journal of Functional Analysis 220: 157–213.
Gérard C and Jäkel C (2005) Thermal quantum fields without
cutoffs in 1 þ 1 space-time dimensions. Reviews in Mathema-
Further Reading tical Physics 17: 113–173.
Haag R (1992) Local Quantum Physics. Berlin: Springer.
Borchers H-J and Yngvason J (1999) Modular groups of quantum Høegh-Krohn R (1974) Relativistic quantum statistical mechanics
fields in thermal states. Journal of Mathematical Physics 40: in two-dimensional space-time. Communications in Mathe-
601–624. matical Physics 38: 195–224.
Birke L and Fröhlich J (2002) KMS, etc., Reviews in Mathema- Jäkel CD (2004) The relation between KMS states for different
tical Physics 14: 829–871. temperatures. Annales de l’Institut Henri Poincaré 5: 1–30.
Bros J and Buchholz D (1994) Towards a relativistic KMS Kopper C, Müller VF, and Reisz T (2001) Temperature
condition. Nuclear Physics B 429: 291–318. independant renormalization of finite temperature field the-
Bros J and Buchholz D (1996) Axiomatic analyticity properties ory. Annales de l’Institut Henri Poincaré 2: 387–402.
and representations of particles in thermal quantum field Kuckert B (2002) Covariant thermodynamics of quantum
theory. Annales de l’Institut Henri Poincaré 64: 495–521. systems: passivity, semipassivity, and the Unruh effect. Annals
Bros J and Buchholz D (1998) The unmasking of thermal of Physics 295: 216–229.
Goldstone bosons. Physical Review D 58: 125012-1. Landsman NP and van Weert Ch G (1987) Real- and imaginary-
Bros J and Buchholz D (2002) Asymptotic dynamics of thermal time field theory at finite temperature and density. Physics
quantum fields. Nuclear Physics B 627: 289–310. Reports 145: 141–249.
Buchholz D and Junglas P (1989) On the existence of equilibrium Narnhofer H (1994) Entropy density for relativistic quantum field
states in local quantum field theory. Communications in theory. Reviews in Mathematical Physics 6: 1127–1145.
Mathematical Physics 121: 255–270. Narnhofer H, Requardt M, and Thirring W (1983) Quasiparticles
Buchholz D and Ojima I (1997) Spontaneous collapse of super- at finite temperature. Communications in Mathematical
symmetry. Nuclear Physics B 498: 228–242. Physics 92: 247–268.
Buchholz D, Ojima I, and Roos H (2002) Thermodynamic Steinmann O (1995) Perturbative quantum field theory at positive
properties of non-equilibrium states in quantum field theory. temperatures: an axiomatic approach. Communications in
Annals of Physics 297: 219–242. Mathematical Physics 170: 405–415.
Toda Lattices
Y B Suris, Technische Universität München, translationally symmetric systems like crystals. On
München, Germany his search for lattice models admitting interesting
ª 2006 Elsevier Ltd. All rights reserved. explicit solutions, M Toda discovered in 1967 the
lattice which nowadays carries his name:
€n ¼ eqnþ1 qn eqn qn1
q ½1
Lattices, or differential–difference equations, are a
special class of ordinary differential equations, with Toda lattice is one of the most celebrated systems of
the dependent variable t playing the role of time and mathematical physics, and a large amount of
an infinite number of dependent variables qn = qn (t) literature is devoted to it and to its various genera-
numbered by integer indices n, characterized by a lizations. Its most prominent property is ‘‘integr-
translational invariance with respect to the shift ability,’’ so that it is amenable to a rather complete
n ! n þ 1. Due to this property, such equations are exact treatment; moreover, it can be regarded as one
well suited for description of processes in of the basic models, illustrating all the relevant
236 Toda Lattices
It turns out that eqns [11] are equivalent to the first application of IST in the lattice context. The
operator equation matrix L0 in [16] is symmetric tridiagonal, which
yields that the operator L0 is second order and self-
L_ ¼ ½L; Aþ ¼ ½A ; L ½12 adjoint. The direct and inverse-spectral problem for
L0 = with such operators L0 is well studied and
where L and A are linear difference operators with
parallel, to a large extent, to the corresponding
coefficients depending on an , bn :
theory for second-order differential operators. In the
X X X
L¼ bn En;n þ an En;nþ1 þ Enþ1;n ½13 rapidly decaying case, the set of spectral data of the
n2Z n2Z n2Z operator L0 , allowing for a solution of the inverse
problem, consists of:
X X 1. eigenvalues j = zj þ z1j of the discrete spectrum,
Aþ ¼ bn En;n þ Enþ1;n with zj 2 (1, 1);
n2Z n2Z
X ½14 2. normalizing coefficients j of the corresponding
A ¼ an En;nþ1 eigenfunctions; and
n2Z 3. reflection coefficient r(z) for jzj = 1, characterizing
Here difference operators are represented as infinite the continuous spectrum = z þ z1 2 [2, 2].
matrices, Em, n being the matrix with the only The solution of the inverse-spectral problem is given
nonvanishing element equal to 1 in the position in terms of the Riemann–Hilbert problem or its
(m, n). A diagonal similarity (gauge) transformation variants, like the Gelfand–Levitan equation. Equa-
of the matrix L leads to an equivalent Lax tion [12] means that the evolution of the operator L,
representation of the Toda lattice: induced by the evolution of qn (t), pn (t) in virtue of
the Toda lattice equations [2], is ‘‘isospectral.’’ More
L_0 ¼ ½L0 ; A0 ½15
precisely, the discrete eigenvalues are integrals of
with motion, while the evolution of other spectral data is
X X governed by simple linear equations:
L0 ¼ bn En;n þ a1=2
n Enþ1;n þ En;nþ1 ½16
j ðtÞ ¼ j ð0Þeðzj zj Þt
1
n2Z n2Z zj ¼ const:;
1
½18
rðz; tÞ ¼ rðz; 0Þeðz zÞt
1 X 1=2
A0 ¼ an Enþ1;n En;nþ1 ½17 In particular, the multisoliton solutions correspond
2 n2Z
to the reflectionless case r(z, t) 0. The IST solution
Being equivalent for the Toda lattice, these two Lax of the initial-value problem for the infinite Toda
representations admit nonequivalent generalizations lattice can be schematically depicted as in Figure 1.
(see below). Note that the matrices A in [14] may
be interpreted as A = (L), where stands for Bi-Hamiltonian Structure
the lower-triangular, resp., strictly upper-triangular
part. The commuting higher members of the Toda The canonical Poisson bracket for the variables qn , pn
lattice hierarchy (enumerated by s 2 N) are char- turns in the Flaschka–Manakov variables [10] into
acterized by the Lax equations of the form [12] with
the same Lax matrix L as in [13] and with fbn ; an g1 ¼ an ; fan ; bnþ1 g1 ¼ an ½19
A = (Ls ). In the Lax representation [15], the
higher Toda flows are obtained by choosing
A0 = skew(Ls0 ), where ‘‘skew’’ denotes the skew- qn(0), pn(0) Direct-spectral problem zj, γj (0), r (z, 0)
symmetric part (strictly lower-triangular part minus
strictly upper-triangular part) of the symmetric
matrix. The Hamilton functions of the higher flows
are obtained as Hs tr(Ls ) = tr(Ls0 ). Linear
evolution
Inverse Scattering
H Flaschka and S Manakov laid the Lax representa-
qn(t ), pn(t ) Inverse-spectral problem zj , γj (t ), r (z, t )
tion into the base of the application of the inverse-
scattering, or inverse-spectral, transformation
method (IST) to the infinite Toda lattice. It was the Figure 1 General scheme of the IST.
238 Toda Lattices
(all other brackets of the coordinate functions with the flipped factors. The Bäcklund transforma-
vanish), and the system [11] is Hamiltonian with tion [21] serves also as an integrable discretization
respect to this bracket, with the Hamilton function of the Toda flow [2] with the time step h.
1X 2 X
H2 ¼ bn þ an
2 Finite Open-End Toda Lattice
However, one can define also a different Poisson
Model
bracket for the variables an , bn :
The infinite Toda lattice [1] can be reduced to finite-
fbn ; an g2 ¼ bn an dimensional systems by imposing suitable boundary
fan ; bnþ1 g2 ¼ an bnþ1 conditions, different from the rapidly decaying ones.
½20
fbn ; bnþ1 g2 ¼ an Particularly important are ‘‘open-end boundary
fan ; anþ1 g2 ¼ an anþ1 conditions,’’ which correspond to placing the parti-
cles 0 and N þ 1 at q0 = þ1 and qNþ1 = 1,
with the following properties: it is compatible with respectively. In terms of the Flaschka–Manakov
the first one (i.e., their linear combinations are again variables, this means that a0 = aN = 0 and b0 =
Poisson brackets), and the system [11] is Hamilto- bNþ1 = 0. The Hamilton function of the resulting
nian with respectP to this bracket, with the Hamilton system with N degrees of freedom is
function H1 = bn . So, the Toda lattice in the form
[11] is a bi-Hamiltonian system. This result is due 1X N X
N 1
H2 ðp; qÞ ¼ p2n þ eqnþ1 qn ½24
to M Adler (1979). The bi-Hamiltonian property, 2 n¼1 n¼1
introduced by F Magri in 1978 on the example of
the Korteweg–de Vries equation, has been estab- This system consists of N particles subject to
lished since then as an alternative (and highly repulsive forces between nearest neighbors, and
effective and informative) definition of integrability. exhibits a scattering behavior both as t ! 1 and
Actually, the Toda lattice [11] is even tri-Hamiltonian, t ! þ1. It admits a Lax representation of the same
since there exists one more local Poisson bracket for form [12] or [15] as in the infinite case, but with all
the variables an , bn with similar properties, discovered the matrices being now of finite size N N, so that
by B Kupershmidt in 1985. [13]–[14] and [16]–[17] are replaced by
X
N X
N 1 X
N 1
Darboux–Bäcklund Transformations L¼ bn En;n þ an En;nþ1 þ Enþ1;n ½25
and Discretization n¼1 n¼1 n¼1
Moser’s Solution
The AKS method provides a formula for the solution
of the initial-value problem for Lax equations of the
Integration of this system has been first performed by form [12] with the Lax matrix L 2 g and
J Moser in 1975. His solution can be interpreted A = ((L)). The solution is given by
within the general scheme of the IST (see Figure 1). 1 1
The spectral data in this case consist, for example, of LðtÞ ¼ Uþ ðtÞLð0ÞUþ ðtÞ ¼ U ðtÞLð0ÞU ðtÞ ½33
the eigenvalues j (j = 1, . . . , N) of the matrix L0 and where the elements U (t) 2 G solve the factoriza-
the first components rj of the corresponding ortho- tion problem
normal eigenvectors. The evolution of these data
induced by the Toda flow [2] turns out to be simple: exp tðLð0ÞÞ ¼ Uþ ðtÞU ðtÞ ½34
r2j ð0Þej t For the open-end Toda lattice g = gl(N), the Lie
j ¼ const:; r2j ðtÞ ¼ PN ½29
2 i t algebra of all N N matrices, g consist of all
i¼1 ri ð0Þe
lower-triangular, resp., strictly upper-triangular,
The IST is expressed by the identity matrices. Accordingly, G = GL(N), the Lie group
X
N r2j 1 of all nondegenerate N N matrices, and G
¼ a1 ½30 consist of all nondegenerate lower-triangular
j b1
j¼1 b2 . matrices, resp., of upper-triangular matrices with
.. units on the diagonal. The corresponding factor-
aN1
bN ization problem in G is well known in the linear
algebra under the name of LR factorization, and is
both parts of which represent the entry (1, 1) of the
related to the Gaussian elimination. From [33] and
matrix (I L0 )1 . It implies that all variables
the well-known expression of the diagonal ele-
an (t), bn (t) are rational functions of j and ej t ; in
ments of the lower-triangular factor in the LR
particular, one finds:
factorization through the minors of the factorized
n1 ðtÞnþ1 ðtÞ matrix, we find:
an ðtÞ ¼ eqnþ1 ðtÞqn ðtÞ ¼ ½31
n2 ðtÞ
nþ1 ðtÞn1 ðtÞ
where n (t) can be represented as an n n Hankel an ðtÞ ¼ an ð0Þ ½35
n2 ðtÞ
determinant
where n (t) is the upper-left n n minor of
n ðtÞ ¼ det cjþk ðtÞ 0j;kn1
the matrix exp(tL(0)). If L(t) is the Lax matrix
X
N
j
½32 along the solution of the Toda flow ((L) = L), then
cj ðtÞ ¼ i r2i ðtÞ the sampling of the matrix exp(L(t)) at the integer
i¼1
times t 2 Z coincides with the result of application
of the Rutishauser’s LR algorithm to the matrix
exp(L(0)). The LR algorithm applied to the matrix
Factorization Solution I þ hL(0) is nothing other but the Bäcklund trans-
formation [21] in the open-end situation.
The Lax representation [12] is a particular instance of
a general construction, known under the name of
Adler–Kostant–Symes (AKS) method and found
Finite Periodic Toda Lattice
around 1980. The ingredients of this construction are:
Model
a splitting of g into a direct sum of its two odic boundary conditions, qnþN (t) qn (t) for all
subspaces g which are also Lie subalgebras, with n 2 Z (of course, such relations hold also for the
: g ! g being the corresponding projections; Flaschka–Manakov variables an , bn ). The Hamilton
240 Toda Lattices
function of the resulting system with N degrees of multidimensional theta-functions by formula [35]
freedom is with n (t) =
(nU tV þ D), where U, V, D are
certain vectors on the Jacobian of R (the first two
1 X 2 X
H2 ðp; qÞ ¼ pn þ eqnþ1 qn ½36 of them depending on the spectrum R only).
2 n2Z=NZ n2Z=NZ
Loop Algebras
This system consists of N particles qn (n = 1, . . . , N),
and it is always assumed that qNþ1 q1 and q0 qN . The periodic Toda lattice can be included into the
Thus, the potential energy in [36] differs from the general AKS scheme, if one interprets the Lax
potential energy in [24] by one additional term eq1 qN . matrix L as an element of the loop algebra g
However, this modest difference leads to much more which consists of Laurent polynomials (in ) with
complicated dynamics of the system (quasiperiodic coefficients from gl(N), singled out by the additional
instead of scattering). It is convenient to replace condition
infinite matrices in the Lax representation [12] by
g ¼ Lð Þ 2 glðNÞ½ ; 1 : Lð Þ1 ¼ Lð! Þ
finite ones, of size N N, but depending on an
additional parameter (called the spectral parameter): where = diag(1, !, .. ., !N1 ), ! = exp(2i=N). Sub-
X X algebras g consist of Laurent polynomials with
L¼ bn En;n þ 1 an En;nþ1 respect to non-negative, resp., strictly negative
n2Z=NZ n2Z=NZ
X powers of . The Lie group G corresponding to the
þ Enþ1;n ½37 Lie algebra g consists of GL(N)-valued functions
n2Z=NZ U( ) of the complex parameter , regular in
X X CP1 n{0, 1} and satisfying U( )1 = U(! ). Its
Aþ ¼ bn En;n þ Enþ1;n ½38 subgroups G corresponding to the Lie algebras g
n2Z=NZ n2Z=NZ
are singled out by the following conditions: elements
X of Gþ are regular in the neighborhood of = 0,
A ¼ 1 an En;nþ1 ½39
n2Z=NZ
while elements of G are regular in the neighbor-
hood of = 1 and take at = 1 the value I. The
The Lax representation [12] holds identically in , corresponding factorization is called the generalized
so that the spectral parameter drops out of the LR factorization. As opposed to the open-end case,
equations of motion. Note that, unlike the open-end finding such a factorization is a problem of the
case, L is no more a tridiagonal matrix, because of Riemann–Hilbert type which is solved in terms of
the nonvanishing entries in the positions (N, 1) algebraic geometry and theta-functions rather than in
and (1, N). terms of linear algebra and exponential functions. This
approach to the periodic Toda lattice is due to Reyman
Inverse-Spectral Transformation and Semenov-Tian-Shansky (1979) and, indepen-
dently, to M Adler and P van Moerbeke (1980).
Solution of the periodic lattice in terms of multi-
dimensional theta functions has been given indepen-
dently by E Date and S Tanaka, and by I Krichever Generalizations: Lie-Algebraic Systems
in 1976. In this case, the set of the spectral data is
more complicated; it includes: The AKS interpretation of the finite Toda lattices
leads directly to their generalizations by replacing
a hyperelliptic Riemann surface R of genus N 1 the algebra gl(N), resp., the loop algebra over gl(N),
determined by the eigenvalues of the periodic by simple Lie algebras, resp. affine Lie algebras.
boundary-value problem for the operator L, or, These generalized Toda systems were introduced in
in other words, by the equation R( , ) = 1976 by O Bogoyavlensky and solved in 1979
det(L( ) I) = 0; and independently by M Olshanetsky, A Perelomov,
while g is spanned by the root spaces for negative abelian subalgebra of g . Denote by the set of 2
roots (Borel decomposition). For 2 let E be a a for which there exist nonzero elements E 2 g 1 with
corresponding root vector. So, [H, E ] = (H)E for all [H, E ] = (H)E for all H 2 a . The elements E 2
H 2 h. The root 2 h may be identified with H 2 h g 1 are defined similarly. It can be shown that
defined by hH , Hi = (H) for all H 2 h. It is easy to contains s þ 1 elements, so that between them there
deduce that [E , E ] = c H , where c = hE , E i. exists exactly one linear relation. The elements of
The system of simple roots will be denoted by
þ . are called simple weights of the loop algebra g. The Lie
The generalized Toda lattice for the Lie algebra g algebra g is a direct sum of its two subspaces g
is the following system of differential equations on consisting of Laurent polynomials with non-negative,
h h: resp., with strictly negative powers of ; these
subspaces are also Lie subalgebras.
_ ¼P
Q
X X Now the generalized Toda lattice related to the loop
P_ ¼ eðQÞ ½E ; E ¼ c eðQÞ H ½40 algebra g can be introduced as the system of differential
2 2 equations on a a , which looks formally exactly as
This system can be given a Hamiltonian formula- [40], and has the Hamilton function which looks
tion, with the Hamilton function exactly as [41], but with the set of simple roots of g
being replaced by the set of simple weights of g. The
1 X matrices participating in the Lax representation [12]
Hg ¼ hP; Pi þ c eðQÞ ½41
2 2
belong now to the loop algebra g:
X X
It is completely integrable, and has a Lax represen- Lð Þ ¼ P þ E þ 1 eðQÞ E ½45
tation [12] with 2 2
X X X
L¼Pþ E þ eðQÞ E ½42 Aþ ð Þ ¼ P þ E
2 2 2
X ½46
X X A ð Þ ¼ 1 eðQÞ E
Aþ ¼ P þ E ; A ¼ eðQÞ E ½43 2
2 2
For the classical series of loop algebras, the
The usual open-end Toda lattice corresponds to the Hamilton functions Hg in the canonically conjugate
algebra sl(N) (series AN1 ), so that the Hamilton variables qn , pn (n = 1, . . . , N) can be presented as
function [24] can be denoted by HAN1 . The
Hamilton functions of the generalized lattices Hg ðp; qÞ ¼ HAN1 ðp; qÞ
8 q ð1Þ
corresponding to other classical algebras so(2N þ >
> e N þ eq1 þq2 ; g ¼ BN
>
>
1) (series BN ), sp(N) (series CN ), and so(2N) (series >
> e2qN þ e2q1 ; ð1Þ
>
> g ¼ CN
DN ) can be written in the canonically conjugate >
>
>
< eqN qN1 þ eq1 þq2 ; ð1Þ
variables qn , pn (n = 1, . . . , N) as g ¼ DN
8 qN þ ½47
>
> e2qN þ eq1 þq2 ;
ð2Þ
g ¼ A2N1
<e ;
> g ¼ BN >
>
>
>
Hg ðp; qÞ ¼ HAN1 ðp; qÞ þ e 2q N
; g ¼ CN ½44 >
> eqN þ e2q1 ; ð2Þ
g ¼ A2N
> >
>
: qN qN1 >
: qN
e ; g ¼ DN ð2Þ
e þ e q1 ; g ¼ DNþ1
Actually, one can find even more general integrable
systems of the Toda type: one can add to HAN1 (p, q)
Affine Lie Algebras any of the two potentials eqN qN1 or eqN þ e2qN
Turning to the generalizations of the periodic Toda on one end combined with any of the two potentials
lattice, let
be a Coxeter automorphism of a simple eq1 þq2 or eq1 þ e2q1 on the other end, where
complex algebra g , the order of
being m. Introduce , , , are arbitrary constants. This result is due
the loop algebra g as the Lie algebra of Laurent to E Sklyanin (1987).
polynomials
g ¼ Lð Þ 2 g ½ ; 1 :
ðLð ÞÞ ¼ Lð! Þ Generalizations: Lattices with
Nearest-Neighbor Interactions
where ! = exp(2i=m). Denote by g j the eigenspaces
of
corresponding to the eigenvalues !j (j 2 Z=mZ). There exist further integrable lattice systems with
Set a = g 0 , and let s denote the dimension of a . By the nearest-neighbor interaction apart from the
definition of the Coxeter automorphism, a is an classical exponential Toda lattice [1]. Those of the
242 Toda Lattices
type q€n = r(q_ n )(g(qnþ1 qn ) g(qn qn1 )) have eqnþ1 qn
been classified by R Yamilov in 1982, and the list qn ¼ qn ð1 qn Þ ð1 q_ nþ1 Þ
€ _ _
1 þ eqnþ1 qn
contains, apart from the usual Toda lattice [1], the
eqn qn1
following ones: ð1 q_ n1 Þ ½56
1 þ eqn qn1
qnþ1 qn qn qn1
q
€n ¼ q_ n ðe e Þ ½48
two -perturbations of the dual Toda lattice [49]:
q
€n ¼ q_ n ðqnþ1 2qn þ qn1 Þ ½49 q_ nþ1 q_ n
q
€n ¼ q_ n ðqnþ1 2qn þ qn1 Þ þ
1 þ ðqnþ1 qn Þ
1 1
€n ¼ ðq_ 2n 2 Þ
q ½50 q_ n q_ n1
qnþ1 qn qn qn1 ½57
1 þ ðqn qn1 Þ
€n ¼ ðq_ 2n 2 Þðcothðqnþ1 qn Þ
q q qn q_ nþ1
€n ¼ q_ n ð1 þ 2 q_ n Þ nþ1
q
cothðqn qn1 ÞÞ ½51 1 þ ðqnþ1 qn Þ
Equations [48] are known as the ‘‘modified Toda qn qn1 q_ n1
½58
lattice.’’ Equations [49] describe the ‘‘dual Toda lattice’’ 1 þ ðqn qn1 Þ
which was instrumental in the original discovery by and one -perturbation of each of the systems [50]
Toda (see Toda (1989)). All systems [49]–[51] can be and [51]:
obtained from [11] via suitable parametrizations of the
variables an , bn by canonically conjugate ones qn , pn , qnþ1 qn q_ nþ1
€n ¼ q_ 2n 2
q
similar to [10] for [1], see Suris (2003). ðqnþ1 qn Þ2 ðÞ2
A remarkable discovery of the integrable relati- !
vistic Toda lattice is due to S Ruijsenaars (1990). qn qn1 q_ n1
½59
This lattice with the equations of motion ðqn qn1 Þ2 ðÞ2
eqnþ1 qn 1 2
q
€n ¼ ð1 þ q_ n Þ ð1 þ q_ nþ1 Þ
q
€n ¼ q_ n 2
1 þ 2 eqnþ1 qk 2
eqn qn1
ð1 þ qn1 Þ
_ ½52 sinh 2ðqnþ1 qn Þ 1 sinhð2Þq_ nþ1
1 þ 2 eqn qn1
sinh2 ðqnþ1 qn Þ sinh2 ðÞ
!
can be considered as the perturbation of the usual sinh 2ðqn qn1 Þ 1 sinhð2Þq_ n1
Toda lattice with the small parameter (the inverse ½60
sinh2 ðqn qn1 Þ sinh2 ðÞ
speed of light).
A class of integrable lattice systems of the relativistic A detailed study of all these systems, their interrelations,
Toda type q €n = r(q_ n )(q_ nþ1 f (qnþ1 qn ) q_ n1 f (qn and time discretizations can be found in Suris (2003).
qn1 ) þ g(qnþ1 qn ) g(qn qn1 )) is richer than There exist also lattices with more complicated
that of the Toda type, and has been isolated by Yu nearest-neighbor interactions, involving elliptic
B Suris and by V Adler and A Shabat in 1997. The list functions. They were discovered by A Shabat and
contains, apart from the relativistic Toda lattice [52], R Yamilov (1990), and by I Krichever (2000). For
two more -perturbations of the usual Toda lattice [1]: example, the nonrelativistic elliptic Toda lattice is
governed by the equations
€n ¼ ð1 þ q_ nþ1 Þeqnþ1 qn ð1 þ q_ n1 Þeqn qn1
q
€n ¼ q_ 2n 1 ðVðqn ; qnþ1 Þ þ Vðqn ; qn1 ÞÞ
q ½61
2 e2ðqnþ1 qn Þ e2ðqn qn1 Þ ½53
where V(q, q0 ) =
(q þ q0 ) þ
(q q0 )
(2q) is an
elliptic function in both arguments q, q0 (here
(q) is
€n ¼ ð1 q_ n Þ2 ð1 q_ nþ1 Þ eqnþ1 qn
q
the Weierstrass
-function).
ð1 q_ n1 Þeqn qn1 ½54
Further Developments
two -perturbations of the modified Toda lattice [48]:
and Generalizations
eqnþ1 qn Sato’s Theory
€k ¼ q_ n eqnþ1 qn eqn qn1 þ q_ nþ1
q
1 þ eqnþ1 qn
qn qn1
Formulas [6], [31], and [35] have the same structure,
e
q_ n1 ½55 with the case-dependent functions n (t) given by the
1 þ eqn qn1 determinants [7] for the multisoliton solution in the
Toda Lattices 243
infinite case, by the Hankel determinants [32] or by the which is obtained from [62] by setting vn =
minors of the matrix exp(L(0)) in the open case, and exp(qnþ1 qn ), already appeared in studies by
by the multidimensional theta functions in the periodic G Darboux in the 1880s, as the equation satisfied
case. All these seemingly different objects are actually by the Laplace invariants of the chain of Laplace
particular cases of a beautiful construction due to M transformations of a given conjugate net. This
Sato (1981), developed by E Date, M Jimbo, M relation to the classical differential geometry was
Kashiwara, T Miwa (1981–83), and by G Segal and G extensively studied by G Darboux, G Tzitzéica, and
Wilson (1985), which provides one of the major others long before the advent of the theory of
unifying schemes for the theory of integrable integrable systems. Another link to the differential
systems. In this construction, integrable systems are geometry is a more recent observation, and relates the
interpreted as simple dynamical systems on an infinite- two-dimensional Toda lattice, with the d’Alembert
dimensional Grassmannian. The -function (first operator ()xy on the left-hand side of [62] replaced by
invented by R Hirota in 1971) receives in this theory the Laplace operator ()zz , to harmonic maps. For
a representation-theoretical interpretation in terms of instance, the sinh-Gordon equation uzz = sinh u gov-
the determinant bundle over the Grassmannian. erns harmonic maps from C into the unit sphere S2 ,
which can be interpreted also as Gauss maps of the
Band Matrices constant mean curvature surfaces in R3 . A review of
this topic can be found in Guest (1997).
The Lax matrices [13] and [16] in the Manakov–
Discretization of Toda lattices, nonabelian Toda
Flaschka variables can be easily generalized: in the
Lattices, quantization of Toda lattices, dispersionless
symmetric matrix L0 one can admit nonvanishing
limit of Toda lattices, etc., are only some of the
elements in the band of the width 2s þ 1 > 3 around
further relevant topics, which cannot be discussed in
the main diagonal, in the Heisenberg matrix L one
any detail in the restricted frame of this article, and
can admit more nonvanishing diagonals in the
the same holds, unfortunately, for such fascinating
upper-triangle part. A systematic presentation of
applications of the Toda lattice as the Frobenius
a large body of relevant results is given in
manifolds, Laplacian growth problem, quantum
Kupershmidt (1985). In the setting of finite lattices,
cohomology, random matrix theory, two-dimensional
the integrability of such systems becomes a non-
gravity, etc.
trivial problem (as opposed to the tridiagonal
situation), because the number of independent
conjugation-invariant functions tr(Ls ) becomes See also: Bäcklund Transformations; Bi-Hamiltonian
less than the number of degrees of freedom. An Methods in Soliton Theory; Classical r-Matrices,
effective approach to this problem based on the Lie Bialgebras, and Poisson Lie Groups; Current Algebra;
semi-invariant functions has been found by P Deift, Dynamical Systems and Thermodynamics; Functional
L-Ch Li, T Nanda, and C Tomei in 1986. Equations and Integrable Systems; Integrable Discrete
Systems; Integrable Systems and Discrete Geometry;
Two-Dimensional Toda Lattices Integrable Systems and the Inverse Scattering Method;
Integrable Systems: Overview; Lie Groups: General
Up to now, we considered integrable lattices with Theory; Multi-Hamiltonian Systems; Quantum
one continuous and one discrete independent vari- Calogero–Moser Systems; Separation of Variables for
ables. This allows for a further generalization. Differential Equations; Solitons and Kac–Moody Lie
Integrable systems with two continuous and one Algebras; WDVV Equations and Frobenius Manifolds.
discrete independent variables are well known and
widely used as models of the field theory. For
instance, the Toda field theory deals with the system Further Reading
ðqn Þxy ¼ eqnþ1 qn eqn qn1 ½62 Adler M (1979) On a trace functional for formal pseudo–differential
operators and the symplectic structure for Korteweg–de Vries
introduced in the soliton theory by A Mikhailov in type equations. Invent. Mathematics 50: 219–248.
1979. This two-dimensional system admits all possi- Adler VE and Shabat AB (1997) On a class of Toda chains. Teor.
ble kinds of reductions and generalizations mentioned Mat. Phys. 111: 323–334 (in Russian; English translation:
above for the usual Toda lattice. In particular, the Theor. Math. Phys. 111: 647–657); Generalized Legendre
periodic two-dimensional Toda lattice is referred to transformations. Teor. Mat. Phys. 112: 179–194 (in Russian;
English translation: Theor. Math. Phys. 112: 935–948).
as the affine Toda field theory (with the prominent Adler M and van Moerbeke P (1980) Completely integrable systems,
example of the sine-Gordon field which corresponds Kac–Moody algebras and curves; Advances in Mathematics 38:
to the period 2). Later, it was realized that the 267–317. Linearization of Hamiltonian systems, Jacobi varieties
equivalent equation ( log vn )xy = vnþ1 2vn þ vn1 , and representation theory. Advances in Mathematics 38: 318–379.
244 Toeplitz Determinants and Statistical Mechanics
Bogoyavlensky OI (1976) On perturbations of the periodic Toda Olshanetsky MA and Perelomov AM (1979) Explicit solutions of
lattice. Communications in Mathematical Physics 51: 201–209. classical generalized Toda models. Invent. Math. 54: 261–269.
Date F, Jimbo M, and Miwa T (1982–83) Method for generating Reyman AG and Semenov-Tian-Shansky MA (1979–81) Reduc-
discrete soliton equations. I–V. Journal of the Physical Society tion of Hamiltonian systems, affine Lie algebras and Lax
of Japan 51: 4116–4124, 4125–4131; 52: 388–393, 761–765, equations. I, II. Invent. Math. 54: 81–100, 63: 423–432.
766–771. Reyman AG and Semenov-Tian-Shansky MA (1994) Group
Date E and Tanaka S (1976) Analogue of inverse scattering theory for theoretical methods in the theory of finite dimensional
the discrete Hill’s equation and exact solutions for the periodic integrable systems. In: Arnold VI and Novikov SP (eds.)
Toda lattice. Progress in Theoretical Physics 55: 457–465. Encyclopaedia of Mathematical Science, vol. 16. Dynamical
Deift P, Li L-C, Nanda T, and Tomei C (1986) The Toda flow on Systems VII, pp. 116–225. Berlin: Springer.
a generic orbit is integrable. Communications in Pure and Ruijsenaars SNM (1990) Relativistic Toda systems. Communica-
Applied Mathematics 39: 183–232. tions in Mathematical Physics 133: 217–247.
Flaschka H (1974) On the Toda lattice II. Inverse scattering Sato M and Sato Y (1983) Soliton equations as dynamical systems
solution. Progress in Theoretical Physics 51: 703–716. on infinite-dimensional Grassmann manifold. In: Nonlinear
Guest MA (1997) Harmonic Maps, Loop Groups, and Integrable Partial Differential Equations in Applied Science (Tokyo,
Systems. Cambridge: Cambridge University Press. 1982), pp. 259–271. North-Holland: Amsterdam.
Hénon M (1974) Integrals of the Toda lattice. Physical Review B Segal G and Wilson G (1985) Loop groups and equations of KdV
9: 1921–1923. type. Inst. Hautes Études Sci. Publ. Math. 61: 5–65.
Krichever IM (1978) Algebraic curves and nonlinear difference Shabat AB and Yamilov RI (1990) Symmetries on nonlinear
equations. Uspekhi Mat. Nauk 33: 215–216 (in Russian). chains. Algebra i Analiz 2: 183–208 (in Russian; English
Krichever I (2000) Elliptic analog of the Toda Lattice. Int. Math. translation: Leningrad Mathematical Journal 2: 377–400).
Res. Notes 8: 383–412. Sklyanin EK (1987) Boundary conditions for integrable equations.
Kupershmidt BA (1985) Discrete Lax equations and differential- Funkts. Anal. Prilozh. 21: 86–87 (in Russian; English
difference calculus. Asterisque 123: 212. translation: Funct. Anal. Appl. 21: 164–166).
Magri F (1978) A simple model of the integrable Hamiltonian Suris YB (1997) New integrable systems related to the relativistic Toda
equation. Journal of Mathematical Physics 19: 1156–1162. lattice. Journal of Physics A: Math. and Gen. 30: 1745–1761.
Manakov SV (1974) On the complete integrability and stochas- Suris YuB (2003) The Problem of Integrable Discretization:
tization in discrete dynamical systems. Zh. Exp. Theoretical Hamiltonian Approach. Basel: Birkhäuser.
Physics 67: 543–555 (in Russian; English translation: Soviet Toda M (1967) Vibration of a chain with nonlinear interaction.
Physics JETP 40: 269–274). Journal of Physical Society Japan 22: 431–436.
Mikhailov AV (1979) Integrability of a two–dimensional general- Toda M (1989) Theory of Nonlinear Lattices. Berlin: Springer.
ization of the Toda chain. Pis’ma Zh. Eksp. Teor. Fiz. 30: 443–448 Toda M and Wadati M (1975) A canonical transformation for the
(in Russian; English translation: JETP Letters 30: 414–418). exponential lattice. Journal of the Physical Society of Japan
Moser J (1975) Finitely many mass points on the line under the 39: 1204–1211.
influence of an exponential potential – an integrable system. Yamilov RI (1982) On the classification of discrete equations.
Lecture Notes Physics 38: 467–497. In: Integrable Systems, Ufa, 95–114 (in Russian).
Þ¼ ai eij
The corresponding eigenvector for eigenvalue Here E1 , E2 , and = 1=kT are, without loss of
i2k=n
n (e ) is generality, assumed to be positive constants, T is the
temperature, and k is the Boltzmann constant. If X
½1; ei2k=n ; . . . ; ei2kðn1Þ=n is a random variable defined on the space of
This can be verified by direct computation. The role configurations, the expectation is given by
of circulant matrices will not be emphasized in this
article, although they are used in the computation of 1 X
EðXÞ ¼ XðÞeEðÞ
the generating function for certain dimer configura- Z ¼1
tions and also in applications using the discrete
Fourier transform. Let n be fixed for the moment and assume toroidal
The most common way to generate a finite boundary conditions for the lattice and then let
Toeplitz matrix is with the Fourier coefficients of N, M ! 1. It is known that the random variable
an integrable function. Let : T ! C be a function
defined on the unit circle with Fourier coefficients XðÞ ¼ 0;0 0;n
Z
1
k ¼ ðei Þeik d ½3 has expectation h0,0 0,n i given by Dn (), where
2
We define Tn () to be the Toeplitz matrix: 1=2
i ð1 1 ei Þð1 2 ei Þ
n1 ðe Þ ¼
Tn ðÞ ¼ ij i;j¼0 ð1 1 ei Þð1 2 ei Þ
A basic problem that in large part has been 1 z2 1 z2
motivated by statistical mechanics is to determine 1 ¼ z1 ; 2 ¼ z1
1
1 þ z2 1 þ z2
the behavior of the asymptotics of the determinant
of Tn () as n ! 1. The determinant will be and
referred to as Dn (), where is called the generating
function of the determinant. If the generating z1 ¼ tanh E1 ; z2 ¼ tanh E2
function has the property that its Fourier coefficients
vanish for negative index (positive index) then the The square root is taken so that (ei ) = 1. This
corresponding matrix is lower-triangular (upper- formula was first stated by Onsager and later
triangular) and hence the determinant is n0 . For verified in a difficult computation by Montroll,
other cases, the determinant is not easy to determine Potts, and Ward.
and requires additional mathematical machinery. The spontaneous magnetization M for the Ising
Some of the primary motivation to study the model is defined by
determinant of these matrices comes from the two-
dimensional Ising model. We consider the Onsager M2 ¼ lim h0;0 0;n i ¼ lim Dn ðÞ
lattice in the absence of a magnetic field with sites n!1 n!1
labeled by
Note that it is the square root of the correlation
ði; jÞ; 0 i M; 0; j N between two distant sites. Hence, the asymptotics of
and with a value i, j = 1 assigned to each site. In the Toeplitz determinants will determine whether
the Ising model, i, j signifies the state of the spin at the magnetization is positive or tends to zero as
the site (i, j). To each possible configuration of spins, n ! 1.
we define an energy
X X
EðÞ ¼ E1 i;j iþ1;j E2 i;j i;jþ1 Strong Szegö Limit Theorem
i;j i;j
To determine the behavior of the determinants, we
Let need to analyze the generating function . Let us
X
Z¼ eEðÞ first consider the case where 2 < 1. (It is always the
case that 0 < 1 < 1.) This generating function is
be the partition function. Then the probability of a differentiable, nonzero and has winding number
given configuration is zero, and it is for functions of this type that a
second-order expansion of the Toeplitz determinants
1 EðÞ can be described. The expansion first formulated by
e
Z Szegö, in response to the question concerning the
246 Toeplitz Determinants and Statistical Mechanics
spontaneous magnetization, is called the ‘‘strong hence the infinite array is upper triangular. From
Szegö limit theorem.’’ this, it follows that
Before proving the Szegö theorem, it should be
remarked that we can view the finite Toeplitz matrix Tðþ ÞTð1 1
þ Þ ¼ Tð ÞTð Þ ¼ I ½5
as a truncation of an infinite array,
0 1 Tð ÞTðþ Þ ¼ TðÞ ½6
0 1 2
B . C and
B 1 0 1 . . C
B C
B .. C ½4 Pn Tðþ Þ ¼ Pn Tðþ ÞPn
B 2 1 0 .C ½7
@ A Pn Tð ÞPn ¼ Tð ÞPn
.. .. ..
. . .
This yields
The above infinite array is the matrix representation
for the Toeplitz operator Dn ðÞ ¼ det Tn ðÞ ¼ det Pn TðÞPn ½8
Theorem 1 (Strong Szegö limit theorem). Assume is I plus a trace class operator, we use the identity
= þ , where have logarithms in B. Suppose TðfgÞ Tðf ÞTðgÞ ¼ Hðf ÞHð~gÞ ½12
log , log þ 2 H 2 . Then !
X1 where H(f) has matrix form (fiþjþ1 )1 i,j = 0 , and
n
lim Dn ðÞ=GðÞ ¼ EðÞ ¼ exp ksk sk g̃(ei ) = g(ei ). Our Banach algebra conditions
n!1
k¼1 show that if f is in B then the operator H(f) satisfies
P 2
where G() = exp (( log )0 ) and sk = log k . i, j jaij j < 1, where the aij are the matrix entries
of the operator. Any operator satisfying this is called
Since B is a Banach algebra, it follows that if a Hilbert–Schmidt operator, and it is known that the
log belong to B so do product of two Hilbert–Schmidt is trace class.
Applying the identity to
; þ ; 1 1
þ ; ; ;
1
and hence they are bounded. Since þ is in H 2 as T 1
þ Tð Þ
well, its Fourier coefficients vanish for negative shows that this operator is T(1
þ ) plus trace class.
index and the Toeplitz operator has a corresponding The operator
infinite array that is lower-triangular. The Fourier
coefficients vanish for positive index for and Tðþ ÞT 1
Toeplitz Determinants and Statistical Mechanics 247
is thus T(þ 1 ) plus trace class and one more and, by the identity from eqn [12], becomes
application of the identity combined with the fact
that trace class operators form an ideal yield the Hðlog ÞHðlog þ Þ
desired result. It can be directly computed that
From the theory of infinite determinants, as
n ! 1, trðHðlog ÞHðlog þ ÞÞ
1
det Pn T 1
þ TðÞT Pn ½13 equals
converges to X
1
1 ksk sk
det T 1
þ TðÞT ½14 k¼1
This last expression is the form In order for this computation to be valid, it was
necessary for 0 < 2 < 1, and by elementary com-
eA eB eA eB putations one can show that this is equivalent to the
inequality
where
sinh 2E1 sinh 2E2 > 1
A ¼ Tðlog þ Þ and B ¼ Tðlog Þ
If AB BA is trace class then
Nonsmooth Symbols or T = Tc
A B A B trðABBAÞ
det e e e e ¼e A problem occurs in the analysis just outlined when
The operator AB BA is the inequality 0 < 2 < 1 does not hold. There are
two separate possibilities, 2 > 1 or 2 = 1. First, we
Tðlog þ ÞTðlog Þ þ Tðlog ÞTðlog þ Þ consider the latter case. For fixed E1 and E2 , this
which equals happens for exactly one fixed value of the constant
c = 1=kTc and the corresponding temperature Tc is
Tðlog þ ÞTðlog Þ þ Tððlog Þðlog þ ÞÞ called the critical temperature. The ‘‘strong Szegö
248 Toeplitz Determinants and Statistical Mechanics
limit theorem’’ does not apply since our generating For the above factors, we normalize so that the
function is of the form geometric mean is 1. Then we may assume that
1=2 the factors þ , ( þ = ) are 1 at zero and
i ð1 1 ei Þð1 ei Þ infinity, respectively, and this defines the loga-
ðe Þ ¼ ½16
ð1 1 ei Þð1 ei Þ rithms for the first product. The E( ) term is the
constant in Szegö’s theorem, and the argument of
In 1968, Fisher and Hartwig raised a conjecture
a term of the form (1 ei(s r ) ) is taken between
about Dn () for nonsmooth which included the
=2 and =2.
above example. They considered generating func-
In the case where R = 1, the conjecture is known
tions of the form
to hold if < > 1=2 and the function b satisfies the
Y
R conditions of Szegö’s theorem and is infinitely
ðei Þ ¼ ðei Þ j ;j ðeiðj Þ Þ ½17 differentiable. The theorem also has an extension
j¼1
to the case where < < 1=2, with 2 not an
where integer, as long as the Fourier coefficients are
defined as the coefficients of a distribution.
; ðei Þ ¼ ð2 2 cos Þ eiðÞ ; 0 < < 2 If we apply the theorem to the generating function
< > 1=2, and is not an integer. The function from [16]
is assumed to be a smooth function. Using the 1=2
i i 1 1 ei
Fisher–Hartwig notation, the symbol of interest in ðe Þ0;1=2 ðe Þ ¼ 0;1=2 ðei Þ
the Ising model from eqn [16] can be written as 1 1 ei
and E is a constant whose value they did not h0;0 n;n i ¼ Dn ð0;1=2 Þ n1=4 Gð1=2ÞGð3=2Þ
identify. The constant was later computed to be
and thus this limit is also zero.
Y
R The proof of the Fisher–Hartwig conjecture is
ij j þj ij j j
E
ðÞ ¼ Eð Þ þ ðe Þ ðe Þ much more complicated than the proof of the
j¼1
Y ‘‘strong Szegö limit theorem.’’ For an indication of
ð1 eiðs r Þ Þðs þs Þðr r Þ how it is proved, note that if we consider the
1s6¼rR generating function 0, , the Fourier coefficients
Y
R
Gð1 þ j þ j ÞGð1 þ j j Þ are (sin )=[(n )] and hence the matrix is
Cauchy and the determinant can be computed
Gð1 þ 2j Þ
j¼1 exactly. From this the asymptotics can be derived
where G(z) is the Barnes G-function satisfying and they yield a special case of the Fisher–Hartwig
conjecture. The main idea in extending the result to
Gð1 þ zÞ ¼ ðzÞGðzÞ a symbol of the form
and is defined by ðei Þ0; ðei Þ
2
Gð1 þ zÞ ¼ ð2Þz=2 eðzþ1Þz=2 z =2 is to prove that the limit of
1
Y z k zþz2 =2k Dn ð 0; Þ
1þ e
k¼1
k Dn ð ÞDn ð0; Þ
Toeplitz Determinants and Statistical Mechanics 249
exists. The proof uses much of the same trace-class In the example above, 1 = 1=2, 2 = 1=2,
approach used in proving the ‘‘strong Szegö limit 1 = 0, 2 = , n1 = 1, and n2 = 1. The result for
theorem,’’ although the results are more compli- the counterexample, combined with what is known
cated. These ideas are then extended for R > 1 and for the case of integer values of and , leads to the
also more general and . following generalized conjecture. Suppose
It should be noted that in this article the Fisher–
Y
R
Hartwig conjecture does not always hold. If we ðei Þ ¼ k k ;k ;j
consider the function j¼1
j j
PR
1; < < 0 for some set of indices k. Define Q(k) = (kj )2
ðei Þ ¼ j=1
1; 0 < < (jk )2 . Let Q = maxk <(Q(k)) and
then K ¼ fk j <ðQðkÞÞ ¼ Qg
The generalized asymptotic formula is conjectured
0, if k is even
n ¼ to be
2i/ðkÞ, if k is odd X
Dn ð Þ ¼ Gð k ÞnQðkÞ E
k þ oðjGðÞjn nQ Þ
The matrix Tn () is antisymmetric and, if n is odd, k2K
Dn () = 0. If n is even, using elementary row and
column operations, the determinant can be put in It may turn out that there is only one element in K
block form with each block of Cauchy type. The and for these symbols there is a unique representa-
determinant can then be evaluated to find tion that yields the highest power in the exponent of
the asymptotic expansion. These are the symbols for
Dn ðÞ ðiÞn n1=2 K which the original Fisher–Hartwig conjecture should
be true and it is now confirmed in these cases. For
where K is a certain constant.
example, the conjecture is known to hold for R > 1
It is instructive to note that
when j<r j < 1=2 and j<r j < 1=2.
ðei Þ ¼ 0;1=2 ðei Þ0;1=2 ðeiðÞ Þ
¼ 0;1=2 ðei Þ0;1=2 ðeiðÞ Þ
and thus that this particular symbol has two Symbols with Nonzero Index or T > Tc
representations of the type given in [17] and each
would give a different asymptotic expansion of the The last possibility in computing the correlation
determinant if the conjecture were true for this set of asymptotics is the case where 2 > 1. Note that, for
parameters. Hence, it is clear that the conjecture fixed E1 and E2 , there is exactly one value of
must fail to hold in this case. = 1=kT where
However, this example indicates that there might 1 1 z2
be a generalization of the original conjecture of 2 ¼ z1 ¼1
1 þ z2
Fisher and Hartwig. If
For values of T > Tc , we have that the symbol
; ðeið Þ Þ ¼ ;; 1=2
ð1 1 ei Þð1 2 ei Þ
then ð1 1 ei Þð1 2 ei Þ
Y
R is the same as
¼ j ;j ;j
1=2
j¼1 ð1 1 ei Þð1 ð1/2 Þei Þ
ei
it is also the case that ð1 1 ei Þð1 ð1/2 Þei Þ
Y
R with the argument chosen so that the symbol is
¼
j ;j þnj ;j positive at . Except for the extra factor of ei , this
j¼1 is the same type of smooth symbol that was
considered earlier (see the section ‘‘Strong Szegö
where limit theorem’’). However, a factor of ei can change
X
R Y
R the asymptotics considerably as can be seen by
nj ¼ 0 and
statistics correspond to singular symbols. For basic Hughes C, Keating JP, and O’Connell N (2001) On the
random-matrix theory information see Mehta characteristic polynomial of a random unitary matrix. Com-
munications in Mathematical Physics 220: 429–451.
(1991), and for connections between the circular McCoy BM and Wu TT (1973) The Two-Dimensional Ising
unitary ensemble and Toeplitz determinants, Model. Cambridge, MA: Harvard University Press.
see Hughes (2001), Tracy and Widom (1993), and Metha ML (1991) Random Matrices. San Diego: Academic Press.
Widom (1994). Montroll EW, Potts RB, and Ward JC (1963) Correlations and
spontaneous magnetization of the two-dimensional Ising
See also: Integrable Systems in Random Matrix Theory; model. Journal of Mathematical Physics 4(2): 308–322.
Onsager L (1971) The Ising model in two dimensions. In: Mills
Two-Dimensional Ising Model.
RE, Ascher E, and Jaffee RI (eds.) Critical Phenomena in
Alloys, Magnets, and Superconductors, pp. 3–12. New York:
McGraw-Hill.
Further Reading
Szegö G (1915) Ein Grenzwertsatz über die Toeplitzschen
Böttcher A and Silbermann B (1990) Analysis of Toeplitz Determinanten einer reellen positiven Funktion. Mathema-
Operators. Berlin: Akademie-Verlag. tische Annalen 76: 490–503.
Böttcher A and Silbermann B (1998) Introduction to Large Szegö G (1952) On Certain Hermitian Forms Associated with the
Truncated Toeplitz Matrices. Berlin: Springer. Fourier Series of a Positive Function, pp. 222–238. Lund:
Ehrhardt T (2001) A status report on the asymptotic behavior of Festschrift Marcel Riesz.
Toeplitz determinants with Fisher–Hartwig singularities. Tracy CA and Widom H (1993) Introduction to random
Operator Theory: Advances and Applications 124: 217–241. matrices. In: Helminck GF (ed.) Geometric and Quantum
Ehrhardt T and Silbermann B (1997) Toeplitz determinants with Aspects of Integrable Systems, (Scheveningen, 1992), Lecture
one Fisher–Hartwig singularity. Journal of Functional Analysis Notes in Physics, vol. 424, pp. 103–130. Berlin: Springer.
148: 229–256. Widom H (1994) Random Hermitian matrices and (nonrandom)
Fisher ME and Hartwig RE (1968) Toeplitz determinants: some Toeplitz matrices. In: Basor E and Gohberg I (eds.) Toeplitz
applications, theorems, and conjectures. Advances in Chemical Operators and Related Topics (Santa Cruz, CA, 1992), Oper.
Physics 15: 333–353. Theory Adv. Appl., vol. 71, pp. 9–15. Basel: Birkhuser.
S0 A ¼ A ; for all A 2 M
Note that if one defines F0 A0 = A0
, for all A0 2
This operator extends to a closed antilinear operator M0 , and takes its closure F, then one has the relations
S defined on a dense subset of H. Let be the
unique positive, self-adjoint operator and J the ¼ FS; 1 ¼ SF; F ¼ J1=2
252 Tomita–Takesaki Modular Theory
Theorem 4 There exists a -strongly continuous setting of faithful normal functionals on von
map R 3 t 7! Ut 2 M such that Neumann algebras M of any type, enabling the
definition of noncommutative Lp spaces, Lp (M, ).
(i) Ut is unitary for all t 2 R;
(ii) Utþs = Ut !t (Us ) for all s,t 2 R; and
(iii) t (A) = Ut !t (A)Ut for all A 2 M and t 2 R. Modular Invariants and the Classification
The 1-cocycle {Ut } is commonly called the cocycle of von Neumann Algebras
derivative of with respect to ! and one writes
As already mentioned, the modular structure carries
Ut = (D : D!)t . There is a chain rule for this
information about the algebra. This is best evi-
derivative, as well: If , , and are faithful normal
denced in the structure of type III factors. As this
states on M, then (D : D)t = (D : D)t (D : D)t ,
theory is rather involved, only a sketch of some of
for all t 2 R. More can be said about the cocycle
the results can be given.
derivative if the states satisfy any of the conditions
If M is a type III algebra, then its crossed
in the following theorem.
product N = M o ! R relative to the modular
Theorem 5 The following conditions are automorphism group of any faithful normal state
equivalent: ! on M is a type II1 algebra with a faithful
semifinite normal trace such that t = et ,
(i) is {!t }-invariant; t 2 R, where is the dual of ! on N . Moreover,
(ii) ! is {t }-invariant; the algebra M is isomorphic to the cross product
(iii) there exists a unique positive injective operator N o R, and this decomposition is unique in a very
!
h affiliated with M \ M such that !( ) ¼ strong sense. This structure theorem entails the
(h ) = (h1=2 h1=2 ); existence of important algebraic invariants for M,
(iv) there exists a unique positive injective operator which has many consequences, one of which is made
!
h0 affiliated with M \ M such that ( ) ¼ explicit here.
!(h0 ) = !(h01=2 h01=2 ); If ! is a faithful normal state of a von Neumann
(v) the norms of the linear functionals ! þ i and algebra M induced by , let ! denote the modular
! i are equal; and operator associated to (M, ) and sp ! denote the
(vi) !t s = s !t , for all s, t 2 R. spectrum of ! . The intersection
The conditions in Theorem 5 turn out to be
S0 ðMÞ ¼ \ sp !
equivalent to the cocycle derivative being a
representation. over all faithful normal states ! of M is an algebraic
invariant of M.
Theorem 6 The cocycle {Ut } intertwining {!t } with
{t } is a group representation of the additive group Theorem 8 Let M be a factor acting on a
of reals if and only if and ! satisfy the conditions separable Hilbert space. If M is of type III, then
in Theorem 5. In that case, U(t) = hit . 0 2 S0 (M); otherwise, S0 (M) = {0,1} if M is of type
I1 or II1 and S0 (M) = {1} if not. Let M now be a
The operator h0 = h1 in Theorem 5 is called the factor of type III.
Radon–Nikodym derivative of with respect to !
(often denoted by d=d!), due to the following (i) M is of type III
, 0 <
< 1, if and only if
result, which, if the algebra M is abelian, is the S0 (M) = {0} [ {
n j n 2 Z}.
well-known Radon–Nikodym theorem from mea- (ii) M is of type III0 if and only if S0 (M) = {0, 1}.
sure theory. (iii) M is of type III1 if and only if S0 (M) = [0, 1).
Theorem 7 If and ! are normal positive linear In certain physically relevant situations, the
functionals on M such that (A) !(A), for all spectra of the modular operators of all faithful
positive elements A 2 M, then there exists a unique normal states coincide, so that Theorem 8 entails
element h1=2 2 M such that ( ) = !(h1=2 h1=2 ) and that it suffices to compute the spectrum of any
0 h1=2 1. conveniently chosen modular operator in order to
determine the type of M. In other such situations,
The analogies with measure theory are not there are distinguished states ! such that
accidental, although these are not discussed in detail S0 (M) = sp ! . One such example is provided by
here. Indeed, any normal trace on a (finite) von asymptotically abelian systems. A von Neumann
Neumann algebra M gives rise to a noncommuta- algebra M is said to be ‘‘asymptotically abelian’’ if
tive integration theory in a natural manner. Mod- there exists a sequence {n }n2N of automorphisms of
ular theory affords an extension of this theory to the M such that the limit of {An (B) n (B)A}n2N in
254 Tomita–Takesaki Modular Theory
the strong operator topology is zero, for all A, B 2 is isomorphic to Aut(M) under the above map
M. If the state ! is n -invariant, for all n 2 N, then 7! V(), which is called the ‘‘standard implemen-
sp ! is contained in sp , for all faithful normal tation’’ of Aut(M).
states on M, so that S0 (M) = sp ! . If, moreover,
Often of particular physical interest are (anti-)auto-
sp ! = [0, 1), then sp ! = sp , for all as
morphisms of M leaving ! invariant. They can only
described.
be implemented by (anti)unitaries which leave
the pair (M, ) invariant. In fact, if U is a unitary
or antiunitary operator satisfying U = and
Self-Dual Cones UMU = M, then U commutes with both J and .
Let j : M ! M0 denote the antilinear -isomorphism
defined by j(A) = JAJ,A 2 M. The natural positive Two Algebras and One State
cone P \ associated with the pair (M, ) is defined as
Motivated by applications to quantum field theory,
the closure, in H, of the set of vectors
the study of the modular structures associated with
fAjðAÞ j A 2 Mg one state and more than one von Neumann algebra
has begun (see Borchers (2000) for references and
Let Mþ denote the set of all positive elements of M. details). Let N M be von Neumann algebras
The following theorem collects the main attributes with a common cyclic and separating vector ,
of the natural cone. and N , JN and M , JM denote the corresponding
modular objects. The structure (M, N , ) is called
Theorem 9 a -half-sided modular inclusion if itM N itM
N , for all t 0.
(i) P \ coincides with the closure in H of the set
{1=4 A j A 2 Mþ }. Theorem 11 Let M be a von Neumann algebra
(ii) it P \ = P \ for all t 2 R. with cyclic and separating vector . The following
(iii) J = for all 2 P \ . are equivalent:
(iv) Aj(A)P \ P \ for all A 2 M.
(i) There exists a proper subalgebra N M such that
(v) P \ is a pointed, self-dual cone whose linear
(M, N , ) is a
-half-sided modular inclusion.
span coincides with H.
(ii) There exists a unitary group {U(t)} with positive
(vi) If 2 P \ , then is cyclic for M if and only if
generator such that
is separating for M.
(vii) If 2 P \ is cyclic, and hence separating, for UðtÞMUðtÞ1 M; for all t 0;
M, then the modular conjugation and the UðtÞ ¼ ; for all t 2 R
natural cone associated with the pair (M, )
coincide with J and P \ , respectively.
(viii) For every normal positive linear functional Moreover, if these conditions are satisfied, then the
on M, there exists a unique vector 2 P \ following relations must hold:
such that (A) = h , A i for all A 2 M.
itM UðsÞit it it
M ¼ N UðsÞN ¼ Uðe
2t
sÞ
In fact, the algebras M and M0 are uniquely
characterized by the natural cone P \ [4]. In light of and
(viii), if is an automorphism of M, then
JM UðsÞJM ¼ JN UðsÞJN ¼ UðsÞ
VðÞ ¼ 1
for all s,t 2 R. In addition, N = U(1)MU(1)1 ,
defines an isometric operator on P \ , which by (v) and if M is a factor, it must be type III1 .
extends to a unitary operator on H. The map
The richness of this structure is further suggested
7!V() defines a unitary representation of the
by the next theorem.
group of automorphisms Aut(M) on M in such a
manner that V()AV()1 = (A) for all A 2 M and Theorem 12
2 Aut(M). Indeed, one has the following:
(i) Let (M, N 1 , ) and (M, N 2 , ) be -half-sided,
Theorem 10 Let M be a von Neumann algebra resp. þ-half-sided, modular inclusions satisfy-
with a cyclic and separating vector . The group V ing the condition JN 1 JN 2 = JM JN 2 JN 1 JM . Then
of all unitaries V satisfying the modular unitaries itM , isN 1 , iu
N 2 , s, t, u 2 R,
generate a faithful continuous unitary repre-
VMV ¼ M; VJV ¼ J; VP \ ¼ P \ sentation of the identity component of the
Tomita–Takesaki Modular Theory 255
group of isometries of two-dimensional Min- equilibrium state at inverse temperature , with all
kowski space. the consequences which both of these facts have.
(ii) Let M, N , N \ M be von Neumann algebras But it has become increasingly clear that the
with a common cyclic and separating vector . If modular objects it , J, of certain algebras of
(M, M \ N , ) and (N , M \ N , ) are -half- observables and states encode additional physical
sided, resp. þ-half-sided, modular inclusions such information. In 1975, it was discovered that if one
that JN MJN = M, then the modular unitaries considers the algebras of observables associated with
itM , isN , iu
N \M , s, t, u 2 R, generate a faithful a finite-component quantum field theory satisfying
continuous unitary representation of SL the Wightman axioms, then the modular objects
(2, R)=Z2 . associated with the vacuum state and algebras of
observables localized in certain wedge-shaped
This has led to a further useful notion. If N M
regions in Minkowski space have geometric content.
and is cyclic for N \ M, then (M, N , ) is said to
In fact, the unitary group {it } implements the group
be a ‘‘-modular intersection’’ if both (M, M \ N , )
of Lorentz boosts leaving the wedge region invariant
and (N , M \ N , ) are -half-sided modular inclu-
(this property is now called modular covariance),
sions and
and the modular involution J implements the space-
time reflection about the edge of the wedge, along
JN lim itN it
M JN ¼ lim M N
it it
with a charge conjugation. This discovery caused
t!
1 t!
1
some intense research activity (see Baumgartel and
where the existence of the strong operator limits is Wollenberg 1992, Borchers 2000, Haag 1992).
assured by the preceding assumptions. An example
of the utility of this structure is the following Positive Energy
theorem.
In quantum physics the time development of the
Theorem 13 Let N , M, L be von Neumann alge- system is often represented by a strongly continuous
bras with a common cyclic and separating vector . If group {U(t) = eitH j t 2 R} of unitary operators, and
0
(M, N , ) and (N , L, ) are –-modular intersections the generator H is interpreted as the total energy of
and (M, L, ) is a þ-modular intersection, then the the system. There is a link between modular
unitaries itM , isN , iu
L , s, t, u 2 R, generate a faithful structure and positive energy, which has found
continuous unitary representation of SO" (1, 2). many applications in quantum field theory. This
These results and their extensions to larger result was crucial in the development of Theorem 11
numbers of algebras were developed for application and was motivated by the 1975 discovery mentioned
in algebraic quantum field theory, but one may above, now commonly called the Bisognano–
anticipate that half-sided modular inclusions will Wichmann theorem.
find wider use. Modular theory has also been Theorem 14 Let M be a von Neumann algebra
applied fruitfully in the theory of inclusions N M with a cyclic and separating vector , and let {U(t)}
of properly infinite algebras with finite or infinite be a continuous unitary group satisfying U(t)MU
index. ( t) M, for all t 0. Then any two of the
following conditions imply the third:
(i) U(t) = eitH , with H 0;
Applications in Quantum Theory (ii) U(t) = , for all t 2 R; and
The Tomita–Takesaki theory has found many (iii) it U(s)it = U(e2t s) and JU(s)J = U(s), for
applications in quantum field theory and quantum all s, t 2 R.
statistical mechanics. As mentioned earlier, the
Modular Nuclearity and Phase Space Properties
modular automorphism group satisfies the KMS
condition, a property of physical significance in the Modular theory can be used to express physically
quantum theory of many-particle systems, which meaningful properties of quantum ‘‘phase spaces’’
includes quantum statistical mechanics and quantum by a condition of compactness or nuclearity of
field theory. In such settings, for a suitable algebra certain maps. In its initial form, the condition was
of observables M and state !, an automorphism formulated in terms of the Hamiltonian, the global
group {t } representing the time evolution of the energy operator of theories in Minkowski space.
system satisfies the modular condition. Hence, on The above indications that the modular operators
the one hand, {t } is the modular automorphism carry information about the energy of the system
group of the pair (M, ), and, on the other, ! is an were reinforced when it was shown that a
256 Tomita–Takesaki Modular Theory
formulation in terms of modular operators was Minkowski space for d = 1, 2, 3. Conversely, such
essentially equivalent. quantum field theories naturally yield such systems
Let O1 O2 be nonempty bounded open subregions of algebras.
of Minkowski space with corresponding algebras of This intimate relation would seem to open up the
observables A(O1 ) A(O2 ) in a vacuum representa- possibility of constructing interacting quantum field
tion with vacuum vector , and let be the modular theories from a limited number of modular inclu-
operator associated with (A(O2 ), ) (by the Reeh– sions/intersections.
Schlieder theorem, is cyclic and separating for
A(O2 )). For each
2 (0, 1=2) define the mapping
Geometric Modular Action
: A(O1 ) ! H by
(A) =
A. The compactness
of any one of these mappings implies the compactness The fact that the modular objects in quantum field
of all of the others. Moreover, the lp (nuclear) norms of theory associated with wedge-shaped regions and the
these mappings are interrelated and provide a measure vacuum state in Minkowski space have geometric
of the number of local degrees of freedom of the significance (‘‘geometric modular action’’) was origin-
system. Suitable conditions on the maps in terms of ally discovered in the framework of the Wightman
these norms entail the strong statistical independence axioms. As an algebraic quantum field theory (AQFT)
condition called the split property. Conversely, the split does not rely on the concept of Wightman fields, it was
property implies the compactness of all of these maps. natural to ask (i) when does geometric modular action
Moreover, the existence of equilibrium temperature hold in AQFT and (ii) which physically relevant
states on the global algebra of observables can be consequences follow from this feature?
derived from suitable conditions on these norms in the There are two approaches to the study of
vacuum sector. geometric modular action. In the first, attention is
The conceptual advantage of the modular com- focused on modular covariance, expressed in terms of
pactness and nuclearity conditions compared to the modular groups associated with wedge algebras
their original Hamiltonian form lies in the fact that and the vacuum state in Minkowski space. Modular
they are meaningful also for quantum systems in covariance has been proven to obtain in conformally
curved spacetimes, where global energy operators invariant AQFT, in any massive theory satisfying
(i.e., generators corresponding to global timelike asymptotic completeness, and also in the presence of
Killing vector fields) need not exist. other, physically natural assumptions. To mention
only three of its consequences, both the spin–statistics
theorem and the PCT theorem, as well as the
Modular Position and Quantum Field Theory
existence of a continuous unitary representation of
The characterization of the relative ‘‘geometric’’ the Poincaré group acting covariantly upon the
position of algebras based on the notions of modular observable algebras and satisfying the spectrum
inclusion and modular intersection was directly condition follow from modular covariance.
motivated by the Bisognano–Wichmann theorem. In a second approach to geometric modular action,
Observable algebras associated with suitably chosen the modular involutions are the primary focus. Here,
wedge regions in Minkowski space provided exam- no a priori connection between the modular objects
ples whose essential structure could be abstracted and isometries of the spacetime is assumed. The central
for more general application, resulting in the notions assumption, given the state vector and the von
presented in the preceding sections. Neumann algebras of localized observables {A(O)} on
Theorem 12(ii) has been used to construct, from the spacetime, is that there exists a family W of subsets
two algebras and the indicated half-sided modular of the spacetime such that JW1 R(W2 )JW1 2
inclusions, a conformal quantum field theory on the {R(W) j W 2 W}, for every W1 ,W2 2 W. This condi-
circle (compactified light ray) with positive energy. tion makes no explicit appeal to isometries or other
Since the chiral part of a conformal quantum field special attributes and is thus applicable, in principle, to
model in two spacetime dimensions naturally yields quantum field theories on general curved spacetimes.
such half-sided modular inclusions, studying the It has been shown for certain spacetimes, including
inclusions in Theorem 12(ii) is equivalent to study- Minkowski space, that under certain additional
ing such field theories. Theorems 12(i) and 13 technical assumptions, the modular involutions
and their generalizations to inclusions involving up encode enough information to determine the
to six algebras have been employed to construct dynamics of the theory, the isometry group of the
Poincaré-covariant nets of observable algebras (the spacetime, and a continuous unitary representation of
algebraic form of quantum field theories) satisfying the isometry group which acts covariantly upon the
the spectrum condition on (d þ 1)-dimensional observables and leaves the state invariant. In certain
Topological Defects and Their Homotopy Classification 257
cases including Minkowski space, it is even possible Lower Spacetime Dimensions; Thermal Quantum Field
to derive the spacetime itself from the group J Theory; Positive Maps on C-Algebras; Two-Dimensional
generated by the modular involutions {JW j W 2 W}. Models; von Neumann Algebras: Introduction, Modular
The modular unitaries itW enter in this approach Theory, and Classification Theory.
through a condition which is designed to assure the
stability of the theory, namely that itW 2 J , for all
t 2 R and W 2 W. In Minkowski space, this addi- Further Reading
tional condition entails that the derived representation
Baumgärtel H and Wollenberg M (1992) Causal Nets of Operator
of the Poincaré group satisfies the spectrum condition.
Algebras. Berlin: Akademie-Verlag.
Borchers HJ (2000) On revolutionizing quantum field theory with
Further Applications
Tomita’s modular theory. Journal of Mathematical Physics 41:
As previously observed, through the close connec- 3604–3673.
Bratteli O and Robinson DW (1981) Operator Algebras and
tion to the KMS condition, modular theory enters
Quantum Statistical Mechanics II. Berlin: Springer.
naturally into the equilibrium thermodynamics of Connes A (1974) Caractérisation des algèbres de von Neumann
many-body systems. But in recent work on the comme espaces vectoriels ordonnés. Annales de l’Institut
theory of nonequilibrium thermodynamics it also Fourier 24: 121–155.
plays a role in making mathematical sense of the Haag R (1992) Local Quantum Physics. Berlin: Springer.
Kadison RV and Ringrose JR (1986) Fundamentals of the Theory
notion of quantum systems in local thermodynamic
of Operator Algebras, vol. II. Orlando: Academic Press.
equilibrium. Modular theory has also proved to be Pedersen GK (1979) C-Algebras and Their Automorphism
of utility in recent developments in the theory of Groups. New York: Academic Press.
superselection rules and their attendant sectors, Stratila S (1981) Modular Theory in Operator Algebras. Tun-
charges and charge-carrying fields. bridge Wells: Abacus Press.
Takesaki M (1970) Tomita’s Theory of Modular Hilbert Algebras
and Its Applications, Lecture Notes in Mathematics, vol. 128.
See also: Algebraic Approach to Quantum Field Theory; Berlin: Springer.
Axiomatic Quantum Field Theory; Quantum Central-Limit Takesaki M (2003) Theory of Operator Algebras II. Berlin:
Theorems; Symmetries in Quantum Field Theory of Springer.
stability subgroup H of 0 (the group of unbroken continuous manner everywhere around the periph-
symmetries in this ground state): ery of some region, it is topologically impossible to
complete the process throughout its interior.
H ¼ fg 2 G : DðgÞ0 ¼ 0 g ½3
Continuity may require that there are points where
In terms of this subgroup, we can find a useful leaves the surface M. For example, if our
characterization of the manifold M of degenerate ferromagnet has two opposite possible directions of
ground states. As noted above, for each g 2 G, easy magnetization, described by f 0 and f 0 , then
Û(g)j0i is also a ground state. However, these are M consists essentially of these two points. Regions
not all distinct, because clearly Û(gh)j0i = Û(g)j0i where f f 0 and where f f 0 must be separated
for all h 2 H. Hence, the distinct ground states are by domain walls across which f varies smoothly
in one-to-one correspondence with the left cosets gH from one to the other.
of H in G, and M may be identified with the
quotient space G/H, the space of left cosets.
For example, suppose G is the rotation group Homotopy Groups
SO(3), and f̂ belongs to the three-dimensional To classify the various possible types of defect, we
vector representation. If f 6¼ 0 in the ground state, need to consider the homotopy groups of the
we may choose f 0 = (0, 0, v). Then, clearly, manifold M of degenerate ground states. In this
H = SO(2), the group of rotations about the z-axis, section, we briefly review the necessary definitions.
and M = SO(3)=SO(2) = S2 , the 2-sphere. It is useful A path in M is a map : I ! M from the unit
to think of M as the subset of the order-parameter interval I = [0, 1] R. We choose a base point m0 2
space comprising the possible expectation values M (which may be identified with 0 ), and consider
ˆ for the various degenerate ground states. For
f = hi loops in M, paths such that (0) = (1) = m0 . We
example, in this case, M = {f: f 2 = v2 }. say that two loops are homotopic, and write ,
if one can be continuously deformed into the other
Defect Formation within M, that is, if there exists a map : I2 ! M
such that
It is often possible to characterize the dynamics at
finite temperature in terms of a function of the order ð0; tÞ ¼ ðtÞ and ð1; tÞ ¼ ðtÞ ½4
parameter, the effective potential V(), which is
for all t, and
necessarily invariant under G, and whose minima
define the equilibrium states. At low temperatures, it ðs; 0Þ ¼ ðs; 1Þ ¼ m0 ½5
has a form like V = (f 2 v2 )2 , whose minima
for all s. This is an equivalence relation. The set
occur at nonzero values of f. But above the critical
1 (M) is the set of equivalence classes [] of loops
temperature Tc , the only minimum is at f = 0, so the
under this relation.
equilibrium state is symmetric under G. In the high-
On the set of loops, we may define a product ,
temperature phase, there may be large fluctuations
comprising the loop followed by (see Figure 1).
in f̂, but its mean value will be zero.
Explicitly,
Now, when the system is cooled through the (
phase tränsition, ˆ will acquire a nonzero expecta- ð2tÞ; 0 t 12
tion value, gradually approaching one of the ð ÞðtÞ ¼ ½6
ð2t 1Þ; 12 < t 1
degenerate ground states characterized by a point
of M. But the choice of which one is unpredict- It is easy to show that if 0 and 0 , then
able; the symmetry breaking is spontaneous. 0 0 . Hence, this defines a product on 1 (M),
Moreover, in a large system, there is no reason by [][ ] = [ ]. So equipped, 1 (M) becomes the
why the same choice should be made everywhere.
For example, a ferromagnet cooling through its
Curie point may acquire a spontaneous magneti- φ
zation in different directions in different parts of
the sample.
Of course, there is an energetic penalty to having
a spatially varying order parameter, so it will tend to
become more uniform as the temperature is lowered. ψ φψ
But the question arises whether there may be any
topological obstruction to this process. It can
happen that if we choose points on M in a Figure 1 The product of loops.
Topological Defects and Their Homotopy Classification 259
fundamental group or first homotopy group of M. general no product can be defined on 0 (M), so
Note that the identity is the equivalence class [0 ] of 0 (M) should be called the zeroth homotopy set
the trivial loop with 0 (t) m0 , while the inverse is (not group). There is an important exception,
˜ where the map ˜ is the reverse of
[]1 = [], however: if G is a Lie group, and G0 its connected
˜ = (1 t).
: (t) subgroup (the subset of elements joined by paths to
Strictly speaking, we should write 1 (M, m0 ) in the identity e), then 0 (M) may be identified with
place of 1 (M). However, for any path-connected the quotient group G=G0 . Note, however, that this
space, the groups 1 (M, m0 ) and 1 (M, m00 ) are group 0 (M) = G=G0 is not necessarily abelian.
always isomorphic, and, more importantly, the same
is true for any coset space M = G=H, where G is a
Classification of Defects
Lie group and H a closed subgroup. For a general
manifold M, 1 (M) is not necessarily abelian, but it We now turn to the classification of defects by
is so if M is a Lie group, or more generally a means of homotopy groups. It will be useful to start
Riemannian symmetric space. The space M is said to with simple specific examples in three-dimensional
be simply connected if 1 (M) = 0, the group compris- space, R3 .
ing only the identity element, 0 = {[0 ]}. (Although First, suppose again that f belongs to the vector
1 (M) is not always abelian, it is conventional for representation of G = SO(3). Then M = SO(3)/
homotopy groups to use an additive notation and SO(2) = S2 may be identified with the sphere
represent the trivial group by 0 rather than 1.) M = {f: f 2 = v2 } in space. Consider a closed surface
The nth homotopy group n (M) may be defined S, an embedding of a 2-sphere S2 in R3 . Assume
similarly, as a set of equivalence classes of maps that everywhere on S the field f(r) has one of the
: In ! M such that maps the entire boundary @In ground-state values. In other words, we have a map
to the base point m0 . Two such maps are homotopic f : S ! M, from one 2-sphere to another. The map f
( ) if there exists a map : Inþ1 ! M such that can be extended to a map from the interior of S to M
only if it belongs to the trivial homotopy class [f 0 ] 2
ð0; tÞ ¼ ðtÞ and ð1; tÞ ¼ ðtÞ ½7
2 (M), where 0 : I2 ! M : (t1 , t2 ) 7! m0 = eH. In all
for all t = (t1 , . . . , tn ), and, for each s 2 I, (s, t) = m0 other cases, there must be at least one point where
for all t 2 @In . The product is defined by f(r) = 0; this is a point defect. The second homotopy
group in this case is 2 (S2 ) = Z, so the possible
ð Þðt1 ; . . . ; tn Þ point defects, or monopoles, are labeled by an integer
(
ð2t1 ; t2 ; . . . ; tn Þ; 0 t1 12 n 2 Z, the winding number. (An example of a map
¼ 1
½8 with winding number n is (in spherical polars)
ð2t1 1; t2 ; . . . ; tn Þ; 2 < t1 1
(r, , ’) 7! (v, , n’).)
The choice of t1 rather than any other tj is arbitrary; More generally, point defects in R d are classified
all choices yield homotopic product maps. The by d1 (M). A map from a closed (d 1)-
product again defines a product on n (M), which dimensional surface S R d to M can be extended
thereby becomes a group, the nth homotopy group. to the interior of S if and only if it belongs to the
One new feature is that, for all n > 1, n (M) is trivial homotopy class [0 ] 2 d1 (M). If this is not
always abelian. the case, there must be at least one point around
Note that since the entire boundary of In is which (r) leaves the surface M, although in general
mapped to a single point, it is possible to collapse it, it is not required to vanish anywhere.
and talk instead about maps from the n-sphere Sn to Second, take the case where is a single complex
M, taking one designated point to m0 . The fact that field, and G is the phase symmetry group U(1). In
n (M) is nontrivial indicates the existence in M of this case, H is the subgroup 1 = {1} G. Thus,
closed n-surfaces that cannot be smoothly shrunk to M = U(1)=1 = S1 ; this manifold may be identified
a point. In particular, it is worth noting that, for any n, with the circle {: jj = v} in the order-parameter
n (Sn ) = Z, the additive group of integers, while space. Now consider a closed loop C in space, an
m (Sn ) = 0 for all m < n. embedding of S1 in R3 (see Figure 2). Suppose that
A special case is n = 0. Here, S0 comprises two on C, (r) takes one of the ground-state values,
points only, and since one of them is always mapped say (r) = v exp [i(r)]. If S is some surface with
to m0 , we really have to consider maps from a single boundary C, then the map : C ! M can be
point to M, that is, points in M. Two points are extended to a map : S ! M if and only if it
homotopic if they can be joined by a path in M. belongs to the trivial homotopy class [0 ] 2 1 (M).
Thus, 0 (M) may be identified with the set of path- If it does not, then there must be at least one
connected components of M. Note, however, that in point on S within C where = 0. Moreover, this
260 Topological Defects and Their Homotopy Classification
φ=0 p
i G
1 H 1
e e m0
also remain in one homotopy class. Thus, we have example, by SU(2). Thus, we may also assume that
defined a map @ : nþ1 (M) ! n (H) : [] 7! []. ˜ 1 (G) = 0. Then the section of the exact sequence in
It is also easy to see that the image the second line of [9] becomes
of @ : nþ1 (M) ! n (H) is the kernel of i : n (H) ! p @ i
n (G), because the n-surface in H defined by ˜ is 0 ! 1 ðMÞ ! 0 ðHÞ ! 0
necessarily homotopically trivial in G. Similarly, one which implies that the two groups in the center are
can see that the image of p : nþ1 (G) ! nþ1 (M) is isomorphic:
the kernel of @ : nþ1 (M) ! n (H).
Putting all these results together, we see that there 1 ðMÞ ¼ 0 ðHÞ ½10
is a (semi-infinite) exact sequence connecting all the For example, if the symmetry group G = SO(3) is
homotopy groups: completely broken, so that H = 1, then replacing G by
p @ i p G̃ = SU(2) requires replacing H by H̃ = { þ1, 1}
! nþ1 ðMÞ ! n ðHÞ ! n ðGÞ ! n ðMÞ
’ Z2 , hence also 1 (M) = 0 (H̃) = Z2 ; there is only
@ i p @ one nontrivial class of linear defects in this model.
! n1 ðHÞ ! 1 ðGÞ! 1 ðMÞ ! 0 ðHÞ ½9
To find 2 (M), we need a standard theorem
i p about Lie groups, namely that the second homotopy
! 0 ðGÞ ! 0 ðMÞ group of any Lie group is trivial: for any
G, 2 (G) = 0. (No details of the proof are given
This sequence makes it easy to compute most of here. It derives from the fact that a generic element
the low-dimensional homotopy groups of M. Let us g 2 G belongs to a unique one-parameter subgroup
begin with 0 (M), which merely labels its discon- { exp (tX), t 2 R} G, where X is an element of the
nected components. As noted earlier, for the Lie Lie algebra of G. Thus, all the points on a surface in
group G, 0 (G) is the quotient group 0 (G) = G=G0 , G may be joined by these paths to the identity, and
where G0 is the connected subgroup of G. Now the the surface may then be shrunk along the resulting
image of 0 (H) under i is clearly the set of cone. There are exceptional elements for which
connected components of G that contain elements this is not true, but it can be shown that in a d-
of H, so if G has m connected components, and n of dimensional group they lie on (d 3)-dimensional
them contain elements of H, then 0 (M) has m=n surfaces, so any 2-surface can be smoothly deformed
elements (see Figure 5). to avoid them.)
Next, we note that, for all the higher homotopy It follows from this theorem that another section
groups, disconnected pieces are irrelevant. Since a of the exact sequence is
loop, for example, starting at m0 must remain p @ i
0 ! 2 ðMÞ ! 1 ðHÞ ! 0
within its connected component M0 M, it
follows that 1 (M) = 1 (M0 ), and similarly which again implies an isomorphism:
n (M) = n (M0 ) for all n > 1. So one can ignore
2 ðMÞ ¼ 1 ðHÞ ½11
any disconnected parts of the symmetry group G,
and assume from now on that 0 (G) = 0. Moreover, For example, if G = SO(3) and H = SO(2), or
it is always possible to replace G by its simply equivalently G̃ = SU(2) and H̃ = U(1) (a double
connected covering group, replacing SO(3), for cover of the SO(2)), then 2 (M) = 1 (H̃) = Z, so
point defects in this theory are labeled by an integer
winding number.
π0(G)
π0( ) Examples
The simplest continuous symmetry is the U(1) phase
symmetry ˆ 7! e
ˆ i of a complex field. In a weakly
π0(H ) interacting Bose gas, below the Bose–Einstein con-
densation temperature, or in superfluid helium-4,
a macroscopic fraction of the atoms occupies a
single quantum state, and ˆ acquires a nonzero
ˆ = , whose phase is arbitrary,
expectation value, hi
so the symmetry is completely broken to H = 1.
Figure 5 The disconnected components of G are shaded, those Thus, M = S1 ; we have a circle of equivalent
of H are cross-hatched. Here 0 (M) has two elements. degenerate ground states. (This corresponds to
262 Topological Defects and Their Homotopy Classification
l defines the orbital angular momentum state by The case of helium-3 is slightly different. Here it
l L = 1, while d defines the spin quantization axis, is the small spin–orbit coupling, arising from long-
such that d S = 0. The manifold MA for this range dipole–dipole interactions, that introduces
phase is the second scale. Its effect is only significant over
large distances.
MA ¼ ½SOð3Þ
S2 =Z2 ½13
In the 3 He-A phase, at short range the l and d
where the Z2 is present because (m, n, d) and vectors are uncorrelated but, over large distances,
(m, n, d) represent the same state. If, for they tend to be aligned parallel or antiparallel. We
example, we take l and d in the z-direction, the can use the Z2 symmetry mentioned earlier to
unbroken symmetry subgroup is choose l = d. Hence, the manifold M0A of true
ground states is only a submanifold of MA , namely
HA ¼ SOð2ÞLz þY
½SOð2ÞSz n Z2 ½14
M0A = SO(3), whose homotopy groups are
where the nontrivial element of Z2 may be taken to
0 ðM0A Þ ¼ 0; 1 ðM0A Þ ¼ Z2 ; 2 ðM0A Þ ¼ 0 ½21
be ei(Sx þLz ) . The covering group of G is, of course,
~ ¼ RY
SUð2Þ
SUð2Þ
G ½15
L S Because of different behavior on different scales,
Correspondingly, ‘‘composite’’ defects can arise. For example, because
2 (MA ) = Z, there are short-range monopole con-
~ A ¼ RL þY
½Uð1Þ n Z4
H ½16 figurations. For the n = 1 monopole, we have a
z Sz
configuration with uniform l, and with d pointing
It follows that the homotopy groups are
outwards from the center. But, eventually the
0 ðMA Þ ¼ 0; 1 ðMA Þ ¼ Z4 ; 2 ðMA Þ ¼ Z ½17 misalignment of d with l is energetically disfavored,
and at large distances d tends to rotate to align with
There are linear defects labeled by a mod-4 quantum
l except around one particular direction where it is
number and point defects labeled by an integer.
oppositely aligned (see Figure 7). We have a
For the 3 He-B phase, by contrast, the order
composite defect: a small monopole coupled to a
parameter is of the form
relatively fat string.
jk / Rjk ei ½18 To see how the small- and large-scale structures fit
together, one has to look also at the relative
where R is a rotation matrix, R 2 SO(3). Here then, homotopy groups n (M, M0 ), whose elements are
MB ¼ SOð3Þ
S1 ½19 homotopy classes of maps from In to M such that
one face of the boundary is mapped into M0 , and the
with homotopy groups remainder to the chosen base point m0 . For example,
0 ðMB Þ ¼ 0; 1 ðMB Þ ¼ Z2
Z; 1 (M, M0 ) classifies paths that terminate at m0 while
beginning at any point of M0 . There is, in fact, a
2 ðMB Þ ¼ 0 ½20 long exact sequence, similar to [9], relating these
In this phase, there are two distinct types of linear homotopy groups, of which a typical segment is
defect, the mass vortices with an integer label, and @ i p
the spin vortices with a mod-2 label. (One can also ! n ðM0 Þ ! n ðMÞ !n ðM; M0 Þ
have a ‘‘spin–mass vortex’’ carrying both quantum @ i
! n1 ðM0 Þ! ½22
numbers.)
l
Composite Defects
There are several cases, including in particular
helium-3, that exhibit symmetry breaking with
multiple length or energy scales. For example, there d
may be two order parameters, say , , with
jj j j. If j j is negligible, the symmetry G is
broken by to H, and the manifold of degenerate
ground states is M = G=H. However, these states
are not all exactly degenerate: breaks the
symmetry further to K H, so the precisely degen- Figure 7 Cross-section of a short-range monopole attached to
erate ground states form a submanifold M0 = G=K. a fat string.
264 Topological Gravity, Two-Dimensional
The relevant groups in the present case are nontrivial elements of 2 (M0B ) have no short-range
singularity at all.
1 ðMA ; M0A Þ ¼ Z2 ; 2 ðMA ; M0A Þ ¼ Z ½23
Because 1 (MA ) = Z4 , there are three distinct classes See also: Abelian Higgs Vortices; Leray–Schauder
of linear defects at small scales, but only those with Theory and Mapping Degree; Liquid Crystals; Phase
Transition Dynamics; Quantum Field Theory: A Brief
quantum number n = 2 (mod 4) survive unchanged to
Introduction; Quantum Fields with Topological Defects;
large scales; they correspond to the nontrivial element
Solitons and Other Extended Field Configurations; String
of 1 (M0A ) = Z2 . On the other hand, the homotopy Topology: Homotopy and Geometric Perspectives;
classes n = 1 (mod 4) are mapped to nontrivial Symmetries and Conservation Laws; Symmetry Breaking
elements of 1 (MA , M0A ) = Z2 , which indicates that in Field Theory; Variational Techniques for
the corresponding linear defects are coupled at long Ginzburg–Landau Energies.
range to fat domain walls, across which d rotates
through with a compensating rotation through
about l. Similarly, the nontrivial elements of Further Reading
2 (MA ) = Z are mapped to nontrivial elements of
2 (MA , M0A ), confirming that these short-range mono- Helgason S (2001) Differential Geometry, Lie Groups and Sym-
metric Spaces. Providence, RI: American Mathematical Society.
poles are coupled to fat strings, as in Figure 7.
Hu S-T (1959) Homotopy Theory. New York: Academic Press.
For 3 He-B, the effect of the spin–orbit coupling Kibble TWB (1976) Topology of cosmic domains and strings. Journal
is to make the most energetically favorable of Physics A: Mathematical and General 9: 1387–1398.
configurations those in which the rotation Kibble TWB (2000) Classification of topological defects and their
matrix R in [18] represents a rotation about an relevance to cosmology and elsewhere. In: Bunkov YM and
Godfrin H (eds.) Topological Defects and the Non-Equilibrium-
arbitrary axis n through the Leggett angle L =
Dynamics of Symmetry Breaking Phase Transitions, NATO
arccos(1/4) = 104
: R = exp (iL n J). Science Series C: Mathematical and Physical Sciences. vol. 249,
Consequently, pp. 7–31. Dordrecht: Kluwer Academic Publishers.
Shellard EPS and Vilenkin A (1994) Cosmic Strings and
M0B ¼ S2
S1 ½24 Other Topological Defects. Cambridge: Cambridge University
Press.
and so Tilley DR and Tilley J (1990) Superfluidity and Superconductiv-
ity, 3rd edn. Bristol: IoP Publishing.
0 ðM0B Þ ¼ 0; 1 ðM0B Þ ¼ Z; 2 ðM0B Þ ¼ Z ½25
Toulouse G and Kléman M (1976) Principles of a classification of
The relative homotopy groups are defects in ordered media. Journal de Physique Lettres 37: 149.
Vollhardt D and Wölfle P (1990) The Superfluid Phases of
1 ðMB ; M0B Þ ¼ Z2 ; 2 ðMB ; M0B Þ ¼ 0 ½26 Helium 3. London: Taylor and Francis.
Volovik GE (1992) Exotic Properties of Superfluid 3He.
Here the mass vortex persists at long range, but the Singapore: World Scientific.
Volovik GE and Mineev VP (1977) Investigation of singularities
configuration around the spin vortex deforms so in superfluid 3He and in liquid crystals by the homotopic
that they become attached to fat domain walls. The topology methods. Zhurnal Eksperimentalnoi i Teoreticheskoi
‘‘monopole’’ configurations corresponding to Fiziki 72: 2256–2274 (Soviet Physics–JETP 24: 1186–1196).
Shenker 1990, Gross and Migdal 1990) is given by intersection numbers on moduli space (t) is the
adjusting the coupling to c and at the same time -function of KdV hierarchy. KdV hierarchy is
taking the limit N ! 1. In this limit, contributions of obtained by generalizing the well-known KdV
all genera survive, and the theory describes the equation
dynamics of fluctuating surfaces of arbitrary topolo-
gies. Results obtained in this way do not, in fact, @u 3 @u 1 @ 3 u
¼ u þ ½5
depend on the detailed choice of the potential (4 type @t 2 @x 4 @x3
in [1]) and have a high degree of universality. Thus, it Identification of the KdV equation with topological
provides an interesting model of two-dimensional (2D) gravity is given by u = 2hO1 O1 i, x = t1 , t = t3 .
quantum gravity. Witten’s conjecture was verified by Kontsevich
Soon after the discovery of double scaling limit of (1991) by an explicit construction of a new type of
matrix models, Witten observed that the correlation matrix model which generates the triangulation of
functions of the 2D gravity theory may be given a the moduli space of Riemann surfaces.
geometrical interpretation as topological invariants In the general case of (p, 1) topological gravity,
of the moduli space of Riemann surfaces M, and the partition function of the theory obeys the
that the 2D gravity theory may be reformulated as a equations of pth generalized KdV hierarchy (p
topological field theory (Witten 1990). This refor- reduction of KP hierarchy).
mulation of the results of the 2D gravity theory is
called ‘‘2D topological gravity.’’
In fact, 2D gravity theories come in a family Intersection Theory
parametrized by a pair of integers (p, q). The double
scaling limit of [1] gives the simplest example We now present some basic features of intersection
(p = 2, q = 1). Models with a chain of p 1 Hermi- theory on the moduli space of Riemann surfaces. It
tian matrices give the (p, q) 2D gravity theories. The is known that 2D oriented surfaces with g handles
label q stands for the order of criticality of the and s marked points xi (i = 1, . . . , s) possess a finite
model, and higher values of q are achieved by fine- number of inequivalent complex structures (complex
tuning the parameters of the potential. At q = 1, 2D structures are identified when they differ only by
gravity theories possess a topological interpretation. diffeomorphism). The space of inequivalent complex
The most basic case (p = 2, q = 1) is called pure structures is called the moduli space Mg,s of the
topological gravity, and in theories at higher values Riemann surface . Its dimension is given by
of p, topological gravity is coupled to a matter
system, that is, topological minimal models. Topo- dim Mg;s ¼ 3g 3 þ s ½6
logical minimal models are obtained by twisting
For a mathematically rigorous treatment, we have to
N = 2 superconformal field theories. g,s of moduli space
consider a compactification M
Let us first consider the case of pure gravity (p = 2, 1).
Mg,s by adding suitable boundary components
Let On denote the observables in the theory and tn the
which arise due to various types of degenerations
coupling constants to these operators. The correlation
of Riemann surfaces. In the Deligne–Mumford or
functions of topological gravity are given by
stable compactification, one considers the following
hOn1 On2 . . . Ons ig ; ni ¼ 1; 2; . . . ½3 three classes of singular Riemann surfaces :
where h ig denotes the expectation value on a 1. Two points, xi and xj , on come close together. In
surface with g handles. The precise significance of this case, an extra 2-sphere is pinched off from the
eqn [3] as the intersection number on the moduli surface by forming a thin neck. The sphere contains
space is discussed below. The string partition points xi and xj and also the point xl at the end of
function (t) is defined as the generating function the neck (see Figure 1a). Since the original surface
of all possible correlation functions now has one point less and the 2-sphere with three
points has no moduli, the degenerate surface has
X1 D X E
3g 4 þ s parameters and forms a boundary
ðtÞ ¼ exp exp tn On ½4 g,s .
g¼0
g divisor of the moduli space M
2. If a cycle of nontrivial homology class shrinks to
The most striking aspect of topological gravity is a point, we have a surface with one less genus
the connection of the intersection theory on M to and two extra marked points. Singular surface
the theory of completely integrable systems, that is, has 3(g 1) 3 þ s þ 2 number of moduli and
Korteveg–de Vries (KdV) and KP hierarchies. this is again a complex codimension-1 compo-
Witten conjectured that the generating function of nent (see Figure 1b).
266 Topological Gravity, Two-Dimensional
The puncture and dilaton equations for (p, 1) It is a basic result of the calculus of pseudodiffer-
theories read ential operators that the above Hamiltonians satisfy
the zero-curvature condition
h0;0 n1 ;k1 ns ;ks ig
@Hm @Hn
X s þ ½Hm ; Hn ¼ 0 ½35
¼ hn1 ;k1 ni 1;ki ns ;ks ig ½27 @tn @tm
i¼1;ni 6¼0
Note that when m is a multiple of p, Hm becomes a
power of L and trivially commutes with L. Thus, the
time variables tm are absent for n
0 mod p. In the
h1;0 n1 ;k1 ns ;ks ig
simple case of p = 2, one has
¼ ð2g 2 þ sÞhn1 ;k1 ns ;ks ig ½28
L ¼ D2 þ uðxÞ ½36
The special terms at g = 0 and g = 1 are given by and H3 = D3 þ (3=2)uD þ (3=4)u0 . One finds
p1 @L @u 3 @u 1 @ 3 u
h0;0 0;i 0;pi2 i0 ¼ 1; h1;0 i1 ¼ ½29 ¼ ¼ ½H3 ; L ¼ u þ ½37
24 @t3 @t3 2 @x 4 @x3
which is the standard KdV equation.
Integrable Hierarchy In the case of KP hierarchy, one starts with a
pseudodifferential operator
We now summarize some basic facts about the
integrable hierarchy (see for instance eqn [5]). We X
1
p2
X @ and considers the time evolution equations
L ¼ Dp þ ui ðxÞDi ; D
½30
@x @Q
i¼0
¼ ½Hn ; Q; Hn ¼ ðQn Þþ ½39
@tn
where the coefficient functions ui are arbitrary
p-reduced KP hierarchy is obtained if one has
functions of x. This Lax operator describes the pth
generalized KdV hierarchy. We consider the time Qp ¼ 0 ½40
evolution of the operator L by an infinite set of
commuting Hamiltonians: By introducing a pseudodifferential operator K, one
may bring Q to the simple derivative operator D as
@L
¼ ½Hn ; L; n ¼ 1; 2; . . . ½31 Q ¼ KDK1 ½41
@tn
K has an expansion of the form
where Hn is given by
X
1
K¼1þ ai Di ½42
Hn ¼ Ln=p ½32
þ i¼1
X
1 @2
i res Li=p ¼ log ðtÞ ½44
res A ¼ f1 ðxÞ; A ¼ fi ðxÞD ½34 @x@ti
i¼1
These residues are expressed in terms of {ui } and
Note that x is identified as the first time variable t1 , their derivatives in x, and one can determine them in
that is, x = t1 . terms of the -function.
Topological Gravity, Two-Dimensional 269
In the case p = 2, one has From [55], we see that (p, 1) theory corresponds to the
k=2
background value of the coupling tpþ1 = 1=(p þ 1).
½Hk ; L ¼ 2D resðL Þ ¼ DRk ; k ¼ odd ½45 In the case of (p, q) theory, background value is given
Here {Rk } are the Gelfand–Dikii potentials by tpqþ1 = 1=(pq þ 1).
R1 ¼ u; R3 ¼ 14ð3u2 þ u00 Þ
Virasoro Conditions
1
R5 ¼ 16 ð10u3 þ 5u02 þ 10uu00 þ u0000 Þ ½46
.. A powerful algebraic machinery controlling
. the structure of 2D gravity is the so-called ‘‘Virasoro
conditions.’’ One introduces differential operators
and obey the recursion relation
@ X1
@ 1X
DRkþ2 ¼ 14 D3 þ 2ðDu þ uDÞ Rk ½47 L1 ¼ þ ktk þ ijti tj ½56
@t1 k¼pþ1 @tkp 2 iþj¼p
If one uses the relation [44], Gelfand–Dikii potentials
are identified as
@ X
1
@ p2 1
Rk ¼ 2hO1 Ok i ½48 L0 ¼ þ ktk þ ½57
@tpþ1 k¼1 @tk 24
By setting k = 1, we note u = 2hO1 O1 i and find that
the evolution equations [31] are all satisfied as By using the fact that derivative in tn brings down
the operator On when acting on the -function, it is
@L @u @
¼ ¼2 hO1 O1 i ¼ 2DhO1 Ok i easy to show that
@tk @tk @tk
L1 ¼ 0 ½58
¼ DRk ¼ ½Hk ; L ½49
Now it is possible to identify the initial condition
for the Lax operator in the case of topological (p, 1) L0 ¼ 0 ½59
gravity. By using the definition reproduce the puncture [27] and dilaton equation
* + [28], respectively. It is possible to show that the
X1 X
log ðtÞ ¼ exp tn O n ½50 L1 -condition, L1 = 0, is equivalent to the
g¼0 n g string equation [55].
Together with the operators (n 1)
one has
@ X
1
@ 1 X @2
res Li=p ð0Þ ¼ hO1 Oi i; i ¼ 1; . . . ; p 1 ½51 Ln ¼ þ ktk þ
@t1þðnþ1Þp k¼1
@tkþnp 2 iþj¼np @ti @tj
From [29] one finds
they generate Virasoro algebra (L0n
(1=p)Ln )
i=p
resL ð0Þ ¼ ix i;p1 ½52
½L0m ; L0n ¼ ðm nÞL0mþn ; n; m 1 ½60
This gives the initial value of the Lax operator:
It is possible to show that the (p, 1) model obeys the
Lð0Þ ¼ Dp þ px ½53 Virasoro conditions [6]
Thus, only the lowest term u0 (x) = px is nonzero Ln ¼ 0; n 1 ½61
and higher coefficients all vanish at t = 0. This is the
special simplification which takes place in the It is known that (p, 1) models with p > 2 also obey
topological gravity theory. constraints of W-algebra.
We note a relation The relationship of the Virasoro conditions to
KdV hierarchy is summarized as
1
D; Lð0Þ ¼ 1 ½54 string equation þ KdV hierarchy
p
() Virasoro and W-algebra constraints
This is the so-called ‘‘string equation’’ (at t = 0). At
nonzero values of t, the string equation takes the form
½P; L ¼ 1 Topological -Model
0 1
1 X ½55 It is known that when the target space of a
P ¼ @ðL1=p Þþ ktk ðLðkpÞ=p Þþ A supersymmetric nonlinear -model is a Kahler
p k¼pþ1
manifold K, the theory acquires an enhanced N = 2
270 Topological Gravity, Two-Dimensional
supersymmetry. Then we can twist the theory and Kahler manifolds are annihilated by the Virasoro
converted into a topological field theory. This is the operators which are constructed by taking an analogy
topological -model [7]. The partition function of with those of (p, 1) gravity. The Virasoro conjecture is
the theory consists of a sum over world-sheet a natural generalization of Witten’s conjecture, and
instantons, that is, holomorphic maps from the has recently been rigorously proved in the case of
Riemann surface to the target space K. Due to curves and projective spaces.
supersymmetry, functional determinants around Excellent reviews on the theory of 2D topological
instantons cancel and the theory simply counts the gravity are given in Witten (1991) and Dijkgraaf (1991).
number of holomorphic curves inside the Kahler
manifold K. Thus, the topological -model has a See also: Axiomatic Approach to Topological Quantum
close relationship with enumerative problems in Field Theory; Large-N and Topological Strings; Mirror
algebraic geometry, that is, Gromov–Witten invar- Symmetry: A Geometric Survey; Moduli Spaces: An
Introduction; Riemann Surfaces; Topological Sigma
iants and quantum cohomology theory.
Models; WDVV Equations and Frobenius Manifolds.
When the topological -model is coupled to topolo-
gical gravity, the BRST-invariant observables are given
by n (i )
n i , where i are cohomology classes
of K. Correlation functions are defined as Further Reading
* + Z
Ys Y s Brézin E and Kazakov V (1990) Exactly soluble field theory of
ni ði Þ ¼ ci ðLi Þni ^ e i ði Þ ½62 closed strings. Physics Letters B 236: 144.
MðK;dÞ Dijkgraaf R (1991) Lectures Presented at Cargése Summer
i¼1 g;d g;s i¼1
School on New Symmetry Principles in Quantum Field
Here M g, s (K; d) denotes the (stable compactification Theory, hep-th/9201003.
of) moduli space of degree d holomorphic maps Dijkgraaf R, Verlinde E, and Verlinde H (1991) Loop equations
to K from genus g Riemann surfaces . e i is the and Virasoro constraints in non-perturbative 2d quantum
gravity. Nulcear Physics B 348: 435.
pullback of the evaluation map ei : (f ; x1 , . . . , xs ) 2 Douglas M and Shenker S (1990) Strings in less than one dimension.
M g, s (K; d) ! f (xi ) 2 K by f where f is a holo- Nuclear Physics B 335: 635.
morphic map. Correlation functions [62] give Eguchi T and Yang S-K (1990) N = 2 superconformal models
topological (symplectic) invariants of the manifold as topological field theories. Modern Physics Letters A 5: 1693.
K. In the cases ni = 0 (i = 1, . . . , s), they are known as Eguchi T, Hori K, and Xiong C-S (1997) Quantum cohomology
and Virasoro algebra. Physics Letters B 402: 71.
Gromov–Witten invariants. Fukuma M, Kawai H, and Nakayama R (1991) Continuum
Equation [62] is nonvanishing if the selection rule Schwinger–Dyson equations and universal structures in two-
dimensional quantum gravity. International Journal of Mod-
X
s
ðni þ qi Þ ¼ dim Mg;s ðK; dÞ ern Physics A 6: 1385.
Gelfand IM and Dikii LA (1975) Asymptotic behavior of the
i¼1
¼ c1 ðKÞd þ ð3 dim KÞðg 1Þ þ s ½63 resolvent of Sturm–Liouville equations and the algebra of the
Korteweg–de Vries equations. Russian Mathematical Surveys
is obeyed, where qi is the degree of cohomology 30(5): 77–113.
class i and c1 (K) is the first Chern class of the Gross D and Migdal A (1990) Non-perturbative two dimensional
quantum gravity. Physical Review Letters 64: 17.
tangent bundle of K. Kontsevich M (1991) Intersection theory on the moduli space of
We see that there is a close parallel between the curves and the matrix Airy function. Communications in
topological -model and (p, 1) topological gravity. Mathematical Physics.
If we formally set qi = ‘i =p, c1 (K) = 0, and dim K = Witten E (1988) Topological sigma models. Communications in
(p 2)=p, eqn [63] agrees with eqn [25]. Based on this Mathematical Physics 118: 411.
Witten E (1990) On the topological phase of two dimensional
analogy, Eguchi, Hori, and Xiong proposed the gravity. Nuclear Physics B 340: 281.
Virasoro conjecture [8], that is, generating functions Witten E (1991) Two-dimensional gravity and intersection theory of
of the number of holomorphic maps to arbitrary moduli space. Surveys in Differential Geometry 1: 243–310.
Topological Knot Theory and Macroscopic Physics 271
combinations of linked and knotted vortices. Kelvin and circulation around all smooth simple closed
was inspired by the work of Gauss, who in an attempt curves C are preserved under the flow,
to describe topologically the behavior of two insepar- Z
d
ably closed linked circuits carrying electric current, X dr ¼ 0
found a relationship between the magnetic action dt t ðCÞ
induced by the currents and a pure number that
One knows that in three dimensions, the Helmholtz–
depends only on the type of link, and not on the
Kelvin theorem says that the vorticity (now a vector
geometry: this number is the first topological invariant
field) is transported. Thus, with generic initial
now known as the linking number.
vorticity a 3D time-periodic Euler fluid motion
In modern mathematical terms, Gauss introduced
preserves a nontrivial vector field. One very interest-
an invariant of a link consisting of two simple closed
ing question that remains to be elucidated is the
curves 1 , 2 in R3 , namely the signed number of turns
following: are there any chaotic, time-periodic Euler
of one of the curves around the other, the linking
flows with stationary boundaries?
coefficient {1 , 2 } of the link. His formula for this is
N ¼ f1 ; 2 g
Z Z The Connection between Topological and
1
¼ ð½d1 ðtÞ; d2 ðtÞ; 1 2 Þ= Numerical Invariants of Knots and the
4 1 2
Physical Helicity of Vector Fields
j1 ðtÞ 2 ðtÞj3 ½1
The writhing number of a curve in Euclidean three-
where [ , ] denotes the vector (or cross) product of dimensional space is the standard measure of the
vectors in R3 and ( , ) the Euclidean scalar product. extent to which the curve wraps and coils around
Thus, this integral always has an integer value N. If itself; it has proved its importance for electrody-
we take one of the curves to be the z-axis in R3 and namics and fluid mechanics in the study of the
the other to lie in the (x, y)-plane, then the formula knotted structures of magnetic vortices and
[1] gives the net number of turns of the plane curve dynamics flows, and for molecular biologists in the
around the z-axis. It is interesting to note that the study of knotted duplex DNA and the enzymes
linking coefficient [1] may be zero even though the which affect it. The helicity of a divergenceless
curves are nontrivially linked. Thus, its having vector field defined on a domain in Euclidean
nonzero value represents only a sufficient condition 3-space, introduced by Woltjer in 1958 in an
for nontrivial linkage of the loops. This last astrophysical context and coined by Moffat in
consideration leads naturally to the mathematical 1969 in the study of its topological meaning, is the
concepts of knots and links whose most striking standard measure of the extent to which the field
properties have been investigated in our introduc- lines wrap and coil around one another; it plays
tory article (see Mathematical Knot Theory). important roles in fluid mechanics, magnetohydro-
The other source of inspiration of Kelvin’s theory dynamics, and plasma physics. The ‘‘Biot–Savart
of matter was the Helmholtz’s laws of vortex operator’’ associates with each current distribution
motion, which state that in an ideal fluid (where on a given domain the restriction of its magnetic
there is no viscosity) vortex lives forever: two closed field to the domain. When the domain is simply
vortex rings, once linked, will always be linked. The connected, the divergence-free fields which are
classical results obtained by Helmholtz are basic to tangent to the boundary and which minimize energy
understanding the dynamics of Euler motions. The for given helicity provide models for stable force-
vorticity of a velocity field is its curl and is denoted free magnetic fields in space and laboratory plasmas;
!t (z) := curl(X(z, t)). In two dimensions, the vorticity these fields appear mathematically as the extreme
is a real-valued function and !t = , where is eigenfields for an appropriate modification of the
the stream function of X(z, t). Recall that the push- Biot–Savart operator. Information about these fields
forward of a scalar field (0-form) s under a can be converted into bounds on the writhing
diffeomorphism f is f s = s f 1 . These results, in number of a given piece of DNA.
modern terms, can be stated as follows: Recent researches (Cantarella et al. 2001)
Theorem (Helmholtz–Kelvin). An incompressible obtained rough upper bounds for the writhing
fluid motion (Mt , t ) with velocity field X and number of a knot or link in terms of its length and
vorticity !t is Euler if and only if its vorticity is thickness, and rough upper bounds for the helicity
passively transported, of a vector field in terms of its energy and the
geometry of its domain. It was also showed that in
t !0 ¼ !t the case of classical electrodynamics in vacuum, the
Topological Knot Theory and Macroscopic Physics 273
Cantarella et al. (2001) found two very interesting Influence of Geometry and Topology
results. on Fluid Flows
Theorem 1 Let K be a smooth knot or link in Ideal topological fluid mechanics deals essentially
3-space, with length L and with an embedded with the study of fluid structures that are
tubular neighborhood of radius R. Then the wri- continuously deformed from one configuration to
thing number Wr(K) of K is bounded by another by ambient isotopies. Since the fluid flow
map ’ is both continuous and invertible, then
jWrðKÞj < 1=4ðL=RÞ4=3 ’t1 (K) and ’t2 (K) generate isotopies of a fluid
structure K (e.g., a vortex filament) for any
For the proof, see Cantarella et al. (2001). {t1 , t2 } 2 I. Isotopic flows generate equivalence
classes of (linked and knotted) fluid structures. In
Theorem 2 The helicity of a unit vector field V the case of (vortex or magnetic) fluid flux tubes,
defined on the compact domain is bounded by fluid actions induce continuous deformations in D.
One of the simplest deformations is local stretch-
jHðVÞj < 1=2 volðÞ4=3 ing of the tube. From a mathematical viewpoint,
this deformation corresponds to a time-dependent,
Let us now give a brief overview of the methods
continuous reparametrization of the tube center-
used to find sharp upper bounds for the helicity of
line. This reparametrization (via homotopy classes)
vector fields defined on a given domain in
generates ambient isotopies of the flux tube, with
3-space. As usual, will denote a compact domain
a continuous deformation of the integral curves.
with smooth boundary in 3-space. Let K() denotes
Moreover, in the context of the Euler equations,
the set of all smooth divergence-free vector fields
the Reidemeister moves (or isotopic plane deforma-
defined on and tangent to its boundary. These
tions), whose changes conserves the knot topology,
vector fields, sometimes called ‘‘fluid knots,’’ are
are performed quite naturally by the action of local
prominent for several reasons: (1) They are natural
flows on flux tube strands. If the fluid in (D K) is
vector fields to study in a ‘‘fluid dynamics
irrotational, then these fluid flows (with velocity u)
approach’’ to geometric knot theory. (2) They
must satisfy the Dirichlet problem for the Laplacian
correspond to incompressible fluid flows inside a
of the stream function ’, that is,
fixed container. (3) They are vector fields most often
studied in plasma physics. (4) For given energy
(equivalently minimize energy for given helicity), u ¼ r’ in ðD KÞ
½7
they provide models for stable force-free magnetic r2 ’ ¼ 0
fields in gaseous nebulaes and laboratory plasmas.
(5) The search for these helicity-maximizing fields with normal component of the velocity to the tube
can be converted to the task of solving a system of boundary u? given. Equations [7] admit a unique
partial differential equations. (6) The fluid knots can solution in terms of local flows, and these flows are
reveal some fundamental and still unknown interpretable in terms of Reidemeister’s moves
mechanisms, which characterize the phenomenon performed on the tube strands. Note that boundary
of phase transition, and in particular the transition conditions prescribe only u? , whereas no condition
from chaotic (unstable) phases and behaviors of is imposed on the tangential component of the
matter to ordered (stable) ones. velocity. This is consistent with the fact that
tangential effects do not alter the topology of the
physical knot (or link). The three type of Reidemeister’s
moves are therefore performed by local fluid flows,
Knots and Fluid Mechanics (Vortex Lines,
which are solutions to [7], up to arbitrary tangential
Magnetic Helicity, and Turbulence)
actions.
The Kelvin’s theory of explaining atoms as knotted
vortices in fluid ether was seminal in the develop-
Knotted and Linked Tubes of Magnetic Flux
ment of topological fluid mechanics. The recent
revival (starting in the 1970s) is mainly due to the Let T be the standard solid torus in R3 given by
work of Moffat, on topological interpretation of
helicity, and Arnol’d, on asymptotic linking number ðð2 þ " cos Þ cos ’; ð2 þ " cos Þ sin ’; " sin ÞÞ ½8
of space-filling curves. Modern developments have
been influenced by recent progress in the theory of where 0 , ’ < 2, and 0 " < 1. For relatively
knots and links. prime integers p and q, let Fp, q denote the foliation
Topological Knot Theory and Macroscopic Physics 275
of T by the curves ", (where 0 " 1 and It can be shown that i is independent of the
0 < 2) given by chosen meridional disk. It also can be shown that
each i is a fluid flow invariant, that is,
"; ðsÞ ¼ ð2 þ " cosð þ qsÞÞ cosðpsÞ; ZZ
ð2 þ " cosð þ qsÞÞ sinðpsÞ; " sinð þ qsÞ ½9 i ðgt LT i Þ ¼ B U dðareaÞ ½11
gt LDi
where 0 s < 2.
is independent of t.
Definition A magnetic tubular link (or magnetic One more fluid invariant that will play a central
link) is a smooth immersion into R3 of finitely many role in the energy minimization of magnetic links is
disjoint standard solid tori [ni= 1 T i given by the following definition.
i ¼ ðLT i Þ : B U dðareaÞ
LDi where SLFi denotes the self-linking number of
the axis curve of the tube LT i with respect to the
where U denotes the normal to the surface LDi framing Fi induced by the integral curves of the
pointing in the positive direction induced by the B magnetic field B within LT i , and LKij denotes
field. the linking number between any integral curve of
276 Topological Knot Theory and Macroscopic Physics
the magnetic field B in LT i with any integral curve losing energy, the magnetic lines of force will
of the magnetic field B in LT j . contract. On the other hand, since this is a volume-
preserving process, the cross sections of the flux
Remark In fact, SLFi is the same as the linking
tubes of gt L will at the same time expand. These
number between any two integral curves of the
changes of topology occur while the flux , volume
magnetic field B within the tube LT i .
V, and helicity of gt L will remain the same. In other
Thus, as many authors have showed, the helicity words, knotted magnetic flux tubes left free to
does reflect the topology and the geometry of the evolve in such a fluid will do so by conserving their
magnetic lines of force within a magnetic link. If, for magnetic flux and volume V, but converting their
example, L has only one component, that is, L is a magnetic energy into kinetic energy, which in turn
magnetic knot, then dissipates by internal friction. Magnetic links and
knots evolve from high to low magnetic energy
HðLÞ ¼ 2 SLF ðCÞ ½13 levels, conserving topology; and because of the
where SLF (C) is the self-linking number of the axis induced shortening of field lines under conservation
curve C of the knotted tube with respect to the of volume, they become fatter and fatter, with an
framing F induced by the integral curves of the increase of the average tube cross section.
magnetic field B within the magnetic knot. If, for This process cannot continue indefinitely. Even-
example, the tube is knotted in the form of a trefoil tually, the magnetic flux tubes of gt L must make
and if the magnetic lines of force appear to be contact with each other. In other words, the topology
parallel to the axis curve when the trefoil is placed of the magnetic link gt L, as expressed in knotting and
on a plane flat surface, then SL = 3 and linking, creates a barrier to the full dissipation of the
magnetic link’s energy, that is, EM (gt L) has a positive
H ¼ 32 ½14 lower bound that results from the topology of gt L.
On the other hand, if for example, the magnetic That means, in other words, that relaxation is
lines of force induce the trivial framing in each obstructed by the knottedness and entanglement of
component, then the field lines, and a minimum magnetic energy is
X reached. Thus, the magnetic link will reach a
HðLÞ ¼ 2 i j LKij ½15 nontrivial stable and invariant energy state, much as
1 i<j n Kelvin conjectured his atomic vortices would.
Thus, if L is a magnetic two-component Hopf link Various estimates of magnetomechanical energy in
with no twisting of the integral curves of the terms of topological quantities have been put forward
magnetic field within the components of L, then in recent years (see Freedmann and He (1994)). These
relations give lower bounds for the energy levels
HðLÞ ¼ 21 2 ½16 attainable by knot or link types by taking into account
because the self-linking number based on the B-field the effects that linking numbers and number of
framing is zero for each component, and the linking crossings have on the energy of the relaxed state.
number between the two components is 1. These bounds are expressed by relationship of the kind
Emin
ðCmin ; ; V; NÞ ½17
Energy of Magnetic Knots and Links
Let us conclude this section with the definition of where Emin is the equilibrium energy and gives the
the energy of a magnetic link. relationship between physical quantities – such as
total flux , number of tubes N, magnetic volume V –
Definition The magnetic energy EM (L) of a mag- and topology, given here by the minimal possible
netic link L is defined by the classical formula number of crossing Cmin . These relations offer
ZZZ
1 numerous advantages due to the explicit dependence
EM ðLÞ ¼ jBj2 dðvolÞ ðGaussian unitsÞ on qualitative properties of the flow field. A simple
8 [i LT i
example is provided by the analysis of three braids,
Although the energy EM is not flow invariant, it will which shows that magnetic energy grows quadrati-
play a central role in magnetic relaxation of knots cally in time due to random braiding. This means
and minimum energy magnetic links. that the least possible amount of magnetic energy
Consider a magnetic link L in a perfectly that can be attained by the physical knot or link is
conducting, incompressible, viscous fluid. As a result determined purely by its topology. If topological
of dissipative frictional fluid forces, the magnetic information sets the levels of minimum energy
energy EM (gt L) of gt L will decrease with time t. In accessible to the knot or link, geometric properties
Topological Knot Theory and Macroscopic Physics 277
may also influence the relaxation process. Consid- vector field lines (streamlines, vortex lines, or
erations of helicity and linking numbers, for magnetic lines) cross each other. If two field lines
example, demonstrate that internal rearrangement meet, the point of crossing is a true nodal point, like
of magnetic field geometry leads to a spectrum of a bifurcation in a path. Dissipative effects allow the
different asymptotic endstates with the same topol- reconnection to proceed through such points.
ogy. Moreover, magnetic knots have a natural In dissipative fluids, mathematical and physical
tendency to get rid of excessive torsion of field properties are no longer conserved, and during the
lines and S-shaped tube geometries, and this may process we lose part of the original information.
influence the relaxation process. However, some of the invariants are rather robust
Since the helicity H(gt L) is both an invariant of and may only degrade slowly. One of them is magnetic
fluid flow and an expression of the magnetic link helicity, the magnetic analog of the kinetic helicity. Its
gt L’s topology, the following theorem, first stated by dissipation during reconnection can be modest; in
Moffat, is a mathematical expression of this particular, if the reconnection timescale is small
topological bounds. compared to classical dissipation times, then helicity
loss will be negligible. The robustness of magnetic
Theorem Let L be a magnetic link. Then
helicity plays a central role in fusion plasma physics
EM ðLÞ
q0 jHðLÞj and in many astrophysical contexts. On the other hand,
large changes in kinetic helicity are intimately related to
where q0 is a nonzero constant that is independent
qualitative changes in the topology of vortex flows.
of the magnetic link.
Under Euler’s equations, the helicity of a vortex
Freedman and He (1991) obtain more subtle and tube Rof vorticity ! and velocity u is defined by
tighter topological bounds on the minimum energy H = u ! dV. The integral is taken over the tube
of magnetic links. For example, for a magnetic knot volume V occupied by !. Now, for n knotted and
K, they prove that linked vortex tubes, each of (constant) strength
(total vorticity) i (1 i N), the helicity of the
1 ðKÞ3=2 acðKÞ3=4 whole system can be expressed in terms of linking
EM ðKÞ
45=4 VðKÞ1=3 ij
are maybe the most significant examples in the last Boyland P (2001) Fluid mechanics and mathematical structures.
years). In particular fluid dynamics, a topological In: Ricca RL (ed.) An Introduction to the Geometry and
Topology of Fluid Flows, pp. 105–134. NATO-ASI Series:
macroscopic field theory, provides a powerful frame- Mathematics. Dordrecht: Kluwer.
work for modern theory of knots and links in Cantarella J, De Turk D, and Gkuck H (2001) The Biot–Savart
3-manifolds. Moreover, as we saw here, it provides operator for application to knot theory fluid dynamics, and
a physical interpretation of the link, self-linking, and plasma physics. Journal of Mathematical Physics 42: 876–905.
writhing number of knots and links. The present Freedman MH and Zheng-Xu He (1991) Divergence free fields:
energy and asymptotic crossing number. Annals of Mathe-
article was essentially aimed to illustrate such a matics 134: 189–229.
relationship. Thus, the most fundamental result we Fuller FB (1978) Decomposition of the linking number of a closed
reported here is the relation (formula) connecting the ribbon: a problem from molecular biology. Proceedings of the
helicity of vector (magnetic) fields to the writhing National Academy of Sciences, USA 75: 3557–3561.
number of knots: H(V) = Flux(V)2 Wr(K). So, wri- Ghrist RW, Holmes PhJ, and Sullivan MC (1997) Knots and
Links in Three-Dimensional Flows. Heidelberg: Springer.
thing number for knots is the analog of helicity for Hornig G (2002) Topological Methods in Fluid Dynamics.
vector fields. Both expressions of these invariants are Preprint. Ruhr-Universität-Bochum.
variants of the (Gaussian) integral formula for the Kauffman LH (1995) Knots and Applications. Series on Knots
linking number of two disjoint closed space curves. and Everything, vol. 6, Singapore: World Scientific.
Further investigations of these invariants and their Lomonaco SJ (1995) The modern legacies of Thomson’s atomic
vortex theory in classical electrodynamics. In: Kauffman LH
mathematical properties might throw new light on (ed.) The Interface of Knots and Physics, Proc. Symp. Appl.
the interfaces between many different areas of Math., vol. 51, pp. 145–166. American Mathematical Society.
macroscopic and quantum physics. Moffatt HK (1969) The degree of knottedness of tangled vortex
lines. Journal of Fluid Mechanics 35: 117–129.
See also: The Jones Polynomial; Knot Theory and Moffat HK (1990) The energy spectrum of knots and links.
Physics; Magnetohydrodynamics; Mathematical Knot Nature 347: 367–369.
Theory; Stability of Flows; Superfluids; Topological Moffatt HK, Zaslavsky GM, Comte P, and Tabor M (1992)
Topological Aspects of the Dynamics of Fluids and Plasmas,
Quantum Field Theory: Overview; Vortex Dynamics;
NATO ASI Series, Series E: Applied Sciences, vol. 218.
Yang–Baxter Equations.
Dordrecht: Kluwer Academic.
Ricca RL (1998) New developments in topological fluid
mechanics: from Kelvin’s vortex knots to magnetic knots. In:
Further Reading Stasiak A, Katritch V, and Kauffman LH (eds.) Ideal Knots.
Singapore: World Scientific.
Arnol’d V and Khesin B (1998) Topological Methods in Ricca RL and Moffat HK (1992) The helicity of a knotted vortex
Hydrodynamics. Heidelberg: Springer. filament. In: Moffat HK (ed.) Topological Aspects of
Berger MA and Field GB (1984) The topological properties of Dynamics of Fluids and Plasmas, pp. 225–236. Dordrecht:
magnetic helicity. Journal of Fluid Mechanics 147: 133–148. Kluwer.
Berry MV and Dennis MR (2001) Knotted and linked phase Tait PG (1900) On Knots I, II, III. In: Scientific Papers.
singularities in nonchromatic waves. Proceedings of the Royal Cambridge: Cambridge University Press.
Society A 457: 2251–2263. Thomson JJ (1883) A Treatise on the Motion of Vortex Rings.
Boi L (2005) Topological knots’ models in physics and biology. London: Macmillan.
In: Boi L (ed.) Geometries of Nature, Living Systems and Trueba JL and Rañada AF (2000) Helicity in classical electro-
Human Cognition. New Interactions of Mathematics with dynamics and its topological quantization. Apeiron 7: 83–88.
Natural Sciences and Humanities, pp. 211–294. Singapore: Woltjer L (1958) A theorem on force-free magnetic fields. Proceed-
World Scientific. ings of the National Academy Sciences, USA 44: 489–491.
quantum physics. Global effects play an important Donaldson–Witten theory where we discuss the
role in quantum-mechanical models and topology computation of its observables from a perturbative
becomes an essential ingredient in their description. approach, showing their relation to the Donaldson
TQFT itself appeared in the winter of 1987 after invariants. Next, we introduce Chern–Simons gauge
Witten’s work (Witten 1988a) on Donaldson theory theory as a theory of knot and link invariants. The
(Donaldson 1990), but a series of papers during the penultimate section deals with advanced develop-
1980s which dealt with topological aspects of field and ments in TQFT. Finally, we end up with some
string theory anticipated its existence. Two of these concluding remarks.
correspond to Witten’s works on supersymmetric
quantum mechanics and supersymmetric sigma mod-
els (Witten 1982) that led to a generalization of Morse Topological Quantum Field Theory
theory. This generalization was considered by Floer
(1987) in a new context that constituted the key We will start our overview by presenting the most
element in Witten’s construction of TQFT. These general structure of a TQFT from a functional
developments were certainly influenced by Atiyah integral point of view which, though not rigorously
(1988). TQFT was born as a result of the interplay defined, is the approach that has led to the most
between physics and mathematics. This has been a important developments. As in conventional quan-
constant feature all along its development. tum field theory, axiomatic approaches to TQFT do
Soon after the formulation of the TQFT exist, but we will not follow that route here.
addressing Donaldson theory, now known Let us consider an n-dimensional Riemannian
as Donaldson–Witten theory, Witten formulated a manifold X endowed with a metric g and a
new TQFT which focuses on knot invariants such as quantum field theory on it. We will say that this
the Jones polynomial and its generalizations (Jones theory is ‘‘topological’’ if there exist operators in the
1985). Witten (1989) constructed Chern–Simons theory such that their correlation functions do not
gauge theory and proved its relation to the theory depend on the metric. If we denote these operators
of knot and link invariants. This theory possesses by Oi (where i is a generic label), then
different features than Donaldson–Witten theory,
and in fact it turns out that these two theories fall
hOi1 Oin i ¼ 0 ½1
into two different general types of TQFTs as will g
be explained in the following section. Anyhow,
despite their formal differences, both Donaldson– where h i denotes a vacuum expectation value.
Witten and Chern–Simons gauge theory emerged The operators that satisfy this equation are called
as a novel way to express topological invariants in ‘‘topological observables.’’
terms of quantum field theory quantities as well as The simplest way to achieve metric independence is
to generalize their previous formulation. But there to consider a theory whose action and operators do not
was much more to them than it seemed in their depend on the metric. In this situation, if no
beginnings. Once these topological invariants were anomalous metric dependence is generated upon
formulated in field theory language, one had a quantization, the correlation functions of these opera-
huge machinery to study them from different tors satisfy [1] and lead to topological invariants on X.
points of view. Theoretical physicists have devel- Theories of this sort are collectively referred to as
oped many useful tools to study quantum field Schwarz-type TQFTs, and well-known examples are
theory. The use of these tools led to new frame- Chern–Simons gauge theory and BF theories. How-
works for these topological invariants. ever, Schwarz-type theories are too restrictive. One
In this overview we are going to provide the basics would like to have a theory satisfying property [1] with
of TQFT and briefly describe two examples – a weaker condition on the action. This can be achieved
Donaldson–Witten theory and Chern–Simons gauge with the help of a symmetry. The resulting TQFTs are
theory – to explain how the general features are called of Witten or cohomological type, the main
implemented. Some excellent reviews on the subject examples being Donaldson–Witten theory and topo-
(Birmingham et al. 1991, Cordes et al. 1996, logical sigma models (Witten 1988b).
Labastida and Mariño 2004) are available. The For TQFTs of Witten type, the action may depend
organization of this work is as follows. In the on the metric. However, the theory has an underlying
following section we present a general introduction scalar symmetry acting on the fields i . Since is a
to TQFT from a functional integral point of view. symmetry, the action of the theory satisfies S(i ) = 0.
Next, we touch upon the twisting of extended In these theories, metric independence of the correla-
supersymmetry as a general constructive approach tion functions is achieved as follows. Let T =
to TQFT. This is followed by a section on (=g )S(i ) be the energy–momentum tensor of
280 Topological Quantum Field Theory: Overview
the theory. It turns out that the energy–momentum the equivariant cohomology of . Given an operator
tensor is -exact: (0) in the equivariant cohomology of , let us
consider the following set of equations:
T ¼ iG ½2
dðnÞ ¼ ðnþ1Þ ; n0 ½7
G being some tensor. Indeed, if [2] is satisfied, it
follows that for any set of operators Oi which are where the operators (n) (n = 1, . . . , dim X) are diff-
-invariant, erential forms of degree n on X and d is the de Rham
differential. These differential equations are called
‘‘descent equations’’ and their solutions (n) (n 0)
hOi1 Oi2 Oin i ¼ hOi1 Oi2 Oin T i
g ‘‘topological descendants’’ of (0) . We will show how
¼ ihOi1 Oi2 Oin G i to construct a solution to these equations on general
¼ ihðOi1 Oi2 Oin G Þi grounds.
The topological descendants lead to the construc-
¼0 ½3
tion of a set of elements of the equivariant coho-
In this computation we have assumed that mology of . Let n be an n-cycle on X, n 2 Hn (X),
the symmetry is not anomalous and that there are and let us consider the following operator:
no contributions coming from boundary terms since Z
ðn Þ
we have integrated by parts in field space. This is not Wð0Þ ¼ ðnÞ ½8
n
always the case and in fact the situations in which one
of these two properties fails lead to rich phenomena. In This operator is -invariant,
those cases, for example, in Donaldson–Witten theory Z Z Z
ð Þ
on manifolds with bþ2 = 1, the correlation functions fail Wð0Þn ¼ ðnÞ ¼ dðn1Þ ¼ ðn1Þ ¼ 0 ½9
to be topological invariants in a controlled manner n n @ n
which unveils many interesting properties. since @ n = 0. On the other hand, if n were trivial
We will now describe Witten-type theories in a in homology, that is, if n = @ nþ1 , we would have
general context. The general structure of Schwarz-type that W((0)n ) is -exact:
theories is much simpler and will be illustrated in Z Z Z
the example presented below. In Witten-type theories ðn Þ ðnÞ ðnÞ
Wð0Þ ¼ ¼ d ¼ ðnþ1Þ ½10
the observables are the -invariant operators. It is @ nþ1 nþ1 nþ1
simple to prove that -exact operators decouple from (0)
Thus, given the operator , we have constructed a
the theory. Indeed, if Oa is -exact, Oa = O^a , then map between the homology of X and the equivar-
hOa Oi1 Oi2 Oin i ¼ hO^a Oi1 Oi2 Oin i iant cohomology of . There are as many maps as
basic operators (0) one finds in the theory.
¼ hðO^a Oi Oi Oi Þi ¼ 0
1 2 n
½4 To actually construct these maps, we need to find
Thus, one can restrict the set of observables to the a solution of the descent equations [7]. As
cohomology of : announced before, there is a general solution to
those equations in Witten-type theories. Since in this
Ker type of theories [2] holds, there exists an operator
O2 ½5
Im
G G0 ½11
There is no reason a priori why the -symmetry
should be a scalar Grassmannian symmetry, but in that satisfies
all known models of Witten-type TQFTs this turns P ¼ T0 ¼ iG ½12
out to be the case. Thus, these theories violate the
spin-statistics theorem. In all these models the Notice that G is an anticommuting operator and a
algebra of the symmetry has the form 1-form in spacetime. With the aid of this operator,
one constructs the following solution to the descent
2 ¼ Z ½6 equations [7]:
where Z is a symmetry transformation (typically a 1 ðnÞ
gauge symmetry of some sort). This property forces ðnÞ ¼ dx1 ^ ^ dxn ½13
n! 1 2 ...n
to consider Z-invariant observables and to work in
the context of ‘‘equivariant cohomology.’’ where
The observables of Witten-type theories fit into a ðnÞ ð0Þ
1 2 ...n ðxÞ ¼ G1 G2 Gn ðxÞ;
general pattern that we describe now. The key
ingredient is a map between the homology of X and n ¼ 1; . . . ; dim X ½14
Topological Quantum Field Theory: Overview 281
1
One can easily check using [12] and the -invariance Qv and Qv ˙ transform under H as (0, 2, 2) and
1
of (0) that the operators [13] do satisfy the descent (2, 0, 2) , respectively. M˙ ˙ and M are the
equations [7]. generators of SU(2)þ and SU(2) , respectively, while
We have seen that Witten-type TQFTs are char- Bvw and R generate SU(2)R and U(1)R , respectively.
acterized by property [2]. It would be desirable to have The twisting of a supersymmetric theory involves a
at hand a systematic procedure to build theories modification of the couplings of the theory to a
satisfying that property. It has been found that background metric on the space where the theory is
extended supersymmetry provides a very helpful defined. This modification is carried out redefining
starting point to build those theories. Although super- the Lorentz transformation properties of the different
symmetry guarantees from first principles only the fields making use of the internal symmetry SU(2)R . In
weaker condition [12] instead of [2], all TQFTs that particular, we will redefine the couplings of the fields
have been constructed from extended supersymmetry to the SU(2)þ spin connection according to the way
actually satisfy [2]. To build a TQFT from a theory they transform under SU(2)R . This is easily done by
with extended supersymmetry, one needs to go identifying the SU(2)R indices v with the SU(2)þ
through the twisting procedure that we now describe. indices .˙ The procedure involves a redefinition of
the rotation group into K0 = SU0 (2)þ SU(2) , where
SU0 (2)þ is generated by
Twisting of Extended Supersymmetry
M0_ _ ¼ M_ _ B_ _ ½16
All known Witten-type theories are related to an
underlying extended supersymmetric quantum field The supersymmetry generators Qv and Qv
˙ get
theory. The topological theory is a modified version of transformed in the following way:
the supersymmetric theory in which the Lorentz
Qv
_ ! Q_ _
transformation properties (spins) of some of the fields ½17
have been modified. This modification of spin assign- Qv ! Q_
ments is known as twisting, and it can be carried out
on any theory with extended supersymmetry in any which allows us to define the ‘‘topological
spacetime dimension. We will not consider the supercharge’’:
procedure in such a general setting but instead we _
Q _ Q_ _ ½18
will illustrate it by considering the case of N = 2
supersymmetry in four dimensions. We will begin with It is simple to prove using [15] and [16] that this
a general description and then we will apply it to a quantity is a scalar under the new rotation group
specific example: Donaldson–Witten theory. K0 : [M , Q] = 0 and [M0˙ ˙ , Q] = 0. In addition, from
Let us consider the Euclidean version of the N = 2 [15], it follows that Q is nilpotent (in the absence of
supersymmetry algebra with no central charges. Central central charges):
charges can be included without much ado but we will 2
not consider them for simplicity. The total symmetry Q ¼0 ½19
group of the theory is H = SU(2)þ SU(2) SU(2)R
The scalar generator Q leads to the topological
U(1)R , K = SU(2)þ SU(2) being the rotation group,
symmetry of the previous section. Actually, the
and SU(2)R U(1)R the internal symmetry group of
twisting procedure provides also the operator G in
the N = 2 supersymmetry algebra. The generator
[12]. Defining
algebra takes the following form:
i _
fQv ; Qw
_ g ¼ 2 vw
_ P ;
fQv ; Qw g ¼ 0 G ¼ ð
Þ Q_ ½20
4
½P ; Qv ¼ 0; ½P ; Qv
_ ¼ 0 one easily finds, after using [15] and [18],
½M ; Qv ¼ ð QÞv ; ½M ; Qv
_ ¼0
fQ; G g ¼ @ ½21
½M_ _ ; Qv ¼ 0; ½M_ _ ; Qv _ _ QÞv
_ ¼ ð _
u wÞ
which is indeed equivalent to [12]. On general
½Bvw ; Qu ¼ uðv QwÞ
; ½Bvw ; Q_ ¼ uðv Q_ grounds we cannot prove that twisted supersym-
½Qv ; R ¼ Qv ; metric theories lead to theories which satisfy [12].
_ ; R ¼ Qv
½Qv _
However relation [12], which is weaker, is guaran-
½15
teed. It turns out that in all the models originated
In these relations v, w 2 {1, 2} are SU(2)R indices and from extended supersymmetry which have been
and ˙ denote spinorial indices of SU(2) and studied, [2] is satisfied and thus the resulting
SU(2)þ , respectively. The supersymmetry generators theories are TQFTs of Witten type.
282 Topological Quantum Field Theory: Overview
The action [25] is Q-exact up to a topological Using G we can now construct the map between
term: the homology of X and the equivariant cohomology
Z of Q. Let us consider the simple case SU(2). There
1
S ¼ fQ; Vg F^F ½27 exists only one independent Casimir and, corre-
2 spondingly, only one basic operator:
where
O ¼ trð2 Þ ½32
Z
pffiffiffi i _ _ for which one finds the following set of descendants:
V ¼ d4 x g tr _ _ ðF_ þ D_ Þ
4
1
1 1 Oð1Þ ¼ tr pffiffiffi dx
½; y þ pffiffiffi _ r_
y ½28 2
2 2 2
ð2Þ 1 1
O ¼ tr pffiffiffi ðF þ D Þ
Actually, it turns out that in all the theories obtained 2 2 ½33
after twisting extended supersymmetry, the resulting 1
actions are Q-exact up to topological terms. In the dx ^ dx
4
case
R of N = 2 theories, topological (theta) terms
..
F ^ F are generically not observable (due to a chiral .
anomaly), so it is customary to pick
The map from the homology of X to the equivariant
SDW ¼ fQ; Vg ½29 cohomology of Q can now be constructed very
easily. Let i be an element of the homology group
as the action of the theory, which immediately implies
Hi (X). We associate to it the following observable:
[2] and therefore the topological character of the Z
theory. Notice, however, that [29] is stronger than [2].
i ! Ii ði Þ ¼ OðiÞ ½34
As we described in the previous section, the i
observables of the theory can be constructed using (i)
the operator G in [20]. Its action on the twisted where O is given in [33]. The construction assures
fields is easily obtained using [23]: that Ii (i ) is invariant under Q and gauge transfor-
mations. Furthermore, it is also assured that Ii (i ) is
1 not Q-exact.
½G ; ¼ pffiffiffi
2 2 Let us consider the computation of correlation
i functions. The discussion will be presented for a
½G ; A ¼ g
i generic gauge group. We will consider the topologi-
2pffiffiffi
cal theory defined by the Donaldson–Witten action
i 2
½G;
¼ r
4 SDW ¼ fQ; Vg ½35
fG ; g ¼ ðF þ Dþ
Þ
½30 where V is defined in [28]. The property [35] has a
½G; ¼ 0 very important consequence. The action SDW shows
3i up in the correlation functions as exp(SDW =e2 ),
½G; Fþ ¼ ir þ
r
where e is a free parameter which corresponds to
pffiffiffi2
3i 2 the coupling constant of the N = 2 theory. Since the
fG; g ¼
r term involving the coupling constant is Q-exact, the
8
3i 3i correlation functions of Q-invariant operators are
½G; D ¼
r
þ r independent of e. Let us explain this in some detail.
4 2
The (unnormalized) correlation functions of the
We now need to fix the basic operator (0) in [14]. theory are defined by
The starting point must be a set of gauge-invariant, Z
2
Q-closed operators which are not Q-trivial. Since h1 n i ¼ D1 n eð1=e ÞSDW ½36
[Q, ] = 0, these operators are the gauge-invariant
polynomials in the field . For a simple gauge group where 1 , . . . , n are invariant under Q transforma-
of rank r the algebra of these polynomials is tions. Using the fact that SDW is Q-exact, one obtains
generated by r elements, and we shall denote this
basis by On , n = 1, . . . , r. A simple choice for SU(N) @ 2
h1 n i ¼ 3 h1 n SDW i
consists of the following Casimirs: @e e
2
On ¼ trðnþ1 Þ; n ¼ 1; . . . ; N ½31 ¼ 3 hfQ; 1 n Vgi ¼ 0 ½37
e
284 Topological Quantum Field Theory: Overview
where we have used the fact that Q is a symmetry of We finish this section by pointing out that many
the theory, and therefore as in [3] the last functional features of the evaluation of the functional integral
integral gives zero. This result implies that one can of the Donaldson–Witten theory developed here are
compute these correlation functions in different common to most topological field theories of the
limits of e. In the weak-coupling limit (semiclassical Witten type. These features can be studied in the
or saddle point approximation), one establishes the context of the Mathai–Quillen formalism which is
connection with Donaldson theory. In the strong- the object of a separate article in the encyclopedia
coupling limit, Seiberg–Witten invariants appear and (see Mathai–Quillen Formalism).
one finds the connection between these two types of
invariants. We will briefly explore the weak-
coupling limit e ! 0. The functional integral [36]
Chern–Simons Gauge Theory
can be evaluated exactly in two steps: first one
analyzes the zero modes or classical configurations
for Knots and Links
that minimize the action, then one expands around Chern–Simons gauge theory is the most important
them considering only quadratic fluctuations. The example of Schwarz-type TQFTs. Let us begin by
integration over these quadratic fluctuations introducing its basic elements. Chern–Simons gauge
involves ratios of determinants of kinetic operators theory is a quantum field theory whose action is
that because of the Q-symmetry of the theory (which based on the Chern–Simons form associated to a
in fact is a Bose–Fermi symmetry) are 1. One is nonabelian gauge group. The theory is defined by
then left with an integral over the bosonic zero the following data: a smooth 3-manifold M which
modes which leads to a finite-dimensional integral will be taken to be compact, a gauge group G which
over the space of bosonic collective coordinates, and will be taken semisimple and compact, and an
a finite Grassmannian integral over the zero modes integer parameter k. The action of the theory is
of the fermionic fields. A careful analysis of the zero Z
modes, first carried out by Witten, reveals that the k 2
SCS ðAÞ ¼ tr A ^ dA þ A ^ A ^ A ½39
infinite-dimensional functional integral is replaced 4 M 3
by a finite-dimensional integral over the moduli where A is a gauge connection and the trace is taken
space of anti-self-dual (ASD) connections MASD , in the fundamental representation. The exponential
þ
that is, the space of connections satisfying F = 0. of i times this action is invariant under gauge
Therefore, the correlation functions [36] have the transformations,
form
Z A ! A þ g1 dg ½40
h1 n i ¼ ^1 ^ ^ ^n ½38 where g is a map g : M ! G.
MASD
Notice that the action [39] is independent of the
where the fields in 1 n are mapped to differ- metric on the 3-manifold M. In this theory, appro-
ential forms ^1 ^n on MASD – the degree of each priate observables lead to correlation functions
form being given by the ghost number of its which correspond to topological invariants. Candi-
partner. Notice that the integral on the right-hand dates to be observables of this type must be metric
side vanishes unless the form has top degree. From independent and gauge invariant. Wilson loops
the field-theoretical point of view, this is the satisfy these properties. They correspond to the
requirement that the overall ghost number of the holonomy of the gauge connection A along a loop.
correlation function must be equal to dim MASD . Given a representation R of the gauge group G and
The quantities on the right-hand side of [38] are – a 1-cycle on M, it is defined as
for gauge group SU(2) – precisely the Donaldson Z
invariants. Thus, Witten’s work provided a new WR ðAÞ ¼ trR Hol ðAÞ ¼ trR P exp A ½41
point of view on these invariants by reformulating
them in a quantum field theory language. This is a Products of these operators are the natural candi-
very important contribution since quantum field dates to obtain topological invariants after comput-
theory is a very rich framework and a wide variety ing their correlation functions. These correlation
of methods can be used to analyze the correlation functions are formally written as
functions. This opened an entirely new strategy to
investigate the Donaldson invariants. The emergence hWR11 WR22 WRnn i
Z
of Seiberg–Witten invariants is perhaps the greatest
¼ ½DAWR11 ðAÞWR22 ðAÞ WRnn ðAÞeiSCS ðAÞ ½42
achievement of the implementation of this strategy.
Topological Quantum Field Theory: Overview 285
invariants, Ng . The topological string contribution over the moduli space of the selected forms. The
takes the form resulting quantities are Donaldson invariants.
0 1 As in the case of topological sigma models one could
X X R
! be tempted to argue that the observation leading to a
x2g2 @ Ng e A ½47
field-theoretical interpretation of Donaldson invar-
g0 2H2 ðX;ZÞ
iants does not provide any new insight. Quite on the
where ! is the Kähler class of the Calabi–Yau manifold. contrary, once a field theory formulation is available,
In general, the quantities Ng are rational numbers. one has at his disposal a huge machinery which could
The precedent discussion has shown how Gromov– lead, on the one hand, to further generalizations of the
Witten invariants can be interpreted in terms of string theory and, on the other hand, to new ways to
theory. One could think that this is just a fancy compute quantities such as [49], obtaining new
observation and that no further insight on these insights on these invariants. This is indeed what
invariants can be gained from this formulation. The happened in the 1990s, leading to an important
situation turns out to be quite the opposite. Once a string breakthrough in 1994 when Seiberg and Witten
formulation has been obtained, the whole machinery of calculated [49] in a different way and pointed out the
string theory is at our disposal. One should look to new relation of Donaldson invariants to new integer
ways to compute the quantity [47], where Gromov– invariants that nowadays bear their names.
Witten invariants are packed. The hope is that, if this is The localization argument that led to the interpreta-
possible, the new emerging picture will provide new tion of [49] as Donaldson invariants is valid because
insights on these invariants. This is indeed what the theory under consideration is exact in the weak-
occurred recently. It turns out that the quantity [47] coupling limit. Actually, the topological theory under
can be obtained from an alternative point of view in consideration is independent of the coupling constant
which the embedded Riemann surfaces are regarded as and thus calculations in the strong-coupling limit are
D-branes. The outcome of this approach is that the also exact. These types of calculations were out of
Gromov–Witten invariants can be written in terms of reach before 1994. The situation changed dramatically
other invariants which are integers and that possess a after the work of Seiberg and Witten in which N = 2
geometrical interpretation. To be more specific, the super Yang–Mills theory was solved in the strong-
quantity [47] takes the form coupling limit. Its application to the corresponding
twisted version was immediate and it turned out that
X X 1 2g2 R
dx
ng 2 sin
d
e
!
½48 Donaldson invariants can be written in terms of new
g0 d>0
d 2 integer invariants now known as Seiberg–Witten
2H2 ðX;ZÞ
invariants (Witten 1994). The development has a
where ng are the new ‘‘integer’’ invariants. This strong resemblance with the one described above for
prediction has been verified in all the cases in which topological strings: certain noninteger invariants can
it has been tested. A similar structure will be found be expressed in terms of new integer invariants.
in the next section in the context of knot theory in The Seiberg–Witten invariants are actually simpler
the large-N limit. to compute than Donaldson invariants. They corre-
Let us now consider also Donaldson–Witten theory spond to partition functions of topological
from a new perspective. To be more specific, let us Yang–Mills theories where the gauge group is
consider the case in which the gauge group is SU(2), abelian. These contributions can be grouped into
and the 4-manifold X is simply connected and has classes labeled by x = 2c1 (L), where c1 (L) is the
bþ þ first Chern class of the corresponding line bundle.
2 > 1 (the case b2 = 1 is anomalous). In this situation
there are 1 þ b2 physical observables [34], O = I1 and The sum of contributions, each being 1, for a given
I(a ) = I2 (a ) (a = 1, . . . , b2 ), where a is a basis of class x is the integer Seiberg–Witten invariant nx . The
H2 (X). These can be packed in a generating functional: strong-coupling analysis of topological Yang–Mills
* !+ theory leads to the following expression for [49]:
X 2 X
exp a Iða Þ þ O ½49 21þð1=4Þð7þ11
Þ eððv =2Þþ2Þ nx evx
a x
X
ððv2 =2Þ2Þ
where and a (a = 1, . . . , b2 ) are parameters. In þi þ
=4
e nx eivx ½50
computing this quantity one can argue that the P x
contribution is localized on the moduli space of where v = a a a , and and
are the Euler
instantons configurations and one ends up, after number and the signature of the manifold X. This
taking into account the selection rule dictated by the result matches the known structure of [49] (structure
dimensionality of the moduli space, with integrations theorem of Kronheimer and Mrowka) and provides
288 Topological Quantum Field Theory: Overview
a meaning to its unknown quantities in terms of the O(1) O(1) ! P 1 , t being the flux of the B-field
new Seiberg–Witten invariants. Equation [50] is a through P 1 . The quantities Fg (t) have been computed
rather remarkable prediction that has been tested in using both physical and mathematical arguments,
many cases, and for which a general proof has been thus proving the conjecture.
recently proposed. For a review of the subject, see Once a new picture for the partition function of
Labastida and Lozano (1998). Chern–Simons gauge theory is available, one should
The situation for manifolds with bþ 2 = 1 involves a ask about the form that the expectation values of
metric dependence and has been worked out in Wilson loops could take in the new context. The
detail (Moore and Witten 1998). The formulation of question was faced by Ooguri and Vafa and they
Donaldson invariants in field-theoretical terms has provided the answer, later refined by Labastida,
also provided a generalization of these invariants. Mariño, and Vafa. The outcome is an entirely new
This generalization has been carried out in several point of view in the theory of knot and link
directions: (1) the consideration of higher-rank invariants. The new picture provides a geometrical
groups, (2) the coupling to matter fields after interpretation of the integer coefficients of the
twisting N = 2 hypermultiplets, and (3) the twist of quantum group invariants, an issue that has been
theories involving N = 4 supersymmetry. investigated during many years. To present an
We will now look at Chern–Simons gauge theory account of these developments, one needs to review
from the perspective that emerges from its treatment first some basic facts of large-N expansions.
in the context of the large-N expansion. We will To consider the presence of Wilson loops, it is
restrict the discussion to the case of knots on S3 with convenient to introduce a particular generating
gauge group SU(N). Gauge theories with gauge group functional. First, one performs a change of basis
SU(N) admit, besides the perturbative expansion, a from representations R to conjugacy classes C(k) of
large-N expansion. In this expansion correlators are the symmetric group, labeled byP vectors
expanded in powers of 1/N while keeping the k = (k1 , k2 , . . . ) with ki 0, and Pjkj = j kj > 0.
’t Hooft coupling t = Nx fixed, x being the coupling The change of basis is Wk = R R (C(k))WR ,
constant of the gauge theory. For example, for the where R P are characters of the permutation group
free energy of the theory, one has the general form S‘ of ‘ = j jkj elements (‘ is also the number of
boxes of the Young tableau associated to R).
X
1
F¼ Cg;h N22g t2g2þh ½51 Second, one introduces the generating functional:
g0
h1
X jCðkÞj ðcÞ
FðVÞ ¼ log ZðVÞ ¼ Wk k ðVÞ ½53
‘!
In the case of Chern–Simons gauge theory, the coupling k
constant is x = 2i=(k þ N) after taking into account where
the shift in k. The large-N expansion [51] resembles a
string-theory expansion and indeed the quantities Cg, h X jCðkÞj
ZðVÞ ¼ Wk k ðVÞ
can be identified with the partition function of a k
‘!
topological open string with g handles and h bound- Y
aries, with N D-branes on S3 in an ambient six- k ðVÞ ¼ ðtr V j Þkj
j
dimensional target space T
S3 . This was pointed out by
Witten in 1992. The result makes a connection between In these expressions jC(k)j denotes the number of
a topological three-dimensional field theory and the elements of the class C(k) in S‘ . The reason behind
topological strings described in the previous section. the introduction of this generating functional is that
In 1998 an important breakthrough took place the large-N structure of the connected Wilson loops,
which provided a new approach to compute quan- Wk(c) , turns out to be very simple:
tities such as [51]. Using arguments inspired by the
AdS/CFT correspondence, Gopakumar and Vafa jCðkÞj ðcÞ X1
Wk ¼ x2g2þjkj Fg; k ðÞ ½54
(1999) provided a closed-string-theory interpretation ‘! g¼0
of the partition function [51]. They conjectured that
the free energy F can be expressed as where = et and t = Nx is the ’t Hooft coupling.
Writing x = t=N, it corresponds to a power series
X
1
expansion in 1/N. As before, the expansion looks
F¼ N 22g Fg ðtÞ ½52
like a perturbative series in string theory where g is
g0
the genus and jkj is the number of holes. Ooguri and
where Fg (t) corresponds to the partition function of a Vafa conjectured in 1999 the appropriate string-
topological closed-string theory on the noncompact theory description of [54]. It corresponds to an open
Calabi–Yau manifold X called the resolved conifold, topological string theory (notice that the ones
Topological Quantum Field Theory: Overview 289
described in the previous section were closed), described how the many faces of TQFT provide a
whose target space is the resolved conifold X. The variety of important insights in a selected set of
contribution from this theory will lead to open- problems in topology. Among these outstand the
string analogs of Gromov–Witten invariants. reformulation of Donaldson theory and the discovery
In order to describe in more detail the fact that one of the Seiberg–Witten invariants, and the string-theory
is dealing with open strings, some new data need to description of the large-N expansion of Chern–Simons
be introduced. Here is where the knot description gauge theory, which provides an entirely new point of
intrinsic to the Wilson loop enters. Given a knot K on view in the study of knot and link invariants and points
S3 , let us associate to it a Lagrangian submanifold CK to an underlying fascinating interplay between string
with b1 = 1 in the resolved conifold X and consider a theory, knot theory, and enumerative geometry which
topological open string on it. The contributions in opens new fields of study.
this open topological string are localized on holo- In addition to their intrinsic mathematical inter-
morphic maps f : g, h ! X with h = jkj which satisfy est, TQFTs have been found relevant to important
f
[g, h ] = Q, and f
[C] = j[] for kj oriented circles questions in physics as well. This is so because, in a
C. In these expressions 2 H1 (CK , Z), and Q 2 sense, TQFTs are easier to solve than conventional
H2 (X, CK , Z), that is, the map is such that kj quantum field theories. For example, topological
boundaries of g, h wrap the knot j times, and g, h sigma models are relevant to the computation of
itself gets mapped to a relative two-homology class certain couplings in string theory. Also, Witten-type
characterized by the Lagrangian submanifold CK . gauge TQFTs such as Donaldson–Witten theories
The number of such maps (in the sense described in and its generalizations play a role in string theory as
the previous section) is the open-string analog of effective world-volume theories of extended string
Gromov–Witten invariants. They will be denoted by states (branes) wrapping curved spaces, and TQFTs
Q
Ng, k . Comparing to the situation that led to [47] in arising from N = 4 gauge theories in four dimen-
the closed-string case, one concludes that in this case sions have shed light on field- (and string-) theory
the quantities Fg, k () in [54] must take the form dualities.
X R Z Most of these developments, and others that we
Q !
Fg; k ðÞ ¼ Ng; k e Q ; t ¼ ! ½55 have not touched upon or only mentioned in passing
Q P1 have their own entries in the encyclopedia, to which
where ! is the Kähler class of the Calabi–Yau we refer the interested reader for further details.
t
manifold
R X and = e . For any Q, one can always See also: Axiomatic Approach to Topological Quantum
write Q ! = Qt, where Q is in general a half-integer Field Theory; BF Theories; Chern–Simons Models:
number. Therefore, Fg, k () is a polynomial in 1=2 Rigorous Results; Donaldson–Witten Theory; Gauge
with rational coefficients. Theoretic Invariants of 4-Manifolds; Gauge Theory:
The result [55] is very impressive but still does not Mathematical Applications; Hamiltonian Fluid Dynamics;
provide a representation where one can assign a The Jones Polynomial; Knot Theory and Physics;
geometrical interpretation to the integer coefficients Mathai–Quillen Formalism; Mathematical Knot Theory;
of the quantum-group invariants. Notice that to Schwarz-Type Topological Quantum Field Theory;
match a polynomial invariant to [55], after obtain- Seiberg–Witten Theory; Stationary Phase Approximation;
ing its connected part, one must expand it in x after Topological Sigma Models.
setting q = ex keeping fixed. One would like to
have a refined version of [55], in the spirit of what Further Reading
was described in the previous section leading from
the Gromov–Witten invariants Ng of [47] to the Atiyah MF (1988) New invariants of three and four dimensional
new integer invariants ng of [48]. It turns out that, manifolds. In: The Mathematical Heritage of Herman Weyl, Proc.
Symp. Pure Math., vol. 48. American Math. Soc. pp. 285–299.
indeed, F(V) can be expressed in terms of integer Birmingham D, Blau M, Rakowski M, and Thompson G (1991)
invariants in complete analogy with the description Topological field theory. Physics Reports 209: 129–340.
presented in the previous section for topological Cordes S, Moore G, and Rangoolam S (1996) Lectures on 2D
strings. A good review on the subject can be found Yang–Mills theory, equivariant cohomology and topological
in Mariño (2005). field theories. In: David F, Ginsparg P, and Zinn-Justin J (eds.)
Fluctuating Geometries in Statistical Mechanics and Field
Theory, Les Houches Sesion LXII, p. 505 (hep-th/9411210).
Elsevier.
Concluding Remarks Donaldson SK (1990) Polynomial invariants for smooth four-
manifolds. Topology 29: 257–315.
In this overview we have introduced key features of Floer A (1987) Morse theory for fixed points of symplectic
TQFTs and we have described some of the most diffeomorphisms. Bulletin of the American Mathematical
relevant results emerged from them. We have Society 16: 279.
290 Topological Sigma Models
Freyd P, Yetter D, Hoste J, Lickorish WBR, Millet K, and Labastida JMF and Mariño M (2005) Topological Quantum Field
Ocneanu A (1985) A new polynomial invariant of knots and Theory and Four Manifolds. Dordrecht: Elsevier; Norwell, MA:
links. Bulletin of the American Mathematical Society 12(2): Springer.
239–246. Mariño M (2005) Chern–Simons theory and topological strings.
Gopakumar R and Vafa C (1999) On the gauge theory/geometry Reviews of Modern Physics 77: 675–720.
correspondence. Advances in Theoretical and Mathematical Moore G and Witten E (1998) Integrating over the u-plane in
Physics 3: 1415 (hep-th/9811131). Donaldson theory. Advances in Theoretical and Mathematic
Jones VFR (1985) A polynomial invariant for knots via von Physics 1: 298–387.
Neumann algebras. Bulletin of the American Mathematical Vassiliev VA (1990) Cohomology of knot spaces. In: Theory of
Society 12: 103–112. Singularities and Its Applications, Advances in Soviet Mathe-
Jones VFR (1987) Hecke algebra representations of braid groups matics, vol. 1, pp. 23–69. American Mathematical Society.
and link polynomials. Annals of Mathematics 126(2): 335–388. Witten E (1982) Supersymmetry and Morse theory. Journal of
Kauffman LH (1990) An invariant of regular isotopy. Transac- Differential Geometry 17: 661–692.
tions of American Mathematical Society 318(2): 417–471. Witten E (1988a) Topological quantum field theory. Commu-
Labastida JMF (1999) Chern-Simons gauge theory: ten years nications in Mathematical Physics 117: 353.
after. In: Falomir H, Gamboa R, and Schaposnik F (eds.) Witten E (1988b) Topological sigma models. Communications in
Trends in Theoretical Physics II, ch. 484, 1. New York: AIP Mathematical Physics 118: 411.
(hep-th/9905057). Witten E (1989) Quantum field theory and the Jones polynomial.
Labastida JMF and Lozano C (1998) Lectures in topological Communications in Mathematical Physics 121: 351.
quantum field theory. In: Falomir H, Gamboa R, and Witten E (1994) Monopoles and four-manifolds. Mathematical
Schaposnik F (eds.) Trends in Theoretical Physics, ch. 419, Research Letters 1: 769–796.
54. New York: AIP (hep-th/9709192).
geometric data. As mentioned above, these models manifold M. The coordinates on are denoted by
possess no local degrees of freedom. One can then ( = 1, 2), while those on the target manifold M
show that the path-integral expression for the are denoted by ui (i = 1, . . . , dim M). The metric and
correlation functions can be localized to a finite- complex structure of are denoted by h and ,
dimensional moduli space of instanton configura- respectively; they obey the relations =
tions which minimize the classical action. and = h . The metric tensor gij and almost-
We will first show how the full quantum action of complex structure Ji j of M obey analogous relations
the theory can be obtained as a BRST quantization of a to the above. In the general model, the target space
classical action with a local gauge symmetry. How- need only be an almost-complex manifold. This
ever, we shall then highlight the fact that the gauge requires the existence of a globally defined tensor
algebra for this topological shift symmetry only closes field Ji j such that Ji j Jj k = i k .
on-shell. In order to proceed with a BRST quantization The action [3] is invariant under the topological
of the model, and obtain the complete quantum shift symmetry
action, one must take recourse to the Batalin–
Vilkovisky quantization scheme. This machinery is ui ¼ i ½7
ideally tailored for such a problem, with the end result
where i is an arbitrary local function of the
that quartic ghost terms are present in the action.
coordinates on the base manifold . Already, at
However, the presence of such terms does not affect
this level, we see the distinction with the standard
the arguments presented above, since the quantum is
sigma model. The presence of this shift symmetry
still obtained as a BRST commutator. Following this,
means that all local degrees of freedom can be
we construct all observables of the theory and
gauged away, leaving only a finite number of global
demonstrate their connection to the de Rham coho-
topological degrees of freedom. It requires some
mology of the target space. The special topological
work to determine the corresponding transformation
properties of the observables are then discussed, and it
for Gi , the key point being the preservation of the
is shown how their computation is localized to the
self-duality constraint. We find
moduli space M of holomorphic maps from to M.
j 1 l k
As a particular example, we show how the computa- Gi ¼ Pi j
þ j D þ 2 Dl J k @ u
tion of a certain class of observables determines the
intersection numbers of the moduli space M. We þ 12 k Dk Ji j Gj ilk k Gl ½8
present a brief discussion of the connection between
topological sigma models with Calabi–Yau target where the covariant derivative is defined by
space M, and the mirror symmetry of M. D i = @ i þ ijk (@ uj )k .
Having determined the classical symmetries of the
model, we can now proceed with the BRST quantized
Construction of the Model form of the quantum action. As a topological field
theory of Witten type, one can show that the quantum
We begin with the following classical action: action can be written as a BRST commutator, that is,
Z pffiffiffi Sq = {Q, V}, where the gauge fermion V is defined by
Sc ¼ d2 h h gij Ki Kj ½3 Z pffiffiffi
i @ ui Bi
V ¼ d2 h C ½9
where 4
Ki ¼ Gi 12 @ ui þ Ji j @ uj ½4 where is an arbitrary gauge-fixing parameter. The
i i
BRST operator Q is nilpotent Q2 = 0, off-shell. It is
The fields G and K both satisfy the self-duality defined by = {Q, }, and takes the form
constraint
ui ¼ Ci
Gi ¼ Pi
þ j G
j
½5 Ci ¼ 0
Ki ¼ Pi
þ j K
j
1
j k k j
Ci ¼ Bi þ Dk Ji Cj C þ ij Ck C
where the self-dual and anti-self-dual projection 2
operators are defined as t ½10
Bi ¼ Ck Cl Riklt þ Rklrs Jri Js t C
4
Pi 1 i i
j ¼ 2 j J j ½6
Dk Ji j Ck Bj
2
The above action describes a theory of maps ui ()
þ Ck Dk Ji s Cl Dl Jt s C t þ i Cj Bk
jk
from a Riemann surface to an almost complex 4
292 Topological Sigma Models
1 Z
k
þ Ci C ðDj Jli ÞðDr Jlk ÞCj Cr ½11 t hOi ¼ t dui dC i dCi etSq fQ; VOg ¼ 0 ½14
16
As a result, one can evaluate the correlation function
It should be stressed that the classical gauge algebra
in the large-t (weak-coupling) limit. In this limit, the
[7] and [8] only closes on-shell. Quantization of the
path integral is dominated by fluctuations around
model is therefore more subtle, and requires use of
the classical minima. For the sigma model under
the Batalin–Vilkovisky formalism. The on-shell
study, the classical action is minimized by the
closure problem automatically results in the pre-
instanton configurations
sence of quartic ghost coupling terms in the action
and consequently cubic terms in the BRST transfor- @ ui þ Ji j @ ui ¼ 0 ½15
mations. Despite this, we have established that the
full quantum action can be written as a BRST Indeed, this localization of the path integral to the
commutator. moduli space of instantons can also be seen by
The form of the action simplifies when the choosing the = 0 gauge in [9]. Integration over the
complex structure of the target manifold is multiplier field then imposes a delta function
covariantly constant, Dk Ji j = 0. In this case, the constraint to the instanton configurations. The key
target manifold M is Kähler and we denote the point in the above derivation is the fact that the
complex coordinates as uI , with their complex quantum action is a BRST commutator, Sq = {Q, V}.
conjugates denoted by uI . The nonzero compo- By a similar argument, one can show that variations
nents of the metric tensor are then gIJ . Similarly, of hOi with respect to the metric and complex
the coordinates of are denoted , with nonzero structure of and M are also zero.
metric components hþ . The nonzero components Our aim now is to construct the Q-cohomology
of the ghost and anti-ghost are then given by classes of operators in the theory. Let us first associate
CI , CI , C an operator O(0) i1
A to each p-form A = Ai1 ip du ^ ^
þI , CI . The action can be written in the ip
form du on the target space M, given by
ð0Þ
Z pffiffiffi OA ¼ Ai1 ip Ci1 Cip ½16
2 1 I J þ
Sq ¼ d h hþ gIJ @þ uI @ uJ þ C þ ðD C Þh gIJ
2 where Ci is the ghost field. Under a BRST
1 I J þ
transformation, we see that
þ C ðDþ C Þh gIJ
2
ð0Þ
fQ; OA g ¼ @i0 Ai1 ip Ci0 Cip
1 þI C
I R CJ CJ
þ hþ C IIJJ ½12 ð0Þ
4 ¼ OdA ½17
Topological Sigma Models 293
support on their respective submanifolds. Since each topological B model. The usefulness of this observa-
of the operators in the correlation function depends tion lies in the fact that the topological A model on a
on some fixed point i , it is meaningful to define the Calabi–Yau target space M is related to the
submanifolds Li {u 2 M j u(i ) 2 Mi } M. Now, topological B model on the mirror of M. This
the correlation function represents a functional relationship and the computation of correlation
integral over the space of maps Map(, M), and we functions in the A and B models thus sheds light
have argued that this integral only receives contribu- on the nature of mirror symmetry.
tions from the instanton configurations. Since the
operators Ai (u(i )) vanish unless u 2 Li by our choice See also: Batalin–Vilkovisky Quantization;
of the Poincaré duals, we see that the only contribu- BRST Quantization; Functional Integration in Quantum
tion to the functional integral can be from those maps Physics; Graded Poisson Algebras; Mathai–Quillen
Formalism; Mirror Symmetry: A Geometric Survey;
which lie in the intersection L1 \ \ Ls . By ghost
Several Complex Variables: Compact Manifolds;
number considerations, this correlation function must
Singularities of the Ricci Flow; Topological Gravity, Two-
vanish unless the codimension of the intersection Dimensional; Topological Quantum Field Theory:
equals the virtual dimension of M. In the generic Overview; WDVV Equations and Frobenius Manifolds.
case where the virtual dimension is equal to dim M,
this means that the intersection is simply a finite
number of points. Intersection numbers 1 can then
Further Reading
be assigned to each point in the intersection L1 \
\ Ls , by considering the relative orientation of the Baulieu L and Singer I (1989) The topological sigma model.
submanifolds Li at the intersection points. From the Communications in Mathematical Physics 125: 227.
functional integral point of view, the computation Birmingham D, Blau M, Rakowski M, and Thompson G (1991)
Topological field theory. Physics Reports 209: 129.
reduces to an evaluation of the ratio of the bosonic Birmingham D, Rakowski M, and Thompson G (1989) BRST
determinant (integration over ui ) to the fermionic quantization of topological field theory. Nuclear Physics B
determinant (integration over Ci and C i ). In the 315: 577.
Kähler case, for example, the intersection number Eguchi T and Yang SK (1990) N = 2 superconformal models as
topological field theories. Modern Physics Letters A 5: 1693.
assigned to each point in the intersection is always þ1.
I determinant is Floer A (1987) Morse theory for fixed points of symplectic
This is due to the fact that the CI , C diffeomorphisms. Bulletin of the American Mathematical
I I
the complex conjugate of the C , Cþ determinant. Society 16: 279.
Floer A (1988) An instanton invariant for three manifolds.
Communications in Mathematical Physics 118: 215.
A and B Models and Mirror Symmetry Gromov M (1985) Pseudo holomorphic curves in symplectic
manifolds. Inventiones Mathematicae 82: 307.
The topological sigma model for a Kähler target Gromov M (1986) In: Proceedings of the International Congress
space [12] is also known as the topological A model. of Mathematicians, p. 81. Berkeley.
In this case, the action can be recovered by twisting Vafa C (1991) Topological Landau–Ginzburg models. Modern
the standard N = 2 supersymmetric sigma model. Physics Letters A 6: 337.
This twisting procedure amounts to a reassignment Witten E (1988) Topological sigma models. Communications in
Mathematical Physics 118: 411.
of the spins of the fields in the theory. However, Witten E (1991) Mirror manifolds and topological field theory.
there is an alternative twisting which can be done, In: Yau ST (ed.) Mirror Symmetry I, p. 121. American
and this leads to another model known as the Mathematical Society.
Turbulence Theories
R M S Rosa, Universidade Federal do Rio de Janeiro, examples of a multitude of flows which display
Rio de Janeiro, Brazil turbulent regimes: from the blood that flows in our
ª 2006 Elsevier Ltd. All rights reserved. veins and arteries to the motion of air within our
lungs and around us; from the flow of water in
creeks to the atmospheric and oceanic currents;
from the flows past submarines, ships, automobiles,
Introduction
and aircraft to the combustion processes propelling
Turbulence has initially been defined as an irregular them; and in the flow of gas, oil, and water, from
motion in fluids. The cloud formations in the the prospecting end to the entrails of the cities. The
atmosphere and the motion of water in rivers make great majority of flows in nature and in engineering
this point clear. These are but a few readily available applications are somehow turbulent.
296 Turbulence Theories
Turbulent Regimes
Turbulence is studied from many perspectives. The
subject of ‘‘transition to turbulence’’ attempts to
describe the initial mechanisms responsible for the
generation of turbulence starting from a laminar
motion in particular geometries. This transition can
be followed with respect to position in space (e.g., Figure 2 Illustration of a flow past an object, with a laminar
the flow becomes more complicated as we look boundary layer (light gray), a turbulent boundary layer (medium
further downstream on a flow past an obstacle or gray), and a turbulent wake (dark gray).
Turbulence Theories 297
large heat gradients, occurring in the atmosphere, ‘‘Chaos’’ serves as a paradigm for turbulence, in
and large salinity gradients, in the ocean. Geophy- the sense that it is now accepted that turbulence is a
sical turbulence involves also stratification and the dynamic processes in a sensitive deterministic
anisotropy generated by Earth’s rotation. Anisotro- system. But not all chaotic motions in fluids are
pic turbulence is also crucial in astrophysics and termed turbulent for they may not display mixing
plasma theory. Multiphase and multicomponent and vortex stretching or involve a wide range of
turbulence appear in flows with suspended particles scales. An important such example appears in the
or bubbles and in mixtures such as gas, water, and dispersive, nonlinear interactions of waves.
oil. Transonic and supersonic flows are also of great
importance and fall into the category of compres-
sible turbulence, much less explored than the The Equations of Motion
incompressible case.
It is usually stressed that turbulence is a continuum
In all those real situations one would like, from the
phenomenon, in the sense that the active scales are
engineering point of view, to compute mean proper-
much larger than the collision mean free path
ties of the flow, such as drag and lift for more between molecules. For this reason, turbulence is
efficient designs of aircraft, ships, and other vehicles. believed to be fully accounted for by the Navier–
Knowledge of the drag coefficient is also of funda- Stokes equations.
mental importance in the design of pipes and pumps, In the case of incompressible homogeneous flows,
from pipelines to artificial human organs. Mean the Navier–Stokes equations in the Eulerian form
turbulent diffusion coefficients of heat and other
and in vector notation read
passive scalars – quantities advected by the flow
without interfering on it, such as chemical products, @u
u þ ðu rÞu þ rp ¼ f ½1a
nutrients, moisture, and pollutants – are also of @t
major importance in industry, ecology, meteorology,
and climatology, for instance. And in most of those r u ¼ 0: ½1b
cases a large amount of research is dedicated to the
Here, u = u(x, t) = (u1 , u2 , u3 ) denotes the velocity
‘‘control of turbulence,’’ either to increase mixing
vector of an idealized fluid particle located at
or reduce drag, for instance. From a theoretical
position x = (x1 , x2 , x3 ), at time t. The mass density
point of view, one would like to fully understand
in a homogeneous flow is constant, denoted . The
and characterize the mechanisms involved in
constant denotes the kinematic viscosity of the
turbulent flows, clarifying this fascinating phe-
fluid, which is the molecular viscosity divided by
nomenon. This could also improve practical appli-
. The variable p = p(x, t) is the kinematic pressure,
cations and lead to a better control of turbulence.
and f = f (x, t) = (f1 , f2 , f3 ) denotes the mass density
The concept of ‘‘two-dimensional turbulence’’ is
of volume forces.
controversial. A two-dimensional flow may be
Equation [1a] expresses the conservation of linear
irregular and display mixing, statistical order, and
momentum. The term u accounts for the dissipa-
a wide range of active scales but definitely it does
tion of energy due to molecular viscosity, and the
not involve vortex stretching since the velocity field
nonlinear term (u r)u, also called the inertial term,
is always perpendicular to the vorticity field. For this
accounts for the redistribution of energy among
reason many researchers discard two-dimensional
different structures and scales of motion. Equation
turbulence altogether. It is also argued that real
[1b] represents the incompressibility condition. In
two-dimensional flows are unstable at complicated
Einstein’s summation convention, these equations
regimes and soon develop into a three-dimensional
can be written as
flow. Nevertheless, many believe that two-dimensional
turbulence, even lacking vortex stretching, is of @ui @ 2 ui @ui @p @uj
fundamental theoretical importance. It may shed þ 2 þ uj þ ¼ fi ; ¼0
@t @xj @xj @xi @xj
some light into the three-dimensional theory and
modeling, and it can serve as an approximation to
some situations such as the motion of the atmos-
The Reynolds Number
phere and oceans in the large and meso scales and
some magnetohydrodynamic flows. The relative The transition to turbulence was carefully studied by
shallowness of the atmosphere and oceans or the Reynolds in the late nineteenth century in a series of
imposition of a strong uniform magnetic field may experiments in which water at rest in a tank was
force the flow into two-dimensionality, at least for allowed to flow through a glass pipe. Starting with
a certain range of scales. dimensional analysis, Reynolds argued that a critical
298 Turbulence Theories
value of a certain nondimensional quantity was The Closure Problem and Turbulence
likely to exist beyond which a laminar flow gives Models
rise to a ‘‘sinuous’’ motion. This was followed by
observations of the flow for tubes with different The RANS equations cannot be solved directly for the
diameter L, different mean velocities U across the mean flow since the Reynolds stresses are unknown.
tube section, and with the kinematic viscosity Equations for these stress terms can be derived but they
= = being altered through changes in tempera- involve further unknown moments. This continues
ture. The experiments confirmed the existence of with equations for moments of a given order depend-
such a critical value for what is now called the ing on new moments up to a higher order, leading to
Reynolds number: an infinite system of equations known as the Fried-
man–Keller system. For practical applications,
LU approximations closing the system at some finite
Re ¼
order are needed, in what is called the closure problem.
The dimensional analysis argument can be repro- Several ad hoc approximations exist, the most famous
duced in the following form: the physical dimension being the Boussinesq eddy-viscosity approximation, in
for the inertial term in [1a] is U2 =L, while that for which the turbulent fluctuations are regarded as
the viscous term is U=L2 . The ratio between them increasing the viscosity of the flow. Prandtl’s mixing-
is precisely Re = LU=. For small values of Re length hypothesis yields a prescription for the compu-
viscosity dominates and the flow is laminar, whereas tation of this eddy viscosity, and together they form the
for large values of Re the inertial term dominates, basis of the algebraic models of turbulence. Other
models involve additional equations, such as the k-
and the flow becomes more complicated and
eventually turbulent. In applications, different types and k-! models. Most of the practical computations of
of Reynolds number can be used depending on the industrial flows are based on such lower-order models,
choice of the characteristic velocity and length, but and a large amount of research is done to determine
in any case, the larger the Reynolds number, the appropriate values for the various ad hoc parameters
more complicated the flow. which appear in these models and which are highly
dependent on the geometry of the flow. This depen-
dency can be explained by the fact that the RANS is
supposed to model the mean flow even at the large
The Reynolds Equations scales of motion, which are highly affected by the
Another advance put forward by Reynolds in a geometry.
subsequent article was to decompose the flow into a Computational fluid dynamics (CFD) is indeed a
mean component and the remaining fluctuations. In fundamental tool in turbulence, both for research and
terms of the velocity and pressure fields this can be engineering applications. From the theoretical side,
written as direct numerical simulations (DNS), which attempt to
resolve all the active scales of the flow, reveal some
þ u0 ;
u¼u þ p0
p¼p ½2 fundamental mechanisms involved in the transition to
with u representing the mean components and
and p turbulence and in vortex stretching. As for applica-
0 0
u and p , the fluctuations. By substituting [2] into tions, DNS applies to flows up to low-Reynolds
[1], one finds the Reynolds-averaged Navier–Stokes turbulence, with the current computational power
(RANS) equations for the mean flow: not allowing for a full resolution of all the scales
involved in high-Reynolds flows. And the current rate
u
@ of evolution of computational power predicts that
u þ ð
u rÞ ¼f þr
u þ rp
@t this will continue so for several decades.
ru
¼0 An intermediate CFD method between RANS and
DNS is the large-eddy simulation (LES), which
It differs from [1] only by the addition of the attempts to fully resolve the large scales while
Reynolds stress tensor: modeling the turbulent motion at the smaller scales.
3 Several models have been proposed which have their
¼ u0 u0 ¼ u0i u0j own advantages and limitations as compared to
i;j¼1
RANS and DNS. It is currently a subject of intense
In a laminar flow, the fluctuations are negligible, research, particularly for the development of suitable
otherwise this decomposition shows how they models for the structure functions near the boundary.
influence the mean flow through this additional Theoretical results on fully developed turbulence play
turbulent stresses. a fundamental role in the modeling process.
Turbulence Theories 299
LESs are a promising tool and they have been Based on this assumption, the averages may in
successfully applied to a number of situations. The practice be calculated as time averages over a
choice of the best method for a given application, sufficiently large period T. There is a related
however, depends very much on the Reynolds argument for substituting space averages by time
number of the flow and the prior knowledge of averages and based on the mechanics of turbulence
similar situations for adjusting the parameters. which is called the ‘‘Taylor hypothesis.’’
Another fundamental concept in the statistical
theory is that of homogeneity, which is the spatial
analog of the statistical equilibrium in time.
Elements of the Statistical Theory
In homogeneous turbulence, the statistical quantities
Several types of averages can be used. The ensemble of a flow are independent of translations in space,
average is taken with respect to a number of experi- that is,
ments at nearly identical conditions. Despite the
h’ðuð þ ‘; Þi ¼ h’ðuð ; Þi
irregular motion of, say, the velocity vector u(n) (x, t)
of each experiment n = 1, . . . , N, the average value for all ‘ 2 R3 . The concept of isotropic turbulence
assumes further independence with respect to
1X N
rotations and reflections in the frame of reference,
u
ðx; tÞ ¼ uðnÞ ðx; tÞ
N n¼1 that is,
is expected to behave in a more regular way. This h’ðQt uðQ ; Þi ¼ h’ðuð ; Þi
type of averaging is usually denoted with the symbol
for all orthogonal transformations Q in R3 , with
h i. This notion can be cast into the context of a
adjoint Qt .
probability space (M, , P), where M is a set, is a
Under the homogeneity assumption, mean quan-
-algebra of subsets of M, and P is a probability
tities can be defined independently of position in
measure on . The velocity field is a random
space, such as the mean kinetic energy per unit mass
variable in the sense that it is a density function
! 7! u(x, t, !) from M into the space of time- 1 1X 3
The theory assumes a wide separation between concentrated on the large scales, while the dissipa-
the energy-containing scales, of order say ‘0 , and the tion is concentrated near the Kolmogorov scale ‘ .
energy-dissipative scales, of order ‘ , so that the The four-fifths law becomes visible as a straight line
cascade process occurs within a wide range of scales in the logarithmic scale.
‘ such that ‘0
‘
‘ . In this range, termed the A more precise mechanism for the energy cascade
inertial range, the viscous effects are still negligible assumes that in the inertial range, eddies with length
and the statistical regime should depend only on . scale ‘ transfer kinetic energy to smaller eddies during
Then, the Kolmogorov ‘‘two-thirds law’’ asserts that their characteristic timescale, also known as circula-
within the inertial range the second-order correla- tion time. If u‘ is their characteristic velocity, then
tions must be proportional to (‘)2=3 , that is, ‘ = ‘=u‘ is their circulation time, so that the kinetic
energy transferred from these eddies during this time is
S2 ð‘Þ ¼ CK ð‘Þ2=3
u2‘ u3‘
for some constant CK known as the Kolmogorov ‘ ¼
‘ ‘
constant in physical space (there is a related constant
in spectral space). The argument extends to higher- In statistical equilibrium, the energy lost to the
order structure functions, yielding smaller scales equals the energy gained from the
larger scales, and that should also equal the total
Sp ð‘Þ ¼ Cp ð‘Þp=3 kinetic energy dissipated by viscous effects. Hence,
Kolmogorov’s derivation of these results was not by ‘ , and we find
dimensional analysis, it was in fact a more convincing u3‘
self-similarity argument based on the universality
‘
assumed for the equilibrium range. A different argu-
ment without resorting to universality assumptions, It also follows that ‘ = ‘=u‘ = ‘(‘)1=3 = 1=3 ‘2=3 so
however, was applied to the third-order structure that the circulation time decreases with the length
function, yielding the more precise ‘‘four-fifths law’’: scale and becomes of the order of the viscous
dissipation time (=)1=2 precisely when ‘ ‘ .
S3 ð‘Þ ¼ 45‘ A similar relation between and the large scales
can also be obtained with heuristic arguments: let e
The ‘‘Kolmogorov five-thirds law’’ concerns the
be the mean kinetic energy and ‘0 , a characteristic
energy spectrum S( ) and is the spectral version of
length for the large scales. Then u0 given by e = u20 =2
the two-thirds law, given by Obukhoff:
is a characteristic velocity for the large scales, and
Sð Þ ¼ C0K 2=3 5=3 0 = ‘0 =u0 is the large-scale circulation time. In
statistical equilibrium, the rate of kinetic energy
The constant C0K is the Kolmogorov constant dissipated per unit time and unit mass is expected to
in spectral space. The spectral version of the be of the order of e=0 , hence
dissipation length is the Kolmogorov wave number
u30
= (= 3 )1=4 .
A typical distribution of energy in a turbulent ‘0
flow is depicted in Figure 4. The energy is which is called the ‘‘energy dissipation law.’’
ln κ
κ0 κ κ ln κ 0 ln κ
Inertial range Inertial range
ᐉ
ᐉ0
Figure 5 A schematic representation of a flow structure Figure 6 A portion of rotating fluid gets stretched and thinned
displaying a range of active scales and a three-dimensional as the flow speeds up, generating one of many coherent
grid with linear dimension ‘0 and mesh length ‘ , sufficient to structures of high vorticity and low dissipation.
represent all the active scales in a turbulent flow. The number of
degrees of freedom is the number of blocks: (‘0 =‘ )3 .
Sp (‘) / ‘(p) , (p) < p=3, for high-order (p > 3) struc-
From the energy dissipation law, several relations ture functions. The issues of intermittency and
between characteristic quantities of turbulent flows can coherent structures and whether and how they could
be obtained, such as ‘0 =‘ Re3=4 , for Re = ‘0 u0 =. affect the deductions of the universality theory such as
Now, assuming the active scales in a turbulent the power laws for the structure functions are far from
flow exist down to the Kolmogorov scale ‘ , one settled and are currently one of the major and most
needs a three-dimensional grid with mesh spacing ‘ fascinating issues being addressed in turbulence
to resolve all the scales, which means that the theory. Several phenomenological theories attempt to
number N of degrees of freedom of the system is of adjust the universality theory to the existence of such
the order of N (‘0 =‘ )3 (see Figure 5). This coherent structures. Multifractal models, for instance,
number can be estimated in terms of the Reynolds suppose that the eddies generated in the cascade
number by N Re9=4 . This relation is important in process do not fill up the space and form multifractal
predicting the computational power needed to structures. Field-theoretic renormalization group
simulate all the active scales in turbulent flows. develops techniques based on quantum field renor-
Several such universal laws can be deduced and malization theory. Intermediate asymptotics also
extended to other situations such as turbulent exploits self-similar analysis and renormalization
boundary layers, with the famous logarithmic law theory but with a somewhat different flavor. Detailed
of the wall. They play a fundamental role in mathematical analysis of the vorticity equations is
turbulence modeling and closure, for the calculation also playing a major role in the understanding of the
of the mean flow and other quantities. dynamics of the vorticity field.
times) exists but which may not be unique, and it has Turbulence; Viscous Incompressible Fluids:
been proved that unique solutions exist which may not Mathematical Theory; Vortex Dynamics; Wavelets:
be global (i.e., they are guaranteed to exist as unique Application to Turbulence.
solutions only for a finite time).
The difficulty here is the possible existence of
singularities in the vorticity field (vorticity becoming Further Reading
infinite at some points in space and time). Depending
on how large the singularity set is, uniqueness may fail Adzhemyan LT, Antonov NV, and Vasiliev AN (1999) The Field
Theoretic Renormalization Group in Fully Developed Turbu-
in strictly mathematical terms. The existence of
lence. Amsterdam: Gordon and Breach.
singularities may not be a purely mathematical Barenblatt GI and Chorin AJ (1998) New perspectives in
curiosity, it may in fact be related with the inter- turbulence: scaling laws, asymptotics, and intermittency.
mittency phenomenon. Rigorous studies of the vorti- SIAM Review 40(2): 265–291.
city equation may continue to reveal more fundamental Batchelor GK (1953) The Theory of Homogeneous Turbulence.
Cambridge Monographs on Mechanics and Applied Mathe-
aspects on vortex dynamics and coherent structures.
matics. New York: Cambridge University Press.
The statistical theory has also been put into a firm Chorin AJ (1994) Vorticity and Turbulence. Applied Mathema-
foundation with the notion of statistical solution of the tical Sciences vol. 103, New York: Springer.
Navier–Stokes equations. It addresses the existence Constantin P (1994) Geometric statistics in turbulence. SIAM
and regularity of the probability distribution assumed Review 36: 73–98.
Foias C, Manley OP, Rosa R, and Temam R (2001) Navier–Stokes
for turbulent flows and of the fundamental elements of
Equations and Turbulence. Encyclopedia of Mathematics and its
the statistical theory such as correlation functions and Applications, vol. 83. Cambridge: Cambridge University Press.
spectra. Based on that, a number of relations between Friedlander S and Topper L (1961) Turbulence. Classic Papers on
physical quantities of turbulent flows may be derived Statistical Theory. New York: Interscience Publisher.
in a mathematically sound and definitive way. This Frisch U (1995) Turbulence. The Legacy of A. N. Kolmogorov.
Cambridge: Cambridge University Press.
does not replace other theories, it is mostly a
Hinze JO (1975) Turbulence. McGraw-Hill Series in Mechanical
mathematical framework upon which other techni- Engineering. New York: McGraw-Hill.
ques can be applied to yield rigorous results. Holmes P, Lumley JL, and Berkooz G (1996) Turbulence,
Despite the difficulties in the mathematical theory Coherent Structures, Dynamical Systems, and Symmetry.
of the NSE some successes have been collected such Cambridge: Cambridge University Press.
Lesieur M (1997) Turbulence in Fluids. Fluid Mechanics and its
as estimates for the number of degrees of freedom in
Applications, 3rd edn. vol. 40. Dordrecht: Kluwer Academic.
terms of fractal dimensions of suitable sets asso- Monin AS and Yaglom AM (1975) Statistical Fluid Mechanics:
ciated with the solutions of the Navier–Stokes Mechanics of Turbulence 2. Cambridge: MIT Press.
equations, and partial estimates of a number of Sagaut P (2001) Large Eddy Simulation for Incompressible Flows.
relations derived in the statistical theory of fully Berlin: Springer.
Schlichting H and Gersten K (2000) Boundary Layer Theory, 8th
developed turbulence.
edn. Berlin: Springer.
Tennekes H and Lumley JL (1972) A First Course in Turbulence.
See also: Bifurcations in Fluid Dynamics; Geophysical Cambridge, MA: MIT Press.
Dynamics; Incompressible Euler Equations: Vishik MI and Fursikov AV (1988) Mathematical Problems of
Mathematical Theory; Intermittency in Turbulence; Statistical Hydrodynamics. Dordrecht: Kluwer.
Inviscid Flows; Lagrangian Dispersion (Passive Scalar); Wilcox DC (2000) Turbulence Modeling for CFD, 2nd edn.
Stochastic Hydrodynamics; Variational Methods in Anaheim, CA: DCW Industries Inc.
Twistor Theory They are totally null (i.e., the tangent vectors not
only have zero length but are also mutually
A basic motivation of twistor theory is to bring out
orthogonal) and also self-dual (under the differential
the complex (holomorphic) geometry that underlies
geometer’s notion of Hodge duality).
real spacetime. In general relativity, a spacetime is a
This complex correspondence can also be
4-manifold with metric g of signature (1, 3), and
restricted to give correspondences for R4 with
when it is flat, that is, g = dt2 dx2 dy2 dz2 ,
metrics of positive-definite, Euclidean, signature or
where (t, x, y, z) are coordinates on R4 , it is called
ultrahyperbolic, (2, 2), signature. A particular sim-
Minkowski space. The first appearance of a com-
plification in Euclidean signature is that the complex
plex structure arises from the fact that, at a given
-planes intersect the real slice in a point. The
event, the celestial sphere of light rays (directions of
conformal compactification of Euclidean R 4 is the
zero length with respect to g) naturally has the
4-sphere S4 given by adding a single point at infinity,
structure of the Riemann sphere, CP1 , in such a way
and so we have a projection p : PT ! S4 whose
that Lorentz transformations (linear transformations
fibers are holomorphically embedded CP1 s. These
of the tangent space preserving the metric) act on
fibers can be characterized as the lines in PT that
this sphere by Möbius transformations. These are
are invariant under a quaternionic complex con-
the maximal group of complex analytic transforma-
jugation which is an antiholomorpic map^: PT !
tions of CP1 .
PT with no fixed points. (Here quaternionic means
Twistor space extends this idea to the whole of
that on the nonprojective twistor space, T = C4 , the
Minkowski space. Denoted PT, the twistor space for ^^
conjugation has the property Z = Z so that it
Minkowski space is complex projective 3-space, CP3 ,
defines a second complex structure anticommuting
the space of one-dimensional subspaces of C4 ; it is a
with the standard one; this is sufficient to express
three-dimensional complex manifold obtained by add-
T = Q 2 , where Q denotes the quaternions. The
ing a ‘‘plane at infinity’’ to C3 . Explicitly, we can
complex structures i, j, and k of the
ffi quaternions
introduce homogeneous coordinates Z 2 C4 {0} pffiffiffiffiffiffi
are given by identifying i with 1 on C4 and j
with = 0, 1, 2, 3 but where Z Z for 2 C {0}.
with ^ and k = ij.)
Affine coordinates on a C3 chart Z3 6¼ 0 can
be obtained by setting (z1 , z2 , ) = (Z0 =Z3 , Z1 =Z3 ,
The Penrose Transform
Z2 =Z3 ). Physically, points of twistor space corre-
spond to spinning massless particles in Minkowski A basic task of twistor theory is to transform
space. Mathematically, the correspondence can be solutions to the field equations of mathematical
understood as the Klein correspondence. physics into objects on twistor space. This works
well for linear massless fields such as the Weyl
The Klein Correspondence neutrino equation, Maxwell’s equations for electro-
The correspondence between PT and Minkowski magnetism and linearized gravity. In its general
space can be extended first to complexified Minkowski form, this transform has become known as the
space so that the coordinates are allowed to take on Penrose transform. Such fields correspond to freely
values in C, and then to its conformal compactification prescribable holomorphic functions f (, z1 , z2 ) (or,
by including the ‘‘light cone at infinity.’’ It then more precisely, analytic cohomology classes) on
coincides with the classical complex Klein correspon- regions of twistor space. The field can be obtained
dence. The Klein correspondence is the one-to-one from this function by means of a contour integral.
correspondence between lines in CP3 and points of a The simplest of these integral formulas is
I
four complex-dimensional quadric, CM, in CP5 . The
4-quadric CM can be understood as conformally ðxa Þ ¼ f ð; t z þ ðx þ iyÞ; x iy
compactified complexified Minkowski space. Introdu- þ ðt þ zÞÞd
cing affine coordinates (z1 , z2 , ) on PT and (t, x, y, z)
on CM, we find that a point (t, x, y, z) in CM and differentiation under the integral sign leads
corresponds to a line in PT according to easily to the fact that satisfies the wave equation
z1 t z x þ iy 1 @2 @2 @2 @2
¼ ¼0
z2 x iy t þ z @t2 @x2 @y2 @z2
Alternatively, fixing (, z1 , z2 ) in these equations This formula was originally discovered by Bateman.
gives a 2-plane in complex Minkowski space Note that f must have singularities on twistor space
corresponding to all the lines in PT through to yield a nontrivial and even then, there are many
(, z1 , z2 ). Such 2-planes are called ‘‘-planes.’’ choices of f that yield zero. For a solution defined
Twistor Theory: Some Applications 305
over a region U in spacetime, the function f is natural to ask which complexified metrics admit a
correctly understood as a representative of a Cech full family of -surfaces, that is, 2-surfaces that are
cohomology class defined on the region U0 in twistor totally null and self-dual. The answer is that a full
space swept out by the lines corresponding to points family of -surfaces exists iff the conformally
of U. Furthermore, the function f should be taken invariant part of the curvature tensor, the Weyl
globally to be a function of homogeneity 2, tensor, is anti-self-dual. If this is the case, twistor
f (Z ) = 2 f (Z ). This formula has generalizations space can be defined to be the (necessarily three-
to massless fields of all helicities in which a field of dimensional) space of such -surfaces.
helicity s corresponds to a function (Cech cocycle) of The remarkable fact is that the twistor space,
homogeneity degree 2s 2. together with its complex structure, is sufficient to
The Penrose transform has found important determine the original spacetime. Twistor space is
applications in representation theory and integral again a three-dimensional complex manifold, and
geometry. For a review, the reader is referred to contains holomorphically embedded rational curves,
Baston and Eastwood (1989), the relevant survey CP1 s, at least one for each point of the spacetime.
articles in Bailey and Baston (1990), or Mason and However, holomorphic rigidity implies that the
Hughston (1990, chapter 1). family of rational curves is precisely four-
dimensional over the complex numbers. Further-
more, incidence of a pair of curves can be taken to
Twistor Theory and Nonlinear Equations imply that the corresponding points in spacetime lie
on a null geodesic and this yields a conformal
The Penrose transform for the Maxwell equations
structure on spacetime. Further structures on twistor
and linearized gravity turns out to be linearizations
space can be imposed to give the complex spacetime
of correspondences for the nonlinear analogs of
a metric that is vacuum, perhaps with a cosmologi-
these equations: the Einstein vacuum equations and
cal constant. The correspondence is stable under
the Yang–Mills equations. However, the construc-
small deformations and so the data defining the
tions only work when these fields are anti-self-dual.
twistor space is effectively freely prescribable, see
This is the condition that the curvature 2-forms
Penrose (1976).
satisfy F = iF, where denotes the Hodge dual
In Euclidean signature, again the complex
(which, up to certain factors of i, has the effect of
-planes intersect the real spacetime in a point, so
interchanging electric and magnetic fields); it is a
the twistor space again fibers over spacetime. The
nonlinear generalization of the right-handed circular
twistor fibration can be constructed as the projecti-
polarization condition. Explicitly, in terms of space-
vized bundle of self-dual spinors or more commonly
time indices a, b, . . . = 0, 1, 2, 3, Fab = (1=2)"abcd Fcd ,
as the unit sphere bundle in the space of self-dual
where "0123 = 1 and "abcd = "[abcd] . In Minkowski
2-forms (Atiyah et al. 1978). In the latter formula-
signature, the i factor in the anti-self-duality condi-
tion, the complex structure on the twistor space
tion implies that real fields cannot be anti-self-dual.
arises from the direct sum of the naturally defined
Thus, these extensions are not sufficient to fulfill the
complex structures on the horizontal and vertical
ambitions of twistor theory to incorporate real
tangent spaces to the bundle; that on the vertical
classical nonlinear physics in Minkowski space.
subspace is the standard one on the sphere, and that
However, the factor of i is not present in Euclidean
on the horizontal subspace is a multiple of the self-
and ultrahyperbolic signature, so the anti-self-
dual 2-form at the given point of the fiber.
duality condition is consistent with real fields in
There are now large families of extensions,
these signatures and this is where the main applica-
generalizations, and reductions of this construction.
tions of these constructions have been.
They are all based on the idea of realizing a space
with a given complexified geometric structure as the
The Nonlinear Graviton Construction
parameter space of a family of holomorphically
and Its Generalizations
embedded submanifolds inside a twistor space. In
The first nonlinear twistor construction was due to general, the most useful of these constructions are
Penrose (1976), and was inspired by Newman’s those in which the ‘‘spacetime’’ is obtained as the
(1976) construction of ‘‘heavens’’ from the infinities space of rational curves in a twistor space. This is
of asymptotically flat spacetimes in general because the equations that are solved on the
relativity. corresponding spacetime can be thought of as a
The nonlinear graviton construction proceeds completely integrable system in which the integr-
from the definition of twistors in flat spacetime as ability condition for the generalized -surfaces is
-planes in complexified Minkowski space. It is interpreted as the consistency condition of a Lax
306 Twistor Theory: Some Applications
pair or more general linear system. For a more consisting of local holomorphic matrix-valued func-
detailed discussion from this point of view, see tions on twistor space. To construct the solution on
Mason and Woodhouse (1996, chapter 13). spacetime, one must first find a Birkhoff factoriza-
tion of the patching data on each Riemann sphere in
twistor space corresponding to points of the appro-
The Anti-Self-Dual Yang–Mills Equation
priate region in spacetime. On each Riemann sphere,
and Its Twistor Correspondence
the Birkhoff factorization starts with the given
The anti-self-dual Yang–Mills equations extend patching function with values in GL(n, C) on the
Maxwell’s equations for electromagnetism in the real axis in the complex plane, and expresses it as a
right-circularly polarized case. They are a family of product of functions with values in GL(n, C) one of
equations that depend on a choice of Lie group G, which extends over the upper-half plane, and the
usually taken to be a group of complex matrices; other over the lower-half complex plane. The anti-
Maxwell’s equations arise when G = U(1). self-dual connection can be obtained by differentiat-
Introduce coordinates xa , a = 0, 1, 2, 3, on R4 with ing the resulting matrices. See Penrose (1984, 1986),
metric ds2 = dx0 dx3 dx1 dx2 (this is a metric of Ward and Wells (1990), or Mason and Woodhouse
ultrahyperbolic signature – Euclidean signature can (1996) for a full discussion, and Atiyah (1979) for
be obtained by choosing the coordinates to be the formulation appropriate to Euclidean signature.
complex, but with (x3 , x2 ) the complex conjugates
of (x0 , x1 )). The dependent variables are the compo-
Completely Integrable Systems
nents Aa of a connection Da = @a Aa , where
@a = @=@xa and Aa = Aa (xb ) 2 Lie G, the Lie algebra In effect, the twistor constructions amount to
of G. This connection defines a method of differ- providing a geometric general local solution to the
entiating vector-valued functions s in some repre- anti-self-duality equations; the twistor data is, for a
sentation of G. The freedom in changing bases for local solution, freely prescribable. In this sense, they
the vector bundle induce the gauge transformations demonstrate complete integrability of the anti-self-
Aa ! g1 Aa g g1 @a g, g(x) 2 G on Aa ; two connec- duality equations. The reconstruction of a solution
tions that are related by a gauge transformation are on spacetime from twistor data is not a quadrature –
deemed to be the same. The self-dual Yang–Mills it involves, in the anti-self-dual Yang–Mills case, a
equations are the condition Birkhoff factorization (also sometimes referred to as
the solution to a Riemann–Hilbert problem), and in
½D0 ; D2 ¼ ½D1 ; D3 ¼ ½D0 ; D3 ½D1 ; D2 ¼ 0
the case of the anti-self-dual Einstein equations, the
They are the compatibility conditions construction of a family of rational curves inside a
complex manifold. Nevertheless, such constructions
½D0 þ D1 ; D2 þ D3 ¼ 0
are a familiar part of the apparatus of the theory of
for the linear system of equations integrable systems.
In Ward (1985), this connection with integrable
ðD0 D1 Þs ¼ ðD2 D3 Þs ¼ 0 ½1
systems was developed further, and the anti-self-
where 2 C and s is an n-component column dual Yang–Mills equations were shown to yield
vector. These latter equations form a ‘‘Lax pair’’ many important integrable systems under symmetry
for the system. reduction. Ward’s list has been extended and now
The Ward (1977) construction provides a one–one includes many of the most famous examples of
correspondence between gauge equivalence classes integrable systems such as the Painlevé equations,
of solutions of the self-dual Yang–Mills equations the Korteweg–de Vries (KdV) equation, the non-
and holomorphic vector bundles on regions in linear Schrödinger equation, the n-wave equations,
twistor space. The key point here is that eqn [1] and so on, see Mason and Woodhouse (1996) for a
defines parallel propagation along -planes. To each review. There are some notable omissions from the
point Z in twistor space, we can associate the vector list such as the Kadomtsev–Petviashvili (KP) and
space EZ of solutions to eqn [1] along the Davey–Stewartson equations (at least if one restricts
corresponding -plane. These vector spaces vary oneself to finite-dimensional gauge groups; reduc-
holomorphically with Z and that is what one means tions using infinite dimensional gauge groups have
by a holomorphic vector bundle E ! PT. The been obtained).
remarkable fact is that the anti-self-dual Yang– The list of integrable systems obtainable by
Mills field can be reconstructed up to gauge from E, symmetry reduction nevertheless remains impressive
and, in effect, for local analytic solutions, E can be and provides a route to the classification of at least
represented by freely prescribable ‘‘patching’’ data those integrable systems that can be obtained in this
Twistor Theory: Some Applications 307
way. Such systems can be classified by the choice of with the Euclidean signature versions of the original
ingredients required in the symmetry reduction: the Ward construction for anti-self-dual Yang–Mills
gauge group, the group of spacetime symmetries to fields and Penrose’s nonlinear graviton construction
be reduced by, the choice of Euclidean or ultra- for Ricci-flat anti-self-dual metrics but, as we will
hyperbolic signature, and the choice of certain discuss, these constructions have a number of
constants of integration that arise in the reduction. extensions and generalizations.
Another implication is that if an integrable system The first dramatic application of these construc-
can be obtained from one of the self-duality tions was the ADHM construction of Yang–Mills
equations by symmetry reduction, then it inherits a instantons. These areR absolute minima of the Yang–
reduced twistor correspondence because the twistor Mills action, S[A] = tr(F ^ F ) on the 4-sphere, S4 ,
correspondences share the symmetry groups of the with its round metric. A simple argument shows that
spacetime field equations. These twistor correspon- the action is bounded below by the second Chern
dences can be seen to underlie much of the theory of class of the bundle and that this bound is achieved
these equations; for example, Backlund transforma- only for anti-self-dual fields. Thus, the problem was
tions of solutions correspond to elementary alge- to characterize all the anti-self-dual Yang–Mills
braic operations on the twistor data, similarly the fields on S4 . In this Euclidean context, twistor
Kac–Moody Lie algebras of hidden symmetries act space, CP3 , fibers over S4 and the corresponding
locally on the twistor data by matrix multiplication Ward vector bundle is a bundle over all of CP3 . It
of the appropriate loop algebras. Similarly, the turns out that all such bundles satisfying a certain
inverse-scattering transform for the KdV and non- stability condition had been constructed reasonably
linear Schrodinger equations can be seen to arise as explicitly by algebraic geometers. Since the stability
particular presentations of the twistor construction. condition was implied by the context, this could be
By and large, although twistor methods have turned into an algebraic construction of the general
yielded new insight into the geometry and structure instanton explicit enough to give some insight into
of systems in dimensions 1 and 2, they have not both the local and global structure of the solution
necessarily superceded pre-existing techniques for space. See Atiyah (1979) for a review.
constructing solutions and analyzing the solution Hitchin used the Euclidean version of the non-
space. The systems for which twistor methods have linear graviton to develop the theory of gravitational
been particularly effective for constructing solutions instantons that are asymptotically locally Euclidean
and characterizing their properties are in 2 þ 1 or (i.e., asymptotically R4 =, where is a finite
higher dimension. Key examples here are of course subgroup of the rotation group). These were finally
the anti-self-dual Yang–Mills and Einstein equations constructed by Kronheimer who again used twistor
themselves, and their single translation reductions. theory to identify the appropriate parameter space,
In the anti-self-dual Yang–Mills case, these reduc- see his article in Mason et al. (2001) and Dancer’s
tions lead either to Ward’s or Manakov and review of hyper-Kähler manifolds in LeBrun and
Zakharov’s chiral model in Lorentzian signature, Wang (1999).
2 þ 1, or the Bogomolny equations for monopoles, Even in four dimensions, there are a number of
the reduction from Euclidean signature. In both variants of the nonlinear graviton construction. The
cases, the twistor construction has played a major basic twistor correspondence produces a twistor
role in constructing and studying the solitonic space that is a complex 3-manifold PT for
solutions. 4-manifolds with conformal structures whose Weyl
See Ward and Wells (1990), Mason and Wood- tensor is anti-self-dual. There are four natural
house (1996), Ward’s article in Huggett et al. (1998) specializations that have attracted study: (1) the
and the first few chapters of Mason et al. (1995), Ricci-flat case, (2) the Einstein case (with nonzero
and Mason et al. (2001) for more examples of cosmological constant), (3) the scalar-flat Kähler
aspects of the theory of integrable systems arising case, and (4) the hypercomplex case.
from twistor correspondences. The twistor space in the Ricci-flat case admits the
additional structure of a fibration over CP1 together
with a holomorphic Poisson structure on the fibers
Applications to Geometry
with values in the pullback of the 1-forms on CP1
These applications are, to a large extent, higher- (alternatively, the bundle of holomorphic 3-forms
dimensional analogs of those discussed above; most should be the pullback of the square of the bundle of
of the problems in geometry to which twistor theory holomorphic 1-forms on CP1 ). The Einstein case
has been applied are those for which the underlying with nonzero cosmological constant is a variant of
differential equations are integrable. These start this in which the twistor space admits a
308 Twistor Theory: Some Applications
nondegenerate holomorphic contact structure, that This is a connection that is naturally defined on any
is, a distribution of 2-plane elements, which are only conformal manifold being the spinor representation
integrable when the cosmological constant vanishes. of the Cartan conformal connection. An impressive
It also admits a Kähler form when the scalar application here is the construction of conformally
curvature is positive (in the negative case the invariant differential operators and other conformal
corresponding Kähler form is indefinite). For the invariants. See the article by Baston and Eastwood
case of Kähler metrics with vanishing scalar curva- in Bailey and Baston (1990).
ture, the twistor space admits a holomorphic volume
form with a double pole. The Ricci-flat case is
equivalent to the case of hyper-Kähler metrics, those Beyond Classical Integrability:
that are Kähler with respect to three different Twistor-String Theory
complex structures I, J, and K satisfying the stan- Until Witten (2004), there was little indication that
dard quaternionic relations IJ = K, etc. A hypercom- twistor theory would have much useful to say about
plex structure is obtained when one only has the Yang–Mills or gravitational fields that are not anti-
three integrable complex structures satisfying the self-dual. Furthermore, it was problematic to incor-
quaternion relations. Such manifolds admit an porate quantum field theory into twistor ideas.
underlying conformal structure that is anti-self- However, twistor-string theory has transformed the
dual, and the corresponding twistor space admits a situation and has furthermore had impressive appli-
fibration to CP1 . cations to the field of perturbative gauge theory.
These constructions have all played a significant The story starts with a formulation by Nair of the
role in the general analysis of these geometric remarkable Park–Taylor formulas for the so-called
structures, and the construction of examples. A maximal helicity violating (MHV) amplitudes in
striking example of an application of the nonlinear gauge theory. These are scattering amplitudes at tree
graviton construction to general properties is due to level in which helicity conservation is maximally
Donaldson and Friedman who show that if two violated; using crossing symmetry to take all the
4-manifolds admit anti-self-dual conformal struc- particles to be outgoing, these are amplitudes in
tures, then their direct sum does also. which n 2 of the particles have helicity 1 and two
In higher dimensions, most generalizations rely on have helicity þ1. These amplitudes can be expressed
quaternionic geometry and its reductions. The simply as follows. Let the n particles have color ti in
Euclidean signature formulation of the nonlinear the Lie algebra of the gauge group and null
graviton construction has natural extensions to momenta pi with spinor decompositions pai = ~A A0
i i ,
quaternionic manifolds in 4k dimensions. These are A0
i = 1, . . . , n where the i are self-dual spinors and
manifolds with metric whose holonomies are con- ~A are anti-self-dual spinors using the index notation
tained in Sp(k) Sp(1). The latter SP(1) = SU(2) of Spinors and Spin Coefficients, and Twistors. Let
factor leads to an associated S2 bundle whose total i = r and i = s be the two gluons of helicity þ1. Then
space is the twistor space PT and it naturally has the coefficient of the colour term tr(t1 t2 tn ) is
the structure of a (2k þ 1)-dimensional complex !
manifold. Xn
r s
4 a
For a series of review articles, the reader is pi n
i¼1
i¼1 i iþ1
referred to Bailey and Baston (1990, chapters 3
0
and 4) and also LeBrun and Wang (1999, chapters where i j = Ai jA0 denotes the standard skew-
2, 5, 6, 10, and 14) which, despite being a book on symmetric inner product on chiral spinors and
the distinct subject of Einstein manifolds, is strongly nþ1 = 1 . A striking feature is that, except for the
influenced by twistor theory. Other applications delta function, it is holomorphic in the i s except at
along these lines are summarized in Mason et al. the simple poles i iþ1 = 0. Nair interprets these
(2001, chapter 1). poles as those associated to fermion correlators in a
There are a number of applications that go current algebra on a CP1 parametrized by . Using a
beyond complete integrability. A striking application supersymmetric formulation adapted to N = 4 super
is the twistor framework of Merkulov for studying Yang–Mills, he formulated the amplitude as arising
arbitrary geometric structures. This has led to a from an integral over lines in supertwistor space
classification of all possible irreducible holonomies CP3j4 .
of torsion-free affine connections, see Merkulov’s Witten extends these ideas to give, at least
article in Huggett et al. (1998). Another important conjecturally, a complete theory. He proposes that
area is in the field of conformal invariants in which full perturbative N = 4 super Yang–Mills theory on
the local twistor connection plays a prominent role. spacetime is equivalent to a string theory, a topological
Twistor Theory: Some Applications 309
contour integral over the moduli space can be to be done to extend these ideas to provide a
performed using residues in such a way as to consistent approach to the main equations of basic
eliminate the Chern–Simons propagators leaving an physics, obstacles that seemed insurmountable a few
integral over d intersecting lines. On the other hand, years ago have been overcome.
the measure on the space of connected curves has a
simple pole where the curve acquires double points See also: Chern–Simons Models: Rigorous Results;
and again the contour integral can be performed in Einstein Equations: Exact Solutions; General Relativity:
such a way as to yield the same integral over d Overview; Instantons: Topological Aspects; Integrable
Systems and the Inverse Scattering Method; Riemann–
intersecting lines.
Hilbert Methods in Integrable Systems; Spinors and Spin
It should be mentioned that Berkovits has given an
Coefficients; Twistors; Classical Groups and
alternative version of twistor-string theory which is a Homogeneous Spaces; Quantum Mechanics:
heterotic open-string theory with target supertwistor Foundations; Several Complex Variables: Compact
space in which the strings are taken to have boundary Manifolds; Several Complex Variables: Basic Geometric
on the real slice RP3 in CP3 (this is appropriate to a Theory.
spacetime with split signature) and the D1-instanton
expansions are replaced by expansions in the funda-
mental modes of the string (this is not a topological
theory). This gives rise to the same formulas for
Further Reading
scattering amplitudes as Witten’s original model. Atiyah MF (1979) Geometry of Yang–Mills Fields: Lezioni
There have been many applications now of these Fermiane. Pisa: Accademia Nazionale dei Lincei Scuola
ideas, perhaps the most striking being the recursion Normale Superiore.
Atiyah MF, Hitchin NJ, and Singer IM (1978) Self-duality in
relations of Britto, Cachazo, Feng, and Witten
four-dimensional Riemannian geometry. Proceedings of the
which give, at tree level, on-shell recurrence rela- Royal Society A 362: 425.
tions for Yang–Mills scattering amplitudes that Bailey TN and Baston R (eds.) (1990) Twistors in Mathematics
suggests a hitherto unsuspected underlying structure and Physics, LMS Lecture Notes Series, vol. 156. Cambridge:
for Yang-Mills theory. Cambridge University Press.
Baston RJ and Eastwood MG (1989) The Penrose Transform: Its
Despite all these successes, twistor-string theory is Interaction with Representation Theory. Oxford: Oxford
not thought by string theorists to be a good vehicle for University Press.
basic physics. The most serious problem is that the Cachazo F, and Svrcek P (2005) Lectures on twistor strings and
closed-string sector gives rise to conformal supergravity perturbative Yang–Mills theory, arXiv:hep-th/0504194.
which is an unphysical theory. This is particularly Hitchin N (1987) Monopoles, Minimal Surfaces and Algebraic
Curves, Seminaire de Mathematiques supérieures, vol. 105.
pernicious from the point of view of analyzing loop
NATO Advanced Study Institute. Les Presses de l’Universite
diagrams as from the point of view of string theory, De Montreal.
loop diagrams will carry supergravity modes. From this Huggett S, Mason LJ, Tod KP, Tsou TS, and Woodhouse NMJ
point of view, twistor-string theory is another duality, (eds.) (1998) The Geometric Universe. Oxford: Oxford
like AdS-CFT etc., that gives insight into some standard University Press.
LeBrun C and Wang M (eds.) (1999) Essays on Einstein
physics but is fundamentally limited.
manifolds, Surveys in Differential Geometry, vol. VI. Boston,
From the point of view of a twistor theorist, MA: International Press.
however, twistor-string theory has overcome major Mason LJ (2005) Twistor actions for non-self-dual fields, a
obstacles to the twistor programme. Hodges has derivation of twistor-string theory, hep-th/0507269.
used the BCFW recursion relations to provide all Mason LJ and Hughston LP (eds.) (1990) Further Advances in
Twistor Theory, Volume I: The Penrose Transform and Its
twistor diagrams for gauge theory. In Mason (2005)
Applications. Pitman Research Notes in Maths, vol. 231.
it is shown how to derive the main generating Harlow: Longman.
function formulas from Yang–Mills and conformal Mason LJ, Hughston LP, and Kobak PZ (1995) Further Advances
gravity spacetime action principles via a twistor in Twistor Theory, Volume II: Integrable Systems, Conformal
space actions for these theories. These twistor Geometry and Gravitation. Pitman Research Notes in Maths,
vol. 232. Harlow: Longman.
actions can in the first instance be expressed purely
Mason LJ, Hughston LP, Kobak PZ, and Pulverer K (eds.) (2001)
bosonically and distinctly and the twistor-string Further Advances in Twistor Theory, Volume III: Curved
generating function formulas are obtained by Twistor Spaces. Pitman Research Notes in Maths, vol. 424.
expanding and re-summing the classical limit of the Boca Raton, FL: Chapman and Hall/CRC Press.
path integral in a parameter that expands about the Mason LJ and Woodhouse NMJ (1996) Twistor Theory, Self-
Duality and Twistor Theory. Oxford: Oxford University Press.
anti-self-dual sector. This allows one to decouple the
Newman ET (1976) Heaven and its properties. General Relativity
Yang–Mills and conformal gravity modes, and and Gravitation 7(1): 107–111.
indeed to work purely bosonically – one is not tied Penrose R (1976) Nonlinear gravitons and curved twistor theory.
to super Yang–Mills. Although there is much work General Relativity and Gravitation 7: 31–52.
Twistors 311
Penrose R (1984, 1986) Spinors and Space-Time, vols. I and II. Ward RS and Wells RO (1990) Twistor Geometry and Field
Cambridge: Cambridge University Press. Theory. Cambridge: Cambridge University Press.
Ward RS (1977) On self-dual gauge fields. Physics Letters A 61: Witten E (2004) Perturbative gauge theory as a string theory in
81–82. twistor space. Communications in Mathematical Physics 252:
Ward RS (1985) Integrable and solvable systems and relations 189 (arXiv:hep-th/0312171).
among them. Philosophical Transactions of the Royal Society
A 315: 451–457.
Twistors
K P Tod, University of Oxford, Oxford, UK Twistor Geometry
ª 2006 Elsevier Ltd. All rights reserved. General references for this section are the books by
Penrose and Rindler (1986) and Hugget and Tod
(1994). It will be convenient to use Penrose’s
abstract index convention (Penrose and Rindler
Introduction
1984, 1986), which is also used in Spinors and
Twistor theory initially arose from two principal Spin Coefficients. This can be used wherever vector
motivations: a desire for a conformally invariant or tensor indices occur. Suppose that V is a (real or
calculus for spacetime geometry and fields on complex) finite-dimensional vector space with dual
spacetime, and a desire to unify and account for V 0 . Elements of V are written va, ub, wc, . . . , where an
the various occurrences of complex numbers and index a, b, c, . . . is regarded not as an integer in the
holomorphic functions in mathematical physics, range 1 to dim V but simply as an abstract label
especially in general relativity (Penrose and indicating that the object to which it is attached is a
MacCallum 1973). The theory leads to a nonlocal vector. Elements of V 0 are similarly written
relation between spacetime and twistor space, ua , vb , wc , . . . and elements of the tensor algebra as
whereby a point in one is an extended object in tab cd according to valence, and so on. The usual
the other. Part of the present-day motivation of the operations of tensor algebra are written in the way
subject is that this nonlocal relation will be a that component calculations would suggest, but
fruitful way to approach the quantization of without necessitating a choice of basis. The jump
spacetime. A comparison is often invoked with to tensor fields on a manifold M is immediate. A
Hamiltonian mechanics, which is a formal rephras- metric is a particular field gab and determines a Levi-
ing of classical mechanics that nonetheless provides Civita connection ra which defines maps ra : vb !
a bridge from that theory to quantum mechanics. ra vb and similar for other valences. The virtue of
The hope is that the twistor theory has the right the formalism is that, while remaining invariant, it
character to provide a bridge from general relativ- can harness the strength and flexibility of calcula-
ity to quantum theory, specifically to quantum tions in components.
gravity. With this understanding, twistors may first be
The principal successes of twistor theory in defined as the fundamental representation of
mathematical physics can be characterized as SU(2, 2), so that they are elements Z of a four-
the linear Penrose transform, which provides a dimensional complex vector space T. T carries a
solution of the zero-rest-mass free-field equations Hermitian form of signature (þ þ ) which is
in Minkowski space in terms of sheaf cohomology in made explicit below and which provides an isomorph-
twistor space, and the nonlinear Penrose transform, ism from the complex conjugate of T to its dual. This
which provides solutions of certain nonlinear field isomorphism is used to eliminate all appearances of
equations in terms of holomorphic geometry. These complex-conjugate twistors from the formalism and is
are treated below, together with other applications therefore regarded as an antilinear map to the dual.
of twistor theory, following a brief introduction to SU(2, 2) is the double cover of O(2, 4), the rotation
twistor geometry. group of E2,4 , the six-dimensional space with flat
Very recently, there has been a resurgence of interest metric 2,4 of signature (þ þ ),which in turn is
in twistor theory following Witten’s introduction of the double cover of C(1, 3), the conformal group of
twistor string theory (Witten 2003) as a string theory Minkowski space M. This last group homomorphism
in twistor space. This is not treated here, but this may be made explicit as follows (suspending the
article does provide the necessary background. abstract-index convention for the duration of this
312 Twistors
Each generator meets this sphere twice at, say, y where T AA (a real vector) defines an infinitesimal
and y, and PN is the quotient by this identifica- translation, BAA0 (another real vector) defines an
tion of the two surfaces infinitesimal special conformal transformation, (a
real constant) defines a dilatation and the (real)
ðy0 Þ2 þ ðy4 Þ2 ¼ 1 ¼ ðy1 Þ2 þ ðy2 Þ2 þ ðy3 Þ2 þ ðy5 Þ2 bivector Mab = AB A0 B0 þ ¯ A0 B0 AB defines an infini-
tesimal rotation. This gives a total of 15 parameters
which define the intersection. The metric 2,4 defines for the transformation, which is the correct dimen-
a degenerate metric on N, which, however, is sion for C(1, 3).
nondegenerate on any smooth cross section of N The Hermitian form ( , ) can be written as
which meets each generator once. Furthermore, the
0
map along the generators between any two such ¼ !A A þ !
ðZ; ZÞ ¼ Z Z A A 0 ½7
cross sections is conformal. Thus, there is a
conformal metric on PN and it is conformal to when it can be checked that the transformations [6]
1,3 . We call PN compactified Minkowski space Mc leave it invariant (and that its signature is (þ þ );
as it is compact and has the same conformal metric this establishes that SU(2, 2) is locally isomorphic to
as Minkowski space. It can be thought of as M C(1, 3)). Equation [7] will be referred to as the norm
compactified by the addition of some points, namely of a twistor.
Twistors 313
0
From [6], a twistor Z = (!A, A0 ) gives rise, under tangent vector (proportional to) A A . Twistors with
0
translation by a variable xAA , to a spinor field A norm zero are called null and the (five-dimensional,
given by real) submanifold of them in PT is PN. This is a
0
compactification of the space of (unscaled) null
A ¼ !A ixAA A0 ½8 geodesics in M by the inclusion of the 2-sphere of
null geodesics in Mc which lie on the light cone at I .
Differentiating [8] and symmetrizing, we see that A
For use in the next section, we note the definition of
satisfies the differential equation
PTþ and PT as the projective twistors with positive
rA0 ðA BÞ ¼ 0 ½9 and negative norm, respectively.
To summarize, we have found M and Mc :
which is known as the twistor equation. In fact, the (complex projective) lines in PT define points of
general solution of [9] takes the form of [8] for CMc ; lines in PN define points of Mc with one such,
constant spinors !A and A0 . Furthermore, the call it I, picked out as the vertex of the null cone I ;
conformal group can be shown directly to act on lines in PN which meet I correspond to points of I ;
solutions of [9], so that twistor theory can begin lines in PN which do not meet I correspond to
with the study of [9] and its solutions. In this points in M. As for CMc , the conformal structure of
approach, a twistor is precisely a solution of [9]. M and Mc is determined by incidence in PN. We
Given a spinor field A of the form of [8], we may may now note the nonlocal correspondence men-
seek the points of M where it vanishes. In general, there tioned in the introduction: points in CMc are lines in
are none, but if we consider complexified Minkowski PT and points in PT are -planes in CMc .
space CM, then A vanishes on a two-dimensional It will be convenient to refer to the line in PT
complex plane with the property that every tangent associated with a point x in CMc as Lx . With this
0
vector is of the form A A for varying A and fixed notation, it is possible to characterize the forward or
0
A . The 2-plane is flat and totally null, in that the future tube in terms of twistor space: a point x of
(analytically extended) Minkowski metric vanishes CM is in the forward tube iff its imaginary part is
identically on it, and it has a self-dual (SD) tangent timelike and past-pointing, and this is equivalent to
0
bivector determined by A . Such a 2-plane is known as Lx lying in PT.
an -plane (reserving the term -plane for a totally null The starting point for Riemannian twistor theory is
2-plane with anti-self-dual (ASD) tangent bivector). At the fact that CP3 is a fibration with fiber CP1 over
a given point p in CM, there is an -plane for each S4, where the fiber above a point p can be interpreted
choice of A0 up to scale (in other words, for each as the almost-complex structures at p (since this is the
element of the projective (primed) spin space at p) same as the projective primed spin space at p). In the
which is a copy of the complex projective line, CP1. picture developed above, this means that there is an
The -plane is determined by the twistor up to S4 ’s worth of lines filling out CP3, no two of which
scale (in that a constant complex multiple of the intersect (so that there are no null vectors and the
field A determines the same -plane). Thus, we metric is definite). The complexification of S4 with its
consider the projective twistor space PT which, conformal structure is again CMc .
since T is C4, is a copy of complex projective If a twistor has nonzero norm, say Z Z = s 6¼ 0,
3-space, CP3. This is now the space of -planes, but then it can be interpreted as a massless particle with
is also compact. We define complexified, compacti- spin s: the momentum is pa = A A0 and the
0 0
fied Minkowski space CMc as the space of all angular momentum bivector is Mab = i!(A B) A B
(complex projective) lines in PT; then it is easy to (A0 B0 ) AB
i!
. The angular momentum transforms
see that this includes CM as an open dense subset. appropriately under translation by virtue of [6]
PT is the space of -planes in CMc and two lines and the (Pauli–Lubanski) spin vector is spa , as it
meet in PT iff the corresponding points in CMc lie should be for a massless spinning particle.
on an -plane, or, equivalently, iff they are null
separated. Thus, the conformal structure in CMc is
determined by incidence of lines in PT.
To find M and Mc in this picture, we seek -planes
The Linear Penrose Transform:
containing real points. If A from [8] vanishes at a Zero-Rest-Mass Free Fields
0
real xAA , then the contraction !A A must be purely A zero-rest-mass free field of spin s is a symmetric
imaginary, so that, by [7], the norm of the twistor is spinor field AB...C with 2s indices which satisfies the
zero. Conversely, one calculates that A can indeed field equation
vanish at real points if the norm is zero, and that it 0
will then in fact vanish along a null geodesic with rA A AB...C ¼ 0 ½10
314 Twistors
The Weyl neutrino equation, source-free Maxwell helicity-s zero-rest-mass fields (thus, Û must con-
equation, and linearized Einstein vacuum equation tain the neighborhood of lines Lx for points x in U).
are examples of zero-rest-mass free-field equations, Similarly, [13] is interpreted cohomologically in
with spins 1/2, 1, and 2, respectively, so that these terms of potentials modulo a gauge. With appro-
are equations of physical interest. Conventionally, priate conditions on Û and U (for brevity, U is said
one takes the s = 0 case to be the wave equation, and to be elementary), these groups can be shown to be
the complex-conjugate fields A0 B0 ...C0 to have the isomorphic and this isomorphism is known as the
same spin but opposite helicity. Penrose transform (Ward and Wells (1991)). A
The conformal group acts on solutions of [10], so particular instance of an elementary U is the
that the equations are conformally invariant. The forward tube, when Û is PT. Since the definition
equations can be solved by contour integral expres- of positive frequency is holomorphicity on the
sions involving homogeneous functions of a twistor forward tube, this observation geometrizes the
variable. To be explicit, we define an operation x of notion of positive frequency in terms of twistor
restriction to the line Lx for a function of a twistor space.
variable by the following: For free fields with mass, there are generalizations
0
of [12] and [13] to solve the Dirac equation for
x f ðZ Þ ¼ f ðixAA A0 ; A0 Þ ½11 different spins. However, the integrands now
involve functions of more than one twistor variable,
Now suppose that f (Z ) is holomorphic and homo-
subject to an equation. This equation is a counter-
geneous of degree 2s 2 in the twistor variable for
part of the Klein–Gordon equation and breaks the
positive integer 2s, but otherwise arbitrary, and
conformal invariance (as it must, since mass does). It
consider the integral
can be imposed by a projection which can in turn be
A0 B0 ...C0 ðxÞ
written as a contour integral over arbitrary holo-
Z morphic functions. It has been argued that the
0 0
¼ A0 B0 . . . C0 x f ðZ ÞE F E0 dF0 ½12 appropriate description of leptons and hadrons in
twistor theory is with functions of two and three
where there are 2s indices on and the integration twistor variables, respectively. Such a function has
is around a contour in the line Lx in PT. The choice two or three integer quantum numbers determined
of homogeneity ensures that the integral is well by the homogeneities in different variables, and this
defined but, to obtain a nonzero answer, x f must leads to a twistor particle classification scheme (see,
have some singularities as a function of A0 on Lx . e.g., Hughston and Sheppard (1980) and Sparling
The answer then automatically gives a helicity-(s) (1981)), similar in many respects to, but not
solution of [10], as may be checked by differentia- identical with, the standard classifications.
tion under the integral sign. Given that free fields, massive or massless, are
For a helicity-s solution, we take an arbitrary determined from arbitrary twistor functions through
function f (Z ), holomorphic and of homogeneity contour integrals, one may translate the Feynman
(2s 2), and consider the integral diagrams of a quantum field theory into contour
integrals over twistor functions. In the massless case,
AB...C ðxÞ the contours are compact, so that the integrals are
Z finite without need for renormalization. The massive
@ @ @ 0 0
¼ x f ð Z
Þ E F E0 dF0 ½13 case is more complicated but essentially parallel.
@!A @!B @!C
This is twistor diagam theory and there is a
where there are 2s indices on and the integration is substantial literature on it (see, e.g., the article by
again around a contour in the line Lx . As before, Hodges in the volume edited by Huggett et al.
one needs singularities to make the contour integral (1998)). There is currently no new physical theory,
nonzero, but again the result satisfies [10]. distinct from a known quantum field theory, to
The correct framework in which to understand generate the relevant diagrams.
these integrals is sheaf cohomology theory. For
[12], the functions with singularities are actually
elements of H 1 (Û, O( 2s 2)), the first cohomol-
The Nonlinear Penrose Transform:
ogy group of a region Û in PT with coefficients in
Curved Twistor Spaces
the sheaf of germs of holomorphic functions of
homogeneity 2s 2, while the fields are elements The electromagnetic field, in Minkowski space say,
of H 0 (U, Z s ), the zeroth cohomology group of the can be regarded as a spinor field subject to field
corresponding region U of M with coefficients in equations, in which case these equations can be
Twistors 315
solved via the Penrose transform by contour completely integrable partial differential equations
integrals. Alternatively, it can be seen as the (PDEs) (including the sine-Gordon, Korteweg–de
curvature of a connection on a U(1) bundle over Vries (KdV) and nonlinear Schrödinger equations)
M, which is a more active role for the field in are reductions of the ASD Yang–Mills equations.
curving a bundle. For SD or ASD electromagnetic Solutions of these other integrable systems can be
fields, there are analogous active twistor construc- given in terms of a geometrical construction,
tions. From an ASD electromagnetic field, one may usually of some structure in holomorphic geometry.
define a connection on the primed spin space of CM The other major active twistor construction,
which is flat on -planes: if the tangents to the - which historically preceded the Yang–Mills one, is
0
plane are of the form A A for varying A and with Penrose’s nonlinear graviton (Penrose 1976), which
A0
fixed up to scale, then consider the propagation solves the ASD Einstein vacuum equations. For this,
of A0 around the -plane given by one starts from a complex, four-dimensional mani-
0
fold M with holomorphic metric, vanishing Ricci
A ðrA0 A iAA0 A ÞB0 ¼ 0 ½14 curvature and ASD Weyl tensor. These conditions
on the curvature are necessary and sufficient to
where AA0 A is a potential for the electromagnetic
allow the existence of -surfaces, which generalize
field. This connection is flat provided
-planes. They are two-dimensional totally null
0 0
A B rAA0 AA (complex) surfaces with SD tangent bivector, one
B0 ¼ 0 ½15
for each choice of (null) SD bivector, or, equiva-
and if this is to hold for all A0 then rA(A0 AA B0 ) lently, for each choice of primed spinor, at each
vanishes and the electromagnetic field, defined as point.
usual as the exterior derivative of the potential, is The space of -surfaces is a three-dimensional
necessarily ASD. Now the space of -planes in CM complex manifold, the curved twistor space PT .
is projective twistor space PT, so we define a This is curved inasmuch as it is not now (part of)
holomorphic C bundle T over PT by taking the CP3 , but it still contains complex projective lines:
fiber above an -plane to be choices of A0 scaled as given a point p in M there is an -surface through p
in [14]. If we restrict attention to the -planes for every primed spinor at p up to scale; these -
through a given point p of CM, then by comparing surfaces make up a projective line Lp in PT . The
the scalings at p we can trivialize the bundle; thus, T conditions on the curvature are equivalent to the
is trivial on lines in PT. There is a converse to this statement that the Levi-Civita connection is flat on
construction and we have: there is a one-to-one primed spinors, so that there exist constant primed
correspondence between holomorphic C bundles spinors in M, and the tangent bivector to an -
on a region Û in PT which are trivial on lines and surface can be taken to be constant, without loss of
ASD electromagnetic fields on the corresponding generality. The map associating a constant primed
region U of CM (for elementary U). spinor with each -surface defines a projection
This construction can be extended to solve the from PT to CP1 , so that PT is a fibration over
ASD Yang–Mills equations with holomorphic vector CP1 . The lines Lp define a four-parameter family of
bundles replacing holomorphic line bundles: with Û sections of this fibration.
and elementary U as above, there is a natural one-to- To define the metric of M from PT , one needs
one correspondence between ASD GL(n, C) gauge the notion of normal bundle: the normal bundle of a
fields on U and holomorphic rank-n vector bundles submanifold Y in a manifold X is N = TXjY =TY in
E over Û which are trivial on Lx for every x in U. terms of the tangent bundles TX and TY. The
ASD Yang–Mills fields cannot be real on M, but normal bundle N p of a particular section Lp is the
using Riemannian twistor theory, one can impose same in PT as it was in PT, namely H H, where
appropriate reality and globality conditions to H is the hyperplane-section line bundle over CP1
ensure that these ASD Yang–Mills fields are both (Ward and Wells 1991). A section SV of N p
real and globally defined on S4 . These are then corresponds to a vector V in Tp M (think of it as
instantons. The Atiyah–Drinfeld–Hitchin–Manin an infinitesimally neighboring point in M) and V is
(ADHM) construction of instantons (Atiyah et al. defined to be null iff SV has a zero. Because of the
1978) proceeds via construction of the correspond- nature of N , this defines a quadratic conformal
ing holomorphic vector bundles over twistor space. metric, which, furthermore, agrees with the con-
The construction of ASD Yang–Mills fields is formal metric on M and generalizes the definition of
also the starting point for the twistor theory of conformal metric for CMc in terms of incidence in
integrable systems (Mason and Woodhouse 1996), PT. To define the actual metric, as opposed to just
following the observation that many of the known the conformal metric, one has a covariant-constant
316 Twistors
0 0
choice of A B in M which defines an on the base of Systems: Overview; Quantum Field Theory: A Brief
the fibration, and a Poisson structure on the fibers
Introduction; Quantum Mechanics: Foundations;
of the projection. The definition of
is more intricate, Relativistic Wave Equations Including Higher Spin Fields;
but the two structures enable the metric of M to be Riemann–Hilbert Problem; Spinors and Spin Coefficients;
Twistor Theory: Some Applications.
recovered from PT . Penrose (1976) and Huggett and
Tod (1994) provide more details.
Now the metric and curvature properties of M
are coded into holomorphic properties of PT Further Reading
together with and
. These properties characterize
Atiyah MF, Hitchin NJ, Drinfeld VG, and Manin YuI (1978)
M: subject to topological conditions on M, there is
Construction of instantons. Physics Letters A 65: 185–187.
a one-to-one correspondence between holomorphic Bailey TN and Baston RJ (eds.) (1990) Twistors in Mathematics
solutions M of the Einstein vacuum equations with and Physics, LMS Lecture Note Series, vol. 156. Cambridge:
ASD Weyl tensor and three-dimensional complex Cambridge University Press.
manifolds PT fibered over CP1 , with a four- Frauendiener J and Penrose R (2001) Twistors and general
relativity. In: Engquist B and Schmid W (eds.) Mathematics
parameter of sections, each with normal bundle
Unlimited – 2001 and Beyond. Berlin: Springer.
H H, and the forms and
as above. Hitchin NJ (1995) Twistor spaces, Einstein metrics and isomono-
In fact, one only needs to assume the existence of dromic deformations. Journal of Differential Geometry. 42:
one section with the correct normal bundle and the 30–112.
full four-parameter family will automatically exist, Hitchin NJ, Segal GB, and Ward RS (1999) Integrable Systems.
Twistors, Loop Groups, and Riemann Surfaces, Oxford
at least near to the initial one. Penrose (1976)
Graduate Texts in Mathematics, vol. 4. Oxford: Oxford
showed how curved twistor spaces with the neces- University Press.
sary structures could be obtained by deforming the Huggett SA and Tod KP (1994) An Introduction to Twistor
neighborhood of a line in the ‘‘flat’’ twistor space Theory, LMS Student Text, vol. 4. Cambridge: Cambridge
PT. The Kodaira–Spencer theory of complex defor- University Press.
Huggett SA, Mason LJ, Tod KP, Tsou TS, and Woodhouse NMJ
mations ensures that the necessary lines continue to
(eds.) (1998) The Geometric Universe – Science, Geometry and
exist under this deformation. the Work of Roger Penrose. Oxford: Oxford University Press.
The original nonlinear graviton construction has Hughston LP and Sheppard M (1980) On the magnetic moments
been extended in various ways including the follow- of hadrons. Reports on Mathematical Physics 18: 53–66.
ing: to allow the possibility of a cosmological Mason LJ, Hughston LP, Kobak PZ, and Pulverer K (eds.) (1995,
1998, 2001) Further Advances in Twistor Theory, vols. 1–3.
constant (Ward and Wells 1991); to produce real,
Boca Raton, FL: Pitman Advanced Publishing Programme and
Riemannian solutions (Hitchin 1995); to solve other Chapman and Hall, CRC.
but related field equations (e.g., those for hyper- Mason LJ and Woodhouse NMJ (1996) Integrability, Self-
complex metrics, scalar-flat Kahler metrics or Duality, and Twistor Theory, LMS Monographs Series.
Einstein–Weyl structures). Oxford: Clarendon.
Penrose R (1976) Nonlinear gravitons and curved twistor theory.
The search for a twistor construction of the
General Relativity and Gravitation 7: 31–52.
SD Einstein equations (distinct from a construction Penrose R (1999) The central programme of twistor theory.
in terms of dual twistors, which is, of course, Chaos Solitons Fractals 10: 581–611.
provided by deforming dual twistor space) is an Penrose R and MacCallum MAH (1973) Twistor theory: an
active area of research. This and other applications of approach to the quantisation of fields and space–time. Physics
Reports 6C: 241–316.
twistor theory, including a quasilocal definition of
Penrose R and Rindler W (1984, 1986) Spinors and Space–Time,
mass in general relativity, the classification of affine vols. 1 and 2. Cambridge: Cambridge University Press.
holonomies and the construction of four-dimensional Sparling GAJ (1981) Theory of massive particles. I. Algebraic
conformal field theories, may be found in the structure. Philosophical Transactions of the Royal Society of
literature cited in the ‘‘Further reading’’ section. London A 301: 27–74.
Ward RS and Wells RO (1991) Twistor Geometry and Field
Theory. Cambridge: Cambridge University Press.
See also: Classical Groups and Homogeneous Spaces;
Witten E (2003) Perturbative gauge theory as a string theory in
Clifford Algebras and Their Representations; Integrable twistor space, hep-th/0312171.
Two-Dimensional Conformal Field Theory and Vertex Operator Algebras 317
subgroup of Möbius transformations, and their com- (bosonic) Euclidean quantum field theory implies
mutation relations are simply that these correlation functions are independent of
the order in which the fields appear in [5].
½Lm ; Ln ¼ ðm nÞLmþn ½4
It is conventional to think of z = 0 as describing
In fact, [4] describes also the commutation relations ‘‘past infinity,’’ and z = 1 as ‘‘future infinity’’; this
of all generators Ln with n 2 Z: this is the Lie defines a time direction in the Euclidean field theory
algebra of (locally defined) 2D conformal transfor- and thus a quantization scheme (radial quantiza-
mations – it is called the Witt algebra. tion). Furthermore, we identify the space of states
with the space of ‘‘incoming’’ states; thus, the state
is simply
The General Structure of Conformal ¼ Vð ; 0; 0Þj0i ½6
Field Theory
We can think of zi and zi in [5] as independent
A 2D conformal field theory is determined (like any variables, that is, we may relax the constraint that zi
other field theory) by its space of states and the is the complex conjugate of zi . Then we have two
collection of its correlation functions (vacuum commuting actions of the conformal group on these
expectation values). The space of states is a vector correlations functions: the infinitesimal action on
space H (which, in many interesting examples, is a the zi variables is described (as before) by the Ln
Hilbert space), and the correlation functions are generators, while the generators for the action on
defined for collections of vectors in some dense n . In a conformal field theory,
the zi variables are L
subspace of H. These correlation functions are the space of states H thus carries two commuting
defined on a 2D (Euclidean) space. We shall mainly actions of the Witt algebra. The generator L0 þ L 0
be interested in the case where the underlying 2D can be identified with the time-translation operator,
space is a closed compact surface; the other and thus describes the energy operator. The space of
important case concerning surfaces with boundaries states of the physical theory should have a bounded
(whose analysis was pioneered by Cardy) will be energy spectrum, and it is thus natural to assume
reviewed elsewhere (see the article Boundary Con- that the spectrum of both L0 and L 0 is bounded
formal Field Theory). The closed surfaces are from below; representations with this property are
classified (topologically) by their genus g, which usually called positive-energy representations. It is
counts the number of handles; the simplest such relatively easy to see that the Witt algebra does not
surface which we shall mainly consider is the sphere have any unitary positive-energy representations
with g = 0, the surface with g = 1 is the torus, etc. except for the trivial representation. However, as is
One of the special features of conformal field common in many instances in quantum theory, it
theory is the fact that the theory is naturally defined possesses many interesting projective representa-
on a Riemann surface (or complex curve), that is, on tions. These projective representations are conven-
a surface that possesses suitable complex coordi- tional representations of the central extension of the
nates. In the case of the sphere, the complex Witt algebra
coordinates can be taken to be those of the complex c
plane that cover the sphere except for the point at ½Lm ; Ln ¼ ðm nÞLmþn þ mðm2 1Þm;n ½7
12
infinity; complex coordinates around infinity are
defined by means of the coordinate function which is the famous Virasoro algebra. Here c is a
(z) = 1=z that maps a neighborhood of infinity to central element that commutes with all Lm ; it is
a neighborhood of zero. With this choice of complex called the central charge (or conformal anomaly).
coordinates, the sphere is usually referred to as the Given the actions of the two Virasoro algebras
(that are generated by Ln and L n ), one can
Riemann sphere, and this choice of complex
coordinates is, up to Möbius transformations, decompose the space of states H into irreducible
unique. The correlation functions of a conformal representations as
field theory that is defined on the sphere are thus of M
H¼ Mij Hi Hj ½8
the form
ij
h0jVð 1 ; z1 ;
z1 Þ Vð n ; zn ;
zn Þj0i ½5
where Hi (Hj ) denotes the irreducible representations
where V( , z, z) is the field that is associated to the of the algebra of Ln (L n ), and Mij 2 N0 describe the
state , and zi and zi are complex conjugates of one multiplicities with which these combinations of
another. Here j0i denotes the SL(2, C)=Z2 -invariant representations occur. (We are assuming here that
vacuum. The usual locality assumption of a 2D the space of states is completely reducible with
Two-Dimensional Conformal Field Theory and Vertex Operator Algebras 319
respect to the action of the two Virasoro algebras; hold for n-point functions with n 5. The full
examples where this is not the case are the so-called Virasoro symmetry must then be used to restrict
logarithmic conformal field theories.) The positive- these functions further; however, since the genera-
energy representations of the Virasoro algebra are tors Ln with n 2 do not annihilate the vacuum
characterized by the value of the central charge, as j0i, the Virasoro symmetry leads to Ward identities
well as the lowest eigenvalue of L0 ; the state that cannot be easily evaluated in general. (In typical
whose L0 eigenvalue is smallest is called the highest- examples, the Ward identities give rise to differential
weight state, and its eigenvalue L0 = h is the equations that must be obeyed by the correlation
conformal weight. The conformal weight determines functions.)
the conformal transformation properties of : under
the conformal transformation z 7! f (z), z 7! f (z), we
have Chiral Fields and Vertex Operator
Algebras
Vð ; z; zÞ
The decomposition [8] usually contains a special
h 0 h
7! ðf ðzÞÞ f ðzÞ Vð ; f ðzÞ; f ðzÞÞ
0
½9 class of states that transform as the vacuum state
with respect to L m ; these states are the so-called
where L0 = h and L 0 = h . The corresponding
chiral states. (Similarly, the states that transform as
field V( ; z, z) is then called a primary field; if [9] the vacuum state with respect to Lm are the
only holds for the Möbius transformations [3], the antichiral states.) Given the transformation proper-
field is called quasiprimary. ties described above, it is not difficult to see that the
Since Lm with m > 0 lowers the conformal weight corresponding chiral fields V( ; z, z) only depend on
of a state (see [7]), the highest-weight state is z in any correlation function, that is V( ; z, z)
necessarily annihilated by all Lm (and L m ) with m > 0. V( , z). (Similarly, the antichiral fields only depend
However, in general the Lm (and L m ) with m < 0 on z.) The chiral fields always contain the field
do not annihilate ; they generate the descendants corresponding to the state L2 j0i, that describes a
of that lie in the same representation. Their specific component of the stress–energy tensor.
conformal transformation property is more compli- In conformal field theory, the product of two
cated, but can be deduced from that of the primary fields can be expressed again in terms of the fields of
state [9], as well as the commutation relations of the theory. The conformal symmetry restricts the
the Virasoro algebra. structure of this operator product expansion:
The Möbius symmetry (whose generators annihi-
late the vacuum) determines the 1-, 2- and 3-point Vð 1 ; z1 ;
z1 ÞVð 2 ; z2 ; z2 Þ
X
functions of quasiprimary fields up to numerical ¼ ðz1 z2 Þi ðz1 z2 Þi
constants: the 1-point function vanishes, unless i
h=h = 0, in which case h0jV( ; z, z)j0i = C, inde- X
Vðir;s ; z2 ; z2 Þðz1 z2 Þr ðz1 z2 Þs ½12
pendent of z and z. The 2-point function of 1 and r;s0
2 vanishes unless h1 = h2 and h1 = h2 ; if the
conformal weights agree, it takes the form where i and i are real numbers, and r, s 2 N 0 .
(Here i labels the conformal representations that
h0jVð 1 ; z1 ;
z1 ÞVð 2 ; z2 ; z2 Þj0i appear in the operator-product expansion, while r
and s label the different descendants.) The actual
¼ Cðz1 z2 Þ2h ðz1 z2 Þ2h ½10
form of this expansion (in particular, representations
Finally, the structure of the 3-point function of three that appear) can be read off from the correlation
quasiprimary fields 1 , 2 , and 3 is functions of the theory since the identity [12] has to
hold in all correlation functions.
h0jVð 1 ; z1 ;
z1 ÞVð 2 ; z2 ; z2 ÞVð 3 ; z3 ; z3 Þj0i Given that the chiral fields only depend on z in all
Y correlation functions, it is then clear that the
¼ C ðzi zj Þðhk hi hj Þ ðzi zj Þðhk hi hj Þ ½11
operator-product expansion of two chiral fields
i<j
again only contains chiral fields. Thus, the subspace
where for each pair i < j, k labels the third field, that of chiral fields closes under the operator-product
is, k 6¼ i and k 6¼ j. The Möbius symmetry also expansion, and therefore defines a consistent (sub)-
restricts the higher correlation function of quasi- theory by itself. This subtheory is sometimes referred
primary fields: the 4-point function is determined up to as a meromorphic conformal field theory (Goddard
to an (undetermined) function of the Möbius 1989). (Obviously, the same also applies to the
invariant cross-ratio, and similar statements also subtheory of antichiral fields.) The operator-product
320 Two-Dimensional Conformal Field Theory and Vertex Operator Algebras
expansion defines a product on the space of mero- paper that started many of the modern develop-
morphic fields. This product involves the complex ments in conformal field theory. Another important
parameters zi in a nontrivial way, and therefore does class of examples are the Wess–Zumino–Witten
not directly define an algebra structure; it is, however, (WZW) models that describe the world-sheet theory
very similar to an algebra, and is therefore usually of strings moving on a compact Lie group. The
called a vertex operator algebra in the mathematical relevant vertex operator algebra is then generated by
literature. The formal definition involves formal the loop group symmetries. There is some evidence
power series calculus and is quite complicated; details that all rational conformal field theories can be
can be found in (Frenkel–Lepowski–Meurman 1988). obtained from the WZW models by means of two
By virtue of its definition as an identity that holds standard constructions, namely by considering
in arbitrary correlation functions, the operator- cosets and taking orbifolds; thus rational conformal
product expansion is associative, that is, field theory seems to have something of the flavor of
(reductive) Lie theory.
ðVð 1 ; z1 ;
z1 ÞVð 2 ; z2 ; z2 ÞÞVð 3 ; z3 ; z3 Þ
Rational theories may be characterized in terms of
¼ Vð 1 ; z1 ;
z1 ÞðVð 2 ; z2 ; z2 ÞVð 3 ; z3 ; z3 ÞÞ ½13 Zhu’s algebra that can be defined as follows. The
chiral fields V( , z) that only depend on z must by
where the brackets indicate which operator-product
themselves define local operators; they can therefore
expansion is evaluated first. If we consider the case
be expanded in a Laurent expansion as
where both 1 and 2 are meromorphic fields, then
X
the associativity of the operator-product expansion Vð ; zÞ ¼ Vn ð Þ znh ½14
implies that the states in H form a representation of n2Z
the vertex operator algebra. The same also holds for
the vertex operator algebra associated to the anti- where h is the conformal weight of the state . For
chiral fields. Thus the meromorphic fields encode in example, for the case of the holomorphic compo-
a sense the symmetries of the underlying theory: this nent of the stress–energy tensor one finds
symmetry always contains the conformal symmetry X
2 j 0i TðzÞ ¼ Ln zn2 ½15
(since L2 j 0i is always a chiral field, and L n2Z
always an antichiral field). In general, however, the
symmetry may be larger. In order to take full where the Ln are the Virasoro generators. By the
advantage of this symmetry, it is then useful to state/field correspondence [6], it then follows that
decompose the full space of states H not just with Vn ð Þj0i ¼ 0 for n > h ½16
respect to the two Virasoro algebras, but rather with
respect to the two vertex operator algebras; the and that
structure is again the same as in [8], where, Vh ð Þj0i ¼ ½17
however, each Hi and Hj is now an irreducible
representation of the chiral and antichiral vertex (For an example of the above component of the
operator algebra, respectively. stress–energy tensor, [16] implies that L1 j0i =
L0 j0i = Ln j0i = 0 for n 0 – thus the vacuum is in
particular SL(2, C)=Z2 invariant. Furthermore, [17]
Rational Theories and Zhu’s Algebra shows that L2 j 0i is the state corresponding to this
component of the stress–energy tensor.) We denote
Of particular interest are the rational conformal by H0 the space of states that can be generated by
field theories that are characterized by the property the action of the modes Vn ( ) from the vacuum j0i.
that the corresponding vertex operator algebras only On H0 we consider the subspace O(H0 ) that is
possess finitely many irreducible representations. spanned by the states of the form
(The name ‘‘rational’’ stems from the fact that the
conformal weights and the central charge of these V ðNÞ ð Þ; N>0 ½18
theories are rational numbers.) The simplest exam-
where V (N) ( ) is defined by
ple of such rational theories are the so-called
minimal models, for which the vertex operator Xh
ðNÞ h
algebra describes just the conformal symmetry: V ð Þ¼ VnN ð Þ ½19
n
these models exist for a certain discrete set of n¼0
central charges c < 1 and were first studied by and h is the conformal weight of . Zhu’s algebra is
Belavin, Polyakov, and Zamolodchikov in 1984. then the quotient space
(Their paper is contained in the reprint volume of
Goddard and Olive (1988).) It was this seminal A ¼ H0 =OðH0 Þ ½20
Two-Dimensional Conformal Field Theory and Vertex Operator Algebras 321
It actually forms an associative algebra, where the Hi of a vertex operator algebra, one can define the
algebra structure is defined by character
? ¼ V ð0Þ ð Þ ½21 i ð Þ ¼ trHi qL0 ðc=24Þ ; q ¼ e2
i ½23
This algebra structure can be identified with the
For rational vertex operator algebras (in the math-
action of the ‘‘zero-mode algebra’’ on an arbitrary
ematical sense) these characters transform under the
highest-weight state.
modular transformation 7! 1= as
Zhu’s algebra captures much of the structure of X
the (chiral) conformal field theory: in particular, it ð1= Þ ¼ Sij j ð Þ ½24
was shown by Zhu in 1996 that the irreducible j
representations of A are in one-to-one correspon-
where Sij are constant matrices. Verlinde’s formula
dence with the representations of the full vertex
then states that, at least for unitary theories,
operator algebra. A conformal field theory is thus
rational (in the above, physicists’, sense) if Zhu’s X Sil Sjl S
As explained above, the correlation function of three is invariant under the action of SL(2, Z). This is a
primary fields is determined up to an overall very powerful constraint on the multiplicity matrices
constant. One important question is whether or not Mij that has been analyzed for various vertex
this constant actually vanishes since this determines operator algebras. For example, Cappelli, Itzykson,
the possible ‘‘couplings’’ of the theory. This infor- and Zuber have shown that the modular invariant
mation is encoded in the so-called fusion rules of the WZW models corresponding to the group SU(2)
theory. More precisely, the fusion rules Nij k 2 N0 have an A–D–E classification. The case of SU(3) was
determine the multiplicity with which the represen- solved by Gannon, using the Galois symmetries of
tation of the vertex operator algebra labeled by k these rational conformal field theories.
appears in the operator-product expansion of the The condition of modular invariance is relatively
two representations labeled by i and j. easily testable, but it does not, by itself, guarantee that
In 1988, Verlinde found a remarkable relation a given space of states H comes from a consistent
between the fusion rules of a vertex operator conformal field theory. In order to construct a
algebra and the modular transformation properties consistent conformal field theory, one needs to solve
of its characters. To each irreducible representation the conformal bootstrap, that is, one has to determine
322 Two-Dimensional Ising Model
all the normalization constants of the correlators so Frenkel I, Lepowski J, and Meurman A (1988) Vertex Operator
that the resulting set of correlators is local and Algebras and the Monster. Boston, MA: Academic Press.
Gaberdiel MR (2000) An introduction to conformal field theory.
factorizes appropriately into 3-point correlators Reports on Progress in Physics 63: 607 (arXiv:hep-th/
(crossing symmetry). This is typically a difficult 9910156).
problem which has only been solved explicitly for Gannon T (1999) Monstrous moonshine and the classification of
rather few theories, for example, the minimal models. CFT, arXiv:math.QA/9906167.
Recently, it has been noticed that the conformal Gannon T (2006) Moonshine beyond the Monster: The Bridge
Connecting Algebra, Modular Forms and Physics (to appear).
bootstrap can be more easily solved for the corre- Cambridge: Cambridge University Press.
sponding boundary conformal field theory. Further- Gawedzki K (1999) Lectures on conformal field theory. In:
more, Fuchs, Runkel, and Schweigert have shown that Quantum Fields and Strings: A Course for Mathematicians,
any solution of the boundary problem induces an vol. 2. Providence, RI: American Mathematical Society.
associated solution for conformal field theory on Ginsparg P (1988) Applied Conformal Field Theory. Lectures
Given at the Les Houches Summer School in Theoretical
surfaces without boundary. This construction relies Physics. Elsevier.
heavily on the relation between 2D conformal field Goddard P (1989) Meromorphic conformal field theory. Infinite
theory and 3D topological field theory (Turaev 1994). Dimensional Lie Algebras and Lie Groups: Proceedings of the
CIRM Luminy Conference, 1988, 556. Singapore: World
See also: Boundary Conformal Field Theory; Scientific.
Compactification of Superstring Theory; Current Algebra; Goddard P and Olive DI (1988) Kac–Moody and Virasoro
Knot Theory and Physics; String Field Theory; Algebras, A Reprint Volume for Physicists. Singapore: World
Scientific.
Superstring Theories; Symmetries in Quantum Field
Kac VG (1998) Vertex Algebras for Beginners. Providence, RI:
Theory of Lower Spacetime Dimensions.
American Mathematical Society.
Pressley A and Segal GB (1986) Loop Groups. Oxford: Clarendon.
Schweigert C, Fuchs J, and Walcher J (2000) Conformal field
Further Reading theory, boundary conditions and applications to string theory,
arXiv:hep-th/0011109.
Di Francesco P, Mathieu P, and Sénéchal D (1997) Conformal Turaev VG (1994) Quantum Invariants of Knots and 3-Manifolds.
Field Theory. New York: Springer. Berlin: de Gruyter.
energy per site. This symmetry is generated by the The next property to be computed was the
relations spontaneous magnetization, which is usually
defined as
½Al ; Am ¼ 4Glm
½Gl ; Am ¼ 2Amþl 2Aml ½2 M ¼ limþ MðHÞ ½8
H!0
½Gl ; Gm ¼ 0 However, because solution is only available at
This algebra of Onsager is a subalgebra of what is H = 0, this definition cannot be used and instead
now called the loop algebra of the Lie algebra Sl2 M is computed from an alternative definition in
and it is the first infinite-dimensional algebra to be terms of the spin–spin correlation function
used in physics. 1 X
In the 60 years since Onsager first computed the <0;0 M;N> ¼ 0;0 M;N eEð0Þ=kB T ½9
ZI ð0Þ ¼1
free energy, several other methods of exact solution
have been found. In 1949, Kaufman reduced the as
computation of the free energy to a problem of free
fermions. A closely related combinatorial method M2 ¼ lim <0;0 M:N> ½10
M2 þN2 !1
was invented by Kac and Ward, Hurst and Green,
and by Kastelyn. Baxter (1982) has computed the The result for M , first announced by Onsager in
free energy by means of star triangle equations and 1949, is
functional equations in his book.
2 1=8
The fermionic and the combinatorial methods are M ¼ ð1 k Þ for T Tc ½11
0 for T > Tc
powerful enough to compute the correlation func-
tions but are not generalizable to other models. The where
functional equation methods of Baxter generalize to
many other important models but they do not give k ¼ ðsinh 2Ev =kB T sinh 2Eh =kB TÞ1 ½12
correlation functions. There are still aspects of A key point in the computation of the magnetiza-
Onsager’s method that remain unexplored. tion [11] from [9] is that the spin–spin correlation
The free energy per site in the thermodynamic function can be written as a determinant. In fact,
limit is defined as there are many such different, but equal, determi-
1 nental representations and the size of the smallest
F ¼ kB T lim N ln ZI ðHÞ ½3
N !1 one in general is 2(jMj þ jNj). The simplest case is
the diagonal correlation
where N is the total number of sites of the lattice
and the partition function ZI (H) is defined as
a0 a1 a2 a1N
X
a1 a0 a1 a2N
ZI ðHÞ ¼ eEðHÞ=kB T ½4
all ¼1 <0;0 N;N> ¼ a2 a1 a0 a3N ½13
.. .. .. ..
with the sum being over all values j, k = 1 and kB . . . .
is Boltzmann’s constant. The result of Onsager is
aN1 aN2 aN3 a0
that, at H = 0,
Z 2 Z 2 h where
1
F=kB T ¼ ln 2 þ 2 d1 d2 ln cosh 2Eh =kB T Z 1=2
8 0 0 1
in 1 kei
an ¼ de ½14
cosh 2Ev =kB T sinh 2Eh =kB T cos 1 2 1 kei
i
sinh 2Ev =kB T cos 2 ½5 Determinants of the form [13], where the elements
on each diagonal are equal, are called Toeplitz.
This free energy has a singularity at a temperature The study of the spin–spin correlations of the
Tc defined from Ising model provides a microscopic picture of the
sinhð2Ev =kB Tc Þ sinhð2Eh =kB Tc Þ ¼ 1 ½6 behavior of the ferromagnet near the phase transition
temperature Tc , and an entire branch of mathematics
and near Tc the specific heat diverges as has developed from the study of the behavior of
2 2 Toeplitz determinants when the size is large. The first
c E sinh2 2Ev =kB Tc þ 2Ev Eh such mathematical advance was the discovery by
kB T 2 h
Szegö of a general formula for the limit as N ! 1,
þ E2v sinh2 2Eh =kB Tc ln j1 T=Tc j ½7 from which the magnetization [11] is computed.
324 Two-Dimensional Ising Model
By comparing [16] with [17] and [18], we see that and considering the joint (scaling limit) where
at T = Tc the correlations decay algebraically but for
T 6¼ Tc the decay is exponential. It is useful to write N ! 1 and T ! 1 with r fixed ½26
the exponential in [17] for T < Tc as We define the scaled correlation function as
N N= 1
k ¼e with ¼ ln k ½19 G ðrÞ ¼ lim M2 ½27
<0;0 N;N>
scaling
and in [18] for T > Tc as
where the subscript means that the limit is taken
kN ¼ eN=þ with 1 ¼ ln k ½20
from T > Tc or T < Tc , respectively, M is the
The quantity is called the correlation length and as spontaneous magnetization [11] and
T ! Tc the correlation length diverges as
Mþ ¼ ðk2 1Þ1=8 ½28
1 1
j1 kj ¼ const: jT Tc j ½21
This concept of the scaling limit and scaling
A more profound property of the correlations is function is very general and can be defined for any
that they satisfy differential and difference equa- system with a critical point that has an order
tions. It was found by Jimbo and Miwa (1980) that parameter like M that vanishes at Tc and a
the diagonal correlation function satisfies the non- correlation length that diverges at Tc . However,
linear differential equation related to the sixth the Ising model has the further remarkable property
Painlevé function discovered by Wu et al. (1976) that the scaled
!2 correlation function may be explicitly expressed in
d2 terms of a function which satisfies an ordinary
tðt 1Þ 2
dt nonlinear differential equation. Specifically,
2
d 1
¼ N 2 ðt 1Þ G ðrÞ ¼ ½1 ðr=2Þðr=2Þ1=2
dt 2
Z 1 0
d d 1 d r
4 ðt 1Þ t ½22 exp r0 2 ½ð1 2 Þ2 ð0 Þ2 ½29
dt dt 4 dt r=2 4
where for T < Tc we set t = k2 and where the function (r) satisfies the Painlevé III
equation
d 1
N ðtÞ ¼ tðt 1Þ ln <0;0 N;N> ½23
dt 4 1 0
00 ¼ ð0 Þ2 þ 3 1 ½30
and for T > Tc we set t = k2 and r
Two-Dimensional Ising Model 325
with the boundary condition that The susceptibility may be studied by using the
determinental expression for the correlation func-
ðrÞ 1 2K0 ð2rÞ as r ! 1 ½31
tion. The simplest result is obtained (for the
where K0 (r) is the modified Bessel function of the isotropic case, Ev = Eh ) by using the scaling form
third kind and [27] to find for T Tc that
Z 1
¼ 1= ½32 2 2
kB T ðTÞ M 2 dr rfG g ½38
0
The leading behavior of G (r) for r ! 1 is
where þ = 0 and = 1. and thus (T) diverges
Gþ ðrÞ K0 ðrÞ ½33 at T ! Tc as
2 2
ðTÞ C jT Tc j7=4 ½39
G ðrÞ 1 þ r ½K21 ðrÞ K20 ðrÞ
where C are transcendental constants given as
rK0 ðrÞK1 ðrÞ þ 12K20 ðrÞ ½34 integrals over the scaling function G (r), which
were first evaluated by Barouch et al. in 1973 as
where Kn (z) is the modified Bessel function of the
third kind. When is given by [32] these r ! 1 C ¼ 0:0255369719 . . . ;
½40
limits of G (r) agree with the behavior of Cþ ¼ 0:9625817322 . . .
<0, 0 N, N> for N
1 and jT Tc j small with
NjT Tc j
1 which is obtained from [18] and
[17]. The behavior of G (r) for r ! 0 with the value Critical-Exponent Phenomenology
of given by [32] is
From the behavior for the Ising model of the
G ðrÞ ¼ const: r 1=4
½35 specific heat, magnetization, susceptibility, corre-
lation length, and the correlation at Tc given
where the constant agrees with that computed from above we abstract for general systems the phe-
the result [16] for < 0, 0 N, N > at T = Tc for N
1. nomenological critical-exponent parametrization
For other values of the boundary condition constant , for T ! Tc of
the scaling function G (r) diverges with a power
which differs from 1/4. The computation of the c A
C jT Tc j ½41
constant in [35] requires the evaluation of a nontrivial
integral involving the Painlevé III function. M AM jTc Tj
½42
The agreement of the limits r ! 1 and r ! 0 of
the function G (r) with the lattice results near Tc A
jT Tc j ½43
means that this scaling function uniformly inter-
polates between T 6¼ Tc and T = Tc and that the A
jT Tc j ½44
lattice size (defined here as unity) and the self-
generated correlation length are the only two and at T = Tc for R ! 1
length scales in the theory. This feature that the
<0 R> A =Rd2þ where d is the dimension ½45
system generates only one new length scale near Tc
is referred to as one length scale scaling. The exponents , , above and below Tc are usually
found to be equal, and the exponent is usually called the
anomalous dimension. If it is assumed that the scaling
Susceptibility function [27] exists and that one length scale scaling holds
The final quantity of macroscopic thermodynamic then the exponents are related by what are called scaling
interest is the magnetic susceptibility laws, such as
@MðHÞ 2
¼ ðd 2 þ Þ ½46
ðTÞ ¼ ½36
@H H¼0
þ 2
¼ 2 ½47
which is expressed in terms of the spin–spin
correlation function as d ¼ 2 ½48
Fuchsian Equations and Natural correct, there will be physical effects which are not
Boundaries for Susceptibility incorporated in the phenomenological scaling theory
of critical phenomena.
This critical phenomenology, however, has not
taken into account the fact that the susceptibility
is a much more complicated function than either
the spontaneous magnetization [11] or the free Impure Ising Models
energy [5], which have only isolated singularities at The Ising model may also be studied when the
k2 = 1, and that there is more structure to the interaction energies at sites j, k are not chosen to be
susceptibility than the singularity of [39]. independent of position but are allowed to vary
For arbitrary T, the susceptibility was shown by from site to site. When these interactions are chosen
Wu, McCoy, Tracy, and Barouch to be expressible randomly out of some probability distribution, this
in the form is a model of a ferromagnet with frozen (quenched)
X
ðTÞ ¼ M2 ~ðjÞ ðTÞ
½49 impurities. All real systems will be impure to some
j extent, so the study of such dirty systems is of great
practical importance.
where in the sum j is odd (even) for T above (below) The special case where the interactions are transla-
Tc . The quantities ~(j) (T) are explicitly given as tionally invariant in the horizontal direction but are
j-fold integrals of algebraic functions and thus will allowed to vary in a layered fashion from row to row
satisfy linear differential equations with polynomial was introduced by McCoy and Wu in 1968 and
coefficients. Such functions can have only isolated found to be dramatically different from the pure Ising
singularities. The function ~(1) (T) is elementary and model described above. In particular, what is a
has a double pole at Tc and ~(2) (T) is given in terms critical temperature Tc in the pure case is now spread
of complete elliptic integrals. Quite recently, out into a region bounded by the temperatures the
remarkable Fuchsian linear differential equations pure model would be critical if all the bonds took on
for ~(3) (T) and ~(4) (T) of seventh and tenth orders, the minimum or maximum value allowed by the
respectively, have been obtained by Zenine, Bouk- probability distribution. In this new region, the
raa, Hassani, and Maillard for the isotropic lattice. correlations (in the direction of translational invar-
Furthermore, it was shown by Orrick et al. (2001) iance) are found to decay as a power law which
that ~(j) has singularities in the complex T plane at depends on the temperature; the specific heat is never
coshð2Ev =kTÞ coshð2Eh =kTÞ infinite but the susceptibility is infinite in an entire
temperature region that includes the temperature at
sinhð2Ev =kTÞ cosð2=jÞ which the spontaneous magnetization first appears as
sinhð2Eh =kTÞ cosð2m0 =jÞ ¼ 0 ½50 T is lowered. The existence of this new region for
Ising models with a general randomness in two and
with m, m0 = 1, 2, . . . , j. The form of the singularity
three dimensions has been demonstrated by Griffiths.
~(j) (T) for T > Tc is as
in
More recently, this effect has been reinterpreted in
2
ðj 3Þ=2
ln
½51 terms of impurities in quantum spin chains.
ðj
2
3Þ=2
½52 Quantum Field Theory
where
measures the deviation from the singular The Ising model of [1] may be reinterpreted as a two-
point [50]. These singularities become dense as dimensional lattice gauge theory of the gauge field
j ! 1 and, therefore, the singularity at T = Tc is
sjþ1=2;k ¼ 1
not isolated and instead the critical point is
embedded in a natural boundary. Such a function on the vertical link between ðj; kÞ and ðj þ 1; kÞ
cannot satisfy a linear differential equation of finite sj;kþ1=2 ¼ 1
order with polynomial coefficients.
The existence of the natural boundary in the on the horizontal link between ðj; kÞ
susceptibility is a new phenomenon which is not and ðj; k þ 1Þ ½53
seen in either the free energy or magnetization and
leads to the speculation that in the presence of a and a ‘‘Higgs’’ field
magnetic field the one length scale scaling property
of the model at H = 0 may fail. If this proves to be j:k ¼ 1 on the site ðj; kÞ ½54
Two-Dimensional Ising Model 327
with the action has poles of the form Al =(k2 þ m2l ), where ml is the
X mass of the lth particle. If we note that the Fourier
Sg ¼ Eg sjþ1=2;k sjþ1;kþ1=2 sjþ1=2;kþ1 sj;kþ1=2
transform of K0 (r) is
j;k
X Z
Eh ðj;k sjþ1=2;k jþ1;k þ j;k sj;kþ1=2 j;kþ1 Þ ½55 2
d2 reikr K0 ðrÞ ¼ 2 ½65
j;k k þ1
If we define we see that the Fourier transform of [62] is the sum of
an infinite number of poles. This is to be compared
zg;h ¼ tanh Eg;h =kB T ½56
with the Fourier transform of the scaled correlation
the partition function of the gauge theory is expressed function G (r) at H = 0 and T < Tc [34], which does
in terms of the Ising model partition function as not contain any poles at all and may instead be
interpreted as having a two-particle cut. This phe-
Zg ¼ ½8 coshðEg =kB TÞ cosh2 nomenon of a cut at h = 0 breaking up into an infinite
N number of poles for h > 0 is a signal that at h = 0 the
ðEh =kB TÞz1=2
g zh ZI ðHÞ ½57
theory has free unconfined two-particle states which
where we make the identification become weakly confined by a linear confining
potential for h > 0. This confinement is thought to
H=kB T ¼ 12 ln zg and E=kB T ¼ 12 ln zh ½58 be a characteristic of most gauge theories.
This identification may be extended to correlation See also: Eight Vertex and Hard Hexagon Models;
functions. Of particular interest for the gauge theory is Holonomic Quantum Fields; Painlevé Equations;
the plaquette–plaquette correlation < P0, 0 Pj, l > , where Percolation Theory; Phase Transitions in Continuous
Systems; Statistical Mechanics and Combinatorial
Pj;k ¼ sjþ1=2;k sjþ1;kþ1=2 sjþ1=2;kþ1 sj;kþ1=2 ½59 Problems; Toeplitz Determinants and Statistical
Mechanics; Yang–Baxter Equations.
which is expressed in terms of the Ising correlations
at H 6¼ 0 as
< P0;0 Pj;k > < P0;0 >2 Further Reading
¼ sinh2 ð2H=kB TÞð< 0;0 j;k > < 0;0 >2 Þ ½60 Barouch E, McCoy BM, and Wu TT (1973) Zero-field suscept-
ibility of the two dimensional Ising model near Tc . Physical
To study this correlation further, we need to study Review Letters 31: 1409–1411.
the correlations of the Ising model in nonzero Baxter RJ (1982) Exactly Solved Models in Statistical Mechanics.
London: Academic Press.
magnetic field. This has been done by McCoy and Griffiths RB (1969) Nonanalytic behavior above the critical point in
Wu in the scaling limit H ! 0, T ! Tc with a random Ising ferromagnet. Physical Review Letters 23: 17–19.
Jimbo M and Miwa T (1980) Studies on holonomic quantum
H fields. XVII. Proceedings of the Japanese Academy 56A: 405;
h¼ fixed ½61
jT Tc j15=8 (1981) 57A: 347.
Kasteleyn PW (1963) Dimer statistics and phase transitions.
for T < Tc , where it is found that the scaling Journal of Mathematical Physics 4: 287–293.
McCoy BM (1969) Incompleteness of the critical exponent
function G(r, h) for small h and large r if
description for ferromagnetic systems containing random
X h i impurities. Physical Review Letters 23: 383–388.
Gðr; hÞ ahK0 ð2 þ h2=3 l Þr McCoy BM (1995) The connection between statistical mechanics
l and quantum field theory. In: Bazhanov VV and Burden CJ
X 2=3
½62
1=2 1=2 2r
r e haerh l (eds.) Statistical Mechanics and Field Theory, pp. 26–128.
Singapore: World Scientific.
l
McCoy BM and Wu TT (1968) Theory of a two dimensional Ising
where l are the solutions of model with random impurities. Physical Review 176: 631–643.
McCoy BM and Wu TT (1973) The Two Dimensional Ising
J1=3 13 3=2 þ J1=3 13 3=2 ¼ 0 ½63 Model. Cambridge: Harvard University Press.
McCoy BM and Wu TT (1978) Two dimensional Ising field
with Jn (z) the Bessel function of order n and K0 (z) theory in a magnetic field: Breakup of the cut in the 2-point
the modified Bessel function of the third kind. function. Physical Review D 18: 1259–1267.
A field theory is said to possess a particle spectrum McCoy BM, Perk JHH, and Wu TT (1981) Ising field theory:
quadratic difference equations for the n-point Green’s func-
if the Fourier transform of the two-point function tions on the lattice. Physical Review Letters 46: 757.
Z Montroll EW, Potts RB, and Ward JC (1963) Correlations and
Gðk; hÞ ¼ d2 reikr Gðr; hÞ ½64 spontaneous magnetization of the two dimensional Ising
model. Journal of Mathematical Physics 4: 308–322.
328 Two-Dimensional Models
Onsager L (1944) Crystal statistics I. A two dimensional model with Zenine N, Boukraa S, Hassani S, and Maillard JM (2004) The
an order disorder transition. Physical Review 65: 117–149. Fuchsian differential equation of the square lattice Ising model
Orrick WP, Nickel BG, Guttmann AJ, and Perk JHH (2001) (3) susceptibility. Journal of Physics A 37: 9651–9668.
Critical behavior of the two dimensional Ising susceptibility. Zenine N, Boukraa S, Hassani S, and Maillard JM (2005) Ising
Physical Review Letters 86: 4120–4123. model susceptibility: Fuchsian differential equation for (4)
Wu TT, McCoy BM, Tracy CA, and Barouch E (1976) Spin–spin and its factorization properties. Journal of Physics A 38:
correlation functions for the two dimensional Ising model, exact 4149–4173.
theory in the scaling region. Physical Review B 13: 315–374.
Two-Dimensional Models
B Schroer, Freie Universität Berlin, Berlin, Germany involving long-range forces and instead explain ferro-
ª 2006 Elsevier Ltd. All rights reserved. magnetism in terms of nonmagnetic short-range
interactions. Its one-dimensional version was solved
four years later by his student Ernst Ising. Its changeful
history reached a temporary conceptual climax when
History and Motivation
Onsager succeeded to rigorously establish a second-
Local quantum physics of systems with infinitely many order phase transition in two dimensions.
interacting degrees of freedom leads to situations Another conceptually rich model which lay
whose understanding often requires new physical dormant for almost two decades as a result of a
intuition and mathematical concepts beyond that misleading speculative higher-dimensional general-
acquired in quantum mechanics and perturbative ization by its protagonist is the bosonization/
constructions in quantum field theory. In this situa- fermionization model first proposed by Jordan
tion, two-dimensional soluble models turned out to (1937). This model establishes a certain equivalence
play an important role. On the one hand, they between massless two-dimensional fermions and
illustrate new concepts and sometimes remove mis- bosons and is related to Thirring’s massless
conceptions in an area where new physical intuition is 4-fermion coupling model and also to Luttinger’s
still in the process of being formed. On the other hand, one-dimensional model of an electron gas (Schroer).
rigorously soluble models confirm that the underlying One reason why even nowadays hardly anybody
physical postulates are mathematically consistent, a knows Jordan’s contribution is certainly the ambi-
task which for interacting systems with infinite degrees tious but unfortunate title ‘‘the neutrino theory of
of freedom is mostly beyond the capability of light’’ under which he published a series of papers.
pedestrian methods or brute force application of hard Both discoveries demonstrate the usefulness of
analysis on models whose natural invariances have having controllable low-dimensional models; at the
been mutilated by a cutoff. same time, their complicated history also illustrates
In order to underline these points and motivate the danger of rushing to premature ‘‘intuitive’’
the interest in two-dimensional QFT, let us briefly conclusions about extensions to higher dimensions.
look at the history, in particular at the physical A review of the early historical benchmarks of
significance of the three oldest two-dimensional conceptual progress through the study of solvable
models of relevance for statistical mechanics and two-dimensional models would be incomplete
relativistic particle physics, in chronological order: without mentioning Schwinger’s (1962) proposed
the Lenz–Ising (L–I) model, Jordan’s model of solution of two-dimensional quantum electrody-
bosonization/fermionization, and the Schwinger namics, afterwards referred to as the Schwinger
model (QED2 ). (A more detailed account of the model. He used this model in order to argue that
changeful history concerning their correct physical gauge theories are not necessarily tied to zero-mass
interpretation and generalizations to higher dimen- vector particles. Some work was necessary
sions of these models and the increasing conceptual (Schroer) to unravel its physical content with the
role of low-dimensional models in QFT can be result that the would-be charge of that QED2
found in Schroer (2005).) model was ‘‘screened’’ and its apparent chiral
The L–I model was proposed in 1920 by Wihelm symmetry broken; in other words, the model exists
Lenz (see Lenz (1925)) as the simplest discrete only in the so-called Schwinger–Higgs phase with
statistical mechanics model with a chance to go massive free scalar particles accounting for its
beyond the P Weiss phenomenological ansatz physical content. Another closely related aspect of
Two-Dimensional Models 329
this model which also arose in the Lagrangian and Wightman (1964)) (see Axiomatic Quantum
setting of four-dimensional gauge theories was that Field Theory), and the more algebraic setting which
of the -angle parametrizing, an ambiguity in the can be traced back to ideas which Haag (1992)
quantization. developed shortly after and which are based on
A coherent and systematic attempt at a mathema- spacetime-indexed operator algebras and related
tical control of two-dimensional models came in the concepts which developed over a long period of
wake of Wightman’s first rigorous programmatic time, with contributions of many other authors to
formulation of QFT (Schroer 2005). This formula- what is now referred to as algebraic QFT (AQFT) or
tion stayed close to the physical ideas underlying the simply local quantum physics (LQP). Whereas the
impressive success of renormalized QED perturba- Wightman approach aims directly at the (not
tion theory, although it avoided the direct use of necessarily observable) quantum fields, the opera-
Lagrangian quantization. The early attempts tor-algebraic setting (see Algebraic Approach to
towards a ‘‘constructive QFT’’ found their successful Quantum Field Theory) is more ambitious. It starts
realization in two-dimensional QFT (the P’2 models from physically well-motivated assumptions about
(Glimm and Jaffe 1987)); the restriction to low the algebraic structure of local observables and aims
dimensions is related to the mild short-distance at the reconstruction of the full field theory
singularity behavior (super-renormalizability) which (including the operators carrying the superselected
these methods require. We will focus our main charges) in the spirit of a local representation theory
attention on alternative constructive methods which, of (the assumed structure of the) local observables.
even though not suffering from such short-distance This has the advantage that the somewhat myster-
restrictions, also suffer from a lack of mathematical ious concept of an inner symmetry (as opposed to
control in higher spacetime dimensions; the illustra- outer (spacetime) symmetry) can be traced back to
tion of the constructive power of these new methods its physical roots which is the representation-
comes presently from massless d = 1 þ 1 conformal theoretical structure of the local observable algebra
and chiral QFT as well as from massive factorizing (see Symmetries in Quantum Field Theory of Lower
models. Spacetime Dimensions). In the standard Lagrangian
There are several books and review articles quantization approach, the inner symmetry is part of
(Furlan et al. 1989, Ginsparg 1990, Di Francesco the input (multiplicity indices of field components
et al. 1996) on d = 1 þ 1 conformal as well as on on which subgroups of U(n) or O(n) act linearly)
massive factorizing models (Abdalla et al. 1991). To and hence it is not possible to problematize this
the extent that concepts and mathematical structures fundamental question. When in low-dimensional
are used which permit no extension to higher spacetime dimensions the sharp separation (the
dimensions (Kac–Moody algebras, loop groups, Coleman–Mandula theorems) of inner versus outer
integrability, presence of an infinite number of symmetry becomes blurred as a result of the
conservation laws), this line of approach will not appearance of braid group statistics, the standard
be followed in this article since our primary interest Lagrangian quantization setting of most of the
will be the use of two-dimensional models of QFT textbooks is inappropriate and even the Wightman
as ‘‘theoretical laboratories’’ of general QFT. Our framework has to be extended. In that case, the
aim is twofold; on the one hand, we intend to algebraic approach is the most appropriate.
illustrate known principles of general QFT in a The important physical principles which are shared
mathematically controllable context and on the between the Wightman approach (see Streater and
other hand, we want to identify new concepts Wightman (1964)) and the operator algebra (AQFT)
whose adaptation to QFT in d = 1 þ 1 lead to their setting (Haag 1992) are the spacelike locality or
solvability (Schroer). Einstein causality (in terms of pointlike fields or
algebras localized in causally disjoint regions) and
the existence of positive-energy representations of
the Poincaré group implementing covariance and the
General Concepts and Their
stability of matter. In the algebraic approach, the
Two-Dimensional Manifestation
observable content of the theory is encoded into a
The general framework of QFT, to which the rich family of (weakly closed) operator algebras
world of controllable two-dimensional models con- {A(O)}O2K indexed by a family of convex causally
tributes as an important testing ground, exists in closed spacetime regions O (with O0 denoting the
two quite different but nevertheless closely related spacelike complement and A0 the von Neumann
formulations: the 1956 approach in terms of point- commutant) which act in one common Hilbert space.
like covariant fields due to Wightman (see Streater Covariant local fields lose their distinguished role
330 Two-Dimensional Models
which they have in the classical setting and which that is, the realization that the structure of charged
(via Lagrangian quantization) was at least partially (nonvacuum) representations (with the superposi-
inherited by the Wightman approach and, apart in tion principle being valid only within one represen-
their role as local generators of symmetries (con- tation) and the spacetime properties of the
served currents), became mere ‘‘field coordinatiza- generating fields which are the carriers of these
tions’’ of local algebras. (There is a denumerable set generalized charges (including their spacelike com-
of such pointlike field generators which form a local mutation relations which lead to the particle
equivalence (Borchers) class of fields and in the statistics and also to their internal symmetry proper-
absence of interactions permits a neat description in ties) are already encoded in the structure of the
terms of Wick-ordered free-field polynomials (Haag Einstein causal observable algebra (Symmetries in
1992). Certain properties cannot be naturally for- Quantum Field Theory: Algebraic Aspects). The
mulated in the pointlike field setting (e.g., Haag intuitive basis of this remarkable result (whose
duality for convex regions A(O0 ) = A(O)0 ), but apart prerequisite is locality) is that one can generate
from those properties the two formulations are quite charged sectors by spatially separating charges in the
close; in particular for two-dimensional theories there vacuum (neutral) sector and disposing of the
are convincing arguments that one can pass between unwanted charges at spatial infinity (Haag 1992).
the two without imposing additional technical An important concept which especially in d = 1 þ 1
requirements. (Haag duality holds for observable has considerable constructive clout is ‘‘modular
algebras in the vacuum sector in the sense that any localization.’’ It is a consequence of the above
violation can be explained in terms of a sponta- algebraic setting if either the net of algebras have
neously broken symmetry; in local theories, it can pointlike field generators, or if the one-particle
always be enforced by dualization and the resulting masses are separated by spectral gaps so that the
Haag dual algebra has a charge superselection formalism of time-dependent scattering can be
structure associated with the unbroken subgroup.) applied (Schroer 2005); in conformal theories, this
Haag duality is the statement that the commutant of property holds automatically in all spacetime
observables not only contains the algebra of the dimensions. It rests on the basic observation
causal complement that is, A(O0 ) A(O)0 (Einstein (Tomita–Takesaki Modular Theory) that a standard
causality) but is even exhausted by it; it is deeply pair (A, ) of a von Neumann operator algebra and
connected to the measurement process and its a standard vector (standardness means that the
violation in the vacuum sector for convex causally operator algebra of the pair (A, ) acts cyclic and
complete regions signals spontaneous symmetry separating on the vector ) gives rise to a Tomita
breaking in the associated charge-carrying field operator S through its star-operation whose polar
algebra (Haag 1992). It can always be enforced decomposition yield two modular objects, a one-
(assuming that the wedge-localized algebras fulfill parametric subgroup it of the unitary group of
[1] below) by symmetry-reducing extension called operators in Hilbert space whose Ad-action defines
Haag dualization. Its violation for multilocal region the modular automorphism of (A, ) whereas the
reveals the charge content of the model via charge– angular part J is the modular conjugation which
anticharge splitting in the neutral observable algebra maps A into its commutant A0
(Schroer).
Another physically important property which has SA ¼ A ; S ¼ J1=2
a natural algebraic formulation is the split property:
JW ¼ UðjW Þ ¼ Sscat J0 ; itW ¼ UðW ð2tÞÞ ½1
for regions Oi separated by a finite spacelike
distance, one finds A(O1 [ O2 ) ’ A(O1 ) A(O2 ) W ðtÞ :¼ AditW
which can be derived from the Buchholz–Wichmann
‘‘nuclearity property’’ (Haag 1992) (an appropriate The standardness assumption is always satisfied for
adaptation of the ‘‘finiteness of phase-space cell’’ any field-theoretic pair (A(O), ) of a O-localized
property of QM to QFT). Related to the Haag algebra and the vacuum state (as long as O has a
duality is the local version of the ‘‘time slice nontrivial causal disjoint O0 ), but it is only for the
property’’ (the QFT counterpart of the classical wedge region W that the modular objects have a
causal dependency property) sometimes referred to physical interpretation in terms of the global
as ‘‘strong Einstein causality’’ A(O00 ) = A(O)00 . symmetry group of the vacuum as specified in the
One of the most astonishing achievements of the second line of [1]; the modular unitary itW
algebraic approach (which justifies its emphasis on represents the W-associated boost W () and the
properties of ‘‘local observables’’) is the DHR theory modular conjugation JW implements the TCP-like
of superselection sectors (Doplicher et al. 1971), reflection along the edge of the wedge (Bisognano
Two-Dimensional Models 331
and Wichmann 1975). The third line is the defini- turns out to be a bona fide quantum field in a larger
tion of the modular group. The importance of this Hilbert space (which extends the Fock space
theory for local quantum physics results from the generated from applying currents to the vacuum).
fact that it leads to the concept of modular The power in front is determined by the requirement
localization, an intrinsic new scenario for field- that all Wightman functions (computed with the
theoretic constructions which is different from the help of free-field Wick combinatorics) stay finite in
Lagrangian quantization schemes (Schroer 2005). this massless limit; the necessary and sufficient
A special feature of d = 1 þ 1 Minkowski spacetime condition for this is the charge conservation rule
is the disconnectedness of the right/left spacelike region * +
leading to a right–left ordering structure. So in addition Y
i i ðxÞ
:e :
to the Lorentz-invariant timelike ordering x y (x
i
earlier than y, which is independent of spacetime 8
> Y 1
ð1=2Þ i j
½4
dimensions), there is an invariant spacelike ordering >
< ; i ¼ 0
x < y (x to the left of y) in d = 1 þ 1 which opens the ¼ i<j
ð
þij Þ" ð
ij Þ"
possibility of more general Lorentz-invariant spacelike >
>
:
commutation relations than those implemented by 0; otherwise
Bose/Fermi fields (Rehren and Schroer 1987) of fields
where the resulting correlation function has been
with a spacelike braid group commutation structure.
factored in terms of light-ray coordinates
The appearance of such exotic statistics fields is not
ij = xi xj , x = t x, and the "-prescription
compatible with their Fourier transforms being crea-
stands for taking the standard Wightman bound-
tion/annihilation operators for Wigner particles;
ary value t ! t þ i", lim" ! 0 which insures the
rather, the state vectors which they generate from the
positive-energy condition. The finiteness of the
vacuum contain in addition to the one-particle
limit insures that the resulting zero-mass limiting
contribution a vacuum polarization cloud (Schroer
theory is a bona fide quantum field theory that is,
2005). This close connection between new kinematic
its system of Wightman functions permits the
possibilities and interactions is one of the reasons why,
construction of an operator theory in a Hilbert
(different from higher dimensions where interactions
space with a distinguished vacuum vector.
are prescribed by the recipe of local couplings of free
The factorization into light-ray components [4]
fields) low-dimensional QFT offers a more intrinsic
shows that the exponential charge-carrying opera-
access to the central issue of interactions.
tors inherit this factorization into two independent
chiral components : exp i (x) : = exp i þ (xþ ):
: exp i (x ):, each one being covariant under
Boson/Fermion Equivalence and
scaling
!
if one assigns the scaling dimension
Superselection Theory in a Special Model d = 2 =2 to the chiral exponential field and d = 1 to
The simplest and oldest but conceptually still rich the current. As any Wightman field, this is a singular
model is obtained, as first proposed by Jordan object which only after smearing with Schwartz test
(1937), by using a two-dimensional massless Dirac functions yields an (unbounded) operator. But the
current and showing that it may be expressed in above form of the correlation function belongs to a
terms of scalar canonical Bose creation/annihilation class of distributions which admits a much larger
operators test-function space consisting of smooth functions
which instead of decreasing rapidly only need to be
j ¼: :¼ @ ; bounded so that they stay finite on the compactified
Z þ1
dp light-ray line R_ = S1 . To make this visible, one uses
:¼ feipx a ðpÞ þ h:c:g ½2
1 2jpj the Cayley transform (now x denotes either xþ
or x )
Although the potential (x) of the current as a result
of its infrared divergence is not a field in the 1 þ ix
standard sense of an operator-valued distribution z¼ 2 S1 ½5
1 ix
in the Fock space of the a(p)# (It becomes an
operator after smearing with test functions whose This transforms the Schwartz test function into a space
Fourier transform vanishes at p = 0), the formal of test functions on S1 which have an infinite order
exponential defined as the zero-mass limit of a well- zero at z = 1 (corresponding to x = 1) but the
defined exponential free massive field rotational transformed fields j(z), : exp i (z): permit
2
the smearing with all smooth functions on S1 , a
: ei ðxÞ :¼ lim m =2
: ei m ðxÞ : ½3 characteristic feature of all conformal invariant
m!0
332 Two-Dimensional Models
theories as the present one turns out to be. There is an AðS1 Þ ¼ alg Wðf Þ; f 2 C1 ðS1 Þ
additional advantage in the use of this compactifica-
AðIÞ ¼ algfWðf Þ; suppf Ig ½6c
tion. Fourier transforming the circular current actually
allows for a quantum-mechanical zero mode whose where
possible nonzero eigenvalues indicate the presence of Z
additional charge sectors beyond the charge-zero dz 0
sð: ; :Þ ¼ f ðzÞgðzÞ
vacuum sector. For the exponential field, this leads to 2i
a quantum-mechanical pre-exponential factor which is the symplectic form which characterizes the
automatically insures the charge selection rules so that Weyl algebra structure and [6c] denotes the
unrestricted (by charge conservation) Wick contrac- unique C algebra generated by the unitary objects
tion rules can be applied. In this approach, the W(f). A particular representation of this algebra is
original chiral Dirac fermion (x) (from which the given by assigning the vacuum state
current was formed as the : : composite) 2 P to the
generators hW(f )i0 = e(1=2)kf k0 , kf k20 = n 1 njfn j2 .
reappears as a charge-carrying exponential field Starting with the vacuum Hilbert space represen-
for = 1 and thus illustrates the meaning of tation A(S1 )0 = 0 (A(S1 )), one easily checks that
bosonization/fermionization. (It is interesting to the formula
note that Jordan’s (1937) original treatment of
fermionization had such a pre-exponential quan- hWðf Þi :¼ ei f0 hWðf Þi0 ½7a
tum-mechanical factor.) Naturally, this terminol-
ogy has to be taken with a grain of salt in view of
ðWðf ÞÞ ¼ ei f0 0 ðWðf ÞÞ ½7b
the fact that the bosonic current algebra only
generates a superselected subspace into which the defines a state with positive energy, that is, one
charge-carrying exponential field does not fit. whose GNS representation for 6¼ 0 is unitarily
Only in the case of massive two-dimensional QFT inequivalent to the vacuum representation. Its
fermions can be incorporated into a Fock space of incorporation into the vacuum Hilbert space [7b] is
bosons (see last section). At this point, it should part of the DHR formalism. It is convenient to view
however be clear to the reader that the physical this change as the result of an application of an
content of Jordan’s paper had nothing to do with automorphism on the C -Weyl algebra A(S1 )
its misleading title ‘‘neutrino theory of light’’ but which is implemented by a unitary charge-generat-
rather was an early illustration about charge ing operator in a larger (nonseparable) Hilbert
superselection rules in two-dimensional QFT. space which contains all charge sectors H = H0 ,
A systematic and rigorous approach consists in H0
Hvac = A(S1 ):
solving the problem of positive-energy representa-
hWðf Þi ¼ h ðWðf ÞÞi0
tion theory for the Weyl algebra on the circle (which ½8
is the rigorous operator-algebraic formulation of the ðWðf ÞÞ ¼ Wðf Þ
abelian current algebra). (The Weyl algebra origi- = describes a state with a rotational homo-
nated in quantum mechanics around 1927; its use in geneous charge distribution; arbitrary
QFT only appeared after the cited Jordan paper. By R charge distribu-
tions
of total charge that is, (dz= 2i)
=
representation we mean here a regular representa- are obtained in the form
tion in which the exponentials can be differentiated
in order to obtain (unbounded) smeared current
Þ
¼ ð
ÞWð^ ½9
operators.) It is the operator algebra generated by
the exponential of a smeared chiral current (always where (
) is a numerical phase factor and the
with real test functions) with the following relation net effect of the Weyl operator is to change the
between the generators rotational homogeneous charge distribution into
. The necessary charge-neutral compensating
Wðf Þ ¼ eijðf Þ function
in the Weyl cocycle W(
) is uniquely
Z determined in terms of
up to the choice of one
dz
jðf Þ ¼ jðzÞf ðzÞ; ½ jðzÞ; jðz0 Þ point 2 S1 (the determining equation involves
2i the ln z function which needs the specification of
¼ 0 ðz z0 Þ ½6a a branch cut (Schroer 2005)). From this formula,
one derives the commutation relations
=
ei
for spacelike separations of the
Wðf ÞWðgÞ ¼ eð1=2Þsðf ;gÞ Wðf þ gÞ
supports; hence, these fields are relatively local
W ðf Þ ¼ Wðf Þ ½6b (bosonic) for = 2Z. In particular, if only one
Two-Dimensional Models 333
type ofpcharge
ffiffiffiffiffiffiffi is present, the generating charge is extension A ! AN , which renders the Hilbert space
gen = 2N and the composite charges are multi- separable and quantizes the charges, seems to be
ples, that is, gen Z. This locality condition characteristic for abelian current algebra; in all other
providing bosonic commutation relations does models which have been constructed up to now the
not yet ensure the -independence. Since the number of sectors is at least denumerable and in the
equation which controls the -change turns out more interesting ones even finite (rational models).
to be An extension is called maximal if there exists no
further extension which maintains the bosonic
1 2 i 2iQ
¼ e e ½10 commutation relation. For the case at hand, this
would require the presence of another generating
one achieves -independence by restricting the field of the same kind as above, which belongs to an
Hilbert space charges to be ‘‘dual’’ to that of the integer N 0 is relatively local to the first one. This is
operators, that is, only possible if N is divisible by a square.
In passing, it is interesting to mention a somewhat
1
Q ¼ pffiffiffiffiffiffiffi Z unexpected relation between the Schwinger model,
2N
whose charges are screened, and the Jordan model.
The localized
1 operators acting on the restricted Since the Lagrangian formulation of the Schwinger
separable Hilbert space Hres generate a -indepen- model is a gauge theory, the analog of the four-
dent extended observable algebra AN (S1 ) (Schroer) dimensional ‘‘asymptotic freedom’’ wisdom would
and it is not difficult to see that its representation in suggest the possibility of ‘‘charge liberation’’ in the
Hres is reducible and that it decomposes into 2N short-distance limit of this model. This seems to
charge sectors contradict the statement that the intrinsic content
of the Schwinger model (QED2 with massless
1 Fermions) (after removing a classical degree of free-
pffiffiffiffiffiffiffi n; n ¼ 0; 1; . . . ; N 1
2N dom) is the QFT of a free massive Bose field and such a
simple free field is at first sight not expected to contain
Hence, the process of extension has led to a charge
subtle information about asymptotic charge liberation.
quantization with a finite (‘‘rational’’) number of
(In its original gauge-theoretical form, the Schwinger
charges relative to the new observable algebra which
model has an infinite vacuum degeneracy. The
is neutral in the new charge counting
removal of this degeneracy (restoration of the cluster
1 property) with the help of the ‘‘-angle formalism’’
Z= gen Z ¼ Z= 2gen ¼ Z2N
gen leaves a massive free Bose field (the Schwinger–Higgs
mechanism). As expected in d = 1 þ 1 the model only
The charge-carrying fields in the new setting are also possesses this phase.) Well, as we have seen above, the
of the above form [9], but now the generating field massless limit really does have liberated charges and
carries the charge the short-distance limit of the massive free field is the
Z massless model (Schroer).
dz
gen ¼ Qgen As a result of the peculiar bosonization/fermioniza-
2i
tion aspect of the zero-mass limit of the derivative of
which is a (1=2N) fraction of the old gen . Their the massive free field, Jordan’s model is also closely
commutation relations for disjoint charge supports related to the massless Thirring model (and the related
are ‘‘braidal’’ (or better ‘‘plektonic’’ which is more Luttinger model for an interacting one-dimensional
on par with being bosonic/fermionic). (In the abelian electron gas) whose massive version is in the class of
case like the present, the terminology ‘‘anyonic’’ factorizing models (see later section). (Another struc-
enjoys widespread popularity, but in the present tural consequence of this aspect leads to Coleman’s
context the ‘‘any’’ does not go well with charge theorem (Schroer 2005) which connects the Mermin–
quantization.) These objects considered as operators Wagner no-go theorem for two-dimensional sponta-
localized on S1 do depend on the cut , but using an neous continuous symmetry breaking with these
appropriate finite covering of S1 this dependence is zero-mass peculiarities.) The Thirring model is a
removed (Schroer 2005). So the field algebra F Z 2N special case in a vast class of ‘‘generalized’’ multi-
generated by the charge-carrying fields (as opposed coupling multicomponent Thirring models, that is,
to the bosonic observable algebra AN ) has its unique models with 4-fermion interactions. Under this name
localization structure on a finite covering of S1 . An they were studied in the early 1970s (Schroer) with
equivalent description which gets rid of consists in the aim to identify massless subtheories for which the
dealing with operator-valued sections on S1 . The currents form chiral current algebras.
334 Two-Dimensional Models
The counterpart of the potential of the conserved of the model) and fulfill the global causality
Dirac current in the massive Thirring model is the condition previously discovered by I Segal (Schroer
sine-Gordon field, that is, a composite field which in 2005). They are generally highly reducible with
the attractive regime of the Thirring coupling again respect to the center of the covering group. The
obeys the so-called sine-Gordon equation of motion. family of fields on the right-hand side, on the other
Coleman gave a supportive argument (Schroer 2005) hand, are fields which were introduced (Schroer and
but some fine points about the range of its validity in Swieca 1974; Schroer et al. 1975) with the aim to
terms of the coupling strength remained open. (It was have objects which live on the projection x(xcov ),
noticed that the current potential of the free massive that is, on the spacetime of the physics laboratory
Dirac Fermion (g = 0) does not obey the sine-Gordon instead of the ‘‘hells and heavens’’ of the covering
equation (Schroer 2005).) A rigorous confirmation of (Schroer 2005). They are operator-distributional
these facts was recently given in the bootstrap form- valued sections in the compactification of ordinary
factor setting (Schroer 2005). Massive models which Minkowski spacetime. The connection is given by
have a continuous or discrete internal symmetry have the above decomposition formula into irreducible
‘‘disorder’’ fields which implement a ‘‘half-space’’ conformal blocks with respect to the center Z of the
symmetry on the charge-carrying field (acting as the noncompact covering group SO(2, g n) where , are
identity in the other half-axis) and together with the labels for the eigenspaces of the generating unitary Z
basic pointlike field form composites which have of the abelian center Z. The decomposition [11] is
exotic commutation relations (see the last section). minimal in the sense that in general there generally
will be a refinement due to the presence of
additional charge superselection rules (and internal
The Conformal Setting, group symmetries). The component fields are not
Structural Results Wightman fields since they annihilate the vacuum if
the right-hand projection differs from P0 = Pvac .
Chiral theories play a special role within the setting Note that the Huygens (timelike) region in Min-
of conformal quantum fields. General conformal kowski spacetime has a timelike ordering structure
theories have observable algebras which live on x y or x y (earlier or later). In d = 1 þ 1, the
compactified Minkowski space (S1 in the case of topology allows in addition a spacelike left–right
chiral models) and fulfill the Huygens principle, ordering x 9 y. In fact, it is precisely the presence of
which in an even number of spacetime dimension these two orderings in conjunction with the factor-
means that the commutator is only nonvanishing for ization of the vacuum symmetry group SO(2, g 2) ’
lightlike separation of the fields. The fact that this g g
PSL(2R)l PSL(2, R)r , in particular Z = Zl Zr ,
classically expected behavior breaks down for which is at the root of a significant simplification.
nonobservable conformal fields (e.g., the massless This situation suggested a tensor factorization into
Thirring field) was noticed at the beginning of the chiral components and led to an extremely rich and
1970s and considered paradoxical at that time successful construction program of two-dimensional
(‘‘reverberation’’ in the timelike (Huygens) region). conformal QFT as a two-step process: the classifica-
Its resolution around 1974–75 confirmed that such tion of chiral observable algebras on the light ray and
fields are genuine conformal covariant objects but the amalgamation of left–right chiral theories to two-
that some fine points about their causality needed to dimensional local conformal QFT. The action on the
be addressed. The upshot was the proposal of two circular coordinates z is through fractional SU(1, 1)
different but basically equivalent concepts about transformations
globally causal fields. They are connected by the
following global decomposition formula: z þ
X gðzÞ ¼
z þ
Aðxcov Þ ¼ A ; ðxÞ; A ; ðxÞ
X
¼ P AðxÞP ; Z ¼ eid P ½11 whereas the covering group acts on the Mack–
Luescher covering coordinates.
On the left-hand side, the spacetime point of the The presence of an ordering structure permits the
field is a point on the universal covering of the appearance of more general commutation relations
conformal compactified Minkowski space. These are for the above A component fields namely
fields (Lüescher and Mack 1975) (Schroer 2005)
A ; ðxÞB; ðyÞ
which ‘‘live’’ in the sense of quantum (modular) X ;
localization on the universal covering spacetime (or ¼ R;0 B ;0 ðyÞA0 ; ðxÞ; x>y ½12
on a finite covering, depending on the ‘‘rationality’’ 0
Two-Dimensional Models 335
with numerical R-coefficients which, as a result of ( , ) 2 2Z, that is, an even-integer lattice L in V,
associativity and relative commutativity with respect whereas the restricted Hilbert subspace HL which
to observable fields, have to obey certain structure ensures -independence is associated with the dual
relations; in this way, Artin braid relations emerge lattice L : (i , k ) = ik which contains L. The
as a new manifestation of the Einstein causality resulting superselection structure (i.e., the Q-
principle for observables in low-dimensional QFT spectrum) corresponds to the finite factor group
(Rehren and Schroer 1989) (see Schroer 2005). L =L. For self-dual lattices L = L (which only can
Indeed, the DHR method to interpret charged fields occur if dim V is a multiple of 8), the resulting
as charge superselection carriers (tied by local observable algebra has only the vacuum sector; the
representation theory to the bosonic local structure most famous case is the Leech lattice 24 in
of observable algebras) leads precisely to such a dim V = 24, also called the ‘‘moonshine’’ model.
plektonic statistics structure (Fredenhagen et al. The observation that the root lattices of the Lie
1992, Gabbiani and Froehlich 1993) for systems in algebras of types A, B, or E (e.g., su(n) corre-
low spacetime dimension (see Symmetries in Quan- sponding to An1 ) also appear among the even-
tum Field Theory of Lower Spacetime Dimensions). integral lattices suggests that the nonabelian
With an appropriately formulated adjustment to current algebras associated to those Lie algebras
observables fulfilling the Huygens commutativity, can also be implemented. This turns out to be
this plektonic structure (but now disconnected from indeed true as far as the level-1 representations are
particle/field statistics) is also a possible manifesta- concerned which brings us to the second family:
tion of causality for the higher-dimensional timelike the nonabelian current algebras of level k asso-
structure (Schroer 2005). ciated to those Lie algebras; they are characterized
The only examples known up to the appearance by the commutation relation
of the seminal BPZ work (Belavin et al. 1984) were
the abelian current models of the previous section J ðzÞ; J ðz0 Þ ¼ if j ðzÞðz z0 Þ
which furnish a rather poor man’s illustration of the 12kg 0 ðz z0 Þ ½13
richness of the decomposition theory. The flood-
gates of conformal QFT were only opened after the where f are the structure constants of the under-
BPZ discovery of ‘‘minimal models,’’ which was lying Lie algebra, g their Cartan–Killing form, and
preceded by the observation (Friedan et al. 1984) k, the level of the algebra, must be an integer in
that the algebra of the stress–energy tensor came order that the current algebra can be globalized to a
with a new representation structure which was not loop group algebra. The Fourier decomposition of
compatible with an underlying internal group the current leads to the so-called affine Lie algebras,
symmetry (see Symmetries in Quantum Field The- a special family of Kac–Moody algebras. For k = 1,
ory: Algebraic Aspects). these currents can be constructed as bilinears in
An important step in the structural study of chiral terms of the multicomponent chiral Dirac field;
models was the recognition that the energy–momen- there exists also the mentioned possibility to obtain
tum tensor has the commutation structure of a Lie them by constructing their maximal Cartan currents
field (Schroer 2005); in the next section, its algebraic within the above abelian setting and representing the
structure and its representation theory will be remaining nondiagonal currents as certain charge-
presented. carrying (‘‘vertex’’ algebra) operators. Level-k alge-
bras can be constructed from reducing tensor
products of k level-1 currents or directly via the
representation theory of infinite-dimensional affine
Chiral Fields and Two-Dimensional
Lie algebras. (The global exponentiated algebras
Conformal Models
(the analogs to the Weyl algebra) are called loop
Let us start with a family which generalizes the group algebras.) Either way one finds that, for
abelian model of the previous section. Instead of a example, the SU(2) current algebra of level k has
one-component abelian current we now take n (together with the vacuum sector) k þ 1 sectors
independent copies. The resulting multicomponent (inequivalent representations). The different sectors
Weyl algebra has the previous form except that the are already distinguished by the structure of their
current is n-component and the real function space ground states of the conformal Hamiltonian L0 .
underlying the Weyl algebra consists of functions Although the computation of higher point correla-
with values in an n-component real vector space tion functions for k > 1, there is no problem in
f 2 LV with the standard Euclidean inner product securing the existence of the algebraic nets which
denoted by ( , ). The local extension now leads to define these chiral models as well as their k þ 1
336 Two-Dimensional Models
representation sectors and to identify their generat- the concept of operator-algebraic inclusions (in
ing charge-carrying fields (primary fields) including particular, inclusions with conditional expectations –
their R-matrices appearing in their plektonic com- V Jones inclusions).
mutation relations. It is customary to use the The SU(2)k current coset construction (Goddard
notation SU(2)k for the abstract operator algebras et al. 1985) revealed that the proof of existence and
associated with the current generators [13] and we the actual construction of the minimal models is
will denote their k þ 1 equivalence classes of related to that of the SU(2)k current algebras.
representations by ASU(2)k , n , n = 0, . . . , k, whereas Constructing a chiral model does not necessarily
representations of current algebras for higher rank mean the explicit determination of the n-point
groups require a more complicated labeling (in Wightman functions of their generating fields
terms of Weyl chambers). (which for most chiral models remains a prohibi-
The third family of models are the so-called tively complicated task) but rather a proof of their
minimal models which are associated with the existence by demonstrating that these models are
Lie-field commutation structure of the chiral obtained from free fields by a series of computa-
stress–energy tensor which results from the chiral tional complicated but mathematically controlled
decomposition of a conformally covariant two- operator-algebraic steps as reduction of tensor
dimensional stress–energy tensor products, formation of orbifolds under group
actions, coset constructions, and a special kind of
½TðzÞ; Tðz0 Þ ¼ iðTðzÞ þ Tðz0 ÞÞ0 ðz z0 Þ extensions. The generating fields of the models are
ic 000 nontrivial in the sense of not obeying free-field
þ ðz z0 Þ ½14 equations (i.e., not being ‘‘on-shell’’). The cases
24
where one can write down explicit n-point functions
whose Fourier decomposition yields the Witt– of generating fields are very rare; in the case of the
Virasoro algebra, that is, a central extension of minimal family this is limited to the field theory of
the Lie algebra of the Diff(S1 ). (The presence of the Ising model (Schroer 2005).
the central term in the context of QFT (the analog To show the power of inclusion theory for the
of the Schwinger term) was noticed later; however, determination of the charge content of theory, let us
the terminology Witt–Virasoro algebra in the look at a simple illustration in the context of the above
physics literature came to mean the Lie algebra multicomponent abelian current algebra. The vacuum
of diffeomorphisms of the circle including the representation of the corresponding Weyl algebra is
central extension.) The first two coefficients are generated from smooth V-valued functions on the
determined by the physical role of T(z) as the circle modulo constant functions (i.e., functions with
generating field density for the Lie algebra of the vanishing total integral) f 2 LV0 . These functions
Poincaré group whereas the central extension equipped with the aforementioned complex structure
parameter c > 0 (positivity of the two-point func- and scalar product yield a Hilbert space. The
tion) for the connection with the generation of the I-localized subalgebra is generated by the Weyl image
Moebius transformations and the undetermined of I-supported functions (class functions whose repre-
parameter c > 0 (the central extension parameter) senting functions are constant in the complement I0 )
is easily identified with the strength of the two-
point function. Although the structure of the AðIÞ :¼ algfWðf Þjf 2 KðIÞg
½15
T-correlation functions resembles that of free KðIÞ ¼ ff 2 LV0 jf ¼ const:in I0 g
fields (in the sense that is an algebraically
computable unique set of correlation functions The one-interval Haag duality A(I)0 = A(I0 ) (the
once one has specified the two-point function), the commutant algebra equals the algebra localized in
realization that c is subject to a discrete quantiza- the complement) is simply a consequence of the fact
tion if c < 1 came as a surprise. As already that the symplectic complement K(I)0 in terms of
mentioned, the observation that the superselection Im(f , g) consists of real functions in that space which
sectors (the positive-energy representation struc- are localized in the complement, that is,
ture) of this algebra did not at all follow the logic K(I)0 = K(I0 ). The answer to the same question for
of a representation theory of an inner symmetry a double interval I = I1 [ I3 (think of the first and
group generated a lot of attention and stimulated a third quadrant on the circle) does not lead to duality
flurry of publications on symmetry concepts but rather to a genuine inclusion
beyond groups (quantum groups). A concept of
KððI1 [ I3 Þ0 Þ ¼ KðI2 [ I4 Þ KðI1 [ I3 Þ0
fundamental importance is the DHR theory of ½16
localized endomorphisms of operator algebras and KðI1 [ I3 Þ KððI1 [ I3 Þ0 Þ0
Two-Dimensional Models 337
The meaning of the left-hand side is clear; these our observable algebra. Again the Haag duality is
are functions which are constant in I1 [ I3 with the violated and converted into an inclusion Aext L (I1 [ I2 )
same constant in the two intervals whereas the Aext
L ((I 1 [ I 2 ) 0 0
) which turns out to have the same
functions on the right-hand side are less restrictive G = L =L charge structure (it is in fact isomorphic
in that the constants can be different. The to the previous inclusion). In the general setting
conversion of real subspaces into von Neumann (current algebras, minimal model algebras, . . .), this
algebras by the Weyl functor leads to the algebraic double interval inclusion is particularly interesting if
inclusion A(I1 [ I3 ) A((I1 [ I3 )0 )0 . In physical the associated Jones index is finite. One finds
terms, the enlargement results from the fact that Kawahigashi et al. (2001) (Schroer 2005).
within the charge neutral vacuum algebra a charge
Theorem 1 A chiral theory with finite Jones index
split with one charge in I1 and the compensating
= ind{A((I1 [ I2 )0 )0 : A(I1 [ I2 )} for the double
charge in I2 for all values of the (unquantized)
interval inclusion (always assuming that A(S1 ) is
charge occurs. A more realistic picture is obtained
strongly additive and split) is a rational theory and
if one allows a charge split to be subjected to a
the statistical dimensions d
of its charge sectors are
charge quantization implemented by a lattice
related to through the formula
condition f (I2 ) f (I4 ) 2 2L which relates the
two multicomponent constant functions (where X
¼ d
2 ½17
f (I) denotes the constant value f takes in I). As
in the previous one-component case, the choice of
even lattices corresponds to the local (bosonic) Instead of presenting more constructed chiral
extensions. Although imposing such a lattice models, it may be more informative to mention
structure destroys the linearity of the K, the some of the algebraic methods by which they are
functions still define Weyl operators which gener- constructed and explored. The already mentioned
ated operator algebras AL (I1 [ I2 ). (The linearity DHR theory provides the conceptual basis for
structure is recovered on the level of the operator converting the notion of positive-energy represen-
algebra.) But now the inclusion involves the dual tation sectors of the chiral model observable
lattice L (which of course contains the original algebras A (equivalence classes of unitary repre-
lattice), sentations) into localized endomorphisms
of this
algebra. This is an important step because con-
AL ðI1 [ I2 Þ AL ðI1 [ I2 Þ trary to group representations which have a
ind AL ðI1 [ I2 Þ AL ððI1 [ I2 Þ0 Þ0 ¼ jGj natural tensor product composition structure,
AL ðI1 [ I2 Þ ¼ invG AL ðI1 [ I2 Þ representations of operator algebras generally do
not come with a natural composition structure.
This time the possible charge splits correspond to The DHR endomorphisms theory of A leads to
the factor group G = L =L, that is, the number of fusion laws and an intrinsic notion of generalized
possibilities is jGj which measures the relative size statistics (for chiral theories: plektonic in addition
of the bigger algebra in terms of the smaller. This is to bosonic/fermionic). The chiral statistics para-
a special case of the general concept of the so-called meters are complex numbers (Haag 1992) whose
Jones index of an inclusion which is a numerical phase is related to a generalized concept of spin
measure of its depth. A prerequisite is that the via a spin-statistics theorem and whose absolute
inclusion permits a conditional expectation which value (the statistics dimension) generalized the
is a generalization of the averaging under the notion of multiplicities of fields known from the
‘‘gauge group’’ G on AL (I1 [ I2 ) in the third description of inner symmetries in higher-dimen-
equation above, which identifies the invariant sional standard QFTs. The different sectors may
smaller algebra with the fix-point algebra (the be united into one bigger algebra called the
invariant part) under the action of G. In fact, exchange algebra F red in the chiral context (the
using the conceptual framework of Jones, one can ‘‘reduced field bundle’’ of DHR) in which every
show that the two-interval inclusion is independent sector occurs by definition with multiplicity 1 and
of the position of the disjoint intervals character- the statistics data are encoded into exchange
ized by the group G. (commutation) relations of charge-carrying opera-
There exists another form of this inclusion which tors or generating fields (‘‘exchange algebra
is more suitable for generalizations. One starts from fields’’) (Schroer 2005). Even though this algebra
the charge quantized extended local algebra Aext L is useful in that all properties concerning fusion
A described earlier in terms of an even-integer lattice and statistics are nicely encoded, it lacks some
L (which lives in the separable Hilbert space HL ) as cherished properties of standard field theory
338 Two-Dimensional Models
namely there is no unique state–field relation, that the elimination of short distances via the mass-shell
is, no Reeh–Schlieder property (a field A whose restriction, would be free of ultraviolet divergencies.
source projection P does not coalesce with the This idea was enriched in the 1960s by the crossing
vacuum projection annihilates the vacuum); in property which in turn led to the bootstrap idea, a
operator-algebraic terms, the local algebras are highly nonlinear seemingly self-consistent proposal
not factors. This poses the question of how to for the determination of the S-matrix. However, the
manufacture from the set of all sectors natural protagonists of this S-matrix bootstrap program
(not necessarily local) extensions with these placed themselves into a totally antagonistic fruitless
desired properties. It was found that this problem position with respect to QFT so that the strong
can be characterized in operator-algebraic terms return of QFT in the form of gauge theory under-
by the existence of the so-called DHR triples mined their credibility. On the other hand, there
(Schroer). In case of rational theories, the number were rather convincing quasiclassical calculations in
of such extensions is finite and in the aforemen- certain two-dimensional massive QFTs as, for
tioned ‘‘classical’’ current algebra and minimal example, the sine-Gordon model which indicated
models they all have been constructed by this that the obtained quasiclassical mass spectrum is
method (thus confirming existing results complet- exact and hence suggested that the associated
ing the minimal family by adding some missing QFTs are integrable (Dashen et al. 1975) and
models). The same method adapted to the chiral have no real particle creation. These provocative
tensor product structure of d = 1 þ 1 conformal observations asked for a structural explanation
observables classifies and constructs all two-dimen- beyond quasiclassical approximations, and it soon
sional local (bosonic/ fermionic) conformal QFT B2 became clear that the natural setting for obtain-
which can be associated with the observable chiral ing such mass formulas was that of the ‘‘fusion’’
input. It turns out that this approach leads to of boundstate poles of unitary crossing-symmetric
another of those pivotal numerical matrices which purely elastic S-matrices; first in the special
encode structural properties of QFT: the coupling context of the sine-Gordon model (Schroer et al.
matrix Z, 1976) and later as a classification program from
which factorizing S-matrices can be determined
A A B2 by solving well-defined equations for the elastic
X
Z
;
ðAÞ ðAÞ A A ½18 two-particle S-matrix (Karowski et al. 1977).
(It was incorrectly believed that the ‘‘nontrivial
where the second line is an inclusion solely elastic scattering implies particle creation’’
expressed in terms of observable algebras from statement of Aks (Aks, 1963) is also valid for
which the desired (isomorphic) inclusion in the first low-dimensional QFTs.) Some equations in this
line follows by a canonical construction, the so- bootstrap approach resembled mathematical
called Jones basic construction. The numerical structures which appeared in C N Yang’s work
matrix Z is an invariant closely related to the so- on nonrelativistic -function particle interactions
called ‘‘statistics character matrix’’ (Schroer 2005) as well as relations for Boltzmann weights in
and in case of rational models it is even a modular Baxter’s work on solvable lattice models; hence,
invariant with respect to the modular SL(2, Z) group they were referred to as Yang–Baxter relations.
transformations (which are closely related to the These results suggested that the old bootstrap
matrix S in the final section). idea, once liberated from its ideological dead
freight (in particular from the claim that the
bootstrap leads to a unique ‘‘theory of
everything’’ (minus gravity)), generates a useful
Integrability, the Bootstrap
setting for the classification and construction
Form-Factor Program
of factorizing two-dimensional relativistic
Integrability in QFT and the closely associated S-matrices. Adapting certain known relations
bootstrap form-factor construction of a very rich between two-particle form factors of field opera-
class of massive two-dimensional QFTs can be tors and the S-matrix to the case at hand
traced back to two observations made during the (Karowski and Weisz 1978), and extending this
1960s and 1970s ideas. On the one hand, there was with hindsight to generalized (multiparticle) form
the time-honored idea to bypass the ‘‘off-shell’’ field- factors, one arrived at the axiomatized recipes of
theoretic approach to particle physics in favor of a the bootstrap form-factor program of d = 1 þ 1
pure on-shell S-matrix setting which (in particular factorizable models (Smirnov 1992). Although
recommended for strong interactions), as a result of this approach can be formulated within the
Two-Dimensional Models 339
setting of the LSZ scattering formalism, the use of the level of particles. The inexorable presence of
a certain algebraic structure (Zamolodchikov and interaction-caused vacuum polarization limits a
Zamolodchikov 1979) which in the simplest fundamental/fused hierarchy to the fusion of
version reads charges.) The minimal (no additional physical
poles) two-particle S-matrix in terms of which the
ZðÞZ ð0 Þ ¼ Sð2Þ ð 0 ÞZ ð0 ÞZðÞ þ ð 0 Þ n-particle S-matrix factorizes is therefore
½19
ZðÞZð0 Þ ¼ Sð2Þ ð0 ÞZð0 ÞZðÞ
ð2Þ sinð1=2Þð þ ð2iÞ=NÞ
Smin ¼ ½20
(the -term Faddeev is due to Faddeev) brought sinð1=2Þð ð2iÞ=NÞ
significant simplifications. In the general case, the
(minimal = without so-called CDD poles) The
Z0 s are vector valued and the S(2) -structure function
SU(N) model as compared with the U(N) model
is matrix valued. (The identification of the Z–F
requires a similar identification of bound states of
structure coefficients with the elastic two-particle
N 1 particles with an antiparticle. This S-matrix
S-matrix S(2) (which is prenempted by our notation)
enters as in the equation for the vacuum to
can be shown to follow from the physical inter-
n-particle meromorphic form factor of local opera-
pretation of the Z-F structure in terms of localiza-
tors; together with the crossing and the so-called
tion.) In that case the associativity of the Z–F
‘‘kinematical pole equation,’’ one obtains a recursive
algebra is equivalent to the Yang–Baxter equations.
infinite system linking a certain residue with a form
Recently, it became clear that this algebraic relation
factor involving a lower number of particles. The
has a deep physical interpretation; it is the simplest
solutions of this infinite system form a linear space
algebraic structure which can be associated with
from which the form factors of specific tensor fields
generators of nontrivial wedge-localized operator
can be selected by a process which is analogous but
algebras (see the next section).
more involved than the specification of a Wick basis
Conceptually as well as computationally it is much
of composite free fields. Although the statistics
simpler to identify the intrinsic meaning of integr-
property of two-dimensional massive fields is not
ability in QFT with the factorization of its S-matrix
intrinsic but a matter of choice, it would be natural
or a certain property of wedge-localized algebras
to realize, for example, the ZN fields as ZN -anyons.
(see next section) than to establish integrability (see
Another rich class of factorizing models are
Integrability and Quantum Field Theory).
the Toda theories of which the sine-Gordon and
The first step of the bootstrap form-factor
sinh–Gordon are the simplest cases. For their
program namely the classification and construction
descriptions, the quasiclassical use of Lagrangians
of model S-matrices follows a combination of two
(supported by integrability) turns out to be of some
patterns: prescribing particle multiplets transforming
help in setting up their more involved bootstrap
according to group symmetries and/or specifying
form-factor construction.
structural properties of the particle spectrum. The
The unexpected appearance of objects with new
simplest illustration for the latter strategy is supplied
fundamental (solitonic) charges (e.g., the Thirring
by the ZN model. In terms of particle content, ZN
field as the carrier of a solitonic sine-Gordon charge)
demands the identification of the Nth bound state
and the unexpected confinement of charges (e.g., the
with the antiparticle. Since the fusion condition for
CP(1) model as a confined SU(2) model) turn out to
the bound mass m2b = (p1 þ p2 )2 = m21 þ m22 þ 2m1
be opposite sides of the same coin and both cases
m2 ch(1 2 ) is only possible for a pure imaginary
have realizations in the setting of factorizing models
rapidity difference 12 = 1 2 = i (‘‘binding
(Schroer 2005).
angle’’). Hence, the binding of two ‘‘elementary’’
particles of mass m gives
sin 2 Recent Developments
m2 ¼ m
sin
There are two ongoing developments which place
and more generally of k particles with the two-dimensional bootstrap form-factor program
into a more general setting which permits to under-
sin k
mk ¼ m stand its position in the general context of local
sin
quantum physics.
so that the antiparticle mass condition mN = m
=m One of these starts from the observation that the
fixes the binding angle to = 2=N. (The quotation smallest spacetime localization region in which it is
mark is meant to indicate that in contrast to the possible to find vacuum-polarization-free generators
Schrödinger QM there is ‘‘nuclear democracy’’ on (PFG) in the presence of interactions is the wedge
340 Two-Dimensional Models
region. If one demands in addition that these models within the scattering framework (factoriza-
generators (necessarily unbounded operators) have tion follows from existence of wedge-localized
the standard domain properties of QFT (which tempered PFGs) is rather simple and intrinsic
include stability of the domain under translations), (Schroer 2005).
then one finds that this leads precisely to the two- Among the additional ongoing investigations
dimensional Z–F algebraic structure which in turn in in which the conceptual relation with higher-
this way a spacetime interpretation for the first time dimensional QFT is achieved via modular localiza-
acquires. In these investigations (Schroer 2005), tion theory, we will select three which have caught
modular localization theory plays a prominent role our, active attention. One is motivated by the recent
and there are strong indications that with these discovery of the adaptation of Einsteins classical
methods one can show the nontriviality of intersec- principle of local covariance to QFT in curved
tions of wedge algebras which is the algebraic spacetime. The central question raised by this work
criterion for the existence of a model within local (see Algebraic Approach to Quantum Field Theory)
quantum physics. is if all models of Minkowski spacetime QFTs
There is a second constructive idea based on light- permit a local covariant extension to curved space-
front holography which uses the radical reorganiza- time and if not which models do? In the realm of
tion of spacetime properties of the algebraic structure chiral QFT, this would amount to ask if all
while maintaining the physical content including the Moebius-invariant models are also Diff(S1 )-covar-
Hilbert space. Since spacetime localization aspects iant. It has been known for sometime that a QFT
(apart from the remark about wedge algebras and with all its rich physical content can be uniquely
their PFG generators made before) are traditionally defined in terms of a carefully chosen relative
related to the concept of fields, holographic methods position of a finite number of copies of one unique
tend to de-emphasize the particle structure in favor of von Neumann operator algebra within one common
‘‘field properties.’’ Indeed, the transversely extended Hilbert space. This is a perfect quantum field-
chiral theories which arise as the holographic image theoretical illustration for Leibnitz’s philosophical
lead to simplification of many interesting properties proposal that reality results from the relative
with very similar aims to the old ‘‘light-cone position of ‘‘monades’’ (As opposed to the more
quantization’’ except that light-front holography is common (Newtonian) view that the material reality
another way of looking at the original local ambient originates from a material content being placed into
theory without subjecting it to another quantization. a spacetime vessel) if one takes the step of identify-
(The price for this simplification is that as a result of ing the hyperfinite typ III1 Murray von Neumann
the nonuniqueness of the holographic inversion factor algebra with an abstract monade from which
certain problems cannot be formulated.) the different copies result from different ways of
Actually, as a result of the absence of a transverse positioning in a shared Hilbert space (Schroer 2005).
direction in the two-dimensional setting, the family In particular, Moebius-covariant chiral QFTs arise
of factorizing models provides an excellent theore- from two monades with a joint intersection defining
tical laboratory to study their rigorous ‘‘chiral a third monade in such a way that the relative
encoding’’ which is conceptually very different positions are specified in terms of natural modular
from Zamolodchikov’s perturbative relation (which concepts (without reference to geometry). This begs
is based on identifying a factorizing model in terms the question whether one can extend these modular-
of a perturbation on a chiral theory). based algebraic ideas to pass from the global
It turns out that the issue of statistics of particles vacuum preserving Moebius invariance to local
loses its physical relevance for two-dimensional Diff(S) covariance Moeb ! Diff(S1 ). This would
massive models since they can be changed without be precisely the two-dimensional adaptation of the
affecting the physical content. Instead such notions crucial problem raised by the recent successful
as order/disorder fields and soliton take their place generalization of the local covariance principle
(Schroer 2005). underlying Einstein’s classical theory of gravity to
In accordance with its historical origin, the theory QFT in curved spacetime: does every Poincaré
of two-dimensional factorizing models may also be covariant Minkowski spacetime QFT allow a unique
viewed as an outgrowth of the quantization of correspondence with one curved spacetime (having
classical integrable systems (Integrability and Quan- the same abstract algebraic substrate but with a
tum Field Theory). But in comparison with the totally different spacetime encoding)? In the chiral
rather involved structure of integrabilty (verifying context, one is led to the notion of ‘‘partially
the existence of sufficiently many commuting con- geometric modular groups’’ which only act geome-
servation laws), the conceptual setting of factorizing trically if restricted to specific subalgebras (Schroer
Two-Dimensional Models 341
2005). It is hard to imagine how one can combine belong to a series of interesting observations whose
quantum theory and gravity without understanding final relation to the principles of QFT still needs
first the still mysterious links between spacetime clarification.
geometry, thermal properties, and relative position-
ing of monades in a joint Hilbert space. See also: Algebraic Approach to Quantum Field Theory;
A second important umbilical cord with higher- Axiomatic Quantum Field Theory; Bosons and Fermions
dimensional theories is the issue of ‘‘Euclideaniza- in External Fields; Euclidean Field Theory; Integrablility
and Quantum Field Theory; Operator Product Expansion
tion’’ in particular the chiral counterpart of
in Quantum Field Theory; Sine-Gordon Equation;
Osterwalder–Schrader localization and the closely
Symmetries in Quantum Field Theory: Algebraic Aspects;
related Nelson–Symanzik duality. In concrete chiral Symmetries in Quantum Field Theory of Lower
models (e.g., the models in the section ‘‘Chiral fields Spacetime Dimensions; Tomita–Takesaki Modular
and two-dimensional conformal models’’), it has Theory.
been noted as a result of explicit calculations that
the analytic continuation in the angular parametri-
zation for thermal correlation functions leads to Further Reading
a duality relation in
Abdalla E, Abdalla MCB, and Rothe K (1991) Non-Perturbative
hAð’1 ; . . . ; ’n Þi ;2t Methods in 2-Dimensional Quantum Field Theory. Singapore:
a X
World Scientific.
i i i Belavin AA, Polyakov AM, and Zamolodchikov AB (1984)
¼ S A ’1 ; . . . ; ’n ½21 Infinite conformal symmetry in two-dimensional quantum
t
t t ;ð2=t Þ
field theory. Nuclear Physics B 241: 333.
Bisognano JJ and Wichmann EH (1975) Journal of Mathematical
where the thermal correlation function is defined as Physics 16: 985.
Dashen F, Hasslacher B, and Neveu A (1975) Physics Reviews D
hAð’1 ; . . . ; ’n Þi
;2t 11: 3424.
Di Francesco P, Mathieu P, and Sénéchal D (1996) Conformal
:¼ trH
e2t ðL0 ðc=24ÞÞ
ðAð’1 ; . . . ; ’n ÞÞ Field Theory. Berlin: Springer.
½22
Y
n Doplicher S, Haag R, and Roberts JE (1971/1974) Communica-
Að’1 ; . . . ; ’n Þ ¼ Ai ð’i Þ tions in Mathematical Physics 23: 199; 35: 49.
i¼1 Fredenhagen K, Rehren KH, and Schroer B (1992) Superselection
sector with braid group statistics and exchange algebras II:
Compared with the thermally extended Nelson– Geometric aspects and conformal invariance. Reviews of
Symanzik relation for two-dimensional QFT one Mathematical Physics 1 (special issue): 113.
notices that in addition to the expected behavior of Furlan P, Sotkov G, and Todorov I (1989) Two-dimensional
conformal quantum field theory. Rivista del Nuovo Cimento
real coordinates becoming imaginary and the 12: 1–203.
2-periodicity changing role with the (suitably Gabbiani F and Froehlich J (1993) Operator algebras and
normalized) KMS inverse temperature, there is a conformal field theory. Communications in Mathematical
rotation in the space of superselected charges in Physics 155: 569.
terms of a unitary matrix S whose origin lies in the Glimm J and Jaffe A (1987) Quantum Physics. A Functional
Integral Point of View. Berlin: Springer.
braid group statistics (the statistics character Ginsparg P (1990) Applied conformal field theory. In: Brezin E
matrix). The deeper structural explanation which and Zinn-Justin J (eds.) Fields, Strings and Critical Phenom-
shows that this relation is not just a property of ena, Les Houches 1988. Amsterdam: North-Holland.
special models, but rather a generic property of Goddard P, Kent A, and Olive D (1985) Virasoro algebras and
chiral QFT, comes from a very deep angular coset space models. Physics Letters B 152: 88.
Haag R (1992) Local Quantum Physics. Berlin: Springer.
Euclideanization which is based on modular theory Ising E (1925) Zeitschrift für physik 31: 253.
(Schroer). Specializing A = identity, one obtains a Jordan P (1937) Beiträge zur Neutrinotheorie des Lichts. Zeitschrift
relation for the partition function, the famous für Physik 114: 229 and earlier papers quoted therein.
Verlinde identity which is part of the transformation Karowski M, Thun H-J, Truoung TT, and Weisz P (1977) Physics
law of the thermal angular correlation functions Letters B 67: 321.
Karowski M and Weisz P (1978) Physics Reviews B 139: 445.
under the SL(2, R) modular group. Kawahigashi Y, Longo R, and Mueger M (2001) Multi-interval
There are many additional important observations subfactors and modularity in representations of conformal field
on factorizing models whose relation to the physical theory. Communications in Mathematical Physics 219: 631.
principles of QFT, unlike the bootstrap form-factor Lenz W (1920) Physikalische Zeitschrift 21: 613.
program, is not yet settled. The meaning of the Lüescher M and Mack G (1975) Global conformal invariance in
quantum field theory. Communications in Mathematical
c-parameter outside the chiral setting and ideas on Physics 41: 203.
its renormalization group flow as well as the various Rehren K-H and Schroer B (1987) Exchange algebra and Ising
formulations of the thermodynamic Bethe ansatz n-point functions. Physics Letters B 198: 84.
342 Two-Dimensional Models
Rehren KH and Schroer B (1989) Einstein causality and Artin Schroer B, Truong TT, and Weisz P (1976) Towards an explicit
braids. Nuclear Physics B 312: 715. construction of the Sine-Gordon field theory. Annals of
Schroer B (2005) Two-dimensional models, a testing ground for Physics (New York) 102: 156.
principles and concepts of QFT, Annals of Physics (in print) Schwinger J (1962) Physical Review 128: 2425.
(hep-th/0504206). Schwinger J (1963) Gauge theory of vector particles. In:
Schroer B and Swieca JA (1974) Conformational transformations Theoretical Physics Trieste Lectures 1962. Wien: IAEA.
for quantized fields. Physics Reviews D 10: 480. Smirnov FA (1992) Advanced Series in Mathematical Physics 14.
Schroer B, Swieca JA, and Voelkel AH (1975) Global operator Singapore: World Scientific.
expansions in conformally invariant relativistic quantum field Streater RF and Wightman AS (1964) PCT, Spin and Statistics
theory. Physics Reviews D 11: 11. and All That. New York: Benjamin.
Zamolodchikov AB and Zamolodchikov AB (1979) Annals of
Physics (New York) 120: 253.
U
Universality and Renormalization
M Lyubich, University of Toronto, Toronto, ON, General Terminology and Notations
Canada and Stony Brook University, Stony Brook,
We will use general notations and terminology from
NY, USA
Holomorphic Dynamics.
ª 2006 Elsevier Ltd. All rights reserved.
Unimodal Maps
Introduction
Definitions and Conventions
Discovery of the universality phenomenon and the
Let us consider a smooth interval map f : I ! I. It is
underlying renormalization mechanism by Feigen-
called unimodal if it has a single critical point c and
baum and independently by Coullet and Tresser in
this point is an extremum. We assume that the critical
late 1970s was one of the most influential events
point is nondegenerate, unless otherwise it is expli-
in the dynamical systems theory in the last quarter
citly stated. A unimodal map is called S-unimodal if it
of the twentieth century. It was numerically
has a negative Schwarzian derivative:
observed that the cascades of doubling bifurca-
tions leading to chaotic regimes in one-parameter
f 000 3 f 00 2
families of interval maps, as well as the dynamical Sf ¼ 0 <0
f 2 f0
attractors that appear in the limits, exhibit the
universal small-scale geometry. To explain this For simplicity, we also assume that the map f is
surprising observation, a ‘‘Renormalization Con- even, and normalize it so that c = 0 and one of the
jecture’’ was formulated which asserted that a endpoints of I is a fixed point.
natural renormalization operator acting in the
space of dynamical systems has a unique hyper- Topological Dynamics
bolic fixed point.
Let J 3 0 be a 0-symmetric periodic interval, that is,
It took about two decades to prove this conjecture
f p (J) J for some p 2 N, such that the intervals
rigorously (and without the help of computers). The
Jk = f k (J), k = 0, 1, . . . , p 1, have disjoint interiors.
proof revealed rich mathematical structures behind
Then we refer to [ Jk as a cycle of intervals of period p.
the universality phenomenon that linked it tightly to
According to their topological dynamics, S-
holomorphic dynamics and conformal and hyper-
unimodal maps can be divided into three possible
bolic geometry.
types (Sharkovskii, Singer, Guckenheimer, Misiur-
Besides the universality per se, the renormaliza-
ewicz, van Strien, Blokh, etc.):
tion theory led to many other important results.
It includes the proof of the regular or stochastic Regular maps. Such a map has an attracting or
dichotomy that gives us a complete under- parabolic cycle a. In this case, almost all trajec-
standing of the real quadratic family (and more tories of f converge to a. In case a is attracting, the
general families of one-dimensional maps) from map f is also called hyperbolic (see Holomorphic
measure-theoretic point of view, as well as deep Dynamics).
advances in several key problems of holomorphic Topologically chaotic maps. For such a map,
dynamics. there is a cycle of intervals [ Jk such that the
Since the original discovery, many other manifes- restriction f j [ Jk is topologically transitive (i.e., it
tations of the universality have been observed, has a dense orbit). Moreover, for almost all z 2 I,
experimentally, numerically, and theoretically, in orb z eventually lands in this cycle.
various classes of dynamical systems. However, in Infinitely renormalizable maps. For such a map,
this article we will focus on mathematical aspects of there is a nested sequence of periodic intervals
the original phenomenon. J1 J2 3 0 of periods pn ! 1. Then the
344 Universality and Renormalization
independent of the particular map f under considera- exponential rate n , where is the unstable
tion. Thus, the small-scale geometry of Af is eigenvalue of the differential DR(f
). This explains
universal. the universal geometry of doubling bifurcations.
This was historically the first observed manifesta- One can also show that the Feigenbaum attractor
tion of the quantitative universality of dynamical Af of any map f 2 W s (f
) is smoothly equivalent to
and parameter structures. Af
, which explains the universal small-scale geome-
try of these attractors.
Feigenbaum–Coullet–Tresser Renormalization
Full Renormalization Horseshoe
Conjecture
Along with period doublings, one can consider
To explain the above universality phenomenon,
period triplings, quadruplings, etc. A unimodal
Feigenbaum and independently Coullet and Tresser,
map f 2 U is said to be renormalizable with period
formulated the following Renormalization Conjec-
p if it has a cycle of intervals J ! J1 ! ! Jp1 ! J
ture. Let us consider the space U of S-unimodal
of period p. The corresponding renormalization
maps f : [1, 1] ! [1, 1]. A map f 2 U is called
operator is defined as Rf (x) = 1 f p (x), where
(doubling) renormalizable if it has a cycle
= jJj=2.
of intervals J ! J1 ! J of period 2. Then, for any
The combinatorics or type of the renormalization
n 2 Zþ [ {1}, we can naturally define n-times
operator is the order of the intervals Jk , k =
renormalizable maps, where n = 0 corresponds to
0, 1, . . . , p 1, on the real line (up to reversal). (For
the non-renormalizable case, while n = 1 corres-
instance, there are three admissible combinatorics of
ponds to the infinitely renormalizable case.
period 5.) If we want to specify combinatorics of the
Let U 0 U be the space of doubling renormaliz-
renormalization operator under consideration, we use
able maps. If f 2 U 0 then f 2 : J ! J is an S-unimodal
notation R . This operator is defined on the ‘‘renor-
map as well, and we define the (doubling) renorma-
malization strip’’ U of unimodal maps f 2 U that are
lization operator R : U 0 ! U as the rescaling of this
renormalizable with combinatorics .
map:
The Renormalization Conjecture admits a
Rf ðxÞ ¼ 1 f 2 ðxÞ straightforward generalization to any renormaliza-
tion operator R . More interestingly, one can
where = jJj=2. formulate a stronger version of it by putting all the
The Renormalization Conjecture asserted that: admissible renormalization types together. Let T
The renormalization operator R has a unique stand for the set of all minimal renormalization
fixed point f
, and this point is hyperbolic; types, that is, the types that cannot be factored
the stable manifold W s (f
) consists of infinitely through other types. Then the renormalization strips
renormalizable unimodal maps; U , 2 T , are pairwise disjoint, and we can define
the unstable manifold W u (f
) is one dimensional the full renormalization operator
and represents an almost full family of unimodal [
maps (see the section ‘‘Kneading theory’’); and R: U ! U ½2
2T
the quadratic family {Pc } transversally intersects
W s (f
) (see Figure 2). by letting RjU = R . Then the strong version of the
Assuming this conjecture, one can see that for any renormalization conjecture asserted that:
curve t 7! gt in U that transversally intersects the there is an R-invariant hyperbolic subset A U
stable manifold W s (f
) at some moment t
, the called the full renormalization horseshoe such
doubling bifurcations parameters tn converge to t
at that the restriction RjA is topologically con-
jugate to the full shift on the space of bi-
infinite sequences (. . . , 1 , 0 , 1 , . . .) of symbols
Wu Quadratic family n 2 T ;
for any f
2 A, the stable manifold W s (f
) consists
of infinitely renormalizable maps f 2 U with the
° f* ° z + c*
2 same combinatorics as f
;
for any f
2 A, the unstable manifold W u (f
) is
one-dimensional and represents an almost full
Ws family of unimodal maps; and
the real quadratic family {Pc } transversally inter-
Figure 2 Renormalization fixed point. sects all stable manifolds W s (f
).
346 Universality and Renormalization
Kðf Þ ¼ fz : f n z 2 U; n ¼ 0; 1; . . .g
Complex Renormalization and Little
Mandelbrot Sets
The Julia set of f is the boundary of its filled Julia
set: J(f ) = @K(f ). A quadratic-like map f : U ! U0 with connected
A polynomial-like map of degree d has d 1 Julia set is called renormalizable if there is a
critical points counted with multiplicities. The Julia topological disk V 3 0 and a natural number p 2
set (and the filled Julia set) is connected if and only called the renormalization period such that:
if all the critical points ci are nonescaping, that is,
letting g = f p jV and V 0 = g(V), the map g : V ! V 0
ci 2 K(f ).
is quadratic-like;
A polynomial-like map of degree 2 is called
the little Julia set K(g) is connected; and
quadratic-like. The Julia set of a quadratic-like
the sets gn (K(g)), n = 1, . . . , p 1, can intersect
map is either connected or a Cantor set, depending
K(g) only at the -fixed point of g.
on whether its critical point is nonescaping or
otherwise. Under these circumstances, the quadratic-like germ g
The domain of a polynomial-like map is allowed considered up to affine conjugacy is called the renorma-
to be slightly adjusted by taking V 0 to be a lization of the quadratic-like germ f ; g = Rf . Moreover,
topological disk such that U V 0 U0 and letting one says that f is primitively renormalizable if the
V = f 1 (V 0 ). We say that two polynomial-like maps little Julia sets gn (K(g)), n = 1, . . . , p 1, are pairwise
represent the same germ if one can be obtained from disjoint. Otherwise, f is satellite renormalizable.
the other by a sequence of such adjustments. As in the unimodal case, one can define combina-
We will be mostly interested in the quadratic case; torics or type of the complex renormalization.
so let Q be the space of quadratic-like germs Roughly speaking, renormalizable maps with the same
considered up to affine conjugacy, and let C be the combinatorics have the same renormalization period
connectedness locus in Q, that is, the subset of f 2 Q and the ‘‘same position’’ of the little Julia sets f k (K(g))
with connected Julia set. The space Q has a natural in C^ (the rigorous definition is based on the notion of
complex analytic structure such that holomorphic Thurston’s equivalence from Holomorphic Dynamics).
curves in Q are represented by holomorphic families
Theorem 1 (Douady and Hubbard 1986). The set
f (z) of quadratic-like maps.
of parameters c for which a quadratic map
Two polynomial-like maps are called hybrid
Pc : z 7! z2 þ c is renormalizable with a given combi-
equivalent if they are conjugate by a quasiconformal
= 0 a.e. on K(f ) (in particular, h natorics assemble a homeomorphic copy M of the
map h such that @h
Mandelbrot set M.
is conformal on int K(f )). By the Straightening
Theorem, any polynomial-like map is hybrid equiva- This theorem explains the presence of many little
lent (after an adjustment of its domain) to a Mandelbrot sets that are observable on the compu-
polynomial of the same degree (called the ‘‘straigh- ter pictures of M (see Figures 3 and 4). Moreover,
tening’’ of f ). The straightening depends only on the the copies corresponding to the primitive renorma-
germ of f. lization originate at primitive hyperbolic compo-
For a quadratic-like map f with connected Julia nents (see Holomorphic Dynamics), while the copies
set, the straightening Pc : z 7! z2 þ c is unique, obtained by a satellite renormalization originate at
c = (f ). Thus, we obtain the straightening map satellite hyperbolic components attached to some
Universality and Renormalization 347
Julia sets are ‘‘hairy’’ at the origin, that is, their Avila and Moreira (2005) went on to prove that
blow-ups fill in densely the whole plane (this for a.e. c 2 N , the map Pc is Collet–Eckmann.
phenomenon is related to the universal geometry of
the Feigenbaum attractors; McMullen (1996)). Renormalization Horseshoe
However, some of them have zero Lebesgue measure
(Yarrington, thesis 1995) and Hausdorff dimension Let us consider the complexification of the renor-
smaller than 2 (Avila–Lyubich, preprint 2004). It is malization operator [2],
unknown whether this happens for all of them or [
R: Q ! Q ½3
not (in particular, the answer is unknown for the 2T
Feigenbaum map born in the cascade of doubling
bifurcations). acting in the space of quadratic-like maps.
Theorem 6 (Lyubich 2002). The ‘‘Strong Renor-
malization Conjecture’’ is valid for the operator [3].
Regular or Stochastic Dichotomy Let I [2, 1=4] be the set of parameters for
Stochastic Maps which the quadratic map Pc is infinitely renormaliz-
able. The above theorem implies that this set has
An S-unimodal map f is called stochastic if it has an zero Lebesgue measure. (Avila and Moreira went on
absolutely continuous invariant measure . In this to prove that HD(I ) < 1.)
case, f is topologically chaotic (see the section
‘‘Topological dynamics’’) and is supported on the
Regular or Stochastic Dichotomy
transitive cycle of intervals [ Jk . Moreover, has a
positive characteristic exponent, Putting together Theorems 5 and 6, we obtain:
Z
Theorem 7 For a.e. c 2 [2, 1=4], the quadratic
¼ log jDf jd > 0
map Pc is either regular or stochastic.
and Lebesgue almost all orbits are equidistributed This result gives a complete probabilistic picture
with respect to , that is, for Lebesgue a.e. x 2 I, of dynamics in the real quadratic family. It has been
Z later transferred to any nondegenerate real analytic
1X n
ðf xÞ ! d family of S-unimodal maps (Avila–Lyubich–de
n Melo), and further to a generic smooth family of S-
for any continuous function . The map f p j J is mixing unimodal maps (Avila–Moreira).
with respect to , and in fact, is weakly Bernoulli. Palis has formulated a strong general conjecture
Here are two important criteria for stochasticity: (in all dimensions) asserting that a typical (from
the probabilistic point of view) smooth dynamical
Collet–Eckmann condition (see Holomorphic
system f has finitely many attractors supporting
Dynamics). These maps have extra strong sto-
SRB measures (see Lyapunov Exponents and
chastic properties, notably, the exponential decay
Strange Attractors) that govern the behavior of
of correlations.
Lebesgue a.e. trajectories of f. The above results
Martens–Nowicki condition. To state it, we need to
confirm the Palis Conjecture in the setting of S-
define the principal nest of intervals, I0 I1 3
unimodal maps.
0. Here I0 = [, ], where is the fixed point with
negative multiplier, and Inþ1 is inductively defined
as the component of f ln (In ) containing 0, where ln
Other Universality Classes
is the moment of first return of the orbit of 0 to In .
n n1
Let
P pusffiffiffiffifficonsider the scaling factors n = jI j=jI j. If From a more general point of view, renormalization
n < 1 then f is stochastic. is an appropriately rescaled return map to a relevant
piece of the phase space, viewed as an operator in
Let N [2, 1=4] be the set of parameters c for
some class of dynamical systems. From this point of
which the quadratic map Pc is topologically chaotic.
view, most dynamical systems are ‘‘renormalizable,’’
Not every such map is stochastic. However, the set
and the renormalization approach often provides a
of stochastic parameters has positive Lebesgue
deep insight into the nature of the systems in
measure (Jakobson 1981), and in fact,
question.
Theorem 5 (Lyubich 2000). For a.e. c 2 N , the Here is a partial list of classes of nonlinear
map Pc satisfies the Martens–Nowicki condition, systems that exhibit universality with an underlying
and thus, is stochastic. renormalization mechanism (we provide a few
Universality and Renormalization 349
relevant names, but there are many more people Cvitanović P (1984) Universality in Chaos. Bristol: Adam Hilger.
who contributed to the corresponding theories): Douady A and Hubbard JH (1985) On the dynamics of
polynomial-like maps. Annales Scientifiques de l’École
Holomorphic germs near indifferent equilibria Normale Supérieure 18: 287–343.
(Yoccoz, Shishikura, McMullen); Lyubich M (1999) Feigenbaum–Coullet–Tresser universality and
Milnor’s hairiness conjecture. Annals of Mathematics 149:
critical circle maps (Kadanoff, Feigenbaum, Rand, 319–420.
Lanford, Swiatek, de Faria, Yampolsky); Lyubich M (2000) Quadratic family as a qualitatively solvable
non-renormalizable quadratic-like maps of model of dynamics. Notices of the American Mathematical
Fibonacci type (Lyubich–Milnor); Society 47(9): 1042–1052.
conservative two-dimensional diffeomorphisms McMullen C (1996) Renormalization and 3-Manifolds Which
Fiber Over the Circle, Annals of Math. Studies, vol. 135.
near the point of breaking of KAM tori (MacKay, Princeton: Princeton University Press.
Koch); and de Melo W and van Strien S (1993) One-Dimensional Dynamics.,
dissipative Hénon-like maps (Collet–Eckmann– Berlin: Springer.
Koch, de Carvalho–Lyubich–Martens). Sullivan D (1993) Linking the universalities of Milnor–Thurston,
Feigenbaum and Ahlfors–Bers. In: Topological Methods
See also: Fractal Dimensions in Dynamics; Holomorphic in Modern Mathematics, The Proceedings of Symposium
held in honor of John Milnor’s 60th Birthday, SUNY at
Dynamics; Lyapunov Exponents and Strange Attractors;
Stony Brook, 1991, pp. 543–564. Houston, TX: Publish or
Multiscale Approaches.
Perish.
Vul EB, Sinai YaG, and Khanin KM (1984) Feigenbaum
universality and the thermodynamical formalism. Russian
Mathematical Surveys 39: 1–40.
Further Reading
Collet P and Eckmann J-P (1980) Iterated Maps of the Interval as
Dynamical Systems. Boston: Birkhäuser.
V
Variational Methods in Turbulence
F H Busse, Universität Bayreuth, Bayreuth, Germany In the following, we shall first discuss the energy
ª 2006 Elsevier Ltd. All rights reserved. method which provides necessary conditions for the
existence of turbulent solutions of the underlying
equations and then turn to the problem of upper
bounds for the turbulent momentum transport in the
Introduction plane Couette flow configuration as a particular
example. The properties and physical relevance of
The problem of fluid turbulence is commonly the extremalizing vector fields will be discussed in a
regarded as one of the most challenging problems of final section.
theoretical physics and mathematics. There is general
agreement that the Navier–Stokes equations (NSEs)
provide a satisfactory basis for the description of
turbulent motions of homogeneous Newtonian fluids Energy Method
such as gases and most liquids. But the difficulty of For simplicity, we consider the NSEs for a homo-
generating solutions of these equations for high- geneous incompressible fluid with a constant kine-
Reynolds-number flows has prevented accurate matic viscosity in an arbitrary fixed domain D.
answers to simple questions such as the question of Using the diameter d of the domain as length scale
the discharge of turbulent pipe flow as a function of and d2 = as timescale, we can write the NSEs of
the pressure head or the question of the heat transport motion in dimensionless form,
by turbulent convection in a fluid layer heated from
below. In view of this difficulty, it has become an @
v þ v rv ¼ rp þ f þ r2 v ½1a
attractive idea to obtain rigorous bounds on turbulent @t
transports. Variational methods have played an
important role in the derivation of such bounds. rv ¼ 0 ½1b
There is another motivation for the use of varia- where f denotes some given steady distribution of a
tional methods for the understanding of turbulent force density. On the boundary @D of the domain D,
fluid systems. Experimenters have sometimes noted steady velocities parallel to the boundary may be
the tendency of turbulent flows to maximize trans- specified. We assume that the basic steady solution
ports under given external conditions. In his pioneer- of the problem is given by vs = Re ^v where the
ing paper, Howard (1963) mentions that the Malkus average of (^v)2 =2 over the domain D (indicated by
hypothesis of a maximum heat transport by thermal angular brackets) is unity, hj^vj2 i = 2. Any velocity
convection had motivated him to derive upper bounds field vt different from vs , that is, with u vt
through the use of variational methods. The techni- vs 6 0, must obey the equations
ques developed by Howard have later been applied to
other kinds of turbulent transports by Busse. While @
~ þ r2 u
u þ vs ru þ urvs þ u ru ¼ rp ½2a
relatively simple ordinary differential equations are @t
obtained when the equation of continuity is not
imposed as a constraint, the Euler–Lagrange equa- ru ¼ 0 ½2b
tions for a stationary value of the variational
together with the homogeneous boundary conditions
functional lead to nonlinear partial differential equa-
for u on @D. By multiplying eqn [2a] by u and
tions when solenoidal extremalizing vector fields are
averaging the result over the domain D we obtain
required. Nevertheless, using boundary layer methods
the relationship
one can derive approximate analytical solutions even
in the limit of asymptotically large Rayleigh and 1d
Reynolds numbers (Busse 1969, 1978). hu ui ¼ hjruj2 i Rehu ðu rÞ^vi ½3
2 dt
352 Variational Methods in Turbulence
where the vanishing of u on @D and equations such Table 1 Reynolds numbers for shear flows
as
ReE ReG Rec
hu ðvs rÞui ¼ 12hvs ru ui (from exp.)
respectively, where k is the unit vector normal to the The Euler–Lagrange equations as necessary con-
plates such that the boundary conditions are given by ditions for an extremal value of the functional are
given by
1 1
v ¼ Re i at z ¼ ½8
2 2 d
d
w
~ u U
¼ r þ r2 ~v
U þ k~ ½13
dz dz
After separating the velocity field v into its mean
and fluctuating parts, v = U þ v with v = U, v = 0, r ~v ¼ 0 ½14
where the bar denotes the average over planes
z = const., we obtain by multiplying eqn [6] by v where dU
=dz is defined by
and averaging it over the entire fluid layer (indicated
by angular brackets) d
hjr~vj2 i
U ¼u
~w uwi
~ h~ ~ i R ½15
dz 2h~vx wi
~
1d @
vj2 i ¼ uw
hj U hjr vj2 i ½9 and where = h~ ux wi
~ has been set. When eqns
2 dt @z
[13]–[15] are compared with the equations for v
Here u denotes the component of v perpendicular to and for U, a strong similarity can be noticed. The
k and w is its z-component. We define fluid variational problem does not exhibit any time
turbulence under stationary conditions by the prop- dependence, but the Euler–Lagrange equations may
erty that quantities averaged over planes z = const. still be regarded as the symmetric analogue of the
are time independent. Accordingly, the equation for NSEs for steady flow.
the mean flow U can be integrated to yield
relatively easily emphasizes the point that the extre- See also: Bifurcations in Fluid Dynamics; Fluid
malizing vector fields are the most interesting aspect of Mechanics: Numerical Methods; Turbulence Theories.
the variational problems. They often exhibit simila-
rities with the observed turbulent velocity fields, in
particular as far as the mean flows are concerned. In Further Reading
the case of convection in a layer heated from below,
the transition of the bound from the 1 – solution to Busse FH (1969) On Howard’s upper bound for heat transport by
the 2 – solution corresponds closely to the experi- turbulent convection. Journal of Fluid Mechanics 37:
457–477.
mentally observed transition from convection rolls to Busse FH (1978) The optimum theory of turbulence. Advances in
bimodal convection (Busse 1969). Applied Mechanics 18: 77–121.
The close similarities between variational functionals Busse FH (2002) The problem of turbulence and the manifold of
for rather different physical systems suggest corre- asymptotic solutions of the Navier–Stokes equations. In:
sponding similarities between the respective turbulent Oberlack M and Busse FH (eds.) Theories of Turbulence,
pp. 77–121. Wien: Springer.
fields. For example, the analogy between the fluctuat- Doering CR and Constantin P (1994) Variational bounds on
ing component of the temperature in turbulent convec- energy dissipation in incompressible flows: shear flow.
tion and the streamwise component of the fluctuating Physical Review E 49: 4087–4099.
velocity field in shear flow turbulence has been Howard LN (1963) Heat transport by turbulent convection.
demonstrated and employed in a theory of the atmo- Journal of Fluid Mechanics 17: 405–432.
Howard LN (1972) Bounds on flow quantities. Annual Review of
spheric boundary layer (Busse 1978). Better bounds Fluid Mechanics 4: 473–494.
and more physically realistic properties of the extre- Joseph DD (1976) Stability of fluid motions. vol. 1. Berlin:
malizing vector fields can be expected when additional Springer.
constraints are imposed. For example, the energy Kerswell RR (1998) Unification of variational principles for
balances for poloidal and toroidal components of the turbulent shear flows: the background method of Doering–
Constantin and Howard–Busse’s mean-fluctuation formula-
velocity field can be applied separately. But these tion. Physica D 121: 175–192.
developments are still in their initial stages.
below the critical temperature, through its Gibbs Reductions of the Model
energy:
The goal of variational studies of the Ginzburg–
Z 2 2 Landau model is to relate the energy to the vortices
1 ð1 j j Þ
G" ð ; AÞ ¼ jrA j2 þ and the applied field. In three dimensions (3D),
2 2"2 vortices are filaments, or lines of zeros of the order
Z
1 parameter , around which has a nonzero
þ jcurl A hex j2 ½1
2 R3 winding number. These are quite delicate to describe
in 3D (we will mention some results below), so a
In this expression, the first unknown is the simplification that is commonly made consists in
‘‘order parameter’’ in physics. It is a complex-valued reducing to a two-dimensional model.
condensed wave function, indicating the local state When reducing to 2D, one assumes that every-
of the material, or the phase (in the Landau theory thing is independent of the vertical direction, and
approach of phase transitions): j j2 is the density of that the applied magnetic field is also vertical. The
the ‘‘Cooper pairs’’ of superconducting electrons domain is then a two-dimensional, bounded and
explaining superconductivity in the BCS approach. (for simplicity) simply connected open set, which is
With our normalization j j 1 and where j j 1 the horizontal section of an infinite vertical
the material is in the superconducting phase, while cylinder. One can also imagine it represents a thin
where j j 0, it is in the normal phase (i.e., behaves film.
like a normal conductor), the two phases being able In 2D, the energy is written the same way:
to coexist in the sample. Z
The second unknown A is the electromagnetic 1 ð1 j j2 Þ2
G" ð ; AÞ ¼ jrA j2 þ
vector potential of the magnetic field, a function 2 2"2
from to R3 . The induced magnetic field in the þ jcurl A hex j2 ½4
sample is deduced by h = curl A. The notation rA
denotes the covariant derivative r iA. The super- where this time A is R2 -valued, and the induced
conducting current is the vector j of components magnetic field h = curl A = @1 A2 @2 A1 is now a
real-valued function, which can be taken to be equal
jk ¼ hi ; ðrA Þk i ½2
to hex (now a real positive number) in R2 n.
where h. , .i denotes the scalar product in C The stationary states of the system are the critical
identified with R 2 . points of G" , or the solutions of the Ginzburg–
Finally, the parameter " is the inverse of the Landau equations:
‘‘Ginzburg–Landau parameter’’ , a dimensionless 1
parameter (ratio of the penetration depth and ðrA Þ2 ¼ ð1 j j2 Þ in
"2
the coherence length) depending on the material only.
Most variational studies of Ginzburg–Landau r? h ¼ hi ; rA i in ½5
focus on the regime of large or small ", h ¼ hex on @
corresponding to ‘‘extreme type-II’’ superconduc- rA ¼0 on @
tors, also called the London limit. In this limit, the
potential term acts as a singular perturbation, and where r? denotes (@x2 , @ x1 ).
the characteristic size of the vortices is " ! 0; A common simplification consists in suppressing
vortices become line-like topological singularities, the magnetic field, and thus in studying the
which makes it easier to extract and describe them. simplified energy
This model is a U(1)-gauge theory, that is, it is Z
invariant under the gauge transformations: 1 ð1 juj2 Þ2
E" ðuÞ ¼ jruj2 þ ½6
2 2"2
i
7! e
½3
A 7! A þ r where the order parameter is commonly denoted by
u, and is still complex valued. This energy, which
where is a smooth real-valued function. The can be seen as a complex analog of the real-valued
physically relevant quantities are those that are Allen–Cahn model of phase transitions, has been
gauge invariant, such as the energy G" , j j, h, and extensively studied, especially since the work of
the superconducting current j. Bethuel–Brezis–Hélein, where the domain is
For more on the model, we refer to the physics assumed to be two dimensional and simply con-
literature (e.g., DeGennes (1966) and Tinkham nected. The higher-dimensional case has also been
(1996)). considered.
Variational Techniques for Ginzburg–Landau Energies 357
Vortices and Critical Fields for a given u or ), and estimate precisely the energetic
cost of each vortex and of their interaction. This
We now need to explain more precisely what a
allows us to obtain results of variational convergence
vortex is. In two dimensions, a vortex is an object
of the energy G" , E" (or their variants), that is, to
centered at an isolated zero of u (or ), around
derive -limits, or ‘‘reduced problems’’ posed in terms
which the phase of u has a nonzero winding number
of the vortices, which are easier to minimize than the
called the ‘‘degree of the vortex.’’ It is the simplest
original ones. These limits depend on the regime of
example of a topological defect. If the zero is located
applied field, and allow to characterization of, in turn,
at x0 , the winding number or degree is the integer
the critical fields, and the optimal repartition and
that can be computed by
Z number of the vortices, if any.
1 @’ Variational methods also serve to solve some
¼d2Z ½7
2 @Bðx0 ;rÞ @ inverse problems, that is, to prove the existence of
solutions of the equation which have some given
where r is small enough, and ’ is the phase of u, that properties, such as a given repartition of vortices,
is, u can be written u = jujei’ . For example, the phase through local minimization procedures, or the use of
’ = d, where is the polar angle centered at x0 , yields topological methods based on investigating the
a vortex of degree d. Observe that the phase ’ is not a topology of the energy levels.
well-defined function, it is multivalued (and defined up Nonvariational approaches of Ginzburg–Landau
to 2); however, we have the important relation are also very useful, in particular to identify the
X
curl r’ ¼ 2 di ai ½8 profiles of the solutions, to describe vortices of
i nonminimizing critical points, or to perform a bifurca-
tion analysis around the normal solution at Hc3 .
where the ai ’s are the zeros of u, di ’s the associated
degrees, and x denotes the Dirac mass at x.
When " is small, it is clear from [4] or [6] that juj
prefers to be close to 1, and a scaling argument hints The Simplified Model
that juj is different from 1 in regions of characteristic We first present the variational study of E" [6] in
size ". Of course this is an intuitive picture and several dimension 2, together with the mathematical tools
mathematical notions are used to describe the vortices. used for both [6] and [4]. We will restrict to the
Vortices appear due to the applied field hex . For asymptotics " ! 0, since this is the situation where
type-II superconductors there are essentially three the most results are known.
critical fields, Hc1 , Hc2 , Hc3 , critical values of hex for Let us present informally the essential ingredients
which phase transitions occur. For hex of the analysis.
Hc1 = O(j log "j), there are no vortices and the
superconductor is in the superconducting phase
j j ’ 1 everywhere. At Hc1 the first vortices appear, Tracing the Vortices
and their number increases as hex is raised. When
The easiest way to trace the vortices is to use the
they become numerous they tend to arrange in
current hiu, rui (or the ‘‘superconducting current’’
triangular lattices called Abrikosov lattices, as
j = hi , rA i for the case with magnetic field). Here
observed in experiments and predicted by Abrikosov
we recall h. , .i denotes the scalar product in C as
from the Ginzburg–Landau model, in a very
identified with R 2 , that is, hiu, rui = (u @1 u, u
influential work. At the second critical field
@2 u) with the vector product in R2 .
Hc1 = O(1="2 ) bulk superconductivity is destroyed,
The curl of the current is the vorticity of the map u,
and surface superconductivity remains until
exactly like in fluid mechanics. Writing u = ei’ we
Hc3 = O(1="2 ), the third critical field, above which
have (at least formally) hiu, rui = 2 r’ and since
0 and the material is normal.
= juj is close to 1 (other than in the small vortex
regions), we have the approximation
Issues and Methods curl hiu; rui ¼ curl ð 2 r’Þ ’ curl r’
X
The variational approach to Ginzburg–Landau con- ¼ 2 di ai ½9
sists in expressing the energy in terms of reduced i
quantities or objects, in particular in terms of the
where the ai ’s are the zeros of u (or its vortices) and
vortices. This requires to develop mathematical tools
the di ’s their degrees, or
to describe and characterize the vortices (in particular
give some suitable definitions of a ‘‘vortex structure’’ curl hi ; rA i þ curl A ’ curl r’
358 Variational Techniques for Ginzburg–Landau Energies
in the case with magnetic field. This can be made order 1 (jruj C="), thus negligible compared to
rigorous (see Jerrard and Soner (2002) and Sandier the cost associated to the phase, which blows up as
and Serfaty (to appear)), that is, one can express that log 1=" as " ! 0.
X The above estimate is only valid as long as
curl hiu; rui 2 di ai ! 0 as " ! 0 ½10 B(x0 , R) does not contain any other zero of u. If
i
P vortices get close to each other or become numer-
(or respectively curlhi , rA i þ curlA 2 i di ai ous, one needs refined techniques to estimate their
! 0) in some weak functional norm, thus giving a cost. This can be done through a ‘‘ball-construction
rigorous use of [8]. The quantity method’’ introduced independently by Jerrard and
Sandier.
ðuÞ ¼ curl hiu; rui ½11
or
Evaluating the Total Interaction Cost of Vortices
ð ; AÞ ¼ curl hi ; rA i þ curl A ¼ curl j þ h ½12
In a first approach, one studies configurations which
in the case with magnetic field, will thus be called satisfy the upper bound E" (u) Cj log "j. Then,
the vorticity and be used to trace the vortices, in this lower bounds of the type [15] show that the total
limit " ! 0. The relation sum of the degrees (hence the total number of
X vortices of nonzero degree) remains bounded as " ! 0.
2 di ai ! 0 as " ! 0 ½13
i
Up to extraction, we may assume these zeros ai
converge as " ! 0 to a finite set of points pi , with a
states that it is close to being a measure. total degree stillPdenoted di . This can also be expressed
This is also called the Jacobian determinant if as
(u" ) ! 2 i di pi as " ! 0.
written (with differential forms) Ju = dhiu, dui = This is not the only case of interest, since
hidu, dui = 2(ux1 ux2 )dx1 ^ dx2 , and under this unbounded numbers of vortices do arise, especially
form it can be used in higher dimensions. in the physical situation of the energy with magnetic
field, as we will see in the next section. However,
The Cost of Each Vortex
this hypothesis, which was made in the work of
Here we investigate informally the cost of a vortex Bethuel–Brezis–Hélein, makes the analysis easier
of degree d. We know already that the characteristic and already allows us to exhibit the main
length scale of variation of u is ", and that (1 phenomena.
juj2 )2 is strongly penalized. Thus, we may expect Vortices in superconductors are generated by the
that juj is close to 1 at a distance " of the zeros. presence of the external magnetic field hex . For the
Assuming that x0 is a zero of u, and taking formally energy without magnetic field, this has to be
juj = 1 for jx x0 j
", we may write u = ei’ and replaced by some boundary condition which forces
jruj = jr’j for jx x0 j
". some degree. Bethuel–Brezis–Hélein considered the
Then, we have fixed Dirichlet boundary condition u" = g on @,
Z where g is a fixed unit-valued map on @, of degree
1
jruj2 d > 0. This forces u to have a total degree d in .
2 R
jxx0 j
" However, the Neumann boundary condition, for
Z Z 2 !
1 R @’ instance, can also be considered (the minimizers of
dr
2 " E" are then simply constants, they are trivial, but
@Bðx0 ;rÞ @
0 !2 1 one can still look for other critical points).
Z Z Let us return to lower bounds in order to look
1 R@ @’ 1 A
dr ½14 for the next order term in the energy (still with
2 " @Bðx0 ;rÞ @ 2r
formal arguments). Cutting out holes [i B(pi , ) of
fixed size around the limiting vortices pi , we may
Z R assume that u = ei’ in n [i B(pi , ) = , with ’ a
1 42 d2 dr R
¼ d2 log ½15 real-valued function, defined modulo 2. Minimiz-
2 2 " r "
ing the energy outside of the holes amounts to
where we have used the Cauchy–Schwarz inequality solving
for [14], and the characterization of the degree [7].
Z
We may also observe that this lower bound is sharp 1
min jruj2
if @’=@ is constant, that is, if the phase is d (and u: !S 1 2
the vortex radial). The cost associated to juj in the u¼g on @
energy imposes the length scale " and is generally of degðu;@Bðpi ; ÞÞ¼di
Variational Techniques for Ginzburg–Landau Energies 359
This is a harmonic map problem, whose solution is the vortex of core of size ; it is what replaces the
given in terms of ’ by infinite term in the formal calculation.
Now [18] is a good estimate for the optimal
’ ¼ 0 in energy outside of the holes, while the energy in holes
@’ @g of size can be bounded below by [15]. Given the
¼ ig; on @
@ @ degree di on the boundary @B(pi , ) of the small
Z
@’ hole, B(pi , ) contains one orPseveral zeros of u of
¼ 2di degrees k with total degree
@Bðpi ; Þ @ k k = di . In view of
[15], since the cost of a vortex of degree P d grows like
and in terms of the harmonic conjugate which is d2 j log "j, Pand since the infimum of k k2 under the
the function (up to a constant) such that constraint k k = di is k = sign(di ), the least costly
r’ = r? , way to achieve this is to have jdi j vortices of degree
sign(di ). The smallest lower bound possible is thus
¼ 0 in
Z
@ @g 1 ð1 juj2 Þ2
¼ ig; on @ jruj2 þ 2
jdi j log þ C ½20
@ @ ½16 2 Bðpi ; Þ 2" "
Z
@
¼ 2di where the constant C can be described explicitly.
@Bðpi ; Þ @ Adding up the results of [20] and [18], we find
As ! 0, behaves like the solution of X
X 1
E" ðuÞ
di2 log
0 ¼ 2 di pi in i
i X
½17 þ jdi j log þ Wd ðp1 ; . . . ; pn Þ
@0 @g "
¼ ig; on @ i
@ @
þ nC þ o ð1Þ þ o" ð1Þ
Hence, we have X 1
Z Z
jdi j log þ Wd ðp1 ; . . . ; pn Þ
1 1 "
jr’j2 ¼ jrj2 i
2 2 þ nC þ o" ð1Þ ½21
Z
1
’ jr0 j2
2 with equality only if u has jdi j zeros of degree
X 1 sign(di ) in each B(pi , ).
¼ di2 log þ Wd ðp1 ; . . . ; pn Þ This provides a lower bound of the energy in
i
terms of the vortices. Moreover, this bound is sharp:
þ oð1Þ as ! 0 ½18 one can construct test configurations which have the
given limiting vortices (pi , di ), and an energy equal
where
to the right-hand side of [21].
X One can thus deduce the behavior of global
Wd ða1 ; . . . ; an Þ ¼ di dj log jpi pj j
minimizers of the energy. GivenPthe total degree
i6¼j
X d = deg(g) > 0 on @ , we need i di = d, and the
di Rðai Þ lowest value achievable under this constraint in
i
Z the right-hand side of [21] is to have di = 1 for
1 @g every i, and thus to have exactly d vortices of
þ 0 ig; ½19
2 @ @ degree 1. Moreover, the limiting points pi ’s
P should minimize W. We thus are led to the first
and R(x) = 0 (x) i di log jx pi j. The function main result.
W was introduced by Bethuel–Brezis–Hélein and
Theorem 1 (Bethuel–Brezis–Hélein). Minimizers of
called the renormalized energy, since it consists in
E" under the boundary condition u = g, deg(g) = d > 0,
the part of the energy that is left after subtracting
have d zeros of degree 1, which converge as " ! 0
the ‘‘infinite part’’ in j log "j from E" . It contains the
to a minimizer of W.
(logarithmic) interaction energy between the vor-
tices: we see that vortices with degrees of same sign This result can be rephrased as a result of
repel one another while vortices with degrees of -convergence of E" dj log "j. It reduces the
opposite signs attract one another. The di2 log 1= minimization of E" to one of W, which is a finite-
term corresponds to the self-interaction, or cost of dimensional problem (interaction of point charges).
360 Variational Techniques for Ginzburg–Landau Energies
Thus, we see again the interest of studying this which requires more delicate estimates. Also, it is then
asymptotic limit " ! 0 because the vortices become no longer possible to study the convergence of the
pointlike and the problem reduces to a finite- individual zeros of , so one studies instead the limit of
dimensional one, or one of minimizing the vortex rescalings of the vorticity measures
( , A).
interaction.
A nonvariational approach also allowed Bethuel– Let us recall that in the case with magnetic field, the
Brezis–Hélein to prove a further correspondence vorticity is given by [12]. In addition, we may
between E" and W: they obtained that critical points assume that the second set of equations in [5]
of E" , under the upper bound E" Cj log "j, have r? h ¼ j in ; h ¼ hex on @ ½22
vortices which converge to a critical point of W.
Other important results are the study of the blow-up is satisfied (if not, keeping fixed and choosing A
profiles or solutions in the whole plane, by Brezis– which satisfies this equation always decreases the
Merle–Rivière and Mironescu. energy). Taking the curl of this equation, we find
In two dimensions, the variational approach is exactly
also used to solve inverse problems (construct h þ h ¼
ð ; AÞ in
solutions) and study variants of the energy with ½23
h ¼ hex on @
pinning (or weighted) terms.
The variational approach is also fruitful in higher Thus, the vorticity and the induced magnetic field
dimensions. In dimension 3, for example, vortices are are in one-to-one correspondence with each other.
not points but vortex lines, and the Jacobian Combining it to the relation [13], we are led to the
Ju = d(iu, du) can be seen as a current carried by the approximate relation
vortex line, with kJuk total mass of the current equal to X
times the length of the line, and it was established by h þ h ’ 2 di ai in
Jerrard and Soner that Ju" is compact in some weak i ½24
sense, and converges, up to extraction, to some times h ¼ hex on @
integer-multiplicity rectifiable current J, with
where again the ai ’s are the vortex centers and di ’s
E" ðu" Þ their degrees, well known in physics as the
lim inf
kJk ‘‘London equation.’’ It shows how the magnetic
"!0 j log "j
field is induced by the vortices which act like
In fact, a complete -convergence result of ‘‘charges,’’ and how the magnetic field ‘‘penetrates
E" =j log "j can be proved, see the work of Alberti– the sample’’ around the positive vortex locations.
Baldo–Orlandi, and thus minimizing E" reduces at Of course this equation is only an approximation,
the limit to minimizing the length of the line, leading because the singularities at the ai ’s, where h would
to straight lines, or in higher dimensions, to become infinite, are really smoothed out in
( , A);
codimension-2 minimal currents. This is a nontrivial however, the approximation is good far from
problem, contrarily to dimension 2, where the - the vortex cores, just as [17] is an approximation
limit of E" =j log "j is trivial, which required to go to for [16].
the lower-order term to find the nontrivial renorma- It is then natural to introduce the field corre-
lized energy limit W. sponding to the vortex-free situation, which is hex h0
where h0 solves
The Functional with Magnetic Field h0 þ h0 ¼ 0 in
½25
The aim here is to achieve the same objective: h0 ¼ 1 on @
express or bound from below the energy by terms
which depend only on the vortices and their degrees. h0 is thus a fixed smooth function, depending only
The method consists in transposing the type of on , and when there are no vortices, we expect h to
analysis above taking into account the magnetic be approximately hex h0 . Moreover, h0 := h hex h0
field contribution to see how the external field then solves
triggers the sudden appearance of vortices, and for X
what values they appear (thus retrieving the critical h0 þ h0 ¼
ð ; AÞ ’ 2 di ai in
fields, etc.). One of the main difficulties consists in the i ½26
fact that the number of vortices becomes divergent, h0 ¼ 0 on @
Variational Techniques for Ginzburg–Landau Energies 361
Defining the Green kernel G(. , y) by configuration ( , A) for which this is an equality,
at leading order.
G þ G ¼ y in
½27 In that relation, h2ex J0 is a fixed energy, the energy
G¼0 on @ of the vortex-free configuration. To it are added the
intrinsic cost of each vortex jdi jjlog "j, the interac-
and S by S(x, y) = 2G(x, y) þ log jx yj, for x far tion cost between vortices, and the interaction
enough from the ai ’s, we may approximate h0 by between
X P the vortices and the external field
h0 ðxÞ ¼ 2 Gðx; ai Þ ½28 2hex i di (h0 1)(ai ).
i
It is then simple, by minimizing the right-hand
Using the second Ginzburg–Landau equation [22] side with respect to the vortices for a given hex , and
and the fact that j j R1, we have jrA j
jjj = jrhj, observing that h0 1 0, to deduce a few basic
thus G" ( , A)
(1=2) jrhj2 þ jh hex j2 . Plugging facts about vortices: vortices of positive degree (and
in the decomposition h = hex h0 þ h0 and using an of degree þ1) are preferred, each vortex costs
integration by parts and [26], one finds j log "j, and allows to gain at best an energy
Z 2hex max jh0 1j when placed at the minimum of
1 2
G" ð ; AÞ ¼ hex jrh0 j2 þ jh0 1j2 h0 1. Therefore, vortices become favorable when
2 their cost becomes smaller than the gain, that is,
Z
þ hex rh0 rh0 þ ðh0 1Þh0 when hex becomes larger than the ‘‘first critical field’’
Z j log "j
1 Hc1 ½32
þ jrh0 j2 þ jh0 j2 2j minðh0 1Þj
2
Z
2
We have the first main result.
¼ hex J0 þ hex ðh0 1Þ
ð ; AÞ
Z Theorem 2 (Sandier–Serfaty). When " is small
1 enough and hex Hc1 , then minimizers of G" have
þ jrh j þ jh0 j2
0 2
½29
2 no vortices.
R
where J0 is the constant (1=2) jrh0 j2 þ jh0 1j2 . On the other hand, if hex
Hc1 , the vortices
The right-hand side of eqn [29] can be expressed cannot all be located at the same minimum point of
P
in terms
R of the vortices.PFirst, using [26], we h0 1, because their repulsion i6¼j log jai aj j
have (h0 1)
( , A) ’ 2 i di (h0 1)(ai ). Second, would be infinite. There is thus a trade-off between
R
the expression jrh0 j2 þ jh0 j2 can be treated exactly their repulsion and the cost for being far from the
like E" (u) in the previous section, using lower bounds for minimum of h0 1. Only if n, the number of
the cost of vortices provided by the Jerrard–Sandier vortices, is small compared to hex do the vortices
method, we are led to the (approximate) relation tend to concentrate near the minimum of h0 1. If
Z X so, then, assuming for simplicity that the minimum
1 1
jrh0 j2 þ jh0 j2
jdi j log of h0 1 is achieved at a unique point p, and
2 i
" denoting by Q the Hessian of h0 1 at p, in the
X
di dj log jai aj j relation above (h0 1)(ai ) can be approximated by
i6¼j min (h0 1) þ (1=2)Q(ai p) and thus G" ( , A) by
X
þ di dj Sðai ; aj Þ ½30 G" ð ; AÞ h2ex J0 þ nj log "j þ 2nhex minðh0 1Þ
i;j X
þ hex Qðai pÞ
Combining this to [29] we find the decomposition i
X
X di dj log jai aj j þ n2 Sðp; pÞ ½33
G" ð ; AÞ
h2ex J0 þ jdi jj log "j i6¼j
i
X
þ 2hex di ðh0 1Þðai Þ From this relation, optimizing on ‘, the character-
i istic distance to p and characteristicpffiffiffiffiffiffiffiffiffiffiffi
distance
ffi
X between the vortices, we find that ‘ = n=hex is
di dj log jai aj j
i6¼j
optimal.
X Moreover, optimizing with respect to n, we find
þ di dj Sðai ; aj Þ ½31 that n should remain bounded (as " ! 0) when
i;j
hex Hc1 þ O( log j log "j). In that regime, rescaling
On the other hand, this inequality is sharp: as by setting xi = ((ai p)=‘), we have the following
before, given vortices ai , one can construct a result:
362 Variational Techniques for Ginzburg–Landau Energies
Theorem 3 (Sandier–Serfaty). There exist fields Theorem 4 (Sandier–Serfaty). G" =h2ex -converges
Hn Hc1 þ C(n 1) log j log "j such that when to G.
Hn hex < Hnþ1 , minimizers of G" have n vortices
The limit problem of minimizing G turns out to
of degree 1, and the rescaled vortices xi ’s tend to
have a simple solution in terms of an obstacle
minimize:
X problem: the optimal
is a uniform density of
wn ðx1 ; . . . ; xn Þ ¼ log jxi xj j vortices on a subdomain of determined through a
i6¼j free boundary problem (and depending on hex ),
X
n which is nonzero.
þ n Qðxi Þ ½34 In all these regimes, we have thus been able to
i¼1 identify the optimal number and repartition of
If hex Hc1 log jlog "j, then the optimal number vortices through a -convergence-type approach,
of vortices n becomes unbounded as " ! 0. The that is, by reducing the minimization of the energy
analysis above still holds, but in order to get a to the minimization of a limiting problem: wn or I or G,
convergence of the vortices, one needs to rescale the according to the regime.
vorticity measure by n. There is an intermediate
regime, for log jlog "j hex Hc1 jlog "j for Further Results
which n should be 1 but still n hex , so ‘ 1: Concerning vortices, in the same spirit as what was
vortices are numerous, but still concentrate around p. done for E" , we can obtain necessary conditions
Rescaling by the scale ‘ as above, we prove that the characterizing limiting vorticities obtained from
density of vortices (after dividing it by n) converges to sequences of (nonminimizing) critical points of
a probability measure, minimizer of the energy the energy G" . They consist in passing to the limit
Z
in the conservative form of the Ginzburg–Landau
Ið
Þ ¼ log jx yj d
ðxÞ d
ðyÞ equations [5].
R 2 R 2
Z Most of the results concerning the phase transi-
þ QðxÞ d
ðxÞ ½35 tions at the next critical fields Hc2 and Hc3 are also
R2
obtained by nonvariational methods, and often by
This is an averaged/continuous form of [34]. linear analysis.
If hex Hc1 is of order j log "j, then the optimal The study of the Ginzburg–Landau energy in non-
number n becomes of order hex and the vortices no simply-connected domains is also very interesting
longer concentrate around a single point. because it leads to nontrivial topological effects, since
The simplest approach is then to simply consider in such domains there exist unit-valued maps with
the vorticity measure
( , A) and to rescale it by the nonzero degree (corresponding to permanent currents).
order n, hence by hex . Then (1=hex )
( , A) con-
verges, after extraction, to some measure
. A See also: Abelian Higgs Vortices; Aharonov–Bohm Effect;
continuous version of [31] can thus be written, using Bose–Einstein Condensates; Gamma-Convergence and
[12], as Homogenization; Gauge Theory: Mathematical
Applications; Ginzburg–Landau Equation; High Tc
G" ð ; AÞ Superconductor Theory; Image Processing:
Z Z Mathematics; Superfluids; Topological Defects and Their
1 1
hex j log "j j
j þ h2ex jrh
j2 þ jh
j2 ½36 Homotopy Classification; Variational Techniques for
2 2 Microstructures.
where h
solves
h
þ h
¼
in Further Reading
h
¼ 1 on @
Bethuel F, Brezis H, and Hélein F (1994) Ginzburg–Landau
Again, this inequality can be proved to be sharp (by Vortices. Boston: Birkhäuser.
DeGennes PG (1966) Superconductivity of Metal and Alloys.
a construction) and allows to show that minimizers New York: Benjamin.
of G" have a vorticity
( , A) such that
( , A)=hex Jerrard RL and Soner HM (2002) The Jacobian and the
converges to a minimizer of Ginzburg–Landau energy. Calculus of Variations and Partial
Z Z Differential Equations 14(2): 151–191.
1 jlog "j 1 Sandier E and Serfaty S Vortices in the Magnetic Ginzburg–
Gð
Þ ¼ lim j
j þ jrh
j2 þ jh
j2
2 "!0 hex 2
Landau Model. Birkhäuser (monograph to appear).
Tinkham M (1996) Introduction to Superconductivity, 2nd edn.
In fact the stronger result holds, in that sense: McGraw-Hill.
Variational Techniques for Microstructures 363
nonlinear elasticity for solid–solid phase transforma- differential equations and for the passage from
tions based on a huge body of work in the original microscopic to macroscopic models. Gradient
literature. The precise references can be found in the Young measures were characterized by Kinder-
extensive bibliographies of the books and review lehrer and Pedregal. The four-point configuration
articles that are cited in the subsequent section, was discovered independently in various contexts by
in particular in Ball (2004), Bhattacharya (2003), several authors including Scheffer, Aumann and
Dolzmann (2003), James and Hane (2000), and Hart, Casadio Tarabusi, Tartar, and Milton and
Müller (1999). This article focuses on models for Nesi. The characterization of the quasiconvex hull
single crystals; the behavior of polycrystals (which uses a quasiconvex function constructed by Šverák.
strongly depends on the amount of symmetry The quasiconvex hull of the two-well problem in 3D
breaking in the transformation) was studied by was found by Ball and James, and the generalization
Bhattacharya and Kohn. to n wells in 2D by Bhattacharya and Dolzmann.
The formulation of solid–solid phase transforma-
tions via nonlinear continuum theory goes back to
Ericksen and the analysis via tools in the calculus of Acknowledgments
variations was initiated by Ball and James, Chipot
The work of G Dolzmann was supported by the
and Kinderlehrer, and Fonseca. The Russian school
NSF through grants DMS0405853 and
developed the theory in linear elasticity in the 1960s,
DMS0104118.
see Khachaturyan (1983) for a review. A detailed
discussion of the crystallographic and group-theo- See also: Gamma-Convergence and Homogenization;
retical aspects is contained in Pitteri and Zanzotto Variational Techniques for Ginzburg–Landau Energies.
(2002).
Quasiconvexity was introduced by Morrey (1966)
and his results were extended to Carathéodory Further Reading
integrands by Acerbi and Fusco and Marcellini. A
Ball JM (2004) Mathematical models of martensitic microstruc-
modern treatment including Dacorogna’s relaxation
ture. Materials Science and Engineering A 378(1–2): 61–69.
theorem and a summary of the various notions Bhattacharya K (2003) Microstructure of Martensite. Oxford:
of convexity and their properties can be found in Oxford University Press.
Dacorogna (1989). Šverák proved that rank-1 Dacorogna B (1989) Direct Methods in the Calculus of Varia-
convexity does not imply quasiconvexity for m 3 tions. Berlin: Springer.
and Milton modified his example to show that the Dolzmann G (2003) Variational Methods for Crystalline Micro-
structure: Analysis and Computation, Lecture Notes in
rank-1 convex hull of a set can be strictly smaller Mathematics, vol. 1803. Berlin: Springer.
than its quasiconvex hull. The explicit characteriza- James RD and Hane KF (2000) Martensitic transformations and
tions for nematic elastomers were obtained by shape-memory materials. Acta Materiali 48: 197–222.
DeSimone and Dolzmann. Khachaturyan A (1983) Theory of Structural Transformations in
Lipschitz solutions to differential inclusions were Solids. New York: Wiley.
Morrey CB (1966) Multiple Integrals in the Calculus of
constructed by Müller and Šverák based on Gro- Variations. Berlin: Springer.
mov’s concept of convex integration, by Dacorogna Müller S (1999) Variational methods for microstructure and
and Marcellini using Baire’s category argument, and phase transitions. Proc. C.I.M.E. Summer School ‘‘Calculus of
by Kirchheim in the framework of Banach Mazur Variations and Geometric Evolution Problems,’’ Cetraro,
games. The structure of solutions of the two-well 1996, Lecture Notes in Mathematics, vol. 1713. Berlin–
Heidelberg: Springer.
problem with finite surface energy was analyzed by Pitteri M and Zanzotto G (2002) Continuum Models for Phase
Dolzmann and Müller. Young measures (also called Transitions and Twinning in Crystals. London: Chapman and
parametrized measures or chattering controls) were Hall.
originally introduced as generalized solutions for Tartar L (1979) Compensated compactess and partial differential
optimal control problems which do not admit equations. In: Knops R (ed.) Nonlinear Analysis and
Mechanics: Heriot–Watt Symposion, vol. IV. London: Pitman.
classical solutions (Young 1969). Tartar (1979) Young LC (1969) Lectures on the Calculus of Variations and
introduced Young measures as a fundamental tool Optimal Control Theory. Philadelphia–London–Toronto:
for the analysis of oscillation effects in partial Saunders.
Vertex Operator Algebras see Two-Dimensional Conformal Field Theory and Vertex Operator Algebras
Viscous Incompressible Fluids: Mathematical Theory 369
which has the form of the Navier–Stokes equations if Multiplying this by w, integrating over , and
and only if the coefficients of the two inertial terms integrating by parts, one then obtains
on the left-hand side are equal. That is, if and only if 1d
kwk2 þkrwk2 ¼ ðw ru; wÞ ½11
¼ ½5 2 dt
Viscous Incompressible Fluids: Mathematical Theory 371
where holds
pffiffiffiif a, b > 0, p, q > 1 and 1=p þ 1=q = 1. Taking
Z a = 2krwk, along with p = q = 2 in the two-
kw k2 ¼ w2 dx dimensional case, and a = (4=3)3=4 krwk, along
Z with p = 4=3, q = 4 in the three-dimensional case,
2 @wi @wi one obtains
krwk ¼ dx
@xj @xj
Z jðw ru; wÞj
@ui (
ðw ru; wÞ ¼ wj wi dx
@xj krwk2 þ 14 kruk2 kwk2 ; if n ¼ 2 ½13
since (and this should further explain our notation) krwk2 þ 256
27
kruk4 kwk2 ; if n ¼ 3
Z
Using these estimates for the right-hand side of [11],
ðwt ; wÞ ¼ wt w dx we obtain linear differential inequalities for kwk2
Z that are easily integrated to give
1d 1d
¼ w2 dx ¼ kw k2
2 dt 2 dt
Z 2 kw ð t Þ k2
@ wi ( Rt 1
ðw; wÞ ¼ 2
wi dx kw0 k2 exp 2
0 2 kruk d ; if n ¼ 2 ½14
@xj Rt
Z 2
kw0 k exp 27 4
@wi @wi 0 128 kruk d ; if n ¼ 3
¼ dx ¼ krwk2
@x j @x j
Z Z
@q @wi It follows that if we can estimate the integrals on
ðrq; wÞ ¼ wi dx ¼ q dx ¼ 0 the right, which concern only the solution u, and if
@x @xi
Z i
v is a second solution, perhaps differing only
@wi
ðu rw; wÞ ¼ uj wi dx slightly from u when t = 0, then we can estimate
@xj
Z the difference kv(t) u(t)k at later times. Moreover,
1 @uj at any particular time this difference will be
¼ wi wi dx ¼ 0
2 @xj bounded proportionally to kv(0) u(0)k. The inte-
and similarly (w rw, w) = 0. In deriving these we gral on the right-hand side of the two-dimensional
have used the fact that the vector fields are version of [14] is easily estimated using the energy
divergence free and vanish on the boundary. In the estimate [16] below. The estimation of the corre-
following, we will use such identities without further sponding integral in the three-dimensional case,
mention. without a restriction on the size of the data,
We can estimate the nonlinear term on the right- remains an open problem. It can be regarded as
hand side of [11] by using the ‘‘Sobolev inequalities’’ the most important open problem in the Navier–
Stokes theory. It would never be enough to some-
kk24 kkkrk; if n ¼ 2 how prove that solutions are smooth without
½12 estimating this integral, or something equivalent
kk24 kk1=2 krk3=2 ; if n ¼ 3
to it. Of course, if solutions were known to be
proved by Ladyzhenskaya (1969), though with smooth one could infer their uniqueness from [14],
larger constants. These are valid for any smooth since smoothness would imply that the integrals are
function which vanishes on the boundary of . It finite, which is enough to conclude that kw(t)k is
may be either scalar or vector valued. The norms on zero if kw0 k is zero.
the leftR are L4 -norms; we use the notation
kkp = ( jjp dx)1=p for any p > 1, but usually
drop the subscript when p = 2. Using first Hölder’s Energy Estimate
inequality and then [12], one obtains If one multiplies the Navier–Stokes equation for u
by u, and proceeds as in deriving [11], one obtains
jðw ru; wÞj kwk24 kruk
(
kwkkrwkkruk if n ¼ 2 1d
kuk2 þkruk2 ¼ 0 ½15
1=2 3=2 2 dt
kwk krwk kruk if n ¼ 3
Young’s inequality and hence
Z t
1 1 1 1
ab ap þ bq kuðtÞk2 þ kruk2 d ¼ ku0 k2 ½16
p q 2 0 2
372 Viscous Incompressible Fluids: Mathematical Theory
This settles the matter of continuous dependence in u þ rp ¼ f and r u ¼ 0 in uj@ ¼ 0 ½18
the two-dimensional case. Together with [16], the
two-dimensional version of [14] implies with f = Pu. For such solutions, and hence for
2 2 2 all such u, we have the estimates
kwðtÞk kw0 k exp 14 ku0 k ; if n ¼ 2 ½17
kukW 2 ðÞ ckPuk ½19
2
We remark that the local rate of energy dissipa-
tion is 2jDuj2 rather than jruj2 , where Du is the and
stress tensor Du = (1=2)(ru þ (ru)T ). However,
integrating over the domain, and integrating by
ckukkPuk; if n ¼ 2
parts using the boundary condition uj@ = 0, one supjuj2 ½20
ckrukkPuk; if n ¼ 3
may verify that the rate of total energy dissipation
2kDuk2 equals kruk2 . For the purpose of this with constants independent of u. It can also be
article, it is convenient to write the energy identity shown that every such vector field u belongs to J 1 ()
as [15]. and hence to J(); see Heywood (1973).
Some history and remarks are in order. The
inequality [19] was proved independently by
Estimates for kru(t)k Pointwise in Time Solonnikov (1964, 1966), and by Prodi’s student
Cattabriga (1961). In fact, they gave Lp versions of
Of course, an estimate for kru(t)k pointwise in time it for all orders of the derivatives. Several proofs
would imply an estimate for the integral of kru(t)k4 specific to the L2 case needed here have been given
on the right-hand side of [14]. We can prove such an by Solonnikov and Sčadilov (1973) and by Beirão da
estimate for at least a finite interval of time by an Veiga (1997). The inequalities [20] can be proved by
argument due to Prodi (1962). It requires, in combining [19] with appropriate Sobolev inequal-
preparation, some deep results concerning the ities, or better, by combining [19] with recent
regularity of solutions of the steady Stokes equa- inequalities of Xie (1991) which are of precisely
tions. These cannot be proved here, but we can the form [20], but with 4u instead of P4u on the
briefly summarize what will be needed. Let right-hand side, and without the requirement that
L2 () = space of vector fields , with finite r u = 0. The constant c in [19] depends upon the
L2 -norms kk, regularity of the boundary, and tends to infinity
1
C0 () = space of smooth vector fields with compact along with a bound for the boundary curvature.
support in , Through the work of Xie (1992, 1997), there is
D() = { 2 C1 reason to believe that the inequalities [20] are
0 (): r = 0},
J() = completion of D() in the L2 -norm kk, probably valid for arbitrary domains, with the
J 1 () = completion of D() in the norm krk, constant c = (2
)1 if n = 2, and c = (3
)1 if n = 3.
G() = {rp: p 2 L2 () with rp 2 L2 ()}, and Xie’s efforts to prove this have been continued by
P : L2 () ! J() be the L2 -projection of L2 () onto the author (Heywood 2001). If the inequalities
J(), [20] can be proved for arbitrary domains (i.e.,
arbitrary open sets), with these fixed constants,
and define the Sobolev W22 () norm by then the approach to Navier–Stokes theory pre-
sented in this article will extend immediately to
kuk2W 2 ðÞ ¼ kuk2 þkruk2
2
Z arbitrary domains, as explained in Heywood and
2 Xie (1997), with estimates independent of the
þ @ 2 ui =@xj @xk dx
domain.
We go on now with an estimation of kru(t)k
based on [20]. Multiplying the Navier–Stokes
Furthermore, observe that (rp, ) = 0 for rp 2
equation for u by P4u, and integrating over ,
G() and 2 J(), since it holds if p is smooth
one obtains
and 2 D(). Therefore, Prp = 0, since
(Prp, ) = (rp, ) = 0, for all 2 J(). Later,
1d
when we need it, we will also argue that kruk2 þ kPuk2 ¼ ðu ru; PuÞ
2 dt
L2 () = J()
G().
With these preparations, it is evident that every sup jujkrukkPuk ½21
smooth vector field u satisfying r u = 0 and
uj@ = 0 can be regarded as a solution of the steady since (ut , Pu)=(Put , u)=(ut , u)=(rut ,ru)
Stokes problem and (rp,Pu)=0.
Viscous Incompressible Fluids: Mathematical Theory 373
The right-hand side of [21] can be estimated using without any restriction on the size of the data.
[20] and Young’s inequality: Integrating the second, one obtains a global estimate
viewing u at any fixed time as a solution of the steady containing an integral of kPuk2 on the left-hand side.
Stokes equations, we can apply regularity estimates for We will use the notation B(M, t) generically, for any
the Stokes equations to infer that it is C1 in all bound that depends only on the function M(t) and t.
variables throughout (0, T), with specific esti- We remark, that a term kut k2 can also be included
mates for each derivative. under the integral sign on the left-hand side of [32],
The estimates of this section are obtained by because kut k and kPuk are of essentially the
integrating an infinite sequence of differential inequal- same order, being the leading terms in the projection
ities, for kuk, kruk, kut k, krut k, kutt k, krutt k, . . . . ut þ P(u ru) = Pu of the Navier–Stokes equation.
The first two are [15] and [21], which have already Finally, one can also include kukW 2 () under the
2
been dealt with. It turns out that after these first two, integral sign, in view of [19].
each succeeding differential inequality is linearized by Going on, we obtain a third differential inequality
the estimates obtained from its predecessor, which from the second identity of the sequence [30]. Its
explains why the time intervals for these additional right-hand side admits the estimate
estimates do not become successively shorter. In fact,
in the two-dimensional case, the energy estimate ðut ru; ut Þ kut k24 kruk
resulting from [15], which is valid for all time, already ckut k1=2 krut k3=2 kruk
gives the linearization [23] of [21], which then 1
provides an estimate valid for all time. Except for krut k2 þ ckruk4 kut k2 ½33
2
noting such differences between the two- and three-
dimensional cases, we will henceforth deal with only which, in view of [29] or [32], produces a linear
the three-dimensional case. differential inequality with integrable coefficients.
The differential inequalities just mentioned are Its integration yields an estimate of the form
obtained by estimating the right-hand sides of two Z t
sequences of differential identities, and ordering kut ðtÞk2 þ krut k2 d
them by an iteration between the two sequences. 0 ½34
The first sequence begins with and is patterned after BðM; t; kut ð0ÞkÞ; for 0 t < T
the energy identity,
provided kut (0)k is bounded. Since ut = P(u
1d u ru), we have the estimate
kuk2 þ kruk2 ¼ 0
2 dt
1d kut ð0Þk ¼ kPðu0 u0 ru0 Þk
kut k2 þ krut k2 ¼ ðut ru; ut Þ
2 dt
½30 ku0 u0 ru0 k B ku0 kW 2 ðÞ ½35
1d 2
kutt k2 þ krutt k2 ¼ ðutt ru; utt Þ
2 dt
provided that u is smooth in [0, T). This is a
2ðut rut ; utt Þ
delicate point, having been forewarned of a regular-
etc: ity breakdown at t = 0. But, we will be able to
while the second begins with and is patterned after replicate the estimate [35] for the Galerkin approx-
Prodi’s identity, imations, ultimately validating [34] for the approx-
imations and the solution.
1d The integration of the next differential inequality,
kruk2 þ kPuk2 ¼ ðu ru; PuÞ which arises from the second of the identities [31],
2 dt
1d requires that krut (0)k < 1. Similarly to [35], we
krut k2 þ kPut k2 ¼ ðut ru; Put Þ have
2 dt
þ ðu rut ; Put Þ ½31
krut ð0Þk ¼ krPðu0 u0 ru0 Þk
1d
krutt k2 þ kPutt k2 ¼ ðutt ru; Putt Þ þ ½36
2 dt B ku0 kW 3 ðÞ
2
etc:
provided that u is smooth in [0, T). However,
Before going on, notice that we can return to [22] and there is a big difference between [35] and [36]. In the
use [29] to infer a more complete estimate of the form next section, we will not be able to obtain an analog of
Z t [36] for the Galerkin approximations. Consequently,
2
kruðtÞk þ kPuk2 d the solution that is obtained will not be fully regular at
0 ½32
time t = 0. It will satisfy u 2 C( [0, T)) \ C1 (
BðM; tÞ; for 0 t < T (0, T)), but not u 2 C1 ( [0, T)). It will satisfy
Viscous Incompressible Fluids: Mathematical Theory 375
ku(t) u0 kW 2 () ! 0 but not ku(t) u0 kW 3 () ! 0, account this initial breakdown in the regularity.
2 2
as t ! 0þ . The continuous dependence estimate [14] meets this
One may wonder whether this is a fault or requirement. So also do the error estimates given in
deficiency in the Galerkin method. It is not, a series of four papers by Rannacher and the author,
remembering what was said at the beginning of beginning with Heywood and Rannacher (1982).
this section. For most prescribed values of u0 , no They were based on the ‘‘smoothing’’ regularity
matter how smooth, there is a breakdown in the estimates for solutions that are being presented here.
regularity of the solution as t ! 0þ . In fact, it was We go on with these now, as models for similar
proved in Heywood and Rannacher (1982) that if estimates for the Galerkin approximations.
krut (t)k or any one of several other quantities, Estimating the right-hand side of the second of the
including ku(t)kW 3 () , remains bounded as t ! 0þ , identities [31] using [20] and Young’s inequality,
2
then there exists a solution p0 of the overdetermined and then multiplying through by t, we get the linear
Neumann problem differential inequality
of the first term on the right-hand side follows from of the system of ordinary differential equations
the boundedness of the integral
Z t unt ; al þ un run ; al ¼ un ; al
tkutt k2 d ½42 for l ¼ 1; 2; . . . ; n ½46
0
which, we have pointed out, can be included on the satisfying the initial conditions (u (0), a ) = (u0 , al ),
n l
left-hand side of [39]. Finally, notice that the for l = 1, 2, . . . , n. Of course, since (unt , al ) = @cln =@t
boundedness of the integral [42] implies and (un , al ) = (Pun , al ) = l cln , the differential
equations can be written as
lim supt!0þ t2 kutt ðtÞk2 ¼ 0 ½43 X n
d
cln ¼ cin cjn ai raj ; al l cln
Therefore, we can integrate [41] to get the estimate dt i;j¼1
Z t
t2 kutt ðtÞk2 þ t2 krutt k2 d and the initial conditions as cln (0) = (un (0), al ), for
0 l = 1, 2, . . . , n.
B M; t; ku0 kW 2 ðÞ ; for 0 t < T ½44 The system [46] is at least locally solvable, on
2 some interval [0, Tn ), with each coefficient satisfying
analogous to [34]. cln 2 C1 [0, Tn ). Therefore, since the eigenfunctions
At this point, we have introduced every device are also smooth, un is C1 smooth in [0, Tn ). It
needed to proceed by induction to an infinite also satisfies all of the identities [30] and [31] on the
sequence of time-weighted estimates, similar to interval [0, Tn ). Indeed, multiplying [46] by cln and
[39] and [44], but with successively higher orders summing over l from 1 to n has the effect of
of time derivatives and weights. The dependence of converting al into un . The resulting identity for un
these estimates on ku0 kW 2 () was introduced through leads immediately to the energy identity
2
[34] and [35]. It can be eliminated by beginning the 1d n 2
introduction of powers of t as weight functions one ku k þkrun k2 ¼ 0 ½47
2 dt
step earlier, with the added advantage that the initial
velocity u0 needs only belong to J 1 (). In the two- The remaining identities in the sequence [30] are
dimensional case, the weight functions can be obtained similarly. For example, the second is
introduced even another step earlier, with the obtained by taking the time derivative of [46],
advantage that the initial velocity u0 needs only multiplying through by dcln =dt and summing over l.
belong to J(). Each of these cases leads to an Prodi’s identity is obtained by multiplying [46] by
existence theorem for solutions u 2 C1 ( (0, T)), l cln and summing, which has the effect of convert-
with the initial values assumed in the norms of J 1 () ing al into Pun . To obtain the second of the
and J(), respectively. identities [31] for un , one differentiates [46], multi-
plies by l dcln =dt and sums. The remaining identities
in the sequence [31] are obtained similarly.
The initial conditions easily imply that kun (0)k
Existence by Galerkin Approximation ku0 k, because u0 2 J() and the eigenfunctions are
Let {a1 , a2 , . . .} and {1 , 2 , . . .} denote the eigenfunc- orthogonal and complete in J(). Therefore, inte-
tions and eigenvalues of the Stokes equations, gration of [47] yields the energy estimate
Z t
ak þ rp ¼ k ak ; r ak ¼ 0 in 1 n 1
ku ðtÞk2 þ krun k2 d ku0 k2 ½48
2 0 2
ak @ ¼ 0 ½45
which is uniform in n. Since kun (t)k remains bounded,
2
chosen to be orthonormal in L (). Clearly, the solution un (t) can be continued for all time. Thus,
Pak = k ak , so they are also the eigenfunctions Tn = 1, for all n. Hence, our early working assump-
and eigenvalues of the Stokes operator, P. Using tion about solutions, that they are smooth in
regularity estimates for the Stokes equations, each [0, 1), is actually valid for the Galerkin approxima-
eigenfunction is known to be C1 smooth in . tions. The issue becomes one of obtaining estimates
The nth Galerkin approximation for problem [9] for their derivatives that are uniform in n. All of the
is the solution estimates we have proved for solutions are proved in
exactly the same way for the approximations. The
X
n
un ðx; tÞ ¼ ckn ðtÞak ðxÞ only possible source of nonuniformity would arise
k¼1
from the initial values of krun k and kunt k.
Viscous Incompressible Fluids: Mathematical Theory 377
The estimates [24], [26], and [27] are uniform in
ut ; al þ ðu ru; al Þ ¼ u; al
n, since u0 2 J 1 () and hence krun (0)k kru0 k,
due to the orthogonality of the eigenfunctions in the for l ¼ 1; 2; . . . ½49
inner-product (ru, rv), and their completeness with
respect to functions in J 1 (). We also obtain a Since the eigenfunctions are complete in J(), and
uniform bound for kunt (0)k of the form [35], by D() J(), this implies
multiplying [46] by @cln =@t and summing over l. In ðut þ u ru u; Þ ¼ 0; for all 2 DðÞ ½50
the last step, we also need the inequality
kun (0)kW 2 () kun0 kW 2 () , which follows from the Therefore, there exists a vector field rp 2 G() such
2 2
orthogonality of the eigenfunctions in the inner that
product (Pu, Pv), and their completeness with ut þ u ru u ¼ rp ½51
respect to functions in J 1 () \ W22 (); see
Ladyzhenskaya (1969, p. 46). Any attempt to find Indeed, the usual test to determine whether a
a bound for krunt (0)k analogous to [36] is certain to smooth vector field w is conservative in some
fail, as it would lead to a contradiction with afore- domain , and therefore representable as a gradient,
mentioned results from Heywood and Rannacher is to check whether the curve integrals
I
(1982).
w ds ½52
C
un , run , @ 2 un =@xi @xj imply that u and its time From our estimates above, we easily conclude that
derivatives are time continuous in W22 (). There- f ut u ru 2 W21 (). Hence, u 2 W23 (). In
fore, u, ut , utt , . . . are classically continuous in fact, in view of the regularity we have proven
(0, T). with respect to time, f 2 C1 (0, T; W21 ()) and u 2
C1 (0, T; W23 ()). Thus begins a bootstrapping argu-
ment. In the next step, we observe that f 2
Introduction of the Pressure
C1 (0, T; W22 ()) and conclude that u 2 C1 (0, T;
Because of the strong convergence un ! u, run ! W24 ()). By induction, one obtains u 2 C1 (0, T;
ru, unt ! ut and the weak convergence W2k ()) for every positive integer k. Then well-
Pun ! Pu, in L2 (), for any t > 0, it is an easy known Sobolev inequalities imply that u 2 C1
matter to let n ! 1 in [46], obtaining, for all t > 0, ( (0, T)).
378 Viscous Incompressible Fluids: Mathematical Theory
Assumption of the Initial Values For the terms under the last integral we have
n
We begin by showing that u(t) ! u0 , weakly in u run ; Pun þ un run ; Pun
t t
L2 (), as t ! 0þ . Of course, ku(t)k remains bounded 2
as t ! 0þ , in virtue of [48], and the eigenfunctions runt þckrun k1=2 kPun k3=2
{al } are complete in J(). Writing Therefore, [53] implies
uðtÞ u0 ; al ¼ uðtÞ un ðtÞ; al þ un ðtÞ
kPun ðtÞk2 kPun ð0Þk2 þ2ðun run ; Pun Þjt
un ð0Þ; al þ un ð0Þ u0 ; al 2ðun run ; Pun Þj0 þ Kt
note that the first and third terms on the right-hand uniformly in n, as t ! 0þ , where K is a constant
side can be made small by choosing n large. The depending on the estimates [32] and [34]. Letting
second can be written as n ! 1, gives
Z t
un ðtÞ un ð0Þ; al ¼ unt ; al d kPuðtÞk2 kPuð0Þk2 þ2ðu ru; PuÞjt
0
2ðu ru; PuÞj0 þ Kt
which will be small if t is small, in view of [34].
Thus, (u(t) u0 , al ) ! 0, as t ! 0þ , which implies Since u ru ! u0 ru0 strongly in L2 (), and
the desired weak convergence. Pu ! Pu0 weakly in L2 (), we get the desired
The strong convergence u(t) ! u0 in L2 () follows result. The continuous assumption of the initial values
from the weak convergence if lim supt ! 0þ ku(t)k in W22 () also implies their continuous assumption in
ku0 k. The energy estimate [48] for the approxima- the classical sense, and hence that u 2 C( [0, T)).
tions implies this also.
To conclude that u(t) ! u0 strongly in J 1 (), it only Conclusion
remains to be shown that lim supt ! 0þ kru(t)k
Years ago, mathematical questions concerning the
kru0 k. This readily follows from [29], provided the
Navier–Stokes equations were usually considered in
bounding function M(t) satisfies M(t) ! kru0 k, as
the context of generalized or weak solutions, which was
t ! 0þ . The bounding functions provided by our basic
a technical barrier to many in the scientific community.
estimates [24], [26], and [27] all have this property.
Nowadays, realizing that solutions are at least locally
We may conclude that u(t) ! u0 weakly in W22 (),
classical, fundamental questions such as that of global
provided ku(t)kW 2 () remains bounded as t ! 0þ . To
2 regularity can be studied within the classical context. If
see this, remember that kPuk and kut k are of the estimate [29] is proved for classical solutions, with
essentially the same order. Thus the term kut (t)k2 on T = 1, and without a restriction on the size of the data,
the left-hand side of [34] can be accompanied by a this particular matter will be settled.
term ku(t)k2W 2 () .
2
Finally, to prove that u(t) ! u0 strongly in W22 (),
we need only show that lim supt ! 0þ kPu(t)k
kPu0 k, since kPk and kkW 2 () are equivalent
2 This work has been supported by the Natural
norms on J 1 () \ W22 (). To this end, multiply [46]
Sciences and Engineering Research Council of Canada.
by l dcln =dt and sum to get
1d 2
kPun k2 þ runt ¼ un run ; Punt See also: Compressible Flows: Mathematical Theory;
2 dt Elliptic Differential Equations: Linear Theory;
d Incompressible Euler Equations: Mathematical Theory;
¼ ðun run ; Pun Þ
dt
Interfaces and Multicomponent Fluids; Leray–Schauder
unt run þ un runt ; Pun Theory and Mapping Degree; Non-Newtonian Fluids;
Partial Differential Equations: Some Examples;
Integrating this gives Stochastic Hydrodynamics; Turbulence Theories;
Wavelets: Application to Turbulence.
kPun ðtÞk2 kPun ð0Þk2
Z t
d
¼ kPun k2 ds Further Reading
0 dt
¼ 2ðun run ; Pun Þjt 2ðun run ; Pun Þj0 Beirão da Veiga H (1997) A new approach to the L2 -regularity
Z t Z t theorem for linear stationary nonhomogeneous Stokes sys-
n 2
n tems. Portugaliae Mathematica 54(Fasc. 3): 271–286.
2
rut ds 2 ut run
0 0 Cattabriga L (1961) Su un probleme al contorno relativo al
sistema di equazioni di Stokes. Rendi conti del Seminario
þ un runt ; Pun ds ½53 Matematico della Università di Padova 31: 308–340.
von Neumann Algebras: Introduction, Modular Theory, and Classification Theory 379
Heywood JG (1976) On uniqueness questions in the theory of Heywood JG and Xie W (1997) Smooth solutions of the vector
viscous flow. Acta Mathematica 136: 61–102. Burgers equation in nonsmooth domains. Differential and
Heywood JG (1980) The Navier–Stokes equations: on the Integral Equations 10: 961–974.
existence, regularity and decay of solutions. Indiana Univer- Ladyzhenskaya OA (1969) The Mathematical Theory of Viscous
sity Mathematics Journal 29: 639–681. Incompressible Flow, 2nd edn. New York: Gordon and
Heywood JG (1990) Open problems in the theory of the Breach.
Navier–Stokes equations for viscous incompressible flow. In: Prodi G (1962) Teoremi di tipo lacale per il sistema de Navier–
Heywood JG, Masuda K, Rautmann R, and Solonnikov VA Stokes e stabilità delle soluzione stazionarie. Rendi conti del
(eds.) The Navier–Stokes Equations: Theory and Numerical Seminario Matematico della Università di Padova 32:
Methods, Lecture Notes in Mathematics, vol. 1431, pp. 1–22. 374–397.
Berlin–Heidelberg: Springer Verlag. Solonnikov VA (1964) On general boundary-value problems for
Heywood JG (1994) Remarks concerning the possible global elliptic systems in the sense of Douglis–Nirenberg. I. Izvestiya
regularity of solutions of the three-dimensional incompressible Akademii Nauk SSSR, Sariya Matematicheskaya 28: 665–706.
Navier–Stokes equations. In: Galdi GP, Malek J, and Necas J Solonnikov VA (1966) On general boundary-value problems for
(eds.) Progress in Theoretical and Computational Fluid elliptic systems in the sense of Douglis–Nirenberg. II. Trudy
Dynamics, Pitman Research Notes in Mathematics Series, Matematiceskogo Instituta Imeni V.A. Steklova 92: 233–297.
vol. 308, pp. 1–32. Essex: Longman Scientific and Technical. Solonnikov VA and Sčadilov VE (1973) On a boundary value
Heywood JG (2001) On a conjecture concerning the Stokes problem for a stationary system of Navier–Stokes equations.
problem in nonsmooth domains. In: Neustupa J and Penel P Proceedings. Steklov Institute of Matematics 125: 186–199.
(eds.) Mathematical Fluid Mechanics: Recent Results and Xie W (1991) A sharp pointwise bound for functions with L2
Open Problems, Advances in Mathematical Fluid Mechanics Laplacians and zero boundary values on arbitrary three-
vol. 2, pp. 195–206. Basel: Birkhauser Verlag. dimensional domains. Indiana University Mathematics Journal
Heywood JG (2003) A curious phenomenon in a model problem, 40: 1185–1192.
suggestive of the hydrodynamic inertial range and the smallest Xie W (1992) On a three-norm inequality for the Stokes operator in
scale of motion. Journal of Mathematical Fluid Mechanics 5: nonsmooth domains. In: Heywood JG, Masuda K, Rautmann R,
403–423. and Solonnikov VA (eds.) The Navier–Stokes Equations II:
Heywood JG and Rannacher R (1982) Finite element approxima- Theory and Numerical Methods, Lecture Notes in Mathematics,
tion of the nonstationary Navier–Stokes problem, Part I: vol. 1530, pp. 310–315. Berlin–Heidelberg: Springer Verlag.
Regularity of solutions and second-order error estimates for Xie W (1997) Sharp Sobolev interpolation inequalities for the
the spatial discretization. SIAM Journal on Numerical Stokes operator. Differential and Integral Equations 10:
Analysis 19: 275–311. 393–399.
group were intensively studied by Krieger (1970, von Neumann algebra, with one candidate for N
1976). We shall simply refer to such factors as being M =N? (where N? = { 2 M : n() =
‘‘Krieger factors.’’ The term ‘‘Krieger factor’’ is 0 8n 2 N}).
actually used for factors obtained from a slightly 4. Any abstract von Neumann algebra (with separ-
more general construction, with ergodic group able predual) is isomorphic (in the category of
actions replaced by more general ergodic equiva- abstract von Neumann algebras) to a (concrete)
lence relations. Since there is no difference in the von Neumann subalgebra of L(H) (for a separ-
two notions at least in good (amenable) cases, we able H).
will say no more about this.
With the abstract viewpoint available, we shall
look for modules over a von Neumann algebra M,
meaning pairs (H, ) where : M ! L(H) is a normal
Abstract von Neumann Algebras
-homomorphism.
So far, we have described matters as they were in A brief digression into the proof of fact (4)
von Neumann’s time. To come to the modern era, it above – which asserts the existence of faithful
is desirable to ‘‘free a von Neumann algebra from M-modules – will be instructive and useful. Suppose
the ambient Hilbert space’’ and to regard it as an M is an abstract von Neumann algebra. A linear
abstract object in its own right which can act on functional on M is called a normal state if:
different Hilbert spaces – for example, L1 (, ) is
(positivity) (x x) 08x 2 M;
an object worthy of study in its own right, without
(normality) : M ! C is normal; and
reference to L2 (, ).
(normalization) (1) = 1.
The abstract viewpoint is furnished by a theorem
of Sakai (1983); let us define an abstract von (Normal states on L1 (, ) correspond to non-
Neumann algebra to be an abstract C -algebra negative probability measures on which are
(this is a Banach algebra with an involution related absolutely continuous with respect to .) It is true
to the norm by the so-called C -identity kxk2 = that there exist plenty of normal states on M.
kx xk) M which admits a pre-dual M – i.e., M is In fact, they linearly span M . This implies that if
isometrically isomorphic to the Banach dual space M is separable, then there exist normal states
(M ) . It turns out that a predual of such an abstract on M which are even ‘‘faithful’’ – meaning
von Neumann algebra is unique up to isometric (x x) = 0 , x = 0.
isomorphism. Consequently, an abstract von Fix a faithful normal state on M. (Consistent
Neumann algebra comes equipped with a canonical with our convention about separable H’s, we shall
‘‘weak -topology,’’ usually called the ‘‘
-weak topol- only consider M’s with separable preduals.) The
ogy’’ on M. The natural morphisms in the category well-known ‘‘Gelfand–Naimark–Segal’’ construction
of abstract von Neumann algebras are -homo- then yields a faithful M-module which is usually
morphisms which are continuous with respect to denoted by L2 (M, ) – motivated
R by the fact that if
-weak topologies on domain and range. It is M = L1 (, ), and (f ) = f d , with a probabil-
customary to call a linear map between abstract ity measure mutually absolutely continuous with
von Neumann algebras ‘‘normal’’ if it is continuous respect to , then L2 (M, ) = L2 (, ) with L1 (, )
with respect to
-weak topologies on domain and acting as multiplication operators. The construction
range. mimics this case: the assumptions on ensure that
The equivalence of the ‘‘abstract’’ definition of the equation
this section, with the ‘‘concrete’’ one given earlier
(which depends on an ambient Hilbert space), relies hx; yi ¼ ðy xÞ
on the following four facts:
defines a positive-definite inner product on M; let
1. L(H) is an abstract von Neumann algebra, with L2 (M, ) be the Hilbert space completion of M. It
the predual L(H) being the so-called ‘‘trace class’’ turns out that the operator of left-multiplication by
of operators, equipped with the ‘‘trace norm.’’ an element of M extends as a bounded operator to
2. A self-adjoint subalgebra of L(H) is closed in the L2 (M, ), and it then follows easily that L2 (M, ) is
strong operator topology, and is hence a ‘‘con- indeed a faithful M-module, thereby establishing
crete von Neumann algebra’’ precisely when it is fact (4) above.
closed in the
-weak topology on L(H). Since we wish to distinguish between elements of
3. If M is an abstract von Neumann algebra, and N the dense subspace M of L2 (M, ) and the operators
is a -subalgebra of M which is closed in the of left-multiplication by members of M, let us write
-weak topology of M, then N is also an abstract x
^ for an element of M when thought of as an
von Neumann Algebras: Introduction, Modular Theory, and Classification Theory 383
element of L2 (M, ), and x for the operator of left- elements of M – then the one-parameter subgroup
multiplication by x; thus, for instance, x ^ and
^ = x1, {(
t ) : t 2 R} of Out(M) is independent of .
x^
y=x ^ 1i
cy, hx1, ^ = (x), etc.
with, he established the equivalence of several Renormalizable Quantum Field Theory; Hopf Algebras
(seemingly quite disparate) requirements on a von and q-Deformation Quantum Groups; The Jones
Neumann algebra M L(H) – ranging from injec- Polynomial; Knot Theory and Physics; Noncommutative
tivity (meaning the existence of a projection of norm Geometry and the Standard Model; Noncommutative
Tori, Yang–Mills and String Theory; Positive Maps on
1 from L(H) onto M) to ‘‘approximate finite
C-Algebras; Quantum 3-Manifold Invariants; Quantum
dimensionality’’ (meaning M = ( [n An )00 for some
Entropy; Tomita–Takesaki Modular Theory;
increasing sequence A1 A2 An of von Neumann Algebras: Subfactor Theory.
finite-dimensional -subalgebras). In the same
paper, Connes (1976) essentially finished the com-
plete classification of injective factors. Only the
injective III1 factor withstood his onslaught; but Further Reading
eventually even it had to surrender to the technical Connes A (1973) Une classification des facteurs de type III.
virtuosity of Haagerup (1987) a few years later! Annales Scientifiques de l’Ecole Normale Superieure 6:
In the language we have developed thus far, the 133–252.
classification of injective factors may be summarized Connes A (1976) Classification of injective factors. Annals of
as follows: Mathematics 104: 73–115.
Connes A (1994) Non-commutative Geometry. San Diego:
Every injective factor is isomorphic to a Krieger Academic Press.
factor. Dixmier J (1981) von Neumann Algebras. Amsterdam: North
Holland.
Up to isomorphism, there is a unique injective Haagerup U (1987) Connes’ bicentralizer problem and uniqueness
factor of each type with the solitary exception of of the injective factor of type III1 . Acta Mathematica 158:
III0 . 95–148.
Injective factors of type III0 are classified (up to Kadison RV and Ringrose JR (1983/1986) Fundamentals of the
isomorphism) by an invariant of an ergodic- Theory of Operator Algebras, vol. I–IV. New York: Academic
Press.
theoretic nature called the ‘‘flow of weights’’; Krieger W (1970) On the Araki–Woods asymptotic ratio set and
unfortunately, coming up with a crisp description non-singular transformations of a measure space. In: Con-
of this invariant, which is simultaneously acces- tributions to Ergodic Theory and Probability, pp. 158–177.
sible to the nonexpert and is consistent with the Lecture Notes in Mathematics, vol. 160, Springer-Verlag.
stipulated size of this survey, is beyond the scope Krieger W (1976) On ergodic flows and the isomorphism of
factors. Mathematische Annalen 223: 19–70.
of this author. Murray FJ and von Neumann J (1936) Rings of operators. Annals
The interested reader is invited to browse through of Mathematics 37: 116–229.
Murray FJ and von Neumann J (1937) On rings of operators, II.
one of the books (Connes 1994, Sunder 1986, Transactions of the American Mathematical Society 41:
Dixmier 1981) for further details; the third book is 208–248.
the oldest (a classic but the language has changed a Murray FJ and von Neumann J (1943) On rings of operators, IV.
bit since it was written), the second is more recent Annals of Mathematics 44: 716–808.
(but quite sketchy in many places), and the first is Sakai S (1983) C -algebras and W -algebras. Berlin–New York:
Springer-Verlag.
clearly the best choice (if one has the time to read it Sunder VS (1986) An Invitation to von Neumann Algebras.
carefully and digest it). Alternatively, the interested New York: Springer-Verlag.
reader might want to browse through the encyclo- Takesaki M (1970) Tomita’s Theory of Modular Hilbert Algebras
pediac treatments (Kadison and Ringrose) or and its Applications, Lecture Notes in Mathematics 128,
(Takesaki). Berlin–Heidelberg–New York: Springer-Verlag.
Takesaki M (1979/2003) Theory of Operator Algebras,
vols. I–III. Heidelberg: Springer Verlag.
See also: Algebraic Approach to Quantum Field Theory; von Neumann J (1936) On rings of operators, III. Annals of
Bicrossproduct Hopf Algebras and Noncommutative Mathematics 37: 111–115.
Spacetime; Braided and Modular Tensor Categories; von Neumann J (1949) On rings of operators. Reduction theory.
C-Algebras and Their Classification; Ergodic Theory; Annals of Mathematics, 50: 401–485.
Finite-Type Invariants; Hopf Algebra Structure of
von Neumann Algebras: Subfactor Theory 385
where B(H) denotes the set of all the bounded linear and all the values in this set are indeed realized.
operators on H. (We are mostly interested in Suppose we have a II1 factor M and an action of
separable, infinite-dimensional Hilbert spaces. A an at most countable, discrete group G on M, that
von Neumann algebra is automatically closed also is, a homomorphism : G ! Aut(M), where Aut(M)
in the norm topology and thus it is also a C -algebra.) is the automorphism group of M. Then we have a
By definition, a factor M acts on a certain Hilbert construction Mo G, called the crossed product. If
space H, but we also consider its action on another g is not an inner automorphism of M for any g 2 G
Hilbert space K, that is, a -weakly continuous other than the identity element of G, then Mo G is
homomorphism preserving the -operation from M also a type II1 factor. (An automorphism of M is
into B(K). A subfactor is a factor N which is said to be inner if it is of the form (x) = uxu for
contained in another factor M and has the same some unitary operator u 2 M.) The index of a
identity. A factor is classified into types subfactor M Mo G is the order of G, which can
In (n = 1, 2, 3, . . . ), I 1 , II1 , II 1 , and III. In most of be infinite. If we have a subgroup H of G, then we
the interesting studies of subfactors, the two factors obtain a subfactor Mo H Mo G and its index is
are of both type II1 or both type III. A factor M is given by the index [G : H] of the subgroup H. This
said to be of type II1 if it is infinite dimensional analogy to the index of a subgroup is the origin of
and has a finite trace tr : M ! C. By definition, a the terminology of the Jones index for a subfactor.
finite trace tr is a linear functional on M satisfying The Jones index is also analogous to the degree of
tr(1) = 1, tr(xy) = tr(yx) for all x, y 2 M, and an extension of a field. From the viewpoint of this
tr(x x) 0 for all x 2 M. When a factor M, not analogy, subfactor theory can be regarded as a
isomorphic to C, acts on a separable Hilbert space, it certain generalized analogue (or the ‘‘quantum’’
is of type III if and only if for any two nonzero version) of the classical Galois theory for field
projections p, q 2 M, we have an operator v 2 M extensions. (The direct analog of the classical Galois
with vv = p and v v = q. One obviously cannot have correspondence for subfactors was studied by
a trace on such a factor. (See Takesaki (2002, 2003) Nakamura–Takeda in the early days, and Izumi–
for a general theory on factors.) Longo–Popa gave the most general form.)
Let M be a type II1 factor acting on a Hilbert The tools Jones (1983) has introduced to study
space H. We then have the coupling constant of subfactors are as follows. Let N M be a subfactor of
Murray and von Neumann, which is denoted by type II1 with finite Jones index. We consider the
dimM H and belongs to (0, 1]. This measures the actions of N, M on L2 (M). The completion of N with
relative dimension of H with respect to M. Note respect to the inner product given by the trace gives
that the factor M acts on M itself by the left L2 (N), which is naturally regarded as a closed
multiplication. We introduce an inner product on subspace of L2 (M). Let eN be the projection on
M by (x, y) = tr(y x) and denote the completion by L2 (M) onto L2 (N), which is called the Jones
386 von Neumann Algebras: Subfactor Theory
projection. We define M1 to be the von Neumann These finite-dimensional algebras are called higher
algebra generated by M and eN on L2 (M). This is again relative commutants of N M. We draw the
a type II1 factor and denoted by M1 . This construction Bratteli diagram for the higher relative commutants
is called the basic construction. We obtain [M1 : M] = as follows. Consider N 0 \ Mk (with convention
[M : N]. Repeat the same procedure for M M1 M1 = N, M0 = M), then it is a L finite-dimensional
acting on L2 (M1 ) this time. In this way, we have an -algebra; thus, it is of the form j Mnj (C), where
increasing sequence of type II1 factors, we have only finitely many direct summands. We
draw a dot for each summand.L We similarly draw a
N M M1 M2 M3 dot for each summand in l Mml (C) for N 0 \ Mkþ1 .
0
which is called the Jones tower. We label the Let
L be the inclusion Lmap from N \ Mk =
0
corresponding Jones projections as e1 = eN , e2 = eM , j M nj (C) to N \ M kþ1 l M ml (C) and pl the
e3 = eM1 , . . . . We then have the following celebrated identity of Mml (C), which is a projection in N 0 \
Jones relations: Mkþ1 . We denote by jl the multiplicity of the
embedding map x 7! (x)pl from Mnj (C) to Mml (C).
ej ek ¼ ek ej ; if jj kj > 1 Then we draw jl edges from the jth dot for Mnj (C)
1 to the lth dot for Mml (C). We repeat this procedure
ej ej1 ej ¼ ej ½2 for all k, and get a picture as in Figure 1, which is
½M : N
called the Bratteli diagram of the higher relative
Jones proved the above-mentioned restriction on commutants of N M.
the possible values of the Jones index using these It turns out that the edges connecting the kth and
relations. The realization of the index values below (k þ 1)th steps of the Bratteli diagram consist of the
4 in the set [1] by Jones also relies on these reflection of those connecting the (k 1)th and kth
relations of the Jones projection. The basic con- steps, and a (possibly empty) new part. The ‘‘new’’
struction is also possible for the other direction. parts taken altogether in the above Bratteli diagram
That is, we can construct a subfactor N1 N so constitute the principal graph of a subfactor N M.
that N M is the basic construction of N1 N. In the example of Figure 1, the principal graph is the
This is called the downward basic construction. Dynkin diagram A5 . In general, a principal graph
This N1 is not unique, but is unique up to an inner can be finite or infinite. If it is finite, we say that a
automorphism of N. subfactor is of finite depth. If a subfactor has the
A subfactor N M is said to be irreducible if the Jones index less than 4, it is automatically of
relative commutant N 0 \ M is equal to C. If a finite depth and the principal graph must be one of
subfactor has Jones index less than 4, then it is the A–D–E Dynkin diagrams.
automatically irreducible. The original realization of Pimsner and Popa (1986) obtained the character-
the Jones index values above 4 by Jones was through ization of the Jones index value in terms of the
reducible subfactors. Popa proved that all the values Pimsner–Popa inequality for a conditional expec-
above 4 are realized with irreducible subfactors. A tation. This can be used as a definition of the index
factor is said to be hyperfinite if it has a dense for a subfactor of arbitrary type (and even for
subalgebra given as the union of increasing sequence C -subalgebras). Kosaki obtained a definition of the
of finite-dimensional -algebras. If M is a hyperfinite index for type III subfactors based on works of
type II1 factor, then its subfactor is automatically Connes and Haagerup.
also hyperfinite by a deep theorem of Connes. For
hyperfinite, irreducible type II1 subfactors, it is still
an open problem to determine all the possible values
of the Jones index. N′ ∩ N
For type II1 factors N M P, the Jones index
[P : N] is equal to the product [P:M][M:N]. Thus for N′ ∩ M
the Jones tower, we have [Mk :N] = [M:N]kþ1 . In N ′ ∩ M1
general, if a subfactor N M has a finite Jones
index, then the relative commutant N 0 \ M is N ′ ∩ M2
automatically finite dimensional. So, if we start
N ′ ∩ M3
with a type II1 subfactor N M with finite Jones
index, we have an increasing sequence of finite- N ′ ∩ M4
dimensional algebras as follows:
N 0 \ M N 0 \ M1 N 0 \ M2 N 0 \ M3 ½3 Figure 1 The Bratteli diagram of the higher relative commutants.
von Neumann Algebras: Subfactor Theory 387
See Evans and Kawahigashi (1998) and Goodman et invariant for links. This was the beginning of series
al. (1989) for these constructions and classifications. of entirely new theories in three-dimensional topol-
Evans-Kawahigashi and Xu studied the orbifold ogy. The Jones polynomial was quickly generalized
construction of subfactors applied to the Hecke to the two-variable HOMFLY polynomial by Hoste,
algebra subfactors of Wenzl. Ocneanu, Millet, Freyd, Lickorish, and Yetter.
In a theory of integrable lattice models, we have A three-dimensional topological quantum field
squares with labeled edges, and we assign complex theory (TQFT3 ) assigns a complex number to each
numbers to them. A paragroup has much formal closed oriented 3-manifold and a finite dimensional
similarity to such a lattice model, and the para- vector space to each closed oriented surface.
groups of subfactors of Jones and Wenzl correspond Furthermore, to each compact oriented 3-manifold
to the lattice models of Andrews–Baxter–Forrester. with boundary, it assigns a vector in the vector space
Goodman–de la Harpe–Jones have another con- corresponding to its boundary. Turaev–Viro have
struction of subfactors from the Dynkin diagrams, constructed TQFT3 from combinatorial data called
and for E6 this gives p a ffiffiffihyperfinite type II1 subfactor quantum 6j-symbols arising from quantum groups.
with Jones index 3 þ 3 and finite depth. Haagerup Ocneanu has found that a subfactor of finite index
has made a combinatorial study on type II1 sub- pffiffiffi and finite depth also produces quantum 6j-symbols,
factors with Jones index values between 4 and 3 þ 3 which give rise to a TQFT3 generalizing the Turaev–Viro
and obtained a list of candidates of possible higher construction. See Evans and Kawahigashi (1998) for
relative commutants. Haagerup himself pffiffiffiffiffiffi showed one this construction. Reshetikhin–Turaev have another
in the list with Jones index (5 þ 13)=2 is indeed construction of TQFT3 from a modular tensor
realized. Asaeda–Haagerup showed that pffiffiffiffiffiffi another in category, which is a braided tensor category with
the list having the Jones index (5 þ 17)=2 is also nondegenerate braiding. Ocneanu has found a
realized. These two examples are still among the subfactor version of the quantum double construc-
most mysterious examples of subfactors today and tion which produces a modular tensor category
do not seem to arise from other constructions using from a type II1 subfactor of finite index and finite
quantum groups or conformal field theory. Izumi depth. From a type II1 subfactor of finite index and
has another construction
pffiffiffiffiffiffi of a subfactor with the finite depth, we can apply Ocneanu’s generalization
Jones index (7 þ 29)=2 using an endomorphism of of the Turaev–Viro construction on one hand, and
the Cuntz algebra. also the Reshetikhin–Turaev construction to the
Popa has obtained a complete characterization of modular tensor category arising from the quantum
higher relative commutants including the case of double construction of Ocneanu. The resulting two
infinite depth, and axiomatized the higher relative TQFT3 s are shown to be equal by Kawahigashi–
commutant as the standard -lattices. Xu has Sato–Wakui. Concrete computations of these topo-
constructed standard -lattices, hence subfactors, logical invariants have been made by Sato–Wakui
from quantum groups. This realization of Popa of a based on Izumi’s work. Turaev and Wenzl have
given standard -lattice produces a nonhyperfinite other constructions of TQFT3 and modular tensor
type II1 subfactor. Popa–Shlyakhtenko later showed categories.
that any standard -lattice is realized for a subfactor
of a single type II1 factor, a group II1 factor arising
Algebraic Quantum Field Theory
from the free group F1 having countably many
generators, which is not hyperfinite. An operator algebraic approach to quantum field
Jones (1999) has introduced a combinatorial theory is called algebraic quantum field theory and
characterization of standard -lattices as planar the standard reference is Haag (1996). In this
algebras. This approach uses planar operads based approach, instead of quantum fields which are
on tangles and provides a new viewpoint on the operator-valued distributions, we consider a family
structure of higher relative commutants. More {A(O)} of von Neumann algebras parametrized by
studies on planar algebras have been done by spacetime regions O in a Minkowski space. Each
Bisch–Jones. A(O) is meant to be generated by self-adjoint
operators which are observables in O. We axioma-
tize such a family of von Neumann algebras and call
one a local net of von Neumann algebras. It is
Topological Invariants in Three
enough to take O of a special form, called a double
Dimensions and Tensor Categories
cone. The name ‘‘local’’ comes from the locality
Through the relations of the Jones projections, Jones axiom which is a mathematical expression of the
(1985) discovered the Jones polynomial as an Einstein causality on a Minkowski space. The
von Neumann Algebras: Subfactor Theory 389
Poincaré group is used as the spacetime symmetry of and computed their representation theory, and his
the Minkowski space. Doplicher et al. (1971, 1974) construction has been extended to other Lie groups
have introduced a representation theory of a local by Toledano Laredo and others. For the local
net A of von Neumann algebras and found that a conformal net A of von Neumann algebras on the
‘‘physically nice’’ representation is realized as an circle arising from LSU(N), we take an endomorph-
endomorphism of a one von Neumann algebra A(O) ism of A(I) arising from a representation of the
for some fixed O. They have a notion of a statistical local conformal net, then we have a subfactor
dimension for such a representation and it is an (A(I)) A(I). This is isomorphic to the type II1
integer (or infinite) if the spacetime dimension is subfactor constructed by Jones and Wenzl tensored
larger than 2. Longo (1989, 1990) has shown that with a common type III factor.
this statistical dimension of a representation is equal Longo–Rehren (1995) started the study of a local
to the square root of the index [A(O) : (A(O))], net of subfactors, A(I) B(I). They have defined a
where is the corresponding endomorphism of certain induction procedure which gives a represen-
A(O) to the representation. The relation between tation of the larger local conformal net B from that
algebraic quantum field theory and subfactor theory of A. This procedure is today called -induction. Xu
has been found in this way. Longo (1989, 1990) has has studied this procedure and found several basic
also started a theory of canonical endomorphisms properties. In the cases of local conformal nets of
for a subfactor and Izumi has further studied it. subfactors arising from conformal embeddings, he
Longo has later obtained a characterization when an has found a simple construction of subfactors with
endomorphism of a factor becomes a canonical principal graphs E6 and E8 using -induction.
endomorphism by introducing a Q-system. In the context of subfactor theory, -induction
Recently, conformal field theory has attracted has been further studied by Böckenhauer–Evans–
much attention. An approach based on algebraic Kawahigashi, together with graphical methods of
quantum field theory describes a conformal field Ocneanu on the Dynkin diagrams. More detailed
theory with a local net of von Neumann algebras on studies on local conformal nets of factors on the
a two-dimensional Minkowski space with diffeo- circle have been pursued partly using various
morphism group as the spacetime symmetry. We can techniques of subfactor theory, including classifica-
restrict such a theory into a tensor product of two tion of local conformal nets of von Neumann
theories on the circle, the compactified one- algebras on the circle with central charge less than
dimensional Euclidean space. Each theory on the 1 by Kawahigashi–Longo.
circle is called a chiral conformal field theory and
described by a local conformal net of von Neumann See also: Algebraic Approach to Quantum Field Theory;
algebras, which is a family of von Neumann Braided and Modular Tensor Categories; C-Algebras
algebras parametrized by intervals on the circle. and Their Classification; Hopf Algebras and
The name ‘‘conformal’’ comes from the fact that we q-Deformation Quantum Groups; The Jones Polynomial;
use the orientation preserving diffeomorphism group Quantum 3-Manifold Invariants; Quantum Entropy; von
Neumann Algebras: Introduction, Modular Theory, and
on the circle as the symmetry group of the space. For
Classification Theory; Yang–Baxter Equations.
a local conformal net A of von Neumann algebras
on the circle with natural irreducibility assumption,
each von Neumann algebra A(I) is automatically a Further Reading
type III factor. The Doplicher–Haag–Roberts theory
works in this setting after an appropriate adaptation Doplicher S, Haag R, and Roberts JE (1971) Local observables
and particle statistics, I. Communications in Mathematical
as in Fredenhagen et al. (1989) and each representa- Physics 23: 199–230.
tion of a local conformal net of von Neumann Doplicher S, Haag R, and Robert JE (1974) Local observables an
algebras is realized by an endomorphism of A(I), particle statistics, II. Communications in Mathematical Phy-
where I is an arbitrarily fixed interval on the circle. sics 35: 49–85.
(Here we do not need an assumption that a Evans DE and Kawahigashi Y (1998) Quantum Symmetries on
Operator Algebras. Oxford: Oxford University Press.
representation is ‘‘physically nice’’ since it now Fredenhagen K, Rehren K-H, and Schroer B (1989) Superselection
automatically holds.) Now the representations give sectors with braid group statistics and exchange algebras.
a braided tensor category. Communications in Mathematical Physics 125: 201–226.
Buchholz–Mack–Todorov constructed examples of Goodman F, de la Harpe P, and Jones VFR (1989) Coxeter
local conformal nets of von Neumann algebras on the Graphs and Towers of Algebras, vol. 14. Berlin: MSRI
Publications, Springer.
circle using the U(1)-current algebra. Wassermann Haag R (1996) Local Quantum Physics. Berlin: Springer.
(1998) has constructed more examples using positive Jones VFR (1983) Index for subfactors. Inventiones Mathematical
energy representations of the loop groups LSU(N) 72: 1–25.
390 Vortex Dynamics
Jones VFR (1985) A polynomial invariant for knots via von London Mathematical Society Lecture Note Series, vol. 36,
Neumann algebras. Bulletin of the American Mathematical pp. 119–172. Cambridge: Cambridge University Press.
Society 12: 103–112. Pimsner M and Popa S (1986) Entropy and index for subfactors.
Jones VFR (1999) Planar algebras, I, math.QA/9909027. Annales Scientifques de l’Ecole Normale 19: 57–106.
Longo R (1989) Index of subfactors and statistics of quantum fields I. Popa S (1994) Classification of amenable subfactors of type II.
Communications in Mathematical Physics 126: 217–247. Acta Mathematica 172: 163–255.
Longo R (1990) Index of subfactors and statistics of quantum Popa S (1995) An axiomatization of the lattice of higher relative
fields II. Communications in Mathematical Physics 130: commutants of a subfactor. Inventiones Mathematicae 120:
285–309. 427–445.
Longo R and Rehren K-H (1995) Nets of subfactors. Reviews in Takesaki M (2002, 2003) Theory of Operator Algebras I, II, III.
Mathematical Physics 7: 567–597. Berlin: Springer.
Ocneanu A (1988) Quantized group, string algebras and Galois Wassermann A (1998) Operator algebras and conformal field theory
theory for algebras. In: Evans DE and Takesaki M (eds.) III: Fusion of positive energy representations of SU(N) using
Operator Algebras and Applications, vol. 2 (Warwick 1987), bounded operators. Inventiones Mathematicae 133: 467–538.
Vortex Dynamics
M Nitsche, University of New Mexico, Albuquerque, u(x, t) = u(x, t)i þ v(x, t)j þ w(x, t)k, and depends on
NM, USA the fluid density (x, t), temperature T(x, t), gravita-
ª 2006 Elsevier Ltd. All rights reserved. tional field g, and other external forces possibly
acting on it. The fluid vorticity is defined by w = r
u.
The vorticity measures the local fluid rotation about an
axis, as can be seen by expanding the velocity near
Introduction
x = x0 ,
A vortex is commonly associated with the rotating
motion of fluid around a common centerline. It is uðxÞ ¼ uðx0 Þ þ Dðx0 Þðx x0 Þ þ 12 wðx0 Þ
ðx x0 Þ
defined by the vorticity in the fluid, which measures þ Oðjx x0 j2 Þ ½1
the rate of local fluid rotation. Typically, the fluid
circulates around the vortex, the speed increases as where
2 3
the vortex is approached and the pressure decreases. ux uy uz
Vortices arise in nature and technology applications Dðx0 Þ ¼ 12 ðru þ ruT Þ; ru ¼ 4 vx vy vz 5 ½2
in a large range of sizes, as illustrated by the wx wy wz
examples given in Table 1. The next section presents
some of the mathematical background necessary to The first term u(x0 ) corresponds to translation: all
understand vortex formation and evolution. Next, fluid particles move with constant velocity u(x0 ).
some sample flows are described, including impor- The second term D(x0 )(x x0 ) corresponds to a
tant instabilities and reconnection processes. Finally, strain field in the three directions of the eigenvectors
some of the numerical methods used to simulate of the symmetric matrix D. If the eigenvalue
these flows are presented. corresponding to a given eigenvector is positive,
the fluid is stretched in that direction, if it is
negative, the fluid is compressed. Note that, in
Background incompressible flow, r u = 0, so the sum of the
eigenvalues of D equals zero. Thus, at least one
Let D be a region in three-dimensional (3D) space
eigenvalue is positive and one negative. If the third
containing a fluid, and let x = (x, y, z)T be a point in
eigenvalue is positive, fluid particles move towards
D. The fluid motion is described by its velocity
sheets (Figure 1a). If the third eigenvalue is negative,
fluid particles move towards tubes (Figure 1b). The
Table 1 Sample vortices and typical sizes last term in eqn [1], (1/2)w(x0 )
(x x0 ), corre-
sponds to a rotation: near a point with w(x0 ) 6¼ 0,
Vortex Diameter the fluid rotates with angular velocity jwj=2 in a
Superfluid vortices 108 cm ( = 1 Å) plane normal to the vorticity vector w. Fluid for
Trailing vortex of Boeing 727 1–2 m which w = 0 is said to be irrotational.
Dust devils 1–10 m A vortex line is an integral curve of the vorticity.
Tornadoes 10–500 m For incompressible flow, r w = r (r
u) = 0,
Hurricanes 100–2000 km
Jupiter’s Red Spot 25 000 km
which implies that vortex lines cannot end in the
Spiral galaxies Thousands of light years interior of the flow, but must either form a closed loop
Vortex Dynamics 391
y y y
d
U
Uo u ω
(a) (b) k
Figure 3 Velocity and vorticity in boundary layer near a flat
(a) (b)
wall.
Figure 5 Vortex sheet: (a) velocity profile and (b) dispersion
relation.
horizontal flow with speed Uo moving past a
solid wall at rest (Figure 3a). Since in viscous
flow the fluid sticks to the wall (the no-slip inside. Shear layers occur naturally in the ocean or
boundary condition), the fluid velocity at the wall atmosphere when regions of distinct temperature or
is zero. As a result, there is a thin layer near the density meet. To illustrate this scenario, consider a
wall in which the horizontal velocity varies tank containing two horizontal layers of fluids of
greatly while the vertical velocity gradients are different densities, one on top of the other. If the
small, yielding large negative vorticity values tank is tilted, the heavier bottom fluid moves
! = vx uy (Figure 3b). Similarity solutions to downstream, and the lighter one moves upstream,
the approximating Prandtl boundary-layer equa- creating a shear layer.
tions show that the boundary-layer
pffiffi thickness d Flat shear layers are unstable to perturbations:
grows proportional to t, where t measures the they do not remain flat but roll up into a sequence
time from the beginning of the motion. Boundary of vortices. This is the Kelvin–Helmholtz instability,
layers can separate from the wall at corners or which can be deduced analytically using linear
regions of high curvature and move into the fluid stability analysis. One shows that in a periodically
interior, as illustrated in several of the following perturbed flat shear layer, the amplitude of a
examples. perturbation with wave number k will initially
grow exponentially in time as ewt , where w = w(k)
is the dispersion relation, leading to instability. The
Sample Vortex Flows wave number of largest growth depends on the layer
Shear Layers thickness. This is illustrated in Figure 4b, which
plots w(k) for a constant-vorticity layer of thickness
A shear layer is a thin region of concentrated 2d. The wave number of maximal growth is
vorticity across which the tangential velocity com- proportional to 1=d.
ponent varies greatly. An example is the constant- A vortex sheet is a model for a shear layer. The
vorticity layer given by parallel 2D flow layer is approximated by a surface of zero thickness
u(x, y) = U(y), v(x, y) = 0, where U is as shown in across which the tangential velocity is discontinu-
Figure 4a. In this case, the velocity is constant ous, as illustrated in Figure 5a. In this case, the
outside the layer and linear inside. The vorticity dispersion relation reduces to w(k) = k. That is,
! = U0 (y) is zero outside the layer and constant for each wave number k there is a growing and a
decaying mode, and the growing mode grows faster
the higher the wave number is, as shown in
y
Figure 5b. The vortex sheet arises from a constant
vorticity shear layer as the thickness d ! 0 and the
w
vorticity ! ! 1 in such a way that the product !d
2D remains constant. Figure 6 shows the roll-up of a
U
periodically perturbed vortex sheet due to the
0.65
kd
(a) (b)
Figure 4 Shear layer: (a) velocity profile and (b) dispersion
relation. Figure 6 Computation of vortex sheet roll-up.
Vortex Dynamics 393
Γ −Γ
2d
Vortex Rings
A vortex tube that forms a closed loop is called a
vortex ring. Vortex rings can be formed by ejecting
fluid from a circular opening, such as when a smoke
ring is formed. The boundary layer wall vorticity
separates at the opening as a cylindrical shear layer
that rolls up at its edge into a ring (Figure 10). The
vorticity is concentrated in a core, which may be
thin or thick relative to the ring diameter. The
limiting cases are an infinitely thin circular filament
of nonzero circulation and the Hill’s vortex, in Figure 12 Sketch. Onset of azimuthal vortex ring instability.
which the vorticity occupies all the interior of a
sphere. Vortex rings of small cross section are subject
Just as a counter-rotating vortex pair, a ring to an azimuthal instability. Theory, experiment,
translates under its self-induced velocity U in and simulations show that if a ring is perturbed
direction normal to the plane of the ring (Figure 11). in the azimuthal direction, there exists a domi-
However, unlike the vortex pair, the ring velocity nant wave number which is unstable and grows
depends significantly on its core thickness. For a ring (Figure 12). The unstable wave number increases
with radius, circulation and core size, respectively, as the core size decreases, while its spatial
R, , a, the self-induced velocity is amplification rate is almost independent of the
8R 1 core size.
U log ½9 Interesting dynamics are obtained when two or
4R a 4
more rings interact. Two coaxial vortex rings of
asymptotically as a ! 0. Thus, the translation equally signed circulation move in the same
velocity becomes unbounded for rings with decreas- direction and exhibit leap-frogging: the rear ring
ing core size. In reality, at some point viscosity takes causes the front ring to grow in radius and the
over and spreads the core vorticity, slowing the ring front ring causes the rear one to decrease. From
down. eqn [9] it can be seen that the ring velocity is
inversely proportional to its radius. Consequently,
the front ring slows down and the rear ring
speeds up, until the rear ring travels through the
front ring. This process repeats itself and is
known as leap-frogging. On the other hand, two
coaxial vortex rings of oppositely signed circula-
tion approach each other and grow in radius.
Their cores contract in order to preserve volume,
and their vorticity increases in order to preserve
circulation. Under certain experimental condi-
tions, the azimuthal instability develops, the
resulting waves on opposite rings reconnect and
a sequence of smaller rings form.
Stretching and folding in turn are the fingerprint tropical cyclones. Baroclinic instability, which
of chaos; thus, mixing and chaos are intimately occurs when temperature advection is superposed
related. Mixing and associated chaotic fluid on a velocity field, can lead to cyclonic vortices at
motion can be obtained by simple vortical the front between air of polar origin and that of
motion. For example, two counter-rotating vor- tropical origin. The inertial or centrifugal
tices subject to a periodic strain field oscillate in a instability occurs when air flows around high-
regular fashion but induce chaos in a region of pressure systems and the pressure gradient force
fluid moving with them. Similarly, two corotating is not large enough to balance the centripetal
vortices of equal strength that are turned on and acceleration and the Coriolis effect.
off periodically so that one is on when the other Vortices also form on other planets with an
is off, known as the blinking vortices, rotate atmosphere. On Mars, dust devils are quite
around a common axis in a stepwise manner but common. They are 10–50 times larger than the
induce chaos in nearby regions. On the other ones on Earth and can carry high-voltage electric
hand, if there are four or more vortices present, fields caused by the rubbing of dust grains against
the vortex motion itself is generally chaotic. It each other. Jupiter’s characteristic spots are
should be noted that there are also nonchaotic extremely large storm vortices. The Great Red
equilibrium solutions of four or more vortices Spot is a vortex spanning twice the diameter of
forming what is called a vortex crystal. the Earth. Unlike the low-pressure terrestrial
Information about chaotic particle motion is storms and hurricanes, the Great Red Spot is a
obtained by studying Poincaré sections, examining high-pressure system that has been stable for
the associated stable and unstable manifolds, and more than 300 years. Other vortices on Jupiter
investigating the existence of chaotic maps such as decay and vanish, such as the White Ovals, three
the horseshoe map. large anticyclones which merged into one within
two years. Recent computer simulations predict
Atmospheric Vortices that many of Jupiter’s vortices will merge and
Atmospheric vortices are driven by temperature disappear in the next decade. As a result, mixing
gradients, Earth’s rotation (Coriolis force), spatial of heat across zones will decay and the planet’s
landscape variations, and instabilities. For example, temperature is predicted to increase.
temperature differences between the equator and the Numerical simulations of the atmosphere are
poles and Earth’s rotation lead to large-scale expensive due to the large number of parameters
vortices such as the trade winds (Hadley cell), the and the relatively small scales that need to be
jet streams, and the polar vortex (Figure 13). Semi- resolved. For climate models and medium-range
annual temperature oscillations are responsible for forecast models, the governing 3D compressible
the Indian monsoons. Daily oscillations cause land- Euler equations are simplified using the hydro-
and sea-breezes. Landscape variations can cause static approximation (in which only the pressure
urban–rural wind flows and mountain–valley gradient and the gravitational forces are retained
circulations. in the vertical-momentum equation) and the
Instabilities are often responsible for large anelastic approximation (in which d=dt is
cyclonic vortices. Barotropic instability results neglected), to obtain the primitive equations.
from large horizontal velocity gradients, and has Additional vertical averaging yields the shallow-
been deemed responsible for disturbances over the water equations. One big hurdle is to accurately
Sahara region that occasionally intensify into incorporate the effect of clouds, which is sig-
nificant and is usually treated using subgrid
models.
Polar front
Polar
vortex Polar jet stream
Vortices in Superfluids and Superconductors
Ferrel cell At temperatures below 2.2 K, liquid helium is a
superfluid, meaning that it acts essentially like a
Subtropical fluid with zero viscosity governed by the Euler
jet stream
equations. The fluid is irrotational, except for
Hadley cell extremely thin vortex filaments, which are formed
by quantum-mechanical processes. Since the vortices
cannot end in the interior of the flow, they can be
Figure 13 Vortices in the atmosphere. generated only at the surface or they nucleate as
396 Vortex Dynamics
vortex rings inside the fluid. As an example, if free current flow is lost. In order to recover the
a cylindrical container with helium is rotated desired property of dissipation-free flow, flux lines
sufficiently fast, vortex lines attached to both ends have to be pinned, for example, by introducing
of the container appear. These quantum vortices inhomogeneities and structural defects. For a given
have discrete values of circulation (= nh=m, where pinning force, flux lines remain pinned as long as the
h = Planck’s constant, m = mass of helium atom, current density stays below a critical value. A major
n = integer), core sizes of about 1 Å (roughly the research objective is to optimize the pinning force in
diameter of a single hydrogen atom) and move order to preserve superconductivity at larger current
without viscosity. densities.
Similarly, certain types of materials lose their
electric resistance at low temperatures and
become superconductors. One distinguishes type-I
Numerical Vortex Methods
superconductors (most pure metals) from type-II
superconductors (alloys). Using the Ginzburg– Many numerical methods used to compute fluid
Landau theory it has been predicted that in flow are Eulerian schemes based on a fixed mesh,
type-II superconductors a lattice of vortex fila- such as finite difference, finite element, and spectral
ments forms, each carrying a quantized amount methods, commonly used for example in atmo-
of magnetic flux. This was subsequently con- sphere and ocean modeling. This section briefly
firmed by experimental observation. More pre- describes alternative vorticity-tracking methods
cisely, for temperatures T below a critical value used to simulate incompressible inviscid vortex
Tc , there are three regions corresponding to flows, and concludes with some extensions to
increasing values of the magnetic field (Figure 14). viscous flows. The premise of these methods is
At low magnetic fields (H < Hc1 ), no vortices that since the fluid velocity is determined by the
exist (superconducting phase). At intermediate vorticity through the Biot–Savart law (eqn [6]), it
values (Hc1 < H < Hc2 ), the magnetic field pene- suffices to track only that portion of the fluid
trates the superconductor in the form of quan- carrying nonzero vorticity. This region is often
tized vortices, also called flux lines (mixed much smaller than the total fluid volume, and
phase). The values Hc1, c2 are determined by the computational efficiency is gained. Numerical vor-
London penetration depth , which measures the tex methods are typically Lagrangian, that is, the
electromagnetic response of the superconductor. computational elements move with the fluid
With increasing magnetic field, the density of flux velocity.
lines increases until the vortex cores overlap
when the upper critical field Hc2 is reached,
beyond which one recovers the normal metallic Point-Vortex Approximation in 2D
state (normal conductor). To compute the evolution of a vorticity distribution
When an external current density j is applied to !(x, t) in 2D, the simplest approach is to approx-
the vortex system, the flux lines start to move under imate the vorticity by a set of point vortices at xj (t)
the action of the Lorentz force. As a result, a with circulation j and evolve them under their self-
dissipating electric field E appears that is parallel to induced motion. The values j are an estimate of the
j, and the superconducting property of dissipation- initial circulation around xj (0). The vortex positions
xj (t) evolve in the induced velocity field
dxj X N
¼ k K 2D ðxj xk Þ ½10
dt k¼1
k6¼j
Hc2
Normal where the exclusion k 6¼ j accounts for the fact that
conductor a point vortex induces zero velocity on itself. The
H Mixed phase
solution to the system of ordinary differential
equations [10] can be obtained using any method,
Hc1 such as Runge–Kutta or Adams–Bashforth.
Superconductor The point-vortex approximation can be written in
no vortices
Hamiltonian form as
0 T Tc
Figure 14 Superconductor phase dependence on magnetic dxj 1 @H dyj 1 @H
field H and temperature T.
¼ ; ¼ ½11
dt j @yj dt j @xj
Vortex Dynamics 397
1 XN X N h
Hðx; yÞ ¼ j k log ðxj xk Þ2
4 j¼1 k¼1
k>j
i t = 20
þðyj yk Þ2 ½12
Vortex Filament Methods in 3D Special topics have also been addressed; atmosphere
(Andrews et al. 1987), point vortex motion and chaos
Vortex simulations in 3D differ from those in 2D in
(Aref 1983, Newton 2001, Ottino 1989), superfluids
that the stretching term in eqn [4] needs to be
and superconductors (Blatter et al. 1994, Donnelly
incorporated. The vortex filament method approx-
1991), turbulence theory using statistical mechanics (
imates the fluid vorticity by a finite number of
Chorin 1994), vortex reconnection (Kida and
filaments whose circulation remains constant in
Takaoka 1994), theory for Euler and Navier–Stokes
time. Each filament is marked by computational
equations (Majda and Bertozzi 2002), contour
mesh points which move with the regularized
dynamics (Pullin 1992), vortex rings (Shariff and
induced velocity. The regularization is necessary to
Leonard 1992), and aircraft trailing vortices (Spalart
prevent the infinite self-induced velocities of curved
1998). Green (1995) includes survey articles on
vortex filaments. As in 2D, this method automati-
various topics.
cally conserves circulation. Vorticity stretching is
accounted for by the stretching between computa-
tional mesh points. As the filament length increases, Nomenclature
more meshpoints are typically introduced to keep it
resolved. Also, the number of filaments can be a vortex ring core size
increased throughout the simulation to maintain g gravitational field
resolution. H Hamiltonian
K2D singular velocity kernel
Viscous Vortex Methods K2D, regularized velocity kernel
(x, t) fluid density
While inviscid models are expected to approximate R vortex ring radius
small viscosity fluids well far from boundaries, near T(x, t) temperature
boundaries, where vortex shedding is an inherently U translation velocity
viscous mechanism, it is important to incorporate u(x, t) = u(x, t)iþ fluid velocity
the effects of viscosity. The first methods to do so v(x, t)j þ w(x, t)k
w(k) dispersion relation
used operator splitting in which inviscid and viscous
w=ru vorticity
terms of the Navier–Stokes equations were solved in ! = vx uy scalar vorticity
a sequential manner. In each time step, the compu- ring circulation
tational elements would first be convected, and then
they would be diffused by a random-walk scheme. See also: Abelian Higgs Vortices; Incompressible Euler
The particle strength exchange method, introduced Equations: Mathematical Theory; Integrable Systems:
more recently, does not rely on operator splitting Overview; Interfaces and Multicomponent Fluids;
and has better accuracy. The particle position and Intermittency in Turbulence; Newtonian Fluids and
vorticity evolve simultaneously, and viscous Thermohydraulics; Point-Vortex Dynamics; Stochastic
diffusion is accounted for in a consistent manner. Hydrodynamics; Superfluids; Topological Knot Theory
Vortex dynamics continues to be a source of and Macroscopic Physics; Turbulence Theories.
interesting problems of theoretical and practical
importance. In particular, much remains to be
Further Reading
learned to better understand turbulence and the
transition to turbulence, a process dominated by Anderson JD (1990) Modern Compressible Flow with Historical
deterministic vortex dynamics. Perspective, 2nd edn. New York: McGraw-Hill.
Andrews DG, Holton JR, and Leovy CB (1987) Middle Atmo-
sphere Dynamics. Orlando: Academic Press.
Further Remarks Aref H (1983) Integrable, chaotic, and turbulent vortex motion in
two-dimensional flows. Annual Review of Fluid Mechanics
Finally, some remarks on relevant literature on this 15: 345–389.
Batchelor GK (1967) An Introduction to Fluid Dynamics.
subject are in order. Lugt (1983) and Tritton (1988)
Cambridge: Cambridge University Press.
are recommended as elementary introduction to Blatter G, Feigel’man MV, Geshkenbein VB, Larkin AI, and
vortex flows. van Dyke (1982) presents beautiful and Vinokur VM (1994) Vortices in high-temperature
instructive flow visualizations. Comprehensive treat- superconductors. Reviews of Modern Physics 66(4):
ments of incompressible fluid dynamics are given in 1125–1388.
Batchelor (1967), Chorin and Marsden (1992), Lamb Chorin AJ (1994) Vorticity and Turbulence. New York: Springer.
Chorin AJ and Marsden JE (1992) A Mathematical Introduction
(1932), and Saffman (1992), and compressible flow is to Fluid Mechanics, 3rd edn. New York: Springer.
treated in Anderson (1990). Cottet and Koumoutsakos Cottet G-H and Koumoutsakos PD (2000) Vortex Methods:
(2000) give an overview of numerical vortex methods. Theory and Practice. Cambridge: Cambridge University Press.
Vortex Dynamics 399
Donnelly RJ (1991) Quantized Vortices in Helium II. Cambridge: Ottino JM (1989) The Kinematics of Mixing: Stretching, Chaos,
Cambridge University Press. and Transport. Cambridge: Cambridge University Press.
Green SI (ed.) (1995) Fluid Vortices. Dordrecht: Kluwer Academic. Saffman PG (1992) Vortex Dynamics. Cambridge: Cambridge
Kida S and Takaoka M (1994) Vortex reconnection Annual University Press.
Review of Fluid Mechanics 26: 169–189. Shariff K and Leonard A (1992) Vortex rings. Annual Review of
Lamb H (1932) Hydrodynamics, 6th edn. New York: Dover. Fluid Mechanics 24: 235–279.
Lugt HJ (1983) Vortex Flow in Nature and Technology. New Spalart PR (1998) Airplane trailing vortices. Annual Review of
York: Wiley. Fluid Mechanics 30: 107–138.
Majda AJ and Bertozzi AL (2002) Vorticity and Incompressible Tritton DJ (1988) Physical Fluid Dynamics, 2nd edn. Oxford:
Flow. Cambridge: Cambridge University Press. Clarendon Press.
Newton PK (2001) The N-Vortex Problem: Analytical Techni- van Dyke M (1982) Album of Fluid Motion. Stanford: The
ques. New York: Springer. Parabolic Press.
Pullin DI (1992) Contour dynamics methods. Annual Review of
Fluid Mechanics 24: 89–115.
from normals to 0 have mirror images, which are Sn1 . For more general convex obstacles K or
rays in Rn . If such a ray hits @Rnþ , its mirror image manifolds with diffractive boundary, other techni-
does so also, and continues into R nþ , as the reflected ques are required, to show that waves reflect off the
ray. The singularities of u propagate along such boundary in a fashion similar to the case [12].
reflected rays. Another situation arises if instead of [12] one
Such a description extends to a general complete takes M = B, or more generally M = K, a convex
Riemannian manifold with boundary M, in the case region as described above. A ray starting off from
of rays that hit the boundary transversally. Such a a point in @M, almost tangent to @M but with a
ray is reflected by retaining the tangential compo- small component in the direction of the normal
nent of its velocity vector at the point of intersection pointing into M, will undergo many reflections in
@M and reversing the sign of the normal component. a short time. Upon shrinking the normal compo-
One says that the ray is reflected according to the nent of the initial velocity to zero, one obtains in
laws of geometrical optics. Singularities of u carried the limit a geodesic in @M, known as a gliding ray.
by such rays that hit @M are correspondingly In such a case, singularities of solutions to [1],
reflected. Methods to establish such transversal with such a boundary condition as [4] or [5],
reflection of singularities are natural extensions of propagate along both transversally reflected and
those developed to treat the propagation away from gliding rays.
@M, mentioned above. For the generic smooth obstacle K in Rn , the
Matters become more delicate when there are rays second fundamental form can have a variety of
that are tangent to @M. A model example is given by signatures at various boundary points. Various types
of ‘‘generalized rays’’ occur – generally speaking
M ¼ R n n B; B ¼ fx 2 Rn : jxj < 1g ½12
limits of sequences of transversally reflected rays.
which one takes when studying the scattering of This situation also holds for general complete
waves in Rn by the obstacle B. Consider a solution Riemannian manifolds with smooth boundary. The
to [1] with boundary condition given by [4] or [5] main result about propagation of singularities in
that has a simple singularity on t = {x 2 Rn : xn = t} such a case is that it is always along such generalized
for t < 1. The associated rays are of the form rays. This was established by Melrose and Sjöstrand
x0 (t) = (x0 , t), for t < 1, with x0 2 Rn1 . If jx0 j > 1, (1978).
these rays continue on in Rn nB, for all t 1. If Further diffraction effects arise when @M has
jx0 j < 1, these rays hit @M = @B transversally, and singularities, such as edges and corners. The simplest
their reflection is as described above. If jx0 j = 1, example is
these rays hit @B tangentially, at t = 0; they are
sometimes called grazing rays. One also continues M ¼ fx 2 R2 : a b; r 0g ½15
them past t = 0. One defines in this fashion t for
where (r, ) are the polar coordinates of x 2 R2 , and
t 1. The region
we assume 0 a < b 2. Here one is studying the
S ¼ fx ¼ ðx0 ; xn Þ 2 R n nB : jx0 j < 1; xn > 0g ½13 diffraction of waves by a wedge. In the limiting case
a = 0, b = 2, the wedge becomes a half-line, that is,
is called the ‘‘shadow region.’’ It is disjoint from t
for all t. The solution u is smooth in S for all t, M ¼ R2 n fðx1 ; 0Þ : x1 > 0g ½16
although it is not identically zero. The set
Singularities of solutions to [1] on R M with
S b ¼ fx ¼ ðx0 ; xn Þ 2 Rn nB : jx0 j ¼ 1; xn 0g ½14 such a boundary condition as [4] or [5] propagate in
the interior of M and reflect off the regular points of
is the ‘‘shadow boundary.’’ @M as before. If a family of continuous, piecewise
One can replace B in [12] by a more general smooth curves t carrying the singularity of u hit the
smooth, convex obstacle K, with positive Gauss corner x = 0 at t = a, this reflection creates a tear in
curvature everywhere, and the same considerations t for t > a. In addition, a diffracted wave spreads
of transversal and grazing rays and shadow regions out from the corner at unit speed. This diffracted
apply. These notions also extend to a more general wave carries a singularity that is weaker than that of
class of Riemannian manifolds with boundary, the incident wave. For example, if one has a solution
called manifolds with diffractive boundary. In the like [8], but shifted to have support in a disk of radius
case K = B, one can use separation of variables to jtj about a point p 6¼ 0 in R 2 , for small jtj, then the
reduce the problem of analyzing solutions to [1] and diffracted wave will have a jump discontinuity.
showing that singularities propagate along such rays The space M in [15] is a special case of a cone.
to a problem in harmonic analysis on the sphere More generally, if N is a complete Riemannian
404 Wave Equations and Diffraction
manifold (possibly with boundary), then the cone although it is weaker than the singularity of the
C(N) with base N is the set main wave.
Taking Cartesian products of spaces of the form
CðNÞ ¼ ½0; 1Þ N ½17 [15] with R k yields spaces with k-dimensional
with all points (0, x), x 2 N, identified, with the edges. There are also spaces with curvy edges.
metric tensor Rather than continuing with further general
description, one more particular case is discussed
ds2 ¼ dr 2 þ r 2 g ½18 next, which has had a historical significance.
Namely, we consider the reflection of waves in R3
where g is the metric tensor on N, and points on off a disk, that is, take
C(N) are denoted (r, x), r 2 [0, 1), x 2 N. The space
in [14] has the form M = C(N) with N = [a, b], an M ¼ R3 n D; D ¼ ðx1 ; x2 ; 0Þ : x21 þ x22 1 ½20
interval. A cone in Euclidean space Rn is of the form
Consider a wave given for t < 0 by u(t, x) = (x3 t).
C(N) with N a domain in the unit sphere Sn1 .
This wave hits D = @M at t = 0, giving off a diffracted
The propagation of singularities for solutions to [1]
wave, traveling away from
= {(x1 , x2 , 0) : x21 þ
on C(N), when N has smooth boundary, has a
x22 = 1} at speed 1 for t > 0. This diffracted wave
description similar to that above for the case [15].
carries a singularity that blows up like the 1=2 power
Again, there is a diffracted wave set off from the conic
of the distance to the torus of points of distance t from
point {r = 0} when a singularity of a wave hits it. The
, for t 2 (0, 1). For t > 1, there is a focusing effect
diffracted wave is typically (n 1)=2 units smoother
along the x3 -axis, producing a stronger singularity for
than the singular wave producing it, where
u(t, x) there.
n = dim C(N). For example, the fundamental solution
This sort of phenomenon was understood, at
to the wave equation on C(N) produces a diffracted
least from a heuristic point of view, in the
wave which is the sum of a jump discontinuity and (in
nineteenth century, and it played a role in an
general) a logarithmic singularity.
important argument of Poisson. At the time, there
In fact, precise understanding of the behavior of
was a debate about whether the propagation of
the fundamental solution to the wave equation on
light was a wave phenomenon. Poisson did not
C(N) is encoded in terms of the behavior of the
think it was, and he noted that if it were, the light
solution operator to the wave equation on the base
waves propagated past such an obstacle should
N. This is discussed in further detail in the section
produce a bright spot along the axis normal to the
on harmonic analysis. In the case where C(N) is
disk and through its center. The experiment was
given by [15], we are dealing with the wave
performed and the bright spot was observed.
equation on an interval [a, b], whose behavior is
This is now called the Poisson spot, and its
elementary.
occurrence convinced many physicists, including
One can use analysis of [15] together with finite
Poisson, that the propagation of light is a wave
propagation speed to get a good qualitative picture of
phenomenon.
diffraction of waves in R2 by a polygonal obstacle. A
variation of this argument allows one to understand
the behavior of the wave equation on a ‘‘polygonal’’ Harmonic Analysis and the Wave
domain N in S2 , that is, one whose boundary consists Equation
of a finite number of geodesic segments in S2 . Going
from there to C(N), one can then analyze diffraction The wave equation [1] with Cauchy data [3] can be
of waves in R3 by a polyhedron. regarded as an operator differential equation, with
It is worth remarking how the ‘‘shadow region’’ solution
pffiffiffiffiffiffiffiffi
for such an obstacle as a wedge in R2 differs from pffiffiffiffiffiffiffiffi sin t
that in [12]–[14]. For example, if one considers M uðt; xÞ ¼ cos t f ðxÞ þ pffiffiffiffiffiffiffiffi gðxÞ ½21
given by [16] and u(t, x) = (x2 t), for t < 0, then
the region This brings one to investigate functions of the self-
S ¼ fx ¼ ðx1 ; x2 Þ : x1 ; x2 > 0g ½19 adjoint operator . If M = Rn , one can do this using
the Fourier transform, which is given by
is the ‘‘shadow region,’’ in the sense that rays either Z
missing or reflecting off the obstacle {(x1 , 0) : x1 > 0} F f ðÞ ¼ ^f ðÞ ¼ ð2Þn=2 f ðxÞeix dx ½22
do not enter the region [19]. However, unlike the
case [13], the solution u(t, x) is not smooth in the One defines F
by changing eix to eix in [22],
region [19] for t > 0. There is a singularity there, and the Fourier inversion formula says F and F
are
Wave Equations and Diffraction 405
inverses of each other on various function spaces, To understand functions of the Laplace operator
including L2 (Rn ). Then one has on a cone C(N), one uses
pffiffiffiffiffiffiffiffi Z
@2 n 1 @ 1
’ð Þf ðxÞ ¼ ð2Þn=2 ’ðjjÞ^f ðÞeix d ½23 ¼ þ þ N ½32
@r2 r @r r2
understands the behavior of families of functions of where Ai is the Airy function. The coefficients ak ()
the operator so produced. An approach taken by and bk () are smooth functions of their argument,
Cheeger and Taylor (1982) to this was to synthesize = (z), which is defined by
these operators from eis , s 2 R, and deduce their Z 1 pffiffiffiffiffiffiffiffiffiffiffiffiffi
behavior from the behavior of the solution operator 2 3=2 dt
¼ 1 t2 ½45
to the wave equation on the base N. 3 z t
One can apply similar considerations to M = R n nB, Making use of [43] in [44], one can obtain a
which is the truncated cone [1, 1) Sn1 , with parametrix for u (i.e., a solution modulo a C1 error)
metric tensor [18], where g is the metric tensor on whose form is a special case of the formula [50],
Sn1 , and Laplace operator given by [32], with N which we will present in the next section.
the Laplace operator on Sn1 . The problem of
diffraction of waves by the ball B can be recast as
solving Geometrical Optics and Extensions
producing a parametrix valid for all t. Generally, one (1976), which produced solutions satisfying [52] to
can solve [49] and the associated transport equations infinite order at n = 0. This earlier construction is
for t in some interval, past which the eikonal adequate to produce a grazing ray parametrix, but
equation might break down. Hörmander’s theory the sharper result [52] is extremely valuable for
treats products of Fourier integral operators, yielding constructing a gliding ray parametrix. This has the
global constructions. This facilitates the treatment of form Z
caustics mentioned earlier. Stationary-phase methods uðyÞ ¼ ½a AiðÞ þ ijj1=3 b Ai0 ðÞ
can be brought to bear to relate the singularities of R n
Moreover, the classical method for modeling turbu- The wavelet transform of a function f 2 L2 (R) is
lent flows consists in neglecting high-wavenumber the inner product of f with the analyzing wavelets
motions and replacing them by their average, suppos- e
a, b , which
R gives the wavelet coefficients: f (a, b) =
ing their dynamics to be either linear or slaved to the hf , a, b i = f (x) a, b (x) dx. They measure the fluc-
low wavenumber motions. Such a method would work tuations of f around the scale a and the position
if there exists a clear separation between low and high b. f can then be reconstructed without any loss as
wavenumbers, that is, a spectral gap. the inner product of its wavelet coefficients e f with
Actually, there is now strong evidence, from the analyzing wavelets
both laboratory and direct numerical simulation ZZ
(DNS) experiments, that this is not the case. 1 e 2
a; b : f ðxÞ = C f ða; bÞ a; b ðxÞa da db
Conversely, one observes that turbulent flows are
nonlinearly active all along the inertial range and that R
coherent vortices seem to play an essential dynamical C = j ^j2 jkj1 dk being a constant which depends
role there, especially for transport and mixing. One on the wavelet .
may then ask the following questions: Are coherent Like the Fourier transform, the wavelet transform
vortices the elementary building blocks of turbulent realizes a change of basis from physical space to
flows? How can we extract them? Do their mutual wavelet space which is an isometry. It thus conserves
interactions have a universal character? Can we the inner product (Plancherel theorem), and in
compress turbulent flows and compute their evolu- particular energy (Parseval’s identity). Let us men-
tion with a reduced number of degrees of freedom tion that, due to the localization of wavelets in
corresponding to the coherent vortices? physical space, the behavior of the signal at infinity
The DNS of turbulent flows, based on the integra- does not play any role. Therefore, the wavelet
tion of the Navier–Stokes equations using either grid analysis and synthesis can be performed locally, in
points in physical space or Fourier modes in spectral contrast to the Fourier transform where the nonlocal
space, requires a number of degrees of freedom per nature of the trigonometric functions does not allow
time step that varies as Re9=4 in dimension 3 (and as to perform a local analysis.
Re in dimension 2). Due to the inherent limitation of Moreover, wavelets constitute building blocks of
computer performances, one can presently only per- various function spaces out of which some can be
form DNS of turbulent flows up to Reynolds numbers used to contruct orthogonal bases. The main
Re = 106 . To compute higher Reynolds flows, one difference between the continuous and the orthogo-
should then design ad hoc turbulence models, whose nal wavelet transforms is that the latter is non-
parameters are empirically adjusted to each type of redundant, but only preserves the invariance by
flows, in particular to their geometry and boundary translation and dilation only for a discrete subset of
conditions, using data from either laboratory or wavelet space which corresponds to the dyadic grid
numerical experiments. = (j, i), for which scale is sampled by octaves j and
space by positions 2j i. The advantage is that all
What are Wavelets? orthogonal wavelet coefficients are decorrelated,
which is not the case for the continuous wavelet
The wavelet transform unfolds signals (or fields)
transform whose coefficients are redundant and
into both time (or space) and scale, and possibly
correlated in space and scale. Such a correlation
directions in dimensions higher than 1. The starting
can be visualized by plotting the continuous wavelet
point is a function 2 L2 (R), called the ‘‘mother
coefficients of a white noise and the patterns one
wavelet’’, which is well localized in physical space
thus observes are due to the reproducing kernel of
x 2 R, is oscillating ( has at least a vanishing
the continuous wavelet transform, which corre-
integral, or better, its first m moments vanish), and
sponds to the correlation between the analyzing
is smooth (its Fourier transform ^(k) exhibits fast
wavelets themselves.
decay for wave numbers jkj tending to infinity). The
In practice, to analyze turbulent signals or fields,
mother wavelet then generates a family of dilated
one should use the continuous wavelet transform
and translated wavelets
with complex-valued wavelets, since the modulus of
the wavelet coefficients allows to read the evolution
1=2 xb
a; b ðxÞ ¼ a of the energy density in both space (or time) and
a
scales. If one uses real-valued wavelets instead, the
with a 2 Rþ the scale parameter and b 2 R the modulus of the wavelet coefficients will present the
position parameter, all wavelets being normalized same oscillations as the analyzing wavelets and it
in L2 -norm. will then become difficult to sort out features
410 Wavelets: Application to Turbulence
belonging to the signal or to the wavelets. In the case with the wavenumber k denoting the barycenter of
of complex-valued wavelets, the quadrature between the wavelet support in Fourier space computed as
the real and the imaginary parts of the wavelet R1
coefficients eliminates these spurious oscillations; this kj bðkÞjdk
k ¼ R0 1 ½3
is why we recommend to use complex-valued wave- j bðkÞjdk
0
lets, such as the Morlet wavelet. To compress
turbulent flows, and a fortiori to compute their
evolution at a reduced cost, compared to standard For the orthogonal wavelet transform, there is
methods (finite difference, finite volume, or spectral a large collection of possible wavelets and the
methods), one should use orthogonal wavelets. This choice depends on which properties are preferred,
avoids redundancy, since one has the same number of for instance: compact support, symmetry, smooth-
grid points as wavelet coefficients. Moreover there ness, number of cancelations, computational
exists a fast algorithm to compute the orthogonal efficiency.
wavelet coefficients which is even faster than the fast From our own experience, we tend to prefer
Fourier transform, having O(N) operations instead of the Coifman wavelet 12, which is compactly
O(N log2 N). supported, has four vanishing moments, is quasi-
The first paper about the continuous wavelet symmetric, and is defined with a filter of length 12,
transform has been published by Grossmann and which leads to a computational cost for the fast
Morlet (1984). Then, discrete wavelets were wavelet transform in 24N operations, since two
constructed, leading to frames (Daubechies et al. filters are used.
1986) and orthogonal bases (Lemarié and Meyer, As stated above, we recommend the complex-
1986). From there the formalism of multiresolution valued continuous wavelet transform for analysis. In
analysis (MRA) has been constructed which led this case, one plots the modulus and the phase of the
to the fast wavelet algorithm (Mallat 1989). The wavelet coefficients in wavelet space, with a linear
first application of wavelets to analyze turbulent horizontal axis for the position b, and a logarithmic
flows has been published by Farge and Rabreau vertical axis for the scale a, with the largest scale at
(1988). Since then a long-term research program has the bottom and the smallest scale at the top.
been developed for analyzing, computing and In Figure 1a we show the wavelet analysis of
modeling turbulent flows using either continuous a turbulent signal, corresponding to the time
wavelets, orthogonal wavelets, or wavelet packets. evolution of the velocity fluctuations of two succes-
sive vortex breakdowns, measured by hot-wire
anemometry at N = 32768 = 215 instants (Cuypers
et al. 2003). The modulus of the wavelet coefficients
Wavelet Analysis (Figure 1b) shows that during the vortex break-
Wavelet Spectra down, which is due to strong nonlinear flow
instability, energy is spread over a wide range of
Wavelet space To study turbulent signals one uses scales. The phase of the wavelet coefficients
the continuous wavelet transform for analysis, and (Figure 1c) is plotted only where the modulus is
the orthogonal wavelet transform for compression non-negligible, otherwise the phase information
and computation. To perform a continuous wavelet would be meaningless. In Figure 1c, one observes
transform, one can choose: that the lines of constant phase point towards the
either a real-valued wavelet, such as the Marr instants where the signal is less regular, that is,
wavelet, also called ‘‘Mexican hat,’’ which is the during vortex breakdowns.
second derivative of a Gaussian,
2 Local wavelet spectrum Since the wavelet trans-
2 x
ðxÞ ¼ ð1 x Þ exp ½1 form conserves energy and preserves locality in
2
physical space, one can extend the concept of energy
or a complex-valued wavelet, such as the Morlet spectrum and define a local energy spectrum, such
wavelet, that
8 2
! 1 e k
> 1 ðk k Þ 2 e
Eðk; xÞ ¼ f ; x for k 0 ½4
>
< bðkÞ ¼ C k k
exp for k > 0
2 2 ½2
>
> where k is the centroid wavenumber of the
:b
ðkÞ ¼ 0 for k 0 analyzing wavelet and C is defined in the
Wavelets: Application to Turbulence 411
seen from the following relation between the two The classical measures based on structure func-
spectra: tions can be thought of as a special case of wavelet
filtering using a nonsmooth wavelet defined as the
Z difference of two Diracs (DOD). It is this lack of
1 1 k k 0 2 0
e
EðkÞ ¼ 0
Eðk Þ b dk ½6 regularity of the underlying wavelet that limits the
C k 0 k adequacy of classical measures to analyze smooth
signals. Wavelet-based diagnostics can overcome
which shows that the global wavelet spectrum is an these limitations, and produce accurate results,
average of the Fourier spectrum weighted by the whatever the signal to be analyzed.
square of the Fourier transform of the analyzing We will link the scale-dependent moments of the
wavelets at wavenumber k. Note that the larger k, wavelet coefficients and the structure functions,
the larger the averaging interval, because wavelets which are classically used to study turbulence. In
are bandpass filters with k=k constant. This the case of second-order statistics, the global wavelet
property of the global wavelet energy spectrum is spectrum corresponds to the second-order structure
particularly useful to study turbulent flows. Indeed, function. Furthermore, a rigorous bound for the
the Fourier energy spectrum of a single realization maximum exponent detected by the structure func-
of a turbulent flow is too oscillating to be able to tions can be computed, but there is a way to
clearly detect a slope, while it is no more the case overcome this limitation by using wavelets.
for the global wavelet energy spectrum, which is a The increments of a signal, also called the
better estimator of the spectral slope. modulus of continuity, can be seen as its wavelet
The real-valued Marr wavelet [1] has only two coefficients using the DOD wavelet
vanishing moments and thus can correctly measure
the energy spectrum exponents up to < 5. In the
ðxÞ ¼ ðx þ 1Þ ðxÞ ½8
case of the complex-valued Morlet wavelet [2], only
the zeroth-order moment is null, but the higher 2mth We thus obtain
order moments are very small (/ km e(k =2) ),
provided that k is larger than 5. For instance, the f ðx þ aÞ f ðxÞ ¼ e
fx; a ¼ hf ;
x; a i ½9
Morlet wavelet transform with k = 6 gives accu-
rate estimates of the power–law exponent of the
with x, a (y) = 1=a[((y x)=a þ 1) ((y x)=a)].
energy spectrum up to < 7.
Note that the wavelet is normalized with respect to
There is also a family of wavelets with an infinite
the L1 -norm. The pth-order structure function Sp (a)
number of cancelations
therefore corresponds to the pth-order moment of
the wavelet coefficients at scale a
bn ðkÞ ¼ n exp 1 k2 þ 1 n1 ½7 Z
2 k2n
Sp ðaÞ ¼ ðe
fx;a Þp dx ½10
Setting a = k =k, we see that the wavelet spectrum statistics therefore characterize intermittency. Of
corresponds to the second-order structure function, course, intermittency is not essential for all problems:
such that second-order statistics are sufficient to measure
dispersion (dominated by energy-containing scales),
e 1 but not to calculate drag or mixing (dominated by
EðkÞ ¼ S2 ðaÞ ½12
C k vorticity production in thin boundary or shear
layers).
The above results show that, if the Fourier spectrum To measure intermittency, one uses the space–
e
behaves like k for k ! 1, E(k) / k if < 2M þ scale information contained in the wavelet coeffi-
1, where M denotes the number of vanishing cients to define scale-dependent moments and
moments of the wavelets. Consequently, we find moment ratios. Useful diagnostics to quantify the
for S2 (a) that S2 (a) / a(p) = (k =k)(p) for a ! 0 if intermittency of a field f are the moments of its
(2) 2M. For the DOD wavelet, we have M = 1, wavelet coefficients at different scales j
therefore, the second-order structure function
j
can only detect slopes smaller than 2, corresponding X
2 1
localized in physical space, one might try to use an orthogonal and therefore the L2 -norm, for example,
on–off filter defined in physical space to extract energy or enstrophy, is a superposition of coherent
them. However, this approach changes the spectral and incoherent contributions (Mallat 1998).
properties by introducing spurious discontinuities, Assuming that coherent structures are what
adding an artificial scaling (e.g., k2 in one remain after denoising, we need a model, not for
dimension) to the energy spectrum. To avoid these the structures themselves, but for the noise. As a first
problems, we use the wavelet representation, which guess, we choose the simplest model and suppose the
combines both physical and spectral space localiza- noise to be additive, Gaussian and white, that is,
tions (bounded from below by Heisenberg’s uncer- uncorrelated. Having this model in mind, we use
tainty principle). In turbulence, the relevant rare Donoho and Johnstone’s theorem to compute the
events are the coherent vortices and the dense value to threshold the wavelet coefficients. Since the
events correspond to the residual background flow. threshold value depends on the variance of the noise,
We have proposed a nonlinear wavelet filtering of which in the case of turbulence is not a priori
the wavelet coefficients of vorticity to extract the known, we propose a recursive method to estimate
coherent vortices out of turbulent flows. We now it from the variance of the weakest wavelet
detail the different steps of this procedure. coefficients, that is, those whose modulus is below
the threshold value.
Extraction of Coherent Structures
Principle We propose a new method to extract Wavelet decomposition We describe the wavelet
coherent structures from turbulent flows, as encoun- algorithm to extract coherent vortices out of
tered in fluids (e.g., vortices, shocklets) or plasmas turbulent flows and apply it as example to a 3D
(e.g., bursts), in order to study their role in transport turbulent flow. We consider the vorticity field
and mixing. w = r v, computed at resolution N = 23J , N being
We first replace the Fourier representation by the the number of grid points and J the number of
wavelet representation, which keeps track of both octaves in each spatial direction. Each vorticity
time and scale, instead of frequency only. The component is developed into an orthogonal wavelet
second improvement consists in changing our view- series from the largest scale lmax = 20 to the smallest
point about coherent structures. Since there is not scale lmin = 2J1 using a three-dimensional (3D) MRA:
yet a universal definition of coherent structures, we
prefer starting from a minimal but more consensual !ðxÞ ¼ !
0;0;0 0;0;0 ðxÞ
statement about them, that everyone hopefully could J1 2 j j j
X X X
1 2 1 2X 1 X
7
agree with: ‘‘coherent structures are not noise.’’ þ ~dj;ix ;iy ;iz d
! j;ix ;iy ;iz ðxÞ ½15
Using this apophatic method, we propose the j¼0 ix ¼0 iy ¼0 iz ¼0 d¼1
following definition: ‘‘coherent structures are what
remain after denoising.’’
with j, ix , iy i, iz (x) = j, ix (x)j, iy (y)j, iz (z), and
For the noise we use the mathematical definition
stating that a noise cannot be compressed in any 8
functional basis. Another way to say this is to > j;ix ðxÞj;iy ðyÞj;iz ðzÞ d¼1
>
>
>
>
observe that the shortest description of a noise is the > d¼2
> j;ix ðxÞ j;iy ðyÞj;iz ðzÞ
>
>
noise itself. Notice that often one calls ‘‘noise’’ what >
>
>
> j;i ðxÞj;iy ðyÞ j;iz ðzÞ d¼3
is actually ‘‘experimental noise,’’ but not noise in the >
< x
mathematical sense. d
j;ix ðxÞj;iy ðyÞ j;iz ðzÞ d¼4
j;ix ;iy ;iz ðxÞ ¼ ½16
Considering our definition of coherent structures, >
>
>
>
turbulent signals can be split into two contribu- >
> j;ix ðxÞ j;iy ðyÞj;iz ðzÞ d¼5
>
>
tions: coherent bursts, corresponding to that part of >
>
>
> j;ix ðxÞ j;iy ðyÞ j;iz ðzÞ d¼6
the signal which can be compressed in a wavelet >
>
:
basis, and incoherent noise, corresponding to that j;ix ðxÞ j;iy ðyÞ j;iz ðzÞ d¼7
part of the signal which cannot be compressed,
neither in wavelets nor in any other basis. We will where j, i and j, i are the one-dimensional
then check a posteriori that the incoherent con- scaling function and the corresponding wavelet,
tribution is spread, and therefore does not com- respectively. Due to orthogonality, the scaling coeffi-
press, in both Fourier and grid-point basis. Since we cients are given by ! 0, 0, 0 = h!, 0, 0, 0 i and the wavelet
use the orthogonal wavelet representation, both coefficients are given by ! ~dj, ix , iy , iz = h!, dj, ix , iy , iz i, where
2
coherent and incoherent components are h,i denotes the L -inner product.
Wavelets: Application to Turbulence 415
Total
10
Coherent
Incoherent
1 Fourier cut
k (–5/3)
0.1
k2
0.01
0.001
0.0001
1e–05
1e–06
1 10 100
Figure 6 Energy spectrum, resolution N = 2563 with a
zoom at 643 . Reprinted with permission from Farge et al.
Coherent vortex extraction in three-dimensional homogeneous
ω< turbulence: Comparison between CVS-wavelet and POD-Fourier
Figure 4 Isosurfaces of incoherent vorticity field, for decompositions. Physics of Fluids 15(10): 2886–2896. Copyright
jwj = 3=2
, 2
, 5=2
with opacity 1, 0.5, 0.1, respectively. Simula- 2003, American Institute of Physics.
tion with resolution N = 2563 . Zoom on a subcube 643 . Reprinted
with permission from Farge et al. Coherent vortex extraction in the same Gaussian distribution as the total velocity,
three-dimensional homogeneous turbulence: Comparison between
while the incoherent velocity remains Gaussian, but its
CVS-wavelet and POD-Fourier decompositions. Physics of Fluids
15(10): 2886–2896. Copyright 2003, American Institute of Physics. variance is much smaller. The corresponding energy
spectra are plotted on Figure 6. We observe that the
does not exhibit coherent structures. Hence, the spectrum of the coherent energy is identical to the
wavelet compression retains all the vortex tubes and spectrum of the total energy all along the inertial
preserves their structure at all scales. Consequently, the range. This implies that the vortex tubes are respon-
coherent flow is as intermittent as the total flow, while sible for the k5=3 energy scaling, which corresponds to
the incoherent flow is structureless and non intermit- a long-range correlation, characteristic of 3D turbu-
tent. Modeling the effect of the incoherent flow onto lence as predicted by Kolmogorov’s theory. In con-
the coherent flow should then be much simpler than trast, the incoherent energy has a scaling close to k2 ,
with methods based on Fourier filtering. which corresponds to an energy equipartition between
Figure 5 shows the velocity PDF in semilogarithmic all wave vectors k, since the isotropic spectrum is
coordinates. We observe that the coherent velocity has obtained by integrating energy in 3D k-space over 2D
shells k = jkj. The incoherent velocity field is therefore
spatially uncorrelated, which is consistent with the
10
observation that incoherent vorticity is structureless
Total and homogeneous.
1 Coherent From these observations, we propose the following
Incoherent
0.1 Gaussian fit scenario to interpret the turbulent cascade: the
coherent energy injected at large scales is transferred
0.01 towards small scales by nonlinear interactions between
0.001
vortex tubes. In the meantime, these nonlinear inter-
actions also produce incoherent energy at all scales,
0.0001 which is dissipated at the smallest scales by molecular
1e–05 kinematic viscosity. Thus, the coherent flow causes
direct transfer of the coherent energy into incoherent
1e–06 energy. Conversely, the incoherent flow does not
1e–07 trigger any energy transfer to the coherent flow, as it
–30 –20 –10 0 10 20 30 is structureless and uncorrelated. We conjecture that
Figure 5 Velocity PDF, resolution N = 2563 with a zoom at the coherent flow is dynamically active, while the
643 . Reprinted with permission from Farge et al. Coherent vortex incoherent flow is slaved to it, being only passively
extraction in three-dimensional homogeneous turbulence: Com-
parison between CVS-wavelet and POD-Fourier decomposi-
advected and mixed by the coherent vortex tubes. This
tions. Physics of Fluids 15(10): 2886–2896. Copyright 2003, is a different view from the classical interpretation
American Institute of Physics. since it does not suppose any scale separation. Both
Wavelets: Application to Turbulence 417
coherent and incoherent flows are active all along the The above equations are completed with bound-
inertial range, but they are characterized by different ary conditions and a suitable initial condition.
probability distribution functions and correlations:
non-Gaussian and long-range correlated for the Time discretization Introducing a classical semi-
former, while Gaussian and uncorrelated for the latter. implicit time discretization with a time step t and
setting !n (x) !(x, nt), we obtain
Wavelet Computation
ð1 tr2 Þ!nþ1 ¼ !n þ tðr Fn vn r!n Þ ½19
Principle
The mathematical properties of wavelets (see Wave-
lets: Mathematical Theory) motivate their use for r2 nþ1 ¼ !nþ1 and vnþ1 ¼ r? nþ1 ½20
solving of partial differential equations (PDEs).
Hence, in each time step two elliptic problems
The localization of wavelets, both in scale and
space, leads to effective sparse representations of have to be solved and a differential operator has to
functions and pseudodifferential operators (and their be applied.
inverse) by performing nonlinear thresholding of the Formally the above equations can be written in
wavelet coefficients of the function and of the matrices the abstract form Lu = f , where L is an elliptic
representing the operators. Wavelet coefficients allow operator with constant coefficients. This corre-
sponds to a Helmholtz type equation for ! with
to estimate the local regularity of solutions of PDEs
L = (1 tr2 ) and a Poisson equation for with
and thus can define autoadaptive discretizations with
L = r2 .
local mesh refinements. The characterization of func-
tion spaces in terms of wavelet coefficients and the
corresponding norm equivalences lead to diagonal Spatial discretization For the spatial discretization,
preconditioning of operators in wavelet space. we use the method of weighted residuals, that is, a
Moreover, the existence of the fast wavelet trans- Petrov–Galerkin scheme. The trial functions
form yields algorithms with optimal linear complex- are orthogonal wavelets and the test functions
ity. The currently existing algorithms can be are operator adapted wavelets, called ‘‘vaguelettes,’’
classified in different ways. We can distinguish . To solve the elliptic equation Lu = f at time
between Galerkin, collocation, and hybrid schemes. step tnþ1 , we develop unþ1 into P an orthogonal
Hybrid schemes combine classical discretizations, wavelet series, that is, unþ1 = e unþ1
, where
for example, finite differences or finite volumes, and = (j, ix , iy , d) denotes the multi-index for scale j,
wavelets, which are only used to speed up the linear space i, and direction d. Requiring that the residual
algebra and to define adaptive grids. On the other vanishes with respect to all test functions , we
hand, Galerkin and collocation schemes employ obtain a linear system for the unknown wavelet
wavelets directly for the discretization of the coefficients e unþ1
of the solution u:
solution and the operators. Wavelet methods have X
been developed to solve Burger’s, Stokes, Kura- unþ1
e hL ; 0 i ¼ hf ; 0 i ½21
moto–Sivashinsky, nonlinear Schrödinger, Euler,
and Navier–Stokes equations. As an example, we The test functions are defined such that the
present an adaptive wavelet algorithm, of Galerkin stiffness matrix turns out to be the identity.
type, to solve the 2D Navier–Stokes equations. Therefore, the solution of Lu =P f reduces to a
Adaptive Wavelet Scheme
change of basis, that is, unþ1 = hf , i . The
right-hand side (RHS) f can then be developed into a
We consider the 2D Navier–Stokes equations writ- biorthogonalP operator adapted wavelet
ten in terms of vorticity ! and stream function , basis f = hf , i , with = L?1 and
which are both scalars in two dimensions, = L , ? denoting the adjoint operator. By
construction, and are biorthogonal, that is,
@t ! þ v r! r2 ! ¼ r F ½17
such that h , 0 i = , 0 . It can be shown that
both have similar localization properties in physical
and Fourier space as , and that they form a Riesz
r2 ¼ ! and v ¼ r? ½18 basis.
for x 2 [0, 1]2 , t > 0. The velocity is denoted by v, F
is an external force, > 0 is the molecular kinematic Adaptive discretization To get an adaptive space
viscosity, and r? = (@y , @x ). discretization for the linear problem Lu = f , we
418 Wavelets: Application to Turbulence
J
advantage of this scheme is that general nonlinear
i
terms, for example, f (u) = (1 u) eC=u , can be
treated more easily. The method can be summar-
ized as follows: starting from the significant
j
wavelet coefficients, jeu j > ", one reconstructs u
on a locally refined grid and gets u(x ). Then one
|ω∼| > ε can evaluate f (u(x )) pointwise and the wavelet
0
coefficients e
f are calculated using the adaptive
Figure 7 Illustration of the dynamic adaption strategy in decomposition.
wavelet coefficient space.
Finally, one computes the scalar products of the
RHS of [21] with the test functions to advance the
solution in time. We compute e u = hf , i belonging
consider only the significant wavelet coefficients of to the enlarged coefficient set (white and gray
the solution. Hence, we only retain coefficients e un regions in Figure 7).
whose modulus is larger than a given threshold ", The algorithm is of O(N) complexity, where N
that is, jeun j > ". The corresponding coefficients denotes the number of wavelet coefficients retained
are shown in Figure 7 (white area under the solid in the computation.
line curve).
Application to 2D Turbulence
Adaption strategy To be able to integrate the
equation in time we have to account for the To illustrate the above algorithm we present an
evolution of the solution in wavelet coefficient adaptive wavelet computation of a vortex dipole in
space (indicated by the arrow in Figure 7). There- a square domain, impinging on a no-slip wall at
fore, we add at time step tn the neighbors to the Reynolds number Re = 1000. To take into account
retained coefficients, which constitute a security the solid wall, we use a volume penalization
zone (gray area in Figure 7). The equation is then method, for which both the fluid flow and the
solved in this enlarged coefficient set (white and solid container are modeled as a porous medium
gray areas below the curves in Figure 7) to obtain whose porosity tends towards zero in the fluid and
unþ1
e towards infinity in the solid region.
. Subsequently, we threshold the coefficients
and retain only those whose modulus je unþ1 The 2D Navier–Stokes equations are thus mod-
j >"
(coefficients under the dashed curve in Figure 7). ified by adding the forcing term F = (1=
)v
This strategy is applied in each time step and hence in eqn [18], where
is the penalization parameter
allows to automatically track the evolution of the and is the characteristic function whose value is 1
solution in both scale and space. in the solid region and 0 elsewhere. The equations
are solved using the adaptive wavelet method in
a periodic square domain of size 1.1, in which
Evaluation of the nonlinear term For the
the square container of size 1 is imbedded,
evaluation of the nonlinear term f (un ), where the
taking
= 103 . The maximal resolution corre-
un are given, there are two
wavelet coefficients e
sponds to a fine grid of 10242 points. Figure 8a
possibilities:
shows snapshots of the vorticity field at times
Evaluation in wavelet coefficient space. As t = 0.2, 0.4, 0.6, and 0.8 (in arbitrary units). We
illustration, we consider a quadratic nonlinear observe that the vortex dipole is moving towards
term, f (u) = u2 . The wavelet coefficients of f can the wall and that strong vorticity gradients are
be calculated using the connection coefficients, produced when the dipole hits the wall. The
that is,Pone Phas to calculate the bilinear expres- computational grid is dynamically adapted during
sion, 0 e
u I 0 00 e
u0 with the interaction the flow evolution, since the nonlinear wavelet filter
tensor I = h , 00 i. Although many coeffi-
0 00 0 automatically refines the grid in regions where
cients of I are zero or very small, the size of I strong gradients develop. Figure 8b shows the
leads to a computation which is quite untractable centers of the retained wavelet coefficients at
in practice. corresponding times.
Evaluation in physical space. This approach is Note that during the computation only 5% out of
similar to the pseudospectral evaluation of the 10242 wavelet coefficients are used. The time
nonlinear terms used in spectral methods, there- evolution of total kinetic energy and the total
fore it is called pseudowavelet technique. The enstrophy F = ( 1y )v, are plotted in Figure 9 to
Wavelets: Application to Turbulence 419
(a) (b)
Figure 8 Dipole wall interaction at Re = 1000. (a) Vorticity field, (b) corresponding centers of the active wavelets, at t = 0.2, 0.4, 0.6,
and 0.8 (from top to bottom).
Z(t )
0.35
refinement, both in the boundary layers at the 150
0.3
wall and also in shear layers which develop during 100
0.25
the flow evolution far from the wall. Therewith,
0.2 50
the number of grid points necessary for the 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
computation is significantly reduced, and we con- t
jecture that the resulting compression rate will Figure 9 Time evolution of energy (solid line) and enstrophy
increase with the Reynolds number. (dashed line).
420 Wavelets: Applications
Wavelets: Applications
M Yamada, Kyoto University, Kyoto, Japan From the perspective of time–frequency analysis,
ª 2006 Elsevier Ltd. All rights reserved. the wavelet analysis may be regarded as a windowed
Fourier analysis with a variable window width,
narrower for higher frequency. The wavelets can
Introduction therefore give information on the local frequency
structure of an event; they have been applied to
Wavelet analysis was first developed in the early various kinds of one-dimensional (1D) or multi-
1980s in the field of seismic signal analysis in the dimensional signals, for example, to identify an
form of an integral transform with a localized kernel event or to denoise or to sharpen the signal.
function with continuous parameters of dilation and 1D wavelets (a,b) (x) are defined as
translation. When a seismic wave or its derivative
has a singular point, the integral transform has a ða;bÞ 1 xb
ðxÞ ¼ p ffiffiffiffiffi
ffi
scaling property with respect to the dilation para- jaj a
meter; thus, this scaling behavior can be available to
locate the singular point. In the mid-1980s, the where a( 6¼ 0), b are real parameters and (x) is a
orthonormal smooth wavelet was first constructed, spatially localized function called ‘‘analyzing wave-
and later the construction method was generalized let’’ or ‘‘mother wavelet.’’ Wavelet analysis gives a
and reformulated as multiresolution analysis decomposition of a function into a linear combina-
(MRA). Since then, several kinds of wavelets have tion of those wavelets, where a perfect reconstruc-
been proposed for various purposes, and the concept tion requires the analyzing wavelet to satisfy some
of wavelet has been extended to new types of basis mathematical conditions.
functions. In this sense, the most important effect of For the continuous wavelet transform (CWT),
wavelets may be that they have awakened deep where the parameters (a, b) are continuous, the
interest in bases employed in data analysis and data analyzing wavelet (x)L2 (R) has to satisfy the
processing. Wavelets are now widely used in various admissibility condition
fields of research; some of their applications are
discussed in this article.
Wavelets: Applications 421
analyzing wavelet (x)L2 (R) has to satisfy the The wavelet frame is also employed in several
admissibility condition applications.
From the prospect of applications, the CWTs are
Z
j ^ð!Þj2
1 better adapted for the analysis of data functions,
C d! < 1 including the detection of singularities and patterns,
1 j!j
while the DWTs are adapted to the data processing,
where ˆ (!) is the Fourier transform of (x): including signal compression or denoising.
Z 1
^ð!Þ ¼ ei!x ðxÞ dx
1
Singularity Detection and Multifractal
The admissibility condition is known to be equiva- Analysis of Functions
lent to the condition that (x) has no zero-frequency
Since its birth, the wavelet analysis has been applied
component, that is, ˆ (0) = 0, under some mild
for the detection of singularity of a data function.
condition for the decay rate at infinity. Then the
Let us define the Hölder exponent h(x0 ) at x0 of a
CWT and its inverse transform of a data function
function f (x) is defined here as the largest value of
f (x) 2 L2 (R) is defined as
the exponent h such that there exists a polynomial
Z 1
1 Pn (x) of degree n that satisfies for x in the
T ða; bÞ ¼ pffiffiffiffiffiffi ða;bÞ ðxÞf ðxÞ dx
C 1 neighborhood of x0 :
Z 1 Z 1
1 ða;bÞ da db jf ðxÞ Pn ðx x0 Þj ¼ Oðjx x0 jh Þ
f ðxÞ ¼ pffiffiffiffiffiffi T ða; bÞ ðxÞ
C 1 1 a2
The data function is not differentiable if h(x0 ) < 1,
In the case of the discrete wavelet transform but if h(x0 ) > 1 then it is differentiable and a
(DWT), the parameters (a, b) are taken discrete; a singularity may arise in its higher derivatives. The
typical choice is a = 1=2j , b = k=2j , where j and k are wavelet transform is applied to find the Hölder
integers: exponent h(x0 ), because T (a, b) has an asymptotic
behavior T (a, b) = O(ah(x0 þ1=2) )(a ! 0) if the ana-
j;k ðxÞ ¼ 2j=2 ð2j x kÞ
lyzing wavelet has N( >h(x0 )) vanishing moments,
In order that the wavelets { j,k (x) j j, k 2 Z} may that is,
constitute a complete orthonormal system in L2 (R), Z 1
the analyzing wavelet should satisfy more stringent xm ðxÞ dx ¼ 0; m 2 Z; 0 m < N
conditions than the admissibility condition for the 1
CWT, and is now constructed in the framework of
MRA. A data function is then decomposed by the A commonly used analyzing wavelet for this purpose
DWT as may be the N-time derivative of the Gaussian
function (x) = dN (ex =2 )=dxN . This method works
2
X1 Z 1
f ðxÞ ¼ j;k j;k ðxÞ; j;k ¼ j;k ðxÞf ðxÞ dx
well to examine a single or some finite number of
j¼1 1 singular points of the data function.
When the data function is a multifractal function
Even when the discrete wavelets do not constitute with an infinite number of singular point of various
a complete orthnormal system, they often form a strengths, the multifractal property of the data
wavelet frame if linear combinations of the wavelets function is often characterized by the singularity
are dense in L2 (R) and if there are two constants A, spectrum D(h) which denotes the Hausdorff dimen-
B such that the inequality sion of the set of points where h(x) = h. The
X singularity spectrum is, however, difficult to obtain
Akf k2 jh j;k ; f ij2 Bkf k2 directly from the CWT, and the Legendre transfor-
j;k
mation is introduced to bypass the difficulty.
holds for an arbitrary f (x) 2 L2 (R). For the wavelet Fully developed 3D fluid turbulence may be a
frame { j,k }, there is a corresponding dual frame, typical example of wavelet application to the
{ ˜ j,k }, which permits the following expansion of f (x): singularity detection. The Kolmogorov similarity
X X law of fluid turbulence for the longitudinal velocity
f ðxÞ ¼ h j;k ; f i ~j;k ðxÞ ¼ h ~j;k ; f i j;k ðxÞ increment u(r) e (u(x þ re) u(x)), where u(x)
j;k j;k is the velocity field and e is a constant unit vector,
422 Wavelets: Applications
predicts a scaling property of the structure function; has a convex shape around h = 1=3 suggesting a
for r in the inertial subrange, multifractal property. For a fractal signal, we note
that the WTMM method enlightens the hierarchical
hðuðrÞÞp i rp; p ¼ p=3 organization of the singularities, in the branching
structure of the WT skeleton defined by the
where h i denotes the statistical mean. In reality, maxima lines arrangement in the (a, b) half-plane.
however, the scaling exponent p measured in Though the above discussion also applies to the
experiments shows a systematic deviation from p/3, DWT, the detection of the Hölder exponent h in
which is considered to be a reflection of intermit- experimental situations is usually performed by the
tency, namely the spatial nonuniformity or multi- CWT, which has no restriction on possible values of
fractal property of active vortical motions in a, while the DWT is often employed for theoretical
turbulence. For simplicity, let us consider the discussions of singularity and multifractal structure
velocity field on a linear section of the turbulence of a function.
field. According to the multifractal formalism, the
turbulence velocity field has singularities of various
strengths described by the singularity spectrum Multiscale Analysis
D(h), which is related to the scaling exponent p
through the Legendre transform, D(h) = infp (ph Wavelet transform expands a data function in the
p þ 1). This relation is often used to determine D(h) time–frequency or the position–wavenumber space,
from the knowledge of p (structure function which has twice the dimension of the original signal,
method). However, this method does not necessarily and makes it easier to perform a multiscale analysis
work well because, for example, it does not capture and to identify events involved in the signal. In the
the singular points of the Hölder exponent larger wavelet transform, as stated above, the time resolu-
than 1 and it is unstable for h < 0. tion is higher at higher frequency, in contrast with
These difficulties are not restricted to the turbu- the windowed Fourier transform where the time and
lence research, but arise commonly when the the frequency resolutions are independent of fre-
structure function is employed to determine the quency. Another advantage of wavelet is a wide
singularity spectrum. In these problems, the CWT variety of analyzing wavelet, which enables us to
T (a, b) provides an alternative method. An inge- optimize the wavelet according to the purpose of
nious technique is to take only the modulus maxima data analysis. Both the CWT and the DWT are
of T (a, b) (for each of fixed a) to construct a available for these time–frequency or position–
partition function wavenumber analysis. However, the CWT has
properties quite different from those of familiar
" #q
X orthonormal bases of discrete wavelets.
0
Zða; qÞ ¼ sup jT ða; b Þj
0
l2Lmax ða;b Þ2l
Multidimensional CWT
where q 2 R, and Lmax denotes the set of all maxima The CWT can be formulated in an abstract way. We
lines, each of which is a continuous curve for small can regard G = {(a, b) j a( 6¼ 0), b 2 R} as an affine
value of a, and there exists at least one maxima line group on R with the group operation of
toward a singular point of the Hölder exponent (a, b)(a0 , b0 ) = (aa0 , ab0 þ b) associated with the
h(x0 ) < N. In the limit of a ! 0, defining the invariant measure d = da db=a2 . The group G has
exponent (q) as Z(a, q) a(q) , one can obtain the its unitary representation in the Hilbert space
singularity spectrum through the Legendre H = L2 (R):
transform:
1 xb
DðhÞ ¼ inf q h þ 12 ðqÞ ðUða; bÞf ÞðxÞ ¼ pffiffiffiffiffiffi f
q jaj a
This method (wavelet-transform modulus-maxima and then we can consider the CWT can be constructed
(WTMM) method) is advantageous in that it works as a linear map W from L2 (R) to L2 (G; da db=a2 ):
also for singularities of h > 1 and h < 0. Several
simple examples of multifractal functions have been 1
W : f ðxÞ 7! T ða; bÞ ¼ pffiffiffiffiffiffi hUða; bÞ ; f i
successfully analyzed by this method. For fluid C
turbulence, this method gives a singularity spectrum
D(h) which has a peak value of 1 at h 1=3, where h , i is the inner product of L2 (R) with the
consistently with Kolmogorov similarity law, but complex conjugate taken at the first element, and
Wavelets: Applications 423
(x) is a unit vector (analyzing wavelet) satisfying which defines the range of the CWT, a subspace
the abstract admissibility condition of L2 (R). Therefore, if one wants to modify T (a, b)
Z by, for example, assigning its value as zero in some
C ¼ jhUða; bÞ ; ij2 d < 1 parameter region just as in a filter process, care
G should be taken for the resultant T (a, b) to be in the
image of the CWT. The reason may be understood
This formulation is applicable also to a locally
intuitively by noticing that the wavelets (a,b) (x) are
compact group G and its unitary and square
linearly dependent on each other. The expression of
integrable representation in a Hilbert space H.
a data function by a linear combination of the
Note that even the canonical coherent states are
wavelets is therefore not unique, and thus is
included in this framework by taking the Weyl–
redundant. The CWT gives only T (a, b) of the
Heisenberg group and L2 (R) for G and H,
least norm in L2 (R2 ; da db=a2 ). In physical inter-
respectively. This abstract formulation allows us
pretations of the CWT, however, this nonuniqueness
to extend the CWT to higher-dimensional Eucli-
is often ignored.
dean spaces and other manifolds: for example, 2D
sphere S2 for geophysical application and 4D
manifold of spacetime taking the Poincaré group Pattern Detection
into consideration. Edge detection The edges of an object are often the
In Rn , the CWT of f (x) 2 L2 (Rn ) and its inverse most important components for pattern detection.
transform are given by The edge may be considered to consist of points of
Z sharp transition of image intensity. At the edge, the
1 ða;r;bÞ ðxÞf ðxÞ dx
T ða; r; bÞ ¼ pffiffiffiffiffiffi modulus of the gradient of the image f (x, y) is
C Rn expected to take a local maximum in the 1D
Z direction perpendicular to the edge. Therefore, the
1 ða;r;bÞ da dr db
f ðxÞ ¼ pffiffiffiffiffiffi Tða; r; bÞ ðxÞ local maxima of jrf (x, y)j may be the indicator of
C G anþ1 the edge. However, the image textures can also give
similar sharp transitions of f (x, y), and one should
where r 2 SO(n), b 2 Rn , dr is the normalized invar- take into account the scale dependence which
iant measure of G = SO(n), and the wavelets are distinguishes between edges and textures. One of
defined as (a, r, b) (x) = (1=an=2 ) (r1 (x b)=a), with the practically possible ways for this purpose is to
the analyzing wavelet satisfying the admissibility use dyadic wavelets m j m j
(2 x, 2j y) which
j (x, y) = 2
condition are generated from the two wavelets ( 1 , 2 ) = (
Z ^ @=@x, @=@y), where is a localized function
j ðwÞj
C ¼ n dw < 1 (multiscale edge detection method). The dyadic
R jwj
n
wavelet transform of the image f (x, y)
Note that these wavelets are constructed not only
by dilation and translation but also by rotation Tjm ðb1 ; b2 Þ ¼ hf ðx; yÞ; m
j ðx b1 ; y b2 Þi; m ¼ 1; 2
which therefore gives the possibility for directional
pattern detection in a data function. In the case of defines the multiscale edges as a set of points
2D sphere S2 , on the other hand, the dilation b = (b1 , b2 ) where the modulus of the wavelet trans-
operation should be reinterpreted in such a way form, j(Tj1 , Tj2 )j, takes a locally maximum value
that at the North Pole, for example, it is the normal (WTMM) in a 1D neighborhood of b in the
dilation in the tangent plane followed by lifting it direction of (Tj1 (b), Tj2 (b)). Scale dependence of
to S2 by the stereographic projection from the the magnitude of the modulus maxima is related to
South Pole. the Hölder exponent of f (x, y) similarly to 1D case,
Generally, the abstract map W thus defined is and thus gives information to distinguish between
injective and therefore reversal, but not surjective in the edges and the textures.
contrast with the Fourier case. Actually in the case of Inversely, the information of WTMM bj,p =
1D CWT, T (a, b) is subject to an integral condition: {(b1,j,p , b2,j,p )} of multiscale edges can be made use
Z Z of for an approximate reconstruction of the original
1 1
da db image, although the perfect reconstruction cannot be
T ða; bÞ ¼ 2
Kða; b; a0; b0 ÞT ða0; b0 Þ
1 1 a expected because of the noncompleteness of the
Z 1 modulus maxima wavelets. Assuming that
0 0
Kða; b; a ; b Þ ¼
0 0
ða;bÞ ðxÞ ða ;b Þ ðxÞ dx { 1j,p , 2j,p } = { 1j (x bj,p ), 2j (x bj,p )} constitutes a
1 frame of the linear closed space generated by
424 Wavelets: Applications
{ 1j,p , 2j,p }, an approximate image f̂ is obtained by reduces the noise component orthogonal to it. More
inverting the relation specifically, the wavelet framePgives a representation
XX XX of a data function as f (t) = j,k j,k j,k , where the
Lf̂ hf̂ ; m m
j;p i j;p ¼ Tjm ðbj;p Þ m
j;p expansion coefficients j,k = h j,k , f (x)i satisfy the
m j;p m j;p
defining equation of the subspace
using, for example, a conjugate gradient algorithm, X
j0;k0 ¼ j;k h j0;k0 ; j;k i
where a fast calculation is possible with a filter bank
algorithm for the dyadic wavelet (‘‘algorithm à If the frame coefficients are transmitted, the projec-
trous’’). This algorithm gives only the solution of tion operator P, which is defined on the right-hand
minimum norm among all possible solutions, but it side of the above equation, reduces the noise in the
is often satisfactory for practical purposes and thus received coefficients j,k contaminated during the
is applicable also to data compression. transmission.
However, this method is not applicable if the
Directional detection For oriented features such as transmitted signal is not redundant. Then some
segments or edges in images to be detected, a a priori criterion is necessary to discriminate between
directionally selective wavelet for the CWT is desired. signal and noise. Various criteria have been pro-
A useful wavelet for this purpose is one that has the posed in different fields. If the signal and the noise,
effective support of its Fourier transform in a convex or plural signals have different power-law forms of
cone with apex at the origin in wave number space. A spectra, then their discrimination may be possible by
typical example of the directional wavelet may be the the DWT at higher-frequency region where the
2D Morlet wavelet: difference in the magnitude of the coefficients is
significant. In this approach, the wavelets of Meyer
ðxÞ ¼ expðik0 xÞ expðjAxj2 Þ type, that is, an orthogonal wavelet with a compact
where k0 is the center of the support in Fourier support in Fourier space, may be preferable because
space, and A is a 2 2 matrix diag[1=2 , 1]( 1), the wavelets of different scales are separated, at least
where the admissibility condition for the CWT is to some extent, in Fourier space.
approximately satisfied for jk0 j 5. Another exam- In fluid dynamics, the vorticity field of 2D
ple is the Cauchy wavelet which has the support turbulence is found to be decomposed into coherent
strictly in a convex cone in wave number space. and incoherent vorticity fields, according as the
These wavelets have the directional selectivity CWT is larger than a threshold value or not,
with preference to a slender object in a specific respectively. These two fields give different Fourier
direction. One of their applications is the analysis of spectra of the velocity field (k5 for coherent part
the velocity field of fluid motion from an experi- while k3 for incoherent part), showing that the
mental data, where many tiny plastic balls distrib- coherent structures are responsible for the deviation
uted in fluid give a lot of line segments in a picture from k3 predicted by the classical enstrophy
taken with a short exposure. The directional wavelet cascade theory. In an astronomical application, on
analysis of the picture classifies the line segments the other hand, the data processing is performed by
according to their directions, indicating the direc- a more sophisticated method taking into account
tions of fluid velocity. Another example may be a interscale relation in the wavelet transform, because
wave-field analysis where many waves in different an astronomical image contains various kinds
directions are superimposed; the directional wavelets of objects, including stars, double-stars, galaxies,
allow one to decompose the wave field into the nebulas, and clusters. In a medical image however
component waves. Directional wavelets have also contrast analysis is indispensable for diagnostic
been applied successfully to detect symmetry of imaging to get a clear detailed picture of organic
objects such as crystals or quasicrystals. structure. A scale-dependent local contrast is defined
as the ratio of the CWT to that given by an
analyzing wavelet with a larger support. A multi-
Denoising and separation of signals The wavelet plicative scheme to improve the contrast is con-
frame as well as the CWT give a redundant structed by using the local contrast.
representation of a data function. If, instead of the
original data, the redundant expression is trans-
Signal Compression
mitted, the redundancy is used to reduce the noise
included in the received data because the redun- Signal compression is quite an important technology
dancy requires the data to belong to a subspace, and in digital communication. Speech, audio, image, and
the projection of the received data to the subspace digital video are all important fields of signal
Wavelets: Applications 425
compression, and plenty of compression methods give a compressed signal. One of the systematic
have been put to practical use, but we mention here methods to generate such a suitable basis is also to
only a few. employ the wavelet packets.
The MRA for orthogonal wavelets gives a
successive procedure to decompose a subspace of
Numerical Calculation
L2 (R) into a direct sum of two subspaces corre-
sponding to higher- and lower-frequency parts; only Application of wavelet transform, especially of the
the latter of which is decomposed again into its DWT, to numerical solver for a differential equation
higher- and lower-frequency parts. Algebraically, (DE) has long been studied. At the first sight, the
this procedure was already known before the wavelets appear to give a good DE solver because
discovery of MRA in filter theory in electrical the wavelet expansion is generally quite efficient
engineering, where a discretely sampled signal is compared to Fourier series due to its spatial
convoluted with a filter series to give, for example, a localization. But its implementation to an efficient
high-pass-filtered or low-pass-filtered series. An computer code is not so straightforward; research is
appropriate designed pair of a high-pass and a still continuing for concrete problems. Application
low-pass filters followed by the downsampling of the CWT to spectral method for partial differ-
yields two new series corresponding to the higher- ential equation (PDE) has been studied extensively.
and lower-frequency parts, respectively, which are There is no wavelet which diagonalizes the differ-
then reversible by another two reconstruction filters ential operator @=@x; therefore, an efficient numer-
with the upsampling. These four filters which are ical method is necessary for derivatives of wavelets.
often employed in a widely used technique of ‘‘sub- Products of wavelets also yield another numerical
band coding’’ then constitute a perfect reconstruc- problem. MRA brings about mesh points which are
tion filter bank. Under some conditions, successive adaptive to some extent, but finite element method
applications of this decomposition process to the still gives more flexible mesh points.
series of lower-frequency parts, which is equivalent For some scaling-invariant differential or integral
to the nesting structure of MRA, have been used for operators, including @ 2 =@x2 , Abel transformations,
data compression (quadrature mirror filter). A and Reisz potential, adaptive biorthogonal wavelets
famous example is a data compression system of can be provided with block-diagonal Galerkin
FBI for finger prints, consisting of wavelet coding representations, which has been applied to data
with scalar quantization. processing. Generally, simultaneous localization of
In MRA, however, it is only the lower-frequency wavelets, both in space and in scale, leads to a
parts that are successively decomposed. If both the sparse Galerkin representation for many pseudodif-
lower- and the higher-frequency parts are repeatedly ferential operators and their inverses. A threshold-
decomposed by the decomposition filters, then the ing technique with DWT has been introduced to
successive convolution processes correspond to a coherent vortex simulation of the 2D Navier–Stokes
decomposition of data function by a set of wavelet- equations, to reduce the relevant wavelet coeffi-
like functions, called ‘‘wavelet packet,’’ where there cients. Another promising application of wavelet
are choices whether to decompose the higher- and/or occurs as a preprocessor for an iterative Poisson
the lower-frequency parts. The best wavelet packet, in solver, where a wavelet-based preconditioning leads
the sense of the entropy, for example, within a to a matrix with a bounded condition number.
specified number of decompositions, often provides
with a powerful tool for data compression in several
Other Wavelets and Generalizations
areas, including speech analysis and image analysis.
We also note that from the viewpoint of the best basis Several new types of wavelets have been proposed:
which minimizes the statistical mean square error of ‘‘coiflet’’ whose scaling function has vanishing
the thresholded coefficients, an orthonormal wavelet moments giving expansion coefficients approxi-
basis gives a good concentration of the energy if the mately equal to values of the data functions, and
original signal is a piecewise smooth function super- ‘‘symlet’’ which is an orthonormal wavelet with a
imposed by a white noise, which is thus efficiently nearly symmetric profile. Multiwavelets are wavelets
removed by thresholding the coefficients. The effi- which give a complete orthonormal system in L2
ciency of a wavelet expansion of a signal is sometimes space. In 2D or multidimensional applications of the
evaluated with the entropy of ‘‘probability’’ defined as DWT, separable orthonormal wavelets consisting of
jj,k j2 =jjf jj2 . A better wavelet can be selected by tensor products of 1D orthonormal wavelets are
reducing the entropy, practically from among some frequently used, while nonseparable orthonormal
set of wavelets, and its restricted expansion coefficients wavelets are also available. Another generalization
426 Wavelets: Mathematical Theory
Note that these filters have a variable width k=k; Figure 1 shows wavelet analyses of a cosine, two
therefore, when the wave number increases, the sines, a Dirac, and a characteristic function. Below
428 Wavelets: Mathematical Theory
the four signals we plot the modulus and the phase cos sin
of the corresponding wavelet coefficients. ½17
sin cos
The analysis formula [8] then becomes
Higher Dimensions Z
The continuous wavelet transform can be extended to e
f ða; b; Þ ¼ f ðxÞ a;b; ðxÞ dx ½18
higher dimensions in L2 (Rn ) in different ways. Either R2
we define spherically symmetric wavelets by setting and for the corresponding inverse wavelet transform
(x) = 1d (jxj) for x 2 R n or we introduce in addition [11] we obtain
to dilations a 2 Rþ and translations b 2 R n also rota- Z Z Z 2
tions to define wavelets with a directional sensitivity. In 1 1 e dadbd
f ðxÞ ¼ f ða;b; Þ a;b; ðxÞ ½19
the two-dimensional case, we obtain for example, C 0 R2 0 a3
Similar constructions can be made in dimensions
1 1 x b
a;b; ðxÞ ¼ R ½16 larger than 2 using n 1 angles of rotation.
a a
where a 2 R þ , b 2 R2 , and where R is the rotation
matrix
1.5
1.5
1
1
0.5
0.5
0 Two sines 0
Cosine
–0.5 –0.5
–1 –1
–1.5 –1.5
0 100 200 300 400 500 600 700 800 900 1000 0 500 1000 1500 2000 2500 3000 3500 4000
3
0.9 3 0.9
10 10
10 0.8 10 0.8 2
2 20 20
0.7 0.7
20 20 30 30
1 1
0.6 0.6
30 30 40 40
0.5 0.5
0 50 50 0
40 40 0.4
0.4
60 60
–1 –1
50 0.3 50 0.3
70 70
0.2 0.2 –2
60 60 –2 80 80
0.1 0.1
90 90
70 70 –3 –3
200 400 600 800 1000 200 400 600 800 1000 500 1000 1500 2000 2500 3000 3500 4000 500 1000 1500 2000 2500 3000 3500 4000
Modulus of the wavelet coefficients Phase of the wavelet coefficients Modulus of the wavelet coefficients Phase of the wavelet coefficients
1.5
1 1
0.5 0.5
Dirac
Characteristic function
0 0
–0.5 –0.5
0 100 200 300 400 500 600 700 800 900 1000 0 100 200 300 400 500 600 700 800 900 1000
3 3
0.3 0.18
10 10 10
0.16 10
2.5 2
20 0.25
20 20 0.14 20
2 1
30 0.2 0.12
30 30 30
0.1 0
40 0.15 40 1.5
40 40
0.08
50 50 1 –1
0.1 50 0.06 50
0.04 –2
60 0.05 60 0.5 60 60
0.02
70 70 70 70 –3
0
200 400 600 800 1000 200 400 600 800 1000 200 400 600 800 1000 200 400 600 800 1000
Modulus of the wavelet coefficients Phase of the wavelet coefficients Modulus of the wavelet coefficients Phase of the wavelet coefficients
Figure 1 Examples of a one-dimensional continuous wavelet analysis using the complex-valued Morlet wavelet. Each subfigure
shows on the top the function to be analyzed and below (left) the modulus of its wavelet coefficients and below (right) the phase of its
wavelet coefficients.
Wavelets: Mathematical Theory 429
j=2
A scaling function (x) is required to exist. Its with gn = hjn , j1, 0 i, and where ji (x) = 2
translates generate a basis in each Vj , that is, (2j x i), j, i 2 Z (cf. Figure 2). The filter coeffi-
cients gn can be computed from the filter coefficients
Vj Vj ¼ spanfji gi2Z ½29 hn using the relation
where
gn ¼ ð1Þ1n h1n ½38
ji ðxÞ ¼ 2j=2 ð2j x iÞ; j; i 2 Z ½30
The translates and dilates of the wavelet
At a given scale j, this basis is orthonormal with respect
constitute orthonormal bases of the spaces Wj ,
to its translates by steps i=2j but not to its dilates,
hji ; jk i ¼ ik ½31 Wj ¼ spanf ji gi2Z ½39
The nestedness of the approximation spaces [28] As in the continuous case, the wavelets have
generated by the scaling function implies that it vanishing mean, and also possibly vanishing higher-
satisfies a refinement equation: order moments; therefore,
X
1 Z 1
j1;i ðxÞ ¼ hn2i jn ðxÞ ½32 xm ðxÞ dx ¼ 0 for m ¼ 0; . . . ; M 1 ½40
n¼1 1
with the filter coefficients hn = hjn , j1,0 i, which
Let us now consider approximations of a function
determine the scaling function completely. In gen-
f 2 L2 (R) at two different scales j:
eral, only the filter coefficients hn are known and no
analytical expression of is given. Equation [32] at scale j
implies that the approximation of a function at
coarser scale can be described by linear combina- X
1
fj ðxÞ ¼ f ji ji ðxÞ ½41
tions of the same function at finer scales.
i¼1
The orthogonal projection of a function f 2 L2 (R)
on VJ is defined as at scale j 1
PVJ : f !PVJ f ¼ fJ ½33 X
1
fj1 ðxÞ ¼ f j1;i j1;i ðxÞ ½42
with i¼1
X
fJ ðxÞ ¼ hf ; jk ijk ðxÞ ½34 with the scaling coefficients
k2Z
differences. These details are needed to go from one upsampling which adds zeros in between two
scale j to the next finer scale j þ 1 for successive coefficients.
j = 0, . . . , J 1,
reconstruction
X
1 1 X
X 1 for j = 1 to J, step 1, do
f ðxÞ ¼ f 0;i 0;i ðxÞ þ e
fji ji ðxÞ ½46
i¼1 j¼0 i¼1 X
1 X
1
f ji ¼ hi2n f j1;n þ gi2ne
fj;n ½51
For numerical applications, the sums in eqn [46] n¼1 n¼1
have to be truncated in both scale j and position i.
The truncation in scale corresponds to a limitation The FWT has been introduced by Stéphane Mallat
of f to a given finest scale J, which is in practice in 1989. If the scaling functions (and wavelets) are
imposed by the available sampling rate. Due to the compactly supported, the filters hn and gn have only
finite length of the available data, the sum over i a finite number of nonvanishing coefficients. In this
also becomes finite. The decomposition [46] is case, the numerical complexity of the FWT is O(N)
orthogonal, as, by construction, where N denotes the number of samples.
0.3 6
0.25 5
0.2
4
0.15
3
0.1
2
0.05
0 1
0.05
1 0.5 0 0.5 1 0 50 100 150 200
(a)
0.3 6
0.2 5
4
0.1
3
0
2
0.1 1
0.2
1 0.5 0 0.5 1 0 50 100 150 200
(b)
ˆ
Figure 3 Orthogonal wavelets Coiflet C12. (a) Scaling function (x) (left) and j(!)j. (b) Wavelet (x ) (left) and j ˆ (!)j.
...
~1
fj –1, ix , iy
... ... ... ... ...
~1
fj, ix , iy
~ ~ ~3 ~2
... fj x–1 , jy–1, ix , iy fjx, jy
–1 , ix , iy fj –1, ix , iy fj –1, ix , iy
~ ~ ~3 ~2
... fj x –1, jy, ix , iy fjx, jy , ix , iy fj, ix , iy fj, ix , iy
(a) (b)
Figure 4a Schematic representation of the 2D (b) wavelet transforms: (a) Tensor product construction and (b) 2D MRA.
C α(IR d )
g
in
dd
be
Em
Linear approx.
O(N –t /d )
t Nonlinear approx.
O(N –t /d )
(a)
s
W s,p(IR d )
N ¼ fk ;k ¼ 1; Nj ke
fk klp > ke
f klp 8 2 g ½63
1 1 s
¼ þ
q p d
representation in wavelet space, that is, to sparse The thresholding parameter " depends on the
matrices. For integral operators, for example, variance of the noise and on the sample size N.
Calderon–Zygmund operators T on R defined by The thresholding function we consider corre-
Z sponds to hard thresholding:
Tf ðxÞ ¼ Kðx; yÞf ðyÞ dy ½65
R
a if jaj > "
" ðaÞ ¼ ½66
where the kernel k satisfies 0 if jaj "
C
jkðx; y; Þj Donoho and Johnstone (1994) have shown that
jx yj there exists an optimal " for which the relative
and quadratic error between the signal s and its
estimator sC is close to the minimax error for all
@ C
kðx; yÞ þ @ kðx; y; Þ signals s 2 H, where H belongs to a wide class of
@x @y function spaces, including Hölder and Besov spaces.
jx yj2
They showed using the threshold
their wavelet representation hT j, i , j0 , i0 i is sparse
and a large number of weak coefficients can be pffiffiffiffiffiffiffiffiffiffiffiffiffi
"D ¼
n 2 ln N ½67
suppressed by simple thresholding of the matrix
entries while controlling the precision. The resulting
numerical scheme is called BCR algorithm and is yields an error which is close to the minimum error.
due to Beylkin et al. (1991). The threshold "D depends only on the sampling N
The characterization of function spaces by the and on the variance of the noise
n ; hence, it is
decay of the wavelet coefficients and the corre- called universal threshold. However, in many
sponding norm equivalences can be used for applications,
n is unknown and has to be estimated
diagonal preconditioning of integral or differential from the available noisy data s. For this, the present
operators which leads to matrices with uniformly authors have developed an iterative algorithm (see
bounded condition numbers. For elliptic differential Azzolini et al. (2005)), which is sketched in the
operators, for example, the Laplace operator r2 the following:
norm equivalence kr2 f k ’ k22je fji k can be used for 1. Initialization
preconditioning the matrix hr2 j, i , j0 , i0 i by a simple (a) given sk , k = 0, . . . , N 1. Set i = 0 and com-
diagonal scaling with 22j to obtain a uniformly pute the FWT of s to obtain es ;
bounded condition number. For further details, we (b) compute the variance
20 of s as a rough
refer to the book of Cohen (2000). estimate of the variance of n and compute the
corresponding threshold "0 = (2 ln N
20 )1=2 ;
Wavelet Denoising
(c) set the number of coefficients considered as
noise Nnoise = N.
We consider a function f which is corrupted by a 2. Main loop repeat
Gaussian white noise n 2 N (0,
2 ). The noise is 0
(a) set Nnoise = Nnoise and count the wavelet
spread over all wavelet coefficients es , while, coefficients Nnoise with modulus smaller
typically, the original function f is determined by than "i ;
only few significant wavelet coefficients. The aim is (b) compute the new variance
2iþ1 from the
then to reconstruct the function f from the observed wavelet coefficients whose modulus is smal-
noisy signal s = f þ n. ler than "i and the new threshold "iþ1 =
The principle of the wavelet denoising can be (2( ln N)
2iþ1 )1=2 ;
summarized in the following procedure: (c) set i = i þ 1 until (Nnoise0
= = Nnoise ).
Decomposition. Compute the wavelet coefficients 3. Final step
es using the FWT. (a) compute sC from the coefficients with mod-
Thresholding. Apply the thresholding function " ulus larger than "i using the inverse FWT.
to the wavelet coefficients es , thus reducing the Example To illustrate the properties of the denoising
relative importance of the coefficients with small algorithm, we apply it to a one-dimensional test signal.
absolute value. We construct a noisy signal s by superposing a
Reconstruction. Reconstruct a denoised version sC Gaussian white noise, with zero mean and variance
from the thresholded wavelet coefficients using
2W = 1,Pto a function f, normalized such that
the fast inverse wavelet transform. ((1=N) k jfk j2 )1=2 = 10. The number of samples is
Wavelets: Mathematical Theory 437
f n
30 30
25 25
20 20
15 15
10 10
5 5
0 0
–5 –5
–10 –10
–15 –15
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000
s
30
25
20
15
10
–5
–10
–15
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
sC s – sC
30 30
25 25
20 20
15 15
10 10
5 5
0 0
–5 –5
–10 –10
–15 –15
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Figure 9 Construction (top) of a 1D noisy signal s = f þ n (middle), and results obtained by the recursive denoising algorithm
(bottom).
N = 8192. Figure 9a shows the function f together the Circle; Image Processing: Mathematics; Wavelets:
with the noise n; Figure 9b shows the constructed Application to Turbulence; Wavelets: Applications.
noisy signal s and Figure 9c shows the wavelet
denoised signal sC together with the extracted noise.
Further Reading
Azzolini A, Farge M, and Schneider K (2005) Nonlinear wavelet
Acknowledgments thresholding: A recursive method to determine the optimal
denoising threshold. Applied and Computational Harmonic
Marie Farge thankfully acknowledges Trinity Col-
Analysis 18(2): 177.
lege, Cambridge, UK, and CIRM, Marseille, France,
Beylkin, Coifman, and Rohklin (1991) Fast wavelet transforms
for support while writing this paper. The authors also and numerical algorithms. Communications in Pure and
thank Barbara Burke for kindly revising their English. Applied Mathematics 44: 141.
Cohen A (2000) Wavelet methods in numerical analysis. In:
See also: Coherent States; Fractal Dimensions in Ciarlet PG and Lions JL (eds.) Handbook of Numerical
Dynamics; Homeomorphisms and Diffeomorphisms of Analysis, vol. 7. Amsterdam: Elsevier.
438 WDVV Equations and Frobenius Manifolds
EF ¼ ð3 dÞF þ 12 A v v þ B v þ C ½4
Main Definition
where
WDVV equations of associativity (after E Witten,
R Dijkgraaf, E Verlinde, and H Verlinde) is @
E ¼ a v þ b
tantamount to the following problem: find a func- @v
tion F(v) of n variables v = (v1 , v2 , . . . , vn ) satisfying for some constants a , b satisfying
the conditions [1], [3], and [4] given below. First,
a1 ¼ 1 ; b1 ¼ 0
@ 3 FðvÞ
½1 A , B , C, d are some constants. E is called Euler
@v1 @v @v
vector field and d is the charge of the Frobenius
must be a constant symmetric nondegenerate matrix. manifold.
Denote (
) = (
)1 the inverse matrix and intro- For n = 1 one has F(v) = (1=6)v3 . For n = 2 one
duce the functions can choose
It is also allowed to add to F(v) a polynomial of the The structure constants of the Frobenius algebra
degree at most 2. To consider more general non- Av = Tv M
linear changes of coordinates one has to give a
coordinate-free form of the above equations [1], [3], @ @ @
¼ c ðvÞ ½6
[4]. This gives rise to the notion of Frobenius @v @v @v
manifold introduced in Dubrovin (1992). can be locally represented by third derivatives [2] of
Recall that a Frobenius algebra is a pair (A, < , > ), a function F(v) satisfying [1], [3], [4]. The function
where A is a commutative associative algebra with a F(v) is called ‘‘potential’’ of the Frobenius manifold.
unity e over a field k (we will consider only the cases It is defined up to adding of an at most quadratic
k = R, C) and < , > is a k-bilinear symmetric non- polynomial in v1 , . . . , vn .
degenerate invariant form on A, that is, A generalization of the above definition to the
< x y; z > ¼ < x; y z > case of Frobenius supermanifolds can be found in
Manin (1999). For the more general class of the
for arbitrary vectors x, y, z in A. so-called F-manifolds, the requirement of the
existence of a flat invariant metric has been relaxed.
Definition Frobenius structure (, e, < , >, E, d) on
the manifold M is a structure of a Frobenius algebra
on the tangent spaces Tv M = (Av , < , >v ) depending
(smoothly, analytically, etc.) on the point v 2 M. It Deformed Flat Connection
must satisfy the following axioms.
One of the main geometrical structures of the theory
FM1. The curvature of the metric < , >v on M
of Frobenius manifolds is the deformed flat connec-
(not necessarily positive definite) vanishes. Denote r
tion. This is a symmetric affine connection on M
the Levi-Civita connection for the metric. The unity
C defined by the following formulas:
vector field e must be flat, re = 0.
FM2. Let c be the 3-tensor c(x, y, z) := < x y, ~ x y ¼ rx y þ zx y;
r x; y 2 TM; z 2 C
z > , x, y, z 2 Tv M. The 4-tensor (rw c)(x, y, z) must
be symmetric in x, y, z, w 2 Tv M. ~ d=dz y ¼ @z y þ E y 1 Vy
r
z ½7
FM3. A linear vector field E 2 Vect(M) (called
Euler vector field) must be fixed on M, that is, d
~x ¼ r d
~ d=dz ¼ 0
r
rrE = 0, such that dz dz
Definition A ‘‘deformed flat function’’ f (v; z) on a Here the multiplication law on the cotangent planes
domain in M C is defined by the requirement of is defined by means of the isomorphism.
horizontality of the differential df
< ; > : TM ! T M
~ ¼0
rdf ½9 The discriminant
M is a proper analytic (for an
Due to vanishing of the curvature of r̃ locally analytic M) subset where the intersection form
there exist n independent deformed flat functions degenerates. One can introduce a new metric on
f1 (v; z), . . . , fn (v; z) such that their differentials, the open subset Mn taking the inverse of the
together with the flat 1-form dz, span the cotangent intersection form. A remarkable result of the theory
of Frobenius manifolds is vanishing of the curvature
plane T(v; z) (M C ). They will be called ‘‘deformed
flat coordinates.’’ The global analytic properties of of this new metric. Moreover, the new flat metric
deformed flat coordinates can be derived, for the together with the following new multiplication:
case of semisimple Frobenius manifolds, from the
x y :¼ x y E1
results of the section ‘‘Moduli of semisimple
Frobenius manifolds’’ discussed later. defines on Mn a structure of an almost-dual
One can relax the definition of Frobenius manifold Frobenius manifold (Dubrovin 2004). In the original
dropping the last axiom FM3. The potential F(v) in flat coordinates v1 , ... , vn the coordinate expressions
this case satisfies [1] and [3] but not [4]. In this case, for the new metric and for the associated Levi-Civita
the deformed flat connection r̃ is just a family of connection r , called the Gauss–Manin connection,
affine flat connections on M depending on the read
parameter z 2 C given by the first line in [7]. The
curvature and torsion of this family of connections g ðvÞ :¼ ðdv ; dv Þ ¼ E ðvÞc
ðvÞ
vanishes identically in z. The deformed flat functions r dv ¼ ðvÞ dv
Thus, f (v; 0) is just an affine linear function of the ð ; Þ :¼ ð ; Þ < ; > ½13
flat coordinates v1 , . . . , vn ; the dependence on z can defines a metric with vanishing curvature. Flat
be considered as a deformation of the affine functions p = p(v; ) for the flat metric are deter-
structure. This motivates the name ‘‘deformed flat mined from the system
coordinates.’’ The coefficients of the expansions of
the deformed flat coordinates are the leading terms ðr rÞ dp ¼ 0 ½14
of the "-expansion of the Hamiltonian densities They are called ‘‘periods’’ of the Frobenius manifold.
of the integrable hierarchies associated with the The periods p(v; ) are related to the deformed flat
Frobenius manifolds (see below). functions f (v; z) by the suitably regularized Laplace-
type integral transform
Z 1
dz
Intersection Form of a pðv; Þ ¼ ez f ðv; zÞ pffiffiffi ½15
0 z
Frobenius Manifold
Choosing a system of n independent periods, one
Another important geometric structure on M is the obtains a system of flat coordinates p1 (v; ), . . . ,
intersection form of the Frobenius manifold. It is a pn (v; ) for the metric ( , ) on Mn ,
symmetric bilinear form on the cotangent bundle i
T M defined by the formula dp ðv; Þ; dpj ðv; Þ ¼ Gij ½16
ð!1 ; !2 Þ ¼ iE !1 !2 ; !1 ; !2 2 T M ½11 for some constant nondegenerate matrix Gij .
WDVV Equations and Frobenius Manifolds 441
The structure of a flat pencil on the Frobenius in the canonical coordinates. Actually, existence of
manifold M gives rise to a natural Poisson pencil canonical coordinates can be proved without using
(= bi-Hamiltonian structure) on the infinite-dimen- [4] (see details in Dubrovin (1992)).
sional ‘‘manifold’’ L(M) consisting of smooth maps Choosing locally branches of the square roots
of a circle to M (the so-called loop space). In the flat pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
coordinates v1 , . . . , vn for the metric < , > the i1 ðuÞ :¼ < @=@ui ; @=@ui >; i ¼ 1; . . . ; n ½20
Poisson pencil has the form we obtain a transition matrix = ( i (u)),
0
fv ðxÞ; v ðyÞg1 ¼ ðx yÞ @ Xn
i ðuÞ @
0 ¼ ½21
fv ðxÞ; v ðyÞg2 ¼ g ðvðxÞÞ ðx yÞ ½17 @v i¼1 i1 ðuÞ @u i
þ
ðvðxÞÞvx ðx yÞ from the basis @=@v to the orthonormal basis
By definition of the Poisson pencil, the linear hfi ; fj i ¼ ij
combination a1 { , }1 þ a2 { , }2 of the Poisson brackets
1 @
is again a Poisson bracket for arbitrary constants f1 ¼ 11 ðuÞ
a1 , a2 . Choosing a system of n independent periods @u1
@ ½22
pi (v; ), i = 1, . . . , n, as a new system of dependent f2 ¼ 1
21 ðuÞ ;...
variables, one obtains a reduction of the Poisson @u2
bracket { , } := { , }2 { , }1 for a given to the @
fn ¼ 1
n1 ðuÞ
canonical form @un
fpi ðvðxÞ; Þ; pj ðvðyÞ; Þg ¼ Gij 0 ðx yÞ ½18 The matrix (u) satisfies orthogonality condition
Under an additional assumption of existence of tau @ @
function (Dubrovin 1996, Dubrovin and Zhang), ðuÞðuÞ ; ¼ ð Þ; :¼ ;
@v @v
one can prove that any Poisson pencil on L(M) of
the form [17] with a nondegenerate matrix ( ) In this formula stands for the transposed matrix.
comes from a Frobenius structure on M. The lengths [20] coincide with the first column of
this matrix.
Denote V(u) = (Vij (u)) the matrix of the antisym-
metric operator V [8] with respect to the orthonor-
Canonical Coordinates on Semisimple
mal frame
Frobenius Manifolds
VðuÞ :¼ ðuÞV1 ðuÞ ½23
Definition The Frobenius manifold M is called
semisimple if the algebras Tv M are semisimple for The antisymmetric matrix V(u) = (Vij (u)) satisfies
v belonging to an open dense subset in M. the following system of commuting time-dependent
Hamiltonian flows on the Lie algebra so(n)
Any n-dimensional semisimple Frobenius algebra
equipped with the standard Lie–Poisson brackets
over C is isomorphic to the orthogonal direct sum of
{Vij , Vkl } = Vil jk Vjl ik þ Vjk il Vik jl :
n copies of one-dimensional algebras. In this section,
all the manifolds will be assumed to be complex @V
¼ fV; Hi ðV; uÞg; i ¼ 1; . . . ; n ½24
analytic. @ui
Near a semisimple point, the roots ui = ui (v),
with quadratic Hamiltonians
i = 1, . . . , n, of the characteristic equation
1 X Vij
2
det g ðvÞ ¼ 0 ½19
Hi ðV; uÞ ¼ ½25
2 j6¼i ui uj
can be used as local coordinates. The vectors
@=@ui , i = 1, . . . , n, are basic idempotents of the
algebras Tv M The matrix (u) satisfies
@ @ @ @
¼ ij ¼ Vi ðuÞ;
@ui @uj @ui @ui ½26
Vi ðuÞ :¼ adEi ad1
U ðVðuÞÞ; i ¼ 1; . . . ; n
We call u1 , . . . , un ‘‘canonical coordinates.’’ Observe
that we violate the indices convention labeling the Here the matrix unity Ei has the entries (Ei )ab =
canonical coordinates by subscripts. We will never ai ib ,U = diag(u1 , . . . , un ). Conversely, given a solu-
use summation over repeated indices when working tion to [24] and [26], one can reconstruct the
442 WDVV Equations and Frobenius Manifolds
Frobenius manifold structure by quadratures n-dimensional linear space equipped with a sym-
(Dubrovin 1998). The reconstruction depends on a metric nondegenerate bilinear form < , > . Two
choice of an eigenvector of the constant matrix linear operators on V, a semisimple operator
V = 1 (u) V(u)(u). ˆ : V ! V, and a nilpotent operator R : V ! V must
The system [24] coincides with the equations of satisfy the following properties. First, the operator ˆ
isomonodromic deformations (see Isomonodromic is antisymmetric:
Deformations) of the following linear differential
^ ¼ ^ ½31
operator with rational coefficients:
and the operator R satisfies
dY V
¼ Uþ Y ½27
dz z R ¼ ei^ R ei^ ½32
The latter is nothing but the last component of the
Here the adjoint operators are defined with respect
deformed flat connection [7] written in the ortho-
to the bilinear form < , > . The last condition to be
normal frame [22]. Other components of the
imposed onto the operator R can be formulated in a
horizontality equations yield
simple way by choosing a basis e1 , . . . , en of
@i Y ¼ ðzEi þ Vi ðuÞÞY; i ¼ 1; . . . ; n ½28 eigenvectors of the semisimple operator ,ˆ
The compatibility conditions of the system [27] and ^ ¼ e ;
e ¼ 1; . . . ; n
[28] coincide with [24].
We require the existence of a decomposition
The integration of [24], [26] and, more generally,
the reconstruction of the Frobenius structure can be R ¼ R0 þ R1 þ R2 þ ½33
reduced to a solution of a certain Riemann–Hilbert
where for any integer k 0 the linear operator Rk
problem (see Riemann–Hilbert Problem).
satisfies
The isomonodromic tau function of the semisim-
ple Frobenius manifold is defined by Rk e 2 span e j ¼ þ k 8 ¼ 1; ...; n ½34
X
n
In the nonresonant case, such that none of the
d log I ðuÞ ¼ Hi ðVðuÞ; uÞdui ½29
differences of the eigenvalues of ˆ being equal to a
i¼1
positive integer, all the matrices R1 , R2 , . .., are equal
It is an analytic function on a suitable unramified to zero. Observe a useful identity
covering of the semisimple part of M.
Alternatively, eqns [24] can be represented as the z^ R z^ ¼ R0 þ zR1 þ z2 R2 þ ½35
isomonodromy deformations of the dual Fuchsian
system More generally, for any operator A : V ! V com-
muting with e2iˆ a decomposition is defined as
d
1
½U ¼ þV ½30 A ¼ ½Ak
d 2 k2Z
X ½36
The latter comes from the Gauss–Manin system for z^ A z^ ¼ zk ½Ak
the periods p = p(v; ) of the Frobenius manifold k2Z
written in the canonical coordinates [22].
In particular, [R]k = Rk , k 0, [R]k = 0, k < 0.
One has to also choose an eigenvector e of the
Moduli of Semisimple operator ˆ such that R0 e = 0; denote d=2 the
Frobenius Manifolds corresponding eigenvalue
u0 = (u01 , . . . , u0n ) of n pairwise distinct complex manifolds in terms of the monodromy data (, ˆ R,
numbers and on a choice of a ray ‘þ on an auxiliary S, C).
complex z-plane starting at the origin such that Conversely, to reconstruct the Frobenius manifold
near a semisimple point with the canonical coordi-
Re z u0i u0j 6¼ 0; i 6¼ j; z 2 ‘þ ½38 nates u01 , . . . , u0n , one is to solve the following
boundary-value problem. Let
Let us order the complex numbers in such a way that
‘ ¼ ð‘ Þ [ ‘þ
zðu0i u0j Þ
e ! 0; i < j; jzj ! 1; z 2 ‘þ ½39 be the oriented line on the complex z-plane chosen
as in [38]. Here the ray ‘ is the opposite to ‘þ .
The operator S must be upper triangular Denote R =L the right/left half-planes with respect
to ‘. To reconstruct the Frobenius manifold, one is
S ¼ ðSij Þ; Sij ¼ 0; i > j
½40 to find three matrix-valued functions 0 (z; u),
Sii ¼ 1; i ¼ 1; . . . ; n R (z; u), and L (z; u):
The operator C must satisfy 0 ðz; uÞ : V ! Cn
R=L ðz; uÞ : Cn ! Cn
C S C ¼ ei^ eiR ½41
for u close to u0 such that 0 (z; u) is analytic and
Here the adjoint operator C is understood as invertible for z 2 C, R (z; u)=L (z; u) are analytic
follows: and invertible for z 2 R =L resp., and continuous
<;>1
up to the boundary ‘n0 and
’
C : Cn ! Cn ! V ! V
R=L ðz; uÞ
1 þ Oð1=zÞ; jzj ! 1; z 2 R =L
The group of diagonal n n matrices
The boundary values of the functions
D ¼ diagð1; . . . ; 1Þ 0 (z; u),R (z; u), and L (z; u) must satisfy the
following boundary-value problem (as above
acts on the pairs (S, C) by
U = diag(u1 , . . . , un )):
S 7! DSD; C 7! DC R ðz; uÞ ¼ L ðz; uÞez U Sez U ; z 2 ‘þ ½42
One is to factor out the action of this diagonal R ðz; uÞ ¼ L ðz; uÞez U S ez U ; z 2 ‘ ½43
group. Besides, the operator C is defined up to a left ^ R zU
action of certain group of linear operators depend- 0 ðz; uÞz z ¼ R ðz; uÞe C; z 2 R
^ R zU
½44
ing on the spectrum. 0 ðz; uÞz z ¼ L ðz; uÞe SC; z 2 L
For the generic (i.e., nonresonant) case where
e2 iˆ has simple spectrum, the operator C is defined Here zˆ := eˆ log z , zR := eR log z are considered as
up to left multiplication by any matrix commuting Aut(V)-valued functions on the universal covering
with e2 iˆ . In this situation, the monodromy data of Cn0; the branch cut in the definition of log z is
(,
ˆ R, S, C) are locally uniquely determined by the chosen to be along ‘ .
n(n 1)=2 entries of the matrix S. Therefore, near a The solution of the above boundary-value pro-
generic point, the variety of the monodromy data is blem [42]–[44], if exists, is unique. It can be reduced
a smooth manifold of the dimension n(n 1)=2. At to a certain Riemann–Hilbert problem, that is, to a
nongeneric points, the variety can get additional problem of factorization of an analytic n n
strata. nondegenerate matrix-valued function on the
The monodromy data S, C are determined at an annulus
arbitrary semisimple point of a Frobenius manifold Gðz; uÞ; r < jzj < R; det Gðz; uÞ 6¼ 0
in terms of the analytic properties of horizontal
sections of the deformed flat connection r̃ [7] in the depending on the parameter u = (u1 , . . . , un ) in a
complex z-plane (the so-called ‘‘Stokes matrix’’ and product
the ‘‘central connection matrix’’ of the operator
Gðz; uÞ ¼ G0 ðz; uÞ1 G1 ðz; uÞ ½45
[27]). Locally, they do not depend on the point of
the semisimple Frobenius manifold (the isomono-
dromicity property). of two matrix-valued functions G0 (z; u) and
We will now describe the reconstruction procedure G1 (z; u) analytic for jzj < R and r < jzj 1 resp.,
giving a parametrization of semisimple Frobenius with nowhere-vanishing determinant.
444 WDVV Equations and Frobenius Manifolds
for arbitrary non-negative integers p1 , . . . , pm . Here (formal) Frobenius manifold on H (X) with the
the evaluation maps evi , i = 1, . . . , m, are given by bilinear form given by the Poincaré pairing
Z
evi : Xg;m; ! X; f 7! f ðxi Þ
¼
^
X
The so-called tautological line bundles Li over Xg, m,
by definition have the fiber Txi Cg , i = 1, . . . , m (see the unity
the article Moduli Spaces: An Introduction regarding
@
the construction of the so-called virtual fundamental e¼
class [Xg, m, ]virt ). The numbers [49] can be defined @v1
for an arbitrary compact symplectic manifold X and the Euler vector field
where one is to deal with the intersection theory on X
n
@
the moduli spaces of pseudoholomorphic curves E¼ ½ð1 q Þv þ r
fixing a suitable almost-complex structure on X. ¼1
@v
They depend only on the symplectic structure on X. Here the numbers q , r are defined by the
In particular, the numbers conditions
< 0 ð
1 Þ . . . 0 ð
m Þ >g; ½50 X
2 H 2q ðXÞ; c1 ðXÞ ¼ r
are called the genus g and degree GW invariants of
X. In certain cases, they admit an interpretation in
The resulting Frobenius manifold will be denoted
terms of enumerative geometry of the variety X
MX . The corresponding n-parameter family of
(Kontsevich and Manin 1994). The numbers [49]
n-dimensional algebras on the tangent spaces Tv MX
with some of pi > 0 are called ‘‘gravitational
is also called ‘‘quantum cohomology’’ QH (X). At
descendents.’’
the point vcl 2 MX of classical limit, the algebra
One can form a generating functions of the Tvcl MX coincides with the cohomology ring H (X).
numbers [49] In all known examples, the series [53] actually
converges in a neighborhood of the point vcl .
X X 1 1 ;p1
FX t . . . tm ; pm Therefore, one obtains a genuine Frobenius structure
g ¼
m 2H2 ðX;ZÞ
m! on a domain MX
H (X; C)=2iH2 (X; Z). How-
ever, a general proof of convergence is still missing.
< p1 ð
1 Þ . . . pm ð
m Þ >g; ½51
In particular, for d = 1, the quantum cohomology
(summation over repeated indices 1 1 , . . . , m of complex projective line P 1 is a two-dimensional
n will always be assumed). Here t, p are indetermi- Frobenius manifold with the potential, unity, and
nates labeled by pairs (, p) with = 1, . . . , n, the Euler vector field
p = 0, 1, 2, . . . . (Usually one is to insert in the Fðu; vÞ ¼ 12 uv2 þ eu ;
definition of F X
g elements q of the Novikov ring
C[H2 (X; Z)]. However, due to the divisor axiom @
e¼ ;
(Kontsevich and Manin 1994) and these insertions @v
can be compensated by a suitable shift in the space @ @
E¼v þ2
of couplings t = (t, p ).) We finally introduce the full @v @u
generating function called total GW potential (it is For d = 2 one has a three-dimensional Frobenius
also called the free energy of the topological sigma manifold QH (P 2 ) with
model with the target space X)
X Fðv1 ; v2 ; v3 Þ¼ 12 v21 v3 þ 12 v1 v22
F X ðt; Þ ¼ 2g2 F X
g ½52 X v3k1 kv2
g 0 þ Nk 3 e
k 1
ð3k 1Þ!
Restricting the genus-zero generating function @ ½54
onto the so-called small phase space e¼
@v1
@ @ @
FX ðvÞ :¼ F X
0 ðt
;0
¼ v ; t;p>0 ¼ 0Þ E ¼ v1 þ3 v3
½53 @v1 @v2 @v3
v ¼ ðv1 ; . . . ; vn Þ
where Nk = number of rational curves on P2 passing
one obtains a solution to the WDVV associativity through 3k 1 generic points. WDVV [5] yields
equations. This solution defines a structure of (Kontsevich and Manin 1994) recursion relations for
446 WDVV Equations and Frobenius Manifolds
the numbers Nk starting from N1 = 1. The closed the needed integrable hierarchy is a new one. It can
analytic formula for the function [54] is still unknown. be associated (Dubrovin and Zhang) with an arbi-
Only for certain very exceptional X the Frobenius trary n-dimensional semisimple Frobenius manifold
manifold MX is semisimple (e.g., for X = Pd ). The M. The equations of the hierarchy have the form
general geometrical reasons of the semisimplicity of h
MX are still to have been understood. wit ¼ Aij ðwÞwjx þ 2 Bij ðwÞwixxx þ Cijk ðwÞwjx wkxx
For the case X = Calabi–Yau manifold, the Fro- i
benius manifold QH (X) is never semisimple. This þ Dijkl ðwÞwjx wkx wlx þ Oð4 Þ; i ¼ 1;.. .;n ½57
Frobenius structure can be computed in terms of the
The coefficients of 2g are graded homogeneous
mirror symmetry construction (see Mirror Symme-
polynomials in ux , uxx , etc., of the degree 2g þ 1,
try: A Geometric Survey).
deg dm u=dxm ¼ m
The construction of the hierarchy is done in two
Frobenius Manifold and Integrable
steps. First, we construct the leading approximation
Systems
(Dubrovin 1992). The equation of the hierarchy
The identities in the cohomology ring generated by specifying the dependence on t = t, p at = 0 reads
the cocycles evi (
) and j := c1 (Lj ) can be recast
@v
into the form of differential equations for the ;p
¼ @x r
; pþ1 ðvÞ
generating function [52]. The variable x := t1, 0
@t ½58
corresponding to
1 = 1 plays a distinguished role ¼ 1; . . . ; n; p 0
in these differential equations. According to the idea The functions
, p (v), v 2 M, are the coefficients of
of Witten (1991), the differential equations for the expansion [10] of the deformed flat functions
generating functions can be written as a hierarchy of normalized by
, 0 = v . The solution v = v(x, t) of
systems of n evolutionary PDEs (n = dim H (X)) for interest is determined from the implicit function
the unknown functions equations
X
@ 2 F X ðt; Þ v ¼ xe þ t;p r
;p ðvÞ ½59
w ¼ hh0 ð
Þ0 ð
1 Þii ¼ 2 ½55
@t1;0 @t;0 ;p
1, 0
The variable x = t is the spatial variable of the Next, one has to find solution
equations of the hierarchy. The remaining para- X
meters (coupling constants) t, p of the generating F ¼ 2g2 F g ðv; vx ; . . . ; vð3g2Þ Þ ½60
function play the role of the time variables. Witten g 1
suggested to use the two-point correlators
of the following universal loop equation (closely
@ 2 F X ðt; Þ
2 related with the Virasoro conjecture of Eguchi and
h; p ¼ hhpþ1 ð
Þ0 ð
1 Þii ¼ ½56
@t1;0 @t;p Xiong (1998)):
as the densities of the Hamiltonians of the flows of
the hierarchy. X @F 1
@xr
Existence of such a hierarchy can be proved for r 0
@v;r EðvÞ
the case of GW invariants (and their descendents) !
of complex projective spaces Pd (the results of X @F X
r r
þ @xk1 @e p G @xrkþ1 @ p
Givental (2001) along with Dubrovin and Zhang r 1
@v;r k
k¼1
(2005) can be used). For d = 0 one obtains,
according to the celebrated result by Kontsevich 1 1 h i2
¼ trðU Þ2 þ tr ðU Þ1 V
conjectured by Witten (see Topological Gravity, 16 4
Two-Dimensional), the tau function of the solution 2 X @ 2 F @F @F
to the KdV hierarchy (see Korteweg–de Vries Equation þ þ
2 @v;k @v;l @v;k @v;l
and Other Modulation Equations) specified by the
initial condition, @xkþ1 @ p G @xlþ1 @ p
Identify
ª 2006 Elsevier Ltd. All rights reserved. θ2
θ2
θ2 θ1 θ1 θ1
Introduction
(a) (b) (c)
Practically any physical, chemical, or biological
system can exhibit rhythmic oscillatory activity, at Figure 1 A 2-torus and its representation on the square.
(Modified from Hoppensteadt and Izhikevich 1997.)
least when the conditions are right. Winfree (2001)
reviews the ubiquity of oscillations in nature,
ranging from autocatalytic chemical reactions to Frequency locking
pacemaker cells in the heart, to animal gates, and to
circadian rhythms. When coupled, even weakly, In phase
oscillators interact via adjustment of their phases,
that is, their timing, often leading to synchroniza-
tion. In this chapter, we review the most important Entrainment Synchronization Phase locking
(1:1 frequency locking)
concepts needed to study and understand the
dynamics of coupled oscillators.
From a mathematical point of view, an oscillator Antiphase
is a dynamical system,
x_ ¼ f ðxÞ; x 2 Rm ½1 Figure 2 Various degrees of locking of oscillators. (Modified
having a limit-cycle attractor – periodic orbit Rm . from Izhikevich 2006.)
Its period is the minimal T > 0 such that
The oscillators are said to be frequency locked when
ðtÞ ¼ ðt þ TÞ for any t [4] has a stable periodic orbit #(t) = (#1 (t), . . . , #n (t))
and its frequency is = 2=T. Let x(0) = x0 2 be on the n-torus Tn , as in Figure 1a. The ‘‘rotation
an arbitrary point on the attractor, then the state of vector’’ or ‘‘winding ratio’’ of the orbit is the set of
the system, x(t), is uniquely defined by its phase integers q1 : q2 : : qn such that #1 makes q1 rotations
# 2 S1 relative to x0 , where S1 is the unit circle. while #2 makes q2 rotations, etc., as in the 2 : 3
Throughout this article, we assume that the frequency locking in Figure 1a. The oscillators
periodic orbit is exponentially stable, which are entrained when they are 1 : 1: :1 frequency
implies normal hyperbolicity. In this case, there is a locked. The oscillators are phase locked when there is
continuous transformation : U ! S1 defined in a an (n 1) n integer matrix K having linearly
neighborhood U such that #(t) = (x(t)) for any independent rows such that K#(t) = const. For exam-
trajectory in U, that is, maps solutions of [1] to ple, the two oscillators in Figure 1b are phase locked
solutions of with K = (2, 3), while those in Figure 1c are not. The
oscillators are synchronized when they are entrained
#_ ¼ ½2 and phase locked. Synchronization is in-phase when
Such a transformation removes the amplitude but #1 (t) = = #n (t) and out-of-phase otherwise. Two
saves the phase of oscillation. oscillators are said to be synchronized antiphase when
Accordingly, there is a continuous transformation #1 (t) #2 (t) = . Frequency locking without phase
that maps solutions of the weakly coupled network locking, as in Figure 1c, is called phase trapping. The
of n oscillators, relationship between all these definitions is depicted
in Figure 2.
x_ i ¼ fi ðxi Þ þ "gi ðx1 ; . . . ; xn ; "Þ; "1 ½3
onto solutions of the phase system
Phase Resetting
#_ i ¼ i þ "hi ð#1 ; . . . ; #n ; "Þ; #i 2 S1 ½4
An exponentially stable periodic orbit is a normally
which is easier for studying the collective properties hyperbolic invariant manifold, hence its sufficiently
of [3]. small neighborhood, U, is invariantly foliated by
Weakly Coupled Oscillators 449
1 θ = π /2 1.5 pulse
θ = 3π /4 θ = π /4 1
0.5
0.5
θ=π x0 θ = 0
Im z 0 y 0
U
–0.5
–0.5 γ
θ = 5π /4 –1
–1
θ = 3π /2 θ = 7π /4 –1.5
–1.5 –2
–1.5 –1 – 0.5 0 0.5 1 1.5 –1.5 –1 –0.5 0 0.5 1 1.5
Re z x
Figure 3 Isochrons of Andronov–Hopf oscillator (z_ = (1 þ i )z zjzj2 , z 2 C) and van der Pol oscillator (x_ = x x 3 y , y_ = x):
stable submanifolds (Guckenheimer 1975) illustrated In Figure 5 we depict phase portraits of the
in Figure 3. The manifolds represent points having Andronov–Hopf oscillator receiving pulses of
equal phases and, for this reason, they are called magnitude 0.5 (left) and 1.5 (right). Notice the
isochrons (from Greek ‘‘iso’’ meaning equal and drastic difference between the corresponding PRCs
‘‘chronos’’ meaning time). or PTCs. Winfree (2001) distinguishes two cases:
The geometry of isochrons determines how the
1. type 1 (weak) resetting results in continuous PRCs
oscillators react to perturbations. For example, the
and PTCs with mean slope 1, and
pulse in Figure 3, right, moves the trajectory from
2. type 0 (strong) resetting results in discontinuous
one isochron to another, thereby changing its phase.
PRCs and PTCs with mean slope 0.
The magnitude of the phase shift depends on the
amplitude and the exact timing of the stimulus
relative to the phase of oscillation #. Stimulating the
oscillator at different phases, one can measure the Type 1 (weak) resetting Type 0 (strong) resetting
phase transition curve (Winfree 2001) θ = π /2
#new ¼ PTCð#old Þ
and the phase resetting curve θ=π θ = 0, 2π
PRCð#Þ ¼ PTCð#Þ #
ðshift ¼ new phase old phaseÞ
θ = 3π /2
Positive (negative) values of the PRC correspond to
phase advances (delays). PRCs are convenient when 1 π
PRC(θ) PRC(θ)
the phase shifts are small, so that they can be
Phase resetting
Phase resetting
–1 –π
0 2π 0 2π
Andronov–Hopf oscillator van der Pol oscillator
Stimulus phase, θ Stimulus phase, θ
Re z (t ) x (t )
0.2
2π π
PRC(θ)
PTC(θ) = PTC(θ) =
Phase transition
Phase transition
The discontinuity of type 0 PRC in Figure 5 is a 1. Winfree: Q(#) is normalized PRC to infinitesimal
topological property that cannot be removed by pulsed perturbations;
reallocating the initial point x0 that corresponds to 2. Kuramoto: Q(#) = grad (x); and
zero phase. The discontinuity stems from the fact 3. Malkin: Q is the solution to the adjoint problem
that the shifted image of the limit cycle (dashed
circle) goes beyond the central equilibrium at which _ ¼ fDf ððtÞÞg> Q
Q ½7
the phase is not defined.
The stroboscopic mapping of S1 to itself, called with the normalization Q(t) f ((t)) = for any t.
Poincaré phase map, The function Q(#) can be found analytically in a
#kþ1 ¼ PTCð#k Þ ½5 few simple cases:
describes the response of an oscillator to a T-periodic 1. a nonlinear phase oscillator x_ = f (x) with x 2 S1
pulse train. Here, #k denotes the phase of oscillation and f > 0 has Q(#) = =f ((#));
when the kth input pulse arrives. Its fixed points 2. a system near saddle-node on invariant circle
correspond to synchronized solutions, and its periodic bifurcation has Q(#) proportional to 1 cos #;
orbits correspond to phase-locked states. and
3. a system near supercritical Andronov–Hopf
bifurcation has Q(#) proportional to sin(# ),
where 2 S1 is a constant phase shift.
Weak Coupling
Other interesting cases, including homoclinic,
Now consider dynamical systems of the form relaxation, and bursting oscillators are considered
x_ ¼ f ðxÞ þ "sðtÞ ½6 by Izhikevich (2006).
Treating s(t) in [6] as the input from the network,
describing periodic oscillators, x_ = f (x), forced by we can transform weakly coupled oscillators
a weak time-depended input "s(t), for example, from
other oscillators in a network. Let (x) denote the si ðtÞ
zfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflffl{
phase of oscillation at point x 2 U, so that the map Xn
: U ! S1 is constant along each isochron. This x_ i ¼ fi ðxi Þ þ " gij ðxi ; xj Þ ; xi 2 Rm ½8
mapping transforms [6] into the phase model j¼1
describing the interaction between oscillators Andronov–Hopf oscillator van der Pol oscillator
(Ermentrout and Kopell 1984). To summarize, we H(χ) H(χ)
transformed weakly coupled system [8] into the 0.5
Hij (χ)
phase model [10] with H given by [11] and each Q 0 Hij (χ)
being the solution to the adjoint problem [7]. This –0.5
constitutes the Malkin theorem for weakly coupled
oscillators (Hoppensteadt and Izhikevich 1997, 0 Phase difference, χ 2π 0 Phase difference, χ 2π
theorem 9.2). Figure 7 Solid curves: functions Hij () defined by [11]
Existence of one equilibrium of the phase model corresponding to the gap–junction input g(xi , xj ) = (xj1 xi1 , 0).
[10] implies the existence of the entire circular Dashed curves: functions H() = Hji () Hij ().: Parameters
family of equilibria, since translation of all ’i by a are as in Figure 3.
constant phase shift does not change the phase
differences ’i ’j and hence the form of [10]. This
family corresponds to a limit cycle of [8], on which All equilibria of [12] are solutions to H() = !,
all oscillators have equal frequencies and constant and they are intersections of the horizontal line !
phase shifts, that is, they are synchronized, possibly with the graph of H. They are stable if the slope
out of phase. of the graph is negative at the intersection. If
We say that two oscillators, i and j, have resonant oscillators are identical, then H() is an odd
(or commensurable) frequencies when the ratio function (i.e., H() = H()), and = 0 and
i =j is a rational number, for example, it is p=q = are always equilibria, possibly unstable,
for some integer p and q. They are nonresonant corresponding to the in-phase and antiphase syn-
when the ratio is an irrational number. In this case, chronized solutions. The in-phase synchronization
the function Hij defined above is constant regardless of gap–junction coupled oscillators in Figure 7 is
of the details of the oscillatory dynamics or the stable because the slope of H (dashed curves) is
details of the coupling, that is, dynamics of two negative at = 0. The max and min values of the
coupled nonresonant oscillators is described by an function H determine the tolerance of the network
uncoupled phase model. Apparently, such oscillators to the frequency mismatch !, since there are no
do not interact; that is, the phase of one of them equilibria outside this range.
cannot change the phase of the other one even on Now consider a network of n > 2 weakly coupled
the long timescale of order 1=". oscillators [8]. To determine the existence and
stability of synchronized states in the network, we
need to study equilibria of the corresponding phase
Synchronization model [10]. The vector = (1 , . . . , n ) is an
Consider [8] with n = 2, describing two mutually equilibrium of [10] when
coupled oscillators. Let us introduce ‘‘slow’’ time X
n
= "t and rewrite the corresponding phase model 0 ¼ !i þ Hij ði j Þ ðfor all iÞ ½13
[10] in the form j6¼1
formulation could be maintained to some extent. the strings. The superstring theories are perturba-
Thus, there is room for exploring some of the tively consistent in critical ten dimensions. The
nonperturbative attributes of the theory. The relati- closed-superstring spectrum contains a spin-2 mass-
vists favor canonical formulation, since some of the less state which is identified to be the graviton. It is
geometrical features of general theory of relativity well known that perturbative computation of pro-
could be incorporated here and be explored to see cesses involving graviton turn out to be finite.
how far the quantum theory captures such properties Moreover, the Einstein–Hilbert term appears natu-
of the classical theory. As we shall discuss in sequel, rally when one derives the string effective action.
some of the interesting issues of quantum cosmology Therefore, it is expected that string theory will be
are addressed in this approach. However, there are able to provide answers to questions related to
limitations and short comings in this formulation and quantum gravity. Indeed, the theory has met with
we refer the reader to the text books and review success in resolving some important issues. We note
articles for further reading and critical assessments of that cosmological scenario has been discussed in the
canonical approach to quantize gravity. string theory framework and the WDW equation
The second approach is primarily the endeavor of has played an important role in study of quantum
physicists who have devoted their research to string cosmology. We shall comment on this aspect
quantum field theory. Feynman’s seminal work on towards the end of this article.
quantization of gravity from this perspective has
profoundly influenced the subsequent developments.
The quantization of gravity is carried out in the The Canonical Structure of Einstein
weak-field approximation such that the graviton is
Gravity
identified as the fluctuation over the Minkowski
background metric. It is a massless spin-2 field as one The Einstein–Hilbert action is
concludes from the properties of low-energy gravita- Z
1 pffiffiffiffiffiffiffi 4
tional interaction in the classical limit. Furthermore, S¼ g d xðR 2Þ ½1
the gauge invariance associated with a spin-2 mass- 16G M
less field gets intimately related with invariance of where R is the Ricci scalar derived from the metric,
Einstein’s theory under general coordinate transfor- g , and is the cosmological constant. The field
mation. In this setup, the field-theoretic techniques equations are derived from the action by the
could be employed to quantize theory and to consider standard variational technique. Note that R involves
perturbative expansions for the scattering amplitudes. second derivative of the metric. If we have compact
It is realized that low-energy amplitudes computed manifolds with boundary @M such that variations of
from the massless spin-2 theory match with those the metric vanish on the boundary and the normal
derived from the Einstein–Hilbert action in the weak- derivatives do not, it is necessary to add a surface
field approximation. Furthermore, the theory is not term to this action. The exact form of this term will
perturbatively renormalizable since the coupling be discussed later. The Einstein’s theory of gravita-
constant carries dimension. One of the most impor- tion is manifestly covariant. The associated action
tant outcomes of the investigations from this per- [1] is invariant under general coordinate transforma-
spective is the discovery, due to Feynman, that the tions: under x ! x0 (x),
introduction of ghost fields is necessary in order to
@x0 @x0
maintain unitarity of the S-matrix when one goes g0 ðx0 Þ ¼ g ðxÞ ½2
beyond the tree level. As is well known, this work has @x @x
profoundly influenced frontiers of research in physics Therefore, we expect that the theory will be
leading to quantization of Yang–Mills theory which, endowed with constraints expressed in terms of the
in turn, paved way for electroweak theory and the canonical variables. One can implement general
QCD. It is worthwhile to mention in passing that the coordinate transformations so that there are only
quantum phenomena associated with gravity in the two pairs of canonical phase-space variables on a
nonperturbative regime cannot be addressed in this spacelike hypersurface. In other words, from physi-
framework. cal considerations, graviton has only two polariza-
In recent years, superstring theory has been at the tions whereas the metric has ten components.
center stage in order to provide a unified theory of Therefore, the two physical degrees of freedom can
fundamental interactions. It is postulated that all be obtained using the freedom of choosing the
elementary constituents of matter and the carriers of ‘‘gauge’’ transformations in this context. It is
the interactions such as gauge bosons and graviton desirable to identify the constraints and analyze
are excitations of one-dimensional extended objects: their structure, most appropriately in Dirac’s
Wheeler–De Witt Theory 455
formalism, and to quantize the theory canonically as Consequently, [4] implies that g00 and g0i will enter
the next step. This is the path we intend to follow in the Hamiltonian as arbitrary functions. As alluded
order to arrive at the WDW equation. to above, hij and their conjugate momenta ij are the
dynamical degrees of freedom. We may choose
The Classical Constraints (N ? , N i ) = N and hij as independent variables
rather than (g00 , g0i ) = g0 and hij for convenience
The Hamiltonian approach is most appropriate to and go back to the other set of variables through [4]
employ the constraint formalism due to Dirac. We and [5] if we desire. Let be canonically conjugate
recall that the Lagrangian formulation is manifestly momenta to N , then it is obvious that a Lagrangian
covariant as is reflected in the field equations; multiplier, , is necessary so that . term has to be
whereas the spacetime covariance is lost in the supplemented to the Hamiltonian due to the
passage to the Hamiltonian approach. Furthermore, arbitrariness of N . We remind the reader that in
the spatial components of the metric are the electrodynamics an analogous situation arises while
dynamical degrees of freedom. We adopt the analyzing its canonical structure – local gauge
formalism introduced by Arnowitt, Deser, and symmetry plays a crucial role there. It is obvious
Misner (ADM) for the so-called 3 þ 1 split of the that the generic form of the Hamiltonian is (we shall
hyperbolic Riemannian spacetime metric, g . One introduce 1=16G, etc., later)
introduces the lapse function, N ? , and the shift Z
function, N i . We suppress the factors of 1=16G,
H ¼ d3 x N? H? ½hij ; ij þ N i Hi ½hij ; ij þ : ½7
etc., for the time being for the general discussions
and shall reintroduce them later. The family of From the perspective of constraint analysis, it is
spacelike hypersurfaces, t , are constructed, with natural that 0 appears as a first-class constraint
metric hij induced on it. Here t is a timelike as they are multiplied by arbitrary functions. More-
parameter, parametrize t . The distance between over, this constraint must hold good under the
points on two neighboring hypersurface, t and deformation of the surface which implies { , H}PB
tþdt , with coordinates (t, xi ) and (t þ dt, xi þ dxi ), must vanish weakly leading to H 0. As a
respectively, is given by consistency requirement, these must be first-class
ds2 ¼ ðN ? Þ2 dt2 þ hij ðN i dt þ dxi ÞðN j dt þ dxj Þ constraints if N are to be arbitrary functions. We
identify that 0 and H 0 are the primary and
¼ g dx dx ½3
secondary constraints, respectively. Thus far, we
The indices of tensors defined on t are raised and have discussed the case for pure gravity; the
lowered by hij and its inverse hij . The relations presence of matter fields in the full action modifies
between the components of g and N ? , N i , hij can the treatment appropriately.
be obtained easily, Let us analyze the structure of the constraints for
the Einstein–Hilbert action [1]. For a compact
g00 ¼ hij N i N j ðN ? Þ2 ; g0i ¼ hij N j ½4 manifold with boundary @M, we have to add the
The above relations can be inverted to give surface term which takes the form:
Z pffiffiffi
1 1
N i ¼ hij g0i N ? ¼ pffiffiffiffiffiffiffiffiffiffiffi ½5 d3 x hK
g00 8G @M
The relations between spatial components, gij , of g Here K stands for the trace of the extrinsic curvature
and hij and some other useful relations are listed of the boundary 3-surface and h = det hij ; note that
below for later conveniences: hij is the induced metric on the 3-surface. If we
include matter fields, the corresponding action is to
Ni Nj be taken into account. Once we make the 3 þ 1 split
gij ¼ hij
ðN ? Þ2 of the metric, the action assumes the following form:
pffiffiffiffiffiffiffi p ffiffiffi Z
g ¼ N ? h ½6 1 pffiffiffi
S¼ d3 x dtN ? h
Ni 16G
g0i ¼
ðN ? Þ2 Kij Kij K2 þ 3 R 2 ½8
Note that (N ? , N i ) are introduced to specify the where
deformation of the hypersurface and therefore, the
evolution equations through the Hamiltonian will 1 @hij
Kij ¼ ? þ Di Nj þ Dj Ni ½9
not determine them; they are arbitrary functions. N @t
456 Wheeler–De Witt Theory
Here Di Nj represents covariant derivative of Nj with fij ðxÞ; kl ðx0 Þg ¼ 0 ½18
the connections computed from hij and 3 R is
curvature of the 3-surface. The canonical momenta
fhij ðxÞ; kl ðx0 Þg ¼ ðik jÞl ðx; x0 Þ ½19
are
pffiffiffi
h Thus, Poisson brackets among the constraints [13]
ij
¼ Kij hij Kll ½10 and [14] are
16G
0
and we can invert this relation to get fHi ðxÞ; Hj ðx0 Þg ¼ Hj ðxÞ@ix ðx; x0 Þ
þ Hi @jx ðx; x0 Þ ½20
ij 1 ij 1 ij l
K ¼ pffiffiffi h l
16G h 2
fHi ðxÞ; H? ðx0 Þg ¼ H? ðxÞ@ix ðx; x0 Þ ½21
The Hamiltonian form of action is given by
Z 0
fH? ðxÞ; H? ðx0 Þg ¼ hij ðxÞHi ðxÞ@jx ðx; x0 Þ
SH ¼ d3 x dt h_ ij ij N? H? N i Hi ½11
hij ðx0 ÞHi ðx0 Þ@jx ðx; x0 Þ ½22
Notice that [8] does not involve time derivatives of
When we resort to canonical quantization, the
N ? and N i , their corresponding canonical momenta
starting point is the Hamiltonian action in the first-
vanish.
order formalism, where the canonical variables are
subjected to the constraints [13] and [14] in terms of
? 0; i 0 ½12
H? and Hi satisfying the algebra given by [20]–[22].
as expected from our earlier discussions about the One encounters a number of important issues while
role of N . A straightforward constraint analysis proceeding to canonically quantize the theory. We
leads to the pair of constraints shall mention only a few of them in what follows. It
is important to address issues related to the role of
Hi ¼ 2Dj ji 0 ½13 the constraints in the quantized theory and how to
deal with the Lagrange multipliers N ? and N i . A
simple proposal is to solve the constraints at the
16G 1
H? ¼ pffiffiffi hij hkl hik hjl ik jl classical level and identify the physical degrees of
h 2
pffiffiffi freedom and quantize the theory subsequently.
h 3 There are four constraints (first class), H? , Hi ,
R0 ½14
16G therefore, out of the 12 phase-space variables,
We mention in passing that the above constraint (hij , ij ), only eight are independent. We need to
equations get modified in the presence of matter supply four gauge conditions in order to render the
fields in the theory. This is relevant. The WDW theory (classically) solvable. Thus, we are left with
equation plays an important role in quantum four physical degrees of freedom in the Hamiltonian
cosmology to describe the evolution of the universe phase space and we can quantize them. The
in early epochs and the equation is studied in the implementation of this idea is easier said than
presence of a generic matter content, that is, a scalar done. One obstacle is that the constraints cannot
field with potential. The constraint equations [13] be solved in a closed form in this formalism. If we
and [14] modify to fix a gauge and quantize the theory, we obviously
break the gauge invariance. It is essential to show,
HTi ¼ Hi þ Hmatter
i 0 ½15 subsequently, that all physically observable quanti-
ties are independent of the gauge choice. Another
HT? ¼ H? þ Hmatter 0 ½16 criticism of this formalism is that we already get rid
of some of the components of the metric. Therefore,
the spirit of the general theory of relativity, which is
based on the geometrical structure of spacetime, is
The Algebra of Constraints
somewhat diluted. There are other suggestions
In order to compute the classical Poisson bracket where hij and their conjugate momenta are elevated
algebra of the constraints [13] and [14], we use the to quantum status before supplying the gauge
canonical Poisson bracket relations for the phase- conditions. The issues of gauge fixing and dealing
space variables on t : with the constraints are addressed at the quantum
level. We replace the canonical Poisson bracket
fhij ðxÞ; hkl ðx0 Þg ¼ 0 ½17
Wheeler–De Witt Theory 457
algebra by the canonical commutators and proceed is that the quantum momentum constraint, H^i , as an
further. The momentum operator assumes the form operator annihilates the wave function which is a
statement of the quantum-mechanical invariance of
^ij ¼ i
h the theory under three-dimensional diffeomorph-
hij isms. However, the WDW equation conveys invar-
and the wave functional depends on hij that is, [h]. iance of the theory under reparametrization,
There are many technical problems related to the although careful analysis is necessary to prove this
properties of the states and we shall not deal with point. Now we proceed to discuss the solutions of
them due to limitations of space. It is essential to the WDW equation.
discuss the role of the constraints in the quantum
theory. We demand that the quantum constraints WDW Equation and the Solutions
annihilate the physical states (recall the Gauss law
constraint in gauge theories). However, the issue of It is recognized that the WDW equation [24] is a
operator ordering is to be dealt with which in turn is second-order hyperbolic functional differential equa-
connected with the Hermiticity properties of the tion and naturally it has enormous number of
quantum constraints. The Hamiltonian constraint solutions. Therefore, if we want the WDW equation
H? 0 (henceforth denoted as H and defined as the to have any predictive power, it is necessary to
Hamiltonian) is a product of the metric h ^ij and ij . introduce boundary conditions. One of the possible
There is certain ambiguity in defining the constraint. choice is to specify the wave function on the
Therefore, one has to choose a convention. boundary of the superspace. Indeed, the central
T
The condition that the Hamiltonian, H^ , consisting issue of quantum cosmology is about the choice of
of gravitational and matter components, annihilates various boundary conditions which has been an
the state is expressed as important topic of debates. This point will be briefly
discussed later. Notice that the boundary condition
T
H^ ¼ 0 ½23 has to be introduced keeping in mind how the
universe is expected to behave as it evolves. There is
When we adopt coordinate representation for ij , a proposition that the boundary condition for the
the above equation takes the form quantum evolution of the universe be given the
status of a physical law. Therefore, the role of the
16G Gijkl wave functional, [hij (x), (x), B], its evolution, and
hij hkl interpretation are central to the development of
pffiffiffi #
h 3 matter quantum cosmology. Thus, represents the ampli-
ð R 2Þ þ H ½h; ¼ 0 ½24 tude for the universe to have hij (x) on the 3-surface,
16G
B, and matter field (x). It is argued that path-
This is the celebrated WDW equation. Here we have integral formalism should be adopted as an alter-
considered a simple case where matter Hamiltonian native to the canonical prescription to solve for the
density generically contains a single scalar field, , wave function, rather the transition amplitude,
and therefore is functional of 3-metric on t and satisfying the WDW equation. Here the first step is
. Gijkl is the De Witt metric in the superspace: to define the Euclidean version of the gravitational
action keeping in mind the subtleties. As is well
1
Gijkl ¼ pffiffiffi ðhik hjl þ hil hjk hij hkl Þ ½25 known, we deal with propagator (or transition
h amplitude) in the path-integral approach where the
functional integral is carried out over a set of
Remarks The space of all 3-metrics and the scalar 4-metrics and matter fields with Euclidean action
field (hij , ), on t , for the description of classical inside the integral acting as the weight factor. We
evolutions is called the superspace (no connection recall that while formulating quantum mechanics in
with the superspace of supersymmetry). Thus, the path-integral approach, we sum over all possible
[hij , ] is a functional on superspace. Furthermore, paths in the functional integral. However, in the
carries no explicit dependence on t. This is a semiclassical approximation, the amplitude is domi-
consequence of the fact that ‘‘time’’ plays the role of nated by the action corresponding to the classical
a parameter in the general theory of relativity, thus path and we approximate the wave function as
the dynamical variables hij and already provide the e(i=h)Scl and it gets modified appropriately in the
evolutionary processes although t does not make its Euclidean formulation. In this background, we
appearance. As mentioned earlier, we always discuss briefly discuss how the wave function of the
the case when t is compact. Another point to note universe is obtained in the path-integral formalism.
458 Wheeler–De Witt Theory
According to the proposal of Hartle and Hawking, boundary B. Thus, in order to determine the wave
one adopts path-integral formalism for the Eucli- function of the universe, we are required to specify
dean action where the functional integral is not only the initial configurations of hij and at
= 0. We
carried out over the 4-metric, g , and the scalar shall not enter into important issues related with the
field , but also one takes sum over the class of properties of the Euclidean action, the problems
manifolds, M. Note that B is a part of the boundary associated with the choice of contours of the path
ij and are the induced
of this set of manifold. If h integrals, and related topics. The reader will find
metric and the configuration of the scalar field, , detailed discussions in the lectures and monographs
on the boundary, B, then the propagator (henceforth referred in the ‘‘Further reading’’ section.
we just call it the wave function) [h ij , ,
B] can be It is important to re-emphasize that boundary
given a functional-integral representation. Indeed, conditions are to be introduced while solving the
obtaining the most general form of the path integral, WDW equation. It was argued by De Witt that the
summing over the 4-manifolds, is quite a formidable wave function will be determined uniquely from the
task. On the other hand, if one chooses a class of mathematical consistency of the theory and that
4-manifolds which can be decomposed as a product hope has not been realized. Whether one attempts to
(foliation) R B, the wave function is expressed as solve the functional differential WDW equation or
obtain the wave function in the path-integral
;
½h; B formalism, the issue of boundary condition is
Z Z
unavoidable. There are mainly three different kinds
¼ DN Dhij D f ðN ÞFP eSE ½g ; ½26 of boundary conditions in quantum cosmology:
Hartle–Hawking (HH) no-boundary proposal,
We have introduced the gauge-fixing condition Vilenkin’s tunneling mechanism, and Linde’s bound-
as f (N ), which is usually taken to be N _ = l and ary condition. We shall briefly discuss the first two
then the corresponding Faddeev–Popov determinant, proposals. Instead of stating the boundary condi-
FP , has to be inserted into the path-integral tions in full generality, we shall envisage quantum
measure. We recall from our earlier discussions cosmology in a minisuperspace and provide illus-
that N has to be unrestricted on the boundary, B, trative examples to compare the main features of
since they have no dynamical role when we express HH and Vilenkin solutions to the WDW equation.
the action in terms of the variables defined on the It is realized that the discussion and solutions of
3-surface. As noted in the previous discussion, quantum cosmology in the superspace is rather
explicit time dependence does not appear after the difficult, since we deal with functional differential
3 þ 1 split and (hij (x), (x)) have no dependence on equations and the configuration space is infinite
t. Therefore, we introduce a parameter to designate dimensional. Therefore, it is worthwhile to consider
the paths over which the functional integral is to be a system, as a simple model, which has finite degrees
taken. Recall that in the quantum-mechanical case, of freedom. Thus, we assume that the metric and
the paths are parametrized as qi (t) for the coordi- matter fields depend only on cosmic time to begin
nates. However, when we resort to a parametriza- with. There is a physical motivation behind this
tion of the variables for the case at hand, certain assumption, since the present classical state of the
conditions must be fulfilled. We are permitted to universe is described by the Friendmann–Robertson–
integrate over hij and over only those paths, while Walker (FRW) metric corresponding to an isotropic
parametrizing them as (hij (x,
), (x,
)), so that they and homogeneous universe. Notice that the classical
match the arguments of the wave function on the evolution equation resembles that of the motion of a
boundary B. Therefore, we may define the metric particle. The quantum evolution equations are now
and the scalar field configuration so that at
= 1 given by differential equations of quantum
they assume their functional values on the boundary: mechanics rather than functional differential equa-
in other words, ij (x) = hij (x,
= 1)
h and tions. Similarly, the path-integral formulation
(x) = (x,
= 1). It is worthwhile to go back to becomes analogous to the quantum-mechanical
the quantum-mechanical analogy once more. When frame work. Of course, adopting such a simplified
we compute amplitudes/propagators in quantum approach deprives us from describing some of the
mechanics, the functional integral is defined for the important aspects of quantum gravity. However,
amplitude of going from a configuration qi to qf within this framework, several essential features can
while summing over all possible paths originating be exhibited and deep insight might be gained into
from one endpoint qi and ending at the final point the physics of the very early universe. The first step
qf . On this occasion, we have imposed the con- in getting the minisuperspace metric is to assume
straint on the final endpoint belonging to the that the lapse is homogeneous, that is, N ? = N ? (t)
Wheeler–De Witt Theory 459
and the shift is set to zero, N i = 0. Thus, the metric derivative term; the total derivative term can be
takes the form removed by adding a boundary term and k is
positive since we take the spatial part to be closed.
ds2 ¼ ðN ? ðtÞÞ2 dt2 þ hij ðx; tÞdxi dxj ½27 We have redefined the scale factor, the scalar
field, the potential term, and k such that the
The relevant choice of 3-metric for FRW isotropic Einstein–Hilbert action with matter field assumes
and homogeneous universe is the form of [29] and this action facilitates the
definition of conjugate momenta without cumber-
hij ðx; tÞdxi dxj ¼ aðtÞ2 d23 ½28 some numerical factors, and the Hamiltonian takes
a simple form. The conjugate momenta and result-
Note that d23 is the metric on a 3-sphere. It is ing Hamiltonian are
straightforward to derive the Friedmann equations
for such a geometry. aa_ a3 _
The HH no-boundary condition can be inter- a ¼ ; ¼ ½30
N? N?
preted as a topological proposition about the set of
path over which we have to sum. The 3-surface B is " #
to be taken as the only surface of compact N? 2a 2 3
Hc ¼ þ 3 þ a Vð Þ a ¼ N ? H ½31
4-manifold M which is endowed with the metric 2 a a
g , and hij and are the induced metric and the
scalar field on the surface. The wave function is and the constraint is H = 0. In the quantum
obtained by using the matching condition supple- cosmology context, we solve the WDW equation:
mented with initial condition. For the minisuper- H = 0. Since the exact solution is not possible, one
space case, initial conditions impose constraints on resorts to some approximation with simple assump-
the scale factor a(
= 0) and (da=d
)(
= o), and N ? tions. The differential equation is
is to be gauge fixed. These conditions are to be
the boundary which were denoted by h ij and ; Vilenkin boundary conditions yield the following
strictly speaking, we should denote the solutions wave functions:
as But from now on, we drop this bar on
a and .
a and . 2
Vð Þ3=2 Þ
ða; ÞV eð1=3Vð ÞÞð1½1a ½35
Let us momentarily assume that V is -indepen-
dent and therefore, we have an effective cosmologi-
2
Vð Þ13=2
cal constant. The problem is identical to the motion ða; ÞV e1=3Vð Þ eði=3Vð ÞÞ½a ½36
of a particle in a potential well. There are two
turning points. In one region, the particle starts from Note that [35] is the wave function under the barrier,
a = 0, reaches one turning point r1 and returns back. that is, a2 V( ) < 1 in this region, whereas [36] is in
In another case, it starts from a = 1, travels up to the classically accessible domain (a2 V( ) > 1) which
a = r2 and reflects back. In the quantum-mechanical is reflected by the oscillatory character. The slowly
case, the particle can tunnel through the barrier. The varying function F( ) e1=3V( ) appears as the
wave function has both decaying and growing common factor for the wave functions in the two
modes under the barrier, and boundary conditions domains.
tell us which mode to choose. One possibility is that The HH no-boundary proposal to derive the wave
the particle starts from a = 0, tunnels through and function of the universe was formulated in the
proceeds towards a = 1, that is, it has outgoing Euclidean path-integral formalism. A considerable
mode. The other possibility is that the wave function amount of attention has been focused in this area.
has both outgoing and ingoing modes. In this simple We shall present the HH wave function providing
scenario, the former corresponds to Vilenkin’s only a sketchy argument. In the Euclidean descrip-
tunneling boundary condition, where the universe tion, 4-metric is ds2 = (N ? )2 d
2 þ a2 (
)d23 . The
is created at a = 0 and it keeps growing. The latter is 4-geometry should close in a regular way. If we
HH no-boundary proposal where the wave function make the bounding 3-space smaller and smaller, it
has both modes and the universe contracts and can be closed with flat space. We can infer about the
expands. behavior of the scale factor in the limit
! 0 from
Now we discuss the two boundary conditions in this consideration. Furthermore, in the semiclassical
the presence of the potential, with the approxima- approximation (a, ) eSE ; we have replaced
tions mentioned above. The proposition of Vilenkin by (a, ) as remarked earlier. Thus, our aim
(a, )
amounts to the following conditions on the wave is to evaluate SE at the saddle point. This is achieved
function: the region of the boundary which is by writing down the (Euclidean version) field
nonsingular is finite and a = 0. Other than this equations for a and and the Hamiltonian
domain, either a or diverge on any other region of constraint, and then solve for a(
), (
), and N ? (
).
the boundary; both can diverge in this singular Eventually, we want to eliminate N ? and then
boundary. Notice from the expression for [33] and obtain SE . After all, the path integral is dominated
[34] that the tunneling region corresponds to by the classical trajectory, a(
), and one does not fix
a2 V( ) < 1, whereas, the oscillatory domain is the gauge for N ? while solving for a. In fact, the
a2 V( ) > 1. If we use the saddle-point approxima- lapse gets eliminated by utilizing the Hamiltonian
tion, eiScl . Vilenkin’s boundary condition cor- constraint which involve
-derivatives of both a and
responds to eiScl , with . We mention, without going into details, that the
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi classical action is not unique. One of the ways to
ð a2 Vð Þ 1Þ3 visualize it is to note that the solutions obtained for
Scl ¼ the lapse from the Hamiltonian constraint have sign
3Vð Þ
ambiguities.
So far, we considered the situation where differential The classical action is
operator for is dropped in [32]. In order to
account for weak -dependence, we could introduce 1
S
E ¼ 1 ½1 a2 Vð Þ3=2 ½37
it by multiplying a slowly varying function, say F( ) 3Vð Þ
and write (a, ) F( )eiScl . Similarly, the wave
function can be obtained under the barrier and Note that the two solutions correspond to 3-sphere
required to satisfy WKB matching conditions. boundary being closed off by sections of 4-sphere.
Furthermore, the regularity condition on the wave Moreover, the Euclidean action is negative. Hartle
function in small scale factor limit and behavior of and Hawking argue that the negative sign in [37]
its derivative with respect to in that limit gives the correct answer since the wave function
determine the form of F( ). In summary, the peaks for that choice. However, there is no unanimity
Wheeler–De Witt Theory 461
for HH argument and some authors have put always accompanied by the dilaton in any string-
forward a point of view that additional inputs are theoretic approach to study the universe. The duality
necessary to arrive at the HH conclusion about symmetries are recognized to provide deep under-
choosing the negative sign for SE in [37]. We refer the standing of the string dynamics. Therefore, the
reader to the reviews of Hartle and Halliwell for investigations of quantum gravity phenomena from
detailed discussions on the choice of contours for the string-theory viewpoint are necessarily influenced
path integrals, subtleties involved in getting various by above mentioned facts. Indeed, classical cosmolo-
solutions for the lapse and their interpretations. We gical solutions, derived from string effective action,
give below the wave function under the barrier (with have several interesting characteristics. We mention is
choice of negative sign in [37]): passing that the WDW equation has played an
important role to study quantum evolution equations
2
Vð Þ3=2 Þ
HH ða; Þ eð1=3vð ÞÞð1½1a ½38 in string cosmology. The choice of operator-ordering
prescription in defining the WDW Laplace–Beltrami
operator can be resolved by appealing to the duality
HH ða; Þ e1=3Vð Þ symmetries. Furthermore, the boundary conditions
1 imposed on the wave function are dictated by string
cos ½a2 Vð Þ 13=2 ½39
3Vð Þ 4 symmetries and therefore, the resulting wave function
has very interesting properties. The string theory has
addressed some of the most important problems in
Remarks The wave function in [38] is obtained in
quantum gravity and it has provided resolutions to
the classically inaccessible region under the condition
several key issues. It is expected that string theory
a2 V( ) < 1, and wave function [39] corresponds to
will provide answers to challenging questions in
the case a2 V( ) > 1, where the particle motion is
quantum cosmology. In summary, we have conveyed
permissible classically. Note the factor e1=3V( ) in the
some of the salient aspects of the WDW equation.
wave functions in both the regions and compare that
The canonical quantization technique is adopted to
with the Vilenkin’s wave function which has the
study quantum gravity in this approach. We have
opposite sign. We may conclude where the wave
illustrated the crucial role of the constraint formalism
function will peak for each of the two boundary
due to Dirac and argued that some of the nonpertur-
conditions. Whereas Vilenkin’s proposal implies that
bative aspects of quantum gravity could be retained.
V (a, ) peaks when V( ) takes large values, HH no-
In a short article of this nature, it is not possible to
boundary condition tells us that it peaks when
provide detailed discussion about the general deriva-
V( ) ! 0. Furthermore, we note that V is complex
tion of the WDW equation and discuss the role of
and HH is real in the oscillatory region. Although
boundary conditions more exhaustively. Instead, we
the debates on the merits and demerits of each of the
presented some of the key steps in the derivation of
boundary proposals are going on for more than two
the WDW equation adopting the canonical formalism
decades, the issue is far from being settled. In the
and provided simple examples. The subject is still an
absence of any experimental tests, there is no way to
active area of research. The interested reader may
favor one boundary proposal over another. Then,
benefit from the bibliography.
boundary conditions do have predictions about the
evolution of the universe after the quantum era and See also: Canonical General Relativity; Loop Quantum
have predictions in that (classical) regime. Therefore, Gravity; Quantum Cosmology; Quantum Dynamics in
determination of the wave function with specific Loop Quantum Gravity; Quantum Geometry and its
boundary conditions does have some connections Applications; Superstring Theories.
with the laws that govern the evolution of our
universe in the present epoch.
Further Reading
It is worthwhile to dwell on the WDW equation
from the perspectives of string theories. Indeed, there De Witt BS (1967) Quantum theory of gravity. Physical Review
160: 1113.
have been important developments to understand the Feynman RP, Morinigo FB, and Wagner WG (1995) Feynman
dynamics of the universe in the string-theoretic Lectures on Gravity. New York: Addison-Wesley.
framework. It is important to note the key role Halliwell JJ (1990) Quantum cosmology. In: Randjbar-Daemi S,
played by dilaton in string theory: (1) it is one of the Sezgin E, and Shafi Q (eds.) Summer School in High Energy
massless states of the theory, and (2) the vacuum Physics and Cosmology, p. 513. Singapore: World Scientific.
Hartle JB (1989) Introductory lectures on quantum cosmology. In:
expectation value (VEV) of this field determines the Coleman S, Hartle JB, Piran T, and Weinberg S (eds.) Quantum
coupling constants we hope to use in describing Cosmology and Baby Universes, Proceedings of 7th Winter
fundamental interactions. Therefore, the graviton is School in Theoretical Physics, p. 159. Jerusalem: World Scientific.
462 Wulff Droplets
Hartle JB and Hawking SW (1983) Wave function of the Vilenkin A (1984) Quantum creation of universe. Physical Review
universe. Physical Review D 28: 2960. D 30: 509.
Hawking SW (1983) Quantum cosmology. Les Houches Lectures Wheeler JA (1963) Geometrodynamics. In: De Witt C and De
on Quantum Cosmology, p. 333. (Les Houches Publication) in Witt BS (eds.) Relativity, Groups and Topology. New York:
Einstein Centenary Volume. Gordon and Breach.
Wulff Droplets
S Shlosman, Université de Marseille, Marseille, minimizer does exist and is unique up to translation.
France It is called the Wulff shape.
ª 2006 Elsevier Ltd. All rights reserved. The following is the geometric construction of
W
. Consider the set
n o
K
¼ x 2 R d: 8n 2 Sd1 ðx; nÞ
ðnÞ
Introduction
Historically, the first question where the Wulff shapes If we define the half-spaces
have appeared is the one of the formation of a droplet n o
or a crystal of one substance inside another. The L
;n ¼ x 2 R d: ðx; nÞ
ðnÞ
natural problem here is: what shape such a formation
would take? The statement that such a shape should then
be defined by the minimum of the overall surface
energy subject to the volume constraint is physically K
¼ \n L
;n ½1
very natural. In the isotropic case, when the surface In particular, K
is convex. It turns out that
tension does not depend on the orientation of the
surface, and so is just a positive number, the shape in W
¼
@ ðK
Þ
question should be of course spherical (provided we
neglect the gravitational effects). In a more general where the dilatation factor
is defined by the
situation the shape in question is less symmetric. The normalization: vol(
K
) = 1. The relation [1] is
corresponding variational problem is called the Wulff called the Wulff construction. For the future use,
problem. Wulff (1901) formulated it in his paper, we introduce the notation w
for the value of the
where he also presented a geometric solution to it, surface energy of the Wulff shape:
called the ‘‘Wulff construction.’’ w
¼ W
ðW
Þ
The Wulff variational problem is formulated as
follows. Let
(n), n 2 Sd1 , be some continuous The Wulff construction was considered by the
function on the unit sphere Sd1 R d . We suppose rigorous statistical mechanics as just a phenomeno-
that
> 0, and that
is even:
(n) =
(n). The value logical statement, though the notion of the surface
(n) plays the role of the surface tension between two tension was among its central notions. The situation
phases separated by the hyperplane orthogonal to the changed after the appearance of the book by
vector n. For every closed compact (hyper)surface Dobrushin et al. (1992). There it was shown that
Md1 Rd , we define its surface energy as in the setting of the canonical ensemble formalism,
Z in the regime of the first-order phase transition, the
W
ðM Þ ¼
ðns Þ ds (random) shape occupied by one of the phases has
M asymptotically (in the thermodynamic limit) a
where ns is the normal vector to M at s 2 M. The nonrandom shape, given precisely by the Wulff
functional W
(M) has the meaning of the surface construction! In other words, a typical macroscopic
energy of the M-shaped droplet made from one of random droplet looks very close to the Wulff shape.
these two phases. It is called the Wulff functional. In what follows we will explain the above result.
Let W
be the surface which minimizes W
( ) over Another important application of the concepts
all the surfaces enclosing the unit volume. Such a introduced above – the role played by the Wulff
Wulff Droplets 463
pffiffiffiffiffiffiffiffiffiffi
the ball centered at t with radius d 1=N , and let droplet () present in the configuration , then the
BN (t) (N) be its preimage under iN . Then value M (t) should be expected to be md (),
X depending on whether t is outside or inside the
1
M ðrÞ ¼ ðxÞ droplet, which explains the factor 1=md ().
jBN ðtÞj x2B ðtÞ For a proof, see Bodineau (1999) and Cerf and
N
Pizstora (1999).
We have to expect to see a droplet sK with
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi See also: Cluster Expansion; Large Deviations in
d d md ðÞ Equilibrium Statistical Mechanics; Metastable States;
s¼
w 2md ðÞ Percolation Theory; Statistical Mechanics of Interfaces.
Rðp; q; rÞWab ðp; qÞWac ðq; rÞW bc ðp; rÞ Ri;iþ1 Riþ1;iþ2 Ri;iþ1 ¼ Riþ1;iþ2 Ri;iþ1 Riþ1;iþ2 ½5
X
¼ W dc ðp; qÞW bd ðq; rÞWad ðp; rÞ ½4 and
d
½Ri;iþ1 ; Rj;jþ1 ¼ 0; if ji jj 2 ½6
Note that eqns [3] and [4] differ from each other by the and similar relations in which Ri, iþ1 and/or Riþ1, iþ2
transposition of both spin variables in all six weight are replaced by their inverses.
b p
Factorizable S-Matrices and Bethe Ansatz
b
r r
= = =
a p a
c d c
=
q
b
q b
=
r r
Figure 3 Star–triangle equation. Figure 4 Reidemeister moves of types I, II, and III.
Yang–Baxter Equations 467
λ
p
γ′ λ d c d c
γ′
α β α β
p
β α′ β p p p
α″ β ″ α′
γ″ γ″ μ a μ b a b
=
α β″ β′ α″ β′
q α q q q
γ q γ ω W w
r r Figure 6 Vertex model weight ! (p, q), mixed model weight
dc dc
W jab (p, q) and IRF model weight wab (p, q).
Figure 5 Vertex model YBE.
XXX 00 00 0 0 00 0
! ðp; qÞ!0000 ðq; rÞ!00 ðp; rÞ
discovery of the condition for factorizable S-matrices 00 00 00
by McGuire in 1964, represented pictorially by XXX 0 0 00 00 0 00
Figure 5, where the world lines of the particles are ¼ !0000 ðp; qÞ! ðq; rÞ!00 ðp; rÞ ½8
00 00 00
given. Upon collisions the particles can only exchange
their rapidities p, q, r, so that there is no dispersion. This equation is represented graphically in Figure 5.
Also indicated are the internal degrees of freedom in From it one can also derive a sufficient condition for
Greek letters. In other words, the three-body S-matrix the commutation of transfer matrices and spin-chain
can be factorized in terms of two-body contributions Hamiltonians, generalizing the work of McCoy and
and the order of the collisions does not affect the Wu, who had earlier initiated the search by showing
final outcome. McGuire also realized that this that the general six-vertex model transfer matrix
condition is all one needs for the consistency of commutes with a Heisenberg spin-chain Hamilto-
factoring the n-body S-matrix in terms of two-body nian. To be more precise, Baxter found that if
S-matrices. The consistency condition is obviously !
= for some choice of p and q, some spin-
related to the Reidemeister move of type III in chain Hamiltonians could be derived as logarithmic
Figure 4. derivatives of the transfer matrix.
Yang succeeded in solving the spin-1/2 fermionic
model using a nested Bethe ansatz, utilizing a Interaction-Round-a-Face Model
generalization of Artin’s braid relations [5] and [6],
Baxter introduced another language, namely that of the
i;iþ1 ðp qÞR
R iþ1;iþ2 ðp rÞR
i;iþ1 ðq rÞ IRF or ‘‘interaction-round-a-face’’ model, which he
introduced in connection with his solution of the hard-
¼R i;iþ1 ðp rÞR
iþ1;iþ2 ðq rÞR iþ1;iþ2 ðp qÞ ½7 hexagon model. This formulation is convenient when
He submitted his findings in two short papers in studying one-point functions using the corner-transfer-
1967. The R operators in eqn [7] – a notation matrix method. Now the integrability condition can be
introduced later by the Leningrad school – depend represented graphically as in Figure 7 or algebraically as
X 0
on differences of two momenta or two relativistic ad
wcb a0 b dc0
0 ðp; qÞwdc0 ðq; rÞwb0 a ðp; rÞ
rapidities. Sutherland solved the general spin case d
using repeated nested Bethe ansätze, while Lieb and X 0 0 0
¼ wbc cd ab
d0 a ðp; qÞwb0 a ðq; rÞwcd0 ðp; rÞ ½9
Wu used Yang’s work to solve the one-dimensional
d0
Hubbard model.
The spins live on faces enclosed by rapidity lines and
Vertex Models
the weights wdcab (p, q) are assigned as in Figure 6.
Since Lieb’s solution of the ice model by a Bethe
ansatz, there have been many developments on
vertex models, in which the state variables live on a′ b p a′ b
line segments and weight factors ! are assigned to p
a vertex where four line segments with the four
states , , , on them meet, see Figure 6. c d c′ = c d′ c′
IRF-Vertex Model
quantum inverse-scattering method (QISM), coining
In Figure 6, we have also defined mixed IRF-vertex the term quantum YBEs (QYBEs) for eqns [8]. If
dc
model weights W jab (p, q). (We could put further special limiting values of p and q can be found, say as
state variables on the vertices, but then the natural h ! 0, such that !
= þ O(
h), one can reduce
thing to do is to introduce new effective weights [8] to the classical Yang–Baxter equations (CYBEs) by
summing over the states at each vertex.) With the expanding up to the first nontrivial order in expansion
choice made a more general YBE can be represented variable h. These determine the integrability of certain
as in Figure 8, or by models of classical mechanics by the inverse-scattering
X X X X 00 00 a0 d method and the existence of Lax pairs.
W jcb0 ðp; qÞ
00 00 00 d
0 0 0 00 0 0
W0000 jadcb0 ðq; rÞW00 jdc
b0 a ðp; rÞ Checkerboard generalizations
XXXX 0 0 0
¼ W0000 jbc
d0 a ðp; qÞ Star–triangle equations [3] and [4] imply that there are
00 00 00 d0 further generalizations of the YBEs, namely those for
00 00 0 0 00 0
cd ab
which the faces enclosed by the rapidity lines are
W jb0 a ðq; rÞW 00 jcd0 ðp; rÞ ½10
alternatingly colored black and white in a checkerboard
pattern. We can then introduce either vertex model
Quantum Inverse-Scattering Method weights !
(p, q) and ! (p, q), or IRF-vertex model
The Leningrad school of Faddeev incorporated the weights W jab (p, q) and W jdc
dc
ab (p, q), or IRF
dc dc
methods of Baxter and Yang in their so-called model weights wab (p, q) and wab (p, q), see Figure 9.
λ λ
α β α β
p p
μ μ
q q
ω ω
d λ c d λ c d c d c
α β α β
p p p p
a μ b a μ b a b a b
q q q q
W w w
W
Figure 9 Checkerboard versions of the weights.
Yang–Baxter Equations 469
The black faces are those where the spins of the Checkerboard IRF Model
spin model with weights defined in Figure 2 live; the
The checkerboard IRF version of the YBE [8]
white faces are to be considered empty in Figures 2
becomes
and 3 (or, equivalently, they can be assumed to host
trivial spins that take on only a single value). X 0 0 0
ad ab dc
Clearly, the IRF-vertex model description contains wcb 0 ðp; qÞwdc0 ðq; rÞwb0 a ðp; rÞ
X 0 0 0
¼ Rðp; q; rÞ wbc cd ab
d0 a ðp; qÞwb0 a ðq; rÞwcd0 ðp; rÞ ½13
Checkerboard Vertex Model d0
0 0 0 00 0 0
W0000 jdc
ab dc
0 ðq; rÞW 00 jb0 a ðp; rÞ
p
γ′ γ′ XXXX 0 0 0
p α′ β ¼ Rðp; q; rÞ W 00 00 jbc
d0 a ðp; qÞ
β α″ β″ α′ 00 00 00 d0
γ″ = γ″
α β″ α″ β′
q β′ 00 00 0 0 00 0
α W jcd ab
b0 a ðq; rÞ W 00 jcd0 ðp; rÞ ½15
γ q γ
r r
XXXX ad 00 00 0
Rðp; q; rÞ W jcb 0 ðp; qÞ
00 00 00 d
γ′ p
γ′ ab 0 0 dc 0 00 0 0
α
¼ W0000 jbc
d0 a ðp; qÞ
β″ α″ β′
q β′ α 00 00 00 d0
γ q γ
cd 00 00ab 0 0 00 0
W jb0 a ðq; rÞW 00 jcd 0 ðp; rÞ ½16
r r
Figure 10 Checkerboard vertex model YBE. with its graphical representation in Figure 12.
470 Yang–Baxter Equations
a′ a′
λ
b p b
p2 λ
p
α β (p1,p2) α β
=
c d c′ = c d′ c′
p1 μ
q
μ
b′ a q b′ a q1 q2 (q1,q2)
r r Figure 13 Square weight as vertex weight.
^^ ^^
!^^ ðp; qÞ ¼ !^^ ðp; qÞ ¼ 0 otherwise ½19
a′
γ′
b p a′
γ′
b In eqn [19], we have set all vertex model weights
p α′ β zero that are inconsistent with IRF-vertex config-
β α″ β′′ α′ urations. Clearly, the translation of IRF models and
c d γ″ c′ = c γ′′ d ′ c′
spin models to vertex models can be done similarly.
α β″ α′′ β′
q β′ α
γ a q γ a Map to Spin Model
b′ b′
Reflection YBEs
Operator Formulations Cherednik and Sklyanin found a condition deter-
The R-Matrix mining the solvability of systems with boundaries,
the reflection YBEs (RYBEs), see Figure 14. Upon
For a problem with N rapidity lines, carrying
rapidities p1 , . . . , pN , we can introduce a set of
matrices Rij (pi , pj ), for 14i < j4N, with elements
Y q– q–
Rij ðpi ; pj Þ11...N
...N ¼ !jiij ðpi ; pj Þ kk ½22
k6¼i; j p–
p–
In terms of these, the YBE [8] can be rewritten in
matrix form as =
p
Rjk ðpj ; pk ÞRik ðpi ; pk ÞRij ðpi ; pj Þ
½23 p
¼ Rij ðpi ; pj ÞRik ðpi ; pk ÞRjk ðpj ; pk Þ
q q
where 14i < j < k4N. Figure 14 Reflection YBE.
472 Yang–Baxter Equations
Behrend RE, Pearce PA, and O’Brien DL (1996) Interaction- Alloys, Magnets and Superconductors, pp. xix–xxiv, 3–12.
round-a-face models with fixed boundary conditions: the ABF New York: McGraw-Hill.
fusion hierarchy. Journal of Statistical Physics 84: 1–48. Perk JHH (1989) Star-triangle equations, quantum Lax pairs, and
Gaudin M (1983) La Fonction d’Onde de Bethe. Paris: Masson. higher genus curves. Proceedings of Symposia in Pure
Jimbo M (ed.) (1987) Yang–Baxter Equation in Integrable Mathematics 49(1): 341–354.
Systems. Singapore: World Scientific. Perk JHH and Schultz CL (1981) New families of commuting
Kennelly AE (1899) The equivalence of triangles and three- transfer matrices in q-state vertex models. Physics Letters A
pointed stars in conducting networks. Electrical World and 84: 407–410.
Engineer 34: 413–414. Perk JHH and Wu FY (1986) Graphical approach to the
Korepin VE, Bogoliubov NM, and Izergin AG (1993) Quantum nonintersecting string model: star-triangle equation, inversion
Inverse Scattering Method and Correlation Functions. relation, and exact solution. Physica A 138: 100–124.
Cambridge: Cambridge University Press. Reidemeister K (1926a) Knoten und Gruppen. Abhandlungen aus
Kulish PP and Sklyanin EK (1981) Quantum spectral transform dem Mathematischen Seminar der Hamburgischen Universität
method. Recent developments. In: Hietarinta J and 5: 7–23.
Montonen C (eds.) Integrable Quantum Field Theories, Reidemeister K (1926b) Elementare Begründung der Knotenthe-
Lecture Notes in Physics, vol. 151, pp. 61–119. Berlin: orie. Abhandlungen aus dem Mathematischen Seminar der
Springer. Hamburgischen Universität 5: 24–32.
Lieb EH and Wu FY (1972) Two-dimensional ferroelectric Yang CN (1967) Some exact results for the many-body problem
models. In: Domb C and Green MS (eds.) Phase Transitions in one dimension with repulsive delta-function interaction.
and Critical Phenomena, vol. 1, pp. 331–490. London: Physical Review Letters 19: 1312–1314.
Academic Press. Yang CN (1968) S-matrix for the one-dimensional N-body
Onsager L (1971) The Ising model in two dimensions. In: Mills problem with repulsive or attractive -function interaction.
RE, Ascher E, and Jaffee RI (eds.) Critical Phenomena in Physical Review 167: 1920–1923.