You are on page 1of 672

P

Painlevé Equations
N Joshi, University of Sydney, Sydney, NSW,
Australia movable singularities (a singularity is ‘‘movable’’ if
its location changes with initial conditions).
ª 2006 Elsevier Ltd. All rights reserved. For the Painlevé equations, all movable singula-
rities are poles. For PI and PII , all solutions are
meromorphic functions. However, the solutions of
each of the remaining equations have other singula-
Introduction
rities called ‘‘fixed’’ singularities, with locations that
The Painlevé equations PI –PVI are six classical are determined by the singularities of the coefficient
second-order ordinary differential equations that functions of the equation. PIII –PVI have a fixed
appear widely in modern physical applications. singularity at x = 1. PIII and PV have additional
Their conventional forms (governing y(x) with fixed singularities at x = 0, and PVI has them at x = 0
derivatives y0 = dy=dx, y00 = d2 y=dx2 ) are: and 1. Although each solution of PIII –PVI is single-
valued around a movable singularity, it may be
PI : y00 ¼ 6y2 þ x
multivalued around a fixed singularity.
PII : y00 ¼ 2y3 þ xy þ  Painlevé’s school considered canonical classes of
y02 y0 1  2   ordinary differential equations equivalent under linear
PIII : y00 ¼  þ y þ  þ y3 þ fractional transformations of y and x. Of the fifty
y x x y
canonical classes of equations they found, all except
y02 3 3 
PIV : y00 ¼ þ y þ 4xy2 þ 2ðx2  Þy þ six were found to be solvable in terms of already
2y 2 y
  0
known functions. These six lead to the Painlevé
1 1 y equations PI –PVI as their canonical representatives.
PV : y00 ¼ þ y02 
2y y  1 x A resurgence of interest in the Painlevé equations
2 
came about from the observation (due to Ablowitz
ðy  1Þ  y
þ y þ þ and Segur) that they arise as similarity reductions
x2 y x
yðy þ 1Þ of well-known integrable partial differential equa-
þ tions (PDEs), or soliton equations, such as the
y1
  Korteweg–de Vries equation, the sine-Gordon equa-
00 1 1 1 1 tion, and the self-dual Yang–Mills equations.
PVI : y ¼ þ þ y02
2 y y1 yx As this connection suggests, the Painlevé equations
 
1 1 1 possess many of the special properties that are
 þ þ y0
x x1 yx commonly associated with soliton equations. They
(
yðy  1Þðy  xÞ x have associated linear problems (i.e., Lax pairs) for
þ 2
þ 2 which they act as compatibility conditions. There
2
x ðx  1Þ y
) exist special transformations (called Bäcklund trans-
ðx  1Þ xðx  1Þ formations) mapping a solution of one equation to a
þ þ solution of another Painlevé equation (or the same
ðy  1Þ2 ðy  xÞ2
equation with changed parameters). There exist
where , , ,  are constants. They were identified Hamiltonian forms that are related to existence of
and studied by Painlevé and his school in their tau-functions, that are analytic everywhere except at
search for ordinary differential equations (in the the fixed singularities. They also possess multilinear
class y00 = R(x, y, y0 ), where R is rational in y0 , y and forms (or Hirota forms) that are satisfied by tau-
analytic in x) that define new transcendental func- functions. In the following subsections, for concise-
tions. Painlevé focussed his search on equations that ness, we give examples of these properties for the first
possess what is now known as the Painlevé property: or second Painlevé equations and briefly indicate
that all solutions are single-valued around all differences, in any, with other Painlevé equations.
2 Painlevé Equations

Complex Analytic Structure of Solutions imaginary axis (between 8 < =x < 12) may be
numerical artifacts. We used the command NSolve to
Consider the two-(complex-)parameter manifold of
32 digits in MATHEMATICA4.)
solutions of a Painlevé equation. Each solution is
The rays of symmetry evident in Figure 1 reflect
globally determined by two initial values given at a
discrete symmetries of PI . The solutions of PI and PII
regular point of the solution. However, the solution
are invariant under the respective discrete symmetries,
can also be determined by two pieces of data given
at a movable pole. The location x0 of such a pole PI : yn ðxÞ ¼ e2in=5 yðe4in=5 xÞ; n ¼ 1; 2
provides one of the two free parameters. The other PII : yn ðxÞ ¼ e in=3
yðe 2in=3
xÞ;  7! ein 
free parameter occurs as a coefficient in the Laurent
expansion of the solution in a domain punctured at n ¼ 1; 2; 3
x0 . For PI , the Laurent expansion of a solution at a The rays of angle 2n=5 for PI and n=3 for PII
movable singularity x0 is related to these symmetries play special roles in the
1 x0 asymptotic behaviors of the corresponding solutions
yðxÞ ¼ 2
þ ðx  x0 Þ2 for jxj ! 1.
ðx  x0 Þ 10
1
þ ðx  x0 Þ3 þ cI ðx  x0 Þ4 þ    ½1
6
Linear Problems
where cI is arbitrary. This second free parameter is
normally called a ‘‘resonance parameter.’’ For PII , The Painlevé equations are regarded as completely
the Laurent expansion of a solution at a movable integrable because they can be solved through an
singularity x0 is associated system of linear equations (Jimbo and
Miwa 1981).
1 x0
yðxÞ ¼ þ ðx  x0 Þ d’
ðx  x0 Þ 6 ¼ Lðx; Þ’ ½3a
1   d
þ ðx  x0 Þ2 þ cII ðx  x0 Þ3 þ    ½2
4
where cII is arbitrary. The symmetric solution of PI d’
¼ Mðx; Þ’ ½3b
that has a pole at the origin and corresponding dx
resonance parameter cI = 0 has a distribution of poles The compatibility condition, that is,
in the complex x-plane shown in Figure 1. (This figure
was obtained by searching for zeros of truncated Lx  M þ ½L; M ¼ 0 ½4
Taylor expansions of the tau-function I described in is equivalent to the corresponding Painlevé equation.
the section ‘‘Bäcklund and Miura transformations.’’ The matrices L, M for PI and PII are listed below:
One hundred and sixty numerical zeros are shown.    
The two pairs of closely spaced zeros near the 0 1 2 0 y
PI : LI ðx; Þ ¼  þ 
0 0 4 0
!
z y2 þ x=2
þ
4y z
   
0 1=2 0 y
MI ðx; Þ ¼ þ
0 0 2 0
where z ¼ y0 ; z0 ¼ 6y2 þ x
   
1 0 2
0 u
PII : LII ðx; Þ ¼  þ 
0 1 2z=u 0
 
z þ x=2 uy
þ
2ð# þ zyÞ=u ðz þ x=2Þ
   
1=2 0 0 u=2
MII ðx; Þ ¼ þ
0 1=2 z=u 0
Figure 1 Poles of a symmetric solution of PI in the complex where u0 ¼ uy; z ¼ y0  y2  x=2
x-plane, with a pole at the origin and zero corresponding 1
resonance parameter, i.e., x0 = 0, cI = 0. # :¼  
2
Painlevé Equations 3

Alternative linear problems also exist for each yn , we can write a difference equation relating yn1
equation. For example, for PII , an alternative choice and ynþ1 (by eliminating y0 from the two transfor-
of L and M is (Flaschka and Newell 1980): mations ~y, ^y) as
! !
4 i 0 0 4y c þ 12 þ n c  12 þ n
PII : LII0 ðx; Þ ¼ 2
 þ  þ þ 2y2n þ x ¼ 0
0 4i 4y 0 ynþ1 þ yn yn1 þ yn
!
iðx þ 2y2 Þ 2 iy0 This is an example of a discrete Painlevé equation (called
þ ‘‘alternate’’ dPI in the literature). In such a discrete
2 iy0 iðx þ 2y2 Þ Painlevé equation, x is fixed while n varies. Another
!
 0 1 lesser known Bäcklund transformation for PII is
þ x
 1 0 y0  y2    v2 ¼ 0 ½7
! 2
i y
MII0 ðx; Þ ¼
y i v0 þ y v ¼ 0 ½8
between PII with  = 1=2 and
The matrix L for each Painlevé equation is
x
singular at a finite number of points ai (x) in the v00 þ  v3 þ v ¼ 0
-plane. For the above choices of L for PI and PII , 2
pffiffiffi pp ffiffiffiffiffiffiffiffiffi
ffiffiffiffiffi
the point  = 1 is clearly a singularity. For LII0 , the which can be scaled (take v(x) = y( 2x)= 2) to
origin  = 0 is also a singularity. The analytic the usual form of PII with  = 0.
continuation of a fundamental matrix of solutions Miura transformations are those that map a solution
 around ai gives a new solution  e which must be of a Painlevé equation to another equation in the 50
e =  A. A is called
related to the original solution:  canonical types classified by Painlevé’s school. If y is a
the monodromy matrix and its trace and determi- solution of PII with parameter  6¼ 1=2, then
nant are called the monodromy data. In general, the
1  w0
data will change with x. However, eqn [4] ensures ð2  1Þw ¼ 2ðy0  y2  x=2Þ; y¼
that the monodromy data remain constant in x. For 2w
this reason, the system [3] is called an isomonodr- maps between PII and
omy problem.
ðw0 Þ2 1
w00 ¼  ð2  1Þ w2  xw 
2w 2w
Bäcklund and Miura Transformations which represents the 34th canonical class in the
Painlevé classification listed in Ince (1927).
Bäcklund transformations are those that map a The Painlevé equations do not possess contin-
solution of a Painlevé equation with one choice of uous symmetries other than Bäcklund and Miura
parameter to a solution of the same equation with transformations described here. However, they do
different parameters. For PI no such transformation possess discrete symmetries described in the section
is known. For PII , there is one Bäcklund transforma- ‘‘Complex analytic structure of solutions.’’
tion. Let y = y(x; ) denote a solution of PII with
parameter . Then ~ y = y(x;   1), which solves PII
with parameter   1, is given by Classical Special Solutions
1 Painlevé showed that there can be no explicit first
 2
y :¼ y þ
~ if  6¼ 1=2 ½5 integral that is rational in y and y0 for his
y0  y2  x=2
eponymous equations. It is known that this state-
If  = 1=2, then y0 = y2 þ x=2 and ~
y = y (see the ment can be extended to say that no such algebraic
next section for this case). Combined with the first integral exists. But the question whether the
symmetry y 7! y,  = , we can write down Painlevé equations define new transcendental func-
another version of this Bäcklund transformation tions remained open until recently.
which maps y to ^y = y(x;  þ 1): Form a class of functions consisting of those
 þ 12 1 satisfying linear second-order differential equations,
y :¼ y 
^ ; if  6¼  ½6 such as the Airy, Bessel, and hypergeometric functions,
y0 þ y2 þ x=2 2
as well as rational, algebraic, and exponential func-
If we parametrize  by c þ n for arbitrary c, and tions. Extend this class to include arithmetic opera-
denote the solution for corresponding parameter as tions, compositions under such functions, and
4 Painlevé Equations

solutions of linear equations with these earlier func- where EII and EII are constants. We choose
tions as coefficients. Members of this class are called canonical variables q1 (t) = y(x), p1 (t) = y0 (x), where
classical functions. For general values of the constants t = x. Furthermore, for PI , we take
, , , , it is now known (Umemura 1990, Umemura Z x
and Watanabe 1997) that the six Painlevé equations q2 ðtÞ ¼ x; p2 ðtÞ ¼ yð Þd
cannot be solved in terms of classical functions.
However, there are special values of the constant and the Hamiltonian
parameters , , ,  for which classical functions do
solve the Painlevé equations. Each Painlevé equation, p21
HI :¼  2q1 3  q2 q1 þ p2
except PI , has special solutions given by classical 2
functions when the parameters in the Painlevé equa- so that the Hamiltonian equations of motion
tion take on special values. For PII , with  = 1=2 we q_ i = @H=@pi and p_ i = @H=@qi are satisfied. For
have the special integral PII , we take
x Z x
I1=2  y0  y2  ¼0 ½9 q2 ðtÞ ¼ x=2; p2 ðtÞ ¼ yð Þ2 d
2
which, modulo PII with  = 1=2, satisfies the relation
and the Hamiltonian
 
d
þ 2y I1=2 ¼ 0 p1 2 q1 4 1
dx HII :¼   q2 q1 2 þ p2  q1
2 2 2
The Riccati eqn [9] can be linearized via y =  0 = We note that these Hamiltonians govern systems
to yield with two degrees of freedom and each is conserved.
x However, no explicit second conserved quantity is
00
þ ¼0 known (see comments on first integrals in the last
2
section).
which gives Painlevé’s viewpoint of the transcendental solutions
of the Painlevé equations as natural generalizations of
ðxÞ ¼ a Aið21=3 xÞ þ b Bið21=3 xÞ elliptic functions also led him to search for entire
for arbitrary constants a and b, that is, the well- functions that play the role of theta functions in
known Airy function solutions of PII . Iterations of this new setting. He found that analogous functions
the Bäcklund transformations ~ y and ^y, [5]–[6] give could be defined which have only zeros at the
further classical solutions in terms of Airy functions locations of the movable singularities of the Painlevé
for the case when  = (2N þ 1)=2 for integer N. transcendents. These functions are now commonly
Similarly, there is a sequence of rational solutions of known as tau-functions (also denoted -functions).
the family of equations PII with  = N, for integer N, if For PI and PII , the corresponding tau-functions are
we iterate the Bäcklund transformations ~y, ^y by entire functions (i.e., they are analytic everywhere in
starting with the trivial solution y  0 for the case the complex x-plane). However, for the remaining
 = 0. For example, for  = 1, we have ^ y = 1=x. The Painlevé equations, they are singular at the fixed
transformations [7]–[8] give a mapping that shows singularities of the respective equation.
that this family of rational solutions and the above For PI , all movable singularities of PI are double
family of Airy-type solutions of PII both exist for the poles of strength unity (see eqn [1]). Therefore, the
cases when  is half-integer and when it is integer. function given by
 Z xZ s 
PI : I ðxÞ ¼ exp  yðtÞdtds

Hamiltonians and Tau-Functions has Taylor expansion with leading term (x  x0 ).


Each Painlevé equation has a Hamiltonian form. For In other words, I (x) is analytic at all the poles
PI and PII , these can be found by integrating each of the corresponding solution of PI . Since y(x) has
equation after multiplying by y0 . These give no other singularity (other than at infinity), I (x)
Z x must be analytic everywhere in the complex x-plane.
y02 Differentiation and substitution of PI shows that
PI : ¼ 2y3 þ xy  yð Þd þ EI
2 I (x) satisfies the fourth-order equation
Z
y02 y4 x 2 1 x
PII : ¼ þ y  yð Þ2 d þ  y þ EII PI :
ð4Þ ð3Þ
I ðxÞI ðxÞ ¼ 4I0 ðxÞI ðxÞ  3I00 ðxÞ2  x I ðxÞ2
2 2 2 2
Painlevé Equations 5

!
Note that this equation is bilinear in  and its d d3 d 0
derivatives. Such bilinear, or in general, multilinear, Lnþ1 fvg ¼ þ 4v þ 2v Ln fvg
dz dz3 dz
equations are called Hirota-type forms of the Painlevé
equations. The special nature of such equations is L1 fvg ¼ v
most simply expressed in terms of the Hirota D(  Dx )
operator, an antisymmetric differential operator defined where primes denote z-derivatives. Note that
here on products of functions of x: L2 fvg ¼ v00 þ 3v2
Dn f  g ¼ ð@  @
Þn f ð Þgð
Þj ¼
¼x L3 fvg ¼ vð4Þ þ 10vv00 þ 5v02 þ 10v3
This operator is intimately related to the Korteweg–de
Notice that Vries equation. (It was first discovered as a method of
generating the infinite number conservation laws
D2    ¼  00   02 ;
associated with this soliton equation.)
D4    ¼  ð4Þ  4 0  ð3Þ þ 3 002 The scaling v(z) = y( x), with = (2)1=3 ,
= (2)1=3 , shows that the case n = 2 of the
Hence the equation satisfied by I (x) can be sequence of ODEs defined recursively by
rewritten more succinctly as Ln fvg ¼ z

ðD4 þ xÞI  I ¼ 0 is PI . Hence this is called the first Painlevé hierarchy.


A second Painlevé hierarchy is given recursively by
For PII , a generic solution y(x) has movable simple  
d
poles of residue 1 (see eqn [2]). Painlevé pointed þ 2y Ln fy0  y2 g ¼ xy þ n ; n  1
out that if we square the function y(x), multiply dx
by 1 and integrate twice, we obtain a function where n are constants.
with Taylor expansion with leading term (x  x0 ). Each Painlevé equation may arise as a reduction
However, the square is not invertible and to of more than one PDE. Since different soliton
construct an invertible mapping to entire functions, equations have different hierarchies, this means
we need two -functions. We denote these by (x) that more than one hierarchy may be associated
and (x): with each Painlevé equation.
 Z xZ s 
See also: Bäcklund Transformations; Integrable Discrete
PII: II ðxÞ ¼ exp  yðtÞ2 dtds
Systems; Integrable Systems: Overview; Isomonodromic
Deformations; Ordinary Special Functions;
II ðxÞ ¼ yðxÞII ðxÞ Riemann–Hilbert Methods in Integrable Systems;
Riemann–Hilbert Problem; Solitons and Kac–Moody Lie
The equations satisfied by these tau-functions are Algebras; Two-Dimensional Ising Model; WDVV
Equations and Frobenius Manifolds.
PII:  00 ðxÞðxÞ ¼  0 ðxÞ2  ðxÞ2

00 ðxÞðxÞ2 ¼ 2ðxÞ 0 ðxÞ 0 ðxÞ   0 ðxÞ2 ðxÞ


Further Reading
þ ðxÞ3 þ xðxÞ2 ðxÞ þ ðxÞ3
Flaschka H and Newell AC (1980) Monodromy- and spectrum-
preserving deformations. Communications in Mathematical
Physics 76: 65–116.
Ince EL (1927) Ordinary Differential Equations, London:
Hierarchies Longmans, Green and Co. (Reprinted in 1956 – New York:
Dover).
Each Painlevé equation is associated with at least Jimbo M and Miwa T (1981) Monodromy preserving deforma-
one infinite sequence of ordinary differential tion of linear ordinary differential equations with rational
equations (ODEs) indexed by order. These coefficients. II. Physica D 2: 407–448.
sequences are called hierarchies and arise from Umemura H (1990) Second proof of the irreducibility of the first
differential equation of Painlevé. Nagoya Mathematical
symmetry reductions of PDE hierarchies that are
Journal 117: 125–171.
associated with soliton equations. Umemura H and Watanabe H (1997) Solutions of the second and
Define the operator Ln {v(z)} (the Lenard recursion fourth Painlevé equations. I. Nagoya Mathematical Journal
operator) recursively by 148: 151–198.
6 Partial Differential Equations: Some Examples

Partial Differential Equations: Some Examples


R Temam, Indiana University, Bloomington, IN, USA Another eqn of the form [1] is the hyperbolic
ª 2006 Elsevier Ltd. All rights reserved. equation

@2u @2u
 ¼0 ½3
@t2 @x2
Introduction which governs, for example, linear acoustics in one
dimension (sound pipes) or the propagation of an
Many physical laws are mathematically expressed
elastic wave along an elastic string.
in terms of partial differential equations (PDEs);
A third equation of type [1] is the linear parabolic
this is, for instance, the case in the realm of
equation
classical mechanics and physics of the laws of
conservation of angular momentum, mass, and @u @ 2 u
energy.  ¼0 ½4
@t @x2
The object of this short article is to provide an
overview and make a few comments on the set of also called the heat equation, which governs, under
PDEs appearing in classical mechanics, which is appropriate circumstances, the temperature (u(x, t) =
tremendously rich and diverse. From the mathema- temperature at x at time t).
tical point of view the PDEs appearing in mechanics All these equations are well understood from the
range from well-understood PDEs to equations mathematical viewpoint and many well-posedness
which are still at the frontier of sciences as far as results are available. A fundamental difference
their mathematical theory is concerned. The math- between eqns [2], [3], and [4] is that for [2] and
ematical theory of PDEs deals primarily with their [4] the solution is as smooth as allowed by the data
‘‘well-posedness’’ in the sense of Hadamard. A well- (forcing terms, boundary data not mentioned here),
posed PDE problem is a problem for which whereas the solutions of [3] usually present some
existence and uniqueness of solutions in suitable discontinuities corresponding to the propagation of
function spaces and continuous dependence on the a wave or wave front.
data have been proved. A considerable jump of complexity occurs if we
For simplicity, let us restrict ourselves to space consider the equation of transonic flows in which
dimension 2. Several interesting and important PDEs  !
are of the form 1 @u 2
a¼ 1 2
v @x
@2u @2u @2u
a þ b þ c ¼0 ½1 2 @u @u
@x2 @x@y @y2 b¼ ½5
v2 @x @y
Here a, b, c may depend on x and y or they may be
 
constants, and then eqn [1] is linear: they may also 1 @u 2
depend on u, @u=@x, and @u=@y, in which case the c¼1 2
v @y
equation is nonlinear.
Such an equation is where v = v(x, y) is the local speed of sound. This is
a mixed second-order equation: it is elliptic in the
 elliptic when (where) b2  4ac < 0, subsonic region where M < 1, M the Mach number
 hyperbolic when (where) b2  4ac > 0, being the ratio of the velocity
 parabolic when (where) b2  4ac = 0.
 2  2 !1=2
Among the simplest linear equations, we have the @u @u
jgrad uj ¼ þ
elliptic equation @x @y
u ¼ 0 ½2
to the local velocity of sound v = v(x, y); eqn [1]
which governs the following phenomena: equation (with [5]) is hyperbolic in the supersonic region,
for the potential or stream function of plane, where M > 1 and parabolic on the sonic line
incompressible irrotational fluids; equation for M = 1. Essentially no result of well-posedness is
some potential in linear elasticity, or the equation available for this problem, and it is not even totally
for the temperature in suitable conditions (sta- clear what are the boundary conditions that one
tionary case; see below for the time-dependent should associate to [1]–[5] to obtain a well-posed
case). problem.
Partial Differential Equations: Some Examples 7

Intermediate mathematical situations are encoun- @u @u @ 3 u


tered with the Navier–Stokes and Euler equations, þu þ ¼0 ½9
@t @x @x3
which govern the motion of fluids in the viscous
and inviscid cases, respectively. A number of and the nonlinear Schrödinger equation (see Non-
mathematical results are available for these equa- linear Schrödinger Equations)
tions (see Compressible Flows: Mathematical The-
ory, Incompressible Euler Equations: Mathematical @u @2A
þ i 2  i jAj2 A þ A ¼ 0 ½10
Theory, Viscous Incompressible Fluids: Mathema- @z @t
tical Theory, Inviscid Flows); but other questions
are still open, including the famous Clay prize ,  > 0. These equations are very different from
problem, which is: to show that the solutions of the eqns [1]–[8] and are reasonably well understood
(viscous, incompressible) Navier–Stokes equations, from the mathematical point of view; they produce
in space dimension three, remain smooth for all time, and describe the amazing physical wave phenom-
or to exhibit an example of appearance of singularity. enon known as the soliton (see Solitons and Kac–
A prize of US$ 1 million will be awarded by the Clay Moody Lie Algebras).
Foundation for the solution of this problem. This article is based on the Appendix of the book
For compressible fluids, the Navier–Stokes equa- by Miranville and Temam quoted below, with the
tions expressing conservation of angular momentum authorization of Cambridge University Press.
and mass read
See also: Compressible Flows: Mathematical Theory;
  Elliptic Differential Equations: Linear Theory; Evolution
@u
 þ ðu  rÞu Equations: Linear and Nonlinear; Fluid Mechanics:
@t Numerical Methods; Fractal Dimensions in Dynamics;
 u þ rp  ð þ Þrðr  uÞ ¼ 0 ½6 Image Processing: Mathematics; Incompressible Euler
Equations: Mathematical Theory; Integrable Systems and
the Inverse Scattering Method; Interfaces and
@ Multicomponent Fluids; Inviscid Flows; Korteweg–de
þ rðuÞ ¼ 0 ½7
@t Vries Equation and Other Modulation Equations; Leray–
Schauder Theory and Mapping Degree;
Here u = u(x, t) is the velocity at x at time t, Magnetohydrodynamics; Newtonian Fluids and
p = p(x, t) the pressure,  the density; ,  are Thermohydraulics; Nonlinear Schrödinger Equations;
viscosity coefficients,  > 0, 3 þ 2  0. When Solitons and Kac–Moody Lie Algebras; Stochastic
 =  = 0, we obtain the Euler equation (see Com- Hydrodynamics; Symmetric Hyperbolic Systems and
pressible Flows: Mathematical Theory). If the fluid Shock Waves; Viscous Incompressible Fluids:
is incompressible and homogeneous, then the den- Mathematical Theory; Non-Newtonian Fluids.
sity is constant,  = 0 and

ru¼0 ½8 Further Reading


so that eqn [8] replaces eqn [7] and eqn [6] Brezis H and Browder F (1998) Partial differential equations in
simplifies accordingly. the 20th century. Advances in Mathematics, 135: 76–144.
Evans LC (1998) Partial Differential Equations. Providence, RI:
Finally, let us mention still different nonlinear
American Mathematical Society.
PDEs corresponding to nonlinear wave phenomena, Miranville A and Temam R (2001) Mathematical Modelling in
namely the Korteweg–de Vries (see Korteweg–de Continuum Mechanics. Cambridge: Cambridge University
Vries Equation and Other Modulation Equations) Press.

Path Integral Methods see Functional Integration in Quantum Physics; Feynman Path Integrals
8 Path Integrals in Noncommutative Geometry

Path Integrals in Noncommutative Geometry


R Léandre, Université de Bourgogne, Dijon, France algebras appear but we use here the presentation of
ª 2006 Elsevier Ltd. All rights reserved. Jones and Léandre (1991).
Let us now quickly recall the theory of distribu-
tions in the white-noise sense. The main tools are
Fock spaces. We consider interacting Fock spaces
Introduction (Accardi and Boźejko (1998)) constituted of 
Let us recall that there are basically two algebraic written as in [2] such that
infinite-dimensional distribution theories: X
kk22;C;k ¼ Cn ðnÞ2 kn k2Hn < 1 ½4
 The first one is white-noise analysis (Hida et al. k

1993, Berezansky and Kondratiev 1995), and uses


Fock spaces and the algebra of creation and The space of white-noise functionals WN1 is the
annihilation operators. intersection of these interacting Fock spaces k, C for
 The second one is the noncommutative differen- C > 0, k > 0. Its topological dual WN1 is called
tial geometry of Connes (1988) and uses the entire the space of white-noise distributions.
cyclic complex. Traditionally, in white-noise analysis, one con-
siders in [2] the case where n belongs to the
If we disregard the differential operations, these symmetric tensor product of Hk endowed with its
two distribution theories are very similar. Let us natural Hilbert structure. We get a symmetric Fock
recall quickly their background on geometrical space sC, k and another space of white-noise
examples. Let V be a compact Riemannian manifold distributions WNs, 1 . The interest in considering
and E a Hermitian bundle on it. We consider an symmetric Fock spaces, instead of interacting Fock
elliptic Laplacian E acting on sections ! of this spaces, arises from the characterization theorem of
bundle. We consider the Sobolev space Hk , k > 0, of Potthoff–Streit. For the sake of simplicity, let us
sections ! of E such that: consider the case where (n) = 1. If ! if a smooth
Z D section of
 E P E, we can consider its exponential
kE þ 1 !; ! dmV < 1 ½1 exp[!] = n!1 !n . If we consider an element  of
V WNs, 1 , h, exp[!]i satisfies two natural
where dmV is the Riemannian measure on V and h , i conditions:
the Hermitian structure on V. Hkþ1 is included in 1. jh, exp[!]ij  C exp[Ck!k2Hk ] for some k > 0.
Hk and the intersection of all Hk is nothing other
2. z ! h, exp[!1 þ z!2 ]i is entire.
than the space of smooth sections of the bundle E,
by the Sobolev embedding theorem. The Potthoff–Streit theorem states the opposite:
Let us quickly recall Connes’ distribution theory: a functional which sends a smooth section of V
let (n) be a sequence of real strictly positive into a Hilbert space and which satisfies the two
numbers. Let previous requirements defines an element of
X WNs, 1 with values in this Hilbert space. More-
¼ n ½2 over, if the functional depends holomorphically on
a complex parameter, then the distribution
where n belongs to Hkn with the Hilbert structure depends holomorphically on this complex para-
naturally inherited from the Hilbert structure of Hk . meter as well.
We put, for C > 0, The Potthoff–Streit theorem allows us to define
X flat Feynman path integrals as distributions. It is the
kk1;C;k ¼ Cn ðnÞkn kHn ½3 opposite point of view, from the traditional point of
k
view of physicists, where generally path integrals are
The set of  such that kk1, C, k < 1 is a Banach defined by convergence of the finite-dimensional
space called CoC, k . The space of Connes functionals lattice approximations. Hida–Streit have proposed
Co1 is the intersection of these Banach spaces for replacing the approach of physicists by defining
C > 0 and k > 0 endowed with its natural topology. path integrals as infinite-dimensional distributions,
Its topological dual Co1 is the space of distribu- and by using Wiener chaos. Getzler was the first
tions in Connes’ sense. who thought of replacing Wiener chaos by other
Remark We do not give the original version of the functionals on path spaces, that is, Chen iterated
space of Connes where tensor products of Banach integrals. In this article, we review the recent
Path Integrals in Noncommutative Geometry 9

developments of path integrals in this framework. Feynman Path Integral on a Manifold


We will mention the following topics:
Let us introduce the flat Brownian motion s ! B(s)
 infinite-dimensional volume element in Rd starting from 0. It has formally the Gaussian
 Feynman path integral on a manifold law
 Bismut–Chern character and path integrals " Z  2 #
 fermionic Brownian motion 1 1  d 
1
Z exp   BðsÞ ds dDðBð:ÞÞ
2 0 ds
The reader who is interested in various rigorous
approaches to path integrals should consult the where dD(B(.)) is the formal Lebesgue measure on
review of Albeverio (1996). finite-energy paths starting from 0 in Rd (the
partition function Z is infinite!). Let N be a compact
Riemannian manifold of dimension d endowed with
Infinite-Dimensional Volume Element the Levi–Civita connection. The stochastic parallel
transport on semimartingales for the Levi-Civita
Let us recall that the Lebesgue measure does not
connection exists almost surely (Ikeda and Watanabe
exist generally as a measure in infinite dimensions.
1981). Let us introduce the Laplace–Beltrami opera-
For instance, the Haar measure on a topological
tor N on N and the Eells–Elworthy–Malliavin
group exists if and only if the topological group is
equation starting from x (Ikeda and Watanabe 1981):
locally compact. Our purpose in this section is to
define the Lebesgue measure as a distribution. dxs ðxÞ ¼ s ðxÞdBðsÞ ½7
We consider the set C1 (M; N) of smooth maps x(.)
where B(.) is a Brownian motion in Tx (M) starting
from a compact Riemannian manifold M into a
from 0 and s ! s (x) is the stochastic parallel transport
compact Riemannian manifold N endowed with its
associated to the solution. s ! xs (x) is called the
natural Fréchet topology. S is the generic point of M
Brownian motion on N. The heat semigroup asso-
and x the generic point of N. We would like to say
ciated to N satisfies exp[tN ]f (x) = E[f (xt (x))] for
that the law of x(Si ) for a finite set of n different
f continuous on N. Formally, there is a Jacobian which
points Si under the formal Lebesgue measure dD(x(.))
appears in the transformation of the formal path
on C1 (M; N) is the product law of n dmN (This
integral which governs B(.) into the formal path
means that the Lebesgue measure on C1 (M; N) is a
integral which governs x. (x)
cylindrical measure). Let us consider a smooth
function n from (M N)n into C. We introduce dx ð1Þ ¼ Z1
x exp½Iðx: ðxÞÞ=2dDðx: ðxÞÞ ½8
the associated functional F(n )(x(.)) on C1 (M; N):
It was shown by B DeWitt, in a formal way, that the
Fðn Þðxð:ÞÞ action in [8] is not the energy of the path and that
Z there are some counter-terms in the action where the
¼ n ðS1 ; . . . ; Sn ; xðS1 Þ; . . . ; xðSn ÞÞdmMn ½5 scalar curvature K of N appears (see Andersson and
Mn
Driver (1999) and Sidorova et al. (2004) for rigorous
If we use formally the Fubini formula, we get results). In order to describe Feynman path integrals,
Z we perform, as it is classical in physics, analytic
Fðn Þðxð:ÞÞdDðxð:ÞÞ continuation on the semigroup and on the ‘‘measure’’
C1 ðM;NÞ dx (1) such that we get a distribution dx () which
Z depends holomorphically on , Re   0.
¼ FðS1 ; . . . ; Sn ; x1 ; . . . ; xn ÞdmMn Nn ½6 In order to return to the formalism of the
Mn N n
introduction, we consider V = N, E the trivial com-
We will interpret this formal remark in the framework plex line bundle and the symmetric Fock space and
of the distribution theories of the introduction. We (n) = 1. To n =n! belonging to Hksym n we associate
consider V = M N and E the trivial complex line the functional on P(N), the smooth path space on N:
bundle endowed with the trivial metric and (n) = 1.
Fðn =n!Þðxð:ÞÞ
We can define the associated algebraic spaces Co1 Z
and WN1 and we can extend to Co1 and WN1 ¼ n ðxðs1 Þ;.. .;xðsn ÞÞds1   dsn ½9
the map F of [5]. F sends elements of Co1 and n
WN1 into the set of continuous bounded maps of
where n is the n-dimensional simplex of [0, 1]n
C1 (M; N) where we can extend [6]. We obtain:
R constituted of times 0 < s1 <   < sn < 1 (Léandre
Theorem 1  ! C1 (M; N) F()(x(.))dD(x(.)) defines (2003)). We remark that F maps WNs, 1 into the
an element of Co1 or WN1 . set of bounded continuous functionals on P(N). We
10 Path Integrals in Noncommutative Geometry

introduce an element h of L2 (N). The map which to where s ! x(s) is a smooth loop in N, !i is of odd
!, a smooth function on N, associates exp[(N þ degree and !1i is of even degree. Let us recall that
!)]h(Re  0) satisfies the requirements (1) and (2) even forms on the free loop space commute. F(n ) is
of the introduction and depends holomorphically on built from even forms on the free loop space, which
. This defines by the Potthoff–Streit theorem a commute. This explains why we have to consider
distribution  which depends holomorphically on , the symmetric Fock space.P Therefore, if  belongs to
Re  0 with values in L2 (N). By uniqueness of WNs, 1 , then F() = F2r (), where F2r () is a
analytic continuation, we obtain: measurable form on L(N) of degree 2r (see Jones
and Léandre (1991) for an analogous statement in
Theorem 2 If Px (N) is the space of smooth paths
the stochastic context).
starting from x in N, we have
( ) Let us explain why the free loop space is
Z important in this context. Let d
x (1) be the law of
h ; i ¼ x ! FðÞhðxð1ÞÞdx ðÞ ½10 the Brownian bridge on N starting from x and
Px ðNÞ
coming back at x at time 1: this is the law of the
Instead of taking functions, we can consider as Brownian motion x. (x) subject to return in time 1 at
bundle E the space of complex 1-forms on N. We its departure. Let pt (x, y) be the heat kernel
then consider Chen (1973) iterated integrals: associated with xt (x): the law of xt (x) is namely
pt (x, y) dmN (y) (Ikeda and Watanabe 1981). We
Fðn Þðxð:ÞÞ consider the Bismut–Høegh–Krohn measure on the
Z
continuous free loop space L0 (N):
¼ hn ðxðs1 Þ; . . . ; xðsn ÞÞ; dxðs1 Þ; . . . ; dxðsn Þi ½11
n
dP ¼ p1 ðx; xÞdx  d
x ð1Þ ½14
such that F maps WNs, 1 into the set of measurable
maps on P(N). These maps are generally not This satisfies
bounded. Namely,
tr½exp½s1 N  f1    fn exp½ð1  sn ÞN 
Z 1  Z
Fðexp½!Þ ¼ exp h!ðxðsÞÞ; dxðsÞi ½12 ¼ f1 ðxðs1 ÞÞ    fn ðxðsn ÞÞdP ½15
0 L0 ðNÞ
R1
instead of exp[ 0 !(x(s))ds] in the previous case. By (We are interested in the trace of the heat semigroup
using the Cameron–Martin–Girsanov–Maruyama for- instead of the heat semigroup itself unlike in the
mula and Kato perturbation theory, we get an analog previous section.)
of Theorem 2 for Chen iterated integrals, but for Since N is spin, we can consider the spin bundle
Re  < 0, because we have to deal with a perturbation Sp = Spþ
Sp on it, the Clifford bundle Cl on it with
of N by a drift when we want to check (1) and (2). its natural Z=2Z gradation (Gilkey 1995). Let us recall
The interest of this formalism is that the parallel that the Clifford algebra acts on the spinors. A form !
transport belongs in some sense to the domain of the can be associated with an element !˜ of the Clifford
distribution and that we get the flat Feynman path bundle (Gilkey 1995). We consider the Brownian loop
integral from the curved one by using an analog of [7]. x(.) associated to the Bismut–Høegh–Krohn measure.
If s < t, we can define the stochastic parallel transport
~s, t from x(t) to x(s) (we identify a loop to a path from
Bismut–Chern Character [0, 1] into N with the same end values). We remark
and Path Integrals that with the notations of [13]
Z
Since we are concerned in this part with index theory, ~0;s1 ð!
~1 ðdxðs1 ÞÞ þ ! ~1 ds1 Þ~
s1 ;s2 . . .
we replace the free path space of N by the free smooth n
loop space L(N). We consider the case where V = N is

~sn1 ;sn !~n ðdxðsn ÞÞ þ !~1n dsn ~sn ;1 ¼ A ½16


a compact oriented Riemannian spin manifold and
E = E
Eþ . E is the bundle of complexified odd is a random almost surely defined even element of the
forms and Eþ is the bundle of complexified even Clifford bundle over x(0). Acting on Sp(x(0)), it thus
forms. To n = n!1 (!1 þ !11 )      (!n þ !1n ), we preserves the gradation. We consider its supertrace
associate the even Chen (1973) iterated integral trs A = trSpþ A  trSp A. This becomes a random vari-
Z able on L0 (N). We introduce the scalar curvature K of

Fðn Þ ¼ !1 ðdxðs1 Þ; :Þ þ !11 ds1 ^    the Levi–Civita connection on N, whose introduction


n arises from the Lichnerowicz formula given the square

^ !n ðdxðsn Þ; :Þ þ !1n dsn ½13 of the Dirac operator in terms of the horizontal
Path Integrals in Noncommutative Geometry 11

Laplacian on the spin bundle (Gilkey 1995). We Fermionic Brownian Motion


R R1
consider the expression L0 (N) exp[ 0 K(x(s) ds=8]
Alvarez-Gaumé has given a supersymmetric proof of the
trs A dP. This expression can be extended to WNs, 1
index theorem: the path representation of the index of
and therefore defines an element Wi of WNs, 1 called
the Dirac operator involves infinite-dimensional Berezin
by Getzler (Léandre 2002) the Witten current.
integrals, while in the previous section only integrals of
Bismut has introduced a Hermitian bundle on M.
forms on the free loop space were concerned. Rogers
He deduces a bundle 1 on L(N): the fiber on a loop x(.)
(1987) has given an interpretation of the work of
is the space of smooth sections along the loop of . We
Alvarez-Gaumé, which begins with the study of
can suppose that is a sub-bundle given by a projector p
fermionic Brownian motion. Let us interpret the
of a trivial bundle. We can suppose that the Hermitian
considerations of Rogers (1987) in this framework.
connection on is the projection connection A = pdp
We consider Cd . H is the space of L2 -maps from
such that its curvature is R = pdp ^ pdp. Bismut (1985,
[0, 1] into Cd . We denote such a path by pffiffiffiffiffiffi (s) =
1987) has introduced the Bismut–Chern character:
( 1 (s), . . . , d (s)), where i (s) = qi (s) þ 1pi (s).
Z
pi (s) is the ith momentum and qi (s) the ith position.
Chð 1 Þ ¼ tr ðAdxðs1 Þ  Rds1 Þ ^    We denote by (H) ^ the fermionic Fock space associated
n  with H.
^ ðAdxðsn Þ  Rdsn Þ ½17 We introduce the bilinear antisymmetric form on H:
d Z 1
pffiffiffiffiffiffiffi X
Ch( 1 ) is a collection of even forms equal to F(( )), ð 1 ; 2 Þ ¼ 1 p1i ðsÞ dq2i ðsÞ
where ( ) belongs to WNs, 1 . We obtain: i¼1 0

Theorem 3 Let us consider the index Ind(D ) of þ pi ðsÞ dq1i ðsÞ


2
½20
the Dirac operator on N with auxiliary bundle
and we consider the formal expression exp[] =
P
(Hida et al. 1993). We have 1 1 ^n ^2
n = 0 n!  . We define a state on  (H) by
1 2 1 2 ^
hWi; ð Þi ¼ Ind D ½18 pffiffiffiffiffiffi ^ ) = ( , ). We put i (s) = 1[0, s] þ
!(
11[0, s] where we take the ith coordinate in Cd .
The proof arises from the Lichnerowicz formula, We obtain, if s1 < s2 ,
the matricial Feynman–Kac formula, and the decom- pffiffiffiffiffiffiffi
position of the solution of a stochastic linear !ð ^i ðs1 Þ ^ ^j ðs2 ÞÞ ¼  1i;j ½21
equation into the sum of iterated integrals.
where i, j is the Kronecker symbol. We change the
By using the Potthoff–Streit theorem, we can do the
sign if s2 > s1 and we write 0 if s1 = s2 .
analytic continuation of [18], as is suggested by the path-
We consider the finite-dimensional space Pol of
integral interpretation of Atiyah (1985) or Bismut
fermionic polynomials on Cd . Pol is endowed with a
(1985, 1987) of [18], motivated by the Duistermaat–
suitable norm, and we consider Poln endowed with
Heckman or Berline–Vergne localization formulas on
the Pinduced norm. We consider a formal series
the free loop space. For this, these authors consider the
 = n , where n belongs to Poln . In order to
Atiyah–Witten
R even form on the free loop space given by
simplify the treatment, we suppose that our fermio-
I(x(.)) = S1 j(d=ds)x(s)j2 ds þ dX1 , where dX1 is the
nic polynomials do not contain constant terms. We
exterior derivative of the Killing form X1 which to a
introduce the following Banach norm:
vector
R X(.) on the loop associates hX1 , X(.)i =
hX(s), dx(s)i. We should obtain the heuristic formula X Cn
S1 kkC ¼ kn k ½22
Z   n!
1
hWi; i ¼ Z1 FðÞ ^ exp  Iðxð:ÞÞ ½19 We obtain the notion of Connes space Co1 in this
LðMÞ 2
simpler context:  belongs to Co1 if kkC < 1 for
We refer to Léandre (2002) for details. all C. If n = P1      Pn , we associate
Let us remark that Bismut (1987) and Léandre Z
(2003) has continued his formal considerations to Fðn Þ ¼ ^ 1 ÞÞ^   
P1 ð ðs
n
the case of the index theorem for a family of Dirac
operators. We consider a fibration : N ! B of ^ n ÞÞ ds1    dsn
^ Pn ð ðs ½23
compact manifolds. Bismut replaces [19] by an
F can be extended in an injective continuous map
integral of forms on the set of loops of N which ^
from Co1 into (H). By using [21], we get:
project to a given loop of B. Bismut remarks that
this integration in the fiber is related to filtering Theorem 4 exp[] is a distribution in the sense of
theory in stochastic analysis. Connes.
12 Peakons

We have only to use the formula [21] and Bismut JM (1986) Localization formula, superconnections and
the index theorem for families. Communications in Mathema-
hexp½; 1 ^ 2    ^ 2n i ¼ Pf f!ð i ^ j Þg ½24 tical Physics 103: 127–166.
Bismut JM (1987) Filtering equation, equivariant cohomology
and the Chern character. In: Seneor R (ed.) Proc. VIIIth Int.
and to estimate the obtained Pfaffians when n ! 1.
Cong. Math. Phys., pp. 17–56. Singapore: World Scientific.
Theorem 4 allows us to give a rigorous interpreta- Chen KT (1973) Iterated path integrals of differential forms and
tion of the fermionic Feynman–Kac formula of Rogers loop space homology. Annals of Mathematics 97: 213–237.
(1987). We refer to Roepstorff (1994) for details. Connes A (1988) Entire cyclic cohomology of Banach algebras
exp[] should give a rigorous interpretation to the and character of -summable Fredholm modules. K-Theory
1: 519–548.
Gaussian
pffiffiffiffiffiffi R 1Berezin
P integral with formal density
Cuntz J (2001) Cyclic theory, bivariant K-theory and the bivariant
exp [ 1 0 pi (s) dqi (s)]. Chern–Connes character. In: Cyclic Homology in Noncom-
mutative Geometry, pp. 2–69. Encyclopedia of Mathematical
See also: Equivariant Cohomology and the Cartan Sciences, 121. Heidelberg: Springer.
Model; Feynman Path Integrals; Functional Integration in Gilkey P (1995) Invariance Theory, the Heat Equation and the
Quantum Physics; Hopf Algebras and q-Deformation Atiyah–Singer Theorem. Boca Raton: CRC Press.
Quantum Groups; Index Theorems; Measure on Loop Hida T, Kuo HH, Potthoff J, and Streit L (1993) White Noise: An
Spaces; Positive Maps on C -Algebras; Stationary Phase Infinite Dimensional Calculus. Dordrecht: Kluwer.
Approximation; Stochastic Differential Equations; Ikeda N and Watanabe S (1981) Stochastic Differential Equations
and Diffusion Processes. Amsterdam: North-Holland.
Supermanifolds; Supersymmetric Quantum Mechanics.
Jones JDS and Léandre R (1991) Lp Chen forms on loop spaces.
In: Barlow M and Bingham N (eds.) Stochastic Analysis,
pp. 104–162. Cambridge: Cambridge University Press.
Further Reading Léandre R (2002) White noise analysis, filtering equation and
Accardi L and Boźejko M (1998) Interacting Fock spaces and the index theorem for families. In: Heyer H and Saitô (eds.)
Gaussianization of probability measures. Infinite Dimensional Infinite Dimensional Harmonic Analysis (to appear).
Analysis, Quantum Probability and Related Topics 1: 663–670. Léandre R (2003) Theory of distributions in the sense of Connes–
Albeverio S (1996) Wiener and Feynman path integrals and their Hida and Feynman path integral on a manifold. Inf. Dim.
applications. In: Masani PR (eds.) Norbert Wiener Centenary Anal. Quant. Probab. Rel. Top. 6: 505–517.
Congress, Proc. Symp. Appl. Math. vol. 52, pp. 163–194. Roepstorff G (1994) Path Integral Approach to Quantum
Providence, RI: American Mathematical Society. Physics. An Introduction. Heidelberg: Springer.
Andersson L and Driver B (1999) Finite dimensional approxima- Rogers A (1987) Fermionic path integration and Grassmann
tion to Wiener measure and path integral formulas on Brownian motion. Communications in Mathematical Physics
manifolds. Journal of Functional Analysis 165: 430–498. 113: 353–368.
Atiyah M (1985) Circular symmetry and stationary phase Sidorova N, Smolyanov O, von Weizsaecker H, and Wittich O
approximation. In: Colloque en l’honneur de L. Schwartz, (2004) The surface limit of Brownian motion in tubular
vol. 131, pp. 43–59. Paris: Asterisque. neighborhood of an embedded Riemannian manifold. Journal
Bismut JM (1985) Index theorem and equivariant cohomology on of Functional Analysis 206: 391–413.
the loop space. Communications in Mathematical Physics 98: Szabo R (2000) Equivariant cohomology and localization of path
213–237. integrals in physics, Lecture Notes in Physics M63. Berlin:
Berezansky YM and Kondratiev YO (1995) Spectral Methods Springer.
in Infinite-Dimensional Analysis, vols. I, II. Dordrecht:
Kluwer.

Peakons
D D Holm, Imperial College, London, UK are peakon wave fronts in higher dimensions. The
ª 2006 Elsevier Ltd. All rights reserved. reduction of these singular solutions of CH and EPDiff
to canonical Hamiltonian dynamics on lower-dimen-
sional sets may be understood, by realizing that their
solution ansatz is a momentum map, and momentum
Introduction maps are Poisson.
Peakons are singular solutions of the dispersionless Camassa and Holm (1993) discovered the ‘‘peakon’’
Camassa–Holm (CH) shallow-water wave equation in solitary traveling-wave solution for a shallow-
one spatial dimension. These are reviewed in the water wave:
context of asymptotic expansions and Euler–Poincaré
uðx; tÞ ¼ cejxctj= ½1
(EP) variational principles. The dispersionless CH
equation generalizes to the EPDiff equation (defined whose fluid velocity u is a function of position x on
subsequently in this article), whose singular solutions the real line and time t. The peakon traveling wave
Peakons 13

moves at a speed equal to its maximum height, at sech2 (x  t) traveling-wave solutions (the solitons)
which it has a sharp peak (jump in derivative). for KdV [3] arise in a balance between its (weakly)
Peakons are an emergent phenomenon, solving the nonlinear steepening and its third-order linear
initial-value problem for a partial differential equa- dispersion, when the quadratic terms in  and 2
tion (PDE) derived by an asymptotic expansion of on its right-hand side are neglected.
Euler’s equations using the small parameters of In eqn [3], a normal-form transformation due to
shallow-water dynamics. Peakons are nonanalytic Kodama (1985) has been used to remove the other
solitons, which superpose as possible quadratic terms of order O(2 ) and O(4 ).
The remaining quadratic correction terms in the
X
N
KdV equation [3] may be collected at order O(2 ).
uðx; tÞ ¼ pa ðtÞejxqa ðtÞj= ½2
a¼1
These terms may be expressed, after introducing a
‘‘momentum variable,’’
for sets {p} and {q} satisfying canonical Hamiltonian
dynamics. Peakons arise for shallow-water waves in m ¼ u 
2 uxx ½4
the limit of zero linear dispersion in one dimension. and neglecting terms of cubic order in  and 2 , as
Peakons satisfy a PDE arising from Hamilton’s
principle for geodesic motion on the smooth  2
mt þ mx þ ðumx þ b mux Þ þ ð1  3Þuxxx ¼ 0 ½5
invertible maps (diffeomorphisms) with respect to 2 6
the H 1 Sobolev norm of the fluid velocity. Peakons In the momentum variable m = u 
2 uxx , the
generalize to higher dimensions, as well. We explain parameter
is given by Dullin et al. (2001):
how peakons were derived in the context of
shallow-water asymptotics and describe some of 19  30  45 2

¼ ½6
their remarkable mathematical properties. 60ð1  3Þ
Thus, the effects of 2 -dispersion also enter the
nonlinear terms. After restoring dimensions in eqn
Shallow-Water Background for Peakons [5] and rescaling velocity u by (b þ 1), the following
Euler’s equations for irrotational incompressible ‘‘b-equation’’ emerges,
ideal fluid motion under gravity with a free surface mt þ c0 mx þ umx þ b mux þ uxxx ¼ 0 ½7
have an asymptotic expansion for shallow-
2
water waves that contains two small parameters, where m = u   uxx is the dimensional momentum
 and 2 , with ordering   2 . These small para- variable, and the constants 2 and =c0 are squares of
meters are  = a=h0 (the ratio of wave amplitude to length scales. When 2 ! 0, one recovers KdV from
mean depth) and 2 = (h0 =lx )2 (the squared ratio of the b-equation [7], up to a rescaling of velocity. Any
mean depth to horizontal length, or wavelength). value of the parameter b 6¼ 1 may be achieved in
Euler’s equations are made nondimensional by eqn [7] by an appropriate Kodama transformation
introducing x = lx x0 for horizontal position, z = h0 z0 (Dullin et al. 2001).
for vertical position, t = (lx =c0 )t0 for time,  = a0 for As already emphasized, the values of the coeffi-
surface elevation, and ’ =ffi (glx a=c0 )’0 for velocity cients in the asymptotic analysis of shallow-water
pffiffiffiffiffiffiffi
potential, where c0 = gh0 is the mean wave speed waves at quadratic order in their two small para-
and g is the constant gravity. The quantity meters only hold, modulo the Kodama normal-form
 = 0 =(h0 c20 ) is the dimensionless Bond number, transformations. Hence, these transformations may
in which is the mass density of the fluid and 0 is be used to advance the analysis and thereby gain
its surface tension, both of which are taken to be insight, by optimizing the choices of these coeffi-
constants. After dropping primes, this asymptotic cients. The freedom introduced by the Kodama
expansion yields the nondimensional Korteweg–de transformations among asymptotically equivalent
Vries (KdV) equation for the horizontal velocity equations at quadratic order in  and 2 also helps
variable u = ’x (x, t) at ‘‘linear’’ order in the small to answer the perennial question, ‘‘Why are integr-
dimensionless ratios  and 2 , as the left-hand side of able equations so ubiquitous when one uses asymp-
totics in modeling?’’
3 2
ut þ ux þ uux þ ð1  3Þuxxx ¼ Oð2 Þ ½3
2 6 Integrable Cases of the b-equation [7]
Here, partial derivatives are denoted using sub- The cases b = 2 and b = 3 are special values
scripts, and boundary conditions are u = 0 and for which the b-equation becomes a completely
ux = 0 at spatial infinity on the real line. The famous integrable Hamiltonian system. For b = 2, eqn [7]
14 Peakons

specializes to the integrable CH equation of Fokas and Liu (1996), and also in Johnson [2002]. All
Camassa and Holm (1993). The case b = 3 in [7] the three derivations used different variants of the
recovers the integrable equation of Degasperis and method of asymptotic expansions for shallow-water
Procesi (1999) (henceforth DP equation). These two waves in the absence of surface tension. Only the
cases exhaust the integrable candidates for [7], as derivation in Dullin et al. (2001) used the Kodama
was shown using Painlevé analysis. The b-family of normal-form transformations to take advantage of the
eqns [7] was also shown in Mikhailov and Novikov nonuniqueness of the asymptotic expansion results at
(2002) to admit the symmetry conditions necessary quadratic order.
for integrability, only in the cases b = 2 for CH and The effects of the parameter b on the solutions of
b = 3 for DP. eqn [7] were investigated in Holm and Staley (2003),
The b-equation [7] with b = 2 was first derived in where b was treated as a bifurcation parameter, in the
Camassa and Holm (1993) by using asymptotic limiting case when the linear dispersion coefficients are
expansions directly in the Hamiltonian for Euler’s set to c0 = 0 and  = 0. This limiting case allows
equations governing inviscid incompressible flow in several special solutions, including the peakons, in
the shallow-water regime. In this analysis, the CH which the two nonlinear terms in eqn [7] balance each
equation was shown to be bi-Hamiltonian and other in the ‘‘absence’’ of linear dispersion.
thereby was found to be completely integrable by
the inverse-scattering transform (IST) on the real
line. Reviews of IST may be found, for example, in
Peakons: Singular Solutions without
Ablowitz and Clarkson (1991), Dubrovin (1981),
and Novikov et al. (1984). For discussions of other
Linear Dispersion in One Spatial
related bi-Hamiltonian equations, see Degasperis Dimension
and Procesi (1999). Peakons were first found as singular soliton solutions
Camassa and Holm (1993) also discovered the of the completely integrable CH equation. This is eqn
remarkable peaked soliton (peakon) solutions of [1], [7] with b = 2, now rewritten in terms of the velocity as
[2] for the CH equation on the real line, given by [7]
in the case b = 2. The peakons arise as solutions of ut þ c0 ux þ 3uux þ uxxx
[7], when c0 = 0 and  = 0 in the absence of linear ¼ 2 ðuxxt þ 2ux uxx þ uuxxx Þ ½8
dispersion. Peakons move at a speed equal to their
maximum height, at which they have a sharp peak Peakons were found in Camassa and Holm (1993)
(jump in derivative). Unlike the KdV soliton, the to arise in the absence of linear dispersion. That is,
peakon speed is independent of its width (). they arise when c0 = 0 and  = 0 in CH [8].
Periodic peakon solutions of CH were treated in Specifically, peakons are the individual terms in the
Alber et al. (1999). There, the sharp peaks of peaked N-soliton solution of CH [8] for its velocity
periodic peakons were associated with billiards
reflecting at the boundary of an elliptical domain. X
N
uðx; tÞ ¼ pb ðtÞejxqb ðtÞj= ½9
These billiard solutions for the periodic peakons
b¼1
arise from geodesic motion on a triaxial ellipsoid, in
the limit that one of its axes shrinks to zero length. in the absence of linear dispersion. Each term in the
Before Camassa and Holm (1993) derived their sum is a soliton with a sharp peak at its maximum,
shallow-water equation, a class of integrable equa- hence the name ‘‘peakon.’’ Expressed using its
tions existed, which was later found to contain eqn momentum, m = (1  2 @x2 )u, the peakon velocity
[7] with b = 2. This class of integrable equations was solution [9] of dispersionless CH becomes a sum
derived using hereditary symmetries in Fokas and over a delta functions, supported on a set of points
Fuchssteiner (1981). However, eqn [7] was not moving on the real line. Namely, the peakon
written explicitly, nor was it derived physically as velocity solution [9] implies
a shallow-water equation and its solution properties
for b = 2 were not studied before Camassa and X
N
mðx; tÞ ¼ 2 pb ðtÞðx  qb ðtÞÞ ½10
Holm (1993). (See Fuchssteiner (1996) for an
b¼1
insightful history of how the shallow-water equation
[7] in the integrable case with b = 2 relates to the because of the relation (1  2 @x2 )ejxj= = 2(x).
mathematical theory of hereditary symmetries.) These solutions satisfy the b-equation [7] for any
Equation [7] with b = 2 was recently re-derived as a value of b, provided c0 = 0 and  = 0.
shallow-water equation by using asymptotic methods Thus, peakons are ‘‘singular momentum solu-
in three different approaches in Dullin et al. (2001), in tions’’ of the dispersionless b-equation, although
Peakons 15

they are not stable for every value of b. From


numerical simulations (Holm and Staley 2003),
peakons are conjectured to be stable for b > 1. In
the integrable cases b = 2 for CH and b = 3 for DP,
peakons are stable singular soliton solutions. The
spatial velocity profile ejxj= =2 of each separate
peakon in [9] is the Green’s function for the
Helmholtz operator on the real line, with vanishing
boundary conditions at spatial infinity. Unlike the
KdV soliton, whose speed and width are related, the
width of the peakon profile is set by its Green’s
function, independently of its speed.
Figure 1 A smooth localized (Gaussian) initial condition for the
CH equation breaks up into an ordered train of peakons as time
Integrable Peakon Dynamics of CH evolves (the time direction being vertical). The peakon train
eventually wraps around the periodic domain, thereby allowing
Substituting the peakon solution ansatz [9] and [10]
the leading peakons to overtake the slower emergent peakons
into the dispersionless CH equation from behind in collisions that cause phase shifts as discussed in
Camassa and Holm (1993). Courtesy of Staley M.
mt þ umx þ 2mux ¼ 0; m ¼ u  2 uxx ½11

yields Hamilton’s canonical equations for the Being a completely integrable Hamiltonian soliton
dynamics of the discrete set of peakon parameters equation, the continuum CH equation [8] has an
qa (t) and pa (t): associated isospectral eigenvalue problem, discov-
ered in Camassa and Holm (1993) for any values of
@hN @hN its dispersion parameters c0 and . Remarkably,
q_ a ðtÞ ¼ and p_ a ðtÞ ¼  ½12 when c0 = 0 and  = 0, this isospectral eigenvalue
@pa @qa
problem has a purely ‘‘discrete’’ spectrum. More-
for a = 1, 2, . . . , N, with Hamiltonian given by over, in this case, each discrete eigenvalue corre-
(Camassa and Holm 1993): sponds precisely to the time-asymptotic velocity of a
peakon. This discreteness of the CH isospectrum in
1 X N
the absence of linear dispersion implies that only the
hN ¼ pa pb ejqa qb j= ½13
2 a; b¼1 singular peakon solutions [10] emerge asymptoti-
cally in time, in the solution of the initial-value
Thus, one finds that the points x = qa (t) in the problem for the dispersionless CH equation [11].
peakon solution [9] move with the flow of the fluid This is borne out in numerical simulations of the
velocity u at those points, since u(qa (t), t) = q_ a (t). dispersionless CH equation [11], starting from a
This means the qa (t) are Lagrangian coordinates. smooth initial distribution of velocity (Fringer and
Moreover, the singular momentum solution ansatz Holm 2001, Holm and Staley 2003).
[10] is the Lagrange-to-Euler map for an invariant Figure 1 shows the emergence of peakons from an
manifold of the dispersionless CH equation [11]. initially Gaussian velocity distribution and their
On this finite-dimensional invariant manifold for subsequent elastic collisions in a periodic one-
the PDE [11], the dynamics is canonically dimensional domain. This figure demonstrates that
Hamiltonian. singular solutions dominate the initial-value pro-
With Hamiltonian [13], the canonical equations blem and, thus, that it is imperative to go beyond
[12] for the 2N canonically conjugate peakon smooth solutions for the CH equation; the situation
parameters pa (t) and qa (t) were interpreted in is similar for the EPDiff equation.
Camassa and Holm (1993) as describing ‘‘geodesic
Peakons as Mechanical Systems
motion’’ on the N-dimensional Riemannian mani-
fold whose co-metric is gij ({q}) = e jqi qj j= . More- Being governed by canonical Hamiltonian equa-
over, the canonical geodesic equations arising from tions, each N-peakon solution can be associated
Hamiltonian [13] comprise an integrable system for with a mechanical system of moving particles.
any number of peakons N. This integrable system Calogero (1995) further extended the class of
was studied in Camassa and Holm (1993) for mechanical systems of this type. The r-matrix
solutions on the real line, and in Alber et al. (1999) approach was applied to the Lax pair formulation
and Mckean and Constantin (1999) and references of the N-peakon system for CH by Ragnisco and
therein, for spatially periodic solutions. Bruschi (1996), who also pointed out the connection
16 Peakons

of this system with the classical Toda lattice. A discrete Compactons in the 1=a 2 ! 0 Limit of CH
version of the Adler–Kostant–Symes factorization
As mentioned earlier, in the limit that 2 ! 0, the
method was used by Suris (1996) to study a discretiza-
CH equation [8] becomes the KdV equation.
tion of the peakon lattice, realized as a discrete
In contrast, when 1=a 2 ! 0, CH becomes the
integrable system on a certain Poisson submanifold of
Hunter–Zheng equation (Hunter and Zheng 1994):
gl(N) equipped with an r-matrix Poisson bracket. Beals  
et al. (1999) used the Stieltjes theorem on continued ðut þ uux Þxx ¼ 12 u2x x
fractions and the classical moment problem for study-
ing multipeakon solutions of the CH equation. Gen- This equation has ‘‘compacton’’ solutions, whose
eralized peakon systems are described for any simple collision dynamics was studied numerically and
Lie algebra by Alber et al. (1999). put into the present context in Fringer and Holm
(2001). The corresponding Green’s function satis-
fies @x 2 g(x) = 2(x), so it has the triangular
Pulsons: Generalizing the Peakon Solutions of shape, g(x) = 1  jxj for jxj < 1, and vanishes
the Dispersionless b-Equation for Other Green’s
otherwise, for jxj  1. That is, the Green’s func-
Functions
tion in this case has compact support, hence the
The Hamiltonian hN in eqn [13] depends on name ‘‘compactons’’ for these pulson solutions,
the Green’s function for the relation between which as a limit of the integrable CH equations
velocity u and momentum m. However, the singular are true solitons, solvable by IST.
momentum solution ansatz [10] is ‘‘independent’’ of
this Green’s function. Thus, as discovered in Fringer Pulson Solutions of the Dispersionless b-Equation
and Holm (2001), the singular momentum solution
Holm and Staley (2003) give the pulson solutions of
ansatz [10] for the dispersionless equation
the traveling-wave problem and their elastic colli-
mt þ umx þ 2mux ¼ 0; with u ¼ g  m ½14 sion properties for the dispersionless b-equation:
provides an invariant manifold on which canonical mt þ umx þ bmux ¼ 0; with u ¼ g  m ½17
Hamiltonian dynamics occurs, for any choice of the
with any (symmetric) Green’s function g and for
Green’s function g relating velocity u and momen-
any value of the parameter b. Numerically,
tum m by the convolution u = g  m.
pulsons and peakons are both found to be stable
The fluid velocity solutions corresponding to the
for b > 1 (Holm and Staley 2003). The reduction
singular momentum ansatz [10] for eqn [14] are the
to ‘‘noncanonical’’ Hamiltonian dynamics for the
‘‘pulsons’’. Pulsons are given by the sum over N velocity
invariant manifold of singular momentum solu-
profiles determined by the Green’s function g, as
tions [10] of the other integrable case b = 3 with
X
N peakon Green’s function g(x, y) = e jxyj= is found
uðx; tÞ ¼ pa ðtÞgðx; qa ðtÞÞ ½15 in Degasperis and Procesi (1999) and Degasperis
a¼1 et al. (2002).
Again for [14], the singular momentum ansatz [10]
results in a finite-dimensional invariant manifold of
solutions, whose dynamics is canonically Hamilto- Euler–Poincaré Theory in More
nian. The Hamiltonian for the canonical dynamics Dimensions
of the 2N parameters pa (t) and qa (t) in the ‘‘pulson’’ Generalizing the Peakon Solutions of the CH
solutions [15] of eqn [14] is Equation to Higher Dimensions

1 X N
In Holm and Staley (2003), weakly nonlinear analysis
hN ¼ pa pb gðqa ; qb Þ ½16 and the assumption of columnar motion in the
2 a; b¼1
variational principle for Euler’s equations are found
Again, for the pulsons, the canonical equations for the to produce the two-dimensional generalization of the
invariant manifold of singular momentum solutions dispersionless CH equation [11]. This generalization is
provide a phase-space description of geodesic motion, the EP equation (Holm et al. 1998a, b) for the
this time with respect to the co-metric given by the Lagrangian consisting of the kinetic energy:
Green’s function g. Mathematical analysis and numer- Z
1 h 2 i
ical results for the dynamics of these pulson solutions ‘¼ juj þ 2 ðdiv uÞ2 dx dy ½18
2
are given in Fringer and Holm (2001). These results
describe how the collisions of pulsons [15] depend in which the fluid velocity u is a two-dimensional
upon their shape. vector. Evolution generated by kinetic energy in
Peakons 17

Hamilton’s principle results in geodesic motion, Strengthening the Kinetic-Energy Norm to Allow
with respect to the velocity norm kuk, which is for Circulation
provided by the kinetic-energy Lagrangian. For The kinetic-energy Lagrangian [18] is a norm for
ideal incompressible fluids governed by Euler’s irrotational flow, with curl u = 0. However, inclusion
equations, the importance of geodesic flow was of rotational flow requires the kinetic-energy norm to be
recognized by Arnol’d (1966) for the L2 norm of strengthened to the H1 norm of the velocity, defined as
the fluid velocity. The EP equation generated by
Z
any choice of kinetic-energy norm without impos- 1 h 2 i
ing incompressibility is called ‘‘EPDiff,’’ for ‘‘Euler– ‘¼ juj þ 2 ðdiv uÞ2 þ 2 ðcurl uÞ2 dx dy
2
Z
Poincaré equation for geodesic motion on the 1 h 2 i 1
diffeomorphisms.’’ EPDiff is given by (Holm et al. ¼ juj þ 2 jruj2 dx dy ¼ kuk2H1 ½22
2 2 
1998a):
  Here, we assume boundary conditions that give
@ no contributions upon integrating by parts. The
þ u  r m þ ruT  m þ mðdiv uÞ ¼ 0 ½19
@t corresponding EPDiff equation is [19] with m 
‘=u = u  2 u. This expression involves inver-
with momentum density m = ‘=u, where ‘ = (1=2) sion of the familiar Helmholtz operator in the
kuk2 is given by the kinetic energy, which defines a (nonlocal) relation between fluid velocity and
norm in the fluid velocity kuk, yet to be determined. momentum density. The H1 norm kuk2H1 for the
By design, this equation has no contribution from kinetic energy [22] also arises in three dimensions
either potential energy or pressure. It conserves the for turbulence modeling based on Lagrangian aver-
velocity norm kuk given by the kinetic energy. Its aging and using Taylor’s hypothesis that the
evolution describes geodesic motion on the diffeo- turbulent fluctuations are ‘‘frozen’’ into the Lagran-
morphisms with respect to this norm (Holm et al. gian mean flow (Foias et al. 2001).
1998a).
An alternative way of writing the EPDiff equation
[19] in either two or three dimensions is Generalizing the CH Peakon Solutions
to n Dimensions
@ Building on the peakon solutions [9] for the CH
m  u  curl m þ rðu  mÞ þ mðdiv uÞ ¼ 0 ½20
@t equation and the pulsons [15] for its generalization
to other traveling-wave shapes in Fringer and Holm
This form of EPDiff involves all three differential
(2001), Holm and Staley (2003) introduced the
operators: curl, gradient, and divergence. For the
following measure-valued singular momentum solu-
kinetic-energy Lagrangian ‘ given in [18], which is a
tion ansatz for the n-dimensional solutions of the
norm for ‘‘irrotational’’ flow (with curl u = 0), we
EPDiff equation [19]:
have the EPDiff equation [19] with momentum
m = ‘=u = u  2 r(div u). N Z
X
EPDiff [19] may also be written intrinsically as mðx; tÞ ¼ P a ðs; tÞðx  Qa ðs; tÞÞ ds ½23
a¼1
@ ‘ ‘
¼ adu ½21 These singular momentum solutions, called ‘‘diffeons,’’
@t u u
are vector density functions supported in R n on a set of
where ad is the L2 dual of the ad-operation N surfaces (or curves) of codimension (n  k) for s 2
(commutator) for vector fields (see Arnol’d and Rk with k < n. They may, for example, be supported on
Khesin (1998) and Marsden and Ratiu (1999) for sets of points (vector peakons, k = 0), one-dimensional
additional discussions of the beautiful geometry filaments (strings, k = 1), or two-dimensional surfaces
underlying this equation). (sheets, k = 2) in three dimensions.
Figure 2 shows the results for the EPDiff equation
when a straight peakon segment of finite length is
Reduction to the Dispersionless CH Equation
created initially moving rightward (East). Because of
in One Dimension
propagation along the segment in adjusting to the
In one dimension, the EPDiff equations [19]–[21] with condition of zero speed at its ends and finite speed in its
Lagrangian ‘ given in [18] simplify to the dispersionless interior, the initially straight segment expands outward
CH equation [11]. The dispersionless limit of the CH as it propagates and curves into a peakon ‘‘bubble.’’
equation appears, because potential energy and pres- Figure 3 shows an initially straight segment whose
sure have been ignored. velocity distribution is exponential in the transverse
18 Peakons

Figure 2 A peakon segment of finite length is initially moving Figure 3 An initially straight segment of velocity distribution
rightward (east). Because its speed vanishes at its ends and it whose exponential profile is wider than the width  for the
has fully two-dimensional spatial dependence, it expands into a peakon solution breaks up into a train of curved peakon
peakon ‘‘bubble’’ as it propagates. (The various shades indicate ‘‘bubbles,’’ each of width . This example illustrates the
different speeds. Any transverse slice will show a wave profile emergent property of the peakon solutions in two dimensions.
with a maximum at the center of the wave, which falls
exponentially with distance away from the center.)

for u 2 R n . When evaluated along the curve


direction, but is wider than  for the peakon x = Qa (s, t), this velocity satisfies
solution. This initial-velocity distribution evolves
N Z
X
under EPDiff to separate into a train of curved uðQa ðs; tÞ; tÞ ¼ Pb ðs0 ; tÞ
peakon ‘‘bubbles,’’ each of width . This example b¼1
illustrates the emergent property of the peakon  
solutions in two dimensions. This phenomenon is  G Qa ðs; tÞ  Qb ðs0 ; tÞ ds0
observed in nature, for example, as trains of internal @Qa ðs; tÞ
wave fronts in the South China Sea (Liu et al. 1998). ¼ ½26
@t
Substitution of the singular momentum solution
ansatz [23] into the EPDiff equation [19] implies the Consequently, the lower-dimensional support sets
following integro-partial-differential equations (IPDEs) defined on x = Qa (s, t) and parametrized by
for the evolution of the parameters {P} and {Q}: coordinates s 2 R k move with the fluid velocity.
This means that the s 2 R k are Lagrangian coordi-
XN Z 
@ a nates. Moreover, eqns [24] for the evolution of these
Q ðs; tÞ ¼ P b ðs0 ; tÞ G Qa ðs; tÞ support sets are canonical Hamiltonian equations:
@t b¼1

 Qb ðs0 ; tÞ ds0 @ a HN @ a HN
Q ðs; tÞ ¼ ; P ðs; tÞ ¼  ½27
@t Pa @t Qa
X N Z  
@ a ½24
P ðs; tÞ ¼  P a ðs; tÞ  P b ðs0 ; tÞ The corresponding Hamiltonian function HN : (Rn 
@t b¼1 Rn )N ! R is
@ 
 G Qa ðs; tÞ ZZ X N  
a
@Q ðs; tÞ 1
 HN ¼ P a ðs; tÞ  P b ðs0 ; tÞ
2
 Qb ðs0 ; tÞ ds0 a;b¼1

 GðQa ðs; tÞ; Qj ðs0 ; tÞÞ ds ds0 ½28


Importantly for the interpretation of these solutions,
the coordinates s 2 R k turn out to be Lagrangian This is the Hamiltonian for geodesic motion on the
coordinates. The velocity field corresponding to the cotangent bundle of a set of curves Qa (s, t) with
momentum solution ansatz [23] is given by respect to the metric given by G. This dynamics was
investigated numerically in Holm and Staley (2003)
uðx; tÞ ¼ G  m which can be referred to for more details of the
XN Z   solution properties. One important result found
¼ P b ðs0 ; tÞ G x  Qb ðs0 ; tÞ ds0 ½25 ‘‘numerically’’ in Holm and Staley (2003) is that
b¼1 only codimension-1 singular momentum solutions
Peakons 19

appear to be stable under the evolution of the EPDiff


equation. Thus,
Stability for codimension-1 solutions: the singular
momentum solutions of EPDiff are stable, as points
on the line (peakons), as curves in the plane (filaments,
or wave fronts), or as surfaces in space (sheets).
Proving this stability result analytically remains an
outstanding problem. The stability of peakons on the
real line is proven in Constantin and Strauss (2000).

Reconnections in Oblique Overtaking Collisions


of Peakon Wave Fronts
Figures 4 and 5 show results of oblique wave front
collisions producing reconnections for the EPDiff
equation in two dimensions. Figure 4 shows a single Figure 5 A series of multiple collisions is shown involving
reconnections as the faster wider peakon segment initially moving
oblique overtaking collision, as a faster expanding
northeast along the diagonal expands, breaks up into a wave train
peakon wave front overtakes a slower one and of peakons, each of which propagates, curves, and obliquely
reconnects with it at the collision point. Figure 5 overtakes the slower wide peakon segment initially moving
shows a series of reconnections involving the rightward (east), which is also breaking up into a train of wave
oblique overtaking collisions of two trains of curved fronts. In this series of oblique collision, the now-curved peakon
filaments exchange momentum and reconnect several times.
peakon filaments, or wave fronts.
momentum maps are Poisson maps. This geometric
The Peakon Reduction is a Momentum Map feature also underlies the singular momentum solution
[23] and its associated velocity [25] which generalize
As shown in Holm and Marsden (2004), the singular
the peakon solutions, both to higher dimensions and to
solution ansatz [23] is a momentum map from the
arbitrary kinetic-energy metrics. The result that the
cotangent bundle of the smooth embeddings of lower-
singular solution ansatz [23] is a momentum map helps
dimensional sets Rs 2 Rn , to the dual of the Lie algebra
to organize the theory, to explain previous results, and
of vector fields defined on these sets. (Momentum maps
to suggest new avenues of exploration.
for Hamiltonian dynamics are reviewed in Marsden
and Ratiu (1999), for example.) This geometric feature
underlies the remarkable reduction properties of the
Acknowledgment
EPDiff equation, and it also explains why the reduced
equations must be Hamiltonian on the invariant The author is grateful to R Camassa, J E Marsden,
manifolds of the singular solutions; namely, because T S Ratiu, and A Weinstein for their collaboration,
help, and inspiring discussions over the years. He also
thanks M F Staley for providing the figures obtained
from his numerical simulations in the collaborations.
US DOE provided partial support, under contract
W-7405-ENG-36 for Los Alamos National Labora-
tory, and Office of Science ASCAR/AMS/MICS.

See also: Hamiltonian Systems: Obstructions to


Integrability; Integrable Systems: Overview; Wave
Equations and Diffraction.

Further Reading
Ablowitz MJ and Clarkson PA (1991) Solitons, Nonlinear
Evolution Equations and Inverse Scattering. Cambridge:
Figure 4 A single collision is shown involving reconnection as the Cambridge University Press.
faster peakon segment initially moving southeast along the diagonal Ablowitz MJ and Segur H (1981) Solitons and the Inverse
expands, curves, and obliquely overtakes the slower peakon Scattering Transform. Philadelphia: SIAM.
segment initially moving rightward (east). This reconnection Alber M, Camassa R, Fedorov Y, Holm D, and Marsden JE
illustrates one of the collision rules for the strongly two-dimensional (1999) On billiard solutions of nonlinear PDEs. PhysicsLetters
EPDiff flow. A 264: 171–178.
20 Peakons

Alber M, Camassa R, Fedorov Y, Holm D, and Marsden JE (2001) Fuchssteiner B (1996) Some tricks from the symmetry-toolbox for
The complex geometry of weak piecewise smooth solutions of nonlinear equations: generalization of the Camassa–Holm
integrable nonlinear PDE’s of shallow water and Dym type. equation. Physica D 95: 229–243.
Communications in Mathematical Physics 221: 197–227. Fringer O and Holm DD (2001) Integrable vs. nonintegrable
Alber M, Camassa R, Holm D, and Marsden JE (1994) The geodesic soliton behavior. Physica D 150: 237–263.
geometry of peaked solitons and billiard solutions of a class of Holm DD (2005) The Euler–Poincaré variational framework
integrable PDEs. Letters in Mathematical Physics 32: 137–151. for modeling fluid dynamics. In: Montaldi J and Ratiu T
Alber MS, Camassa R, and Gekhtman M (2000) On billiard weak (eds.) Geometric Mechanics and Symmetry: The Peyresq
solutions of nonlinear PDE’s and Toda flows. CRM Proceed- Lectures, pp. 157–209. London Mathematical Society
ings and Lecture Notes 25: 1–11. Lecture Notes Series 306. Cambridge: Cambridge University
Arnol’d VI (1966) Sur la géométrie differentielle des groupes de Lie de Press.
dimenson infinie et ses applications à l’hydrodynamique des fluids Holm DD and Marsden JE (2004) Momentum maps and
parfaits. Annales de l’Institut Fourier, Grenoble 16: 319–361. measure-valued solutions (peakons, filaments and sheets) for
Arnol’d VI and Khesin BA (1998) Topological Methods in the EPDiff equation. In: Marsden JE and Ratiu TS (eds.) The
Hydrodynamics. Springer: New York. Breadth of Symplectic and Poisson Geometry, A Festshcrift
Beals R, Sattinger DH, and Szmigielski J (1999) Multi-peakons for Alan Weinstein, pp. 203–235, Progress in Mathematics,
and a theorem of Stietjes. Inverse Problems 15: L1–4. vol. 232. Boston: Birkhäuser.
Beals R, Sattinger DH, and Szmigielski J (2000) Multipeakons Holm DD, Marsden JE, and Ratiu TS (1998a) The Euler–
and the classical moment problem. Advances in Mathematics Poincaré equations and semidirect products with applica-
154: 229–257. tions to continuum theories. Advances in Mathematics 137:
Beals R, Sattinger DH, and Szmigielski J (2001) Peakons, strings, 1–81.
and the finite Toda lattice. Communications in Pure and Holm DD, Marsden JE, and Ratiu TS (1998b) Euler–Poincaré
Applied Mathematics 54: 91–106. models of ideal fluids with nonlinear dispersion. Physical
Calogero F (1995) An integrable Hamiltonian system. Physics Review Letters 349: 4173–4177.
Letters A 201: 306–310. Holm DD and Staley MF (2003) Nonlinear balance and exchange
Calogero F and Francoise J-P (1996) A completely integrable of stability in dynamics of solitons, peakons, ramps/cliffs and
Hamiltonian system. Journal of Mathematics 37: 2863–2871. leftons in a 1þ1 nonlinear evolutionary PDE. Physical Letters
Camassa R and Holm DD (1993) An integrable shallow water A 308: 437–444.
equation with peaked solitons. Physical Review Letters 71: Holm DD and Staley MF (2003) Wave structures and nonlinear
1661–1664. balances in a family of evolutionary PDEs. SIAM Journal of
Camassa R, Holm DD, and Hyman JM (1994) A new Applied Dynamical Systems 2(3): 323–380.
integrable shallow water equation. Advances in Applied Holm DD and Staley MF (2004) Interaction dynamics of singular
Mechanics 31: 1–33. wave fronts in nonlinear evolutionary fluid equations (in
Constantin A and Strauss W (2000) Stability of peakons. Commu- preparation).
nications on Pure and Applied Mathematics 53: 603–610. Hunter JK and Zheng Y (1994) On a completely integrable
Degasperis A and Procesi M (1999) Asymptotic integrability. In: hyperbolic variational equation. Physica D 79: 361–386.
Degasperis A and Gaeta G (eds.) Symmetry and Perturbation Johnson RS (2002) Camassa–Holm, Korteweg–de Vries models
Theory, pp. 23–37. Singapore: World Scientific. for water waves. Journal of Fluid Mechanics 455: 63–82.
Degasperis A, Holm DD, and Hone ANW (2002) A new Kodama Y (1985) On integrable systems with higher order
integrable equation with peakon solutions. Theoretical and corrections. Physics Letters A 107: 245–249.
Mathematical Physics 133: 1463–1474. Kodama Y (1985a) Normal forms for weakly dispersive wave
Dubrovin B (1981) Theta functions and nonlinear equations. equations. Physics Letters A 112: 193–196.
Russian Mathematical Surveys 36: 11–92. Kodama Y (1987) On solitary-wave interaction. Physics letters A
Dubrovin BA, Novikov SP, and Krichever IM (1985) Integrable 123: 276–282.
systems. I. Itogi Nauki i Tekhniki. Sovr. Probl. Mat. Fund. Liu AK, Chang YS, Hsu M-K, and Liang NK (1998) Evolution of
Naprav. 4. VINITI (Moscow) (Engl. transl. (1989) Encyclo- nonlinear internal waves in the east and south China Seas.
paedia of Mathematical Sciences, vol. 4. Berlin: Springer). Journal of Geophysical Research 103: 7995–8008.
Dullin HR, Gottwald GA, and Holm DD (2001) An integrable Marsden JE and Ratiu TS (1999) Introduction to Mechanics and
shallow water equation with linear and nonlinear dispersion. Symmetry. Texts in Applied Mathematics, 2nd edn. vol. 17,
Physical Review Letters 87: 194501–04. Berlin: Springer.
Dullin HR, Gottwald GA, and Holm DD (2003) Camassa–Holm, McKean HP and Constantin A (1999) A shallow water equation
Korteweg–de Vries-5 and other asymptotically equivalent on the circle. Communications on Pure and Applied Mathe-
equations for shallow water waves. Fluid Dynamics Research matics 52: 949–982.
33: 73–95. Mikhailov AV and Novikov VS (2002) Perturbative symmetry
Dullin HR, Gottwald GA, and Holm DD (2004) On asymptotically approach. Journal of Physics A 35: 4775–4790.
equivalent shallow water wave equations. Physica D 190: 1–14. Novikov SP, Manakov SV, Pitaevski LP, and Zakharov VE (1984)
Foias C, Holm DD, and Titi ES (2001) The Navier–Stokes-alpha Theory of Solitons. The Inverse Scattering Method, Comtem-
model of fluid turbulence. Physica D 152: 505–519. porary Soviet Mathematics. Consultants Bureau (translated
Fokas AS and Fuchssteiner B (1981) Bäcklund transformations for from Russian). New York: Plenum.
hereditary symmetries. Nonlinear Analysis Transactions of the Ragnisco O and Bruschi M (1996) Peakons, r-matrix and Toda
American Mathematical Society 5: 423–432. lattice. Physica A 228: 150–159.
Fokas AS and Liu QM (1996) Asymptotic integrability of water Suris YB (1996) A discrete time peakons lattice. Physics Letters A
waves. Physical Review Letters 77: 2347–2351. 217: 321–329.
Percolation Theory 21

Penrose Inequality see Geometric Flows and the Penrose Inequality

Percolation Theory
V Beffara, Ecole Normale Supérieure de Lyon, Lyon, bond percolation on a graph G is equivalent to the
France existence of a path for site percolation on the
V Sidoravicius, IMPA, Rio de Janeiro, Brazil covering graph of G. However, site percolation on
ª 2006 Elsevier Ltd. All rights reserved. a given graph may not be equivalent to bond
percolation on any other graph.
All graphs under consideration will be assumed to
be connected, locally finite and quasitransitive. If
Introduction A, B  V, then A $ B means that there exists an
Percolation as a mathematical theory was introduced open path from some vertex of A to some vertex of
by Broadbent and Hammersley (1957), as a stochastic B; by a slight abuse of notation, u $ v will stand for
way of modeling the flow of a fluid or gas through a the existence of a path between sites u and v, that is,
porous medium of small channels which may or may the event {u} $ {v}. The open cluster C(v) of the
not let gas or fluid pass. It is one of the simplest models vertex v is the set of all open vertices which are
exhibiting a phase transition, and the occurrence of a connected to v by an open path:
critical phenomenon is central to the appeal of CðvÞ ¼ fu 2 V : u $ vg
percolation. Having truly applied origins, percolation
has been used to model the fingering and spreading of The central quantity of the percolation theory is the
oil in water, to estimate whether one can build percolation probability:
nondefective integrated circuits, and to model the ðpÞ :¼ Pp f0 $ 1g ¼ Pp fjCð0Þj ¼ 1g
spread of infections and forest fires. From a mathema-
tical point of view, percolation is attractive because it The most important property of the percolation
exhibits relations between probabilistic and algebraic/ model is that it exhibits a phase transition, that is,
topological properties of graphs. there exists a threshold value pc 2 [0, 1], such that
To make the mathematical construction of such a the global behavior of the system is substantially
system of channels, take a graph G (which originally different in the two regions p < pc and p > pc . To
was taken as Zd ), with vertex set V and edge set E, and make this precise, observe that  is a nondecreasing
make all the edges independently open (or passable) function. This can be seen using Hammersley’s joint
with probability p or closed (or blocked) with construction of percolation systems for all p 2 [0, 1]
probability 1  p. Write Pp for the corresponding on G: let {U(v), v 2 V} be independent random
probability measure on the set of configurations of variables, uniform in [0,1]. Declare v to be p-open
open and closed edges – that model is called bond if U(v)  p, otherwise it is declared p-closed. The
percolation. The collection of open edges thus forms a configuration of p-open vertices has the distribution
random subgraph of G, and the original question stated Pp for each p 2 [0, 1]. The collection of p-open
by Broadbent was whether the connected component vertices is nondecreasing in p, and therefore (p) is
of the origin in that subgraph is finite or infinite. nondecreasing as well. Clearly, (0) = 0 and (1) = 1
A path on G is a sequence v1 , v2 , . . . of vertices of G, (Figure 1).
such that for all i  1, vi and viþ1 are adjacent on G. A
path is called open if all the edges {vi , viþ1 } between
successive vertices are open. The infiniteness of the θ(p)
cluster of the origin is equivalent to the existence of 1
an unbounded open path starting from the origin.
There is an analogous model, called ‘‘site percola-
tion,’’ in which all edges are assumed to be passable,
but the vertices are independently open or closed
with probability p or 1  p, respectively. An open p
0
path is then a path along which all vertices are open. pc 1
Site percolation is more general than bond percola- Figure 1 The behavior of (p) around the critical point
tion in the sense that the existence of a path for (for bond percolation).
22 Percolation Theory

The critical probability is defined as It was an important step in the development of the
theory to show that pT (G) = pc (G). The fundamental
pc :¼ pc ðGÞ ¼ supfp: ðpÞ ¼ 0g
estimate in the subcritical regime, which is a much
By definition, when p < pc , the open cluster of the stronger statement than pT (G) = pc (G), is the following:
origin is Pp -a.s. finite; hence, all the clusters are also
Theorem 1 (Aizenman and Barsky, Menshikov).
finite. On the other hand, for p > pc there is a
Assume that G is periodic. Then for p < pc there
strictly positive Pp -probability that the cluster of the
exist constants 0 < C1 , C2 < 1, such that
origin is infinite. Thus, from Kolmogorov’s zero–one
law it follows that Pp fjCðvÞj  ng  C1 eC2 n
Pp fjCðvÞj ¼ 1 for some v 2 Vg ¼ 1 for p > pc The last statement can be sharpened to a ‘‘local
Therefore, if the intervals [0, pc ) and (pc , 1] are both limit theorem’’ with the help of a subadditivity
nonempty, there is a phase transition at pc . argument: for each p < pc , there exists a constant
Using a so-called Peierls argument it is easy to see 0 < C3 (p) < 1, such that
that pc (G) > 0 for any graph G of bounded degree. 1
On the other hand, Hammersley proved that lim  log Pp fjCðvÞj ¼ ng ¼ C3 ðpÞ
n!1 n
pc (Zd ) < 1 for bond percolation as soon as d  2,
and a similar argument works for site percolation The Supercritical Regime
and various periodic graphs as well. But for some
Once an infinite open cluster exists, it is natural to
graphs G, it is not so easy to show that pc (G) < 1.
ask how it looks like, and how many infinite open
One says that the system is in the subcritical (resp.
clusters exist. It was shown by Newman and Schul-
supercritical) phase if p < pc (resp. p > pc ).
man that for periodic graphs, for each p, exactly one
It was one of the most remarkable moments in the
of the following three situations prevails: if N 2
history of percolation when Kesten (1980) proved,
Zþ [ {1} is the number of infinite open clusters, then
based on results by Harris, Russo, Seymour and
Pp (N = 0) = 1, or Pp (N = 1) = 1, or Pp (N = 1) = 1.
Welsh, that the critical parameter for bond percolation
Aizenman, Kesten, and Newman showed that the
on Z2 is equal to 1/2. Nevertheless, the exact value of
third case is impossible on Zd . By now several
pc (G) is known only for a handful of graphs, all of
proofs exist, perhaps the most elegant of which is
them periodic and two dimensional – see below.
due to Burton and Keane, who prove that indeed
there cannot be infinitely many infinite open clusters
on any amenable graph. However, there are some
Percolation in Zd graphs, such as regular trees, on which coexistence
The graph on which most of the theory was of several infinite clusters is possible.
originally built is the cubic lattice Zd , and it was The geometry of the infinite open cluster can be
not before the late twentieth century that percola- explored in some depth by studying the behavior of
tion was seriously considered on other kinds of a random walk on it. When d = 2, the random walk
graphs (such as Cayley graphs), on which specific is recurrent, and when d  3 is a.s. transient. In all
phenomena can appear, such as the coexistence of dimensions d  2, the walk behaves diffusively, and
multiple infinite clusters for some values of the the ‘‘central limit theorem’’ and the ‘‘invariance
parameter p. In this section, the underlying graph is principle’’ were established in both the annealed and
thus assumed to be Zd for d  2, although most quenched cases.
of the results still hold in the case of a periodic
d-dimensional lattice.
Wulff droplets In the supercritical regime, aside
The Subcritical Regime from the infinite open cluster, the configuration
contains finite clusters of arbitrary large sizes. These
When p < pc , all open clusters are finite almost large finite open clusters can be thought of as droplets
surely. One of the greatest challenges in percolation swimming in the areas surrounded by an infinite open
theory has been to prove that (p) := Ep {jC(v)j} is cluster. The presence at a particular location of a large
finite if p < pc (Ep stands for the expectation with finite cluster is an event of low probability, namely, on
respect to Pp ). For that one can define another critical Zd , d  2, for p > pc , there exist positive constants
probability as the threshold value for the finiteness of 0 < C4 (p), C5 (p) < 1, such that
the expected cluster size of a fixed vertex:
1
pT ðGÞ :¼ supfp : ðpÞ < 1g C4 ðpÞ   log Pp fjCðvÞj ¼ ng  C5 ðpÞ
nðd1Þ=d
Percolation Theory 23

for all large n. This estimate is based on the fact that correlation length, leading again to the same value for
the occurrence of a large finite cluster is due to a pc ; the behavior at or near the critical point then has no
surface effect. The typical structure of the large finite characteristic length, and gives rise to scaling
finite cluster is described by the following theorem: exponents (conjecturally in most cases).
The most usual critical exponents are defined as
Theorem 2 Let d  2, and p > pc . There exists a
follows, if (p) is the percolation probability, C the
bounded, closed, convex subset W of Rd containing
cluster of the origin, and (p) the correlation length:
the origin, called the normalized Wulff crystal of
the Bernoulli percolation model, such that, under the @3
conditional probability Pp { jnd  jC(0)j < 1}, the Ep ½jCj1   jp  pc j1
@p3
random measure
ðpÞ  ðp  pc Þþ
1 X
x=n f ðpÞ :¼ Ep ½jCj1jCj<1   jp  pc j
nd x2Cð0Þ
Ppc ½jCj ¼ n  n11=
(where x denotes a Dirac mass at x) converges Ppc ½x 2 C  jxj2d
weakly in probability toward the random measure
(p)1W (x  M) dx (where M is the rescaled center of ðpÞ  jp  pc j

mass of the cluster C(0)). The deviation probabilities Ppc ½diamðCÞ ¼ n  n11=
behave as exp{cnd1 } (i.e., they exhibit large
Ep ½jCjkþ1 1jCj<1 
deviations of surface order; in dimensions 4 and
k
 jp  pc j
more it holds up to re-centering). Ep ½jCj 1jCj<1 

This result was proved in dimension 2 by Alexander These exponents are all expected to be universal,
et al. (1990), and in dimensions 3 and more by Cerf that is, to depend only on the dimension of the
(2000). lattice, although this is not well understood at the
mathematical level; the following scaling relations
Percolation Near the Critical Point between the exponents are believed to hold:
Percolation in Slabs The main macroscopic obser- 2   ¼  þ 2 ¼ ð þ 1Þ;  ¼ ;  ¼
ð2  Þ
vable in percolation is (p), which is positive above
pc , 0 below pc , and continuous on [0, 1]n{pc }. In addition, in dimensions up to dc = 6, two
Continuity at pc is an open question in the general additional hyperscaling relations involving d are
case; it is known to hold in two dimensions strongly conjectured to hold:
(cf. below) and in high enough dimension (at the d ¼  þ 1; d
¼ 2  
moment d  19 though the value of the critical
dimension is believed to be 6) using lace expansion while above dc the exponents are believed to take
methods. The conjecture that (pc ) = 0 for 3  d  18 their mean-field value, that is, the ones they have for
remains one of the major open problems. percolation on a regular tree:
Efforts to prove that led to some interesting and  ¼ 1;  ¼ 1;  ¼ 1;  ¼ 2
important results. Barsky, Grimmett, and Newman
solved the question in the half-space case, and simulta- ¼ 0;
¼ 12; ¼ 12;  ¼ 2
neously showed that the slab percolation and half-space Not much is known rigorously on critical expo-
percolation thresholds coincide. This was complemen- nents in the general case. Hara and Slade (1990)
ted by Grimmett and Marstrand showing that proved that mean field behavior does happen above
dimension 19, and the proof can likely be extended
pc ðslabÞ ¼ pc ðZd Þ
to treat the case d  7. In the two-dimensional case
Critical exponents In the subcritical regime, expo- on the other hand, Kesten (1987) showed that,
nential decay of the correlation indicates that there assuming that the exponents  and exist, then so
is a finite correlation length (p) associated to the do , , , and
, and they satisfy the scaling and
system, and defined (up to constants) by the relation hyperscaling relations where they appear.
 
n’ðxÞ
Pp ð0 $ nxÞ  exp  The incipient infinite cluster When studying long-
ðpÞ
range properties of a critical model, it is useful to
where ’ is bounded on the unit sphere (this is known have an object which is infinite at criticality, and
as Ornstein–Zernike decay). The phase transition can such is not the case for percolation clusters. There
then also be defined in terms of the divergence of the are two ways to condition the cluster of the origin to
24 Percolation Theory

be infinite when p = pc : The first one is to condition pbond


c for T 0 , i.e., the hexagonal lattice) and pbond
c for
it to have diameter at least n (which happens with the bow-tie lattice which is a root of the equation
positive probability) and take a limit in distribution p5  6p3 þ 6p2 þ p  1 = 0. The value of the critical
as n goes to infinity; the second one is to consider parameter for site percolation on Z2 might, on the
the model for parameter p > pc , condition the other hand, never be known; it is even possible that
cluster of 0 to be infinite (which happens with it is ‘‘just a number’’ without any other signification.
positive probability) and take a limit in distribution Still using duality, one can prove that the
as p goes to pc . The limit is the same in both cases; it probability, for bond percolation on the square
is known as the incipient infinite cluster. lattice with parameter p = 1=2, that there is a
As in the supercritical regime, the structure of the connected component crossing an (n þ 1)
n rec-
cluster can be investigated by studying the behavior tangle in the longer direction is exactly equal to 1/2.
of a random walk on it, as was suggested by de This and clever arguments involving the symmetry
Gennes; Kesten proved that in two dimensions, the of the lattice lead to the following result, proved
random walk on the incipient infinite cluster is independently by Russo and by Seymour and Welsh
subdiffusive, that is, the mean square displacement and known as the RSW theorem:
after n steps behaves as n1" for some " > 0.
Theorem 3 (Russo 1978, Seymour and Welsh 1978).
The construction of the incipient infinite cluster
For every a, b > 0 there exist > 0 and n0 > 0 such
was done by Kesten (1986) in two dimensions, and a
that for every n > n0 , the probability that there is a
similar construction was performed recently in high
cluster crossing an bnac
bnbc rectangle in the first
dimension by van der Hofstad and Jarai (2004).
direction is greater than .
The most direct consequence of this estimate is that
Percolation in Two Dimensions
the probability that there is a cluster going around an
As is the case for several other models of statistical annulus of a given modulus is bounded below
physics, percolation exhibits many specific properties independently of the size of the annulus; in particular,
when considered on a two-dimensional lattice: duality almost surely there is some annulus around 0 in
arguments allow for the computation of pc in some which this happens, and that is what allows to prove
cases, and for the derivation of a priori bounds for the that (pc ) = 0 for bond percolation on Z2 (Figure 2).
probability of crossing events at or near the critical
point, leading to the fact that (pc ) = 0. On another The Scaling Limit
front, the scaling limit of critical site percolation on the
RSW-type estimates give positive evidence that a
two-dimensional triangular lattice can be described in
scaling limit of the model should exist; it is indeed
terms of Stochastic Loewner evolutions (SLE) processes.
essentially sufficient to show convergence of the
Duality, Exact Computations, and RSW Theory crossing probabilities to a nontrivial limit as n goes
to infinity. The limit, which should depend only on
Given a planar lattice L, define two associated the ratio a/b, was predicted by Cardy using con-
graphs as follows. The dual lattice L0 has one vertex formal field theory methods. A celebrated result of
for each face of the original lattice, and an edge Smirnov is the proof of Cardy’s formula in the case of
between two vertices if and only if the correspond- site percolation on the triangular lattice T :
ing faces of L share an edge. The star graph L is
obtained by adding to L an edge between any two Theorem 4 (Smirnov (2001)). Let  be a simply
vertices belonging to the same face (L is not planar connected domain of the plane with four points a, b,
in general; (L, L ) is commonly known as a c, d (in that order) marked on its boundary. For
matching pair). Then, a result of Kesten is that, every  > 0, consider a critical site-percolation
under suitable technical conditions,
pbond
c ðLÞ þ pbond
c ðL0 Þ ¼ psite site
c ðLÞ þ pc ðL Þ ¼ 1

Two cases are of particular importance: the lattice


Z2 is isomorphic to its dual; the triangular lattice T
is its own star graph. It follows that
pbond
c ðZ2 Þ ¼ psite 1
c ðT Þ ¼ 2

The only other critical parameters that are known Figure 2 Two large critical percolation clusters in a box of the
exactly are pbond
c (T ) = 2 sin ( =18) (and hence also square lattice (first: bond percolation, second: site percolation).
Percolation Theory 25

model on the intersection of  with T and let


f (ab, cd; ) be the probability that it contains a
cluster connecting the arcs ab and cd. Then:
(i) f (ab, cd; ) has a limit f0 (ab, cd; ) as  ! 0;
(ii) the limit is conformally invariant, in the
following sense: if  is a conformal map from
 to some other domain 0 = (), and maps
a to a0 , b to b0 , c to c0 and d to d0 , then
f0 (ab, cd; ) = f0 (a0 b0 , c0 d0 ; 0 ); and
(iii) in the particular case when  is an equilateral
triangle of side length 1 with vertices a, b and c,
and if d is on (ca) at distance x 2 (0, 1) from c,
then f0 (ab, cd; ) = x.
Point (iii) in particular is essential since it allows
us to compute the limiting crossing probabilities in
any conformal rectangle. In the original work of
Cardy, he made his prediction in the case of a Figure 4 An SLE process with parameter = 6 (infinite time,
rectangle, for which the limit involves hypergeo- with the driving process stopped at time 1).
metric functions; the remark that the equilateral
triangle gives rise to nicer formulae is originally due
to Carleson. existence of pivotal sites on large critical percolation
To precisely state the convergence of percolation clusters), and it has Hausdorff dimension 7/4. For
to its scaling limit, define the random curve known more details on SLE processes, see, for example, the
as the percolation exploration path (see Figure 3) as related entry in the present volume.
follows: In the upper half-plane, consider a site- As an application of this convergence result, one
percolation model on a portion of the triangular can prove that the critical exponents described in the
lattice and impose the boundary conditions that on previous section do exist (still in the case of the
the negative real half-line all the sites are open, triangular lattice), and compute their exact values,
while on the other half-line the sites are closed. The except for , which is still listed here for
exploration curve is then the common boundary of completeness:
the open cluster spanning from the negative half-  
2 5 43 91
line, and the closed cluster spanning from the  ¼  ; ¼ ; ¼ ; ¼
3 36 18 5
positive half-line; it is an infinite, self-avoiding
5 4 48 91
random curve in the upper half-plane. ¼ ;
¼ ; ¼ ; ¼
As the mesh of the lattice goes to 0, the exploration 24 3 5 36
curve then converges in distribution to the trace of an These exponents are expected to be universal, in the
SLE process, as introduced by Schramm, with sense that they should be the same for percolation
parameter = 6 – see Figure 4. The limiting curve is on any two-dimensional lattice; but at the time of
not simple anymore (which corresponds to the this writing, this phenomenon is far from being
understood on a mathematical level.
The rigorous derivation of the critical exponents
for percolation is due to Smirnov and Werner
(2001); the dimension of the limiting curve was
obtained by Beffara (2004).

Other Lattices and Percolative Systems


Some modifications or generalizations of standard
Bernoulli percolation on Zd exhibit an interesting
behavior and as such provide some insight into the
Figure 3 A percolation exploration path. Figure courtesy
Schramm O (2000) Scaling limits of loop-erased random walks original process as well; there are too many
and uniform spanning trees. Israel Journal of Mathematics 118: mathematical objects which can be argued to be
221–228. percolative in some sense to give a full account of all
26 Percolation Theory

of them, so the following list is somewhat arbitrary


and by no means complete.

Percolation on Nonamenable Graphs

The first modification of the model one can think of


is to modify the underlying graph and move away
from the cubic lattice; phase transition still occurs,
and the main difference is the possibility for
infinitely many infinite clusters to coexist. On a
regular tree, such is the case whenever p 2 (pc , 1),
the first nontrivial example was produced by
Grimmett and Newman as the product of Z by a
tree: there, for some values of p the infinite cluster is
unique, while for others there is coexistence of
infinitely many of them. The corresponding defini-
tion, due to Benjamini and Schramm, is then the
following: if N is as above the number of infinite
open clusters, Figure 5 Gradient percolation in a square. In black is the
  cluster spanning from the bottom side of the square.
pu :¼ inf p : Pp ðN ¼ 1Þ ¼ 1  pc
The main question is then to characterize graphs on domain than on the other. If p still varies smoothly,
which 0 < pc < pu < 1. then one expects some regions to look subcritical
A wide class of interesting graphs is that of Cayley and others to look supercritical, with interesting
graphs of infinite, finitely generated groups. There, behavior in the vicinity of the critical level set
by a simultaneous result by Häggström and Peres {p = pc }. This particular model was introduced by
and by Schonmann, for every p 2 (pc , pu ) there are Sapoval et al. (1978) under the name of gradient
Pp -a.s. infinitely many infinite cluster, while for percolation (see Figure 5).
every p 2 (pu , 1] there is only one – note that this The control of the model away from the critical
does not follow from the definition since new zone is essentially the same as for usual Bernoulli
infinite components could appear when p is percolation, the main question being how to
increased. It is conjectured that pc < pu for any estimate the width of the phase transition. The
Cayley graph of a nonamenable group (and more main idea is then the same as in scaling theory: if the
generally for any quasitransitive graph with positive distance between a point v and the critical level set is
Cheeger constant), and a result by Pak and less than the correlation length for parameter p(v),
Smirnova is that every infinite, finitely generated, then v is in the phase transition domain. This of
nonamenable group has a Cayley graph on which course makes sense only asymptotically, say in a
pc < pu ; this is then expected not to depend on the large n
n square with p(x, y) = 1  y=n as is the
choice of generators. In the general case, it was recently case in the figure: the transition then is expected to
proved by Gaboriau that if the graph G is unimodular, have width of order na for some exponent a > 0.
transitive, locally finite, and supports nonconstant
harmonic Dirichlet functions (i.e., harmonic functions First-Passage Percolation
whose gradient is in ‘2 ), then indeed pc (G) < pu (G).
For reference and further reading on the topic, First-passage percolation (also known as Eden or
the reader is advised to refer to the review paper by Richardson model) was introduced by Hammersley
Benjamini and Schramm (1996), the lecture notes and Welsh (1965) as a time-dependent model for the
of Peres (1999), and the more recent article of passage of fluid through a porous medium. To define
Gaboriau (2005). the model, with each edge e 2 E(Zd ) is associated a
random variable T(e), which can be interpreted as
Gradient Percolation being the time required for fluid to flow along e. The
T(e) are assumed to be independent non-negative
Another possible modification of the original model random variables having common distribution F. For
is to allow the parameter p to depend on the any path we define the passage time T( ) of as
location; the porous medium may for instance have X
been created by some kind of erosion, so that there Tð Þ :¼ TðeÞ
will be more open edges on one side of a given e2
Percolation Theory 27

The first passage time a(x, y) between vertices x and The analogy with percolation is strong, the
y is given by corresponding percolative picture being the follow-
ing: in Zdþ1
þ , each edge is open with probability p 2
aðx; yÞ ¼ inffTð Þ : a path from x to yg
(0, 1), and the question is whether there exists an
and we can define infinite oriented path (i.e., a path along which the
sum of the coordinates is increasing), composed of
WðtÞ :¼ fx 2 Zd : að0; xÞ  tg open edges. Once again, there is a critical parameter
the set of vertices reached by the liquid by time t. It customarily denoted by pc , at which no such path
turns out that W(t) grows approximately linearly as exists (compare this to the open question of the
time passes, and that there exists a nonrandom limit continuity of the function  at pc in dimensions
set B such that either B is compact and 3  d  18). This variation of percolation lies in a
different universality class than the usual Bernoulli
1f model.
ð1  "ÞB WðtÞ ð1 þ "ÞB; eventually a:s:
t
for all  > 0, or B = R d , and Invasion Percolation

1f Let X(e) : e 2 E be independent random variables


fx 2 Rd : jxj  Kg WðtÞ; eventually a:s: indexed by the edge set E of Zd , d  2, each
t
f = {z þ [ 1=2, 1=2]d : having uniform distribution in [0, 1]. One con-
for all K > 0. Here W(t) structs a sequence C = {Ci , i  1} of random
z 2 W(t)}. connected subgraphs of the lattice in the
Studies of first-passage percolation brought following iterative way: the graph C0 contains
many fascinating discoveries, including Kingman’s only the origin. Having defined Ci , one obtains
celebrated subadditive ergodic theorem. In recent Ciþ1 by adding to Ci an edge eiþ1 (with its outer
years interest has been focused on study of lying end-vertex), chosen from the outer edge
f for large t. In spite of
fluctuations of the set W(t) boundary of Ci so as to minimize X(eiþ1 ). Still
huge effort and some partial results achieved, it very little is known about the behavior of this
still remains a major task to establish rigorously process.
conjectures predicted by Kardar–Parisi–Zhang the- An interesting observation, relating (pc ) of usual
ory about shape fluctuations in first passage percolation with the invasion dynamics, comes from
percolation. CM Newman:
ðpc Þ ¼ 0 , Pfx 2 Cg ! 0 as jxj ! 1
Contact Processes
Introduced by Harris and conceived with biological
interpretation, the contact process on Zd is a Further Remarks
continuous-time process taking values in the space For a much more in-depth review of percolation on
of subsets of Zd . It is informally described as lattices and the mathematical methods involved in
follows: particles are distributed in Zd in such a its study, and for the proofs of most of the results we
way that each site is either empty or occupied by could only point at, we refer the reader to the
one particle. The evolution is Markovian: each standard book of Grimmett (1999); another excel-
particle disappears after an exponential time of lent general reference, and the only place to find
parameter 1, independently from the others; at any some of the technical graph-theoretical details
time, each particle has a possibility to create a new involved, is the book of Kesten (1982). More
particle at any of its empty neighboring sites, and information in the case of graphs that are not
does so with rate  > 0, independently of everything lattices can be found in the lecture notes of Peres
else. (1999).
The question is then whether, starting from a For curiosity, the reader can refer to the first
finite population, the process will die out in finite mention of a problem close to percolation, in the
time or whether it will survive forever with positive problem section of the first volume of the American
probability. The outcome will depend on the value Mathematical Monthly (problem 5, June 1894,
of , and there is a critical value c , such that for submitted by D V Wood).
  c process dies out, while for  > c indeed
there is survival, and in this case the shape of the See also: Determinantal Random Fields; Stochastic
population obeys a shape theorem similar to that of Loewner Evolutions; Two-Dimensional Ising Model; Wulff
first-passage percolation. Droplets.
28 Perturbation Theory and Its Techniques

Further Reading Kesten H (1982) Percolation Theory for Mathematicians, Pro-


gress in Probability and Statistics, vol. 2. Boston, MA:
Alexander K, Chayes JT, and Chayes L (1990) The Wulff Birkhäuser.
construction and asymptotics of the finite cluster distribution Kesten H (1986) The incipient infinite cluster in two-dimensional
for two-dimensional Bernoulli percolation. Communications percolation. Probability Theory and Related Fields 73:
in Mathematical Physics 131: 1–50. 369–394.
Beffara V (2004) Hausdorff dimensions for SLE6 . Annals of Kesten H (1987) Scaling relations for 2D-percolation. Commu-
Probability 32: 2606–2629. nications in Mathematical Physics 109: 109–156.
Benjamini I and Schramm O (1996) Percolation beyond Zd , many Peres Y (1999) Probability on Trees: An Introductory Climb,
questions and a few answers. Electronic Communications in Lectures on Probability Theory and Statistics (Saint-Flour,
Probability 1(8): 71–82 (electronic). 1997), Lecture Notes in Math, vol. 1717, pp. 193–280. Berlin:
Broadbent SR and Hammersley JM (1957) Percolation processes, Springer.
I and II. Mathematical Proceedings of the Cambridge Russo L (1978) A note on percolation. Zeitschift für Wahrschein-
Mathematical Society 53, pp. 629–645. lichkeitstheorie und Verwandte Gebiete 43: 39–48.
Cerf R (2000) Large deviations for three dimensional supercritical Sapoval B, Rosso M, and Gouyet J (1985) The fractal nature of a
percolation. Astérisque, SMF vol. 267. diffusion front and the relation to percolation. Journal de
Gaboriau D (2005) Invariant percolation and harmonic Dirichlet Physique Lettres 46: L146–L156.
functions. Geometric and Functional Analysis (in print). Seymour PD and Welsh DJA (1978) Percolation probabilities on the
Grimmett G (1999) Percolation. Grundlehren der Mathematischen square lattice. Annals of Discrete Mathematics 3: 227–245.
Wissenschaften, 2nd edn, vol. 321. Berlin: Springer. Smirnov S (2001) Critical percolation in the plane: conformal
Hammersley JM and Welsh DJA (1965) First-passage percolation, invariance, Cardy’s formula, scaling limits. Comptes-rendus de
subadditive processes, stochastic networks, and generalized l’Académie des Sciences de Paris, Série I Mathématiques 333:
renewal theory. In: Proc. Internat. Res. Semin, Statist. Lab., 239–244.
Univ. California, Berkeley, CA, pp. 61–110. New York: Springer. Smirnov S and Werner W (2001) Critical exponents for two-
Hara T and Slade G (1990) Mean-field critical behaviour for dimensional percolation. Mathematical Research Letters 8:
percolation in high dimensions. Communications in Mathe- 729–744.
matical Physics 128: 333–391. van der Hofstad R and Járai AA (2004) The incipient infinite
Kesten H (1980) The critical probability of bond percolation on cluster for high-dimensional unoriented percolation. Journal
the square lattice equals 1/2. Communications in Mathema- of Statistical Physics 114: 625–663.
tical Physics 74: 41–59.

Perturbation Theory and Its Techniques


R J Szabo, Heriot-Watt University, Edinburgh, UK between in and out states at large positive time t.
ª 2006 Elsevier Ltd. All rights reserved. The scattering operator S is then defined by writing
[1] in terms of initial free-particle (descriptor) states as
Sba ¼: h b ð0ÞjSj a ð0Þi ½2

Introduction Suppose that the Hamiltonian of the given field


theory can be written as H = H0 þ H 0 , where H0 is
There are several equivalent formulations of the the free part and H 0 the interaction Hamiltonian.
problem of quantizing an interacting field theory. The time evolutions of the in and out states are
The list includes canonical quantization, path- governed by the total Hamiltonian H. They can be
integral (or functional) techniques, stochastic expressed in terms of descriptor states which evolve
quantization, ‘‘unified’’ methods such as the in time with H0 in the interaction picture and
Batalin–Vilkovisky formalism, and techniques correspond to free-particle states. This leads to the
based on the realizations of field theories as low- Dyson formula
energy limits of string theory. The problem of  Z 1 
obtaining an exact nonperturbative description of a
S ¼ T exp i dt HI ðtÞ ½3
given quantum field theory is most often a very 1
difficult one. Perturbative techniques, on the other
hand, are abundant, and common to all of the Rwhere T denotes time ordering and HI (t) =
quantization methods mentioned above is that they dd xHint (x, t) is the interaction Hamiltonian in the
admit particle interpretations in this formalism. interaction picture, with Hint (x, t) the interaction
The basic physical quantities that one wishes to Hamiltonian density, which deals with essentially
calculate in a relativistic (d þ 1)-dimensional quan- free fields. This formula expresses S in terms of
tum field theory are the S-matrix elements interaction-picture operators acting on free-particle
states in [2] and is the first step towards Feynman
Sba ¼ out h b ðtÞj a ðtÞiin ½1 perturbation theory.
Perturbation Theory and Its Techniques 29

For many analytic investigations, such as those in terms of which the expansion [7] reads
which arise in renormalization theory, one is Z dþ1
interested instead in the Green’s functions of the X1 nY
i n d ki ~
Z½ J  ¼ Jðki Þ
quantum field theory, which measure the response n! i¼1 ð2Þdþ1
n¼0
of the system to an external perturbation. For
~ ðnÞ ðk1 ; . . . ; kn Þ
G ½10
definiteness, let us consider a free real scalar field
theory in d þ 1 dimensions with Lagrangian
The generating functional [10] can be written as a sum
density
of Feynman diagrams with source insertions. Dia-
L ¼ 12 @ @    12 m2 2 þ Lint ½4 grammatically, the Green’s function is an infinite series
of graphs which can be represented symbolically as
where Lint is the interaction Lagrangian density
which we assume has no derivative terms. The k1
interaction Hamiltonian density is then given by kn
Hint = Lint . Introducing a real scalar source J(x), ~
G(n)(k1, . . . ,kn) = . k2 ½11
we define the normalized ‘‘partition function’’
.
through the vacuum expectation values, k3
. .
h0jS½ Jj0i
Z½ J ¼ ½5 where the n external lines denote the source
h0jS½0j0i insertions of momenta ki and the bubble denotes
where j0i is the normalized perturbative vacuum the sum over all Feynman diagrams constructed
state of the quantum field theory given by (4) from the interaction vertices of Lint .
(defined to be destroyed by all field annihilation This procedure is, however, rather formal in the way
operators), and that we have presented it, for a variety of reasons. First
of all, by Haag’s theorem, it follows that the interaction
Z 
representation of a quantum field theory does not exist
S½ J ¼ T exp i ddþ1 xðLint þ JðxÞðxÞÞ ½6 unless a cutoff regularization is introduced into the
interaction term in the Lagrangian density (this
from the Dyson formula. This partition function is regularization is described explicitly below). The
the generating functional for all Green’s functions addition of this term breaks translation covariance.
of the quantum field theory, which are obtained This problem can be remedied via a different definition
from [5] by taking functional derivatives with of the regularized Green’s functions, as we discuss
respect to the source and then setting J(x) = 0. below. Furthermore, the perturbation series of a
Explicitly, in a formal Taylor series expansion in J quantum field theory is typically divergent. The
one has expansion into graphs is, at best, an asymptotic series
Z which is Borel summable. These shortcomings will not
X
1 nY
i n
Z½ J ¼ ddþ1 xi Jðxi Þ GðnÞ ðx1 ; . . . ; xn Þ ½7 be emphasized any further in this article. Some
n¼0
n! i¼1 mathematically rigorous approaches to perturbative
quantum field theory can be found in the bibliography.
whose coefficients are the Green’s functions The Green’s functions can also be used to describe
scattering amplitudes, but there are two important
GðnÞ ðx1 ; ...; xn Þ differences between the graphs [11] and those which
R 
h0jT½exp i ddþ1 xLint ðx1 Þ ðxn Þj0i appear in scattering theory. In the present case,
:¼ R  ½8 external lines carry propagators, that is, the free-
h0jT exp i ddþ1 xLint j0i field Green’s functions

It is customary to work in momentum space by ðx  yÞ ¼ h0j T½ðxÞ ðyÞ j0i


D  1  E
introducing the Fourier transforms 
¼ x & þ m2 y
Z Z
~JðkÞ ¼ ddþ1 x eikx JðxÞ ddþ1 p i
¼ dþ1 p2  m2 þ i
eipðxyÞ ½12
ð2Þ
~ ðnÞ ðk1 ; . . . ; kn Þ
G ½9
Yn Z where  ! 0þ regulates the mass shell contributions,
¼ ddþ1 xi eiki xi GðnÞ ðx1 ; . . . ; xn Þ and their momenta ki are off-shell in general
i¼1 (k2i 6¼ m2 ). By the LSZ theorem, the S-matrix element
30 Perturbation Theory and Its Techniques

is then given by the multiple on-shell residue of the The formal Taylor series expansion of the
Green’s function in momentum space as scattering operator S may now be succinctly
0
summarized into a diagrammatic notation by
k1 ; . . . ; k0n j S  1jk1 ; . . . ; kl usingRWick’s theorem. For each spacetime integra-
Yn
1  Yl
1  2  tion ddþ1 xi we introduce a vertex with label i,
¼ 0 lim pffiffiffi0ffi k0i 2  m2 p ffiffiffi
ffi k j  m2 and from each vertex there emanate some lines
0
k ;...;kn !m2
i¼1 i ci
i cj
1
k1 ;...;kl !m2
j¼1 corresponding to field insertions at the point xi .
  If the operators represented by two lines appear in
~ ðnþmÞ k0 ; . . . ; k0 ; k1 ; . . . ; kl
G ½13
1 n a two-point function according to [14], that is, they
where ic0i , icj are the residues of the corresponding are contracted, then these two lines are connected
particle poles in the exact two-point Green’s together. The S operator is then represented as a
function. sum over all such Wick diagrams, bearing in mind
This article deals with the formal development that topologically equivalent diagrams correspond
and computation of perturbative scattering ampli- to the same term in S. Two diagrams are said to
tudes in relativistic quantum field theory, along the have the same pattern if they differ only by a
lines outlined above. Initially we deal only with real permutation of their vertices. For any diagram D
scalar field theories of the sort [4] in order to with n(D) vertices, the number of ways of inter-
illustrate the concepts and technical tools in as changing vertices is n(D)!. The number of diagrams
simple and concise a fashion as possible. These per pattern is always less than this number. The
techniques are common to most quantum field symmetry number S(D) of D is the number of
theories. Fermions and gauge theories are then permutations of vertices that give the same dia-
separately treated afterwards, focusing on the gram. The number of diagrams with the pattern of
methods which are particular to them. D is then n(D)!=S(D).
In a given pattern, we write the contribution to S
of a single diagram D as

Diagrammatics 1
: ðDÞ:
The pinnacle of perturbation theory is the technique nðDÞ!
of Feynman diagrams. Here we develop the basic
machinery in a quite general setting and use it to where the combinatorial factor comes from
analyze some generic features of the terms compris- the Taylor expansion of S, the large colons
ing the perturbation series. denote normal ordering of quantum operators,
and : (D) : contains spacetime integrals over nor-
mal-ordered products of the fields. Then all
Wick’s Theorem diagrams with the pattern of D contribute : (D) :
The Green’s functions [8] are defined in terms of =S(D) to S. Only the connected diagrams Dr , r 2 N
vacuum expectation values of time-ordered products (those in which every vertex is connected to every
of the scalar field (x) at different spacetime points. other vertex) contribute and we can write the
Wick’s theorem expresses such products in terms of scattering operator in a simple form which
normal-ordered products, defined by placing each eliminates contributions from all disconnected dia-
field creation operator to the right of each field grams as
annihilation operator, and in terms of two-point !
Green’s functions [12] of the free-field theory X
1
ðDr Þ
S ¼: exp : ½15
(propagators). The consequence of this theorem is r¼1
SðDr Þ
the Haffnian formula

h0j T½ðx1 Þ    ðxn Þj0i Feynman Rules


8
>
> 0
>
> Feynman diagrams in momentum space are
>
>
>
> n ¼ 2k  1 defined from the Wick diagrams above by drop-
>
< ½14 ping the labels on vertices (and also the symmetry
¼ XY k  
factors S(D)1 ), and by labeling the external lines
>
>  
0 T ðxð2i1Þ Þ ðxð2iÞ Þ 0
>
> 2S i¼1 by the momenta of the initial and final particles
>
>
>
>
2k
that the corresponding field operators annihilate.
>
: n ¼ 2k In a spacetime interpretation, external lines
Perturbation Theory and Its Techniques 31

represent on-shell physical particles while internal ih


lines of the graph represent off-shell virtual ½ ¼ S½ þ Tr lnð1 þ V 00 ½Þ þ Oðh2 Þ
2
particles (k2 6¼ m2 ). Physical particles interact X1
ð1Þn
via the exchange of virtual particles. An arbitrary ¼ S½ þ ih
2n
diagram is then calculated via the Feynman rules: Z
n¼1
Yn
 ddþ1 xi ðxi  xiþ1 ÞV 00 ½ðxiþ1 Þ
p dd +1p i i¼1
=
(2π)d +1 p2 – m2 + i þ Oðh2 Þ ½19
R dþ1
pn p1 where we have denoted S[] = d xL and
½16
. V[] = Lint , and for each term in the infinite
. = ig (2π)d +1 δ (d +1)(p1 + · · · + pn) series we define xnþ1 := x1 . The first term in [19]
. p2 is the classical contribution and it can be
p3 represented in terms of connected tree diagrams.
The second term is the sum of contributions of
for a monomial interaction Lint = (g=n!)n . one-loop diagrams constructed from n propaga-
tors i(x  y) and n vertices iV 00 []. The
expansion may be carried out to all orders in
Irreducible Green’s Functions terms of connected Feynman diagrams, and the
A one-particle irreducible (1PI) or proper Green’s result of the above Legendre transformation is to
function is given by a sum of diagrams in which select only the one-particle irreducible diagrams
each diagram cannot be separated by cutting one and to replace the classical value of  by an
internal line. In momentum space, it is defined arbitrary argument. All information about the
without the overall momentum conservation delta- quantum field theory is encoded in this effective
function factors and without propagators on exter- action.
nal lines. For example, the two particle 1PI Green’s
function Parametric Representation

k k
Consider an arbitrary proper Feynman diagram
1PI =: ∑(k) ½17 D with n internal lines and v vertices. The
number, ‘, of independent loops in the diagram
is the number of independent internal momenta in
is called the self-energy. If G(k) is the complete
D when conservation laws at each vertex have
two-point function in momentum space, then one
been taken into account, and it is given by ‘ = n þ
has
1  v. There is an independent momentum inte-
gration variable ki for each loop, and a propa-
k k
G(k) := gator for each internal line as in [16]. The
contribution of D to a proper Green’s function
k k k with r incoming external momenta pi , with
= + 1PI P r
i = 1 pi = 0, is given by
k k k n Z
+ 1PI 1PI +... Y ddþ1 ki
~ID ðpÞ ¼ VðDÞ i
SðDÞ i¼1 ð2Þdþ1 k2i  m2 þ i
= i
k 2 – m 2 – ∑(k) ½18 Yv  
 ð2Þdþ1 ðdþ1Þ Pj  Kj ½20
j¼1
and thus it suffices to calculate only 1PI diagrams.
The 1PI effective action, defined by R the Legendre where V(D) contains all contributions from the
transformation [] := i ln Z[J]  ddþ1 xJ(x)(x) interaction vertices of Lint , and Pj (resp. Kj ) is the
of [5], is the generating functional for proper vertex sum of incoming external momenta plj (resp.
functions and it can be represented as a functional of internal momenta klj ) at vertex j with respect to
only the vacuum expectation value of the field , a fixed chosen orientation of the lines of the
that is, its classical value. In the semiclassical (WKB) graph. After resolving the delta-functions in terms
approximation, the one-loop effective action is of independent internal loop momenta k1 , . . . , k‘
given by and dropping the overall momentum conservation
32 Perturbation Theory and Its Techniques

delta-function along with the symmetry and vertex reduces to the calculation of the parametric
factors in [20], one is left with a set of momentum integrals:
space integrals  
 n  ðdþ1Þ‘ n Z
Y Y
‘ Z
1 ‘
Y ddþ1
ki iY
n 2 1
ID ðpÞ ¼ ½21 ID ðpÞ ¼ dj
Qi ðÞ2
ðdþ1Þ‘

i¼1 ð2Þ dþ1 a ðk; pÞ þ i


j¼1 j
ð2Þ 2 id‘ j¼1 0 i¼1
 P 
where aj (k, p) are functions of both the internal and  1  j j
external momenta.  nðdþ1Þ‘
It is convenient to exponentiate propagators using P 2
ðp2 Þ  12 Li ðpÞ  Q1 ðÞij Lj ðpÞ
the Schwinger parametrization i;j
Z 1 ½26
i
¼ dj eij ðaj þiÞ ½22
aj þ i 0 where (s) is the Euler gamma-function.
and after some straightforward manipulations one
may write the Feynman parametric formula
Regularization
Y
n
i
The parametric representation [26] is generically
a ðk; pÞ þ i
j¼1 j convergent when 2n  (d þ 1)‘ > 0. When diver-
 P 
n Z  1  j j gent, the infinities arise from the lower limits of
Y 1
¼ ðn  1Þ! dj ½23 integration j ! 0. This is just the parametric
j¼1 0 DD ðk; ; pÞn representation of the large-k divergence of the
P original Feynman amplitude [20]. Such ultraviolet
where DD (k; , p) := j j [aj (k, p) þ i] is generic- divergences plague the very meaning of a quan-
ally a quadratic form tum field theory and must be dealt with in some
way. We will now quickly tour the standard
1X ‘
methods of ultraviolet regularization for such
DD ðk; ; pÞ ¼ ki  Qij ðÞkj
2 i;j¼1 loop integrals, which is a prelude to the renor-
malization program that removes the divergences
X
‘  
þ Li ðpÞ  ki þ p2 ½24 (in a renormalizable field theory). Here we
i¼1 consider regularization simply as a means of
justification for the various formal manipulations
The positive symmetric matrix Qij is independent that are used in arriving at expressions such
of the external momenta pl , invertible, and as [26].
has nonzero eigenvalues Q1 , . . . , Q‘ . The vectors
Li are linear combinations of the pj , while (p2 )
Momentum Cutoff
is a function of only the Lorentz invariants p2i .
After some further elementary manipulations, Cutoff regularization introduces a mass scale 
the loop diagram contribution [21] may be into the quantum field theory and throws away
written as the Fourier modes of the fields for spatial
momenta k with jkj > . This regularization
ID ðpÞ
! spoils Lorentz invariance. It is also nonlocal. For
n Z
Y Y Z X
1 ‘
1 ddþ1 ki example, if we restrict to a hypercube in
¼ ðn  1Þ! dj 2 dþ1
 1 j
j¼1 0 i¼1 Qi ðÞ ð2Þ j
momentum space, so that jki j <  for i = 1, . . . , d,
!n then
1X   1X
 k2i þ p2  Li ðpÞ  Q1 ðÞij Lj ðpÞ ½25
2 2 Z
i i;j
dd k Y
d
sinðxi Þ
eikx ¼
Finally, the integrals over the loop momenta ki jkj> ð2Þ d
i¼1
xi
may be performed by Wick-rotating them
to Euclidean space and using the fact that which is a delta-function in the limit  ! 1 but is
the combination of ‘ integrations in Rdþ1 has nonlocal for  < 1. The regularized field theory is
O((d þ 1)‘) rotational invariance. The contribu- finite order by order in perturbation theory and
tion from the entire Feynman diagram D thereby depends on the cutoff .
Perturbation Theory and Its Techniques 33

Lattice Regularization in (3 þ 1)-dimensional scalar field theory. We


replace this integral by its D-dimensional version
We can replace the spatial continuum by a lattice L
of rank d and define a Lagrangian on L by Z  D=2r  
dD k D=2 a2 D
1 X _2 X X r ¼  r  ½29
LL ¼ i þ J i j þ Vði Þ ½27 ðk2 þ a2 Þ ðr  1Þ! 2
2 i2SðL Þ hi;ji2LðL Þ i2SðL Þ
This integral is absolutely convergent for D < 2r.
where S(L ) is the set of sites i of the lattice on each We can analytically continue the result of this
of which is situated a time-dependent function i , and integration to the complex plane D 2 C. As an
LL is the collection of links connecting pairs hi, ji of analytic function, the only singularities of the Euler
nearest-neighbor sites i, j on L . The regularized field function (z) are poles at z = 0, 1, 2, . . . . In
theory is now local, but still has broken Lorentz particular, (z) has a simple pole at z = 0 of residue
invariance. In particular, it suffers from broken rota- 1. If we write D = 4 þ  with jj ! 0, then the
tional symmetry. If L is hypercubic with lattice spacing integral [29] is proportional to (r  2  =2) and 
a, that is, L = (Za)d , then the momentum cutoff is plays the role of the regulator here. This regulariza-
at  = a1 . tion is Lorentz invariant (in D dimensions) and is
distinguished as having a dimensionless regulariza-
tion parameter . This parameter is related to the
Pauli–Villars Regularization momentum cutoff  by 1 = ln (=m), so that the
We can replace the P propagator i(k2  m2 þ i)1 by limit  ! 0 corresponds to  ! 1.
i(k  m þ i) þ i N
2 2 1 2 2 1
j = 1 cj (k  Mj þ i) , where Infrared Divergences
the masses Mj  m are identified with the momen-
tum cutoff as min{Mj } =  ! 1. The mass-depen- Thus far we have only considered the ultraviolet
dent coefficients cj are chosen to make the modified behavior of loop amplitudes in quantum field theory.
propagator decay rapidly as (k2 )N1Pat k ! 1, When dealing with massless particles (m = 0 in [4])
which gives the N equations (m2 )i þ j cj (M2j )i = one has to further worry about divergences arising
0, i = 0, 1, . . . , N  1. This regularization preserves from the k ! 0 regions of Feynman integrals. After
Lorentz invariance (and other symmetries that the Wick rotation to Euclidean momenta, one can show
field theory may possess) and is local in the that no singularities arise in a given Feynman diagram
following sense. The modified propagator can be as some of its internal masses vanish provided that all
thought of as arising through the alteration of the vertices have superficial degree of divergence d þ 1,
Lagrangian density [4] by N additional scalar fields the external momenta are not exceptional (i.e., no
’j of masses Mj with partial sum of the incoming momenta pi vanishes), and
there is at most one soft external momentum. This
LPV ¼ 12 @ @    12 m2 2
result assumes that renormalization has been carried
XN  
1  1 2 2
out at some fixed Euclidean point. The extension of
þ 2 @  ’ j @ ’j  2 M ’
j j þ Lint ½ ½28 this property when the external momenta are con-
j¼1
tinued to physical on-shell values is difficult. The
P pffiffiffiffi
where  :=  þ j cj ’j . The contraction of the  Kinoshita–Lee–Nauenberg theorem states that, as a
field thus produces the required propagator. consequence of unitarity, transition probabilities in a
However, the cj ’s as computed above are gener- theory involving massless particles are finite when the
ically negative numbers and so the Lagrangian sum over all degenerate states (initial and final) is
density [28] is not Hermitian (as  6¼ y ). It is taken. This is true order by order in perturbation
possible to make [28] formally Hermitian by theory in bare quantities or if minimal subtraction
redefining the inner product on the Hilbert renormalization is used (to avoid infrared or mass
space of physical states, but this produces singularities in the renormalization constants).
negative-norm states. This is no problem at
energy scales E  Mj on which the extra particles
Fermion Fields
decouple and the negative probability states are
invisible. We will now leave the generalities of our pure scalar
field theory and start considering the extensions of
our previous considerations to other types of
Dimensional Regularization
R particles. Henceforth we will primarily deal with
Consider a Euclidean space integral d4 k(k2 þ a2 )r the case of (3 þ 1)-dimensional spacetime. We begin
arising after Wick rotation from some loop diagram by indicating how the rudiments of perturbation
34 Perturbation Theory and Its Techniques

theory above apply to the case of Dirac fermion (with xnþ1 := x1 ), where tr is the 4  4 trace
fields. The Lagrangian density is over spinor indices. This reordering introduces the
familiar minus sign for a closed fermion loop, and
LF ¼ ði@=  mÞ þ L0 ½30 one has
where are four-component Dirac fermion fields in V(x1)
V(xn)
3 þ 1 dimensions, := y
0 and @= =
 @ with
 n
the generators of the Clifford algebra {
 ,
} = 2  .
. = (–)Π d4xi
. i=1
The Lagrangian density L0 contains couplings of the .
V(x2)
n–1
Dirac fields to other field theories, such as the scalar V(x3) × tr Π ΔF (xj – xj + 1)
field theories considered previously. j =1
Wick’s theorem for anticommuting Fermi fields
leads to the Pfaffian formula × V(xj + 1) ΔF (xj + 1 – xj + 2)
½33
h0jT½ ð1Þ    ðnÞj0i
8
> 0; n ¼ 2k  1 Feynman rules are now described as follows.
>
>
>
> X Fermion lines are oriented to distinguish a particle
>
> 1
>
> sgnðÞ from its corresponding antiparticle, and carry both
>
> k
< 2 k! 2S2k ½31 a four-momentum label p as well as a spin
¼ polarization index r = 1, 2. Incoming fermions (resp.
>
> Y k
>
> antifermions) are described by the wave functions
>
>  h0jT½ ðð2i  1ÞÞ ðð2iÞÞj0i
>
>
> i¼1 u(r)
p (resp. v(r)
p ), while outgoing fermions (resp.
>
>
: antifermions) are described by the wave functions
n ¼ 2k u(r) (r) (r) (r)
p (resp. vp ). Here up and vp are the classical

where for compactness we have written in the spinors, that is, the positive and negative-energy
argument of (i) the spacetime coordinate, the solutions of the Dirac equation (p =  m)u(r)
p = (p=þ
(r)
Dirac index, and a discrete index which distin- m)vp = 0. Matrices are multiplied along a Fermi
guishes from . The nonvanishing contractions line, with the head of the arrow on the left. Closed
in [31] are determined by the free-fermion fermion loops produce an overall minus sign as in
propagator [33], and the multiplication rule gives the trace of
 
Dirac matrices along the lines of the loop. Unpolar-
F ðx  yÞ ¼ 0T ðxÞ ðyÞ 0 ized scattering amplitudes are summed over the spins
D   E of final particles and averaged over the spins of initial
 
¼ xði@=  mÞ1 y particles using the completeness relations for spinors
Z X ðrÞ ðrÞ X ðrÞ ðrÞ
d4 p p= þ m
¼i 4 p2  m2 þ i
eipðxyÞ ½32 up up ¼ p = þ m; vp vp ¼ p=  m ½34
ð2Þ r¼1;2 r¼1;2

Perturbation theory now proceeds exactly as leading to basis-independent results. Polarized


before. Suppose that the coupling Lagrangian amplitudes are computed using the spinor bilinears
density in [30] is of the form L0 = (x)V(x) (x). u(r)  (s) (r)  (s)  rs (r) (s) (r) (s)
p
up = vp
vp = 2p  , up up =  vp vp = 2m
Both the Dyson formula [3] and the diagrammatic rs (r) (s)
 , and up vp = 0.
formula [15] are formally the same in this instance. When calculating fermion loop integrals using
For
R 4 example, in the formal expansion in powers of dimensional regularization, one utilizes the Dirac
d xL0 , the vacuum-to-vacuum amplitude (the algebra in D dimensions
denominator in [5]) will contain field products of
the form

 ¼  ¼ D
n Z
Y =
 ¼ ð2  DÞp

 p =
d4 xi h0jT½ ðxi ÞVðxi Þ ðxi Þj0i
i=1
 p= k=
 ¼ 4p  k þ ðD  4Þ p
=k=

which correspond to fermion loops. Before applying


 p= k= q=
 ¼ 2q
=k=p
=  ðD  4Þ p
=k=q
= ½35
Wick’s theorem, the fields must be rearranged as 1
tr1 ¼ 4; tr
  
2k1 
¼ 0; tr

¼ 4 

Y
n tr



 ¼ 4ð     
tr Vðxi Þ ðxi Þ ðxiþ1 Þ
i=1 þ  Þ
Perturbation Theory and Its Techniques 35

Specific to D = 4 dimensions are the trace identities Feynman diagrams. The gauge field propagator is
5  5 given by
tr
¼ tr


¼ 0;
½36 h0jT A ðxÞA ðyÞ j0i
tr




5 ¼ 4i 
  1
¼ hxj  & þ 2  @ @ jyi
where
5 := i
0
1
2
3 . Finally, loop diagrams eval-
Z p p
uated with the fermion propagator [32] require a d4 p   þ 2 ipðxyÞ
generalization of the momentum space integral [29] ¼i e ½39
ð2Þ4 p2  2 þ i
given by
Z and is represented by a wavy line. The fermion–
dD k 1 fermion–photon vertex is
D ðk2 þ 2k  p þ a2 þ iÞr
ð2Þ
  = –ie γμ
iðÞD=2  r  D2 1 ½40
¼ D
½37
ð2Þ ðr  1Þ! ða2  p2 þ iÞrD=2 μ
From this formula we can extract expressions for An incoming (resp. outgoing) soft photon of
more complicated Feynman integrals which are momentum k and polarization r is described by the

tensorial, that is, which contain products of wave function e(r) (r)
 (k) (resp. e (k) ), where the
(r)
momentum components k in the numerators of polarization vectors e (k), r = 1, 2, 3 solve the vector
their integrands, by differentiating [37] with respect field wave equation (& þ 2 )A = @ A = 0 and
to the external momentum p . obey the orthonormality and completeness
conditions

Gauge Fields eðrÞ ðkÞ  eðsÞ ðkÞ ¼ rs


The issues we have dealt with thus far have X
3
 k k ½41
eðrÞ ðrÞ
 ðkÞe ðkÞ ¼   þ
interesting difficulties when dealing with gauge r¼1
2
fields. We will now discuss some general aspects of
the perturbation expansion of gauge theories using along with k  e(r) (k) = 0. All vector indices are
as prototypical examples quantum electrodynamics contracted along the lines of the Feynman graph.
(QED) and quantum chromodynamics (QCD) in All other Feynman rules are as previously.
four spacetime dimensions.
Quantum Chromodynamics
Quantum Electrodynamics Consider nonabelian gauge theory in 3 þ 1 dimen-
Consider the QED Lagrangian density sions minimally coupled to a set of fermion fields
A
, A = 1, . . . , Nf , each transforming in the funda-
LQED ¼  14 F F mental representation of the gauge group G whose
þ 12 2 A A þ ði@=  eA
=  mÞ ½38 generators T a satisfy the commutation relations
[T a , T b ] = f abc T c . The Lagrangian density is given by
where A is a U(1) gauge field in 3 þ 1 dimensions
and F = @ A  @ A is its field strength tensor. 1 a a  1  2
LQCD ¼  F F þ @ Aa  þ@ D
We have added a small mass term  ! 0 for 4  2
Nf
the gauge field, which at the end of calculations X A
should be taken to vanish in order to describe þ =  mA Þ A
ðiD ½42
A¼1
real photons (as opposed to the soft photons
a
described by [38]). This is done in order to cure where F = @ Aa  @ Aa þ f abc Ab Ac and D = @ þ
a a
the infrared divergences generated in scattering ieR(T )A , with R the pertinent representation of G
amplitudes due to the masslessness of the photon, (R(T a )bc = fbca
for the adjoint representation and
a a
that is, the long-range nature of the electromag- R(T ) = T for the fundamental representation).
netic interaction. The Bloch–Nordsieck theorem The first term is the Yang–Mills Lagrangian density,
in QED states that infrared divergences cancel the second term is the covariant gauge-fixing term,
for physical processes, that is, for processes and the third term contains the Faddeev–Popov
with an arbitrary number of undetectable soft ghost fields which transform in the adjoint
photons. representation of the gauge group.
Perturbation theory proceeds in the usual way Feynman rules are straightforward to write
via the Dyson formula, Wick’s theorem, and down and are given in Figure 1 where wavy lines
36 Perturbation Theory and Its Techniques

kμ kν
b,ν ημν + (α – 1)
k a,μ k2
– i δ ab
k2 + i
c,λ,k

e f abc [ηλμ(pν – kν) + ημν (qλ – pλ ) + ηνλ(kμ – qμ )]

b,ν,q a,μ,p

d,ρ a,μ

– i e 2 [f eab f ecd (ημλ ηνρ – ημρ ηνλ )


+ f eac f ebd (ημν ηλρ – ημρ ηνλ ) + f ead f ebc (ημν ηλρ – ηνρ ημλ )]

c,λ b,ν

b a i δ ab
k
k2 + i
a,μ,p

ekμ f abc

c,k b,q
Figure 1 Feynman rules.

represent gluons and dashed lines represent ghosts. physical quantity. However, at a given order of
Feynman rules for the fermions are exactly as perturbation theory, a physical quantity typically
before, except that now the vertex [40] is multi- involves both virtual and real emission contribu-
plied by the color matrix T a . All color indices are tions that are separately infrared divergent.
contracted along the lines of the Feynman graph. Already at two-loop level these divergences have
Color factors may be simplified by using the a highly intricate structure. Their precise form is
identities specified by the Catani color-space factorization
formula, which also provides an efficient way of
dim R organizing amplitudes into divergent parts, which
Tr Ra Rb ¼ C2 ðRÞab ; Ra Ra ¼ C2 ðRÞ
dim
 G  ultimately drop out of physical quantities, and
½43
1 finite contributions.
R R R ¼ C2 ðRÞ  C2 ðGÞ Rb
a b a
2 The computation of multigluon amplitudes in
nonabelian gauge theory is rather complicated
where Ra := R(T a ) and C2 (R) is the quadratic when one uses polarization states of vector bosons.
Casimir invariant of the representation R (with A much more efficient representation of amplitudes
value C2 (G) in the adjoint representation). For is provided by adopting a helicity (or circular
G = SU(N), one has C2 (G) = N and C2 (N) = (N 2  polarization) basis for external gluons. In the
1)=2N for the fundamental representation. spinor–helicity formalism, one expresses positive
The cancellation of infrared divergences in loop and negative-helicity polarization vectors in terms
amplitudes of QCD is far more delicate than in of massless Weyl spinors jk i := 12 (1
5 )uk =
QED, as there is no analog of the Bloch– 1
2 (1
5 )vk through
Nordsieck theorem in this case. The Kinoshita–
Lee–Nauenberg theorem guarantees that, at the  

q

 k

end of any perturbative calculation, these diver- e


 ðk; qÞ ¼ pffiffiffi ½44
gences must cancel for any appropriately defined 2hq
jk i
Perturbation Theory and Its Techniques 37

where q is an arbitrary null reference momentum asymptotic expansion of D may be summarized in


which drops out of the final gauge-invariant the diagrammatic formula
amplitudes. The spinor products are crossing sym-
metric, antisymmetric in their arguments, and satisfy lim D ðQ; m; qÞ
Q!1
the identities X  
¼ ðD =dÞðm; qÞ ? T fmd ;qd g d ðQ; md ; qd Þ ½46
D ED E
d D
ki jk þ
j k þ 
j jk i ¼ 2ki  kj
D  E

D  þE where the sum runs through all subgraphs d of D
 þ
k i kj kl jkr
þ
¼ ki jkr
þ
kl jkj which contain all vertices where a large momentum

D  þ E enters or leaves the graph and is one-particle irredu-
þ k i jk þ
l kj jkr ½45 cible after identifying these vertices. The operator
T {md , qd } performs a Taylor series expansion before any
Any amplitude with massless external fermions
integration is carried out, and the notation (D =d) ?
and vector bosons can be expressed in terms of
(T {md , qd } d) indicates that the subgraph d D is
spinor products. Conversely, the spinor products
replaced by its Taylor expansion in all masses and
offer the most compact representation of helicity
external momenta of d that do not belong to the set
amplitudes which can be related to more conven-
{Qi }. The external momenta of d which become loop
tional amplitudes described in terms of Lorentz
momenta in D are also considered to be small. The
invariants. For loop amplitudes, one uses a
loop integrations are then performed only after all
dimensional regularization scheme in which all
these expansions have been carried out. The diagrams
helicity states are kept four dimensional and only
D =d are called co-subgraphs.
internal loop momenta are continued to D = 4 þ 
The subgraphs become massless integrals in which
dimensions.
the scales are set by the large momenta. For instance,
in the simplest case of a single large momentum Q one
is left with integrals over propagators. The co-
Computing Loop Integrals subgraphs may contain small external momenta and
At the very heart of perturbative quantum field masses, but the resulting integrals are typically much
theory is the problem of computing Feynman simpler than the original one. A similar formula is true
integrals for multiloop scattering amplitudes. The for large-mass expansions, with the vertex conditions
integrations typically involve serious technical chal- on subdiagrams replace by propagator conditions. For
lenges and for the most part are intractable by example, consider the asymptotic expansion of the
straightforward analytical means. We will now two-loop double bubble diagram (Figure 2) in the
survey some of the computational techniques that region q2  m2 , where m is the mass of the inner loop.
have been developed for calculating quantum loop The subgraphs (to the right of the stars) are expanded
amplitudes which arise in the field theories consid- in all external momenta including q and reinserted into
ered previously. the fat vertices of the co-subgraphs (to the left of the
stars). Once such asymptotic expansions are carried
out, one may attempt to reconstruct as much informa-
Asymptotic Expansion tion as possible about the given scattering amplitude
In many physical instances one is interested in
scattering amplitudes in certain kinematical limits. In
this case one may perform an asymptotic expansion of
multiloop diagrams whose coefficients are typically = *
nonanalytic functions of the perturbative expansion
parameter  h. The main simplification which arises
comes from the fact that the expansions are done
before any momentum integrals are evaluated. In the + 2 *
limits of interest, Taylor series expansions in different
selected regions of each loop momentum can be
interpreted in terms of subgraphs and co-subgraphs
of the original Feynman diagram.
+ *
Consider a Feynman diagram D which depends on
a collection {Qi } of large momenta (or masses), and
a collection {mi , qi } of small masses and momenta. Figure 2 Asymptotic expansion of the two-loop double bubble
The prescription for the large-momentum diagram.
38 Perturbation Theory and Its Techniques

by using the method of Padé approximation which to write the left-hand sides of [50] as the sum of
requires knowledge of only part of the expansion of rank-2 Feynman integrals which, with the exception
the diagram. By construction, the Padé approximation of the one multiplied by q2 from [51], have one less
has the same analytic properties as the exact denominator factor. This formally determines the
amplitude. coefficients a and b in terms of a set of rank-2
integrations. The vector function c is then found
from the contraction
Brown–Feynman Reduction
J ¼ p a þ q b þ ðD  2Þc ½52
When considering loop diagrams which involve
fermions or gauge bosons, one encounters tensorial This contraction eliminates the k2 denominator
Feynman integrals. When these involve more than factor in the integrand of [47] and produces a
three distinct denominator factors (propagators), vector-valued integral. Solving the system of
they require more than two Feynman parameters algebraic equations [50] and [52] then formally
for their evaluation and become increasingly determines the rank-3 Feynman integral [47] in
complicated. The Brown–Feynman method simpli- terms of rank-1 and rank-2 Feynman integrals. The
fies such higher-rank integrals and effectively rank-2 Feynman integrals thus generated can then
reduces them to scalar integrals which typically be evaluated in the same way by writing a
require fewer Feynman parameters for their decomposition for them analogous to [48] and
evaluation. solving for them in terms of vector-valued and
To illustrate the idea behind this method, consider scalar-valued Feynman integrals. Finally, the rank-1
the one-loop rank-3 tensor Feynman integral integrations can be solved for in terms of a set of
Z scalar-valued integrals, most of which have fewer
dD k denominator factors in their integrands.
J ¼
ð2ÞD Generally, any one-loop amplitude can be reduced
k k k to a set of basic integrals by using the Passarino–
 ½47
k2 ðk2  2 Þðq  kÞ2 ððk  qÞ2 þ 2 Þðk2 þ 2k  pÞ Veltman reduction technique. For example, in
supersymmetric amplitudes of gluons any tensor
where p and q are external momenta with the mass- Feynman integral can be reduced to a set of scalar
shell conditions p2 = (p  q)2 = m2 . By Lorentz invar- integrals, that is, Feynman integrals in a scalar field
iance, the general structure of the integral [47] will theory with a massless particle circulating in the
be of the form loop, with rational coefficients. In the case of N = 4
J ¼ a p þ b q þ c s þ c s ½48 supersymmetric Yang–Mills theory, only scalar box
integrals appear.
 
where a , b are tensor-valued functions and
c a vector-valued function of p and q. The Reduction to Master Integrals
symmetric tensor s is chosen to project out
components of vectors transverse to both p and q, While the Brown–Feynman and Passarino–Veltman
i.e., p s = q s = 0, with the normalization reductions are well suited for dealing with one-loop
s  = D  2. Solving these constraints leads to the diagrams, they become rather cumbersome for
explicit form higher-loop computations. There are other more
powerful methods for reducing general tensor
m2 q q þ q2 p p  ðp  qÞðq p þ p q Þ integrals into a basis of known integrals called
s ¼   ½49
m2 q2  ðp  qÞ2 master integrals. Let us illustrate this technique on a
scalar example. Any scalar massless two-loop Feyn-
To determine the as yet unknown functions man integral can be brought into the form
a , b and c above, we first contract both sides Z Z D 0 Y q
dD k d k t Y
of the decomposition [48] with p and q to get IðpÞ ¼ 
lj
ni i ½53
j
ð2ÞD ð2ÞD j¼1 i¼1
2p J ¼ 2m2 a þ 2ðp  qÞb
½50
2q J ¼ 2ðp  qÞa þ 2q2 b where j are massless scalar propagators depending
on the loop momenta k, k0 and the external
Inside the integrand of [47], we then use the trivial momenta p1 , . . . , pn , and i are scalar products of
identities a loop momentum with an external momentum or
  of the two loop momenta. The topology of the
2k  p ¼ k2 þ 2k  p  k2
½51 corresponding Feynman diagram is uniquely deter-
2q  k ¼ k2 þ q2  ðk  qÞ2 mined by specifying the set 1 , . . . , t of t distinct
Perturbation Theory and Its Techniques 39

propagators in the graph, while the integral itself is techniques. For instance, one can apply a Mellin–
specified by the powers lj 1 of all propagators, by Barnes transformation of all propagators given by
the selection 1 , . . . , q of q scalar products and by Z i1
their powers ni 0. 1 1 dz az
¼ ðl þ zÞðzÞ ½56
The integrals in a class of diagrams of the same ðk2 þ aÞl ðl  1Þ! i1 2i ðk2 Þlþz
topology
P with the same denominator dimension
where the contour of integration is chosen to lie to the
r = Pj lj and same total scalar product number
right of the poles of the Euler function (l þ z) and to
s = i ni are related by various identities. One
the left of the poles of (z) in the complex z-plane.
class follows from the fact that the integral over a
Alternatively, one may apply the negative-dimension
total derivative with respect to any loop momentum
method in which D is regarded as a negative integer in
vanishes in dimensional regularization as
intermediate calculations and the problem of loop
Z integration is replaced with that of handling infinite
dD k @JðkÞ
¼0 series. When combined with the above methods, it may
ð2ÞD @k be used to derive powerful recursion relations among
where J(k) is any tensorial combination of propaga- scattering amplitudes. Both of these techniques rely on
tors, scalar products and loop momenta. The an explicit integration over the loop momenta of the
resulting relations are called integration-by-parts graph, their differences occurring mainly in the repre-
identities and for two-loop integrals can be cast sentations used for the propagators.
into the form The procedure outlined above can also be used to
reduce a tensor Feynman integral to scalar integrals, as
Z Z
dD k dD k0 @f ðk; k0 ; pÞ in the Brown–Feynman and Passarino–Veltman reduc-
v ¼0 tions. The tensor integrals are expressed as linear
ð2ÞD ð2Þ D @k
Z Z D 0 combinations of scalar integrals of either higher
dD k d k  @f ðk; k0 ; pÞ dimension or with propagators raised to higher
¼ v ½54
ð2ÞD ð2ÞD @k0 powers. The projection onto a tensor basis takes the
form [53] and can thus be reduced to master integrals.
where f (k, k0 , p) is a scalar function containing
propagators and scalar products, and v is any
internal or external momentum. For a graph with ‘ String Theory Methods
loops and n independent external momenta, this
The realizations of field theories as the low-energy
results in a total of ‘(n þ ‘) relations.
limits of string theory provides a number of power-
In addition to these identities, one can also exploit
ful tools for the calculation of multiloop amplitudes.
the fact that all Feynman integrals [53] are Lorentz
They may be used to provide sets of diagrammatic
scalars. Under an infinitesimal Lorentz transformation
computational rules, and they also work well for
p ! p þ p , with p = p  ,  =   , one has
calculations in quantum gravity. In this final part we
the invariance condition I(p þ p) = I(p), which leads
shall briefly sketch the insights into perturbative
to the linear homogeneous differential equations
quantum field theory that are provided by tech-
Xn   niques borrowed from string theory.
@  @
pi  pi IðpÞ ¼ 0 ½55
i¼1
@pi @pi String Theory Representation
This equation can be contracted with all possible String theory provides an efficient compact repre-
antisymmetric combinations of pi pj to yield sentation of scattering amplitudes. At each loop
linearly independent Lorentz invariance identities order there is only a single closed string diagram,
for (53). which includes within it all Feynman graphs along
Using these two sets of identities, one can either with the contributions of the infinite tower of
obtain a reduction of integrals of the type (53) massive string excitations. Schematically, at one-
to those corresponding to a small number of simpler loop order, the situation is as shown in Figure 3.
diagrams of the same topology and diagrams of The terms arising from the heavy string modes are
simpler topology (fewer denominator factors), or removed by taking the low-energy limit in which all
a complete reduction to diagrams with simpler external momenta lie well below the energy scale set
topology. The remaining integrals of the topology by the string tension. This limit picks out the regions
under consideration are called irreducible master of integration in the string diagram corresponding to
integrals. These momentum integrals cannot be particle-like graphs, but with different diagrammatic
further reduced and have to be computed by different rules.
40 Perturbation Theory and Its Techniques

amplitude. The contribution from a fermion propagat-


×
×
× ×
= + + +... ing in the loop can be decomposed into the contribution
of a complex scalar field in the loop plus a contribution
Figure 3 String theory representation at one-loop order. from an N = 1 chiral supermultiplet consisting of a
complex scalar field and a Weyl fermion. The
Given these rules, one may formulate a purely contribution from a gluon circulating in the loop can
field-theoretic framework which reproduces them. be decomposed into contributions of a complex scalar
In the case of QCD, a key ingredient is the use of a field, an N = 1 chiral supermultiplet, and an N = 4
special gauge originally derived from the low-energy vector supermultiplet comprising three complex scalar
limit of tree-level string amplitudes. This is known fields, four Weyl fermions and one gluon all in the
as the Gervais–Neveu gauge and it is defined by the adjoint representation of the gauge group. This
gauge-fixing Lagrangian density decomposition assumes the use of a supersymmetry-
 2 preserving regularization.
1 ie The supersymmetric components have important
LGN ¼  Tr @ A  pffiffiffi A A ½57
2 2 cancellations in their leading loop momentum
behavior. For instance, the leading large loop
This gauge choice simplifies the color factors that momentum power in an n-point 1PI graph is
arise in scattering amplitudes. The string theory reduced from jkjn down to jkjn2 in the N = 1
origin of gauge theory amplitudes is then most amplitude. Such a reduction can be extended to any
closely mimicked by combining this gauge with the amplitude in supersymmetric gauge theory and is
background field gauge, in which one decomposes related to the improved ultraviolet behavior of
the gauge field into a classical background field and supersymmetric amplitudes. For the N = 4 ampli-
a fluctuating quantum field as A = Acl qu
 þ A , and tude, further cancellations reduce the leading power
cl qu
imposes the gauge-fixing condition D A = 0, behavior all the way down to jkjn4 . In dimensional
where Dcl is the background field covariant deriva- regularization, N = 4 supersymmetric loop ampli-
tive evaluated in the adjoint representation of the tudes have a very simple analytic structure owing to
gauge group. This hybrid gauge is well suited for their origins as the low-energy limits of superstring
computing the effective action, with the quantum scattering amplitudes. The supersymmetric Ward
part describing gluons propagating around loops identities in this way can be used to provide
and the classical part describing gluons emerging identities among the nonsupersymmetric contribu-
from the loops. The leading loop momentum tions. For example, in N = 1 supersymmetric Yang–
behavior of one-particle irreducible graphs with Mills theory one can deduce that fermion and gluon
gluons in the loops is very similar to that of graphs loop contributions are equal and opposite for multi-
with scalar fields in the loops. gluon amplitudes with maximal helicity violation.
Supersymmetric Decomposition
Scattering Amplitudes in Twistor Space
String theory also suggests an intimate relationship
The scattering amplitude in QCD with n incoming
with supersymmetry. For example, at tree level,
gluons of the same helicity vanishes, as does the
QCD is effectively supersymmetric because a multi-
amplitude with n  1 incoming gluons of one helicity
gluon tree amplitude contains no fermion loops, and
and one gluon of the opposite helicity for n 3. The
so the fermions may be taken to lie in the adjoint
first nonvanishing amplitudes are the maximal helicity
representation of the gauge group. Thus, pure gluon
violating (MHV) amplitudes involving n  2 gluons of
tree amplitudes in QCD are identical to those in
one helicity and two gluons of the opposite helicity.
supersymmetric Yang–Mills theory. They are con-
Stripped of the momentum conservation delta-function
nected by supersymmetric Ward identities to ampli-
and the group theory factor, the tree-level amplitude
tudes with fermions (gluinos) which drastically
for a pair of gluons of negative helicity is given by
simplify computations. In supersymmetric gauge
theory, these identities hold to all orders of  þ
Yn  þ
1
AðkÞ ¼ en2 k k k 
perturbation theory. r s i kiþ1 ½58
i¼1
At one-loop order and beyond, QCD is not super-
symmetric. However, one can still perform a super- This amplitude depends only on the holomorphic
symmetric decomposition of a QCD amplitude for (negative chirality) Weyl spinors. The full MHV
which the supersymmetric components of the ampli- amplitude (with the momentum conservation
tude obey the supersymmetric Ward identities. Con- delta-function) is invariant under the conformal
sider, for example, a one-loop multigluon scattering group SO(4, 2) ffi SU(2, 2) of four-dimensional
Perturbative Renormalization Theory and BRST 41

Minkowski space. After a Fourier transformation of Further Reading


the positive-chirality components, the complexifica-
Bern Z, Dixon L, and Kosower DA (1996) Progress in one-loop
tion SL(4, C) has an obvious four-dimensional repre- QCD computations. Annual Review of Nuclear and Particle
sentation acting on the positive- and negative-chirality Science 46: 109–148.
spinor products. This representation space is iso- Brown LM and Feynman RP (1952) Radiative corrections to
morphic to C4 and is called twistor space. Its elements Compton scattering. Physical Review 85: 231–244.
are called twistors. Cachazo F and Svrček P (2005) Lectures on twistor strings and
perturbative Yang–Mills theory. Preprint arXiv:hep-th/
Wave functions and amplitudes have a known 0504194.
behavior under the C-action which rescales twistors, Catani S (1998) The singular behaviour of QCD amplitudes at
giving the projective twistor space CP3 or RP3 two-loop order. Physics Letters B 427: 161–171.
according to whether the twistors are complex valued Chetyrkin KG and Tkachov FV (1981) Integration by parts: the
or real valued. The Fourier transformation to twistor algorithm to calculate beta functions in 4 loops. Nuclear
Physics B 192: 159–204.
space yields (due to momentum conservation) the Dixon L (1996) Calculating scattering amplitudes efficiently. In:
localization of an MHV amplitude to a genus-0 Soper DE QCD and Beyond, pp. 539–584. Singapore: World
holomorphic curve CP1 of degree 1 in CP3 (or to a Scientific.
real line RP1 RP3 ). It is conjectured that, generally, Fleischer J, Jegerlehner F, and Tarasov OV (2000) Algebraic
an ‘-loop amplitude with p gluons of positive helicity reduction of one-loop Feynman graph amplitudes. Nuclear
Physics B 566: 423–440.
and q gluons of negative helicity is supported on a Gehrmann T and Remiddi E (2000) Differential equations for two-
holomorphic curve in twistor space of degree q þ ‘  1 loop four-point functions. Nuclear Physics B 580: 485–518.
and genus  ‘. The natural interpretation of this curve is Halliday IG and Ricotta RM (1987) Negative dimensional
as the world sheet of a string. The perturbative gauge integrals 1: Feynman graphs. Physics Letters B 193: 241–246.
theory may then be described in terms of amplitudes Itzykson C and Zuber J-B (1980) Quantum Field Theory. New
York: McGraw-Hill.
arising from the couplings of gluons to a string. This Mangano ML and Parke SJ (1991) Multiparton amplitudes in
twistor string theory is a topological string theory which gauge theories. Physics Reports 200: 301–367.
gives the appropriate framework for understanding the Magnen J and Seneor R (1979) Expansion and summability
twistor properties of scattering amplitudes. This frame- methods in constructive field theory. Mathematical Problems
work has been used to analyze MHV tree diagrams and in Theoretical Physics, Proc. Lausanne 1979: 217–223.
Passarino G and Veltman MJG (1979) One-loop corrections for
one-loop N = 4 supersymmetric amplitudes of gluons. eþ e annihilation into þ  in the Weinberg model. Nuclear
Physics B 160: 151–207.
See also: Constructive Quantum Field Theory; Peskin ME and Schroeder DV (1995) An Introduction to
Dispersion Relations; Effective Field Theories; Gauge Quantum Field Theory. Reading, MA: Addison-Wesley.
Theories from Strings; Hopf Algebra Structure of Pokorski S (1987) Gauge Field Theories. Cambridge: Cambridge
Renormalizable Quantum Field Theory; Perturbative University Press.
Renormalization Theory and BRST; Quantum Steinhauser M (2002) Results and techniques of multiloop
Chromodynamics; Renormalization: General Theory; calculations. Physics Reports 364: 247–357.
Scattering, Asymptotic Completeness and Bound States; Steinmann O (1995) Axiomatic approach to perturbative quan-
tum field theory. Annales de Poincare en Physique Theorique.
Scattering in Relativistic Quantum Field Theory:
63: 399–409.
Fundamental Concepts and Tools; Stationary Phase
Weinberg S (1995) The Quantum Theory of Fields, Volume 1:
Approximation; Supersymmetric Particle Models. Foundations. Cambridge: Cambridge University Press.

Perturbative Renormalization Theory and BRST


K Fredenhagen, Universität Hamburg, Hamburg, gauge transformation. In the classical theory, the
Germany Cauchy problem is well posed for the observables,
M Dütsch, Universität Zürich, Zürich, Switzerland but in general not for the nonobservable gauge-
ª 2006 Elsevier Ltd. All rights reserved. variant basic fields, due to the existence of time-
dependent gauge transformations.
Attempts to quantize the gauge-invariant objects
directly have not yet been completely satisfactory.
Main Problems in the Perturbative
Instead, one modifies the classical action by adding a
Quantization of Gauge Theories
gauge-fixing term such that standard techniques of
Gauge theories are field theories in which the basic perturbative quantization can be applied and such
fields are not directly observable. Field configurations that the dynamics of the gauge-invariant classical
yielding the same observables are connected by a fields is not changed. In perturbation theory, this
42 Perturbative Renormalization Theory and BRST

problem shows up already in the quantization of the algebra of observables is then defined as the
free gauge fields (see the section ‘‘Quantization of cohomology of the BRST transformation. To solve
free gauge fields’’). In the final (interacting) theory the the problem of positivity, one has to show that the
physical quantities should be independent on how the algebra of observables, in contrast to the algebra of
gauge fixing is done (‘‘gauge independence’’). all fields, has a nontrivial representation on a
Traditionally, the quantization of gauge theories Hilbert space. Finally, one can attack the infrared
is mostly analyzed in terms of path integrals (e.g., by problem by investigating the asymptotic behavior
Faddeev and Popov), where some parts of the of states. The latter problem is nontrivial even in
arguments are only heuristic. In the original treat- quantum electrodynamics (since an electron is
ment of Becchi, Rouet, and Stora (cf. also Tyutin) accompanied by a ‘‘cloud of soft photons’’) and
(which is called ‘‘BRST-quantization’’), a restriction may be related to confinement in quantum
to purely massive theories was necessary; the chromodynamics.
generalization to the massless case by Lowenstein’s The method of BRST quantization is by no means
method is cumbersome. restricted to gauge theories, but applies to general
The BRST quantization is based on earlier work constrained systems. In particular, massive vector
of Feynman, Faddeev, and Popov (introduction of fields, where the masses are usually generated by the
‘‘ghost fields’’), and of Slavnov. The basic idea is Higgs mechanism, can alternatively be treated
that after adding a term to the Lagrangian which directly by the BRST formalism, in close analogy
makes the Cauchy problem well posed but which is to the massless case (cf. the section on quantization
not gauge-invariant one enlarges the number of of free gauge fields).
fields by infinitesimal gauge transformations
(‘‘ghosts’’) and their duals (‘‘anti-ghosts’’). One
then adds a further term to the Lagrangian which
Local Operator BRST Formalism
contains a coupling of the anti-ghosts and ghosts.
The BRST transformation acts as an infinitesimal In AQFT, the principal object is the family of
gauge transformation on the original fields and on operator algebras O ! A(O) (where O runs, e.g.,
the gauge transformations themselves and maps the through all double cones in Minkowski space),
anti-ghosts to the gauge-fixing terms. This is done which fulfills the Haag–Kastler axioms (cf. Algebraic
in such a way that the total Lagrangian is invariant Approach to Quantum Field Theory). To construct
and that the BRST transformation is nilpotent. these algebras, one considers the algebras F (O)
The hard problem in the perturbative construction generated by all local fields including ghosts u and
of gauge theories is to show that BRST symmetry can anti-ghosts ũ. Ghosts and anti-ghosts are scalar
be maintained during renormalization (see the section fermionic fields. The algebra gets a Z2 grading with
on perturbative renormalization). By means of the respect to even and odd ghost numbers, where ghosts
‘‘quantum action principle’’ of Lowenstein (1971) get ghost numbers þ1 and anti-ghosts ghost number 1.
and Lam (1972, 1973) a cohomological classification The BRST transformation s acts on these algebras as a
of anomalies was worked out (an overview is given, Z2 -graded derivation with s2 = 0, s(F (O))  F (O),
e.g., in the book of Piguet and Sorella (1995)). For and s(F ) = (1)F s(F) , F denoting the ghost num-
more details, see BRST Quantization. ber of F.
The BRST quantization can be carried out in a The observables should be s-invariant and may be
transparent way in the framework of algebraic identified if they differ by a field in the range of s.
quantum field theory (AQFT, see Algebraic Since the range A00 of s is an ideal in the kernel A0
Approach to Quantum Field Theory). The advan- of s, the algebra of observables is defined as the
tage of this formulation is that it allows one to quotient
separate the three main problems of perturbative
gauge theories: A :¼ A0 =A00 ½1
1. the elimination of unphysical degrees of freedom,
and the local algebras A(O)  A are the images of
2. positivity (or ‘‘unitarity’’), and
A0 \ F (O) under the quotient map A0 ! A.
3. the problem of infrared divergences.
To prove that A admits a nontrivial representa-
In AQFT, the procedure is the following: starting tion by operators on a Hilbert space, one may use
from an algebra of all local fields, including the the BRST operator formalism (Kugo and Ojima
unphysical ones, one shows that after perturbative 1979, Dütsch and Fredenhagen 1999): one starts
quantization the algebra admits the BRST transfor- from a representation of F on an inner-product
mation as a graded nilpotent derivation. The space (K, h , i) such that hF , i = h, F i
Perturbative Renormalization Theory and BRST 43

and that s is implemented by an operator Q on K, This result guarantees that, within perturbation
that is, theory, the interacting theory satisfies positivity,
provided the unperturbed theory was positive and
sðFÞ ¼ ½Q; F ½2 BRST symmetry is preserved.
with [ , ] denoting the graded commutator, such
that Q is symmetric and nilpotent. One may then
Quantization of Free Gauge Fields
construct the space of physical states as the
cohomology of Q, H := K0 =K00 , where K0 is the The action of a classical free gauge field A,
kernel and K00 the range of Q. The algebra of Z
1
observables now has a natural representation  S0 ðAÞ ¼  dx F ðxÞF ðxÞ
4
on H:
Z
1 ^  ðkÞ M ðkÞA
^  ðkÞ
ð½AÞ½ :¼ ½A ½3 ¼ dkA ½5
2
(where A 2 A0 ,  2 K0 , [A] := A þ A00 , [] :=  þ (where F := @  A  @  A and M (k) := k2 g 
K00 ). The crucial question is whether the scalar k k ) is unsuited for quantization because M is not
product on H inherited from K is positive definite. invertible: due to M k = 0, it has an eigenvalue 0.
In free quantum field theories (K, h , i) can be Therefore, the action is usually modified by adding a
chosen in such a way that the positivity can directly Lorentz-invariant gauge-fixing term: M is replaced
be checked by identifying the physical degrees of by M (k) þ k k , where  2 R n {0} is an arbitrary
freedom (see next section). In interacting theories constant. The corresponding Euler–Lagrange equation
(see the section on perturbative construction of reads
gauge theories), one may argue in terms of scattering
states that the free BRST operator on the asymptotic &A  ð1  Þ@  @ A ¼ 0 ½6
fields coincides with the BRST operator of the For simplicity, let us choose  = 1, which is referred
interacting theory. This argument, however, is to as Feynman gauge. Then the algebra of the free
invalidated by infrared problems in massless gauge gauge field is the unital ?-algebra generated by
theories. Instead, one may use a stability property of elements A (f ), f 2 D(R4 ), which fulfill the
the construction. relations:
Namely, let F~ be the algebra of formal power
series with values in F , and let K~ be the vector space f 7! A ðf Þ is linear ½7
of formal power series with values in K. K~ possesses
a natural inner product with values in the ring of A ð&f Þ ¼ 0 ½8
formal power series C[[]], as well as a representa-
tion of F~ by operators. One also assumes that the
A ðf Þ ¼ A ðf Þ ½9
BRST P transformation s̃ is a formal power series
s̃ = n n sn of operators sn on F and that the
BRST Z
P operator Q̃ is a formal power series
Q̃ = n n Qn of operators on K. The algebraic ½A ðf Þ; A ðgÞ ¼ ig dx dy f ðxÞDðx  yÞgðyÞ ½10
construction can then be done in the same way as
before, yielding a representation ˜ of the algebra where D is the massless Pauli–Jordan distribution.
of observables A~ by endomorphisms of a C[[]] This algebra does not possess Hilbert space
module H, ~ which has an inner product with values representations which satisfy the microlocal spectrum
in C[[]]. condition, a condition which in particular requires
One now assumes that at  = 0 the inner product the singularity of the two-point function to be of the
is positive, in the sense that so-called Hadamard form. It possesses, instead,
representations on vector spaces with a nondegene-
(Positivity)
rate sequilinear form, for example, the Fock space
ðiÞ h; i 0 8 2 K with Q0  ¼ 0; and over a one-particle space with scalar product
ðiiÞ Q0  ¼ 0 ^ h; i ¼ 0 ¼)  2 Q0 K ½4 Z 3
3 d p 
h; i ¼ ð2Þ  ðpÞ  ðpÞjp0 ¼jpj ½11
2jpj
Then the inner product on H~ is positive in the
sense that for all ˜ 2 H~ the inner product with itself, Gupta and Bleuler characterized a subspace of the
h,˜ i,
˜ is of the form c̃ c̃ with some power series Fock space on which the scalar product is semide-
c̃ 2 C[[]], and c̃ = 0 iff ˜ = 0. finite; the space of physical states is then obtained
44 Perturbative Renormalization Theory and BRST

by dividing out the space of vectors with vanishing (see, e.g., Scharf (2001)). It is implemented by the
norm. free BRST charge
After adding a mass term Z
ð0Þ
Z Q0 ¼ d3 xj0 ðx0 ; xÞ ½15
m2 x0 ¼const:
dxA ðxÞA ðxÞ
2 where
to the action [5], it seems to be no longer necessary jð0Þ  
 :¼ ð@ B þ mÞ@ u  @ ð@ B þ mÞu ½16
to add also a gauge-fixing term. The fields then
satisfy the Proca equation is the free BRST current, which is conserved. (The
interpretation of the integral in [15] requires some
@ F þ m2 A ¼ 0 ½12 care.) Q0 satisfies the assumptions of the (local)
which is equivalent to the equation (& þ m2 )A = 0 operator BRST formalism, in particular it is nilpotent
together with the constraint @ A = 0. The Cauchy and positive [4]. Distinguished representatives of the
problem is well posed, and the fields can be equivalence classes [] 2 Ke Q0 =Ra Q0 are the states
represented in a positive-norm Fock space with built up only from the three spatial (two transversal
only physical states (corresponding to the three for m = 0, respectively) polarizations of A.
physical polarizations of A). The problem, however,
is that the corresponding propagator admits no
power-counting renormalizable perturbation series. Perturbative Renormalization
The latter problem can be circumvented in the
The starting point for a perturbative construction of
following way: for the algebra of the free quantum
an interacting quantum field theory is Dyson’s
field, one takes only the equation (& þ m2 )A = 0
formula for the evolution operator in the interaction
into account (or, equivalently, one adds the gauge-
picture. To avoid conflicts with Haag’s theorem on
fixing term (1/2)(@ A )2 to the Lagrangian) and goes
the nonexistence of the interaction picture in
over from the physical field A to
quantum field theory, one multiplies the interaction
@ Lagrangian L with a test function g and studies the
B :¼ A þ ½13 local S-matrix,
m
X1 nZ
where  is a real scalar field, to the same mass m i
SðgLÞ ¼ 1 þ dx1    dxn gðx1 Þ    gðxn Þ
where the sign of the commutator is reversed n!
n¼1
(‘‘bosonic ghost field’’ or ‘‘Stückelberg field’’).
 TðLðx1 Þ    Lðxn ÞÞ ½17
The propagator of B yields a power-counting
renormalizable perturbation series; however, B is where T denotes a time-ordering prescription. In the
an unphysical field. One obtains four independent limit g ! 1 (adiabatic limit), S(gL) tends to the
components of B which satisfy the Klein–Gordon scattering matrix. This limit, however, is plagued by
equation. The constraint 0 = @ A = @ B þ m is infrared divergences and does not always exist.
required for the expectation values in physical states Interacting fields FgL are obtained by the Bogoliubov
only. So, quantization in the case m > 0 can be formula:
treated in analogy with [8]–[10] by replacing A by
B , the wave operator by the Klein–Gordon operator 
FgL ðxÞ ¼ j SðgLÞ1 SðgL þ hFÞ ½18
(& þ m2 ) in [8], and D by the corresponding massive hðxÞ h¼0
commutator distribution m in [10]. Again, the The algebraic properties of the interacting fields
algebra can be nontrivially represented on a space within a region O depend only on the inter-
with indefinite metric, but not on a Hilbert space. action within a slightly larger region (Brunetti and
One can now use the method of BRST quantiza- Fredenhagen 2000), hence the net of algebras in the
tion in the massless as well as in the massive case. sense of AQFT can be constructed in the adiabatic
One introduces a pair of fermionic scalar fields limit without the infrared problems (this is called the
(ghost fields) (u, ũ). u, ũ, and (for m > 0)  fulfill the ‘‘algebraic adiabatic limit’’).
Klein–Gordon equation to the same mass m 0 as The construction of the interacting theory is thus
the vector field B. The free BRST transformation reduced to a definition of time-ordered products of
reads fields. This is the program of causal perturbation
theory (CPT), which was developed by Epstein and
s0 ðB Þ ¼ i@  u; s0 ðÞ ¼ imu
½14 Glaser (1973) on the basis of previous work by
s0 ðuÞ ¼ 0; uÞ ¼ ið@ B þ mÞ
s0 ð~ Stückelberg and Petermann (1953) and Bogoliubov
Perturbative Renormalization Theory and BRST 45

and Shirkov (1959). For simplicity, we describe D(R 4n n n ) are maintained in the extension,
CPT only for a real scalar field. Let ’ be a classical namely:
real scalar field which is not restricted by any field
(N0) a bound on the degree of singularity near
equation. Let P denote the algebra of polynomials
the total diagonal;
in ’ and all its partial derivatives @ a ’ with multi-
(N1) Poincaré covariance;
indices a 2 N40 . The time-ordered products (Tn )n2N
(N2) unitarity of the local S-matrix;
are linear and symmetric maps Tn : (P

(N3) a relation to the time-ordered products of


D(R 4 ))
n ! L(D), where L(D) is the space of
subpolynomials;
operators on a dense invariant domain D in the
(N4) the field equation for the interacting field
Fock space of the scalar free field. One often uses
’gL [18];
the informal notation
(AWI) the ‘‘action Ward identity’’ (Stora 2002,
Tn ðg1 F1
  
gn Fn Þ Dütsch and Fredenhagen 2003): @  T(Fl (x) ) =
Z T(   @  Fl (x)   ). This condition can be understood
¼ dx1    dxn Tn ðF1 ðx1 Þ; . . . ; Fn ðxn ÞÞ as the requirement that physics depends on the action
only, so total derivatives in the interaction Lagrangian
 g1 ðx1 Þ    gn ðxn Þ ½19 can be removed; and
further symmetries, in particular in gauge
where Fj 2 P, gj 2 D(R4 ). theories, Ward identities expressing BRST invar-
The sequence (Tn ) is constructed by induction on iance. A universal formulation of all symmetries
n, starting with the initial condition which can be derived from the field equation in
classical field theory is the ‘‘master Ward iden-
!
Y Y tity’’ (which presupposes (N3) and (N4)) (Boas
aj
T1 @ ’ðxÞ ¼: @ aj ðxÞ : ½20 and Dütsch 2002, Dütsch and Fredenhagen
j j 2003); see next section.
where the right-hand side is a Wick polynomial of The problem of perturbative renormalization is to
the free field . In the inductive step the requirement construct a solution of all these normalization
of causality plays the main role, that is, the conditions. Epstein and Glaser have constructed the
condition that solutions of (N0)–(N3). Recently, the conditions
(N4) and (AWI) have been included. The master
Tn ðf1
  
fn Þ ¼ Tk ðf1
  
fk Þ Ward identity cannot always be fulfilled, the
 Tnk ðfkþ1
  
fn Þ ½21 obstructions being the famous ‘‘anomalies’’ of
perturbative quantum field theory.
if
ðsupp f1 [    [ supp fk Þ
\ ððsupp fkþ1 [    [ supp fn Þ þ V̄ Þ = ; Perturbative Construction of Gauge
Theories
(where V̄ is the closed backward light cone). This
condition expresses the composition law for evolu- In the case of a purely massive theory, the
tion operators in a relativistically invariant and local adiabatic limit S = limg!1 S(gL) exists (Epstein
way. Causality determines Tn as an operator-valued and Glaser 1976), and one may adopt a formalism
distribution on R4n in terms of the inductively known due to Kugo and Ojima (1979), who use the fact
Tl , l < n, outside of the total diagonal n := that in these theories the BRST charge Q can be
{(x1 , . . . , xn ) j x1 =    = xn }, that is, on test functions identified with the incoming (free) BRST charge
from D(R4n n n ). Q0 [15]. For the scattering matrix S to be a well-
Perturbative renormalization is now the exten- defined operator on the physical Hilbert space of
sion of Tn to the full test function space D(R4n ). the free theory, H = Ke Q0 =Ra Q0 , one then has to
Generally, this extension is nonunique. In contrast require
to other methods of renormalization, no diver-
gences appear, but the ambiguities correspond to lim½Q0 ; TððgLÞ
n ÞjkerQ0 ¼ 0 ½22
g!1
the finite renormalizations that persist after
removal of divergences by infinite counter terms. This is the motivation for introducing the condi-
The ambiguities can be reduced by (re-)normal- tion of ‘‘perturbative gauge invariance’’ (Dütsch
ization conditions, which means that one requires et al. 1993, 1994); see Scharf (2001)): according
that certain properties which hold by induction on to this condition, there should exist a Lorentz
46 Perturbative Renormalization Theory and BRST

vector L1 2 P associated with the interaction L, nilpotent in classical field theory (and hence this holds
such that also for s̃). However, in QFT conservation of j̃gL and
2
Q̃ = 0 requires the validity of additional Ward
½Q0 ; Tn ðLðx1 Þ    Lðxn Þ identities, beyond the condition of perturbative gauge
Xn
¼i @xl Tn ðLðx1 Þ    L1 ðxl Þ    Lðxn ÞÞ ½23 invariance [23]. All the necessary identities can be
l¼1 derived from the master Ward identity
This is a somewhat stronger condition than [22] but Tnþ1 ðA; F1 ; . . . ; Fn Þ
has the advantage that it can be formulated X n
independently of the adiabatic limit. The condition ¼ Tn ðF1 ; . . . ; A Fk ; . . . ; Fn Þ ½25
[22] (or perturbative gauge invariance) can be k¼1
satisfied for tree diagrams (i.e., the corresponding where A = A S0 with a derivation A . The master
requirement in classical field theory can be fulfilled). Ward identity is closely related to the quantum
In the massive case, this is impossible without a action principle which was formulated in the
modification of the model; the inclusion of addi- formalism of generating functionals of Green’s
tional physical scalar fields (corresponding to Higgs functions. In the latter framework, the anomalies
fields) yields a solution. It is gratifying that, have been classified by cohomological methods. The
by making a polynomial ansatz for the interaction vanishing of anomalies of the BRST symmetry is a
L 2 P, perturbative gauge invariance [23] for tree selection criterion for physically acceptable models.
diagrams, renormalizability (i.e., the mass dimension In the particular case of QED, the Ward identity
of L is 4), and some obvious requirements (e.g.,
the Lorentz invariance) determine L to a far extent. @y T ðj ðyÞF1 ðx1 Þ    Fn ðxn ÞÞ
In particular, the Lie-algebraic structure needs not to X n
be put in, as it can be derived in this way (Stora 1997, ¼i ðy  xj Þ
unpublished). Including loop diagrams (i.e., quantum j¼1
 
effects), it has been proved that (N0)–(N2) and  T F1 ðx1 Þ    ðFj Þðxj Þ    Fn ðxn Þ ½26
perturbative gauge invariance can be fulfilled to all
for the Dirac current j :=   , is sufficient for
orders for massless SU(N) Yang–Mills theories.
the construction, where (F) := i(r  s)F for
Unfortunately, in the massless case, it is unlikely that
F = r s B1    Bl (B1 , . . . , Bl are nonspinorial fields)
the adiabatic limit exists and, hence, an S-matrix
and F1 , . . . , Fn run through all subpolynomials of
formalism is problematic. One should better rely on
L = j A , (N0)–(N4) and [26] can be fulfilled to all
the construction of local observables in terms of
orders (Dütsch and Fredenhagen, 1999).
couplings with compact support. However, then the
selection of the observables [1] has to be done in terms See also: Algebraic Approach to Quantum Field Theory;
of the BRST transformation s̃ of the interacting fields. Axiomatic Quantum Field Theory; Batalin–Vilkovisky
For the corresponding BRST charge, one makes Quantization; BRST Quantization; Constrained Systems;
the ansatz Indefinite Metric; Perturbation Theory and its Techniques;
Z X Quantum Chromodynamics; Quantum Field Theory:
Q~ ¼ d4 x ~j  ðxÞb ðxÞ; L ¼ Ln  n ½24 A Brief Introduction; Quantum Fields with Indefinite
gL
n 1 Metric: Non-Trivial Models; Renormalization: General
Theory; Renormalization: Statistical Mechanics and
where (b ) is a smooth version of the -function Condensed Matter; Standard Model of Particle Physics.

characterizing a Cauchy surface and j̃gL is the
interacting
P BRST-current [18] (where
j̃ = n j(n)
  n (n)
(j 2 P) is a formal power series with Further Reading
j(0)
 given by [16]). (Note that there is a volume
divergence in this integral, which can be avoided by a Becchi C, Rouet A, and Stora R (1975) Renormalization of the
spatial compactification. This does not change the abelian Higgs–Kibble model. Communications in Mathema-
tical Physics 42: 127.
abstract algebra F L (O).) A crucial requirement is that Becchi C, Rouet A, and Stora R (1976) Renormalization of gauge

j̃gL is conserved in a suitable sense. This condition is theories. Annals of Physics (NY) 98: 287.
essentially equivalent to perturbative gauge invariance Bogoliubov NN and Shirkov DV (1959) Introduction to the Theory
and hence its application to classical field theory of Quantized Fields. New York: Interscience Publishers Inc.
determines the interaction L in the same way, and in Brunetti R and Fredenhagen K (2000) Microlocal analysis and
interacting quantum field theories: renormalization on physical
addition the deformation j(0) ! j̃gL . The latter also backgrounds. Communications in Mathematical Physics 208: 623.
gives the interacting BRST charge and transformation, Dütsch M, Hurth T, Krahe K, and Scharf G (1993) Causal
Q̃ and s̃, by [24] and [2]. The so-obtained Q̃ is often construction of Yang–Mills theories. I. N. Cimento A 106: 1029.
Phase Transition Dynamics 47

Dütsch M, Hurth T, Krahe K, and Scharf G (1994) Causal Lam Y-MP (1972) Perturbation Lagrangian theory for scalar
construction of Yang–Mills theories. II. N Cimento A 107: 375. fields – Ward–Takahashi identity and current algebra. Physics
Dütsch M and Fredenhagen K (1999) A local (perturbative) Reviews D 6: 2145.
construction of observables in gauge theories: the example of Lam Y-MP (1973) Equivalence theorem on Bogoliubov–Parasiuk–
QED. Communications in Mathematical Physics 203: 71. Hepp–Zimmermann – renormalized Lagrangian field theories.
Dütsch M and Boas F-M (2002) The Master Ward identity. Physics Reviews D 7: 2943.
Reviews of Mathematical Physics 14: 977–1049. Lowenstein JH (1971) Differential vertex operations in Lagrangian
Dütsch M and Fredenhagen K (2003) The master Ward identity field theory. Communications in Mathematical Physics 24: 1.
and generalized Schwinger–Dyson equation in classical Piguet O and Sorella S (1995) Algebraic Renormalization:
field theory. Communications in Mathematical Physics Perturbative Renormalization, Symmetries and Anomalies,
243: 275. Lecture Notes in Physics. Berlin: Springer.
Dütsch M and Fredenhagen K (2004) Causal perturbation theory Scharf G (1995) Finite Quantum Electrodynamics. The Causal
in terms of retarded products, and a proof of the action Ward Approach, 2nd edn. Berlin: Springer.
identity. Reviews in Mathematical Physics 16: 1291–1348. Scharf G (2001) Quantum Gauge Theories – A True Ghost Story.
Epstein H and Glaser V (1973) Annals Institut Henri Poincaré A New York: Wiley.
19: 211. Stora R (2002) Pedagogical experiments in renormalized
Epstein H and Glaser V (1976) Adiabatic limit in perturbation perturbation theory. Contribution to the conference. Theory
theory. In: Velo G and Wightman AS (eds.) Renormalization of Renormalization and Regularization. Germany:
Theory, pp. 193–254. Hesselberg.
Henneaux M and Teitelboim C (1992) Quantization of Gauge Stückelberg ECG and Petermann A (1953) La normalisation des
Systems. Princeton: Princeton University Press. constantes dans la theorie des quanta. Helvetica Physica Acta
Kugo T and Ojima I (1979) Local covariant operator formalism 26: 499–520.
of nonabelian gauge theories and quark confinement problem. Weinberg S (1996) The Quantum Theory of Fields. Cambridge:
Supplement of the Progress of Theoritical Physics 66: 1. Cambridge University Press.

Phase Transition Dynamics


A Onuki, Kyoto University, Kyoto, Japan Phase Ordering in Nonconserved
ª 2006 Elsevier Ltd. All rights reserved. Systems
Let us consider phase ordering in a system with a
scalar spacetime-dependent variable (r, t). If its
space integral is not conserved in time, it is called
Introduction
the nonconserved order parameter, representing
When an external parameter such as the tempera- magnetization, electric polarization, etc. After
ture T is changed, physical systems in a homo- appropriate scaling of time t, space r, and , the
geneous state often become unstable and tend to simplest dynamic equation reads
an ordered phase with broken symmetry. The
@
growth of new order takes place with coarsening ¼ r2 
 3
þhþ ½1
of domains or defect structures on mesoscopic @t
spatial scales much longer than the microscopic The coefficient
is related to the temperature by
molecular scale. Such ordering processes are
= A(T  Tc ), where A is a constant and Tc is the
ubiquitously observed in many systems such as critical temperature. The constant h is also an
ferromagnetic (spin) systems, solid alloys, and externally controllable parameter, proportional to
fluids. Historically, structural ordering and phase the applied magnetic field for the ferromagnetic
separation in solid alloys have been one of the case. The last term is the Markovian Gaussian
central problems in metallurgy (Cahn 1961). These random noise needed when eqn [1] is treated
are highly nonlinear and far-from-equilibrium as a Langevin (stochastic differential) equation.
processes and have been studied as challenging In physics its stochastic property is usually
subjects in condensed matter physics, polymer expressed as
science, and metallurgy (Gunton et al. 1983,
hðr; tÞðr 0 ; t0 Þi ¼ 2"ðr  r 0 Þðt  t0 Þ ½2
Binder 1991, Bray 1994, Onuki 2002). Here a
short review on phase ordering is given on the where " represents the strength of the noise
basis of prototype mathematical models, which (proportional to the temperature before the scaling).
can be a starting point to understand the real In the presence of , the variable is a random
complex problems. variable, whose probability distribution P({ }, t)
48 Phase Transition Dynamics

obeys the Fokker–Planck equation. The equilibrium


(steady) distribution is given by
Peq f g ¼ const: expðFf g="Þ ½3
where
  0.5 2 5
Z
 1 1
F¼ dr 2
þ 4
þ jr j2  h ½4
2 4 2
is the so-called Ginzburg–Landau free energy. Using
F we rewrite eqn [1] in a standard form of the
Langevin equation, 10 20 40

@ F Figure 1 Time evolution of in model [1] in 2D with system


¼ þ ½5 length = 128. The numbers are the times after quenching. Noise
@t  is added, but is not essential for large patterns or in the late
stage. Reproduced with permission from Onuki A (2002) Phase
In equilibrium consists of the average e and the
Transition Dynamics. Cambridge, UK: Cambridge University
deviation  , where the latter is a Gaussian Press.
fluctuation in the limit of small ". If  > 0 and
h = 0, we obtain e = 0. If  < 0 and h = 0, there in two dimensions (2D), where we can see the
are two minima e = jj1=2 . These two states coarsening of the patterns. The characteristic domain
can coexist in equilibrium with a planar interface size ‘(t) grows algebraically as
separating them at h = 0. If its normal is along the x-
axis, the interface solution is of the form ‘ðtÞ  ta ½9
pffiffiffi
ðxÞ ¼ jj1=2 tanhðjj1=2 x= 2Þ ½6 where a = 1=2 is known for the model [1]. Scattering
experiments detect the time-dependent correlation
which tends to jj1=2 as x  1 and satisfies
gðr; tÞ ¼ h ðr þ r 0 ; tÞ ðr 0 ; tÞi ½10
F= ¼ ð þ 2
Þ  d2 =dx2 ¼ 0 ½7
Z
It is well known that the fluctuations of are
Sðk; tÞ ¼ drgðr; tÞeikr ½11
increasingly enhanced near the critical point. The
renormalization group theory shows how the equili-
where S(k, t) is called the structure factor. We
brium distribution Peq { } in eqn [3] depends on the
assume the translational invariance and the spatial
upper cutoff wave number  of , where we suppose
isotropy after the thermal average h  i. If i  1,
that consists of the Fourier components k with
the quartic term in F is negligible, leading to the
k <  (Onuki 2002). In our phase-ordering problem
initial structure factor
the shortest relevant spatial scale is the interface
width of the order of the thermal correlation length  Sðk; 0Þ ffi "=ði þ k2 Þ ½12
at the final temperature. Therefore, near criticality,
we may assume that the thermal fluctuations with which is produced by the thermal fluctuations.
wave numbers larger than 1 have been eliminated However, when the domain size ‘(t) much exceeds
in the model (or   1 at the starting point). the microscopic length (lattice constant), the follow-
ing scaling behavior emerges:
Domain Growth
gðr; tÞ ¼ Gðr=‘ðtÞÞ ½13
Thermodynamic instability occurs when  is
changed from a positive value i to a negative
value f at t = 0. We here assume h = 0. We set Sðk; tÞ ¼ ‘ðtÞd Qð‘ðtÞkÞ ½14
f = 1 using the scaling. At long wavelengths k < where d is the space dimensionality and G(x) and Q(x)
1, small plane wave fluctuations with wave vector k are the scaling functions of order unity for x  1. The
grow exponentially as correlation on the scale of ‘(t) in eqn [13] arises
from large-scale domain structures, while eqn [14]
k ðtÞ  exp½ð1  k2 Þt ½8
is simply its Fourier transformation. The maxi-
with the growth rate largest at k = 0. This suggests mum of the structure factor grows as ‘(t)d . When
that the nonlinear term in eqn [1] becomes crucial " 1, however, there can be a well-defined initial
after a transient time. Numerically obtained snap- stage in which S(k, t) grows exponentially at long
shots of the subsequent (r, t) are shown in Figure 1 wavelengths.
Phase Transition Dynamics 49

We may explain the roles of the terms on the 1. If we set vint  ‘(t)=t and K  1=‘(t), we obtain
right-hand side of eqn [1] in phase ordering in a a = 1=2 in the growth law [9].
simple manner. 2. In phase ordering under very small positive h,
the balance 1=‘(t)  h= yields the crossover
1. The linear term  triggers instability for  < 0.
time th  h2 . For t < th the effect of h is small,
2. The nonlinear term  3 gives rise to saturation
while for t > th the region with ffi 1 becomes
of into 1. To see this, we neglect r2 and 
predominant.
to have @ =@t = (1  2 ) for  = 1. This
3. A spherical droplet with ffi 1 evolves as
equation is solved to give
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi @R 2 2h
ðtÞ ¼ 0 = 2 þ ð1  2 Þe2t ½15 ¼ þ ½19
0 0 @t R 
where 0 = (0) is the initial value. Thus, ! 1 from which the critical radius is determined as
for 0 > 0 and ! 1 for 0 < 0 as t ! 1. Rc ¼ =h ½20
3. The gradient term limits the instability only in
the long wavelength region k < 1 in the initial A droplet with R > Rc (R < Rc ) grows (shrinks).
stage (see eqn [8]) and creates the interfaces in We mention a statistical theory of interface dynamics
the late stage (see eqn [7]). at h = 0 by Ohta (1982). There, a smooth subsidiary
4. The noise term  is relevant only in the early field u(r, t) is introduced to represent surfaces by
stage where is still on the order of the initial u = const. The differential geometry is much simplified
thermal fluctuations. The range of the early stage in terms of such a field. The two-phase boundaries are
is of order 1 for " > 1, but weakly grows as represented by u = 0. If all the surfaces follow vint = K

ln(1=") for " 1. The noise term can be in eqn [17] in the whole space, u obeys
neglected once the fluctuations much exceed the @ h X i
thermal level. u ¼ r2  ni nj ri rj u ½21
@t ij
5. If h is a small positive number, it favors growth
of regions with ffi 1. where ri = @=@xi and ni = ri u=jruj. This equation
becomes a linear diffusion equation if ni nj ri rj is
Interface Dynamics replaced by d1 ij r2 . Then u can be expressed in
terms of its initial value and the correlation function
At long times t  1 domains with typical size ‘(t)
of (r, t)( ffi u(r, t)=ju(r, t)j in the late stage) is
are separated by sharp interfaces and the thermal
calculated in the form of eqn [13] with
noise is negligible. Allowing the presence of a small   
positive h, we may approximate the free energy F as 2 1 1 2
GðxÞ ¼ sin exp  x ½22
F ¼ SðtÞ  2hVþ ðtÞ þ const: ½16  8ð1  1=dÞ

where  is a constant (surface tension), S(t) is the which excellently agrees with simulations.
surface area, and Vþ (t) is the volume of the
regions with ffi 1. In this stage the interface velocity
vint = vint  n is given by the Allen–Cahn formula Spinodal Decomposition in Conserved
(Allen and Cahn 1979): Systems
vint ¼ K þ ð2=Þh ½17 The order parameter can be a conserved variable
such as the density or composition in fluids or
The normal unit vector n is from a region with ffi 1 alloys. With the same F in eqn [4], a simple dynamic
to a region with ffi 1. The K is the sum of the model in such cases reads
principal curvatures 1=R1 þ 1=R2 in 3D. This equa-
tion can be derived from eqn [1]. If the interface @ F
¼ r2  r  jR ½23
position r a moves to r a þ nR infinitesimally,
R the @t 
surface area changes by S = daK, where da   
Here jR is the random current characterized by
denotes the surface integral. Therefore, F in eqn [16] D E
changes in time as jR R 0 0 0 0
ðr; tÞj
ðr ; t Þ ¼ 2"
ðr  r Þðt  t Þ ½24
Z
dF
¼ daðK  2hÞvint
0 ½18 which ensures the equilibrium distribution [3] of .
dt
However, the noise jR is negligible in late-stage
which is non-negative-definite owing to eqn [17]. phase separation as in the nonconserved case. Note
Furthermore, we may draw three results from eqn [17]. that h in the conserved case is the chemical potential
50 Phase Transition Dynamics

conjugate to and, if it is homogeneous, it vanishes minority phase eventually appears as droplets in the
in the dynamic equation [23]. In experiments the percolating region of the majority phase.
average order parameter
Z Interface Dynamics
M ¼ h i ¼ dr ðrÞ=V ½25
Interface dynamics in the conserved case is much
more complicated than in the nonconserved case,
is used as a control parameter instead of h, where
because the coarsening can proceed only through
the integral is within the system with volume V. If
diffusion. Long-distance correlations arise among
there is no flux from outside, M is constant in time.
the domains and the interface velocity cannot be
Here the instability occurs below the so-called
written in terms of the local quantities like the
spinodal M2 < 1=3(M2 < jj=3 for general  < 0).
curvature. As a simple example, we give the counter-
In fact, small fluctuations with wave vector k grow
part of eqn [19]. In 3D a spherical droplet with ffi 1
exponentially as
appears in a nearly homogeneous matrix with = M
k ðtÞ  exp½k2 ð1  3M2  k2 Þt ½26 far from the droplet. The droplet radius R is then
governed by (Lifshitz and Slyozov 1961)
right after the quenching as in eqn [8]. The growth rate  
is largest at an intermediate wave number k = km with @  2d0
R¼D  2 ½29
@t R R
km ¼ ½ð1  3M2 Þ=21=2 ½27
where  = (M þ 1)=2 is called the supersaturation,
This behavior and the exponential growth of the while D and d0 are constants (equal to 2 and =8,
structure factor have been observed in polymer mixtures respectively, after the scaling). The critical radius is
where the parameter " in eqn [3] or [12] is expected to be written as
small (Onuki 2002). In late-stage coarsening the peak
position of S(k, t) decreases in time as Rc ¼ 2d0 = ½30
km ðtÞ  2=‘ðtÞ ½28 The general definition of the supersaturation is
 . 
in terms of the domain size ‘(t). The growth  ¼ M  ð2Þ ð1Þ
 ð2Þ
½31
cx cx cx
exponent in eqn [9] is given by 1/3 for the simple
model [23] (see eqn [33] below). Here the equilibrium values of are written as (1) cx
Figure 2 shows the patterns after quenching in 2D. and (2)cx and M is supposed to be slightly different
For M = 0 the two phases are symmetric and the from (2)cx .
patterns are bicontinuous, while for M 6¼ 0 the Lifshitz and Slyozov (1961) analyzed domain coar-
sening in binary AB alloys when the volume fraction q
of the A-rich domains is small. They noticed that the
M=0 supersaturation  around each domain decreases in
time with coarsening. That is, the A component atoms
in the B-rich matrix are slowly absorbed onto the
growing A-rich domains, while a certain fraction of the
A-rich domains disappear. Thus, q(t) and (t) both
depend on time, but satisfy the conservation law
20 100 400
qðtÞ þ ðtÞ ¼ ð0Þ ¼ ðM þ 1Þ=2 ½32
(a)
M = 0.1 With this overall constraint, they found the
asymptotic late-stage behavior

‘ðtÞ  ðtÞ1  t1=3 ½33


where ‘(t) is the average droplet radius. Notice that
this behavior is consistent with the droplet equation
20 100 400
[29], where each term is of order R=t  t2=3 .
(b)
Figure 2 Time evolution of in model [23] in 2D with system
length = 128 without thermal noise: (a) M = 0 and (b) M = 0.1. Nucleation
The numbers are the times after quenching. Reproduced with
permission from Onuki A (2002) Phase Transition Dynamics. In metastable states the free energy is at a local
Cambridge, UK: Cambridge University Press. minimum but not at the true minimum. Such states
Phase Transition Dynamics 51

are stable for infinitesimal fluctuations, but rare distribution n(R, t) then obeys the Fokker–Planck
spatially localized fluctuations, called critical nuclei, equation
can continue to grow, leading to macroscopic phase  
@ @ @ F0 ðRÞ
ordering (Onuki 2002, Debenedetti 1996). The birth n¼ LðRÞ þ n ½39
of a critical droplet is governed by the Boltzmann @t @R @R kB T
factor exp (Fc =kB T) at finite temperatures, where Here n(R, t)dR denotes the droplet number density
Fc is the free energy needed to create a critical in the range [R, R þ dR]. We determine the kinetic
droplet and kB T is the thermal energy with kB being coefficient L(R) such that
the Boltzmann constant. In this section we explicitly
write kB T, but we may scale and space such that vðRÞ LðRÞF0 ðRÞ=kB T ½40
 = 1 at the final temperature. is the right-hand side of eqn [19] or [29]. It is
equal to @R=@t when the thermal noise is
Droplet Free Energy and Experiments neglected. Thus, L(R) / R2 or R3 for the non-
In the nonconserved case we prepare a spin-down state conserved or conserved case. The second deriva-
with ffi 1 in the time region t < 0 and then apply a tive (@=@R)L(R)(@=@R) in eqn [39] stems from the
small positive field h at t = 0. For t > 0 a spin-up thermal noise and is negligible for R  Rc > 
1 in
droplet with radius R requires a free energy change 3D (Onuki 2002). Hence, for R  Rc > 
1, the
droplets follow the deterministic equation [19] or
8 3 [29] and n obeys
FðRÞ ¼ 4R2  hR ½34
3
@ @
The first term is the surface free energy and the n¼  ½vðRÞn ½41
@t @R
second term is the bulk decrease due to h. The
critical radius Rc in eqn [20] gives the maximum of In Figure 3, we plot the solution of eqn [39] for
F(R) given by the conserved case with Fc =kB T = 17.4 (Onuki
4 2 2002). The time is measured in units of 1=c ,
Fc ¼ Rc ½35 which is the timescale of a critical droplet defined by
3
In fact, F0 (R) = @F(R)=@R is written as c ¼ ð@vðRÞ=@RÞR¼Rc ½42

F0 ðRÞ ¼ 8ðR  R2 =Rc Þ ½36 We notice c / R3 c from eqn [29] so c is small.
The initial distribution is given by
In conserved systems such as fluids or alloys, we
lower the temperature slightly below the coexistence nðR; 0Þ ¼ n0 expð4R2 =kB TÞ ½43
curve with the average order parameter M held fixed.
We again obtain the droplet free energy [34], but
0
h ¼ ð=2d0 Þ ½37
in terms of the (initial) supersaturation  = (0). –2
Let the equilibrium values (1)
cx and
(2)
cx in the two

phases be written as A(Tc  T) with A and


–4 log10 n(R,t )
being constants (
ffi 1=3 as T ! Tc ). For each given
M, we define the coexistence temperature Tcx by –6

M = (2)
cx = A(Tc  Tcx ) . In nucleation experi-
ments the final temperature T is slightly below Tcx –8
and T Tcx  T is a positive temperature incre-
ment. For small T we find –10

ffi T=ðTc  Tcx Þ ½38 –12


0 1 2 3 4 5 6 7 8
2
R /Rc
Figure 3 Time evolution of the droplet size distribution n(R, t)
Droplet Size Distribution and Nucleation Rate on a semilogarithmic scale as a solution of eqn [39] in the 3D
conserved case. The first 11 curves correspond to the times at
In a homogeneous metastable matrix, droplets of the
c t = 0, 1, . . . and 10. The last four curves are those at
new phase appear as rare thermal fluctuations. We c t = 15, 20, 25, and 30. Reproduced with permission from
describe this process by adding a thermal noise term Onuki A (2002) Phase Transition Dynamics. Cambridge, UK:
to the droplet equation [19] or [29]. The droplet size Cambridge University Press.
52 Phase Transition Dynamics

with n0 being a constant number density. This form near-critical fluids, however, I0 itself becomes small
has been observed in computer simulations as the ( / 6 ) such that the cloud point considerably depends
droplet size distribution on the coexistence curve on the experimental timescale (observation time).
(h = 0). Figure 3 indicates that n(R, t) tends to a
steady solution ns (R) which satisfies
  Remarks
@ F0 ðRÞ
LðRÞ þ ns ¼  I ½44 The order parameter can be a scalar, a vector as in
@R kB T
the Heisenberg spin system, a tensor as in liquid
where I is a constant. Imposing the condition ns (R) ! 0 crystals, and a complex number as in superfluids
as R ! 1, we integrate the above equation as and superconductors. In phase ordering a crucial
Z 1   role is played by topological singularities like
1 FðR1 Þ  FðRÞ
ns ðRÞ ¼ I dR1 exp ½45 interfaces in the scalar case and vortices in the
R LðR1 Þ kB T
complex number case. Furthermore, a rich variety of
For R  Rc  1 we may replace F(R1 )  F(R) phase transition dynamics can be explained if the
by F0 (R)(R1  R) in the integrand of eqn [45] to order parameter is coupled to other relevant
obtain variables in the free energy and/or in the dynamic
ns ðRÞ ffi I=vðRÞ ½46 equations. We mention couplings to velocity field in
fluids, electrostatic field in charged systems, and
which also follows from eqn [41]. Thus elastic field in solids. Phase ordering can also be
ns ðRÞdR ¼ I dt ðdR ¼ vðRÞdtÞ ½47 influenced profoundly by external fields such as
electric field or shear flow.
This means that I is the nucleation rate of droplets
with radii larger than Rc emerging per unit volume See also: Reflection Positivity and Phase Transitions;
and per unit time. Furthermore, as R ! 0, we Renormalization: Statistical Mechanics and Condensed
require ns (R) ! n0 = const. in eqn [43] so that Matter; Statistical Mechanics of Interfaces; Topological
Z 1   Defects and Their Homotopy Classification.
1 FðR1 Þ
n0 ¼ I dR1 exp ½48
0 LðR1 Þ kB T
Further Reading
where the integrand becomes maximum
around Rc . Using the expansion F(R) = Fc þ Allen SM and Cahn JW (1979) Microscopic theory for antiphase
F00 (Rc ) (R  Rc )2 =2 þ    , we obtain the famous boundary motion and its application to antiphase domain
formula for the nucleation rate coarsening. Acta Metallurgica 27: 1085.
Binder K (1991) Spinodal decomposition. In: Cohen RW, Haasen
I ¼ I0 expðFc =kB TÞ ½49 P, and Kramer EJ (eds.) Material Sciences and Technology, vol.
5. Weinheim: VCH.
Bray AJ (1994) Theory of phase-ordering kinetics. Advances in
Physics 43: 357.
¼ I0 expðC0 =2 Þ ½50 Cahn JW (1961) On spinodal decomposition. Acta Metallurgica
9: 795.
where the coefficient I0 is of order n0 c . The second Debenedetti PG (1996) Metastable Liquids. Princeton: Princeton
line holds in the 3D conserved case. Here, C0  103 University.
typically and I0 is a very large number in units of Gunton JD, San Miguel M, and Sani PS (1983) The dynamics of
cm3 s1 , say, 1030 . Then the exponential factor in I first-order phase transitions. In: Domb C and Lebowitz JL
(eds.) Phase Transition and Critical Phenomena, vol. 8.
changes abruptly from a very small to a very large London: Academic Press.
number with only a slight increase of  at small Lifshitz IM and Slyozov VV (1961) The kinetics of precipitation
 1. For example, if C0 =2 = 50, I is increased from supersaturated solid solutions. Journal of Physics and
by exp (100=) with a small increase of  to Chemistry of Solids 19: 35.
 þ . This factor can be of order 103 even for Ohta T, Jasnow D, and Kawasaki K (1982) Universal scaling in
the motion of random interfaces. Physical Review Letters 49:
= = 0.05. Unless very close to criticality, simple 1223.
metastable fluids become opaque suddenly with Onuki A (2002) Phase Transition Dynamics. Cambridge: Cambridge
increasing  or T at a rather definite cloud point. In University Press.
Phase Transitions in Continuous Systems 53

Phase Transitions in Continuous Systems


E Presutti, Università di Roma ‘‘Tor Vergata,’’ Rome, As always, there is a ‘‘fundamental theory’’
Italy aspect; in the specific case it is the attempt for an
ª 2006 Elsevier Ltd. All rights reserved. atomistic theory able to describe also macroscopic
phenomena, thus ranging from the angstrom to the
kilometer scales. From an engineering point of view,
the target is, for instance, to understand why and
Introduction when a substance is an insulator, or a conductor or,
maybe, a superconductor, and, more importantly,
Many aspects of our everyday life, from weather to how should we change its microscopic interactions
boiling water for a cup of coffee, involve heat to produce such effects: this opens the way to
exchanges and variations of pressure and, as a technologies which are indeed enormously affecting
result, a phase transition. The general theory behind our life.
these phenomena is thermodynamics, which studies
fluids and macroscopic bodies under these and more
general transformations.
In the simple case of a one-component substance, Phase Transitions and Statistical
the behavior under changes of temperature T and Mechanics
pressure P is described, according to the Gibbs
The modern theory of statistical mechanics is based
phase rule, by a phase diagram such as the one in
upon the Gibbs hypothesis. In a classical (i.e., not
Figure 1. The curves in the (T, P) plane, distinguish
quantum) framework, the macroscopic states are
regions where the substance is in its solid, liquid,
described by probability measures on a particle
and gas phases. Thus, in an experiment where we
configuration phase space. The equilibrium states
vary the pressure and temperature moving along a
are then selected by the Gibbs prescription, which
line which crosses a transition curve, we observe an
requires that the probability of observing a config-
abrupt and dramatic change at the crossing, when
uration which has energy E should be proportional
the system changes phase. As already stated, every-
to eE , where  = 1=kT, k is the Boltzmann
day life is an active source of examples of such
constant, and T the absolute temperature. These
phenomena.
are the ‘‘Gibbs measures’’ and the purpose of
The picture is ‘‘far from innocent’’, it states that air,
statistical mechanics is to study their properties. A
liquid, and solid are not different elements of nature, as
prerequisite for the success of the theory is compat-
for long believed, but just different aspects of the same
ibility with the principles of thermodynamics, the
thing: substances are able to adapt to different external
theory should then be able to explain the origin of
conditions in dramatically different ways. What
the various phase diagrams and in particular to
properties of intermolecular forces are responsible for
determine the circumstances under which phase
such astonishing behavior? The question has been
transitions appear.
extensively studied and it is the argument of the
The theory, commonly called DLR, after
present article, where it will be discussed in the
Dobrushin, Lanford, and Ruelle, who, in the
framework of statistical mechanics for continuous
1960s, contributed greatly to its foundations, has
systems. Before entering into the matter, let us mention
solid mathematical basis. Its main success is a
two basic motivations.
rigorous proof of consistency with thermodynamics,
which is derived under the only assumption that
P surface effects are negligible, a condition which is
mathematically achieved by studying the system in a
‘‘thermodynamic limit,’’ where the region containing
the system invades the whole space.
In the thermodynamic limit, the equilibrium states
Solid
can no longer be defined by the Gibbs prescription,
Liquid because the energy of configurations in the whole
space, being extensive, is typically infinite. The
Gas
problem has been solved by first proving conver-
gence of the finite-volume Gibbs measures in the
T thermodynamic limit. After defining the limit states,
Figure 1 Phase diagram of a one-component substance. called ‘‘DLR states,’’ as the equilibrium states of the
54 Phase Transitions in Continuous Systems

infinite systems, it is proved that the DLR states can is the segment {0  T  Tc , h = 0}, in the (T, h)
be directly characterized (i.e., without using limit plane, h being the magnetic field. In the upper-half
procedures) as the solutions of a set of equations, plane, there is a single phase with positive magne-
the ‘‘DLR equations,’’ which generalize the finite- tization, in the lower one with a negative value; at
volume Gibbs prescription. h = 0, positive and negative magnetization states can
In terms of DLR states, the mathematical meaning coexist, if the temperature is lower than the critical
of phase transitions becomes very clear and sharp. value Tc . Correspondingly, there are, simulta-
The starting point is the proof that the physical neously, a positive and a distinctly negative DLR
property that intensive variables in a pure phase state, which describe the two phases.
have negligible fluctuations is verified by all the An analogous result is missing for systems of
DLR measures which are in a special class, thus particles in the continuum, but there has been recent
selected by this property, and which are therefore progress on the analysis of the liquid–vapor branch
interpreted as ‘‘pure phases.’’ All the other DLR of the phase diagram, and the issue will be the main
measures are proved to be mixtures, that is, general focus of this article.
convex combinations, of the pure DLR states. Thus,
in the DLR theory, the system is in a single phase
when there is only one DLR state, at the given
Sensitive Dependence on Boundary
values of the thermodynamic parameters (e.g.,
Conditions
temperature and chemical potential), while the
system is at a phase transition if there are several Phase transitions describe exceptional regimes where
distinct DLR states. the system is in a critical state; this is why they are
While the theory beautifully clarifies the meaning so interesting and difficult to study. As in chaotic
of phase transitions, it does not say whether the systems, criticality corresponds to a ‘‘butterfly
phenomenon really occurs! This is maybe the main effect,’’ which, in a statistical-mechanics setting
open problem in equilibrium statistical mechanics. A means changing far-away boundary conditions.
general proof of existence of phase diagrams is Such changes affect the neighbors, which in turn
needed, which should at least capture the basic influence their neighbors, and so on. In general, the
property behind the Gibbs phase rule, namely that in effect decays with the distance but, at phase
most of the space (of thermodynamic parameters) transition, it provokes an avalanche which propa-
there is a single phase, with rare exceptions where gates throughout the system reaching all its points.
several phases coexist. A more refined result should Its occurrence is not at all obvious, if we remember
then indicate that coexistence occurs only on regular the stochastic nature of the theory. The domino
surfaces of positive codimension. effect described above can in fact, at each step, be
There is, however, a general result of existence of subverted by stochastic fluctuations. The latter, in
the gaseous phase, with a proof of uniqueness of the end, may completely hide the effect of changing
DLR measures when temperature is large and the boundary conditions. This is an instance of a
density low. Coexistence of phases is much less competition between energy and entropy which is
understood at a general level, but results for the ruling phenomenon behind phase transitions.
particular classes of models exist, for instance, in This intuitive picture also explains the relevance
lattice systems at low temperatures. The prototype is of space dimensionality. In a many-dimensional
the ferromagnetic Ising model in two or more space, the influence of the boundary conditions has
dimensions, where indeed the full diagram has clearly many more ways to percolate, in contrast to
been determined, see Figure 2. The transition curve the one-dimensional case, where in fact there is a
general result on the uniqueness of DLR measures
and therefore absence of phase transitions, for short-
h range interactions. For pair potentials, ‘‘short’’
means that the interaction energy between two
molecules, respectively at r and r0 , decays as
jr  r0 j ,  > 2. There are results on the converse,
namely on the presence of phase transitions when
the above condition is not satisfied, mainly for
lattice systems, but with partial extensions also to
Tc T
continuous systems. One-dimensional and long-
range cases are not the main focus of this article,
Figure 2 Phase diagram of the Ising ferromagnet. and the issue will not be discussed further here.
Phase Transitions in Continuous Systems 55

Ising Model which are otherwise identical. Particles are massive


points and the only interaction is a hard-core
In order to make the previous ideas quantitative, let
interaction among different colors, namely a red and
us first describe the simple case of the Ising model.
a black particle cannot be closer than 2R0 , R0 > 0
Ising spin configurations are collections {(x), x 2
being the hard-core radius.
Zd } of (x) 2 {1} magnetic moments called spins.
The order parameter for the phase transition is the
In the nearest-neighbor case, the interaction between
particle color. For large values of the chemical
two spins is J(x)(y), J > 0, if x and y are nearest
potential, and thus large densities, there are two
neighbors on Zd , or is vanishing otherwise. There
states, one essentially red, the other black, while, if
are, therefore, two ground states, one with all spins
the density is low, the colors ‘‘are not separated’’
equal to þ1 and the other one with all spins equal to
and there is a unique state. The proof of the
1. Since the Gibbs probability of higher energies
statement starts by dividing the particles of a
vanishes as the temperature goes to zero, these are
configuration into clusters, each cluster made by a
interpreted as the equilibrium states at temperature
maximal connected component, where two particles
T = 0.
are called connected when their mutual distance is
If T > 0, configurations with larger energy will
< 2R0 . Then, in each cluster, all particles have the
appear, even though depressed by the Gibbs factor,
same color (because of the hard-core exclusion
but their occurrence is limited if T is small. In fact,
between black and red), and the color is either
in the ferromagnetic Ising model at zero magnetic
black or red, with equal probability.
field, dimensions d  2, and low enough tempera-
The question of phase transition is then related to
ture, it has been proved that there are two distinct
cluster percolation, namely the existence of clusters
DLR measures, one called positive and the other
which extend to infinity. If this occurs, then the influence
negative. The typical configurations in the positive
of fixing the color of a particle may propagate infinitely
measure are mainly made by positive spins and, in
far away, hence the characteristic ‘‘sensitive dependence
such an ‘‘ocean of positive spins’’ there are rare and
phenomenon’’ of phase transitions. Percolation and
small islands of negative spins. The same situation,
hence phase transitions have been proved to exist in the
but with the positive and negative spins inter-
positive and negative states, if the density is large and,
changed, occurs in the negative DLR state.
respectively, small. The above argument is a more recent
The selection of one of these two states can be
version of the original proof by Ruelle, which goes back
made by choosing the positive or the negative
to the 1970s.
boundary conditions, which shows how a surface
The key element for the appearance of the phase
effect, namely putting the boundary spins equal to 1
transition is the competition between two different
or 1, has a volume effect, as most of the spins in the
components, so that the analysis is not useful in
system follow the value indicated by the boundary
explaining the mechanisms for coexistence in the
values. Again, this is more and more striking as we
case of identical particles, which are considered in
note that each spin is random, yet a strong,
the following.
cooperative effect takes over and controls the system.
The original proof due to Peierls exploits the spin-
flip symmetry of the Ising interaction, but it has Coarse Graining Transformations
subsequently been extended to a wider class of
The Peierls argument in Ising systems does not seem
systems on the lattice, in the general framework of
to extend to the continuum, certainly not in a trivial
the ‘‘Pirogov–Sinai theory.’’ This theory studies the
way. The ground states, in fact, will not be as simple
low-temperature perturbations of ground states and
as the constant configurations of a lattice system;
it applies to many lattice systems, proving the
they will instead be periodic or quasiperiodic config-
existence of a phase transition and determining the
urations with a complicated dependence on the
structure of the phase diagram in the low-
particle interactions. The typical fluctuations when
temperature region. The theory, however, does not
we raise the temperature above zero have a much
cover continuous systems, where the low-temperature
richer and complex structure and are correspondingly
regime is essentially not understood, with the notable
more difficult to control. Closeness to the ground
exception of the Widom and Rowlinson model.
states at nonzero temperature, as described in the
Ising model, would prove the spontaneous breaking
of the Euclidean symmetries and the existence of a
Two Competing Species in the Continuum
crystalline phase. The question is, of course, of great
The simplest version of the Widom and Rowlinson interest, but it looks far beyond the reach of our
model has two types of particles, red and black, present mathematical techniques.
56 Phase Transitions in Continuous Systems

The simpler Ising picture should instead reappear coarse graining picture works and it has been proved
at the liquid–vapor coexistence line. Looking at the that in a ‘‘small’’ region of the temperature–chemical
fluid on a proper spatial scale, we should in fact see potential plane, there is a part of the curve where two
a density that is essentially constant, except for distinct phases coexist, while elsewhere in the neighbor-
small and rare fluctuations. Its value will differ in hood, the phase is unique.
the liquid and in the gaseous states, gas < liq . The ideas behind the choice of the Hamiltonian
Therefore, density is an order parameter for the go back to van der Waals, and the Ginzburg–
transition and plays the role of the spin magnetiza- Landau theory, which are milestones in the theory
tion in the Ising picture. of phase transitions, while the mathematics of
There are general mathematical techniques devel- variational problems also enters here in an impor-
oped to translate these ideas into proofs, they involve tant way. These are briefly discussed in the next
‘‘coarse graining,’’ ‘‘block spin transformations,’’ and sections.
‘‘renormalization group’’ procedures. The starting
point is to ideally divide the space into cells. Their size
should be chosen to be much larger than the typical
The van der Waals Liquid–Vapor
microscopic distance between molecules, to depress
fluctuations of the particle density in a cell. To study
Transition
the probability distribution of the latter, we integrate Let us then do a step backwards and recall the
out all the other degrees of freedom. After such a van der Waals theory of the liquid–vapor transition.
coarse graining, we are left with a system of spins on a As typical intermolecular forces have a strong
lattice, the lattice sites labeling the cells (also called repulsive core and a rather long attractive tail, in a
blocks) and each spin (also called block spin) giving continuum, mesoscopic approximation of the system
the value of the density of particles in the correspond- will be described by a free-energy functional of the
ing cell. Translated into the language of block spins, type
the previous physical analysis of the state of the fluid Z
suggests that most probably, in each block the density 0
FðÞ ¼ f; ððrÞÞdr
is approximately equal to either liq or gas , and the 
Z
same in different blocks, except in the case of small 1
 Jðr; r0 ÞðrÞðr0 Þdr dr0 ½1
and rare fluctuations. If we represent the probability 2 
distribution of the block spins in terms of a Gibbs where  = {(r), r 2 } is the particles density and 
measure (as always possible if the system is in a the region where the system is confined, which, for
bounded region), the previous picture is compatible simplicity, is taken here as a torus in Rd , consisting
with a new Hamiltonian with a single spin (one-body) of a cube with periodic boundary conditions. The
potential which favors the two values liq and gas and term J(r, r0 )(r)(r0 ), J(r, r0 )  0, is the energy due to
an attractive interaction between spins which sup- the attractive tail of the interaction, which is
presses changes from one to the other. A new effective periodic in ; f,0  () = f,0 0 ()   is the free-energy
low temperature should finally dampen the density due to the short, repulsive part of the
fluctuations. interaction,  being the chemical potential.
Thus, after coarse graining, the system should be in As noted later, [1] can be rigorously derived by a
the same universality class as of the low-temperature coarse graining transformation; it will be used to
Ising model, and we may hope, in this way, to extend build a bridge between the van der Waals theory and
to the liquid–vapor branch of the phase diagram the the previous block spin analysis of the liquid–vapor
Pirogov–Sinai theory of low-temperature lattice phase transition. Let us take for the moment [1] as a
systems. In particular, as in the Ising model, we will primitive notion. By invoking the second principle of
then be able to select the liquid or the vapor phases by thermodynamics, the equilibrium states can be
the introduction of suitable boundary conditions. found by minimizing the free-energy functional.
The conditional tense arises because the computation Supposing J to be translation invariant, that is,
of the coarse graining transformation is in general very J(r, rR0 ) = J(r þ a, r0 þ a), r, r0, a 2 Rd , and calling
difficult, if not impossible, to carry out, but there is a  = J(r, r0 )dr0 the intensity of J, we can rewrite
class of systems where it has been accomplished. These F() as
are systems of identical point particles in Rd , d  2, Z  
0 ðrÞ2
which interact with ‘‘special’’ two- and four-body FðÞ ¼ f; ððrÞÞ  dr
potentials, having finite range and which can be chosen  2
Z
to be rotation and translation invariant; their specific 1
þ Jðr; r0 Þ½ðrÞ  ðr0 Þ2 dr dr0 ½2
form will be described later. For such systems, the above 4 
Phase Transitions in Continuous Systems 57

This shows that the minimizer must have (r) the exponential term in [3] is replaced by functions
constant (so that the second integral is minimized) whose dependence on  has the same scaling
and equal to any value which minimizes the function properties as mentioned above (in (1) and (2)),
{f,0  ()  2 =2}. By thermodynamic principles, the while the hard core can be replaced by suitably
free energy f,0  () is convex in , but, if  is large repulsive interactions.
enough, the above expression is not convex and, by The proof, in the version proposed by Lebowitz
properly choosing the value of , the minimizers are and Penrose, uses coarse graining and shows that the
no longer unique, hence the van der Waals phase effective Hamiltonian is well approximated by the
transition. van der Waals functional [1], when  is small, while
the effective temperature scales as  d . The approx-
imation becomes exact in the limit  ! 0, where it
Kac Potentials
reduces the computation of the partition function to
The analogy between the above analysis of [2] and the analysis of the minima and the ground states of
the previous heuristic study of the fluid based on an effective Hamiltonian which, in the limit  ! 0,
coarse graining is striking. As customary in con- is exactly the van der Waals functional.
tinuum theory, each mesoscopic point r should be A true proof of phase transitions requires instead
regarded as representative of a cell containing many to keep  > 0 fixed (instead of letting  ! 0) and
molecules. Then the functional F() can be inter- thus to control the difference of the effective
preted as the effective Hamiltonian after coarse Hamiltonian after coarse graining and the van der
graining. The role of the one-body term is played in Waals functional, which is the effective Hamilto-
[2] by the curly bracket, which selects two values of nian, but only in the actual limit  ! 0. In general,
 (its minimizers, to be identified with liq and gas ); there is no symmetry between the two ground states,
the attractive two-body potential is then related to unlike in the Ising case where they are related by
the last term in [2], as it suppresses the variations of spin flip, and the Pirogov–Sinai theory thus enters
. The analogy clearly suggests a strategy for a into play. The framework in fact is exactly similar,
rigorous proof of phase transitions in the conti- with the lattice Hamiltonian replaced by the func-
nuum, an approach which has been and still is tional and low temperatures by small  (recall that
actively pursued. It will be discussed briefly in the the effective temperature scales as  d ). The extension
sequel. of the theory to such a setting, however, presents
The first rigorous derivation of the van der Waals difficulties and success has so far been only partial.
theory in a statistical-mechanics setting goes back to
the 1960s and to Kac, who proposed a model where
the particle pair interaction is A Model for Phase Transitions in the
Continuum
 d ejqi qj j þ hard core; ;  > 0 ½3
The problem is twofold: to have a good control of
The phase diagram of such systems, after the (1) the limit theory and (2) the perturbations
thermodynamic limit, can be quite explicitly deter- induced by a nonzero value of the Kac parameter
mined in the limit  ! 0, where it has been proved . The former falls in the category of variational
to converge to the van der Waals phase diagram, problems for integral functionals, whose prototype
under a proper choice of f,0  (  ) in [1]. is the Ginzburg–Landau free energy
The characteristic features of the first term in [3] Z
are: (1) very long range, which scales as  1 , and (2) Fgl ðÞ ¼ fwðÞ þ jrj2 g dr ½4
very small intensity, which scales as  d , so that the
total intensity of the potential, defined as the which can be regarded as an approximation of [2]
integral over the second position, is independent of with w equal to the curly bracket in [2] and J
. The additional hard-core term (which imposes replaced by a -function. Minimization problems for
that any two particles cannot get closer than this and similar functionals have been widely
2R0 , R0 > 0 being the hard-core radius) is to ensure analyzed in the context of general variational
stability of matter, that is, to avoid collapse of the problems theory and partial differential equations
whole system on an infinitesimally small region, as it (PDEs), and the study of the limit theory can benefit
would happen if only the attractive part of the from a vast literature on the subject. The analysis of
interaction were present. the corrections due to small  is, however, so far
Derivation of the van der Waals theory has been quite limited. To implement the Pirogov–Sinai
proved for a general class of Kac potentials, where strategy, we need, in the case of the interaction [3],
58 Phase Transitions in Continuous Systems

a very detailed knowledge of the system without the where [8] is taken to be defined on a torus (to avoid
Kac part of the interaction and with only hard cores. convergence problems of the integral), and
This, however, is so far not available when the j = j ,  = 1.
particle density is near to close-packing (i.e., the Exploiting the concavity of the entropy S(), it is
maximal density allowed by the hard-core poten- proved that the minimizers of F(  ) are constant
tial). Replacing hard cores by other short-range functions with the constants minimizing
repulsive interactions does not help either, and this
seems the biggest obstacle to the program. SðuÞ
The difficulty, however, can be avoided by f; ðuÞ ¼ e ðuÞ  ; u0 ½10

replacing the hard-core potential by a repulsive
many-body (more than two) Kac potential, which In the case of [6], to which we restrict in the sequel,
ensures stability as well. The class of systems for any  > (3=2)3=2 there is  so that f ,  (u) is
covered by the approach is characterized by Hamil- double-well with two minimizers, gas < liq (depen-
tonian of the form dence on  is omitted).
Z To ‘‘recognize’’ the densities gas and liq in a
H; ðqÞ ¼ e ð  ðrÞÞdr ½5 particle configuration, we use coarse graining and
Rd introduce two partitions of R d into cubes C(‘
,  ) . The
cubes C(‘,  ) of the first partition have side ‘, 
where e ( ) is a polynomial of the scalar field proportional to  1þ ,  > 0 suitably small; those of
variable , a specific example being the second one have length ‘þ,  proportional to
 1 ; they are chosen so that each cube C(‘þ,  ) is
4 2
e ð Þ ¼    ½6 union of cubes C(‘,  ) . Notice that the small cubes
4! 2 have side much smaller than the interaction range (for
This form of the Hamiltonian is familiar from small ), while the opposite is true for the large cubes.
Euclidean field theories. In these theories, the free Given a particle configuration q, we say that
distribution of the field is Gaussian; in our case, a point r is in the liquid phase and write
however, the field =  (r) is a function of the (r; q) = 1, if
particle configurations q = (qi , i = 1, . . . , n):  
jq u Cð‘; Þ j 
 
X
n    liq    a ; a > 0 suitably small ½11
 ‘d 
 ðrÞ ¼ j qðrÞ ¼ j ðr; qi Þ
i¼1 ½7 (‘ )
0 d 0
for any small cube C(‘,  ) contained either in Cr þ,  or
j ðr; r Þ ¼  jðr; r Þ (‘ )
in the cubes C(‘þ,  ) contiguous to Cr þ,  : jq u C(‘,  ) j is
where j(r, r0 ) is a translation-invariant, symmetric referred to as the number of particles of q in C(‘,  ) ,
(‘ )
transition probability kernel. Thus,  (r) is a non- and Cr þ,  as the large cube which contains r.
negative variable which has the meaning of a local Thus, (r; q) = 1 if the local particle density is
density at r, weighted by the Kac kernel j (r, r0 ). constantly close to liq in a large region around r.
Defining (r; q) = 1 if the above holds with gas
instead of liq and setting (r; q) = 0 in all the other
cases, we then have a phase indicator (r; q), which
Contours and Phase Indicators identifies, for all particle configurations, which
The dependence on  yields the scaling properties spatial regions should be attributed to the liquid
characteristic of the Kac potentials and [5] may be and gas phases. The connected components of the
regarded as a generalized Kac Hamiltonian, which, complementary region are called contours and the
in the polynomial case of [6], involves up to four- definition of (r; q) has been structured in such a
body Kac potentials. The phase diagram of the way that liquid and gas are always separated by a
model, after taking first the thermodynamic limit contour. The liquid phase will then be represented
and then the limit  ! 0, is determined by the free- by a measure which gives large probability to
energy functional configurations having mostly  = 1, while the gas
Z   phase by configurations with mostly  = 1.
SððrÞÞ This is quite similar to the Ising picture and, as in
FðÞ ¼ e ðj ðrÞÞ  dr ½8
 the Ising model, the existence of a phase transition
follows from a Peierls estimate that contours have
small probability. In fact, if there are few contours,
SðÞ ¼ ðlog   1Þ ½9 the phase imposed on the boundaries of the region
Phase Transitions in Continuous Systems 59

where the system is observed percolates inside, imposing a total density (or magnetization in the
invading most of the space. Thus, boundary condi- case of spins) intermediate between those of the pure
tions select the phase in the whole volume. The phases. There will then be an interface separating
absence of the short-range potential, which was the the two phases with a corresponding surface tension
hard-core interaction in [3], and hence the absence and the geometry will be determined by the solution
of all the difficulties which originate from it, allow of a variational problem and given by the Wulff
one to carry through successfully the Pirogov–Sinai shape.
program and prove Peierls estimates on contours Can statistical mechanics explain and describe the
and, hence, the existence of a phase transition. In phenomenon? Important progress has been made
particular, the statistical weight of a contour is recently on the subject in the case of lattice systems
estimated by first relating the computation to one at low temperatures. The question has also been
involving the functional [8] and then computing its widely studied at the mesoscopic level, in the
value on density profiles compatible with the context of variational problems for Ginzburg and
existence of the given contour. This part of the Landau and many other functionals. Therefore, all
problem needs variational analysis for [8], with the ingredients of further development of the theory
constraints and benefits of a vast literature on the in this direction are now present.
subject. We have so far discussed only classical systems;
The phase transition is very sharp, as shown by a few words about extensions to the quantum case
the following ideal experiment. Having fixed  > are now in order. In the range of values of
(3=2)3=2 , let  vary in a (suitably) small interval temperatures and densities where the liquid–vapor
[  ,  þ ],  > 0, centered around the mean- transition occurs, the quantum effects are not
field critical value  . We consider the system in a expected to be relevant. Referring to the case of
large region with, for instance, boundary conditions bosons, and away from the Bose condensation
 = 1 (i.e., forcing the gas phase) and fix  small regime (and for system with Boltzmann statistics
enough. At  =   , the system has  = 1 in as well), the quantum delocalization of particles
most of the domain, and this persists when we caused by the indeterminacy principle should
increase  till a critical value, ,  , close to, but not essentially disappear after macroscopic coarse
the same as  . For  > ,  ,  = 1 in most of the graining, and the block-spin variables should
domain, except for a small layer around the again behave classically, even though their under-
boundaries. The analogous picture holds if we lying constituents are quantal. If this argument
choose boundary conditions  = 1, and  = ,  is proves correct, then progress along these lines may
the only value of the chemical potential where the be expected in near future.
system is sensitive to the boundary conditions and
both phases can be produced by the right boundary See also: Cluster Expansion; Ergodic Theory; Finite
conditions. The fact that the actual value ,  differs Group Symmetry Breaking; Pirogov–Sinai Theory;
from  , is characteristic of the Pirogov–Sinai Reflection Positivity and Phase Transitions; Statistical
Mechanics and Combinatorial Problems; Statistical
approach and enlightens the delicate nature of the
Mechanics of Interfaces; Symmetry Breaking in Field
proofs.
Theory; Two-Dimensional Ising Model.

Some Related Problems Further Reading


In this concluding section, two important related Dal Maso G (1993) An Introduction to -Convergence.
problems, which have not been mentioned so far, Birkhäuser.
are discussed. Lebowitz JL, Mazel A, and Presutti E (1999) Liquid–vapor phase
A natural question, after proving a phase transi- transitions for systems with finite range interactions. Journal
tion, is to describe how two phases coexist, once of Statistical Physics 94: 955–1025.
Ruelle D (1969) Statistical Mechanics. Rigorous Results.
forced to be simultaneously present in the system.
Benjamin.
This can be achieved, for instance, by suitable Sinai YaG (1982) Theory of Phase Transitions: Rigorous
boundary conditions (typically positive and negative Results. Pergamon Press (co-edition with Akademiai Kiadó,
on the top and bottom of the spatial domain) or by Budapest).
60 Pirogov–Sinai Theory

Pirogov–Sinai Theory
R Kotecký, Charles University, Prague, Czech conditions  2  (and with Hamiltonian H) is the
Republic, and the University of Warwick, UK probability  (j) on  defined by
ª 2006 Elsevier Ltd. All rights reserved.
expfH ðjÞg
 ðf gjÞ ¼ ½2
ZðjÞ
with the partition function
Introduction X
Pirogov–Sinai theory is a method developed to ZðjÞ ¼ expfH ðjÞg ½3

study the phase diagrams of lattice models at low
temperatures. The general claim is that, under We use G(H) to denote the set of all periodic Gibbs
appropriate conditions, the phase diagram of a states with Hamiltonian H defined on  by means of
lattice model is, at low temperatures, a small the Dobrushin–Lanford–Ruelle (DLR) equations.
perturbation of the zero-temperature phase dia-
gram designed by ground states. The treatment can
be generalized to cover temperature driven transi- Ground-State Phase Diagram and the Removal
tions with coexistence of ordered and disordered of Degeneracy
phases. A periodic configuration  2  is called a (periodic)
ground state of a Hamiltonian H = (A ) if
Formulation of the Main Result X
Hð~; Þ ¼ Þ  A ðÞÞ  0
ðA ð~ ½4
Setting A

Refraining first from full generality, we formulate for every finite perturbation ˜ 6¼  of  (˜ differs
the result for a standard class of lattice models with from  at a finite number of lattice sites). We use
finite spin state and finite-range interaction. We will g(H) to denote the set of all periodic ground states
mention different generalizations later. of H. For every configuration  2 g(H), we define
We consider classical lattice models on the the specific energy e (H) by
d-dimensional hypercubic lattice Zd with d  2.
A spin configuration  = (x )x2Zd is an assignment of 1 X
e ðHÞ ¼ lim A ðÞ ½5
a spin with values in a finite set S to each lattice site n!1 jVn j
d A\V 6¼;n
x 2 Zd ; the configuration space is  = SZ . For  2 
and   Zd , we use  2  = S to denote the (with Vn denoting a cube consisting of nd lattice sites).
restriction  = {x ; x 2 }. To investigate the phase diagram, we will consider
The Hamiltonian is given in terms of a collection of a parametric class of Hamiltonians around a
interaction potentials (A ), where A are real func- fixed Hamiltonian H (0) with a finite set of periodic
tions on , depending only on x with x 2 A, and A ground states g(H(0) ) = {1 , . . . , r }. Namely, let H(0) ,
runs over all finite subsets of Zd . We assume that the H (1) , . . . , and H (r1) be Hamiltonians determined by
potential is periodic with finite range of interactions. potentials (0) , (1) , . . . , and (r1) , respectively, and
Namely, A0 (0 ) = A () whenever A and  are related consider theP(r  1)-parametric set of Hamiltonians
to A0 and 0 by a translation from (aZ)d for some fixed Ht = H (0) þ r1 ‘ = 1 t‘ H
(‘)
with t = (t1 , . . . , tr1 ) 2 Rr1 .
integer a and there exists R  1 such that A  0 for Using a shorthand em (H) = em (H), and introducing
all A with diameter exceeding R. the vectors e(H)= (e1 (H), ... , er (H)) and h(t) = e(Ht )
Without loss of generality (possibly multiplying minm em (Ht ), we notice that for each t 2 Rr1 , the
the number a by an integer and increasing R), we vector h(t) 2 @Qr , the boundary of the positive octant
may assume that R = a. in Rr . A crucial assumption for such a parametriza-
The Hamiltonian H (j) in  with boundary tion Ht to yield a meaningful phase diagram is the
conditions  2  is then given by condition of removal of degeneracy: we assume that
X g(H (0) þ H (‘) ) $ g (H (0) ),‘ = 1, ... , r  1, and that the
H ðjÞ ¼ A ð _ c Þ ½1 vectors e(H (‘) ), ‘ = 1, ... , r  1, are linearly independent.
A\6¼; In particular, its immediate consequence is that
the mapping Rr1 3 t 7! h(t) 2 @Qr is a bijection.
where  _ c 2  is the configuration  extended This fact has a straightforward interpretation in
by c on c . The Gibbs state in  under boundary terms of ground-state phase diagram. Viewing the
Pirogov–Sinai Theory 61

phase diagram (at zero temperature) as a partition of (the support of the contour ) is a connected
the parameter space into regions Kg with a given set component of B() (and  is the restriction of  on
g  g(H (0) ) of ground states – ‘‘coexistence of zero- ). Here, the connectedness of  means that it cannot
temperature phases from g’’ – the above bijection be split into two parts whose (Euclidean) distance is
means that the region Kg is the preimage of the set larger than 1. We useS@() to denote the set of all
contours of , B() = 2@() .
Qg ¼ fh 2 @Qr jhm ¼ 0 for m 2 g and
Consider a configuration  such that  is its
hm > 0 otherwiseg ½6 unique contour. The set Zd n has one infinite
component to be denoted Ext  and a finite number
The partition of the set @Qr has a natural
of finite components whose union will be denoted
hierarchical structure implied by the fact that Qg1 \
Qg2 = Qg1 [g2 (Qg is the closure of Qg ). Namely, the Int . Observing that the configuration  coincides
with one of the states m 2 G on every component of
origin {0} = Qg(H(0) ) is the intersection of r positive
Zd nB(), each of those components can be labeled
coordinate axes Q{m , m6 ¼m} , m = 1, . . . , r; each of
by the corresponding m. Let q be the label of Ext ,
those half-lines is an intersection of r  1 two-
we say that  is a q-contour, and let Intm  be the
dimensional quarter-planes with boundaries on posi-
union of all components of Int  labeled by
tive coordinate axes, etc., up to (r  1)-dimensional
m, m = 1, . . . , r.
planes Q{m } , m = 1, . . . , r. This hierarchical structure
Defining the ‘‘energy’’ () of a q-contour  by
is thus inherited by the partition of the parameter
space Rr1 into the regions Kg . The phase diagrams the equation
with such regular structure are sometimes said to ðÞ ¼ Hð ; q Þ þ eq ðHÞjj
satisfy the Gibbs phase rule. Xr
We can thus summarize in a rather trivial conclusion  ðem ðHÞ  eq ðHÞÞjIntm j ½7
that the condition of removal of degeneracy implies m¼1
that the ground-state phase diagram obeys the Gibbs the Peierls condition with respect to the set G of
phase rule. The task of the Pirogov–Sinai theory is to reference configurations is an assumption of the
provide means for proving that this remains true, at existence of  > 0 such that
least in a neighborhood of the origin of parameter
space, also for small nonzero temperatures. To achieve ðÞ  ð þ min em ðHÞÞjj ½8
m
this, we need an effective control of excitation energies.
for any contour of any configuration  that is a
Peierls Condition finite perturbation of q 2 G.
Notice that if G = g(H), the sum on the right-hand
A crucial assumption for the validity of the Pirogov– side of [7] vanishes.
Sinai theory is a lower bound on energy of
excitations of ground states – the Peierls condition.
Phase Diagram
In spite of the fact that for a study of phase diagram
we consider a parametric set of Hamiltonians whose The main claim of the Pirogov–Sinai theory provides,
set of ground states may differ, it is useful to introduce for  sufficiently large, a construction of regions Kg ()
the Peierls condition with respect to a single fixed of the parameter space characterized by the coex-
collection G of reference configurations (eventually, it istence of phases labeled by configurations m 2 g.
will be identified with the ground states of the This is done similarly as for the ground-state phase
Hamiltonian H (0) ). Let thus a fixed set G of periodic diagram discussed earlier by constructing a home-
configurations {1 , . . . , r } be given. Again, without omorphism t 7! a(t) from a neighborhood of the origin
loss of generality, we may assume that the periodicity of the parameter space to a neighborhood of the origin
of all configurations m 2 G is R. of @Qr that provides the phase diagram (actually, the
Before formulating the Peierls condition, we have function a(t) will turn out to be just a perturbation of
to introduce the notion of contours. Consider the set h(t) with errors of order e ).
of all sampling cubes C(x) = {y 2 Zd kyi  xi j  R for Before stating the result, however, we have to
1  i  d}, x 2 Zd . A bad cube of a configuration clarify what exactly is meant by existence of phase
 2  is a sampling cube C for which C differs from m for a given Hamiltonian H. Roughly speaking, it
m restricted to C for every m 2 G. The boundary is the existence of a periodic extremal Gibbs state
B() of  is the union of all bad cubes of . If m 2 G m 2 G(H), whose typical configurations do not
and  is its finite perturbation (differing from m on a differ too much from the ground-state configura-
finite set of lattice sites), then, necessarily, B() is tion m . In more technical terms, the existence
finite. A contour of  is a pair  = (,  ), where  of such a state is provided once we prove a
62 Pirogov–Sinai Theory

suitable bound, for the finite-volume Gibbs state c and collections M(, q) of contours @ in 
 ({ }jm ) under the boundary conditions m , on satisfying the matching condition, and such that the
the probability that a fixed point in  is encircled external among them are q-contours. Here, a contour
by a contour from @. If this is the case, we say that  2 @ is called an external contour in @ if   Ext  0
the phase m is stable. It turns out that such a bound for all  0 2 @ different from .
is actually an integral part of the construction of With this observation and usingSm (@) to denote
metastable free energies fm (t) yielding the home- the union of all components of  n 2@  with label
omorphism t 7! a(t). In this way, we get the main m, we get
claim formulated as follows: X Y Y
Zðjq Þ ¼ eem ðHÞjm ðÞj eðÞ ½9
Theorem 1 Consider P a parametric set of Hamilto- @2Mð;qÞ m 2@
nians Ht = H (0) þ r1 ‘=1 t ‘ H (‘)
with periodic finite-
range interactions satisfying the condition of Usefulness of such contour representations stems
removal of degeneracy as well as the Peierls from an expectation that, for a stable phase q,
condition with respect to the reference set contours should constitute a suppressed excitation
G = g(H (0) ). Let d  2 and let  be sufficiently and one should be able to use cluster expansions to
large. Then there exists a homeomorphism t 7! a(t) evaluate the behavior of the Gibbs state q .
of a neighborhood V of the origin of the parameter However, the direct use of the cluster expansion on
space Rr1 onto a neighborhood U of the origin of [9] is trammeled by the presence of the energy terms
@Qr such that, for any t 2 V , the set of all stable eem (H)jm (@)j and, more seriously, by the require-
phases is {m 2 {1, . . . , r}jam (t) = 0}. ment that the contour labels match.
Nevertheless, one can rewrite the partition func-
The Peierls condition can be actually assumed tion in a form that does not involve any matching
only for the Hamiltonian H (0) inferring its validity condition. Namely, considering first a sum over
for Ht on a sufficiently small neighborhood V . mutually external contours @ ext and resumming over
Notice also that the result can be actually stated collections of contours which are contained in their
not as a claim about phase diagram in a space of interiors without touching the boundary (being thus
parameters, but as a statement about stable phases prevented to ‘‘glue’’ with external contours), we get
of a fixed Hamiltonian H. Namely, for a Hamilto- X
nian H satisfying Peierls condition with respect to a Zðjq Þ ¼ eeq ðHÞjExtj
reference set G, one can assure the existence of @ ext

parameters am labeled by elements from G such that Y  Y 


eðÞ Zdil ðIntm jm Þ ½10
the set of extremal periodic Gibbs states of H 2@ ext m
consists of all those m-phases for which am = 0.
Here the sum goes over all collections of
compatible Texternal q-contours in , Ext =
Construction of Metastable Free Energies Ext (@ ext ) = 2@ ext (Ext  \ ), and the partition
An important part of the Pirogov–Sinai theory is function Zdil ðjq Þ is defined by [9] with
an actual construction of the metastable free Mð, qÞ replaced by Mdil ð, qÞ  Mð, qÞ, the
energies – a set of functions fm (t), m = 1, . . . , r, set of all those collections whose external coun-
that provide the homeomorphism a(t) by taking tours  are such that dist ð, c Þ > 1: Multiplying
am (t) = fm (t)  minm fm (t). now each term by
We start with a contour representation of Y Y Zdil ðIntm jq Þ
partition function Z(jq ). Considering, for each 1¼ ½11
contributing configuration , the collection @() of 2@ ext m
Zdil ðIntm jq Þ
its contours, we notice that, in addition to the fact
we get
that different contours ,  0 2 @() have disjoint X
supports,  \ 0 = ;, the contours from @() have Zðjq Þ ¼ eeq ðHÞjExtj
to satisfy the matching conditions:
S if C is a @ ext
connected component of Zd n 2@ , then the Y 
eeq ðHÞjj wq ðÞZdil ðInt jq Þ ½12
restrictions of the spin configurations  to C 2@ ext
are the same for all contours  2 @() with
dist(, C) = 1. In other words, the contours touch- where wq () is given by
ing C induce the same label on C. Let us observe Y Zdil ðIntm jm Þ
that there is actually one-to-one correspondence wq ðÞ ¼ eðÞ eeq ðHÞjj ½13
between configurations  that coincide with q on m Zdil ðIntm jq Þ
Pirogov–Sinai Theory 63

Observing that a similar expression is valid for 3. If m 2 G, then


Zdil ðjq Þ (with an appropriate restriction on the
sum over external contours @ ext ) and proceeding by jZðjm Þj  e minq fq ðHt Þjj e j@j ½20
induction, we eventually get the representation A standard example illuminating the perturbative
X Y
Zðjq Þ ¼ eeq ðHÞjj wq ðÞ ½14 construction of the metastable free energies and
@2Cð;qÞ 2@ showing the role of entropic contributions is the
Blume–capel model. It is defined by the Hamiltonian
where C(, q) denotes the set of all collections of X X X
nonoverlapping q-contours in . Clearly, the sum on H ðÞ ¼ J ðx  y Þ2 
2x  h x ½21
the right-hand side is exactly of the form needed to hx;yi x2 x2

apply cluster expansion, provided the contour weights with spins x 2 { 1, 0, 1}. Taking into account only
satisfy the necessary convergence assumptions. the lowest-order excitations, we get:
Even though this is not necessarily the case, there
is a way to use this representation. Namely, one can ~f
ð
; hÞ ¼ 
h  1 eð2d


artificially change the weights to satisfy the needed 
bound, for example, by modifying them to the form (sea of pluses or minuses with a single spin flip
! 0)
 
and
w0q ðÞ ¼ min wq ðÞ; ejj ½15
 
~f0 ð
; hÞ ¼  1 eð2dþ
Þ eh þ eh
with a suitable constant . The modified partition 
function X Y
Z0 ðjq Þ ¼ eeq ðHÞjj w0q ðÞ ½16 (sea of zeros with a single spin flip either 0 ! þ or
@2Cð;qÞ 2@ 0 ! )
can then be controlled by cluster expansion allowing Since these functions differ from full metastable free
to define energies f
(
, h), f0 (
, h) by terms of higher order
( e(4d2) ), the real phase diagram differs in this
1 1 order from the one constructed by equating the
fq ðHÞ ¼  lim log Z0 ðjq Þ ½17
 jj!1 jj functions ~f
(
, h) and ~f0 (
, h). It is particularly
This is the metastable free energy corresponding to the interesting to inspect the origin,
= h = 0. It is only
phase q. Applying the cluster expansion to the the phase 0 that is stable there at all small
logarithm of the sum in [16], we get jfq (H)  eq (H)j  temperatures since
e=2 . The metastable free energy corresponds to 2 1
taking the ground state q and its excitations as long f0 ð0; 0Þ  e2d < f
ð0; 0Þ  e2d ½22
 
as they are sufficiently suppressed. Once wq () exceeds
the weight ejj (and the contour would have been The only reason why the phase 0 is favored at this
actually preferred), we suppress it ‘‘by hand.’’ The point with respect to phases þ and  is that there
point is that if the phase q is stable, this never happens are two excitations of order e2d for the phase 0,
and w0q () = wq () for all q-contours . This is the idea while there is only one such excitation for þ or .
behind the use of the function fq (H) as an indicator of The entropy of the lowest-order contribution to
the stability of the phase q by taking f0 (0, 0) is overweighting the entropy of the contribu-
tion to f
(0, 0) of the same order.
aq ðtÞ ¼ fq ðHt Þ  min fm ðHt Þ ½18
m

Of course, the difficult point is to actually prove that Applications


the stability of phase q (i.e., the fact that aq (t) = 0)
Several applications, stemming from the Pirogov–
indeed implies w0q () = wq () for all . The crucial step
Sinai theory, are based on the fact that, due to the
is to prove, by induction on the diameter of  and ,
cluster expansion, we have quite accurate descrip-
the following three claims (with = 2e=2 ):
tion of the model in finite volume.
1. If  is a q-contour with aq (t) diam   =4, then One class of applications concerns various
w0q () = wq (). problems featuring interfaces between coexisting
2. If aq (t) diam   =4, then Z(jq ) = Z0 (jq ) 6¼ 0 phases. To be able to transform the problem into a
and study of the random boundary line separating the
two phases, one needs a precise cluster expansion
  formula for partition functions in volumes occupied
Zðjq Þ  efq ðHt Þjj j@j ½19
by those phases. In the situation with no symmetry
64 Pirogov–Sinai Theory

between the phases, the use of the Pirogov–Sinai Zeros of Partition Functions
theory is indispensable.
The full strength of the formula [23] is revealed
Another interesting class of applications concerns
when studying the zeros of the partition function
the behavior of the system with periodic boundary
ZTN (z) as a polynomial in a complex parameter z
conditions. It is based on the fact that the partition
entering the Hamiltonian of the model. To be able
function ZTN on a torus TN consisting of N d sites
to use the theory in this case, one has to extend the
can be, again with the help of the cluster expan-
definitions of the metastable free energies to com-
sions, explicitly and very accurately evaluated in
plex values of z. Indeed, the construction still goes
terms of metastable free energies,
through, now yielding genuinely complex, contour
 
 Xr  models w
with the help of an inductive procedure.
 fq ðHÞNd 
 Z TN  e  Notice that no analytic continuation is involved. An
 q¼1

analog of [23] is still valid,
 expf min fm ðHÞN d  bNg ½23  
m  Xr 
 d
ZTN ðzÞ  efm ðzÞN 
with a fixed constant b. This formula (and its  m¼1

generalization to the case of complex parameters)
 expf min <efm ðzÞNd  bNg ½26
allows us to obtain various results concerning the m
behavior of the model in finite volumes. Using [26], it is not difficult to convince oneself
that the loci of zeros can be traced down to the
Finite-Size Effects phase coexistence lines. Indeed, on the line of
the coexistence of two phases <efm = <efq , the
Considering as an illustration a perturbation of the partition function ZTN (z) is approximated by
Ising model, so that it does not have the
symmetry d d d
efN (e=mfm N þ e=mfn N ). The zeros of this
any more (and the value ht () of external field approximation are thus given by the equations
at which the phase transition between plus and
minus phase occurs is not known), we can pose a <efm ¼ <efn < <ef‘ for all ‘ 6¼ m; n
natural question that has an importance for correct d
½27
N ð=mfm  =mfn Þ ¼ mod 2
interpretation of simulation data. Namely, what is
the asymptotic behavior of the magnetization The zeros of the full partition function ZTN (z) can
P
mper
N (, h) =  T N
(1/ x2  x ) on a torus? In the be proved to be exponentially close, up to a shift
thermodynamic limit, the magnetization mper 1 (, h)
of order O(ebN ), to those of the discussed
displays, as a function of h, a discontinuity at approximation.
h = ht (). For finite N, we get a rounding of the Briefly, the zeros of ZTN (z) asymptotically con-
discontinuity – the jump is smoothed. What is the centrate on the phase coexistence curves with the
shift of a naturally chosen finite-volume transition density (1=2 )N d j(d=dz)(fm  fn )j.
point ht (N) with respect to the limiting value ht ?
The answer can be obtained with the help of [23]
once sufficient care is taken to use the freedom in Bibliographical Remarks
the definition of the metastable free energies fþ (h) and Generalizations
and f (h) to replace them with a sufficiently smooth The original works Pirogov and Sinai (1975, 1976)
version allowing an approximation of the functions and Sinai (1982) introduced an analog of the weights
f
(h) around limiting point ht in terms of their w0q () and parameters aq (H) as a fixed point of a
Taylor expansion. suitable mapping on a Banach space. The inductive
As a result, in spite of the asymmetry of the model, definition used here was introduced in Kotecký and
the finite-volume magnetization mper N (, h) has a uni- Preiss (1983) and Zahradnı́k (1984). The completeness
versal behavior in the neighborhood of the transition of phase diagram – the fact that the stable phases
point ht . With suitable constants m and m0 , we have exhaust the set of all periodic extremal Gibbs states
per
mN ð; hÞ m0 þ m tanhfN d mðh  ht Þg ½24 was first proved in Zahradnı́k (1984). Extension to
complex parameters was first considered in Gawȩdzki
Choosing the inflection point hmax (N) of mper
N (, h) et al. (1987) and Borgs and Imbrie (1989). For a review
as a natural finite-volume indicator of the occurence of the standard Pirogov–Sinai theory, see Sinai (1982)
of the transition, one can show that and Slawny (1987).
3 Application of Pirogov–Sinai theory for finite-size
hmax ðNÞ ¼ ht þ N 2d þ OðN 3d Þ ½25 effects was studied in Borgs and Kotecký (1990) and
22 m3
Pirogov–Sinai Theory 65

general theory of zeros of partition functions is Borgs C and Kotecký R (1990) A rigorous theory of finite-size
presented in Biskup et al. (2004). scaling at first-order phase transitions. Journal of Statistical
Physics 61: 79–119.
The basic statement of the Pirogov–Sinai theory Borgs C, Kotecký R, and Ueltschi D (1996) Low temperature phase
yielding the construction of the full phase diagram diagrams for quantum perturbations of classical spin systems.
has been extended to a large class of models. Let us Communications in Mathematical Physics 181: 409–446.
mention just few of them (with rather incomplete Borgs C and Waxler R (1989) First order phase transitions in
references): unbounded spin systems: construction of the phase diagram.
Communications in Mathematical Physics 126: 291–324.
1. Continuous spins. The main difficulty in these Bricmont J, Kuroda T, and Lebowitz J (1985) First order phase
models is that one has to deal with contours transitions in lattice and continuum systems: extension of
Pirogov–Sinai theory. Communications in Mathematical Phy-
immersed in a sea of fluctuating spins (Dobrushin sics 101: 501–538.
and Zahradnı́k 1986, Borgs and Waxler 1989). Bricmont J and Kupiainen A (1987) Lower critical dimensions for
2. Potts model. An example of a system a transi- the random field Ising model. Physical Review Letters 59:
tion in temperature with the coexistence of the 1829–1832.
low-temperature ordered and the high-tempera- Bricmont J and Kupiainen A (1988) Phase transition in the 3D
random field Ising model. Communications in Mathematical
ture disordered phases. Contour reformulation is Physics 116: 539–572.
employing contours between ordered and dis- Datta N, Fernández R, and Fröhlich J (1996) Low-temperature
ordered regions (Bricmont et al. 1985, Kotecký phase diagrams of quantum lattice systems. I. Stability for
et al. 1990). The treatment is simplified with help quantum perturbations of classical systems with finitely-many
of Fortuin–Kasteleyn representation (Laanait ground states. Helv. Phys. Acta 69: 752–820.
Dinaburg EL and Sinaı̈ YaG (1985) An analysis of ANNNI model
et al. 1991). by Peierls contour method. Communications in Mathematical
3. Models with competing interactions. ANNNI Physics 98: 119–144.
model, microemulsions. Systems with a rich Dobrushin RL and Zahradnı́k M (1986) Phase diagrams of
phase structure (Dinaburg and Sinai 1985). continuous lattice systems. In: Dobrushin RL (ed.) Math.
4. Disordered systems. An example is a proof of Problems of Stat. Physics and Dynamics, pp. 1–123. Dordrecht:
Reidel.
the existence of the phase transition for the three- Gawȩdzki K, Kotecký R, and Kupiainen A (1987) Coarse-graining
dimensional random field Ising model (Bricmont approach to first-order phase transitions. Journal of Statistical
and Kupiainen 1987, 1988) using a renormaliza- Physics 47: 701–724.
tion group version of the Pirogov–Sinai theory Kotecký R, Laanait L, Messager A, and Ruiz J (1990) The q-state
first formulated in Gawȩdzki et al. (1987). Potts model in the standard Pirogov–Sinai theory: surface
tensions and Wilson loops. Journal of Statistical Physics 58:
5. Quantum lattice models. A class of quantum 199–248.
models that can be viewed as a quantum perturba- Kotecký R and Preiss D (1983) An inductive approach to PS
tion of a classical model. With the help of Feyn- theory, Proc. Winter School on Abstract Analysis, Suppl. ai
man–Kac formula these are rewritten as a (d þ 1)- Rend. del Mat. di Palermo.
dimensional classical model that is, in its turn, Lebowitz JL, Mazel A, and Presutti E (1999) Liquid–vapor phase
transitions for systems with finite range interactions. Journal
treated by the standard Pirogov–Sinai theory (Datta of Statistical Physics 94: 955–1025.
et al. 1996, Borgs et al. 1996). Laanait L, Messager A, Miracle-Solé S, Ruiz J, and Shlosman SB
6. Continuous systems. Gas of particles in con- (1991) Interfaces in the Potts model I: Pirogov–Sinai theory of
tinuum interacting with a particular potential of the Fortuin–Kasteleyn representation. Communications in
Kac type. Pirogov–Sinai theory is used for a proof Mathematical Physics 140: 81–91.
Pirogov SA and Sinai YaG (1975) Phase diagrams of classical
of the existence of the phase transitions after a lattice systems (Russian). Theoretical and Mathematical
suitable discretisation (Lebowitz et al. 1999). Physics 25(3): 358–369.
Pirogov SA and Sinai YaG (1976) Phase diagrams of classical
See also: Cluster Expansion; Falicov–Kimball Model; lattice systems. Continuation (Russian). Theoretical and
Phase Transitions in Continuous Systems; Quantum Mathematical Physics 26(1): 61–76.
Spin Systems. Sinai YaG (1982) Theory of Phase Transitions: Rigorous Results.
New York: Pergamon.
Slawny J (1987) Low temperature properties of classical lattice
systems: phase transitions and phase diagrams. In: Domb C and
Further Reading Lebowitz JL (eds.) Phase Transitions and Critical Phenomena,
Biskup M, Borgs C, Chayes JT, and Kotecký R (2004) Partition vol. 11, pp. 127–205. New York: Academic Press.
function zeros at first-order phase transitions: Pirogov–Sinai Zahradnı́k M (1984) An alternate version of Pirogov–Sinai theory.
theory. Journal of Statistical Physics 116: 97–155. Communications in Mathematical Physics 93: 559–581.
Borgs C and Imbrie JZ (1989) A unified approach to phase
diagrams in field theory and statistical mechanics. Commu-
nications in Mathematical Physics 123: 305–328.
66 Point-Vortex Dynamics

Point-Vortex Dynamics
S Boatto, IMPA, Rio de Janeiro, Brazil
D Crowdy, Imperial College, London, UK Roughly speaking, following Descartes, a vortex
ª 2006 Elsevier Ltd. All rights reserved. is an entity which makes particles move along
circular-like orbits. Examples are the cyclones and
anticyclones in the atmosphere (see Figure 3).
Mathematically speaking, let u = (u, v, w) 2 R3 be a
Introduction velocity field, the associated vorticity field ! is
defined to be
Vortices have a long fascinating history. Descartes
wrote in his Le Monde: !¼r^u ½1
. . .que tous les mouvements qui se font au Monde sont
In this article we are considering exclusively inviscid
en quelque façon circulaire: c’est à dire que, quand un
flows which are also incompressible, that is,
corps quitte sa place, il entre toujours en celle d’un
autre, et celui-ci en celle d’un autre, et ainsi de suite ru¼0 ½2
jusques au dernier, qui occupe au même instant le lieu
délaissé par le premier. and have constant density , which we normalize to
be equal to 1 ( = 1). In two dimensions, a point-
In particular, Descartes thought of vortices to vortex field is the simplest of all vorticity fields: it
model the dynamics of the solar system, as reported can be thought as an entity where the vorticity field
by W W R Ball (1940): is concentrated into a point. In other words, point
Descartes’ physical theory of the universe, embodying vortices are singularities of the vorticity field! Then,
most of the results contained in his earlier and in the plane the vorticity field associated to a system
unpublished Le Monde, is given in his Principia, of N point vortices is
1644, . . . He assumes that the matter of the universe XN
must be in motion, and that the motion must result in a !ðrÞ ¼  ðr  r  Þ ½3
number of vortices. He stated that the sun is the center ¼1
of an immense whirlpool of this matter, in which the
planets float and are swept round like straws in a
whirlpool of water. + + + +
+ +
Descartes’ theory was later on recused by Newton +
+ +
in his Principia in 1687. Few centuries later, +
W Thomson (1867) the later Lord Kelvin, made use +
+ +
of vortices to formulate his atomic theory: each atom + +
was assumed to be made up of vortices in a sort of + + +
ideal fluid. In 1878–79 the American physicist A M + +
Mayer conducted a few experiments with needle (a) (b)
magnets placed on floating pieces of cork in an Figure 1 Thomson atomic model: (a) atom with three
applied magnetic field, as toy models for studying electrons and (b) atom with four electrons. From Thomson JJ
atomic interactions and forms (Mayer 1878, Aref (1883) A Treatise on the Motion of Vortex Rings. New York:
et al. 2003). In 1883 inspired by Mayer experiments, Macmillan and Thomson JJ (1904) Electricity and Matter.
J J Thomson combined W Thomson’s atomic theory Westmister: Archibald Constable.
with H von Helmholtz’s point-vortex theory
(Helmholtz 1858): he thought as the electrons were
point vortices inside a positively charged shell (see
Figure 1), the vortices being located at the vertices of
regular parallelograms and investigated about the
stability of such structures (see Thomson (1883,
section 2.1)). The vortex-atomic theory survived for
quite a few years up to Rutherford’s experiments
proved that atoms have quite a different structure!
Before continuing this historical/modeling overview,
let’s address the following question: Figure 2 Hurricane Jeanne. Reproduced with permission from
what is a vortex and, more specifically, what is a point- the National Oceanic and Atmospheric Administration (NOAA)
vortex? (www.noaanews.noaa.gov).
Point-Vortex Dynamics 67

formation of stable polygonal configurations of iden-


Γ1 tical vortices, quite similar to the ones observed by
Γ2
Mayer with his magnets (see Figures 5 and 1).
One would like to understand how such configura-
Σ Γ3
Γ7 tions form and to give a theoretical account about their
Γ5
stability. In order to answer these questions we have to
C first be able to describe the dynamics of a system of
Γ6 point vortices from a mathematical point of view.
Γ4

Figure 3 Cyclones and anticyclones in the atmosphere. Repro- Evolution Equations


duced from Boatto S and Cabrel HE, SIAM Journal of Applied
Mathematics 64:216–230 (2003). With the permission of SIAM.
Can point vortices be viewed as ‘‘discrete’’ (or
localized) solutions of Euler equation in two dimen-
sions? Let us consider the Euler equation
where  ,  = 1, . . . , N, is a constant and corre-
sponds to the vorticity (or circulation) of the @u
þ u  ru ¼ rp þ f ½6
-vortex, situated at r  . In fact by definition, @t
the circulation around a curve C delimiting a region where p is the pressure, f = rU is a conservative
 with boundary C, force, and restrict our attention to the two-dimensional
I ZZ ZZ setting, for example, vortex dynamics on the plane (or a
C ¼ u  ds ¼ ðr ^ uÞ  n dA ¼ ! ½4 sphere). Then it is immediate that by taking the curl of
C  
eqn [6] we obtain the evolution equation of the
where we have used Stokes’ theorem to bring in the vorticity, that is,
vorticity. Then if the region contains only the th
point vortex, we obtain @! D!
þ u  r! ¼ 0; or ¼0 ½7
ZZ @t Dt
C ¼ !  dA ¼  ½5 where the operator D=Dt = @=@t þ u  r is called the
 material derivative and describes the evolution along
by eqn [3]. A positive (resp. negative) sign of  the flow lines. It follows from eqn [7] that in two
indicates that the corresponding point vortex dimensions the vorticity is conserved as it is trans-
induces an anticlockwise (resp. clockwise) particle ported along the flow lines. Then a natural question
motion, see Figure 4a)). Is there an analog of a arises: supposing the vorticity field ! is known, is it
point-vortex system for a three-dimensional flow? possible to deduce the velocity field u generating !? Or
Yes, and this brings in the analogy between vortex in other words, is it possible to solve the system of eqns
lines and magnetic field lines that Mayer used in his [1]–[2]? It is immediate to see that in general the
experiments with floating magnets. In fact, in three solution is not unique, if some boundary conditions
dimensions, the notion of a point vortex can be are not specified (see Marchioro and Pulvirenti
extended to that one of a straight vortex line (see (1993)). Furthermore, as already observed by Kirchh-
Figure 4b), where, by definition, a vortex line is a curve off in 1876 (Boatto and Cabral 2003), in two
that is tangent to the vorticity vector ! at each of its dimensions we can recast the fluid equations [1]–[2]
point. In this context we would like to mention the into a Hamiltonian formalism. In fact, notice that on
beautiful experiments of Yarmchuck–Gordon–Packard the plane u = (x,
_ y)
_ and eqn [2] is still satisfied if we
on vortices in superfluid helium. They observed the represent the velocity components as

ω
u

particle
r
Γ>0 Γ>0
Γ
|u| = c __ u
r2

(a) (b)
Figure 4 (a) Advected by the velocity field of one point vortex, a test particle follows a circular orbit, with a speed proportional to the
absolute value of the vortex circulation and inversely proportional to the square of its distance from the vortex. (b) Straight vortex lines.
68 Point-Vortex Dynamics

where G(r, r 0 ) is the Green’s function, solution of


the equation G(x, y) = (x, y). The Green’s func-
tion both for the plane and the sphere is (Marchioro
and Pulvirenti 1993)
1
Gðr; r 0 Þ ¼  log kr  r 0 k2 ½11
4
where kr  r 0 k2 = (x  x0 )2 þ (y  y0 )2 . By [10], once
we specify the vorticity field !(r) we can compute ,
and by replacing it into [8] the velocity field becomes
Z
uðrÞ ¼ Kðr; r 0 Þ!ðr 0 Þ dr 0 ½12

where K(r, r 0 ) = (r  r 0 )? =½2kr  r 0 k2  and it


represents the velocity field generated by a point
vortex of intensity one, located at r 0 . Then by
considering the vorticity field generated by point
vortices, eqn [3], together with eqn [11], eqn [10]
becomes
Z !
1 XN
0 2
ðrÞ ¼  log kr  r k  ðr  r  Þ dr 0
0
4 ¼1

1 XN
¼  log kr  r  k2 ½13
4 ¼1

Equation [13] describes together with [8], the


dynamics of a test particle at a point r = (x, y) in
the plane. Analogously, it can be shown that the
dynamics of a systems of point vortices in the plane
is given by the equations
dx @Hv dy @Hv
 ¼ ;  ¼ ½14
dt @y dt @x
Figure 5 Photographs of vortex configurations in a rotated
sample of superfluid helium with 1, . . . ,11 vortices. Reprinted where (q , p ) = (x ,  y ),  = 1, . . . , N, is a pair of
figure with permission from Yarmchuk EJ, Gordon MJV, and conjugate variables and Hv is the generalization of
Packard RE (1979) Observation of stationary vortices arrays in the stream function  (eqn [13]):
rotating superfluid Helium. Physical Review Letters 43(3): 214–
217. Copyright (1979) by the American Physical Society.
1 X N
Hv ¼    log kr   r  k2 ½15
4 ;¼1
6¼
@ @
x_ ¼ ; y_ ¼  ½8
@y @x Notice that the vortex Hamiltonian Hv (eqn [15]) is
an autonomous Hamiltonian and, as we will discuss
that is, by means of , called the stream function.
in the first subsection, it provides a good Lyapunov-
Formally,  plays the rôle of a Hamiltonian for the pair
like function to study stability properties of some
of conjugate variables (x, y) and it is used to describe the
vortex configurations. Moreover, Hv is invariant
dynamics of a test particle, located at (x, y) and advected
with respect to rotations and translations, then by
by the flow. By substituting [8] into [1], we obtain
the Noether theorem there are other first integrals of
ðrÞ ¼ !ðrÞ ½9 motion, that is,
that is, a Poisson equation with ! as a source term. X
N X
N
Then, once we specify the vorticity field, by L¼ k k xk k2 ; Mx ¼ k xk ;
inverting [9] we obtain the stream function  to be k¼1 k¼1
Z X
N

ðrÞ ¼ Gðr; r 0 Þ!ðr 0 Þ dr 0 ½10 My ¼ k yk


k¼1
Point-Vortex Dynamics 69

expressing, respectively, the conservation of angular 0 < Γ1 < Γ2


momentum, L, and linear momentum, M =
(Mx , My ), on the plane. We shall denote with M Γ=Γ
the magnitude of M (i.e., M = kMk). Furthermore,
by introducing the Poisson bracket
XN  
@f @g @f @g
½f ; g ¼ 
¼1
@q @p @p @q
(a) (b)
XN  
1 @f @g @f @g
¼  ⏐Γ1⏐>⏐Γ2⏐
 @x @y @y @x
¼1 
Γ1 = –Γ2
we can construct three integrals in involution out of
the four conserved quantities L, Mx , My , and Hv .
These are L, M2x þ M2y and Hv : in fact,
h i
½Hv ; L¼ 0; Hv ; M2x þ M2y ¼ 0;
h i (c) (d)
L; M2x þ M2y ¼ 0 Figure 6(a–d) For N = 2 the vortex dipole exhibits a synchro-
nous and the orbits are in general circular orbits, with the
It is then possible to reduce the system of equations exception of the case (d) for which 1 = 2 and the circular
from N to N  2 degrees of freedom. A Hamiltonian orbit degenerates into a line (or a circle of infinite radius).
system with N degrees of freedom is integrable
whenever there are N independent integrals of a specific reference frame in which the two vortices
motion in involution. It follows that a vortex system are at rest. If the vortices are identical (1 = 2 = ),
with N  3 is integrable, whereas the system of the motion is synchronous with frequency  = =
equations of four identical vortices has been shown and the vortices share the same circular orbit (see
by Ziglin to be nonintegrable in the sense that there Figure 6a). If the vortices are not identical and have
are no other first integrals analytically depending on vorticities of different magnitudes (say j1 j > j2 j),
the coordinates and circulations, and functionally their motion is still synchronous and periodic, with
independent of L, Hv , Mx , My (see Ziglin (1982)). frequency  = (1 þ 2 )=(2), and the vortices move
The following, however, has been shown: on different circular orbits (with r2 < r1 ) both
P centered at the center of vorticity. Note that for
1. Let K = N  = 1 k be the total vorticity,
both cases, identical and nonidentical vortices, we
M = (Mx , My ) the total momentum and M = kMk .
can view the vortex dynamics in a co-rotating frame
Then, as shown by Aref and Stremler (1999), if K = 0
where the vortices are simply at rest.
and M = 0, N-vortex problem [16] is integrable.
For three vortices we can have periodic and
2. A system of four identical vortices (i.e., k = k
quasiperiodic motion, depending on the initial
for  = 1, . . . , 4) can undergo periodic or quasi-
conditions, and for four vortices we can have
periodic motion for special initial conditions (see
periodic, quasiperiodic, or weakly chaotic motion.
Khanin (1981) Russian Math. Surveys 36: 231;
Aref and Pomphrey (1982) Proc. R. Soc. Lond. A Remarks
380: 359–387). More specifically, the motion of a
(i) The nonintegrability of the 4-vortex system was
system of four identical vortices can be periodic,
also proved for configurations of nonidentical vortices.
quasiperiodic, or chaotic depending on the symme-
Koiller and Carvalho (1989) gave an analytical proof
try of the initial configuration. In fact, every vortex
for 1 = 2 and 3 = 4 = , 0    1. Moreover,
configuration that belongs to the subspace of
Castilla et al. (1993) considered the case:
symmetric configurations – x = xþ2 and y =
1 = 2 = 3 = 1 and 4 = .
yþ2 ,  = 1, 2 – gives rise to an integrable vortex
(ii) Due to the translational and rotational
motion.
symmetries of Hv , there are some analogies between
We have that up to two vortices, the motion is the N-vortex problem and the N-body problem,
almost always periodic and the orbits are circles; the especially for what concerns configurations of
only exception being the case for which k2 = k1 , relative equilibria (see Albouy (1996) and Glass
when the circles degenerate into straight lines. Thus, (2000)). A relative equilibrium is a vortex (or mass)
a configuration of two point vortices is always a configuration that moves without change of shape
relative equilibrium configuration, that is, there exists or form, that is, a configuration which is steadily
70 Point-Vortex Dynamics

Montaldi et al. (2002) studied vortex dynamics on


a cylindrical surface, and Soulière and Tokieda
(2002) considered vortex dynamics on surfaces
with symmetries.
ΓN
(iv) As we shall see in the section on point
vortex motion, it is sometimes useful to employ
the complex analysis formalism. Then the vari-
ables of interest are z = x þ iy ,  = 1, . . . , N, and
Γ Γ Γ
its conjugate z , the Hamiltonian [15] takes the
Γ Γ
form
Γ ΓS 1 X
Hv ¼    log jz  z j
2 6¼

and the equations of motions become


(a) (b)
Figure 7 Polygonal configuration of vortices: (a) planar i X N
z  z
z_  ¼  ;  ¼ 1; . . . ; N ½16
configurations and (b) configurations of vortex rings on a sphere, 2 6¼;¼1 jz  z j2
with and without polar vortices.

(v) Equation [14] can we rewritten in a more


rotating or translating. A few examples are vortex compact form as
polygons (see Figure 7) like the ones studied by
Thomson, Mayer, Yarmchuk–Gordon–Packard, dX
¼ JrX Hv ½17
Boatto–Cabral (2003), Cabral–Schmidt (1999/ dt
2000), Dritschel–Polvani (1993), Lim–Montaldi– where
Roberts (2001), Sakajo (2004). For an exhaustive
review on relative equilibria of vortices, see the X ¼ ðq1 ; . . . ; qN ; p1 ; . . . ; pN Þ
article by Aref et al. (2003). We shall discuss  
@ @ @ @
stability of polygonal vortex configuration in the rX ¼ ;...; ; ;...;
@q1 @qN @p1 @pN
following subsection.  
O I
(iii) As shown by Kimura (1999) in a beautiful J¼
geometrical formalism, on the unit sphere (S2 ) and I O
on the Hyperbolic plane (H 2 ), the vortex Hamilto- I being the N  N identity matrix.
nians [15] are (vi) How close is the point-vortex model to the
original Euler equation? Point-vortex systems repre-
1 XN
Hv ¼    logð1  cos  Þ on S2 sent discrete solutions of the Euler equation in a
4 6¼ ‘‘weak’’ sense – see both the book and the article by
Marchioro and Pulvirenti (1993, 1994). These
1 XN
cosh   1
Hv ¼    log on H 2 authors proved that the Euler dynamics is ‘‘similar’’
4 6¼ cosh  þ 1
to the vortex dynamics in which the vortices are
localized in very small regions, and the vortex
where
intensities are the total vorticities associated to
cos  ¼ cos  cos  such small regions. In particular, let us consider a
vorticity field with compact support on a family of
þ sin  sin  cosð    Þ on S2
-balls, that is,
cosh  ¼ cosh  cosh 
X
N
þ sinh  sinh  cosð    Þ on H2 ! ¼ !i
i¼1
On S2 ,  and  are, respectively, the co-latitude
and the longitude of the -vortex,  = 1, . . . , N. We with support of !i contained in the ball of center xi
can define canonical variables q and p on S2 and (independent of ) and radius . Furthermore let us
H 2 , respectively, as assume that
q ¼  cos  ; p ¼  on S2 Z
!i dr ¼ i
q ¼  cosh  ; p ¼  on H 2 jrr i j
Point-Vortex Dynamics 71

stability of co-rotating point vortices in the plane. In


ε particular, his interest was in configurations of
ε 0 identical vortices equally spaced along the circum-
Figure 8 In the limit  ! 0, the dynamics of the center of ference of a circle, that is, located at the vertices of a
vorticity of a vortex -ball is approximated by the dynamics of a regular polygon (see Figure 7). He proved that for
point vortex. six or fewer vortices the polygonal configurations
are stable, while for seven vortices – the Thomson
with the
i independent of . Then in the limit  ! 0 heptagon – he erroneously concluded that the
the dynamics
R of the center of vorticity configuration is slightly unstable. It took more
B (t) = r! (r, t) dr, of a given -ball, ‘‘converges’’ than a century to make some progresses on this
to the motion of a single point vortex (see Figure 8). problem. D G Dritschel (1985) succeeded in solving
This result is important to illustrate as vortex the heptagon mystery for what concerns its linear
systems provide both a useful heuristic tool in the stability analysis, leaving open the nonlinear stabi-
analysis of the general properties of the solutions of lity question: he proved that the Thomson heptagon
Euler’s equations (Poupaud 2002, Schochet 1995), is neutrally stable and that for eight or more vortices
and a useful starting point for the construction of the corresponding polygonal configurations are
practical algorithms for solving equations in specific linearly unstable. Later on in 1993, Polvani and
situations. In particular, it provides a theoretical Dritschel (1993) generalized the techniques used in
justification to the vortex method previously intro- Dritschel (1985) to study the linear stability of a
duced by Carnevale et al. (1992). These authors ‘‘latitudinal’’ ring of point vortices on the sphere, as
constructed a numerical algorithm to study turbu- a function of the number N of vortices in the ring,
lence decaying in two dimensions. Their vortex and of the ring’s co-latitude  (see Figure 10). They
method greatly simplifies fluid simulations as basi- proved that polygonal configurations are more
cally it relies on a discretization of the fluid into unstable on the sphere than in the plane. In
circular patches. The dynamics of patches is given particular, they showed that at the pole, for N < 7
by the centers of vorticity, which interact as a point- the configuration is stable, for N = 7 it is neutrally
vortex system, endowed with a rule dictating how stable and for N > 7 it is unstable. By means of the
patches merge (see Figure 9). energy momentum method (Marsden–Meyer–Weistein
reduction), J E Marsden and S Pekarsky (1998)
studied the nonlinear stability analysis for the
Stability of a Vortex Ring
integrable case of polygonal configurations of
As mentioned in the Introduction section, the study three vortices of arbitrary vorticities (1 , 2 and
of vortex relative equilibria has a long history. 3 ) on the sphere, leaving open the stability
Kelvin showed that steadily rotating patterns of analysis for nonintegrable vortex systems (N > 3).
identical vortices arise as solutions of a variational In 1999 H E Cabral and D S Schmidt completed
problem in which the interaction energy (vortex the linear and nonlinear stability analysis at once
Hamiltonian) is minimized subject to the constraint for polygonal configurations in the plane. In 2003
that the angular impulse be maintained (see Aref Boatto and Cabral studied the nonlinear stability of
(2003). In 1883, while studying and modeling the a ring of vortices on the sphere, as a function of the
atomic structure, J J Thomson investigated the linear number of vortices N and the ring colatitude .

2a1
z

2a3
2a2
θ

Figure 9 In Carnevale et al. (1992) the fluid is modeled by a y


dilute vortex gas with density  and typical radius a. The x
dynamics is governed by the point-vortex dynamics of the disk
centers, each disk corresponding to a point vortex of intensity
 =  ext a 2 , where ext plays the role of a vorticity density. Two
vortices or radius a1 and a2 merge when their center-to-center
distance is less or equal to the sum of their radii, a1 þ a1 . Then a Figure 10 Latitudinal ring of vortices. Reproduced with
new vortex is created and its radius a3 is given by permission from Boatto S and Cabral HE SIAM Journal of
a3 = (a14 þ a24 )1=4 . Applied Mathematics 64: 216–230 (2003).
72 Point-Vortex Dynamics

Boatto and Simó (2004) generalized the stability dynamics in a frame co-rotating with the relative
analysis to the case of a ring with polar vortices equilibrium configuration. In the co-rotating refer-
and of multiple rings, the key idea being, as we ence system, the Hamiltonian takes the form
shall discuss in this section, the structure of the
Hessian of the Hamiltonian. ~ ¼ H þ !M
H
How to infer about linear and nonlinear stability where M is the momentum of the system, and H and
of steadily rotating configurations? ! are, respectively, the Hamiltonian and the rota-
Let us restrict the discussion to a polygonal ring of tional frequency of the relative equilibrium in the
identical vortices on a sphere as illustrated in original frame of reference. In the new reference
Figure 7 (Boatto and Cabral 2003, Boatto and frame, the relative equilibrium becomes an equili-
Simó 2004). The reasoning is easily generalized for brium, X , and the standard techniques can be used
the planar case. The case of multiple rings is to study its stability.
discussed in great detail in Boatto and Simó To study linear stability, the relevant equation is
(2004). A polygonal ring is a relative equilibrium dX
of coordinates X(t) = (q1 (t), . . . , qN (t), p1 (t), . . . , ¼ JSX ½20
dt
pN (t)), where
where X = X þ X, and S is the Hessian of H ~
q ðtÞ ¼  ðtÞ ¼ !t þ o
½18 evaluated at the equilibrium X . Then linear (or
p ðtÞ ¼ po ¼  cos o  ¼ 1; . . . ; N spectral) stability is deduced by studying the
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi eigenvalues of the matrix JS (spectral stability). For
! = (N  1)po =r2o , ro = 1  p2o =2 , o and o = o
being the initial longitude and co-latitude of the th nonlinear stability we make use of a sufficient
vortex. stability criterion due to Dirichlet (1897) (see G
Lejeune Dirichlet (1897). Werke, vol. 2, Berlin,
Theorem 1 (Spherical case) (Boatto and Simó pp. 5–8; Boatto and Cabral (2003) and references
2004). The relative equilibrium [18] is (linearly and therein).
nonlinearly) stable if
Theorem 2 Let X be an equilibrium of an
 4ðN  1Þð11  NÞ þ 24ðN  1Þr2o autonomous system of ordinary differential equations
þ 2N 2 þ 1 þ 3ð1ÞN < 0 ½19 dX
¼ f ðXÞ; 
R 2N ½21
and it is unstable if the inequality is reversed. dt
that is, f (X ) = 0. If there exists a positive (or
negative) definite integral F of the system [21] in a
Remarks
neighborhood of the equilibrium X , then X is
(i) By Theorem 1 a vortex polygon, of N point vortices, stable.
is stable for 0  o   o and (180   o )  o 
In our case the Hamiltonian itself is an integral of
180 , where  o = arcsin(r o ) and
motion. Then by studying definiteness of its Hes-
7N sian, S, evaluated at X , we infer minimal stability
r 2
o < for N odd intervals in  and N. Details are given in Boatto and
4
N 2  8N þ 8 Cabral (2003) and Boatto and Simó (2004). The
r 2
o < for N even proof is mainly based on the following
4ðN  1Þ
considerations:
where r o = sin  o .
1. Since S is a symmetric matrix it is diagonaliz-
(ii) Theorem 1 includes at once the results of
able, that is, there exists an orthogonal matrix
Thomson (1883), Dritschel (1985), and Polvani
C such that CT SC = D, where D is a diagonal
and Dritschel (1993) (and other authors who
matrix, D = diag( 1 , . . . , N ). Furthermore, the
have been working in the area (Aref et al. 2003)).
matrix C can be chosen to leave invariant the
We recover the planar case by setting ro = 0 in
symplectic form (equivalently J = CT JC). Then
eqn [19], deducing that stability is guaranteed
by the canonical change of variables Y = CT X
for N < 7.
eqn [20] becomes
To prove Theorem 1 it is useful to consider the
Hamiltonian equations as in eqn [17]. The first step dY
¼ JDY ½22
is to make a change of reference frame: view the dt
Point-Vortex Dynamics 73

where Y = (q ~1 , . . . , q ~1 , . . . , p
~N , p ~N ) and (q ~j ),
~j , p view. As discussed in the previous section, there are
j = 1, . . . , N, are pairs of conjugate variables. some vortex configurations, such as the polygonal
Equation [22] can be rewritten as ones, for which vortices undergo a periodic circular
motion. Then by viewing the dynamics in a
d2 ~
qj reference frame co-rotating with the vortices the
¼  j jþN q
~j ; j ¼ 1; . . . ; N
dt2 tracer Hamiltonian is manifestly time independent
and, therefore, integrable – since it reduces to a
2. When evaluated at the equilibrium X , the
Hamiltonian of one degree of freedom. In such an
Hessian S takes the block structure
occurrence, tracer trajectories form a web of homo-
 
~ Q O clinic and heteroclinic orbits. An interesting theo-
S¼ retical problem is to study how the tracer transport
O P
properties (i.e., existence of barriers to transport,
where the matrices Q and P are symmetric circulant diffusion etc.) are affected by perturbing the poly-
matrices, that is, (N  N) matrices of the form gonal vortex configuration, that is, by introducing in
0 1  a ‘‘genuine’’ time dependence (periodic, quasi-
a1 a2 . . . aN
B aN a1 . . . aN1 C periodic, or chaotic) (see, e.g., Boatto and Pierre-
B C
A ¼ B .. .. . . .. C ½23 humbert (1999), Rom-Kedar, Leonard and Wiggins
@ . . . . A (1990), Kuznetsov and Zaslavsky (2000), and
a2 a3 . . . a1 Newton (2001)). Furthermore, in the lab experi-
Circulant matrices are of special interest to us ments, color dyes, which monitor the flow velocity
because we can easily compute their eigenvalues field, are often used as the experimental equivalent
and eigenvectors for all N. In fact, it is immediate of tracer particles. In this context we would like to
to show that: stress the striking resemblance between theoretical
particle trajectories, deduced from point vortex
Lemma 3 All circulant matrices [23] have dynamics, and the actual dye visualizations observed
eigenvalues by van Heijst and Flor for vortex dipoles in a
X
N stratified fluid (see Figures 11 and 12) (van Heijst
j ¼ ak rk1
j ; j ¼ 1; . . . ; N 1993). Similarly, tripolar structures have been
k¼1 observed both in lab experiments (see Figure 13)
and in nature (see Figure 14). Recently, the Danish
and corresponding eigenvectors vj = (1, rj , . . . ,
group of Jansson–Haspang–Jensen–Hersen–Bohr has
rN1
j )T , j = 1, . . . , N, where rj = exp (2(j  1)=N)
observed beautiful rotating polygons, such as
are solutions of rN = 1.
squares and pentagons, on a fluid surface in the
presence of a rotating cylinder (see Figure 15).
Passive Tracers in the Velocity Fields of N Point
Vortices: The Restricted (N þ 1)-Vortex Problem
The terminology ‘‘restricted (N þ 1)-vortex prob- Point Vortex Motion with Boundaries
lem’’ is used in analogy with celestial mechanics In comparison with the extensive literature on point
literature, when one of the vorticities is taken to be vortex motion in unbounded domains, the study of
zero. The zero-vorticity vortex does not affect the point vortex motion in the presence of walls is modest.
dynamics of the remaining N-vortices. For this
There is, however, a general theory for such problems,
reason, it is said to be passively advected by the
and some recent new developments in this area have
flow of the remaining N-vortices and in the fluid
resulted in a versatile tool for analyzing point vortex
mechanics literature the terminology ‘‘passive tra-
motion with boundaries. Newton (Newton 2001)
cer’’ is also employed. The tracer dynamics is given contains a chapter on point vortex motion with
by the Hamiltonian equations [8]. Notice that in boundaries and also features a detailed bibliography.
general the Hamiltonian  is time dependent, The reader is referred there for standard treatments;
through the vortex variables r j , j = 1, . . . , N, that is, here, we focus on more recent developments of the
ðr; tÞ ¼ ðr; r 1 ðtÞ; . . . ; r N ðtÞÞ mathematical theory.
and (q, p) = (x, y) play the role of conjugate canoni-
The Method of Images
cal variables. There is an extensive literature on the
subject both from theoretical (see, e.g., Boatto and When point vortices move around in bounded
Simó (2004) and Newton (2001)) and an experi- domains, it is clear that the motion is subject to
mental (van Heijst 1993, Ottino 1990) point of the constraint that no fluid should penetrate any of
74 Point-Vortex Dynamics

Figure 11 Test-particle trajectories: on the left, theoretical


trajectories, from the point-vortex model; on the right, a top view
of a laboratory experiment in stratified flows. Reproduced from
van Heijst GJF and Flor JB (1989) Dipole formation and
collisions in a stratified fluid. Nature 340: 212–215, with
permission from Nature Publishing Group.

the boundary walls of the domain. If n denotes the


local normal to the boundary walls, the boundary
condition on the velocity field u is therefore u  n = 0
everywhere on the walls. Another way to say the
same thing is that all the walls must be streamlines
so that the streamfunction, say, must be constant
on any boundary wall.
A classical approach to bounded vortex motion is
Figure 12 A frontal collision of two dipoles as observed in a
the celebrated method of images – a rather special
stratified fluid: after a so called ‘‘partner-exchange’’ two new
technique limited to cases where the domain of dipoles are formed. Reproduced from van Heijst GJF and Flor JB
interest has certain geometrical symmetries so that (1989) Dipole formation and collisions in a stratified fluid. Nature
an appropriate distribution of image vorticity can be 340: 212–215, with permission from Nature Publishing Group.
ascertained, essentially by inspection. This image
vorticity is placed in nonphysical regions of the
plane in order to satisfy the boundary conditions
that the walls act as impenetrable barriers for the
flow.
The simplest example is the motion of a single
vortex next to a straight plane wall of infinite
extent. Suppose the wall is along y = 0 in an (x, y)-
plane and that the fluid occupies the upper-half
plane. If a circulation- vortex is at the complex
position z0 = x0 þ iy0 , the solution for the stream-
function is
 
 z  z0 
ðz; zÞ ¼  log   ½24 Figure 13 A tripolar vortex structure as observed in a rotating
2 z  z0  stratified fluid. Reproduced from van Heijst GJF, Kloosterziel
RC, and Williams CWM (1991) Laboratory experiments on the
where z = x þ iy. This has a single logarithmic tripolar vortex in a rotating fluid. Journal of Fluid Mechanics 225:
singularity in the upper-half plane at z = z0 301–331, with permission from Cambridge University Press.
Point-Vortex Dynamics 75

(corresponding to the point vortex) and it is easily


checked that = 0 on y = 0. Therefore, no fluid
penetrates the wall. Equation [24] can be written as
 
ðz; zÞ ¼  log jz  z0 j þ log jz  z0 j ½25
2 2
which is the sum of the streamfunction due to a
point vortex of circulation  at z0 = x0 þ iy0 and
another, one imagines, of circulation  at z0 =
x0  iy0 . In this case, the image vortex distribution is
simple: it is just the second vortex sitting at the
reflected point in the wall. The method of images
can be applied to flows in other regions bounded by
straight line segments (e.g., wedge regions of various
angles (Newton 2001)).
A variant of the method of images is the Milne–
Thomson circle theorem relevant to planar flow
around a circular cylinder. Given a complex
potential w(z) with the required singularities in the
fluid region exterior to the cylinder, but failing to
satisfy the boundary condition that the surface of
the cylinder is a streamline, this theorem says that
the correct potential W(z) is
WðzÞ ¼ wðzÞ þ wða2 =zÞ ½26

Figure 14 Infrared image taken by NOAA11 satellite on


where a is the cylinder radius and w(z) is the
January 4 1990 (0212 UT) shows a tripolar structure in the conjugate function to w(z). It is easy to verify that
Bay of Biscay. The central part of the tripole measures about the imaginary part of W(z), that is, the stream-
50–70 km and rotates clockwise, whereas the two satellite function, is zero on jzj = a. The second term,
vortices rotate anticlockwise. The dipoles persisted for a few w(a2 =z), produces the required distribution of
days before it fell apart. Reproduced from Pingree RD and Le
Cann B, Anticyclonic Eddy X91 in the Southern Bay of Biscay,
image vorticity inside the cylinder. A famous
Journal of Geophysical Research, 97: 14353–14362, May 1991 example is the Föppl vortex pair which is the
to February 1992. Copyright (1992) American Geophysical simplest model of the trailing vortices shed in the
Union. Reproduced/modified by permission of American Geo- wake of a circular aerofoil traveling at uniform
physical Union. speed.

Kirchhoff–Routh–Lin Theory

The most important general mathematical tool for


point vortex motion in bounded planar regions is
the Hamiltonian approach associated with the
names of Kirchhoff (1876) and Routh (1881),
who developed the early theory. It is now known
that the problem of N-vortex motion in a simply
connected domain is a Hamiltonian dynamical
Figure 15 The free surface of a rotating fluid will, due to the system. Moreover, the Hamiltonian has simple
centrifugal force, be pressed radially outward. If the flow is driven
transformation properties when a given flow
by rotating the bottom plate, the axial symmetry can break
spontaneously and the surface can take the shape of a rigidly domain of interest is mapped conformally to
rotating polygon. With water Jansson–Haspang–Jensen–Her- another – a result originally due to Routh. A
sen–Bohr have observed polygons with up to six corners. The formula for the Hamiltonian can be built from
rotation speed of the polygons does not coincide with that of the knowledge of the instantaneous Green’s function
plate, but it is often mode-locked, such that the polygon rotates
associated with motion of the point vortex in the
by one corner for each complete rotation of the plate.
Reproduced from Jansson TRN, Haspang M, Jensen KH, simply connected domain D. In fact, [24] is
Hersen P, and Bohr T (2005) Rotating polygons on a fluid precisely the relevant Green’s function when D is
surface. Preprint, with permission from T Bohr. the upper-half plane.
76 Point-Vortex Dynamics

Much later, in 1941, Lin (1941a) extended these ðx; y; xk ; yk Þ


general results to the case of multiply connected
X
N
fluid regions. To visualize such a region, think of a ¼ 0 ðx; yÞ þ k Gðx; y; xk ; yk Þ ½29
bounded region of the plane containing fluid but k¼1
also a finite number of impenetrable islands whose
boundaries act as barriers for the fluid motion. If the where 0 (x, y) is the streamfunction due to outside
islands are infinitely thin, they can be thought of as agencies and is independent of the point vortex
straight wall segments immersed in the flow (see positions.
later examples). Lin (1941b) showed that both the Theorem 5 For the motion of vortices of strengths
Hamiltonian structure, and the transformation {k jk = 1, . . . , N} in a general region D bounded by
properties of the Hamiltonian under conformal fixed boundaries, there exists a Kirchhoff–Routh
mapping, are preserved in the multiply connected function H({xk , yk }), depending on the point vortex
case. positions, such that

Lin’s Special Green’s Function dxk @H dyk @H


k ¼ ; k ¼ ½30
dt @yk dt @xk
Since Lin’s result subsumes the earlier simply
connected studies, we now outline the key results where H({xk , yk }) is given by
as presented in Lin (1941a). Consider a fluid region
D, with outer boundary C0 and M enclosed islands X
N
Hðfxk ; yk gÞ ¼ k 0 ðxk ; yk Þ
each having boundaries {Cj jj = 1, . . . , M}. Lin intro-
k¼1
duced a special Green’s function G(x, y; x0 , y0 )
X
N
satisfying the following properties: þ k1 k2 Gðxk1 ; yk1 ; xk2 ; yk2 Þ
1. the function k1 ;k2 ¼1
k1 >k2

1 1X N
gðx; y; x0 ; y0 Þ ¼ Gðx; y; x0 ; y0 Þ  log r0 ½27  2 gðxk ; yk ; xk ; yk Þ ½31
2 2 k¼1 k
is harmonic with respect to (x, y) throughout
the region D including at the point (x0 , y0 ). Here, In rescaled coordinates (xk , k yk ), [30] is a Hamil-
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi tonian system in canonical form. For historical
r0 = (x  x0 )2 þ (y  y0 )2 ; reasons, H is often called the Kirchhoff–Routh
2. if @G=@n is the normal derivative of G on a curve path function. Analyzing the separate contributions
then to the path function [31] is instructive: the first term
is the contribution from flows imposed from outside
Gðx; y; x0 ; y0 Þ ¼ Ak ; on Ck ; k ¼ 1; . . . ; M
I (e.g., background flows and round-island circula-
@G ½28 tions), the second term is the ‘‘free-space’’ contribu-
ds ¼ 0; k ¼ 1; . . . ; M
Ck @n tion (it is the relevant Hamiltonian when no
boundaries are present) while the third term encodes
where ds denotes an element of arc and {Ak } are
the effect of the boundary walls (or, the effect of the
constants;
‘‘image vorticity’’ distribution discussed earlier).
3. G(x, y; x0 , y0 ) = 0 on C0 .
Lin (1941a) went on to show that, with the
Flucher and Gustafsson (1997) refer to this G as
Hamiltonian in some D given by H in [31], the
the hydrodynamic Green’s function. (In fact, it
Hamiltonian relevant to vortex motion in another
coincides with the modified Green’s function
domain obtained from D by a conformal mapping
arising in abstract potential theory – a function
z( ) consists of [31] with some simple extra additive
that is dual to the usual first-type Green’s function
contributions dependent only on the derivative of
that equals zero on all the domain boundaries.)
the map z( ) evaluated at the point vortex positions.
On the use of G, Lin established the following two
Flucher and Gustafsson (1997) also introduce
key results:
the Robin function R(x0 , y0 ) defined as the regular
Theorem 4 If N vortices of strengths {k jk = part of the above hydrodynamic Green’s function
1, . . . , N} are present in an incompressible fluid at evaluated at the point vortex. Indeed, R(x0 , y0 )
the points {(xk , yk )jk = 1, . . . , N} in a general multi- g(x0 , y0 ; x0 , y0 ), where g is defined in [27]. An
ply connected region D bounded by fixed bound- interesting fact is that, for single-vortex motion in
aries, the stream function of the fluid motion is a simply connected domain, R(x0 , y0 ) satisfies the
given by quasilinear elliptic Liouville equation everywhere in
Point-Vortex Dynamics 77

D with the boundary condition that it becomes (2005a), who, up to conformal mapping, have
infinite everywhere on the boundary of D. derived explicit formulas for the hydrodynamic
By combining the Kirchhoff–Routh theory with Green’s function in multiply connected fluid regions
conformal mapping theory, many interesting prob- of arbitrary finite connectivity. Their approach
lems can be studied. What happens, for example, if makes use of elements of classical function theory
there is a gap in the wall of Figure 16? In recent dating back to the work of Poincaré, Schottky, and
work, Johnson and McDonald (2005) show that if Klein (among others). This allows new problems
the vortex starts off, far from the gap, at a distance involving bounded vortex motion to be tackled. For
of less than half the gap width from the wall, then it example, the motion of a single vortex around
will eventually penetrate the gap. Otherwise, it will multiple circular islands has been studied in Crowdy
dip towards the gap but not go through it. The and Marshall (2005b), thereby extending recent
trajectories are shown in Figure 17. work on the two-island problem (Johnson and
Unfortunately, Lin did not provide any explicit McDonald 2005). If the wall in Figure 17 happens
analytical expressions for G in the multiply con- to have two (or more) gaps, then the fluid region is
nected case. This has limited the applicability of his multiply connected. The two-gap (doubly con-
theory beyond fluid regions that are anything other nected) case was recently solved by Johnson and
than simply and doubly connected. Recently, how- McDonald (2005) using Schwarz–Christoffel maps
ever, Lin’s theory has recently been brought to combined with elements of elliptic function theory
implementational fruition by Crowdy and Marshall (see Figure 18). Crowdy and Marshall have solved
the problem of an arbitrary number of gaps in a wall
by exploiting the new general theory presented
Point vortex, circulation Γ
in Crowdy and Marshall (2005a,b) (and related
works by the authors). The case of a wall with three
gaps represents a triply connected fluid region and
the critical vortex trajectory is plotted in Figure 19.
Point vortex motion in bounded domains on the
Wall surface of a sphere has received scant attention in

2
1.5
Image vortex, circulation-Γ 1
Figure 16 The motion of a point vortex near an infinite straight 0.5
wall. The vortex moves, at constant speed, maintaining a
0
constant distance from the wall. Other possible trajectories are
shown; they are all straight lines parallel to the wall. The motion –0.5
can be thought of as being induced by an opposite-circulation –1
‘‘image’’ vortex at the reflected point in the wall. –1.5
–2
–3 –2 –1 0 1 2 3
Figure 18 The critical trajectory when there are two symmetric
2 gaps in a wall. The fluid region is now doubly connected. This
problem is solved in Johnson and McDonald (2005) and Crowdy
1.5
and Marshall (2005).
1

0.5
2
0
1
–0.5
0
1
–1
–1.5
–2
–2 –5 0 5
–3 –2 –1 0 1 2 3 Figure 19 The critical vortex trajectories when there are three
Figure 17 Distribution of point vortex trajectories near a wall gaps in the wall. This time the fluid region is triply connected.
with a single gap of length 2. There is a critical trajectory which, This problem is solved in Crowdy and Marshall (2005) using the
far from the gap, is unit distance from the wall. general methods in Crowdy and Marshall (2005).
78 Point-Vortex Dynamics

the literature, although Kidambi and Newton von Helmholtz H (1858) On the integrals of the hydrodynamical
(2000) and Newton (2001) have recently made a equations which express vortex motion. Philosophical Maga-
zine 4(33): 485–512.
contribution. Such paradigms are clearly relevant Jansson TRN, Haspang M, Jensen KH, Hersen P, and Bohr T
to planetary-scale oceanographic flows in (2005) Rotating polygons on a fluid surface. Preprint.
which oceanic eddies interact with topography such Johnson ER and McDonald NR (2005) Vortices near barriers
as ridges and land masses and deserve further study. with multiple gaps. Journal of Fluid Mechanics 531: 335–358.
Kidambi R and Newton PK (2000) Point vortex motion on a
sphere with solid boundaries. Physical Fluids 12: 581–588.
Kimura Y (1999) Vortex motion on surfaces with constant
Further Reading curvature. Proceedings of the Royal Society of London A 455:
245–259.
Albouy A (1996) The symmetric central configurations of four Kirchhoff G (1876) Vorlesunger über mathematische Physik,
equal masses. Contemporary Mathematics 198: 131–135. Mechanik. Leipzig.
Aref H and Stremler MA (1999) Four-vortex motion with zero total Koiller J and Carvalho SP (1989) Non-integrability of the 4-
circulation and impulse. Physics of Fluids 11(12): 3704–3715. vortex system: analytical proof. Communications in Mathe-
Aref H, Newton PK, Stremler MA, Tokieda T, and Vainchtein DL matical Physics 120(4): 643–652.
(2003) Vortex crystals. Advances in Applied Mathematics 39: Kuznetsov L and Zavlasky GM (2000) Passive tracer transport in
1–79. three-vortex flow. Physical Review A 61(4): 3777–3792.
Ball WWR (1940) A Short Account of the History of Mathe- Lim C, Montaldi J, and Roberts M (2001) Relative equilibria of
matics, 12th edn. London: MacMillan. point vortices on the sphere. Physica D 148: 97–135.
Boatto S and Pierrehumbert RT (1999) Dynamics of a passive Lin CC (1941a) On the motion of vortices in two dimensions. I.
tracer in the velocity field of four identical point-vortices. Existence of the Kirchhoff–Routh function. Proceedings of the
Journal of Fluid Mechanics 394: 137–174. National Academy of Sciences 27(12): 570–575.
Boatto S and Cabral HE (2003) Nonlinear stability of a Lin CC (1941b) On the motion of vortices in two dimensions. II.
latitudinal ring of point vortices on a non-rotating sphere. Some further investigations on the Kirchhoff–Routh function.
SIAM Journal of Applied Mathematics 64: 216–230. Proceedings of the National Academy Sciences 27(12): 575–577.
Boatto S and Simó C (2004) Stability of latitudinal vortex rings Marchioro C and Pulvirenti M (1993) Vortices and localization in
with polar vortices. Mathematical Physics Preprint Archive Euler flows. Communications in Mathematical Physics 154:
(mp_arc) 04-67. 49–61.
Cabral HE and Schmidt DS (1999/00) Stability of relative Marchioro C and Pulvirenti M (1994) Mathematical Theory of
equilibria in the problem of N þ 1 vortices. SIAM Journal of Incompressible Non-viscous Fluids. vol. 96, AMS. New York:
Mathematical Analysis 31(2): 231–250. Springer.
Carnevale GF, McWilliams JC, Pomeau Y, Weiss JB, and Young Mayer AM (1878) Floating magnetics. Nature 17: 487–488.
WR (1991) Evolution of vortex statistics in two-dimensional Mayer AM (1878) Scientific American 2045–2047.
turbulence. Physical Review Letters 66(21): 2735–2737. Mayer AM (1878) On the morphological laws of the configura-
Carnevale GF, McWilliams JC, Pomeau Y, Weiss JB, and Young WR tions formed by magnets floating vertically and subjected to
(1992) Rates, pathways, and end states of nonlinear evolution in the attraction of a superposed magnetic. American Journal of
decaying two-dimensional. Physics of Fluids A 4(6): 1314–1316. Science 16: 247–256.
Castilla MSAC, Moauro V, Negrini P, and Oliva WM (1993) The Montaldi J, Soulière A, and Tokieda T (2002) Vortex dyanmics
four positive vortices problem – regions of chaotic behavior on a cylinder. SIAM Journal of Applied Dynamical Systems
and non-integrability. Annales de l’Institut Henri Poincaré. 2(3): 417–430.
Section A. Physique Theorique 59(1): 99–115. Newton PK (2001) The N-Vortex Problem. Analytical Tech-
Crowdy DG and Marshall JS (2005a) Analytical formulae for the niques. New York: Springer.
Kirchhoff–Routh path function in multiply connected Ottino JM (1990) The Kinematics of Mixing: Stretching, Chaos
domains. Proceedings of the Royal Society A 461: 2477–2501. and Transport. Cambridge: Cambridge University Press.
Crowdy DG and Marshall JS (2005b) The motion of a point Pingree RD and Le Cann B (1992) Anticyclonic Eddy X91 in the
vortex around multiple circular islands. Physics of Fluids 17: Southern Bay of Biscay. Journal of Geophysical Research 97:
560–602. 14353–14362.
Dritschel DG (1985) The stability and energetics of co-rotating Polvani LM and Dritschel DG (1993) Wave and vortex dynamics on
uniform vortices. Journal of Fluid Mechanics 157: 95–134. the surface of a sphere. Journal of Fluid Mechanics 255: 35–64.
Flucher M and Gustafsson B (1997) Vortex Motion in Two Poupaud F (2002) Diagonal defect measures, adhesion dynamics
Dimensional Hydrodynamics. Royal Institute of Technology and Euler equation. Methods and Applications of Analysis
Report No. TRITA-MAT-1997-MA-02. 9(4): 533–562.
Glass K (2000) Symmetry and bifurcations of planar configura- Rom-Kedar V, Leonard A, and Wiggins S (1990) An analytical
tions of the N-body and other problems. Dynamics and study of transport, mixing and chaos in an unsteady vortical
Stability of Systems 15(2): 59–73. flow. Journal of Fluid Mechanics 214: 347–394.
van Heijst GJF and Flor JB (1989) Dipole formation and Routh E (1881) Some applications of conjugate functions.
collisions in a stratified fluid. Nature 340: 212–215. Proceedings of the London Mathematical Society 12: 73–89.
van Heijst GJF, Kloosterziel RC, and Williams CWM (1991) Sakajo T (2004) Transition of global dynamics of a polygonal vortex
Laboratory experiments on the tripolar vortex in a rotating ring on a sphere with pole vortices. Physica D 196: 243–264.
fluid. Journal of Fluid Mechanics 225: 301–331. Schochet S (1995) The weak vorticity formulation of the 2-D
van Heijst GJF (1993) Self-organization of two-dimensional Euler equations and concentration-cancellation. Communi-
flows. Nederlands Tijdschrift voor Natuurkunde 59: cations in Partial Differential Equations 20(5&6):
321–325 (http://www.fluid.tue.nl). 1077–1104.
Poisson Reduction 79

Soulière A and Tokieda T (2002) Periodic motion of vortices on Thomson W (1867) On vortex atoms. Proceedings of the Royal
surfaces with symmetries. Journal of Fluid Mechanics 460: Society of Edinburgh 6: 94–105.
83–92. Yarmchuk EJ, Gordon MJV, and Packard RE (1979) Observation
Thomson JJ (1883) A Treatise on the Motion of Vortex Rings. of stationary vortices arrays in rotating superfluid helium.
New York: Macmillan. Physical Review Letters 43(3): 214–217.
Thomson JJ (1904) Electricity and Matter. Westmister Archibald Ziglin SL (1982) Quasi-periodic motions of vortex systems.
Constable. Physica D 4: 261–269 (addendum to K M Khanin).

Poisson Lie Groups see Classical r-Matrices, Lie Bialgebras, and Poisson Lie Groups

Poisson Reduction
J-P Ortega, Université de Franche-Comté, structure. Given a Poisson dynamical system
Besançon, France (M, { , }, h), its ‘‘integrals of motion’’ or ‘‘con-
T S Ratiu, Ecole Polytechnique Federale de served quantities’’ are defined as the centralizer of
Lausanne, Lausanne, Switzerland h in (C1 (M), { , }) that is, the subalgebra of
ª 2006 Elsevier Ltd. All rights reserved. (C1 (M), { , }) consisting of the functions
f 2 C1 (M) such that {f , h} = 0. Note that the
terminology is justified since, by Hamilton’s equa-
Introduction tions in Poisson bracket form, we have f_ = Xh [f ] =
{f , h} = 0, that is, f is constant on the flow of Xh . A
The Poisson reduction techniques allow the con- smooth mapping ’ : M1 ! M2 , between the two
struction of new Poisson structures out of a given Poisson manifolds (M1 , { , }1 ) and (M2 , { , }2 ),
one by combination of two operations: ‘‘restriction’’ is called ‘‘canonical’’ or ‘‘Poisson’’ if for all g,
to submanifolds that satisfy certain compatibility h 2 C1 (M2 ) we have ’ {g, h}2 = {’ g, ’ g}1 . If
assumptions and passage to a ‘‘quotient space’’ ’ : M1 ! M2 is a smooth map between two Poisson
where certain degeneracies have been eliminated. manifolds (M1 , { , }1 ) and (M2 , { , }2 ), then ’ is a
For certain kinds of reduction, it is necessary to pass Poisson map if and only if T’  Xh’ = Xh  ’ for
first to a submanifold and then take a quotient. any h 2 C 1 (M2 ), where T’ : TM1 ! TM2 denotes
Before making this more explicit, we introduce the the tangent map (or derivative) of ’.
notations that will be used in this article. All Let (S, { , }S ) and (M, { , }M ) be two Poisson mani-
manifolds in this article are finite dimensional. folds such that S
M and the inclusion iS : S ,! M
is an immersion. The Poisson manifold (S, { , }S ) is
Poisson Manifolds
called a ‘‘Poisson submanifold’’ of (M, { , }M )
A ‘‘Poisson manifold’’ is a pair (M, { , }), where M is a if iS is a canonical map. An immersed submanifold
manifold and { , } is a bilinear operation on C1 (M) Q of M is called a ‘‘quasi-Poisson submanifold’’ of
such that (C 1 (M), { , }) is a Lie algebra and { , } is a (M, { , }M ) if for any q 2 Q, any open neighborhood
derivation (i.e., the Leibniz identity holds) in each U of q in M, and any f 2 C 1 (U) we have
argument. The pair (C1 (M), { , }) is also called a Xf (iQ (q)) 2 Tq iQ (Tq Q), where iQ : Q ,! M is the
‘‘Poisson algebra.’’ The functions in the center C(M) of inclusion and Xf is the Hamiltonian vector field of f
the Lie algebra (C1 (M), { , }) are called ‘‘Casimir on U with respect to the Poisson bracket of M
functions.’’ From the natural isomorphism between restricted to U. If (S,{ , }S ) is a Poisson submanifold
derivations on C1 (M) and vector fields on M, it follows of (M, { , }M ), then there is no other bracket { , }0 on
that each h 2 C1 (M) induces a vector field on M via the S making the inclusion i : S ,! M into a canonical map.
expression Xh = { , h}, called the ‘‘Hamiltonian vector If Q is a quasi-Poisson submanifold of (M, { , }), then
field’’ associated to the ‘‘Hamiltonian function’’ h. there exists a unique Poisson structure { , }Q on Q
The triplet (M, { , }, h) is called a ‘‘Poisson dynami- that makes it into a Poisson submanifold of (M, { , })
cal system.’’ Any Hamiltonian system on a symplec- but this Poisson structure may be different from the
tic manifold is a Poisson dynamical system relative given one on Q. Any Poisson submanifold is quasi-
to the Poisson bracket induced by the symplectic Poisson but the converse is not true in general.
80 Poisson Reduction

The Poisson Tensor and Symplectic Leaves symplectic orbit reduced space MO (see Symmetry
and Symplectic Reduction). If, additionally, G is
The derivation property of the Poisson bracket implies
compact, M is connected, and the momentum map J
that for any two functions f , g 2 C1 (M), the value of
is proper, then McO = MO .
the bracket {f , g}(z) at an arbitrary point z 2 M (and
In the remainder of this section, we characterize
therefore Xf (z) as well) depends on f only through
the situations in which new Poisson manifolds can
df (z) which allows us to define a contravariant
be obtained out of a given one by a combination of
antisymmetric 2-tensor B 2 2 (T  M), called the ‘‘Pois-
restriction to a submanifold and passage to the
son tensor,’’ by B(z)(z , z ) = {f , g}(z), where
quotient with respect to an equivalence relation that
df (z) = z 2 Tz M and dg(z) = z 2 Tz M. The vector
encodes the symmetries of the bracket.
bundle map B] : T  M ! TM over the identity naturally
associated to B is defined by B(z)(z , z ) = Definition 1 Let (M,{  ,  }) be a Poisson manifold
hz , B] (z )i. Its range D := B] (T  M)  TM is called and D  TM a smooth distribution on M. The
the ‘‘characteristic distribution’’ of (M, { , }) since D is distribution D is called ‘‘Poisson’’ or ‘‘canonical,’’ if
a generalized smooth integrable distribution. Its the condition df jD = dgjD = 0, for any f , g 2 C 1 (U)
maximal integral leaves are called the ‘‘symplectic and any open subset U  P, implies that d{f , g}jD = 0.
leaves’’ of M for they carry a symplectic structure that
makes them into Poisson submanifolds. As integral Unless strong regularity assumptions are invoked, the
leaves of an integrable distribution, the symplectic passage to the leaf space of a canonical distribution
leaves L are ‘‘initial submanifolds’’ of M, that is, the destroys the smoothness of the quotient topological
inclusion i : L ,! M is an injective immersion such that space. In such situations, the Poisson algebra of functions
for any smooth manifold P, an arbitrary map g : P ! L is too small and the notion of presheaf of Poisson
is smooth if and only if i  g : P ! M is smooth. algebras is needed. See Singularity and Bifurcation
Theory for more information on singularity theory.

Poisson Reduction Definition 2 Let M be a topological space with a


presheaf F of smooth functions. A presheaf of Poisson
Canonical Lie Group Actions algebras on (M, F ) is a map { , } that assigns to each
Let (M, { , }) be a Poisson manifold and let G be a open set U  M a bilinear operation { , }U : F (U) 
Lie group acting canonically on M via the map F (U) ! F (U) such that the pair (F (U), { , }U ) is a
: G  M ! M. An action is called ‘‘canonical’’ if Poisson algebra. A presheaf of Poisson algebras is
for any h 2 G and f , g 2 C1 (M), one has denoted as a triple (M, F , { , }). The presheaf of
Poisson algebras (M, F , { , }) is said to be ‘‘nondegene-
ff  h ; g  h g ¼ ff ; gg  h rate’’ if the following condition holds: if f 2 F (U) is such
If the G-action is free and proper, then the orbit space that {f , g}U\V = 0, for any g 2 F (V) and any open set of
M=G is a smooth regular quotient manifold. Moreover, V, then f is constant on the connected components of U.
it is also a Poisson manifold with the Poisson bracket Any Poisson manifold (M, { , }) has a natural
{ , }M=G , uniquely characterized by the relation presheaf of Poisson algebras on its presheaf of smooth
ff ; ggM=G ððmÞÞ ¼ ff  ; g  gðmÞ ½1 functions that associates to any open subset U of M
the restriction { , }jU of { , } to C1 (U)  C1 (U).
for any m 2 M and where f , g : M=G ! R are two
arbitrary smooth functions. This bracket is appro- Definition 3 Let P be a topological space and
priate for the reduction of Hamiltonian dynamics Z = {Si }i 2 I a locally finite partition of P into smooth
in the sense that if h 2 C1 (M)G is a G-invariant manifolds Si  P, i 2 I, that are locally closed topo-
smooth function on M, then the Hamiltonian logical subspaces of P (hence their manifold topol-
flow Ft of Xh commutes with the G-action, so it ogy is the relative one induced by P). The pair (P, Z)
M=G
induces a flow Ft on M=G that is Hamiltonian on is called a ‘‘decomposition’’ of P with ‘‘pieces’’ in Z,
M=G or a ‘‘decomposed space,’’ if the following ‘‘frontier
(M=G, { , } ) for the reduced Hamiltonian
function [h] 2 C 1 (M=G) defined by [h]   = h. condition’’ holds:
If the Poisson manifold (M, { , }) is actually Condition (DS) If R, S 2 Z are such that R \  S 6¼ ;,
symplectic with form ! and the G-action has an then R  S. In this case, we write R S. If, in
associated momentum map J : M ! g , then the addition, R 6¼ S we say that R is incident to S or that
symplectic leaves of (M=G,{ , }M=G ) are given by the it is a boundary piece of S and write R
S.
spaces (McO := G  J 1 ()c =G, !cO ), where J 1 ()c is a
connected component of the fiber J 1 () and !cO is the Definition 4 Let M be a differentiable manifold
restriction to McO of the symplectic form !O of the and S  M a decomposed subset of M. Let {Si }i 2 I
Poisson Reduction 81

be the pieces of this decomposition. The topology Definition 5 Let (M, { , }) be a Poisson manifold,
of S is not necessarily the relative topology as a S a decomposed subset of M, and D  TMjS a
subset of M. Then D  TMjS is called a ‘‘smooth Poisson-integrable generalized distribution adapted
distribution’’ on S adapted to the decomposition to the decomposition of S. Assume that C1 S=DS
{Si }i 2 I , if D \ TSi is a smooth distribution on Si for has the (D, DS )-local extension property. Then
all i 2 I. The distribution D is said to be ‘‘integrable’’ (M, { , }, D, S) is said to be ‘‘Poisson reducible’’ if
S=DS
if D \ TSi is integrable for each i 2 I. (S=DS ,C1 S=DS , { , } ) is a well-defined presheaf of
Poisson algebras where, for any open set V  S=DS ,
In the situation described by the previous defini- S=D
the bracket { , }V S : C1 1 1
S=DS (V)  CS=DS (V) ! CS=DS
tion and if D is integrable, the integrability of the
(V) is given by
distributions DSi := D \ TSi on Si allows us to
partition each Si into the corresponding maximal S=DS
ff ; ggV ðDS ðmÞÞ :¼ fF; GgðmÞ
integral manifolds. Thus, there is an equivalence
relation on Si whose equivalence classes are precisely for any m 2 1
DS (V)for local D-invariant extensions
these maximal integral manifolds. Doing this on F,G at m of f  DS and g  DS , respectively.
each Si , we obtain an equivalence relation DS on the Theorem 1 Let (M, { , }) be a Poisson manifold with
whole set S by taking the union of the different associated Poisson tensor B 2 2 (T  M), S a decom-
equivalence classes corresponding to all the DSi . posed space, and D  TMjS a Poisson-integrable
Define the quotient space S=DS by generalized distribution adapted to the decomposition
[ of S (see Definitions 4 and 1). Assume that C1
S=DS :¼ Si =DSi S=DS has
the (D, DS )-local extension property. Then (M, { , },
i2I
D, S) is Poisson reducible if for any m 2 S
and let DS : S ! S=DS be the natural projection.  
B] ðm Þ  Sm ½3

The Presheaf of Smooth Functions on S=DS where m := {dF(m)jF 2 C1 (Um ), dF(z)jD(z) = 0, for
all z 2 Um \ S, and for any open neighborhood Um
Define the presheaf of smooth functions C1 S=DS on of m in M} and Sm := {dF(m) 2 m jFjUm \Vm is
S=DS as the map that associates to any open subset V constant for an open neighborhood Um of m in M
of S=DS the set of functions C1 S=DS (V) characterized and an open neighborhood Vm of m in S}.
by the following property: f 2 C1
S=DS (V) if and only if
for any z 2 V there exists m 2 1 If S is endowed with the relative topology, then
DS (V),Um open
neighborhood of m in M, and F 2 C1 (Um ) such that Sm := {dF(m) 2 m jFjUm \Vm is constant for an open
neighborhood Um of m in M}.
f  DS j1 ðVÞ\Um ¼ Fj1 ðVÞ\Um ½2
DS DS
Reduction by Regular Canonical Distributions
F is called a ‘‘local extension’’ of f  DS at the point
m 2 1 Let (M, { , }) be a Poisson manifold and S an
DS (V). When the distribution D is trivial, the
presheaf C1 embedded submanifold of M. Let D  TMjS be a
S=DS coincides with the presheaf of
Whitney smooth functions C1 sub-bundle of the tangent bundle of M restricted to
S, M on S induced by
the smooth functions on M. S such that DS := D \ TS is a smooth, integrable,
The presheaf C1 regular distribution on S and D is canonical.
S=DS is said to have the (D, DS )-
local extension property when the topology of S is Theorem 2 With the above hypotheses, (M, { , },
stronger than the relative topology and, at the same D, S) is Poisson reducible if and only if
time, the local extensions of f  DS defined in [2]
can always be chosen to satisfy B] ðD Þ  TS þ D ½4

dFðnÞjDðnÞ ¼ 0 for any n 2 1


DS ðVÞ \ Um

F is called a ‘‘local D-invariant extension’’ of f  DS at Applications of the Poisson Reduction


the point m 2 1 DS (V). If S is a smooth embedded Theorem
submanifold of M and DS is a smooth, integrable, and
Reduction of Coisotropic Submanifolds
regular distribution on S, then the presheaf C1 S=DS
coincides with the presheaf of smooth functions on Let (M, { , }) be a Poisson manifold with associated
S=DS when considered as a regular quotient manifold. Poisson tensor B 2 2 (T  M) and S an immersed
The following definition spells out what we mean smooth submanifold of M. Denote by (TS) := {s 2
by obtaining a bracket via reduction. Ts Mjhs , vs i = 0, for all s 2 S, vs 2 Ts S}  T  M the
82 Poisson Reduction

conormal bundle of the manifold S; it is a vector (i) B] ((TS) ) \ TS = {0},


sub-bundle of T  MjS . The manifold S is called (ii) Ts S þ Ts Ls = Ts M,
‘‘coisotropic’’ if B] ((TS) )  TS. In the physics
for any s 2 S and Ls the symplectic leaf of (M, { , })
literature, coisotropic submanifolds appear some-
containing s 2 S.
times under the name of ‘‘first-class constraints.’’
The following are equivalent: The cosymplectic submanifolds of a symplectic mani-
fold (M, !) are its symplectic submanifolds. Cosym-
1. S is coisotropic;
plectic submanifolds appear in the physics literature
2. if f 2 C1 (M) satisfies f jS 0, then Xf jS 2 X(S);
under the name of ‘‘second-class constraints.’’
3. for any s 2 S, any open neighborhood Us of s in
M, and any function g 2 C1 (Us ) such that Proposition 2 Let (M, { , }) be a Poisson manifold,
Xg (s) 2 Ts S, if f 2 C1 (Us ) satisfies {f , g}(s) = 0, it B 2 2 (T  M) the corresponding Poisson tensor,
follows that Xf (s) 2 Ts S; and S a cosymplectic submanifold of M. then, for
4. the subalgebra {f 2 C1 (M) j f jS 0} is a Poisson any s 2 S,
subalgebra of (C1 (M), { , }).
(i) Ts Ls = (Ts S \ Ts Ls ) B] (s)((Ts S) ), where Ls is
The following proposition shows how to endow the symplectic leaf of (M, { , }) that contains
the coisotropic submanifolds of a Poisson manifold s 2 S.
with a Poisson structure by using the reduction (ii) (Ts S) \ ker B] (s) = {0}.
theorem 1. (iii) Ts M = B] (s)((Ts S) ) Ts S.
(iv) B] ((TS) ) is a sub-bundle of TMjS and hence
Proposition 1 Let (M, { , }) be a Poisson manifold
TMjS = B] ((TS) ) TS.
with associated Poisson tensor B 2 2 (T  M). Let S
(v) The symplectic leaves of (M, { , }) intersect S
be an embedded coisotropic submanifold of M and
transversely and hence S \ L is an initial
D := B] ((TS) ). Then
submanifold of S, for any symplectic leaf L of
(i) D = D \ TS = DS is a smooth generalized (M, { , }).
distribution on S.
Theorem 3 (The Poisson structure of a cosymplectic
(ii) D is integrable.
submanifold). Let (M, { , }) be a Poisson manifold,
(iii) If C1
S=DS has the (D, DS )-local extension property, B 2 2 (T  M) the corresponding Poisson tensor,
then (M, { , }, D, S) is Poisson reducible.
and S a cosymplectic submanifold of M. Let
Coisotropic submanifolds usually appear as the D := B] ((TS) )  TMjS . Then,
level sets of integrals in involution. Let (M, { , }) be a
(i) (M, { , }, D, S) is Poisson reducible.
Poisson manifold with Poisson tensor B and let
(ii) The corresponding quotient manifold equals S
f1 , . . . , fk 2 C1 (M) be k smooth functions in involu-
and the reduced bracket { , }S is given by
tion, that is, {fi , fj } = 0, for any i, j 2 {1, . . . , k}.
Assume that 0 2 Rk is a regular value of the function ff ; ggS ðsÞ ¼ fF; GgðsÞ ½5
F := (f1 , . . . , fk ) : M ! Rk and let S := F1 (0). Since for
any s 2 S, span {df1 (s), . . . , dfk (s)}  (Ts S) and the where f , g 2 C1S, M (V) are arbitrary and F, G 2
dimensions of both sides of this inclusion are equal, C1 (U) are local D-invariant extensions of f
it follows that span{df1 (s), . . . , dfk (s)} = (Ts S) . and g around s 2 S, respectively.
Hence, B] (s)((Ts S) ) = span{Xf1 (s), . . . , Xfk (s)} and (iii) The Hamiltonian vector field Xf of an arbitrary
B] (s) ((Ts S) )  Ts S by the involutivity of the compo- function f 2 C1S, M (V) is given either by
nents of F. Consequently, S is a coisotropic submani-
Ti  Xf ¼ XF  i ½6
fold of (M, { , }).
where F 2 C1 (U) is a local D-invariant exten-
Cosymplectic Submanifolds and Dirac’s
sion of f and i : S ,! M is the inclusion, or by
Constraints Formula Ti  Xf ¼ S  XF  i ½7
The Poisson reduction theorem 2 allows us to define
where F 2 C1 (U) is an arbitrary local extension
Poisson structures on certain embedded submani-
of f and S : TMjS ! TS is the projection
folds that are not Poisson submanifolds.
induced by the Whitney sum decomposition
Definition 6 Let (M, { , }) be a Poisson manifold TMjS = B] ((TS) ) TS of TMjS .
and let B 2 2 (T  M) be the corresponding Poisson (iv) The symplectic leaves of (S, { , }S ) are the
tensor. An embedded submanifold S  M is called connected components of the intersections S \ L,
cosymplectic if where L is a symplectic leaf of (M, { , }). Any
Poisson Reduction 83

symplectic leaf of (S, { , }S ) is a symplectic Additionally, since the functions ’1 , . . . , ’k are
submanifold of the symplectic leaf of (M,{ , }) D-invariant, by [6], it follows that
that contains it.
(v) Let Ls and LSs be the symplectic leaves of X’1 ðsÞ ¼ X 1 ðsÞ 2 Ts S; . . . ; X’k ðsÞ
b

(M, { , }) and (S, { , }S ), respectively, that contain ¼ X k ðsÞ 2 Ts S
the point s 2 S. Let !Ls and !LS be the correspond- b

s
ing symplectic forms. Then B] (s)((Ts S) ) is a for any s 2 S. Consequently, {X’1 (s), . . . ,X’k (s),
symplectic subspace of Ts Ls and X 1 (s), . . . , X nk (s)} spans Ts Ls with
 !L ðsÞ fX’1 ðsÞ; . . . ; X’k ðsÞg  Ts S \ Ts Ls
B] ðsÞððTs SÞ Þ ¼ Ts LSs s ½8
and
where (Ts LSs )!Ls (s) denotes the !Ls (s)-orthogonal
complement of Ts LSs in Ts Ls . fX 1 ðsÞ; . . . ; X nk ðsÞg  B# ðsÞððTs SÞ Þ
(vi) Let BS 2 2 (T  S) be the Poisson tensor associated By Proposition 2(i),
to (S, { , }S ). Then
spanfX’1 ðsÞ; . . . ; X’k ðsÞg ¼ Ts S \ Ts Ls
B]S ¼ S  B] jS  S ½9
and
where S  
: T S ! T MjS is the dual of S : TMjS spanfX 1 ðsÞ; . . . ; X nk ðsÞg ¼ B# ðsÞððTs SÞ Þ
! TS.
Since dim(B# (s)((Ts S) )) = n  k by Proposition
The ‘‘Dirac constraints formula’’ is the expression in 2(iii), it follows that {X 1 (s), . . . , X nk (s)} is a basis
coordinates for the bracket of a cosymplectic of B# (s)((Ts S) ).
submanifold. Let (M, { , }) be an n-dimensional Since B# (s)((Ts S) ) is a symplectic subspace of
Poisson manifold and let S be a k-dimensional Ts Ls by Theorem 3(v), there exists some r 2 N such
cosymplectic submanifold of M. Let z0 be an that n  k = 2r and, additionally, the matrix C(s)
arbitrary point in S and (U, ) a submanifold chart with entries
around z0 such that  = (’, ) : U ! V1  V2 , where
V1 and V2 are two open neighborhoods of the origin Cij ðsÞ :¼ f i ; j
gðsÞ; i; j 2 f1; . . . ; n  kg
in two Euclidean spaces such that (z0 ) = (’(z0 ),
(z0 )) = (0, 0) and is invertible. Therefore, in the coordinates (’1 , . . . ,
’k , 1 , . . . , nk ), the matrix associated to the
ðU \ SÞ ¼ V1  f0g ½10 Poisson tensor B(s) is
 
Let ’ =: (’1 , . . . , ’k ) be the components of ’ BS ðsÞ 0
BðsÞ ¼
and define ’ b1 := ’1 jU\S , . . . , ’
bk := ’k jU\S . Extend 0 CðsÞ
1
b to D-invariant functions ’1 , . . . , ’k on U.
b ,...,’
’ k

Since the differentials d ’ b1 (s), . . . , d ’


bk (s) are linearly where BS 2 2 (T  S) is the Poisson tensor associated
independent for any s 2 U \ S, we can assume (by to (S, { , }S ). Let Cij (s) be the entries of the matrix
shrinking U if necessary) that d’1 (z), . . . , d’k (z) are C(s)1 .
also linearly independent for any z 2 U. Conse- Proposition 3 (Dirac formulas). In the coordinate
quently, (U,) with  := (’1 , . . . , ’k , 1 , . . . , nk ) is neighborhood (’1 , . . . ,’k , 1 , . . . , nk ) constructed
a submanifold chart for M around z0 with respect to above and for s 2 S we have, for any f , g 2 C1 S,M (V):
S such that, by construction,
X
nk
1 i
d’ ðsÞjB# ðsÞððTs SÞ Þ  Xf ðsÞ ¼ XF ðsÞ  fF; gðsÞCij ðsÞX j ðsÞ ½12
i;j¼1
¼    ¼ d’k ðsÞjB# ðsÞððTs SÞ Þ ¼ 0
and
for any s 2 U \ S. This implies that for any
i 2 {1, . . . ,k}, j 2 {1, . . . , n  k}, and s 2 S ff ; ggS ðsÞ ¼ fF; GgðsÞ
  X
nk
f’i ; j gðsÞ ¼ d’i ðsÞ X j ðsÞ ¼ 0  fF; i
gðsÞCij ðsÞf j ; GgðsÞ ½13
i;j¼1
since d j (s) 2 (Ts S) by [10] and hence
where F, G 2 C1 (U) are arbitrary local extensions of
#  f and g, respectively, around s 2 S.
X j ðsÞ 2 B ðsÞððTs SÞ Þ ½11
84 Polygonal Billiards

See also: Classical r-Matrices, Lie Bialgebras, and Krishnaprasad PS and Marsden JE (1987) Hamiltonian structure
Poisson Lie Groups; Cotangent Bundle Reduction; and stability for rigid bodies with flexible attachments.
Graded Poisson Algebras; Symmetry and Symplectic Archives for Rational and Mechanical Analysis 98: 137–158.
Reduction; Hamiltonian Group Actions; Lie, Symplectic, Lewis D, Marsden JE, Montgomery R, and Ratiu TS (1986) The
Hamiltonian structure for dynamic free boundary problems.
and Poisson Groupoids and their Lie Algebroids;
Physica D 18: 391–404.
Singularity and Bifurcation Theory. Lu J-H and Weinstein A (1990) Poisson Lie groups, dressing
transformations and Bruhat decompositions. Journal of
Differential Geometry 31: 510–526.
Further Reading Marsden JE and Ratiu TS (1986) Reduction of Poisson manifolds.
Abraham R and Marsden JE (1978) Foundations of Mechanics, Letters in Mathematical Physics 11: 161–169.
2nd edn. Reading, MA: Addison–Wesley. Marsden JE and Ratiu TS (2003) Introduction to Mechanics and
Casati P and Pedroni M (1992) Drinfeld–Sokolov reduction on a Symmetry, second printing; 1st edn. (1994), Texts in Applied
simple Lie algebra from the bi-Hamiltonian point of view. Mathematics, 2nd edn., vol. 17. New York: Springer.
Letters in Mathematical Physics 25(2): 89–101. Ortega J-P and Ratiu TS (1998) Singular reduction of Poisson
Castrillón-López M and Marsden JE (2003) Some remarks on manifolds. Letters in Mathematical Physics 46: 359–372.
Lagrangian and Poisson reduction for field theories. Journal of Ortega J-P and Ratiu TS (2003) Momentum Maps and Hamiltonian
Geometry and Physics 48: 52–83. Reduction. Progress in Math. vol. 222. Boston: Birkhäuser.
Cendra H, Marsden JE, and Ratiu TS (2003) Cocycles, compat- Pedroni M (1995) Equivalence of the Drinfeld–Sokolov reduction
ibility, and Poisson brackets for complex fluids. In: Capriz G and to a bi-Hamiltonian reduction. Letters in Mathematical
Mariano P (eds.) Advances in Multifield Theories of Continua Physics, 35(4): 291–302.
with Substructures, Memoirs, pp. 51–73. Aarhus: Aarhus Univ. Sundermeyer K (1982) Constrained Dynamics. Lecture Notes in
Faybusovich L (1991) Hamiltonian structure of dynamical Physics, vol. 169. New York: Springer.
systems which solve linear programming problems. Physica Weinstein A (1983) The local structure of Poisson manifolds.
D 53: 217–232. Journal of Differential Geometry 18: 523–557.
Faybusovich L (1995) A Hamiltonian structure for generalized affine- Weinstein A (1985) The local structure of Poisson manifolds – errata
scaling vector fields. Journal of Nonlinear Science 5(1): 11–28. and addenda. Journal of Differential Geometry 22(2): 255.
Gotay MJ, Nester MJ, and Hinds G (1978) Presymplectic Zaalani N (1999) Phase space reduction and Poisson structure.
manifolds and the Dirac–Bergmann theory of constraints. Journal of Mathematical Physics 40: 3431–3438.
Journal of Mathematical Physics 19: 2388–2399.

Polygonal Billiards
S Tabachnikov, Pennsylvania State University, energy and momentum are conserved. The reflection
University Park, PA, USA off the left endpoint of the half-line is also elastic: if a
ª 2006 Elsevier Ltd. All rights reserved. point hits the ‘‘wall’’ x = 0, its velocity changes sign.
The configuration space of this system is the wedge
pffiffiffiffiffiffi
0  x1  x2 . After the rescaling x i = mi xi , i = 1, 2,
Mechanical Examples. Unfolding this system identifies with the p billiard inside a wedge
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Billiard Trajectories with the angle measure arctan m1 =m2 .
Likewise, the system of two elastic point-masses
The billiard system inside a polygon P has a very on a segment is the billiard system in a right
simple description: a point moves rectilinearly with triangle; a system of a number of elastic point-
the unit speed until it hits a side of P; there it masses on the positive half-line or a segment is the
instantaneously changes its velocity according to the billiard inside a multidimensional polyhedral cone
rule ‘‘the angle of incidence equals the angle of or a polyhedron, respectively. The system of three
reflection,’’ and continues the rectilinear motion. If elastic point-masses on a circle has three degrees of
the point hits a corner, its further motion is not freedom; one can reduce one by assuming that the
defined. (see Billiards in Bounded Convex Domains). center of mass of the system is fixed. The resulting
From the point of view of the theory of dynamical two-dimensional system is the billiard inside an
systems, polygonal billiards provide an example of acute triangle with the angles
parabolic dynamics in which nearby trajectories  rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
diverge with subexponential rate. m1 þ m2 þ m3
arctan mi ; i ¼ 1; 2; 3
One of the motivations for the study of polygonal m1 m2 m3
billiards comes from the mechanics of elastic particles in
dimension 1. For example, consider the system of two For comparison, the more realistic system of
point-masses m1 and m2 on the positive half-line x 0. elastic balls identifies with the billiard system in a
The collision between the points is elastic, that is, the domain with nonflat boundary components.
Polygonal Billiards 85

perimeter among inscribed triangles. The Fagnano


trajectory belongs to a band of 6-periodic ones. It is
not known whether every acute triangle has other
periodic trajectories.
For a right triangle, one has the following result:
almost every (in the sense of the Lebesgue measure)
billiard trajectory that leaves a leg in the perpendicular
direction returns to the same leg in the same direction
and is therefore periodic. A similar existence result
holds for polygons whose sides have only two
directions.
In general, not much is known about the existence
of periodic billiard trajectories in polygons. Con-
jecturally, every polygon has one, but this is not
Figure 1 Unfolding a billiard trajectory in a wedge. known even for all obtuse triangles. Recently,
R Schwartz proved that every obtuse triangle with
A useful elementary method of study is unfolding: the angles not exceeding 100 has a periodic billiard
instead of reflecting the billiard trajectory in the path. This work substantially relies on a computer
sides of the polygon, reflect the polygon in the program, McBilliards, written by Schwartz and
respective side and unfold the billiard trajectory to a Hooper.
straight line. This method yields an upper bound If an arbitrary small perturbation of the vertices of a
" # billiard polygon leads to a perturbation of a periodic
 billiard trajectory, but not to its destruction, then this
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi trajectory is called stable. Label the sides of the
arctan m1 =m2
polygon 1, 2, . . . , k. Then a periodic trajectory is
for the number of collisions in the system of two coded by the word consisting of the labels of the
point-masses m1 and m2 on the positive half-line. consecutively visited sides. An even-periodic trajectory
Likewise, the number of collisions for any number is stable if and only if the numbers in the respective
of elastic point-masses on the positive half-line is word can be partitioned in pairs of equal numbers, so
bounded above by a constant depending on the that the number from each pair appears once at an
masses only. Similar results are known for systems even position, and once at an odd one. As a
of elastic balls (Figure 1). consequence, if the angles of a polygon are indepen-
Similarly, one studies the billiard inside the unit dent over the rational numbers, then every periodic
square. Unfolding the square yields a square grid in billiard trajectory in it is stable.
the plane, acted upon by the group of parallel
translations 2Z  2Z. Factorizing by this group
action yields a torus, and the billiard flow in a
Complexity of Billiard Trajectories
given direction becomes a constant flow on the
torus. If the slope is rational, then all orbits are The encoding of billiard trajectories by the consecu-
periodic, and if the slope is irrational, then all orbits tively visited sides of the billiard polygon provides a
are dense and the billiard flow is ergodic. Its metric link between billiard and symbolic dynamics. For a
entropy is equal to zero. Periodic trajectories of the billiard k-gon P, denote by  the set of words in
billiard in a square come in bands of parallel ones. letters 1, 2, . . . , k corresponding to billiard trajec-
Let f (‘) be the number of such bands of length not tories in P, and let n be the set of such words of
greater than ‘. Then, f (‘) equals the number of length n.
coprime lattice points inside the circle of radius ‘, One has a general theorem: the topological
that is, f (‘) has quadratic growth in ‘. entropy of the billiard flow is zero. This implies
that a number of quantities, associated with a
polygonal billiard, grow slower than exponentially,
as functions of n: the cardinality jn j, the number of
Periodic Trajectories
strips of n-periodic trajectories, the number of
The simplest example of a periodic orbit in a generalized diagonals with n links (i.e., billiard
polygonal billiard is the 3-periodic Fangano trajec- trajectories that start and end at corners of the
tory in an acute triangle: it connects the bases of the billiard polygon), etc. Conjecturally, all these quan-
three altitudes of the triangle and has minimal tities have polynomial growth in n.
86 Polygonal Billiards

The complexity of the billiard in a polygon is


a
defined as the function p(n) = jn j. Likewise, one
may consider the billiard trajectories in a given
direction  and define the corresponding complexity
p (n).
In the case of a square, one modifies the encoding
using only two symbols, say, 0 and 1, to indicate b
that a trajectory reflects in a horizontal or a vertical
side, respectively. If  is a direction with an
Figure 2 The invariant surface for a right triangle with acute
irrational slope, then p (n) = n þ 1. This is a classical angle /8 has genus 2.
result by Hedlund and Morse. The sequences with
complexity p(n) = n þ 1 are called Sturmian; this
are obtained, one from another, by rotations. The
is the smallest complexity of aperiodic sequences.
genus of M is given by the formula
A generalization for multidimensional cubes and
 X 1
parallelepipeds, due to Yu Baryshnikov, is known. N
For a k-gon P, let N be the least common 1þ k2
2 ni
denominator of its -rational angles and s be the
number of its distinct -irrational angles. Then, For example, if P is a right triangle with an acute
 angle =8, then M is a surface of genus 2 (Figure 2).
ns The cases when M is a torus are as follows: the
p ðnÞ  kNn 1 þ
2 angles of P are all of the form =ni , where ni are
Concerning billiard trajectories in all directions, equal, up to permutations, to
one has a lower bound for complexity: p(n)  cn2 ð3; 3; 3Þ; ð2; 4; 4Þ; ð2; 3; 6Þ; ð2; 2; 2; 2Þ
for a constant c depending on the polygon. A similar
estimate holds for a d-dimensional polyhedron with and the respective polygons are an equilateral
the exponent 2 replaced by d. triangle, an isosceles right triangle, a right triangle
with an acute angle =6, and a square. All these
polygons tile the plane.
The billiard flow on the surface M has saddle
Rational Polygons and Flat Surfaces singularities at the points obtained from the vertices
The only class of polygons for which the billiard of P. The surface M inherits a flat metric from P
dynamics is well understood are rational one, the with a finite number of cone-type singularities,
polygons satisfying the property that the angles corresponding to the vertices of P, with cone angles
between all pairs of sides are rational multiples of . multiples of 2 (Figure 3).
Let P be a simply connected (without holes) A flat surface M is a compact smooth surface with
rational k-gon with angles mi =ni , where mi and ni a distinguished finite set of points . On M n , one
are coprime integers. The reflections in the sides of P has coordinate charts v = (x, y) such that the transi-
generate a subgroup of the group of isometries of tion functions on the overlaps are of the form
the plane. Let G(P)  O(2) consist of the linear v!vþc or v ! v þ c
parts of the elements of this group. Then, G(P) is the
dihedral group DN consisting of 2N elements. When
a billiard trajectory reflects in a side of P, its
direction changes by the action of the group G(P),
and the orbit of a generic direction  6¼ k=N on the
unit circle consists of 2N points.
The phase space of the billiard flow is the unit
tangent bundle P  S1 . Let M be the subset of
points whose projection to S1 belongs to the orbit of
 under G(P) = DN . Then, M is an invariant surface
of the billiard flow in P. The surface M is obtained
from 2N copies of P by gluing their sides according
to the action of DN . This oriented compact surface
depends only on the polygon P, but not on the
choice of , and may be denoted by M. The
directional billiard flows F on M in directions  Figure 3 A cone singularity for the flow on an invariant surface.
Polygonal Billiards 87

In particular, one may talk about directions on a flat the Teichmuller space that contains this surface. These
surface. values are known, due to Eskin, Masur, Okunkov, and
The group PSL(2, R) acts on the space of flat Zorich. Since a generic flat surface does not correspond
structures. From the point of view of complex analysis, to a rational polygon, this result does not immediately
a flat surface is a Riemann surface with a holomorphic apply to polygonal billiards. However, quadratic
quadratic differential; the set of cone points  corre- asymptotics are established for rectangular billiards
sponds to the zeros of the quadratic differential. Not with barriers.
every flat surface is associated with a polygonal billiard. Note, in conclusion, a close relation of billiards in
Concerning ergodicity, one has the theorem of rational polygons and interval exchange transforma-
Kerckhoff, Masur, and Smillie: given a flat surface of tions; the reduction of the former to the latter is a
genus not less than 2, for almost all directions  (in the particular case of the reduction of the billiard flow to
sense of the Lebesgue measure), the flow F is uniquely the billiard ball map. On an invariant surface M of the
ergodic. Furthermore, the Hausdorff dimension of the billiard flow, consider a segment I, perpendicular to
set of angles  for which ergodicity fails does not the directional flow. Since ‘‘the width of a beam’’ is an
exceed 1/2, and this bound is sharp. As a consequence, invariant transversal measure for the constant flow, the
the billiard flow on the invariant surface is uniquely first return map to I is a piecewise orientation preserving
ergodic for almost all directions. Another corollary: isometry, that is, an interval exchange transformation.
there is a dense G subset in the space of polygons
consisting of polygons for which the billiard flow is
ergodic. If a billiard polygon admits approximation by Acknowledgment
rational polygons at a superexponentially fast rate, This work was partially supported by NSF.
then the billiard flow in it is ergodic.
Concerning periodic orbits, one has the following See also: Billiards in Bounded Convex Domains; Ergodic
theorem due to H Masur: given a flat surface of genus Theory; Fractal Dimensions in Dynamics; Generic
not less than 2, there exists a dense set of angles  such Properties of Dynamical Systems; Holomorphic
that F has a closed trajectory. As a consequence, for Dynamics; Hyperbolic Billiards; Riemann Surfaces.
any rational billiard polygon, there is a dense set of
directions each with a periodic orbit. Furthermore,
Further Reading
periodic points are dense in the phase space of the
billiard flow in a rational polygon. Burago D, Ferleger S, and Kononenko A (2000) A Geometric
Similarly to the case of a square, let f (‘) be the Approach to Semi-Dispersing Billiards. Hard Ball Systems and
number of strips of periodic trajectories of length not the Lorentz Gas, pp. 9–27. Berlin: Springer.
Chernov N and Markarian R Theory of Chaotic Billiards (to
greater than ‘ in a rational polygon P. By a theorem appear).
of H Masur, there exist constants c and C such that Galperin G, Stepin A, and Vorobets Ya (1992) Periodic billiard
for sufficiently large ‘ one has: c‘2 < f (‘) < C‘2 , and trajectories in polygons: generating mechanisms. Russian
likewise for flat surfaces. Mathematical Surveys 47(3): 5–80.
There is a class of flat surfaces, called Veech (or Gutkin E (1986) Billiards in polygons. Physica D 19: 311–333.
Gutkin E (1996) Billiards in polygons: survey of recent results.
lattice) surfaces, for which more refined results are Journal of Statistical Physics 83: 7–26.
available. The groups of affine transformations of a Gutkin E (2003) Billiard dynamics: a survey with the emphasis on
flat surface determine a subgroup in SL(2, R). If this open problems. Regular and Chaotic Dynamics 8: 1–13.
subgroup is a lattice in SL(2, R), then the flat surface Katok A and Hasselblatt B (1995) Introduction to the Modern
is called a Veech surface. Similarly, one defines a Theory of Dynamical Systems. Cambridge: Cambridge
University Press.
Veech rational polygon. For example, regular poly- Kozlov V and Treshchev D (1991) Billiards. A Genetic Introduction
gons and isosceles triangles with equal angles =n to the Dynamics of Systems with Impacts. Providence: American
are Veech. All acute Veech triangles are described. Mathematical Society.
For a Veech surface, one has the following Veech Masur H and Tabachnikov S (2002) Rational Billiards and
dichotomy: for any direction , either the flow F is Flat Structures. Handbook of Dynamical Systems, vol. 1A,
pp. 1015–1089. Amsterdam: North-Holland.
minimal or its every leaf is closed (unless it is a saddle Sinai Ya (1976) Introduction to Ergodic Theory. Princeton:
connection, i.e., a segment connecting cone points). Princeton University Press.
For a Veech surface (and polygon), the quadratic Smillie J (2000) The Dynamics of Billiard Flows in Rational
bounds for the counting function f (‘) become quad- Polygons, Encyclopaedia of Mathematical Sciences, vol. 100,
ratic asymptotics: f (‘)=‘2 has a limit as ‘ ! 1. The pp. 360–382. Berlin: Springer.
Tabachnikov S (1995) Billiards, Société Math. de France,
value of this limit is expressed in arithmetical terms. Panoramas et Syntheses, No 1.
A generic flat surface also has quadratic asymptotics. Tabachnikov S (2005) Geometry and Billiards. American Math-
The value of the limit depends only on the stratum of ematical Society.
88 Positive Maps on C*-Algebras

Positive Maps on C*-Algebras


F Cipriani, Politecnico di Milano, Milan, Italy The involution determines the self-adjoint part
ª 2006 Elsevier Ltd. All rights reserved. Ah := {a 2 A: a = a } of A, a real subspace such that
A = Ah þ iAh . A self-adjoint element a of A satisfies
Sp(a)  R and, if k  0, one has kak  k if and only
if Sp(a)  [k, k].
Introduction The involution determines another important
subset of A: Aþ := {a a: a 2 A}. This subset of Ah is
The theme of positive maps on  -algebras and other
closed in the norm topology of A and contains the
ordered vector spaces, dates back to the Perron–
sums of its elements as well as their multiples by
Frobenius theory of matrices with positive entries,
positive scalars: in other words, it is a closed convex
the Shur’s product of matrices, the study of doubly
cone. From a spectral point of view, one has the
stochastic matrices describing discrete-time random
following characterization: a self-adjoint element a
walks and the behavior of limits of powers of
belongs to Aþ if and only if its spectrum is positive
positive matrices in ergodic theory.
Sp(a)  [0, þ1). It is this property that allows us to
A long experience proved that far-reaching general-
call Aþ the positive cone of A and its elements
izations of the above situations have to be considered
positive. If it exists, a unit 1A in A is always positive
in various fields of mathematical physics and that
and a Hermitian element a is positive if and only if
C -algebras, their positive cones, and other associated
k1A  a=kak k  1.
ordered vector spaces provide a rich unifying frame-
The continuous functional calculus in A allows
work of functional analysis to treat them.
to write any self-adjoint element of Ah as a
It is the scope of this note to review some of the
difference of elements of Aþ : Ah = Aþ  Aþ . More-
basic aspects both of the general theory and of the
over, Aþ \ (Aþ ) = {0} and the decomposition
applications.
a = b  c of a self-adjoint element a as difference
In the next section we briefly recall the definitions
of positive elements b and c is unique provided one
of C -algebras and their positive cones. However,
requires that bc = cb = 0. In this case, it is called the
throughout this article we refer to C-Algebras and
orthogonal decomposition.
their Classification and von Neumann Algebras:
The cone Aþ determines an underlying structure
Introduction, Modular Theory and Classification
of order space on A: for a, b 2 A one says that a is
Theory as sources of the definitions and general
less than or equal to b, in symbols a  b, if and only
properties of the objects of these operator algebras.
if b  a 2 Aþ . In particular, a  0 just means that a
We then introduce positive maps, illustrate their
is positive.
general properties, and discuss some relevant classes
Another fundamental characterization of the
of them. The correspondence between states and
positive cone is the following: a self-adjoint element
representations is described next, as well as the
a = a is positive if and only if there exists an
appearance of vector, normal and non-normal states
element b in A such that a = b2 . Moreover, among
in applications. We then illustrate the structure of
the elements b with this property, there exists one
completely positive maps and their relevance in
and only one which is positive, the square root of a.
mathematical physics. Finally, we describe the
Some examples of positive cones are provided in the
relevance of the class of completely positive maps
following.
to understand the structure of nuclear C -algebras.
Example 1 By a fundamental result of I M
Gelfand, a commutative C -algebra A is isomorphic
Positive Cones in C  -Algebras to the C -algebra C0 (X) of all complex continuous
functions vanishing at infinity on a locally compact
A C -algebra A is a complex Banach algebra with a Hausdorff topological space X. The algebraic
conjugate-linear involution a 7! a such that ka ak = operations have the usual pointwise meaning and
kak2 for all a 2 A. the norm is the uniform one. The constant function
When A has a unit 1A , the spectrum Sp(a) of an 1 represents the unit precisely when X is compact.
element a is the subset of all complex numbers  The positive cone C0 (X)þ coincides with that of the
such that a    1A is not invertible in A. When A is positive continuous functions in C0 (X).
realized as a subalgebra of some B(H), and this is
always possible, the set Sp(a) coincides with the Example 2 Finite dimensional C -algebras A are
spectrum of the bounded operator a on the Hilbert classified as finite sums Mn1 (C) Mn2 (C)   
space H. Mnk (C) of full matrix algebras Mni (C). An element
Positive Maps on C*-Algebras 89

a1 a2    ak is positive if and only if the which is unital when p(e) = 1, on the full group
matrices ai have positive eigenvalues. C -algebra C (G). When G is amenable, this algebra
coincides with reduced C -algebra Cr (G) so that, if
Example 3 When a C -algebra A  B(H) is rep-
G is also unimodular (as is the case if G is compact),
resented as a self-adjoint closed algebra of operators
the positive elements can be approximated by
on a Hilbert space H, its positive elements are those
positive-definite functions in L1 (G, m) and the
which have non-negative spectrum.
positivity of  follows exactly as in the previous
example.
Positive Maps on C  -Algebras
Positive Maps in Commutative C  -Algebras
Among the various relevant classes of maps between
C -algebras, we are going to consider the following Positive maps  : C0 (Y) ! C0 (X) between commu-
ones, whose properties are connected with the tative CR-algebras have the following structure:
underlying structures of ordered vector spaces. (a)(x) = Y k(x, dy)a(y), a 2 C0 (Y). Here the kernel
x 7! k(x, ) is a continuous map from X to the space
Definition 1 Given two C -algebras A and B, a of positive Radon measures on Y. In case X and Y
map  : A ! B is called positive if (Aþ )  Bþ . In are compact, the map is unital provided k(x, ) is a
other words, a map is positive if and only if it probability measure for each x 2 X. In fact, for a
transforms the positive elements of A into positive fixed x 2 X, the map a 7! (a)(x) is a positive linear
elements of B: functional from C0 (Y) to C and Riesz’s theorem
a 2 A ) ða aÞ 2 Bþ ½1
guarantees that it can be represented by a positive
Radon measure on Y.
If A and B have units, the map is called unital In probability theory, one-parameter semigroups
provided (1A ) = 1B . t s = tþs of positive maps t : C0 (X) ! C0 (X)
such that t (1)  1 for all t  0, are called Markovian
Morphisms and Jordan Morphisms
semigroups (conservative, if the maps are unital). They
A  -morphism between C -algebras  : A ! B is represent the expectation at time t > 0 of Markovian
positive; in fact, (a a) = (a) (a)  0. stochastic processes on X. In this case, the time-
This also the case for Jordan  -morphism, the dependent kernel k(t, x, ) represents the distribution
linear maps satisfying (a ) = (a) and ({a, b}) = probability at time t of a particle starting in x 2 X at
{(a), (b)}, where {a, b} = ab þ ba denotes the Jor- time t = 0.
dan product. In fact, if a = a then (a2 ) = (a)2 is These kinds of maps arise also in potential theory,
positive. where the dependence of the solution (a) of a
Dirichlet problem on a bounded domain , with
Shur’s Product of Matrices nice boundary @, upon the continuous boundary
Let A 2 Mn (C) be a positive matrix and define a data a 2 C(@) gives rise to a linear unital map
linear map  : Mn (C) ! Mn (C) through the Shur’s  : C(@) ! C( [ @), whose positivity and uni-
product of matrices: A (B) := [Aij Bij ]ni, j = 1 . Since the tality translates the ‘‘maximum principle’’ for har-
Shur’s product of positive matrices is positive too monic functions. When  is the unit disk, k is the
(i.e., the positive cone of Mn (C) is a semigroup familiar Poisson’s kernel.
under matrix product), the above map is positive.
Continuity and Algebraic Properties
of Positive Maps
Positive-Definite Function on Groups
Since the order structure of a C -algebra A is defined
Positive maps also arise naturally in harmonic
by its positive cone Aþ , positive maps are
analysis. Let G be a locally compact topological
group with identity e and left Haar’s measure m. Let 1. real: (a ) = (a) and
p : G ! C be a continuous positive-definite function 2. order preserving: (a)  (b) whenever a  b.
on G. This just means that for all n  1 and all
s1 , . . . , sn 2 G, the matrix { p(s 1 n From this follows an important interplay between
Pi n sj )}i,j = 11belongs to positivity and continuity:
the positive cone of Mn (C): i, j = 1 p(si sj )i j  0
for all 1 , . . . , n . Such functions are necessarily a positive map  : A ! B
bounded with kpk1  p(e), so that an operator
between C -algebras is continuous
 : L1 (G, m) ! L1 (G, m) is well defined by point-
wise multiplication: (f )(s) := p(s)f (s). This map In case A has a unit, this follows by the fact that  is
extends to a positive map  : C (G) ! C (G), order preserving and that, for self-adjoint a, one has
90 Positive Maps on C*-Algebras

kak1A  a  þkak1A , so that kak(1A )  (a)  As C is a C -algebra, when A is unital, a state on it


þkak(1A ) and then k(a)k  k(1A )k  kak. In gen- is just a unital positive map:
eral, splitting a = b þ ic as a combination of Hermitian
(a a)  0 for all a 2 A, and
elements b and c, as kbk  kak and kck  kak, one
(1A ) = 1.
obtains
States for which (ab) = (ba) are called tracial states.
kðaÞk  kðbÞk þ kðcÞk States constitute a distinguished class of positive
 kð1A Þkðkbk þ kckÞ maps, both from a mathematical viewpoint and for
 2kð1A Þk  kak application to mathematical physics. We will see below
that states are deeply connected to representations of
The second general result concerning positivity C -algebras (see C-Algebras and their Classification).
and continuity is the following:
States on Commutative C  -Algebras
Let  : A ! B be a linear map between C -algebras
with unit such that (1A ) = 1B ; then  is positive if Since this is a subcase of positive maps in commutative
and only if kk = 1. C -algebras we only add a comment. As far as a
C -algebra represents observable quantities of a
The result relies, among other things, on the physical system, states carry our actual knowledge
generalized Schwarz inequality for unital positive about the system itself. The smallest C -sub-algebra
maps on normal elements, {f (a): f 2 C0 (R)} of A containing a given self-adjoint
element a 2 A, representing a certain observable
ða ÞðaÞ  ða aÞ; a a ¼ aa
quantity, is isomorphic to the algebra C(Sp(a)) of
These results may be used to reveal the strong continuous functions on the spectrum of a. A state on
interplay between the algebraic, continuity and A induces, by restriction, a state on C(Sp(a)), which,
positivity properties of maps: by the Riesz representation theorem, is associated to a
probability measure a on Sp(a) through the formula
Let  : A ! B be an invertible linear map between
Z
unital C -algebras such that (1A ) = 1B . The
following properties are equivalent: ðf ðaÞÞ ¼ f ðxÞa ðdxÞ
SpðaÞ
1.  is Jordan isomorphism,
2.  is an isometry, and Since Sp(a) represents the possible values of the
3.  is an order isomorphism ( and 1 are order observable associated to a, a represents the dis-
preserving). tribution of these values when the physical state of
the system is represented by .
The above conclusions can be strengthened if,
instead of individual maps, continuous groups of Vector States and Density Matrices
maps are considered. In case A is acting on a Hilbert space h, A  B(h),
Let t 7! t be a strongly continuous, one-parameter each unit vector  2 h gives rise to a vector state
group of maps of a unital C -algebra A and  (a) = (ja ). In the quantum-mechanical descrip-
assume that t (1A ) = 1A for all t 2 R. The follow- tion of a finite system, as far as observables with
ing properties are equivalent: discrete spectrum are concerned, one can assume A
to be the C -algebra K(h) of compact operators on
1. t is a  -automorphism of A for all t 2 R, the Hilbert space h. In this case every state is a
2. kt k  1 for all t 2 R, and convex superposition of vector states, in the sense
3. t is positive for all t 2 R. that it can be represented by the formula
An analogous result holds true for w -continuous
groups on abelian, or factors, von Neumann algebras. ðaÞ ¼ trðaÞ=trðÞ; a 2 KðhÞ

for a suitable density matrix , that is, a positive,


compact operator with finite trace. In quantum
States on C  -Algebras statistical mechanics, the grand canonical Gibbs
equilibrium state of a finite system at inverse tempera-
A state on a C -algebra A is a positive functional
ture  and chemical potential , with Hamiltonian H
 : A ! C of norm 1:
and number operator N, is of the above type
(a a)  0 for all a 2 A, and
kk = 1. ; ðaÞ ¼ trðeK aÞ=trðeK Þ
Positive Maps on C*-Algebras 91

where K = H  N, and the spectrum of H is assumed There is a symbiotic appearance of states and
to be discrete and such that eK is trace-class. For representations on C -algebras. In fact, given a
infinite systems, A is a quasilocal C -algebra generated representation : A ! B(H), one easily constructs
by a net {A } of C -subalgebras describing observa- states on A by unit vectors  2 H by
bles referred to finite-volume regions. Infinite-volume
equilibrium states on A can then be obtained as  ðaÞ ¼ ðj ðaÞÞ
thermodynamic limits of finite-volume Gibbs equili- In fact, one checks that  (a a) = (j (a a) ) =
brium states of the above type. (j (a ) (a) ) = k (a) k2  0 and, at least if a unit
exists, that  (1A ) = kk2 = 1.
Normal and Singular States A fundamental construction due to Gelfand,
When observables with continuous spectrum have to Naimark, and Segal allows to associate a represen-
be considered and one chooses the algebra B(h) of tation to each state in such a way that each state is a
all bounded operators, the above formula, although vector state for a suitable representation.
still meaningful, does not describe all states on B(h) ‘‘Let ! be a state over the C -algebra A. It follows
but only the important subclass of the normal ones. that there exists cyclic representation ( ! , H! , ! )
To this class, which can be considered on any von of A such that
Neumann algebra M, belong states  which are
-weakly continuous functionals. Equivalently, these !ðaÞ ¼ ð! j ! ðaÞ! Þ
are the states such that for all increasing net a 2 Mþ Moreover, the representation is unique up to
with least upper bound a 2 Mþ , (a) is least upper unitary equivalence. It is called the canonical
bound of the net (a ). cyclic representation of A associated with !.’’
In general, each state  on a von Neumann
algebra M splits as a sum of a maximal normal The positivity property of the state allows to
piece and a singular one. Singular traces appear in introduce the positive-semidefinite scalar product
noncommutative geometry as very useful tools to get hajbi = !(a b) on the vector space A. Moreover, its
back local objects from spectral ones via the familiar kernel I ! = {a 2 A: !(a a) = 0} is a left-ideal of A: in
principle that local properties of functions depend fact, if a 2 A and b 2 I ! then !((ba) (ba)) 
on the asymptotics of their Fourier coefficients. kak2 !(b b) = 0. This allows to define, on the
This is best illustrated on a compact, Riemannian quotient pre-Hilbert space A=I ! , an action of
n-manifold M by the formula the elements a 2 A: ! (a)(b þ I ! ) := ab þ I ! . It is
Z the extension of this action to the Hilbert space
f dm ¼ cn 
! ðMf jDjn Þ completion H! of A=I ! that gives the representation
M associated to !. When A has a unit, the cyclic vector
which expresses the Riemannian integral of a nice ! with the stated properties is precisely the image of
function f in terms of the Dirac operator D acting on the 1A þ I ! . By definition, the cyclicity of the represen-
Hilbert space of square-integrable spinors, the multi- tation amounts to check that ! (A)! is dense in H! .
plication operator Mf by f, and the singular Dixmier
tracial state
! on B(H). Here the compactness of M
implies the compactness of the operator Mf jDjn and Completely Positive Maps

! is a limiting procedure depending only on the


asymptotic behavior of the eigenvalues of Mf jDjn . In a sense, the order structure of a C -algebra A
Similar formulas are valid on self-similar fractals as well is better understood through the sequence of
as on quasiconformal manifolds. Local index formulas C -algebras A Mn (C) ffi Mn (A), obtained as tensor
represent cyclic cocycles in Connes’ spectral geometry products of A and full matrix algebras Mn (C). For
(see Noncommutative Geometry and the Standard example, C -algebras are matrix-ordered vector
Model; Noncommutative Geometry from Strings; spaces as  (Mm (A))þ   (Mn (A))þ for all matrices
Path-Integrals in Noncommutative Geometry).  2 Mmn (C).
In this respect, one is naturally led to consider
States and Representations: The stronger notion of positivity:
GNS Construction ‘‘A map  : A ! B is called n-positive if its
A fundamental tool in studying a C -algebra A extension
are its representations. These are morphisms of  1n : A Mn ðCÞ ! B Mn ðCÞ
C -algebras : A ! B(H) from A to the algebra of
all bounded operators on some Hilbert space H. ð 1n Þ½ai;j
i;j ¼ ½ðai;j Þ
i;j
92 Positive Maps on C*-Algebras

is positive and completely positive (CP map for to each Borel subset E of a topological space X. For
short) if this happens for all n.’’ each aR2 C0 (X), one can define its integral
Pn  (f ) := X f dE as an element of A. The map
Equivalently, n-positive means that i, j = 1 bi  
  : C0 (X) ! A, called the observation channel, is
(ai aj )bj  0 for all a1 , . . . , an 2 A and b1 , . . . , bn 2 b.
then a CP map.
In particular, if  is n-positive then it is k-positive for
4. Another field of mathematical physics in which CP
all k  n. Many positive maps we considered are in
maps play a distinguished role is in the construc-
fact CP maps:
tion and application of the quantum dynamical
1. morphisms of C -algebras are CP maps; entropy, an extension of the Kolmogorov–Sinai
2. positive maps  : A ! B are automatically CP entropy of measure preserving transformations
maps provided A, B or both are commutative and (see Quantum Entropy). When dealing with
states are, in particular, CP maps; and a noncommutative dynamical system (M, ,
)
3. an important class of CP maps is the following. in which
is a normal trace state on a finite
A norm one projection " : A ! B, from a von Neumann algebra M, the Connes–Størmer
C -algebra A onto a C -subalgebra B, is a entropy h
() is defined through the consideration
contraction such that "(b) = b for all b 2 B. It of an entropy functional H
(N1 , . . . , Nk ) of finite-
can be proved that these maps satisfy dimensional von Neumann subalgebras
"(bac) = b"(a)c for all a 2 A and b, c 2 B and N1 , . . . , Nk  M. To extend the definition to
for this reason they are called conditional more general C -algebras and states on them, one
expectations. This property then implies that has to face the fact that C -algebras may have no
they are CP maps. nontrivial C -subalgebras. To circumvent the
problem A Connes, H Narnhofer, and W Thirring
However, the identity map from a C -algebra A
(CNT) introduced an entropy functional
into its opposite A is positive but not 2-positive
H( 1 , . . . , k ) associated to a set i : Ai ! A of
unless A is commutative, the transposition a 7! at in
CP maps (finite channels) from finite-dimensional
Mn (C) is positive and not 2-positive if n  2 and, for
C -algebras Ai into A. This led to the CNT entropy
all n, there exist n-positive maps which are not
h! () of a noncommutative dynamical system
(n þ 1)-positive.
(A, , !), where ! is a state on A and  is an
automorphism or a CP map preserving it:
CP Maps in Mathematical Physics !  = !.
In several fields of application, the transition of a CP Maps and Continuity
state of a system into another state can be described
by a completely positive map  : A ! B between Since for an element a 2 A of a unital C -algebra,
C -algebras: for any given state ! of B, !  is then one has kak  1 precisely when
 
a state of A. 1 a
1. In the theory of quantum communication pro- a 1
cesses (see Channels in Quantum Information is positive in M2 (A), it follows that
Theory; Optimal Cloning of Quantum States;
Source Coding in Quantum Information Theory; 2-positive unital maps are contractive
Capacity for Quantum Information), for exam- Unital 2-positive maps satisfy, in particular, the
ple, B and A represent the input and output generalized Schwarz inequality for all a 2 A,
systems, respectively, ! the signal to be trans-
mitted, !  the received signal, and  the system ða ÞðaÞ  ða aÞ
of transmission, called the channel. In particular,
2. In quantum probability and in the theory of
quantum open systems, continuous semigroups ‘‘CP maps are completely bounded as supn k 1n k =
of CP maps (see Quantum Dynamical Semi- k(1A )k and completely contractive if they are
groups) describe dissipative time evolutions of a unital. Conversely unital, completely contractive
system due to interaction with an external one maps are CP maps.’’
(heat bath).
CP Maps and Matrix Algebras
3. In the theory of measurement in quantum
mechanics, an observable can be described by a When the domain or the target space of a map are
positive-operator-valued (POV) measure M which matrix algebras, one has the following equivalences
assigns a positive element m(E) in a C -algebra A concerning positivity. Let [ei, j ]i, j denote the standard
Positive Maps on C*-Algebras 93

matrix units in Mn (C) and  : Mn (C) ! B into a Strongly continuous positive semigroups, which
C -algebra B. The following conditions are are KMS symmetric with respect to a KMS state !
equivalent: of a given automorphism group of a C -algebra A,
can be analyzed as positive semigroups in the
1.  is a CP map,
standard representation (M, H, P, J) (see Tomita–
2.  is n-positive, and
Takesaki Modular Theory) of the von Neumann
3. [(ei, j )]i, j is positive in Mn (B).
algebra M := ! (A)00 . A semigroup on A gives rise to
Associating to a linear map  : A ! Mn (C), the a corresponding w -continuous positive semigroup
linear
P functional s : Mn (A) ! C by s ([ai, j ]) := on M and to a strongly continuous positive
i, j (a i, j )i, j , one has the following equivalent semigroup on the ordered Hilbert space (H, P) of
properties: the standard form. In the latter framework, one can
develop an infinite-dimensional, noncommutative
1.  is a CP map,
extension of the classical Perron–Frobenius theory
2.  is n-positive,
for matrices with positive entries. This applies, in
3. s is positive, and
particular, to semigroups generated by physical
4. s is positive on Aþ Mn (C)þ .
Hamiltonians and has been used to prove existence
Stinspring Representation of CP Maps
and uniqueness of the ground state for bosons and
fermions systems in quantum field theory (one may
CP maps are relatively easy to handle, thanks to the consult Gross (1972)).
following dilation result due to W F Stinspring. It
describes a CP map as the compression of a
morphism of C -algebras. Nuclear C  -Algebras and Injective
Let A be a unital C -algebra and  : A ! B(H) a von Neumann Algebras
linear map. Then  is a CP map if and only if it The nonabelian character of the product in
has the form C -algebras may prevent the existence of nontrivial
ðaÞ ¼ V  ðaÞV morphisms between them, while one may have an
abundance of CP maps. For example, there are no
for some representation : A ! B(K) on a Hil- nontrivial morphisms from the algebra of compact
bert space K, and some bounded linear map operators to C, but there exist sufficiently many
V : H ! K. If A is a von Neumann algebra and  states to separate its elements. A much more well-
is normal then can be taken to be normal. When behaved category of C -algebras is obtained by
A = B(H) and H is separable, one has, for some considering CP maps as morphisms. This is true, in
bn 2 B(H), particular, for nuclear C -algebras: those for which
X
1 any tensor product A B with any other C -algebra
ðaÞ ¼ bn abn B admits a unique C -cross norm (see C-Algebras
n¼1 and their Classification). The intimate relation
between this class of algebras and CP maps is
The proof of this result is reminiscent of the illustrated by the following characterization:
GNS construction for states and its extension, by
G Kasparov, to C -modules is central in bivariant 1. A is nuclear;
K-homology theory. 2. the identity map of A is a pointwise limit of CP
maps of finite rank;
Despite the above satisfactory result, one should 3. the identity map of A can be approximately
be aware that positive but not CP maps are much factorized, lim (T S )a ! a for all a 2 A,
less understood and only for maps on very low through matrix algebras and nets of CP maps
dimensional matrix algebras do we have a definitive S : A ! Mn (C), T : Mn (C) ! A.
classification. To have an idea of the intricacies of
the matter, one may consult Størmer (1963). A second important relation between nuclear
C -algebras and CP maps emerges in connection to
Positive Semigroups on Standard Forms the lifting problem.
of von Neumann Algebras and Ground State
‘‘Let A be a nuclear C -algebra and J a closed two-
for Physical Hamiltonians
sided ideal in a C -algebra B. Then every CP map
The above result allows one to derive the structure  : A ! B=J can be lifted to a CP map 0 : A ! B.
of generators of norm-continuous dynamical semi- In other words,  factors through B by the
groups in terms of dissipative operators. quotient map q : B ! B=J:  = q .’’
94 Pseudo-Riemannian Nilpotent Lie Groups

This and related results are used to prove that Takesaki Modular Theory; von Neumann Algebras:
the Brown–Douglas–Fillmore K-homology invariant Introduction, Modular Theory, and Classification Theory.
Ext(A) is a group for separable, nuclear C -algebras.
Our last basic result, due to W Arveson, about CP Further Reading
maps concerns the extension problem.
Bratteli O and Robinson DW (1987) Operator Algebras and
‘‘Let A be a unital C -algebra and N a self-adjoint Quantum Statistical Mechanics 1, 2nd edn., 505 pp. Berlin:
closed subspace of A containing the identity. Then Springer; New York: Heidelberg.
every CP map  : N ! B(H) from N into a type I factor Bratteli O and Robinson DW (1997) Operator Algebras and
B(H) can be extended to a CP map  : A ! B(H).’’ Quantum Statistical Mechanics 2, 2nd edn., 518 pp. Berlin:
Springer; New York: Heidelberg.
This result can be restated by saying that type I Connes A (1994) Noncommutative Geometry, 661 pp. San Diego,
factors are injective von Neumann algebras. It may CA: Academic Press.
suggest how the notion of a completely positive map Davies EB (1976) Quantum Theory of Open Systems, 171 pp.
London: Academic Press.
plays a fundamental role along Connes’ proof of one
Gross L (1972) Existence and uniqueness of physical ground
culminating result of the theory of von Neumann states. Journal of Functional Analysis 10: 52–109.
algebras, namely the fact that the class of injective Lance EC (1995) Hilbert C -Modules, 130 pp. London: Cambridge
von Neumann algebras coincides with the class University Press.
of approximately finite-dimensional ones (see von Ohya M and Petz D (1993) Quantum Entropy and Its Use,
335 pp. New York: Academic Press.
Neumann Algebras: Introduction, Modular Theory
Paulsen VI (1996) Completely Bounded Maps and Dilations,
and Classification Theory). 187 pp. Harlow: Longman Scientific-Technical.
Pisier G (2003) Introduction to Operator Space Theory, 479 pp.
See also: Capacity for Quantum Information; London: Cambridge University Press.
C *-Algebras and Their Classification; Channels Størmer E (1963) Positive linear maps of operator algebras. Acta
in Quantum Information Theory; Noncommutative Mathematica 110: 233–278.
Geometry and the Standard Model; Noncommutative Takesaki M (2004a) Theory of Operator Algebras I, Second
Geometry from Strings; Optimal Cloning of Quantum printing of the 1st edn., 415 pp. Berlin: Springer.
States; Path Integrals in Noncommutative Geometry; Takesaki M (2004b) Theory of Operator Algebras II, 384 pp.
Berlin: Springer.
Quantum Dynamical Semigroups; Quantum Entropy;
Takesaki M (2004c) Theory of Operator Algebras III, 548 pp.
Source Coding in Quantum Information Theory; Tomita– Berlin: Springer.

Pseudo-Riemannian Nilpotent Lie Groups


P E Parker, Wichita State University, groups with the Killing metric tensor, which need
Wichita, KS, USA not be (positive or negative) definite even on N. Here,
ª 2006 Elsevier Ltd. All rights reserved. K is compact and A is abelian.
An early motivation for this study was the
observation that there are two nonisometric
pseudo-Riemannian metrics on the Heisenberg
Nilpotent Lie Groups
group H3 , one of which is flat. This is a strong
While not much had been published on the geometry contrast to the Riemannian case in which there is
of nilpotent Lie groups with a left-invariant only one (up to positive homothety) and it is not
Riemannian metric till around 1990, the situation is flat. This is not an anomaly, as we now well know.
certainly better now; see the references in Eberlein While the idea of more than one timelike
(2004). However, there is still very little that is dimension has appeared a few times in the physics
conspicuous about the more general pseudo- literature, both in string/M-theory and in brane-
Riemannian case. In particular, the two-step world scenarios, essentially all work to date assumes
nilpotent groups are nonabelian and as close as only one. Thus, all applications so far are of
possible to being abelian, but display a rich variety of Lorentzian or definite nilpotent groups. Guediri
new and interesting geometric phenomena (Cordero and co-workers led the Lorentzian studies, and
and Parker 1999). As in the Riemannian case, one of most of their results stated near the end of the
many places where they arise naturally is as groups of section ‘‘Lorentzian groups’’ concern a major,
isometries acting on horospheres in certain (pseudo- perennial interest in relativity: the (non)existence of
Riemannian) symmetric spaces. Another is in the closed timelike geodesics in compact Lorentzian
Iwasawa decomposition G = KAN of semisimple manifolds.
Pseudo-Riemannian Nilpotent Lie Groups 95

Others have made use of nilpotent Lie groups gives handy conversion procedures for this and for
with left-invariant (positive or negative) definite the other major sign variant (e.g., curvature) (see
metric tensors, such as Hervig’s (2004) constructions O’Neill (1983, pp. 92 and 89, respectively)).
of black hole spacetimes from solvmanifolds (related A Riemannian inner product has signature (p, 0).
to solvable groups: those with Iwasawa decomposi- In view of the preceding remark, one might as well
tion G = AN), including the so-called BTZ construc- regard signature (0, q) as also being Riemannian, so
tions. Definite groups and their applications, already that ‘‘Riemannian geometry is that of definite metric
having received thorough surveys elsewhere, most tensors.’’ Similarly, a Lorentzian inner product has
notably those of Eberlein, are not included here. either p = 1 or q = 1. In this case, both sign
Although the geometric properties of Lie groups conventions are used in relativistic theories with
with left-invariant definite metric tensors have been the proviso that the ‘‘1’’ axis is always timelike.
studied extensively, the same has not occurred for If neither p nor q is 1, there is no physical
indefinite metric tensors. For example, while the convention. We shall say that v 2 V is timelike if
paper of Milnor (1976) has already become a classic hv, vi > 0, null if hv, vi = 0, and spacelike if hv, vi < 0.
reference, in particular for the classification of (In a Lorentzian example, one may wish to revert to
positive-definite (Riemannian) metrics on three- one’s preferred relativistic convention.) We shall refer
dimensional Lie groups, a classification of the to these collectively as the causal type of a vector (or of
left-invariant Lorentzian metric tensors on these a curve to which a vector is tangent).
groups became available only in 1997. Similarly, Considering indefinite inner products (and metric
only a few partial results in the line of Milnor’s tensors) thus greatly expands one’s purview, from
study of definite metrics were previously known for one type of geometry (Riemannian), or possibly two
indefinite metrics. Moreover, in dimension 3, there (Riemannian and Lorentzian), to a total of b(p þ
are only two types of metric tensors: Riemannian q)=2c þ 1 distinctly different types of geometries on
(definite) and Lorentzian (indefinite). But in higher the same underlying differential manifolds.
dimensions, there are many distinct types of indefi-
nite metrics while there is still essentially only one
Rise of 2-Step Groups
type of definite metric. This is another reason why
this area has special interest now. Throughout, N will denote a connected (and simply
The list in ‘‘Further reading’’ at the end of this connected, usually), nilpotent Lie group with Lie
article consists of general survey articles and a algebra n having center z. We shall use h , i to denote
select few of the more historically important papers. either an inner product on n or the induced left-
Precise bibliographical information for references invariant pseudo-Riemannian (indefinite) metric
merely mentioned or alluded to in this article tensor on N.
may be found in those. The main, general reference For all nilpotent Lie groups, the exponential map
on pseudo-Riemannian geometry is O’Neill’s (1983) exp : n ! N is surjective. Indeed, it is a diffeomorph-
book. Eberlein’s (2004) article covers the Rieman- ism for simply connected N; in this case, we shall
nian case. At this time, there is no other compre- denote the inverse by log.
hensive survey of the pseudo-Riemannian case. One One of the earliest papers on the Riemannian
may use Cordero and Parker (1999) and Guediri geometry of nilpotent Lie groups was Wolf (1964).
(2003) and their reference lists to good advantage, Since then, a few other papers about general nilpotent
however. Lie groups have appeared, including Karidi (1994)
and Pauls (2001), but the area has not seen a lot of
progress.
Inner Product and Signature
However, everything changed with Kaplan’s
By an inner product on a vector space V we shall (1981) publication. Following this paper and its
mean a nondegenerate, symmetric bilinear form on successor (Kaplan 1983), almost all subsequent
V, generally denoted by h , i. In particular, we do not work on the left-invariant geometry of nilpotent
assume that it is positive definite. It has become groups has been on two-step groups.
customary to refer to an ordered pair of non- Briefly, Kaplan defined a new class of nilpotent
negative integers (p, q) as the signature of the inner Lie groups, calling them of Heisenberg type. This
product, where p denotes the number of positive was soon abbreviated to H-type, and has since been
eigenvalues and q the number of negative eigen- called also as Heisenberg-like and (unfortunately)
values. Then nondegeneracy means that p þ q = ‘‘generalized Heisenberg.’’ (Unfortunate, because
dim V. Note that there is no real geometric that term was already in use for another class, not
difference between (p, q) and (q, p); indeed, O’Neill all of which are of H-type.) What made them so
96 Pseudo-Riemannian Nilpotent Lie Groups

compelling was that (almost) everything was expli- basis of its Lie algebra n for which the structure
citly calculable, thus making them the next great test constants are rational.
bed after symmetric spaces.
Such a group is said to have a rational structure, or
Definition 1 We say that N (or n ) is 2-step simply to be rational.
nilpotent when [n , n ]  z. Then [[n , n ], n ] = 0 and A nilmanifold is a (compact) homogeneous space
the generalization to k-step nilpotent is clear: of the form nN, where N is a connected, simply
connected (rational) nilpotent Lie group and  is a
½½   ½½½n ; n ; n ; n    ; n  ¼ 0
lattice in N. An infranilmanifold has a nilmanifold
with k þ 1 copies of n (or k nested brackets, if you as a finite covering space. They are commonly
prefer). regarded as a noncommutative generalization of
tori, the Klein bottle being the simplest example of
It soon became apparent that H-type groups
an infranilmanifold that is not a nilmanifold.
comprised a subclass of 2-step groups; for a nice,
We recall the result of Marsden from O’Neill
modern proof see Berndt et al. (1995). By around
(1983).
1990, they had also attracted the attention of the
spectral geometry community, and Eberlein pro- Theorem 2 A compact, homogeneous pseudo-
duced the seminal survey (with important new Riemannian space is geodesically complete.
results) from which the modern era began. (It was
Thus, if a rational N is provided with a bi-invariant
published in 1994 (Eberlein 1994), but the preprint
metric tensor h , i, then M becomes a compact,
had circulated widely since 1990.) Since then,
homogeneous pseudo-Riemannian space which is
activity around 2-step nilpotent Lie groups has
therefore complete. It follows that (N, h , i) is itself
mushroomed; see the references in Eberlein (2004).
complete. In general, however, the metric tensor is
Finally, turning to pseudo-Riemannian nilpo-
not bi-invariant and N need not be complete.
tent Lie groups, with perhaps one or two
For 2-step nilpotent Lie groups, things work nicely
exceptions, all results so far have been obtained
as shown by this result first published by Guediri.
only for 2-step groups. Thus, the remaining
sections of this article will be devoted almost Theorem 3 On a 2-step nilpotent Lie group, all
exclusively to them. left-invariant pseudo-Riemannian metrics are geode-
The Baker–Campbell–Hausdorff formula takes on sically complete.
a particularly simple form in these groups:
  No such general result holds for 3- and higher-step
expðxÞ expðyÞ ¼ exp x þ y þ 12½x; y ½1 groups, however.
Proposition 1 In a pseudo-Riemannian 2-step
nilpotent Lie group, the exponential map preserves
causal character. Alternatively, one-parameter sub- 2-Step Groups
groups are curves of constant causal character. In the Riemannian (positive-definite) case, one splits
Of course, one-parameter subgroups need not be n = z  v = z  z? , where the superscript denotes the
geodesics. orthogonal complement with respect to the inner
product h , i. In the general pseudo-Riemannian case,
however, z  z? 6¼ n . The problem is that z might be
Lattices and Completeness
a degenerate subspace; that is, it might contain a
We shall need some basic facts about lattices in N. null subspace U for which U  U ? .
In nilpotent Lie groups, a lattice is a discrete It turns out that this possible degeneracy of the
subgroup  such that the homogeneous space center causes the essential differences between
M = nN is compact. Here we follow the conven- the Riemannian and pseudo-Riemannian cases. So
tion that a lattice acts on the left, so that the coset far, the only general success in studying groups with
space consists of left cosets and this is indicated by degenerate centers was in Cordero and Parker (1999)
the notation. Other subgroups will generally act on where an adapted Witt decomposition of n was used
the right, allowing better separation of the effects of together with an involution  exchanging the two null
two simultaneous actions. parts.
Lattices do not always exist in nilpotent Lie Observe that if z is degenerate, the null subspace
groups. U is well defined invariantly. We shall use a
decomposition
Theorem 1 The simply connected, nilpotent Lie
group N admits a lattice if and only if there exists a n ¼zv ¼U Z V E ½2
Pseudo-Riemannian Nilpotent Lie Groups 97

in which z = U  Z and v = V  E , U and V are [V , V ] = {0} = [E , E ], it is easy to construct examples


complementary null subspaces, and U ? \ V ? = Z  E . which are not flat.
Although the choice of V is not well defined
Corollary 2 If dim Z  dn=2e, then there exists a
invariantly, once a V has been chosen then Z and E
flat metric on N.
are well defined invariantly. Indeed, Z is the portion of
the center z in U ? \ V ? , and E is its orthocomplement Here dre denotes the least integer greater than or
in U ? \ V ? . This is a Witt decomposition of n given U , equal to r and n = dim N.
easily seen by noting that (U  V )? = Z  E , adapted Before continuing, we pause to collect some facts
to the special role of the center in n . about the condition [n , n ]  U and its consequences.
We shall also need to use an involution  that
Remark 1 Since it implies j(z) = 0 for all z 2 Z , this
interchanges U and V and which reduces to the
latter is possible with no pseudo-Euclidean de Rham
identity on Z  E in the Riemannian (positive-definite)
factor, unlike the Riemannian case. (On the other
case. (The particular choice of such an involution is
hand, a pseudo-Euclidean de Rham factor is
not significant.) It turns out that  is an isometry of n
characterized in terms of the Kaplan-Eberlein map
which does not integrate to an isometry of N. The
j whenever the center is nondegenerate.)
adjoint with respect to h , i of the adjoint representa-
Also, it implies j(u) interchanges V and E for all
tion of the Lie algebra n on itself is denoted by ady .
u 2 U if and only if [V , V ] = [E , E ] = {0}. Examples
Definition 2 The linear mapping are the Heisenberg group and the groups H(p, 1) for
p  2 with null centers.
j : U  Z ! EndðV  E Þ
Finally, we note that it implies that, for every u 2 U ,
is given by j(u) maps V to V if and only if j(u) maps E to E if and
only if [V , E ] = {0}.
jðaÞx ¼  adyx a
Proposition 2 If j(z) = 0 for all z 2 Z and j(u)
Formulas for the connection and curvatures, and
interchanges V and E for all u 2 U , then N is Ricci
explicit forms for many examples, may be found in
flat.
Cordero and Parker (1999). It turns out there is a
relatively large class of flat spaces, a clear distinction Proposition 3 If j(z) = 0 for all z 2 Z , then N is
from the Riemannian case in which there are none. scalar flat. In particular, this occurs when [n , n ]  U .
Let x, y 2 n . Recall that homaloidal planes are
Much like the Riemannian case, we would expect
those for which the numerator hR(x, y)y, xi of the
that (N, h , i) should in some sense be similar to flat
sectional curvature formula vanishes. This notion is
pseudo-Euclidean space. This is seen, for example,
useful for degenerate planes tangent to spaces that
via the existence of totally geodesic subgroups
are not of constant curvature.
(Cordero and Parker 1999). (O’Neill (1983, ex. 9,
Definition 3 A submanifold of a pseudo-Riemannian p. 125) has extended the definition of totally
manifold is flat if and only if every plane tangent to geodesic to degenerate submanifolds of pseudo-
the submanifold is homaloidal. Riemannian manifolds.)
Theorem 4 The center Z of N is flat. Example 1 For any x 2 n the one-parameter sub-
group exp(tx) is a geodesic if and only if x 2 z or
Corollary 1 The only N of constant curvature
x 2 U  E . This is essentially the same as the
are flat.
Riemannian case, but with some additional geodesic
The degenerate part of the center can have a one-parameter subgroups coming from U .
profound effect on the geometry of the whole
Example 2 Abelian subspaces of V  E are Lie
group.
subalgebras of n , and give rise to complete, flat,
Theorem 5 If [n , n ]  U and E = {0}, then N is flat. totally geodesic abelian subgroups of N, just as in
the Riemannian case. Eberlein’s construction is valid
Among these spaces, those that also have Z = {0}
in general, and shows that if dim V  E  1 þ k þ
(which condition itself implies [n , n ]  U ) are funda-
k dim z, then every nonzero element of V  E lies in
mental, with the more general ones obtained by
an abelian subspace of dimension k þ 1.
making nondegenerate central extensions. It is also
easy to see that the product of any flat group with a Example 3 The center Z of N is a complete, flat,
nondegenerate abelian factor is still flat. totally geodesic submanifold. Moreover, it deter-
This is the best possible result in general. Using mines a foliation of N by its left translates, so each
weaker hypotheses in place of E = {0}, such as leaf is flat and totally geodesic, as in the Riemannian
98 Pseudo-Riemannian Nilpotent Lie Groups

case. In the pseudo-Riemannian case, this foliation ~


Proposition 5 Let O(N) denote the subgroup of
in turn is the orthogonal direct sum of two foliations I(N) which fixes 1 2 N. Then I(N) ffi O(N) ~ n N,
determined by U and Z , and the leaves of the where N acts by left translations.
U -foliation are also null. All these leaves are ~
The proof is obvious from the definition of O.
complete. ~
It is also obvious that O  O. Examples show that
There is also the existence of dim z independent O < O, ~ hence Iaut < I, is possible when the center is
first integrals, a familiar result in pseudo-Euclidean degenerate.
space, and the geodesic equations are completely Thus, we have three groups of isometries, not
integrable; in certain cases (mostly when the center necessarily equal in general: Ispl  Iaut  I. When the
is nondegenerate), one can obtain explicit formulas. center is nondegenerate (U = {0}), the Ricci transfor-
Unlike the Riemannian case, there are flat groups mation is block-diagonalizable and the rest of
(nonabelian) which are isometric to pseudo- Kaplan’s proof using it now also works.
Euclidean spaces (abelian).
Corollary 3 If the center is nondegenerate, then
Theorem 6 If [n , n ]  U and E = {0}, then N is ~
I(N) = Ispl (N) whence O(N) ffi O(N).
geodesically connected. Consequently, so is any
In the next few results, we use the phrase ‘‘a
nilmanifold with such a universal covering space.
subgroup isometric to’’ a group to mean that the
Thus, these compact nilmanifolds are much like tori. isometry is also an isomorphism of groups.
This is also illustrated by the computation of their
Proposition 6 For any N containing a subgroup
period spectrum.
isometric to the flat three-dimensional Heisenberg
group,
Isometry Group
Ispl ðNÞ < Iaut ðNÞ < IðNÞ
The main new feature is that when the center is
degenerate, the isometry group can be strictly larger Unfortunately, this class does not include our flat
in a significant way than when the center is groups in which [n , n ]  U and E = {0}. However,
nondegenerate (which includes the Riemannian case). it does include many groups that do not satisfy
Letting Aut(N) denote the automorphism group [n , n ]  U , such as the simplest quaternionic
of N and I(N) the isometry group of N, set Heisenberg group.
O(N) = Aut(N) \ I(N). In the Riemannian case, Remark 2 A direct computation shows that on this
I(N) = O(N) n N, the semidirect product where N flat H3 with null center, the only Killing fields with
acts as left translations. We have chosen the geodesic integral curves are the nonzero scalar
notation O(N) to suggest an analogy with the multiples of a vector field tangent to the center.
pseudo-Euclidean case in which this subgroup is
precisely the (general, including reflections) pseudo- Proposition 7 For any N containing a subgroup
orthogonal group. According to Wilson (1982), this isometric to the flat H3
R with null center,
analogy is good for any nilmanifold (not necessarily Ispl ðNÞ < Iaut ðNÞ < IðNÞ
2-step).
To see what is true about the isometry group in Many of our flat groups in which [n , n ]  U and
general, first consider the (left-invariant) splitting of E = {0} have such a subgroup isometrically
the tangent bundle TN = zN  vN. embedded, as in fact do many others which are not
flat.
Definition 4 Denote by Ispl (N) the subgroup of the
isometry group I(N) which preserves the splitting
Lattices and Periodic Geodesics
TN = zN  vN. Further, let Iaut (N) = O(N) n N,
where N acts by left translations. In this subsection, we assume that N is rational and
let  be a lattice in N.
Proposition 4 If N is a simply connected, 2-step
Certain tori TF and TB provide the model fiber
nilpotent Lie group with left-invariant metric tensor,
and the base for a submersion of the coset space nN.
then Ispl (N)  Iaut (N).
This submersion may not be pseudo-Riemannian in
There are examples to show that Ispl < Iaut is the usual sense, because the tori may be degenerate.
possible when U 6¼ {0}. We began the study of periodic geodesics in these
When the center is degenerate, the relevant group compact nilmanifolds, and obtained a complete
analogous to a pseudo-orthogonal group may be calculation of the period spectrum for certain flat
larger. spaces.
Pseudo-Riemannian Nilpotent Lie Groups 99

To the compact nilmanifold nN we may flat in general. Moreover, the geometry of the
associate two flat (possibly degenerate) tori. product is ‘‘twisted’’ in a certain way. It would be
interesting to determine which tori could appear as
Definition 5 Let N be a simply connected, two-step
such a TV and how.
nilpotent Lie group with lattice  and let  : n ! v
denote the projection. Define Theorem 7 Let N be a simply connected, 2-step
nilpotent Lie group with lattice , a left-invariant
Tz ¼ z=ðlog  \ zÞ
metric tensor, and tori as above. The fibers TF of
Tv ¼ v=ðlog Þ the (generalized) pseudo-Riemannian submersion
nN  TB are isometric to Tz . If in addition the
Observe that dim Tz þ dim Tv = dim z þ dim v =
center Z of N is nondegenerate, then the base TB is
dim n .
isometric to Tv .
Let m = dim z and n = dim v. It is a consequence
We recall that elements of N can be identified
of a theorem of Palais and Stewart that nN is a
with elements of the isometry group I(N): namely,
principal T m -bundle over T n . The model fiber T m
n 2 N is identified with the isometry  = Ln of left
can be given a geometric structure from its closed
translation by n. We shall abbreviate this by writing
embedding in nN; we denote this geometric
 2 N.
m-torus by TF . Similarly, we wish to provide the
base n-torus with a geometric structure so that the Definition 6 We say that  2 N translates the
projection pB : nN  TB is the appropriate general- geodesic  by ! if and only if (t) = (t þ !) for
ization of a pseudo-Riemannian submersion all t. If  is a unit-speed geodesic, we say that ! is a
(O’Neill 1983) to (possibly) degenerate spaces. period of .
Observe that the splitting n = z  v induces splittings
Recall that unit speed means that jj ˙ =
TN = zN  vN and T(nN) = z(nN)  v(nN), 1=2
jh, ˙ j = 1. Since there is no natural normal-
˙ i
and that pB just mods out z(nN). Examining
ization for null geodesics, we do not define periods
O’Neill’s definition, we see that the key is to
for them. In the Riemannian case and in the
construct the geometry of TB by defining
timelike Lorentzian case in strongly causal space-
pB : v  ðnNÞ ! TpB ðÞ ðTB Þ times, unit-speed geodesics are parametrized by
arclength and this period is a translation distance.
for each  2 nN is an isometry ½3
If  belongs to a lattice , it is the length of a closed
and geodesic in nN.
In general, recall that if  is a geodesic in N and if
rTpBx
B
pB y ¼ pB ðrx yÞ pN : N  nN denotes the natural projection, then
for all x; y 2 v ¼ V  E ½4 pN  is a periodic geodesic in nN if and only if
some  2  translates . We say periodic rather than
where  : n ! v is the projection. Then the rest of the closed here because in pseudo-Riemannian spaces it
usual results will continue to hold, provided that is possible for a null geodesic to be closed but not
sectional curvature is replaced by the numerator of periodic. If the space is geodesically complete or
the sectional curvature formula at least when Riemannian, however, then this does not occur; the
elements of V are involved: former is in fact the case for our 2-step nilpotent Lie
groups. Further, recall that free homotopy classes of
hRTB ðpB x; pB yÞ pB y; pB xi
closed curves in nN correspond bijectively with
¼ hRnN ðx; yÞy; xi þ 34h½x; y; ½x; yi ½5 conjugacy classes in .
Now pB will be a pseudo-Riemannian submersion in Definition 7 Let C denote either a nontrivial, free
the usual sense if and only if U = V = {0}, as is homotopy class of closed curves in nN or the
always the case for Riemannian spaces. corresponding conjugacy class in . We define }(C)
In the Riemannian case, Eberlein showed that to be the set of all periods of periodic unit-speed
TF ffi Tz and TB ffi Tv . In general, TB is flat only if N geodesics that belong to C.
has a nondegenerate center or is flat. In the Riemannian case, this is the set of lengths of
Remark 3 Observe that the torus TB may be closed geodesics in C, frequently denoted by ‘(C).
decomposed into a topological product TE
TV in Definition 8 The period spectrum of nN is the set
the obvious way. It is easy to check that TE is flat [
and isometric to ( log  \ E )nE , and that TV has a spec} ðnNÞ ¼ }ðCÞ
linear connection not coming from a metric and not C
100 Pseudo-Riemannian Nilpotent Lie Groups

where the union is taken over all nontrivial, free z orthogonal to [e , n ] and set ! = jz0 þ e j. Let
homotopy classes of closed curves in nN. ˙ = z0 þ e0 . Then
(0)
In the Riemannian case, this is the length spectrum (i) je j  !. In addition, ! < ! for timelike (space-
spec‘ (nN). like) geodesics with !z0 z0 timelike (spacelike),
and ! > ! for timelike (spacelike) geodesics
Example 4 Similar to the Riemannian case, we can
with !z0 z0 spacelike (timelike);
compute the period spectrum of a flat torus nRm ,
(ii) ! = je j if and only if (t) = exp(te =je j) for all
where  is a lattice (of maximal rank, isomorphic to
t 2 R; and
Zm ). Using calculations in an analogous way as for
(iii) ! = ! if and only if !z0 z0 is null.
finding the length spectrum of a Riemannian flat
torus, we easily obtain Although ! need not be an upper bound for periods
as in the Riemannian case, it nonetheless plays a
spec} ðnRm Þ ¼ fjgj 6¼ 0 j g 2 g special role among all periods, as seen in (iii) above,
and we shall refer to it as the distinguished period
It is also easy to see that the nonzero d’Alembertian associated with  2 N. When the center is definite,
spectrum is related to the analogous set produced for example, we do have !  ! .
from the dual lattice  , multiplied by factors of Now the following definitions make sense at least
42 , almost as in the Riemannian case. for N with a nondegenerate center.
As in this example, simple determinacy of periods Definition 9 Let C denote either a nontrivial, free
of unit-speed geodesics helps make calculation of the homotopy class of closed curves in nN or the
period spectrum possible purely in terms of corresponding conjugacy class in . We define } (C)
log   n . to be the distinguished periods of periodic unit-speed
For the rest of this subsection, we assume that N geodesics that belong to C.
is a simply connected, two-step nilpotent Lie group
Definition 10 The distinguished period spectrum
with left-invariant pseudo-Riemannian metric tensor
of nN is the set
h , i. Note that non-null geodesics may be taken to be
[
of unit speed. Most non-identity elements of N Dspec} ðnNÞ ¼ } ðCÞ
translate some geodesic, but not necessarily one of C
unit speed. where the union is taken over all nontrivial, free
For our special class of flat 2-step nilmanifolds, homotopy classes of closed curves in nN.
we can calculate the period spectrum completely.
Then we get this result:
Theorem 8 If [n , n ]  U and E = {0}, then spec} (M)
Corollary 5 Assume the center is nondegenerate. If
can be completely calculated from log  for any
n is nonsingular, then spec} (TB ) (respectively, TF ) is
M = nN.
precisely the period spectrum (respectively, the
Thus, we see again just how much these flat, two- distinguished period spectrum) of those free homo-
step nilmanifolds are like tori. All periods can be topy classes C of closed curves in M = nN that do
calculated purely from log   n , although some will not (respectively, do) contain an element in the
not show up from the tori in the fibration. center of  ffi 1 (M), except for those periods arising
only from unit-speed geodesics in M that project to
Corollary 4 spec} (TB ) (respectively, TF ) is [C} (C)
null geodesics in both TB and TF .
where the union is taken over all those free
homotopy classes C of closed curves in M = nN
Conjugate Loci
that do not (respectively, do) contain an element in
the center of  ffi 1 (M), except for those periods This is the only general result on conjugate points.
arising only from unit-speed geodesics in M that
Proposition 8 Let N be a simply connected, 2-step
project to null geodesics in both TB and TF .
nilpotent Lie group with left-invariant metric tensor
We note that one might consider using this to assign h , i, and let  be a geodesic with (0) ˙ = a 2 z.
periods to some null geodesics in the tori TB and TF . If ady a = 0, then there are no conjugate points
When the center is nondegenerate, we obtain along .
results similar to Eberlein’s. Here is part of them.
In the rest of this subsection, we assume that the
Theorem 9 Assume U = {0}. Let  2 N and write center of N is nondegenerate.
log  = z þ e . Assume  translates the unit-speed For convenience, we shall use the notation
geodesic  by ! > 0. Let z0 denote the component of Jz = ady z for any z 2 z. (Since the center is
Pseudo-Riemannian Nilpotent Lie Groups 101

nondegenerate, the involution  may be omitted.) where n  o


We follow Ciatti (2000) for this next definition. As  t t
A1 ¼ t 2 Rhx0 ; x0 i cot ¼ h; _ i
_
in the Riemannian case, one might as well make 2 2
2-step nilpotency part of the definition since it and
  
effectively is so anyway.  hx0 ; x0 i

A2 ¼ t 2 Rt ¼ sin t
Definition 11 N is said to be of pseudoH-type if h; _ þ hz0 ; z0 i
_ i
and only if when dim z  2
Jz2 ¼ hz; ziI
for any z 2 z. If t0 2 (2=)Z , then
Complete results on conjugate loci have been 
obtained only for these groups (Jang et al. 2005). dim v 1 if h; _ þ hz0 ; z0 i 6¼ 0
_ i
multcp ðt0 Þ ¼
For example, using standard results from analytic dim n 2 if h; _ þ hz0 ; z0 i ¼ 0
_ i
function theory, one can show that the conjugate 6 (2=)Z , then
If t0 2
locus is an analytic variety in N. This is probably 8
true for general two-step groups, but the proof we <1 if t0 2 A1 A2
know works only for pseudoH-type. multcp ðt0 Þ ¼ dim z 1 if t0 2 A2 A1
:
Definition 12 Let  denote a geodesic and assume dim z if t0 2 A1 \ A2
that (t0 ) is conjugate to (0) along . To indicate
(ii) If hz0 , z0 i = 2 with > 0, then (t0 ) is a
that the multiplicity of (t0 ) is m, we shall write
conjugate point along  if and only if t0 2
multcp (t0 ) = m. To distinguish the notions clearly,
B1 [ B2 where
we shall denote the multiplicity of  as an eigenvalue
of a specified linear transformation by multev .   
 t t
Let  be a geodesic with (0) = 1 and (0)˙ = z0 þ B1 ¼ 
t 2 Rhx0 ; x0 i coth ¼ h;
_ i
_
2 2
x0 2 z  v, respectively, and let J = Jz0 . If  is not
null, we may assume that  is normalized so that and
h, ˙ = 1. As usual, Z denotes the set of all
˙ i
integers with 0 removed.   
 hx0 ; x0 i
Theorem 10 Under these assumptions, if N is of B2 ¼ t 2 R t ¼ sinh t
h; _ þ hz0 ; z0 i
_ i
pseudoH-type, then: when dim z  2
(i) if z0 = 0 and x0 6¼ 0, then (t) is conjugate to
(0) along  if and only if hx0 , x0 i < 0 and The multiplicity is
8
12 <1 if t0 2 B1 B2
2 ¼ hx0 ; x0 i multcp ðt0 Þ ¼ dim z 1 if t0 2 B2 B1
t :
dim z if t0 2 B1 \ B2
in which case multcp (t) = dim z;
(ii) if z0 6¼ 0 and x0 = 0, then (t) is conjugate to (iii) If hz0 , z0 i = 0, then (t0 ) is a conjugate point
(0) along  if and only if hz0 , z0 i > 0 and along  if and only if

2  12
t2 Z t02 ¼
jz0 j hx0 ; x0 i
and multcp (t0 ) = dim z 1.
in which case multcp (t) = dim v.
This covers all cases for a pseudoH-type group with
Theorem 11 Let  be such a geodesic in a a center of any dimension.
pseudoH-type group N with z0 6¼ 0 6¼ x0 . Some results on other two-step groups and
(i) If hz0 , z0 i = 2 with  > 0, then (t0 ) is con- examples (including pictures in dimension 3) may
jugate to (0) along  if and only if be found in the references cited in Jang et al. (2005).
When the groups are not pseudoH-type, however,
complete results are available only when the center
2 
t0 2 Z [ A1 [ A2 is one dimensional. Guediri (2004) has results in the
 timelike Lorentzian case.
102 Pseudo-Riemannian Nilpotent Lie Groups

Lorentzian Groups Corollary 8 If N is weakly nonsingular, then nN


contains no closed timelike geodesic.
Not too long ago, only a few partial results in the
line of Milnor’s study of definite metrics were Corollary 9 If N ¼ H2kþ1 is a Lorentzian Heisen-
known for indefinite metrics (Barnet 1989, Nomizu berg group with degenerate center, then nN
1979), and they were Lorentzian. contains no closed timelike geodesic.
Guediri (2003) and others have made special
study of Lorentzian two-step groups, partly because Guediri also has the only non-Riemannian results
of their relevance to general relativity, where they so far about the phenomenon Eberlein called ‘‘in
can be used to provide interesting and important resonance.’’ Roughly speaking, this occurs when the
(counter)examples. Special features of Lorentzian eigenvalues of the map j have rational ratios. (The
geometry frequently enable them to obtain much Lorentzian case actually requires a slightly more
more complete and explicit results than are possible complicated condition when the center is
in general. degenerate.)
For example, Guediri (2003) was able to provide Theorem 14 If N is almost nonsingular, then N is
a complete and explicit integration of the geodesic in resonance if and only if every geodesic of N is
equations for Lorentzian 2-step groups. This translated by some element of N.
includes the case of a degenerate center, which
only required extremely careful handling through a See also: Classical Groups and Homogeneous Spaces;
number of cases. He also paid special attention to Einstein Equations: Exact Solutions; Lorentzian
the existence of closed timelike geodesics, reflecting Geometry.
the relativistic concerns.
As usual, N denotes a connected and simply
connected 2-step nilpotent Lie group. For the rest
of this section, we assume that the left-invariant Further Reading
metric tensor is Lorentzian. Whenever a lattice is Barnet F (1989) On Lie groups that admit left-invariant Lorentz
mentioned, we also assume that the group is metrics of constant sectional curvature. Illinois Journal of
rational. Mathematics 33: 631–642.
Berndt J, Tricerri F, and Vanhecke L (1995) Generalized
Proposition 9 If the center is degenerate, then no Heisenberg Groups and Damek-Ricci Harmonic Spaces,
timelike geodesic can be translated by a central LNM 1598. Berlin: Springer.
Ciatti P (2000) Scalar products on Clifford modules and pseudo-
element. H-type Lie algebras. Annali di Matematica Pura ed Applicata
Thus, there can be no closed timelike geodesics 178: 1–31.
Cordero LA and Parker PE (1999) Pseudoriemannian 2-step
parallel to the center in any nilmanifold obtained nilpotent Lie groups, Santiago-Wichita. Preprint DGS/CP4,
from such an N. (arXiv:math.DG/9905188).
Eberlein P (1994) Geometry of 2-step nilpotent groups with a left-
Theorem 12 If the center is Lorentzian, then nN invariant metric. Annales Scientifique de l’École Normale
contains no timelike or null closed geodesics for any Supérieure 27: 611–660.
lattice . Eberlein P (2004) Left-invariant geometry of Lie groups. Cubo 6:
427–510. (See also http://www.math.unc.edu/faculty/pbe.)
To handle degenerate centers, three refined Guediri M (2003) Lorentz geometry of 2-step nilpotent Lie
notions for nonsingular are used: almost, weakly, groups. Geometriae Dedicata 100: 11–51.
and strongly nonsingular. The precise definitions Guediri M (2004) The timelike cut locus and conjugate points in
Lorentz 2-step nilpotent Lie groups. Manuscripta Mathema-
involve an adapted Witt decomposition (as in the tica 114: 9–35.
general pseudo-Riemannian case, but a rather Hervig S (2004) Einstein metrics: homogeneous solvmanifolds,
different one here) and are quite technical, as is generalised Heisenberg groups and black holes. Journal of
typical. We refer to Guediri (2003) for details. Geometry and Physics 52: 298–312.
Jang C, Parker PE, and Park K (2005) PseudoH-type 2-step
Theorem 13 If N is weakly nonsingular, then no nilpotent Lie groups. Houston Journal of Mathematics 31:
timelike geodesic can be translated by an element 765–786 (arXiv:math.DG/0307368).
Kaplan A (1981) Riemannian nilmanifolds attached to Clifford
of N. modules. Geometriae Dedicata 11: 127–136.
Kaplan A (1983) On the geometry of Lie groups of Heisenberg
Corollary 6 If N is flat, then no timelike geodesic type. Bulletin of the London Mathematical Society 15: 35–42.
can be translated by a non-identity element. Karidi R (1994) Geometry of balls in nilpotent Lie groups. Duke
Mathematical Journal 74: 301–317.
Corollary 7 If N is flat, then nN contains no Milnor J (1976) Curvatures of left-invariant metrics on Lie
closed timelike geodesics for any lattice . groups. Advances in Mathematics 21: 293–329.
Pseudo-Riemannian Nilpotent Lie Groups 103

Nomizu K (1979) Left-invariant Lorentz metrics on Lie groups. Wilson EN (1982) Isometry groups on homogeneous manifolds.
Osaka Mathematical Journal 16: 143–150. Geometriae Dedicata 12: 337–346.
O’Neill B (1983) Semi-Riemannian Geometry. New York: Wolf JA (1964) Curvature in nilpotent Lie groups. Proceedings of
Academic Press. the American Mathematical Society 15: 271–274.
Pauls SD (2001) The large scale geometry of nilpotent Lie groups.
Communications in Analysis and Geometry 9: 951–982.
Q
q-Special Functions
T H Koornwinder, University o If jqj < 1 this definition remains meaningful for
f Amsterdam, Amsterdam, The Netherlands k = 1 as a convergent infinite product:
ª 2006 Elsevier Ltd. All rights reserved.
Y
1
ða; qÞ1 :¼ ð1  aqj Þ ½2
j¼0

Introduction We also write (a1 , . . . , ar ; q)k for the product of r


In this article we give a brief introduction to q-special q-shifted factorials:
functions, that is, q-analogs of the classical special
functions. Here q is a deformation parameter, usually ða1 ; . . . ; ar ; qÞk :¼ ða1 ; qÞk . . . ðar ; qÞk
0 < q < 1, where q = 1 is the classical case. The ðk 2 Z0 or k ¼ 1Þ ½3
deformation is such that the calculus simultaneously
deforms to a q-calculus involving q-derivatives and A q-hypergeometric series is a power series (for the
q-integrals. The main topics to be treated are moment still formal) in one complex variable z with
q-hypergeometric series, with some selected evalu- power series coefficients which depend, apart from q,
ation and transformation formulas, and some on r complex upper parameters a1 , . . . , ar and s
q-hypergeometric orthogonal polynomials, most nota- complex lower parameters b1 , . . . , bs as follows:
bly the Askey–Wilson polynomials. In several vari-
" #
ables, we discuss Macdonald polynomials associated a1 ; . . . ; ar
with root systems, with most emphasis on the An case. r s ; q; z ¼ r s ða1 ; . . . ; ar ; b1 ; . . . ; bs ; q; zÞ
b1 ; . . . ; bs
The rather new theory of elliptic hypergeometric series
gets some attention. While much of the theory of X
1
ða1 ; . . . ; ar ; qÞk
q-special functions keeps q fixed, some of the deeper :¼
k¼0
ðb1 . . . ; bs ; qÞk ðq; qÞk
;
aspects with number-theoretic and combinatorial  srþ1
flavor emphasize expansion in q. Finally, we indicate  ð1Þk qð1=2Þkðk1Þ zk ðr; s 2 Z0 Þ ½4
applications and interpretations in quantum groups,
Chevalley groups, affine Lie algebras, combinatorics, Clearly the above expression is symmetric in
and statistical mechanics. a1 , . . . , ar and symmetric in b1 , . . . , bs . On the right-
hand side of [4], we have that
Conventions
ðk þ 1Þth term
q 2 Cn{1} in general, but 0 < q < 1 in all infinite
sums and products. kth term
n, m, N will be non-negative integers unless men- ð1  a1 qk Þ    ð1  ar qk Þðqk Þsrþ1 z
¼ ½5
tioned otherwise. ð1  b1 qk Þ    ð1  bs qk Þð1  qkþ1 Þ

is rational in qk . Conversely, any rational function in


q-Hypergeometric Series qk can be written in the form P1of the right-hand side
of [5]. Hence, any series k = 0 ck with c0 = 1 and
Definitions ckþ1 =ck rational in qk is of the form of a
For a, q 2 C the q-shifted factorial (a; q)k is defined q-hypergeometric series [4].
as a product of k factors: In order to avoid singularities in the terms of [4],
we assume that b1 , . . . , bs 6¼ 1, q1 , q2 , . . . . If, for
ða; qÞk :¼ ð1  aÞð1  aqÞ    ð1  aqk1 Þ some i, ai = qn , then all terms in the series [4] with
ðk 2 Z>0 Þ; ða; qÞ0 :¼ 1 ½1 k > n will vanish. If none of the ai is equal to qn
106 q-Special Functions

and if jqj < 1, then the radius of convergence of the q-number, q-factorial, and q-Pochhammer
power series [4] equals 1 if r < s þ 1, 1 if r = s þ 1, symbol:
and 0 if r > s þ 1.
qð1=2Þa  qð1=2Þa Y
k
We can view the q-shifted factorial as a q-analog ½aq :¼ ½kq ! :¼ ½jq
of the shifted factorial (or Pochhammer symbol) by q1=2  q1=2 j¼1
the limit formula
Y
k 1

ðqa ; qÞk ð½aq Þk :¼ ½a þ jq ðk 2 Z0 Þ ½13


lim ¼ ðaÞk :¼ aða þ 1Þ    ða þ k  1Þ ½6 j¼0
q!1 ð1  qÞk
For q ! 1, these symbols tend to their classical
Hence the q-binomial coefficient counterparts without the need for renormalization.
  They are expressed in terms of the standard notation
n ðq; qÞn
:¼ ðn; k 2 Z; n  k  0Þ ½7 [1] as follows:
k q ðq; qÞk ðq; qÞnk
ðq; qÞk
tends to the binomial coefficient for q ! 1: ½kq ! ¼ qð1=4Þkðk1Þ
ð1  qÞk
    ½14
n n ðqa ; qÞk
lim ¼ ½8 ð½aq Þk ¼ qð1=2Þkða1Þ qð1=4Þkðk1Þ
q!1 k q k ð1  qÞk

and a suitably renormalized q-hypergeometric series


tends (at least formally) to a hypergeometric series Special Cases
as q " 1: For s = r  1, formula [4] simplifies to
" a1 #  
q ; . . . ; qar ; c1 ; . . . ; cr0 a1 ; . . . ; ar
1þsr ; q; z
lim rþr0 sþs0 ; q; ðq  1Þ z r r1
b1 ; . . . ; br1
q"1
qb1 ; . . . ; qbs ; d1 ; . . . ; ds0
! X 1
ða1 ; . . . ; ar ; qÞk
a1 ; . . . ; ar ðc1  1Þ    ðcr0  1Þ z ¼ zk ½15
¼ r Fs ; ½9 k¼0
ðb 1 ; . . . ; br1 ; qÞ k ðq; qÞ k
b1 ; . . . ; bs ðd1  1Þ    ðds0  1Þ
which has radius of convergence 1 in the nontermi-
At least formally, there are limit relations between nating case. The case r = 2 of [15] is the q-analog of
q-hypergeometric series with neighboring r, s: the Gauss hypergeometric series.
   
a1 ; . . . ; ar z a ; . . . ; ar1
lim r s ; q; ¼ r1 s 1 ; q; z ½10 q-Binomial series
ar !1 b1 ; . . . ; bs ar b1 ; . . . ; bs
X1
ða; qÞk zk ðaz; qÞ1
    1 0 ða; ; q; zÞ ¼ ¼
a1 ; . . . ; ar a1 ; . . . ; ar ðq; qÞk ðz; qÞ1
lim r s ; q; bs z ¼ r s1 ; q; z ½11 k¼0
bs !1 b1 ; . . . ; bs b1 ; . . . ; bs1
ðif series is not terminating, then jzj < 1Þ ½16
PA
n
terminating q-hypergeometric
P series
k = 0 ck z
k
rewritten as zn nk = 0 cnk zk yields
another terminating q-hypergeometric series, for q-Exponential series
instance: eq ðzÞ:¼ 1 0 ð0; ; q; zÞ
 n  X 1
zk 1
q ; a1 ; . . . ; as

sþ1 s ; q; z ¼ ¼ ðjzj < 1Þ ½17
b1 ; . . . ; bs k¼0
ðq; qÞk ðz; qÞ1
ða1 ; . . . ; an ; qÞn n X 1
qð1=2Þkðk1Þ zk
¼ ð1Þn qð1=2Þnðnþ1Þ z Eq ðzÞ:¼ 0 0 ð; ; q; zÞ ¼
ðb1 ; . . . ; bs ; qÞn ðq; qÞk
" k¼0
qn ; qnþ1 b1 1 ;...;q
nþ1 1
bs  1
 sþ1 s nþ1 1 nþ1 1
; ¼ ðz; qÞ1 ¼ eq ðzÞ ðz 2 CÞ ½18
q a1 ; . . . ; q as

qnþ1 b1    bs
q; ½12
a1    as z "q ðzÞ:¼ 1 1 ð0; q1=2 ; q1=2 ; zÞ
X 1
qð1=4Þkðk1Þ k
Often, in physics and quantum groups related ¼ z ðz 2 CÞ ½19
literature, the following notation is used for k¼0
ðq; qÞk
q-Special Functions 107

Jackson’s q-Bessel functions For nonzero a, b 2 R we define


  Z b Z b Z
ð1Þ ðqþ1 ; qÞ1 1  a
J ðx; qÞ:¼ x f ðxÞ dq x :¼ f ðxÞ dq x  f ðxÞ dq x ½26
ðq; qÞ1 2 a 0 0
 
0; 0 1 For a q-integral over (0, 1), we have to specify a
 2 1 þ1 ; q;  x2 ð0 < x < 2Þ ½20
q 4 q-lattice {aqk }k2Z for some a > 0 (up to multi-
plication by an integer power of q):
   
ðqþ1 ; qÞ1 1   1 Z a:1 X
1
Jð2Þ ðx; qÞ :¼ x 0 1 þ1 ; q;  qþ1 x2
ðq; qÞ1 2 q 4 f ðxÞ dq x :¼ að1  qÞ f ðaqk Þ qk
  0
1 k¼1
¼  x; q Jð1Þ ðx; qÞ ðx > 0Þ ½21 Z qn a
4 1
¼ lim f ðxÞ dq x ½27
n!1 0

 
ðqþ1 ; qÞ1 1 
Jð3Þ ðx; qÞ :¼ x
ðq; qÞ1 2 The q-Gamma and q-Beta Functions
 
0 1
 1 1 þ1 ; q; qx2 ðx > 0Þ ½22 The q-gamma function is defined by
q 4
ðq; qÞ1 ð1  qÞ1z
See [90] for the orthogonality relation for J(3) (x; q). q ðzÞ :¼ ðz 6¼ 0; 1; 2; . . .Þ ½28
If expq (z) denotes one of the three q-exponentials ðqz ; qÞ1
[17]–[19], then (1=2)( expq (ix) þ expq (ix)) is a
Z ð1qÞ1
q-analog of the cosine and (1=2)i( expq (ix)
¼ tz1 Eq ðð1  qÞqtÞ dq t ð<z > 0Þ ½29
 expq (ix)) is a q-analog of the sine. The three 0
q-cosines are essentially the case  = 1=2 of the
Then
corresponding q-Bessel functions [20]–[22], and the 1  qz
three q-sines are essentially the case  = 1=2 of x q ðz þ 1Þ ¼ q ðzÞ ½30
1q
times the corresponding q-Bessel functions.
ðq; qÞn
q-Derivative and q-Integral q ðn þ 1Þ ¼ ½31
ð1  qÞn
The q-derivative of a function f given on a subset of
R or C is defined by lim q ðzÞ ¼ ðzÞ ½32
q"1
f ðxÞ  f ðqxÞ
ðDq f ÞðxÞ :¼ ðx 6¼ 0; q 6¼ 1Þ ½23
ð1  qÞx The q-beta function is defined by
where x and qx should be in the domain of f. By q ðaÞq ðbÞ ð1  qÞ ðq; qaþb ; qÞ1
continuity, we set (Dq f )(0) := f 0 (0), provided f 0 (0) Bq ða; bÞ : ¼ ¼
q ða þ bÞ ðqa ; qb ; qÞ1
exists. If f is differentiable on an open interval
I, then ða; b 6¼ 0; 1; 2; . . .Þ ½33

limðDq f ÞðxÞ ¼ f 0 ðxÞ ðx 2 IÞ ½24 Z 1


ðqt; qÞ1
q"1
¼ tb1 dq t
0 ðqa t; qÞ1
For a 2 Rn{0} and a function f given on (0, a] or
[a, 0), we define the q-integral by ð<b > 0; a 6¼ 0; 1; 2; . . .Þ ½34
Z a X
1
f ðxÞ dq x : ¼ að1  qÞ f ðaqk Þ qk
0 k¼0 The q-Gauss Hypergeometric Series
X
1
¼ f ðaqk Þ ðaqk  aqkþ1 Þ ½25 q-Analog of Euler’s integral representation
k¼0 a b c
2 1 ðq ; q ; q ; q; zÞ
provided the infinite sum converges absolutely (e.g., Z 1
q ðcÞ ðtq; qÞ1
if f is bounded). If F(a) is given by the left-hand side ¼ tb1
of [25], then Dq F = f . The right-hand side of [25] is q ðaÞq ðc  bÞ 0 ðtqcb ; qÞ1
an infinite Riemann ðtzqa ; qÞ1
R a sum. For q " 1 it converges, at  dq t ð<b > 0; jzj < 1Þ ½35
least formally, to 0 f (x) dx. tz; qÞ1
108 q-Special Functions

By substitution of [25], formula [35] becomes a Some special solutions of [45] are:
transformation formula:
u1 ðzÞ :¼ 2 1 ðqa ; qb ; qc ; q; zÞ ½46
2 1 ða; b; c; q; zÞ
ðaz; qÞ1 ðb; qÞ1
¼ 2 1 ðc=b; z; az; q; bÞ ½36 u2 ðzÞ :¼ z1c 2 1 ðq1þac ; q1þbc ; q2c ; q; zÞ ½47
ðz; qÞ1 ðc; qÞ1

Note the mixing of argument z and parameters


a, b, c on the right-hand side. u3 ðzÞ :¼ za 2 1 ðqa ; qacþ1 ; qabþ1 ; q; qabþcþ1 z1 Þ ½48
They are related by:
Evaluation formulas in special points
ðqa ; q1c ; qcb ; qÞ1
2 1 ða; b; c; q; c=ðabÞÞ u1 ðzÞ þ
ðqc1 ; qacþ1 ; q1b ; qÞ1
ðc=a; c=b; qÞ1
¼ ðjc=ðabÞj < 1Þ ½37
ðc; c=ðabÞ; qÞ1 ðqb1 z; q2b z1 ; qÞ1
 u2 ðzÞ
ðqbc z; qcbþ1 z1 ; qÞ1
ðc=b; qÞn
2 1 ðq
n
; b; c; q; cqn =bÞ ¼ ½38 ðq1c ; qabþ1 ; qÞ1
ðc; qÞn ¼
ðq1b ; qacþ1 ; qÞ1

n ðc=b; qÞn bn ðqaþbc z; qcabþ1 z1 ; qÞ1 za


2 1 ðq ; b; c; q; qÞ ¼ ½39  u3 ðzÞ ½49
ðc; qÞn ðqbc z; qcbþ1 z1 ; qÞ1

Two general transformation formulas Summation and Transformation Formulas


    for r r 1 Series
a; b ðaz; qÞ1 a; c=b

2 1 ; q; z ¼ 
2 2 ; q; bz ½40 An r r1 series [15] is called ‘‘balanced’’ if b1 . . . br1 =
c ðz; qÞ1 c; az
qa1 . . . ar and z = q, and the series is called ‘‘very well-
  poised’’ if qa1 = a2 b1 = a3 b2 =    = ar br1 and qa1 =
1=2
ðabz=c; qÞ1 c=a; c=b abz
¼ 2 1 ; q; ½41 a2 = a3 . The following more compact notation is
ðz; qÞ1 c c
used for very well-poised series:

r Wr1 ða1 ; a4 ; a5 ; . . . ; ar ; q; zÞ
Transformation formulas in the terminating case 2 3
1=2 1=2
 n  a1 ; qa1 ; qa1 ; a4 ; . . . ; ar
q ;b :¼ r r1 4 ; q; z5 ½50

2 1 ; q; z 1=2 1=2
a1 ; a1 ; qa1 =a4 ; . . . ; qa1 =ar
c
" #
ðc=b; qÞn qn ; b; qn bc1 z Below only a few of the most important identities
¼ 3 2 ; q; q ½42 are given. See Gasper and Rahman (2004) for many
ðc; qÞn q1n bc1 ; 0
more. An important tool for obtaining complicated
  identities from more simple ones is Bailey’s Lemma,
qn ; cb1 ; 0
¼ ðqn bc1 z; qÞn 3 2 ; q; q ½43 which can moreover be iterated (Bailey chain), see
c; qcb1 z1 Andrews (1986, ch.3).
 
ðc=b; qÞn n qn ; b; qz1 z
¼ b 3 1 ; q; ½44 The q-Saalschütz sum for a terminating balanced 3 2
ðc; qÞn q1n bc1 c
 
a; b; qn ðc=a; c=b; qÞn
3 2 ; q; q ¼ ½51
c; q1n abc1 ðc; c=ðabÞ; qÞn
Second order q-difference equation
zðqc  qaþbþ1 zÞðD2q uÞðzÞ
    Jackson’s sum for a terminating balanced 8W7
1  qc b1q
a
a1q
bþ1
þ  q þq z ðDq uÞðzÞ nþ1 2
1q 1q 1q 8W7 ða; b; c; d; q a =ðbcdÞ; qn ; q; qÞ
1  qa 1  qb ðqa; qa=ðbcÞ; qa=ðbdÞ; qa=ðcdÞ; qÞn
 uðzÞ ¼ 0 ½45 ¼ ½52
1q 1q ðqa=b; qa=c; qa=d; qa=ðbcdÞ; qÞn
q-Special Functions 109

Watson’s transformation of a terminating 8W7 into a Bilateral Series


terminating balanced 4 3
Definition [1] can be extended by
 
n qnþ2 a2
8W7 a; b; c; d; e; q ; q; ða; qÞ1
bcde ða; qÞk :¼ ðk 2 ZÞ ½59
ðaqk ; qÞ1
ðqa; qa=ðdeÞ; qÞn
¼
ðqa=d; qa=e; qÞn Define a bilateral q-hypergeometric series by the
 n  Laurent series
q ; d; e; qa=ðbcÞ
 4 3 ; q; q ½53 " #
qa=b; qa=c; qn de=a a1 ; . . . ; ar
r s ; q; z ¼ r s ða1 ; . . . ; ar ; b1 ; . . . ; bs ; q; zÞ
b1 ; . . . ; bs

Sears’ transformation of a terminating balanced 4 3


X 1
ða1 ; . . . ; ar ; qÞk  sr
:¼ ð1Þk qð1=2Þkðk1Þ zk
  ðb 1 ; . . . ; bs ; qÞ k
qn ; a; b; c k¼1
4 3 ; q; q
d; e; f ða1 ; . . . ; ar ; b1 ; . . . ; bs 6¼ 0; s  rÞ ½60
 
ðe=a; f =a; qÞn n qn ; a; d=b; d=c
¼ a 4 3 ; q; q ½54 The Laurent series is convergent if jb1 . . . bs =(a1 . . . ar )j <
ðe; f ; qÞn d; q1n a=e; q1n a=f
jzj and moreover, for s = r, jzj < 1.
By iteration and by symmetries in the upper and in
the lower parameters, many other versions of this Ramanujan’s 1 1 summation formula
identity can be found. An elegant comprehensive
formulation of all these versions is as follows. 1 1 ðb; c; q; zÞ

Let x1 x2 x3 x4 x5 x6 = q1n . Then the following ðq; c=b; bz; q=ðbzÞ; qÞ1
¼ ðjc=bj < jz < 1Þ ½61
expression is symmetric in x1 , x2 , x3 , x4 , x5 , x6 : ðc; q=b; z; c=ðbzÞ; qÞ1
qð1=2Þnðn1Þ ðx1 x2 x3 x4 ; x1 x2 x3 x5 ; x1 x2 x3 x6 ; qÞn This has as a limit case
ðx1 x2 x3 Þn
" # ðq; z; q=z; qÞ1
qn ; x2 x3 ; x1 x3 ; x1 x2 0 1 ð; c; q; zÞ ¼ ðjzj > jcjÞ ½62
 4 3 ; q; q ½55 ðc; c=z; qÞ1
x1 x2 x3 x4 ; x1 x2 x3 x5 ; x1 x2 x3 x6
and as a further specialization the Jacobi triple
Similar formulations involving symmetry groups can product identity
be given for other transformations, see Van der Jeugt
X
1
and Srinivasa Rao (1999). ð1Þk qð1=2Þkðk1Þ zk
k¼1

Bailey’s transformation of a terminating ¼ ðq; z; q=z; qÞ1 ðz 6¼ 0Þ ½63


balanced 10W9
  which can be rewritten as a product formula for a
qnþ2 a3 n theta function:
10W9 a; b; c; d; e; f ; ; q ; q; q
bcdef
X
1
ðqa; qa=ðef Þ; ðqaÞ2 =ðbcdeÞ; ðqaÞ2 =ðbcdf Þ; qÞn
2

¼
4 ðx; qÞ :¼ ð1Þk qk e2ikx
ðqa=e; qa=f ; ðqaÞ2 =ðbcdef Þ; ðqaÞ2 =ðbcdÞ; qÞn k¼1
 2  Y1
qa qa qa qa qnþ2 a3 n
 10W9 ; ; ; ; e; f ; ; q ; q; q ½56 ¼ ð1  q2k Þ
bcd cd bd bc bcdef
k¼1
 
 1  2qk1 cosð2xÞ þ q4k2 ½64
Rogers–Ramanujan Identities

X1
qk
2
1
0 1 ð; 0; q; qÞ ¼ ¼ ½57 q-Hypergeometric Orthogonal
ðq; qÞ ðq; q 4 ; q5 Þ
k¼0 k 1 Polynomials
Here we discuss families of orthogonal polyno-
X
1 mials {pn (x)} which are expressible as terminating
2 qkðkþ1Þ 1
0 1 ð; 0; q; q Þ ¼ ¼ 2 3 5 ½58 q-hypergeometric series (0 < q < 1) and for
k¼0
ðq; qÞk ðq ; q ; q Þ1 which either (1) Pn (x):= pn (x) or (2) Pn (x):= pn
110 q-Special Functions

((1=2)(x þ x1 )) are eigenfunctions of a second- and the xk are the points (1=2)(eqk þ e1 qk ) with
order q-difference operator, that is, e any of the a, b, c, d of absolute value > 1; the sum
is over the k 2 Z0 with jeqk j > 1. The !k are
AðxÞ Pn ðqxÞ þ BðxÞ Pn ðxÞ þ CðxÞ Pn ðq1 xÞ certain weights which can be given explicitly. The
¼ n Pn ðxÞ ½65 sum in [67] does not occur if moreover
jaj, jbj, jcj, jdj < 1.
where A(x), B(x), and C(x) are independent of n,
A more uniform way of writing the orthogonality
and where the n are the eigenvalues. The generic
relation [67] is by the contour integral
cases are the four-parameter classes of ‘‘Askey–
Wilson polynomials’’ (continuous weight function) I    
1 1 1
and q-Racah polynomials (discrete weights pn ðz þ z1 Þ pm ðz þ z1 Þ
2i C 2 2
on finitely many points). They are of type (2) (quad-
ratic q-lattice). All other cases can be obtained from ðz2 ; z2 ; qÞ1 dz

the generic cases by specialization or limit transition. ðaz; az1 ; bz; bz1 ; cz; cz1 ; dz; dz1 ; qÞ1 z
In particular, one thus obtains the generic three- ¼ 2hn n;m ½70
parameter classes of type (1) (linear q-lattice). These
are the big q-Jacobi polynomials (orthogonality by where C is the unit circle traversed in positive
q-integral) and the q-Hahn polynomials (discrete direction with suitable deformations to separate the
weights on finitely many points). sequences of poles converging to zero from the
sequences of poles diverging to 1.
The case n = m = 0 of [70] or [67] is known as the
Askey–Wilson Polynomials
Askey–Wilson integral.
Definition as q-hypergeometric series
pn ðcos Þ ¼ pn ðcos ; a; b; c; d j qÞ q-Difference equation
ðab; ac; ad; qÞn 
:¼ n 4 3 AðzÞPn ðqzÞ  AðzÞ þ Aðz1 Þ Pn ðzÞ þ Aðz1 ÞPn ðq1 zÞ
" a #
qn ; qn1 abcd; aei ; aei ¼ ðqn  1Þ ð1  qn1 abcdÞPn ðzÞ ½71
 ; q; q ½66
ab; ac; ad where Pn (z) = pn ( 12 (z þ z1 )) and A(z) = (1  az)
This is symmetric in a, b, c, d. (1  bz)(1  cz) (1  dz)=((1  z2 )(1  qz2 ))

Orthogonality relation Assume that a, b, c, d are Special cases These include the continuous
four reals, or two reals and one pair of complex q-Jacobi polynomials (two parameters), the contin-
conjugates, or two pairs of complex conjugates. uous q-ultraspherical polynomials (symmetric one-
Also assume that jabj, jacj, jadj, jbcj, jbdj, jcdj < 1. parameter case of continuous q-Jacobi), the
Then Al-Salam-Chihara polynomials (Askey–Wilson with
Z 1 c = d = 0), and the continuous q-Hermite polyno-
mials (Askey–Wilson with a = b = c = d = 0).
pn ðxÞpm ðxÞwðxÞ dx
1
X
þ pn ðxk Þ pm ðxk Þ!k ¼ hn n;m ½67 Continuous q-Ultraspherical Polynomials
k
Definitions as finite Fourier series and as special
where Askey–Wilson polynomial


2

ðe2i ; qÞ
Cn ðcos ;  j qÞ
2 sin  wðcos Þ ¼

i i i 1 i
½68

ðae ; be ; ce ; de ; qÞ 1 Xn
ð; qÞk ð; qÞnk iðn2kÞ
:¼ e ½72
ðabcd; qÞ1 k¼0
ðq; qÞk ðq; qÞnk
h0 ¼
ðq; ab; ac; ad; bc; bd; cd; qÞ1
hn 1  abcdqn1
¼ ð; qÞn
h0 1  abcdq2n1 ¼ pn ðcos ;  1=2 ; q1=2 1=2 ; 1=2 ;
ðq; qÞn
ðq; ab; ac; ad; bc; bd; cd; qÞn
 ½69  q1=2 1=2 j qÞ ½73
ðabcd; qÞn
q-Special Functions 111

Orthogonality relation (1 <  < 1) Big q-Jacobi Polynomials


Z
2i
Definition as q-hypergeometric series
1 
ðe ; qÞ
2
Cn ðcos ; ; qÞCm ðcos ; ; qÞ

2i 1

d
2 0 ðe ; qÞ1 Pn ðxÞ ¼ Pn ðx; a; b; c; qÞ
" #
ð; q; qÞ1 1   ð2 ; qÞn qn ; qnþ1 ab; x
¼ 2 n;m ½74 :¼ 3 2 ; q; q ½82
ð ; q; qÞ1 1  qn ðq; qÞn
qa; qc

q-Difference equation
 Orthogonality relation
AðzÞPn ðqzÞ  AðzÞ þ Aðz1 Þ Pn ðzÞ þ Aðz1 ÞPn ðq1 zÞ Z qa
ða1 x; c1 x; qÞ1
¼ ðqn  1Þð1  qn 2 ÞPn ðzÞ ½75 Pn ðxÞPm ðxÞ dq x ¼ hn n;m ;
qc ðx; bc1 x; qÞ1
where Pn (z) = Cn ( 12 (z þ z1 );  j q) and A(z) = (1  z2 ) ð0 < a < q1 ; 0 < b < q1 ; c < 0Þ ½83
(1  qz2 )=((1  z2 )(1  qz2 )).
where hn can be explicitly given.
Generating function
ðei z; ei z; qÞ1 X 1 q-Difference equation
¼ Cn ðcos ;  j qÞzn
ðei z; ei z; qÞ1 n¼0 AðxÞPn ðqxÞ  ðAðxÞ þ CðxÞÞPn ðxÞ þ CðxÞPn ðq1 xÞ
ðjzj < 1; 0    ; 1 <  < 1Þ ½76 ¼ ðqn  1Þð1  abqnþ1 ÞPn ðxÞ ½84
2
where A(x) = aq(x  1)(bx  c)=x and C(x) = (x  qa)
Special case: the continuous q-Hermite polynomials (x  qc)=x2
Hn ðx j qÞ ¼ ðq; qÞn Cn ðx; 0 j qÞ ½77
Limit case: Jacobi polynomials P(
,
n
)
(x)
Special cases: the Chebyshev polynomials
lim Pn ðx; q
; q ; q1 d; qÞ
q"1
sinððn þ 1ÞÞ  
Cn ðcos ; q j qÞ ¼ Un ðcos Þ :¼ ½78 n! 2x þ d  1
sin  ¼ Pð
;Þ ½85
ð
þ 1Þn n dþ1

ðq; qÞn
lim Cn ðcos ;  j qÞ ¼ Tn ðcos Þ Special case: the little q-Jacobi polynomials
"1 ð; qÞn

:¼ cosðnÞ ðn > 0Þ ½79 pn ðx; a; b; qÞ ¼ ðbÞn qð1=2Þnðnþ1Þ


ðqb; qÞn
 Pn ðqbx; b; a; 0; qÞ ½86
q-Racah Polynomials ðqa; qÞn

Definition as q-hypergeometric series


¼ 2 1 ðqn ; qnþ1 ab; qa; q; qxÞ ½87
(n = 0, 1, . . . , N)
which satisfy orthogonality relation (for 0 < a < q1
Rn ðqy þ qyþ1 ;
; ; ;  j qÞ and b < q1 )
" #
qn ;
qnþ1 ; qy ; qyþ1 Z 1
:¼ 4 3 ; q; q ðqx; qÞ1 logq a
q
; q; q pn ðx; a; b; qÞpm ðx; a; b; qÞ x dq x
0 ðqbx; qÞ1
ð
;  or ¼ qN1 Þ ½80 ðq; qab; qÞ1 ð1  qÞðqaÞn ðq; qb; qÞn
¼ n;m ½88
ðqa; qb; qÞ1 1  abq2nþ1 ðqa; qab; qÞn
Orthogonality relation
X
N Limit case: Jackson’s third q-Bessel function (see [22])
Rn ðqy þ qyþ1 ÞRm ðqy þ qyþ1 Þ!y
y¼0 ðq; qÞ1 ðnþkÞ
lim pNn ðqNþk ; q ; b; qÞ ¼ q
¼ hn n;m ½81 N!1 ðqþ1 ; qÞ1

where !y and hn can be explicitly given.  Jð3Þ ð2qð1=2ÞðnþkÞ ; qÞ ð > 1Þ ½89
112 q-Special Functions

by which [88] tends to the orthogonality relation for where the contour C is as in [70], and where
J(3) (x; q):
wðzÞ
X
1
Jð3Þ ð2qð1=2ÞðnþkÞ ; qÞ Jð3Þ ð2qð1=2ÞðmþkÞ ; qÞqk ðz2 ; z2 ; abcdez; abcde=z; qÞ1
¼ ½97
k¼1 ðaz; a=z; bz; b=z; cz; c=z; dz; d=z; ez; e=z; qÞ1
¼ n;m qn ðn; m 2 ZÞ ½90
ðbcde; acde; abde; abce; abcd; qÞ1
q-Hahn Polynomials h0 ¼ ½98
ðq; ab; ac; ad; ae; bc; bd; be; cd; ce; de; qÞ1
Definition as q-hypergeometric series and hn =h0 can also be given explicitly. For
 n nþ1  ab = qN , n, m 2 {0, 1, . . . , N}, there is a related dis-
q ; q
; x
Qn ðx;
; ; N; qÞ :¼ 3 2 ; q; q crete biorthogonality of the form
q
; qN
ðn ¼ 0; 1; . . . ; NÞ ½91 X N  
1 k 1 k
Rn ðaq þ a q Þ; a; b; c; d; e
k¼0
2
Orthogonality relation  
1 q
X
N ðq
; qN ; qÞy ðq
Þy  Rm ðaqk þ a1 qk Þ; a; b; c; d; wk ¼ 0
y y 2 abcde
Qn ðq ÞQm ðq Þ
y¼0
ðqN  1 ; q; qÞy ðn 6¼ mÞ ½99
¼ hn n;m ½92
where hn can be explicitly given.
Identities and Functions Associated
Stieltjes–Wigert Polynomials with Root Systems
Definition as q-hypergeometric series -Function Identities
 n 
1 q nþ1
Let R be a root system on a Euclidean space of
Sn ðx; qÞ ¼ 1 1 ; q; q x ½93 dimension l. Then Macdonald (1972) generalizes
ðq; qÞn 0
Weyl’s denominator formula to the case of an affine
The orthogonality measure is not uniquely determined: root system. The resulting formula can be written as
Z 1 an explicit expansion in powers of q of
1
Sn ðq1=2 x; qÞSm ðq1=2 x; qÞwðxÞ dx ¼ n n;m ; !
0 q ðq; qÞn Y1 Y
n l n

where, for instance ð1  q Þ ð1  q e Þ


n¼1
2R
q1=2
wðxÞ ¼ or which expansion takes the form of a sum over a
logðq1 Þðq; q1=2 x; q1=2 x1 ; qÞ1
! lattice related to the root system. For root system A1
q1=2 log2 x this reduces to Jacobi’s triple product identity [63].
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi exp  ½94
2 logðq1 Þ 2 logðq1 Þ Macdonald’s formula implies a similar expansion in
powers of q of (q)lþjRj , where (q) is ‘‘Dedekind’s
Rahman–Wilson Biorthogonal Rational Functions -function’’ (q) := q1=24 (q; q)1 .

The following functions are rational in their first


argument: Constant Term Identities

Rn 12 ðz þ z1 Þ; a; b; c; d; e Let R be a reduced root system, Rþ the positive
:¼ 10W9 ða=e; q=ðbeÞ; q=ðceÞ; roots, and k 2 Z>0 . Macdonald conjectured the
second equality in
q=ðdeÞ; az; a=z; qn1 abcd; qn ; q; qÞ ½95 R Q 

T
2Rþ ðe R ; qÞk ðqe ; qÞk dx
They satisfy the biorthogonality relation
  T dx
I !
1 1 1 Y Y k
Rn ðz þ z Þ; a; b; c; d; e ¼ CT ð1  qi1 e
Þð1  qi e
Þ
2i C 2
 
2Rþ i¼1
1 q dz
 Rm ðz þ z1 Þ; a; b; c; d; wðzÞ l 
Y 
2 abcde z kdi
¼ ½100
¼ 2hn n;m ½96 i¼1 k q
q-Special Functions 113

where T is a torus determined by R, CT means the Definition For  a partition and for 0  t  1, the
constant term in the Laurent expansion in e
, and (analytically defined) Macdonald polynomial P (z) =
the di are the degrees of the fundamental invariants P (z; q, t) is of the form
of the Weyl group of R. The conjecture was X
extended for real k > 0, for several parameters k P ðzÞ ¼ P ðz; q; tÞ ¼ m ðzÞ þ u; m ðzÞ
<
(one for each root length), and for root system BCn ,
where Gustafson’s five-parameter n-variable analog ðu; 2 CÞ
of the Askey–Wilson integral ([70] for n = 0) such that for all < 
settles: Z
Z
d1 . . . n P ðzÞ m ðzÞ ðzÞ dz ¼ 0
jðei1 ; . . . ; ein Þj2 ¼ 2n n! T
½0;2 n ð2Þn
Y n where
ðt; tnþj2 abcd; qÞ1
 ½101 Y ðzi z1
ðt ; q; abtj1 ; actj1 ; . . . ; cdtj1 ; qÞ1 j ; qÞ1
j
j¼1 ðzÞ ¼ ðz; q; tÞ :¼ ½106
i6¼j
ðtzi z1
j ; qÞ1
where
Y ðzi zj ; zi =zj ; qÞ1
ðzÞ :¼ Orthogonality relation
ðtzi zj ; tzi =zj ; qÞ1
1i<jn Z
1
Y
n ðz2j ; qÞ1 P ðzÞ P ðzÞ ðzÞ dz
 ½102 n! T
j¼1
ðazj ; bzj ; czj ; dzj ; qÞ1 Y ðqi j tji ; qi j þ1 tji ; qÞ
1
¼ i j tjiþ1 ; qi j þ1 tji1 ; qÞ
; ½107
Further extensions were in Macdonald’s conjectures i<j
ðq 1
for the quadratic norms of Macdonald polynomials
associated with root systems (see the subsection
‘‘Macdonald–Koornwinder polynomials’’), and finally q-Difference equation
proved by Cherednik. X
n Y
tzi  zj
q;zi P ðz; q; tÞ
i¼1 j6¼i
zi  zj
Macdonald Polynomials for Root System An1 !
X
n
Let n 2 Z>0 . We work with partitions  = (1 , . . . , n ) i ni
¼ q t P ðz; q; tÞ ½108
of length  n, where 1      n  0 are integers. i¼1
On the set of such partitions, we take the partial
where q, zi is the q-shift operator: q, zi f (z1 , . . . , zn ) :=
order   ) 1 þ    þ n = 1 þ    þ n and
f (z1 , . . . , qzi , . . . , zn ). See (Macdonald 1995, ch. VI, §3)
1 þ    þ i  1 þ    þ i (i = 1, . . . , n  1). Write
for the full system of q-difference equations.
 < iff   and  6¼ . The monomials are
z
= z
1 1 . . . z
n n (
1 , . . . ,
n 2 Z0 ). For  a partition
the symmetrized monomials m (z) and the Schur Special value
functions s (z) are defined by: Y
n
X P ð1; t; . . . ; tn1 ; q; tÞ ¼ tði1Þi
m ðzÞ:¼ z
ðsum over all distinct i¼1

Y ðtqji ; qÞi j


permutations
of ð1 ; . . . ; n ÞÞ ½103  ½109
i<j
ðqji ; qÞi j

 þnj
detðzi j Þi;j¼1;...;n Restriction of number of variables
s ðzÞ :¼ nj
½104
detðzi Þi;j¼1;...;n
P1 ;2 ;...;n1 ;0 ðz1 ; . . . ; zn1 ; 0; q; tÞ
n
We integrate a function over the torus T := {z 2 C j ¼ P1 ;2 ;...;n1 ðz1 ; . . . ; zn1 ; q; tÞ ½110
jz1 j =    = jzn j = 1} as
Z
1 Homogeneity
f ðzÞ dz :¼
T ð2Þn
Z 2 Z 2 P1 ;...;n ðz; q; tÞ ¼ z1 . . . zn P1 1;...;n 1 ðz; q; tÞ
 ... f ðei1 ; . . . ; ein Þd1 . . . dn ½105 ðn > 0Þ ½111
0 0
114 q-Special Functions

Self-duality Let , be partitions. Define an inner product h , iq, t on the space of


1 n1 2 n2 n symmetric functions such that
P ðq t ; q t ; . . . ; q ; q; tÞ
lðÞ
Y
P ðtn1 ; tn2 ; . . . ; 1; q; tÞ 1  qi
1 n1 2 n2 n hp ; p iq; t ¼ ; z ½120
P ðq t ; q t ; . . . ; q ; q; tÞ i¼1
1  t i
¼ ½112
P ðtn1 ; tn2 ; . . . ; 1; q; tÞ
For partitions ,P the partial
P ordering  
means now that j1 j = j1 j and 1 þ    þ
i  1 þ    þ i for all i. The Macdonald poly-
Special cases and limit relations
nomial P (x; q, t) can now be algebraically defined
Continuous q-ultraspherical polynomials (see [72]):
as thePunique symmetric function P of the form
ðq; qÞmn mþn P =  u, m (u, 2 C, u,  = 1) such that
Pm;n ðrei ; rei ; q; tÞ ¼ r
ðt; qÞmn hP ; P iq:t ¼ 0 if  6¼ ½121
 Cmn ðcos ; t j qÞ ½113
If l()  n, then the newly defined P (x) with
xnþ1 = xnþ2 =    = 0 coincides with P (x; q, t)
defined analytically, and the new inner product is a
Symmetrized monomials (see [103]):
constant multiple (depending on n) of the old inner
P ðz; q; 1Þ ¼ m ðzÞ ½114 product.

Bilinear sum
Schur functions (see [104]): X 1
P ðx; q; tÞP ðy; q; tÞ
P ðz; q; qÞ ¼ s ðzÞ ½115 hP  P iq; t
;

Y ðtxi yj ; qÞ
1
¼ ½122
ðxi y j ; qÞ 1
Hall–Littlewood polynomials (see Macdonald (1995), i; j1
ch. III):
Generalized Kostka numbers The Kostka numbers
P ðz; 0; tÞ ¼ P ðz; tÞ ½116 K, Poccurring as expansion coefficients in
s = K, m were generalized by Macdonald to
coefficients K, (q, t) occurring in connection with
Jack polynomials (see Macdonald (1995), §VI.10): Macdonald polynomials, see Macdonald (1995,
ð1=aÞ §VI.8). Macdonald’s conjecture that K, (q, t) is a
lim P ðz; q; qa Þ ¼ P ðzÞ ½117
q"1 polynomial in q and t with coefficients in Z0 was
fully proved in Haiman (2001).

Algebraic definition of Macdonald polynomials Macdonald–Koornwinder Polynomials


Macdonald polynomials can also be defined Macdonald (2000, 2001) also introduced Macdonald
algebraically. We work now with partitions polynomials associated with an arbitrary root
 (1  2      0) of arbitrary length l(), and system. For root system BCn this yields a three-
with symmetric polynomials in arbitrarily many parameter family which can be extended to the
variables x1 , x2 , . . . , which can be canonically five-parameter Macdonald–Koornwinder (M–K) poly-
extended to symmetric functions in infinitely nomials (Koornwinder 1992). They are orthogonal
many variables x1 , x2 , . . . . The rth power sum pr with respect to the measure occurring in [101] with
and the symmetric functions p are formally (z) given by [102]. The M–K polynomials are
defined by n-variable analogs of the Askey–Wilson polynomials.
X All polynomials just discussed tend, for q " 1, to
pr ¼ xri ; p ¼ p1 p2 . . . ½118 Jacobi polynomials associated with root systems.
i1
Macdonald conjectured explicit expressions for
Put the quadratic norms of the Macdonald polynomials
Y associated with root systems and of the M–K
z :¼ imi mi ! where mi ¼ mi ðÞ is the number of polynomials. These were proved by Cherednik by
i1
considering these polynomials as Weyl group
parts of  equal to i: ½119 symmetrizations of non-invariant polynomials
q-Special Functions 115

which are related to double affine Hecke algebras Elliptic Analog of Jackson’s 8W7 Summation
(see Macdonald (2003)).
nþ1 2
10V9 ða; b; c; d; q a =ðbcdÞ; qn ; q; pÞ
ðqa; qa=ðbcÞ; qa=ðbdÞ; qa=ðcdÞ; q; pÞn
Elliptic Hypergeometric Series ¼ ½129
ðqa=b; qa=c; qa=d; qa=ðbcdÞ; q; pÞn
Let p, q 2 C, jpj, jqj < 1. Define a modified Jacobi
theta function by Elliptic Analog of Bailey’s 10W9 Transformation
ðx; pÞ :¼ ðx; p=x; pÞ1 ðx 6¼ 0Þ ½123  
qnþ2 a3 n
and the elliptic shifted factorial by V
12 11 a; b; c; d; e; f ; ; q ; q; p
bcdef

ða; q; pÞk :¼ ða; pÞðaq; pÞ . . . ðaqk1 ; pÞ ðqa; qa=ðef Þ;ðqaÞ2 =ðbcdeÞ;ðqaÞ2 =ðbcdf Þ; q; pÞn
¼
ðqa=e; qa=f ;ðqaÞ2 =ðbcdef Þ; ðqaÞ2 =ðbcdÞ;q;pÞn
ðk 2 Z>0 Þ; ða; q; pÞ0 :¼ 1 ½124  2 
qa qa qa qa qnþ2 a3 n
 12V11 ; ; ; ;e;f ; ; q ; q;p ½130
bcd cd bd bc bcdef
ða1 ; . . . ; ar ; q; pÞk :¼ ða1 ; q; pÞk . . . ðar ; q; pÞk ½125
Suitable 12V11 functions satisfy a discrete biortho-
where a, a1 , ..., ar 6¼ 0. For q = e2i , p= e2i (= > 0), gonality relation which is an elliptic analog of [99].
and a 2 C we have
Ruijsenaars’ elliptic gamma function
1
ðae2iðxþ Þ ; e2i Þ
¼1
ðae2ix ; e2i Þ
1 Y1
1  z1 qjþ1 pkþ1
ðae2iðxþ  Þ ; e2i Þ ðz; q; pÞ :¼ ½131
¼ a1 qx ½126 1  zqj pk
ðae2ix ; e2i Þ j;k¼0
P1
A series k = 0 ck with ckþ1 =ck being an elliptic
which is symmetric in p and q. Then
(i.e., doubly periodic meromorphic) function of k
ðqz; q; pÞ ¼ ðz; pÞðz; q; pÞ
considered as a complex variable is called an elliptic ½132
hypergeometric series. In particular, define the r Er1 ðqn z; q; pÞ ¼ ðz; q; pÞn ðz; q; pÞ
theta hypergeometric series as the formal series
Applications
r Er1 ða1 ; . . . ; ar ; b1 ; . . . ; br1 ; q; p; zÞ
Quantum Groups
X1
ða1 ; . . . ; ar ; q; pÞk zk
:¼ ½127 A specific quantum group is usually a Hopf algebra
k¼0
ðb1 ; . . . ; br1 ; q; pÞk ðq; q; pÞk
which is a q-deformation of the Hopf algebra of
It has g(k):= ckþ1 =ck with functions on a specific Lie group or, dually, of a
universal enveloping algebra (viewed as Hopf
zða1 qx ; pÞ . . . ðar qx ; pÞ algebra) of a Lie algebra. The general philosophy is
gðxÞ ¼
ðqxþ1 ; pÞ ðb1 qx ; pÞ . . . ðbr1 qx ; pÞ that representations of the Lie group or Lie algebra
also deform to representations of the quantum
By [126], g(x) is an elliptic function with periods 1
group, and that special functions associated with
and 1 (q = e2i , p = e2i ) if the balancing condi-
the representations in the classical case deform to
tion a1 . . . ar = qb1 . . . br1 is satisfied.
q-special functions associated with the representa-
The r Vr1 very well-poised theta hypergeometric
tions in the quantum case. Sometimes this is
series (a special r Er1 ) is defined, in case of
straightforward, but often new subtle phenomena
argument 1, as:
occur.
r Vr1 ða1 ; a6 ; . . . ; ar ; q; pÞ The representation-theoretic objects which may
X1 be explicitly written in terms of q-special functions
ða1 q2k ; pÞ ða1 ; a6 ; . . . ; ar ; q; pÞk
:¼ include matrix elements of representations with
k¼0
ða1 ; pÞ ðqa1 =a6 ; . . . ; qa1 =ar ; q; pÞk respect to specific bases (in particular spherical
qk elements), Clebsch–Gordan coefficients and Racah
 ½128 coefficients. Many one-variable q-hypergeometric
ðq; q; pÞk
functions have found interpretation in some way
The series is called balanced if a26 . . . a2r = ar6
1 q
r4
. in connection with a quantum analog of a three-
n
The series terminates if, for instance, ar = q . dimensional Lie group (generically the Lie group
116 q-Special Functions

SL(2, C) and its real forms). Classical by now are: Macdonald’s generalization of Weyl’s denominator
little q-Jacobi polynomials interpreted as matrix formula to affine root systems has an interpretation
elements of irreducible representations of SUq (2) as an identity for the denominator of the character
with respect to the standard basis; Askey–Wilson of a representation of an affine Kac–Moody
polynomials similarly interpreted with respect to a algebra.
certain basis not coming from a quantum subgroup;
Jackson’s third q-Bessel functions as matrix elements Partitions of Positive Integers
of irreducible representations of Eq (2); q-Hahn Let n be a positive integer, p(n) the number of
polynomials and q-Racah polynomials interpreted partitions of n, pN (n) the number of partitions of n
as Clebsch–Gordan coefficients and Racah coeffi- into parts  N, pdist (n) the number of partitions of
cients, respectively, for SUq (2). n into distinct parts, and podd (n) the number of
Further developments include: Macdonald poly- partitions of n into odd parts. Then, Euler observed:
nomials as spherical elements on quantum analogs
1 X
1
1 X
1
of compact Riemannian symmetric spaces; q-analogs ¼ pðnÞqn ¼ pN ðnÞqn ½136
of Jacobi functions as matrix elements of irreducible ðq; qÞ1 n¼0 ðq; qÞN n¼0
unitary representations of SUq (1, 1); Askey–Wilson
polynomials as matrix elements of representations
of the SU(2) dynamical quantum group; an inter- X
1
pretation of discrete 12 V11 biorthogonality relations ðq; qÞ1 ¼ pdist ðnÞqn
on the elliptic U(2) quantum group. n¼0
½137
Since the q-deformed Hopf algebras are usually 1 X
1
¼ p ðnÞqn
presented by generators and relations, identities for ðq; q2 Þ1 n¼0 odd
q-special functions involving noncommuting vari-
ables satisfying simple relations are important for and
further interpretations of q-special functions in 1
quantum groups, for instance: ðq; qÞ1 ¼ ; pdist ðnÞ ¼ podd ðnÞ ½138
ðq; q2 Þ1

q-Binomial formula with q-commuting variables


The Rogers–Ramanujan identity [57] has the
Xn  
n following partition-theoretic interpretation: the
ðx þ yÞn ¼ ynk xk ðxy ¼ qyxÞ ½133
k q number of partitions of n with parts differing at
k¼0
least 2 equals the number of partitions of n into
Functional equations for q-exponentials with xy parts congruent to 1 or 4 (mod 5). Similarly, [58]
= qyx yields: the number of partitions of n with parts
larger than 1 and differing at least 2 equals the
eq ðx þ yÞ ¼ eq ðyÞeq ðxÞ
½134 number of partitions of n into parts congruent to
Eq ðx þ yÞ ¼ Eq ðxÞEq ðyÞ 2 or 3 (mod 5).
The left-hand sides of the Rogers–Ramanujan
eq ðx þ y  yxÞ ¼ eq ðxÞeq ðyÞ identities [57] and [58] have interpretations in
½135
Eq ðx þ y þ yxÞ ¼ Eq ðyÞEq ðxÞ the ‘‘hard hexagon model,’’ see Baxter (1982).
Much further work has been done on Rogers–
Various Algebraic Settings Ramanujan-type identities in connection with
more general models in statistical mechanics. The
Classical groups over finite fields (Chevalley so-called ‘‘fermionic expressions’’ do occur.
groups) q-Hahn polynomials and various kinds of
q-Krawtchouk polynomials have interpretations as See also: Combinatorics: Overview; Eight Vertex and
spherical and intertwining functions on classical Hard Hexagon Models; Hopf Algebras and q-Deformation
groups (GLn , SOn , Spn ) over a finite field Fq with Quantum Groups; Integrable Systems: Overview; Ordinary
respect to suitable subgroups, see Stanton (1984). Special Functions; Solitons and Kac–Moody Lie Algebras.

Affine Kac–Moody algebras (see Lepowsky Further Reading


(1982)) The Rogers–Ramanujan identities [57],
Andrews GE (1986) q-Series: Their Development and Application in
[58] and some of their generalizations were inter- Analysis, Number Theory, Combinatorics, Physics, and Compu-
preted in the context of characters of representations ter Algebra, CBMS Regional Conference Series in Mathematics,
of the simplest affine Kac–Moody algebra A(1) 1 . vol. 66. Providence, RI: American Mathematical Society.
Quantum 3-Manifold Invariants 117

Andrews GE, Askey R, and Roy R (1999) Special Functions. Communications, vol. 14, pp. 131–166. Providence, RI:
Cambridge: Cambridge University Press. American Mathematical Society.
Andrews GE and Eriksson K (2004) Integer Partitions. Cambridge: Lepowsky J (1982) Affine Lie algebras and combinatorial
Cambridge University Press. identities. In: Winter DJ (ed.) Lie Algebras and Related
Baxter RJ (1982) Exactly Solved Models in Statistical Mechanics. Topics, Lecture Notes in Math., vol. 933, pp. 130–156.
London: Academic Press. Berlin: Springer.
Gasper G and Rahman M (2004) Basic Hypergeometric Series, Macdonald IG (1972) Affine root systems and Dedekind’s
2nd edn. Cambridge: Cambridge University Press. -function. Inventiones Mathematicae 15: 91–143.
Haiman M (2001) Hilbert schemes, polygraphs and the Macdonald Macdonald IG (1995) Symmetric Functions and Hall Polynomials,
positivity conjecture. Journal of the American Mathematical 2nd edn. Oxford: Clarendon.
Society 14: 941–1006. Macdonald IG (2000, 2001) Orthogonal polynomials associated
Koekoek R and Swarttouw RF (1998) The Askey-Scheme of with root systems. Séminaire Lotharingien de Combinatoire
Hypergeometric Orthogonal Polynomials and Its q-Analogue. 45: Art. B45a.
Report 98-17, Faculty of Technical Mathematics and Infor- Macdonald IG (2003) Affine Hecke Algebras and Orthogonal
matics, Delft University of Technology. Polynomials. Cambridge: Cambridge University Press.
Koornwinder TH (1992) Askey–Wilson polynomials for root Stanton D (1984) Orthogonal polynomials and Chevalley groups.
systems of type BC. In: Richards DStP (ed.) Hypergeometric In: Askey RA, Koornwinder TH, and Schempp W (eds.)
Functions on Domains of Positivity, Jack Polynomials, and Special Functions: Group Theoretical Aspects and Applica-
Applications, Contemp. Math., vol. 138, pp. 189–204. tions, pp. 87–128. Dordrecht: Reidel.
Providence, RI: American Mathematical Society. Suslov SK (2003) An Introduction to Basic Fourier Series.
Koornwinder TH (1994) Compact quantum groups and q-special Dordrecht: Kluwer Academic Publishers.
functions. In: Baldoni V and Picardello MA (eds.) Representa- Van der Jeugt J and Srinivasa Rao K (1999) Invariance groups of
tions of Lie Groups and Quantum Groups, Pitman Research transformations of basic hypergeometric series. Journal of
Notes in Mathematics Series, vol. 311, pp. 46–128. Harlow: Mathematical Physics 40: 6692–6700.
Longman Scientific & Technical. Vilenkin NJ and Klimyk AU (1992) Representation of Lie Groups
Koornwinder TH (1997) Special functions and q-commuting and Special Functions, vol. 3. Dordrecht: Kluwer Academic
variables. In: Ismail MEH, Masson DR, and Rahman M (eds.) Publishers.
Special Functions, q-Series and Related Topics, Fields Institute

Quantum 3-Manifold Invariants


C Blanchet, Université de Bretagne-Sud, Ribbon and Modular Categories
Vannes, France
V Turaev, IRMA, Strasbourg, France The Reshetikhin–Turaev approach begins with fixing
suitable algebraic data, which are best described in terms
ª 2006 Elsevier Ltd. All rights reserved.
of monoidal categories. Let C be a monoidal category
(i.e., a category with an associative tensor product and
unit object 1). A ‘‘braiding’’ in C assigns to any objects
Introduction V, W 2 C an invertible morphism cV, W : V  W !
The idea to derive topological invariants of smooth W  V such that, for any U, V, W 2 C,
manifolds from partition functions of certain action
functionals was suggested by A Schwarz (1978) and cU;VW ¼ ðidV  cU;W ÞðcU;V  idW Þ
highlighted by E Witten (1988). Witten interpreted cUV;W ¼ ðcU;W  idV ÞðidU  cV;W Þ
the Jones polynomial of links in the 3-sphere S3 as a
A ‘‘twist’’ in C assigns to any object V 2 C an
partition function of the Chern–Simons field theory.
invertible morphism V : V ! V such that, for any
Witten conjectured the existence of mathematically
V, W 2 C,
defined topological invariants of 3-manifolds, gen-
eralizing the Jones polynomial (or rather its values VW ¼ cW;V cV;W ðV  W Þ
in complex roots of unity) to links in arbitrary
closed oriented 3-manifolds. A rigorous construction A ‘‘duality’’ in C assigns to any object V 2 C a ‘‘dual’’
of such invariants was given by N Reshetikhin and object V 2 C, and evaluation and co-evaluation
V Turaev (1989) using the theory of quantum morphisms dV : V  V ! 1, bV : 1 ! V  V such
groups. The Witten–Reshetikhin–Turaev invariants that
of 3-manifolds, also called the ‘‘quantum invar-
iants,’’ extend to a topological quantum field theory ðidV  dV ÞðbV  idV Þ ¼ idV
(TQFT) in dimension 3. ðdV  idV ÞðidV  bV Þ ¼ idV
118 Quantum 3-Manifold Invariants

The category C with duality, braiding, and twist is are isotopy classes of framed oriented tangles.
ribbon, if for any V 2 C, Given a ribbon category C, we can consider C-
labeled tangles, that is, (framed oriented) tangles
ðV  idV  ÞbV ¼ ðidV  V  ÞbV whose components are labeled with objects of C.
For an endomorphism f : V ! V of an object V 2 C, They form a category T C . Links appear here as
its trace ‘‘tr(f ) 2 EndC (1)’’ is defined as tangles without endpoints, that is, as morphisms
; ! ;. The link invariant hLi generalizes to a
trðf Þ ¼ dV cV;V  ððV f Þ  idV  ÞbV : 1 ! 1 functor h  i : T C ! C.
To define 3-manifold invariants, we need modular
This trace shares a number of properties of the categories (Turaev 1994). Let k be a field. A
standard trace of matrices, in particular, monoidal category C is k-additive if its Hom sets
tr(fg) = tr(gf ) and tr(f  g) = tr(f )tr(g). For an object are k-vector spaces, the composition and tensor
V 2 C, set product of the morphisms are bilinear, and
dimðVÞ ¼ trðidV Þ ¼ dV cV;V  ðV  idV  ÞbV EndC (1) = k. An object V 2 C is simple if
EndC (V) = k. A modular category is a k-additive
Ribbon categories nicely fit the theory of knots ribbon category C with a finite family of simple
and links in S3 . A link L  S3 is a closed one- objects {V } such that (1) for any object P V2C
dimensional submanifold of S3 . (A manifold is there is a finite expansion idV = i fi gi for
closed if it is compact and has no boundary.) A certain morphisms gi : V ! Vi , fi : Vi ! V and
link is oriented (resp. framed) if all its components (2) the S-matrix (S,  ) is invertible over k where
are oriented (resp. provided with a homotopy class S,  = tr(cV , V cV , V ). Note that S,  = hH(, )i
of nonsingular normal vector fields). Given a framed where H(, ) is the oriented Hopf link with framing 0,
oriented link L  S3 whose components are labeled linking number þ1, and labels V , V .
with objects of a ribbon category C, one defines a Axiom (1) implies that every simple object in C is
tensor hLi 2 EndC (1). To compute hLi, present L by isomorphic to exactly one of V . In most interesting
a plane diagram with only double transversal cross- cases (when there is a well-defined direct summa-
ings such that the framing of L is orthogonal to the tion in C), this axiom may be rephrased by saying
plane. Each double point of the diagram is an that C is finite semisimple, that is, C has a finite set
intersection of two branches of L, going over and of isomorphism classes of simple objects and all
under, respectively. Associate with such a crossing objects of C are direct sums of simple objects. A
the tensor (cV, W )1 where V, W 2 C are the labels of weaker version of the axiom (2) yields premodular
these two branches and 1 is the sign of the crossing categories.
determined by the orientation of L. We also The invariant h  i of links and tangles extends by
associate certain tensors with the points of the linearity to the case where labels are finite linear
diagram where the tangent line is parallel to a fixed combinations of objects of C withP coefficients in k.
axis on the plane. These tensors are derived from the Such a linear combination  =  dim (V )V is
evaluation and co-evaluation morphisms and the called the Kirby color. It has the following sliding
twists. Finally, all these tensors are contracted into a property: for any object V 2 C, the two tangles in
single element hLi 2 EndC (1). It does not depend on Figure 1 yield the same morphism V ! V. Here, the
the intermediate choices and is preserved under dashed line represents an arc on the closed compo-
isotopy of L in S3 . For the trivial knot O(V) with nent labeled by . This arc can be knotted or linked
framing 0 and label V 2 C, we have hO(V)i = with other components of the tangle (not shown in
dim (V). the figure).
Further constructions need the notion of a tangle.
An (oriented) tangle is a compact (oriented) one-
dimensional submanifold of R2  [0, 1] with end-
points on R  0  {0, 1}. Near each of its endpoints,
an oriented tangle T is directed either down or up,
and thus acquires a sign 1. One can view T as a
morphism from the sequence of 1’s associated Ω Ω
with its bottom ends to the sequence of 1’s
associated with its top ends. Tangles can be
composed by putting one on top of the other. V V
This defines a category of tangles T whose objects
are finite sequences of 1’s and whose morphisms Figure 1 Sliding property.
Quantum 3-Manifold Invariants 119

Invariants of Closed 3-Manifolds M with @M = (X) q Y (the minus sign indicates the
1 2 3 orientation reversal). A TQFT has to satisfy axioms
Given an embedded solid torus g : S  D ,! S ,
which can be expressed by saying that V is a
where D2 is a 2-disk and S1 = @D2 , a 3-manifold can
monoidal functor from the category of surfaces and
be built as follows. Remove from S3 the interior of
cobordisms to the category of vector spaces over k.
g(S1  D2 ) and glue back the solid torus D2  S1
Homeomorphisms of surfaces should induce iso-
along gjS1 S1 . This process is known as ‘‘surgery.’’
morphisms of the corresponding vector spaces
The resulting 3-manifold depends only on the
compatible with the action of cobordisms. From
isotopy class of the framed knot represented by g.
the definition, V(;) = k. Every compact oriented
More generally, a surgery on a framed link
3-manifold M is a cobordism between ; and @M
L = [m i = 1 Li in S
3
with m components yields a
so that V yields a ‘‘vacuum’’ vector V(M) 2 Hom(V(;),
closed oriented 3-manifold ML . A theorem of
V(@M)) = V(@M). If @M = ;, then this gives a
W Lickorish and A Wallace asserts that any closed
numerical invariant V(M) 2 V(;) = k.
connected oriented 3-manifold is homeomorphic to
Interestingly, TQFTs are often defined for
ML for some L. R Kirby proved that two framed
surfaces and 3-cobordisms with additional struc-
links give rise to homeomorphic 3-manifolds if and
ture. The surfaces X are normally endowed with
only if these links are related by isotopy and a finite
Lagrangians, that is, with maximal isotropic
sequence of geometric transformations called Kirby
subspaces in H1 (X; R). For 3-cobordisms, several
moves. There are two Kirby moves: adjoining a
additional structures are considered in the litera-
distant unknot O" with framing " = 1, and sliding
ture: for example, 2-framings, p1 -structures, and
a link component over another one as in Figure 1.
numerical weights. All these choices are equiva-
Let L = [m 3
i = 1 Li  S be a framed link and let lent. The TQFTs requiring such additional struc-
(bi, j )i, j = 1,..., m be its linking matrix: for i 6¼ j, bi, j is
tures are said to be ‘‘projective’’ since they provide
the linking number of Li , Lj , and bi, i is the framing
projective linear representations of the mapping
number of Li . Denote by eþ (resp. e ) the number of
class groups of surfaces.
positive (resp. negative) eigenvalues of this matrix.
Every modular category C with ground field k
The sliding property of modular categories implies
and simple objects {V } gives rise to a projective
the following theorem. In its statement, a knot K
three-dimensional TQFT V C . ItP depends on the
with label  is denoted by K(). 2
choice of a square root D of  (dim (V )) 2 k.
Theorem 1 Let C be a modular category with For a connected surface X of genus g,
Kirby color . Then hO1 ()i 6¼ 0, hO1 ()i 6¼ 0 and 0 1
the expression g
M O
V C ðXÞ ¼ HomC @1; ðVr  Vr ÞA
C ðML Þ ¼ hO1 ðÞieþ hO1 ðÞie hL1 ðÞ; . . . ; Lm ðÞi 1 ;...;g r¼1

is invariant under the Kirby moves on L. This


expression yields, therefore, a well-defined topological The dimension of this vector space enters the
invariant C of closed connected oriented 3-manifolds. Verlinde formula
X
Several competing normalizations of C exist in dimk ðV C ðXÞÞ  1k ¼ D2g2 ðdimðV ÞÞ22g
the literature. Here, the normalization
P used is such

that C (S3 ) = 1 and C (S1  S2 ) =  (dim (V ))2 . where 1k 2 k is the unit of the field k. If char(k) = 0,
The invariant C extends to 3-manifolds with a then this formula computes dimk (V C (X)). For a
framed oriented C-labeled link K inside by closed connected oriented 3-manifold M with
C ðML ; KÞ numerical weight zero, V C (M) = Db1 (M)1 C (M),
where b1 (M) is the first Betti number of M.
¼ hO1 ðÞieþ hO1 ðÞie hL1 ðÞ; . . . ; Lm ðÞ; Ki The TQFT V C extends to a vaster class of surfaces
and cobordisms. Surfaces may be enriched with a
finite set of marked points, each labeled with an
Three-Dimensional TQFTs
object of C and endowed with a tangent direction.
A three-dimensional TQFT V assigns to every closed Cobordisms may be enriched with ribbon (or fat)
oriented surface X a finite-dimensional vector space graphs whose edges are labeled with objects of C and
V(X) over a field k and assigns to every cobordism whose vertices are labeled with appropriate inter-
(M, X, Y) a linear map V(M) = V(M, X, Y) : V(X) ! twiners. The resulting TQFT, also denoted V C , is
V(Y). Here, a ‘‘cobordism’’ (M, X, Y) between nondegenerate in the sense that, for any surface X,
surfaces X and Y is a compact oriented 3-manifold the vacuum vectors in V(X) determined by all M
120 Quantum 3-Manifold Invariants

with @M = X span V(X). A detailed construction


of V C is given in Turaev (1994).
The two-dimensional part of V C determines a
a –1 –a = (s – s –1)
‘‘modular functor’’ in the sense of G Segal,
G Moore, and N Seiberg.
Figure 2 The Homfly relation.

Constructions of Modular Categories


categories derived from quantum groups of series
The universal enveloping algebra Ug of a (finite- A, B, C, D. In particular, the categories determined
dimensional complex) simple Lie algebra g admits by the series A arise from the Homfly skein relation
a deformation Uq g, which is a quasitriangular Hopf shown in Figure 2 where a, s 2 K. The categories
algebra. The representation category Rep(Uq g) is determined by the series B, C, D arise from the
C-linear and ribbon. For generic q 2 C, this category is Kauffman skein relation.
semisimple. (The irreducible representations of g can The quantum invariants of 3-manifolds and the
be deformed to irreducible representations of Uq g.) TQFTs associated with slN can be directly described
For q, an appropriate root of unity, a certain in terms of the Homfly skein theory, avoiding the
subquotient of Rep(Uq g) is a modular category language of ribbon categories (W Lickorish,
with ground field k = C. For g = sl2 (C), it was C Blanchet, N Habegger, G Masbaum, P Vogel for
pointed out by Reshetikhin and Turaev; the general sl2 and Y Yokota for all slN ).
case involves the theory of tilting modules. The
corresponding 3-manifold invariant  is denoted
qg . For example, if g = sl2 (C) and M is the Poincaré
homology sphere (obtained by surgery on a left- Unitarity
hand trefoil with framing 1), then (Le 2003) From both physical and topological viewpoints,
X
qg ðMÞ ¼ ð1  qÞ1 qn ð1  qnþ1 Þ one is mainly interested in Hermitian and unitary
n 0 TQFTs (over k = C). A TQFT V is Hermitian if the
vector space V(X) is endowed with a nondegene-
 ð1  qnþ2 Þ    ð1  q2nþ1 Þ
rate Hermitian form h. , .iX : V(X) C V(X) ! C
The sum here is finite since q is a root of unity. such that:
There is another construction (Le 2003) of a 1. the form h. , .iX is natural with respect to homeo-
modular category associated with a simple Lie morphisms and multiplicative with respect to
algebra g and certain roots of unity q. The disjoint union and
corresponding quantum invariant of 3-manifolds is 2. for any cobordism (M, X, Y) and any
denoted qPg . (Here, it is normalized so that x 2 V(X), y 2 V(Y),
qPg (S3 ) = 1.) Under mild assumptions on the order
of q, we have qg (M) = qg (M) 0 (M) for all M, where hVðM; X; YÞðxÞ; yiY ¼ hx; VðM; Y; XÞðyÞiX
 0 (M) is a certain Gauss sum determined by g, the If h. , .iX is positive definite for every X, then the
homology group H = H1 (M) and the linking form Hermitian TQFT is ‘‘unitary.’’ Note two features of
Tors H  Tors H ! Q=Z. Hermitian TQFTs. If @M = ;, then V(M) = V(M).
A different construction derives modular categories
The group of self-homeomorphisms of any X
from the category of framed oriented tangles T . Given
acts in V(X) preserving the form h. , .iX . For a
a ring K, a bigger category K[T ] can be considered unitary TQFT, this gives an action by unitary matrices.
whose morphisms are linear combinations of tangles The three-dimensional TQFT derived from a mod-
with coefficients in K. Both T and K[T ] have a ular category V is Hermitian (resp. unitary) under
natural structure of a ribbon monoidal category. additional assumptions on V which are discussed
The skein method builds ribbon categories by briefly. A ‘‘conjugation’’ in V assigns to each morph-
quotienting K[T ] using local ‘‘skein’’ relations, ism f : V ! W in V a morphism f : W ! V so that
which appear in the theory of knot polynomials
(the Alexander–Conway polynomial, the Homfly
polynomial, and the Kauffman polynomial). In f ¼ f; f þ g ¼ f þ g for any f ; g : V ! W
order to obtain a semisimple category, one com- f  g ¼ f  g for any morphisms f ; g in C
pletes the quotient category with idempotents as
f
g ¼ g
f for any morphisms
objects (the Karoubi completion). Choosing appro-
priate skein relations, one can recover the modular f : V ! W; g : W ! V
Quantum 3-Manifold Invariants 121

One calls V Hermitian if it is endowed with It turns out that S(X) is a finitely generated
conjugation such that projective D-module and V(X) = S(X) D C.
A cobordism (M, X, Y) is targeted if all its connected
V ¼ ðV Þ1 ; cV;W ¼ ðcV;W Þ1 components meet Y along a nonempty set. In
bV ¼ dV cV;V  ðV  1V  Þ this case, V(M)(S(X))  S(Y). Thus, applying S to
surfaces and restricting  to targetet cobordisms, we
dV ¼ ð1V   1 1
V ÞcV  ;V bV obtain an ‘‘integral version’’ of V. In many interest-
for any objects V, W of V. A Hermitian modular ing cases, the D-module S(X) is free and its basis
category V is unitary if tr(f f ) 0 for any morphism may be described explicitly. A simple Lie algebra g
f in V. The three-dimensional TQFT, derived from a and a primitive rth (in some cases 4rth) root of unity
Hermitian (resp. unitary) modular category, has a q with sufficiently big prime r give rise to an almost
natural structure of a Hermitian (resp. unitary) D-integral TQFT for D = Z[q].
TQFT.
The modular category derived from a simple Lie
algebra g and a root of unity q is always Hermitian. State-Sum Invariants
It may be unitary for some q. For simply laced g,
there are always such roots of unity q of any given Another approach to three-dimensional TQFTs is
sufficiently big order. For non-simply-laced g, this based on the theory of 6j-symbols and state sums on
holds under certain divisibility conditions on the triangulations of 3-manifolds. This approach intro-
order of q. duced by V Turaev and O Viro is a quantum
deformation of the Ponzano–Regge model for the
three-dimensional lattice gravity. The quantum 6j-
Integral Structures in TQFTs symbols derived from representations of Uq (sl2 C) are
C-valued rational functions of the variable q0 = q1=2
The quantum invariants of 3-manifolds have one  
fundamental property: up to an appropriate res- i j k
  ½2
caling, they are algebraic integers. This was l m n
first observed by H Murakami, who proved that
numerated by 6-tuples of non-negative integers i, j,
qsl2 (M) is an algebraic integer, provided the order of
k, l, m, n. One can think of these integers as labels
q is an odd prime and M is a homology sphere. This
sitting on the edges of a tetrahedron (see Figure 3).
extends to an arbitrary closed connected oriented 3-
The 6j-symbol admits various equivalent normal-
manifold M and an arbitrary simple Lie algebra g as
izations and we choose the one which has full
follows (Le 2003): for any sufficiently big prime
tetrahedral symmetry. Now, let q0 2 C be a
integer r and any primitive rth root of unity q,
primitive 2rth root of unity with r 2. Set
qPg ðMÞ 2 Z½q ¼ Z½expð2i=rÞ ½1 I = {0, 1, . . . , r  2}. Given a labeled tetrahedron T
as in Figure 3 with i, j, k, l, m, n 2 I, the 6j-symbol
This inclusion allows one to expand qPg (M) as [2] can be evaluated at q0 and we can obtain a
a polynomial in q. A study of its coefficients leads complex number denoted jTj. Consider a closed
to the Ohtsuki invariants of rational homology three-dimensional manifold M with triangulation t.
spheres and further to perturbative invariants of (Note that all 3-manifolds can be triangulated.) A
3-manifolds due to T Le, J Murakami, and coloring of M is a mapping ’ from the set Edg(t)
T Ohtsuki (see Ohtsuki (2002)). Conjecturally, the of the edges of t to I. Set
inclusion [1] holds for nonprime (sufficiently big) r pffiffiffiffiffi X Y Y
2a
as well. Connections with the algebraic number jMj ¼ ð 2r=ðq0  q1 0 ÞÞ h’ðeÞi jT ’ j
theory (specifically modular forms) were studied by ’ e2EdgðtÞ T

D Zagier and R Lawrence.


It is important to obtain similar integrality results
for TQFTs. Following P Gilmer, fix a Dedekind
domain D  C and call a TQFT V almost D-integral
k
if it is nondegenerate and there is d 2 C such i j
that dV(M) 2 D for all M with @M = ;. Given
an almost-integral TQFT V and a surface X, we m
l
^ to be the D-submodule of V(X), generated
define S(X)
by all vacuum vectors for X. This module is preserved n
under the action of self-homeomorphisms of X. Figure 3 Labeled tetrahedron.
122 Quantum 3-Manifold Invariants

where a is the number of vertices of t, hni = (1)n (see Majid (1995)). If C is spherical, then Z(C) is
(qn0  qn 1
0 ) (q0  q0 ) for any integer n, T runs over modular. Conjecturally, jMjC = Z(C) (M). In the case
all tetrahedra of t, and T ’ is T with the labeling where C arises from a subfactor, this has been recently
induced by ’. It is important to note that jMj does proved by Y Kawahigashi, N Sato, and M Wakui.
not depend on the choice of t and thus yields a The state sum invariants above are closely related
topological invariant of M. to spin networks, spin foam models, and other
The invariant jMj is closely related to the models of quantum gravity in dimension 2 þ 1 (see
quantum invariant qg (M) for g = sl2 (C). Namely, Baez (2000) and Carlip (1998)).
jMj is the square of the absolute value of qg (M), that
is, jMj = jqg (M)j2 . This computes jqg (M)j inside M See also: Axiomatic Approach to Topological Quantum
without appeal to surgery. No such computation of Field Theory; Braided and Modular Tensor Categories;
the phase of qg (M) is known. Chern–Simons Models: Rigorous Results; Finite-type
Invariants of 3-Manifolds; Large-N and Topological
These constructions generalize in two directions.
Strings; Schwarz-Type Topological Quantum Field
First, they extend to manifolds with boundary. Second,
Theory; Topological Quantum Field Theory: Overview;
instead of the representation category of Uq (sl2 C), one von Neumann Algebras: Subfactor Theory.
can use an arbitrary modular category C. This yields a
three-dimensional TQFT, which associates to a surface
X a vector space jXjC , and to a 3-cobordism (M, X, Y)
Further Reading
a homomorphism jMjC : jXjC ! jYjC , (see Turaev
(1994)). When X = Y = ;, this homomorphism is Baez JC (2000) An Introduction to Spin Foam Models of BF
multiplication C ! C by a topological invariant Theory and Quantum Gravity, Geometry and Quantum
jMjC 2 C. The latter is computed as a state sum on a Physics, Lecture Notes in Physics, No. 543, pp. 25–93. Berlin:
Springer.
triangulation of M involving the 6j-symbols associated Bakalov B and Kirillov A Jr. (2001) Lectures on Tensor
with C. In general, these 6j-symbols are not numbers Categories and Modular Functors. University Lecture Series,
but tensors so that, instead of their product, one vol. 21. Providence, RI: American Mathematical Society.
should use an appropriate contraction of tensors. The Blanchet C, Habegger N, Masbaum G, and Vogel P (1995)
vectors in V(X) are geometrically represented by Topological quantum field theories derived from the
Kauffman bracket. Topology 34: 883–927.
trivalent graphs on X such that every edge is labeled Carlip S (1998) Quantum Gravity in 2 þ 1 Dimensions,, Cambridge
with a simple object of C and every vertex is labeled Monographs on Mathematical Physics Cambridge: Cambridge
with an intertwiner between the three objects labeling University Press
the incident edges. The TQFT j  jC is related to the Carter JS, Flath DE, and Saito M (1995) The Classical and
TQFT V = V C by jMjC = jV(M)j2 . Moreover, for any Quantum 6j-Symbols. Mathematical Notes, vol. 43. Princeton:
Princeton University Press.
closed oriented surface X, Evans D and Kawahigashi Y (1998) Quantum Symmetries on
Operator Algebras, Oxford Mathematical Monographs,
jXjC ¼ EndðVðXÞÞ ¼ VðXÞ  ðVðXÞÞ
Oxford Science Publications. New York: The Clarendon
¼ VðXÞ  VðXÞ Press, Oxford University Press.
Kauffman LH (2001) Knots and Physics, 3rd edn., Series on
and for any three-dimensional cobordism (M, X, Y), Knots and Everything, vol. 1. River Edge, NJ: World
Scientific.
jMjC ¼ VðMÞ  VðMÞ : VðXÞ  VðXÞ Kerler T and Lyubashenko V (2001) Non-Semisimple Topological
Quantum Field Theories for 3-Manifolds with Corners.
! VðYÞ  VðYÞ
Lecture Notes in Mathematics, vol. 1765. Berlin: Springer.
J Barrett and B Westbury introduced a general- Kodiyalam V and Sunder VS (2001) Topological Quantum Field
Theories from Subfactors. Research Notes in Mathematics,
ization of jMjC derived from the so-called spherical vol. 423. Boca Raton, FL: Chapman and Hall/CRC Press.
monoidal categories (which are assumed to be Le T (2003) Quantum invariants of 3-manifolds: integrality,
semisimple with a finite set of isomorphism classes splitting, and perturbative expansion. Topology and Its
of simple objects). This class includes modular Applications 127: 125–152.
categories and a most interesting family of (unitary Lickorish WBR (2002) Quantum Invariants of 3-Manifolds.
Handbook of Geometric Topology, pp. 707–734. Amsterdam:
monoidal) categories arising in the theory of sub- North-Holland.
factors (see Evans and Kawahigashi (1998) and Majid S (1995) Foundations of Quantum Group Theory.
Kodiyalam and Sunder (2001)). Every spherical Cambridge: Cambridge University Press
category C gives rise to a topological invariant jMjC Ohtsuki T (2002) Quantum Invariants. A Study of Knots,
of a closed oriented 3-manifold M. (It seems that this 3-Manifolds, and Their Sets, Series on Knots and Everything,
vol. 29. River Edge, NJ: World Scientific.
approach has not yet been extended to cobordisms.) Turaev V (1994) Quantum Invariants of Knots and 3- Manifolds.
Every monoidal category C gives rise to a double (or de Gruyter Studies in Mathematics, vol. 18. Berlin: Walter de
a center) Z(C), which is a braided monoidal category Gruyter.
Quantum Calogero–Moser Systems 123

Quantum Calogero–Moser Systems


R Sasaki, Kyoto University, Kyoto, Japan quantum Liouville integrability does not imply the
ª 2006 Elsevier Ltd. All rights reserved. complete determination of the eigenvalues and
eigenfunctions. Such systems would be called exactly
solvable. This can be readily understood by consider-
Introduction ing any (autonomous) degree-1 Hamiltonian system,
which, by definition, is Liouville integrable at the
Calogero–Moser (C–M) systems are multiparticle classical and quantum levels. However, it is known
(i.e., finite degrees of freedom) dynamical systems that the number of excatly solvable degree-1 Hamil-
with long-range interactions. They are integrable tonians are very limited. What would be the quantum
and solvable at both classical and quantum levels. counterpart of the ‘‘transformation to action-angle
These systems offer an ideal arena for interplay of variables by quadrature’’? Could it be better for-
many important concepts in mathematical/theoreti- mulated in terms of a path integral? Many questions
cal physics: to name a few, classical and quantum remain to be answered. The quantum C–M systems,
mechanics, classical and quantum integrability, an infinite family of exactly solvable multiparticle
exact and quasi-exact solvability, addition of dis- Hamiltonians, would shed some light on the problem
crete (spin) degrees of freedom, quantum Lax pair of quantum integrability, in addition to their own
formalism, supersymmetric quantum mechanics, beautiful structure explored below.
crystallographic root systems and associated Weyl Throughout this article, the dependence on
groups and Lie algebras, noncrystallographic root Planck’s constant, h, is shown explicitly to distin-
systems, and Coxeter groups or finite reflection guish the quantum effects.
groups. The quantum integrability or solvability of
C–M systems does not depend on such known
solution mechanisms as Yang–Baxter equations, Simplest Cases (Based on Ar 1 Root
quantum R-matrix or Bethe ansatz for the quantum System)
systems. In fact, quantum C–M systems provide a The simplest example of a C–M system consists of r
good material for pondering about quantum particles of equal mass (normalized to unity) on a
integrability. line with pairwise 1=(distance)2 interactions
described by the following Hamiltonian:
Quantum (Liouville) Integrability X
r X
r
1 1
H^ ¼ p2j þ gðg  hÞ ½1
The classical Liouville theorem for an integrable 2 ðqj  qk Þ2
j¼1 j<k
system consists of two parts. Let us consider
Hamiltonian dynamics of finite degrees of freedom in which g is a real positive coupling constant.
N with coordinates q = (q1 , . . . , qN ) and conjugate Here q = (q1 , . . . , qr ) are the coordinates and
momenta p = (p1 , . . . , pN ) equipped with Poisson p = (p1 , . . . , pr ) are the conjugate canonical momenta
brackets {qj , pk } = jk , {qj , qk } = {pj , pk } = 0. The first obeying the canonical commutation relations:
part is the existence of a set of independent and [qj , pk ] = ihjk , [qj , qk ] = [pj , pk ] = 0, j, k = 1, . . . , r.
involutive {Kj , Kk } = 0 conserved quantities {Kj } as The Heisenberg equations of motion are q_P j = (i= h)
many as the degrees of freedom (j = 1, . . . , N). The ^ q ] = pj , q
[H, € j = _
p j = (i=
h )[ ^ p ] = 2g(g  h)
H, 1=
j j k6¼j
second part asserts that the generating function of the (qj  qk )3 . The repulsive 1=(distance)2 potential
canonical transformation for the action-angle vari- cannot be surmounted classically or quantum
ables can be constructed from the conserved quan- mechanically, and the relative position of the
tities via quadrature. In other words, the second part, particles on the line is not changed during the time
that is, the reducibility to the action-angle variables is evolution. Classically, it means that if a motion
the integrability. The quantum counterpart of the starts at a configuration q1 > q2 >    > qr , then the
first half is readily formulated: that is, the existence inequalities remain valid throughout the time evolu-
of a set of independent and mutually commuting tion. At the quantum level, the wave functions
(involutive) [Kj , Kk ] = 0 conserved quantities {Kj } as vanish at the boundaries, and the configuration
many as the degrees of freedom. (This does not space can be naturally limited to q1 > q2 >    > qr
necessary imply, however, that they are well defined (the principal Weyl chamber).
in a proper Hilbert space.) The definition of the Similar integrable quantum many-particle
quantum integrability should come as a second part, dynamics are obtained by replacing the inverse
which is yet to be formulated. It is clear that the square potential in [1] by the trigonometric
124 Quantum Calogero–Moser Systems

The diagonal
P element mj of M is given by
V(q) 1/q 2 q 2 + 1/q 2 1/sin2q 1/sinh2q
mj = ig P k6¼j 1=(qj  qkP)2 . The matrix M has a special
r r
property j = 1 Mjk = k = 1 Mjk = 0, which ensures
the quantum conserved quantities as the total sum of
powers of ^
q q q P Lax matrix L: [H, Kn ] = 0, Kn 
Rational Calogero Sutherland Hyperbolic Ts(Ln ) = j,k (Ln )jk , (n = 1, 2, 3, . . . ), [Kn , Km ] = 0.
Figure 1 Four different types of quantum C–M potentials. It should be stressed that the trace of Ln is not
conserved because of the noncommutativity of q and
p. The Hamiltonian is equivalent to K2 , H^ / K2 þ
(hyperbolic) counterpart (see Figure 1)
const. In other words, the Lax matrix L is like a
1=(qj  qk )2 ! a2 =sinh2 a(qj  qk ), in which a > 0 is
‘‘square root’’ of the Hamiltonian. The quantum
a real parameter. The 1=sin2 q potential case
equations of motion for the Sutherland and hyper-
(the Sutherland system) corresponds to the
bolic potentials are again expressed by Lax pairs if
1=(distance)2 interaction on a circle of radius 1/2a,
the following replacements are made: 1=(qj  qk ) !
seePFigure 2. A harmonic confining potential
a coth a(qj  qk ) in L and 1=(qj  qk )2 !
!2 rj = 1 q2j =2 can be added to the rational Hamil- 2 2
a =sinh a(qj  qk ) in M. The quantum conserved
tonian [1] without breaking the integrability
quantities are obtained in the same manner as above
(the Calogero system, see Figure 1). At the
for the systems with the trigonometric and hyperbolic
classical level, the trigonometric (hyperbolic) and
interactions.
rational C–M systems are obtained from the
The main goal here is to find all the eigenvalues
elliptic potential systems (with the Weierstrass }
{E} and eigenfunctions { (q)} of the Hamiltonians
function) as the degenerate limits: }(q1  q2 ) !
with the rational, Calogero, Sutherland, and
a2 =sinh2 a(q1  q2 ) ! 1=(q1  q2 )2 , namely as one
hyperbolic potentials: H^ (q) = E (q). The mome-
(two) period(s) of the } function tends to infinity.
ntum operator pj acts as differential operators
It is remarkable that these equations of motion can
pj = ih@=@qj . For example, for the rational
be expressed in a matrix form (Lax pair):
^ L] = dL=dt = LM  ML = [L, M] , Heisenberg model Hamiltonian [1], the eigenvalue equation
i=h[H,
reads
equation of motion, in which L and M are given by
2 3
0 ig ig
1 h
 2X r
@ 2 Xr
1
p1 q1 q2  q1 qr 4 þ gðg  hÞ 5 ðqÞ
B C 2 j¼1 @q2j ðq  q Þ 2
B C j<k j k
B ig ig C
B q2 q1 p2  q2 qr C ¼ E ðqÞ ½3
B C
L¼B
B
C
C
B .. .. .. .. C which is a second-order Fuchsian differential
B . . . . C
B C equation for each variable {qj } with a regular
@ A
ig ig
 pr singularity at each hyperplane qj = qk whose expo-
qr q1 qr q2
nents are g=h, 1  g=h. Any solution of [3] is
0 ig ig
1 ½2 regular at all points, except for those on the union
m1  ðq 2   ðq 2
of hyperplanes qj = qk . Since the structure of the
1 q2 Þ 1 qr Þ
B C
B C singularity is the same for the other three types of
B C
B  ig 2 m2  ig
 ðq q Þ2 C potentials, the same assertion for the regularity and
B ðq2 q1 Þ 2 r C
B C singularity of the solution holds for these cases,
M¼B C
B .. .. .. .. C too. For the trigonometric (Sutherland) case, there
B C
B . . . . C are other singularities at qj  qk = l=a, l 2 Z, due
B C
@ A to the periodicity of the potential. As is clear from
ig ig
 ðq q Þ2
 ðq q 2  mr the shape of the potentials, see Figure 1, the
r 1 r 2Þ
rational and hyperbolic Hamiltonians have only
continuous spectra, whereas the Calogero and
q2 Sutherland Hamiltonians have only discrete
q3 spectra.
q1 The integrability or more precisely the triangular-
q 4 distance(q 1, q 2) = sin a(q 1 – q 2)/a ity of the quantum C–M Hamiltonian was first
R = 1/2a
discovered by Calogero for particles on a line with
Figure 2 Sutherland potential is 1=(distance)2 interaction on a inverse square potential plus a confining harmonic
circle. The large-radius limit, a ! 0, gives the rational potential. force and by Sutherland for the particles on a circle
Quantum Calogero–Moser Systems 125

with the trigonometric potential. Later, classical Table 1 Functions appearing in the prepotential and Lax pair
integrability of the models in terms of Lax pairs was
Potential w (u) x (u) y (u)
proved by Moser. Olshanetsky and Perelomov
showed that these systems were based on Ar1 root Rational u 1/u 1=u 2
systems, that is, qj  qk =   q, and  is one of the Hyperbolic sinh au a coth au a 2=sinh2 au
root vectors of Ar1 root system [13]. They also Trigonometric sin au a cot au a 2=sin2 au
introduced generalizations of the C–M systems
based on any root system including the noncrystal- group, that is, they are identical for roots in the
lographic ones. same orbit. That is, for the simple Lie algebra cases,
As shown by Heckman–Opdam and Sasaki and one coupling constant, g = g, for all roots in simply
collaborators, quantum C–M systems with degen- laced models and two independent coupling con-
erate potentials (i.e., the rational potentials with/ stants, g = gL for long roots and g = gS for short
without harmonic force, the hyperbolic, and the roots, in non-simply laced models. The function
trigonometric potentials), based on any root system w(u) and the other functions x(u) and y(u) appearing
can be formulated and solved universally. To be in the Lax pair [10], [11] are listed in Table 1 for
more precise, the rational and Calogero systems are each type of degenerate potentials. The dynamics of
integrable for all root systems, the crystallographic the prepotentials W(q) (eqn [5]) has been discussed
and noncrystallographic. The hyperbolic and trigo- by Dyson from a different point of view (random-
nometric (Sutherland) systems are integrable for any matrix model). The above factorized Hamiltonian
crystallographic root system. The universal formulas [4] consists of an operator part H, ^ which is the
for the Hamiltonians, Lax pairs, ground state wave Hamiltonian in the usual definition (see the Hamil-
functions, conserved quantities, the triangularity, the tonians in the previous section, e.g., [1]), and a
discrete spectra for the Calogero and Sutherland constant E 0 which is the ground-state energy,
systems, the creation and annihilation operators, H = H^  E 0 . The factorized Hamiltonian [4] also
etc., are equally valid for any root system. This will arises within the context of supersymmetric quan-
be shown in the next section. Some rudimentary tum mechanics.
facts of the root systems and reflections are The pre-potential and the Hamiltonian are
summarized in the appendix. invariant under reflection of the phase space
variables in the hyperplane perpendicular to any
root W(s (q)) = W(q), H(s (p), s (q)) = H(p, q), 8 2
Universal Formalism , with s defined by [12]. The above Coxeter
(Weyl) invariance is the only (discrete) symmetry of
A C–M system is a Hamiltonian dynamical systems the C–M systems. The main problem is, as in the Ar1
associated with a root system  of rank r, which is a case, to find all the eigenvalues {E} and eigenfunctions
set of vectors in Rr with its standard inner product. { (q)} of the above Hamiltonian H (q) = E (q).
A brief review of the properties of the root systems For any root system and for any choice of
and the associated reflections together with explicit potential, the C–M system has a hard repulsive
realizations of all the classical root systems will be potential  1=(  q)2 near the reflection hyperplane
found in the appendix. H = {q 2 Rr ,   q = 0}. The C–M eigenvalue equa-
tion is a second-order Fuchsian differential equation
Factorized Hamiltonian with regular singularities at each reflection hyper-
plane H and those arising from the periodicity in
The Hamiltonian for the quantum C–M system can the case of the Sutherland potential. Near the
be written in terms of a pre-potential W(q) in a reflection hyperplane H , the solution behaves as
‘‘factorized form’’: follows:
r   
1X @WðqÞ @WðqÞ  ð  qÞg =h ð1 þ regular termsÞ; or
H¼ pj  i pj þ i ½4
2 j¼1 @qj @qj  ð  qÞ1g =h ð1 þ regular termsÞ
The pre-potential is a sum over positive roots: The former solution is chosen for the square
X  !  integrability. Because of the singularities, the con-
WðqÞ ¼ g ln jwð  qÞj þ  q2 ½5 figuration space is restricted to the principal Weyl
2
2
þ
chamber PW or the principal Weyl alcove PWT
The real positive coupling constants g are for the trigonometric potential (see Figure 3): PW =
defined on orbits of the corresponding Coxeter {q 2 Rr j   q > 0,  2 }, PWT = {q 2 Rr j   q > 0,
126 Quantum Calogero–Moser Systems

α2
contained in the ground-state wave function eW , E
αh
must be regular at finite q, including all the
λ2 reflection boundaries. As for the rational and
hyperbolic potentials, the energy eigenvalues are
λ1 only continuous. For the rational case, the eigen-
α1
functions are multivariable generalization of Bessel
Figure 3 Simple roots, the highest root, fundamental weights, functions.
and the principal Weyl alcove (grey) and the principal Weyl
chamber (light grey, extending to infinity) in a two-dimensional
root system.
Calogero systems The similarity-transformed
Hamiltonian H~ reads
 2 , h  q < =a}, (: set of simple roots, see the
appendix). Here h is the highest root. @ h2 X
r
@2
H~ ¼ h!q  
@q 2 j¼1 @q2j
Ground-State Wave Function and Energy ½8
X g @
One straightforward outcome of the factorized  h 
2
q @q
þ
Hamiltonian [4] is the universal ground-state wave
function, which is given by which maps a Coxeter-invariant polynomial in q of
degree d to another of degree d. Thus, the
0 ðqÞ ¼ eWðqÞ=h Hamiltonian H~ (8) is lower-triangular in the basis
Y  2

¼ jwð  qÞjg =h eð!=2hÞq ½6
of Coxeter-invariant polynomials and the diagonal
2þ elements have values as h!  degree, as given by the
H0 ðqÞ ¼ 0 first term. Independent Coxeter-invariant polyno-
mials exist at the degrees fj listed in Table 2: fj = 1 þ
2
The exponential factor e(!=2h)q exists only for the ej , j = 1, . . . , r, where {ej }, j = 1, . . . , r, are the
Calogero systems. The ground-state energy, that is, exponents of .
the constant part of H = H^  E 0 , has a universal The eigenvalues of the Hamiltonian H are h!N
expression for each potential: with N a non-negative integer. N can be
P
( expressed as N = rj = 1 nj fj , nj 2 Zþ , and the
0  rational
E0 ¼ P degeneracy of the eigenvalue h!N is the number
! hr=2 þ 2þ g Calogero of partitions of N. It is remarkable that the
½7 coupling constant dependence appears only in the
 ground-state energy E 0 . This is a deformation of
1 hyperbolic
E 0 ¼ 2a2 2  the isotropic harmonic oscillator confined in the
1 Sutherland
P principal Weyl chamber. The eigenpolynomials
where  = 1=2 2þ g  is called a ‘‘deformed are generalization of multivariable Laguerre
Weyl vector.’’ Obviously, 0 (q) is square integrable (Hermite) polynomials. One immediate consequence
in the configuration spaces for the Calogero and of this spectrum is the periodicity of the quantum
Sutherland systems and not square integrable for the motion. If a system has a wave function (0) at
rational and hyperbolic potentials. t = 0, then at t = T = 2=! the system has physically
the same wave function as (0), that is,
Excited States, Triangularity, and Spectrum (T) = eiE 0 T=h (0). The same assertion holds at the
classical level, too.
Excited states of the C–M systems can be easily
obtained as eigenfunctions of a differential operator
H~ obtained from H by a similarity transformation: Table 2 The degrees fj in which independent Coxeter-invariant
polynomials exist
H~ ¼ eW=h HeW=h
1 Xr  fj = 1 þ ej  fj = 1 þ ej
¼ ðh2 @ 2 =@q2j þ 2
h@W=@qj @=@qj Þ
2 j¼1 Ar 2, 3, 4, . . . , r þ 1 E8 2, 8, 12, 14, 18, 20, 24, 30
Br 2, 4, 6, . . . , 2r F4 2, 6, 8, 12
The eigenvalue equation for H,~ H
~ E = EE , is then Cr 2, 4, 6, . . . , 2r G2 2, 6
equivalent to that of the original Hamiltonian, Dr 2, 4, . . . , 2r  2, r I2 (m) 2, m
E6 2, 5, 6, 8, 9, 12 H3 2, 6, 10
HE eW = EE eW . Since all the singularities of the
E7 2, 6, 8, 10, 12, 14, 18 H4 2, 12, 20, 30
Fuchsian differential equation H (q) = E (q) are
Quantum Calogero–Moser Systems 127

Sutherland Systems The periodicity of the trigono- the chosen potential as given in Table 1. Then the
metric potential dictates that the wave function equations of motion can be expressed in a matrix
should be a Bloch factor e2iaq (where  is a weight) form dL=dt = i= h[H, L]P = [L, M]. The P operator M
multiplied by a Fourier series in terms of simple satisfies the relation 2R M  = 2R M = 0,
roots. The basis of the Weyl invariant wave which is essential for deriving quantum conserved
functions
P is specified by a Pdominant weight quantities as the total sum (Ts) of P all the matrix
 = rj= 1 mj j , mj 2 Zþ ,  (q)  2O e2iaq , where elements of Ln : Kn = Ts(Ln )  , 2R (Ln ) ,
O is the orbit of  by the action of the Weyl group: [H, Kn ] = 0, [Km ,Kn ] = 0, n, m = 1, . . . In particular,
O = {g() j g 2 G }. The set of functions { } has an the power 2 is universal to all the root systems, and
order  , jj2 > j0 j2 )   0 . The similarity- the quantum Hamiltonian is given by H / K2 þ
transformed Hamiltonian H~ given by const. As in the affine Toda molecule systems, a Lax
pair with a spectral parameter can also be intro-
h
 2 Xr
@2 X @
H~ ¼   a
h g cot ða  qÞ  ½9 duced universally for all the above potentials. The
2 2
@qj @q
j¼1 2þ Dunkl operators, or the commuting differential–
difference operators are also used to construct
is lower-triangular in this basis: H ~  = 2a2 (h2 2 þ
P quantum conserved quantities for some root sys-
2h  ) þ j0 j<jj c0 0 . That is, the eigenvalue is tems. This method is essentially equivalent to the
h2 2 þ 2
E = 2a2 ( h  ) or E þ E 0 = 2a2 ( h þ )2 . universal Lax operator formalism. As the Lax
Again, the coupling constant dependence comes operators do not contain the Planck’s constant, the
solely from the deformed Weyl vector . This quantum Lax pair is essentially of the same form as
spectrum is a deformation of the spectrum corre- the classical Lax pair. The difference between the
sponding to the free motion with momentum 2ha trace (tr) and the total sum (Ts) vanishes as h ! 0.
in the principal Weyl alcove. The corresponding
eigenfunction is called a generalized Jack polynomial
Lax pair for Calogero systems The quantum Lax
or Heckman–Opdam’s Jacobi polynomial. For the
pair for the Calogero systems is obtained from the
rank-2 (r = 2) root systems, A2 , B2 ffi C2 and I2 (m)
universal Lax pair [10] by replacement L !
(the dihedral group), the complete set of eigenfunc-
L
= L
i!Q, Q  q  Ĥ, which correspond to the
tions are known explicitly.
creation and annihilation operators of a harmonic
oscillator. The equations of motion are rewritten as
Quantum Lax Pair and Quantum Conserved dL
=dt = i=h[H, L
] = [L
, M]
i!L
. Then L
=
Quantities L
L satisfy the Lax type equation dL
=dt =
The universal Lax pair for C–M systems is given in i=h[H, L
], giving rise to conserved quantities
terms of the representations of the Coxeter (Weyl) Ts(L
)n , n = 1, 2, . . . The Calogero Hamiltonian is
group in stead of the Lie algebra. The Lax operators given by H / Ts(L
).
without spectral parameter for the rational, trigono- All the eigenstates of the Calogero P Hamiltonian H
metric, and hyperbolic potentials are with eigenvalues h!N, N = rj = 1 njQ fj , nj 2 Zþ , are
simply constructed in terms of L
: rj = 1 (Bþ nj W
fj ) e .
^ þ XðqÞ
Lðp; qÞ ¼ p  H Here the integers {fj }, j = 1, . . . , r, are listed in
X ½10 Table 2. The creation operators Bþ fj and the
XðqÞ ¼ i ^
g ð  HÞxð  qÞ^s corresponding annihilation operators B are defined
fj

fj
2þ by B
fj = Ts(L ) , j = 1, . . . , r. They are Hermitian
y
conjugate to each other (B
fj ) = Bfj with respect to
i X the standard Hermitian inner product of the states
MðqÞ ¼ g 2 yð  qÞð^s  IÞ ½11
2 2 defined in PW. They satisfy commutation relations
þ
þ þ
[H, B
k]=
h k!B
 
k , [Bk , Bl ] = [Bk , Bl ] = 0, k, l 2
where I is the identity operator and {ŝ j 2 } are {fj j j = 1, . . . , r}. The ground state is annihilated by
the reflection operators of the root system. They act all the annihilation operators B W
fj e = 0, j = 1, . . . , r.
on a set of Rr vectors, R = {(k) 2 Rr j k = 1, . . . , d},
permuting them under the action of the reflection
group. The vectors in R form a basis for the Further Developments
representation space V of dimension d. The matrix
Rational Potentials: Superintegrability
elements of the operators {ŝ j  2 } and
{Ĥj j j = 1, . . . , r} are defined as follows: The systems with the rational potential have a remark-
(ŝ ) = , s ( ) =  , s () , (Ĥj ) = j  ,  2 , , able property: superintegrability. A rational C–M
2 R. The form of the functions x, y depends on system based on a rank-r root system has 2r  1
128 Quantum Calogero–Moser Systems

independent conserved quantities. Roughly speaking, Sutherland. For each member  of R, to be called
they are of the form Kn = Ts(Ln ), Jm = Ts(QLm ), Q  a ‘‘site,’’ a vector space V is associated whose
q  Ĥ, among which only r are involutive. At the element is called a ‘‘spin.’’ The dynamical variables
classical level, superintegrability can be characterized are those of the particles {qj , pj } and the spin
as algebraic linearizability. Since a commutator of any exchange operators {P^ } ( 2 ) which exchange
conserved quantities is again a conserved quantity, these the spins at the sites  and s (). For each  and R
conserved quantities form a nonlinear algebra called a a spin exchange model can be defined by ‘‘freezing’’
quadratic algebra. It can be considered as a finite- the particle degrees of freedom at the equilibrium
dimensional analog of the W-algebra appearing in point of the corresponding classical potential
certain conformal field theory. {q, p} ! {q̄, 0}. These are generalization of Hal-
dane–Shastry model for Sutherland potentials and
Quantum vs Classical Integrability that of Polychronakos for the Calogero potentials.
Universal Lax pair operators for both spin C–M
In C–M systems, the classical and quantum integr-
systems and spin exchange models are known and
ability are very closely related. The quantum discrete
conserved quantities are constructed.
spectra of the Calogero and the Sutherland systems
are, as shown above, expressed in terms of the
coupling constant (!, g) and the exponents or the Integrable Deformations
weights of the corresponding root systems. Namely,
they are integral multiples of coupling constants. The C–M systems allow various integrable deformations at
corresponding the classical and/or quantum levels. One of the well-
P classical systems with the potential known deformations is the so-called ‘‘relativistic’’ C–M
V(q) = (1=2) rj = 1 (@W(q)=@qj )2 share many remark-
able properties. As is clear from Figure 1, they always system or the Ruijsenaars–Schneider (R–S) system. For
have an equilibrium position. The equilibrium posi- degenerate potentials, they are integrable both at the
tions (q̄) are described by the zeros of a classical classical and quantum levels. The classical quantities of
orthogonal polynomial; the Hermite polynomial the R–S systems at equilibrium exhibit many interesting
(A-type Calogero), the Laguerre polynomial (B, C, D- properties, too. The equilibrium positions are described
type Calogero), the Chebyshev polynomial (A-type by the zeros of certain deformation of the above-
Sutherland) and the Jacobi polynomial (B, C, D-type mentioned classical polynomials. The frequencies of
Sutherland). For the exceptional root systems, the small oscillations are also related to the exact quantum
corresponding polynomials were not known for a long spectrum, and they can be expressed as coupling
time. The minimum energy of the classical potential constant times the (q-) integers.
V(q) at the equilibrium is the quantum ground-state Inozemtsev models are classically integrable mul-
energy limh!0 E 0 itself. It is also an integral multiple of tiparticle dynamical systems related to C–M systems
coupling constants for both Calogero and Sutherland based on classical root systems (A, B, C, D) with
cases. Near a classical equilibrium, a multiparticle additional q6 (rational) or sin2 2q (trigonometric)
dynamical system is always reduced to a system of potentials. Their quantum versions are not exactly
coupled harmonic oscillators. For Calogero systems, solvable in contrast to the C–M or R–S systems,
the eigenfrequencies of these small oscillations are, in although there is some evidence of their Liouville
fact, exactly the same as the quantum eigenfrequen- integrability (without a proper Hilbert space).
cies, !fj = !(1 þ ej ). For Sutherland systems, the Quantum Inozemtsev systems can be deformed to
classical eigenfrequencies are the same as the o(h) be a widest class of quasi-exactly solvable multi-
part of the quantum spectra corresponding to all particle dynamical systems. They possess a form of
the fundamental weights j : 2a2 j  . Moreover, the higher-order supersymmetry for which the method
eigenvalues of various Lax matrices L and M at the of prepotential is also useful.
equilibrium take many ‘‘interesting values.’’ These
results provide ample explicit examples of the general
theorem on the quantum–classical correspondence Appendix: Root Systems
formulated by Loris–Sasaki.
Some rudimentary facts of the root systems and
Spin Models reflections are recapitulated here. The set of roots 
is invariant under reflections in the hyperplane
For any root system  and an irreducible represen- perpendicular to each vector in . In other words,
tation R of the Coxeter (Weyl) group G , a spin s (
) 2 , 8,
2 , where
C–M system can be defined for each of the
potentials: rational, Calogero, hyperbolic and s ð
Þ ¼
 ð_ 
Þ; _  2=jj2 ½12
Quantum Calogero–Moser Systems 129

The set of reflections {s j  2 } generates a group 3. Cr : This root system is associated with Lie
G , known as a Coxeter group, or finite reflection algebra sp(2r). The long roots have (length)2 = 4
group. The orbit of
2  is the set of root vectors and short roots have (length)2 = 2:
resulting from the action of the Coxeter group on
¼ [ f
ej
ek g [rj¼1 f
2ej g
it. The set of positive roots þ may be defined in 1 j k r
terms of a vector U 2 Rr , with   U 6¼ 0, 8 2 , Y r1 ½15
as the roots  2  such that   U > 0. Given þ , ¼ [ fej  ejþ1 g [ f2er g
j¼1
there is a unique set of r simple roots
 = {j j j = 1, . . . , r} defined such that they span 4. Dr : This root system is associated with Lie
the P root space and the coefficients {aj } in algebra so(2r):

= rj = 1 aj j for
2 þ are all Pr non-negative. ¼ [ f
ej
ek g
The highest root h , for which j = 1 aj is max- 1 j k r
imal, is then also determined uniquely. The subset Y r1 ½16
of reflections {s j  2 } in fact generates the ¼ [ fej  ejþ1 g [ fer1 þ er g
j¼1
Coxeter group G . The products of s , with  2
, are subject solely to the relations
(s s
)m(,
) = 1, ,
2 . The interpretation is that
s s
is a rotation in some plane by 2=m(,
). The See also: Calogero–Moser–Sutherland Systems
set of positive integers m(,
) (with of Nonrelativistic and Relativistic Type;
Dynamical Systems in Mathematical Physics:
m(, ) = 1, 8 2 ) uniquely specifies the Coxeter
An Illustration from Water Waves; Functional Equations
group. The weight lattice P() is defined as the
and Integrable Systems; Integrable Discrete Systems;
Z-span of the fundamental weights {j }, defined by Integrable Systems in Random Matrix Theory; Integrable
_j  k = jk , 8j 2 . Systems: Overview; Isochronous Systems; Toda
The root systems for finite reflection groups may Lattices.
be divided into two types: crystallographic and
noncrystallographic. Crystallographic root systems
satisfy the additional condition _ 
2 Z, 8,
2 .
The remaining noncrystallographic root systems are Further Reading
H3 , H4 , whose Coxeter groups are the symmetry Calogero F (1971) Solution of the one-dimensional N-body
groups of the icosahedron and four-dimensional problem with quadratic and/or inversely quadratic pair
600-cell, respectively, and the dihedral group of potentials. Journal of Mathematical Physics 12: 419–436.
order 2m, {I2 (m)jm 4}. Calogero F (2001) Classical Many-Body Problems Amenable to
Exact Treatments. New York: Springer.
The explicit examples of the classical root Dunkl C (2001) Orthogonal Polynomials of Several Variables.
systems, that is, A, B, C, and D are given below. Cambridge: Cambridge University Press.
For the exceptional and noncrystallographic root Humphreys JE (1990) Reflection Groups and Coxeter Groups.
systems, the reader is referred to Humphrey’s book. Cambridge: Cambridge University Press.
In all cases, {ej } denotes an orthonormal basis in Rr . Macdonald IG (1995) Symmetric Functions and Hall Polynomials,
2nd edn. Oxford University Press.
1. Ar1 : This root system is related with the Lie Moser J (1975) Three integrable Hamiltonian systems connected
algebra su(r). with isospectral deformations. Advances in Mathematics 16:
197–220.
Olshanetsky MA and Perelomov AM (1983) Quantum integrable
¼ [ f
ðej  ek Þg;
1 j k r systems related to Lie algebras. Physics Reports 94: 313–404.
Y r1 ½13 Ruijsenaars SNM (1999) Systems of Calogero–Moser Type. CRM
¼ [ fej  ejþ1 g Series in Mathematical Physics 1: 251–352. Springer.
j¼1 Sasaki R (2001) Quantum Calogero–Moser Models: Complete
Integrability for All the Root Systems. Proceedings Quantum
2. Br : This root system is associated with Lie Integrable Models and Their Applications, 195–240. World
algebra so(2r þ 1). The long roots have Scientific.
(length)2 = 2 and short roots have (length)2 = 1: Sasaki R (2002) Quantum vs Classical Calogero–Moser Systems.
NATO ARW Proceedings, Elba, Italy.
¼ [ f
ej
ek g [rj¼1 f
ej g Stanley R (1989) Some combinatorial properties of Jack sym-
1 j k r metric function. Adv. Math. 77: 76–115.
Y r1 ½14 Sutherland B (1972) Exact results for a quantum many-body
¼ [ fej  ejþ1 g [ fer g problem in one dimension. II. Physical Review A 5:
j¼1
1372–1376.
130 Quantum Central-Limit Theorems

Quantum Central-Limit Theorems


A F Verbeure, Institute for Theoretical Physics, phenomena on the basis of the microscopic struc-
KU Leuven, Belgium ture. A precise definition or formulation of a
ª 2006 Elsevier Ltd. All rights reserved. microscopic and a macroscopic system is of prime
importance. The so-called algebraic approach of
dynamical systems (Brattelli and Robinson 1979 and
2002) offers the necessary generality and mathema-
Introduction tical framework to deal with classical and quantum,
Statistical physics deals with systems with many microscopic and macroscopic, finite and infinite
degrees of freedom and the problems concern finding systems. The observables of any system are assumed
procedures for the extraction of relevant physical to be elements of an (C - or von Neumann) algebra
quantities for these extremely complex systems. The A, and the physical states are given by positive
idea is to find relevant reduction procedures which linear normalized functionals ! of A, mapping the
map the complex systems onto simpler, tractable observables on their expectation values.
models at the price of introducing elements of A common physicist’s belief is that the macro-
uncertainty. Therefore, probability theory is a natural scopic behavior of an idealized infinite system is
mathematical tool in statistical physics. Since the early described by a reduced set of macroscopic quantities
days of statistical physics, in classical (Newtonian) (Sewell 1986). Some examples of these are the
physical systems, it is natural to model the observables average densities of particles, energy, momentum,
by a collection of random variables acting on a magnetic moment, etc. Analogously as the micro-
probability space. Kolmogorovian probability techni- scopic quantities, the macroscopic observables
ques and results are the main tools in the development should be elements of an algebra, and macroscopic
of classical statistical physics. A random variable is states of the system should be states on this algebra.
usually considered as a measurable function with The main problem is to construct the precise
expectation given as its integral with respect to a mathematical procedures to go from a given micro-
probability measure. Alternatively, a random variable scopic system to its macroscopic systems.
can also be viewed as a multiplication operator by the A well-known macroscopic system is the one
associated function. Different random variables com- given by the algebra of the observables at infinity
mute as multiplication operators, and one speaks of a (Lanford and Ruelle 1969) containing the spacial
commutative probabilistic model. averages of local micro-observables, that is, for any
Now, looking at genuine quantum systems, in local observable A one considers the observable
many cases the procedure mentioned above leads to Z
1
commutative probabilistic models, but there exist A! ¼ ! lim dx x A
V !1 V V
the realms of physics where quantum noncommuta-
tive probabilistic concepts are unavoidable. Typical where V is any finite volume in R  and x the
examples of such areas are quantum optics, low- translation over x 2 R , and where ! lim is the
temperature solid-state physics and ground-state weak operator limit in the microstate !. The limits
physics such as quantum field theory. During the A! obtained correspond to the law of large numbers
last 50 years physicists have developed more or less in probability. The algebra generated by these limit
heuristic methods to deal with, for example, observables A! = {A! j A 2 A} is an abelian algebra
manifestations of fluctuations of typical quantum of observables of a macroscopic system. This
nature. In the last 30 years, mathematical founda- algebra can be identified with an algebra with
tions of such theories were also formulated, and a pointwise product of measurable functions for
notion of quantum probability was launched as a some measure or macroscopic state.
branch of mathematical physics and mathematics The content of this review is to describe an
(Cushen and Hudson 1971, Fannes and Quaegebeur analogous mapping from micro to macro but for a
1983, Quaegebeur 1984, Hudson 1973, Giri and different type of scaling, namely the scaling of
von Waldenfels 1978). fluctuations. For any local observable A 2 A, one
The aim of this article is to review briefly a few considers the limit
selected rigorous results concerning noncommuta- Z
1
tive limit theorems. This choice is made not only lim 1=2 dxðx A  !ðx AÞÞ  FðAÞ
V V V
because of the author’s interest but also for its close
relation to concrete problems in statistical physics The problem consists in characterizing the F(A) as
where one aims at understanding the macroscopic an operator on a Hilbert space, called fluctuation
Quantum Central-Limit Theorems 131

operator, and to specify the algebraic character of Denote by AL all local observables, that is,
the set of all of these. [
Based on this quantum central-limit theorem, one AL ¼ A

notes that not all locally different microscopic
observables always yield different fluctuation opera- This algebra is naturally equipped with a C -norm
tors. Hence the central-limit theorem realizes a well- k  k and its closure
defined procedure of coarse graining or reduction
procedure which is handled by the mathematical B ¼ AL
notion of an equivalence relation on the microscopic 
is called a quasilocal C -algebra and considered as the
observables yielding the same fluctuation operator. microscopic algebra of observables of the system.
In the following sections we discuss the prelimin- Typical examples are spin systems where A = Mn is the
aries, the basic results about normal and abnormal n n complex matrix algebra. In this case, every state
fluctuations. Three model-independent applications ! of B is then locally normal, that is, there exists a
are also discussed. In this review, we omit the family of density matrices { j  2 D(Z )} such that
properties of the so-called modulated fluctuations.
One should remark that we discuss only fluctua- !ðAÞ ¼ tr  A for all A 2 A
tions in space. One can also consider timelike An important group of -automorphisms of B is the
fluctuations. The theory of fluctuation operators group of space translations {x , x 2 Z }:
for these has not been explicitly worked out so far.
However, it is clear that for normal fluctuations the x : Ay 2 Ay ! x Ay ¼ Axþy 2 Ayþx
clustering properties of the time correlation func-
for all A 2 A.
tions will play a crucial role. On the other hand,
Note that the quasilocal algebra B is asymptoti-
typical properties of the structure of this fluctuation
cally abelian for space translations: that is, for all
algebra may come up.
A, B 2 B
Another point which one has to stress is that all
systems, which are treated in this review, are quasilocal lim k½A; x Bk ¼ 0
jxj ! 1
systems. Other systems, for example, fermion systems,
are note treated. But, in particular, fermion systems A state ! of B represents a physical state of the
share many properties of quasilocality, and many of system, assigning to every observable A its expecta-
the results mentioned hold true also for fermion tion value !(A). Therefore, this setting can be viewed
systems. as the quantum analog of the classical probabilistic
setting. Sequences of random variables or observables
can be constructed by considering an observable and
Preliminaries its translates, that is, x (A)x2Z is a noncommutative
random field. If a state ! is translation invariant, that
Quantum Lattice Systems is, !
x = ! for all x, then all x (A) are identically
Although all results we review can be extented to distributed random variables. The mixing property of
continuous or more general systems, modulo some the random field is then expressed by the spatial
technicalities, we limit ourself to quasilocal quantum correlations tending to zero:
dynamical lattice systems.    
! x ðAÞy ðBÞ  !ðx ðAÞÞ! y ðBÞ ! 0 ½3
We consider the quasilocal algebra built on a
-dimensional lattice Z . Let D(Z ) be the directed if jx  yj ! 1.
set of finite subsets of Z where the direction is the One of the basic limit theorems of probability theory
inclusion. With each point x 2 Z we associate an is the weak law of large numbers. In this noncommu-
algebra (C - or von Neumann algebra) Ax , all copies tative setting the law of large numbers is translated into
of an algebra A. For all  2 D(Z ), the tensor the problem of the convergence of space averages of an
product x2 Ax is denoted by A . We take A to be observable A 2 B. A first result was given by the mean
nuclear, then there exists a unique C -norm on A . ergodic theorem of von Neumann (1929). In Brattelli
Every copy Ax is naturally embedded in A . and Robinson (1979, 2002) one finds the following
The family {A }2D(Z ) has the usual relations of theorem: if the state ! is space translation invariant and
locality and isotony: mixing (see [3]) then for all A, B, and C in B
½A1 ; A2  ¼ 0 if 1 \ 2 ¼ ; ½1 ! !
1 X
lim ! A x ðBÞ C ¼ !ðACÞ!ðBÞ ½4
 ! Z jj x 2 
A1  A2 if 1  2 ½2
132 Quantum Central-Limit Theorems

That is, in the GNS (Gelfand–Naimark–Segal) repre- test function space (H, ) with a possibly degen-
sentation
P of the state !, the sequence S (B) = erate symplectic form  is treated. Hence, H is a
1=jj x2 x B converges weakly to a multiple of real vector space and  a bilinear, antisymmetric
the identity: S(B)  !(B)1. This theorem, called the form on H.
mean ergodic theorem, characterizes the class of Denote by W(H, ) the complex vector space
states yielding a weak law of large numbers. Clearly, generated by the functions W(f ), f 2 H, defined by
these limits {S(A)jA 2 B} form a trivial abelian algebra
of macroscopic observables. Wðf Þ : H ! C : g ! Wðf Þg

Now we go a step further and consider space 0 if f ¼
6 g
¼
fluctuations. Define the local fluctuation of an 1 if f ¼ g
observable A in a homogeneous (spatial invariant)
state ! by W(H, ) becomes an algebra with unit W(0) for the
product
1 X
F ðAÞ ¼ ðx A  !ðAÞÞ ½5 Wðf ÞWðgÞ ¼ Wðf þ gÞeði=2Þðf ;gÞ ; f;g2H
jj1=2 x 2 

and a -algebra for the involution
The problem is to give a rigorous meaning to
lim F (A) for  tending to Z in the sense of Wðf Þ ! Wðf Þ ¼ Wðf Þ
extending boxes. When does such a limit exist?
What are the properties of the fluctuations or the It becomes a C -algebra C (H, ) following the
limits F(A) = lim F (A), etc.? Again, the F(A) are construction of Verbeure and Zagrebnov (1992).
macroscopic variables of the microsystem. A linear functional ! of a C -algebra C (H, ) is
Already we remark the following: if A, B are called a state if !(I) = 1 and !(A A) 0 for all
strictly local elements, A, B 2 AL , then A 2 C (H, ) and I = W(0). Every state gives rise to a
X representation through the GNS construction
½A; y B 2 AL (Brattelli and Robinson 1979, 2002). P In particular,
y 2 Z ! is a state if for any choice of A = j cj W(fj ) we
have
and an easy computation yields, by [4],
X  
weak lim ½F ðAÞ; F ðBÞ cjck ! Wðfj  fj Þ eiðfj ;fk Þ 0
 jk
!
1 X X !ðWð0ÞÞ ¼ 1
¼ weak lim x ½A; yx B
 jj
x2 y2 A remark about the special case that  is degenerate
!
1 X X is in order. Denote by H0 the kernel of :
¼ weak lim x ½A; y B
 jj H0 ¼ ff 2 Hj ðf ; gÞ ¼ 0 for all g 2 Hg
x2 y 2 Z
X  
¼ ! ½A; y B  iðA; BÞ1 If H = H0 H1 with 1 a nondegenerate symplectic
y 2 Z form on H1 and 1 equal to the restriction of  to
that is, if the F(A) and F(B) limits do exist, then H1 , we have that C (H, ) is a tensor product:

½FðAÞ; FðBÞ ¼ iðA; BÞ1 ½6 C ðH; Þ ¼ C ðH0 ; 0Þ  C ðH1 ; 1 Þ

This property indicates that fluctuations should have Note that C (H0 , 0) is abelian and that each
the same commutation relations as boson fields. If positive-definite normalized functional ’,
fluctuations can be characterized as macroscopic ’ : h 2 H0 ! ’ðWðhÞÞ
observables, they must satisfy the canonical com-
mutation relations (CCRs). Therefore, in the next defines a state !(W(h)) = ’(W(h)) on C (H0 , 0).
section we introduce the essentials on CCR Let  be any character of the abelian additive
representations. group H, then the map  ,
 Wðf Þ ¼ ðf ÞWðf Þ
CCR Representations
extends to a  -automorphism of C (H, ). Let s be a

We present the abstract Weyl CCR C -algebra. positive symmetric bilinear form on H such that for
More details can be found in Brattelli and all f , g 2 H:
Robinson (1979, 2002) and in particular in 1 2
Manuceau et al. (1973), where the case of a real 4 jðf ; gÞj sðf ; f Þ sðg; gÞ ½7
Quantum Central-Limit Theorems 133

and let !s,  be the linear functional on C (H, ) where the limit is taken for any increasing
given by Z -absorbing sequence {} of finite volumes of
Z . The limits F(A) are called the macroscopic
!s; ðWðhÞÞ ¼ ðhÞeð1=2Þsðh;hÞ ½8 fluctuation operators of the system (B, !).
then it is straightforward (Brattelli and Robinson Already earlier work (Cushen and Hudson 1971,
1979, 2002) to check that !s,  is a state on C (H, ). Sewell 1986) suggested that the fluctuations behave
All states of the type [8] are called quasifree states like bosons. We complete this idea by proving that
on the CCR algebra C (H, ). one gets a well-defined representation of a CCR C -
A state ! of C (H, ) is called a regular state if, for algebra of fluctuations uniquely defined by the
all f , g 2 H, the map  2 R ! !(W(f þ g)) is con- original system (B, !).
tinuous. The regularity property of a state yields the Denote by AL, sa and Bsa the real vector space of
existence of a Bose field as follows. Let (H, , ) be the self-adjoint elements of AL , respectively, B.
the GNS representation (Brattelli and Robinson Definition 1 An observable A 2 Bsa satisfies the
1979, 2002) of the state w, then the regularity of central-limit theorem if
w implies that there exists a real linear map
b : H ! L(H) (linear operators on H) such that (i) lim !(F (A)2 )  s! (A, A) exists and is finite, and
 2
8f 2 H: b(f ) = b(f ) and (ii) lim !(eitF (A) ) = e(t=2) s! (A, A)
for all t 2 R.

ðWðf ÞÞ ¼ expðibðxÞÞ Clearly, our definition coincides with the notion in
terms of characteristic functions, for classical systems (A
The map b is called the Bose field satisfying the Bose abelian) equivalent with the notion of convergence in
field commutation relations: distribution. For quantum systems, there does not exist
½bðf Þ; bðgÞ ¼ iðf ; gÞ ½9 a standard notion of ‘‘convergence in distribution.’’
Only the concept of expectations is relevant. This does
Note that the Bose fields are state dependent. Note not exclude the notion of central-limit theorem in terms
also already that if  is a continuous character of H, of the moments, which is the analog of the moment
then any quasifree state [8] is a regular state problem (Giri and von Waldenfels 1978).
guaranteeing the existence of a Bose field.
Definition 2 The system (B, !) is said to have
normal fluctuations if ! is translation invariant and if

Normal Fluctuations (i) 8A, B 2 AL


X
In this section we develop the theory of normal j!ðAx BÞ  !ðAÞ!ðBÞj < 1
fluctuations for -dimensional quantum lattice sys- x 2 Z

tems with a quasilocal structure (see the section (ii) the central-limit theorem holds for all A 2 AL, sa .
‘‘Quantum lattice systems’’) and for technical simpli-
city we assume that the local C -algebra Ax , x 2 Z , Note that (i) implies that the state ! is mixing for
are copies of the matrix algebra Mn (C) of n n space translations. Also by (i), one can define a
complex matrices. Most of the results stated can be sesquilinear form on AL :
extended to the case where Ax is a general C -algebra hA; Bi! ¼ lim !ðF ðA ÞF ðBÞÞ
(Goderis et al. 1989, 1990, Goderis and Vets 1989). X 
We consider a physical system (B, !) where ! is a ¼ ð!ðA x BÞ  !ðA Þ!ðBÞÞ
translation-invariant state of B, that is, !
x = ! for
all x 2 Z . Later on we extend the situation to a and denote
C -dynamical system (B, !, t ) and analyze the
s! ðA; BÞ ¼ RehA; Bi!
properties of the dynamics t under the central limit.
For any local A we introduced its local fluctuation ! ðA; BÞ ¼ 2 ImhA; Bi!
in the state ! of the system:
For A, B 2 AL, sa one has
1 X X
F ðAÞ ¼ 1=2
ðx A  !ðAÞÞ ½10 ! ðA; BÞ ¼ i !ð½A; x BÞ ½11
jj x2 x 2 Z

The main problem is to give a rigorous mathema- s! ðA; AÞ ¼ hA; Ai! ½12
tical meaning to the limits
Clearly, (AL, sa , ! ) is a symplectic space and s! a
lim F ðAÞ  FðAÞ non-negative symmetric bilinear form on AL, sa .
!1
134 Quantum Central-Limit Theorems

Following the discussion in the section ‘‘CCR From [13], the mean ergodic theorem, and Theorem
representations’’ we get a natural CCR C -algebra 1 we get:
C (AL, sa , ! ) defined on this symplectic space. The
Theorem 2 If the system (B, !) has normal
following theorem is an essential step in the
fluctuations then for A, B 2 AL, sa :
construction of a macroscopic physical system of  
fluctuations of the microsystem (B, !). lim ! eiF ðAÞ eiF ðBÞ

 
Theorem 1 If the system (B, !) has normal 1 i
¼ exp  s! ðA þ B; A þ BÞ  ! ðA; BÞ
fluctuations, then the limits { lim !(eiF (A) ) = 2 2

exp ((1=2)s! (A, A)), A 2 AL } define a quasifree ¼ !ðWðAÞWðBÞÞ
~
 
state !˜ on the CCR C -algebra C (AL, sa , ! ) by
  with !˜ a quasifree state on the CCR algebra C (AL,! sa ).
~
!ðWðAÞÞ ¼ exp  12 s! ðA; AÞ
Theorems 1 and 2 describe completely the
topological and analytical aspects of the quantum
Proof The proof is clear from the definition [8] if central-limit theorem under the condition of normal
one can prove that the positivity condition [7] holds. fluctuations (Definition 2). In fact, the quantum
But the latter follows readily from central limit yields, for every microphysical system
2 (B, !), a macrophysical system (C (AL, sa , ! ), !) ˜
1
4 j! ðA; BÞj ¼ lim jIm !ðF ðAÞF ðBÞÞj2
 defined by the CCR C -algebra of fluctuation
lim !ðF ðAÞ2 Þ!ðF ðBÞ2 Þ observables C (AL, sa , ! ) in the representation
 defined by the quasifree state !. ˜ As the state !˜ is a
¼ s! ðA; AÞs! ðB; BÞ quasifree state, it is a regular state, that is, the map
 2 R ! !( ˜ W (A þ B)) is continuous. From in sec-
by Schwarz inequality. &
tion ‘‘CCR representations’’ we know that this
This theorem indicates that the quantum-mechan- regularity property yields the existence of a Bose
ical alternative for (classical) Gaussian measures are field, that is, there exists a real linear map
quasifree states on CCR algebras. However, the
F : A 2 AL;sa ! FðAÞ
following basic question arises: is it possible to take
the limits of products of the form where F(A) is a self-adjoint operator on the GNS
  representation space H~ of !,˜ such that for all
lim ! eiF ðAÞ eiF ðBÞ    A, B 2 AL, sa :

½FðAÞ; FðBÞ ¼ i! ðA; BÞ
and, if they exist, do they preserve the CCR
structure? Clearly, this is a typical noncommutative Moreover, if one has a complex structure J on
problem. (AL, sa , ! ) such that J2 = 1 and for all A, B 2 AL, sa :
Using the following general bounds: for C = C
! ðJA; BÞ ¼ ! ðA; JBÞ
and D = D norm-bounded operators one has
 iðCþDÞ  ! ðA; JBÞ > 0
e  eiC  kDk
 iC iD  then one defines the boson creation and annihilation
½e ; e  k½C; Dk
operators
 iðCþDÞ 
e  eiC eiD  12 k½C; Dk 1
F ðAÞ ¼ pffiffiffi ðFðAÞ  iFðJAÞÞ
and by the expansion of the exponential function 2
one proves easily that satisfying the usual boson commutation relations

limeiF ðAÞ eiF ðBÞ  eiðF ðAÞþF ðBÞÞ ½F ðAÞ; Fþ ðBÞ ¼ ! ðA; JBÞ þ i! ðA; BÞ


eð1=2Þ½F ðAÞ;F ðBÞ  ¼ 0 ½13 Finally, it is straightforward, nevertheless impor-
tant, to remark that Theorems 1 and 2 hold true if
if A and B are one-point observables, that is, if A, B 2 the linear space of local observables AL, sa is replaced
A{0} . For general local elements the proof is some- by any of its subspaces. Some of them can have
what more technical and can be based on a Bernstein- greater physical importance than others. This means
like argument (for details see Goderis and Vets that the quantum central-limit theorems can realize
(1989)). The property [13] can be seen as a several macrophysical systems of fluctuations. But
Baker–Campbell–Hausdorff formula for fluctuations. all of them are Bose field systems.
Quantum Central-Limit Theorems 135

It is also important to remark that these results where N, d 2 Rþ and d(, 0 ) is the Euclidean
end up in giving a probabilistic canonical basis of distance between  and 0 . It is obvious that
the canonical commutation relations.
Now we analyze the notion of coarse graining due !N ðdÞ !N ðd0 Þ if d d0
to the quantum central limit. Consider on AL the !N ðdÞ !N0 ðdÞ if N N 0
sesquilinear form (see [11], [12]) again
X The clustering condition is expressed by the follow-
hA; Bi! ¼ ð!ðA x BÞ  !ðAÞ!ðBÞÞ ing scaling law:
x 2 Z  
¼ s! ðA; BÞ þ i! ðA; BÞ ½14 9
> 0 : lim N 1=2 !N N 1=2
¼ 0 ½15
N!1
This form defines a topology on AL which is not
or, equivalently,
comparable with the operator topologies induced by
!. In fact, this form is not closable in the weak, 9
> 0 : lim N þ
!N2ðþ
Þ ðNÞ ¼ 0 ½16
strong, ultraweak, or ultrastrong operator topologies. N!1
We call A and B in AL equivalent, denoted by
Note that this condition implies that
A  B if hA  B, A  Bi! = 0. Clearly, this defines
X
an equivalence relation on AL . The property of !N ðjxjÞ < 1
coarse graining is mathematically characterized by x 2 Z
the following: for all A, B 2 AL, sa the relation A  B
is equivalent with F(A) = F(B). Suppose first that that is, that the function !N (  ) is an L1 (Z )-
F(A) = F(B), then function for all N. In fact, this condition corre-
sponds to the uniform mixing condition in the
½WðAÞ; WðBÞ ¼ 0 commutative (classical) central-limit theorem (see,
hence ! (A, B) = 0. Therefore, from Theorem 1: e.g., Ibragimov and Linnick (1971)). This condition

can also be called the modulus of decoupling.
1 ¼ !ðWðAÞWðBÞ
~ Þ ¼ !ðWðAÞWðBÞÞ
~ Product states, for example, equilibrium states of
 
¼ !ðWðA
~  BÞÞ ¼ exp  12 s! ðA  B; A  BÞ mean-field systems are uniformly clustering with
! (d) = 0 for d > 0.
and from [12] and [14]: hA  B, A  Bi! = 0. The The normality of the fluctuations of the micro-
converse is equally straightforward. system (B, !) for product states is proved and
From this property, it follows immediately that, for extensively studied in Goderis et al. (1989), and for
example, the action of the translation group is trivial states satisfying the condition [15] or [16] in Goderis
or that F(x A) = F(A) for all x 2 Z . Therefore, the and Vets (1989). In the latter case, the proofs are
map F : AL, sa ! C (AL, sa , ! ) is not injective. This very technical and based on a generalization of the
expresses the physical phenomenon of coarse graining well-known Bernstein argument (Ibragimov and
and gives a mathematical signification of the fluctua- Linnick 1971) of the classical central-limit theorem
tions being macroscopic observables. to the noncommutative situation. A refinement of
In the above, we have constructed the new these arguments can be found in Goderis et al.
macroscopic physical system of quantum fluctua- (1990). For the sake of formal self-consistency we
tions for any microsystem with the property of formulate the theorem:
normal fluctuations (see Definition 2). The main
problem remains: when the microsystem does have Theorem 3 (Central-limit theorem) Take the micro-
normal fluctuations. We end this section with the system (B, !) such that ! is lattice translation invariant
formulation of a general sufficient clustering condi- and satisfies the clustering condition [15]; then the
tion for the microstate ! in order that the micro- system has normal fluctuations for all elements of the
system (B, !) has normal fluctuations. vector space of local observables AL, sa . &
Let , 0 2 D(Z ) and ! a translation invariant In Goldshtein (1982) a noncommutative central-
state, denote limit theorem is derived using similar techniques.
! ð; 0 Þ ¼ sup j!ðABÞ  !ðAÞ!ðBÞj The main difference, however, is its strictly local
A 2 A ;kAk¼1
B 2 A 0 ;kBk¼1
character, namely for one local operator separately.

The conditions depend on the spectral properties of
The cluster function !N (d) is defined by the operator. It excludes a global approach resulting
in a CCR algebra structure.
!N ðdÞ ¼ sup f ! ð; 0 Þ : dð; 0 Þ d and
Even for quantum lattice systems, it is not
maxðjj; j0 jÞ Ng straightforward to check whether a state satisfies
136 Quantum Central-Limit Theorems

the degree of mixing as expressed in conditions This theorem yields the existence of a dynamics
~t
[15]–[16]. Clearly, one expects the condition to hold on the fluctuations algebra and shows that it is of
for equilibrium states at high enough temperatures. the quasifree type
For quantum spin chains, a theorem analogous with
~t FðAÞ ¼ Fð t AÞ

Theorem 3 under weaker conditions than [15] is
proved for example, in Matsui (2003). where F(A) is a representation of a Bose field in a
So far we have reviewed the quantum central-limit quasifree state !, ~ the noncommutative version of a
theorem for physical C -spin systems (B, !) with Gaussian distribution. In physical terms, it also
normal fluctuations. means that any microdynamics t induces a linear
Now we extend the physical system to a process on the level of its fluctuations.
C -dynamical system (B, !, t ) (Brattelli and Robinson We can conclude that on the basis of the
1979, 2002) and we investigate the properties of the Theorems 3 and 4 the quantum central-limit
dynamics t under the central limit. As usual, the theorem realized a map from the microdynamical
dynamics is supposed to be of the short-range type in system (B, !, t ) to a macrodynamical system
order to guarantee the norm limit: (C (AL, sa , ! ), !,
˜ ~t ) of the quantum fluctuations.
The latter system is a quasifree Boson system.
t ðÞ ¼ n  lim eitH  eitH Note that, contrary to the central-limit theorem,

the law of large numbers [4] maps local observables
and space homogeneous t  x = x  t , 8t 2 R, 8x 2 to their averages forming a trivial commutative
Z . We suppose that the state ! is both space as algebra of macro-observables. The macrodynamics
time translation invariant. Moreover, we assume is mapped to a trivial dynamics as well. Therefore,
that the state ! satisfies the mixing condition [15] the consideration of law of large numbers does not
for normal fluctuations. allow one to observe genuine quantum phenomena.
In [10] we defined, for every local A 2 AL, sa , the On the other hand, on the level of the fluctuations,
local fluctuation F (A) and obtained a clear meaning macroscopic quantum phenomena are observable.
of F(A) = lim F (A) from the central-limit theorem.
Now we are interested in the dynamics of the
fluctuations F(A). Clearly, for all A 2 AL, sa and all
finite : Abnormal Fluctuations
The results about normal fluctuations in the last
t F ðAÞ ¼ F ð t AÞ ½17
section contain two essential elements. On the one
and one is tempted to define the dynamics ~t of the hand, the central limit has to exist. The condition in
fluctuations in the -limit by the formula order that this occurs is the validity of the cluster
condition ([15] or [16]) guaranteeing the normality
~t FðAÞ ¼ Fð t AÞ
½18 of the fluctuations. On the other hand, there is the
reconstruction theorem, identifying the CCR algebra
Note, however, that in general t A is not a local
representation of the fluctuation observables or
element of AL, sa . It is unclear whether the central
operators in the quasifree state, which is denoted
limit of elements of the type t A, with A 2 AL, sa
by !.
˜
exists or not and hence whether one can give a
The cluster condition is in general not satisfied for
meaning to F( t A). Moreover, if F( t A) exists, it
systems with long-range correlations, for example,
remains to prove that ( ~t )t defines a weakly
for equilibrium states at low temperatures with
continuous group of -automorphisms on the fluc-
~ = C (AL, sa , ! )00 (the von phase transitions. It is a challenging question to also
tuation CCR algebra M
study in this case the existence of fluctuations
Neumann algebra generated by the !-representation
˜
operators and, if they exist, to study their mathe-
of C (AL, sa , ! )). All this needs a proof. In Goderis
matical structure. Here we detect structures other
et al. (1990), one finds the proof of the following
than the CCR structure, other states or distributions
basic theorem about the dynamics.
different from quasifree states, etc.
Theorem 4 Under the conditions on the dynamics Progress in the elucidation of all these questions
t and on the state ! expressed above, the limit started with a detailed study of abnormal fluctua-
F( t A) = lim F ( t A) exists as a central limit as in tions in the harmonic and anharmonic crystal
Theorem 2, and the maps ~t defined by [18] extend models (Verbeure and Zagrebnov 1992, Momont
to a weakly continuous one-parameter group of et al. 1997). More general Lie algebras are obtained
-automorphisms of the von Neumann algebra M. ~ than the Heisenberg Lie algebra of the CCR algebra,
The quasifree state !˜ is
~t -invariant (time invariant). and more general states !˜ or quantum distributions
Quantum Central-Limit Theorems 137

are computed beyond quasifree states, which is the Suppose now that the indices
A are determined
case for normal fluctuations. by the existence of the central limit [19]. The next
Abnormal fluctuations turn up, if one has an problem is to find out whether also in these cases a
ergodic state ! with long-range correlations. We reconstruction theorem, comparable to, for exam-
have in mind continuous (second-order) phase ple, Theorem 2, can be proved giving again a
transitions, then typically, for example, the heat mathematical meaning to the limits
capacity or some more general susceptibilities
diverge at critical points or lines. This means that lim F
A ðAÞ  F
A ðAÞ ½21

normally scaled (with the factor jj1=2 ) fluctuations
of some observables diverge. This is equivalent with as operators, in general unbounded, on a Hilbert space.
the divergence of sums of the type Here we develop a proof of the Lie algebra
X character of the abnormal fluctuations under the
ð!ðAx AÞ  !ðAÞ2 Þ conditions: (1) the
-indices are determined by the
x 2 Z existence of the variances (second moments), and
(2) the existence of the third moments (for more
for some local observable A.
details see, e.g., Momont et al. (1997)).
In order to deal with these situations, we rescale
Consider a local algebra, namely an n-dimensional
the local fluctuations. One determines a scaling
vector space G with basis {vi }i = 1,..., n and product
index
A 2 (1=2, 1=2), depending on the observa-
ble A, such that the abnormally scaled local X
n

fluctuations vj  vk  ½vj ; vk  ¼ c‘jk v‘ ½22


‘¼1
F
A ¼ jj
A F ðAÞ with structure constants c‘jk satisfying
with F (A) as in [10], yield a nontrivial character- c‘jk þ c‘kj ¼ 0
istic function: 8t 2 R, X

A ðcrij csrk þ crjk csri þ crki csrj Þ ¼ 0
lim ! ðeitF ðAÞ
Þ  A ðtÞ ½19 r

Consider the concrete Lie algebra basis of operators
where we limit our discussion to states ! local
in A{0}
Gibbs states. The index
A is a measure for the
abnormality of the fluctuation of A. Note that fL0 ¼ i1; L1 ; . . . ; Lm g; m<1

A = 1=2 yields a triviality and that


A = 1=2
such that Lj = Lj , j = 0, 1, . . . , m and !(Lj ) =
would lead to a law of large numbers (theory of
lim ! (Lj ) = 0 for j > 0. Clearly, ! (L0 ) = i for all
averages). Observe also that in general the char-
, and the {Li } satisfy eqn [22]. Because of the
acteristic function A or the corresponding state !˜
special choices of L0 one has c‘ok = c‘ko = 0 and
need not be Gaussian or quasifree.
cojk = i lim ! ([Lj , Lk ]). We consider now the
In the physics literature, one describes the long-
fluctuations of these generators and we are looking
range order by means of the asymptotic form of the
for a characterization of the Lie algebra of the
connected two-point function in terms of the critical
fluctuations if any.
exponent
For a translation-invariant local state ! ,   Z ,
!
1 such that ! = lim ! is mixing, define the local
2
! ðax AÞ  ! ðAÞ ’ 0 ; jxj ! 1 ½20 fluctuations, for j = 1, . . . , m,
jxj2þ

1 X 
Our scaling index
A is related to the critical Fj j ;  ¼ 1=2þ
j
x Lj  ! ðLj Þ ½23
jj x2
exponent by the straightforward relation
and for notational convenience, take
¼ 2  2
A
F0; ¼ i1
As stated above, the index
A is determined by the
existence of the central limit and explicitly com- Now we formulate the conditions for our purposes.
puted in several model calculations, for example,
Condition A We assume that the parameters
j are
Verbeure and Zagrebnov (1992), and for equili-
determined by the existence of the finite and
brium states. Apart from the strong model depen-
nontrivial variances: for all j = 1, . . . , m,
dence, the indices also depend strongly on the
 
chosen boundary conditions. This fact draws a new 0 < lim ! ðFj;

j 2
Þ <1 ½24
light on the universality of the critical exponents. 
138 Quantum Central-Limit Theorems

After reordering, take 1=2 >


1
2   
m > It is an easy exercise to check that the {c‘jk ()} are the
1=2. structure coefficients of a Lie algebra G(). Hence,
by considering local fluctuations, one constructs a
Condition B Assume that all third moments are
map from the Lie algebra G onto the Lie algebra
finite, that is,
G() by a nontrivial change of the structure

 



j
k

constants. When the transformed structure constants
lim
! Fj; Fk; F‘;
<1
 approach a well-defined limit, a new nonisomorphic
Lie algebra might appear. The limit algebra G(Z ),
We have in mind, that the ! ’s are Gibbs states
called the contracted one of the original one G is
for some local Hamiltonians with some specific
always nonsemisimple. This contraction is a typical
boundary conditions. The limit  ! Z may depend
Inönü–Wigner contraction (Inönü and Wigner
very strongly on these boundary conditions, in the
1953). About the limit algebra G(Z ), the following
sense that they are visible in the values of the
results are obtained (see Momont et al. (1997)):
indices
j (see, e.g., Verbeure and Zagrebnov
8
(1992)). If for some j 1, the corresponding
j = 0 <0 if 12 þ
j þ
k 
‘ > 0
then the operator Lj has a normal fluctuation ‘
lim cjk ðÞ ¼ cjk if . . . . . . . . . . . . . . . ¼ 0
‘ ½28
operator  :
0 if . . . . . . . . . . . . . . . < 0

Fj j ¼ lim Fj;
j
½25 It is interesting to distinguish a number of special

cases:
where the limit is understood in the sense of 1. If all fluctuations are normal, one recovers the
Condition A, namely a finite nontrivial variance. If, Heisenberg algebra of the canonical commuta-
for some j 1, the corresponding
j 6¼ 0, then the tion relations with the right symplectic form ! .
fluctuation [25] is called an abnormal fluctuation 2. If 1=2 þ
j þ
k 
‘ > 0 for all j, k, ‘ one obtains
operator. In order to satisfy Condition A, it happens an abelian Lie algebra of fluctuations.
sometimes that
j has to be chosen negative (see, 3. One gets the richest structure if 1=2 þ
j þ
k 
e.g., Verbeure and Zagrebnov (1992)). In this case,
‘ = 0 for all j, k, ‘ or for some of them. One
it is reasonable to limit our discussion to the notes a phenomenon of scale invariance, the
situation that all
j > 1=2. c‘jk () are -independent. Algebras different from
On the basis of Condition A, the limit set the CCR algebra are observed. A particularly

{Fj j }j = 0,..., m of fluctuation operators generates a interesting case turns up if


j = 
k 6¼ 0, that is,
Hilbert space H with scalar product one of the indices is negative, for example,
j < 0,
   

the corresponding fluctuation Fj j shows a prop-




j 
k
Fj j ; Fk
k ¼ lim ! ðFj; Þ Fk; ½26
 erty of space squeezing, and then
k > 0, the
fluctuation Fk
k expresses the property of space
On the basis of Condition B, the fluctuation
dilation. These phenomena are observed and
operators are defined as multiplication operators of
computed in several models (see, e.g., Verbeure
the Hilbert space H. Note that the Conditions A and
and Zagrebnov (1992)). This yields in particular
B are not sufficient to obtain a characteristic
a microscopic explanation of the phenomenon of
function. However, they are sufficient to obtain the
squeezing (squeezed states and all that) in
notion of fluctuation operator. Now we proceed to
quantum optics. We refer also to the section
clarify the Lie algebra character of these fluctuation
‘‘Spontaneous symmetry breaking’’ for this phe-
operators on H.
nomenon as being the basis of the construction of
Consider the Lie product of two local fluctuations
the Goldstone normal modes of the Goldstone
for a finite , one gets
particle appearing in systems showing sponta-
h i Xm neous symmetry breakdown.

j
k

Fj; ; Fk; ¼ c‘jk ðÞF‘; ½27
‘¼0

with Some Applications


The notion of fluctuation operator as presented
c‘jk
c‘jk ðÞ ¼ ; ‘ ¼ 1; . . . ; m above, and the mathematical structure of the algebra
jj1=2þ
j þ
k 
‘ of fluctuations have been tested in several soluble
X
m models. Many applications of this theory of quan-
c0jk ðÞ ¼ jj
j 
k

c‘jk ! ðF‘; Þ tum fluctuations can be found in the list of
‘¼0 references. Here we are not entering into the details
Quantum Central-Limit Theorems 139

of any model, but we limit ourselves to mention often argued that when the perturbation is small,
three applications which are of a general nature and one can limit the study of the response to the first-
totally model independent. order term in the perturbation in the corresponding
Dyson expansion. This is the basis of what is called
Conservation of the KMS Property under the ‘‘linear response theory of Kubo.’’
the Transition from Micro to Macro A long-term debate is going on about the validity
of the linear response theory. The question is how to
Suppose that we start with a micro-dynamical
understand from a microscopic point of view the
system (B, !, t ) with normal fluctuations, that is,
validity of the response theory being linear or not.
we are in the situation as treated in the section
One must realize that the linear response theory
‘‘Normal fluctuations.’’ Hence, we know that the
actually observed in macroscopic systems seems to
quantum central-limit theorem maps the system
have a significant range of validity beyond the
(B, !, t ) onto the macrodynamical system
criticism being expressed about it.
(C (AL, sa , ! ), !,
˜ ~t ) of quantum fluctuations.
Here we discuss the main result of the paper
If the microstate ! is t -time invariant (!  t = !
(Goderis et al. 1991) in which contours are sketched
for all t 2 R), then it also follows readily that the
for the exactness of the response being linear.
macrostate !˜ is ~t -time invariant (see Theorem 4,
We assume:
i.e., !˜ 
~t = !˜ for all t 2 R).
A less trivial question to pose is: suppose that the 1. that the microdynamics t is the norm-limit of
microstate ! is an equilibrium state for the micro- the local dynamics t = eitH  eitH , where H
dynamics t , is then the macrostate !˜ also an contains only standard finite-range interactions
equilibrium state for the macrodynamics ~t of the (as in the section ‘‘Normal fluctuations’’);
fluctuations? In Goderis et al. (1990) this question is 2. that the ! are states such that ! = lim ! is a
answered positively in the following more technical state which is time and space translation invar-
sense: if ! is an t -KMS state of B at inverse iant; and
temperature , then !˜ is an ~t -KMS state at the 3. that ! satisfies the cluster condition [15] or [16].
same temperature.
From the time invariance of the state, one has a
This property proves that the notion of equili-
Hamiltonian GNS representation of the dynamics:
brium is preserved under the operation of coarse
t = eitH  eitH . On the basis of Theorem 4, one has
graining induced by the central-limit theorem. This
the dynamics ~t of the fluctuation algebra
statement constitutes a proof of one of the
C (AL, sa , ! ) in the state !.
˜ This GNS representation
basic assumptions of the phenomenological theory
yields a Hamiltonian representation for ~t :
of Onsager about small oscillations around
equilibrium. ~ ~
~t ¼ eitH  eitH

This result also yields a contribution to the
discussion whether or not quantum systems should Now take any local perturbation P 2 AL, sa of t ,
be described at a macroscopic level by classical namely
observables. The result above states that the macro-
scopic fluctuation observables behave classically if Pt; ¼ eitðHþF ðPÞÞ  eitðHþF ðPÞÞ
and only if they are time invariant. In other words, it
can only be expected a priori that conserved where F (P) is the local fluctuation of P in !. Then
quantities behave classically. In principle, other one proves the following central-limit theorem
observables follow a quantum dynamics. (Goderis et al. 1991): for all A and B in AL, sa , one
has the perturbed dynamics
Linear Response Theory ~ ~
~Pt ¼ eitðHþFðPÞÞ  eitðHþFðPÞÞ

In particular, in the study of equilibrium states
(KMS states) a standard procedure is to perturb the of the fluctuation algebra in the sense of [18]:
system and to study the response of the system as a ~Pt FðAÞ ¼ lim Fð Pt; ðAÞÞ

function of the perturbation. The response eluci- 
dates many, if not all, of the properties of the This proves the existence and the explicit form of
equilibrium state. the perturbed dynamics lifted to the level of the
Technically, one considers a perturbation of the fluctuations. In particular, one has
dynamics by adding a term to the Hamiltonian. One
 
expands the perturbed dynamics in terms of the lim ! Pt; ðF ðAÞÞ ¼ !ð
~ ~Pt FðAÞÞ
perturbation and the unperturbed dynamics. It is 
140 Quantum Central-Limit Theorems

This is nothing but the existence of the relaxation following product state solutions: ! = i tr ,
function of Kubo but lifted to the level of the where
fluctuations and instead of dealing with strictly local
observables here one considers fluctuations. e h
 ¼ ;  ¼ tr   ¼ ! ð Þ
Assume, furthermore, that the state ! is an ( t , )- tr e h
KMS state; then one derives readily Kubo’s famous h ¼ z  þ  
formula of his linear response theory:
Note that  = tr  is a nonlinear equation for 
d whose solutions determine the density matrix  .
~
!ð ~Pt FðAÞÞ ¼ i! ~t FðAÞÞ
~ð½FðPÞ;
dt This equation always has the solution  = 0,
which shows full linearity in the perturbation describing the so-called normal phase. For > c ,
observable P. Kubo’s formula arises as the central with th c  = 2, one has a solution  6¼ 0, describing
limit of the microscopic response to the dynamics the superconducting phase. Remark that if  is a
perturbed by a fluctuation observable. We remark solution, then also ei for all is a solution as
that if ! is an equilibrium state, then the right-hand well. It is clear that HN is invariant under the
side of the formula above can be expressed in terms continuous gauge transformation automorphism
of the Duhamel two-point function, which is the group G = {’ j ’ 2 [0, 2]} of B:
common way of doing in linear response theory. ’ ðþ i’ þ
i Þ ¼ e i
Hence G is a symmetry group. On the other hand:
Spontaneous Symmetry Breaking ! (’ (þ i’
! (þ þ
i )) = e i ) 6¼ ! (i ). The gauge group
SSB is one of the basic phenomena accompanying G is spontaneously broken. Remark also that the
collective phenomena, such as phase transitions in gauge transformations are implemented locally by
statistical mechanics, or specific ground states in the charges
field theory. SSB goes back to the Goldstone X
N
theorem. There are many different situations to QN ¼ zi ; i:e:; ’ ðþ i’QN þ i’QN
i Þ¼ e i e
consider, for example, in the case of short-range j¼N
interactions, it is typical that SSB yields a z
dynamics which remains symmetric, whereas for and  is the symmetry generator density. As the
long-range interactions SSB also breaks the sym- states ! are product states, all fluctuations are
metry of the dynamics. However, in all cases the normal (see the section ‘‘Normal fluctuations’’). One
physics literature predicts the appearance of a considers the local operators
particular particle, namely the Goldstone boson, to
jj2 z 
appear as a result of SSB. The theory of fluctua- Q¼ 2
 þ 2 ðþ þ  Þ
 
tion operators allows the construction of the
canonical coordinates of this particle. The most i
P ¼ ðþ   Þ
general result can be found in Michoel and 
Verbeure (2001). We sketch the essentials in two where  = (2 þ jj2 )1=2 . Note that P is essentially
cases, namely for systems of long-range interac- the order parameter operator, that is, the operator P
tions (mean fields) and for systems with short- is breaking the symmetry:
range interactions.
d
! ð’ ðAÞÞ 6¼ 0; ! ðAÞ ¼ 0
Long-range (mean-field) interactions Here we give d’
explicitly the example of the strong-coupling BCS
model in one dimension ( = 1). The microscopic On the other hand, Q is essentially the generator of
algebra of observables is B = i (M2 )i , where M2 is the symmetry z normalized to zero, that is,
the algebra of 2 2 complex matrices. The local ! (Q) = 0.
Hamiltonian of the models is given by Michoel and Verbeure (2001) proved in detail
that the fluctuations F(Q) and F(P) form a
X
N
1 XN
canonical pair
HN ¼  zi  þ 
2N þ 1 i;j¼N i j
i¼N 4jj2
1 ½FðQÞ; FðPÞ ¼ i
0<< 2 
where z ,  are the usual 2 2 Pauli matrices. In and that they behave, under the time evolution, as
the thermodynamic limit, the KMS equation has the harmonic oscillator coordinates oscillating with a
Quantum Central-Limit Theorems 141

frequency equal to 2. This frequency is called a is nontrivial and finite. This means that the fluctua-
plasmon frequency. Moreover, the variances are tion F
(A) exists. Then we get
"
jj2 1 X
~ ðFðQÞ2 Þ ¼
! ~ ðFðPÞ2 Þ
¼! lim ! ðqx  !ðqÞÞ;
2  jj1=2

x2
#!
This means that these coordinates vanish or dis- 1 X
appear if  = 0. The coordinates F(Q) and F(P) are ðx A  !ðAÞÞ ¼ c
the canonical coordinates of a particle appearing jj1=2þ
y 2 
only if there is spontaneous symmetry breakdown. Hence
They are the canonical coordinates of the Goldstone  
boson, which arise if SSB occurs. ~ F
ðqÞ; F
ðAÞ ¼ c
!
which for equilibrium states !, turns into the
operator equation for fluctuations
Short-range interactions An analogous result, as
for long-range interactions, can be derived for ½F
ðqÞ; F
ðAÞ ¼ c1
systems with short-range interactions. However, in
this case we have equilibrium states with poor In other words, one obtains a canonical pair
cluster properties. We are now in the situation as (F
(q), F
(A)) of normal coordinates of the collec-
described in the ‘‘Abnormal Fluctuations’’ section. tive Goldstone mode.
Also in this case we have the phenomenon of SSB, Note that the long-range correlation of the
which shows the appearance of a Goldstone particle. order-parameter operator (positive
) is exactly
Also in this case one is able to construct its compensated by a squeezing, described by the
canonical coordinates. The details of this construc- negative index 
, for the fluctuation operator of
tion can be found in Michoel and Verbeure (2001). the local generator of the broken symmetry. This
Here we give a heuristic picture of this construction. result can also be expressed as typical for SSB,
Consider again a microsystem (B, !, t ) and let s namely that the symmetry is not completely
be a strongly continuous one-parameter symmetry broken, but only partially. More detailed informa-
group Pof t which is locally generated by tion about all this is found in Michoel and
Q = x2 qx . SSB amounts to find an equilibrium Verbeure (2001).
(KMS) or ground state ! which breaks the symme-
See also: Algebraic Approach to Quantum Field Theory;
try, that is, there exists a local observable A 2 AL, sa Large Deviations in Equilibrium Statistical Mechanics;
such that for s 6¼ 0 holds: !(s (A)) 6¼ !(A) and Macroscopic Fluctuations and Thermodynamic
t s = s t . This is equivalent to Functionals; Quantum Phase Transitions; Quantum

Spin Systems; Symmetry Breaking in Field Theory;
d

!ðs ðAÞÞ

¼ lim !ð½Q ; AÞ ¼ c 6¼ 0 Tomita–Takesaki Modular Theory.


ds s¼0 

with c a constant.
Now we turn this equation into a relation for Further Reading
fluctuations. Using space translation invariance of
Brattelli O and Robinson DW (1979) Operator Algebras and
the state, one gets Quantum Statistical Mechanics, vol. I. New York–Heidelberg–
" #! Berlin: Springer.
1 X X
lim ! ðqx  !ðqÞÞ ðx A  !ðAÞÞ ¼c Brattelli O and Robinson D (2002) Operator Algebras and
 jj Quantum Statistical Mechanics, vol. II. New York–Heidelberg–
x2 y2
Berlin: Springer.
We now use another consequence of the Gold- Cushen CD and Hudson RL (1971) A quantum mechanical central
limit theorem. Journal of Applied Probability 8: 454–469.
stone theorem, namely that SSB implies poor
Fannes M and Quaegebeur J (1983) Central limits of product
clustering properties for the order parameter A, mappings between CAR-algebras. Publications of the Research
that is, in the line of what is done in the last Institute for Mathematical Studies Kyoto 19: 469–491.
section, we assume that the lack of clustering is Giri N and von Waldenfels W (1978) An algebraic version of the
expressed by the existence of a positive index
central limit theorem. Zeitschrift für Wahrscheinlichkeitstheorie
und Verwandte gebiete 42: 129–134.
such that
Goderis D, Verbeure A, and Vets P (1989) Non-commutative
0 !2 1 central limits. Probability and Related Fields 82: 527–544.
1 X Goderis D, Verbeure A, and Vets P (1990) Dynamics of
lim !@ 1þ2
ðx A  !ðAÞÞ A fluctuations for quantum lattice systems. Communications in
 jj x2 Mathematical Physics 128: 533–549.
142 Quantum Channels: Classical Capacity

Goderis D, Verbeure A, and Vets P (1991) About the exactness of Manuceau J, Sirugue M, Testard D, and Verbeure A (1973) The
the linear response theory. Communications in Mathematical smallest C -algebra for canonical commutation relations.
Physics 136: 265–583. Communications in Mathematical Physics 32: 231.
Goderis D and Vets P (1989) Central limit theorem for mixing Matsui T (2003) On the algebra of fluctuations in quantum spin
quantum systems and the CCR-algebra of fluctuations. chains. Annales Henri Poincaré 4: 63–83.
Communications in Mathematical Physics 122: 249. Michoel T and Verbeure A (2001) Goldstone boson normal
Goldshtein BG (1982) A central limit theorem of non-commutative coordinates. Communications in Mathematical Physics 216:
probability theory. Theory of Probability and its Applications 461–490.
27: 703. Momont B, Verbeure A, and Zagrebnov VA (1997) Algebraic
Hudson RL (1973) A quantum mechanical central limit theorem for structure of quantum fluctuations. Journal of Statistical
anti-commuting observables. Journal of Applied Probability 10: Physics 89: 633–653.
502–509. Quaegebeur J (1984) A non-commutative central limit theorem
Ibragimov IA and Linnick YuV (1971) Independent and stationary for CCR-algebras. Journal of Functional Analysis 57: 1–20.
sequences of random variables. Groningen: Wolters-Noordhoff. Sewell GL (1986) Quantum theory of collective phenomena.
Inönü E and Wigner EP (1953) On the contraction groups and Oxford: Oxford University Press.
their representations. Proceedings of the National Academy of Verbeure A and Zagrebnov VA (1992) Phase transitions and
Sciences, USA 39: 510–524. algebra of fluctuation operators in an exactly soluble model
Lanford DE and Ruelle D (1969) Observables at infinity and of a quantum anharmonic crystal. J. Stat. Phys. 69:
states with short-range correlations in statistical mechanics. 329–359.
Journal of Mathematical Physics 13: 194.

Quantum Channels: Classical Capacity


A S Holevo, Steklov Mathematical Institute, Moscow, soon after the publication of the pioneering papers
Russia by Shannon and goes back to the classical works of
ª 2006 Elsevier Ltd. All rights reserved. Gabor, Brillouin, and Gordon, asking for funda-
mental physical limits on the rate and quality of
information transmission. This work laid a physical
The Definition foundation and raised the question of consistent
quantum treatment of the problem. Important steps
A numerical measure of the ability of a classical or in this direction were made in the early 1970s when
quantum information processing system (for definite- a quantum probabilistic framework for this type
ness, one speaks of a communication channel) to of problem was created and the conjectured upper
transmit information expressible as a text message bound for the classical capacity of quantum
(called ‘‘classical information’’ as distinct from quan- channel was proved. A long journey to the quantum
tum information). It is equal to the least upper bound coding theorem culminated in 1996 with the
for rates of the asymptotically perfect transmission of proof of achievability of the upper bound
classical information through the system, when the (the Holevo–Schumacher–Westmoreland theorem;
transmission time tends to infinity, and arbitrary pre- see Holevo (1998) for a detailed historical survey).
and post-processing (encoding and decoding) are Moreover, it was realized that quantum channel is
allowed at the input and the output of the system. characterized by the whole spectrum of capacities
Typically, for rates exceeding the capacity, not only depending on the nature of the information resources
the asymptotically perfect transmission is impossible, and the specific protocols used for the transmission.
but the error probability with arbitrary encoding– To a great extent, this progress was stimulated by an
decoding scheme tends to 1, so that the capacity has a interplay between the quantum communication theory
nature of a threshold parameter. and quantum information ideas related to more recent
development in quantum computing. This new age of
quantum information science is characterized by
From Classical to Quantum
emphasis on the new possibilities (rather than restric-
Information Theory
tions) opened by the quantum nature of the informa-
A central result of the classical information theory is tion processing agent. On the other hand, the question
the Shannon coding theorem, giving an explicit of information capacity is important for the theory of
expression to the capacity in terms of the maximal quantum computer, particularly in connection with
mutual information between the input and the quantum error-correcting codes, communication and
output of the channel. The issue of the information algorithmic complexity, and a number of other
capacity of quantum communication channels arose important issues.
Quantum Channels: Classical Capacity 143

The Quantum Coding Theorem where


( !
In the simplest and most basic memoryless case, the X
information processing system is described by the C ðÞ ¼ max H px ½x 
x
sequence of block channels, )
X
n
 ¼ |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl}
      ; n ¼ 1; . . .  px H ð½x Þ ½2
n x

of n parallel and independent uses of a channel , n H() = tr  log2  is the binary von Neumann
playing the role of transmission time (Holevo 1998). entropy, and the maximum is taken over all
More generally, one can consider memory channels probability distributions {px } and collections of
given by open dynamical systems with a kind of density operators {x } in H1 .
ergodic behavior and the limit where the transmission
time goes to infinity (Kretschmann and Werner 2005).
Restricting to the memoryless case, encoding is given The Variety of Capacities
by a mapping of classical messages x from a given This basic definition and the formulas [1], [2] generalize
codebook of size N into states (density operators) (n)x the definition of the Shannon capacity and the coding
in the input space Hn n
1 of the block channel  , and theorem for classical memoryless channels. For quantum
decoding – by an observable M(n) in the output space channel, there are several different capacities because
Hn (n)
2 , that is, a family {My } of operators constituting a one may consider sending different kinds (classical or
resolution of the identity in Hn
2 : quantum) of information, restrict the admissible coding
X and decoding operations, and/or allow the use of
MðnÞ
y  0; MðnÞ
y ¼ I additional resources, such as shared entanglement,
y
forward or backward communication, leading to really
Here y plays the role of outcomes of the whole different quantities (Bennett et al. 2004). Few of these
decoding procedure involving both the quantum resources (such as feedback) also exist for classical
measurement at the output and the possible classical channels but usually influence the capacity less drama-
information post-processing. Then the diagram for tically (at least for memoryless channels). Restricting to
the classical information transmission is the transmission of classical information with no
additional resources, one can distinguish at least four
ðnÞ MðnÞ
x ! i ! n ½ðnÞx  !y capacities (Bennett and Shor 1998), according to
|{z} |fflfflfflfflffl{zfflfflfflfflffl}
input output whether, for each block length n, one is allowed to use
state state
arbitrary entangled quantum operations on the full
The such-described encoding and decoding consti- block of input (resp. output) systems, or if, for each of the
tute a quantum block code of length n and size N parallel channels, one has to use a separate quantum
for the memoryless channel. The conditional prob- encoding (resp. decoding), and combine these only by
ability of obtaining an outcome y provided the classical pre- (resp. post-) processing:
message x was sent for a chosen block code is given
by the statistical formula
C∞∞: full
capacity, arbitary
pðnÞ ðyjxÞ ¼ tr n ½ðnÞ ðnÞ
x My (de)coding

and the error probability for the code is just ??? ≥


maxx (1  p(n) (x j x)).
C1∞ = Cχ: C∞1: quantum
Denoting by pe (n, N) the infimum of the error unentangled block
probability over all codes of length n and size N, the coding, quantum coding, separate
block decoding decoding
classical capacity C() of the memoryless channel is
≥ =
defined as the least upper bound of the rates R for
which lim n ! 1 pe (n, 2nR ) = 0. C11: one-shot
capacity or accessible
Let  be a quantum channel from the input to the information, separate
output quantum systems, assumed to be finite quantum (de)coding, block
(de)coding only classical
dimensional. The coding theorem for the classical
capacity says that
1 The full capacity C11 is just the classical capacity
CðÞ ¼ lim C ðn Þ ½1
n!1 n C() given by [1]. That C11 coincides with the
144 Quantum Chromodynamics

quantity C () given by [2] is the essential content theory (Holevo 1998, Holevo and Werner 2001).
of the HSW theorem, from which [1] is obtained Another important extension concerns multiuser
by additional blocking. Since C is apparently quantum information processing systems and their
superadditive, C (1  2 )  C (1 ) þ C (2 ), one capacity regions (Devetak and Shor 2003).
has C11  C . It is still not known whether the
quantity C () is in fact additive for all channels, See also: Capacities Enhanced by Entanglement;
which would imply the equalities here. Additivity of Capacity for Quantum Information; Channels in Quantum
C () would have the important physical conse- Information Theory; Entanglement Measures.
quence – it would mean that using entangled input
states does not increase the classical capacity of
Further Reading
quantum channel. While such a result would be very
much welcome, giving a single-letter expression for Bennett CH and Shor PW (1998) Quantum information theory.
the classical capacity, it would call for a physical IEEE Transactions on Information Theory 44: 2724–2742.
explanation of asymmetry between the effects of Bennett CH, Devetak I, Shor PW, and Smolin JA Inequality and
separation between assisted capacities of quantum channel,
entanglement in encoding and decoding procedures. e-print quant-ph/0406086.
Indeed, the inequality in the lower left is known to be Devetak I and Shor PW The capacity of quantum channel for
strict sometimes (Holevo 1998), which means that simultaneous transmission of classical and quantum informa-
entangled decodings can increase the classical capa- tion, e-print quant-ph/0311131.
Holevo AS (1998) Quantum Coding Theorems. Russian Math.
city. There is even an intermediate capacity between
Surveys vol. 53. pp. 1295–1331, e-print quant-ph/9808023.
C11 and C11 obtained by restricting the quantum Holevo AS (2000) Coding theorems of quantum information
block decodings to adaptive ones (Shor 2002). The theory. In: Grigoryan A, Fokas A, Kibble T, and Zegarlinski B
additivity of the quantity C for all channels is one of (eds.) Proc. XIII ICMP, pp. 415–422. London: International
the central open problems in quantum information Press of Boston.
theory; it was shown to be equivalent to several other Holevo AS and Werner RF (2001) Evaluating capacities of
bosonic Gaussian channels. Physical Review A 63: 032312
important open problems, notably (super)additivity (e-print quant-ph/9912067).
of the entanglement of formation and additivity of Kretschmann D and Werner RF Quantum channels with memory,
the minimal output entropy (Shor 2004). e-print quant-ph/0502106.
For infinite-dimensional quantum processing sys- Shor PW The adaptive classical capacity of a quantum channel, or
tems, one needs to consider the input constraints information capacity of 3 symmetric pure states in three
dimensions, e-print quant-ph/0206058.
such as the power constraint for bosonic Gaussian Shor PW (2004) Equivalence of additivity questions in quantum
channels. The definition of the classical capacity and information theory. Communications in Mathematical Physics
the capacity formula are then modified by introduc- 246: 4334–4340 (e-print quant-ph/0305035).
ing the constraint in a way similar to the classical

Quantum Chromodynamics
G Sterman, Stony Brook University, Stony Brook, each field may be described in terms of quantum waves
NY, USA or particles.
ª 2006 Elsevier Ltd. All rights reserved. Because it is a gauge field theory, the fields that
carry the forces of QCD transform as vectors under
the Lorentz group. Corresponding to these vector
fields are the particles called ‘‘gluons,’’ which carry
Introduction
an intrinsic angular momentum, or spin, of 1 in
Quantum chromodynamics, or QCD, as it is normally units of h. The strong interactions are understood as
called in high-energy physics, is the quantum field the cumulative effects of gluons, interacting among
theory that describes the strong interactions. It is the themselves and with the quarks, the spin-1/2
SU(3) gauge theory of the current standard model for particles of the Dirac quark fields.
elementary particles and forces, SU(3)SU(2)L U(1), There are six quark fields of varying masses in
which encompasses the strong, electromagnetic, and QCD. Of these, three are called ‘‘light’’ quarks, in a
weak interactions. The symmetry group of QCD, with sense to be defined below, and three ‘‘heavy.’’ The
its eight conserved charges, is referred to as color light quarks are the up (u), down (d), and strange (s),
SU(3). As is characteristic of quantum field theories, while the heavy quarks are the charm (c), bottom (b),
Quantum Chromodynamics 145

and top (t). Their well-known electric charges are The Lagrangian and Its Symmetries
ef = 2e=3(u, c, t) and ef = e=3(d, s, b), with e the
The QCD Lagrangian may be written as
positron charge. The gluons interact with each quark
1 h 2 i
field in an identical fashion, and the relatively light nf
X  
masses of three of the quarks provide the theory with L¼ q 6 ½A  mf qf  tr F
f i D ðAÞ
2
a number of approximate global symmetries that f ¼1
 
profoundly influence the manner in which QCD  Bb ðAÞ
 ðBa ðAÞÞ2 þ cb ca ½1
manifests itself in the standard model. 2 a
These quark and gluon fields and their correspond-
ing particles are enumerated with complete confidence with D6 [A] =   @ þ igs   A the covariant derivative in
by the community of high-energy physicists. Yet, none QCD. The   are the Dirac matrices, satisfying the
  
of these particles has ever been observed in isolation, anticommutation relations,
P8 [ , ]þ = 2g . The SU(3)

as one might observe a photon or an electron. Rather, gluon fields are A = a = 1 Aa Ta , where Ta are the
all known strongly interacting particles are colorless; generators of SU(3) in the fundamental representation.
most are ‘‘mesons,’’ combinations with the quantum The field strengths F [A] = @ A  @ A þ igs [A , A ]
numbers of a quark q and a antiquark q 0 , or specify the three- and four-point gluon couplings of
‘‘baryons’’ with the quantum numbers of (possibly nonabelian gauge theory. In QCD, there are nf = 6
distinct) combinations of three quarks qq0 q00 . This flavors of quark fields, qf , with conjugate qf = qyf  0 .
feature of QCD, that its underlying fields never The first two terms in the expression [1] make up
appear as asymptotic states, is called ‘‘confinement.’’ the classical Lagrangian, followed by the gauge-fixing
The very existence of confinement required new ways term, specified by a (usually, but not necessarily
of thinking about field theory, and only with these linear) function Ba (A), and the ghost Lagrangian. The
was the discovery and development of QCD possible. ghost (anti-ghost) fields ca (ca ) carry the same adjoint
index as the gauge fields.
The classical QCD Lagrangian before gauge fixing
The Background of QCD is invariant under the local gauge transformations
The strong interactions have been recognized as a i
separate force of nature since the discovery of the A0 ðxÞ ¼ @ ðxÞ1 ðxÞ þ ðxÞA0 ðxÞ1 ðxÞ
gs
neutron as a constituent of atomic nuclei, along with
¼ A ðxÞ  @ ðxÞ
the proton. Neutrons and protons (collectively,  
nucleons) possess a force, attractive at intermediate þ igs ðxÞ; A ðxÞ þ   
distances and so strong that it overcomes the electric 0 ½2
i ðxÞ ¼ ðxÞij j ðxÞ ¼ i ðxÞ
repulsion of the protons, each with charge e. A sense
þ igs ðxÞij j ðxÞ þ   
of the relative strengths of the electromagnetic and
strong interactions may be inferred from the typical X
8

distance between mutually repulsive electrons in an ðxÞ ¼ a ðxÞTa


a¼1
atom,  108 cm, and the typical distance between
protons in a nucleus, of order 1013 cm. The full QCD action including gauge-fixing and
The history that led up to the discovery of QCD is a ghost terms is also invariant under the Bechi, Rouet,
fascinating one, beginning with Yukawa’s 1935 theory Stora, Tyutin (BRST) transformations with  an
of pion exchange as the source of the forces that bind anticommuting variable.
nuclei, still a useful tool for low-energy scattering.  
Other turning points include the creation of nonabelian A;a ¼ ab @ þ gAc fabc cb 
gauge theories by Yang and Mills in 1954, the discovery ca ¼  12 gCabc cb cc  ; c ¼ Ba  ½3
of the quantum number known as strangeness, the  ¼ ig½Tb ij cb
i j
consequent development of the quark model, and then
the proposal of color as a global symmetry. The role of with fabc the SU(3) structure constants. The Jacobian
pointlike constituents in hadrons was foreshadowed by of these transformations is unity.
the identification of electromagnetic and weak currents In addition, neglecting masses of the light quarks,
and the analysis of their quantum-mechanical algebras. u, d, and s, the QCD Lagranian has a class of global
Finally, the observation of ‘‘scaling’’ in deep-inelastic flavor and chiral symmetries, the latter connecting
scattering, which we will describe below, made QCD, left- and right-handed components of the quark
with color as a local symmetry, the unique explanation fields, L, R (1=2)(1
5 ) ,
of the strong interactions, through its property of P
asymptotic freedom.
0
ðxÞ ¼ ei5 ðxÞ; P ¼ 0; 1 ½4
146 Quantum Chromodynamics

Here, power P = 0 describes phase, and P = 1 chiral, The variation of the anti-ghost as in [3] is equivalent
transformations. Both transformations can be to an infinitesimal change in the gauge-fixing term;
extended to transformations among the light flavors, variations in the remaining fields all cancel single-
by letting become a vector, and  an element in particle plane wave behavior in the corresponding
the Lie algebra of SU(M), with M = 2 if we take only Green functions. These identities then ensure the
the u and d quarks, and M = 3 if we include the gauge invariance of the perturbative S-matrix, a result
somewhat heavier strange quark. These symmetries, that turns out to be useful despite confinement.
not to be confused with the local symmetries of the To go beyond a purely perturbative description of
standard model, are strong isospin and its extension QCD, it is useful to introduce a set of nonlocal
to the ‘‘eightfold way,’’ which evolved into the operators that are variously called nonabelian
(3-)quark model of Gell–Mann and Zweig. The phases, ordered exponentials, and Wilson lines,
many successes of these formalisms are automati- " Z z #
cally incorporated into QCD. UC ðz; yÞ ¼ P exp igs 
dx A ðxÞ ½8
y

where C is some self-avoiding curve between y and z.


Green Functions, Phases,
The U’s transform at each end linearly in nonabelian
and Gauge Invariance gauge transformations (x) at that point,
In large part, the business of quantum field theory is
UC0 ðz; yÞ ¼ ðzÞUC ðz; yÞ1 ðyÞ ½9
to calculate Green functions,
Especially interesting are closed curves C, for which
Gn ðx1 . . . xn Þ
z = y. The phases about such closed loops are, like
¼ h0jT ð1 ðx1 Þ . . . i ðxi Þ . . . n ðxn ÞÞj0i ½5 their abelian counterparts, sensitive to the magnetic
where T denotes time ordering. The i (x) are flux that they enclose, even when the field strengths
elementary fields, such as A or qf , or composite vanish on the curve.
fields, such as currents like J = q f   qf . Such a
Green function generates amplitudes for the scatter- QCD at the Shortest and Longest
ing of particles of definite momenta and spin, when Distances
in the limit of large times the xi -dependence of the
Green function is that of a plane wave. For example, Much of the fascination of QCD is its extraordinary
we may have in the limit x0i ! 1, variation of behavior at differing distance scales. Its
discovery is linked to asymptotic freedom, which
Gn ðx1 . . . xn Þ !
i ðp; Þ eipxi hðp; ÞjT ð1 ðx1 Þ . . . characterizes the theory at the shortest scales.
i1 ðxi1 Þiþ1 ðxiþ1 Þ . . . n ðxn ÞÞj0i ½6 Asymptotic freedom also suggests (and in part
provides) a bridge to longer distances.
where
i (p, ) is a solution to the free-field equation for Most analyses in QCD begin with a path-integral
field i , characterized by momentum p and spin . (An formulation in terms of the elementary fields
inegral over possible momenta p is understood.)  a = qf . . . ,
When this happens for field i, the vacuum state is
Z " Y #
Y
replaced by j(p, )i, a particle state with precisely  
this momentum and spin; when it occurs for all Gn xi ;ðzj ;yj Þ ¼ Da i ðxi Þ
a¼q;
q;G;c;c i
fields, we derive a scattering (S)-matrix amplitude. Y
iSQCD
In essence, the statement of confinement is that  UCj ðzj ;yj Þe ½10
Green functions with fields qf (x) never behave as j
plane waves at large times in the past or future. with SQCD the action. Perturbation theory keeps
Only Green functions of color singlet composite only the kinetic Lagrangian, quadratic in fields, in
fields, invariant under gauge transformations, are the exponent, and expands the potential terms in
associated with plane wave behavior at large times. the coupling. This procedure produces Feynman
Green functions remain invariant under the BRST diagrams, with vertices corresponding to the cubic
transformations [3], and this invariance implies a set and quartic terms in the QCD Lagrangian [1].
of Ward identities Most nonperturbative analyses of QCD require
 X n studying the theory on a Eucliean, rather than
h0jT ð1 ðx1 Þ . . . BRS i ðxi Þ . . . Minkowski space, related by an analytic continuation
 ðzÞ i¼1
in the times x0 , y0 , z0 in Gn from real to imaginary
n ðxn ÞÞj0i ¼ 0 ½7 values. In Euclidean space, we find, for example,
Quantum Chromodynamics 147

classical solutions to the equations of motion, known as where in the second form, we have introduced QCD ,
instantons, that provide nonperturbative contributions the scale parameter of the theory, which embodies
to the path integral. Perhaps the most flexible non- the condition that we get the same coupling at scale
perturbative approach approximates the action and the 1 no matter which scale 0 we start from.
measure at a lattice of points in four-dimensional space. Asymptotic freedom consists of the observation that
For this purpose, integrals over the gauge fields are at larger renormalization masses , or correspond-
replaced by averages over ‘‘gauge links,’’ of the form of ingly shorter timescales, the coupling weakens, and
eqn [8] between neighboring points. indeed vanishes in the limit  ! 1. The other side of
Perturbation theory is most useful for processes the coin is that over longer times or lower momenta,
that occur over short timescales and at high relative the coupling grows. Eventually, near the pole at
energies. Lattice QCD, on the other hand, can 1 = QCD , the lowest-order approximation to the
simulate processes that take much longer times, but running fails, and the theory becomes essentially
is less useful when large momentum transfers are nonperturbative. Thus, the discovery of asymptotic
involved. The gap between the two methods remains freedom suggested, although it certainly does not
quite wide, but between the two they have covered prove, that QCD is capable of producing very strong
enormous ground, enough to more than confirm forces, and confinement at long distances. Current
QCD as the theory of strong interactions. estimates of QCD are  200 MeV.

Asymptotic Freedom Spontaneous Breaking of Chiral Symmetry


QCD is a renormalizable field theory, which implies The number of quarks and their masses is an external
that the coupling constant g must be defined by its input to QCD. In the standard model masses are
value at a ‘‘renormalization scale,’’ and is denoted provided by the Higgs mechanism, but in QCD they
g(). Usually, the magnitude of s () g2 =4 , is are simply parameters. Because the standard model
quoted at  = mZ , where it is  0.12. In effect, g() has chosen several of the quarks to be especially light,
controls the amplitude that connects any state to QCD incorporates the chiral symmetries implied by
another state with one more or one fewer gluon, eqn [4] (with P = 1). In the limit of zero quark
including quantum corrections that occur over time- masses, these symmetries becomes exact, respected to
scales from zero up to 
h= (if we measure  in units of all orders of perturbation theory, that is, for any
energy). The QCD Ward identities mentioned above finite number of gluons emitted or absorbed.
ensure that the coupling is the same for both quarks At distances on the order to 1=QCD , however,
and gluons, and indeed remains the same in all terms QCD cannot respect chiral symmetry, which would
in the Lagrangian, ensuring that the symmetries of require each state to have a degenerate partner with
QCD are not destroyed by renormalization. the opposite parity, something not seen in nature.
Quantum corrections to gluon emission are not Rather, QCD produces, nonperturbatively, nonzero
generally computable directly in renormalizable values for matrix elements that mix right- and left-
theories, but their dependence on  is computable, handed fields, such as h0j
uL uR j0i, with u the up-quark
and is a power series in s () itself, field. Pions are the Goldstone bosons of this symmetry,
and may be thought of as ripples in the chiral
ds ðÞ 2s ðÞ 3s ðÞ
2 ¼ b 0  b1 þ  ðs Þ ½11 condensate, rotating it locally as they pass along. The
d2 4 ð4 Þ2 observation that these Goldstone bosons are not
where b0 = 11  2nf =3 and b1 = 2(31  19nf =3). The exactly massless is due to the ‘‘current’’ masses of the
celebrated minus signs on the right-hand side are quarks, their values in LQCD . The (chiral perturbation
associated with both the spin and self-interactions of theory) expansion in these light-quark masses
the gluons. also enables us to estimate them quantitatively:
The solution to this equation provides an expres- 1.5 mu 4 MeV, 4 md 8 MeV, and 80 ms
sion for s at any scale 1 in terms of its value at 155MeV. These are the light quarks, with masses
any other scale 0 . Keeping only the lowest-order, smaller than QCD . (Like s , the masses are renorma-
b0 , term, we have lized; these are quoted from Eidelman (2004) with
 = 2 GeV.) For comparison, the heavy quarks
s ð0 Þ have masses mc  1–1.5 GeV, mb  4–4.5 GeV, and
s ð1 Þ ¼  
1 þ ðb0 =4 Þ ln 21 =20 mt  180 GeV (the giant among the known elementary
4 particles).
¼  ½12 Although the mechanism of the chiral condensate
b0 ln 21 =2QCD (and in general other nonperturbative aspects of
148 Quantum Chromodynamics

QCD) has not yet been demonstrated from first Motivation for such a string picture was also
principles, a very satisfactory description of the origin found from the hadron spectrum itself, before any of
of the condensate, and indeed of much hadronic the heavy quarks were known, and even before the
structure, has been given in terms of the attractive discovery of QCD, from the observation that many
forces between quarks provided by instantons. The mesonic (qq0 ) states lie along ‘‘Regge trajectories,’’
actions of instanton solutions provide a dependence which consist of sets of states of spin J and mass m2J
exp[8 2 =g2s ] in Euclidean path integrals, and so are that obey a relation
characteristically nonperturbative.
J ¼ 0 m2J ½14
Mechanisms of Confinement
for some constant 0 . Such a relation can be modeled
As described above, confinement is the absence of by two light particles (‘‘quarks’’) revolving around each
asymptotic states that transform nontrivially under other at some constant (for simplicity, fixed nonrela-
color transformations. The full spectrum of QCD, tivistic) velocity v0 and distance 2R, connected by a
however, is a complex thing to study, and so the ‘‘string’’ whose energy per unit length is a constant .
problem has been approached somewhat indirectly. A Suppose the center of the string is stationary, so
difficulty is the same light-quark masses associated the overall system is at rest. Then neglecting the
with approximate chiral symmetry. Because the masses masses, the total energy of the system is M = 2R .
of the light quarks are far below the scale QCD at Meanwhile, the momentum density per unit length
which the perturbative coupling blows up, light quarks at distance r from the center is v(r) = (r=R)v0 , and
are created freely from the vacuum and the process of the total angular momentum of the system is
‘‘hadronization,’’ by which quarks and gluons form Z R
mesons and baryons, is both nonperturbative and 2 v0 2 v0 2
J ¼ 2 v0 dr r2 ¼ R ¼ M ½15
relativistic. It is therefore difficult to approach in both 0 3 6
perturbation theory and lattice simulations.
and for such a system, [14] is indeed satisfied.
Tests and studies of confinement are thus normally
Quantized values of angular momentum J give
formulated in truncations of QCD, typically with no
quantized masses mJ , and we might take this as a
light quarks. The question is then reformulated in a
sort of ‘‘Bohr model’’ for a meson. Indeed, string
way that is somewhat more tractable, without
theory has its origin in related consideration in the
relativistic light quarks popping in and out of the
strong interactions.
vacuum all the time. In the limit that its mass becomes
Lattice data are unequivocal on the linearly rising
infinite compared to the natural scale of fluctuations in
potential, but it requires further analysis to take a
the QCD vacuum, the propagator of a quark becomes
lattice result and determine what field configura-
identical to a phase operator, [8], with a path C
tions, stringlike or not, gave that result. Probably the
corresponding to a constant velocity. This observation
most widely accepted explanation is in terms of an
suggests a number of tests for confinement that can be
analogy to the Meissner effect in superconductivity,
implemented in the lattice theory. The most intuitive is
in which type II superconductors isolate magnetic
the vacuum expectation value of a ‘‘Wilson loop,’’
flux in quantized tubes, the result of the formation
consisting of a rectangular path, with sides along the
of a condensate of Cooper pairs of electrons. If the
time direction, corresponding to a heavy quark and
strings of QCD are to be made of the gauge field,
antiquark at rest a distance R apart, and closed at some
they must be electric (F0 ) in nature to couple to
starting and ending times with straight lines. The
quarks, so the analogy postulates a ‘‘dual’’ Meissner
vacuum expectation value of the loop then turns out to
effect, in which electric flux is isolated as the result
be the exponential of the potential energy between the
of a condensate of objects with magnetic charge
quark pair, multiplied by the elapsed time,
(producing nonzero Fij ). Although no proof of this

 I 
mechanism has been provided yet, the role of

0 P exp igs A ðxÞ dx 0 
C
magnetic fluctuations in confinement has been
widely investigated in lattice simulations, with
¼ expðVðRÞT=hÞ ½13
encouraging results. Of special interest are magnetic
When V(R) / R (‘‘area law’’ behavior), there is a field configurations, monopoles or vortices, in the
linearly rising, confining potential. This behavior, Z3 center of SU(3), exp [i k=3]I33 , k = 0, 1, 2. Such
not yet proven analytically yet well confirmed on the configurations, even when localized, influence
lattice, has an appealing interpretation as the energy closed gauge loops [13] through the nonabelian
of a ‘‘string,’’ connecting the quark and antiquark, Aharonov–Bohm effect. Eventually, of course, the
whose energy is proportional to its length. role of light quarks must be crucial for any complete
Quantum Chromodynamics 149

description of confinement in the real world, as influence of other, truly nonperturbative scales,
emphasized by Gribov. proportional to powers of QCD . At large values of
Another related choice of closed loop is the Q2 , however, the situation simplifies greatly, and
‘‘Polyakov loop,’’ implemented at finite temperature, dependence on all scales below Q is suppressed by
for which the path integral is taken over periodic powers of Q. This may be expressed in terms of the
field configurations with period 1=T, where T is the operator product expansion,
temperature. In this case, the curve C extends from    
times t = 0 to t = 1=T at a fixed point in space. In 0 T J ð0ÞJ ðxÞ 0
X
this formulation it is possible to observe a phase ¼ ðx2 Þ3þdI =2 CI ðx2 2 ; s ðÞÞ
transition from a confined phase, where the expec- OI
tation is zero, to a deconfined phase, where it is  h0jOI ð0Þj0i ½17
nonzero. This phase transition is currently under
intense experimental study in nuclear collisions. where dI is the mass dimension of operator OI , and
where the dimensionless coefficient functions CI
incorporate quantum corrections. The sum over
Using Asymptotic Freedom: operators begins with the identity (dI = 0), whose
Perturbative QCD coefficient function is identified with the sum of
quantum corrections in the approximation of zero
It is not entirely obvious how to use asymptotic
masses. The sum continues with quark mass correc-
freedom in a theory that should (must) have
tions, which are suppressed by powers of at least
confinement. Such applications of asymptotic free-
m2f =Q2 , for those flavors with masses below Q. Any
dom go by the term perturbative QCD, which has
QCD quantity that has this property, remaining
many applications, not the least as a window to
finite in perturbation theory when all particle masses
extensions of the standard model.
are set to zero, is said to be ‘‘infrared safe.’’
Lepton Annihilation and Infrared Safety
The effects of quarks whose masses are above Q
P are included indirectly, through the couplings and
The electromagnetic current, J = f ef q f  qf , is a masses observed at the lower scales. In summary,
gauge-invariant operator, and its correlation functions the leading power behavior of (Q), and hence of
are not limited by confinement. Perhaps, the simplest the cross section, is a function of Q, , and s ()
application of asymptotic freedom, yet of great only. Higher-order operators whose vacuum matrix
physical relevance, is the scalar two-point function, elements receive nonperturbative corrections include
Z the ‘‘gluon condensate,’’ identified as the product
i   
ðQÞ ¼ d4 x eiQx 0 T J ð0ÞJ ðxÞ 0 ½16 s ()G G / 4QCD .
3
Once we have concluded that Q is the only
The imaginary part of this function is related to the physical scale in , we may expect that the right
total cross section for the annihilation process eþ e ! choice of the renormalization scale is  = Q. Any
hadrons in the approximation that only one photon observable quantity is independent of the choice of
takes part in the reaction. The specific relation is renormalization scale, , and neglecting quark
QCD = (e4 =Q2 ) Im (Q2 ), which follows from the masses, the chain rule gives
optical theorem, illustrated in Figure 1. The perturba-
dðQ=; s ðÞÞ @ @
tive expansion of the function (Q) depends, in  ¼ þ 2 ðs Þ ¼0 ½18
general, on the mass scales Q and the quark masses d @ @s
mf as well as on the strong coupling s () and on the which shows that we can determine the beta
renormalization scale . We may also worry about the function directly from the perturbative expansion
of the cross section. Defining a s ()= , such a
e+
perturbative calculation gives
σ(Q) = Σ
2 2
– eq Π(Q)  
q e
3 X 2
Im ðQ2 Þ ¼ ef 1 þ a þ a2 1:986
Π(Q) = Σ m
2
= Im 4 f
m
= Im( 
Q2
+ +. . .)  0:115nf  ðb0 =4 Þ ln 2 ½19

Figure 1 First line: schematic relation of lowest order eþ e
annihilation to sum over quarks q, each with electric charge eq . with b0 as above. Now, choosing  = Q, we see that
Second line: perturbative unitarity for the current correlation asymptotic freedom implies that when Q is large,
function (Q). the total cross section is given by the lowest order,
150 Quantum Chromodynamics

0.5

Lattice
q

NNLO
Theory

NLO
Data to W
Deep-inelastic scattering
e+e– Annihilation P
0.4 Hadron collisions
Heavy quarkonia

Λ (5) αS(MZ)

{
MS q
245 MeV 0.1209 to C
0.3 QCD
210 MeV 0.1182
=
α s(Q) O(α4S)
180 MeV 0.1155 ξP
i
P
0.2
to fi/N
Figure 3 Schematic depiction of factorization in deep-inelastic
scattering.
0.1

1 10 100 expect a quantum-mechanical incoherence between


Q (GeV) the scattering reaction, which occurs (by the uncer-
Figure 2 Experimental variation of the strong coupling with tainty principle) at short distances, and the forces that
scales. Reproduced from Bethke S (2004) Alpha(s) at Zinnowitz. stabilize the nucleon. After all, we have seen that the
Nuclear Physics Proceedings Supplements 135: 345–352, with latter, strong forces, should be associated with long
permission from Elsevier.
distances. Such a separation of dynamics, called
factorization, can be implemented in perturbation
plus small and calculable QCD corrections, a result theory, and is assumed to be a property of full QCD.
that is borne out in experiment. Comparing experi- Factorization is illustrated schematically in Figure 3.
ment to an expression like [19], one can measure the Of course, short and long distances are relative
value of s (Q), and hence, with eqn [12], s () for concepts, and the separation requires the introduction
any  QCD . Figure 2 shows a recent compilation of a so-called factorization scale, F , not dissimilar to
of values of s from this kind of analysis in different the renormalization scale described above. For many
experiments at different scales, clearly demonstrat- purposes, it is convenient to choose the two equal,
ing asymptotic freedom. although this is not required.
The expression of factorization for deep-inelastic
Factorization, Scaling, and Parton distributions
scattering is

One step beyond vacuum matrix elements of currents WN ðp; qÞ
are their expectation values in single-particle states, X Z 1

and here we make contact with the discovery of ¼ d C


i ð p; q; F ; s ðF ÞÞ
i¼qf ;
qf ;G x
QCD, through scaling. Such expectations are relevant
to the class of experiments known as deep-inelastic  fi=N ð ; F Þ ½21
scattering, in which a high-energy electron exchanges
where the functions C
(the coefficient functions)
i
a photon with a nucleon target. All QCD information
can be computed as an expansion in s (F ), and
is contained in the tensor matrix element
describe the scattering of the ‘‘partons,’’ quarks, and

WN ðp; qÞ gluons, of which the target is made. The variable
Z ranges from unity down to x q2 =2p  q > 0, and
1 X
d4 x eiqx hp; jJ ð0ÞJ ðxÞjp; i ½20 has the interpretation of the fractional momentum
8 
of the proton carried by parton i. (Here q2 = Q2 is
with q the momentum transfer carried by the positive.) The parton distributions fi=N can be
photon, and p,  the momentum and spin of the defined in terms of matrix elements in the nucleon,
target nucleon, N. This matrix element is not in which the currents are replaced by quark (or
infrared safe, since it depends in principle on the antiquark or gluon) fields, as
entire history of the nucleon state. Thus, it is not Z
1 1 þ
accessible to direct perturbative calculation. fq=N ðx;Þ¼ d eixp
4 1
Nevertheless, when the scattering involves a large
momentum transfer compared to QCD , we may  hp;jqðnÞUn ðn; 0Þn  qð0Þjp;i ½22
Quantum Chromodynamics 151

n is a light-like vector, and Un a phase operator distributions can be inferred directly from experi-
whose path C is in the n-direction. The dependence ment, to arbitrarily high scales, reachable in accel-
of the parton distribution on the factorization scale erators under construction or in the imagination, or
is through the renormalization of the composite even on the cosmic level.
operator consisting of the quark fields, separated At very high energy, however, the effective values
along the light cone, and the nonabelian phase of the variable x can become very small and
operator Un (n, 0), which renders the matrix ele- introduce new scales, so that eventually the evolu-
ment gauge invariant by eqn [9]. By combining the tion of eqn [23] fails. The study of nuclear collisions

calculations of the C’s and data for WN , we can may provide a new high-density regime for QCD,
infer the parton distributions, fi=N . Important factor- which blurs the distinction between perturbative and
izations of a similar sort also apply to some nonperturbative dynamics.
exclusive processes, including amplitudes for elastic
pion or nucelon scattering at large momentum Inclusive Production
transfer.
Equation [21] has a number of extraordinary Once we have evolution at our disposal, we can take
consequences. First, because the coefficient function yet another step, and replace electroweak currents
is an expansion in s , it is natural to choose 2F  with any operator from any extension of QCD, in
Q2  p  q (when x is of order unity). When Q is the standard model or beyond, that couples quarks
large, we may approximate C and gluons to the particles of as-yet unseen fields.
i by its lowest order,
which is first order in the electromagnetic coupling Factorization can be extended to these situations as
of quarks to photons, and zeroth order in s . In this well, providing predictions for the production of
approximation, dependence on Q is entirely in the new particles, F of mass M, in the form of factorized
parton distributions. But such dependence is of inclusive cross sections,
necessity weak (again for x not so small as to AB!FðMÞ ðM; pA ; pB Þ
produce another scale), because the F dependence X Z
of fi=N ( , F ) must be compensated by the F ¼ d a d b fi=A ð a ; Þfj=B ð b ; Þ
dependence of C i , which is order s . This means i;j¼qf q
f ;G

that the overall Q dependence of the tensor WN is
 Hij!FðMÞ ðxa pA ; xb pB ; M; ; s ðÞÞ ½24
weak for Q large when x is moderate. This is the
scaling phenomenon that played such an important where the functions Hij ! F may be calculated
role in the discovery of QCD. perturbatively, while the fi=A and fj=B parton
distributions are known from a combination of
Evolution: Beyond Scaling lower-energy observation and evolution. In this
Another consequence of the factorization [21], or context, they are said to be ‘‘universal,’’ in that
equivalently of the operator definition [22], is that they are the same functions in hadron–hadron
the F -dependence of the coefficient functions and collisions as in the electron–hadron collisions of
the parton distributions are linked. As in the lepton deep-inelastic scattering. In general, the calculation
annihilation cross section, this may be thought of as of hard-scattering functions Hij is quite nontrivial
due to the independence of the physically observable beyond lowest order in s . The exploration of
tensor WN 
from the choice of factorization and methods to compute higher orders, currently as far
renormalization scales. This implies that the as 2s , has required extraordinary insight into the
F -dependence of fi=N may be calculated perturba- properties of multidimensional integrals.
tively since it must cancel the corresponding The factorization method helped predict the
dependence in Ci . The resulting relation is coven- observation of the W and Z bosons of electroweak
tionally expressed in terms of the ‘‘evolution theory, and the discovery of the top quark. The
equations,’’ extension of factorization from deep-inelastic scat-
tering to hadron production is nontrivial; indeed, it
dfa=N ðx; Þ only holds in the limit that the velocities, i , of the

d colliding particles approach the speed of light in the
XZ 1 center-of-momentum frame of the produced particle.
¼ d Pac ðx= ; s ðÞÞfc=N ð ; Þ ½23 Corrections to the relation [24] are then at the level
c x
of powers of i  1, which translates into inverse
where Pac ( ) are calculable as power series, now powers of the invariant mass(es) of the produced
known up to 3s . This relation expands the applic- particle(s) M. Factorizations of this sort do not
ability of QCD from scales where parton apply to low-velocity collisions. Arguments for this
152 Quantum Chromodynamics

result rely on relativistic causality and the uncer- s corrections. As a result, QCD predicts that in
tainty principle. The creation of the new state most leptonic annihilation events, energy will flow
happens over timescales of order 1/M. Before that in two back-to-back collimated sets of particles,
well-defined event, the colliding particles are known as ‘‘jets.’’ In this way, quarks and gluons are
approaching at nearly the speed of light, and hence observed clearly, albeit indirectly.
cannot affect the distributions of each others’ With varying choices of S, many properties of
partons. After the new particle is created, the jets, such as their distributions in invariant mass,
fragments of the hadrons recede from each other, and the probabilities and angular distributions of
and the subsequent time development, when multijet events, and even the energy dependence of
summed over all possible final states that include their particle multiplicities, can be computed in
the heavy particle, is finite in perturbation theory as QCD. This is in part because hadronization is
a direct result of the unitarity of QCD. dominated by the production of light quarks,
whose production from the vacuum requires very
Structure of Hadronic Final States little momentum transfer. Paradoxically, the very
lightness of quarks is a boon to the use of
A wide range of semi-inclusive cross sections are
perturbative methods. All these considerations can
defined by measuring properties of final states that
be extended to hadronic scattering, and jet and other
depend only on the flow of energy, and which bring
semi-inclusive properties of final states also com-
QCD perturbation theory to the threshold of
puted and compared to experiment.
nonperturbative dynamics. Schematically, P for a
state N = jk1 . . . kN i, we define S(N) = i s(i )k0i ,
where s() is some smooth function of directions.
We generalize the eþ e annihilation case above, and Conclusions
define a cross section in terms of a related, but QCD is an extremely broad field, and this article has
highly nonlocal, matrix element, hardly scratched the surface. The relation of QCD-
Z
like theories to supersymmetric and string theories,
dðQÞ
0 d x e 4 iQx
0 J ð0Þ and implications of the latter for confinement and
dS
Z  the computation of higher-order perturbative ampli-

 d sðÞEðÞ  S J ðxÞ 0
2
½25 tudes, have been some of the most exciting devel-
opments of recent years. As another example, we
where 0 is a zeroth-order cross section, and where note that the reduction of the heavy-quark propa-
E is an operator at spatial infinity, which measures gator to a nonabelian phase, noted in our discussion
the energy flow of of confinement, is related to additional symmetries
P any state in direction : E() of heavy quarks in QCD, with many consequences
jk1 . . . kN i = (1=Q) i k0i 2 (  i ). This may seem a
little complicated, but like the total annihilation cross for the analysis of their bound states. Of the
section, the only dimensional scale on which it bibliography given below, one may mention the
depends is Q. The operator E can be defined in a four volumes of Shifman (2001, 2002), which
gauge-invariant manner, through the energy–momen- communicate in one place a sense of the sweep of
tum tensor for example, and has a meaning indepen- work in QCD.
dent of partonic final states. At the same time, this Our confidence in QCD as the correct description of
sort of cross section may be implemented easily in the strong interactions is based on a wide variety of
perturbation theory, and like the total annihilation experimental and observational results. At each stage in
cross section, it is infrared safe. To see why, notice the discovery, confirmation, and exploration of QCD,
that when a massless (k2 = 0) particle decays into two the mathematical analysis of relativistic quantum field
particles of momenta xk and (1  x)k (0 x 1), the theory entered new territory. As is the case for gravity or
quantity S is unchanged, since the sum of the new electromagnetism, this period of exploration is far from
energies is the same as the old. This makes the complete, and perhaps never will be.
observable S(N) insensitive to processes at low
See also: AdS/CFT Correspondence; Aharonov–Bohm
momentum transfer.
Effect; BRST Quantization; Current Algebra; Dirac
For the case of leptonic annihilation, the lowest-
Operator and Dirac Field; Euclidean Field Theory;
order perturbative contribution to energy flow Effective Field Theories; Electroweak Theory; Lattice
requires no powers of s , and consists of an Gauge Theory; Operator Product Expansion in Quantum
oppositely moving quark and antiquark pair. Any Field Theory; Perturbation Theory and its Techniques;
measure of energy flow that includes these config- Perturbative Renormalization Theory and BRST;
urations will dominate over correlations that require Quantum Field Theory: A Brief Introduction; Random
Quantum Cosmology 153

Matrix Theory in Physics; Renormalization: General Physics and Cosmology, vol. 8. Cambridge: Cambridge
Theory; Scattering in Relativistic Quantum Field Theory: University Press.
Fundamental Concepts and Tools; Scattering, Greensite J (2003) The confinement problem in lattice gauge
Asymptotic Completeness and Bound States; theory. Progress in Particle and Nuclear Physics 51: 1.
Mandelstam S (1976) Vortices and quark confinement in
Seiberg–Witten Theory; Standard Model of Particle
nonabelian gauge theories. Physics Reports 23: 245–249.
Physics. Muta T (1986) Foundations of Quantum Chromodynamics.
Singapore: World Scientific.
Neubert H (1994) Heavy quark symmetry. Physics Reports 245:
259–396.
Further Reading
Polyakov AM (1977) Quark confinement and topology of gauge
Bethke S (2004) s at Zinnowitz, 2004. Nuclear Physics groups. Nuclear Physics B 120: 429–458.
Proceeding Supplements 135: 345–352. Schafer T and Shuryak EV (1998) Instantons and QCD. Reviews
Brodsky SJ and Lepage P (1989) Exclusive processes in quantum of Modern Physics 70: 323–426.
chromodynamics. In: Mueller AH (ed.) Perturbative Quantum Shifman M (ed.) (2001) At the Frontier of Particle Physics:
Chromodynamics. Singapore: World Scientific. Handbook of QCD, vols. 1–3. River Edge, NJ: World Scientific.
Collins JC, Soper DE, and Sterman G (1989) Factorization. In: Shifman M (ed.) (2002) At the Frontier of Particle Physics:
Mueller AH (ed.) Perturbative Quantum Chromodynamics. Hand book of QCD, vol. 4. River Edge, NJ: World Scientific.
Singapore: World Scientific. Sterman G (1993) An Introduction to Quantum Field Theory.
Dokshitzer Yu L and Kharzeev DE (2004) Gribov’s conception of Cambridge: Cambridge University Press.
quantum chromodynamics. Annual Review of Nuclear and ’t Hooft G (1977) On the phase transition towards permanent
Particle Science 54: 487–524. quark confinement (1978). Nuclear Physics B 138: 1.
Dokshitzer Yu L, Khoze V, Troian SI, and Mueller AH (1988) ’t Hooft G (ed.) (2005) Fifty Years of Yang–Mills Theories.
QCD coherence in high-energy reactions. Reviews of Modern Hackensack: World Scientific.
Physics 60: 373. Weinberg S (1977) The problem of mass. Transactions of the
Eidelman S et al. (2004) Review of particle physics. Physics New York Academy of Science 38: 185–201.
Letters B 592: 1–1109. Wilson KG (1974) Confinement of quarks. Physical Review D
Ellis RK, Stirling WJ, and Webber BR (1996) QCD and Collider 10: 2445–2459.
Physics, Cambridge Monographs on Particle Physics, Nuclear

Quantum Cosmology
M Bojowald, The Pennsylvania State University, singularity, in the very early universe, quantum
University Park, PA, USA modifications will give rise to new equations of
ª 2006 Elsevier Ltd. All rights reserved. motion which turn into Einstein’s equations only on
larger scales. The analysis of these equations of
motion leads to new classes of early universe
phenomenology.
Introduction
The application of quantum theory to cosmology
Classical gravity, through its attractive nature, leads presents a unique problem with not only mathema-
to a high curvature in important situations. In tical but also many conceptual and philosophical
particular, this is realized in the very early universe ramifications. Since by definition there is only one
where in the backward evolution energy densities universe which contains everything accessible, there
are growing until the theory breaks down. Mathe- is no place for an outside observer separate from the
matically, this point appears as a singularity where quantum system. This eliminates the most straight-
curvature and physical quantities diverge and the forward interpretations of quantum mechanics and
evolution breaks down. It is not possible to set up an requires more elaborate, and sometimes also more
initial-value formulation at this place in order to realistic, constructions such as decoherence. From
determine the further evolution. the mathematical point of view, this situation is
In such a regime, quantum effects are expected to often expected to be mirrored by a new type of
play an important role and to modify the classical theory which does not allow one to choose initial or
behavior such as the attractive nature of gravity or the boundary conditions separately from the dynamical
underlying spacetime structure. Any candidate for laws. Initial or boundary conditions, after all, are
quantum gravity thus allows us to reanalyze the meant to specify the physical system prepared for
singularity problem in a new light which implies the observations which is impossible in cosmology.
tests of the characteristic properties of the respective Since we observe only one universe, the expectation
candidate. Moreover, close to the classical goes, our theories should finally present us with only
154 Quantum Cosmology

one, unique solution without any freedom for but still homogeneous models, where a minisuper-
further conditions. This solution then contains all space quantization does not agree at all with the
the information about observations as well as information obtained from the less symmetric
observers. Mathematically, this is an extremely model. However, often those effects already have a
complicated problem which has received only scant classical analog such as instability of the more
attention. Equations of motion for quantum cosmol- symmetric solutions. A wider investigation of the
ogy are usually of the type of partial differential or reliability of models and when correction terms
difference equations such that new ingredients from from ignored degrees of freedom have to be included
quantum gravity are needed to restrict the large has not been done yet.
freedom of solutions. With candidates for quantum gravity being
available, the current situation has changed to
some degree. It is then not only possible to reduce
Minisuperspace approximation classically and then simply use quantum
mechanics, but also perform at least some of the
In most investigations, the problem of applying full reduction steps at the quantum level. The relation
quantum gravity to cosmology is simplified by a to models is then much clearer, and consistency
symmetry reduction to homogeneous or isotropic conditions which arise in the full theory can be
geometries. Originally, the reduction was performed made certain to be observed. Moreover, relations
at the classical level, leaving in the isotropic case between models and the full theory can be studied
only one gravitational degree of freedom given by to elucidate the degree of approximation. Even
the scale factor a. Together with homogeneous though new techniques are now available, a
matter fields, such as a scalar , there are then detailed investigation of the degree of approxima-
only finitely many degrees of freedom which one can tion given by a minisuperspace model has not been
quantize using quantum mechanics. The classical completed due to its complexity.
Friedmann equation for the evolution of the scale This program has mostly been developed in the
factor, depending on the spatial curvature k = 0 or context of loop quantum gravity, where the specia-
1, is then quantized to the Wheeler–DeWitt lization to homogeneous models is known as loop
equation, commonly written as quantum cosmology. More specifically, symmetries
 
1 4 x @ x @ can be introduced at the level of states and basic
‘P a a  ka 2 ða; Þ operators, where symmetric states of a model are
9 @a @a
8G ^ distributions in the full theory, and basic operators
¼  aHmatter ðaÞ ða; Þ ½1 are obtained by the dual action on those distribu-
3
tions. In such a way, the basic representation of
for the wave function (a, ). The matter Hamilto- models is not assumed but derived from the full
nian H^ matter (a), such as theory where it is subject to much stronger
@2 consistency conditions. This has implications even
^ matter ðaÞ ¼  1 
H h2 a3 2 þ a 3 VðÞ ½2 in homogeneous models with finitely many degrees
2 @
of freedom, despite the fact that quantum mechanics
is left unspecified here, and x parametrizes factor is usually based on a unique representation if the
ordering ambiguities (butffi not completely). The
pffiffiffiffiffiffiffiffiffiffiffiffi Weyl operators eisq and eitp for the variables q and p
Planck length ‘P = 8G h is defined in terms of are represented weakly continuously in the real
the gravitational constant G and the Planck parameters s and t.
constant  h. The continuity condition, however, is not neces-
The central conceptual issue then is the generality sary in general, and so inequivalent representations
of effects seen in such a symmetric model and its are possible. In quantum cosmology this is indeed
relation to the full theory of quantum gravity. This realized, where the Wheeler–DeWitt representation
is completely open in the Wheeler–DeWitt form assumes that the conjugate to the scale factor,
since the full theory itself is not even known. On the corresponding to extrinsic curvature of an isotropic
other hand, such relations are necessary to value any slice, is represented through a continuous Weyl
potential physical statement about the origin and operator, while the representation derived for loop
early history of the universe. In this context, quantum cosmology shows that the resulting opera-
symmetric situations thus present models, and the tor is not weakly continuous. Furthermore, the scale
degree to which they approximate full quantum factor has a continuous spectrum in the Wheeler–
gravity remains mostly unknown. There are exam- DeWitt representation but a discrete spectrum in the
ples, for instance, of isotropic models in anisotropic loop representation. Thus, the underlying geometry
Quantum Cosmology 155

of space is very different, and also evolution takes such as that for an inflaton. Since initial conditions
a new form, now given by a difference equation of often provide special properties early on, the
the type combination of evolution and initial conditions has
been used to find a possible origin of an arrow
ðVþ5  Vþ3 Þe ik þ4 ðÞ of time.
 ð2 þ k2 ÞðVþ1  V1 Þ  ðÞ
ik
þ ðV3  V5 Þe 4 ðÞ Singularities
4 2^
¼  G‘ Hmatter ðÞ  ðÞ
3 P ½3 While classical gravity is based on spacetime
geometry and thus metric tensors, this structure is
in terms of volume eigenvalues V = (‘ 2P jj=6) 3=2 .
viewed as emergent only at large scales in canonical
For large  and smooth wave functions, one can see
quantum gravity. A gravitational system, such as a
that the difference equation reduces to the
whole universe, is instead described by a wave
Wheeler–DeWitt equation with jj / a2 to leading
function which, at best, yields expectation values for
order in derivatives of . At small , close to the
a metric. The singularity problem thus takes a
classical singularity, however, both equations have
different form since it is not metrics which need to
very different properties and lead to different
be continued as solutions to Einstein’s field equa-
conclusions. Moreover, the prominent role of
tions but the wave function describing the quantum
difference equations leads to new mathematical
system. In the strong curvature regime around a
problems.
classical singularity, one does not expect classical
This difference equation is not simply obtained
geometry to be applicable, such that classical
through a discretization of [1], but derived from a
singularities may just be a reflection of the break-
constraint operator constructed with methods from
down of this picture, rather than a breakdown of
full loop quantum gravity. It is, thus, to be regarded
physical evolution. Nevertheless, the basic feature of
as more fundamental, with [1] emerging in a
a singularity as presenting a boundary to the
continuum limit. The structure of [3] depends on
evolution of a system equally applies to the quantum
the properties of the full theory such that its
equations. One can thus analyze this issue, using
qualitative analysis allows conclusions for full
new properties provided by the quantum evolution.
quantum gravity.
The singularity issue is not resolved in the
Wheeler–DeWitt formulation since energy densities,
with a being a multiplication operator, diverge and
Applications
the evolution does not continue anywhere beyond
Traditionally, quantum cosmology has focused on the classical singularity at a = 0. In some cases one
three main conceptual issues: can formally extend the evolution to negative a, but
this possibility is not generic and leaves open what
 the fate of classical singularities,
negative a means geometrically. This is different in
 initial conditions and the ‘‘prediction’’ of inflation
the loop quantization: here, the theory is based on
(or other early universe scenarios), and
triad rather than metric variables. There is thus a
 arrow of time and the emergence of a classical
new sign factor corresponding to spatial orientation,
world.
which implies the possibility of negative  in the
The first issue consists of several subproblems since difference equation. The equation is then defined on
there are different aspects to a classical singularity. the full real line with the classical singularity  = 0
Often, curvature or energy densities diverge and one in the interior. Outside  = 0, we have positive
can expect quantum gravity to provide a natural volume at both sides, and opposite orientations.
cutoff. More importantly, however, the classical Using the difference equation, one can then see that
evolution breaks down at a singularity, and quan- the evolution does not break down at  = 0,
tum gravity, if it is to cure the singularity problem, showing that the quantum evolution is singularity
has to provide a well-defined evolution which does free.
not stop. Initial conditions are often seen in relation For the example [3] shown here, one can follow
to the singularity problem since early attempts tried the evolution, for instance, backward in internal
to replace the singularity by choosing appropriate time , starting from initial values for at large
conditions for the wave function at a = 0. Different positive . By successively solving for 4 , the wave
proposals then lead to different solutions for the function at lower  is determined. This goes on in
wave function, whose dependence on the scalar  this manner only until the coefficient V3  V5 of
can be used to determine its probability distribution 4 vanishes, which is the case if and only if  = 4.
156 Quantum Cosmology

The value 0 of the wave function exactly at the parts of minisuperspace, such as a = 0 in the
classical singularity is thus not determined by initial isotropic case, corresponding to classical singulari-
data, but one can easily see that it completely drops ties. This condition, unfortunately, can easily be
out of the evolution. In fact, the wave function at all seen to be ill posed in anisotropic models where in
negative  is uniquely determined by initial values at general the only solution vanishes identically. In
positive . Equation [3] corresponds to one parti- other models, lima ! 0 (a) does not even exist.
cular ordering, which in the Wheeler–DeWitt case is Similar problems of the generality of conditions
usually parametrized by the parameter x (although arise in other scenarios. Most well known are the
the particular ordering obtained from the continuum no-boundary and tunneling proposal where initial
limit of [3] is not contained in the special family conditions are still imposed at a = 0, but with a
[1]). Other nonsingular orderings exist, such as that nonvanishing wave function there.
after symmetrizing the constraint operator, in which This issue is quite different for difference equa-
case the coefficients never become 0. tions since at first the setup is less restrictive: there
In more complicated systems, this behavior is are no continuity or differentiability conditions for a
highly nontrivial but still known to be realized in a solution. Moreover, oscillations that become arbi-
similar manner. It is not automatic that the internal trarily rapid, which can be responsible for the
time evolution does not continue since even in nonexistence of lim a ! 0 (a), cannot be supported
isotropic models one can easily write difference on a discrete lattice. It can then easily happen that a
equations for which the evolution breaks down. difference equation is well posed, while its con-
That the most natural orderings imply nonsingular tinuum limit with an analogous initial condition is
evolution can be taken as a support of the general ill posed. One example are the dynamical initial
framework of loop quantum gravity. It should also conditions of loop quantum cosmology which arise
be noted that the mechanism described here, from the dynamical law in the following way: the
providing essentially a new region beyond a classical coefficients in [3] are not always nonzero but vanish
singularity, presents one mechanism for quantum if and only if they are multiplied with the value of
gravity to remove classical singularities, and so far the wave function at the classical singularity  = 0.
the only known one. Nevertheless, there is no claim This value thus decouples and plays no role in the
that the ingredients have to be realized in any evolution. The instance of the difference equation
nonsingular scenario in the same manner. Different that would determine 0 , for example, the equation
scenarios can be imagined, depending on how for  = 4 in the backward evolution, instead implies
quantum evolution is understood and what the a condition on the previous two values, 4 and 8 ,
interpretation of nonsingular behavior is. It is also in the example. Since they have already been
not claimed that the new region is semiclassical in determined in previous iteration steps, this translates
any sense when one looks at it at large volume. If to a linear condition on the initial values chosen. We
the initial values for the wave function describe a thus have one example where indeed initial condi-
semiclassical wave packet, its evolution beyond the tions and the evolution follow from only one
classical singularity can be deformed and develop dynamical law, which also extends to anisotropic
many peaks. What this means for the re-emergence models. Without further conditions, the initial-value
of a semiclassical spacetime has to be investigated in problem is always well posed, but may not be
particular models, and also in the context of complete, in the sense that it results in a unique
decoherence. solution up to norm. Most of the solutions,
however, will be rapidly oscillating. In order to
guarantee the existence of a continuum approxima-
Initial Conditions
tion, one has to add a condition that these
Traditional initial conditions in quantum cosmology oscillations are suppressed in large volume regimes.
have been introduced by physical intuition. The Such a condition can be very restrictive, such that
main mathematical problem, once such a condition the issue of well-posedness appears in a new guise:
is specified in sufficient detail, then is to study well- nonzero solutions do exist, but in some cases all of
posedness, for instance, for the Wheeler–DeWitt them may be too strongly oscillating.
equation. Even formulating initial conditions In simple cases, one can use generating function
generally, and not just for isotropic models, is techniques advantageously to study oscillating solu-
complicated, and systematic investigations of the tions, at least if oscillations are of alternating nature
well-posedness have rarely been undertaken. An between two subsequent levels of the difference
exception is the historically first such condition, equation.P The idea is that a generating function
due to DeWitt, that the wave function vanishes at G(x) = n n x n has a stronger pole at x = 1 if n
Quantum Cosmology 157

is alternating compared to a solution of constant complicated ways. Quantization can thus be per-
sign. Choosing initial conditions which reduce the formed, but transforming back to the metric at the
pole order thus implies solutions with suppressed operator level and drawing conclusions is quite
oscillations. As an example, we can look at the involved. The main issue of interest in the recent
difference equation literature has been the investigation of field theory
aspects of quantum gravity in a tractable model. In
2
nþ1 þ n  n1 ¼0 ½4 particular, it turns out that self-adjoint Hamilto-
n nians, and thus unitary evolution, do not exist in
whose generating function is general.
Loop quantizations of inhomogeneous models are
1x þ 0 ð1 þ 2xð1  logð1  xÞÞÞ
GðxÞ ¼ ½5 available even in cases where a reformulation such
ð1 þ xÞ2 as a field theory on flat space does not exist, or is
The pole at x = 1 is removed for initial values not being made use of to avoid special gauges. This
1 = 0 (2 log 2  1) which corresponds to nonoscil-
is quite valuable in order to see if specific features
lating solutions. In this way, analytical expressions exploited in reformulations lead to artifacts in the
can be used instead of numerical attempts which results. So far, the dynamics has not been investi-
would be sensitive to rounding errors. Similarly, the gated in detail, even though conclusions for the
issue of finding bounded solutions can be studied by singularity issue can already be drawn.
continued fraction methods. This illustrates how an From a physical perspective, it is most important
underlying discrete structure leads to new questions to introduce inhomogeneities at a perturbative level
and the application of new techniques compared to in order to study implications for cosmological
the analysis of partial differential equations which structure formation. On a homogeneous back-
appear more commonly. ground, one can perform a mode decomposition of
metric and matter fields and quantize the homo-
geneous modes as well as amplitudes of higher
modes. Alternatively, one can first quantize the
More General Models
inhomogeneous system and then introduce the mode
Most of the time, homogeneous models have been decomposition at the quantum level. This gives rise
studied in quantum cosmology since even formulat- to a system of infinitely many coupled equations of
ing the Wheeler–DeWitt equation in inhomogeneous infinitely many variables, which needs to be trun-
cases, the so-called midisuperspace models, is cated, for example, for numerical investigations. At
complicated. Of particular interest among homo- this level, one can then study the question to which
geneous models is the Bianchi IX model since it has degree a given minisuperspace model presents a
a complicated classical dynamics of chaotic beha- good approximation to the full theory, and where
vior. Moreover, through the Belinskii–Khalatnikov– additional correction terms should be introduced. It
Lifschitz (BKL) picture, the Bianchi IX mixmaster also allows one to develop concrete models of
behavior is expected to play an important role even decoherence, which requires a ‘‘bath’’ of many
for general inhomogeneous singularities. The classi- weakly interacting degrees of freedom usually
cal chaos then indicates a very complicated thought of as being provided by inhomogeneities in
approach to classical singularities, with structure cosmology, and an understanding of the semiclassi-
on arbitrarily small scales. cal limit.
On the other hand, the classical chaos relies on a
curvature potential with infinitely high walls, which
can be mapped to a chaotic billiard motion. The
Interpretations
walls arise from the classical divergence of curva-
ture, and so quantum effects have been expected to Due to the complexity of full gravity, investigations
change the picture, and shown to do so in several without symmetry assumptions or perturbative
cases. approximations usually focus on conceptual issues.
Inhomogeneous models (e.g., the polarized As already discussed, cosmology presents a unique
Gowdy models) have mostly been studied in cases situation for physics since there cannot be any
where one can reformulate the problem as that of a outside observer. While this fact has already
massless free scalar on flat Minkowski space. The implications on the interpretation of observations
scalar can then be quantized with familiar techni- at the classical level, its full force is noticed only in
ques in a Fock space representation, and is related to quantum cosmology. Since some traditional inter-
metric components of the original model in rather pretations of quantum mechanics require the role of
158 Quantum Cosmology

observers outside the quantum system, they do not midisuperspace models. In addition, complicated
apply to quantum cosmology. interpretational issues, as important as they are for
Sometimes, alternative interpretations such as a deep understanding of quantum physics, do not
Bohm theory or many-world scenarios are cham- prevent the development of physical applications in
pioned in this situation, but more conventional quantum cosmology, just as they did not do so in
relational pictures are most widely adopted. In the early stages of quantum mechanics.
such an interpretation, the wave function yields
relational probabilities between degrees of free- See also: Canonical General Relativity; Cosmology:
dom rather than absolute probabilities for mea- Mathematical Aspects; Loop Quantum Gravity; Quantum
surements done by an outside observer. This has Geometry and its Applications; Spacetime Topology,
been used, for instance, to determine the prob- Causal Structure and Singularities; Wheeler–De Witt
ability of the right initial conditions for inflation, Theory.
but it is marred by unresolved interpretational
issues and still disputed. These problems can be
avoided by using effective equations, in analogy Further Reading
to an effective action, which modify classical
equations on small scales. Since the new equa- Bojowald M (2001a) Absence of a singularity in loop
quantum cosmology. Physical Review Letters 86: 5227–5230
tions are still of classical type, that is, differential
(gr-qc/0102069).
equations in coordinate time, no interpretational Bojowald M (2001b) Dynamical initial conditions in
issues arise at least if one stays in semiclassical quantum cosmology. Physical Review Letters 87: 121301
regimes. In this manner, new inflationary scenar- (gr-qc/0104072).
ios motivated from quantum cosmology have Bojowald M (2003) Initial conditions for a universe. General
Relativity and Gravitation 35: 1877–1883 (gr-qc/0305069).
been developed.
Bojowald M (2005) Loop quantum cosmology. Living Reviews in
In general, a relational interpretation, though Relativity (to appear).
preferable conceptually, leads to technical Bojowald M and Morales-Técotl HA (2004) Cosmological
complications since the situation is much more applications of loop quantum gravity. In: Proceedings of
involved and evolution is not easy to disentangle. the Fifth Mexican School (DGFM): The Early Universe
and Observational Cosmology, Lecture Notes in Physics,
In cosmology, one often tries to single out one
vol. 646, pp. 421–462. Berlin: Springer (gr-qc/0306008).
degree of freedom as internal time with respect to DeWitt BS (1967) Quantum theory of gravity. I. The canonical
which evolution of other degrees of freedom is theory. Physical Review 160: 1113–1148.
measured. In homogeneous models, one can Giulini D, Kiefer C, Joos E, Kupsch J, Stamatescu IO et al. (1996)
simply take the volume as internal time, such as Decoherence and the Appearance of a Classical World in
Quantum Theory. Berlin: Springer.
a or  earlier, but in full no candidate is known.
Hartle JB (2003) What connects different interpretations of
Even in homogeneous models, the volume is not quantum mechanics? quant-ph/0305089.
suitable as internal time to describe a possible Hartle JB and Hawking SW (1983) Wave function of the
recollapse. One can use extrinsic curvature universe. Physical Review D 28: 2960–2975.
around such a point, but then one has to under- Kiefer C (2005) Quantum cosmology and the arrow of time. In:
Proceedings of the Conference DICE2004, Piombino, Italy,
stand what changing the internal time in quantum
September 2004 (gr-qc/0502016).
cosmology implies, that is, whether evolution Kuchař KV (1992) Time and interpretations of quantum gravity.
pictures obtained in different internal time for- In: Kunstatter G, Vincent DE, and Williams JG (eds.)
mulations are equivalent to each other. Proceedings of the 4th Canadian Conference on General
There are thus many open issues at different Relativity and Relativistic Astrophysics. Singapore: World
Scientific.
levels, which, strictly speaking, do not apply only to
McCabe G (2005) The structure and interpretation of cosmology:
quantum cosmology but to all of physics. After all, Part II. The concept of creation in inflation and quantum
every physical system is part of the universe, and cosmology. Studies in History and Philosophy of Modern
thus a potential ingredient of quantum cosmology. Physics 36: 67–102 (gr-qc/0503029).
Obviously, physics works well in most situations Vilenkin A (1984) Quantum creation of universes. Physical
Review D 30: 509–511.
without taking into account its being part of one
Wiltshire DL (1996) An introduction to quantum cosmology. In:
universe. Similarly, much can be learned about a Robson B, Visvanathan N, and Woolcock WS (eds.)
quantum universe if only some degrees of freedom Cosmology: The Physics of the Universe, pp. 473–531.
of gravity are considered as in mini- or Singapore: World Scientific (gr-qc/0101003).
Quantum Dynamical Semigroups 159

Quantum Dynamical Semigroups


R Alicki, University of Gdańsk, Gdańsk, Poland always meaningful and its solution is given in
ª 2006 Elsevier Ltd. All rights reserved. terms of the exponential (t) = (t)  etL . The
linear map  is a general completely positive map
on matrices, which preserves the positivity of 
Introduction and   Id preserves positivity of nd  nd matrices
for arbitrary d = 1, 2, 3, . . . A useful Dyson-type
With a given quantum system we associate a expansion
Hilbert space H such that pure states of the system
1 Z t
X Z tk
are represented by normalized vectors in H or
etL  ¼ WðtÞ þ dtk dtk1   
equivalently by one-dimensional projections j ih j, k¼1 0 0
whereas
P mixed states are Pgiven by density matrices Z t2
 = j pj j j ih j j, pj > 0, j pj = 1, that is, positive  dt1 Wðt  tk ÞWðtk  tk1 Þ
trace-1 operators and observables are identified 0
with self-adjoint operators A acting on H. The      Wðt1 Þ ½5
mean value of an observable A at a state  is given
by the following expression: with W(t)  W(t)W(t) , W(t) = etD shows that
(t) is also completely positive. It is often conve-
< A > ¼ trðAÞ ½1 nient to describe quantum evolution in terms of
The time evolution of the isolated system is deter- observables (Heisenberg picture)
mined by the self-adjoint operator H (Hamiltonian)   
< A >ðtÞ ¼ tr etL  A
corresponding to the energy of the system. The  

infinitesimal change of state of the isolated system ¼ tr  etL A ¼ < AðtÞ > ½6
can be written as
ðt þ dtÞ ¼ ðtÞ  iHdt ðtÞ; or d
½2 AðtÞ ¼ L AðtÞ
ðt þ dtÞ ¼ ðtÞ  idt½H;  dt
1X  
what leads to a reversible purity preserving unitary ¼ i½H; AðtÞ þ Vj ½AðtÞ; Vj  þ ½Vj ; AðtÞVj 
2 j2I
dynamics (t) = eitH , (t) = eitH eitH. We use the
notation [A, B]  AB  BA, {A, B} = AB þ BA and ¼ D ðtÞ þ ðtÞD þ  ðtÞ ½7
put h  1. An interaction with environment leads
with the initialP condition A(0) = A, completely
to irreversible changes of the density matrix trans-
positive  A = j2I Vj AVj and the corresponding
forming, in general, pure states into mixed ones.
Dyson expansion.
Such a process can be modeled phenomenologically
The solutions of eqns [4] and [7] are given in
by a transition map V : H 7! H leading to
terms of dynamical semigroups. Their general
1 mathematical properties and particular examples
ðt þ dtÞ ¼ ðtÞ þ dtVV   dt fV  V; g ½3
2 will be reviewed in this article. Various methods of
Combining Hamiltonian dynamics with several derivation of master equations for open quantum
irreversible processes governed by a family of systems from the underlying Hamiltonian dynamics
transition operators {Vj } we obtain the following of composed systems will also be presented.
formal evolution equation in the Schrödinger picture
(quantum Markovian master equation) Semigroups and Their Generators
d For standard quantum-mechanical models it is con-
ðtÞ ¼ LðtÞ
dt venient to define quantum dynamical semigroup in
1X  the Schrödinger picture as a one-parameter family
¼ i½H; ðtÞ þ ½Vj ; ðtÞVj  þ ½Vj ðtÞ; Vj 
2 j2I {(t); t 0} of linear and bounded maps acting on the
Banach space of trace-class operators T (H) equipped
¼ DðtÞ þ ðtÞD þ ðtÞ ½4 with the norm kk1 = tr( )1=2 and satisfying the
with the following conditions:
P initial condition
P (0) = . Here D = iH 
(1=2) j Vj Vj ,  = j2I Vj Vj , and I is a certain 1. Composition (semigroup) law
countable set of indices. Assume for the moment
that the Hilbert space H = Cn . Then the eqn [4] is ðtÞðsÞ ¼ ðt þ sÞ; for all t; s 0 ½8
160 Quantum Dynamical Semigroups

2. Complete positivity We can solve eqn [4] in terms of a minimal solution.


d
Defining by Z the generator of the contracting
ðtÞ  Id is positive on T ðH  C Þ 
semigroup  7! etD etD and denoting byP J the com-
for all d ¼ 1; 2; 3; . . . and t 0 ½9 pletely positive (unbounded) map  7! j2I Vj Vj ,
one can show that for any  > 0, J(I  Z)1 possesses
3. Conservativity (trace preservation)
a unique bounded completely positive extension
trððtÞÞ ¼ trðÞ; for all  2 T ðHÞ ½10 denoted by A with kA k 1. Hence, for any
0 r < 1 there exists a strongly continuous, comple-
4. Continuity (in a weak sense) tely positive and contracting semigroup (r) (t) with the
lim trðAðtÞÞ ¼ trðAÞ resolvent explicitly given by
t!0
X
1
for all  2 T ðHÞ; A 2 BðHÞ ½11 RðrÞ ðÞ ¼ ðI  ZÞ1 rk Ak ½15
k¼0
From a general theory of one-parameter semigroups
on Banach spaces it follows that under the condi- As kR(r) ()k 1 the limit limr ! 1 R(r) () = R(),
tions (1)–(4) (t) is a one-parameter strongly where R() is the resolvent of the semigroup (t)
continuous semigroup of contractions on T (H) satisfying (1), (2), and (4) and called the minimal
uniquely characterized by a generally unbounded solution of the eqn [4]. The minimal solution need
but densely defined semigroup generator L with the not be a unique solution or conservative (generally
domain dom(L)
T (H) such that for any tr (t) tr (0) and for any other solution
 2 dom(L) 0 (t) (t)). There exist useful sufficient conditions
for conservativity, an example of a sufficient and
d necessary condition is the following: An ! 0 strongly
ðtÞ ¼ LðtÞ; ðtÞ ¼ ðtÞ ½12
dt as n ! 1 for all  > 0 (Chebotarev and Fagnola
One can show that for  > 0 the resolvent 1988).
R() = (I  L)1 can be extended to a bounded
operator satisfying kR()k 1 and, therefore, the
Examples
following formula makes sense:
 t n Bloch equation The simplest two-level system can
lim I  L  ¼ ðtÞ; for all  2 T ðHÞ ½13 be described in terms of spin operators
n!1 n
Sk = (1=2)k , k = 1, 2, 3, where k are Pauli matrices.
Under the additional assumption that the generator The most general master equation of the form [4]
L is bounded (and hence everywhere defined) Gorini, can be written as (Alicki and Fannes 2001, Ingarden
et al. (1976) and Lindblad (1976) proved
P that eqns et al. 1997)
[4] and [7] with bounded H, Vj and j Vj Vj provide
the most general form of L. The choice of H and Vj is d X3
1X 3
¼i hk ½Sk ;  þ akl f½Sk ; Sl 
not unique and the sum over j can be replaced by an dt 2 k;l¼1
k¼1
integral. In the case of n-dimensional Hilbert space
we can always choose the form of eqn [4] with at þ ½Sk ; Sl g ½16
most n2  1 Vj ’s. Sometimes the structure [4] is where hk 2 R and [akl ] is a 3  3 complex,
hidden as for the following useful example of the positively defined matrix. Introducing the magneti-
relaxation process to a fixed density matrix 0 with zation vector Mk (t) = tr((t)Sk ), we obtain the
the rate  > 0: following Bloch equation used in the magnetic
d resonance theory:
ðtÞ ¼ ð0  ðtÞÞ ½14
dt d
MðtÞ ¼ h  ðMðtÞ  M0 Þ  FðMðtÞ  M0 Þ ½17
The general structure of an unbounded L is not dt
known. However, the formal expressions [4] and [7] where the tensor F (real, symmetric, and positive
with possibly unbounded D and Vj are meaningful 3  3 matrix) and the vector M0 are functions of
under the following conditions: [akl ]. In particular, complete positivity implies the
the operator D generates a strongly continuous following inequalities for the inverse relaxation
contracting semigroup {etD ; t 0} on H; times 1 , 2 , 3 (eigenvalues of F):
dom(Vj ) dom(D), for allPj;
<, D > þ <D, > þ j <Vj , Vj > = 0, for k 0; 1 þ 2 3
½18
all , 2 dom(D). 3 þ 1 2 ; 2 þ 3 1
Quantum Dynamical Semigroups 161

Damped and pumped harmonic oscillator The with a single-particle Hamiltonian H1 and a damp-
quantum master equation for a linearly damped ing (pumping) positive operator # (" ) 0. The
and pumped harmonic oscillator with frequency ! operators H1 , # , and " need not be bounded
and the damping (pumping) coefficient # (" ) has provided iH1  (1=2){(#  ()" ) generates a
form (contracting in the fermionic case) semigroup
{T(t); t 0} on H1 and the formal solution of
d #
¼  i!½a a;  þ ð½a; a  þ ½a; a Þ eqn [24]
dt 2
"  ðtÞ ¼ TðtÞð0ÞT  ðtÞ þ QðtÞ
þ ð½a ; a þ ½a ; aÞ ½19 Z t
2
where a , a are creation and annihilation operators where QðtÞ ¼ TðsÞ" T  ðsÞds ½25
0
satisfying [a, a ] = 1. Taking diagonal elements
pn = <n, n> in the ‘‘particle number’’ basis is meaningful. We can now define the quasifree
a ajn >= njn >, n = 0, 1, 2, . . . , which evolve inde- dynamical semigroup for the many-particle system
pendently of the off-diagonal elements, one obtains described by the Fock space F  (H1 ) (Alicki and
the birth and death process, Lendi 1987, Alicki and Fannes 2001). The simplest
definition involves Heisenberg evolution of the
dpn ordered monomials in a ( j ) and a(j ):
¼ # ðn þ 1Þpnþ1 þ " npn1
dt  
 # n þ " ðn þ 1Þ pn ½20  ðtÞa ð 1 Þ    a ð m Það1 Þ    aðn Þ
X  
¼  Det < jk ; QðtÞil > k;l¼1;2;...;r
It is convenient to use the Heisenberg picture and
P
find an explicit solutionpinffiffiffi terms of Weyl unitary
operators W(z) = exp[(i= 2)(za þ za )],  a ðT  ðtÞ
1 Þ    a ðT  ðtÞ
mr Þ
   
 a T  ðtÞ1    a T  ðtÞnr ½26
 ðtÞWðzÞ
( )
The sum is taken over all partitions {(j1 , . . . , jr )
jzj2 #  
¼ exp  1  eð# " Þt WðzðtÞÞ ½21 (
1 , . . . ,
mr )}, {(i1 , . . . , ir )(1 , . . . , nr } such that
4 #  "
j1 < j2 <    < jr ,
1 <
2 ,    <    <
mr , i1 < i2 <   
where z(t)= exp{(i! þ 12 (#  " ))t}, t 0. For # > " < rr , 1 < 2    <
nr ; þ  1,  is a product of
the solution of eqn [19] always tends to the stationary signatures of the permutations {1, 2, . . . , m} 7!
Gibbs state {j1 , . . . , jr ,
1 , . . . ,
mr }, {1, 2, . . . , n} 7! {i1 , . . . , ir , 1 , . . . ,
 
nr }; a permanent Detþ is taken for bosons, a
 ¼ Z1 e!a a ; Z ¼ tre!a a
determinant Det for fermions.
1 ½22 Introducing an orthonormal basis {ek } in H1 and
 ¼ lnð# =" Þ using the notation a (ek )  ak , we can write a
!
formal master equation for density matrices on the
Quasifree semigroups The previous example is the Fock space corresponding to eqn [26]:
simplest instance of the dynamical semigroups for d 1 X kl 
noninteracting bosons and fermions which are ¼  i½HF ;  þ # ½ak ; al 
dt 2 k;l
completely determined on the single-particle level.
Such systems are defined by a single-particle Hilbert    
þ ½ak ; al  þ kl " ½a k ; a l  þ ½a 
k ; a l  ½27
space H1 and a linear map H1 3  7! a () into
creation operators satisfying canonical commutation Again, formally,
or anticommutation relations (CCRs or CARs, X
respectively) for bosons and fermions, respectively HF ¼ < ek ; H1 el > ak al
k;l ½28
½að Þ; a ðÞ ¼ < ;  >
½23 kl
# ¼ < ek ; # el >; kl
" ¼ < e k ; " e l >
½A; B  AB  ð1ÞBA
Often the formulas [27], [28] are not well-
In all expressions containing (), sign (þ) refers to
defined, but replacing the (infinite) matrices by
bosons and () to fermions.
(distribution-valued) integral kernels, sums by inte-
Consider a nonhomogeneous evolution equation
grals, and ak , al by quantum fields, we can obtain
on the trace-class operators  2 T (H1 ):
meaningful objects.
d 1 Quasifree dynamical semigroups find applications
¼ i½H1 ;   fð#  ðÞ" Þ; g þ " ½24
dt 2 in the theory of unstable particles, quantum linear
162 Quantum Dynamical Semigroups

optics, solid-state physics, quantum information unbounded one under technical conditions concern-
theory, etc. (Alicki and Lendi 1987, Sewell 2002). ing domains). This allows spectral decomposition of
L and a proper definition of damping rates for the
obtained eigenvectors. The normality condition is
Ergodic Properties one of the possible definitions of quantum detailed
Dynamical semigroups which possess stationary balance. The other, based on the time-reversal
states satisfying L0 = 0 are of particular interest, operation, often coincides with the previous one
for example, in the description of relaxation for important examples.
processes toward equilibrium states (Frigerio 1977, Interesting examples of nonergodic dynamical
Spohn 1980, Alicki and Lendi 1987). The dynamical semigroups are given for open systems consisting of
semigroup {(t)} with a stationary state 0 is called N identical particles with Hamiltonians H (N) and
ergodic if operators Vj(N) invariant with respect to particles
permutations. Then the commutant {H (N) , Vj(N) ,
lim ðtÞ ¼ 0 ; for any initial  ½29 j 2 I}0 contains an abelian algebra generated by
t!1
projections on irreducible tensors corresponding to
For the case of finite-dimensional H at least one Young tables.
stationary state always exists. If, moreover, it is
strictly positive, 0 > 0, then we have the following
sufficient condition of ergodicity:
From Hamiltonian Dynamics to
fVj ; j 2 Ig0  fA; A 2 BðHÞ; ½A; Vj  ¼ 0; j 2 Ig Semigroups
¼ C1 ½30
One of the main tasks in the quantum theory of open
Open systems interacting with heat baths at the systems is to derive master equations [4] from the
temperature T are described by the semigroups with model of a ‘‘small’’ open system S interacting with a
generators [4] of the special form ‘‘large’’ reservoir R at a certain reference state !R
(Davies 1976, Spohn 1980, Alicki and Lendi 1987,
d 1 X n Breurer and Petruccione 2002, Garbaczewski and
ðtÞ ¼  i½H; ðtÞ þ ½Vj ; ðtÞVj 
dt 2 ! 0 Olkiewicz 2002). Starting with
j P the total Hamiltonian
  H = HS  1R þ 1S  HR þ 
S
 R
, where S
=
þ ½Vj ðtÞ; Vj  þ e!j ½Vj ; ðtÞVj 

S
, R
= R
, tr(!R R
) = 0, and  is a coupling con-
o stant, we define the reduced dynamics of S by
þ ½Vj ðtÞ; Vj  ½31
 
ðtÞ ¼ ðÞ ðtÞ ¼ trR U ðtÞ  !R U ðtÞ ½34
where
1 with U (t) = exp (itH ). Here trR denotes a partial
¼ ; ½H; Vj  ¼ !j Vj ½32 trace over R defined in terms of an arbitrary
kB T P basis
{ek } of R by the formula <, (trR A)> = k <
The Gibbs state  = Z1 eH is a stationary state ek , A  ek >. Generally, () (t þ s) 6¼ () (t)() (s),
for eqn [31] and the condition {Vj , Vj ; j 2 I}0 = C1 but dynamical semigroups can provide good approx-
implies ergodicity (return to equilibrium). Moreover, imations in important cases.
the matrix elements of  diagonal in H-eigenbasis
transform independently of the off-diagonal ones
and satisfy the Pauli master equation Weak-Coupling Limit

dpk X Under the conditions of sufficiently fast decay of


¼ ðakl pl  alk pk Þ ½33 multitime correlation functions constructed from the
dt l observables R
at the state !R , one can prove that
with the detailed balance condition akl eEl = for small coupling constant  the exact dynamical
alk eEk , where Ek are eigenvalues of H. map () (t) can be approximated by the dynamical
Define the new Hilbert space L2 (H,  ) as a semigroup corresponding to the following master
completion of B(H) with respect to the scalar equation:
product (A, B)  tr( A B). The semigroup’s gen-
erators in the Heisenberg picture corresponding to d 2 X X
ðtÞ ¼ i½H; ðtÞ þ C
 ð!Þ
eqn [31] are normal operators in L2 (H,  ) with the dt 2
 !2Sp
Hamiltonian part i[H,  ] being the anti-Hermitian   

one (automatically for bounded L , and for  ½V!
; ðtÞV!  þ ½V!
ðtÞ; V!  ½35
Quantum Dynamical Semigroups 163

P P
 
where H = HS þ 2
 !2Sp P K
 (!)V! V! is a completely positive map , that is, S(j) S(j).
renormalized Hamiltonian, !2Sp denotes the Psum Hence, for the quantum dynamical semigroup (t) with
over eigenfrequencies of [H,  ], eitH S
eitH = !2Sp the stationary state 0 we obtain the following relation
V!
ei!t and for the von Neumann entropy S() = tr( ln ):
Z 1
  d d d
ei!t tr !R eitHR R
eitHR R dt SððtÞÞ ¼  SððtÞj0 Þ  trððtÞ ln 0 Þ ½38
0 dt dt dt
¼ 12 C
 ð!Þ þ iK
 ð!Þ ½36 where (d=dt)S((t) j 0 ) 0 is an entropy produc-
tion and the second term describes entropy exchange
The rigorous derivation involves van Hove or weak
with environment (Spohn 1980, Alicki and Lendi
coupling limit,  ! 0, with = 2 t kept fixed.
1987).
It follows from the Bochner theorem that the
Bistochastic dynamical semigroups preserve the
matrix [C
 (!)] is positively defined and therefore
maximally mixed state, that is, L(1) = 0. For them,
by its diagonalization we can convert eqn [35]
the von Neumann entropy does not decrease and the
into the standard form [4]. If the reservoir’s state
purity tr 2 never increases (Streater 1995). Two
!R is an equilibrium state (Kubo–Martin–Schwinger
important classes of master equations, used to
state) then C
 (!) = e!=kB T C
(!) and therefore
describe decoherence, yield bistochastic dynamical
eqn [35] can be written in a form [31]. Moreover,
semigroups:
transition probabilities akl from eqn [33] coincide
with those obtained using the ‘‘Fermi golden d
rule.’’ ðtÞ ¼ i½H; ðtÞ
dt X
 ½Aj ; ½Aj ; ðtÞ; Aj ¼ Aj ½39
Low-Density Limit j
If the reservoir can be modeled by a gas of
noninteracting particles (bosons or fermions) at
low density , we can derive the following master d
ðtÞ ¼ i½H; ðtÞ
equation which approximates an exact dynamics dt Z
[34] in the low-density limit ( ! 0, with = t kept
þ ðd
ÞðUð
ÞðtÞU ð
Þ  ðtÞÞ ½40
fixed) M

d XZ where U(
) are unitary and () is a (positive)
ðtÞ ¼ i½H; ðtÞ þ d3 pd3 p0 GðpÞ
dt R 6 measure on M.
!2S
 
  Ep0  Ep þ ! ð½T! ðp; p0 Þ; ðtÞT! ðp; p0 Þ 
þ ½T! ðp; p0 ÞðtÞ; T! ðp; p0 Þ Þ ½37 Itô–Schrödinger Equations
Up to technical problems in the case of unbounded
Here H is a renormalized
P Hamiltonian of the system operators, the master equation [4] is completely
S, eitH TeitH = !2S T! ei!t , T is a T-matrix equivalent to the following stochastic differential
describing the scattering process involving S and a equation (in Itô form):
single particle, T = Vþ , where V is a particle-
system potential and þ is a Møller operator. 1X 
d ðtÞ ¼  iH ðtÞ dt  V Vj ðtÞ dt
T! (p, p0 ) denotes the integral kernel corresponding 2 j2I j
to T! expressed in terms of momenta of the bath X
particle, Ep the kinetic energy of a particle, and G(p) i Vj ðtÞdXj ðtÞ ½41
its probability distribution in the momentum space. j2I

If G(p)  exp(Ep =kB T) and microreversibility con- where Xj (t) are arbitrary statistically independent
ditions, Ep = Ep and T! (p, p0 ) = T! (p0 , p), hold, stochastic processes with independent increments
then eqn [37] satisfies the quantum detailed-balance (continuous or jump processes) such that the
condition with the stationary Gibbs state expectation E(dXj (t) dXk (t)) = jk dt. Equation [41]
 ,  = 1=kB T. should be understood as an integral equation
involving stochastic Itô integrals with respect to
{Xj (t)} computed according to the Itô rule:
Entropy and Purity
dXj (t) dXk (t) = jk dt. Taking the average (t) =
The relative entropy S( j ) = tr( ln    ln ) is E(j (t) > < (t)j) one can show, using the Itô rule,
monotone with respect to any trace-preserving that (t) satisfies eqn [4]. For numerical
164 Quantum Dynamical Semigroups

applications, it is convenient to use the nonlinear von Neumann algebras, the most difficult problem
version of eqn [41] for the normalized stochastic of constructing physically relevant semigroups
vector (t) = (t)=k (t)k, which can be easily for generic infinite systems remains unsolved
derived from eqn [41] (Breurer and Petruccione (Majewski and Zegarliński 1996, Garbaczewski
2002). and Olkiewicz 2002).
Introducing quantum noises, for example, quan-
tum Brownian motions defined in terms of bosonic
or fermionic fields and satisfying suitable quantum
Nonlinear Dynamical Semigroups
Itô rules one can develop the theory of noncommu-
tative stochastic differential equations (NSDE) The reduced description of many-body classical or
(Hudson and Parthasarathy 1984). Both, eqn [41] quantum systems in terms of single-particle states
and NSDE, provide examples of unitary dilations – (probability distributions, wave functions, or density
(physically singular) mathematical constructions of matrices) leads to nonlinear dynamics (e.g., Boltz-
the environment R and the R–S coupling which mann, Vlasov, Hartree, or Hartree–Fock equations)
exactly reproduce dynamical semigroups as reduced (Spohn 1980, Garbaczewski and Olkiewicz 2002). A
dynamics [34]. large class of nonlinear evolution equations for
single-particle density matrices  can be written as
Alicki and Lendi (1987)
Algebraic Formalism
In order to describe open systems in thermodyna- d
¼ L½ ½43
mical limit (e.g., infinite spin systems) or systems dt
in the quantum field theory one needs the
formalism based on C or von Neumann algebras. where  7! L[] is a map from density matrices to
In the C -algebraic language, by dynamical semi- semigroup generators of the type [4]. Under
group (in the Heisenberg picture) we mean a certain technical conditions the solution of eqn
family {T(t); t 0} of linear maps on the unital [43] exists and defines a nonlinear dynamical
C -algebra A satisfying the following conditions: semigroup – a family {(t); t 0} of maps on the
(1) complete positivity, (2) T(t)T(s) = T(t þ s), set of density matrices satisfying the composition
(3) weak (or strong) continuity, and (4) T(t)1 = 1. law (t þ s) = (t)(s).
Assuming the existence of a faithful stationary A simple example is provided by an open N-
state ! = !  T(t) on A, one can use a Gelfand– particle system with the total Hamiltonian invariant
Naimark–Segal (GNS) representation ! (A) of A with respect to particle permutations. The Marko-
in terms of bounded operators on the suitable vian approximation combined with the mean-field
Hilbert space H! with the cyclic and separating method leads to a nonlinear dynamical semigroup
vector  satisfying !(A) = <, ! (A)> for all which preserves purity and for initial pure states is
A 2 A. Then the dynamical semigroup can be governed by the nonlinear Schrödinger equation
defined on the von Neumann algebra M (obtained with the following structure:
by a weak closure of ! (A)) as T(t) ^ ! (A) 
! (T(t)A). The Kadison inequality valid even for d
¼ iðh þ NUð ÞÞ
2-positive bounded maps  on A dt
N X
ðAA Þ ðAÞð1ÞðA Þ ½42 þ < ; Vj > Vj
2 j
 
implies that !([T(t)A] T(t)A) !(A A), which 
allows one to extend the dynamical semigroup  < ; Vj > Vj ½44
to the contracting semigroup T(t)[ ~ ! (A)] 
[ ! (T(t)A)] on the GNS Hilbert space H! . Typi-
Here h is a single-particle Hamiltonian, U( ) a
cally, one tries to define the semigroup in terms of
Hartree potential, and Vj are single-particle opera-
the proper limiting procedures T(t) = limn ! 1 Tn (t),
tors describing collective dissipation.
where Tn (t) is well defined on A. However, the limit
may not exist as an operator on A but can be well See also: Boltzmann Equation (Classical and Quantum);
defined on the von Neumann algebra M. If not, the Channels in Quantum Information Theory; Evolution
contracting semigroup on H! may still be a useful Equations: Linear and Nonlinear; Kinetic Equations;
object. Nonequilibrium Statistical Mechanics (Stationary):
Although there exists a rich ergodic theory Overview; Positive Maps on C*-Algebras; Quantum
of dynamical semigroups for the special types of Error Correction and Fault Tolerance; Quantum
Quantum Dynamics in Loop Quantum Gravity 165

Mechanical Scattering Theory; Stochastic Differential Gorini V, Kossakowski A, and Sudarshan ECG (1976) Comple-
Equations. tely positive dynamical semigroups of n-level systems. Journal
of Mathematical Physics 17: 821–825.
Hudson R and Parthasarathy KR (1984) Quantum Itô’s formula
Further Reading and stochastic evolutions. Communications in Mathematical
Physics 93: 301–323.
Alicki R and Fannes M (2001) Quantum Dynamical Systems. Ingarden RS, Kossakowski A, and Ohya M (1997) Information
Oxford: Oxford University Press. Dynamics and Open Systems. Dordrecht: Kluwer.
Alicki R and Lendi K (1987) Quantum Dynamical Semigroups Lindblad G (1976) On the generators of quantum dynamical
and Applications. LNP, vol. 286. Berlin: Springer. semigroups. Communications in Mathematical Physics 48:
Breurer H-P and Petruccione F (2002) Theory of Open Quantum 119–130.
Systems. Oxford: Oxford University Press. Majewski WA and Zegarliński B (1996) Quantum stochastic
Chebotarev AM and Fagnola F (1998) Sufficient conditions for dynamics II. Reviews in Mathematical Physics 8: 689–713.
conservativity of quantum dynamical semigroups. Journal of Sewell G (2002) Quantum Mechanics and Its Emergent Macro-
Functional Analysis 153: 382–404. physics. Princeton: Princeton University Press.
Davies EB (1976) Quantum Theory of Open Systems. London: Spohn H (1980) Kinetic equations from Hamiltonian dynamics.
Academic Press. Review of Modern Physics 52: 569–616.
Frigerio A (1977) Quantum dynamical semigroups and approach Streater RF (1995) Statistical Dynamics. London: Imperial
to equilibrium. Letters in Mathematical Physics 2: 79–87. College Press.
Garbaczewski P and Olkiewicz R (eds.) (2002) Dynamics of
Dissipation. LNP, vol. 597. Berlin: Springer.

Quantum Dynamics in Loop Quantum Gravity


H Sahlmann, Universiteit Utrecht, Utrecht, Since the solutions of the quantum dynamics will
The Netherlands not depend on any sort of time parameter in an
ª 2006 Elsevier Ltd. All rights reserved. explicit way, they cannot be readily interpreted as a
(quantum) spacetime history. The conceptual ques-
tions related to this are known as the ‘‘problem of
Introduction time’’ in quantum gravity.
We should mention that there is a proposal –
In general relativity, the metric is a dynamic entity, consistent discretizations – that allows us to elimi-
there is no preferred notion of time, and the theory nate constraints, at the expense of a discretization
is invariant under diffeomorphisms. Therefore, one of the classical theory and dynamical specification of
expects the concept of dynamics to be very different Lagrange multipliers. Application of this technique
from that in mechanical or special relativistic to gravity is currently under study.
systems. Indeed, in a canonical formulation, the Loop quantum gravity (LQG) (see Loop Quantum
diffeomorphism symmetry manifests itself through Gravity) is based on the choice of a canonical pair
the appearance of constraints (see Constrained (Aa , Eb ) of an SU(2) connection and an su(2)-valued
Systems). In particular, in the absence of boundaries, vector density. The constraints come in three classes:
the Hamiltonian turns out to be a linear combina-
tion of them. Thus, the dynamics is completely Gi ½A; EðxÞ ¼ 0; Va ½A; EðxÞ ¼ 0;
encoded in the constraints. C½A; EðxÞ ¼ 0
To quantize such a system following Dirac, one
has to define operators corresponding to the the Gauss, vector, and scalar constraints, respectively.
constraints on an auxiliary Hilbert space. Solutions Before giving some detail about the quantization
to the quantum dynamics are then vectors that are of the constraints and their solutions, we should
annihilated by all the constraint operators. Techni- mention that there exists an analogous classical
cal complications can arise, and the solutions might formulation in terms of complex (self-dual) vari-
not lie in the auxiliary Hilbert space but in an ables. The quantization in that formulation faces
appropriately chosen dual. serious technical obstacles, but in the case of
Physical observables on the other hand are positive cosmological constant an elegant formal
associated with operators on the auxiliary space solution to all the constraints – the Kodama state –
that commute with the constraints or, equivalently, is known. It is related to the Chern–Simons action
operators that act within the space of solutions. on the spatial slice.
166 Quantum Dynamics in Loop Quantum Gravity

As said before, strictly speaking, implementing the E are well defined as operators. These issues can
dynamics comprises quantizing and satisfying all the however be dealt with in an elegant way as follows.
constraints. Here we will however focus on C since The first step is to absorb the determinant factor
it is the most challenging, and most closely related into a Poisson bracket,
to standard dynamics in that it generates changes
2 abc
under timelike deformations of the Cauchy surface CE ¼  trðFab fAc ; VgÞ
 on which the canonical formulation is based. 
The quantum solutions of the other constraints, where V is the volume of the spatial slice . Then
linear combinations of s-knots, lie in a Hilbert space one approximates the curvature by (identity minus)
Kdiff which is part of the dual of the kinematical the holonomy around a small loop. In the present
Hilbert space K of the theory. For details on these case one finds that for a small tetrahedron  with
solutions as well as some basic definitions that will base point v, one can approximate
be used without comment below (see Loop Quan- Z
tum Gravity). Since s-knots are labeled, among other CE ðNÞ :¼ 21 N trðF ^ fA; VgÞ

things, by a diffeomorphism equivalence class of a
2
graph, relations to knot theory are emerging at this   NðvÞijk trðhij hsk fh1
sk ; VgÞ ½1
level (see Knot Invariants and Quantum Gravity). 3
It is important to note that C does not Poisson- where (see Figure 1a)) the si are edges of  incident
commute with the diffeomorphism constraints. at v and the ij loops around the faces of  incident
Therefore, in the quantum theory it does matter in at v.
which order the constraints are solved. It turns out This suggests how to define an operator C b E that

that on the quantum solutions to the other con- acts on cylindrical functions on a given graph : one
straints, the scalar constraint can be defined by chooses a triangulation adapted to the graph and
introducing a regulator, and stays well defined even quantizes the CE (N) (where  is a tetrahedron of
when the regulator is removed. This ultraviolet this triangulation) using the right-hand side of [1] –
finiteness on Kdiff can be intuitively understood holonomies are quantized by the holonomy opera-
from the diffeomorphism invariance of its elements: tors of the quantum theory, V by the volume
There is no problematic short-distance regime since operator V, b and the Poisson bracket by the
the states do not contain any scale at all. corresponding commutator divided by ih. To be
In the following we will briefly review the imple- more precise, the triangulation is chosen such that
mentation of the scalar constraint in LQG and the sk in [1] are part of , and the operators
comment on some ramifications and open questions. corresponding to the h are creating new edges that
connect the endpoints of the sk (see Figure 1b).
bE
Still this is not sufficient, since the definition of C 
The Scalar Constraint Operator depends quite heavily on the choice of the triangula-
tion, and there is no natural way to choose one.
In the Lorentzian theory the scalar constraint C is Furthermore, there is no choice that would guarantee
the sum of the scalar constraint CE of the Euclidean
theory:

CE ¼ ðdet qÞ1=2 trðFab ½Ea ; Eb Þ


a second term of a similar form, but with the
curvature F of the connection A replaced by the
curvature associated to a certain triad e, and
possibly matter terms. In the following we will just s3
discuss CE , the other terms can be handled in a s2
α12
similar fashion.
There appear to be a number of obstacles to the v s1
quantization of CE : for one, the inverse of
the determinant would likely be ill defined, as
the
R volume operator – essentially a quantization of
(det q)1=2 – has a large kernel. In addition, there (a) (b)
are no well-defined operators corresponding to F Figure 1 (a) A tetrahedron  and its labeling of edges and
and E evaluated at points. Rather, only holonomies loops. (b) A tetrahedron  adapted to the edges (dashed lines)
he [A] of A along curves e and certain functionals of of a graph .
Quantum Dynamics in Loop Quantum Gravity 167

that the Cb E for different  are consistent in the sense constraint operator along the lines sketched above.

that they correspond to the action of the same The quantization ambiguities include changes in the
operator C b E on two different cylindrical subspaces. power of the volume operator and the spin quantum
Here, the diffeomorphism invariance of the theory number that the constraint creates or annihilates. An
comes to the rescue: a well-defined operator largely interesting check on these quantizations would be to
free of ambiguities can be obtained by letting the inspect the algebra of constraint operators for anoma-
operators above act (by duality) on Kdiff to give lies. In the present situation, this can only be carried
elements in K . When acting on diffeomorphism- out to a certain extent, because C b is defined on
invariant states, the ambiguities in the definition of diffeomorphism-invariant states. The Poisson bracket
the triangulations can be eliminated, and the opera- between two scalar constraints is proportional to a
tors Cb E for different  are consistent and together diffeomorphism constraint, and indeed it turns out

define an operator C b E (N). Roughly speaking, for a that in the quantum theory the commutator of two
diffeomorphism-invariant state, it does not matter scalar constraint operators vanishes for quantizations
anymore where on the graph the endpoints of the sk as described above. In that sense they are ambiguity
lie and how they are connected to form the loops . free; however, this criterion is not strong enough to
The final picture looks as follows: for each s-knot s, distinguish between the candidates.
the operator gives a sum of contributions, one for Recently, a slightly different strategy has been
each vertex of s, that is, C b E (N)s = P C cv (N)s. The proposed, which, if successfully implemented, would
v
terms in this sum are not diffeomorphism invariant. eliminate some of the questions regarding the
Their evaluation on a spin network S is of the form constraint algebra. The idea is to combine the
X constraints C(N) for different lapse functions N
cv sÞ½S ¼
ðC cðs0 ÞNðxðvÞÞs0 ½S ½2 into one master constraint
s0
Z
0
where the s are s-knots that differ from s by the M ¼ ðdet qÞ1=2 C2 d3 x
addition or deletion of certain edges, and correspond- 
ing changes in coloring (by 1=2) and intertwiners. As M is manifestly diffeomorphism invariant and could
an example, Figure 2 schematically depicts the action replace all the noncommuting constraints C(N),
on a trivalent vertex. The point x(v) on which N is hence simplifying the constraint algebra considerably.
evaluated in the above formula gets determined as The interpretation of the solutions of all the
follows: the evaluation s0 [S] is zero unless the graph  constraints hinges on the construction of observables
on which S is based is an element in the diffeomorph- for the theory. This is already a difficult task in the
ism equivalence class on which s0 is based. x(v) is the classical theory, and thus even more so after quantiza-
position of the vertex v in this element of the tion. Though there is no general solution to this problem
equivalence class. Because of this x(v), the action of available, interesting proposals are being studied.
b E (N) is not diffeomorphism invariant.
C Finally, it should be said that the quantization of
Similar techniques give a quantization C b of the
the scalar constraint can be used to obtain a picture
full constraint. The solutions to the constraint can that resembles more the standard time evolution in
be determined as the vectors 2 Kdiff that are quantum field theory. The (formal) power series
annihilated by C b in the sense that (C(N)
b )[f ] = 0 expansion of the projector
for all functions N and elements f of K. The Z  Z 
solutions are more or less explicitly known; how- Y
P¼ b
ðCðxÞÞ ¼ D½N exp i NðxÞCðxÞ b
ever, the task of interpreting them is a hard one and x2 
remains an object of current research.
It should be mentioned that, strictly speaking, one onto the kernel of C b can be described by a spin foam
can arrive at several slightly different versions of the model (see Spin Foams).
For further information on the subject of this article
see the references: Thiemann (to appear), Rovelli
(2004), and Ashtekar and Lewandowski (2004) for
k general reviews on LQG (with a systematic exposition
∑ 1
2 of a large class of quantizations of the scalar constraint
j, k
j and their solutions in Ashtekar and Lewandowski
(2004)); Thiemann (1998) for a seminal work on the
quantization of the scalar constraint; Rovelli (1999)
Figure 2 A schematic rendering of the action of the operator and Reisenberger and Rovelli (1997) on the connec-
b v for a trivalent vertex.
C tion to spin foam models; Di Bartolo et al. (2002) on
168 Quantum Electrodynamics and Its Precision Tests

consistent discretizations; Kodama (1990) and Freidel Freidel L and Smolin L (2004) The linearization of the Kodama
and Smolin (2004) on the Kodama state; and state. Classical and Quantum Gravity 21: 3831.
Kodama H (1990) Holomorphic wave function of the universe.
Thiemann (2003) on the master constraint program. Physical Review D 42: 2548.
Reisenberger MP and Rovelli C (1997) ‘‘Sum over surfaces’’ form
See also: Constrained Systems; Knot Invariants and of loop quantum gravity. Physical Review D 56: 3490.
Quantum Gravity; Loop Quantum Gravity; Quantum Rovelli C (1999) The projector on physical states in loop
Geometry and its Applications; Spin Foams; Wheeler–De quantum gravity. Physical Review D 59: 104015.
Rovelli C (2004) Quantum Gravity, Cambridge Monographs
Witt Theory.
in Mathematical Physics. Cambridge: Cambridge University
Press.
Thiemann T (1998) Quantum spin dynamics (QSD). Classical and
Further Reading
Quantum Gravity 15: 839.
Ashtekar A and Lewandowski J (2004) Background independent Thiemann T (2003) The Phoenix project: master constraint
quantum gravity: a status report. Classical and Quantum programme for loop quantum gravity, arXiv:gr-qc/0305080.
Gravity 21: R53. Thiemann T (2006) Modern Canonical Quantum General
Di Bartolo C, Gambini R, and Pullin J (2002) Canonical Relativity, Cambridge University Press (to appear).
quantization of constrained theories on discrete space-time
lattices. Classical Quantum Gravity 19: 5275.

Quantum Electrodynamics and Its Precision Tests


S Laporta, Università di Parma, Parma, Italy energy and momentum associated to the EMF into
E Remiddi, Università di Bologna, Bologna, Italy quanta of light or photons (Einstein 1905).
ª 2006 Elsevier Ltd. All rights reserved. The quantization of the EMF was first worked out
by P Jordan, within the article (1926) by M Born,
W Heisenberg, and P Jordan (usually referred to as
Introduction the Dreimännerarbeit) and then in the paper ‘‘The
quantum theory of emission and absorption of
Quantum electrodynamics (QED) describes the radiation’’ by PAM Dirac, commonly considered
interaction of the electromagnetic field (EMF) the beginning of the so-called second quantization
with charged particles. Any physical particle formalism.
interacts, directly or indirectly, with any other In the subsequent year (1928) Dirac published the
particle (including itself); in the case of the famous equation for the relativistic electron, from
electron, however, at low and medium energy which it was immediately deduced, on a firmer
(say, up to a few GeV) the interaction with the basis, that the electron has spin 1/2, that its spin
EMF is by and far the most important, so that gyromagnetic ratio (the ratio between spin and
QED describes with great precision the dynamics associated magnetic moment in suitable dimension-
of the electron, and at the same time the electron less units; see below for more details) is twice the
provides with the most stringent tests of QED value predicted by classical physics (a result
currently available. expressed as ge = 2) and that the levels of atomic
In the various sections of this article we will hydrogen with the same principal quantum number
discuss, in the following order, the origin of QED, n are not fully degenerate, as in the nonrelativistic
the structure of the radiative corrections, the limit, but do possess the so-called fine structure
application of QED to various bound states pro- splitting. In particular, the energy of the n = 2 levels
blems (the hydrogen-like atoms, the muonium, and splits into two values, one value for 2P3=2 states
positronium) and the anomalous magnetic moments with total angular momentum J = 3=2 and another
of the leptons (the muon and the electron). value for the states 2S1=2 and 2P1=2 , which have
J = 1=2; note that the 2S1=2 and 2P1=2 states are still
degenerate.
Origin of QED
Very soon it was realized that Dirac’s equation
The origin of QED can ideally be traced back to the also requires that each particle must be accompanied
very beginning of quantum mechanics, the black- by its antiparticle, with exactly the same mass and
body formula by M Planck (1900), which was soon opposite charge. The antiparticle of the electron, the
understood as pointing to a discretization of the positron, was indeed discovered by C Anderson
Quantum Electrodynamics and Its Precision Tests 169

(1932), establishing Dirac’s equation as one of the perturbation theory (which replaced the previous
cornerstones of theoretical physics. noncovariant ‘‘old fashioned’’ perturbation theory)
All the ingredients needed for the evaluation of and of the renormalization theory, which liberated
the perturbative corrections to the QED theory the perturbative expansion from the divergences
(usually called radiative corrections) were already plaguing the older approach, opening the path to the
present at that moment, but radiative corrections evaluation of radiative corrections and to the great
were not systematically investigated for several success of precision predictions of QED.
years, due perhaps to the length and difficulty of The formalism improved quickly, evolving in
the calculations and the absence of important the more general quantum field theory (QFT)
disagreements between theoretical predictions and approach; three of the main contributors were
experimental results. Sin-Itiro Tomonaga, Julian Schwinger, and Richard
The situation changed in 1947, when two experi- P Feynman, awarded a few years later (1965) the
ments were carried out, measuring the energy Nobel price ‘‘for their fundamental work in quantum
difference between the 22 S1=2 and 22 P1=2 levels of electrodynamics, with deep-ploughing consequences
the hydrogen atom and the gyromagnetic ratios of for the physics of elementary particles.’’ QFT was then
the electron. successfully used for describing the weak interactions
Lamb and Retherford (1947), by using the ‘‘great in the electroweak model and later on also for the
wartime advances in microwaves techniques,’’ suc- strong interactions theory, dubbed quantum chromo-
ceeded in establishing that in the hydrogen atom dynamics (or QCD, in analogy with the popular QED
‘‘the 22 S1=2 state is higher than the 22 P1=2 by about acronym). For more details and references to original
1000 Mc/sec.,’’ while (as observed above) according works, the reader is invited to look at any treatise on
to the Dirac theory the two states are expected to QED or QFT, such as, for instance, Weinberg (1995).
have exactly the same energy. Subsequent refine- Initially, the Lamb shift was perhaps more
ments of the experiment (Triebwasser et al. 1953) important than the electron magnetic anomaly both
gave for the difference (now referred to as Lamb for the establishment of renormalization theory and
shift) the value 1057.77  0.10 MHz, with a relative as a test of QED, but in the following years it was
error 1  104 . supplanted by the latter as a precision test of QED.
The authors of the second 1947 experiment In 1947 the ‘‘best values’’ for some fundamental
(Kusch and Foley 1947) measured the frequencies constants were indeed
associated with the Zeeman splitting of two differ-
ent states of gallium, finding an inconsistency with c ¼ ð2:99776  0:00004Þ  1010 cm s1
the theoretical values of the gyromagnetic ratios of m e c2 2
R1 ¼ ¼ 109737:303  0:017 cm1 ½3
the electron. More exactly, write the magnetic 2hc
moments mL , mS associated to the (dimensionless) 1= ¼ 137:030  0:016
orbital and spin angular momenta L, S of the
electron as where R1 is the Rydberg constant for infinite mass,
h the Planck constant, and  the fine structure
e
h e
h constant (let us observe here in passing that R1 was
mL ¼ gL L; mS ¼ gS S ½1
2me c 2me c and is still known much better than the separate
values of me , , and h entering in its definition); for
where (e) is the charge of the electron (e > 0), me
comparison, the current (2005) values for c and R1
its mass, c the speed of light and gL , gS , respectively,
are
the orbital and spin gyromagnetic ratios; the Dirac
theory then predicts gL = 1 and gS = 2, while the c ¼ 299792458 m s1
results of Kusch and Foley (1947) gave a discre- ½4
pancy which could be accounted for by taking R1 ¼ 109737:31568525ð73Þ cm1
gS = 2.00229  0.00008 and gL = 1, or alternatively
where the value of c is exact (it is in fact the
gS = 2 and gL = 0.99886  0.00004. In modern
definition of the meter), and the relative error in R1
notation the first conjecture can be rewritten as
is 6.6  1012 (the value of  will be discussed later).
gS ¼ ge ¼ 2ð1 þ ae Þ; ae ¼ 0:001145  0:00004 ½2 The measurement of the Lamb shift, repeated
several times, gave results in nice agreement with
where ae is the anomalous magnetic moment (or the original value, and for several years it was
magnetic anomaly) of the electron. providing either a test of QED or a precise value for
The need of explaining the two experimental . But the Lamb shift is the energy difference
results gave rise to a rapid development of covariant between the metastable level 2S1=2 (whose lifetime
170 Quantum Electrodynamics and Its Precision Tests

is about 1/7 s) and the 2P1=2 level, which has a conservation of the momenta at that vertex. For
lifetime of about 1.596 ns or a natural linewidth of each process, the Feynman graphs are naturally
99.7 MHz. Such a large linewidth poses a strong classified by the total number of the interaction
intrinsic limitation to the precision attainable in the vertices they contain. In the simplest graphs for a
measure of the Lamb shift, which is just ten times given process (the so-called tree graphs) the
larger; as a matter of fact, that precision could never -functions at the vertices make the integrations
reach the 1  106 relative error level, while in the trivial; but when the number of vertices increases,
meantime the relative precision in ae reached the closed loops of virtual particle states appear, whose
109 range, replacing the Lamb shift in the role of evaluation quickly becomes extremely demanding.
the leading quantity in high-precision QED. In QED, each loop gives an extra factor (e)2 with
respect to the tree graph; it is customary to express it
in terms of (=) = (e=2)2 , so that the resulting
The Structure of Radiative Corrections
power of (=) corresponds to the number of
For obvious space problems we can only super- internal loops. The typical QED prediction for a
ficially sketch here the lines along which the physical quantity is then expressed as a series of
perturbative expansion of QED leading to the powers of the fine structure constant  (and of its
evaluation of radiative corrections can be built, logarithm in bound-state problems). As  is small
considering for simplicity only the photon and the ( ’ 1=137), and the first coefficients of the expan-
electron. One can start from a QED Lagrangian, sions are usually of the order of 1, a small number
formally similar to the classical Lagrangian, invol- of terms in the expansion is in general sufficient to
ving the electron field and the vector potentials of match the precision of the available experimental
the electromagnetic (or photon) field. The theory is data.
a gauge theory (its physical content should not But the number of different graphs for a given
change if a gradient is added to the vector number of loops grows quickly with the number of
potentials); it is further an abelian gauge theory as the loops; in turn, each graph consists in general of a
the EMF does not interact directly with itself. great number of terms and the loop integrations
The QED Lagrangian is separated into a free part become prohibitively difficult when the number of
and an interaction part. From the free part, one loops increases, so that the evaluation of radiative
derives the wave functions of the free-particle states corrections proved to be one of the major computa-
and the corresponding time-evolution operators tional challenges of theoretical physics. As a matter
(free Green’s functions or propagators; let us just of fact, it prompted the development of computer
recall here that to obtain a convenient photon programs (Veltman 1999) for processing the huge
propagator one has to break the gauge invariance algebraic expressions usually encountered, and of
by adding to the Lagrangian a suitable gauge- many sophisticated numerical and analytical techni-
breaking term), while the interaction part of the ques for performing the loop integrations.
Lagrangian gives the ‘‘interaction vertices’’ of the It should be further mentioned here that Feynman
theory. graphs written by naively following the above
Aim of the theory is to build the Green’s function sketched rules are often mathematically ill-defined,
for the various processes in the presence of the taking the form of nonconvergent integrals on the
interaction; from these Green’s functions, one then loop momenta. A regularization procedure is needed
derives all the physical quantities of interest. to give an unambiguous meaning to all the integrals;
With the free propagators and the interaction currently the most powerful regularization is the
vertices, one generates the perturbative expansion of continuous dimensional regularization scheme, in
the Green’s functions. The result, namely the which the loop integrations are carried out in d
contributions to the perturbative expansion (or continuous dimensions, with d unspecified; renor-
radiative corrections), can be depicted in terms of malization counter-terms are also evaluated in the
Feynman graphs: they consist of various particle same scheme, and the physical quantities are
lines joined in the interaction vertices, with external recovered in the d ! 4 limit (unrenormalized loop
lines corresponding to the initial and final particles integrals and renormalization counterterms are
and internal lines corresponding to intermediate or usually singular as powers of 1=(d  4) in the
virtual particle states. Each graph stands for an d ! 4 limit, but all those divergences cancel out in
integral on the momenta of all the intermediate the physical combinations of interest).
states, each vertex implying among other things an QED describes the main interaction of the
interaction constant, which is (e) in the case of charged leptons (e, , and ) which have, however,
electron QED, and a -function imposing the weak interactions as well. Strictly speaking, pure
Quantum Electrodynamics and Its Precision Tests 171

QED processes do not exist; it is an essential feature Coulomb interaction between the two charges) and
of QFT that any existing particle can contribute to to devise techniques for their resummation. Among
the Feynman graphs for any process, when the them, one can quote the Bethe–Salpeter equation,
approximation is pushed to a sufficiently high formally very elegant and complete but difficult to
degree. In particular the photon, which is the main use in practice. A great progress has been achieved by
carrier of the QED interaction, is directly coupled the NRQED (nonrelativistic QED) approach, which
also to the strongly interacting particles (the result- is a nonrelativistic theory designed to reproduce the
ing contributions are referred to as ‘‘hadronic full QED scattering amplitude in the nonrelativistic
vacuum polarization’’ effects). limit by the ad hoc definition, a posteriori, of a
The precision tests of QED are then to be suitable effective Hamiltonian. The Hamiltonian is
necessarily searched for in those phenomena where then divided into a part containing the Coulomb
non-QED contributions are presumably small and interaction, which is treated exactly and which gives
which involve quantities already well known inde- rise to the bound states, and all the rest, to be treated
pendently of QED itself. But such high-precision perturbatively. The power of the NRQED approach
quantities are not always available, and as QED is was further boosted by the continuous dimensional
known better than the rest of physics, very often it is regularization technique of Feynman graph integrals.
taken to be correct by assumption, and used as a Traditionally, the results are expressed in terms of
tool for extracting or measuring some of the non- the energies of the bound states, but as in practice
QED quantities relevant to various physical the precise measurements concern the transition
processes. frequencies between various levels, it is customary
In any case, as QED predictions are expressed in to express any energy contribution to some level, say
terms of the fine structure constant , a determina- E, also in terms of the associated frequency
tion of  independent of QED is needed; without it,  = (E)=h, where h is the Planck constant.
the most precise predictions of QED would simply
become measures of  and not tests of the theory.
The Hydrogen-Like Atoms
Finally, it is to be recalled that, ironically, the
problem of the convergence of the expansion in Quite in general, a hydrogen-like atom consists of a
powers of  is still open, even if it is commonly single electron bound to a positively charge particle,
accepted that convergence problems will matter only which is a proton for the hydrogen atom, a deuteron
for precisions and corresponding perturbative orders nucleus for deuterium, a Helium nucleus for an Heþ
(say at order 1= ’ 137) absolutely out of reach of ion, a þ meson for muonium, or a positron for
present experimental and computational possibili- positronium. Even if QED alone is not sufficient to
ties, involving further extremely high energies, treat the dynamical properties of the nuclei, their
where the other fundamental interactions are strong interactions can be described by introducing
expected to be as important as QED, so that it suitable form factors and a few phenomenological
would be meaningless to consider only QED. parameters; weak interactions could be treated
In the following we will discuss only the QED perturbatively, but are not yet required at the
predictions for bound states and the anomalous precision levels achieved so far.
magnetic moments of  and e. The QED results for the hydrogen-like atoms can
be expressed in terms of the mass M of the positive
particle and of its charge Ze (of course Z = 1 for
The Bound States
hydrogen). When the electron mass me is smaller
A very good review of the current status of the theory then M (which is always the case, except the
of hydrogen-like atoms can be found in Eides et al. positronium case) one can take as a starting point
(2001), to which we refer for more details and the QED electron moving in the external field of the
citation of the original papers. The starting point for positive particle, and treat all the other aspects of
studying the bound-state problem in QED is the the relativistic two-body problem (the so-called
scattering amplitude of two charged particles, pre- recoil effects) perturbatively in me =M.
dicted by perturbative QED (pQED) as a (formal) Neglecting the spin of the positive particle, the
series expansion in powers of . In the static limit energy levels of the hydrogen-like atom are identi-
v ! 0, where v is the relative velocity of the two fied by the usual principal quantum number n, the
particles, some of the pQED terms behave as =v, so orbital angular momentum l (with the convention of
that the naive expansion in  becomes meaningless. writing S, P, D, . . . instead of l = 0, l = 1, l = 2, . . .)
Fortunately, it is relatively easy to identify the origin and j, the total angular momentum including the
of those terms (which are essentially due to the spin of the electron. It turns out that the bound
172 Quantum Electrodynamics and Its Precision Tests

levels consist of very many contributions of different within QED, even if their actual calculation is an
kinds; dropping quantum number indices for sim- extremely demanding task. One of the first results
plicity, the energy levels can be written as an obtained in 1947 was A41 = (4=3)l0 , contributing to
expression of the form the 2S but not to the 2P states (quite in general,
  most corrections are much bigger for l = 0 states
me c2 ðZÞ2 mr than for higher-angular-momentum states), which is
E¼
2 me sufficient to give the right order of magnitude of the
 
1 (2S1=2 –2P1=2 ) Lamb shift (about 1000 MHz). The
 2 þ ðZÞ2 f4 þ ðZÞ4 f6 þ   
n other coefficients are now known, thanks to the
þ Erad þ Erec þ Enucl þ    ½5 strenuous and continued efforts (Eides et al. 2001)
since then, which is impossible to refer properly here
Let us observe that it is convenient to write in any detail. The current frontier of the theoretical
explicitly the Z factors even when Z = 1 for a better calculation (around the dots in the previous for-
bookkeeping of the various corrections. As usual, mr mula) corresponds to 8–9 total powers of (=) and
is the reduced mass of the electron, mr = me M= (Z) or some kHz for the 1S state.
(me þ M) the mass of the nucleus being M; the first The next term in eqn [5], Erec contains
term in the square bracket, 1=n2 , the familiar contributions of order me c2 (Z)5 (me =M) or smaller
Balmer term, is by and far the dominant one, giving (some care must be done for classifying the
for the n = 1 level in the Z = 1 case an energy of contributions of order me =M, which can be
about 13.6 eV or a corresponding frequency of accounted for by proper use of mr rather than me
3.3  1015 Hz. The other terms in the square and genuine me =M contributions), and are suffi-
bracket, f4 and f6 , are known coefficients (depend- ciently known for practical purposes; the same is
ing also on the small parameter me =M; f4 is true for many other contributions discussed in Eides
essentially the fine structure). et al. (2001) and skipped in eqn [5]. A troublesome
The term Erad , is the bulk of the radiative QED contribution comes however from Enucl ; at leading
corrections; it can be written as a multiple expan- order, one has
sion on (Z),  and L = ln [1=(Z)2 ], which turns  
out to have the following explicit form: 2ðZÞ4 mc2 mcRp 2
Enucl ¼ l0
 3n3 h
1 h
Erad ¼ me c2 ðZÞ4 3 A41 L þ A40 þ ðZÞA50 where Rp is the so-called root-mean-square charge
n 

i radius of the proton, which is not well known
þ ðZÞ2 A62 L2 þ A61 L þ A60 þ    experimentally (in the literature, there are indeed
2 h two direct measurements, Rp = 0.805(11) fm and
þ B40 þ ðZÞB50 Rp = 0.862(12) fm, in poor agreement with each
 i

other; a new independent measurement is strongly
þ ðZÞ2 B63 L3 þ B62 L2 þ B61 L þ B60 þ   
needed).
3
þ ½C40 þ ðZÞC50 þ    þ    ½6
 The hyperfine splitting The effect of the interac-
The first index of the coefficients refers to the power tion of the electron with the spin of the positive
of (Z), the second to the power of L; as a rule, particle introduces the so-called hyperfine splitting
there are three powers of (Z) due to the normal- of all the levels. The order of magnitude of the
ization of the wave function and one power of (Z) hyperfine splitting of the 1S state is given by the
for each interaction with the nucleus (in the leading Fermi energy
term of eqn [5] one must subtract two powers of 4 me
(Z) due to the long-range nature of the Coulomb EF ¼ me c2 ðZÞ4 gp
3 mp
interaction), while the terms in L = ln [1=(Z)2 ] are
related to the infrared divergences of the scattering where gp ’ 5.586 is the g-factor of the proton,
amplitude, with the binding energy acting as infra- which gives ’ 1.42 GHz. It was dubbed hyperfine
red cutoff. The A-coefficients refers to order (=) because it is smaller than the fine structure terms by
or one-loop virtual correction (we do not distinguish the factor me =mp . Many classes of corrections can
here between one-loop self-mass and vacuum- be worked out, with patterns similar to those of the
polarization contribution, as usually done in the previous subsection, and also in this case the nuclear
literature), the B-coefficients to two loops, etc. The contributions (this time mainly due to the theoreti-
coefficients are pure numbers, entirely determined cally unknown magnetic form factor and the
Quantum Electrodynamics and Its Precision Tests 173

so-called polarizability of the proton) prevent from the mþ lepton has no strong interactions, the mþ e
obtaining predictions with an error less than 1 kHz system can be studied theoretically within pure
(or a relative precision better than 1  106 ). QED, with the weak interactions giving a known
and small perturbation. Further, the ratio of the
The comparison with the experiments Experimen- masses me =mm ’ 4.8  103 is small, so that the
Experimentally, one measures transition frequencies external field approximation holds. However, the m
among the various levels. For many years the is unstable (lifetime ’ 2.2 m s), which makes experi-
precision record was given by the hyperfine splitting ments more difficult to carry out. The best measured
of the ground states of hydrogen hfs (1S) was quantity is the hyperfine splitting of the 1S ground
measured long ago (see Hellwig et al. (1970) and state (see Liu et al. (1999))
Essen et al. (1971)),
hfs ðme; 1SÞ ¼ 4 463 302 765ð53Þ Hz
hfs ð1SÞ ¼ 1 420 405:751 766 7ð9Þ kHz ½7
with a relative precision of 12  109 . The theore-
with a relative error 6  1013 . The current record in
tical treatment is similar to the case of hydrogen,
the optical range is the value of the (1S–2S)
with the important advantage that nuclear interac-
hydrogen transition frequency, obtained by means
tions are absent and everything can be evaluated
of two-photon Doppler-free spectroscopy Niering
within QED, so that the bulk of the contribution is
et al. (2000),
given by a formula with the structure of eqn [6]. But
ð1S–2SÞ ¼ 2 466 061 413 187:103ð46Þ kHz ½8 the prediction depends, in any case, on the me =mm
mass, which is not known with the required
with a relative precision 1.9  1014 ; other optical
precision. Indeed, a recent theoretical calculation
transitions, such as (2S–8D), (2S–12D) are measured
(Czarnecki et al. 2002) (which includes also a
with precision of about 1  1011 .
contribution of 0.233(3) kHz from hadronic
The measurement of the Lamb shift was repeated
vacuum polarization) gives 4 463 302 680(510)
several times, with results in nice agreement with the
(30)(220) Hz, where the first (and biggest) error
original value, such as Lundeen and Pipkin (1986),
comes from me =mm , the second from , and the third
1057.845(9) MHz. The most precise value,
is the theoretical error (an estimation of higher-
1057.8514  0.0019 MHz was given in Palchikov
order contributions not yet evaluated).
et al. (1985) (the result depends, however, on the
theoretical value of the lifetime, and should be
changed into 1057.8576  0.0021 according to Positronium
subsequent analysis (see Karshenboim (1996)). The
The positronium is the bound state of an electron
experimental (2S1=2 –2P1=2 ) Lamb shift was also
and a positron. Theoretically, it is an ideal system to
obtained as the difference between the measured
study, as it can be described entirely within QED,
fine structure separation (2P3=2 –2S1=2 ) and the
without any unknown parameter of non-QED
theoretical value of the (2P3=2 –2P1=2 ) frequency,
origin. As the masses of the two constituents,
and the radiative corrections Erad to any level are
positron and electron, are strictly equal, the reduced
now referred to as the Lamb shift of that level.
mass of the system is exactly equal to half of the
As a somewhat deceiving conclusion, the wonder-
electron mass, mr = me =2, and the energy scale of
ful experimental results of eqns [7] and [8] cannot
the bound states is half of R1 .
be used as a high-precision test of the theory or to
At variance with the muonium case, the external
obtain precise values of many fundamental con-
field approximation is not valid, so that positronium
stants, as the theoretical calculations depend, unfor-
must be treated with the full two-body bound-state
tunately, on hadronic quantities which are not
machinery of QFT, of which it provides an excellent
known accurately. Combining theoretical predic-
test (Karshenboim 2004).
tions, the above transitions and Lamb shift data, and
Experimentally, radioactive positron sources are
the available values of  and me =mp , one can indeed
available, so that positronium is easier to produce
obtain a measure of Rp (Rp = 0.883  0.014,
than muonium. It is, however, unstable; states with
according to Melnikov and van Ritbergen (2000))
total spin S equal 0 (also called parapositronium
and the value of R1 already quoted above.
states) annihilate into an even number (mainly two)
of gammas, and states with S = 1 (orthopositronium)
Muonium
into an odd number (mainly three) of gammas, with
The muonium is the bound state of a positive mþ short lifetimes (which make precise measurements
meson and an electron. At variance with the proton, difficult). Further, as positronium is the lightest
174 Quantum Electrodynamics and Its Precision Tests

atom, Doppler-broadening effects are very impor- The Anomalous Magnetic Moments
tant, reducing the precision of spectroscopical of Leptons
measurements.
The precision of the measurements requires, for both
Positronium decay rates There has been a long- the e and  leptons, to also take into account graphs
time discrepancy between theory and experiment in with contributions from the other leptons as virtual
decay rate of ground-state orthopositronium, which intermediate states and those of hadronic and weak
prompted thorough theoretical investigations look- origin. Quite in general, if the mass of the virtual
ing for errors in the calculations or flaws in the particle, say mv , is smaller than the mass of the
formalism, but it turned out that the flaw was on the external lepton, say ml , one can have an ln (ml =mv )
experimental side. The current theoretical prediction behavior of the contributions; that is the case of the
for the ground state S = 1 decay is (Adkins et al. virtual electron contributions to the muon magnetic
2002) anomaly am , which can be enhanced by powers of
 ln (mm =me ). In the opposite case, mv > ml , the
  1
contribution has the behavior (ml =mv )2 ; that is the
ð1S; orthoÞ ¼ 0 1 þ A þ 2 ln 
 3 case of the (mm =m )2 contributions to am from 
2 33 loops and of the (me =mm )2 contributions from 
2 3
þB  ln  þ C ln  þ    loops to th electron magnetic anomaly, ae . As strong
 2 
and weak interactions are in general associated with
¼ 7:039979ð11Þms1 heavy-mass particles, they are expected to be more
where 0 = 2(2  9)me 6 =(9) = 7.2111670(1), important for am than ae ; further, a given heavy
A = 10.286606(10),B = 45.06(26), C = 5.517, in particle contribution to ae is smaller by a factor
nice agreement with the less precise experimental (me =mm )2 than the corresponding contribution to am .
result of Karshenboim (2004, ref. 38) 7.0404
(10)(8)ms1 . As a curiosity, the coefficients A, B The Magnetic Anomaly am of the m
above are among the greatest coefficients so far
The am has been reviewed in Passera (2005). The
appeared in QED radiative corrections.
present (2005) world average experimental value is
The agreement between theory and experiment for
the ground-state parapositronium decay rate has am ðexpÞ ¼ 116 592 080ð60Þ  1011
always been good; the current status of Karshenboim
(2004, ref. 41) is 7990.9(1.7) ms1 for the experimental with a relative error 0.5  106 .
result and of Karshenboim (2004, ref. 43) Theoretically, one can write
7989.64(2) ms1 for the theoretical prediction. am ¼ am ðQEDÞ þ am ðhadÞ þ am ðEWÞ ½9

Positronium levels The quantum number structure where the three terms stand for the contributions
of the levels is similar to muonium, with the from pure QED, strong interacting hadrons and
important difference, however, that the hyperfine electroweak interactions. In turn, one can expand
splitting (which in hydrogen or muonium is small am (QED) in powers of  as
because it is proportional to the ratio of the masses of X l
the two components) is in fact of the same order as am ðQEDÞ ¼ Cl
l

the fine structure. The theoretical evaluation of the     
X ðlÞ ðlÞ mm ðlÞ mm
energy levels provides a very stringent check of QED ¼ A1 þ A2 þ A2
and of the overall treatment of the bound-state l
me mt
problem. Corrections have been evaluated, typically,   
ðlÞ mm mm  l
up to order mc2 7 . The best-known quantities are þ A3 ; ½10
me mt 
the ground state (hyper)fine splitting, experimental
value (Ritter et al. 1984) 203.38910(74) GHz The coefficients A(l)
1 involve only the photon and
(3.6  106 relative error), theoretical (Karshenboim the external lepton as virtual states, are identically
2004) 203.3917(6), and the 1S–2S transition for the same as in ae ; they are known up to l = 4
orthopositronium, experiment (Fee et al. 1993) 1 233 included (but, strictly speaking, the contribution of
607 216.4 (3.2) MHz, theory 1 233 607 222.2(6). A(4)
1 is smaller than the experimental error of am )
The general agreement is good; the precisions and will be discussed later for the electron. The
achieved are, however, not yet sufficient to allow a A(l)
2 (mm =me ) are very large, being enhanced by
determination of R1 or  competitive with other powers of ln (mm =me ), and are required and known
measurements. up to l = 5; A(l) (2)
2 (mm =mt ) starts with A2 (mm =mt ) ’
Quantum Electrodynamics and Its Precision Tests 175

1=45(mm =mt )2 , contributing 4.2  1011 to am , so the scientific community: the validity of QED and
that the A(l)
2 (mm =mt ) with higher values of l are not electroweak models is taken for granted, and a
needed. A(l)
3 (mm =me , mm =mt ), finally, starts from
disagreement, if any, is considered to be an indica-
l = 3, and gives a negligible contribution tion of new physics. To obtain significant informa-
0.7  1011 . Summing up, one finds C1 = 1=2, tion in that direction, however, the experimental
C2 = 0.765 857 410(27) (the error is from the experi- and the theoretical errors (dominated in turn by the
mental errors in the lepton masses) C3 = 24.050 experimental error in eþ e scattering data) should
509 64(43), C4 = 131.011(8), and C5 = 677(40). As be significantly reduced.
already observed, the coefficients are large due to the
presence of ln (mm =me ) factors. The last term C5
The Magnetic Anomaly ae of the Electron
contributes 4.6(0.3)  1011 to am , and the total QED
contribution is Experimentally, one has the 1987 value (Kinoshita
2005, ref. 1).
am ðQEDÞ ¼ 116 584 718:8ð0:3Þð0:4Þ  1011
where the first error is due to the uncertainties in the ae ðexpÞ ¼ 1 159 652 188:4ð4:3Þ  1012 ½11
coefficients C2 , C3 , and C5 and the second from the
value of  coming from atom interferometry with a relative error 3.7  109 and the preliminary
measurements (see below). Harvard (2004) measurement (Kinoshita 2005, ref. 3).
The hadronic contributions are of two kinds,
those due to vacuum polarization, am (vac.pol), ae ðHarvardÞ ¼ 1 159 652 180:86ð0:57Þ  1012 ½12
which can be evaluated by sound theoretical
with 0.5  109 relative error, that is, an increase in
methods by using existing experimental data, and
precision by a factor 7.
those due to light-by-light hadronic scattering,
Theoretically, eqns [9] and [10] apply also to the
am (lbl), whose evaluation relies on much less firmer
electron; given the smallness of the electron mass,
grounds and are entirely model-dependent. The
the relevant terms up to the precision of the
value of am (vac.pol) varies slightly among the
experimental data are
various authors (see Passera (2005) for reference to
original work), let us take as a typical value     
ð1Þ ð2Þ  2 ð3Þ  3
am (vac.pol) = 6834(92)  1011 (based on eþ e scat- a e ¼ A1 þ A1 þ A1
  
tering data and including also first-order radiative 4   
corrections). The model-dependent value of the ð4Þ ð2Þ me  2
þ A1 þ    þ A2
light-by-light contribution changed several times in  mm 
the years (also in sign!) but now there is a general
consensus that it should be positive; let us take, þ ae ðhadÞ þ ae ðEWÞ ½13
somewhat arbitrarily, am (lbl) = 136(25)  1011 , so The explicit calculation gives
that the total hadronic contribution becomes
ð1Þ 1
am ðhadÞ ¼ 6970ð92Þ  1011 A1 ¼ ðPassera 2005; ref: 1Þ
2
The electroweak contribution, finally, is ð2Þ 197 1 2 1 2 3
A1 ¼ þ    ln 2 þ ð3Þ
144 12 2 4
am ðEWÞ ¼ 154ð2Þ  1011
¼  0:328 478 965 579 . . .
which accounts for a one-loop purely weak ðPassera 2005; ref: 17Þ
contribution and a two-loop electromagnetic and
weak contribution, which turns out to be very large ð3Þ 83 215
A1 ¼ 2 ð3Þ  ð5Þ
(42  1011 ) for the presence of logarithms in the 72 24
 
masses (the error is due to the uncertainty in the 100 1 4 1 2 2
þ a4 þ ln 2   ln 2
Higgs boson mass). 3 24 24
Summing up, eqn [9] gives am = 116 591 842 239 4 139 298 2
(92)  1011 , so that   þ ð3Þ   ln 2
2160 18 9
am ðexpÞ  am ¼ 138ð60Þð90Þ  1011 17101 2 28259
þ  þ
810 5184
The substantial agreement can be considered to be a
¼ 1:181 241 456 . . . ðLaporta and Remiddi 1996Þ
good overall check of QED and electroweak inter-
ð4Þ
actions. But another attitude is often adopted in A1 ¼  1:7283ð35Þ ðKinoshita 2005Þ
176 Quantum Electrodynamics and Its Precision Tests

and Czarnecki A, Eidelman SI, and Karshenboim SG (2002) Muonium


hyperfine structure and hadronic effects. Physical Review D
   
ð2Þ me 2 1 me 2 2 65: 053004.
A2 ’ ’ 2:72  1012 Dirac PAM (1927) The quantum theory of emission and absorp-
mm  45 mm  tion of radiation. Proceedings of the Royal Society A 114: 243.
ae ðhadÞ ¼ 1:67ð0:02Þ  1012 Eides MI, Grotch H, and Shelyuto VA (2001) Theory of light
hydrogenlike atoms. Physics Reports 342: 63 (arXiv:hep-ph/
ae ðEWÞ ¼ 0:03  1012 ½14 0002158).
Essen L, Donaldson RW, Bangman MJ, and Hope EG (1971)
For obtaining a meaningful prediction, one needs Frequency of the hydrogen maser. Nature 229: 110.
now a precise value of . The most precise value Fee MS et al. Measurement of the positronium 13 s1 –23 s1 interval
available at present is that of Passera (2005, ref. 49) by continuous-wave two-photon excitation. Physical Review
Letters 70: 1397.
1 ðaifÞ ¼ 137:036 000 3ð10Þ Hellwig H, Vessot RFC, Levine MW, Zitzewitz PW, Allan DW,
and Glaze DJ (1970) Measurement of the unperturbed
with relative error 7  109 , obtained by the atom hydrogen hyperfine transition frequency. IEEE Transactions
interferometry method (which is independent of on Instrument Measurement IM19: 200.
Karshenboim SG (1996) Leading logarithmic corrections and
QED, depending only on the kinematics of the
uncertainty of muonium hyperfine splitting calculations.
Doppler effect). With that value of , the theoretical Z. Physics D 36: 11.
prediction for ae becomes Karshenboim SG (2004) Precision study of positronium: testing
bound state QED theory. International Journal of Modern
ae ¼ 1 159 652 175:9ð8:5Þð0:1Þ1012 Physics A 19: 3879 (e-print hep-ph/0310099).
Kinoshita T (2005) Improved 4 term of the electron anomalous
where the first error comes from  and the second magnetic moment (hep-ph/0507249).
from C4 ; conversely, one can use the QED predic- Kusch P and Foley HM (1947) Precision measurements of the
tion for ae and ae (Harvard) for obtaining ; one ratio of the atomic ‘g values’ in the 2 P3=2 and 2 P1=2 states of
obtains in that way gallium. Physical Review 72: 1256(L).
Lamb WE Jr. and Retherford RC (1947) Fine structure of the
1 ðQED; ae Þ ¼ 137:035 999 708ð12Þð67Þ hydrogen atom by a microwave method. Physical Review 72:
241.
where the first uncertainty is from C4 and the Laporta S and Remiddi E (1996) The analytical value of the electron
second from the experiment. We see that theory and (g  2) at order 3 in QED. Physics Letters B 379: 283.
Liu W et al. (1999) High precision measurements of the ground
experiment are in good agreement. state hyperfine structure interval of muonium and of the muon
As a concluding remark, another independent and magnetic moment. Physical Review Letters 82: 711.
more precise (or analytic!) evaluation of C4 contribu- Lundeen SR and Pipkin FM (1986) Separated oscillatory field
tion would be welcome. The five-loop term is not measurement of Lamb shift in atomic hydrogen, n = 2.
known; but as (=)5  0.07  1012 , if C5 is, say, Metrologia 22(1): 9.
Melnikov K and van Ritbergen T (2000) Three-loop slope of the
not greater than 2, its contribution to ae becomes Dirac form factor and the 1S Lamb shift in hydrogen. Physical
equal to the contribution of the error C4 of C4 and Review Letters 84: 1673 (hep-ph/9911277).
is not yet required to match the current precision of Niering M et al. (2000) Measurement of the hydrogen 1s–2s
ae ( exp ). The ultimate theoretical limit, the error of transition frequency by phase coherent comparison with a
the hadronic contribution, ae (had) = 0.02  1012 , microwave cesium fountain clock. Physics Review Letters 84:
5496.
is still smaller, corresponding to a change Pal’chikov VG, Sokolov YL, and Yakovlev VP (1985) Measure-
C4 = 0.0007 of C4 or C5 = 0.3 of C5 . ment of the Lamb shift in H, n = 2. Metrologia 21: 99.
Passera M (2005) The standard model prediction of the muon
See also: Abelian and Nonabelian Gauge Theories Using anomalous magnetic moment. Journal of Physics G 31: R75
Differential Forms; Anomalies; Effective Field Theories; (hep-ph/0411168).
Electroweak Theory; Quantum Field Theory: A Brief Ritter MW, Egan PO, Hughes VW, and Woodle KA (1984)
Introduction; Standard Model of Particle Physics. Precision determination of the hyperfine-structure interval in
the ground state of positronium, V. Physical Review A 30:
1331.
Triebwasser S, Dayhoff ES, and Lamb WE Jr. (1953) Fine
structure of the hydrogen atom, V. Physical Review 89: 98.
Further Reading
Veltman MJC (1999) Nobel Lecture (http://nobelprize.org/
Adkins GS, Fell RN, and Sapirstein J (2002) Two-loop correction physics/laureates/1999/veltman-lecture.pdf).
to the orthopositronium decay rate. Annals of Physics (NY) Weinberg S (1995) The Quantum Theory of Fields. Cambridge:
295: 136. Cambridge University Press.
Quantum Entropy 177

Quantum Entropy
D Petz, Budapest University of Technology ones, we simply call them particles. Since we have
and Economics, Budapest, Hungary ideas of quantum mechanics in mind, we assume
ª 2006 Elsevier Ltd. All rights reserved. that each of the particles is in one of the energy
levels E1 < E2 <    < Em . The P number of particles
in the level Ei is Ni , so i Ni = N is the total
In the past 50 years, entropy has broken out of number of particles. A macrostate of our system is
thermodynamics and statistical mechanics and given by the occupation numbers P N1 , N2 , . . . , Nm .
invaded communication theory, ergodic theory The energy of a macrostate is E = i Ni Ei . A given
mathematical statistics, and even the social and macrostate can be realized by many configurations
life sciences. The favorite subjects of entropy of the N particles, each of them at a certain energy
concern macroscopic phenomena, irreversibility, level Ei . These configurations are called microstates.
and incomplete knowledge. In the strictly mathe- Many microstates realize the same macrostate. We
matical sense entropy is related to the asymptotics count the number of ways of arranging N particles
of probabilities or concerns the asymptotic beha- in m boxes (i.e., energy levels) such that each box
vior of probabilities. has N1 , N2 , . . . , Nm particles. There are
This review is organized as follows. First the  
N N!
history of entropy is discussed generally and then we :¼ ½1
N1 ; N2 ; . . . ; Nm N1 !N2 ! . . . Nm !
concentrate on the von Neumann entropy again
somewhat historically following the work of von such ways. This multinomial coefficient is the
Neumann. Umegaki’s quantum relative entropy is number of microstates realizing the macrostate
discussed both in case of finite systems and in the (N1 , N2 , . . . , Nm ) and it is proportional to the
setting of C -algebras. An axiomatization is pre- probability of the macrostate if all configurations
sented. To show physical applications of the concept are assumed to be equally likely. Boltzmann called [1]
of entropy, the statistical thermodynamics is the thermodynamical probability of the macrostate,
reviewed in the setting of spin chains. The relative in German ‘‘thermodynamische Wahrscheinlichkeit,’’
entropy shows up in the asymptotic theory of hence the letter W was used. Of course, Boltzmann
hypothesis testing and data compression. argued in the framework of classical mechanics and
the discrete values of energy came from an approxi-
mation procedure with ‘‘energy cells.’’
General Introduction to Entropy: From If we are interested in the thermodynamic limit N
Clausius to von Neumann increasing to infinity, we use the relative numbers
The word ‘‘entropy’’ was created by Rudolf Clausius pi := Ni =N to label Pa macrostate and, instead of the
and it appeared in his work Abhandlungen über die total energy E = i Ni Ei , P we consider the average
mechanische Wärmetheorie published in 1864. The energy pro particle E=N = i pi Ei . To find the most
word has a Greek origin, its first part reminds us of probable macrostate, we wish to maximize [1] under
‘‘energy’’ and the second part is from ‘‘tropos,’’ a certain constraint. The Stirling approximation of
which means ‘‘turning point.’’ Clausius’ work is the the factorials gives
foundation stone of classical thermodynamics.  
1 N
According to Clausius, the change of entropy of a log
system is obtained by adding the small portions of N N1 ; N2 ; . . . ; Nm
heat quantity received by the system divided by the ¼ Hðp1 ; p2 ; . . . ; pm Þ þ OðN1 log NÞ ½2
absolute temperature during the heat absorption.
This definition is satisfactory from a mathematical where
point of view and gives nothing other than an X
Hðp1 ; p2 ; . . . ; pm Þ :¼ pi log pi ½3
integral in precise mathematical terms. Clausius i
postulated that the entropy of a closed system
cannot decrease, which is generally referred to as If N is large then the approximation [2] yields that
the second law of thermodynamics. instead of maximizing the quantity [1] we can
The concept of entropy was really clarified by maximize [3]. P For example, maximizing [3] under
Ludwig Boltzmann. His scientific program was to the constraint i pi Ei = e, we get
deal with the mechanical theory of heat in connec-
tion with probabilities. Assume that a macroscopic eEi
pi ¼ P Ej ½4
system consists of a large number of microscopic je
178 Quantum Entropy

where the constant  is the solution of the equation up to an additive constant, which could be chosen to
X Ei be 0 as a matter of normalization. Equation [6] is
e
Ei P ¼e von Neumann’s celebrated entropy formula; it has a
i j eEj more elegant form
Note that the last equation has a unique solution if Sð!Þ ¼  tr ð!Þ ½7
E1 < e < Em , and the distribution [4] is now known
where the state ! is identified with the correspond-
as the discrete Maxwell–Boltzmann law.
ing statistical operator, and  : R þ ! R is the
Let p1 , p2 , . . . , pn be the probabilities of different
continuous function (t) = t log t.
outcomes of a random experiment. According to
von Neumann solved the maximization problem
Shannon, the expression [1] is a measure of our
for S(!) under the constraint tr !H = e. This means
ignorance prior to the experiment. Hence it is also
the determination of the ensemble of maximal
the amount of information gained by performing the
entropy when the expectation of the energy operator
experiment. The quantity [1] is maximum when all
H is a prescribed value e. It is convenient to rephrase
the pi ’s are equal. In information theory, logarithms
his argument in terms of conditional expectations.
with base 2 are used and the unit of information is
H = H  is assumed to have a discrete spectrum and
called bit (from binary digit). As will be seen below,
we have a conditional expectation E determined by
an extra factor equal to Boltzmann’s constant is
the eigenbasis of H. If we pass from an arbitrary
included in the physical definition of entropy.
statistical operator ! with tr !H = e to E(!), then the
The comprehensive mathematical formalism of
entropy is increasing, on the one hand, and the
quantum mechanics was first presented in the famous
expectation of the energy does not change, on the
book Mathematische Grundlagen der Quantenme-
other, so the maximizer should be searched among
chanik published in 1932 by Johann von Neumann.
the operators commuting with H. In this way we are
In the traditional approach to quantum mechanics, a
(and von Neumann was) back to the classical
physical system is described in a Hilbert space:
problem of statistical mechanics treated at the
observables correspond to self-adjoint operators and
beginning of this article. In terms of operators, the
statistical operators are associated with the states. In
solution is in the form
fact, a statistical operator describes a mixture of pure
states. Pure states are really the physical states and expðHÞ
they are given by rank-1 statistical operators, or ½8
tr expðHÞ
equivalently by rays of the Hilbert space.
von Neumann associated an entropy quantity to a which is called Gibbs state today.
statistical operator in 1927 and the discussion was
extended in his book (von Neumann 1932). His
The von Neumann Entropy
argument was a gedanken experiment on the
grounds of phenomenological thermodynamics. Let von Neumann was aware of the fact that statistical
us consider a gas of N(  1) molecules in a box. operators form a convex set whose extreme points
Suppose that the gas behaves like a quantum system are exactly the pure states. He also knew that
and is described
P by a statistical operator ! which is a entropy is a concave functional, so
mixture i i j’i ih’i j, where j’i i  ’i are orthogonal X  X
state vectors. We may take i N molecules in the pure S i
 i ! i  i
Sð!i Þ ½9
state ’i for every i. The gedanken experiment gave
X  for any convex combination. To determine the
entropy of a statistical operator, he used the
S i j’i ih’i j
i
Schatten decomposition, which is an orthogonal
X X extremal decomposition in our present language.
¼ i Sðj’i ih’i jÞ   i log i ½5
For a statistical operator ! there are many ways to
i i
write it in the form
where  is Boltzmann’s constant and S is certain X
thermodynamical entropy quantity (relative to the !¼ i j i ih i j
fixed temperature and molecule density). i

After this, von Neumann showed that S(j’ih’j) is if we do not require the state vectors to be
independent of the state vector j’i, so that orthogonal. The geometry of the statistical opera-
X  X tors, that is, the state space, allows many extremal
S i j’i ih’i j ¼   i log i ½6 decompositions and among them there is a unique
i i orthogonal one if the spectrum of ! is not
Quantum Entropy 179

degenerate. Nonorthogonal pure states are essen- where !Ln 2 B(HA ) B(HLnB ) and !R R
n 2 B(HnB )
tially nonclassical. They are between identical and B(HC ) are density operators and pn 2 B(HB ) are
completely different. Jaynes recognized in 1956 that the orthogonal projections HB ! HLnB HR
nB .
from the point of view of information the Schatten
decomposition is optimal. He proved that
n X X o
S ð!Þ ¼ sup  i i log i: ! ¼  ! ½10 Quantum Relative Entropy
i i i
The quantum relative entropy is an information
where thePsupremum is over all convex combina- measure representing the uncertainty of a state with
tions ! = i i !i statistical operators. This is Jaynes respect to another state. Hence it indicates a kind of
contribution to the von Neumann entropy. By the distance between the two states. The formal defini-
way, formula [10] may be used to define von tion [12] is due to Umegaki.
Neumann entropy for states of an arbitrary Now we approach quantum relative entropy
C -algebra whose states cannot be described by axiomatically. Our crucial postulate includes the
statistical operators. notion of conditional expectation. Let us recall that
Certainly the highlight of quantum entropy theory in the setting of operator algebras conditional
in the 1970s was the discovery of subadditivity. This expectation (or projection of norm 1) is defined as
property is formulated in a tripartite system whose a positive unital idempotent linear mapping onto a
Hilbert space H is a tensor product HA HB HC . subalgebra.
A statistical operator !ABC admits several reduced Now we list the properties of the relative entropy
densities, !AB , !B , !BC , and others. The strong functional which will be used in an axiomatic
subadditivity is the inequality due to Lieb and characterization:
Ruskai in 1973:
1. Conditional expectation property. Assume that A
Theorem 1 is a subalgebra of B and there exists a projection of
Sð!ABC Þ þ Sð!B Þ
Sð!AB Þ þ Sð!BC Þ ½11 norm 1 E of B onto A, such that ’ E = ’. Then
for every state ! of B S(!, ’) = S(!jA, ’jA) þ
The strong subadditivity inequality [11] is con- S(!, ! E) holds.
veniently rewritten in terms of the relative entropy. 2. Invariance property. For every automorphism 
For statistical operators  and !, of B we have S(!, ’) = S(! , ’ ).
Sðk!Þ ¼ tr ðlog   log !Þ ½12 3. Direct sum property. Assume that B = B1 B2 . Let
’12 (a b) = ’1 (a) þ (1  )’2 (b) and !12 (a b) =
if supp 
supp !, otherwise S(k!) = þ1. The !1 (a) þ (1  )!2 (b) for every a 2 B1 , b 2 B2 and
relative entropy expresses statistical distinguishabil- some 0 <  < 1. Then S(!12 , ’12 ) = S (!1 , ’1 )þ
ity and therefore it decreases under stochastic (1  )S(!2 , ’2 ).
mappings: 4. Nilpotence property. S(’, ’) = 0.
Sðk!Þ  SðEðÞkEð!ÞÞ ½13 5. Measurability property. The function (!, ’) 7!
S(!, ’) is measurable on the state space of the
for a completely positive trace-preserving mapping E. finite dimensional C -algebra B (when ’ is
The strong subadditivity is equivalent to assumed to be faithful).
Sð!AB ; ’ !B Þ
Sð!ABC ; ’ !BC Þ ½14 Theorem 3 If a real valued functional R(!, ’)
defined for faithful states ’ and arbitrary states !
where ’ is any state on B(HA ) of finite entropy. This of finite quantum systems shares the properties
inequality is a consequence of monotonicity of the [1]–[5], then there exists a constant c 2 R such
relative entropy, since !AB = E(!ABC ) and ’ !B = that
E(’ !BC ), where E is the partial trace over HC .
Clearly, the equality in [11] is equivalent to equality Rð!; ’Þ ¼ c Tr D! ðlog D!  log D’ Þ
in [14]. The relative entropy may be defined for linear
Theorem 2 The equality holds in [11] if and only functionals of an arbitrary C -algebra. The general
if definition may go through von Neumann algebras,
L there is an orthogonal decomposition pB HB =
L R normal states and the relative modular operator.
n HnB HnB , pB = supp !B , such that the density
operator of !ABC satisfies Another possibility is based on the monotonicity.
X Let ! and ’ be states of a C -algebra A. Consider
!ABC ¼ !B ðpn Þ!Ln !R
n ½15 finite-dimensional algebras B and completely posi-
n tive unital mappings  : B ! A. Then the supremum
180 Quantum Entropy

of the relative entropies S(! k’ ) (over all ) automorphisms of the quasilocal algebra A. Clearly,
can be defined as S(!k’). the covariance condition
Theorem 4 The relative entropy of states of x ðA Þ ¼ Aþx
C -algebras shares the following properties.
holds, where  þ x is the space-translate of the
(i) (!, ’) 7! S(!k’) is convex and weakly lower- region  by the displacement x.
semicontinuous. Having described the kinematical structure of
(ii) k’  !k2
2S(!, ’). lattice systems, we turn to the dynamics. The local
(iii) For a unital Schwarz map  : A0 ! A1 the Hamiltonian H() is taken to be the total potential
relation S(! k’ )
S(!k’) holds. energy between the particles confined to . This
Property (iii) is Uhlmann’s monotonicity theorem, energy may come from many-body interactions of
which we have already applied above. various orders. Most generally, we assume that there
The relative entropy appears in many concepts exists a global function  such that for any finite
and problems in the area of quantum information subsystem  the local Hamiltonian takes the form
theory (Nielsen and Chuang 2000, Schumacher and X
HðÞ ¼ ðXÞ ½16
Westmoreland 2002). X 

Each (X) represents the interaction energy of the


particles in X. Mathematically, (X) is a self-adjoint
Statistical Thermodynamics element of AX and H() will be a self-adjoint
Let an infinitely extended system of quantum spins operator in A . We restrict our discussion to
be considered in the simple cubic lattice L = Z , translation-invariant interactions, which satisfy the
where  is a positive integer. The observables additional requirement
confined to a lattice site x 2 Z form the self-adjoint x ððXÞÞ ¼ ðX þ xÞ
part of a finite-dimensional C -algebra Ax which is a
copy of the matrix algebra Md (C). It is assumed that for every x 2 Z and every region X Z . An
the local observables in any bounded region  Z interaction  is said to be of finite range if () = 0
are those of the finite quantum system when the cardinality (or diameter) of  is large
O enough, d()  d . The infimum of such numbers is
A ¼ Ax called the range of .
x2
If ’ is a state of the quasilocal algebra A then it
It follows from the definition that for  0 we will induce a state ’ on A(), the finite system
have A0 = A A0 n , where 0 n is the comple- comprising the spin in the bounded region  of Z .
ment of  in 0 . The algebra A and the subalgebra The (local) energy, entropy, and free energy of this
A CI0 n of A0 have identical structure and we finite system are given by the following formulas:
identify the element A2A with A I0 n in A0 .
If  0 then A A0 and it is said that A is E ð’Þ :¼ tr  ! HðÞ
isotonic with respect to . The definition also S ð’Þ :¼ tr  ! log ! ½17
implies that if 1 and 2 are disjoint then elements 1
of A1 commute with those of A2 . The quasilocal F ð’Þ :¼ E ð’Þ  S ð’Þ

C -algebra A is the norm completion of the normed
algebra A1 = [ A , the union of all local algebras Here ! denotes the density of ’ with respect to the
A associated with bounded (finite) regions  Z . trace tr of A , and  denotes the inverse temperature.
We denote by ax the element of Ax corresponding The functionals E , S , and F are termed local. It is
to a 2 A0 (x 2 Z ). It follows from the definition rather obvious that all three local functionals are
that the algebra A1 consists of linear combinations continuous if the weak topology is considered on the
of terms a(1) (k) (1) (k) state space of the quasilocal algebra. The energy is
x1    axk where x1 , . . . , xk and a , . . . , a
run through Z and A0 , respectively. We define x affine, the entropy is concave and consequently, the
to be the linear transformation free energy is a convex functional.
The free energy functional F is minimized by the
ð1Þ ðkÞ
að1Þ ðkÞ
x1    axk 7! ax1 þx    axk þx Gibbs state (see [8] with H = H()), and the
minimum value is given by
x corresponds to the space translation by x 2 Z
and it extends to an automorphism of A. Hence is 1
 log tr  eHðÞ ½18
a representation of the abelian group Z by 
Quantum Entropy 181

Our aim is to explain this variational principle after In accordance with the lattice-gas interpretation
the thermodynamic limit is performed. of our model, the global quantity p is termed
The thermodynamic limit ‘‘ tends to infinity’’ pressure.
may be taken along lattice parallelepipeds. Let a 2 Z In the treatment of quantum spin systems, the set
with positive coordinates and define S of all translation-invariant states is essential. The
global entropy functional s is a continuous affine
ðaÞ ¼ fx 2 Z: 0
xi < ai ; i ¼ 1; 2; . . . ; g ½19
function on S and physically it is a macroscopic
When a ! 1, (a) tends to infinity in a manner quantity which does not have microscopic (i.e.,
suitable for the study of thermodynamic limit: the local) counterpart. Indeed, the local entropy func-
boundary of the parallelepipeds is getting more and tional is not an observable because it is not affine on
more negligible compared with the volume. The the (local) state space. The local internal energy
notion of limit in the sense of van Hove makes this E (’) is microscopic observable and the energy
idea more precise and physically more satisfactory. density functional e of S is the corresponding
For the sake of simplicity, we restrict ourselves to global extensive quantity.
thermodynamic limit along parallelepipeds. As an analog of the variational principle for finite
Denoting by jj the volume of  (or the number quantum systems, the global free-energy functional f
of points in ), we may define the global energy, attains an absolute minimum at a translationally
entropy, and free energy functionals of translation- invariant state, and the minimum value of f  is equal
ally invariant states to be to the thermodynamic limit of the canonical free-
energy densities of the local finite systems. In the next
eð’Þ :¼ lim E ð’Þ=jj ½20
!1 theorem, this global variational principle will be
formulated in a slightly different but equivalent way.
sð’Þ :¼ lim S ð’Þ=jj ½21
!1 Theorem 6 When  is an interaction of finite
range, then

f ð’Þ :¼ lim F ð’Þ=jj ½22
!1
pð; Þ ¼ supfsð!Þ  eð!Þg
The existence of the limit in [21] is guaranteed by
the strong subadditivity of entropy, while that of the holds, when the supremum is over all translationally
limits in [20] and [22] is assumed if the interaction invariant states ! on A.
is suitably tempered, as it certainly does if the The minimizers of the right-hand side are called
interaction is of finite range. equilibrium states and they have several different
Theorem 5 If ’ is a translationally invariant state of characterizations.
the quasilocal algebra A, then the limit [21] exists and
sð’Þ ¼ inffSðaÞ ð’Þ=jðaÞj : a 2 Zþ g ½23 Asymptotical Properties
Moreover, the von Neumann entropy density functional We keep the notation of the previous section but we
’ 7! s(’) is affine and upper-semicontinuous when the consider one-dimensional chains,  = 1. Let ! be
state space is endowed with the weak topology. translation-invariant state on A and we fix a positive
Let  be an interaction of finite range. Then the number " < 1. We have in our mind that " is small and
thermodynamic limit [20] exists and the energy say that a sequence of projection Qn 2 A[1, n] is of high
density is given by probability if !(Qn )  1  ". The size of Qn , the
X ðÞ cardinality of a maximal pairwise orthogonal family of
eð’Þ ¼ ’ðE Þ and E ¼ projections contained in Qn , is given by trn Qn . (The
jj
02 subscript n in trn indicates that the algebraic trace
Furthermore, e(’) is an affine weak continuous functional on An is meant here.) The theorem below
functional of ’. says that the entropy density of ! governs asymptoti-
It follows that the free energy density f (’) exists cally the rank of the high-probability projections.
and it is an affine lower-semicontinuous function of Theorem 7 Assume that ! is an ergodic translation-
the translation-invariant state ’. invariant state of A. Then the limit relation
For 0 <  < 1 the thermodynamic limit
1
1 lim infflog trn Qn g ¼ sð!Þ
lim log tr eHðÞ  pð; Þ n!1 n
!1 jj
holds, when the infimum is over all projections
exists. Qn 2 A[1, n] such that !n (Qn )  1  ".
182 Quantum Entropy

This result is strongly related to data compression. Now we fix a formalism for an asymptotic theory
When ! is interpreted as a stationary quantum source of the hypothesis testing. Suppose that a sequence
(with possible memory), then efficient and reliable (Hn ) of Hilbert spaces is given, ((n) (n)
0 ) and (1 ) are
data compression needs a subspace of small dimension density matrices on Hn . The typical example we have
and the range of Qn can play this role. The entropy in mind is (n) (n)
0 = 0 0    0 and 1 = 1
density is the maximal rate of reliable compression. 1    1 . A positive contraction Tn 2 B(Hn ) is
It is interesting that one can impose further considered as a test on a composite system. Now the
requirements on the high-probability projections errors of the first and second kind depend on n:
and the statement of the theorem remains true. n [Tn ] = tr(n) (n)
0 (I  Tn ) and n [Tn ] = tr1 Tn .
1. The partial trace of Qnþ1 over Anþ1 is Qn ; Set
2. en(s")
tr Qn
en(sþ") if n is large enough; and ðnÞ
 ðn; "Þ ¼ infftr1 An g ½26
3. if q
Qn is a minimal projection (in A[1, n] ), then
!(q)
en(s") if n is large enough. where the infimum is over all An 2 B(Hn ) such that
In (2) and (3) s stands for s(!). Let Dn be the density 0
An
I and tr(n)
0 (I  An )
". In other words,
matrix of the restriction of ! to A[1, n] . It follows this is the infimum of the error of the second kind
that for an eigenvalue  of Qn Dn Qn the inequality when the error of the first kind is at most ". The
importance of this quantity is in the customary
log  approach to hypothesis testing.
s"

n The following result is the quantum Stein lemma.
holds. Theorem 9 In the above setting, the relation
From the point of view of data compression, it is
important if the sequence Qn 2 A[1, n] works uni- 1
lim log  ðn; "Þ ¼ Sð0 k1 Þ
n!1 n
versally for many states. Indeed, in this case the
compression algorithm can be universal for several holds for every 0 < " < 1.
quantum sources.
Theorem 8 Let R > 0. There is a projection
Qn 2 A[1, n] such that
Bibliographic Notes

1 A book about entropy in several contexts is Greven


lim sup log trQn
R ½24 et al. (2003). The role of von Neumann in the
n n
mathematization of the quantum entropy is
and for any ergodic state ! on A such that s(!) < R described in Petz et al. (2001); this work contains
the relation also his gedanken experiment.
The first proof of the strong subadditivity appeared
lim !ðQn Þ ¼ 1 ½25
n in Lieb and Ruskai (1973) and a didactical elementary
holds. approach is in Nielsen and Petz (2005). The structure
of the case of equality was obtained in Hayden et al.
In the simplest quantum hypothesis testing prob- (2004). The quantum relative entropy was introduced
lem, one has to decide between two states of a by Umegaki in 1962 and their properties and its
system. The state 0 is the null hypothesis and 1 is axiomatization are contained in the monograph by
the alternative hypothesis. The problem is to decide Ohya and Petz (1993). The monotonicity of the
which hypothesis is true. The decision is performed relative entropy is called Uhlmann’s theorem, see
by a two-valued measurement {T, I  T}, where Uhlmann (1977) and Ohya and Petz (1993).
0
T
I is an observable. T corresponds to The rigorous and comprehensive treatment of
the acceptance of 0 and I  T corresponds to the quantum lattice systems was one of the early
acceptance of 1 . T is called a test. When the successes of the algebraic approach to quantum
measurement value is 0, the hypothesis 0 is statistical thermodynamics. The subject is well
accepted, otherwise the alternative hypothesis 1 is summarized in Bratteli and Robinson (1981). The
accepted. The quantity [T] = tr0 (I  T) is inter- book by Sewell (1986) contains more physics and
preted as the probability that the null hypothesis is has less in mathematical technicalities. For many
true but the alternative hypothesis is accepted. This interesting entropy results concerning mean field
is the error of the first kind. Similarly, [T] = tr1 T systems, see, for example, Ragio and Werner (1991).
is the probability that the alternative hypothesis is The high probability subspace theorem is due to
true but the null hypothesis is accepted. It is called Ohya and Petz – Petz (1992) and Ohya and Petz
the error of the second kind. (1993) for product states – and was extended to
Quantum Ergodicity and Mixing of Eigenfunctions 183

some algebraic and Gibbs states by Hiai and Petz. Hiai F and Petz D (1991) The proper formula for relative entropy
The application to data compression was first and its asymptotics in quantum probability. Communications
in Mathematical Physics 143: 99–114.
observed by Schumacher (1995). The chained Kaltchenko A and Yang E-H (2003) Universal compression of
property of the high-probability subspaces was ergodic quantum sources. Quantum Information and Compu-
studied in Bjelaković et al. (2003) and the univers- tation 3: 359–375.
ality is from Kaltchenko and Yang (2003). Lieb EH and Ruskai MB (1973) Proof of the strong subadditivity
A weak form of the quantum Stein lemma was of quantum mechanical entropy. Journal of Mathematical
Physics 14: 1938–1941.
proved in Hiai and Petz (1991) and the stated form Nielsen MA and Petz D (2005) A simple proof of the strong
is due to Nagaoka and Ogawa (2000). An extension subadditivity inequality. Quantum Information and Computa-
to the case where (n)
0 is not a product was given in tion 5: 507–513.
Bjelaković and Siegmund-Schultze (2004). Ogawa T and Nagaoka H (2000) Strong converse and Stein’s
Other surveys about quantum entropy are Petz lemma in quantum hypothesis testing. IEEE Transactions on
Information Theory 46: 2428–2433.
(1992) and Schumacher and Westmoreland (2002). Ohya M and Petz D (1993) Quantum Entropy and Its Use, Texts
and Monographs in Physics, (2nd edn., 2004). Berlin:
See also: Asymptotic Structure and Conformal Infinity; Springer.
Capacities Enhanced by Entanglement; Channels in Petz D (1992) Entropy in quantum probability. In: Accardi L (ed.)
Quantum Information Theory; Entropy and Qualitative Quantum Probability and Related Topics VII, pp. 275–297.
Transversality; Positive Maps on C-Algebras; von Singapore: World Scientific.
Neumann Algebras: Introduction, Modular Theory and Petz D (2001) Entropy, von Neumann and the von Neumann
Classification Theory; von Neumann Algebras: Subfactor entropy. In: Rédei M and Stöltzner M (eds.) John von Neumann
Theory. and the Foundations of Quantum Physics. Dordrecht: Kluwer.
Raggio GA and Werner RF (1991) The Gibbs variational
principle for inhomogeneous mean field systems. Helvetica
Further Reading Physica Acta 64: 633–667.
Schumacher B (1995) Quantum coding. Physical Review A 51:
Bjelaković I, Krüger T, Siegmund-Schultze R, and Szko´la A 2738–2747.
(2003) Chained typical subspaces – a quantum version of Schumacher B and Westmoreland MD (2002) Relative entropy in
Breiman’s therem. Preprint. quantum information theory. In: Quantum Computation and
Bjelaković I and Siegmund-Schultze R (2004) An ergodic theorem Information, (Washington, DC, 2000), Contemp. Math. vol.
for the quantum relative entropy. Communications in Math- 305, pp. 265–289. Providence, RI: American Mathematical
ematical Physics 247: 697–712. Society.
Bratteli O and Robinson DW (1981) Operator Algebras and Sewell GL (1986) Quantum Theory of Collective Phenomena,
Quantum Statistical Mechanics. 2. Equilibrium States. Models New York: Clarendon.
in Quantum Statistical Mechanics, Texts and Monographs in Uhlmann A (1977) Relative entropy and the Wigner–Yanase–
Physics, (2nd edn., 1997). Berlin: Springer. Dyson–Lieb concavity in an interpolation theory. Commu-
Greven A, Keller G, and Warnecke G (2003) Entropy. Princeton nications in Mathematical Physics 54: 21–32.
and Oxford: Princeton University Press. von Neumann J (1932) Mathematische Grundlagen der Quanten-
Hayden P, Jozsa R, Petz D, and Winter A (2004) Structure of mechanik. Berlin: Springer. (In English: von Neumann J,
states which satisfy strong subadditivity of quantum entropy Mathematical Foundations of Quantum Mechanics. Princeton:
with equality. Communications in Mathematical Physics 246: Princeton University Press.)
359–374.

Quantum Ergodicity and Mixing of Eigenfunctions


S Zelditch, Johns Hopkins University, Baltimore, constant h ! 0 or the energy E ! 1. More generally,
MD, USA one could ask what impact any dynamical feature of a
ª 2006 Elsevier Ltd. All rights reserved. classical mechanical system (e.g., complete integrabil-
ity, KAM, and ergodicity) has on the eigenfunctions
and eigenvalues of the quantization.
Quantum ergodicity and mixing belong to the field Over the last 30 years or so, these questions have
of quantum chaos, which studies quantizations of been studied rather systematically by both mathe-
‘‘chaotic’’ classical Hamiltonian systems. The basic maticians and physicists. There is an extensive
question is: how does the chaos of the classical literature comparing classical and quantum
dynamics impact on the eigenvalues/eigenfunctions dynamics of model systems, such as comparing the
of the quantum Hamiltonian H ^ and on long-time geodesic flow and wave group on a compact (or
dynamics generated by H? ^ finite-volume) hyperbolic surface, or comparing
These problems lie at the foundations of the classical and quantum billiards on the Sinai billiard
semiclassical limit, that is, the limit as the Planck or the Bunimovich stadium, or comparing the
184 Quantum Ergodicity and Mixing of Eigenfunctions

discrete dynamical system generated by a hyperbolic Additionally, two main kinds of methods are in
torus automorphism and its quantization by the use: (1) methods of semiclassical (or microlocal)
metaplectic representation. As these models indicate, analysis, which apply to general Laplacians (and
the basic problems and phenomena are richly more general Schrödinger operators), and (2)
embodied in simple, low-dimensional examples in methods of number theory and automorphic
much the same way that two-dimensional toy forms, which apply to arithmetic models such as
statistical mechanical models already illustrate com- arithmetic hyperbolic manifolds or quantum cat maps.
plex problems on phase transitions. The principles Arithmetic models are far more ‘‘explicitly solvable’’
established for simple models should apply to far than general chaotic systems, and the results obtained
more complex systems such as atoms and molecules for them are far sharper than the results of semiclassi-
in strong magnetic fields. cal analysis. This article is primarily devoted to the
The conjectural picture which has emerged from general results on Laplacians obtained by semiclassical
many computer experiments and heuristic argu- analysis; see Arithmetic Quantum Chaos for results by
ments on these simple model systems is roughly J Marklov. For background on semiclassical analysis,
that there exists a length scale in which quantum see Heller (1984).
chaotic systems exhibit universal behavior. At this
length scale, the eigenvalues resemble eigenvalues of
random matrices of large size and the eigenfunctions
Wave Group and Geodesic Flow
resemble random waves. A small sample of the
original physics articles suggesting this picture is The model quantum Hamiltonians we will discuss
Berry (1977), Bohigas et al. (1984), Feingold and are Laplacians  on compact Riemannian mani-
Peres (1986), and Heller (1984). folds (M, g) (with or without boundary). The
This article reviews some of the rigorous mathe- classical phase space in this setting is the cotangent
matical results in quantum chaos, particularly those bundle T  M of M, P equipped with its canonical
on eigenfunctions of quantizations of classically symplectic form i dxi ^ di . The metric defines
ergodic or mixing systems. They support the the Hamiltonian
conjectural picture of random waves up to two vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
uX
moments, that is, on the level of means and u n
variances. A few results also exist on higher Hðx; Þ ¼ jjg ¼ t gij ðxÞi j
ij¼1
moments in very special cases. But from the
mathematical point of view, the conjectural links
on T  M, where
to random matrices or random waves remain very
 
much open at this time. A key difficulty is that the @ @
length scale on which universal behavior should gij ¼ g ;
@xi @xj
occur is far below the resolving power of any known
mathematical techniques, even in the simplest model [gij ] is the inverse matrix to [gij ]. We denote the
problems. The main evidence for the random volume density of (M, g) by dVol and the corre-
matrix and random wave connections comes from sponding inner product on L2 (M) by hf , gi. The unit
numerous computer experiments of model cases in (co-) ball bundle is denoted B M = {(x, ) : jj  1}.
the physics literature. We will not review numerical The Hamiltonian flow t of H is the geodesic
results here, but to get a well-rounded view of the flow. By definition, t (x, ) = (xt , t ), where (xt , t ) is
field, it is important to understand the computer the terminal tangent vector at time t of the unit
experiments (see, e.g., Bäcker et al. (1998a, b) and speed geodesic starting at x in the direction . Here
Barnett (2005)). and below, we often identify T  M with the tangent
The model quantum systems that have been most bundle TM using the metric to simplify the
intensively studied in mathematical quantum chaos geometric description. The geodesic flow preserves
are Laplacians or Schrödinger operators on com- the energy surfaces {H = E} which are the co-sphere
pact (or finite-volume) Riemannian manifolds, with bundles SE M. Due to the homogeneity of H,
or without boundary, and quantizations of sym- the flow on any energy surface {H = E} is equivalent
plectic maps on compact Kähler manifolds. Similar to that on the co-sphere bundle S M = {H = 1}.
techniques and results apply in both settings, so for (This homogeneity could be broken by adding a
the sake of coherence we concentrate on the potential V 2 C1 (M) to form a semiclassical
Laplacian on a compact Riemannian manifold Schrödinger operator h2  þ V, whose underlying
with ‘‘chaotic’’ geodesic flow and only briefly Hamiltonian flow is generated by jj2g þ V(x).) See
allude to the setting of ‘‘quantum maps.’’ h-Pseudodifferential Operators and Applications.
Quantum Ergodicity and Mixing of Eigenfunctions 185

The quantization
pffiffiffiffi of the Hamiltonian H is the This formula is almost universally taken to be the
square root  of the positive Laplacian definition of quantization of a flow or map in the
physics literature.
1 X n
@ ij @
 ¼  pffiffiffi g g The key difficulty in quantum chaos is that it
g i;j¼1 @xi @xj involves a comparison between long-time dynamical
properties of t and Ut through the symbol map and
of (M,
pg).
ffiffiffiffi Here, g = det [gij ]. We choose to work similar classical limits. The classical dynamics
with  rather than  since the former generates
defines the ‘‘principal symbol’’ behavior of Ut and
the wave
pffiffiffi
the ‘‘error’’ Ut AUt  Op(A  t ) typically grows
Ut ¼ eit  exponentially in time. This is just the first example
of a ubiquitous ‘‘exponential barrier’’ in the subject.
which is the quantization of the geodesic flow t .
By the last statement we mean that Ut is related to
t in several essentially equivalent ways:
Eigenvalues and Eigenfunctions of 
1. singularities of waves, that is, solutions Ut of
The eigenvalue problem on a compact Riemannian
the wave equation, propagate along geodesics;
manifold
2. Ut is a Fourier integral operator (= quantum
map) associated to the canonical relation defined ’j ¼ 2j ’j ; h’j ; ’k i ¼ jk
by the graph of t in T  M  T  M; and
3. Egorov’s theorem holds. is dual under the Fourier transform to the wave
equation. Here, {’j } is a choice of orthonormal basis
We only define the latter since it plays an important of eigenfunctions, which is not unique if the
role in studying eigenfunctions. As with any quantum eigenvalues have multiplicities > 1. The individual
theory, there is an algebra of observables on the eigenfunctions are difficult to study directly, and so
Hilbert space L2 (M, dvolg ) which quantizes T  M. one generally forms the spectral projections kernel,
Here, dvolg is the volume form of the metric. The X
algebra is that  (M) of pseudodifferential operators Eð; x; yÞ ¼ ’j ðxÞ’j ðyÞ ½4
DO’s of all orders, though we often restrict to the j : j  

subalgebra 0 of DO’s of order zero. We denote by Semiclassical asymptotics is the study of the  ! 1
m (M) the subspace of pseudodifferential operators of limit of the spectral data {’j , j } or of E(, x, y). The
order m. The algebra is defined by constructing a (Schwartz) kernel of the wave group can be
quantization Op from an algebra of symbols a 2 represented in terms of the spectral data by
Sm (T  M) of order m (polyhomogeneous functions on X
T  Mn0) to m . The map Op is not unique. In the Ut ðx; yÞ ¼ eitj ’j ðxÞ’j ðyÞ
reverse direction is the symbol map A : m ! j
Sm (T  M) which takes an operator Op(a) to the
or
R equivalently as the Fourier transform
homogeneous term am of order m in a.
R eit dE(, x, y) of the spectral projections. Hence,
Egorov’s theorem for the wave group concerns the
spectral asymptotics is often studied through the
conjugations
large-time behavior of the wave group.
t ðAÞ :¼ Ut AUt ; A 2 m ðMÞ ½1 The link between spectral theory and geometry,
and the source of Egorov’s theorem for the wave
Such a conjugation defines the quantum evolution of
group, is the construction of a parametrix (or WKB
observables in the Heisenberg picture, and, since the
formula) for the wave kernel. For small times t, the
early days of quantum mechanics, it was known to
simplest is the Hadamard parametrix,
correspond to the classical evolution
Z 1 X
1
2 2
Vt ðaÞ :¼ a  t ½2 Ut ðx; yÞ  eiðr ðx;yÞt Þ Uk ðx; yÞððd3Þ=2Þk d
0 k¼0
1 
of observables a 2 C (S M). Egorov’s theorem is
the rigorous version of this correspondence: it states ðt < injðM; gÞÞ ½5
that t defines an order-preserving automorphism of where r(x, y) is the distance between points,
 (M), that is, t (A) 2 m (M) if A 2 m (M), and U0 (x, y) = 1=2 (x, y) is the volume 1/2-density,
that inj(M, g) is the injectivity radius, and the higher
Ut AUt ðx; Þ ¼ A ðt ðx; ÞÞ :¼ Vt ðA Þ; Hadamard coefficients are obtained by solving
transport equations along geodesics. The parametrix
ðx; Þ 2 T  Mn0 ½3 is asymptotic to the wave kernel in the sense of
186 Quantum Ergodicity and Mixing of Eigenfunctions

smoothness, that is, the difference of the two sides of An important generalization is the ‘‘local Weyl law’’
[5] is smooth. The relation [5] may be iterated using concerning the traces trAE(), where A 2 m (M).
Utm = Utm to obtain a parametrix for long times. It asserts that
This is obviously complicated and not necessarily X
the best long-time parametrix construction, but it hA’j ; ’j i
illustrates again the difficulty of a long-time j 
Z
analysis. 1
¼ A dx d n þ Oðn1 Þ ½10
ð2 Þn B M
Weyl Law and Local Weyl Law
There is also a pointwise local Weyl law:
A fundamental and classical result in spectral
asymptotics is Weyl’s law on counting eigenvalues: X 1
j’j ðxÞj2 ¼ jBn jn þ Rð; xÞ ½11
j 
ð2 Þn
NðÞ ¼ #fj : j  g
jBn j where R(, x) = O(n1 ) uniformly in x. Again,
¼ VolðM; gÞn þ Oðn1 Þ ½6 when the periodic geodesics form a set of measure
ð2 Þn
zero in S M, one could average over the shorter
Here, jBn j is the Euclidean volume of the unit ball interval [,  þ 1]. Combining the Weyl and local
and Vol(M, g) is the volume of M with respect to the Weyl law, we find the surface average of A is a
metric g. An equivalent formula which emphasizes limit of traces:
the correspondence between classical and quantum Z
mechanics is 1
!ðAÞ :¼ A d


ðS MÞ S M
Volðjjg  Þ 1 X
trE ¼ ½7 ¼ lim hA’j ; ’j i ½12
ð2 Þn !1 NðÞ
  j

where Vol is the symplectic volume Pmeasure relative


n Here,
is the ‘‘Liouville measure’’ on S M, that is,
to the natural symplectic form j=1 dxj ^ dj on
the surface measure d
= dx d=dH induced by the
T  M.pffiffiffiffi
Thus, the dimension of the space where
H =  is   is asymptotically the volume where Hamiltonian H = jjg and by the symplectic volume
its symbol jjg  . measure dx d on T  M.
The remainder term in Weyl’s law is sharp on the
standard sphere, where all geodesics are periodic, but Problems on Asymptotics Eigenfunctions
is not sharp on (M, g) for which the set of periodic
Eigenfunctions arise in quantum mechanics as
geodesics has measure zero (Duistermaat–Guillemin,
stationary states, that is, states for which the
Ivrii) (see Semiclassical Spectra and Closed Orbits).
probability measure j (t, x)j2 dvol is constant in time
When the set of periodic geodesics has measure zero
where (t, x) = Ut (x) is the evolving state. This
(as is the case for ergodic systems), one has
follows from the fact that
NðÞ ¼ #fj : j  g
Ut ’k ¼ eitk ’k ½13
jBn j
¼ VolðM; gÞn þ oðn1 Þ ½8 and that jeitk j = 1. They are the basic modes of the
ð2 Þn
quantum system. One would like to know the
The remainder is then of smaller order than the behavior as j ! 1 (or h ! 0 in the semiclassical
derivative of the principal term, and one then has setting) of invariants such as:
asymptotics in shorter intervals:
1. matrix elements hA’j , ’j i of observables in this
Nð½;  þ 1Þ ¼ #fj : j 2 ½;  þ 1g state;
2. transition elements hA’i , ’j i between states;
jBn j 3. size properties as measured by Lp norms k’j kLp ;
¼n VolðM; gÞn1 þ oðn1 Þ ½9
ð2 Þn 4. value distribution as measured by the distribution
function Vol{x 2 M : j’j (x)j2 > t}; and
Physicists tend to write   h1 and to average over
5. shape properties, for example, distribution of
intervals of this width. Then mean spacing between
zeros and critical points of ’j .
the eigenvalues in this interval is  Cn Vol(M, g)1 
(n1) , where Cn is a constant depending on the Let us introduce some problems which have
dimension. motivated much of the work in this area.
Quantum Ergodicity and Mixing of Eigenfunctions 187

Problem 1 Let Q denote the set of ‘‘quantum oscillation properties of eigenfunctions. Here are
limits,’’ that is, weak limit points of the sequence some possibilities:
{k } of distributions on the classical phase space
1. Normalized Liouville measure. In fact, the func-
S M, defined by
tional ! of [12] is also a state on 0 for the
Z
reason explained above. A subsequence {’kj } of
a dk :¼ hOpðaÞ’k ; ’k i eigenfunctions is considered diffuse if kj ! !.
X
2. A periodic orbit measure
defined by
where a 2 C1 (S M). Z
1
The set Q is independent of the definition of Op.
ðAÞ ¼ A ds
L
It follows almost immediately from Egorov’s theo-
rem that Q MI , where MI is the convex set of where L is the length of . A sequence of
invariant probability measures for the geodesic flow. eigenfunctions for which kj !
obviously con-
Furthermore, they are time-reversal invariant, that centrates (or strongly ‘‘scars’’) on the closed
is, invariant under (x, ) ! (x, ) since the eigen- geodesic.
functions are real valued. 3. A finite sum of periodic orbit measures.
To see this, it is helpful to introduce the linear 4. A delta-function along an invariant Lagrangian
functionals on 0 : manifold  S M. The associated eigenfunctions
are viewed as ‘‘localizing’’ along .
k ðAÞ ¼ hOpðaÞ’k ; ’k i ½14 5. A more general invariant measure which is
We observe that k (I) = 1, k (A)
0 if A
0, singular with respect to d
.
and that All of these possibilities can and do happen in
  different examples. If dkj ! !, then in particular
k Ut AUt ¼ k ðAÞ ½15 we have
Indeed, if A
0 then A = B B for some B 2 0 Z
1 VolðEÞ
and we can move B to the right-hand side. j’kj ðxÞj2 dVol !
VolðMÞ E VolðMÞ
Similarly, [15] is proved by moving Ut to the right-
hand side and using [13]. These properties mean for any measurable set E whose boundary has
that j is an ‘‘invariant state’’ on the algebra 0 . measure zero. Interpreting j’kj (x)j2 dVol as the
More precisely, one should take the closure of 0 in probability density of finding a particle of energy
the operator norm. An invariant state is the analog 2k at x, this result means that the sequence of
in quantum statistical mechanics of an invariant probabilities tends to uniform measure.
probability measure. However, dkj ! ! is much stronger since it says
The next important fact about the states k is that that the eigenfunctions become diffuse on the energy
any weak limit of the sequence {k } on 0 is an surface S M and not just on the configuration space
invariant probability measure on C(S M), that is, M. As an example, consider the flat torus Rn =Zn .
a positive linear functional on C(S M) rather than An orthonormal basis of eigenfunctions is furnished
just a state on 0 . This follows from the fact that by the standard exponentials e2 ihk, xi with k 2 Zn .
hK’j , ’j i ! 0 for any compact operator K, and so any Obviously, je2 ihk, xi j2 = 1, so the eigenfunctions are
limit of hA’k , ’k i is equally a limit of h(A þ K)’k , ’k i. already diffuse in configuration space. On the other
Hence, any limit is bounded by inf K kA þ Kk (the hand, they are far from diffuse in phase space, and
infimum taken over compact operators), and for any localize on invariant Lagrangian tori in S M. Indeed,
A 2 0 , kA kL1 = inf K kA þ Kk. Hence, any weak by definition of pseudodifferential operator,
limit is bounded by a constant times kA kL1 and is Ae2 ihk, xi = a(x, k) e2 ihk, xi , where a(x, k) is the com-
therefore continuous on C(S M). It is a positive plete symbol. Thus,
functional since each j , and hence any limit, is a Z
probability measure. By Egorov’s theorem and the hAe 2 ihk;xi
;e 2 ihk;xi
i¼ aðx; kÞ dx
invariance of the k , any limit of k (A) is a limit of R n =Zn
Z  
k (Op(A  t )) and hence the limit measure is k
invariant.  A x; dx
R n =Zn jkj
Problem 1 is thus to identify which invariant
measures in MI show up as weak limits of the A subsequence e2 ihkj , xi of eigenfunctions has a weak
functionals k or equivalently the distributions dk . limit if and only if kj =jkj j tends to a limit vector 0 in
The weak limits reflect the concentration and the unit sphere in R n . In this case, the associated
188 Quantum Ergodicity and Mixing of Eigenfunctions

R
weak limit is Rn =Zn A (x, 0 )dx, that is, the delta- Matrix elements of eigenfunctions are quadratic
function on the invariant torus T0 S M defined forms. More ‘‘nonlinear’’ problems involve the
by the constant momentum condition  = 0 . The Lp -norms or the distribution functions of eigenfunc-
eigenfunctions are said to localize on this invariant tions. Estimates of the L1 -norms can be obtained
torus for t . from the local Weyl law [10].P Since the2 jump in
The flat torus is a model of a completely the left-hand side at  is j : j =  j’j (x)j and the
integrable system on both the classical and quantum jump in the right-hand side is the jump of R(, x),
levels. Another example is that of the standard this implies
round sphere Sn . In this case, the author and X n1
D Jakobson showed that absolutely any invariant j’j ðxÞj2 ¼ Oðn1 Þ ¼) jj’j jjL1 ¼ Oð 2 Þ ½17
measure 2 MI can arise as a weak limit of a j:j ¼

sequence of eigenfunctions. This reflects the huge For general Lp -norms, the following bounds were
degeneracy (multiplicities) of the eigenvalues. proved by C Sogge for any compact Riemannian
On the other hand, if the geodesic flow is ergodic, manifold:
one would expect the eigenfunctions to be diffuse in
phase space. In the next section, we will discuss the k’j kp
¼ OððpÞ Þ; 2p1 ½18
rigorous results on this problem. k’k2
Off-diagonal matrix elements
where
jk ðAÞ ¼ hA’i ; ’j i ½16 8  
> 1 1 1 2ðn þ 1Þ
> n 
< 2 p  2; p1
are also important as transition amplitudes between n1
ðpÞ ¼   ½19
states. They no longer define states since jk (I) = 0, >
> n1 1 1 2ðn þ 1Þ
:  ; 2p
are positive, or invariant. Indeed, jk (Ut AUt ) = 2 2 p n1
eit(j k ) jk (A), so they are eigenvectors of the
automorphism t of [1]. A sequence of such matrix These estimates are sharp on the unit sphere Sn
elements cannot have a weak limit unless the nþ1
R . The extremal eigenfunctions are the zonal
spectral gap j  k tends to a limit 2 R. In this spherical harmonics, which are the L2 -normalized
case, by the same discussion as above, any weak spectral projection kernels N (x, x0 )=kN ( , x0 )k
limit of the functionals jk will be an eigenmeasure centered at any x0 . However, they are not sharp
of the geodesic flow which transforms by ei t under for generic (M, g), and it is natural to ask how
the action of t . Examples of such eigenmeasures ‘‘chaotic dynamics’’ might influence Lp -norms.
are orbital Fourier coefficients
Problem 3 Improve the estimates k’j kp =k’k2 =
Z L
1 i t t O((p) ) for (M, g) with ergodic or mixing geodesic
e A ð ðx; ÞÞ dt
L 0
flow.
C Sogge and the author have proved that if a
along a periodic orbit. Here, 2 (2 =L )Z. We
sequence of eigenfunctions attains the bounds in
denote by Q such eigenmeasures of the geodesic
[17], then there must exist a point x0 so that a
flow. Problem 1 has the following extension to off-
positive measure of geodesics starting at x0 in Sx0 M
diagonal elements:
returns to x0 at a fixed time T. In the real analytic
Problem 2 Determine the set Q of ‘‘quantum case, all return so x0 is a perfect recurrent point. In
limits,’’ that is, weak limit points of the sequence dimension 2, such a perfect recurrent point cannot
{kj } of distributions on the classical phase space occur if the geodesic flow is ergodic; hence
S M, defined by k’j kL1 = o((n1)=2 ) on any real analytic surface
Z with ergodic geodesic flow. This shows that none
adkj :¼ hOpðaÞ’k ; ’j i of the Lp -estimates above the critical index are sharp
X for real analytic surfaces with ergodic geodesic flow,
and the problem is the extent to which they can be
where j  k = þ o(1) and where a 2 C1 (S M), or
improved.
equivalently of the functionals jk .
The random wave model (see the section ‘‘Random
As will be discussed in the section ‘‘Quantum waves and orthonormal bases’’) predicts that eigen-
weak mixing,’’ the asymptotics of off-diagonal functions of Riemannian manifolds with chaotic
elements depends on the weak mixing properties of geodesic flow should have the bounds
pffiffiffiffiffiffiffiffiffiffi k’ kLp = O(1)
the geodesic flow and not just its ergodicity. for p < 1 and that k’ kL1 < log . But there are
Quantum Ergodicity and Mixing of Eigenfunctions 189

no rigorous estimates at this time close to such geodesic flow Gt is ergodic on (S M, d
) if and only
predictions. The best general estimate to date on if, for every A 2 o (M), we have:
negatively curved compact manifolds (which are 1
P 2
(i) lim ! 1 N() j  j(A’j , ’j )  !(A)j = 0.
models of chaotic geodesic flow) is just the logarithmic P
1
improvement (ii) (8)(9) lim sup ! 1 N() j6¼k : j ,k jj k j<
2
 n1  j(A’j , ’k )j < .

jj’j jjL1 ¼ O This implies that there exists a subsequence {’jk }
log 
of eigenfunctions whose indices jk have counting
on the standard remainder term in the local Weyl density 1 for which hA’jk , ’jk i ! !(A). We will call
law. This was known for compact hyperbolic the eigenfunctions in such a sequence ‘‘ergodic
manifolds from the Selberg trace formula, and eigenfunctions.’’ One can sharpen the results by
similar estimates hold manifolds without conjugate averaging over eigenvalues in the shorter interval
points (P Bérard). The exponential growth of the [,  þ 1] rather than in [0, ].
geodesic flow again causes a barrier in improving There is also an ergodicity result for boundary values
the estimate beyond the logarithm. In the analogous of eigenfunctions on domains with boundary and with
setting of quantum ‘‘cat maps,’’ which are models of Dirichlet, Neumann, or Robin boundary conditions
chaotic classical dynamics, there exist arbitrarily (Gérard–Leichtnam, Hassell–Zelditch, Burq). This cor-
large eigenvalues with multiplicities of the order responds to the fact that the billiard map on B @M
O(n1 =log ); the L1 -norm of the L2 -normalized is ergodic.
projection kernel onto an eigenspace of this multi- The first statement (i) is essentially a convexity
plicity is of order of the square root of the result. It remains true if one replaces the square by
multiplicity (Faure et al. 2003). This raises doubt any convex function ’ on the spectrum of A,
that the logarithmic estimate can be improved by
1 X
general dynamical arguments. Further discussion of ’ðhA’k ; ’k i  !ðAÞÞ ! 0 ½20
L1 -norms, as well as zeros, will be given at the end NðEÞ   E
j

of the next section for ergodic systems.


Before sketching a proof, we point out a some-
what heuristic ‘‘picture proof’’ of the theorem.
Quantum Ergodicity Namely, ergodicity of the geodesic flow is equivalent
to the statement that Liouville measure is an
In this section, we discuss results on the problems extreme point of the compact convex set MI . In
stated above when the geodesic flow of (M, g) is fact, it further implies that ! is an extreme point of
assumed to be ergodic. Let us recall that this means the compact convex set E R of invariant states for t
that Liouville measure is an ergodic measure for t . of eqn [1]; see Ruelle (1969) for background. But
This is a spectral property of the operator Vt of [2] the local Weyl law says that ! is also the limit of the
on L2 (S M, d
), namely that Vt has 1 as an convex combination
eigenvalue of multiplicity 1. That is, the only
invariant L2 -functions (with respect to Liouville 1 X
j
measure) are the constant functions. This implies NðEÞ  E
j
that the only invariant sets have Liouville measure 0
or 1 and (Birkhoff’s ergodic theorem) that time An extreme point cannot be written as a convex
averages of functions are constant almost every- combination of other states unless all the states in the
where (equal to the space average). combination are equal to it. In our case, ! is only a
In this case, there is a general result which limit of convex combinations so it need not (and does
originated in the work of Schnirelman and was not) equal each term. However, almost all terms in the
developed into the following theorem by Zelditch, sequence must tend to !, and that is equivalent to [1].
Colin de Verdière, and Sunada (manifolds without Sketch of Proof of Theorem 1(i) As mentioned
boundary), and Gérard–Leichtnam and Zelditch– above, this is a convexity result and with no
Zworski (manifolds with boundary). The following additional effort we can consider more general
discussion is based on the articles (Zelditch sums of the form. We then have
1996b, c, Zelditch and Zworski 1996), which X
contain further references to the literature. ’ðhA’k ; ’k i  !ðAÞÞ
j E
Theorem 1 Let (M, g) be a compact Riemannian X
manifold (possibly with boundary), and let {j , ’j } ¼ ’ðhhAiT  !ðAÞ’k ; ’k iÞ ½21
be the spectral data of its Laplacian . Then the j E
190 Quantum Ergodicity and Mixing of Eigenfunctions

where the proof is so general that it applies to seemingly very


Z T different situations. In place of the distributions
1 {j } we may consider the set
of periodic orbit
hAiT ¼ Ut AUt dt
2T T measures for a hyperbolic flow on a compact manifold
We then apply the Peierls–Bogoliubov inequality X. That is,
Z
X
n 1
’ððB’j ; ’j ÞÞ  tr ’ðBÞ
ðf Þ ¼ f for f 2 CðXÞ
T
j¼1

with B = E [hAiT  !(A)]E to get where is a closed orbit and T is its period.
X According to the Bowen–Margulis equidistribution
’ðhhAiT  !ðAÞ’k ; ’k iÞ theorem for closed orbits of hyperbolic flows, we
j E have
 tr ’ðE ½hAiT  !ðAÞE Þ ½22
1 X 1
^ corre-
Here, E is the spectral projection for H
!

ðTÞ :T T jdetðI  P Þj
sponding to the interval [0, E]. From the Berezin

inequality we then have (if ’(0) = 0):


where as above
is the Liouville measure, where P
1 is the linear Poincaré map and where (T) is the
tr ’ðE ½hAiT  !ðAÞE Þ
NðEÞ normalizing factor which makes the left side a
1 probability measure, that is, defined by the integral
 tr E ’ð½hAiT  !ðAÞÞE of 1 against the sum. An exact repetition of the
NðEÞ
previous argument shows that up to a sparse
! !E ð’ðhAiT  !ðAÞÞÞ; as E ! 1 subsequence of ’s,
!
individually. Yet clearly,
As long as ’ is smooth, ’(hAiT  !(A)) is a the whole sequence does not tend to d
: for
pseudodifferential operator of order zero with instance, one could choose the sequence of iterates
principal symbol ’(hA iT  !(A)). By the assump- k of a fixed closed orbit.
tion that !E ! ! we get
Quantum Ergodicity in Terms of Operator Time
1 X and Space Averages
lim ’ðhA’k ; ’k i  !ðAÞÞ
E!1 NðEÞ
j E The first part of the result above may be reformu-
Z
lated as a relation between operator time and space
 ’ðhA iT  !ðAÞÞ d
averages.
fH¼1g

where Definition Let A 2 0 be an observable and define


its time average to be:
Z T
1 Z
hA iT ¼ A  t dt 1 T
2T T hAi :¼ lim Ut AUt dt
T!1 2T T
As T ! 1 the right-hand side approaches ’(0) = 0
by the dominated convergence theorem and by and its space average to be scalar operator
Birkhoff’s ergodic theorem. Since the left-hand side
is independent of T, this implies that !ðAÞ I

1 X Here, the limit is taken in the weak operator


lim ’ðhA’k ; ’k i  !ðAÞÞ ¼ 0 topology (i.e., one matrix element at a time). To see
E!1 NðEÞ
 E j
what is involved, we consider matrix elements with
for any smooth convex ’ on Spec(A) with ’(0) = 0. respect to the eigenfunctions. We have
&  Z T 
1  sin Tði  j Þ
As mentioned above, the statement of Theorem 1(i) Ut AUt dt’i ; ’j ¼ ðA’i ; ’j Þ
2T T Tði  j Þ
is equivalent to saying that there is a subsequence
{’jk } of counting density 1 for which jk ! !. The from which it is clear that the matrix element tends
above proof does not and cannot settle the question to zero as T ! 1 unless i = j . However, there is
whether there exist exceptional sparse subsequences no uniformity in the rate at which it goes to zero
of eigenfunctions of density zero tending to other since the spacing i  j could be uncontrollably
invariant measures. To see this, we observe that small.
Quantum Ergodicity and Mixing of Eigenfunctions 191

In these terms, Theorem 1(i) states that quantum maps and of Laplacians have much in
 common, this negative result shows that there
hAi ¼ !ðAÞI þ K; where lim ! ðK KÞ ! 0 ½23
!1 cannot exist a universal structural proof of QUE.
The principal positive result available at this time
where ! (A) = tr E()A. Thus, the time average
is the recent proof by Lindenstrauss of the QUE
equals the space average plus a term K which
property for the orthonormal basis of Laplace–
is semiclassically small in the sense that its
Hecke eigenfunctions on arithmetic hyperbolic sur-
Hilbert–Schmidt norm square kE Kk2HS in the span
faces. It is generally believed that the spectrum of
of the eigenfunctions of eigenvalue   is o(N()).
the Laplace eigenvalues is of multiplicity 1 for such
This is not exactly equivalent to Theorem 1(i)
surfaces, so this should imply QUE completely for
since it is independent of the choice of orthonormal
these surfaces. Earlier partial results on Hecke
basis, while the previous result depends on the
eigenfunctions are due to Rudnick–Sarnak, Wolpert,
choice of basis. However, when all eigenvalues have
and others. For references and further discussion onf
multiplicity 1, then the two are equivalent. To psee
ffiffiffiffi Hecke eigenfunctions, see Rudnick and Sarnak
the equivalence, note that hAi commutes with 
(1994) (see Arithmetic Quantum Chaos).
and hence is diagonal in the basis {’j } of joint
So far we have not mentioned Theorem 1(ii). In
eigenfunctions of hAi and of Ut . Hence, K is the
the next section, we will describe a similar but more
diagonal matrix with entries hA’k ,’k i  !(A). The
general result for mixing systems and the relevance
condition is therefore equivalent to
of (ii) will become clear. An interesting open
1 X problem is the extent to which (ii) is actually
lim jhA’k ; ’k i  !ðAÞj2 ¼ 0
E!1 NðEÞ
 E
necessary for the equivalence to classical ergodicity.
j

Problem 5 Converse QE: What can be said of the


Since all the terms are positive, no cancellation is
classical limit of a quantum ergodic system, that is, a
possible and this condition is equivalent to the
system for which hAi = !(A) þ K, where K is
existence of a subset S N of density 1 such that
compact? Is it necessarily ergodic?
QS := {dk : k 2 S} has only ! as a weak limit
point. As above, one says that the sequence of Very little is known on this converse problem at
eigenfunctions is ergodic. present. It is known that if there exists an open set in
One could take this restatement of Theorem 1(i) S M filled by periodic orbits, then the Laplacian
as a semiclassical definition of quantum ergodicity. cannot be quantum ergodic (see Marklof and
Two natural questions arise. First: O’Keefe (2005) for recent results and references).
But no proof exists at this time that KAM systems,
Problem 4 Suppose the geodesic flow t of (M, g)
which have Cantor-like positive measure invariant
is ergodic on S M. Is the operator K in
sets, are not quantum ergodic. It is known that there
hAi ¼ !ðAÞ þ K exists a positive proportion of approximate eigen-
pffiffiffiffi functions (quasimodes) which localize on the invari-
a compact operator? In this case,  is said to be
ant tori, but it has not been proved that a positive
quantum uniquely ergodic (QUE). If ergodicity is
proportion of actual eigenfunctions has this localiza-
not sufficient for the QUE property, what extra
tion property.
conditions need to be added?
Compactness would imply that hK’k ,’k i ! 0, Further Problems and Results on Ergodic
hence hA’k ,’k i ! !(A) along the entire sequence. Eigenfunctions
Quite a lot of attention has been focused on this
Ergodicity is also known to have an impact on
problem in the last decade. It is probable that
the distribution of zeros. The complex zeros in
ergodicity is not by itself sufficient for the QUE
Kähler phase spaces of ergodic eigenfunctions of
property of general Riemann manifold. For instance,
quantum ergodic maps become uniformly distrib-
it is believed that there exist modes of asymptotic
uted with respect to the Kähler volume form
bouncing ball type which concentrate on the
(Nonnenmacher–Voros, Shiffman–Zelditch). An inter-
invariant Lagrangian cylinder (with boundary)
esting problem is whether the real analog is true:
formed by bouncing ball orbits of the Bunimovich
stadium (see e.g., Heller (1984) for more on such Problem 6 Ergodicity and equidistribution of
‘‘scarring’’). Further, Faure et al. (2003) have shown nodal sets. Let N ’j M denote the nodal set (zero
that QUE does not hold for the hyperbolic system set) of ’j , and equip it with its hypersurface volume
defined by a quantum cat map on the torus. Since form dHn1 induced by g. Let (M, g) have ergodic
the methods applicable to eigenfunctions of geodesic flow, and suppose that {’j } is an ergodic
192 Quantum Ergodicity and Mixing of Eigenfunctions

sequence of eigenfunctions. Are the following The restriction j 6¼ k is of course redundant unless
asymptotics valid? = 0, in which case the statement coincides with
Z Z quantum ergodicity. This result follows from the
1 general asymptotic formula, valid for any compact
f dHn1  j f dVol
N ’j VolðM; gÞ M Riemannian manifold (M, g), that
 
This is predicted by the random wave model of 1 X sin Tði  j  Þ2
jhA’i ; ’j ij2  
the section ‘‘Random waves and orthonormal NðÞ i6¼j; ;  Tði  j  Þ 
bases.’’ An equidistribution law for the complex i j
 Z T 2  
zeros is known which gives some evidence for the 1  sin T 2
validity of this limit formula. Let (M, g) be a 2T e it
V ð Þ
t A 
   
 T  !ðAÞ
2
½24
T 2
compact real analytic Riemannian manifold and let
’Cj be the holomorphic extension of the real analytic In the case of weak-mixing geodesic flows, the right-
eigenfunction ’j to the complexification MC of M hand side tends to 0 as T ! 1. As with diagonal
(its Grauert tube). Then, if the geodesic flow is sums, the sharper result is true where one averages
ergodic and if ’j is an ergodic sequence of over the short intervals [,  þ 1].
eigenfunctions, the normalized current of integration
(1=j )Z’C over the complex zero set of ’C j tends
j
weakly to (i= )@@j  g j. This current is singular along Spectral Measures and Matrix Elements
the zero section. Theorem 2 is based on expressing the spectral
Finally, we mention some results on L1 -norms of measures of the geodesic flow in terms of matrix
eigenfunctions on arithmetic hyperbolic manifolds elements. The main limit formula is
of dimensions 2 and 3. It was proved by Iwaniec–
Sarnak that the joint eigenfunctions of  and the Z þ" X
1
Hecke operators on arithmetic hyperbolic surfaces d
A :¼ lim jhA’i ; ’j ij2 ½25
5=48þ " !1 NðÞ i; j:  ;
have the upper bound k’j k1 = O (j ) for all j
ji j  j<"
j pand  > 0,
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi and the lower bound k’j k1

c log log j for some constant c > 0 and infinitely where d


A is the spectral measure for the geodesic
many j. Rudnick and Sarnak (1994) proved that flow corresponding to the principal symbol of
there exists an arithmetic hyperbolic manifold and a A, A 2 C1 (S M, d
). Recall that the spectral mea-
subsequence ’jk of eigenfunctions with k’jk kL1 sure of Vt corresponding to f 2 L2 is the measure
1=4
jk , contradicting the random wave model d
f defined by
predictions. Z
hVt f ; f iL2 ðS MÞ ¼ eit d
f ð Þ
R
Quantum Weak Mixing
The limit formula [25] is equivalent to the dual
There are parallel results on quantizations of weak- formula (under the Fourier transform):
mixing geodesic flows which are the subject of this
section. First we recall the classical definition: 1 X itði j Þ
lim e jhA’i ; ’j ij2
the geodesic flow of (M, g) is weak mixing if the  ! 1 NðÞ
i;j: 
j
operator Vt has purely continuous spectrum on the
orthogonal complement of the constant functions in ¼ hVt A ; A iL2 ðS MÞ ½26
L2 (S M, d
). Hence, like ergodicity, it is a spectral
property of the geodesic flow. The proof of [26] is to consider, for A 2  , the
We have: operator At A 2  with At = Ut AUt . By the local
Weyl law,
Theorem 2 (Zelditch 1996c). The geodesic flow t
of (M, g) is weak mixing if and only if the conditions 1
(i) and (ii) of Theorem 1 hold and additionally, for lim tr EðÞAt A ¼ hVt A ; A iL2 ðS MÞ
 ! 1 NðÞ
any A 2 o (M),
The right-hand-side of [25] defines a measure dmA
1 X
ð8Þð9Þ lim sup jðA’j ; ’k Þj2 <  on R and [26] says
!1 NðÞ j 6¼ k : j ;   Z Z
k
jj k  j<
it
e dmA ð Þ ¼ hVt A ; A iL2 ðS MÞ ¼ eit d
A ð Þ
ð8 2 RÞ R R
Quantum Ergodicity and Mixing of Eigenfunctions 193

Since weak-mixing systems are ergodic, it is not On the basis of the analogy between jhA’i , ’j ij2
necessary to average in both indices along an and jhA’j , ’j i  !(A)j2 , it is conjectured in Feingold
ergodic subsequence: and Peres (1986) that
X
lim hAt A’j ; ’j i ¼ eitði j Þ jhA’i ; ’j ij2 CA!ðAÞI ð0Þ
j ! 1 VA ðÞ 
j n1 volð Þ
¼ hVt A ; A iL2 ðS MÞ ½27 pffiffiffi
The idea is that ’ = (1= 2)(’i ’j ) have the same
matrix element asymptotics as eigenfunctions when
Dually, one has
i  j is sufficiently small. But then 2hA’þ , ’ i =
X Z þ" hA’i , ’i i  hA’j , ’j i when A = A. Since we are
2
lim jhA’i ; ’j ij ¼ d
A ½28 taking a difference, we may replace each matrix
j ! 1 "
i : ji j  j<" element hA’i , ’i i by hA’i , ’i i  !(A) (and also for ’j ).
The conjecture then assumes that hA’i , ’i i  !(A) has
For QUE systems, these limit formulas are valid for the same order of magnitude as hA’i , ’i i  hA’j , ’j i.
the full sequence of eigenfunctions. Dynamical grounds for this conjecture are given in
Eckhardt et al. (1995). The order of magnitude is
predicted by some natural random wave models, as
Rate of Quantum Ergodicity and Mixing discussed in the next section.
A quantitative refinement of quantum ergodicity is
Rigorous results
to ask at what rate the sums in Theorem 1(i) tend to
zero, that is, to establish a rate of quantum At this time, the strongest variance result is an
ergodicity. More generally, we consider ‘‘variances’’ asymptotic formula for the diagonal variance proved
of matrix elements. For diagonal matrix elements, by Luo and Sarnak (2004) for special Hecke
we define eigenfunctions on the quotient H 2 =SL(2, Z) of the
upper half plane by the modular group. Their result
1 X pertains to holomorphic Hecke eigenforms, but the
VA ðÞ :¼ jhA’j ; ’j i  !ðAÞj2 ½29
NðÞ j:  analogous statement for smooth Maass–Hecke
j
eigenfunctions is expected to hold by similar
In the off-diagonal case, one may view jhA’i , ’j ij2 as methods, so we state the result as a theorem/
analogous to jhA’j , ’j )  !(A)j2 . However, the sums conjecture. Note that H 2 =SL(2, Z) is a noncompact
in [25] are double sums while those of [29] are finite-area surface whose Laplacian  has both a
single. One may also average over the shorter discrete and a continuous spectrum. The discrete
intervals [,  þ 1]. Hecke eigenfunctions are joint eigenfunctions of 
and the Hecke operators Tp .
Quantum Chaos Conjectures Theorem/Conjecture 1 (Luo and Sarnak 2004).
First, consider off-diagonal matrix elements. One Let {’k } denote the orthonormal basis of Hecke
conjecture is that it is not necessary to sum in j in eigenfunctions for H 2 =SL(2, Z). Then there exists a
2
[28]: each individual term has the asymptotics quadratic form B(f ) on C1 0 (H =SL(2, Z)) such that
consistent with [28]. This is implicitly conjectured Z Z 2
1 X   2
’j  dvol  1 

by Feingold–Peres (1986) (see [11]) in the form  f f dVol 
NðÞ   X VolðXÞ X
  j
 
Ei  Ej Bðf ; f Þ 1
CA ¼ þo
h

jhA’i ; ’j ij2 ’ ½30  
2 ðEÞ
When the multiplier f = ’ is itself an eigenfunc-
where tion, Luo–Sarnak have shown that
Z 1
CA ð Þ ¼ ei t hVt A ; A i dt Bð’ ; ’ Þ ¼ C’ ð0ÞLð12 ; ’ Þ
1
where L( 12 , ’ ) is a certain L-function. Thus, the
1
In our notation, j =  h Ej and (E) dE  dN(). conjectured classical variance is multiplied by an
There are  Cn1 eigenvalues i in the interval arithmetic factor depending on the multiplier. A
[j   , j  þ ], so [30] states that individual crucial fact in the proof is that the quadratic form B
terms have the asymptotics of [28]. is diagonalized by the ’ .
194 Quantum Ergodicity and Mixing of Eigenfunctions

The only rigorous result to date which is valid on Thus,


general Riemannian manifolds with hyperbolic
d
geodesic flow is the logarithmic decay: C ðx; yÞ ¼ Eð; x; yÞ
d Z
Theorem 3 (Zelditch). For any (M, g) with hyper-
bolic geodesic flow, ¼ ð2 Þn eihxy;i dS
jj¼
Z
n n1
1 X 1 ¼ ð2 Þ  eihxy;i dS ½32
jðA’j ; ’j Þ  !ðAÞj2p ¼ jj¼1
NðÞ  
j
ðlog Þp
where dS is the usual surface measure. With this
definition, C (x, x)  n1 . In order to make
The logarithm reflects the exponential blow-up in
E(f (x)2 ) = 1 consistent with normalized eigenfunc-
time of remainder estimates for traces involving the
tions, we divide by n1 to define
wave group associated to hyperbolic flows. It would Z
be surprising if the logarithmic decay is sharp for ^  ðx; yÞ ¼ ð2 Þn
C eihxy;i dS
Laplacians. However, a recent result of R Schubert jj¼1
shows that the estimate is sharp in the case of two-
dimensional hyperbolic quantum cat maps. Hence, One could express the integral as a Bessel function
the estimate cannot be improved by semiclassical to rewrite this as
arguments that hold in both settings.  
n1

jjx  yjjðn2Þ=2 Jðn2Þ=2 ðjx  yjÞ
2
Wick’s formula in this ensemble gives
Random Waves and Orthonormal Bases 1
E’ðxÞ2 ’ðyÞ2 ¼ ½1 þ 2C ðx; yÞ2 
We have mentioned that the random wave model Volð Þ2
provides a kind of guideline for what to conjecture
about eigenfunctions of quantum chaotic system. In Thus, in dimension n we have
this final section, we briefly discuss random wave "Z Z #
models and what they predict. 22 2
E VðxÞVðyÞ’ðxÞ ’ðyÞ dxdy  V
By a random wave model, one means a prob-
Z Z
ability measure on a space of functions. To deal 2
¼ ^  ðx; yÞ2 VðxÞVðyÞdx dy
C
with orthonormal bases rather than individual Volð Þ 2

functions, one sets a probability measure on a Z Z
1 VðxÞVðyÞ
space of orthonormal bases, that is, on a unitary  cosðjx  yjÞ2 dx dy
group. We denote expected values relative to a given n1 Volð Þ2 jx  yjn1
probability measure by E. We now consider some
In the last line, we used the stationary-phase
specific Gaussian models and what they predict
asymptotics
about variances. Z
As a model for quantum chaotic eigenfunctions
ð2 Þn n1 eihxy;i dS
in plane domains, Berry (1977) suggested using jj¼1
the Euclidean random wave model at fixed
 Cn ðjx  yjÞðn1Þ=2 cosðjx  yjÞ ½33
energy. A rigorous version of such a model is as
follows: let E  denote the space of (tempered) Thus, the variances have order (n1) in dimension n,
eigenfunctions of eigenvalue 2 of the Euclidean consistent with the conjectures in Feingold and
Laplacian  on Rn . It is spanned by exponentials Peres (1986) and Eckhardt et al. (1995).
eihk, xi with k 2 Rn , jkj = . The infinite-dimensional This model is often used to obtain predictions on
space E  is a unitary representation of the Euclidean eigenfunctions of chaotic systems. By construction,
motion group and carries an invariant inner it is tied to Euclidean geometry and only pertains
product. The inner product defines an associated directly to individual eigenfunctions of a fixed
Gaussian measure whose covariance kernel eigenvalue. It is based on the infinite-dimensional
C (x, y) = Ef (x)f (y) is the derivative at  of the multiplicity of eigenfunctions of fixed eigenvalue of
spectral function the Euclidean Laplacian on R n . There also exist
Z random wave models on a curved Riemannian
manifold (M, g), which model individual eigen-
Eð; x; yÞ ¼ ð2 Þn eihxy;i d;  2 Rn ½31
jj functions and also random orthonormal bases
Quantum Ergodicity and Mixing of Eigenfunctions 195

(Zelditch 1996a). Thus, one can compare the asymptotics for the traces k Ak , (k Ak )2 for any
behavior of sums over eigenvalues of the orthonor- pseudodifferential operator A. Combining the strong
mal basis of eigenfunctions of  with that of a Szegö asymptotics with the arguments of Zelditch
random orthonormal basis. Instead of taking (1996a), random orthonormal bases can be proved
Gaussian random combinations of Euclidean plane to satisfy the following variance asymptotics:
waves of a fixed eigenvalue, one takes Gaussian P
P 1. Eð j:j 2Ik jðAU’j ; U’j Þ  !ðAÞj2
random combinations j : j 2[,þ1] cj ’j of the eigen-
functions of (M, g) with eigenvalues in a short  ð!ðA AÞ  !ðAÞ2 Þ;
interval in the sense above. Equivalently, one takes P  
P sinTði j  Þ2 2
random combinations with jc j 2
= 1. These 2. Eð i6¼j:j ;i 2Ik  Tði   Þ  jðAU’j ; U’i Þj
j j j

 2  2
random waves are globally adapted to (M, g). The
 2sin T  þ 1 P sinTði j  Þ
statistical results depend on the measure of the set of T NðkÞ i6¼j Tði j  Þ

periodic geodesics of (M, g); thus, as discussed in ð!ðA AÞ  !ðAÞ2 Þ


Kaplan and Heller (1998), different random wave
models make different predictions about off-
See also: Arithmetic Quantum Chaos; Determinantal
diagonal variances.
Random Fields; Eigenfunctions of Quantum Completely
Fix a compact Riemannian pffiffiffiffimanifold (M, g) and Integrable Systems; Fractal Dimensions in Dynamics;
partition the spectrum of  into the intervals h-Pseudodifferential Operators and Applications; Number
Ik = [k, k þ 1]. Let k = E(k þ 1)  E(k) pffiffiffiffibe the Theory in Physics; Regularization for Dynamical Zeta
kernel of the spectral projections for  corre- Functions; Semiclassical Spectra and Closed Orbits.
sponding to the interval Ik . Its kernel k (x, y) is
the covariance
P kernel of Gaussian random combi-
nations j : j 2Ik cj ’j and is analogous to C (x, y) in Further Reading
the Euclidean case; it is of course not Bäcker A, Schubert R, and Stifter P (1998a) Rate of quantum
the derivative dE(, x, y) but the difference of the ergodicity in Euclidean billiards. Physical Review E (3) 57(5):
spectral projector over Ik . We denote by N(k) the part A 5425–5447.
number of eigenvalues in Ik and put Hk = rank Bäcker A, Schubert R, and Stifter P (1998b) Rate of quantum
ergodicity in Euclidean billiards – erratum. Physical Review E,
(the range of k ). We define a ‘‘random’’ ortho- (3) 58(4): 5192 (math-ph/0512030).
normal basis of Hk by changing the basis of Barnett A (2005) Asymptotic rate of quantum ergodicity in
eigenfunctions {’j } of  in Hk by a random chaotic Euclidean billiards.
element of the unitary group U(Hk ) of the finite- Berry MV (1977) Regular and irregular semiclassical wavefunc-
dimensional Hilbert space Hk . We then define a tions. Journal of Physics A 10(12): 2083–2091.
Bohigas O, Giannoni MJ, and Schmit C (1984) Characterization
random orthonormal basis of L2 (M) by taking the of chaotic quantum spectra and universality of level fluctua-
product over all the spectral intervals in our tion laws. Physical Review Letters 52(1): 1–4.
partition. More precisely, we define the infinite- Eckhardt B, Fishman S, Keating J, Agam O, Main J et al. (1995)
dimensional unitary group Approach to ergodocity in quantum wave functions. Physical
Review E 52: 5893–5903.
Y
1 Faure F, Nonnenmacher S, and De Bièvre S (2003) Scarred
Uð1Þ ¼ UðHk Þ eigenstates for quantum cat maps of minimal periods.
k¼1 Communications in Mathematical Physics 239(3): 449–492.
Feingold M and Peres A (1986) Distribution of matrix elements of
of sequences (U1 , U2 , . . . ), with Uk 2 U(Hk ). We chaotic systems. Physical Review A (3) 34(1): 591–595.
equip U(1) with the product Heller EJ (1984) Bound-state eigenfunctions of classically chaotic
Hamiltonian systems: Scars of periodic orbits. Physical Review
Y
1 Letters 53(16): 1515–1518.
d 1 ¼ d k Kaplan L and Heller EJ (1998) Weak quantum ergodicity. Physica
D 121(1–2): 1–18 (Appendix by S. Zelditch.).
k¼1
Luo W and Sarnak P (2004) Quantum variance for Hecke
of the unit mass Haar measures d k on U(Hk ): we eigenforms. Annales Scientifiques de l’Ecole Normale Super-
ieure 37: 769–799.
then define a random orthonormal basis of L2 (M) to Marklof J and O’Keefe S (2005) Weyl’s law and quantum
be obtained by applying a random element ergodicity for maps with divided phase space (with an
U 2 U(1) to thepffiffiffiffiorthonormal basis  = {’j } of appendix ‘Converse quantum ergodicity’ by Steve Zelditch).
eigenfunctions of . Nonlinearity 18(1): 277–304.
Assuming the set of periodic geodesics of (M, g) Rudnick Z and Sarnak P (1994) The behaviour of eigenstates of
arithmetic hyperbolic manifolds. Communications in Mathe-
has measure zero, the Weyl remainder results [8] matical Physics 161: 195–213.
and strong Szegö limit asymptotics of Guillemin– Ruelle D (1969) Statistical Mechanics: Rigorous Results. New
Okikiolu and Laptev–Robert–Safarov give two term York: W. A. Benjamin.
196 Quantum Error Correction and Fault Tolerance

Sarnak P (1995) Arithmetic quantum chaos. In: The Schur Zelditch S (1996c) Quantum Mixing. Journal of Functional
Lectures (Tel Aviv, 1992), Israel Mathematical Conference Analysis 140: 68–86.
Proc, vol. 8 (1995), pp. 183–236. Zelditch S and Zworski M (1996) Ergodicity of eigenfunctions for
Zelditch S (1996a) A random matrix model for quantum mixing. ergodic billiards. Communications in Mathematical Physics
International Mathematics Research Notices 3: 115–137. 175: 673–682.
Zelditch S (1996b) Quantum ergodicity of C dynamical systems.
Communications in Mathematical Physics 177: 507–528.

Quantum Error Correction and Fault Tolerance


D Gottesman, Perimeter Institute, Waterloo, ON, a great deal lower than that of the decoder.
Canada In particular, the task of determining what error
ª 2006 Elsevier Ltd. All rights reserved. has occurred can be computationally difficult
(NP-hard, in fact), and designing codes with
efficient decoding algorithms is an important task
Quantum Error Correction in quantum error correction, as in classical error
correction.
Building a quantum computer or a quantum com- This article will cover only binary quantum codes,
munications device in the real world means having built with qubits as registers, but all of the
to deal with errors. Any qubit stored unprotected or techniques discussed here can be generalized to
one transmitted through a communications channel higher-dimensional registers, or ‘‘qudits.’’
will inevitably come out at least slightly changed. To determine whether a given subspace is able to
The theory of quantum error-correcting codes correct a given set of errors, we can apply the
(QECCs) has been developed to counteract noise quantum error-correction conditions (Bennett et al.
introduced in this way. By adding extra qubits and 1996, Knill and Laflamme 1997):
carefully encoding the quantum state we wish to
protect, a quantum system can be insulated to a Theorem 1 A QECC C corrects the set of errors E iff
great extent against errors.
To build a quantum computer, we face an even h i jEya Eb j j i ¼ Cab ij ½1
more daunting task: if our quantum gates are
where Ea , Eb 2 E and {j i i} form an orthonormal
imperfect, everything we do will add to the error.
basis for C.
The theory of fault-tolerant quantum computation
tells us how to perform operations on states encoded The salient point in these error-correction condi-
in a QECC without compromising the code’s ability tions is that the matrix element Cab does not depend
to protect against errors. on the encoded basis states i and j, which, roughly
In general, a QECC is a subspace of a Hilbert speaking, indicates that neither the environment nor
space designed so that any of a set of possible errors the decoding operation learns any information about
can be corrected by an appropriate quantum the encoded state. We can imagine the various
operation. Specifically: possible errors taking the subspace C into other
subspaces of Hn , and we want those subspaces to be
Definition 1 Let Hn be a 2n -dimensional Hilbert
isomorphic to C, and to be distinguishable from
space (n qubits), and let C be a K-dimensional
each other by an appropriate measurement. For
subspace of Hn . Then C is an ((n, K)) (binary) QECC
instance, if Cab = ab , then the various erroneous
correcting the set of errors E = {Ea } iff 9R s.t. R is a
subspaces are orthogonal to each other.
quantum operation and (R  Ea )(j i) = j i for all
Because of the linearity of quantum mechanics,
Ea 2 E, j i 2 C.
we can always take the set of errors E to be a linear
R is called the ‘‘recovery’’ or ‘‘decoding’’ opera- space: if a QECC corrects Ea and Eb , it will also
tion and serves to actually perform the correction correct Ea þ Eb using the same recovery opera-
of the state. The decoder is sometimes also taken to tion. In addition, if we write any superoperator S in
map Hn into an unencoded Hilbert space Hlog K terms
P of its operator-sum representation S() 7!
isomorphic to C. This should be distinguished from Ak Ayk , a QECC that corrects the set of errors
the ‘‘encoding’’ operation which maps Hlog K into {Ak } automatically corrects S as well. Thus, it is
Hn , determining the imbedding of C. The computa- sufficient in general to check that the error-correc-
tional complexity of the encoder is frequently tion conditions hold for a basis of errors.
Quantum Error Correction and Fault Tolerance 197

Frequently, we are interested in codes that correct (1997) bound) states that any ((n, K, d)) QECC
any error affecting t or fewer physical qubits. In that must satisfy
case, let us consider tensor products of the Pauli
n  log K  2d  2 ½4
matrices
    We can set a lower bound on the existence of
1 0 0 1 QECCs using the quantum Gilbert–Varshamov
I¼ ; X¼
0 1 1 0 bound, which states that, for large n, an ((n, 2k , d))
    ½2 QECC exists provided that
0 i 1 0
Y¼ ; Z¼
i 0 0 1 k=n  1  ðd=nÞ log 3  hðd=nÞ ½5

Define the Pauli group P n as the group consisting of where h(x) = x log x  (1  x) log (1  x) is the
tensor products of I, X, Y, and Z on n qubits, with binary Hamming entropy. Note that the Gilbert–
an overall phase of 1 or i. The weight wt(P) of a Varshamov bound simply states that codes at least
Pauli operator P 2 P n is the number of qubits on this good exist; it does not suggest that better codes
which it acts as X, Y, or Z (i.e., not as the identity). cannot exist.
Then the Pauli operators of weight t or less form a
basis for the set of all errors acting on t or fewer
qubits, so a QECC which corrects these Pauli Stabilizer Codes
operators corrects all errors acting on up to t In order to better manipulate and discover QECCs,
qubits. If we have a channel which causes errors it is helpful to have a more detailed mathematical
independently with probability O() on each qubit structure to work with. The most widely used
in the QECC, then the code will allow us to structure gives a class of codes known as ‘‘stabilizer
decode a correct state except with probability codes’’ (Calderbank et al. 1998, Gottesman
O(tþ1 ), which is the probability of having more 1996). They are less general than arbitrary quantum
than t errors. We get a similar result in the case codes, but have a number of useful properties that
where the noise is a general quantum operation on make them easier to work with than the general
each qubit which differs from the identity by QECC.
something of size O().
Definition 3 Let S P n be an abelian subgroup of
Definition 2 The distance d of an ((n, K)) QECC is the Pauli group that does not contain 1 or i, and
the smallest weight of a nontrivial Pauli operator let C(S) = {j i s.t. Pj i = j i8P 2 S}. Then C(S) is a
E 2 P n s.t. the equation stabilizer code and S is its stabilizer.
h i jEj j i ¼ CðEÞij ½3 Because of the simple structure of the Pauli group,
any abelian subgroup has order 2nk for some k and
fails. can easily be specified by giving a set of n  k
commuting generators.
We use the notation ((n, K, d)) to refer to an
The code words of the QECC are by definition in
((n, K)) QECC with distance d. Note that for P, Q 2
the þ1-eigenspace of all elements of the stabilizer,
P n , wt(PQ)  wt(P) þ wt(Q). Then by comparing
but an error E acting on a code word will move the
the definition of distance with the quantum error-
state into the 1-eigenspace of any stabilizer element
correction conditions, we immediately see that a
M which anticommutes with E:
QECC corrects t general errors iff its distance d > 2t.
If we are instead interested in ‘‘erasure’’ errors, when MðEj iÞ ¼ EMj i ¼ Ej i ½6
the location of the error is known but not its precise
Thus, measuring the eigenvalues of the generators of
nature, a distance d code corrects d  1 erasure
S tells us information about the error that has
errors. If we only wish to detect errors, a distance d
occurred. The set of such eigenvalues can be
code can detect errors on up to d  1 qubits.
represented as an (n  k)-dimensional binary vector
One of the central problems in the theory of
known as the ‘‘error syndrome.’’ Note that the error
quantum error correction is to find codes which
syndrome does not tell us anything about the encoded
maximize the ratios ( log K)=n and d=n, so they can
state, only about the error that has occurred.
encode as many qubits as possible and correct as
many errors as possible. Conversely, we are also Theorem 2 Let S be a stabilizer with n  k gener-
interested in the problem of setting upper bounds on ators, and let S? = {E 2 P n s.t. [E, M] = 0 8M 2 S}.
achievable values of ( log K)=n and d=n. The Then S encodes k qubits and has distance d, where d
quantum Singleton bound (or Knill–Laflamme is the smallest weight of an operator in S? nS.
198 Quantum Error Correction and Fault Tolerance

We use the notation [[n, k, d]] to a refer to such a linear algebra exercise. Another useful representa-
stabilizer code. Note that the square brackets specify tion is to map the single-qubit Pauli operators I, X,
that the code is a stabilizer code, and that the middle Y, Z to the finite field GF(4), which sets up a
term k refers to the number of encoded qubits, and connection between stabilizer codes and a subset of
not the dimension 2k of the encoded subspace, as for classical codes on four-dimensional registers.
the general QECC (whose dimension might not be a
power of 2).
S? is the set of Pauli operators that commute with
CSS Codes
all elements of the stabilizer. They would therefore
appear to be those errors which cannot be detected CSS codes are a very useful class of stabilizer codes
by the code. However, the theorem specifies the invented by Calderbank and Shor (1996), and by
distance of the code by considering S? nS. A Pauli Steane (1996). The construction takes two binary
operator P 2 S cannot be detected by the code, but classical linear codes and produces a quantum code,
there is in fact no need to detect it, since all code and can therefore take advantage of much existing
words remain fixed under P, making it equivalent to knowledge from classical coding theory. In addition,
the identity operation. A distance d stabilizer code CSS codes have some very useful properties which
which has nontrivial P 2 S with wt(P) < d is called make them excellent choices for fault-tolerant
degenerate, whereas one which does not is non- quantum computation.
degenerate. The phenomenon of degeneracy has no A classical [n, k, d] linear code (n physical bits, k
analog for classical error-correcting codes, and logical bits, classical distance d) can be defined in
makes the study of quantum codes substantially terms of an (n  k) n binary ‘‘parity check’’ matrix
more difficult than the study of classical error H – every classical code word v must satisfy Hv = 0.
correction. For instance, a standard bound on Each row of the parity check matrix can be
classical error correction is the Hamming bound converted into a Pauli operator by replacing each 0
(or sphere-packing bound), but the analogous with an I operator and each 1 with a Z operator.
quantum Hamming bound Then the stabilizer code generated by these opera-
tors is precisely a quantum version of the classical
k=n  1  ðt=nÞ log 3  hðt=nÞ ½7 error-correcting code given by H. If the classical
distance d = 2t þ 1, the quantum code can correct t
for [[n, k, 2t þ 1]] codes (when n is large) is only
bit flip (X) errors, just as could the classical code.
known to apply to nondegenerate quantum codes
If we want to make a QECC that can also correct
(though in fact we do not know of any degenerate
phase (Z) errors, we should choose two classical
QECCs that violate the quantum Hamming bound).
codes C1 and C2 , with parity check matrices H1 and
An example of a stabilizer code is the 5-qubit
H2 . Let C1 be an [n, k1 , d1 ] code and let C2 be an
code, a [[5,1,3]] code whose stabilizer can be
[n, k2 , d2 ] code. We convert H1 into stabilizer
generated by
generators as above, replacing each 0 with I and
X
Z
Z
X
I each 1 with Z. For H2 , we perform the same
I
X
Z
Z
X procedure, but each 1 is instead replaced by X. The
X
I
X
Z
Z code will be able to correct bit flip (X) errors as if it
Z
X
I
X
Z had a distance d1 and to correct phase (Z) errors as
if it had a distance d2 . Since these two operations are
The 5-qubit code is a nondegenerate code, and is the completely separate, it can also correct Y errors as
smallest possible QECC which corrects 1 error (as both a bit flip and a phase error. Thus, the distance
one can see from the quantum Singleton bound). of the quantum code is at least min (d1 , d2 ), but
It is frequently useful to consider other represen- might be higher because of the possibility of
tations of stabilizer codes. For instance, P 2 P n can degeneracy.
be represented by a pair of n-bit binary vectors However, in order to have a stabilizer code at all,
(pX j pZ ), where pX is 1 for any location where P has the generators produced by the above procedure
an X or Y tensor factor and is 0 elsewhere, and pZ must commute. Define the dual C? of a classical
is 1 for any location where P has a Y or Z tensor code C as the set of vectors w s.t. w v = 0 for all
factor. Two Pauli operators P = (pX jpZ ) and v 2 C. Then the Z generators from H1 will all
Q = (qX jqZ ) commute iff pX qZ þ pZ qX = 0. commute with the X generators from H2 iff C? 2
Then the stabilizer for a code becomes a pair of C1 (or equivalently, C? 1 C 2 ). When this is true, C1
(nk) n binary matrices, and most interesting and C2 define an [[n, k1 þ k2  n, d]] stabilizer code,
properties can be determined by an appropriate where d  min (d1 , d2 ).
Quantum Error Correction and Fault Tolerance 199

The smallest distance-3 CSS code is the 7-qubit transversal gate which depends on the outcome of
code, a [[7, 1, 3]] QECC created from the classical the measurement.
Hamming code (consisting of all sums of classical
strings 1111000, 1100110, 1010101, and 1111111).
 for this code consists of the Fault-Tolerant Gates
The encoded j0i
superposition of all even-weight classical code We will focus on stabilizer codes. Universal fault
words and the encoded j1i  is the superposition of tolerance is known to be possible for any stabilizer
all odd-weight classical code words. The 7-qubit code, but in most cases the more complicated type
code is much studied because its properties make it of construction is needed for all but a few gates. The
particularly well suited to fault-tolerant quantum Pauli group P k , however, can be performed trans-
computation. versally on any stabilizer code. Indeed, the set S? nS
of undetectable errors is a boon in this case, as it
allows us to perform these gates. In particular, each
Fault Tolerance
coset S? =S corresponds to a different logical Pauli
Given a QECC, we can attempt to supplement it operator (with S itself corresponding to the identity).
with protocols for performing fault-tolerant opera- On a stabilizer code, therefore, logical Pauli opera-
tions. The basic design principle of a fault-tolerant tions can be performed via a transversal Pauli
protocol is that an error in a single location – either operation on the physical qubits.
a faulty gate or noise on a quiescent qubit – should Stabilizer codes have a special relationship to a
not be able to alter more than a single qubit in each finite subgroup Cn of the unitary group U(2n )
block of the QECC. If this condition is satisfied, t frequently called the ‘‘Clifford group.’’ The Clifford
separate single-qubit or single-gate failures are group on n qubits is defined as the set of unitary
required for a distance 2t þ 1 code to fail. operations which conjugate the Pauli group P n into
Particular caution is necessary, as computational itself; Cn can be generated by the Hadamard trans-
gates can cause errors to propagate from their form, the controlled-NOT (CNOT), and the single-
original location onto qubits that were previously qubit =4 phase rotation diag(1, i). The set of
correct. In general, a gate coupling pairs of qubits stabilizer codes is exactly the set of codes which can
allows errors to spread in both directions across the be created by a Clifford group encoder circuit using
coupling. j0i ancilla states.
The solution is to use transversal gates whenever Some stabilizer codes have interesting symmetries
possible (Shor 1996). A transversal operation is one under the action of certain Clifford group elements,
in which the ith qubit in each block of a QECC and these symmetries result in transversal gate
interacts only with the ith qubit of other blocks of operations. A particularly useful fact is that a
the code or of special ancilla states. An operation transversal CNOT gate (i.e., CNOT acting between
consisting only of single-qubit gates is automatically the ith qubit of one block of the QECC and the ith
transversal. A transversal operation has the virtue qubit of a second block for all i) acts as a logical
that an error occurring on the third qubit in a block, CNOT gate on the encoded qubits for any CSS code.
say, can only ever propagate to the third qubit of Furthermore, for the 7-qubit code, transversal
other blocks of the code, no matter what other Hadamard performs a logical Hadamard, and the
sequence of gates we perform before a complete transversal =4 rotation performs a logical =4
error-correction procedure. rotation. Thus, for the 7-qubit code, the full logical
In the case of certain codes, such as the 7-qubit Clifford group is accessible via transversal
code, a number of different gates can be performed operations.
transversally. Unfortunately, it does not appear to Unfortunately, the Clifford group by itself does
be possible to perform universal quantum compu- not have much computational power: it can be
tations using just transversal gates. We therefore efficiently simulated on a classical computer.
have to resort to more complicated techniques. We need to add some additional gate outside
First we create special encoded ancilla states in a the Clifford group to allow universal quantum
non-fault-tolerant way, but perform some sort of computation; a single gate will suffice, such as the
check on them (in addition to error correction) to single-qubit =8 phase rotation diag(1, exp (i=4)).
make sure they are not too far off from the goal. Note that this gives us a finite generating set of
Then we interact the ancilla with the encoded data gates. However, by taking appropriate products, we
qubits using gates from our stock of transversal get an infinite set of gates, one that is dense in the
gates and perform a fault-tolerant measurement. unitary group U(2n ), allowing universal quantum
Then we complete the operation with a further computation.
200 Quantum Error Correction and Fault Tolerance

The following circuit performs a =8 rotation, be measured, and we perform the controlled-X, -Y,
given an ancilla state j =8 i = j0i þ exp (i=4)j1i: or -Z operations transversally from the appropriate
qubits of the cat state to the appropriate qubits in
the data block. Since, assuming the cat state is
correct, all of its qubits are either j0i or j1i, the
⏐ψπ/8 〉 PX procedure either leaves the data state alone or
performs M on it uniformly. A þ1 eigenstate in the
Here P is the =4 phase rotation diag(1, i), and X data therefore leaves us with j00 . . . 0i þ j11 . . . 1i in
is the bit flip. The product is in the Clifford group, the ancilla and a 1 eigenstate leaves us with
and is only performed if the measurement outcome j00 . . . 0i  j11 . . . 1i. In either case, the final state
is 1. Therefore, given the ability to perform fault- still tells us nothing about the data beyond the
tolerant Clifford group operations, fault-tolerant eigenvalue of M. If we perform a Hadamard
measurements, and to prepare the encoded j =8 i transform and then measure each qubit in the
state, we have universal fault-tolerant quantum ancilla, we get either a random even-weight string
computation. A slight generalization of the fault- (for eigenvalue þ1) or an odd-weight string (for
tolerant measurement procedure below can be used eigenvalue 1).
to fault-tolerantly verify the j =8 i state, which is a The procedure is transversal, so an error on a
þ1 eigenstate of PX. Using this or another verifica- single qubit in the initial cat state or in a single gate
tion procedure, we can check a non-fault-tolerant during the interaction will only produce one error in
construction. the data. However, the initial construction of the cat
state is not fault tolerant, so a single-gate error then
could eventually produce two errors in the data
block. Therefore, we must be careful and use some
Fault-Tolerant Measurement
sort of technique to verify the cat state, for instance,
and Error Correction
by checking if random pairs of qubits are the same.
Since all our gates are unreliable, including those Also, note that a single phase error in the cat state
used to correct errors, we will need some sort of will cause the final measurement outcome to be
fault-tolerant quantum error-correction procedure. wrong (even and odd switch places), so we should
A number of different techniques have been devel- repeat the measurement procedure multiple times
oped. All of them share some basic features: they for greater reliability.
involve creation and verification of specialized We can then make a full fault-tolerant error-
ancilla states, and use transversal gates which correction procedure by performing the above
interact the data block with the ancilla state. measurement technique for each generator of the
The simplest method, due to Shor, is very general stabilizer. Each measurement gives us one bit of the
but also requires the most overhead and is error syndrome, which we then decipher classically
frequently the most susceptible to noise. Note that to determine the actual error.
the following procedure can be used to measure More sophisticated techniques for fault-tolerant
(non-fault-tolerantly) the eigenvalue of any (possibly error correction involve less interaction with the
multiqubit) Pauli operator M: produce an ancilla data but at the cost of more complicated ancilla
qubit in the state jþi = j0i þ j1i. Perform a con- states. A procedure due to Steane uses (for CSS
trolled-M operation from the ancilla to the state codes) one ancilla in a logical j0i  state of the same
being measured. In the case where M is a multiqubit code and one ancilla in a logical j0i  þ j1i
 state. A
Pauli operator, this can be broken down into a procedure due to Knill (for any stabilizer code)
sequence of controlled-X, controlled-Y, and con- teleports the data qubit through an ancilla consisting
trolled-Z operations. Then measure the ancilla in the of two blocks of the QECC containing an encoded
basis of jþi and ji = j0i  j1i. If the state is a þ1 Bell state j00i þ j11i. Because the ancillas in Steane
eigenvector of M, the ancilla will be jþi, and if the and Knill error correction are more complicated
state is a 1 eigenvector, the ancilla will be ji. than the cat state, it is especially important to verify
The advantage of this procedure is that it the ancillas before using them.
measures just M and nothing more. The disadvan-
tage is that it is not transversal, and thus not fault
The Threshold for Fault Tolerance
tolerant. Instead of the unencoded jþi state, we
must use a more complex ancilla state j00 . . . 0i þ In an unencoded protocol, even one error can
j11 . . . 1i known as a ‘‘cat’’ state. The cat state destroy the computation, but a fully fault-tolerant
contains as many qubits as the operator M to protocol will give the right answer unless multiple
Quantum Error Correction and Fault Tolerance 201

errors occur before they can be corrected. On the Furthermore, these calculations make a number of
other hand, the fault-tolerant protocol is larger, assumptions about the physical properties of the
requiring more qubits and more time to do each computer. The errors are assumed to be independent
operation, and therefore providing more opportu- and uncorrelated between qubits except when a gate
nities for errors. If errors occur on the physical connects them. It is assumed that measurements and
qubits independently at random with probability p classical computations can be performed quickly
per gate or time step, the fault-tolerant protocol has and reliably, and that quantum gates can be
probability of logical error for a single logical gate performed between arbitrary pairs of qubits in the
or time step at most Cp2 , where C is a constant that computer, irrespective of their physical proximity.
depends on the design of the fault-tolerant circuitry Of these, only the assumption of independent errors
(assume the QECC has distance 3, as for the 7-qubit is at all necessary, and that can be considerably
code). When p < pt = 1=C, the fault tolerance helps, relaxed to allow short-range correlations and certain
decreasing the logical error rate. pt is the ‘‘thresh- kinds of non-Markovian environments. However,
old’’ for fault-tolerant quantum computation. If the the effects of relaxing these assumptions on the
error rate is higher than the threshold, the extra threshold value and overhead requirements have not
overhead means that errors will occur faster than been well studied.
they can be reliably corrected, and we are better off
with an unencoded system.
To further lower the logical error rate, we turn to
Further Reading
a family of codes known as ‘‘concatenated codes’’
(Aharonov and Ben-Or, Kitaev 1997, Knill et al. Aharonov D and Ben-Or M (1999) Fault-tolerant quantum
1998). Given a code word of a particular [[n, 1]] computation with constant error rate, quant-ph/9906129.
QECC, we can take each physical qubit and again Bennett C, DiVincenzo D, Smolin J, and Wootters W (1996)
Mixed state entanglement and quantum error correction.
encode it using the same code, producing an [[n2 , 1]] Physical Review A 54: 3824–3851 (quant-ph/9604024).
QECC. We could repeat this procedure to get an n3 - Calderbank AR and Shor PW (1996) Good quantum error-
qubit code, and so forth. The fault-tolerant proce- correcting codes exist. Physical Review A 54: 1098–1105
dures concatenate as well, and after L levels of (quant-ph/9512032).
concatenation, the effective logical error rate is Calderbank AR, Rains EM, Shor PW, and Sloane NJA (1998)
L Quantum error correction via codes over GF(4). IEEE
pt (p=pt )2 (for a base code correcting 1 error). Transactions on Information Theory 44: 1369–1387 (quant-
Therefore, if p is below the threshold pt , we can ph/9605005).
achieve an arbitrarily good error rate  per logical Gottesman D (1996) Class of quantum error-correcting codes
gate or time step using only poly( log ) resources, saturating the quantum Hamming bound. Physical Review A
which is excellent theoretical scaling. 54: 1862–1868 (quant-ph/9604038).
Kitaev AY (1997) Quantum error correction with imperfect gates.
Unfortunately, the practical requirements for this In: Hirota O, Holeva AS, and Caves CM (eds.) Quantum
result are not nearly so good. The best rigorous Communication, Computing, and Measurement (Proc. 3rd
proofs of the threshold to date show that the Int. Conf. of Quantum Communication and Measurement),
threshold is at least 2 105 (meaning one error pp. 181–188. New York: Plenum.
per 50,000 operations). Optimized simulations of Knill E and Laflamme R (1997) A theory of quantum error-
correcting codes. Physical Review A 55: 900–911 (quant-ph/
fault-tolerant protocols suggest that the true thresh- 9604034).
old may be as high as 5%, but to tolerate this much Knill E, Laflamme R, and Zurek WH (1998) Resilient quantum
error, existing protocols require enormous overhead, computation. Science 279: 342–345.
perhaps increasing the number of gates and qubits Shor PW (1996) Fault-tolerant quantum computation. In: Proc. 35th
Ann. Symp. on Fundamentals of Computer Science, pp. 56–65.
by a factor of a million or more for typical
(quant-ph/9605011). Los Alamitos: IEEE Press.
computations. For lower physical error rates, over- Steane AM (1996) Multiple particle interference and quantum
head requirements are more modest, particularly if error correction. Proceedings of the Royal Society of London
we only attempt to optimize for calculations of a A 452: 2551–2577 (quant-ph/9601029).
given size, but are still larger than one would like.
202 Quantum Field Theory in Curved Spacetime

Quantum Field Theory in Curved Spacetime


B S Kay, University of York, York, UK mean a global coordinate t such that each constant-t
ª 2006 B S Kay. Published by Elsevier Ltd. surface is a smooth Cauchy surface, that is, a
All rights reserved. smooth spacelike 3-surface cut exactly once by each
inextendible causal curve. (Without this default
assumption, extra problems arise for QFT which
we shall briefly mention in connection with the
Introduction and Preliminaries ‘‘time machine’’ question discussed later.) In view
Quantum Field Theory (QFT) in curved spacetime of this definition, globally hyperbolic spacetimes
is a hybrid approximate theory in which quantum are clearly time-orientable and we shall assume a
matter fields are assumed to propagate in a fixed choice of time-orientation has been made so we can
classical background gravitational field. Its basic talk about the ‘‘future’’ and ‘‘past’’ directions.
physical prediction is that strong gravitational Modern formulations of the subject take, as the
fields can polarize the vacuum and, when time fundamental mathematical structure modeling the
dependent, lead to pair creation just as a strong quantum field, a -algebra A (with identity I)
and/or time-dependent electromagnetic field can together with a family of local sub -algebras
polarize the vacuum and/or give rise to pair A(O) labeled by bounded open regions O of the
creation of charged particles. One expects it to spacetime (M, g) and satisfying the isotony or net
be a good approximation to full quantum gravity condition that O1  O2 implies A(O1 ) is a subalge-
provided the typical frequencies of the gravita- bra of A(O2 ) as well as the condition that whenever
tional background are very much less than two bounded open regions O1 and O2 are spacelike
the Planck frequency (c5 =Gh)1=2  1043 s1 ) and separated, then A(O1 ) and A(O2 ) commute.
provided, with a suitable measure for energy, the Standard concepts and techniques from algebraic
energy of created particles is very much less than quantum theory are then applicable: In particular,
the energy of the background gravitational field or states are defined to be positive (this means
of its matter sources. Undoubtedly, the most !(A A)  0 8A 2 A) normalized (this means !(I) = 1)
important prediction of the theory is the Hawking linear functionals on A. One distinguishes between pure
effect, according to which a, say spherically states and mixed states, only the latter being writable
symmetric, classical black hole of mass M will as nontrivial convex combinations of other states. To
emit thermal radiation at the Hawking tempera- each state, !, the GNS construction associates a
ture T = (8M)1 (here and from now on, we use representation, ! , of A on a Hilbert space H!
Planck units where G, c, h and, k (Boltzmann’s together with a cyclic vector  2 H! such that
constant) are all taken to be 1).
!ðAÞ ¼ hj! ðAÞi
On the mathematical side, the need to formulate the
laws and derive the general properties of QFT on (and the GNS triple (! , H, ) is unique up to
nonflat spacetimes forces one to state and prove results equivalence). There are often technical advantages
in local terms and, as a byproduct, thereby leads to an in formulating things so that the -algebra is a
improved perspective on flat-spacetime QFT too. It is C -algebra. Then the GNS representation is as every-
also interesting to formulate QFT on idealized space- where-defined bounded operators and is irreducible if
times with particular global geometrical features. and only if the state is pure. A useful concept, due to
Thus, QFT on spacetimes with bifurcate Killing Haag, is the folium of a given state ! which may be
horizons is intimately related to the Hawking effect; defined to be the set of all states ! which arise in the
QFT on spacetimes with closed timelike curves is form tr(! ()), where  ranges over the density
intimately related to the question whether the laws of operators (trace-class operators with unit trace) on H! .
physics permit the manufacture of a time machine. Given a state, !, and an automorphism, , which
As is standard in general relativity, a curved preserves the state (i.e., !   = !) then there will be
spacetime is modeled mathematically as a a unitary operator, U, on H! which implements  in
(paracompact, Hausdorff) manifold M equipped the sense that ! ((A)) = U1 ! (A)U and U is
with a pseudo-Riemannian metric g of signature chosen uniquely by the condition U = .
( , þ þ þ) (we follow the conventions of the On a stationary spacetime, that is, one which
standard text by Misner et al. (1973)). We shall admits a one-parameter group of isometries
also assume, except where otherwise stated, our whose integral curves are everywhere timelike,
spacetime to be globally hyperbolic, that is, that the algebra will inherit a one-parameter group (i.e.,
M admits a global time coordinate, by which we satisfying (t1 )  (t2 ) = (t1 þ t2 )) of time-translation
Quantum Field Theory in Curved Spacetime 203

automorphisms, (t), and, given any stationary of interest, there may be one state or several states or,
state (i.e., one which satisfies !  (t) = ! 8t 2 R), frequently, no states at all which deserve the name
these will be implemented by a one-parameter ‘‘vacuum’’ and even when there are states which
group of unitaries, U(t), on its GNS Hilbert space deserve this name, they will often only be defined in
satisfying U(t) = . If U(t) is strongly continuous some approximate or asymptotic or transient sense or
so that it takes the form eiHt and if the only on some subregion of the spacetime.
Hamiltonian, H, is positive, then ! is said to be Concomitantly, one does not expect global obser-
a ‘‘ground state.’’ Typically one expects ground vables such as the ‘‘particle number’’ or the quantum
states to exist and often be unique. Hamiltonian of flat-spacetime free-field theory to
Another important class of stationary states for generalize to a curved spacetime context, and for
the algebra of a stationary spacetime is the class of this reason local observables play a central role in
KMS states, ! , at inverse temperature ; these have the theory. The quantized stress–energy tensor is a
the physical interpretation of thermal equilibrium particularly natural and important such local obser-
states. In the GNS representation of one of these, the vable and the theory of this is central to the whole
automorphisms are also implemented by a strongly subject. A brief introduction to it is given in a later
continuous unitary group, eiHt , which preserves  section.
but (in place of H positive) there is a complex This is followed by a further section on the
conjugation, J, on H! such that Hawking and Unruh effects and then a brief section
on the problems of extending the theory beyond the
eH=2 ! ðAÞ ¼ J! ðA Þ ½1 ‘‘default’’ setting, to nonglobally hyperbolic space-
times. Finally, we briefly mention a number of other
for all A 2 A. An attractive feature of the subject is
interesting and active areas of the subject as well as
that its main qualitative features are already present
issuing a few warnings to be borne in mind when
for linear field theories and, unusually in compar-
reading the literature.
ison with other questions in QFT, these are
susceptible of a straightforward explicit and rigor-
ous mathematical formulation. In fact, as our
principal example, we give, in the next section a Construction of -Algebra(s) for a Real
construction for the field algebra for the quantized Linear Scalar Field on Globally
real linear Klein–Gordon equation Hyperbolic Spacetimes and Some
General Theorems
ð&g  m2  VÞ ¼ 0 ½2
On a globally hyperbolic spacetime, the classical
of mass m on a globally hyperbolic spacetime (M, g). equation [2] admits well-defined advanced and
Here, &g denotes the Laplace–Beltrami operator retarded Green functions (strictly bidistributions)
gab ra @b (= (j det (g)j)1=2 @a ((j det (g)j1=2 gab @b )). We A and R and the standard covariant quantum
include a scalar external background classical field, free real (or ‘‘Hermitian’’) scalar field commutation
V, in addition to the external gravitational field relations familiar from Minkowski spacetime free-
represented by g. In case m is zero, taking V to equal field theory naturally generalize to the (heuristic)
R=6, where R denotes the Riemann scalar, makes the equation
equation conformally invariant.
The main new feature of QFT in curved spacetime ^
½ðxÞ; ^
ðyÞ ¼ iðx; yÞI
(present already for linear field theories) is that, in a
general (neither flat nor stationary) spacetime there where  is the Lichnérowicz commutator function
will not be any single preferred state but rather a  = A  R . Here, the ‘‘^’’ on the quantum field ˆ
family of preferred states, members of which are best serves to distinguish it from a classical solution . In
regarded as on an equal footing with one another. It mathematical work, one does not assign a meaning
is this feature which makes the above algebraic to the field at a point itself, but rather aims to assign
framework particularly suitable, indeed essential, to meaning to smeared fields (F) ˆ for all real-valued
1
a clear formulation of the subject. Conceptually, it is test functions F 2 C0 (M) R which are then to be
this feature which takes the most getting used to. In interpreted as standing for M (x)fˆ (x)j det (g)j1=2 d4 x.
particular, one must realize that, as we shall explain In fact, it is straightforward to define a minimal
later, the interpretation of a state as having a field algebra (see below) Amin generated by such
particular ‘‘particle content’’ is in general problematic ˆ
(F) which satisfy the suitably smeared version
because it can only be relative to a particular choice
of ‘‘vacuum’’ state and, depending on the spacetime ^
½ðFÞ; ^
ðGÞ ¼ iðF; GÞI
204 Quantum Field Theory in Curved Spacetime

of the above commutation relations together with C3. Positivity


ˆ  = (F)),
Hermiticity (i.e., (F) ˆ the property of being a
weak solution of eqn [2] (i.e., ((& ˆ 2
g  m  V)F) =
GðF; FÞ  0 and GðF1 ; F1 Þ1=2 GðF2 ; F2 Þ1=2
0 8F 2 C1 0 (M)) and linearity in test functions. There  jðF1 ; F2 Þj
is a technically different alternative formulation of this
and it can be shown that, to every bilinear
minimal algebra, which is known as the Weyl algebra,
functional G on C1 0 (M) satisfying (C1)–(C3),
which is constructed to be the C -algebra generated by
there is a quasifree state with two-point distri-
operators
R W(F) (to be interpreted as standing for
bution (1=2)(G þ i). One further declares a
ˆ
exp (i M (x)f (x)j det (g)j1=2 d4 x) satisfying
quasifree state to be physically admissible only if
WðF1 ÞWðF2 Þ ¼ expðiðF1 ; F2 Þ=2ÞWðF1 þ F2 Þ (for pairs of points in sufficiently small convex
neighborhoods)
together with W(F) = W(F) and W((&g  m2  C4. Hadamard condition
V)F) = I. With either the minimal algebra or the 
1 1
Weyl algebra one can define, for each bounded open ‘‘Gðx1 ; x2 Þ ¼ 2 uðx1 ; x2 ÞP
region O, subalgebras A(O) as generated by the () ˆ 2 

(or the W()) smeared with test functions supported ’’
þ vðx1 ; x2 Þ log jj þ wðx1 ; x2 Þ
in O and verify that they satisfy the above ‘‘net’’
condition and commutativity at spacelike separation.
This last condition expresses the requirement that
Specifying a state, !, on Amin is tantamount to
(locally) the two-point distribution actually ‘‘is’’
specifying its collection of n-point distributions (i.e.,
ˆ 1 ) . . . (F
ˆ n )). (In the (in the usual sense in which one says that a
smeared n-point functions) !((F
distribution ‘‘is’’ a function) a smooth function for
case of the Weyl algebra, one restricts attention to
pairs of non-null-separated points. At the same
‘‘regular’’ states for which the map F ! !(W(F)) is
time, it requires that the two-point distribution be
sufficiently often differentiable on finite-dimensional
singular at pairs of null-separated points and
subspaces of C1 0 (M) and defines the n-point locally specifies the nature of the singularity for
distributions in terms of derivatives with respect to
such pairs of points with a leading ‘‘principal part
suitable parameters of expectation values of suitable
of 1=’’ type singularity and a subleading ‘‘log jj’’
Weyl algebra elements.) A particular role is played
singularity, where  denotes the square of the
in the theory by the quasifree states for which all the
geodesic distance between x1 and x2 . u (which
truncated n-point distributions except for n = 2
satisfies u(x1 , x2 ) = 1 when x1 = x2 ) and v are certain
vanish. Thus, all the n-point distributions for odd n
smooth two-point functions determined in terms of
vanish while the four-point distribution is made out
the local geometry and the local values of V by
of the two-point distribution according to
something called the Hadamard procedure while the
^ 1 ÞðF
!ððF ^ 2 ÞðF
^ 3 ÞðF
^ 4 ÞÞ smooth two-point function w depends on the state.
We shall omit the details. The important point is
^ 1 ÞðF
¼ !ððF ^ 2 ÞÞ!ððF
^ 3 ÞðF
^ 4 ÞÞ
that this Hadamard condition on the two-point
þ !ððF ^ 3 ÞÞ!ððF
^ 1 ÞðF ^ 2 ÞðF
^ 4 ÞÞ distribution is believed to be the correct general-
^ 4 ÞÞ!ððF
^ 1 ÞðF ^ 2 ÞðF
^ 3 ÞÞ ization to a curved spacetime of the well-known
þ !ððF
universal short-distance behavior shared by the
etc. The anticommutator distribution truncated two-point distributions of all physically
relevant states for the special case of our theory
^ 1 ÞðF
GðF1 ; F2 Þ ¼ !ððF ^ 2 ÞÞ þ !ððF
^ 2 ÞðF
^ 1 ÞÞ ½3 when the spacetime is flat (and V vanishes). In the
latter case,
P u reduces to 1, and v to a simple power
of a quasifree state (or indeed of any state) will series 1 n=0 v n  n
with v0 = m2 =4, etc.
satisfy the following conditions (for all test functions Actually, it is known (this is the content of ‘‘Kay’s
F, F1 , F2 , etc.): conjecture’’ which was proved by M Radzikowski in
C1. Symmetry 1992) that (C1)–(C4) together imply that the two-
point distribution is nonsingular at all pairs of (not
GðF1 ; F2 Þ ¼ GðF2 ; F1 Þ necessarily close together) spacelike separated
points. More important than this result itself is a
C2. Weak bisolution property
reformulation of the Hadamard condition in terms
Gðð&g  m2  VÞF1 ; F2 Þ ¼ 0 of the concepts of microlocal analysis which
Radzikowski originally introduced as a tool towards
¼ GðF1 ; ð&g  m2  VÞF2 Þ its proof.
Quantum Field Theory in Curved Spacetime 205

C40 . Wave front set (or microlocal) spectrum condition Moreover, since [2] implies, for each pair of classical
solutions, 1 , 2 , the conservation (i.e., @a ja = 0) of
WFðG þ iÞ the current ja = j det (g)j1=2 gab (1 @b 2  2 @b 1 ), the
¼ fðx1 ; p1 ; x2 ; p2 Þ 2 T  ðM
MÞ n 0jx1 and x2 symplectic form (on C1 1
0 (C)
C0 (C))
lie on a single null geodesic, p1 is tangent to Z
that null geodesic and future pointing, and
ððft1 ; p1t Þ; ðft2 ; p2t ÞÞ ¼ ðft1 p2t  p1t ft2 Þd3 x
p2 when parallel transported along that null C0
geodesic from x2 to x1 equals p1 g
will be conserved in time.
For the gist of what this means, it suffices to know that Corresponding to this picture of classical
to say that an element (x, p) of the cotangent bundle of dynamics, one expects there to be a description of
a manifold (excluding the zero section 0) is in the wave quantum dynamics in terms of a family of sharp-
front set, WF, of a given distribution on that manifold time quantum fields (’t , t ) on C0 , satisfying
may be expressed informally by saying that that heuristic canonical commutation relations
distribution is singular at the point x in the direction
p. (And here the notion is applied to G þ i, thought ½’t ðxÞ; ’t ðyÞ ¼ 0
of as a distribution on the manifold M
M.) ½t ðxÞ; t ðyÞ ¼ 0
We remark that generically (and, e.g., always if the ½’t ðxÞ; t ðyÞ ¼ i 3 ðx; yÞI
spatial sections are compact and m2 þ V(x) is every-
where positive) the Weyl algebra for eqn [2] on a given and evolving in time according to the same
stationary spacetime will have a unique ground state dynamics as the Cauchy data of a classical solution.
and unique KMS states at each temperature and these (Both these expectations are correct because the field
will be quasifree and Hadamard. equation is linear.) An elegant way to make rigorous
Quasifree states are important also because of a mathematical sense of these expectations is in terms
theorem of R Verch (1994, in verification of another of a -algebra with identity generated by Hermitian
conjecture of Kay) that (in the Weyl algebra frame- objects ‘‘((’0 , 0 ); (f , p))’’ (‘‘symplectically smeared
work) on the algebra of any bounded open region, sharp-time fields at t = 0’’) satisfying linearity in f
the folia of the quasifree Hadamard states coincide. and p together with the commutation relations
With this result one can extend the notion of
physical admissibility to not-necessarily-quasifree ½ðð’0 ; 0 Þ; ðf 1 ; p1 ÞÞ; ðð’0 ; 0 Þ; ðf 2 ; p2 ÞÞ
states by demanding that, to be admissible, a state ¼ iððf 1 ; p1 Þ; ðf 2 ; p2 ÞÞI
belong to the resulting common folium when
restricted to the algebra of each bounded open and to define (symplectically smeared) time-t sharp-
region; equivalently, that it be a locally normal state time fields by demanding
on the resulting natural extension of the net of local
ðð’t ; t Þ; ðft ; pt ÞÞ ¼ ðð’0 ; 0 Þ; ðf0 ; p0 ÞÞ
Weyl algebras to a net of local W  -algebras.
where (ft , pt ) is the classical time-evolute of (f0 , p0 ).
This -algebra of sharp-time fields may be identified
Particle Creation and the Limitations with the (minimal) field -algebra of the previous
section, the (F) ˆ of the previous section being
of the Particle Concept
identified with ((’0 , 0 ); (f , p)), where (f , p) are
Global hyperbolicity also entails that the Cauchy the Cauchy data at t = 0 of   F. (This identifica-
problem is well posed for the classical field equation tion is of course many–one since (F) ˆ = 0 whenever
[2] in the sense that for every Cauchy surface, C, and 2
F arises as (&g  m  V)G for some test function
every pair (f , p) of Cauchy data in C1 0 (C), there G 2 C1 0 (M).)
exists a unique solution  in C1 0 (M) such that Specializing momentarily to the case of the free
f = jC and p = j det (g)j1=2 gab @b jC . Moreover,  has scalar field (&  m2 ) = 0 (m 6¼ 0) in Minkowski
compact support on all other Cauchy surfaces. space with a flat t = 0 Cauchy surface, the ‘‘sym-
Given a global time coordinate t, increasing towards plectically smeared’’ two-point function of the usual
the future, foliating M into a family of constant-t ground state (‘‘Minkowski vacuum state’’), !0 , is
Cauchy surfaces, Ct , and given a choice of global given, in this formalism, by
timelike vector field  a (e.g.,  a = gab @b t) enabling
one to identify all the Ct , say with C0 , by identifying !0 ððð’; Þ; ðf 1 ; p1 ÞÞðð’; Þ; ðf 2 ; p2 ÞÞÞ
points cut by the same integral curve of  a , a single ¼ 12 ðhf 1 j
f 2 i þ hp1 j
1 p2 i
such classical solution  may be pictured as a family
{(ft , pt ): t 2 R} of time-evolving Cauchy data on C0 . þ iððf 1 ; p1 Þ; ðf 2 ; p2 ÞÞÞ ½4
206 Quantum Field Theory in Curved Spacetime

where the inner products are in the one-particle


Hilbert space H = L2C (R3 ) and
= (m2  r2 )1=2 . The
t=T
GNS representation of this state may be concretely
realized on the familiar Fock space F (H) over H by
ay ðaÞÞ Þ
ay ðaÞ  ð^
0 ððð’; Þ; ðf ; pÞÞÞ ¼ ið^
where a denotes the element of H:
ð
1=2 f þ i
1=2 pÞ
a¼ pffiffiffi
2
(we note in passing that, if we equip H with the
symplectic form 2 Imhji, then K : (f , p) 7! a is a
symplectic map) ay (a) is the usual smeared creation
R yand ^
a (x)a(x)d3 x”) on F (H) satisfying
operator (= ‘‘ ^
t=0
ay ða1 ÞÞ ; ^
½ð^ ay ða2 Þ ¼ ha1 ja2 iH I
The usual (smeared) annihilation operator, a^(a), is Figure 1 A spacetime which is flat outside of a compact bump
ay (Ca)) , where C is the natural complex conjuga-
(^ of curvature.
tion, a 7! a on H. Both of these operators annihilate
the Fock vacuum vector F . In this representation,
the one-parameter group of time-translation This question may be answered by referring to the
automorphisms real linear map T : H ! H which sends aT = 21=2
(
1=2 fT þ i
1=2 pT ) to a0 = 21=2 (
1=2 f0 þ i
1=2 p0 ).
ðtÞ : ðð’0 ; 0 Þ; ðf ; pÞÞ 7! ðð’t ; t Þ; ðf ; pÞÞ ½5
By the conservation in time of  and the symplec-
is implemented by exp (iHt) where H is the second ticity, noted in passing above, of the map
quantization R of y
(i.e., 3 the operator otherwise K : (f , p) 7! a, this satisfies the defining relation
known as
(k)^ a (k)^a(k)d k) on F (H).
The most straightforward (albeit physically artifi- ImhT a1 jT a2 i ¼ Imha1 ja2 i
cial) situation involving ‘‘particle creation’’ in a curved
of a classical Bogoliubov transformation. Splitting T
spacetime concerns a globally hyperbolic spacetime
into its complex-linear and complex-antilinear parts
which, outside of a compact region, is isometric to
by writing
Minkowski space with a compact region removed –
that is, to a globally hyperbolic spacetime which is flat T ¼  þ C
except inside a localized ‘‘bump’’ of curvature (see
Figure 1). (One could also allow the function V in [2] where  and  are complex-linear operators, this
to be nonzero inside the bump.) On the field algebra relation may alternatively be expressed in terms of
(defined as in the previous section) of such a spacetime, the pair of relations
there will be an ‘‘in’’ vacuum state (which may be
identified with the Minkowski vacuum to the past of      ¼ I;   ¼   

the bump) and an ‘‘out’’ vacuum state (which may be where  = CC,  = CC.
identified with the Minkowski vacuum to the future of We remark that there is an easy-to-visualize
the bump) and one expects, for example, the ‘‘in equivalent way of defining  and  in terms of
vacuum’’ to arise as a many-particle state in the GNS the analysis, to the past of the bump, into
representation of the ‘‘out vacuum’’ corresponding to positive- and negative-frequency parts of complex
the creation of particles out of the vacuum by the solutions to [2] which are purely positive fre-
bump of curvature. quency to the future of the bump. In fact, if, for
In the formalism of this section, if we choose our any element a 2 H, we identify the positive-
global time coordinate on such a spacetime so that, frequency solution to the Minkowski-space
say, the t = 0 surface is to the past of the bump and Klein–Gordon equation
the t = T surface to its future, then the single
automorphism (T) (defined as in [5]) encodes the pos
out ðt; xÞ ¼ ðð2
Þ1=2 expði
tÞaÞðxÞ
overall effect of the bump of curvature on the
quantum field and one can ask whether it is with a complex solution to [2] to the future of the
implemented by a unitary operator in the GNS bump, then (it may easily be seen) to the past of the
representation of the Minkowski vacuum state [4]. bump, this same solution will be identifiable with
Quantum Field Theory in Curved Spacetime 207

the (partly positive-frequency, partly negative- varying in time, one can define approximate adia-
frequency) Minkowski-space Klein–Gordon solution batic notions of classical positive-frequency solutions,
  and hence also of quantum ‘‘vacuum’’ and ‘‘particles’’
in ðt; xÞ ¼ ð2
Þ1=2 expði
tÞa ðxÞ at each finite value of the cosmological time. But, at
  times where the gravitational field is rapidly varying,
þ ð2
Þ1=2 expði
tÞa
 ðxÞ one does not expect there to be any sensible notion of
‘‘particles.’’ And, in a rapidly time-varying back-
and this could be taken to be the defining equation ground gravitational field which never settles down,
for the operators  and . one does not expect there to be any sensible particle
It is then known (by a 1962 theorem of Shale) interpretation of the theory at all. To understand
that the automorphism [5] (strictly, its Weyl algebra these statements, it suffices to consider the (1 þ 0)-
counterpart) will be unitarily implemented if and dimensional Klein–Gordon equation with an external
only if  is a Hilbert–Schmidt operator on H. Wald potential V:
(1979, in case m  0) and Dimock (1979, in case !
m 6¼ 0) have verified that this condition is satisfied d2 2
 2  m  VðtÞ  ¼ 0
in the case of our bump-of-curvature situation. In dt
that case, if we denote the unitary implementor by
U, we have the following results: which is of course a system of one degree of
freedom, mathematically equivalent to the harmonic
R1. The expectation value hUjN(a)UiF (H) of the
oscillator with a time-varying angular frequency
number operator, N(a) = ^ ay (a)^
a(a), where a is a
$(t) = (m2 þ V(t))1=2 . One could of course express
normalized element of H, is equal to ha j aiH .
its quantum theory in terms of a time-evolving
R2. First note that there exists an orthonormal basis
Schrödinger wave function (’, t) and attempt to
of vectors, ei , (i = 1 . . . 1), in H such that the
 1 has the give this a particle interpretation at each time, s, by
(Hilbert–Schmidt) P operator   expanding (’, s) in terms of the harmonic oscilla-
canonical form i i hCei jijei i. We then have
tor wave functions for a harmonic oscillator with
(up to an undetermined phase)
some particular choice of angular frequency. But the
!
1X y problem is, as is easy to convince oneself, that there
y
U ¼ N exp  a ðei Þ^
i ^ a ðei Þ  is no such good choice. For example, one might
2 i
think that a good choice would be to take, at time s,
the set of harmonic oscillator wave functions with
where the normalization constant N is chosen
angular frequency $(s). (This is sometimes known
so that kUk = 1. This formula makes manifest
as the method of ‘‘instantaneous diagonalization of
that the particles are created in pairs.
the Hamiltonian.’’) But suppose we were to apply
We remark that, identifying elements, a, of H with this prescription to the case of a smooth V() which
positive-frequency solutions (below, we shall call is constant in time until time 0 and assume the
them ‘‘modes’’) as explained above, result (R1) may initial state is the usual vacuum state. Then at some
alternatively be expressed by saying that the positive time s, the number of particles predicted to
expectation value, !in (N(a)), in the in-vacuum state be present is the same as the number of particles
of the occupation number, N(a), of a normalized predicted to be present on the same prescription at
mode, a, to the future of the bump, is given by ^
all times after s for a V() which is equal to V() up
hajaiH . to time s and then takes the constant value V(s) for
This formalism and the results, (R1) and (R2) all later times (see Figure 2). But V() ^ will
above, will generalize (at least heuristically, and generically have a sharp corner in its graph (i.e., a
sometimes rigorously – see especially the rigorous
scattering-theoretic work in the 1980s by Dimock
and Kay and more recently by A Bachelot and others)
to more realistic spacetimes which are only asympto-
tically flat or asymptotically stationary. In favorable
cases, one will still have notions of classical solutions
which are positive frequency asymptotically towards
t
the future/past, and, in consequence, one will have 0 s
well-defined asymptotic notions of ‘‘vacuum’’ and Figure 2 Plots of $ against t for the two potentials V (continuous
‘‘particles.’’ Also, in, for example, cosmological, line) and V^ (continuous line upto s and then dashed line) which play
models where the background spacetime is slowly a role in our critique of ‘‘instantaneous diagonalization.’’
208 Quantum Field Theory in Curved Spacetime

discontinuity in its time derivative) at time s, and for an arbitrary state whose two-point function
one would expect a large part of the particle has Hadamard form – i.e., whose anticommutator
production in the latter situation to be accounted function satisfies condition (C4)) on the minimal
for by the presence of this sharp corner – and field algebra and to other linear field theories
therefore a large part of the predicted particle (including the stress tensor for a conformally
production in the case of V() to be spurious. coupled linear scalar field) on a general globally
Back in 1 þ 3 dimensions, even where a good hyperbolic spacetime (and the result obtained
notion of particles is possible, it depends on the agrees with that obtained by other methods,
choice of time evolution, as is dramatically illu- including dimensional regularization and zeta-
strated by the Unruh effect discussed in the relevant function regularization). However, the general-
section. ization to a curved spacetime involves a number
of important new features which we now briefly
list (see Wald (1978) for details).
Theory of the Stress–Energy Tensor First, the subtraction term which replaces
!0 ((x1 )(x2 )) is, in general, not the expectation
To orient ideas, consider first the free (minimally
value of (x1 )(x2 ) in any particular state, but
coupled) scalar field, (& m2 ) = 0, in Minkowski
rather a particular locally constructed Hadamard
space. If one quantizes this system in the usual
two-point function whose physical interpretation is
Minkowski-vacuum representation, then the expec-
more subtle; the renormalization is thus in general
tation value of the renormalized stress-energy tensor not to be regarded as a normal ordering. Second, the
(which in this case is the same thing as the normal immediate result of the resulting limiting process
ordered stress–energy tensor) in a vector state  in will not be covariantly conserved and, in order to
the Fock space will be given by the formal point- obtain a covariantly conserved quantity, one needs
splitting expression
to add a particular local geometrical correction
hjTab ðxÞi term. The upshot of this is that the resulting
  expected stress–energy tensor is covariantly con-
¼ lim @a1 @b2  12 ab ð cd @c1 @d2 þ m2 Þ served but possesses a (state-independent) anoma-
ðx1 ;x2 Þ ! ðx;xÞ
lous trace. In particular, for a massless conformally

ðhj0 ððx1 Þðx2 ÞÞi coupled linear scalar field, one has (for all physically
 hF j0 ððx1 Þðx2 ÞÞF iÞ ½6 admissible quasifree states, !) the trace anomaly
formula
where ab is the usual Minkowski metric. A  
sufficient condition for the limit here to be finite !ðTaa ðxÞÞ ¼ ð28802 Þ1 Cabcd Cabcd þ Rab Rab  13 R2
and well defined would, for example, be for  to
consist of a (normalized) finite superposition of plus an arbitrary multiple of &R. In fact, in general,
n-particle vectors of form ^ ay (a1 ), . . . , ^ay (an )F the thus-defined renormalized stress–energy tensor
where the smearing functions a1 , . . . , an are all operator (see below) is only defined up to a finite
C1 elements of H (i.e., of L2C (R3 ). The reason this renormalization ambiguity which consists of the
works is that the two-point function in such states addition of arbitrary multiples of the functional
shares the same short-distance singularity as the derivatives with respect to gab of the quantities
Minkowski-vacuum two-point function. For exactly
Z
the same reason, one obtains a well-defined finite
limit if one defines the expectation value of In ¼ Fn ðxÞjdetðgÞj1=2 d4 x
M
the stress–energy tensor in any physically admissible
quasifree state by the expression where n ranges from 1 to 4 with F1 = 1, F2 = R,
F3 = R2 , and F4 = Rab Rab . In the Minkowski-space
!ðTab ðxÞÞ case, only the first of these ambiguities arises and it
 
¼ lim @a1 @b2  12 ab ð cd @c1 @d2 þ m2 Þ is implicitly resolved in the formulas [6], [7]
ðx1 ;x2 Þ ! ðx;xÞ inasmuch as these effectively incorporate the

ð!ððx1 Þðx2 ÞÞ  !0 ððx1 Þðx2 ÞÞÞ ½7 renormalization condition that !0 (Tab ) = 0. (For the
same reason, the locally flat example we give below
This latter point-splitting formula generalizes to a has no ambiguity.)
definition for the expectation value of the One expects, in both flat and curved cases, that,
renormalized stress–energy tensor for an arbitrary for test functions, F 2 C1 0 (M), there will exist
physically admissible quasifree state (or indeed operators Tab (F) which are affiliated to the net of
Quantum Field Theory in Curved Spacetime 209

local W  -algebras referred to earlier and that it is Hawking and Unruh Effects
meaningful to write
Z The original calculation by Hawking (1975) con-
cerned a model spacetime for a star which collapses
!ðTab ðxÞÞFðxÞj detðgÞj1=2 d4 x ¼ !ðTab ðFÞÞ
M to a black hole. For simplicity, we shall only discuss
the spherically symmetric case (see Figure 4). Adopt-
provided that, by ! on the right-hand side, we ing a similar ‘‘mode’’ viewpoint to that mentioned
understand the extension of ! from the Weyl algebra after results (R1) and (R2) discussed earlier, the
to this net. (Tab (F) is however not expected to result of the calculation may be stated as follows:
belong to the minimal algebra or be affiliated to the For a real linear scalar field satisfying [2] with m = 0
Weyl algebra.) (and V = 0) on this spacetime, the expectation value
An interesting simple example of a renormalized !in (N(a$, ‘ )) of the occupation number of a one-
stress–energy tensor calculation is the so-called particle outgoing mode a$, ‘ ) localized (as far as a
Casimir effect calculation for a linear scalar field normalized mode can be) around $ in angular-
on a (for further simplicity, (1 þ 1)-dimensional) frequency space and about retarded time v, and with
timelike cylinder spacetime of radius R (see angular momentum ‘‘quantum number’’ ‘, in the in-
Figure 3). This spacetime is globally hyperbolic vacuum state (i.e., on the minimal algebra for a real
and stationary and, while locally flat, globally scalar field on this model spacetime) !in is, at late
distinct from Minkowski space. As a result, while – retarded times, given by the formula
provided the regions O are sufficiently small
(such as the diamond region in Figure 3) – elements ð$; ‘Þ
!in ðNða$;‘ ÞÞ ¼
A(O) of the minimal net of local algebras on this expð8M$Þ  1
spacetime will be identifiable, in an obvious way,
with elements of the minimal net of local algebras where M is the mass of the black hole and the
on Minkowski space, the stationary ground state absorption factor (alternatively known as gray-body
!cylinder will, when restricted to such thus-identified factor) ($, ‘) is equal to the norm-squared of that
regions, be distinct from the Minkowski vacuum part of the one-particle mode a$, ‘ which, viewed as
state !0 . The resulting renormalized stress–energy a complex positive-frequency classical solution
tensor (as first pointed out in Kay (1979)), propagating backwards in time from late retarded
definable, once the above identification has been times, would be absorbed by the black hole. (Note
made, exactly as in [7]) turns out, in the massless the independence of the right-hand side of this
case, to be nonzero and, interestingly, to have a (in formula from the retarded time, v.) This calculation
the natural coordinates, constant) negative energy- can be understood as an application of result (R1)
density T00 . In fact, in this massless case,
1
!cylinder ðTab Þ ¼ ab Singularity
24R2

Horizon

Figure 3 The timelike cylinder spacetime of radius R with a


diamond region isometric to a piece of Minkowski space. See Interior of
Kay (1979). Casimir effect in quantum field theory. (Original title: star
The Casimir effect without magic.) Physical Review D 20:
3052–3062. Reprinted with permission ª 1979 by the American Figure 4 The spacetime of a star collapsing to a spherical
Physical Society. black hole.
210 Quantum Field Theory in Curved Spacetime

(even though the spacetime is more complicated than Future singularity


one with a localized ‘‘bump of curvature’’ and even (Schwarzschild case)
though the relevant overall time evolution will not be
unitarily implemented, the result still applies when
suitably interpreted) and the heart of the calculation
is an asymptotic estimate of the relevant ‘‘’’
Bogoliubov coefficient which turns out to be depen-
dent on the geometrical optics of rays which pass
through the star just before the formation of the Exterior
horizon. This result suggests that the in-vacuum state Schwarzschild
is indistinguishable at late retarded times from a state wedge/
Rindler wedge
of blackbody radiation at the Hawking temperature,
THawking = 1=8M, in Minkowski space from a
blackbody (gray body) with the same absorption
factor. This was confirmed by further work by many
authors. Much of that work, as well as the original
result of Hawking was partially heuristic but later
Past singularity
work by Dimock and Kay (1987), by Fredenhagen (Schwarzschild case)
and Haag (1990), and by Bachelot (1999) and others Figure 5 The geometry of maximally extended Schwarzschild
has put different aspects of it on a rigorous (/or Minkowski) spacetime. In the Schwarzschild case, every
mathematical footing. The result generalizes to point represents a 2-sphere (/in the Minkowski case, a 2-plane).
nonzero mass and higher spin fields to interacting The curves with arrows on them indicate the Schwarzschild time
fields as well as to other types of black hole and the evolution (/one-parameter family of Lorentz boosts). These
curves include the (straight lines at right angles) event horizons
formula for the Hawking temperature generalizes to (/Killing horizons).

THawking ¼ =2
when restricted to a Rindler wedge and regarded with
respect to the time evolution consisting of the wedge-
where is the surface gravity of the black hole.
preserving one-parameter family of Lorentz boosts is
This result suggests that there is something funda-
known as the Unruh effect (1975). This latter property
mentally ‘‘thermal’’ about quantum fields on black-
of the Minkowski vacuum in fact generalizes to
hole backgrounds and this is confirmed by a number of
general Wightman QFTs and is in fact an immediate
mathematical results. In particular, the theorems in the
consequence of a combination of the Reeh–Schlieder
two papers Kay and Wald (1991) and Kay (1993),
theorem (applied to a Rindler wedge) and the
combined together, tell us that there is a unique state
Bisognano–Wichmann theorem (1975). The latter
on the Weyl algebra for the maximally extended
theorem says that the defining relation [1] of a KMS
Schwarzschild spacetime (a.k.a. Kruskal–Szekeres
state holds if, in [1], we identify the operator J with the
spacetime) (see Figure 5) which is invariant under the
complex conjugation which implements wedge reflec-
Schwarzschild isometry group and whose two-point
tion and H with the self-adjoint generator of the
function has Hadamard form. Moreover, they tell us
unitary implementor of Lorentz boosts. We remark
that this state, when restricted to a single wedge (i.e.,
that the Unruh effect illustrates how the concept of
the exterior Schwarzschild spacetime) is necessarily a
‘‘vacuum’’ (when meaningful at all) is dependent on
KMS state at the Hawking temperature. This unique
the choice of time evolution under consideration.
state is known as the Hartle–Hawking–Israel state.
Thus, the usual Minkowski vacuum is a ground state
These results in fact apply more generally to a wide
with respect to the usual Minkowski time evolution
class of globally hyperbolic spacetimes with bifurcate
but not (when restricted to a Rindler wedge) with
Killing horizons including de Sitter space – where the
respect to a one-parameter family of Lorentz boosts;
unique state is sometimes called the Euclidean and
with respect to these, it is, instead, a KMS state.
sometimes the Bunch–Davies vacuum state – as well as
to Minkowski space, in which case the unique state is
the usual Minkowski vacuum state, the analog of the
Nonglobally Hyperbolic Spacetimes
exterior Schwarzschild wedge is a so-called Rindler
and the ‘‘Time Machine’’ Question
wedge, and the relevant isometry group is a one-
parameter family of wedge-preserving Lorentz boosts. Hawking (1992) argued that a spacetime in which a
In the latter situation, the fact that the Minkowski time machine gets manufactured should be modeled
vacuum state is a KMS state (at ‘‘temperature’’ 1=2) (see Figure 6) by a spacetime with an initial globally
Quantum Field Theory in Curved Spacetime 211

There are many further mathematically rigorous


Region with closed
timelike curves results on algebraic and axiomatic QFT in a curved
spacetime setting, including versions of PCT, spin-
Cauchy statistics and Reeh–Schlieder theorems and also
horizon rigorous energy inequalities bounding the extent to
which expected energy densities can be negative, etc.
Initial globally hyperbolic region There is much mathematical work controlling
Figure 6 The schematic geometry of a spacetime in which a scattering theory on black holes, partly with a view
time machine gets manufactured. to further elucidating the Hawking effect.
Perturbative renormalization theory of interacting
quantum fields in curved spacetime is also now a
hyperbolic region with a region containing closed
highly developed subject.
timelike curves to its future and such that the future
Beyond QFT in a fixed curved spacetime is
boundary of the globally hyperbolic region is a
semiclassical gravity which takes into account the
compactly generated Cauchy horizon. On such a
back-reaction of the expectation value of the stress–
spacetime, Kay et al. (1997) proved that it is
energy tensor on the classical gravitational back-
impossible for any distributional bisolution which
ground. There are also interesting condensed matter
satisfies (even a certain weakened version of) the
analogs of the Hawking effect such as dumb holes.
Hadamard condition on the initial globally hyper-
Readers exploring the wider literature, or doing
bolic region to continue to satisfy that condition on
further research on the subject should be aware that
the full spacetime – the (weakened) Hadamard
the word ‘‘vacuum’’ is sometimes used to mean
condition being necessarily violated at at least one
‘‘ground state’’ and sometimes just to mean ‘‘quasifree
point on the Cauchy horizon. This result implies
state.’’ They should be cautious of attempts to define
that, however one extends a state, satisfying our
particles on Cauchy surfaces in instantaneous diag-
conditions (C1)–(C4), on the minimal algebra for [2]
onalization schemes (cf. the remarks at the end of the
on the initial globally hyperbolic region, the expec-
section ‘‘Particle creation and the limitations of the
tation value of its stress–energy tensor must neces-
particle concept’’). When studying (or performing)
sarily become singular on the Cauchy horizon. This
calculations of the ‘‘expectation value of the stress–
result, together with many heuristic results and
energy tensor’’ it is always important to ask oneself
specific examples considered by many other authors
with respect to which state the expectation value is
appears to support the validity of the (Hawking
being taken. It is also important to remember to check
1992) chronology protection conjecture to the effect
that candidate two-point (anticommutator) functions
that it is impossible in principle to manufacture a time
satisfy the positivity condition (C3) discussed earlier.
machine. However, there are potential loopholes in the
Typically, two-point distributions obtained via mode
physical interpretation of this result as pointed out by
sums automatically satisfy condition (C3) (and condi-
Visser (1997), as well as other claims by various authors
tion (C4)), but those obtained via image methods do
that one can nevertheless violate the chronology
not always satisfy it. (When they do not, the presence
protection conjecture. For a recent discussion on this
of nonlocal spacelike singularities is often a tell-tale
question, we refer to Visser (2003).
sign as can be inferred from Kay’s conjecture/Radzi-
kowski’s theorem discussed earlier.) There are a
number of apparent implicit assertions in the literature
Other Related Topics and Some
that some such two-point functions arise from ‘‘states’’
Warnings
when of course they cannot. Some of these concern
There is a vast computational literature, calculating proposed analogs to the Hartle–Hawking–Israel state
the expectation values of stress–energy tensors in for the (appropriate maximal globally hyperbolic
states of interest for scalar and higher spin linear portion of the maximally extended) Kerr spacetime.
fields (and also some work for interacting fields) on That they cannot belong to states is clear from a
interesting cosmological and black-hole backgrounds. theorem in Kay and Wald (1991) which states that
QFT on de Sitter and anti-de Sitter space is a big there is no stationary Hadamard state on this space-
subject area in its own right with recent renewed time at all. Others of them concern claimed ‘‘states’’ on
interest because of its relevance to string theory and spacetimes such as those discussed in the previous
holography. Also important on black-hole back- section which, if they really were states would seem to
grounds is the calculation of gray-body factors, be in conflict with the chronology protection con-
again with renewed interest because of relevance to jecture. Finally, beware states (such as the so-called -
string theory and to brane-world scenarios. vacua of de Sitter spacetime) whose two-point
212 Quantum Field Theory: A Brief Introduction

distributions violate the ‘‘Hadamard’’ condition (C4) Hawking SW (1992) The chronology protection conjecture.
and which therefore do not have a well-defined finite Physical Review D 46: 603–611.
Israel W (1976) Thermo-field dynamics of black holes. Physics
expectation value for the renormalized stress–energy Letters A 57: 107–110.
tensor. Kay BS (1979) Casimir effect in quantum field theory. (Original
title: The Casimir effect without magic.) Physical Review D
See also: AdS/CFT Correspondence; Algebraic 20: 3052–3062.
Approach to Quantum Field Theory; Axiomatic Quantum Kay BS (1993) Sufficient conditions for quasifree states and an
Field Theory; Black Hole Mechanics; Bosons and improved uniqueness theorem for quantum fields on space-times
Fermions in External Fields; Integrability and Quantum with horizons. Journal of Mathematical Physics 34: 4519–4539.
Field Theory; Quantum Fields with Indefinite Metric: Kay BS (2000) Application of linear hyperbolic PDE to linear
quantum fields in curved spacetimes: especially black holes,
Non-Trivial Models; Quantum Fields with Topological
time machines and a new semi-local vacuum concept. Journées
Defects; Quantum Geometry and Its Applications;
Équations aux Dérivées Partielles, Nantes, 5–9 juin 2000,
Scattering in Relativistic Quantum Field Theory: GDR 1151 (CNRS): IX1–IX19. (Also available at http://
Fundamental Concepts and Tools; Thermal Quantum www.math.sciences.univ-nantes.fr or as gr-qc/0103056.)
Field Theory. Kay BS, Radzikowski MJ, and Wald RM (1997) Quantum field
theory on spacetimes with a compactly generated Cauchy horizon.
Communications in Mathematical Physics 183: 533–556.
Kay BS and Wald RM (1991) Theorems on the uniqueness and
Further Reading thermal properties of stationary, nonsingular, quasifree states
on spacetimes with a bifurcate Killing horizon. Physics
Birrell ND and Davies PCW (1982) Quantum Fields in Curved Reports 207(2): 49–136.
Space. Cambridge: Cambridge University Press. Misner CW, Thorne KS, and Wheeler JA (1973) Gravitation. San
Brunetti R, Fredenhagen K, and Verch R (2003) The generally Francisco: W.H. Freeman.
covariant locality principle – a new paradigm for local quantum Unruh W (1976) Notes on black hole evaporation. Physical
physics. Communications in Mathematical Physics 237: 31–68. Review D 14: 870–892.
DeWitt BS (1975) Quantum field theory in curved space-time. Visser M (2003) The quantum physics of chronology protection.
Physics Reports 19(6): 295–357. In: Gibbons GW, Shellard EPS, and Rankin SJ (eds.) The
Dimock J (1980) Algebras of local observables on a manifold. Future of Theoretical Physics and Cosmology. Cambridge:
Communications in Mathematical Physics 77: 219–228. Cambridge University Press.
Haag R (1996) Local Quantum Physics. Berlin: Springer. Wald RM (1978) Trace anomaly of a conformally invariant quantum
Hartle JB and Hawking SW (1976) Path-integral derivation of field in a curved spacetime. Physical Review D 17: 1477–1484.
black-hole radiance. Physical Review D 13: 2188–2203. Wald RM (1994) Quantum Field Theory in Curved Spacetime
Hawking SW (1975) Particle creation by black holes. Commu- and Black Hole Thermodynamics. Chicago: University of
nications in Mathematical Physics 43: 199–220. Chicago Press.

Quantum Field Theory: A Brief Introduction


L H Ryder, University of Kent, Canterbury, UK (wave) nature of the electron, according to the
ª 2006 Elsevier Ltd. All rights reserved.
formulations of Heisenberg and Schrödinger. The
introduction of the quantum idea into physics,
however, by Planck in 1900 closely followed by
By any account quantum field theory occupies a Einstein in 1905 was the proposal of a quantum
prominent place in the history of mathematical (particular) aspect of the electromagnetic field – the
physics. This article is, however, not intended to photon. In the mid-1920s the only force in nature to
serve as an overview of this subject, but has the be considered was the electromagnetic interaction;
more modest aim of identifying a few areas which this was before the theories of Yukawa and Fermi,
seem to me interesting and significant. concerning the strong and weak nuclear forces.
Dirac, Heisenberg, Jordan, and others then
addressed themselves to finding a formulation of
Historical Remarks; Second Quantization quantum electrodynamics (QED) comparable in
At the time when quantum field theory was at the mathematical sophistication to the Heisenberg–
forefront of theoretical physics its raison d’être was Schrödinger formulation of quantum mechanics –
to complete the quantum description of the sub- which Planck’s and Einstein’s theories were not.
atomic world. Quantum mechanics had been amaz- The idea that was pursued, at least in the early
ingly successful in solving almost the whole of stages, was that the Schrödinger wave function ,
atomic physics by making explicit the quantum taken as a wave field, should be ‘‘quantized’’; Dirac
Quantum Field Theory: A Brief Introduction 213

seems to have taken this as a model for photons. Schrödinger wave function to be considered as a
Jordan further proposed that electrons should be ‘‘real’’ field, whose quanta result in ‘‘real’’ particles,
treated as the quanta of an electron field, but or is it a probability field, whose significance lies in
recognized that their fermionic nature would modify Born’s probabilistic interpretation of quantum
the quantization procedure. This generic idea mechanics? Born wrote in 1926, ‘‘[Einstein said
involved what was called ‘‘second quantization’’ – that] the waves are present only to show the
of a field into a particle. corpuscular light quanta the way, and he spoke in
One of the earliest quantization rules was Bohr’s the sense of a ‘‘ghost field’’. This determines the
condition relating
R to the periodic orbits of electrons in probability that a light quantum, the bearer of
atoms, J = p dq = nh. At the hands of Heisenberg and energy and momentum, takes a certain path;
Dirac this became upgraded to the commutation however, the field itself has no energy and no
relation momentum.’’ This is the first problem. The second
one concerns the nature of the quantization itself. Is
½q; p ¼ i
h this a quantization of field energy, or a quantization
of the field itself, as a substantial entity? If the field
where the operators p and q are ‘‘observables.’’ In is real, the second of these does not imply the first.
their papers on quantum field theory, Dirac, Jordan Ambiguities surrounding the idea of second
and Wigner, and Heisenberg introduced creation and quantization survived into the 1960s. Wigner is
annihilation operators which had the function, as recorded as saying, in an interview in 1963, ‘‘just as
their name implied, of creating and destroying single we get photons by quantising the electromagnetic
particles – quanta of the field. These operators obeyed fields, so we should be able to get material particles
the commutation rules (with [A, B] = AB  BA) by quantising the Schrödinger field.’’ And Rosenfeld,
also in an interview in 1963, said, ‘‘in some sense or
½br ; bs  ¼ rs ; ½br ; bs  ¼ ½br ; bs  ¼ 0
other, Jordan himself took the wave function, the
when the field quanta were bosons, and the anti- probability amplitude, physically more seriously
commutation rules than most people [did].’’
It would seem we are justified in concluding that the
fbr ; bs g ¼ rs ; fbr ; bs g ¼ fbr ; bs g ¼ 0 idea of second quantization contains flaws, but an even
clearer indication of the need for rethinking is provided
(with {A, B} = AB þ BA) when the field quanta were by the story of the Dirac equation. This is a wave
fermions (e.g., electrons). These steps constitute equation for the electron, compatible with special
second quantization, but it may be noted that relativity, and taking explicit account of its spin being
the creation and annihilation operators are not (1/2)h. The equation famously had both positive- and
observables, as p and q are in the Heisenberg negative-energy solutions. This potential disaster was
commutation relation. In addition, the second converted by Dirac into a triumph by reinterpreting the
quantization conditions do not involve Planck’s (absence of) negative-energy solutions as (positive-
constant. ‘‘First’’ and ‘‘second’’ quantization are energy) antiparticles – positrons, particles with positive
therefore not so similar as one might like to think. charge but the same mass and spin as the electron.
The question of what exactly is being quantized Positrons were eventually discovered by Anderson. It
was in fact the source of some confusion. In his was later shown that the existence of antiparticles is a
paper of 1927, Dirac’s attention is focussed on general feature of quantum field theory, not just a
electromagnetic radiation, but he nevertheless dis- peculiarity of spin-1/2 particles. The significance of this
cusses the difference between ‘‘a light-wave and the discovery, however, is that the twin requirements of
de Broglie or Schrödinger wave associated with the relativity and quantum theory are not compatible with
light-quanta.’’ As Dirac points out, ‘‘their intensities a single-particle state; rather, these requirements result
are to be interpreted in different ways. The number in a two-particle state. Thus, in some sense the
of light quanta per unit volume associated with a requirements of relativity and quantum mechanics
monochromatic light-wave equals the energy per already start to take us down the road to a quantum
unit volume of the wave divided by the energy theory of fields.
(2h) of a single light quantum. On the other hand Quantum field theory is then constructed on the
a monochromatic de Broglie wave of amplitude a following sort of framework: ‘‘classical’’ theories for
(multiplied into the imaginary exponential factor) fields with any spin may be written down and these
must be interpreted as representing a2 light quanta are quantized by reinterpreting the field variables as
per unit volume for all frequencies.’’ There are at operators and imposing Heisenberg-type commuta-
least two problematic issues here. First, is the tion relations on the field and its corresponding
214 Quantum Field Theory: A Brief Introduction

‘‘momentum’’ variable. So, for example, for spinless perturbation theory, since any physical process (say a
fields we have the equal-time commutation relation scattering process or a particle decay) will only be
observed at a finite energy and comparison of theory
hð3Þ ðx  yÞ
½ðx; tÞ; ðy; tÞ ¼ i and experiment therefore only requires calculation up
where  = @L=@(@0 ) and L is the Lagrange density. to a finite order of perturbation theory. So even
The mass and spin of particles are defined with nonrenormalizable theories are perfectly acceptable
reference to the Poincaré group (thereby incorporat- as low-energy theories. This amounts to a philosophy
ing special relativity) and the quantum requirement of effective field theories; an effective field theory is a
is the familiar one that physical states are repre- model which holds good up to a particular energy
sented by vectors in Hilbert space. The rest follows: scale, or equivalently down to a particular length
as Weinberg says, ‘‘quantum field theory is the way scale.
it is because (with certain qualifications) this is the An important addition to the theoretical armoury
only way to reconcile quantum mechanics with is the renormalization group. Renormalization is
special relativity.’’ implemented first of all by a scheme of regulariza-
tion, which enables the divergences to be exhibited
explicitly. The simplest type of regularization is the
introduction of a cutoff in the momentum integrals,
Renormalization
but in modern particle physics the favored scheme is
A notorious problem in quantum field theory is the dimensional regularization. The dimensionality of
occurrence of infinities. In QED, for example, the the integrals in momentum space is taken to be
electron acquires a self-energy – and therefore a d = 4  " and the divergent quantities have an
contribution to its mass – by virtue of the emission explicit dependence on " (which, of course, as the
and reabsorption of virtual photons. It turns out ‘‘real’’ world is approached, approaches zero). At
that this self-energy is infinite – it is given by a the same time, a mass parameter  is introduced in
divergent integral – even in the lowest order of order to define dimensionless quantities, for exam-
perturbation theory. In the early days, this was ple, a dimensionless coupling constant. The renor-
recognized as being a serious problem, and in fact it malized quantities then depend on the ‘‘bare’’
turns out to be a generic problem in quantum field (unrenormalized) quantities and on  and ". The
theory. It was realized by Dyson, however, that in arbitrariness of  enables a differential equation, for
some field theories these divergences may be dealt scattering amplitudes, for example, to be written
with by redefining a small number of parameters down. While at first sight this renormalization
(e.g., in QED, the electron mass, charge, and field group equation might seem to have no physical
amplitude) so that thereafter the theory is finite to importance, in fact it gives a powerful way of
all orders of perturbation theory. Such theories are studying scattering behavior at large momenta.
called renormalizable, and QED is a renormalizable Most interestingly, the concept of the renormali-
field theory. zation group also arises in condensed matter physics.
Some important field theories, however, are not Here, rather than, for example, a cutoff in momen-
renormalizable; an example is Fermi’s theory of tum space, the relevant parameter is a distance scale.
weak interactions. To lowest order in perturbation In the Ising model in statistical mechanics, for
theory, Fermi’s theory works well (e.g., in account- example, in which spins are located on a lattice,
ing for the electron spectrum in neutron beta decay), the parameter is the lattice spacing. To construct a
but to higher orders divergent results are obtained, theory that describes the physics on the macroscopic
which cannot be waved away by redefining a finite scale involves integrating out the details on the
number of parameters; that is to say, as the order of microscopic scale and one way to do this is via the
perturbation increases, so also does the number of ‘‘block spin’’ transformation originally introduced
parameters to be redefined. Nonrenormalizable by Kadanoff. In this way the renormalization group
theories of this type have traditionally been regarded has had a large impact in condensed matter physics,
as highly undesirable, not to say rather nasty. for example, in the study of critical phenomena.
The modern view of renormalization is, however,
somewhat different. The problem with nonrenormal-
Particle Physics and Cosmology
izable theories is that, in order to calculate a physical
process to all orders in perturbation theory, an Probably the most spectacular success of quantum
infinite number of parameters must be renormalized, field theory in the twentieth century has been in
so the theory has no predictive power. In practice, particle physics. The ‘‘standard model’’ accounts for
however, we do not need to calculate to all orders in the strong, electromagnetic, and weak interactions
Quantum Field Theory: A Brief Introduction 215

between elementary particles with outstanding of the zero-point energies of all the oscillators in the
success. The interactions are generalizations of Max- Fourier expansion of the scalar field operator. In any
well’s electrodynamics, which is invariant under a other interaction than gravity, this zero-point energy
symmetry group U(1) of gauge transformations. An may be ignored, but in gravity it may be expected to
enlargement of this group to SU(2)  U(1) accounts have observable consequences, and indeed it turns out
for the unified electroweak interaction (the unifica- that it plays the same role as a cosmological constant ,
tion resulting from the fact that the two U(1)’s above and therefore acts as an agent of acceleration, rather
are not exactly the same; there is some on-diagonal than deceleration, of the universe.
mixing), and the strong interactions between quarks, A final topic worth noting is one whose existence
which binds them into hadrons, are invariant under an would have been inconceivable in the early days of this
SU(3) group of gauge transformations. The gauge subject. The nonlinearity of the (nonabelian) gauge
fields are the photon , the W and Z bosons (both field equations and the existence of a nontrivial group
heavy; of the order of 100 times the proton mass), and space allows new types of topologically nontrivial
the (massless) gluons mediating the force between solutions to these equations: solitons, bounces, instan-
quarks (quantum chromodynamics, QCD). An tons, sphalerons, and so on. Effects such as fractional
important feature of the standard model is sponta- spin and nonconservation of fermion number also
neous symmetry breaking, which is the mechanism by appear, and, on the cosmological scale, domain walls
which the W and Z particles acquire a mass (but the and cosmic strings. There is something here for
photon does not, and neither do the gluons). This goes theoretical physicists of many differing interests.
by the name of the Higgs mechanism.
The quantization of the standard model is most See also: Algebraic Approach to Quantum Field Theory;
successfully carried out using the path-integral Axiomatic Quantum Field Theory; BRST Quantization;
formalism, rather than canonical quantization, and Constrained Systems; Constructive Quantum Field
Theory; Deformation Quantization; Electroweak Theory;
the proof of the renormalizability of the model (of
Euclidean Field Theory; Exact Renormalization Group;
nonabelian gauge theories with spontaneous sym-
Integrability and Quantum Field Theory; Nonperturbative
metry breaking) was given by ’t Hooft. Details of and Topological Aspects of Gauge Theory; Perturbative
these topics are now available in many textbooks. Renormalization Theory and BRST; Quantum
Confidence that this is a realistic model of elemen- Chromodynamics; Quantum Electrodynamics and Its
tary particles – that is to say, of quarks and leptons – Precision Tests; Quantum Fields with Indefinite Metric:
depends, of course, on particular experiments and Non-Trivial Models; Quantum Fields with Topological
their interpretation and an important milestone on this Defects; Renormalization: General Theory; Standard
journey was Feynman’s quark–parton model of deep Model of Particle Physics; Symmetries and Conservation
inelastic electron–proton scattering. The interpretation Laws; Symmetries in Quantum Field Theory of Lower
of the data required a picture of an electron scattering Spacetime Dimensions; Topological Defects and Their
Homotopy Classification; Topological Quantum Field
from an individual quark in the proton, and this in
Theory: Overview; Twistors.
turn required a negligible interaction between quarks;
in other words, that at small distances (inside the
proton) the quarks are (almost) free – despite the fact
Further Reading
that at large distances they most certainly are not! The
proof, by Gross, Politzer, and Wilczek, that nonabe- Cao TY (1997) Conceptual Developments of 20th Century Field
lian gauge are indeed asymptotically free (asymptotic Theories. Cambridge: Cambridge University Press.
in momentum space, that is) was therefore an Davies P (ed.) (1989) The New Physics. Cambridge: Cambridge
University Press.
important event in helping to establish the credibility Gross F (1993) Relativistic Quantum Mechanics and Field
of the standard model. Theory. New York: Wiley.
A characteristic contribution of quantum field theory Huang K (1998) Quantum Field Theory. New York: Wiley.
to our view of the physical world is its picture of the Itzykson C and Zuber J-B (1980) Quantum Field Theory.
vacuum, as being populated with virtual particle– New York: McGraw-Hill.
Maggiore M (2005) A Modern Introduction to Quantum Field
antiparticle pairs. A consequence of this is the phenom- Theory. Oxford: Oxford University Press.
enon of vacuum polarization – that the presence of an Rubakov V (2002) Classical Theory of Gauge Fields. Princeton:
electric charge in free space polarizes these virtual pairs. Princeton University Press.
This in turns leads to the phenomenon of screening in Schweber SS (1994) QED and the Men Who Made It. Princeton:
QED, and antiscreening in QCD, SU(3) having a more Princeton University Press.
Schwinger J (ed.) (1958) Quantum Electrodynamics. New York:
complicated structure than U(1). It also leads to a Dover.
nonzero (in fact, quadratically divergent!) value for the ’t Hooft G (1997) In Search of the Ultimate Building Blocks.
energy of the vacuum. This is in effect the contribution Cambridge: Cambridge University Press.
216 Quantum Fields with Indefinite Metric: Non-Trivial Models

Weinberg S (1995, 1996) The Quantum Theory of Fields, vol. 1 Century, (Reprinted in Mehra J (1973) The Physicist’s Conception
and 2. Cambridge: Cambridge University Press. of Nature. Dordrecht: Reidel.) New York: Interscience.
Wentzel G (1960) Quantum theory of fields (until 1947). In: Fierz M Zee A (2003) Quantum Field Theory in a Nutshell. Princeton:
and Weisskopf VF (eds.) Theoretical Physics in the Twentieth Princeton University Press.

Quantum Fields with Indefinite Metric: Non-Trivial Models


S Albeverio and H Gottschalk, Rheinische generated as the linear span of vectors generated by
Friedrich-Wilhelms-Universität Bonn, Bonn, Germany repeated application of field operators to the
ª 2006 Elsevier Ltd. All rights reserved. vacuum. The following properties should hold for
the quantum field :
1. Temperedness: fn ! f in S ) h, (fn )i !
Introduction h, (f )i 8,  2 S.
The nonperturbative construction of quantum field 2. Covariance: There exists a weakly continuous
models with nontrivial scattering in arbitrary dimen- representation U of the covering of the
orthochronous, proper Poincaré group P ~ " by
sion d of the underlying Minkowski spacetime is þ
much simpler in the framework of quantum field linear operators on D which is J-unitary, that is,
theory with indefinite metric than in the positive- U[] = U1 with U[] = JU JjD and leaves 
metric case. In particular, there exist a number of invariant.  is said to be covariant with respect
solutions in the physical dimension d = 4, where up to U and a representation  of the covering of the
orthochronous, proper Lorentz group L ~ " if
to now no positive-metric solutions are known. The þ
reasons why this is so are reviewed in this article, U(g)(f )U(g)1 = (fg ), where fg (x) = ()f (1
(x  a)), g = f, ag,  2 L ~ " , a 2 Rd .
and some examples obtained by analytic continua- þ
tion from the solutions of Euclidean covariant 3. Spectrality: Let U(a), a 2 R d , be the representa-
stochastic partial differential equations (SPDEs) tion of the translation group and let
driven by non-Gaussian white noise are discussed. = [,2D suppF (h, U(.)i) with F the Fourier
transform (in the sense of tempered distribu-
tions). Formally, is the joint spectrum of the
The Hilbert Space Structure Condition generators of spacetime translations U(a). The
spectral condition then demands that  V  þ,
It has been proved by F Strocchi that a quantum 0
the closed forward light cone in energy–momentum
gauge field in a local, covariant gauge cannot act on
space.
a Hilbert space with a positive-definite inner
4. Locality: There is a decomposition CN = 
V

product. But it is possible to overcome this obstacle


such that for each f , h 2 S taking values in a V

by passing from a Hilbert space representation of


and having spacelike separated supports one has
the algebra of the quantum field to Krein space
either [(f ), (h)] = 0 or f(f ), (h)g = 0, where
representations in order to preserve locality and
[.,.] is the commutator and f.,.g the
covariance under the Poincaré group.
anticommutator.
A Krein space K is an inner-product space which
5. Hermiticity: There is an involution  on S such
also is a Hilbert space with respect to some auxiliary
that (f )[] = (f  ).
scalar product. The relation between the inner
product h. , .i and the auxiliary scalar product (. , .) The quantum-mechanical interpretation of the
is given by a self-adjoint linear operator J : K ! K inner product of two vectors in K as a probability
with J2 = 1K and h. , .i = (. , J.). J is called the metric amplitude, however, gets lost. It has to be restored
operator. A quantum field acting on such a space is by the construction of a physical subspace of K
called a quantum field with indefinite metric. The where the restriction of the inner product is non-
formal definition is as follows. negative. This is called the Gupter–Bleuler gauge
Let D  K be a dense linear space and  2 D a procedure. Typically, one first considers the problem
distinguished vector (henceforth called the vacuum). of constructing quantum fields with indefinite
Let S = S(Rd , CN ) be the space of Schwartz test metric, that is, the dynamical problem is addressed.
functions with values in CN. A quantum field  by This is often followed by the construction of the
definition is a linear mapping from S to the linear physical states, which involves implementation of
operators on D. One usually assumes that D is quantum constraints.
Quantum Fields with Indefinite Metric: Non-Trivial Models 217

The vacuum expectation values (VEVs), also called when combined with nontrivial scattering) becomes
Wightman functions, of the quantum field theory highly nonlinear for truncated Wightman functions.
with indefinite metric (IMQFT) are defined as This can be seen as one explanation why it is so
difficult to find nontrivial (i.e., corresponding to
Wn ðf1      fn Þ = h; ðf1 Þ    ðfn Þi nontrivial interactions) solutions to the Wightman
f1 ; . . . ; fn 2 S ½1 axioms.
But it turns out that, in contrast to positivity, the
An axiomatic framework for (unconstrained) HSSC is essentially linear for truncated Wightman
IMQFT has been suggested by G Morchio and functions.
F Strocchi in terms of the Wightman functions
Theorem 1 If there exists a Schwartz norm jj  jj on
Wn 2 S 0 , n 2 N0 . Previous work on the topic had
S such that WnT is continuous with respect to jj  jjn
been done by J Yngvason. These generalized Wight-
for n 2 N then the associated sequence of Wightman
man axioms of Morchio and Strocchi replace the
functions fWn g fulfills the HSSC [2].
positivity condition on the Wightman functions by a
so-called Hilbert space structure condition (HSSC): Note that jj  jjn is well defined as S is a nuclear
for n 2 N 0 there exist pn a Hilbert seminorm on S n space. This theorem makes it much easier to
such that construct IMQFTs. In particular, all known solu-
tions of the linear program for truncated
jWnþm ðf  hÞj pn ðf Þpm ðhÞ 8n; m 2 N0 Wightman functions lead to an abundance of
f 2 S n ; h 2 S m ½2 mathematical solutions to the axioms of IMQFT,
as long as the singularities of truncated Wightman
This condition makes sure that a field algebra on a functions in position and energy–momentum space
Krein space with VEVs equal to the given set of do not become increasingly stronger with growing n.
Wightman functions can be constructed. The For example, the perturbative solutions to Wight-
remaining axioms of the Wightman framework – man functions of Ostendorf and Steinmann provide
temperedness, covariance, spectral condition, local- solutions when the perturbation series is truncated at
ity, and Hermiticity – remain the same. Clustering of a given order.
Wightman functions is assumed at least for massive
theories:

lim Wnþm ðf  hta Þ = Wn ðf ÞWm ðhÞ 8n; m 2 N0 Relativistic Fields from Euclidean
t!1
Stochastic Equations
f 2 S n ; h 2 S m ½3
In the classical work on constructive quantum field
for spacelike a 2 R d . It fails to hold in certain theory, relativistic fields in spacetime dimensions
physical contexts where multiple vacua (also called d = 2 and 3 have been constructed by analytic
-vacua) accompanied with massless Goldstone continuation from Euclidean random fields. This, in
bosons occur due to spontaneous symmetry particular, has led to firm connections between
breaking. quantum field theory and equilibrium statistical
In the original Wightman axioms, there are mechanics. Let us discuss one specific class of
essentially two nonlinear axioms: positivity and solutions of the axioms of IMQFT for arbitrary d
clustering. Here nonlinear means that checking that which also stem from random fields related to an
condition involves more than one VEV with a given ensemble of statistical mechanics of classical, con-
number of field operators. The cluster condition can tinuous particles. Mathematically, this is connected
be linearized by an operation on the Wightman with using random fields with Poisson distribution.
functions called ‘‘truncation.’’ The equations As in constructive QFT, the moments, also called
Schwinger functions, of the random field can be
Wn ðf1      fn Þ analytically continued from Euclidean imaginary
X Y
¼ WnT ðfj1      fjl Þ ½4 time to relativistic real time. That this is possible
I2P ðnÞ
fj1 ;...;jl g2I results from an explicit calculation. Axiomatic results
j1 <j2 <<jl cannot be used, as they depend on positivity or
reflection positivity in the Euclidean spacetime,
recursively define the truncated Wightman functions respectively.
WnT for n 2 N. Here P (n) stands for the set of all By definition, a mixing Euclidean covariant
partitions of f1, . . . , ng into disjoint, nonempty sets. random field ’ is an almost surely linear mapping
Unfortunately, the positivity condition (at least from S R = S(R d , RN ) to the space of real-valued
218 Quantum Fields with Indefinite Metric: Non-Trivial Models

measurable functions (random variables) on some (1 k)()1 = QE (k) 8 2 SO(d), k 2 Rd . l 2


probability space that fulfills the following N and m1 2 Cn(1, 0) are parameters with the
properties: interpretation of the mass spectrum (m1 , . . . , mP )
L and ( 1 , . . . , P ) the dipole degrees of the related
1. Temperedness: fn ! f in S R ) ’(fn ) ! ’(f ).
L masses. We restrict ourselves to the case of positive
2. Covariance: ’(f ) ¼ ’(fg ) 8f 2 SR , g = f, ag,
d mass spectrum where ml > 0, and in this case
 2 SO(d), a 2 R , fg (x) = ()f (1 (x  a)) for
QP
some continuous representation  : SO(d) ! GL(N). 2 l
= 1 ðt þ ml Þ
3. Mixing: limt!1 E[ABta ] = E[A]E[B] for all pðtÞ = pðt; DÞ = lQ P 2 l
; t > 0 ½8
square-integrable random variables A = A(’), l = 1 ml

B = B(’), and Bta = B(’ta ), ’ta (f ) = ’(fta ) 8f 2 SR ,


One can show that ’ obtained as the unique
a 2 Rd n f0g.
solution of the SPDE D’ =  is a Euclidean covariant,
The mixing condition in the Euclidean spacetime mixing random field. The Schwinger functions
plays the same role as the cluster property in the (moments) of ’ are given by
generalized Wightman axioms.
Sn ðf1      fn Þ
In particular, we consider random fields ’
obtained as solutions of the SPDE D’ = . In this = E½’ðf1 Þ    ’ðfn Þ; f1 ; . . . ; fn 2 S R ½9
equation,  is a noise field, that is,  is -covariant Now the Schwinger functions can be calculated
for some representation of SO(d), (f ) has infinitely explicitly. They are determined by the truncated
divisible probability law and (f ), (h) are indepen- Schwinger functions, cf. [4], as follows: for n = 2,
dent 8f , h 2 S R with supp f \ supp h = ;. D is a
-covariant (i.e., ()D()1 = D 8 2 SO(d)) ST2;1 ;2 ðx1 ; x2 Þ
partial differential operator with constant coeffi- " #
cients (also pseudodifferential operators D could be QE2;1 ;2 ði r2 Þ Y N
2  l
considered). From the classification of infinitely ¼ QN 2 l
ð þ ml Þ ðx1  x2 Þ ½10
l ¼ 1 ml l¼1
divisible probability laws, it is known that 
essentially consists of Gaussian white noise and
and for n 3
Poisson fields and derivatives thereof. Such a Gauss–
Poisson noise field by the Bochner–Minlos theorem STn;1 n ðx1 ; . . . ; xn Þ
is characterized by its Fourier transform. Direct
¼ QEn;1 n ði rn Þ
relations with QFT arise if one chooses " #
Z Y n Y
N
Z  2  l
iðf Þ 2
ð þ ml Þ ðxj  xÞ dx ½11
E½e  = exp ðf Þ  f   pðÞf dx Rd j ¼ 1
R d l¼1

f 2 SR ½5 where
 
where : RN ! C is a Lévy function, Y
n
@
QEn;1 n ði rn Þ ¼C 1 n
QE;l ;l  i ½12
Z @xl
t  2 t l¼1
ðtÞ = ia  t  þz ðeits  1Þ drðsÞ
2 N
R nf0g with
t2R N
½6 
@ n ðtÞ 
n
C1 n = ðiÞ ½13
Here the centered dot represents a -invariant scalar @t1    @tn t = 0
product on RN ,  a positive-semidefinite -invariant
and the Einstein convention of summation and raising/
N
N matrix, z 0 a real number and r is a
lowering of indices on RN with respect to the invariant
-invariant probability measure on Rn nf0g with all
inner product  is applied. The Schwinger functions
moments. Further, 2, = (@ 2 (t)=@t @t )jt = 0 ,
fulfill the requirements of -covariance, symmetry,
and p : [0, 1) ! [0, 1) is a polynomial depending
^ 1 , the Fourier-transformed inverse of D, clustering, and Hermiticity from the Osterwalder–
on D. If D
Schrader axioms of Euclidean QFT.
exists, it can be represented by
While there is no known general reason why a
^ 1 ðkÞ = Q QE ðkÞ relativistic QFT should exist for a given set of
D P
½7 Schwinger functions, one can take advantage of the
l=1 ðjkj2 þ m2l Þ l
explicit formulas [10]–[13] in order to calculate the
Here QE (k) is a complex N
N matrix with analytic continuation from Euclidean to relativistic
polynomial entries being -covariant, ()QE times explicitly.
Quantum Fields with Indefinite Metric: Non-Trivial Models 219

It simplifies the considerations to exclude dipole QM


^ T ðk1 ; k2 Þ ¼ ð2 Þðdþ1Þ 2;1 2 ðk1 ; k2 Þ
fields, that is, one assumes that l = 1 for W 2;1 2 QN 2
l = 1, . . . , n. In physical terms, the no-dipole condi- l ¼ 1 ml
tion guarantees that the asymptotic fields in Min- X
N


bl
m ðk1 Þ
ðk1 þ k2 Þ ½18
kowski spacetime fulfill the Klein–Gordon equation l¼1
l

and thus generate particles in the usual sense if


applied to the vacuum. If this condition is not and
imposed, asymptotic fields might only fulfill a dipole W^T
n;1 n ðk1 ; . . . ; kn Þ
equation (& þ m2 )2 in=out = 0 or a related hyper-
bolic equation of even higher order, and the particle ¼ QM
n;1 n ðk1 ; . . . ; kn Þ
states generated by application of such fields to the
X
N Y
n
vacuum require a gauge fixing (constraints) in order ^T
blj W

n;ml ;...; mln ðk1 ; . . . ; kn Þ ½19
to obtain a physical interpretation. Given the no- l1 ;...; ln ¼ 1 j ¼ 1
1

dipole condition, one obtains by expansion into


partial fractions the analytic continuation of Schwinger functions can
be summarized as follows:
1 X
N
bl
QP = ½14 Theorem 2 The truncated Schwinger functions
2 2
l = 1 ðjkj þ m2l Þ 2
l = 1 ðjkj þ ml Þ S Tn have a Fourier–Laplace representation with W ^T
n
T
defined in eqns [18] and [19]. Equivalently, S n is the
with bl 2 (0, 1) uniquely determined and bl 6¼ 0. analytic continuation of WnT from purely real
For the truncated Schwinger functions, this implies relativistic time to purely imaginary Euclidean
(n 3) that time. The truncated Wightman functions WnT fulfill
the requirements of temperedness, relativistic covar-
STn;1 n ðx1 ; . . . ; xn Þ
iance with respect to the representation of the
X
P
orthochronous, proper Lorentz group ˜ : L"þ (d) !
¼ QEn;1 n ði rn Þ
Gl(L), locality, spectral property, and cluster prop-
l1 ;...; ln ¼ 1
Z erty. Here ˜ is obtained by analytic continuation of 
Y
n Y
n

blr ð þ m2lj Þ1 ðx  xj Þ dx ½15 to a representation of the proper complex Lorentz
r¼1 Rd j ¼ 1 group over Cd (which contains SO(d) as a real
submanifold) and restriction of this representation
At this point, a lengthy calculation R Q yields a repres- to the real orthochronous proper Lorentz group.
entation of the functions Rd nj = 1 ( þ m2j )1
(x  xj ) dx as the Fourier–Laplace transform of a Again making use of the explicit formula in
distribution W ^T Theorem 2, the condition of Theorem 1 can be verified.
n,m1 ,...,mn that fulfills the spectral
condition. This is equivalent to the statement that This proves the existence of IMQFT models associated
the analytic continuation of such functions to with the class of random fields under discussion.
T
relativistic times yields Wn,m 1 ,...,mn
, where the latter Theorem 3 The Wightman functions defined in
distribution is the inverse Fourier transform of Theorem 2 fulfill the HSSC [2]. In particular, there
W^T
n , m1 ,..., mn . This distribution up to a constant that exists a QFT with indefinite metric such that the
can be integrated into QE is given by Wightman functions are given as the VEVs of that
8 9 IMQFT.
<X n Yj1
ð1Þ Yn = Xn 
 þ

ml ðkl Þ 2
ml ðkl Þ
kl
:j = 1
l=1
k  m2j l = jþ1
;
l=1
Nontrivial Scattering
½16
Theories as described in Theorem 2 obviously have
Here
m
(k) = ( k0 )
(k2  m2 ), where is the trivial scattering behavior if the noise field  is
Heaviside step function and k2 = k0  jkj2 . On the
2
Gaussian, that is, if, in [7], z = 0. In the case where
other hand, the partial differential operator QEn can there is also a Poisson component in , that is, z > 0,
be analytically continued in momentum space: higher-order truncated Wightman functions do not
vanish and such relativistic theories have nontrivial
QM 0 0
n ððk1 ; k1 Þ; . . . ; ðkn ; knÞÞ scattering.
½17
= QEn ððik01 ; k1Þ; . . . ; ðik0n ; kn ÞÞ Before the scattering of the models can be
discussed, some comments about scattering in
k1 , . . . , kn 2 Rd . With the definition IMQFT in general are in order. The scattering
220 Quantum Fields with Indefinite Metric: Non-Trivial Models

theory in axiomatic QFT, Haag–Ruelle theory, relies experimentally very well tested, apparently has to be
on positivity. In fact, one can show that in the class located in the constraints, that is, in the procedure of
of models under discussion, the LSZ asymptotic implementing a gauge, of the theory and not in the
condition is violated if dipole degrees of freedom are unconstrained IMQFT.
admitted. In that case more complicated asymptotic Second, one can replace somewhat artificially the
conditions have to be used. In any case, the Haag– polynomials QM n in [17] by any other symmetric and
Ruelle theory cannot be adapted to IMQFT. relativistically covariant polynomial. If the sequence of
Nevertheless, asymptotic fields and states can be the ‘‘new’’ QM n is of uniformly bounded degree in any
constructed in IMQFT if one imposes a no-dipole of the arguments k1 , . . . , kn , the redefined Wightman
condition in a mathematically precise way. Then the functions in [17] still fulfill the requirements of
LSZ asymptotic condition leads to the construction of Theorem 1 and thus define a new relativistic, local
mixed VEVs of asymptotic in- and out-fields with local IMQFT. The scattering amplitudes of such a theory
fields. The collection of such VEVs is called the form- are again well defined and given by [20]. For example,
factor functional. After constructing this collection of in the case of only one scalar particle with mass m, one
mixed VEVs, one can try to check the HSSC for this can show that arbitrary Lorentz-invariant scattering
functional and obtains a Krein space representation for behavior of bosonic particles can be reproduced by
the algebra generated by in- local and out-fields. such theories for energies below an arbitrary maximal
Following this line, asymptotic in- and out-particle energy up to arbitrary precision. This kind of
states can be constructed for the given mass spectrum interpolation theorem shows that the outcome of an
in=outy
(m1 , . . . , mP ). If a, l (k), l = 1, . . . , P, denotes the arbitrary scattering experiment can be reproduced
creation operator for an incoming/outgoing particle within the formalism of (unconstrained) IMQFT as
with mass ml , spin component , and energy–momen- long as it is in agreement with the general requirements
tum k, the following scattering amplitude can be derived of Poincaré invariance and statistics.
for r incoming particles with masses ml1 , . . . , mlr and
n  r outgoing particles with masses mlrþ1 , . . . , mln : List of Symbols
D ET
iny iny outy outy ! converges to
a1 ;l1 ðk1 Þ    ar ;lr ðkr Þ; arþ1 ;lrþ1ðkrþ1 Þ    an ;lnðkn Þ L
! convergence in law
¼ ð2 ÞiQM 1 ;...;n ðk1 ; . . . ; kr ; krþ1 ; . . . ; kn Þ N set of natural numbers
Y
n N0 set of natural numbers and zero

þ

m l
ðkj Þ
ðKin  Kout Þ ½20 R set of real numbers
j
j¼1 C set of complex numbers
Kin=out stand for the total energy–momentum of 1 identity mapping
Pr jD restricted to D
in- and Pout-particles, that is, Kin = j = 1 kj and
n
Kout = j = rþ1 kj .
x0 and x time and spatial part of
Two immediate consequences can be drawn from x = (x0 , x) 2 R
Rd1
[20]. First, choosing a model with nonvanishing rn gradient operator on Rdn
Poisson part such that C1 2 3 6¼ 0 and a differential
operator D containing in its mass spectrum the
masses m and with m > 2 , one gets a nonvanish- See also: Algebraic Approach to Quantum Field Theory;
ing scattering amplitude for the process Euclidean Field Theory; Indefinite Metric; Perturbative
Renormalization Theory and BRST; Quantum Field
µ
Theory in Curved Spacetime; Quantum Field Theory: A
m ½21
Brief Introduction; Stochastic Differential Equations.
µ

even though in- and out-particle states consist of Further Reading


particles with well-defined sharp masses. Thus, for the
Albeverio S and Gottschalk H (2001) Scattering theory for
incoming particle, the energy uncertainty, which for a
quantum fields with indefinite metric. Communications in
particle at rest is proportional to the mass uncertainty, Mathematical Physics 216: 491–513.
vanishes but still the particle undergoes a nontrivial Albeverio S, Gottschalk H, and Wu J-L (1997) Models of local
decay and must have a finite decay time. This appears relativistic quantum fields with indefinite metric (in all
to be a contradiction to the energy–time uncertainty dimensions). Communications in Mathematical Physics 184:
509–531.
relation, which therefore seems to have an unclear
Gottschalk H (2001) On the stability of one particle states
status in IMQFT (i.e., in QFT including gauge fields). generated by quantum fields fulfilling Yang–Feldman equa-
The origin of this inequality, which of course is tions. Reports on Mathematical Physics 47(2): 241–246.
Quantum Fields with Topological Defects 221

Grothaus M and Streit L (1999) Construction of relativistic Steinmann O (2000) Perturbative Quantum Electrodynamics and
quantum fields in the framework of white noise analysis. Axiomatic Field Theory. Berlin: Springer.
Journal of Mathematical Physics 40(11): 5387. Strocchi F (1993) Selected Topics on the General Properties of
Morchio G and Strocchi F (1980) Infrared singularities, vacuum Quantum Field Theory. Lecture Notes in Physics, vol. 51.
structure and pure phases in local quantum field theory. Ann. Singapore: World Scientific.
Inst. H. Poincaré 33: 251.

Quantum Fields with Topological Defects


M Blasone and G Vitiello, Università degli Studi di from condensed matter to cosmology (Kibble 1976,
Salerno, Baronissi (SA), Italy Zurek 1997, Volovik 2003).
P Jizba, Czech Technical University, Prague, Here, we will review how the generation of
Czech Republic ordered structures and extended objects is explained
ª 2006 Elsevier Ltd. All rights reserved. in quantum field theory (QFT). We follow Umezawa
(1993) and Umezawa et al. (1982) in our presenta-
tion. We will consider systems in which spontaneous
Introduction symmetry breaking (SSB) occurs and show that
topological defects originate by inhomogeneous
The ordered patterns we observe in condensed
(localized) condensation of quanta. The approach
matter and in high-energy physics are created by
followed here is alternative to the usual one
the quantum dynamics. Macroscopic systems exhi-
(Rajaraman 1982), in which one starts from the
biting some kind of ordering, such as superconduc- classical soliton solutions and then ‘‘quantizes’’
tors, ferromagnets, and crystals, are described by the them, as well as to the QFT method based on dual
underlying quantum dynamics. Even the large-scale
(disorder) fields (Kleinert 1989).
structures in the universe, as well as the ordering in
In the next section we introduce some general
the biological systems appear to be the manifesta-
features of QFT useful for our discussion and treat
tion of the microscopic dynamics governing the
some aspects of SSB and the rearrangement of
elementary components of these systems. Thus, we
symmetry. Next we discuss the boson transforma-
talk of macroscopic quantum systems: these are
tion theorem and the topological singularities of the
quantum systems in the sense that, although they boson condensate. We then present, as an example,
behave classically, some of their macroscopic fea- a model with U(1) gauge invariance in which SSB,
tures nevertheless cannot be understood without
rearrangement of symmetry, and topological defects
recourse to quantum theory.
are present (Matsumoto et al. 1975a, b). There we
The question then arises how the quantum
show how macroscopic fields and currents are
dynamics generates the observed macroscopic prop-
obtained from the microscopic quantum dynamics.
erties. In other words, how it happens that the
The Nielsen–Olesen vortex solution is explicitly
macroscopic scale characterizing those systems is
obtained as an example. The final section is devoted
dynamically generated out of the microscopic scale to conclusions.
of the quantum elementary components (Umezawa
1993, Umezawa et al. 1982).
Moreover, we also observe a variety of phenom-
Symmetry and Order in QFT:
ena where quantum particles coexist and interact
A Dynamical Problem
with extended macroscopic objects which show a
classical behavior, for example, vortices in super- QFT deals with systems with infinitely many degrees
conductors and superfluids, magnetic domains in of freedom. The fields used for their description are
ferromagnets, dislocations and other topological operator fields whose mathematical significance is
defects (grain boundaries, point defects, etc.) in fully specified only when the state space where they
crystals, and so on. operate is also assigned. This is the space of the
We are thus also faced with the question of the states, or physical phase, of the system under given
quantum origin of topological defects and their boundary conditions. A change in the boundary
interaction with quanta (Umezawa 1993, Umezawa conditions may result in the transition of the system
et al. 1982): this is a crucial issue for the under- from one phase to another. For example, a change
standing of symmetry-breaking phase transitions of temperature from above to below the critical
and structure formation in a wide range of systems temperature may induce the transition from the
222 Quantum Fields with Topological Defects

normal to the superconducting phase in a metal. The may drastically interfere with the interacting objects,
identification of the state space where the field thus changing their nature. Besides the asymptotic
operators have to be realized is thus a physically fields, one then also introduces dynamical or
nontrivial problem in QFT. In this respect, the QFT Heisenberg fields, that is, the fields in terms of
structure is drastically different from the one of which the dynamics is given. Since the interaction
quantum mechanics (QM). The reason is the region is precluded from observation, we do not
following. observe Heisenberg fields. Observables are thus
The von Neumann theorem (1955) in QM states solely described in terms of asymptotic fields.
that for systems with a finite number of degrees of Summing up, QFT is a ‘‘two-level’’ theory: one level
freedom all the irreducible representations of the is the interaction level where the dynamics is specified
canonical commutation relations are unitarily by assigning the equations for the Heisenberg fields.
equivalent. Therefore, in QM the physical system The other level is the physical level, the one of the
can only live in one single physical phase: unitary asymptotic fields and of the physical state space
equivalence means indeed physical equivalence and directly accessible to observations. The equations for
thus there is no room (no representations) for the physical fields are equations for free fields,
physically different phases. Such a situation drasti- describing the observed incoming/outgoing particles.
cally changes in QFT where systems with infinitely To be specific, let the Heisenberg operator fields
many degrees of freedom are treated. In such a case, be generically denoted by H (x) and the physical
the von Neumann theorem does not hold and operator fields by ’in (x). For definiteness, we choose
infinitely many unitarily inequivalent representa- to work with the in-fields, although the set of out-
tions of the canonical commutation relations do in fields would work equally well. They are both
fact exist (Umezawa 1993, Umezawa et al. 1982). It assumed to satisfy equal-time canonical (anti)-
is such richness of QFT that allows the description commutation relations.
of different physical phases. For brevity, we omit considerations on the renor-
malization procedure, which are not essential for the
conclusions we will reach. The Heisenberg field
QFT as a Two-Level Theory
equations and the free-field equations are written as
In the perturbative approach, any quantum experi-
ð@Þ H ðxÞ ¼ J½ H ðxÞ ½1
ment or observation can be schematized as a
scattering process where one prepares a set of free ð@Þ’in ðxÞ ¼ 0 ½2
(noninteracting) particles (incoming particles or in-
fields) which are then made to collide at some later where (@) is a differential operator, x  (t, x) and
time in some region of space (spacetime region of J is some functional of the H fields, describing the
interaction). The products of the collision are interaction.
expected to emerge out of the interaction region as Equation [1] can be formally recast in the
free particles (outgoing particles or out-fields). following integral form (Yang–Feldman equation):
Correspondingly, one has the in-field and the out- ¼ ’in ðxÞ þ 1 ð@Þ  J ½
H ðxÞ H ðxÞ ½3
field state space. The interaction region is where the
dynamics operates: given the in-fields and the in- where  denotes convolution. The symbol 1 (@)
states, the dynamics determines the out-fields and denotes formally the Green function for ’in (x). The
the out-states. precise form of Green’s function is specified by the
The incoming particles and the outgoing ones boundary conditions. Equation [3] can be solved by
(also called quasiparticles in solid state physics) are iteration, thus giving an expression for the Heisen-
well distinguishable and localizable particles only far berg fields H (x) in terms of powers of the ’in (x)
away from the interaction region, at a time much fields; this is the Haag expansion in the LSZ
before (t = 1) and much after (t = þ1) the formalism (or ‘‘dynamical map’’ in the language of
interaction time: in- and out-fields are thus said to Umezawa 1993 and Umezawa et al. 1982), which
be asymptotic fields, and for them the interaction might be formally written as
forces are assumed not to operate (switched off).
H ðxÞ ¼ F½x; ’in  ½4
The only regions accessible to observations are
those far away (in space and in time) from the (A (formal) closed form for the dynamical map is
interaction region, that is, the asymptotic regions obtained in the closed time path (CTP) formalism
(the in- and out-regions). It is so since, at the (Blasone and Jizba 2002). Then the Haag expansion
quantum level, observations performed in the inter- [4] is directly applicable to both equilibrium and
action region or vacuum fluctuations occurring there nonequilibrium situations.)
Quantum Fields with Topological Defects 223

We stress that the equality in the dynamical map When symmetry is spontaneously broken it is
[4] is a ‘‘weak’’ equality, which means that it must G0 6¼ G, with G0 the group contraction of G; when
be understood as an equality among matrix elements symmetry is not broken then G0 = G.
computed in the Hilbert space of the physical Since G is the invariance group of the dynamics,
particles. eqn [4] requires that G0 is the group under which
We observe that mathematical consistency in the free fields equations are invariant, that is, also ’0in
above procedure requires that the set of ’in fields is a solution of [2]. Since eqn [4] is a weak equality,
must be an irreducible set; however, it may happen G0 depends on the choice of the Fock space H
that not all the elements of the set are known from among the physically realizable unitarily inequiva-
the beginning. For example, there might be compo- lent state spaces. Thus, we see that the (same)
site (bound states) fields or even elementary quanta original invariance of the dynamics may manifest
whose existence is ignored in a first recognition. itself in different symmetry groups for the ’in fields
Then the computation of the matrix elements in according to different choices of the physical state
physical states will lead to the detection of unex- space. Since this process is constrained by the
pected poles in the Green’s functions, which signal dynamical equations [1], it is called the dynamical
the existence of the ignored quanta. One thus rearrangement of symmetry (Umezawa 1993,
introduces the fields corresponding to these quanta Umezawa et al. 1982).
and repeats the computation. This way of proceed- In conclusion, different ordering patterns appear
ing is called the self- consistent method (Umezawa to be different manifestations of the same basic
1993, Umezawa et al. 1982). Thus it is not necessary dynamical invariance. The discovery of the process
to have a one-to-one correspondence between the of the dynamical rearrangement of symmetry leads
sets { Hj } and {’iin }, as it happens whenever the set to a unified understanding of the dynamical genera-
{’iin } includes composite particles. tion of many observable ordered patterns. This is the
phenomenon of the dynamical generation of order.
The contraction of the symmetry group is the
The Dynamical Rearrangement of Symmetry mathematical structure controlling the dynamical
As already mentioned, in QFT the Fock space for rearrangement of the symmetry. For a qualitative
the physical states is not unique since one may have presentation see Vitiello (2001).
several physical phases, for example, for a metal the One can now ask which ones are the carriers of
normal phase and the superconducting phase, and so the ordering information among the system elemen-
on. Fock spaces describing different phases are tary constituents and how the long-range correla-
unitarily inequivalent spaces and correspondingly tions and the coherence observed in ordered patterns
we have different expectation values for certain are generated and sustained. The answer is in
observables and even different irreducible sets of the fact that SSB implies the appearance of bosons
physical quanta. Thus, finding the dynamical map (Goldstone 1961, Goldstone et al. 1962, Nambu
involves singling out the Fock space where the and Jona-Lasinio 1961), the so-called Nambu–
dynamics has to be realized. Goldstone (NG) modes or quanta. They manifest
Let us now suppose that the Heisenberg field as long-range correlations and thus they are respon-
equations are invariant under some group G of sible of the above-mentioned change of scale, from
transformations of H : microscopic to macroscopic. The coherent boson
0
condensation of NG modes turns out to be the
H ðxÞ ! H ðxÞ ¼ g½ H ðxÞ ½5 mechanism by which order is generated, as we will
with g 2 G. The symmetry is spontaneously broken see in an explicit example in a later section.
when the vacuum state in the Fock space H is not
invariant under the group G but only under one of
its subgroups (Umezawa 1993, Umezawa et al. The ‘‘Boson Transformation’’ Method
1982).
On the other hand, eqn [4] implies that when H We now discuss the quantum origin of extended
is transformed as in [5], then objects (defects) and show how they naturally
emerge as macroscopic objects (inhomogeneous
’in ðxÞ ! ’0in ðxÞ ¼ g0 ½’in ðxÞ ½6 condensates) from the quantum dynamics. At zero
with g0 belonging to some group of transformations temperature, the classical soliton solutions are then
G0 and such that recovered in the Born approximation. This approach
is known as the ‘‘boson transformation’’ method
g½ H ðxÞ ¼ F½g0 ½’in ðxÞ ½7 (Umezawa 1993, Umezawa et al. 1982).
224 Quantum Fields with Topological Defects

f
The Boson Transformation Theorem that is, cl (x) provides the solution of the classical
Euler–Lagrange equation.
Let us consider, for simplicity, the case of a
Beyond the classical level, in general, the form of
dynamical model involving one scalar field H and
this equation changes. The Yang–Feldman equation
one asymptotic field ’in satisfying eqns [1] and [2],
[10] gives not only the equation for the order
respectively.
parameter, eqn [13], but also, at higher orders in
As already remarked, the dynamical map is valid
h, the dynamics of the physical quanta in the
only in a weak sense, that is, as a relation among matrix
potential generated by the ‘‘macroscopic object’’
elements. This implies that eqn [4] is not unique, since
f (x) (Umezawa 1993, Umezawa et al. 1982).
different sets of asymptotic fields and the correspond-
One can show (Umezawa 1993, Umezawa et al.
ing Hilbert spaces can be used in its construction. Let us
1982) that the class of solutions of eqn [8] which
indeed consider a c–number function f (x), satisfying
lead to topologically nontrivial (i.e., carrying a
the ’in equations of motion [2]:
nonzero topological charge) solutions of eqn [13],
ð@Þf ðxÞ ¼ 0 ½8 are those which have some sort of singularity with
respect to Fourier transform. These can be either
The boson transformation theorem (Umezawa 1993,
divergent singularities or topological singularities.
Umezawa et al. 1982) states that the field
The first are associated to a divergence of f (x) for
f
¼ F½x; ’in þ f  jxj = 1, at least in some direction. Topological
H ðxÞ ½9
singularities are instead present when f (x) is not
is also a solution of the Heisenberg equation [1]. single-valued, that is, it is path dependent. In both
The corresponding Yang–Feldman equation takes cases, the macroscopic object described by the
the form order parameter, carries a nonzero topological
charge.
f f
H ðxÞ ¼ ’in ðxÞ þ f ðxÞ þ 1 ð@Þ  J ½ H ðxÞ ½10
The difference between the two solutions H and Topological Singularities and Massless Bosons
f
H is only in the boundary conditions. An impor- An important result is that the boson transformation
tant point is that the expansion in [9] is obtained functions carrying topological singularities are only
from that in [4] by the spacetime-dependent allowed for massless bosons (Umezawa 1993,
translation Umezawa et al. 1982).
Consider a generic boson field in satisfying the
’in ðxÞ ! ’in ðxÞ þ f ðxÞ ½11
equation
The essence of the boson transformation theorem is
ð@ 2 þ m2 Þin ðxÞ ¼ 0 ½14
that the dynamics embodied in eqn [1] contains an
internal freedom, represented by the possible and suppose that the function f (x) for the boson
choices of the function f (x), satisfying the free- transformation in (x) ! in (x) þ f (x) carries a topo-
field equation [8]. logical singularity. It is then not single-valued and
We also observe that the transformation [11] is a thus path dependent:
canonical transformation since it leaves invariant the
canonical form of commutation relations. Gþ
 ðxÞ  ½@ ; @  f ðxÞ 6¼ 0; for certain ; ; x ½15
Let j0i denote the vacuum for the free field ’in . On the other hand, @ f (x), which is related with
The vacuum expectation value of eqn [10] gives observables, is single-valued, that is, [@ , @ ]
f @ f (x) = 0. Recall that f (x) is solution of the in
f ðxÞ  h0j H ðxÞj0i
D h i E equation:
 f 
¼ f ðxÞ þ 0 1 ð@Þ  J ½ H ðxÞ 0 ½12
ð@ 2 þ m2 Þf ðxÞ ¼ 0 ½16
f
The c–number field  (x) is the order parameter. We From the definition of Gþ
 (x)
and the regularity of
remark that it is fully determined by the quantum @ f (x), it follows, by computing @  Gþ
 (x), that
dynamics. In the classical or Born approximation,
1
which consists in taking h0jJ [ fH ]j0i = J [ f ], that @ f ðxÞ ¼ @  Gþ
 ðxÞ ½17
is, neglecting all the contractions of the physical @ þ m2
2
f
fields, we define cl (x)  limh!0  f (x). In this limit, This equation and the antisymmetric nature of
2
we have Gþ (x) then lead to @ f (x) = 0, which in turn implies
m = 0. Thus, we conclude that [15] is only compa-
ð@Þclf ðxÞ ¼ J ½clf ðxÞ ½13 tible with massless equation for in .
Quantum Fields with Topological Defects 225

The topological charge is defined as B(x) is an auxiliary field which implements the
Z Z gauge-fixing condition (Matsumoto et al. 1975a, b).

NT ¼ dl @ f ¼ dS  @ @ f Notice the -term where v is a complex number; its
C S rôle is to specify the condition of symmetry breaking
Z
1  þ under which we want to compute the functional
¼ dS G ½18
2 S integral and it may be given the physical meaning of
Here C is a contour enclosing the singularity and S a a small external field triggering the symmetry
surface with C as boundary. NT does not depend on breaking (Matsumoto et al. 1975a, b). The limit
the path C provided this does not cross the  ! 0 must be made at the end of the computations.
singularity. The dual tensor G (x) is We will use the notation
Z
1
G ðxÞ  12  Gþ
 ðxÞ ½19 hF½i;J;K  ½dA ½d½d ½dBF½
N
 
and satisfies the continuity equation  exp i S½A ; B; ½24
@ G ðxÞ ¼ 0 with hF[]i  hF[]i, J =K =0 and hF[]i  lim!0
, @ Gþ þ þ
 ðxÞ þ @ G ðxÞ þ @ G ðxÞ ¼ 0 ½20 hF[]i .
The fields , A , and B appearing in the generating
Equation [20] completely characterizes the topolo- functional are c-number fields. In the following, the
gical singularity (Umezawa 1993, Umezawa et al. Heisenberg operator fields corresponding to them
1982). will be denoted by H , AH , and BH , respectively.
Thus, the spontaneous symmetry breaking condition
is expressed by h0jH (x)j0i  ~v 6¼ 0, with ~v constant.
An Example: The Anderson–Higgs–Kibble Since in the functional integral formalism the
Mechanism and the Vortex Solution functional average of a given c-number field gives
the vacuum expectation value of the corresponding
We consider a model of a complex scalar field (x)
operator field, for example, hF[]i  h0jF[H ]j0i, we
interacting with a gauge field A (x) (Anderson 1958,
have lim ! 0 h(x)i  h0jH (x)j0i = ~v.
Higgs 1960, Kibble 1967). The lagrangian density
Let us introduce the following decompositions:
L[(x),  (x), A (x)] is invariant under the global
and the local U(1) gauge transformations (we do not 1
ðxÞ ¼ pffiffiffi ½ ðxÞ þ iðxÞ
assume a particular form for the Lagrangian density, 2
so the following results are quite general): 1
KðxÞ ¼ pffiffiffi ½K1 ðxÞ þ iK2 ðxÞ
ðxÞ ! ei
ðxÞ; A ðxÞ ! A ðxÞ ½21 2
ðxÞ  ðxÞ  h ðxÞi
ðxÞ ! eie0 ðxÞ ðxÞ; A ðxÞ ! A ðxÞ þ @ ðxÞ ½22
Note that h(x)i = 0 because of the invariance
respectively, where (x) ! 0 for jx0 j ! 1 and/or under  ! .
jxj ! 1 and e0 is the coupling constant. We work
in the Lorentz gauge @ A (x) = 0. The generating
The Goldstone Theorem
functional, including the gauge constraint, is
(Matsumoto et al. 1975a, b) Since the functional integral [23] is invariant under
Z the global transformation [21], we have that
1
Z½ J; K ¼ ½dA ½d½d ½dB @Z[ J, K]=@
= 0 and subsequent derivatives with
N respect to K1 and K2 lead to
 
 exp i S½A ; B;  ½23
pffiffiffi Z
Z h ðxÞi ¼ 2v d4 yhðxÞðyÞi
h
S¼ d4 x LðxÞ þ BðxÞ@  A ðxÞ pffiffiffi
¼ 2v ð; 0Þ ½25
þ K ðxÞðxÞ þ KðxÞ ðxÞ In momentum space the propagator for the field 
i

þ J ðxÞA ðxÞ þ ijðxÞ  vj 2 has the general form
Z 
Z
N ¼ ½dA ½d½d ½dB  ð0; pÞ ¼ lim 2
!0 p  m2  þ ia
 Z   
4 2
 exp i d x LðxÞ þ ijðxÞ  vj þ (continuum contributions) ½26
226 Quantum Fields with Topological Defects

Here Z and a are renormalization constants. The (, A , B), all the other two-point functions
integration in eqn [25] picks up the pole contribu- must vanish.
tion at p2 = 0, and leads to The dynamical maps expressing the Heisenberg
pffiffiffi Z operator fields in terms of the asymptotic operator

~ 2 v , m ¼ 0; v ¼ 0 , m 6¼ 0
~ ½27 fields are found to be (Matsumoto et al. 1975a, b)
a ( )
Z1=2
 
The Goldstone theorem (Goldstone 1961, Goldstone H ðxÞ ¼ :exp i in ðxÞ ~v þ Z1=2 in ðxÞ
et al. 1962) is thus proved: if the symmetry is ~v
spontaneously broken (~v 6¼ 0), a massless mode must  
þF ½in ; Uin ; @ðin  bin Þ : ½32
exist, whose field is (x), that is, the NG boson
mode. Since it is massless, it manifests as a long- Z1=2
1=2 
range correlation mode. (Notice that in the present AH ðxÞ ¼Z3 Uin

ðxÞ þ @  bin ðxÞ
case of a complex scalar field model, the NG mode e0 ~v

is an elementary field. In other models, it may þ : F  ½in ; Uin ; @ðin  bin Þ: ½33
appear as a bound state, for example, the magnon in
(anti)ferromagnets.) Note that e0 ~v
BH ðxÞ ¼ 1=2
½bin ðxÞ  in ðxÞ þ c ½34
@ pffiffiffi Z 4 Z
h ðxÞi ¼ 2 d yhðxÞðyÞi ½28
@v where : . . . : denotes the normal ordering and the
functionals F and F  are to be determined within a
and because m 6¼ 0, the right-hand side of this
particular model. In eqns [32]–[34], in denotes the
equation vanishes in the limit  ! 0; therefore, ~v is 
NG mode, bin the ghost mode, Uin the massive
independent of jvj, although the phase of jvj
vector field, and in the massive matter field. In eqn
determines the one of ~ v (from eqn [25]): as in
[34] c is a c-number constant, whose value is
ferromagnets, once an external magnetic field is
irrelevant since only derivatives of B appear in the
switched on, the system is magnetized independently
field equations (see below). Z3 represents the wave
of the strength of the external field. 
function renormalization for Uin . The corresponding
The Dynamical Map and the Field Equations field equations are
Observing that the change of variables [21] (and/or @ 2 in ðxÞ ¼ 0; @ 2 bin ðxÞ ¼ 0
[22]) does not affect the generating functional, we may ½35
ð@ 2 þ m2 Þin ðxÞ ¼ 0
obtain the Ward–Takahashi identities. Also, using
B(x) ! B(x) þ (x) in [23] gives h@  A (x)i, J, K = 0.  
ð@ 2 þ m2V ÞUin ðxÞ ¼ 0; @ Uin ðxÞ ¼ 0 ½36
One then finds the following two-point function pole
structures (Matsumoto et al. 1975a, b): with mV 2 = (Z3 =Z )(e0 ~v)2 . The field equations for
( Z )
i e0 ~
v BH and AH read (Matsumoto et al. 1975a, b)
hBðxÞðyÞi ¼ lim 4
d4 p eipðxyÞ ½29
!0 ð2 Þ p2 þ ia @ 2 BH ðxÞ ¼ 0; @ 2 AH ðxÞ ¼ jH ðxÞ  @ BH ðxÞ ½37
Z with jH (x) = L(x)= AH (x). One may then require
i 1
hBðxÞA ðyÞi ¼ @x d4 p eipðxyÞ ½30 that the current jH is the only source of the gauge
ð2 Þ4 p2
field AH in any observable process. This amounts to
( Z impose the condition: p hbj@ BH (x)jaip = 0, that is,
i vÞ 2
ðe0 ~
hBðxÞBðyÞi ¼ lim d4 p eipðxyÞ ð@ 2 Þp hbjA0H ðxÞjaip ¼ phbj jH ðxÞjaip ½38
!0 ð2 Þ4 Z
 

1 1 where jaip and jbip denote two generic physical


  ½31
p2 þ ia p2 states and A0 
v : @  bin (x):. Equa-
H (x)  AH (x)  e0 ~
tions [38] are the classical Maxwell equations. The
The absence of branch-cut singularities in propaga- condition p hbj@ BH (x)jaip = 0 leads to the Gupta–
tors [29]–[31] suggests that B(x) obeys a free-field
Bleuler–like condition
equation. In addition, eqn [31] indicates that the
model contains a massless negative-norm state ðÞ ðÞ
½in ðxÞ  bin ðxÞjaip ¼ 0 ½39
(ghost) besides the NG massless mode . Moreover,
it can be shown (Matsumoto et al. 1975a, b) that a where () ()
in and bin are the positive-frequency parts

massive vector field Uin also exists in the theory. of the corresponding fields. Thus, we see that in and
Note that because of the invariance (, A , B) ! bin cannot participate in any observable reaction.
Quantum Fields with Topological Defects 227

This is confirmed by the fact that they are present BH in a combination such that the changes of BH

in the S-matrix in the combination (in  bin ) and of Uin compensate each other provided
(Matsumoto et al. 1975a, b). It is to be remarked,
however, that the NG boson does not disappear from m2V
ð@ 2 þ m2V Þa ðxÞ ¼ @ f ðxÞ ½45
the theory: we shall see below that there are situations e0
in which the NG fields do have observable effects. Equation [45] thus obtained is the Maxwell equa-
tion for the massive potential vector a (Matsumoto
The Dynamical Rearrangement of Symmetry et al. 1975a, b). The classical ground state current j
and the Classical Fields and Currents
turns out to be
From eqns [32]–[33] we see that the local gauge  
1
transformations of the Heisenberg fields j ðxÞ  h0jjH ðxÞj0i ¼ m2V a ðxÞ  @  f ðxÞ ½46
e0
H ðxÞ ! eie0 ðxÞ H ðxÞ
½40 The term m2V a (x) is the Meissner current, while
AH ðxÞ ! AH ðxÞ þ @  ðxÞ; BH ðxÞ ! BH ðxÞ (m2V =e0 )@  f (x) is the boson current. The key point
with @ 2 (x) = 0, are induced by the in-field here is that both the macroscopic field and current
transformations are given in terms of the boson condensation
function f (x).
e0 ~
v Two remarks are in order: first, note that the
in ðxÞ ! in ðxÞ þ 1=2
ðxÞ
Z terms proportional to @  f (x) are related to obser-
e0 ~
v ½41 vable effects, for example, the boson current which
bin ðxÞ ! bin ðxÞ þ 1=2
ðxÞ acts as the source of the classical field. Second, note
Z
 
that the macroscopic ground state effects do not
in ðxÞ ! in ðxÞ; Uin ðxÞ ! Uin ðxÞ occur for regular f (x)(Gþ  (x) = 0). In fact, from [45]
On the other hand, the global phase transformation we obtain a (x) = (1=e0 )@ f (x) for regular f (x)
H (x) ! ei
H (x) is induced by which implies zero classical current (j = 0) and
zero classical field (F = @ a  @ a ), since the
v
~ Meissner and the boson current cancel each other.
in ðxÞ ! in ðxÞ þ 1=2

f ðxÞ; bin ðxÞ ! bin ðxÞ
Z In conclusion, the vacuum current appears only
in ðxÞ ! in ðxÞ; 
Uin 
ðxÞ ! Uin ðxÞ ½42 when f (x) has topological singularities and these can
be created only by condensation of massless bosons,
with @ 2 f (x) = 0 and the limit f (x) ! 1 to be performed that is, when SSB occurs. This explains why
at the end of computations. Note that under the above topological defects appear in the process of phase
transformations, the in-field equations and the transitions, where NG modes are present and
S-matrix are invariant and that BH is changed by an gradients in their condensate densities are nonzero
irrelevant c-number (in the limit f ! 1). (Kibble 1976, Zurek 1997).
Consider now the boson transformation On the other hand, the appearance of spacetime
in (x) ! in (x) þ (x): in local gauge theories the order parameter is no guarantee that persistent
boson transformation must be compatible with the ground state currents (and fields) will exist: if f (x)
Heisenberg field equations but also with the physical is a regular function, the spacetime dependence of ~ v
state condition [39]. Under the boson transforma- can be gauged away by an appropriate gauge
tion with (x) = ~ vZ1=2

f (x) and @ 2 f (x) = 0, BH transformation.
changes as Since, as already mentioned, the boson transfor-
mation with regular f (x) does not affect observable
v2
e0 ~
BH ðxÞ ! BH ðxÞ  f ðxÞ ½43 quantities, the S-matrix is actually given by
Z  
 1
eqn [38] is thus violated when the Gupta–Bleuler- S ¼ : S in ; Uin  @ðin  bin Þ : ½47
mV
like condition is imposed. In order to restore it, the
shift in BH must be compensated by means of the This is indeed independent of the boson transforma-

following transformation on Uin : tion with regular f (x):
  1=2 

Uin ðxÞ ! Uin ðxÞ þ Z3 a ðxÞ; @ a ðxÞ ¼ 0 ½44 0  1
S ! S ¼ :S in ; Uin  @ðin  bin Þ
mV
with a convenient c-number function a (x). The 
dynamical maps of the various Heisenberg operators 1=2 1
þZ3 ða  @  f Þ : ½48

are not affected by [44] since they contain Uin and e0
228 Quantum Fields with Topological Defects

R
since a (x) = (1=e0 )@ f (x) for regular f (x). However, that is, by using the identity (2 )2 d3 p(ei px =p2 ) =
S0 6¼ S for singular f (x): S0 includes the interaction of 1=2jxj,

the quanta Uin and in with the classically behaving Z
1 dy ð Þ 1
macroscopic defects (Umezawa 1993, Umezawa rf ðxÞ ¼  d k ^ rx ½54
et al. 1982). 2 d jx  yð Þj
Note that r2 f (x) = 0 is satisfied.
A straight infinitely long vortex is specified by
The Vortex Solution yi ( ) = i3 with 1 < < 1. The only nonvanish-
ing component of G (x) are G03 (x) = Gþ 12 (x) =
Below we consider the example of the Nielsen–
(x1 ) (x2 ). Equation [54] gives (Umezawa 1993,
Olesen vortex string solution. We show which one is
Umezawa et al. 1982, Matsumoto 1975a, b)
the boson function f (x) controlling the nonhomoge- Z
neous NG boson condensation in terms of which the @ 1 @ 2
f ðxÞ ¼ d ½x þ x22 þ ðx3  Þ2 1=2
string solution is described. For brevity, we only @x1 2 @x2 1
report the results of the computations. The detailed x2
¼ 2 ½55
derivation as well as the discussion of further x1 þ x22
examples can be found in (Umezawa 1993, @ x1 @
Umezawa et al. 1982). f ðxÞ ¼ 2 2
; f ðxÞ ¼ 0
@x2 x1 þ x2 @x3
In the present U(1) problem, the electromagnetic
tensor and the vacuum current are (Umezawa 1993, and then
Umezawa et al. 1982, Matsumoto et al. 1975a, b)
x2
f ðxÞ ¼ tan1 ¼
ðxÞ ½56
F ðxÞ ¼ @ a ðxÞ  @ a ðxÞ x1
Z
m2 We have thus determined the boson transformation
¼ 2 V d4 x0 c ðx  x0 ÞGþ 0
 ðx Þ ½49
e0 function corresponding to a particular vortex solu-
tion. The vector potential is
Z Z
m2V m2 x0
j ðxÞ ¼ 2 d4 x0 c ðx  x0 Þ@x0 Gþ 0
 ðx Þ ½50 a1 ðxÞ ¼  V d4 x0 c ðx  x0 Þ 02 2 02
e0 2e0 x1 þ x2
2 Z
respectively, and satisfy @  F (x) = j (x). In these m x0 ½57
a2 ðxÞ ¼ V d4 x0 c ðx  x0 Þ 02 1 02
equations, 2e0 x1 þ x2
Z a3 ðxÞ ¼ a0 ðxÞ ¼ 0
1 0 1
c ðx  x0 Þ ¼ d4 p eipðxx Þ 2 ½51
ð2 Þ4 p  m2V þ i and the only nonvanishing component of F :
The line singularity for the vortex (or string) Z
m2
solution can be parametrized by a single line F12 ðxÞ ¼ 2 V d4 x0 c ðx  x0 Þ ðx01 Þ ðx02 Þ
e0
parameter and by the time parameter . A static qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
m2
vortex solution is obtained by setting y0 (, )=  and ¼ V K0 mV x21 þ x22 ½58
y(, )= y( ), with y denoting the line coordinate. e0
Gþ (x) is nonzero only on the line at y (we can Finally, the vacuum current eqn [50] is given by
consider more lines but let us limit to only one line, qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
for simplicity). Thus, we have m3 x2
j1 ðxÞ ¼  V qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi K1 mV x21 þ x22
Z e0 x2 þ x2
dyi ð Þ 3 1 2
G0i ðxÞ ¼ d ½x  yð Þ Gij ðxÞ ¼ 0 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
d ½52 mV3
x1 ½59
j2 ðxÞ ¼ q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi K1 mV x21 þ x22
Gþij ðxÞ ¼ ijk G0k ðxÞ; Gþ0i ðxÞ ¼ 0 e0 x2 þ x2
1 2
Equation [49] shows that these vortices are purely j3 ðxÞ ¼ j0 ðxÞ ¼ 0
magnetic. We obtain
We observe that these results are the same of the
@0 f ðxÞ ¼ 0 Nielsen–Olesen vortex solution. Notice that we did
Z
1 dyk ð Þ x not specify the potential in our model but only the
@i f ðxÞ ¼ d ijk @j
ð2 Þ 2 d invariance properties. Thus, the invariance proper-
Z ties of the dynamics determine the characteristics of
eipðxyð ÞÞ
 d3 p ½53 the topological solutions. The vortex solution
p2
Quantum Fields with Topological Defects 229

manifests the original U(1) symmetry through the See also: Abelian Higgs Vortices; Algebraic Approach to
cylindrical angle
which is the parameter of the Quantum Field Theory; Quantum Field Theory: A Brief
U(1) representation in the coordinate space. Introduction; Quantum Field Theory in Curved
Spacetime; Symmetries in Quantum Field Theory:
Algebraic Aspects; Symmetries in Quantum Field Theory
Conclusions of Lower Spacetime Dimensions; Topological Defects
and their Homotopy Classification.
We have discussed how topological defects arise as
inhomogeneous condensates in QFT. Topological
defects are shown to have a genuine quantum Further Reading
nature. The approach reviewed here goes under the
name of ‘‘boson transformation method’’ and relies Anderson PW (1958) Coherent excited states in the theory of
on the existence of unitarily inequivalent representa- superconductivity: gauge invariance and the Meissner effect.
Physical Review 110: 827–835.
tions of the field algebra in QFT. Blasone M and Jizba P (2002) Topological defects as
Describing quantum fields with topological inhomogeneous condensates in quantum field theory: kinks
defects amounts then to properly choose the physical in (1 þ 1) dimensional  4 theory. Annals of Physics 295:
Fock space for representing the Heisenberg field 230–260.
operators. Once the boundary conditions corre- Blasone M, Jizba P, and Vitiello G (2006) Spontaneous Break-
down of Symmetry and Topological Defects, London: Imper-
sponding to a particular soliton sector are found, ial College Press. (in preparation).
the Heisenberg field operators embodied with such Goldstone J (1961) Field theories with ‘‘superconductor’’ solu-
conditions contain the full information about the tions. Nuovo Cimento 19: 154–164.
defects, the quanta and their mutual interaction. Goldstone J, Salam A, and Weinberg S (1962) Broken symmetries.
One can thus calculate Green’s functions for Physical Review 127: 965–970.
Higgs P (1960) Spontaneous symmetry breakdown without
particles in the presence of defects. The extension massless bosons. Physical Review 145: 1156–1163.
to finite temperature is discussed in Blasone and Kibble TWB (1967) Symmetry breaking in non-abelian gauge
Jizba (2002) and Manka and Vitiello (1990). theories. Physical Review 155: 1554–1561.
As an example we have discussed a model with Kibble TWB (1976) Topology of cosmic domains and strings.
U(1) gauge invariance and SSB and we have obtained Journal of Physics A 9: 1387–1398.
Kibble TWD (1980) Some implications of a cosmological phase
the Nielsen–Olesen vortex solution in terms of transition. Physics Reports 67: 183–199.
localized condensation of Goldstone bosons. These Kleinert H (1989) Gauge Fields in Condensed Matter, vols. I & II
thus appear to play a physical role, although, in the Singapore: World Scientific.
presence of gauge fields, they do not show up in the Manka R and Vitiello G (1990) Topological solitons and tempera-
physical spectrum as excitation quanta. The function ture effects in gauge field theory. Annals of Physics 199: 61–83.
Matsumoto H, Papastamatiou NJ, Umezawa H, and Vitiello G
f (x) controlling the condensation of the NG bosons (1975a) Dynamical rearrangement in Anderson–Higgs–Kibble
must be singular in order to produce observable mechanism. Nuclear Physics B 97: 61–89.
effects. Boson transformations with regular f (x) only Matsumoto H, Papastamatiou NJ, and Umezawa H (1975b) The
amount to gauge transformations. For the treatment boson transformation and the vortex solutions. Nuclear
of topological defects in nonabelian gauge theories, Physics B 97: 90–124.
Nambu Y and Jona-Lasinio G (1961) Dynamical model of
see Manka and Vitiello (1990). elementary particles based on an analogy with superconduc-
Finally, when there are no NG modes, as in the tivity. I. Physical Review 122: 345–358.
case of the kink solution or the sine-Gordon Nambu Y and Jona-Lasinio G (1961) Dynamical model of
solution, the boson transformation function has to elementary particles based on an analogy with superconduc-
carry divergence singularity at spatial infinity tivity. II. Physical Review 124: 246–254.
Rajaraman R (1982) Solitons and Instantons: An Introduction to
(Umezawa 1993, Umezawa et al. 1982, Blasone Solitons and Instantons in Quantum Field Theory. Amsterdam:
and Jizba 2002). The boson transformation has also North-Holland.
been discussed in connection with the Bäklund Umezawa H (1993) Advanced Field Theory: Micro, Macro and
transformation at a classical level and the confine- Thermal Physics. New York: American Institute of Physics.
ment of the constituent quanta in the coherent Umezawa H, Matsumoto H, and Tachiki M (1982) Thermo
Field Dynamics and Condensed States. Amsterdam: North-
condensation domain. Holland.
For further reading on quantum fields with Vitiello G (2001) My Double Unveiled. Amsterdam: John
topological defects, see Blasone et al. (2006). Benjamins.
Volovik GE (2003) The Universe in a Helium Droplet. Oxford:
Clarendon.
Acknowledgments von Neumann J (1955) Mathematical Foundation of Quantum
Mechanics. Princeton: Princeton University Press.
The authors thank MIUR, INFN, INFM, and the Zurek WH (1997) Cosmological experiments in condensed matter
ESF network COSLAB for partial financial support. systems. Physics Reports 276: 177–221.
230 Quantum Geometry and Its Applications

Quantum Geometry and Its Applications


A Ashtekar, Pennsylvania State University, University completion in simplified models – called mini- and
Park, PA, USA midi-superspaces. They show that quantum space-
J Lewandowski, Uniwersyte Warszawski, Warsaw, time does not end at singularities. Rather, quantum
Poland geometry serves as a ‘‘bridge’’ to another large
ª 2006 Elsevier Ltd. All rights reserved. classical spacetime.
This article will focus on structural issues from
the perspective of mathematical physics. For com-
plementary perspectives and further details, see
Introduction Loop Quantum Gravity, Canonical General Relativity,
In general relativity, the gravitational field is Quantum Cosmology, Black Hole Mechanics, and
encoded in the Riemannian geometry of spacetime. Spin Foams in this Encyclopedia.
Much of the conceptual compactness and mathema-
tical elegance of the theory can be traced back to
Basic Framework
this central idea. The encoding is also directly
responsible for the most dramatic ramifications of The starting point is a Hamiltonian formulation of
the theory: the big bang, black holes, and gravita- general relativity based on spin connections
tional waves. However, it also leads one to the (Ashtekar 1987). Here, the phase space G consists
conclusion that spacetime itself must end and of canonically conjugate pairs (A, P), where A is a
physics must come to a halt at the big bang and connection on a 3-manifold M and P a 2-form, both
inside black holes, where the gravitational field of which take values in the Lie algebra su(2). Since G
becomes singular. But this reasoning ignores quan- can also be thought of as the phase space of the
tum physics entirely. When the curvature becomes SU(2) Yang–Mills theory, in this approach there is a
large, of the order of 1=‘2Pl = c3=G
h, quantum effects unified kinematic framework for general relativity
dominate and predictions of general relativity can that describes gravity and the gauge theories which
no longer be trusted. In this ‘‘Planck regime,’’ one describe the other three basic forces of nature. The
must use an appropriate synthesis of general connection A enables one to parallel transport chiral
relativity and quantum physics, that is, a quantum spinors (such as the left-handed fermions of the
gravity theory. The predictions of this theory are standard electroweak model) along curves in M. Its
likely to be quite different from those of general curvature is directly related to the electric and
relativity. In the real, quantum world, evolution may magnetic parts of the spacetime ‘‘Riemann tensor.’’
be completely nonsingular. Physics may not come to The dual P of R P plays Ra double role (the dual is
a halt and quantum theory could extend classical defined via M P ^ ! = M Py ! for any 1-form !
spacetime. on M). Being the momentum canonically conjugate
There are a number of different approaches to to A, it is analogous to the Yang–Mills electric field.
quantum gravity. One natural avenue is to retain the But (apart from a constant), it is also an orthonor-
interplay between gravity and geometry but now use mal triad (with density weight 1) on M and
‘‘quantum’’ Riemannian geometry in place of the therefore determines the positive-definite (‘‘spatial’’)
standard, classical one. This is the key idea under- 3-metric, and hence the Riemannian geometry of M.
lying loop quantum gravity. There are several This dual role of P is a reflection of the fact that
calculations which indicate that the well-known now SU(2) is the (double cover of the) group of
failure of the standard perturbative approach to rotations of the orthonormal spatial triads on M
quantum gravity may be primarily due to its basic itself rather than of rotations in an ‘‘internal’’ space
assumption that spacetime can be modeled as a associated with M.
smooth continuum at all scales. In loop quantum To pass to quantum theory, one first constructs an
gravity, one adopts a nonperturbative approach. algebra of ‘‘elementary’’ functions on G (analogous
There is no smooth metric in the background. to the phase-space functions x and p in the case of a
Geometry is not only dynamical but quantum particle) which are to have unambiguous operator
mechanical from ‘‘birth.’’ Its fundamental excita- analogs. The holonomies
tions turn out to be one dimensional and polymer- Z
like. The smooth continuum is only a coarse-grained he ðAÞ :¼ P exp  A ½1
e
approximation. While a fully satisfactory quantum
gravity theory still awaits us (in any approach), associated with a curve/edge e on M are (SU(2)-
detailed investigations have been carried out to valued) configuration functions on G. Similarly,
Quantum Geometry and Its Applications 231

given a 2-surface S on M, and an su(2)-valued (test) isomorphic to C . A is called the Gel’fand spectrum
function f on M, of C . It has been shown to consist of ‘‘generalized
Z connections’’ A defined as follows: A  assigns to any
oriented edge e in M an element A(e)  of SU(2)
PS; f :¼ trðf PÞ ½2
S (a ‘‘holonomy’’) such that A(e 
 1 ) = [A(e)] 1
; and, if
the endpoint of e1 is the starting point of e2 , then
is a momentum function on G, where tr is over the  1  e2 ) = A(e
 1 ) A(e
 2 ). Clearly, every smooth con-
A(e
su(2) indices. (For simplicity of presentation, all
nection A is a generalized connection. In fact, the
fields are assumed to be smooth and curves/edges e
space A of smooth connections has been shown to be
and surfaces S, finite and piecewise analytic in a
dense in A (with respect to the natural Gel’fand
specific sense. The extension to smooth curves and
topology thereon). But A has many more ‘‘distribu-
surfaces was carried out by Bacz and Sawin,
tional elements.’’ The Gel’fand theory guarantees that
Lewandowski and Thiemann, and Fleischhack. It is
every representation of the C ? algebra C is a direct
technically more involved but the final results are
sum of representations of the following type: the
qualitatively the same.) The symplectic structure on  d) for some
underlying Hilbert space is H = L2 (A,
G enables one to calculate the Poisson brackets
measure  on A and (regarded as functions on A) 
{he , PS, f }. The result is a linear combination of
elements of C act by multiplication. Since there are
holonomies and can be written as a Lie derivative,  there is a multi-
many inequivalent measures on A,
fhe ; PS; f g ¼ LXS; f he ½3 tude of representations of C . A key question is how
many of them can be extended to representations of
where XS, f is a derivation on the ring generated by the full algebra a (or W) without having to introduce
holonomy functions, and can therefore be regarded any ‘‘background fields’’ which would compromise
as a vector field on the configuration space A of diffeomorphism covariance. Quite surprisingly, the
connections. This is a familiar situation in classical requirement that the representation be cyclic with
mechanics of systems whose configuration space is a respect to a state which is invariant under the action
finite-dimensional manifold. Functions he and vector of the (appropriately defined) group Diff M of
fields XS, f generate a Lie algebra. As in quantum piecewise-analytic diffeomorphisms on M singles out
mechanics on manifolds, the first step is to promote a unique irreducible representation. This result was
this algebra to a quantum algebra by demanding established for a by Lewandowski, Okołów, Sahl-
that the commutator be given by ih times the Lie mann and Thiemann, and for W by Fleischhack. It is
bracket. The result is a ?-algebra a , analogous to the the quantum geometry analog to the seminal results
algebra generated by operators exp i^ x and p ^ in by Segal and others that characterized the Fock
quantum mechanics. By exponentiating the momen- vacuum in Minkowskian field theories. However,
tum operators P ^ S, f one obtains W , the analog of the while that result assumes not only Poincaré invar-
quantum-mechanical Weyl algebra generated by iance but also specific (namely free) dynamics, it is
exp i^x and exp ip ^. striking that the present uniqueness theorems make
The main task is to obtain the appropriate no such restriction on dynamics. The requirement of
representation of these algebras. In that representa- diffeomorphism invariance is surprisingly strong and
tion, quantum Riemannian geometry can be probed makes the ‘‘background-independent’’ quantum geo-
through the momentum operators P ^ S, f , which metry framework surprisingly tight.
stem from classical orthonormal triads. As in This representation had been constructed by
quantum mechanics on manifolds or simple field Ashtekar, Baez, and Lewandowski some ten years
theories in flat space, it is convenient to divide the before its uniqueness was established. The under-
task into two parts. In the first, one focuses on the lying Hilbert space is given by H = L2 (A,  do ) where
algebra C generated by the configuration operators o is a diffeomorphism-invariant, faithful, regular
^c and finds all its representations, and in the second
h Borel measure on A,  constructed from the normal-
one considers the momentum operators P ^ S, f to ized Haar measure on SU(2). Typical quantum states
restrict the freedom. can be visualized as follows. Fix: (1) a graph  on M
C is called the holonomy algebra. It is naturally (by a graph on M we mean a set of a finite number
endowed with the structure of an abelian C ? algebra of embedded, oriented intervals called edges; if two
(with identity), whence one can apply the powerful edges intersect, they do so only at one or both ends,
machinery made available by the Gel’fand theory. called vertices), and (2) a smooth function on
This theory tells us that C determines a unique [SU(2)]n . Then, the function
compact, Hausdorff space A such that the C ? algebra
 :¼ ðAðe
 ðAÞ  1 Þ; . . . ; Aðe
 n ÞÞ ½4
of all continuous functions on A is naturally
232 Quantum Geometry and Its Applications

on A is an element of H. Such states are said to be self-adjoint and all its eigenvalues are discrete. To
‘‘cylindrical’’ with respect to the graph  and their define other geometric operators such as the area
space is denoted by Cyl . These are ‘‘typical states’’ operator A^ S associated with a surface S or a volume
in the sense that Cyl := [ Cyl is dense in H. operator V ^ R associated with a region R, one first
Finally, as ensured by the Gel’fand theory, the expresses the corresponding phase-space functions in
holonomy (or configuration) operators h ^e act just terms of the ‘‘elementary’’ functions ESi , fi using
by multiplication. The momentum operators P ^ S, f act suitable surfaces Si and test functions fi and then
as Lie derivatives: P^ S, f  = i
hLXS, f . promotes ESi , fi to operators. Even though the
classical expressions are typically nonpolynomial
Remark Given any graph  in M, and a labeling of
functions of ESi , fi , the final operators are all well
each of its edges by a nontrivial irreducible represen-
defined, self-adjoint and with purely discrete eigen-
tation of SU(2) (i.e., by a nonzero half integer j), one
values. Therefore, in the sense of the word used in
can construct a finite-dimensional Hilbert space H, j ,
elementary quantum mechanics (e.g., of the hydro-
which can be thought of as the state space of a spin
gen atom), one says that geometry is quantized.
system ‘‘living on’’ the graph . The full Hilbert space
Because the theory has no background metric or
admits a simple decomposition: H = , j H, j . This
indeed any other background field, all geometric
is called the spin-network decomposition. The geo-
operators transform covariantly under the action of
metric operators discussed in the next section leave
the Diff M. This diffeomorphism covariance makes
each H, j invariant. Therefore, the availability of this
the final expressions of operators rather simple. In
decomposition greatly simplifies the task of analyzing
the case of the area operator, for example, the
their properties. ^ S on a state  [4] depends entirely on
action of A
the points of intersection of the surface S and the
graph  and involves only right- and left-invariant
Geometric Operators
vector fields on copies of SU(2) associated with
In the classical theory, E := 8GP has the inter- edges of  which intersect S. In the case of the
pretation of an orthonormal triad field (or a volume operator V ^ R , the action depends on the
‘‘moving frame’’) on M (with density weight 1). vertices of  contained in R and, at each vertex,
Here,  is a dimensionless, strictly positive number, involves the right- and left-invariant vector fields on
called the Barbero–Immirzi parameter, which arises copies of SU(2) associated with edges that meet at
as follows. Because of emphasis on connections, in each vertex.
the classical theory the first-order Palatini action is a To display the explicit expressions of these
more natural starting point than the second-order operators, let us first define on Cyl three basic
Einstein–Hilbert action. Now, there is a freedom to operators ^Jj(v, e) , with j 2 {1, 2, 3}, associated with the
add a term to the Palatini action which vanishes pair consisting of an edge e of  and a vertex v of e:
when Bianchi identities are satisfied and therefore 8
does not change the equations of motion.  arises as > d  expðtj Þ; . . .Þ
>
> i jt¼0  ð. . . ; Ue ðAÞ
>
> dt
the coefficient of this term. In some respects  is >
>
< if e begins at v
analogous to the  parameter of Yang–Mills theory. ^Jðv;eÞ  ðAÞ
 ¼
j >
Indeed, while theories corresponding to any permis- > d  . . .Þ
>
> i jt¼0  ð. . . ; expðtj ÞUe ðAÞ;
sible values of  are related by a canonical >
> dt
>
:
transformation classically, quantum mechanically if e ends at v
this transformation is not unitarily implementable. ½5
Therefore, although there is a unique representation
of the algebra a (or W ), there is a one-parameter where j denotes a basis in su(2) and ‘‘. . .’’ stands for
family of inequivalent representations of the algebra the rest of the arguments of  which remain
of geometric operators generated by suitable func- unaffected. The quantum area operator As is
tions of orthonormal triads E, each labeled by the assigned to a finite two-dimensional submanifold S
value of . This is a genuine quantization ambiguity. in M. Given a cylindrical state we can always
As with the  ambiguity in QCD, the actual value of represent it in the form [4] using a graph  adapted
 in nature has to be determined experimentally. to S, such that every edge e either intersects S at
The current strategy in quantum geometry is to fix exactly one endpoint, or is contained in the closure
its value through a thought experiment involving S, or does not intersect S. For each vertex v in S of
black hole thermodynamics (see below). the graph , the family of edges intersecting v can be
The basic object in quantum Riemannian geome- divided into three classes: edges {e1 , . . . , eu } lying on
try is the triad flux operator E ^ S, f := 8G P
^ S, f . It is one side (say ‘‘above’’) S, edges {euþ1 , . . . , euþd } lying
Quantum Geometry and Its Applications 233

on the other side (say ‘‘below’’), and edges contained tangent to the edges at v, [i , j ] = ck ij k and the
in S. To each v we assign a generalized Laplace indices are raised by the tensor ij . The action of the
operator quantum volume operator on a cylindrical state [4]
! is then given by
X u
ðv;e Þ
X
uþd
ðv;e Þ X pffiffiffiffiffiffiffiffi
S;v ¼  ij ^J I
 ^J I
^ R   ¼ o
i i V qv j:
j^ ½12
I¼1 I¼uþ1
! v2R
X
u X
uþd
 ^Jðv;eK Þ  ^Jðv;eK Þ ½6 Here, o is an overall, independent of a graph,
j j
K¼1 K¼uþ1 constant resulting from an averaging.
The volume operator plays an unexpectedly
where ij stands for 1=2 the Killing form on su(2).
^ S on important role in the definition of both the gravita-
Now, the action of the quantum area operator A
tional and matter contributions to the scalar
 is defined as follows:
constraint operator which dictates dynamics.
X pffiffiffiffiffiffiffiffiffiffiffiffiffi
^ S  ¼ 4‘2
A S;v  ½7 Finally, a notable property of the volume operator
Pl
v2S is the following. Let R(p,
) be a family of neighbor-
hoods of a point p 2 M. Then, as indicated above,
The quantum area operator has played the most V^ R(p,
)  = 0 if  has no vertex in the neighborhood.
important role in applications. Its complete spec- However, if  has a vertex at p
trum is known in a closed form. Consider arbitrary
sets j(u) (d) (uþd)
I , jI , and jI of half-integers, subject to the ^ Rðx;
Þ 
lim V

!0
condition
exists but is not necessarily zero. This is a reflection
ðuþdÞ ðuÞ ðdÞ ðuÞ ðdÞ ðuÞ ðdÞ
jI 2 fjjI  jI j; jjI  jI j þ 1; . . . ; jI þ jI g ½8 of the ‘‘distributional’’ nature of quantum geometry.
where I runs over any finite number of integers. The Remark States  2 Cyl have support only on the
general eigenvalues of the area operator are given by: graph . In particular, they are simply annihilated
by geometric operators such as A ^ S and V^ R if the
X  ðuÞ ðuÞ ðdÞ ðdÞ
aS ¼ 4‘2Pl 2jI ð jI þ 1Þ þ 2jI ð jI þ 1Þ support of the surface S and the region R does not
I intersect the support of . In this sense the
1=2
ðuþdÞ ðuþdÞ fundamental excitations of geometry are one dimen-
 jI ð jI þ 1Þ ½9
sional and geometry is polymer-like. States  ,
On the physically interesting sector of SU(2)- where  is just a ‘‘small graph,’’ are highly quantum
gauge-invariant subspace Hinv of H, the lowest mechanical – like states in QED representing just a
eigenvalue of A ^ S – ‘‘the area gap’’ – depends on few photons. Just as coherent states in QED require
some global properties of S. Specifically, it ‘‘knows’’ an infinite superposition of such highly quantum
whether the surface is open, or a 2-sphere, or, if M is states, to obtain a semiclassical state approximating
a 3-torus, a (nontrivial) 2-torus in M. Finally, on a given classical geometry, one has to superpose a
Hinv , one is often interested only in the subspace of very large number of such elementary states. More
states  , where  has no edges which lie within a precisely, in the Gel’fand triplet Cyl H Cyl? ,
given surface S. Then, the expression of eigenvalues semiclassical states belong to the dual Cyl? of Cyl.
simplifies considerably:
X pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
aS ¼ 8‘2Pl jI ðjI þ 1Þ ½10 Applications
I Since quantum Riemannian geometry underlies loop
To display the action of the quantum volume quantum gravity and spin-foam models, all results
operator V ^ R , for each vertex v of a given graph , obtained in these frameworks can be regarded as its
let us first define an operator q ^v on Cyl . applications. Among these, there are two which
have led to resolutions of long-standing issues. The
1 first concerns black hole entropy, and the second,
^v ¼ð8‘2Pl Þ3
q
48 quantum nature of the big bang.
X ðv;eÞ ðv;e0 Þ ^ðv;e00 Þ

ðe; e0 ; e00 Þcijk ^Ji ^Jj Jk ½11
e;e0 ;e00 Black Holes
where e, e 0 , and e 00 run over the set of edges Seminal advances in fundamentals of black hole
intersecting v,
(e, e 0 , e 00 ) takes values 1 or 0 physics in the mid-1970s suggested that the entropy
depending on the orientation of the half-lines of large black holes is given by SBH = (ahor =4‘2Pl ),
234 Quantum Geometry and Its Applications

where ahor is the horizon area. This immediately to Hawking’s semiclassical result. This correction,
raised a challenge to potential quantum gravity with the 1=2 factor, is robust in the sense that it
theories: give a statistical mechanical derivation of also arises in other approaches.) However, as one
this relation. For familiar thermodynamic systems, a would expect, the proportionality factor depends on
statistical mechanical derivation begins with an the Barbero–Immirzi parameter  and so far loop
identification the microscopic degrees of freedom. quantum gravity does not have an independent way
For a classical gas, these are carried by molecules; to determine its value. The current strategy is to
for the black body radiation, by photons; and for a determine  by requiring that, for the Schwarzschild
ferromagnet, by Heisenberg spins. What about black black hole, the leading term agrees exactly with
holes? The microscopic building blocks cannot be Hawking’s semiclassical answer. This requirement
gravitons because the discussion involves stationary implies that  is the root of algebraic equation and
black holes. Furthermore, the number of micro- its value is given by  0.2735. Now, quantum
scopic states is absolutely huge: some exp 1077 for a geometry theory is completely fixed. One can
solar mass black hole, a number that completely calculate entropy of other black holes, with angular
dwarfs the number of states of systems one normally momentum and distortion. A nontrivial check on the
encounters in statistical mechanics. Where does this strategy is that for all these cases, the coefficient in
huge number come from? In loop quantum gravity, the leading-order term again agrees with Hawking’s
this is the number of states of the ‘‘quantum horizon semiclassical result.
geometry.’’ The detailed analysis involves a number of
The idea behind the calculation can be heuristi- structures of interest to mathematical physics. First,
cally explained using the ‘‘It from Bit’’ argument, the intrinsic horizon geometry is described by a U(1)
put forward by Wheeler in the 1990s. Divide the Chern–Simons theory on a punctured 2-sphere (the
black hole horizon into elementary cells, each with horizon), the level k of the theory being given by
one Planck unit of area, ‘2Pl , and assign to each cell k = ahor =4‘2Pl . The punctures are simply the inter-
two microstates. Then the total number of states N sections of the excitations of the polymer geometry
is given by N = 2n , where n = (ahor =‘2Pl ) is the in the bulk with the horizon 2-surface. Second,
number of elementary cells, whence entropy is because of the horizon boundary conditions, in the
given by S = ln N
ahor . Thus, apart from a classical theory the gauge group SU(2) is reduced to
numerical coefficient, the entropy (It) is accounted U(1) at the horizon. At each puncture, it is further
for by assigning two states (Bit) to each elementary reduced to the discrete subgroup Zk of U(1),
cell. This qualitative picture is simple and attractive. sometimes referred to as a ‘‘quantum U(1) group.’’
However, the detailed derivation in quantum geo- Third, the ‘‘surface phase space’’ associated with the
metry has several new features. horizon is represented by a noncommutative torus.
First, Wheeler’s argument would apply to any Finally, the surface Chern–Simons theory is entirely
2-surface, while in quantum geometry the surface unrelated to the bulk quantum geometry theory but
must represent a horizon in equilibrium. This the quantum horizon boundary condition requires
requirement is encoded in a certain boundary that the spectrum of a certain operator in the
condition that the canonically conjugate pair (A, P) Chern–Simons theory must be identical to that of
must satisfy at the surface and plays a crucial role in another operator in the bulk theory. The surprising
the quantum theory. Second, the area of each fact is that there is an exact agreement. Without this
elementary cell is not a fixed multiple of ‘2Pl but is seamless matching, a coherent description of the
given by [10], where I labels the elementary cells quantum horizon geometry would not have been
and jI can be any half-integer (such that the sum is possible.
within a small neighborhood of the classical area of The main weakness of this approach to black hole
the black hole under consideration). Finally, the entropy stems from the Barbero–Immirzi ambiguity.
number of quantum states associated with an The argument would be much more compelling if
elementary cell labeled by jI is not 2 but (2jI þ 1). the value of  were determined by independent
The detailed theory of the quantum horizon considerations, without reference to black hole
geometry and the standard statistical mechanical entropy. (By contrast, for extremal black holes,
reasoning is then used to calculate the entropy and string theory provides the correct coefficient without
the temperature. For large black holes, the leading any adjustable parameter. The AdS/CFT duality
contribution to entropy is proportional to the hypothesis (as well as other semiquantitative) argu-
horizon area, in agreement with quantum field ments have been used to encompass certain black
theory in curved spacetimes. (The subleading term holes which are away from extremality. But in these
(1=2) ln(ahor =‘2Pl ) is a quantum gravity correction cases, it is not known if the numerical coefficient is
Quantum Geometry and Its Applications 235

1/4 as in Hawking’s analysis.) It’s primary strengths function c(t) and, in that of the momentum/triad, in
are twofold. First, the calculation encompasses all another function p(t). The scale factor is given by
realistic black holes – not just extremal or near- a2 = jpj. (The variable p itself can assume both signs;
extremal – including the astrophysical ones, which positive if the triad is left handed and negative if it is
may be highly distorted. Hairy black holes of right handed. p vanishes at degenerate triads which
mathematical physics and cosmological horizons are permissible in this approach.) The system again
are also encompassed. Second, in contrast to other has only a finite number of degrees of freedom.
approaches, one works directly with the physical, However, quantum theory turns out to be inequi-
curved geometry around black holes rather than valent to that used in older quantum cosmologies.
with a flat-space system which has the same number This surprising result comes about as follows.
of states as the black hole of interest. Recall that in quantum geometry, one has well-
defined holonomy operators h ^ but there is no
operator corresponding to the connection itself. In
The Big Bang
quantum mechanics, the analog would be for
Most of the work in physical cosmology is carried ^
operators U() corresponding to the classical func-
out using spatially homogeneous and isotropic tions exp ix to exist but not be weakly continuous
models and perturbations thereon. Therefore, to in ; the operator x ^ would then not exist. Once the
explore the quantum nature of the big bang, it is requirement of weak continuity is dropped, von
natural to begin by assuming these symmetries. Neumann’s uniqueness theorem no longer holds and
Then the spacetime metric is determined simply by the Weyl algebra can have inequivalent irreducible
the scale factor a(t) and matter fields (t) which representations. The one used in loop quantum
depend only on time. Thus, because of symmetries, cosmology is the direct analog of full quantum
one is left with only a finite number of degrees of geometry. While the space A of smooth connections
freedom. Therefore, field-theoretic difficulties are reduces just to the real line R, the space A of
bypassed and passage to quantum theory is simpli- generalized connections reduces to the Bohr com-
fied. This strategy was introduced already in the late pactification R  Bohr of the real line. (This space was
1960s and early 1970s by DeWitt and Misner. introduced by the mathematician Harold Bohr (Nils’
Quantum Einstein’s equations now reduce to a brother) in his theory of almost-periodic functions.
single differential equation of the type It arises in the present application because holo-
nomies turn out to be almost periodic functions
@2 ^ ða; Þ
ðf ðaÞða; ÞÞ ¼ const: H ½13 of c.) The Hilbert space of states is thus
@a2 H = L2 (R Bohr , do ) where o is the Haar measure
on the wave function (a, ), where H ^ is the matter on (the abelian group) R  Bohr . As in full quantum
Hamiltonian and f (a) reflects the freedom in factor geometry, the holonomies act by multiplication and
ordering. Since the scale factor a vanishes at the big the triad/momentum operator p ^ via Lie derivatives.
bang, one has to analyze the equation and its To facilitate comparison with older quantum
solutions near a = 0. Unfortunately, because of the cosmologies, it is convenient to use a representation
standard form of the matter Hamiltonian, coeffi- in which p ^ is diagonal. Then, quantum states are
cients in the equation diverge at a = 0 and the functions (p, ). But the Wheeler–DeWitt equation
evolution cannot be continued across the singularity is now replaced by a difference equation:
unless one introduces unphysical matter or a new
principle. A well-known example of new input is the Cþ ðpÞ ðp þ 4po ; Þ þ Co ðpÞ ðp; Þ
Hartle–Hawking boundary condition which posits þ C ðpÞ ðp  4po Þð Þ ¼ const: H ^ ðp; Þ ½14
that the universe starts out without any boundary
and a metric with positive-definite signature and where po is determined by the lowest eigenvalue of the
later makes a transition to a Lorentzian metric. area operator (‘‘area gap’’) and the coefficients C (p)
Bojowald and others have shown that the situa- and Co (p) are functions of p. In a backward ‘‘evolu-
tion is quite different in loop quantum cosmology tion,’’ given  at p þ 4 and p, such a ‘‘recursion
because quantum geometry effects make a qualita- relation’’ determines  at p  4, provided C does not
tive difference near the big bang. As in older vanish at p  4. The coefficients are well behaved and
quantum cosmologies, one carries out a symmetry nowhere vanishing, whence the evolution does not stop
reduction at the classical level. The final result at any finite p, either in the past or in the future. Thus,
differs from older theories only in minor ways. In near p = 0 this equation is drastically different from the
the homogeneous, isotropic case, the freedom in the Wheeler–DeWitt equation [13]. However, for large p –
choice of the connection is encoded in a single that is, when the universe is large – it is well
236 Quantum Group Differentials, Bundles and Gauge Theory

approximated by [13] and smooth solutions of [13] are some similarities with other approaches, (e.g.,
approximate solutions of the fundamental discrete ‘‘cyclic universes,’’ or pre-big-bang cosmology),
equation [14] in a precise sense. only in loop quantum cosmology is there a fully
To complete quantization, one has to introduce a deterministic evolution across what was the classical
suitable Hilbert space structure on the space of big-bang. However, so far, detailed results have
solutions to [14], identify physically interesting been obtained only in simple models. The major
operators and analyze their properties. For simple open issue is the inclusion of perturbations and
matter fields, this program has been completed. subsequent comparison with observations.
With this machinery at hand, one begins with
semiclassical states which are peaked at configura- See also: Algebraic Approach to Quantum Field Theory;
tions approximating the classical universe at late Black Hole Mechanics; Canonical General Relativity;
times (e.g., now) and evolves backwards. Numerical Knot Invariants and Quantum Gravity; Loop Quantum
Gravity; Quantum Cosmology; Quantum Dynamics in
simulations show that the state remains peaked at
Loop Quantum Gravity; Quantum Fields Theory in
the classical solution till very early times when the
Curved Spacetime; Spacetime Topology, Causal Structure
matter density becomes of the order of Planck and Singularities; Spin Foams; Wheeler–De Witt Theory.
density. This provides, in particular, a justification,
from first principles, for the assumption that space-
time can be taken to be classical even at the onset of
the inflationary era, just a few Planck times after the Further Reading
(classical) big bang. While one would expect a result Ashtekar A (1987) New Hamiltonian formulation of general
along these lines to hold on physical grounds, relativity. Physical Review D 36: 1587–1602.
technically it is nontrivial to obtain semiclassicality Ashtekar A and Krishnan B (2004) Isolated and dynamical
over such huge domains. However, in the Planck horizons and their applications. Living Reviews in Relativity
10: 1–78 (gr-qc/0407042).
regime near the big bang, there are major deviations Ashtekar A and Lewandowski L (2004) Background independent
from the classical behavior. Effectively, gravity quantum gravity: a status report. Classical Quantum Gravity
becomes repulsive, the collapse is halted and then 21: R53–R152.
the universe re-expands. Thus, rather than modify- Bojowald M and Morales-Tecotl HA (2004) Cosmological
ing spacetime structure just in a tiny region near the applications of loop quantum gravity. Lecture Notes in
Physics 646: 421–462 (also available at gr-qc/0306008).
singularity, quantum geometry effects open a bridge Gambini R and Pullin J (1996) Loops, Knots, Gauge Theories and
to another large classical universe. These are Quantum Gravity. Cambridge: Cambridge University Press.
dramatic modifications of the classical theory. Perez A (2003) Spin foam models for quantum gravity. Classical
For over three decades, hopes have been expressed Quantum Gravity 20: R43–R104.
that quantum gravity would provide new insights Rovelli C (2004) Quantum Gravity. Cambridge: Cambridge
University Press.
into the true nature of the big bang. Thanks to Thiemann T (2005) Introduction to Modern Canonical Quantum
quantum geometry effects, these hopes have been General Relativity, Cambridge: Cambridge University Press.
realized and many of the long-standing questions (draft available as gr-qc/0110034).
have been answered. While the final picture has

Quantum Group Differentials, Bundles and Gauge Theory


T Brzeziński, University of Wales Swansea, while the base manifold is a spacetime for the
Swansea, UK theory. In this article, we review the theory of
ª 2006 Elsevier Ltd. All rights reserved. bundles in which a structure group is a quantum
group and base space or spacetime might be
noncommutative. To fully deal with geometric
aspects, we first review differential geometry of
Introduction
quantum groups. Then we describe the theory of
Mathematics of classical gauge theories is contained quantum principal bundles, connections on such
in the theory of principal and associated vector bundles, gauge transformations, associated vector
bundles. Principal bundles describe pure gauge bundles and their sections. We indicate that, for a
fields and their transformations, while the asso- certain class of quantum principal bundles, sections
ciated bundles contain matter fields. A structure of an associated bundle become vector bundles of
group of a bundle has a meaning of a gauge group, noncommutative geometry à la Connes, that is,
Quantum Group Differentials, Bundles and Gauge Theory 237

finite projective modules. The theory is illustrated One says that (A) satisfies the ‘‘density
by two explicit examples that can be viewed as condition’’ if any element of n (A) is of the
deformations of the classical magnetic monopole above form, for any n. To simplify notation, one
and the instanton. writes d for dn .
As an example of (A), take A = C(X) and then
the exterior algebra (X) for (A). The exterior
Differential Structures on Algebras algebra satisfies density condition as any n-form
Algebraic Conventions can be written as f (x) ^ dg(x) ^ dh(x) ^    . The
wedge product is anticommutative, but for a
Throughout this article, A (P etc.,) will be an noncommutative algebra A, the anticommutativity
associative unital complex algebra. To gain some of the product in (A) cannot be generally
geometric intuition the reader can think of A as an required.
algebra of continuous complex functions on a
compact (Hausdorff) space X, C(X), with product
given by pointwise multiplication fg(x) = f (x)g(x),
The Universal Differential Calculus
and with the unit provided by a constant function
x 7! 1. The algebra C(X) is commutative, but, in Any algebra A comes equipped with a universal
what follows, we do not assume that A is a differential calculus denoted by (1 A, d). 1 A is def-
commutative algebra. By an A-bimodule we mean ined as the kernel
P of the multiplication
P map, that
a vector space with mutually commuting left and is, 1 A := { i ai  bi 2 A  A j i ai bi = 0}  A  A.
right actions of A. All modules are unital (i.e., the The derivative is defined by d(a) = 1  a  a  1. The
unit element of A acts trivially). On elements, the n-forms are defined as n A = 1 A  A 1 A  A   
multiplication in an algebra or an action of A on a  A 1 A (n-copies of 1 A). n A can be identified
module is denoted by juxtaposition. with a subspace of A  A      A (n þ 1-copies of
A) consisting of all such elements that vanish upon
multiplication of any two consecutive factors. With
Differential Calculus on an Algebra this identification, higher derivatives read
A first-order differential calculus on A is a pair
X  Xnþ1 X
(1 (A), d), where 1 (A) is an A-bimodule and
d ai0  ai1      ain ¼ ð1Þk ai0  ai1
d : A ! 1 (A) is a linear map such that: i i
k¼0
1. for all a, b 2 A, d(ab) = (da)b þ adb (the Leibniz      aik1  1  aik     ain
rule); and P
2. every ! 2 1 (A) can be written as ! = i ai dbi for
The universal differential calculus satisfies the
some ai , bi 2 A.
density condition.
Elements of 1 (A) are called differential 1-forms This calculus captures very little (if any) of the
and the map d is called an exterior derivative. As a geometry of the underlying algebra A, but it has the
motivating example, take A = C(X) and 1 (A) the universality property, that is, any differential calcu-
space of 1-forms on X (sections of the cotangent lus on A can be obtained as a quotient of A.
bundle T  X), and d the usual exterior differential. In other words, any differential calculus (A) is
Higher-differential forms corresponding to (1 (A), d) fully determined by a system of A-sub-bimodules
are defined as elements of a differential graded Nn 2 A  nþ1 (or homogeneous ideals in the algebra
algebra (A). This is an algebra which can be A), so that n (A) = n A=Nn . The differentials d in
decomposed into the direct sum of A-bimodules (A) are derived from universal differentials via the
n (A), that is, (A) = A  1 (A)  2 (A)     . In canonical projections n : n A ! n (A).
addition to d : A ! 1 (A), there are maps dn : n Typical examples of algebras in quantum geome-
(A) ! nþ1 (A) such that, for all !n 2 n (A), try are given by generators and relations, that is,
!k 2 k (A), A = Chx1 , . . . , xn i=hRi (x1 , . . . , xn )i, where Chx1 , . . . , xn i
is a free algebra on generators xk and Ri (x1 , . . . , xn )
1. d1  d = 0 and dnþ1  dn = 0, n = 1, 2, . . . ;
are polynomials, so that Ri (x1 , . . . , xn ) = 0 in A.
2. !n !k 2 nþk (A); and
Correspondingly, the modules n (A) are given by
3. dnþk (!n !k ) = (dn !n )!k þ (1)n !n (dk !k ).
generators and relations. If (A) satisfies the density
Elements of n (A) are known as ‘‘differential condition, that the whole of (A) must be generated
n-forms.’’ n (A) contains all linear combinations by some 1-forms. The sub-bimodules Nn contain
of expressions a0 da1 da2    dan with a0 , . . . , an 2 A. relations satisfied by these generators.
238 Quantum Group Differentials, Bundles and Gauge Theory

-Calculi The universal calculus on A is bicovariant with


coactions
If A is a -algebra, then a calculus is called a
X  X
‘‘-calculus’’ provided (A) is a graded -algebra, U a i
 b i
¼ aið1Þ  bið1Þ  aið2Þ bið2Þ ;
R
and d( ) = (d) , for all  2 (A). i i
X  X
U
L
i
a b i
¼ aið1Þ bið1Þ  aið2Þ  bið2Þ
Differential Structures on Quantum i i

Groups (Hopf Algebras) 1 1


Since  (A) =  A=N for an A-sub-bimodule
Hopf Algebra Preliminaries N 2 1 A, the calculus 1 (A) is left (resp. right)
covariant if and only if U L (N)  A  N (resp.
From now on, A is a Hopf algebra (quantum group), UR (N)  N  A).
with a coproduct  : A ! A  A, counit " : A ! C
and antipode
P S. We use Sweedler’s notation The Woronowicz Theorems
(a) = a(1)  a(2) . We also write Aþ = ker " (the
augmentation ideal). A form ! in a left-covariant differential calculus
For any algebra P, the convolution product of 1 (A) is said to be left-invariant provided
linear maps f , g : A ! P is L (!) = 1  !. 1 (A) is a free A-module with basis
Pa linear map f  g : A ! P, given by left-invariant forms, that is, one can choose
defined by f  g(a) = f (a(1) )  g(a(2) ). A map
f : A ! P is said to be convolution invertible, a set of left-invariant forms !i such that any 1-form
provided there exists f 1 : A ! P such that  can
P be uniquely written as a finite sum
f  f 1 = f  f 1 = 1".  = i ai !i , ai 2 A.
An A-coaction onP a comodule V, % : V ! V  A, is The first Woronowicz theorem states that there is
denoted by %(v) = v(0)  v(1) . The right adjoint a one-to-one correspondence between left-covariant
coaction in A is a map calculi on A and right A-ideals Q  Aþ . The
correspondence is provided by the map
Ad : A ! A  A; X
X  : A  Q ! N; a  q 7! aSqð1Þ  qð2Þ
AdðaÞ ¼ að2Þ  ðSað1Þ Það3Þ
where N is such Pthat 1 (A) P
= (1 A)=N. The inverse of
A subspace B of A is said to be ‘‘Ad-invariant’’
 reads  ( i a  b ) = i ai bi(1)  bi(2) . The map
1 i i
provided Ad(B)  B  A. For example, Aþ is such a
 induces the map  : Aþ =Q ! 1 (A), via
space.
!([a]) = [(1  a)] where [] denotes cosets in
Aþ =Q and in 1 (A) = (1 A)=N. This establishes a
Covariant Differential Calculi
one-to-one correspondence between the space
For Hopf algebras one can study calculi that are 1 = Aþ =Q and the space of left-invariant 1-forms in
covariant with respect to . For A = C[G] (an 1 (A). The dual space to 1 , that is, the space of linear
algebra of functions on a Lie group), this corre- functionals 1 ! C, is often termed a ‘‘quantum Lie
sponds to the covariance of a differential structure algebra’’ or a ‘‘quantum tangent space’’ corresponding
on G with respect to regular representations. to a left-covariant calculus 1 (A). The dimension of
A first-order differential calculus 1 (A) on a 1 is known as a dimension of 1 (A).
quantum group A is said to be left-covariant, if The definitions and analysis of right-covariant
there exists a linear map L : 1 (A) ! A  1 (A) differential calculi are done in a symmetric manner.
(called a left coaction) such that, for all a, b 2 A, For a bicovariant calculus, a form ! that is both left-
X and right-invariant, is termed a ‘‘bi-invariant’’ form.
L ðadbÞ ¼ að1Þ bð1Þ  að2Þ dbð2Þ The second Woronowicz theorem states a one-to-
1 (A) is called a right-covariant differential calculus one correspondence between bicovariant differential
if there exists a linear map R : 1 (A) ! 1 (A)  A calculi and Ad-invariant A-ideals Q  Aþ (cf. the
(called a right coaction) such that, for all a, b 2 A, subsection ‘‘Hopf algebra preliminaries’’). The
X correspondence is provided by the map  above.
R ðadbÞ ¼ að1Þ dbð1Þ  að2Þ bð2Þ For the universal calculus, Q is trivial, and hence
1 = Aþ = ker (").
If 1 (A) is both left- and right-covariant, it is called
a ‘‘bicovariant differential calculus.’’ A bicovariant
Higher-order Bicovariant Calculi
1 (A) has a structure of a Hopf A-bimodule, that is,
it is an A-bimodule and an A-bicomodule such that Given a first-order bicovariant calculus 1 (A), one
the coactions are compatible with actions. constructs a braiding operator, known as the
Quantum Group Differentials, Bundles and Gauge Theory 239

‘‘Woronowicz braiding’’  : 1 (A)  A 1 (A) ! 1 (A) General classification results are based on
A 1 (A) by setting (a!  A ) = a A ! for all a 2 A, the equivalence between the category of Hopf
and any left-invariant ! and right-invariant , and then bimodules of a finite-dimensional Hopf algebra
extending it A-linearly to the whole of A and that of Yetter–Drinfeld or crossed modules
1 (A) A 1 (A). This operator satisfies the braid of A. These are the modules of the Drinfeld double
relation (id A )  ( A id)  (id A ) = ( A id)  of A. As a result, in the case of a finite-dimensional
(id A )  ( A id), and is invertible provided the factorizable coquasitriangular Hopf algebra A with
antipode S is invertible. The Woronowicz braiding is a dual Hopf algebra H, the bicovariant 1 (A) are
used to define symmetric forms as those invariant in one-to-one correspondence with two-sided ideals
under . One then defines exterior 2-forms as elements in H þ . If, in addition, A is semisimple, then
of 1 (A) A 1 (A)=ker (id  ), and introduces the (coirreducible) calculi are in one-to-one correspon-
wedge product. The wedge product is not in general dence with nontrivial irreducible representations of
anticommutative, but one does have ! ^  =  ^ ! H. This can be extended to infinite-dimensional
for bi-invariant !, . This construction is extended to algebras, provided one works over a field of formal
higher forms and leads to the definition of the exterior power series in the deformation parameter.
algebra (A). To define exterior n-forms, one maps
any permutation on n-elements to the corresponding
element of the braid group generated by  and then
Quantum Group Principal Bundles
takes the quotient of the nth tensor power of 1 (A) by Quantum Principal Bundles
all elements corresponding to even permutations. The
In classical geometry, a (topological) principal
differential d : A ! 1 (A) is extended to an exterior
bundle is a locally compact Hausdorff space with a
differential in the whole of (A) in the following way.
(continuous) free and proper action of a locally
First, 1 (A) is extended by a one-dimensional
compact group (e.g., a Lie group). In terms of
A-bimodule generated by a form  that is required to
algebras of functions this gives rise to the following
be bi-invariant. The resulting extended bimodule
structure. A is a Hopf algebra (the model is
(which, in general, is not a first-order differential P functions on a group G), P is a right A-comodule
calculus, as  is not necessarily of the form i ai dbi ,
algebra with a coaction P : P ! P  A (the model
for some ai , bi 2 A) is then determined from the
is functions on a total space X). Let
relation da = a  a for all a 2 A. Higher exterior
B = {b 2 P j P (b) = b  1} be the coinvariant sub-
derivative is then defined by d =  ^   (  1)n  ^ ,
algebra (the model is functions on a base manifold
for any  2 n (A).
M = X=G). Fix a bicovariant calculus 1 (A), with
The algebra (A) is a Z2 -graded differential Hopf
the corresponding Q and 1 = Aþ =Q as in the
algebra, that is, it has a coproduct such that
X subsection ‘‘The Woronowicz theorems.’’ Take a
ð! ^ Þ ¼ ð1Þj!ð2Þ kð1Þ j !ð1Þ ^ ð1Þ  !ð2Þ ^ ð2Þ differential calculus 1 (P) = 1 P=NP such that:
P
where j!(2) j etc., denotes the degree of a homo- 1. 1 P (NP )  NP A, where for all i pi qi 21 P,
geneous component in the decomposition of (!). X  X
Furthermore, 1 P i
p q i
¼ pið0Þ  qið0Þ  pið1Þ qið1Þ
X  i i
ðd!Þ ¼ d!ð1Þ  !ð2Þ þ ð1Þj!ð1Þ j !ð1Þ  d!ð2Þ
2 1 P  A
On the 1-forms this coproduct is simply the sum
2. (N
˜ P )  NP  Q, where
L þ R .
~ : 1 P ! P  Aþ ;

Classification X X X
pi  qi 7! pi P ðqi Þ ¼ pi qið0Þ  qið1Þ
There is no unique covariant differential calculus on A, i i i
so classification of covariant differential calculi is an
important problem. For example, it is known that the 3. NB = NP \ 1 B gives rise to a differential struc-
quantum group SUq (2) admits a left-covariant three- ture 1 (B) = 1 B=NB on B. Condition (1) ensures
dimensional calculus, but there is no three-dimen- that 1 P descends to a coaction 1 (P) :
sional bicovariant calculus. On the other hand, there 1 (P) ! 1 (P)  A, while (2) allows for defining
are two four-dimensional bicovariant calculi on a map
SUq (2). Differential calculi are classified for standard
quantum groups such as SLq (N) or Spq (N). ver : 1 ðPÞ ! P  1 ; verð½! Þ ¼ ½ð!Þ
~
240 Quantum Group Differentials, Bundles and Gauge Theory

Since B is a subalgebra of P, the P-bimodule  is a coproduct in P. P is a quantum principal


nX o A-bundle over the coinvariants B, provided ker  
P1 ðBÞP :¼ pi ðdbi Þqi jpi ; qi 2 P; bi 2 B Bþ P, where Bþ = B \ Pþ . B is a left quantum
i homogeneous space in the sense that (B)  P  B,
and P is known as a quantum homogeneous bundle.
is a sub-bimodule of 1 P, known as horizontal
An example of this is the standard quantum
forms. P is called a ‘‘quantum principal bundle’’
2-sphere – a quantum homogeneous space of
over B with quantum structure group A and calculi
SUq (2) (see the subsection ‘‘The Dirac q-monopole’’).
1 (A) and 1 (P) provided the following sequence;
This construction reflects the classical construction of
ver
0 ! P1 ðBÞP ! 1 ðPÞ ! P  1 ! 0 a principal bundle over a homogeneous space, since
every homogeneous space of a group G can be
is exact. This definition reflects the geometric identified with a quotient G=H, where H  G is a
content of principal bundles, but is not restricted subgroup. Not every quantum homogeneous space
to any specific differential calculus. The surjectivity can be obtained in this way (e.g., nonstandard
of ver corresponds to the freeness of the (co)action, quantum 2-spheres), as quantum groups P do not
while the condition ker (ver) = P1 (B)P corresponds have sufficiently many quantum subgroups A (in a
to identification of vertical vector fields as those that sense of Hopf algebra projections  : P ! A). To study
are annihilated by horizontal forms. gauge theory on general quantum homogeneous
spaces, more general notion of a bundle needs to be
developed (see the subsection ‘‘Generalizations of
The Universal Calculus Case
quantum principal bundles’’).
In the universal calculus case, both NP and Q in the A general differential calculus on a quantum
previous subsection are trivial, and ver = .˜ Uni- homogeneous bundle is specified by choosing a
versal horizontal forms P(1 B)P coincide with the left-covariant calculus on P with an ideal QP 2 Pþ
kernel of the canonical projection P B P ! P  P. such that (id  )  Ad(QP )  QP  A. A bicovariant
The exactness of the sequence in the last subsection calculus on A is then given by QA = (QP ).
is equivalent to the requirement that the map

can : P B P ! P  A Quantum Trivial Bundles


X
p B q 7! pP ðqÞ ¼ pqð0Þ  qð1Þ A quantum principal bundle (with the universal
differential calculus) is said to be ‘‘trivial’’ or ‘‘cleft’’
be bijective. In algebra, such an inclusion of algebras provided there exists a linear map  : A ! P such that
B  P is known as a Hopf–Galois extension. Thus, a
1. (1) = 1 (unitality);
geometric notion of a quantum principal bundle
2. P   = (  id)   (colinearity or covariance);
with the universal calculus is the same as the
and
algebraic notion of a Hopf–Galois extension.
3.  is convolution invertible (cf. the subsection
If (2) in the previous subsection is replaced by
‘‘Hopf algebra preliminaries’’).
stronger conditions (N ˜ P ) = NP  Q and (NP \
˜  P(1 B)P, then exactness of the sequence
ker )  is called a trivialization. In this case, P is
in the previous subsection is equivalent to the isomorphic to B  A as a left B-module and right
bijectivity of ‘‘can.’’ Thus, although defined in a A-comodule via the map B  A ! P, b  a 7! b(a).
purely algebraic way, the notion of a Hopf–Galois In particular, an A-covariant (i.e., colinear) algebra
extension carries deep geometric meaning. It there- map j : A ! P is a trivialization (the convolution
fore makes sense to consider primarily Hopf–Galois inverse of j is j  S).
extensions and then specify differential structure in Based on trivial bundles, locally trivial bundles
such a way that this stronger version of (2) is can be constructed by choosing a compatible cover-
satisfied. Henceforth, unless specified otherwise, a ing of B (in terms of ideals).
quantum principal bundle is taken with the uni- At this point, the reader should be warned that
versal differential calculus. the notion of a trivial quantum principal bundle
includes bundles which are not trivial classically
(i.e., do not correspond to functions on the
Quantum Homogeneous Bundles
Cartesian product of spaces). As an example,
Suppose that P is a Hopf algebra, and that there is a consider the Möbius strip viewed as a Z2 -principal
Hopf algebra surjection  : P ! A. This induces a bundle over the circle S1 . Obviously, this is not a
coaction of P on A via P = (id  )  , where now trivial bundle (the Möbius strip is not isomorphic to
Quantum Group Differentials, Bundles and Gauge Theory 241

S1
Z2 ). It can be shown, however, that the connection as a connection form or a gauge field,
quantum principal bundle corresponding to the that is, a map ! : 1 ! 1 (P) such that:
Möbius strip has a trivialization  in the above
1. for all
2 1 , ver(!(
)) = 1 
; and
sense.
2. 1 (P)  ! = (!  id)  Ad1 (Ad-covariance), where
Ad1 is a projection of the adjoint coaction to 1 ,
Generalizations of Quantum Principal Bundles that is, Ad1 ([a]) = [Ad(a)] (well defined, because
Q is Ad-invariant for a bicovariant calculus, see
In the case of majority of quantum homogeneous
the subsection ‘‘The Woronowicz theorems’’).
spaces, the map  in the subsection ‘‘Quantum
homogeneous bundles’’ is a coalgebra and right The correspondence between connections and con-
P-module map, but not an algebra map. Thus, the nection 1-forms is given by the formula
induced coaction is not an algebra map either. To Y X
cover examples like these, one needs to introduce ðpdqÞ ¼ pqð0Þ !ð½qð1Þ Þ
a generalization of quantum principal bundles. In the universal differential calculus case, 1 = Aþ ,
Consider an algebra P that is also a right comodule hence ! can be viewed as a map ! : A ! 1 P, such
of a coalgebra C with coaction P . Define that !(1) = 0. The map F ! : A ! 2 P, given by
n F ! = d! þ !  ! is called a ‘‘curvature’’ of !. The
B :¼ b 2 Pj8p 2 P; P ðbpÞ ¼ bP ðpÞ curvature satisfies the Bianchi identity, dF ! =
X o F !  !  !  F !.
¼ bpð0Þ  pð1Þ In the case of a trivial bundle with trivialization 
and universal calculus, any linear map : A ! 1 B
B is a subalgebra of P. P is a principal coalgebra- such that (1) = 0 defines a connection 1-form
bundle over B or B  P is a coalgebra-Galois
extension provided the map ! ¼ 1  d þ 1   

can : P  B P ! P  C The corresponding curvature is F ! = 1  F  ,


X where F = d þ  .
p  B q 7! pP ðqÞ ¼ pqð0Þ  qð1Þ In the case of a quantum homogeneous bundle
with calculus determined by QP 2 Pþ and
is bijective. This purely algebraic requirement QA = (QP ) (cf. the subsection ‘‘Quantum homo-
induces a rich symmetry structure on P, given in geneous bundles’’), a canonical connection form can
terms of entwining, which allows one for developing be assigned to any algebra map i : A ! P such that
various differential geometric notions such as those
discussed in the next section. The lack of space does 1.   i = id (i-splits );
not permit us to describe this theory here. 2. "P  i = "A (co-unitality);
3. (id  )  AdP  i = (i  id)  AdA (Ad-covariance);
and
4. i(QA )  QP (differentiability).
Connections, Gauge Transformations, P
Matter Fields Explicitly, !([a]) = (Si(a)(1) )di(a)(2) .

Connections and Connection Forms Covariant Derivative: Strong Connections


A ‘‘connection’’ in a quantum principal bundle with A covariant derivative associated to a connection 
calculi 1 (P), 1 (A) is a left P-linear map is a map D : P ! P1 (B)P, p 7! dp  (dp). A covar-
 : 1 (P) ! 1 (P) such that: iant derivative maps elements of P into horizontal
forms, since ker  = P1 (B)P, and satisfies the
1.    =  ( is a projection);
Leibniz rule D(bp) = (db)p þ bDp, for all b 2 B,
2. ker  = P1 (B)P; and
p 2 P.
3. 1 (P)   = (  id)  1 (P) (colinearity or
A connection is ‘‘strong’’ provided D(p) 2 1 (B)P.
covariance).
A covariant derivative of a strong connection is a
The exact sequence in the subsection ‘‘Quantum connection on module P in the sense of Connes.
principal bundles’’ implies that  is a left P-linear Furthermore, in the universal calculus case, and when
projection if and only if there exists a left P-linear A has invertible antipode, the existence of strong
map : P  1 ! 1 (P) such that ver  = id. Since connections leads to rich gauge theory of associated
is left P-linear, it is fully specified by its action on bundles (cf. the subsection ‘‘Associated bundles:
1 . This leads to the equivalent definition of a matter fields’’). A connection in a trivial bundle
242 Quantum Group Differentials, Bundles and Gauge Theory

nX 
described in the subsection ‘‘Connections and con- X i
nection forms’’ is strong (and every strong connection E¼ v  p 2 V  P
i i
vð0Þ  pið0Þ  við1Þ pið1Þ
i i
in a trivial bundle is of this form). Assuming
X o
invertibility of the antipode in A, a canonical
¼ vi  p i  1  V  P
connection in a quantum homogeneous bundle
i
described in that subsection is strong provided Ad- P
covariance (3) is replaced by conditions (id  )    E
P is i a right B-module with product ( i vi  pi )b =
i
i = (i  id)  A (right covariance) and (  id)    i v  p b. A right B-linear map s : E ! B is called a
i = (id  i)  A (left covariance), where  is a section of E. The space of sections (E) is a left B-
coproduct in P, and A is a coproduct in A. module via (bs)(p) = bs(p).
In the universal calculus case, the map D can The theory of associated bundles is particularly rich
be extended to a P map D : 1 P ! 2 P via the when A has a bijective antipode and P has a strong
formula D() = d þ (0) !((1) ). Then D  D(p) = connection form !. In this case, (E) is isomorphic to
P
p(0) F ! (p(1) ), where F ! is the curvature of ! (cf. the the left B-module % of maps : V ! P such that P 
subsection ‘‘Connections and connection forms’’). This = (  id)  %. If V is finite dimensional, then % is a
explains the relationship between a curvature under- finite projective B-module, that is, it is a module of
stood as the square of a covariant derivative and F ! . sections of a noncommutative vector bundle in the
sense of Connes. The strong connection induces a map
Bundle Automorphisms and Gauge r : % ! 1 B  B % , given by r( )(v) = d (v) þ
P
Transformations (v(0) )!(v(1) ). r is a connection in the sense of
Connes (in a projective left B-module), that is, for all
A quantum bundle automorphism is a left B-linear b 2 B, 2 % , r(b ) = db  B þ br( ).
right A-covariant (i.e., colinear) automorphism In the case of a trivial bundle, % can be identified
F : P ! P such that F(1) = 1. Bundle automorphisms with the space of linear maps V ! B. Thus, sections
form a group with operation FG = G  F. This group of an associated bundle correspond to pullbacks of
is isomorphic to the group G(P) of gauge transfor- matter fields, as in the classical local gauge theory
mations, that is, maps f : A ! P that satisfy the matter fields are defined as functions on a spacetime
following conditions: with values in a representation (vector) space of the
1. f (1A ) = 1P (unitality); gauge group.
2. P  f = (f  id)  Ad (Ad-covariance); and
3. f is convolution invertible (cf. the subsection
‘‘Hopf algebra preliminaries’’). The Dirac q-Monopole

The product in G(P) is the convolution product This is an example of a strong connection in a
(cf. the subsection ‘‘Hopf algebra preliminaries’’). quantum homogeneous bundle (cf. the subsection
The group of gauge transformations acts on the ‘‘Quantum homogeneous bundles’’). P = SUq (2) is a
space of (strong) connection forms ! via the formula matrix Hopf -algebra with matrix of generators

f . ! ¼ f  !  f 1 þ f  df 1 ; 8f 2 GðPÞ  
a qc
This resembles the gauge transformation law of a c a
gauge field in the standard gauge theory. The curvature
and relations
transforms covariantly as F f .! = f  F !  f 1 .
In the case of a trivial principal bundle, gauge
ac ¼ qca; ac ¼ qc a; cc ¼ c c
transformations correspond to a change of the
trivialization and can be identified with convolution- a a þ c c ¼ 1; aa þ q2 cc ¼ 1
invertible maps : A ! B such that (1) = 1. A map
: A ! 1 B that induces a connection as in the where q is a real parameter. A = C[U(1)] is a Hopf
subsection ‘‘Connections and connection forms’’ is -algebra generated by unitary and group-like u
transformed to   1 þ  d 1 , and the curva- (i.e., uu = u u = 1, (u) = u  u). The -projection
ture F 7!  F  1 .  : P ! A is defined by (a) = u. The coinvariant
subalgebra B is generated by x = cc , z = ac ,
Associated Bundles: Matter Fields z = ca . The elements x and z satisfy relations
Given a right A-comodule (corepresentation)
x ¼ x; zx ¼ q2 xz;
% : V ! V  A one defines a quantum vector bundle
associated to P as zz ¼ q2 xð1  q2 xÞ; z z ¼ xð1  xÞ
Quantum Group Differentials, Bundles and Gauge Theory 243

Thus, B is the algebra of functions on the standard quantum 7-sphere. As a -algebra it is defined by
quantum 2-sphere. A strong connection is obtained generators z1 , z2 , z3 , z4 and relations
from a bicovariant -map i : A ! P given by
i(un ) = an (cf. the subsections ‘‘Quantum homoge- zi zj ¼ qzj zi ðfor i < jÞ
neous bundles,’’ ‘‘Connections and connection zj zi ¼ qzi zj ðfor i 6¼ jÞ
X
forms,’’ and ‘‘Covariant derivative: strong connec- zk zk ¼ zk zk þ ð1  q2 Þ zj zj ;
tions’’). Explicitly, the connection form reads j<k
X
n  
X zk zk ¼1
n
!ðun Þ ¼ ck ank dðank ck Þ k¼1
k¼0
k q2
where q 2 R. The coaction of the -Hopf algebra
Xn n
n
!ðu Þ ¼ q2k
a nk k k nk
c dðc a Þ A = SUq (2) (cf. the previous subsection) on P is
k¼0
k q2 constructed as follows. Start with the quantum group
Uq (4), generated by a matrix t = (tij )4i, j = 1 and view
where the deformed binomial coefficients are C[S7q ] as a right quantum homogeneous space of Uq (4)
defined for any number x by generated by the bottom row in t. Thus, there is a right
coaction of Uq (4) on C[S7q ] obtained by the restriction of
n ðxn  1Þðxn1  1Þ    ðxkþ1  1Þ the coproduct in Uq (4). Next, project Uq (4) to SUq (2)
¼ nk by a suitable coideal and a right ideal in Uq (4). The
k x ðx  1Þðxnk1  1Þ    ðx  1Þ
corresponding canonical surjection r : Uq (4) ! SUq (2)
There is a family Vn , n 2 Z of one-dimensional is a coalgebra map, characterized as a right Uq (4)-
corepresentations of C[U(1)] with Vn = C and module map by r(t11 t22  qt12 t21 ) = 1 and
%n (1) = 1  un , n 0 and %n (1) = 1  un , n < 0. This    
u 0 u22 u21
leads to the family of finite projective modules rðtÞ ¼ ; u

0 u u12 u11
n = %n as described in the subsection ‘‘Associated
bundles: matter fields.’’ The Hermitian projectors where u = (uij )2i, j = 1 is the matrix of generators
e(n) of these modules come out as, for n > 0, of SUq (2) (cf. the previous subsection). When
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi applied to the coaction of Uq (4) on C[S7q ], r induces
n n
the required coaction P : C[S7q ] ! C[S7q ]  SUq (2).
eðnÞij ¼ ani ci cj anj ;
i q2 j q2 Explicitly, the
P coaction comes out on generators
as P (zj ) = i zi  r(tij ). The coaction P is not
i; j ¼ 0; 1; . . . ; n an algebra map. The coinvariant subalgebra B is a
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

n n -algebra generated by
eðnÞij ¼ qiþj ci ani anj cj ;
i q2 j q2 a ¼ z1 z4  z2 z3
i; j ¼ 0; 1; . . . ; n b ¼ z1 z3 þ q1 z2 z4
R ¼ z1 z1 þ z2 z2
The e(n) describe q-monopoles of magnetic charge
n. For example, the charge-1 projector explicitly The elements a, a , b, b , R satisfy the following
reads relations:
  Ra ¼ q2 aR; Rb ¼ q2 bR
1x z
z q2 x ab ¼ q3 ba; ab ¼ q1 b a
aa þ q2 bb ¼ Rð1  q2 RÞ
and reduces to the usual charge-1 Dirac monopole
projector when q = 1. The covariant derivatives r aa ¼ q2 a a þ ð1  q2 ÞR2
are Levi-Civita or Grassmann connections in mod- b b ¼ q4 bb þ ð1  q2 ÞR
ules n corresponding to projectors e(n).
Hence B can be understood as a deformation of the
algebra of functions on the 4-sphere and is denoted
The q-Instanton
by C[ 4q ]. One can show that the map ‘‘can’’ in the
This is an example of a coalgebra bundle and the subsection ‘‘Generalizations of quantum principal
associated vector bundle, which is a deformation of bundles’’ is bijective, hence there is an SUq (2)-
an instanton (with instanton number 1). P = C[S7q ] is coalgebra principal bundle with the total space the
the -algebra of polynomial functions on the quantum 7-sphere C[S7q ] and the base space the
244 Quantum Hall Effect

quantum 4-sphere C[ 4q ]. By abstract arguments Bonechi F, Ciccoli N, Da̧browski L, and Tarlini M (2004)
that involve cosemisimplicity of SUq (2), one can Bijectivity of the canonical map for the noncommutative
instanton bundle. Journal of Geometry and Physics 51: 71–81.
prove that there exists a strong connection in this Brzeziński T and Majid S (1993) Quantum group gauge theory on
bundle; this is the q-deformed instanton field. At the quantum spaces. Communications in Mathematical Physics
time of writing this article, however, the explicit 157: 591–638.
form of this connection is not known. Brzeziński T and Majid S (1995) Quantum group gauge theory on
On the other hand, following the classical con- quantum spaces – Erratum. Communications in Mathematical
Physics 167: 235.
struction of an instanton, one can take the funda- Brzeziński T and Majid S (2000) Quantum geometry of algebra
mental two-dimensional corepresentation V = C2 of factorisations and coalgebra bundles. Communications in
SUq (2) and explicitly construct q-instanton projection Mathematical Physics 213: 491–521.
with instanton number 1. Writing e1 , e2 for the basis Calow D and Matthes R (2002) Connections on locally trivial
of V, the coaction % : V ! V  SUq (2) is given by quantum principal fibre bundles. Journal of Geometry and
X Physics 41: 114–165.
ðej Þ ¼ ei  uij Da̧browski L, Grosse H, and Hajac PM (2001) Strong connec-
tions and Chern–Connes pairing in the Hopf–Galois theory.
i
Communications in Mathematical Physics 220: 301–331.
The associated bundle (cf. the subsection ‘‘Asso- Hajac PM and Majid S (1999) Projective module description of
ciated bundles: matter fields’’) is a finite projective the q-monopole. Communications in Mathematical Physics
206: 247–264.
left module over C[ 4q ]. The corresponding q-instan-
Heckenberger I and Schmüdgen K (1998) Classification
ton projector comes out as of bicovariant differential calculi on the quantum groups
0 2 1 SLq (n þ 1) and Spq (2n). Journal für die Reine und
q R 0 qa q2 b Angewandte Mathematik 502: 141–162.
B 0 q2 R qb q3 a C
B C Klimyk A and Schmüdgen K (1997) Quantum Groups and Their
@ qa qb 1R 0 A Representations. Berlin: Springer.
2  3 4 Majid S (1998) Classification of bicovariant differential calculi.
q b q a 0 1q R
Journal of Geometry and Physics 25: 119–140.
Majid S (1999) Quantum and braided Riemannian geometry.
Journal of Geometry and Physics 30: 113–146.
See also: Bicrossproduct Hopf Algebras and Majid S (2000) Foundations of Quantum Group Theory, 1st pbk.
Noncommutative Spacetime; Hopf Algebras and edn. Cambridge: Cambridge University Press.
Wess J and Zumino B (1990) Covariant differential calculus on
q-Deformation Quantum Groups; Noncommutative Tori,
the quantum hyperplane, Nuclear Physics B. Proceedings
Yang–Mills, and String Theory.
Supplement 18B: 302–312.
Woronowicz SL (1989) Differential calculus on compact matrix
pseudogroups (quantum groups). Communications in Mathe-
Further Reading matical Physics 122: 125–170.
Bonechi F, Ciccoli N, and Tarlini M (2003) Noncommutative
instantons and the 4-sphere from quantum groups. Commu-
nications in Mathematical Physics 226: 419–432.

Quantum Hall Effect


K Hannabuss, University of Oxford, Oxford, UK first demonstrated it in thin samples of gold foil in
ª 2006 Elsevier Ltd. All rights reserved.
October 1879 (Hall’s subsequent measurements of
the potential difference showed that the carriers
could be positively or negatively charged for
different materials). A schematic diagram of Hall’s
Introduction
experiment and the lateral separation of charges is
When a current flows in a thin sample with a shown in Figure 1.
transverse magnetic field B, the Lorentz force Equilibrium is reached when the magnetic force
deflects the trajectories of the charge carriers, balances that from the potential difference E due to
producing an excess charge on one side and a the displaced charge. When the charge carriers are
charge deficiency on the other, and creating a electrons, with the electron density n, and the
potential difference across the conductor perpendi- electron current J, this gives neE = JB. Comparison
cular to both the direct current and the magnetic with Ohm’s law, J = E, gives conductance (the
field. This is known as the Hall effect, in honour of reciprocal of resistance) to be = ne=B. More
E H Hall, who, inspired by a remark of Maxwell, generally, considering the currents and fields as
Quantum Hall Effect 245

Magnetic
field B 4

3
+ –

h σ H/ e 2
+ – 2

+ –
2
Direct 1
+ current – 1
0
0 1 2 3 4
ν
Figure 1 Schematic diagram of charge separation in Hall’s
experiment. (To different scale)

vectors,  is represented by a matrix. Rescaling by

σ
the sample thickness , the diagonal components of
 give the direct conductivity k and its off- 0 1 2 3 4
diagonal elements give the Hall conductivity: Figure 2 Schematic diagram of the Hall and direct conductivities
H = 21 . (For systems symmetric under 90 rota- plotted against the filling factor .
tions, 11 = 22 and 12 = 21 .) In quantum
theory, one usually works in terms of the filling
fraction  = nh=eB and then H = e2 =h. conditions of the classical Hall effect. The following
In 1980 von Klitzing, Dorda, and Pepper dis- features seem to play a role, and in the case of the
covered that at very low temperatures in very high first three, even in the classical effect.
magnetic fields, the Hall conductivity H is quan-
1. As Hall discovered, the samples must be very thin
tized as integral multiples of e2 =
h, a fact known as
to exhibit even the classical effect. (Nowadays
the integer quantum hall effect (IQHE). The integer
they are often a surface layer between two
multiples were accurate to 1 part in 108 , and the
semiconductors.)
effect was exceptionally robust against changes in
2. The samples are macroscopic and much larger
the geometry of the samples and in the experimental
than the quantum wavelengths appearing in the
parameters. Indeed, the unprecedented accuracy of
problem.
the effect led to its adoption as the international
3. The electric field is small enough that nonlinear
standard for resistance in 1990.
effects are negligible.
More precisely, the Hall conductivity was no
4. The quantum effect appears only at a very low
longer proportional to the filling fraction , but the
temperature.
graph of H against  displayed a sequence of jumps,
as shown in Figure 2. In this figure, the conductivity The first of these suggests that we should idealize
has plateau at the integer multiples of e2 =h, and to the case where the motion of the charge carriers is
jumps between them within fairly small ranges of restricted to a two-dimensional region, and the
the filling fraction. Moreover, the direct conductiv- second that we may work in the thermodynamic
ity vanishes where the Hall conductivity takes its limit where the conducting surface is the whole
constant integral values. of R2 . The third and fourth ensure both that the
These results raise numerous questions. linear Ohm’s law should be adequate, and also that
it should be enough to consider the limiting cases of
1. Why does the conductivity take such precise
very weak electric fields and zero temperature.
integer values, and why are they so stable under
Multiple limits of this sort raise delicate mathema-
changes of the geometry and physical
tical issues. Indeed, many plausible models of the
parameters?
effect turn out, on careful analysis, to predict
2. Why does the direct conductivity vanish, except
vanishing Hall conductivity.
in regions where the Hall conductivity jumps
A theoretical explanation of the quantization of
between integer values, and how are such jumps
the conductivity was soon suggested by Laughlin.
possible?
Exploiting the apparent independence of sample
Moreover, any theory must also explain why geometry, he considered a cylindrical conductor
these features are not present under the more normal where quantization followed on consideration of
246 Quantum Hall Effect

the flux tubes threading it. Laughlin’s choice of a There are good surveys of the area (Bellissard
particular configuration precluded investigation of et al. 1994, McCann 1998) explaining how the
the influence of changing geometry. This was soon mathematical model arises out of the physics, the
provided by Thouless, Kohmoto, Nightingale, and mathematical models themselves. As well as being
de Nijs, who argued (from a lattice version of the the standard reference for noncommutative geome-
problem) that the conductivity could be identified try, Connes (1994) discusses the Hall effect. These
with the Chern character of a line bundle over a resources contain good bibliographies, which may
Brillouin zone (a quotient of momentum space by be consulted for further references.
the action of the reciprocal crystal lattice), so that it
had to be integral and the stability of the effect was
a consequence of the topological nature of . Electron Motion in a Magnetic Field
Unfortunately, whilst suggestive, this explanation
worked only under the physically implausible con- The following discussion restricts attention to
straint that the magnetic flux through a crystal cell motion in two dimensions, with electrons as the
was rational, offered no explanation of the link charge carriers, and no interactions between them.
between the Hall and direct conductivities, and, (The first condition is essential; the second could be
working with a periodic Hamiltonian, made no relaxed a little to allow sufficiently long-lived
allowance for the impurities and disorder usually quasi-particles.) A single free electron with mass
important in solid-state problems of this sort. m and charge e moving in the x1 –x2 plane with a
Notwithstanding these deficiencies, this model con- constant transverse magnetic field B in the positive
tained important insights, which inspired Bellissard x3 -direction, can be described by the Landau
to model the effect using Connes’ newly developed Hamiltonian
noncommutative geometry (Bellissard 1986, Connes HL ¼ jP  eAj2 =2m ½1
1986). (Kunz produced a Hilbert space theory at about
the same time, but that has been rather less influential.) where A = 12 B  X is a magnetic vector potential that
Connes’ work turned out to contain all the relevant gives rise to B. This problem is exactly solvable by,
concepts and tools needed to provide a good under- for example, introducing K  = (K1 , K2 ) = P eA.
standing of the effect, based on interpreting the The components of K þ and K  commute with each
conductivity as a noncommutative Chern character other, but [K1 , K2 ] = iheB. Comparison with the
for a noncommutative version of the Brillouin zone. In harmonic oscillator shows that the energy spectrum of
fact, the techniques of noncommutative geometry HL = [(K1þ )2 þ (K2þ )2 ]=2m is {(n þ 12 )heB=m: n 2 Z}.
seemed to fit the quantum Hall effect so well that Since HL commutes with the components of K  ,
this has become one of the standard examples of the each of these Landau energy levels is infinitely
theory. degenerate, and the filling fraction  measures
Even whilst the theorists were struggling to what proportion of states in the Landau levels are
explain the experiments, observations by Tsui, filled. The frequency !c = eB=m is the cyclotron
Störmer, and Gossard showed that, with suitable frequency for classical circular orbits in the
care, fractional Hall conductivities could also be magnetic field.
observed, although these were far less stable than The degeneracy of the Landau Hamiltonian can
those given by integers. One, therefore, distinguishes also be understood in terms of the magnetic
between IQHE and the fractional quantum Hall translations obtained by exponentiating the connec-
effect (FQHE), and this survey concentrates largely tion defined by the magnetic potential A: rj = @j þ
on the former. One simplifying feature of the IQHE ieAj =h = iKj =h. More precisely, we set
is that it seems to be comprehensible at the level of UðaÞ ¼ expðia
rÞ ¼ expðia
K  =hÞ ½2
individual noninteracting electrons, whereas the
FQHE certainly involves some kind of interaction which clearly commutes with HL , expressing the
and many-body theory. translational symmetry of this model. The curvature
This article presents an outline of the connection [r1 , r2 ] = B of the connection manifests itself in the
between noncommutative differential geometry and identities
the IQHE, and concludes by discussing some of the
UðaÞUðbÞ ¼ eð1=2Þi Uða þ bÞ ¼ ei UðbÞUðaÞ ½3
approaches to the FQHE, and some other applica-
tions of noncommutative geometry and mathema- where  = eB
(a  b)=h measures the magnetic flux
tical directions suggested by the theory. The sections through the parallelogram spanned by a and b.
alternate between the physical model and the These show that U is a projective representation
mathematical abstraction from it. of R2 with projective multiplier (a, b) = exp ( 12 i).
Quantum Hall Effect 247

The significance of this is that, unless  is an We now wish to calculate the expected current
integer multiple of 2, U(a) and U(b) generate a h Jk i, in a thermal state with chemical potential at
noncommutative algebra. This replaces the commu- inverse temperature
= 1=kT (where k is Boltz-
tative algebra of functions on two-dimensional mann’s constant and the temperature is T (kelvin).
momentum space and leads naturally to a noncom- Using the Fermi–Dirac distribution, the grand
mutative geometry. canonical expectation value is
The unembellished Landau Hamiltonian cannot
describe the Hall effect without adding an electric  1 
potential eE
X to drive the current in the sample. h Jk i ¼ tr 1 þ e
ðH Þ Jk ½5
(Alternatively, and useful for the later discussion,
one could use the radiation gauge in which, instead
of introducing a scalar potential, a time-dependent Since the quantum Hall effect occurs at low tempera-
term is added to A so that E = @A=@t.) tures (large
) and for weak fields, we formally
The quantum Hall effect also depends crucially on proceed to those limits. Then (1 þ e
(H ) )1 tends to
the effects of impurities in the conducting material. the projection PF onto the states with energy less than
These can be modeled by adding a random potential the Fermi energy EF in the absence of the electric
V! with ! in a compact probability space  to field. The limiting expected current is, therefore,
obtain H! = HL þ eE
X þ V! (X). A continuous tr(PF Jk ) = tr(PF k H), where H is now the Hamilto-
function f on  can be interpreted as a random nian including the electric field (without which there
variable, and its expectation  (f ) gives a trace on would be no current).
the C -algebra C() (i.e., a positive linear functional A detailed calculation of the Hall conductivity
such that  (AB) =  (BA)). using the Kubo–Greenwood formula shows that the
Although the magnetic translations commute with conductivity matrix is actually
HL , they do not generally commute with the
potentials so they act on , but, on the other hand, kj ¼ iðe2 =hÞtrðPF ½@j PF ; @k PF Þ ½6
the physics of a disordered system and its translates
should be the same, so we assume that the
probability measure and hence also  are invariant In particular, this immediately implies that the direct
under magnetic translations. (As noted earlier, we conductivity terms jj vanish, as observation sug-
work in the thermodynamic limit, where the Hall gested. The derivation of [6] requires great care, and
sample expands to fill R2 , so we do not need to references may be found in the surveys, but a formal
worry about translations moving the sample itself.) argument in the next section may lend this expres-
Then  with the magnetic translation action can be sion some plausibility.
interpreted as the noncommutative Brillouin zone.
(A space  can be reconstructed from the magnetic
translations of the resolvents of the Hamiltonians
The Noncommutative Geometry
(Bellisard et al. 1994).)
The current J may be defined as the functional The principal ingredient for noncommutative geo-
derivative of the Hamiltonian with respect to the metry is an algebra, and thus we shall now consider
vector potential A or, in components, Jk = k H = a class of algebras broad enough to include the
H=Ak . For the Landau Hamiltonian, this gives physical example.
The action of the magnetic translations on 
hk HL ¼ ie
i hðPk  eAk Þ=m ¼ e½Xk ; HL  ½4 defines automorphisms of the C -algebra C(),
which permit the construction of a twisted crossed-
a relation which persists for H = HL þ V(X) when- product algebra, in which these automorphisms are
ever the potential V is independent of A, so that represented by conjugation. Because much of the
k H = ie[Xk , H]=h = e dXk =dt, the charge times theory has been formulated with lattice approxima-
velocity, as one might expect. The operator func- tions using Z2 rather than R2 , it is useful to work
tional calculus delivers a similar formula for deriva- more generally with a separable locally compact
tions of the spectral projections of H. We have abelian group G with continuous multiplier , and a
k = e@k =
h, where, in view of the commutation homomorphism to automorphisms of a C -algebra
relations, @k = i[Xk ,
] can be regarded as a A1 with trace 1 , which will in practice be the
momentum-space derivative, confirming that we commutative algebra C() with  . The twisted
are dealing with the differential geometry of crossed product A = C(A1 , G, ) can be constructed
momentum space. as the norm completion of the continuous compactly
248 Quantum Hall Effect

supported functions from G to A1 with the product, provides a Connes–Fredholm involution which
adjoint and norm anticommutes with 3 . Detailed technical results of
Z Connes show how to use the supertrace on H2 and
ðf  gÞðxÞ ¼ ðy; x  yÞf ðyÞð y gÞðx  yÞ dy ½7 the Dixmier trace to interpret the physically impor-
G tant quantities in this setting.
We now turn to the formal derivation of the key
alternative expression for the conductivity. In the
f  ðxÞ ¼ ðx; xÞ1 f ðxÞ ½8
abstract algebraic setting, when p 2 A is a projec-
Z Z  tion in the domain of a derivation  the derivative of
kf k ¼ max kf ðxÞkA1 dx; kf  ðxÞkA1 dx ½9 (1  p)p = 0 gives
G G
0 ¼ ðð1  pÞpÞ ¼ ð1  pÞðpÞ  ðpÞp ½12
integration being with respect to the Haar measure.
and then an easy calculation leads to
The crossed-product algebra is noncommutative,
both because of the action of G and due to the ½p; ½p; p ¼ 2pðpÞp  ðpÞp2  p2 ðpÞ ¼ p ½13
multiplier . It has a trace [f ] = 1 [ f (0)] and, when
G = R2 , has derivations given by @k f = ixk f (x).
As an example, consider the case of periodic In the identity for elements a, b, c, and h 2 A
potentials invariant under translation by vectors a ð½a; ½b; chÞ  ðc½½h; a; bÞ
and b. Then the group G ffi Z2 generated by a and b
¼ ð½a; ½b; chÞ þ ð½b; c½h; aÞ ¼ 0 ½14
acts trivially on  and the crossed-product algebra is
just a product of A1 and the twisted group algebra we set a = c = p and b = p to obtain
of complex-valued functions C(C, G, ), generated
ð½p; ½p; phÞ ¼ ðp½½h; p; pÞ ½15
by U(a) and U(b). We already noted that the algebra
is commutative only when the flux  2 2Z, in Combining this with [12] when    = 0, one
which case it is just the convolution algebra of obtains
Z2 , which by Fourier transforming (effectively
setting U(a) = ei and U(b) = ei
) is the algebra ðphÞ ¼ ððphÞÞ  ððpÞhÞ
C(T 2 ), with torus coordinates and
. For fluxes ¼ ð½p; ½p; phÞ ¼ ðp½½h; p; pÞ ½16
which are rational multiples of 2 we obtain a
matrix algebra, whilst irrational fluxes give an
infinite-dimensional irrational rotation algebra or
noncommutative torus, a standard example in The Hall Conductivity and Anderson
noncommutative geometry. Localisation
Any -representation of A1 on a Hilbert space
H can be induced to a -representation  of the Substituting p = PF and h = H in formula [16] would
twisted crossed product on H = L2 (G, H ) by give the current tr(PF [[H, PF ], PF ]). Since k is
setting proportional to the commutator with Xk , it is true
that tr  k = 0, but, unfortunately, PF need not lie in
ð ðf Þ ÞðxÞ the domain of k , and H is unbounded, further
Z compounding the difficulties. These are serious
¼ ðx; y  xÞ1 ð x f Þðy  xÞ ðyÞ dy ½10 problems, although the situation is not quite as
G bad as it seems. Without the electrostatic term eE
X
in H, PF would have been a spectral projection with
for f 2 A and 2 H. When A1 = C(), we may
which H would commute, so that
take to be a one-dimensional irreducible
-representations given by evaluating the function ½H; PF  ¼ e½E
X; PF  ¼ eEj ½Xj ; PF  ¼ ieEj @j PF ½17
at a point ! 2 .
When G = R2 , it is easy to construct a Fredholm and H disappears from the formula, to be replaced
module from  . The space H2 = H  C2 has actions by @j PF . This would give the expected current
 of A on the first factor and of the Pauli spin i(e2 =h)tr(PF [@j PF , @k PF ])Ej , and the conductivity
matrices 1 , 2 , 3 , on the second. It may be matrix
regarded as a graded module with grading operator
kj ¼ iðe2 =hÞtrðPF ½@j PF ; @k PF Þ ½18
3 , and
given earlier (there is no need to scale by the
F ¼ ðx21 þ x22 Þ1=2 ðx1 1 þ x2 2 Þ ½11 thickness in two dimensions).
Quantum Hall Effect 249

However it is derived, this expression for the characters. The cyclic cocycle is a trilinear form
conductivity only makes sense under suitable condi- defined on elements a0 , a1 , a2 2 A0 by
tions, otherwise tr(PF [@j PF , @k PF ]) might either be
undefined (because PF is not differentiable) or might c ða0 ; a1 ; a2 Þ ¼ ½a0 ð1 a1 2 a2  2 a1 1 a2 Þ ½22
not be trace class. There is a simple condition This is easily shown to be cyclic, c (a0 , a1 , a2 ) =
sufficient to handle both these difficulties, which c (a1 , a2 , a0 ), and to satisfy the cyclic 2-cocycle
also leads to an interesting physical insight. From condition
the obvious inequality
 c ða0 a1 ; a2 ; a3 Þ  c ða0 ; a1 a2 ; a3 Þ
0 tr PF ð@1 PF  i@2 PF Þ ð@1 PF  i@2 PF Þ ½19
þ c ða0 ; a1 ; a2 a3 Þ  c ða3 a0 ; a1 ; a2 Þ ¼ 0 ½23
h  i
¼ tr PF ð@1 PF Þ2 þ ð@2 PF Þ2 The Hall conductivity 21 = ic (p, p, p)e2 =h can
now be interpreted as the noncommut ative
 i trðPF ½@1 PF ; @2 PF Þ ½20 Chern character defined by the projection p.
This interpretation of the Hall conductivity clears
and the fact that 1 PF , we deduce that
the way to prove that it is integral, and there are
h i
several different routes to this.
tr ð@1 PF Þ2 þ ð@2 PF Þ2
One approach is to identify the conductivity with
h  i
some kind of index which is clearly integral.
tr PF ð@1 PF Þ2 þ ð@2 PF Þ2
Bellissard worked with the Fredholm module
jtrðPF ½@1 PF ; @2 PF Þj ½21 where, by results of Connes, the Chern character is
interpreted as the index of the Fredholm operator
Thus, if tr[((@1 PF )2 þ (@2 PF )2 )] exists and is finite, then  (p)F (p). Avron, Seiler and Simon have inter-
our expression for the conductivity is well defined. preted the conductivity as a relative index
Mathematically, this is a Sobolev type condition. To dim [ ker (PF  QF  1)]  dim [ ker (QF  PF  1)] of
see the physical significance, we recall that @k PF =  i the projections PF and its conjugate QF = uPF u by
[Xk , PF ], so that the condition is equivalent to the an off-diagonal element u of F. This is particularly
finiteness of tr[(X21 þ X22 )PF 2 ]  tr[(X1 PF )2 þ (X2 PF )2 ]. interesting as the conjugation by u can be inter-
This condition imposes a requirement for some preted as a nonsingular gauge transformation of
localization in the system (when PF is a rank-1 exactly the kind introduced by Laughlin in his
projection,
it reduces to the requirement that the original explanation of the quantum Hall effect in
variance X21 þ X22  hX1 i2 hX2 i2 be finite). This terms of singular flux tubes piercing a cylindrical
links with a much older observation of Anderson that conductor.
the interference caused by impurities in a crystal, Xia suggested another approach rewriting A as a
which cancel at long range, should, at smaller repeated crossed product with R, which allows us to
distances, cause localized clumping. The mathe- calculate K0 (A), using either Connes’ Thom iso-
matical development of this idea by Pastur provides morphism theorem or the Takai duality theorem for
an appropriate tool for handling the conditions stable algebras to get
for the valdiity of the conductivity formula. The 
impurities generating Anderson localization are K0 ðAÞ ¼ K0 CðA1 ; G; Þ ffi K0 ðA1 Þ ½24
provided in this model by the random potential
which, when A1 = C(), is just K0 (), leading to
in the Hamiltonian. It also leads us to restrict
identification as a topological index. For the simplest
attention to the dense subalgebra A0 of f 2 A,
case of  = T 2 , this gives K0 () ffi Z2 . The image of ,
where [(@1 f ) (@1 f ) þ (@2 f ) (@2 f )] < 1.
and so also c , actually sits in just one component,
leading to quantization of the Hall conductivity.
The two questions posed in the introduction can
The Integral Quantum Hall Effect
now be answered as follows: The Hall conductivity
Having identified the features of physical interest, can be identified with a topological index which can
we can return to the abstract algebraic description take only integer values, and therefore does not
with conductivity i(e2 =h)(p[@j p, @k p]). The key respond to continuous changes in any of the physical
observation is that this can be interpreted as the parameters until the change brings the system into a
Connes pairing between a cyclic cocycle c on A0 region where one of the background assumptions
and the projection p whose stable equivalence class fails, such as a breakdown in the localization
represents an element of the C -algebraic K-theory, condition. The same conditions also ensure that the
K0 (A). Such pairings give noncommutative Chern direct current vanishes. Roughly speaking, the
250 Quantum Hall Effect

plateaus occur when the Fermi energy is in a gap in example, Macris, Martin, and Pulé, and by Fröhlich,
the extended (nonlocalized) spectrum. Graf, and Walcher. The K-theory of the boundary
This brief overview has omitted many of the and bulk of a sample can be linked by exact
interesting features of the detailed theory, which can sequences such as those of the commutative theory
be found in the surveys, such as the fact that low- (Kellendonk et al. 2000), and even in the IQHE
lying energy levels do not contribute to the boundary and bulk conductivities can be used
conductivity, and Shubin’s theorem identifying (p) (Schulz-Baldes et al. 2002).
as the integrated density of states. Harper’s equation It has been fairly clear that whilst the IQHE can
describing a discrete lattice analog of the IQHE has already be understood in terms of the motion of a
been a test-bed for many of the ideas, and various single electron, the fractional effect is a many-body
results were first proved in that setting. The FQHE cooperative effect. One attempt to simplify the
was discovered during an unsuccessful search for a description is to work with an incompressible quan-
Wigner crystal phase transition, but analysis of tum fluid, and for edge currents one should study the
discrete models provides strong evidence that Hall boundary theory of such a fluid, in which the
conductors have very complicated phase diagrams. dominant contribution to the action is a Chern–Simons
term, with conductivity as a coefficient. For an annular
sample, this leads, in a suitable limit, to a chiral
The Fractional Quantum Hall Effect Luttinger model on the boundary circles, which can
As mentioned in the introduction, by the time IQHE then be tackled mathematically using the representa-
had been understood theoretically, it had been found tion theory of loop groups. This leads to some elegant
that, with appropriate care, fractional conductivities mathematics, including extensions to multiple coupled
could also be observed, although they were much less bands, with conductivities described by Cartan
precise and stable than the integer values, and the matrices, as explained in the International Congress
plateaus less pronounced. Although there have been of Mathematicians (ICM) survey (Fröhlich 1995), and
many phenomenological explanations, there is as yet no in the review by Fröhlich and Studer (1993).
mathematical understanding from quantum field the- The theory of composite fermions provides another
ory as compelling as that for the integer effect. We shall physical approach in which field-theoretic effects result
briefly summarize some of the main lines of attack. in the electrons sharing their charges in such a way as to
The first explanation, again due to Laughlin, has also produce fractional charges, and there is experimental
provided the basis for many subsequent treatments of evidence of such fractional charges in studies of
the problem. The wave functions of the oscillator-like tunneling from one edge to another. Then the FQHE
Landau Hamiltonian can conveniently be represented is easily understood by simply replacing the electron
in the Bargmann–Segal Fock space of holomorphic charge e by e=k in the appropriate formulas.
functions f on R2  C which are square-integrable with Susskind has suggested combining noncommuta-
respect to a Gaussian measure. Incorporating the tive geometry with the theory of incompressible
measure into the functions, these have the form quantum fluids, an idea taken up by Polychronakos
f (z) exp(jzj2 =2). Many particle wave functions are (2001). There are intriguing mathematical parallels
similarly realized in terms of holomorphic functions on with work by Berest and Wilson on ideals in the
CN , and must be antisymmetric under odd permuta- Weyl algebra and the Calogero–Moser model.
tions of the particles to describe fermions. This quickly
leads one to consider functions of the form Further Developments
!
Y k
X 2 Bellissard and others have extended the use of
ðzr  zs Þ exp  jzj j =2 ½25
noncommutative geometrical methods into other
r<s j
parts of solid-state theory, where they clarify a
for odd integers k > 0, and their multiples by even number of the physical ideas. This is particularly
holomorphic functions. The lowest energy where such a useful in the case of quasicrystals, which are not
wave function occurs is when k = 1, and larger values of easily handled by the conventional methods
k have the effect of dividing the Hall conductivity by k, (Bellissard et al. 2000). Some ideas in string theory
which produces fractional conductivities. resemble higher-dimensional analogs, and higher-
Halperin suggested quite early that counterflow- dimensional versions of the quantum Hall effect
ing currents in the interior of a sample would tend have also been studied by Hu and Zhang.
to cancel, so that most of the current would be Finally, we conclude with some mathematical
carried near the edge of the sample. There are extensions of the theory. We have seen that, for
several mathematical derivations of this, by, for periodic systems, the noncommutative Brillouin
Quantum Mechanical Scattering Theory 251

zone can be a noncommutative torus, and it is CRM Monograph Series vol. 13, pp. 207–258. Providence, RI:
possible to consider noncommutative versions of American Mathematical Society.
Bellissard J, van Elst A, and Schulz-Baldes H (1994) The non-
Riemann surfaces of higher genera. Carey et al. commutative geometry of the quantum Hall effect. Journal of
(1998) studied the effect in a noncommutative Mathematical Physics 35: 5373–5457.
hyperbolic geometry with a discrete group action, Carey AL, Hannabuss KC, Mathai V, and McCann P (1998)
generalizing the action of a Fuchsian group on the Quantum Hall effect on the hyperbolic plane. Communica-
unit disc. This provides a tractable example in which tions in Mathematical Physics 190: 629–673.
Connes A (1986) Non-commutative differential geometry. Pub-
one has an edge (albeit rather different from lications of the Institut des Hautes Etudes Scientifiques 62:
the normal physical situations) and also examples 257–360.
of a Hall effect in higher-genus noncommutative Connes A (1994) Non Commutative Geometry. San Diego:
Riemann surfaces closely related to those of Klimek Academic Press.
and Lesznewski. Natsumé and Nest have subse- Fröhlich J (1995) The Fractional Quantum Hall Effect, Chern–
Simons Theory and Integral Lattices. Proceedings of the
quently shown that these are deformation quantiza- International Congress of Mathematicians, 1994, Zürich,
tions of the commutative Riemann surface theory in 75–105. Basel: Birkhäuser Verlag.
the sense of Rieffel. Coverings of noncommutative Fröhlich J and Studer UM (1993) Gauge invariance and current
Riemann surfaces, which might provide an analoge algebra in nonrelativistic many-body theory. Reviews of
of composite fermions, have been investigated by Modern Physics 65: 733–802.
Kellendonk J, Richter T, and Schulz-Baldes M (2002) Edge
Marcolli and Mathai (1999, 2001). current channels and Chern numbers in the integral quantum
Hall effect. Reviews in Mathematical Physics 14: 87–119.
See also: C-Algebras and Their Classification; Marcolli M and Mathai V (1999) Twisted index theory on good
Chern–Simons Models: Rigorous Results; Fractional orbifolds. I. Non-commutative Bloch theory. Communications
Quantum Hall Effect; Hopf Algebras and q-Deformation in Contemporary Mathematics 1: 553–587.
Quantum Groups; Localization for Quasiperiodic Marcolli M and Mathai V (2001) Twisted index theory on good
Potentials; Noncommutative Geometry and the Standard orbifolds. II. Fractional quantum numbers. Communications
Model; Noncommutative Tori, Yang–Mills, and String in Mathematical Physics 217: 55–87.
Theory; Schrödinger Operators. McCann PJ (1998) Geometry and the integer quantum Hall effect.
In: Carey AL and Murray MK (eds.) Geometric Analysis and
Lie Theory in Mathematics and Physics, pp. 122–208.
Australian Mathematical Society Lecture Series. Cambridge:
Further Reading
Cambridge University Press.
Bellissard J (1986) K-theory of C -algebras, in solid state physics. Polychronakos A (2001) Quantum Hall states as matrix Chern–
In: Dorlas T, Hugenholtz NM, and Winnink M (eds.) Statistical Simons theory. The Journal of High Energy Physics
Mechanics and Field Theory: Mathematical Aspects, Springer 4(paper 11): 20.
Lecture Notes in Physics vol. 257, pp. 99–156. Berlin: Springer. Schulz-Baldes M, Kellendonk J, and Richter T (2000) Simulta-
Bellissard J, Herrmann DJL, and Zarrouati M (2000) Hulls of neous quantization of edge and bulk Hall conductivity.
a periodic solids and gap labelling theorem. In: Baake M and Journal of Physics A 33: L27–L32.
Moody RV (eds.) Directions in Mathematical Quasi-Crystals,

Quantum Mechanical Scattering Theory


D R Yafaev, Université de Rennes, Rennes, France equation for a ‘‘free’’ system with a Hamiltonian H0 .
ª 2006 Elsevier Ltd. All rights reserved.
Of course, eqn [1] has a unique solution
u(t) = exp (iHt)f , while the solution of the
same equation with the operator H0 and the
initial data u0 (0) = f0 is given by the formula
Introduction u0 (t) = exp (iH0 t)f0 . From the viewpoint of scat-
tering theory, the function u(t) has free asympto-
Scattering theory is concerned with the study of the
tics as t ! 1 if for appropriate initial data f0
large-time behavior of solutions of the time-
eqn [2] holds:
dependent Schrödinger equation [1] for a system
with a Hamiltonian H: lim kuðtÞ  u
0 ðtÞk ¼ 0 ½2
t!1
i@u=@t ¼ Hu; uð0Þ ¼ f ½1
Here and throughout this article a relation contain-
Being a part of the perturbation theory, scattering ing the signs ‘‘’’ is understood as two indepen-
theory describes the asymptotics of u(t) as t ! þ1 dent equalities. We emphasize that initial data f0
or t ! 1 in terms of solutions of the Schrödinger are different for t ! þ1 and t ! 1 and
252 Quantum Mechanical Scattering Theory

u 
0 (t) = exp (iH0 t)f0 . Equation [2] leads to a defined by eqn [7] gives us the part of particles
connection between the corresponding initial data scattered in a solid angle d^
x:
f0 and f given by
dð^ x; !; Þj2 d^
x; !; Þ ¼ jað^ x ½7
f ¼ lim expðiHtÞ expðiH0 tÞf0 ½3 As discussed below, the temporal asymptotics
t!1
of solutions of the time-dependent Schrödinger
If f is an eigenvector of H, that is, Hf = f , then equation [1] are closely related to the asymptotics
obviously u(t) = eit f . On the contrary, if f belongs at large distances of solutions of the stationary
to the (absolutely) continuous subspace of H, then Schrödinger equation [5].
necessarily u(t) has the free asymptotics as t ! 1.
This result is known as asymptotic completeness.
The Schrödinger operator H =  þ V(x) in the Time-Dependent Scattering Theory
space H = L2 (Rd ) with a real potential V decaying and Møller Operators
at infinity is a typical Hamiltonian of scattering
If V(x) ! 0 as jxj ! 1, then the essential spectrum
theory. The operator H describes a particle in an
of the Schrödinger operator H =  þ V(x) covers
external potential V or two interacting particles.
the whole positive half-line, whereas the negative
Asymptotically (as t ! þ1 or t ! 1), particles
spectrum of H consists of eigenvalues accumulating,
may either form a bound state or be free (a
perhaps, at the point zero only.
scattering state). Of course, a bound (scattering)
Scattering theory requires a more advanced
state at 1 remains the same at þ1. To be more
classification of the spectrum based on measure
precise, suppose that
theory. Consider a self-adjoint operator H defined
jVðxÞj  Cð1 þ jxjÞ ½4 on domain D(H) in a Hilbert space H. Let E be its
spectral family. Then the space H can be decom-
where  > 1. Then relation [2] can be justified with posed into the orthogonal sum of invariant sub-
the kinetic energy operator H0 =  playing the spaces H(p) , H(sc) and H(ac) . The subspace H(p) is
role of the unperturbed operator. spanned by eigenvectors of H and the subspaces
As discussed in Landau and Lifshitz (1965) (see H(sc) , H(ac) are distinguished by the condition that
also Amrein et al. (1977), Pearson (1988), and Yafaev the measure (E(X)f , f ) (here X  R is a Borel set) is
(2000)), in scattering experiments one sends a beam singularly or absolutely continuous with respect to
of particles of energy  > 0 in a direction !. Such a the Lebesgue measure for all f 2 H(sc) or f 2 H(ac) .
beam is described by the plane wave Typically (in applications to quantum-mechanical
problems) the singularly continuous part is absent,
0 ðx; !; Þ ¼ expðikh!; xiÞ;  ¼ k2 > 0 that is, H(sc) = {0}. We denote by H (ac) the restriction
of H on its absolutely continuous subspace H(ac) and
(which satisfies of course the free equation
by P(ac) the orthogonal projection on this subspace.
 0 =  0 ). The scattered particles are described
The same objects for the operator H0 will be
for large distances by the outgoing spherical wave
endowed with the index ‘‘0.’’
Equation [3] motivates the following fundamental
x; !; Þjxjðd1Þ=2 expðikjxjÞ
að^ definition. The wave, or Møller, operator
W = W (H, H0 ) for a pair of self-adjoint operators
Here x^ = xjxj1 is the direction of observation and
H0 and H is defined by eqn [8] provided that the
the coefficient a(^
x, !; ) is known as the scattering
corresponding strong limit exists:
amplitude. This means that quantum particles
subject to a potential V(x) are described by the W ¼ s-lim expðiHtÞ expðiH0 tÞP0
ðacÞ
½8
solution of eqn [5] with asymptotics [6] at infinity: t!1

The wave operator is isometric on H(ac)


0 and enjoys
 þ VðxÞ ¼  ½5 the intertwining property
ðx; !; Þ ¼ expðikh!; xiÞ HW ¼ W H0 ½9
ðd1Þ=2
þ að^
x; !; Þjxj expðikjxjÞ Therefore, its range Ran W is contained in the
  absolutely continuous subspace H(ac) of the operator H.
þ o jxjðd1Þ=2 ½6
The operator W (H, H0 ) is said to be complete if
eqn [10] holds:
The existence of such solutions requires of course a
proof. The differential scattering crosssection Ran W ðH; H0 Þ ¼ HðacÞ ½10
Quantum Mechanical Scattering Theory 253

It is easy to see that the completeness of W (H, H0 ) Consideration of wave operators [12] with J 6¼ I
is equivalent to the existence of the ‘‘inverse’’ wave may of course be of interest also in the case H0 = H.
operator W (H0 , H). Thus, if the wave operator It suffices to verify the existence of limits [8] or
W (H, H0 ) exists and is complete, then the opera- [12] on some set dense in the absolutely continuous
tors H0(ac) and H (ac) are unitarily equivalent. We subspace H(ac)
0 of the operator H0 . The following
emphasize that scattering theory studies not arbi- simple but convenient condition for the existence of
trary unitary equivalence but only the ‘‘canonical’’ wave operators is usually called Cook’s criterion.
one realized by the wave operators. Suppose that H0 = H0(ac) and that the operator J
Along with the wave operators an important role maps domain D(H0 ) of the operator H0 into D(H).
in scattering theory is played by the scattering Let
operator defined by eqn [11] where Wþ is the Z 1
operator adjoint to Wþ : kðHJ  JH0 Þ expðiH0 tÞf kdt < 1
0
S ¼ SðH; H0 Þ ¼ Wþ ðH; H0 ÞW ðH; H0 Þ ½11
for all f from some set D0  D(H0 ) dense in H0 .
The operator S commutes with H0 and hence Then the wave operator W (H, H0 ; J) exists.
reduces to multiplication by the operator function This result is often useful in applications since the
S() = S(; H, H0 ) in a representation of H(ac)0 which operator exp (iH0 t) is known explicitly. For
is diagonal for H0(ac) . The operator S() is known as example, it works with J = I for the pair
the scattering matrix. The scattering operator [11] is H0 ¼ ; H ¼ H0 þ VðxÞ ½13
unitary on the subspace H(ac) 0 provided the wave
operators W (H, H0 ) exist and are complete. The if V(x) satisfies estimate [4] with  > 1. On the
scattering operator S(H, H0 ) connects the asympto- other hand, different proofs of the existence of the
tics of the solutions of eqn [1] as t ! 1 and as wave operators W (H0 , H; J ) require new mathe-
t ! þ1 in terms of the free problem, that is matical tools. There are two essentially different
S(H, H0 ) : f0 7! f0þ , where f0 are the same as in eqn approaches in scattering theory: the trace-class and
[2]. The scattering operator and the scattering smooth methods.
matrix are usually of great interest in mathematical
physics problems, because they connect the ‘‘initial’’
and the ‘‘final’’ characteristics of the process Time-Independent Scattering Theory
directly, bypassing its consideration for finite times. The approach in scattering theory relying on
The definition of the wave operators can be definition [8] is called time dependent. An alter-
extended to self-adjoint operators acting in different native possibility is to change the definition of wave
spaces. Let H0 and H be self-adjoint operators in operators replacing the unitary groups by the
Hilbert spaces H0 and H, respectively, and let corresponding resolvents R0 (z) = (H0  z)1 and
‘‘identification’’ J : H0 ! H be a bounded operator. R(z) = (H  z)1 . They are related by a simple
Then the wave operator W = W (H, H0 ; J) for the identity
triple H0 , H, and J is defined by eqn [12] provided
again that the strong limit there exists: RðzÞ ¼ R0 ðzÞ  R0 ðzÞVRðzÞ
¼ R0 ðzÞ  RðzÞVR0 ðzÞ ½14
ðacÞ
W ¼ s-lim expðiHtÞJ expðiH0 tÞP0 ½12
t!1 where V = H  H0 and Im z 6¼ 0. In the stationary
Intertwining property [9] is preserved for wave approach in place of limits [8] one has to study
operator [12]. This operator is isometric on H(ac) if the boundary values (in a suitable topology) of the
0
and only if resolvents as the spectral parameter z tends to the
real axis. An important advantage of the stationary
lim kJ expðiH0 tÞf0 k ¼ kf0 k approach is that it gives convenient formulas for the
t!1
wave operators and the scattering matrix.
for all f0 2 H(ac)
0 . Since Let us discuss here the stationary formulation of
ðacÞ
the scattering problem for operators [13] in the
s-lim K expðiH0 tÞP0 ¼0 Hilbert space H = L2 (Rd ) in terms of solutions of the
jtj!1
Schrödinger equation [5]. If V(x) satisfies estimate [4]
for a compact operator K, wave operators [12] with  > (d þ 1)=2, then for all  > 0 and all unit
corresponding to identifications J1 and J2 coincide if vectors ! 2 Sd1 , eqn [5] has the solution (x; !, )
J2  J1 is compact or, at least, the operators (J2  J1 ) with asymptotics [6] as jxj ! 1. Moreover, the
E0 (X) are compact for all bounded intervals X. scattering amplitude a(^x, !; ) belongs to the space
254 Quantum Mechanical Scattering Theory

L2 (Sd1 ) in the variable x


^ uniformly in ! 2 Sd1 , and Moreover, with the help of eqn [15], it can be
it can be expressed via (x; !, ) by the formula shown that F  is an isometry on H(ac) , it is zero on
Z H
H(ac) , and its range Ran F  = L2 (Rd ). This is
að; !; Þ ¼ d ðÞ eikh;xi VðxÞ ðx; !; Þ dx equivalent to eqns [18]:
Rd
F  F  ¼ PðacÞ ; F  F  ¼ I ½18
where
Hence any function f 2 H(ac) admits the expansion
d ðÞ ¼ eiðd3Þ=4 21 ð2Þ ðd1Þ=2 ðd3Þ=4
 in the generalized Fourier integral
Z
Let us define two sets of scattering solutions, or f ðxÞ ¼ ð2Þd=2  ðx; ÞðF  f ÞðÞ d
eigenfunctions of the continuous spectrum, by the Rd
formulas It can also be deduced from eqn [6] that the vector
  
 ðx; !; Þ ¼ ðx; !; Þ and þ ðx; !; Þ F   F 0 expðijj2 tÞ^f
¼ ðx; !; Þ
tends to zero as t ! 1 for an arbitrary ^f 2 L2 (R d ).
In terms of boundary values of the resolvent, the This implies the existence of the wave operators
functions  (!, ) can be constructed by the formula W = W (H, H0 ) for pair [13] and gives the
representation
 ð!; Þ ¼ 0 ð!; Þ  Rð  i0ÞV 0 ð!; Þ ½15
W ¼ F  F 0 ½19
Obviously, functions [15] satisfy eqn [5]. Using Completeness of W follows from eqn [19] and
resolvent identity [14], it is easy to derive the the first equation in [18]. The second equality in
Lippmann–Schwinger equation [18] is equivalent to the isometricity of W .
 R0 ð  i0ÞV Formula [19] is an example of a stationary
 ð!; Þ ¼ 0 ð!; Þ  ð!; Þ
representation for the wave operator. It formally
for  (!, ). Asymptotics [6] can be deduced from implies that
the formula W 0 ð!; Þ ¼  ð!; Þ

xÞjxjðd1Þ=2
ðR0 ð  i0Þf ÞðxÞ ¼ c ðÞð0 ðÞf Þð^ which means that each wave operator establishes a
ðdþ1Þ=2 one-to-one correspondence between eigenfunctions of
expðikjxjÞ þ Oðjxj Þ
the continuous spectrum of the operators H0 and H.
where f 2 C1 d 1=2 1=4 i(d3)=4 The main ideas of the stationary approach go
0 (R ), c () =   e and
the operator 0 () defined by eqn [16] is (up to the back to Friedrichs (1965), and Povzner. The inverse
numerical factor) the restriction of the Fourier problem of reconstruction of a potential V given the
transform ^f = F 0 f onto the sphere of radius 1=2 : scattering amplitude a (see eqn [6]) is treated in
Faddeev (1976).
ð0 ðÞf Þð!Þ ¼ 21=2 ðd2Þ=4 ^f ð1=2 !Þ; ! 2 Sd1 ½16

The wave operators W (H, H0 ) can be con- The Trace-Class Method


structed in terms of the solutions  . Set  = 1=2 !
( is the momentum variable), write  (x, ) instead Recall that the class S p , p 1, consists of compact
of  (x; !, ), and consider two transformations operators T such that the norm
Z !1=p
X
ðF  f ÞðÞ ¼ ð2Þd=2  ðx; Þ f ðxÞ dx ½17 kTkp ¼ p
n ðjTjÞ ; jTj ¼ ðT  TÞ1=2
Rd n
d
(defined initially, e.g., on the Schwartz class S(R )) is finite. Eigenvalues n (jTj) =: sn (T) of a non-
of the space L2 (R d ) into itself. The operators F  can negative operator jTj are called singular numbers
be regarded as generalized Fourier transforms, and of T. In particular, S 1 is the trace class and S 2 is the
both of them coincide with the usual Fourier Hilbert–Schmidt class.
transform F 0 if V = 0. It follows from eqns [5], The trace-class method (see Reed and Simon (1976)
[17] that under the action of F  the operator H goes or Yafaev (1992) for a detailed presentation) makes no
over into multiplication by jj2 , that is, assumptions about the ‘‘unperturbed’’ operator H0 . Its
basic result is the following theorem of Kato and
ðF  Hf ÞðÞ ¼ jj2 ðF  f ÞðÞ Rosenblum. If V = H  H0 belongs to the trace class
Quantum Mechanical Scattering Theory 255

S 1 , then the wave operators W (H, H0 ) exist and are If K : H ! G (G is some Hilbert space) is a Hilbert–
complete. In particular, the operators H0(ac) and H (ac) Schmidt operator, then for all f 2 R
are unitarily equivalent. This can be considered as a far Z 1
advanced extension of the H Weyl theorem, which kK expðiHtÞf k2 dt  2r2H ðf ÞkKk22 ½21
states the stability of the essential spectrum under 1

compact perturbations. Moreover, the set R is dense in H(ac) .


The condition V 2 S 1 in the Kato–Rosenblum The Pearson theorem allows to simplify consider-
theorem cannot be relaxed in the framework of ably the original proofs of different generalizations
operator ideals S p . This follows from the Weyl–von of the Kato–Rosenblum theorem.
Neumann–Kuroda theorem. Let H0 be an arbitrary A typical application of the trace-class theory is
self-adjoint operator. For any p > 1 and any " > 0 the following result. Suppose that
there exists a self-adjoint operator V such that V 2 S p ,
kVkp < " and the operator H = H0 þ V has purely H ¼ L2 ðRd Þ; H0 ¼  þ V0 ðxÞ; H ¼ H0 þ VðxÞ ½22
point spectrum. Of course, such an operator H has no where the functions V0 and V are real, V0 2 L1 (Rd )
absolutely continuous part. At the same time, the and V satisfies estimate [4] for some  > d. Then the
operator H0 may be absolutely continuous. In this wave operators W (H,H0 ) exist and are complete.
case, the wave operators W (H, H0 ) do not exist.
Although sharp in the abstract framework, the
Kato–Rosenblum theorem cannot directly be applied
The Smooth Method
to the theory of differential operators where a
perturbation is usually an operator of multiplication The smooth method (see Kuroda (1978), Reed and
and hence is not even compact. We mention its two Simon (1979), or Yafaev (1992), for a detailed
generalizations applicable to this theory. The first, presentation) relies on a certain regularity of the
the Birman–Kato–Kreın theorem, claims that the perturbation in the spectral representation of the
wave operators W (H, H0 ) exist and are complete operator H0 . There are different ways to understand
provided regularity. For example, in the Friedrichs–Faddeev
model H0 acts as multiplication by independent
Rn ðzÞ  Rn0 ðzÞ 2 S 1
variable in the space H = L2 (; N ), where  is an
for some n = 1, 2, . . . and all z with Im z 6¼ 0. The interval and N is an auxiliary Hilbert space. The
second, the Birman theorem, asserts that the same is perturbation V is an integral operator with suffi-
true if D(H) = D(H0 ) or D(jHj1=2 ) = D(jH0 j1=2 ) and ciently smooth kernel.
Another possibility is to use the concept of H-
EðXÞðH  H0 ÞE0 ðXÞ 2 S 1
smoothness introduced by Kato. An H-bounded
for all bounded intervals X. operator K is called H-smooth if, for all f 2 D(H),
The wave operators enjoy the following property Z 1
known as the Birman invariance principle. Suppose kK expðiHtÞf k2 dt  Ckf k2 ½23
that ’(H)  ’(H0 ) 2 S 1 for a real function ’ such 1

that its derivative ’0 is absolutely continuous and (cf. eqns [21] and [23]). Here and below, C are different
’0 () > 0. Then the wave operators W (H, H0 ) exist positive numbers whose precise values are inessential.
and eqn [20] holds: It is important that this definition admits equivalent
reformulations in terms of the resolvent or of the
W ðH; H0 Þ ¼ W ð’ðHÞ; ’ðH0 ÞÞ ½20
spectral family. Thus, K is H-smooth if and only if
A direct generalization of the Kato–Rosenblum
sup kKðRð þ i"Þ  Rð  i"ÞÞK k < 1
theorem to the operators acting in different spaces is 2R;">0
due to Pearson. Suppose that H0 and H are self-
adjoint operators in spaces H0 and H, respectively, or if and only if
J : H0 ! H is a bounded operator and V = HJ 
sup jXj1 kKEðXÞk2 < 1
JH0 2 S 1 . Then the wave operators W (H, H0 ; J)
and W (H0 , H; J ) exist. for all intervals X  R.
Although rather sophisticated, the proof relies In applications the assumption of H-smoothness
only on the following elementary lemma of Rosen- of an operator K imposes too stringent conditions
blum. For a self-adjoint operator H, consider the set on the operator H. In particular, the operator H is
R  H(ac) of elements f such that necessarily absolutely continuous if kernel of K is
trivial. This assumption excludes eigenvalues and
r2H ðf Þ :¼ ess sup dðEðÞf ; f Þ=d < 1 other singular points in the spectrum of H, for
256 Quantum Mechanical Scattering Theory

example, the bottom of the continuous spectrum for considered in the space H, is continuous in norm in
the Schrödinger operator with decaying potential or the closed complex plane C cut along (0, 1) with
edges of bands if the spectrum has the band possible exception of the point z = 0. This implies
structure. The notion of local H-smoothness sug- H0 -smoothness of the operator hxil , l > 1=2, on all
gested by Lavine is considerably more flexible. By compact intervals X  (0, 1).
definition, K is called H-smooth on a Borel set X  R To obtain a similar result for the operator H,
if the operator KE(X) is H-smooth. Note that, under we proceed from the resolvent identity [14].
the assumption Let R(z) = hxil R(z)hxil , and let B be the operator
of multiplication by the bounded function
sup kKðRð þ i"Þ  Rð  i"ÞÞK k < 1 ½24
2X;">0 (1 þ jxj) V(x). If

the operator K is H-smooth on the closure of X. f þ R0 ðzÞBf ¼ 0


The following Kato–Lavine theorem is simple but
very useful. Suppose that then = R0 (z)hxil Bf satisfies the Schrödinger equa-
tion H = z . Since H is self-adjoint, this implies that
HJ  JH0 ¼ K K0 = 0 and hence f = 0. Using eqn [14], we obtain that
where the operators K0 and K are H0 -smooth and
H-smooth, respectively, on an arbitrary compact RðzÞ ¼ ðI þ R0 ðzÞBÞ1 R0 ðzÞ; Im z 6¼ 0 ½27
subinterval of some interval . Then the wave
operators because the inverse operator here exists by the
Fredholm alternative.
W ðH; H0 ; JE0 ðÞÞ and W ðH0 ; H; J EðÞÞ The operator function (I þ R0 (z)B)1 is analytic
exist (and are adjoint to each other). in the complex plane cut along (0, 1) with possible
This result cannot usually be applied directly since exception of poles (coinciding with eigenvalues of H)
the verification of H0 - and especially of H-smooth- on the negative half-axis. Moreover, (I þ R0 (z)B)1
ness may be a difficult problem. Let us briefly is continuous up to the cut except the set N  (0, 1)
explain how it can be done on the example of pair of  where at least one of the homogeneous equations
[10], where the potential V(x) satisfies estimate [4]
f þ R0 ð  i0ÞBf ¼ 0 ½28
for some  > 1. Let us start with the operator
H0 = . Denote by L(l) (l) d
2 = L2 (R ) the Hilbert has a nontrivial solution. It follows from eqn [27]
l
space with the norm kf kl = khxi f k, where hxi = (1 þ that the same is true for the operator function R(z).
jxj2 )1=2 . Let the operator 0 () be defined by It can be shown that the set N is closed and has the
eqn [16], and let X  (0, 1) be some compact Lebesgue measure zero. Let  = (0, 1)nN ; then
interval. Set N = L2 (Sd1 ). If f 2 L(l)
2 with l > 1=2,  = [n n where n are disjoint open intervals. By
then, by the Sobolev trace theorem, condition [24], the operator hxil , l > 1=2, is
k0 ðÞf kN  Ckf kl H-smooth on any strictly interior subinterval of
½25 every n . Applying the Kato–Lavine theorem, we see
k0 ðÞf  0 ð0 Þf kN  Cj  0 j kf kl
that the wave operators W (H, H0 ; E0 (n )) and
for an arbitrary  l  1=2, < 1 and all , 0 2 X. W (H0 , H; E(n )) exist for all n. Since E0 () = I
These estimates imply that the function and E() = P(ac) , this implies the existence of
Z W (H, H0 ) and W (H0 , H). Thus, the wave opera-
ðE0 ðÞf ; f Þ ¼ j^f ðÞj2 d ½26 tors W (H, H0 ) for pair [13] exist and are complete
jj2 <
if estimate [4] holds for some  > 1.
is differentiable and the derivative Compared to the trace-class method, conditions on
ðlÞ the perturbation V(x) are less restrictive, while the
dðE0 ðÞf ; f Þ=d ¼ k0 ðÞf k2N ; f 2 L2 ; l > 1=2
class of admissible ‘‘free’’ problems is essentially more
is Hölder-continuous in  > 0 (uniformly in f, narrow (in eqn [22] V0 (x) is an arbitrary bounded
kf kl  1). Therefore, applying the Privalov theorem function). It is not known whether the wave
to the Cauchy integral operators W (H, H0 ) exist for all pairs [22] such
Z 1 that V0 2 L1 and V satisfies [4] for some  > 1.
ðR0 ðzÞf ; f Þ ¼ ð  zÞ1 dðE0 ðÞf ; f Þ It is important that the smooth method allows one
0
to prove the absence of the singular continuous
we obtain that the analytic operator function spectrum. Note first that the continuity of R(z)
implies that the operator H is absolutely continuous
R0 ðzÞ ¼ hxil R0 ðzÞhxil ; l > 1=2 on the subspace E()H. Therefore, the singular
Quantum Mechanical Scattering Theory 257

positive spectrum of H is necessarily contained in N . where the operator 0 () is defined by eqn [16].
To prove that its continuous part is empty, it suffices Then
to check that the set N consists of eigenvalues of the
operator H. In terms of u = hxil Bf , l = =2, eqn F0 : L2 ðRd Þ ! L2 ðRþ ; N Þ; N ¼ L2 ðSd1 Þ
[28] can be rewritten as
is a unitary operator and (F0 H0 f )() = (F0 f )().
u þ VR0 ð  i0Þu ¼ 0 ½29 Under assumption [4] where  > 1, the scattering
operator S for pair [13] is defined by eqn [11]. It is
Multiplying this equation by R0 (  i0)u and taking unitary on the space H = L2 (Rd ) and commutes
the imaginary part of the scalar product, we see that with the operator H0 . It follows that (F0 Sf )() =
S()(F0 f )(),  > 0, where the unitary operator
 dðE0 ðÞu; uÞ=d ¼ ImðR0 ð  i0Þu; uÞ ¼ 0
S() : N ! N is known as the scattering matrix. The
According to eqn [26], this implies that scattering matrix S() for the pair H0 , H can be
computed in terms of the scattering amplitude.
u
^ðÞ ¼ 0 for jj ¼ 1=2 ½30 Namely, S() acts in the space L2 (Sd1 ), and S() 
I is the integral operator whose kernel is the
It follows from eqn [29] that scattering amplitude. More precisely,
¼ R0 ð  i0Þu ½31
ðSðÞf ÞðÞ
Z
that is, ˆ () = (jj2    i0)1 u
^(), is a formal
¼ f ðÞ þ 2i1=2 d ðÞ að; !; Þf ð!Þ d!
(because of the singularity of the denominator) Sd1
solution of Schrödinger equation [5]. Therefore, one
needs only to verify that 2 L2 (R d ). Since u 2 L(l) In operator notation, this representation can be
2 ,
where l = =2, this is a direct consequence of [25] and rewritten as
[30] if  > 2. In the general case, one uses that under SðÞ ¼ I  2 i0 ðÞðV  VRð þ i0ÞVÞ0 ðÞ ½33
assumption [30] the function (jj2  )1 u^() belongs
(p)
to the space L2 for any p < l  1. By virtue of The right-hand side here is correctly defined as a
condition [4] where  > 1, eqn [29] now shows that bounded operator in the space N and is continuous
(p)
actually u 2 L2 for any p < l þ   1. Repeating in  > 0. Moreover, the operator S()  I is compact
these arguments, we obtain, after n steps, that u 2 since 0 ()hxil : H ! N is compact for l > 1=2 by
L(p)
2 for any p < l þ n(  1). For n large enough, this virtue of the Sobolev trace theorem.
implies that u 2 L(p) 2 for p > 1, and consequently It follows that the spectrum of the operator S()
function [31] belongs to L2 (Rd ). consists of eigenvalues of finite multiplicity, except
Similar arguments show that eigenvalues of H possibly the point 1, lying on the unit circle and
have finite multiplicity and do not have positive accumulating at the point 1 only. In the general
accumulation points. For the proof of boundedness case, eigenvalues of S() play the role of scattering
of the set of eigenvalues, one uses additionally the phases or shifts considered often for radial potentials
estimate V(x) = V(jxj).
The scattering amplitude is singular on the
kR0 ð  i0Þk ¼ Oð1=2 Þ; !1 ½32 diagonal  = ! only. Moreover, this singularity is
Actually, according to Kato theorem the Schrödin- weaker for potentials with faster decay at infinity
ger operator H does not have positive eigenvalues. (for  bigger). If  > (d þ 1)=2, then the operator
There exists also a purely time-dependent S()  I belongs to the Hilbert–Schmidt class. In this
approach, the Enss method (see Perry (1983)), case the total scattering cross section
which relies on an advanced study of the free Z
evolution operator exp (iH0 t). ð!; Þ ¼ jað; !; Þj2 d
Sd1

is finite for all energies  > 0 and all incident


directions ! 2 Sd1 . If  > d, then the operator
The Scattering Matrix S()  I belongs to the trace class. In this case, the
The operator H0 =  can of course be diagona- scattering amplitude a(, !; ) is a continuous func-
lized by the classical Fourier transform. To put it tion of , ! 2 Sd1 (and  > 0). The unitarity of the
slightly differently, set operator S() implies the optical theorem
 
ðF0 f ÞðÞ ¼ 0 ðÞf ð!; Þ ¼ 1=2 Im d1 ðÞað!; !; Þ
258 Quantum Mechanical Scattering Theory

Using resolvent identity [14], one deduces from operators W (H, H0 ; J), where J is a pseudodiffer-
eqn [33] the Born expansion ential operator,
Z
X
1
ðJf ÞðxÞ ¼ ð2Þd=2 eihx;i eiðx;Þ
ðx; Þ^f ðÞ d
SðÞ ¼ I  2 i ð1Þn 0 ðÞVðR0 ð þ i0ÞVÞn 0 ðÞ Rd
n¼0
with oscillating symbol exp (i(x, ))
(x, ). Due to the
This series is norm-convergent for small potentials V conservation of energy, we may suppose that
(x, )
and according to estimate [32] for high energies . contains a factor (jj2 ) with 2 C1 0 (0, 1). Set

’ðx; Þ ¼ hx; i þ ðx; Þ


Long-Range Interactions
The perturbation HJ  JH0 is also a pseudodiffer-
Potentials decaying at infinity as the Coulomb ential operator, and its symbol is short-range (it is
potential O(jxj1" ), " > 0, as jxj ! 1) if exp (i’(x, )) is an
approximate eigenfunction of the operator H corre-
VðxÞ ¼ jxj1 ; d 3 sponding to the ‘‘eigenvalue’’ jj2 . This leads to the
eikonal equation
or slower are called long-range. More precisely, it is
required that jrx ’ðx; Þj2 þ VðxÞ ¼ jj2

j@ VðxÞj  Cð1 þ jxjÞj j ;  2 ð0; 1 ½34 The notorious difficulty (for d 2) of this method is
that the eikonal equation does not have (even
for all derivatives of V up to some order. In the approximate) solutions such that jrx (x, )j ! 0 as
long-range case, the wave operators W (H, H0 ) do jxj ! 1 and the arising error term is short-range.
not exist, and the asymptotic dynamics should be However, it is easy to construct functions ’ = ’
properly modified. It can be done in a time- satisfying these conditions if a conical neighborhood
dependent way either in the coordinate or momen- of the direction  is removed from Rd . For
tum representations. For example, in the coordinate example,
representation, the free evolution exp (iH0 t) Z 1
should be replaced in definition [8] of wave  ðx; Þ ¼ 21 ðVðx  Þ  Vð ÞÞ d
operators by unitary operators U0 (t) defined by 0

if  > 1=2 in eqn [34]. Then the cutoff function


ðU0 ðtÞf ÞðxÞ ¼ expði ðx; tÞÞð2itÞd=2 ^f ðx=ð2tÞÞ
(x, ) =
 (x, ) should be homogeneous of order
where ^f is the Fourier transform of f. For short- zero in the variable x and it should be equal to zero
range potentials we can set (x, t) = (4t)1 jxj2 . In the in a neighborhood of the direction . We empha-
long-range case the phase function (x, t) should be size that now we have a couple of different
chosen as a (perhaps, approximate) solution of the identifications J = J .
eikonal equation The long-range problem is essentially more diffi-
cult than the short-range one. The limiting absorp-
@=@t þ jrj2 þ V ¼ 0 tion principle remains true in this case, but its proof
cannot be performed within perturbation theory.
In particular, we can set The simplest proof relies on the Mourre estimate
Z 1 (see Cycon et al. (1987)) for the commutator i[H, A]
ðx; tÞ ¼ ð4tÞ1 jxj2  t VðsxÞ ds of H with the generator of dilations
0
X
d
if  > 1=2 in [34]. For the Coulomb potential, A ¼ i ðxj @j þ @j xj Þ
j¼1
ðx; tÞ ¼ ð4tÞ1 jxj2  tjxj1 ln jtj
The Mourre estimate affirms that, for all  > 0,
(the singularity at x = 0 is inessential here). Thus,
both in short- and long-range cases solutions of the iEð Þ½H; AEð Þ cðÞEð Þ; cðÞ > 0 ½35
time-dependent Schrödinger equation ‘‘live’’ in a if  = (  ",  þ ") and " is small enough. For the
region of the configuration space where jxj is of free operator H0 , this estimate takes the form
order jtj. Long-range potentials change only asymp- i[H0 , A] = 4H0 and can be regarded as a commutation
totic phases of these solutions. relation. Estimate [35] means that the observable
Another possibility is a time-independent modifi-
cation in the phase space. Let us consider wave ðAeiHt f ; eiHt f Þ
Quantum Mechanical Scattering Theory 259

is a strictly increasing function of t for all f 2 H(ac) . W (H, H0 ; J ) and W (H0 , H; J 


) exist. These
The H-smoothness of the operator hxil , l > 1=2, is operators are isometric since the operators J
deduced from this fact by some arguments of are in some sense close to unitary operators.

abstract nature (they do not really use concrete The isometricity of W (H0 , H; J ) is equivalent to
forms of the operators H and A). the completeness of W (H, H0 ; J ).
However, the limiting absorption principle is not Although the modified wave operators enjoy
sufficient for construction of scattering theory in the basically the same properties as in the short-range
long-range case, and it should be supplemented by case, properties of the scattering matrices in the
an additional estimate. To formulate it, denote by short- and long-range cases are drastically different.
Here we note only that for long-range potentials,
ðr? uÞðxÞ ¼ ðruÞðxÞ  jxj2 hðruÞðxÞ; xix due to a wild diagonal singularity of kernel of the
the orthonal projection of a vector (ru)(x) on the scattering matrix, its spectrum covers the whole unit
plane orthogonal to x. Then the operator circle.
K = hxi1=2 r? is H-smooth on any compact X  Different aspects of long-range scattering are
(0, 1). This result is formulated as an estimate discussed in Dereziński and Gérard (1997), Pearson
(either on the resolvent or on the unitary group of (1988), Saitō (1979), and Yafaev (2000).
H), which we refer to as the radiation estimate. This
estimate is not very astonishing from the viewpoint See also: N-Particle Quantum Scattering; Quantum
of analogy with the classical mechanics. Indeed, in Dynamical Semigroups; Random Matrix Theory in
Physics; Scattering in Relativistic Quantum Field Theory:
the case of free motion, the vector x(t) of the
Fundamental Concepts and Tools; Schrödinger
position of a particle is directed asymptotically as its Operators; Spectral Theory for Linear Operators.
momentum . Regarded as a pseudodifferential
operator, r? has symbol   jxj2 h, xix, which
equals zero if x =   for some  2 R. Thus, r?
removes the part of the phase space where a classical Further Reading
particle propagates. The proof of the radiation Amrein WO, Jauch JM, and Sinha KB (1977) Scattering Theory in
estimate is based on the inequality Quantum Mechanics. New York: Benjamin.
Cycon H, Froese R, Kirsh W, and Simon B (1987) Schrödinger
K K  C0 ½H; @r  þ C1 hxi1 ; @r ¼ @=@jxj Operators, Texts and Monographs in Physics. Berlin:
Springer.
which can be obtained by a direct calculation. Since Dereziński J and Gérard C (1997) Scattering Theory of Classical
the integral and Quantum N Particle Systems. Berlin: Springer.
Z t Faddeev LD (1976) Inverse problem of quantum scattering theory
i ð½H; @r eiHs f ; eiHs Þf Þ ds II. J. Soviet. Math. 5: 334–396.
0 Friedrichs K (1965) Perturbation of Spectra in Hilbert Space.
Providence, RI: American Mathematical Society.
¼ ð@r eiHt f ; eiHt f Þ  ð@r f ; f Þ Kuroda ST (1978) An Introduction to Scattering Theory, Lecture
Notes Series No. 51. Aarhus University.
is bounded by C(X)kf k2 for f 2 E(X)f and Landau LD and Lifshitz EM (1965) Quantum Mechanics.
the operator hxi(1þ)=2 is H-smooth on X, this Pergamon.
implies H-smoothness of the operator KE(X). Pearson D (1988) Quantum Scattering and Spectral Theory.
Calculating the perturbation HJ  J H0 , we see London: Academic Press.
Perry PA (1983) Scattering Theory by the Enss Method. vol. 1.
that it is a sum of two pseudodifferential operators.
Harwood: Math. Reports.
The first of them is short-range and thus can be Reed M and Simon B (1979) Methods of Modern Mathematical
taken into account by the limiting absorption Physics III. New York: Academic Press.
principle. The symbol of the second one contains Saitō Y (1979) Spectral Representation for Schrödinger Operators
first derivatives (in the variable x) of the cutoff with Long-Range Potentials. Springer Lecture Notes in Math,
vol. 727.
function
 (x, ) and hence decreases at infinity as
Yafaev DR (1992) Mathematical Scattering Theory. Providence,
jxj1 only. This operator factorizes into a product of RI: American Mathematical Society.
H0 - and H-smooth operators according to the Yafaev DR (2000) Scattering Theory: Some Old and New
radiation estimate. Thus, all wave operators Problems, Springer Lecture Notes in Math., vol. 1735.
260 Quantum Mechanics: Foundations

Quantum Mechanics: Foundations


R Penrose, University of Oxford, Oxford, UK von Neumann set the framework on a more rigorous
ª 2006 Elsevier Ltd. All rights reserved. basis in his 1932 book, Mathematische Grundlageen
der Quantenmechanik (later translated as Mathe-
matical Foundations of Quantum Mechanics, 1955).
This formalism, now well known to physicists, is
The Framework of Quantum Mechanics based on the presence of a quantum state j i
In 1900, Max Planck initiated the quantum revolu- (Dirac’s ‘‘ket’’ notation being adopted here). In
tion by presenting the hypothesis that radiation is Schrödinger’s description, j i is to evolve by unitary
emitted or absorbed only in ‘‘quanta,’’ each of evolution, according to the Schrödinger equation
energy h, for frequency  (where h was a new @j i
fundamental constant of Nature). By this device, he ih ¼ Hj i
@t
explained the precise shape of the puzzling black-
body spectrum. Then, in 1905, Albert Einstein where H is the quantum Hamiltonian. The totality of
introduced the concept of the photon, according to allowable states j i constitutes a Hilbert space H
which light, of frequency  would, in appropriate and the Schrödinger equation provides a continuous
circumstances, behave as though it were constituted one-parameter family of unitary transformations of H.
as individual particles, each of energy h, rather The letter U is used here for the ‘‘quantum-level’’
than as continuous waves, and he was able to evolution whereby the state j i evolves in time
explain the conundrum posed by the photoelectric according to this unitary Schrödinger evolution.
effect by this means. Later, in 1923, Prince Louis de However, we must be careful not to demand an
Broglie proposed that, conversely, all particles interpretation of this evolution similar to that
behave like waves, the energy being Planck’s h which we adopt for a classical theory, such as is
and the momentum being h1 , where  is the provided by Maxwell’s equations for the electro-
wavelength, which was later strikingly confirmed in magnetic field. In Maxwell’s theory, the evolution
a famous experiment of Davisson and Germer in that his equations provide is accepted as very
1927. Some years earlier, in 1913, Niels Bohr had closely mirroring the actual way in which a
used another aspect of this curious quantum physically real electromagnetic field evolves with
‘‘discreteness,’’ explaining the stable electron orbits time. In quantum mechanics, however, it is a highly
in hydrogen by the assumption that (orbital) angular contentious matter how we should regard the
momentum must be quantized in units of  h(= h=2). ‘‘reality’’ of the unitarily evolving state j i.
All this provided a very remarkable collection of One of the key difficulties resides in the fact that
facts and concepts, albeit somewhat disjointed, the world that we actually observe about us rather
explaining a variety of previously baffling physical blatantly does not accord with such a unitarily
phenomena, where a certain discreteness seemed to evolving j i. Indeed, the standard way that the
be entering Nature at a fundamental level, where quantum formalism is to be interpreted is very far
previously there had been continuity, and where from the mere following of such a picture. So long
there was an overriding theme of a confusion as to as no ‘‘measurement’’ is deemed to have been taking
whether – or in what circumstances – waves or place, this U-evolution procedure would be adopted,
particles provide better pictures of reality. More- but upon measurement, the state is taken to behave
over, no clear and consistent picture of an actual in a very different way, namely to ‘‘jump’’ instanta-
‘‘quantum-level reality’’ as yet seemed to arise out of neously to some eigenstate ji of the quantum
all this. Then, in 1925, Heisenberg introduced his operator Q which is taken to represent the measure-
‘‘matrix mechanics,’’ subsequently developed into a ment, with probability given by the Born rule
more complete theory by Born, Heisenberg and
jhj ij2
Jordan, and then more fully by Dirac. Some six
months after Heisenberg, in 1926, Schrödinger if we assume that both j i and ji are normalized
introduced his very different-looking ‘‘wave (h j i = 1 = hji); otherwise we can express this
mechanics,’’ which he subsequently showed was probability simply as
equivalent to Heisenberg’s scheme. These became hj ih ji
encompassed into a comprehensive framework
h j ihji
through the transformation theory of Dirac, which
he put together in his famous book The Principles of (The operator Q is normally taken to be self-adjoint,
Quantum Mechanics, first published in 1930. Later, so that Q = Q and its eigenvalues are real, but more
Quantum Mechanics: Foundations 261

generally complex eigenvalues are accommodated if knowledge of the of the quantum system under
we allow Q to be normal, that is, QQ = Q Q. In consideration. According to this view, the ‘‘jumping’’
each case we require the eigenvectors of Q to span that the quantum state undergoes is regarded as
the Hilbert space H.) This ‘‘evolution procedure’’ of unsurprising, since it does not represent a sudden
the quantum state is very different from U, owing change in the reality of the situation, but merely in the
both to its discontinuity and its indeterminacy. The observer’s knowledge, as new information becomes
letter R will be used for this, standing for the available, when the result of some measurement
‘‘reduction’’ of the quantum state (sometimes referred becomes known to the observer. According to this
to as the ‘‘collapse of the wave function’’). This view, there is no objective quantum reality described
strange hybrid, whereby U and R are alternated, with by j i. Whether or not there might be some objective
U holding between measurements and R holding at quantum-level reality with some other mathematical
measurements, is the standard procedure that is description seems to be left open by this viewpoint, but
pragmatically adopted in conventional quantum the impression given is that there might well not be any
mechanics, and which works so marvelously well, such quantum-level reality at all, in the sense that it
with no known discrepancy between the theory and becomes meaningless to ask for a description of
observation. (In his classic account, von Neumann ‘‘actual reality’’ at quantum-relevant scales.
(1932, 1955), ‘‘R’’ is referred to as his ‘‘process I’’ Of course some connection with the real world is
and ‘‘U’’ as his ‘‘process II.’’) However, there appears necessary, in order that the quantum formalism can
to be no consensus whatever about the relation relate to the results of experiment. In the Copenha-
between this mathematical procedure and what is gen viewpoint, the experimenter’s measuring appa-
‘‘really’’ going on in the physical world. This is the ratus is taken to be a classical-level entity, which can
kind of issue that will be of concern to us here. be ascribed a real ontological status. When the
Geiger counter ‘‘clicks’’ or when the pointer
‘‘points’’ to some position on a dial, or when the
Quantum Reality
track in the cloud chamber ‘‘becomes visible’’ –
The discussion here will be given only in the these are taken to be real events. The intervening
Schrödinger picture, for the reason that the issues description in terms of a quantum state vector j i is
appear to be clearer with this description. In the not ascribed a reality. The role of j i is merely to
Heisenberg picture, the state j i does not evolve in provide a calculational procedure whereby the
time, and all dynamics is taken up in the time different outcomes of an experiment can be assigned
evolution of the dynamical variables. But this probabilities. Reality comes about only when the
evolution does not refer to the evolution of specific result of the measurement is manifested, not before.
systems, the ‘‘state’’ of any particular system being A difficulty with this viewpoint is that it is hard to
defined to remain constant in time. Since the draw a clear line between those entities which are
Schrödinger and Heisenberg pictures are deemed to considered to have an actual reality, such as the
be equivalent (at least for the ‘‘normal’’ systems that experimental apparatus or a human observer, and
are under consideration here), we do not lose the elemental constituents of those entities, which
anything substantial by sticking to Schrödinger’s are such things as electrons or protons or neutrons
description, whereas there does seem to be a or quarks, which are to be treated quantum
significant gain in understanding of what the mechanically and therefore, on the ‘‘Copenhagen’’
formalism is actually telling us. view, their mathematical descriptions are denied
There are, however, many different attitudes that such an honored ontological status. Moreover, there
are expressed as to the ‘‘reality’’ of j i. (There is an is no limit to the number of particles that can
unfortunate possibility of confusion here in the two partake in a quantum state. According to current
uses of the word ‘‘real’’ that come into the discussion quantum mechanics, the most accurate mathemati-
here. In the quantum formalism, the state is mathe- cal procedure for describing a system with a large
matically a ‘‘complex’’ rather than a ‘‘real’’ entity, number of particles would indeed be to use a
whereas our present concern is not directly to do unitarily evolving quantum state. What reasons can
with this, but with the ‘‘ontology’’ of the quantum be presented for or against the viewpoint that this
description.) According to what is commonly regarded gives us a reasonable description of an actual
as the standard – ‘‘Copenhagen’’ – interpretation of reality? Can our perceived reality arise as some
quantum mechanics (due primarily to Bohr, kind of statistical limit when very large numbers of
Heisenberg, and Pauli), the quantum state j i is not constituents are involved?
taken as a description of a quantum-level reality at all, Before entering into the more subtle and con-
but merely as a description of the observer’s tentious issues of the nature of ‘‘quantum reality,’’ it
262 Quantum Mechanics: Foundations

is appropriate that one of the very basic mathema- guarantee this answer. (We are, of course, consider-
tical aspects of the quantum formalism be addressed ing only ‘‘ideal’’ measurements, for the purpose of
first. It is an accepted aspect of the quantum argument.) Moreover, we could imagine that
formalism that a state-vector such as j i should between the two measurements, some appropriate
not, in any case, be thought of as providing a unique magnetic field had been introduced so as to rotate
mathematical description of a ‘‘physical reality’’ for the spin direction in some very specific way, so that
the simple reason that j i and zj i, where z is any the spin state is now some other direction such as
nonzero complex number, describe precisely the jÇi. By rotating our second Stern–Gerlach apparatus
same physical situation. It is a common, but not to agree with this new direction, we must again get
really necessary, practice to demand that j i be certainty for the YES answer, the guaranteeing of
normalized to unity: h j i = 1, in which case the this by the rotated state seeming now to give a
freedom in j i is reduced to the multiplication by a ‘‘reality’’ to this new state jÇi. The quantum
phase factor j i 7! ei j i. Either way, the physically formalism does not allow us to ascertain an
distinguishable states constitute a projective Hilbert unknown direction of spin. But it does allow for us
space PH, where each point of PH corresponds to a to ‘‘confirm’’ (or ‘‘refute’’) a proposed direction for
one-dimensional linear subspace of the Hilbert space the spin state, in the sense that if the proposed
H. The issue, therefore, is whether quantum reality direction is incorrect, then there is a nonzero
can be described in terms of the points of a probability of refutation. Only the correct direction
projective Hilbert space PH. can be guaranteed to give the YES answer.

Reality in Spin-1/2 Systems EPR–Bohm and Bell’s Theorem


As a general comment, it seems that for systems with a For a pair of particles or atoms of spin 1/2, the issue
small number of degrees of freedom – that is, for a of the ‘‘reality’’ of spin states becomes less clear.
Hilbert space Hn of small finite dimension n – it seems Consider, for example, the EPR–Bohm example
more reasonable to assign a reality to the elements of (where ‘‘EPR’’ stands for Einstein–Podolski–Rosen)
PHn than is the case when n is large. Let us begin with whereby an initial state of spin 0 decays into two
a particularly simple case, where n = 2, and H2 spin-1/2 atoms, traveling in opposite directions (east
describes the two-dimensional space of spin states of E, and west W). If a suitable Stern–Gerlach apparatus
a massive particle of spin 1/2, such as an electron, is set up to measure the spin of the atom at E, finding
proton, or quark, or suitable atom. Here we can take an answer jÇi, say, then this immediately ensures
as an orthonormal pair of basis states jÆi and ji, that the state at W is the oppositely pointing jªi,
representing right-handed spin about the ‘‘up’’ and which can subsequently be ‘‘confirmed’’ by measure-
‘‘down’’ directions, respectively. Clearly there is ment at W. This, then, seems to provide a ‘‘reality’’
nothing special about these particular directions, so for the spin state jªi at W as soon as the E
any other state of spin, of direction ji say, is just as measurement has been performed, but not before.
‘‘real’’ as the original two. Indeed, we always find Now, let us suppose that some orientation different
from ª had actually been set up for the measurement
ji ¼ wjÆi þ zji
at W, namely that which would have given YES for
for some pair of complex numbers z and w (not both the direction . This measurement can certainly give
zero). The different possible ratios z : w give us a the answer YES upon encountering jªi (with a
complex plane (of zw1 ) compactified by a point at certain nonzero probability, namely (1 þ cos )=2,
infinity (where w = 0) – a ‘‘Riemann sphere’’ – which is where  is the angle between ª and ). So far, this
a realization of the complex projective 1-space PH2 . provides us with no problem with the ‘‘reality’’ of the
There does indeed seem to be something ‘‘real’’ spin state of the atom at W, since it would have been
about the spin state of such a spin-1/2 particle or jªi before the measurement at W and would have
atom. We might imagine preparing the spin of ‘‘collapsed’’ (by the R-process) to ji after the
a suitable spin-1/2 atom using a Stern–Gerlach measurement. But now suppose that the measure-
apparatus (see Introductory Article: Quantum ment at W had actually been performed momentarily
Mechanics) oriented in some chosen direction. The before the measurement at E, rather than just
atom seems to ‘‘know’’ the direction of its spin, after it. Then there is no reason that the
because if we measure it again in the same direction W-measurement would encounter jªi, rather than
it has to be prepared to give us the answer ‘‘YES,’’ to some other direction, but the result ji of the
the second measurement, with certainty, and that measurement at W now seems to force the state at
direction for its spin state is the only one that can E to be ji. Indeed, the two measurements, at E and
Quantum Mechanics: Foundations 263

at W, might have been spacelike separated, and requirements of special relativity. (It is possible that
because of the requirements of special relativity there these difficulties might be resolved within some kind
would be no meaning to say which of the two of nonlocal geometry, such as that supplied by
measurements – at E or at W – had ‘‘actually’’ twistor theory (see Twistors; Twistor Theory: Some
occurred first. One seems to obtain a different picture Applications); see, particularly, Penrose (2005).)
of ‘‘reality’’ depending on this ordering. These types of issues are made even more dramatic
In fact, the calculations of probabilities come out and problematic in the procedure of ‘‘quantum
the same whichever picture is used, so if one asks teleportation,’’ whereby the information in a quantum
only for a calculational procedure for the probabil- state (e.g., the unknown actual direction  in some
ities, rather than an actual picture of quantum quantum state ji) can be transported from one
reality, these considerations are not problematic. But experimenter A to another one B, by merely
they do provide profound difficulties for any view of the sending of a small finite number of classical bits
quantum reality that is entirely local. The difficulty of information from A to B, where before this classical
is made particularly clear in a theorem due to John information is transmitted, A and B must each be in
Bell (1964, 1966a, b) which showed that on the possession of one member of an EPR pair. More
basis of the assumptions of local realism, there are explicitly, we may suppose A (Alice) is presented with
particular relations between the conditional prob- a spin-1/2 state ji, but is not told the direction . She
abilities, which must hold in any situation of this has in her possession another spin-1/2 state which is an
kind; moreover, these inequalities can be violated in EPR–Bohm partner of a spin-1/2 state in the posses-
various situations in standard quantum mechanics. sion of B (Bob). She combines this ji with her EPR
(See, most specifically, Clauser et al. (1969).) Several atom and then performs a measurement which
experiments that were subsequently performed distinguishes the four orthogonal ‘‘Bell states’’
(notably Aspect et al. (1982)) confirmed the expec-
tations of quantum mechanics, thereby presenting 0: jÆiji  jijÆi
profound difficulties for any local realistic model of 1: jÆijÆi  jiji
the world. There are also situations of this kind 2: jÆijÆi þ jiji
which involve only yes/no questions, so that actual 3: jÆiji þ jijÆi
probabilities do not need to be considered, see
Kochen and Specker (1967), Peres (1991), Hardy where the first state in each product refers to her
(1993), Conway and Kochen (2002). Basically: if unknown state and the second refers to her EPR
one insists on realism, then one must give up atom. The result of this measurement is conveyed to
locality. Moreover, nonlocal realistic models, con- Bob by an ordinary classical signal, coded by the
sistent with the requirements of special relativity, are indicated numbers 0, 1, 2, 3. On receiving Alice’s
not easy to construct (see Quantum Mechanics: message, Bob takes the other member of the EPR
Generalizations), and have so far proved elusive. pair and performs the following rotation on it:
0: leave alone
Other Aspects of Quantum Nonlocality 1: 180 about x-axis
Problems of this kind occur even at the more 2: 180 about y-axis
elementary level of single particles, if one tries to 3: 180 about z-axis
consider that an ordinary particle wave function
(position-space description of j i) might be just This achieves the successful ‘‘teleporting’’ of ji
some kind of ‘‘local disturbance,’’ like an ordinary from A to B, despite the fact that only 2 bits of
classical wave. Consider the wave function spread- classical information have been signaled. It is the
ing out from a localized source, to be detected at a acausal EPR–Bohm connection that provides the
perpendicular screen some distance away. The transmission of ‘‘quantum information’’ in a classi-
detection of the particle at any one place on the cally acausal way. Again, we see the essentially
screen immediately forbids the detection of that nonlocal (or acausal) nature of any attempted
particle at any other place on the screen, and if we ‘‘realistic’’ picture of quantum phenomena. It may
are to think of this information as being transmitted be regarded as inappropriate to use the term
as a classical signal to all other places on the screen, ‘‘information’’ for something that is propagated
then we are confronted with problems of super- acausally and cannot be directly used for signaling.
luminary communication. Again, any ‘‘realistic’’ It has been suggested, accordingly, that a term such
picture of this process would require nonlocal as ‘‘quanglement’’ might be more appropriate to use
ingredients, which are difficult to square with the for this concept; see Penrose (2002, 2004).
264 Quantum Mechanics: Foundations

The preceding arguments illustrate how quantum consists of the photon going in some other direction,
systems involving even just a few particles can exhibit missing the detector so that the murderous device is
features quite unlike the ordinary behavior of classical not activated, and the cat is left alive. These two
particles. This was pointed out by Schrödinger (1935), alternatives would each be perfectly plausible
and he referred to this key property of composite evolutions which might take place in the physical
quantum systems as ‘‘entanglement.’’ An entangled world. Now, by use of a beam splitter (effectively a
quantum state (vector) is an element of a product ‘‘half-silvered mirror’’) we can easily arrange for the
Hilbert space Hm  Hn which cannot be written as a initial state of the photon to be the superposition
tensor product of elements j iji, with j i 2 Hm and wji þ zji of the two. Then by quantum linearity
ji 2 Hn , where Hm refers to one part of the system and we find, as the final result, the superposed state
Hn refers to another part, usually taken to be physically wj0 i þ zj0 i, in which the cat is in a superposition
widely separated from the first. EPR systems are a of life and death (a ‘‘Schrödinger’s cat’’).
clear example, and we begin to see very nonclassical, We note that the two individual final states j0 i
effectively nonlocal behavior with entangled systems and j0 i would each involve not just the cat but also
generally. A puzzling aspect of this is that the vast its environment, fully entangled with the cat’s state,
majority of states are indeed entangled, and the more and perhaps also some human observer looking at
parts that a system has, the more entangled it becomes the cat. In the latter case, j0 i would involve the
(where the generalization of this notion to more than observer in a state of unhappily perceiving a dead
two parts is evident). One might have expected that cat, and j0 i happily perceiving a live one. Two of
‘‘big’’ quantum systems with large numbers of parts the ‘‘conventional standpoints’’ with regard to the
ought to behave more and more like classical systems measurement problem are of relevance here. Accord-
when they get larger and more complicated. However, ing to the standpoint of environmental decoherence,
we see that this is very far from being the case. There is the details of the environmental degrees of freedom
no good reason why a large quantum system, left on its are completely inaccessible, and it is deemed to be
own to evolve simply according to U should actually appropriate to construct a density matrix to describe
resemble a classical system, except in very special the situation, which is a partial trace D of the
circumstances. Something of the nature of the R quantity j ih j, constructed by tracing out over all
process seems to be needed in order that classical the environmental degrees of freedom:
behaviour can ‘‘emerge.’’
D ¼ trace over environmentfj ih jg
The density matrix tends to be regarded as a more
appropriate quantity than the ket jyi to represent
Schrödinger’s Cat
the physical situation, although this represents
To clarify the nature of the problem we must consider a something of an ‘‘ontology shift’’ from the point of
key feature of the U formalism, namely ‘‘linearity,’’ view that was being held previously. Under appro-
which is supposed to hold no matter how large or priate assumptions, D may now be shown to attain a
complicated is the quantum system under considera- form that is close to being diagonal in a basis with
tion. Recall the quantum superposition principle, which respect to which the cat is either dead or alive, and
allows us to construct arbitrary combinations of states then, by a second ‘‘ontology shift’’ D is re-read as
describing a probability mixture of these two states.
j i ¼ wji þ zji
According to the second ‘‘conventional standpoint’’
from two given states ji and ji. Quantum linearity under consideration here, it is not logical to take this
tells us that if detour through a density-matrix description, and
instead one should maintain a consistent ontology by
ji j0 i and ji j0 i
following the evolution of the state j i itself through-
where the symbol ‘‘ ’’ expresses how a state will out. The ‘‘real’’ resulting physical state is then taken to
have evolved after a specified time period T, then be actually j 0 i, which involves the superposition of a
dead and live cat. Of course this ‘‘reality’’ does not agree
j i ¼ wji þ zji j 0 i ¼ wj0 i þ zj0 i
with the reality that we actually perceive, so the position
Let us now consider how this might be applied in is taken that a conscious mind would not actually be
a particular, rather outlandish situation. Let us able to function in such a superposed condition, and
suppose that the ji-evolution consists of a photon would have to settle into a state of perception of either a
going in one direction, encountering a detector, dead cat or a live one, these two alternatives occurring
which is connected to some murderous device which with probabilities as given by the Born rule stated
kills a cat. The ji-evolution, on the other hand, above. It may be argued that this conclusion depends
Quantum Mechanics: Generalizations 265

upon some appropriate theory of how conscious minds Bell JS (1964) On the Einstein Podolsky Rosen paradox. Physics
actually perceive things, and this appears to be lacking. 1: 195–200. Reprinted in Quantum Theory and Measurement,
eds., Wheeler JA and Zurek WH (Princeton Univ. Press,
A good many physicists might argue that none of Princeton, 1983).
these attempts at resolution of the measurement Bell JS (1966a) On the problem of hidden variables in quantum
problem is satisfactory, including ‘‘Copenhagen,’’ theory. Reviews in Modern Physics 38: 447–452.
although the latter at least has the advantage of Bell JS (1966b) Speakable and Unspeakable in Quantum Mechanics.
offering a pragmatic, if not fully logical, stance. Such Cambridge: Cambridge University Press. Reprint 1987.
Clauser JF, Horne MA, Shimony A, and Holt RA (1969)
physicists might take the position that it is necessary Proposed experiment to test local hidden-variable theories.
to move away from the precise version of quantum Physical Review Letters 23: 880–884.
theory that we have at present, and turn to one of its Conway J and Kochen S (2002) The geometry of the quantum
modifications. Some major candidates for modifica- paradoxes. In: Bertlmann RA and Zeilinger A (eds.) Quantum
tion are discussed in Quantum Mechanics: General- [Un]speakables: From Bell to Quantum Information, Ch. 18
(ISBN 3-540-42756-2). Berlin: Springer.
izations. Most of these actually make predictions Hardy L (1993) Nonlocality for two particles without inequalities
that, at some stage, would differ from those of for almost all entangled states. Physical Review Letters
standard quantum mechanics. So it becomes an 71(11): 1665.
experimental matter to ascertain the plausibility of Kochen S and Specker EP (1967) Journal of Mathematics and
these schemes. In addition, there are reinterpretations Mechanics 17: 59.
Penrose R (2002) John Bell, state reduction, and quanglement. In:
which do not change quantum theory’s predictions, Reinhold A, Bertlmann, and Zeilinger A (eds.) Quantum
such as the de Broglie–Bohm model. In this, there are [Un]speakables: From Bell to Quantum Information,
two levels of ‘‘reality,’’ a firmer one with a particle or pp. 319–331. Berlin: Springer.
position-space ontology, and a secondary one con- Penrose R (2004) The Road to Reality: A Complete Guide to the
taining waves which guide the behavior at the firmer Laws of the Universe. London: Jonathan Cape.
Penrose R (2005) The twistor approach to space-time struc-
level. It is clear, however, that these issues will tures. In: Ashtekar A (ed.) In 100 Years of Relativity; Space-
remain the subject of debate for many years to come. Time Structure: Einstein and Beyond. Singapore: World
Scientific.
See also: Functional Integration in Quantum Physics; Peres A (1991) Two simple proofs of the Kochen–Specker
Normal Forms and Semiclassical Approximation; theorem. Journal of Physics A: Mathematical and General
Quantum Mechanics: Generalizations; Twistor Theory: 24: L175–L178.
Some Applications [In Integrable Systems, Complex Schrödinger E (1935) Probability relations between separated
Geometry and String Theory]; Twistors. systems. Proceedings of the Cambridge Philosophical Society
31: 555–563.
von Neumann J (1932) Mathematische Grundlageen der Quan-
Further Reading tenmechanik. Berlin: Springer.
von Neumann J (1955) Mathematical Foundations of Quantum
Aspect A, Grangier P, and Roger G (1982) Experimental Mechanics. Princeton: Princeton University Press.
realization of Einstein–Podolsky–Rosen–Bohm. Gedankenex-
periment: a new violation of Bell’s inequalities. Physical
Review Letters 48: 91–94.

Quantum Mechanics: Generalizations


P Pearle, Hamilton College, Clinton, NY, USA quantization? how are the states realized in nature to
A Valentini, Perimeter Institute for Theoretical be characterized? how and when is the wave-function
Physics, Waterloo, ON, Canada ‘‘collapse postulate’’ to be invoked? Because of its
ª 2006 Elsevier Ltd. All rights reserved. success, one may suspect that quantum theory can be
promoted from a theory of measurement to a theory
of reality. But, that requires there to be an unambig-
uous specification (S) of the possible real states of
Introduction
nature and their probabilities of being realized.
According to the so-called ‘‘Copenhagen Interpreta- There are several approaches that attempt to
tion,’’ standard quantum theory is limited to describ- achieve S. The more conservative approaches (e.g.,
ing experimental situations. It is at once remarkably consistent histories, environmental decoherence,
successful in its predictions, and remarkably ill-defined many worlds) do not produce any predictions that
in its conceptual structure: what is an experiment? differ from the standard ones because they do not
what physical objects do or do not require tamper with the usual basic mathematical
266 Quantum Mechanics: Generalizations

formalism. Rather, they utilize structures compatible However, given the mapping ! = !(M, ) for indi-
with standard quantum theory to elucidate S. These vidual trials, one may, in principle, consider
approaches, which will not be discussed in this nonstandard distributions () 6¼ QT () that yield
article, have arguably been less successful so far at statistics outside the domain of ordinary quantum
achieving S than approaches that introduce theory (Valentini 1991, 2002a). We may say that
significant alterations to quantum theory. such distributions correspond to a state of quantum
This article will largely deal with the two most nonequilibrium.
well-developed realistic models that reproduce Quantum nonequilibrium is characterized by the
quantum theory in some limit and yield potentially breakdown of a number of basic quantum con-
new and testable physics outside that limit. First, the straints. In particular, nonlocal signals appear at the
pilot-wave model, which will be discussed in the statistical level. We shall first illustrate this for the
broader context of ‘‘hidden-variables theories.’’ hidden-variables model of de Broglie and Bohm.
Second, the continuous spontaneous localization Then we shall generalize the discussion to all
(CSL) model, which describes wave-function col- (deterministic) hidden-variables theories.
lapse as a physical process. Other related models At present there is no experimental evidence for
will also be discussed briefly. quantum nonequilibrium in nature. However, from
Due to bibliographic space limitations, this article a hidden-variables perspective, it is natural to
contains a number of uncited references, of the form explore the theoretical properties of nonequilibrium
‘‘[author] in [year].’’ Those in the next section can distributions, and to search experimentally for the
be found in Valentini (2002b, 2004a,b) or at statistical anomalies associated with them.
www.arxiv.org. Those in the subsequent sections From this point of view, quantum theory is a
can be found in Adler (2004), Bassi and Ghirardi special case of a wider physics, much as thermal
(2003), Pearle (1999) (or in subsequent papers by physics is a special case of a wider (nonequilibrium)
these authors, or directly, at www.arxiv.org), and in physics. (The special distribution QT () is analo-
Wallstrom (1994). gous to, say, Maxwell’s distribution of molecular
speeds.) Quantum physics may be compared with
the physics of global thermal equilibrium, which is
Hidden Variables and Quantum characterized by constraints – such as the impossi-
bility of converting heat into work (in the absence of
Nonequilibrium
temperature differences) – that are not fundamental
A deterministic hidden-variables theory defines a but contingent on the state. Similarly, quantum
mapping ! = !(M, ) from initial hidden parameters constraints such as statistical locality (the impossi-
 (defined, e.g., at the time of preparation of a bility of converting entanglement into a practical
quantum state) to final outcomes ! of quantum signal) are seen as contingencies of QT ().
measurements. The mapping depends on macro-
scopic experimental settings M, and fixes the out-
come for each run of the experiment. Bell’s theorem Pilot-Wave Theory
of 1964 shows that, for entangled quantum states of The de Broglie–Bohm ‘‘pilot-wave theory’’ – as it
widely separated systems, the mapping must be was originally called by de Broglie, who first
nonlocal: some outcomes for (at least) one system presented it at the Fifth Solvay Congress in 1927 –
must depend on the setting for another distant is the classic example of a deterministic hidden-
system. variables theory of broad scope (Bohm 1952, Bell
In a viable theory, the statistics of quantum 1987, Holland 1993). We shall use it to illustrate the
measurement outcomes – over an ensemble of above ideas. Later, the discussion will be generalized
experimental trials with fixed settings M – will to arbitrary theories.
agree with quantum theory for some special dis- In pilot-wave dynamics, an individual closed
tribution QT () of hidden variables. For example, system with (configuration-space) wave function
expectation values will coincide with the predictions (X, t) satisfying the Schrödinger equation
of the Born rule
Z @ ^
ih ¼ H ½1
h!iQT  d QT ðÞ!ðM; Þ ¼ trð^ ^
Þ @t
has an actual configuration X(t) with velocity
for an appropriate density operator ˆ and Hermi-
ˆ (As is customary in this context, _ JðX; tÞ
tian
R observable . XðtÞ ¼ ½2
d is to be understood as a generalized sum.) jðX; tÞj2
Quantum Mechanics: Generalizations 267

where J = J[] = J(X, t) satisfies the continuity distribution of outcomes of quantum measurements
equation will match the statistical predictions of quantum
theory (Bohm 1952, Bell 1987, Dürr et al. 2003).
@jj2 Thus, quantum theory emerges phenomenologically
þrJ ¼0 ½3
@t for a ‘‘quantum equilibrium’’ ensemble with
(which follows from [1]). In quantum theory, J is the distribution P(X, t) = j(X, t)j2 (or () = QT ()).
‘‘probability current.’’ In pilot-wave theory,  is an
objective physical field (on configuration space) Quantum nonequilibrium In principle, as we saw
guiding the motion of an individual system. for general hidden-variables theories, we may con-
Here, the objective state (or ontology) for a closed sider a nonequilibrium distribution P(X, 0) 6¼
system is given by  and X. A probability distribu- j(X, 0)j2 of initial configurations while retaining
tion for X – discussed below – completes an the same deterministic dynamics [1], [2] for indivi-
unambiguous specification S (as mentioned in the dual systems (Valentini 1991). The time evolution of
introduction). P(X, t) will be determined by [6].
Pilot-wave dynamics may be applied to any As we shall see, in appropriate circumstances
quantum system with a locally conserved current in _ [6]
(with a sufficiently complicated velocity field X),
configuration space. Thus, X may represent a many- generates relaxation P ! jj2 on a coarse-grained
body system, or the configuration of a continuous level, much as the analogous classical evolution on
field, or perhaps some other entity. phase space generates thermal relaxation. But for as
For example, at low energies, for a system of N long as the ensemble is in nonequilibrium, the
particles with positions xi (t) and masses statistics of outcomes of quantum measurements
mi (i = 1, 2, . . . , N), with an external potential V, will disagree with quantum theory.
[1] (with X  (x1 , x2 , . . . , xN )) reads Quantum nonequilibrium may have existed in the
very early universe, with relaxation to equilibrium
@ X N
h2 2
 occurring soon after the big bang. Thus, a hidden-
i
h ¼  r  þ V ½4
@t i¼1
2mi i variables analog of the classical thermodynamic
‘‘heat death of the universe’’ may have actually
while [2] has components
  taken place (Valentini 1991). Even so, relic cosmo-
dxi h ri  ri S logical particles that decoupled sufficiently early
¼ Im ¼ ½5
dt mi  mi could still be in nonequilibrium today, as suggested
by Valentini in 1996 and 2001. It has also been
(where  = jje(i=h)S ). speculated that nonequilibrium could be generated
In general, [1] and [2] determine X(t) for an in systems entangled with degrees of freedom behind
individual system, given the initial conditions a black-hole event horizon (Valentini 2004a).
X(0), (X, 0) at t = 0. For an arbitrary initial Experimental searches for nonequilibrium have
distribution P(X, 0), over an ensemble with the been proposed. Nonequilibrium could be detected
same wave function (X, 0), the evolution P(X, t) by the statistical analysis of random samples of
of the distribution is given by the continuity particles taken from a parent population of (for
equation example) relics from the early universe. Once the
@P parent distribution is known, the rest of the popula-
_ ¼0
þ r  ðPXÞ ½6
@t tion could be used as a resource, to perform tasks
that are currently impossible (Valentini 2002b).
The outcome of an experiment is determined by
X(0), (X, 0), which may be identified with . For
H-Theorem: Relaxation to Equilibrium
an ensemble with the same (X, 0), we have
 = X(0). Before discussing the potential uses of nonequili-
brium, we should first explain why all systems
Quantum equilibrium From [3] and [6], if we probed so far have been found in the equilibrium
assume P(X, 0) = j(X, 0)j2 at t = 0, we obtain state P = jj2 . This distribution may be accounted
P(X, t) = j(X, t)j2 – the Born-rule distribution of for along the lines of classical statistical mechanics,
configurations – at all times t. noting that all currently accessible systems have had
Quantum measurements are, like any other a long and violent astrophysical history.
process, described and explained in terms of evol- Dividing configuration space into small cells, and
ving configurations. For measurement devices whose introducing coarse-grained quantities P,  jj2 , a gen-
pointer readings reduce to configurations, the eral argument for relaxation P  ! jj2 is based on an
268 Quantum Mechanics: Generalizations

analog of the classical coarse-graining H-theorem. ensemble P(xA , xB , t) = j(xA , xB , t)j2 , local opera-
The coarse-grained H-function tions at B have no statistical effect at A: the
Z individual nonlocal effects vanish upon averaging
 ¼ dX P
H  lnðP=jj
 2
Þ ½7 over an equilibrium ensemble.
Nonlocality is (generally) hidden by statistical
noise only in quantum equilibrium. If instead
(minus the relative entropy of P with respect to
P(xA , xB , 0) 6¼ j(xA , xB , 0)j2 , a local change in the
jj2 ) obeys the H-theorem (Valentini 1991) Hamiltonian at B generally induces an instan-
  taneous
R 3 change in the marginal pA (xA , t) 
HðtÞ  Hð0Þ d xB P(xA , xB , t) at A. For example, in one dimen-
sion a sudden change H ^B !H ^ 0 in the Hamiltonian
(assuming no initial fine-grained microstructure in P B
at B induces a change pA  pA (xA , t)  pA (xA , 0)
and jj2 ). Here, H   0 for all P, jj2 and H  = 0 if
2 (for small t) (Valentini 1991),
and only if P  = jj everywhere.
 Z
The H-theorem expresses the fact that P and jj2 t2 @
pA ¼  aðxA Þ dxB bðxB Þ
behave like two ‘‘fluids’’ that are ‘‘stirred’’ by the same 4m @xA
velocity field X,_ so that P and jj2 tend to become 
PðxA ; xB ; 0Þ  jðxA ; xB ; 0Þj2
indistinguishable on a coarse-grained level. Like its
½8
classical analog, the theorem provides a general jðxA ; xB ; 0Þj2
understanding of how equilibrium is approached, (Here mA = mB = m, a(xA ) depends on (xA , xB , 0),
while not proving that equilibrium is actually while b(xB ) also depends on H ^ 0 and vanishes if
B
reached. (And of course, for some simple systems – ^ 0 ^
HB = HB .) The signal is generally nonzero if
such as a particle in the ground state of a box, for P0 6¼ j0 j2 .
which the velocity field rS=m vanishes – there is no Nonlocal signals do not lead to causal paradoxes
relaxation at all.) A strict decrease of H(t)  immedi- if, at the hidden-variable level, there is a preferred
ately after t = 0 is guaranteed if X _ 0  r(P0 =j0 j2 ) has foliation of spacetime with a time parameter that
nonzero spatial variance over a coarse-graining cell, defines a fundamental causal sequence. Such sig-
as shown by Valentini in 1992 and 2001. nals, if they were observed, would define an
A relaxation timescale  may be defined by absolute simultaneity as discussed by Valentini in
1= 2  (d2 H=dt
 2  0 . For a single particle with
)0 =H 1992 and 2005. Note that in pilot-wave field
quantum energy spread E, a crude estimate given theory, Lorentz invariance emerges as a phenom-
by Valentini in 2001 yields   (1=") h2 =m1=2 (E)3=2 , enological symmetry of the equilibrium state,
where " is the coarse-graining length. For wave conditional on the structure of the field-theoretical
functions that are superpositions of many energy Hamiltonian (as discussed by Bohm and Hiley in
eigenfunctions, the velocity field (generally) varies 1984, Bohm, Hiley and Kaloyerou in 1987, and
rapidly, and detailed numerical simulations (in two Valentini in 1992 and 1996).
dimensions) show that relaxation occurs with an
approximately exponential decay H(t)  H  0 et=tc ,
with a time constant tc of order  (Valentini and Subquantum Measurement
Westman 2005). In principle, nonequilibrium particles could also be
Equilibrium is then to be expected for particles used to perform ‘‘subquantum measurements’’ on
emerging from the violence of the big bang. The ordinary, equilibrium systems. We illustrate this
possibility is still open that relics from very early with an exactly solvable one-dimensional model
times may not have reached equilibrium before (Valentini 2002b).
decoupling. Consider an apparatus ‘‘pointer’’ coordinate y,
with known wave function g0 (y) and known
Nonlocal Signaling
(ensemble) distribution 0 (y) 6¼ jg0 (y)j2 , where 0 (y)
We now show how nonequilibrium, if it were ever has been deduced by statistical analysis of random
discovered, could be used for nonlocal signaling. samples from a parent population with known wave
Pilot-wave dynamics is nonlocal. For a pair of function g0 (y). (We assume that relaxation may be
particles A, B with entangled wave function neglected: for example, if g0 is a box ground state,
(xA , xB , t), the velocity x_ A (t) = rA S(xA , xB , t)=mA y_ = 0 and 0 (y) is static.) Consider also a ‘‘system’’
of A depends instantaneously on xB , and local coordinate x with known wave function 0 (x) and
operations at B – such as switching on a potential – known distribution 0 (x) = j 0 (x)j2 . If 0 (y) is
instantaneously affect the motion of A. For an arbitrarily narrow, x0 can be measured without
Quantum Mechanics: Generalizations 269

disturbing 0 (x), to arbitrary accuracy (violating the measurement of the trajectories could then distin-
uncertainty principle). guish the states j 1 i, j 2 i.
To do this, at t = 0 we switch on an interaction
Hamiltonian H ^ = a^ ^y , where a is a constant and py
xp Breaking quantum cryptography The security of
is canonically conjugate to y. For relatively large a, standard protocols for quantum key distribution
we may neglect the Hamiltonians of x and y. For depends on the validity of the laws of quantum
 = (x, y, t), we then have @=@t = ax@=@y. theory. These protocols would become insecure
For jj2 we have the continuity equation @jj2 =@t = given the availability of nonequilibrium systems
ax@ jj2 =@y, which implies the hidden-variable (Valentini 2002b).
velocity fields x_ = 0, y_ = ax and trajectories x(t) = x0 , The protocols known as BB84 and B92 depend on
y(t) = y0 þ ax0 t. the impossibility of distinguishing nonorthogonal
The initial product 0 (x, y) = 0 (x)g0 (y) evolves quantum states without disturbing them. An eaves-
into (x, y, t) = 0 (x)g0 (y  axt). For at ! 0 (with a dropper in possession of nonequilibrium particles could
large but fixed), (x, y, t) ! 0 (x)g0 (y) and 0 (x) is distinguish the nonorthogonal states being transmitted
undisturbed: for small at, a standard quantum between two parties, and so read the supposedly secret
pointer with the coordinate y would yield negligible key. Further, if subquantum measurements allow an
information about x0 . Yet, for arbitrarily small at, eavesdropper to predict quantum measurement out-
the hidden-variable pointer coordinate y(t) = y0 þ ax0 t comes at each ‘‘wing’’ of a (bipartite) entangled state,
does contain complete information about x0 (and then the EPR (Einstein–Podolsky–Rosen) protocol also
x(t) = x0 ). This ‘‘subquantum’’ information will be becomes insecure.
visible to us if 0 (y) is sufficiently narrow.
For, over an ensemble of similar experiments,
Subquantum computation It has been suggested
with initial joint distribution P0 (x, y) = j 0 (x)j2 0 (y)
that nonequilibrium physics would be computation-
(equilibrium for x and nonequilibrium for y), the
ally more powerful than quantum theory, because of
continuity equation @P=@t =ax@P=@y implies that
the ability to distinguish nonorthogonal states
P(x, y, t) = j 0 (x)j2 0 (y  axt). If 0 (y) is localized
(Valentini 2002b). However, this ability depends
around y = 0 (0 (y) = 0 for jyj > w=2), then a stan-
on the (less-than-quantum) dispersion w of the
dard (faithful) measurement of y with result ymeas
nonequilibrium ensemble. A well-defined model of
will imply that x lies in the interval (ymeas =at  w=2at,
computational complexity requires that the
ymeas =at þ w=2at) (so that P(x, y, t) 6¼ 0). Taking the
resources be quantified in some way. Here, a key
simultaneous limits at ! 0, w ! 0, with w=at ! 0,
question is how the required w scales with the size
the midpoint ymeas =at ! x0 (since ymeas = y0 þ ax0 t
of the computational task. So far, no rigorous results
and jy0 j  w=2), while the error w=2at ! 0.
are known.
If w is arbitrarily small, a sequence of such
measurements will determine the hidden trajectory Extension to All Deterministic
x(t) without disturbing (x, t), to arbitrary accuracy. Hidden-Variables Theories
Let us now discuss arbitrary (deterministic) theories.
Subquantum Information and Computation

From a hidden-variables perspective, immense phy- Nonlocal signaling Consider a pair of two-state
sical resources are hidden from us by equilibrium quantum systems A and B, which are widely
statistical noise. Quantum nonequilibrium would separated and in the singlet state. Quantum
probably be as useful technologically as thermal or measurements of observables ˆ A  mA  ŝ A , ˆ B 
chemical nonequilibrium. mB  ŝ B (where mA , mB are unit vectors in Bloch
space and ŝ A , ŝ B are Pauli spin operators) yield
outcomes A , B = 1, in the ratio 1 : 1 at each
Distinguishing nonorthogonal states In quantum wing, with a correlation hˆ A ˆ B i = mA  mB . Bell’s
theory, nonorthogonal states j 1 i, j 2 i (h 1 j 2 i 6¼ 0) theorem shows that for a hidden-variables theory to
cannot be distinguished without disturbing them. reproduce this correlation – upon averaging over an
This theorem breaks down in quantum nonequili- equilibrium ensemble with distribution QT () – it
brium (Valentini 2002b). For example, if j 1 i, j 2 i must take the nonlocal form
are distinct states of a single spinless particle, then
A ¼ A ðmA ; mB ; Þ; B ¼ B ðmA ; mB ; Þ ½9
the associated de Broglie–Bohm velocity fields will
in general be different, even if h 1 j 2 i 6¼ 0, and so More precisely, toR obtain hA B iQT = mA  mB
will the hidden-variable trajectories. Subquantum (where hA B iQT  dQT ()A B ), at least one of
270 Quantum Mechanics: Generalizations

A , B must depend on the measurement setting at Further, for a two-state system with observables
the distant wing. Without loss of generality, we m  ŝ, the ‘‘dot-product’’ structure of the quantum
assume that A depends on mB . expectation hm  ŝ i = tr(m
ˆ  ŝ) = m  P (for some
For an arbitrary nonequilibrium ensemble with Bloch vector P) is equivalent to expectation
distribution
R () 6¼ QT (), in general hA B i  additivity (Valentini 2004b). Nonadditive expecta-
d ()A B differs from mA  mB , and the out- tions then provide a convenient signature of none-
comes A , B = 1 occur in a ratio different from 1 : 1. quilibrium for any two-state system. For example,
Further, a change of setting mB ! m0B at B will generally the sinusoidal modulation of the quantum trans-
induce a change in the outcome statistics at A, yielding a mission probability for a single photon through a
nonlocal signal at the statistical level. To see this, note polarizer
that, in a nonlocal theory, the ‘‘transition sets’’
pþ 1 1
QT ðÞ ¼ 2ð1 þ hm  ŝ iÞ ¼ 2ð1 þ P cos 2Þ ½10
TA ð; þÞ  fjA ðmA ; mB ; Þ ¼ 1;
A ðmA ; m0B ; Þ ¼ þ1g (where an angle  on the Bloch sphere corresponds
TA ðþ; Þ  fjA ðmA ; mB ; Þ ¼ þ1; to a physical angle  = =2) will generically break
down in nonequilibrium. Deviations from [10]
A ðmA ; m0B ; Þ ¼ 1g
would provide an unambiguous violation of quan-
cannot be empty for arbitrary settings. Yet, in quantum tum theory (Valentini 2004b).
equilibrium, the outcomes A = 1 occur in the ratio Such deviations were searched for by Papaliolios
1 : 1 for all settings, so the transition sets must in 1967, using laboratory photons and successive
have equal equilibrium measure, QT [TA (, þ)] = polarization measurements over very short times, to
QT [TA (þ,)] (dQT  QT ()d). That is, the test a hidden-variables theory (distinct from pilot-
fraction of the equilibrium ensemble making the wave theory) due to Bohm and Bub (1966), in which
transition A = 1 ! A = þ1 under mB ! m0B must quantum measurements generate nonequilibrium for
equal the fraction making the reverse transition short times. Experimentally, successive measure-
A = þ1 ! A = 1. (This ‘‘detailed balancing’’ is ments over timescales 1013 s agreed with the
analogous to the principle of detailed balance in (quantum) sinusoidal modulation cos2  to < 
1%.
statistical mechanics.) Since TA (, þ), TA (þ, ) are Similar tests might be performed with photons of a
fixed by the deterministic mapping, they are indepen- more exotic origin.
dent of the ensemble distribution (). Thus, for
() 6¼ QT (), in general [TA (, þ)] 6¼ [TA (þ, )]
(d  ()d): the fraction of the nonequilibrium
Continuous Spontaneous Localization
ensemble making the transition A = 1 ! A = þ1
will not in general balance the fraction making the
Model (CSL)
reverse transition. The outcome ratio at A will then The basic postulate of CSL is that the state vector
change under mB ! m0B and there will be an instanta- j , ti represents reality. Since, for example, in
neous signal at the statistical level from B to A describing a measurement, the usual Schrödinger
(Valentini 2002a). evolution readily takes a real state into a nonreal
Thus, in any deterministic hidden-variables state, that is, into a superposition of real states
theory, nonequilibrium distributions () 6¼ QT () (such as apparatus states describing different
generally allow entanglement to be used for non- experimental outcomes), CSL requires a modifica-
local signalling (just as, in ordinary statistical tion of Schrödinger’s evolution. To the Hamiltonian
physics, differences of temperature make it possible is added a term which depends upon a classical
to convert heat into work). randomly fluctuating field w(x, t) and a mass-
^ t). This term acts to collapse
density operator A(x,
Experimental signature of nonequilibrium Quantum a superposition of states, which differ in their
expectations are additive, hc1 ˆ 1 þ c2  ˆ 2 i = c1 hˆ 1 iþ spatial distribution of mass density, to one of these
ˆ
c2 h2 i, even for noncommuting observables states. The rate of collapse is very slow for a
([ˆ 1 , 
ˆ 2 ] 6¼ 0, with c1 , c2 real). As emphasized by superposition involving a few particles, but very
Bell in 1966, this seemingly trivial consequence fast for a superposition of macroscopically different
of the (linearity of the) Born rule hi ˆ = tr(ˆ )ˆ is states. Thus, very rapidly, what you see (in nature)
remarkable because it relates statistics from is what you get (from the theory). Each state vector
distinct, ‘‘incompatible’’ experiments. In none- evolving under each w(x, t) corresponds to a
quilibrium, such additivity generically breaks realizable state, and a rule is given for how to
down (Valentini 2004b). associate a probability with each. In this way, an
Quantum Mechanics: Generalizations 271

unambiguous specification S, as mentioned in the Z l


introduction, is achieved. j ; tiw ¼ T exp  ^
dt0 fiH
0
!
Requirements for Stochastic Collapse Dynamics 1 ^ 2 0
þ ð4Þ ½wðt Þ  2A g j ; 0i ½12
Consider a normalized state vector j , ti =
P
n n (t)jan i(han jan0 i =
nn0 ) which undergoes a
stochastic dynamical collapse process. This means where H ^ is the usual Hamiltonian, w(t0 ) is an
that, starting from the initial superposition at t = 0, arbitrary function of white noise class, A ^ is a
for each run of the process, the squared amplitudes ^ n i = an jan i),  is a collapse
Hermitian operator (Aja
xn (t)  j n (t)j2 fluctuate until all but one vanish, that rate parameter, T is the time-ordering operator and
is, xm (1) = 1, (x6¼m (1) = 0) with probability xm (0). h = 1. Associated with this, the probability rule
This may be achieved simply, assuming negligible
effect of the usual Schrödinger evolution, if the t=dt
Y
stochastic process enjoys the following properties Pt ðwÞDw  wh ; tj ; tiw dwðtj Þ=ð2=dtÞ1=2 ½13
(Pearle 1979): j¼0
X
xn ðtÞ ¼ 1 ½11a is defined, which gives the probability that nature
n
chooses a noise which lies in the range {w(t0 ), w(t0 ) þ
dw(t0 )} for 0  t0  t (for calculational purposes,
xn ðtÞ ¼ xn ð0Þ ½11b time is discretized, with t0 = 0).
Equations [12] and [13] contain the essential
xn ð1Þxm ð1Þ ¼ 0 for m 6¼ n ½11c features of CSL, and are all that is needed to discuss
the simplest collapse behavior. Set H^ = 0, so there is
where the overbar indicates the ensemble average at
no competition between collapse and the usual
the indicated time. The only way that a sum of
Schrödinger P
evolution, and let the initial state vector
products of non-negative terms can vanish is for at
be j , 0i = n n jan i. Equations [12] and [13]
least one term in each product to vanish. Thus,
become
according to [11c], for each run, at least one of each
pair {xn (1), xm (1)}(n 6¼ m) must vanish. This  Z l
X
means that at most one xn (1) might not vanish j ; tiw ¼ n jan i exp ð4Þ 1
dt0 ½wðt0 Þ
and, by [11a], applied at t = 1, one xn (1) must not n 0
vanish and, in fact, must equal 1: hence, each run 
produces collapse. Now, let the probability of the  2an 2 ½14a
outcome {xn (1) =P1, x6¼n (1) = 0} be denoted Pn . Since
xn (1) = 1  Pn þ m6¼n 0  Pm = Pn then, according to X  Z l
the Martingale property [11b], applied at Pt ðwÞ ¼ j n j2 exp ð2Þ1 dt0 ½wðt0 Þ
t = 1, Pn = xn (0): hence, the ensemble of runs pro- n 0

duces the probability postulated by the usual ‘‘collapse
rule’’ of standard quantum theory.  2an 2 ½14b
A (nonquantum) stochastic process which obeys
these equations is the gambler’s ruin game. Suppose When the unnormalized state vector in [14a] is
one gambler initially possesses the fraction x1 (0) of 1=2
divided by Pt (w) and so normalized, the squared
their joint wealth, and the other has the fraction amplitudes are
x2 (0). They toss a coin: heads, a dollar goes from
gambler 1 to gambler 2, tails the dollar goes the 
other way. [11a] is satisfied since the sum of money xn ðtÞ ¼ j n j2 exp ð2Þ1
Z t 
in the game remains constant, [11b] holds because it 0 0 2
is a fair game, and [11c] holds because each game
dt ½wðt Þ  2an  =Pt ðwÞ
0
eventually ends. Thus, gambler i wins all the money
with probability xi (0). which are readily shown to satisfy [11a], [11b], and
1=2 1=2
[11c] in the form xn (1)xm (1) = 0(m 6¼ n) (which
CSL in Essence
does not change the argument in the last subsection,
Consider the (nonunitary) Schrödinger picture evo- but makes for an easier calculation). Thus, [14a] and
lution equation [14b] describe collapse dynamics.
272 Quantum Mechanics: Generalizations

To describe collapse to a joint eigenstate of a set subsection ‘‘Spontaneous localization model’’), choose
of mutually commuting operators A ^ r , replace ^ x as, essentially, proportional to the mass in a sphere
A
(4) 1
[w(t 0
)  2 ^ 2 in the exponent of [12] by
A] of radius a about x:
P 1 r 0 ^r 2
r (4) [w (t )  2A ] . The interaction picture
^
^ tÞ  eiHt 1
state vector in this case is [12] multiplied by Aðx;
^
exp (iHt): ða2 Þ3=4
Z ^
Z MðzÞ 2 1 2 ^
l
dz eð2a Þ ðxzÞ eiHt ½18
j ; tiw ¼ T exp ð4Þ1 dt0 mp
0
! The parameter value choices of SL,  1016 s1
X (according to [17] and [18], the collapse rate for
r 0 ^ r 0 2

½w ðt Þ  2A ðt Þ j ; 0i ½15
r
protons) and a 105 cm are, so far, consistent with
experiment (see the next subsection), and will be
where A^ r (t0 )  exp (iHt ^ r exp (iHt
^ 0 )A ^ 0 ). The density adopted here.
matrix follows from [15], and [13]: The density matrix associated with [17] is, as
Z in [16],
 Z t
^ðtÞ  Pt ðwÞDwj ; tiw w h ; tj=Pt ðwÞ
^ðtÞ ¼ T exp ð=2Þ ^ L ðx0 ; t0 Þ
dt0 dx0 ½A
 Z t 0

¼ T exp =2 dt0 0 0 2
0 ^
 AR ðx ; t Þ ^ð0Þ ½19
X 
^ r 0 ^ r 0 2

½AL ðt Þ  AR ðt Þ ^ð0Þ ½16
r
which satisfies the differential equation
Z
^ r (t0 )(A
^ r (t0 )) appears to the left (right) of (0), d^
ðtÞ  ^ 0 ; tÞ; ½Aðx
^ 0 ; tÞ; ^ðtÞ
where A L R ˆ ¼ dx0 ½Aðx ½20
and is time-ordered (time reverse-ordered). In the dt 2
example described by [14], the density matrix [16] is of Lindblad–Kossakowski form.
X 2
^ðtÞ ¼ eðt=2Þðan am Þ n m jan iham j Consequences of CSL
n;m
Since the state vector dynamics of CSL is different
which encapsulates the ensemble’s collapse behavior. from that of standard quantum theory, there are
phenomena for which the two make different
CSL predictions, allowing for experimental tests. Con-
sider an N-particle system with position operators
The CSL proposal (Pearle 1989) is that collapse is ^ i (X ^ 0 ) from [18] in
^ i jxi = xi jxi). Substitution of A(x
X
engendered by distinctions between states at each
^ r in [15] the Schrödinger picture version of [20], integration
point of space, so the index r of A
over x0 , and utilization of
becomes x,
 Z tZ X
N
1
^
f ðzÞMðzÞjxi ¼ ^ i Þ
ðz  X
m i f ðX ^ i Þjxi
j ; tiw ¼ T exp ð4Þ dt0 dx0 i¼1
0

0 0 ^ 0 0 2 results in

½wðx ; t Þ  2Aðx ; t Þ j ; 0i ½17
d^
ðtÞ XN X N
mi mj
^ 
ðtÞ; H
¼ i½^
and the distinction looked at is mass density. However, dt 2 i¼1 j¼1 mp mp
one cannot ^ 0) = M(x),
^
P make y
the choice A(x, where h 2 1 ^ ^ 2 2 1 ^ ^ 2
^ ˆ ˆ
M(x) = i mi i (x) i (x) is the mass-density operator
eð4a Þ ðXLi XLj Þ þ eð4a Þ ðXRi XRj Þ
(mi is the mass of the ith type of particle, so 2 1 ^
i
^ 2
me , mp , mn , . . . are the masses, respectively, of elec-  2eð4a Þ ðXLi XRj Þ ^ðtÞ ½21
y
trons, protons, neutrons. . . , and ˆi (x) is the creation
operator for such a particle at location x), because this which is a useful form for calculations first
entails an infinite rate of energy increase of particles suggested by Pearle and Squires in 1994.
([23] with a = 0). Instead, adapting a ‘‘Gaussian
smearing’’ idea from the Ghirardi et al. (1986) Interference Consider the collapse rate of an initial
spontaneous localization (SL) model (see the state j i = 1 j1i þ 2 j2i, where j1i, j2i describe a
Quantum Mechanics: Generalizations 273

clump of matter, of size a, at different locations (bound state size/a)2 , the excitation rate of state
with separation  a. Electrons may be neglected jEi is
because of their small collapse rate compared to the
dhEj^
ðtÞjEi
much more massive nucleons, and the nucleon mass  jt¼0
difference may be neglected. In using [21] to calculate *dt +* +
^ j )2 ] 1 X N
mi ^  X N
mi ^ 
dh1j(t)j2i=dt,
ˆ since exp [(4a2 )1 (X^i  X   
^i ¼ 2 E Xi E0 E0  Xi E
when acting on state j1i or j2i, and 0 when X 2a i¼1
mp i¼1
mp
acts on j1i and X ^ j acts on j2i, [21] yields, for N
nucleons, the collapse rate N2 : þ Oðsize=aÞ4 ½24
Since jE0Pi, jEi are eigenstates of the center-of-mass
dh1j^
ðtÞj2i ^ operator N m ^ i = PN mi with eigenvalue 0, the
X
ðtÞ; Hj2i
¼ ih1j½^  N 2 h1j^
ðtÞj2i ½22 i=1 i i=1
dt dipole contribution explicitly given in [24] vanishes
If the clump undergoes a two-slit interference identically. This leaves the quadrupole contribution
experiment, where the size and separation condi- as the leading term, which is too small to be
tions above are satisfied for a time T, and if the measured at present.
However, the choice of A(x) ^ as mass-density
result agrees with the standard quantum theory
prediction to 1%, it also agrees with CSL provided operator was made only after experimental indica-
1 > 100N 2 T. So far, interference experiments tion. Let gi replace mi =mp in [21] and [24], so that
with N as large as 103 have been performed, by g2i is the collapse rate for the ith particle. Then,
Nairz, Arndt, and Zeilinger in 2000. The SL value experiments looking for the radiation expected from
of 1 1016 would be testable, that is, the ‘‘spontaneously’’ excited atoms and nuclei, in large
quantum-predicted interference pattern would be amounts of matter for a long time, as shown by
‘‘washed out’’ to 1% accuracy, if the clump were Collett, Pearle, Avignone, and Nussinov in 1995,
an 106 cm radius sphere of mercury, which Pearle, Ring, Collar, and Avignone in 1999, and
contains N 108 nucleons, interfered for Jones, Pearle, and Ring in 2004, have placed the
T = 0.01 s. Currently envisioned but not yet following limits:
performed experiments (e.g., by Marshall, Simon, g  g 
 e me  12me  n mn  3ðmn  mp Þ
Penrose, and Bouwmester in 2003) have been   < ;   <
gp m p mp gp m p mp
analyzed (e.g., by Bassi, Ippoliti, and Adler in
2004 and by Adler in 2005), which involve a
superposition of a larger clump of matter in
slightly displaced positions, entangled with a
photon whose interference pattern is measured: Random walk According to [17] and [13], the
these proposed experiments are still too crude to center-of-mass wave packet, of a piece of matter of
detect the SL value of , or the gravitationally size a or smaller, containing N nucleons, achieves
based collapse rate proposed by Penrose in 1996 equilibrium size s in a characteristic time s , and
(see the next section and papers by Christian in undergoes a random walk through a root-mean-
1999 and 2005). square distance Q:
 2 1=4
a h Nmp s2
s ; s
mp N3 h
Bound state excitation Collapse narrows wave
½25
packets, thereby imparting energy to particles. If
H^ = PN P ^2 ^ 1=2 3=2
h t
i = 1 i =2mi þ V(x1 , . . . , xN ), it is straight- Q
forward to calculate from [21] that mp a
The results in [25] were obtained by Collett and
d ^ d XN
h2
3
^ ðtÞ ¼
hHi  tr½H^ ½23 Pearle in 2003. These quantitative results can be
dt dt i¼1
4mi a2 qualitatively understood as follows.
In time t, the usual Schrödinger equation
For a nucleon, the mean rate of energy increase is expands a wave packet of size s to s þ
quite small, 3
1025 eV s1 . However, deviations (h=Nmp s)t. CSL collapse, by itself, narrows the
from the mean can be significantly greater. wave packet to s[1  N 2 (s=a)2 t]. The condition
Equation [21] predicts excitation of atoms and of no change in s is the result quoted above. s is the
nuclei. Let jE0 i be an initial bound energy time it takes the Schrödinger evolution to expand a
eigenstate. Expanding [21] in a power series in wave packet near size s to size s: (h=Nmp s)s s.
274 Quantum Mechanics: Generalizations

The t3=2 dependence of Q arises because this is a with probability


random walk without damping (unlike Brownian Z
motion, where Q  t1=2 ). The mean energy dt dz dxj ðx; t þ dtÞj2
increase N h2 m1 2
p a t of [23] implies the root-
mean-square velocity increase [h2 m2 2 1=2
p a t] , Thus z, the ‘‘center’’ of the hit, is most likely to be
whose product with t is Q. located where the wave function is large. For a single
For example, a sphere of density 1 cm3 and particle in the superposition described in the subsec-
radius 105 cm has s 4
107 cm, s 0.6 s and tion ‘‘Interference,’’ a single hit is overwhelmingly
Q 5[t in days]3=2 cm. At the low pressure of likely to reduce the wave function to one or the other
5
1017 torr at 4.2 K reported by Gabrielse’s location, with total probability j i j2 , at the rate .
group in 1990, the mean collision time with gas For an N-particle clump, it is considered that each
molecules is 80 min, over which Q 0.7 mm. particle has the same independent probability, dt,
Thus, observation of this effect should be feasible. of being hit. But, for the example in the subsection
‘‘Interference,’’ a single hit on any particle in one
Further Remarks location of the clump has the effect of multiplying
the wave function part describing the clump in the
It is possible to define energy for the w(x, t) field so other location by the tail of the Gaussian, thereby
that total energy is conserved: as the particles gain collapsing the wave function at the rate N.
energy, the w-field loses energy, as shown by Pearle By use of the Gaussian hit rather than a delta-
in 2005. function hit, SL solves the problem of giving too
Attempts to construct a special-relativistic CSL- much energy to particles as mentioned in the
type model have not yet succeeded, although subsection ‘‘CSL.’’ By the hypothesis of independent
Pearle in 1990, 1992, and 1999, Ghirardi, Grassi, particle hits, SL also solves the problem of achieving
and Pearle in 1990, and Nicrosini and Rimini in a slow collapse rate for a superposition of small
2003 have made valiant attempts. The problem is objects and a fast collapse rate for a superposition of
that the white noise field w(x, t) contains all large objects. However, the hits on individual
wavelengths and frequencies, exciting the vacuum particles destroys the (anti-) symmetry of wave
in lowest order in  to produce particles at the functions. The CSL collapse toward mass density
unacceptable rate of infinite energy/per second per eigenstates removes that problem. Also, while SL
cubic centimeter. Collapse models which utilize a modifies the Schrödinger evolution of a wave
colored noise field w have a similar problem in function, it involves discontinuous dynamics and so
higher orders. In 2005, Pearle suggested a quasir- is not described by a modified Schrödinger equation
elativistic model which reduces to CSL in the low- as is CSL.
speed limit.
CSL is a phenomenological model which describes
dynamical collapse so as to achieve S. Besides Other Models
needing decisive experimental verification, it needs
identification of the w(x, t) field with a physical For a single (low-energy) particle, the polar decom-
entity. position  = Re(i=h)S of the Schrödinger equation
Other collapse models which have been investi- implies two real equations,
gated are briefly described below.  
@R2 =S
þ =  R2 ¼0 ½26
@t m
Spontaneous Localization Model
(the continuity equation for R2 = jj2 ) and
The SL model of Ghirardi et al. (1986), although
superseded by CSL, is historically important and @S ð=SÞ2
^ = 0 for simplicity, and þ þV þQ¼0 ½27
conceptually valuable. Let H @t 2m
consider a single particle whose wave function at where Q  (h2 =2m)r2 R=R is the ‘‘quantum
time t is (x, t). Over the next interval dt, with potential.’’ (These equations have an obvious gen-
probability 1  dt, it does not change. With prob- eralisation to higher-dimensional configuration
ability dt it does change, by being ‘‘spontaneously space.) In 1926, Madelung proposed that one should
localized’’ or ‘‘hit.’’ A hit means that the new start from [26] and [27] – regarded as hydrodyna-
(unnormalized) wave function suddenly becomes mical equations for a classical charged fluid with
mass density mR2 and fluid velocity =S=m – and
2 1
ðxzÞ2
ðx; t þ dtÞ ¼ ðx; tÞða2 Þ3=4 eð2a Þ
construct  = Re(i=h)S from the solutions.
Quantum Mechanics: Generalizations 275

This ‘‘hydrodynamical’’ interpretation suffers from formulated as an Itô stochastic differential equation,
many difficulties, especially for many-body systems. a suggestion which has been widely followed. (The
In any case, a criticism by Wallstrom (1994) seems equation for the state vector given here, which is
decisive: [26] and [27] (and their higher-dimensional physically more transparent, has its time derivative
analogs) are not, in fact, equivalent to the Schrödin- equivalent to a Stratonovich stochastic differential
ger equation. For, as usually understood, the quan- equation, which is readily converted to the Itô form.)
tum wave function  is a single-valued and The importance of requiring that the density matrix
continuous complex field, which typically possesses describing collapse be of the Lindblad–Kossakowski
nodes ( = 0), in the neighborhood of which the form was emphasized by Gisin in 1984 and Diosi in
phase S is multivalued, with values differing by 1988. The stochastic differential Schrödinger equation
integral multiples of 2 h. If one allows S in [26], that achieves this was found independently by Diosi in
[27] to be multivalued, there is no reason why the 1988 and by Belavkin, Gisin, and Pearle in separate
allowed values should differ by integral multiples of papers in 1989 (see Ghirardi et al. 1990).
2h, and in general  will not be single-valued. On A gravitationally motivated stochastic collapse
the other hand, if one restricts S in [26], [27] to be dynamics was proposed by Diosi in 1989 (and some-
single-valued, one will exclude wave functions – such what corrected by Ghirardi et al. in 1990). Penrose
as those of nonzero angular momentum – with a emphasized in 1996 that a quantum state, such as that
multivalued phase. (This problem does not exist in describing a mass in a superposition of two places, puts
pilot-wave theory as we have presented it here, where the associated spacetime geometry also in a super-
 is regarded as a basic entity.) position, and has argued that this should lead to wave-
Stochastic mechanics, introduced by Fényes in 1952 function collapse. He suggests that the collapse time
and Nelson (1966), has particle trajectories x(t) should be  h=E, where E is the gravitational
obeying a ‘‘forward’’ stochastic differential equation potential energy change obtained by actually displa-
dx(t) = b(x(t), t)dt þ dw(t), where b is a drift (equal to cing two such masses: for example, the collapse time
the mean forward velocity) and w a Wiener process, h=(Gm2 =R), where the mass is m, its size is R, and
and also a similar ‘‘backward’’ equation. Defining the displacement is R or larger. No specific dynamics
the ‘‘current velocity’’ v = (1=2)(b þ b ), where b is is offered, just the vision that this will be a property of
the mean backward velocity, and using an appropriate a correct future quantum theory of gravity.
time-symmetric definition of mean acceleration, one Collapse to energy eigenstates was first proposed
may impose a stochastic version of Newton’s second by Bedford and Wang in 1975 and 1977 and, in the
law. If one assumes, in addition, that v is a gradient context of stochastic collapse (e.g., [11] with A ^ = H),
^
(v = rS=m for some S), then one obtains [26], [27] by Milburn in 1991 and Hughston in 1996, but it has
pffiffiffi
with R  , where  is the particle density. been argued by Finkelstein in 1993 and Pearle in
pffiffiffi
Defining   e(i=h)S , it appears that one recovers 2004 that such energy-driven collapse cannot give a
the Schrödinger equation for the derived quantity . satisfactory picture of the macroscopic world.
However, again, there is no reason why S should Percival in 1995 and in a 1998 book, and Fivel in
have the specific multivalued structure required for 1997 have discussed energy-driven collapse for
the phase of a single-valued complex field. It then microscopic situations.
seems that, despite appearances, quantum theory Adler (2004) has presented a classical theory
cannot in fact be recovered from stochastic (a hidden-variables theory) from which it is argued
mechanics (Wallstrom 1994). The same problem that quantum theory ‘‘emerges’’ at the ensemble level.
occurs in models that use stochastic mechanics as an The classical variables are N
N matrix field ampli-
intermediate step (e.g., Markopoulou and Smolin in tudes at points of space. They obey appropriate
2004): the Schrödinger equation is obtained only for classical Hamiltonian dynamical equations which he
exceptional, nodeless wave functions. calls ‘‘trace dynamics,’’ since the expressions for
Bohm and Bub (1966) first proposed dynamical Hamiltonian, Lagrangian, Poisson bracket, etc., have
wave-function collapse through deterministic evolu- the form of the trace of products of matrices and their
tion. Their collapse outcome is determined by the sums with constant coefficients. Using classical statis-
value of a Wiener–Siegel hidden variable (a variable tical mechanics, canonical ensemble averages of
distributed uniformly over the unit hypersphere in a (suitably projected) products of fields are analyzed
Hilbert space identical to that of the state vector). In and it is argued that they obey all the properties
1976, Pearle proposed dynamical wave-function col- associated with Wightman functions, from which
lapse equations where the collapse outcome is deter- quantum field theory, and its nonrelativistic-limit
mined by a random variable, and suggested (Pearle quantum mechanics, may be derived. As well as
1979) that the modified Schrödinger equation be obtaining the algebra of quantum theory in this way,
276 Quantum Mechanics: Weak Measurements

it is argued that statistical fluctuations around the Ghirardi G, Pearle P, and Rimini A (1990) Markov processes in
canonical ensemble can give rise to the behavior of Hilbert space and continuous spontaneous localization of
systems of identical particles. Physical Review A 42: 78–89.
wave-function collapse, of the kind discussed here, Holland PR (1993) The Quantum Theory of Motion: An Account
both energy-driven and CSL-type mass-density-driven of the de Broglie–Bohm Causal Interpretation of Quantum
collapse so that, with the latter, comes the Born Mechanics. Cambridge: Cambridge University Press.
probability interpretation of the algebra. The Hamil- Nelson E (1966) Derivation of the Schrödinger equation from
tonian needed for this theory to work is not provided Newtonian mechanics. Physical Review 150: 1079–1085.
Pearle P (1979) Toward explaining why events occur. Interna-
but, as the argument progresses, its necessary features tional Journal of Theoretical Physics 18: 489–518.
are delimited. Pearle P (1989) Combining stochastic dynamical state-vector
reduction with spontaneous localization. Physical Review A
See also: Quantum Mechanics: Foundations. 39: 2277–2289.
Pearle P (1999) Collapse models. In: Petruccione F and Breuer HP
(eds.) Open Systems and Measurement in Relativistic Quan-
tum Theory, pp. 195–234. Heidelberg: Springer. (ArXiv:
Further Reading quant-ph/9901077).
Valentini A (1991) Signal-locality, uncertainty, and the subquan-
Adler SL (2004) Quantum Theory as an Emergent Phenomenon. tum H-theorem. I and II. Physics Letters A 156: 5–11 and
Cambridge: Cambridge University Press. 158: 1–8.
Bassi A and Ghirardi GC (2003) Dynamical reduction models. Valentini A (2002a) Signal-locality in hidden-variables theories.
Physics Reports 379: 257–426 (ArXiv: quant-ph/0302164). Physics Letters A 297: 273–278.
Bell JS (1987) Speakable and Unspeakable in Quantum Valentini A (2002b) Subquantum information and computation.
Mechanics. Cambridge: Cambridge University Press. Pramana – Journal of Physics 59: 269–277 (ArXiv: quant-ph/
Bohm D (1952) A suggested interpretation of the quantum theory 0203049).
in terms of ‘hidden’ variables. I and II. Physical Review Valentini A (2004a) Black holes, information loss, and hidden
85: 166–179 and 180–193. variables, ArXiv: hep-th/0407032.
Bohm D and Bub J (1966) A proposed solution of the Valentini A (2004b) Universal signature of non-quantum systems.
measurement problem in quantum mechanics by a hidden Physics Letters A 332: 187–193 (ArXiv: quant-ph/0309107).
variable theory. Reviews of Modern Physics 38: 453–469. Valentini A and Westman H (2005) Dynamical origin of quantum
Dürr D, Goldstein S, and Zanghı̀ N (2003) Quantum equilibrium probabilities. Proceedings of the Royal Society of London
and the role of operators as observables in quantum theory, Series A 461: 253–272.
ArXiv: quant-ph/0308038. Wallstrom TC (1994) Inequivalence between the Schrödinger
Ghirardi G, Rimini A, and Weber T (1986) Unified dynamics for equation and the Madelung hydrodynamic equations. Physical
microscopic and macroscopic systems. Physical Review D Review A 49: 1613–1617.
34: 470–491.

Quantum Mechanics: Weak Measurements


L Diósi, Research Institute for Particle and Nuclear to extend the notion of mean for normalized bilinear
Physics, Budapest, Hungary expressions (Aharonov et al. 1988):
ª 2006 Elsevier Ltd. All rights reserved. ^
hf jAjii
Aw ¼: ½2
hf jii
However unusual is this structure, standard quan-
Introduction tum theory provides a plausible statistical interpre-
tation for it, too. The two pure states jii, jf i play the
In quantum theory, the mean value of a certain
^ in a (pure) quantum state jii is defined roles of the prepared initial and the postselected
observable A
final states, respectively. The statistical interpreta-
by the quadratic form:
tion relies upon the concept of weak measurement.
In a single weak measurement, the notorious
^ ¼: hijAjii
hAi ^ ½1
i decoherence is chosen asymptotically small. In
physical terms, the coupling between the measured
Here A^ is Hermitian operator on the Hilbert space state and the meter is assumed asymptotically weak.
H of states. We use Dirac formalism. The above The novel mean value [2] is called the (complex)
mean is interpreted statistically. No other forms had weak value.
been known to possess a statistical interpretation in The concept of quantum weak measurement
standard quantum theory. One can, nonetheless, try (Aharonov et al. 1988) provides particular
Quantum Mechanics: Weak Measurements 277

conclusions on postselected ensembles. Weak mea- respectively. Here G is the central Gaussian
surements have been instrumental in the interpreta- distribution of variance . Note that, as expected,
tion of time-continuous quantum measurements on eqn [5] implies eqn [4]. Nonzero  means that the
single states as well. Yet, weak measurement itself measurement is nonideal, yet the expectation value
can properly be illuminated in the context of E[a] remains calculable reliably if the statistics N is
classical statistics. Classical weak measurement as suitably large.
well as postselection and time-continuous measure- Suppose the spread of A in state  is finite:
ment are straightforward concepts leading to con-
clusions that are natural in classical statistics. In 2 A ¼: hA2 i  hAi2 < 1 ½7
quantum context, the case is radically different and Weak measurement will be defined in the asympto-
certain paradoxical conclusions follow from weak tic limit (eqns [8] and [9]) where both the stochastic
measurements. Therefore, we first introduce the error of the measurement and the measurement
classical notion of weak measurement on postse- statistics go to infinity. It is crucial that their rate is
lected ensembles and, alternatively, in time-contin- kept constant:
uous measurement on a single state. Certain idioms
from statistical physics will be borrowed and certain ; N ! 1 ½8
not genuinely quantum notions from quantum
2
theory will be anticipated. The quantum counterpart 2 ¼: ¼ const: ½9
of weak measurement, postselection, and continuous N
measurement will be presented afterwards. The Obviously for asymptotically large , the precision
apparent redundancy of the parallel presentations of individual measurements becomes extremely
is of reason: the reader can separate what is weak. This incapacity is fully compensated by the
common in classical and quantum weak measure- asymptotically large statistics N. In the weak
ments from what is genuinely quantum. measurement limit (eqns [8] and [9]), the probability
distribution pw of the arithmetic mean a of the N
independent outcomes converges to a Gaussian
distribution:
Classical Weak Measurement  
Given a normalized probability density (X) over pw ðaÞ ! G a  hAi ½10
the phase space {X}, which we call the state, the
The Gaussian is centered at the mean hAi , and the
mean value of a real function A(X) is defined as
Z variance of the Gaussian is given by the constant
rate [9]. Consequently, the mean [3] is reliably
hAi ¼: dX A ½3
calculable on a statistics N growing like  2 .
With an eye on quantum theory, we consider two
Let the outcome of an (unbiased) measurement of A situations – postselection and time-continuous
be denoted by a. Its stochastic expectation value measurement – of weak measurement in classical
E[a] coincides with the mean [3]: statistics.
E½a ¼ hAi ½4

Performing a large number N of independent Postselection


measurements of A on the elements of the ensemble For the preselected state , we introduce postselec-
of identically prepared states, the arithmetic mean a tion via the real function (X), where 0    1.
of the outcomes yields a reliable estimate of E[a] The postselected mean value of a certain real
and, this way, of the theoretical mean hAi . function A(X) is defined by
Suppose, for concreteness, the measurement
outcome a is subject to a Gaussian stochastic hAi
 hAi ¼: ½11
error of standard dispersion  > 0. The probability hi
distribution of a and the update of the state
where hi is the rate of postselection. Postselection
corresponding to the Bayesian inference are
means that after having obtained the outcome a
described as
pðaÞ ¼ hG ða  AÞi ½5 regarding the measurement of A, we measure the
function , too, in ideal measurement with random
outcome  upon which we base the following
1 random decision. With probability , we include
! G ða  AÞ ½6
pðaÞ the current a into the statistics and we discard it
278 Quantum Mechanics: Weak Measurements

with probability 1  . Then the coincidence of E[a] Equations [17] and [18] are the special case of the
and  hAi , as in eqn [4], remains valid: Kushner–Stratonovich equations of time-continuous
Bayesian inference conditioned on the continuous
E½a ¼  hAi ½12
measurement of A yielding the time-dependent
Therefore, a large ensemble of postselected states outcome value at . Formal time derivatives of both
allows one to estimate the postselected mean  hAi . sides of eqn [17] yield the heuristic equation
Classical postselection allows introducing the at ¼ hAit þ gt ½19
effective postselected state:
Accordingly, the current measurement outcome is

 ¼: ½13 always equal to the current mean plus a term
hi
proportional to standard white noise t . This
Then the postselected mean [11] of A in state  can, plausible feature of the model survives in the
by eqn [14], be expressed as the common mean of A quantum context as well. As for the other equation
in the effective postselected state  : [18], it describes the gradual concentration of the
distribution t in such a way that the variance t A
 hAi ¼ hAi  ½14 tends to zero while hAit tends to a random
As we shall see later, quantum postselection is asymptotic value. The details of the convergence
more subtle and cannot be reduced to common depend on the character of the continuously mea-
statistics, that is, to that without postselection. The sured function A(X). Consider a stepwise A(X):
X
quantum counterpart of postselected mean does not AðXÞ ¼ a P ðXÞ ½20
exist unless we combine postselection and weak
measurement.
The real values a are step heights all differing from
Time-Continuous Measurement each other. The indicator functions P take values
0 or 1 and form a complete set of pairwise disjoint
For time-continuous measurement, one abandons the functions on the phase space:
ensemble of identical states. One supposes that a single X
time-dependent state t is undergoing an infinite P  1 ½21
sequence of measurements (eqns [5] and [6]) of A
employed at times t = t, t = 2t, t = 3t, . . . . The rate
P P
¼ 
P ½22
 =: 1=t goes to infinity together with the mean
squared error 2 . Their rate is kept constant: In a single ideal measurement of A, the outcome a is
one of the a ’s singled out at random. The
;  ! 1 ½15
probability distribution of the measurement out-
2 come and the corresponding Bayesian update of the
g2 ¼: ¼ const: ½16 state are given by

In the weak measurement limit (eqns [15] and [16]), p ¼ hP i0 ½23
the infinite frequent weak measurements of A
constitute the model of time-continuous measure- 1
0 ! P 0 ¼:  ½24
ment. Even the weak measurements will signifi- p
cantly influence the original state 0 , due to the
respectively. Equations [17] and [18] of time-
accumulated effect of the infinitely many Bayesian
continuous measurement are a connatural time-
updates [6]. The resulting theory of time-continuous
continuous resolution of the ‘‘sudden’’ ideal
measurement is described by coupled Gaussian
measurement (eqns [23] and [24]) in a sense that
processes [17] and [18] for the primitive function
they reproduce it in the limit t ! 1. The states 
t of the time-dependent measurement outcome
are trivial stationary states of the eqn [18]. It can be
and, respectively, for the time-dependent Bayesian
shown that they are indeed approached with
conditional state t :
probability p for t ! 1.
dt ¼ hAit dt þ g dWt ½17
  Quantum Weak Measurement
dt ¼ g1 A  hAit t dWt ½18
In quantum theory, states in a given complex
Here dWt is the Itô differential of the Wiener Hilbert space H are represented by non-negative
process. density operators ,
ˆ normalized by tr ˆ = 1. Like the
Quantum Mechanics: Weak Measurements 279

classical states , the quantum state ˆ is interpreted Hilbert space L2 of a hypothetic meter. Suppose
statistically, referring to an ensemble of states with R 2 (1, 1) is the position of the ‘‘pointer.’’ Let its
the same . ˆ Given a Hermitian operator A, ^ called initial state ˆ M be a pure central Gaussian state of
observable, its theoretical mean value in state ˆ is width ; then the density operator ˆ M in Dirac
defined by position basis takes the form
Z Z
^ ¼ trðA^
hAi ^ Þ ½25
^ ^M ¼ dR dR0 G1=2 1=2 0
 ðRÞG ðR ÞjRihR j
0
½30
Let the outcome of an (unbiased) quantum measure-
ment of A^ be denoted by a. Its stochastic expectation We are looking for a certain dynamical interaction
value E[a] coincides with the mean [25]: ^ onto the
to transmit the ‘‘value’’ of the observable A
pointer position R.^ To model the interaction, we
^
E½a ¼ hAi ½26 define the unitary transformation [31] to act on the
^
tensor space H  L2 :
Performing a large number N of independent
measurements of A ^ on the elements of the ensemble ^  KÞ
^ ¼ expðiA
U ^ ½31
of identically prepared states, the arithmetic mean a
Here K^ is the canonical momentum operator
of the outcomes yields a reliable estimate of E[a]
^ . If the ^
conjugated to R:
and, this way, of the theoretical mean hAi ˆ
measurement outcome a contains a Gaussian sto- ^
chastic error of standard dispersion , then the expðiaKÞjRi ¼ jR þ ai ½32
probability distribution of a and the update, called The unitary operator U ^ transforms the initial
collapse in quantum theory, of the state are uncorrelated quantum state into the desired corre-
described by eqns [27] and [28], respectively. (We lated composite state:
adopt the notational convenience of physics litera-
ture to omit the unit operator ^I from trivial  ^   ^M U
^ ¼: U^ ^y ½33
expressions like a^I.) Equations [30]–[33] yield the expression [34] for the
D E ˆ
^ state :
pðaÞ ¼ G ða  AÞ ½27
^ Z Z
^ ¼ dR dR0 G1=2 ðR  AÞ^
 ^ G1=2
1 ^ G1=2 ða  AÞ
^
 
^ ! G1=2 ða  AÞ^ ½28
pðaÞ  
^  jRihR0 j
 ðR0  AÞ ½34
Nonzero  means that the measurement is nonideal, ^ into
Let us write the pointer’s coordinate operator R
but the expectation value E[a] remains calculable the standard form [35] in Dirac position basis:
reliably if N is suitably large. Z
Weak quantum measurement, like its classical ^ ¼ dajaihaj
R ½35
counterpart, requires finite spread of the observable
^ on state :
A ˆ
The notation anticipates that, when pointer R ^ is
^
2^ A ¼: hA ^ 2<1
^ 2 i  hAi ½29 measured ideally, the outcome a plays the role of the
^ ^
nonideally measured value of the observable A. ^
Weak quantum measurement, too, will be defined in Indeed, let us consider the ideal von Neumann
the asymptotic limit [8] introduced for classical weak measurement of the pointer position on the corre-
measurement. Single quantum measurements can no lated composite state . ˆ The probability of the
more distinguish between the eigenvalues of A. ^ Yet,
outcome a and the collapse of the composite state
the expectation value E[a] of the outcome a remains are given by the following standard equations:
calculable on a statistics N growing like  2 . h i
Both in quantum theory and classical statistics, pðaÞ ¼ tr ð^I  jaihajÞ
^ ½36
the emergence of nonideal measurements from ideal
ones is guaranteed by general theorems. For com- h i
^ ! 1 ð^I  jaihajÞð
 ^ ^I  jaihajÞ ½37
pleteness of this article, we prove the emergence of pðaÞ
the nonideal quantum measurement (eqns [27] and
[28]) from the standard von Neumann theory of respectively. We insert eqn [34] into eqns [36] and
ideal quantum measurements (von Neumann 1955). [37]. Furthermore, we take the trace over L2 of both
The source of the statistical error of dispersion  sides of eqn [37]. In such a way, as expected, eqns
is associated with the state ˆ M in the complex [36] and [37] of ideal measurement of R ^ yield the
280 Quantum Mechanics: Weak Measurements

earlier postulated eqns [27] and [28] of nonideal The interpretation of postselection itself reduces to a
measurement of A.^ simple procedure. One performs the von Neumann
ideal measurement of the Hermitian projector jf ihf j,
then includes the case if the outcome is 1 and
Quantum Postselection discards it if the outcome is 0. The rate of
A quantum postselection is defined by a Hermitian postselection is jhf jiij2 . We note that a certain
^
operator satisfying 0 ˆ  ^I. The corresponding statistical interpretation of Im Aw , too, exists
^ is
postselected mean value of a certain observable A although it relies upon the details of the ‘‘meter.’’
defined by We outline a heuristic proof of the central
equation [40]. One considers the nonideal measure-
h ^
^ Ai ment (eqns [27] and [28]) of A ^ followed by the ideal
^ ^
^ hAi^ ¼: Re ½38 ˆ Then the joint distribution of the
 ^
hi measurement of .
^
corresponding outcomes is given by eqn [42]. The
The denominator hi ˆ ˆ is the rate of quantum probability distribution of the postselected outcomes
postselection. Quantum postselection means that a is defined by eqn [43], and takes the concrete form
after the measurement of A, ^ we measure the [44]. The constant N assures normalization:
ˆ
observable  in ideal quantum measurement and  
we make a statistical decision on the basis of the pð; aÞ ¼ tr ð  ÞG ^ G1=2 ða  AÞ
^ 1=2 ða  AÞ^ ^ ½42
 
outcome . With probability , we include the case
in question into the statistics while we discard it Z
1
with probability 1  . By analogy with the classical pðaÞ ¼: pð; aÞ d ½43
N
case [12], one may ask whether the stochastic
expectation value E[a] of the postselected measure- 1 D 1=2 E
ment outcome does coincide with pðaÞ ¼: G ða  AÞ ^ G ^
^ 1=2 ða  AÞ ½44

N ^
? ^
E½a ¼ ^ hAi ^ ½39 ^ is bounded. When
Suppose, for simplicity, that A
 ! 1, eqn [44] yields the first two moments of
Contrary to the classical case, the quantum equation
the outcome a:
[39] does not hold. The quantum counterparts of
classical equations [12]–[14] do not exist at all. ^
E½a ! ^ hAi ½45
^
Nonetheless, the quantum postselected mean ˆ hAi ^

possesses statistical interpretation although E½a2   2 ½46
restricted to the context of weak quantum measure-
Hence, by virtue of the central limit theorem, the
ments. In the weak measurement limit (eqns [8] and
probability distribution [40] follows for the average
[9]), a postselected analog of classical equation [10]
a of postselected outcomes in the weak measurement
holds for the arithmetic mean  a of postselected weak
limit (eqns [8] and [9]).
quantum measurements:
 
pw ð
aÞ ! G   ^
a  ^ hAi ½40 Quantum Weak-Value Anomaly
^

The Gaussian is centered at the postselected mean Unlike in classical postselection, effective postse-
^
ˆ hAiˆ , and the variance of the Gaussian is given by the
lected quantum states cannot be introduced. We can

constant rate [9]. Consequently, the mean [38] ask whether eqn [47] defines a correct postselected
becomes calculable on a statistics N growing like  2 . quantum state:
Since the statistical interpretation of the postse- ^
^
lected quantum mean [38] is only possible for weak ^?^ ¼: Herm ½47
^ is called the (real) ^
hi
measurements, therefore ˆ hAi ˆ ^
^
weak value of A. Consider the special case when This pseudo-state satisfies the quantum counterpart
both the state ˆ = jiihij and the postselected operator of the classical equation [14]:
ˆ = jf ihf j are pure states. Then the weak value
 
^
ˆ hAiˆ takes, in usual notations, a particular form ^ ^ ?^
 ^ hAi^ ¼ tr A^
 
½48
[41] yielding the real part of the complex weak
value Aw [1]: In general, however, the operator ˆ ?ˆ is not a density
operator since it may be indefinite. Therefore, eqn
^
hf jAjii
^ [47] does not define a quantum state. Equation [48]
f hAii ¼: Re ½41
hf jii does not guarantee that the quantum weak value
Quantum Mechanics: Weak Measurements 281

^ lies within the range of the eigenvalues of the


ˆ hAiˆ from eqn [9] after replacing N by the size of the

observable A.^ postselected statistics which is approximately N=4:
Let us see a simple example for such anomalous
weak values in the two-dimensional Hilbert space. 2 ¼ 400=N ½55
Consider the pure initial state given by eqn [49] and Accordingly, if N = 3600 independent quantum
the postselected pure state by eqn [50], where measurements of precision  = 10 are performed
2 [0, ] is a certain angular parameter. regarding the observable A, ^ then the arithmetic
  mean a of the  900 postselected outcomes a will be
1 ei =2 2 0.33. This exceeds significantly the largest
jii ¼ pffiffiffi i =2 ½49
2 e eigenvalue of the measured observable A.^ Quantum
postselection appears to bias the otherwise unbiased
  nonideal weak measurements.
1 ei =2
jf i ¼ pffiffiffi i =2 ½50
2 e
Quantum Time-Continuous Measurement
The probability of successful postselection is cos2 . The mathematical construction of time-continuous
If 6¼ =2, then the postselected pseudo-state quantum measurement is similar to the classical one.
follows from eqn [47]: We consider the weak measurement limit (eqns [15]
  and [16]) of an infinite sequence of nonideal
1 1 cos1 ^ at
^?^ ¼ ½51 quantum measurements of the observable A
2 cos1 1 t = t, 2t, . . . , on the time-dependent state ˆ t . The
resulting theory of time-continuous quantum mea-
This matrix is indefinite unless = 0, its two
surement is incorporated in the coupled stochastic
eigenvalues are 1 cos1 . The smaller the post-
equations [56] and [57] for the primitive function t
selection rate cos2 , the larger is the violation of the
of the time-dependent outcome and the conditional
positivity of the pseudo-density operator. Let the
time-dependent state ˆ t , respectively (Diósi 1988):
weakly measured observable take the form
  ^ dt þ gdWt
dt ¼ hAi ½56
^t
A^¼ 0 1 ½52
1 0
^ ½A;
t ¼  18 g2 ½A;
d^ ^ ^  dt
t
Its eigenvalues are 1. We express its weak value  
1 ^
þ g Herm A  hAi ^ ^t dWt ½57
from eqns [41], [49], and [50] or, equivalently, from ^t 

eqns [48] and [51]: Equation [56] and its classical counterpart [17] are
perfectly similar. There is a remarkable difference
^ 1
f hAii ¼ ½53 between eqn [57] and its classical counterpart [18].
cos
In the latter, the stochastic average of the state is
This weak value of A ^ lies outside the range of the constant: E[dt ] = 0, expressing the fact that classi-
eigenvalues of A.^ The anomaly can be arbitrarily cal measurements do not alter the original ensemble
large if the rate cos2 of postselection decreases. if we ‘‘ignore’’ the outcomes of the measurements.
Striking consequences follow from this anomaly On the contrary, quantum measurements introduce
if we turn to the statistical interpretation. For irreversible changes to the original ensemble, a
concreteness, suppose = 2=3 so that f hAi ^ = 2. phenomenon called decoherence in the physics
i
On average, 75% of the statistics N will be lost literature. Equation [57] implies the closed linear
in postselection. We learnt from eqn [40] that first-order differential equation [58] for the stochas-
the arithmetic mean  a of the postselected outcomes tic average of the quantum state ˆ t under time-
continuous measurement of the observable A: ^
of independent weak measurements converges
stochastically to the weak value upto the Gaussian dE½^
t  ^ ½A;
¼ 18g2 ½A; ^ E½^
t  ½58
fluctuation , as expressed symbolically by dt
a¼2 
 ½54 This is the basic irreversible equation to model the
gradual loss of quantum coherence (decoherence)
Let us approximate the asymptotically large error  under time-continuous measurement. In fact, the
of our weak measurements by  = 10 which is very equation models decoherence under the influ-
already well beyond the scale of the eigenvalues 1 ence of a large class of interactions, for example,
^ The Gaussian error  derives
of the observable A. with thermal reservoirs or complex environments. In
282 Quantum Mechanics: Weak Measurements

two-dimensional Hilbert space, for instance, we can resolution of the ‘‘sudden’’ ideal quantum measure-
consider the initial pure state hij =: [ cos , sin ] and ment (eqns [65] and [66]) in a sense that they
the time-continuous measurement of the diagonal reproduce it in the limit t ! 1. The states ˆ are
observable [59] on it. The solution of eqn [58] is stationary states of eqn [57]. It can be shown that
given by eqn [60]: they are indeed approached with probability p for
  t ! 1 (Gisin 1984).
^¼ 1 0
A ½59
0 1
Related Contexts
" #
cos2
2
et=4g cos sin In addition to the two particular examples as
E½^
t  ¼ 2 ½60 in postselection and in time-continuous measure-
et=4g cos sin sin2
ment, respectively, presented above, the weak
The off-diagonal elements of this density matrix measurement limit itself has further variants.
go to zero, that is, the coherent superposition A most natural example is the usual thermodynamic
represented by the initial pure state becomes an limit in standard statistical physics. Then weak
incoherent mixture represented by the diagonal measurements concern a certain additive micro-
density matrix ^1 . scopic observable (e.g., the spin) of each constituent
Apart from the phenomenon of decoherence, the and the weak value represents the corresponding
stochastic equations show remarkable similarity additive macroscopic parameter (e.g., the magneti-
with the classical equations of time-continuous zation) in the infinite volume limit. This example
measurement. The heuristic form of eqn [56] is indicates that weak values have natural interpreta-
eqn [61] of invariable interpretation with respect tion despite the apparent artificial conditions of
to the classical equation [19]: their definition. It is important that the weak value,
with or without postselection, plays the physical role
^ þ gt similar to that of the common mean hAi ^ . If,
at ¼ hAi ^t ½61 ˆ
between their pre- and postselection, the states ˆ
Equation [57] describes what is called the time- become weakly coupled with the state of another
continuous collapse of the quantum state under quantum system via the observable A, ^ their average
time-continuous quantum measurement of A. ^ For influence will be as if A ^ took the weak value ˆ hAi ^ .
 ˆ
^ and
concreteness, we assume discrete spectrum for A Weak measurements also open a specific loophole to
consider the spectral expansion circumvent quantum limitations related to the
X irreversible disturbances that quantum measure-

A ^
a P ½62 ments cause to the measured state. Noncommuting
observables become simultaneously measurable in
the weak limit: simultaneous weak values of non-
The real values a are nondegenerate eigenvalues.
^ form a complete commuting observables will exist.
The Hermitian projectors P
Literally, weak measurement had been coined
orthogonal set:
in 1988 for quantum measurements with (pre- and)
X
^  ^I
P ½63 postselection, and became the tool of a certain time-
symmetric statistical interpretation of quantum states.
Foundational applications target the paradoxical
P ^
¼ 
P
^ P ^ ½64 problem of pre- and retrodiction in quantum theory.
^ the outcome a is In a broad sense, however, the very principle of weak
In a single ideal measurement of A, measurement encapsulates the trade between asymp-

one of the a ’s singled out at random. The totically weak precision and asymptotically large
probability distribution of the measurement out- statistics. Its relevance in different fields has not yet
come and the corresponding collapse of the state are been fully explored and a growing number of founda-
given by
tional, theoretical, and experimental applications are
^ i being considered in the literature – predominantly in
p ¼ hP ^0 ½65
the context of quantum physics. Since specialized
1 ^ ^ monographs or textbooks on quantum weak measure-
^0 ! P ^0 P ¼: ^ ½66 ment are not yet available, the reader is mostly referred
p
to research articles, like the recent one by Aharonov
respectively. Equations [56] and [57] of continuous and Botero (2005), covering many topics of postse-
measurements are an obvious time-continuous lected quantum weak values.
Quantum n-Body Problem 283

Nomenclature Aharonov Y and Botero A (2005) Quantum averages of weak


values. Physical Review A (in print).
a measurement outcome Aharonov Y, Popescu S, Rohrlich D, and Vaidman L (1993)
a
 arithmetic mean of measurement Measurements, errors, and negative kinetic energy. Physical
outcomes Review A 48: 4084–4090.
^
A Hermitian operator, quantum observable Aharonov Y and Vaidman L (1990) Properties of a quantum
system during the time interval between two measurements.
A(X) real phase-space function
Physical Review A 41: 11–20.
E[ . . . ] stochastic expectation value
^ Belavkin VP (1989) A new equation for a continuous
hf jAjii matrix element nondemolition measurement. Physics Letters A 140:
hf jii inner product 355–358.
H Hilbert space Diósi L (1988) Continuous quantum measurement and Itô-
L2 space of Lebesgue square-integrable formalism. Physics Letters A 129: 419–432.
complex functions Gisin N (1984) Quantum measurements and stochastic processes.
p probability distribution Physical Review Letters 52: 1657–1660.
tr trace Giulini D, Joos E, Kiefer C, Kupsch J, Stamatescu IO et al. (1996)
U^ unitary operator Decoherence and the Appearance of a Classical World in
Quantum Theory. Berlin: Springer.
Wt Wiener process
Gough J and Sobolev A (2004) Stochastic Schrödinger equations
t white noise process
as limit of discrete filtering. Open Systems and Information
 h. . .i postselected mean value Dynamics 11: 1–21.
ˆ density operator Kraus K (1983) States, Effects, and Operations: Fundamental
(X) phase-space distribution Notions of Quantum Theory. Berlin: Springer.
 direct product von Neumann J (1955) Mathematical Foundations of Quantum
y operator adjoint Mechanics. Princeton: Princeton University Press.
j . . .i state vector Nielsen MA and Chuang IL (2000) Quantum Computation and
h. . . j adjoint state vector Quantum Information. Cambridge: Cambridge University
h. . .i mean value Press.
Stratonovich R (1968) Conditional Markov Processes and Their
Application to the Theory of Optimal Control. New York:
Further Reading Elsevier.
Wiseman HM (2002) Weak values, quantum trajectories, and the
Aharonov Y, Albert DZ, and Vaidman L (1988) How the result of cavity-QED experiment on wave-particle correlation. Physical
measurement of a component of the spin of a spin-1/2 particle Review A 65: 32111–32116.
can turn out to be 100. Physical Review Letters 60: 1351–1354.

Quantum n-Body Problem


R G Littlejohn, University of California at Berkeley, ‘‘classical n-body problem,’’ in which V is usually
Berkeley, CA, USA
assumed to consist of the sum of the pairwise
ª 2006 Elsevier Ltd. All rights reserved. gravitational interactions of the particles. In this
article, we shall only assume that V (hence H) is
invariant under translations, proper rotations, par-
Introduction ity, and permutations of identical particles. The
Hamiltonian H is also invariant under time reversal.
This article concerns the nonrelativistic quantum
This Hamiltonian describes the dynamics of isolated
mechanics of isolated systems of n particles inter-
atoms, molecules, and nuclei, with varying degrees
acting by means of a scalar potential, what we shall
of approximation, including the case of molecules in
call the ‘‘quantum n-body problem.’’ Such systems
the Born–Oppenheimer approximation, in which V
are described by the kinetic-plus-potential
is the Born–Oppenheimer potential. We shall ignore
Hamiltonian,
the spin of the particles, and treat the wave function
X
n
jP  j2  as a scalar. We assume that  is an eigenfunction
H ¼TþV ¼ þ VðR1 ; . . . ; Rn Þ ½1 of H, H = E. In practice, the value of n typically
¼1
2m
ranges from 2 to several hundred. Often the cases
where R , P  ,  = 1, . . . , n are the positions and n = 3 and n = 4 are of special interest. In this article,
momenta of the n particles in three-dimensional we shall assume that n
3, since n = 2 is the trivial
space, m are the masses, and V is the potential case of central-force motion. The quantum n-body
energy. This Hamiltonian also occurs in the problem is not to be confused with the ‘‘quantum
284 Quantum n-Body Problem

many-body problem,’’ which usually refers to the invariant, that is, independent linear functions of the
quantum mechanics of large numbers of identical relative particle positions R  R . We denote the
particles, such as the electrons in a solid. momenta conjugate to (r 1 , . . . , r n1 , RCM ) by
Of particular interest is the ‘‘reduction’’ of the (p1 , . . . , pn1 , PCM ), of which P CM turns out to be the
Hamiltonian [1], that is, the elimination of those total momentum of the system,
degrees of freedom that can be eliminated due to the
X
n
continuous symmetries of translations and rotations. P CM ¼ P ½3
A basic problem is to write down the reduced ¼1
Hamiltonian and to make its analytical and geome-
trical properties clear. In the following we shall Under such a coordinate transformation, the poten-
present this reduction in two stages, dealing first with tial energy becomes simply a function of the n  1
the translations and second with the proper rotations. relative vectors, V(r 1 , . . . , r n1 ), whereas the kinetic
In each stage, we shall describe the reduction first in energy becomes
coordinate language and then in geometrical lan-
jP CM j2 1 X n1
guage. The discrete symmetries of parity, time T¼ þ K p  p ½4
reversal, and permutation of identical particles are 2M 2 ;¼1
handled by standard methods of group representation
where K is a symmetric tensor (the ‘‘inverse mass
theory, and will not be discussed here.
tensor’’).
There has been considerable interest in mathema-
The vectors (r 1 , . . . , r n1 ) specify the positions of n
tical circles in recent years in the reduction of
particles relative to their center of mass. As described
dynamical systems with symmetry, and the quantum
so far, these vectors need only be independent,
n-body problem is one of the most important such
translationally invariant linear combinations of the
systems from a physical standpoint. As such, the
particle postitions. However, it is convenient to
basic theory of the quantum n-body problem has
choose them so that the inverse mass tensor becomes
received considerable attention in the physical
proportional to the identity, K = (1=M) . An
literature going back to the birth of quantum
elegant way of doing this is the method of Jacobi
mechanics, and continues to be of great practical
vectors, which involves splitting the original set of
importance. This article and the bibliography
particles into two nonempty subsets, which are then
attempt to bridge these two centers of interest.
split into smaller subsets, etc., until only subsets of a
single particle remain. The process can be represented
Reduction by Translations: Coordinate by a tree growing downward, with the original n
particles as the root, and the ends of the branches at
Description
the bottom each containing one particle. Then the
We begin with a coordinate description of the vectors (r 1 , . . . , r n1 ) (the Jacobi vectors) are chosen
reduction of the system [1] by translations. The to be proportional to the differences between the
coordinates (R1 , . . . , Rn ) are coordinates on the con- centers of mass of the two subsets at each splitting.
figuration space of the system, called the ‘‘original With the right constants of proportionality, the
configuration space’’ or OCS. The OCS is R 3n . The kinetic energy becomes
original system has 3n degrees of freedom. The
translation group acts on configuration space by 1 1 X
n1
T¼ jPCM j2 þ jp j2 ½5
R 7! R þ , for  = 1, . . . , n, where  is a displace- 2M 2M ¼0 
ment vector. It acts on wave functions by
(R1 , . . . , Rn ) 7! (R1  , . . . , Rn  ). Henceforth, we shall assume that the vectors
To reduce the system by translations, we perform (r 1 , . . . , r n1 ) are Jacobi vectors with conjugate
a linear coordinate transformation on the OCS, momenta (p1 , . . . , pn1 ).
taking us from the original vectors (R1 , . . . , Rn ) to a The choice of Jacobi vectors is not unique. In the
new set of n vectors (r 1 , . . . , r n1 , RCM ), where RCM first place, there is a discrete set of possible ways of
is the center-of-mass position, splitting the original set of n particles into subsets
(of forming trees), each of which leads to the same
1X n
form [5] of the kinetic energy. More generally, the
RCM ¼ m  R ½2
M ¼1 kinetic energy [5] is invariant under transformations
P
where M =  m is the total mass of the system, and X
n1
the other n  1 vectors of the new coordinate system, r 0 ¼ Q r  ½6
(r 1 , . . . , r n1 ), are required to be translationally ¼1
Quantum n-Body Problem 285

where Q is an orthogonal matrix, Q 2 O(n  1). that are created in the process of splitting subsets of
Such transformations are called ‘‘kinematic rota- particles, including the original action of the
tions.’’ The discrete choices of trees in forming the translation group. Thus, each splitting of a subset
Jacobi vectors are equivalent to a discrete set of of particles generates a three-dimensional subspace
kinematic rotations Q that map one standard of the OCS, on which one of the r  are coordinates.
choice of Jacobi vectors into the others. The conjugate momentum p is the generator of the
Since the momentum P CM of the center of mass group action moving the two new subsets apart. The
commutes with H, the eigenfunctions  of H can be final result is that the OCS is decomposed into n
chosen to have the form orthogonal, three-dimensional subspaces, one of
which contains the action of the original translation
ðR1 ; . . . ; Rn Þ
group, and the others of which represent the
¼ expðiRCM  P CM =
hÞ ðr 1 ; . . . ; r n1 Þ ½7 decomposition of the TRCS into n  1, three-
dimensional orthogonal subspaces.
This causes to be an eigenfunction of the
The TRCS can also be seen as a global section of a
‘‘translation-reduced Hamiltonian,’’ Htr = Etr ,
flat, trivial, principal fiber bundle created by the
where
action of the translation group on the OCS.
1 X
n1 Alternatively, the TRCS can be seen as the quotient
Htr ¼ jp j2 þ Vðr 1 ; . . . ; r n1 Þ ½8 space, R3n =R3 . The construction is fairly simple
2M ¼0 
because the translation group is Abelian.
The kinetic energy of the center of mass, The wave function can be seen as a member of
jPCM j2 =2M, has been discarded from both Htr and the Hilbert space of wave functions on the TRCS,
Etr , which represent physically the energy of the upon which the reduced Hamiltonian Htr of eqn [8]
system about its center of mass. acts. Alternatively, it can be seen as the function
obtained by restricting  on the OCS to the TRCS,
where  has a dependence along the orbits of the
translation group given by exp (iRCM  P CM =h), that
Reduction by Translations: Geometrical
is, by an irreducible representation (irrep) of the
Description
translation group.
The kinetic
P energy T in eqn [1] specifies a metric
ds2 =  m jdR j2 on the OCS (=R3n ). The transla-
tion group (=R3 ) acts freely on the OCS, with an
Reduction by Rotations: Coordinate
action that is generated by P CM . This action defines
Description
an orthogonal decomposition of the OCS,
R3n = R 3  R 3n3 , where R3 is the orbit of the origin The Hamiltonian Htr acts on wave functions
(the other orbits of the translation group action are defined on the TRCS and has 3n  3 degrees of
parallel spaces), and R 3n3 is the orthogonal subspace freedom. Consider a coordinate transformation to
(henceforth the ‘‘translation-reduced configuration eliminate further degrees of freedom due to the
space’’ or TRCS for short). The TRCS is physically rotational invariance. This coordinate transforma-
the space of configurations relative to the center of tion takes us from the Jacobi vectors {r  ,  = 1, . . . ,
mass. The vectors (r 1 , . . . , r n1 ) are coordinates on n  1} to orientational and shape coordinates. Shape
the TRCS. The TRCS possesses a metric which is the coordinates are a set of 3n  6 coordinates
projection of the metric on the OCS onto the TRCS {q ,  = 1, . . . , 3n  6} that specify the shape of the
by means of the translation group action. The metric n-particle system, that is, they are 3n  6 independent
can be projected because translations preserve the functions of the interparticle distances (hence rota-
original metric (they are isometries). Jacobi vectors tionally invariant). We will call the space upon which
are Euclidean coordinates on the TRCS with respect the q are coordinates ‘‘shape space.’’ For example, in
to this metric. the case of the three-body problem, shape space is the
The tree method of constructing Jacobi vectors space of all triangles.
can be understood in terms of certain group actions As for orientational coordinates, to define them it
which take place as each subset of particles is split is necessary first to define a ‘‘body frame.’’ We
into two further subsets. The group action in assume we are already given one frame, the ‘‘space
question leaves the center of mass of the original frame,’’ a fixed inertial frame. The body frame is a
subset invariant, while moving the two new subsets 3-frame attached in a conventional way to each shape
apart along a line. This motion in the configuration of the system of particles, which rotates with the
space is orthogonal to all the other group actions particles. The orientational coordinates, to be
286 Quantum n-Body Problem

denoted by {i , i = 1, 2, 3}, are three coordinates (e.g., The third field is the (3n  6)  (3n  6) lower
Euler angles) specifying the SO(3) rotation that maps block of the metric tensor on the TRCS, an object
the space frame into the body frame. We shall write with two shape indices. It is given by
the new coordinates collectively as {i , q }.
n1 
X 
There is a great deal of arbitrariness in the choice @r  @r 
g ¼ M   A  E  A ½11
of a body frame, since for a given shape a body frame ¼1
@q @q
can be attached in many ways, the different choices
being related by proper rotations. The only require- where again the vectors are referred to the body
ment is that the body frame should change smoothly frame. The notation suggests (correctly) that g is
as the shape changes. Popular choices for the body the metric tensor on shape space.
frame are the principal axis and Eckart frames. On transforming the wave function from the
When the potential energy is transformed to the new Jacobi vectors to coordinates (i , q ), it is convenient
coordinates, it becomes a function only of the {q }, to introduce a Jacobian factor, (r 1 , . . . , r n1 ) =
that is, of the shape. The potential can be written as D1=4
(i , q ), where D = (det E)(det g ). This
V = V(q). V is a scalar field on shape space. causes the new wave function
to have the
The transformation of the kinetic energy is more normalization
complicated. When the (Euclidean) metric tensor on Z !
Y
3n6
the TRCS is transformed to orientational and shape dR dq j
j2

½12
coordinates there results a (3n  3)  (3n  3) com- ¼1
ponent matrix which may be partitioned into blocks
according to the coordinates {i , q }, that is, accord- where dR is the Haar measure on the group SO(3).
ing to 3n  3 = 3 þ (3n  6). This matrix cannot be The factor D depends only on the q , not the i .
made diagonal or even block diagonal by any choice Then the Schrödinger equation can be written as
of orientational or shape coordinates, or by any Htr
= Etr
, where Htr is a differential operator
choice of body frame. involving @=@i and @=@q .
The components of the metric tensor in the new The orientational derivatives @=@i in Htr are
coordinates are conveniently expressed in terms of conveniently expressed in terms of the angular
three fields on shape space. The first is the moment-of- momentum operator L. When acting on the original
inertia tensor E, which describes the 3  3 upper block wave function  on the OCS, the angular momen-
of the metric tensor. Its components are given by tum is

n1   X
n
X 2 L¼ R  P  ½13
Eij ¼ M jr  j ij  ri rj ½9 ¼1
¼1
When this is transformed to the coordinates
The vectors and tensors in this equation can be (r 1 , . . . , r n1 , RCM ), it becomes L = LCM þ Ltr ,
referred either to the space frame or the body frame, where LCM = RCM  P CM , and
but the body frame is more convenient because then
the components of the vectors r are functions only X
n1

of the shape coordinates q . Thus, the body frame Ltr ¼ r   p ½14


¼1
components Eij of the moment-of-inertia tensor
define a field on shape space. Physically, Ltr is the angular momentum of the
The second field is the ‘‘gauge potential’’ A , an system about the center of mass.
object with 3(3n  6) components Ai , i = 1, 2, 3, We shall henceforth drop the ‘‘tr’’ on Htr , Etr , and
 = 1, . . . , 3n  6, which describes the off-diagonal Ltr , thereby restricting attention to the energy and
blocks of the metric tensor. It is defined by angular momentum about the center of mass.
! The angular momentum L, when acting on wave
1
X
n1
@r  functions (r 1 , . . . , r n1 ) on the TRCS, is a vector of
A ¼ E M r   ½10
¼1
@q differential operators involving @=@r  . When these
are transformed to orientational and shape coordi-
in which all vectors are understood to be referred to nates, the components of L become differential
the body frame (so the partial derivatives make operators involving only orientational derivatives,
sense). The gauge potential A is responsible for the @=@i . There are no shape derivatives, @=@q , since
‘‘falling cat’’ phenomenon, in which a flexible body L generates rotations, that is, changes in orientation,
of zero angular momentum nevertheless manages to not shape. Thus, one can solve for the operators
rotate. @=@i in terms of the components of L. This is true
Quantum n-Body Problem 287

both for the space and the body components of L, as differential operators in i , but as (2l þ 1)
although the differential operators are not the same (2l þ 1) matrices that act on the ‘‘spinor’’ . These
in the two cases. The space components of L satisfy matrices are the transposes of the usual angular
the usual angular momentum commutation rela- momentum matrices in angular momentum theory,
tions, [Li , Lj ] = i
h ijk Lk , while the body components that is, (Li )kk0 = hk0 jLi jki.
of satisfy [Li , Lj ] = ih ijk Lk (with a minus sign This is the final form of the Schrödinger equation
relative to the space commutation relations). after all reductions by all continuous symmetries
Thus, the Hamiltonian can be expressed in have been carried out. The fully reduced system has
terms of L and the shape momentum operators, 3n  5 degrees of freedom (3n  6 for the shape
p = ih@=@q . The result is coordinates, and one for the ‘‘spinor’’ index k).

H ¼ 12 L  E1  L þ 12 ðp  L  A Þg ðp  L  A Þ


þ V2 ðqÞ þ VðqÞ ½15
Reduction by Rotations: Geometrical
where all vectors are referred to the body frame, Description
where g is the contravariant metric tensor on The proper rotation group SO(3) acts on the OCS
shape space, and where V2 is given by by R 7! RR , and on the TRCS by r  7! Rr  , where
  R 2 SO(3). Rotations acting on the OCS do not
h2
 @ @D1=4
V2 ¼ D1=4  g ½16 commute with translations, but the action preserves
2 @q @q
the translation fibers, and thus can be projected onto
V2 looks like a potential (it is a function of only q), the TRCS.
hence the notation, but physically it belongs to the The action of SO(3) on the TRCS is effective but
kinetic energy. It is sometimes called an ‘‘extrapoten- not free, that is, most orbits are diffeomorphic to
tial.’’ It arises from nonclassical commutators in the SO(3), but a subset of measure zero (the ‘‘singular’’
transformation of the kinetic energy (hence the h2 orbits) are diffeomorphic to S2 or a single point.
dependence). The first term of eqn [15] is the kinetic Configurations of the n-particle system in which the
energy of rotation, also called the ‘‘vertical’’ kinetic particles do not lie on a line (‘‘noncollinear shapes’’)
energy, the next two terms are the remainder of the have SO(3) orbits, those in which the particles do lie
kinetic energy, somewhat imprecisely thought of as on a line but are not coincident have S2 orbits, and
the kinetic energy of vibrations or changes in shape, the n-body collision (a single shape) has an orbit that
also called the ‘‘horizontal kinetic energy,’’ and the is a single point. Thus, the action of SO(3) on the
final term is the (true) potential, discussed above. TRCS foliates the TRCS into a (3n  6)-parameter
Since the Hamiltonian commutes with the angular family of copies of SO(3), plus the singular orbits. If
momentum, [H, L] = 0,
can be chosen to be we exclude the singular orbits, then the TRCS has the
simultaneous eigenfunctions of L2 and Lz (the latter structure of an SO(3) principal fiber bundle. In
being the space component), as well as of energy. general, the bundle is not trivial. Shape space may
Let
lm be these eigenfunctions, where l and m are be defined as the quotient space under the SO(3)
the quantum numbers of L2 and Lz , respectively. action. Omitting the singular shapes, shape space is
Then by the transformation properties of
under the base space of the bundle. The coordinates q
rotations, we can write introduced above are coordinates on shape space.
The singular shapes and orbits are physically acces-
X
þl
sible, and there are important questions regarding the

lm ði ; q Þ ¼ lk ðq Þ Dlkm ði Þ ½17 behavior of the system in their neighborhood.
k¼l
The definition of a body frame is equivalent to the
where D is a standard rotation matrix and lk are choice of a section of the fiber bundle, generally
functions only of q . In these equations we use the only locally defined over some region of shape
phase and other standard conventions of the theory of space. A configuration (a point in the TRCS) on the
rotations. The wave function is a function only of q section defines an orientation of the n-particle
and can loosely be thought of as the wave function on system for the given shape, which serves as a
shape space. It is not a scalar like , , or
, but rather reference orientation to which others can be
has 2l þ 1 components indexed by k. referred. We think of the reference orientation as
The Schrödinger equation for can be written one in which the space and body frames coincide; in
as H = E , where H has the same form as in other orientations of the same shape, the body frame
eqn [15], except that now the components of the has been rotated with the body to a new orientation.
angular momentum Li are interpreted, no longer The choice of the section (body frame) allows us to
288 Quantum n-Body Problem

impose coordinates on each (nonsingular) rotation the components of the classical angular momentum L
fiber, that is, we label points on the fiber by the (body or space components, depending on the basis
rotation that takes us from the section to the actual of forms). Thus, horizontal motions are those for
configuration in question. This is why a choice of which L = 0, and horizontal lifts of curves in shape
body frame is necessary before defining orienta- space are motions of the system with vanishing
tional coordinates. Sections are only defined locally. angular momentum. Since angular momentum is
Popular choices of body frame, such as the principal conserved, such motions are generated by the
axis frame, imply multivalued sections, unless classical equations of motion and are physically
branch cuts are introduced. Orientational coordi- allowed. For loops in shape space, the holonomy
nates are simply coordinates on the group manifold generated by the horizontal lift is physically the
SO(3), transferred to the nonsingular rotation fibers, rotation that a flexible body experiences when it is
with the group identity element mapped onto the carried under conditions of vanishing angular
point where the fiber intersects the section. momentum from an initial shape, through intermedi-
The metric tensor determines much of the geome- ate shapes and back to the initial shape. An example
try of the reduction by rotations. Since the metric on is the rotation generated by the ‘‘falling cat.’’
the TRCS is SO(3)-invariant, horizontal subspaces in Since the metric on the TRCS is SO(3)-invariant,
the SO(3) fiber bundle (the TRCS minus the singular it may be projected onto shape space, which there-
orbits) can be defined as the spaces orthogonal to the fore is a Riemannian manifold in its own right. The
fibers (hence orthogonal to the vertical subspaces). projected metric is ds2 = g dq dq . This metric is
This is a standard construction in Kaluza–Klein not flat (the Riemann curvature tensor is nonzero
theories, which reappears here. Thus, the bundle has for all values n  3). Geodesics in shape space have
a connection, induced by the metric. horizontal lifts that are free particle motions (V = 0)
The moment-of-inertia tensor is the metric tensor of zero angular momentum. Conversely, such
restricted to a fiber, evaluated in a basis of left- motions project onto geodesics on shape space.
(body frame) or right-invariant (space frame) vector A popular choice of body frame in molecular
fields on SO(3), which are transported to the fibers physics is the Eckart frame, which has advantages
to create a basis of vertical vector fields. for the description of small vibrations and other
The coordinate description of the connection is purposes. The section defining the Eckart frame is a
the gauge potential A , in which the  index refers flat vector subspace of the TRCS of dimension 3n  6
to shape coordinates q , and the components of the that is orthogonal (horizontal) to a particular fiber
3-vector A refer to the standard set of left- or right- (over an equilibrium shape) at a particular
invariant vector fields on SO(3). The coordinate orientation.
representative of the curvature 2-form is conveni- The geometrical meaning of eqn [17] is that
ently denoted by B , defined by rotations act on a set of wave functions
that span
an irrep of SO(3) by multiplication by the represen-
@A @A tative element of the group. In standard physics
B ¼   A  A ½18
@q @q notation, l indexes the irrep, and m indexes the basis
vectors spanning the irrep. Thus, the values of these
where it is understood that body frame components wave functions at any point on the fiber are known
are used. Direct calculation shows that it is nonzero, once their values are given at a reference point. A
hence the fiber bundle is not flat, for any value of convenient choice for the reference point is the point
n  3. The curvature form B appears in the on the section, and the wave functions lk are simply
classical equation of motion and in the quantum the values of the
lm on this reference point (with a
commutation relations. change of notation, m ! k). Thus, the wave func-
The field B satisfies differential equations on tions lk are properly not ‘‘wave functions on shape
shape space that have the form of Yang–Mills field space,’’ but rather wave functions on the section.
equations. It is interesting that the sources of this Shape space in the case n = 3 is homeomorphic to
field are singularities of the monopole type, located the region x3  0 of R3 , and in the case n = 4 to R 6 .
on the singular shapes. In the case n = 3, the source A convenient tool for understanding the structure
is a single monopole located at the three-body of shape space is by its foliation under the action of
collision, which is similar to a Dirac monopole in the kinematic rotations, eqn [5]. The kinematic
electromagnetic theory. rotations commute with ordinary rotations, and
The (3n  6)-dimensional horizontal subspaces of hence have an action on shape space. This action
the TRCS are annihilated by three differential forms, preserves the eigenvalues of the moment-of-inertia
whose values on a velocity vector of the system are tensor.
Quantum Phase Transitions 289

Concluding Remarks See also: Bosons and Fermions in External Fields;


Gravitational N-Body Problem (Classical); Integrable
The quantum n-body problem provides an interesting Systems: Overview.
example in which nonabelian gauge theories find
application in nonrelativistic quantum mechanics. The
fields E, A , and g , and fields derived from them such
as the curvature tensor B and the Riemann curvature Further Reading
tensor derived from g , satisfy a complex set of Abraham R and Marsden JE (1978) Foundations of Mechanics,
differential equations on shape space that can be 2nd edn. Reading, MA: Benjamin/Cummings.
derived by considering the vanishing of the Riemann Aquilanti V and Cavalli S (1986) Coordinates for molecular
tensor on the TRCS. The resulting field equations are dynamics: orthogonal local frames. Journal of Chemical
Physics 85: 1355–1361.
useful in perturbation theory, for example, in the study Berry MV (1984) Quantal phase factors accompanying adiabatic
of small vibrations of a molecule. This means of changes. Proceedings of the Royal Society of London. Series A
constructing field equations on the base space of a 392: 45–57.
bundle is standard in Kaluza–Klein theories, which are Coquereaux R (1988) Riemannian Geometry, Fiber bundles,
an important line of thinking in modern attempts to Kaluza–Klein Theories and All That. World Scientific Lecture
Notes in Physics, vol. 16. Singapore: World Scientific.
understand gauge field theories in particle physics. Ezra GS (1982) Symmetry Properties of Molecules. New York:
The rotations generated by flexible bodies of vanish- Springer.
ing angular momentum (the ‘‘falling cat’’) are an Iwai T (1987) A gauge theory for the quantum planar three-body
example of a ‘‘geometric phase,’’ that is, a nonabelian problem. Journal of Mathematical Physics 28: 964–974.
generalization of ‘‘Berry’s phase.’’ It is interesting how Kupperman A (1993) A new look at symmetrized hyperspherical
coordinates. Advances in Molecular Vibrations and Collision
the associated gauge potential A in this problem plays Dynamics vol. 2B: 117–186.
a role in the dynamics of the n-particle system. Littlejohn RG and Reinsch M (1997) Gauge fields in the
The Hamiltonian [15] is the starting point for separation of rotations and internal motions in the n-body
numerous practical calculations, for example, the problem. Reviews of Modern Physics 69: 213–275.
numerical evaluation of energy levels, cross-sections Mead CA (1992) The geometric phase in molecular systems.
Reviews of Modern Physics 64: 51–85.
and reaction rates in molecular physics. One can Mitchell KA and Littlejohn RG (2000) Kinematic orbits and the
compute, for example, chemical reaction rates for structure of the internal space for systems of five or more
molecular processes in atmospheric or astrophysical bodies. Journal of Physics A 33: 1395–1416.
contexts, where experiments would be difficult or Naber GL (1997) Topology, Geometry and Gauge Fields.
expensive. The numerical analysis of the Hamiltonian New York: Springer.
Wilson EB Jr., Decius JC, and Cross PC (1955) Molecular
[15] usually requires the introduction of a basis set and Vibrations: the Theory of Infrared and Raman Vibrational
the processing of large matrices. Current techniques Spectra. New York: McGraw-Hill.
for basis set selection are not very satisfactory, and this
is an area where research into wavelets and numerical
analysis could have an impact.

Quantum Phase Transitions


S Sachdev, Yale University, New Haven, CT, USA The content of this sophisticated theory may be sum-
ª 2006 Elsevier Ltd. All rights reserved.
marized in a few basic principles: (1) The collective
thermal fluctuations near second-order transitions
can be accurately described by simple classical
models, that is, quantum-mechanical effects can be
Introduction
entirely neglected. (2) The classical models identify
The study of second-order phase transitions at an ‘‘order parameter,’’ a collective variable which
nonzero temperatures has a long and distinguished has to be treated on par with other thermodynamic
history in statistical mechanics. Many key physical variables, and whose correlations exhibit distinct
phenomena, such as the loss of ferromagnetism behavior in the phases on either side of the
in iron at the Curie temperature or the critical transition. (3) The thermal fluctuations of the
endpoint of CO2 , are now understood in precise order parameter near the transition are controlled
quantitative detail. This understanding began in the by a continuum field theory whose structure is
work of Onsager, and is based upon what may now usually completly dictated by simple symmetry
be called the Landau–Ginzburg–Wilson theory. considerations.
290 Quantum Phase Transitions

This article will not consider such nonzero These operators clearly act on the two states of the
temperature phase transitions, but will instead qubit on site j, and the Pauli operators on different
describe second-order phase transitions at the sites commute.
absolute zero of temperature. Such transitions are The quantum Ising chain is defined by the simple
driven by quantum fluctuations mandated by the Hamiltonian
Heisenberg uncertainty principle: one can imagine
moving across the quantum critical point by X
N 1 X
N
HI ¼ J ^jz ^jþ1
z
gJ ^jx ½2
effectively ‘‘tuning the value of Planck’s constant, j¼1 j¼1
h.’’ Clearly, quantum mechanics plays a central role

at such transitions, unlike the situation at nonzero where J > 0 sets the energy scale, and g  0 is a
temperatures. The reader may object that absolute dimensionless coupling constant. In the thermody-
zero is an idealization not realized by any experi- namic limit (N ! 1), the ground state of HI exhibits
mental system; hence, the study of quantum phase a second-order quantum phase transition as g is
transitions is a subject only of academic interest. As tuned across a critical value g = gc (for the specific
we will illustrate below, knowledge of the zero- case of HI it is known that gc = 1), as we will now
temperature quantum critical points of a system is illustrate.
often the key to understanding its finite-temperature First, consider the ground state of HI for g  1.
properties, and in some cases the influence of a zero- At g = 0, there are two degenerate ‘‘ferromagnetically
temperature critical point can be detected at ordered’’ ground states
temperatures as high as ambient room temperature.
Y
N Y
N
We will begin in the following section by j*i ¼ j "i j ; j+i ¼ j #i j ½3
introducing some simple lattice models which j¼1 j¼1
exhibit quantum phase transitions. Next the theory
of the critical point in these models is based upon Each of these states breaks a discrete ‘‘Ising’’
a natural extension of the Landau–Ginzburg–Wilson symmetry of the Hamiltonian rotations of all
(LGW) method, and this will be presented. This spins by 180 about the x-axis. These states are
section will also describe the consequences of a zero- more succinctly characterized by defining the
temperature critical point on the nonzero tempera- ferromagnetic moment, N0 , by
ture properties. Finally, we will consider more zj j*i ¼ h+j^
N0 ¼ h*j^ zj j+i ½4
complex models in which quantum interference
effects play a more subtle role, and which cannot At g = 0 we clearly have N0 = 1. A key point is
be described in the LGW framework: such quantum that in the thermodynamic limit, this simple picture
critical points are likely to play a central role in of the ground state survives for a finite range of
understanding many of the correlated electron small g (indeed, for all g < gc ), but with 0 < N0 < 1.
systems of current interest. The quantum tunneling between the two ferromag-
netic ground states is exponentially small in N (and
so can be neglected in the thermodynamic limit),
and so the ground state remains 2-fold degenerate
Simple Models
and the discrete Ising symmetry remains broken.
Quantum Ising Chain The change in the wave functions of these states
from eqn [3] can be easily determined by perturba-
This is a simple model of N qubits, labeled by the
tion theory in g: these small g quantum fluctuations
index j = 1, . . . , N. On each ‘‘site’’ j there are two
reduce the value of N0 from unity but do not cause
qubit quantum states j"ij and j#ij (in practice, these
the ferromagnetism to disappear.
could be two magnetic states of an ion at site j in a
Now consider the ground state of HI for g  1.
crystal). The Hilbert space therefore consists of 2N
At g = 1 there is a single nondegenerate ground
states, each consisting of a tensor product of the
state which fully preserves all symmetries of HI :
states on each site. We introduce the Pauli spin
operators, ^j , on each site j, with  = x, y, z: N 
Y 
) i ¼ 2N=2 j "i j þ j #i j ½5
    j¼1
0 1 0 i
^x ¼ ; ^y ¼ It is easy to verify
1 0 i 0  that
  this state has no ferromagnetic
  ½1 moment N0 = )^jz )i = 0. Further, perturbation
1 0 theory in 1=g shows that these features of the ground
^z ¼
0 1 state are preserved for a finite range of large g values
Quantum Phase Transitions 291

(indeed, for all g > gc ). One can visualize this ground


state as one in which strong quantum fluctuations
have destroyed the ferromagnetism, with the local
magnetic moments quantum tunneling between ‘‘up’’
and ‘‘down’’ on a timescale of order  h=J.
Given the very distinct signatures of the small g
and large g ground states, it is clear that the ground Figure 1 The coupled dimer antiferromagnet. Qubits (i.e.,
state cannot evolve smoothly as a function of g. S = 1=2 spins) are placed on the sites, the A links are shown as
These must be at least one point of nonanalyticity as full lines, and the B links as dashed lines.
a function of g: for HI it is known that there is only
a single nonanalytic point, and this is at the location excitations of many experimentally important spin
of a second-order quantum phase transition at gap compounds.
g = gc = 1. The Hamiltonian of the dimer antiferromagnet is
The character of the excitations above the ground illustrated in Figure 1 and is given by
state also undergoes a qualitative change across the
quantum critical point. In both the g < gc and g > gc X 
Hd ¼ J ^jx ^kx þ ^jy ^ky þ ^jz ^kz
phases, these excitations can be described in the
hjki2A
Landau quasiparticle scheme, that is, as super-
J X x x 
positions of nearly independent particle-like þ ^j ^k þ ^jy ^ky þ ^jz ^kz ½8
excitations; a single well-isolated quasiparticle has g hjki2B
an infinite lifetime at low excitation energies.
However, the physical nature of the quasiparticles where J > 0 is the exchange constant, g  1 is the
is very different in the two phases. In the ferromag- dimensionless coupling, and the set of nearest-
netic phase, with g < gc , the quasiparticles are neighbor links A and B are defined in Figure 1. An
domain walls between regions of opposite important property of Hd is that it is now invariant
magnetization: under the full O(3) group of spin rotations under
j
which the ^  transform as ordinary vectors (in
Y Y
N
jj; j þ 1i ¼ j "i k j #i ‘ ½6 contrast to the Z2 symmetry group of HI ). In
k¼1 ‘¼jþ1 analogy with HI , we will find that Hd undergoes a
quantum phase transition from a paramagnetic
This is the exact wave function of a stationary phase which preserves all symmetries of the
quasiparticle excitation between sites j and j þ 1 at Hamiltonian at large g, to an antiferromagnetic
g = 0; for small nonzero g the quasiparticle acquires phase which breaks the O(3) symmetry at small g.
a ‘‘cloud’’ of further spin-flips and also becomes This transition occurs at a critical value g = gc ,
mobile. However its qualitative interpretation as a and the best current numerical estimate is
domain wall between the two degenerate ground 1=gc = 0.52337(3).
states remains valid for all g < gc . In contrast, for As in the previous section, we can establish the
g > gc , there is no ferromagnetism, and the non- existence of such a quantum phase transition by
degenerate paramagnetic state has a distinct quasi- contrasting the disparate physical properties at large
particle excitation: g with those at g  1. At g = 1 the exact ground
 Y   state of Hd is
jji ¼ 2N=2 j"ij  j#ij j"ik þ j#ik ½7
k6¼j Y 1  
jspin gapi ¼ pffiffiffi j"ij j#ik  j#ij j"ik ½9
This is a stationary ‘‘flipped spin’’ quasiparticle at hjki2A 2
site j, with its wave function exact at g = 1. Again,
this quasiparticle is mobile and applicable for all and is illustrated in Figure 2. This state is non-
g > gc , but there is no smooth connection between degenerate and invariant under spin rotations, and
eqns [7] and [6]. so is a paramagnet: the qubits are paired into spin
singlet valence bonds across all the A links.
The excitations above the ground state are
Coupled Dimer Antiferromagnet
created by breaking a valence bond, so that the
This model also involves qubits, but they are now pair of spins form a spin triplet with total spin
placed on the sites, j, of a two-dimensional square S = 1 – this is illustrated in Figure 3. It costs a large
lattice. Models in this class describe the magnetic energy to create this excitation, and at finite g the
292 Quantum Phase Transitions

orientation of the spontaneous magnetic moment


which breaks the O(3) spin rotation invariance of
Hd . The excitations above this antiferromagnet are
also distinct from those of the paramagnet: they are
a doublet of spin waves consisting of a spatial
variation in the local orientation, n , of the
=( – )/√2 antiferromagnetic order: the energy of this excita-
tion vanishes in the limit of long wavelengths, in
Figure 2 The paramagnetic state of Hd for g > gc . The state
illustrated is the exact ground state for g = 1, and it is
contrast to the finite energy gap of the triplon
adiabatically connected to the ground state for all g > gc . excitation of the paramagnet.
As with HI , we can conclude from the distinct
characters of the ground states and excitations for
g  1 and g  1 that there must be a quantum
critical point at some intermediate g = gc .

Quantum Criticality
The simple considerations of the previous section
Figure 3 The triplon excitation of the g > gc paramagnet. The have given a rather complete description (based on
stationary triplon is an eigenstate only for g = 1 but it becomes the quasiparticle picture) of the physics for g  gc
mobile for finite g.
and g  gc . We turn, finally, to the region g  gc .
For the specific models discussed in the previous
triplet can hop from link to link, creating a gapped section, a useful description is obtained by a method
‘‘triplon’’ quasiparticle excitation. This is similar to that is a generalization of the LGW method
the large g paramagnet for HI , with the important developed earlier for thermal phase transitions.
difference that each quasiparticle is now 3-fold However, some aspects of the critical behavior
degenerate. (e.g., the general forms of eqns [13]–[15]) will
At g = 1, the ground state of Hd is not known apply also to the quantum critical point of the
exactly. However, at this point Hd becomes equiva- section ‘‘Beyond LGW theory.’’
lent to the nearest-neighbor square lattice antiferro- Following the canonical LGW strategy, we need
magnet, and this is known to have antiferromagnetic to identify a collective order parameter which
order in the ground state, as illustrated in Figure 4. distinguishes the two phases. This is clearly given
This state is similar to the ferromagnetic ground by the ferromagnetic moment in eqn [4] for the
state of HI , with the difference that the magnetic quantum Ising chain, and the antiferromagnetic
moment now acquires a staggered pattern on the moment in eqn [10] for the coupled dimer antiferro-
two sublattices, rather than the uniform moment of magnet. We coarse-grain these moments over some
the ferromagnet. Thus, in this ground state finite averaging region, and at long wavelengths this
yields a real order parameter field a , with the index
j jAFi ¼ N0 j n
hAFj^ ½10 a = 1, . . . , n. For the Ising case we have n = 1 and a
is a measure of the local average of N0 as defined in
where 0 < N0 < 1 is the antiferromagnetic moment, eqn [4]. For the antiferromagnet, a extends over the
j = 1 identifies the two sublattices in Figure 4, and three values x, y, z (so n = 3), and three components
n is an arbitrary unit vector specifying the of a specify the magnitude and orientation of the
local antiferromagnetic order in eqn [10]; note the
average orientation of a specific spin at site j is j
times the local value of a .
The second step in the LGW approach is to write
down a general field theory for the order parameter,
consistent with all symmetries of the underlying
model. As we are dealing with a quantum transition,
the field theory has to extend over spacetime, with
the temporal fluctuations representing the sum over
Figure 4 Schematic of the ground state with antiferromagnetic histories in the Feynman path-integral approach.
order with g < gc . With this reasoning, the proposed partition function
Quantum Phase Transitions 293

for the vicinity of the critical point takes the Here ^ is the Heisenberg field operator correspond-
following form: ing to the path integral in eqn [11], the square
Z brackets represent a commutator, and the angular
brackets an average over the partition function at a
Z ¼ Da ðx; Þ
temperature T. The structure of can be deduced
Z  
1 from the knowledge that the quantum correlators of

exp  dd x d ð@ a Þ2 Z  are related by analytic continuation in time to
2
 u
2  the corresponding correlators of the classical statis-
þ c2 ðrx a Þ2 þ s2a þ 2 ½11 tical mechanics problem in d þ 1 dimensions. The
4! a
latter are known to diverge at the critical point as
1=p2 where p is the (d þ 1)-dimensional momen-
Here  is imaginary time; there is an implied
tum,  is defined to be the anomalous dimension of
summation over the n values of the index a, c is a
the order parameter ( = 1=4 for the quantum Ising
velocity, and s and u > 0 are coupling constants.
chain). Knowing this, we can deduce the form of the
This is a field theory in d þ 1 spacetime dimensions,
quantum correlator in eqn [12] at the zero-tempera-
in which the Ising chain corresponds to d = 1 and
ture quantum critical point
the dimer antiferromagnet to d = 2. The quantum
phase transition is accessed by tuning the ‘‘mass’’ s: 1
ðk; !Þ ; T ¼ 0; g ¼ gc ½13
there is a quantum critical point at s = sc and the ðc2 k2  !2 Þ1=2
s < sc (s > sc ) regions correspond to the g < gc (g > gc )
regions of the lattice models. The s < sc phase has The most important property of eqn [13] is the
ha i 6¼ 0 and this corresponds to the spontaneous absence of a quasiparticle pole in the spectral
breaking of spin rotation symmetry noted in eqns [4] density. Instead, Im( (k, !)) is nonzero for all ! > ck,
and [10] for the lattice models. The s > sc phase is reflecting the presence of a continuum of critical
the paramagnet with ha i= 0. The excitations in this excitations. Thus the stable quasiparticles found at
phase can be understood as small harmonic oscilla- low enough energies for all g 6¼ gc are absent at the
tions of a about the point (in field space) a = 0. A quantum critical point.
glance at eqn [11] shows that there are n such We now briefly discuss the nature of the phase
oscillators for each wave vector. These oscillators diagram for T > 0 with g near gc . In general, the
clearly constitute the g > gc quasiparticles found interplay between quantum and thermal fluctuations
earlier in eqn [7] for the Ising chain (with n = 1) near a quantum critical point can be quite compli-
and the triplon quasiparticle (with n = 3) illustrated cated, and we cannot discuss it in any detail here.
in Figure 3 for the dimer antiferromagnet. However, the physics of the quantum Ising chain is
We have now seen that there is a perfect relatively simple, and also captures many key
correspondence between the phases of the quantum features found in more complex situations, and is
field theory Z  and those of the lattice models HI summarized in Figure 5. For all g 6¼ gc there is a
and Hd . The power of the representation in eqn [11] range of low temperatures (T < jg  gc j) where the
is that it also allows us to get a simple description of long time dynamics can be described using a dilute
the quantum critical point. In particular, readers gas of thermally excited quasiparticles. Further, the
may already have noticed that if we interpret the
temporal direction  in eqn [11] as another spatial
direction, then Z  is simply the classical partition T
function for a thermal phase transition in a ferro-
magnet in d þ 1 dimensions: this is the canonical
model for which the LGW theory was originally Quantum
developed. We can now take over standard results critical
Domain wall Flipped-spin
for this classical critical point, and obtain some quasiparticles quasiparticles
useful predictions for the quantum critical point of
Z  . It is useful to express these in terms of the 0
dynamic susceptibility defined by gc g

Z Figure 5 Nonzero temperature phase diagram of H I : The


i d ferromagnetic order is present only at T = 0 on the shaded line
ðk; !Þ ¼ d x
h
 with g < gc : The dashed lines at finite T are crossovers out of
Z 1 Dh iE the low-T quasiparticle regimes where a quasiclassical descrip-

dt ^ tÞ; ð0;
ðx; ^ 0Þ eikxþi!t ½12 tion applies. The state sketched on the paramagnetic side used
0 T the notation j!ij = 21=2 (j"ij þ j#ij ) and j ij = 21=2 (j"ij  j#ij ):
294 Quantum Phase Transitions

dynamics of these quasiparticles is quasiclassical, order in the disordered state, and such effects are
although we reiterate that the nature of the entirely absent in the LGW theory.
quasiparticles is entirely distinct on opposite sides An important example of a system displaying such
of the quantum critical point. Most interesting, phenomena is the S = 1=2 square lattice antiferro-
however, is the novel quantum critical region, magnet with additional frustrating interactions. The
T> jg  gc j, where neither quasiparticle picture nor quantum degrees of freedom are identical to those of
a quasiclassical description are appropriate. Instead, the coupled dimer antiferromagnet, but the Hamil-
we have to understand the influence of temperature tonian preserves the full point-group symmetry of
on the critical continuum associated with eqn [13]. the square lattice:
This is aided by scaling arguments which show that X  
the only important frequency scale which charac- Hs ¼ Jjk ^ xj ^kx þ ^jy ^ky þ ^jz ^kz þ ½16
j<k
terizes the spectrum is kB T= h, and the crossovers
near this scale are universal, that is, independent of Here the Jjk > 0 are short-range exchange interac-
specific microscopic details of the lattice Hamilto- tions which preserve the square lattice symmetry,
nian. Consequently, the zero-momentum dynamic and the ellipses represent possible further multiple
susceptibility in the quantum critical region takes spin terms. Now imagine tuning all the non-nearest-
the following form at small frequencies: neighbor terms as a function of some generic
1 1 coupling constant g. For small g, when Hs is nearly
ðk ¼ 0; !Þ 2
½14 the square lattice antiferromagnet, the ground state
T ð1  i!=R Þ
has antiferromagnetic order as in Figure 4 and
This has the structure of the response of an eqn [10]. What is now the disordered ground state
overdamped oscillator, and the damping frequency, for large g? One natural candidate is the spin-singlet
R , is given by the universal expression paramagnet in Figure 2. However, because all

 kB T nearest neighbor bonds of the square lattice are
R ¼ 2 tan ½15 now equivalent, the state in Figure 2 is degenerate
16 h
with three other states obtained by successive 90
The numerical proportionality constant in eqn. [15]
rotations about a lattice site. In other words, the
is specific to the quantum Ising chain; other models
state in Figure 2, when transferred to the square
also obey eqn [15] but with a different numerical
lattice, breaks the symmetry of lattice rotations by
value for this constant.
90 . Consequently it has a new type of order, often
called valence-bond-solid (VBS) order. It is now
believed that a large class of models like Hs do
Beyond LGW Theory
indeed exhibit a second-order quantum phase
The quantum transitions discussed so far have transition between the antiferromagnetic state and
turned to have a critical theory identical to that a VBS state – see Figure 6. Both the existence of VBS
found for classical thermal transitions in d þ 1 order in the paramagnet, and of a second-order
dimensions. Over the last decade it has become quantum transition, are features that are not
clear that there are numerous models, of key predicted by LGW theory: these can only be
physical importance, for which such a simple
classical correspondence does not exist. In these
models, quantum Berry phases are crucial in estab- Antiferromagnetic
lishing the nature of the phases, and of the critical VBS order
order
boundaries between them. In less technical terms, a
signature of this subtlety is an important simplifying
feature which was crucial in the analyses of the or
section ‘‘Simple models’’: both models had a
straightforward g ! 1 limit in which we were able
to write down a simple, nondegenerate, ground-state gc g
wave function of the ‘‘disordered’’ paramagnet. In Figure 6 Phase diagram of Hs . Two possible VBS states are
many other models, identification of the disordered shown: one which is the analog of Figure 2, and the other in
phase is not as straightforward: specifying absence which spins form singlets in a plaquette pattern. Both VBS states
have a 4-fold degeneracy due to breaking of square lattice
of a particular magnetic order is not enough to
symmetry. So the novel critical point at g = gc (described by Z z )
identify a quantum state, as we still need to write has the antiferromagnetic and VBS orders vanishing as it is
down a suitable wave function. Often, subtle approached from either side: this coincident vanishing of orders
quantum interference effects induce new types of is generically forbidden in LGW theories.
Quantum Spin Systems 295

understood by a careful study of quantum inter- of Z z is the presence of a U(1) gauge field A : this
ference effects associated with Berry phases of spin gauge force emerges near the critical point, even
fluctuations about the antiferromagnetic state. We though the underlying model in eqn [16] only has
will not enter into details of this analysis here, but will simple two spin interactions. Studies of fractiona-
conclude our discussion by writing down the theory so lized critical theories like Z c in other models with
obtained for the quantum critical point in Figure 6: spin and/or charge excitations is an exciting avenue
Z for further theoretical research.
Z z ¼ Dz ðx; ÞDA ðx; Þ
 Z See also: Bose–Einstein Condensates; Boundary
Conformal Field Theory; Fractional Quantum Hall Effect;

exp  d2 x d jð@  iA Þz j2 þ sjz j2 Ginzburg–Landau Equation; High Tc Superconductor
 Theory; Quantum Central-Limit Theorems; Quantum
u 1
þ ðjz j2 Þ2 þ 2 ð  @ A Þ2 ½17 Spin Systems; Quantum Statistical Mechanics: Overview.
2 2e
Here , , are spacetime indices which extend over Further Reading
the two spatial directions and ,  is a spinor index
which extends over " , # , and z is complex spinor Matsumoto M, Yasuda C, Todo S, and Takayama H (2002)
field. In comparing Z z to Z  , note that the vector Physical Review B 65: 014407.
Sachdev S (1999) Quantum Phase Transitions. Cambridge:
order parameter a has been replaced by a spinor z , Cambridge University Press.
and these are related by a = z  a z , where a are Senthil T, Balents L, Sachdev S, Vishwanath A, and Fisher MPA,
the Pauli matrices. So the order parameter has http://arxiv.org/abs/cond-mat/0312617.
fractionalized into the z . A second novel property

Quantum Spin Systems


B Nachtergaele, University of California at Davis, associated with a fixed irreducible representation of
Davis, CA, USA SU(2). One speaks of a spin-J model if this represen-
ª 2006 B Nachtergaele. Published by Elsevier Ltd. tation is the (2J þ 1)-dimensional one. The possible
All rights reserved. values of J are 1/2, 1, 3/2, . . .
The spins are usually thought of as each being
associated with a site in a lattice, or more generally, a
Introduction vertex in a graph. In a condensed-matter-physics
model, each spin may be associated with an ion in a
The theory of quantum spin systems is concerned with crystalline lattice. Quantum spin systems are also used
the properties of quantum systems with an infinite in quantum information theory and quantum compu-
number of degrees of freedom that each have a finite- tation, and show up as abstract mathematical objects
dimensional state space. Occasionally, one is specifically in representation theory and quantum probability.
interested in finite systems. Among the most common In this article we give a brief introduction to the
examples, one has an n-dimensional Hilbert space subject, starting with a very short review of its history.
associated with each site of a d-dimensional lattice. The mathematical framework is sketched and the most
A model is normally defined by describing important definitions are given. Three sections, ‘‘Sym-
a Hamiltonian or a family of Hamiltonians, which metries and symmetry breaking,’’ ‘‘Phase transitions,’’
are self-adjoint operators on the Hilbert space, and and ‘‘Dynamics,’’ together cover the most important
one studies their spectrum, the eigenstates, the aspects of quantum spin systems actively pursued today.
equilibrium states, the system dynamics, and non-
equilibrium stationary states, etc.
More particularly, the term ‘‘quantum spin sys-
A Very Brief History
tem’’ often refers to such models where each degree
of freedom is thought of as a spin variable, that is, The introduction of quantum spin systems was the
there are three basic observables representing the result of the marriage of two developments during
components of the spin, S1 , S2 , and S3 , and these the 1920s. The first was the realization that angular
components transform according to a unitary repre- momentum (hence, also the magnetic moment) is
sentation of SU(2). The most commonly encountered quantized (Pauli 1920, Stern and Gerlach 1922) and
situation is where the system consists of N spins, each that particles such as the electron have an intrinsic
296 Quantum Spin Systems

angular momentum called spin (Compton 1921, When it was realized in the 1980s that the magnetic
Goudsmit and Uhlenbeck 1925). properties of complex materials play an important role
The second development was the attempt in in high-Tc superductivity, a variety of quantum spin
statistical mechanics to explain ferromagnetism and models studied in the literature proliferated. This
the phase transition associated with it on the basis of a motivated a large number of theoretical and experi-
microscopic theory (Lenz and Ising 1925). The mental studies of materials with exotic properties that
fundamental interaction between spins, the so-called are often based on quantum effects that do not have a
exchange operator which is a subtle consequence classical analog. An example of unexpected behavior is
of the Pauli exclusion principle, was introduced the prediction by Haldane of the spin liquid ground
independently by Dirac and Heisenberg in 1926. state of the spin-1 Heisenberg antiferromagnetic chain
With this discovery, it was realized that magnetism is in 1983. In the quest for a mathematical proof of this
a quantum effect and that a fundamental theory prediction (a quest still ongoing today), Affleck,
of magnetism requires the study of quantum-mechan- Kennedy, Lieb, and Tasaki introduced the AKLT
ical models. This realization and a large amount of model in 1987. They were able to prove that the
subsequent work notwithstanding, some of the most ground state of this model has all the characteristic
fundamental questions, such as a derivation of properties predicted by Haldane for the Heisenberg
ferromagnetism from first principles, remain open. chain: a unique ground state with exponential decay of
The first and most important quantum spin model correlations and a spectral gap above the ground state.
is the Heisenberg model, so named after Heisenberg. There are also particle models that are defined on
It has been studied intensely ever since the early a lattice, or more generally, a graph. Unlike spins,
1930s and its study has led to an impressive variety particles can hop from one site to another. These
of new ideas in both mathematics and physics. Here, models are closely related to quantum spin systems
we limit ourselves to listing only some landmark and, in some cases, are mathematically equivalent.
developments. The best-known example of a model of lattice
Spin waves were discovered independently by fermions is the Hubbard model. Such systems are
Bloch and Slater in 1930 and they continue to play not discussed further in this article.
an essential role in our understanding of the
excitation spectrum of quantum spin Hamiltonians.
Mathematical Framework
In two papers published in 1956, Dyson advanced
the theory of spin waves by showing how interac- Quantum spin systems present an area of mathema-
tions between spin waves can be taken into account. tical physics where the demands of mathematical
In 1931, Bethe introduced the famous Bethe rigor can be fully met and, in many cases, this can be
ansatz to show how the exact eigenvectors of the done without sacrificing the ability to include all
spin-1/2 Heisenberg model on the one-dimensional physically relevant models and phenomena. This
lattice can be found. This exact solution, directly does not mean, however, that there are few open
and indirectly, led to many important developments problems remaining. But it does mean that, in
in statistical mechanics, combinatorics, representa- general, these open problems are precisely formu-
tion theory, quantum field theory and more. lated mathematical questions.
Hulthén used the Bethe ansatz to compute the In this section we review the standard mathema-
ground-state energy of the antiferromagnetic spin- tical framework for quantum spin systems, in which
1/2 Heisenberg chain in 1938. the topics discussed in the subsequent section can be
In their famous 1961 paper, Lieb, Schultz, and given a precise mathematical formulation. It is
Mattis showed that some quantum spin models in possible, however, to skip this section and read the
one dimension can be solved exactly by mapping rest with only a physical or intuitive understanding
them into a problem of free fermions. This paper is of the notions of observable, Hamiltonian,
still one of the most cited in the field. dynamics, symmetry, ground state, etc.
Robinson, in 1967, laid the foundation for the The most common mathematical setup is as follows.
mathematical framework, which we describe in the Let d  1, and let L denote the family of finite subsets
next section. Using this framework, Araki estab- of the d-dimensional integer lattice Zd . For simplicity
lished the absence of phase transitions at positive we will assume that the Hilbert space of the ‘‘spin’’
temperatures in a large class of one-dimensional associated with each x 2 Zd has the same dimension
quantum spin models in 1969. n  2: H{x} ffi Cn . The Hilbert space associated
N with
During the more recent decades, the mathematical the finite volume  2 L is then H = x2 Hx . The
and computational techniques used to study quantum algebra of observables for the spin of site x consists of
spin models have fanned out in many directions. the n  n complex matrices: A{x} ffi Mn (C). For any
Quantum Spin Systems 297

 2 L, the algebraN of observables for the system in  is Its completion is the C -algebra of quasilocal
given by A = x2 A{x} . The primary observables for observables, which we will simply denote by A.
a quantum spin model are the spin-S matrices The dynamics and symmetries of a quantum spin
S1 , S2 , and S3 , where S is the half-integer such that model are described by (groups of) automorphisms
n = 2S þ 1. They are defined as Hermitian matrices of the C -algebra A, that is, bijective linear trans-
satisfying the SU(2) commutation relations. Instead formations  on A that preserve the product and
of S1 and S2 , one often works with the spin-raising 
operations. Translation invariance, for example, is
and -lowering operators, Sþ and S , defined by the expressed by the translation automorphisms x , x 2
relations S1 = (Sþ þ S )=2, and S2 = (Sþ  S )=(2i). In Zd , which map any subalgebra A to Aþx , in the
terms of these, the SU(2) commutation relations are natural way. They form a representation of the
additive group Zd on A.
½ Sþ ; S  ¼ 2S3 ; ½ S3 ; S  ¼ S ½1 A translation-invariant interaction, or potential,
defining a quantum spin model, is a map  : L ! A
where we have used the standard notation for the
with the following properties: for all X 2 L,
commutator for two elements A and B in an algebra:
we have (X) 2 AX , (X) = (X) , and for x 2 Zd ,
[A, B] = AB  BA. In the standard basis S3 , Sþ , and
(X þ x) = x ((X)). An interaction is called finite
S are given by the following matrices:
0 1 range if there exists R > 0 such that (X) = 0
S whenever diam(X) > R. The Hamiltonian in  is
B S1 C the self-adjoint element of A defined by
S3 ¼ B
@ .. C
A
. X
S H ¼ ðXÞ
X 
S = (Sþ ) , and
0 1 For the standard Heisenberg model the interaction is
0 cS
B C given by
B 0 cS1 C
þ B .. .. C
S ¼B . . C ðfx; ygÞ ¼ JSx Sy ; if jx  yj ¼ 1 ½2
B C
@ 0 cSþ1 A
0 and (X) = 0 in all other cases. Here, Sx Sy is the
conventional notation for S1x S1y þ S2x S2y þ S3x S3y . The
where, for m = S, S þ 1, . . . , S,
magnitude of the coupling constant J sets a natural
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
cm ¼ SðS þ 1Þ  mðm  1Þ unit of energy and is irrelevant from the mathema-
tical point of view. Its sign, however, determines
In the case n = 2, one often works with the Pauli whether the model is ferromagnetic (J > 0), or
matrices, 1 , 2 , 3 , simply related to the spin antiferromagnetic (J < 0). For the classical Heisen-
matrices by j = 2Sj , j = 1, 2, 3. berg model, where the role of Sx is played by a unit
Most physical observables are expressed as finite vector in R3 , and which can be regarded, after
sums and products of the spin matrices rescaling by a factor S2 , as the limit S ! 1 of the
Sxj , j = 1, 2, 3, associated with the site x 2 : quantum Heisenberg model, there is a simple trans-
O formation relating the ferro- and antiferromagnetic
Sxj ¼ Ay models (just map Sx to Sx for all x in the even
y2 sublattice of Zd ). It is easy to see that there does not
with Ax = Sj , and Ay = 1 if y 6¼ x. exist an automorphism of A mapping Sx to Sx , since
The A are finite-dimensional C -algebras for the that would be inconsistent with the commutation
usual operations of sum, product, and Hermitian relations [1]. Not only is there no exact mapping
conjugation of matrices and with identity 1 . between the ferro- and the antiferromagnetic models,
If 0 1 , there is a natural embedding of A0 their ground states and equilibrium states have
into A1 , given by radically different properties. See below for the
definitions and further discussion.
A0 ffi A0
11 n0 A1 The dynamics (or time evolution), of the system in
finite volume  is the one-parameter group of
The algebra of local observables is then defined by automorphisms of A given by
[
Aloc ¼ A ðÞ
2L t ðAÞ ¼ eitH AeitH ; t2R
298 Quantum Spin Systems

For each t 2 R, ()


t is an automorphism of A and The states describing thermal equilibrium are
the family {()
t j t 2 R} forms a representation of the characterized by the Kubo–Martin–Schwinger
additive group R. (KMS) condition: for any
 0 (related to absolute
Each ()
t can trivially be extended to an auto- temperature by
= 1=(kB T), where kB is the Boltz-
morphism on A, by tensoring with the identity map. mann constant), ! is called
-KMS if
Under quite general conditions, () t converges
strongly as  ! Zd in a suitable sense, that is, for !ðAi
ðBÞÞ ¼ !ðBAÞ; for all A; B 2 Aloc ½5
every A 2 A, the limit The most common way to construct ground states
ðÞ and equilibrium states, namely solutions of [4] and [5],
lim t ðAÞ ¼ t ðAÞ
"Zd respectively, is by taking thermodynamic limits of
finite-volume states with suitable boundary condi-
exists in the norm in A, and it can be shown that it tions. A ground state of the finite-volume Hamiltonian
defines a strongly continuous one-parameter group H is a convex combination of vector states that are
of automorphisms of A.  " Zd stands for any eigenstates of H belonging to its smallest eigenvalue.
sequence of  2 L such that  eventually contains The finite-volume equilibrium state at inverse tem-
any given element of L. A sufficient condition on the perature
has density matrix
defined by
potential  is that there exists  > 0 such that kk
is finite, with
¼ Zð;
Þ1 e
H
X
kk ¼ ejXj kðXÞk ½3 where Z(,
) = tr e
H is called the partition
X30 function. By considering limit points as
 ! Zd , one can show that a quantum spin model
Here j j denotes the number of elements in X. One always has at least one ground state and at least one
can show that, under the same conditions, defined equilibrium state for all
.
on Aloc by In this section, the basic concepts have so far been
ðAÞ ¼ lim ½H ; A discussed in the most standard setup. Clearly, many
"Zd generalizations are possible: one can consider non-
translation-invariant models; models with random
is a norm-closable (unbounded) derivation on A and potentials; the state spaces at each site may have
that its closure is, up to a factor i, the generator of different dimensions; instead of Zd one can consider
{t j t 2 R}, that is, formally other lattices or define models on arbitrary graphs;
one can allow interactions of infinite range that
t ¼ eit
satisfy weaker conditions than those imposed by the
For the class of  with finite kk for some  > 0, Aloc finiteness of the norm [3], or restrict to subspaces of
is a core of analytic vectors for . This means that, for the Hilbert space by imposing symmetries or
each A 2 Aloc , the function t 7! t (A) can be extended suitable hardcore conditions; and one can study
to a function z (A) analytic in a strip jIm zj < a models with infinite-dimensional spins. Examples of
for some a > 0. all these types of generalizations have been consid-
A state of the quantum spin system is a linear ered in the literature and have interesting
functional on A such that !(A A)  0, for all A 2 A applications.
(positivity), and !(1) = 1 (normalization). The res-
triction of ! to A , for each  2 L, is uniquely
determined by a density matrix, that is,  2 A , Symmetries and Symmetry Breaking
such that Many interesting properties of quantum spin sys-
tems are related to symmetries and symmetry
!ðAÞ ¼ tr  A; for all A 2 A
breaking. Symmetries of a quantum spin model are
where tr denotes the usual trace of matrices.  is realized as representations of groups, Lie algebras,
non-negative definite and of unit trace. If the density or quantum (group) algebras on the Hilbert space
matrix is a one-dimensional projection, the state is and/or the observable algebra. The symmetry prop-
called a vector state, and can be identified with a erty of the model is expressed by the fact that the
vector 2 H , such that C = ran  . Hamiltonian (or the dynamics) commutes with this
A ground state of the quantum spin system is a representation. We briefly discuss the most common
state ! satisfying the local stability inequalties: symmetries.
Translation invariance. The translation auto-
!ðA ðAÞÞ  0; for all A 2 Aloc ½4 morphisms x have already been defined on the
Quantum Spin Systems 299

observable algebra of infinite quantum spin systems phenomenon that also plays an important role in
on Zd . One can also define translation automorph- quantum field theory.
isms for finite systems with periodic boundary The famous Hohenberg–Mermin–Wagner theo-
conditions, which are defined on the torus rem, applied to quantum spin models, states that, as
Zd =TZd , where T = (T1 , . . . , Td ) is a positive integer long as the interactions do not have very long range
vector representing the periods. and the dimension of the lattice is 2 or less,
Other graph automorphisms. In general, if G is a continuous symmetries cannot be spontaneously
group Nof automorphisms of the graph , and broken in a
-KMS state for any finite
.
H = x2 Cn is the Hilbert space of a system of Quantum group symmetries. We restrict ourselves
identical spins defined on , then, for each g 2 G, one to one important example: the SUq (2) invariance of
canN define aNunitary Ug on H by linear extension of the spin-1/2 XXZ Heisenberg chain with
Ug ’x = ’g1 (x) , where ’x 2 Cn , for all x 2 . q 2 [0, 1], and with special boundary terms. The
These unitaries form a representation of G. With the Hamiltonian of the SUq (2)-invariant XXZ chain of
unitaries one can immediately define automorphisms length L is given by
of the algebra of observables: for A 2 A , and U 2 A
unitary, (A) = U AU defines an automorphism, and X
L1
1 1 1 
if Ug is a group representation, the corresponding g HL ¼  Sx Sxþ1 þ S2x S2xþ1
x¼1

will be, too. Common examples of graph automorph-
  1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3 
isms are the lattice symmetries of rotation and  S3x S3xþ1  1=4 þ 1  2 Sxþ1  S3x
reflection. Translation symmetry and other graph 2
automorphisms are often referred to collectively as where q 2 (0, 1] is related to the parameter   1
spatial symmetries. by the relation  = (q þ q1 )=2. When q = 0, HL is
Local symmetries (also called gauge symmetries). equivalent to the Ising chain. Thus, the XXZ model
Let G be a group and ug , g 2 G, a unitary N repre- interpolates between the Ising model (the primordial
sentation of G on Cn . Then, Ug = x2 ug is a classical spin system) and the isotropic Heisenberg
representation on H . The Heisenberg model [2], for model (the most widely studied quantum spin model).
example, commutes with such a representation of In the limit of infinite spin (S ! 1), the model
SU(2). It is often convenient, and generally equiva- converges to the classical Heisenberg model (XXZ
lent, to work with a representation of the Lie or isotropic). An interesting feature of the XXZ
algebra. In that case the SU(2) invariance of the model are its non-translation-invariant ground
Heisenberg model is expressed by the fact that H states, called kink states.
commutes with the following three operators: In this family of models, one can see how aspects
X of discreteness (quantized spins) and continuous
Si ¼ Six ; i ¼ 1; 2; 3 symmetry (SU(2), or quantum symmetry SUq (2)) are
x2
present at the same time in the quantum Heisenberg
Note: sometimes the Hamiltonian is only sym- models, and the two classical limits (q ! 0 and
metric under certain combinations of spatial and S ! 1) can be used as a starting point to study its
local symmetries. CP symmetry is an example. properties.
For an automorphism , we say that a state ! is Quantum group symmetry is not a special case of
-invariant if !  = . If ! is g -invariant for all invariance under the action of a group. There is no
g 2 G, we say that ! is G-invariant. group, but there is an algebra represented on the
It is easy to see that if a quantum spin model has a Hilbert space of each spin, for which there is a good
symmetry G, then the set of all ground states or all definition of tensor product of representations, and

-KMS states will be G-invariant, meaning that if ! ‘‘many’’ irreducible representations. In this example,
is in the set, then so is ! g , for all g 2 G. By a the representation of SUq (2) on H[1, L] commuting
suitable averaging procedure, it is usually easy to with HL is generated by
establish that the sets of ground states or equili-
brium states contain at least one G-invariant X
L
S3 ¼ 11

S3x
1xþ1
1L
element. x¼1
An interesting situation occurs if the model is
X
L
G-invariant, but there are ground states or KMS Sþ ¼ t1

tx1

x
1xþ1
1L
states that are not. This means that, for some x¼1
g 2 G, and some ! in the set (of ground states or X
L
KMS states), !  6¼ !. When this happens, one says S ¼ 1 1

S 1 1
x
txþ1
tL
that there is spontaneous symmetry breaking, a x¼1
300 Quantum Spin Systems

where In contrast, no proof of long-range order in the


  Heisenberg ferromagnet at low temperature exists. This
q1 0 is rather remarkable since proving long-range order in

0 q the ground states of the ferromagnet is a trivial problem.
Of particular interest are the so-called quantum
Quantum group symmetries were discovered in
phase transitions. These are phase transitions that
exactly solvable models, starting with the spin-1/2
occur as a parameter in the Hamiltonian is varied and
XXZ chain. One can exploit their representation
which are driven by the competing effects of energy
theory to study the spectrum of the Hamiltonian in
and quantum fluctuations, rather than the balance
very much the same way as ordinary symmetries.
between energy and entropy which drives usual
The main restriction to its applicability is that the
equilibrium phase transitions. Since entropy does not
tensor product structure of the representations is
play a role, quantum phase transitions can be oberved
inherently one-dimensional, that is, relying on an
at zero temperature, that is, in the ground states.
ordering from left to right. For the infinite XXZ
An important example of a quantum phase
chain the left-to-right and right-to-left orderings can
transition occurs in the two-or higher-dimensional
be combined to generate an infinite-dimensional
XY model with a magnetic field in the Z-direction.
algebra, the quantum affine algebra Uq (^sl2 ).
It was proved by Kennedy, Lieb, and Shastry that, at
zero field, this model has off-diagonal long-range
order (ODLRO), and can be interpreted as a hard-
Phase Transitions core Bose gas at half-filling. It is also clear that if the
magnetic field exceeds a critical value, hc , the model
Quantum spin models of condensed matter physics
has a simple ferromagnetically ordered ground state.
often have interesting ground states. Not only are
There are indications that there is ODLRO for all
the ground states often a good approximation of the
jhj < hc . However, so far there is no proof that
low-temperature behavior of the real systems that
ODLRO exists for any h 6¼ 0.
are modeled by it, and studying them is therefore
What makes the ground-state problem of quantum
useful, it is in many cases also a challenging
spin systems interesting and difficult at the same time
mathematical problem. This is in contrast with
is that ground states, in general, do not minimize the
classical lattice models for which the ground states
expectation value of the interaction terms in the
are usually simple and easy to find. In more than
Hamiltonian individually although, loosely speaking,
one way, ground states of quantum spin systems
the expectation value of their sum (the Hamiltonian)
display behavior similar to equilibrium states of
is minimized. However, there are interesting excep-
classical spin systems at positive temperature.
tions to this rule. Two examples are the AKLT model
The spin-1/2 Heisenberg antiferromagnet on
and the ferromagnetic XXZ model.
 Zd , with Hamiltonian
The wide-ranging behavior of quantum spin models
X
H ¼ Sx Sy ½6 has required an equally wide range of mathematical
x;y;2jxyj¼1
approaches to study them. There is one group of
methods, however, that can make a claim of sub-
is a case in point. Even in the one-dimensional case stantial generality: those that start from a representa-
(d = 1), and even though the model in that case is tion of the partition function based on the Feynman–
exactly solvable by the Bethe ansatz, its ground state is Kac formula. Such representations turn a d-dimen-
highly nontrivial. Analysis of the Bethe ansatz solution sional quantum spin model into a (d þ 1)-dimensional
(which is not fully rigorous) shows that spin–spin classical problem, albeit one with some special
correlation function decays to zero at infinity, but features. This technique was pioneered by Ginibre in
slower than exponentially (roughly as inverse distance 1968 and was quickly adopted by a number of authors
squared). For d = 2, it is believed, but not mathemati- to solve a variety of problems. Techniques borrowed
cally proved, that the ground state has Néel order, that from classical statistical mechanics have been adapted
is, long-range antiferromagnetic order, accompanied by with great success to study ground states, the low-
a spontaneous breaking of the SU(2) symmetry. Using temperature phase diagram, or the high-temperature
reflection positivity, Dyson, Lieb, and Simon were able regime of quantum spin models that can be regarded as
to prove the Néel order at sufficiently low temperature perturbations of a classical system. More recently, it
(large
), for d  3 and all S  1=2. This was later was used to develop a quantum version of Pirogov–
extended to the ground state for d = 2 and S  1, and Sinai theory which is applicable to a large class of
d  3 and S  1=2, that is, all the cases where Néel problems, including some with low-temperature
order is expected except d = 2, S = 1=2. phases not related by symmetry.
Quantum Spin Systems 301

Dynamics currents. Some interesting results have been


obtained although much remains to be done.
Another feature of quantum spin systems that makes
them mathematically richer than their classical
couterpart is the existence of a Hamiltonian Acknowledgment
dynamics. Quite generally, the dynamics is well
This work was supported in part by the National
defined in the thermodynamic limit as a strongly
Science Foundation under Grant # DMS-0303316.
continuous one-parameter group of automorphisms
of the C -algebra of quasilocal observables. Strictly See also: Bethe Ansatz; Channels in Quantum
speaking, a quantum spin model is actually defined Information Theory; Eight Vertex and Hard Hexagon
by its dynamics t , or by its generator , and not by Models; Exact Renormalization Group; Falicov–Kimball
the potential . Indeed,  is not uniquely determined Model; Finitely Correlated States; High Tc
by t . In particular, it is possible to incorporate Superconductor Theory; Hubbard Model; Pirogov–Sinai
various types of boundary conditions into the Theory; Quantum Central-Limit Theorems; Quantum
definition of . This approach has proved very useful Phase Transitions; Quantum Statistical Mechanics:
in obtaining important structural results, such as the Overview; Reflection Positivity and Phase Transitions;
proof by Araki of the uniqueness the KMS state at Symmetry and Symmetry Breaking in Dynamical
Systems; Symmetry Breaking in Field Theory.
any finite
in one dimension. Another example is a
characterization of equilibrium states by the energy–
entropy balance inequalities, which is both physically Further Reading
appealing and mathematically useful: ! is a
-KMS Affleck I, Kennedy T, Lieb EH, and Tasaki H (1988) Valence
state for a quantum spin model in the setting of the bond ground states in isotropic quantum antiferromagnets.
section on the mathematical framework in this article Communications in Mathematical Physics 115: 477–528.
(and in fact also for more general quantum systems), Aizenman M and Nachtergaele B (1994) Geometric aspects of
quantum spin states. Communications in Mathematical
if and only if the inequality Physics 164: 17–63.
Araki H (1969) Gibbs states of a one dimensional quantum lattice.
!ðX XÞ Communications in Mathematical Physics 14: 120–157.

!ðX ðXÞÞ  !ðX XÞ log Borgs C, Kotecký R, and Ueltschi D (1996) Low-temperature phase
!ðXX Þ
diagrams for quantum perturbations of classical spin systems.
Communications in Mathematical Physics 181: 409–446.
is satisfied for all X 2 Aloc . This characterization Bratteli O and Robinson DW (1981, 1997) Operator Algebras
and several related results were proved in a series of and Quantum Statistical Mechanics 2. Equilibrium States.
works by various authors (mainly Roepstorff, Araki, Models in Quantum Statistical Mechanics. Berlin: Springer.
Datta N, Fernández R, and Fröhlich J (1996) Low-temperature
Fannes, Verbeure, and Sewell).
phase diagrams of quantum lattice systems. I. Stability for
Detailed properties of the dynamics for specific quantum perturbations of classical systems with finitely-many
models are generally lacking. One could point to ground states. Journal of Statistical Physics 84: 455–534.
the ‘‘immediate nonlocality’’ of the dynamics as Dyson F, Lieb EH, and Simon B (1978) Phase transitions in
the main difficulty. By this, we mean that, except in quantum spin systems with isotropic and non-isotropic
interactions. Journal of Statistical Physics 18: 335–383.
trivial cases, most local observables A 2 Aloc ,
Fannes M, Nachtergaele B, and Werner RF (1992) Finitely
become nonlocal after an arbitrarily short time, correlated states on quantum spin chains. Communications
that is, t (A) 62 Aloc , for any t 6¼ 0. This nonlocality in Mathematical Physics 144: 443–490.
is not totally uncontrolled however. A result by Kennedy T (1985) Long-range order in the anisotropic quantum
Lieb and Robinson establishes that, for models with ferromagnetic Heisenberg model. Communications in Mathe-
matical Physics 100: 447–462.
interactions that are sufficiently short range (e.g.,
Kennedy T and Nachtergaele B (1996) The Heisenberg model – a
finite range), the nonlocality propagates at a bibliography. http://math.arizona.edu/tgk/qs.html.
bounded speed. More precisely, under quite general Kennedy T and Tasaki H (1992) Hidden symmetry breaking and
conditions, there exist constants c, v > 0 such that, the Haldane phase in S = 1 quantum spin chains. Commu-
for any two local observables A, B 2 A{0} , nications in Mathematical Physics 147: 431–484.
Lieb E, Schultz T, and Mattis D (1964) Two soluble models of an
antiferromagnetic chain. Annals of Physics (NY) 16: 407–466.
k½t ðAÞ; x ðBÞk 2kAkkBkecðjxjvjtjÞ Matsui T (1990) Uniqueness of the translationally invariant
ground state in quantum spin systems. Communications in
Attempts to understand the dynamics have gen- Mathematical Physics 126: 453–467.
Mattis DC (1981, 1988) The Theory of Magnetism. I. Berlin:
erally been aimed at one of the two issues: return to Springer.
equilibrium from a perturbed state, and convergence Simon B (1993) The Statistical Mechanics of Lattice Gases.
to a nonequilibrium steady state in the presence of Volume I. Princeton: Princeton University Press.
302 Quantum Statistical Mechanics: Overview

Quantum Statistical Mechanics: Overview


L Triolo, Università di Roma ‘‘Tor Vergata’’, The Crisis of Classical Physics:
Rome, Italy The Quantum Free Gas
ª 2006 Elsevier Ltd. All rights reserved.
Let us briefly recall some of what Lord Kelvin called
the ‘‘nineteenth century clouds’’ over the physics
of that time (1884), and the subsequent new ideas,
(Gallavotti 1999).
Introduction
It is well known that the classical Dulong–Petit law
Quantum theory actually started at the beginning of of specific heat of solid crystals may be derived from
the twentieth century as a many-body theory, the model of point particles interacting through
attempting to solve problems to which classical harmonic forces; the equipartition of the mean energy
physics gave unsatisfactory answers. among the degrees of freedom implies, for N
This article aims to follow the developments of particles, the linear dependence of the internal energy
quantum statistical mechanics, hereafter called UN on absolute temperature T, hence a constant heat
QSM, staying close to the underlying physics and capacity CN (kB is the Boltzmann’s constant)
sketching its methods and perspectives. The next
1 @UN
section outlines the historical path, and the first UN ¼ 6  NkB T; CN :¼ ¼ 3NkB ½1
achievements by Planck (1900) and Debye (1913); 2 @T
the subsequent free quantum gas theory will be Experimentally this is relatively well satisfied at high
recalled in the first original insights due to Fermi, temperatures but it is violated for low T: one
Dirac (1926) and Bose, Einstein (1924–25), when observes that UN vanishes faster than linearly as T
many open problems began to find a coherent goes to zero, so that CN vanishes. Moreover, the
treatment. contributions to the heat capacity from the internal
In this framework, an interesting new idea degrees of freedom of the molecular gases or from
appeared: the elementary units of the systems could the free electrons in conducting solids are negligible,
be ‘‘particles’’, in the usual or in a broad meaning, a at room temperature: these degrees of freedom, in
notion which includes photons, phonons, and spite of the equipartition principle, seem frozen.
quasiparticles of current use in condensed matter The analysis of the blackbody radiation problem
physics. The description of a classical harmonic from the classical point of view, that is, using
system through independent normal modes is an equipartition among the normal modes of the
example of a very fruitful use of collective variables. electromagnetic field in the ‘‘black’’ cavity at
The subsequent section will deal with more recent temperature T, gives the following dependence,
achievements, related to the properties of quantum Rayleigh–Jeans law (1900), of the spectral energy
N-body systems, which are fundamental for the density u(, T), on frequency  and temperature T
derivation of their macroscopic behavior. In parti- (c is the speed of light in vacuum):
cular, the works by Dyson–Lenard and Lieb–
8 2
Lebowitz on the stability of matter have to be uð; TÞ ¼ kB T ½2
recalled: a system made of electrons and ions has a c3
thermodynamic behavior, thanks to the quantum The experimental curves for any positive T show a
nature of its constituents, where the Pauli exclusion maximum for a frequency max (T) which increases
principle plays an essential role. linearly with T according to Wien’s displacement
We will then present relations that arise in law (1893). The spectral energy density decreases
quantum field theory, that is, from the second fast enough to zero as  ! 1 in such a way that the
quantization methods; related technical and concep- overall (integrated) energy is (finite and) propor-
tual problems will also be presented briefly. tional to T 4 , according to Stefan’s law (1879); the
This is necessary for taking into account the agreement with the classical form holds for low
recent works and perspectives, which will be frequencies. The analytic form of the classical
considered in the last section. Here the new inputs u(, T) in [2] does not present maxima and the
and challenges from outstanding achievements in overall radiated energy is clearly divergent (this bad
physics laboratories will be taken into account, behavior for large , present in many formulas for
referring to some exactly solvable models which other models, sometimes in the corresponding
help in understanding and in fixing the boundaries ‘‘short-distance’’ form, is called an ‘‘ultraviolet
of approximate methods. catastrophe’’).
Quantum Statistical Mechanics: Overview 303

The effort by M Planck (1900) to understand the


right dependence of u from  and T was based on a
thermodynamic argument about the possible T3
energy–entropy relation, and on an assumption
similar to the discretization rules on which the T2

u (ν,Ti)
‘‘old quantum theory’’ for the atomic structure is
based. The electromagnetic field is represented, via T1
Fourier analysis, as a set of infinitely many
independent harmonic oscillators, two for every
wave vector k, to take into account the polarization.
The frequency depends linearly on the wave number ν
k = jkj (linear dispersion law), and the spacing Figure 1 Dependence of the electromagnetic energy density
becomes negligible for macroscopic dimensions of on , for T1 < T2 < T3 .
the cavity. The key idea for computing the partition
function is the discretization of the phase space of
representing the radiating system was one of a gas
each oscillator (of frequency  = !=2). Putting
of noninteracting photons, carrying energy and
there the adimensionalized Lebesgue measure
momentum, and being continuously created and
dp dq=h, where h is a constant with physical
absorbed.
dimensions of an action, we consider the regions
A slightly different approach was used about the
RE bounded by the constant-energy ellipses and
same time, for the problem of specific heat of
their areas jRE j, and find
Z crystalline solids.
dpdq 2E E The simpler model considers N points on the
jRE j ¼ ¼ ¼
RE h h! h nodes of the lattice Z3 , in a cubic box of side L, and
interacting through harmonic forces; similarly to the
If these adimensional areas have integer values, that radiation problem, the system is represented by a
is, E = nh, n = 0, 1, 2, . . . , the annular region (‘‘cell’’ collection of independent harmonic oscillators (nor-
Cn ) between REn and REnþ1 has unit area and so we mal modes), which are ‘‘quantized’’ as before: the
approximate the partition function with the series corresponding quanta were called phonons (by
( = 1=(kB T), the ubiquitous parameter in statistical Fraenkel, in 1932) for the role of the acoustic band
mechanics, often called ‘‘inverse temperature’’) of frequencies. In this simplified approach (by
X 1 Debye, in 1913) the different phonons are deter-
Zdiscr ¼ expðnhÞ ¼ mined by a finite set of wave vectors
n
1  expðhÞ

In this way, the probabilistic weight given to this 2


k¼ n; ni integer; i ¼ 1; 2; 3; jkj  kM
cell is L
where the maximal modulus kM is such that the
expðnhÞ
pðCn Þ ¼ P total number of different k’s is 3N (degrees of
j expðjhÞ freedom).
¼ expðnhÞð1  expðhÞÞ ½3 Moreover, the frequency–wave number relation is
simplified too, extrapolating the low-frequency
A well-defined value for the constant h (i.e., h = (acoustic) linear relation  = jkjv0 (v0 is the sound
6.626 . . .  1027 erg s, the Planck constant), com- speed). In this way, the density of states which is
bined with the usual computation for the density of quadratic in the frequency, has a cutoff to zero at
states, gives a formula which quantitatively agrees the maximal frequency, D , corresponding to
with experimental data (see Figure 1) jkM j, with an associate temperature D = hD =kB
(Debye’s temperature). The expected energy UN in
8 2 h the canonical ensemble, after the computation of the
uð; TÞ ¼ ½4
c3 expðh=kB TÞ  1 canonical partition function, is given in term of the
Moreover, for a certain range of parameters, that Debye function D():
is, such that h  1, there is agreement with the  

classical law. UN ¼ 3NkB TD
T
The ‘‘quantum of light,’’ introduced by Einstein in Z y 3
½5
1905 in his work on photoelectric effect, was later 3 x dx
DðyÞ :¼ 3
(1926) called photon by G N Lewis. The picture for y 0 exp x  1
304 Quantum Statistical Mechanics: Overview

As the state is completely defined by the knowl-


edge of occupation numbers nk,  , we have the
simple and relevant statement on the ground states
for the N spinless bosons and N spin-1/2 fermion
systems are described by the statement:
c (T )

BE system : nk ¼ Nk;k0


½6
FD system : nk; ¼ 1ðjkj  kF Þ8

The constant kF (Fermi wave number), or the


T equivalent pF = hkF and "F = pF 2 =2m (Fermi momen-
Figure 2 The specific heat of crystal solids according to
tum and energy, respectively) denotes the higher
Debye. occupied level. In the continuum approximation,
this implies the following relation between Fermi
energy "F and density  = N=L3 :
The agreement with experimental data, for the
specific heat of different materials (i.e., different 2
h
"F ¼ ð32 Þ2=3 ½7
D ), at low and high temperatures, is rather good. 2m
At low temperatures, one recovers the empirical T 3
Going to the positive-temperature case, the grand
behavior (see Figure 2). More careful measurements
canonical partition function is computed by con-
at low T put into evidence, for metallic solids, the
sidering that occupation numbers are non-negative
role of the conduction electrons: their contribution
integers for the B–E case and just 0 or 1 for the F–D
to the heat capacity turns out to be linear in T, with
case. This implies the simple formulas, with obvious
a coefficient such that at room temperature it is
meaning of symbols and leaving more details to the
much smaller than the lattice contribution, so that a
vast literature (see Figure 3):
satisfactory agreement with the classical law is
found. <nk;>;
Soon after the beginning of quantum mechanics in
1
its modern form (1925–26), physicists considered ¼ ; þ for FD;  for BE ½8
many-particle systems, dealing initially with the expðð"k  ÞÞ  1
simplest situations, with a relatively easy formal
It is useful to introduce the Fermi temperature
apparatus, yet sufficient enough to understand in the
TF = "F =kB ; using some realistic data, that is, for
main lines the ‘‘anomalous,’’ that is, nonclassical,
common metals like copper, TF ranges roughly
behavior.
between 104 –105 K, that is, well above the ‘‘nor-
For a system of N free particles in a cubic box of
mal,’’ room temperatures: the quantum nature (i.e.,
side L, quantum theory brings the labeling of the
quantum degeneracy) of the conduction electrons,
one-particle states with the wave vectors k, recalling
modeled as free electrons, is macroscopically visible
the de Broglie relation for the momentum p = hk,
in normal conditions.
with a possible additional spin (intrinsic angular
moment) label 
2
k¼ n; n 2 Z3 nf0g 1
L T=0

and the statistics of the particles: because of 0.8


indistinguishability, the wave function of several
identical particles has to be symmetric (B–E, Bose– 0.6
n(ε)

Einstein statistics) or antisymmetric (F–D, Fermi–


0.4
Dirac statistics) in the exchange of the particles. This
has the deep implication that no more than one T>0
0.2
fermion shares the same quantum state.
We may here recall the spin–statistics connection,
0
which, in the framework of a local relativistic 0 1 2 3 4 5
theory, states that integer spin particles are bosons, εs
while particles with half-odd-integer spin are Figure 3 Expected fermionic occupation number, for T = 0,
fermions. T > 0, and  = 2.
Quantum Statistical Mechanics: Overview 305

The presence of an external field, like the periodic SU(2) is given, so that the nonzero values for s are
one given by the ionic lattice of a crystal, changes 1=2, 1, 3=2, . . . . For any x, the generators
the situation in a relevant way, as the one-particle S (x), ( = 1, 2, 3) satisfy the well-known commuta-
spectrum generally gets a band structure, and the tion
P relations of the angular momentum; moreover,
2
allowed momenta are described in the reciprocal S
(x) = s(s þ 1)1, and operators related to
lattice: the Fermi sphere becomes a surface, and its different sites commute. The ferromagnetic, iso-
structure is central for further developments. tropic, next-neighbors, magnetic field Hamiltonian
For massive bosons, the strange superfluid fea- for the finite system is
tures of liquid 4 He at low temperature, that is, X X
below the critical value 2.17 K, led F London, just H ¼ J SðxÞ  SðyÞ  h S3 ðxÞ ½9
after Kapitza’s discovery in 1937, to speculate that <x;y> x
these were related to a macroscopic occupation of
the ground state (B–E condensation). A more where J is the positive strength of the next-neighbors
realistic model has to take into account interaction coupling (< x, y > means that x and y are next
between bosons (see last section) as the microscopic neighbors); h is the intensity of the magnetic field
interactions in superfluid liquid 4 He are not oriented along the third axis. This model is consider-
negligible. ably studied even now with several variants regarding
possible anisotropies of the interaction, the possibly
infinite range of the interaction, and the sign of J, for
other (e.g., antiferromagnetic) couplings. Among the
Quantum N-Body Properties:
relevant results, the Mermin–Wagner theorem, at
Second Quantization
variance with the analogous classical spin model,
The main step in analyzing a quantum N-body states the absence of spontaneous magnetization in
system is its energy spectrum, and in particular its this zero-field model for d = 2 for any positive
ground state, as it may represent a good approxima- temperature; this can also be formulated as absence
tion of the low-temperature states: its structure, the of symmetry breaking for this model (Fröhlich and
relations with possible symmetries of the Hamilto- Pfister in 1981 shed more light on this point).
nian, its degeneracy, the dependence of its energy on As mentioned earlier, a useful mathematical tool
the number of particles, are further relevant ques- for dealing with quantum systems of many particles
tions. The last one is related to the possibility of or quasiparticles, is the occupation-number repre-
defining a thermodynamics for the system (Ruelle sentation for the state of the system. The vector
1969). As a physically very interesting example, space for a system with an indefinite number of
consider a system of electrically charged particles, N particles is the Fock space: it is the direct sum of all
electrons with negative unit charge, and K atoms spaces with any number of particles, starting with
with positive charge z, say, interacting through the zero-particle, vacuum state. The operators which
electrostatic forces; the classical Coulomb potential connect these subspaces are the creation and
as a function of distance behaves badly, as it annihilation operators, very similar to the raising
diverges at zero and decreases slowly at infinity. and lowering operators introduced by Dirac for the
The first question is about the stability: thanks to spectral analysis of the harmonic-oscillator Hamil-
the exclusion principle, for the ground-state energy tonian and the angular momentum, in the context of
E0N, K an extensive estimate from below is valid: one-particle quantum theory.
It is perhaps worth sketching the action of these
E0N;K c0 ðN þ KzÞ operators on the Fock space.
so that a finite-volume grand partition function We consider spinless bosons first, as spin might
exists, while for the thermodynamic limit, which easily be taken into account, if necessary. We
involves large distances, we need more, that is, suppose that a one-particle Hamiltonian has eigen-
charge neutrality, which allows for screening, and a functions labeled by a set of quantum numbers k,
fast-decreasing effective interaction. say, as the wave vector for the purely kinetic one-
Let us see an example (quantum spin, Heisenberg particle Hamiltonian. P Let jnk1 , nk2 , . . . , nkp > denote
model) belonging to the class of lattice models, a vector state with i = 1,..., p nki particles, where nki
where the identical microscopic elements are distin- denotes the number of particles with wave vector
guishable by their fixed positions, that is, the nodes ki , i = 1, . . . , p; j0 > denotes the no-particle, vacuum
of a lattice like Zd . To any site x 2 Zd is associated state. We define the creation operators a
k as follows:
a copy Hx of a (2s þ 1)-dimensional Hilbert space pffiffiffiffiffiffiffiffiffiffiffiffiffiffi
H, where an irreducible unitary representation of a
k j . . . nk ; . . .i ¼ nk þ 1j . . . ; nk þ 1; . . .i ½10
306 Quantum Statistical Mechanics: Overview

Its adjoint ak is called the annihilation operator, 100


for its action on the vectors 90
pffiffiffiffiffi 80
ak j . . . nk ; . . .i ¼ nk j . . . ; nk  1; . . .i ½11 70
60
The operator a
k creates a new particle with that

(ε /kB)
50
momentum: for any k 40
30
a
k ak j . . . nk ; . . .i ¼ nk j . . . nk ; . . .i
20
10
a
k ak :¼ n
^k (the occupation-number operator)
0
0 0.5 1 1.5 2 2.5 3 3.5
The vacuum state belongs to Ker ak for any k, and k
the whole space is generated by application of Figure 4 Excitation spectrum for superfluids.
creation operators on the vacuum state.
The following basic commutation relations, for
any k, k0 , are valid: where "k is the one-particle kinetic energy and ^
vq is
the Fourier transform of the two-body potential.
½ak ; a
k0  ¼ ðk; k0 Þ; ½ak ; ak0  ¼ ½a
k ; a
k0  ¼ 0 ½12 To study the excitation spectrum above the ground
state, he introduced an approximation about the
For fermions, multiple occupancy is forbidden, so persistency of a macroscopic occupation of the
that the analogous annihilation ( k ) and creation ground state and a diagonalization procedure
(
k ) operators satisfy anticommutation relations: leading to new quasiparticles with a characteristic
energy spectrum, linearly increasing near jkj = 0,
½ k ;
k0 þ ¼ ðk; k0 Þ; ½ k ; k0 þ ¼ ½
k ;
k0 þ ¼ 0 ½13 then presenting a positive minimum before the
subsequent increase (see Figure 4).
The presence of spin is dealt by an additional spin
label  to these symbols, and a (, 0 ), where
necessary.
The Hamiltonian for a system of particles, say Some Mathematical Tools for
spinless bosons, in a box , made of its kinetic part Macroscopic Quantum Systems
together with a two-body (
v(x  y)) interaction, is
The formal apparatus of second quantization, born
written in terms of the ‘‘field operators’’; if { k (x)}
in the context of the quantum field theory, brought
are the one-particle eigenfunctions of the single-
to statistical mechanics new ideas and techniques
particle purely kinetic Hamiltonian for the spinless
and related difficulties. For instance, the renormali-
case, and their complex conjugates are {
k (x)}, we
zation group was conceived in the 1970s to deal
define the fields
both with critical phenomena (i.e., power singula-
X X
ðxÞ ¼ k ðxÞak ; 
ðxÞ ¼
k ðxÞa
k ½14 rities of thermodynamic quantities around the
k k critical point) and with divergences in quantum
field theory. This subject is currently being devel-
So that the full Hamiltonian is given by oped and applied in models of quantum statistical
Z ! mechanics (QSM) (Benfatto and Gallavotti, 1995).

h2 Another issue, which has again strong relations
H ¼ dx  ðxÞ ðxÞ
 2m with quantum field theory, is the algebraic formula-
Z Z tion of QSM. This point of view, which is well
þ
dx dy vðx  yÞ
ðxÞðxÞ
ðyÞðyÞ suited for the analysis of infinitely extended quan-
 
tum systems, uses a unified, synthetic, and rigorous
½15 language. The procedure for passing from a finite
We mention that a theoretical breakthrough in the quantum system to its infinitely extended version
analysis of superfluidity was made by Bogoliubov deserves some attention.
(1946), who, starting from the Hamiltonian in [15], It is well known that, for finite quantum systems,
introduced the following Hamiltonian in the say N particles in a box , an observable is represented
momentum representation: by a self-adjoint operator A on a Hilbert space H , and
the normalized elements {j >} of this space are the
X 1X
H ¼ "k a
k ak þ vq a
kq a
k0 þq ak ak0
^ ½16 pure states  which define the expectations
k
2 0
k;k ;q  ðAÞ :¼ < jA >
Quantum Statistical Mechanics: Overview 307

The mixed states (mixtures) are defined by convex This relation is suitably extended for infinite size,
combinations of pure states, the coefficients having and therefore defines a KMS state; it implies some
an obvious statistical meaning. physically relevant properties like stability with
Among the observables, the Hamiltonian plays a respect to local disturbances and dissipativity
special role, as it generates the dynamics of the (Sewell 2002).
system, which evolves the pure states through the A final issue in this section concerns another
unitary group (Schrödinger picture) formalism stemming from the Feynman path-integral
   formulation of quantum mechanics: here a functional
 
 ðtÞ exp  itH  > integral represents the statistical equilibrium density
 h 
 operator W = exp(H). For a d-dimensional sys-
tem of N particles
P in a potential field (X 2 RdN )
To the notion of equilibrium probability measure on
V = V(X) = i<j (xi  xj ) and Hamiltonian H =
the phase space of a classical system, corresponds
(1=2)  þ V the Feynman–Kac formula which, for
the mixed state H ,  such that
a test function , may be written as follows:
H ; ðAÞ :¼ Z; 1 trðexpðH ÞAÞ ½17 Z  Z  

ðW ÞðXÞ ¼ PX;Y ðd!Þ exp  dsVð!ðsÞÞ ðYÞdY
The normalization factor Z,  = tr( exp ( H )) is 0

the canonical partition function. where PX, Y (d!)


is the Wiener measure on the space
Consider now the algebra A() of local observa- of paths {!(s), s 2 [0, ]}. For details on the con-
bles; sending  to infinity, by induction, it is possible struction and several other related features on the
to define the algebra A of quasilocal observables. treatment of the different statistics, see Glimm and
The main point is a set of algebraic relations like the Jaffe (1981).
canonical commutation relations (CCRs) and the
canonical anticommutation relations (CARs) for
the creation/annihilation operators: the observables
New Problems and Challenges
of A, through the GNS (Gel’fand, Naimark, and
Segal) construction may be represented as operators In this final section, we recall some phenomena
on the appropriate Hilbert spaces, depending on the which have been observed recently in physics
chosen state; the representations, at variance with laboratories, and which presumably deserve con-
the finite case, might be inequivalent. It is possible to siderable efforts to overcome the heuristic level of
define the equilibrium state for the infinite system explanation. About this last point, it is worth
and how to insert in a natural way the possible quoting a method that has been used to get results
group invariance of the system (R d or Zd , typically), even without clear justifications of the underlying
ending with characterization of the pure phases of hypotheses, that is, the mean-field procedure. It
the system as the ergodic components in the started with the Curie–Weiss theory of magnetism
decomposition of an equilibrium state. These states and is based on the following drastic simplification:
have the property that coarse-grained observables the microscopic element of the system feels an
have sharp values (Ruelle 1969, Sewell 2002): if average interaction field due to other elements,
Avl (A) is the space average on scale l, that is, over indipendently of the positions of the latter. This
boxes of side l, for an ergodic state , method might provide relatively good results if the
range of the interaction is very large, and in fact, a
lim ð½Avl ðAÞ  ðAÞ2 Þ ¼ 0 clear version with due limiting procedure was
l!1
introduced by Kac, and applied by Lebowitz and
Another issue which is worth mentioning is the Penrose in the 1960s for a microscopic derivation of
characterization of equilibrium states through van der Waals equation, and soon extended by Lieb
the KMS (Kubo–Martin–Schwinger) condition. The to quantum systems.
strong formal similarity between the finite-volume We will briefly outline some aspects of three
quantum evolution operator t := exp(itH =h) recent achievements of condensed matter physics for
and the statistical equilibrium density operator which modeling is still on the way of further
exp(H ), leads to the identity, valid for any progress: the B–E condensation, the high-Tc super-
couple of bounded observables A and B, using the conductivity, and the fractional quantum Hall effect.
short symbol <  >,  for the expectations with The first consists in trapping an ultracold (at less
respect to the statistical operator: than 50 mK) dilute bosonic gas, for example,
104 –107 atoms of 87 Rb, finding experimental evi-
<At B>; ¼ <BAtþih>; ½18 dence for Bose condensation. To understand the
308 Quasiperiodic Systems

properties of this system, an important tool is the and experiments: these phenomena are a source of
Gross–Pitaevskii energy functional for the conden- new ideas and suggest new models for further
sate wave function , progress.
Z " #
h2
 2 2 g 4 See also: Bose–Einstein Condensates; Dynamical
E½ ¼ dx jrj þ Vext ðxÞjj þ jj
2m 2 Systems and Thermodynamics; Exact Renormalization
Group; Falicov–Kimball Model; Fermionic Systems;
where the quartic term represents the reduced Finitely Correlated States; Fractional Quantum Hall
(mean-field) interaction among particles. Effect; High Tc Superconductor Theory; Hubbard Model;
The second issue, that is, the high-temperature Quantum Phase Transitions; Quantum Spin Systems;
superconductivity, certainly deserves much atten- Stability of Matter.
tion. It has been observed recently in some ceramic
materials well above 100 K, and a clear model which
takes into account the formation of pairs and the Further Reading
peculiar isotropy–anisotropy aspects of the normal
Benfatto G and Gallavotti G (1995) Renormalization Group.
conductivity and superconductivity is still lacking Princeton: Princeton University Press.
(Mattis 2003). Gallavotti G (1999) Statistical Mechanics: A Short Treatise.
Finally, let us consider the fractional quantum Berlin: Springer.
Hall effect; recall that the integer version, that is, a Glimm J and Jaffe A (1981) Quantum Physics. A Functional
discretization of the Hall resistivity RH by multiples Integral Point of View. New York: Springer.
Landau LD and Lifschitz EN (2000) Statistical Physics, Course of
of h=(e2 ), finds an explanation in terms of band Theoretical Physics, 3rd edn., vol 5. Parts I and II. Oxford:
spectra, formation of magnetic Landau levels, and Butterworth-Heinemann.
localization from surface impurities, that is, without Mattis DC (2003) Statistical Mechanics Made Simple. NJ: World
taking into account direct interactions among Scientific.
electrons. Ruelle D (1969) Statistical Mechanics. Rigorous Results. New
York: Benjamin.
The fractional discretization of RH (Störmer 1999) Sewell GL (2002) Quantum Mechanics and Its Emergent
has a theoretical interpretation, in terms of subtle Macrophysics. Princeton: Princeton University Press.
collective behavior of the two-dimensional semicon- Sinai YaG (1982) Theory of Phase Transitions: Rigorous Results.
ductor electron system: the quasiparticles which Budapest: Akadémiai Kiadó.
represent the excitations may behave as composite Störmer HL (1999) Nobel lecture: the quantum Hall effect.
Reviews of Modern Physics 71: 875–889.
fermions or bosons, or exhibit a fractional statistics Taylor PL and Heinonen O (2002) A Quantum Approach to
(see Fractional Quantum Hall Effect). Condensed Matter Physics. Cambridge: Cambridge University
This brief excursion through these new fascinating Press.
phenomena shows the rich interplay between theory

Quasiperiodic Systems
P Kramer, Universität Tübingen, Tübingen, Germany in 1892 in the complete classification of the 230
ª 2006 Elsevier Ltd. All rights reserved.
space groups due to Fedorov and Schoenflies
(see Schwarzenberger (1980, pp. 132–135). One
characteristic property of periodic systems is that
their Fourier transform has a pure point spectrum.
Introduction: From Periodic
Since the Fourier spectrum is experimentally acces-
to Quasiperiodic Systems
sible through diffraction experiments, it provides
Periodic systems occur in many branches of physics. a main tool for the structure determination of
Their mathematical analysis was stimulated in crystals.
particular by the analysis of the periodic transla- With quantum mechanics in the twentieth cen-
tional symmetry of crystals. The systematic study of tury, it became possible to describe crystal structures
the compatibility between translational and crystal- quantitatively as ordered systems of atomic nuclei
lographic point or reflection symmetry leads to the and electrons with electromagnetic interactions.
concept of space group symmetry. Mathematical The representation theory of crystallographic
crystallography in three dimensions (3D) culminated space groups now opened the way to verify the
Quasiperiodic Systems 309

space group symmetry of atomic systems for elaborate study of quasiperiodic systems. Therefore,
example from the band structure of crystals. It we shall focus in what follows on the concepts
was then believed that in physics atomic long- developed in this theory.
range order is linked to periodicity and hence to In the following section, we briefly review basic
the paradigm of the 230 space groups in 3D. concepts of periodic systems and lattices in nD, their
Mathematical analysis beyond this paradigm classification in terms of point symmetry and space
started independently in various directions. Bohr groups, and their cell structure. In a section on
(1925) studied quasiperiodic functions and their quasiperiodic point sets and functions, a quasiper-
Fourier transform. He interpreted them as restric- iodic system is taken as a geometric object on an
tions of periodic functions in nD to their values on a irrational mD subspace in an n-dimensional space
linear subspace of orientation irrational with respect and lattice. Noncrystallographic point symmetry is
to a lattice. Mathematical crystallography in general shown to select the irrational subspace. Next,
dimension n > 3, including point group symmetry, scaling symmetry in quasiperiodic systems is demon-
was started around 1949 in work by Hermann and strated. Then, examples of quasiperiodic systems
by Zassenhaus (see Schwarzenberger (1980)), and with point and scaling symmetry are given. The
completed in 1978 for n = 4 in Brown et al. (1978). penultimate section discusses quasiperiodic tilings
A different route was taken by Penrose (1974). He and their windows. Finally, the notion of a funda-
constructed an aperiodic tiling (covering without mental domain for quasiperiodic functions compa-
gaps or overlaps) of the plane. Its tiles in two tible with a tiling is illustrated.
rhombus shapes provide global 5-fold point symme-
try and make the tiling incompatible with any
periodic lattice in 2D. The connection between
Concepts from Periodic Systems
Penrose’s aperiodic tiling and irrational subspaces
in periodic structures was made by de Bruijn (1981). A distribution f p (x) of geometric objects on Eucli-
He interpreted the Penrose rhombus tiling as the dean space En (a real linear space equipped with
intersection of geometric objects from cells of a standard Euclidean scalar product h , i and metric)
hypercubic lattice in 5D with a 2D subspace, with coordinates x 2 En is called ‘‘periodic’’ if it is
irrational and invariant under 5-fold noncrystallo- invariant under translations bi in n linearly indepen-
graphic point symmetry. Kramer and Neri (1984) dent directions,
embedded the icosahedral group as a point group
into the hypercubic lattice in 6D and constructed a ðpÞ : f p : f p ðx þ bi Þ ¼ f p ðxÞ; i ¼ 1; . . . ; n ½1
3D irrational subspace invariant under the noncrys-
tallographic icosahedral point group. From intersec- The set of all translations on En forms the discrete
tions of boundaries of the hypercubic lattice cells additive abelian translation group
with this subspace, they constructed a 3D tiling of ( )
global icosahedral point symmetry with two rhom- Xn
n i n
T ¼ b 2 E : b¼ mi b ; ðm1 ; . . . ; mn Þ 2 Z ½2
bohedral tiles. i
Shechtman et al. (1984) discovered in the system
AlMn diffraction patterns of icosahedral point Any orbit (set of all images of an initial point) under
symmetry. Since icosahedral symmetry is incompa- the action T  En ! En yields a lattice  on En .
tible with a lattice in 3D, they concluded that there Since T acts fixpoint-free, there is a one-to-one
exists atomic long-range order without a lattice. The correspondence  $ T. A fundamental domain on
new paradigm of quasiperiodic long-range order in En is defined as a subset of points x 2 En which
quasicrystals was established and since then stimu- contains a representative point from any orbit under
lated a broad range of theoretical and experimental T. Such a fundamental domain can be chosen, for
research. example, as the unit cell of the lattice  or as the
The interplay between the notions – (1) of Voronoi cell (eqn [5]). By eqn [1], the functional
crystallographic symmetry in nD, n > 3, (2) of values on En of a periodic function f p (x) are
subspaces invariant under a point group but completely determined from its values on a funda-
irrational and hence incompatible with a lattice, mental domain of En .
and (3) of discrete geometric periodic objects in nD Given the lattice basis (b1 , . . . , bn ) of eqn [2]
providing quasiperiodic tilings on these subspaces – in En , the vector components of the basis form the
forms the mathematical basis for a new quasiper- n  n basis matrix B of . The most general change
iodic long-range order found in quasicrystals. The of the basis preserving the lattice is given by acting
present-day theory of quasicrystals offers the most with any element h of the general linear group
310 Quasiperiodic Systems

Gl(n, Z), with integral matrix entries and determi- spectrum is a pure point spectrum and the Fourier
nant 1, on the lattice basis, coefficients can be referred to the points of a reciprocal
lattice  (eqn [7]) in Fourier space En . We denote
Glðn; ZÞ 3 h : B ! B0 ¼ Bh ½3 objects belonging to this Fourier space by the index  .
The crystallographic classification of inequivalent The basis matrix B of the reciprocal lattice  2 En
lattices in En starts from Gl(n, Z). In addition to is obtained from B as the inverse transpose,
translations, it employs crystallographic point sym-
hbi ; bj i ¼ ij $ B ¼ ðB1 ÞT ½7
metry operations, (Brown et al. 1978, p. 9). A
crystallographic point group operation of a lattice  The values of the Fourier coefficients of f p (x) reduce
is a Euclidean isometry g which belongs to a group to integrals over the fundamental domain of the
G 3 g with representations D : G ! O(n, R) and lattice . From eqns [4] and [7] it follows that the
D : G ! Gl(n, Z) such that orthogonal representation of a point group G in
G ¼ fg : DðgÞB ¼ BDðgÞg ½4 coordinate and in Fourier space coincides. The
Fourier spectrum and its point symmetry in crystals
The maximal crystallographic point group for given are observed in diffraction experiments.
lattice  is the holohedry of . The group generated
by T, G is a space group which classifies the lattice.
For finer details in the classification of space groups, Quasiperiodic Point Sets and Functions
we refer to Brown et al. (1978). For crystallography
in E3 , this classification yields 230 space groups. Quasiperiodic functions are characterized from their
Crystallography in En is described in Schwarzenberger Fourier spectrum (Bohr 1925) by
(1980) and in Brown et al. (1978) where it is (qp ) The Fourier point spectrum of a quasiper-
elaborated for E4 . iodic function forms a Z-module M of rank
From a lattice  2 En and from the Euclidean metric, n, n > m on Fourier space Em .
one constructs a cell structure as follows: the Voronoi
cell V(b), centered at a lattice point b 2 , known in A Z-module of rank n, n > m on Em is defined as a set
physics as the Wigner–Seitz cell, is the set of points ( )
Xn
   i n
M ¼ b :b ¼ mj b ; ðm1 ; . . . ; mn Þ 2 Z ½8
VðbÞ ¼ fx 2 En : jx  bj  jx  b0 j; b0 2 g ½5
j
Any Voronoi cell has a hierarchy of boundaries Xp with the Z-module basis (b1 , . . . , bn ) linearly
of dimension p, 0  p  n which we denote as independent with respect to integral linear combina-
p-boundaries. tions. The step from a lattice  to a module M is
The set of Voronoi cells at all lattice points form nontrivial since the set of all module points becomes
the -periodic Voronoi complex of  2 En . The dense on Em . The Fourier coefficients of a
Voronoi cells and complexes associated with a quasiperiodic function are assigned to the discrete
lattice admit a notion of geometric duality. We set of module points (eqn [8]).
denote dual objects by a star,  . They are built from Bohr in his analysis of quasiperiodic functions
convex hulls of sets of lattice points (Kramer and (Bohr 1925, II, pp. 111–125) shows that a general
Schlottmann 1989) as follows. A Voronoi p-boundary Z-module M of rank n can be taken as the
Xp is shared by several Voronoi cells V(b) and projection to a subspace Em of dimension m of a
determines a set of lattice points (nonunique) lattice  2 En , n > m. It is convenient
SðXp Þ : fb 2  : Xp 2 VðbÞg ½6 to consider in Fourier space En an orthogonal
splitting which we denote as
The boundary dual to Xp is defined as the convex
ðnmÞ ðnmÞ
hull X(np) := conv{b : b 2 S(Xp )}. X(np) can be En ¼ Em
k þ E? ; Em
k ? E? ½9
shown to be an (n  p)-boundary of a dual
Delone cell. A Delone cell D is defined as the A characterization of a quasiperiodic function
convex hull of all lattice points whose Voronoi cells f qp (x) on coordinate space is obtained as follows.
share a single vertex, called a hole of the lattice. From  one can construct with the help of eqn [7]
Since these vertices fall into classes of orbits under the lattice  := ( ) reciprocal to  on a coordi-
translations, they determine translationally inequi- nate space En and associate to it via the Fourier
valent classes of Delone cells D , D , . . . . series a quasiperiodic function on a coordinate
(nm)
Fourier analysis applied to a periodic function f p (x) subspace Em n m
k of E = Ek þ E? , equipped with a
on En reduces to an n-fold Fourier series. The Fourier Z-module M (eqn [11]). As a result one finds a
Quasiperiodic Systems 311

characterization of a quasiperiodic function in Scaling and Quasiperiodicity


coordinate space:
Quasiperiodic systems lack periodicity but can have
(qp) A quasiperiodic function f qp (xk ), xk 2 Em
k can scaling symmetries originating from a non-Euclidean
always be interpreted as the restriction to a extension of eqn [12].
subspace Emk of a -periodic function f (x)
p
n
on E , Example 1: Scaling in the Square Lattice Z 2
ðnmÞ
En ¼ Em
k þ E? ; x ¼ xk þ x ? We begin with the Fibonacci scaling on the square
½10 lattice Z2 of E2 . The symmetric matrix
p qp
f ðxk þ c? Þ ¼: f ðxk Þ
 
In the interpretation (qp) (eqn [10]), the Z-modules 1 1
h¼ 2 Glð2; ZÞ ½13
in Fourier space (eqn [8]) and in coordinate space 1 0
Emk become projections of reciprocal lattices,
has eigenvalues
M ¼ k ð Þ; M ¼ k ðÞ ½11
pffiffiffi
The linear independence of the module basis 1 ¼   1 ¼   þ 1; 2 ¼  :¼ ð1 þ 5Þ=2 ½14
enforces a splitting (eqn [9]), irrational with respect
to the lattice  2 En . Evaluation of the orthogonal eigenvectors allows us
As in the classification of crystal lattices, point to define a lattice basis B = (b1 , b2 ) and rewrite the
symmetry plays a crucial role in the classification of eigenvalue equation similar to eqn [12] as
Z-modules for quasiperiodic systems like quasicrys- " #  
tals. Noncrystallographic point groups G (with a  1 0 1 1
representation incompatible with any lattice) give B¼B
0  1 0
rise to quasiperiodic systems as follows: 2 qffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffi 3
½15
 þ3 þ2
(qp) Given a point group G with orthogonal 6 5 5 7
B ¼ 4 qffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffi 5
representations Dk : G ! O(m, R), D? :G ! O þ2 þ3
5 5
(n  m, R) such that Dk is incompatible with
any lattice in Em n
k , we now require in E instead
of eqn [10] a lattice  with basis B and a This relation shows that h with respect to the basis
representation D : G ! Gl(n, Z) such that B acts as a non-Euclidean point symmetry of the
  square lattice and generates an infinite discrete
Dk ðGÞ 0 group. Equation [15] provides an orthogonal
B ¼ B DðGÞ ½12
0 D? ðGÞ splitting E2 = Ek þ E? . The element h acts on the
two subspaces as a discrete linear scaling by
Equation [12] requires that the matrix B provides an  1 , , respectively. It maps points of Z2 in E2 ,
irrational reduction of the representation D(G) into hence also their projections to Ek , into one
the two representations Dk (G), D? (G). Periodic another.
functions restricted as in the second line of eqn Figure 1 shows the lattice basis from eqn [15].
[10] are quasiperiodic. We choose as fundamental domain of Z2 two
For any finite group G, a representation D(G) squares A, B whose boundaries are parallel or
allowing for lattice embedding can always be perpendicular to Ek . A horizontal line Ek intersects
constructed by the technique of induced representa- these two squares at vertical distances varying with
tions. Its reduction into representations Dk (G), respect to their horizontal boundaries. The quasi-
D? (G) contained in this induced representation is periodic restriction f qp (xk ) = f p (xk þ c? ) of a
obtained by standard techniques. If Dk (G) is non- Z2 -periodic function f p (x) to a line x = xk þ c?
crystallographic and inequivalent to D? (G), the picks up varying functional values on these sec-
subspace decomposition (eqn [12]) is unique. tions. Clearly, one needs all the values of f p on its
Quasiperiodic functions compatible with tilings fundamental domain in E2 to obtain all the values
and their windows can be constructed from the dual taken by f qp .
cell structure (eqns [5] and [6]) of the embedding Scaling symmetry appears in conjunction with
lattice (Kramer and Schlottmann 1989). Examples noncrystallographic point symmetry (cf. the follow-
are given in the sections ‘‘Point symmetry in ing section). Combined with quasiperiodic tilings, it
quasiperiodic systems’’ and ‘‘Quasiperiodic tilings gives rise to a hierarchy of self-similar tilings whose
and their windows’’. tiles scale with .
312 Quasiperiodic Systems

D A possible choice of the basis for eqn [12] is the


irrational matrix
B
A X *i
B ¼ ðb1 ; b2 ; b3 ; b4 Þ
X1 2 3
V⊥ A 2 0 3 1
0
0 0 0
1 c c c c 6 7
60 s 0 0 76 1 1 0 07
6 s s s 7 6 7
x⊥ B ¼6 0 0 76
6 0 1 1 07 7
41 c c c c 56 7
b1 4 0 0 1 1 5
b2 0 s0 s s s0
X *i X1
0 0 0 1
    pffiffiffiffiffiffiffiffiffiffiffi
x
2  1 2  þ2
c ¼ cos ¼ ; s ¼ sin ¼
5 2 5 2
Figure 1 The square lattice with Fibonacci scaling. Lattice     pffiffiffiffiffiffiffiffiffiffiffi
points are black squares, holes white circles. The vectors 4  4 3
(b 1 , b 2 ) indicate the lattice basis. The directions xk , x? of c0 ¼ cos ¼ ; s0 ¼ sin ¼
5 2 5 2
scalings by  1 ,  run horizontally and vertically, respectively.
Perpendicular and parallel projections V? of Voronoi and Dk of ½17
Delone cells are attached to the lattice and hole points,
respectively. Two different pairs of dual 1-boundaries X1 , X1 of Equation [12] for the representation of the generator
Voronoi and Delone squares are marked on the right. The
product polytopes X1, k  X1, ? of their projections form two
(12345) of the cyclic group C5 becomes
squares A, B and yield a periodic tiling of E 2 . A single pair A, B
2 3
forms a fundamental domain of the lattice. The characteristic c s 0 0
functions on A, B are windows for the tiles. A general 6s c 0 07
quasiperiodic function f qp (xk ) is the restriction of a periodic 6 7 1 2 3 4
6 7ðb ; b ; b ; b Þ
function f p (x), defined on A, B, to its values on a horizontal line 40 0 c0 s0 5
x = xk þ c? . If the periodic function f p (x ) on A, B takes only
values independent of x? , its quasiperiodic restriction f qp (xk ) :=
0 0 s0 c0
2 3
f p (xk þ c? ) to this line repeats its values on the long and short 0 0 0 1
tiles Ak , Bk , respectively, of the standard Fibonacci tiling. Then 61 0 0 1 7
6 7
Ak , Bk form a fundamental domain for quasiperiodic functions ¼ ðb1 ; b2 ; b3 ; b4 Þ6 7 ½18
compatible with the tiling. 40 1 0 1 5
0 0 1 1

Point Symmetry in Quasiperiodic The left of eqn [18] generates two 2D inequivalent
Systems representations of 5-fold planar rotations which are
incompatible with any 2D lattice.
Quasiperiodic systems with noncrystallographic The lattice A4 in addition has a scaling symmetry
point symmetry provide the structure theory and with a factor . The scaling transformation may be
physics of quasicrystals. We illustrate the general expressed in terms of the basis (eqn [16]) and an
scheme (qp) of eqn [12] by examples of 5-fold and element h 2 Gl(4, Z) as
icosahedral point symmetry. For generalizations, see
Janssen (1986). 2 3
 0 0 0
6 0  0 0 7
6 7 1 2 3 4
6 7ðb ; b ; b ; b Þ
Example 2: 5-Fold Point Symmetry from 4 0 0  1 0 5
the Root Lattice A4
0 0 0  1
4
The A4 root lattice basis in E may be derived (Baake
et al. 1990) from five orthonormal unit vectors
(e1 , e2 , e3 , e4 , e5 ) in E5 as 2 3
0 1 0 1
6 0 1 1 17
B ¼ ðb1 ; b2 ; b3 ; b4 Þ ¼ ðb1 ; b2 ; b3 ; b4 Þ6
41
7 ½19
1 1 05
:¼ ðe1  e2 ; e2  e3 ; e3  e4 ; e4  e5 Þ ½16 1 0 1 0

As the generator of the cyclic group C5 of 5-fold It is easily verified that the operations of scaling and
rotations, we take the cyclic permutation (12345) in of 5-fold rotation (eqns [19] and [18]) commute
cycle notation acting on the vectors (e1 , e2 , e3 , e4 , e5 ). with one another.
Quasiperiodic Systems 313

Example 3: Icosahedral Point Symmetry from Example 4: The Quasiperiodic Fibonacci Point Set
Lattices  = Z 6 , D6
If in the Fibonacci system (Figure 1), one attaches to
The icosahedral group G = H3 has two inequivalent any point b of the square lattice as a window the
3D noncrystallographic representations. H3 allows characteristic function  of the perpendicular pro-
for an induced embedding representation D : H3 ! jection V? (b) of the unit square attached to b, the
Gl(6, Z), (Kramer and Neri 1984, Kramer et al. function f qp (xk ) becomes the standard quasiperiodic
1992, Kramer and Papadopolos 1997) into a Fibonacci sequence of points.
hypercubic lattice  = Z6 . This representation The dual cell geometry of Voronoi and Delone
reduces into two 3D orthogonal inequivalent irre- cells and their dual boundaries (eqns [5] and [6])
ducible noncrystallographic representations Dk : allows us to construct dual canonical quasiperiodic
H3 ! O(3, R), D? : H3 !O(3, R). The irrational tilings (T , ), (T  , ) (Kramer and Schlottmann
basis matrix of eqn [12] for  = Z6 becomes 1989). To this end one constructs from local projec-
(Kramer et al. 1992, p. 185, eqn (7)) tions of pairs of dual boundaries Xm, k , X(nm), ? or
Xm, k , X(nm), ? the direct product polytopes Xm, k 
B ¼ ðb1 ; b2 ; b3 ; b4 ; b5 ; b6 Þ X(nm), ? or Xk  X? called ‘‘klotz polytopes.’’ The
2 3 characteristic functions on these polytopes form the
0 1 1  0 
61   windows for the tiles Xm, k , X(nm), k , respectively.
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi6 0 1 077
6 7
1 6 0 0 1  17 ½20 Example 5: The Quasiperiodic Fibonacci Tiling
¼ 6 7
2ð þ 2Þ6 60   1 0 177 The Voronoi cells V of the square lattice are squares
6 7
4 1 1 0  05 centered at lattice points, the Delone cells D are
1 0 0  1  squares centered at the vertices of Voronoi squares.
The product polytopes X1, ?  X1, k from projections
with  = , 1 = 1. The six basis vectors with of dual 1-boundaries X1 , X1 of Delone and Voronoi
components in the upper three rows span the so- squares (cf. Figure 1) become the two types of square
called primitive icosahedral Z-module associated windows A, B. If a parallel line section x = xk þ c?
with Dk in E3k in the sense of eqn [11]. In this crosses one of these squares, the tile Ak or Bk is
space they point along the directions of six 5-fold formed. The standard Fibonacci tiling results.
axes of the icosahedron. Example 6: Canonical Tilings from the Root
A second lattice in E6 which admits icosahedral Lattices A4 , D6
point symmetry is the root lattice D6 . The basis of
this root lattice, often denoted as the P-lattice, is The two rhombus tiles of the planar quasiperiodic
obtained from eqn [20] by a centering matrix Penrose pattern (Penrose 1974) (T , A4 ) are the projec-
given in Kramer et al. (1992, p. 185, eqn (8)). The tions of 2-boundaries of the Voronoi complex of the
corresponding Z-module is inequivalent to the root lattice A4 2 E4 (Baake et al. 1990). The triangle
module projected from eqn [20]. The third tiles of the dual tiling (T  , A4 ) are shown in Figure 2.
lattice of icosahedral point symmetry in E6 is They are projections of 2-boundaries from the
 = I := P reciprocal to the root lattice D6 . All Delone complex of the same lattice. A full analysis
three icosahedral modules admit (powers of) of dual Voronoi and Delone boundaries of the root
-scaling. lattice D6 is given in Kramer et al. (1992). It leads to
icosahedral tilings (T , D6 ) and (T  , D6 ) of E3 , (Kramer
et al. 1992, Kramer and Papadopolos 1997, Kramer
and Schlottmann 1989) and to models of icosahedral
Quasiperiodic Tilings and Their Windows quasicrystals.
Quasiperiodic sets of points arise from the general
scheme (qp) (eqn [12]) by choosing particular
Fundamental Domains for Quasiperiodic
periodic functions in the embedding space En , called
Tilings
the ‘‘windows,’’ whose intersections with Ek are the
quasiperiodic sets of points. Canonical tilings allow us to construct quasiperiodic
The window for the construction of a discrete functions equipped with a quasiperiodic counterpart
quasiperiodic point set based on eqn [12] is given by of fundamental domains or cells in crystals: assume
the characteristic function (x? ) on the projection that the tiles of a tiling (T , ) all are translates in Em
V? (x? ) := ? (V(b)) of the Voronoi cell (eqn [5]), of a finite minimal set of prototiles (X1 , . . . , Xr ).
attached to any lattice point b 2 . Consider the class of quasiperiodic functions which
314 Quasiperiodic Systems

Conclusion
For quasiperiodic systems, the general construction
was introduced in the section ‘‘Quasiperiodic point
sets and functions’’, and illustrations were given in
four subsequent sections. Further reading resources are
provided by the references given at the end. Here, we
mention some of the many possible generalizations.
Bohr (1925) considers quasiperiodic as special
cases of almost periodic systems. The module of an
almost periodic function has a countable basis.
Moody (1997) discusses the notion of Meyer sets.
These describe discrete sets on locally compact
abelian groups and as particular cases encompass
quasiperiodic systems.
Lagarias (2000) studies aperiodic sets character-
ized by the following properties, shared with
periodic and quasiperiodic sets:
Figure 2 A patch of the planar quasiperiodic triangle tiling (ap1): inequivalent patches of points are volume
(T  , A4 ) obtained from the root lattice A4 2 E 4 . The tiles are two bounded,
triangles, projections of 2-boundaries from the Delone cells of (ap2): pure point Fourier spectrum,
A4 . The vertices are projections of lattice points. The 20 shaded
(ap3): linear repetitivity of patches, and
triangles form a set of prototiles such that any other tile is a
translate of one of them. The shaded set forms a fundamental (ap4): self-similarity.
domain for the tiling.
See also: Compact Groups and Their Representations;
Finite Group Symmetry Breaking; Lie Groups: General
take identical values on any translate of a prototile.
Theory; Localization for Quasiperiodic Potentials;
These values are prescribed on the finite set of Symmetries and Conservation Laws; Symmetry and
prototiles in Em which define a fundamental domain Symmetry Breaking in Dynamical Systems.
for this class of quasiperiodic functions. Only this
class of quasiperiodic functions is compatible with
the tiling. It can be characterized in the scheme (qp)
(eqn [12]) by -periodic functions on En whose Further Reading
values on the tile windows of the previous section Baake M, Kramer P, Schlottmann M, and Zeidler D (1990)
are independent of the perpendicular coordinate. A Planar patterns with fivefold symmetry as sections of periodic
fundamental domain for the triangle tiling (T  , A4 ) structures in 4-space. International Journal of Modern Physics
is given by the shaded parts in Figure 2. The B 4: 2217–2268.
Bohr H (1925) Zur Theorie der fastperiodischen Funktionen. I.
fundamental domain property appears in relation
Acta Mathematicae 45: 29–127.
with the theory of covering of quasiperiodic sets (see Bohr H (1925) Zur Theorie der fastperiodischen Funktionen. II.
Kramer and Papadopolos (2000)). Acta Mathematicae 46: 101–214.
Brown R, Bülow R, Neubüser J, Wondratschek H, and
Zassenhaus H (1978) Crystallographic Groups of Four-
Example 7: Fundamental Domain Dimensional Space. New York: Wiley.
for the Fibonacci Tiling de Bruijn NG (1981) Algebraic theory of Penrose’s non-periodic
tilings of the plane. I. Proceedings Koninklijke Nederlandse
Attach to the squares A, B in Figure 1 a periodic Akademie van Wetenschappen 84: 39–52.
function f p (x) with functional values independent of de Bruijn NG (1981) Algebraic theory of Penrose’s non-periodic
the perpendicular coordinate x? within the two tilings of the plane. II. Proceedings Koninklijke Nederlandse
squares. Consider the functional values f qp (xk ) = Akademie van Wetenschappen 84: 53–66.
f p (xk þ c? ) picked up on a parallel line. Clearly, Janssen T (1986) Crystallography of quasicrystals. Acta Crystal-
lographica A 42: 261–71.
these values become independent of the perpendi- Kramer P and Neri R (1984) On periodic and non-periodic space
cular coordinate of any intersection with a square fillings of En obtained by projection. Acta Crystallographica A
A, B. The general prescription of values on a 40: 580–587.
fundamental domain of  2 E2 needed for a Kramer P and Papadopolos Z (1997) Symmetry concepts for
quasiperiodic function reduces to a prescription of quasicrystals and non-commutative crystallography. In:
Moody RV (ed.) The Mathematics of Long-Range Aperiodic
its functional values in Ek on the fundamental Order, pp. 307–330. Dordrecht: Kluwer.
domain formed by the two prototiles Ak , Bk .
Quillen Determinant 315

Kramer P and Papadopolos Z (eds.) (2000) Coverings of Discrete Moody RV (1997) Meyer sets and their duals. In: Moody RV
Quasiperiodic Sets, Theory and Applications to Quasicrystals. (ed.) The Mathematics of Long-Range Aperiodic Order,
New York: Springer. pp. 403–441. Dordrecht: Kluwer.
Kramer P, Papadopolos Z, and Zeidler D (1992) The root lattice D6 Penrose R (1974) The role of aesthetics in pure and applied
and icosahedral quasicrystals. In: Frank A, Seligman TH, and Wolf mathematical research. Bulletin of the Institute of Mathe-
KB (eds.) Group Theory in Physics, American Institute of Physics matics and its Applications 10: 266–71.
Conference Proceedings, vol. 266, pp. 179–200. New York. Schwarzenberger RLE (1980) N-Dimensional Crystallography.
Kramer P and Schlottmann M (1989) Dualization of Voronoi San Francisco: Pitman.
domains and klotz construction: a general method for the Shechtman D, Blech I, Gratias D, and Cahn JW (1984) Metallic
generation of proper space filling. Journal of Physics A: phase with long-range orientational order and no translational
Mathematical and General 22: L1097–L1102. symmetry. Physical Review Letters 53: 1951–1953.
Lagarias J (2000) The impact of aperiodic order on mathematics.
Material Science Engineering A 294–296: 186–191.

Quillen Determinant
S Scott, King’s College London, London, UK while for any basis {e1 , . . . , en } for V, with dual basis
ª 2006 Elsevier Ltd. All rights reserved. {e1 , . . . , en } for V  ,
det A :¼ e1 ^ ^ en
ð^n AÞðe1 ^ ^ en Þ 2 LA ½6

Determinants in Finite Dimensions There is a canonical isomorphism for A 2 Hom(V, W),


B 2 Hom(U, V)
The determinant of a linear transformation
A : V ! W acting between finite-dimensional com- LAB ffi LA
LB ½7
plex vector spaces is an element det A of a complex coming from the isomorphism
line LA . The abstract element det A is called the
Quillen determinant of A, and the complex line LA Det V 
Det V ! C ½8
is called the determinant line of A. A choice of defined by the canonical pairing Det V   Det V ! C,
(linear) isomorphism and this preserves the determinant elements
: LA ! C ½1 det ðABÞ ! det A
det B ½9
associates to det A the complex number
det A :¼ ðdet AÞ 2 C ½2 The Classical Determinant
which can equivalently be written as the ratio When V = W these constructions take on a more
det A familiar form. Then can be chosen to be the
det A ¼ ½3 canonical isomorphism [8] and evaluation on
1 ð1Þ
det A 2 LA outputs the classical determinant
taken in the one-dimensional complex vector space X
LA relative to the canonical generator 1 (1). It is det C A ¼ ð1Þ
a1;
ð1Þ an;
ðnÞ ½10
not necessarily the case that det A determines a

generator for LA ; specifically, if dim V = m and where the sum is over permutations of {1, . . . , n} and
dim W = n, then det A = 0 if m 6¼ n (by ‘‘fiat’’), while (ai, j ) is the matrix of A with respect to any basis of V –
if m = n, then det A = 0 precisely when A is not changing the basis may change the summands on the
invertible. For the moment, set m = n. right-hand side of [10], but not their sum. It is
For k 2 {0, 1, . . . , n} the kth exterior power opera- fundamental that when V = W the classical determi-
tor is defined by nant is an intrinsic invariant of the operator A, inde-
^k A : ^k V ! ^k W pendent of the choice of basis for V; when V 6¼ W that
is no longer so since there is then no canonical bilinear
^k Aðv1 ^ v2 ^ ^ vk Þ :¼ Av1 ^ Av2 ^ ^ Avk ½4 pairing Det V   Det W ! C; the choice of a non-
where v1 , ..., vk 2 V and ^0 V := C and ^0 A := 1. degenerate pairing is equivalent to a choice of in [1].
When k = n, Det V := ^n V and Det W := ^n W are The identification of [10] from [6] and [8]
complex lines and the determinant line of A is amounts to the identity in Det V

LA :¼ Det V 
Det W ½5 ð^n AÞðe1 ^ ^ en Þ ¼ det C A : e1 ^ ^ en ½11
316 Quillen Determinant

Since ^n (AB) = ^n A ^n B, [11] in turn implies the < , > , let C(H) be the algebra of compact operators
characterizing multiplicativity property of the classical on H, and let
determinant ( )
X
1
det C ðABÞ ¼ det C A . det C B ½12 L1 ¼ A 2 CðHÞ j kAk21 :¼ i ðA AÞ < 1 ½17
i¼1
for A, B 2 End(V), specializing the general fact in
[7]. Similarly, the group Gl(V, C) of invertible be the ideal of trace-class operators, where the sum
elements of End(V) is identified with those A with is over the real discrete eigenvalues i (A A) & þ0 of
det C A 6¼ 0. the compact self-adjoint operator A A. For any
The classical determinant can also be thought of orthonormal basis { j } of H the map
in the following ways. First, the direct sum of the X
operators defined in [4] yields the total exterior tr : L1 ! C; A 7! tr ðAÞ :¼ < j ; A j >
j
power operator ^A : ^V ! ^V on the exterior
algebra ^V =
nk = 0 ^k V and this has trace is a trace functional on L1 (H), independent of the
choice of basis. Lidskii’s theorem states that
tr ð^AÞ ¼ det C ðI þ AÞ ½13
X
where I is the identity. Alternatively, one can do tr ðAÞ ¼  ½18
something a little more sophisticated and use the 2specðAÞ

holomorphic functional calculus to define the with the sum over the eigenvalues of A counted up
logarithm log B of B 2 End(V) by to algebraic multiplicity; for general trace-class
Z operators this equality is highly nontrivial.
i
log B ¼ log  ðB  IÞ1 d ½14 If A is trace class, then for each non-negative
2 
integer k so is each of the exterior power operators
Here log  is the branch of the complex logarithm ^k A : ^k H ! ^k H, defined as in [4]. Following
defined by   2 < arg()  and  is a positively [13], a determinant can therefore be defined on the
oriented contour enclosing spec(B) but not any point semigroup I þ L1 := {I þ A j A 2 L1 } of determinant-
of the spectral cut R = {rei j r 0}. Then, if B is class operators by the absolutely convergent sum
invertible,
X
1
tr ðlog BÞ ¼ log det C B ½15 det F ðI þ AÞ :¼ tr ð^AÞ ¼ 1 þ tr ð^k AÞ ½19
k¼1

On the other hand, since tr is tracial and log (I þ A)


defined by [14] is trace class, then according to [16],
The Fredholm Determinant
there is a determinant given on invertible determinant-
The advantage of the constructions [13] and [15] is class operators by
that they extend to a restricted class of bounded linear
operators on infinite-dimensional Hilbert spaces. This log detF ðI þ AÞ ¼ tr ðlog ðI þ AÞÞ ½20
is consequent on the fact that both of the formulas [13] which, as the left-hand side already suggests,
and [15] are computed as operator traces. coincides with the Fredholm determinant.
(Recall that a trace on a Banach algebra B is a The Fredholm determinant retains the character-
linear functional  : B ! C which has the property izing properties of the classical determinant in finite
([a, b]) = 0 for all a, b in B, where [a, b] := ab  ba dimensions, that detF : I þ L1 ! C is multiplicative,
is defined by the product structure on B. Since one
can define the logarithm log b of an element b of B detF ððI þ AÞðI þ BÞÞ ¼ detF ðI þ AÞ detF ðI þ BÞ;
with spectral cut R by the formula [14], one in this A; B 2 L1 ½21
case obtains a determinant det, (b) on such
elements by setting and detF (I þ A) 6¼ 0 if and only if I þ A is invertible.
It is, moreover, essentially unique; any other multi-
log det ; ðbÞ ¼ ðlog bÞ ½16
plicative functional on I þ L1 is equal to some power
If a, b, ab 2 B have common spectral cuts , the trace of the Fredholm determinant, or, equivalently, any
property of  translates into the multiplicativity trace on L1 is a constant multiple of the operator
property det ,  (ab) = det ,  (a)det ,  (b) via a version trace. The trace property, the operator trace, and the
of the Campbell–Hausdorff formula.) multiplicativity of the Fredholm determinant do not,
The operator trace arises as follows. Let H be a however, persist to any functional extension of the
complex separable Hilbert space with inner product operator trace (resp. Fredholm determinant) on the
Quillen Determinant 317

space of pseudodifferential operators of any real where


order acting on function spaces (fields over space-
time). In quantum physics, this is a primary cause of  ¼ tr ðZ1 dZÞ ½23
anomalies. More precisely, determinants of differen- is the 1-form on Gl1 (H).
tial operators arise in quantum field theories (QFTs) This equation makes sense because the derivative
and string theory through the formal evaluation of dZ is trace class, and hence so is Z1 dZ. Now,
their defining Feynman path integrals and the locally  = d log detF (Z), so that the 1-form !1
calculation of certain stable quantum numbers, pulled back by a path  : S1 ! Gl1 (H) is precisely the
which are in some sense ‘‘topological.’’ winding number of the curve traced out in C by the
From the latter perspective, it is instructive to be function detF (). In fact, this is just a special case of
aware also of the following, third, construction of the the Bott periodicity theorem, which tells us that the
Fredholm determinant, which equates the existence stable homotopy group 2j1 (Gl1 (H)) is isomorphic to
of a nontrivial determinant to the existence of Z and an isomorphism is defined R by assigning to a map
nontrivial topology of the general linear group. f : S2j1 ! Gl1 (H) the integer S2j1 f  !j 2 Z (it is not
First, in a surprising contrast to Gl(n, C), the general obvious a priori that it is an integer).
linear group Gl(H) of an infinite-dimensional Hilbert Notice that it was not necessary to have mentioned
space H with the norm topology is contractible, and the Fredholm determinant of Z at this point. Indeed,
hence topologically trivial. By transgression proper- the third definition of the Fredholm determinant is to
ties in cohomology, this implies any vector bundle see it as the integral of the 1-form , define
with structure group Gl(H) is isomorphic to the Z
trivial bundle. In order to recapture some topology log detF ðI þ AÞ :¼  ½24
(and hence, in applications, some physics), it is

necessary to reduce to certain infinite-dimensional where


: [0, 1] ! Gl1 (H) is any path with
(0) = I
subgroups of Gl(H). The most obvious one is the and
(1) = I þ A; this uses the connectedness of
group Gl(1) of of invertible operators differing from Gl1 (H) and independence of the choice of
, as
the identity by an operator of finite rank. As the guaranteed by Bott periodicity.
inductive limit of the Gl(n, C), the cohomology and Interestingly, this is closely tied in with the
homotopy groups of Gl(1) are a stable version of Atiyah–Singer index theorem for elliptic pseudodif-
those of Gl(n, C). Precisely, Gl(1) is torsion free and ferential operators (which in full generality uses the
its cohomology ring is an exterior algebra with odd Bott periodicity theorem). Here, there is the follow-
degree generators, while Bott (1959) periodicity ing simple but quintessential version of that theorem
identifies k (Gl(1)) to be isomorphic to Z if k is which links it to the winding number of the
odd and trivial if k is even. Topologically, it is determinant of the symbol of a differential operator
preferable to consider the closure of Gl(1) in Gl(H), X
which yields the group Glcpt (H) of operators differing D¼ a ðxÞD x ½25
from the identity by a compact operator, but this is j m
now a little ‘‘too large’’ for analysis and differential n
on Euclidean space R with = ( 1 , . . . , n ) a multi-
geometry. Given our earlier comments, there is an
index of non-negative integers, j j = 1 þ    þ n ,
intermediate natural choice of the Banach Lie group
and Dx = i@=@xi . Here D acts on C1 (R n , V) with V
Gl1 (H) of operators differing from the identity by a
a finite-dimensional complex vector space and
trace-class operator (in fact, there is a tower of such
the coefficients of D are matrices varying smoothly
Schatten class groups). Moreover, the inclusions
with
 x which
 are required to decay suitably fast,
Gl(1) Gl1 (H) Glcpt (H) are homotopy equiva- D a (x) = O(jxjj j ) as jxj ! 1. If the symbol D
x
lences, and so the cohomology of Gl1 (H) is just the
of D, defined by
exterior algebra mentioned above
X
D ðx; Þ ¼ a ðxÞ ½26
H  ðGl1 ðHÞÞ ¼ ^ð!1 ; !3 ; !5 ; . . .Þ;
j m
deg!j ¼ 2j  1 ½22
with = ( 1 , . . . , n ) 2 Rn , satisfies the ellipticity
The advantage of considering Gl1 (H) is that precise condition of being invertible on the 2n  1 sphere
analytical representatives for the classes !j can be S2n1 in (x, ) space, then D is a Fredholm operator.
written down: The index theorem then states
 j Z
i ðj  1Þ! 2j1
!j ¼  index ðDÞ ¼ D ð!n Þ
2 ð2j  1Þ! S2j1
318 Quillen Determinant

the higher-dimensional analog of the winding determinant line Det(A) are equivalence classes
number of the determinant. [E, ] of pairs (E, ), where E : H 1 ! H 2 such that
A  E is trace class and relative to the equivalence
relation (Eq, )  (E, det F (q)) for q : H 1 ! H 1 of
Fredholm Operators and Determinant determinant class and where detF (q) is the Fredholm
determinant of q. Complex multiplication on Det(A)
Line Bundles
is defined by [A, ] = [A, ]. The abstract, or
The operators whose determinants are considered in Quillen, determinant of A is the preferred element
this article are all Fredholm operators. Recall that a det A := [A, 1] in Det(A).
linear operator A : H1 ! H2 between Hilbert spaces Here are some essential properties of the determi-
is Fredholm if it is invertible modulo compact nant line. First, det A is nonzero if and only if A is
operators; that is, there is a ‘‘parametrix’’ invertible. Next, quotients of abstract determinants
Q : H2 ! H1 such that QA  I and AQ  I are in Det(A) are given by Fredholm determinants; for if
compact operators on H1 and H2 , respectively. A1 : H 1 ! H 2 , A2 : H 1 ! H 2 are Fredholm operators
Equivalently, the range A(H1 ) of A is closed in H2 , such that Ai  A are trace class, then if A2 is
and the kernel Ker(A) = { 2 H1 j A = 0} and invertible we see that A1 2 A1 is determinant class
cokernel Coker(A) = H2 =A(H1 ) of A are finite and hence from the definition that
dimensional. (This is equally true for Banach and
Frechet spaces, we restrict our attention to Hilbert detðA1 Þ
spaces for brevity.) The space Fred of all such ¼ detF ðA1
2 A1 Þ ½27
detðA2 Þ
Fredholm operators with the norm topology has the
homotopy type of the classifying space Z  BGl(1). where the quotient on the left-hand side is taken in
The first factor parametrizes the connected compo- Det(A). The principal functorial property of the
nents of Fred, two Fredholm operators are in the determinant line is that given a commutative
same component if and only if they have the same diagram with exact rows and Fredholm columns
index
0 ! H1 ! H10 ! H100 ! 0
index ðAÞ ¼ dim KerðAÞ  dim CokerðAÞ
#A #A 0
#A 00 ½28
Mostly we restrict our attention to the connected 0 ! H2 ! H20 ! H200 ! 0
component Fred0 of operators of index zero. The
cohomology of Fred0  BGl(1) is a polynomial then there is canonical isomorphism of complex
ring lines

H  ðFred0 Þ ¼ R½ch1 ; ch2 ; ch3 ; . . . DetðA0 Þ ffi DetðAÞ  DetðA00 Þ ½29

whose generators may be formally realized as the preserving the Quillen determinants det (A0 ) $
even degree components of the Chern character of det (A)  det (A00 ). A consequence of this property is
an infinite-dimensional bundle over Fred0 . In fact, that given Fredholm operators A : H2 ! H3 and
the generators !2j1 of H  (Gl1 (H)) are related to the B : H1 ! H2 , then
chj through transgression, see Chern and Simons
(1974). We shall be interested here in the first DetðABÞ ffi DetðAÞ  DetðBÞ
generator ch1 , a transgression of the Fredholm
determinant ‘‘winding number 1-form’’ !1 , which with det (AB) $ det (A)  det (B), generalizing the
coincides with the real Chern class of a canonical elementary property [9].
complex line bundle DET0 ! Fred0 . The fiber of The principal context of interest for studying
DET0 at A 2 Fred0 is the determinant line Det(A) of determinant lines is the case where one has a
the Fredholm operator A, which is defined as family A = {Ax j x 2 B} of Fredholm operators
follows (Segal 2004). parametrized by a manifold B, satisfying suitable
Just as for finite-rank operators (see the subsec- continuity properties, and one aims to make sense
tion ‘‘Determinants in finite dimensions’’), the of the determinant as a function A ! C. It is then
determinant of a Fredholm operator A : H 1 ! H 2 of no difficulty to show that the corresponding
exists abstractly not as a number but as an element family of determinant lines DET(A) = [Det(Ax )
detA of a complex line Det(A). For simplicity, we defines a complex line bundle over B endowed
suppose that index (A) = 0. Elements of the with a canonical section det : B ! DET(A)
Quillen Determinant 319

assigning to x 2 B the Quillen determinant first-order elliptic differential operators D = {Dx :


det (Ax ) 2 Det(Ax ) (Quillen 1985, Segal 2004). To C1 (Mx ; E þ 1 
x ) ! C (Mx ; E x ) j x 2 B} of chiral Dirac-
identify the Quillen determinant section with a type, with Dx a Dirac-type operator acting over the
function on A, we need to identify a trivialization manifold Mx = 1 (x) parametrized by the fibration,
of the line bundle DET(A), giving a global basis along with a determinant line bundle DET(D) ! B
for the fibers. This is the same thing as giving a endowed with a canonical section x 7! det (Dx ).
non(or never)vanishing section : B ! DET(A), There are various contexts in mathematics and
with respect to which we have the regularized physics in which one would like to assign to the
determinant function (cf. [3]): determinant section a naturally associated smooth
function (a regularized determinant) det reg : B ! C,
det ðAx Þ
x 7! det ðAx Þ :¼ ½30 which can, for example, then be integrated. As
ðxÞ discussed in the previous section, this depends on
If A is trivializable, so a nonzero section exists, there identifying a trivializing (nonzero) section of
will be many such sections and some extra data is DET(D). For such a section to exist, the first Chern
needed to fix a natural choice of . class c1 (DET(D) 2 H 2 (B) must vanish, and this in
Each of the properties mentioned above for turn can be computed as a term in the Atiyah–Singer
determinant lines carries forward to determinant (1984) index theorem for families. Indeed, this is
line bundles in a natural way. In particular, one clear from the formal identification [31] which here
easily deduces from [28], or from the exact takes on a precise meaning.
sequence The following simple example, which is the basic
topological anomaly computation in string theory,
Ax
0 ! KerAx ! H1;x ! H2;x ! CokerAx ! 0 may help to explain the type of computation. Let
Mx be a copy of  a compact Riemann surface, so
that if the kernels KerAx have constant dimension as
that M is a family of surfaces parametrized by B.
x varies, then there is a canonical isomorphism
Let T = [Tx be the vertical complex tangent line
DetðAÞ ffi ^max KerðAÞ  ^max CokerðAÞ ½31 bundle on M, where Tx is the complex tangent line
bundle to Mx . Each fiber has an associated
where Ker(A) is the finite-rank complex vector @-operator @ x which we couple to the Hermitian
bundle over B with fiber KerAx , and Coker(A) bundle E x := Txm for m a non-negative integer. In
similarly. The interesting feature here is that it this way, we get a family D of @-operators coupled
shows the determinant bundle to be the top to E = T m whose index bundle is the element
exterior power of the index bundle Ind(A) = Ind(D ) = f! (T m ) 2 K(B). The Atiyah–Singer index
[Ker(A)]  [Coker(A)] 2 K(B) in the even K-theory theorem for families in this situation coincides with
of B, and in this sense determinant theory may be the Grothendieck–Riemann–Roch theorem and this
seen as a particular aspect of index theory – says that
understood is the very broadest sense; in fact,
the computation of determinants is usually a chðf! ðT m ÞÞ ¼ f ðchðT m ÞToddðTÞÞ
considerably more complex and difficult task than
computing an index. where ch is the Chern character class and Todd(T) is
the Todd class defined for a vector bundle F whose
first few terms are
Determinant Bundles for Differential
Operators over Manifolds ToddðFÞ ¼ 1 þ 12 c1 ðFÞ þ 12
1
c1 ðFÞ2 þ   

The Quillen determinant has been of particular and where f : H i (M) ! H i1 (B) is integration over
interest in the case of families of Dirac operators. the fibers. That is, with = c1 (T),
Such a family is associated to a C1 fibration  
 : M ! B of closed boundaryless finite-dimensional chðf! ðT m ÞÞ ¼ f 1 þ m þ 12 m2 2 þ   
Riemannian manifolds of even dimension. If there is  1 2

 1 þ 12 þ 12 þ 
a graded Hermitian vector bundle E = E þ
E  ! M     2 
of Clifford modules, then from the Riemannian ¼ f 1 þ m þ 12 þ 121
m þ m þ 16

structure one can construct a Levi-Civita connection  2 þ   
on the vertical tangent bundle T(M=B) which can be
lifted to a Clifford connection on E; for example, the So we have
spinor connection if we have a family of spin
manifolds. This data yields a smooth family of c1 ðf! ðT m ÞÞ ¼ 12
1
ð6m2 þ 6m þ 1Þf ð 2 Þ 2 H 2 ðBÞ ½32
320 Quillen Determinant

But for any element of K-theory, c1 (E) = connection on the determinant line bundle for a
c1 (DET(E)), and so the left-hand side of [32] is the family of @-operators over a Riemann surface
first Chern class of the determinant line bundle coupled to a holomorphic vector bundle. (This is
DET(D ). If we take, in particular, B = Conf(), the the first paper one should read on determinant line
space of conformal classes of metrics on  (or bundles; Quillen’s motivation, in fact, did not come
compact subsets of this space), and couple the from physics but from a problem in number
family D to a background trivial real bundle of theory.)
rank d=2, or its negative in K-theory, then taking To outline this construction, which was extended
m = 1 [32] is easily seen to be modified to to general families of Dirac-type operators in Bismut
and Freed (1986), first we recall that if  is
ðd  26Þ an invertible Laplacian-type second-order elliptic
c1 ðD;d=2 Þ ¼ f ð 2 Þ
24 differential operator acting on the space of sections
of a vector bundle over a compact manifold of
It follows for this topological anomaly to vanish dimension n, then it has a spectrum consisting of
one must have background spacetime of dimension real discrete eigenvalues {} forming an unbounded
d = 26. The idea here is that Conf() is a subset of the positive real line. The zeta function
configuration space for bosonic strings in R d of  is defined in the complex half-plane Re(s) >
with the requirement that the determinant section n=2 by
of the determinant line bundle be conformally X n
invariant, corresponding to the classical invariance ð; sÞ ¼ tr ðs Þ ¼ s ; ReðsÞ >
2
of the string Lagrangian defining the string path 
integral from which the determinant arises. That and extends to a meromorphic function of s on the
is, in order to evaluate the path integral on the whole complex plane. It turns out that the extension
reduced configuration space, one requires a trivia- has no pole at s = 0 and this means that we may
lization of the determinant line bundle which define the zeta-function regularized determinant of
defines a conformally invariant regularized deter-  by
minant function. The above calculation says that  
there is a topological obstruction to this occurring d
det ðÞ :¼ exp  ð; sÞ
when the background space dimension differs dsjs¼0
from 26.
This is the most basic example of determinant since (d=ds)js = 0 s = log  this formally represents a
anomaly computations, which have acquired regularized product of the eigenvalues of . A
considerably more sophisticated constructions in metric is now defined on the determinant line
modern versions of string theory and QFT. One bundle DET(D) by defining the norm square of the
immediate deficiency in the approach explained so element det (Dx ) 2 Det(Dx ) by
far is that not all anomalies are topological and so
k detðDx Þk2 :¼ det ðDx Dx Þ
even though the first Chern class of the determinant
line bundle may vanish, there may still be local and over the subset B0 of x 2 B where Dx is invertible.
global obstructions to the existence of a determi- Elsewhere in B, one includes a factor defined by the
nant function with the correct symmetry properties. induced L2 metric in the kernel and cokernel. See
To be more precise, one needs to say not just that a Quillen (1985) and Bismut and Freed (1986) for full
trivialization of the determinant line bundle for- details.
mally exists, but to actually be able to construct a A connection is defined by similarly constructing
specific preferred trivialization. For this more a regularized version of the connection we would
refined objective, one needs to know more about define if we were working with finite-rank bundles.
the differential geometry of the determinant line. First, one includes in the data associated to the
One approach is to fix a canonical choice of fibration  : M ! B defining the family of opera-
connection and, if the determinant bundle is tors D a splitting of the tangent bundle
topologically trivial, to construct a determinant TM = T(M=B)
 (TB). This assumption and the
section (up to phase) using the parallel transport Riemannian geometry of the fibration yield a
of the connection. connection r() defined along the fibers of the
The principal contribution to such a theory was fibration. The connection form over B0 is then
made in a remarkable four page paper by Quillen defined by
(1985) in which using zeta-function regularization
he presented a construction of a metric and !ðxÞ ¼ tr  ðD1 ðÞ
x r Dx Þ
Quillen Determinant 321

where the zeta-regularized trace tr  is defined on a determinant section pushes down to a section of
vertical bundle endomorphism-valued 1 form x 7! Ax a reduced determinant line bundle over A=G. As
on M by seen earlier, the topological obstruction to realiz-
 s ing this determinant section as a function on A=G
tr  ðAx Þ :¼ fps¼0 tr ðAx Dx Dx Þjmer can be computed from the Atiyah–Singer index
theorem for families applied to the corresponding
where the superscript indicates we are considering the index bundle Ind(DA=G ) in the K(A=G) by picking
meromorphically extended form, and fps = 0 (G(s)) out the degree-2 component in H 2 (A=G) of the
means the finite part of a meromorphic function G Chern character ch(Ind(DA=G )). On the other hand,
on C; that is, the constant term in the Laurent it turns out that this characteristic class is the
expansion of G(s) near s = 0. transgression of the element of H 1 (G, Z) defined by
A theorem of Bismut and Freed, generalizing the zeta-determinant trace
Quillen’s original computation, computes the curva-
 1  
ture (DET(D)) of this connection to be the 2-form  :¼ tr DA Dg:A dG DA Dg:A
component in the local Atiyah–Singer families index
 1   s
density. This is a refined version of the topological :¼ fps¼0 tr DA Dg:A dG DA Dg:A DA Dg:A Þjmer
version of that theorem which we utilized earlier; it
expresses the characteristic classes on B in terms of which counts the winding number of the zeta
specific canonical differential forms constructed by determinant G ! C defined by det (DA Dg.A ). This
integrating, along the fibers of the fibration, provides an interesting parallel of the classical
canonically defined vertical characteristic forms. theory described in the section ‘‘The Fredholm
More precisely, they prove the formula (Bismut determinant.’’ For more details of this and more
and Freed 1986 and Berline et al. 1992) advanced ideas take a look at Singer (1985). (A
! similar parallel holds between the topological
Z
ðDETðDÞÞ n=2 b derivation of the conformal anomaly outlined at
 ¼ ð2iÞ AðM=BÞchðEÞ ½33 the beginning of this section and what it called the
M=B
½2
Polyakov multiplicative anomaly formula for the
where ()[2] 2 2 (B) means the 2-form component zeta determinant of the Laplacian with respect to
of a differential form  on B. Here A(M=B) b = conformal changes in the metric on the surface.)
1=2
det ((R M=B
=2)= sinh (RM=B
=2)) is the vertical Aspects of more recent work in this direction have
b
A-genus differential form, while ch(E) is the vertical been the extension of the theory to manifolds with
Chern character form associated to the curvature boundary, and how it encodes into the structures of
form of the bundle E. topological and conformal field theories, see Segal
This theory seems a long way from the classical (2004) and Mickelsson and Scott (2001), and more
theory of stable characteristic classes and the generally into M-theory (Freed and Moore 2004).
Fredholm determinant discussed in earlier sections.
See also: Anomalies; Feynman Path Integrals;
There are, however, interesting parallels which
Index Theorems; Regularization for Dynamical
may guide the search for an understanding of the -Functions.
geometry of families of elliptic operators, of which
determinants form a component. The prototypical
situation where determinants arise in the quantiza-
tion of gauge theory is the following. Consider the Further Reading
infinite-dimensional affine space A of connections Alvarez O and Singer IM (1984) Gravitational anomalies and the
on a complex vector bundle E with structure families index theorem. Communications in Mathematical
group G sitting over Sn the n-sphere. The Lie Physics 96: 409–417.
group G is assumed to be compact. For each Atiyah MF and Singer IM (1984) Dirac operators coupled to
connection A 2 A, we consider a Dirac operator vector potentials. Proceedings of the National Academy of
Sciences of the USA 81: 2597–2600.
DA : C1 (Sn , Sþ  E) ! C1 (Sn , S  E), where E is Bott R (1959) The stable homotopy of the classical groups.
a Hermitian vector bundle coupled to the spinor Annals of Mathematics 70: 313–337.
bundles S . The group G of based gauge transfor- Berline N, Getzler E, and Vergne M (1992) Heat Kernels and Dirac
mations acts on A and symmetry properties of Operators, Grundlehren der Mathematischen Wissenschaften,
conservation laws lead one to be interested in vol. 298. Berlin: Springer.
Bismut J-M and Freed DS (1986) The analysis of elliptic families I.
constructing a determinant function on the quo- Communications in Mathematical Physics 106: 159–176.
tient space A=G. More precisely, g 2 G transforms Chern SS and Simons J (1974) Characteristic forms and geometric
DA to Dg.A and by equivariance the Quillen invariants. Annals of Mathematics 99: 48–69.
322 Quillen Determinant

Freed DS and Moore G (2004) Setting the quantum integrand of Segal G (2004) The definition of conformal field theory. In:
M-theory, arXiv:hep-th/0409135. Tillmann U (ed.) Topology, Geometry and Quantum Field
Mickelsson J and Scott S (2001) Functorial QFT, gauge anomalies Theory, pp. 421–577. Cambridge University Press. Topo-
and the Dirac determinant bundle. Communications in Mathe- logical field theory, ‘‘Stanford lecture notes,’’ http://
matical Physics 219: 567–605. www.cgtp.duke.edu.
Quillen D (1985) Determinants of Cauchy–Riemann opeators over Singer IM (1985) Families of Dirac operators with applications to
a Riemann surface. Functional Analysis and its Applications physics. Asterique, hors serie, 323–340.
19: 31–34.

Quivers see Finite-Dimensional Algebras and Quivers


R
Random Algebraic Geometry, Attractors and Flux Vacua
M R Douglas, Rutgers, The State University Elementary Random Algebraic Geometry
of New Jersey, Piscataway, NJ, USA
Let us introduce this subject with the problem of
ª 2006 Elsevier Ltd. All rights reserved.
finding the expected distribution of zeros of a
random polynomial,
f ðzÞ ¼ c0 þ c1 z þ    þ cN zN
Introduction
We define a random polynomial to be a probability
A classic question in probability theory, studied by measure on a space of polynomials. A natural choice
M Kac, S O Rice, and many others, is to find the might be independent Gaussian measures on the
expected number and distribution of zeros or critical coefficients,
points of a random polynomial. The same question
can be asked for random holomorphic functions or Y
N
i jci j2 =22
d ½f  ¼ d½c0 ; . . . ; cN  ¼ d2 ci e i ½1
sections of bundles, and are the subject of ‘‘random i¼0
2
algebraic geometry.’’
While this theory has many physical applications, We still need to choose the variances. At first the
in this article we focus on a variation on a standard most natural choice would seem to be equal
question in the theory of disordered systems. This variance for each coefficient, say i = 1=2. We can
is to find the expected distribution of minima of characterize this ensemble by its two-point
a potential function randomly chosen from an function,
ensemble, which might be chosen to model a crystal
with impurities, a spin glass, or another disordered Gðz1 ; z2 Þ  E½ f ðz1 Þf  ðz2 Þ
Z
system. Now whereas standard potentials are real-
¼ d½ f  f ðz1 Þf  ðz2 Þ
valued functions, analogous functions in supersym-
metric theories, such as the superpotential and X
N
the central charge, are holomorphic sections of a ¼ ðz1z2 Þn
line bundle. Thus, one is interested in finding the n¼0
distribution of critical points of a randomly chosen 1  zNþ1
1 zNþ1
2
¼
holomorphic section. 1  z1z2
Two related and much-studied problems of this
type are (1) the problem of finding attractor points We now define d0 (z) to be a measure with unit
in the sense of Ferrara, Kallosh, and Strominger, and weight at each solution of f (z) = 0, such that its
(2) the problem of finding flux vacua as posed by integral over a region in C counts the expected
Giddings, Kachru, and Polchinski. These problems number of zeros in that region. It can be written in
involve a good deal of fascinating mathematics and terms of the standard Dirac delta function, by
are good illustrations of the general theory. multiplication by a Jacobian factor,
A note on general references for further reading on   ðzÞ
the subject of this article is in order. For background d0 ðzÞ ¼ E½ð2Þ ðf ðzÞÞ@f ðzÞ@f ½2
on random algebraic geometry and some of its other To compute this expectation value, we introduce a
applications, as well as references in the text not constrained two-point function,
listed here, consult Edelman and Kostlan (1995) and
Zelditch (2001). The attractor problem is discussed in E½ð2Þ ðf ðzÞÞ f ðz1 Þ f  ðz2 Þ
Gf ðzÞ ¼ 0 ðz1 , z2 Þ ¼
Ferrara et al. (1995) and Moore (2004), while IIb flux E½ð2Þ ðf ðzÞÞ
vacua were introduced in Giddings et al. (2002).
Background on Calabi–Yau manifolds can be found in It could be explicitly computed by using the
Cox and Katz (1999) and Gross et al. (2003). constraint f (z) = 0 to solve for a coefficient ci in
324 Random Algebraic Geometry, Attractors and Flux Vacua

the Gaussian integral,


P that is, projecting on the volume of the Fubini–Study (SU(2)-invariant) Kähler
linear subspace 0 = ci zi . The result, in terms of metric
G(z1 , z2 ), is 
! ¼ @ @K; K ¼ logð1 þ zzÞ
1
E½ð2Þ ðf ðzÞÞ ¼ on complex projective space CP1 .
Gðz; zÞ
We can better understand the different behaviors
Gðz1 ; zÞGðz; z2 Þ in our two examples by focusing on a Hermitian
Gf ðzÞ¼0 ðz1 ; z2 Þ ¼ Gðz1 ; z2 Þ 
Gðz; zÞ inner product (f , g) on function space, associated to
as can be verified by considering the measure eqn [1] by the formal expression
d½f  ¼ ½Df  eðf; f Þ
E½ð2Þ ðf ðzÞÞ f ðzÞf  ðz2 Þ / Gf ðzÞ¼0 ðz; z2 Þ
Gðz; zÞGðz; z2 Þ In making this precise, let us generalize a bit further
¼ Gðz; z2 Þ  ¼0 and allow f to be a holomorphic section of a line
Gðz; zÞ
bundle L, say O(N) over CP1 in our examples. We
Using this, eqn [2] can be evaluated by taking then choose an orthonormal basis of sections
derivatives: (si , sj ) = ij , and write
X
1  2 Gz ðz1 ; z2 Þ f  c i si ½6
d0 ðzÞ ¼ lim D1 D
Gðz; zÞ z1 ;z2 !z i
1 and
¼ @ @ log Gðz; zÞ

1 Y
N
2
For the constant variance ensemble eqn [2], d½f  ¼ N
d2 ci ejci j =2

! ð2Þ i¼1
d2 z 1 ðN þ 1Þ2 ðzzÞN
d0 ðzÞ ¼  ½3 We can then compute the two-point function
 ð1  zzÞ2 ð1  ðzzÞNþ1 Þ2
X
N
We see that as N ! 1, the zeros concentrate on the Gðz1 ; z2 Þ  E½sðz1 Þs ðz2 Þ ¼ si ðz1 Þsi ðz2 Þ ½7
unit circle jzj = 1 (Hammersley 1954). i¼1
A similar formula can be derived for the distribu- and proceed as before.
tion of roots of a real polynomial on the real axis, In these terms, the simplest way to describe the
using d(t) = E[(f (t))jdf =dtj]. One obtains (Kac measure for our first example is that it follows from
1943): the inner product on the unit circle,
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi I
r dt 1 ðN þ 1Þ2 t2N dz 
d0 ðtÞ ¼  ðf ; gÞ ¼ f ðzÞgðzÞ
jzj¼1 2z
2
 ð1  t2 Þ ð1  t2Nþ2 Þ2
Integrating, one finds the expected number of real Thus, we might suspect that this has something to
zeros of a degree N random real polynomial is EN  do with the concentration of eqn [3] on the unit
(2=) log N, and as N ! 1 the zeros are concen- circle. Indeed, this idea is made precise and general-
trated at t = 1. ized in Shiffman and Zelditch (2003).
While concentration of measure is a fairly Our second example belongs to a class of problems
generic property for random polynomials, it is by in which M is compact and L positive. In this case,
no means universal. Let us consider another the space H 0 (M, L) of holomorphic sections is finite
Gaussian ensemble, with variance n = N!=n! dimensional, so we can take the basis to consist of all
(N  n)!. This choice leads to a particularly simple sections. Then, if M is in addition Kähler, we can
two-point function, derive all the other data from a choice of Hermitian
metric h(f , g) on L. In particular, this determines a
Gðz; zÞ ¼ ð1 þ zzÞN ½4 Kähler form ! as the curvature of the metric
compatible connection, and thus a volume form
and the distribution of zeros
Vol! = !n =n!. We then define the inner product to be
1  N d2 z Z
d0 ¼ @ @ log G ¼ ½5 ðf; gÞ ¼ Vol ! hðf; gÞ
 ð1 þ zzÞ2 M

Rather than concentrate the zeros, in this ensemble Thus, the measure equation [1] and the final distribu-
zeros are uniformly distributed according to the tion equation [2] are entirely determined by h. In
Random Algebraic Geometry, Attractors and Flux Vacua 325

these terms, the underlying reason for the simplicity of There are many mathematical and physical ques-
eqn [5] is that we started with the SU(2)-invariant tions one can ask about attractor points, and it
metric h, so the final distribution must be invariant would be very interesting to have a general method
as well. More generally, eqn [7] is a Szegö kernel. to find them. As emphasized by G Moore, this is one
Taking L = L N 1 for N large, this has a known of the simplest problems arising from string theory
asymptotic expansion, enabling a rather complete in which integrality (here due to charge quantiza-
treatment (Zelditch 2001). tion) plays a central role, and thus it provides a
Our two examples also make the larger point that natural point of contact between string theory and
a wide variety of distributions are possible. Thus, to number theory. For example, one might suspect that
get convincing results, we must put in some informa- attractor Calabi–Yau’s are arithmetic, that is, are
tion about the ensemble of random polynomials or projective varieties whose defining equations live in
sections which appear in the problem at hand. an algebraic number field. This can be shown to
The basic computation we just discussed can be always be true for K3 T 2 , and there are
vastly generalized to multiple variables, multipoint conjectures about when this is true more generally
correlation functions, many different ensembles, and (Moore 2004).
different counting problems. We will discuss the A simpler problem is to characterize the distribu-
distribution of critical points of holomorphic tion of attractor points in Mc (M). As these are
sections below. infinite in number, one must introduce some
control parameter. While the first idea which
might come to mind is to bound the magnitude of
The Attractor Problem , since the intersection form on H 3 (M, Z) is
antisymmetric, there is no natural way to do this.
We now turn to our physical problems. Both are
A better way to get a finite set is to bound the
posed in the context of compactification of the type
period of , and consider the attractor points
IIb superstring theory on a Calabi–Yau 3-fold M.
satisfying
This leads to a four-dimensional effective field
R
theory with N = 2 supersymmetry, determined by 2 2 j M  ^ j2
the geometry of M. Zmax jZð; zÞj  R  ½9
M^
Let us begin by stating the attractor problem
mathematically, and afterwards give its physical As an example of the type of result we will discuss
background. We begin by reviewing a bit of the below, one can show that for large Zmax , the density
theory of Calabi–Yau manifolds. By Yau’s proof of of such attractor points asymptotically approaches
the Calabi conjecture, the moduli space of Ricci-flat the Weil–Peterson volume form on Mc .
metrics on M is determined by a choice of complex We now briefly review the origins of this problem,
structure on M, denote this J, and a choice of Kähler in the physics of 1/2 BPS (Bogomoln’yi–Prasad–
class. Using deformation theory, it can be shown Sommerfield) black holes in N = 2 supergravity. We
that the moduli space of complex structures, denote begin by introducing local complex coordinates zi
this Mc (M), is locally a complex manifold of on Mc (M). Physically, these can be thought of as
dimension h2, 1 (M). A point J in Mc (M) picks out a massless complex scalar fields. These sit in vector
holomorphic 3-form J 2 H 3, 0 (M, C), unique up to multiplets of N = 2 supersymmetry, so there must be
an overall choice of normalization. The converse is h2, 1 (M) vector potentials to serve as their bosonic
also true; this can be made precise by defining the partners under supersymmetry. These appear
period map Mc (M) ! P(H 3 (M, Z) C) to be the because the massless modes of the type IIb string
class of  in H 3 (M, Z) C up to projective include various higher rank-p form gauge potentials,
equivalence. One can prove that the period map is in particular a self-dual 4-form which we denote C.
injective (the Torelli theorem), locally in general and Self-duality means that dC =  dC up to nonlinear
globally in certain cases such as the quintic in CP 4 . terms, where  is the Hodge star operator in ten
Now, the data for the attractor problem is a charge, dimensions. Now, Kaluza–Klein reduction of this
a class  2 H 3 (M, Z). An attractor point for  is then 4-form potential produces b3 (M) 1-form vector
a complex structure J on M such that potentials AI in four dimensions. Given an explicit
basis of 3-forms !I for H 3 (M, R) \ H 3 (M, Z), this
 2 HJ3;0 ðM; CÞ
HJ0;3 ðM; CÞ ½8 follows from the decomposition
This amounts to h2, 1 complex conditions on the h2, 1 X
b3
complex structure moduli, so picks out isolated C¼ AI ^ !I þ massive modes
points in Mc (M), the attractor points. I¼1
326 Random Algebraic Geometry, Attractors and Flux Vacua

However, because of the self-duality relation, only With some work, one can see that in the 1/2 BPS
half of these vector potentials are independent; the case, the equations of motion imply that as r
other half are determined in terms of them by four- decreases, the complex structure moduli z follow
dimensional electric–magnetic duality. Explicitly, gradient flow with respect to jZ(, z)j2 in eqn [11],
given the intersection form ij on H 3 H 3 , we have and the area A(r) of an S2 at radius r decreases.
Finally, at the horizon, z reaches a value z at which
dAi ¼ ij 4 dAj ½10
jZ(, z )j2 is a local minimum, and the area of
where 4 denotes the Hodge star in d = 4. Thus we the event horizon is A = 4jZ(, z )j2 . Since z is
have h2, 1 þ 1 independent vector potentials. One of determined by minimization, this area will not
these sits in the N = 2 supergravity multiplet, and change under small variations of the initial z,
the rest are the correct number to pair with the resolving the paradox.
complex structure moduli. We now consider 1/2 BPS A little algebra shows that the problem of finding
black hole solutions of this four-dimensional N = 2 nonzero critical points of jZ(, z)j2 is equivalent to
theory. Choosing any S2 which surrounds the that of finding critical points Di Z = 0 of the period
horizon, we can define the charge  as the class in associated to ,
H 3 (M, Z) which reproduces the corresponding mag- Z
netic charges Z¼ ^ ½12
Z Z M
1
Qi ¼ dAi  !i ^  usually called the central charge, with respect to the
2 S2 M covariant derivative
Using eqn [10], this includes all charges.
Di Z ¼ @i Z þ ð@i KÞZ ½13
One can show that the mass M of any charged
object in supergravity satisfies a BPS bound, Here
Z
M2 jZð; zÞj2 ½11 eK  
^ ½14
2
The quantity jZ(; z)j , defined in eqn [9], depends
explicitly on , and implicitly on the complex The mathematical significance of this rephrasing is
structure moduli z through . A 1/2 BPS solution that K is a Kähler potential for the Weil–Peterson
by definition saturates this bound. Kähler metric on Mc (M), with Kähler form
! = @ @K, and eqn [13] is the unique connection on
We now explain the ‘‘attractor paradox.’’
According to Bekenstein and Hawking, the entropy H (3, 0) (M, C) regarded as a line bundle over Mc (M),
of any black hole is proportional to the area of its whose curvature is !. These facts can be used to
event horizon. This area can be found by finding show that Di  provides a basis for H (2, 1) (M, C), so
the black hole as an explicit solution of four- that the critical point condition forces the projection
dimensional supergravity, which clearly depends on of  on H (2, 1) to vanish. This justifies our original
the charge . In fact, we must fix boundary definition eqn [8].
conditions for all the fields at infinity, in particular
the complex structure moduli, to get a particular
black hole solution. Now, normally varying the Flux Vacua in IIb String Theory
boundary conditions varies all the data of a We will not describe our second problem in as much
solution in a continuous way. On the other hand, detail, but just give the analogous final formulation.
if the entropy has any microscopic interpretation as In this problem, a ‘‘choice of flux’’ is a pair of
the logarithm of the number of quantum states of elements of H 3 (M, Z), or equivalently a single
the black hole, one would expect eS to be integrally element
quantized. Thus, it must remain fixed as the
boundary conditions on complex structure moduli F 2 H 3 ðM; Z
ZÞ ½15
are varied, in contradiction with naive expectations where  2 H  { 2 CjIm > 0} is the so-called
for the area of the horizon, and seemingly contra- ‘‘dilaton-axion.’’
dicting Bekenstein and Hawking. A flux vacuum is then a choice of complex
The resolution of this paradox is the attractor structure J and  for which
mechanism. Let us work in coordinates for which
the four-dimensional metric takes the form F 2 HJ3;0 ðM; CÞ
HJ1;2 ðM; CÞ ½16
AðrÞ Now we have h2, 1 þ h0, 3 = h2, 1 þ 1 complex condi-
ds2 ¼ f ðrÞ dt2 þ dr2 þ d2S2
4 tions on the joint choice of h2, 1 complex structure
Random Algebraic Geometry, Attractors and Flux Vacua 327

moduli and , so this condition also picks out One of the standard solutions of this problem is
special points, now in Mc H. the ‘‘anthropic solution,’’ initiated in work of
The critical point formulation of this problem is Weinberg and others, and discussed in string theory
that of finding critical points of in Bousso and Polchinski (2000). Suppose that we
Z are discussing a theory with a large number of
W ¼ ^F ½17 vacuum states, all of which are otherwise candidates
to describe our universe, but which differ in . If the
under the covariant derivatives eqn [13] and number of these vacuum states were sufficiently
large, the claim that a few of these states realize a
D W ¼ @ W þ ð@ WÞZ small  would not be surprising. But one might still
with K the sum of eqn [14] and the Kähler potential feel a need to explain why our universe is a vacuum
log Im for the metric on the upper half-plane of with small , and not one of the multitude with
constant curvature 1. large .
This is a sort of complexified version of the The anthropic argument is that, according to
previous problem and arises naturally in IIb com- accepted models for early cosmology, if the value of
pactification by postulating a nonzero value F for a jj were even 100 times larger than what is
certain 3-form gauge field strength, the flux. The observed, galaxies and stars could not form. Thus,
quantity eqn [17] is the superpotential of the the known laws of physics guarantee that we will
resulting N = 1 supergravity theory, and it is a observe a universe with  within this bound; it is
standard fact in this context that supersymmetric irrelevant whether other possible vacuum states
vacua (critical points of the effective potential) are ‘‘exist’’ in any sense.
critical points of W in the sense we just stated. While such anthropic arguments are controversial,
We can again pose the question of finding the one can avoid them in this case by simply asking
distribution of flux vacua in Mc (M) H. Besides whether or not any vacuum state fits the observed
jWj2 , which physically is one of the contributions to value of . Given a precise definition of vacuum
the vacuum energy, we can also use the ‘‘length of state, this is a question of mathematics. Still,
the flux’’ answering it for any given vacuum state is extremely
Z difficult, as it would require computing  to 10122
1 precision. But it is not out of reach to argue that out
L¼ Re F ^ Im F ½18
Im of a large number of vacua, some of them are
expected to realize small . For example, if we
as a control parameter, and count flux vacua for could show that the number of otherwise physically
which L Lmax . In fact, this parameter arises acceptable vacua was larger than 10122 , and that the
naturally in the actual IIb problem, as the ‘‘orienti- distribution of  among these was approximately
fold three-plane charge.’’ uniform over the range (M4Planck , M4Planck ), we would
What makes this problem particularly interesting have made a good case for this expectation. This style
physically is that it (and its analogs in other string of reasoning can be vastly generalized and, given
theories) may bear on the solution of the cosmolo- favorable assumptions about the number of vacua in
gical constant problem. This begins with Einstein’s a theory, could lead to falsifiable predictions inde-
famous observation that the equations of general pendent of any a priori assumptions about the choice
relativity admit a one-parameter generalization, of vacuum state (Douglas 2003).
R  12g R ¼ 8T þ g

Physically, the cosmological constant  is the


Asymptotic Counting Formulas
vacuum energy, which in our flux problem takes We have just defined two classes of physically
the form  =    3jWj2 (the other terms are preferred points in the complex structure moduli
inessential for us here). space of Calabi–Yau 3-folds, the attractor points
Cosmological observations tell us that  is very small, and the flux vacua. Both have simple definitions in
of the same order as the energy of matter in the present terms of Hodge structure, eqn [8] and eqn [16], and
era, about 10122 M4Planck in Planck units. However, in a both are also critical points of integral periods of the
generic theory of quantum gravity, including string holomorphic 3-form.
theory, quantum effects are expected to produce a large This second phrasing of the problem suggests the
vacuum energy, a priori of order M4Planck . Finding an following language. We define a random period of
explanation for why the theory of our universe is in this the holomorphic 3-form to be the period for a
sense nongeneric is the cosmological constant problem. randomly chosen cycle in H3 (M, Z) of the types we
328 Random Algebraic Geometry, Attractors and Flux Vacua

just discussed (real or complex, and with the In words, the two-point function is the formal
appropriate control parameters). We are then inter- continuation of the Kähler potential on Mc (M) to
ested in the expected distribution of critical points independent holomorphic and antiholomorphic
for a random period. This brings our problem into variables. This incorporates the quadratic form
the framework of random algebraic geometry. appearing in eqn [18] and can be used to count
Before proceeding to use this framework, let us sections with such a bound.
first point out some differences with the toy We can now follow the same strategy as before,
problems we discussed. First, while eqn [12] and by introducing an expected density of critical
eqn [17] are sums of the form eqn [6], we take not points,
an orthonormal basis but instead a basis si of
 isðzÞÞ j det Hij j
dðzÞ ¼ E½ðnÞ ðDi sðzÞÞðnÞ ðD ½19
integral periods of . Second, the coefficients ci are 1 i;j 2n
not normally distributed but instead drawn from a
discrete uniform distribution, that is, correspond to where the ‘‘complex Hessian’’ H is the 2n 2n
a choice of  in H 3 (M, Z) or F as in eqn [15], matrix of second derivatives
satisfying the bounds on jZj or L. Finally, we do not !
@i D sðzÞ @i Dj sðzÞ
normalize the distribution (which is thus not a H   j ½20
probability measure) but instead take each choice @i DjsðzÞ @i Dj sðzÞ
with unit weight.
(note that @Ds = DDs at a critical point). One can
These choices can of course be modified, but are
then compute this density along the same lines.
made in order to answer the question, ‘‘how many  s = ! s,
The holomorphy of s implies that @i D j ij
attractor points (or flux vacua) sit within a specified
which is one simplification. Other geometric
region of moduli space?’’ The answer we will get is a
simplifications follow from the fact that eqn [19]
density (Zmax ) or (Lmax ) on moduli space, such
depends only on s and a finite number of its
that as the control parameter becomes large, the
derivatives at the point z.
number of critical points within a region R
For the attractor problem, using the identity
asymptotes to
Z 
 s ¼ 0
Di Dj s ¼ F ijk !kk D k
N ðR; Zmax Þ  ðZmax Þ
R from special geometry of Calabi–Yau 3-folds, the
The key observation is that to get such asympto- Hessian becomes trivial, and detH = jsj2n . One thus
tics, we can start with a Gaussian random finds (Denef and Douglas 2004) that the asymptotic
element s of H 3 (M, R) or H 3 (M, C). In other density of attractor points with large jZj Zmax in a
words, we neglect the integral quantization of region R is
the charge or flux. Intuitively, this might be 2nþ1
expected to make little difference in the limit N ðR; jZj Zmax Þ  Znþ1  volðRÞ
ðn þ 1Þn max
that the charge or flux is large, and in fact one R
can prove that this simplification reproduces the where vol(R) = R !n =n! is the volume of R in the
leading large L or jZj asymptotics for the density Weil–Peterson metric. The total volume is known to
of critical points, using standard ideas in lattice be finite for Calabi–Yau 3-fold moduli spaces, and
point counting. thus so is the number of attractor points under this
This justifies starting with a two-point function bound.
like eqn [7]. While the integral periods si of  can The flux vacuum problem is complicated by the
be computed in principle (and have been in many fact that DDs is nonzero and thus the determinant
examples) by solving a system of linear PDEs, the of the Hessian does not take a definite sign, and
Picard–Fuchs equations, it turns out that one does implementing the absolute value in eqn [19] is
not need such detailed results. Rather, one can nontrivial. The result (Douglas, et al. 2004) is
use the following ansatz for the two-point Z
1
function, ðzÞ  pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi j detðHH   jxj2  1Þj
b3 ! det ðzÞ HðzÞ C
X
b3
t 1
Hjxj2
Gðz1 ; z2 Þ ¼ IJ sI ðz1 ÞsJ ðz2 Þ eH ðzÞ dH dx
I¼1
Z where H(z) is the subspace of Hessian matrices eqn
¼  z2 Þ
ðz1 Þ ^ ð [20] obtainable from periods at the point z, and (z)
M is a covariance matrix computable from the period
¼ exp Kðz1 ; z2 Þ data.
Random Algebraic Geometry, Attractors and Flux Vacua 329

A simpler lower bound for the number of study of more or less any class of superstring vacua
solutions can be obtained by instead computing the leads to similar questions of counting and distribu-
index density tion, less well understood at present. Some of these
h i are discussed in Douglas (2003), Acharya et al.
 isÞ det Hij
I ðzÞ ¼ E ðnÞ ðDi sÞðnÞ ðD ½21 (2005), Denef and Douglas (2005), Blumenhagen
1 i;j 2n
et al. (2005).
so-called because it weighs the vacua with a Morse–
Witten sign factor. This admits a simple explicit See also: Black Hole Mechanics; Chaos and Attractors;
formula (Ashok and Douglas 2004), Compactification of Superstring Theory; Supergravity.

Ivac ðR; L Lmax Þ


Z
ð2Lmax Þb3 Further Reading
 detðR þ !  1Þ ½22
nþ1 b3 ! R
Acharya BS, Denef F, and Valandro R (2005) Statistics of M
where R is the (n þ 1) (n þ 1)-dimensional matrix theory vacua. Journal of High Energy Physics 0506: 056.
of curvature 2-forms for the Weil–Peterson metric. Ashok S and Douglas MR (2004) Counting flux vacua. Journal of
One might have guessed this density by the High Energy Physics 0401: 060.
Blumenhagen R, Gmeiner F, Honecker G, Lust D, and Weigand T
following reasoning. If s had been a single-valued (2005) The statistics of supersymmetric D-brane models.
section on a compact Mc (it is not), topological Nuclear Physics B 713: 83.
arguments determine the total index to be [cnþ1 (L Bousso R and Polchinski J (2000) Quantization of four-form
T  M)], and this is the simplest density constructed fluxes and dynamical neutralization of the cosmological
solely from the metric and curvatures in the same constant. Journal of High Energy Physics 06: 06.
Cox DA and Katz S (1999) Mirror Symmetry and Algebraic
cohomology class. Geometry. Providence, RI: American Mathematical Society.
It is not in general known whether this integral over Denef F and Douglas MR (2004) Distributions of flux vacua.
Calabi–Yau moduli space is finite, though this is true Journal of High Energy Physics 0405: 072.
in examples studied so far. One can also control jWj2 Denef F and Douglas MR (2005) Distributions of nonsuper-
as well as other observables, and one finds that the symmetric flux vacua. Journal of High Energy Physics
0503: 061.
distribution of jWj2 among flux vacua is to a good Douglas MR (2003) The statistics of string/M theory vacua.
approximation uniform. Considering explicit exam- Journal of High Energy Physics 5: 046.
ples, the prefactor in eqn [22] is of order 1010010300 , Douglas MR, Shiffman B, and Zelditch S (2004) Critical points
so assuming that this factor dominates the integral, we and supersymmetric vacua I. Communications in Mathema-
have justified the Bousso–Polchinski solution to the tical Physics 252(1–3): 325–358; II: Asymptotics (to appear).
Douglas MR, Shiffman B, and Zelditch S (2005) Critical points
cosmological constant problem in these models. and supersymmetric vacua, III: String/M models. Submitted to
The finite L corrections to these formulas can be Communications in Mathematical Physics. http://arxiv./org/
estimated using van der Corput techniques, and are abs/math-ph/0506015.
suppressed by better than the naive L1=2 or jZj1 one Edelman A and Kostlan E (1995) How many zeros of a random
might have expected. However the asymptotic for- polynomial are real? Bull. Amer. Math. Soc. (N.S.) 32: 1–37.
Ferrara S, Kallosh R, and Strominger A (1995) N = 2 extremal
mulas for the numbers of flux vacuum break down in black holes, Physical Review D 52: 5412.
certain limits of moduli space, such as the large Giddings SB, Kachru S, and Polchinski J (2002) Hierarchies from
complex structure limit. This is because eqn [18] fluxes in string compactifications. Physical Review D (3)
is an indefinite quadratic form, and the fact that 66(10): 106006.
it bounds the number of solutions at all is somewhat Gross M, Huybrechts D, and Joyce D (2003) Calabi–Yau
Manifolds and Related Geometries. New York: Springer.
subtle. These points are discussed at length in Moore GW (2004) Les Houches lectures on strings and
(Douglas et al. 2005). arithmetic, arXiv:hep-th/0401049.
Similar results have been obtained for a wide Shiffman B and Zelditch S (2003) Equilibrium distribution of
variety of flux vacuum counting problems, with zeros of random polynomials. Int. Math. Res. Not. 1: 25–49.
constraints on the value of the effective potential at Zelditch S (2001) From random polynomials to symplectic
geometry. In: XIIIth International Congress of Mathematical
the minimum, on the masses of scalar fields, on Physics, pp. 367–376. Beijing: Higher Education Press.
scales of supersymmetry breaking, and so on. And in Zelditch S (2005) Random complex geometry and vacua, or:
principle, this is just the tip of an iceberg, as the How to count universes in string/M theory, 2005 preprint.
330 Random Dynamical Systems

Random Dynamical Systems


V Araújo, Universidade do Porto, Porto, Portugal sequences (xk )k1 where each xkþ1 is a random
ª 2006 Elsevier Ltd. All rights reserved.
variable with law p( j xk ). This is a Markov
chain with state space M and transition probabilities
{p( j x)}x2M . To extend the concept of invariant
Introduction measure of a transformation to this setting, a
probability
R measure  is said to be ‘‘stationary’’ if
The concept of random dynamical system is a (A) = p(A j x) d(x) for every measurable (Borel)
comparatively recent development combining ideas subset A. This can be conveniently translated by
and methods from the well-developed areas of saying that the skew-product measure   pN on
probability theory and dynamical systems. M  MN given by
Let us consider a mathematical model of some
physical process given by the iterates T0k = T0  k  T0 , dð  pN Þðx0 ; x1 ; . . . ; xn ; . . .Þ
k  1, of a smooth transformation T0 : M of a

¼ dðx0 Þpðdx1 j x0 Þ    pðdxnþ1 j xn Þ   
manifold into itself. A realization of the process
with initial condition x0 is modeled by the sequence is invariant by the shift map S : M  MN on the


(T0k (x0 ))k1 , the orbit of x0 . space of orbits. Hence, we may use the ergodic
Due to our inaccurate knowledge of the particular theorem and get that time averages of all continuous
physical system or due to computational or theore- observables ’ : M ! R, that is, writing x = (xk )k0
tical limitations (e.g., lack of sufficient computa- and
tional power, inefficient algorithms, or insufficiently
developed mathematical or physical theory), the 1Xn1
mathematical models never correspond exactly to ~
’ðxÞ ¼ lim ’ðxk Þ
n!þ1 n
the phenomenon they are meant to model. More- k¼0

over, when considering practical systems, we cannot 1Xn1

avoid either external noise or measurement or ¼ lim ’ð0 ðS k ðxÞÞÞ


n!þ1 n
k¼0
inaccuracy errors, so every realistic mathematical
model should allow for small errors along orbits not
exist for   pN -almost all sequences x, where
to disturb the long-term behavior too much. To be
0 : M  MN ! M is the natural projection on the
able to cope with unavoidable uncertainty about the
first coordinate. It is well known that stationary
‘‘correct’’ parameter values, observed initial states
measures always exist if the transition probabilities
and even the specific mathematical formulation
p( j x) depend continuously on x.
involved, let randomness be embedded within the
model to begin with. R A function ’ : M ! R is invariant if ’(x) =
’(z)p(dz j x) for -almost every x. We then say
This article presents the most basic classes of
that  is ergodic if every invariant function is
models, defines the general concept, and presents
constant -almost everywhere. Using the ergodic
R
some developments and examples of applications.
theorem again, if  is ergodic, then ’ ~ = ’ d,
-almost everywhere.
Stationary measures are the building blocks for
Dynamics with Noise
more sophisticated analysis involving, for example,
To model random perturbations of a transformation asymptotic sojourn times, Lyapunov exponents, decay
T0 , we may consider a transition from the of correlations, entropy and/or dimensions, exit/
image T0 (x) to some point according to a given entrance times from/to subsets of M, to name just a
probability law, obtaining a Markov chain, or, if T0 few frequent notions of dynamical and probabilistic/
depends on a parameter p, we may choose p at statistical nature.
random at each iteration, which also can be seen as
Example 1 (Random jumps). Given  > 0 and
a Markov chain but whose transitions are strongly
T0 : M ! M, let us define
correlated.
mðA \ BðT0 ðxÞ; ÞÞ
p ðA j xÞ ¼
Random Noise mðBðT0 ðxÞ; ÞÞ
Given T0 : M and a family {p( j x) : x 2 M} of where m denotes some choice of Riemannian

probability measures on M such that the support of volume form on M. Then p ; ( j x) is the normalized
p( j x) is close to T0 (x), the random orbits are volume restricted to the -neighborhood of T0 (x).
Random Dynamical Systems 331

This defines a family of transition probabilities Tw (x) = T(w, x), and for every  > 0 write
allowing the points to ‘‘jump’’ from T0 (x) to any  = (m(B(p0 , ))1 (m j B(p0 , )), the normalized
point in the -neighborhood of T0 (x) following a restriction of m to the -neighborhood of p0 . Then
uniform distribution law. (Tw )w2P , together with  , defines a random pertur-
bation of Tp0 , for every small enough  > 0.
Random Maps Example 3 (Global additive perturbations). Let M
be a homogeneous space, that is, a compact
Alternatively, we may choose maps T1 , T2 , . . . , Tk
connected Lie group admitting an invariant
independently at random near T0 according to a
Riemannian metric. Fixing a neighborhood U of
probability law  on the space T(M) of maps, whose
the identity e 2 M, we can define a map T : U 
support is close to T0 in some topology, and
M ! M, (u, x) 7! Lu (T0 (x)), where Lu (x) = u  x is
consider sequences xk = Tk      T1 (x0 ) obtained
the left translation associated with u 2 M. The
through random iteration, k  1, x0 2 M.
invariance of the metric means that left (and also
This is again a Markov chain whose transition
right) translations are isometries, hence fixing u 2 U
probabilities are given for any x 2 M by
and taking any (x, v) 2 TM, we get
pðA j xÞ ¼  ðfT 2 TðMÞ: TðxÞ 2 AgÞ kDTu ðxÞ  vk ¼ kDLu ðT0 ðxÞÞðDT0 ðxÞ  vÞk
so this model may be reduced to the first one. ¼ kDT0 ðxÞ  vk
However, in the random-maps setting, we may In the particular case of M = Td , the d-dimensional
associate, with each random orbit, a sequence of torus, we have Tu (x) = T0 (x) þ u, and this simplest
maps which are iterated, enabling us to use ‘‘robust case suggests the name ‘‘additive random pertur-
properties’’ of the transformation T0 (i.e., properties bations’’ for random perturbations defined using
which are known to hold for T0 and for every families of maps of this type.
nearby map T) to derive properties of the random
orbits. For the probability measure on U, we may
Under some regularity conditions on the map take  , any probability measure supported in the
x 7! p(A j x) for every Borel subset A, it is possible -neighborhood of e and absolutely continuous
to represent random noise by random maps on with respect to the Riemannian metric on M, for
suitably chosen spaces of transformations. In fact, any  > 0 small enough.
the transition probability measures obtained in the Example 4 (Local additive perturbations). If
random-maps setting exhibit strong spatial correla- M = Rd and U0 is a bounded open subset of M
tion: p(  j x) is close to p(  j y) as x is near y. strictly invariant under a diffeomorphism T0 , that is,
If we have a parametrized family T : U  M ! M closure (T0 (U0 ))  U0 , then we can define an
of maps, we can specify the law  by giving a isometric random perturbation setting:
probability  on U. Then with every sequence
T1 , . . . , Tk , . . . of maps of the given family, we (i) V = T0 (U0 ) (so that closure (V) = closure
associate a sequence !1 , . . . , !k , . . . of parameters in (T0 (U0 ))  U0 );
U since (ii) G ’ Rd the group of translations of Rd ; and
(iii) V a small enough neighborhood of 0 in G.
Tk      T1 ¼ T!k      T!1 ¼ T!k1;...;!k
Then for v 2 V and x 2 V, we set Tv (x) = x þ v, with
the standard notation for vector addition, and
for all k  1, where we write T! (x) = T(!, x). In this
clearly Tv is an isometry. For  , we may take any
setting, the shift map S becomes a skew-product
probability measure on the -neighborhood of 0,
transformation
supported in V and absolutely continuous with
S : M  UN ðx; !Þ 7! ðT!1ðxÞ; ð!ÞÞ respect to the volume in R d , for every small enough

 > 0.
to which many of the standard methods of dynami-
cal systems and ergodic theory can be applied,
yielding stronger results that can be interpreted in Random Perturbations of Flows
random terms.
In the continuous-time case, the basic model to start
Example 2 (Parametric noise). Let T : P  M ! M with is an ordinary differential equation
be a smooth map where P, M are finite-dimensional dXt = f (t, Xt )dt, where f : [0, þ 1) ! X (M) and
Riemannian manifolds. We fix p0 2 P, denote by m X (M) is the family of vector fields in M. We
some choice of Riemannian volume form on P, set embed randomness in the differential equation
332 Random Dynamical Systems

basically through ‘‘diffusion,’’ the perturbation is system). A random dynamical system is a skew
given by white noise or Brownian motion ‘‘added’’ product
to the ordinary solution.
S t :   M ; ð!; xÞ 7! ððtÞð!Þ; ’ðt; !ÞðxÞÞ


In this setting, assuming for simplicity that
M = Rn , the random orbits are solutions of stochas- for all t 2 T, where  : T   !  is a family
tic differential equations of measure-preserving maps (t) : (, P) and


’ : T    M ! M is a family of maps
dXt ¼ f ðt; Xt Þdt þ   ðt; Xt ÞdWt ;
’(t, !) : M satisfying the cocycle property: for


0  t  T; X0 ¼ Z s, t 2 T, ! 2 ,
where Z is a random variable, , T > 0 and both ’ð0; !Þ ¼ IdM
f : [0, T]  Rn ! Rn and  : [0, T]  Rn ! L(R k , Rn )
’ðt þ s; !Þ ¼ ’ðt; ðsÞð!ÞÞ  ’ðs; !Þ
are measurable functions. The space of linear maps
Rk ! Rn is written on L(Rk , Rn ) and Wt is the In this general setting an invariant measure for the
white-noise process on Rk . The solution of this random dynamical system is any probability mea-
equation is a stochastic process: sure  on   M which is S t -invariant for all t 2 T
and whose marginal is P, that is, (S 1
t (U)) = (U)
X:R!M ðt; !Þ 7! Xt ð!Þ
and (1 (U)) = P(U) for every measurable U 
for some (abstract) probability space , given by   M, respectively, with  :   M !  the nat-
Z T Z T ural projection.
Xt ¼ Z þ f ðs; Xs Þds þ   ðs; Xs ÞdWs Example 5 In the setting of the previous examples
0 0
of random perturbations of maps, the product
where the last term is a stochastic integral in the measure = P   on   M, with  = U N , P = N
sense of Itô. Under reasonable conditions on f and , and  any stationary measure, is clearly invariant.
there exists a unique solution with continuous paths, However, not all invariant measures are product
that is, measures of this type.
½0; þ 1Þ 3 t 7! Xt ð!Þ Naturally an invariant measure is ergodic if every
is continuous for almost all ! 2  (in general these S t -invariant function is -almost everywhere
paths are nowhere differentiable). constant. That is, if :   M ! R satisfies
Setting Z = x0 , the probability measure concen-  S t = -almost everywhere for every t 2 T,
trated on the point x0 , the initial point of the path is then is -almost everywhere constant.
x0 with probability 1. We write Xt (!)x0 for paths of
this type. Hence, x 7! Xt (!)x defines a map
Applications
Xt (!) : M which can be shown to be a home-

omorphism and even diffeomorphisms under suit- The well-established applications of both probability
able conditions on f and . These maps satisfy a or stochastic differential equations (solution of
cocycle property boundary value problems, optimal stopping, sto-
chastic control etc.) and dynamical systems (all
X0 ð!Þ ¼ IdM ðidentity map of MÞ
kinds of models of physical, economic or biological
Xtþs ð!Þ ¼ Xt ððsÞð!ÞÞ  Xs ð!Þ phenomena, solutions of differential equations,
for s, t  0 and ! 2 , for a family of measure- control systems etc.) will not be presented here.
preserving transformations (s) : (, P) on a Instead, this section focuses on topics where the

suitably chosen probability space (, P). This subject sheds new light on these areas.
enables us to write the solution of this kind of
Products of Random Matrices and the
equations also as a skew product.
Multiplicative Ergodic Theorem

The following celebrated result on products of


The Abstract Framework
random matrices has far-reaching applications on
The illustrative particular cases presented can all be dynamical systems theory.
written in skew-product form as follows. Let (Xn )n0 be a sequence of independent and
Let (, P) be a given probability space, which will identically distributed random variables on
be the model for the noise, and let T be time, which the probability space (, P) with values in
usually means Zþ , Z (discrete, resp. invertible L(Rk , Rk ) such that E( logþ kX1 k) < þ1, where
system) or R þ , R (continuous, resp. invertible logþ x = max {0, log x} and k  k is a given norm on
Random Dynamical Systems 333

L(Rk , Rk ). Writing ’n (!) = Xn (!)      X1 (!) for we obtain a stationary sequence to which we can
all n  1 and ! 2  we obtain a cocycle. If we set apply the previous result, obtaining the existence of
 Lyapunov exponents and of Lyapunov subspaces on
1
B ¼ ð!; yÞ 2   Rk : lim log k’n ð!Þyk a full measure subset for any C1 measure-preserving
n!þ1 n
 dynamical system.
By a standard extension of the previous setup, we
exists and is finite or is 1 ;
obtain a random version of the multiplicative ergodic
0 ¼ f! 2  : ð!; yÞ 2 B for all y 2 Rk g theorem. We take a family of skew-product maps
S t :   M as in the section ‘‘The abstract frame-


then 0 contains a subset 00 of full probability and work’’ with an invariant probability measure  and
there exist random variables (which might take the such that ’(t, !) : M is (for simplicity) a local


value 1)
1 
2     
k with the following diffeomorphism. We then consider the stationary family
properties.
Xt :  ! LðTMÞ; ! 7! D’ðt; !Þ : TM t2T


1. Let I = {k þ 1 = i1 > i2 >    > ilþ1 = 1} be any
where D’(t, !) is the tangent map to ’(t, !). This is
(l þ 1)-tuple of integers and then we define
a cocycle since for all t, s 2 T, ! 2  we have
I ¼ f! 2 00 :
i ð!Þ ¼
j ð!Þ; ih > i; j  ihþ1 ; Xðs þ t; !Þ ¼ Xðs; ðtÞ!Þ  Xðt; !Þ
and
ihð!Þ >
ihþ1ð!Þ for all 1 < h < lg
If we assume that
the set of elements where the sequence
i jumps  
sup sup logþ kD’ðt; !ÞðxÞk 2 L1 ð; PÞ
exactly at the indexes in I. Then for 0t1 x2M
! 2 I , 1 < h  l,
  where k  k denotes the norm on the corresponding
k 1 space of linear maps given by the induced norm
I;h ð!Þ ¼ y 2 R : lim log k’n ð!Þk 
ihð!Þ
n!þ1 n (from the Riemannian metric) on the appropriate
tangent spaces, then we obtain a sequence of
is a vector subspace with dimension ih1  1.
random variables (which might take the value 1)
2. Setting I,kþ1 (!) = {0}, then

1 
2     
k , with k being the dimension of
1 M, such that
lim log k’n ð!Þk ¼
ihð!Þ
n!þ1 n
1
lim log kXt ð!; xÞyk ¼
i ð!; xÞ
for every y 2 I,h (!)nI,hþ1 (!). t!þ1 t
3. For all ! 2 00 there exists the matrix for every y 2 Ei !, x) = i (!, x) n iþ1 (!, x) and
1=2n i = 1, . . . , k þ 1, where (i (!, x))i is a sequence of
Að!Þ ¼ lim ½ð’n ð!ÞÞ ’n ð!Þ

n!þ1 vector subspaces in Tx M as before, measurable with


respect to (!, x). In this setting, the subspaces Ei (!, x)
whose eigenvalues form the set {e
i : i = 1, . . . , k}.
and the Lyapunov exponents are invariant, that is,
The values of
i are the random Lyapunov for all t 2 T and -almost every (!, x) 2   M, we
characteristics and the corresponding subspaces are have
analogous to random eigenspaces. If the sequence

i ðS t ð!; xÞÞ ¼
i ð!; xÞ and Ei ðS t ð!; xÞÞ ¼ Ei ð!; xÞ
(Xn )n0 is ergodic, then the Lyapunov characteristics
become nonrandom constants, but the Lyapunov The dependence of Lyapunov exponents on the
subspaces are still random. map T0 has been a fruitful and central research
We can easily deduce the multiplicative ergodic program in dynamical systems for decades extending
theorem for measure-preserving differentiable maps to the present day. The random multiplicative
(T0 , ) on manifolds M from this result. For simplicity, ergodic theorem sets the stage for the study of the
we assume that M  R k and set p(A j x) = T0 (x) (A) = 1 stability of Lyapunov exponents under random
if T0 (x) 2 A and 0 otherwise. Then the measure   pN
perturbations.
on M  MN is -invariant (as defined earlier) and we
have that 0   = T0  0 , where 0 : MN ! M is the
projection on the first coordinate, and also (0 ) (  Stochastic Stability of Physical Measures
pN ) = . Then, setting for n  1 The development of the theory of dynamical systems
k k n has shown that models involving expressions as
X : M ! LðR ; R Þ and Xn ¼ X  0  
simple as quadratic polynomials (as the logistic
x 7! DT0 ðxÞ family or Hénon attractor), or autonomous ordinary
334 Random Dynamical Systems

differential equations with a hyperbolic singularity yet far from being proved for most dynamical
of saddle type, as the Lorenz flow, exhibit sensitive systems, in spite of much recent progress in this
dependence on initial conditions, a common feature direction.
of chaotic dynamics: small initial differences are There are robust examples of systems admitting
rapidly augmented as time passes, causing two several physical measures whose basins together are
trajectories originally coming from practically indis- of full Lebesgue measure, where ‘‘robust’’ means
tinguishable points to behave in a completely that there are whole open sets of maps of a manifold
different manner after a short while. Long-term in the C2 topology exhibiting these features. For
predictions based on such models are unfeasible, typical parametrized families of one-dimensional
since it is not possible to both specify initial unimodal maps (maps of the circle or of the interval
conditions with arbitrary accuracy and numerically with a unique critical point), it is known that the
calculate with arbitrary precision. above scenario holds true for Lebesgue almost every
parameter. It is known that there are systems
Physical measures Inspired by an analogous situa- admitting no physical measure, but the only known
tion of unpredictability faced in the field of cases are not robust, that is, there are systems
statistical mechanics/thermodynamics, researchers arbitrarily close which admit physical measures.
focused on the statistics of the data provided by It is hoped that conclusions drawn from models
the time averages of some observable (a continuous admitting physical measures to be effectively obser-
function on the manifold) of the system. Time vable in the physical processes being modeled.
averages are guaranteed to exist for a positive- In order to lend more weight to this expectation,
volume subset of initial states (also called an researchers demand stability properties from such
observable subset) on the mathematical model if invariant measures.
the transformation, or the flow associated with the
ordinary differential equation, admits a smooth Stochastic stability There are two main issues
invariant measure (a density) or a physical measure. concerning a mathematical model, both from theo-
Indeed, if 0 is an ergodic invariant measure for the retical and practical standpoints. The first one is to
transformation T0 , then the ergodic theorem ensures describe the asymptotic behavior of most orbits, that
that for every -integrable function ’ : M ! R and is, to understand what happens to orbits when time
for -almost every point x in P the manifold M, the time tends to infinity. The second and equally important
j
~ = limn!þ1 nR1 n1
average ’(x) j=0 ’(T0 (x)) exists and one is to ascertain whether the asymptotic behavior
equals the space average ’ d0 . A physical measure is stable under small changes of the system, that is,
 is an invariant probability measure for which it is whether the limiting behavior is still essentially the
required that time averages of every continuous same after small changes to the law of evolution. In
function ’ exist for a positive Lebesgue measure fact, since models are always simplifications of the
(volume) subset of the space and be equal to the space real system (we cannot ever take into account the
average (’). whole state of the universe in any model), the lack
We note that if  is a density, that is, absolutely of stability considerably weakens the conclusions
continuous with respect to the volume measure, then drawn from such models, because some properties
the ergodic theorem ensures that  is physical. might be specific to it and not in any way
However, not every physical measure is absolutely resembling the real system.
continuous. To see why in a simple example, we Random dynamical systems come into play in this
consider a singularity p of a vector field which is an setting when we need to check whether a given
attracting fixed point (a sink), then the Dirac mass model is stable under small random changes to the
p concentrated on p is a physical probability law of evolution.
measure, since every orbit in the basin of attraction In more precise terms, we suppose that there is a
of p will have asymptotic time averages for any dynamical system (a transformation or a flow) admit-
continuous observable ’ given by ’(p) = p (’). ting a physical measure 0 and we take any random
Physical measures need not be unique or even dynamical system obtained from this one through the
exist in general but, when they do exist, it is introduction of small random perturbations on the
desirable that the set of points whose asymptotic dynamics, as in Examples 1– 4 or in the section on
time averages are described by physical measures ‘‘Random perturbations of flows,’’ with the noise level
(such a set is called the basin of the physical  > 0 close to zero.
measures) be of full Lebesgue measure – only an In this setting if, for any choice  of invariant
exceptional set of points with zero volume would measure for the random dynamical system for all
not have a well-defined asymptotic behavior. This is  > 0 small enough, the set of accumulation points of
Random Dynamical Systems 335

the family ( )>0 , when  tends to 0 – also known as measures  such that the same map T0 is not
zero-noise limits – is formed by physical measures or, stochastically stable.
more generally, by convex linear combinations of
It is well known that
is the unique absolutely
physical measures, then the original unperturbed
continuous invariant measure for T0 and also the
dynamical system is stochastically stable.
unique physical measure. Given  > 0 small, let us
This intuitively means that the asymptotic beha-
define transition probability measures as follows:
vior measured through time averages of continuous
observables for the random system is close to the
j ½  ðzÞ  ;  ðzÞ þ 

behavior of the unperturbed system. p ð j zÞ ¼



ð½  ðzÞ  ;  ðzÞ þ 
Þ
Recent progress in one-dimensional dynamics has
shown that, for typical families (ft )t2(0,1) of maps of where  j (, ) 0,  j [S1 n (2, 2)] T0 , and
the circle or of the interval having a unique critical over (2, ] [ [, 2), we can define  by inter-
point, a full Lebesgue measure subset T of the set of polation in order that it be smooth.
parameters is such that, for t 2 T, the dynamics of ft In this setting, every random orbit starting at
admits a unique stochastically stable (under additive (, ) never leaves this neighborhood in the
noise type random perturbations) physical measure future. Moreover, it is easy to see that every
t whose basin has full measure in the ambient space random orbit eventually enters (, ). Hence,
(either the circle or the interval). Therefore, models every invariant probability measure  for this
involving one-dimensional unimodal maps typically Markov chain model is supported in [, ]. Thus,
are stochastically stable. letting  ! 0, we see that the only zero-noise limit
In many settings (e.g., low-dimensional dynamical is 0 , the Dirac mass concentrated at 0, which is
systems), Lyapunov exponents can be given by time not a physical measure for T0 .
averages of continuous functions – for example, the This construction can be achieved in a random-
time average of log kDT0 k gives the biggest expo- maps setting, but only in the C0 topology – it is not
nent. In this case, stochastic stability directly implies possible to realize this Markov chain by random
stability of the Lyapunov exponents under small maps that are C1 close to T0 for  near 0.
random perturbations of the dynamics.
Example 6 (Stochastically stable examples). Let Characterization of Measures Satisfying
T0 : S1 be a map such that
, the Lebesgue (length)

the Entropy Formula


measure on the circle, is T0 -invariant and ergodic.
Then
is physical. Significant effort has been put in recent years in
extending important results from dynamical systems
We consider the parametrized family Tt : S1  to the random setting. Among many examples are:
S ! S1 , (t, x) 7! x þ t and a family of probability
1
the local conjugacy between the dynamics near a
measures  = (
(, ))1  (
j (, )) given by the hyperbolic fixed point and the action of the derivative
normalized restriction of
to the -neighborhood of of the map on the tangent space, the stable/unstable
0, where we regard S1 as the Lie group R=Z and use manifold theorems for hyperbolic invariant sets and
additive notation for the group operation. Since
is the notions and properties of metric and topological
Tt -invariant for every t 2 S1 ,
is also an invariant entropy, dimensions and equilibrium states for
measure for the measure-preserving random system potentials on random (or fuzzy) sets.
The characterization of measures satisfying the
S : ðS1  N ;
 N
 Þ

entropy formula is one important result whose


for every  > 0, where  = (S1 )N . Hence, (T0 ,
) extension to the setting of iteration of independent
is stochastically stable under additive noise and identically distributed random maps has
perturbations. recently had interesting new consequences back
Concrete examples can be irrational rotations, into nonrandom dynamical systems.
T0 (x) = x þ with 2 RnQ, or expanding maps of
the circle, T0 (x) = b  x for some b 2 N, n  2.
Analogous examples exist in higher-dimensional tori. Metric entropy for random perturbations Given a
probability measure  and a partition of M, except
Example 7 (Stochastic stability depends on the type perhaps for a subset of -null measure, the entropy
of noise). In spite of the straightforward method of  with respect to is defined to be
for obtaining stochastic stability in Example 6, for X
example, an expanding circle map T0 (x) = 2  x, we H ð Þ ¼  ðRÞ log ðRÞ
can choose a continuous family of probability R2
336 Random Dynamical Systems

where the convention that 0 log 0 = 0 has been used. smooth ergodic theory for nonuniformly hyperbolic
Given another finite partition , we write _  to dynamics.
indicate the partition obtained through intersection Both the inequality and the characterization of
of every element of with every element of , and stationary measures satisfying the entropy formula
analogously for any finite number of partitions. If  were extended to random iterations of independent
is also a stationary measure for a random-maps and identically distributed C2 maps (noninjective
model (see the section ‘‘Random maps’’), then for and admitting critical points), and the inequality
any finite measurable partition of M, reads
Z ! ZZ X
1 _  1
n1
h ð Þ ¼ inf H i
T! ð Þ dpN ð!Þ h 
i ðx; !Þ dðxÞ dpN ð!Þ
n1 n
i ðx;!Þ>0
i¼0

is finite and is called the entropy of the random where the functions
i are the random variables
dynamical system with respect to and to . provided by the random multiplicative ergodic
We define h = sup h ( ) as the metric entropy theorem.
of the random dynamical system, where the
supremo is taken over all -measurable partitions. Construction of Physical Measures
An important point here is the following notion: as Zero-Noise Limits
setting A the Borel -algebra of M, we say that a The characterization of measures which satisfy the
finite partition of M is a random generating entropy formula enables us to construct physical
partition for A if measures as zero-noise limits of random invariant
_
þ 1 measures in some settings, outlined in the following,
ðT!i Þ1 ð Þ ¼ A obtaining in the process that the physical measures
i¼0 so constructed are also stochastically stable.
The physical measures obtained in this manner
(except -null sets) for pN -almost all ! 2  = U N .
arguably are natural measures for the system, since
Then a classical result from ergodic theory ensures
they are both stable under (certain types of)
that we can calculate the entropy using only a
random perturbations and describe the asymptotic
random generating partition , that is, h = h ( ).
behavior of the system for a positive-volume subset
of initial conditions. This is a significant contribu-
The entropy formula There exists a general tion to the state-of-the-art of present knowledge on
relation ensuring that the entropy of a measure- dynamics from the perspective of random dynami-
preserving differentiable transformation (T0 , ) on a cal systems.
compact Riemannian manifold is bounded from
above by the sum of the positive Lyapunov Hyperbolic measures and the entropy formula The
exponents of T0 main idea is that an ergodic invariant measure  for
Z X a diffeomorphism T0 which satisfies the entropy
h ðT0 Þ 
i ðxÞ dðxÞ formula and whose Lyapunov exponents are every-

i ðxÞ>0 where nonzero (known as hyperbolic measure)
necessarily is a physical measure for T0 . This follows
The equality (entropy formula) was first shown from standard arguments of smooth nonuniformly
to hold for diffeomorphisms preserving a measure hyperbolic ergodic theory.
equivalent to the Riemannian volume, and then the Indeed  satisfies the entropy formula if and only
measures satisfying the entropy formula were if  disintegrates into densities along the unstable
characterized: for C2 diffeomorphisms the equality submanifolds of T0 . The unstable manifolds W u (x)
holds if and only if the disintegration of  along the are tangent to the subspace corresponding to every
unstable manifolds is formed by measures abso- positive Lyapunov exponent at -almost every point
lutely continuous with respect to the Riemannian x, they are an invariant family, that is,
volume restricted to those submanifolds. The T0 (W u (x)) = W u (x) for -almost every x, and dis-
unstable manifolds are the submanifolds of M tances on them are uniformly contracted under
everywhere tangent to the Lyapunov subspaces iteration by T01 .
corresponding to all positive Lyapunov exponents, If the exponents along the complementary direc-
analogous to ‘‘integrating the distribution of Lya- tions are nonzero, then they must be negative
punov subspaces corresponding to positive expo- and smooth ergodic theory ensures that there exist
nents’’ – this particular point is a main subject of stable manifolds, which are submanifolds W s (x) of
Random Dynamical Systems 337

M everywhere tangent to the subspace of negative that T0 belongs to the family. Letting Tx (u) = T(u, x)
Lyapunov exponents at -almost every point x, form for all (u, x) 2 U  M, we then have that Tx ( ) is
a T0 -invariant family (T0 (W s (x)) = W s (x), -almost absolutely continuous. This means that sets of
everywhere), and distances on them are uniformly perturbations of positive  -measure send points of
contracted under iteration by T0 . M onto positive-volume subsets of M. Such a
We still need to understand that time averages perturbation can be constructed for every contin-
are constant along both stable and unstable mani- uous map of any manifold.
folds, and that the families of stable and unstable In this setting, any invariant probability measure
manifolds are absolutely continuous, in order to for the associated skew-product map S :   M of


realize how a hyperbolic measure is a physical the form N    is such that  is absolutely
measure. continuous with respect to volume on M. Then the
Given y 2 W s (x), the time averages of x and y entropy formula holds:
coincide for continuous observables simply because ZZ X
dist (T0n (x), T0n (y)) ! 0 when n ! þ1. For unstable h  ¼
i ðx; !Þ d ðxÞ dN
 ð!Þ
manifolds, the same holds when considering time
i ðx;!Þ>0
averages for T01 . Since forward and backward time
averages are equal -almost everywhere, the set of Having this and knowing the characterization of
points having asymptotic time averages given by  measures satisfying the entropy formula, it is natural
has positive Lebesgue measure if the set to look for conditions under which we can guaran-
[ tee that the above inequality extends to any zero-
B ¼ fW s ðyÞ: y 2 W u ðxÞ \ suppðÞg noise limit 0 of  when  ! 0. In this case, 0
satisfies the entropy formula for T0 .
has positive volume in M, for some x whose time If, in addition, we are able to show that 0 is a
averages are well defined. hyperbolic measure, then we obtain a physical measure
Now, stable and unstable manifolds are trans- for T0 which is stochastically stable by construction.
verse everywhere where they are defined, but they These ideas can be carried out completely for
are only defined -almost everywhere and depend hyperbolic diffeomorphisms, that is, maps admitting
measurably on the base point, so we cannot use a continuous invariant splitting of the tangent space
transversality arguments from differential topol- into two sub-bundles E F defined everywhere with
ogy, in spite of W u (x) \ supp() having positive bounded angles, whose Lyapunov exponents are
volume in W u (x) by the existence of a smooth negative along E and positive along F. Recently,
disintegration of  along the unstable manifolds. maps satisfying weaker conditions were shown to
However, it is known for smooth (C2 ) transforma- admit stochastically stable physical measures follow-
tions that the families of stable and unstable ing the same ideas.
manifolds are absolutely continuous, meaning These ideas also have applications to the con-
that projections along leaves preserve sets of zero struction and stochastic stability of physical measure
volume. This is precisely what is needed for for strange attractors and for all mathematical
measure-theoretic arguments to show that B has models involving ordinary differential equations or
positive volume. iterations of maps.

See also: Dynamical Systems in Mathematical Physics:


Zero-noise limits satisfying the entropy An Illustration from Water Waves; Homeomorphisms and
formula Using the extension of the characteriza- Diffeomorphisms of the Circle; Lyapunov Exponents and
tion of measures satisfying the entropy formula Strange Attractors; Nonequilibrium Statistical Mechanics
for the random-maps setting, we can build random (Stationary): Overview; Random Walks in Random
dynamical systems, which are small random pertur- Environments; Stochastic Differential Equations.
bations of a map T0 , having invariant measures 
satisfying the entropy formula for all sufficiently
small  > 0. Indeed, it is enough to construct small Further Reading
random perturbations of T0 having absolutely Arnold L (1998) Random Dynamical Systems. Berlin: Springer.
continuous invariant probability measures  for all Billingsley P (1965) Ergodic Theory and Information. New York:
small enough  > 0. Wiley.
In order to obtain such random dynamical Billingsley P (1985) Probability and Measure, 3rd edn. New York:
Wiley.
systems, we choose families of maps T : U  M ! Bonatti C, Dı́az L, and Viana M (2004) Dynamics Beyond
M and of probability measures ( )>0 as in Hyperbolicity: A Global Geometric and Probabilistic Perspec-
Examples 3 and 4, where we assume that o 2 U, so tive. Berlin: Springer.
338 Random Matrix Theory in Physics

Bonatti C, Dı́az L, and Viana M (2005) Dynamics Beyond Kunita H (1990) Stochastic Flows and Stochastic Differential
Uniform Hyperbolicity. A Global Geometric and Probabilistic Equations. Cambridge: Cambridge University Press.
Perspective, Encyclopaedia of Mathematical Sciences, 102 Ledrappier F and Young L-S (1998) Entropy formula for random
Mathematical Physics III. Berlin: Springer. transformations. Probability Theory and Related Fields 80(2):
Doob J (1953) Stochastic Processes. New York: Wiley. 217–240.
Fathi A, Herman M-R, and Yoccoz J-C (1983) A proof of Pesin’s Liu P-D and Qian M (1995) Smooth Ergodic Theory of Random
stable manifold theorem. In: Palis J (ed.) Geometric Dynamics Dynamical Systems, Lecture Notes in Mathematics, vol. 1606.
(Rio de Janeiro, 1981), Lecture Notes in Mathematics, Berlin: Springer.
vol. 1007, 177–215. Berlin: Springer. Øskendal B (1992) Stochastic Differential Equations. Universi-
Kifer Y (1986) Ergodic Theory of Random Perturbations. text, 3rd edn. Berlin: Springer.
Boston: Birkhäuser. Viana M (2000) What’s new on Lorenz strange attractor.
Kifer Y (1988) Random Perturbations of Dynamical Systems. Mathematical Intelligencer 22(3): 6–19.
Boston: Birkhäuser. Walters P (1982) An Introduction to Ergodic Theory. Berlin: Springer.

Random Matrix Theory in Physics


T Guhr, Lunds Universitet, Lund, Sweden and disordered systems, and even quantum chromo-
ª 2006 Elsevier Ltd. All rights reserved.
dynamics on the lattice. Consider the nearest-
neighbor spacing distribution p(s). It is the prob-
ability density of finding two adjacent levels in
Introduction the distance s. If the positions of the levels are
uncorrelated, the nearest-neighbor spacing distribu-
We wish to study energy correlations of quantum tion can be shown to follow the Poisson law
spectra. Suppose the spectrum of a quantum system
has been measured or calculated. All levels in the pðPÞ ðsÞ ¼ expðsÞ ½1

total spectrum having the same quantum numbers


While this is occasionally found, many more systems
form one particular subspectrum. Its energy levels are
show a rather different nearest-neighbor spacing
at positions xn , n = 1, 2, . . . , N, say. We assume that
distribution, the Wigner surmise
N, the number of levels in this subspectrum, is large.   
With a proper smoothing procedure, we obtain the 
pðWÞ ðsÞ ¼ s exp  s2 ½2

level density R1 (x), that is, the probability density of 2 4


finding a level at the energy x. As indicated in the top As shown in Figure 2, the Wigner surmise excludes
part of Figure 1, the level density R1 (x) increases with degeneracies, p(W) (0) = 0, the levels repel each other.
x for most physics systems. In the present context, This is only possible if they are correlated. Thus, the
however, we are not so interested in the level density. Poisson law and the Wigner surmise reflect the absence
We want to measure the spectral correlations or the presence of energy correlations, respectively.
independently of it. Hence, we have to remove the Now, the question arises: if these correlation
level density from the subspectrum. This is referred to patterns are so frequently found in physics, is
as unfolding. We introduce a new dimensionless there some simple, phenomenological model? –
energy scale such that d = R1 (x) dx. By construc- Yes, random matrix theory (RMT) is precisely this.
tion, the resulting subspectrum in has level density To describe the absence of correlations, we choose,
unity, as shown schematically in the bottom part of in view of what has been said above, a diagonal
Figure 1. It is always understood that the energy Hamiltonian
correlations are analyzed in the unfolded subspectra.
Surprisingly, a remarkable universality is found in H ¼ diagðx1 ; . . . ; xN Þ ½3

the spectral correlations of a large class of systems,


1.0
including nuclei, atoms, molecules, quantum chaotic
p(s)

0.5
x

0.0
0 1 2 3
ξ s
Figure 1 Original (top) and unfolded (bottom) spectrum. Figure 2 Wigner surmise (solid) and Poisson law (dashed).
Random Matrix Theory in Physics 339

whose elements, the eigenvalues xn , are uncorrelated eigenvalue is doubly degenerate. This is Kramers’
random numbers. To model the presence of correla- degeneracy. The diagonalizing matrix U is in the
tions, we insert off-diagonal matrix elements, orthogonal group O(N) for  = 1, in the unitary
2 3 group U(N) for  = 2 and in the unitary–symplectic
H11    H1N
6 . .. 7 group USp(2N) for  = 4. Accordingly, the three
H ¼ 4 .. . 5 ½4 symmetry classes are referred to as orthogonal,
HN1    HNN unitary, and symplectic.
We have not yet chosen the probability densities
We require that H is real symmetric, H T = H. The for the random entries Hnm . To keep our assump-
independent elements Hnm are random numbers. tions about the system at a minimum, we treat all
The random matrix H is diagonalized to obtain the entries on equal footing. This is achieved by
energy levels xn , n = 1, 2, . . . , N. Indeed, a numerical rotational invariance of the probability density
simulation shows that these two models yield, after PN()
(H), not to be confused with the rotational
unfolding, the Poisson law and the Wigner surmise symmetry employed above to define the symmetry
for large N, that is, the absence or presence of classes. No basis for the matrices is preferred in any
correlations. This is the most important insight into way if we construct PN ()
(H) from matrix invariants,
the phenomenology of RMT. that is, from traces and determinants, such that it
In this article, we set up RMT in a more formal depends only on the eigenvalues, P() ()
N (H) = PN (x). A
way; we discuss analytical calculations of correla- particularly convenient choice is the Gaussian
tion functions, demonstrate how this relates to
 
supersymmetry and stochastic field theory and ðÞ ðÞ  2
show the connection to chaos, and we briefly sketch PN ðHÞ ¼ CN exp  2 tr H ½5
4v
the numerous applications in many-body physics, in
disordered and mesoscopic systems, in models for where the constant v sets the energy scale and the
interacting fermions, and in quantum chromody- constant C() N ensures normalization. The three
namics. We also mention applications in other symmetry classes together with the probability
fields, even beyond physics. densities [5] define the Gaussian ensembles: the
Gaussian orthogonal (GOE), unitary (GUE) and
symplectic (GSE) ensemble for  = 1, 2, 4.
The phenomenology of the three Gaussian
Random Matrix Theory ensembles differs considerably. The higher , the
Classical Gaussian Ensembles stronger the level repulsion between the eigenvalues
xn . Numerical simulation quickly shows that the
For now, we consider a system whose energy levels nearest-neighbor spacing distribution behaves like
are correlated. The N  N matrix H modeling it has p() (s)  s for small spacings s. This also becomes
no fixed zeros but random entries everywhere. There obvious by working out the differential probability
are three possible symmetry classes of random PN()
(H)d[H] of the random matrices H in eigenvalue–
matrices in standard Schrödinger quantum angle coordinates x and U. Here, d[H] is the invariant
mechanics. They are labeled by the Dyson index . measure or volume element in the matrix space. When
If the system is not time-reversal invariant, H has to writing d[], we always mean the product of all
be Hermitian and the random entries Hnm are differentials of independent variables for the quantity
complex ( = 2). If time-reversal invariance holds, in the square brackets. Up to constants, we have
two possibilities must be distinguished: if either the
system is rotational symmetric, or it has integer spin d½H ¼ jN ðxÞj d½x dðUÞ ½6
and rotational symmetry is broken, the Hamilton
matrix H can be chosen to be real symmetric ( = 1). where d(U) is, apart from certain phase contribu-
This is the case in eqn [4]. If, on the other hand, the tions, the invariant or Haar measure on O(N), U(N),
system has half-integer spin and rotational symme- or USp(2N), respectively. The Jacobian of the
try is broken, H is self-dual ( = 4) and the random transformation is the modulus of the Vandermonde
entries Hnm are 2  2 quaternionic. The Dyson determinant
index  is the dimension of the number field over Y
which H is constructed. N ðxÞ ¼ ðxn  xm Þ ½7
n<m
As we are interested in the eigenvalue correla-
tions, we diagonalize the random matrix, H = raised to the power . Thus, the differential
()
U1 xU. Here, x = diag(x1 , . . . , xN ) is the diagonal probability PN (H) d[H] vanishes whenever any
matrix of the N eigenvalues. For  = 4, every two eigenvalues xn degenerate. This is the level
340 Random Matrix Theory in Physics

repulsion. It immediately explains the behavior of density. When unfolding, we also want to take the
the nearest-neighbor spacing distribution for small limit of infinitely many levels N ! 1 to remove
spacings. cutoff effects due to the finite dimension of the
Additional symmetry constraints lead to new random matrices. It suffices to stay in the center of
random matrix ensembles relevant in physics, the the semicircle where pffiffiffiffiffi the mean level spacing is
Andreev and the chiral Gaussian ensembles. If one D = 1=R1() (0) = v= N . We introduce the dimen-
refers to the classical Gaussian ensembles, one sionless energies p = xp =D, p = 1, . . . , k, which have
usually means the three ensembles introduced to be held fixed when taking the limit N ! 1. The
above. unfolded correlation functions are given by
ðÞ ðÞ
Correlation Functions Xk ð1 ; . . . ; k Þ ¼ lim Dk Rk ðD1 ; . . . ; Dk Þ ½11
N!1
The probability density to find k energy levels at
As we are dealing with probability densities, the
positions x1 , . . . , xk is the k-level correlation func-
Jacobians dxp =dp enter the reformulation in the
tion Rk() (x1 , . . . , xk ). We find it by integrating out
new energy variables. This explains the factor Dk .
N  k levels in the N-level differential probability
() Unfolding makes the correlation functions transla-
PN (H) d[H]. We also have to average over the
tion invariant; they depend only on the differences
bases, that is, over the diagonalizing matrices U.
p  q . The unfolded correlation functions can be
Due to rotational invariance, this simply yields the
written in a rather compact form. For the GUE
group volume. Thus, we have
( = 2), they read
ðÞ
Rk ðx1 ; . . . ; xk Þ  
Z þ1 Z þ1 ð2Þ sin ðp  q Þ
N! Xk ð1 ; . . . ; k Þ ¼ det ½12
¼ dxkþ1   
ðÞ
dxN jN ðxÞj PN ðxÞ ½8 ðp  q Þ p;q¼1;...;k
ðN  kÞ! 1 1
There are similar, but more complicated, formulae
Once more, we used rotational invariance which for the GOE ( = 1) and the GSE ( = 4). By
()
implies that PN (x) is invariant under permutation of construction, one has X1() (1 ) = 1.
the levels xn . Since the same then also holds for the It is useful to formulate the case where correla-
correlation functions [8], it is convenient to normal- tions are absent, that is, the Poisson case, accord-
ize them to the combinatorial factor in front of the ingly. The level density R(P)
1 (x1 ) is simply N times the
integrals. A constant ensuring this has been (smooth) probability density chosen for the entries
()
absorbed into PN (x). in the diagonal matrix [4]. Lack of correlations
Remarkably, the integrals in eqn [8] can be done means that the k-level correlation function only
in closed form. The GUE case ( = 2) is mathema- involves one-level correlations,
tically the simplest, and one finds the determinant
structure ðPÞ N! Yk
ðPÞ
Rk ðx1 ; . . . ; xk Þ ¼ R ðxp Þ ½13
ð2Þ ð2Þ ðN  kÞ!N k p¼1 1
Rk ðx1 ; . . . ; xk Þ ¼ det½KN ðxp ; xq Þp;q¼1;...;k ½9
All entries of the determinant can be expressed in The combinatorial factor is important, since we
terms of the kernel K(2) always normalize to N!=(N  k)!. Hence, one finds
N (xp , xq ), which depends on
two energy arguments (xp , xq ). Analogous but ðPÞ
Xk ð1 ; . . . ; k Þ ¼ 1 ½14
more complicated formulae are valid for the
GOE ( = 1) and the GSE ( = 4), involving for all unfolded correlation functions.
quaternion determinants and integrals and deriva-
tives of the kernel. Statistical Observables
As argued in the Introduction, we are interested in
the energy correlations on the unfolded energy scale. The unfolded correlation functions yield all statis-
The level density is formally the one-level correla- tical observables. The two-level correlation function
tion function. For the three Gaussian ensembles it is, X2 (r) with r = 1  2 is of particular interest in
to leading order in the level number N, the Wigner applications. If we do not write the superscript ()
semicircle or (P), we mean either of the functions. For the
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Gaussian ensembles, X2() (r) is shown in Figure 3.
ðÞ 1
R1 ðx1 Þ ¼ 4 Nv2  x21 ½10 One often writes X2 (r) = 1  Y2 (r). The two-level
2v2 cluster function Y2 (r) nicely measures the deviation
pffiffiffiffiffi pffiffiffiffiffi
for jx1 j  2 N v and zero for jx1 j > 2 Nv. None of from the uncorrelated Poisson case, where one has
the common systems in physics has such a level X(P) (P)
2 (r) = 1 and Y2 (r) = 0.
Random Matrix Theory in Physics 341

1.5
zero in a Gaussian fashion. Thus, although the
nearest-neighbor spacing distribution mathemati-
1.0
cally involves all correlations, it makes in practice
X2(β )(r ) only a meaningful statement about the two-level
correlations. Luckily, p() (s) differs only very slightly
0.5
from the heuristic Wigner surmise [2] (correspond-
ing to  = 1), respectively from its extensions
0.0
0 1 2 3
(corresponding to  = 2 and  = 4).
r
Figure 3 Two-level correlation function X2() (r ) for GOE (solid),
Ergodicity and Universality
GUE (dashed) and GSE (dotted). We constructed the correlation functions as averages
over an ensemble of random matrices. But this is not
By construction, the average level number in an how we proceeded in the data analysis sketched in
interval of length L in the unfolded spectrum is L. the Introduction. There, we started from one single
The level number variance 2 (L) is shown to be an spectrum with very many levels and obtained the
average over the two-level cluster function, statistical observable just by sampling and, if
necessary, smoothing. Do these two averages, the
Z L ensemble average and the spectral average, yield the
2 ðLÞ ¼ L  2 ðL  rÞY2 ðrÞdr ½15 same? Indeed, one can show that the answer is
0
pffiffiffiffiffiffiffiffiffiffiffiffi affirmative, if the level number N goes to infinity.
We find L  2 (L) levels in an interval of length L. This is referred to as ergodicity in RMT.
In the uncorrelated Poisson case, one has 2(P) (L) = L. Moreover, as already briefly indicated in the
This is just Poisson’s error law. For the Gaussian Introduction, very many systems from different
ensembles 2() (L) behaves logarithmically for large L. areas of physics are well described by RMT. This
The spectrum is said to be more rigid than in the seems to be at odds with the Gaussian assumption
Poisson case. As Figure 4 shows, the level number [5]. There is hardly any system whose Hamilton
variance probes longer distances in the spectrum, in matrix elements follow a Gaussian probability
contrast to the nearest-neighbor spacing distribution. density. The solution for this puzzle lies in the
Many more observables, also sensitive to higher unfolding. Indeed, it has been shown that almost all
()
order, k > 2 correlations, have been defined. In functional forms of the probability density PN (H)
practice, however, one is often restricted to analyz- yield the same unfolded correlation functions, if no
ing two-level correlations. An exception is, to some new scale comparable to the mean level spacing is
()
extent, the nearest-neighbor spacing distribution present in PN (H). This is the mathematical side of
p(s). It is the two-level correlation function with the empirically found universality.
the additional requirement that the two levels in Ergodicity and universality are of crucial impor-
question are adjacent, that is, that there are no levels tance for the applicability of RMT in data analysis.
between them. Thus, all correlation functions are
needed if one wishes to calculate the exact nearest- Wave Functions
neighbor spacing distribution p() (s) for the
By modeling the Hamiltonian of a system with a
Gaussian ensembles. These considerations explain
random matrix H, we do not only make an
that we have X2() (s) ’ p() (s) for small s. But while
assumption about the statistics of the energies, but
X2() (s) saturates for large s, p() (s) quickly goes to
also about those of the wave functions. Because of
the eigenvalue equation Hun = xn un , n = 1, . . . , N,
the wave function belonging to the eigenenergy xn
2 is modeled by the eigenvector un . The columns of
the diagonalizing matrix U = [u1 u2    uN ] are these
eigenvectors. The probability density of the compo-
Σ2(L)

1
nents unm of the eigenvector un can be calculated
rather easily. For large N it approaches a Gaussian.
0 This is equivalent to the Porter–Thomas distribu-
0 1 2 3 tion. While wave functions are often not accessible
L in an experiment, one can measure transition
Figure 4 Level number variance 2 (L) for GOE (solid) and amplitudes and widths, giving information about
Poisson case (dashed). the matrix elements of a transition operator and a
342 Random Matrix Theory in Physics

projection of the wave functions onto a certain state We have not made any statistical assumptions yet.
in Hilbert space. If the latter are represented by a Often, one can understand generic features of a
fixed matrix A or a fixed vector a, respectively, one scattering system by assuming that the Hamiltonian
can calculate the RMT prediction for the probability H is a random matrix, taken from one of the three
densities of the matrix elements uyn Aum or the classical ensembles. This is one RMT approach used
widths ay un from the probability density of the in scattering theory.
eigenvectors. Another RMT approach is based on the scattering
matrix itself, S is modeled by a    unitary
random matrix. Taking into account additional
Scattering Systems symmetries, one arrives at the three circular ensem-
It is important that RMT can be used as a powerful bles, circular orthogonal (COE), unitary (CUE) and
tool in scattering theory, because the major part of symplectic (CSE). They correspond to the three
the experimental information about quantum sys- classical Gaussian ensembles and are also labeled
tems comes from scattering experiments. Consider with the Dyson index  = 1, 2, 4. The eigenphases of
an example from compound nucleus scattering. In the random scattering matrix correspond to the
an accelerator, a proton is shot on a nucleus, with eigenvalues of the random Hamiltonian matrix. The
which it forms a compound nucleus. This then unfolded correlation functions of the circular
decays by emitting a neutron. More generally, the ensembles are identical to those of the Gaussian
ingoing channel  (the proton in our example) ensembles.
connects to the interaction region (the nucleus),
which also connects to an outgoing channel  (the
Supersymmetry
neutron). There are  channels with channel wave
functions which are labeled  = 1, . . . , . The Apart from the symmetries, random matrices con-
interaction region is described by an N  N tain nothing but random numbers. Thus, a certain
Hamiltonian matrix H whose eigenvalues xn are type of redundancy is present in RMT. Remarkably,
bound-state energies labeled n = 1, . . . , N. The this redundancy can be removed, without losing any
dimension N is a cutoff which has to be taken to piece of information by using supersymmetry, that
infinity at the end of a calculation. The    is, by a reformulation of the random matrix model
scattering matrix S contains the information about involving commuting and anticommuting variables.
how the ingoing channels are transformed into the For the sake of simplicity, we sketch the main ideas
outgoing channels. The scattering matrix S is for the GUE, but they apply to the GOE and the
unitary. Under certain and often justified assump- GSE accordingly.
tions, a scattering matrix element can be cast into One defines the k-level correlation functions by
the form using the resolvent of the Schrödinger equation,

S ¼   i2Wy G1 W ½16 b ð2Þ ðx1 ; . . . ; xk Þ


R k
Z Yk
The couplings Wn between the bound states n and 1 ð2Þ 1
¼ k PN ðHÞ tr  d½H ½18
the channels  are collected in the N   matrix W,  p¼1
xp  H
W is its th column. The propagator G1 is the
inverse of The energies carry an imaginary increment x p = xp 
X i" and the limit " ! 0 has to be taken at the end of
G ¼ z1N  H þ i W Wy ½17 the calculation. The k-level correlation functions
 open R(2)
k (x1 , . . . , xk ) as defined in eqn [8] can always be
obtained from the functions [18] by constructing a
Here, z is the scattering energy and the summation linear combination of the R b (2) (x1 , . . . , xk ) in which
k
is only over channels which are open, that is, the signs of the imaginary increments are chosen
accessible. Formula [16] has a clear intuitive inter- such that only the imaginary parts of the traces
pretation. The scattering region is entered through contribute. Some trivial -distributions have to be
channel , the bound states of H become resonances removed. The k-level correlation functions [18]
in the scattering process according to eqn [17], the can be written as the k-fold derivative
interaction region is left through channel . This
formulation applies in many areas of physics. All b ð2Þ ðx1 ; . . . ; xk Þ
R k
observables such as transmission coefficients, cross
1 @k
sections, and others can be calculated from the ð2Þ
¼ Qk Zk ðx þ JÞ ½19
scattering matrix S. ð2Þk @Jp
p¼1 J¼0
Random Matrix Theory in Physics 343

of the generating function anticommuting variables have no direct physics


ð2Þ
interpretation; they appear simply as helpful math-
Zk ðx þ JÞ ematical devices to cast the RMT model into an
 
Z Yk det x þ Jp  H often much more convenient form.
ð2Þ p
¼ PN ðHÞ   d½H ½20

p¼1 det xp  Jp  H Crossover Transitions

which depends on the energies and k new source The RMT models discussed up to now describe
variables Jp , p = 1, . . . , k, ordered in 2k  2k diag- four extreme situations, the absence of correla-
onal matrices tions in the Poisson case and the presence of
correlations as in the three fully rotational
x ¼ diagðx1 ; x1 ; . . . ; xk ; xk Þ invariant models GOE, GUE, and GSE. A real
½21
J ¼ diagðþJ1 ; J1 ; . . . ; þJk ; Jk Þ physics system, however, is often between these
extreme situations. The corresponding RMT mod-
We notice the normalization Z(2)
k (x) = 1 at J = 0. The els can vary considerably, depending on the
generating function [20] is an integral over an
specific situation. Nevertheless, those models in
ordinary N  N matrix H. It can be exactly rewritten
which the random matrices for two extreme
as an integral over a 2k  2k supermatrix  contain-
situations are simply added with some weight are
ing commuting and anticommuting variables,
useful in so many applications that they acquired a
ð2Þ rather generic standing. One writes
Zk ðx þ JÞ
Z
ð2Þ
¼ Qk ðÞsdetN ðx þ J  Þd½ ½22 Hð Þ ¼ H ð0Þ þ H ðÞ ½24
(0)
where H is a random matrix drawn from an
The integrals over the commuting variables are of ensemble with a completely arbitrary probability
the ordinary Riemann–Stiltjes type, while those over density P(0) (0)
N (H ). The case of a fixed matrix is
the anticommuting variables are Berezin integrals. included, because one may choose a product of
The Gaussian probability density [5] is mapped onto -distributions for the probability density. The
its counterpart in superspace matrix H() is random and drawn from the classical
  Gaussian ensembles with probability density
ð2Þ ð2Þ 1 2 ()
Qk ðÞ ¼ ck exp  2 str  ½23 PN (H () ) for  = 1, 2, 4. One requires that the
2v
group diagonalizing H (0) is a subgroup of the one
where c(2)
k is a normalization constant. The supertrace diagonalizing H () . The model [24] describes a
str and the superdeterminant sdet generalize the crossover transition. The weight is referred to as
corresponding invariants for ordinary matrices. The transition parameter. It is useful to choose the
total number of integrations in eqn [22] is drastically spectral support of H (0) and H () equal. One can
reduced as compared to eqn [20]. Importantly, it is then view as the root-mean-square matrix element
independent of the level number N which now only of H () . At = 0, one has the arbitrary ensemble.
appears as the negative power of the superdeterminant The Gaussian ensembles are formally recovered in
in eqn [22], that is, as an explicit parameter. This most the limit ! 1, to be taken in a proper way such
convenient feature makes it possible to take the limit of that the energies remain finite.
infinitely many levels by means of a saddle point We are always interested in the unfolded correla-
approximation to the generating function. tion functions. Thus, has to be measured in units
Loosely speaking, the supersymmetric formulation of the mean level spacing D such that
= =D is
can be viewed as an irreducible representation of RMT the physically relevant transition parameter. It
which yields a clearer insight into the mathematical means that, depending on the numerical value of
structures. The same is true for applications in D, even a small effect on the original energy scale
scattering theory and in models for crossover transi- can have sizeable impact on the spectral statistics.
tions to be discussed below. This explains why super- This is referred to as statistical enhancement. The
symmetry is so often used in RMT calculations. nearest-neighbor spacing distribution is already
It should be emphasized that the rôle of super- very close to p() (s) for the Gaussian ensembles if
symmetry in RMT is quite different from the one in
is larger than 0.5 or so. In the long-range
high-energy physics, where the commuting and observables such as the level number variance
anticommuting variables represent physical parti- 2 (L), the deviation from the Gaussian ensemble
cles, bosons and fermions, respectively. This is not statistics becomes visible at interval lengths L
so in the RMT context. The commuting and comparable to
.
344 Random Matrix Theory in Physics

Crossover transitions can be interpreted as diffu- produce different statistics, often of the Poisson type.
sion processes. With the fictitious time t = 2 =2, the Mixed statistics as described by crossover transitions
probability density PN (x, t) of the eigenvalues x of are then of particular interest to investigate the
the total Hamilton matrix H = H(t) = H( ) satisfies character of excitations. For example, one applies
the diffusion equation the model [24] with H (0) drawn from a Poisson
ensemble and H () from a GOE. Another application
4@
x PN ðx; tÞ ¼ PN ðx; tÞ ½25 of crossover transitions is breaking of time-reversal
 @t invariance in nuclei. Here, H (0) is from a GOE and
where the probability density for the arbitrary H () from a GUE. Indeed, a fit of spectral data to this
ensemble is the initial condition PN (x, 0) = P(0)
N (x).
model yields an upper bound for the time-reversal
The Laplacian invariance violating root-mean-square matrix element
  in nuclei. Yet another application is breaking of
XN
@2 X  @ @
x ¼ þ  ½26 symmetries such as parity or isospin. In the case of
n¼1
@x2n n<m xn  xm @xn @xm two quantum numbers, positive and negative parity,
say, one chooses H (0) = diag(H(þ) , H () ) block-
lives in the curved space of the eigenvalues x. This diagonal with H (þ) and H () drawn from two
diffusion process is Dyson’s Brownian motion in uncorrelated GOE and H () from a third uncorre-
slightly simplified form. It has a rather general meaning lated GOE which breaks the block structure. Again,
for harmonic analysis on symmetric spaces, connecting root-mean-square matrix elements for symmetry
to the spherical functions of Gelfand and Harish- breaking have been derived from the data.
Chandra, Itzykson–Zuber integrals, and to Calogero– Nuclear excitation spectra are extracted from
Sutherland models of interacting particles. All this scattering experiments. An analysis as described
generalizes to superspace. In the supersymmetric above is only possible if the resonances are isolated.
version of Dyson’s Brownian motion the generating Often, this is not the case and the resonance widths
function of the correlation functions is propagated, are comparable to or even much larger than the mean
4@ level spacing, making it impossible to obtain the
s Zk ðs; tÞ ¼ Zk ðs; tÞ ½27 excitation energies directly from the cross sections.
 @t
One then analyzes the latter and their fluctuations as
where the initial condition Zk (s, 0) = Z(0)
k (s) is the measured and applies the concepts sketched above
generating function of the correlation functions for for scattering systems. This approach has also been
the arbitrary ensemble. Here, s denotes the eigenva- successful for crossover transitions.
lues of some supermatrices, not to be confused with Due to the complexity of the nuclear many-body
the spacing between adjacent levels. Since the problem, one has to use effective or phenomenological
Laplacian s lives in this curved eigenvalue space, interactions when calculating spectra. Hence, one often
this diffusion process establishes an intimate con- studies whether the statistical features found in the
nection to harmonic analysis on superspaces. Advan- experimental data are also present in the calculated
tageously, the diffusion [27] is the same on the spectra which result from the various models for nuclei.
original and on the unfolded energy scales. Other many-body systems, such as complex atoms
and molecules, have also been studied with RMT
concepts, but the main focus has always been on nuclei.
Fields of Application
Many-Body Systems Quantum Chaos
Numerous studies apply RMT to nuclear physics Originally, RMT was intended for modeling systems
which is also the field of its origin. If the total with many degrees of freedom such as nuclei. Surpris-
number of nucleons, that is, protons and neutrons, is ingly, RMT proved useful for systems with few degrees
not too small, nuclei show single-particle and of freedom as well. Most of these studies aim at
collective motion. Roughly speaking, the former is establishing a link between RMT and classical chaos.
decoherent out-of-phase motion of the nucleons Consider as an example the classical motion of a point-
confined in the nucleus, while the latter is coherent like particle in a rectangle billiard. Ideal reflection at the
in-phase motion of all nucleons or of large groups of boundaries and absence of friction are assumed,
them such that any additional individual motion of implying that the particle is reflected infinitely many
the nucleons becomes largely irrelevant. It has been times. A second billiard is built by taking a rectangle
shown empirically that the single-particle excitations and replacing one corner with a quarter circle as shown
lead to GOE statistics, while collective excitations in Figure 5. The motion of the particle in this Sinai
Random Matrix Theory in Physics 345

the kinetic part, that is, the Laplacian, and a white-


noise disorder potential V(r) with second moment

hVðrÞVðr 0 Þi ¼ cV ðdÞ ðr  r 0 Þ ½28


Here, r is the position vector in d dimensions. The
constant cV determines the mean free time between
Figure 5 The Sinai billiard.
two scattering processes in relation to the density of
states. It is assumed that phase coherence is present
billiard is very different from the one in the rectangle. such that quantum effects are still significant. This
The quarter circle acts like a convex mirror which defines the mesoscopic regime. The average over the
spreads out the rays of light upon reflection. This effect disorder potential can be done with supersymmetry.
accumulates, because the vast majority of the possible In fact, this is the context in which supersymmetric
trajectories hit the quarter circle infinitely many times techniques in statistical physics were developed,
under different angles. This makes the motion in the before they were applied to RMT models. In the
Sinai billiard classically chaotic, while the one in the case of weak disorder, the resulting field theory in
rectangle is classically regular. The rectangle is separ- superspace for two-level correlations acquires the
able and integrable, while this feature is destroyed in the form
Sinai billiard. One now quantizes these billiard systems, Z
calculates the spectra, and analyzes their statistics. Up dðQÞf ðQÞ expðSðQÞÞ ½29
to certain scales, the rectangle (for irrational squared
ratio of the side lengths) shows Poisson behavior, the where f (Q) projects out the observable under
Sinai billiard yields GOE statistics. consideration and where S(Q) is the effective
A wealth of such empirical studies led to the Bohigas– Lagrangian
Giannoni–Schmit conjecture. We state it here not in its Z  
original, but in a frequently used form: spectra of SðQÞ ¼  str DðrQðrÞÞ2 þ i2rMQðrÞ dd r ½30
systems whose classical analogues are fully chaotic
show correlation properties as modeled by the Gaussian This is the supersymmetric nonlinear  model. It is
ensembles. The Berry–Tabor conjecture is complemen- used to study level correlations, but also to obtain
tary: spectra of systems whose classical analogs are fully information about the conductance and conduc-
regular show correlation properties which are often tance fluctuations when the probe is coupled to
those of the Poisson type. As far as concrete physics external leads. The supermatrix field Q(r) is the
applications are concerned, these conjectures are well- remainder of the disorder average, its matrix
posed. From a strict mathematical viewpoint, they have dimension is four or eight, depending on the
to be supplemented with certain conditions to exclude symmetry class. This field is a Goldstone mode. It
exceptions such as Artin’s billiard. Due to the defnition does not directly represent a particle as often the
of this system on the hyperbolic plane, its quantum case in high-energy physics. The matrix Q(r) lives
version shows Poisson-like statistics, although the in a coset space of certain supergroups. A tensor M
classical dynamics is chaotic. Up to now, no general appears in the calculation, and r is the energy
and mathematically rigorous proofs could be given. difference on the unfolded scale, not to be confused
However, semiclassical reasoning involving periodic with the position vector r.
orbit theory and, in particular, the Gutzwiller trace The first term in the effective Lagrangian invol-
formula, yields at least a heuristic understanding. ving a gradient squared is the kinetic term, it stems
Quantum chaos has been studied in numerous from the Laplacian in the Hamiltonian. The con-
systems. An especially prominent example is the stant D is the classical diffusion constant for the
Hydrogen atom put in a strong magnetic field, motion of the electron through the probe. The
which breaks the integrability and drives the second term is the ergodic term. In the limit of
correlations towards the GOE limit. zero dimensions, d ! 0, the kinetic term vanishes
and the remaining ergodic term yields precisely the
unfolded two-level correlations of the Gaussian
Disordered and Mesoscopic Systems
ensembles. Thus, RMT can be viewed as the zero-
An electron moving in a probe, a piece of wire, say, is dimensional limit of field theory for disordered
scattered many times at impurities in the material. systems. For d > 0, there is a competition between
This renders the motion diffusive. In a statistical the two terms. The diffusion constant D and the
model, one writes the Hamilton operator as a sum of system size determine an energy scale, the Thouless
346 Random Matrix Theory in Physics

where Wb is a random matrix without further


2.0
symmetries. By construction, W has chiral symmetry.
The assumption underlying chiral RMT is that the
1.5
gauge fields effectively randomize the motion of the
Σ2(L)

1.0 quark. Indeed, this simple schematic model correctly


reproduces low-energy sum rules and spectral statis-
0.5 tics of lattice gauge calculations. Near the center of
the spectrum, there is a direct connection to the
0.0 partition function of quantum chromodynamics.
0 20 40 60 80 Furthermore, a similarity to disordered systems exists
L and an analog of the Thouless energy could be found.
Figure 6 Level number variance 2 (L). In this example, the
Thouless energy is Ec 10 on the unfolded scale. The Other Fields
Gaussian ensemble behavior is dashed.
Of the wealth of further investigations, we can
mention but a few. RMT is in general useful for
wave phenomena of all kinds, including classical
energy Ec , within which the spectral statistics is of
ones. This has been shown for elastomechanical and
the Gaussian ensemble type and beyond which it
electromagnetic resonances.
approaches the Poisson limit. In Figure 6, this is
An important field of application is quantum
schematically shown for the level number variance
gravity and matrix model aspects of string theory.
2 (L), which bends from Gaussian ensemble to
We decided not to go into this, because the reason
Poisson behavior when L > Ec . This relates to the
for the emergence of RMT concepts there is very
crossover transitions in RMT. Gaussian ensemble
different from everything else discussed above.
statistics means that the electron states extend over
RMT is also successful beyond physics. Not
the probe, while Poisson statistics implies their
surprisingly, it always received interest in mathema-
spatial localization. Hence, the Thouless energy is
tical statistics, but, as already said, it also relates to
directly the dimensionless conductance.
harmonic analysis. A connection to number theory
A large number of issues in disordered and
exists as well. The high-lying zeros of the Riemann
mesoscopic systems have been studied with the
function follow the GUE predictions over certain
supersymmetric nonlinear  model. Most results
interval lengths. Unfortunately, a deeper under-
have been derived for quasi-one-dimensional sys-
standing is still lacking.
tems. Through a proper discretization, a link is
As the interest in statistical concepts grows, RMT
established to models involving chains of random
keeps finding new applications. Recently, one even
matrices. As the conductance can be formulated in
started using RMT for risk management in finance.
terms of the scattering matrix, the experience with
RMT for scattering systems can be applied and See also: Arithmetic Quantum Chaos; Chaos and
indeed leads to numerous new results. Attractors; Determinantal Random Fields; Free Probability
Theory; Growth Processes in Random Matrix Theory;
Hyperbolic Billiards; Integrable Systems in Random
Quantum Chromodynamics
Matrix Theory; Integrable Systems: Overview; Number
Quarks interact by exchanging gluons. In quantum Theory in Physics; Ordinary Special Functions; Quantum
chromodynamics, the gluons are described by gauge Chromodynamics; Quantum Mechanical Scattering
fields. Relativistic quantum mechanics has to be Theory; Random Partitions; Random Walks in Random
Environments; Semi-Classical Spectra and Closed
used. Analytical calculations are only possible after
Orbits; Supermanifolds; Supersymmetry Methods in
some drastic assumptions and one must resort to Random Matrix Theory; Symmetry Classes in Random
lattice gauge theory, that is, to demanding numerics, Matrix Theory.
to study the full problem.
The massless Dirac operator has chiral symmetry,
implying that all nonzero eigenvalues come in pairs Further Reading
(
n , þ
n ) symmetrically around zero. In chiral
RMT, the Dirac operator is replaced with block Beenakker CWJ (1997) Random-matrix theory of quantum
Transport. Reviews of Modern Physics 69: 731–808.
off-diagonal matrices Bohigas O (1984) Chaotic motion and random matrix theory. In:
  Dehesa JS, Gomez JMG, and Polls A (eds.) Mathematical and
0 Wb
W¼ ½31 Computational Methods in Physics, Lecture Notes in Physics,
Wby 0 vol. 209. Berlin: Springer.
Random Partitions 347

Brody TA, Flores J, French JB, Mello PA, Pandey A, and Wong Haake F (2001) Quantum Signatures of Chaos, 2nd edn. Berlin:
SSM (1981) Random-matrix physics: spectrum and strength Springer.
fluctuations. Reviews of Modern Physics 53: 385–479. Mehta ML (2004) Random Matrices, 3rd edn. New York:
Efetov K (1997) Supersymmetry in Disorder and Chaos. Cambridge: Academic Press.
Cambridge University Press. Stöckmann HJ (1999) Quantum Chaos: An Introduction. Cambridge:
Forrester PJ, Snaith NC, and Verbaarschot JJM (eds.) (2003) Cambridge University Press.
Special issue: random matrix theory. Journal of Physics A 36. Verbaarschot JJM and Wettig T (2000) Random matrix theory
Guhr T, Müller-Groeling A, and Weidenmüller HA (1998) and chiral symmetry in QCD. Annual Review of Nuclear and
Random-matrix theories in quantum physics: common con- Particle Science 50: 343–410.
cepts. Physics Reports 299: 189–428.

Random Partitions
A Okounkov, Princeton University, Princeton, NJ, group GL(n). More generally, the highest weight of a
USA rational representation of GL(n) can be naturally
ª 2006 A Okounkov. Published by Elsevier Ltd. viewed as two partitions of total length  n.
All rights reserved. For an even more basic example, partitions
with

1  m and ‘(
)  n are the same as upright lattice
paths making n steps up and m steps to the right
Partitions (just
follow
the boundary of
). In particular, there
are nþm n of such. By a variation on this theme,
A partition of n is a monotone sequence of non- partitions label the standard basis of fermionic Fock
negative integers, space (Miwa et al. 2000). They also label a standard

¼ ð
1

2

3
  
0Þ basis of the bosonic Fock space.
In most instances, partitions naturally occur
with sum n. The number n is also denoted by j
j and together with some weight function. For example,
is called the size of n. The number of nonzero terms the dimension, dim
, of an irreducible representation
in
is called the length of
and often denoted by of S(n), or some power of it, is what always appears in
‘(
). It is convenient to make the sequence
infinite harmonic analysis on S(n). By a theorem of Burnside,
by adding a string of zeros at the end.
A geometric object associated to partition is its ðdim
Þ2
diagram. The diagram of
= (4, 2, 2, 1) is shown in MPlanch ð
Þ ¼ ½1
n!
Figure 1. A larger diagram, flipped and rotated by 135 ,
can be seen in Figure 2. Flipping the diagram introduces is a probability measure on the set of partitions of n; it
an involution on the set of partitions of n known as is known as the Plancherel measure. Besides harmonic
transposition. The transposed partition is denoted by
0 . analysis, there are many other contexts in which
Partitions serve as natural combinatorial labels for it appears, for example, by a theorem of
many basic objects in mathematics and physics. For Schensted (see Sagan (2001) and Stanley (1999)), the
example, partitions of n index both conjugacy classes distribution of the first part
1 of a Plancherel random
and irreducible representations of the symmetric partition
is the same as the distribution of the longest
group S(n). Partitions
with ‘(
)  n index irredu- increasing subsequence in a uniformly random permu-
cible polynomial representations of the general linear tation of {1, 2, . . . , n}.

1.5

0.5

–2 –1 1 2
Figure 2 A Plancherel-random partition of 1000 and the limit
Figure 1 Diagram of a partition. shape.
348 Random Partitions

Partitions of n being just a finite set, one is often Here the product is over all squares & in the
interested in letting n ! 1. Even if the original diagram of  and
problem was not of a probabilistic origin, one can
hð&Þ ¼ 1 þ að&Þ þ lð&Þ
still often benefit from adopting a probabilistic
viewpoint because of the intuition and techniques where a(&) and l(&) is the number of squares to the
that it brings. This is best illustrated by concrete right of the square & and below it, respectively.
examples, which is what we now turn to. These (These are known as arm-length and leg-length.)
examples are not meant to be a panorama of
random partitions. This is an old and still rapidly Limit Shape and Edge Scaling
growing field and a simple list of all major
When the diagram of  is very large, the logarithm
contributions will take more space than is allowed.
The books Kerov (2003), Pitman (n.d.) Sagan of the hook product approximates a double
integral. The analysis of the corresponding integral
(2001), and Stanley (1999) offer much more
plays the central role, (see Kerov (2003), chapter 3)
information on the topics discussed below.
in the proof of the following law of large numbers
for the Plancherel measure.
Take the diagram of , flip
pffiffiffi and rotate it as in Figure 1
Plancherel Measure
and rescale by a factor of n so that it has unit area. In
Dimension of a Diagram this way one obtains a measure on continuous and, in
fact, Lipschitz functions. By a result of Logan and Shepp
There are several formulas and interpretations
and, independently, Vershik and Kerov these measures
for the number dim  in [1]; see Sagan (2001) and
converge as n ! 1 to the -measure on a single
Stanley (1999). The one that often appears in the
function (x). This limit shape for the Plancherel
context of growth processes is the following:
measure, is also plotted in Figure 2. Explicitly,
dim  is the number of ways to grow the diagram
8  pffiffiffiffiffiffiffiffiffiffiffiffiffiffi
 from the empty diagram ; by adding a square < 2 x arcsinðx=2Þ þ 4  x2 ; jxj  2
at a time. That is, dim  is number of chains of ðxÞ ¼ 
the form :
jxj; jxj > 2
; ¼ ð0Þ  ð1Þ      ðn1Þ  ðnÞ ¼  This is an analog of Wigner’s semicircle law (Mehta
(k)
where j j = k and    means inclusion of 1991) for spectra of random matrices. The Gaussian
diagrams. correction to the limit shape was also found by
From the classical formula Kerov (2003).
The limit
pffiffiffi shape result can be refined to show
jj! Y that 1 = n ! 2 in probability. Together with
dim  ¼ Q ði  j þ j  iÞ ½2
ði þ k  iÞ! ijk Schensted’s theorem, this answers the question
posed by Ulam about the longest increasing
where k is any number such that kþ1 = 0, one sees subsequence in a random permutation. Further
that the Plancherel measure is a discrete analog of progress came in the work of Baik, Deift, and
the eigenvalue density Johansson (see Deift (2000)), who conjectured
P 2Y (and proved for i = 1 and 2) that as n ! 1 the
eð1/2Þ xi ðxi  xj Þ2 joint distribution
i<j
pffiffiffi
i  2 n
of a GUE random matrix (Mehta 1991). Indeed, ; i ¼ 1; 2; . . .
the first factor in [2], which looks like a multi- n1=6
nomial coefficient, is the analog of the Gaussian becomes exactly the same as the distribution of
weight. Kerov (2003) and Johansson were among largest eigenvalues of a GUE random matrix. In
the first to recognize the analogy between Plan- particular, the longest increasing subsequence,
cherel measure and GUE. One comes across many suitably scaled, is distributed exactly like the
partition sums that are discrete analogs of random largest eigenvalue. The distribution of the latter is
matrix integrals. known as the Tracy–Widom distribution; it is
The most compact formula for dim  is the hook given in terms of a particular solution of the
formula Painlevé II equation. For more information about
the proof of the full conjecture, see Aldous and
dim  Y
¼ hð&Þ1 ½3 Diaconis (1999), Deift (2000), and Okounkov
jj! &2 (2002).
Random Partitions 349

Correlation Functions where C is the conjugacy class with cycle type


 and
One way to prove the full BDJ conjecture is to use the
following exact formula first obtained in a more general 
setting by Borodin and Olshanski (see Olshanski (2003), f  ðÞ ¼ jC j
dim 
and Okounkov (2002) for further generalizations).
Look at the downsteps of the zig-zag curve in Figure 2. is the central character of the irreducible representa-
The x-coordinates of their midpoints are the numbers tion . Here  is the character of any  2 C in the
  representation .
SðÞ ¼ i  i þ 12  Z þ 12 ½4 Let  be of the form ( , 1, 1, . . . ) with  fixed.
n 1
By a result of Kerov and Olshanski, jj  f  (), is a
The map  7! S() makes a random partition a random polynomial in  of degree j j. See [11] for the
subset of Z þ 12, that is, a random point field on a lattice. simplest example  = (2), that is, for the central
These random points should be treated like eigenvalues character of a transposition. We thus recognize in
of a random matrix. In particular, it is natural to consider [7] a discrete analog of the GUE expectation of a
their correlations, that is, the probability that X  S() polynomial in traces of a random matrix. This
for some fixed X  Z þ 12. analogy becomes even clearer in the Gromov–Witten
Many formulas work better if we replace the theory of CP1 , which can be viewed as taking into
Plancherel measures MPlanch, n on partitions of a account contributions of certain degenerate covers,
fixed number n by their Poisson average, see Okounkov (2002).
X n There is a generalization, due to Burnside, of [7]
M ¼ e MPlanch;n to counting branched covers of surfaces of any
n0
n!
genus g; see Jones (1998). The only modification
Here  > 0 is a parameter. It equals the expected required is that a representation  is now counted
size of . For any finite set X, we have with the weight ( dim )22g . For example, covers of

the torus correspond to a uniform measure on
Prob ðX  SðÞÞ ¼ det K Bessel ðxi ; xj ; Þ xi ;xj 2X ½5 partitions. In particular, the probability that two
random permutation from S(n) commute is p(n)/n!,
where KBessel is the discrete Bessel kernel given by where p(n) is the number of partitions of n.
K Bessel ðx;y;Þ
pffiffiffi pffiffiffi pffiffiffi pffiffiffi
pffiffiffi Jx1=2 ð2 ÞJyþ1=2 ð2 Þ  Jxþ1=2 ð2 ÞJy1=2 ð2 Þ
¼  Generalizations of Plancherel Measure
xy

Note that only Bessel function of integral order Schur Functions and Cauchy Identity
enter this formula. pffiffiffi Schur functions s (x1 , . . . , xn ), where  is a parti-
For large
pffiffiffi argument , Jn (2 ) has sine asymptotics tion with at most n parts, form a distinguished
ifpnffiffiffi 2  and Airy function asymptotics if n
linear basis of the algebra of symmetric polyno-
2 . Consequently, one gets the random matrix mials in x1 , . . . , xn . Various definitions and many
behavior near the edge of the limit shape and remarkable properties of these function are dis-
discrete sine kernel asymptotics of correlations in cussed in, for example, Sagan (2001) and Stanley
the bulk of the limit shape. (1999). One of them is that s (x) is the trace of a
matrix with eigenvalues {xi } in an irreducible
GL(n) module with highest weight . The follow-
Permutation Enumeration
ing stability of s ,
A basic combinatorial problem is to count per-
s ðx1 ; . . . ; xn ; 0Þ ¼ s ðx1 ; . . . ; xn Þ; ‘ðÞ  n
mutations 1 , . . . , p 2 S(n) of given cycle types
(1) , . . . , (p) such that allows one to define Schur functions in infinitely
many variables. The formulas
1    p ¼ 1 ½6
X X 
A geometric interpretation of this problem is to count p ¼  s ; s ¼ p
covers of the sphere S2 = CP1 branched over p given  
zðÞ
points with monodromy (1) , . . . , (p) . Elementary
where
character theory of S(n) gives (Jones 1998)
Y DY E jj! Y
#fi 2 CðiÞ ; i ¼ 1g ¼ f ðiÞ ½7 zðÞ ¼ ¼ jAutðÞj i
Planch jC j
350 Random Partitions

establish the transition between the basis of Schur Dimension Functions


function and the basis of power sum functions
We already met the function dimn . There is a
Y X useful formula
p ¼ pi ; pk ¼ xki
i Y n þ cð&Þ
dimn  ¼ ½10
In particular, the dimension function dim  is the &2
hð&Þ
following specialization of the Schur function:
where c((i, j)) = j  i is the content of the square & in
dim  ith row and jth column. From [10] it is clear that dimn
¼ s p1 ¼1; p2 ¼p3 ¼¼0
jj! makes sense for arbitrary complex values of n. The
corresponding specializations of the Schur measure
We will discuss other important specializations of pffiffiffi pffiffiffi pffiffiffi pffiffiffi
Schur functions later. x ¼ ; . . . ;  ; y ¼ ; . . . ; 
|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl}
A typical situation in which a random matrix z times z0 times
integral can be reduced to a sum over partition is
when one uses the Cauchy identity where , z, z0 are parameters, are related to
X the so-called Z-measures and their theory is much-
1 pk ðxÞpk ðyÞ developed (Olshanski 2003). As z, z0 , 1 ! 1 in
Q ¼ exp
ð1  xi yi Þ X k such a way that zz0  ! 0 , we get M0 in the limit.
¼ s ðxÞs ðyÞ ½8 The enumerative problems discussed in the section
 ‘‘Permutation enumeration’’ have analogs for the
unitary groups U(n) and, suitably interpreted, the
to expand the integrand in Schur function and
answers are the same with the dimension dimn 
integrate term by term using, for example, the
replacing dim . For example, instead of counting the
orthogonality of characters or the identity
solutions to [6], one may be interested in the volume
Z of the set of p-tuples of unitary matrices with given
1
s ðAgBg1 Þ dg ¼ s ðAÞs ðBÞ ½9 eigenvalues that multiply to 1. Geometrically, such
UðnÞ dim n
data arise as the monodromy of a flat unitary
Here s (A) denotes the Schur function in eigenvalues connection over S2 n{p points}, which is a U(n) analog
of a matrix A, dg is the normalized Haar measure on of a branched cover. The analog of Burnside’s
the unitary group U(n), and formula is Witten’s formula for the volumes of
moduli spaces of flat connections on a genus g
dimn  ¼ s ð1; . . . ; 1Þ surface with given holonomy around p punctures,
|fflfflfflffl{zfflfflfflffl}
n times (see, e.g., Witten (1991) and Woodward (2004)). It
involves summing normalized characters over all
is the dimension of irreducible GL(n) module V  representations V  , not necessarily polynomial, with
with highest weight . The meaning of [9] is that the weight (dim V  )22g . If additionally weighted by
normalized characters are algebra homomorphisms a Gaussian of the form exp(A(f 2 () þ (n=2)jj)),
from the center of the group algebra of U(n) to where
numbers. This method of converting a random
matrix problem to a random partition problem is 1 Xh 2  2 i
f 2 ðÞ ¼ i  i þ 12  i þ 12
known as character expansion (see, e.g., Kazakov 2 i
(2001)). X
Inspired by the Cauchy identity, one can general- ¼ cð&Þ ½11
&2
ize Plancherel measure to
Y this becomes Migdal’s formula for the partition
MSchur ¼ ð1  xi yi Þs ðxÞs ðyÞ function of the 2D Yang–Mills theory, the positive
constant A being the area of the surface (see, e.g.,
where x and y, or, equivalently, pk (x) and pk (y), are Witten (1991) and Woodward (2004)).
viewed as parameters. This p isffiffiffi known as the Schur A further generalization naturally arising in the
measure. If p1 (x) = p2 (y) =  and all other pk ’s theory of quantum groups is the quantum dimension
vanish, we get MSchur = M . Many properties of the
Plancherel measure can be generalized to Schur dimn;q  ¼ s ðq1n ; q3n ; . . . ; qn3 ; qn1 Þ
measure, in particular, exact formulas for correla- Y qnþcð&Þ  qncð&Þ
tion functions, description of the limit shape, etc. ¼
qhð&Þ  qhð&Þ
(Okounkov 2002). &2
Random Partitions 351

where q is a parameter (it is more common to use Harmonic Functions on Young Graph
dimn, q1=2 instead). Obviously, dimn, q ! dimn as q ! 1.
Definitions
The function dimn, q is an important building block of,
for example, quantum invariants of knots and 3-folds, Partitions form a natural directed graph Y, known
and various related objects (see, e.g., Bakalov and as Young graph, in which there is an edge from  to
Kirillov (2001)). The Verlinde formula (Bakalov and  if  is obtained from  by adding a square. We
Kirillov 2001) can be viewed as an analog of Burnside’s will denote this by %. Let
be a non-negative
formula with weight dimn, q . When q is a root of unity function (called multiplicity) on edges of Y. A
the summation over  is naturally truncated to a function on the vertices of Y is harmonic if it
finite sum. satisfies
The next level of generalization is obtained by X
deforming Schur function to Jack and, more generally, ðÞ ¼
ð; Þ ðÞ ½12
-
Macdonald symmetric functions (Macdonald 1995).
In particular, the Jack polynomial analog of the for any . For given edge multiplicities
, non-
Plancherel measure is negative harmonic functions normalized by (;) = 1
MJack ðÞ form a convex compact (with respect to pointwise
Y convergence) set, which we will denote by H(
). The
n!ðt1 t2 Þn
¼ extreme points of H(
) are the indecomposable or
ððað&Þ þ 1Þt1 þ lð&Þt2 Þðað&Þt1 þ ðlð&Þ þ 1Þt2 Þ
&2 ergodic harmonic functions. They are the most
where t1 , t2 are parameters, and a(&) and l(&) important ones. One defines
X Y
denote, as above, the arm- and leg-length of a dim
= ¼
ð i ; iþ1 Þ
square &. This measure depends only on the ratio ¼ 0 % 1 %  % jjjj ¼
t2 =t1 which is the usual parameter of Jack poly-
nomials. To continue the analogy with random and dim
 = dim
=;. For example, if
1 then
matrices, this should be viewed as a general dim
 = dim . Any function 2 H(
) defines a
analog of the Plancherel measure. probability measure on partitions of fixed size
The measure MJack naturally arises in Atiyah– n, n = 0, 1, 2, . . . , by
Bott localization computations on the Hilbert M ;n ðÞ ¼ ðÞ dim
; jj ¼ n ½13
scheme of n points in C2 . By definition, this
Hilbert scheme parametrizes ideals I  C[x, y] of The mean value property [12] implies a certain
codimension n as linear spaces. The torus (C )2 coherence of these measures for different values of
acts on it by rescaling x and y and the fixed points n, which, in general, does not hold for measures like
of this action are MSchur . Two multiplicity functions
and
0 are
  gauge equivalent if
I ¼ Span of xj1 yi1 ði;jÞ=2

0 ð; Þ ¼ f ðÞ
ð; Þf ðÞ1
where  is a partition of n. The weight of this fixed
for some function f. In this case, H(
) and H(
0 ) are
point in the Atiyah–Bott formula is proportional to
naturally isomorphic and the measures M are the same.
MJack (), the parameters t1 and t2 being the
standard torus weights. Corresponding formulas in
First Example: Thoma Theorem
K-theory involve a Macdonald polynomial analog of
dim . Let F be a central
S function on the infinite symmetric
Nekrasov defines the partition functions of N = 2 group S(1) = n S(n), normalized by F(1) = 1.
supersymmetric gauge theories by formally applying Restricted to S(n), F is a linear combination of
the Atiyah–Bott localization formula to (noncom- irreducible characters
pact) instanton moduli spaces. The resulting expres- X
sion is a sum over partitions with a weight which is FjSðnÞ ¼ ðÞ
jj¼n
a generalization of MJack . In this way, random
P
partitions enter gauge theory. What is more, The branching rule  jS(n1) =  %   implies that
statistical properties of these random partitions are the Fourier coefficients are harmonic with respect
reflected in the dynamics of gauge theories. For to
1. They are non-negative if and only if F is a
example, the limit shape turns out to be precisely the positive-definite function on S(1), which means that
Seiberg–Witten curve (see Nekrasov and Okounkov the matrix (F(gi g1
j )) is non-negative definite for any
(2003), Okounkov (2002), and also Nakajima and {gi }  S(1). The description of all indecomposable
Yoshioka (2003)). positive-definite central functions on S(1) was first
352 Random Partitions

obtained by Thoma (see Kerov (2003, 1998) and The multiplicities


K are gauge equivalent to
Olshanski (2003)). Rephrased in our language, it multiplicities
says that the functions
VP ð; Þ ¼ kk ½14
ðÞ ¼ s j P k P k
p1 ¼1;pk ¼kþ1
i þð1Þ i ;k>1 which arise in the study of probability measures on
virtual permutations S (Olshanski 2003). By definition,
are the extreme points of H(1). Here i and i are
parameters satisfying S ¼ lim SðnÞ


1  2      0; 1  2      0 with respect to the maps S(n) ! S(n  1) that delete n


X from the disjoint cycle decomposition of a permutation
i þ i  1
 2 S(n). For n  5, this is the unique map that
This set is known as the Thoma simplex. The origin commutes with the right and left action of S(n  1).
i = i = 0 corresponds to the Plancherel measure. Thus, S has a natural S(1) S(1) action; however, it
A general positive-definite central functions on is not a group. A measure M on S is central if it is
S(1) defines a measure on the Thoma simplex. This invariant under the action of the diagonal subgroup in
measure can be interpreted as a point process on the S(1) S(1). Let the push-forward of M to S(n) give
real line, for example, by placing particles at mass () to a permutation with cycle type . It is then
positions { i } and { i }. Interesting central func- easy to see that is harmonic with respect to [14].
tions lead to interesting processes (see Olshanski Thus, Kingman’s theorem gives a description of
(2003)). ergodic central measures on S. For example, i = 0
corresponds to the -measure at the identity.
Second Example: Kingman Theorem
Ergodic Method
Let  be a partition of the naturals N into disjoint
subsets. For any n = 1, 2, . . . ,  defines the induced A unified approach to this type of problems was
partition n of {1, . . . , n} and hence a partition (n ) proposed and developed by Vershik and Kerov. It is
of the number n. A measure M on partitions  is based on the following ergodic theorem. Let be an
called exchangeable if ergodic harmonic function. Then
dim
=
Mðn Þ ¼ ððn ÞÞ; n ¼ 1; 2; . . . ðÞ ¼ lim ; jj ! 1 ½15
dim

for some function on Y. This implies that is
for almost all  with respect to the measure [13]
harmonic for
(Kerov 2003). This is similar to approximating a

K ð; Þ ¼ k Gibbs measure in infinite volume by a sequence of
finite-volume Gibbs measures with appropriate
where 1 2
=1 2  and  = 11 22    kk 1
kþ1 þ1 boundary conditions. The ratio on the RHS of [15]
(k þ 1)    . The description of all exchange-
is known as the Martin kernel. Its asymptotics as
able measures M was first obtained by Kingman.
jj ! 1 plays the essential role.
In our language, it says that the extreme points of
Let us call a sequence {(n)} of partitions of n
H(
K ) are
regular if the limit in [15] exists for all . For
1,
ðÞ ¼ m j P k Vershik and Kerov proved that {(n)} is regular if
p1 ¼1; pk ¼ i ;k>1
and only if the following limits exist:
where m is the monomial symmetric function (sum
of all monomials with exponents ) and i are ðnÞi ðnÞ0i
! i ; ! i ½16
parameters as before. The corresponding measure n n
M can be described as follows. Let Xi be a that is, if the rows and columns of (n), scaled by n,
sequence of independent, identically distributed have a limit. In this case, the limit in [15] is the
random variables such that { i } are the measures harmonic function with Thoma parameters i and
of atoms of their distribution. This defines a i . This simultaneously proves Thoma classification
random partition  of N by putting i and j in the and gives a law of large numbers for the correspond-
same block of  if and only if Xi = Xj . A general ing measures [13]. It also gives a transparent
exchangeable measure M is then a convex linear geometric interpretation of Thoma parameters.
combination of M , which can be viewed as Note that the behavior [16] is very different from
making the common distribution of Xi also the formation of a smooth limit shape that we saw
random. See Pitman (n.d.) for a lot more about earlier. For a common generalization of this result
Kingman’s theorem. and Kingman’s theorem see Kerov (1998).
Random Walks in Random Environments 353

See also: Determinantal Random Fields; Growth Mehta ML (1991) Random Matrices, 2nd edn. Boston, MA:
Processes in Random Matrix Theory; Integrable Systems Academic Press.
in Random Matrix Theory; Random Matrix Theory in Miwa T, Jimbo M, and Date E (2000) Solitons. Differential
Physics; Symmetry Classes in Random Matrix Theory. Equations, Symmetries and Infinite-Dimensional Algebras.
Cambridge: Cambridge University Press.
Nakajima H and Yoshioka K (2003) Lectures on instanton
Further Reading counting, math.AG/0311058.
Nekrasov N and Okounkov A (2003) Seiberg–Witten theory and
Aldous D and Diaconis P (1999) Longest increasing subsequences: random partitions, hep-th/0306238.
from patience sorting to the Baik–Deift–Johansson theorem. Okounkov A (2002) Symmetric Functions and Random Partitions,
Bulletin of the American Mathematical Society 36(4): 413–432. Symmetric Functions 2001: Surveys of Developments and
Bakalov B and Kirillov A Jr. (2001) Lectures on Tensor Perspectives, pp. 223–252, NATO Sci. Ser. II Math. Phys.
Categories and Modular Functors. University Lecture Series, Chem., 74. (math.CO/0309074). Dordrecht: Kluwer Academic.
vol. 21. American Mathematical Society. Olshanski G (2003) An introduction to harmonic analysis on the
Deift P (2000) Integrable systems and combinatorial theory. infinite symmetric group, math.RT/0311369.
Notices of the American Mathematical Society 47(6): 631–640. Okounkov A (2002) The uses of random partitions, math-ph/
Jones G (1998) Characters and Surfaces: A Survey, The Atlas 0309015.
of Finite Groups: Ten Years on (Birmingham, 1995), Pitman J (n.d.) Combinatorial Stochastic Processes, Lecture Notes
London Mathematical Society Lecture Note Series, vol. 249, from St. Four Course, available from www.stat.berekeley.edu.
pp. 90–118. Cambridge: Cambridge University Press. Sagan B (2001) The Symmetric Group. Representations, Combi-
Kazakov V (2001) Solvable Matrix Models, Random Matrices natorial Algorithms, and Symmetric Functions. Graduate Texts
and Their Applications, vol. 40. MSRI Publications; in Mathematics, 2nd edn., vol. 203, New York: Springer.
Cambridge: Cambridge University Press. Stanley R (1999) Enumerative Combinatorics, II. Cambridge:
Kerov S (2003) Asymptotic Representation Theory of the Cambridge University Press.
Symmetric Group and its Applications in Analysis. American Witten E (1991) On quantum gauge theories in two dimensions.
Mathematical Society. Communications in Mathematical Physics 141(1): 153–209.
Kerov S, Okounkov A, and Olshanski G (1998) The boundary of Woodward C (2004) Localization for the norm-square of the
the Young graph with Jack edge multiplicities. International moment map and the two-dimensional Yang–Mills integral,
Mathematics Research Notices 4: 173–199. math.SG/0404413.
Macdonald IG (1995) Symmetric Functions and Hall Polyno-
mials. Oxford: Clarendon.

Random Walks in Random Environments


L V Bogachev, University of Leeds, Leeds, UK in biology, crystallography, and metal physics, but
ª 2006 Elsevier Ltd. All rights reserved. later applications have spread through numerous
areas (see review papers by Alexander et al. (1981),
Bouchaud and Georges (1990), and a comprehensive
monograph by Hughes (1996)). After 30 years of
Introduction
extensive work, RWRE remain a very active area of
Random walks provide a simple conventional model to research, which has been a rich source of hard and
describe various transport processes, for example, challenging questions and has already led to many
propagation of heat or diffusion of matter through a surprising discoveries, such as subdiffusive behavior,
medium (for a general reference see, e.g., Hughes trapping effects, localization, etc. It is fair to say that
(1995)). However, in many practical cases, the medium the RWRE paradigm has become firmly established
where the system evolves is highly irregular, due to in physics of random media, and its models, ideas,
factors such as defects, impurities, fluctuations, etc. It is methods, results, and general effects have become an
natural to model such irregularities as ‘‘random indispensable part of the standard tool kit of a
environment,’’ treating the observable sample as a mathematical physicist.
statistical realization of an ensemble, obtained by One of the central problems in random media
choosing the local characteristics of the motion (e.g., theory is to establish conditions ensuring homogeniza-
transport coefficients and driving fields) at random, tion, whereby a given stochastic system evolving in a
according to a certain probability distribution. random medium can be adequately described, on some
In the random walks context, such models are spatial–temporal scale, using a suitable effective
referred to as ‘‘random walks in random environ- system in a homogeneous (nonrandom) medium. In
ments’’ (RWRE). This is a relatively new chapter particular, such systems would exhibit classical diffu-
in applied probability and physics of disordered sive behavior with effective drift and diffusion coeffi-
systems initiated in the 1970s. Early interest in cient. Such an approximation, called ‘‘effective
RWRE models was motivated by some problems medium approximation’’ (EMA), may be expected to
354 Random Walks in Random Environments

be successful for systems exposed to a relatively small that only the current location of the walk determines
disorder of the environment. However, in certain the random motion mechanism, whereas the past
circumstances, EMA may fail due to atypical environ- history is not relevant. In terms of probability theory,
ment configurations (‘‘large deviations’’) leading to such a process is referred to as ‘‘Markov chain.’’ Thus,
various anomalous effects. For instance, with small but assuming that the walk starts at the origin, its position
positive probability a realization of the environment after n steps can be represented as the sum of
may create ‘‘traps’’ that would hold the particle for an consecutive displacements, Xn = Z1 þ    þ Zn ,
anomalously long time, resulting in the subdiffusive where Zi are independent random variables with the
behavior, with the mean square displacement growing same distribution P{Zi = 1} = p, P{Zi = 1} = q.
slower than linearly in time. The strong law of large numbers (LLN) states that
RWRE models have been studied by various almost surely (i.e., with probability 1)
nonrigorous methods including Monte Carlo simu-
Xn
lations, series expansions, and the renormalization lim ¼ EZ1 ¼ p  q; P-a.s. ½1
group techniques (see more details in the above
n!1 n
references), but only a few models have been where E denotes expectation (mean value) with respect
analyzed rigorously, especially in dimensions greater to P. This result shows that the random walk moves
than one. The situation is much more satisfactory in with the asymptotic average velocity close to p  q. It
the one-dimensional case, where the mathematical follows that if p  q 6¼ 0, then the process Xn , with
theory has matured and the RWRE dynamics has probability 1, will ultimately drift to infinity (more
been understood fairly well. precisely, þ1 if p  q > 0 and 1 if p  q < 0). In
The goal of this article is to give a brief particular, in this case, the random walk may return to
introduction to the beautiful area of RWRE. The the origin (and in fact visit any site on Z) only finitely
principal model to be discussed is a random walk many times. Such behavior is called ‘‘transient.’’
with nearest-neighbor jumps in independent and However, in the symmetric case (i.e., p = q = 0.5) the
identically distributed (i.i.d.) random environment average velocity vanishes, so the above argument fails.
in one dimension, although we shall also comment In this case, the walk behavior appears to be more
on some generalizations. The focus is on rigorous complicated, as it makes increasingly large excursions
results; however, heuristics will be used freely to both to the right and to the left, so that
motivate the ideas and explain the approaches and limn ! 1 Xn = þ1, limn ! 1 Xn = 1 (P-a.s.). This
proofs. In a few cases, sketches of the proofs have implies that a symmetric random walk in one dimen-
been included, which should help appreciate the sion is ‘‘recurrent,’’ in that it visits the origin (and
flavor of the results and methods. indeed any site on Z) infinitely often. Moreover, it can
be shown to be ‘‘null-recurrent,’’ which means that the
expected time to return to the origin is infinite. That is
Ordinary Random Walks: A Reminder
to say, return to the origin is guaranteed, but it takes
To put our exposition in perspective, let us give very long until this happens.
a brief account of a few basic concepts and Fluctuations of the random walk can be char-
facts for ordinary random walks, that is, evolving acterized further via the central limit theorem
in a nonrandom environment (see further details in (CLT), which amounts to saying that the probability
Hughes (1995)). In such models, space is modeled distribution of Xn is asymptotically normal, with
using a suitable graph, for example, a d-dimensional mean n(p  q) and variance 4npq:
integer lattice Zd , while time may be discrete or ( )
continuous. The latter distinction is not essential, Xn  nðp  qÞ
lim P pffiffiffiffiffiffiffiffiffiffiffi x
and in this article we will mostly focus on the n!1 4npq
discrete-time case. The random mechanism of Z x
1 2
spatial motion is then determined by the given ¼ ðxÞ:¼ pffiffiffiffiffiffi ey =2 dy ½2
2 1
transition probabilities (probabilities of jumps) at
each site of the graph. In the lattice case, it is usually These results can be extended to more general
assumed that the walk is translation invariant, so walks in one dimension, and also to higher dimen-
that at each step distribution of jumps is the same, sions. For instance, the criterion of recurrence for a
with no regard to the current location of the walk. general one-dimensional random walk is that it is
In one dimension (d = 1), the simple (nearest- unbiased, EðX1  X0 Þ = 0. In the two-dimensional
neighbor) random walk may move one step to right case, in addition one needs EjX1  X0 j2 < 1. In
or to the left at a time, with some probabilities p and higher dimensions, any random walk (which does
q = 1  p, respectively. An important assumption is not reduce to lower dimension) is transient.
Random Walks in Random Environments 355

Random Environments and Random Walks to the right, with probability px , or to the left, with
probability qx . Here the environment is determined
The definition of an RWRE involves two ingredi-
by the sequence of random variables {px }. For most
ents: (1) the environment, which is randomly chosen
of the article, we assume that the random probabil-
but remains fixed throughout the time evolution,
ities {px , x 2 Z} are i.i.d., which is referred to as
and (2) the random walk, whose transition prob-
‘‘i.i.d. environment.’’ Some extensions to more
abilities are determined by the environment. The set
general environments will be mentioned briefly in
of environments (sample space) is denoted by
the section ‘‘Some generalizations and variations.’’
 = {!}, and we use P to denote the probability
The study of RWRE is simplified under the follow-
distribution on this space. For each ! 2 , we define
ing natural condition called ‘‘(uniform) ellipticity:’’
the random walk in the environment ! as the (time-
homogeneous) Markov chain {Xt , t = 0, 1, 2, . . .g on 0 <   px  1   < 1; x 2 Z; P-a.s. ½5
Zd with certain (random) transition probabilities
which will be frequently assumed in the sequel.
pðx; y; !Þ ¼ P! fX1 ¼ yjX0 ¼ xg ½3
The probability measure P! that determines the
distribution of the random walk in a given environ- Transience and Recurrence
ment ! is referred to as the ‘‘quenched’’ law. We
In this section, we discuss a criterion for the RWRE
often use a subindex to indicate the initial position
to be transient or recurrent. The following theorem
of the walk, so that, for example, P!x {X0 = x} = 1.
is due to Solomon (1975).
By averaging the quenched probability P!x further,
with respect to the environment distribution, we Theorem 1 Set x := qx =px , x 2 Z, and  := E ln 0 .
obtain the ‘‘annealed’’ measure P x = P  P!x , which
(i) If  6¼ 0 then Xt is transient (P 0 -a.s.); moreover,
determines the probability law of the RWRE:
Z if  < 0 then limt !0 Xt = þ1, while if  > 0
then limt !0 Xt = 1 (P 0 -a.s.).
P x ðAÞ ¼ P!x ðAÞ Pðd!Þ ¼ EP!x ðAÞ ½4
 (ii) If  = 0 then Xt is recurrent (P 0 -a.s.); moreover,
Expectation with respect to the annealed measure lim Xt ¼ þ1; lim Xt ¼ 1; P 0 -a.s.
t!1 t!1
Px will be denoted by Ex.
Equation [4] implies that if some property A of the Let us sketch the proof. Consider the hitting times
RWRE holds almost surely with respect to the Tx := min {t  0 : Xt = x} and denote by fxy the
quenched law P!x for almost all environments (i.e., quenched first-passage probability from x to y:
for all ! 2 0 such that P(0 ) = 1), then this property is
also true with probability 1 under the annealed law P x . fxy :¼ P!x f1  Ty < 1g
Note that the random walk Xn is a Markov Starting from 0, the first step of the walk may be
chain only conditionally on the fixed environment either to the right or to the left, hence by the
(i.e., with respect to P!x ), but the Markov property Markov property the return probability f00 can be
fails under the annealed measure Px . This is because decomposed as
the past history cannot be neglected, as it tells what
information about the medium must be taken into f00 ¼ p0 f10 þ q0 f1;0 ½6
account when averaging with respect to environ- To evaluate f10 , for n  1 set
ment. That is to say, the walk learns more about
the environment by taking more steps. (This idea ux  uðnÞ !
x :¼ Px fT0 < Tn g; 0xn
motivates the method of ‘‘environment viewed from
the particle,’’ see related section below.) which is the probability to reach 0 prior to n,
The simplest model is the nearest-neighbor one- starting from x. Clearly,
dimensional walk, with transition probabilities ðnÞ
8 f10 ¼ lim u1 ½7
n!1
< px if y ¼ x þ 1
pðx; y; !Þ ¼ qx if y ¼ x  1 Decomposition with respect to the first step yields
: the difference equation
0 otherwise
where px and qx = 1  px (x 2 Z) are random vari- ux ¼ px uxþ1 þ qx ux1 ; 0<x<n ½8
ables on the probability space (, P). That is to say, with the boundary conditions
given the environment ! 2 , the random walk
currently at point x 2 Z will make a one-unit step u0 ¼ 1; un ¼ 0 ½9
356 Random Walks in Random Environments

Using px þ qx = 1, eqn [8] can be rewritten as α = 12


uxþ1  ux ¼ x ðux  ux1 Þ η<0
1 υ>0
whence by iterations
Y
x
η<0
uxþ1  ux ¼ ðu1  u0 Þ j ½10 υ=0
j¼1 β = 12
β
Summing over x and using the boundary conditions η>0
υ=0
[9] we obtain
!1
n1 Y
X x η>0
1  u1 ¼ j ½11 0 α 1 υ<0
x¼0 j¼1 Figure 1 Phase diagram for the canonical model, eqn [13]. In the
regions where  < 0 or  > 0, the RWRE is transient to þ1 or 1,
(if x = 0, the product over j is interpreted as 1). In respectively. The recurrent case,  = 0, arises when  = 1=2 or
view of eqn [7] it follows that f10 = 1 if and only if  = 1=2. The asymptotic velocity  := limt!0 xt =t is given by eqn
the right-hand side of eqn [11] tends to 0, that is, [14]. Adapted from Hughes BD (1996) Random Walks and Random
Environments. Volume 2: Random Environments, Ch. 6, p. 391.
X
1 X
x
Oxford: Clarendon, by permission of Oxford University Press.
expðYx Þ ¼ 1; Yx :¼ ln j ½12
x¼1 j¼1

Note that the random variables ln j are i.i.d., hence case is degenerate and amounts to the ordinary
by the strong LLN symmetric random walk, while the second one
Yx (except where  = 1=2) corresponds to Sinai’s
lim ¼ E ln 0  ; P-a.s. problem (see the section ‘‘Sinai’s localization’’). A
x!1 x
‘‘phase diagram’’ for this model, showing various
That is, the general term of the series [12] for large x limiting regimes as a function of the parameters , ,
behaves like exp (x); hence, for  > 0 the condition is presented in Figure 1.
[12] holds true (and so f10 = 1), whereas for  < 0 it
fails (and so f10 < 1).
By interchanging the roles of px and qx , we also
have f 1, 0 < 1 if  > 0 and f 1, 0 = 1 if  < 0. From Asymptotic Velocity
eqn [6], it then follows that in both cases f00 < 1, In the transient case the walk escapes to infinity,
that is, the random walk is transient. and it is reasonable to ask at what speed. For a
In the critical case,  = 0, by a general result from nonrandom environment, px  p, the answer is
probability theory, Yx  0 for infinitely many x given by the LLN, eqn [1]. For the simple
(P-a.s.), and so the series in eqn [12] diverges. RWRE, the asymptotic velocity was obtained by
Hence, f10 = 1 and, similarly, f 1, 0 = 1, so by eqn [6] Solomon (1975). Note that by Jensen’s inequality,
f00 = 1, that is, the random walk is recurrent. (E0 ) 1  E1
0 .
It may be surprising that the critical parameter
appears in the form  = E ln 0 , as it is probably Theorem 2 The limit v := limt ! 1 Xt =t exists
more natural to expect, by analogy with the (P0 -a.s.) and is given by
8
ordinary random walk, that the RWRE criterion > 1  E0
>
> if E0 < 1
would be based on the mean drift, E(p0  q0 ). In the >
>
< 1 þ E0
next section, we will see that the sign of d may be v¼ 1  E1
0
½14
misleading. >
>
>  1
if E1
0 < 1
A canonical model of RWRE is specified by the > 1 þ E0
>
:
assumption that the random variables px take only 0 otherwise
two values,  and 1  , with probabilities
Thus, the RWRE has a well-defined nonzero
Pfpx ¼ g ¼ ; Pfpx ¼ 1  g ¼ 1   ½13
asymptotic velocity except when (E0 ) 1  1 
where 0 <  < 1, 0 <  < 1. Here  = (2  1) E1
0 . For instance, in the canonical example
ln (1 þ (1  2)=), and it is easy to see that, for eqn [13] (see Figure 1), the criterion E0 < 1 for
example,  < 0 if  < 1=2,  < 1=2 or  > 1=2, the velocity v to be positive amounts to the
 > 1=2. The recurrent region where  = 0 splits into condition that both (1  )= and (1  )= lie on
two lines,  = 1=2 and  = 1=2. Note that the first the same side of point 1.
Random Walks in Random Environments 357

The key idea of the proof is to analyze the hitting Furthermore, by Jensen’s inequality
times Tn first, deducing results for the walk Xt later. 1
More specifically, set i = Ti  Ti1 , which is the time E0 ¼ Ep1
0  1  ðEp0 Þ 1
to hit i after hitting i  1 (providing that i > X0 ). If so eqn [14] implies that if E0 < 1, then
X0 = 0 and n  1, then Tn = 1 þ    þ n . Note that
in fixed environment ! the random variables { i } are 0 < v  2 Ep0  1 ¼ E ðp0  q0 Þ
independent, since the quenched random walk ‘‘for- and the inequality is strict if p0 is genuinely random
gets’’ its past. Although there is no independence with (i.e., does not reduce to a constant). Hence, the
respect to the annealed probability measure P 0 , one asymptotic velocity v is less than the mean drift
can show that, due to the i.i.d. property of the E(p0  q0 ), which is yet another evidence of slow-
environment, the sequence { i } is ergodic and therefore down. What is even more surprising is that it is
satisfies the LLN: possible to have E(p0  q0 ) > 0 but  = E ln 0 > 0, so
Tn 1 þ    þ n that P0 -a.s. Xt ! 1 (although with velocity v = 0).
¼ ! E0 1 ; P 0 -a:s: Indeed, following Sznitman (2004) suppose that
n n
In turn, this implies Pfp0 ¼ g ¼ ; Pfp0 ¼
g ¼ 1  

Xt 1 with  > 1=2. Then Ep0   > 1=2 if 1 >  >


! ; P 0 -a:s: ½15 1=2, hence E(p0  q0 ) = 2 Ep0  1 > 0. On the
t E0 1
other hand,
(the clue is to note that XTn = n).
1 1

To compute the mean value E0 1 , observe that E ln 0 ¼  ln þ ð1  Þ ln >0




1 ¼ 1fX1 ¼1g þ 1fX1 ¼1g ð1 þ 00 þ 10 Þ ½16


if
is sufficiently small.
where 1A is the indicator of event A and 00 , 10
are, respectively, the times to get from 1 to 0 and
then from 0 to 1. Taking expectations in a fixed Critical Exponent, Excursions, and Traps
environment !, we obtain
Extending the previous analysis of the hitting times,
E!0 1 ¼ p0 þ q0 ð1 þ E!0 00 þ E!0 1 Þ ½17 one can obtain useful information about the limit
distribution of Tn (and hence Xt ). To appreciate
and so this, note that from the recursion eqn [16] it follows
E!0 1 ¼ 1 þ 0 þ 0 E!0 00 ½18 1s ¼ 1fX1 ¼1g þ 1fX1 ¼1g ð1 þ 00 þ 10 Þs
Note that E!0 00 is a function of {px , x < 0} and and, similarly to [17],
hence is independent of 0 = q0 =p0 . Averaging eqn
[18] over the environment and using E0 00 = E0 1 E!0 1s ¼ p0 þ q0 E!0 ð1 þ 00 þ 10 Þs
yields Taking here expectation E, one can deduce that
8 E0 1s < 1 if and only if Es0 < 1. Therefore, it is
< 1 þ E0
if E0 < 1 natural to expect that the root of the equation
E0 1 ¼ 1  E0 ½19
:
1 if E0  1 E 0 ¼ 1 ½20
and by eqn [15] ‘‘half’’ of eqn [14] follows. The plays the role of a critical exponent responsible for
other half, in terms of E01 , can be obtained by the growth rate (and hence, for the type of the limit
interchanging the roles of px and qx , whereby 0 is distribution) of the sum Tn = 1 þ    þ n . In parti-
replaced with 01 . cular, by analogy with sums of i.i.d. random
Let us make a few remarks concerning Theorems variables one can expect that if > 2, then Tn is
1 and 2. First of all, note that by Jensen’s inequality asymptotically normal, with the standard scaling
pffiffiffi
E ln 0  ln E0 , with a strict inequality whenever n, while for < 2 the limit law of Tn is stable
0 is nondegenerate. Therefore, it may be possible (with index ) under scaling n1= .
that, with P0 -probability 1, Xt ! 1 but Xt =t ! 0 Alternatively, eqn [20] can be obtained from
(see Figure 1). This is quite unusual as compared consideration of excursions of the random walk.
to the ordinary random walk (see the subsection L
Let T11 be the left-excursion time from site 1, that is
‘‘Ordinary random walks: a reminder’’), and the time to return to 1 after moving to the left at the
indicates some kind of slowdown in the transient L
first step. If  = E ln 0 < 0, then T11 < 1 (P0 -a.s.).
case. L
Fixing an environment !, let w1 = E!1 T11 be the
358 Random Walks in Random Environments

L
quenched mean duration of the excursion T11 and Although the above considerations point to the
observe that w1 = 1 þ E!0 1 , where 1 is the time to critical parameter , eqn [20], which may be
get back to 1 after stepping to 0. expected to determine the slowdown scale, they
As a matter of fact, this representation and provide little explanation of a mechanism of the
eqn [19] imply that the annealed mean duration of slowdown phenomenon. Heuristically, it is natural
L
the left excursion, E0 T11 , is given by to attribute the slowdown effects to the presence of
8 ‘‘traps’’ in the environment, which may be thought
< 2 of as regions that are easy to enter but hard to leave.
if E0 < 1
Ew1 ¼ 1  E0 ½21 In the one-dimensional case, such a trap would
:
1 if E0  1 occur, for example, between two long series of
Note that in the latter case (and bearing in mind  < 0), successive sites where the probabilities px are fairly
the random walk starting from 1 will eventually drift to large (on the left) and small (on the right).
þ1, thus making only a finite number of visits to 0, Remarkably, traps can be characterized quantita-
but the expected number of such visits is infinite. tively with regard to the properties of the random
In fact, our goal here is to characterize the environment, by linking them to certain large-
distribution of w1 under the law P. To this end, deviation effects (see Sznitman (2002, 2004)). The
observe that the excursion T11 L
involves at least two key role in this analysis is played by the function
steps (the first and the last ones) and, possibly, F(u) := ln Eu0 , u 2 R. Suppose that  = E ln 0 < 0
several left excursions from 0, each with mean time (so that by Theorem 1 the RWRE tends to
L
w0 = E!0 T00 . Therefore, þ1, P 0 -a.s.) and also that E0 > 1 and E01 > 1
(so that by Theorem 2, v = 0). The latter means that
X
1
F(1) > 0 and F(1) > 0, and since F is a smooth
w1 ¼ 2 þ qj0 p0 ðjw0 Þ ¼ 2 þ 0 w0 ½22
j¼1
strictly convex function and F(0) = 0, it follows that
there is the second root 0 < < 1, so that F( ) = 0,
By the translation invariance of the environment, the that is, E 0 = 1 (cf. eqn [20]).
random variables w1 and w0 have the same distribu- Let us estimate the probability to have a trap in
tion. Furthermore, similarly to recursion [22], we U = [ L, L] where the RWRE will spend anoma-
have w0 = 2 þ  1 w1 . This implies that w0 is a lously long time. Using eqn [11], observe that
function of px with x  1 only, and hence w0 and
0 are independent random variables. Introducing the P!1 fT0 < TLþ1 g  1  expfLSL g
Laplace transform (s) = E exp (sw1 ) and condition- P
where SL := L 1 Lx = 1 ln x !  < 0 as L ! 1.
ing on 0 , from eqn [22] we get the equation However, due to large deviations SL may exceed
ðsÞ ¼ e2s E ðs0 Þ ½23 level > 0 with probability

Suppose that PfSL > g


expfLIð Þg; L!1

1  ðsÞ
as ; s!0 where I(x) := supu {ux  F(u)} is the Legendre trans-
form of F. We can optimize this estimate by
then eqn [23] amounts to assuming that L  ln n and minimizing the ratio
1  as þ    ¼ ð1  2s þ   Þð1  as E 0 þ   Þ I( )= . Note that F(u) can be expressed via the
inverse Legendre transform, F(u) = supx {xu  I(x)},
Expanding the product on the right, one can see that and it is easy to see that if := min >0 I( )= , then
a solution with = 1 is possible only if E0 < 1, in F( ) = 0, so is the second (positive) root of F.
which case The ‘‘left’’ probability P!1 {T0 < TL1 } is esti-
2 mated in a similar fashion, and one can deduce that
a ¼ Ew1 ¼ for some constants K > 0, c > 0, and any 0 > , for
1  E0
large n
We have already obtained this result in eqn [21].    
The case < 1 is possible if E 0 = 1, which is P P!0 max jXk j  K ln n  c  n
0

exactly eqn [20]. Returning to w1 , one expects a kn


slow decay of the distribution tail,
That is to say, this is a bound on the probability to
Pfw1 > tg
bt1= ; t!1 see a trap centered at 0, of size ln n, which will
retain the RWRE for at least time n. It can be
In particular, in this case the annealed mean shown that, typically, there will be many such traps
0 0
duration of the left excursion appears to be infinite. both in [n , 0] and [0, n ], which will essentially
Random Walks in Random Environments 359

0
prevent the RWRE from moving at distance n the number of left excursions starting P from i up to
from the origin before time n. In particular, it time Tn , and note that Tn = n þ 2 Pi Uin . Since the
0
follows that limn ! 1 Xn =n = 0 for any 0 > , so walk is transient to þ1, the sum i0 Uin is finite
recalling that 0 < < 1, we have indeed a sublinear (P0 -a.s.) and so does not affect the limit. (2) Observe
growth of Xn . This result is more informative as that if the environment ! is fixed then the condi-
compared to Theorem 2 (the case v = 0), and it tional distribution of Ujn , given Ujþ1 n
, . . . , Unn = 0, is
n
clarifies the role of traps (see more details in the same as the distribution of the sum of 1 þ Ujþ1 i.i.d.
Sznitman (2004)). The nontrivial behavior of the random variables V1 , V2 , . . . , each with geo-
RWRE on the precise growth scale, n , is char- metric distribution P!0P {Vi = k} = pj qkj (k = 0, 1, 2, . . . ).
n n
acterized in the next section. Therefore, the sum i = 1 UiP(read from right to
n1
left) can be represented as t = 0 Zt , where Z0 =
0, Z1 , Z2 , . . . is a branching process (in random
Limit Distributions environment {pj }) with one immigrant at each step
Considerations in the previous section suggest that and the geometric offspring distribution with parameter
the exponent , defined as the solution of eqn pj for each particle present at time j. (3) Consider
[20], characterizes environments in terms of dura- the successive ‘‘regeneration’’ times k , at which
tion of left excursions. These heuristic arguments the process
P Zt vanishes. The partial sums
are confirmed by a limit theorem by Kesten et al. Wk := t< Zt form an i.i.d. sequence, and the
k kþ1
(1975), which specifies the slowdown scale. We proof amounts to showing that the sum of Wk has a
state here the most striking part of their result. stable limit of index . (4) Finally, the P distribution
Qn  1 of
Denote lnþ u := max { ln u, 0}; by an arithmetic W0 can be approximated using M0 := 1 t=1 j = 0 j
distribution one means a probability law on R (cf. eqn [11]), which is the quenched mean number of
concentrated on the set of points of the form total progeny of the immigrant at time t = 0. Using
0, c, 2c, . . . . Kesten’s renewal theorem, it can be checked that
P{M0 > x}
Kx  as x ! 1, so M0 is in the domain
Theorem 3 Assume that 1   = E ln 0 < 0 of attraction of a stable law with index , and the
and the distribution of ln 0 is nonarithmetic result follows.
(excluding a possible atom at 1). Suppose that Let us emphasize the significance of the regenera-
the root of eqn [20] is such that 0 < < 1 and tion times i . Returning to the original random
E 0 lnþ 0 < 1. Then walk, one can see that these are times at which the
RWRE hits a new ‘‘record’’ on its way to þ1, never
lim P 0 fn1= Tn  tg ¼ L ðtÞ
n!1 to backtrack again. The same idea plays a crucial
role in the analysis of the RWRE in higher
lim P 0 ft Xt  xg ¼ 1  L ðx1= Þ
t!1 dimensions (see the subsections ‘‘Zero–one laws
where L () is the distribution function of a stable and LLNs’’ and ‘‘Kalikow’s condition and Sznitman’s
law with index , concentrated on [0, 1). condition (T0 )’’).
Finally, note that the condition 1   < 0
General information on stable laws can be found allows P{p0 = 1} > 0, so the distribution of 0 may
in many probability books; we only mention here have an atom at 0 (and hence ln 0 at 1). In view
that the Laplace transform of a stable distribution of eqn [20], no atom is possible at þ1. The
on [0, 1) with index has the form (s) = restriction for the distribution of ln 0 to be
exp { Cs }. nonarithmetic is important. This will be illustrated
Kesten et al. (1975) also consider the case  1. in the section ‘‘Diode model,’’ where we discuss the
Note that for > 1, we have E0 < (E 0 )1/ ¼ 1, so model of random diodes.
v > 0 by eqn [14]. For example, if > 2 then, as
expected (see the previous section), there exists a
nonrandom 2 > 0 such that
  Sinai’s Localization
Tn  n=v
lim P 0 pffiffiffi  t ¼ ðtÞ The results discussed in the previous section indicate
n!1  n
that the less transient the RWRE is (i.e., the critical
 
Xt  tv exponent decreasing to zero), the slower it moves.
lim P 0 3=2 pffiffi  x ¼ ðxÞ Sinai (1982) proved a remarkable theorem showing
t!1 v  t
that for the recurrent RWRE (i.e., with
Let us describe an elegant idea of the proof based  = E ln 0 = 0), the slowdown effect is exhibited in
on a suitable renewal structure. (1) Let Uin (i  n) be a striking way.
360 Random Walks in Random Environments

Theorem 4 Suppose that the environment {px } is Environment Viewed from the Particle
i.i.d. and elliptic, eqn [5], and assume that
This important technique, dating back to Kozlov
E ln 0 = 0, with P{0 = 1} < 1. Denote 2 := E ln2
and Molchanov (1984), has proved to be quite
0 , 0 < 2 < 1. Then there exists a function
efficient in the study of random motions in random
Wn = Wn (!) of the random environment such that
media. The basic idea is to focus on the evolution of
for any " > 0
the environment viewed from the current position of
 2  
 Xn  the walk.
lim P 0  2  Wn  > " ¼ 0 ½24 Let  be the shift operator acting on the space of
n!1 ln n
environments  = {!} as follows:
Moreover, Wn has a limit distribution:

! ¼ fpx g 7! !
 ¼ fpx1 g
lim PfWn  xg ¼ GðxÞ ½25
n!1
Consider the process
and thus also the distribution of 2 Xn = ln2 n under
P 0 converges to the same distribution G(x). !n :¼ Xn !; !0 ¼ !
Sinai’s theorem shows that in the recurrent case, the
which describes the state of the environment from
RWRE considered on the spatial scale ln2 n becomes
the point of view of an observer moving along with
localized near some random point (depending on the
the random walk Xn . One can show that !n is a
environment only). This phenomenon, frequently
Markov chain (with respect to both P!0 and P 0 ), with
referred to as ‘‘Sinai’s localization,’’ indicates an
the transition kernel
extremely strong slowdown of the motion as com-
pared with the ordinary diffusive behavior. Tð!; d!0 Þ ¼ p0 ! ðd!0 Þ þ q0 1 ! ðd!0 Þ ½27
Following Révész (1990), let us explain heuristi-
cally why Xn is measured on the scale ln2 n. Rewrite and the respective initial law ! or P (here ! is the
eqn [11] as Dirac measure, i.e., unit mass at !).
!1 This fact as it stands may not seem to be of any
X
n1
practical use, since the state space of this Markov
!
P1 fTn < T0 g ¼ 1 þ expðYx Þ ½26
chain is very complex. However, the great advan-
x¼1
tage is that one can find an explicit invariant
where Yx is defined in eqn [12]. By the CLT, pffiffiffi the probability Q for the kernel T (i.e., such that
typical size of jYx j for large x is of order of x, and QT = Q), which is absolutely continuous with
so eqn [26] yields respect to P.
pffiffiffi More specifically, assume that E0 < 1 and set
P!1 fTn < T0 g expf ng
Q = f (!)P, where (cf. eqn [14])
This suggests thatpthe ffiffiffi walk started at site 1 will 1 Y
X x
make about exp { n } visits to the origin before f ¼ v ð1 þ 0 Þ j
reaching level n. Therefore, the first pffiffiffi passage to x¼0 j¼1
½28
site n takes at least time exp { n }. In other 1  E0
words, one may expect that a typical displace- v¼
1 þ E0
ment after n steps will be of order of ln2 n (cf. eqn
[24]). This argument also indicates, in the spirit Using independence of {x }, we note
of the trapping mechanism of slowdown discussed Z
at the end of the section ‘‘Critical exponent, X1
Qðd!Þ ¼ Ef ¼ ð1  E0 Þ ðE0 Þx ¼ 1
excursions, and traps,’’ that there is typically a  x¼0
trap of size ln2 n, which retains the RWRE until
time n. hence Q is a probability measure on . Furthermore,
It has been shown (independently by H Kesten for any bounded measurable function g on  we
and A O Golosov) that the limit in [25] coincides have
with the distribution of a certain functional of the Z
standard Brownian motion, with the density QTg ¼ Tgð!ÞQðd!Þ ¼ Ef Tg

function
( )   
X1 k 2 2 ¼ E f p0 ðg Þ þ q0 ðg 1 Þ
2 ð1Þ ð2k þ 1Þ 
G0 ðxÞ ¼ exp  jxj   
 k¼0 2k þ 1 8 ¼ E g ðp0 f Þ 1 þ ðq0 f Þ  ½29
Random Walks in Random Environments 361

By eqn [28], Then E ~0 = Eq0 =(p0 þ


) < 1 if
is large enough,
so by the first part of the theorem, P!0 - a.s.,
1 Y
X x
ðp0 f Þ 1 ¼ vp1 ð1 þ 1 Þ j1 ~ n 1  E~
Xn X 0
x¼0 j¼1 lim  lim ¼ ½30
n!1 n n!1 n 1 þ E~
0
!
X
1 Y
x
0
¼ v 1 þ 0 j ¼ v þ f Note that E~ 0 is a continuous function of
with
x¼0 j¼1
1 þ 0 values in [0, E0 ] 3 1, so there exists
such that
E~0 attains the value 1. Passing to the limit in
and similarly eqn [30] as
"
, we obtain limn ! 1 Xn =n  0
1 (P!0 - a.s.). Similarly, we get the reverse inequality,
ðq0 f Þ  ¼ v þ f which proves the second part of the theorem.
1 þ 0
A more prominent advantage of the environment
So from eqn [29] we obtain method is that it naturally leads to statements of CLT
Z type. A key step is to find a function H(x, t, !) =
QTg ¼ Eðgf Þ ¼ gð!Þ Qðd!Þ ¼ Qg x  vt þ h(x, !) (called ‘‘harmonic coordinate’’) such
 that the process H(Xn , n, !) is a martingale. To this
which proves the invariance of Q. end, by the Markov property it suffices to have
To illustrate the environment method, let us
E!Xn HðXnþ1 ; n þ 1; !Þ ¼ HðXn ; n; !Þ; P!0 -a.s.
sketch the proof of Solomon’s result on the
asymptotic velocity (see Theorem 2). Set d(x, !) := For (x, !) := h(x þ 1, !)  h(x, !) this condition
E!x (X1  X0 ) = px  qx . Noting that d(x, !) = leads to the equation
d(0,  x !), define
ðx; !Þ ¼ x ðx  1; !Þ þ v  1 þ ð1 þ vÞx
X
n X
n
Xi1
Dn :¼ dðXi1 ; !Þ ¼ dð0;  !Þ If E0 < 1 (so that v > 0), there exists a bounded
i¼1 i¼1 solution
Due to the Markov property, the process Mn := X
1 Y k
ðx; !Þ ¼ v  1 þ 2v xi
Xn  Dn is a martingale with respect to the natural
k¼0 i¼0
filtration F n = {X1 , . . . , Xn } and the law P!0 ,
and we note that (x, !) = (0, x !) is a stationary
E!0 ½Mnþ1 j F n  ¼ Mn ; P!0 -a.s. sequence with mean E(x, !) = 0. Finally, setting
and it has bounded jumps, jMn  Mn1 j  2. By h(0, !) = 0 we find
general results, this implies Mn =n ! 0 (P!0 -a.s.). 8 x1
> X
On the other hand, by Birkhoff’s ergodic >
> ðk; !Þ; x>0
>
<
theorem k¼0
hðx; !Þ ¼
Z >
> Xx
Dn >
> ðk; !Þ; x < 0
lim ¼ dð0; !Þ Qðd!Þ; P 0 -a.s. :
n!1 n  k¼1

The last integral is easily evaluated to yield As a result, we have the representation
1 Y
X x
Eðp0  q0 Þf ¼ vE j ð1  0 Þ Xn  nv ¼ HðXn ; n; !Þ þ hðXn ; !Þ ½31
x¼0 j¼1 For a fixed !, one can apply a suitable CLT for
X
1
martingale differences to the martingale term in eqn
¼ vð1  E0 Þ ðE0 Þx ¼ v
[31], while using that Xn
nv (P 0 -a.s.), the second
P
x¼0
term in eqn [31] is approximated by the sum nv k=0
and the first part of the formula [14] follows. (k, !), which can be handled via a CLT for stationary
The case E0  1 can be handled using a sequences. This way, we arrive at the following result.
comparison argument (Sznitman 2004). Observe
that if px  p~x for all x then for the corresponding Theorem 5 Suppose that the environment is
random walks we have Xt  X ~ t (P! - a.s.). We now elliptic, eqn [5], and such that E2þ"
0 < 1 for some
0
define a suitable dominating random medium by " > 0 (which implies that E0 < 1 and hence v > 0).
setting (for
> 0) Then there exists a nonrandom 2 > 0 such that
 
px
Xn  nv
~x :¼
p þ  px lim P0 pffiffiffiffiffiffiffiffi  x ¼ ðxÞ


n!1 n2
362 Random Walks in Random Environments

Note that this theorem is parallel to the result by This equation is easily solved by iterations:
Kesten et al. (1975) on asymptotic normality when
X
1
> 2 (see the section ‘‘Limit distributions’’). The ðsÞ ¼ ð1  Þ k estk
moment assumptions in Theorem 5 are more k¼0
restrictive, but they can be relaxed. On the other ½33
X
k
hand, Theorem 5 does not impose the nonarithmetic j
tk :¼ 2 
condition on the distribution of the environment j¼0
(cf. Theorem 3). More importantly, the environment
hence the distribution of w is given by
method proves to be quite efficient in more general
situations, including non-i.i.d. environments and Pfw ¼ tk g ¼ ð1  Þk ; k ¼ 0; 1; . . .
higher dimensions (at least in some cases, e.g., for
random bonds RWRE and balanced RWRE dis- This result has a transparent probabilistic meaning.
cussed subsequently). In fact, the factor (1 )k is the probability that
the nearest diode on the left of the starting point
occurs at distance k þ 1, whereas tk is the corre-
sponding mean excursion time. Note that formula
Diode Model [33] for tk easily follows from the recursion tk = 2 þ
In the preceding sections (except in the section tk  1 (cf. eqn [22]) with the boundary condition
‘‘Limit distributions,’’ where however we were t0 = 2.
limited to a nonarithmetic case), we assumed that A self-similar hierarchy of timescales [33] indi-
0 < px < 1 and therefore excluded the situation cates that the process will exhibit temporal oscilla-
where there are sites through which motion is tions. Indeed, for  > 1 the average waiting time
permitted in one direction only. Allowing for such until passing through a valley of ordinary sites of
a possibility leads to the ‘‘diode model’’ (Solomon length k is asymptotically proportional to tk
2k ,
1975). Specifically, suppose that so one may expect the annealed mean displacement
E0 Xn to have a local minimum at n tk . Passing to
Pfpx ¼ g ¼ ; Pfpx ¼ 1g ¼ 1   ½32 logarithms, we note that ln tkþ1  ln tk
ln , which
with 0 <  < 1, 0 <  < 1, so that with probability suggests the occurrence of persistent oscillations on
 a point x 2 Z is a usual two-way site and with the logarithmic timescale, with period ln  (see
probability 1   it is a repelling barrier (‘‘diode’’), Figure 2). This was confirmed by Bernasconi and
through which passage is only possible from left to Schneider (1985) who showed that for  > 1
right. This is an interesting example of statistically E0 Xn
n Fðln nÞ; n!1 ½34
inhomogeneous medium, where the particle motion
is strongly irreversible due to the presence of special where =  ln = ln  < 1 is the solution of eqn [20]
semipenetrable nodes. The principal mathematical and the function F is periodic with period ln  (see
advantage of such a model is that the random walk Figure 2).
can be decomposed into independent excursions In contrast, for  = 1 one has
from one diode to the next. n ln 
Due to diodes, the RWRE will eventually drift to E0 Xn
; n!1
2 ln n
þ1. If  > 1=2, then on average it moves faster
than in a nonrandom environment with px  . The and there are no oscillations of the above kind.
situation where   1=2 is potentially more inter- These results illuminate the earlier analysis of the
esting, as then there is a competition between the diode model by Solomon (1975), which in the main
local drift of the walk to the left (in ordinary sites) has revealed the following. If  = 1, then Xn
and the presence of repelling diodes on its way. satisfies the strong LLN:
Note that E0 = , where  := (1  )=, so the Xn ln 
condition E0 < 1 amounts to  > =(1 þ ). In this lim ¼ ; P 0 -a.s.
n!1 n= ln n 2
case (which includes  > 1=2), formula [14] for the
asymptotic velocity applies. while in the case  > 1 the asymptotic behavior of
As explained in the section ‘‘Critical exponent, Xn is quite complicated and unusual: if ni ! 1 is a
excursions, and traps,’’ the quenched mean duration sequence of integers such that { ln ni } !
(here
w of the left excursion has Laplace transform given {a} = a  [a] denotes the fractional part of a), then
by eqn [23], which now reads the distribution of ni Xni under P 0 converges to a
nondegenerate distribution which depends on
.
ðsÞ ¼ e2s f1   þ  ðsÞg Thus, the very existence of the limiting distribution
Random Walks in Random Environments 363

0.2

ln(n –½ E0 Xn)
0.1

0 2 4 6 8 10
ln n
Figure 2 Temporal oscillations for the diode model, eqn [32]. Here  = 0.3 and  = 1=0.09, so that  > 1 and = 1=2. The dots
represent an average of Monte Carlo simulations over 10 000 samples of the environment with a random walk of 200 000 steps in
each realization. The broken curve refers to the exact asymptotic solution [34]. The arrows indicate the simulated locations of the
minima tk , the asymptotic spacing of which is predicted to be ln  241. Reproduced from Bernasconi J and Schneider WR (1982).
Diffusion on a one-dimensional lattice with random asymmetric transition rates. Journal of Physics A: Mathematical and General 15:
L729–L734, by permission of IOP Publishing Ltd.

of Xn and the limit itself heavily depend on the conditions in order to ensure enough decoupling
subsequence ni chosen to approach infinity. (e.g., in Sinai’s problem). The method of environ-
This should be compared with a more ‘‘regular’’ ment viewed from the particle (discussed earlier) is
result Theorem 3. Note that almost all the condi- also suited very well to dealing with stationarity.
tions of this theorem are satisfied in the diode In the remainder of this section, we describe some
model, except that here the distribution of ln 0 is other generalizations including RWRE with
arithmetic (recall that the value ln 0 = 1 is bounded jumps, RWRE where randomness is
permissible), so it is the discreteness of the environ- attached to bonds rather than sites, and continuous-
ment distribution that does not provide enough time (symmetric) RWRE driven by the randomized
‘‘mixing’’ and hence leads to such peculiar features master equation.
of the asymptotics.
RWRE with Bounded Jumps

Some Generalizations and Variations The previous discussion was restricted to the case of
RWRE with nearest-neighbor jumps. A natural
Most of the results discussed above in the simplest extension is RWRE with bounded jumps. Let L, R
context of RWRE with nearest-neighbor jumps in an be fixed natural numbers, and suppose that from
i.i.d. random environment have been extended to each site x 2 Z jumps are only possible to the sites
some other cases. One natural generalization is to x þ i, i = L, . . . , R, with (random) probabilities
relax the i.i.d. assumption, for example, by con-
sidering stationary ergodic environments (see details X
R

in Zeitouni (2004)). In this context, one relies on an px ðiÞ  0; px ðiÞ ¼ 1 ½35


i¼L
ergodic theorem instead of the usual strong LLN.
For instance, this way one readily obtains an We assume that the random vectors px () determin-
extension of Solomon’s criterion of transience versus ing the environment are i.i.d. for different x 2 Z
recurrence (see Theorem 1). Other examples include (although many results can be extended to the
an LLN (along with a formula for the asymptotic stationary ergodic case).
velocity, cf. Theorem 2), a CLT and stable laws for The study of asymptotic properties of such a
the asymptotic distribution of Xn (cf. Theorem 3), model is essentially more complex, as it involves
and Sinai’s localization result for the recurrent products of certain random matrices and hence must
RWRE (cf. Theorem 4). Usually, however, ergodic use extensively the theory of Lyapunov exponents
theorems cannot be applied directly (like, e.g., to (see details and further references in Brémont
Xn , as the sequence Xn  Xn1 is not stationary). In (2004)). Lyapunov exponents, being natural analogs
this case, one rather uses the hitting times which of logarithms of eigenvalues, characterize the
possess the desired stationarity (cf. the sections asymptotic action of the product of random matrices
‘‘Asymptotic velocity’’ and ‘‘Critical exponent, along (random) principal directions, as described by
excursions, and traps’’). In some situations, in Oseledec’s multiplicative ergodic theorem. In most
addition to stationarity, one needs suitable mixing situations, however, the Lyapunov spectrum can
364 Random Walks in Random Environments

only be accessed implicitly, which makes the For orientation, note that if pn (i) = p(i) are
analysis rather hard. nonrandom constants, then
1 = ln 1 , where 1 > 0
To explain how random matrices arise here, let us first is the largest eigenvalue of M0 , and so
1 < 0 if and
consider a particular case R = 1, L  1. Assume that only if 1 < 1. The latter means that the character-
px (L), px (1)   > 0 for all x 2 Z (ellipticity condi- istic polynomial ’() := det (M0  I) satisfies the
tion, cf. eqn [5]), and consider the hitting probabilities condition (1)L ’(1) > 0. To evaluate det (M0  I),
un := P!n {T0 < 1}, where T0 := min {t  0 : Xt  0} replace the first column by the sum of all columns
(cf. the section ‘‘Transience and recurrence’’). By and expand to get ’(1) = (1)L1 (b1 þ    þ bL ).
decomposing with respect to the first step, for n  1 Substituting expressions [38] it is easy toPsee that
we obtain the difference equation the above condition amounts to p(1)  Li= 1 ip
(i) > 0, that is, the mean drift of the random
X
L
un ¼ pn ð1Þunþ1 þ pn ðiÞuni ½36 walk is positive and hence Xn ! þ1 a.s.
i¼0 In the general case, L  1, R  1, similar con-
siderations lead to the following matrices of order
with the boundary conditions
P u0 =    = u Lþ1 = 1. d := L þ R  1 (cf. eqn [39]):
Using that 1 = pn (1) þ Li= 0 pn (i), we can rewrite 0 1
eqn [36] as an ðR  1Þ    an ð1Þ bn ð1Þ    bn ðLÞ
B C
X
L B 1 0    0 C
B C
pn ð1Þðun  unþ1 Þ ¼ pn ðiÞðuni  un Þ B C
B C
i¼1 B 0 1 0   0 C
Mn ¼ B B .. .. .. .. ..
C
.. C
or, equivalently, B . . . . . . C
B C
B . . . . . . C
X
L B .. .. .. .. .. .. C
vn ¼ bn ðiÞvni ½37 @ A
i¼1 0   0 1 0
where vi := ui  uiþ1 and where bn (i) are given by eqn [38] and
pn ðiÞ þ    þ pn ðLÞ pn ðiÞ þ    þ pn ðRÞ
bn ðiÞ:¼ ½38 an ðiÞ :¼ 
pn ð1Þ pn ðRÞ
Recursion [37] can be written in a matrix form, Suppose that the ellipticity condition is satisfied in
Vn = Mn Vn  1 , where Vn := (vn , . . . , vn  Lþ1 )> , the form pn (i)   > 0, i 6¼ 0, L  i  R, and let
0 1
1 
2     
d be the (nonrandom) Lyapunov
bn ð1Þ       bn ðLÞ
B 1 ... 0 0 C exponents of {Mn }. The largest exponent
1 is again
Mn :¼B @ ... .. .. .. C ½39 given by eqn [40], while other exponents are
. . . A determined recursively from the equalities
0  1 0

1 þ    þ
k ¼ lim n1 ln k^k ðMn    M1 Þk
and by iterations we get (cf. eqn [10]) n!1

> (1  k  d). Here ^ denotes the external (antisym-


Vn ¼ Mn    M1 V0 ; V0 ¼ ð1  u1 ; 0; . . . ; 0Þ
metric) product: x ^ y = y ^ x (x, y 2 Rd ), and
Note that Mn depends only on the transition ^k M acts on the external product space ^k Rd ,
probability vector pn (), and hence Mn    M1 is the generated by the canonical basis {ei1 ^    ^ eik , 1 
product of i.i.d. random (non-negative) matrices. By i1 <    < ik  d}, as follows:
Furstenberg–Kesten’s theorem, the limiting behavior
of such a product, as n ! 1, is controlled by the ^k Mðx1 ^    ^ xk Þ :¼ Mðx1 Þ ^    ^ Mðxk Þ
largest Lyapunov exponent
One can show that all exponents except
R are

1 :¼ lim n1 ln kMn . . . M1 k ½40 sign-definite:
R  1 > 0 >
Rþ1 . Moreover, it is the
n!1
sign of
R that determines whether the RWRE is
(by Kingman’s subadditive ergodic theorem, the limit transient or recurrent, the dichotomy being the same
exists P-a.s. and is nonrandom). It follows that, P 0 -a.s., as in the case R = 1 above (with
1 replaced by
R ).
the RWRE Xn is transient if and only if
1 6¼ 0, and Let us also mention that an LLN and CLT can be
moreover, limn!1 Xn ¼ þ1 (1) when
1 < 0 (> 0), proved here (see Brémont (2004)).
whereas limn!1 Xn = 1, limn!1 Xn = þ1 when In conclusion, let us point out an alternative

1 = 0. approach due to Bolthausen and Goldsheid (2000)


Random Walks in Random Environments 365

who studied a more general RWRE on a strip section, we obtain that limn ! 1 Xn =n exists
Z  {0, 1, . . . , m  1}. The link between these two (P!0 -a.s.) and is given by
models is given by the representation Xn = mYn þ Zn , Z
 
where m := max {L, R}, Yn 2 Z, Zn 2 {0, . . . , m  1}. dð0; !Þ Qðd!Þ ¼ Z1 E c01  c1;0 ¼ 0
Random matrices arising here are constructed in- 

directly using an auxiliary stationary sequence. so the asymptotic velocity vanishes.


Even though these matrices are nonindependent, Furthermore, under suitable technical conditions
thanks to their positivity the criterion of transience on the environment (e.g., c01 being bounded away
can be given in terms of the sign of the largest from 0 and 1, cf. eqn [5]), one can prove the
Lyapunov exponent, which is usually much easier to following CLT:
deal with. An additional attractive feature of this  
approach is that the condition px (R) > 0 (P-a.s.), Xn
lim P 0 pffiffiffiffiffiffiffiffi  x ¼ ðxÞ ½43
which was essential for the previous technique, can n!1 n2
be replaced with a more natural condition where 2 = (Ec01  Ec01 1 1
) . Note that 2  1 (with a
P{px (R) > 0} > 0. strict inequality if c01 is not reduced to a constant),
which indicates some slowdown in the spatial
spread of the random bonds RWRE, as compared
Random Bonds RWRE
to the ordinary symmetric random walk.
Instead of having random probabilities of jumps Thus, there is a dramatic distinction between the
at each site, one could assign random weights random bonds RWRE, which is recurrent and
to bonds between the sites. For instance, the diffusive, and the random sites RWRE, with a
transition probabilities px = p(x, x þ 1, !) can be much more complex asymptotics including both
defined by transient and recurrent scenarios, slowdown effects,
cx; xþ1 and subdiffusive behavior. This can be explained
px ¼ ½41 heuristically by noting that the random bonds
cx1; x þ cx; xþ1
RWRE is reversible, that is, m(x)p(x, y) = m(y)
where cx, xþ1 > 0 are i.i.d. random variables on the p(y, x) for all x, y 2 Z, with m(x) := cx  1, x þ cx, xþ1
environment space . (this property also easily extends to multidimen-
The difference between the two models may not sional versions). Hence, it appears impossible to
seem very prominent, but the behavior of the walk create extended traps which would retain the
in the modified model [41] appears to be quite particle for a very long time. Instead, the mechanism
different. Indeed, working as in the section ‘‘Tran- of the diffusive slowdown in a reversible case is
sience and recurrence,’’ we note that associated with the natural variability of the
qx cx1;x environment resulting in the occasional occurrence
x ¼ ¼ of isolated ‘‘screening’’ bonds with an anomalously
px cx;xþ1
small weight cx, xþ1 .
hence, exploiting formulas [11] and [41], we obtain, Let us point out that the RWRE determined by
P-a.s., eqn [41] can be interpreted in terms of the random
X
n1 conductivity model (see Hughes (1996)). Suppose
1 c01
¼
c01 n Ec1
01 ! 1 ½42 that each random variable cx, xþ1 attached to the
1  u1 x¼0 cx;xþ1 bond (x, x þ 1) has the meaning of the conductance
1 of this bond (the reciprocal, cx,1xþ1 , being its
since Ec01 > 0. Therefore, f00 = 1, that is, the
resistance). If a voltage drop V is applied across
random walk is recurrent (P 0 -a.s.).
the system of N successive bonds, say from 0
The method of environment viewed from the
to N, then the same current I flows in each
particle can also be applied here (see Sznitman
of the conductors and by Ohm’s law we have
(2004)). Similarly to the section ‘‘Environment
I = cx, xþ1 Vx, xþ1 , where Vx, xþ1 is the voltage drop
viewed from the particle,’’ we define a new prob-
across the corresponding bond. Hence
ability measure Q = f (!) P using the density

X
N X
N
f ð!Þ ¼ Z1 c1;0 ð!Þ þ c01 ð!Þ V¼ Vx;xþ1 ¼ I c1
x;xþ1
x¼0 x¼0
where Z = 2Ec01 is the normalizing constant (we
assume that Ec01 < 1). One can check that Q is which amounts to saying that the total resistance of
invariant with respect to the transition kernel the system of consecutive elements is given by the
eqn [41], and by similar arguments as in that sum of the individual resistances. The effective
366 Random Walks in Random Environments

conductivity of the finite system, cN , is defined as which in the limit h ! 0 yields the master equation
the average conductance per bond, so that (or Chapman–Kolmogorov’s forward equation)
d X
1X N
p0x ðtÞ ¼ cyx p0y ðtÞ  cxy p0x ðtÞ
c1
N ¼ c1 dt
N x¼0 x;xþ1 y 6¼ x ½45
and by the strong LLN, cN1 ! Ec1 p0x ð0Þ ¼ 0 ðxÞ
01 as N ! 1 (P-a.s.).
Therefore, the effective conductivity of the infinite where 0 (x) is the Kronecker symbol.
1 1
system is given by c = (Ec01 ) , and we note that Continuous-time RWRE are therefore naturally
c < Ec01 if the random medium is nondegenerate. described via the randomized master equation, that
Returning to the random bonds RWRE, eqn [41], is, with random transition rates. The canonical
it is easy to see that a site j is recurrent if and only if example, originally motivated by Dyson’s study of
the conductance cj, 1 between x and 1 equals zero. the chain of harmonic oscillators with random
Using again Ohm’s law, we have (cf. eqn [42]) couplings, is a symmetric nearest-neighbor RWRE,
X
1 where the random transition rates cxy are nonzero
c1
j; þ1 ¼ c1
x;xþ1 ¼ 0; P-a.s. only for y = x 1 and satisfy the condition
x¼j cx, xþ1 = cxþ1, x , otherwise being i.i.d. (see Alexander
and we recover the result about recurrence. et al. (1981)). In this case, the problem [45] can be
formally solved using the Laplace transform, leading
to the equations
Continuous-Time RWRE
1
As in the discrete-time case, a random walk on Z with s þ Gþ  ^
0 þ G0 ¼ ½p0 ðsÞ ½46
continuous time is a homogeneous Markov chain
Xt , t 2 [0, 1), with state space Z and nearest-neighbor s þ G þ
x þ Gx ¼ 0 ðx 6¼ 0Þ ½47
(or at least bounded) jumps. The term ‘‘Markov’’ as
where Gx , Gþ
x are defined as
usual refers to the ‘‘lack of memory’’ property, which
amounts to saying that from the entire history of the ^0x ðsÞ  p
p ^0;x 1 ðsÞ
process development up to a given time, only the G
x :¼ cx;x 1 ½48
^0x ðsÞ
p
current position of the walk is important for the future R
evolution while all other information is irrelevant. and p^0x (s) := 01 p0x (t) e st dt. From eqns [47] and
Since there is no smallest time unit as in the discrete- [48] one obtains the recursion
time case, it is convenient to describe transitions of Xt 1
1 1
in terms of transition rates characterizing the G
x ¼ þ
cx;x 1 s þ Gþ ½49
likelihood of various jumps during a very short time. x 1

More precisely, if pxy (t) := P{Xt = y j X0 = x} are the x ¼ 0; 1; 2; . . .


transition probabilities over time t, then for h ! 0
The quantities G 0 are therefore expressed as infinite
pxy ðhÞ ¼ cxy h þ oðhÞ ðx 6¼ yÞ continued fractions depending on s and the random
X ^00 (s) can
variables cx, x 1 , cx, x 2 , . . . . The function p
pxx ðhÞ ¼ 1  h cxy þ oðhÞ ½44
y 6¼ x
then be found from eqn [46].
In its generality, the problem is far too hard, and
Equations for the functions pxy (t) can then be we shall only comment on how one can evaluate the
derived by adapting the method of decomposition annealed mean
commonly used for discrete-time Markov chains

^00 ðsÞ ¼ E s þ Gþ  1
(cf. the section ‘‘Transience and recurrence’’). Here Ep 0 þ G0
it is more convenient to decompose with respect to
According to eqn [49], the random variables
the ‘‘last’’ step, that is, by considering all possible
Gþ 
0 , G0 are determined by the same algebraic
transitions during a small increment of time at the
formula, but involve the rate coefficients from
end of the time interval [0, t þ h]. Using Markov
different sides of site x, and hence are i.i.d.
property and eqn [44] we can write
X Furthermore, eqn [49] implies that the random
p0x ðt þ hÞ ¼ h p0y ðtÞ cyx variables Gþ þ
0 , G1 have the same distribution and,
þ
y 6¼ x moreover, G1 and c01 are independent. Therefore,
! eqn [49] may be used as an integral equation for the
X
þ p0x ðtÞ 1  h cxy þ oðhÞ unknown density function of Gþ 0 . It can be proved
y 6¼ x that the suitable solution exists and is unique, and
Random Walks in Random Environments 367

although an explicit solution is not available, one For the symmetric nearest-neighbor RWRE con-
can obtain the asymptotics of small values of s, sidered above, the transition probabilities of the
thereby rendering information about the behavior of imbedded random walk are given by
p00 (t) for large t. More specifically, one can show cx;xþ1
1 1
that if c := (Ec01 ) > 0, then px :¼ px;xþ1 ¼
cx1;x þ cx;xþ1
^00 ðsÞ
ð4c sÞ1=2 ;
Ep s!0 qx :¼ px;x1 ¼ 1  px
and so by a Tauberian theorem and we recognize here the transition law of a
random walk in the random bonds environment
Ep00 ðtÞ
ð4c tÞ1=2 ; t!1 ½50 considered in the previous subsection (cf. eqn [41]).
Note that asymptotics [50] appears to be the same Recurrence and zero asymptotic velocity established
as for an ordinary symmetric random walk with there are consistent with the results discussed in the
constant transition rates cx, xþ1 = cxþ1, x = c , suggest- present section (e.g., note that the CLT for both Xn ,
ing that the latter provides an EMA for the RWRE eqn [43], and Xt , eqn [51], does not involve any
considered above. centering). Let us point out, however, that a ‘‘naive’’
This is further confirmed by the asymptotic discretization of time using the mean sojourn time
calculation of the annealed mean square displace- appears to be incorrect, as this would lead to the
ment, E0 X2t
2c t as t ! 1 (Alexander et al. 1981). scaling t = n1 with 1 := E(c 1, 0 þ c01 )1 , while
Moreover, Kawazu and Kesten (1984) proved that from comparing the limit theorems in these two
Xt is asymptotically normal: cases, one can conclude that the true value of the
  effective discretization step is given by
Xt  := (2c ) 1 = (1=2)Ec1
01 . In fact, by the arith-
lim P 0 pffiffiffiffiffiffiffiffiffi  x ¼ ðxÞ ½51
t!1 2c t metic–harmonic mean inequality we have  > 1 ,
which is a manifestation of the RWRE’s diffusive
Therefore, if c > 0, then the RWRE has the same slowdown.
diffusive behavior as the corresponding ordered
system, with a well-defined diffusion constant
D = c .
1
In the case where c = 0 (i.e., Ec01 = 1), one may RWRE in Higher Dimensions
expect that the RWRE exhibits subdiffusive beha- Multidimensional RWRE with nearest-neighbor
vior. For example, if the density function of the jumps are defined in a similar fashion: from site
transition rates is modeled by x 2 Zd the random walk can jump to one of the 2d
d
adjacent sites x þ e 2 ZP (such that jej = 1), with
fa ðuÞ ¼ ð1  Þ u 1f0<u<1g ð0 <  < 1Þ
probabilities px (e)  0, jej = 1 px (e) = 1, where the
then, as shown by Alexander et al. (1981), random vectors px () are assumed to be i.i.d. for
different x 2 Zd . As usual, we will also impose the
Ep00 ðtÞ
C tð1Þ=ð2Þ condition of uniform ellipticity:
E0 X2t
C0 t2ð1Þ=ð2Þ px ðeÞ   > 0; P-a.s.
½52
In fact, Kawazu and Kesten (1984) proved that in jej ¼ 1; x2Z d

this case t =(1þ) Xt has a (non-Gaussian) limit


distribution as t ! 1. In contrast to the one-dimensional case, theory of
To conclude the discussion of the continuous- RWRE in higher dimensions is far from maturity.
time case, let us point out that some useful Possible asymptotic behaviors of the RWRE for d  2
information about recurrence of Xt can be obtained are not understood well enough, and many basic
by considering an imbedded (discrete-time) random questions remain open. For instance, no definitive
walk X ~ n , defined as the position of Xt after n jumps. classification of the RWRE is available regarding
Note that continuous-time Markov chains admit an transience and recurrence. Similarly, LLN and CLT
alternative description of their evolution in terms of have been proved only for a limited number of
sojourn times and the distribution of transitions at a specific models, while no general sharp results have
jump. Namely, if the environment ! is fixed, then been obtained. On a more positive note, there has
the random sojourn time of Xt in each state x is been considerable progress in recent years in the so-
exponentially
P distributed with mean 1=cx , where called ballistic case, where powerful techniques have
cx := y 6¼ x cxy , while the distribution of transitions been developed (see Sznitman (2002, 2004) and
from x is given by the probabilities pxy = cxy =cx . Zeitouni (2003, 2004)). Unfortunately, not much is
368 Random Walks in Random Environments

known for nonballistic RWRE, apart from special of the RWRE and is particularly useful for
cases of balanced RWRE in d  2 (Lawler 1982), proving an LLN and a CLT, due to the fact
small isotropic perturbations of ordinary symmetric that pieces of the random walk between con-
random walks in d  3 (Bricmont and Kupiainen secutive regeneration times (and fragments of the
1991), and some examples based on combining random environment involved thereby) are inde-
components of ordinary random walks and RWRE pendent and identically distributed (at least
in d  7 (Bolthausen et al. 2003). In particular, there starting from 1 ). In this vein, one can prove a
are no examples of subdiffusive behavior in any ‘‘directional’’ version of the LLN, stating that for
dimension d  2, and in fact it is largely believed that each ‘ there exist deterministic v‘ , v ‘ (possibly
a CLT is always true in any uniformly elliptic, i.i.d. zero) such that
random environment in dimensions d  3, with
somewhat less certainty about d = 2. A heuristic Z‘n
lim ¼ v‘ 1A‘ þ v‘ 1A‘ ; P 0 -a:s: ½54
n!1 n
explanation for such a striking difference with the
case d = 1 is that due to a less restricted topology of Note that if P 0 (A‘ ) 2 {0, 1}, then eqn [54] in
space in higher dimensions, it is much harder to force conjunction with eqn [53] would readily imply
the random walk to visit traps, and hence the
slowdown is not so pronounced. Z‘n
lim ¼ v‘ ; P 0 -a:s: ½55
In what follows, we give a brief account of some n!1 n
of the known results and methods in this fast- Moreover, if P 0 (A‘ ) 2 {0, 1} for any ‘, then there
developing area (for further information and specific exists a deterministic v (possibly zero) such that
references, see an extensive review by Zeitouni
(2004)). Xn
lim ¼ v; P 0 -a:s: ½56
n!1 n
Zero–One Laws and LLNs Therefore, it is natural to ask if a zero–one law [53]
can be enhanced to that for the individual prob-
A natural first step in a multidimensional context is abilities P 0 (A‘ ). It is known that the answer is
to explore the behavior of the random walk Xn as affirmative for i.i.d. environments in d = 2, where
projected on various one-dimensional straight lines. indeed P(A‘ ) 2 {0, 1} for any ‘, with counterexamples
Let us fix a test unit vector ‘ 2 Rd , and consider the in certain stationary ergodic (but not uniformly
process Z‘n := Xn  ‘. Then for the events elliptic) environments. However, in the case d  3
A ‘ := { limn ! 1 Z‘n = 1} one can show that this is an open problem.
P 0 ðA‘ [ A‘ Þ 2 f0; 1g ½53
That is to say, for each ‘ the probability that the Kalikow’s Condition and Sznitman’s Condition (T0 )
random walk escapes to infinity in the direction ‘ is An RWRE is called ‘‘ballistic’’ (ballistic in direction ‘)
either 0 or 1. if v 6¼ 0 (v‘ 6¼ 0), see eqns [55] and [56]. In this
Let us sketch the proof. We say that is ‘‘record section, we describe conditions on the random
time’’ if jZ‘t j > jZ‘k j for all k < t, and ‘‘regeneration environment which ensure that the RWRE is ballistic.
time’’ if in addition jZ‘ j  jZ‘n j for all n  . Note Let U be a connected strict subset of Zd contain-
that by the ellipticity condition [52], limn ! 1 jZ‘n j = ing the origin. For x 2 U, denote by
1 (P0 -a.s.), hence there is an infinite sequence of
record times 0 = 0 < 1 < 2 <    . If P 0 (A‘ [ X
TU
gðx; !Þ :¼ E!0 1fXn ¼xg
A ‘ ) > 0, we can pick a subsequence of record
n¼0
times i0 , each of which has a positive P0 -
probability to be a regeneration time (because the quenched mean number of visits to x prior to the
otherwise jZ‘n j would persistently backtrack exit time TU := min {n  0 : Xn 2
= U}. Consider an
towards the origin and the event A‘ [ A ‘ could auxiliary Markov chain X b n , which starts from 0,
not occur). Since the trials for different record makes nearest-neighbor jumps while in U, with
times are independent, it follows that a regenera- (nonrandom) probabilities
tion time occurs P 0 -a.s. Repeating this argu- E½gðx; !Þpx ðeÞ
ment, we conclude that there exists an infinite b
px ðeÞ ¼ ; x2U ½57
E½gðx; !Þ
sequence of regeneration times i , which implies
that jZ‘n j ! 1 (P0 -a.s.), that is, P(A‘ [ A ‘ ) = 1. and is absorbed as soon as it first leaves U. Note
Regeneration structure introduced by the that the expectations in eqn [57] are finite; indeed, if
sequence { i } plays a key role in further analysis x is the probability to return to x before leaving U,
Random Walks in Random Environments 369

then, by the Markov property, the mean number of Condition [59] can also be reformulated in terms of
returns is given by the exit distribution of the RWRE from infinite thick
X
1 slabs ‘‘orthonormal’’ to directions ‘0 sufficiently close
x to ‘. As it stands, the latter reformulation is difficult
kkx ð1  x Þ ¼ <1
k¼1
1  x to check, but Sznitman (2004) has developed a
remarkable ‘‘effective’’ criterion reducing the job to
since, due to ellipticity, x < 1.
a similar condition in finite boxes, which is much
An important property, highlighting the usefulness
b n , is that if X
b n leaves U with probability 1, then the more tractable and can be checked in a number of
of X
cases.
same is true for the original RWRE Xn (under
In fact, condition (T0 ) follows from Kalikow’s
the annealed law P 0 ), and moreover, the
b ^ and XT have the same distribution condition, but not the other way around. In the one-
exit points X TU U
dimensional case, condition (T0 ) (applied to ‘ = 1 and
laws.
‘ = 1) proves to be equivalent to the transient
Let ‘ 2 Rd , j‘j = 1. One says that Kalikow’s condi-
b n in behavior of the RWRE, which, as we have seen in
tion with respect to ‘ holds if the local drift of X
Theorem 2, may happen with v = 0, that is, in a
the direction ‘ is uniformly bounded away from zero:
X nonballistic scenario. The situation in d  2 is quite
inf inf ðe  ‘Þ b
px ðeÞ > 0 ½58 different, as condition (T0 ) implies that the RWRE is
U x2U ballistic in the direction ‘ (with v‘ > 0) and satisfies a
jej¼1
CLT (under P 0 ). It is not known whether the ballistic
A sufficient condition for [58] is, for example, that behavior for d  2 is completely characterized by
for some > 0 condition (T0 ), although this is expected to be true.
   
E ðdð0; !Þ  ‘Þþ  E ðdð0; !Þ  ‘Þ
where d(0, !) = E!0 X1 and u := max { u, 0}. Balanced RWRE
A natural implication of Kalikow’s condition [58] In this section we discuss a particular case of
is that P 0 (A‘ ) = 1 and v‘ > 0 (see eqn [55]). More- nonballistic RWRE, for which LLN and CLT can
over, noting that eqn [58] also holds for all ‘0 in a be proved. Following Lawler (1982), we say that an
vicinity of ‘ and applying the above result with d RWRE is ‘‘balanced’’ if px (e) = px (e) for all
noncollinear vectors from that vicinity, we conclude x 2 Zd , jej = 1 (P-a.s.). In this case, the local drift
that under Kalikow’s condition there exists a vanishes, d(x, !) = 0, hence the coordinate processes
deterministic v 6¼ 0 such that Xn =n ! v as n ! 1 Xin (i = 1, . . . , d) are martingales with respect to the
(P0 -a.s.). p Furthermore,
ffiffiffi it can be proved that natural filtration F n = {X0 , . . . , Xn }. The quenched
(Xn  nv)= n converges in law to a Gaussian covariance matrix of the increments Xin :=
distribution (see Sznitman (2004)). Xinþ1  Xin (i = 1, . . . , d) is given by
It is not hard to check that in dimension d = 1  
Kalikow’s condition is equivalent to v 6¼ 0 and E!0 Xin Xjn jF n ¼ 2ij pXn ðei Þ ½60
therefore characterizes completely all ballistic
Since the right-hand side of eqn [60] is uniformly
walks. For d  2, the situation is less clear; for
bounded, it follows that Xn =n ! 0 (P 0 -a.s.). Further,
instance, it is not known if there exist RWRE with
it can be proved that there exist deterministic positive
P(A‘ ) > 0 and v‘ = 0 (of course, such RWRE cannot
constants a1 , . . . , ad such that for i ¼ 1, . . . , d
satisfy Kalikow’s condition).
Sznitman (2004) has proposed a more compli- 1Xn1
ai
cated transience condition (T0 ) involving certain lim
n!1 n
pXk ðei Þ ¼ ; P0 -a.s. ½61
2
regeneration times i similar to those described in k¼0
the previous subsection. An RWRE is said to satisfy Once this is proved, a multidimensional
Sznitman’s condition (T0 ) relative to direction ‘ if pffiffiffi CLT for
martingale differences yields that Xn = n converges
P 0 (A‘ ) = 1 and for some c > 0 and all 0 <
< 1 in law to a Gaussian distribution with zero mean
  and the covariances bij = ij ai .
E0 exp c sup jXn j
< 1 ½59 The proof of [61] employs the method of environ-
n 1
ment viewed from the particle. Namely, define a
This condition provides a powerful control over 1 Markov chain !n := Xn ! with the transition kernel
for d  2 and in particular ensures that 1 has finite Xd
moments of any order. This is in sharp contrast with Tð!; d!0 Þ ¼ ½ p0 ðei Þ! ðd!0 Þ
the one-dimensional case, and should be viewed as a i¼1
reflection of much weaker traps in dimensions d  2. þ p0 ðei Þ1 ! ðd!0 Þ
370 Random Walks in Random Environments

(cf. eqn [27]). The next step is to find a probability that the annealed local drift in some direction is strong
measure Q on  invariant under T and absolutely enough (see Sznitman (2004)). More precisely, sup-
continuous with respect to P. Unlike the one- pose that d  3 and  2 (0, 1). Then there exists
dimensional case, however, an explicit form of Q is "0 = "0 (d, ) > 0 such that if jpx (e)  1=2dj <
not available, and Q is constructed indirectly as the " (x 2 Zd , jej = 1) with 0 < " < "0 , and for some e0
limit of invariant measures of certain periodic one has E[d(x, !)  e0 ]  "2.5   (d = 3) or  "3  
modifications of the RWRE. Birkhoff’s ergodic (d  4), then Sznitman’s condition (T0 ) is satisfied
theorem then yields, P 0 -a.s., with respect to e0 and therefore the RWRE is ballistic
in the direction e0 (cf. the subsection ‘‘Kalikow’s
1Xn1
1Xn1
condition and Sznitman’s condition (T0 )’’).
pXk ðei ; !Þ ¼ p0 ðei ; !k Þ
n k¼0 n k¼0 Examples of a different type are constructed in
Z dimensions d  6 by letting the first d1  5 coordi-
! p0 ðei ; !Þ Qðd!Þ   nates of the RWRE Xn behave according to an

ordinary random walk, while the remaining
by the ellipticity condition [52], and eqn [61] d2 = d  d1 coordinates are exposed to a random
follows. environment (see Bolthausen et al. (2003)). One can
With regard to transience, balanced RWREs show that there exists a deterministic v (possibly
admit a complete and simple classification. Namely, zero) such that Xn =n ! pvffiffiffi (P 0 -a.s.). Moreover, if
it has been proved (see Zeitouni (2004)) that any d1  13, then (Xn  nv)= n satisfies both quenched
balanced RWRE is transient for d  3 and recurrent and annealed CLT. Incidentally, such models can be
for d = 2 (P 0 -a.s.). It is interesting to note, however, used to demonstrate the surprising features of the
that these answers may be false for certain balanced multidimensional RWRE. For instance, for d  7
random walks in a fixed environment (P-probability one can construct an RWRE Xn such that the
of such environments being zero, of course). Indeed, annealed local drift does not vanish, Ed(x, !) 6¼ 0,
examples can be constructed of balanced random but the asymptotic velocity is zero, Xn =n ! 0
walks in Z2 and in Zd with d  3, which are (P0 - a.s.), andpffiffiffifurthermore, if d  15, then in this
transient and recurrent, respectively (Zeitouni example Xn = n satisfies a quenched CLT. (In fact,
2004). one can construct such RWRE as small perturba-
tions of a simple symmetric walk.) On the other
RWRE Based on Modification of Ordinary
hand, there exist examples (in high enough dimen-
Random Walks
sions) where the walk is ballistic with a velocity
A number of partial results are known for RWRE which has an opposite direction to the annealed drift
constructed on the basis of ordinary random walks Ed(x, !) 6¼ 0. These striking examples provide
via certain randomization of the environment. A ‘‘experimental’’ evidence of many unusual properties
natural model is obtained by a small perturbation of of the multidimensional RWRE, which, no doubt,
a simple symmetric random walk. To be more will be discovered in the years to come.
precise, suppose that: (1) jpx (e)  1=2dj < " for all
x 2 Zd and any jej = 1, where " > 0 is small enough; See also: Averaging Methods; Growth Processes in
(2) Epx (e) = 1=2d; (3) vectors px () are i.i.d. for Random Matrix Theory; Lagrangian Dispersion (Passive
different x 2 Zd ; and (4) the distribution of the Scalar); Random Dynamical Systems; Random Matrix
Theory in Physics; Stochastic Differential Equations;
vector px () is isotropic, that is, invariant with
Stochastic Loewner Evolutions.
respect to permutations of its coordinates. Then for
d  3 Bricmont and Kupiainen (1991) have proved
an LLN (with zero asymptotic velocity) and a Further Reading
quenched CLT (with nondegenerate covariance
matrix). The proof is based on the renormalization Alexander S, Bernasconi J, Schneider WR, and Orbach R (1981)
Excitation dynamics in random one-dimensional systems.
group method, which involves decimation in time
Reviews of Modern Physics 53: 175–198.
combined with a suitable spatial–temporal scaling. Bernasconi J and Schneider WR (1985) Random walks in one-dimen-
This transformation replaces an RWRE by another sional random media. Helvetica Physica Acta 58: 597–621.
RWRE with weaker randomness, and it can be Bolthausen E and Goldsheid I (2000) Recurrence and transience
shown that iterations converge to a Gaussian fixed of random walks in random environments on a strip.
point. Communications in Mathematical Physics 214: 429–447.
Bolthausen E, Sznitman A-S, and Zeitouni O (2003) Cut points
Another class of examples is also built using small and diffusive random walks in random environments.
perturbations of simple symmetric random walks, but Annales de l’Institut Henri Poincaré. Probabilités et Statis-
is anisotropic and exhibits ballistic behavior, providing tiques 39: 527–555.
Recursion Operators in Classical Mechanics 371

Bouchaud J-P and Georges A (1990) Anomalous diffusion in Probabilités de Saint-Flour XXII-1992, Lecture Notes in
disordered media: statistical mechanisms, models and physical Mathematics, vol. 1581, pp. 242–411. Berlin: Springer.
applications. Physical Reports 195: 127–293. Révész P (1990) Random Walk in Random and Non-Random
Brémont J (2004) Random walks in random medium on Z and Environments. Singapore: World Scientific.
Lyapunov spectrum. Annales de l’Institut Henri Poincaré. Sinai YaG (1982) The limiting behavior of a one-dimensional
Probabilités et Statistiques 40: 309–336. random walk in a random medium. Theory of Probability and
Bricmont J and Kupiainen A (1991) Random walks in asymmetric Its Applications 27: 256–268.
random environments. Communications in Mathematical Solomon F (1975) Random walks in a random environment. The
Physics 142: 345–420. Annals of Probability 3: 1–31.
Hughes BD (1995) Random Walks and Random Environments. Sznitman A-S (2002) Lectures on random motions in random
Volume 1: Random Walks. Oxford: Clarendon. media. In: Bolthausen E and Sznitman A-S. Ten Lectures on
Hughes BD (1996) Random Walks and Random Environments. Random Media, DMV Seminar, vol. 32. Basel: Birkhäuser.
Volume 2: Random Environments. Oxford: Clarendon. Sznitman A-S (2004) Topics in random walks in random
Kawazu K and Kesten H (1984) On birth and death processes in environment. In: Lawler GF (ed.) School and Conference on
symmetric random environment. Journal of Statistical Physics Probability Theory (Trieste, 2002), ICTP Lecture Notes Series,
37: 561–576. vol. XVII, pp. 203–266 (Available at http://www.ictp.trieste.it/
Kesten H, Kozlov MV, and Spitzer F (1975) A limit law for ~pub_off/lectures/vol17.html).
random walk in a random environment. Compositio Mathe- Zeitouni O (2003) Random walks in random environments. In:
matica 30: 145–168. Tatsien Li (ed.) Proceedings of the International Congress of
Kozlov SM and Molchanov SA (1984) On conditions for Mathematicians (Beijing, 2002), vol. III, pp. 117–127. Beijing:
applicability of the central limit theorem to random walks Higher Education Press.
on a lattice. Soviet Mathematics Doklady 30: 410–413. Zeitouni O (2004) Random walks in random environment. In:
Lawler GF (1982) Weak convergence of a random walk in a Picard J (ed.) Lectures on Probability Theory and Statistics,
random environment. Communications in Mathematical Phy- Ecole d’Eté de Probabilités de Saint-Flour XXXI-2001,
sics 87: 81–87. Lecture Notes in Mathematics, vol. 1837, pp. 189–312.
Molchanov SA (1994) Lectures on random media. In: Bernard P New York: Springer.
(ed.) Lectures on Probability Theory, Ecole d’Eté de

Recursion Operators in Classical Mechanics


F Magri, Università di Milano Bicocca, Milan, Italy conditions. The first condition is that the vector-
M Pedroni, Università di Bergamo, Dalmine (BG), Italy valued 2-form
ª 2006 Elsevier Ltd. All rights reserved.
TN ðX; YÞ ¼ ½NX; NY  N½NX; Y  N½X; NY
þ N 2 ½X; Y
(called the Nijenhuis torsion of N) vanishes identi-
Introduction cally. In this case N is termed a ‘‘recursion
One of the tasks of classical mechanics has always operator.’’ The second condition is that
been to identify those Hamiltonian systems which,
!0 ðX; YÞ ¼ !ðNX; YÞ
by their peculiar properties, are considered solvable.
The integrable systems of Liouville and the separ- is a closed 2-form. The manifolds where these
able systems of Jacobi can serve as representative conditions are fulfilled are called !N manifolds.
examples here. The bi-Hamiltonian geometry, a On these manifolds, each Hamiltonian vector field
branch of Poisson geometry dealing with a special Xh is embedded into the distribution
kind of deformation of Poisson bracket, suggests
Dh ¼ hXh ; NXh ; N 2 Xh ; . . .i
two further classes of Hamiltonian systems – the
bi-Hamiltonian systems and the cyclic systems of which is the minimal invariant distribution con-
Levi-Civita. The purpose of this article is to taining Xh . This can be called the Levi-Civita
investigate the second class of systems mentioned distribution generated by Xh . Experience has
above, and to explain why they are relevant for shown that Dh is seldom integrable. The cyclic
classical mechanics. (see Bi-Hamiltonian Methods in systems of Levi-Civita are, by definition, the
Soliton Theory and Multi-Hamiltonian Systems for generators of the integrable Levi-Civita distribu-
further details). tions. Even though this notion is new in classical
To define a cyclic system of Levi-Civita, one mechanics, many interesting classical systems dis-
must consider a symplectic manifold (S, !) endowed play this property.
with a tensor field of type (1, 1), seen as an The aim of this article is to show that the cyclic
endomorphism N : TS ! TS that obeys two systems of Levi-Civita are closely related to
372 Recursion Operators in Classical Mechanics

separable systems of Jacobi. To this end, the To the first order in , the Jacobi identity on {f , g}
article is organized in four sections, of which the gives
first three clarify the above-mentioned concepts. In
the section ‘‘!N manifolds,’’ the idea of !N fff ; gg; hg0 þ fff ; gg0 ; hg þ cyclic permutations ¼ 0
manifolds is explained from the viewpoint of bi- This condition entails a constraint on !0 . One can
Hamiltonian geometry. The section ‘‘Cotangent readily check that !0 must be a closed 2-form:
bundles’’ shows that cotangent bundles provide a
large class of !N manifolds, proving that such d!0 ¼ 0
manifolds are not rare. Next, two basic examples
In turn, this constraint imposes a condition on N.
of cyclic systems of Levi-Civita are presented.
The translation of the closure of !0 on N is
Finally, the relation between cyclic systems of
Levi-Civita and separable systems of Jacobi is ½NXf ; Xg  þ ½Xf ; NXg   N½Xf ; Xg  ¼ Xff ;gg0
explained briefly.
To the second order in , the Jacobi identity on
{f , g} gives

!N Manifolds fff ; gg0 ; hg0 þ cyclic permutations ¼ 0


Let us consider a symplectic manifold (S, !) with its entailing the condition
Hamiltonian vector fields Xh defined by
½NXf ; NXg  ¼ NXff ;gg0
!ðXh ; Þ ¼ dh
on N. Thus, the Jacobi identity is satisfied at any
and with the Poisson bracket order in  if and only if N is torsion free and !0 is a
closed 2-form. Hence, according to the definition
ff ; gg ¼ !ðXf ; Xg Þ given in the ‘‘Introduction,’’ the manifold S is an !N
manifold.
Both the Hamiltonian vector fields and the functions It may be of interest to notice that the bracket
on S form a Lie algebra, and these algebras are
homomorphic, since ½X; YN ¼ ½NX; Y þ ½X; NY  N½X; Y
½Xf ; Xg  ¼ Xff ;gg is a new (deformed) commutator on vector fields,
since the torsion of N vanishes. The same is also
The bi-Hamiltonian geometry is the study of the
true for
deformations of the Lie algebras which preserve the
above morphism. ½X; Y ¼ ½X; Y þ ½X; YN
We start from the deformations of the Poisson
algebra of functions, by replacing the bracket {f , g} since the torsion of (Id þ N) vanishes too. There-
with the linear pencil fore, one can write

ff ; gg ¼ ff ; gg þ ff ; gg0 ; 2R ½Xf ; Xg  ¼ Xff ;gg

The problem is to find {f , g}0 in such a way that the This formula shows that this process of deformation
linear pencil satisfies the Jacobi identity for any is rigid. For each change of the Poisson bracket,
value of the parameter . To solve this problem it there is a deformation of the commutator of vector
is convenient to represent the bracket {f , g}0 in fields such that the basic correspondence between
the form functions and Hamiltonian vector fields, established
by the symplectic form !, remains a Lie algebra
ff ; gg0 ¼ !0 ðXf ; Xg Þ morphism.
The same phenomenon can be observed in
(which is analogous to the standard representation
connection with the definition of Hamiltonian
of the Poisson bracket of S) and then to notice that
vector field. If one introduces the pencil of 2-forms
there exists a unique (1, 1) tensor field N : TS ! TS
such that ! ¼ ! þ !0
0
! ðXf ; Xg Þ ¼ !ðNXf ; Xg Þ and the pencil of derivations
0
Due to the skew-symmetry of ! , the tensor field N d ¼ d þ dN
must satisfy the condition
where dN is the derivation of type d and degree 1
!ðNXf ; Xg Þ ¼ !ðXf ; NXg Þ canonically associated with N according to the
Recursion Operators in Classical Mechanics 373

theory of graded derivations of Frölicher and characteristic polynomial is s(). Thus, the choice
Nijenhuis, one can prove that of s() also determines an !N structure on T  Q
according to the previous prescription. The con-
d2 ¼ 0; d  ! ¼ 0 clusion is that there is a relation between pencils
and that of Poisson brackets on T  Q and coordinate
systems on Q. This relation is the clue to
! ðXh ; Þ ¼ d h understand the geometry of separable systems of
This means that, on an !N manifold, the symplectic Jacobi.
form ! and the de Rham differential d are deformed
in such a way that the basic relation between
functions and Hamiltonian vector fields established
Cyclic Systems of Levi-Civita
by ! holds true.
The systems of coupled harmonic oscillators are the
first example of cyclic systems of Levi-Civita. Let us
Cotangent Bundles consider, for simplicity, a system formed by only
two particles, with masses m1 and m2 , moving on a
Cotangent bundles are a source of examples of !N
line under the action of an internal elastic force. The
manifolds. The construction begins on the
Lagrangian of the system is
base manifold Q. For any (1, 1) tensor field
L : TQ ! TQ with vanishing Nijenhuis torsion,  
L ¼ 12 m1 x_ 21 þ m2 x_ 22  12 kðx1  x2 Þ2
one constructs the deformed Liouville 1-form
X
n and the equations of motion are
0 ¼ yi L ðdxi Þ  
i¼1 x1
M€
x þ Kx ¼ 0; x¼
x2
and its exterior derivative
where
!0 ¼ d0
   
It can be proved that !0 satisfies the conditions m1 0 k k
M¼ ; K¼
explained in the previous section, and conclude 0 m2 k k
that T  Q, endowed with the pencil of 2-form
! = ! þ !0 , is an !N manifold. Under a change of coordinates, the entries of the
A subclass of these structures merits attention. It matrices M and K obey the transformation law of
is related to the polynomials the components of a second-order covariant tensor.
  Therefore, the entries of the matrix L = M1 K are
sðÞ ¼ n  s1 n1 þ s2 n2 þ    þ sn the components of a tensor field of type (1, 1) on R 2 .
The defining equations of the associated endo-
the coefficients of which are functions on Q
morphism L : TR 2 ! R2 are
satisfying the condition
ds1 ^ ds2 ^    ^ dsn 6¼ 0 L ðdx1 Þ ¼ !21 ðdx2  dx1 Þ

(almost) everywhere on Q. Moreover, it is con- L ðdx2 Þ ¼ !22 ðdx1  dx2 Þ


venient to assume that the roots (1 , 2 , . . . , n ) of
s() are distinct and real, so that they are if !21 = k=m1 and !22 = k=m2 , and these equations
functionally independent and can be used as clearly show that L is torsion free. The same
coordinates on Q. Therefore, the choice of s() is argument holds for any system of coupled harmonic
equivalent to fix a special system of coordinates on oscillators. Therefore, the cotangent bundle asso-
Q, as it happens in R3 when one introduces the ciated with any system of coupled harmonic
elliptical coordinates as the roots of the oscillators is an !N manifold.
polynomial To compute the tensor field N in our example,
one has to follow the prescription, passing from
sðÞ ¼ ð  aÞð  bÞð  cÞ  
  0 ¼ !21 y1  !22 y2 ðdx2  dx1 Þ
x2 y2 z2
 1þ þ þ
a b c to
The peculiarity of this situation is that there exists  
a unique recursion operator L : TQ ! TQ whose !0 ¼ !21 dy1  !22 dy2 ^ ðdx2  dx1 Þ
374 Recursion Operators in Classical Mechanics

and to the appropriate recursion operator L : TR2 ! TR2


  to be used to construct the !N structure on T  R 2 .
@ @ @
N ¼ !21  !22 Let us however recall that according to Neumann,
@x1 @x1 @x2 the system is separable in elliptical spherical (also
 
@ @ @ called spheroconical) coordinates, defined as the
N ¼ !21 þ !22
@x2 @x1 @x2 roots of the restriction to S2 of the polynomial
   
@ @ @  2 
N ¼ !12
 x1 x22 x23
@y1 @y1 @y2 sðÞ ¼ ð  aÞð  bÞð  cÞ þ þ
    a b c
@ @ @
N 2
¼ !2  þ ¼ 2  ðs1  þ s2 Þ
@y2 @y1 @y2
Let us, therefore, use this polynomial to construct
The Levi-Civita distribution Dh is therefore spanned
the unique recursion operator L having s() as its
by the vector fields
   characteristic polynomial. It is given by
y1 @ y2 @ @ @
Xh ¼ k 2 þ þ ðx2  x1 Þ  L ðds1 Þ ¼ ds2 þ s1 ds1
!1 @x1 !22 @x2 @y1 @y2
    L ðds2 Þ ¼ s2 ds1
!2 @ !2 @
NXh ¼ k y1  12 y2 þ y2  22 y1 or, after a brief computation, by
!2 @x1 !1 @x2
  
  @ @ L ðdx1 Þ ¼ a1 dx1  x1 d 12 ða1  a3 Þx21 þ 12 ða2  a3 Þx22
þ !21 þ !22 ðx2  x1 Þ 
@y1 @y2 
L ðdx2 Þ ¼ a2 dx2  x2 d 12 ða1  a3 Þx21 þ 12 ða2  a3 Þx22
related to the Hamiltonian
The situation stays the same as in the previous
y2 y2 1 example. Accordingly, the recursion operator N on
h ¼ 1 þ 2 þ kðx1  x2 Þ2
2m1 2m2 2 T  R 2 is now given by
of the system of coupled oscillators. Since N  dx1 ¼ a1 dx1  x1 df
[Xh , NXh ] = 0, the distribution is integrable; there- N  dx2 ¼ a2 dx2  x2 df
fore, the system is a cyclic system of Levi-Civita.
This property holds for any system of coupled N  dy1 ¼ a1 dy1  ða1  a3 Þx1 dg þ y1 df
harmonic oscillators. It will be apparent at the end N  dy2 ¼ a2 dy2  ða2  a3 Þx2 dg þ y2 df
of this article that this result is due to the
where the shorthand notations
eigenvectors of L defining the separation coordi-
nates of the coupled oscillators. f ¼ 12ða1  a3 Þx21 þ 12ða2  a3 Þx22
The second and final example of cyclic systems of
g ¼ x 1 y 1 þ x2 y 2
Levi-Civita is the Neumann system, that is, the
anisotropic harmonic oscillator on the sphere S2 , have been used. The derivation dN , associated with
whose Lagrangian is N, is accordingly defined by
    
L ¼ 12 m x_ 21 þ x_ 22 þ x_ 23  12 a1 x21 þ a2 x22 þ a3 x23 dN x1 ¼ N  dx1 ¼ a1 þ ða3  a1 Þx21 dx1
with the constraint þ ða3  a2 Þx1 x2 dx2

x21 þ x22 þ x23 ¼ 1 dN x2 ¼N  dx2 ¼ ða3  a1 Þx1 x2 dx1



This constraint can be avoided by using the first two þ a2 þ ða3  a2 Þx22 dx2
Cartesian coordinates (x1 , x2 ) as local coordinates dN y1 ¼ N  dy1 ¼ ½ða3  a1 Þx1 y2  ða3  a2 Þx2 y1  dx2
on S2 . The Hamiltonian of the system can then be 
written in the form þ a1 þ ða3  a1 Þx21 dy1 þ ða3  a1 Þx1 x2 dy2
 
h ¼ 12 1 þ x21 y21  x1 x2 y1 y2 dN y2 ¼ N  dy2 ¼ ½ða3  a2 Þx2 y1  ða3  a1 Þx1 y2  dx1
  
þ 12 1 þ x22 y22 þ 12 ða1  a3 Þx21 þ ða3  a2 Þx1 x2 dy1 þ a2 þ ða3  a2 Þx22 dy2

þ 12 ða2  a3 Þx22 on the coordinate functions. Recalling that dN


anticommutes with d, one can then easily check the
where, for simplicity, m = 1. Formally one is back in condition
R2 as in the previous example, but the nonlinearity
of the equations of motion hinders us to readily see ddN h ¼ ds1 ^ dh
Recursion Operators in Classical Mechanics 375

where s1 is the first coefficient of the polynomial integrable if and only if the 2-form ddN h vanishes
defining the elliptical spherical coordinates, and h is on Dh :
the Hamiltonian of the Neumann system. By the
ddN h ¼ 0 on Dh
Frobenius theorem, this equation alone entails the
integrability of the distribution Dh , without the need Suppose now that the dimension of Dh is maximal,
of computing Xh , NXh , and their commutator that is, equal to n = (1=2) dim S. Then Dh is spanned
[Xh , NXh ]. Thus, it can be concluded that the by the n vector fields (Xh , NXh , . . . ,N n1 Xh ), and
Neumann system too is a cyclic system of Levi- the vanishing condition of ddN h on Dh turns out to
Civita, and that the recursion operator N, generat- be equivalent to
ing the distribution Dh , is closely related to the
ddN hðN j Xh ; N k Xh Þ ¼ 0
polynomial defining the separation coordinates of
the Neumann system. for any value of j and k from 0 to n  1. Thus, the
number of separability conditions of h and the
number of integrability conditions of Dh are equal.
This circumstance strongly suggests that the two sets
Separable System of Jacobi
of conditions are related. The nontensorial character
In 1838, Jacobi noticed that the Hamilton–Jacobi of the Levi-Civita conditions, compared with the
equation tensorial character of the integrability conditions of
  Dh , further suggests that the former should be the
@W @W
h x1 ; x2 ; . . . ; xn ; ;...; ¼e evaluation of the latter in a specific system of
@x1 @xn coordinates. These coordinates are the ‘‘normal
of many Hamiltonian systems splits owing to an coordinates’’ of an !N manifold, that will be
appropriate choice of coordinates in a set of introduced in the following.
ordinary differential equations. On account of Assume that the minimal polynomial of N has
this property, these systems have been called real and distinct roots (l1 , . . . , ln ). In this case, the
separable. In 1904, Levi-Civita gave a first partial !N manifold is said to be semisimple. A two-
characterization of separable Hamiltonians by dimensional eigenspace is associated with each
means of his separability conditions. In a letter root lk . Let us consider the distribution Ek spanned
addressed to Stäckel, he proved that h is separ- by all the eigenvectors of N, except those
able in a preassigned system of canonical coordi- associated with lk . Since N is torsion free, each
nates if and only if the conditions distribution Ek is integrable. Let us fix the
attention on one of these distributions. It turns
@ 2 h @h @h @ 2 h @h @h out that its leaves are symplectic submanifolds of

@xj @xk @yj @yk @xj @yk @yj @xk codimension 2. So they are the level surfaces of a
@ 2 h @h @h @ 2 h @h @h pair of (local) functions which are not in involu-
 þ ¼0 tion. By collecting together the pairs of functions
@yj @xk @xj @yk @yj @yk @xj @xk
associated with the n distributions (E1 , . . . , En ),
are satisfied by h. One must notice the nontensorial one obtains, at the end, a coordinate system
character of these conditions; they hold only in a (1 , 1 , 2 , 2 , . . . , n , n ) on S. Moreover, these
specific coordinate system, and if the coordinates are functions can be chosen in such a way to form a
changed, it is not possible to reconstruct the form of system of canonical coordinates. The final result is
the separability conditions in the new coordinates. that, on a semisimple !N manifold, one can
The nontensorial character is the major drawback of construct a coordinate system such that
the separability conditions of Levi-Civita, making X
n
them practically useless in the search of separation !¼ dj ^ dj
coordinates. j¼1
The contact between the theory of separable
and
system of Jacobi and the theory of cyclic systems
of Levi-Civita rests on two occurrences. The first is N  ðdj Þ ¼ lj dj
the form of the integrability conditions of the N  ðdj Þ ¼ lj dj
distribution Dh generated by any vector field Xh
on an !N manifold. Exploiting the Frobenius These coordinates are called the normal coordinates
integrability conditions and the properties of the (or sometimes, the Darboux–Nijenhuis coordinates) of
differential operator dN associated with the recur- the !N manifold. One can prove that the separability
sion operator N, it can be proved that Dh is conditions of Levi-Civita are the integrability
376 Reflection Positivity and Phase Transitions

conditions of Dh , written in normal coordinates. This Variables for Differential Equations; Solitons and
result allows us to claim that the cyclic systems of Levi- Kac–Moody Lie Algebras.
Civita on semisimple !N manifolds are all separable.
The reverse is also true. As has already been Further Reading
shown in the example of the Neumann system, a
given separable system of Jacobi can be associated Dubrovin BA, Krichever IM, and Novikov SP (2001) Integrable
with a recursion operator N in such a way that its systems I. In: Arnol’d VI (ed.) Encyclopaedia of Mathematical
Sciences. Dynamical Systems IV, pp. 177–332. Berlin: Springer.
phase space (with the possible exclusion of a Jacobi CGJ (1996) Vorlesungen ber analytische Mechanik,
singular locus) becomes an !N manifold, and the Deutsche Mathematiker Vereinigung, Freiburg. Braunschweig:
Hamiltonian vector field Xh becomes a cyclic system Friedrich Vieweg and Sohn.
of Levi-Civita. A new interpretation of the process Ivan K, Michor PW, and Slovák J (1993) Natural Operations in
of separation of variables follows from this result. Differential Geometry. Berlin: Springer.
Kalnins EG (1986) Separation of Variables for Riemannian
Indeed, to find separation coordinates for a given Spaces of Constant Curvature. New York: Wiley.
system on a symplectic manifold S is equivalent to Krasilshchik IS and Kersten PHM (2000) Symmetries and
deforming the Poisson bracket of S into a pencil Recursion Operators for Classical and Supersymmetric Differ-
ential Equations. Dordrecht: Kluwer.
ff ; gg ¼ ff ; gg þ ff ; gg0 Magri F, Falqui G, and Pedroni M (2003) The method of Poisson
pairs in the theory of nonlinear PDEs. In: Conte R, Magri F,
in such a way that the recursion operator N defining Musette M, Satsuma J, and Winternitz P (eds.) Direct and
the pencil {f , g} generates, with Xh , an integrable Inverse Methods in Nonlinear Evolution Equations, Lecture
distribution Dh . Therefore, classical mechanics is Notes in Physics, vol. 632, pp. 85–136. Berlin: Springer.
Miller W (1977) Symmetry and Separation of Variables. Reading,
deeply entangled with the theory of recursion opera- MA–London–Amsterdam: Addison-Wesley.
tors, even if the insistence on the use of separation Olver PJ (1993) Applications of Lie Groups to Differential
coordinates has hidden this factor for a long time. Equations, 2nd edn. New York: Springer.
Pars LA (1965) A Treatise on Analytical Dynamics. London:
Heinemann.
See also: Bi-Hamiltonian Methods in Soliton Theory;
Vaisman I (1994) Lectures on the Geometry of Poisson Mani-
Classical r-Matrices, Lie Bialgebras, and Poisson Lie folds. Basel: Birkhäuser.
Groups; Integrable Systems and Algebraic Geometry; Vilasi G (2001) Hamiltonian Dynamics. River Edge, NJ: World
Integrable Systems and Recursion Operators on Scientific.
Symplectic and Jacobi Manifolds; Integrable Systems: Yano K and Ishihara S (1973) Tangent and Cotangent Bundles:
Overview; Multi-Hamiltonian Systems; Separation of Differential Geometry. New York: Dekker.

Reflection Positivity and Phase Transitions


Y Kondratiev, Universität Bielefeld, Bielefeld, defined by interaction potentials and equilibrium phases
Germany appear as states – positive linear functionals on algebras
Y Kozitsky, Uniwersytet Marii Curie-Sklodowskiej, of observables. In the classical case the states are defined
Lublin, Poland by means of the probability measures which satisfy
ª 2006 Elsevier Ltd. All rights reserved. equilibrium conditions, formulated in terms of the
interaction potentials. Such measures are called Gibbs
measures and the corresponding states are called Gibbs
Phase Transitions in Lattice Systems states. The observables are then integrable functions. In
the quantum case the states mostly are introduced by
Introduction
means of the Kubo–Martin–Schwinger condition – a
Phase transitions are among the main objects of quantum analog of the equilibrium conditions used for
equilibrium statistical mechanics, both classical and classical models. The quantum observables constitute
quantum. There exist several approaches to the descrip- noncommutative von Neumann algebras.
tion of these phenomena. Their common point is that Infinite systems of particles studied in statistical
the macroscopic behavior of a statistical mechanical mechanics fall into two main groups. These are
model can be different at the same values of the model continuous systems and lattice systems. In the latter
parameters. This corresponds to the multiplicity of case, particles are attached to the points of various
equilibrium phases, each of which has its own proper- crystalline lattices. In view of the specifics of our subject,
ties. In the mathematical formulation, models are in this article we will deal with lattice systems only.
Reflection Positivity and Phase Transitions 377

One of the main problems of the mathematical rotations in R N . Consider a translation-invariant


theory of phase transitions is to prove that the Gibbs Gibbs state of this model, which always exists. Let
states of a given model can be multiple, that is, that K(‘, ‘0 ), ‘, ‘0 2 Zd , be the expectation of the scalar
this model undergoes a phase transition. To solve product (x‘ , x‘0 ) of spins in this state. Then K(‘, ‘0 ) is
this problem one has to elaborate corresponding also translation invariant and hence may be written as
mathematical tools. Typically, at high temperatures Z pffiffiffiffiffiffiffi
0 1 b iðp;‘‘0 Þ
(equivalently, for weak interactions), a model, which Kð‘; ‘ Þ ¼ KðpÞe dp; i ¼ 1 ½1
undergoes a phase transition, has only one Gibbs ð2Þd ð;d
state. This state inherits all the symmetries possessed where the generalized function K b is defined by the
by the interaction potentials. At low temperatures Fourier series
this model has multiple Gibbs states, which may lose X 0
the symmetries. In this case the phase transition is b
KðpÞ ¼ Kð‘; ‘0 Þeiðp;‘‘ Þ ; p 2 ð; d ½2
accompanied by a symmetry breaking. Among the ‘0 2Zd
symmetries important in the theory of lattice
As the model is ferromagnetic, K(‘, ‘0 )  0. The
systems, there is the invariance with respect to the
Gibbs state is nonergodic if K(‘, ‘0 ) does not tend to
lattice translations. If the Gibbs state of a translation b should be
zero as j‘  ‘0 j ! 1. In this case K
invariant lattice model is unique, it ought to be
singular at p = 0. Set
ergodic with respect to the group of lattice transla-
tions. This means in particular that the spacial b
KðpÞ ¼ ð2Þd ðpÞ þ gðpÞ ½3
correlations in this state decay to zero at long
distances. Therefore, the lack of the latter property where (p) is the Dirac -function and g(p) is regular
may indicate a phase transition. In a number of at p = 0. Then the Gibbs state is nonergodic if  6¼ 0.
lattice models, phase transitions can be established Suppose we know that g(p)  0 and that the
by means of their special property – reflection following two estimates hold. The first one is
positivity. The most important consequence of gðpÞ  =Jjpj2 ; p 6¼ 0 ½4
reflection positivity are chessboard (another name
checkerboard) estimates, being extended versions of where  > 0 is a constant and J > 0 is the interaction
Hölder’s inequalities. The proof of a phase transi- intensity multiplied by the inverse temperature .
tion is then performed either by means of a This is the infrared estimate. The second estimate is
combination of such estimates and contour methods, Kð‘; ‘Þ  K > 0 ½5
or by means of infrared estimates obtained from the
chessboard estimates. where K is independent of J. By these estimates and
In this article we show how to prove phase [1], [2], we get
transitions by means of the infrared estimates for Z
 dp
some simple reflection positive models, both classi- K d 2
½6
cal and quantum. The details on the reflection ð2Þ J ð; jpj
d

positivity method in all its versions may be found For d  3, the latter integral exists; hence,  > 0 for
in the literature listed at the end of the article. There J large enough, which means that the state we
we also provide short bibliographic comments. consider is nonergodic.
The quantum case is more involved. The infrared
bounds are obtained not for functions like K(p)b but
Nonergodicity and Infrared Estimates
for the so-called Duhamel two-point functions. Then
The following heuristic arguments should give an idea one has to prove a number of additional statements,
how to establish the nonergodicity of a Gibbs state by which finally lead to the proof of the result desired.
means of infrared estimates. Let us consider a classical In the section on reflection positivity in quantum
ferromagnetic translation-invariant model. (Of systems we indicate how to do this for a simple
course, we assume that it possesses Gibbs states, quantum spin model.
which for models with unbounded spins is a
nontrivial property. A particular case of this model
is described in more detail in the subsection ‘‘Gaus-
Reflection Positivity and Phase
sian domination.’’) This model describes the system
Transitions in Classical Systems
of interacting N-dimensional spins x‘ 2 RN , indexed
by the elements ‘ 2 Zd of the d-dimensional simple We begin by studying reflection positive (RP)
cubic lattice. The interaction is pairwise, attractive, functionals. Gibbs states of RP models are such
nearest-neighbor, and invariant with respect to the functionals.
378 Reflection Positivity and Phase Transitions

Reflection Positive Functionals on Rjj=2 such that every real-valued polynomial


on Rjj=2 is -integrable.
Let  be a finite set of indices consisting of an
even number jj of elements, which label real Proposition 3 The functional
variables x‘ , ‘ 2 . For 0  , we write Z
0
x0 = (x‘ )‘20 2 R j j . Suppose we are given a bijec- ðAÞ ¼ Aðx Þ d ðxþ Þ d ðx Þ ½11
tion  :  ! ,    = id, such that the set  falls R jj

into two disjoint parts  with the property is RP.


 : þ !  . Therefore, jþ j = j j, and the map 
may be regarded as a reflection. For x 2 Rjj , we In both these examples the states are symmetric,
set (x ) = (x(‘) )‘2 . Now let A be an algebra of that is,
functions A : Rjj ! R. Then we define the map ½A#ðBÞ ¼ ½B#ðAÞ; for all A; B 2 Aþ ½12
# : A ! A by setting
In the sequel we shall suppose that all RP functionals
#ðAÞðx Þ ¼ Aððx ÞÞ ½7 possess this property. Therefore, RP functionals obey
Clearly, for all A, B 2 A and ,
2 R, a Cauchy–Schwarz type inequality.

#ð A þ
BÞ ¼ #ðAÞ þ
#ðBÞ Lemma 4 If is RP, then for any A, B 2 Aþ ,
½8
#ðA BÞ ¼ #ðAÞ #ðBÞ f ½A#ðBÞg2  ½A#ðAÞ ½B#ðBÞ ½13
þ 
By A (respectively, A ), we denote the sub-
algebra of A consisting of functions dependent Proof For 2 R, by [8] we have
on xþ (respectively, x ). Then #(Aþ ) = A and ½ðA þ BÞ#ðA þ BÞ
#  # = id.
¼ ½ðA þ BÞð#ðAÞ þ #ðBÞÞ  0
Definition 1 A linear functional : A ! R is called
RP with respect to the maps  and #, if Since is linear, the latter can be written as a
3-nomial, whose positivity for all 2 R is equivalent
8A 2 Aþ: ½A#ðAÞ  0 ½9 to [13]. &

Example 2 Let be a Borel measure on the real Now let an RP functional be such that for
line (not necessarily positive), with respect to which A; B; C1 ; . . . ; Cm ; D1 ; . . . ; Dm 2 Aþ
all real polynomials are integrable. Let also A be the
algebra of all real-valued polynomials on Rjj , jj there exists
being even. Finally, let  and # be any of the maps " !#
X
m
with the properties described above. Then the exp A þ #ðBÞ þ Ci #ðDi Þ
functional i¼1
Z
and that the series
ðAÞ ¼ Aðx Þ d  ðx Þ
R jj X
1
Y ½10 1
f½C1 #ðC1 Þn1 ½Cm #ðCm Þnm
d  ðx Þ ¼ d ðx‘ Þ n ! n !
n1 ;...;nm ¼0 1 m
‘2

exp½A þ #ðBÞg ½14
is RP. Indeed, let F : R jj=2 ! R be such that
A(x ) = F(xþ ). Then as well as the one with all Ci s replaced by Di s
Z Y Z Y converge absolutely.
½A#ðAÞ ¼ Fðxþ Þ d ðx‘ Þ Fðx Þ d ðx‘ Þ Lemma 5 Let the functional and the functions
‘2þ ‘2
A, B, Ci , Di , i = 1, . . . , m, be as above. Then
"Z #2
Y ( " !#)2
¼ Fðxþ Þ d ðx‘ Þ 0 Xm

‘2þ
exp A þ #ðBÞ þ Ci #ðDi Þ
i¼1
In the above example the multiplicative structure of " !#
X
m
the measure  is crucial. It results in the positivity  exp A þ #ðAÞ þ Ci #ðCi Þ
of with respect to all reflections. If one has just i¼1
" !#
one such reflection, the measure which defines X
m
may be decomposable onto two measures only. Let
exp B þ #ðBÞ þ Di #ðDi Þ ½15
,A,, and # be as above. Consider a Borel measure i¼1
Reflection Positivity and Phase Transitions 379

Proof By the above assumptions 0þ [ 0 , and A be the algebra of all polynomials
0
" !# of (x0 , y0 ) 2 R2Nj j . Note that x0 may be regarded
Xm
exp A þ #ðBÞ þ Ci #ðDi Þ as the pair (x0þ , x0 ). Let Aþ (respectively, A ) be
i¼1 the subalgebra of A consisting of the polynomials
" !# which depend on x0þ , y0þ (respectively, x0 , y0 )
X
m
¼ F#ðGÞ exp Ci #ðDi Þ only. Introduce the measures
i¼1 !
X
1 JX 2
1 d~ðx Þ ¼ exp  jx‘ j dðx Þ
¼ ½F#ðGÞ½C1 #ðD1 Þn1 2 ‘20
n1 ;...;nm
n
¼0 1
! nm ! !
JX 2

½Cm #ðDm Þnm  ½16 d~ ðx Þ ¼ exp  jxj d ðx Þ
2 ‘20 ‘
where F = eA , G = eB . Then by [13] and the Cauchy–
Schwarz inequality for sums we get and define the following functional on A:
Z
RHS½16
 1=2 ðFÞ ¼ Fðx0 ; y0 Þ dðx
~ þ Þ
X
1
1 R 2Njj
 ½F#ðFÞ½C1 #ðC1 Þn1 ½Cm #ðCm Þnm 
n1 ;...;nm ¼0
n1 ! nm !
d~
ðyþ Þ d~
ðx Þ d~
ðy Þ ½18
 1=2
1

½G#ðGÞ½D1 #ðD1 Þn1 ½Dm #ðDm Þnm  It has the same structure as the one described by
n1 ! nm !
(
X1
)1=2 Proposition 3, hence is RP with respect to the map #
1
 ½F#ðFÞ½C1 #ðC1 Þn1 ½Cm #ðCm Þnm  defined by the reflection . Set
n1 ;...;nm ¼0 1
n ! nm ! Z Z
( )1=2
X1
1  ¼ d~
ðx Þ;  ¼ d~ ðy Þ ½19

½G#ðGÞ½D1 #ðD1 Þn1 ½Dm #ðDm Þnm  R Njj R Njj
n ! nm !
n1 ;...;nm ¼0 1
( " !#)1=2 and
Xm

X 1 
¼ exp A þ #ðAÞ þ Ci #ðCi Þ
i¼1 2
( " !#)1=2 A 0; B ¼ J ja‘ j þ ða‘ ; y‘ Þ
X
m
‘20þ
2

exp B þ #ðBÞ þ Di #ðDi Þ
i¼1
ðkÞ
pffiffi ðkÞ ðkÞ
pffiffi ðkÞ ðkÞ
 ½20
C‘ ¼ J x‘ ; D‘ ¼ J y‘ þ a ‘
which yields [15]. &
‘ 2 0þ ; k ¼ 1; . . . ; N
Main Estimate
Then the left-hand side of [17] is
Let  be a finite set and 0 be its nonempty subset.
LHS ½17
Let also  and be finite Borel measures on "
RNjj , N 2 N. For vectors b, c 2 RN , by
P (b, c)(k)and ¼
1
exp A þ #ðBÞ
jbj, jcj we denote their scalar product N k=1 b c
(k)
ð  Þ2
and the corresponding norms, respectively. By x !#
we denote (x‘ )‘2 , x‘ 2 RN ; hence, x 2 RNjj . XX N   2
ðkÞ ðkÞ
þ C‘ # D‘ ½21
Lemma 6 Let the sets , 0 and the measures , be 0

‘2þ k¼1
0
as above. Then for every (a‘ )‘20 2 RNj j and J  0,
"Z ! #2 with given by [18]. Applying [15] and taking into
JX 2
account [19], we arrive at
exp  jx‘  y‘  a‘ j dðx Þd ðy Þ
R2Njj 2 ‘20 LHS ½17
Z ! 0 1
JX 2 Z X
 exp  jx‘  y‘ j dðx Þ dðy Þ 1
R2Njj 2 ‘20  exp@ J x‘ xð‘Þ A
Z ! ð  Þ2 R 2Njj ‘20þ
JX 2

exp  jx‘  y‘ j d ðx Þ d ðy Þ ½17
dðx
~ þ Þ dðx
~  Þ d~
ðyþ Þ d~
ðy Þ
R 2Njj 2 ‘20 0 1
Z X
Proof Take two copies of  and denote them by
exp@ J y‘ yð‘Þ A
 . Furthermore, by 0  we denote the subsets R 2Njj ‘20þ
consisting of the elements of 0 . For an ‘ 2 þ ,
dðx
~ þ Þ dðx
~  Þ d~
ðyþ Þ d~
ðy Þ ¼ RHS ½17
by (‘) we denote its counterpart in  . Then  is a
reflection and (0þ ) = 0 . Let  = þ [  , 0 = which completes the proof. &
380 Reflection Positivity and Phase Transitions

Gaussian Domination
Let  be a finite set, jj even, and E be a set of
unordered pairs of elements of , such that the
graph (, E) is connected. If e 2 E connects given
‘, ‘0 2 , we write e = h‘, ‘0 i. We suppose that E
contains no loops h‘, ‘i. With each ‘ 2  we
associate a random N-component vector x‘ , called
spin. The joint probability distribution of the spins
(x‘ )‘2 is defined by means of the local Gibbs
measure Figure 1 The torus.
0 1
1 J X
d ðx Þ ¼ exp@ jx‘  x‘0 j2 Ad  ðx Þ; bijection n :  ! , n  n = id, such that
Z 2 h‘;‘0 i2E
n ((n)
þ ) =  (n)
 and h n (‘),  n (‘ 0
)i 2 E(n)
 whenever
h‘, ‘0 i 2 E(n)
þ . Finally, we assume that if h‘, ‘ i 2 En
0
x 2 RNjj ½22 (n) 0
and ‘ 2 þ , then n (‘) = ‘ .
Here the measure By this assumption if h‘, ‘0 i 2 En , then no other
Y elements of En can be of the form h‘, ‘00 i or h‘00 , ‘0 i.
d  ðx Þ ¼ d ðx‘ Þ ½23
The basic example here is the torus which one obtains
‘2
from a rectangular box  Zd , jj even, by imposing
describes the system if the interaction intensity periodic conditions on its boundaries. The set of edges
J equals zero. In general, J  0, that is, the model is E = {h‘, ‘0 ijj‘  ‘0 j = 1}, where j‘  ‘0 j is the
[22], [23] is ferromagnetic. The single-spin measure periodic distance on  (see the next subsection).
is a probability measure on R N and Then every plane which contains the center of the
0 1 torus and its axis cuts it out along a family of
Z X edges onto two subgraphs with the property
J
Z ¼ exp@ jx‘  x‘0 j2A d  ðx Þ ½24 desired (see Figure 1).
R Njj 2 h‘;‘0 i2E
Theorem 9 The model [22]–[23] defined on the
is the partition function. Set graph obeying Assumption 8 admits Gaussian
0 1 domination.
Z X
J Proof For  = 1, h = (h‘‘0 )h‘, ‘0 i2E , and n = 1, . . . , m,
Z ðhÞ ¼ exp@ jx‘  x‘0  h‘‘0 j2 A
R Njj 2 h‘;‘0 i2E we define the map
8

d  ðx Þ ½25 >
< h‘‘0 ; if h‘; ‘0 i 2 EðnÞ



Tn h ‘‘0 ¼ hn ð‘Þn ð‘0 Þ ; if h‘; ‘0 i 2 EðnÞ ½28
where h‘‘0 = h‘0 ‘ 2 RN , h‘, ‘0 i 2 E. >
:

0
0; if h‘; ‘ i 2 En
Definition 7 The model [22]–[23] admits Gaussian
domination if for all h = (h‘‘0 )h‘, ‘0 i2E , According to Assumption 8
0 1
Z ðhÞ  Z ð0Þ ½26 Z X
J 2
Z ðhÞ ¼ exp@ jx‘  x‘0  h‘‘0 j A
We prove that our model admits Gaussian domina- R Njj 2 h‘;‘0 i2E
1
tion if the graph satisfies the following:    

d þð1Þ xð1Þ d ð1Þ xð1Þ ½29
Assumption 8 The set of edges E can be þ þ  

decomposed
where
[
m \  
E¼ En ; En En0 ¼ ;; if n 6¼ n0 ½27 d  ð1Þ xð1Þ
 
n¼1 0 1
J X  
in such a way that for every n = 1, . . . , m, the graph ¼ exp@ jx‘  x‘0  h‘‘0 j2 A d ð1Þ xð1Þ ;
(, EnEn ) is disconnected and falls into two con- 2 0 ð1Þ  
h‘;‘ i2E
nected components, ((n) (n) (n) (n)
þ , Eþ ) and ( , E ), which
are isomorphic. This means that there exists a  ¼ 1
Reflection Positivity and Phase Transitions 381

Set is RP (see Example 2). The Gibbs measure [22] can


0 1 be written as
0 1
B J X C Xm X N X  
d ð1Þ ðxð1Þ Þ ¼ exp@ jx‘  x‘0  hð‘Þð‘0 Þ j2A 1 ðkÞ ðkÞ
þ þ 2 d ðx Þ ¼ exp@ C‘ #n C‘ A
ð1Þ
h‘;‘0 i2Eþ Z ð0Þ n¼1 k¼1 ‘2 0
þ;n


d ð1Þ ðxð1Þ Þ
d
~ ðx Þ ½33
þ þ
0 1 where C(k) ‘ , k = 1, . . . , N, are the same as in [20] and
  X = {‘ 2 (n)
0þ,n def 0
B J C þ jh‘, ‘ i 2 En }. Then the reflection
d þð1Þ xð1Þ ¼ exp@ jxð‘Þ  xð‘0 Þ  h‘‘0 j2A
  2 ð1Þ
positivity of the Gibbs state [31] can be obtained
h‘;‘0 i2Eþ
along the line of arguments used for proving Lemma
6. It appears that this is the only possible way to

d ð1Þ ðxð1Þ Þ
  construct an RP functional from another RP
Then we apply here Lemma 6, with 0þ = {‘ 2 functional.
(1) 0
þ jh‘, ‘ i 2 E1 }, and obtain
Repeated application of the estimate [15] also
yields
0 1
Z X ! " !#1=jj
J Y Y Y
½Z ðhÞ2  exp@ jx‘  x‘0 j2 A  F‘ ðx‘ Þ   F‘ ðx‘0 Þ ½34
R Njj 2 h‘;‘0 i2E 1
‘2 ‘2 ‘0 2




d þð1Þ xð1Þ d þð1Þ xð1Þ which holds for any family of functions
þ  
{F‘ : RN ! [0, þ1)}‘2 , for which the above
þ

0 1
Z expressions make sense. The estimate [34] is a
J X

exp@ jx‘  x‘0 j2A chessboard estimate, which is a very important
R Njj 2 h‘;‘0 i2E element of the theory of phase transitions in
1



RP models. The estimate [26] may be obtained

d ð1Þ xð1Þ

d ð1Þ xð1Þ from [34].
 þ þ

¼ Z ðT1þ hÞZ ðT1 hÞ Infrared Bound

Next we estimate both Z (T1 h) employing E2 and Let us show now how to derive the infrared
T2 . Repeating this procedure due times we finally estimates from the Gaussian domination [26].
get Consider the system of N-dimensional spins
indexed by the elements of Zd with the nearest-
Y
½Z ðhÞ2 
m
m
Z ðTm T11 hÞ ¼ ½Z ð0Þ2
m
½30 neighbor ferromagnetic interaction and the sin-
1 ;...;m ¼1 gle-spin measure . To construct the periodic
local Gibbs measure of this system, we take the
Note that Tmm
T11 h = 0 for any h 2 RNjEj and any box
sequence 1 , ..., m = 1, which follows from [27] \
and [28]. &  ¼ ðL; Ld Zd ; L 2 N ½35

As might be clear from the proof given above, the and impose periodic conditions on its boundaries.
local Gibbs state This defines the periodic distance
Z " #1=2
Xd
 ðAÞ ¼ Aðx Þ d ðx Þ ½31 j‘  ‘0 j ¼ j‘j  ‘0j j2L ; ‘; ‘0 2 
R Njj ½36
j¼1

defined by means of the measure [22], is RP j‘j  ‘0j jL ¼ minfj‘j  ‘0j j; L  j‘j  ‘0j jg
with respect to all reflections n , n = 1, . . . , m.
Indeed, the functional defined by the product and hence the set of edges E, being unordered
measure pairs h‘, ‘0 i such that j‘  ‘0 j = 1. Thus, we have
the graph (, E) and the measure [22]. This is the
!
JX periodic local Gibbs measure of our model. By
def
d
~ ðx Þ ¼ exp  jx‘ j2 d  ðx Þ ½32 [31] it defines the periodic local Gibbs state  .
2 ‘2
We have included the inverse temperature  into J
382 Reflection Positivity and Phase Transitions

and assumed that the single-spin measure is first components h(1)


‘‘0 are nonzero. Then [44]
rotation invariant. Let us introduce the Fourier holds if
transformation X X h  i
ð1Þ ð1Þ ð1Þ ð1Þ ð1Þ ð1Þ
J  x‘1  x‘0 x‘2  x‘0 h‘1 ‘0 h‘2 ‘0
1 X h‘1 ;‘01 i2E h‘2 ;‘02 i2E
1 2 1 2

^ðpÞ ¼ pffiffiffiffiffiffi
x x‘ eið‘;pÞ
jj ‘2 X h i
ð1Þ 2
½37  h‘‘0 ½45
1 X
x‘ ¼ pffiffiffiffiffiffi ^ðpÞeið‘; pÞ
x h‘;‘0 i2E
jj p2
This means that the eigenvalues of the matrix of the
real quadratic form (with respect to h) defined by
 the left-hand side of [45] do not exceed one. The
 same ought to be true for the extension of this form
 ¼ p ¼ ðp1 ; . . . ; pd Þj pj ¼  þ j ;
L to the complex case. Let us show that the complex
 eigenvectors h(1)
‘‘0 (p) of this matrix and the corre-
j ¼ 1; . . . ; 2L; j ¼ 1; . . . ; d ½38 sponding eigenvalues (p) are
ð1Þ 0 pffiffiffiffiffiffi
Then we can set h‘‘0 ðpÞ ¼ ðeiðp;‘Þ  eiðp;‘ Þ Þ= jj
h i p 2  ½46
b ðkÞ ðpÞ ¼  x ðkÞ ðkÞ ðpÞ ¼ 2JEðpÞK b ð1Þ ðpÞ
K  ^ x
ðpÞ^ ðpÞ 

X
N ½39 For j = 1, . . . , d, let j 2 Zd be the unit vector with
b  ðpÞ ¼
K b ðkÞ ðpÞ
K the jth component equal to 1. Then for h‘, ‘0 i 2 E,

k¼1 there exists j such that ‘  ‘0 = j . Since the edge
Thereby, cf. [1], [2], h‘, ‘0 i is an unordered set, let us fix ‘0 = ‘ þ j .
Thereby,
def 1 Xb
X  ð1Þ  
0
K ð‘; ‘0 Þ ¼  ½ðx‘ ; x‘0 Þ ¼ K ðpÞeiðp;‘‘ Þ ½40 1 ð1Þ iðp;‘Þ iðp;‘0 Þ
jj p2 x ‘  x ‘ 0 e  e

jj1=2 h‘;‘0 i2E
By construction, for any ‘0 2 , d h i
2 XX ð1Þ ð1Þ
0 0 ¼ x‘ eiðp;‘Þ  x‘ eiðp;‘Þ cosðp; j Þ
K ð‘; ‘ Þ ¼ K ð‘ þ ‘0 ; ‘ þ ‘0 Þ ½41 jj1=2 ‘2 j¼1

where addition is componentwise modulo 2L. This ¼ 2^ ð1Þ


x ðpÞEðpÞ
means that K (‘, ‘0 ) is invariant with respect to the
translations on the corresponding torus. One can In view of [41], one has
show that K (‘, ‘0 ) converges, as L ! þ1, to K(‘, ‘0 ) b ð1Þ ðpÞ
xð1Þ ðpÞ^
 ½^ xð1Þ ðp0 Þ ¼ 0;pþp0 K
discussed in the Introduction. The corresponding 

Gibbs state of the whole model is called the periodic Then employing the latter two facts and [37], we get
Gibbs state. By construction, it is translation X h  
ð1Þ ð1Þ ð1Þ ð1Þ
invariant. Set J  x‘1  x‘0 x‘2  x‘0 h‘2 ‘02 ðpÞ
1 2
h‘2 ;‘02 i2E
X
d h  i
ð1Þ ð1Þ
EðpÞ ¼ ½1  cos pj ; p 2 ð; d ½42 ¼ 2JEðpÞ  x‘1  x‘0 ^ð1Þ ðpÞ
x
1
j¼1 h i
1 X
¼ 2JEðpÞ  x^ð1Þ ðp0 Þ^
xð1Þ ðpÞ
Theorem 10 For all p 2  n {0}, jj1=2 p0 2
 
b  ðpÞ  N 0 0 0

eiðp ;‘1 Þ  eiðp ;‘1 Þ
K ½43
2JEðpÞ
b ð1Þ ðpÞh‘ ‘0 ðpÞ
¼ 2JEðpÞK  1 1
Proof Consider the function f ( ) = Z ( h), 2 R,
which proves [46]. Then by [45] Kb (1) (p)  1=2JE(p),
where Z (h) is defined by [25]. By Theorem 9 it has 
a maximum at = 0; hence, b (k) (p), k = 2, . . . , N,
for p 6¼ 0. The same holds for K 
which by [39] yields [43]. &
f 00 ð0Þ  0 ½44
The result just proved and the convergence of
Obviously, f 00 (0) depends on h = (h‘‘0 )h‘, ‘0 i2E , K (‘, ‘0 ) ! K(‘, ‘0 ), as L ! þ1, imply the infrared
h‘‘0 2 RN . Let us choose h such that only the bound [4]. It turns out that the estimate [43]
Reflection Positivity and Phase Transitions 383

may be used directly to prove the phase transi- article we can only sketch its main elements basing
tion. Consider on the original paper by Dyson et al. (1978), where
X the interested reader can find the details. As above,
def 1
P ¼ 2  ½ðx‘1 ; x‘2 Þ we start by studying reflection positive functionals.
jj ‘1 ;‘2 2
0 1
1 X 2 Reflection Positivity in Nonabelian Case

¼  @ x A0 ½47
jj ‘2 ‘ Again we consider a finite set , jj being even. For every
‘ 2 , let a complex Hilbert space H‘ be given. This is
where  is the box [35]. By [40] and [41], we have the single-spin physical Hilbert space for our quantum
1 b system. We suppose that all H‘ , ‘ 2 , are the copies of a
P ¼ K ð0Þ ½48 certain finite-dimensional space H. The physical Hilbert
jj
def
space H corresponding to  is the tensor product of
One can show that if P = limL!þ1 P is positive, H‘ , ‘ 2 . Let A be the algebra of all linear operators
then there exist multiple Gibbs states. By [40], [41], defined on H . This is the algebra of observables in our
and [48], we get that for any ‘ 2 , case; it is noncommutative (nonabelian) and contains the
1 X b unit element I – the identity operator. As above,  splits
K ð‘; ‘Þ ¼ P þ KðpÞ ½49 into two subsets  , which are the mirror images of each
jj p2 nf0g
other, that is, we are given a reflection  :  ! , such
that (þ ) =  . This allows us to introduce the
Suppose that, cf. [5], corresponding subalgebras A  by setting the elements
K ð‘; ‘Þ  K > 0 ½50 of Aþ to be of the form A  I, where A : Hþ ! Hþ is a
linear operator and I is the identity operator on H .
with K independent of  and J. Employing in [49] Respectively, the elements of A  are to be of the form
this estimate and [43], and passing to the limit I  A. Then we define the map # : Aþ 
 ! A as
L ! þ1, we get

#ðA  IÞ ¼ I  A ½53
P  K  I ðdÞN=2J ½51
where A 7! A  is complex (not Hermitian) conjugation; it
where
may be realized as transposing and taking Hermitian
Z 1 A
n =
def 1 dp conjugation. For A1 , . . . , An 2 A, one has A
I ðdÞ ¼ d
½52 A1 , An . We also suppose that # possesses the
ð2Þ ð;d EðpÞ
properties [8]. A linear functional : A ! R is called
which is finite for d  3. Thereby, we have proved RP (with respect to the pair , #) if it has the property [9].
the following:
Definition 12 A functional is called generalized
Theorem 11 For the spin model [22], [23], there reflection positive (GRP) if for any A1 , . . . , An 2 Aþ
,
exist multiple Gibbs states, and hence multiple
phases, if d  3 and J > I (d)N=2 K. ½A1 #ðA1 Þ An #ðAn Þ  0 ½54

Finally, let us pay some attention to the estimate In principle, this notion differs from the reflection
[50], which is closely related with the properties of the positivity only in the nonabelian case. However, if
single-spin measure (note that played no role in the algebras A  commute (they do commute in our
obtaining [26] and [43]). If it is the uniform measure case), a functional is RP if and only if it is GRP.
on the unit sphere SN1 RN , then K (‘, ‘) = 1 and Example 13 Let
[50] is trivial. In general, one has to employ some
technique to obtain such an estimate. ðAÞ ¼ traceðAÞ; A 2 A ½55
Since the space H is finite dimensional, this is
well defined. It is GRP. Indeed, as the algebras A

Reflection Positivity and Phase commute, we have
Transitions in Quantum Systems ½A1  I #ðA1  IÞ An  I #ðAn  IÞ
As in the classical case, the way of proving the phase ¼ ½A1  I An  I #ðA1  IÞ #ðAn  IÞ
transition for appropriate models leads from an ¼ ½A1  I An  I #ðA1  I An  IÞ
estimate like [17] to Gaussian domination and then 1 A
 n
¼ trace½A1 An  trace½A
to the infrared bound. However, here this way is
much more complicated, so in the frames of this ¼ jtrace½A1 An j2  0
384 Reflection Positivity and Phase Transitions

The Cauchy–Schwarz inequality [13] obviously The proof is performed by means of Lemma 14.
holds also in the quantum case. By means of this The periodic local Gibbs state of the model [58] at
inequality and the Trotter product formula the inverse temperature , analogous to the state [31], is
expðA þ BÞ ¼ lim ½expðA=nÞ expðB=nÞn ½56  ðAÞ ¼ tracefA expðH Þg=Z ð0Þ; A 2 A ½61
n!þ1
As in the classical case, one can define the parameter
one can prove that every RP functional obeys an
[47]. However, now the fact that limL!þ1 P > 0
estimate like [17]. Thereby, we have the following
does not yet imply the phase transition. One has to
analog of Lemma 6:
prove a more general fact
Lemma 14 Let A, B, C1 , . . . , Cn 2 Aþ  be any self-
8 0 19
adjoint operators possessing real matrix representa- < 1 X 2 =

lim lim  @ 0 S A >0 ½62
tion and a1 , . . . , am be any real numbers. Then L0 !þ1:L!þ1 j j ‘20 ‘ ;
" ( !)#2
X
m
trace exp A þ #ðBÞ  ½Cn  #ðCn Þ  an 2 where 0 is the box [35] of side 2L0 . Furthermore, in
(
n¼1
!) the quantum case the Gaussian domination [60]
X
m does not lead directly to the estimate [43], which
 trace exp A þ #ðAÞ  ½Cn  #ðCn Þ2 yields [51]. Instead, one can get a bound like [43]
n¼1
( !) but for the Duhamel two-point function (DTF).
X
m

trace exp B þ #ðBÞ  ½Cn  #ðCn Þ2 ½57 Given A, B 2 A , their DTF is
n¼1 Z 1
ðA; BÞ ¼  ðAe H Be H Þd ½63
0

Gaussian Domination and Phase Transitions By means of [56] one can show that
To proceed further we need a concrete model with 1
ðA; BÞ ¼
finite-dimensional physical Hilbert spaces. As every Z ð0Þ
 2 
quantum model, it is defined by its Hamiltonian. Let @

trace½expð A þ
B  H Þ ½64
 Zd be the box [35] and (, E) be the same @ @
¼
¼0
graph as in the subsection ‘‘Infrared bound.’’ The
periodic Hamiltonian of our model is Let ^S(p) = (^S(1) (p),. .., ^S(N) (p)), p 2  , be the Fourier
image of S‘ , defined by [37], [38]. Then
X 1 X
H ¼ Q‘ þ jS‘  S‘0 j2 ½58   XN  
‘2
2 0
h‘;‘ i2E ^SðpÞ; ^SðpÞ ¼ ^SðkÞ ðpÞ; ^SðkÞ ðpÞ
k¼1
where at each ‘ 2  we have the copies Q‘ ,
S(1) (N)
‘ , . . . , S‘ of N þ 1 basic operators, acting in the Theorem 16 For all p 2  n{0}, it follows that
Hilbert space H‘ , and  
^SðpÞ; ^SðpÞ  N ½65
N 
X  2EðpÞ
ðkÞ ðkÞ 2
jS‘  S‘0 j2 ¼ S ‘  S ‘0
k¼1 To prove this statement one has to use the
Gaussian bound [60] exactly as in the case of
The only condition we impose so far is that all these
Theorem 10. The second derivative with respect to
operators can simultaneously be chosen as real
gives the corresponding DTF (see [64]).
matrices. For h = (h‘‘0 )h‘,‘0 i2E 2 R NjEj , we set
Now let us indicate how the infrared bound [65]
(
X leads to the phase transition. To this end we use the
Z ðhÞ ¼ trace exp   Q‘ simplest quantum spin model with the Hamiltonian
‘2
!) [58], for which Q‘ = 0, N = 2, and S(k) ‘ , k = 1, 2,
 X 2 being the copies of the Pauli matrices
 jS‘  S‘0  h‘‘0 j ½59
2 h‘;‘0 i2E 0 1 1 0
ð1Þ ð2Þ
S ¼ ; S ¼
1 0 0 1
where  > 0 is the inverse temperature.
Then
Theorem 15 For the model [58] and any  
ðkÞ ðkÞ ðkÞ
h = (h‘‘0 )h‘, ‘0 i2E 2 RNjEj , K ð‘; ‘Þ ¼  S‘ S‘ ¼1
Z ðhÞ  Z ð0Þ ½60 for all ‘ 2 ; k ¼ 1; 2 ½66
Reflection Positivity and Phase Transitions 385

which gives the bound K (see [50]). For A, B 2 A , was made in Driessler et al. (1979), Pastur and
by [A, B] we denote the commutator AB  BA. Set Khoruzhenko (1987), Barbulyak and Kondratiev
h h ii (1992), and Kondratiev (1994). In the latter two
ðkÞ
 ðpÞ ¼  ^ SðkÞ ðpÞ; H ; ^
SðkÞ ðpÞ papers a general version of the quantum crystal was
k ¼ 1; 2 ½67 studied in the framework of the Euclidean approach,
based on functional integrals (see Albeverio et al.
The phase transition in the model we consider can (2002)). In this approach the quantum crystal is
be established by means of the following statement represented as a lattice spin model with unbounded
(see Dyson 1978, Theorem 5.1). infinite-dimensional spins. Like in the case of classical
models with unbounded spins, here establishing the
Proposition 17 Suppose there exist (k) (p), k = 1, 2,
estimate [5] becomes a highly nontrivial task. In
p 2 (, ]d such that, for all L 2 N,
particular cases, for example, for 4 -models, one
ðkÞ
 ðpÞ  ðkÞ ðpÞ; k ¼ 1; 2; p 2  ½68 applies special tools like the Bogoliubov inequalities
(see Driessler et al. (1979) and Pastur and Khoruz-
Then the model undergoes a phase transition at a henko (1987)). In the general case quasiclassical
certain finite  if d  3 and asymptotics allow us to get the lower bound [5] (see
Z  ðkÞ 1=2 Barbulyak and Kondratiev (1992) and Kondratiev
1  ðpÞ (1994)). There is one more technique based on
d
dp < 1 ½69
ð2Þ ð; 8EðpÞ
d
reflection positivity (see Lieb (1989)). It employs
reflections in spin spaces, whereas the properties of
for a certain, and hence for both, k = 1, 2.
the index sets (lattices) play no role. This technique
Thus to prove the phase transition we have to proved to be useful in the theory of strongly correlated
estimate (k)
 (p), k = 1, 2. By means of the Cauchy– electron systems, see Tian (2004). Finally, we mention
Schwarz inequality, the estimate [69] may be the books of Georgii (1988), Prum (1986), and Sinai
transformed into the following: (1982) where different aspects of the RP method are
Z h i described. In Georgii (1988), one can also find
1 ð1Þ ð2Þ
 ðpÞ þ  ðpÞ dp < 16=I ðdÞ extended bibliographical and historical comments on
ð2Þd ð;d this subject.
where I (d) is the same as in [52]. The integral on
See also: Phase Transition Dynamics; Phase Transitions
the left-hand ffi side can be estimated from above by
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi in Continuous Systems; Quantum Spin Systems;
8 d(d þ 1); hence, the latter inequality holds if
Renormalization: Statistical Mechanics and Condensed
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Matter.
I ðdÞ dðd þ 1Þ < 2

which holds for all d  3. In particular, I (3)  0.505. Further Reading


Albeverio S, Kondratiev YG, Kozitsky Y, and Röckner M (2002)
Bibliographic Notes Euclidean Gibbs states of quantum lattice systems. Reviews in
Mathematical Physics 14: 1–67.
As the original sources on the RP method in the Barbulyak VS and Kondratiev YG (1992) The quasiclassical limit
theory of phase transitions we mention the papers for the Schrödinger operator and phase transitions in quantum
Fröhlich et al. (1976) (classical case), Dyson et al. statistical physics. Functional Analysis and its Applications
(1978) (quantum case), and Fröhlich and Lieb (1978) 26(2): 61–64.
Driessler W, Landau L, and Fernando-Perez J (1979) Estimates of
(both cases). In a unified way and with many examples, critical lengths and critical temperatures for classical and quantum
this method is described in Fröhlich et al. (1978, lattice systems. Journal of Statistical Physics 20: 123–162.
1980). A detailed analysis of the method, especially in Dyson FJ, Lieb EH, and Simon B (1978) Phase transitions in
its applications to classical models with unbounded quantum spin systems with isotropic and nonisotropic inter-
actions. Journal of Statistical Physics 18: 335–383.
spins, was given in Shlosman (1986). The techniques
Fröhlich J, Israel R, Lieb EH, and Simon B (1978) Phase transitions
based on the chessboard and contour estimates are and reflection positivity, I. General theory and long range lattice
described in Fröhlich and Lieb (1978) and Shlosman models. Communications in Mathematical Physics 62: 1–34.
(1986). As was mentioned above, the quantum case is Fröhlich J, Israel R, Lieb EH, and Simon B (1980) Phase
much more complicated; it gets even more compli- transitions and reflection positivity, II. Lattice systems with
cated if one deals with quantum models employing short-range and Coulomb interactions. Journal of Statistical
Physics 22: 297–347.
infinite-dimensional physical Hilbert spaces and Fröhlich J and Lieb EH (1978) Phase transitions in anisotropic
unbounded operators, such as quantum crystals. lattice spin systems. Communications in Mathematical Physics
The adaptation of the RP method to such models 60: 233–267.
386 Regularization for Dynamical -Functions

Fröhlich J, Simon B, and Spenser T (1976) Infrared bounds, phase Prum B and Fort J-C (1991) Stochastic Processes on a Lattice
transitions and continuous symmetry breaking. Communica- and Gibbs Measures (translated from the French by Bertram
tions in Mathematical Physics 50: 79–85. Eugene Schwarzbach and revised by the authors). Mathema-
Georgii H-O (1988) Gibbs Measures and Phase Transitions. tical Physics Studies, vol.11. Dordrecht: Kluwer Academic
Studies in Mathematics, vol.9. Berlin: Walter de Gruyter. Publishers Group.
Kondratiev JG (1994) Phase transitions in quantum models of Shlosman SB (1986) The method of reflection positivity in the
ferroelectrics. In: Stochastic Processes, Physics and Geometry, mathematical theory of first-order phase transitions. Russian
vol. II, pp. 465–475. Singapure: World Scientific. Mathematical Surveys 41: 83–134.
Lieb EH (1989) Two Theorems on the Hubbard Model. Physical Sinai Ya (1982) Theory of Phase Transitions. Rigorous Results.
Review Letters 62: 1201–1204. Oxford: Pergamon.
Pastur LA and Khoruzhenko BA (1987) Phase transitions in Tian G-S (2004) Lieb’s spin-reflection-positivity method and its
quantum models of rotators and ferroelectrics. Theoretical applications to strongly correlated electron systems. Journal of
and Mathematical Physics 73: 111–124. Statistical Physics 116: 629–680.
Prum P (1986) Processus sur un Réseau et Mesures de Gibbs.
Applications. Techniques Stochastiques. Paris: Masson.

Regularization for Dynamical -Functions


V Baladi, Institut Mathématique de Jussieu, (If f is not inversible, it is understood, e.g., that f
Paris, France has at most finitely many inverse branches, and
ª 2006 Elsevier Ltd. All rights reserved. that the right-hand side of [2] is the sum over
these inverse branches, see the next section.) We
let L act on a Banach space of functions or
distributions ’ on M. For suitable g (in particular
Introduction g = j det Tf 1 j when this Jacobian makes sense), the
spectrum of L is related to the fine statistical
If A is a finite, say N
N, matrix with properties of the dynamics f: existence and
complex coefficients, the following easy equality uniqueness of equilibrium states (related to the
gives
QN an expression for the polynomial maximal eigenvector of L), decay of correlations
k=1 (1  z k ) = det (Id  zA): (related to the spectral gap), limit laws, entro-
!
X1 n pies, etc: see, for example, Baladi (1998) or
z n
detðId  zAÞ ¼ exp  tr A ½1 Cvitanović et al. (2005). The operator L is not
n
n¼1 always trace-class, indeed, it sometimes is not
(here, Id denotes the identity matrix and tr is the compact on any reasonable space. Even worse, its
trace of a matrix). Even in this trivial finite- essential spectral radius may coincide with its
dimensional case, the z-radius of convergence of spectral radius. (Recall that the essential spectral
the logarithm of the right-hand side only gives radius of a bounded linear operator L acting on a
information about the spectral radius (the modulus Banach space is the infimum of those  > 0, such
of the largest eigenvalue) of A. The zeros of the that the spectrum of L outside of the disk of
left-hand side (i.e., the inverses z = 1=k of the radius  is a finite set of eigenvalues of finite
nonzero eigenvalues of A) can only be located algebraic multiplicity.) However, various techni-
after extending holomorphically the right-hand ques allow us to prove that a suitable dynamically
side. The purpose of this article is to discuss defined replacement for the right-hand side of [1]
some dynamical situations in which A is replaced extends holomorphically to a disk in which its
by a linear bounded operator L, acting on an zeros describe at least part of the spectrum of L.
infinite-dimensional space, and for which a dyna- Some of these techniques have a ‘‘regularization’’
mical determinant (or dynamical -function), con- flavor, and we shall concentrate on them.
structed from periodic orbits, takes the part of the In the following section, we present the simplest
right-hand side. In the examples presented, L will case: analytic expanding or hyperbolic dynamics,
be a transfer operator associated to a weighted for which no regularization is necessary and the
discrete-time dynamical system: given a transfor- Grothendieck–Fredholm theory can be applied.
mation f : M ! M on a compact manifold M and Next, we consider analytic situations where
a function g : M ! C, we set finitely many neutral periodic orbits introduce
branch cuts in the dynamical determinant, and
L’ ¼ g ’  f 1 ½2 see how to ‘‘regularize’’ them. Finally, we discuss a
Regularization for Dynamical -Functions 387

kneading operator regularization approach, eigenvalue.) Ruelle also proved that the traces can
inspired by the work of Milnor and Thurston, be written as sums over periodic orbits:
and applicable to dynamical systems with finite Qn1
X  k
smoothness. n k¼0 gðf xÞ
tr L0 ¼
Despite the terminology, none of the regulariza- x: f n ðxÞ¼x
j detðId  Tfxn Þj
tion techniques discussed below match the following P
‘‘-regularization’’ formula: where  means that the fixed points of f n lying in
! the intersection of two or more elements of the
Y1
dX 1
Markov partition must be counted two or more
s
ak ¼ exp  a j ½3
k¼1
ds k¼1 k s¼0 times. (Note that if f n (x) = x, then this closed orbit
gives a natural inverse branch for f n .) Taking into
(For information about the above -regularization account the periodic orbits on the boundaries of the
and its applications to physics, we refer, e.g., to Markov partition, Ruelle expresses the following
Elizalde 1995. See also Voros (1987) and Fried ‘‘dynamical determinant’’:
(1986) for more geometrical approaches and further
references, e.g., to the work of Ray and Singer.) df; g ðzÞ
We do not cover all aspects of dynamical 2 3
X
1 n X Qn1 k
-functions here. For more information and refer- z gðf xÞ
¼ exp4 k¼0 5 ½6
ences, we refer to our survey Baladi (1998), to the n¼1
n x: f n ðxÞ¼x j detðId  Tfxn Þj
more recent surveys by Pollicott (2001) and Ruelle
(2002), and also to the exhaustive account by as an alternated product of determinants d0 (z) as in [5].
Cvitanović et al. (2005), which contains a rich The expression [6] is sometimes also called a
array of physical applications. ‘‘dynamical -function,’’ but we prefer to reserve this
terminology for the following power series:
2 3
The Grothendieck–Fredholm Case X1 n
z X nY 1
f; g ðzÞ ¼ exp4þ gðf k xÞ5 ½7
Let M be a real analytic compact manifold (e.g., the n¼1
n n
x: f ðxÞ¼x k¼0
circle or the d-torus), and let f : M ! M be real
analytic and g : M ! C be analytic. It is not difficult to write f , g (z) as (Baladi 1998) an
First suppose that f is uniformly expanding, that alternated product of determinants df , gi , for
is, there is  > 1 so that kTf (v)k  kvk. (For i = 0, . . . , d, and appropriate weights gi .
example, f (z) = z2 on the unit circle, or a small In fact, the results just described hold in more
analytic perturbation thereof.) Consider generality, for example, for piecewise bijective and
X analytic interval maps. Such maps, f, appear
Lf ; g ’ðxÞ ¼ gðyÞ’ðyÞ ½4 naturally, for example, when considering Schottky
y: f ðyÞ¼x subgroups of PSL(2, Z). We mention the recent
(For example, with g(y) = 1=j det Tf (y)j or work of Guillopé–Lin–Zworski (2004), who let the
1=j det Tf (y)js .) Ruelle (1976) proved that an transfer operator associated to such f and weights
operator L0 , which is essentially the same as Lf , g gs (y) = 1=jf 0 (y)js act (as trace-class operators) on
(the difference, if any, arises from the use of Markov suitable Hilbert spaces of holomorphic functions.
partitions, especially in higher dimensions), acting This allows them to obtain precise estimates for the
on a Banach space of holomorphic and bounded number of zeros of s 7! df , gs [1] in the complex
functions, is not only compact, but is in fact a plane: these zeros are the resonances (in the sense of
nuclear operator in the sense of Grothendieck. In the spectrum of the Laplacian).
particular, the traces of all its powers are well Note that the nuclearity properties extend also to
defined, and the Grothendieck–Fredholm (Gohberg the Gauss map f (x) = {1=x}, which has infinitely
et al. 2000) determinant many inverse branches, if the weight g has summa-
! bility properties over the branches (e.g.,
X1
zn n
gs (y) = j1=f 0 (y)js , where s is a complex parameter,
d0 ðzÞ ¼ exp  tr L0 ½5 with <s > 1=2). The dynamical determinant df , gs (z)
n¼1
n
for the transfer operator of the Gauss map is related
extends to an entire function of finite order, the to the Selberg -function (see e.g., Chang and Mayer
zeros of which are exactly the inverses of the (2001) and references therein).
nonzero eigenvalues of L0 . (The order of the zero Next, assume that M and g are as before, but f is a
coincides with the algebraic multiplicity of the uniformly hyperbolic real analytic diffeomorphism.
388 Regularization for Dynamical -Functions

For example, M is the 2-torus and f is a small real Lf , g on B is equal to 1, and such that the following
analytic perturbation of the linear automorphism regularized determinant
 
2 1 df; g ðzÞ
1 1 2 3
Q n1
X
1
zn X g ð f k

k¼0 s
More generally, we may assume that f is a real ¼ exp4 5 ½8
n¼1
n x2ð0;1: f n ðxÞ¼x
1  Tfxn
analytic Anosov diffeomorphism, that is, there are
C  1 and  > 1 such that the tangent bundle is a holomorphic function in the cut complex plane
decomposes as TM = Eu Es , where the dynamical {z 2 C j z 62 [1, 1)}. Furthermore, its zeros z in this
bundles Eu and Es are Tf-invariant, with kTf n jEs k
cut plane are in bijection with the spectrum of Lf , g jB
Cn and kTf n jEu k
Cn for all n 2 Zþ . In outside of the unit interval [0, 1], and this spectrum
general, the smoothness of x 7! Eu (x) and Es (x) is consists of eigenvalues 1/z of finite multiplicities.
only Hölder. Under the very strong additional Finally, these eigenvalues can only accumulate at 0
assumption that Eu (x) and Es (x) are real analytic, or 1, although each point in the unit interval belongs
Ruelle (1976) (see also Fried (1986)) showed that to the spectrum of Lf , g . In particular, the essential
the power series df , g (z) can again be written as a spectral radius of Lf , g on B coincides with its
finite alternated product (this product being again spectral radius.
an artifact of the Markov partition) of entire Let us define the Banach space B and explain the
functions of finite order. For this, he constructed key ideas in the proof of the above result (Rugh’s
auxiliary transfer operators associated to the claim is in fact more general than the statement
expanding (and analytic!) quotiented dynamics above and applies to a class of maps f with neutral
acting on holomorphic functions on disks. The fixed points). The starting point is the decomposition
analyticity assumption on the dynamical bundles
was later lifted by Rugh (1996) (see also Fried Lf ; g ¼ L1 þ L2
(1995)), who let their transfer operators act on
where Li ’ = ’  fi1  j(fi1 )0 js . The operator L2 is of
Banach topological tensor products of spaces of
the type discussed in the previous section, and it is
holomorphic functions on a disk with the dual of
nuclear when acting, for example, on bounded
such a space. In all these cases, the transfer
holomorphic functions in a complex neighborhood
operator is a nuclear operator in the sense of
of M. Since f1 is not expanding (because of the
Grothendieck and no regularization is needed.
parabolic fixed point at 0), other ideas must be used
(More recent work of Kitaev (1999), when applied
to handle the operator L1 . The change of coordinates
to this analytic setting, shows that the ‘‘mero-
(this idea goes back to Fatou) w = 1=x replaces the
morphic’’ function df , g (z) in fact does not have
weak contraction f11 by the translation w 7! w þ 1 in
poles.)
a suitable domain containing a half-plane <w > w0 .
In order to take into account the weight gs , it is
convenient to use the change of variables
Regularization and Intermittency (w) = ’(1=w)  w2s . Indeed, in the new coordinates
Consider the interval M = [0, 1], and f defined on M the operator L1 reads as
by f (x) = f1 (x) = x=(1  x) on [0, 1/2], and f (x) = M1 ðwÞ ¼ ðw þ 1Þ
f2 (x) = (1  x)=x on [1/2, 1]. (This is the Farey
map, which appears naturally when considering The next step consists in letting M1 act on the
continued fractions.) Each of the two branches is Banach space Bw of Laplace transforms of
an analytic bijection onto [0, 1]. The second branch L1 (Rþ , Lebesgue), that is, functions
is expanding, but the first one, f1 , has a (parabolic) Z 1
neutral fixed point at x = 0 (the expansion is ðwÞ ¼ eðww0 Þt ðtÞ dt
0
f (x) = x þ x2 þ x3 þ   ). Let g = gs be an analytic R
weight of the form g(y) = 1=jf 0 (y)js for <s  1=2. We with the induced norm kkBw = j (t)j dt. Since M1
are interested in the spectrum of the operator Lf , g maps to et (t), it is not difficult to see that the
associated with the pair (f , g) by [4]. Clearly, the spectrum of M1 on Bw (and thus of L1 on the pullback
expression [6] is not a good candidate for an analog B of Bw by , which consists of functions in a complex
of the Fredholm determinant of Lf , g . Rugh (1996) neighborhood of [0,1], holomorphic in a sector at 0,
introduced a Banach space B of functions in a and with a possible, but controlled, singularity at 0) is
complex neighborhood of M, having a controlled the closed unit interval. One can check that L2 is
singularity at 0, and such that the spectral radius of nuclear on B. Composing a bounded operator with a
Regularization for Dynamical -Functions 389

nuclear operator gives a nuclear operator. If 1=z 62 smaller than the spectral radius. Then, the goal is to
[0, 1], the resolvent (1  zL1 )1 is a bounded operator, prove that the dynamical determinant [6] defines a
and therefore, for such z, the operator holomorphic function in the disk of radius 1=ess , and
that its zeros in this disk are exactly the inverses of the
PðzÞ :¼ zL2 ð1  zL1 Þ1 ½9 eigenvalues of Lf , g . For uniformly expanding Cr maps
is nuclear on B. We view P(z) as a ‘‘regularized’’ f on compact manifolds, and Cr weights, denoting by
version of Lf , g = L1 þ L2 . Now, since  > 1 the expansion coefficient as in the section ‘‘The
Grothendieck–Fredholm case,’’ this goal was essen-
ð1  zLf ; g Þ1 ¼ ð1  zðL1 þ L2 ÞÞ1 tially attained by Ruelle (1990). For Lf , g acting on the
 1 Banach space of Cr functions on M, Ruelle proved
¼ ð1  zL1 Þ1 1  zL2 ð1  zL1 Þ1 ess (Lf , g )
r and was able to extend df , g (z) (and
interpret its zeros) in the disk of radius r .
it is not surprising that one can prove (Rugh 1996)
For Cr Anosov diffeomorphisms f, and Cr weights g,
that the Fredholm determinant
  Pollicott, Ruelle, Haydn, and others obtained important
u 7! det 1  L2 ðu  L1 Þ1 results using the symbolic dynamics description (for
which the maximal smoothness which can be used is
(which is holomorphic in u 62 [0, 1]) has as its zero set r
1, because of the metric-space model). Later, Kitaev
sp(Lf , g jB ) n [0, 1], and that this set consists in isolated (1999) was able to show that df , g (z) extends to a
eigenvalues of finite multiplicity (equal to the order of holomorphic function in the disk of radius r=2 ,
the corresponding zero) for Lf , g . Formally, but did not give any spectral interpretation of the
zeros of df , g (z). More recently, Liverani (2005) was able
X
1
ð1  zL1 Þ1 ¼ z k L k1 ½10 to give such an interpretation, in a smaller disk however.
k¼0 All the works mentioned in the previous paragraph
are based on some approximation scheme (Taylor
so that the regularization we just described can be expansion style). In the early 1990s, a new approach,
viewed as mirroring an induction (or renormaliza- with a regularization flavor, was launched (see e.g.,
tion) procedure, where the dynamics f is replaced by Baladi and Ruelle (1996)), initially for piecewise
the first-return map to the ‘‘chaotic’’ part of the monotone interval maps. We present it next.
phase space [0, 1/2]. (For the Farey map, the induced Consider a finite set of local homeomorphisms
map is just the Gauss map.) The formal equality [10]
! : U! ! ! (U! ), where each U! is a bounded
is also behind the fact that (Rugh 1996) open interval of R, and of associated weight functions
Q n1 g! which are continuous, of bounded variation, and
X k
n k¼0 gs ðf xÞ
tr PðzÞ ¼ have support inside U! . For example, the ! can be the
x6¼0: f n ðxÞ¼x
1  Tfxn
inverse branches of a single piecewise monotone
An extension of this theory to the two-dimensional interval map f, and g! can be g  ! for a single g.
setting has been obtained by Baladi, Pujals, and (No contraction assumption is required on the ! :
Sambarino. their graph can even coincide with the diagonal on a
segment.) The transfer operator is now
X
M’ ¼ g!  ð’  ! Þ
Regularization and Kneading !
Determinants b for the essential
Ruelle obtained an estimate, noted R,
Up to now we have only discussed analytic dynamical spectral radius of M acting on the Banach space BV of
systems, for which hyperbolicity (or uniform expan- functions of bounded variation. The main result of
sion) guaranteed that the transfer operator (or a Baladi and Ruelle (1996) links the eigenvalues of
regularized version thereof) was compact, even M : BV ! BV outside of the disk of radius R, b with
nuclear, on a natural Banach space. When considering the zeros of the following ‘‘sharp determinant’’:
hyperbolic invertible (or expanding noninvertible) !
maps f, and weights g with ‘‘finite smoothness,’’ say #
X1
zn # n
det ðId  zMÞ ¼ exp  tr M ½11
Cr for some finite r > 1, the transfer operator defined n¼1
n
by [2] or [4] is usually not compact on any infinite-
dimensional space. However, one can often prove a where (with the understanding that y=jyj = 0 if y = 0)
‘‘Lasota–Yorke’’ type inequality (see e.g., Baladi X Z 1 ! ðxÞ  x
(1998)) which ensures that the essential spectral radius tr# M ¼ dg! ðxÞ
ess (Lf , g ), defined in the ‘‘Introduction,’’ is strictly !
2 j ! ðxÞ  xj
390 Regularization for Dynamical -Functions

If the ! are strict contractions which form the set Thurston. In a suitable z-disk, one proves that this
of inverse branches of a piecewise monotone interval b is a Hilbert–Schmidt operator on an
operator D(z)
map f, and g! = g  ! , then integration by parts L2 space (its kernel is bounded and compactly
together with the key property that supported), thus allowing the use of regularized
x determinants of order 2 (see e.g., Gohberg et al.
d ¼ ; the Dirac delta at the origin of R b
(2000)). By definition, det(Id þ D(z)) is the product of
2jxj
this regularized determinant with the exponential of the
show that det# (Id  zM) = 1=f , g (z) (recall [7]). If b along the diagonal, which
average of the kernel of D(z)
one assumes instead only that the graph of each is well defined. Another kneading operator, D(z), is
admissible composition nw of n successive ! ’s (with essential. If 1=z is not in the spectrum of M (on BV),
n  1) intersects the diagonal transversally, then then D(z) is also Hilbert–Schmidt, and one can show
b
det(Id þ D(z)) = det(Id þ D(z))1 . The initial defini-
det# ðId  zMÞ b and D(z) were technical and we shall not
tions of D(z)
2
X1 n
z X X   give them here. However, a more conceptual definition
¼ exp4 L x; n
w of the D(z) was later implemented:
n¼1
n admissible n x: n ðxÞ¼x
w w
# DðzÞ ¼ N ðId  zMÞ1 S ½15
Y 
n1 
k
 g!k w ðxÞ ½12 where N is an auxiliary transfer operator and S is
k¼0 the convolution
where L(x, ) 2 {1, 1} is the Lefschetz number of a Z
1 xy
transversal fixed point x = (x) (if is C1 this is just S’ðxÞ ¼ ’ðyÞ d
2 jx  yj
sgn (1  0 (x))). Therefore, we call the sharp determi-
nant det# (Id  zM) a Ruelle–Lefschetz (dynamical) where  is an auxiliary non-negative finite measure.
determinant. For a class of ‘‘unimodal’’ interval maps f From [15], it becomes clear that the kneading
and constant weight g = 1, the expression [12] with operator is a regularized (through the convolution
Lefschetz numbers, coming from the additional S) object which describes the inverse spectrum of the
transversality assumption, gives that det# (Id  zM) transfer operator: the resolvent (Id  zM)1 in [15]
is just 1=  (z), where the ‘‘negative -function’’ means that poles can only appear if 1=z is an
" # b
eigenvalue. Since det(Id þ D(z)) = det(Id þ D(z))1 ,
X1 n
z this can be translated into a statement for zeros of
  n
 ðzÞ ¼ exp þ ð2#Fix ðf Þ  1Þ ½13 b
n¼1
n det(Id þ D(z)). The Milnor–Thurston identity [14]
then implies that any zero of det# (Id  zM) is an
is defined by counting (twice) the sets inverse eigenvalue of M.
Fix ðf n Þ ¼ fxjf n ðxÞ ¼ x; f strictly decreasing The one-dimensional kneading regularization we
just presented is well understood. The higher-
in a neighborhood of xg dimensional theory is not as developed yet. Let
of ‘‘negative fixed points.’’ This negative -function U! be now finitely many bounded open subsets of Rd ,
r
was studied by Milnor and Thurston, who proved ! : U! ! ! (U! ) be local C homeomorphisms or
the remarkable identity diffeomorphisms, while g! : U! ! C are compactly
supported Cr functions, for r  1.
b
ð  ðzÞÞ1 ¼ detð1 þ DðzÞÞ In 1995, A Kitaev wrote a two-page sketch proving a
b higher-dimensional Milnor–Thurston formula, under
where D(z) is a 1  1 ‘‘matrix,’’ which is just a
an additional transversality assumption. This assump-
power series in z with coefficients in {1, 0, þ1},
tion guarantees that the set of fixed points of each fixed
given by the signed itinerary of the image of the
period m is finite, so that the Ruelle–Lefschetz
turning point (the so-called ‘‘kneading’’ data).
determinant det# (Id  zM) can be defined through
Returning now to the general setup ! , g! , the
[12]. Inspired by Kitaev’s unpublished note, Baillif
crucial step in the proof of the spectral interpreta-
(2004) proved the following Milnor–Thurston formula:
tion of the zeros of this Ruelle–Lefschetz determi-
nant consists in establishing the following Y
d 1 kþ1
continuous version of the Milnor–Thurston identity: det# ðId  zMÞ ¼ det [ ðId þ Dk ðzÞÞð1Þ ½16
k¼0
b
det# ðId  zMÞ ¼ det  ðId þ DðzÞÞ ½14
Here, the Dk (z) are kernel operators acting on (k þ 1)-
b replaces (for-
where the ‘‘kneading operator’’ D(z) forms, constructed with the resolvent (Id  zMk )1 ,
mally) the finite kneading matrix of Milnor and together with a convolution operator S k , mapping
Relativistic Wave Equations Including Higher Spin Fields 391

(k þ 1)-forms to k-forms and which satisfies the for general modular groups. In: Fiedler B (ed.) Ergodic
homotopy equation dS þ Sd = 1. The kernel k (x, y) Theory, Analysis, and Efficient Simulation of Dynamical
Systems, pp. 523–562. Springer: Berlin.
of S k has singularities of the form (x  y)=kx  ykd . Cvitanović P, Artuso R, Mainieri R, Tanner G, and Vattay G (2005)
The transversality assumption allows Baillif to interpret Chaos: Classical and Quantum, ChaosBook.org. Copenhagen:
the determinant obtained by integrating the kernels Niels Bohr Institute.
along the diagonal as a flat determinant in the sense of Elizalde E (1995) Ten Physical Applications of Spectral Zeta
Atiyah and Bott, whence the notation det[ in the right- Functions, Lecture Notes in Physics, New Series m:35.
Springer: Berlin.
hand side of [16]. Fried D (1986) The zeta functions of Ruelle and Selberg. I. Ann.
Baillif (2004) did not give a spectral interpretation Sci. École Norm. Sup 19: 491–517.
of zeros or poles of the sharp determinant [16], but Fried D (1986) Analytic torsion and closed geodesics on
he noticed that for jzj very small, suitably high hyperbolic manifolds. Inventiones Mathematicae 84:
iterates of the Dk (z) are trace-class on L2 (Rd ), 523–540.
Fried D (1995) Meromorphic zeta functions for analytic flows.
showing that the corresponding regularized determi- Communications in Mathematical Physics 174: 161–190.
nant has a nonzero radius of convergence under Gohberg I, Goldberg S, and Krupnik N (2000) Traces and
weak assumptions. The spectral interpretation of the Determinants of Linear Operators. Basel: Birkhäuser.
sharp determinant [12] in arbitrary dimension, but Guillopé L, Lin K, and Zworski M (2004) The Selberg zeta
under additional assumptions, was subsequently function for convex co-compact Schottky groups. Commu-
nications in Mathematical Physics 245: 149–176.
carried out by Baillif and the author of the present Kitaev AY (1999) Fredholm determinants for hyperbolic diffeo-
article, giving a new proof of some of the results in morphisms of finite smoothness. Nonlinearity 12: 141–179.
Ruelle (1990). Liverani C (2005) Fredholm determinants, Anosov maps and
Ruelle resonances. Discrete and Continuous Dynamical
See also: Chaos and Attractors; Dynamical Systems and Systems 13: 1203–1215.
Thermodynamics; Ergodic Theory; Hyperbolic Dynamical Pollicott M (2001) Dynamical zeta functions. In: Katok A, de la
Systems; Number Theory in Physics; Quantum Llave R, Pesin Y, and Weiss H (eds.) Smooth Ergodic Theory
and Its Applications (Seattle, WA, 1999), Proc. Sympos. Pure
Ergodicity and Mixing of Eigenfunctions; Quillen
Math., vol. 69, pp. 409–427. Providence, RI: American
Determinant; Semi-Classical Spectra and Closed Orbits;
Mathematical Society.
Spectral Theory for Linear Operators. Ruelle D (1976) Zeta functions for expanding maps and Anosov
flows. Inventiones Mathematicae 34: 231–242.
Ruelle D (1990) An Extension of the Theory of Fredholm
Further Reading Determinants, Inst. Hautes Études Sci. Publ. Math.
175–193.
Baillif M (2004) Kneading operators, sharp determinants, and
Ruelle D (2002) Dynamical Zeta Functions and Transfer
weighted Lefschetz zeta functions in higher dimensions. Duke
Operators, Notices American Mathematical Society: 887–895.
Mathematical Journal 124: 145–175.
Rugh HH (1996) Generalized Fredholm Determinants and
Baladi V (1998) Periodic Orbits and Dynamical Spectra,
Selberg Zeta Functions for Axiom A Dynamical Systems,
Ergodic Theory Dynam. Systems, vol. 18, pp. 255–292
Ergodic Theory Dynam. Systems. 805–819.
(with an addendum by Dolgopyat D and Pollicott M, pp.
Rugh HH (1999) Intermittency and regularized Fredholm
293–301.)
determinants. Inventiones Mathematicae 135: 1–25.
Baladi V and Ruelle D (1996) Sharp determinants. Inventiones
Voros A (1987) Spectral functions, special functions and the
Mathematicae 123: 553–574.
Selberg zeta function. Communications in Mathematical
Chang CH and Mayer DH (2001) An extension of the
Physics 110: 439–465.
thermodynamic formalism approach to Selberg’s zeta function

Relativistic Wave Equations Including Higher Spin Fields


R Illge and V Wünsch, Friedrich-Schiller-Universität wave equations may be based on the spin of the
Jena, Jena, Germany particles (or physical fields), which was discovered
ª 2006 Elsevier Ltd. All rights reserved. for the electron by Goudsmith and Uhlenbeck in
1925. For the greater part of physics, the three spin
numbers s = 0, 1=2, and 1 are sufficient; the respec-
tive equations named after their discoverers Klein–
Introduction
Gordon, Dirac, and Proca for massive fields and
The description of phenomena at high energies D’Alembert, Weyl, and Maxwell for massless fields,
requires the investigation of relativistic wave equa- respectively (see the following section).
tions, that is, equations which are invariant under In their original form, these equations look rather
Lorentz transformations. Our discussion will be given different. However, their translation into spinor form
classically (i.e., nonquantum). A classification of the shows that the wave equations for bosons and fermions
392 Relativistic Wave Equations Including Higher Spin Fields

have the same structure, if s > 0. Therefore, most of By iteration we obtain second-order wave equations
the equations dealt with in this article are formulated of normal hyperbolic type. Further, Cauchy’s initial-
for spinor fields. (Strictly speaking, the exclusive use of value problem is well posed and a Lagrangian is
2-spinors restricts the relativistic invariance to the known. For zero mass, we state the wave equations
proper Lorentz group SOþ (1, 3). However, all the
results presented here can be ‘‘translated back’’ into rA
ðA0 jAjB0 ...E0 Þ ¼ 0 ½2
tensor or bispinor form, respectively (Illge 1993).)
Relativistic wave equations for free fields with arbi- which are just the curved versions of the equations
trary spin s > 0 in Minkowski spacetime are discussed for the potential of a massless field. They are
in the section ‘‘Higher spin in Minkowski spacetime’’; consistent in curved spacetime, too, and the Cauchy
they were first given by Dirac (1936). problem is well posed (Illge 1988).
In the subsequent section, we explain how the field Last but not least, let us mention the esthetic
theory can be extended to curved spacetimes. If a aspect. Equations [1] and [2] satisfy Dirac’s demand:
Lagrangian is known, then there exists a well-known ‘‘Physical laws should have mathematical beauty.’’
mathematical procedure (‘‘Lagrange formalism’’) to In the following, we assume that the spacetime
obtain the field equations, the energy–momentum and all the spinor and tensor fields are of class C1 .
tensor, etc. All field equations for ‘‘low’’ spin s  1 All considerations are purely local. We will call a
arise from an action principle. Consequently, they can symmetric (‘‘irreducible’’) spinor to be of type (n, k)
be extended to curved spacetime by simply replacing the if and only if it has n unprimed and k primed indices
flat metric and connection with their curved versions. (irrespective of their position). Moreover, we use the
If s > 1, then the wave equations do not follow from notations and conventions of Penrose and Rindler
a variation principle without supplementary conditions. (1984), especially for the curvature spinors ABCD
Nevertheless, one can try to generalize the equations of and ABA0 B0 .
the section ‘‘Higher spin in Minkowski spacetime’’ to
curved spacetime by the ‘‘principle of minimal cou- Wave Equations for Low Spin
pling,’’ too. However, the arising equations are not in Minkowski Spacetime
satisfactory, since there is an algebraic consistency
condition in curved space if s > 1 (Buchdahl 1962), and The spin (or intrinsic angular momentum) of a
another for charged fields in the presence of electro- particle is found to be quantized. Its projection on
magnetism if s > 1=2 (Fierz and Pauli 1939). any fixed direction is an integer or half-integer
There have been numerous attempts to avoid these multiple of Planck’s constant h; the only possible
inconsistencies. As a rule, the alternative theories values are
require an extended spacetime structure or additional sh; ðs þ 1Þh; . . . ; ðs  1Þh; sh
new fields or they give up some important principle. An
extensive literature is devoted to just this problem – The spin quantum number s so defined can have one
unfortunately, a survey article or book is missing. of the values s = 0, 1=2, 1, 3=2, 2, . . . and is a
Finally, we present a possibility to describe fields characteristic for all elementary particles along
with arbitrary spin s > 0 within the framework of with their mass m and electric charge e. The
Einstein’s general relativity without any auxiliary particles with integer s are called ‘‘bosons,’’ those
fields and subsidiary conditions in a uniform manner. with half-integer s ‘‘fermions.’’ The three numbers
The approach is based on irreducible representations s = 0, 1=2, and 1 are referred to as ‘‘low’’ spin; they
of type D(s, 0) and D(s  1=2, 1=2) instead of are sufficient for the greater part of physics.
D(s=2, s=2) in the Fierz theory for bosons and The principle of first quantization associates a type
D(s=2 þ 1=4, s=2  1=4) in the Rarita–Schwinger of field and a field equation to each type of elementary
theory for fermions. It was first pointed out particles. Massive particles, with rest mass m > 0, and
by Buchdahl (1982) that this type of field equations massless particles, with rest mass m = 0, are to be
can be generalized to a curved spacetime if the mass is distinguished. Accordingly, we obtain six linear wave
positive. After a short time Wünsch (1985) simplified equations for s  1, which read as follows in units
them to their final form: such that c = h = 1 (see Table 1):
For the sake of simplicity, we consider only free
5A
P0 ’AB...E þ m1 B...EP0 ¼ 0 fields in Table 1; no source terms or interaction terms
0 ½1
rPðA B...EÞP0  m2 ’AB...E ¼ 0 appear here. The associated ‘‘free’’ Lagrangians are
given in Table 2.
This system contains the well-known wave equa- Since the electromagnetic field tensor Fab satisfies the
tions for low spin s = 1=2 and s = 1 as special cases. first part of Maxwell’s equations @[c Fab] = 0, it follows
Relativistic Wave Equations Including Higher Spin Fields 393

Table 1 Relativistic wave equations for low spin s = 0, 1=2, and 1 where ’ and  are both symmetric spinors:
’AB = ’(AB) , A0 B0 = (A0 B0 ) . After a straightforward
Spin, mass Wave equation Associated particles
calculation the Proca equation yields
s = 0, m > 0 Klein–Gordon eqn. Scalar mesons C 0
C 0
(& þ m 2 )u = 0 , , K , . . . @ðA BÞC0 þ ’AB ¼ 0; @ðA 0 B0 ÞC þ A0 B0 ¼ 0

0
s = 0, m = 0 D’Alembert eqn. – @AC0 ’CA þ @AC C0 A0 þ m2 AA0 ¼ 0
&u = 0
s = 1=2, m > 0 Dirac eqn. Leptons e, ,  Further, from the equation @[c Hab] = 0, we obtain
0
@AA0 ’A þ pimffiffi2 A0 = 0 Baryons p, n, , , , . . . @AC A0 C0 = @AC0 ’AC ; thus, the first and second summand
0
@AA A0  pimffiffi ’A = 0
2
in the third equation are equal. Consequently, we find
s = 1=2, m = 0 Weyl eqn. Massless(?) neutrinos the following spinor form of the Proca equations:
@AA0 A = 0 e ,  , 
m2 0
s = 1, m > 0 Proca eqn. Vector mesons @AC0 ’CA þ  0 ¼ 0; C
@ðA BÞC0 þ ’AB ¼ 0
Hab = @a Ub  @b Ua
, !, , , . . . 2 AA ½3
2
@ c Hca þ m 2 Ua = 0 0 m C0
@AC C0 A0 þ  0 ¼ 0; @ðA 0 B0 ÞC þ A0 B0 ¼ 0
s = 1, m = 0 Maxwell eqn. Photon 2 AA
@½a Fbc = 0
If the tensor fields H and U are real, then we have
@a F ab = 0
A0 B0 , AA0 = 
A0 B0 = ’ AA0 , and the second pair of equa-
tions is just the complex conjugate of the first.
Now it is readily seen that the Dirac and Proca
Table 2 The Lagrangian densities for free (i.e., noninteracting) equations have the same structure. They are coupled
fields with low spin first-order systems of differential equations for pairs
of spinor fields. The only decisive difference is that
Field Lagrangian density
the spinors have one index if s = 1=2 and two indices
Scalar field L = 12 f(@ a u)(@a u)  m 2 u 2 g if s = 1.
Dirac field L = piffiffi2 (
0
A @ AA A0 þ ’
0
B @BB 0 ’B  ’B @BB 0 ’
B
0
We obtain a similar result for Maxwell fields. The
A0 @ AA 
0
A ) þ m( A ’A þ ’
0
A A0 ) real tensor Fab has the spinor equivalent
0 0
Weyl field L = piffiffi2 (
A0 @ AA A  A @ AA A0 )
2
Fab aAA0 bBB0 ¼ ’AB "A0 B0 þ ’
A0 B0 "AB
1 ab
Proca field L= 4 Hab H  H ab @½a Ub þ m2 Ua U a
Maxwell field L=  14 Fab F ab = (@½a Ab (@ ½a Ab ) with a symmetric spinor ’AB . The spinor form of
Maxwell’s equations is (Penrose and Rindler 1984)
@AA0 ’AB ¼ 0 ½4
that a vector field Aa exists such that Fab = @a Ab 
@b Aa . This vector field is called the ‘‘electromagnetic and has the same structure as the Weyl equation.
4-potential.’’ It is not uniquely determined by the field Here we found an example for the power and utility
Fab ; the freedom in Aa is Aa ! Aa þ @a  where of spinor techniques since they allow the formulation
 = (x) is a real-valued function. This gauge transfor- of the wave equations for bosons and fermions in a
mation of Aa can be used, for example, to obtain the uniform manner. Only the cases m > 0 and m = 0 are
Lorentz gauge condition @ a Aa = 0. to be distinguished. Moreover, the above results
The wave equations listed in Table 1 look rather suggest the way for generalizing the wave equations
different, but this formal disadvantage can be over- to higher spin. Therefore, we can already end the
come. To begin with, we remark that fermions discussion of the fields with low spin and take them as
require spinors for their description. The Dirac and special cases of those with arbitrary spin.
Weyl equations are not describable by linear equa-
tions for tensor fields. On the other hand, bosons can
be described by spinors as well. All tensor equations Higher Spin in Minkowski Spacetime
can be ‘‘translated’’ into spinor form using the mixed
Massive Fields
spinor–tensor aAA0 . We will demonstrate this proce-
dure for the Proca field in some detail. Relativistic wave equations for particles with arbi-
The (possibly complex) skew-symmetric tensor trary spin were first considered by Dirac (1936). His
Hab and the vector Ua have the spinor equivalents equations read
Hab aAA0 bBB0 ¼ ’AB "A0 B0 þ A0 B0 "AB @PA0 ’AB...DQ0 ...T 0 þ m1 B...DP0 Q0 ...T 0 ¼ 0
½5
Ua aAA0 ¼ AA0 0
@AP B...DP0 Q0 ...T 0  m2 ’AB...DQ0 ...T 0 ¼ 0
394 Relativistic Wave Equations Including Higher Spin Fields

where the spinors ’ and  are of type (n, k) and Then  is symmetric in all its indices since ’ is
(n  1, k þ 1), respectively (corresponding to irredu- divergence-free. Further, we obtain
cible representations of the restricted Lorentz group 0 0

SOþ (1, 3)). The constants m1 and m2 are mass @EP B...DP0 Q0 ...T 0 ¼ @EP @PA0 ’AB...DQ0 ...T 0
parameters (m2 = 2m1 m2 ) and the spin s is one 1
  &’EB...DQ0 ...T 0
half of the total number of indices of each spinor, 2
s = (1=2)(n þ k). As in the preceding section, we 2
m
assume that electromagnetism and other interactions ¼ ’EB...DQ0 ...T 0
2
are absent. We should mention that equations for
since ’ satisfies the Klein–Gordon equation [6a].
higher spin were not motivated by observations or
Consequently, the pair (’, ) satisfies a system [5].
empirical facts in that period of time, because only a
Obviously, this procedure can be continued: define
few elementary particles were known (proton,
neutron, electron, positron, and photon), and all of B
C...DO0 P0 Q0 ...T 0 :¼ @O 0 B...DP0 Q0 ...T 0
them have low spin (see Table 1). Since that time,
particles with s > 1 were found in nature, for etc. We obtain a sequence of spinors of type
example, resonances in scattering experiments. (0, 2s), (1, 2s  1), . . . , (2s, 0) each of which is
The system [5] allows a uniform description of free obtainable from its immediate neighbors by a
fields with arbitrary spin s > 0, including Dirac and differentiation contracted on one index. Together,
Proca fields, as we know from the preceding section. these spinors form an invariant exact set (Penrose
(Remark: The symmetrization in eqns [3] can be and Rindler 1984).
omitted since the vector field U is divergence-free The just given arguments show that there is an
as a consequence of the second Proca equation.) ambiguity in the system [5]. The spin s fixes only
Various other field equations proposed subsequently the total number of indices of ’ and . However,
can be comprehended as its special cases (Corson their partition into primed and unprimed ones is
1953). Examples are the Rarita–Schwinger equations not a priori fixed. Therefore, we can choose a
for fermions: if they are written in terms of 2-spinors, ‘‘convenient’’ partition for the respective needs.
then one obtains just the system [5] where the spinor Massless Fields
’ is of type (s þ 1=2, s  1=2) and the spinor  is of
type (s  1=2, s þ 1=2). If m = 0, then the Dirac system [5] is decoupled.
0
If we apply @EP to the first of the equations in [5] Therefore, we have to state a single equation for a
and use the second, we obtain single field. Let ’ be a spinor field of type (n, 0). The
massless free-field equation for spin (1/2)n is then
ð& þ m2 Þ’AB...DQ0 ...T 0 ¼ 0 ½6a taken to be
since the second derivatives commute in flat space- @AA0 ’AB...E ¼ 0 ½8
times. Similarly,
More precisely, the solutions of [8] represent left-
ð& þ m2 ÞB...DP0 Q0 ...T 0 ¼ 0 ½6b handed massless particles with helicity (1=2)n h,
whereas the solutions of the complex-conjugate
so both fields ’ and  satisfy a Klein–Gordon type
form of this equation are right-handed particles
equation. Moreover, eqns [5] imply that each of ’
(helicity þ (1=2)nh). Recall that the Weyl equation
and  is divergence-free
(n = 1) and the source-free Maxwell equation (n = 2)
0 0
@ AQ ’AB...DQ0 ...T 0 ¼ 0 ¼ @ BP B...DP0 Q0 ...T 0 ½7 have this form. (Remark: The Bianchi identity in
Einstein spaces also falls in this category, with the
if they have at least one index of each kind. Weyl spinor ABCD taking the place of ’. . . .
In a sense, this procedure can be reversed. Let a Moreover, we may think of [8] with n = 4 as the
symmetric spinor field ’ be given that satisfies [6a] gauge-invariant equation for the weak vacuum
and [7]. (Remark: A significant example is the Fierz gravitational field.)
system The massless field equation [8] can be solved
using methods of twistor geometry. Moreover, there
ð& þ m2 ÞUab...d ¼ 0; @ a Uab...d ¼ 0
is an explicit integral formula for representing
for a symmetric, tracefree tensor field U, since the massless free fields in terms of arbitrarily chosen
spinor equivalent of U is of type (k, k).) null data on a light cone (Penrose and Rindler 1984,
Define 1986, Ward and Wells 1990). We do not discuss
eqns [8] in detail since they are generally incon-
B...DP0 Q0 ...T 0 :¼ @PA0 ’AB...DQ0 ...T 0 sistent in curved spacetimes if n > 2 (see the next
Relativistic Wave Equations Including Higher Spin Fields 395

section). We only indicate that each solution of [8] flat metric and connection with their curved
satisfies the second-order wave equation versions. This procedure is called the ‘‘principle of
minimal coupling.’’
&’AB...E ¼ 0
All equations for low spin in Minkowski
spacetime are the Euler–Lagrange equations of a
Maxwell’s equations imply the existence of an
variation principle (see Table 2). Consequently, they
electromagnetic potential (cf. section ‘‘Wave equa-
can be extended to curved spacetime by simply using
tions for low spin in Minkowski spacetime’’). This
the principle of minimal coupling. The arising
concept can be generalized to higher spin.
equations are perfectly acceptable. No complications
A ‘‘potential’’ for a spinor field ’AB...E of type
arise, and so we do not repeat them in this section.
(n, 0) is a spinor field AB0 ...E0 of type (1, n  1) such
If s > 1, then neither the massive nor the massless
that
wave equations follow from a variation principle
A without supplementary conditions. Nevertheless, we
@ðA 0 jAjB0 ...E0 Þ ¼ 0 ½9
can try to generalize the equations of the previous
and section to a curved spacetime by formally replacing
B 0 0 the flat metric and connection with their curved
’AB...E ¼ @ðB    @EE AÞB0 ...E0 ½10
versions, too. However, serious problems arise:
Let us first consider massless fields of helicity
One can check in a straightforward manner that a
(1=2)nh. The principle of minimal coupling yields
spinor field ’ that is given by [9] and [10] satisfies
the massless equation [8]. If n > 1, there is a gauge rA
A0 ’AB...E ¼ 0 ½13
freedom in these potentials; it turns out to be 0
If we apply rA
F to this equation, we obtain
AB0 ...E0 ! AB0 ...E0 þ @AðB0 !C0 ...E0 Þ
0
rA A
F rA0 ’AB...E ¼ 0
for any spinor field ! of type (0, n  2). Further-
more, the general massless field ’ can locally be Since the covariant derivatives do not commute
expressed in this way (Penrose and Rindler 1986). with each other, the term on the left-hand side is not
completely symmetric in the unprimed indices.
Wave Equations in Curved Spacetimes, Therefore, this equation can be decomposed into
Consistency Conditions two nontrivial irreducible parts if n > 1: symmetri-
zation yields the covariant D’Alembert equation
First of all we emphasize that Hamilton’s principle
of stationary action is extremely important in field ra ra ’B...EF ¼ 0
theories (see, e.g., Schmutzer (1968)). Assume that as required, while antisymmetrization yields by use
the Lagrangian L contains at most first derivatives of the spinor Ricci identities
of a field  : L = L(  (x), @a  (x)). ‘‘Special rela-
tivity’’ states that L is invariant under Lorentz ðn  2ÞKLM ðC ’D...EÞKLM ¼ 0 ½14
transformations. The Euler–Lagrange equations
with respect to variation of  read where ABCD is the Weyl spinor. If n > 2 and the
spacetime is not conformally flat, then this algebraic
@L @L consistency condition effectively renders eqn [13]
 @a ¼0 ½11
@  @ð@a  Þ useless as physical field equations.
and these are the field equations that  is required to If m > 0, the situation is not better. In somewhat
satisfy. similar way, we obtain the algebraic consistency
In ‘‘general relativity,’’ the Lagrangian L has to be conditions
generally covariant. So we have L = L(  (x), ðn  2ÞKLM ðC ’D...EÞKLMQ0 P0 ...T 0
ra  (x)) and the Euler–Lagrange equations 0
þ kKLX ðQ0 ’jKLC...EjP0 ...T 0 ÞX0 ¼ 0 ðn > 1Þ
@L @L ½15
 ra ¼0 ½12  X0 Y 0 Z0 ðS0 jB...DX0 Y 0 Z0 jT 0 ...U0 Þ
ðk  1Þ
@  @ðra  Þ 0 0
þ ðn  1ÞðBKX Y C...DÞKX0 Y 0 S0 T 0 ...U0 ¼ 0 ðk > 0Þ
emerge. If we assume that the Lagrangian L does
not contain the curvature tensors and their deriva- if the spinor field ’ is of type (n, k) (Buchdahl 1962).
tives explicitly and compare [11] and [12], then it is We remark that similar consistency conditions
easily seen how the wave equations in curved occur if we have no gravitation, but an interaction
spacetime can be obtained: by simply replacing the with an electromagnetic field. Then the partial
396 Relativistic Wave Equations Including Higher Spin Fields

derivative is to be replaced by Da = @a  ieAa and special cases of [16], choose n = 1 and n = 2,


we obtain consistency conditions like [14] and respectively. (Remark: An electromagnetic field can
[15], where the curvature spinors are to be be included in [16] by ra ! Da = ra  ieAa , and
replaced by the electromagnetic spinor (Fierz and the equations remain consistent (Illge 1993).)
Pauli 1939). First of all, we remark that eqns [16] are the Euler–
So far one is left with the problem: ‘‘Find the Lagrange equations of an action principle. The
‘correct’ laws for arbitrary spin, that means field existence of a Lagrangian is plausible since the
equations which coincide with the well-known number of equations and the number of degrees of
approved ones for low spin and which remain freedom are equal. We do not state the Lagrangian,
consistent even for higher spin when electromagnet- the energy–momentum tensor, and the current vector
ism and/or gravitation is coupled!’’ in this article and refer the reader to Illge (1993).
0
An extensive literature is devoted to just this If n > 1, we can apply rBP to the first equation of
problem. Let us briefly sketch some means by which [16] and obtain using the spinor Ricci identities:
the authors tried to solve it:
0 1 BP0 A
 derivation of the desired field equations from a rBP BC...EP0 ¼  r rP0 ’ABC...E
m1
variation principle where the original spinor fields n  2 KLM
¼  ðC ’D...EÞKLM ½17
are supplemented by auxiliary fields; m1
 extension of the four-dimensional spacetime geome-
try to a richer one: higher number of dimensions, Hence the divergence of  vanishes if n = 2 or if the
complexification, addition of torsion, nonmetrical spacetime is conformally flat. These are exactly the
connection, . . . ; cases where the symmetrization in the second
 replacement of the algebra of spinors by some equation of [16] can be omitted.
richer algebra; Now we are going to derive the second-order
 disclaim of the principle of minimal coupling; and equations for ’ and . Substituting
 supergravity theories. 1 A
BC...EP0 ¼  r 0 ’AB...E ½18
Some of these attempts are able to solve the problem, m1 P
at least partially. But, as a rule, they pay a price of into the second equation of [16], we obtain, after a
new difficulties. In the next section, we offer ‘‘good’’ bit of algebra,
equations for arbitrary s > 0 within the conventional
framework of the minimal coupling principle and of ra ra ’AB...E  2ðn  1ÞKL ðAB ’C...EÞKL
a curved spacetime background.  
nþ2 2
þ R þ m ’AB...E ¼ 0 ½19
12

Wave Equations for Arbitrary Spin This is a linear second-order equation of normal
without Consistency Conditions hyperbolic type for the spinor field ’. It can be used
to solve Cauchy’s problem for the system [16].
Massive Fields Similarily, we get a second-order equation for :
The ansatz which leads to the desired result is 0
ra ra B...EP0  2ðn  1ÞðB K P0 W C...EÞKW 0
surprisingly simple. We avoid the ambiguity in the  
Dirac system [5] that has been discussed earlier as R
þ þ m2 B...EP0
well as any consistency condition if we state the 4
wave equations n1 0
¼2 rðBP0 rKW C...EÞKW 0 ½20
rA n
P0 ’AB...E þ m1 B...EP0 ¼ 0
0 ½16 Seemingly this is not an equation of hyperbolic
rPðA B...EÞP0  m2 ’AB...E ¼ 0
type if n > 1. However, the second derivatives of 
This system was first proposed by Wünsch (1985); on the right-hand side of [20] can be eliminated
it is equivalent to a pair of equations given by using [17]. Therefore, if the spinor field ’ is
Buchdahl (1982) which contains the Weyl spinor already known by solving [19], then [20] is an
explicitly. As before, ’ and  are symmetric spinor equation of Klein–Gordon type, too. However, it
fields, ’ has n unprimed indices (and no one else!) is generally inhomogeneous if n > 2. A wave
and the constants m1 , m2 are mass parameters equation that contains the spinor field  alone
(m2 = 2m1 m2 ). We assume m1 6¼ 0 in this section. exists only if n = 1, n = 2, or the spacetime is
Obviously, the Dirac and Proca equations are conformally flat.
Relativistic Wave Equations Including Higher Spin Fields 397

Now we are going to discuss the ‘‘Cauchy for a spinor field  of type (1, n  1). This is just
problem’’ for the wave equations [16] (for details eqn [9] for the potential of a massless field. We will
see Wünsch (1985)). Let a spacelike hypersurface S show that [23] is a satisfactory equation in a
be given and let na denote the future-directed unit generally curved spacetime (Illge 1988). Unfortu-
normal vector on S and rn = na ra . The local nately, no Lagrangian has been found if n > 1.
Cauchy problem is to find a solution (’, ) of [16] To begin with, we remark that there is a gauge
with given Cauchy data ’0 , 0 on S. freedom in curved spacetimes, too, since the
In general, the initial data ’0 and 0 cannot be solution  of [23] cannot be uniquely determined
prescribed arbitrarily. Suppose that a solution (’, ) if n > 1. We use this freedom to prescribe the
of [16] does exist. Then the differential equations divergence of . So let an arbitrary spinor field
have to be satisfied on S, too. Thus, we obtain ! of type (0, n  2) be given. We consider eqns
0  [23] and
ðrn ’AB...E ÞjS ¼ 2nA ~F
A rA0 ’B...EF þ m1 B...EA0 jS ½21
0
~AA0 = rAA0  rAB AB0 C0 ...E0 ¼ !C0 ...E0
where the differential operator r
nAA0 rn is just the tangential part of rAA0 with or, together,
respect to S. Therefore, the right-hand side of [21]
is completely determined by the initial data. Now n1
rA
A0 AB0 ...E0 ¼  "A0 ðB0 !C0 ...E0 Þ ½24
the symmetry of the solution ’AB...E implies the n
symmetry of rn ’AB...E . Consequently, the right- If we apply rA
0
B to this equation, we obtain using the
hand side of [21] has to be symmetric with respect spinor Ricci identities
to the unprimed indices and so we obtain the
following constraints for the initial data if ’ has at K W0 R
ra ra BB0 ...E0  2ðn  1ÞB ðB0 jKjC0 ...E0 ÞW 0 þ BB0 ...E0
least two indices: 4
0  2ðn  1Þ
nBA r ~ F 0 ’0 B...EF þ m1 0 B...EA0 jS ¼ 0
A ½22 ¼
n
rBðB0 !C0 ...E0 Þ ½25

Now we can state: This is a linear second-order equation of normal


0 0
Theorem 1 If the Cauchy data ’ and  satisfy the hyperbolic type for the spinor field  (cf. [20]).
constraints [22], then the Cauchy problem has a Now let us discuss some particular cases. If n = 1,
unique solution in a neighborhood of S. then [23] is just the Weyl equation itself. Therefore,
the equations for the field and its potential are
For each differential equation of hyperbolic type identical and there is no gauge freedom. If n = 2,
we can ask the question whether the wave propaga- then the spinor field AA0 is a (complex) vector field
tion is ‘‘sharp,’’ that is, free of tails. If this property and eqn [23] yields
is valid we say that the equation satisfies ‘‘Huygens’
principle’’ (for an exact definition, see, e.g., Wünsch rA
ðA0 jAjB0 Þ ¼ 0
(1994)). Using invariant Taylor expansions of
the parallel propagator and of the Riesz kernels in The gauge field ! is just a scalar function, especially
normal coordinates we can prove (Wünsch 1985): we can choose ! = 0 (Lorentz gauge). As in eqn [10]
we define the field spinor as
Theorem 2 The massive wave equations [16] for
0
spin s > 0 satisfy Huygens’ principle if and only ’AB ¼ rBðA BÞB0
if the spacetime is of constant curvature and
R = (6m2 =s). Since we have the identity
0 0
rA A A A
B0 rðB AÞA0 ¼ rB rðB0 jAjA0 Þ
Massless Fields
for arbitrary spinor fields AA0 (which must not have
In the preceding section, we have seen that the additional free indices!), the spinor field ’AB satisfies
premise m1 6¼ 0 is decisive for the consistency of the massless free-field equation
[16] if s > 1. This fact agrees with the result of the
previous section, that eqn [13] is inconsistent if rA
B0 ’AB ¼ 0
s > 1 and the spacetime is not conformally flat. On
the other hand, m2 = 0 is possible. Therefore we If n > 2, we can define a field ’AB...E via the
state the wave equations relation [10], too, replacing the partial with the
covariant derivatives. But the field equation for
rA
ðA0 jAjB0 ...E0 Þ ¼ 0 ½23 ’AB...E becomes more complicated than [13]. This
398 Relativistic Wave Equations Including Higher Spin Fields

fact is not surprising, since eqn [23] is a consistent Petrov type N, III or D spacetimes as well as those
one, whereas [13] is inconsistent. with r[a Rb]c = 0.
We continue with some remarks on ‘‘conformal
rescalings of the metric.’’ The equations for massless See also: Clifford Algebras and Their Representations;
fields have to be invariant with respect to such Dirac Fields in Gravitation and Nonabelian Gauge
transformations. Therefore, the ‘‘curved space’’ Theory; Euclidean Field Theory; Evolution Equations:
Linear and Nonlinear; Spinors and Spin Coefficients;
scalar wave equation is
Standard Model of Particle Physics; Twistors.
 
R
&þ ’¼0 ½26
6
Further, the equations Further Reading
rA Buchdahl HA (1962) On the compatibility of relativistic wave
ðA0 jAB...EjB0 ...F0 Þ ¼ 0 ½27
equations in Riemann spaces. Nuovo Cimento 25: 486–496.
for any spinor field  of type (n, k) are conformally Buchdahl HA (1982) On the compatibility of relativistic wave
equations in Riemann spaces II. Journal of Physics A 15: 1–5.
invariant (Penrose and Rindler 1984). Especially, Corson EM (1953) Introduction to Tensors, Spinors, and
eqns [23] for the massless potential and [13] for the Relativistic Wave-Equations. London and Glasgow: Blackie
massless field have this property. and Son Ltd.
We mention a further special case of [27]. If  is of Dirac PAM (1936) Relativistic wave equations. Proceedings of the
type (k þ 1, k), then these equations are consistent, Royal Society London Series A 155: 447–459.
Fierz M and Pauli W (1939) On relativistic wave equations for
too (Frauendiener and Sparling 1999). The Cauchy particles of arbitrary spin in an electromagnetic field. Proceed-
problem is well posed and a Lagrangian is known. ings of the Royal Society London Series A 173: 211–232.
Unfortunately, the solutions do not satisfy a wave Frauendiener J and Sparling GAJ (1999) On a class of consistent
equation of second order if k > 0. higher spin equations on curved manifolds. Journal of
We conclude with the discussion of the Cauchy Geometry and Physics 30: 54–101.
Greiner W (1997) Relativistic Quantum Mechanics – Wave
problem for eqn [24]. As in the preceding section, let Equations, 2nd edn. Berlin: Springer.
a spacelike hypersurface S and initial data 0 on S Illge R (1988) On potentials for several classes of spinor and
be given. We can state: tensor fields in curved spacetimes. General Relativity and
Gravitation 20: 551–564.
Theorem 3 If a symmetric spinor field ! of type Illge R (1993) Massive fields of arbitrary spin in curved space-
(0, n  2) is given, then there exists a neighborhood times. Communications in Mathematical Physics 158:
of S in which eqn [24] has one and only one solution 433–457.
satisfying jS = 0 . Penrose R and Rindler W (1984) Spinors and Space-Time,
Two-Spinor Calculus and Relativistic Fields, vol. 1.
The proof is given in Illge (1988). We emphasize Cambridge: Cambridge University Press.
that there are no constraints on the Cauchy data for Penrose R and Rindler W (1986) Spinors and Space-Time, Spinor
and Twistor Methods in Space-Time Geometry, vol. 2.
the massless equation [24]. Cambridge: Cambridge University Press.
In contrast to massive fields we are far away from Schmutzer E (1968) Relativistische Physik. Leipzig: Teubner-
an answer to the question whether Huygens princi- Verlag.
ple is valid for the massless equations. A particular Ward RS and Wells RO (1990) Twistor Geometry and Field
result is Wünsch (1994): Theory. Cambridge: Cambridge University Press.
Wünsch V (1985) Cauchy’s problem and Huygens’ principle for
Theorem 4 Huygen’s principle for the conformally relativistic higher spin wave equations in an arbitrarily curved
invariant scalar wave equation [26], the Weyl, and space-time. General Relativity and Gravitation 17: 15–38.
Wünsch V (1994) Moments and Huygens’ principle for
the Maxwell equations is valid only for conformally conformally invariant field equations in curved space-times.
flat and plane wave metrics within the classes of Annales de l’Institute Henri Poincaré – Physique théorique
centrally symmetric, recurrent, (2, 2)-decomposable, 60: 433–455.
Renormalization: General Theory 399

Renormalization: General Theory


J C Collins, Penn State University, product expansion, factorization theorems, and the
University Park, PA, USA renormalization group (RG), to go far beyond simple
ª 2006 Elsevier Ltd. All rights reserved. fixed-order perturbation theory. The construction of
fully rigorous mathematical treatments for the exact
theory is a topic of future research.

Introduction
Formulation of QFT
Quantum field theories (QFTs) provide a natural
framework for quantum theories that obey the A QFT is specified by its Lagrangian density.
principles of special relativity. Among their most A simple example is 4 theory:
striking features are ultraviolet (UV) divergences,
which at first sight invalidate the existence of the ? ð@Þ2 m2 2 4
L¼   ½1
theories. The divergences arise from Fourier modes 2 2 4!
of very high wave number, and hence from the where (x) = (t, x) is a single component Hermitian
structure of the theories at very short distances. In field. The Lagrangian density and the resulting
the very restricted class of theories called ‘‘renorma- equation of motion, @ 2  þ m2  þ (1=6)3 = 0, are
lizable,’’ the divergences may be removed by a local; they involve only products of fields at the
singular redefinition of the parameters of the theory. same spacetime point. Such locality is characteristic
This is the process of renormalization that defines a of relativistic theories, where otherwise it is difficult
QFT as a nontrivial limit of a theory with a UV or impossible to preserve causality, but it is also the
cutoff. source of the UV divergences. The question mark
A very important QFT is the standard model, an over the equality symbol in eqn [1] is a reminder
accurate and successful theory for all the known that renormalization of UV divergences will force us
interactions except gravity. Calculations using to modify the equation.
renormalization and related methods are vital to The Feynman rules for perturbation theory are
the theory’s success. given by a free propagator i=(p2  m2 þ i0) and an
The basic idea of renormalization predates QFT. interaction vertex i. Although we will usually
Suppose we treat an observed electron as a work in four spacetime dimensions, it is useful also
combination of a bare electron of mass m0 and the to consider the theory in a general spacetime
associated classical electromagnetic field down to a dimensionality n, where the coupling has energy
radius a. The observed mass of the electron is its dimension [] = E4n . We use ‘‘natural units,’’ that
bare mass plus the energy in the field (divided by c2 ). is, with h = c = 1. The ‘‘i0’’ in the propagator i=(p2 
The field energy is substantial, for example, 0.7 MeV m2 þ i0) symbolizes the location of the pole relative
when a = 1015 m, and it diverges when a ! 0. The to the integration contour; it is often written as i.
observed mass, 0.5 MeV, is the sum of the large The primary targets of calculations are the
(or infinite) field contribution compensated by a vacuum expectation values of time-ordered products
negative and large (or infinite) bare mass. This of ; in QFT these are called the Green functions of
calculation needs replacing by a more correct the theory. From these can be reconstructed the
version for short distances, of course, but it remains scattering matrix, scattering cross sections, and
a good motivation. other measurable quantities.
In this article, we review the theory of renorma-
lization in its classic form, as applied to weak-
coupling perturbation theory, or Feynman graphs. It One-Loop Calculations
is this method, rather than the Wilsonian approach Low-order graphs for the connected and amputated
(see Exact Renormalization Group), that is typically four-point Green function are shown in Figure 1.
used in practice for perturbative calculations in the Each one-loop graph has the form
standard model, especially its QCD part.
Much of the emphasis is on weak-coupling i2 Iðp2 Þ
perturbation theory, where there are well-known 2Z
?  d4 k 1
algorithmic rules for performing calculations and ¼ ½2
2 ð2Þ4 ðk2  m2 þ i0Þ½ðp  kÞ2  m2 þ i0
renormalization. Applications (see Quantum Chro-
modynamics for some important nontrivial examples) where p is a combination of external momenta.
involve further related results, such as the operator There is a divergence from where the loop
400 Renormalization: General Theory

+ + + + O(λ3) + 3A + + + + O(λ3)

Figure 1 One-loop approximation to connected and amputated Figure 2 One-loop approximation to renormalized connected
four-point function, before renormalization. and amputated four-point function, with counter-term.

momentum k goes to infinity. We define the degree the appropriate expansion parameter of the theory is
of divergence, , by counting powers of k at large k, the finite renormalized coupling , held fixed as
to get = 0. In an n-dimensional spacetime we a ! 0. We call the extra term in eqn [5] a counter-
would have  = n  4. The integral is divergent term. The diagrams for the correct renormalized
whenever   0. Comparing the dimensions of the calculation are represented in Figure 2, which has a
one-loop and tree graphs shows that  equals the counter-term graph compared with Figure 1.
negative of the energy dimension of the coupling . In the physics terminology, used here, the cutting-
Thus, the dimensionlessness of  at the physical off of the divergence by using a modified theory is
spacetime dimension is equivalent to the integral called a regularization. This contrasts with the
being just divergent. mathematics literature, where ‘‘regularized integral’’
The infinity in the integral implies that the theory usually means the same as a physicist’s ‘‘renorma-
in its naive formulation is not defined. With the aid lized integral.’’
of RG methods, it has been shown that the problem There is always freedom to add a finite term to a
is with the complete theory, not just perturbation counter-term. When we discuss the RG, we will see
theory. that this corresponds to a reorganization of the
The divergence only arises because we use a perturbation expansion and provides a powerful
continuum spacetime. So suppose that we formulate tool for improving perturbatively based calculations,
the theory initially on a lattice of spacing a (in space especially in QCD. Contrary to the impression given
or spacetime). Our loop graph is now in some parts of the literature, it is not necessary
that a renormalized mass equal a corresponding
i2 Iðp; m; aÞ
Z physical particle mass, with similar statements for
2 coupling and field renormalization. While such a
¼ d4 k Sðk; m; aÞ Sðp  k; m; aÞ ½3
324 prescription is common and natural in a simple
theory like QED, it is by no means required and
where the free propagator S(k, m; a) approaches the
certainly may not always be best. If nothing else, the
usual value i=(k2  m2 þ i0) when k is much smaller
correspondence between fields and stable particles
than 1=a, and it falls off more rapidly for large k.
may be poor or nonexistent (as in QCD).
The basic observation that propels the renormaliza-
One classic possibility is to subtract the value of
tion program is that the divergence as a ! 0 is
the graph at p = 0, a prescription associated with
independent of p. This is most easily seen by
Bogoliubov, Parasiuk, and Hepp (BPH), which
differentiating once with respect to p, after which
leads to
the integral is convergent when a = 0, because the
differentiated integral has degree of divergence 1. i2 IR; BPH ðp2 Þ
Thus we can cancel the divergence in eqn [2] by Z
i2 1  
replacing the coupling in the first term in Figure 1, ¼ 2
dx ln 1  p2 xð1  xÞ=m2 ½6
by the so-called bare coupling 32 0
In obtaining this from [2], we used a standard
0 ¼  þ 3AðaÞ2 þ Oð3 Þ ½4
Feynman parameter formula,
Here A(a) is chosen so that the renormalized value Z 1
of our one-loop graph, 1 1
¼ dx ½7
AB 0 ½Ax þ Bð1  xÞ2
i2 IR ðp2 ; m2 Þ ¼ i2 lim ½Iðp; m; aÞ þ AðaÞ ½5
a!0 to combine the propagator denominators, after
exists, at a = 0, with A(a) in fact being real valued. which the integral over the momentum variable
The factor 3 multiplying A(a) in eqn [4] is because k is elementary. We then obtain the renormalized
there are three one-loop graphs, with equal diver- one-loop (four-point and amputated) Green function
gent parts. The replacement for the coupling is made
i  i2 ½IR ðsÞ þ IR ðtÞ þ IR ðuÞ þ Oð3 Þ ½8
in the tree graph in Figure 1, but not yet at the
vertices of the other graphs, because at the moment where s, t, and u are the three standard Mandelstam
we are only doing a calculation accurate to order 2 ; invariants for the Green function. (For a 2 ! 2
Renormalization: General Theory 401

scattering process, or a corresponding off-shell derivatives in each term. A generalization of the


Green function, in which particles of momenta p1 power-counting analysis shows that if we start with
and p2 scatter to particles of momenta p01 and p02 , a theory whose L only has terms of dimension 4 or
the Mandelstam variables are defined as s = (p1 þ less, then no terms of higher dimension are needed
p2 )2 , t = (p1  p01 )2 , and u = (p1  p02 )2 .) as counter-terms, at least not in perturbation theory.
In the general case, with a nonzero degree of This is a very powerful restriction on self-contained
divergence, the divergent part of an integral is a QFTs, and was critical in the discovery of the
polynomial in p and m of degree D, where D is the standard model.
smallest positive integer less than or equal to . In a Sometimes it is found that the description of some
higher spacetime dimension, this implies that renor- piece of physics appears to need higher-dimension
malization of the original, momentum-independent, operators, as was the case originally with weak-
interaction vertex is not sufficient to cancel the interaction physics. The lack of renormalizability of
divergences. We would need higher derivative terms, such theories indicates that they cannot be complete,
and this is evidence that the theory is not renorma- and an upper bound on the scale of their applic-
lizable in higher than 4 spacetime dimensions. Even ability can be computed, for example, a few
so, the terms needed would be local, because of the hundred GeV for the four-fermion theory of weak
polynomiality in p. interactions. Eventually, this theory was superseded
by the renormalizable Weinberg–Salam theory of
weak interactions, now a part of the standard
Complete Formulation of model, to which the four-fermion theory provides a
low-energy approximation for charged current weak
Renormalization Program
interactions.
The full renormalization program motivated by Certain operators of allowed dimensions are
example calculations is: missing in eqn [9]: the unit operator, and  and
 the theory is regulated to cut off the divergences; 3 . Symmetry under the transformation  ! 
 the numerical value of each coefficient in L is implies that Green functions with an odd number of
fields vanish, so that no  and 3 counter-terms are
allowed to depend on the regulator parameter
needed. Divergences with the unit operator do
(e.g., a); and
 these dependences are adjusted so that finite appear, but not for ordinary Green functions. In
gravitational physics, the coefficient of the unit
results for Green functions are obtained after
operator gives renormalization of the cosmological
removal of the regulator.
constant.
In 4 theory, we therefore replace L by To implement renormalized perturbation theory,
we partition L (nonuniquely) as
Z Zm20 2 Z2 0 4
L¼ ð@Þ2     ½9 L ¼ Lfree þ Lbasic interaction þ Lcounter-term ½11
2 2 4!
with the bare parameters, Z, m0 and 0 , having a where the free, the basic interaction, and the
regulator dependence such that Green functions of  counter-term Lagrangians are
are finite at a = 0.
1 m2 2
The slightly odd labeling of the coefficients in Lfree ¼ ð@Þ2   ½12
eqn [9] arises because observables like cross sections 2 2
are invariant under a redefinition of p the
ffiffiffiffi field by a
factor. In terms of the bare field 0 def
= Z, we have
 4
Lbasic interaction ¼   ½13
1 m20
0 4!
L ¼ ð@0 Þ2  2  4 ½10
2 2 0 4! 0
The unit coefficient of (1=2)(@0 )2 implies that 0 Z1 ðZm20  m2 Þ 2
Lcounter-term ¼ ð@Þ2  
has canonical commutation relations (in the regu- 2 2
lated theory). This provides a natural standard for ðZ2 0  Þ 4
  ½14
the normalization of the bare mass m0 and the bare 4!
coupling 0 . The renormalized coupling and mass,  and m, are to
All terms in L have coefficients with dimension be fixed and finite when the UV regulator is removed.
zero or larger. This is commonly characterized by Both the basic interaction and the counter-terms are
saying that the terms L ‘‘have dimension 4 or less,’’ treated as interactions. First we compute ‘‘basic
which refers to the products of field operators and graphs’’ for Green functions using only the basic
402 Renormalization: General Theory

interaction. The counter-terms are expanded in with  and  being held fixed when  ! 0. (Thus,
powers of , and then all graphs involving counter- the basic interaction in eqn [13] is changed to
term vertices at the chosen order in  are added to the 2 4 =4!.) Then for the one-loop graph of eqn [2],
calculation. The counter-terms are arranged to cancel dimensionally regularized Feynman parameter meth-
all the divergences, so that the UV regulator can be ods give
removed, with m and  held fixed. The counter-terms
cancel the parts of the basic Feynman graphs asso- i2
i2 Iðp; m; Þ ¼ ð4Þ ðÞ
ciated with large loop momenta. An algorithmic 322
Z 1  2 
specification of the otherwise arbitrary finite parts of m  p2 xð1  xÞ  i0
 dx ½16
the counter-terms is called a renormalization prescrip- 0 2
tion or a renormalization scheme. Thus, it gives a A natural renormalization procedure is to subtract
definite relation between the renormalized and bare the pole at  = 0, but it is convenient to accompany
parameters, and hence a definite specification of the this with other factors to remove some universally
partitioning of L into its three parts. occurring finite terms. So MS renormalization
It has been proved that this procedure works to all (‘‘modified minimal subtraction’’) is defined by
orders in , with corresponding results for other using the counter-term
theories. Even in the absence of fully rigorous
nonperturbative proofs, it appears clear that the results  2 S
iAðÞ2 ¼ i ½17
extend beyond perturbation theory, at least in asymp- 322 
totically free theories like QCD: see the discussion on
where S def
= (4 eE ) , with E = 0.5772 . . . being the
Wilsonian RG (see Exact Renormalization Group).
Euler constant. This gives a renormalized integral (at
 = 0)
Z 1  2 
Dimensional Regularization i2 m  p2 xð1  xÞ
 dx ln ½18
and Minimal Subtraction 322 0 2

The final result for renormalized graphs does not which can be evaluated easily. A particularly simple
depend on the particular regularization procedure. result is obtained at m = 0:
A particularly convenient procedure, especially in  
i2 p2
QCD, is dimensional regularization, where diver-  ln 2 þ 2 ½19
gences are removed by going to a low spacetime 322 
dimension n. To make a useful regularization method, This formula symptomizes important and very
n is treated as a continuous variable, n = 4  2. useful algorithmic simplifications in the higher-
Great advantages of the method are that it order massless calculations common in QCD.
preserves Poincaré invariance and many other The MS scheme amounts to a de facto standard
symmetries (including the gauge symmetry of for QCD. At higher orders a factor of S L is used in
QCD), and that Feynman graph calculations are the counter-terms, with L being the number of
minimally more complicated than for finite graphs loops.
at n = 4, particularly when all the lines are massless,
as in many QCD calculations.
Although there is no such object as a genuine
Coordinate Space
vector space of finite noninteger dimension, it is
possible to construct an operation that behaves as if Quantum fields are written as if they are functions
it were an integration over such a space. The of x, but they are in fact distributions or generalized
operation was proved unique by Wilson, and functions, with quantum-mechanical operator
explicit constructions have been made, so that values. This indicates that using products of fields
consistency is assured at the level of all Feynman is dangerous and in need of careful definition. The
graphs. Whether a satisfactory definition beyond relation with ordinary distribution theory is simplest
perturbation theory exists remains to be determined. in the coordinate-space version of Feynman graphs.
It is convenient to arrange that the renormalized Indeed in the 1950s, Bogoliubov and Shirkov
coupling is dimensionless in the regulated theory. formulated renormalization as a problem of
This is done by changing the normalization of  with defining products of the singular numeric-valued
the aid of an extra parameter, the unit of mass : distributions in coordinate-space Feynman graphs;
theirs was perhaps the best treatment of renormali-
0 ¼ 2 ð þ counter-termsÞ ½15 zation in that era.
Renormalization: General Theory 403

For example, the coordinate-space version of external momenta, which does not produce a finite
eqn [5] is result because of the divergent one-loop subgraph.
Z But for consistency of the theory, the one-loop
2 lim d4 x d4 y f ðx; yÞ counter-terms already computed must be themselves
a!0
h i put into loop graphs. Among others, this gives the
Sðx  y; m; aÞ2 þ iAðaÞð4Þ ðx  yÞ ½20
 12 ~ second graph of Figure 3, where the cross denotes
that a counter-term contribution is used. The
where x and y are the coordinates for the interaction contribution used here is actually 2/3 of the total
vertices, f (x, y) is the product of external-line free one-loop counter-term, for reasons of symmetry
propagators, and ~ S(x  y; m, a) is the coordinate- factors that are not fully evident at first sight. The
space free propagator, which at a = 0 has a remainder of the one-loop coupling renormalization
singularity cancels a subdivergence in another two-loop graph.
1 It is readily shown that the divergence of the sum of
½21 the first two graphs in Figure 3 is momentum
42 ½ðx  yÞ2 þ i0
independent, and thus can be canceled by a vertex
as (x  y)2 ! 0. We see in eqn [20] a version of the counter-term.
Hadamard finite part of a divergent integral, and This method is fully general, and is formalized in
renormalization theory generalizes this to particular the Bogoliubov R-operation, which gives a recursive
kinds of arbitrarily high-dimension integrals. The specification of the renormalized value R(G) of a
physical realization and justification of the use of graph G:
the finite-part procedure is in terms of renormaliza- def
X
tion of parameters in the Lagrangian; this also gives RðGÞ ¼ G þ Gji !Cði Þ ½22
the procedure a significance that goes beyond the f1 ;...;n g

integrals themselves and involves the full nonpertur- The sum is over all sets of nonintersecting 1PI
bative formulation of QFT. subgraphs of G, and the notation Gji ! C(i ) denotes
G with all the subgraphs i replaced by associated
General Counter-Term Formulation counter-terms C(i ). The counter-term C() of a 1PI
graph  has the form
We have written L as a basic Lagrangian density
def
plus counter-terms, and have seen in an example CðÞ ¼  T ð þ counter-terms
how to cancel divergences at one-loop order. In this for subdivergencesÞ ½23
section, we will see how the procedure works to all
orders. The central mathematical tool is Bogoliubov’s Here T is an operation that extracts the divergent
R-operation. Here the counter-terms are expanded part of its argument and whose precise definition
as a sum of terms, one for each basic one-particle gives the renormalization scheme. For example, in
irreducible (1PI) graph with a non-negative degree minimal subtraction we define
of divergence. To each basic graph for a Green TðÞ ¼ pole part at  ¼ 0 of  ½24
function is added a set of counter-term graphs
associated with divergences for subgraphs. The We formalize the term inside parentheses in eqn
central theorem of renormalization is that this [23] as
procedure does in fact remove all the UV diver- def

RðÞ ¼  þ counterterms for subdivergences
gences, with the form of the counter-terms being
X0
determined by the simple computation of the degree ¼ þ Gji !Cði Þ ½25
of divergence for 1PI graphs. f1 ;...;n g
To see the essential difficulty to be solved, consider P
a two-loop graph like the first one in Figure 3. Its where the prime on the 0 denotes that we sum over
divergence is not a polynomial in external momenta, all sets of nonintersecting 1PI subgraphs except for
and is therefore not canceled by an allowed counter- the case that there is a single i equal to the whole
term. This is shown by differentiation with respect to graph (i.e., the term with n = 1 and 1 =  is
omitted).
Note that, for the MS scheme, we define the T
+ 2A +
operation to be applied to a factor of constant
B
dimension obtained by taking the appropriate power
Figure 3 A two-loop graph and its counter-terms. The label B of  outside of the pole-part operation. Moreover,
indicates that it is the two-loop overall counter-term for this graph. it is not a strict pole-part operation; instead each
404 Renormalization: General Theory

pole is to be multiplied by S L , where L is the calculation, at intermediate stages, to use bare


number of loops, and S is defined after eqn [17]. quantities that are divergent as the regulator is
Equations [22]–[25] give a recursive construction removed.
of the renormalization of an arbitrary graph. The
recursion starts on one-loop graphs, since they have
no subdivergences, that is, C() = T() for a one- Renormalizability, Non-Renormalizability,
loop 1PI graph. and Super-Renormalizability
Each counter-term C() is implemented as a
The basic power-counting method shows that if a
contribution to the counter-term Lagrangian. The
theory with conventional fields (at n = 4) has only
Feynman rules ensure that once C() has been
operators of dimension 4 or less in its L, then the
computed, it appears as a vertex in bigger graphs
necessary counter-term operators are also of dimen-
in such a way as to give exactly the counter-terms
sion 4 or less. So if we start with a Lagrangian with
for subdivergences used in the R-operation. It has
all possible such operators, given the field content,
been proved that the R-operation does in fact give
then the theory is renormalizable. This is not the
finite results for Feynman graphs, and that basic
whole story, as we will see in the discussion of gauge
power counting in exactly the same fashion as at
theories.
one-loop determines the relevant operators.
If we start with a Lagrangian containing operators
In early treatments of renormalization, a problem
of dimension higher than 4, then renormalization
was caused by graphs like Figure 4. This graph has
requires operators of ever higher dimension as
three divergent subgraphs which overlap, rather
counter-terms when one goes to higher orders in
than being nested. Within the R-operation approach,
perturbation theory. Therefore, such a theory is said
such cases are no harder to deal with than merely
to be perturbatively non-renormalizable. Some very
nested divergences.
powerful methods of cancelation or some nonper-
The recursive specification of R-operation can be
turbative effects are needed to evade this result.
converted to a nonrecursive formulation by the
In the case of dimension-4 interactions, there is
forest formula of Zavyalov and Stepanov, later
only a finite set of operators given the set of basic
rediscovered by Zimmerman. It is normally the
fields, but divergences occur at arbitrarily high
recursive formulation that is suited to all-orders
orders in perturbation theory. If, instead, all the
proofs.
operators have at most dimension 3, then only a
Whether these results, proved to all orders of
finite number of graphs need counter-terms. Such
perturbation theory, genuinely extend to the com-
theories are called super-renormalizable. The diver-
plete theory is not so easy to answer, certainly in a
gent graphs also occur as subgraphs inside bigger
realistic four-dimensional QFT. One illuminating
graphs, of course. There is only one such theory in a
case is of a nonrelativistic quantum mechanics
four-dimensional spacetime: 3 theory, which suf-
model with a delta-function potential in a two-
fers from an energy density that is unbounded from
dimensional space. Renormalization can be applied
below, so it is not physical. In lower spacetime
just as in field theory, but the model can also be
dimension, where the requirements on operator
treated exactly, and it has been shown that the
dimension are different, there are many more
results agree with perturbation theory.
known super-renormalizable theories, some with a
Perturbation series in relativistic QFTs can at best
very rigorous proof of existence.
be expected to be asymptotic, not convergent. So
All the above characterizations rely primarily on
instead of a radius of convergence, we should talk
perturbative analysis, so they are subject to being
about a region of applicability of a weak-coupling
not quite accurate in an exact theory, but they form
expansion. In a direct calculation of counter-terms,
a guide to the relevant issues.
etc., the radius of applicability shrinks to zero as the
regulator is removed. However, we can deduce the
expansion for a renormalized quantity, whose
Renormalization and Symmetries:
expansion is expected to have a nonzero range of
Gauge Theories
applicability. We can therefore appeal to the
uniqueness of power series expansions to allow the In most physical applications, we are interested in
QFTs whose Lagrangian is restricted to obey certain
symmetry requirements. Are these symmetries pre-
served by renormalization? That is, is the Lagran-
gian with all necessary counter-terms still invariant
Figure 4 Graph with overlapping divergent subgraphs. under the symmetry?
Renormalization: General Theory 405

We first discuss nonchiral symmetries; these are generally hold; the form of the gauge transformation is
symmetries in which the left-handed and right- itself renormalized, in a certain sense.
handed parts of Dirac fields transform identically.
For Poincaré invariance and simple global internal
Anomalies
symmetries, it is simplest to use a regulator, like
dimensional regularization, which respects the sym- Chiral symmetries, as in the weak-interaction part of
metries. Then it is easily shown that the symmetries the gauge symmetry of the standard model, are
are preserved under renormalization. This holds much harder to deal with. Chiral symmetries are
even if the internal symmetries are spontaneously ones for which the left-handed and right-handed
broken (as happens with a ‘‘wrong-sign mass term,’’ components of Dirac field transform independently
e.g., negative m2 in eqn [1]). under different components of the symmetry group,
The case of local gauge symmetries is harder. But local or global as the case may be. Occasionally,
their preservation is more important, because gauge some or other of the left-handed or right-handed
theories contain vector fields which, without a gauge components may not even be present.
symmetry, generally give unphysical features to the In general, chiral symmetries are not preserved by
theory. For perturbation theory, BRST quantization regularization, at least not without some other
is usually used, in which, instead of gauge symme- pathology. At best one can adjust the finite parts of
try, there is a BRST supersymmetry. This is counter-terms such that in the limit of the removal of
manifested at the Green function level by Slavnov– the regulator, the Ward or Slavnov–Taylor identities
Taylor identities that are more complicated, in hold. But in general, this cannot be done consistently,
general, than the Ward identities for simple global and the theory is said to suffer from an anomaly. In
symmetries and for abelian local symmetries. the case of chiral gauge theories, the presence of an
Dimensional regularization preserves these anomaly prevents the (candidate) theory from being
symmetries and the Slavnov–Taylor identities. More- valid. A dramatic and nontrivial result (Adler–
over, the R-operation still produces finite results with Bardeen theorem and some nontrivial generaliza-
local counter-terms, but cancelations and relations tions) is that if chiral anomalies cancel at the
occur between divergences for different graphs in one-loop level, then they cancel at all orders.
order to preserve the symmetry. A simple example is Similar results, but more difficult ones, hold for
QED, which has an abelian U(1) gauge symmetry, and supersymmetries.
whose gauge-invariant Lagrangian is The anomaly cancelation conditions in the standard
  model lead to constraints that relate the lepton content
ð0Þ ð0Þ 2 to the quark content in each generation. For example,
L ¼  14 @ A  @ A
  given the existence of the b quark, and the
and

þ 0 i  @  e0 Að0Þ
  m 0 0 ½26 leptons (of masses around 4.5 GeV, 1.8 GeV, and zero
respectively), it was strongly predicted on the grounds
At the level of individual divergent 1PI graphs, of anomaly cancelation that there must be a t quark
we get counter-terms proportional to A 2 and to partner of the b to complete the third generation of
(A 2 )2 , operators not present in the gauge-invariant quark doublets. This prediction was much later
Lagrangian. The Ward identities and Slavnov–Taylor vindicated by the discovery of the much heavier top
identities show that these counter-terms cancel when quark with mt ’ 175 GeV.
they are summed over all graphs at a given order of
renormalized perturbation theory. Moreover, the
Renormalization Schemes
renormalization of coupling and the gauge field are
inverse, so that e0 A(0)  equals the corresponding A precise definition of the counter-terms entails
object with renormalized quantities,  eA . Natu- a specification of the renormalization prescription
rally, sums of contributions to a counter-term in (or scheme), so that the finite parts of the counter-
L can only be quantified with use of a regulator. terms are determined. This apparently induces extra
In nonabelian theories, the gauge-invariance proper- arbitrariness in the results. However, in the 4
ties are not just the absence of certain terms in L but Lagrangian (for example), there are really only two
quantitative relations between the coefficients of terms independent parameters. (A scaling of the field does
with different numbers of fields. Even so, the argument not affect any observables, so we do not count Z as
with Slavnov–Taylor identities generalizes appropri- a parameter here.) Thus, at fixed regulator para-
ately and proves renormalizability of QCD, for meter a or , renormalization actually just gives a
example. But note that the relation concerning the reparametrization of a two-parameter collection of
product of the coupling and the gauge field does not theories. A renormalization prescription gives the
406 Renormalization: General Theory

change of variables between bare and renormalized RG equation is incorrectly labeled as a Callan–
parameters, a rather singular transformation when Symanzik equation.
the regulator is removed. If we have two different The elementary use of the RG is not sufficient for
prescriptions, we can deduce a transformation most interesting processes, which involve a set of
between the renormalized parameters in the two widely different scales. Then more powerful theo-
schemes. The renormalized mass and coupling m1 rems come into play. Typical are the factorization
and 1 in one scheme can be obtained as functions theorems of QCD (see Quantum Chromodynamics).
of their values m2 and 2 in the other scheme, with These express differential cross sections for certain
the bare parameters, and hence the physics, being important reactions as a product of quantities that
the same in both schemes. Since these are renorma- involve a single scale:
lized parameters, the removal of the regulator leaves
the transformation well behaved. d ¼ CðQ; ; ðÞÞ  f ðm; ; ðÞÞ
Generalization to all renormalizable theories is þ small correction ½28
immediate.
The product is typically a matrix or a convolution
product. The factors obey nontrivial RG equations,
and these enable different values of  to be used in
Renormalization Group and Applications the different factors. Predictions arise because some
and Generalizations factors and the kernels of the RG equation are
perturbatively calculable, with a weak effective
One part of the choice of renormalization scheme is
coupling. Other factors, such as f in eqn [28], are
that of a scale parameter such as the unit of mass  of
not perturbative. These are quantities with names
the MS scheme. The physical predictions of the theory
like ‘‘parton distribution functions,’’ and they are
are invariant if a change of  is accompanied by a
universal between many different processes. Thus,
suitable change of the renormalized parameters, now
the nonperturbative functions can be measured in a
considered as -dependent parameters () and m().
limited set of reactions and used to predict cross
These are called the effective, or running, coupling and
sections for many other reactions with the aid of
mass. The transformation of the parametrization of
calculations of the perturbative factors.
the theory is called an RG transformation.
Ultimately, this whole area depends on physical
The bare coupling and mass 0 and m0 are RG
phenomena associated with renormalization.
invariant, and this can be used to obtain equations
for the RG evolution of the effective parameters
from the perturbatively computed counter-terms.
For example, in 4 theory, we have (in the Concluding Remarks
renormalized theory after removal of the regulator) The actual ability to remove the divergences in
d certain QFTs to produce consistent, finite, and
¼ ðÞ ½27 nontrivial theories is a quite dramatic result. More-
d ln 2
over, associated with the integrals that give the
with () = 32 =(162 ) þ O(3 ). As exemplified in divergences is behavior of the kind that is analyzed
eqns. [18] and [19], Feynman diagrams depend with RG methods and generalizations. So the
logarithmically on . By choosing  to be comparable properties of QFTs associated with renormalization
to the physical external momentum scale, we remove get tightly coupled to many interesting consequences
possible large logarithms in this and higher orders. of the theories, most notably in QCD.
Thus, provided that the effective coupling at this scale QFTs are actually very abstruse and difficult
is weak, we get an effective perturbation expansion. theories; only certain aspects currently lend them-
This is a basic technique for exploiting perturba- selves to practical calculations. So the reader should
tion theory in QCD, for the strong interactions, not assume that all aspects of their rigorous
where the interactions are not automatically weak. mathematical treatment are perfect. Experience,
In this theory the RG function is negative so that both within the theories and in their comparison
the coupling decreases to zero as  ! 1; this is the with experiment, indicates, nevertheless, that we
asymptotic freedom of QCD. have a good approximation to the truth.
A closely related method is that associated with When one examines the mathematics associated
the Callan–Symanzik equation, which is a formula- with the R-operation and its generalizations with
tion of a Ward identity for anomalously broken factorization theorems, there are clearly present
scale invariance. However, RG methods are the some interesting mathematical structures that are
actually used ones, normally, even if sometimes an not yet formulated in their most general terms. Some
Renormalization: Statistical Mechanics and Condensed Matter 407

indications of this can be seen in the work by Acknowledgments


Connes and Kreimer (see Hopf Algebra Structure of
The author would like to thank Professor K Goeke
Renormalizable Quantum Field Theory), where it is
for hospitality and support at the Ruhr-Universität-
seen that renormalization is associated with a Hopf
Bochum. This work was also supported by the US DOE.
algebra structure for Feynman graphs.
With such a deep subject, it is not surprising that See also: Anomalies; BRST Quantization; Effective
it lends itself to other approaches, notably the Field Theories; Electroweak Theory; Euclidean Field
Connes–Kreimer one and the Wilsonian one (see Theory; Exact Renormalization Group; High Tc
Exact Renormalization Group). Readers new to the Superconductor Theory; Holomorphic Dynamics; Hopf
subject should not be surprised if it is difficult to get Algebra Structure of Renormalizable Quantum Field
a fully unified view of these different approaches. Theory; Lattice Gauge Theory; Operator Product
Expansion in Quantum Field Theory; Perturbation Theory
and its Techniques; Perturbative Renormalization Theory
Notes on Bibliography and BRST; Quantum Chromodynamics; Quantum Field
Reliable textbooks on quantum field theory Theory: A Brief Introduction; Singularities of the Ricci
Flow; Standard Model of Particle Physics; Supergravity.
are Sterman (1993) and Weinberg (1995). A clear
account of the foundations of perturbative QCD
methods is given by Sterman (1996). Further Reading
A pedagogical account of renormalization and
related subjects may be found in Collins (1984). Bogoliubov NN and Shirkov DV (1959) Introduction to the
Theory of Quantized Fields. New York: Wiley-Interscience.
The best account of renormalization theory before Collins JC (1984) Renormalization. Cambridge: Cambridge
the 1970s is given by Bogoliubov and Shirkov University Press.
(1959); the viewpoint is very modern, including a Kraus E (1998) Renormalization of the electroweak standard
coordinate-space distribution-theoretic view. A model to all orders. Annals of Physics 262: 155–259.
full account of the Wilsonian method as applied Manuel C and Tarrach R (1994) Perturbative renormalization in
quantum mechanics. Physics Letters B 328: 113–118.
to renormalization is given by Polchinski (1984). Polchinski J (1984) Renormalization and effective Lagrangians.
Manuel and Tarrach (1994) give an excellent account Nuclear Physics B 231: 269–295.
of renormalization for a theory with a non-relativistic Sterman G (1993) An Introduction to Quantum Field Theory.
delta-function potential in 2 space dimensions, which Cambridge: Cambridge University Press.
provides a fully tractable model. Sterman G (1996) Partons, factorization and resummation. In:
Soper DE (ed.) QCD and Beyond, pp. 327–406. (hep-ph/
Tkachov (1994) reviews a systematic application 9606312). Singapore: World Scientific.
of distribution theoretic methods to asymptotic Tkachov FV (1994) Theory of asymptotic operation. A summary of
problems in QFT. Finally, Weinzierl (1999) provides basic principles. Soviet Journal of Particles and Nuclei 25: 649.
a construction of dimensional regularization with the Weinberg S (1995) The Quantum Theory of Fields, Vol. I.
Foundations. Cambridge: Cambridge University Press.
aid of K-theory using an underlying vector space of
Weinzierl S (1999) Equivariant dimensional regularization, hep-
the physical integer dimension. Other constructions, ph/9903380.
referred to in this paper, follow Wilson and use an Zavyalov OI (1990) Renormalized Quantum Field Theory.
infinite-dimensional underlying space. Dordrecht: Kluwer.

Renormalization: Statistical Mechanics and Condensed Matter


M Salmhofer, Universität Leipzig, Leipzig, Germany details of the system. For example, the liquid–gas
ª 2006 Elsevier Ltd. All rights reserved. transition for real gases has the same exponents as
the magnetization transition in the three-dimensional
Ising model.
The renormalization group (RG) was developed
Renormalization Group
by Kadanoff, Wilson, and Wegner, to understand
and Condensed Matter
these critical phenomena (Domb and Green 1976).
Statistical mechanical systems at critical points The central idea is that the system becomes scale
exhibit scaling laws of order parameters, susceptibi- invariant at the critical point, which makes it
lities, and other observables. The exponents of these natural to average over degrees of freedom on
laws are universal, that is, independent of most increasing length scales successively in the
408 Renormalization: Statistical Mechanics and Condensed Matter

calculation of the partition function. This leads to a large K. Let T be a set and  = { :  ! T } be the
map between effective interactions associated to set of spin configurations. Common examples for
different length scales. Thus, the focus shifts from the target space T are T = {1, 1} for the Ising
the analysis of a single interaction to that of a flow model, T = SN1 for the O(N) model, and T = Rn
on a space of interactions. This space is in general for unbounded spins. Let S :  ! R,  7! S () be
much larger than the original formulation of the an interaction and
model would suggest: the description of long- Z Y
distance or low-energy properties may be in terms Zð; S Þ ¼ dðxÞeS ðÞ ½1
of variables that were not even present in the x2
original formulation of the system. Phenomeno- In the unbounded case, S is assumed to grow
logically, this corresponds to the emergence of sufficiently fast for jj ! 1, so that Z exists; for the
collective degrees of freedom. case of a finite set T, the integral is replaced by a
Condensed matter theory is itself already an sum. Denote the corresponding Boltzmann factor by
effective theory, and its ‘‘microscopic’’ formulation (, S ),
gets inputs from the underlying theories, which
determine in particular the statistics of the particles 1
ð; S ÞðÞ ¼ eS ðÞ ½2
and their interactions at the scale of atomic energies. Zð; S Þ
At much lower-energy scales, which are relevant for
The block spin transformation consists of an
low-temperature phenomena in condensed matter,
integration step and a rescaling step. Divide the
collective excitations of different, sometimes exotic,
lattice into cubic blocks of side-length L and define
statistics may emerge, but the starting point is given
a new lattice 0 by associating one lattice site of the
naturally in terms of fermionic and bosonic parti-
new lattice to each L-block of the old lattice. For
cles. For this reason, the discussion given below will
any 0 : 0 ! T, let
be split in these two cases.
Z Y
A major difference between high-energy and
0 ð0 Þ ¼ dðxÞPð0 ; ÞeS ðÞ ½3
condensed matter systems is that the latter have a
x2
well-defined Hamiltonian which can be used to RQ
0 0 0 0
define the finite-volume ensembles of quantum where P( , )  0 and x0 20 d (x )P( , ) = 1
0
statistical mechanics and which determines the time for all , so that  remains a probability distribu-
evolution, as well as various analyticity properties. tion. Since 0 is positive, one defines
The relevant spatial dimensions in condensed
S00 ð0 Þ ¼  log 0 ð0 Þ ½4
matter are d  3, but some results in higher
dimensions relevant for the development of the By construction, the partition function is invariant:
method will also be discussed below. The cases Z(0 , S00 ) = Z(, S ). The new lattice 0 has spacing L;
d = 1 and d = 2 have always been of mathematical now rescale to make it a unit lattice. This completes
interest but in recent years have become important the RG step in finite volume.
for the theory of new materials. In an algorithmic sense, the ‘‘blocking rule’’
Some interesting topics cannot be covered here P(0 , ) can be viewed as a transition probability of
due to space restrictions, notably the application of a configuration  to a configuration 0 . P may be
renormalization methods to membrane theory (see deterministic, that is, simply fix 0 as a function
Wiese (2001)) and renormalization methods for of . From the intuition of averaging over local
operators (see Bach et al. (1998)). fluctuations, 0 is often taken to be some average of
(x) at x in a block around x0 , hence the name.
Obviously, the thus defined RG transformation
The Renormalization Group often cannot be iterated arbitrarily, since in every
application, the number of points of the lattice shrinks
In this section we briefly describe the setup of two
by a factor Ld , so that after K iterations, a lattice with
important versions of the RG, namely the block spin
only a single point is left over. It is necessary to take the
RG and the RG based on scale decompositions of
infinite-volume limit L ! 1 to obtain a map that
singular covariances.
operates from a space to itself. However, [4] can
become problematic in that limit: Gibbs measures 
Block spin RG
can map to measures 0 whose large-deviation proper-
Let  be a finite lattice, for example, a finite subset ties differ from those of Gibbs measures. The discus-
of Zd . For the following, it is convenient to take  sion of this problem and its solution is reviewed in
to be a cube of side-length LK for L > 1 and some Bricmont and Kupiainen (2001). The problem can be
Renormalization: Statistical Mechanics and Condensed Matter 409

solved in different ways, relaxing conditions on Gibbs Again, we assume that the potential v depends on x
measures or, in the Ising model, changing the descrip- and y only via x  y, so that translation invariance
tion from the spins to the contours. The crucial point is holds. In both UV and IR cases, naive perturbation
that the difficulties arise only because [4] is applied theory fails even as a formal power series. That is,
globally, that is, to every 0 . The set of bad 0 has very writing V = V0 , with a coupling constant  which is
small probability. treated as a formal expansion parameter, the singu-
Block spin methods have been used in mathema- larity of C leads to termwise divergences in the series.
tical construction of quantum field theories, for The theory is called perturbatively renormalizable if
example, in the work of Gawedzki and Kupiainen all divergences can be removed by posing counter-
(1985) and Balaban (1988) (see the subsection terms of certain types, which are fixed by physically
‘‘Field theory and statistical mechanics’’). The sensible renormalization conditions. Identifying the
above-mentioned problem was avoided there by UV renormalizable theories was a breakthrough in
not taking a logarithm in the so-called large-field high-energy physics. The IR renormalization problem
region (which has very small probability). is different, and in some respects harder, because
there is almost no freedom to put counter-terms: the
microscopic model is given from the start. This will
Scale Decomposition RG be discussed in more detail below for an example.
The generating functionals of quantum field theory A much more ambitious, and largely open, project
and quantum statistical mechanics can be cast into is to do this renormalization nonperturbatively, that
the form is, to treat  as a real (typically, small) parameter.
Z Some results will be discussed below.
0
ZðC; V; Þ ¼ dC ð0 Þ eVð þÞ ½5 TheP RG is set up by a scale decomposition
C = j Cj . In the example of the massless Gaussian
field, one would take each C ^ j to be a C1 function
Here dC denotes the Gaussian measure with covar-
iance C, and V is the two-body interaction between the supported in the region {k 2 Rd : Mj  k2  Mjþ1 },
particles. The field variables are real or complex for where M > 1 is a fixed constant, and the summation
bosons and Grassmann-valued for fermions. Differ- over j runs over Z.
entiating log Z with respect to the external field  The scale decomposition of C leads to a represen-
generates the connected amputated correlation func- tation of [5] by an iteration of Gaussian convolution
tions. The covariance determines the free propagation integrals with covariances Cj , hence a sequence of
of particles; the interaction their collisions. effective interactions Vj , defined recursively by
In most cases, such functional integrals are a priori Z
0
ill-defined, even if V is small (and bounded from eVj ðÞ ¼ dCjþ1 ð0 Þ eVjþ1 ð þÞ ; V0 ¼ V ½7
below) because the covariance C is singular. That is,
the integral kernel C(X, X0 ) of the operator C either For a singular covariance, the scale decomposition is
diverges as jx  x0 j ! 0 (ultraviolet (UV) problem) or an infinite sum. A formal object like [5] is now
C(X, X0 ) has a slow decay as jx  x0 j ! 1 (infrared regularized by starting with a finite sum, that is,
(IR) problem). In our notational convention, X may, imposing a UV and IR cutoff, which is mathemati-
in addition to the configuration variable x, also cally well defined, and then taking limits of the thus
contain discrete indices of the fields, such as a spin or defined objects. Again, in condensed matter applica-
color index. The dependence of C on x and x0 is tions, imposing an IR cutoff is an operation that
assumed to be of the form x  x0 . A typical example needs to be justified, for example, by showing that
is the massless Gaussian field in d dimensions, where taking the limit as the cutoff is removed commutes
C is the inverse Fourier transform of C(k) ^ = 1=k2 , with the infinite-volume limit.
d
k 2 R , which has both a UV and an IR problem, or Note that the RG map, which is the iteration
its lattice analog, Vj 7! Vj1 , goes to lower and lower j, corresponding
!1 to longer and longer length scales. The convention
2 Xd that the iteration starts at some fixed j, for example,
^
DðkÞ ¼ 2 ð1  cosðaki Þ
a i¼1 j = 0, is appropriate for IR problems. In UV
problems, the iteration would start at some large
with a the lattice constant, which has only an IR JUV , which defines a UV cutoff and is taken to
problem. A typical interaction is of the type infinity, to remove the cutoff, at the end.
Z A Rvariant using a continuous scale decomposition,

VðÞ ¼ dX dY ðXÞðXÞvðX; 
YÞðYÞðYÞ ½6 C = dsC_ s , originally due to Wegner and Houghton,
became very popular after Polchinski (1984) used it
410 Renormalization: Statistical Mechanics and Condensed Matter

to give a short argument for perturbative renorma- estimates on the rareness of large-field regions
lizability. Polchinski’s equation, the analog of the using cluster expansions. For fermions, the expan-
recursion [7], reads sion in powers of the fields can be proved to
  converge for regular, summable covariances, which
@V 1 V V 1 1 V _ V
¼  e C_ s e ¼ C_ s V  ; Cs ½8 leads to substantial technical simplifications.
@s 2 2 2   The spatial proliferation of interactions is absent
Here only in certain one-dimensional and in specially
  constructed higher-dimensional models, the so-
  called ‘‘hierarchical models.’’ In these models, the
C ¼ ;C
  search for an RG fixed point is still a nonlinear
fixed-point problem, whose treatment leads to
denotes the Laplacian in field space associated to the interesting mathematical results.
covariance C. Polchinski’s argument has been devel- This article will be restricted to the mathema-
oped into a mathematical tool that applies to many tical use of the RG both in perturbative and
models. For an introduction to perturbative renor- nonperturbative quantum field theory of con-
malization using this method, see Salmhofer (1998). densed matter systems. Many nonrigorous but
Equations of the type [8] have also been very useful very interesting applications have also come out
beyond perturbation theory: much work has been of this method, showing that it also works well in
done based on the beautiful representation of Mayer practice, but they will not be reviewed here. Before
expansions found in Brydges and Kennedy (1987) discussing condensed matter systems, the pioneer-
using RG equations. ing works done on the mathematical RG, which
were largely motivated by high-energy physics,
Mathematical Structure and Difficulties will be reviewed briefly, as they laid the founda-
The RG flow is thus, depending on the implementa- tion of much of the technique used later in the
tion, either a sequence or a continuous flow of condensed matter case.
interactions. Setting up this flow in mathematical
terms is not easy and indeed part of the mathema- Field Theory and Statistical Mechanics
tical RG analysis is to find a suitable space of
interactions that is left invariant by the successive Because of the close connection between quantum
convolutions, and then to control the RG iteration. field theory and statistical mechanics given by
A serious problem is the proliferation of interac- formulas of the Feynman–Kac type, a significant
tions: already a single application of the RG amount of work on the mathematical RG focused
transformation [7] maps a simple interaction, such on models of classical statistical mechanics in
as [6], to a nonlocal functional of the fields, connection with field theories and gauge theories.
Here we mention some of the pioneering results in
XZ
Vj ðÞ ¼ dX1    dXm that field.
m0 The scale decomposition method was developed
in a mathematical form and applied to perturbative
 vðjÞ
m ðX1 ; . . . ; Xm Þ ðX1 Þ    ðXm Þ ½9
UV renormalization of scalar field theories, as well
Already for perturbative renormalization, one needs as nonperturbative analysis of some models, by
to extract local terms, calculate their flow more Gallavotti and Nicolò (Gallavotti 1985).
explicitly, and control the power counting of the Infrared 4 theory in four dimensions was
remainder. The convergence of the series is not an constructed using block spin methods (Gawedzki
issue in formal perturbation theory because in every and Kupiainen 1985) and scale decomposition RG
finite order r in , the sum over m is finite. (Feldman et al. 1987). An essential feature of the 44
For nonperturbative renormalization, however, model is its IR asymptotic freedom, meaning that
the problem is much more serious. For bosonic the local part of the effective quartic interaction
systems, the expansion in powers of the fields in tends to zero in the IR limit.
[9] is divergent, and one needs a split into small- Block spin methods were used by Balaban (1988)
field and large-field regions and cluster expansions to construct gauge theories in three and four
to obtain a well-defined sequence of effective dimensions. For gauge theories, the block spin RG
actions (Gawedzki and Kupiainen 1985, Feldman has the major advantage that it allows to define a
et al. 1987, Rivassean 1993). That is, the local gauge-invariant RG flow. The scale decomposition
parts are extracted and treated explicitly only in violates gauge invariance, which creates substantial
the small-field region, and this is combined with technical problems (Rivasseau 1993).
Renormalization: Statistical Mechanics and Condensed Matter 411

Condensed Matter: Fermions Perturbative Renormalization

Starting with the seminal work of Feldman and Renormalization of the Fermi surface at zero
Trubowitz (1990, 1991) and Benfatto and temperature In the limit T ! 0, the Matsubara
Gallavotti (1995), this field has become one of the frequency ! becomes a real variable, hence the
most successful applications of the mathematical propagator has a singularity at ! = 0 and k 2 S,
RG. We use this example to discuss the scale where S = {k : e(k) = 0}, a codimension-1 subset of
decomposition method in a bit more detail. Bd , is the Fermi surface. The existence of a Fermi
We shall mainly focus on models in d  2 surface which does not degenerate to a point is a
dimensions (the case d = 1 is described in detail in characteristic feature of systems showing metallic
Benfatto and Gallavotti (1995)). The system is put behavior.
into a finite (very large) box  of side-length L. For The singularity implies that C^ 62 Lp (R  Bd ) for
simplicity we take periodic boundary conditions. any p  2. Because terms of the type
The Hilbert space for spin-1/2 Z Z
L Velectrons is the ^
fermionic Fock space F = n0 n L2 (, C2 ). The d! dkFð!; kÞCð!; kÞ
grand canonical ensemble in finite volume is given
by the density operator  = Z1 e(HN) , with the Y
p1 
 ^
Ti ð!; kÞCð!; kÞ ½11
Hamiltonian H and the number operator N, in the
i¼1
usual second quantized form. The parameter
 = T 1 is the inverse temperature and the chemical appear for all p  1 in the formal perturbation
potential  is an auxiliary parameter used to fix the expansion, with functions Ti and F that do not
average particle number. vanish on the singularity set of C, the perturbation
The grand canonical trace defining the ensemble expansion for observables is termwise divergent.
can be rewritten in functional-integral form. It takes The deeper reason for these problems is that the
the form [5], but now dC stands for a Grassmann interaction shifts the Fermi surface so that the true
Gaussian ‘‘measure,’’ which is really only a linear propagator has a singularity of the form
functional (for definitions, see, e.g., Salmhofer G(!, k) = (i!  e(k) 
(!, k))1 . If the self-energy

(1998, chapter 4 and appendix B)). A two-body is a sufficiently regular function, G has the same
interaction corresponds to a quartic interaction integrability properties as C, but the singularity of G
polynomial V, as in [6]. The covariance is (in the is on the set ~S = {k : e(k) þ
(0, k) = 0} (the singular-
infinite-volume limit L ! 1) ity in ! remains
P at ! = 0).
Z Let 1 = j0 j (!, k) be a C1 partition of unity
1 X dk iðkx!Þ ^ such that
Cð; xÞ ¼ e Cð!; kÞ
 !2M ð2 Þd
F
½10 for j < 0 supp j  fð!; kÞ : 0 Mj2
^ 1  ji!  eðkÞj  0 Mj g ½12
Cð!; kÞ ¼
i!  eðkÞ
where M > 1 and 0 is a fixed constant (an energy
where  2 (0, ] is a Euclidian time variable and k scale determined by the global properties of the
is the spatial momentum. The summation over ! function e; see Salmhofer (1998, chapter 4)). The
runs over the set of fermionic Matsubara frequen- corresponding covariances C ^j = C
^ have the prop-
j
cies MF = T(2Z þ 1). The function e(k) = "(k)  , ^
erties that for j < 0, kCj k1  const.Mj and kC ^ jk 
1
where "(k) is the band function given by the single- j
const.MP . Using these bounds and expanding
particle term in the Hamiltonian. For a lattice v(j)
m =
(j) r
r1 vm, r  , one can derive estimates for the
system, k 2 Bd , the momentum space torus (e.g., coefficient functions v(j) m, r .
for the lattice Zd , Bd = R d =2 Zd ); for a continuous Of course, the scale decomposition by itself does
system, k 2 Rd , hence there is a spatial UV not solve the problem of the moving singularity. It
problem. Electrons in a crystal have a natural only allows us to pinpoint the problematic terms in
spatial UV cutoff (see Salmhofer (1998, chapter 4) the expansion. To construct the self-energy
, as
for a discussion) so we assume in the following well as all higher Green functions, a two-step
that there is either a UV cutoff or that the system is method is used (Feldman and Trubowitz 1990,
on a lattice. A nonperturbative definition of the 1991, Feldman et al. 1996, 2000). First, a counter-
functional integral involves a limit from discrete term function K which modifies e is introduced, so
times (by the Trotter product formula); see, for that all two-point insertions Ti get subtracted on
example, Salmhofer (1998) or Feldman et al. the Fermi surface, hence replaced by T ~ i (!, k) =
(2003, 2004). 0 0
Ti (!, k)  Ti (0, k ), with k obtained from k by a
412 Renormalization: Statistical Mechanics and Condensed Matter

projection to the Fermi surface (Feldman and has a unique solution. If this is done, the procedure
Trubouitz 1990, 1991). Consequently, the T ~ i vanish for renormalization is as follows. For a model given
linearly on the Fermi surface, so that the integral over by dispersion relation and interaction (E, V), solve
k in [11] converges. The effect of the counter-term [14], then add and subtract e in the kinetic term.
function K can be described less technically: it fixes This automatically puts K = E  e as a counter-term,
the Fermi surface to be S, the zero set of e. Thus, K and the expansion is now set up automatically with
forces S to be the Fermi surface of the interacting the right counter-term. The function K describes
system. To achieve this, K must be chosen a function the shift from the Fermi surface of the free system (the
of e, k, and V. In contrast to the situation for zero set of E) to that of the interacting system
covariances with point singularities, the function K (the zero set of e). Proving that K is sufficiently
will, for a nontrivial Fermi surface, be very different regular and solving [14] is nontrivial. Uniqueness of
from the original e. It can, however, be constructed to the solution follows from the above stated properties
all orders in perturbation theory for a large class of of K as a function of e. Existence was shown for a
Fermi surfaces. More precisely, one can prove: if e 2 class of Fermi surfaces with strictly positive curva-
C2 (Bd , R), ^
v 2 C2 (Bd , R), and the Fermi surface S ture in Feldman et al. (1996, 2000), to every order
contains noPpoints k with re(k) = 0 and no flat sides, in perturbation theory. This implies a bijective
then K = r r Kr exists as a formal power series in  relation between the Fermi surfaces of the free and
and the map e 7! e þ K is locally injective on this set the interacting model.
of e’s (Feldman et al. 1996, 2000). With this counter-
term, the order-r m-point functions on scale j satisfy
the bounds Positive temperature and the zero-limit temperature
  One advantage of the functional-integral approach
 ðjÞ 
^vm;r   wm;r Mð4mÞj=2 jjjr is that the setup at positive temperatures is identical
1
to that at zero temperature, save for the discreteness
and of the set MF at T > 0. Because 0 62 MF , the
  temperature effectively provides an IR cutoff, so
 ðjÞ  that all term-by-term divergences are regularized in
^vm;r   w
~ m;r ½13
1
a natural way. However, renormalization is still
with constants wm, r and w v(j)
~ m, r . Here ^ m, r is the
necessary because the temperature is a physical
(j)
Fourier transform of vm, r (see [9], with the momen- parameter and unrenormalized expansions give
tum conservation delta function from translation disastrous bounds for the behavior of observables
invariance removed. as functions of the temperature. Renormalization
Equation [13] implies that in the RG sense, the carries over essentially unchanged (the counter-term
two-point function is relevant, the four-point func- function is constructed slightly differently).
tion is marginal, and all higher m-point functions Because j!j  = for all ! 2 MF , [12] implies
are irrelevant. supp j = ; for j < J , where
In one dimension, the Fermi ‘‘surface’’ reduces to  0
two points which are related by a symmetry, so the J ¼ logM ½15

counter-term function K is just a constant, that is, an
adjustment of the chemical potential , which is Thus, the scale decomposition is now a finite sum
justified because  is only an auxiliary parameter over 0  j  J . This restriction is inessential for
used to fix the average value of the particle number. the problem of renormalizing the Fermi surface, but
The counter-term function is a constant also in it puts a cutoff on the marginal growth of the four-
higher dimensions in the special case e(k) = k2  : point function: [15] and [13] imply that
there, rotational symmetry implies that K can be  
ðjÞ  0 r
chosen independent of k (if v is also rotationally k^vm;r k1  w
~ m;r log ½16

symmetric). However, in the generic case of non-
spherical Fermi surfaces, K depends nontrivially If one can show that w ~ m, r  ABr with constants A
on k, and an inversion problem arises: adding the and B, this implies that perturbation theory con-
counter-term changes the model. To obtain the verges for jj log ( 0 = ) < B1 . Such a bound has
Green functions of a model with a given dispersion been shown using constructive methods (Disertori
relation and interaction (E, V), one needs to show and Rivasseau 2000, Feldman et al. 2003, 2004) (see
that given E in a suitable set, the equation below). The logarithm of  is due to the Cooper
instability (see Feldman and Trubowitz (1990,
eðkÞ þ Kð; e; VÞðkÞ ¼ EðkÞ ½14 1991) and Salmhofer (1998, section 4.5)).
Renormalization: Statistical Mechanics and Condensed Matter 413

The application of renormalization at positive nonperturbative proof of the corresponding inver-


temperature also led to the solution of a longstanding sion theorem remains open.
puzzle in solid-state physics, namely the (seeming) In d = 3, the proof of Fermi liquid behavior remains
discontinuity of the results of perturbation theory as a an open problem, despite some partial results.
function of the temperature claimed in the early
literature. When renormalization is done correctly,
there is no discontinuity in the temperature. Condensed Matter: Bosons
Recent advances in quantum optice, in particular the
Nonperturbative Renormalization for Fermions trapping of ultracold atoms, have led to the
It is a remarkable feature of fermionic field theories experimental realization of Bose–Einstein condensa-
that for a covariance for which kCk ^ and kCk are tion (BEC), which caused a surge of theoretical and
1 1
both finite, the effective action defined in [7] exists mathematical works. For bosons, the definition of
and is analytic in the fields and in the original the ensembles is similar to, but more involved than
interaction V, thanks to determinant bounds. For a in, the fermionic case. On a formal level, the
V as in [6], with v weak and of short range, the functional-integral representation is analogous to
skeleton functions (where all relevant m-point fermions, except that the fields are not Grassmann
functions are projected back to their initial values fields but complex fields, and the covariance is given
in the RG iteration) satisfy by a sum as in [10], but now the summation over !
runs over the bosonic Matsubara frequencies
vðjÞ ^ ðm=2Þþ1 kCj k1 MB = 2 TZ. The existence of even the free partition
k^ m k1  const:kCj k1 1 ½17
function in finite volume restricts the chemical
For the many-electron covariance [10], with a potential (for free particles,  < inf k "(k) must
positively curved Cd Fermi surface and with the hold). Note that C is complex and Gaussian
scale decomposition [12], kC ^ j k is of order Mj and
j(dþ1)=2
1 measures with complex covariances exist in infinite
kCj k1 is of order M . The right-hand side of dimensions only under rather restricted conditions,
[17] then contains M(dþ3m)j=2 , which agrees (up to which are not satisfied by [10]. This is inessential for
logarithms) with the perturbative power-counting perturbative studies, where everything can be
bounds [13] only for d = 1. In dimension d = 2, the reduced to finite-dimensional integrals involving
method has been refined by dividing the Fermi the covariance, but a nonperturbative definition of
surface into angular sectors. The corresponding functional integrals for such systems requires again a
sectorized propagators have a better decay bound carefully regularized (e.g., discrete-time) definition
kCj k1 , but the trade-off is sector sums at every of the functional integral.
vertex. Momentum conservation restricts these
sector sums sufficiently in two dimensions to allow Bose–Einstein Condensation
for good power-counting bounds. This has allowed
for the construction of an interesting class of The problem was treated to all orders in perturba-
interacting fermionic models. tion theory at positive particle density  > 0 by
The major results obtained with the RG method Benfatto (Benfatto and Gallavotti 1995). The initial
are as follows. interaction is again quartic, "(k) = k2 , and one
Luttinger liquid behavior at zero temperature was considers the problem at zero temperature, in the
proved for one-dimensional models with a repulsive limit  ! 0 , which is the limit in which BEC occurs
interaction (Benfatto and Gallavotti 1995). for free particles. The interaction is expected to
Fermi liquid behavior in the region where change the value of , given the density, so a
jj log ( 0 ) 1 was proved for the two- chemical potential term is included in the action, to
dimensional model with e(k) = k2  1, a local poten- give the interaction
tial V, and a UV cutoff both on k and the Matsubara Z
frequencies ! in Disertori and Rivasseau (2000). VðÞ ¼ ddxdyjð; xÞj2 vðx  yÞjð; yÞj2
A two-dimensional model with a band function Z
e(k) that is nonsymmetric under k !k and a þ ddxjð; xÞj2 ½18
general short-range interaction was proved to be a
Fermi liquid at zero temperature (Feldman et al. After writing (, x) =  þ ’(, x), where  is indepen-
2003, 2004). Due to the asymmetry under k !k, dent of  and x, the density condition becomes
the Cooper instability can be proved to be absent. In  = jj2 . now needs to be chosen such that the
pffiffiffi
Feldman et al. (2003, 2004), a counter-term func- free energy has a minimum at  = . This can be
tion as in Feldman et al. (1996, 2000) was used. The reformulated in terms of the self-energy of the boson.
414 Renormalization: Statistical Mechanics and Condensed Matter

Benfatto uses the RG to prove that the propagator of BRST; Phase Transition Dynamics; Reflection Positivity
the interacting system no longer has the singularity and Phase Transitions.
structure (i!  k2 )1 but instead (!2 þ c2 k2 )1 , where
c is a constant. This requires a nontrivial analysis of Further Reading
Ward identities in the RG flow.
BEC has been proved in the Gross–Pitaevskii limit Bach V, Fröhlich J, and Sigal IM (1998) Renormalization group
(Lieb et al. 2002). In the present formulation, this limit analysis of spectral problems in quantum field theory.
Advances in Mathematics 137: 205–298.
corresponds to an infinite-volume limit L ! 1 where Balaban T (1988) Convergent renormalization expansions for
the density  is taken to zero as an inverse power of L. lattice gauge theories. Communications in Mathematical
A nonperturbative proof of BEC at fixed positive Physics 119: 243–285.
particle density remains an open problem. Balaban T (1995) A low-temperature expansion for classical
N-vector models I. A renormalization group flow. Commu-
nications in Mathematical Physics 167: 103–154.
Benfatto G and Gallavotti G (1995) Renormalization Group.
Superconductivity
Princeton: Princeton University Press.
Superconductivity (SC) occurs in fermionic systems, Bricmont J and Kupiainen A (2001) Renormalizing the renorma-
lization group pathologies. Physics Reports 348: 5–31.
but it happens at energy scales where the relevant
Brydges DC and Kennedy T (1987) Mayer expansions and the
excitations have bosonic character: the Cooper pairs Hamilton–Jacobi equation. Journal of Statistical Physics 48:
are bosons. In the RG framework, they arise naturally 19–49.
when the fermionic RG flow discussed above is Disertori M and Rivasseau V (2000) Interacting Fermi liquid in
stopped before it leaves the weak-coupling region two dimensions at finite temperature I. Convergent contribu-
tions. Communications in Mathematical Physics 215: 251–290.
and the dominant Cooper pairing term is rewritten by
Domb C and Green M (eds.) (1976) Phase Transitions and
a Hubbard–Stratonovich transformation. The fer- Critical Phenomena, vol. 6. London: Academic Press.
mions can then be integrated over, resulting in the Feldman J, Magnen J, Rivasseau V, and Sénéor R (1987)
typical Mexican hat potential of an O(2) nonlinear Construction of infrared 44 by a phase space expansion.
sigma model. Effectively, one now has to deal with a Communications in Mathematical Physics 109: 437.
Feldman J and Trubowitz E (1990) Perturbation theory for
problem similar to the one for BEC, but the action is
many-fermion systems. Helvetica Physica Acta 63: 157.
considerably more complicated. Feldman J and Trubowitz E (1991) The flow of an electron-
phonon system to the superconducting state. Helvetica
Physica Acta 64: 213.
The Nonlinear Sigma Models Feldman J, Salmhofer M, and Trubowitz E (1996) Perturbation
theory around non-nested Fermi surfaces I. Keeping the Fermi
The prototypical model, into whose universality surface fixed. Journal of Statistical Physics 84: 1209–1336.
class both examples mentioned above fall, is that Feldman J, Salmhofer M, and Trubowitz E (2000) An inversion
of O(N) nonlinear sigma models: both BEC and SC theorem in Fermi surface theory. Communications on Pure
can be reformulated as spontaneous symmetry and Applied Mathematics 53: 1350–1384.
Feldman J, Knörrer H, and Trubowitz E (2003) A class of Fermi
breaking (SSB) in the O(2) model in dimensions
liquids. Reviews in Mathematical Physics 15: 949–1169.
d  3. For d = 2, long-range order is possible only at Feldman J, Knörrer H, and Trubowitz E (2004) Communications
zero temperature because only then does the time in Mathematical Physics 247: 1–319.
direction truly represent a third dimension, prevent- Fröhlich J, Simon B, and Spencer T (1976) Infrared bounds, phase
ing the Mermin–Wagner theorem from applying. transitions, and continuous symmetry breaking. Communica-
tions in Mathematical Physics 50: 79.
SSB has been proved for lattice O(N) models by
Gallavotti G (1985) Renormalization theory and ultraviolet
reflection positivity and Gaussian domination meth- stability via renormalization group methods. Reviews of
ods (Fröhlich et al. 1976). The elegance and Modern Physics 57: 471–569.
simplicity of this method is unsurpassed, but only Gawedzki K and Kupiainen A (1985) Massless lattice 44 theory:
very special actions satisfy reflection positivity, so Rigorous control of a renormalizable asymptotically free model.
Communications in Mathematical Physics 99: 197–252.
that the method cannot be used for the effective
Lieb E, Seiringer R, Solovej JP, and Yngvason J (2002) The ground
actions obtained in condensed matter models. state of the Bose gas. In: Current Developments in Mathematics,
Results in the direction of proving SSB in O(N) 2001, pp. 131–178. Cambridge: International Press.
models for d  3 by RG methods, which apply to Polchinski J (1984) Renormalization and effective Lagrangians.
much more general actions, have been obtained by Nuclear Physics B 231: 269.
Rivasseau V (1993) From Perturbative to Constructive Renorma-
Balaban (1995).
lization. Princeton, NJ: Princeton University Press.
Salmhofer M (1998) Renormalization: An Introduction, Springer
See also: Bose–Einstein Condensates; Fermionic Texts and Monographs in Physics. Heidelberg: Springer.
Systems; High Tc Superconductor Theory; Holomorphic Wiese KJ (2001) Polymerized membranes, a review. In: Domb C
Dynamics; Operator Product Expansion in Quantum and Lebowitz J (eds.) Phase Transitions and Critical Phenom-
Field Theory; Perturbative Renormalization Theory and ena, vol. 19. Academic Press.
Resonances 415

Resonances
N Burq, Université Paris-Sud, Orsay, France On the quantum mechanics point of view, both
ª 2006 Elsevier Ltd. All rights reserved.
systems are described by the Hamiltonians

d2
Hi ¼ h2 þ Vi ðxÞ
dx2
Introduction acting on L2 ([1, 1]) (with boundary conditions) and
In quantum mechanics and wave propagation, L2 (R), respectively. In the first case, H1 has a discrete
eigenvalues (and eigenfunctions) appear naturally spectrum, j, h 2 R with eigenfunctions ej, h (x), j 2 N,
as they describe the behavior of a quantum and the time evolution of the system is given by
system (or the vibration of a structure). There X
eitH1 u ¼ eitj;h uj;h  ej;h ½1
are however some cases where these simple j
notions do not suffice and one has to appeal to
where uj, h  ej, h is the orthogonal projection of u on
the more subtle notion of resonances. For
example, if the vibration of a drum is well the eigenspace Cej, h . In the second case, H2 has no
understood in terms of eigenvalues (the audible square integrable eigenfunction, and no simple
frequencies) and eigenfunctions (the correspond- description as [1] can consequently hold. However
ing vibrating modes), the notion of resonances is as h ! 0, the correspondence principle tells us that
quantum mechanics should get close to classical
necessary to understand the propagation of waves
mechanics. Since for both quantum problems the
in the exterior of a bounded obstacle. Another
classical limit is the same (at least for initial states
example (taken from Zworski (2002)) which
confined in the well with energy E), we expect that
allows us to understand both the similarities of
resonances with eigenvalues and their differences for the second potential there should exist a
is the following: consider the motion of a quantum state corresponding to the classical one.
classical particle submitted to a force field In fact, this is indeed the case and one can show that
deriving from the potential V1 (x) on a bounded there exist resonant states ej, h associated to reso-
nances Ej, h which are solution of the equation
interval as shown in Figure 1a. If the classical
momentum is denoted by , then the classical H2 ej;h  Ej;h ej;h ; Ej;h  E
energy is given by
are not square integrable, but still have moderate
E ¼ jj2 þ V1 ðxÞ growth at infinity and are confined in the interior of
and the classical motion is given by the relations of the well (see sections ‘‘Definition’’ and ‘‘Location of
Hamiltonian mechanics: resonances’’). On the other hand, the first quantum
system is confined, whereas the second one is not and
@E @E we know that even for initial states confined in the
x_ ¼ ¼ 2; _ ¼  ¼ V 0 ðxÞ
@ @x well, tunneling effect allows the quantum particle to
Since energy is conserved, if the initial energy is escape to infinity. This fact should be described by
smaller than the top of the barrier, then the classical the theory as a main difference between eigenvalues
particle bounces forever in the well. Now we can and resonances. This is indeed the case as the
consider the same example with the potential V2 (x) resonances Ej, h are not real (contrarily to eigenvalues
on R as shown in Figure 1b. Of course, if the of self-adjoint operators) but have a nonvanishing
particle is initially inside the well (with the same imaginary part (see section ‘‘Resonance-free regions’’)
energy as before), the classical motion remains the
Im Ej;h  eC=h
same.
If we assume that a similar description as [1] still
holds for the second system, at least locally in space
(see section ‘‘Resonances and time asymptotics’’),
E then, for time t >> eC=h , the factor eitEh becomes
very small (the quantum particle has left the well
0 π 0 π due to tunneling effect).
There have been several studies on resonances and
(a) (b) scattering theory and the presentation here cannot be
Figure 1a, b A particle trapped in a well. complete. For a more in-depth presentation, one can
416 Resonances

consult the books by Lax and Phillips (1989) and Remark 1 In the case of acoustical scattering
Hislop and Sigal (1987), or the reviews on resonances (P =    2 ,  = h1 ), the introduction of the addi-
by Vodev (2001) and Zworski (1994) for example. tional parameter z is pointless and pffiffiffi one works
directly with the parameter  = h1 z. In that case
the resolvent R()(   2 )1 is well defined for
Definition Im  < 0, the essential spectrum is precisely the axis
 2 R and the resolvent admits a meromorphic
There are different (equivalent) definitions of reso-
continuation from Im z < 0 toward the upper half-
nances. The most elegant is certainly the Helffer and
plane (with possibly a cut at 0):
Sjöstrand (1986) definition (see also the presentation
of complex scaling by Combes et al. (1984) and the RðÞ : L2 ðÞcomp ! L2 ðÞloc
very general ‘‘black box’’ framework by Sjöstrand and
Zworski (1991)). However, it requires a few prerequi- The acoustic resonances are by definition the poles
sites and we preferred to stick to the more elementary of this meromorphic continuation. They are related
(but less general) resolvent point of view. The starting to semiclassical resonances by the relation
point for this definition of resonances is the fact that pffiffiffiffiffiffiffiffiffiffiffi
Ressc ¼ h Resac
the eigenvalues of a (self-adjoint) operator P are the
points where P is not injective. The more general It can also be shown that if z is a resonance, there
resonances will be the points where the operator is not exists an associated resonant state ez such that
invertible (on suitable spaces).
ðPh  zÞez ¼ 0
More precisely, consider a perturbation of the
Laplace operator on Rn , P0 (h) = h2  in the following the function ez satisfies Sommerfeld radiation con-
sense: let   Rd be a (possibly empty) smooth obstacle ditions (in polar coordinates (r, ) 2 [0, þ1)  Sn1 )
whose complementary,  = c , is connected. Consider pffiffiffi pffiffi
a classical self-adjoint operator defined on L2 (): jh@r e  i zej  Cjei zr j=r1þn=2

Ph u ¼ ðh2  þ VðxÞÞu ½2 and the function


ez pffiffi
i zr
with boundary conditions (Dirichlet) e
1 þ rð1=2Þþ
u j@ ¼ 0 ½3 is square integrable.
(Neumann boundary conditions could be used too).
This setting contains both the Schrödinger operator Resonance-Free Regions
(Ph = h2  þ V(x)  E on  = Rn ) and the Helmoltz
equation with Dirichlet conditions, in the exterior of The very first result about resonance-free regions is
an obstacle (waves at large frequencies: P =    2 ; based on Rellich uniqueness theorem (uniqueness for
in this case, define h =  1 and Ph = h2 ), which we solutions of elliptic second-order equations) and says
shall define as acoustical scattering. that there are no real resonances (except possibly 0).
We assume that P is a perturbation of P0 , that is, The more precise determination of resonance-free
V ! 0, jxj ! þ1 sufficiently fast (see Sjöstrand and regions (originally in acoustical scattering) has been a
Zworski (1991) for the very general black box subject of study from the 1960s and it has motivated a
assumptions). For example, this perturbation large range of works from the multiplier methods of
assumption is fulfilled if V has compact support. Morawetz (1975) to the general propagation of
Then the resolvent Ph (z) = (Ph  z)1 is well defined singularity theorem of Melrose and Sjöstrand (1978).
for Im z 6¼ 0 as a bounded operator from L2 () to To state the main result in this direction, we need the
notion of nontrapping perturbation.
H 2 ðÞ \ H01 ðÞ
Definition 1 A generalized bicharacteristic at energy
(because the operator Ph is self-adjoint). However, it E(x(s), (s)) is an integral curve of the Hamiltonian field
is not bounded for z > 0 on L2 () because the
essential spectrum of Ph is precisely the semiaxis z > @p @ @p @
Hp ¼ 
0, but it admits a meromorphic continuation from @ @x @x @
Im z > 0 toward the lower half-plane: of the principal symbol p(x, ) = jj2 þV(x) of the
Rh ðzÞ : L2 ðÞcomp ! L2 ðÞloc operator P, included in the characteristic set
p(x, ) = E and which, when hitting the boundary of
The poles of this resolvent Rh are by definition the the obstacle, reflects according to the laws of
semiclassical resonances, Ressc (Ph ). geometric optics (see (Melrose and Sjöstrand 1978)).
Resonances 417

The operator P (or by extension the obstacle in the Theorem 4 If the acoustical problem is nontrap-
case of acoustic scattering) is said to be nontrapping ping, then there exist C,  > 0 such that for any
at energy E if all generalized bicharacteristics go to solution of the wave equation
the infinity:
@u
&u ¼ 0; ujt¼0 ¼ u0 ; @t ujt¼0 ¼ u1 ; ujD ¼ 0; j ¼ 0
lim jxðsÞj ¼ þ1 @n N
s!1

The operator P (or by extension the obstacle in the with compactly supported initial data (u0 , u1 ) (in a
case of acoustic scattering) is said to be nontrapping fixed compact), one has
near energy E if P is nontrapping at energy E0 for E0 Eloc ðuÞ
in a neighborhood of E. Z
¼ jruj2 þ j@t uj2
The following result was obtained in different \fjxjC
generalities by Morawetz (1975), Melrose and 8 t
< Ce if the space dimension is even
Sjöstrand (1978), and others.  C ½4
: if the space dimension is odd
Theorem 1 Assume that the operator P is nontrap- td
ping near energy E. Then for any N > 0 there exist
h0 > 0 such that for 0 < h < h0 there are no Trapping perturbations were investigated more
resonances in the set recently. In that case, the local energy decays, but the
fz; jIm zj  Nh logðhÞg rate cannot be uniform. The first trapping example in
acoustic scattering was studied by Ikawa (1983): the
In the case of analytic geometries (and coefficients), obstacle is the union of a finite number (and at least
this result (see Bardos et al. 1987) can be improved to two) convex bodies. In that case, one has
Theorem 2 Assume that the operator P is non Theorem 5 For any  > 0 there exists C > 0 such
trapping. Then there exist  > 0, N0 > 0 and h0 > 0 that for any initial data supported in a fixed
such that for 0 < h < h0 there are no resonances in compact set
the set
Eloc ðuÞðtÞ  Cet kðu0 ; u1 Þk2Dðð1Þð1þÞ=2 Þ
1ð1=3Þ
fz; jIm zj  N0 h g \ fjz  Ej  g
where D((1  )(1þ)=2 ) is the domain of the
Remark 2 In the case of acoustical scattering, operator (1  )(1þ)=2 . Remark that the norm in
pffiffiffi with
the new definition of resonances,  = h1 z, the D((1  )1=2 ) is the natural energy and consequently
resonance-free zones have respectively the forms the estimate above exhibits a loss of  derivatives.
For strongly trapping perturbations, the results are
fz; jIm zj  N logðjzjÞ; jzj >> 1g worse. They are consequences of Theorem 3.
fz; jIm zj  N0 jzj1=3 ; jzj >> 1g
Theorem 6 For any k there exists Ck > 0 such that
In the case of trapping perturbations, the first result for any initial data supported in a fixed compact set
was obtained by Burq (1998). Ck
Eloc ðuÞðtÞ  2k
kðu0 ; u1 Þk2Dðð1Þð1þkÞ=2 Þ
Theorem 3 There exist C > 0 and h0 > 0 such that logðtÞ
for 0 < h < h0 there are no resonances in the set
One can also obtain real asymptotic expansions in
fz; jIm zj  N0 eC=h g \ fjz  Ej  g terms of resonances (see the work by Tang and
Zworski (2000)).
Theorem 7 Let  2 C1 n
c (R ) and 2 C1 c ((0, 1))
Resonances and Time Asymptotics and let chsupp = [a, b]. There exists 0 < <
The relationship between eigenfunctions/eigen- c(h) < 2 such that for every M > M0 there exists
values and time asymptotics is straightforward. L = L(M), and we have
This is no longer the case for resonances. For X
eit
ðPÞ=h  ðPÞ ¼ Resðeit
ð Þ=h
nontrapping problems however, this question has
z2ðhÞ\ResðPÞ
been studied in the late 1960s by Lax and Phillips
(1989) and Vainberg (1968). In particular, this  Rð ; hÞ; zÞ ðPÞ ½5
approach was decisive to study the local energy 1
þ OH!H ðh Þ; for t > h L
decay in acoustical scattering. As a consequence of
Theorem 1, we have ðhÞ ¼ ða  cðhÞ; b þ cðhÞÞ  i½0; hM Þ
418 Resonances

where Res(f ( ), z) denotes the residue of a mer- disjoint convex bodies. In this case, the line
omorphic family of operators, f, at z. minimizing the distance, d, between the bodies is
trapped. However, this trapped trajectory is isolated
The function c(h) depends on the distribution of
and of hyperbolic type (unstable). Ikawa (1983) and
resonances: roughly speaking we cannot ‘‘cut’’
Gérard (1988) have obtained:
through a dense cloud of resonances. Even in the
very well understood case of the modular surface Theorem 9 There exist geometric positive constants
there is, currently at least, a need for some kp ! þ1 as p ! þ1 such that all resonances
nonexplicit grouping of terms. The same ideas can located above the line Im z
C (C arbitrary large
be applied to acoustic scattering. but fixed) have an asymptotic expansion
X l=2
  j;p þ al;p j;p þ Oð1
j;p Þ; j ! þ1
Trace Formulas l

Trace formulas provide a description of the classical/ where the approximate resonances
quantum correspondence: one side is given by the trace
j;p ¼ j  ikp
of a certain function of the operator f (Ph ), whereas the d
other side is described in terms of classical objects are located on horizontal lines.
(closed orbits of the classical flow). In the case of
Another example is when the obstacle is convex.
discrete eigenvalues, the question is relatively simple
This example is nontrapping and Sjöstrand and
and can be solved by using the spectral theorem. In the
Zworski (1999) are able to prove that the resonances
case of continuous spectrum, the problem is much more
in any region Im z
Njzj1=3 (N arbitrary large) are
subtle (self-adjoint operators with continuous spectrum
asymptotically distributed near cubic curves
behave in some ways as non-normal operators). It has
been studied by Lax and Phillips (1989), Bardos et al. Cj ¼ fz 2 C; Im z ¼ cj jzj1=3 g
(1982), and Melrose (1982). More recently, Sjöstrand
(1997) introduced a local notion of trace formulas. Finally, the last main example where one can give a
Let W   be an open precompact subsets of precise asymptotic for resonances is when there
ei[20 , 0] ]0, þ1[. Assume that the intersections I exists a stable (elliptic) periodic trajectory for the
and J of W and  with the real axis are intervals and Hamiltonian flow. In that case it had been known
that  is simply connected. from the 1960s (see the works by Babič (1968)) that
one can construct quasimodes, that is, compactly
Theorem 8 Let f (z, h) be a family of holomorphic supported approximate solutions of the eigenfunc-
functions on z 2  such that jfjnW j  1. Let  2 tions equation:
C10 (R) equal to 1 on a neighborhood of I. Then ðPh  Eh Þej ¼ Oðh1 Þ
  
Trace ðf ÞðPh Þ  ðf Þ h2  It is only recently that Tang and Zworski (1998) and
X
¼ f ð; hÞ þ Oðhn Þ Stefanov (1999) proved that these quasimodes
 a resonance of Ph \ constructions imply the existence of resonances
asymptotic to Eh , h ! 0.
The use of this result with a clever choice of functions f
allows Sjöstrand to show that an analytic singularity of See also: h-Pseudodifferential Operators and
the function E 7! Vol({x; V(x)
E}) (observe that if V Applications; Semi-Classical Spectra and Closed Orbits.
is bounded, this function vanishes for large E and
consequently it has analytic singularities) gives a lower
bound for  a neighborhood of E Further Reading
]ResðPh Þ \ 
chn Babič VM (1968) Eigenfunctions which are concentrated in the
neighborhood of a closed geodesic. Zapiski Nauchnykh
which coincides with the upper bound (see Zworski Seminanov Leningradkogo Otdeleniya Matematicheskogo
(2002) and the references given there). Instituta Imeni V.A. Steklova 9: 15–63.
Bardos C, Guillot JC, and Ralston J (1982) La relation de Poisson
pour l’équation des ondes dans un ouvert non borné.
Application à la théorie de la diffusion. Communications in
Location of Resonances Partial Differential Equations 7: 905–958.
In some particular cases, one can expect to have a Bardos C, Lebeau G, and Rauch J (1987) Scattering frequencies and
Gevrey 3 singularities. Inventiones Mathematicae 90: 77–114.
precise description of the location of resonances. Burq N (1998) Décroissance de l’énergie locale de l’équation des
This is the case in Ikawa’s example in acoustic ondes pour le problème extérieur et absence de résonance au
scattering where the obstacle is the union of two voisinage du réel. Acta Mathematica 180: 1–29.
Riemann Surfaces 419

Combes J-M, Duclos P, and Seiler R (1984) On the shape Spectral Theory (Lucca, 1996), NATO Adv. Sci. Inst. Ser. C
resonance. In: Resonances – Models and Phenomena (Biele- Math. Phys. Sci., vol. 490, pp. 377–437. Dordrecht: Kluwer
feld, 1984), Lecture Notes in Physics, vol. 211, pp. 64–77. Academic.
Berlin: Springer. Sjöstrand J and Zworski M (1991) Complex scaling and the
Gérard C (1988) Asymptotique des pôles de la matrice de scattering distribution of scattering poles. Journal of the American
pour deux obstacles strictement convexes. Supplément au Mathematical Society 4(4): 729–769.
Bulletin de la Société Mathématique de France 116: 146 pp. Sjöstrand J and Zworski M (1999) Asymptotic distribution of
Helffer B and Sjöstrand J (1986) Resonances en limite semi- resonances for convex obstacles. Acta Mathematica 183(2):
classique. Mémoire de la S.M.F 114(24–25): 228 pp. 191–253.
Hislop PD and Sigal IM (1987) Shape resonances in quantum Stefanov P (1999) Quasimodes and resonances: sharp lower
mechanics. In: Differential Equations and Mathematical bounds. Duke Mathematical Journal 99(1): 75–92.
Physics (Birmingham, Ala., 1986), Lecture Notes in Mathe- Tang S-H and Zworski M (1998) From quasimodes to reason-
matics, vol. 1285, pp. 180–196. Berlin: Springer. ances. Mathematical Research Letters 5(3): 261–272.
Ikawa M (1983) On the poles of the scattering matrix for two Tang SH and Zworski M (2000) Resonance expansions of
convex obstacles. Journal of Mathematics of the Kyoto scattered waves. Communication in Pure and Applied Mathe-
University 23: 127–194. matics 53(10): 1305–1334.
Lax PD and Phillips RS (1989) Scattering Theory, Pure and Applied Vainberg BR (1968) On the analytical properties of the resolvent
Mathematics, 2nd edn., vol. 26. Boston: Academic Press. for a certain class of operator pencils. Mathematics of the
Melrose RB (1982) Scattering theory and the trace of the wave USSR Sbornik 6(2): 241–273.
group. Journal of Functional Analysis 45: 429–440. Vodev G (2001) Resonances in euclidean scattering. Cubo Mate-
Melrose RB and Sjöstrand J (1978) Singularities of boundary matica Educacional 3: 317–360. http://www.math.sciences.
value problems. I. Communications in Pure and Applied univ-nantes.fr.
Mathematics 31: 593–617. Zworski M (1994) Counting scattering poles. In: Ikawa M (ed.)
Melrose RB and Sjöstrand J (1982) Singularities of boundary Spectral and Scattering Theory (Sanda, 1992), Lecture Notes
value problems. II. Communications in Pure and Applied in Pure and Applied Mathematics, vol. 161, pp. 301–331.
Mathematics 35: 129–168. New York: Dekker.
Morawetz CS (1975) Decay for solutions of the exterior problem Zworski M (2002) Quantum resonances and partial differential
for the wave equation. Communication in Pure Applied equations. In: Li Ta Tsien (ed.) Proceedings of the Interna-
Mathematics 28: 229–264. tional Congress of Mathematicians (Beijing, 2002), vol. III,
Sjöstrand J (1997) A trace formula and review of some estimates pp. 243–252. Beijing: Higher Education Press.
for resonances. In: Rodino L (ed.) Microlocal Analysis and

Ricci Flow see Singularities of the Ricci Flow

Riemann Surfaces
K Hulek, Universität Hannover, Hannover, Germany disk, the complex plane, or the Riemann sphere
(see the section ‘‘Uniformization’’).
ª 2006 Elsevier Ltd. All rights reserved.
This article discusses the basic theory of compact
Riemann surfaces, such as their topology, their
periods, and the definition of the Jacobian variety.
Studying the zeros and poles of meromorphic
Introduction
functions leads to the notion of divisors and linear
Riemann surfaces were first studied as the natural systems. In modern language this can be rephrased
domain of definition of (multivalued) holomorphic in terms of line bundles, resp. locally free sheaves
or meromorphic functions. They were the starting (see the section ‘‘Divisors, linear systems, and line
point for the development of the theory of bundles’’). One of the fundamental results is the
real and complex manifolds (see Weyl (1997)). Riemann–Roch theorem which expresses the
Nowadays, Riemann surfaces are simply defined difference between the dimension of a linear system
as one-dimensional complex manifolds (see the and that of its adjoint system in terms of the degree
next section). Compact Riemann surfaces can of the linear system and the genus of the curve. This
be embedded into projective spaces and are thus, theorem has been vastly generalized and is truly one
by virtue of Chow’s theorem, algebraic curves. By of the cornerstones of algebraic geometry.
uniformization theory, the universal cover of A formulation of this result and a discussion of
a connected Riemann surface is either the unit some of its applications are also discussed.
420 Riemann Surfaces

A study of the subsets of the Jacobians parame- M N


trizing linear systems of given degree and dimension
leads to Brill–Noether theory, which is discussed in h
the section ‘‘Brill–Noether theory.’’ This is followed x UαM N
h(x) Uβ
by a brief introduction to the theory of equations
and syzygies of canonical curves.
Moduli spaces play a central role in the theory
of complex variables and in algebraic geometry.
fαM fβN
Arguably, the most important of these is the
moduli space of curves of genus g. This and
related moduli problems are treated in the section
‘‘Moduli of compact Riemann surfaces.’’ In parti- fβN ° h ° (fαM )–1
cular, the space of stable maps is closely related to
quantum cohomology. Finally, we present a brief
VαM ⊂ Cm VβN ⊂ Cn
discussion of the Verlinde formula and conformal
blocks. Figure 2 Holomorphic map between manifolds.

Basic Definitions for each point x 2 M, there are charts


fM : UM ! VM  Cn near x and fN : UN ! VN 
Riemann surfaces are one-dimensional complex
Cm near h(x) with h(UM )  UN such that the map
manifolds. An n-dimensional complex manifold
shown in Figure 2
M is a topological Hausdorff space (i.e., for any
two points x 6¼ y on M, there are disjoint open fN  h  ðfM Þ1 : VM ! VN  Cm
neighborhoods containing x and y), which has a
countable basis for its topology, together with a is holomorphic (one checks easily that this does not
complex atlas A. The latter is an open covering depend on the choice of the charts).
(U )2A together with homeomorphisms f : U ! A Riemann surface is a one-dimensional com-
V  Cn , where the U are open subsets of M and plex manifold. Trivial examples are given by open
the V are open sets in Cn . The main requirement sets in C (where one chart suffices). Another
example is the Riemann sphere C ^ = C [ {1},
is that these charts are holomorphically compati-
ble, that is, for U \ U 6¼ ;, the map shown in which can be covered by the two charts given by
Figure 1, z 6¼ 1 and z 6¼ 0. Both of these charts are home-
omorphic to C with the transition function given
f  f1 jf ðU \U Þ : f ðU \ U Þ ! f ðU \ U Þ  Cn by z 7! 1=z. Historically, Riemann surfaces were
viewed as (branched) coverings of C or of the
is biholomorphic. A map h : M ! N between two
sphere, where they appear as the natural domain
complex manifolds is holomorphic if it is so with
of definition of multivalued holomorphic or
respect to the local charts. This means the following:
meromorphic functions.

Uα M
Uβ Uniformization
If M is a Riemann surface, then its universal
covering M ~ is again a Riemann surface. The
connected and simply connected Riemann surfaces
fα fβ can be fully classified. Let
E ¼ fz 2 C; jzj < 1g
be the unit disk and C ^ = C [ {1} the Riemann
fβ ° fα–1
sphere. The latter can be identified with the complex
projective line P1C .
Theorem 1 (Generalized Riemann mapping
Vα ⊂ Cn Vβ ⊂ Cn
theorem). Every connected and simply connected
Figure 1 Charts of a complex manifold. Riemann surface is biholomorphically equivalent
Riemann Surfaces 421

to the unit disk E, the complex plane C, or the Periods and the Jacobian
^
Riemann sphere C.
On a compact Riemann surface C of genus g, there
This theorem was proved rigorously by Koebe exist 2g homologically independent paths, that is,
and Poincaré at the beginning of the twentieth H1 (C, Z) ffi Z2g .
century. Let 1 , . . . ,2g be a basis of H1 (C, Z) and
let !1 , . . . , !g be a basis of the space of holomorphic
1-forms on C. Integrating these forms over the paths
Compact Riemann Surfaces 1 , . . . , 2g defines the period matrix
0R R 1
The topological structure of a compact Riemann    2g !1
 1 !1
surface C is determined by its genus g (Figure 3). B . .. C
¼B @ R ..
C
Topologically, a Riemann surface of genus g is a R . A
sphere with g handles or, equivalently, a torus with  1 !g    2g !g
g holes.
Analytically, the genus can be characterized as the If Q = (i , j ) is the intersection matrix of the paths
maximal number of linearly independent holo- 1 , . . . , 2g , then  satisfies the Riemann bilinear
morphic forms on C (see also the section ‘‘The relations
Riemann–Roch theorem and applications’’). pffiffiffiffiffiffiffi t
There exists a very close link with algebraic Qt ¼ 0; 1 Q > 0 ½1
geometry: every compact Riemann surface C can where the latter condition means positive definite.
be embedded into some projective space PnC (in One can choose (see Figure 4) 1 , . . . , 2g such that
fact already into P3C ). By Chow’s theorem, C is  
then a (projective) algebraic variety, that is, it can 0 1g
Q¼J¼
be described by finitely many homogeneous equa- 1g 0
tions. It should be noted that such a phenomenon
is special to complex dimension 1. The crucial where 1g is the g  g unit matrix. Moreover,
point is that one can always construct a non- !1 , . . . , !g can be chosen such that
0 1
constant meromorphic function on a Riemann 1    0 11    1g
surface (e.g., by Dirichlet’s principle). Given such B .. C
 ¼ @ ... . . . ... ..
. . A
a function, it is not difficult to find a projective
embedding of a compact Riemann surface C. On 0    1 g1    gg
the other hand, it is easy to construct a compact Let
two-dimensional torus T = C2 =L for some suitably
chosen lattice L, which cannot be embedded into 0 ¼ ðij Þ1 i; j g
any projective space PnC .
Then the Riemann bilinear relations [1] become
The dichotomy Riemann surface/algebraic curve
arises from different points of view: analysts think 0 ¼ t0 ; Im t0 > 0
of a real two-dimensional surface with a Rieman-
nian metric which, via isothermal coordinates, that is, 0 is an element of the Siegel upper half-
defines a holomorphic structure, whereas algebraic space
geometers think of a complex one-dimensional
Hg ¼ f 2 Matðg  g; CÞ;  ¼  t ; Im  > 0g
object.
In this article, the expressions compact Riemann The matrix 0 is defined by the Riemann surface C
surface and (projective) algebraic curve are both only up to the action of the symplectic group
used interchangeably. The choice depends on
which expression is more commonly used in the Spð2g; ZÞ ¼ fM 2 Matð2g  2g; ZÞ; MJMt ¼ Jg
part of the theory which is discussed in the
relevant section.

γ3 γ4
γ1 γ2
g=0 g=1 genus g
Figure 3 Genus of Riemann surfaces. Figure 4 Homology of a compact Riemann surface.
422 Riemann Surfaces

which acts on the Siegel space Hg by Divisors, Linear Systems,


  and Line Bundles
A B
M¼ :  7! ðA þ BÞðC þ DÞ1
C D A divisor D on C is a formal sum
Here A, . . . , D are g  g blocks. D ¼ n 1 P 1 þ    þ nk P k ; Pi 2 C; ni 2 Z
The rows of the matrix  define a rank-2g lattice
The degree of D is defined as
L in Cg and the Jacobian of C is the torus
deg D ¼ n1 þ    þ nk
JðCÞ ¼ Cg =L
and D is called ‘‘effective’’ if all ni 0. Every
More intrinsically, one can define J(C) as follows.
meromorphic function f 6¼ 0 defines a divisor
Let H0 (C, !C ) be the space of holomorphic differ-
ential forms on C. Then, integration over cycles ðf Þ ¼ f0  f1
defines a monomorphism
where f0 are the zeros of f and f1 the poles (each
0
counted with multiplicity). Divisors of the form (f ) are
H1 ðC; ZÞ ! H ðC; !C Þ
Z called principal divisors and the degree of any principal
 7! divisor is 0 (see the next section). Two divisors D1 and
 D2 are called linearly equivalent (D1 D2 ) if their
difference is a principal divisor, that is,
and
D1  D2 ¼ ðf Þ
JðCÞ ¼ H 0 ðC; !C Þ
=H1 ðC; ZÞ
for some meromorphic function f 6¼ 0. This defines
For a fixed base point P0 2 C, the Abel–Jacobi an equivalence relation on the group Div(C) of all
map is defined by divisors on C. Since principal divisors have degree 0,
the notion of degree also makes sense for classes of
u : C ! JðCÞ
Z P  linearly equivalent divisors. We define the divisor
Z P
class group of C by
P 7! !1 ; . . . ; !g
P0 P0 ClðCÞ ¼ DivðCÞ=
Here, the integration is taken over some path The degree map defines an exact sequence
from P0 to P. Obviously, the integral depends on
deg
the choice of this path, but since J(C) was
0 ! Cl0 ðCÞ ! ClðCÞ ! Z ! 0
obtained by dividing out the periods given by
integrating over a basis of H1 (C, Z), the map is where Cl0 (C) is the subgroup of Cl(C) of divisor
well defined. classes of degree 0.
Let Cd be the dth Cartesian product of C, that is, Let Cd be the set of unordered d-tuples of points
the set of all ordered d-tuples (P1 , . . . , Pd ). Then, u on C, that is,
defines a map
Cd ¼ Cd =Sd
ud : Cd ! JðCÞ
where the symmetric group Sd acts on the Cartesian
ðP1 ; . . . ; Pd Þ 7! uðP1 Þ þ    þ uðPd Þ
product Cd by permutation. This is again a smooth
where þ is the usual addition on the torus J(C). projective variety and the Abel–Jacobi map
If d = g 1, then ud : Cd ! J(C) clearly factors through a map

 ¼ Imðug1 Þ  JðCÞ ud : Cd ! JðCÞ

is a hypersurface (i.e., has codimension 1 in J(C)) The fibers of this map are of particular interest.
and is called a theta divisor. A different choice of the Theorem 2 (Abel). Two effective divisors D1 and
base point P0 results in a translation of the theta D2 on C of the same degree d are linearly equivalent
divisor. Using the theta divisor, one can show that if and only if ud (D1 ) = ud (D2 ).
J(C) is an abelian variety, that is, J(C) can be
embedded into some projective space PnC . The pair One normally denotes the inverse image of ud (D) by
(J(C), ) is a principally polarized abelian variety
jDj ¼ u1 0 0 0
d ðud ðDÞÞ ¼ fD ; D 0; D Dg
and Torelli’s theorem states that C can be
reconstructed from its Jacobian J(C) and the theta Note that the latter description also makes sense if
divisor . D itself is not necessarily effective. One calls jDj the
Riemann Surfaces 423

complete linear system defined by the divisor D. If are fiberwise linear isomorphisms. If M is connected,
deg D < 0, then automatically jDj = ;, but the then r is constant and is called the rank of the vector
converse is not necessarily true. Let MC be the bundle. A line bundle is simply a rank-1 vector bundle.
field of meromorphic (or equivalently rational) Alternatively, one can view vector bundles as
functions on C. Then, one defines locally free OM -modules, where OM denotes the
structure sheaf of holomorphic (or in the algebro-
LðDÞ ¼ ff 2 MC ; ðf Þ Dg
geometric setting regular) functions on M. An
This is a C-vector space and it is not difficult to see OM -module E is called locally free of rank r, if an
that L(D) has finite dimension. To every function open covering (U )2A of M exists such that EjU ffi
0 6¼ f 2 L(D), one can associate the effective divisor O r
U . The transition functions of a locally free sheaf
can be used to define a vector bundle and vice versa,
Df ¼ ðf Þ þ D 0 and hence the concepts of vector bundles and locally
Clearly, Df D and every effective divisor with this free sheaves can be used interchangeably. The open
property arises in this way. This gives a bijection coverings U can be viewed either in the complex
topology, or, if M is an algebraic variety, in
PðLðDÞÞ ¼ jDj the Zariski topology, thus leading to either holo-
morphic vector bundles (locally free sheaves in the
showing that the complete linear system jDj has the
C-topology) or algebraic vector bundles (locally free
structure of a projective space. A linear system is a
sheaves in the Zariski topology). Clearly, every
projective subspace of some complete linear system jDj.
algebraic vector bundle defines a holomorphic
Clearly, the map ud : Cd ! J(C) can be extended
vector bundle. Conversely, on a projective variety
to the set Divd (C) of degree d divisors and Abel’s
M, Serre’s GAGA theorem (géométrie algébriques et
theorem then states that this map factors through
géométrie analytique), a vast generalization of
Cld (C), that is, that we have a commutative diagram
Chow’s theorem, states that there exists a bijection
Divd(C ) Cld(C ) between the equivalence classes of algebraic and
ud ud holomorphic vector bundles (locally free sheaves).
J(C ) The Picard group Pic M is the set of all isomorph-
ism classes of line bundles on M. The tensor product
where ud is injective. defines a group structure on Pic M where the neutral
element is the trivial line bundle OM and the inverse
Theorem 3 (Jacobi’s Inversion Theorem). The
of a line bundle L is its dual bundle L
, which is also
map ud is surjective and hence induces an isomorphism
denoted by L1 . For this reason, locally free sheaves
ud : Cld ðCÞ ffi JðCÞ of rank 1 are also called invertible sheaves.
We now return to the case of a compact Riemann
It should be noted that the definition of the maps surface (algebraic curve) C. The concept of line
ud depends on the choice of a base point P0 2 C. bundles and P divisors can be translated into each
Hence, the maps ud are not canonical, with the other. If D = ni Pi is a divisor on C and U an open
exception of the isomorphism u0 : Cl0 (C) ffi J(C) set, then we denote by DU the restriction of D to U,
where the choice of P0 drops out. that is, the divisor consisting of all points Pi 2 U
The concepts of divisors and linear systems can be with multiplicity ni . One then defines a locally free
rephrased in the language of line bundles. A (holo- sheaf (line bundle) L(D) by
morphic) vector bundle on a complex manifold M is a
complex manifold E together with a projection LðDÞðUÞ ¼ ff 2 MC ðUÞ; ðf Þ DU g
p : E ! M which is a locally trivial Cr -bundle. This
To see that this is locally free, it is enough to
means that an open covering (U )2A of M and local
consider for each point Pi a neighborhood Ui on
trivializations
which a holomorphic function ti exists, which
≅ pα vanishes only at Pi and there of order 1 (i.e., it is a
p–1(Uα) Uα × Cr
pα prUα
local parameter near the point Pi ). Then,

LðDÞðUi Þ ¼ tini OUi ffi OUi
exist, such that the transition maps This correspondence defines a map
’  ’1
 jðU \U ÞCr : Div C ! Pic C
r r
ðU \ U Þ  C ! ðU \ U Þ  C D 7! LðDÞ
424 Riemann Surfaces

It is not hard to show that: canonical divisors are the divisors of the meromorphic
1-forms on C, whereas the effective canonical divisors
1. every line bundle L 2 Pic C is of the form L =
correspond to the divisors of holomorphic 1-forms
L(D) for some divisor D on the curve C;
(here, we simply write a 1-form locally as f (z) dz and
2. D1 D2 () L(D1 ) ffi L(D2 );
define a divisor by taking the zeros, resp. poles of f (z)).
3. L(D1 )  L(D2 ) ffi L(D1 þ D2 ); and
By abuse of notation, we also denote the divisor class
4. L(D) ffi L(D)1 .
corresponding to canonical divisors by KC . There is a
Hence, there is an isomorphism of abelian groups natural identification
ClðCÞ ffi Pic C PðH 0 ðC; !C ÞÞ ¼ jKC j
This correspondence allows to define the degree of a For a divisor D, the index of speciality is defined by
line bundle L. In the complex analytic setting this
can also be interpreted as follows. Let O
C be the iðDÞ ¼ lðKC  DÞ ¼ dimC LðKC  DÞ
sheaf of nowhere-vanishing functions. Using cocycles, The linear system jKC  Dj is called the adjoint
one easily identifies system of jDj. A crucial role is played by the
H 1 ðC; O
C Þ ffi Pic C Theorem 4 (Riemann–Roch). For any divisor D on a
and the exponential sequence compact Riemann surface C of genus g, the equality
exp lðDÞ  iðDÞ ¼ deg D þ 1  g ½2
0 ! Z ! OC ! O
C ! 0
holds.
induces an exact sequence
This can also be written in terms of line bundles.
0 ! H1 ðC; ZÞ ! H 1 ðC; OC Þ If L is any line bundle, then we denote the
! H1 ðC; O
C Þ ¼ Pic C ! H 2 ðC; ZÞ dimension of the space of global sections by

The last map in this exact sequence associates to h0 ðLÞ ¼ dimC H 0 ðC; LÞ
each line bundle L its first Chern class c1 (L) 2 Then, the Riemann–Roch theorem can be written as
H 2 (C, Z) ffi Z, which can be identified with the
degree of L. Hence, the subgroup Pic0 C of degree 0 h0 ðLÞ  h0 ð!C  L1 Þ ¼ deg L þ 1  g ½3
line bundles on C is isomorphic to
This can be written yet again in a different way, if
Pic0 C ffi H 1 ðC; OC Þ=H 1 ðC; ZÞ we use sheaf cohomology. By Serre duality, there is
an isomorphism of cohomology groups
Altogether there are identifications
H 1 ðC; LÞ ffi H 0 ðC; !C  L1 Þ

Pic0 C ffi Cl0 C ffi JðCÞ


and hence if we set
h1 ðLÞ ¼ dimC H 1 ðC; LÞ
The Riemann–Roch Theorem
then [3] reads
and Applications
For every divisor D on a compact Riemann surface C, h0 ðLÞ  h1 ðLÞ ¼ deg L þ 1  g ½4
the discussion of the preceding section shows that there Whereas [2] is the classical formulation of the
is an identification of finite-dimensional vector spaces Riemann–Roch theorem, formula [4] is the formula-
tion which is more suitable for generalizations.
LðDÞ ¼ H 0 ðC; LðDÞÞ
From this point of view, the classical Riemann–
where H 0 (C, L(D)) is the space of global sections of Roch theorem is a combination of the cohomologi-
the line bundle L(D). One defines cal formulation [4] together with Serre duality.
The Riemann–Roch theorem has been vastly gen-
lðDÞ ¼ dimC LðDÞ
eralized. This was first achieved by Hirzebruch who
It is a crucial question in the theory of compact Riemann proved what is nowadays called the Hirzebruch–
surfaces to study the dimension l(D) as D varies. Riemann–Roch theorem for vector bundles on projec-
The canonical bundle !C of C is defined as the dual tive manifolds. A further generalization is due to
of the tangent bundle of C. Its global sections are Grothendieck, who proved a ‘‘relative’’ version invol-
holomorphic 1-forms. Every divisor KC on C with ving maps between varieties. Nowadays, theorems like
!C = L(KC ) is called (a) canonical divisor. The the Hirzebruch–Riemann–Roch theorem can be
Riemann Surfaces 425

viewed as special cases of the Atiyah–Singer index Proposition 1 Let D be a divisor of degree d on the
theorem for elliptic operators. The latter also contains curve C. Then
the Gauss–Bonnet theorem from differential geometry
(i) jDj is base point free if d 2g and
as a special case. Moreover, Serre duality holds in
(ii) jDj is very ample if d 2g þ 1.
much greater generality, namely for coherent sheaves
on projective varieties. If the genus g(C) 2, then one can prove that jKC j
Applying the Riemann–Roch theorem [3] to the is base point free and consider the canonical map
zero divisor D = 0, resp. the trivial line bundle OC ,
one obtains ’jKC j : C ! Pg1

h0 ð!C Þ ¼ g ½5 A curve C is called hyperelliptic if there exists a


surjective map f : C ! P1 which is a covering of
that is, the number of independent global holo- degree 2. In genus 2 every curve is hyperelliptic,
morphic 1-forms equals the genus of the curve C. whereas for genus g 3 hyperelliptic curves are
Similarly, for D = KC , resp. L = !C , we find from [3] special. The connection with the canonical map is
and [5] that given by
deg KC ¼ 2g  2 Theorem 5 (Clifford). Let C be a curve of genus
g 2. Then the canonical map is an embedding if
These relations show, how the Riemann–Roch
and only if C is not hyperelliptic.
theorem links analytic, resp. algebraic, invariants
with the topology of the curve C. We end this section by stating Hurwitz’s theorem:
Finally, if deg D > 2g  2, then deg(KC  D) < 0 Let f : C ! D be a surjective holomorphic map
and hence i(D) = l(KC  D) = 0 and [2] becomes between compact Riemann surfaces (if f is not
constant then it is automatically surjective). Then,
lðDÞ ¼ deg D þ 1  g if deg D > 2g  2 near a point P 2 C the map f is given in local
which is Riemann’s original version of the theorem. analytic coordinates by f (t) = tnP and we call f
Classically, linear series arose in the study of ‘‘ramified’’ of order nP if nP > 1. The ramification
projective embeddings of algebraic curves. For a divisor of f is defined as
nonzero effective divisor X
R¼ ðnP  1ÞP
X
k P2C
D¼ n i Pi ; ni > 0
Note that this is a finite sum. If we define
i¼1
X
the support of D is defined by f
ðQÞ ¼ nP P
P2f 1 ðQÞ
suppðDÞ ¼ fP1 ; . . . ; Pk g
then one can show that
A complete linear system jDj is called base point X
free, if no point P exists which is in the support of deg f ¼ deg f
ðQÞ ¼ nP
every divisor D0 2 jDj. This is the same as saying P2f 1 ðQÞ
that for every P 2 C a section s 2 H 0 (C, L(D)) exists is independent of the point Q. This number is called
which does not vanish at P. Let jDj be base point the degree of the map f. (This should not be
free and let s0 , . . . , sn 2 H 0 (C, L(D)) be a basis of the confused with the degree deg(f ) of the principal
space of sections. Then, one obtains a map divisor (f ) defined by f.) In fact, applying the above
’jDj : C ! PðH 0 ðC; LðDÞÞÞ ¼ Pn equality to the map f : C ! P1 associated to a
nonconstant meromorphic function f shows that
P 7! ðs0 ðPÞ : . . . : sn ðPÞÞ
the degree of the principal divisor (f ) is zero, since
The divisors D0 2 jDj are then exactly the pullbacks
degðf Þ ¼ deg f
ð0Þ  deg f
ð1Þ ¼ 0
of the hyperplanes H of Pn under the map ’jDj . Note
that the map ’jDj as defined here depends on the
choice of the basis s0 , . . . , sn , but any two such Theorem 6 (Hurwitz). Let f : C ! D be a surjec-
choices only differ by an automorphism of Pn . We tive holomorphic map between compact Riemann
say that jDj, resp. the associated line bundle surfaces of genus g(C) and g(D), respectively. Then,
L = L(D), is very ample if ’jDj defines an
2gðCÞ  2 ¼ deg f  ð2gðDÞ  2Þ þ deg R
embedding. Using the Riemann–Roch theorem, it is
not difficult to prove: where R is the ramification divisor.
426 Riemann Surfaces

Brill–Noether Theory Brill–Noether theory started with a paper of Brill


and Noether in 1873. It was, however, only from
In this section, we state the main results of Brill–
the 1970s onwards that the main theorems could be
Noether theory. For a divisor D on a curve C we
proved rigorously, due to the work of Griffiths,
denote by
Harris, Kleiman, Mumford, and many others. For
rðDÞ ¼ lðDÞ  1 an extensive treatment of the theory, as well as a list
of references, the reader is referred to Arbarello
the projective dimension of the complete linear et al. (1985).
system jDj. The principal objects of Brill–Noether
theory are the sets Wdr  Cld (C) = Picd (C) given by
Wdr ðCÞ ¼ fD; deg D ¼ d; rðDÞ rg Green’s Conjecture
d d
These sets are subvarieties of Cl (C) = Pic (C). In recent years, much progress was achieved in
We denote by grd a linear system (not necessarily understanding the equations of canonical curves. If
complete) of degree d and projective dimension r. the curve C is not hyperelliptic, then the canonical
Closely related to the varieties Wdr are the sets map ’jKC j : C ! Pg1 defines an embedding. We
  shall, in this case, identify C with its image in Pg1
Grd ðCÞ ¼ ;  is a grd on C
and call this a canonical curve. The Clifford index
These sets also have a natural structure as a projective (for a precise definition see Lazarsfeld (1989)) is a
variety. Clearly, there are maps Grd (C) ! Wdr (C). first measure of how special a curve C is with
If g = g(C) is the genus of the curve C, then the respect to the canonical map. Hyperelliptic curves,
Brill–Noether number is defined as where the canonical map fails to be an embedding,
have, by definition, Clifford index 0. The two next
ðg; r; dÞ ¼ g  ðr þ 1Þðg  d þ rÞ special cases are plane quintic curves (they have
Its significance is that it is the expected dimension of a g25 ) and trigonal curves. A curve C is called
the varieties Grd (C). The two basic results of Brill– trigonal, if there is a 3 : 1 map C ! P1 , in which
Noether theory are: case C has a g13 . More generally, the gonality of a
curve C is the minimal degree of a surjective map
Theorem 7 (Existence Theorem). Let C be a curve C ! P1 . Plane quintics and trigonal curves are
of genus g. Let d, r be integers such that d 1, r 0, precisely the curves which have Clifford index 1.
and (g, r, d) 0. Then Grd (C) and hence Wdr (C) are Theorem 11 (Enriques–Babbage). If C  P g1 is a
nonempty and every component of Grd (C) has dimen- canonical curve, then C is either defined by quad-
sion at least . If r d  g, then the same is true ratic equations, or it is trigonal or isomorphic to a
for Wdr (C). plane quintic curve (i.e., it has Clifford index 1).
Theorem 8 (Connectedness Theorem). Let C be One can now ask more refined questions about
a curve of genus g and d, r integers such that d 1, the equations defining canonical curves and the
r 0, and (g, r, d) 1. Then Grd (C) and hence also relations (syzygies) among these equations. This
Wdr (C) are connected. leads to looking at the minimal free resolution of a
canonical curve C, which is of the form
The above theorems hold for all curves C. There
are other theorems which only hold for general 0 IC j OPg1 ðjÞ0j  j OPg1 ðjÞkj 0
curves (where general means outside a countable
union of proper subvarieties in the moduli space, see Here, I C is the ideal sheaf of C and OPg1 (n) is the
the section ‘‘Moduli of compact Riemann surfaces’’). nth power of the dual of the Hopf bundle (or
tautological sub-bundle) on Pg1 if n 0,
Theorem 9 (Dimension Theorem). Let C be a resp. the jnjth power of the Hopf bundle if n <
general curve of genus g and d 1, r 0 integers. If 0. The ij (C) are called the Betti numbers of C.
(g, r, d) < 0, then Grd (C) = ;. If  0, then every The Green conjecture predicts a link between the
component of Grd (C) has dimension . nonvanishing of certain Betti numbers and geo-
metric properties of the canonical curve, such as
Theorem 10 (Smoothness Theorem). Let C be a the existence of multisecants. Recently, C Voisin
general curve of genus g and d 1, r 0. Then, and M Teixidor have proved the Green conjecture
Grd (C) is smooth of dimension . If  1, then for general curves of given gonality (see Beauville
Grd (C) and hence Wdr (C) are irreducible. (2003)).
Riemann Surfaces 427

Moduli of Compact Riemann Surfaces looks like C3g3 =G near the origin, where G is a
finite group acting linearly on C3g3 . One expresses
As a set, the moduli space of compact Riemann
this by saying that Mg has only finite quotient
surfaces of genus g is defined as
singularities. A space with this property is also
Mg ¼ fC; C is a compact sometimes referred to as a V-manifold or an
Riemann surface of genus gg= ffi orbifold. Moreover, Mg is a quasiprojective variety,
that is, a Zariski-open subset of a projective variety.
For genus g = 0, the only Riemann surface is the As the above parameter count implies, the dimen-
Riemann sphere C ^ = P1 and hence M0 consists of sion of Mg is 3g  3. At this point it can also be
one point only. Every Riemann surface of genus 1 is clarified what is meant by a general curve in the
a torus context of Brill–Noether theory: a property is said to
hold for the general curve in Brill–Noether theory if
E ¼ C=L
it holds outside a countable number of proper
for some lattice L, which can be written in the form subvarieties of Mg .
It is often useful to work with projective, rather
L ¼ Z þ Z; Im  > 0 than quasiprojective, varieties. This means that one
Two elliptic curves E = C=L and E 0 = C=L 0 are wants to compactify Mg to a projective variety Mg ,
isomorphic if and only if a matrix preferably in such a way that the points one adds
  still correspond to geometric objects. The crucial
a b concept in this context is that of a stable curve. A
M¼ 2 SLð2; ZÞ
c d stable curve of genus g is a one-dimensional
exists with projective variety with the following properties:

a þ b 1. C is connected (but not necessarily irreducible),


0 ¼ 2. C has at most nodal singularities (i.e., two local
c þ d
analytic branches meet transversally),
This proves that 3. the arithmetic genus pa (C) = h1 (C, OC ) = g, and
4. the automorphism group Aut(C) of C is finite.
M1 ¼ H1 =SLð2; ZÞ
The last of these conditions is equivalent to the
and this construction also shows that M1 can itself
following: if a component of C is an elliptic curve,
be given the structure of a Riemann surface. Using
then this must either meet another component or
the j-function, one obtains that
have a node, and if a component is a rational curve,
M1 ffi C then this component must either have at least two
nodes or one node and intersect another component,
The situation is considerably more complicated for
or it is smooth and has at least three points of
genus g 2. The space of infinitesimal deformations
intersection with other components.
of a curve C is given by H1 (C, TC ) where TC is the
It should be noted that, in contrast to the previous
tangent bundle. By Serre duality
illustrations, Figure 5 is drawn from the complex

point of view, that is, the curves appear as one-
H 1 ðC; TC Þ ffi H 0 ðC; !2
C Þ
dimensional objects.
and by Riemann’s theorem it then follows that The concept of stable curves leads to what is
dim H 1 ðC; TC Þ ¼ dim H 0 ðC; !2 generally known as the Deligne–Mumford compac-
C Þ ¼ 3g  3
tification of Mg :
This shows that a curve of genus g depends on
3g  3 parameters or moduli, a dimension count Mg ¼ fC; C is a stable curve of genus gg= ffi
which was first performed by Riemann.
In genus 2 every curve has the hyperelliptic
involution, and for a general curve of genus 2 this
is the only automorphism. In genus g 3 the g=2 g=0
general curve has no automorphisms, but some
curves do. The order of the automorphism group is
bounded by 84(g  1). The existence of automorph-
ism for some curves means that Mg is not a
manifold, but has singularities. The singularities
are, however, fairly mild. Locally, Mg always Figure 5 An example of a stable curve of genus 3.
428 Riemann Surfaces

Theorem 12 (Deligne–Mumford, Knudsen). Mg is problem gives rise to a proper Deligne–Mumford stack


an irreducible, projective variety of dimension 3g  3 Mg, n (X, ). In general, this stack is very complicated,
with only finite quotient singularities. it need not be connected, can be very singular, and may
have several components of different dimensions. Its
The spaces Mg have been studied intensively over
expected dimension is
the last 30 years. From the point of view of
classification, an important question is to determine exp : dim Mg; n ðX; Þ
the Kodaira dimension of these spaces. Z
¼ ðdim X  3Þð1  gÞ þ n þ c1 ðTX Þ
Theorem 13 (Harris–Mumford, Eisenbud–Harris). 
The moduli spaces Mg are of general type for
g > 23. Quantum cohomology can now be rephrased as
intersection theory on the stack Mg, n (X, ). In
On the other hand, it is known that Mg is general, these stacks do not have the expected
rational for g 6, unirational for g 14, and has dimension. For this reason, Behrend and Fantechi
negative Kodaira dimension for g 16. (1997) have constructed a virtual fundamental class of
A further topic is to understand the cohomology the right dimension, which is the correct tool for the
of Mg , resp. the Chow ring, and to compute the intersection theory which gives the algebro-geometric
intersection theory on Mg . For these topics we refer definition of quantum cohomology. In addition to this,
the reader to Vakil (2003). there is also a symplectic formulation. It was shown by
Closely related is the moduli problem of stable B Siebert that both approaches coincide.
n-pointed curves. A stable n-pointed curve (Figure 6)
is an (n þ 1)-tuple (C, x1 , . . . , xn ), where C is a
connected nodal curve and x1 , . . . , xn are smooth
points of C with the stability condition that the Verlinde Formula and Conformal Blocks
automorphism group of (C, x1 , . . . , xn ) is finite. The study of vector bundles (locally free sheaves) on
These curves can be parametrized by a coarse a compact Riemann surface is an area of research in
moduli space Mg, n . These spaces share many its own right. For a rank-r bundle E, the slope of E is
properties of the spaces Mg : they are irreducible, defined by
projective varieties with finite quotient singularities
and of dimension 3g  3 þ n. deg E
ðEÞ ¼
A further development, which has become very r
important in recent years, is that of moduli spaces of where the degree
stable maps. These were introduced by Kontsevich in V of E is defined as the degree of the
line bundle r E = det E. The bundle E is called
the context of quantum cohomology. To define stable stable, resp. semistable, if
maps, one first fixes a projective variety X and then
considers (n þ 2)-tuples (C, x1 , . . . , xn , f ) where ðF Þ < ðEÞ; resp: ðF Þ ðEÞ
(C, x1 , . . . , xn ) is an n-pointed curve of genus g and for every proper sub-bundle {0} $ F $ E. Let C be a
f : C ! X a map. The stability condition is, that this compact Riemann surface of genus g 2 and let
object allows only finitely many automorphisms SUC (r) be the moduli space of semistable rank-r vector
’ : C ! C, fixing the marked points x1 , . . . , xn , such bundles with trivial determinant det E = OC . This is
that f  ’ = f . In order to obtain meaningful moduli a projective variety of dimension (r2  1)(g  1).
spaces, one also fixes a class  2 H2 (X, Z). One then It contains a smooth open set, whose points corres-
asks for a space parametrizing all stable (n þ 2)-tuples pond to the isomorphism classes of stable vector
(C, x1 , . . . , xn , f ) with the additional property that bundles. The complement of this set is in general the
f
[C] = . This construction is best treated in the singular locus of SUC (r) and its points correspond to
language of stacks, and one can show that this moduli direct sums of line bundles of degree 0. These are the
so-called graded objects of the semistable, but not
stable, bundles. By a theorem of Narasimhan and
g=2 g=0 Seshadri, the points of SUC (r) are also in one-to-one
correspondence with the isomorphism classes of
representations 1 (C) ! SU(r).
Let L 2 Picg1 (C) be any line bundle of degree
g=0 g  1 on C. Then, the set
 
Figure 6 An example of marked stable curve.
L ¼ E 2 SUC ðrÞ; dim H 0 ðC; E  LÞ > 0
Riemann–Hilbert Methods in Integrable Systems 429

is a Cartier divisor on SUC (r) and thus defines a line Complex Algebraic Geometry, (Berkeley, CA, 1992/93),
bundle L on SUC (r). This is a natural generalization Math. Sci. Res. Inst. Publ., vol. 28, pp. 17–33. Cambridge:
Cambridge University Press.
of the construction of the classical theta divisor. The Beauville A (2003) La conjecture de green générique [d’après
line bundle L generates the Picard group of the C. Voisin]. Exposé 924 du Séminaire Bourbaki.
moduli space SUC (r). Behrend K and Fantechi B (1997) The intrinsic normal cone.
Inventiones Mathematicae 128: 45–88.
Theorem 14 (Verlinde Formula). If C has genus g Faber C and Looijenga E (1999) Remarks on moduli of curves. In:
and k is a positive integer, then Faber C and Looijenga E (eds.) Moduli of Curves and Abelian
Varieties, Aspects Math. E33, pp. 23–45. Braunschweig: Vieweg.
dim H 0 ðSUC ðrÞ; Lk Þ Farkas H and Kra I (1992) Riemann Surfaces, 2nd edn. New
 g X Y  s  t g1
York: Springer.
r
¼  sin  Forster O (1991) Lectures on Riemann Surfaces (translated from
r þ k StT¼f1;...;rþkg s2S rþk the 1977 German Original by Bruce Gilligan), Reprint of the
t2T
jSj¼r 1981 English Translation. New York: Springer.
Griffiths Ph and Harris J (1994) Principles of Algebraic
This formula was first found by Verlinde in the context Geometry, Reprint of the 1978 Edition, Wiley Classics
of conformal field theory. Due to this relationship, the Library. New York: Wiley.
spaces H 0 (SUC (r), Lk ) are also called conformal Hartshorne R (1977) Algebraic Geometry. Heidelberg: Springer.
blocks. These spaces can also be defined for principal Jost J (1997) Compact Riemann Surfaces. An Introduction to
bundles. Rigorous proofs for the general case of the Contemporary Mathematics (translated from the German
Manuscript by Simha RR). Berlin: Springer.
Verlinde formula are due to Beauville–Laszlo and Kirwan F (1992) Complex Algebraic Curves. Cambridge:
Faltings. For a survey, see Beauville (1995). Cambridge University Press.
Lazarsfeld R (1989) A sampling of vector bundle techniques in the
See also: Characteristic Classes; Cohomology Theories; study of linear series. In: Cornalba M, Gomez-Mont X, and
Index Theorems; Mirror Symmetry: a Geometric Survey; Verjovsk (eds.) Lectures on Riemann Surfaces, pp. 500–559.
Moduli Spaces: An Introduction; Polygonal Billiards; Teaneck, NJ: World Scientific.
Several Complex Variables: Basic Geometric Theory; Miranda R (1995) Algebraic Curves and Riemann Surfaces.
Several Complex Variables: Compact Manifolds; Providence: American Mathematical Society.
Mumford D (1995) Algebraic Geometry. I. Complex Projective
Topological Gravity, Two-Dimensional.
Varieties, Reprint of the 1976 Edition. Berlin: Springer.
Vakil R (2003) The moduli space of curves and its tautological
ring. Notices of the American Mathematical Society 50(6):
Further Reading 647–658.
Arbarello E, Cornalba M, Griffiths Ph, and Harris J (1985) Weyl H (1997) Die Idee der Riemannschen Fläche (Reprint of the
Geometry of Algebraic Curves, vol. I. New York: Springer. 1913 German Original, With Essays by Reinhold Remmert,
Beauville A (1995) Vector bundles on curves and generalized Michael Schneider, Stefan Hildebrandt, Klaus Hulek and
theta functions: recent results and open problems. In: Boutet Samuel Patterson. Edited and with a Preface and a Biography
de Montel A and Morchenko V (eds.) Current Topics in of Weyl by Remmert). Stuttgart: Teubner.

Riemann–Hilbert Methods in Integrable Systems


D Shepelsky, Institute for Low Temperature Physics consist of several connected components; typical
and Engineering, Kharkov, Ukraine contours appearing in applications to integrable
ª 2006 Elsevier Ltd. All rights reserved. systems are shown in Figure 1.
The orientation of an arc in  defines the þ
and the  side of . Suppose in addition that we
Introduction are given a map v :  ! GL(N, C) with v, v1 2
L1 (). The (normalized) RH problem determined
The Riemann–Hilbert (RH) method in mathematical by the pair (, v) consists in finding an N  N
physics and analysis consists in reducing a particular
problem to the problem of reconstruction of an
analytic, scalar- or matrix-valued function in the
complex plane from a prescribed jump across a
given curve. More precisely, let an oriented contour
 be given in the complex
-plane. The contour 
may have points of self-intersections, and it may Figure 1 Typical contours for RH problems.
430 Riemann–Hilbert Methods in Integrable Systems

matrix-valued function m() with the following The main benefit of reducing an originally non-
properties: linear problem to the analytic factorization of a
given matrix function arises in asymptotic analysis.
mðÞ is analytic in Cn ½1a Typically, the dependence of the jump matrix on the
external parameters (say, x and t) is oscillatory. In
mþ ðÞ ¼ m ðÞvðÞ for  2  analogy of asymptotic evaluation of oscillatory
where mþ ðÞðm ðÞÞ is the limit contour integrals via the classical method of steepest
of m from the þ ðÞ side of  ½1b descent, in the asymptotic evaluation of the solution
m(; x, t) of the matrix RH problem as x, t ! 1, the
mðÞ ! I (identity matrix) as  ! 1 ½1c nonlinear steepest-descent method examines the
analytic structure of the jump matrix v(; x, t) in
The precise sense in which the limit at 1 and the order to deform the contour  to contours where
boundary values m are attained are technical the oscillatory factors become exponentially small as
matter that should be specified for each given RH x, t ! 1, and hence the original RH problem
problem (, v). reduces to a collection of local RH problems
Concerning the name RH problem we note that associated with the relevant points of stationary
in literature (particularly, in the theory of bound- phase. Although the method has (in the matrix case)
ary values of analytic functions), the problem of noncommutative and nonlinear elements, the final
reconstructing a function from its jump across a result of the analysis is as efficient as the asymptotic
curve is often called the Hilbert boundary-value evaluation of the oscillatory integrals.
problem. The closely related problem of analytic
matrix factorization (given  and v, find G()
analytic and nondegenerate in Cn such that Dressing Method
Gþ G = v on ) is sometimes called the Riemann The RH method allows describing the solution of a
problem. The name ‘‘RH problem’’ is also differential system independently of the theory of
attributed to the reconstruction of a Fuchsian differential equations. The solution might be expli-
system with given poles and a given monodromy cit, that is, given in terms of elementary or elliptic or
group. abelian functions and contour integrals of such
In applications, the jump matrix v also depends functions. In general (transcendental) case, the
on certain parameters, in which the original problem solution can be represented in terms of the solution
at hand is naturally formulated (e.g., v = v(; x, t) in of certain linear singular integral equations.
applications to the integrable nonlinear differential In the modern theory of integrable systems, a
equations in dimension 1 þ 1, with x being the space system of nonlinear differential equations is often
variable and t the time variable), and the main called integrable if it can be represented as a
concern is the behavior of the solution of the RH compatibility condition of an auxiliary overdeter-
problem, m(; x, t), as a function of x and t. mined linear system of differential equations called a
Particular interest is in the behavior of m(; x, t) as Lax pair of the given nonlinear system (actually it
x and t become large. might involve more than two linear equations). In
In the scalar case, N = 1, rewriting the original order that the compatibility condition represents a
multiplicative jump condition in the additive form nontrivial nonlinear system of equations, the Lax
log mþ ðÞ ¼ log mþ ðÞ þ log vðÞ pair is required to depend rationally on an auxiliary
parameter (called a spectral parameter). The RH
and using the Cauchy–Plemelj–Sokhotskii formula problem formulated in the complex plane of the
give an explicit integral representation for the spectral parameter allows, given a particular solu-
solution tion of the compatibility equations, to construct
 Z  directly new solutions of the compatibility system by
1 log vðÞ
mðÞ ¼ exp d ½2 ‘‘dressing’’ the initial one.
2i    
For example, let D(x, ), x 2 Rn ,  2 C be an N  N
(in the case of nonzero index,  log vj 6¼ 0, formula diagonal, polynomial in  with smooth coefficients,
[2] admits a suitable modification). function such that aj := @D=@xj are polynomials in
A generic (nonabelian) matrix RH problem  of degree dj . Then 0 := exp D(x, ) solves the
cannot be solved explicitly in terms of contour system of linear equations @0 =@xj = aj 0 , whose
integrals; however, it can always be reduced to a compatibility conditions @ 2 0 =@xj @xk = @ 2 0 =@xk @xj
system of linear singular-integral equations, thus are trivially satisfied. Given a contour  and a smooth
linearizing an originally nonlinear system. function v, consider the matrix RH problem [1]
Riemann–Hilbert Methods in Integrable Systems 431

with the jump matrix ~ v(; x) := expD(x, )v() The relation between the RH problem and the
exp D(x, ). Let m(; x) be the solution of this RH differential equations [5] is local in x and t; it is based
problem. Then (Dj m)þ = (Dj m) ~ v, where Dj f := only on the unique solvability of the RH problem,
@f =@xj þ [aj , f ] with [a, b] := ab  ba. The Liouville the Liouville theorem, and the explicit dependence of
theorem implies that (Dj m)m1 is an entire function the jump matrix in x and t. The uniqueness of the
which is o(dj ) as  ! 1. Setting (x, ) := m(; x) solution of an RH problem is basically provided by
exp D(x, ) gives the system of linear equations the Liouville theorem: the ratio m(1) (m(2) )1 of any
X two solutions is analytic in Cn and continuous
@
¼ aj þ k qjk ðxÞ  Rj ðx; Þ ½3 across  and is therefore identically equal to I by the
@xj k<d normalization condition [1c].
j

On the other hand, there are no completely


the compatibility conditions for which are
general effective criteria for the solvability. Never-
@Rk @Rj theless, many RH problems seen in applications to
 ¼ ½Rj ; Rk  ½4
@xj @xk integrable systems satisfy the following sufficient
condition: if  is symmetric with respect to R and
Equating coefficients of various powers of  in [4] contains R, and if, in addition, v () = v()  for  2
gives a (generally) nonlinear system of partial nR and Re v() > 0 for  2 R, then the RH
differential equations for the coefficient matrices problem is solvable.
qjk . Thus, given D(x, ), the RH problem, if it is For nonlinear equations supporting solitons, the
solvable, maps the pair (, v) to solutions of [4]. RH problem appears naturally in a more general
Specializing to n = 2 with variables (x, t) 2 R2 , the setting, as a meromorphic factorization problem,
overdetermined system of linear equations and the where m in [1] is sought to be a (piecewise)
corresponding compatibility conditions are meromorphic function, with additionally prescribed
x ¼ U; t ¼ V ½5 poles and respective residue conditions. Alterna-
tively, in the Riemann factorization problem
and Gþ G = v, one assumes that G degenerates at
Ut  Vx þ ½U; V ¼ 0 ½6 some given points 1 , . . . , n 2 þ and 1 , . . . , n 2
 , where C = þ [  [ , and prescribes two sets
respectively. Conditions [6] are sometimes called the of subspaces, Im Gj = j and Ker Gj = j . In the case
zero-curvature conditions. v  I, the solution of the factorization problem with
Equations [5] and [6] with U and V depending zeros (meromorphic RH problem) is purely alge-
rationally on the spectral parameter  represents the braic, and gives formulas describing multisoliton
integrable nonlinear systems in 1 þ 1 dimension. A solutions. In the general case, v 6 I, the mero-
typical example of such a system is the (defocusing) morphic RH problem can be algebraically converted
nonlinear Schrödinger (NLS) equation to a holomorphic RH problem, by subsequently
iqt þ qxx  2jqj2 q ¼ 0 ½7 removing the poles with the help of the Blaschke–
Potapov factors.
Starting from the RH problem with the 2  2 jump Alternatively, a meromorphic RH problem can be
matrix converted to a holomorphic one by adding to  an
additional contour aux enclosing all the poles,
vð; x; tÞ ¼ ei3 =2 vðÞei3 =2 ½8
interpolating the constants involved in the residue
2 conditions inside the region surrounded by aux , and
where (; x, t) = t þ x, 3 = diag{1, 1}, and

v() satisfies the involution 3 v ()3 = v(), defining a new jump matrix on aux using the
expanding out the limit of the solution of the RH interpolant and the Blaschke–Potapov factors.
problem as  ! 1 RH problems formulated on the complex plane C
  correspond typically to solutions of relevant non-
m1 ðx; tÞ 1
mð; x; tÞ ¼ I þ þo ½9 linear problems decaying at infinity. For other types
  of boundary conditions (e.g., nonzero constants or
and arguing as above gives [5], with periodic or quasiperiodic boundary conditions), the
! corresponding RH problem is naturally formulated
i3 0 q on a Riemann surface. For example, the RH
U¼ þ ½10 problem associated with finite density conditions
2 q
 0
q(x, t) ! ei as x ! 1 for the NLS equation [7]
and q = i(m1 )12 , whereas the compatibility condi- is naturally formulated on the two-sheet Riemann
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
tion [6] reduces to [7]. surface of the function k() = 2  42 with
432 Riemann–Hilbert Methods in Integrable Systems

the contour  consisting of the points (, "), where jump matrix [11] and evaluating its solution m(x, ) as
jj  2 and " = 1 marks the surface sheet.  ! 1 [9]:
r 7! v 7! RHP 7! mðx; Þ
Inverse-Scattering Transform ¼ mðx; ; rÞ 7! m1 ðxÞ 7! qðxÞ
The inverse-scattering transform method for solving ¼ iðm1 ðxÞÞ12
initial-value problems for integrable nonlinear equa-
and thus
tions written as the compatibility conditions [6] for 
linear equations [5] consists in the following: starting 2
qðx; tÞ ¼ R1 eixð Þitð Þ rð Þ ½12
from the given initial data, solve the direct problem,
that is, determine appropriate eigenfunctions (solu- The mathematical rigor to this scheme is provided
tions of the differential x-equation in the Lax pair [5]) by the general theory of analytic matrix factoriza-
having well-controlled analytic properties as functions tion making use of the relation between the
of the auxiliary (spectral) parameter  and the factorization problem and certain singular integral
associated spectral functions of ; then, by virtue of equations; this relation can be established with the
the t-equations in the Lax pair [5], the associated help of the Cauchy operators
functions evolve in a simple, explicit way. Finally, Z
using the explicit evolution of the spectral functions, hðÞ d
ChðÞ ¼ ;  2 Cn
solve the inverse problem of finding the associated    2i

coefficients in the x-equation, which, by [5], evolve and
according to the given nonlinear equation and thus
solve the Cauchy problem for this equation. The last C  hðÞ ¼ lim
0
ðChÞð0 Þ
 !
0 2ðÞside of 
step in this procedure, the inverse-scattering problem,
can be effectively solved by reformulating it as an RH For a very general class of contours, the Cauchy
problem, which in turn can be related to a system of operators C : Lp ! Lp , 1 < p < 1, are bounded,
singular integral equations. The classical Gelfand– Cþ  C = I, and Cþ þ C = H, where
Levitan–Marchenko integral equation of the inverse- Z
scattering problem is the Fourier transform of some hðÞ d
HhðÞ :¼ lim
special cases of these singular integral equations. "!0 
jj>"
   i
To fix ideas, consider the initial-value problem for
the NLS equation [7], where the data q(x, t = 0) = is the Hilbert transform.
q0 (x) have sufficient smooth and decay as jxj ! 1. The map R is often considered as a nonlinear
For each  2 CnR, one constructs solutions (x, ) Fourier-type map; this point of view is supported by
of x = U with U given by [10], having the the fact that R is a bijection between the corre-
properties sponding Schwartz spaces of functions. Making use
  of the Lp or Hölder theory of the Cauchy operators
ix3 and the related factorization problems, it is possible
mðx; Þ :¼ ðx; Þ exp ! I as x ! 1
2 to analyze the action of R and R1 in various
functional spaces. This also requires making more
and m(x, ) is bounded as x ! 1. For each fixed x,
precise the definition of the RH problem: for fixed
the 2  2 matrix function m(x, ) solves the RH
1 < p < 1, given  and v such that v, v1 2
problem in , where  = R and the jump matrix is
! L1 ( ! GL(N, C)), we say that m solves an RH
1  jrðÞj2 rðÞ eix Lp -problem if m 2 I þ @C(Lp ) and mþ () =
v ¼ vð; xÞ ¼ ½11 m ()v() for  2 . Here a pair of Lp ()-functions
rðÞ eix 1
f 2 @C(Lp ) if there exists a unique function
Here r() is the reflection coefficient of q0 (x). h 2 Lp () such that f () = (C h)(). Then f () =
The direct scattering map R is described by Ch(),  2 Cn, is called the extension of f off .
mapping q 7! r, Given a factorization of v = (v )1 vþ = (I  w )1
(I þ wþ ) on  with v , (v )1 2 Lp , the basic
q 7! mðx; Þ ¼ mðx; ; qÞ 7! vð; xÞ 7! r ¼ RðqÞ associated singular integral operator is defined by
By virtue of the t-equations in [5], if q(t) = q(x, t)
Cw h :¼ Cþ ðhw Þ þ C ðhwþ Þ
solves the NLS equation, then r(t) = R(q( , t)) evolves
2
as r(t) = r(t, ) = eit r0 (), where r0 = R(q0 ). Given If the operator I  Cw is invertible on Lp (), with
r, the inverse-scattering map R1 is obtained by  2 I þ Lp (), solving (I  Cw )m = I, then m() =
solving the normalized RH problem (RHP) with the I þ (C((wþ þ w )))() is the unique solution of the
Riemann–Hilbert Methods in Integrable Systems 433

RH problem (, v). Although the operator Cw need An RH problem may be viewed as a special case
not be compact, in many cases it is Fredholm with in a more general setting of problems of recon-
zero index. Then the existence of (I  Cw )1 is structing an analytic function from the known
equivalent to the solvability of the RH problem structure of its singularities. The departure from
(, v), and the normalized RH problem (m ! I as analyticity of a function m of the complex variable
 ! 1) has a unique solution if and only if the  can be described in terms of the ‘‘d-bar’’
corresponding homogeneous RH problem (with  If @m=@  can be linearly related
derivative, @m=@ .
m ! 0 as  ! 1) has only the trivial solution to m itself, then the use of the extension of
(vanishing lemma). Cauchy’s formula
The most complete theory for RH problem relative Z Z
1 1 @m 1 mðÞ
to simple contours is the theory when v is in an mðÞ ¼ d ^ d þ d
inverse, closed, decomposing Banach algebra A, that 2i D    @  2i @D 
is, the algebra of continuous functions with the leads to a linear integral equation for m. This is the
Hilbert transform bounded in it such that if f 2 A, case for some multidimensional (2 þ 1) nonlinear
then f 1 2 A. For contours with self-intersections, the integrable equations. For example, for the Kadomtsev–
RH factorization theory is formulated in terms of a Petviashvili-I equation (the two-dimensional general-
pair of decomposing algebras: choosing the orienta- ization of the Korteweg–de Vries equation) (qt þ
tion of the contour in such a way that it divides the 6qqx þ qxxx )x = 3qyy , the appropriate eigenfunctions
-plane into two disjoint regions, þ and  , and are still sectionally meromorphic, but their jumps
each arc of  forms part of the positively oriented across a contour are connected nonlocally to m on
boundary of þ , the functions in the þ () algebra the contour, which leads to nonlocal RH problem of
are continuous up to the boundary in each connected the type
component of þ ( ). Z
The choice of functional spaces in the RH problem mþ ðÞ ¼ m ðÞ þ dm ðÞf ð; Þ;  2 
should be based on the integrable system at hand. For 
example, an integrable flow connected to the scatter- with given f (, ) (analogue of scattering data).
ing problem for x = U, with U defined by [10], Contrarily, the eigenfunctions for the Kadomtsev–
p p
has in general the form eit 3 v()eit 3 (Ablowitz– Petviashvili-II equation (qt þ 6qqx þ qxxx )x = 3qyy
Kaup–Newell–Segur (AKNS) hierarchy) in the scat- are nowhere analytic, with @m=@  related to m by
tering space (for the NLS equation, p = 2), so that
appropriate spaces are L2 ((1 þ x2 ) dx) \ H p1 for @m 
ðÞ ¼ FðRe ; Im ÞmðÞ; 2C
q( , t) and L2 ((1 þ jj2p2 )jdj) \ H1 as the scatter- @ 
ing space. Deift and Zhou showed that in this case
the scattering map R and the inverse-scattering map
R1 indeed involve no ‘‘loss’’ of smoothness or decay. Nonlinear Steepest-Descent Method
A generalization of the inverse-scattering trans- The nonlinear steepest-descent method is based on a
form method to the initial boundary-value problems direct asymptotic analysis of the relevant RH
for integrable nonlinear equations (on the half-line problem; it is general and algorithmic in the sense
or on a finite interval with respect to the space that it does not require a priori information (anzatz)
variable x) can be also developed on the basis of the about the form of the solution of the asymptotic
RH problem formalism. It this case, the construction problem. However, the noncommutativity of the
of the corresponding RH problem involves simulta- matrix setting requires developing rather sophisti-
neous spectral analysis of the both linear equations cated technical ideas, which, in particular, enable an
in the Lax pair [5]. The boundary values generate an explicit solution of the associated local RH problems.
additional set of spectral functions, which generally To fix ideas, let us again consider the NLS
makes the construction of the associated RH equation. The dependence of the jump matrix
problem more complicated than in the case of the v(; x, t) on x and t is oscillatory; it is the same as
corresponding initial-value problem (particularly, in the integral
the contour is to be enhanced by adding the part Z
coming from the spectral analysis of the t-equation); 1 2
qðx; tÞ ¼ pffiffiffiffiffiffi eiðxt Þ q
^0 ðÞd ½13
however, this RH problem again depends explicitly 2 R
on x and t, which makes it possible to develop which solves the initial-value problem for the
relevant techniques (such as the nonlinear steepest- linearized version of [7]:
descent method for the asymptotic analysis) in the
same spirit as in the case of initial-value problems. iqt þ qxx ¼ 0; qðx; 0Þ ¼ q0 ðxÞ ½14
434 Riemann–Hilbert Methods in Integrable Systems

(here q^0 () is the Fourier transform of the initial data


q0 ). The main contribution to [13] as jxj and t tend to
1 comes from the point of stationary phase of
2
ei(xt ) , that is, the point  = 0 = x=2t, for which Re iθ < 0 Re iθ > 0
d
ðx  t2 Þ ¼ 0 λ0
d
Re iθ < 0 Re iθ > 0
^0 () is analytic in a strip jIm j < ", then one can
If q
use Cauchy’s theorem to deform [13] to an integral
2
on a contour " such that jei(xt ) j decreases rapidly
on " away from  = 0 . Hence, as t ! 1, the Figure 3 Signature table.
problem localizes to a neighborhood of  = 0 ; this
 
constitutes the standard method of steepest descent. 1  jrj2 rei
In the spirit of the oscillatory contour integral v¼
case, the nonlinear steepest-descent method for an rei 1
oscillatory RH problem introduced by Deift and   1 0
1 rei
Zhou consists in the following: deform the contour ¼ ð > 0 Þ
and (rationally) approximate the jump matrix in 0 1 rei 1
order to obtain an RH problem with a jump matrix 1 0! 1  jrj2 0 !
that decays to the identity away from stationary ¼ i
phase points; then, rescaling the problem near the  re 2 1 0 1
stationary phase points, obtain a (local) RH problem
1  jrj 1  jrj2
0 1
with a piecewise constant jump matrix, which can rei
B 1
be solved in closed form, usually in terms of certain @ 1  jrj2 C
A ð < 0 Þ
special functions.
The contour deformation means the following. 0 1
Suppose that the jump matrix of an RH problem The diagonal factors (1  jrj2 )1 can be removed by
(, v) has a factorization v = b1  v1 bþ between two conjugating v by  3
, where () solves the scalar,
points on , where bþ (b ) has holomorphic and normalized RH problem on R : þ =  (1  jrj2 ) for
nondegenerating continuation to the part þ ( ) of a  < 0 and þ =  for  > 0 ; the solution of the
disk  supported by these points, see Figure 2a. Then latter can be written in a closed form:
the contour  may be deformed to the contour ( )
0 =  [ @, and the jump matrices across 0 may be Z 0
1 logð1  jrðÞj2 Þ
defined as indicated in Figure 2b. If m solves the RH ðÞ ¼ exp d
2i 1 
problem (, v), then m0 defined by m0 = mb1  in 

0
and m = m outside  solves the deformed RH Then m ~ := m 3 solves the RH problem across
problem associated with 0 .  = R, with the jump matrix
The appropriate factorization of v given by [8]
and the contour deformation are to be chosen in   1 0
1 r 2 ei
accordance with signature table; for the NLS ~v ¼ ð > 0 Þ
equation, it is given in Figure 3. The key step is to 0 1 r 2 ei 1
move algebraically the factors ei in v(; x, t) into 1 0! 2 i !
r þ e
regions of the complex plane, where they are 1
¼ r 2 ei 1  jrj2 ð < 0 Þ
exponentially decreasing as t ! 1. The jump matrix   2 1
admits two algebraic factorizations: 1  jrj 0 1
Replacing r, r, etc., by appropriate rational approx-
imations [r], [r], matching at  ¼ 0 ,
b+  
Ω+ 1 0
ν m~þ
Ω– ½r 2 ei 1
ν1

Σ, ν ν b––1 can be continued to the sector above Rþ þ 0 and


 
(a) (b)
1 ½r 2 ei
m~
Figure 2 Deformation of an RH problem. 0 1
Riemann–Hilbert Methods in Integrable Systems 435

can be continued to the sector below Rþ þ 0 , where including explicit connection formulas, as x
the factors ei are exponentially decreasing. Doing the approaches relevant critical points along different
same for the appropriate factors on R þ 0 , we directions in the complex plane.
obtain an RH problem on a cross, say, (0 þ ei=4 R) [ The development of the RH method in the theory
(0 þ ei=4 R). As t ! 1, the RH problem then of integrable systems caused emerging new analytic
localizes at 0 . and algebraic ideas for other branches of mathe-
Performing an appropriate scaling, a straightfor- matics and theoretical physics. The recent examples
ward computation shows that, as t ! 1, the are the study of the asymptotics in the theory of
problem reduces to an RH problem with the jump orthogonal polynomials and random matrices and in
matrix that does not depend on  (it is determined combinatories (random permutations).
by r(0 )), which make it possible to solve this
problem explicitly (in terms of the parabolic cylinder See also: Boundary-Value Problems for Integrable
functions, in the case of the NLS equation). Using Equations;  Approach to Integrable Systems; Integrable
explicit asymptotics for these functions and control- Systems and Algebraic Geometry; Integrable Systems
and the Inverse Scattering Method; Integrable Systems:
ling the error terms, it is possible to obtain the
Overview; Nonlinear Schrödinger Equations; Painlevé
uniform (for all x 2 R) asymptotics for the solution
Equations; Twistor Theory: Some Applications [in
of the initial-value problem for the NLS equation Integrable Systems, Complex Geometry and String
with q0 2 L2 ((1 þ x2 ) dx) \ H 1 of the form Theory]; Riemann–Hilbert Problem.
qðx; tÞ ¼ t1=2
ð0 Þ expðix2 =ð4tÞ  i ð0 Þ log 2tÞ
þ Oðtð1=2þ Þ Þ
Further Reading
for any fixed 0 < < 1=4, where
and are given
Ablowitz MJ and Clarkson PA (1991) Solitons, Nonlinear
in terms of r = R(q0 ):
Evolution Equations and Inverse Scatting, London Math.
1 Soc., Lecture Notes Series, vol. 149. Cambridge: Cambridge
ðÞ ¼  logð1  jrðÞj2 Þ University Press.
2 Beals R, Deift PA, and Tomei C (1988) Direct and Inverse
ðÞ Scattering on the Line. Mathematical Surveys and Mono-
j
ðÞj2 ¼
2 graphs 28. Providence, RI: American Mathematical Society.
Belokolos ED, Bobenko AI, Enol’skii VZ, and Its AR (1994)
and Algebro-Geometric Approach to Nonlinear Integrable
Z  Equations. Springer Series in Nonlinear Dynamics. Berlin:
1
arg
ðÞ ¼ logð  Þ dðlogð1  jrðÞj2 ÞÞ Springer.
 1 Deift PA (1999) Orthogonal Polynomials and Random Matrices:
 A Riemann–Hilbert Approach. Courant Lecture Notes in
þ þ arg ði ðÞÞ þ arg rðÞ Mathematics, vol. 3. New York: CIMS.
4
Deift PA and Zhou X (2003) Long-time asymptotics for solutions
The method can be used to obtain asymptotic of the NLS equation with initial data in a weighted Sobolev
expansions to all orders. Also, for nonlinear equa- space. Communications on Pure and Applied Mathematics
tions supporting solitons, the soliton part of the 56(8): 1029–1077.
Deift PA, Its AR, and Zhou X (1993) Long-time asymptotics for
asymptotics can be incorporated via the dressing integrable nonlinear wave equations. In: Fokas AS and
method. Zakharov VE (eds.) Important Developments in Soliton
Further applications include long-time asympto- Theory, pp. 181–204. Berlin: Springer.
tics for near-integrable systems, such as the per- Faddeev LD and Takhtajan LA (1987) Hamiltonian Methods in
turbed NLS equation iqt þ qxx  2jqj2 q  "jqjl q = 0 the Theory of Solitons. Berlin: Springer.
Fokas AS (2000) On the integrability of linear and nonlinear
for l > 2 and " > 0, and the small-dispersion limits partial differential equations. Journal of Mathematical Physics
of integrable equations (e.g., for the Korteweg– 41: 4188–4237.
de Vries equation qt  6qqx þ "2 qxxx = 0 with small Its AR (2003) The Riemann–Hilbert problem and integrable
dispersion " & 0). systems. Notices of the AMS 50(11): 1389–1400.
The RH formalism makes possible a comprehen- Novikov SP, Manakov SV, Pitaevskii LP, and Zakharov VE
(1984) Theory of Solitons. The Inverse Scattering Method.
sive global asymptotic analysis of the Painlevé New York: Consultants Bureau.
transcendents (which, due to their increasing role Zhou X (1989) The Riemann–Hilbert problem and inverse
in the modern mathematical physics, should be scattering. Journal on Mathematical Analysis. Society for
considered as new nonlinear special functions), Industrial and Applied Mathematics (SIAM) 20: 966–986.
436 Riemann–Hilbert Problem

Riemann–Hilbert Problem
V P Kostov, Université de Nice Sophia Antipolis, poles of order only  n  j. A linear equation is
Nice, France Fuchsian if and only if it is regular. The best-studied
ª 2006 Elsevier Ltd. All rights reserved. Fuchsian equations are the hypergeometric one and
its generalizations and the Jordan–Pochhammer
equation.
Regular and Fuchsian Linear Systems The linear change of the dependent variables
on the Riemann Sphere X 7! WðtÞX ½4
Consider a system of ordinary linear differential 1
(where W is meromorphic on CP ) makes system [2]
equations with time belonging to the Riemann
undergo the gauge transformation
sphere CP1 = C [ 1:
A ! W 1 ðdW=dtÞ þ W 1 AW ½5
dX=dt ¼ AðtÞX ½1
(Most often one requires W to be holomorphic and
The n  n matrix A is meromorphic on CP1 , with
holomorphically invertible for t 6¼ aj , j = 1, . . . , p þ 1,
poles at a1 , . . . , apþ1 ; the dependent variables X form
so that no new singular points appear in the system.)
an n  n matrix. One can assume that 1 is not
This transformation preserves regularity but not
among the poles aj and it is not a pole of the 1-form
necessarily being Fuchsian. The only invariant under
A(t)dt (this can be achieved by a fractionally-linear
the group of linear transformations [4] is the
transformation of t).
monodromy group of the system.
P Deligne has introduced a terminology of
meromorphic connections and sections which is Definition 4 Set  = CP1 n{a1 , . . . , apþ1 }. Fix a
often preferred in modern literature to the one of base point a0 2  and a matrix B 2 GL(n, C).
meromorphic linear systems and their solutions, and Consider a closed contour  with base point a0
there is a one-to-one correspondence between the and bypassing the poles of the system. The mono-
two languages. dromy operator of system [1] defined by this
contour is the linear operator M acting on the
Definition 1 System [1] is regular at the pole aj if
solution space of the system which maps the
its solutions have a moderate (or polynomial)
solution X with Xjt = a0 = B into the value of its
growth rate there, that is, for every sector S centered 
analytic continuation along . Notation: X 7! XM.
at aj and not containing other poles of the system
The monodromy operator depends only on the class
and for every solution X restricted to S there exists
of homotopy equivalence of .
Nj 2 R such that kX(t  aj )k = O(jt  aj jNj ) for all
The monodromy group is the subgroup of
t 2 S. System [1] is regular if it is regular at all poles
GL(n, C) generated by all monodromy operators. It
aj . System [1] is Fuchsian if its poles are logarithmic
is defined only up to conjugacy due to the freedom
(i.e., of first order). Every Fuchsian system is
to choose a0 and B.
regular.
Definition 5 Define the product (concatenation)
Remark 2 The opening of the sector S might be
1 2 of two paths 1 , 2 in  (where the end of 1
> 2. Restricting to a sector is necessary because the
coincides with the beginning of 2 ) as the path
solutions are, in general, ramified at the poles aj and
obtained by running 1 first and 2 next.
by turning around the poles much faster than
approaching them one can obtain any growth rate. Remark 6 The monodromy group is an antirepre-
sentation of the fundamental group 1 () into
A Fuchsian system can be presented in the form
! GL(n, C) because one has
pþ1
X 1 2
dX=dt ¼ Aj =ðt  aj Þ X; Aj 2 glðn; CÞ ½2 X 7! XM1 7! XM2 M1 ½6
j¼1
that is, the concatenation 1 2 of the two contours
The sum of its matrices-residua Aj is 0, that is, defines the monodromy operator M2 M1 . In the text,
the monodromy group is referred to as to a
A1 þ    þ Apþ1 ¼ 0 ½3
representation, not an antirepresentation.
(recall that 1 is not a pole of the system).
One usually chooses a standard set of generators
Remark 3 The Pn linear equation (with meromorphic of 1 () (see Figure 1) defined by contours
(j)
coefficients) j=0 a j (t)x = 0 is Fuchsian if aj has j , j = 1, . . . , p þ 1, where j consists of a segment
Riemann–Hilbert Problem 437

difficulty when computing the monodromy


group of system [1] consists in computing the
a1
matrices Qj which is a transcendental problem.
(iii) As will be noted in Theorem 9, every compo-
a2 nent of every solution to a regular linear system
is a function of the class of Nilsson, that is,
representable
P as a convergent (on sectors) series
k2N, 1in, 0n1 ai, k,  t
i þk
ln t, i 2 C, ai, k, 
2 C.
Example 8 The Fuchsian system dX=dt = (A=t)X,
A 2 gl(n, C), has two poles – at 0 and at 1,
with matrices-residua A and A. Any solution
is of the form X = exp (A ln t)G, G 2 GL(n, C).
To compute the local monodromy around 0, change
a0 ap + 1 the argument of t by 2i. This results in ln t 7!
ln t þ 2i and X 7! XG1 exp (2iA)G, that is the
Figure 1 The standard set of generators. monodromy operator at 0 equals G1 exp (2iA)G
(and in the same way the one at 1 equals
G1 exp (2iA)G).
[a0 , a0j ] (a0j being a point close to aj ), of a small
circumference run counterclockwise (centered at aj ,
passing through a0j and containing inside no pole of
the system other than aj ), and of the segment [a0j , a0 ].
Formulation and History of the Problem
Thus, j is freely homotopic to a small loop The Riemann–Hilbert problem (or Hilbert’s twenty-
circumventing counterclockwise aj (and no other first problem) is formulated as follows:
pole ai ). The indices of the poles are chosen such
Prove that for any set of points a1 , . . . , apþ1 2 CP1
that the indices of the contours increase from 1 to
and for any set of matrices M1 , . . . , Mp 2 GL(n, C)
p þ 1 when one turns around a0 clockwise.
there exists a Fuchsian linear system with poles
For the standard choice of the contours the
at and only at a1 , . . . , apþ1 for which the correspond-
generators Mj satisfy the relation
ing monodromy operators are M1 , . . . , Mp ,
M1 . . . Mpþ1 ¼ I ½7 Mpþ1 = (M1 . . . Mp )1 .
Indeed, the concatenation of contours pþ1 . . .1 is Historically, the Riemann–Hilbert problem was
homotopy equivalent to 0 and equality [7] results first stated for Fuchsian equations, not for systems –
from Remark 6. Riemann mentions in a note at the end of the 1850s
the problem how to reconstruct a Fuchsian equation
Remarks 7
from its monodromy representation and Hilbert
(i) If the matrix-residuum Aj of a Fuchsian system includes it in 1900 as the twenty-first problem on
has no eigenvalues differing by a nonzero his list in a formulation mentioning equations and
integer, then the monodromy operator Mj not systems. However, the number of parameters
defined as above is conjugate to exp (2iAj ). It necessary to parametrize a Fuchsian equation is, in
is always true that the eigenvalues k, j of Mj general, smaller than the one necessary to parame-
equal exp (2ik, j ), where k, j are the eigenva- trize a monodromy group generated by p matrices.
lues of Aj . Therefore, one has to allow the presence of
(ii) If the generators Mj of the monodromy group additional apparent singularities in the equation,
are defined after a standard set of contours j , that is, singularities the monodromy around which is
then they are conjugate to the corresponding trivial.
operators Lj of local monodromy, that is, when It had been believed for a long time that the
the poles aj are circumvented counterclockwise Riemann–Hilbert problem has a positive solution
along small loops. The operators Lj of a regular for any n 2 N, after J. Plemelj in 1908 gave a proof
system can be computed (up to conjugacy) with a gap. In his proof, Plemelj tries to reduce the
algorithmically – one first makes the system Riemann–Hilbert problem to the so-called homo-
Fuchsian at aj by means of a change [4] and geneous Hilbert boundary-value problem of the
then carries out the computation. Thus, theory of singular integral equations. It follows
Mj = Q1j Lj Qj for some Qj 2 GL(n, C) and the from the correct part of the proof that if one of
438 Riemann–Hilbert Problem

the monodromy operators of system [1] is diagonal- size 2. In Bolibrukh’s work, the last condition is
izable, then system [1] is equivalent to a Fuchsian formulated in a different (but equivalent) way using
one; this is due to Yu S Il’yashenko. (In particular, if the notion of Fuchsian weight.
one allows just one additional apparent singularity,
then the Riemann–Hilbert problem is positively
solvable. The author has shown that the result still The New Setting of the Problem
holds if one of the monodromy operators has one After the negative answer to the Riemann–Hilbert
Jordan block of size 2 and n  2 Jordan blocks of problem for n  3, it is reasonable to reformulate it
size 1. The result is sharp – it would be false if one as follows:
allows one Jordan block of size  3 or two blocks of
size 2.) It also follows that any finitely generated Find necessary and/or sufficient conditions for the
subgroup of GL(n, C) is the monodromy group of a choice of the monodromy operators M1 , . . . , Mp and
regular system with prescribed poles which is the points a1 , . . . , apþ1 so that there should exist a
Fuchsian at all the poles with the possible exception Fuchsian system with poles at and only at the given
of one (where the system is regular) which can be points and whose monodromy operators Mj should
chosen among them at random. be the given ones.
After the publication of Plemelj’s result, the In the new setting of the Riemann–Hilbert pro-
interest shifted basically towards the question how blem, the answer is positive if the monodromy group
to construct a Fuchsian system given the mono- is irreducible (for any positions of the poles aj ). This
dromy operators Mj . At the end of the 1920s has been first proved by Bolibrukh for n = 3 and then
IA Lappo-Danilevskii expressed the solutions to a independently by the author and by him for any n.
Fuchsian system as series of the monodromy Bolibrukh found many examples of couples
operators. These series are convergent for mono- (reducible monodromy group, poles) for which the
dromy operators close to the identity matrix and for answer to the Riemann–Hilbert problem is nega-
such operators one can express the residua Aj of the tive. For n = 3, the negative answer is due to
Fuchsian system as convergent series of the mono- possible ‘‘bad position’’ of the poles and a small
dromy operators. shift from this position while keeping the same
In 1956 BL Krylov proved that the Riemann– monodromy group leads to a couple for which the
Hilbert problem is solvable for n = p = 2 by con- answer is positive. For n  4, there are couples
structing a Fuchsian system after its monodromy where the negative answer is due to arithmetic
group. In 1983 NP Erugin did the same in the case properties of the eigenvalues of the matrices-
n = 2, p = 3, and established a connection between residua and the corresponding monodromy groups
the Riemann–Hilbert problem and Painlevé’s are not realizable by Fuchsian systems for any
equations. position of the poles. During the last years of his
In 1957 H Röhrl reformulated the problem in life, Bolibrukh studied upper-triangular mono-
terms of fibre bundles. His approach is more dromy representations and found other examples
geometric; however, it does not require the system with negative answer to the Riemann–Hilbert
realizing a given monodromy group to be Fuchsian, problem.
but only regular. Bolibrukh also found some sufficient conditions
In 1978 W Dekkers considered the particular case for the positive resolvability of the Riemann–Hilbert
n = 2 of the Riemann–Hilbert problem, and gave a problem in the case of a reducible monodromy
positive answer to it. The gap in Plemelj’s proof was group. For example, suppose that the monodromy
detected in the 1980s by AT Kohn and YuS group is a semidirect sum:
Il’yashenko. !
It was proved by AA Bolibrukh in 1989 that, for M1j 
n  3, the problem has a negative answer. For n = 3, Mj ¼
0 M2j
the answer is negative precisely for those couples
(monodromy group, set of poles) for which each where the matrices Mij (of size li  li , i = 1, 2) define
monodromy operator M1 , . . . , Mpþ1 is conjugate to the representations i . Suppose that the representa-
a Jordan block of size 3, the monodromy group is tion 2 is realizable by a Fuchsian system, that the
reducible, with an invariant subspace or factor-space representation 1 is irreducible, and that one of the
of dimension 2, the monodromy sub- or factor- matrices Mj is block-diagonal, with left upper block
representation corresponding to it is irreducible and of size s  s, where s  l1 . Then for any choice of the
cannot be realized by a Fuchsian system having all poles aj the monodromy group can be realized by
its matrices-residua conjugate to Jordan blocks of some Fuchsian system.
Riemann–Hilbert Problem 439

Bolibrukh also gave an estimation upon the Remark 10 Denote by k, j the diagonal entries
number m of additional apparent singularities in a (i.e., the eigenvalues) of the matrix Ej . Then the
Fuchsian equation which are sufficient to realize a sums k, j þ ’k, j are the eigenvalues of the matrix-
given irreducible monodromy group. It follows from residuum Aj at aj .
his result that
In proving that the Riemann–Hilbert problem is
nðn  1Þðp  1Þ positively solved in the case of an irreducible mono-
m þ1n
2 dromy group, Bolibrukh (or the author) uses the
correct part of Plemelj’s proof – namely, that the given
One can ask the question what the codimension of monodromy group can be realized by a regular system
the subset in the space (monodromy group, poles) is which is Fuchsian at all poles but one. After this, a
which provides the negative answer to the Riemann– suitable change [4] is sought which makes the system
Hilbert problem in its initial setting. The (author’s) Fuchsian at the last pole. The criterium to be Fuchsian
answer for p  3 is 2p(n  1), and for n  7 this is provided by the above theorem; one checks how the
codimension is attained only at couples (mono- matrices Dj , that is, the exponents ’k, j and the
dromy group, poles) for which every monodromy matrices Uj change as a result of the transformation
operator Mj is conjugate to a Jordan block of size n, [4]. This is easier (one has only to multiply to the left
the group has an invariant subspace or factor-space by W(t)) than to see how the matrix A(t) of system [1]
of dimension n  1, the corresponding sub- or changes because one has conjugation in rule [5]. This
factor-representation is irreducible and cannot be idea is also due to Bolibrukh.
realized by a Fuchsian system in which all matrices- When Bolibrukh obtains the negative answer to
residua are conjugate to Jordan blocks of size n  1. the Riemann–Hilbert problem in some case of
For n  6 there are examples where the same reducible monodromy group, he often uses the
codimension is attained (but cannot be decreased) following two propositions:
on other couples as well. P
Proposition 11 The sum k, j þ ’k, j relative to a
subspace of the solution space invariant for all
monodromy operators is a non-positive integer.
Levelt’s Result and Bolibrukh’s Method
In particular, the sum of all exponents k, j þ ’k, j
In 1961, AHM Levelt described the form of the is a non-positive integer which is 0 if and only if the
solution to a regular system at its pole. His result is system is Fuchsian.
in the core of Bolibrukh’s method for solving the
Riemann–Hilbert problem. Proposition 12 If some component of some col-
umn of some matrix solution to a regular system is
Theorem 9 In the neighborhood of a pole, the identically equal to 0, then the monodromy group of
solution to a regular linear system is representable in the system is reducible.
the form
A reducible monodromy group can be conjugated
X ¼ Uj ðt  aj Þðt  aj ÞDj ðt  aj ÞEj Gj ½8 to a block upper-triangular form, with the diagonal
where the matrix Uj is holomorphic in a neigh- blocks defining irreducible representations. Thus, the
borhood of 0, Dj = diag(’1, j , . . . , ’n, j ), ’n, j 2 Z, Riemann–Hilbert problem for reducible monodromy
det Gj 6¼ 0. The matrix Ej is in upper-triangular groups makes necessary the answer to the question
form and the real parts of its eigenvalues belong to ‘‘given the set of poles aj , for which sets of exponents
[0, 1) (by definition, (t  aj )Ej = eEj ln (taj ) ). The num- ’k, j can a given irreducible monodromy group be
bers ’k, j satisfy the condition [10] formulated realized by such a Fuchsian system?’’ For n  2, an
below. They are valuations in the eigenspaces of irreducible monodromy group can be a priori realized
the monodromy operator Mj (i.e., in the maximal by infinitely many Fuchsian systems, with different
subspaces invariant for Mj on which it acts as an sets of exponents ’k, j . Consider the case when these
operator with a single eigenvalue). exponents are fixed for j 6¼ 1; suppose that a1 = 0.
A regular system is Fuchsian at aj if and only if The author has shown that then infinitely many of
the a priori possible choices of the exponents ’k, 1
det Uj ð0Þ 6¼ 0 ½9 cannot be realized by Fuchsian systems if and only if
The condition on ’k, j can be formulated as follows: let the given monodromy group is realized by a Fuchsian
Ej have one and the same eigenvalue in the rows with system which is obtained from another one via the
indices s1 < s2 <    < sq . Then one has change of time t 7! tk =(bk tk þ bk1 tk1 þ    þ b0 ),
bi 2 C, b0 6¼ 0, k 2 N  , k > 1. This change increases
’s1 ; j  ’s2 ; j      ’sq ; j ½10 the number of poles.
440 Riemann–Hilbert Problem

Further Developments – The been completely solved (for any eigenvalues) by W


Deligne–Simpson Problem Crawley-Boevey. The case of matrices Aj with p = 2
has been treated by O Gleizer using results of A
The Riemann–Hilbert problem can be generalized for Klyachko. The case when the matrices Mj are unitary
irregular systems as follows. One asks whether for is considered in papers of S Agnihotri, P Belkale, I
given poles aj there exists a linear system of ordinary Biswas, C Teleman, and C Woodward. Several cases of
differential equations on the Riemann sphere with finite groups have been considered by M Dettweiler, S
these and only these poles which is Fuchsian at the Reiter, K Strambach, J Thompson, and H Völklein.
regular singular points, which has prescribed formal The important rigid case has been studied by NM
normal forms, formal monodromies and Stokes Katz. Y Haraoka has considered the problem in the
multipliers at the irregular singular points, and context of linear systems in Okubo’s normal form.
which has a prescribed global monodromy. One can find details in an author’s survey on the
The Riemann–Hilbert problem has been consid- Deligne–Simpson problem (Kostov, 2004).
ered in some papers (of H Esnault, E Vieweg, and C
Hertling) in the context of algebraic curves of higher See also: Affine Quantum Groups; Bicrossproduct Hopf
genus instead of CP1 . Algebras and Non-Commutative Spacetime; Einstein
The study of the so-called Riemann–Hilbert Equations: Exact Solutions; Holonomic Quantum Fields;
correspondence between the category of holonomic Integrable Systems: Overview; Isomonodromic
D-modules and the one of perverse sheaves with Deformations; Leray–Schauder Theory and Mapping
constructible cohomology has been initiated in the Degree; Painlevé Equations; Riemann–Hilbert Methods
works of J Bernstein in the algebraic aspect and of in Integrable Systems; Twistors; WDVV Equations and
Frobenius Manifolds.
M Sato, T Kawai, and M Kashiwara in the analytic
one. This has been done in the case of a variety of
arbitrary dimension (not necessarily CP1 ), with Further Reading
codimension one pole divisor. Perversity has been
defined by P Deligne, M Goresky, and R MacPher- Anosov DV and Bolibruch AA (1994) The Riemann–Hilbert
son. Regularity has been defined by M Kashiwara in Problem, A Publication from the Moscow Institute of
Mathematics, Aspects of Mathematics, Vieweg.
the analytic aspect and by Z Mebkhout in the Arnol’d VI and Ilyashenko YuS (1988) Ordinary differential
geometric one. Important contributions in the equations. In: Dynamical Systems I, Encyclopedia of Mathe-
domain are due to Ph Maisonobe, M Merle, N matical Sciences, t. 1. Berlin: Springer.
Nitsure, C Sabbah, and the list is far from being Beukers F and Heckman G (1989) Monodromy for the hypergeo-
exhaustive. The Riemann–Hilbert correspondence metric function n Fn1 . Inventiones Mathematicae 95: 325–354.
Bolibrukh AA (1990) The Riemann–Hilbert problem. Russian
plays an important role in other trends of mathe- Mathematical Surveys 45(2): 1–49.
matics as well. Bolibrukh AA (1992) Sufficient conditions for the positive
The Deligne–Simpson problem is formulated like solvability of the Riemann–Hilbert problem. Mathematical
this: Give necessary and sufficient conditions upon Notes 51(1–2): 110–117.
the choice of the conjugacy classes cj gl(n, C) or Crawley-Boevey W (2003) On matrices in prescribed conjugacy
classes with no common invariant subspace and sum zero.
Cj GL(n, C) so that there should exist an irredu- Duke Mathematical Journal 118(2): 339–352.
cible (i.e., without proper invariant subspace) Dekkers W (1979) The matrix of a connection having regular
(p þ 1)-tuple of matrices Aj 2 cj satisfying [3] or of singularities on a vector bundle of rank 2 on P1 (C). Lecture
matrices Mj satisfying [7]. Notes in Mathematics 712: 33–43.
The problem was stated in the 1980s by P Deligne Deligne P (1970) Equations différentielles à points singuliers
réguliers, Lecture Notes in Mathematics, vol. 163, pp. 133.
for matrices Mj and in the 1990s by the author for Berlin: Springer.
matrices Aj . C Simpson was the first to obtain results Dettweiler M and Reiter S (1999) On rigid tuples in linear groups
towards its resolution in the case of matrices Mj . The of odd dimension. Journal of Algebra 222(2): 550–560.
problem admits the following geometric interpretation Esnault H and Viehweg E (1999) Semistable bundles on curves
in the case of matrices Mj : For which (p þ 1)-tuples of and irreducible representations of the fundamental group.
Algebraic geometry: Hirzebruch 70 (Warsaw, 1998), Con-
local monodromies does there exist an irreducible temporary Mathematics AMS, Providence, RI 241: 129–138.
global monodromy with such local monodromies? Esnault H and Hertling C (2001) Semistable bundles on curves
For generic eigenvalues the problem has found a and reducible representations of the fundamental group.
complete solution in the author’s papers in the form of International Journal of Mathematics 12(7): 847–855.
a criterium upon the Jordan normal forms defined by Haraoka Y (1994) Finite monodromy of Pochhammer equation.
Annales de l’Institut Fourier 44(3): 767–810.
the conjugacy classes. The author has treated the case Katz NM (1995) Rigid Local systems. Annnals of Mathematics,
of nilpotent matrices Aj and the one of unipotent Studies Series, Study, vol. 139. Princeton: Princeton University
matrices Mj as well. For matrices Aj , the problem has Press.
Riemannian Holonomy Groups and Exceptional Holonomy 441

Kohn A and Treibich (1983) Un résultat de Plemelj, Mathematics Levelt AHM (1961) Hypergeometric functions. Indagationes
and Physics (Paris 1979/1982). In: Progr. Math., vol. 37, Mathematicae 23: 361–401.
pp. 307–312. Boston: Birkhäuser. Maisonobe Ph and Narváez-Macarro L (eds.) (2004) Eléments de
Kostov VP (1992) Fuchsian linear systems on CP1 and the la théorie des systèmes différentiels géométriques. Cours du
Riemann–Hilbert problem. Comptes Rendus de l’Académie C.I.M.P.A. Ecole d’été de Séville. Séminaires et Congrès 8,
des Sciences à Paris, 143–148. xx þ 430 pages.
Kostov VP (1999) The Deligne–Simpson problem. C.R. Acad. Sci. Völklein H (1998) Rigid generators of classical groups. Mathe-
Paris, t. 329 Série I, 657–662. matische Annalen 311(3): 421–438.
Kostov VP (2004) The Deligne–Simpson problem – a survey. Wasow WR (1976) Asymptotic Expansions for Ordinary Differ-
Journal of Algebra 281: 83–108. ential Equations. New York: Huntington.

Riemannian Holonomy Groups and Exceptional Holonomy


D D Joyce, University of Oxford, Oxford, UK rg = 0, we see that P : Tx M ! Ty M is orthogonal
ª 2006 Elsevier Ltd. All rights reserved. with respect to the metric g on Tx M and Ty M.
Definition 1 Fix a point x 2 M.  is said to be loop
based at x if  : [0, 1] ! M is a continuous, piece-
Riemannian Holonomy Groups wise-smooth path with (0) = (1) = x. If  is a loop
Let (M, g) be a Riemannian n-manifold. The based at x, then the parallel transport map P lies in
holonomy group Hol(g) is a Lie subgroup of O(n), O(Tx M), the group of orthogonal linear transforma-
a global invariant of g which measures the constant tions of Tx M. Define the (Riemannian) holonomy
tensors S on M preserved by the Levi-Civita group Holx (g) of g based at x to be
connection r of g. The most well-known examples  
Holx ðgÞ ¼ P:  is a loop based at x
of metrics with special holonomy are Kähler metrics,
with Hol(g)
U(m) O(2m). A Kähler manifold
OðTx MÞ ½1
(M, g) also carries a complex structure J and Kähler
Here are some elementary properties of Holx (g).
2-form ! with rJ = r! = 0.
The only difficult part is showing that Holx (g) is a
The classification of Riemannian holonomy
(closed) Lie subgroup.
groups gives a list of interesting special Riemannian
geometries such as Calabi–Yau manifolds and the Theorem 2 Holx (g) is a Lie subgroup of O(Tx M),
exceptional holonomy groups G2 and Spin(7), all of which is closed and connected if M is simply
which are important in physics. These geometries connected, but need not be closed or connected
have many features in common with Kähler geome- otherwise. Let x, y 2 M, and suppose  : [0, 1] ! M
try, and are characterized by the existence of is a continuous, piecewise-smooth path with
constant exterior forms. (0) = x and (1) = y, so that P : Tx M ! Ty M. Then

General Properties of Holonomy Groups P Holx ðgÞP1


 ¼ Holy ðgÞ ½2
Let M be a connected manifold of dimension n and g a By choosing an orthonormal basis for Tx M we
Riemannian metric on M, with Levi-Civita connec- can identify O(Tx M) with the Lie group O(n), and
tion r, regarded as a connection on the tangent so identify Holx (g) with a Lie subgroup of O(n).
bundle TM of M. Suppose  : [0, 1] ! M is a smooth Changing the basis changes the subgroups by
path, with (0) = x and (1) = y. Let s be a smooth conjugation by an element of O(n). Thus, Holx (g)
section of   (TM), so that s : [0, 1] ! TM with s(t) 2 may be regarded as a Lie subgroup of O(n) defined
T(t) M for each t 2 [0, 1]. Then we say that s is up to conjugation. Equation [2] shows that in this
parallel if r(t)
˙ s(t) = 0 for all t 2 [0, 1], where (t)
˙ is sense, Holx (g) is independent of the base point x.
d Therefore, we omit the subscript x and write
ðtÞ 2 TðtÞ M Hol(g) for the holonomy group of g, regarded as
dt
a subgroup of O(n) defined up to conjugation.
For each v 2 Tx M, there is a unique parallel It is significant that Hol(g) is a global invariant of g,
section s of   (TM) with s(0) = v. Define a map that is, it does not vary from point to point like
P : Tx M ! Ty M by P (v) = s(1). Then P is well local invariants of g such as the curvature. Generic
defined and linear, and is called the parallel metrics g on M have Hol(g) = SO(n) if M is
transport map along . This easily generalizes orientable, and Hol(g) = O(n) otherwise. But some
to continuous, piecewise-smooth paths . As special metrics g can have Hol(g) a proper
442 Riemannian Holonomy Groups and Exceptional Holonomy

subgroup of SO(n) or O(n). Then M carries some is that many possible holonomy groups are the
extra geometric structures compatible with g. holonomy group of a Riemannian symmetric space,
Broadly, the smaller Hol(g) is as a subgroup of but are not realized by any nonsymmetric metric.
O(n), the more special g is, and the more extra Therefore, by restricting attention to nonsymmetric
geometric structures there are. Therefore, under- metrics, one considerably reduces the number of
standing and classifying the possible holonomy possible Riemannian holonomy groups.
groups gives a family of interesting special Rieman- A tensor S on M is constant if rS = 0. An
nian geometries, such as Kähler geometry. All of important property of Hol(g) is that it determines
these special geometries have cropped up in physics. the constant tensors on M.
Define the holonomy algebra hol(g) to be the Lie
Theorem 5 Let (M, g) be a Riemannian manifold,
algebra of Hol(g), regarded as a Lie subalgebra of
with Levi-Civita connection r. Fix x 2 M, so
o(n), defined up to the adjoint action of O(n).
that Holx (g)
Nk acts onN Tx M, and so on the tensor
Define holx (g) to be the Lie algebra of Holx (g), as a l 
powers
Nk T
Nl  x M
Tx M. Suppose S 2 C1
Lie subalgebra of o(Tx M) ffi 2 Tx M. The holonomy
( TM
T M) is a constant tensor. Then Sjx
algebra hol(g) is intimately connected with the
is fixed Nby the action
N of Holx (g). Conversely,
Riemann curvature tensor Rabcd = gae Re bcd of g.
if Sjx 2 k Tx M
l Tx M is fixed by Holx (g),
Theorem 3 The Riemann curvature tensor Rabcd it extends
N to Na unique constant tensor
lies in S2 holx (g) at x, where holx (g) is regarded as a S 2 C1 ( k TM
l T  M).
subspace of 2 Tx M. It also satisfies the first and
The main idea in the proof is that if S is a constant
second Bianchi identities
tensor and  : [0, 1] ! M is a path from x to y, then
Rabcd þ Radbc þ Racdb ¼ 0 ½3 P (Sjx ) = Sjy , that is, ‘‘constant tensors are invariant
under parallel transport.’’ In particular, they are
invariant under parallel transport around closed
re Rabcd þ rc Rabde þ rd Rabec ¼ 0 ½4 loops based at x, and so under elements of Holx (g).

A related result is the Ambrose–Singer holonomy Berger’s Classification of Holonomy Groups


theorem, which, roughly speaking, says that holx (g)
Berger classified Riemannian holonomy groups in
may be reconstructed from Rabcd jy for all y 2 M,
1955.
moved to x by parallel transport.
If (M, g) and (N, h) are Riemannian manifolds, the Theorem 6 Let M be a simply connected,
product M  N carries a product metric g  h. It is n-dimensional manifold, and g an irreducible, non-
easy to show that Hol(g  h) = Hol(g)  Hol(h). A symmetric Riemannian metric on M. Then
Riemannian manifold (M, g) is called reducible if
(i) Hol(g) = SO(n),
every point has an open neighborhood isometric to a
(ii) n = 2m and Hol(g) = SU(m) or U(m),
Riemannian product and irreducible otherwise.
(iii) n = 4m and Hol(g) = Sp(m) or Sp(m)Sp(1),
Theorem 4 Let (M, g) be Riemannian n-manifold. (iv) n = 7 and Hol(g) = G2 , or
Then the natural representation of Hol(g) on R n is (v) n = 8 and Hol(g) = Spin(7).
reducible if and only if g is reducible.
To simplify the classification, Berger makes three
There is a class of Riemannian manifolds called assumptions: M is simply connected, g is irreducible,
the ‘‘Riemannian symmetric spaces’’ which are and g is nonsymmetric. We can make M simply
important in the theory of Riemannian holonomy connected by passing to the ‘‘universal cover.’’ The
groups. A Riemannian symmetric space is a holonomy group of a reducible metric is a product
special kind of Riemannian manifold with a of holonomy groups of irreducible metrics, and the
transitive isometry group. The theory of sym- holonomy groups of locally symmetric metrics
metric spaces was worked out by Élie Cartan in follow from Cartan’s classification of Riemannian
the 1920s, who classified them completely, using symmetric spaces. Thus, these three assumptions can
his own classification of Lie groups and their easily be removed.
representations. Here is a sketch of Berger’s proof of Theorem 6.
A Riemannian metric g is called ‘‘locally sym- As M is simply connected, Theorem 2 shows Hol(g)
metric’’ if re Rabcd 0, and ‘‘nonsymmetric’’ other- is a closed, connected Lie subgroup of SO(n), and
wise. Every locally symmetric metric is locally since g is irreducible, Theorem 4 shows the
isometric to a Riemannian symmetric space. The representation of Hol(g) on Rn is irreducible. So,
relevance of symmetric spaces to holonomy groups suppose that H is a closed, connected subgroup of
Riemannian Holonomy Groups and Exceptional Holonomy 443

SO(n) acting irreducibly on Rn , with Lie algebra h. (iii) Metrics g with Hol(g) = Sp(m) are called
The classification of all such H follows from the ‘‘hyper-Kähler.’’ As Sp(m)  SU(2m)  U(2m), hyper-
classification of Lie groups (and is of considerable Kähler metrics are Ricci-flat and Kähler.
complexity). Berger’s method was to take the list of Metrics g with holonomy group Sp(m)Sp(1) for
all such groups H, and to apply two tests to each m 2 are called ‘‘quaternionic Kähler.’’ (Note that
possibility to find out if it could be a holonomy quaternionic Kähler metrics are not in fact Kähler.)
group. The only groups H which passed both tests They are Einstein, but not Ricci-flat.
are those in the theorem. (iv), (v) G2 and Spin(7) are the exceptional cases,
Berger’s tests are algebraic and involve the so they are called the ‘‘exceptional holonomy
curvature tensor. Suppose that Rabcd is the Riemann groups.’’ Metrics with these holonomy groups are
curvature of a metric g with Hol(g) = H. Then Ricci-flat.
Theorem 3 gives Rabcd 2 S2 h, and the first Bianchi
The groups can be understood in terms of the four
identity [3] applies. But if h has large codimension in
division algebras: the real numbers R, the complex
o(n), then the vector space RH of elements of S2 h
numbers C, the quaternions H, and the octonions or
satisfying [3] will be small, or even zero. However,
Cayley numbers O.
the ‘‘Ambrose–Singer holonomy theorem’’ shows that
RH must be big enough to generate h. For many of the SO(n) is a group of automorphisms of Rn .
candidate groups H, this does not hold, and so H U(m) and SU(m) are groups of automorphisms of Cm .
cannot be a holonomy group. This is the first test. Sp(m) and Sp(m) Sp(1) are automorphism groups
Now re Rabcd lies in (Rn )
RH , and also satisfies of Hm .
the second Bianchi identity, eqn [4]. Frequently, G2 is the automorphism group of Im O ffi R 7 .
these imply that rR = 0, so that g is locally Spin(7) is a group of automorphisms of O ffi R 8 ,
symmetric. Therefore, we may exclude such H, and preserving part of the structure on O.
this is Berger’s second test.
Berger’s proof does not show that the groups on
his list actually occur as Riemannian holonomy The Exceptional Holonomy Groups
groups – only that no others do. It is now known, For some time after Berger’s classification, the
though this took another thirty years to find out, exceptional holonomy groups remained a mystery.
that all possibilities in Theorem 6 do occur. In 1987, Bryant used the theory of exterior
differential systems to show that locally there exist
The Groups on Berger’s List many metrics with these holonomy groups, and gave
Here are some brief remarks about each group on some explicit, incomplete examples. Then in 1989,
Berger’s list. Bryant and Salamon found explicit, complete
metrics with holonomy G2 and Spin(7) on non-
(i) SO(n) is the holonomy group of generic compact manifolds. In 1994–95, the author con-
Riemannian metrics. structed the first examples of metrics with holonomy
(ii) Riemannian metrics g with Hol(g)  U(m) are G2 and Spin(7) on compact manifolds. For more
called ‘‘Kähler metrics.’’ Kähler metrics are a natural information on exceptional holonomy, see Joyce
class of metrics on complex manifolds, and generic (2000, 2002).
Kähler metrics on a given complex manifold have
holonomy U(m). The Holonomy Group G2
Metrics g with Hol(g) = SU(m) are called Calabi–
Let (x1 , . . . , x7 ) be coordinates on R7 . Write dxij...l
Yau metrics. Since SU(m) is a subgroup of U(m), all
for the exterior form dxi ^ dxj ^ ^ dxl on R 7 .
Calabi–Yau metrics are Kähler. If g is Kähler and M
Define a metric g0 , a 3-form ’0 , and a 4-form ’0
is simply connected, then Hol(g)  SU(m) if and
on R7 by
only if g is Ricci-flat. Thus, Calabi–Yau metrics are
locally more or less the same as Ricci-flat Kähler g0 ¼ dx21 þ þ dx27
metrics. ’0 ¼ dx123 þ dx145 þ dx167 þ dx246
If (M, J) is a compact complex manifold with
 dx257  dx347  dx356 ½5
trivial canonical bundle admitting Kähler metrics,
then Yau’s solution of the Calabi conjecture gives a ’0 ¼ dx4567 þ dx2367 þ dx2345 þ dx1357
unique Ricci-flat Kähler metric in each canonical  dx1346  dx1256  dx1247
class. This gives a way to construct many examples
of Calabi–Yau manifolds, and explains why these The subgroup of GL(7, R) preserving ’0 is the
have been named after them. exceptional Lie group G2 . It also preserves g0 ,  ’0 ,
444 Riemannian Holonomy Groups and Exceptional Holonomy

and the orientation on R 7 . It is a compact, The subgroup of GL(8, R) preserving 0 is the


semisimple, 14-dimensional Lie group, a subgroup holonomy group Spin(7). It also preserves the
of SO(7). orientation on R 8 and the Euclidean metric
A G2 -structure on a 7-manifold M is a principal g0 = dx21 þ þ dx28 . It is a compact, semisimple,
sub-bundle of the frame bundle of M, with 21-dimensional Lie group, a subgroup of SO(8).
structure group G2 . Each G2 -structure gives rise A Spin(7)-structure on an 8-manifold M gives rise
to a 3-form ’ and a metric g on M, such that every to a 4-form  and a metric g on M, such that each
tangent space of M admits an isomorphism with R7 tangent space of M admits an isomorphism with R8
identifying ’ and g with ’0 and g0 , respectively. By identifying  and g with 0 and g0 , respectively. By
an abuse of notation, (’, g) can be referred to as a an abuse of notation, the pair (, g) is referred to as
G2 -structure. a Spin(7)-structure.
Proposition 7 Let M be a 7-manifold and (’, g) a Proposition 9 Let M be an 8-manifold and (, g) a
G2 -structure on M. Then the following are Spin(7)-structure on M. Then the following are
equivalent: equivalent:
(i) Hol(g)  G2 , and ’ is the induced 3-form; (i) Hol(g)  Spin(7) and  is the induced 4-form;
(ii) r’ = 0 on M, where r is the Levi-Civita (ii) r = 0 on M, where r is the Levi-Civita
connection of g; and connection of g; and
(iii) d’ = d(’) = 0 on M. (iii) d = 0 on M.
The equations d’ = d(’) = 0 look like linear We call r the torsion of the Spin(7)-structure
partial differential equations on ’. However, it is (, g), and (, g) torsion free if r = 0. A triple
better to consider them as nonlinear, for the (M,, g) is called a Spin(7)-manifold if M is an 8-
following reason. The 3-form ’ determines the manifold and (, g) a torsion-free Spin(7)-structure
metric g, and g gives the Hodge star  on M. So on M. If g has holonomy Hol(g)  Spin(7), then g is
’ is a nonlinear function of ’, and d(’) = 0 a Ricci-flat.
nonlinear equation. Thus, constructing and study- Here is a result on compact 8-manifolds with
ing G2 -manifolds come down to studying solu- holonomy Spin(7).
tions of nonlinear elliptic partial differential
Theorem 10 Let (M, , g) be a compact Spin(7)-
equations.
manifold. Then, Hol(g) = Spin(7) if and only if M is
Note that Hol(g)  G2 if and only if r’ = 0
simply connected, and b3(M) þ b4þ(M) = b2 (M) þ
follows from Theorem 5. We call r’ the
2b4 (M) þ 25. In this case, the moduli space of
‘‘torsion’’ of the G2 -structure (’, g), and when
metrics with holonomy Spin(7) on M, up to
r’ = 0 the G2 -structure is ‘‘torsion-free.’’ A triple
diffeomorphisms isotopic to the identity, is a smooth
(M, ’, g) is called a G2 -manifold if M is a
manifold of dimension 1 þ b4 (M).
7-manifold and (’, g) a torsion-free G2 -structure
on M. If g has holonomy Hol(g)  G2 , then g is The inclusions between the holonomy groups
Ricci-flat. SU(m), G2 , Spin(7) are
Theorem 8 Let M be a compact 7-manifold, and SUð2Þ ! SUð3Þ ! G2
suppose that (’, g) is a torsion-free G2 -structure on M. # # # ½7
Then Hol(g) = G2 if and only if 1 (M) is finite. In
SUð2Þ  SUð2Þ ! SUð4Þ ! Spinð7Þ
this case, the moduli space of metrics with holon-
omy G2 on M, up to diffeomorphisms isotopic to The meaning of the above equation is illustrated
the identity, is a smooth manifold of dimension by using the inclusion SU(3) ,! G2 . As SU(3) acts
b3 (M). on C3 , it also acts on R  C3 ffi R7 , taking the
SU(3)-action on R to be trivial. Thus, we embed
SU(3) as a subgroup of GL(7, R). It turns out
The Holonomy Group Spin(7) that SU(3) is contained in the subgroup G2 of
Let R8 have coordinates (x1 , . . . , x8 ). Define a GL(7, R) defined in the section ‘‘The holonomy
4-form 0 on R8 by group G2 .’’

0 ¼ dx1234 þ dx1256 þ dx1278 þ dx1357  dx1368 Constructing Compact G2- and Spin(7)-Manifolds
 dx1458  dx1467  dx2358  dx2367  dx2457
The author’s method of constructing compact
þ dx2468 þ dx3456 þ dx3478 þ dx5678 ½6 7-manifolds with holonomy G2 is based on the
Riemannian Holonomy Groups and Exceptional Holonomy 445

Kummer construction for Calabi–Yau metrics elements of . We now describe the singularities in
on the K3 surface and may be divided into four the example.
steps.
Lemma 12 In Example 11, , , , and 
Step 1. Let T 7 be the 7-torus and (’0 , g0 ) a flat have no fixed points on T 7 . The fixed points of
G2 -structure on T 7 . Choose a finite group  of , , are each 16 copies of T 3 . The singular set S of
isometries of T 7 preserving (’0 , g0 ). Then the quotient T 7 = is a disjoint union of 12 copies of T 3 , 4 copies
T 7 = is a singular, compact 7-manifold, an orbifold. from each of , , . Each component of S is a
Step 2. For certain special groups , there is a singularity modeled on that of T 3  C2 ={1}.
method to resolve the singularities of T 7 = in a natural
The most important consideration in choosing 
way, using complex geometry. We get a nonsingular,
is that we should be able to resolve the singula-
compact 7-manifold M, together with a map  : M !
rities of T 7 = within holonomy G2 , in Step 2. We
T 7 =, the resolving map.
have no idea how to resolve general orbifold
Step 3. On M, we explicitly write down a one-
singularities of G2 -manifolds. However, after fifty
parameter family of G2 -structures (’t , gt ) depending
years of hard work we understand well how to
on t 2 (0, ). They are not torsion free, but have
resolve orbifold singularities of Calabi–Yau mani-
small torsion when t is small. As t ! 0, the
folds, with holonomy SU(m). This is done by a
G2 -structure (’t , gt ) converges to the singular
combination of algebraic geometry, which pro-
G2 -structure  (’0 , g0 ).
duces the underlying complex manifold by a
Step 4. We prove using analysis that for suffi-
crepant resolution, and Calabi–Yau analysis,
ciently small t, the G2 -structure (’t , gt ) on M, with
which produces the Ricci-flat Kähler metric on
small torsion, can be deformed to a G2 -structure
this complex manifold.
(’
’t , g̃t ), with zero torsion. Finally, it is shown that g̃t
Now the holonomy groups SU(2) and SU(3) are
is a metric with holonomy G2 on the compact
subgroups of G2 , as in [7]. Our tactic in Step 2 is to
7-manifold M.
ensure that all of the singular set S of T 7 = can
We explain the first two steps in greater detail. locally be resolved with holonomy SU(2) or SU(3),
For Step 1, an example of a suitable group  is given and then use Calabi–Yau geometry to do this. In
here. particular, suppose each connected component of S
is isomorphic to either
Example 11 Let (x1 , . . . , x7 ) be coordinates on
T 7 = R7 =Z7 , where xi 2 R=Z. Let (’0 , g0 ) be the 1. T 3  C2 =G, for G a finite subgroup of SU(2); or
flat G2 -structure on T 7 defined by [5]. Let , , and 2. S 1  C3 =G, for G a finite subgroup of SU(3)
 be the involutions of T 7 defined by acting freely on C3 n{0}.

 : ðx1 ; . . . ; x7 Þ One can use complex algebraic geometry to find a


crepant resolution X of C2 =G or Y of C3 =G. Then
7!ðx1 ; x2 ; x3 ; x4 ; x5 ; x6 ; x7 Þ ½8 T 3  X or S 1  Y gives a local model for how to
resolve the corresponding component of S in T 7 =.
 : ðx1 ; . . . ; x7 Þ Thus we construct a nonsingular, compact 7-mani-
fold M by using the patches T 3  X or S 1  Y to
7!ðx1 ; x2 ; x3 ; x4 ; x5 ; 12  x6 ; x7 Þ ½9
repair the singularities of T 7 =. In the case of
Example 11, this means gluing 12 copies of T 3  X
 : ðx1 ; . . . ; x7 Þ into T 7 =, where X is the blow-up of C2 ={1} at its
  singular point.
7! x1 ; x2 ; x3 ; x4 ; 12x5 ; x6 ; 12  x7 ½10
By considering different groups  acting on T 7 ,
By inspection, , , and  preserve (’0 , g0 ), and also by finding topologically distinct resolu-
because of the careful choice of exactly which signs tions M1 , . . . , Mk of the same orbifold T 7 =, we
to change. Also, 2 =  2 =  2 = 1, and , , and  can construct many compact Riemannian 7-mani-
commute. Thus, they generate a group folds with holonomy G2 . A good number of
 = h, , i ffi Z32 of isometries of T 7 preserving examples are given in Joyce (2000, chapter 12).
the flat G2 -structure (’0 , g0 ). Figure 1 displays the 252 different sets of Betti
numbers of compact, simply connected 7-mani-
Having chosen a lattice  and finite group , the folds with holonomy G2 constructed there
quotient T 7 = is an orbifold, a singular manifold together with 5 more sets from Kovalev. It
with only quotient singularities. The singularities of seems likely to the author that the Betti numbers
T 7 = come from the fixed points of nonidentity given in Figure 1 are only a small proportion of
446 Riemannian Holonomy Groups and Exceptional Holonomy

25

20

b 2(M )
15

10

0
0 20 40 60 80 100 120 140 160 180 200
b 3(M )
Figure 1 Betti numbers (b 2 , b 3 ) of compact G2 -manifolds. (From Joyce (2000) and Kovalev (2003).)

the Betti numbers of all compact 7-manifolds with Further Reading


holonomy G2 .
Bryant RL (1987) Metrics with exceptional holonomy. Annals of
A different construction of compact 7-manifolds Mathematics 126: 525–576.
with holonomy G2 was given by Kovalev (2003), Bryant RL and Salamon SM (1989) On the construction of some
involving gluing together asymptotically cylindrical complete metrics with exceptional holonomy. Duke Mathe-
Calabi–Yau 3-folds. Compact 8-manifolds with matical Journal 58: 829–850.
holonomy Spin(7) were constructed by the author Gross M, Huybrechts D, and Joyce D (2003) Calabi–Yau
Manifolds and Related Geometries, Universitext Series.
using two different methods: first, by resolving Berlin: Springer.
singularities of torus orbifolds T 8 = in a similar way Joyce DD (2002) Constructing compact manifolds with excep-
to the G2 case (though the details are different and tional holonomy, math.DG/0203158, 2002 (a survey paper).
more difficult), and second, by resolving Y=hi for Y Joyce DD (2004) Constructing compact manifolds with exceptional
a Calabi–Yau 4-orbifold with singularities of a holonomy. In: Douglas M, Gauntlett J, and Gross M (eds.)
Strings and Geometry, Clay Mathematics Proceedings 3.
special kind, and  an antiholomorphic isometric pp. 177–191. Providence, RI: American Mathematical Society.
involution of Y. Details can be found in Joyce (2000). math.DG/0203158.
Kovalev AG (2003) Twisted connected sums and special
See also: Calibrated Geometry and Special Lagrangian Riemannian holonomy. Journal für die Reine und Angewandte
Submanifolds. Mathematik 565: 125–160, math.DG/0012189.
S
Saddle Point Problems
M Schechter, University of California at Irvine, To illustrate the technique, we consider the
Irvine, CA, USA problem of finding a solution of
ª 2006 Elsevier Ltd. All rights reserved.
u00 ðxÞ þ uðxÞ ¼ f ðx; uðxÞÞ ½4
x 2 I = [0, 2], under the conditions
uð0Þ ¼ uð2Þ; u0 ð0Þ ¼ u0 ð2Þ ½5
Introduction
We assume that the function f(x, t) is continuous in
Many problems arising in science and engineering I  R and is periodic in x with period 2. The
call for the solving of the Euler equations of approach begins by asking the question, ‘‘does there
functionals, that is, equations of the form exist a differentiable function G from a space H to
G0 ðuÞ ¼ 0 ½1 R such that [4], [5] are equivalent to [1]?’’ It is
1
hoped that one can mimic the methods of calculus to
where G(u) is a C -functional (usually representing find critical points and thus solve [1].
the energy) arising from the given data. As an Actually, we are asking the following: does there exist
illustration, the equation a mapping G from a space H to R such that G has a
uðxÞ ¼ f ðx; uðxÞÞ critical point u satisfying G0 (u) = u00 þ u  f (x, u(x))?
In order to solve the problem one has to
is the Euler equation of the functional
Z 1. find G(u) such that
1
GðuÞ ¼ kruk2  Fðx; uðxÞÞ dx ðG0 ðuÞ; vÞH ¼ ðu; vÞH  ðf ð; uÞ; vÞ ½6
2
on an appropriate space, where holds for each u, v 2 H,
Z t 2. show that there is a function u(x) such that
Fðx; tÞ ¼ f ðx; sÞ ds ½2 G0 (u) = 0,
0 3. show that u00 exists in I,
and the norm is that of L2 . The solving of the Euler 4. show that [1] implies [4].
equations is tantamount to finding critical points of We used the notation
the corresponding functional. The classical approach Z 2
was to look for maxima or minima. If one is looking ðu; vÞ ¼ uðxÞvðxÞ dx
for a minimum, it is not sufficient to know that the 0
functional is bounded from below, as is easily
checked. However, one can show that there is a In order to carry out the procedure, we assume
sequence satisfying that for each R > 0 there is a constant CR such that

Gðuk Þ ! a; G0 ðuk Þ ! 0 ½3 jf ðx; tÞj  CR ; x 2 I; t 2 R; jtj  R ½7

for a = inf G. If the sequence has a convergent This assumption is used to carry out step (1). We define
Z 2
subsequence, this will produce a minimum. 1 2
However, when extrema do not exist, there is no GðuÞ ¼ kukH  Fðx; uðxÞÞ dx ½8
2 0
clear way of obtaining critical points. In particular,
this happens when the functional is not bounded where F(x, t) is given by [2] and we take H to be the
from either above or below. Until recently, there completion of C1 (I) with respect to the norm
was no organized procedure for producing critical
kukH ¼ ðku0 k2 þ kuk2 Þ1=2 ½9
points which are not extrema. We shall describe an
2
approach which is very useful in such cases. where kuk = (u, u). We have
448 Saddle Point Problems

Theorem 1 If f(x, t) satisfies [7], then G(u) given Let


by [8] is continuously differentiable and satisfies [6].
N ¼ fu 2 H : k ¼ 0 for jkj > ng
Once we have reduced the problem to solving [1],
Thus,
we can search for critical points. The easiest type to X
locate are ‘‘saddle points’’ which are local minima in kuk2H ¼ ð1 þ k2 Þjk j2
some directions and local maxima in all others. For jkjn
instance, we obtain theorems such as
 ð1 þ n2 Þkuk2 ; u2N ½19
Theorem 2 Assume that
Let
jf ðx; tÞj  Cðjtj þ 1Þ; x 2 I; t 2 R
½10 M ¼ fu 2 H : k ¼ 0 for jkj  ng
2Fðx; tÞ=t2 ! ðxÞ a.e. as jtj ! 1
In this case,
with (x) satisfying X
kuk2H ¼ ð1 þ k2 Þjk j2
1 þ n2  ðxÞ  1 þ ðn þ 1Þ2 jkjnþ1
½11
1 þ n2 6 ðxÞ 6 1 þ ðn þ 1Þ2  ð1 þ ðn þ 1Þ2 Þkuk2 ; u2M ½20

and n an integer  0. If G(u) is given by [8], then Note that M, N are closed subspaces of H and that
there is a u0 2 H such that M = N ? . Note also that N is finite dimensional. If
we consider the functional [8], it is not difficult to
G0 ðu0 Þ ¼ 0 ½12 show that [11] implies
In particular, u0 is a solution of [4] and [5] in the inf G > 1; sup G < 1 ½21
M N
usual sense.
We are now in a position to apply Theorem 3. This
In proving this theorem, we shall make use of
produces a saddle point satisfying [1]. &
Theorem 3 Let M, N be closed subspaces of a
Hilbert space E such that M = N ? . Assume that at
least one of these subspaces is finite dimensional. Minimax
Let G be a continuously differentiable functional on
Theorem 3 is very useful when extrema do not exist, but
E satisfying
it is not always applicable. One is then forced to search
m0 ¼ sup inf Gðv þ wÞ 6¼ 1 ½13 for other ways of obtaining critical points. Again, one is
v2N w2M faced with the fact that there is no systematic method of
and finding them. A useful idea is to try to find sets that
separate the functional. By this we mean the following:
m1 ¼ inf sup Gðv þ wÞ 6¼ 1 ½14
w2M v2N Definition 1 Two sets A, B separate the functional
G(u) if
Then there is a sequence {uk } E such that
a0 :¼ sup G  b0 :¼ inf G ½22
B
Gðuk Þ ! c; m0  c  m1 ; G0 ðuk Þ ! 0 ½15 A

We would like to find sets A and B such that [22]


Theorem 3 allows us to obtain solutions if we can
will imply
find subspaces of H such that [13] and [14] hold. We
use it to give the proof of Theorem 2. 9u : GðuÞ  b0 ; G0 ðuÞ ¼ 0 ½23
Proof. Note that
This is too much to expect since even semibounded-
X
kuk2H ¼ ð1 þ k2 Þjk j2 ; u 2 H ½16 ness does not imply the existence of an extremum.
Consequently, we weaken our requirements and
where the k are given by look for sets A, B such that [22] implies

k ¼ ðu; ’
k Þ; k ¼ 0;
1;
2; . . . ½17 Gðuk Þ ! a; G0 ðuk Þ ! 0 ½24
with a  b0 . This leads to
and
Definition 2 We shall say that the set A links the
1 set B if [22] implies [24] with a  b0 for every C1
’k ðxÞ ¼ pffiffiffiffiffiffi eikx ; k ¼ 0;
1;
2; . . . ½18
2 functional G(u).
Saddle Point Problems 449

Of course, [24] is a far cry from [23], but if, for is finite. Let (t) be a positive, locally Lipschitz
example, the sequence [24] has a convergent continuous function on [0, 1) such that
subsequence, then [24] implies [23]. Whether or Z 1
not [24] implies [23] is a property of the functional ðrÞ dr ¼ 1 ½27
G(u). We state this as 0

Then there is a sequence {uk } E such that


Definition 3 We say that G(u) satisfies the Palais–
Smale (PS) condition if [24] always implies [23]. Gðuk Þ ! a; G0 ðuk Þ= ðkuk kÞ ! 0 ½28
The usual way of verifying this is to show that If a = b0 , then we can also require that
every sequence satisfying [24] has a convergent
subsequence (there are other ways). dðuk ; BÞ ! 0 ½29
All of this leads to
Corollary 1 Under the hypotheses of Theorem 6
Theorem 4 If G satisfies the PS condition and is
there is a sequence {uk } E such that
separated by a pair of linking sets, then it has a
critical point satisfying [23]. Gðuk Þ ! a; ð1 þ kuk kÞG0 ðuk Þ ! 0 ½30
This theorem cannot be applied until one knows if
there are linking sets and functionals that satisfy the Proof. We merely take (u) = 1=(1 þ kuk) in
PS condition. Fortunately, they exist. Examples and Theorem 6. &
sufficient conditions for A to link B are found in the A useful criterion for finding linking subsets is
literature. Obviously, the weaker the conditions, the
more pairs will qualify. To date, the conditions Theorem 7 Let F be a continuous map from a
described in the next section allow all known Banach space E to Rn , and let Q E be such that
examples. F0 = FjQ is a homeomorphism of Q onto the closure
of a bounded open subset  of R n . If p 2 , then
F01 (@) links F1 (p).

The Details
Some Examples
Let E be a Banach space, and let  be the set of all
continuous maps  = (t) from E  [0, 1] to E such The following are examples of sets that link.
that Example 1 Let M, N be closed subspaces such that
1. (0) = I, the identity map; E = M N (with one finite dimensional). Let
2. for each t 2 [0, 1), (t) is a homeomorphism of E BR ¼ fu 2 E : kuk < Rg
onto E and 1 (t) 2 C(E  [0, 1), E);
3. (1)E is a single point in E and (t)A converges and take A = @BR \ N, B = M. Then A links B.
uniformly to (1)E as t ! 1 for each bounded To see this, we identify N with some Rn and take
set A E; and  For u 2 E, we write
 = BR \ N, Q = .
4. for each t0 2 [0, 1) and each bounded set A E, u ¼ v þ w; v 2 N; w 2 M ½31
sup fkðtÞuk þ k1 ðtÞukg < 1 ½25 and take F to be the projection
0tt0 ;u2A
Fu ¼ v
We have the following
Since FjQ = I and M = F1 (0), we see from Theorem 7
Theorem 5 A sufficient condition for A to link B is that A links B.
(i) A \ B =  and Example 2 We take M, N as in Example 1. Let
(ii) for each  2  there is a t 2 (0, 1] such that w0 6¼ 0 be an element of M, and take

ðtÞA \ B 6¼  A ¼ fv 2 N : kvk  Rg
[ fsw0 þ v : v 2 N; s  0; ksw0 þ vk ¼ Rg
Theorem 6 Let G be a C1 -functional on E, and let B ¼ @B \ M; 0 <  < R:
A, B be subsets of E such that A, B satisfy [22] and
the hypotheses of Theorem 5. Assume that Then A links B. Again we identify N with some R n ,
and we may assume kw0 k = 1. Let
a :¼ inf sup GððsÞuÞ ½26
2 0s1;u2A Q ¼ fsw0 þ v : v 2 N; s  0; ksw0 þ vk  Rg
450 Saddle Point Problems

Then A = @Q in Rnþ1 . If u is given by [31], we Example 6 Let M, N be as in Example 1. Let v0


define be in @B1 \ N and write N = {v0 } N 0 . Let
  \ N, and
A = @B \ N, Q = B
Fu ¼ v þ kwkw0
Then FjQ = I and B = F1 (w0 ). We can now apply B ¼ fw 2 M : kwk  Rg
Theorem 7 to conclude that A links B. [ fw þ sv0 : w 2 M; s  0; kw þ sv0 k ¼ Rg
Example 3 Take M, N as before and let v0 6¼ 0 be
where 0 <  < R. Then A links B. To see this, write
an element of N. We write N = {v0 } N 0 . We take
u = w þ v0 þ sv0 , w 2 M, v0 2 N 0 , s 2 R and take
A ¼ fv0 2 N 0 : kv0 k  Rg
FðuÞ ¼ ðcR  maxfckw þ sv0 k; jcR  sjgÞv0 þ v0
0 0 0 0
[ fsv0 þ v : v 2 N ; s  0; ksv0 þ v k ¼ Rg
where c = =(R  ). Then F is the identity operator
B ¼ fw 2 M : kwk  g
on Q, and F1 (0) = B. Apply Theorem 7.
[ fsv0 þ w : w 2 M; s  0; ksv0 þ wk ¼ g
where 0 <  < R. Then A links B. To see this, we let
Some Applications
Q ¼ fsv0 þ v0 : v0 2 N 0 ; s  0; ksv0 þ v0 k  Rg
Many elliptic semilinear problems can be described
and reason as before. For simplicity, we assume that in the following way. Let  be a domain in Rn , and
kv0 k = 1, E is a Hilbert space and that the splitting let A be a self-adjoint operator on L2 (). We assume
E = N 0 {v0 } M is orthogonal. If that A  0 > 0 and that
u ¼ v0 þ w þ sv0 ; v0 2 N 0 ; w 2 M; s 2 R ½32 C1 1=2
Þ H m; 2 ðÞ
0 ðÞ D :¼ DðA ½33
we define
 for some m > 0, where C1 0 () denotes the set of test
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
functions in  (i.e., infinitely differentiable functions
FðuÞ ¼ v0 þ s þ   2  kwk2 v0 ; kwk  
with compact supports in ), and H m, 2 () denotes
¼ v0 þ ðs þ Þv0 ; kwk >  the Sobolev space. If m is an integer, the norm in
H m, 2 () is given by
Note that FjQ = I while F1 (v0 ) is precisely the set 0 11=2
B. Hence we can conclude via Theorem 7 that A X 2A
links B. kukm; 2 :¼ @ kD uk ½34
jjm
Example 4 This is the same as Example 3 with A
replaced by A = @BR \ N. The proof is the same Here D represents the generic derivative of order
 R \ N.
with Q replaced by Q = B jj and the norm on the right-hand side of [34] is
Example 5 Let M, N be as in Example 1. Take that of L2 (). We shall not assume that m is an
A = @B \ N, and let v0 be any element in @B1 \ N. integer.
Take B to be the set of all u of the form Let q be any number satisfying

u ¼ w þ sv0 ; w2M 2  q  2n=ðn  2mÞ; 2m < n


satisfying any of the following: 2  q < 1; n  2m
(i) kwk  R, s = 0, and let f (x, t) be a continuous function on   R.
(ii) kwk  R, s = 2R0 , and We make the following assumptions.
(iii) kwk = R, 0  s  2R0
Assumption A The function f (x, t) satisfies
where 0 <  < min (R, R0 ). Then A links B. To see
this, take N = {v0 } N 0 . Then any u 2 E can be jf ðx; tÞj  V0 ðxÞq jtjq1 þ V0 ðxÞW0 ðxÞ ½35
written in the form [32]. Define
   and
0 R0
FðuÞ ¼ v þ R0  max kwk; js  R0 j v0
R f ðx; tÞ=V0 ðxÞq ¼ oðjtjq1 Þ as jtj ! 1 ½36
and Q = B   \ N. Again we may identify N with where V0 (x) > 0 is a function in Lq () such that
n
some R . Then F 2 C(E, N) and FjQ = I. Moreover,
A = F1 (0). Hence, A links B by Theorem 7. kV0 ukq  CkukD ; u2D ½37
Saddle Point Problems 451

0
and W0 is a function in Lq (). Here (ii) for each > 0 sufficiently small, there is an " > 0
Z 1=q such that
kukq :¼ juðxÞjq dx ½38
 GðuÞ  "; kukD ¼ ½48
We may assume that option (ii) holds, for otherwise
kukD :¼ kA1=2 uk ½39 we are done. By [46] we have
and q0 = q=(q  1). With the norm [39], D becomes Z
a Hilbert space. Define G and F by [8] and [2]. It GðR’0 Þ  R2 ðk’0 k2D  0 k’0 k2 Þ þ W0 ðxÞ dx

follows that G is a continuously differentiable Z
functional on the whole of D. ¼ W0 ðxÞ dx
We assume further that 

Hðx; tÞ ¼ 2Fðx; tÞ  tf ðx; tÞ By Theorem 6, there is a sequence satisfying [28].


Taking (r) = 1=(r þ 1), we conclude that there is a
 W1 ðxÞ 2 L1 ðÞ; x 2 ; t 2 R ½40 sequence {uk } D such that
and
Gðuk Þ ! c; m0  c  m1 ;
Hðx; tÞ ! 1 a:e: as jtj ! 1 ½41
ð1 þ kuk kD ÞG0 ðuk Þ ! 0 ½49
Moreover, we assume that there are functions
In particular, we have
V(x), W(x) 2 L2 () such that multiplication by
V(x) is a compact operator from D to L2 () and Z
kuk k2D 2 Fðx; uk Þ dx ! c ½50
Fðx; tÞ  CðVðxÞ2 jtj2 þ VðxÞWðxÞjtjÞ ½42 

We wish to obtain a solution of and

Au ¼ f ðx; uÞ; u2D ½43 kuk k2D  ðf ð; xk Þ; uk Þ ! 0 ½51


By a solution of [43] we shall mean a function u 2 D Consequently,
such that
Z
ðu; vÞD ¼ ðf ð; uÞ; vÞ; v2D ½44 Hðx; uk Þ dx ! c ½52
2 
If f (x, u) is in L (), then a solution of [44] is in D(A)
and solves [43] in the classical sense. Otherwise we call These imply
it a weak or semistrong solution. We have Z
Theorem 8 Let A be a self-adjoint operator in Hðx; uk Þ dx  K ½53

L2 () such that A  0 > 0 and [33] holds for some
m > 0. Assume that 0 is an eigenvalue of A with If k = kuk kD ! 1, let u~k = uk = k . Then k~
uk kD = 1.
eigenfunction ’0 . Assume also Consequently, there is a renamed subsequence such
that u
~k ! u ~ weakly in D, strongly in L2 (), and a.e.
2Fðx; tÞ  0 t2 ; jtj   for some  > 0 ½45
in . We have from [42]
and
1  ðm1 þ Þ= 2k
2Fðx; tÞ  0 t2  W0 ðxÞ; t > 0; x 2  ½46 Z
þ 2C fVðxÞ2 u
~2k þ VðxÞWðxÞj~
uk j 1
k g dx
where W0 2 L1 (). Assume that f (x, t) satisfies [35], 
[36], [40], [41], and [42]. Then [43] has a solution
u 6¼ 0. Consequently,
Z
Proof. Under the hypotheses of the theorem, it
is known that the following alternative holds: either 1  2C VðxÞ2 u
~2 dx ½54

(i) there is an infinite number of y(x) 2 D(A)n{0}
This shows that u~ 6 0. Let 0 be the subset of  on
such that
Ay ¼ f ðx; yÞ ¼ 0 y ½47 which u~ 6¼ 0. Then

or uk ðxÞj ! 1;
juk ðxÞj ¼ k j~ x 2 0 ½55
452 Saddle Point Problems

If 1 = n0 , then we have when


Z Z Z
jxj > C; t 2 I; x 2 Rn
Hðx; uk Þ dx ¼ þ
 0 1
Z 4. The function given by
 Hðx; uk Þ dx Hðt; xÞ ¼ 2Vðt; xÞ  rx Vðt; xÞ  x ½60
0
Z satisfies
 W1 ðxÞ dx ! 1 ½56
1 Hðt; xÞ  WðtÞ 2 L1 ðIÞ; jxj  C ½61
n
This contradicts [53], and we see that k = kuk kD is t 2 I, x 2 R , and
bounded. Once we know that the k are bounded,
Hðt; xÞ ! 1 as jxj ! 1 ½62
we can apply well-known theorems to obtain the
desired conclusion. & We have
Remark 1 It should be noted that the crucial Theorem 9 Under the above hypotheses, the
element in the proof of Theorem 8 was [51]. If we system [57] has a nonconstant solution.
had been dealing with an ordinary Palais–Smale
Proof. Let X be the set of vector functions x(t)
sequence, we could only conclude that
described above. It is a Hilbert space with norm
kuk k2D  ðf ð; uk Þ; uk Þ ¼ oð k Þ satisfying
Xn
which would imply only kxk2X ¼ kxj k2H1
Z j¼1
Hðx; uk Þ dx ¼ oð k Þ
 We also write
This would not contradict [56], and the argument X
n

would not go through. kxk2 ¼ kxj k2


j¼1
As another application, we wish to solve
where k  k is the L2 (I) norm. Let
00
x ðtÞ ¼ rx Vðt; xðtÞÞ ½57
N ¼ fxðtÞ 2 X : xj ðtÞ  constant; 1  j  ng
where
and M = N ? . The dimension of N is n, and
xðtÞ ¼ ðx1 ðtÞ; . . . ; xn ðtÞÞ ½58 X = M N. The following is easily proved.
n
is a map from I = [0, 2] to R such that each Lemma 1 If x 2 M, then
component xj (t) is a periodic function in H 1 with  0 2
period 2, and the function kxk21  kx k
6
Vðt; xÞ ¼ Vðt; x1 ; . . . ; xn Þ and
nþ1
is continuous from R to R with a gradient kxk  kx0 k
rx Vðt; xÞ ¼ ð@V=@x1 ; . . . ; @V=@xn Þ
½59 We define
2 CðRnþ1 ; Rn Þ Z
0 2
GðxÞ ¼ kx k  2 Vðt; xðtÞÞ dt; x2X ½63
For each x 2 Rn , the function V(t, x) is periodic in t I
with period 2. We shall study this problem under
the following assumptions: For each x 2 X write x = v þ w, where v 2 N, w 2 M.
For convenience, we shall use the following equivalent
1. 0  Vðt; xÞ  Cðjxj2 þ 1Þ norm for X:
t 2 I; x 2 Rn kxk2X ¼ kw0 k2 þ kvk2
2. There are constants m > 0,   3m2 =22 such that
If x 2 M and
Vðt; xÞ  ; jxj  m; t 2 I; x 2 Rn 6 2
kx0 k2 ¼ 2 ¼ m

3. There are constants  > 1=2 and C such that
then Lemma 1 implies that kxk1  m, and we have
Vðt; xÞ  jxj2 by Hypothesis 2 that V(t, x)  .
Saddle Point Problems 453

Hence, Hence,
Z
GðxðkÞ Þ ¼ k½xðkÞ 0 k2
GðxÞ  kx0 k2  2  dt Z
jxj<m
 2 Vðt; xðkÞ ðtÞÞ dt ! c  0 ½68
2
  2ð2Þ  0 ½64 I

Note that Hypothesis 3 is equivalent to ðG0 ðxðkÞ Þ; zÞ=2 ¼ ð½xðkÞ 0 ; z0 Þ


Z
Vðt; xÞ  jxj2  C; t 2 I; x 2 Rn ½65  rx Vðt; xðkÞ Þ  zðtÞ dt ! 0; z2X ½69
I

for some constant C. Next, let and


yðtÞ ¼ v þ sw0 ðG0 ðxðkÞ Þ; xðkÞ Þ=2 ¼ k½xðkÞ 0 k2
Z
where v 2 N, s  0, and  rx Vðt; xðkÞ Þ  xðkÞ dt ! 0 ½70
I
w0 ¼ ðsin t; 0; . . . ; 0Þ
If
Then w0 2 M, and k ¼ kxðkÞ kX  C
kw0 k2 ¼ kw00 k2 ¼  then there is a renamed subsequence such that x(k)
converges to a limit x 2 X weakly in X and
Note that uniformly on I. From [69] we see that
kyk2 ¼ kvk2 þ s2  ¼ 2jvj2 þ s2 ðG0 ðxÞ; zÞ=2 ¼ ðx0 ; z0 Þ
Z
Consequently,  rx Vðt; xðtÞÞ  zðtÞ dt ¼ 0; z2X
Z I

GðyÞ ¼ s2 kw00 k2  2 Vðt; yðtÞÞ dt from which we conclude easily that x is a solution of
Z I
[57]. From [68], we see that
 s  2 jyðtÞj2 dt þ 2C
2
I GðxÞ  c  0
¼ s2  2ðkvk2 þ s2 Þ þ 2C
showing that x(t) is not a constant. For if c > 0 and
 ð1  2Þs2  4jvj2 þ 2C x 2 N, then
! 1 as s2 þ jvj2 ! 1 Z
GðxÞ ¼ 2 Vðt; xðtÞÞ dt  0
We also note that Hypothesis 1 implies I

GðvÞ  0; v2N ½66 If c = 0, we see that x 2 B by Theorem 6. Hence,


x 2 M. If
Take
k ¼ kxðkÞ kX ! 1
A ¼ fv 2 N : kvk  Rg
let x ~(k) = x(k) = k . Then, k~ x(k) kX = 1. Let x~(k) = w
~ (k) þ
[ fsw0 þ v : v 2 N; s  0; ksw0 þ vkX ¼ Rg (k) (k) (k)
~v , where w ~ 2 M and ~v 2 N. There is a renamed
B ¼ @B \ M; 0 < ¼ 6m2 = < R subsequence such that x ~(k) converges uniformly in I to
(k) 0
a limit x ~ and k[~ x ] k ! r and k~ x(k) k ! , where r2 þ
where 2
= 1. From [68] and [70], we obtain
B
¼ fx 2 X : kxkX <
g Z
ðkÞ 0 2
x  k  2 Vðt; xðkÞ ðtÞÞ dt= 2k ! 0
k½~
By Example 2, A links B. Moreover, if R is I
sufficiently large, and
Z
ðkÞ 0 2
sup G ¼ 0  inf G
B
½67 x k 
k½~ rx Vðt; xðkÞ Þ  xðkÞ dt= 2k ! 0
A I

Hence, we may conclude that there is a sequence Thus,


{x(k) } X such that Z
2 Vðt; xðkÞ ðtÞÞ dt= 2k ! r2 ½71
GðxðkÞ Þ ! c  0; ð1 þ kxðkÞ kX Þ G0 ðxðkÞ Þ ! 0 I
454 Saddle Point Problems

and Theorem 10 Under hypotheses (a1 )(a3 ) the


Z boundary-value problem
rx Vðt; xðkÞ Þ  xðkÞ dt= 2k ! r2 ½72
I
u ¼ f ðx; uÞ; x 2 ; u ¼ 0 on @ ½75

Hence, has a nontrivial solution for almost every positive .


Z Unfortunately, this theorem does not give any
Hðt; xðkÞ ðtÞÞ dt= 2k ! 0 ½73 information for any specific . It still leaves open the
I
problem of solving [74]. For this purpose, we add
By Hypothesis 3, the left-hand side of [71] is the assumption
(a4 ) There are constants  > 2, r  0 such that
xðkÞ k2  4C= 2k
 2k~
Fðx; tÞ  tf ðx; tÞ  Cðt2 þ 1Þ; jtj  r ½76
Thus,
We have
r2  2 2 ¼ 2ð1  r2 Þ
Theorem 11 Under hypotheses (a1 )(a4 ) problem
showing that r > 0. Hence, x ~(t) 6 0. Let 0 I [74] has a nontrivial solution.
be the set on which [~ x(t)] 6¼ 0. The measure of We also have
0 is positive. Thus, jx(k) (t)j ! 1 as k ! 1 for
t 2 0 . Hence, Theorem 12 If we replace hypothesis (a4 ) with
Z (a04 ) The function H(x, t) is convex in t,
Hðt; xðkÞ ðtÞÞ dt
I
Z Z then the problem [74] has at least one nontrivial
solution.
 Hðt; xðkÞ ðtÞÞ dt þ WðtÞ dt ! 1
0 In0

contrary to Hypothesis 4. Thus, the k are bounded,


Weak Linking
and the proof is complete. &
It is not clear if it is possible for A to link B if neither is
contained in a finite-dimensional manifold. For
instance, if E = M N, where M, N are closed
Superlinear Problems
infinite-dimensional subspaces of E and BR is the ball
Consider the problem centered at the origin of radius R in E, it is unknown if
the set A = M \ @BR links B = N. (If either M or N is
u ¼ f ðx; uÞ; x 2 ; u ¼ 0 on @ ½74
finite dimensional, then A does link B.) Unfortunately,
where  Rn is a bounded domain whose bound- this is the situation which arises in some important
ary is a smooth manifold, and f (x, t) is a continuous applications including Hamiltonian systems, the wave
function on    R. This semilinear Dirichlet pro- equation and elliptic systems, to name a few.
blem has been studied by many authors. It is called We now consider linking when both M and N are
‘‘sublinear’’ if there is a constant C such that infinite dimensional and G0 has some additional
continuity property. A property that is very useful is
jf ðx; tÞj  Cðjtj þ 1Þ; x 2 ; t 2 R
that of weak-to-weak continuity:
Otherwise, it is called ‘‘superlinear’’. Assume
uk ! u weakly in E
(a1 ) There are constants c1 , c2  0 such that ¼) G0 ðuk Þ ! G0 ðuÞ weakly ½77
s
jf ðx; tÞj  c1 þ c2 jtj We make the following definition:
where 0  s < (n þ 2)=(n  2) if n > 2. Definition 3 A subset A of a Banach space E links
(a2 ) f (x, t) = o(jtj) as t ! 0. a subset B of E ‘‘weakly’’ if for every G 2 C1 (E, R)
(a3 ) Either satisfying [77] and
Fðx; tÞ=t2 ! 1 as t ! 1 a0 :¼ sup G  b0 :¼ inf G ½78
A B
or
there is a sequence {uk } E and a constant c such
Fðx; tÞ=t2 ! 1 as t ! 1: that
We have b0  c < 1 ½79
Saddle Point Problems 455

and (u) = (u, u), then we assume that


0
Gðuk Þ ! c; G ðuk Þ ! 0 ½80 ðvÞ  ðAv; vÞ; v2N ½86

ðAw; wÞ  ðwÞ; w2M ½87


We have the following counterpart of Theorem 7.
Theorem 13 Let E be a separable Hilbert space, We also assume that the only solution of
and let G be a continuous functional on E with a Au ¼ þ uþ   u ½88
continuous derivative satisfying [77]. Let N be a
closed subspace of E, and let Q be a bounded open is u  0, where u
= max {
u, 0}. We have
subset of N containing the point p. Let F be a
Theorem 14 Under the above hypotheses there is
continuous map of E onto N such that
at least one solution of
(i) FjQ = I, and Au ¼ f ðx; uÞ; u 2 DðAÞ ½89
(ii) For each finite-dimensional subspace S 6¼ {0} of
E containing p, there is a finite-dimensional Next, we consider an application concerning
subspace S0 6¼ {0} of N containing p such that radially symmetric solutions for the problem
 \ S0 ;
v2Q w 2 S ¼) Fðv þ wÞ 2 S0 ½81 utt  u ¼ f ðt; x; uÞ; t 2 R; x 2 BR ½90
Set A = @Q, B = F1 (p). If uðt; xÞ ¼ 0; t 2 R; x 2 @BR ½91
a1 ¼ sup G < 1 ½82 uðt þ T; xÞ ¼ uðt; xÞ; t 2 R; x 2 BR ½92

Q
where BR = {x 2 R n : jxj < R}. We assume that the
and [22] holds, then there is a sequence {uk } E
ratio R=T is rational. Let
such that [24] holds with a  a1 .
Theorem 13 states that if Q, F, p satisfy the 8R=T ¼ a=b ½93
hypotheses of that theorem, then A = @Q links where a, b are relatively prime positive integers. It
B = F1 (p) weakly. It follows from this theorem can be shown that
that all sets A, B known to link when one of the
subspaces M, N is finite dimensional will link n 6 3 ðmodð4; aÞÞ ½94
weakly even when M, N are both infinite
dimensional. implies that the linear problem corresponding to
Now we give some applications of Theorem 13 to [90]–[92] has no essential spectrum. If
semilinear boundary-value problems. Let  be a
n  3 ðmodð4; aÞÞ ½95
domain in R n and let A be a self–adjoint operator in
L2 () having 0 in its resolvent set (thus, there is an then the essential spectrum of the linear operator
interval (a, b) in its resolvent set satisfying consists of precisely one point
a < 0 < b). Let f (x, t) be a continuous function on
  R such that 0 ¼ ðn  3Þðn  1Þ=4R2 ½96
2
jf ðx; tÞj  VðxÞ jtj þ WðxÞVðxÞ ½83 Consider the case
x 2 , t 2 R, and
f ðt; r; sÞ ¼ s þ pðt; r; sÞ ½97
f ðx; tÞ=t ! 
ðxÞ as t !
1 ½84
where  is a point in the resolvent set, r = jxj, and
where V, W 2 L2 (), and multiplication by V(x) >
0 is a compact operator from D = D(jAj1=2 ) to
jpðt; r; sÞj  Cðjsj þ 1Þ; s2R ½98
L2 (). Let
Z 1 Z a for some number < 1. We then have
M¼ dEðÞD; N¼ dEðÞD
b 1 Theorem 15 If [94] holds, then [90]–[92] have a
where {E()} is the spectral measure of A. Then M, N weak rotationally invariant solution. If [95] holds
are invariant subspaces for A and D = M N. If and 0 < , assume in addition that p(t, r, s) is
Z nondecreasing in s. If  < 0 , assume that p(t, r, s) is
ðu; vÞ ¼ ðþ uþ   u Þv dx ½85 nonincreasing in s. Then [90]–[92] have a weak
 rotationally invariant solution.
456 Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools

See also: Combinatorics: Overview; Homoclinic Ghoussoub N (1993) Duality and Perturbation Methods in Critical
Phenomena; Ljusternik–Schnirelman Theory; Minimax Point Theory. Cambridge: Cambridge University Press.
Principle in the Calculus of Variations. Mawhin J and Willem M (1989) Critical Point Theory and
Hamiltonian Systems. Berlin: Springer.
Rabinowitz PR (1986) Minimax Methods in Critical Point Theory
Further Reading with Applications to Differential Equations, Conf. Board of
Math. Sci. Reg. Conf. Ser. in Math. No. 65. Providence, RI:
Ambrosetti A and Prodi G (1993) A primer of nonlinear analysis. American Mathematical Society.
Cambridge Studies in Advanced Mathematics 34. Cambridge: Schechter M (1999) Linking Methods in Critical Point Theory.
Cambridge University Press. Boston: Birkhäuser.
Chang KC (1993) Infinite Dimensional Morse Theory and Schechter M (1986) Spectra of Partial Differential Operators,
Multiple Solution Problems. Boston: Birkhäuser. 2nd edn. Amsterdam: North-Holland.
Ekeland I and Temam R (1976) Convex Analysis and Variational Struwe M (1996) Variational Methods. Berlin: Springer.
Problems. Amsterdam: North-Holland. Willem M (1996) Minimax Theorems. Boston: Birkhäuser.

Scattering in Relativistic Quantum Field Theory: Fundamental


Concepts and Tools
D Buchholz, Universität Göttingen, It should be mentioned that until the late 1950s,
Göttingen, Germany the scattering theory of relativistic quantum particles
S J Summers, University of Florida, relied upon ideas from nonrelativistic quantum-
Gainesville, FL, USA mechanical scattering theory (interaction representa-
ª 2006 Elsevier Ltd. All rights reserved. tion, adiabatic limit, etc.), which were invalid in the
relativistic context. Only with the advent of axio-
matic quantum field theory did it become possible to
properly formulate the concepts and mathematical
Physical Motivation and Mathematical techniques which will be outlined here.
Scattering theory can be rigorously formulated
Setting
either in the context of quantum fields satisfying
The primary connection of relativistic quantum field the Wightman axioms (Streater and Wightman 1964)
theory to experimental physics is through scattering or in terms of local algebras satisfying the Haag–
theory, that is, the theory of the collision of elementary Kastler–Araki axioms (Haag 1992). In brief, the
(or compound) particles. It is therefore a central topic relation between these two settings may be described
in quantum field theory and has attracted the attention as follows: in the Wightman setting, the theory is
of leading mathematical physicists. Although a great formulated in terms of operator-valued distributions 
deal of progress has been made in the mathematically on Minkowski space, the quantum fields, which act on
rigorous understanding of the subject, there are the physical state space. These fields, integrated with
important matters which are still unclear, some of test functions f having support in a given region O of
which will be indicated below. spacetime (only four-dimensional R Minkowski space
In the paradigmatic scattering experiment, several R4 will be treated here), (f ) = d4 x f (x)(x), form
particles, which are initially sufficiently distant from under the operations of addition, multiplication, and
each other that the idealization that they are not Hermitian conjugation a polynomial -algebra P(O) of
mutually interacting is physically reasonable, unbounded operators. In the Haag–Kastler–Araki
approach each other and interact (collide) in a region setting, one proceeds from these algebras to algebras
of microscopic extent. The products of this collision A(O) of bounded operators which, roughly speaking,
then fly apart until they are sufficiently well separated are formed by the bounded functions A of the
that the approximation of noninteraction is again operators (f ). This step requires some mathematical
reasonable. The initial and final states of the objects in care, but these subtleties will not be discussed here. As
the scattering experiment are therefore to be modeled the statements and proofs of the results in these two
by states of noninteracting, that is, free, fields, which frameworks differ only in technical details, the theory
are mathematically represented on Fock space. Typi- is presented here in the more convenient setting of
cally, what is measured in such experiments is the algebras of bounded operators (C -algebras).
probability distribution (cross section) for the transi- Central to the theory is the notion of a particle,
tions from a specified state of the incoming particles to which, in fact, is a quite complex concept, the full
a specified state of the outgoing particles. nature of which is not completely understood, cf.
Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools 457

below. In order to maintain the focus on the above physically mandatory equality of state spaces nor
essential points, we consider in the subsequent the more stringent requirement that every state has an
sections primarily a single massive particle of integer interpretation in terms of incoming and outgoing
spin s, that is, a boson. In standard scattering theory scattering states, that is, H = Hin = Hout (asymptotic
based upon Wigner’s characterization, this particle completeness), has been fully established in any inter-
is simply identified with an irreducible unitary acting relativistic field theoretic model so far. This
representation U1 of the identity component P "þ of intriguing problem will be touched upon in the last
the Poincaré group with spin s and mass m > 0. The section of this article.
Hilbert space H1 upon which U1 (P "þ ) acts is called Before going into details, let us state the few
the one-particle space and determines the possible physically motivated postulates entering into the
states of a single particle, alone in the universe. analysis. As discussed, the point of departure is a
Assuming that configurations of several such parti- family of algebras A(O), more precisely a net,
cles do not interact, one can proceed by a standard associated with the open subregions O of Min-
construction to a Fock space describing freely kowski space and acting on H. Restricting attention
propagating multiple particle states, to the case of bosons, we may assume that this net is
M local in the sense that if O1 is spacelike separated
HF ¼ Hn from O2 , then all elements of A(O1 ) commute with
n2N 0
all elements of A(O2 ). (In the presence of fermions,
where H0 = C and Hn is the n-fold symmetrized direct these algebras contain also fermionic operators
product of H1 with itself. This space is spanned by which anticommute.) This is the mathematical
vectors 1      n , where  denotes the symme- expression of the principle of Einstein causality.
trized tensor product, representing an n-particle state The unitary representation U of P "þ acting on H is
wherein the kth particle is in the state k 2 assumed to satisfy the relativistic spectrum condition
H1 , k = 1, . . . , n. The representation U1 (P "þ ) induces (positivity of energy in all Lorentz frames) and, in
a unitary representation UF (P "þ ) on HF by the sense of equality of sets, U()A(O)U()1 =
: A(O) for all  2 P "þ and regions O, where  O
UF ðÞð1      n Þ ¼ U1 ðÞ1      U1 ðÞn ½1 denotes the Poincaré transformed region. It is also
assumed that the subspace of U(P "þ )-invariant
In interacting theories, the states in the correspond- vectors is spanned by a single unit vector ,
ing physical Hilbert space H do not have such an a representing the vacuum, which has the Reeh–
priori interpretation in physical terms, however. It is Schlieder property, that is, each set of vectors
the primary goal of scattering theory to identify in H A(O) is dense in H. These standing assumptions
those vectors which describe, at asymptotic times, will subsequently be amended by further conditions
incoming, respectively, outgoing, configurations of concerning the particle content of the theory.
freely moving particles. Mathematically, this amounts
to the construction of certain specific isometries
(generalized Møller operators), in and out , mapping Haag–Ruelle Theory
HF onto subspaces Hin  H and Hout  H, respec- Haag and Ruelle were the first to establish the
tively, and intertwining the unitary actions of the existence of scattering states within this general
Poincaré group on HF and H. The resulting vectors framework (Jost 1965); further substantial improve-
: ments are due to Araki and Hepp (Araki 1999). In all
ð1      n Þin=out ¼ in=out ð1      n Þ 2 H ½2
of these investigations, the arguments were given for
are interpreted as incoming and outgoing particle quantum field theories with associated particles (in
configurations in scattering processes wherein the the Wigner sense) which have strictly positive mass
kth particle is in the state k 2 H1 . m > 0 and for which m is an isolated eigenvalue of
If, in a theory, the equality Hin = Hout holds, then the mass operator (upper and lower mass gap).
every incoming scattering state evolves, after the Moreover, it was assumed that states of a single
collision processes at finite times, into an outgoing particle can be created from the vacuum by local
scattering state. It is then physically meaningful to operations. In physical terms, these assumptions
define on this space of states the scattering matrix, allow only for theories with short-range interactions
setting S = in out . Physical data such as collision and particles carrying strictly localizable charges.
cross sections can be derived from S and the corre- In view of these limitations, Haag–Ruelle theory
sponding transition amplitudes h(1      m )in , has been developed in a number of different
(01      0n )out i, respectively, by a standard proce- directions. By now, the scattering theory of massive
dure. It should be noted, however, that neither the particles is under complete control, including also
458 Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools

:e
particles carrying nonlocalizable (gauge or topo- ff
x0 (p) ¼ f (p) e
ix0 !( p)
, where f is some test function
logical) charges and particles having exotic statistics on R with e
3
f (p) having compact support, and
(anyons, plektons) which can appear in theories in !(p) = (p2 þ m2 )1=2 . Note that (x0 , x) 7! fx0 (x) is a
low spacetime dimensions. Due to constraints of solution of the Klein–Gordon equation of mass m.
space, these results must go without further men- With these assumptions, it follows by a straight-
tion; we refer the interested reader to the articles forward application of the harmonic analysis of
Buchholz and Fredenhagen (1982) and Fredenhagen unitary groups that in the sense of strong conver-
et al. (1996). Theories of massless particles and of gence At (f ) !R P1 A(f ) and At (f )  ! 0 as t ! 1,
particles carrying charges of electric or magnetic where A(f ) = d3 x f (x)A(0, x). Hence, the opera-
type (infraparticles) will be discussed in subsequent tors At (f ) may be thought of as creation operators
sections. and their adjoints as annihilation operators. These
We outline here a recent generalization of Haag– operators are the basic ingredients in the construc-
Ruelle scattering theory presented in Dybalski tion of scattering states. Choosing local operators
(2005), which covers massive particles with localiz- Ak as above and test functions f (k) with disjoint
able charges without relying on any further con- compact supports in momentum space,
straints on the mass spectrum. In particular, the k = 1, . . . , n, the scattering states are obtained as
scattering of electrically neutral, stable particles limits of the Haag–Ruelle approximants
fulfilling a sharp dispersion law in the presence of
massless particles is included (e.g., neutral atoms in A1t ðf ð1Þ Þ    Ant ðf ðnÞ Þ ½5
their ground states). Mathematically, this assump-
tion can be expressed by the requirement that there Roughly speaking, the operators Akt (f (k) ) are loca-
exists a subspace H1  H such that the restriction of lized in spacelike separated regions at asymptotic
U(P "þ ) to H1 is a representation of mass m > 0. We times t, due to the support properties of the Fourier
denote by P1 the projection in H onto H1 . transforms of the functions f (k) . Hence they com-
To establish notation, let O be a bounded space- mute asymptotically because of locality and, by the
time region and let A 2 A(O) be any operator such clustering properties of the vacuum state, the above
that P1 A 6¼ 0. The existence of such localized (in vector becomes a product state of single-particle
brief, local) operators amounts to the assumption states. In order to prove convergence, one proceeds,
that the particle carries a localizable charge. That in analogy to Cook’s method in quantum-mechanical
the particle is stable, that is, completely decouples scattering theory, to the time derivatives,
from the underlying continuum states, can be cast
into a condition first stated by Herbst: for all @t A1t ðf ð1Þ Þ    Ant ðf ðnÞ Þ
sufficiently small  > 0 X
¼ A1t ðf ð1Þ Þ    ½@t Akt ðf ðkÞ Þ; Alt ðf ðlÞ Þ    Ant ðf ðnÞ Þ
kE ð1  P1 ÞAk  c ½3 k6¼l
X k
for some constants c,  > 0, where E is the projec- þ A1t ðf ð1Þ Þ    _    Ant ðf ðnÞ Þ@t Akt ðf ðkÞ Þ ½6
tion onto the spectral subspace of the mass operator k
corresponding to spectrum in the interval (m  , k
where _ denotes omission of Akt (f (k) ). Employing
m þ ). In the case originally considered by Haag
and Ruelle, where m is isolated from the rest of the techniques of Araki and Hepp, one can prove that
the terms in the first summation on the right-hand
mass spectrum, this condition is certainly satisfied.
: side (RHS) of [6], involving commutators, decay
Setting A(x) ¼ U(x)AU(x)1 , where U(x) is the
rapidly in norm as t approaches infinity because of
unitary implementing the spacetime translation
locality, as indicated above. By applying condition
x = (x0 , x) (the velocity of light and Planck’s
[3] and the fact that the vectors @t Akt (f (k) ) do not
constant are set equal to 1 in what follows), one
have a component in the single-particle space H1 ,
puts, for t 6¼ 0,
the terms in the second summation on the RHS of
Z
[6] can be shown to decay in norm like jtj(1þ) .
At ðf Þ ¼ d4 x gt ðx0 Þfxo ðxÞAðxÞ ½4 Thus, the norm of the vector [6] is integrable in t,
: implying the existence of the strong limits
Here x0 7! gt (x0 ) ¼ g((x0  t)=jtj )=jtj induces a
time averaging about t, g being any test function  in=out
P1 A1 ðf ð1Þ Þ      P1 An ðf ðnÞ Þ
which integrates to 1 and whose Fourier transform
has compact support, and 1=(1 þ ) <  < 1 with  :
¼ lim A1t ðf ð1Þ Þ    Ant ðf ðnÞ Þ ½7
as above. The Fourier transform of fx0 is given by t!
1
Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools 459

As indicated by the notation, these limits depend useful reduction formulas for the S-matrix greatly
only on the single-particle vectors P1 Ak (f (k) ) 2 H1 , facilitate computations, in particular in perturba-
k = 1, . . . , n, but not on the specific choice of tion theory. Moreover, these formulas are the
operators and test functions. In order to establish starting point of general studies of the momentum
their Fock structure, one employs results on cluster- space analyticity properties of the S-matrix (disper-
ing properties of vacuum correlation functions in sion relations), as outlined in Dispersion Relations
theories without strictly positive minimal mass. (cf. also Iagolnitzer (1993)). Within the present
Using this, one can compute inner products of general setting, the LSZ method was established by
arbitrary asymptotic states and verify that the maps Hepp.
  For simplicity of discussion, we consider again a
P1 A1 ðf ð1Þ Þ      P1 An ðf ðnÞ Þ single particle type of mass m > 0 and integer spin s,
 in=out subject to condition [3]. According to the results of
7! P1 A1 ðf ð1Þ Þ      P1 An ðf ðnÞ Þ ½8 the preceding section, one then can consistently
define asymptotic creation operators on the scatter-
extend by linearity to isomorphisms in=out from the ing states, setting
Fock space HF onto the subspaces Hin=out  H  in=out
generated by the collision states. Moreover, the Aðf Þin=out P1 A1 ðf ð1Þ Þ      P1 An ðf ðnÞ Þ
asymptotic states transform under the Poincaré  in=out
:
transformations U(P "þ ) as ¼ lim At ðf Þ P1 A1 ðf ð1Þ Þ      P1 An ðf ðnÞ Þ
t!
1

 in=out ¼ P1 Aðf Þ  P1 A1 ðf ð1Þ Þ    
UðÞ P1 A1 ðf ð1Þ Þ      P1 An ðf ðnÞ Þ in=out
  P1 An ðf ðnÞ Þ ½10
¼ U1 ðÞP1 A1 ðf ð1Þ Þ     
in=out Similarly, one obtains the corresponding asymptotic
U1 ðÞP1 An ðf ðnÞ
Þ ½9 annihilation operators,
 in=out
Thus, the isomorphisms in=out intertwine the action Aðf Þin=out P1 A1 ðf ð1Þ Þ     P1 An ðf ðnÞ Þ
of the Poincaré group on HF and Hin=out . We 
¼ lim At ðf Þ P1 A1 ðf ð1Þ Þ  
summarize these results, which are vital for the t!
1
physical interpretation of the underlying theory, in in=out
the following theorem.  P1 An ðf ðnÞ Þ ¼0 ½11

Theorem 1 Consider a theory of a particle of mass where the latter equality holds if the Fourier trans-
m > 0 which satisfies the standing assumptions and forms of the functions f , f (1) , . .., f (n) , have disjoint
the stability condition [3]. Then there exist canoni- supports. We mention as an aside that, by replacing
cal isometries in=out , mapping the Fock space HF the time-averaging function g in the definition of
based on the single-particle space H1 onto subspaces At (f ) by a delta function, the above formulas still
Hin=out  H of incoming and outgoing scattering hold. But the convergence is then to be understood
states. Moreover, these isometries intertwine the in the weak Hilbert space topology. In this form, the
action of the Poincaré transformations on the above relations were anticipated by LSZ (asymptotic
respective spaces. condition).
It is straightforward to proceed from these
Since the scattering states have been identified
relations to reduction formulas. Let B be any local
with Fock space, asymptotic creation and annihila-
operator. Then one has, in the sense of matrix
tion operators act on Hin=out in a natural manner.
elements between outgoing and incoming scattering
This point will be explained in the following section.
states,
BAðf Þin  Aðf Þout B ¼ lim ðBAðft Þ  Aðft ÞBÞ
LSZ Formalism Z
t!1
Z 
Prior to the results of Haag and Ruelle, an axiomatic ¼ lim d4 xft ðxÞBAðxÞ  d4 xft ðxÞAðxÞB ½12
t!1
approach to scattering theory was developed by
:
Lehmann, Symanzik, and Zimmermann (LSZ), ft (x)¼ gt (x0 )f (x0 )(vec(x)). Because of the (essential)
based on time-ordered vacuum expectation values support properties of the functions f t , the contribu-
of quantum fields. The relative advantage of their tions to the latter integrals arise, for asymptotic t,
approach with respect to Haag–Ruelle theory is that from spacetime points x where the localization
460 Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools

D out
regions of A(x) and B have a negative timelike (first
P1 A1 ðf ð1Þ Þ      P1 Ak ðf ðkÞ Þ ;
term), respectively, positive timelike (second term)
 in 
distance. One may therefore proceed from the ðkþ1Þ ðnÞ
P1 Akþ1 ðf Þ      P1 An ðf Þ
products of these operators to the time-ordered
Z Z
products T(BA(x)), where T(BA(x)) = A(x)B if the
¼ ð2iÞn    d3 p1    d3 pn fg
ð1Þ ðp Þ   
1
localization region of A(x) lies in the future of that
of B, and T(BA(x)) = BA(x) if it lies in the past. It is
noteworthy that a precise definition of the time fg k
g ðp Þ    fg
ðkÞ ðp Þf ðkþ1Þ
kþ1
ðnÞ ðp Þ
n
Yn D 
ordering for finite x is irrelevant in the present f ðp1 Þ   
ðpi0  !ðpi ÞÞ ; T A 1
context – any reasonable interpolation between the i¼1
above relations will do. Similarly, one can define f ðpk ÞAg
A kþ1 ðpkþ1 Þ   
time-ordered products for an arbitrary number of k
Ej¼1;...;n
local operators. The preceding limit can then be fn ðpn Þ 
A ½17
recast into pj0 ¼!ðpj Þ

Z in an obvious notation.
lim d4 xðft ðxÞ  ft ðxÞÞTðBAðxÞÞ ½13
t!1 Thus, the kernels of the scattering amplitudes in
The latter expression has a particularly simple form in momentum space
Qn are obtained by restricting the (by
momentum space. Proceeding to the Fourier trans- the factor i = 1 (pi0  !(pi ))) amputated Fourier
forms of f t and noticing that, in the limit of large t, transforms of the vacuum expectation values of the
  time-ordered products to the positive and negative
ff e
t ðpÞ  ft ðpÞ =ðp0  !ðpÞÞ mass shells, respectively. These are the famous LSZ
reduction formulas, which provide a convenient link
! 2ie
f ðpÞ ðp0  !ðpÞÞ ½14 between the time-ordered (Green’s) functions of a
one gets theory and its asymptotic particle interpretation.

BAðf Þin  Aðf Þout B


Z Asymptotic Particle Counters
¼ 2i d3 pef ðpÞðp0  !ðpÞÞ The preceding construction of scattering states
 applies to a significant class of theories; but even if
e 
TðBAðpÞÞ  ½15 one restricts attention to the case of massive
p0 ¼!ðpÞ
particles, it does not cover all situations of physical
e
Here T(BA(p)) denotes the Fourier transform of interest. For an essential input in the construction is
T(BA(x)), and it can be shown that the restriction of the existence of local operators interpolating
e
(p0  !(p))T(BA(p)) to the manifold {p 2 R 4 : p0 = between the vacuum and the single-particle states.
!(p)} (the ‘‘mass shell’’) is meaningful in the sense of There may be no such operators at one’s disposal,
distributions on R3 . By the same token, one obtains however, either because the particle in question
carries a nonlocalizable charge, or because the given
Aðf Þout B  BAðf Þin
Z  family of operators is too small. The latter case
 appears, for example, in gauge theories, where in
¼ 2i d3 p e f ðpÞBÞ
f ðpÞ ðp0  !ðpÞÞTðA  ½16
p0 ¼!ðpÞ general only the observables are fixed by the
principle of local gauge invariance, and the physical
Similar relations, involving an arbitrary number of
particle content as well as the corresponding inter-
asymptotic creation and annihilation operators, can
polating operators are not known from the outset.
be established by analogous considerations. Taking
As observables create from the vacuum only neutral
matrix elements of these relations in the vacuum state
states, the above construction of scattering states
and recalling the action of the asymptotic creation
then fails if charged particles are present. Never-
and annihilation operators on scattering states, one
theless, thinking in physical terms, one would expect
arrives at the following result, which is central in all
that the observables contain all relevant information
applications of scattering theory.
in order to determine the features of scattering
Theorem 2 Consider the theory of a particle of states, in particular their collision cross section. That
mass m > 0 subject to the conditions stated in the this is indeed the case was first shown by Araki and
preceding sections and let f (1) , . . . , f (n) be any family Haag (Araki 1999).
of test functions whose Fourier transforms have In scattering experiments, the measured data are
compact and nonoverlapping supports. Then provided by detectors (e.g., particle counters) and
Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools 461

coincidence arrangements of detectors. Essential spacetime region is dominated by the single-particle


features of detectors are their lack of response in contributions. It is this physical insight which
the vacuum state and their macroscopic localization. justifies the expectation that the detectors C(x)
Hence, within the present mathematical setting, a become particle counters at asymptotic times.
general detector is represented by a positive operator Accordingly, one considers for asymptotic t the
C on the physical Hilbert space H such that C = 0. operators
Because of the Reeh–Schlieder theorem, these con- Z
:
ditions cannot be satisfied by local operators. Ct ðhÞ ¼ d3 xhðx=tÞCðt; xÞ ½20
However, they can be fulfilled by ‘‘almost-local’’
operators. Examples of such operators are easy to
where h is any test function on R3 . The role of the
produce, putting C = L L with
integral is to sum up all single-particle contributions
Z with velocities in the support of h in order to
L ¼ d4 x f ðxÞ AðxÞ ½18 compensate for the decreasing probability of finding
such particles at asymptotic times t about the
where A is any local operator and f any test function localization center of the detector. That these ideas
whose Fourier transform has compact support in the are consistent was demonstrated by Araki and Haag,
complement of the closed forward light cone (and who established the following result (Araki 1999).
hence in the complement of the energy momentum
Theorem 3 Consider, as before, the theory of a
spectrum of the theory). In view of the properties of
massive particle. Let C(1) , . . . , C(n) 2 C be any family
f and the invariance of  under translations, it
of detector operators and let h(1) , . . . , h(n) be any
follows that C = L L annihilates the vacuum and
family of test functions on R3 . Then, for any state
can be approximated with arbitrary precision by
out 2 Hout of finite energy,
local operators. The algebra generated by these
D E
operators C will be denoted by C. ð1Þ ðnÞ
lim out ; Ct ðhð1Þ Þ    Ct ðhðnÞ Þout
When preparing a scattering experiment, the first t!1
Z Z
thing one must do with a detector is to calibrate it, ¼  d3 p1    d3 pn hout ; out ðp1 Þ    out ðpn Þout i
that is, test its response to sources of single-particle
states. Within the mathematical setting, this Y
n
hðpk =!ðpk ÞÞhpk jCðkÞ jpk i ½21
amounts to computing the matrix elements of C in
k¼1
states  2 H1 :
Z Z where out (p) is the momentum space density (the
h; Ci ¼ d3 p d3 q ðpÞ ðqÞ hpjCjqi ½19 product of creation and annihilation operators) of
outgoing particles of momentum p, and (summa-
tions over) possible indices labeling internal degrees
Here p 7! (p) is the momentum space wave func-
of freedom of the particle are omitted. An analogous
tion of , hjCji is the kernel of C in the single-
relation holds for incoming scattering states at
particle space H1 , and we have omitted (summations
negative asymptotic times.
over) indices labeling internal degrees of freedom of
the particle, if any. The relevant information about This result shows, first of all, that the scattering
C is encoded in its kernel. As a matter of fact, one states have indeed the desired interpretation with
only needs to know its restriction to the diagonal, regard to the observables, as anticipated in the
p 7! hpjCjpi. It is called the sensitivity function of C preceding sections. Since the assertion holds for all
and can be shown to be regular under quite general scattering states of finite energy, one may replace in the
circumstances (Araki 1999, Buchholz and Fredenhagen above theorem the outgoing scattering states by any
1982). state of finite energy, if the theory is asymptotically
Given a state  2 H for which the expectation complete, that is, H = Hin = Hout . Then choosing, in
value h, C(x))i differs significantly from 0, one particular, any incoming scattering state and making
concludes that this state deviates from the vacuum use of the arbitrariness of the test functions h(k) as well
in a region about x. For finite x, this does not mean, as the knowledge of the sensitivity functions of the
however, that  has a particle interpretation at x. detector operators, one can compute the probability
For that spacetime point may, for example, be just distributions of outgoing particle momenta in this state,
the location of a collision center. Yet, if one and thereby the corresponding collision cross sections.
proceeds to asymptotic times, one expects, in view The question of how to construct certain specific
of the spreading of wave packets, that the prob- incoming scattering states by using only local
ability of finding two or more particles in the same observables was not settled by Araki and Haag,
462 Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools

however. A general method to that effect was respectively, negative, timelike distance from all
outlined in Buchholz et al. (1991). As a matter of points in O. Then, for any operator B which is
fact, for that method only the knowledge of states in compactly localized in O , respectively, one obtains
the subspace of neutral states is required. Yet in this limt ! 1 At B = limt ! 1 BAt  = BP1 A. This
approach one would need for the computation of, relation establishes the existence of the limits
say, elastic collision cross sections of charged
particles the vacuum correlation functions involving Ain=out ¼ lim At ½23
t!
1
at least eight local observables. This practical
disadvantage of increased computational complexity on the (by the Reeh–Schlieder property) dense sets of
of the method is offset by the conceptual advantage vectors {B : B 2 A(O
)}  H. It requires some
of making no appeal to quantities which are a priori more detailed analysis to prove that the limits have
nonobservable. all of the properties of a (smeared) free massless
field, whose translates x 7! Ain=out (x) satisfy the wave
equation and have c-number commutation relations.
Massless Particles From these free fields, one can then proceed to
asymptotic creation and annihilation operators and
and Huygens’ Principle
construct asymptotic Fock spaces Hin=out  H of
The preceding general methods of scattering theory massless particles and a corresponding scattering
apply only to massive particles. Yet taking advan- matrix as in the massive case. The details of this
tage of the salient fact that massless particles always construction can be found in the original article, cf.
move with the speed of light, Buchholz succeeded in Haag (1992).
establishing a scattering theory also for such It also follows from these arguments that the
particles (Haag 1992). Moreover, his arguments asymptotic fields Ain=out of massless particles ema-
lead to a quantum version of Huygens’ principle. nating from a region O, that is, for which the
As in the case of massive particles, one assumes underlying interpolating operators A are localized in
that there is a subspace H1  H corresponding to a O, commute with all operators localized in O
,
representation of U(P "þ ) of mass m = 0 and, for respectively. This result may be understood as an
simplicity, integer helicity; moreover, there must expression of Huygens’ principle. More precisely,
exist local operators interpolating between the denoting by Ain=out (O) the algebras of bounded
vacuum and the single-particle states. These operators generated by the asymptotic fields Ain=out ,
assumptions cover, in particular, the important respectively, one arrives at the following quantum
examples of the photon and of Goldstone particles. version of Huygens’ principle.
Picking any suitable local operator A interpolating
Theorem 4 Consider a theory of massless particles
between  and some vector in H1 , one sets, in
as described above and let Ain=out (O) be the algebras
analogy to [4],
Z generated by massless asymptotic fields Ain=out with
: A 2 A(O). Then
At ¼ d4 x gt ðx0 Þ
Ain ðOÞ  AðO Þ0
ð1=2Þ"ðx0 Þ ðx20  x2 Þ@0 AðxÞ ½22
and ½24
:
Here gt (x0 ) ¼ (1=j ln tj) g((x0  t)=j ln tj) with g as in A out
ðOÞ  AðOþ Þ 0
[4], and the solution of the Klein–Gordon equation
in [4] has been replaced by the fundamental solution Here the prime denotes the set of bounded operators
of the wave equation; furthermore, @0 A(x) denotes commuting with all elements of the respective
the derivative of A(x) with respect to x0 . Then, once algebras (i.e., their commutants).
again, the strong limit of At  as t ! 1 is P1 A,
with P1 the projection onto H1 .
Beyond Wigner’s Concept of Particle
In order to establish the convergence of At as in
the LSZ approach, one now uses the fact that these There is by now ample evidence that Wigner’s
operators are, at asymptotic times t, localized in the concept of particle is too narrow in order to cover
complement of some forward, respectively, back- all particle-like structures appearing in quantum
ward, light cone. Because of locality, they therefore field theory. Examples are the partons which show
commute with all operators which are localized in up in nonabelian gauge theories at very small
the interior of the respective cones. More specifi- spacetime scales as constituents of hadrons, but
cally, let O  R 4 be the localization region of A and which do not appear at large scales due to the
let O  R4 be the two regions having a positive, confining forces. Their mathematical description
Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools 463

requires a quite different treatment, which cannot be UðxÞjLip ¼ eipx jLðxÞip ; L2L ½25
discussed here. But even at large scales, Wigner’s
concept does not cover all stable particle-like It is instructive to (formally) replace L here by the
systems, the most prominent examples being parti- identity operator, making it clear that this relation
cles carrying an abelian gauge charge, such as the indeed defines improper states of sharp energy–
electron and the proton, which are inevitably momentum.
accompanied by infinite clouds of (‘‘on-shell’’) In theories of massive particles, one can always find
massless particles. localizing operators L 2 L such that their images
The latter problem was discussed first by Schroer, jLip 2 H are states with a sharp mass. This is the
who coined the term ‘‘infraparticle’’ for such situation covered in Wigner’s approach. In theories
systems. Later, Buchholz showed in full generality with long-range forces there are, in general, no such
that, as a consequence of Gauss’ law, pure states operators, however, since the process of localization
with an abelian gauge charge can neither have a inevitably leads to the production of low-energy
sharp mass nor carry a unitary representation of the massless particles. Yet improper states of sharp momen-
Lorentz group, thereby uncovering the simple origin tum still exist in this situation, thereby leading to a
of results found by explicit computations, notably in meaningful generalization of Wigner’s particle concept.
quantum electrodynamics (Steinmann 2000). Thus, That this characterization of particles covers all
one is faced with the question of an appropriate situations of physical interest can be justified in the
mathematical characterization of infraparticles general setting of relativistic quantum field theory as
which generalizes the concept of particle invented follows. Picking gt as in [4] and any vector  2 H
by Wigner. Some significant steps in this direction with finite energy, one can show that the functionals
were taken by Fröhlich, Morchio, and Strocchi, who t , t 2 R, given by
based a definition of infraparticles on a detailed Z
:
spectral analysis of the energy–momentum opera- t ðL LÞ ¼ d4 x gt ðx0 Þ h; ðL LÞðxÞi; L 2 L ½26


tors. For an account of these developments and


further references, cf. Haag (1992). are well defined and form an equicontinuous family
We outline here an approach, originated by Buch- with respect to a certain natural locally convex
holz, which covers all stable particle-like structures topology on the algebra C = L L. This family of
appearing in quantum field theory at asymptotic times. functionals therefore has, as t ! 1, weak- limit
It is based on Dirac’s idea of improper particle states points, denoted by
. The functionals
are positive
with sharp energy and momentum. In the standard on C but not normalizable. (Technically speaking,
(rigged Hilbert space) approach to giving mathema- they are weights on the underlying algebra A.) Any
tical meaning to these quantities, one regards them as such
induces a positive-semidefinite scalar product
vector-valued distributions, whereby one tacitly on the left ideal L given by
assumes that the improper states can coherently be
:
superimposed so as to yield normalizable states. This hL1 j L2 i ¼
ðL1 L2 Þ; L1 ; L2 2 L ½27
assumption is valid in the case of Wigner particles but
fails in the case of infraparticles. A more adequate After quotienting out elements of zero norm and
taking the completion, one obtains a Hilbert space
method of converting the improper states into normal-
and a linear map L 7! jLi from L into that space.
izable ones is based on the idea of acting on them with
Moreover, the spacetime translations act on this
suitable localizing operators. In the case of quantum
space by a unitary representation satisfying the
mechanics, one could take as a localizing operator any
relativistic spectrum condition.
sufficiently rapidly decreasing function of the position
operator. It would map the improper ‘‘plane-wave It is instructive to compute these functionals and
states’’ of sharp momentum into finitely localized maps in theories of massive particles. Making use of
states which thereby become normalizable. In quan- relation [21] one obtains, with a slight change of
notation,
tum mechanics, these two approaches can be shown to
be mathematically equivalent. The situation is differ- Z
ent, however, in quantum field theory. hL1 j L2 i ¼ dðpÞ hp jL1 L2 jp i ½28
In quantum field theory, the appropriate localiz-
ing operators L are of the form [18]. They constitute where  is a measure giving the probability density
a (nonclosed) left ideal L in the C -algebra A of finding at asymptotic times in state  a particle of
generated by all local operators. Improper particle energy–momentum p. Once again, possible summa-
states of sharp energy–momentum p can then be tions over different particle types and internal
defined as linear maps jip : L ! H satisfying degrees of freedom have been omitted here. Thus,
464 Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools

:
setting jLip ¼ L jpi, one concludes that the map structures, which cannot appear in quantum-mechan-
L 7! jLi can be decomposed into a direct integral of ical systems with a finite number of degrees of freedom.
improper Rparticle states of sharp energy–momen- Thus, the first step in establishing a complete particle
tum, ji = d(p)1=2 jip . It is crucial that this result interpretation in a quantum field theory has to be the
can also be established without any a priori input determination of its full particle content. Here the
about the nature of the particle content of the methods outlined in the preceding section provide a
theory, thereby providing evidence of the universal systematic tool. From the resulting data, one must then
nature of the concept of improper particle states of reconstruct the full physical Hilbert space of the theory
sharp momentum, as outlined here. comprising all superselection sectors. For theories in
which only massive particles appear, such a construc-
Theorem 5 Consider a relativistic quantum field
tion has been established in Buchholz and Fredenhagen
theory satisfying the standing assumptions. Then the
(1982), and it has been shown that the resulting Hilbert
maps L 7! jLi defined above can be decomposed into
space contains all scattering states. The question of
improper particle states of sharp energy–momentum p,
Z completeness can then be recast into the familiar
problem of the unitarity of the scattering matrix. It is
ji ¼ dðpÞ1=2 jip ½29
believed that phase space (nuclearity) properties of the
theory are of relevance here (Haag 1992).
where  is some measure depending on the state  However, in theories with long-range forces, where
and the respective time limit taken.
a meaningful scattering matrix may not exist, this
It is noteworthy that whenever the space of strategy is bound to fail. Nonetheless, as in most high-
improper particle states corresponding to fixed energy scattering experiments, only some very specific
energy–momentum p is finite dimensional (finite aspects of the particle interpretation are really tested –
particle multiplets), then in the corresponding Hilbert one may think of other meaningful formulations of
space there exists a continuous unitary representation completeness. The interpretation of most scattering
of the little group of p. This implies that improper experiments relies on the existence of conservation
momentum eigenstates of mass m = (p2 )1=2 > 0 carry laws, such as those for energy and momentum. If a
definite (half)integer spin, in accordance with Wigner’s state has a complete particle interpretation, it ought to
classification. However, if m = 0, the helicity need not be possible to fully recover its energy, say, from its
be quantized, in contrast to Wigner’s results. asymptotic particle content, that is, there should be no
Though a general scattering theory based on contributions to its total energy which do not manifest
improper particle states has not yet been developed, themselves asymptotically in the form of particles.
some progress has been made in Buchholz et al. Now the mean energy–momentum of a state  2 H is
(1991). There it is outlined how inclusive collision given by h, Pi, P being the energy–momentum
cross sections of scattering states, where an unde- operators, and the mean energy–momentumR contained
termined number of low-energy massless particles in its asymptotic particle content is d(p)p, where 
remains unobserved, can be defined in the presence is the measure appearing in the decomposition [29].
of long-range forces, in spite of the fact that a Hence, in case of a complete particle interpretation,
meaningful scattering matrix may not exist. the following should hold:
Z
h; Pi ¼ dðpÞp ½30
Asymptotic Completeness
Similar relations should also hold for other con-
Whereas the description of the asymptotic particle served quantities which can be attributed to parti-
features of any relativistic quantum field theory can be cles, such as charge, spin, etc. It seems that such a
based on an arsenal of powerful methods, the question weak condition of asymptotic completeness suffices
of when such a theory has a complete particle for a consistent interpretation of most scattering
interpretation remains open to date. Even in concrete experiments. One may conjecture that relation [30]
models there exist only partial results, cf. Iagolnitzer and its generalizations hold in all theories admitting
(1993) for a comprehensive review of the current state a local stress–energy tensor and local currents
of the art. This situation is in striking contrast to the corresponding to the charges.
case of quantum mechanics, where the problem of
asymptotic completeness has been completely settled. See also: Algebraic Approach to Quantum Field Theory;
One may trace the difficulties in quantum field Axiomatic Quantum Field Theory; Dispersion Relations;
theory back to the possible formation of superselection Perturbation Theory and its Techniques; Quantum
sectors (Haag 1992) and the resulting complex particle Chromodynamics; Quantum Field Theory in Curved
Scattering in Relativistic Quantum Field Theory: The Analytic Program 465

Spacetime; Quantum Mechanical Scattering Theory; Dybalski W (2005) Haag–Ruelle scattering theory in presence of
Scattering, Asymptotic Completeness and Bound States; massless particles. Letters in Mathematical Physics 72: 27–38.
Scattering in Relativistic Quantum Field Theory: The Fredenhagen K, Gaberdiel MR, and Ruger SM (1996) Scattering
Analytic Program. states of plektons (particles with braid group statistics) in (2 þ 1)
dinemsional quantum field theory. Communications in Mathe-
matical Physics 175: 319–336.
Haag R (1992) Local Quantum Physics. Berlin: Springer.
Further Reading
Iagolnitzer D (1993) Scattering in Quantum Field Theories.
Araki H (1999) Mathematical Theory of Quantum Fields. Princeton, NJ: Princeton University Press.
Oxford: Oxford University Press. Jost R (1965) General Theory of Quantized Fields. Providence,
Buchholz D and Fredenhagen K (1982) Locality and the structure RI: American Mathematical Society.
of particle states. Communications in Mathematical Physics Steinmann O (2000) Perturbative Quantum Electrodynamics and
84: 1–54. Axiomatic Field Theory. Berlin: Springer.
Buchholz D, Porrmann M, and Stein U (1991) Dirac versus Streater RF and Wightman AS (1964) PCT, Spin and Statistics,
Wigner: towards a universal particle concept in quantum field and All That. Reading, MA: Benjamin/Cummings.
theory. Physics Letters B 267: 377–381.

Scattering in Relativistic Quantum Field Theory:


The Analytic Program
J Bros, CEA/DSM/SPhT, CEA Saclay, called N-point structure functions of the field  ‘‘in
Gif-sur-Yvette, France x-space,’’ namely in Minkowski spacetime (here, for
ª 2006 Elsevier Ltd. All rights reserved. brevity, we assume that the system is defined in terms
of a single quantum field). In parallel, itR is important to
consider the Fourier transform (p)˜ = eipx (x) dx of
Introduction to the Analytic Structures the field in the Minkowskian energy–momentum
:
of Quantum Field Theory space (p  x ¼ p0 x0  p  x denoting the Minkowskian
scalar product). The corresponding quantities
The importance of complex variables and of the ˜ 1 )    (p
< , (p ˜ N )0 > , can then be called N-point
concept of analyticity in theoretical physics finds structure functions of the field  ‘‘in p-space,’’ namely
one of its best illustrations in the analytic structure in energy–momentum space.
of relativistic quantum field theory (QFT). The latter In the algebraic QFT framework, each basic
have been investigated from several viewpoints in local observable B affiliated to a certain bounded
the last 50 years, according to the successive region of spacetime O generates a Haag–Kastler–
progress in QFT. Araki quantum field B(x) by the action of
In the two main axiomatic frameworks of QFT, :
the translations of spacetime, namely B(x) ¼
namely the one based on Wightman axioms (for a U(x)BU(x)1 . Here U(x) denotes the unitary repre-
short presentation, see Dispersion Relations and also sentation of the group of spacetime translations in
Axiomatic Quantum Field Theory) and the Haag, the Hilbert space of states: B(x) is affiliated to the
Kastler, and Araki theory of ‘‘local observables’’ (see translated region O(x) = {y; y  x 2 O}. Then again
Algebraic Approach to Quantum Field Theory), one can consider N-point structure functions of the
there are general justifications of analyticity proper- theory of the form < , B(x1 )    B(xN )0 > and
ties for relevant ‘‘N-point structure functions’’ both ~ 1 )    B(p
< , B(p ~ N )0 > .
in complexified spacetime variables and in complex- To summarize the situation as it occurs in both
ified energy–momentum variables. cases, one can say the following:
In the Wightman framework, relativistic quantum
fields are operator-valued distributions j (x) on four- 1. A certain postulate of relativistic causality
dimensional Minkowski spacetime that transform implies the analyticity of structure functions of
covariantly under a unitary representation of the a certain class, often called ‘‘Green functions,’’
Poincaré group in the Hilbert space of states. The in the complex energy–momentum variables
basic quantities of QFT are (tempered) distributions kj = pj þ iqj , in particular for purely imaginary
on R4N of the form < , (x1 )    (xN )0 >, which energies.
depend on pairs of states , 0 , belonging to the 2. ‘‘Stability properties’’ of the states , 0 such as a
Hilbert space of the QFT considered: they can be ‘‘bounded energy content’’ of these states imply
466 Scattering in Relativistic Quantum Field Theory: The Analytic Program

the analyticity of the previous structure functions There are two versions of this postulate. In the
in the complex spacetime variables, in particular Wightman framework, causality is expressed by the
for purely imaginary times. condition of local commutativity or microcausality,
In both cases, analyticity is obtained as a basic pro- ½ðx1 Þ; ðx2 Þ ¼ 0 for ðx1  x2 Þ2 < 0 ½3
perty of the Fourier–Laplace transformation in several
variables. Let V þ denote the forward cone of the In the algebraic QFT framework, causality is
: : : expressed by a similar property in terms of any
Minkowskian space (V þ ¼ V  ¼ {x; x2 ¼ x  x > 0, :
x0 > 0}) and let field B(x) generated by a local observable B ¼ B(0)
Z affiliated to a region of spacetime enclosed in a
~f ðp þ iqÞ ¼ given ‘‘double cone’’ Ob = Vbþ \ (Vbþ ). The corres-
eiðpþiqÞx f ðxÞdx ½1
Vaþ ponding expression of causality is
Z
4
½Bðx1 Þ; Bðx2 Þ ¼ 0
gðx þ iyÞ ¼ ð2Þ eipðxþiyÞ ~
gðpÞdp ½2
Vpþ for ðx1  x2 Þ 2= ðVaþ [ ðVaþ Þ ½4

be the associated reciprocal Fourier formulas, for all a such that a > 2b.
applied, respectively, to functions f (x) with support So, we see that basically, causality and spectral
contained in the translated forward cone Vaþ = a þ condition generate analyticity respectively in com-
V þ , a 2 V þ (or in its closure), and to functions ~g(p) plexified p-space and x-space. However, the situa-
with support contained in the translated forward tion is more intricate, since for each N there are
cone VPþ = P þ V þ , P 2 V þ of energy–momentum always several holomorphic branches (two in the
space (or in its closure). Then in view of the case N = 2) in the variables (z1 , . . . , zn ) and also in
convergence properties of the previous integrals, one the variables (k1 , . . . , kn ): each of these two sets is
easily checks that ~f (k) is holomorphic with possible obtained essentially by permutations of the N vector
exponential increase in the imaginary directions variables. The important point is that these various
controlled by the bound eqa in the tube domain branches can be seen to ‘‘communicate together,’’
T þ = R4 þ iV þ ; similarly, g(z) is holomorphic with thanks to the existence of ‘‘coincidence regions’’ of
an increase controlled by the exponential bound eyP their boundary values on the reals. Here again the
in the tube domain T  = R4 þ iV  . roles played by causality and stability are symmetric
On the one hand, for each N the structure functions (but inverted): while causality produces coincidence
<, (p˜ 1 )    (p
˜ N )0 > (or <, B(p ~ 1 )    B(p
~ N )0 >) regions for the holomorphic functions in complex
have conical support properties of the previous type in spacetime, spectral conditions produce coincidence
the variables pj , as a consequence of the relativistic regions for the holomorphic functions in complex
shape of the energy–momentum spectrum. In both energy–momentum space.
axiomatic frameworks, in fact, one postulates that In view of a basic theorem of several complex
there is a state of zero energy–momentum , called the variable analysis, called the edge-of-the-wedge the-
vacuum, and that the energy–momentum spectrum , orem (see below in (4)), the two sets of commu-
namely the joint spectrum of the generators P of the nicating holomorphic branches actually define by
Lie algebra of the group U(x), is contained in the mutual analytic continuation two holomorphic
, 0 0
closure of V þ : this is the so-called spectral condition. function HN (k1 , . . . , kN ) and W ,N (z1 , . . . , zN ) in
, 0 , 0
A more refined assumption introduced for the require- respective domains DN and N . However, these
ments in particle physics is that  contains discrete two primitive domains are not natural holomorphy
parts localized on sheets of (mass-shell) hyperboloids domains (a phenomenon which is particular to
inside V þ . These support properties in p-space imply complex geometry in several variables). The prob-
that the corresponding inverse Fourier transforms lem of finding their holomorphy envelopes, namely
<, (x1 )    (xN )0 > are boundary values of holo- the smallest domains D ^ , 0 and 
^ , 0 in which any
N N
morphic functions in appropriate tube domains of the functions holomorphic in the primitive domains can
complex space variables (z1 , . . . , zn ). be analytically continued, is the idealistic purpose of
On the other hand, in order to exhibit structure what has been called the analytic program of
functions with conical support properties in x-space, axiomatic QFT. So, we see that there is an analytic
one needs to build appropriate algebraic combina- program in x-space and there is an analytic program
tions of functions < , (xj1 )    (xjN )0 > with in p-space. In practice, except for the case N = 2,
permuted arguments in order to take the benefit of where the complete answer is known, only a partial
the causality postulate, which is always formulated knowledge of the holomorphy envelopes has been
in terms of the commutator of two field operators. obtained.
Scattering in Relativistic Quantum Field Theory: The Analytic Program 467

The analytic program in p-space, which is the 3. A more recent extension of QFT called thermal
only one to be described in the rest of this article, QFT (TQFT), which aims to study the behavior of
was often considered as physically more interesting, quantum fields in a thermal bath, can be described
in view of the fact that it aims to establish in terms of a modified analytic program. In the
analyticity properties of the scattering kernels on latter, the spectral condition is replaced by the
the complex mass shell. As a matter of fact, an so-called KMS condition, which prescribes x-space
important part of it concerns the derivation of the analyticity properties of a particular type for the
analyticity domains of dispersion relations for two- structure functions W N : it requires analyticity
particle scattering amplitudes. This part is important together with periodicity conditions with respect
from the historical viewpoint as well as from to imaginary times, the period being the inverse of
conceptual, physical, and pedagogical viewpoints the temperature (see Thermal Quantum Field
(the reader may find it useful to first check the Theory). The usual analytic structure for the
article Dispersion Relations, which illustrates
0
how a theories with vacuum and spectral conditions is
structure function of the form H2,  (k1 , k2 ) can be recovered in the zero-temperature limit.
used for that purpose with a suitable choice of the 4. In more recent investigations concerning quan-
states  and 0 ). In the general development of the tum fields on (holomorphic) curved spacetimes,
analytic program (in x-space as well as in p-space), analyticity properties of the structure functions
it is recommended to consider the infinite set of similar to those of thermal QFT can be estab-
: , 
structure functions HN ¼ HN (k1 , . . . , kN ) and lished. This is the case in particular with de Sitter
: , 
W N ¼ W N (z1 , . . . , zN ) where  is the privileged spacetime, for which a notion of ‘‘temperature of
vacuum state of the theory, in view of the fact that geometrical origin’’ is most simply exhibited.
each of these sets characterizes entirely the field
In this article, an account of the general analytic
theory considered.
program of axiomatic QFT in complex energy–
Before shifting to the analytic program in p-space,
momentum space will be presented; it will describe
we would like to mention various points of interest
some of the methods which have been used for
of the analytic program in x-space:
establishing analyticity properties of the N-point
1. Various results of this program have been structure functions of QFT and corresponding proper-
extensively used for proving fundamental prop- ties of the (n ! n0 )-particle collision processes, for all
erties of QFT, such as the PCT-invariance n, n0 such that n  2, n0  2, n þ n0 = N. (For a more
theorem, the spin–statistics connection, etc. detailed study, in particular concerning the microlocal
A good part of these can be found in the methods, see the book by Iagolnitzer (1992)).
books by Streater and Wightman (1980) and by Concerning the important case N = 4, this article
Jost (1965). gives complements to the results described in the
2. The functions HN and W N are holomorphic in article Dispersion Relations. In fact, the program
their respective p-space and x-space ‘‘Euclidean allows one to justify other important analytic
subspaces.’’ To make this clear, let us assume structures of the four-point functions and of two-
that a Lorentz frame has been chosen once for particle scattering functions. They concern
all; the linear subspace of complex spacetime
 the field-theoretical basis of analyticity in the
(resp. energy–momentum) vectors of the form
complexified variable of angular momentum, first
z = (iy0 , x) (resp. k = (iq0 , p)) is called the ‘‘Eucli-
introduced and developed in potential theory
dean subspace’’ of the corresponding complex
(Regge 1959);
Minkowskian space, in view of the fact that the
:  the Bethe–Salpeter (BS-) type structure (based on
quadratic form z2 ¼ z  z = (y20 þ x2 ) (resp.
: the additional postulate of asymptotic complete-
k2 ¼ k  k = (q20 þ p2 )) has a definite (negative)
ness), which is a relativistic field-theoretical gen-
sign on that subspace. Then it has been estab-
eralization of the Lippmann–Schwinger structure
lished that (for each N) the restrictions of HN
of nonrelativistic scattering theory (for Schrödinger
and W N to the corresponding N-vector Euclidean
equations with Yukawa-type potentials).
subspaces are the Fourier transforms of each
other. This fact participates in the foundation of The latter allows one to introduce the concept of
the Euclidean formulation of QFT or ‘‘QFT at composite particle in the field-theoretical framework
imaginary times’’; the latter has provided many (including bound states and unstable particles or
important results in QFT, in particular for the ‘‘resonances’’) and also the concept of ‘‘Regge
rigorous study of field models (initiated by particle,’’ thanks to complex angular momentum
Glimm and Jaffe in the 1970s). analysis.
468 Scattering in Relativistic Quantum Field Theory: The Analytic Program

Various Aspects of the General Dispersion Relations. Then in view of the Laplace-
Analytic Program of QFT in Complex transform theorem in several variables, the Fourier
transform ~ () (p1 , . . . , pN ) = (p1 þ    þ pN ) 
R
Energy–Momentum Space N
~rN ([p]N ) is such that ~r()
()
N ([p]N ) is the boundary value
The N-Point Structure Functions of QFT (), (c)
of a holomorphic function ~rN ([k]N ) defined in a
tube T  = R 4(N1)
þ iC~  . Here [k] = [p] þ i[q]
It is proved in the Wightman QFT axiomatic frame- N N N
work that any QFT is completely characterized by the belongs to a 4(N  1)-dimensional complex linear
(infinite) sequence of its ‘‘N-point functions’’ or space M(c) : this is the set of complex vectors
: N ~  is
‘‘vacuum expectation values’’ (also called ‘‘Wightman [k]N ¼ (k1 , . . . , kN ) such that k1 þ    þ kN = 0. C
functions’’) the dual cone of C in the real (4(N  1)-dimen-
sional) [q]N -space. Geometrically, each cone C ~  is
:
WN ðx1 ; . . . ; xN Þ ¼ < ; ðx1 Þ    ðxN Þ > defined in terms of a certain ‘‘cell’’ of [q]N -space
which are tempered distributions on R4N satisfying a which is defined by prescribing consistent P conditions
set of general properties that can be split up into of the form "J qJ 2 V þ with qJ = j2J qj and "J = 1
linear and nonlinear conditions. (This is known as for all proper subsets J of the set {1, 2, . . . , N}.
the Wightman reconstruction theorem). This is the expression of the microcausality postu-
late (summarized in [3] or [4]) in complex energy–
Linear conditions Each individual N-point func- momentum space. Concerning the difference
tion satisfies three sets of linear conditions which between the two formulations [3] and [4], one can
result, respectively, from: see that there is no geometrical difference concern-
ing the analyticity domains, but differences for the
1. Poincaré invariance: typically, for every Poincaré type of increase of the structure functions in their
transformation g of Minkowski spacetime tube domains: in the case of [3], they are bounded
WN ðx1 ; . . . ; xN Þ ¼ WN ðgx1 ; . . . ; gxN Þ by powers of the energy–momenta, while in the case
of [4] they may have an exponential increase
in particular, the WN are invariant under space- governed by factors of the type eqa .
time translations and therefore defined on the For each N, the linear space generated by all the
:
quotient subspace R4(N1) ¼ R4N =R4 of the differ- distributions ~r() N ([ p]N ) is constrained by a set of
ences xj  xk . linear relations (called Steinmann relations) which
2. Microcausality: support conditions on commu- result from algebraic expressions of discontinuities
tator functions of the following form: of the following type, called (generalized) ‘‘absorp-
: tive parts,’’
Cðj;jþ1Þ ðx1 ; . . . ; xn Þ ¼ WN ðx1 ; . . . ; xj ; xjþ1 ; . . . ; xN Þ
ðÞ ð0 Þ
 WN ðx1 ; . . . ; xjþ1 ; ~rN ð½ pN Þ  ~rN ð½ pN Þ
xj ; . . . ; xN Þ ¼ 0 ~ ð1 Þ ð½ p Þ; R
¼ < ; ½R ~ ð2 Þ ð½ p Þ > ½5
J1 ðJ1 Þ J2 ðJ2 Þ
4N 2
in the region of R defined by (xj  xjþ1 ) < 0. for all pairs of adjacent cells (, 0 )( J1 , J2 ) in the
3. Spectral condition: support conditions on the following sense:  and 0 only differ by changing the
Fourier transform W ~ N (p1 , . . . , pN ) = (p1 þ    þ
value of "J1 = "J2 , ( J1 , J2 ) denoting any given
pN )  w ^ N (p1 , . . . , pN1 ) of WN , which assert that partition of the set {1, 2, . . . , N}. In [5], the symbols
w^ N (p1 , . . . , pN1 ) = 0 if either one of the follow- ~ (i ) denote generalized retarded operators of lower
R Ji
ing conditions is fulfilled: p1 þ    þ pj 62 , for order and the argument [ p](J) stands for the set of
j = 1, . . . , N  1. independent 4-momenta { pj ; j 2 J}. Formula [5] may
For each N, one can then construct a set of be seen as an N-point generalization of formula [26]
distributions R()
N (x1 , . . . , xN ), called ‘‘generalized
of Dispersion Relations for the case when the state
retarded functions’’ (Araki, Ruelle, Steinmann,  = 0 is replaced by .
1960 (see Iagolnitzer (1992, ref. [EGS])) which are Then by applying to [5] the same argument based
appropriate linear combinations of multiple com- on spectral condition as in the exploitation of
mutator functions built from WN and multiplied by eqn [26] in Dispersion Relations, one concludes
(0 )
products of Heaviside step-functions (xj,0  xk,0 ) of that the two distributions ~r() N and ~ rN coincide on
the differences of time coordinates. Each of these an open set R, 0 of the form p2J1 = p2J2 < M2J1 , where
: P
distributions R()
N (x1 , . . . , xN ) has its support con-
pJ1 ¼ j2J1 pj =  pJ2 . It then follows from the gen-
tained in a convex salient cone C . This construction eral ‘‘oblique edge-of-the-wedge theorem’’ (Epstein,
can be seen as a generalization of the decomposition 1960; see below) that the two corresponding
(0 ), (c)
[23] of the commutator C, 0 in the article holomorphic functions ~r(), N
(c)
([k]N ) and ~rN ([k]N )
Scattering in Relativistic Quantum Field Theory: The Analytic Program 469

have a common analytic continuation in the union of Sn, n0 (pn, in ; pn0 , out ), defined by a straightforward gen-
their tubes together with a certain complex ‘‘connecting eralization of formula [20] of the quoted article:
set,’’ bordered by R, 0 . Since this argument applies to
all pairs (, 0 )( J1 , J2 ) , the following important property Sn;n0 ð^fn;in ; ^gn0 ;out Þ
Z
holds (see Iagolnitzer (1992, refs. [B2], [EGS])): ^fn;in ðpn;in Þ^gn0 ;out ðpn0 ;out Þ
¼
Theorem 1 Mn;n0
0

(i) All the holomorphic functions ~r(), (c)


([k]N )  Sn;n0 ðpn;in ; pn0 ;out Þnm ðpn;in Þnm ðpn0 ;out Þ ½7
N
admit a common analytic continuation Here we have considered for simplicity the case of
HN ([k]N ), called the N-point structure function collisions involving a single type of particle with
(or Green function) of the given quantum field mass m. In the arguments of the wave packets, the
in complex energy–momentum space. It is 0
kernel, and the measures (nm , nm ), pn, in and pn0 , out ,
holomorphic in a ‘‘primitive domain’’ DN of respectively, denote the sets of incoming and
M(c)
N , which is the union of all tubes T  outgoing 4-momenta (p1 , . . . , pn ) and (p01 , . . . , p0n0 )
together with complex ‘‘connecting sets’’ bor- which all belong to the physical mass shell
dered by all the coincidence regions R, 0 Hm þ
= {p; p 2 V þ , p2 = m2 }. By supplementing these
defined previously. mass-shell constraints with the relativistic law of
(ii) For each N the complex domain DN contains the conservation of total energy–momentum p1 þ    þ
whole Euclidean subspace E N of M(c) N , which is pn = p01 þ    þ p0n0 , one obtains the definition of the
the set of all complex vectors [k]N = (k1 , . . . , kN ) mass-shell manifold Mn, n0 of (n ! n0 )-particle colli-
such that kj = (kj, 0 , kj ); kj, 0 = iqj, 0 , kj = pJ for sion processes.
j = 1, 2, . . . , N. (This Euclidean subspace depends We shall reserve the name of scattering kernel (or
on the choice of a given Lorentz frame in scattering amplitude), denoted by Tn, n0 (pn, in ; pn0 , out ),
Minkowski spacetime.) to the so-called ‘‘connected component’’ of the
S-matrix kernel Sn, n0 (pn, in ; pn0 , out ). By analogy with
Positivity Conditions The Hilbert space framework the definition of T in terms of S for the two-particle
which underlies the axioms of QFT implies (an collision processes (see Dispersion Relations) Tn, n0 is
infinite set of) positivity inequalities on the N-point defined by a recursive algorithm, which amounts to
structure functions of the fields. As a typical subtract from Sn, n0 all the components of the
example related to the previous formula [5] when (n ! n0 )-collision processes that are decomposable
jJ1 j = jJ2 j = N=2 (for N even), one can mention the into independent collision processes involving smal-
positive-definiteness property of the absorptive parts ler number of particles, according to all admissible
for appropriate pairs of adjacent cells (1 , 2 = partitions of the numbers n and n0 .
1 )(J1 , J2 ) , which simply expresses the positivity of For any given N, let us consider all the ‘‘affiliated’’
the following Hilbertian squared norm: scattering kernels Tn, n0 such that n þ n0 = N and whose
Z corresponding collision processes, also called

 f ð½ p Þf ð½ p Þ½~rðÞ ð½ p Þ ‘‘channels,’’ are deduced from one another by the
 ð J2 Þ ð J1 Þ N N
 relevant exchange of incoming particles and
ð0 Þ  outgoing antiparticles (e.g., 1 þ 2 þ 3 ! 4 þ
 ~rN ð½ pN Þd½ pð J1 Þ d½ pð J2 Þ 
5 þ 6 , 1 þ 2 ! 3 þ 4 þ 5 þ 6 , and 1 þ
Z 2 3 ! 2 þ 4 þ 5 þ 6 ). There exist general reduc-
 ð1 Þ 
¼  f ð½ pð J1 Þ ÞRJ ð½pð J1 Þ Þ > d½pð J1 Þ   0 ½6
 ~ tion formulas according to which all these scattering
kernels are restrictions to the mass-shell manifold M(N)
of appropriate boundary values of the (so-called)
:
^ N (k1 , . . . , kN ) ¼
Scattering Kernels of General (n ! n 0 )-Particle ‘‘amputated N-point function’’ H
Collisions and General Reduction Formulas (k21  m2 )    (k2N  m2 )  HN (k1 , . . . , kN ). More pre-
cisely, these reduction formulas can be written as
The presentation of (2 ! 2)-particle scattering ker- follows:
nels in the article Dispersion Relations can be
generalized to arbitrary (n ! n0 )-particle collision ^ ðÞ ðp1 ; . . . ; pN Þ ðÞ ½8
Tn;n0 ðpn;in ; pn0 ;out ÞjMðÞ ¼ H N jM
ðNÞ ðNÞ
processes, involving n incoming massive particles
(n  2) and n0 outgoing massive particles (n0  2). ^ () denotes a certain boundary value of
In the latter, H N
The big ‘‘scattering matrix’’ or ‘‘S-matrix’’ in the ^
HN on the reals: it is equal to a generalized retarded
Hilbert space of states is the collection of all partial function ~r()
N ([p]N ) which depends in a specific way
scattering matrices Sn, n0 or of the equivalent kernels on a region of the mass shell, called M() (N) , in which
470 Scattering in Relativistic Quantum Field Theory: The Analytic Program

the (n ! n0 )-channel is considered. The important in its cut-plane (or crossing) domain; the dispersion
thing to be noted in [8] is the sign convention which relations with two subtractions are still justified in
attributes the notation pj to the momentum of any that case (Epstein, Glaser, Martin, 1969 (see Martin
incoming particle and therefore implies that pj (1969, preprint))).
 :
belongs to the negative sheet of hyperboloid Hm ¼
þ
Hm . This is the price to pay for expressing Off-Shell Character of DN : Nontriviality of the
symmetrically the energy–momentum conservation Analytic Structure of the Scattering Kernels
law as p1 þ p2 þ    þ pN = 0 (according to the QFT
One can now see that for each value of N(N  4)
formalism), but it also displays, as a nice feature,
the situation created by complex geometry in the
the fact that all the affiliated scattering kernels
space C4(N1) of [k]N is a mere generalization of the
Tn, n0 such that n þ n0 = N are located on the
one described in a simple situation in the article
various connected components of the mass shell
Dispersion Relations.
M(N) (pj 2 Hm ; j = 1, 2, . . . , N): the choice of the
 þ
sheet Hm or Hm of Hm is exactly linked to the 1. There exists a fundamental (3N  4)-dimensional
incoming or outgoing character of the particle complex submanifold, namely the complex mass
considered. shell M(c) (N) defined by the equations kj = m ;
2 2

j = 1, . . . , N, which connects together the various


Remark 1 The reduction formulas are more usually real mass-shell components M() (N) interpreted as
expressed in terms of the Fourier transforms of the the various physical regions of a set of affiliated
(connected parts of the) N-point amputated chronolo- (n ! n0 )-collision processes. The problem of
gical functions N ([p]N ) (see Scattering in Relativistic proving the ‘‘analyticity of (n ! n0 )-scattering
Quantum Field Theory: Fundamental Concepts and functions’’ thus amounts to constructing such
Tools). As a matter of fact, the latter coincide with the holomorphic functions on the complex manifold
boundary values ~r()
N ([p]N ) of HN in the corresponding M(c)
(N) , whose boundary values on the various real
relevant regions M()(N) . regions M() (N) would reproduce the relevant
Remark 2 Coming back to the case of two-particle scattering kernels Tn, n0 (pn, in ; pn0 , out ).
scattering amplitudes (i.e., n = n0 = 2, N = 4), one 2. All the tubes T  which generate the primitive
can see that the general study presented here implies domain DN are off-shell domains, namely their
the consideration of the four-point function intersections with M(c) (N) are empty. This simply
H4 (k1 , k2 , k3 , k4 ), which is a holomorphic function comes from the fact that the conditions qj 2 V 
of three independent complex 4-momenta (since (included in their definition) and k2j = m2 > 0 are
k1 þ k2 þ k3 þ k4 = 0). In that case, the domain D4 incompatible. One can also check that adding the
contains 32 tubes T  which are specified by triplets coincidence regions R, 0 between adjacent tubes
of conditions such as q1 2 V þ , q2 2 V þ , q3 2 V þ , or does not improve the situation. However, one
q1 2 V þ , q1 þ q2 2 V þ , q1 þ q3 2 V þ , and those can state as a relevant scope the following
obtained by permutations of the subscripts program.
(1, 2, 3, 4) and also by a global substitution of the 3. Linear program (so-called because it only relies
cone V  to V þ . on the linear conditions presented in the section
‘‘N-point structure functions of QFT’’): find parts
Remark 3 The logical path from the postulates of of the holomorphy envelope of DN (possibly
QFT to the analyticity properties of two-particle improved by the exploitation of the Steinmann
scattering amplitudes that has been followed in the relations) whose intersections with the complex
article Dispersion Relations can be seen as a partial mass shell M(c) (N) are nonempty. In the best case,
exploitation of the general analyticity properties of show that such intersections can exist which
the four-point function: one was specially interested connect two different regions M() (N) together,
there in the analyticity properties of H4 in a single which means ‘‘proving the crossing property
4-momentum k1 = k3 (at fixed real values of between these two regions.’’
p2 = p4 ). The ‘‘partial reduction formula’’ [27] of 4. We shall see in the following that, except for the
Dispersion Relations corresponds to the restriction case N = 4, the results of this linear program
of eqn [8] (for N = 4) to the linear submanifold have been rather disappointing as far as reaching
(p1 = p3 , p2 = p4 ). It may also be worthwhile to the complex mass shell is concerned; however,
stress the fact that, in spite of the exponential other interesting analytic structures also coming
bounds on H4 implied by the postulates of algebraic from positivity conditions and from the addi-
QFT, it has been possible to prove that the tional postulate of asymptotic completeness have
scattering function is still bounded by a power of s been investigated under the general name of
Scattering in Relativistic Quantum Field Theory: The Analytic Program 471

nonlinear program. The ‘‘synergy’’ created by the Zerner ‘‘flat tube theorem,’’ or ‘‘flat edge-of-the-
combination of these two programs remains, to a wedge theorem.’’ In the latter, the local tubes
large extent, to be explored. TC(loc)
1
and TC(loc)
2
of f1 and f2 reduce to one-variable
domains of the upper half-plane in separate
Results of Analytic Completion variables z1 = x1 þ iy1 , z2 = x2 þ iy2 but with a
in the ‘‘Linear Program’’ common range of real parts (x1 , x2 ) 2 U. The data
f1 (z1 , x2 ) and f2 (x1 , z2 ) have coinciding boundary
We can only outline here some of the geometrical
values (f1 (x1 , x2 ) = f2 (x1 , x2 )) in the limit (y1 ! 0,
methods which allow one to compute parts of the
y2 ! 0). The result is again the existence of a
holomorphy envelopes of the domains DN . One
common analytic continuation to f1 and f2 , which
important method, which may be used after apply-
is a function of two complex variables f (z1 , z2 ) in
ing suitable conformal mappings, reduces to the
the intersection of the quadrant (y1 > 0, y2 > 0)
following basic theorem.
with a complex neighborhood of U. (Note that
The tube theorem The holomorphy envelope of a this result of complex analysis still holds when the
‘‘tube domain’’ of the form TB = Rn þ iB, where B is real boundary values of the holomorphic func-
an arbitrary domain in Rn called the basis of the tions have singularities, namely are only defined
tube, is the convex tube TB^ = Rn þ iB,
^ where B
^ is the in the sense of distributions).
convex hull of B.
Global analyticity properties The following prop-
The opposite or oblique edge-of-the-wedge theo-
erty (discovered by Streater for three-point func-
rem (Epstein 1960 (see Streater and Wightman
tions) looks like an extension of the tube theorem.
(1980, ch. 2, ref. 18))) is a refined local version of
The holomorphy envelope of the union of two tubes
the tube theorem, in which the basis B is of the form
T  , T 0 corresponding to adjacent pairs of cells
B = C1 [ C2 , where C1 , C2 are two disjoint (opposite
(, 0 )(J1 , J2 ) together with a complex connecting set
or nonopposite) cones with apex at the origin
bordered by R, 0 = {[ p]N ; p2J1 < m2J1 } is the convex
and where TB is replaced by a pair of ‘‘local tubes’’
hull T , 0 of the union of these tubes minus the
(TC(loc) , TC(loc) ). Here the adjective ‘‘local’’ means that
1 2 following analytic hypersurface J1 which can be
the real parts of the variables are confined in a given
called ‘‘a cut’’: J1 = {[k]N : k2J1 = m2J1 þ ,  0}. The
open set U (which can be arbitrarily small). The
interest of this result (although it remains by itself an
connectedness of TB is now replaced by the
off-shell result) is that it can generate larger cut-
consideration of any pair of functions (f1 , f2 )
domains by additional analytic completions, which
holomorphic in these local tubes whose boundary
may have intersections with the complex mass shell
values on their common real set U coincide. The
(see below for the case N = 4).
result is that f1 and f2 admit a common analytic
continuation f in a local tube TC(loc) , where C is the
Microlocal analyticity properties In the case of the
convex hull of C1 [ C2 . In the case of opposite cones ^ 4 , it is possible to consider
four-point function H
(C1 = C2 ), f is then analytic in the real set U, while
opposite cut-domains of the previous type, for which
in the general oblique case f is only analytic in a
J1 = {1, 2} is the energy-cut of the channel (1, 2 !
complex connecting set bordered by U (namely a set
3, 4), and for which the spectral conditions prescribe
which connects TC(loc) and TC(loc) ). There exists an
1 2 an ‘‘edge-of-the-wedge situation’’ in the neighbor-
extended version of the edge-of-the-wedge theorem
hood of the corresponding mass-shell component
in which the boundary values of f1 and f2 are only
M(1, 2 ! 3, 4) . The result is that H4 is proved to be
defined as distributions.
holomorphic in a full complex cut-neighborhood of
For simplicity, we shall just give a very rough
M(1, 2 ! 3, 4) in the ambient complex energy–momen-
classification of the type of results obtained. We
tum space. The intersection of this local domain
shall distinguish:
with the complex mass shell M(c) (4) is of course a full
 analyticity domains in the space of several complex cut-neighborhood of M(1, 2 ! 3, 4) in M(c) (4) , and
(possibly all) variables: they can be of global this proves that the corresponding scattering amplitude
type or of microlocal type, namely restricted to is the boundary value of an analytic scattering function
^ t) ¼ : ^ ^ 4 : it is
complex neighborhoods of real points; defined as the restriction F(s, H4 jM(c) of H
(4)
 analyticity domains in special families of one- holomorphic in a domain of complex (s, t) space
dimensional complex manifolds; and deprived from the s–cut.
 combinations of one-dimensional results which In the general case N > 4, the results are less
generate domains in several variables by a refined spectacular, although a more sophisticated microlocal
use of the tube theorem, called the Malgrange– method involving a ‘‘generalized edge-of-the-wedge
472 Scattering in Relativistic Quantum Field Theory: The Analytic Program

theorem’’ has been applied. This method, which was {k; k = p þ iq; k.P = 0, k2 = s=4 þ m2 ; jq2 j < b2 }. The
one of the three methods at the origin of the chapter (2 ! N  2)-particle scattering kernel is therefore the
of mathematics called microlocal analysis (the other boundary value of a scattering function holomorphic
two being Hörmander’s ‘‘analytic wave-front’’ in the previous spherical domain of complex k-space.
method and Sato’s ‘‘microfunctions’’ method) is In the special case of the two-particle scattering
based on a local version of the Fourier–Laplace amplitude F(s, t), one checks that the previous domain
transformation called the FBI transformation (see, yields for each s, s  4m2 , an ellipse of analyticity for
e.g., the book on ‘‘hypo-analytic structures’’ by ^ t) in the t-plane with foci at t = 0 and u = 4m2 
F(s,
Treves (1992) and in the present context the article s  t = 0; this ellipse is called the Lehmann ellipse. (We
‘‘Causality and local analyticity’’ by Bros and have considered for simplicity the case of a single type
Iagolnitzer (1973) (see Iagolnitzer (1992, ref. of particle with mass m and two-particle threshold at
[BI1]))). 2m.) In fact, the squared momentum transfer t is equal
A first positive result (obtained at first by Hepp in to (k  k0 )2 , if k0 = (k3  k4 )=2 denotes the ‘‘final
1965) is the fact that the various real boundary relative momentum’’ of the s-channel, which was
values of H ^ N admit well-defined restrictions as here taken to be fixed and real. Moreover, by a similar
tempered distributions on the corresponding (real) argument the corresponding absorptive part, namely
mass shell M(N) ; this result is in fact crucial for the the discontinuity across the s-cut of the scattering
rigorous proof of general reduction formulas. How- amplitude, can be shown to be holomorphic in a larger
ever, (according to Bros, Epstein, Glaser, 1972 (see ellipse with the same foci called the large Lehmann
Iagolnitzer (1992, ref. [BEG2])) the local existence ellipse.
of an analytic scattering function in M(c) (N) is not It is interesting to compare the previous result
ensured at all points of the mass shell, but only in with the one that one obtains when the fixed vector
certain regions. A rather favourable situation still P is chosen to be spacelike, namely when s has a
occurs for (2 ! 3)-particle collision amplitudes (i.e., negative, namely ‘‘unphysical’’ value with respect to
for N = 5), but in the general case there are large the distinguished channel (1, 2 ! 3, 4). For that case,
regions of the mass shell where it is only possible to the exploitation of the primitive domain D4 shows
prove (at least in this linear program) that the that for all negative (unphysical) values
i = k2i < 0;
amplitude is a sum of a limited number of boundary i = 1, 2, 3, 4, of the squared mass variables, the
values of analytic functions, defined in local domains function H ^ 4 is holomorphic in a cut-plane of the
of M(c)
(N) (see in this connection, Iagolnitzer (1992)). variable t, where the cuts are the t-cut (t = 4m2 þ ,
 0) and the u-cut (u = 4m2  s  t = 4m2 þ 0 ,
Analyticity at fixed total energy in momentum 0  0). This cut-plane has of course to be compared
transfer variables A remarkably simple situation with the off- shell cut-plane domain 
at the basis
had already been exploited before the general of the proof of dispersion relations (see Dispersion
analysis of HN leading to Theorem 1 was carried Relations). Here, however, the choice of the squared
out. It is the section of the domain of the N-point momentum transfer t as the variable of analyticity
function in the space of the ‘‘initial relative allows one to shift to another interpretation in terms
4-momentum’’ k = (k1  k2 )=2 of the s-channel of the concept of angular momentum.
with initial 4-momenta (k1 , k2 ), when the total
energy–momentum P = (k1 þ k2 ) with P2 = s is
kept fixed and real. The remaining 4-momenta Analyticity in the complex angular momentum
p3 , . . . , pN such that p3 þ    þ pN = P are also kept variable In all the situations previously considered
fixed and real. Consider the case when P is (positive) for the case N = 4, one can see that at fixed real
timelike and such that s  4m2 . Then it can be seen values of the squared energy s and of the squared
that one obtains analyticity of (a certain ‘‘1-vector masses
= {
i ; i = 1, 2, 3, 4}, the complex initial and
restriction’’ of) HN with respect to the vector variable final relative 4-momenta k and k0 have directions
k in the union of the two opposite tubes T þ = R4 þ which vary on the complexified sphere S(c) . More-
iV þ , T  = R4 þ iV  . Moreover, an edge-of-the- over, the corresponding restriction of H ^ 4 to that
wedge situation holds in view of the spectral coin- sphere turns out to be always well defined and
cidence region of the form k21 = (P=2 þ k)2 < M21 , analytic on the real part of that sphere: it therefore
k22 = (P=2  k)2 < M22 . The corresponding holomor- defines a kernel on the sphere, which, in view of
phy envelope is given by a Jost–Lehmann–Dyson Poincaré invariance, is invariant under the rotations
domain (see Dispersion Relations), whose section by and therefore admits a convergent expansion in
the complex mass shell k21 = k22 = m2 turns out Legendre polynomials. Let us call h‘ (s;
) the
to give a ‘‘spherical tube domain’’ of the form corresponding sequence of Legendre coefficients.
Scattering in Relativistic Quantum Field Theory: The Analytic Program 473

In the first case considered above, this sequence Asymptotic completeness and BS-type structural
coincides (all
i being equal to m2 ) with what the analysis The BS equations have been at first
physicists call the set of partial waves f‘ (s) of the introduced as identities of formal series in the
scattering amplitude. The analyticity of H ^ 4 on a perturbative approach of QFT, and the idea of
complex spherical tube of S(c) , namely of F(s, ^ t) in considering such identities as exact equations having
the Lehmann ellipse, is then equivalent to a certain a conceptual content in the general axiomatic
exponential decrease property with respect to ‘ of framework of QFT has been introduced and devel-
the sequence of partial waves. oped by Symanzik in 1960. However, it took a long
In the second case, where s and the
i are negative, it time before its integration in the analytic program of
can be seen that the sphere S describes 4-momentum QFT (Bros 1970 (see Iagolnitzer (1992, ref. [B1]))).
configurations which all belong to a certain Euclidean These developments belong to the nonlinear pro-
subspace E 4 of M(c)4 . But this situation is much more gram since they rely on quadratic integral equations
favourable from the viewpoint of analyticity, since H ^4 between the various N-point functions, which
can be seen to be holomorphic on the full complex express the postulate of asymptotic completeness
submanifold S(c)  S(c) minus two sets t and u via the use of appropriate reduction formulas.
which correspond to the t- and u-cuts of the For brevity, the general set of BS-type equations
complex t-plane. Then this larger analyticity prop- for the N-point functions with N > 4 will not be
erty turns out to be equivalent to the fact that the presented. The simplest BS-type equation, which
sequence h‘ (s;
) admits an interpolation H( ; ~ s;
) concerns the four-point function, can be written as
holomorphic in a certain half-plane of the form follows:
Re > ‘0 such that for all integers ‘ > ‘0 one has:
~ s;
) = h‘ (s;
). The value of ‘0 is linked to the ^ 4 ðK; k; k0 Þ ¼ BðK; k; k0 Þ þ ðH
H ^ 4 s BÞðK; k; k0 Þ ½9
H(‘;
power bound at large momenta that must be where
satisfied by H^ 4 as a consequence of the temperate-
ness property included in the Wightman axiomatic ^ 4 s BÞðK; k; k0 Þ
ðH
Z  
framework (Bros and Viano 2000). ^ 4 ðK; k; k00 ÞBðK; k00 ; k0 ÞG K þ k00
Of course, this nice analytic structure in a ¼ H
 2
complex angular momentum variable could extend  
K
to the set of physical partial waves f‘ (s) if one could G  k00 d4 k00 ½10
^ t) in a cut- 2
establish the analytic continuation of F(s,
plane of t containing the Lehmann ellipses, but this In the latter, the s-channel is privileged, with
seems out of the possibilities at least of the linear s = K2 , K = (k1 þ k2 ); H ^ 4 is seen as a K-dependent
0
program. kernel (k and k are the initial and final relative
4-momenta already defined), and the new object B
to be studied is also a K-dependent kernel. The
function G(k) is holomorphic in k2 in a cut-plane
The ‘‘Nonlinear Program’’ and
except for a pole at k2 = m2 which plays a crucial
Its Two Main Aspects
role. (It is essentially the ‘‘propagator’’ or two-point
The extension of the analyticity domains by positivity function of the field theory considered). Apart from
and the derivation of bounds by unitarity Positivity pathologies due to the Fredholm alternative, the
conditions of the form [6] have been extensively correspondence between H ^ 4 and B is one-to-one, but
applied to the case N = 4 (namely for subsets J with the peculiarity concerns the integration cycle  of
two elements). The main result (Martin 1969) consists [10]: it is a complex cycle of real dimension 4, which
in the possibility of differentiating the forward disper- coincides with the Euclidean space of the vector
sion relations with respect to t and, as a consequence, variable k00 when all the 4-momenta are Euclidean,
to enlarge the analyticity domain in t at fixed s: the and can always be distorted inside the analyticity
Lehmann ellipse, whose size shrinks to zero when s domain of H ^ 4 together with the external variables.
tends to infinity, can then be replaced by an ellipse The exploitation of the Fredholm equation in
(i.e., the Martin ellipse) whose maximal point complex space with ‘‘floating integration cycles’’
t = tmax > 0 is fixed when s goes to infinity. This then implies that B is holomorphic at least in the
justifies the extension of dispersion relations in s to primitive domain of H ^ 4.
positive values of t; then in a second step the use of An important geometrical aspect of the integra-
unitarity relations for the partial waves allows one to tion on the cycle  in [10] is the fact that this cycle is
obtain Froissart-type bounds on the scattering ampli- ‘‘pinched’’ between the pair of poles of the functions
tudes (see Martin (1969)). G when K2 tends to its threshold value (s = 4m2 ).
474 Scattering in Relativistic Quantum Field Theory: The Analytic Program

The type of mathematical concept encountered here in the joint variables and s, corresponding to
is closely related to those used in the study of the concept of Regge particle: the composite
analyticity properties and Landau singularities of the particles introduced in (2) might then be inte-
Feynman amplitudes in the perturbative approach of grated in the Regge particle, although they
QFT (in this connection, see the books by Hwa and manifest themselves physically only for integral
Teplitz (1966) and by F Pham (2005) and references values ‘ of with the corresponding spin
therein). interpretation. Of course, this scenario is by no
^4
The first basic result is that it is equivalent for H means proven to hold in the general analytic
to satisfy an asymptotic completeness equation in program of QFT, but we have seen that the
the pure two-particle region 4m2 < s < 9m2 and for relevant ‘‘embryonary structures’’ are concep-
B to satisfy the following property called two- tually built-in, so that the phenomenon might
particle irreducibility: B satisfies dispersion relations hopefully be produced in a definite quantum field
in s such that the s-cut begins at the three-particle model.
threshold: s = 9m2 . 4. Byproducts of BS-type structural analysis for
The consequence of this extended analyticity N = 5 and N = 6. Relativistic exact structural
property of B is that it generates the following type equations for (3 ! 3)-particle collision ampli-
of analyticity properties for H ^ 4: tudes, which generalize the Faddeev structural
equations of nonrelativistic potential theory,
1. The existence of a two-sheeted analytic structure
^ 4 over a domain of the s-plane containing have been shown to be valid in the energy
for H
region of ‘‘elastic’’ collisions (i.e., with total
the interval 4m2
s < 9m2 , with a square-root-
energy bounded by 4m); relevant Landau singu-
type branch point at the threshold s = 4m2 .
larities of tree diagrams and triangular diagrams
2. Composite particles. There exists a Fredholm-
have been exhibited as a by-product in this
type expression
low-energy region (Bros, and also Combescure,
0
^ 4 ðK; k; k0 Þ ¼ NðK; k; k Þ
H ½11 Dunlop in two-dimensional field models, 1981
DðK2 Þ (see Iagolnitzer (1992, refs. [B3], [B4], [CD]))).
Moreover, crossing domains on the complex mass
where N and D are expressed in terms of B via
shell for (2 ! 3)-particle collision amplitudes have
Fredholm determinants, which shows that in its
^ 4 may have poles in s = K2 , been obtained (Bros 1986 (see Iagolnitzer (1992,
second sheet H
ref. [B1]))) by conjointly using (N = 5) BS-type
generated by the zeros of D. These poles are
equations together with analytic completion prop-
interpreted as resonances or unstable particles.
erties (see, e.g., the ‘‘Crossing lemma’’ in Dispersion
The generation of real poles in the first sheet (i.e.,
Relations).
bound states) is also possible under special
spectral assumptions of QFT. See also: Algebraic Approach to Quantum Field Theory;
3. Complex angular momentum diagonalization of Axiomatic Quantum Field Theory; Dispersion Relations;
BS-type equations (Bros and Viano 2000, 2003). Scattering, Asymptotic Completeness and Bound States;
The operation s in the BS-type equation [9] Scattering in Relativistic Quantum Field Theory:
contains not only an integration over squared- Fundamental Concepts and Tools; Thermal Quantum
mass variables, but also a convolution product on Field Theory.
the sphere S; the latter is transformed into a
product by the Legendre expansion of four-point
functions described previously in the subsection Further Reading
‘‘Analyticity in the complex angular momentum
Bros J and Viano GA (2000) Complex angular momentum in
variable.’’ As a result, there is a partially
general quantum field theory. Annales Henri Poincaré 1:
diagonalized transform of eqn [9] in terms of 101–172.
~
the functions H( ; ~
s;
) and B( ; s;
), which Bros J and Viano GA (2003) Complex angular momentum
allows one to write a Fredholm formula similar diagonalization of the Bethe–Salpeter structure in general
to [11], namely quantum field theory. Annales Henri Poincaré 4: 85–126.
Haag R (1992) Local Quantum Physics. Berlin: Springer.
~
Nð ; s;
Þ Hwa RC and Teplitz VL (1966) Homology and Feynman
~
Hð ; s;
Þ ¼ ½12
~ integrals. New York: Benjamin.
Dð ; sÞ Iagolnitzer D (1992) Scattering in Quantum Field Theories: The
Axiomatic and Constructive Approaches, Princeton Series in
Then under suitable increase assumptions on B,
Physics. Princeton: Princeton University Press.
there may exist a half-plane of the form Re > Jost R (1965) The General Theory of Quantized Fields.
~
‘1 (with ‘1 < ‘0 ) such that H( ; s;
) admits poles Providence: American Mathematical Society.
Scattering, Asymptotic Completeness and Bound States 475

Martin A (1969) Scattering Theory: Unitarity, Analyticity and Streater RF and Wightman AS (1964, 1980) PCT, Spin and
Crossing, Lecture Notes in Physics. Berlin: Springer. Statistics, and all that. Princeton: Princeton University Press.
Pham F (2005) Intégrales singulières. Paris: EDP Sciences/CNRS Treves F (1992) Hypoanalytic Structures. Princeton: Princeton
Éditions. University Press.

Scattering, Asymptotic Completeness and Bound States


D Iagolnitzer, CEA/DSM/SPhT, CEA/Saclay, hand, these works have also largely been related and
Gif-sur-Yvette, France have contributed to important, purely mathematical
J Magnen, Ecole Polytechnique, France developments, for example, in the domain of
ª 2006 Elsevier Ltd. All rights reserved. analytic functions of several complex variables,
microlocal analysis, . . . .
The general framework of QFT based on
Wightman axioms is introduced in the next
Introduction
section. Massive theories are characterized in that
Relativistic quantum field theory (QFT) has been framework by a condition on the mass spectrum.
mainly developed since the 1950s in the perturba- Haag–Ruelle asymptotic theory then allows one to
tive framework. Quantities of interest then appear define, in the Hilbert space H of states, two
as infinite sums of Feynman integrals, correspond- subspaces Hin and Hout corresponding to states
ing to infinite series expansions with respect to that are asymptotically tangent, before and after
couplings. This approach has led to basic successes interactions, respectively, to free-particle states. The
for practical purposes, but suffered due to crucial AC condition H = Hin = Hout introduces a further
defects from conceptual and mathematical view- important implicit particle content in the theory.
points. First, individual terms were a priori infinite: Collision amplitudes or scattering functions are then
this was solved by perturbative renormalization. well defined in the space of on-mass-shell initial and
However, even so, the series remain divergent. Two final energy–momenta (satisfying energy–momen-
rigorous approaches have been developed since the tum conservation). The LSZ ‘‘reduction formulas’’
1960s. The axiomatic approach aims to establish a give their link with chronological functions of the
general framework independent of any particular fields.
model (Lagrangian interaction) and to analyze Basic properties of scattering amplitudes that
general properties that can be derived in that follow from the Wightman axioms are then out-
framework from basic principles. The ‘‘construc- lined. In particular, these axioms allow one to define
tive’’ approach aims to rigorously establish the the ‘‘N-point functions,’’ which are analytic in a
existence of nontrivial QFT models (theories) and domain of complex energy–momentum space con-
to directly analyze their properties. Some of the taining the Euclidean region (imaginary energy
fundamental bases are described in this encyclope- components), and from which chronological and
dia in the articles by J Bros, D Buchholz and scattering functions can be recovered. Other results
J Summers, and by G Gallavotti, respectively. This at that stage include the on-shell physical sheet
article aims to a deeper study of particle analysis analyticity properties of four-point functions, as also
and scattering of theories. In contrast to the articles general asymptotic causality and local analyticity
by Buchholz and Summers and G Gallavotti, it is properties for N  4.
restricted to massive theories, a rather strong Next, we describe results derived from AC and
restriction, but for the latter goes much beyond in regularity conditions on analyticity and asymptotic
particle analysis. causality in terms of particles. In particular, the
From a purely physical viewpoint, results remain analysis of the links between analyticity properties
limited: the models rigorously defined so far are of irreducible kernels (satisfying Bethe–Salpeter type
weakly coupled models in spacetime dimensions 2 equations) and AC in low-energy regions are
or 3, results on bound states depend on specific included, following ideas of K Symanzik.
kinematical factors in these dimensions, proofs The final three sections are devoted to the analysis
of asymptotic completeness (AC) are not yet of models.
complete, . . . . On the positive side, we might say Models of QFT have been rigorously defined in
that the analysis and results are of interest from both Euclidean spacetime, through cluster and, more
conceptual and physical viewpoints; on the other generally, phase-space expansions which are shown
476 Scattering, Asymptotic Completeness and Bound States

to be convergent at small coupling (and replace the one mass m, there is only one corresponding
nonconvergent expansions, of perturbative QFT). particle. At small coupling, the existence of other
Examples of such models are the super-renormalizable (stable) particles is not a priori expected; never-
massive ’4 models in dimensions 2 or 3 (in the theless, we will see that such particles (two-particle
1970s) and the ‘‘just renormalizable’’ massive bound states) will occur in some models in view of
(fermionic) Gross–Neveu model – in dimension 2 – kinematical threshold effects.
in the 1980s. The N-point functions of these models The 2PI four-point kernel G2 is shown to be
can be shown to have exponential fall-off in analytic up to s = (4m)2  " in an even theory. On
Euclidean spacetime. By the usual Fourier–Laplace the other hand, it satisfies a (regularized) BS
transform theorem, one obtains in turn analyticity equation. In a way analogous to the section ‘‘AC
properties in corresponding regions away from the and analyticity,’’ starting here from the analyticity of
Euclidean energy–momentum space. G2 , the actual four-point function F is in turn
On the other hand, à la Osterwalder–Schrader analytic or meromorphic in that region up to the cut
properties can be established in Euclidean spacetime. at s  4m2 , and the discontinuity formula associated
By analytic continuation from imaginary to real with AC in the low-energy region is obtained.
times, it is in turn shown that a corresponding For some models (depending on the signs of some
nontrivial theory satisfying the Wightman axioms is couplings), it will be shown that F has a pole in the
recovered on the Minkowskian side. This analysis is physical sheet, below the two-particle threshold (at a
omitted here. However, no information is obtained distance from it which tends to zero as the coupling
in that way on the mass spectrum, AC, energy– itself tends to zero). This pole then corresponds to a
momentum space analyticity, . . . . Such results can further stable particle.
be obtained through the use of irreducible kernels. More generally, and up to some technical pro-
This was initiated by T Spencer in the 1970s and blems, the structure equations should allow one to
then developed along the same line (Spencer and derive various discontinuity formulas of N-point
Zirilli, Dimock and Eckmann, Koch, Combescure, functions including those associated with AC in
and Dunlop). We outline here the more general increasingly higher-energy regions. Asymptotic caus-
approach of the present authors. In the latter, ality in terms of particles and related analyticity
irreducible kernels are directly defined through properties (Landau singularities . . .) should also
‘‘higher-order’’ cluster expansions which are again follow. However, in this approach, results should
convergent at sufficiently small coupling. They are be obtained only for very small couplings as the
shown to satisfy exponential fall-off in Euclidean energy region considered increases.
spacetime with rates better than those of the Note: Notations used are different in the next
N-point functions, and hence corresponding analy- two sections on the one hand, and the final three
ticity in larger regions around (and away from) the sections on the other. These notations follow the
Euclidean energy–momentum space. Results will use of, respectively, axiomatic and constructive
then be established by analytic continuation, from field theory; for instance, x and p are real on
the Euclidean up to the Minkowskian energy– the Minkowskian side in the next two sections
momentum space, of structure equations that whereas they are real on the Euclidean side in the
express the N-point functions in terms of irreducible last three sections. The mass m in the next two
kernels. These structure equations are infinite series sections is a physical mass, whereas it is a bare
expansions, with again convergence properties at mass in the last three sections (where a physical
small coupling. In the cases N = 2 and N = 4 (even mass is noted mph ).
theories), the re-summation of these structure equa-
tions give, respectively, the Lippmann–Schwinger and
Bethe–Salpeter (BS) integral equations (up to some
The General Framework of Massive
regularization).
Field Theories
The one-particle irreducible (1PI) two-point
kernel G1 is analytic up to s = (2m)2  ", where " We denote by x = (x0 , x) a (real) point in Minkowski
is small at small coupling (s is the squared center of spacetime with respective time and space components
mass energy of the channel). A simple argument x0 and x (in a given Lorentz frame); x2 = x20  x2 .
then allows one to show analyticity of the actual Besides the usual spacetime dimension d = 4, possible
two-point function in the same region up to a pole values 2 or 3 will also be considered. In all that
at k2 = m2ph : this shows the existence of a first basic follows, the unit system is such that the velocity c of
physical mass mph (close at small coupling to the light is equal to 1. Energy–momentum variables, dual
bare mass m). In a free theory (zero coupling) with (by Fourier transformation) to time and space
Scattering, Asymptotic Completeness and Bound States 477

variables, respectively, are denoted by p = (p0 , p); limits in H when t ! 1, respectively, and that
p2 = p20  p2 . these limits depend only on the mass-shell restric-
We describe below the Wightman axiomatic tions of the test functions ~fjjHþ (m) .
framework, though alternative ones such as ‘‘local Hin and Hout are interpreted physically as sub-
quantum physics’’ based on the Araki–Haag–Kastler spaces of states that are ‘‘asymptotically tangent’’
axioms may be used similarly for present purposes. before, respectively, after the interactions, to free-
For simplicity, unless otherwise stated, we consider particle states with particles of mass m. They are in
a theory with only one basic (neutral, scalar) field A; fact both isomorphic to the free-particle Fock space
A is defined on spacetime as an operator-valued F , namely the direct sum of n-particle spaces of
distribution:
R for each test function f , A(f ) (formally ‘‘wave functions’’ depending on n on mass-shell
A(x)f (x)dx) is an operator in a Hilbert space H of energy–momenta p1 , p2 , . . . , pn .
states. A physical state is represented by a (normal- AC is the assertion that H = Hin = Hout , that is,
ized) vector in H modulo scalar multiples. It has to that each state in H is asymptotically tangent to a
be physically understood as ‘‘sub specie aeternitatis’’ free-particle state, with particles of mass m, both
(i.e., ‘‘with all its evolution,’’ the Heisenberg picture before and after interactions (the two free-particle
of quantum mechanics being always adopted). It is states are different if there are interactions). This
assumed that there exists in H a representation of the condition cannot be expected to always hold in the
Poincaré group (semidirect product of pure Lorentz general framework introduced above, even if we
transformations and spacetime translations). restrict our attention to ‘‘physically reasonable’’
The Wightman axioms include: theories in which states of H are asymptotically
tangent to free-particle states before and after
1. local commutativity: A(x) and A(y) commute if
interactions: the absence of other stable particles
x  y is spacelike: (x  y)2 < 0.
with different masses is not guaranteed. For
2. the spectral condition ( = positivity of the energy
instance, even if A is ‘‘neutral,’’ the action of field
in relativistic form): the spectrum of the energy–
operators on the vacuum might generate pairs of
momentum operators (infinitesimal generators of
‘‘charged’’ particles with opposite charges, whatever
spacetime translations) is contained in the cone
‘‘charge’’ one might imagine. Individual charged
Vþ (p2  0, p0  0). In a massive theory, the
particles cannot occur in the neutral space H and
spectrum is more precisely assumed to be
their mass thus does not appear in the spectral
contained in the union of the origin (that will
condition. Hence, such states of pairs of charged
correspond to the vacuum vector introduced
particles will not belong to Hin or Hout although
next), of one or more discrete mass-shell hyper-
they belong to H. However, if the set of charged
boloids Hþ (mi )(p2 = m2i , p0 > 0) with strictly
particles is known, it can be shown that the above
positive masses mi , and of a continuum. For
framework might be enlarged by defining charged
simplicity, and unless otherwise stated, we con-
fields, in such a way that AC might still be valid in
sider in this section a theory with only one mass
the enlarged framework (see the article of Buchholz
m and a continuum starting at 2m (but this will
and Summers). For simplicity, we restrict below our
not be so in a theory with ‘‘two-particle bound
attention to the simplest theories in which AC holds
states’’). This condition introduces a first (partial)
in the way stated above.
particle content of the theory. In models, physical
If AC holds, it is shown that there exists a linear
masses will not be introduced at the outset but
operator S from H to H, called ‘‘collision operator’’
will have to be determined.
or ‘‘S-matrix,’’ that relates the ‘‘initial’’ and ‘‘final’’
3. existence in H of a vacuum vector , which is the
free-particle states to which a state in H is tangent
only invariant vector under Poincaré transforma-
before and after interactions, respectively; if AC
tions up to scalar multiples; it is moreover assumed
does not hold, S can also be defined as in operator in
that the vector space generated by the action of field
F . Collision amplitudes or scattering functions are
operators on the vacuum is dense in H.
the energy–momentum kernels of S for given
4. Poincaré covariance of the theory.
numbers m and n of initial and final particles. As
Subspaces Hin and Hout of H can be defined by easily seen, they are well-defined distributions on the
limiting procedures. To that purpose, one considers space of all initial and final on-shell energy–
test functions fj, t (x) with Fourier transforms of momenta. For convenience, we will denote by pk
2 1=2
the form ~fj (p)ei(po [p þ m ] )t , where the functions ~fj
2
the physical energy–momentum of a final particle
have their supports in a neighborhood of the mass- with index k(pk 2 Hþ (m)), and by pk the physical
shell Hþ (m). It can then be shown that vectors of the energy–momentum of an initial particle
form t = A(f1, t )A(f2, t )    A(fn, t ) converge to (pk 2 Hþ (m)).
478 Scattering, Asymptotic Completeness and Bound States

Wightman Functions, Chronological Functions, the definition of chronological operators, and sup-
and LSZ Reduction Formulas port properties in p-space due to the spectral
The N-point Wightman ‘‘functions’’ WN are defined condition. Support properties in x-space apply to
as the vacuum expectation values (VEVs) of the cell and more general ‘‘paracell’’ functions which are
products of N field operators, namely: VEVs of adequate combinations of products of
‘‘partial’’ chronological operators. It is shown that
WN ðx1 ; x2 ; . . . ; xN Þ each such function has support in x-space in a closed
¼ < ; Aðx1 ÞAðx2 Þ    AðxN Þ > cone CS (with apex at the origin). Moreover, for cell
functions, the cone CS is convex and salient. Hence,
The chronological functions TN are the VEVs of the in view of the usual Laplace transform theorem, the
chronological products of the fields A(x1 ), . . . , cell function in p-space (after Fourier transforma-
A(xN ): in the latter, fields are ordered according to tion) is the boundary value of a function analytic in
decreasing values of the time components of the complex space in the tube Re p arbitrary, Im p in the
points xk . TN is essentially well defined due to local open dual cone C ~ S of CS . It is also shown that, near
commutativity with, however, problems not treated any real point P = (P1 , . . . , PN ), the chronological
here at coinciding points. function in p-space coincides with one or more cell
T~ N (p1 , . . . , pN ) will denote the Fourier transform
functions.
of TN . In view of the invariance of the theory under Together with support properties in p-space
spacetime translations, functions above are invariant arising from the spectral condition and the use of
under global spacetime translation of all points xk coincidence relations between some cell functions (in
together. Hence, their Fourier transforms contain an adequate real regions in p-space), one then shows
energy–momentum conservation (e.m.c.) delta func- the existence, for each N, of a well-defined, unique
tion (p1 þ p2 þ    þ pN ). Connected N-point func- analytic function FN , called the ‘‘analytic N-point
tions are defined by induction (over N) via a function,’’ whose domain of analyticity, the ‘‘primi-
formula expressing each (nonconnected) function tive domain of analyticity,’’ in complex p-space
as the sum of the corresponding connected function contains all the tubes T S associated with the cell
and of products of connected functions depending functions. It also contains in particular a complex
on subsets of points. In contrast to nonconnected neighborhood of the Euclidean energy–momentum
functions, the analysis shows that connected func- space which consists of energy momenta Pk with
tions in energy–momentum space do not contain in real P k and imaginary energies (Pk )0 . Moreover, the
general e.m.c. delta functions involving subsets of chronological function T ~ amp, c is the boundary value
N
energy–momenta. of FN at all real points P, from imaginary directions
It can be shown that the two-point function which include those of the convex envelope of the
~ 2 (p1 , p2 ) = (p1 þ p2 )T
T ~ 2 (p1 ) has a pole of the form ~ S associated with cell functions that coincide
cones C
1=(p21  m2 ) and that T ~ N has similar poles for each ~ amp, c .
locally with T N
energy–momentum variable pk on the mass-shell. The However, the primitive domain has an empty
connected, amputated chronological function T ~ amp, c is
N intersection with the complex mass-shell, and thus
defined by multiplying (T ~N ) ~c
connected = TN (for N  2) gives no result on analyticity properties of collision
by the product of all factors p2k  m2 that cancel these amplitudes on the (real or complex) mass-shell. For
poles. It is then shown that it can be restricted as a N = 4, it has been possible to largely extend the
distribution to the mass-shell of any physical process primitive domain (which is not a ‘‘natural domain of
with m initial and n final particles, with m þ n = N, holomorphy’’) by computing (parts of) its holomorphy
and that this restriction coincides with the collision envelope, which now has a nonempty intersection
amplitude of the process. A process is here character- with the complex mass shell. It is shown in turn that
ized by fixing the initial and final indices. the four-point function F4 can be restricted to the
The analyticity properties of interest (described complex mass-shell in a one-sheeted domain, called
below) will apply to the connected functions after the ‘‘physical sheet,’’ that admits each (real) physical
factoring out their global e.m.c. delta functions. region on its boundary (there is here one physical
region for each choice of the two initial and the two
The Analytic N-point Functions
final indices, the corresponding physical regions being
The Wightman axioms (without so far AC) yield disconnected from each other). In each physical
general analyticity, as also asymptotic causality, region, the collision amplitude is the boundary value
properties that we now describe. The analysis is of the mass-shell restriction of F4 , from the corre-
essentially based on the interplay of support proper- sponding half-space of ‘‘þi"’’ directions Im s > 0,
ties in x-space arising from local commutativity and where s is the (squared) energy of the process.
Scattering, Asymptotic Completeness and Bound States 479

The analyticity domain on the complex mass-shell this particular case Lorentz invariance implies that
contains paths of analytic continuation between the u3  u1 must be proportional to P3 þ P4 ). In more
various physical regions (‘‘crossing property’’) and general cases, the possible causal configurations u
admits cuts sij real  (2m)2 covering the various depend on P.
physical regions. From these analyticity properties in
the physical sheet, one can also derive ‘‘dispersion
relations’’ (see Dispersion Relations). AC and Analyticity
Asymptotic Causality in Terms of Particles
Asymptotic causality and analyticity and Landau Singularities
properties for N  4
As a matter of fact, a better causality property ‘‘in
No similar result has been achieved at N > 4, and as terms of particles’’ – which is the best possible
a matter of fact, no similar result is expected if the one – is expected for ‘‘physically reasonable’’
AC condition is not assumed. The best results theories if the (stable) particles of the theory are
achieved so far are decompositions of the collision known. (By physically reasonable, we mean the
amplitude, in various parts of its physical region, as absence of ‘‘à la Martin’’ pathologies such as the
a sum of boundary values of functions analytic in occurrence of an infinite number of unstable
domains of the complex mass-shell. In contrast to particles with arbitrary long lifetime). That prop-
the case N = 4, the sum reduces to one term only in erty expresses the idea that the only causal
a certain subset of the physical region. Near other configurations u at P are those for which the
points, the N-point analytic function cannot be energy–momentum can be transferred from the
restricted locally to the complex mass-shell, though initial to the final points via intermediate stable
it can be decomposed as a sum of terms which, particles in accordance with classical laws: there
individually, are locally analytic in a larger domain should exist a classical connected multiple scatter-
that intersects the complex mass-shell. ing diagram in spacetime joining the initial and
These analyticity properties for N  4 are a direct final points uk , with physical on-shell energy–
consequence of (and equivalent to) an asymptotic momenta for each intermediate particle and
causality property that we now outline. Let fk,  (p) energy–momentum conservation at each (point-
be, for each index k, a test function of the form wise) interaction vertex.
2 This property, if it holds, yields in turn (and is
fk; ðpÞ ¼ eip:uk ejpk Pk j
equivalent to) improved analyticity of the analytic
where each uk is a point in spacetime, Pk is a given N-point function near real physical regions: the (on-
on-shell energy–momentum, and  will be a space- shell) collision amplitude is the boundary value of a
time dilatation parameter ( > 0). It is well localized unique analytic function in its physical region, at
in p-space around the point Pk and its Fourier least away from some ‘‘exceptional points.’’ The
transform is well localized in x-space around the boundary value (namely the collision amplitude) is
pffiffiffiffiffiffi
point uk up to an exponential fall-off of width  moreover analytic outside Landau surfaces Lþ () of
which is small compared to  as  ! 1. connected multiple scattering graphs ; and along
We now consider the action of the (connected, these surfaces (which are in general smooth
amputated) chronological function on such test codimension-1 surfaces), it is in general obtained
functions. A configuration u = (u1 , . . . , uN ) will be from well-specified ‘‘þi"’’ directions (that depend in
called ‘‘noncausal’’ at P = (P1 , . . . , PN ) if this action general on the real point P of Lþ ).
decays exponentially as  ! 1. In mathematical Exceptional points are those that lie at the
terms, u is then outside the ‘‘essential support’’ or intersection of two (or several) surfaces Lþ (1 ),
‘‘microsupport’’ at P. The asymptotic causality Lþ (2 ) . . . , with opposite causal directions, and
property established, has roughly the following hence having no þi" directions in common (in the
content: the only possible causal configurations u on-shell framework). Such points do not occur at
at P are those for which energy–momentum can be N = 4 for two-body processes, in which case the
transferred from the initial to the final points in surfaces Lþ are the n-particle thresholds s = (nm)2 ,
future cones. Moreover, at least two initial ‘‘extre- with n  2, s = (p1 þ p2 )2 . They do occur more
mal’’ points must coincide, as also two extremal generally: in a 3 ! 3 process, 1,2,3 initial, 4,5,6
final points. The simplest example is the case N = 4; final, this is the case of all points P such that
if, for example, indices 1,2 are initial and 3, 4 final, P1 = P4 , P2 = P5 , P3 = P6 which all belong to
then the only a priori possible causal situations are the Landau surfaces of the two graphs 1 , 2 , with
such that u3 = u4 is in the future cone of u1 = u2 (in only one internal line joining two interaction
480 Scattering, Asymptotic Completeness and Bound States

vertices: in the case of 1 , (resp., 2 ), the first vertex region (imaginary energies) and then by local
involves the external particles 1, 2, 4 (resp., 1, 3, 5), distortions of integration contours allowing one to
while the second one involves 3, 5, 6 (resp., 2, 4, 6). reach the Minkowskian region. From discontinuity
If moreover P1 , P2 , P3 lie in a common plane, formulas and algebraic arguments, these irreducible
previous points P also lie on surfaces Lþ of kernels are shown to have analyticity (or meromor-
‘‘triangle’’ graphs with again opposite causal phy) properties associated with the physical idea of
directions at P. The fact that þi directions are irreducibility (see examples below).
opposite can equally be checked for the corre- Results obtained so far with or without irreduci-
sponding Feynman integrals of perturbative field ble kernels are comparable in the simplest cases.
theory. However, the method based on irreducible kernels
gives more refined results and seems best adapted to
Remark The above points are no longer exceptional
‘‘extricate’’ the analytic structure of N-point func-
in spacetime dimension 2. In fact, all surfaces
tions for N > 4.
Lþ mentioned then coincide with the (on-shell)
codimension-1 surface p1 = p4 , p2 = p5 , p3 = p6 ,
with two opposite causal directions. The previous N = 4, Two-Body Processes in the
asymptotic causality property, together with a further Low-Energy Region
‘‘causal factorization’’ property for causal configura-
By even theory, we mean theories in which N-point
tions, then yields along that surface an actual
function vanishes identically for N odd.
factorization of the three-body (nonconnected)
Standard results on two-body processes with
S-matrix into a product of two-body scattering
initial (resp., final) energy–momenta p1 , p2 (resp.,
functions modulo an analytic background. The latter
p01 , p02 ) in the low-energy region (2m)2  s < (3m)2
vanishes outside the surface, hence is identically zero,
(s = (p1 þ p2 )2 = (p01 þ p02 )2 ) are based on the ‘‘off-
for some special two-dimensional models.
shell unitarity equation’’
In the absence of the AC condition, one clearly
Fþ  F ¼ Fþ ? F ½1
sees why the above causality in terms of particles
cannot be established: as we have seen, there is where Fþ (p1 , p2 ; p01 , p02 )
and F (p1 , p2 ; p01 , p02 )
denote,
a priori no control on the stable particles of the respectively, the þi" and i" boundary values of the
theory and on their masses, and pathologies such as four-point function F4 from above or below the cut
those mentioned above cannot be excluded. Hope- s  (2m)2 in the physical sheet, and ? denotes on-
fully, the first problem should be solved if AC is shell convolution over two intermediate energy–
assumed, and the second one should be removed by momenta. This relation is a direct consequence of
adequate regularity assumptions. This is the pur- AC for s less than (3m)2 , or less than (4m)2 in an
pose of the so-called axiomatic nonlinear program, even theory. When the four external energy–momen-
in which one also wishes to examine further tum vectors p1 , p2 , p01 , p02 are put on the mass shell
problems, for example, analytic continuation into (on both sides of that relation), one recovers the usual
unphysical sheets, with the occurrence of possible elastic unitarity relation for the collision amplitude
unstable particle poles and other singularities, Tþ and its complex conjugate T :
nature of singularities, possible multiparticle dis-
Tþ  T ¼ Tþ ? T
persion relations, . . . . , to cite only a few. Results so
far remain limited but provide a first insight into In the exploitation of these relations outlined below,
such problems. a regularity condition is moreover needed, for
example, the continuity of Fþ in the low-energy
region.
The Nonlinear Axiomatic Program
By considering the unitarity equation as a Fredholm
Results described below are based on discontinuity equation for Tþ at fixed s (in the complex mass
formulas arising from – and essentially equivalent in shell), one obtains the following result: Tþ can be
adequate energy regions to – AC, together with analytically continued as a meromorphic function
some regularity conditions. They can be established of s through the cut (in the low-energy region) in a
either with or without the introduction of adequate two-sheeted (d even) or multisheeted (d odd)
‘‘irreducible’’ kernels. The methods rely on some domain around the two-particle threshold. Possible
general preliminary results on Fredholm theory in poles in the second sheet (generated by Fredholm
complex space (and with complex parameters). theory) will correspond physically to unstable
Irreducible kernels are defined through integral particles. The singularity at the two-particle thresh-
(Fredholm type) equations, first in the Euclidean old is of the square-root type in s for d even, or in
Scattering, Asymptotic Completeness and Bound States 481

1=log s for d odd. The difference between the two graphs with one internal line and with triangle
cases is due to the power (d  1)=2 of s, integer or graphs, with two-point functions on internal lines
half-integer, in the kinematical factor arising from and four-point functions at each vertex, plus a
on-shell convolution. This result can also be remainder R. The latter is shown to be a boundary
extended to the off-shell function F4 by applying a value from þi" directions Im s positive, where
further argument of analytic continuation making s = (p1 þ p2 þ p3 )2 , p1 , p2 , p3 denoting the energy–
use of the off-shell unitarity equation. momentum vectors of the initial particles. Further
Restricting now our attention to an even theory regularity conditions are needed to recover its local
(for simplicity), a similar result also follows from the physical region analyticity. The various explicit
introduction of a two 2PI BS type kernel G contributions that we have just mentioned yield the
satisfying (and here defined from F through) a actual physical region Landau singularities expected
regularized BS equation of the form in the low-energy 3–3 physical region.
A more refined result, in the approach based on
F ¼ G þ F M G ½2
irreducible kernels outlined below, applies in a
where M denotes convolution over two intermedi- larger region and then includes further à la Feynman
ate energy–momenta with two-point functions on contributions associated with 2-loop and 3-loop
the internal lines and a regularization factor in order diagrams (the latter do not contribute to ‘‘effective’’
to avoid convergence problems at infinity (G then singularities in the neighborhood of the physical
depends on the choice of this factor but its proper- region).
ties and the subsequent analysis do not). Alterna- The first result can be established from disconti-
tively, one may also introduce a kernel satisfying a nuity formulas for the three-point function around
renormalized BS equation, but this is not useful for two-particle thresholds, arising from AC, and
present purposes. ‘‘microsupport’’ analysis of all terms involved. In
Starting from the above discontinuity formula [1], the approach based on irreducible kernels, it is
one shows in turn that G is indeed ‘‘2PI’’ in the useful to introduce in particular a 3PI kernel G3
analytic sense: that, in contrast to the 3–3 function, will be analytic
or meromorphic in a domain including the three-
Gþ ¼ G ½3
particle threshold. To that purpose, an adequate set
in the low-energy region. More precisely, G is of integral equations is introduced and the three-
analytic or meromorphic (with poles that may arise particle irreducibility of G3 in ‘‘the analytic sense’’ is
from Fredholm theory) in a domain that includes the then established. In turn it provides the complete
two-particle threshold s = (2m)2 , in contrast to F structure equation mentioned above.
itself.
The proof of [3] is based on the relation
More General Analysis
independent of M (and thus leaving the M depen-
dence implicit). There are so far only preliminary steps in more
general situations, in view of (difficult) technical
þ   ¼ ? ½4
problems involved and the need of ad hoc regularity
(which is a nontrivial adaptation of the decomposi- assumption at each stage. As already mentioned, the
tion of a mass-shell delta function as a sum of plus approach based on irreducible kernels seems best
and minus i" poles). A simple algebraic argument adapted. The analysis should clearly involve more
then shows essentially the equivalence between the general irreducible kernels with various irreducibil-
discontinuity formulas [1] and [3]. ity properties with respect to various channels (and
In turn, assuming that G has no poles, this not only with respect to the basic channel consid-
analyticity allows one to recover the two-sheetedness ered such as the 3–3 channel in the case above).
(d even) or multisheetedness (d odd, singularity in From a heuristic viewpoint, one may first consider
1=log) of F, in view of the BS type equation. to that purpose adequate formal expansions into
(infinite) sums of ‘‘à la Feynman contributions’’
adapted to the energy regions under investigation.
N = 6, 3–3 Process in the Low-Energy Region
These à la Feynman contributions will involve
(Even Theory)
adequate irreducible kernels in the graphical sense
The result, in the neighborhood of the 3–3 physical at each vertex, and the above expansions correspond
region, is here a ‘‘structure equation’’ expressing the formally to the best possible regroupings of
3–3 function F in the low-energy region as a sum of Feynman integrals with respect to the energy region
‘‘à la Feynman contributions’’ associated with considered. From such expansions, one might
482 Scattering, Asymptotic Completeness and Bound States

determine adequate sets of integral equations allow- also exist provided that c2p > 0 is small enough
ing one, together with regularity assumptions, to depending on m and on the other coefficient c’s
carry out an analysis similar to above. and , and
2. the just renormalizable theories where () (and
possibly
()) depend in general on . In models
The Models mentioned below () ! 0 as  ! 1; this char-
acterizes ‘‘asymptotic freedom.’’
A Euclidean field-theoretical model can be defined
by a probability measure d(’) on the space of The proof of the existence of the N-point
tempered distributions ’ in Euclidean spacetime, functions makes use of Taylor type expansions
whose moments verify the Osterwalder–Schrader (or with remainder. The first orders are used to compute
similar) axioms. The moments of d are, for each N, (),
(), a(). The idea is to consider the functional
the Euclidean (Schwinger) N-point functions: integral [5] – at ,  finite – as an integral over
Z roughly d ‘‘degrees of freedom’’ which are weakly
Sðx1 ; . . . ; xN Þ ¼ ’ðx1 Þ    ’ðxN Þ dð’Þ ½5 coupled. This corresponds to a decomposition of the
phase space (with cutoff both in x-space (the box )
In what follows, the measure d will be a and in p-space (roughly jpj < )). The coupling
perturbed Gaussian measure which, for the massive between different regions in x-space comes from
’4 model with a volume cutoff  and an ultraviolet the propagators C ; the coupling between different
cutoff , is given in d dimensions by frequencies in p-space comes from the ’4 term (the
R 4 R 2 interaction vertex). The expansion is then, for each
ðÞ ’ ðzÞ dzþaðÞ ’ ðzÞdz
d; ¼ e   d  ð’Þ=Z; ½6 degree of freedom, a finite expansion in the coupling
between this degree and the others so that, even if
where Z,  is the normalization factor and where
the expansion is perturbative up to the order d ,
d
R  (’) is the Gaussian measure of mean zero the bound on each term is qualitatively the one on a
( ’ d = 0) and covariance
product of d finite order-independent expansions,
Z
2 2 the order of which can be fixed uniformly in  (and
Cðx  y; Þ ¼ dd p eipðxyÞ ep = =ð
ðÞp2 þ m2 Þ depending only on ). To achieve this program, the
propagator linking two points of distance of order L
where by convention m is called the bare mass. 1
must have a decrease of order eL jxyj , that is, have
For d = 2 or 3 one can show that, for () =  1
momentum larger than L , so that one must
small enough (depending on m) and
() = 1, there localize both in x-space and p-space ; for example,
exists a function a() (a() = O() as  ! 0) such the smallest cells of phase space correspond to fields
that, for any set of N distinct points, the function ’ localized in x, p-spaces, the x-boxes being of side
S(x1 , . . . , xN ) = lim,  ! 1 S,  (x1 , . . . , xN ) exists, is 1 and the p-localization consisting of values such
not Gaussian (hence does not correspond to a trivial, that roughly (=2)  jpj  . More generally, a
free theory), and satisfies the Osterwalder–Schrader generic cell (of index i) corresponds to fields ’ at
axioms. The connected part S(x1 , . . . , xN )connected has point x and momentum p, with x in a box of side
the following perturbative series: 2i 1 and 2i1  < jpj < 2i .
X ð1Þn Z These expansions are mimicking the à la Wilson
lim ’ðx1 Þ . . . ’ðxN Þ renormalization group. For just renormalizable theo-
;!1
n
n!
Z n ries (where () depends on ), one is led to introduce
 the effective coupling constant (2i ) whose pertur-
½’4  aðÞ’2 ðzÞ dz d ; ð’Þconnected ½7
 bative expansion is the value at momentum zero of
the sum of all the (connected, amputated) four-point
which is the (divergent) sum of the connected
functions containing only propagators of momentum
renormalized (Euclidean) Feynman graphs.
(roughly) bigger than 2i  (plus () which in fact
The study of the perturbative series leads to the
tends to zero as  ! 1).
distinction of:
Then by small coupling we mean a theory where
1. the super-renormalizable theories, where it is (2i )=
(2i )2 is small for all i.
possible to take (),
() not depending on . By convention we write ren ,
ren , aren for the
In dimension 2, all the models where ’4 is effective parameters of the theory at zero
replaced by momentum.
The expansion obtained expresses Sconnected as a
c2p ’2p þ c2p1 ’2p1 þ   þ c5 ’5 þ ’4 þ c3 ’3 ½8 sum of terms each of them being associated to a
Scattering, Asymptotic Completeness and Bound States 483

given set of phase-space cells which are ‘‘connected’’ Finally, the external points are by convention z‘
together by ‘‘links’’ that are either propagators or points; then:
vertices. Each term decreases exponentially with the
difference imax  imin of the upper and lower indices S; ðx1 ; . . . ; xN Þconnected
Z X 1 X
of the phase-space cells involved. Moreover, each set ¼ d M ð’Þ
must contain the cells associated to the fields T
jTj! fXv g
’(x1 )    ’(xN ) whose indices are fixed by the order Z 
nonoverlapping
 Y 
of magnitude of the distances between the points. Y
dz‘ dz0‘ CM ð‘Þ
On the other hand, the difference between the ‘ ‘2T
z‘ not external
theory of cutoff  and the one of cutoff 2 are Y
terms containing at least one cell of momentum of KXv ðfz; z0 gv ; ’Þ ½9
order ; these terms are thus small like v2T
cst(x1 , . . . , xN )e(cst) , so that the limit as  ! 1 where for coupling small enough:
exists. Z Y Y
So far, the ‘‘construction’’ of models is possible d M ð’Þ jKXv ðfz; z0 gv ; ’Þj  eMð1ÞjXv j ½10
only at small coupling, apart from special cases. The v2T v2T
’4 theory in dimension 4 is just renormalizable
(from the perturbative viewpoint) but the above The X’s are 2 2 nonoverlapping; however, it will
condition of small coupling cannot be achieved (and suffice to sum over all X’s (without restriction) to
it is generally believed that this model cannot be get a bound showing the convergence of the
defined as a nontrivial theory). A just renormaliz- expansion as  ! 1. In this formula the K(. , ’)’s
able model has been shown to exist, namely the are still coupled by the measure d M (’); all the
Gross–Neveu model which is a fermionic theory in nonperturbativity is hidden in the K’s (in particular
dimension 2. The elementary particle physics models the contribution of momentum bigger than M).
are just renormalizable but their construction has As a consequence of [9] and if a(, ) has been
not been completed so far (in particular in view chosen such that aren = 0, for M large enough and at
of the confinement problem). See Constructive small coupling (depending on M, m):
Quantum Field Theory for details. jSðx; yÞconnected j
To state the result in a form convenient for our Z
purposes here, we introduce a splitting of the  jCM ðx  yÞj þ dz01 dz02 jCM ðx  z01 Þ
covariance in two parts: 0 0
eMð1Þjz1 z2 j CM ðz02  yÞj þ   
Cðx  y; Þ ¼ CM ðx  y; Þ þ C>M ðx  y; Þ; M>m  ðcstÞemð1Þjxyj ½11
2 2 2 2
~ M ðp; Þ ¼ ðep
C = 2 2
=p þ m Þ  ðe p = 2 2
=p þ M Þ More generally, the connected N-point function
satisfies
so that CM (x  y) behaves like C at large distances but
has an ultraviolet cutoff of size M, and jC>M (x  y)j  jSðx1 ; . . . ; xN Þconnected j  cst emð1Þdðx1 ;...;xN Þ ½12
eMjxyj decreases exponentially depending on the
where d(x1 , . . . , xN ) is the length of the smallest tree
(technical) choice of M. Let d M (’) be the Gaussian
joining x1 , . . . , xN , with possibly intermediate points.
measure of covariance CM .
One divides also  in unit cubes and obtains for
the connected N-point function an expansion as a
The Irreducible Kernels
sum over connected trees; a tree T is composed of
lines ‘ and vertices v; each line joins two vertices or The 1PI Kernel and a Lippmann–Schwinger Equation
one of the external points x1 , . . . , xN and a vertex;
To then show that a theory – if the perturbation series
moreover, there are no loops.
heuristically shows it – contains only one particle of
To each line ‘ is associated a propagator
mass smaller than 2m(1  ), it is necessary to expand
CM (z‘ , z0‘ ) = CM (‘).
further the coupling between the K’s in [9]. Each
To each vertex v are associated:
perturbative step relatively to this coupling will
1. two subsets Iv , Iv0 of {‘}, generate a sum of terms such that in each one there is
2. a connected set Xv of unit cubes such that all the a ‘‘new’’ propagator CM between two K’s.
z‘ , ‘ 2 Iv and all the z0‘ , ‘ 2 Iv0 are contained in The fact that in [9] the X’s are nonoverlapping
Xv ; jXv j is the volume of Xv , and has the consequence that an expansion where for
3. a kernel KXv ({z, z0 }v ; ’) each pair of KX the number of propagators CM
484 Scattering, Asymptotic Completeness and Bound States

remains bounded (say by n þ 1) is convergent (for outgoing xpþ1 , . . . , xN points, this defines a channel.
small enough couplings depending on m, n); this is One then obtains nPI kernels (in the given channel).
because, for a given X, the others must be farther In the same way as above, one obtains a relevant
and farther as their number increases, and in view of structure equation; this equation makes sense only
the exponential decrease (in x-space) of CM . if the kernels KX have a decrease corresponding to
We then consider the expansion where we have n-particle irreducibility; to that purpose we take
further expanded the two-point function S(x, y) such M > nm. The expansion converges for couplings
that each term can be decomposed in the channel small enough depending on m and n.
x ! y in CM propagators and 1PI contributions (in In the case n = 2 this gives a kind of BS equation
the sense that any line cutting such a 1PI contribu- (the Lippmann–Schwinger equation corresponding
tion (and outside the X0 s) cuts at least two to the case n = 1); if we restrict, for simplicity, the
propagators); that means that these 1PI contribu- analysis to even theories one is led to jump directly
tions are no longer coupled by the d M (’) measure. to the case n = 3:
They are made of propagators and of KX which still
Sðx1 ; x2 ; x3 ; x4 Þconnected
have nonoverlapping restrictions; the latter are Z
straightforwardly expanded using a kind of (con- ¼ dz1 dt1 dz2 dt2 ðM Þðx1 ; x2 ; z1 ; t1 Þ
vergent) Mayer expansion; the result is finally a
Lippmann–Schwinger type equation: G2 ðz1 ; t1 ; z2 ; t2 ÞðM Þðz2 ; t2 ; x3 ; x4 Þ þ    ½18
Z
Sðx; yÞconnected ¼ CM ðx  yÞ þ dz1 dz2 CM ðx  z1 Þ
X
S ¼ M ½G2 M p
G1 ðz1 ; z2 Þ CM ðz2  yÞ þ    ½13
p1
or
or
" #
X S ¼ M G2 M þ M G2 S ½19
Sðx; yÞconnected ¼ CM ½G1 CM p ðx; yÞ
p0 where
which is equivalent to ðM Þðx1 ; x2 ; x3 ; x4 Þ ¼ Sðx1 ; x3 ÞSðx2 ; x4 Þ
Sconnected ¼ CM þ CM G1 CM þ CM G1 Sconnected ½14 þ Sðx1 ; x4 ÞSðx2 ; x3 Þ
where G1 is a 1PI kernel that satisfies the bound and where
2mð1Þjtuj
jG1 ðt; uÞj  ren e ½15 jG2 ðt1 ; t2 ; u1 ; u2 Þj
In Fourier transform, eqn [14] becomes  ren expf4mð1  Þ maxðjti  uj jÞg ½20
i;j
~ M ðpÞ þ C
FðpÞ ¼ C ~ M ðpÞG
~ 1 ðpÞC
~ M ðpÞ
Equation [19] once amputated, and after Fourier
þC ~ M ðpÞG1 ðpÞFðpÞ ½16 transformation, is eqn [2].
Denoting by (p þ q) F(p, q) the Fourier transform of
More General Irreducible Kernels
S(x, y)connected , we can then compute F(p): and Structure Equations
~M þ C
ðp2 þ m2 Þ½C ~ MG~ 1 ðpÞ Irreducible kernels with various degrees of irreduci-
FðpÞ ¼ ½17
~ MG
ðp2 þ m2 Þ  ðp2 þ m2 ÞC ~ 1 ðpÞ bility in various channels can be defined in a similar
~ M (p) ! (1  m2 =M2 ) as p ! 0 and way. Corresponding expansions of N-point func-
where (p2 þ m2 )C tions follow, in terms of integrals involving these
~
jG1 (p)j  ren cst(m) so that (as expected) F has no kernels and two-point functions. These kernels are
pole in the Euclidean region at small coupling; but, again convergent at small coupling (! 0 as their
as will be seen in the next section, it has a pole irreducibility ! 1) as well as the corresponding
outside the Euclidean region. structure equations (which generalize eqn [18]).

The 2PI Kernel and a BS Equation


Analyticity, AC, and Bound States
From the previous discussion, it is clear that one can
extract from [9] as many propagators as we want As explained in the introduction, we now proceed
between kernels KX . If one considers a splitting of by analytic continuation away from the Euclidean
the external points in incoming x1 , . . . , xp and region in complex energy–momentum space.
Scattering, Asymptotic Completeness and Bound States 485

First, it is easily seen that the two-point function section ‘‘AC and analyticity,’’ so as to avoid the pole
is analytic in the region s < (2m)2   apart from a singularities of the two-point functions involved in
pole at s = m2ph which defines the physical mass M , the threshold singularities being due to the
mph (m2ph is the zero in p2 of the denominator in pinching of this contour between the two poles as
formula [17]). In view of the bounds of the previous s ! (2mph )2 . If a fixed neighborhood of the thresh-
two sections, mph is close to the ‘‘bare’’ mass m. old is excluded, one does obtain uniform bounds of
The 2PI kernel, for even theories, is shown, again by the form (cst ren )q (for a term with q factors G2 ) in
Laplace transform theorem, to be analytic and bounded any bounded domain, which ensures the conver-
in domains around and away from the Euclidean region gence of the Neumann series.
up to s = (4m)2  , and is of the order of ren . It remains to study the neighborhood of the
As we have seen in the section ‘‘AC and threshold. To that purpose, the following method
analyticity,’’ the analyticity of G2 entails the analytic is convenient. One shows that the convolution
structure of F (two-sheeted or multisheeted at the operator M can be written in the form
threshold). On the other hand, further poles of F can
be generated by the BS integral equation [2] in the M ¼ gðsÞ
þr ½21
physical or unphysical sheets. If a pole in the where
is, as in the section ‘‘AC and analyticity,’’
physical sheet occurs at s < (2mph )2 real, it will on-shell convolution for s > (2mph )2 or is obtained
correspond to a new particle in the theory, namely a by analytic continuation for complex value of s
two-particle bound state. around the threshold; g(s) = 1=2 for d even and, if d
is odd, g(s) = (i=2 ) log , where = 4m2ph  s. In
AC in the Low-Energy Region view of this definition of g(s), the operator r is
The analysis of possible bound states, which will be regular: it is an analytic one-sheeted operation
presented in the following, will show that there around the threshold (this is equivalent to [4]), and
might be at most one two-particle bound state of it has no pole singularities. This property of r can
mass mB < 2mph which tends to 2mph as the be established by geometric methods or by an
couplings tends to zero. explicit evaluation.
On the other hand, for even theories, in view of It is then useful to introduce a new kernel U
the analyticity properties of the two-point function linked to G2 by the integral equation
and of the 2PI kernel G2 , equation [1] holds in the U ¼ G2 þ UrG2 ½22
region (2m2ph ) < s < (4mph )2  , where
is on-shell
convolution with particles of mass mph . In view of the regularity and bounds of r and G2 ,
If there is no two-particle bound state, this one sees (e.g., by a series expansion) that U, like G2 ,
characterizes the AC of the theory for s < (4mph )2  . is analytic in a neighborhood of the threshold and
If there is a bound state of mass mB , AC is behaves in the same way at small ren .
established only in the region s < (3mph )2  . By a simple algebraic argument F and U are
For non-even theories, the analysis is similar but related by the integral equations
requires the introduction of new irreducible kernels
F ¼ U þ gðsÞU
F ¼ U þ gðsÞF
U ½23
in view of the fact that the non-evenness opens new
channels. AC in all cases can be established, for Two-dimensional models We start the analysis with
small couplings, up to s < (3mph )2  . the case d = 2. The mass shell is trivial in this case; let f
be the restriction of F to the mass shell; it depends only
Analysis of Possible Two-Particle Bound States on s = (p3 þ p4 )2 due to the mass shell and e.m.c.
for Even Theories at Small Coupling constraints (as also Lorentz invariance). On the mass
It can be checked that such poles of F, if there are, shell, the operation
becomes a mere multiplication
either lie far away in the unphysical sheet(s) or are and the integral equation [23] becomes
close to the two-particle threshold (s = (2mph )2 ). 1
This is due to the convergence, at small coupling, of f ðsÞ ¼ uðsÞ þ f ðsÞuðsÞ ½24
aðsÞ
the Neumann series F = G2 þ G2 M G2 þ    . Indi-
vidual terms G2 M    M G2 are, in fact, defined where u is the mass shell restriction of U and the
away from the Euclidean region by analytic con- factor a(s) arising from
is of the form
tinuation in a two-sheeted (d even) or multisheeted a(s) = cst s1=2 1=2 , = (2mph )2  s, which gives
(d odd) domain around the threshold: to that
purpose locally distorted integration contours (initi- aðsÞuðsÞ
f ðsÞ ¼ ½25
ally the Euclidean region) are introduced as in the aðsÞ  uðsÞ
486 Scattering, Asymptotic Completeness and Bound States

In turn one obtains above, there would be no two-particle bound state at


small coupling. In fact, the kinematical factor (d3)=2
Uj j U
F¼Uþ ½26 (for d even) generated by the mass shell convolution
aðsÞ  uðsÞ is no longer equal to 1=2 as in the d = 2 case but
where Uj (resp., j U) is U with p3 , p4 (resp., p1 , p2 ) now to 1=2 . As a consequence, the Neumann series
restricted to the mass shell. Equation [26] comple- giving F in terms of G2 is convergent also in the
tely characterizes the local structure of F in view of neighborhood of the two-particle threshold.
the local analyticity of U. Non-even theories The analysis for the non-even
The analysis of the possible poles follows from the theories follows similar lines. As already mentioned,
fact that U is equal to G2 up to higher order in ren ; the analysis requires the introduction of new irredu-
on the other hand, G2 is equal to a first known term cible kernels. For the models ’4 þ c3 ’3 , which do
plus higher-order corrections in ren (if we expand in exist at small couplings in dimensions 2 and 3, there
ren the expression for G2 obtained in the previous will be either exactly one or no two-particle bound
section), so that the leading contribution of u(s) is state, depending on the respective values of , c3 .
known and the results follow.
For a theory (see [8]) containing a ren ’4 term there is Structure Equations and AC in
exactly one pole, which corresponds to the zero of a(s)  Higher-Energy Regions
u(s), lying in the region (2mph )2   < s < (2mph )2 .
The structure equations of the previous section provide,
This pole is either in the physical sheet for ren < 0 or in
after analytical continuation away from the Euclidean
the second sheet if ren > 0. In the case ren < 0, this
region, a rigorous version of the analysis presented at
pole corresponds to a two-particle bound state of
the end of the section ‘‘AC and analyticity.’’ The
physical mass mB which tends to 2mph as ren ! 0.
irreducible kernels can here be defined in a direct way
In a model without ’4 term (ren = 0) the lowest-
following the previous section, together with their
order contribution to G2 , hence to U, is in general of
analyticity properties. One has then to derive the
the order of the square of the leading coupling, in
discontinuity formulas that in turn characterize AC.
which case there is always one bound state.
This program has been carried out in the 3 ! 3 particle
The treatment of the fermionic Gross–Neveu
region, and partly in the general case. It seems possible
model, which involves spin and color indices, is
to complete general proofs up to some technical
analogous, with minor modifications. Equations now
(difficult) problems. As already mentioned, in this
involve, in the two-particle region, 4 4 matrices;
approach, the coupling should be taken smaller and
poles of F are now the zeros of det (a(s)I  m(s)u(s)),
smaller as the energy region considered increases.
where m(s) is the 4 4 matrix obtained from 2 2
residue matrices (whose leading matrix elements are See also: Axiomatic Quantum Field Theory; Constructive
explicitly computable). The detailed analysis, which Quantum Field Theory; Dispersion Relations; Dynamical
requires the consideration of different channels Systems in Mathematical Physics: An Illustration from
(various color and spin indices) is omitted. Water Waves; Perturbation Theory and its Techniques;
Quantum Chromodynamics; Scattering in Relativistic
Three-dimensional models The results are similar:
Quantum Field Theory: Fundamental Concepts and
F is decomposed as F0 þ F00 , where F0 is the ‘ = 0
Tools; Scattering in Relativistic Quantum Field Theory:
‘‘partial
R wave component’’ of F, namely F0 = (1=2 ) the Analytic Program; Schrödinger operators.
F d , where is the ‘‘scattering angle’’ of the
channel; its complement F00 is shown to be locally
bounded in view of a further factor . The analysis Further Reading
is then analogous to the case d = 2 with a(s) now
Bros J (1984) r-particle irreducible kernels, asymptotic complete-
behaving like cst= log as ! 0. There is, a priori,
ness and analyticity properties of several particle collision
either no pole, or one pole in the physical sheet at amplitudes. Physica A 124: 145.
s = m2B < (2mph )2 with mB = 2mph þ O(ecst=ren ), Bros J, Epstein H, and Glaser V (1965) A proof of the crossing
depending again on the signs of the couplings. For property for two-particle amplitudes in general quantum field
the existing even models such as the ’4 model, there theory. Communications in Mathematical Physics 1: 240.
Bros J, Epstein H, and Glaser V (1972) Local analyticity
is no pole, hence no two-particle bound state.
properties of the n-particle scattering amplitude. Helvetica
Four-dimensional models The existence of the ’4 Physica Acta 43: 149.
model in dimension 4 is doubtful. If a four- Epstein H, Glaser V, and Iagolnitzer D (1981) Some analyticity
properties arising from asymptotic completeness in quantum
dimensional model were defined, and if the 2PI field theory. Communications in Mathematical Physics 80: 99.
kernel G2 of a massive channel could be defined and Glimm J and Jaffe A (1981, 1987) Quantum Physics:
shown to satisfy analyticity properties analogous to A Functionnal Integral Point of View. Heidelberg: Springer.
Schrödinger Operators 487

Iagolnitzer D (1992) Scattering in Quantum Field Theories: The renormalization: the Gross–Neveu model. Communications in
Axiomatic and Constructive Approaches, Princeton Series in Mathematical Physics 111: 89.
Physics. Princeton University Press. Martin A (1970) Scattering Theory: Unitarity, Analyticity and
Iagolnitzer D and Magnen J (1987a) Asymptotic completeness Crossing. Heidelberg: Springer.
and multiparticle structure in field theories. Communications Rivasseau V (1991) From Perturbative to Constructive Renorma-
in Mathematical Physics 110: 51. lization. Princeton: Princeton University Press.
Iagolnitzer D and Magnen J (1987b) Asymptotic completeness
and multiparticle structure in field theories. II. Theories with

Schrödinger Operators
V Bach, Johannes Gutenberg-Universität, general validity of eqn [2] as the fundamental
Mainz, Germany dynamical law of all physical theories, including,
ª 2006 Elsevier Ltd. All rights reserved. for example, nonrelativistic and (special) relativistic
quantum mechanics, quantum field theory, and
string theory, deserves appreciation.
Schrödinger operators are linear partial differential If the physical system under consideration is a
operators of the form nonrelativistic point particle of mass m > 0 in a
potential Ve : Rd ! R, then, according to the princi-
HV ¼  þ VðxÞ ½1 ples of classical (Newtonian) mechanics, its state is
acting on a suitable dense domain dom(HV ) L () 2 determined by its momentum p 2 R d and its posi-
in the Hilbert space of square-integrable functions tion x 2 R d , its kinetic energy is (1=2m)p2 , its
e
potential energy is V(x), and the dynamics is given
on a spatial domain
Pd  Rd , where d 2 N. Here,
H0 =  =  = 1 @ =@x2 is (minus) the Laplacian
2 by the Hamiltonian flow generated by the
e
Hamiltonian function Hclass (p, x) = (1=2m)p2 þ V(x).
on , and the potential V :  ! R acts as a multi-
plication operator, [V ](x) := V(x) (x). Schrödinger derived the Hamiltonian (operator)
H = (h2 =2m) þ V(x)e in [2] from the replace-
ment of the momentum p 2 Rd by the momentum
Historical Origin and Relation operator ihrx . This prescription is called quanti-
to Theoretical Physics zation and is further discussed in the section
In 1926, Schrödinger formulated quantum theory as ‘‘Quantization and semiclassical limit.’’ The
wave mechanics and proved later that it is equiva- Schrödinger operator HV in [1] is then obtained after
lent to Heisenberg’s matrix mechanics. He proposed an additional unitary rescaling, (x) 7! d=2 (x),
e
by  := h(2m)1=2 , and a redefinition V(x) := V(x=)
that the state of a physical system at time t 2 R is
given by a normalized wave function t 2 L2 () of the potential.
whose dynamics is determined by a linear Cauchy For more details, we refer the reader to
problem: 0 is the state at time t = 0, and for t > 0, Schrödinger (1926) and Messiah (1962).
it evolves according to
@ t Self-Adjointness
i ¼H t ½2
@t Led by the requirement of unitarity of the propa-
the Schrödinger equation. More generally, 0 is a gator, the domain dom(HV ) in [1] is usually chosen
normalized element of a Hilbert space H, and such that HV is self-adjoint, which, in turn, is most
the Hamiltonian HV is a self-adjoint operator, often established by means of the Kato–Rellich
that is, dom(HV ) = dom(HV
) H and HV = HV
on perturbation theory, briefly described below. If
dom(HV ). Formally, eqn [2] is solved by the V 0, then H0 equals the Laplacian , which
evolution operator or propagator exp(itHV ) in is a positive self-adjoint operator, provided
2
the form t = exp(itHV ) 0 . The self-adjointness dom(H0 ) = Wb.c. () is the second Sobolev space
of HV insures the existence and unitarity of with suitable conditions on the boundary @ of .
the propagator exp(itHV ), for all t 2 R, so Typical examples are dom(H0 ) = W 2 (Rd ), for
k t k = k 0 k = 1. For physics, this unitarity is crucial,  = Rd , and WDir 2 2
() and WNeu () with Dirichlet
because k t k2 is interpreted as the total probability or Neumann boundary conditions on @, respec-
of the system to be at time t in some state in H. The tively, in case that  is a bounded, open domain in
488 Schrödinger Operators

Rd with smooth boundary @. Starting from this decomposition of the spectrum of HV into the discrete
situation, V is required to be relatively H0 -bounded, spectrum disc (HV ), which consists of all isolated
that is, that M(V, r) := V( þ r1)1 defines eigenvalues of HV of finite multiplicity, and its
(extends to) a bounded operator on L2 (), for any complement ess (HV ) = Rndisc (HV ), the essential
r > 0. If limr!1 kM(V, r)k < 1, then HV is self- spectrum of HV , as its residual spectrum is void. One
adjoint on dom(H0 ) and semibounded, that is, the of the main goals of the spectral analysis is to
infimum inf (HV ) of its spectrum (HV ) is finite; in determine the spectral measure for a given potential
other words, HV  c1, for some c 2 R, as a V as precisely as possible.
quadratic form. (The semiboundedness corresponds In many applications,  = Rd and the potential V in
to quasidissipativity, as a generator of the semigroup HV is not only relatively H0 -bounded, but even
exp(HV ).) relatively H0 -compact, that is, M(V, 1) is compact. In
A fairly large class of potentials fulfilling these this case, limr!1 kM(V, r)k = 0, insuring self-
requirements is defined by adjointness on dom(H0 ) and semiboundedness of HV .
( Z ) Moreover, a theorem of Weyl implies that its essential
lim sup 4d 2 d
jx  yj VðyÞ d y ¼ 0 ½3 spectrum agrees with the one of H0 , that is, with the
&0 x2 jxyj positive half-axis Rþ 0 , and the discrete spectrum is
contained in the negative half-axis R . If, furthermore,
for d 6¼ 4, and with jx  yj4d replaced by (ln jx  (H0 þ 1)1 [x rV(x)](H0 þ 1)1 is compact, then the
yj)1 , for d = 4. For d  3, [3] is equivalent to the essential spectrum on the positive half-axis is purely
uniformR local square integrability of V, that is, absolutely continuous, ess (HV ) \ Rþ = ac (HV ) \
supx2 jxyj1 V(y)2 dd y < 1. Note that [3] allows Rþ , and hence disc (HV )  pp (HV )  disc (HV ) [
for local singularities of V, provided they are not too {0}; the singular continuous spectrum is void.
severe; in this respect, quantum mechanics is more We remark that the absence of singular contin-
general than classical mechanics. Equation [3] is a uous spectrum is not understood. Indeed, it is
sufficient condition for HV =  þ V to be self- possible to explicitly construct potentials V such
adjoint on dom() because limr!1 kM(V, r)k = 0. that H(V) has singular continuous spectrum. In
Moreover, as eqn [3] only misses some borderline terms of the Baire category, singular continuous
cases, it is also almost necessary for the self- spectrum is even typical. The appearance of singular
adjointness of HV . By means of Kato’s inequality, the continuous spectrum can, perhaps, be easier
conditions on V, especially on its positive part understood in terms of the dynamical properties of
Vþ := maxfV, 0g, can be further relaxed. Also, if one exp [ itHV ], rather than the spectral analysis of its
realizes HV as the Friedrichs extension of a semi- generator HV : Singular continuous spectrum occurs
bounded quadratic form, the conditions to impose on when initially localized states are not bound states,
V are milder. One possibly loses, however, control but move out to infinity very slowly.
over the operator domain dom(HV ), and typically The reader is referred to Simon (2000), Reed and
dom() is only a core for HV . Simon (1980a, b) and Cycon et al. (1987) for further
For further details on self-adjointness, we refer the detail.
reader to Reed and Simon (1980a, b), Kato (1976),
and Cycon et al. (1987).
Properties of Eigenfunctions
Let us assume  = Rd , that V  0 is nonpositive,
Spectral Analysis
fulfills [3], and that limjxj!1 V(x) = 0. From the
The self-adjointness of HV establishes a functional statements in the last section we conclude that
calculus, generalizing the notion of diagonalizability of HV =  þ V(x) is semibounded, that the essential
finite-dimensional self-adjoint matrices: there exists a spectrum is the positive half-axis and that all
unitary transformation W : L2 () ! L2 ((HV ), d) eigenvalues are negative and of finite multiplicity,
such that HV acts on elements ’ of L2 ((HV ), dHV) possibly accumulating only at 0. We collect some
as a multiplication operator, [HV ’](!) = !’(!). The properties of the eigenfunctions j 2 L2 (R d ) with
spectral measure HV decomposes into an absolutely corresponding eigenvalue ej < 0, that is, HV j =
continuous (ac) part HV , ac , a pure point (pp) part ej j . The smallest eigenvalue e0 := inf (HV ) (coin-
HV , pp , and a singular continuous (sc) part HV , sc , ciding with the bottom of the spectrum) is simple,
mutual disjointly supported on the ac spectrum and the corresponding eigenfunction 0 (x) > 0 is
ac (HV ), the pp spectrum pp (HV ), and the sc strictly positive a.e. Elliptic regularity implies that at
spectrum sc (HV )  R, respectively, whose union is a given point x 2 R d , the eigenfunction j is almost
the spectrum (HV ) of HV . There is an additional 2  d/2 degrees more regular than V. For example,
Schrödinger Operators 489

if V 2 Ck [B2 (x)], for some  > 0, then j 2 procedure does not commute with symplectic
Ckþ‘ [B" (x)], for all ‘ < 2  d=2. Agmon estimates changes of the classical variables. The question of
(originally obtained by S’nol and also known in the geometrically sound definition of quantization,
mathematical physics as Combes–Thomas argu- with a general d-dimensional manifold replacing
ment) furthermore show that, for unbounded , the spatial domain , has attracted many mathe-
the eigenfunction j decays exponentially: j j (x)j  maticians and has led to the mathematical fields
C ejxj , for any 0 <  < ej . of geometric quantization and deformation
For more details, see Reed and Simon (1978, quantization.
1980a, b) and Cycon et al. (1987). It is remarkable, however, that Schrödinger himself
discovered already in his early paper the fact that
classical dynamics derives as the scaling limit h ! 0
One Dimension and Sturm–Liouville from quantum mechanics. The systematic study of
Theory the convergence of wave functions and of operators
and their spectral properties is known as semiclassical
For d = 1, the stationary Schrödinger equation
analysis, which is nowadays considered to be part of
reduces to a second-order ordinary differential
microlocal analysis. We illustrate the type of results
equation known as a Sturm–Liouville problem,
one obtains by the following example on  = Rd .
00
 ðxÞ þ VðxÞ ðxÞ ¼ E ðxÞ ½4 Let F 2 C1 0 (R; R) be a smooth characteristic
function, compactly supported in an interval I R
on L2 ([a, b]), with V 2 L1 ([a, b)] and independent
away from the essential spectrum of the semiclassi-
boundary conditions at 1  a < b  1, say. Equa-
cal Schrödinger operator Hh = h2  þ V with a
tion [4] admits an almost explicit solution by means of d
smooth potential V 2 C1 0 (R ) of compact support.
the Prüfer transformation defined by ’(x):=
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi We define the operator F[Hh ] by functional calculus
0
arctan [ (x)= (x)] and R(x):= ln (x)2 þ 0 (x)2 . (note that I d (HV ) andPF[Hh ] is of trace class).
Let, furthermore, Ah = jjM a (x)@x be a differ-
The key point about the Prüfer transformation is that it
ential operator representing an observable. Then
effectively reduces the second-order differential equa-
tr{Ah F[Hh ]}, which exists because the eigenfunctions
tion [4] into a (nonlinear) first-order equation for ’,
of Hh are smooth and decay exponentially, is, up to
’0 ðxÞ ¼ ðE  VðxÞÞ sin2 ½’ðxÞ þ cos2 ½’ðxÞ ½5 normalization, interpreted to be the expectation of the
observable Ah in the state represented by the spectral
Note that [5] does not involve R and that the projection of Hh in I, approximated by F[Hh ].
boundary conditions on and 0 at a and b can be Semiclassical analysis then yields an asymptotic
easily expressed in terms of ’(a) and ’(b). More- expansion of the form
over, having determined ’ on [a, b] from [5], the
function R is immediately obtained by integrating tr{Ah F½Hh } = hd ðc0 þ c1 h þ þ cn hn þ oðhn ÞÞ
R0 (x) = [1 þ V(x)  E] sin [’(x)] cos [’(x)]. In case of
a bounded interval, 1 < a < b < 1, or a confin- for arbitrarily large integers n 2 N. The leading-
ing potential, limx!
1 V(x) = 1, it is not difficult to order coefficient c0 is determined by Bohr’s corre-
derive from [5] the following basic facts: the spondence principle,
spectrum of H(V) consists only of simple eigenva- trfAh F½Hh g
lues E0 < E1 < E2 < with limn!1 En = 1. More- Z
over, the corresponding eigenfunction dp dy
n 6¼ 0, ¼ a½x; pF½ p2 þ VðxÞ
n 2 N 0 , with H(V) n = En n , has precisely n zeros, R 2d
ð2hÞd
 
and Sturm’s oscillation theorem holds.
þ o ð2hÞd ½6
See Amrein et al. (2005) for more details.
Semiclassical analysis thus provides the mathemati-
cal link between quantum and classical mechanics.
Quantization and Semiclassical Limit The proof of [6] usually involves pseudodifferential
The quantization procedure postulated by Schrödinger and/or Fourier integral operators, depending on the
is the replacement of the classical momentum p 2 R d method. Advanced topics in semiclassical analysis
by the quantum-mechanical momentum operator studied more recently are the construction of
ihrx . It is known (and, in fact, easy to see, quasimodes, that is, wave functions E, h, n which
cf. Messiah (1962)) that the classical Hamiltonian hn )
solve the eigenvalue problem (Hh  E) E, h, n = O(
n
equation of motions is invariant under symplectic up to errors of order h , for arbitrarily large n 2 N,
transformations, but Schrödinger’s quantization and the relation between semiclassical asymptotics
490 Schrödinger Operators

and the KAM (Kolmogorov–Arnold–Moser) theory includes an external magnetic field, for example,
from classical mechanics. H = (p  A)2  V (see the next and the last section).
For more details, see Dimassi and Sjöstrand The reader is referred to Thirring (1997), Reed and
(1999), and Robert (1987). See also Stability Theory Simon (1978), and Simon (1979) for further details.
and KAM, KAM Theory and Celestial Mechanics in
this encyclopedia.
Magnetic Schrödinger Operators

Lieb–Thirring Inequalities Magnetic Schrödinger operators are Hamiltonians


of the form
Lieb–Thirring inequalities are estimates on eigenva-
lue sums of HV =    V(x), where V  0 is Hmc ðA; VÞ ¼ ðp  AðxÞÞ2 VðxÞ
assumed to be non-negative (note that we changed on L2 ðR3 Þ ½9
the sign of V) and vanishing at 1; the most
important examples for these sums are the number or
of eigenvalues below a given E  0 and the sum of
HPauli ðA; VÞ ¼ ½s ðp  AðxÞÞ2 VðxÞ
its negative eigenvalues, counting multiplicities.
More generally, denoting by [ ]þ := max { , 0} the on L2 ðR3 Þ C2 ½10
positive part of 2 R, Lieb–Thirring inequalities are
estimates on tr{[E  HV ]
þ }, for
 0. The num- where V is the (electrostatic) potential; as before,
ber of eigenvalues below E is then obtained in the A : R3 ! R3 is the vector potential of the magnetic
limit
! 0, and the sum of the negative eigenvalues field B =  ^ A, and s = (1 , 2 , 3 ) are the Pauli
corresponds to E = 0 and
= 1. We henceforth matrices. Hmc (A, V) and Hpauli (A, V) generate the
assume E = 0, for simplicity. A guess inspired by dynamics of a particle moving in an external electro-
[6] with F[ ] := [ ]
þ , A = 1, and  h = 1 then is that magnetic field of spin s = 0 and spin s = 1=2, respec-
tr{[HV ]
þ } is approximately given by tively. The operator HPauli (A, V) is usually called Pauli
Hamiltonian, and we refer to Hmc (A, V) as the
Z
 
dd x dd p magnetic Hamiltonian. To keep the exposition simple,
VðxÞ  p2 þ we assume henceforth that A and @ A are uniformly
R 2d ð2Þd
Z bounded, which suffices to prove the self-adjointness
¼ CSC ð
; dÞ VðxÞðd=2Þþ
dd x ½7 of both Hamiltonians.
Rd At a first glance, the magnetic and the Pauli
for a suitable constant CSC (
, d) > 0 depending only Hamiltonians may seem to differ only marginally,
on
and d (but not on V). While this guess is but in fact, some of their spectral properties are
wrong, it is nevertheless a useful guiding principle. fundamentally different.
Namely, in a rather large range of
and d, there 1. The magnetic Hamiltonian fulfills the diamagnetic
exist constants CLT (
, d) > 0 such that inequality, jeHmc (A, V) (x, y)j  eHmc (0, V) (x, y), for
trf½HV 
þ g almost all x, y 2 R3 , where m(x, y) denotes the
Z integral kernel of an operator m. As a consequence,
 CLT ð
; dÞ VðxÞðd=2Þþ
dd x ½8 inf [Hmc (A, V)]  inf [Hmc (0, V)] = inf [H(V)],
Rd
and the quadratic form of the magnetic Hamilto-
for all V  0, for which the right-hand side is finite nian is semibounded, for all choices of A, provided
(with the understanding that this finiteness also H(V) is.
insure that [HV ]
þ is trace class, in the first place). 2. If inf [Hmc (A, V)] is an eigenvalue, the diamag-
Of course, CLT (
, d)  CSC (
, d), by [6]. The netic inequality reflects the fact that the corre-
Lieb–Thirring conjecture, which is still open today, sponding eigenvector is not positive or of
says that the best possible choice of CLT (1, 3) equals constant phase. The determination of the nodal
CSC (1, 3) in the physically most relevant case
= 1 set of eigenfunctions is a difficult task on its own.
and d = 3. It is known that CLT (
, d) > CSC (
, d), for 3. For V = 0, the diamagnetic inequality and the

< 1 or d < 3. minimax principle imply that p  A has no zero


Lieb–Thirring estimates have been derived for eigenvalue.
various modifications of the original model, depend- 4. The diamagnetic inequality fails to hold for the
ing on the application. One of these are pseudor- Pauli Hamiltonian. On the contrary, if A is
elativistic Hamiltonians
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi of the form H = T(p)  V, carefully adjusted in Hmc (A, Zjxj1 ), and Z is
where T(p) = p2 þ m2 , with m  0, another one sufficiently large, then the corresponding
Schrödinger Operators 491

quadratic form may assume arbitrarily small the spectral analysis of this Schrödinger operator
values (even if the corresponding field energy is directly, but rather only suitable approximations.
added). In spite of the fact that HN (Z, R) was one of the
5. For many choices of A, the (Dirac) operator basic operators of quantum mechanics from its very
s (p  A) has a nontrivial kernel. beginning in the late 1920s, HN (Z, R) was, strictly
speaking, not known to be self-adjoint before Kato
From (1)–(4) it is clear that the proof of stability of
developed the perturbation theory (described in the
matter (see the next section) in presence of a
section ‘‘Self-adjointness’’) some 20 years later, which
magnetic field is more difficult than in absence of it.
then also yielded the semiboundedness of HN (Z, R).
This can be illustrated by the fact that magnetic Lieb–
So, the ground-state energy EN (Z, R) := inf [HN
Thirring inequalities, being the natural analog of eqn
(Z, R)] > 1 is finite. From the HVZ (Hunziker–
[8], are more involved to derive than the original
van Winter–Zishlin) theorem follows that inf ess [HN
estimate [8]. The currently best bound is of the form
(Z, R)] = EN1 (Z, R), which particularly implies that
trf½HV 
þ g EN (Z, R) is monotonically decreasing in N and
Z n negative (because E1 (Z, R) < 0).
 CmLT ½VðxÞ5=2
þ þ jBðxÞj ½VðxÞþ
3=2
It is known that EN (Z, R) = ENþ1 (Z, R) and that
Rd
  o HN (Z, R) has Pno eigenvalue, for N  2Ztot þ 1,
þ jBðxÞj þ Lc ðxÞ2 Lc ðxÞ1 ½VðxÞþ dd x ½11 where Ztot := K k = 1 Zk is the total nuclear charge
of the atom. On the other hand, it is known that
for some universal CmLT < 1, where Lc (x) is a local EN (Z, R) is an eigenvalue, provided N < Ztot . Thus,
length scale associated with B. It is nonlocal in x defining Ncrit to be the smallest number such that
and somewhat reminiscent of a maximal function. EN (Z, R) is not an eigenvalue, for all N  Ncrit , that
We further remark that if restricted to two is, Ncrit is the maximal number of electrons the
dimensions, d = 2, both the magnetic and the Pauli molecule can bind, we have that Ztot  Ncrit 
Hamiltonians play an important role in the theory of 2Ztot þ 1. In increasing precision, asymptotic neu-
the (integer) quantum Hall effect. trality, Ncrit = Ztot þ R(Ztot ), with R(Ztot ) = o(Ztot )
For more details, see Simon (1979), Cycon et al. and R(Z) = o(Z5=7 ), was shown for atoms and for
(1987), Rauch and Simon (1997), and Erdös and molecules, respectively. The ionization conjecture
Solovej (2004). See also the article Quantum Hall states that Ncrit  Ztot þ C, for some universal
Effect in this encyclopedia. constant C. It is still open for the full model
represented by HN (Z, R), but has been proved in
the Hartree–Fock approximation. It has been proved
N-Body Schrödinger Operators in the Hartree–Fock approximation by Solovej.
The semiboundedness of HN (Z, R), for fixed Z, R,
The origin of quantum mechanics is atomic (K = 1 and N, alone does not rule out a physical collapse of
below) or molecular (K  2) physics. If we regard the matter described by HN (Z, R), but the stronger
the nuclei of the molecule as fixed point charges property of stability of matter does. It holds if there
Z := (Z1 , . . . , ZK ) > 0 at respective positions exists a constant C, possibly depending on Z, such that
R := (R1 , . . . , RK ) 2 R3 , then the Hamiltonian (in
convenient units) of this molecule with N 2 N X Zk Z‘
EN ðZ; RÞ þ  CðN þ KÞ ½13
electrons is the following Schrödinger operator: jRk  R‘ j
1k<‘K
( )
XN XK
Zk
HN ðZ; RÞ ¼ n  that is, if the ground-state energy plus the repulsive
n¼1 k¼1
jxn  Rk j electrostatic energy of the nuclei is bounded below
X 1 by a constant times the total number N þ K of
þ ½12 particles in the system. Equation [13] was shown to
1m<nN m
jx  xn j
V hold for HN (Z, R).
defined on H(N) := N 2 3 2
n = 1 L [R Z2 ]  L [(R
3
In connection with stability of matter, Thomas–
N
Z2 ) ], the space of totally antisymmetric, square- Fermi theory and the question of the limit of large
integrable wave functions in N space–spin variables nuclear charge came into the focus of research. For
(x1 , 1 ), . . . , (xN , N ) 2 R 3 Z2 . The antisymmetry simplicity, we restrict ourselves to atoms, K = 1, that
of the wave function accounts for the fact that is, there is one nucleus of charge Z := Z1 at the
electrons are fermions and is of crucial importance. origin, R1 = 0, and we consider E(Z) := minN2N
Note that the number N of electrons is possibly very EN (Z, 0) (which amounts to fixing N := Ncrit ). An
large. It is clear that we cannot expect to carry out asymptotic expansion for E(Z) of increasing
492 Schrödinger Operators

precision in Z was obtained by ever-finer estimates; scattering states (states in the range of 
) of HV .
presently, one knows that The intertwining property HV 
= 
H0 (which
easily follows from [15]) implies that the restriction
EðZÞ ¼ ETF Z7=3 þ 14 Z2 þ CDS Z5=3 þ oðZ5=3 Þ ½14 of HV to Ran
is unitarily equivalent to H0 , hence
Ran
 Hac (HV )  H? pp (HV ). The difficult part of
where the leading contribution ETF Z7=3 is the
the proof of asymptotic completeness is to show that
Thomas–Fermi energy, (1=4)Z2 is the Scott correc-
H?pp (HV )  Ran .

tion, and CDS Z5=3 is the Dirac–Schwinger term. The


Much effort has been spent to prove asymptotic
computation of this last term requires semiclassical
completeness
N for N-body Schrödinger operators on
analysis sketched in the section ‘‘Quantization and
H(N) := N 2 3
n = 1 L (R ) of the form
semiclassical limit.’’
For more details, see Cycon et al. (1987), Rauch
X
N
n
and Simon (1997), Thirring (1997), and Solovej HN ðVÞ ¼ þ VðxÞ
(2003). See also the article Stability of Matter in this n¼1
2 mn
encyclopedia. X
with VðxÞ :¼ Vmn ðxm  xn Þ ½18
1m<nN

Scattering Theory
where each pair potential Vmn obeys j@y Vmn (y)j 
The study of the properties of the propagator C(1 þ jyj)jj , with  2 N d0 being a multi-index. If
exp(itH) of a self-adjoint operator H = H  , as  > 1 for all m 6¼ n then V is called a short-range
t ! 1, is the concern of scattering theory. To potential. Conversely, if 0 <   1 then V is a long-
obtain a well-defined mathematical object in this range potential. Note that even though each Vmn
limit, it is necessary to compose exp(itH) with decays at infinity, jxj2 = x21 þ x22 þ þ x2n ! 1
the inverse of some explicitly accessible compar- alone does not imply that V(x) ! 1. In fact, physical
ison dynamics before passing to the limit t ! 1. If intuition tells us that for a cluster C of N particles,
V is a short-range potential, that is, V is relatively whose dynamics is generated by HN (V), several
H0 -compact and jV(x)j  Cjxj , for some  > 1 scenarios for the long-time asymptotic behavior of
and C < 1, then the comparison dynamics appro- the evolution are possible:
priate for HV is generated by H0 : the wave
operators 
are defined as the strong limits 1. The N particles stay together in their cluster C
whose center of mass moves in space at constant

:¼ lim eitHV e
itH0 ½15 velocity.
t!
1
2. The cluster breaks up into two (or even more)
A general technique in scattering theory to prove the subclusters, C1 and C2 , of N1 and N2 = N  N1
existence of such limits is Cook’s argument, which particles, respectively, whose centers of mass drift
formally amounts to an application of the funda- apart from each other at constant velocities (in
mental theorem of calculus. For example, for the the short-range case). For each subcluster C1 and
existence of þ , one writes C2 , both scenarios may appear again, after wait-
Z 1 ing sufficiently longer.
þ d itHV itH0
3. In the limit t ! 1, possibly after going through
 1¼ dt e e
0 dt (1) and (2) several times, the initial cluster C is
Z 1 broken up into 1  K  N subclusters
¼ i dt feitHV V eitH0 g ½16 C1 , . . . , CK , whose centers of mass drift apart
0
from each other at constant velocities according
and additionally proves the absolute integrability of to a free and independent dynamics of their
t 7! eitHV VeitH0 ’, for ’ in a dense subset of H, like centers of mass.
dom(H0 ) = dom(HV ).
In some sense, asymptotic completeness says that
Research in scattering theory in the past two
nothing else than (1)–(3) can possibly happen.
decades or so was focused around the question of
(Strictly speaking, asymptotic completeness is a
asymptotic completeness, which is a mathematically
statement about the limit t ! 1 and only
precise formulation
involves (3) – the actual behavior of exp [itHV ]
Ranþ ¼ Ran ¼ H? at intermediate times in terms of (1)–(3) is beyond
pp ðHV Þ ½17
the reach of current mathematics.) It is a key
of the physical expectation that the states in H are insight of scattering theory that the asymptotics of
either bound states (eigenvectors) of HV or the time evolution in the sense of (3) is completely
Schrödinger Operators 493

characterized by the asymptotic velocity defined particular, the spectrum (H(V! ))  R itself) are
by the strong limit independent of ! P-almost surely. For example,
 x  assuming an independent, identical distribution
Pþ :¼ lim eitHN ðVÞ eitHN ðVÞ ½19 (i.i.d.) of V! in the discrete case on Zd , one arrives
t!1 t
at the Anderson model, which has been most
It is a nontrivial fact that Pþ exists, commutes with thoroughly studied. Its counterpart for continuum
HN (V), and that bound states are precisely the states models is a Poisson-distributed V! . A model which
with zero asymptotic velocity, while states with also has ergodic properties, although deterministic, is
nonzero asymptotic velocity are scattering states in the Hofstadter or the Mathieu problem. Most
Ran
. This then implies asymptotic completeness research has been focused on localization, that is,
for short-range potentials. The proof of this dichot- spatial decay properties of the resolvent {H( V! ) 
omy builds essentially upon positive commutator or E}1 (x, y) of H( V! ), as jx  yj ! 1, and particularly
Mourre estimates. Given an interval J localized (in the question of presence or absence of exponential
energy) away from any eigenvalue of any possible decay (localization), as this is an important indicator
subcluster configuration C1 , . . . , CK (called thresh- for the transport properties of the material under
olds), the Mourre estimate asserts the existence of a consideration. Exponential localization of eigenstates
positive constant M > 0 and a compact operator has been established for d = 1 or strong disorder or
R 2 B(H(N) ) such that sufficiently high energies E  1. Localization is also
1J i½HN ðVÞ; A 1J  M1J  R ½20 intimately related to bounds on moments of the form
kx=2 t k  C t . The study of the asymptotic dis-
as a quadratic form, for some suitable operator tribution of eigenvalues close to the lowest threshold
A. This operator A is often chosen to be the leads to the so-called Lifshitz tails.
dilation generator A = (1=2){p x þ x p} or a var- The reader is referred to Figotin and Pastur
iant thereof. (1992), Cycon et al. (1987), and Stollmann (2001).
Again, the proof of asymptotic completeness for
long-range potentials is still more
pffiffiffidifficult and has
been carried out only for  > 3  1. The addi-
tional problem is the comparison dynamics of the
(Pseudo)relativistic Schrödinger
relative motion of the clusters C1 and C2 in (2), Operators
which is not the free one; the clusters rather Schrödinger operators of the form H(V) = p2 þ V(x)
influence each other even at large distances. do not observe the invariance principles of (special)
For more details, see Reed and Simon (1980c) and relativity, as their derivation is based in classical
Derezinski and Gérard (1997). See also the articles (Newtonian) mechanics. The free Dirac operator
Scattering in Relativistic Quantum Field Theory: D := a p þ m (here,  and  are self-adjoint
Fundamental Concepts and Tools, Scattering, 4 4 matrices) possesses the desired relativistic
Asymptotic Completeness and Bound States in this invariance, but it is not semibounded, and the
encyclopedia. definition of an interacting Dirac operator is
notoriously difficult (and unsolved). The replace-
ment of the kinetic p energy (1=2m)p2 by the Klein–
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Random Schrödinger Operators Gordon operator p þ m2 is a step towards
2

Schrödinger operators H(V! ) on L2 (Rd ) or ‘2 (Zd ) relativistic invariance, which, at the same time,
with a random potential V! are called random yields a positive operator. This replacement may
Schrödinger operators. (If H(V! ) acts on ‘2 (Zd ), also be viewed as the restriction of the free Dirac
then the (continuum) Laplacian  is replaced by the operator to its positive-energy subspace. The virtue
discrete
Pd Laplacian on Zd defined by [disc f ](x) = of this replacement is that it immediately allows for
 = 1 {2f (x)  f (x  e )  f (x þ e ).) More precisely,
the study of interacting N-particle operators,
given a probability space (, P, ) and a random ( )
variable  3 ! 7! V! , the family {H(V! )}!2 defines X
N pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X
K
Zk
rel
an operator-valued random variable that we refer to HN ðZ; RÞ ¼ n þ m2 
n¼1 k¼1
jxn  Rk j
as a random Schrödinger operator. Random quantum X 1
systems are physically relevant as models for amor- þ ½21
phous materials, and for solids in very heterogenous 1‘<nN
jx ‘  xn j
external fields or coupled to quantized fields. Suitable pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ergodicity assumptions on ! !V! ensure that the much like in [12]. Since p2 þ m2  p jpj, as p ! 1,
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
domain of H! and even many spectral properties (in the pseudorelativistic kinetic energy p2 þ m2 can
494 Schwarz-Type Topological Quantum Field Theory

balance only less severe local singularities of the non-homogeneous magnetic field. Annales Henri Poincaré 5:
potential V than the nonrelativistic kinetic energy 671–741.
2 Figotin A and Pastur L (1992) Spectra of Random and Almost-
(1=2m)p
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi. Indeed, already the quadratic form Periodic Operators. Grundlehren der Mathematischen
p2 þ m2  gjxj1 on C1 3
0 (R ) associated to a hydro- Wissenschaften, vol. 297. Berlin: Springer-Verlag.
gen-like atom is unbounded from below if g > 2=. Kato T (1976) Perturbation Theory of Linear Operators, 2 edn.,
Hence, the stability of matter becomes a more subtle Grundlehren der mathematischen Wissenschaften, vol. 132.
property of pseudorelativistic matter. The relaxation Berlin: Springer-Verlag.
Messiah A (1962) Quantum Mechanics, 1st edn., vol. 2. Amsterdam:
of the restriction onto the positive subspace of the free North-Holland.
Dirac operator also got into the focus of research. Rauch J and Simon B (eds.) (1997) Quasiclassical Methods. IMA
For more details, we refer the reader to Thirring Volumes in Mathematics and Its Applications, vol. 95. Berlin:
(1997). Springer-Verlag.
Reed M and Simon B (1978) Methods of Modern Mathematical
See also: Deformation Quantization; Elliptic Differential Physics IV. Analysis of Operators, 1st edn., vol. 4. San Diego:
Academic Press.
Equations: Linear Theory; h-Pseudodifferential Operators
Reed M and Simon B (1980a) Methods of Modern Mathematical
and Applications; Localization for Quasiperiodic
Physics: I. Functional Analysis, 2nd edn., vol. 1. San Diego:
Potentials; Nonlinear Schrödinger Equations; Normal Academic Press.
Forms and Semiclassical Approximation; N-Particle Reed M and Simon B (1980b) Methods of Modern Mathematical
Quantum Scattering; Quantum Hall Effect; Quantum Physics: II. Fourier Analysis and Self-Adjointness, 2nd edn.,
Mechanical Scattering Theory; Scattering, Asymptotic vol. 2. San Diego: Academic Press.
Completeness and Bound States; Stability of Matter; Reed M and Simon B (1980c) Methods of Modern Mathematical
Stationary Phase Approximation. Physics: III. Scattering Theory, 2nd edn., vol. 3. San Diego:
Academic Press.
Robert D (1987) Autour de l’Approximation Semi-Classique,
Further Reading 1st edn. Boston: Birkhäuser.
Schrödinger E (1926) Quantisierung als Eigenwertproblem.
Amrein W, Hinz A, and Pearson D (2005) Sturm–Liouville Annalen der Physik 79: 489.
Theory – Past and Present. Boston: Birkhäuser. Simon B (1979) Functional Integration and Quantum Physics,
Cycon H, Froese R, Kirsch W, and Simon B (1987) Schrödinger Pure and Applied Mathematics. New York: Academic Press.
Operators, 1st edn. Berlin: Springer. Simon B (2000) Schrödinger operators in the twentieth century.
Derezinski J and Gérard C (1997) Scattering Theory of Classical Journal of Mathematical Physics 41: 3523–3555.
and Quantum N-Particle Systems, Text and Monographs in Solovej JP (2003) The ionization conjecture in Hartree–Fock
Physics. Berlin: Springer-Verlag. theory. Annals of Mathematics 158: 509–576.
Dimassi M and Sjöstrand J (1999) Spectral Asymptotics in Stollmann P (2001) Caught by Disorder. Progress in Mathema-
the Semi-Classical Limit. London Mathematical Society tical Physics, vol. 20. Boston: Birkhäuser.
Lecture Notes Series, vol. 268. Cambridge: Cambridge Thirring W (ed.) (1997) The Stability of Matter: From Atoms to
University Press. Stars – Selecta of Elliott H. Lieb, 2 edn. Berlin: Springer-
Erdös L and Solovej JP (2004) Uniform Lieb–Thirring inequality Verlag.
for the three-dimensional Pauli operator with a strong

Schwarz-Type Topological Quantum Field Theory


R K Kaul and T R Govindarajan, The Institute of are of two kinds: (1) Schwarz type and (2) Witten
Mathematical Sciences, Chennai, India type.
P Ramadevi, Indian Institute of Technology Bombay, In a Witten-type topological field theory, action is a
Mumbai, India BRST exact form, so is the stress energy tensor T so
ª 2006 Elsevier Ltd. All rights reserved. that their functional averages are zero (Witten 1988).
The BRST charge is associated with a certain shift
symmetry. The topological observables form cohomo-
logical classes and semiclassical approximation turns
Introduction
out to be exact. In four dimensions, such theories
Topological quantum field theories (TQFTs) provide involving Yang–Mills gauge fields provide a field-
powerful tools to probe topology of manifolds, theoretic representation for Donaldson invariants.
specifically in low dimensions. This is achieved by On the other hand, Schwarz-type TQFTs are
incorporating very large gauge symmetries in the described by local action functionals which are not
theory which lead to gauge-invariant sectors with total derivatives but are explicitly independent of
only topological degrees of freedom. These theories metric (Schwarz 1978, 1979, 1987, Witten 1989).
Schwarz-Type Topological Quantum Field Theory 495

The examples of such theories are topological topological properties of knots and links. These
Chern–Simons (CS) theories and BF theories. theories with bilinear action in fields can also be
Metric independence of the action S of a Schwarz- defined in higher dimensions. In particular in D = 4,
type gauge theory implies that stress–energy tensor BF theory, besides describing two-dimensional gen-
is zero: eralizations of knots and links, also provides a field-
theoretic interpretation of Donaldson invariants.
S
 T ¼ 0 This provides a connection of these theories with
 g Witten-type TQFTs of Yang–Mills gauge fields. We
More generally, in the gauge-fixed version of such shall not discuss BF theories in the following and
theories, stress–energy can be BRST exact, where refer to the article BF Theories in this Encyclopedia.
BRST charge corresponds to gauge fixing in contrast Witten (1995) has also formulated CS theories in
to Witten-type theories where corresponding BRST three complex dimensions described in terms of
charge corresponds to a combination of shift holomorphic 1-forms. Such a theory on Calabi–Yau
symmetry and gauge symmetry. There are no local spaces can be interpreted as a string theory in terms
propagating degrees of freedom; the only degrees of of a Witten-type topological field theory of a sigma
freedom are topological. Expectation values of model coupled to gravity. General topological sigma
metric-independent operators W are also indepen- models in Batalin–Vilkovisky formalism have been
dent of the metric: constructed by Alexandrov et al. (1997). This is a
Schwarz-type theory. However, in its gauge-fixed
hWi version, it can also be interpreted as a Witten-type
¼0
g theory. This construction provides a general for-
mulation from which numerous topological field
Three-dimensional CS theories are of particular theories emerge. In particular, the Witten A and B
interest, for these provide a framework for the study models and also multidimensional CS theories are
of knots and links in any 3-manifold. Pioneering special cases of this construction.
indications of the fact that topological invariants In the following, we shall survey three-dimensional
can be found in such a setting came in very early CS theory as a description of knots/links, indicate
when A S Schwarz demonstrated that a particular how manifold invariants can be constructed from
topological invariant, Ray–Singer analytic torsion invariants for framed links, and also discuss its
(which is equivalent to combinatorial Reidemeister– application to three-dimensional gravity.
Franz torsion) can be interpreted in terms of the
partition function of a quantum gauge field theory
(Schwarz 1978, 1979). In particular, in the weak- Three-Dimensional CS Theory with
coupling limit of CS theory of gauge group G on a Gauge Group U(1)
manifold M, contribution from each topologically The simplest Schwarz-type topological field theory is
distinct flat connection (characterized by the equiva- the U(1) CS theory described by the action:
lence classes of homomorphisms: 1 (M) ! G) to the Z
1
partition function is given by metric-independent S¼ A dA ½1
Ray–Singer torsion of the flat connection up to a 8 M
phase. This phase factor is also a topological where A is a connection 1-form A = A dx and M is
invariant of framed 3-manifold M (Witten 1989). the 3-manifold, which we shall take to be S3 for the
It was Schwarz who first discussed CS theory as a discussion below. The action has no dependence on
topological field theory and also conjectured that the metric. Besides being the U(1) gauge invariant, it
the well-known Jones polynomial may be related to is also general coordinate invariant.
it (Schwarz 1987). In his famous paper Witten In quantum CS field theory, we are interested in
(1989) not only demonstrated this connection, but the functional averages of gauge-invariant and
also set up a general field-theoretic framework to metric-independent functionals W[A]:
study the topological properties of knots and links in Z
1
any arbitrary 3-manifold. In addition, this frame- hW½Ai ¼ ½DAW½A expfikSg
Z
work provides a method of obtaining some new Z ½2
manifold invariants. As discussed by A Achúcaro Z ¼ ½DA expfikSg
and P K Townsend, CS theory also describes gravity
in three-dimensional spacetime (Carlip 2003). This theory captures some of the simple, but
BF theories in three dimensions provide another interesting, topological properties of knots and links
framework for field-theoretic description of in three dimensions. For a knot K, we associate a knot
496 Schwarz-Type Topological Quantum Field Theory

H
operator K A which is gauge invariant and also does does depend on the topological character of the
not depend on the metric of the 3-manifold. Then for normal vector field n (s). It is also related to two
a link made of two knotsH K1 andH K2 , we have the loop geometric quantities called ‘‘twist’’ T(K) and ‘‘writhe’’
correlation function h K1 A K2 Ai, which can be w(K) through a theorem due to Calugareanu:
evaluated in terms of two-point correlator
SLðKÞ ¼ TðKÞ þ !ðKÞ ½5
hA (x)A (y)i in R3 (with flat metric). This correlator
in Lorentz gauge (@ A = 0) is: where
I
i ðx  yÞ 1 dx  dx
hA ðxÞA ðyÞi ¼  TðKÞ ¼ ds  n
k jx  yj3 2 K ds ds
I I
1 de de 
so that for two distinct knots K1 and K2 !ðKÞ ¼ ds dt  e
4 K K ds dt
I I 
4i
A A ¼ LðK1 ; K2 Þ ½3 Here
K1 K2 k
y ðtÞ  y ðsÞ
where e ðs; tÞ ¼
jyðtÞ  yðsÞj
I I
1 ðx  yÞ is a unit map from K  K ! S2 and n (s) is a normal
LðK1 ; K2 Þ ¼ dx dy  3
4 K1 K2 jx  yj unit vector field. T(K) and !(K) are not in general
integers and represent the amount of twist and coiling
This integral is the well-known topological invariant
of the knot. These are not topological invariants but
called ‘‘Gauss linking number’’ of two distinct
their sum, self-linking number, is indeed always an
closed curves. It is an integer measuring the number
integer and a topological invariant. This result has
of times one knot K1 goes through the other knot
found interesting applications in the studies of the
K2 . Linking number does not depend on the
action of enzymes on circular DNA.
location, size, or shape of the knots. In electro-
dynamics, it has the physical interpretation of work
done to move a monopole around a knot while Nonabelian CS Theories
electric current runs through the other knot.
Abelian CS theory also provides a field-theoretic Nonabelian CS theories provide far more informa-
representation for another topological quantity tion about the topological properties of the mani-
called ‘‘self-linking number,’’ also known as ‘‘fram- folds as well as knots and links.
ing number,’’ of the knot. Nonabelian CS theory in a 3-manifold M (which
H H It is related to the as in last section is taken to be S3 ) is described by
functional average of h K A K Ai where two loop
integrals are over the same knot. Coincidence the action functional
Z
singularity is avoided by a topological loop-splitting 1  
regularization. For a knot K given by x (s) para- S¼ tr A ^ dA þ 23A ^ A ^ A ½6
4 M
metrized along the length of the knot by s, we
associate another closed curve Kf given by where A is a gauge field 1-form which takes its value
y (s) = x (s) þ  n (s), where  is a small parameter in the Lie algebra LG of a compact semisimple Lie
and n (s) is a principal normal to the curve at s. The group G. For example, we may take this group to be
coincidence limit is then obtained at the end by SU(N) and A = Aa T a , where T a is the fundamental
taking the limit  ! 0. Such a limiting procedure is N-dimensional representation with trT a T b = 1=2ab .
called framing and knot Kf is the ‘‘frame’’ of knot K. Under homotopically nontrivial gauge transforma-
Linking number of the knot K and its frame Kf is the tions this action is not invariant, but changes by an
self-linking number of the knot: amount 2n where integers n are the winding
I I numbers characterizing the gauge transformations
1  ðx  yÞ which fall in homotopic classes given by 3 (G) = Z
SLðK; n Þ ¼ dx dy
4 jx  yj3 for a compact semisimple group G. However, for
quantum theory what is relevant is exp[ikS] which
Hence coincidence two loop correlator is is invariant even under homotopically nontrivial
I I  gauge transformations provided the coupling k
4i
A A ¼ SLðK; n Þ ½4 takes integer values. This quantized nature of the
K K k
coupling was pointed out by Deser et al. (1982a, b)
Notice that the self-linking number of a knot is (and also they were first to introduce the non-
independent of the regularization parameter , but abelian CS term as a gauge-invariant topological
Schwarz-Type Topological Quantum Field Theory 497

mass term in gauge theories). So for integer k, the generalization is the HOMFLY polynomial) corre-
quantum field theory we discuss here is gauge sponds to the case of spin-1/2 representation of
invariant. SU(2) CS theory: V2 [L] = Jones polynomial [L], up
The topological operators are Wilson loop opera- to an overall normalization. These skein relations
tors for an oriented knot K: are sufficient to recursively find all the expectation
I values of links with only fundamental representation
WR ½K ¼ tr P exp AR ½7 on the components. To obtain invariants for any
K other representation, more general methods have to
a
where AR = A TRa
with TRa
as the representation be developed. A complete and explicit solution of
matrices of a finite-dimensional representation R of the CS field theory is thus obtained. One such
the LG. P stands for the path ordering of the method has been reviewed in Kaul (1999). The
exponential. The method makes use of the following important
S observable Wilson link operator
for a link L = n1 Ki , carrying representations Ri on statement:
the respective component knots, is
Proposition: CS theory on a 3-manifold M
Y
n with boundary  is described by a WZNW
WR1 R2 Rn ½L ¼ WRi ½Ki  ½8 (Wess–Zumino–Novikov–Witten) conformal field
1
theory (CFT) on the boundary (Figure 2).
Expectation values of these operators are:
Using the same identification, functional average
R
½DAWR1 Rn ½LeikS for Wilson lines ending at n points on the boundary
VR1 ;R2 Rn ½L ¼ R ½9  is obtained from WZNW field theory on the
½DAeikS
boundary with n punctures carrying representations
The measure [DA] has to be metric independent. Ri (Figure 3):
These expectation values depend not only on the We can represent CS functional integral as a
isotopy of the link L but also on the set of the vector (Witten 1989) in the Hilbert space H
representations {Ri }. These can be evaluated in associated with the n-point vacuum expectation
principle nonperturbatively. For example, when values of primary fields in WZNW conformal field
LG = su(N) and each of the component knot of the theory on the boundary . Next, to obtain a
links carries the fundamental N-dimensional repre- complete and explicit nonperturbative solution of
sentation, the Wilson link expectation values satisfy the CS theory, the theory of knots and links and
a recursion relation involving three link diagrams their connection to braids is invoked.
which are identical except for one crossing where
they differ as over crossing (Lþ ), under crossing Σ
(L ), and no crossing (L0 ) as shown in the Figure 1. Σ
The expectation values of these links are related
as (Witten 1989): W
Z
qN=2 VN ½Lþ   qN=2 VN ½L  CS
  N
¼ q1=2  q1=2 VN ½L0  ½10 W

where

2i Figure 2 Relation of CS to CFT.


q ¼ exp
kþN
This is precisely the well-known skein relation for
the HOMFLY polynomial. The famous Jones one-
variable polynomial (whose two-variable
Σ Σ

L+ L0 L–
Figure 3 CS functional integrals with Wilson lines and CFT on
Figure 1 Skein related links. punctured boundary.
498 Schwarz-Type Topological Quantum Field Theory

Knots/Links and Braids two nonintersecting 3-balls are removed from the
3-manifold S3 to obtain a manifold with two S2
Braids have an intimate connection with knots and
boundaries. Then we arrange 2n Wilson lines of, say
links which can be summarized as follows:
SU(N) CS theory, as a 2n-strand oriented braid
1. An n-braid is a collection of nonintersecting carrying representations Ri in this manifold. The CS
strands connecting n points on a horizontal rod functional integral over this manifold is a state in
to n points on another horizontal rod below the tensor product of the Hilbert spaces H1  H2
strictly excluding any backward traversing of the associated with conformal field theory on the two
strands. A general braid can be written as a word boundaries. These boundaries have 2n punctures
in terms of elementary braid generators. carrying the set of representations {Ri } and {R0i },
2. We associate representations Ri of the group with respectively, the two sets being permutations of each
the strands as their colors. We also put an other. This state can be expanded in terms of some
orientation on each strand. When all the repre- convenient basis given by the conformal blocks for
sentations are identical and also all strands are the 2n-point correlation functions of SU(N)k
unoriented, we get ordinary braids, otherwise we WZNW conformal field theory. The duality of
get colored oriented braids. these correlation functions represents the transfor-
3. The colored oriented braids form a groupoid mation between different bases for the Hilbert
where product of the different braids is obtained space. Their monodromy properties allow us to
by joining them with both colors and orientations write down representations of the braid generators.
matching on the joined strands. Unoriented Since an arbitrary braid is just a word in terms of
monochromatic braids form a group. these generators, this construction provides us a
4. A knot/link can be formed from a given braid by matrix representation B({Ri }, {R0j }) for the colored
a process called platting. We connect adjacent oriented braid in the manifold with two S2 bound-
strands namely the (2i þ 1)th strand to 2ith aries. Then we plat this braid by gluing two balls B1
strand carrying the same color and opposite and B2 with Wilson lines as shown in Figure 5.
orientations in both the rods of an even-strand Each of the two caps again represents a state
braid (Figure 4a). j ({Rj })i in the Hilbert space associated with the
There is a theorem due to Birman which states conformal field theory on punctured boundary (S2 ).
that all colored oriented knots/links can be Platting of the braid then simply is the matrix
obtained through platting. This construction is element of braid representation B({Ri }, {R0j }) with
not unique. respect to these states j ({Ri })i and j ({R0j })i corre-
5. There is another construction associated with sponding to two caps B1 , B2 . Thus, for a link in S3
braids which relates them to knots and links. We the invariant is given by the following theorem:
obtain a closure of a braid by connecting the ends
Theorem The vacuum expectation value of Wilson
of the first, second, third, . . . strands from above
loop operator of a link L constructed from platting
to those of the respective first, second, third, . . .
of a colored oriented 2n braid with representation
strands from below as shown in the Figure 4b.
B({Ri }, {R0j }) is given by (Kaul 1999):
There is theorem due to Alexander which states
that any knot or link can be obtained as a closure V½L ¼ h ðfRi gÞjBðfRi g; fR0j gÞj ðfR0j gÞi ½11
of a braid, though again not uniquely.
This theorem can be used to calculate the
Link Invariants invariant for any arbitrary link. For an unknot U
This connection of braids to knots and links can be
used to construct link invariants, say in S3 . To do so,
〈ψ({Rj})⏐ ⏐ψ({Rj′})〉

B({Ri}, {Ri′})

(a) (b) B1 B2

Figure 4 (a) Platting and (b) closure of braids. Figure 5 Construction of the link invariant.
Schwarz-Type Topological Quantum Field Theory 499

carrying an N-dimensional representation in an method for generating manifold invariants are given
SU(N) CS theory, the knot invariant is: in Kaul (1999) and Kaul and Ramadevi (2001).
qN=2  qN=2
VN ½U ¼ ½N; where ½N ¼ Surgery of Framed Knots/Links and Kirby Moves
q1=2  q1=2
As discussed earlier, frame of a knot K is an
Wilson link expectation values calculated this way associated closed curve Kf going along the length
depend on the regularization, that is, the definition of the knot wrapping around it certain number of
of framing used in defining coincident loop correla- times. Self-linking number (also called framing
tors. One such regularization usually used is the number) is equal to the linking number of the knot
standard framing, where the frame for every knot is with its frame. There are several ways of fixing this
so chosen that its self-linking number is zero. framing. The ‘‘standard’’ framing is one in which the
The procedure outlined here has been used for frame number of the knot, that is, the linking
explicit computations of knot/link invariants. This number of the knot and its frame is zero. On the
has led to answers to several questions of knot other hand, ‘‘vertical’’ framing is obtained by
theory. One such question relates to distinguishing choosing the frame vertically above the knot
chirality of knots (Kaul 1999). In this context, newer projected on to a plane. In such a frame, the framing
invariants constructed with arbitrary representations number of a knot is the same as its crossing number.
living on the knots are more powerful than the older In constructing the 3-manifold invariants from CS
polynomial invariants. For example, invariants with theories, we need vertical framing. The framing
spin-3/2 representation in an SU(2) CS theory are number may be denoted by writing the integer by
sensitive to chirality of many knots which otherwise the side of knot. We denote a framed r-component
is not detected by Jones, HOMFLY, and Kauffman link by [L, f ] where framing f = (n(1), n(2), . . . , n(r))
polynomials. However, invariants obtained from CS is a set of integers denoting the framing number of
theories do not distinguish all chiral knots. There is component knots K1 , K2 , . . . , Kr in the link L.
a class of links known as ‘‘mutants’’ which are not According to the Lickorish–Wallace theorem,
distinguished by CS link invariants (Kaul 1999). A surgery over links with vertical framing in S3 yields
mutant link is obtained by removing a portion of all the 3-manifolds. This surgery is performed in the
weaving pattern in a link and then gluing it back following way.
after rotating it about any one of three orthogonal Take a framed r-component link [L, f ] in S3 .
axes by an amount . Thicken the component knots K1 , K2 , . . . , Kr such
The CS invariants of knots and links can also be that the solid tubes N1 , N2 , . . . , Nr so obtained are
used to construct special 3-manifold invariants. nonintersecting. Then the compliment S3 
Hence, CS theory provides an important tool to (N1 þ N2 þ    þ Nr ) will have r toral boundaries.
study these. On the ith toral boundary, we imagine an
appropriate curve winding n(i) times around the
meridian and once along the longitude. Perform a
Manifold Invariants from CS Theory modular transformation so that this curve bounds
a disk. This construction is done with each of the
Different 3-manifolds can be constructed through a
toral boundaries. The tubes N1 , N2 , . . . , Nr are
procedure called ‘‘surgery of framed knots and
then glued back in to the respective gaps. This
links’’ in S3 (Lickorish–Wallace theorem). This
surgery thus yields a new 3-manifold. This
construction is not unique. That is, there are many
construction is not unique. The rules of equiva-
framed knots and links which give the same
lence for surgery on framed knots/links in S3 are
manifold. However, rules of this equivalence are
two independent Kirby moves.
known: these are called ‘‘Kirby moves.’’
Classification of 3-manifolds would involve find-
ing a method of associating a quantity with the Kirby move I Take an arbitrary r-component
manifold obtained by surgery on the corresponding framed link [L, f ] in S3 and consider a curve C
framed knot/link on S3 . If the Kirby moves on the with framing number þ1 going around the unlinked
framed knot/link leave this quantity unchanged, strands of L as in Figure 6a. We refer to this (r þ 1)-
then it is a 3-manifold invariant. Knot/link invar- component link as H[X], where X represents a
iants of nonabelian CS theories provide a method of weaving pattern of the strands. Kirby move I
finding such 3-manifold invariants. Equivalently, consists of twisting the disk enclosed by C in the
this procedure gives an algebraic meaning to the clockwise direction from below by an amount 2.
surgery construction of 3-manifolds. Details of this This twisting thereby introduces new crossings
500 Schwarz-Type Topological Quantum Field Theory

need for this purpose invariants for links in S3 with


n ′(i ) vertical framing.
n(i ) Let M be the manifold obtained from surgery
of an r-component framed link [L, f ] in S3 . Then
a manifold invariant F ^ (G) [M] is given as a linear
X
X
combination of the framed link invariants VR(G)1 ,..., Rr
C
[L, f ], with representations R1 , R2 , . . . , Rr living on
+1 component knots, obtained from CS theory based
on a compact semisimple group G:
!
X Y r
^ ðGÞ
F ½M ¼   ½L;f 
R i
R1 ;... Rr i¼1
ðGÞ
(a) H [X ] (b) U [X ]
 VR1 ;R2 ;...; Rr ½L; f  ½12
Figure 6 Kirby move I. Here [L, f ] is the signature of the linking matrix
and Ri = S0Ri ,  = eic=4 , where c is the central
between the curve C and the strands enclosed by it. charge of the associated WZNW conformal field
Then the curve C is removed giving us a new theory and S0Ri denotes the matrix element of the
r-component link U[X] of Figure 6b. Framing modular matrix S. General S-matrix elements for
numbers n0 (i) of the component knots in link U[X] any compact group are given by
are related to the framing number n(i) of framed link
[L, f ] as n0 (i) = n(i)  (L(Ki , C))2 , where L(Ki , C) is SR1 R2 ¼ ðiÞðdrÞ=2 jL! =Lj1=2 ðk þ Cv Þ1=2
the linking number of knot Ki and closed curve C. X

2i
The surgery of the framed links in Figures 6a and 6b  ð!Þ exp ð!ðR1 þ Þ; R2 þ Þ
!2W
k þ Cv
will give the same 3-manifold.
Inverse Kirby move I involves removal of a curve where W denotes the Weyl group and its elements !
C with framing number 1 (instead of þ1) after are words
making one complete anticlockwise twist from Q constructed using the generator si – that
is, ! = i si and (!) = (1)‘(!) with ‘(!) as length of
below on the disk enclosed by C. In the process the the word. Here Ri ’s denotes the highest weights of
unlinked strands get twisted in the anticlockwise the representations Ri ’s and  is the Weyl vector. The
direction leading to changed framing numbers action of the Weyl generator s on a weight R is
n0 (i) = n(i) þ (L(Ki , C))2 of the component knots Ki .
ðR ; Þ
s ðR Þ ¼ R  2
Kirby move II This move consists of removing a ð; Þ
disjoint unknot C with framing 1 from framed link
[L, f ] without changing the rest of the link as in and jL! =Lj is the ratio of weight and coroot lattices
Figure 7. Surgery of the two links in Figure 7 will (equal to the determinant of the Cartan matrix for
give the same manifold. simply laced algebras). Also Cv is quadratic Casimir
Inverse Kirby move II involves removal of a invariant for the adjoint representation.
disjoint unknot with framing þ1 (instead of 1) It is important to stress that the expression
^ (G) [M] is unchanged under both Kirby moves I
F
from a framed link.
and II (for detailed proof, see Kaul (1999) and Kaul
3-Manifold Invariants and Ramadevi (2001)). Notice that for every
compact gauge group, we have a new 3-manifold
Now a 3-manifold invariant can be constructed by
invariant.
an appropriate combination of the invariants of
framed links in such a way that this algebraic
expression is unchanged under the Kirby moves. We Few examples of 3-manifolds Table 1 lists the
algebraic expressions of this invariant calculated
C explicitly from the formula in eqn [12] for a few
3-manifolds. All these examples can be constructed
Z Z by surgery on an unknot U(f ) with different frame
–1 numbers f.
In Table 1 L[p, q] stands for Lens spaces of the
Figure 7 Kirby move II. type (p, q) and CR is the quadratic Casimir invariant
Schwarz-Type Topological Quantum Field Theory 501

Table 1 Invariants for some simple manifolds where the coupling constant k = ‘=(4G) for negative
^ (G)
cosmological constant  = 1=‘2 . The gauge group
U(f ) M F [M]
for this theory is SL(2, C). Infinitesimal diffeo-
U(0) S2  S1 1=S00 morphisms are described by field-dependent gauge
U(1) S3 1 P transformations. The corresponding gauge group for
2CR
U(þ2) RP 3 1 S0R qS00 S0R Minkowski gravity with negative cosmological con-
PR
U(þp) L[p, 1]
pC R
1 S0R qS00 S0R stant  is SO(2, R)  SO(2, R). For positive , one
R
gets SO(3, 1) and SO(4) for Minkowski and Euclidean
metrics, respectively. For  = 0, we have ISO(2, 1)
for representation R of the Lie algebra of the gauge (ISO(3)) as the gauge group for Minkowski
group G. (Euclidean) gravity. Hence, the sign of cosmological
Partition function of a CS theory on M is also an constant determines the gauge group of the CS
invariant characterizing the 3-manifold. This has theory.
been calculated for several manifolds by different Identification of 3D gravity with CS theory can be
methods. Invariant F ^ (G) [M] listed above for various used with some advantage to find the partition
manifolds is related to the CS partition function function for a black hole in 3D gravity with negative
^ (G) [M] = S1 Z(G) [M]. So the method of
Z(G) [M]: F cosmological constant. This in turn yields an
00
constructing 3-manifold invariants above can also expression for entropy of the black hole.
be used to calculate the partition function of CS
theories.
BTZ Black Hole and Its Partition Function
Only for negative  we have a black hole solution of
3D Gravity and CS Theory the Einstein’s equations. This solution, known as the
Three-dimensional CS theory also provides a BTZ black hole (Carlip 2003), in Euclidean gravity
description of gravity. The 3D gravity including is given by the metric
cosmological constant has been first discussed by

Deser and Jackiw (1984). The action with cosmolo- r2 J2


ds2E ¼ M þ 2  2 d
2
gical constant  = 1=‘2 is: l 4r
Z 2

1
2
1 pffiffiffiffiffiffiffi r J2 J
S¼ d3 x gðR  2Þ ½13 þ M þ 2  2 2 2
dr þ r d  d

16G M l 4r 2r
G is the Newton’s constant, g is the metric on the
It is specified by two parameters M and J (the mass
3-manifold M, and R is scalar curvature. Solutions
and angular momentum). By a coordinate transfor-
of Einstein equations of motion have a constant
mation, this metric can be rewritten as ds2E =
positive (negative) curvature if  is positive (nega-
(l2 =z2 )(dx2 þ dy2 þ dz2 ), with z > 0. This is the 3D
tive). It is also well known that there are no
upper-half hyperbolic space and can be rewritten
dynamical degrees of freedom for gravity in dimen-
using spherical polar coordinates as
sions D 3; it is indeed described by topological
field theories. The gravity action above can be
l2  2 
rewritten as a CS gauge theory in first-order ds2E ¼ dR þ R2 d 2 þ R2 sin2 d 2
2
formulation (Carlip 2003). For triads ea and spin R2 sin
connection !a of Euclidean gravity, we define
1-forms e = ea T a dx , ! = !a T a dx , which have We have the identifications (R, , )
(R exp {2rþ =l},
values in the Lie algebra of SU(2) whose generators þ {2r =l}, ) where rþ and r are the outer and
are T a = i a =2 with a as three Pauli matrices. inner horizon radii, respectively. It is clear from this
In terms of these we define two gauge field 1-forms identification that topologically the metric corre-
A and A  as: sponds to a solid torus. Functional integral over


this manifold represents a state in the Hilbert space
ie  ie specified by the mass and angular momentum. It is
A¼ þ! ; A¼ !
‘ ‘ the microcanonical ensemble partition function and
Then the Euclidean gravity action can be written its logarithm is the entropy of the black hole.
 as
in terms of two CS actions, SCS [A] and SCS [A], To evaluate this partition function, the connection
1-form is kept at a constant value on the toroidal

S ¼ kSCS ½A  kSCS ½A ½14 boundary through a gauge transformation. We
502 Schwarz-Type Topological Quantum Field Theory

define local coordinatesR on the R torus boundary black hole mass and zero angular momentum in
z = x þ
y such that a dz = 1, b dz =
, where saddle-point approximation. The computation yields
a (b) stands for the contractible (noncontractible) (Govindarajan et al. 2001):
cycle of solid torus and
=
1 þ i
2 is the modular rffiffiffiffiffiffiffiffiffiffiffiffi

parameter of the boundary torus. Then connection l2 8rþ G 2rþ


describing the black hole is ZBH ¼ 2 exp þ  ½16
rþ l2 4G

i u
~ i u This gives not only the leading Bekenstein–Hawking
A¼ dz þ dz T 3 ½15 behavior of the black hole entropy S but also a

2
2
subleading logarithmic term:
where u and u ~ are canonically conjugate with 2rþ 3 2rþ
commutation relation: [~ u, u] = (2=)
2 (k þ 2)1 . S ¼ ln ZBH ¼  ln þ 
4G 2 4G
These are related to black hole parameters
through holonomies of gauge field A around the This is an interesting application of CS theory to
a- and b-cycles (for a classical black hole solution 3D gravity. In fact, three-dimensional CS theory also
 = 2): has applications in the study of black holes in four-
dimensional gravity: the boundary degrees of free-

i 2ðrþ þ ijr jÞ dom of a black hole in 4D are also described by an


u¼ i
þ SU(2) CS theory. This allows a calculation of the
2 l

degrees of freedom of, for example, Schwarzschild
i 2ðrþ þ ijr jÞ black hole. For large area black holes, this in turn
u
~ ¼ i

þ
2 l results in an expression for the entropy which, besides
a Bekenstein–Hawking area term, has a logarithmic
For a fixed value of connection, namely u, the area correction with same coefficient 3=2 as above.
functional integral is described by a state 0 with no This suggests a universal, dimension-independent,
Wilson line in the bulk. The states with Wilson line nature of the these logarithmic corrections.
carrying spin j=2 are given by Labastida and
Ramallo: See also: BF Theories; The Jones Polynomial; Knot
Theory and Physics; Large-N and Topological Strings;
Quantum 3-Manifold Invariants; Topological Quantum
k 2 Field Theory: Overview.
j ðu;
Þ ¼ exp u j ðu;
Þ
4
2

where the Weyl–Kac characters for affine su(2)


Further Reading
ðkþ2Þ ðkþ2Þ Alexandrov M, Konstsevich M, Schwarz A, and Zaboronsky O
jþ1 ðu;
Þ  j1 ðu;
Þ
j ðu;
Þ ¼ (1997) The geometry of the master equations and topological
21 ðu;
Þ  21 ðu;
Þ quantum field theory. International Journal of Modern
Physics A 12: 1405–1430.
Atiyah M (1989) The Geometry and Physics of Knots.
and  functions are defined by
Cambridge: Cambridge University Press.
  Carlip S (2003) Quantum Gravity in 2 þ 1 Dimensions,
X  2   Cambridge Monographs on Mathematical Physics,
k ðu;
Þ ¼ exp 2ik n þ
þ nþ u Cambridge: Cambridge University Press.
n2Z
2k 2k
Deser S and Jackiw R (1984) Three-dimensional cosmological
gravity: dynamics of constant curvature. Annals of Physics
Given the collection of states j , we write the 153: 405–416.
partition function by choosing an appropriate Deser S, Jackiw R, and Templeton S (1982a) Three-dimensional
ensemble for fixed mass and angular momentum. massive gauge theory. Physical Review Letters 48: 975–978.
Deser S, Jackiw R, and Templeton S (1982b) Topological massive
This black hole partition function is:
gauge theories. Annals of Physics NY 140: 372–411.
Govindarajan TR, Kaul RK, and Suneeta V (2001) Logarithmic
Z  2
X k  correction to the Bekenstein–Hawking entropy of the BTZ
  black hole. Classical and Quantum Gravity 18: 2877–2886.
ZBH ¼ dð
;
Þ ð j ð0;
ÞÞ j ðu;
Þ
 j¼0  Kaul RK (1999) Chern–Simons theory, knot invariants, vertex
models and three-manifold invariants. In: Kaul RK, Maharana J,
Mukhi S, and Kalyana Rama S (eds.) Frontiers of Field Theory,
where modular invariant measure is d(
,
) = Quantum Gravity and Strings, Horizons in World Physics,

=
22 . This integral can be worked out for large
d
d vol. 227, pp. 45–63. New York: NOVA Science Publishers.
Seiberg–Witten Theory 503

Kaul RK and Ramadevi P (2001) Three-manifold invariants from Schwarz AS (1987) New Topological Invariants in the Theory of
Chern–Simons field theory with arbitrary semi-simple gauge Quantized Fields. Abstracts in the Proceedings of International
groups. Communications in Mathematical Physics 217: Topological Conference, Baku, Part II.
295–314. Witten E (1988) Topological quantum field theory. Communica-
Schwarz AS (1978) The partition function of degenerate quadratic tions in Mathematical Physics 117: 353–386.
functional and Ray–Singer invariants. Letters in Mathematical Witten E (1989) Quantum field theory and the Jones polynomial.
Physics 2: 247–252. Communications in Mathematical Physics 121: 351–399.
Schwarz AS (1979) The partition function of a degenerate Witten E (1995) Chern–Simons gauge theory as a string theory.
functional. Communications in Mathematical Physics 67: Progress in Mathematics 133: 637–678.
1–16.

Seiberg–Witten Theory
Siye Wu, University of Colorado, Boulder, CO, USA N = 1 Gauge Theory and Seiberg Dualities
ª 2006 Elsevier Ltd. All rights reserved. N = 1 Yang–Mills Theory and QCD
Let G be a compact Lie group and let P be a principal
Introduction G-bundle over the Minkowski space R3, 1 . In pure
gauge theory, the dynamical variable is a connection A
Gauge theory is the cornerstone of the standard in P; two connections are equivalent if they are related
model of elementary particles. The original motiva- by a gauge transformation. Let F 2 2 (R3, 1 , ad P) be
tion for studying supersymmetric gauge theories was the curvature of A. It decomposes into the self-dual and
phenomenological (such as the hierarchy problem). anti-self-dual parts, þ 
pffiffiffiffiffiffi that is, F = F þ F , where
They display a large number of interesting phenom- 
F = (1=2)(F 1 F). With a suitably normalized
ena and become the models for the dynamics of nondegenerate bilinear form h,i on the Lie algebra g,
strongly coupled field theories. They also offer the classical action is
valuable insights to nonsupersymmetric models. In Z
N = 1 gauge theory, the low-energy effective super- 1
SYM ½A ¼  2 hF ^ Fi þ hF ^ Fi
potential is holomorphic both in the superfields and 3;1 2g 162
ZR
in the coupling constants. This powerful holomor-

 
¼  hFþ ^ Fþ i  hF ^ F i
phy principle, together with symmetry and various R 3;1 8 8
limits, often determines the effective superpotential
Here g > 0 is the coupling constant and 2 R, the
completely. Such theories often have quantum
angle, and
moduli spaces where the classical singularities are
pffiffiffiffiffiffiffi
smoothed out, continuous interpolation between 4 1
Higgs and confinement phases, massless composite
¼ þ
2 g2
mesons and baryons, and dual theories weakly
coupled at low energy. For N = 2 pure gauge theory, is a complex number in the upper-half plane that
the low-energy effective theory is an abelian gauge incorporates both. Classically, the theory is con-
theory in which both the kinetic term and the formally invariant and the dynamics is independent
coupling constant are determined by a holomorphic of the -term. At the quantum level, (mod2)
prepotential. The electric–magnetic duality is in the appears in the path integral and parametrizes
ambiguity of the low-energy description. Much inequivalent vacua. The coupling constant runs as
physical information, such as the coupling constant, energy  varies, satisfying the renormalization group
the Kähler metric on the quantum moduli, the equation
monodromy around the singularities, can be incor- dg b0 3
porated in a family of elliptic curves. This low-  ¼  g þ oðg5 Þ
energy exact solution is also useful to topological
d ð4Þ2
field theory that can be obtained from the N = 2 where the right-hand side is called the -function
theory by twisting. Much of the above was the work (g). This introduces, when b0 6¼ 0, a mass scale 
of Seiberg and Witten in the mid-1990s. In this given by
article, we review some of the fascinating aspects of 2
=gðÞ2
N = 1 and N = 2 supersymmetric gauge theories. ð=Þb0 ¼ e8
504 Seiberg–Witten Theory

up to one-loop. Consequently, the classical scale  the theory is asymptotically free but
Since b0 = 3h,
invariance is lost. It is convenient to redefine  as a strongly coupled at low energy. Classically, the
complex quantity such that theory has a U(1)R chiral symmetry. However, due
pffiffiffiffiffi
to anomaly, only the subgroup Z2h survives at the
ð=Þb0 ¼ e2 1ðÞ quantum level. Instanton effect yields gaugino
condensation h i
3 . The symmetry is thus
For pure gauge theory, b0 = (11=3)h,  where h  is the  inequivalent vacua.
further broken to Z2 , resulting h
dual Coxeter number of g. At high energy ( ! 1), The N = 1 QCD has additional chiral superfields
the coupling becomes weak (g ! 0); this is known as  in a representation R, including the bosons  2
asymptotic freedom. On the contrary, the interac- (P G R) and the fermions 2 (Sþ (P G R)).
tion becomes strong at low energy. It is believed that In the absence of superpotential, the action is
the theory exhibits confinement and has a mass gap.
QCD, or quantum chromodynamics, is gauge SN¼1 N¼1
SQCD ½A; ; ;  ¼ SSYM ½A; 
theory coupled to matter fields. Suppose the boson Z
1
 and the fermion are in the (complex) representa- þ 2 d4 x d2  d2  12jj2
g
tions Rb and Rf of G, respectively. That is,  2
(P G Rb ), or  is a section of the bundle P G Rb , In components, the second term is
and 2 (S (P G Rf )), where S is the spinor
Z  
bundle over R3, 1 . The classical action is 1 pffiffiffiffiffiffiffi
d4 x 12jrj2 þ 1ð ; r = þ Þ  12 jDj2 þ   
g2
SQCD ½A; ; ¼ SYM ½A
Z pffiffiffiffiffiffiffi
1 1 where D : R ! g is the moment map of the
þ 2 d4 x jrj2 þ 1ð ; r = Þ þ   Hamiltonian G-action on R, and we have omitted
g 2
other terms containing fermionic fields. The
where r is the covariant derivative, r
= is the Dirac moduli space of classical vacua is the symplectic
operator coupled to A, and we have omitted possible quotient D1 (0)=G = R==G. It is the same as the
mass and potential terms. The quantum theory Kähler quotient Rs =GC , where the stable subset
depends sensitively on the representations Rb and Rs = { 2 RjGC   \ D1 (0) 6¼ ;} is open and dense in
Rf . In the -function, we have R. Again, the quantum theory depends on the
representation R. Since b0 = 3h  (1=2)(R), the theory
b0 ¼ 11  1 2 is asymptotically free, infrared free, scale invariant (to
3 h  6ðRb Þ  3ðRf Þ  (R) > 6h,  (R) = 6h,

one-loop) when (R) < 6h,
where (R) is the Dynkin index of a representation respectively. The moduli space may be lifted by a
R. If b0 < 0, the theory is free in the infrared but superpotential or modified by other quantum effects.
strongly interacting in the ultraviolet. If b0 > 0, the
converse is true; in particular, the theory exhibits
asymptotic freedom. If b0 = 0, the situation depends SU(Nc ) Theories at Low Energy
on the sign of the two or higher-loop contributions. We now consider N = 1 QCD with G = SU(Nc ); Nc
Pure N = 1 supersymmetric gauge theory is one on is the number of colors. The matter field consists of
the superspace R 3, 1j(2, 2) with a constraint that the Nf copies of quarks Qi (1 i Nf ) in the funda-
curvature vanishes in the odd directions. The mental representation of SU(Nc ) and Nf copies of
dynamical variables are in the superfield strength antiquarks Q0i0 (1 i0 Nf ) in the conjugate repre-
W, a 1j(1, 0)-form valued in ad P. In components, sentation. Using the isomorphism of su (Nc ) with its
the theory is gauge field coupled to a Majorana or dual, the moment map is
Weyl fermion in the adjoint representation. Let S 
pffiffiffiffiffiffiffi
be spinor bundles of positive (negative) chiralities, DðQ; Q0 Þ ¼ traceless part of 1ðQQy  Q0 Q0y Þ
respectively, and let be a section of Sþ adP. The
action, written both in superspace and in ordinary So (Q, Q0 ) 2 D1 (0) if and only if QQy  Q = cINc
0y

spacetime, is for some c 2 R. If Nf < Nc , then c = 0 and


Z  0 1
1 4 2 a1
SN¼1
SYM ½A;  ¼ Im d x d  hW; Wi
4 B .. C
Z pffiffiffiffiffiffiffi Q; Q0
B
@ . C
A
1
¼ SYM ½A þ 2 d4 x 1h ; r
= þ i
g aN f
Seiberg–Witten Theory 505

for some ak 0. Generically, these ak > 0 and the The stationary points of Weff are at BB0  ^Nc M = 0,
gauge group SU(Nc ) is broken to SU(Nc  Nf ). If BM = 0, MB0 = 0; these are precisely the constraints
Nf Nc , then that the classical configuration satisfies. However,
0 1 0 0 1 the moduli space is interpreted differently: it is
a1 a1 embedded into a larger space, and the constraints
B .. C B .. C
Q
@ . A; Q0
@ . A are satisfied only at stationary points. At the
aN c a0Nc singularity hMi = 0, the whole global symmetry
group is unbroken, and B, B0 are the new massless
where ak , a 0k 0 satisfy a2k  a 0k 2 = c for some c 2 R. fields resolving the singularity. So we have a
The gauge group is completely broken. The low- continuous transition between confinement (without
energy superfields are the mesons Mii0 = Qi Q0i0 and, if chiral symmetry breaking) and the Higgs mechanism
Nf Nc , the baryons in the semiclassical regime.
When Nc þ 2 Nf (3=2)Nc , the original theory,
1
BiNc þ1 iNf ¼
i i Qi1    QiNc called the electric theory, is still strongly coupled in
Nc ! 1 Nf the infrared. Seiberg (1995) proposed that there is a
i0 ...i0N 1 i01 i0Nf 0 dual, magnetic theory, which is infrared free. The
B0 Ncþ1 f ¼
Qi0    Q0i0
Nc ! 1 Nc
two theories are different classically, but are
equivalent at the quantum level. The dual theory
When Nf < Nc , Affleck et al. (1984) found a is an N = 1SU(N ~ c ) gauge theory with N ~ c = Nf  Nc ,
dynamically generated superpotential coupled to dual quarks Q ~ 0i0 , where 1 i; i0
~ i, Q
 3Nc N 1=ðNc Nf Þ Nf are flavor indices. In addition, the mesons Mii0
^  f
Weff ðMÞ ¼ ðNc  Nf Þ become fundamental fields. They are not coupled to
det M the SU(N ~ c ) gauge field but interact with the dual
generated by instanton effect when Nf = Nc  1 and by quarks through the superpotential
gaugino condensation in the unbroken SU(Nc  Nf ) ~ 0i0
~ iQ
theory when Nf < Nc  1. It is also the unique super- W ¼ 1 Mii0 Q
potential (up to a multiplicative constant) that is The two theories have the same global symmetry
consistent with the global and supersymmetry. The and the same gauge-invariant operators. The dual
potential pushes the vacuum to infinity. Therefore, quarks are fundamental in the magnetic theory but
contrary to the classical picture, theories with Nf < Nc are solitonic excitations in the electric theory. At
do not have a vacuum at the quantum level. high energy, the electric theory is asymptotically
When Nf 3Nc , the theory is not strongly inter- free, while the magnetic theory is strongly coupled.
acting at low energy, and perturbation methods are At low energies, the converse is true. Therefore,
reliable. (When Nf = 3Nc , the two-loop contribution reliable perturbative calculations can be performed
to the -function is negative.) We now look at the by choosing an appropriate weakly coupled
range Nc Nf < 3Nc . The cases Nf = Nc , Nc þ 1 theory.
and Nc þ 2 Nf < 3Nc were studied in Seiberg When (3=2)Nc < Nf < 3Nc , the theory has a
(1994) and Seiberg (1995), respectively. nontrivial infrared fixed point. This is because up
When Nf = Nc , the classical moduli space is to two-loop,
det M = BB0 . The quantum theory at low energy
consists of the fields M, B, B0 satisfying the g3
constraint det M  BB0 = 2Nc . The quantum moduli ðgÞ ¼  ð3Nc  Nf Þ
162
space is smooth everywhere, and there are no  
additional massless particles. So the gluons are g5 2 Nf
þ 2Nc Nf  3Nc  þ oðg7 Þ
heavy throughout the moduli space. This is due to 1284 Nc
confinement near the origin, where the interaction is
strong, and due to the Higgs mechanism far out in There is a solution g > 0 to (g) = 0. We have
the flat direction, where the classical picture is a (g) < 0 when 0 < g < g , (g) > 0 when g > g . In
good approximation. We see a smooth transition the infrared limit, the coupling constant flows to
between these two effects. g = g , where we have a nontrivial, interacting
When Nf = Nc þ 1, there is a dynamically gener- superconformal theory in four dimensions. The
ated superpotential conformal dimension becomes anomalous and is
equal to 3/2 of the charge of the chiral U(1)R ; for
1 example, that of the meson 1 M is 3(Nf 
Weff ¼ ðB0 MB  det MÞ
2Nc 1 Nc )=Nf > 1 in this range.
506 Seiberg–Witten Theory

~
Other Classical Gauge Groups form on C2Nc . When (3=2)(Nc þ 1) < Nf < 3(Nc þ 1),
the theory flows to an interacting superconformal field
We now consider N = 1 supersymmetric gauge
theory in the infrared.
theory and QCD with gauge groups Sp(Nc ) and
Theories with the SO(Nc ) gauge group were
SO(Nc ). The Sp(Nc ) theories, studied by Intriligator
studied by Seiberg (1995) and by Intriligator and
and Pouliot (1995), are the simplest examples of
Seiberg (1995). Since the fundamental representa-
the N = 1 theories. We take 2Nf chiral superfields
tion is real, there is no constraint on the number Nf
Qi (i = 1, . . . , 2Nf ) in the fundamental representation
of quarks Qi (1 i Nf ). The gauge invariants are
C2Nc ffi HNc of Sp(Nc ). The number of copies must j
the mesons Mij = Qia Qb ab and, if Nf Nc , the
be even so that the quantum theory is free from
baryons BiNc þ1 iNf =
i1 iNf Qi1    QiNc =Nc ! They
global gauge anomaly. The gauge-invariant quanti-
satisfy rank M Nc and BB = p ^N c
M.
ffiffiffiffiffiffi Using the
ties are the mesons Mij = Qai Qbj !ab , where ! is
decomposition u(Nc ) = so(Nc )  1{R-self-adjoint
the symplectic form on C2Nc , subject to a constraint
matrices},
pffiffiffiffiffiffi the moment map D(Q) is the projection

1,..., 2Nc þ2 M1, 2    M2Nc þ1, 2Nc þ2 = 0.


pffiffiffiffiffiffi Using the
1QQy on so(Nc ). If D(Q) = 0, then up to gauge
decomposition u (2Nc ) = sp(Nc )  1{H-self-adjoint
and global symmetries, Q is of the form
matrices},
p ffiffiffiffiffiffi the moment map D(Q) is the projection of
0 1
1QQy on sp(Nc ). So D(Q) = 0 implies a1
0 1 B .. C
a1   B . C
.. 1 0 Q
B C
B C @ ar A
Q
@ . A
0 1
aminfNc ;Nf g
where ak 0. At a generic point of the classical where a1 , . . . , ar > 0 if r = rank Q Nc and
moduli space, the gauge group is broken to Sp(Nc  a1 , . . . , aNc 1 > 0 and aNc 6¼ 0 if r = Nc . Generically,
Nf ) if Nc > Nf ; it is completely broken if Nc Nf . the gauge group is broken to SO(Nc  Nf ) if Nc
Since b0 = 3(Nc þ 1)  Nf , the quantum theory is Nf þ 2 and is totally broken if Nc < Nf þ 2.
infrared free if Nf 3(Nc þ 1). (When b0 = 0, the We have b0 = 3(Nc  2)  Nf if Nc 5. For
two-loop -function is negative.) When Nf Nc , Nc = 4, the group is (SU(2)  SU(2))=Z2 and
there is a dynamically generated superpotential b0 = 6  Nf for each SU(2) factor. If Nc = 3, the
group is SU(2)=Z2 b0 = 6  2Nf . The theory is
Weff ¼ ðNc þ 1  Nf Þ asymptotically free if Nf > 3(Nc  2) and infrared
 Nc 1 3ðNc þ1ÞN 1=ðNc þ1Nf Þ free if Nf 3(Nc  2).
2  f
 When Nf Nc  5, there is a dynamically gener-
Pf M ated superpotential
pushing the vacuum to infinity. 1
When Nf = Nc , the classical moduli space PfM = 0 Weff ¼ ðNc  2  Nf Þ
2
has singularities. The quantum moduli space is  1=ðNc 2Nf Þ
Pf M = 2Nc 1 2(Nc þ1) . The singularity is smoothed 163ðNc 2ÞNf

out and there are no light fields other than the det M
mesons M. When Nf = Nc þ 1, all components of M
lifting the classical vacuum degeneracy. The coeffi-
become dynamical in the low-energy theory, and
cient is fixed by mass deformation and by matching
there is a superpotential
the SU(4) theory when Nc = 6.
Pf M When Nf = Nc  4, the unbroken gauge group is
Weff ¼  SO(4) = (SU(2)  SU(2))=Z2 on the generic point of
2Nc 1 2Nc þ1
the moduli space. The superpotential of the original
At the most singular point hMi = 0, the global
theory is
symmetry is unbroken, and all the light fields in M
become massless. In both cases, there is a transition  2ðNc 1Þ 1=2

between confinement and Higgs mechanism. Weff ¼ 2ð
þ þ
 Þ
det M
When Nc þ 3 Nf (3=2)(Nc þ 1), there is a
dual, magnetic theory which is free in the infrared. where the choices
þ ,
 = 1 correspond to the fact
The dual theory has 2Nf quarks Q ~ i in the funda- that each of the SU(2) theory has two vacua. There
~
mental representation of Sp(Nc ), where N ~ c = Nf  are two physically inequivalent branches:
þ =

Nc  2. In addition, the mesons Mij become elemen- and
þ = 
 . For
þ =
 , the superpotential pushes
tary and couple to Q ~ through a superpotential the vacuum to infinity. For
þ = 
 ,Weff = 0. In the
1 ~ ia ~ jb
W = (2) Mij Q Q !˜ ab , where !˜ is the symplectic quantum theory, the singularity is smoothed out and
Seiberg–Witten Theory 507

all the massless fermions are in M, even at the origin gC =GC = tC =W, where W is the Weyl group. At a
of the moduli space. Hence the quarks are confined. generic  2 tC , the gauge group is broken to T by
When Nf = Nc  3, the unbroken gauge group is the Higgs mechanism. Classically, the massless
SO(3) and the theory has two branches with degrees of freedom are excitations of  and
components of the gauge field in t. So the low-
2Nc 3 energy physics can be described by these massless
Weff ¼ 4ð1 þ
Þ
det M fields. However, the moduli space is singular when 
where
= 1. For
= 1, the quantum theory has no is on the walls of the Weyl chambers. At these
vacuum. For
= 1, Weff = 0, but there are addi- values, the unbroken gauge group is larger and there
~ i coupling to M via the super-
tional light fields Q are extra massless fields that resolve the
potential W
(2)1 Mij Q~ iQ
~ j near M = 0. singularities.
When Nf = Nc  2, the low-energy theory is related Since b0 = 2h > 0, the quantum theory is asymp-
to the N = 2 gauge theory and will be addressed in the totically free but strongly interacting at low energy.
subsection ‘‘Seiberg–Witten’s low-energy solution.’’ It can be shown that N = 1 supersymmetry already
When Nf Nc  1, we define a dual, magnetic forbids a dynamically generated superpotential on
theory whose gauge group is SO(N ~ c ), where tC =W. Therefore, the vacuum degeneracy is not
N~ c = Nf  Nc þ 4. There are Nf dual quarks Q ~ i (1 lifted and the quantum moduli space is still a
i Nf ) in the fundamental representation. This continuum. However, there are corrections to the
theory is infrared free if Nf (3=2)(Nc  2). In the part of classical moduli space where strong interac-
effective theory, the mesons Mij become fundamen- tions occur. The quantum theory has a dynamically
tal and couple with the dual quarks through a generated mass scale . We pick the renormalization
superpotential W = (2)1 Mij Q ~ iQ
~ j if Nf Nc ; there scale  to be jj, the typical energy scale where
is an additional term det M=642Nc 5 if Nf = Nc  1. spontaneous symmetry breaking occurs. Far away
When (3=2)(Nc  2) < Nf < 3(Nc  2), the theory from the origin, that is, when jj  jj, the theory is
flows to an interacting superconformal field theory in weakly interacting and the classical description of
the infrared. the moduli space is a good approximation. How-
ever, when jj is comparable to jj, the classical
language and perturbation methods fail due to
N = 2 Gauge Theory and Seiberg–Witten strong interaction. At  = 0, the full gauge symmetry
Duality is restored classically. But since the theory becomes
strongly interacting at low energy, it cannot be the
N = 2 Yang–Mills Theory
low-energy solution of the original theory.
Pure N = 2 supersymmetric gauge theory is a special The classical U(1)R symmetry extends to U(2)R ,
case of N = 1 QCD when R = gC is the (complex- mixing and . The U(1)R subgroup in U(2)R is
ified) adjoint representation
pffiffiffiffiffiffi of G. The moment map anomalous except for a subgroup Z4h . So we have a
is D() = (1=2 1)[, ]  2 g ffi g ( 2 g). Since the global SU(2)R Z2 Z4h symmetry at the quantum
fermionic fields and are sections of the same level. This is consistent with a continuous moduli
bundle, there is a second set of supersymmetry space of vacua, if the group SU(2)R is to act
transformations by interchanging the roles of and nontrivially. Also, the space is not a single orbit of
. This makes the theory N = 2 supersymmetric. the global symmetry group. pffiffiffiffiThe generator of Z4h


The classical action is acts on tC by a phase e 1=h . The group Z4h is
spontaneously broken to the subgroup which
SN¼2
SYM ½A; ; ;  ¼ SYM ½A acts trivially on tC =W.
Z pffiffiffiffiffiffiffi
1 We study the general form of low-energy effective
þ 2 d4 x 1ðh ; r
= i
g Lagrangian that is consistent with N = 2 super-
1 symmetry. We assume that the quantum effect does
= iÞ þ jrj2
þ h ; r not modify the topology of the moduli space tC =W,
pffiffiffiffiffiffiffi 2
þ 1ðh;  ½ ; i þ h; ½ ;
 iÞ though it may alter the singularity and its nature.
Suppose U is the quantum moduli. At a generic
1  2
 j½; j point in U, the residual gauge group is T. In the
8 N = 1 language, the theory is a supersymmetric
The energy reaches the minimum when  takes a gauged sigma model with target space U. It contains
constant value  2 gC that can be conjugated by G N = 1 vector multiplets W I and chiral multiplets I ,
to the Cartan subalgebra tC . (t is the Lie algebra of where 1 I r, r = dim T being the rank of G.
the maximal torus T.) The classical moduli space is N = 1 supersymmetry requires that U is Kähler, with
508 Seiberg–Witten Theory

possible singularities where the effective theory proposed that this is so for the low-energy effective
breaks down. N = 2 supersymmetry requires further theory of the N = 2 gauge theory. An SL(2, Z)
that U is special Kähler, that is, there is a flat, transformation maps one description of the low-
torsion-free connection r on TU such that the energy theory to another, exchanging electricity and
Kähler form ! is parallel and such that dr J = 0, magnetism. It is however not an exact duality of the
where the complex structure J is viewed as a 1-form full SU(2) theory. Rather, duality is in the ambiguity
valued in TU. See, for example, Freed (1999). of the choice of the low-energy description. More
Locally, there is a holomorphic prepotential F and precisely,  is a section of a flat SL(2, Z) bundle over
special coordinates {zI }. Let ~zI = @F =@zI be the dual U. Thus,  is multivalued and exists as a function in
coordinates and let IJ = @ 2 F =@zI @zJ = @~zI =@zp J
.ffiffiffiffiffiffi
Then local charts only. So we must use different Lagran-
K = Im(~zI zI ) is a Kähler potential and ! = ( 1=2) gians in different regions of the u-plane. Around the
Im(IJ )dzI ^ dzJ is the Kähler form. The effective singularities where  is not defined, nontrivial
action is monodromy can appear.
Z Away from infinity, the electric theory is strongly
1
N¼2
Seff ½W;  ¼ Im d4 x d2  12IJ ðÞðW I ; W J Þ interacting but the magnetic theory is infrared free.
4 The dual field is ~a = dF (a)=da, and eff (u) = d~ a=da.
Z 
4 2 2
þ d x d d KðÞ The group SL(2, Z) is generated by
   
1 0 0 1
Note that both the coupling constants IJ and the P¼ ; S¼
0 1 1 0
metric ImIJ on U are determined by a holomorphic  
function F , which is the hallmark of N = 2 1 1

supersymmetry. 0 1
In the bare theory with abelian gauge group T, the  
To see its action on ~aa , we use the central
action is given by choosing F 0 () = (1=2)IJ hI , J i,
extension of the N = 2 super-Poincaré algebra. In
where the IJ (and hence the metric ImIJ ) are
the classical theory, the central charge is Z = (ne þ
constants. Due to one-loop and instanton effects,
nm )a from the boundary terms at infinity. As the
F is no longer quadratic in the effective theory.
electric–magnetic duality transformation S inter-
Since  varies on U, it cannot be holomorphic
changes ne and nm , we have for any 2 SL(2, Z),
(except at a few singular points), single valued, and
: (nm , ne ) 7! (nm , ne ) 1 . When nm = 0, the classical
having a positive-definite imaginary part. The
formula Z = ne a is valid. Invariance of Z under
solution to this apparent contradiction is that each
SL(2, Z) requires that Z = nm ~a þ ne a at the  quan-
set of special coordinates and the expression of F is
tum level and that SL(2, Z) acts on ~aa homo-
valid only in part of U. Solving the N = 2 gauge
geneously as a column vector.
theory at low energy means understanding the
When u = (1=2)a2 is large, perturbation is reliable.
singularity of U in the strong coupling regime and
The
pffiffiffiffiffiffi classical
pffiffiffiffiffiffi and one-loop results are a(u)

obtaining the explicit form of F or IJ in various


2u, ~a
( 1=)a log a2 . As u goes around infinity,
regions of the moduli space.
the fields transforms as a 7! a, ~a 7! ~a þ 2a. The
monodromy is M1 = PT 2 . The mass M of a
Seiberg–Witten’s Low-Energy Solution
monopole state is bounded by M2 = P P jZj2 ,
We consider N = 2 gauge theory with G = SU(2). which is precisely the Bogomol’nyi bound. Now as a
The Cartan subalgebra is tffi C; each
 a 2 C deter- consequence of the N = 2 supersymmetry, it receives
mines an element  = (1=2) 0a a 0
in t. The Weyl no quantum corrections as long as supersymmetry is
group W ffi Z2 acts on C by a 7!  a. The moduli not broken at the quantum level. The states that
space of classical vacua is the u-plane C=Z2 saturate the bound are the BPS states. The BPS
parametrized by u = tr 2 = (1=2)a2 . When u 6¼ 0, spectrum at u 2 U is a subset of H1 (Eu , Z) ffi Z2
the gauge group is broken to U(1). p The
ffiffiffiffiffiffi generator containing the pairs (nm , ne ) realized by the dyon
of Z4h = Z8  U(1)R acts as a 7! 1a, u 7! u. charges. Near infinity, the condition is that either
The Z8 symmetry is broken to Z4 ; the quotient ne = 1, nm = 0 (for W particles) or nm = 1 (for
Z2 = Z8 =Z4 acts on the u-plane by u 7!  u. monopoles or dyons). This spectrum is invariant
Abelian gauge theory and N = 4 supersymmetric under the monodromy M1 .
gauge theory exhibit exact electric–magnetic duality The nontrivial holonomy at infinity implies the
in the sense that the quantum theories are identical existence of at least one singularity at a finite value
if the coupling constant  undergoes an SL(2, Z) u = u0 , where extra particles become massless.
transformation. Seiberg and Witten (1994a, b) Seiberg and Witten (1994a, b) propose that these
Seiberg–Witten Theory 509

particles are collective excitations in the perturbative monodromy is (T 2 S)T 2 (T 2 S)1 . A pair of dyons E of
regime. Suppose along a path connecting u0 and charges 1 become massless. The effective action is
some base point near infinity, a monopole of charges Weff
(u  162Nc 4 )Eþ E .
(1, ne ) = (0, 1)(T ne S1 )1 becomes massless at u0 . Topological gauge theory is a twisted version of
Then by the renormalization group analysis N = 2 Yang–Mills theory in which the observables
and duality, the monodromy at u0 is Mu0 = (T ne S1 ) at high energy are the Donaldson invariants. The
T 2 (T ne S1 )1 . It turns out that there are two work of Seiberg and Witten (1994a, b) yields new
singularities u = 2 with monodromies M2 = insight to it and has a tremendous impact on the
ST 2 S1 and M2 = (TS)T 2 (TS)1 . The particles that geometry of 4-manifolds. See Witten (1994) for the
become massless at 2 are of charges (nm , ne ) = (1, 0) initial steps.
and (1, 1), respectively. The only BPS states in the After the work of Seiberg and Witten (1994a, b),
strong coupling regime are those which become there has been much progress on theories with other
massless at the singularities; the others decay as u gauge groups. If the gauge group is a compact Lie
deforms towards strong interaction. group of rank r, the u-plane is replaced by tC =W;
The monodromies M2 , M1 (or any two of the singularities are modified by quantum effects.
them) generate the subgroup (2). The family of The duality group is Sp(2r, Z) or its subgroup of
elliptic curves with these monodromies can be finite index, acting on the coupling matrix  = (IJ )
identified with y2 = (x  2 )(x þ 2 )(x  u) called by fractional linear transformations. For example, for
the Seiberg–Witten curve. The singularities are at G = SU(Nc ), the moduli space is parametrized by
u = 2 and u = 1, where the curve degenerates. gauge P invariants u2 , . . . , uNc defined by det (xI  ) =
Let xNc  N i = 2 ui x
c Nc i
= PNc (x, ui ). Classically, the sin-
pffiffiffi gular locus is a simple singularity of type ANc 1 . At
2 y dx the quantum level, the singularity consists of two
¼ copies of such locus shifted by  n in the un
2 x2  4
direction. The monodromies correspond to a family
be the Seiberg–Witten differential (of second kind on of hyperelliptic curves y2 = PNc (x, ui )2  2Nc of
the total space E). Then in a suitable
R basis
R ( , ) of genus Nc  1. The Seiberg–Witten differential is
H1 (Eu =U, Z), we have a = , ~ a =  . At a pffiffiffi
singularity, if  = nm  þ ne is a vanishing cycle, 2 @PNc ðx; ui Þ x dx
¼ pffiffiffiffiffiffiffi þ @ð  Þ
then the dyon of charges (nm , ne ) becomes massless.  1 @x y
This Ris because its central charge is Z = nm ~a þ The Nc  1 independent eigenvalues ai of  and
ne a =  . The monodromy at a singularity where  their duals ~ai = @F =@ai are the periods of along
is a vanishing cycle is given by the Picard–Lefshetz the 2Nc  2 homology cycles in the curve. For more
formula M: 7!  2(  ). At u = 2 , the van- details, the reader is referred to Klemm et al. (1995)
ishing cycles are  and   , respectively. and Argyes and Faraggi (1995).
We return to the N = 1 SO(Nc ) gauge theory with
Nf = Nc  2. At a generic point in the moduli space,
N = 2 QCD
the gauge group is broken to SO(2), which is
abelian. Much of the above discussion applies to N = 2 supersymmetric QCD is N = 2 Yang–Mills
this case. By N = 1 supersymmetry, the effective theory coupled to N = 2 matter. The latter consists
coupling eff is holomorphic in M but is not single of N = 1 superfields Q that form a quarternionic
valued. In fact, eff depends on u = det M, which is representation R of the gauge group G. The space R
invariant under the (anomalypffiffiffiffi free) SU(Nf ) symme- has a G-invariant hyper-Kähler structure. The
try. For large u, we have e2 1eff = 4Nc 8 =u2 and hyper-Kähler moment map H : R ! g Im H con-
the monodromy around infinity is M1 = PT 2 . sists of a real moment map R : R ! g for the
On the other hand, a large expectation value Kähler structure and a complex moment map
of M of rank Nc  3 breaks the gauge group to C : R ! (g )C for the holomorphic symplectic
SO(3) and the theory is the N = 2 theory discussed structure. As an N = 1 theory, the matter superfields
earlier. Using these facts, Intriligator and Seiberg R  gC with a D-term D(Q, ) =
are valued inpffiffiffiffiffiffi
(1995) identified the family of elliptic curves as R (Q) þ (1=2 
pffiffiffi 1)[, ] and a superpotential
y2 = x(x  162Nc 4 )(x  u). There are two singula- W(Q, ) = 2hC (Q), i þ m(Q), where the mass
rities with inequivalent physics. At u = 0, the mono- term m is a G-invariant quadratic form on R. The
dromy is ST 2 S1 . A pair of monopoles Q ~  becomes classical moduli space of vacua has two branches.
massless. They couple with M through the super- On the Coulomb branch where Q = 0 and  6¼ 0,
potential W
(2)1 Mij Q~ iQ
~ j . At u = 162Nc 4 , the the unbroken gauge group is abelian and the
510 Seiberg–Witten Theory

photons are massless. If Q 6¼ 0 exists in the flat multiply ne by 2 so that it has integer values on Qi
directions, the gauge group is broken according to and Q ~ i , and divide a by 2 to preserve the formula
the value of Q; these are the Higgs branches. If Z = nm ~a þ ne a. The monodromies around the singu-
m = 0, the moduli space of classical vacua is the larities become M2 = STS1 , M2 = (T 2 S)T(T 2 S)1 ,
hyper-Kähler quotient 1 H (0)=G. The branches of M1 = PT 4 . They generate the subgroup 0 (4) of
two types touch at the origin, where the full gauge SL(2, Z). The coupling constant is
group is restored, and at other subvarieties in R. The pffiffiffiffiffiffiffi
global symmetry is the subgroup of U(R) that  8 1
¼ þ
commutes with the G-action on R and preserves  g2
m; it contains U(2)R .
Quantum mechanically, such a theory is free The Seiberg–Witten curve is y2 = x3  ux2 þ
from local gauge anomalies. Consistency under large (1=4)40 x, related to the earlier one y2 = (x  u)(x2 
gauge transformations puts a torsion condition on R, 40 ) by an isogeny. Here and below, Nf is the
such as (R) = 0(mod 2). Since b0 = 2h   (1=2)(R), dynamically generated scale.
the theory is asymptotically free if (R) < 4h.  If For Nf > 0, we consider the case with zero bare
 the quantum theory is scale invariant up masses. The simplest BPS-saturated
(R) = 4h, pffiffiffi states are the
to one-loop (and hence to all loops), and is expected elementary quarks with mass 2jaj, which form
to be so nonperturbatively. If (R) > 4h,  the quan- the vector representation of SO(2Nf ). In addition, the
tum theory may not be defined but it can be the low- quarks have fermion zero modes in the monopole
energy solution of another asymptotically free theory. background. When nm = 1, each SU(2) doublet of
Due to the axial anomaly, the U(2)R global symmetry quarks has one zero mode. With Nf hypermultiplet,
reduces to the subgroup SU(2)R Z2 Z4h(R)  . The there are 2Nf zero modes in the vector representation
metric on the Coulomb branch can be corrected by of SO(2Nf ). Upon quantization, the quantum states
quantum effects, but those on the Higgs branches do are in the spinor representation. So the flavor
not change because of the uniqueness of the hyper- symmetry is really Spin(2Nf ). The spectrum may
Kähler metric. In the quantum theory, the Higgs also include states with nm > 1. For Nf = 2, 3, 4, the
branches still touch the Coulomb branch, but the center Z(Spin(2Nf )) are Z2  Z2 , Z4 , Z2  Z2 ,
photons of the Coulomb branch are the only massless whose generators act onpstates of charges (nm , ne )
ffiffiffiffiffiffinm þ2n
by ((1)ne þnm , (1)ne ), 1 , ((1)nm , (1)ne ),
e
gauge bosons at the point where they meet.
When G = SU(Nc ) we take Nf quarks respectively.
Qi (i = 1, . . . , Nf ) in the fundamental representation Suppose at a singularity on the u-plane, the low-
and Nf antiquarks Q ~ i (i = 1, . . . , Nf ) in the complex- energy theory is QED with k hypermultiplets. Let mi
conjugate representation. The moment map is the be the bare mass and Si , the U(1) charge of the ith
same aspin ffiffiffi N = 1i QCD whereas the superpotential hypermultiplet. Withpffiffiffi the expectation value of , the
is W = 2Q ~ i Q þ P mi Q ~ i Qi . Consider the case actual masses are j 2a þ mi j(1 i k). As the states
i
G = SU(2) as in Seiberg and Witten (1994b). Since form a small representation of the N = 2 algebra, the
b0 = 4  Nf , the asymptotically free theories have pffiffiffi charge is modified as Z = nm ~a þ ne a þ S 
central
Nf 3 whereas the Nf = 4 theory is scale invariant. m= 2, where m = (m1 , . . . , mk ) and S = (S1 , . . . , Sk ).
As the representations on Qi and Q ~ i are isomorphic, Under a duality transformation M 2 SL(2, Z), the
pffiffiffi
the classical global symmetry is O(2Nf )  U(2)R column vector (m=  2 , a
~
 , a) is multiplied by a matrix
when all mi = 0. The appearance of the even number of the form M ^ = Ik 0 . (For example, if M = T, M ^
 M
of fundamental representations is necessary for the can be derived by one-loop analysis.) So the row
consistency of the theory at the quantum level. The vector W = (S, nm , ne ) transforms as W 7! W M ^ 1 . The
U(1)R symmetry is anomalous if Nf 6¼ 4. When Nf > 0, transformation on (nm , ne ) is not homogeneous when
SO(2Nf ) is anomaly free, whereas O(2Nf )=SO(2Nf ) = there are hypermultiplets. This phenomenon persists
Z2 is anomalous. The anomaly free subgroup of Z2  even when all the bare masses mi are zero.
U(1)R is Z4(4Nf ) . Its Z2 subgroup acts in the same way When Nf = 1, the global symmetry of the u-plane
as Z2  Z(SO(2Nf )). A nonzero expectation value of is Z3 . There are three singularities related by this
u = tr 2 further breaks the symmetry to Z4 . The symmetry, where monopoles with charges (nm , ne ) =
quotient group that acts effectively on the u-plane (the (1, 0), (1, 1), and (1, 2) become massless. The low-
Coulomb branch) is Z4Nf if Nf > 0 and Z2 if Nf = 0. energy theory at each singularity is QED with a
When Nf = 4, the U(1)R symmetry is anomaly free but single light hypermultiplet. Besides the photon, no
Z2 = O(8)=SO(8) is still anomalous. other flat directions exist. This is consistent with the
The Nf = 0 theory is the N = 2 pure gauge theory. absence of Higgs branch in the original theory.
In order to compare it to the Nf > 0 theories, we The monodromies at the singularities are STS1 ,
Seiberg–Witten Theory 511

(TS)T(TS)1 , (T 2 S)T(T 2 S)1 , respectively, and the hypermultiplet has (nm , ne ) = (0, 1) and form the
corresponding Seiberg–Witten family of curves is vector representation v of SO(8). Fermion zero
y2 = x2 (x  u)  (1=64)61 . The Seiberg–Witten dif- modes give rise to hypermultiplets with
ferential is (nm , ne ) = (1, 0), (1, 1) that transform under the spinor
pffiffiffi representations s, c of Spin(8). SL(2, Z) acts on the
2 y dx spectrum via a homomorphism onto the outer-auto-
¼
4 x2 morphism group S3 of Spin(8), which then permutes v,
s, and c. So duality is mixed in an interesting way with
When Nf = 2, there are two singularities related by the SO(8) triality. In v, s, and c, the center Z2  Z2
the global symmetry Z2 of the u-plane. The massless acts as ((1)nm , (1)ne ) = (1, 1), (1, 1), (1, 1),
states at one singularity have (nm , ne ) = (1, 0) and respectively. The full SL(2, Z) invariance predicts the
form a spinor representation of SO(4) while those at existence of multimonopole bound states: for every
the other have (nm , ne ) = (1, 1) and form the other pair of relatively prime integers (p, q), there are eight
spinor representation. The low-energy theory at each states with (nm , ne ) = (p, q) that form a representation
singularity is QED with two light hypermultiplets. of Spin(8) on which the center acts as ((1)p , (1)q ).
There are additional flat directions along which Solutions when the bare masses are nonzero are
SO(4)  SU(2)R is broken. They form the two Higgs also obtained by Seiberg and Witten (1994b). The
branches that touch the u-plane at the two singula- masses can be deformed to relate theories with
rities rather than at the origin. The metric and pattern different values of Nf . N = 2 QCD with a general
of symmetry breaking are the same as classically. classical gauge group has also been studied. By
The monodromies are ST 2 S1 , (TS)T 2 (TS)1 . The adding to these theories a mass term m tr 2
Seiberg–Witten curve is y2 = (x2  u)  (1=64)42 ) that explicitly breaks the supersymmetry to N = 1,
(x  u) and the differential is the dualities of Seiberg can be recovered. For
pffiffiffi SU(Nc ), SO(Nc ) and Sp(2Nc ) gauge groups,
2 y dx see Hanany and Oz (1995), Argyes et al. (1996),
¼
4 x2  42 =64 Argyes et al. (1997) and references therein.

When Nf = 3, the u-plane has no global symme- See also: Anomalies; Brane Construction of Gauge
try. There are two singularities. At one of them, a Theories; Donaldson–Witten Theory; Duality in
single monopole bound state with (nm , ne ) = (2, 1) Topological Quantum Field Theory; Effective Field
becomes massless and there are no other light Theories; Electric–Magnetic Duality; Floer Homology;
particles. At the other singularity, the massless states Gauge Theories from Strings; Gauge Theory:
Mathematical Applications; Nonperturbative and
have (nm , ne ) = (1, 0) and form a (four-dimensional)
Topological Aspects of Gauge Theory; Quantum
spinor representation of SO(6) with a definite Chromodynamics; Topological Quantum Field Theory:
chirality. Thus, the low-energy theory is QED with Overview; Supersymmetric Particle Models.
four light hypermultiplets. Along the flat directions,
the SO(6)  SU(2)R symmetry is further broken.
This corresponds to a single Higgs branch touching Further Reading
the u-plane at the singularity. Again, the metric on Affleck I, Dine M, and Seiberg N (1984) Dynamical super-
the Higgs branch is not modified by quantum symmetric breaking in supersymmetric QCD. Nuclear Physics
effects. The monodromies at the two singularities B 241: 493–534.
are (ST 2 S)T(ST 2 S)1 and ST 4 S1 , respectively. The Argyes PC and Faraggi AE (1995) Vacuum structure and
Seiberg–Witten curve is y2 = x2 (x  u)  (1=64) spectrum of N = 2 supersymmetric SU(n) gauge theory.
Physical Review Letters 74: 3931–3934.
23 (x  u)2 and the differential is Argyes PC, Plesser MR, and Seiberg N (1996) The moduli space
pffiffiffi    of vacua of N = 2 SUSY QCD and duality in N = 1 SUSY
2 pffiffiffiffiffiffiffi 3 32 QCD. Nuclear Physics B 471: 159–194.
¼ log y þ 1 x  u  2 x2 dx Argyes PC, Plesser MR, and Shapere AD (1997) N = 2 moduli
3 8 3
spaces and N = 1 dualities for SOðnc Þ and USpð2nc Þ super-
QCD. Nuclear Physics B 483: 172–186.
When Nf = 4, the theory is characterized by Freed DS (1999) Special Kähler manifolds. Communications in
classical coupling constant
pffiffiffiffiffiffi , and there are no Mathematical Physics 203: 31–52.
corrections to a = (1=2) 2u, ~ a = a. There is only Hanany A and Oz Y (1995) On the quantum moduli space of
one singularity at u = 0, where the monodromy is P. vacua of N = 2 supersymmetric SUðNc Þ gauge theories.
Nuclear Physics B 452: 283–312.
Seiberg and Witten (1994b) postulate that the full Klemm A, Lerche W, Yankielowicz S, and Theisen S (1995)
quantum theory is SL(2, Z) invariant, just like the Simple singularities and N = 2 supersymmetric Yang–Mills
N = 4 pure gauge theory. The elementary theory. Physics Letters B 344: 169–175.
512 Semiclassical Spectra and Closed Orbits

Intriligator K and Pouliot P (1995) Exact superpotential, quantum Seiberg N and Witten E (1994a) Electric–magnetic duality,
vacua and duality in supersymmetric SPðNc Þ gauge theories. monopole condensation, and confinement in N = 2
Physics Letters B 353: 471–476. supersymmetric Yang–Mills theory. Nuclear Physics B 426:
Intriligator K and Seiberg N (1995) Duality, monopoles, dyons, 19–52.
confinement and oblique confinement in supersymmetric Seiberg N and Witten E (1994b) Monopoles, duality and chiral
SOðNc Þ gauge theories. Nuclear Physics B 444: 125–160. symmetry breaking in N = 2 supersymmetric QCD. Nuclear
Seiberg N (1994) Exact results on the space of vacua of four- Physics B 431: 484–550.
dimensional SUSY gauge theories. Physical Review D 49: Witten E (1994) Monopoles and four-manifolds. Mathematical
6857–6863. Research Letters 1: 769–796.
Seiberg N (1995) Electric–magnetic duality in supersymmetric
non-Abelian gauge theories. Nuclear Physics B 435: 129–146.

Semiclassical Approximation see Stationary Phase Approximation; Normal Forms and Semiclassical Approximation

Semiclassical Spectra and Closed Orbits


Y Colin de Verdière, Université de Grenoble 1, On the other side, around 1970, two groups of
Saint-Martin d’Hères, France physicists developed independently asymptotic trace
ª 2006 Elsevier Ltd. All rights reserved. formulas:
 M Gutzwiller for the Schrödinger operator,
using the quasiclassical approximation of the
Introduction Green function (the ‘‘van Vleck’s formula’’); it
is interesting to note that the word ‘‘trace
The purpose of this article is to describe the so- formula’’ is not written, but Gutzwiller instead
called ‘‘semiclassical trace formula’’ (SCTF) relating speaks of a new ‘‘quantization method’’ (the old
the ‘‘spectrum’’ of a semiclassical Hamiltonian to one being ‘‘Einstein–Brillouin–Keller (EBK)’’ or
the ‘‘periods of closed orbits’’ of its classical limit. ‘‘Bohr–Sommerfeld rules’’).
SCTF formula expresses the asymptotic behavior as  R Balian and C Bloch, for the eigenfrequencies of
h ! 0 (h = h=2) of the regularized density of states
 a cavity, use what they call a ‘‘multiple reflection
as a sum of oscillatory contributions associated to expansion.’’ They asked about a possible applica-
the closed orbits of the classical limit. tion to Kac’s problem.
We will mainly present the case of the Schrödin-
ger operator on a Riemannian manifold which At the same time, under the influence of Mark
contains the purely Riemannian case. Kac’s famous paper ‘‘Can one hear the shape of a
We start with a section about the history of the drum?,’’ mathematicians became quite interested in
subject. We then give a statement of the results and inverse spectral problems, mainly using heat kernel
a heuristic proof using Feynman integrals. This expansions (for the state of the art around 1970, see
proof can be transformed into a mathematical Berger et al. (1971)).
proof which we will not give here. After that we The SCTF was put into its final mathematical
describe some applications of the SCTF. form for the Laplace operator on closed manifolds
by three groups of people around 1973–75:
 Y Colin de Verdière in his thesis was using the
About the History
short-time expansion of the Schrödinger kernel
SCTF has several origins: on one side, Selberg and an approximate Feynman path integral. He
trace formula (1956) is an exact summation formula proved that the spectrum of the Laplace operator
concerning the case of locally symmetric spaces; this determines generically the lengths of closed
formula was interpreted by H Huber as a formula geodesics.
relating eigenvalues of the Laplace operator and  J Chazarain derived the qualitative form of the
lengths of closed geodesics (also called the ‘‘lengths trace for the wave kernel using Fourier integral
spectrum’’) on a closed surface of curvature 1. operators.
Semiclassical Spectra and Closed Orbits 513

 Using the full power of the symbolic calculus of Semiclassical Schrödinger Operators
Fourier integral operators, H Duistermaat and on Riemannian Manifolds
V Guillemin were able to compute the main term
of the singularity from the Poincaré map of the If (X, g) is a (possibly noncompact) Riemannian
closed orbit. Their paper became a canonical manifold and V : X ! R a smooth function which
reference on the subject. satisfies lim inf x ! 1 V(x) = E1 > 1, the differential
operator H ^ = (1=2)h2  þ V is semibounded from
After that, people were able to extend SCTF to: below and admits self-adjoint extensions. For all
 general semiclassical Hamiltonians (Helffer– those extensions, the spectrum is discrete in the interval
e 1, E1 d and eigenfunctions H’ ^ j = Ej ’j are loca-
Robert, Guillemin–Uribe, Meinrenken),
 manifolds with boundary (Guillemin–Melrose), lized in the domain V  Ej . If X is compact and V = 0,
 surfaces with conical singularities and polygonal we recover the case of the Laplace operator.
billiards (Hillairet), and We will denote this part of the spectrum by
 several commuting operators (Charbonnel– inf V < E1 ðhÞ < E2 ðhÞ      Ej ðhÞ     < E1
Popov).
For the Laplace operator, we have Ej = h2 j , where
Recently, some researchers have remarked about the 1  2      j     is the spectrum of the
nonprincipal terms in the singularities expansion Laplace operator.
which come from the semiclassical Birkhoff normal The SCTF can also be derived the same way for
form (Zelditch, Guillemin). Schrödinger operators with magnetic field. One can
even extend it to Hamiltonian systems which are not
obtained by Legendre transform from a regular
Selberg Trace Formula Lagrangian. In this case, Morse indices have to be
replaced by the more general Maslov indices.
We consider a compact hyperbolic surface X.
‘‘Hyperbolic’’ means that the Riemannian metric is
locally (dx2 þ dy2 )=y2 or is of constant curvature
1. Such a surface is the quotient X = H= where  Classical Dynamics
is a discrete co-compact subgroup of the group of Newton Flows
isometries of the Poincaré half-plane H. Closed
geodesics of X are in bijective correspondence with Euler–Lagrange equations for the Lagrangian
nontrivial conjugacy classes of . More precisely, L(x, v) := (1=2)kvk2g  V(x) admit a Hamiltonian
the set of loops C(S1 , X) splits into connected formulation on T ? X whose energy is given by
components associated to conjugacy classes and H = (1=2)kk2g þ V(x). We will denote by XH the
each component of nontrivial loops contains exactly Hamiltonian vector field
one periodic geodesic. X @H @H
XH :¼ @ xj  @
Theorem 1 (Selberg trace formula). If  is a real- j
@ j @xj j
valued function on R whose Fourier transform ˆ is
compactly supported and j = 1=4 þ 2j is the spec- Preservation of H by the dynamics shows immedi-
trum of the Laplace operator on X, we have: ately that the Hamiltonian flow t restricted to H <
E1 is complete.
X
1 Z The Hamiltonian H is the ‘‘classical limit’’ of H; ^
A
ð  j Þ ¼ ð þ sÞs tanh s ds in more technical terms, H is the semiclassical
2 R
j¼1 ^
principal symbol of H.
XX
1
l If V = 0, H = (1=2)gij i j and the flow is the geo-
þ desic flow.
2P n¼1
2 sinhðnl =2Þ

ðnl Þeinl Þ
 Reð^ Periodic Orbits

where A is the area of X, P the set of primitive Definition 1 A periodic orbit (, T) (also denoted
conjugacy classes of  and, for  2 P, l is the length p.o.) of the Hamiltonian H consists of an orbit 
of the unique closed geodesic associated to . of XH which is homeomorphic to a circle and
a nonzero real number T so that T (z) = z for all
A nice recent presentation of the Selberg trace z 2 . We will denote by T0 () > 0 (the primitive
formula can be found in Marklof (2003). period) the smallest T > 0 for which T (z) = z.
514 Semiclassical Spectra and Closed Orbits

If (T, E) are given, WT, E is the set of z’s so that tangent space is the intersection of the tangent
H(z) = E and T (z) = z. spaces of Y and Z.
Fixed points of a smooth map are clean if the
 The (linear) Poincaré map  of a p.o. (, T) with
graph of the map intersects the diagonal cleanly.
H() = E: we restrict the flow to SE := {H = E}
and take a hypersurface  inside SE transversal to Definition 3 We will denote by (ND) the following
 at the point z0 . The associated return map P is a property of the p.o. (0 , T0 ): the fixed points of the
local diffeomorphism fixing z0 . Its linearization associated (nonlinear) Poincaré map P are clean.
 := P0 (z0 ) is the linear Poincaré map, an The set WT, E is ND if all p.o.’s inside are ND.
inversible (symplectic) endomorphism of the WT, E is then a manifold of dimension ().
tangent space Tz0 .
Example 2
 The Morse index (): p.o.R T (, T) is a critical point
of the action integral 0 L((s), ˙ (s)) ds on the  Generic case: = 1; (ND) is equivalent to ‘‘1 is
manifold C1 (R=TZ, X). It always has a not an eigenvalue of the linear Poincaré map.’’
finite Morse index (Milnor 1967) which is denoted In this case, we can deform the p.o. smoothly by
by (). For general Hamiltonian systems, the Morse moving the energy. This family of p.o.’s is called
index is replaced by the Conley–Zehnder index. a cylinder of p.o.’s. The period T(E) is then a
 The nullity index () is the dimension of the smooth function of E.
space of infinitesimal deformations of the p.o.   Completely integrable systems: = d; (ND) is then a
by p.o. of the same energy and period. We always consequence of the so-called ‘‘isoenergetic KAM
have ()  1 and () = 1 þ dim ker (Id   ). condition’’: assuming the Hamiltonian is expressed
as H(I1 , . . . , Id ) using action-angle coordinates, this
Example 1 (Geodesic flows)
condition is that the mapping I ! [rH(I)] from the
 Riemannian manifold with sectional curvature < 0: energy surface H = E into the projective space is a
in this case, we have for all periodic geodesics local diffeomorphism. This condition implies that
() = 0, () = 1. Diophantine invariant tori are not destructed by a
 Generic metrics: for a generic metric on a closed small perturbation of the Hamiltonian.
manifold, we have () = 1 for all periodic  Maximally degenerated systems: it is the case
geodesics. where all orbits are periodic ( = 2d  1). For
 For flat tori of dimension d: we have () = 0 and example, the two-body problem with Newtonian
() = d. potential and the geodesic flows on compact
 For sphere of dimension 2 with constant curva- rank-1 symmetric spaces.
ture: if n is the nth iterate of the great circle, we
have (n ) = 2jnj and (n ) = 3.
Canonical Measures and Symplectic Reduction
It is a beautiful result of J-P Serre that any pair of
points on a closed Riemannian manifold are end- Under the hypothesis (ND), the manifold WT, E admits
points of infinitely many distinct geodesics. Count- a canonical measure c , invariant by t . In theffi case
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ing geometrically distinct periodic geodesics is much = 1, this measure is given by jdtj= det(Id  ).
harder especially for simple manifolds like the By using a Poincaré section, it is enough to
spheres. It is now known that every closed Riemannian understand the following fact: if A is a symplectic
manifold admits infinitely many geometrically distinct linear map, the space ker (Id  A) admits a canonical
periodic geodesics (at least, in some cases, for Lebesgue measure.
generic metrics, (Berger 2000 chap. V). There exists We start with the following construction: let L1
significant knowledge concerning more general and L2 be two Lagrangian subspaces of a symplectic
Hamiltonian systems as well. space E and !j , j = 1, 2, be half-densities on Lj ,
denoted by !j 2 1=2 (Lj ). If W = L1 \ L2 , we have
the following canonical isomorphisms: 1=2 (Lj ) =
Nondegeneracy 1=2 (W) 1=2 (Lj =W). So 1=2 (L1 ) 1=2 (L2 ) =
1=2 (L1 =W) 1=2 (L2 =W) 1 (W). Mj = Lj =W are
There are several possible nondegeneracy assump-
two Lagrangian subspaces of the reduced space
tions. They can be formulated ‘‘à la Morse–Bott’’
W o =W whose intersection is 0. Hence, by using
(critical point of action integrals) or purely
the Liouville measure on it, we get 1=2 (M1 )
symplectically.
1=2 (M2 ) = C. Hence, we get a density !1 ? !2
Definition 2 Two submanifolds Y and Z of X on W. It turns out that the previous calculation is one
intersect cleanly iff Y \ Z is a manifold whose of the main algebraic pieces of the symbolic calculus of
Semiclassical Spectra and Closed Orbits 515

P
Fourier integral operators and the density !1 ? !2 We define D(E) := aEj b
(Ej ) as the sum of
arises in stationary-phase computations. Dirac measures at the points Ej and its h-Fourier
The graph of a symplectic map is equipped with a transform as
half-density by pullback of the Liouville half- ^ X0
density. So we can apply the previous construction ZðtÞ ¼ trace0 ðeitH=h Þ :¼ expðitEj =hÞ ½1
to the intersection of the graph of A and the graph P0
of the identity map. where is the sum over Ej 2 [a, b].
The Duistermaat–Guillemin trick relates the
Actions previous behavior to asymptotics of the regularized
density of eigenvalues.RLet us give a function  2
Definition 4 If (, T) is a p.o., we define the
S(R) so that (t) ˆ = eitE (E)dE is compactly
following quantity which is called action of :
Z supported and
AðÞ ¼  dx ^ðtÞ ¼ 1 þ Oðt1 Þ; t!0 ½2

(all moments ofP vanish). We introduce, for E 2
In the (ND) case, A() is constant on each connected [a, b], D (E) := 0j 1h (E  Ej =h). D (E) is indepen-
component of WT, E . dent modulo O(h1 ) of a, b. We have
In the generic case and if T 0 (E) 6¼ 0 (cylinder of Z
1
p.o.), p.o.’s of the cylinder are also parametrized D ðEÞ ¼ ^ðtÞZðtÞ dt
2h
by T (i.e., we note by E the p.o. of the cylinder of
energy E and T the p.o. R T of period T). If The idea is now to start from a semiclassical
^
a(E) = A(E ) and b(T) =  0 L(T (s), _ T (s))ds, a(E) approximation of U(t) = eitH=h and to insert it into
and b(T) are Legendre transforms of each other. eqn [1]. We need only a uniform approximation of
U(t) for t 2 Support().
ˆ From the asymptotic expan-
Playing with Spectral Densities sion of Z(t), we will deduce the asymptotic expan-
sion of D , the regularized eigenvalue density.
We will define the ‘‘regularized spectral densities.’’
The general idea is as follows: we want to study an
h-dependent sequence of numbers Ej (
 h) (a spectrum)
in some interval [a, b]. We introduce The Smoothed Density of States
R a non negative
function  2 S(R) P which satisfies (t)dt = 1, and The following statement expressing the smoothed
also D, ", h (E) = " (E  Ej ), where " (E) = density of eigenvalues is the main result of the
"1 (E="). It gives the analysis of the spectrum at subject. Under the (ND) assumption, it gives the
the scale ". Of course, we will adapt the scaling " existence of an asymptotic expansion for D (E):
to the small parameter  h. If the scaling is of the size
of the mean spacing of the spectrum, we will get a Theorem 2 If E is not a critical value of H and the
very precise resolution of the spectrum. (ND) condition is satisfied for all p.o.’s of energy
The general philosophy is: E 2 [a, b] and period inside the support of ,
ˆ
X
 If h is the semiclassical parameter of a semiclassi- D ðEÞ ¼ DWeyl ðEÞ þ DWðT;EÞ þ Oðh1 Þ ½3
cal Hamiltonian, the mean spacing of the eigen-
values is of order  hd (Weyl’s law). The trace where:
formula gives the asymptotic behavior of (i) !
D, ", h (E) for "

h (and hence " >> E except X
1
d j
if d = 1). This behavior is not ‘‘universal’’ and DWeyl ðEÞ ¼ ð2hÞ aj ðEÞh
thus contains a significant amount information of j¼0
(in our case, on periodic trajectories). R
 Better resolution of the spectrum needs the use of with a0 (E) = H = E dL=dH
the long-time behavior of the classical dynamics and (ii) The sum is over all the manifolds WT, E so that
is conjecturally universal. It means that eigenvalues T 2 Support().
ˆ
seen at very small scale behave like eigenvalues of an (iii)
"
ensemble of random matrices, the most common one DWðT;EÞ ¼ ð ðÞþ1Þ=2
eiðÞ=2
being the Wigner Gaussian orthogonal ensemble ð2ihÞ
(GOE) and Gaussian unitary ensemble (GUE). X
 eiAðÞ=h bj ðEÞhj
We fix some interval [a, b] with b < E1 . j0
516 Semiclassical Spectra and Closed Orbits

with The Trace and Loop Manifolds

Z Let us try a formal calculation of the partition


b0 ðEÞ ¼ ^ðTÞ dc function and its semiclassical limit. We get
WT;E Z Z  Z t 
 i
1 if T 0 ðEÞ > 0 ZðtÞ ¼ jdxj exp LððsÞ; ðsÞÞds
_ jdj
"¼ X x;x;t h 0
i if T 0 ðEÞ < 0
If we denote by t the manifold of paths
ˆ  )T0 jdet(Id   )j1=2 .
If () = 1, we get b0 = (T
 : R=tZ ! X, (loops) and we apply Fubini (sic !),
we get
Z  Z t 
The Weyl Expansion i
ZðtÞ ¼ exp LððsÞ; ðsÞÞ
_ ds jdj
t h 0
If Support()ˆ is contained in [Tmin , Tmin ], where
Tmin is the smallest period of a p.o.  with H() = E,
and, if E is not a critical value of H, formula [3] The Semiclassical Limit
reduces to
! We want to apply stationary phase in order to get
X1
j
the asymptotic expansion of Z(t); critical points of
d
D ðEÞ
ð2 hÞ aj ðEÞ
h Jt : t ! R are the p.o.’s of the Euler–Lagrange flow
j¼0 and hence of the Hamiltonian flow of period t. We
require the ND assumption (Morse–Bott), the Morse
From the previous formula, it is possible to deduce
index, and the determinant of the Hessian:
the following estimates:
1. The ND assumption is the original Morse–Bott
Theorem 3 If a, b are not critical values of H:
one in Morse theory: we have smooth manifolds
of critical points and the Hessian is transversally
hÞ  bg
#fjja  Ej ð
ND.
hÞd volumeða  H  bÞð1 þ OðhÞÞ
¼ ð2 2. The Morse index is the Morse index of theR t action
functional on periodic loops: L() := 0 L((s),
This remainder estimate is optimal and was first ˙
(s))ds.
shown in rather great generality by Hörmander 3. The Hessian is associated to a periodic Sturm–
(1968). Liouville operator for which many regulariza-
tions have already been proposed.
In this manner, we get a sum of contributions
Derivation from the Feynman Integral given by the components Wj, t of Wt :
The Feynman Integral
Zj ðtÞ ¼ ðihÞ j =2 eði=hÞLðÞ cj ðhÞ
R Feynman (Feynman and Hibbs 1965) found
P
a geometric representation of the propagator, with cj (h)
1 hl and
l = 0 cj, l 
that is, the kernel p(t, x, y) of the unitary group
^ h) using an integral (FPI := Feynman path
exp (itH= eið=2Þ
cj;0 ¼
integral) on the manifold t, x, y := { : [0, t] ! j
j1=2
Xj(0) = x, (t) = y} of paths from x to y in the
time t; if L(, )
˙ is the Lagrangian, we have, for where  is the Morse index and
is a regularized
t > 0: determinant.
Z  Z t 
i
pðt; x; yÞ ¼ exp LððsÞ; ðsÞÞ
_ ds jdj
t;x;y h 0

The Integrable Case
where jdj is a ‘‘Riemannian measure’’ on the As observed by Berry–Tabor, the trace formula in
manifold t, x, y with the natural Riemannian this case comes from Poisson summation formula
structure. using action-angle coordinates. Asymptotic of the
There is no justification FPI as a useful mathema- eigenvalues to any order can then be given in the so-
tical tool. Nevertheless, FPI gives good heuristics called quantum integrable case by Bohr–Sommerfeld
and right formulas. rules.
Semiclassical Spectra and Closed Orbits 517

The Maximally Degenerated Case Bifurcations


Let us assume that (X, g) is a compact Riemannian Let us denote by CH R 2T, E , the set of pairs (T,E)
manifold for which all geodesics have the same for which WT, E is not empty. The previous results
smallest period T0 = 2. Then we have the following apply to the ‘‘smooth’’ part of the set CH . Among
clustering property: other interesting points are points (0,E) with critical
value E of H (Brummelhuis–Paul–Uribe) and points
Theorem 4 There exists some constant C and some
corresponding to bifurcation of p.o. when moving
integer so that
the energy.
(i) the spectrum of  is contained in the union of Detailed studies of some of these points have been
the intervals done, for example, the results of suitable applica-
 tions of the theory of singularities of functions
2  2 of finitely many variables, their deformations (catas-
Ik ¼ kþ C; k þ þC ;
4 4 trophe theory), and applications to stationary-phase
k ¼ 1; 2; . . . method, and a significant body of knowledge on
these subjects now exists.
(ii) N(k) = #Spectrum() \ Ik is a polynomial func-
tion of k for k large enough. SCTF and Eigenvalue Statistics
The property (ii) is consequence of the trace One of the main open mathematical problems is:
formula. ‘‘can one really use appropriate forms of the SCTF
as quantization rules and use it in order to derive
eigenvalues statistics?’’
Applications to the Inverse
This problem is related to the fine-scale study of the
Spectral Problem eigenvalue spacings (" << h). It is one of the important
We will now restrict ourselves to the case of the unsolved problems of the so-called ‘‘quantum chaos.’’
Laplace operator on a compact Riemannian mani- Many people think that progress in this field will allow
fold (X, g). The main result is as follows: us to solve the Bohigas–Giannoni–Schmit conjecture:
‘‘if the geodesic flow is hyperbolic, eigenvalue distribu-
Theorem 5 (Colin de Verdière). If X is given, there
tion follows random matrix asymptotics.’’
exists a generic subset GX , in the sense of Baire
category, of the set of smooth Riemannian metrics on See also: Billiards in Bounded Convex Domains;
X, so that, if g 2 GX , the length spectrum of (X, g) can h-Pseudodifferential Operators and Applications;
be recovered from the Laplace spectrum. The set GX Quantum Ergodicity and Mixing of Eigenfunctions;
contains all metrics with < 0 sectional curvature and Random Matrix Theory in Physics; Regularization for
(conjecturally) all metrics with < 0 sectional curvature. Dynamical Zeta Functions; Resonances.

We can take for GX the set of metrics for which all


periodic geodesics are nondegenerate and the length Further Reading
spectrum is simple.
Balian R and Bloch C (1970) Distribution of eigenfrequencies for the
Some cancelations may occur between the asympto- wave equation in a finite domain I. Annals of Physics 60: 401.
tic expansions of two ND periodic trajectories with the Balian R and Bloch C (1971) Distribution of eigenfrequencies for the
same actions if the Morse indices differ by 2 mod 4. wave equation in a finite domain II. Annals of Physics 64: 271.
Balian R and Bloch C (1972) Distribution of eigenfrequencies for the
wave equation in a finite domain III. Annals of Physics 69: 76.
The Case with Boundary Berger M (2000) Riemannian Geometry During the Second Half
of the Twentieth Century. University Lecture Series, vol. 17.
If (X, g) is a smooth compact manifold with boundary, Providence, RI: American Mathematical Society.
Berger M, Gauduchon P, and Mazet E (1971) Le spectre d’une
one introduces the broken geodesic flow by extending
variété riemannienne compacte. Berlin–Heidelberg–New York:
the trajectories by reflection on the boundary. SCTFs Springer LNM.
have been extended to that case by Guillemin and Colin de Verdière Y (1973) Spectre du Laplacien et longueurs des
Melrose. Periodic geodesics which are transversal to géodésiques périodiques II. Compositio Mathematica 27:
the boundary contribute to the density of states in the 159–184.
same way as for periodic manifolds. Periodic geodesics Duistermaat J (1976) On the Morse index in variational calculus.
Advances in Mathematics 21: 173–195.
inside the boundary are in general accumulation of Duistermaat J and Guillemin V (1975) The spectrum of positive
periodic geodesics near the boundary: their contribu- elliptic operators and periodic geodesics. Inventiones Mathe-
tions is therefore very complicated analytically. maticae 29: 39–79.
518 Semilinear Wave Equations

Feynman R and Hibbs A (1965) Quantum Mechanics and Path on Hypebolic Manifolds,’’ Schloss Reisensburg, Gunsburg,
Integrals. New-York: McGraw-Hill. Germany, 4–11 October 2003. To appear in Springer LNP,
Gutzwiller M (1971) Periodic orbits and classical quantization Berlin–Heidelberg, New York. http://fr.arxiv.org
conditions. Journal of Mathematical Physics 12: 343–358. Milnor J (1967) Morse Theory. Annals of Mathematics Studies
Gutzwiller M (1990) Chaos in Classical and Quantum no. 51. Princeton, NJ: Princeton University Press.
Mechanics. Berlin–Heidelberg–New York: Springer. Selberg A (1956) Harmonic analysis and discontinuous groups in
Hörmander L (1968) The spectral function of an elliptic operator. weakly symmetric Riemannian spaces with applications to
Acta Mathematica 121: 193–218. Dirichlet series. Journal of the Indian Mathematical Society
Marklof J (2003) Selberg’s trace formula: an introduction. 20: 47–87.
Lectures given at the International School ‘‘Quantum Chaos

Semilinear Wave Equations


P D’Ancona, Università di Roma ‘‘La Sapienza,’’ Thus, in particular f (0) = 0 and we see that f must
Rome, Italy be of the form f (u) = g(juj2 )u for some g. Since the
ª 2006 Elsevier Ltd. All rights reserved. gauge-invariant wave equation

&u ¼ gðjuj2 Þu ½5
has essentially the same properties as the real-valued
Introduction equation [3], it is not too restrictive to study only
real-valued functions as we shall mostly do in the
A semilinear wave equation is an equation of the
following.
form
The more general equations of the form [1],
&u ¼ Fðu; u0 Þ; u :  R  Rn ! R ½1 involving the derivatives of u, are encountered in
nþ2 several physical theories, including the nonlinear
where F : R ! R is a smooth function, the -models and general relativity.
d’Alembert operator & is defined as However, beyond the concrete physical applica-
@ @ tions, eqn [1] is important since it is a simplified but
& ¼ D2t  D2x1   D2xn ; Dt ¼ ; D xj ¼ ½2 relevant model of much more general equations and
@t @xj
systems of mathematical physics; despite its simple
and u0 denotes the vector of all first-order deriva- structure, the semilinear wave equation presents
tives of u: already all the main difficulties and phenomena of
u0 ¼ ðDt u; Dx1 u; . . . ; Dxn uÞ  ðut ; ux1 ; . . . ; uxn Þ nonlinear wave interaction, and it represents an
ideal laboratory for such problems.
Sometimes the term ‘‘semilinear’’ is used in a more In this article we plan to give a concise but, as far
restrictive sense and refers to the special class of as possible, comprehensive review of the main
equations research directions concerning eqn [1], and in
&u ¼ f ðuÞ ½3 particular we shall focus on the global existence of
The very particular case f (u) = mu, m > 0, corres- both large and small nonlinear waves, and the
ponds to the Klein–Gordon equation, used to model problem of local existence for low-regularity solu-
relativistic particles. True nonlinear terms of the form tions. A large part of the theory extends to nonlinear
f (u) = mu  u3 , m  0 (meson equation), or perturbations of the form &u = F(u, u0 , u00 ) and to
f (u) = sin u (sine-Gordon equation) have been pro- the fully nonlinear case; we have no space here to
posed as models of self-interacting fields with a local give an account of these developments and we must
interaction. Notice that for the physical applications it refer the reader to the books and papers cited in the
is natural to consider complex-valued functions u(t, x); ‘‘Further reading’’ section.
in the general case of eqn [1], this actually means that
we are considering a 2  2 system in <u and =u. Classical Results
However, the natural physical requirement of gauge
invariance restricts the possible nonlinearities to the Equations [1] and [3] are hyperbolic with respect to
functions satisfying the condition the variable t. This is a precise way of stating that
the ‘‘correct’’ problem for it is an initial-value
f ðei uÞ ¼ f ðuÞei ; 8 2 R ½4 problem (IVP) with data at some fixed time, or
Semilinear Wave Equations 519

more generally on some spacelike surface: this respect to Ek (0) (the H k norm of the data),  takes
means that we assign two functions u0 (x), u1 (x), any ball BY (0, N) of YT into the ball BX (0, M þ NT)
called the ‘‘initial data,’’ and we look for a function of XT . Moreover, if we apply [7] to the difference of
u(t, x) satisfying the IVP: two equations &u = F and &v = G, we also see that
 is Lipschitz continuous from YT to XT , with a
&u ¼ Fðu; u0 Þ; uð0; xÞ ¼ u0 ðxÞ; ut ð0; xÞ ¼ u1 ðxÞ ½6 Lipschitz constant CT.
On the other hand, (u) = F(u, u0 ) takes XT to YT ,
This setting is in agreement with the physical picture provided k > 1 þ n=2; we can even say that it is
of an evolution problem: the data represent the Lipschitz continuous from BX (0, M) to BY (0, C(M))
complete state of a system at a fixed time, and they for some function C(M), with a Lipschitz constant
uniquely determine the evolution of the system, C1 (M) also depending on M. This follows easily
which is described by the differential equation. from Moser type estimates like
This rough statement of the problem is sufficient
when working with smooth functions, as in the n
kFðu; u0 ÞkHk1 ðkukL1 ÞkukHk ; k> þ1
classical approach. By purely classical methods, that 2
is, energy inequalities and nonlinear estimates, it is or
not difficult to prove the following local existence
result, where H k = H k (R n ) denotes the Sobolev n
space of functions with k derivatives in L2 (Rn ): kFðuÞkHk ðkukL1 ÞkukHk ; k>
2
Theorem 1 Assume F is C1 . Let (u0 , u1 ) 2 H k  Now it is easy to conclude: the composition 

H k1 for some k > 1 þ n=2. Then there exists a time maps XT into itself, and actually is a contraction of
T = T(ku0 kHk þ ku1 kHk1 ) > 0 such that problem BX (0, M) into itself provided M is large enough with
[6] has a unique solution belonging to (u, ut ) 2 respect to the data, and T is small enough with
C([T, T]; H k )  C([T, T]; H k1 ). respect to M. The unique fixed point is the required
If F = F(u) depends only on u, the result holds for solution. &
all k > n=2.
The wave operator has an additional important
Proof We decided to include a sketchy but com- property called the finite speed of propagation,
plete proof of this result since it shows the basic which can be stated as follows: given the IVP
approach to nonlinear wave equations: many results
of the theory, even some of the most delicate ones, &u ¼ 0; uð0; xÞ ¼ u0 ðxÞ; ut ð0; xÞ ¼ u1 ðxÞ
are obtained by suitable variations of the contrac-
tion method, and are similar in spirit to this classical if we modify the data ‘‘outside’’ a ball B(x0 , R) R n ,
theorem. the values of the solution inside the cone
Assume for a moment that the equation is linear
so that F = F(t, x) is a given smooth function of (t, x). Kðx0 ; RÞ ¼ fðt; xÞ : t  0; jx  x0 j < R  tg
For the linear equation &u = F, we can construct a
solution u using explicit formulas. Moreover, u do not change. Notice that K(x0 , R) is the cone with
satisfies the energy inequality basis B(x0 , R) and tip (R, x0 ); the slope of its mantle
represents the speed of propagation of the signals,
Z t
which for the wave operator & is equal to 1. The
Ek ðtÞ Ek ð0Þ þ kFðs; ÞkHk1 ds ½7
0
property extends without modification to the semi-
linear problem [6], at least for the smooth solutions
where the energy Ek (t) is defined as given by Theorem 1. Actually, it is not difficult to
modify the proof of the theorem to work on cones
Ek ðtÞ ¼ kuðt; ÞkHk þ kut ðt; ÞkHk1 ½8 instead of bands [T, T]  Rn ; in other words, given
a ball B = B(x0 , R), we can assign two data
Now we introduce the space XT = C([T, T]; H k ) \ u0 2 H k (B), u1 2 H k1 (B)(k > n=2 þ 1) and prove
C1 ([T, T]; H k1 ), the space YT = C([T, T]; H k1 ), the existence of a local solution on the cone
the mapping  : F ! u that takes the function F(t, x) K(x0 , R) for some time interval t 2 [0, T].
into the solution of &u = F (with fixed data u0 , u1 ), In general, the finite speed of propagation allows
and the mapping (u) = F(u, u0 ) which is the original us to localize in space most of the results and the
right-hand side of the equation. estimates; as a rule of thumb, we expect that what is
The energy inequality tells us that  is bounded true on a band [0, T]  Rn should also be true on
from YT to XT . Actually, for M large enough with any truncated cone K(x0 , R) \ {0 t T}.
520 Semilinear Wave Equations

Symmetries and
Z
The linear wave equation can be written as the
Euler–Lagrange equation of a suitable Lagrangian. ½xk eðuÞ þ Dk u Dt u dx ¼ const:;
This is still true for the semilinear perturbations of
k ¼ 1;.. .;n ½15
the form
&u þ f ðuÞ ¼ 0 ½9 where
Rs
Indeed, denoting with F(s) = 0 f () d the primitive eðuÞ ¼ 12u2t þ 12jrx uj2 þ FðuÞ ½16
of f, the Lagrangian of [9] is
ZZ   is the energy density.
1 1
LðuÞ ¼  jut j2 þ jrx uj2 þ FðuÞ dt dx ½10 The Poincaré group does not exhaust the invar-
2 2 iance properties of the free wave equation. Among
The functional L is not positive definite; hence, the the other transformations which commute or almost
variational approach gives only weak results. How- commute with &, we mention the spacetime dilations
ever, this point of view allows us to apply Noether’s and inversions (which together with translations and
principle: any invariance of the functional is related Lorentz transformations generate the larger confor-
to a conservation law of the equation. These mal group), the scaling u 7! u, the spatial dilations,
conserved quantities can also be obtained by taking and, in the complex-valued case, the gauge transfor-
the product of the equation by a suitable multiplier, mation u 7! ei u. In this way several useful conserva-
although this method is far from obvious in many tion laws can be obtained, including the conformal
cases. We describe here this circle of ideas briefly. energy identities of K Morawetz.
The functional L is invariant under the Poincaré
group, generated by time and space translations and
the Lorentz transformations ( > 1, c 6¼ 0): Strichartz Estimates
t  xj =c xj  ct Energy estimates are very useful tools but they have
t 7! pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; xj 7! pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½11
2  1 2  1 some major shortcomings. The main one is clearly
The infinitesimal generators of the translations are the large number of derivatives necessary to estimate
simply the partial derivatives Dt and Dxj . The Lorentz the nonlinear term. This is why the modern theory
transformations can be decomposed as a rotation of semilinear wave equations relies mainly on
followed by a boost, and indeed a corresponding different tools, which go under the umbrella name
complete set of infinitesimal generators are the operators of Strichartz estimates and express the decay
properties of solutions when measured in Lp or
jk ¼ xj Dk  xk Dj ; j ¼ xj Dt þ tDj ½12 related norms. In this section we summarize these
All the operators in the Poincaré group commute estimates in their most general form, and try to give
with & exactly. a feeling of the techniques involved.
The conservation law related to time translations Consider the following IVP for a homogeneous
(time derivative) is the fundamental ‘‘conservation of linear wave equation:
energy’’
&u ¼ 0; uð0; xÞ ¼ 0; ut ð0; xÞ ¼ f ðxÞ ½17
Z  
1 2 1
EðtÞ ¼ ut þ jrx uj2 þ FðuÞ dx ¼ Eð0Þ ½13 The conservation of energy states that
2 2
while spatial translations (spatial derivatives) lead to kut ðt; Þk2L2 þ krx uðt; Þk2L2  kf k2L2 ½18
the conservation of momenta
Z for all times t. Thus, we see that L2 -type norms of
ut uxj dx ¼ const:; j ¼ 1; . . . ; n the solution do not decay. The interesting fact is that
if we measure the solution u in a different Lp -norm,
On the other hand, infinitesimal rotations and p > 2, the norm decays as t ! 1, and the decay is
boost [12] are connected to the conservation of fastest for the L1 -norm.
angular momenta To appreciate the dispersive phenomena at their
Z best, let us assume that the Fourier transform of the
  data is localized in an annulus of order 1:
xk Dj u  xj Dk u  Dt u dx ¼ const:;
j; k ¼ 1; . . . ; n ½14 supp ^f ðÞ f1=2 jj 2g ½19
Semilinear Wave Equations 521

Then the corresponding solution u(t, x) has the same is by definition the B_ s1,1 Besov norm of f. Thus,
property, and we see that summing the estimates [23] over j, we conclude that
a general solution of [17] satisfies the dispersive
ukL2 2kjj^
kukL2 ¼ k^ ukL2  2krukL2 4kukL2 estimate
We condense the last line in the shorthand notation kuðt; ÞjL1 < ðn1Þ=2
t kf kB_ ðn1Þ=2
1;1
½24
kukL2 ’ krukL2
The Strichartz estimates can be obtained as a
We shall also write consequence of the above dispersive estimates, plus
kvkX < some subtle functional analytic arguments. In the
kwkY () kvkX CkwkY for some C
general form we give here, they were proved by
We can now rewrite the conservation of energy J Ginibre and G Velo, and in the most difficult
[20] in a very simple form; for localized data (and endpoint cases by Keel and T Tao. The solution of
hence a localized solution) as in [19], we have the homogeneous problem [17] studied above can be
written as
kuðt; ÞkL2 <
kf kL2 ½20
sinðtjDjÞ
The basic L1 -estimate for a solution of [17] with uðt; xÞ ¼ f; jDj  F 1 jjF
jDj
localized data as in [19] is simply
ðn1Þ=2
(here F denotes the Fourier transform). On the
kuðt; ÞkL1 <
t kf kL1 ½21 other hand, the solution of the complete nonhomo-
This estimate is well known since the 1960s; it can geneous problem
be proved easily by several techniques, notably by &u ¼ Fðt; xÞ; uð0; xÞ ¼ u0 ; ut ð0; xÞ ¼ u1 ðxÞ ½25
the stationary-phase method. Property [21] mea-
sures the fact that as time increases, the total can be written by Duhamel’s formula as
energy of the solution remains constant but spreads
@ sinðtjDjÞ sinðtjDjÞ
over a region of increasing volume, due to the uðt; xÞ ¼ u0 þ u1
propagation of waves. If we interpolate between @t Z jDj jDj
t
[20] and [21], we obtain the full set of dispersive sinððt  sÞjDjÞ
þ f ds
estimates 0 jDj

ðn1Þð1=21=qÞ and we see that the above estimates [22] apply to all
kuðt; ÞkLq <
t kf kLp
the operators appearing here. If we consider problem
1 1 ½22
þ ¼ 1; 2 q 1 [25] and we assume that the data F(t, x), u0 , u1 are
q p localized in frequency so that F(t, ^ ), u^0 , u
^1 have
Recall that we are working with localized solutions support in the annulus jj 1, the Strichartz estimate
on the annulus jj 1; it is easy to extend the takes the following form:
above estimates to general solutions by a rescaling kukLp Lq <
ku0 kL2 þ ku1 kL2 þ kFkLp~0 Lq~0 ½26
argument, exploiting the fact that, if u(t, x) is a I I

solution of the homogeneous wave equation, Here the dimension is n  2; LpI Lq denotes the space
u(t, x) is also a solution for any constant . with norm
Indeed, if ^f (and hence u ^) is supported in the
Z 1=p
annulus 2j1 jj 2jþ1 , j 2 Z, by rescaling [21],
we obtain kukLp Lq ¼ kuðt; ÞkpLq ðRn Þ dt ; I ¼ ½0; T
I
I

kuðt; ÞkL1 < ðn1Þ=2 jðn1Þ=2 or I ¼ R


t 2 kf kL1 ½23

If f is any smooth function, not localized in the indices p, q satisfy the conditions
frequency, we can still write it as a series 1 1 n1 1 n1
X þ ;
f ¼ fj p q 2 2 2
j2Z p 2 ½2; 1; ðn; p; qÞ 6¼ ð3; 2; 1Þ ½27

where supp ^fj {2j1 jj 2jþ1 }. The quantity while p ~ satisfy an identical condition (and p0
~, q
X denotes the conjugate index to p). The constant in
kf kB_ s ¼ 2js kfj kL1 inequality [26] is uniform with respect to the
1;1
j2Z interval I.
522 Semilinear Wave Equations

To get the most general form of the estimates, Global Large Waves
some additional function space trickery is required.
As for ordinary differential equations (ODEs), the
As before, a simple rescaling argument extends
local solutions constructed in Theorem 1 can be
estimate [26] to the case of data F, u0 , u1 , whose
extended to a maximal time interval [0, T  ], and a
spatial Fourier transforms are localized in the
natural question arises: are these maximal solutions
annulus 2j1 jj 2jþ1 ; we obtain
global, that is, is T  = 1?
2jð1=pþn=qÞ kukLp Lq < jn=2
2 ku0 kL2 For generic nonlinearities and large data, the
I
answer is negative; in a dramatic way, in general
þ 2jðn=21Þ ku1 kL2
the norm ku(t,  )kL1 is unbounded as t"T  < 1.
~0 0
þ 2jð1=p þn=~q 2Þ kFkLp~0 Lq~0 The reason for this is simple: using the finite speed
I
of propagation, we can localize the equation and
Finally, if the data are arbitrary, we may decompose work on a cone; then if we take constant functions
them as series of localized functions, and summing as initial data, the solution inside the cone does not
the corresponding estimates we obtain the general depend on x, and the equation restricted to the cone
Strichartz estimates for the wave equation [25]: for effectively reduces to an ODE:
all (p, q) and (p~, q
~) as in [27],
&u ¼ f ðuÞ () y00 ðtÞ ¼ f ðyÞ;
kukLp B_ 1=pþn=q < ½30
I q;2
ku0 kH_ n=2 þ ku1 kH_ n=21 yðtÞ  uðt; xÞ
þ kFkLp~0 B_ 1=p~0 þn=~q0 2 ½28 By this remark it is elementary to construct solutions
I ~0 ;2
q
P of the IVP [6] that blow up in a finite time.
Here, given a decomposition f = j2Z fj , the This construction does not apply if the equation
homogeneous Besov and Sobolev norms are defined, has some positive conserved quantity. Indeed, con-
respectively, by the identities (obvious modification sider a general gauge-invariant equation
for r = 1):
X &u þ gðjuj2 Þu ¼ 0;
½31
kf krB_ s ¼ 2jsr kfj krLq ; uð0; xÞ ¼ u0 ðxÞ; ut ð0; xÞ ¼ u1 ðxÞ
q;r
j2Z
for
R s some smooth function g(s). Writing G(s) =
kukH_ s ¼ kjjs u
^kL2 ’ kukB_ s
2;2 0 g() d, multiplying the equation by ut , and
integrating over Rn , it is easy to check that the
It is easy to convert the estimates [28] into a form
nonlinear energy
that uses only the more traditional norms Z h i
kf kH_ qs  kjDjs f kLq ; jDj  F 1 jj F EðtÞ ¼ jut j2 þ jrx uj2 þ Gðjuj2 Þ dx  Eð0Þ ½32

since by the Besov–Sobolev embedding we have is constant in time, provided the solution u is
smooth enough. When G(s) has no definite sign,
B_ sq;2  H
_ s for 2 q < 1;
q we can proceed as above and construct solutions
_ s for 1 < q 2
B_ sq;2 H q
that blow up in finite time; this is usually called the
‘‘focusing’’ case. However, if we assume that
Notice that if we apply to the equation and the G(s)  0 (‘‘defocusing’’ case), the energy E(t) is
data the operator jDj = F 1 jj F , which commutes non-negative. The corresponding ODE, which is
with &, the Strichartz estimate [28] can be rewritten y00 þ g(y2 )y = 0, has only global solutions, and one
in an apparently more general form: may guess that also the solutions of [31] can be
extended to global ones.
kukLp B_ 1=pþn=qþ <
ku0 kH_ n=2þ
I q;2 This innocent-looking guess turns out to be one of
þ ku1 kH_ n=21þ þ kFk ~ 0 1=p
p
0 0
~ þn=~
q 2þ ½29 the most difficult problems of the theory of nonlinear
LI B_ q~0 ;2
waves, and is actually largely unsolved at present.
In particular, it is possible to choose the indices in The only general result for eqns [31] is Segal’s
such a way that no derivatives appear on u and F: theorem, stating that the IVP has always a global
this choice gives weak solution:
Theorem 2 Let g(s) be aR C1 non-negative function
kukLp ðRnþ1 Þ <
ku0 kH_ 1=2 þ ku1 kH_ 1=2 þ kFkLp0 ðRnþ1 Þ s
on [0, þ1), write G(s) = 0 g() d and assume that
2ðn þ 1Þ for some constant C

n1
sgðs2 Þ CGðs2 Þ; lim GðsÞ ¼ þ1 ½33
which is the estimate originally proved by Strichartz. s!þ1
Semilinear Wave Equations 523

Then for any (u0 , u1 ) 2 H 1  L2 such that G(ju0 j2 ) names of K Jörgens, I Segal, W Strauss, W von
2 L1 , the IVP [31] has a global solution u(t, x) in the Wahl, P Brenner, H Pecher, J Ginibre, G Velo,
sense of distributions, such that u0 2 L1 (R, L2 (Rn )) R Glassey and the more recent contributions of
and F(u) 2 L1 (R, L1 (Rn )). J Shatah, M Struwe, L Kapitanski, M Grillakis,
omitting many others). Actually modern proofs are
The proof (see Shatah and Struwe (1998)) is
remarkably simple, and are based again on a
delicate but elementary in spirit: by truncating the
variation of the fixed-point argument. Roughly
nonlinear term, we can approximate the problem at
speaking, the linear equation &u þ g(jvj2 )v = 0
hand with a sequence of problems with global
defines a mapping v 7! u; the Strichartz estimates
solution; then the conservation law [32] yields
localized on a cone imply that this mapping is
some extra compactness, which allows us to extract
Lipschitz continuous in suitable spaces, the Lipschitz
a subsequence converging to a solution of the
constant being estimated by the nonlinear energy of
original equation.
the solution restricted to the cone. In order to show
Thus we see that, despite its generality, this result
that this mapping is actually a contraction, it is
does not shed much light on the difficulties of the
sufficient to prove that the localized energy tends to
problem. Indeed, the weak solution obtained might
zero near the tip of the cone, that is, it cannot
not be unique, nor smooth, and in these questions
concentrate at a point. Once this is known, it is easy
the real obstruction to solving [31] is hidden.
to continue the solution beyond any maximal time
Notice that in the one-dimensional case n = 1 the
of existence and prove the global existence and
solution is always unique and smooth when the data
uniqueness of the solution.
are smooth, since in this case E(t) controls the L1 -
In the supercritical case p > p0 (n), very little is
norm of u. For higher dimensions n  2, something
known at present; there is some indication that the
more can be proved if we assume that the nonlinear
problem is much more unstable than in the
term has a polynomial growth:
subcritical case (Kumlin, Brenner, Lebeau), and
sgðs2 Þ ¼ jsjp1 s for s large; p > 1 ½34 there is some numerical evidence in the same
direction.
In particular, the defocusing wave equation with a
power nonlinearity

&u þ jujp1 u ¼ 0 ½35 Global Small Waves


It was noted already in the 1960s (Segal, Strauss)
has been studied extensively. Notice that when p is
that the equation in dimension n  2
close to 1, the term jujp1 u becomes singular near 0;
this introduces additional difficulties in the problem; &u ¼ f ðuÞ; uð0; xÞ ¼ "u0 ðxÞ; ut ð0; xÞ ¼ "u1 ðxÞ
for this reason, it is better to consider a smooth term
f ðuÞ ¼ Oðjuj Þ for u 0
as in [34].
We can summarize the best-known results con- with small data can be considered as a perturbation of
cerning [31] under [34] as follows. Let p0 (n) be the the free wave equation and admits global solutions.
number The phenomenon may be regarded as follows: the
wave operator tends to spread waves and reduce their
p0 ð1Þ ¼ p0 ð2Þ ¼ 1
size (see [21]); the nonlinear term tends to concen-
4
p0 ðnÞ ¼ 1 þ for n  3 trate the peaks and make them higher, but at the same
n2 time it makes small waves smaller. If the rate of
Then dispersion is fast enough, the initial data are small
enough, and the power of the nonlinear term is high
 in the subcritical case 1 p < p0 (n), for any data enough, the peaks have no time to concentrate, and
(u0 , u1 ) 2 H 1  L2 , there exists a unique solution the solution quickly flattens out to 0. Notice that in
u 2 C(R; H1 ) such that u0 2 C(R; L2 ); dimension 1 there is no dispersion, and this kind of
 the same result holds in the critical case p = p0 (n) mechanism does not occur.
for n  3; and It was, however, F John who initiated the modern
 when 3 n 7, 1 p p0 (n), the solution is study of this question by giving the complete picture
smoother if the data are smoother. in dimension 3: for the IVP
These results have been achieved in the course of
more than 30 years through the works of several &u ¼ juj ; uð0; xÞ ¼ "u0 ðxÞ
authors (it is indispensable to mention at least the ut ð0; xÞ ¼ "u1 ðxÞ; n¼3
524 Semilinear Wave Equations

he proved that, for fixed u0 , u1 2 C1


0 , to the equation in order to increase the power .
pffiffiffi This method is effective for a variety of equations,
 if  > 1 þ 2 and " is small enough, the solution
including the semilinear wave, Klein–Gordon, and
is global and pffiffiffi
Schrödinger equations.
 if 1 <  < 1 þ 2 and the data are not identically
 The conformal transform method of D Christo-
zero, the solutions blow up in a finite time for all "
doulou. The Penrose transform takes the wave
(i.e., the L1 -norm is unbounded).
operator on R 1þn to the wave operator on a
Later Schaeffer proved that
pffiffiffi blow-up occurs also at bounded subset of R  Sn , the so-called Einstein
the critical value  = 1 þ 2. diamond (here Sn is the n-dimensional sphere).
W Strauss guessed the correct critical value for all Thanks to the fact that a problem of global
dimensions – 0 (n) is the positive root of the existence is converted into a problem of local
algebraic equation existence, the proof reduces to showing that the
  lifespan of the local solution becomes large
n1 nþ1 enough to cover the whole diamond when "
 ¼1
2 2 decreases.
and conjectured that the same picture as in dimen- A similar theory has been developed for the more
sion 3 is valid for all dimensions n  2. general semilinear equation
Soon Sideris proved that, for 1 <  < 0 (n) and the
&u ¼ Fðu; u0 Þ; Fðu; u0 Þ ¼ Oðju; u0 j Þ for u 0
quite general and small data, one always has blow-up.
Also it was proved by Klainerman, Shatah, Christo- but the results are less complete. The general picture
doulou, and others that the positive part of the is similar: for   2 when n  4, and for   3 when
conjecture was true for  > 0 (n), with a small gap n = 3, one has global small solutions, while for 
near the critical value. The gap was closed by close to 1 one in general has blow-up.
Georgiev, Lindblad, Sogge, who proved global exis- A very interesting phenomenon in this context
tence for all  > 0 (n). We also mention that the was discovered by S Klainerman: some nonlinea-
solution at the critical value  = 0 (n) always seems to rities with a special structure, called ‘‘null struc-
blow up; this is settled for low dimension (Schaeffer, ture,’’ behave better than the others. This structure
Yordanov, Zhang and others), but the question is still is clearly related to the wave operator, and in the
not completely clear for large dimensions. end it can be precisely explained in terms of
This problem has spurred a great deal of interaction of waves in phase space. We illustrate
creativity, eventually leading to very fruitful results: these ideas in the most interesting special case.
the different approaches have proved useful in a Consider the equation in three dimensions
variety of problems, sometimes quite different from
the original semilinear equation. We mention a few: &u ¼ FðDt u; Dx uÞ; F ¼ Oðju0 j Þ; n ¼ 3

 The weighted estimates of F John are estimates of In the ‘‘cubic’’ case  = 3, one has global existence
p
the solution in spacetime L norms with weights for all data small enough. On the other hand, in the
of the form (1 þ jtj þ jxj) (1 þ ktj  jxk) . An ‘‘quadratic’’ case  = 2, it is possible to construct
extension of this method was also used in the examples where the solution blows up in a finite
final complete proof of the conjecture. time no matter how small the data. Now, assume
 The vector field approach of S Klainerman. If we that the nonlinear term has the following structure:
X
regard energy estimates as norms generated by the Fðu0 Þ ¼ aQ0 ðu0 Þ þ cjk Qjk ðu0 Þ þ Oðju0 j3 Þ ½36
plain derivatives, it is natural to extend them to 0 j<k 3
more general norms generated by vector fields
commuting, or quasicommuting, with the wave which is called a ‘‘null structure’’. Here a, cjk are
operator. The conservation of energy expressed in constants, and the quadratic forms Q are the
these generalized norms has a built-in decay that following:
allows us to prove global existence of small waves.
Q0 ðu0 Þ ¼ jDt uj2  jDx1 u j2  jDx2 u j2  jDx3 u j2 ½37
This circle of ideas led very far, and we might even
regard Christodoulou and Klainerman’s proof of
the stability of Minkowski space for the Einstein Q0j ðu0 Þ ¼ Dt u  Dxj u  Dxj u  Dt u; j ¼ 1; 2; 3 ½38
equation as an extreme consequence of this
approach. Qjk ðu0 Þ ¼ Dxj u  Dxk u  Dxk u  Dxj u
 The normal forms of J Shatah. The idea is to ½39
apply a nonlinear (and nonlocal) transformation j; k ¼ 1; 2; 3; j < k
Semilinear Wave Equations 525

Then the problem has a global solution for all small and the Lorentz transformation
enough data. The extensions and applications of this
t  x1 t  x1
idea are very wide (see the ‘‘Further reading’’ section t 7! pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; x 7! pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½43
for further information). Another situation where 2  1 2  1
the null structure plays an important role is it is possible to show by explicit constructions that
discussed in the next section.
 the equation is not locally well posed for p(n=2  s)
(n=2 þ 2  s) (scaling) and
 the equation is not locally well posed for p(n=4 þ
Low Regularity 1=4  s) n=4 þ 5=4  s (Lorentz).
Theorem 1, although optimal in the classical frame- On the positive side, local well-posedness has been
work, is not satisfactory for a few reasons. From a almost fully proved in the complementary region of
physicist’s point of view, requiring n=2 þ 1 deriva- indices, with the exception of a tiny spot near the
tives of the data is not meaningful, since the endpoint s = 0, p = (n þ 5)=(n þ 1) where the pro-
measurable quantities involve only low-order deri- blem is still open (and the conjecture is that the
vatives, the most important one being the energy, equation is ill posed for indices in that region).
that is, the H 1 -norm of the solution. Moreover, the These results are due to several authors, among the
wave equation has a rich set of conserved quantities, others we cite C Kenig, G Ponce, L Vega,
symmetries and decay properties which may be H Lindblad, C Sogge, L Kapitanski, and T Tao.
useful to prove stronger results, and in particular the When the nonlinearity depends also on the first-
global existence. However, many of these structures order derivatives of u, the situation becomes more
appear only at a low-regularity level (H 1 or even complex. In the general case, the best result
Lp ); in order to exploit them it is essential to work available is still the local existence theorem
with low-regularity solutions. (Theorem 1); the only possible refinement is the
As an example, if we were able to prove Theorem 1 use of fractional Sobolev spaces H s , but in general
for k = 1, then we could deduce that the local local solvability only holds for s > n=2 þ 1. If we
solutions can be extended to global ones in all cases assume that F = F(u0 ) is a quadratic form in the
when the H 1 -norm is conserved. For instance, this first-order derivatives, a clever use of Strichartz
would allow us to solve globally the equations of estimates allows us to prove local solvability down
the form to s > n=2 þ 1=2 for n  3 and s > 7=4 for n = 2
&u þ G0 ðjuj2 Þu ¼ 0; GðsÞ  0 (Ponce and Sideris).
However, exactly as in the case of the small
The problem of the lowest value of s such that nonlinear waves examined in the previous section, if
a unique local solution exists in H s is quite the nonlinear term has a null structure the result can
difficult, and still not completely solved. In order be improved. Indeed, when F(u0 ) is a combination of
to state the results we precise the definition of the forms [37]–[39], then local solvability and
solution as follows: the IVP is said to be locally uniqueness can be proved for all s > n=2, as in the
well posed in H s , if, for all (u0 , u1 ) in a bounded case of a nonlinear term of the type F(u). This result
set B of H s  H s1 , there exist a T > 0, a Banach is due to Klainerman, Machedon, and Selberg.
space XT (depending on B) continuously Again, the proof is based on a variation of the
embedded in C([0, T]; H s ), and a unique solution contraction method; the additional ingredient here
u 2 XT , such that the map (u0 , u1 ) 7! u is contin- is the use of suitable function spaces, which are
uous from B to XT . the counterpart for the wave equation of the spaces
For the wave equation with a power nonlinearity used by Bourgain in the study of the nonlinear
Schrödinger equation. The norm of these spaces is
&u ¼ jujp ½40 defined as follows:
or more generally
kukHs;  khis hjtj  jji e

; ÞkL2 ðRnþ1 Þ
&u ¼ FðuÞ; FðuÞ ¼ 0
½41 where hi = (1 þ jj2 )1=2 and u e is the spacetime
jFðuÞ  FðvÞj Cju  vjðjujp1 þ jvjp1 Þ Fourier transform of u(t, x). The wave operator can
the picture is almost complete. Indeed, by using the be regarded as a spacetime Fourier multiplier of the
scaling form
2  jj2 = (jtj  jj)(jtj þ jj), and we see that
‘‘inverting’’ the operator & has a regularizing effect
t 7! t; x 7! x ½42 in the scale of Hs,  spaces, since it decreases both
526 Separation of Variables for Differential Equations

s and  by one unit. Substantiating this formal John F (1979) Blow-up of solutions of nonlinear wave equations
argument and complementing it with suitable esti- in three space dimensions. Manuscripta Mathematica 28(1–3):
235–268.
mates for the nonlinear term requires some hard work, Keel M and Tao T (1998) Endpoint Strichartz estimates.
which is contained in the theory of bilinear estimates American Journal of Mathematics 120: 955–980.
developed by Klainerman and his school. Klainerman S (1986) The null condition and global existence to
nonlinear wave equations. Lecture in Applied Mathematics
See also: Evolution Equations: Linear and Nonlinear; 23: 293–326.
Symmetric Hyperbolic Systems and Shock Waves; Wave Klainerman S and Selberg S (2002) Bilinear estimates and
Equations and Diffraction. applications to nonlinear wave equations. Communications
in Contemporary Mathematics 4: 223–295.
Lindblad H and Sogge C (1995) Existence and scattering with
Further Reading minimal regularity for semilinear wave equations. Journal of
Functional Analysis 130: 357–426.
Choquet-Bruhat Y (1988) Global existence for nonlinear Schiff LI (1951) Nonlinear meson theory of nuclear forces I.
-models, Rend. Sem. Mat. Univ. Pol. Torino, Fascicolo Physical Review 84: 1–9.
speciale ‘‘Hyperbolic equations,’’ 65–86. Segal IE (1963) The global Cauchy problem for a relativistic
D’Ancona P, Georgiev V, and Kubo H (2001) Weighted decay scalar field with power interaction. Bulletin de la Société
estimates for the wave equation. Journal of Differential Mathématique de France 91: 129–135.
Equations 177: 146–208. Shatah J (1988) Weak solutions and the development of
Georgiev V, Lindblad H, and Sogge CD (1997) Weighted singularities in the SU(2) -model. Communications on Pure
Strichartz estimates and global existence for semilinear wave and Applied Mathematics 41: 459–469.
equations. American Journal of Mathematics 119: 1291–1319. Shatah J and Struwe M (1993) Regularity results for nonlinear
Ginibre J and Velo G (1982) The Cauchy problem for the wave equations. Annals of Mathematics 138: 503–518.
o(N), cp(N  1), and GC (N, p) models. Annals of Physics 142: Shatah J and Struwe M (1998) Geometric Wave Equations.
393–415. Courant Lecture Notes in Mathematics, vol. 2. New York:
Ginibre J and Velo G (1995) Generalized Strichartz inequalities Courant Institute, New York University.
for the wave equation. Journal of Functional Analysis 133: Strauss W (1989) Nonlinear Wave Equations. CBMS Lecture
50–68. Notes, vol. 75. Providence, RI: American Mathematical Society.
Hörmander L (1997) Lectures on Nonlinear Hyperbolic Equa-
tions. Berlin: Springer.

Separation of Variables for Differential Equations


S Rauch-Wojciechowski, Linköping University, equation, the Schrödinger equation, and the
Linköping, Sweden Hamilton–Jacobi equation are solved by separating
K Marciniak, Linköping University, Norrköping, variables in suitably chosen systems of coordinates.
Sweden
ª 2006 Elsevier Ltd. All rights reserved.
Fourier Method
The SoV method can be attributed to Fourier
Introduction (1945), who solved the heat equation

The method of separation of variables (SoV) is a @t u ¼ @xx u ½1


way of finding particular and general solutions of
certain types of partial differential equations (PDEs). for distribution of temperature u(x, t) in a one-
Its main dimensional metal rod (of length L) by looking
P idea is to consider the additive ansatz
u(x) = Q i wi (xi , ) or the multiplicative ansatz first for special solutions of the product type
u(x) = i ui (xi , ) for a solution of a PDE that u(x, t) = X(x)T(t). This ansatz, substituted to [1],
allows for reducing this PDE to a set of (uncoupled) reduces it to two ODEs: @t T = k2 T and @xx X =
ordinary differential equations (ODEs) for the k2 X that can be solved by quadratures:
unknown functions wi (xi , ) or ui (xi , ) of one
2
variable xi , where x = (x1 , . . . , xn ). Locally, the Tk ðtÞ ¼ Aek t ; Xk ðxÞ ¼ B cosðkxÞ þ C sinðkxÞ
additive ansatz
P is, through the change of variables
u(x) = exp( i wi (xi , )), equivalent to the multi- Due to linearity
P of [1], any formal linear combina-
plicative ansatz. tion u(x, t) = k ck Xk (x)Tk (t) is again a solution of
Many well-known equations of mathematical the heat equation and can be used for solving an
physics such as the heat equation, the wave initial boundary-value problem (IBVP). For instance,
Separation of Variables for Differential Equations 527

in the case of the IBVP on the interval 0  x  L operator of total derivative with respect to (w.r.t.) xi ;
and with zero boundary conditions then, Di H[x, u] = 0 or
@t u ¼ @xx u; 0 < t; 0 < x < L e iH
D
ui;mi þ1 ¼ 
uð0; tÞ ¼ uðL; tÞ ¼ 0; 0<t Hui ;mi
uðx; 0Þ ¼ f ðxÞ; 0<x<L where Hui , mi = @ui , mi H. The integrability conditions
only a countable set of values for the separation Dj ui, mi þ1 = 0, j 6¼ i, give rise to a large set of
constant k is admissible: kn = (n=L), n = 1, 2, . . . . differential conditions to be satisfied by H[x, u]:
Then the general solution has the form of the     
Hui ;mi Huj ;mj De iDe j H þ Hu ;m u ;m D e iH D e jH
Fourier series  
i

i j j

X
1   ¼ Huj ;mj D e iH D e j Hu ;m
i i
uðx; tÞ ¼ cn exp k2n t sinðkn xÞ   
n¼1 þ Hui ;mi D e jH D e i Hu ;m ½4
j j

where the coefficients cn are given by the integrals


In general, the conditions [4] are restrictions for
Z
2 L both H and the form of a particular separable
cn ¼ f ðxÞ sinðkn xÞ dx solution u(x). If [4] is satisfied identically w.r.t. all
L 0
u, uk, l , we say that the corresponding coordinate
The sequence of functions sin(kn x) is complete on system xi is a regular separable
the interval [0, L]. That means that any regular P coordinate system;
then the PDE [3] admits a ( i mi þ 1)-parameter
(continuous and differentiable) initial data function family of separable solutions. Most cases considered
f (x) such that f (0) = f (L) = 0 can be uniquely in literature are regular; since then the separable
expressed as an infinite convergent sum of the solution is usually sufficiently general for solving
orthogonal set of functions sin(kn x). The study of various IBVPs.
mathematical properties of the Fourier expansion A given PDE, however, usually does not satisfy
gave rise to the classical theory of Fourier series and [4]; since these equations are not of tensorial type,
Fourier integrals. the natural question arises if there exists a suitable
change of coordinates y(x) such that the transformed
PDE satisfies [4]. Such separation coordinates may
Separability of PDEs in General Setting or may not exist; it is usually very difficult to decide.
A general setting for an additive separability of a Here and in what follows, we speak about
single, usually nonlinear, PDE has been developed separability of a single (scalar) PDE. The theory of
by Levi-Civita (1904) and by Kalnins and Miller separability of systems of PDEs is still not developed
(1980) (see also Miller (1983)). Let fully, although it is of relevance in the theory of
Maxwell equations and of the Dirac equation.
Hðx1 ; . . . ; xn ; u; ui ; uij ; uijk ; . . .Þ ¼ E We present here the most classical part of SoV theory:
1  i; j; k  n ½2 orthogonal separability of the Hamilton–Jacobi
equation for geodesic motions on Riemannian
be a finite-order PDE for an unknown function u(x),
manifolds.
where ui (x) = @xi u, uij = @xj @xi u, etc., andPE is a
constant. A separable solution u(x) = i Wi (xi )
satisfies the simpler equation
Configurational Separation
E ¼ Hðx; u; ui ; uii ; . . .Þ  H½x; u ½3 of Hamilton–Jacobi Equation
where all mixed derivatives uij , etc., disappear. If a on Riemannian Manifolds
separable solution is admissible by eqn [2], then the Around 1842, C G J Jacobi invented the method of
function H(x; u, ui , uii , . . .) has to satisfy a set of generating function for solving the canonical
integrability conditions following from the total Hamilton equations
derivatives of [3]. Let
@Hðx; yÞ @Hðx; yÞ
Di ¼ @xi þ ui;1 @u þ ui; 2 @ui;1 þ    þ ui; mi þ1 @ui; mi x_ ¼ ; y_ ¼ 
@y @x ½5
De i þ ui; m þ1 @u
i i; mi x ¼ ðx1 ; . . . ; xn Þ y ¼ ðy1 ; . . . ; yn Þ
(where ui, 1 = ui , ui, jþ1 = @xi ui, j , etc., and mi is the where H(x, y) is a Hamiltonian and dot denotes the
largest number l such that @ui, l H 6¼ 0) denote the time derivative (Landau and Lifshitz 1976). In this
528 Separation of Variables for Differential Equations

method, one looks for a generating function W(x, ) A separable solution W(x, ) of [6] exists when-
of a canonical transformation ever the Hamiltonian H(x, y) satisfies (identically)
the integrability conditions [4] which in this case
@Wðx; Þ @Wðx; Þ acquire the (nonlinear) form
y¼ ; ¼
@x @
that transforms Hamiltonian equations [5] into simple Lij ðHÞ  @i H@j H@ i @ j H þ @ i H@ j H@i @j H
equations for the new variables  2 Rn ,  2 Rn . Since
 @i H@ j H@ i @j H  @ i H@j H@i @ j H
the transformation is canonical, the transformed
equations are again Hamiltonian with the new ¼0 for all i; j ¼ 1; . . . ; n ½7
Hamiltonian H(,e ) = H(x(, ), y(, )). If we
e
choose this transformation so that H(, ) = 1 , then (@i = @=@xi , @ i = @=@yi ) found by Levi-Civita (1904).
the transformed Hamilton equations become In classical mechanics the most important
Hamiltonians are natural ones:
e
@ Hð; Þ
_ ¼ ¼ ð1; 0; . . . ; 0Þ 1 X ij
@ Hðx; yÞ ¼ g ðxÞyi yj þ VðxÞ  G þ V ½8
e
@ Hð; Þ 2 i;j
_ ¼  ¼0
@
They are defined on the cotangent bundle T  Q of a
so that (t) = (t þ 10 , 20 , . . . , n0 ), (t) = configurational Riemannian manifold Q with the
(10 , . . . , n0 ) = const. and the solution x(t), y(t) of metric tensor g. The function G is the geodesic
the Hamilton equations [5] is then given implicitly Hamiltonian associated with the metric tensor g. For
by the equations such natural Hamiltonians, the Levi-Civita condition
Lij (G þ V) = 0 splits into the condition Lij (G) = 0
@WðxðtÞ; Þ @WðxðtÞ; Þ
ðtÞ ¼ ; yðtÞ ¼ and a condition for the potential V(x). The condition
@ @x
Lij (G) = 0, depending solely on the kinetic energy
Since term, is thus a necessary condition for coordinates xi
on Q to be separation coordinates for [8].
@Wðx; Þ
y¼ In the fundamental case of orthogonal separation
@x (i.e., when gij = 0 for i 6¼ j), the Levi-Civita condi-
the generating function W(x, ) has to satisfy (identi- tions Lij (G þ V) = 0 read
cally w.r.t. (x, )) the first-order nonlinear PDE  
  @i @j gkk  @i ln gjj @j gkk
@Wðx; Þ  
H x; ¼ 1 ½6  @j ln gii @i gkk ¼ 0; i 6¼ j ½9
@x
This equation is called the Hamilton–Jacobi  
equation for the generating function W(x, ). It is @i @j V  @i ln gjj @j V
solved when its complete integral W(x, ), complete  
means that  @j ln gii @i V ¼ 0; i 6¼ j ½10
 2  The main questions arising here are
@ Wðx; Þ
det 6¼ 0
@xi @j 1. What is the algebraic form of orthogonally
separable Riemannian metrics?
depending on n independent constants  is known.
2. What is the form of separable coordinates on
In general, it is very difficult to find solutions of [6].
Riemannian manifolds?
The most important method is the method of
separation of variables when one P looks for a The first question is answered by the Stäckel
solution in the form W(x, ) = nk = 1 Wk (xk , ) theorem (Stäckel 1891) that provides an algebraic
which is a sum of n functions Wk (xk , ), each characterization of orthogonal separability of a
depending on a single variable xk and, possibly, all natural Hamiltonian H = G þ V.
constants a. If the Hamilton–Jacobi equation [6]
Theorem 1 The Hamilton–Jacobi equation for the
admits such a solution, then integrating this
natural Hamiltonian
equation is reduced to integrating n (uncoupled)
first-order ODEs for functions Wk (xk , ). The 1 X ii
constants k acquire then the meaning of integration H ¼GþV ¼ g ðxÞy2i þ VðxÞ
2 i
constants.
Separation of Variables for Differential Equations 529

is separable in the (orthogonal) coordinates x if and and


only if X @Wi @ 2 Wi
(i) There exists a matrix  = [’ij (x )], det () 6¼ 0 i gii ¼0
i
@xi @xi @j
(so that the row i depends only on xi ) such that
[g11 , . . . , gnn ] is the first row of the inverse (for j = 2, . . . , n), that is, the condition [11] for the
matrix  = 1 . P
Stäckel matrix
(ii) The potential V has the form V(x) = i gii fi (xi ),

@Wi @ 2 Wi
where each fi (xi ) is a function of one variable xi ¼
only. @xi @xi @j

Such matrix  is called a Stäckel matrix. Further, we see that

Proof If 1 X ii
V ¼ 1  g ð@xi Wi Þ2
2 3 2 i
’11 ðx1 Þ  ’1n ðx1 Þ

 11 6 7 1 X ii 1
g ; . . . ; gnn 6 .. .. .. 7 ¼ g 1 ’i1 ðxi Þ  ð@xi Wi Þ2
4 . . . 5 2 i 2
X
’n1 ðxn Þ  ’nn ðxn Þ ¼ ii i
g fi ðx Þ &
i
¼ ½1; 0; . . . ; 0 ½11
then the Hamilton–Jacobi equation for H can be Remark 2 The Stäckel characterization of orthogo-
written as nal separability is equivalent to Levi-Civita conditions
  [9] and [10]. It is in fact a solution of these conditions.
1 X ii @W 2 X ii
g þ g fi ðxi Þ ¼ 1 Remark 3 With every Stäckel matrix, one can
2 i @xi i
X X relate a family of n quadratic in momenta Hamilto-
¼ 1 gii ’i1 ðxi Þ þ 2 gii ’i2 ðxi Þ nians defined by n rows of the inverse Stäckel matrix
i
X
i  = 1 = [ kr ]:
þ    þ n gii ’in ðxi Þ ½12
i
1X n
2
Hk ¼ kr yr ; k ¼ 1; . . . ; n ½14
2 r¼1
This equationP admits an additively separable
solution W = i Wi (xi ), where the functions Wi (so that H1 = G). These Hamiltonians are linearly
satisfy n ODEs (separation equations): and functionally independent; they Poisson-
  commute (so that they form a Liouville integrable
1 @Wi 2
þ fi ðxi Þ system) and are all diagonal so that they have
2 @xi common eigenvectors.
¼ 1 ’i1 ðxi Þ þ 2 ’i2 ðxi Þ þ    þ n ’in ðxi Þ These properties are the main ingredients of an
i ¼ 1; . . . ; n ½13 intrinsic (coordinate-independent) characterization
of separable geodesic Hamiltonians G in terms of
By differentiating [13] w.r.t. j , we get involutive Killing tensors that is due to works of
@Wi @ 2 Wi Eisenhart (1934), Kalnins and Miller (1980), and
’ij ðxi Þ ¼ Benenti (1997).
@xi @xi @j
and thus Theorem 4 A necessary and sufficient condition
 2  for the existence of an orthogonal additive separable
 @W1 @Wn @ W coordinate system x for the Hamilton–Jacobi
det ’ij ðxi Þ ¼ . . . det 6¼ 0
@x 1 @x n @xi @j equation of the geodesic Hamiltonian H1 = G
P on an n-dimensional (pseudo)-Riemannian manifold
so that W = i Wi (xi ) is indeed a complete integral of is that there exist n quadratic forms
the Hamilton–Jacobi equation [12]. Conversely, if P
P Hr = ni, j hijr (x)yi yj such that
W = i Wi (xi ) is a complete integral of the Hamilton–
Jacobi equation [12], then by differentiating it w.r.t. j (i) They all Poisson-commute: {Hr , Hs } = 0, 1  r,
we get for j = 1 s  n.
(ii) The set {Hr }nr= 1 is linearly independent.
X @Wi @ 2 Wi
gii ¼1 (iii) There is a basis {!(j) }nj= 1 of n simultaneous
i
@xi @xi @j eigenforms for all Hr .
530 Separation of Variables for Differential Equations

If conditions(i)–(iii) are satisfied then there exist and the constant terms in [4] give the P Levi-Civita
functions gj (x) such that !(j) = gj dxj , j = 1, . . . , n. equation [10] meaning that V(x) = ni= 1 gii fi (xi ).
Eisenhart has shown that the Robertson condition is
This theorem has been further simplified
equivalent to the requirement that the Ricci tensor is
by Benenti (1997), who has shown that for separ-
diagonal: Rij = 0, i 6¼ j in variables x so that the
ability it is sufficient that gij admits a single Killing
Robertson condition is satisfied automatically in the
2-tensor with simple eigenvalues and normal eigen-
Euclidean space, in spaces of constant curvature and in
vectors. He has also explained the role of ignorable
Einstein spaces. Thus every orthogonal coordinate
coordinates.
system permitting multiplicative separation of the
These results are key ingredients of an answer to the
Schrödinger equation corresponds to the Stäckel form.
question (2). Eisenhart (1934), starting from the fact
that every separable geodesic Hamiltonian H = G
Jacobi Problem of Separability
admits n quadratic (w.r.t. momenta yi ) integrals of
motion, derived a set of nonlinear PDEs characterizing In order to apply the separability theory to physical
separable Riemannian metrics. He has solved these Hamiltonians H = (1=2)p2 þ V(q), p = (p1 , . . . , pn ),
equations for spaces of constant curvature. This q = (q1 , . . . , qn ), it is essential to solve the following
solution is the basis of the Kalnins and Miller’s problem: ‘‘given a potential V(q), decide if there
(1986) diagrammatic classification of all orthogonal exists a point transformation x(q) to some curvi-
separation coordinates on Rn and the sphere Sn . linear coordinates x such that the Hamilton–Jacobi
Separable coordinates on the Minkowski space Mn equation associated with H is separable in coordi-
have not been classified yet. nates x, and if such transformation exists, determine
Since the work of Robertson (1927) and Eisenhart it and solve the obtained Hamilton–Jacobi
(1934), it is known that in Rn , Sn and, in general, in equation.’’
the space with diagonal Ricci tensor, the (additive) This problem has been raised by Jacobi (1884) in
separability of Hamilton–Jacobi equation for the connection with the problem of finding geodesic
natural Hamiltonian H = G þ V is equivalent motions on a 3-axial ellipsoid. For solving this
to multiplicative separability of the stationary problem Jacobi introduced his ‘‘remarkable change
Schrödinger equation with the same potential V: of coordinates’’ to the generalized elliptic coordi-
nates x(q) defined through zeros of the rational
ð þ VðxÞÞðxÞ ¼ EðxÞ ½15
function
where Q j
X n
ðqi Þ2 j ðz  x Þ
X
n qffiffiffiffiffiffiffiffiffiffiffiffiffi  1þ Q ½16
1 ðz  i Þ i ðz  i Þ
¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi @i detðgÞgij @j i¼1
i;j¼1 detðgÞ
where the constants i > 0 are all different. From
is the Laplace–Beltrami operator. Usually, multi- the graph of the left-hand side of [16], it is easy to
Q
plicative separated solutions (x) = ni= 1 i (x) is see that there are exactly n simple, real zeros. For
considered but the change of the dependent variable given values of elliptic coordinates xj , the values of
u = ln  transforms it into an additive separable (qi )2 are uniquely determined as residues at i while
solution. If we restrict our considerations to ortho- Cartesian coordinates qi are determined uniquely
gonal separation coordinates (gij = 0 for i 6¼ j), eqn only in each n-tant of Rn .
[15] becomes The Jacobi elliptic coordinates play a pivotal role
n  
in orthogonal separability on Rn and Sn since they
X  1
gii uii þ u2i þ pffiffiffiffiffiffiffiffiffiffiffiffiffi @i are the mother of all other separation coordinates
i¼1 detðgÞ that can be obtained through proper and improper
qffiffiffiffiffiffiffiffiffiffiffiffiffi   degenerations of i ’s. By using these coordinates
 detðgÞgii ui þ VðxÞ ¼ E Jacobi solved not only the geodesic motions on the
ellipsoid but also the motion on the ellipsoid under
where ui = @i u, uii = @i @i u. The integrability condi- the action of harmonic potential V(q) = (1=2)q2 . He
tions [4] for regular separation lead to the Levi-Civita has also found separation coordinates for a system
condition [9] on the components gii of the metric of three interacting particles on the line known
tensor, upon comparison of the coefficients at u2i . today as the Calogero system. In general, however,
The coefficients at uii yield the Robertson condition Jacobi considered the problem of finding separation
qffiffiffiffiffiffiffiffiffiffiffiffiffi  coordinates for a given potential V(q) to be very
@i @j ln detðgÞgii ¼ 0; i 6¼ j difficult. In Vörlesungen über Dynamik, ch. 26, he
writes: ‘‘The main difficulty in integrating a given
Separation of Variables for Differential Equations 531

differential equation lies in introducing convenient Remark 6 If the potential V(q1 , q2 ) is separable,
variables, which there is no rule for finding. There- then it admits an integral of motion K that is
fore, we must travel the reverse path and after quadratic w.r.t. momenta and V satisfies (identically
finding some notable substitution, look for problems w.r.t. q1 , q2 ) eqn [17] for certain values of the
to which it can be successfully applied’’. This undetermined constants a, b, e b, c, e
c, d. Since coeffi-
statement had a profound influence on further cients at linearly independent expressions of q1 , q2
development of SoV theory that concentrated on have to be equal to zero, the parameters
characterizing separable Hamiltonians (as expressed a, b, e
b, c, ec, d have to satisfy a set of linear, algebraic,
in terms of separation coordinates) and on describ- homogeneous equations. If there is a nonzero
ing and classifying separation coordinates. solution for a, b, e b, c, ec, d, then there exists an
The original problem of Jacobi of finding separa- integral of motion K and separation coordinates
tion variables for a given natural Hamiltonian has can be determined as characteristic variables for
been taken up by Rauch-Wojciechowski (1986), equation [17].
who found a characterization of separable potentials
Example 7 Separable cases of the Henon–Heiles
V(q) in terms of Cartesian coordinates qi . Its
potential
invariant geometric form has been given by Benenti.
A complete criterion of separability that allows for  
V ¼ 12 !1 q21 þ !2 q22 þ q21 q2  13 q32
an effective testing and calculation of separation
coordinates (if they exist) for V(q) has been solved
By substituting this form of V into [17], we get two
by Waksjö and Rauch-Wojciechowski (2003). This
sets of admissible solutions for parameters , ,
criterion is directly applicable to the problem of
!1 , !2 : (i)  = , !1 = !2 with V separable in
finding SoV for the Schrödinger equation.
rotated (by =4) Cartesian coordinates; (ii)
Criterion of Separability for n = 2  = 6, !1 , !2 -arbitrary with V separable in the
shifted parabolic coordinates. In case (ii) eqn [17]
The criterion of separability for n = 2 can be read becomes
from the Bertrand–Darboux theorem.
 
Theorem 5 (Bertrand–Darboux). For the 1
2 q2  ð4!1  !2 Þ @1 @2 V
Hamiltonian: 4 
  þ q1 @22 V  @12 V þ 3@1 V ¼ 0
1
H= 2 p21 þ p22 þ Vðq1 ; q2 Þ
the following statements are equivalent: and p inffiffiffiffiffi its characteristic coordinates defined as
q1 =  , q2 = (1=2)(  ) þ (1=4)(4!1  !2 ) it
(i) H has a functionally independent integral of takes the form (  )@ @ V þ @ V þ @ V = 0 solved
motion {H, K} = 0 of the form by V(, ) = ( þ )2 [f () þ g( )] which is separable
in the parabolic coordinates.
   
K ¼ aq22 þ bq2 þ c p21 þ aq21 þ e c p22
bq1 þ e
 
þ  2aq1 q2  bq1  e bq2 þ d p1 p2 Effective Criterion of Separability
þ kðq1 ; q2 Þ for Arbitrary Dimension

For n > 2, a similar theorem characterizing separ-


(ii) The potential V(q1 , q2 ) satisfies the following linear ability in generalized elliptic coordinates has been
second-order PDE with quadratic coefficients formulated by Rauch-Wojciechowski (1986).
  Theorem 8 (Elliptic Bertrand–Darboux). For a
0 ¼ 2 aq22  aq21 þ bq2  e
bq1 þ c  ec @1 @2 V
  natural Hamiltonian H = (1=2)p2 þ V(q), the
þ ð2aq1 q2  bq1  e bq2 þ dÞ @22 V  @12 V following statements are equivalent:

þ ð6aq2 þ 3bÞ@1 V  ð6aq1 þ 3e


bÞ@2 V ½17 (i) H has n global, functionally independent and
involutive integrals of motion {H, Ki } = 0,
where a, b, e b, c, e
c, d are some constants, {Ki , Kj } = 0, i, j = 1, . . . , n, having the form
@1 = @q1, @2 = @q2 .
(iii) The Hamilton–Jacobi equation for H is separ- X
n

able in one of the four orthogonal coordinate Ki ¼ ði  r Þ1 lir2 þ p2i þ ki ðqÞ
r¼1;r6¼i
systems in the plane: elliptic, parabolic, polar,
or Cartesian. lir ¼ qi pr  qr pi ½18
532 Separation of Variables for Differential Equations

(ii) The potential V satisfies the following system of , i ,


ij . If  = 0, then V(q) is not separable in
linear second-order PDEs elliptic coordinates.
2. If  6¼ 0, set b = 1 , S = bbt  1
and
ði  j Þ@i @j V  =ij ð2 þ <ÞV ¼ 0 diagonalize S: S = A diag(1 , . . . , n )At . If some
i; j ¼ 1; . . . ; n; i 6¼ j ½19 eigenvalues i coincide, then V(q) is not separ-
able in elliptic coordinates. Otherwise V(q) is
i @i =jk V þ j @j =ki V þ k @k =ij V ¼ 0 separable in the elliptic coordinates
x = (x1 , . . . , xn ) given by
all i; j; k different ½20
Pn
where =ij = qi @j  qj @i , < = r = 1 qr @r . Q j
Xn
qi Þ2
ðe j ðz  x Þ
(iii) The Hamilton–Jacobi equation for H is separ- 1þ Q
able in the generalized elliptic coordinates [16] i¼1
ðz  i Þ i ðz  i Þ
with parameters i .
(compare with [16]), where q = Ae
q þ b, with b and
A found as above.
Remark 9 Equations [19]–[20] follow from the
compatibility conditions that mixed derivatives of If  = 0,  6¼ 0, then there exists a similar
ki (q) calculated from the conditions {H, Kr } = 0, are algorithm for separability in generalized parabolic
equal. This leads to an overdetermined system [19]–[20] coordinates and for  = 0,  = 0,
6¼ tI, we
of PDEs for V(q). Equations [19]–[20] are not linearly have separability in Cartesian coordinates if all i
independent but we keep both sets [19]–[20] in the are different. For giving an idea of what happens
formulation of this theorem because eqns [19] give rise when degenerations occur, consider the case
to the basic Bertrand–Darboux equations [21] used in  = 0,  = 0. Then the Bertrand–Darboux equations
the criterion of separability while eqns [20] give rise to [21] are Euclidean equivalent to the canonical form
cyclic Bertrand–Darboux equations [22] used for testing (i  j ) @i @j V = 0 and if all i are different,
the level of spherical symmetry in the potential. then equations @i @j V = 0 imply that V(q) is a
sum of P functions of one variable only:
For testing elliptic separability of any given potential V(q) = ni= 1 Vi (qi ).
V(q), it is necessary to introduce into eqns [19] and The main problem is to handle all possible
[20] the freedom of choice of the Euclidean reference degenerations when certain ’s coincide. Let
frame (as described by the Euclidean transformation 1 =    = j < jþ1 <    < n , where 1 < j < n.
q = At (q  b), A 2 SO(n), b 2 Rn ). By substituting it
e Then V(q) = Vj (q1 , . . . , qj ) þ Vjþ1 (qjþ1 ) þ    þ
into [19]–[20], omitting tildes and summing over one Vn (q ) which means that variables qjþ1 , . . . , qn
n
of the indices, we obtain new equations separate off while the potential Vj (q1 , . . . , qj ) has to
X
n  be tested again on Rj with the use of eqns [21].
0¼ ðqi qk þ i qk þ k qi þ
ik Þ@k @j V Degenerations for  6¼ 0 or  6¼ 0 are more compli-
k¼1 cated and the cyclic Bertrand–Darboux equations
 
 qj qk þ j qk þ k qj þ
jk @k @i V [22] have to be used. They unfold the level of
    spherical degeneracy of spheres and embedded sub-
þ 3 ðqi þ i Þ@j V  qj þ j @i V
spheres. A complete analysis of all possible degenera-
i; j ¼ 1; . . . ; n; i 6¼ j ½21 tions is technical. It requires considering of all possible
X
n     degenerations of the sequences 1 <    < n and of

il qj 
jl qi @k @l V þ
jl qk 
kl qj @i @l V the related equations [21]–[22] for the potential V(q).
l¼1 It has been proved by Waksjö and

þ ð
kl qi 
il qk Þ@j @l V ½22 Rauch-Wojciechowski (2003) that there is a one-to-
one correspondence between all possible sets of PDEs
with the new coefficients , i ,
ij that are uncon-
[21]–[22] characterizing separable potentials and all
strained despite that the orthogonal matrix A
possible types of Riemannian metrics (in the Kalnins
satisfies the quadratic algebraic constraint AAt = I.
and Miller (1986) classification of all separable
Theorem 8 provides the following test of elliptic
coordinates on Rn and Sn ) so that no completely
separability for a potential V(q) given in Cartesian
separable case is missed. The most important is that
coordinates.
after maximally n steps separation coordinates are
1. Insert V(q) into the Bertrand–Darboux equations always determined (if they exist) by a sequential use of
[21]. This gives a system of linear, homogeneous, the Bertrand–Darboux and cyclic Bertrand–Darboux
algebraic equations for the unknown parameters equations [21]–[22].
Separation of Variables for Differential Equations 533

Separation of Eigenvalues Problems The ansatz w = f (r)Y(, ’) gives the separated


equation
Eigenvalues problems (in a given domain D) of the
 2 0 0   
form r f þ r2 f 1 1
¼ @’ @’ Y
f Y sin  sin 
wðqÞ þ  ðqÞwðqÞ ¼ 0; >0 ½23 
þ @ ðsin  @ Y Þ
(where  is the Laplace operator) arise when sepa-
rating the wave equation (q)utt = u and the diffu-
sion equation (q)ut = u (Courant and Hilbert 1989). so that its both sides must be equal to a constant .
The multiplicative ansatz u(q, t) = w(q)g(t) yields Continuity of Y implies that it has to be periodic in ’
eqn [23] together with € g = g or g_ = g. The problem (with period 2) and regular at  = 0,  = . It can
[23] is also used for solving the inhomogeneous only be satisfied for  = n(n þ 1). The left-hand side of
equation u = f with the zero boundary condition the above equation yields then (r2 f ‘0 )0  n(n þ 1)f þ
uj@D = 0. In general, the properties of the eigenvalues i r2 f = 0. Solutions pthat
ffiffi are regular
pffiffiffi at r = 0 are the
and of the corresponding eigenfunctions wi of the Bessel functions (1= r)Jnþ(1=2) ( r). The equation for
problem [23] depend on the regularity requirements for spherical harmonics
   
wi and on the boundary conditions at @D. 1 1
For the zero boundary conditions w(q)j@D = 0, one @’ @’ Y þ @ ðsin  @ Y Þ
Y sin  sin 
seeks a nontrivial (w 6¼ 0) solution having in the
region D continuous first- and second-order deriva- þ nðn þ 1ÞY ¼ 0
tives. General theorems (Courant and Hilbert 1989)
can be further multiplicatively separated by assum-
state that for such problems there exists a growing
ing Y = ()(’). The function P(z = cos ) = ()
sequence {i }ni= 1 of positive eigenvalues i such that
satisfies then the Legendre equation
i ! 1 as i increases, and that there is a related  
pffiffiffi   0 0
sequence of normalized eigenfunctions w1 , 2 
pffiffiffi 1  z P ðzÞ þ nðn þ 1Þ  PðzÞ ¼ 0
w2 , . . . that form
R a complete weighted-orthogonal 1  z2
(in the sense that D wi wj = ij , i, j = 1, 2, . . .) system
of functions so that every regular initial function P(z) is regular at z = 1 only when  = k2 ,
(q) with (q)j@D = 0 may be expanded in terms of k = 0, 1, 2, . . . . The function (’) satisfies then
the eigenfunctions wm in an P absolutely and uni- 00 = k2  with solutions k (’) = ak cos (k’) þ bk
formlyR convergent series (q) = 1 sin (k’). The full solution of the eigenvalue problem
m = 1 cm wm (q) with
cm = D wm . This makes it possible to express a w þ w = 0 has the form of an infinite series
solution of the IBVP for the wave or for the diffusion X1
1 pffiffiffiffiffiffiffiffiffi 
equation with zero boundary conditions: wm ðr; ’; Þ ¼ pffiffi Jnþð1=2Þ m;n r an;0 Pðcos Þ
n¼0
r
ðqÞutt ¼ u respectively ðqÞut ¼ u X
n  
þ an;k cosðkÞ þ bn;k sinðkÞ
uðq; tÞj@D ¼ 0; uðq; t ¼ 0Þ ¼ ðqÞ ½24 k¼1

 Pn;k ðcos Þ


as
P1 a convergent infinite series u(q, t) =
c w (q)g (t), where g (t) satisfy g€ = g where the constants m, n , m = 1, 2, . . . , are
pffiffiffidetermined
m=1 m m m m
respectively g_ = g. Further determination of proper- by the transcendental equation Jnþ(1=2) ( ) = 0 that
ties of the eigenfunctions wn is possible only in follows from the boundary condition u(q, t)j@D = 0.
special domains D when the problem [23] can be Almost all BVPs that can be reduced to one-
reduced to one-dimensional eigenvalue problems by dimensional eigenvalue problems may be considered
separating variables in some suitable coordinates. as a special or limiting case of the Lame problem
Example 10 Consider the spherical domain where the boundary @D is given by pieces of confocal
r2 = x2 þ y2 þ z2  1. Equation [23] with ¼ 1 quadrics corresponding to some separation coordi-
attains in the spherical coordinates (r, ’, ) the form nates. If D = {q(x) 2 R3 : x0i  xi  x1i , i = 1, 2, 3} is a
domain defined by parametrizing q with the elliptic
   coordinates xi given by [16], then the eigenvalue
1 2 1
w þ w  @ r ðr sin  @ r wÞ þ @’ @’ w problem w þ w = 0 splits into three one-
r2 sin  sin 
 dimensional equations of the form
þ @ ðsin  @ wÞ þ w ¼ 0
’ðsÞY 00 ðsÞ þ 12’0 ðsÞY 0 ðsÞ þ ðs þ ÞYðsÞ ¼ 0
534 Separation of Variables for Differential Equations

where ’(s) = 4(s  e1 )(s  e2 )(s  e3 ) and ei are para- since solving [25] w.r.t.  is a purely algebraic
meters of the elliptic coordinates. This is the Lame operation. We can treat eqns [26] as a set of
equation; its solutions define new transcendental func- simultaneously separable (in the canonical variables
tions that depend on the choice of the constants , . (x, y)) Hamilton–Jacobi equations related to the
The approach presented here extends to diverse Hamiltonians Hi . Assume now that
modifications such as vibrations with forcing term    2 
w(q) þ w(q) = f (q), vibrations of a nonhomogen- @2W @ Wi
det ¼ det 6¼ 0
eousmedium w(q) þ  (q)w(q) = 0, the stationary @xi @j @xi @j
Schrödinger equation w(q) þ V(q)w(q) = w(q)
whenever the functions (q), f (q), V(q) are compatible i.e. that W is a complete integral for [26]. Then the
with the separation coordinates. Hamiltonians Hi (x, y) = i Poisson-commute since
Separation equations for the second-order BVP i can be treated as new canonical variables
are the source of one-dimensional eigenvalue pro- obtained by the canonical transformation (x, y) !
blems of the Sturm–Liouville type (, ) given by
0
ðpðsÞu0 Þ qðsÞu þ  ðsÞu ¼ 0 @Wðx; Þ @Wðx; Þ
y¼ ; ¼
with singularities that may occur at the endpoints of @x @
the fundamental domain. Majority of orthogonal
Thus, any solvable w.r.t.  set of separation relations
polynomials and special functions appearing in math-
[25] defines a Liouville integrable system.
ematical physics are solutions of Sturm–Liouville
If we perform a canonical transformation from
problems.
(x, y) to new variables (q, p), then the new set of
In the complex domain the study of singularities e i (q, p) = Hi (x(q, p),
commuting Hamiltonians H
of Laurent series solutions of the same equations led
y(q, p)) is also called separable.
to development of theory of linear ODEs with
The main problem for any given set of commuting
singular points of the Fuchs class and the Böcher e i (q, p) is to decide if there exists a
Hamiltonians H
class.
canonical transformation (q, p) ! (x, y) to the
separation variables (x, y) so that the related
Hamilton–Jacobi equations [26] are simultaneously
Constructive Approach to Separability separable. An answer to this problem is known for
of Liouville Integrable Systems integrable Hamiltonians solvable through the spec-
tral curve method (Sklyanin 1995) and for the whole
In the constructive approach to separability, one class of natural Hamiltonians discussed earlier.
considers simultaneously all Hamilton–Jacobi equa- This approach brings new, wider perspective to the
tions following from a set of n, functionally classical separability mechanism stated in the Stäckel
independent, commuting integrals H1 (x, y), . . . , theorem. It contains majority of all known separable
Hn (x, y), {Hi , Hj } = 0, that define a Liouville inte- Hamiltonian systems. For example, if we specify the
grable system (Sklyanin 1995). separation relations [25] to be affine in i ,
One starts with the separation equations, a set
of n decoupled ODEs for the functions Wi (xi , ) X
n
depending on one variable xi and parametric fik ðxi ; yi Þk ¼ gi ðxi ; yi Þ; i ¼ 1; . . . ; n ½27
 2 Rn : k¼1
 
@Wi ðxi ; Þ then [27] are called generalized Stäckel separability
fi xi ; yi ¼ ; ¼ 0 ½25
@xi conditions. To recover the explicit form of Hamilto-
nians Hk = k , it is enough to solve relations [27] w.r.t.
Assume that the dependence on i is essential (i.e., k . It has been proved that the Stäckel Hamiltonians in
that det(@fi =@j ) 6¼ 0) so that we can resolve eqns [27] constitute a quasi-bi-Hamiltonian chain. If we
[25] w.r.t. i so that i = Hi (x, y) for some functions specify further relations [27] by assuming that func-
Hi . If the functions Wi (xi , ) solve [25] identically tions fik do not depend on yi and functions gi are
w.r.t.
Pn x and , then the function W(x, ) = quadratic in yi , then we obtain the classical Stäckel
i=1 W i (xi , ) is simultaneously an additively separability conditions (see Theorem 1)
separable solution of eqns [25] and of the equations
  X
n
1
x; @Wðx; Þ fik ðxi Þk ¼ gi ðxi Þy2i þ hi ðxi Þ ½28
i  Hi x; y ¼ ; i ¼ 1; . . . ; n ½26 2
@x k¼1
Separatrix Splitting 535

that can be solved for k yielding Courant R and Hilbert D (1989) Methods of Mathematical
Physics. vol. II. Partial Differential Equations, Wiley Classics
  Library. A Wiley-Interscience Publication. New York: Wiley.
1X n 
1
 2 hi ðxi Þ Eisenhart LP (1934) Separable systems of Stäckel. Annals of
k ðx; yÞ ¼  ik yi þ
2 i¼1 gi ðxi Þ Mathematics 35(2): 284–305.
Fourier J (1945) The Analytical Theory of Heat. New York:
G. E. Stechert and Co.
that is, the Stäckel Hamiltonians [14] with the Stäckel Jacobi CGJ (1884) Vörlesungen über Dynamik. Herausgegeben
matrix  = [’ik ], where ’ik = fik (xi )=gi (xi ). By speci- A. Clebsch., ch. 26. Berlin: Verlag von G. Reimer.
fying [28] further, we obtain separation relations Kalnins EG and Miller W Jr. (1980) Killing tensors and variable
separation for Hamilton–Jacobi and Helmholtz equations.
SIAM Journal on Mathematical Analysis 11(6): 1011–1026.
xn1
i 1 þ xn2
i 2 þ    þ n ¼ 12gðxi Þy2i þ hðxi Þ Kalnins EG and Miller W Jr. (1986) Separation of variables on
n-dimensional Riemannian manifolds. I. The n-sphere Sn and
which give the so-called Benenti systems associated Euclidean n-space Rn . Journal of Mathematical Physics 27(7):
with conformal Killing tensors and cofactor pair 1721–1736.
Landau LD and Lifshitz EM (1976) Course of Theoretical
systems. Physics. vol. 1. Mechanics, Third Edition. Translated from
Relations [27], with gi (xi , yi ) depending exponen- the Russian by JB Sykes and JS Bell. Oxford–New York–
tially on momenta y, contain several well-known Toronto, Ont: Pergamon Press.
systems such as periodic Toda lattice, the KdV Levi-Civita T (1904) Sulla integrazione della equazione di
dressing chain, and the Ruijsenaar–Schneider sys- Hamilton-Jacobi per separazione di variabili. Mathematische
Annalen 59: 383–397.
tem. Relations with gi cubic in momenta y yield Miller W Jr. (1983) The technique of variable separation for
stationary flows of Boussinesq hierarchy and integr- partial differential equations. In: Wolf BK (ed.) Nonlinear
able systems on the loop algebra sl(3). Phenomena (Oaxtepec, 1982), Lecture Notes in Physics,
vol. 189, pp. 184–208. Berlin: Springer.
See also: Boundary-Value Problems for Integrable Sklyanin EK (1995) Separation of variables. New trends. Progress
Equations; Calogero–Moser–Sutherland Systems of in Theoretical Physics 118: 35–60.
Nonrelativistic and Relativistic Type; Elliptic Differential Stäckel P (1897) Über die Integration der Hamilton’schen
Differentialgleichung mittelst Separation der Variabeln. Math.
Equations: Linear Theory; Evolution Equations: Linear
Ann. 49(1): 145–147.
and Nonlinear; Integrable Systems: Overview; Multi- Waksjö C and Rauch-Wojciechowski S (2003) How to find
Hamiltonian Systems; Ordinary Special Functions; separation coordinates for the Hamilton–Jacobi equation: a
Recursion Operators in Classical Mechanics; Toda Lattices. criterion of separability for natural Hamiltonian systems.
Mathematical Physics, Analysis and Geometry 6(4): 301–348.
Wojciechowski S (1986) Review of the recent results on
Further Reading integrability of natural Hamiltonian systems. In: Winternitz P
Benenti S (1997) Intrinsic characterization of the variable (ed.) Systèmes dynamiques non linéaires: intégrabilité
separation in the Hamilton–Jacobi equation. Journal of et comportement qualitatif, Sém. Math. Sup, vol. 102,
Mathematical Physics 38(12): 6578–6602. pp. 294–327. Montreal, QC: Presses Univ. Montréal.

Separatrix Splitting
D Treschev, Moscow State University, Moscow, In this article we consider the case of systems with
Russia finite-dimensional phase space. Basically we deal with
ª 2006 Elsevier Ltd. All rights reserved. nonautonomous Hamiltonian systems 2-periodic in
time. However, it is useful to keep in mind the fact
that the cases of autonomous Hamiltonian systems
Separatrices are asymptotic manifolds in dynamical and symplectic maps are dynamically the same. Some
systems. However, this term is applied usually in the results for non-Hamiltonian perturbations will also
case of a small dimension of the phase space, where be presented. Hamiltonian systems with one-
these manifolds are hypersurfaces. In the context of and-a-half or two degrees of freedom as well as
separatrix splitting manifolds asymptotic to hyper- area-preserving two-dimensional maps are especially
bolic tori are usually considered, where tori of important for us because the results on the separatrix
dimension 0 and 1 are called equilibrium positions splitting in this case are more clear and complete.
and periodic trajectories, respectively. A separatrix Dynamics in such systems is essentially the same.
can be stable (asymptotic as t ! þ1) and unstable Below we call these systems two dimensional.
(asymptotic as t ! 1). We assume that all systems are at least C1 -smooth.
536 Separatrix Splitting

Poincaré Integral Periodicity of H1 in t implies 2-periodicity of


P(). There is also the following obvious identity:
Consider a Liouville integrable Hamiltonian system.
Z þ1
Then any separatrix either goes to infinity or joins dPðÞ
two hyperbolic tori. From a dynamical point of ¼ fH0 ; H1 gððt þ Þ; tÞ dt
d 1
view, the latter case is more interesting. If these tori
are different, the situation is called heteroclinic, where { , } is the Poisson bracket.
otherwise homoclinic. Poincaré was the first to
notice that after a generic perturbation stable and
unstable separatrices become different submanifolds Melnikov Integral
of the phase space. This phenomenon is called the Melnikov (1963) considered general (not necessarily
separatrix splitting. Hamiltonian) 2-periodic in t perturbations:
Poincaré (1987) considered perturbations of
separatrices homoclinic to a periodic solution in a @H0
x_ ¼ þ "v1 ðx; y; tÞ þ Oð"2 Þ
Hamiltonian system with one-and-a-half degrees of @y
freedom. In this case the system has the form @H0
y_ ¼  þ "v2 ðx; y; tÞ þ Oð"2 Þ
@x
@H @H
x_ ¼ ; y_ ¼  ; ðx; yÞ 2 D  R2 ½1 In this case, information on the separatrix splitting
@y @x
in the first approximation is contained in the
where D is an open domain and Melnikov integral
Z þ1
Hðx; y; t; "Þ ¼ H0 ðx; yÞ þ "H1 ðx; y; tÞ þ Oð"2 Þ ½2 MðÞ ¼ vH0 ððt þ Þ; tÞ dt
1
We assume that H is 2-periodic in t and " is a
small parameter. Let (x0 , y0 ) 2 D be an equilibrium where vH0 = v1 @H0 =@x þ v2 @H0 =@y.
position for the unperturbed (" = 0) system: Note that if the vector field v is Hamiltonian and
grad H0 (x0 , y0 ) = 0. Without loss of generality, H1 is the corresponding Hamiltonian function, we
(x0 , y0 ) = 0. In the extended phase space D  T have: vH0 = {H0 , H1 }. Hence in Hamiltonian
(T = {t mod 2} is a one-dimensional torus) instead systems we have: M() = dP()=d.
of the equilibrium we have a 2-periodic solution A multidimensional version of the Melnikov
0  T. Suppose that the equilibrium (and therefore, integral is presented in Wiggins (1988).
the periodic solution) is hyperbolic and the corre-
sponding stable and unstable separatrices s, u are
doubled: s = u = . Let (t) be a natural para- Geometric Meaning of M()
metrization of , that is, (x(t), y(t)) = (t) is a Let T be a compact piece of the unperturbed
solution of eqns [1]. In the extended phase space, separatrix
we have the asymptotic surface
T ¼ fðx; yÞ 2 D: ðx; yÞ ¼ ðtÞ; jtj  Tg
ððt þ Þ; tÞ; t 2 T;  2 R Then for any T > 0 there exists a neighborhood U of
For small values of ", the perturbed system has a T and symplectic coordinates (time–energy coordi-
hyperbolic periodic solution (" (t), t), " (t) = O(") 2 D nates) , h on U such that the section of the perturbed
and the separatrices separatrices s," u by the plane {t = 0} is as follows:

ð"s;u ðt; Þ; tÞ; 0s;u ðt; Þ ¼ ðt þ Þ s;u s;u


" jt¼0 ¼ fð; hÞ : h ¼ h" ðÞg

where
Since the addition to the Hamiltonian of a function,
depending only on t and ", does not change the 1. hu" () = O("2 ),
dynamics, without loss of generality we can assume 2. hs" () = "M() þ O("2 ).
that H1 (0, 0, t)  0. Hence the Poincaré integral Moreover, let gt" : D ! D be the phase flow of
Z þ1 the perturbed system. The map g2 " is called the
PðÞ ¼ H1 ððt þ Þ; tÞ dt Poincaré map. The following statement holds.
1 3. For any two points z0 , z1 2 U such that z1 = g2
" (z0 ),
let (0 , h0 ) and (1 , h1 ) be their time–energy
converges. The function P carries all information on
coordinates. Then
the separatrix splitting in the first approximation
in ". 1 ¼ 0 þ 2 þ Oð"Þ; h1 ¼ h0 þ Oð"Þ
Separatrix Splitting 537

h Multidimensional Case
Multidimensional generalization of the Poincaré–
Lobe Λs Λu Melnikov construction is strongly connected to the
concept of a (partially) hyperbolic torus. Let
(M, !, H) be a Hamiltonian system on the 2m-
τ∗ τ dimensional symplectic manifold (M, !).
An invariant n-torus N  M (0  n < m) is called
Figure 1 Perturbed separatrices in time–energy coordinates.
hyperbolic if there exist coordinates x, y, z on M in a
neighborhood of N such that
Existence of such coordinates has several 1. y = (y1 , . . . , yn ), x = (x1 , . . . , xn ) mod 2,
corollaries. z = (zs , zu ), zs, u = (zs,1 u , . . . , zs,l u ), l þ n = m;
If P is not identically constant, the separatrices 2. ! = dy ^ dx þ dzu ^ dzs ;
split and this splitting is of the first order in ". 3. N = {(x, y, z) : y = 0, z = 0}; and
Let 
be a simple zero of M. Then the 4. H = h, yi þ (1=2)hAy, yi þ hzu , (x)zs i þ O3 (y, z),
perturbed separatrices intersect transversally at
where  2 Rn is a constant vector, A is a constant
a point z
(") with time–energy coordinates
n  n matrix,  is an l  l matrix such that
(
þ O("), O("2 ), t = 0). Such a point z
(") is
(x) þ T (x) is positive definite for any x mod 2,
called a transversal homoclinic point. It gen-
the symbol O3 P denotes terms of order not less than
erates a doubly asymptotic solution in the
3, and ha, bi = aj bj .
perturbed system.
Consider a lobe domain L(
, ") bounded by two If det A 6¼ 0, the torus is called nondegenerate. If 
is Diophantine, that is, for some , > 0 and any
segments of separatrices on the section {t = 0}
0 6¼ k 2 Zn
(see Figure 1). Let another ‘‘corner point’’ of the
lobe L(
, ") correspond to the simple zero 
0 of jh; kij jkj
M. Then the symplectic area of L(
, ") equals
Z the torus N is called Diophantine. The coordinates

0
(x, y, z) are called canonical for N.
AL ð
; "Þ ¼ " MðÞ d þ Oð"2 Þ

Now suppose that the Hamiltonian H depends
smoothly on the parameter ":
A Standard Example H ¼ H0 þ "H1 þ Oð"2 Þ
Consider as an example a pendulum with periodi- and for " = 0 the system is Liouville integrable with
cally oscillating suspension point. The Hamiltonian the commuting first integrals F1 , . . . , Fm :
of the system can be presented in the form
fFj ; Fk g ¼ 0; 1  j; k  m
Hðx; y; t; "Þ ¼ 12 y2 þ 2 cos x þ "ðtÞ cos x ½3
Let M0 = {F1 =    = Fm = 0}  M be their zero
where  is the ‘‘internal’’ frequency of the pendulum. common level and let N  M be an n-dimensional
The function  is 2-periodic in time. Hence the nondegenerate Diophantine hyperbolic torus. The
frequency of the suspension point oscillation equals torus N generates the invariant Lagrangian asymp-
1. In this case, the unperturbed homoclinic solution totic manifolds s, u  M. Suppose that the separa-
(t) can be computed explicitly. In particular, trices are doubled, that is, there is a Lagrangian
manifold   s \ u .
cosðxðtÞÞ ¼ 1  2 cosh2 ðtÞ Consider the perturbed Hamiltonian H = H0 þ
R þ1 "H1 þ O("2 ). The torus N as well as the asymptotic
Therefore, P() = 1 (t)(cos (x(t þ ))  1) dt. For manifolds s, u survive the perturbation. Let N" be the
example, if (t) = cos t, we have corresponding hyperbolic torus in the perturbed
2 cos  system and s," u its asymptotic manifolds: N" and
PðÞ ¼  s," u depend smoothly on " and N0 = N, s,0 u = s, u .
2 sinhð=2Þ
Let the function
(x) satisfy the equation
In this case, different lobes have the same area
h; @
ðxÞ=@xi þ H1 ðx; 0; 0Þ
Z
4" 1
AL ¼ þ Oð"2 Þ ¼ H1 ðx; 0; 0Þ dx
2 sinhð=2Þ ð2Þn T n
538 Separatrix Splitting

This equation has a smooth solution unique up to an Suppose that N = N(0) is Diophantine and non-
additive constant. degenerate. Then in the perturbed system there is
Consider a solution of the unperturbed Hamiltonian smooth in " hyperbolic torus N" , N0 = N. Consider
equations (t)  . Let Ij , Ij, l , 1  j, l  m be the the Poincaré function
following quantities (Treschev 1994): Z þ1 
 Z T Pð ; Þ ¼ H1 ð þ ðt þ Þ; 0; ðt þ Þ; tÞ
Ij ¼ lim  fFj ; H1 gððtÞÞ dt 1

T!þ1 T
 H1 ð þ ðt þ Þ; 0; 0; 0; tÞ dt
þ fFj ;
gððTÞÞ  fFj ;
gððTÞÞ
Obviously, P( , ) is 2-periodic in and .
 Z T If P is not identically constant, asymptotic

Ij;l ¼ lim  fFj ; fFl ; H1 ggððtÞÞ dt surfaces of N" split in the first approximation in ".
T!þ1 T
Nondegenerate critical points of P correspond to
þ fFj fFl ;
ggððTÞÞ
 transversal homoclinic solutions of the perturbed
 fFj fFl ;
ggððTÞÞ system.
Other results on the splitting of multidimensional
asymptotic manifolds are presented in Arnol’d et al.
The numbers Ij , Ij, l play the role of the first and (1988) and Lochak et al. (2003).
second derivatives of the Poincaré integral at some
point.
If any of the quantities Ij , Ij, l does not vanish, Exponentially Small Separatrix Splitting
the asymptotic manifolds s, u split. Moreover, sup-
pose that I1 =    = Im

= 0 and the rank of the matrix If in the unperturbed (integrable) system there are no

(Ij, l ) equals m  1. Then for small values of ", the asymptotic manifolds, they can appear after a
manifolds s and u intersect transversally on the perturbation. Consider, for example, perturbation
energy level at points of the solution " (t), where of a real-analytic Liouville integrable system near a
" !  as " ! 0. simple resonance:
@H @H
x_ ¼ ; y_ ¼ ; x 2 T m ; y 2 D  Rm
@y @x
Poincaré Integral in Multidimensional Hðx; y; t; "Þ ¼ H0 ðyÞ þ "H1 ðx; y; t; "Þ
Case
As usual, we assume 2-periodicity in t. A simple
Suppose that the Hamiltonian from the previous resonance corresponds to a value of the action
section equals variable y = y0 such that the frequency vector
0 01
Hðx; y; u; v; t; "Þ ¼ H0 ðy; u; vÞ þ "H1 ðx; y; u; v; tÞ  @H0 0
þ Oð"2 Þ ^ ¼ @ A;  0 ¼ ðy Þ 2 Rm
@y
1
Here x = (x1 , . . . , xn ) mod 2, y = (y1 , . . . , yn ) 2 Rn , and
(u, v) 2 R2 . The symplectic structure ! = dy ^ dx þ (here 1 is the frequency, corresponding to the time
dv ^ du. variable) admits only one resonance. More precisely,
there exists a nonzero k ^ 2 Zm þ 1 , satisfying hk,
^ ^i = 0
We assume that in the unperturbed integrable mþ1
system the variables separate: and any k 2 Z such that hk, ^i = 0 is collinear
^
with k.
H0 ðy; u; vÞ ¼ FðyÞ þ f ðu; vÞ Without loss of generality, we can assume that

and the system with one degree of freedom and y0 =0 and  0 = e 0
, e 2 Rm1 . Then the vector
Hamiltonian f has a hyperbolic equilibrium  = e 2 Rm is nonresonant.
1 pffiffiffi
(u, v) = 0 with a homoclinic solution (t). Any torus In a "-neighborhood of the resonance we have a
system with fast variables X = (x2 , . . . , xm , t) mod 2
Nðy0 Þ ¼ fðx; y; u; v; tÞ: y ¼ y0 ; u ¼ v ¼ 0g and slow variables Y = (x1 , "1=2 y1 , . . . , "1=2 ym )
variables:
is a hyperbolic torus of the unperturbed system with
pffiffiffi pffiffiffi
frequency vector Y_ ¼ Oð "Þ; _ ¼  þ Oð "Þ
X ½4
0 1
ðy0 Þ If the frequency vector  is Diophantine, by using
@ A; ðyÞ ¼ @F=@y the Neishtadt averaging procedure, we can reduce
1 the dependence of the right-hand sides of eqns [4] on
Separatrix Splitting 539

the fast variables to exponentially small in " terms. separatrix splitting, one has to study singularities of
This means that there exist new symplectic variables the solutions with respect to complex time. Area of
pffiffiffi pffiffiffi lobes in this system equals (Treschev 1997)
P ¼ Y þ Oð "Þ; Q ¼ X þ Oð "Þ
1
AL ¼ 4bf ðb; "Þ"1 eð2"Þ
(new time coincides with the old one) such that
system [4] takes the form Here f (b, "), " 0 is a smooth function. The func-
pffiffiffi pffiffiffi   tion f (b, 0) is even and entire. It can be computed
P_ ¼ "FðP; "Þ þ O expða"b Þ numerically as a solution of a problem which does
 
Q_ ¼  þ pffiffi"ffiGðP; pffiffi"ffiÞ þ O expða"b Þ not contain ". The value f (0, 0) = 4 corresponds to
the Poincaré integral, but the function f (b, 0) is not
with positive constants a, b. constant. It is possible to prove that f can be
If we neglect the exponentially small reminders, expanded in a power series in ". Apparently, this
the system turns out to be integrable. Generically, it series diverges for any b 6¼ 0.
has a family of hyperbolic m-tori of the form
{(P, Q): P = const.} with doubled asymptotic mani-
folds. However, the terms O(exp (a"b )) generic- Separatrix Splitting and Dynamics
ally cannot be removed completely. They produce 1. Separatrix splitting can be regarded as an obstacle
an exponentially small splitting of the asymptotic to the integrability of the perturbed system. How-
manifolds. This splitting implies nonintegrability, ever, this statement needs some comments.
chaotic behavior, Arnol’d diffusion, and other Doubled asymptotic surfaces in an integrable
dynamical effects. Hamiltonian system can have self-intersections. In
It is important to note that exponentially small the case of equilibrium, such intersections can even
splitting appears only in the analytic case. In smooth be transversal. In the literature, there is no general
systems the splitting is much stronger. result saying that separatrix splitting implies non-
Unfortunately, at present there are no quantitative integrability. Some particular cases (studied by
methods for studying such splittings except obvious Kozlov, Ziglin, Bolotin, and others) are presented
upper estimates and the case of two-dimensional in Arnol’d et al. (1989). For example, in the two-
systems. dimensional case, this is seen to be true.
2. Conceptual reason for the nonintegrability, dis-
cussed in the previous item, is a complicated
Exponentially Small Splitting dynamics near the splitted separatrices. In many
in Two-Dimensional Systems situations, it is possible to find in this domain a
Smale horseshoe. This implies positive topological
The main results on exponentially small separatrix entropy, existence of nontrivial hyperbolic sets,
splitting were obtained by Lasutkin and his students symbolic dynamics, etc.
(Gelfreich and others). Another effective approach 3. Consider a near-integrable area-preserving two-
was proposed by Treschev. There are no general dimensional map. In the perturbed system in the
theorems in this situation; however, many examples vicinity of the splitted separatrices of a hyperbolic
were studied. We discuss the splitting in the fixed point z" the so-called stochastic layer is
pendulum with rapidly oscillating suspension point. formed. Here we mean the domain bounded by
The Hamiltonian of the system has the form invariant curves, closest to the separatrices. An
important quantity, describing the rate of chaos, is
H ¼ 12 y2 þ ð1 þ 2b cosðt="ÞÞ cos x the area of the stochastic layer ASL ("). It turns out
(Treschev 1998b) that ASL (") is connected with the
(cf. [3]). For any value of " the circle area of the largest lobe AL (") by the simple formula
{(x, y, t): x = y = 0} is a periodic trajectory. For
small " > 0 the trajectory is hyperbolic. AL ð"Þ logðAL ð"ÞÞ
Poincaré integral can be formally written in this c1 ASL ð"Þ < < c2 ASL ð"Þ
1 log2
system. It predicts the area of lobes 16b"1 e(2") .
However, there is no reason to expect that this with some constants c1 , c2 > 0, where is the
asymptotics of the splitting is correct. Indeed, its largest multiplier (Lyapunov exponent) of the fixed
value is exponentially small in ", while the error of the point z0 .
Poincaré–Melnikov method is in general quadratic in 4. Let ^z be a hyperbolic fixed point of an area-
the perturbation. To obtain correct asymptotics of the preserving two-dimensional map. The point ^z
540 Several Complex Variables: Basic Geometric Theory

divides the corresponding separatrices s, u in 4 Lochak P, Marco J-P, and Sauzin D (2003) On the splitting of
branches s1, 2 and u1, 2 . Suppose that the pair of invariant manifolds in multidimensional near-integrable
Hamiltonian systems. Memoirs of the American Mathematical
branches sj and ul satisfies the following Society 163(775): viiiþ145.
conditions: Melnikov V (1963) On the stability of the center for time-periodic
perturbations. Trudy Moskovskogo Metern. Obschestva 12:
sj and ul lie in a compact invariant domain; 3–52 (Russian). (English transl.: Trans. Moscow Math. Soc.
sj and ul do not coincide and intersect at a 1963, pp. 1–56 (1965).)
homoclinic point. Poincaré H (1987) Les méthodes nouvelles de la Mécanique
s u Céleste, vols. 1–3. Paris: Gauthier–Villars, (Original publica-
Then the closures j , l are compact invariant tion: 1892, 1893, 1899). New Printing: Librairie Scientifique
sets. Very little is known about these sets. For et Technique Albert Blanchard 9, Rue Medecin 75006, Paris.
example, it is not known if their measure is positive. Treschev D (1994) Hyperbolic tori and asymptotic surfaces in
However, by using the Poincaré recurrence theorem, Hamiltonian systems. Russian Journal of Mathematical
s u Physics 2(1): 93–110.
it is possible to prove (Treschev 1998a) that j = l . Treschev D (1997) Separatrix splitting for a pendulum with
rapidly oscillating suspension point. Russian Journal of
See also: Averaging Methods; Billiards in Bounded Convex Mathematical Physics 5(1): 63–98.
Domains; Hamiltonian Systems: Obstructions to Integrability; Treschev D (1998a) Closures of asymptotic curves in a two-
Hamiltonian Systems: Stability and Instability Theory. dimensional symplectic map. J. Dynam. Control Systems 4(3):
305–314.
Treschev D (1998b) Width of stochastic layers in near-integrable
Further Reading two-dimensional symplectic maps. Physica D 116(1–2): 21–43.
Wiggins S (1988) Global Bifurcations and Chaos. Analytical
Arnol’d V, Kozlov V, and Neishtadt A (1988) Mathematical Methods. Applied Mathematical Sciences, vol. 73, 494pp
aspects of classical and celestial mechanics. In: Encyclopaedia New York: Springer.
of Mathematical Sciences, vol. 3. Berlin: Springer.

Several Complex Variables: Basic Geometric Theory


A Huckleberry, Ruhr-Universität Bochum, Bochum, These are smooth complex-valued functions f
Germany which satisfy
T Peternell, Universität Bayreuth, Bayreuth, Germany
X @f
ª 2006 Elsevier Ltd. All rights reserved.  :¼
@f dzi ¼ 0
@zi
Some results from the one-dimensional theory extend
Introduction to the case where n > 1. However, even at the early
stages of development, one sees that there are many
The rubric ‘‘several complex variables’’ is attached to a new phenomena in the higher-dimensional setting.
wide area of mathematics which involves the study of
holomorphic phenomena in dimensions higher than
Extending Results from the One-Dimensional
one. In this area there are viewpoints, methods and
Theory
results which range from those on the analytic side,
where analytic techniques of partial differential equa- For local results one may restrict considerations to
tions (PDEs) are involved, to those of algebraic geometry functions f which are holomorphic in a neighbor-
which pertain to varieties defined over finite fields. Here hood of 0 2 Cn . The restriction of f to, for example,
we outline selected basic methods which are aimed at any complex line through 0 is holomorphic, and
understanding global geometric phenomena. Detailed therefore the maximum principle can be immedi-
presentations of most results discussed here can be ately transferred to the higher-dimensional setting.
found in the basic texts (Demailly, Grauert and The zero-set V(f ) of a nonconstant holomorphic
Fritzsche 2001, Griffiths and Harris 1978, Grauert et function is one-codimensional over the complex
al. 1994, Grauert and Remmert 1979, 1984). numbers (two-codimensional over the reals). Thus
the identity principle must be formulated in a
different way from its one-dimensional version. For
Domains in Cn
example, under the usual connectivity assumptions,
Complex analysis begins with the study of if f vanishes on a set E with Hausdorff dimension
holomorphic functions on domains D in Cn . bigger than 2n  2, then it vanishes identically. Here
Several Complex Variables: Basic Geometric Theory 541

is another useful version: if M is a real submanifold many holomorphic functions. A function g on A is


such that the real tangent space Tz M generates the said to be holomorphic if at each a 2 A it is the
full complex tangent space at one of its points, that restriction of a holomorphic function on some
is, Tz M þ iTz M = Tz Cn , and f jM  0, then f  0. neighborhood of a in D. There is an appropriate
In the one-dimensional theory, after choosing notion of an irreducible component of A. If A is
appropriate holomorphic coordinates, f (z) = zk for irreducible, it contains a dense open set Areg , which
some k. This local normal form implies that is a connected k-dimensional complex manifold,
nonconstant holomorphic functions are open map- that is, at each of its points a there are functions
pings. Positive results in the mapping theory of f1 , . . . , fk which define a map F := (f1 , . . . , fk ), which
several complex variables are discussed below. The is a holomorphic diffeomorphism of Areg onto an
simple example F : C2 ! C2 , (z, w) ! (z, zw), shows open set in Ck . The boundary Asing is the set of
that the open mapping theorem cannot be trans- singular points of A, which is a lower-dimensional
ferred without further assumptions. analytic set. The dimension of an analytic set is the
The local normal-form theorem in several com- maximum of the dimensions of its irreducible
plex variables is called the ‘‘Weierstraß preparation components.
theorem.’’ It states that after appropriate normal- Here are typical examples of theorems on con-
ization of the coordinates, f is locally the product of tinuing holomorphic functions across small analytic
a nonvanishing holomorphic function with a sets E. If codim E  2, then every function which is
‘‘polynomial’’ holomorphic on DnE extends to a holomorphic
function on D. The same is true of meromorphic
Pðz; z0 Þ = zk þ ak1 ðz0 Þzk1 þ    þ a0 ðz0 Þ functions, that is, functions which are locally
where z is a single complex variable, z0 denotes the defined as quotients m = f =g of holomorphic func-
remaining n  1 variables, and the coefficients are tions. If f is holomorphic on D, then g := 1=f is
holomorphic in z0 . This is a strong inductive device holomorphic outside the analytic set E := V(f ).
for the local theory. Thus g cannot be holomorphically continued across
If D is a product D = D1      Dn of relatively this one-codimensional set. However, Riemann’s
compact domains in the complex plane C, then Hebbarkeitssatz is valid in several complex vari-
repeated integration transfers the one-variable ables: if f is locally bounded outside an analytic
Cauchy integral formula from the Di to D. The subset E of any positive codimension, then it extends
resulting integral is over the product bd(D1 )      holomorphically to D.
bd(Dn ) of the boundaries which is topologically a With a bit of care, continuation results of this type
small set in bd(D). Complex analytically it is, however, can be proved for (reduced) complex spaces. These
large in the sense of the above identity principle. are defined as paracompact Hausdorff spaces which
It follows from, for example, the n-variable possess charts (U , ’ ), where the local home-
Cauchy integral formula that holomorphic functions omorphism ’ identifies the open set U with a
agree with their convergent power series develop- closed analytic subset A of a domain D in some
ments. As in the one-variable theory, the appro- Cn . As indicated above, a continuous function on
priate topology on the space O(D) of holomorphic A is holomorphic if at each point it can be
functions on D is that of uniform convergence on holomorphically extended to some neighborhood of
compact subsets. In this way O(D) is equipped with that point in D . Finally, just as in the case of
the topology of a Fréchet space. manifolds, the compatibility between charts is guar-
anteed by requiring that coordinate change
’ : U ! U is biholomorphic, that is, it is a
First Theorems on Analytic Continuation
homeomorphism so that it and its inverse are given by
Analytic continuation is a fundamental phenomenon holomorphic functions as F = (f1 , . . . , fm ). The discus-
in complex geometry. One type of continuation sion of irreducible components, sets of singularities,
theorem which is known in the one-variable theory and dimension for complex spaces goes exactly in the
is of the following type: If E is a small closed set in same way as that above for analytic sets.
D and f 2 O(DnE) is a holomorphic function which If E is everywhere at least two-codimensional,
satisfies some growth condition near E, then it then the above result on continuation of mero-
extends holomorphically to D. The notion ‘‘small’’ morphic functions holds in complete generality. The
can be discussed in terms of measure, but it is more Hebbarkeitssatz requires the additional condition
appropriate to discuss it in complex analytic terms. that the complex space is normal. In many situations
An analytic subset A of D is locally the common this causes no problem at all, because, in general,
zero set {a 2 D; f1 (a) =    = fm (a) = 0} of finitely there is a canonically defined associated normal
542 Several Complex Variables: Basic Geometric Theory

complex space X ~ and a proper, surjective, finite- { = 0} in some neighborhood U of 0 of a smooth


fibered holomorphic map X ~ ! X which is biholo- function with d 6¼ 0 on U. This is viewed as a piece
morphic outside a nowhere-dense proper analytic of a boundary of a domain D, where U \ D = { < 0}.
subset. Difficulties can be overcome by simply lifting The real tangent space T0  = Ker(d(0)) contains a
functions to this normalization and applying the unique maximal (one-codimensional) complex sub-
Hebbarkeitssatz. space T0C  = Ker(@(0)) = H. The signature of the
Continuation theorems of Hartogs-type reflect the 
restriction of the complex Hessian (or Levi form) i@ @
fact that complex analysis in dimensions larger than to H is a biholomorphic invariant of . In this
one is really quite different from the one-variable notation the Hessian is a real alternating 2-form
version. The following is such a theorem. Let (z, w) be which is compatible with the complex structure, and
the standard coordinates in C2 and think of the z-axis its signature is defined to be the signature of the
as a parameter space for geometric figures in the associated symmetric form.
w-plane. For example, let Dz := {(z, w) : jwj < 1} be If the restriction of this Levi form to the complex
a disk and Az := {(z, w) : 1  " < jwj < 1} be an tangent space has a negative eigenvalue, that is, if
annulus. An example of a Hartogs figure H in C2 the boundary bd(D) has a certain degree of
is the union of the family of disks Dz for jzj < 1   concavity, then there is a map F :  ! U of the
with the family Az of annuli for 1    jzj < 1. unit disk  which is biholomorphic onto its image
One should visualize the moving disks which with F(0) = 0 and otherwise F(cl()) D. The
suddenly change to moving annuli. One speaks of reader can imagine pushing the image of this map
filling in the Hartogs figure to obtain the polydisk into the domain to obtain a family of disks which
H^ := {(z, w); jzj, jwj < 1}. Hartogs’ continuation the- are in the domain, and pushing it in the outward
orem states that a function which is holomorphic pointing direction to obtain annuli which are also in
on H extends holomorphically to H. ^ the domain. Making this precise, one builds a
(higher-dimensional) Hartogs figure H at the base
Cartan–Thullen Theorem point 0 so that H ^ is an open neighborhood of 0. In
particular this proves the theorem of E. E. Levi:
One of the major developments in complex analysis
every function holomorphic on U \ D extends to a
in several variables was the realization that certain
neighborhood of 0. This can be globally formulated
convexity concepts lie behind the strong continua-
as follows:
tion properties. At the analytic level one such is
defined as follows by the full algebra of holo- Theorem If D is a domain of holomorphy with
morphic functions O(D). If K is a compact subset of smooth boundary in Cn , then bd(D) is Levi-
D, then its holomorphic convex hull K ^ is defined as pseudoconvex.
the intersection of the sets P(f ) := {p 2 D : jf (p)j 
Here the terminology Levi-pseudoconvex is used to
jf jK } as f runs through O(D). One says that D is
^ is compact for every denote the condition that the restriction of the Levi
holomorphically convex if K
form to the complex tangent space of every
compact subset K of D.
boundary point is positive semidefinite.
The theorem of H. Cartan and Thullen relates this
One of the guiding problems of complex analysis
concept to analytic continuation phenomena as
in higher dimensions is the Levi problem. This is the
follows. A domain D is said to be a domain of
converse statement to that of the Levi’s theorem:
holomorphy if, given a divergent sequence {zn } D,
there exists f 2 O(D) which is unbounded along it. Levi Problem Is a domain D with smooth Levi-
In other words, the phenomenon of being able to pseudoconvex boundary in a complex manifold
extend all holomorphic functions on D to a truly necessarily a domain of holomorphy?
larger domain D ^ does not occur. The Cartan–
Stated in this form it is not true, but for domains
Thullen theorem states that D is a domain of
in Cn it is true. As will be sketched below, under
holomorphy if and only if it is holomorphically
stronger assumptions on the Levi form it is almost
convex. In the next paragraph the relation between
true. However, there are still interesting open
this type of convexity and a certain complex
problems in complex analysis which are related to
geometric convexity of the boundary bd(D) will be
the Levi problem.
indicated.

Levi Theorem and the Levi Problem Bounded Domains and Their Automorphisms

Consider a smooth (local) real hypersurface  The unit disk in the complex plane is particularly
containing 0 2 Cn with n > 1. It is the zero-set important, because, with the exceptions of projective
Several Complex Variables: Basic Geometric Theory 543

space P1 (C), the complex plane C, the punctured realized that domains of holomorphy form the
plane Cn{0}, and compact complex tori, it is the basic class of spaces where it would be possible to
universal cover of every (connected) one-dimensional solve the important problems of the subject con-
complex manifold. cerning the existence of holomorphic or mero-
In higher dimensions it should first be underlined morphic functions with reasonably prescribed
that, without some further condition, there is no properties. In fact, Oka formulated a principle
best bounded domain in Cn . For example, two which more or less states that if a complex analytic
randomly chosen small perturbations of the unit ball problem which is well formulated on a domain of
B2 := {(z, w); jzj2 þ jwj2 < 1}, with, for example, real holomorphy has a continuous solution, then it
analytic boundary, are not biholomorphically should have a holomorphic solution. Given the
equivalent. flexibility of continuous functions and the rigidity
On the other hand, the following theorem of of holomorphic functions, this would seem impos-
H. Cartan shows that bounded domains D are good sible but in fact is true!
candidates for covering spaces: Beginning in the late 1930s, Stein worked on
problems related to this Oka principle, in particular
Theorem Equipped with the compact open topol-
on those related to what we would now call the
ogy, the group Aut(D) of holomorphic automorph-
algebraic topological aspects of the subject, and he
isms of D is a Lie group acting properly on D.
was led to formulate conditions on a general
The notion of a proper group action of a complex manifold X which should hold if problems
topological group on a topological space is funda- of the above type are to be solved. First, his axiom
mental and should be underlined. It means that if of holomorphic convexity was simply that, given a
{xn } is a convergent sequence in the space where the divergent sequence {xn } in X, there should be a
group is acting, then a sequence of group elements function f 2 O(X) such that {f (xn )} is unbounded.
{gn }, with the property that {gn (xn )} is convergent, Secondly, holomorphic functions should separate
itself possesses a convergent subsequence. As a points in the sense that, given distinct points x1 , x2 2
consequence, isotropy groups are compact and X, there exists f 2 O(X) with f (x1 ) 6¼ f (x2 ). Finally,
orbits are closed. globally defined holomorphic functions should give
In the context of bounded domains D this implies local coordinates. Assuming that X is n-dimensional,
that if  is a discrete subgroup of Aut(D), then this means that, given a point x 2 X, there exist
X = D= carries a natural structure of a complex f1 , . . . , fn 2 O(X) such that df1 (x) ^    ^ dfn (x) 6¼ 0.
space. If in addition  is acting freely, something Assuming Stein’s axioms, Cartan and Serre then
that, with minor modifications, can be arranged, produced a powerful theory in the context of sheaf
then X is a complex manifold. cohomology which proved certain vanishing theo-
Many nontrivial compact complex manifolds arise rems that led to the desired existence theorems. This
as quotients D= of bounded domains. Even very theory and typical applications are sketched below.
concrete quotients, for example, where D = B2 , are Before going into this, we would like to mention
extremely interesting. Conversely, if Aut(D) contains that Grauert’s version of the Cartan–Serre theory
a discrete subgroup  so that D= is compact, then requires only very weak versions of Stein’s axioms:
D is probably very special. For example, it is known (1) The connected component containing K of the
to be holomorphically convex! holomorphic convex hull K ^ of every compact set
Any compact quotient X = D= of a bounded should be compact. (2) Given x 2 X, there are
domain is projective algebraic in the sense that it can functions f1 , . . . , fm 2 O(X) so that x is an isolated
be realized as a complex (algebraic) submanifold of point in the fiber of the map F := (f1 , . . . , fm ) : X !
some complex projective space. In fact the embed- Cm . Of course the results also hold for complex
ding can be given by quite special -invariant spaces.
holomorphic tensors on D, and this in turn implies Holomorphically convex domains in Cn are Stein
that X is of general type (see below). For further manifolds, and since closed complex manifolds of
details, in particular on Cartan’s theorem on the Stein manifolds are Stein, it follows that any
automorphism group of a bounded domain, the complex submanifold of Cn is Stein. In particular,
reader is referred to Narasimhan (1971). affine varieties are Stein spaces. Remmert’s theorem
states the converse: an n-dimensional Stein manifold
can be embedded as a closed complex submanifold
Stein Manifolds
of C2nþ1 . A nontrivial result of Behnke and Stein
The founding fathers of the first phase of ‘‘modern implies that every noncompact Riemann surface is
complex analysis’’ (Cartan, Oka, and Thullen) also Stein.
544 Several Complex Variables: Basic Geometric Theory

Basic Formalism Cq (U, S), which is the set of alternating maps 


from the set of (q þ 1)-fold indices of the form
The following first Cousin problem is typical of those
(i0 , . . . , iq ) 7! si0 ,..., iq 2 S(Ui0 ,..., iq ). Here Ui0 ,..., iq := Ui0
which can be solved by Stein theory. Let X be a
complex manifold which is covered by open sets Ui . \    \ Uiq . The boundary mapping  : Cq ! Cqþ1 is
P
Suppose that on each such set a meromorphic function defined by ()i0 ,...iqþ1 = k (1)k si0 ,..., ik1 , ikþ1 ,..., iqþ1 . It
mi is given so that on the overlap Uij := Ui \ Uj the follows that 2 = 0, and H
(U, S) is defined to be the
difference mij = mj  mi =: fij is holomorphic. This cohomology of the associated complex.
means that the distribution of polar parts of these In any consideration it is necessary to refine
functions is well defined. The question is whether or coverings, shrink, etc., and therefore one goes to
not there exists a globally defined meromorphic the limit H
(X, S) over all refinements of the
function m 2 M(X) with these prescribed polar coverings. The script notation S is used to denote
parts, that is, with m  mi =: fi 2 O(Ui ). that we have then localized the sheaf to the germ
If one applies the Oka principle, this problem can level. Due to a theorem of Leray one can, however,
be easily solved. For this one can assume that the always take a suitable covering so that
covering is locally finite and take i to be a partition H q (U, S) = H q (X, S) for all q, where now S(U)
of unity subordinate to the cover. Using standard satisfies the above axioms.
shrinking and cut-off arguments, one extends the fij One of the important facts in this cohomology
to the full space X as smooth functions so that the theory is that a short exact sequence of sheaves 0 !
alternating cocycle relations fP ij þ fjk þ fki = 0 and S 0 ! S ! S 00 ! 0 yields a long exact sequence
fij = fji still hold. Then fj := k fjk is a smooth
0 ! H 0 ðX; S 0 Þ ! H 0 ðX; SÞ ! H 0 ðX; S 00 Þ
function on Uj which satisfies fj  fi = fij on the
overlap Uij . It follows that f := mi þ fi = mj þ fj is a ! H 1 ðX; S 0 Þ ! H 1 ðX; SÞ ! H 1 ðX; S 00 Þ !   
globally well-defined ‘‘smooth’’ function with the
in cohomology.
prescribed polar parts. The Oka principle would
A fundamental theorem of Stein theory, Theorem
then imply that there is a globally defined mero-
B, states that for the basic analytic sheaves S of
morphic function with the same property.
complex analysis, the so-called coherent sheaves, all
The basic sheaf cohomological formalism for
cohomology spaces H q (X, S) vanish for all q  1. In
Stein theory can be seen in the above argument.
the above example of the first Cousin problem the
Suppose that instead of applying extension and cut-
desired vanishing is that of H 1 (U, O).
off techniques from the smooth category, we could
answer positively the question ‘‘given the holo-
Coherent Sheaves
morphic functions {fij } on the Uij , do there exist
holomorphic functions {fi } on the Ui such that fj  Numerous important sheaves in complex analysis
fi = fij on the Uij ?’’ Then we would immediately have are associated to vector bundles on complex mani-
the desired globally defined meromorphic function folds. A holomorphic vector bundle  : E ! X over a
m := mi þ fi . This question is exactly the question of complex manifold is a holomorphic surjective
whether or not the Cech cohomology class of the maximal rank fibration. Every fiber Ex := 1 (x) is
alternating cocycle {fij } vanishes. a complex vector space, and the vector space
Let us quickly summarize the language of Cech structure is defined holomorphically over X. For
cohomology. A presheaf of abelian groups is a example, addition is a holomorphic map E X E ! E.
mapping U ! S(U) which associates to every open Such bundles are locally trivial, that is, there is a
subset of X an abelian group. Typical examples are covering {Ui } of the base such that 1 (Ui ) is
U ! O(U), U ! C1 (U), U ! H
(U, Z), . . . . The isomorphic to Ui  Cr and on the overlap the gluing
last example which associates to U its topological maps in the fibers are holomorphic maps ’ij : Uij !
cohomology does not localize well in terms of GLn (C). The number r is called the rank of the
following the basic axioms for a sheaf: (1) Given a bundle. Holomorphic bundles of rank 1 are referred to
covering {Ui } of an open subset U of X and elements as holomorphic line bundles. Of course all of these
si 2 S(Ui ) with sj  si = 0 on Uij , there exists s 2 S(U) definitions make sense in other categories, for exam-
with sjUi = si . (2) If s, t 2 S(U) are such that ple, topological, smooth, real analytic, etc.
sjUi = tjUi for all i, then s = t. For this we have A holomorphic section of E over an open set U
assumed that the restriction mappings have been is a holomorphic map s : U ! E with  s = IdU .
built into the definition of a presheaf. The space of these sections is denoted by E(U), and
Associated to a sheaf S on X and a covering the map U ! E(U) defines a sheaf which is locally
U = {Ui } is the space of alternating q-cocycles just OrX . It is therefore called a locally free sheaf of
Several Complex Variables: Basic Geometric Theory 545

O-modules. Conversely, by taking bases of a locally other words, for U open in A the space OA (U)
free sheaf S on the open sets where it is isomorphic should be regarded as the space of holomorphic
to a direct sum Or , one builds an associated functions on U.
holomorphic vector bundle E so that E = S. Now, I is a coherent sheaf on X and therefore by
It is not possible to restrict our attention to these Theorem B the cohomology group H 1 (X, I ) vanishes.
locally free sheaves or equivalently to holomorphic Consequently, the associated long exact sequence in
vector bundles. One important reason is that images cohomology implies that the restriction mapping
of holomorphic vector bundle maps are not necessa- OX (X) ! OA (A) is surjective. This special case of
rily vector bundles. A related reason is that the sheaf Theorem A means that every (global!) holomorphic
of ideals of holomorphic functions which vanish on function on A is the restriction of a holomorphic
a given analytic set A is not always a vector bundle. function on X. ^
This is caused by the presence of singularities in A.
Example Let us consider the multiplicative (second)
There are many other reasons, but these should
Cousin problem. In this case meromorphic functions
suffice for this sketch.
mi are given on the open subsets Ui of a covering U
The sheaves S that arise naturally in complex
with the property that mi = fij mj , where fij is holo-
analysis are almost vector bundles. If X is the base
morphic and nowhere vanishing on the overlap Uij .
complex manifold or complex space under consid-
This is a distribution D of the zero and polar parts of
eration, then S will come from a vector bundle on
meromorphic functions, which in complex geometry is
some big open subset X0 whose boundary is an
called a divisor, and the interesting question is whether
analytic set X1 , and then on the irreducible
or not there exists a globally defined meromorphic
components of X1 it will come from vector bundles
function which has D as its divisor.
on such big open sets, etc. These sheaves are called
Now we note that GL1 (C) = C
and thus
coherent analytic sheaves of OX -modules. The
fij : Uij ! C
defines a line bundle L on X and we
correct algebraic definition is that locally there
regard it as an element of the space H 1 (X, O
) of
exists an exact sequence
equivalence classes of line bundles on X. Here O

0 ! Opd !    ! Op1 ! Op0 ! S ! 0 ½1 is the sheaf of nowhere-vanishing holomorphic


functions on X. It is not even a sheaf of O-modules;
of sheaves of O-modules. This implies in particular
therefore coherence is not discussed in this case.
that, although S might not be locally free, it is
The long exact sequence in cohomology associated
locally finitely generated, and the relations among exp
to the short exact sequence 0 ! Z ! O ! O
! 1
the generators are also finitely generated.
yields an element c1 (L) 2 H 2 (X, Z), which is a purely
topological invariant. It is called the Chern class of L,
Selected Theorems and one knows that L is topologically trivial if and
only if c1 (L) = 0.
The following efficiently formulated fundamental
Coming back to the Cousin II problem, using the
theorem contains a great deal of information about
same argument as in the Cousin I problem, we can
Stein manifolds.
solve it if and only if we can find nowhere-vanishing
Theorem B A complex space X is Stein if and only functions fi 2 O
(Ui ) with fi = fij fj . This is equivalent
if for every coherent sheaf S of OX -modules to finding a nowhere-vanishing section of L. But a
H q (X, S) = 0 for all q  1. line bundle has a nowhere-vanishing section if and
only if it is isomorphic to the trivial bundle. In other
Since S is a sheaf, it follows that H 0 (X, S) = S(X).
words, the Cousin II problem can be solved for a
This is referred to as the space of sections of S over
given divisor D if and only if the associated line
X. As a result of Theorem B, we are able to
bundle L(D) is trivial in H 1 (X, O
). For this, a
construct sections with prescribed properties. Let us
necessary condition is that the Chern class c1 (L(D))
give two concrete applications (there are many
vanishes. But if X is Stein, this is also sufficient,
more!).
because the vanishing of H 1 (X, O) together with the
Example Let A be a closed analytic subset of a long exact sequence in cohomology shows that
c1
Stein space X, and let I denote the subsheaf of OX H 1 (X, O
) ! H 2 (X, Z) is injective.
which consists of those functions which vanish on A. Hence, in this case we have the following precise
Note that this must be defined for every open subset formulation of the Oka principle: ‘‘A given divisor
U of X. Then we have the short exact sequence 0 ! D on a Stein manifold is the divisor of a globally
I ! OX ! OX =I ! 0. The restriction of OX =I to defined meromorphic function if and only if the
A is called the (reduced) structure sheaf OA of A. In associated line bundle is topologically trivial.’’ ^
546 Several Complex Variables: Basic Geometric Theory

A slightly refined statement from that above is the Montel’s Theorem and Fredholm
fact that on a Stein manifold the space of topologi- Mappings
cal line bundles is the same as the space of
holomorphic line bundles. In the case of (higher If U is an open subset of a complex space X, then
rank) vector bundles this is a deep and important O(U) has the Fréchet topology of convergence on
theorem of Grauert. It can be formulated as follows. compact subsets K defined by the seminorms j  jK .
Using resolutions of type (1) above, one shows that
Grauert’s Oka principle On a Stein space the map the space of sections S(U) of every coherent sheaf S
F : Vectholo (X) ! Vecttop (X) from the space of holo- also possesses a canonical Fréchet topology. This is
morphic vector bundles to the space of topological then extended to the spaces Cq (U, S), and conse-
vector bundles which forgets the complex structure quently one is able to equip the cohomology spaces
is bijective. H q (X, S) with (often non-Hausdorff) quotient
In closing this section, a few words concerning the topology.
proofs of the major theorems, for example, Theorem B, Elements of such cohomology groups can be
should be mentioned. In all cases one must solve regarded as obstructions to solving complex analytic
something like an additive Cousin problem and one problems. One often expects such obstructions, and
first does this on special relatively compact subsets. For is satisfied whenever it can be shown if there are
this step there are at least two different ways to only finitely many, that is, a finiteness theorem of
proceed. One is to delicately piece together solutions the type dim H q (X, S) < 1 is desirable. Here we
which are known to exist on very special polyhedral- sketch two finiteness theorems which hold in
type domains or build up from lower-dimensional seemingly different contexts, but their proofs are
pieces of such. based on one principle: use the compactness
Another method is to solve certain systems of PDEs guaranteed by Montel’s theorem as the necessary
on relatively compact domains where control at the input for the Fredholm theorem in the context of
boundary is given by the positivity of the Levi-form. Fréchet spaces.
An example of how such PDEs occur can already be Recall that a continuous linear map T : E ! F
seen at the level of the above Cousin I problem. At the between topological vector spaces is said to be
point where we have solved it topologically, that is, the compact if there is an open neighborhood U of 0 2 E
holomorphic cocycle {fij } is a coboundary fij = fj  fi of such that T(U) is relatively compact in F. If Y is a
smooth functions, we observe that since @f  ij = 0, it relatively compact open subset of a complex space

follows that  = @fi is a globally defined (0, 1)-form. It X, then Montel’s theorem states that the restriction

is @-closed, that is, the compatibility condition for map rX Y : O(X) ! O(Y) is compact. This can be
solving the system @u =  is fulfilled. If this system can extended to coherent sheaves, and using the Fred-
be solved, then we use the solution u to adjust the holm theorem for certain natural restriction and
topological solutions of the Cousin problem by boundary maps, one proves the following funda-
replacing fi by fi  u. We still have fij = fj  fi , but mental fact.
now the fi are holomorphic on Ui . Lemma 1 If the restriction map rX q
Y : H (X, S) !
To obtain the global solution to a Cousin-type q q
H (Y, S) is surjective, then H (Y, S) is finite
problem, one exhausts the Stein space by the special dimensional.
relatively compact subsets Un where, by one method
or another, we have solved the problem with Since the methods for the proof are basic in complex
solutions sn . One would like to say that the sn analysis, we outline it here. Take a covering U~ of X
converge to a global solution s. However, there is no such that H q (U, S) = H q (X, S). Then intersect its
way to a priori guarantee this without making some elements with Y to obtain a covering U~ of Y. Finally,
sort of estimates. One main way of handling this refine that covering with refinement mapping to a
problem is to adjust the solutions as n ! 1 by an covering V of Y such that H q (V, S) = Hq (X, S) and so
approximation procedure. For this one needs to that Ui contains V (i) as a relatively compact subset
know that holomorphic objects, for example, func- for all i. Let Zq (U, S) denote the kernel of
tions on Un , can be approximated on Un by objects the boundary map  for the covering U, and consider
of the same type which are defined on the bigger set the map Zq (U, S) Cq1 (V, S) ! Cq (V, S) which is
Unþ1 . This Runge-type theorem, which is a non- the direct sum  of the restriction and boundary
trivial ingredient in the whole theory, requires the maps. By assumption it is surjective. Since  is the
introduction of an appropriate Fréchet structure on difference of this map and the compact map ,
the spaces of sections of a coherent sheaf. This is in L Schwartz’s version of the Fredholm theorem for
itself a point that needs some attention. Fréchet spaces implies that its image is of finite
Several Complex Variables: Basic Geometric Theory 547

codimension, that is, H q (Y, S) = H q (V, S) is finite be noted that, even if the original space X is a
dimensional. complex manifold, the associated Stein space Z may
Applying this Lemma in the case of compact be singular. This reflects the fact that it is difficult to
spaces where X = Y, one has the following theorem avoid singularities in complex geometry.
of Cartan and Serre:
Theorem If X is a compact complex space and S is
Mapping Theory
a coherent sheaf on X, then dim H q (X, S) < 1 for
all q. Above we have attempted to make it clear that
holomorphic maps play a central role in complex
Grauert made use of this technique in solving the
geometry. It is even important to regard a holo-
Levi problem for a strongly pseudoconvex relatively
morphic function as a map. Here we outline the
compact domain D with smooth boundary in a
basic background necessary for dealing with maps
complex manifold X. Here strongly pseudoconvex
and then state three basic theorems which involve
means that the restriction of the Levi form to the
proper holomorphic mappings.
complex tangent space of every boundary point is
positive definite. To do this he sequentially made
Basic Facts
‘‘bumps’’ at boundary points to obtain a finite
sequence of domains D = D0 D1    Dm in A holomorphic map F : X ! Y between (reduced)
such a way that the restriction mappings at the complex spaces is a continuous map which can be
level of qth cohomology, q  1, are all surjective represented locally as a holomorphic map between
and such that at the last step D is relatively analytic subsets of the spaces in which X and Y are
compact in Dm . Applying the above Lemma, locally embedded. In other words, F is the restriction
dim H q (D, S) < 1. Using another bumping proce- of a map F = (f1 , . . . , fm ) which is defined by
dure, it then follows that D is holomorphically holomorphic functions.
convex and, in fact, that D is almost Stein. If X is irreducible and X and Y are one-
This last statement means that one can guarantee dimensional, then a nonconstant holomorphic map
that O(D) separates points outside of some compact F : X ! Y is an open mapping. This statement is far
subset which could contain compact subvarietes on from being true in the higher-dimensional setting.
which the global holomorphic functions are constant. The reader need only consider the example
In this situation one can apply Remmert’s reduction F : C2 ! C2 , (z, w) ! (zw, z).
theorem which implies that there is a canonically Despite the fact that holomorphic maps can be
defined proper surjective holomorphic map  : D ! Z quite complicated, they have properties that in
to a Stein space which is biholomorphic outside of certain respects render them tenable. Let us sketch
finitely many fibers. One says that, in order to obtain these in the case where X is irreducible. First, one
the Stein space Z, finitely many compact analytic notes that every fiber F1 (y) is a closed analytic
subsets must be blown down to points. subset of X. One defines rankx F to be the codimen-
The above mentioned reduction theorem is a sion at x of the fiber F1 (F(x)) at x. Then
general result which applies to any holomorphically rank F := max {rankx F; x 2 X}. It then can be
convex complex space X. For this one observes that shown that {x 2 X; rankx F  k} is a closed analytic
if X is holomorphically convex, then for x 2 X the subset of X for every k. Applying this for
level set L(x) := {y 2 X; f (y) = f (x) for all f 2 O(X)} k = rank F  1 we see that, outside a proper closed
is a compact analytic subset of X. One then defines analytic subset, F has constant maximal rank.
an equivalence relation: x  y if and only if the If F : X ! Y has constant rank k in a neighbor-
connected component of L(x) containing x and that hood of some point x 2 X, then one can choose
of L(y) which contains y are the same. One then neighborhoods U of x in X and V of F(x) in Y so
equips X= with the quotient topology and proves that FjU maps U onto a closed analytic subset of Y.
that the canonical quotient  : X ! X= =: Z is By restricting F to the sets where it has lower rank
proper. Finally, for U open in Z one defines and applying this local-image theorem, it follows
OZ (U) = OX (1 (U)) and proves that, equipped that the local images of the set where F has lower
with this structure, Z is a Stein space. This Remmert rank are at least two dimensions smaller than those
reduction is universal with respect to holomorphic of top rank. Conversely, the fiber dimension
maps to holomorphically separable complex spaces, dF (x) := dimx F1 (F(x)) is semicontinuous in the
that is, if ’ : X ! Y and OY (Y) separates the points sense that dF (x)  dF (z) for all z near x. Finally, we
of Y, then there exists a uniquely defined holo- note that if Y is m-dimensional, then F : X ! Y is an
morphic map ’ : Z ! Y so that ’  = ’. It should open map if and only if it is of constant rank m.
548 Several Complex Variables: Basic Geometric Theory

Proper Mappings into two maps X ! Z ! Y, where X ! Z is a


canonically associated surjective map with con-
By definition a mapping F : X ! Y between topolo-
nected fiber, and Z ! Y is a finite map.
gical spaces is proper if and only if the inverse image
This geometric proper mapping theorem is a preview
F1 (K) of an arbitrary compact subset in Y is
of one of the deepest results in complex analysis:
compact in X. This is a more delicate condition
Grauert’s direct image theorem. This concerns the
than meets the eye. For example, if F : X ! Y is a
images of sheaves, not just the images of points. For this,
proper map and one removes one point from some
given a sheaf S on X one defines the qth direct image
fiber, then it is normally no longer proper! On the
sheaf on Y as the sheaf associated to the presheaf which
other hand, the restriction of a proper map to a
attaches to an open set U in Y the cohomology space
closed subset is still proper.
H q (F1 (U), S). Grauert’s ‘‘Bildgarbensatz’’ states the
Remmert’s ‘‘Proper mapping theorem’’ is the first
following: ‘‘If F : X ! Y is a proper holomorphic map,
basic theorem on proper holomorphic maps:
then all direct image sheaves of any coherent sheaf on X
Theorem The image of a proper holomorphic map are coherent on Y.’’
F : X ! Y is a closed analytic subset of Y.

Given another basic theorem of complex analysis, Complex Analysis and Algebraic
the reader can imagine how this might be proved. Geometry
This is the continuation theorem for analytic sets
The interplay between these subjects has motivated
due to Remmert and Stein:
research and produced deep results on both sides.
If X is a complex space and Y is a closed analytic Here we indicate just a few results of the type which
subset with dimy Y  k for all y 2 Y and Z is a closed show that objects which are a priori of an analytic
analytic subset of the complement XnY with dimz Z  nature are in fact algebraic geometric.
k þ 1 at all z 2 Z, then the topological closure cl(Z) of
Z in X is a closed analytic subset of X with E = cl(Z)n Projective Varieties
Z = cl(Z) \ Y a proper analytic subset of cl(Z).
Let us begin with the algebraic geometric side of the
Similar results hold for more general complex picture where we consider algebraic subvarieties X of
analytic objects. For example, closed positive cur- projective space Pn (C). If [z0 : z1 :    : zn ] are homo-
rents with (locally) finite volume can be continued geneous coordinates of Pn , such a variety is the
across any proper analytic subset (Skoda 1982). A simultaneous zero-set, X := V(P1 , . . . , Pm ), of finitely
sketch of the proof of the proper mapping theorem many (holomorphic) homogeneous polynomials
(for X irreducible) goes as follows. From the Pi = Pi (z0 , . . . , zm ). Chow’s theorem states that in this
assumption that F is proper, the image F(X) context there are no further analytic phenomena:
is closed. If F has constant rank k, then, by the
Theorem Closed complex analytic subsets of pro-
local result stated above, its image is everywhere
jective space Pn (C) are algebraic subvarieties.
locally a k-dimensional analytic set. Since the image
is closed, the desired result follows. If rank F = k This observation has numerous consequences. For
and E := {x 2 X; rankx F < k} 6¼ ;, then by induction example, if F : X ! Y is a holomorphic map between
F(E) is a closed analytic subset of dimension at algebraic varieties, then, by applying Chow’s theorem
most k  2. Let A := F1 (E) and apply the to its graph, it follows that F is algebraic.
previous discussion for constant rank maps to Chow’s theorem can be proved via an application
Fj(XnA) : XnA ! YnE. The image is a closed of the Remmert–Stein theorem in a very simple
k-dimensional analytic subset of YnE and its situation. For this, let  : Cnþ1 n{0} ! Pn (C) be the
Remmert–Stein extension is the full image F(X). standard projection, and let Z := 1 (X). Since Z is
In this framework the Stein factorization theorem positive dimensional, by the Remmert–Stein theorem it
is an important tool. Here F : X ! Y is again a can be extended to an analytic subset of Cnþ1 . The
proper holomorphic map which we may now resulting subvariety K(X) (the cone over X) is invariant
assume to be surjective. Analogous to the construc- by the C
-action which is defined by v !
v for
2
tion of the reduction of a holomorphically convex C
. If f is a holomorphic function on Cnþ1 which
space, one says that two points in X are equivalent vanishes on K(X), then Pwe develop it in homogeneous
if they are in the same connected component of an polynomials fP= Pd and note that
F-fiber. This is indeed an equivalence relation, and
(f )(z) = f (
z) =
d Pd also vanishes for all
.
the quotient Z := X= is a complex space equipped Hence, all Pd vanish identically and therefore the
with the direct image sheaf. Thus one decomposes F ideal of holomorphic functions which vanish on K(X)
Several Complex Variables: Basic Geometric Theory 549

is generated by the homogeneous polynomials which pseudoconcave, that is, when regarded from outside
vanish on K(X) and consequently finitely many of T, its boundary is strongly pseudoconvex.
these define X as a subvariety of Pn (C). To prove an embedding theorem, one must
Complements of subvarieties in projective varieties produce sections with prescribed properties. Sections
occur in numerous applications and are important of powers Lk are closely related to holomorphic
objects in complex geometry. Even complements Pn nY functions on the dual bundle space L
. This is due to
of subvarieties Y in the full projective space are not the fact that if  : L ! X is the bundle projection,
well understood. If Y is the intersection of a compact 1 (U ) ffi U  C is a local trivialization, and z is
projective variety X with a projective hyperplane, that a fiber coordinate, then a holomorphic function f on
is, Y is a hyperplane section, then XnY is affine. If Y is L
has a Taylor series development
q-codimensional in X, then XnY possesses a certain X
degree of Levi convexity and general theorems of f ðvÞ = s ðnÞððvÞÞzn ðvÞ
Andreotti and Grauert (1962) on the finiteness and
vanishing of cohomology indeed apply. However, not The function f is well defined on L. Hence, the
nearly as much is understood in this case as in the case transformation law for the zn must be canceled out
of a hyperplane section. by a transformation law for the coefficient functions
s (n). This implies that the s (n) are sections of Ln .
Hence, proving the existence of sections in the
Kodaira Embedding Theorem
powers of L with prescribed properties amounts to
Given that analytic subvarieties of projective space the same thing as proving the existence of holo-
are algebraic, one would like to understand whether morphic funtions on L
with analogous properties.
a given compact complex manifold or complex The positivity assumption on L is equivalent to
space can be realized as such a subvariety. Kodaira’s assuming that the tubular neighborhoods of the zero-
theorem is a prototype of such an embedding section in L
defined by the norm function associated
theorem. Most often one formulates projective to the dual metric are strongly pseudoconvex. The
embedding theorems in the language of bundles. solution to the Levi problem, which was sketched
For this, observe that if L ! X is a holomorphic above, then shows that L
is holomorphically convex,
line bundle over a compact complex manifold, then and its Remmert reduction is achieved by simply
its space (X, L) of holomorphic sections is a finite- blowing down its zero-section. In other words, L
is
dimensional vector space V. The zero-set of a section essentially a Stein manifold, and using Stein theory, it
s 2 V is a one-codimensional subvariety of X. is possible to produce enough holomorphic functions
Let us restrict our attention to bundles which are on L to show that some power Lk defines a
generated by their sections which for line bundles holomorphic embedding ’Lk : X ! P((X, Lk )
).
simply means that for every x 2 X there is some Bundles with this property are said to be ample, and
section s 2 V with s(x) 6¼ 0. It then follows that for thus we have outlined the following fact: ‘‘a line
every x 2 X the space Hx := {x 2 X; s(x) = 0} is a bundle which is Grauert-positive is ample.’’
one-codimensional vector subspace of V. Thus L It should be underlined that we defined the Chern
defines a holomorphic map ’L : X ! P(V
), x 7! Hx . class of L as the image in H 2 (X, Z) of its equivalence
Note that we must go to the projective space P(V
), class in H 1 (X, O
), that is, in this formulation the
because a linear function defining such an Hx is only Chern class is a Cech cohomology class. It is, however,
unique up to a complex multiple. often more useful to consider it as a deRham class
2
Projective embedding theorems state that under where it lies in the (1, 1)-part of HdeR (X, C). If h is a
certain conditions on L the map ’L is a holomorphic bundle metric as above, then the Levi form of the norm
embedding, that is, it is injective and is everywhere function is a representative c1 (L, h) of the Chern
of maximal rank in the analytic sense that its class of L
. Thus c1 (L, h) is an integral (1, 1)-form
differential has maximal rank. Here we outline a which represents c1 (L). It is called the Chern form of L
complex analytic approach of Grauert for proving associated to the metric h. The following is Kodaira’s
embedding theorems. It makes strong use of the formulation of his embedding theorem:
complex geometry of bundle spaces.
Theorem A line bundle L is ample if and only if it
Let L ! X be a holomorphic line bundle over a
possesses a metric h so that c1 (L, h) is positive definite.
compact complex manifold. A Hermitian bundle metric
is a smoothly varying metric h in the fibers of L. This Kodaira’s proof of this fact follows from his
defines a norm function v 7! jvj2 := h(v, v) on the vanishing theorem (see Several Complex Variables:
bundle space L. One says that L is positive if the tubular Compact Manifolds) in the same way the example
neighborhood T := {v 2 L; jvj3 < 1} is strongly of Theorem A was derived from Theorem B in the
550 Several Complex Variables: Basic Geometric Theory

first example in the subsection ‘‘Selected theorems.’’ determinant of the Jacobian d=dz and, given a
That an ample bundle is positive follows immedi- holomorphic functionP f, consider (at least formally)
ately from the fact that if ’Lk is an embedding, then the Poincaré series f ((z))J(, z)k of weight k. If f is
its pullback of the (positive) hyperplane bundle on bounded and k  2, then this series converges to a
projective space agrees with Lk . holomorphic function P(f ) on D which satisfies the
Finally, one asks the question ‘‘under what natural transformation rule P(f ) ((z)) = J(, z)k P(f )(z).
conditions can one construct a bundle L which is Now the differential volume form  := dz1 ^    ^
positive?’’ The following is an example of an answer dzn transforms in the opposite way (for k = 1).
which is related to geometric quantization. Therefore s(f ) = P(f )()k is a -invariant section of
Suppose that X is a compact complex manifold the kth power of the determinant bundle
equipped with a symplectic structure !, that is, ! is K := n T
D of the holomorphic cotangent bundle
a d-closed, nondegenerate 2-form. One says that ! is of D. In other words, s(f ) 2 (X, Kk ). Since the
Kählerian if it is compatible with the complex choice of f may be varied to show that there are
structure J in the sense that !(Jv, Jw) = !(v, w) and sufficiently many sections to separate points and to
!(Jv, v) > 0 for every v and w in every tangent space guarantee the maximal rank condition, it follows
of X. Note that if L is a positive line bundle, then it that the canonical bundle K of X is ample. Compact
possesses a Hermitian metric h such that ! = c1 (L, h) complex manifolds with ample canonical bundle are
is a Kählerian structure on X. examples of manifolds which are said to be of
It should be underlined that there are Kähler general type (see Several Complex Variables: Compact
manifolds without positive bundles, for example, Manifolds). Thus, this construction with Poincaré
every compact complex torus T = Cn = possesses series proves the following: ‘‘Every compact quotient
the Kählerian structure which comes from the D= of a bounded domain is of general type and is
standard linear structure on Cn . However, for n > 1 in particular projective algebraic.’’
most such tori are not projective algebraic and
therefore do not have positive bundles. See also: Gauge Theoretic Invariants of 4-Manifolds;
If, on the other hand, the Kählerian structure is Moduli Spaces: An Introduction; Riemann Surfaces;
integral, a condition that is automatic for the Chern Several Complex Variables: Compact Manifolds; Twistor
Theory: Some Applications [in Integrable Systems,
form c1 (L, h) of a bundle, then there is indeed a line
Complex Geometry and String Theory].
bundle L ! X equipped with a Hermitian metric h
such that c1 (L, h) = !. The condition of integrality can
be formulated in terms of the integrals of ! over
homology classes being integral or that its deRham Further Reading
class is in the image of the deRham isomorphism from Andreotti A and Grauert H (1962) Théorèmes de finitude pour la
the Cech cohomology H 2 (X, Z)  C to Hde 2
R (X, C). cohomologie des espaces complexes. Bulletin de la Société
Coupling this with the embedding theorem for positive Mathématique de France 90: 193–259.
bundles, we have the following theorem of Kodaira: Demailly J-P (1985) Champs magnt́ique et inégalitiés de Morse pour
lat d00 -cohomologie. Annales de l’Institut de Fourier 35: 189–229.
Theorem If (X, !) is Kählerian and ! is integral, Demailly J-P, Complex analytic and algebraic geometry, http://
then X is projective algebraic. www-fourier.ujf-grenoble.fr/demailly.
Grauert H (1962) Uber Modifikationen und exzeptionelle
This result has been refined in the following analytische Mengen. Mathematische Annalen 129: 331–368.
important way (a conjecture of Grauert and Grauert H and Fritzsche K (2001) From Holomorphic Functions
Riemenschneider proved with different methods by to Complex Manifolds. Heidelberg: Springer.
Griffiths PhA and Harris J (1978) Principles of Algebraic
Siu (1984) and by Demailly (1985)): the same result Geometry. New York: Wiley.
holds if ! is only assumed to be semipositive and Grauert H and Remmert R (1979) Theory of Stein Spaces.
positive in at least one point. Heidelberg: Springer.
For Grauert’s proof of the Kodaira embedding Grauert H and Remmert R (1984) Coherent Analytic Sheaves.
theorem and a number of other important and Heidelberg: Springer.
Grauert H, Peternell Th, and Remmert R (1994) Several Complex
beautiful results, we recommend the original paper Variables VII. Encyclopedia of Mathematical Science, vol. 74.
(Grauert 1962). Heidelberg: Springer.
Narasimhan R (1971) Several Complex Variables. Chicago Lectures
in Mathematics. Chicago, IL: University of Chicago Press.
Quotients of Bounded Domains Siu YT (1984) A vanishing theorem for semi-positive line bundles
over non-Kähler manifolds. Journal of Differntial Geometry
Let D be a bounded domain in Cn and  be a discrete 19: 431–452.
subgroup of Aut(D) which is acting freely on D with a Skoda H (1982) Prolongement des courants positifs fermés
compact quotient X := D=. For  2  let J(, z) be the de masse finie. Inventiones Mathematicae 66: 361–376.
Several Complex Variables: Compact Manifolds 551

Several Complex Variables: Compact Manifolds


A Huckleberry, Ruhr-Universität Bochum, Bochum, In this article we will assume familiarity with
Germany basic notions and methods from several complex
T Peternell, Universität Bayreuth, Bayreuth, Germany variables and/or algebraic geometry. In particular
ª 2006 Elsevier Ltd. All rights reserved. we refer to Several Complex Variables: Basic
Geometric Theory in this encyclopedia.
We first note some standard notation used in this
article. If X is a complex manifold of dimension n,
Introduction then TX will denote its holomorphic tangent bundle
p
and X the sheaf of holomorphic p-forms, V that is,
The aim of this article is to give an overview of the the sheafV of sections of the bundle p TX . The
classification theory of compact complex manifolds. bundle n 
TX is usually denoted by KX , the
Very roughly, compact manifolds can be divided canonical bundle of X and its sheaf of sections is
into three disjoint classes: the dualizing sheaf !X , but frequently we will not
 Projective manifolds, that is, manifolds which can distinguish between vector bundles and their sheaves
be embedded into some projective space, or of sections. An effective (Cartier) divisor on a
manifolds birational to those, usually called normal
P space X is a finite linear combination
Moishezon manifolds. These manifolds are treated ni Yi , where ni > 0 and Yi  X are irreducible
by algebraic geometric methods, but very often reduced subvarieties of codimension, which are
transcendental methods are also indispensable. locally given by one equation. If L is a line bundle,
 Compact (nonalgebraic) Kähler manifolds, that is, then instead of Lm we often write mL. If X is a
manifolds carrying a positive closed (1, 1)-form, compact variety and E a vector bundle or coherent
or manifolds bimeromorphic to those. This class sheaf, then the dimension of the finite-dimensional
is treated mainly by transcendental methods from vector space H q (X, E) will be denoted by hq (X, E).
complex analysis and complex differential geo-
metry. However, some algebraic methods are also
of use here. Birational Classification
 General compact manifolds which are not bimer-
omorphic to Kähler manifolds. For two reasons Two compact manifolds X and Y are bimeromor-
we will essentially ignore this class in our survey. phically equivalent, if there exist nowhere dense
First, because of the lack of methods, not much is analytic subsets A  X and B  Y and a biholo-
known, for example, there is still no complete morphic map XnA ! Y nB such that the closure of
classification of compact complex surfaces, and it the graph is an analytic set in X  Y. In case X and
is still unknown whether or not the 6-sphere Y are algebraic, one rather says that X and Y are
carries a complex structure. And second, for the birationally equivalent. This induces an isomorph-
purpose of this encyclopedia, this class seems to ism between the function fields of X and Y. If X and
be less important. Y are projective or Moishezon(see below), then
conversely an isomorphism of their function fields
The main problems of classification theory can be induces a birational equivalence between X and Y.
described as follows. Important examples are blow-ups of submanifolds;
 Birational classification: describe all projective locally they can be described as follows. Suppose
(Kähler) manifolds up to birational (bimeromorphic) that locally X is an open set U  Cn with coordi-
equivalence; find good models in every equivalence nates z1 , . . . , zn and that A  X is given by
class. This includes the study of invariants. z1 =    = zm = 0. Then the blow-up X ^ ! X is the
 Biholomorphic classification: classify all projec- submanifold X ^  U  Pnm1 given by the
tive (Kähler) manifolds with some nice property, equations
for example, curvature, many symmetries, etc. yj ti  yi tj ¼ 0
 Topological classification and moduli: study all
complex structures on a given topological manifold – where tj are homogeneous coordinates in Pnm1 .
including the study of topological invariants of The Chow lemma says that any birational – even
complex manifolds; describe complex structures rational – maps can be dominated by a sequence of
up to deformations and describe moduli spaces. blow-ups with smooth centers. Recently other
 Symmetries: describe group actions and invariants – factorizations (‘‘weak factorization,’’ using blow-
this is deeply related with the moduli problem. ups and blow-downs) have been established.
552 Several Complex Variables: Compact Manifolds

A projective manifold is a compact manifold which notion of ampleness: a line bundle L is ample if L
is a submanifold of some projective space PN . Of carries a metric of positive curvature. Alternatively
course, a projective manifold can be embedded into some tensor power of L has enough global section to
projective spaces in many ways. According to Chow’s separate points and tangents and there gives an
theorem (see Several Complex Variables: Basic embedding into some projective space; see Several
Geometric Theory), X  PN is automatically given Complex Variables: Basic Geometric Theory for
by polynomial equations and is therefore an algebraic more details. The notion of nefness, which is in a
variety. This is part of Serre’s GAGA principle which certain sense the degenerate version of ampleness,
roughly says that all global analytic objects on a plays a central role in Mori theory: a line bundle or
projective manifold, for example, vector bundles or divisor L is nef if
coherent sheaves and their cohomology are auto-
matically algebraic. A compact manifold which is L  C ¼ degðLjCÞ 0
bimeromorphically equivalent to a projective mani- for all curves C  X. Examples are those L carrying
fold is called a Moishezon manifold. These arise a metric of semipositive curvature, but the converse
naturally, for example, as quotient of group actions, is not true. However, if L is nef, there exists for all
compactifications, etc. positive  > 0 a metric h with curvature  > !,
The most important birational invariant of com- where ! is a fixed positive form. In this context
pact manifolds is certainly the Kodaira dimension singular metrics on L are also important. Locally
(X). It is defined in three steps: they are given by e’ with a locally integrable
 (X) = 1 iff h0 (mKX ) = 0 for all m 1. weight function ’ and they still have a curvature
 (X) = 0 iff h0 (mKX )
1 for all m, and current . If L has a singular metric with 
h0 (mKX ) = 1 for some m. bounded from below as current by a Kähler form,
 In all other cases we can consider the meromorphic then L is big, that is, (L) = dim X, the birational
map fm : X ! PN(m) associated to H 0 (mKX ) for all version of ampleness. If one simply has  0 as
those m for which h0 (mKX ) 2. Let Vm denote current, then L is pseudoeffective (and vice versa).
the (closure of the) image of fm . Then (X) is All these positivity notions only depend on the
defined to be the maximal possible dim Vm . Chern class c1 (L) of L and therefore one considers
the ample cone
Recall that fm is defined by [s0 :    : sN ] for a given
base si of H 0 (mKX ), cf. Several Complex Variables: Kamp  ðH 1;1 ðXÞ \ H 2 ðX; ZÞÞ  R
Basic Geometric Theory.
In the same way one defines the Kodaira (or and the cone of curves
Iitaka) dimension (L) of a holomorphic line bundle NEðXÞ  ðH n1;n1 ðXÞ \ H 2n2 ðX; ZÞÞ  R
L (instead of L = KX ).
We are now going to describe geometrically the The ample cone is by definition the closed cone of
different birational equivalence classes and how to nef divisors, the interior being the ample classes,
single out nice models in each class. Using methods while the cone of curves is the closed cone generated
in characteristic p, Miyaoka and Mori proved the by the fundamental classes of irreducible curves.
following theorem: A basic result says that these cones are dual to
each other. The structure of NE(X) in the part
Theorem 1 Let X be a projective manifold and
where KX is negative is very nice; one has the
suppose that through a general point x 2 X there is a
following cone theorem:
curve C such that KX  C < 0. Then X is uniruled, that
is, there is a family of rational curves covering X. Theorem 2 NE(X) is locally finite polyhedral in
the half-space {KX < 0}; the (geometrically) extremal
A rational curve is simply the image of noncon-
rays contain classes of rational curves.
stant map f : P1 ! X. It is a simple matter to prove
that uniruled manifolds have (X) = 1, but the A ray R = Rþ [a] is said to be extremal in a closed
converse is an important open problem. A step cone K if the following holds: given b, c 2 K with
towards this conjecture has recently been made by b þ c 2 R, then b, c 2 R. Given such an extremal ray
Boucksom et al. (2004) if KX is not pseudoeffective, R  NE(X), one can find an ample line bundle H
that is, KX ‘‘cannot be approximated by effective and a rational number t such that KX þ tH is nef
divisors,’’ then X is uniruled. Here one also finds a and KX þ tH  R = 0. Using the Kawamata–Viehweg
discussion of the case when KX is pseudoeffective. vanishing theorem, a generalization of Kodaira’s
Mori theory is central in birational geometry. vanishing theorem, which is one of the technical
To state the main results in this theory, we recall the corner stones of the theory, one proves the so-called
Several Complex Variables: Compact Manifolds 553

Base point free theorem Some multiple of KX þ tH stop; this class is discussed later. If KX is not nef,
is spanned by global sections and therefore defines a then perform a Mori contraction f : X ! Y. There
holomorphic map f : X ! Y to some normal projec- are two cases:
tive variety Y contracting exactly those curves whose
 If dim Y < dim X, then the general fiber F is a
classes belong to R.
manifold with ample KF , that is, a Fano
These maps are called ‘‘contractions of extremal manifold (discussed in the next section). Here we
rays’’ or ‘‘Mori contractions.’’ In dimension 2 they stop and observe that (X) = 1. Of course one
are classical: either X = P2 and f is the constant can still investigate Y and try to say more on the
map, or f is a P1 -bundle or f is birational and the structure of the fibration f.
contraction of a P1 with normal bundle O(1), that  If dim Y = dim X, then Y has terminal singularities –
is, f contracts a (1)-curve. In particular Y is again unless f is a small contraction which means that no
smooth. In the first two cases X has a very precise divisors are contracted. Thus if f is not small, we may
structure, but in the third birational case one attempt to proceed by substituting X by Y.
proceeds by asking whether or not KY is nef. If it
As a result one must develop the entire theory for
is not nef, we start again by choosing the contrac-
varieties with terminal singularities. The big pro-
tion of an extremal ray; if KY is nef, then a
blem arises from small contractions f. In that case
fundamental result says that a multiple of KY is
KY cannot be Q-Cartier and the machinery stops. So
spanned. The class of manifolds with this property
new methods are required. At this stage, other
will be discussed later.
aspects of the theory lead one to attempt a certain
The situation in higher dimensions is much more
surgery procedure which should improve the situa-
complicated. For example, Y need no longer be
tion and allow one to continue as above. The
smooth. However the singularities which appear are
expected surgery Y * Y 0 , which takes place in
rather special.
codimension at least 2, is a ‘‘flip.’’ The idea is that
Definition 1 A normal variety X is said to have we should substitute a small set, namely the
only terminal singularities if first some multiple of exceptional set of a small contraction, by some
the canonical (Weil) divisor KX is a Cartier divisor, other small set (on which the canonical bundle will
that is, a line bundle (one says that X is be positive) to improve the situation. Of course Y 0
Q-Gorenstein) and second if for some (hence for should possess only terminal singularities. The
every) resolution of singularities  : X ! X the existence of flips is very deep and has been proved
following holds: by S Mori in dimension 3. Moreover, there cannot
X be an infinite sequence of flips, at least in dimension
KX^ ¼  ðKX Þ þ ai Ei at most 4.
In summary, by performing contractions and flips
where the Ei run over the irreducible -exceptional
one constructs from X a birational model X0 with
divisors and the ai are strictly positive.
terminal singularities such that either
A brief remark concerning Weil divisors is in
 KX0 is nef in which case we call X0 a minimal
order:
P a Weil divisor is a finite linear combination
model for X, or
ai Yi with Yi irreducible of codimension 1, but Yi
 X0 admits a Fano fibration f 0 : X0 ! Y 0 (discussed
is not necessarily locally defined by one equation.
below), in which case (X) = (X0 ) = 1.
Recall that if each Yi is given locally by one
equation, then the Weil divisor is Cartier. On a Up to now, Mori theory (via the work of
smooth variety these notions coincide. Kawamata, Kollár, Mori, Reid, Shokurov, and
One important consequence is that (X) = (X) ^ in others) works well in dimension 3 (and possibly in
case of terminal singularities, which is completely the near future in dimension 4) but in higher
false for arbitrary singularities. Also notice that dimensions there are big problems with the existence
terminal singularities are rational: Rq  (OX^ ) = 0 for of flips. Of course there might be completely
q 1. Terminal singularities occur in codimension different and possibly less precise ways to construct
at least 3. Thus they are not present on surfaces. In a minimal model. One way is to consider the
dimension 3 terminal singularities are well under- canonical ring R of a manifold of general type:
stood. The main point in this context is that for a X
birational Mori contraction the image Y often has R¼ H 0 ðmKX Þ
terminal singularities.
Now the scheme of Mori theory is the following. If R is finitely generated as C-algebra, then
Start with a projective manifold X. If KX is nef, we Proj(R) would be at least a canonical model which
554 Several Complex Variables: Compact Manifolds

has slightly more complicated singularities than a rationally connected fibers) f : X ! Y, then X is
minimal model. However, it is known that this rationally connected if and only if Y is.
‘‘finite generatedness problem’’ is equivalent to the Manifolds Xn which are birational to Pn are
existence of minimal models. On the other hand, if called rational. If there merely exists a surjective
X is of general type with KX nef (hence essentially (‘‘dominant’’) rational map Pn * X, then X is said
ample) or more generally when some positive to be unirational. Of course rational (resp. unira-
multiple mKX is generated by global sections, then tional) manifolds are rationally connected, but to
R is finitely generated. decide whether a given manifold is rational/uni-
We now must discuss the case of a nef canonical rational is often a very deep problem. Therefore,
bundle. The behavior is predicted by the rational connectedness is often viewed as a practical
substitute for (uni)rationality.
Abundance conjecture. If X has only terminal
Often it is very important to compute the Kodaira
singularities and KX is nef, then some multiple
dimension of fiber spaces. Let us fix a holomorphic
mKX is spanned.
surjective map f : X ! Y between projective mani-
Up to now this conjecture is known only in folds and we suppose f has connected fibers. Then
dimension 3 (Kawamata, Kollár, Miyaoka). In the so-called conjecture Cnm states that
higher dimensions it is even unknown if there is a
ðXÞ ðFÞ þ ðYÞ
single section in some multiple mKX . If mKX is
spanned, one considers the Stein factorization where F is the general fiber of f. This conjecture is
f : X ! Y of the associated map, which is called the known in many cases, for example, when the
Iitaka fibration (if not birational) and we have general fiber is of general type, but it is wide open
dim Y = (X) by definition. The general fiber F is a in general. It is deeply related to the existence of
variety with KF 0, a class discussed in the next minimal models (Kawamata).
section. If f is birational, then Y will be slightly
singular (so-called canonical singularities) and KY
Biholomorphic Classification
will be ample. Essentially we are in the case of
negative Ricci curvature. In this section we discuss manifolds X with
Everything that was outlined above holds for
 ample anticanonical bundles KX (Fano manifolds),
projective manifolds. In the Kähler case one would
 trivial canonical bundles, and
expect the same picture, but the methods completely
 ample canonical bundles KX .
fail, and new, analytic methods must be found. Only
very few results are known in this context. Due to the solution of the Calabi conjecture by
We come back to the case of a Fano fibration Yau and Aubin, these classes are characterized by a
f : X ! Y. By definition the anticanoical bundle KX Kähler metric of positive (resp. zero, resp. negative)
is relatively ample so that the general fiber is a Fano Ricci curvature. In principle, in view of the results of
variety. In this case there are no constraints on Y. Mori theory, one should rather consider varieties
To see how much of the geometry of X is dictated with terminal singularities, but we ignore this aspect
by the rational curves, one considers the so-called completely. Philosophically, up to birational equiva-
rational quotient of X. Here we identify two very lence all manifolds are via fibrations somehow
general points on X if they can be joined by a chain composed of those classes via fibrations, possibly
of rational curves. In that way we obtain the also up to étale coverings.
rational quotient Examples of Fano manifolds are hypersurfaces of
f :X*Y degree at most n þ 1 in Pnþ1 , Grassmannians, or
more generally homogenenous varieties G/P with G
This map is merely meromorphic, but has the semisimple and P a parabolic subgroup. Fano
remarkable property of being ‘‘almost holo- manifolds are simply connected. This can be seen
morphic,’’ that is, the set of indeterminacies does either by classical differential geometric methods
not project onto Y. In other words, one has nice using a Kähler metric of positive curvature or via the
compact fibers not meeting the indeterminacy set. If fundamental
Y is just a point, then all points of X can be joined
Theorem 3 Fano manifolds are rationally
by chains of rational curves and X is called
connected.
rationally connected. This notion is clearly biration-
ally invariant. The only known proof of this fact uses, as in the
A deep theorem of Graber–Harris–Starr states uniruled criterion mentioned above, characteristic p
that, given a Fano fibration (or a fibration with methods. By just using complex methods it is not
Several Complex Variables: Compact Manifolds 555

known how to construct a single rational curve (of c1 (X) = 0 in H 2 (X, R). Then there exists a finite
course, in concrete examples the rational curves are unramified cover X ! X such that KX~ is trivial. In
seen immediately). One still has to observe that view of Mori theory, normal projective varieties X
rationally connected manifolds are simply con- with at most terminal singularities and KX 0 (i.e.,
nected, which is not so surprising, since rational KX  C = 0 for all curves) should also be investigated.
curves lift to the universal cover. It is expected that similar structure theorems hold;
At least in principle, Fano manifolds can be in particular 1 (X) should be finite. The main
classified: difficulty is that there are no differential methods
available; on the other hand an algebraic proof even
Theorem 4 There are only finitely many families of
for the splitting theorem in the smooth case is
Fano manifolds in every dimension.
unknown.
A family (of Fano manifolds) is a submersion Calabi–Yau manifolds play an important role in
 : X ! S (with S irreducible) such that all fibers are string theory and mirror symmetry (see Mirror
Fano manifolds. The essential step is to bound (KX )n . Symmetry: A Geometric Survey). Here we mention
An actual classification has been carried out only two basic problems. The first is the problem of
in dimension up to 3; in dimension 2 one finds boundedness:
P2 , P1  P1 and the so-called del Pezzo surfaces (P2 Are there only finitely many families of Calabi–
blown up in at most eight points in general position). Yau manifolds in any dimension?
In dimension 3 there are already 17 families of Fano This problem is wide open; in particular one
3-folds with b2 = 1 and 88 families with b2 2. might ask:
An extremely hard question is to decide whether a Is the Hodge number h1, 2 bounded for Calabi–
given Fano manifold is rational or unirational. Even Yau 3-folds?
in dimension 3 this is not completely decided. The other problem asks for the existence of
The next class to be discussed are the manifolds rational curves. In all known examples there are
with trivial canonical class KX . This means that rational curves, but a general existence proof is not
there is a holomorphic n-form without zeros known. The case where b2 (X) = 1 seems to be
(n = dim X). Important examples are tori and particularly difficult. If b2 (X) 2, then in may
hypersurface in Pnþ1 of degree n þ 2. Simply cases one can hope to find a fibration or a birational
connected manifolds with trivial canonical bundles map, at least for 3-folds. Given such a map, the
are further divided into irreducible Calabi–Yau existence of rational curves is simple. For example,
manifolds and irreducible symplectic manifolds. if D  X is an irreducible hypersurface which is not
The first class is defined by requiring that there are nef, choose H ample and consider the a priori
no holomorphic p-forms for p < dim X whereas the positive real number p such that D þ pH is on the
second is characterized by the existence of a boundary of the ample cone. Then actually p is
holomorphic 2-form of everywhere maximal rank. rational and a suitable multiple m(D þ pH) is
A completely different characterization is by holonomy: spanned and defines a contraction on X. This
an irreducible Calabi–Yau manifold has SU-holonomy comes from ‘‘logarithmic Mori theory.’’
whereas irreducible symplectic manifolds have The above splitting theorem exhibits a torus
Sp-holonomy (with respect to a suitable Kähler metric). factor and all holomorphic 1-forms on X come
The splitting theorem of Beauville–Bogomolov– from this torus. This principle generalizes: given any
Kobayashi says projective or compact Kähler manifold X, there
exists a ‘‘universal object,’’ the Albanese torus
Theorem 5 Let X be a projective (or compact
Kähler) manifold with trivial canonical bundle. AlbðXÞ ¼ H0 ð1X Þ =H1 ðX; ZÞ
~ !X
Then there exists a finite unbranched cover X
such that (which is algebraic if X is) together with a
X ¼ A  Xi  Yj holomorphic map

with A a torus, Xi irreducible Calabi–Yau, and Yj  : X ! AlbðXÞ


irreducible symplectic.
the Albanese map. This Albanese map is given by
The key to the proof of this theorem is the integrating 1-forms and is often far from being
existence of a Ricci-flat Kähler metric on X, a surjective. The important property is now that,
Kähler–Einstein metric with zero Ricci curvature. given a holomorphic 1-form ! on X, there exists a
Actually one has a stronger result: instead of holomorphic 1-form  on the Albanese torus such
assuming KX to be trivial, just assume that that ! =  (). The universal property reads as
556 Several Complex Variables: Compact Manifolds

follows: every map X ! T to a torus factors via an In case of equality, X is covered by the
affine map Alb(X) ! T. n-dimensional unit ball.
There is a nonabelian analog, the so-called The same inequality holds in case KX = 0, and as a
Shafarevich map, but at the moment this map is consequence the Chern class c2 (X) is in some sense
only known to be meromorphic. It is an important semipositive. If c2 (X) = 0, then some finite unrami-
tool to study the fundamental group 1 (X). We refer fied cover of X is a torus.
to Campana (1996) and Kollár (1995). There is an interesting relation to stability. Recall
In the following, Chern classes of holomorphic that a vector bundle E on a compact Kähler
vector bundles will be important. Let X be a manifold Xn is semistable with respect to a given
compact complex manifold and E a holomorphic Kähler form !, if for all proper coherent subsheaves
vector bundle on X. The jth Chern class of E is an F  E of rank-r the following inequality holds:
element
c1 ðF Þ  !n1 c1 ðEÞ  !n1
cj ðEÞ 2 H 2j ðX; QÞ \ H j;j ðXÞ

r n
It can be defined, for example, by putting a Hermitian In case of strict inequality, E is said to be stable.
metric on E, computing the curvature of the canonical The basic observation is now that the tangent
connection compatible with both the metric and the bundle of a manifold with a Kähler–Einstein metric
holomorphic structure and then by applying certain is semistable (with respect to the Kähler–Einstein
linear operators coming from symmetric functions metric). It is expected that Fano manifolds with
such as determinant and trace. Actually Chern classes b2 = 1 have (semi?-)stable tangent bundles, although
can be attached to every complex topological vector in certain situations they do not admit a Kähler–
bundle on a topological manifold; then cj (E) will Einstein metric.
simply live in H 2j (X, R). There is also a purely Again the first two Chern classes of a semistable
algebraic construction by Grothendieck. We refer, for vector bundle fulfill an inequality:
example, to Fulton (1984) as well as for a discussion of
the elementary functorial properties of Chern classes. 2r
Here we just recall that for a rank-r vector bundle E the c21 ðEÞ  !n2
c2 ðEÞ  !n2
r1
first Chern class
^  Equally important, semistable bundles with fixed
c1 ðEÞ ¼ c1 E numerical data form moduli spaces, this being the
origin of the stability notion (Mumford). In this
V context, the notion of an Hermite–Einstein bundle is
where the Chern class of the line bundle r E as
given in Several Complex Variables: Basic Geo- also important. Given a holomorphic vector bundle
metric Theory actually lives in H 2 (X, Z). E with a Hermitian metric h, there is a unique
Finally we discuss manifolds with ample canonical connection Fh on E compatible both with h and the
class KX . Here moduli question often plays a central complex structure. Fh is a (1,1)-form with values in
role. Moduli spaces of surfaces with fixed c21 and c2 End(E). Now suppose (X, !) is Kähler and let Fh be
are very intensively studied (by Catanese, Ciliberto, the contraction of Fh with !. Then (E, h) is said to
and others). Here, without going into details, we Hermite–Einstein on (X, !), if
will concentrate on the very interesting topic of
Kähler–Einstein metrics. Fh ¼ id
A Kähler metric ! is said to be Kähler–Einstein, if
its Ricci curvature Ric(!) is proportional to !. The with some constant  and id: E ! E the identity.
proportionality factor  can be taken to be 1, 0, 1. In Notice that (X, !) is Kähler–Einstein if (TX , h) is
case KX is ample or trivial, Kähler–Einstein metrics Hermite–Einstein over (X, !) with h the Kähler
always exist by Yau and Aubin (cases  = 1, resp. metric with Kähler form !. It is not so difficult to
 = 0). However if X is Fano, there are obstructions, see that Hermite–Einstein bundles are semistable
and a Kähler–Einstein metric does not always exist. (with respect to the underlying Kähler form) and
An important consequence of the existence of a actually are directs sum of stable Hermite–Einstein
Kähler–Einstein metric on a manifold Xn with ample bundles. Conversely, a very deep theorem of
canonical class is the Miyaoka–Yau inequality: Uhlenbeck–Yau says that every stable vector bundle
on a compact Kähler manifold is Hermite–Einstein.
2n þ 1 n2 This is known as the Kobayashi–Hitchin correspon-
c21 !n2
!
n dence; see Lübke and Teleman (1995).
Several Complex Variables: Compact Manifolds 557

Topology, Invariants and Cohomology Concerning the first, Hodge decomposition


implies that the irregularity h0 (1X ) is actually a
Besides the Kodaira dimension there are other
topological invariant. However it is unknown
important invariants of compact complex manifolds.
whether the number of holomorphic 2-forms is a
Of course there are topological invariants such as the
topological invariant of Kähler 3-folds. Both ques-
Betti number bi (X) = dim H i (X, R) or the fundamen-
tions have been intensively studied in dimension 2.
tal group 1 (X). The fundamental group has been
However, in higher dimensions almost nothing is
studied intensively in the last decade. A central
known. For example, it is not known whether there
question asks which groups can occur as fundamental
is projective manifold of general type of even
groups of compact Kähler manifolds; another pro-
dimension which is homeomorphic to a quadric,
blem is the so-called Shafarevitch conjecture which
that is, a hypersurface of degree 2 in projective
says that the universal cover of a compact Kähler
space.
manifold should be holomorphically convex. We
Other important tools in the study of projective/
refer to Campana (1996) and Kollár (1995).
Kähler manifolds are listed below.
The plurigenera,
 Cohomological methods: Riemann–Roch theorem
Pm ðXÞ ¼ dim h0 ðmKX Þ and holomorphic Morse inequalities; vanishing
are also extremely important. Here, Siu recently theorems (Kodaira, Kawamata–Viehweg, etc.);
proved that Pm (X) is constant in families of Serre duality. References: Demailly (2000),
projective manifolds. Other important invariants are Demailly and Lazarsfeld, Fulton (1984), Grauert
h0 (X, (1X )m ). For example, it is conjectured that if et al. (1994), Lazarsfeld (2004).
 m  L2 methods: extension theorems, singular metrics,
h0 ðX; 1X Þ¼0 multiplier ideals, etc. Reference: Demailly and
Lazarsfeld (2001), Lazarsfeld (2004).
for all positive m, then X is rationally connected.
 Theory of currents. Reference: Demailly 2000.
Tensor powers of the cotangent bundle somehow
 Cycle space and Douady space, resp. Chow
capture more of the structure of X than the Kodaria
scheme and Hilbert scheme. Reference: Fulton
dimension but they are more difficult to treat. The
1984, Grauert 1994, Kollár 1996.
relevance of the dimensions
 p We restrict our remarks on just one of these
h0 X; X topics, vanishing theorems. The classical Kodaira–
of holomorphic forms is easier to understand. More Nakano vanishing theorem says that if X is a
generally one has the Hodge numbers compact manifold of dimension n with a positive
 p
(ample) line bundle L, then
hp;q ðXÞ ¼ dim Hq X; X
H q ðX; L  p Þ ¼ 0
For compact Kähler manifolds, the Hodge decom-
position states for p þ q > n. This is usually proved via harmonic
M theory, that is, by representing the cohomology
Hr ðX; CÞ ¼ H p;q ðXÞ space by harmonic (p, q)-forms with values in L
pþq¼r
and by computing integrals of these forms. For
Furthermore, Hodge duality, many purposes, for example, for Mori theory, it is
important to generalize this to a line bundle which
H p;q ðXÞ ¼ H q;p ðXÞ have some positivity properties but which are not
holds. These results form a cornerstone for the ample. This works only for p = n, however this is
geometry of compact Kähler manifolds and the the most important part of the Kodaira–Nakano
starting point of Hodge theory. Hodge theory is, vanishing. The Kawamata–Viehweg vanishing theo-
for example, extremely important in the study of rem in its most basic version says that given a nef
families of manifolds and moduli. and big line bundle L, then Kodaira vanishing still
Concerning the topology of projective (Kähler) holds:
manifolds, the following two questions are very
H q ðX; L  KX Þ ¼ 0
basic.
 Which invariants are topological (or diffeo- for q 1. But actually it is not necessary to assume
morphic) invariants? L nef, in fact the following is true. Let
 What are the projective or Kähler structures on a X
given compact topological manifold? D¼ ai D i
558 Several Complex Variables: Compact Manifolds

be an effective Q-divisor, that is, all ai are positive More generally, let us consider the case that the
rational numbers. Let hai i be theP
fractional part of ai compact Kähler manifold X admits a vector field v
and suppose that the Q-divisor hai iDi has normal without zeros, but X is not required to be homo-
crossings.
P Let dai e be the roundup of ai and put geneous. Then a theorem of Lieberman says that
L = dai eDi . If D is big and nef, then there is a finite unramified cover f : X ! X and a
splitting
H q ðX; L  KX Þ ¼ 0
~ ’FT
X
for q 1. Of course L itself need not be nef! This
generalization is technically very important and yields with T a torus, such that f  (v) is the pullback of a
substantial freedom for birational manipulations. We vector field on T. On the other hand, if v has a zero,
refer to Kawamata et al. (1987) and Lazarsfeld (2004). then a classical theorem of Rosenlicht says that X is
Even this is not the end of the story: the Kawamata– covered by rational curves, that is, X is uniruled. In
Viehweg theorem is embedded in the broader context particular (X) = 1. Notice also that a manifold
of the Nadel vanishing theorem where multiplier ideal of general type can never carry a vector field, in
sheaves come into the play. See Demailly and other words, the automorphism group is discrete,
Lazarsfeld and Lazarsfeld (2004). even finite.
Coming back to compact homogeneous Kähler
manifolds, the first thing to study is the Albanese
Homogeneous Manifolds map. The Borel–Remmert theorem says that

In this section we consider vector fields and X’TQ


holomorphic group actions on compact (Kähler)
manifolds. Our main reference is Huckleberry where T is the Albanese torus. This is proved using a
(1990) with further literature given there. maximal compact subgroup K  G and by some
We denote by Aut(X) the group of holomorphic averaging process over K. Moreover, Q is a rational
automorphisms of the compact manifold X (well homogeneous manifold. The structure of Q is more
known to be a complex Lie group), and by precisely the following. One can write Q = S=P with
G := Aut0 (X) the connected component containing S a semisimple Lie group and P  S parabolic,
the identity. The tangent space at any point of which means that P contains a maximal connected
Aut0 (X) can naturally be identified with H0 (X, TX ), solvable subgroup (the so-called Borel subgroup).
the (finite-dimensional) space of holomorphic The main ingredients of the proof are the Tits
vector fields on X. In fact, by integration, a fibration, the Levi–Malcev decomposition of a Lie
vector field determines a one-parameter group of group into its radical and a semisimple group, and
automorphisms. the Borel fixed point theorem:
One says that X is homogeneous if G acts
Theorem 6 Let G  GLn (C) be a connected
transitively on X. Therefore, one can write
solvable subgroup and X  Pn1 be a G-stable
X ¼ G=H subvariety. Then G has a fixed point on X.
where H is the isotropy subgroup of any point In the homogenenous Kähler case, the rationality of
x0 2 X, that is, the subgroup of automorphisms Q is seen by exhibiting an open subset in Q which is
fixed x0 . Conversely one can take a complex Lie algebraically isomorphic to Cn .
group G and a closed subgroup H and form the Now things come down to classify all rational
quotient G/H which is again a complex manifold homogenenous manifold S/P which is of course
and in fact homogeneous (of course not necessarily classical. Notice that all rational homogeneous
compact). manifolds are Fano. One knows that a rational
Going back to a compact manifold X, the homogeneous manifold with Betti number b2 2
condition to be homogenenous can be rephrased by can be fibered over another rational homogenenous
saying that the tangent bundle is generated by manifold with fibers rational homogeneous – this is
global sections, that is, if x 2 X and e 2 TX, x , then actually a fiber bundle. The case that b2 = 1 can be
there exists v 2 H 0 (X, TX ) such that v(x) = e. The rephrased by saying that P is maximal parabolic.
easiest case is when TX is trivial. If X is Kähler, this This fiber bundle might not be trivial as shown by
is exactly the case when X is torus, X = Cn = with the projectivized tangent bundle P(TPn ).
 ’ Z2n a lattice, but without the Kähler assump- Compact Hermitian symmetric spaces form a
tion there are many more examples (the so-called particularly interesting subclass of homogeneous
parallelizable manifolds). Kähler manifolds. A manifold equipped with a
Shock Wave Refinement of the Friedman–Robertson–Walker Metric 559

Hermitian metric is called Hermitian symmetric, if for Griffiths Ph and Harris J (1978) Priniciples of Algebraic
every x 2 X there exists an involutive holomorphic Geometry. New York: Wiley.
Grauert H, Peternell Th, and Remmert R (1994) Several Complex
isometry fixing x. Mok has shown the remarkable fact Variables VII. Encyclopedia of Mathematical Sciences, vol. 74.
that the simply connected compact Hermitian sym- Heidelberg: Springer.
metric spaces are exactly those simply connected Huckleberry A (1990) Actions of groups of holomorphic
compact manifolds carrying a Kähler metric with transformations. In: Barth W and Narasimhan R (eds.) Several
semipositive holomorphic bisectional curvature. The Complex Variables VI, Encyclopedia of Mathematical Science,
vol. 69, pp. 143–196. Berlin: Springer.
only manifold having a metric with positive holo- Kawamata Y, Matsuda K, and Matsuki K (1987) Introduction to
morphic bisectional curvature is Pn (Siu-Yau, Mori). the minimal model problem. Advance Studies in Pure
Mathematics 10: 283–360.
See also: Classical Groups and Homogeneous Spaces; Kollár J (1995) Shafarevitch Maps and Automorphic Forms.
Einstein Manifolds; Mirror Symmetry: A Geometric Princeton University Press.
Survey; Moduli Spaces: An Introduction; Riemann Kollár J (1996) Rational Curves on Algebraic Varieties. Ergeb-
Surfaces; Several Complex Variables: Basic Geometric nisse der Mathematik und ihrer Grenzgebiete, vol. 32.
Theory; Topological Sigma Models; Twistor Theory: Heidelberg: Springer.
Lazarsfeld R (2004) Positivity in Algebraic Geometry I, II.
Some Applications [in Integrable Systems, Complex
Ergebnisse der Mathematik und ihrer Grenzgebiete, vol. 48
Geometry and String Theory].
and 49. Heidelberg: Springer.
Lübke M and Teleman A (1995) The Kobayashi–Hitchin
Correspondence. Singapore: World Scientific.
Further Reading Matsuki K (2002) Introduction to the Mori Program, Universi-
Beltrametti M and Sommese AJ (1995) The Adjunction Theory of text. Heidelberg: Springer.
Complex Projective Varieties. Berlin: de Gruyter. Mori S (1987) Classification of higher-dimensional varieties.
Boucksom S, Demailly JP, Paun M, and Peternell T (2004) Proceedings of Symposia in Pure Mathematics 46: 269–331.
The pseudo-effective cone of a compact Kähler manifold Parshin AN and Shafarevich IR (eds.) (1999) Algebraic Geometry
and varieties of negative Kodaira dimension math.AG/0405285 V – Fano Varieties, vol. 47, Encyclopedia of Mathematical
Campana F (1996) Kodaira dimension and fundamental group of Sciences. Heidelberg: Springer.
compact Kähler manifolds. In: Andreatta M and Peternell T Siu YT (1987) Lectures on Hermitian–Einstein Metrics for Stable
(eds.) Higher Dimensional Complex Varieties, pp. 89–162. Bundles and Kähler–Einstein Metrics. DMV Seminar, vol. 8.
Berlin: de Gruyter. Basel: Birkhäuser.
Demailly JP (2000) Complex analytic and algebraic geometry, Ueno K (1975) Classification Theory of Compact Complex spaces.
http://www-fourier.ujf-grenoble.fr/ demailly. Lecture Notes in Math., vol. 439. Heidelberg: Springer.
Demailly JP and Lazarsfeld R (eds.) Vanishing Theorems and Viehweg E (1995) Quasi-Projective Moduli for Polarozed Mani-
Effective Results in Algebraic Geometry, ICTP Lecture Notes, folds. Ergebnisse der Mathematik und ihrer Grenzgebiete,
vol. 6, Trieste. vol. 30. Heidelberg: Springer.
Fulton W (1984) Intersection Theory. Heidelberg: Springer.

Shock Wave Refinement of the Friedman–Robertson–


Walker Metric
B Temple, University of California at Davis, Davis, In this model, which accounts for things on the
CA, USA largest length scale, the universe is approximated by a
J A Smoller, University of Michigan, Ann Arbor, MI, USA space of uniform density and pressure at each fixed
ª 2006 Elsevier Ltd. All rights reserved. time, and the expansion rate is determined by the
cosmological scale factor R(t) that evolves according
to the Einstein equations. Astronomical observations
show that the galaxies are uniform on a scale of
Introduction about one billion light years, and the expansion is
In the standard model of cosmology, the expanding critical – that is, k = 0 in [1] – and so, according to
universe of galaxies is described by a Friedman– [1], on the largest scale, the universe is infinite flat
Robertson–Walker (FRW) metric, which in spherical Euclidian space R3 at each fixed time. Matching the
coordinates has a line element given by (Blau and Guth Hubble constant to its observed values, and invoking
1987, Weinberg 1972) the Einstein equations, the FRW model implies that
  the entire infinite universe R3 emerged all at once
dr2 2 from a singularity (R = 0), some 14 billion years ago,
ds2 ¼ dt2 þ R2 ðtÞ þ r 2
½d 2
þ sin d
2
½1
1kr2 and this event is referred to as the big bang.
560 Shock Wave Refinement of the Friedman–Robertson–Walker Metric

In this article, which summarizes the work of the coordinate r is singular with respect to radial
authors in Smoller and Temple (1995, 2003), we arclength r̄ = rR at the big bang R = 0, so setting
describe a two-parameter family of exact solutions r > 0 does not place the shock wave away from the
of the Einstein equations that refine the FRW metric origin at time t = 0. The distance from the FRW
by a spherical shock wave cutoff. In these exact center to the shock wave tends to zero in the limit
solutions, the expanding FRW metric is reduced to a t ! 0 even when r > 0. In the limit r ! 1, we
region of finite extent and finite total mass at each recover from the family of solutions the usual
fixed time, and this FRW region is bounded by an (infinite) FRW metric with equation of state p =  –
entropy-satisfying shock wave that emerges from the that is, we recover the standard FRW metric in the
origin (the center of the explosion), at the instant of limit that the shock wave is infinitely far out. In this
the big bang, t = 0. The shock wave, which marks sense our family of exact solutions of the Einstein
the leading edge of the FRW expansion, propagates equations considered here represents a two-parameter
outward into a larger ambient spacetime from time refinement of the standard FRW metric.
t = 0 onward. Thus, in this refinement of the FRW The exact solutions for the case r = 0 were first
metric, the big bang that set the galaxies in motion constructed in Smoller and Temple (1995) (see also
is an explosion of finite mass that looks more like a the notes by Smoller and Temple (1999)), and are
classical shock wave explosion than does the big qualitatively different from the solutions when r > 0,
bang of the standard model. (The fact that the entire which were constructed later in Smoller and
infinite space R3 emerges at the instant of the big Temple (2003). The difference is that, when r = 0,
bang, is, loosely speaking, a consequence of the the shock wave lies closer than one Hubble length
Copernican principle, the principle that the Earth is from the center of the FRW spacetime throughout
not in a special place in the universe on the largest its motion (Smoller and Temple 2000), but when
scale of things. With a shock wave present, the r > 0, the shock wave emerges at the big bang at a
Copernican principle is violated, in the sense that distance beyond one Hubble length. (The Hubble
the Earth then has a special position relative to the length depends on time, and tends to zero as t ! 0.)
shock wave. But, of course, in these shock wave We show in Smoller and Temple (2003) that one
refinements of the FRW metric, there is a spacetime Hubble length, equal to c=H, where H = R=R, _ is a
on the other side of the shock wave, beyond the critical length scale in a k = 0 FRW metric because
galaxies, and so the scale of uniformity of the FRW the total mass inside one Hubble length has a
metric, the scale on which the density of the galaxies Schwarzschild radius equal exactly to one Hubble
is uniform, is no longer the largest length scale.) length. (Since c=H is a good estimate for the age of
In order to construct a mathematically simple the universe, it follows that the Hubble length c=H
family of shock wave refinements of the FRW metric is approximately the distance of light travel starting
that meet the Einstein equations exactly, we assume at the big bang up until the present time. In this
k = 0 (critical expansion), and we restrict to the case sense, the Hubble length is a rough estimate for the
that the sound speed in the fluid on the FRW side of distance to the further most objects visible in the
the shock wave is constant. That is, we assume an universe.) That is, one Hubble length marks precisely
FRW equation of state p = , where , the square
pffiffiffiffiffiffiffiffiffiffiffiffiffi the distance at which the Schwarzschild radius r̄s  2M
of the sound speed @p=@, is constant, 0 <   c2 . of the mass M inside a radial shock wave at distance
At  = c2 =3, this catches the important equation of r̄ from the FRW center, crosses from inside (r̄s < r̄)
state p = (c2 =3) which is correct at the earliest stage to outside (r̄s > r̄) the shock wave. If the shock wave
of big bang physics (Weinberg 1972). Also, as  is at a distance closer than one Hubble length from
ranges from 0 to c2 , we obtain qualitatively correct the FRW center, then 2M < r̄ and we say that the
approximations to general equations of state. solution lies outside the black hole, but if the shock
Taking c = 1 (we use the convention that c = 1, and wave is at a distance greater than one Hubble
Newton’s constant G = 1 when convenient), the length, then 2M > r̄ at the shock, and we say that
family of solutions is then determined by two the solution lies ‘‘inside’’ the black hole. Since M
parameters, 0 <   1 and r  0. The second increases like r̄3 , it follows that 2M < r̄ for r̄
parameter, r , is the FRW radial coordinate r of sufficiently small, and 2M > r̄ for r̄ sufficiently
the shock in the limit t ! 0, the instant of the large, so there must be a critical radius at which
big bang. (Since, when k = 0, the FRW metric is 2M = r̄, and we show in what follows (see also
invariant under the rescaling r ! r and R ! 1 R, Smoller and Temple (2003)) that when k = 0, this
we fix the radial coordinate r by fixing the scale critical radius is exactly the Hubble length. When
factor  with the condition that R(t0 ) = 1 for some the parameter r = 0, the family of solutions for 0 <
time t0 , say present time.) The FRW radial   1 starts at the big bang, and evolves thereafter
Shock Wave Refinement of the Friedman–Robertson–Walker Metric 561

‘‘outside’’ the black hole, satisfying 2M=r̄ < 1 every- this sense, the case r > 0 gives a black hole
where from t = 0 onward. But, when r > 0, the cosmology that refines the standard FRW model of
shock wave is further out than one Hubble length cosmology to the case of finite mass. One of the
at the instant of the big bang, and the solution surprising differences between the case r = 0 and the
begins with 2M=r̄ > 1 at the shock wave. From this case r > 0 is that, when r > 0, the important
time onward, the spacetime expands until even- equation of state p = /3 comes out of the analysis as
tually the Hubble length catches up to the shock special at the big bang. When r > 0, the shock
wave at 2M=r̄ = 1, and then passes the shock wave, wave emerges at the instant of the big bang at a
making 2M=r̄ < 1 thereafter. Thus, when r > 0, finite nonzero speed (the speed of light) only for the
the whole spacetime begins inside the black hole special value  = 1/3. In this case, the equation of
(with 2M=r̄ > 1 for sufficiently large r̄), but state on both sides of the shock wave tends to the
eventually evolves to a solution outside the black correct relation p = /3 as t ! 0, and the shock
hole. The time when r̄ = 2M actually marks the wave decelerates to subluminous speed for all
event horizon of a white hole (the time reversal of positive times thereafter (see Smoller and Temple
a black hole) in the ambient spacetime beyond the (2003) and Theorem 8 below).
shock wave. We show that, when r > 0, the time In all cases 0 <   1, r  0, the spacetime
when the Hubble length catches up to the shock metric that lies beyond the shock wave is taken to
wave comes after the time when the shock wave be a metric of Tolmann–Oppenheimer–Volkoff
comes into view at the FRW center, and when (TOV) form (Oppenheimar and Volkoff 1939):
2M = r̄ (assuming t is so large that we can neglect
the pressure from this time onward), the whole ds2 ¼ BðrÞdt2 þ A1 ðrÞdr2 þ r2 ½d2 þ sin2  d2  ½2
solution emerges from the white hole as a finite
ball of mass expanding into empty space, satisfying The metric [2] is in standard Schwarzschild coordi-
2M=r̄ < 1 everywhere thereafter. In fact, when r > 0, nates (diagonal with radial coordinate equal to the
the zero pressure Oppenheimer–Snyder solution area of the spheres of symmetry), and the metric
outside the black hole gives the large-time asymp- components depend only on the radial coordinate r̄.
totics of the solution (Oppenheimer and Snyder Barred coordinates are used to distinguish TOV
1939, Smoller and Temple 1988, 2004 and the coordinates from unbarred FRW coordinates for
comments after Theorems 6–8 below). shock matching. The mass function M(r̄) enters as a
The exact solutions in the case r = 0 give a metric component through the relation
general-relativistic version of an explosion into a 2MðrÞ
static, singular, isothermal sphere of gas, qualita- A¼1 ½3
r
tively similar to the corresponding classical explo-
sion outside the black hole (Smoller and Temple The TOV metric [2] has a very different character
1995). The main difference physically between the depending on whether A > 0 or A < 0; that is,
cases r > 0 and r = 0 is that, when r > 0 (the case depending on whether the solution lies outside the
when the shock wave emerges from the big bang at a black hole or inside the black hole. In the case A > 0,
distance beyond one Hubble length), a large region r̄ is a spacelike coordinate, and the TOV metric
of uniform expansion is created behind the shock describes a static fluid sphere in general relativity.
wave at the instant of the big bang. Thus, when r > 0, (When A > 0, for example, the metric [2] is the
lightlike information about the shock wave starting point for the stability limits of Buchdahl
propagates inward from the wave, rather than and Chandresekhar for stars (Weinberg 1972,
outward from the center, as is the case when r = 0 Smoller and Temple 1997, 1998).) When A < 0, r̄
and the shock lies inside one Hubble length. (One is the timelike coordinate, and [2] is a dynamical metric
can imagine that when r > 0, the shock wave can that evolves in time. The exact shock wave solutions are
get out through a great deal of matter early on when obtained by taking r̄ = R(t)r to match the spheres of
everything is dense and compressed, and still not symmetry, and then matching the metrics [1] and [2] at
violate the speed of light bound. Thus, when r > 0, an interface r̄ = r̄(t) across which the metrics are
the shock wave ‘‘thermalizes,’’ or more accurately Lipschitz continuous. This can be done in general.
‘‘makes uniform,’’ a large region at the center, early In order for the interface to be a physically mean-
on in the explosion.) It follows that, when r > 0, ingful shock surface, we use the result in Theorem 4
an observer positioned in the FRW spacetime inside below (see Smoller and Temple (1994)) that a single
the shock wave will see exactly what the standard additional conservation constraint is sufficient to rule
model of cosmology predicts, up until the time when out -function sources at the shock (the Einstein
the shock wave comes into view in the far field. In equations G = T are second order in the metric, and
562 Shock Wave Refinement of the Friedman–Robertson–Walker Metric

so -function sources will in general be present at a bounds on the equations of state imply that the
Lipschitz continuous matching of metrics), and equations of state are qualitatively reasonable, and
guarantee that the matched metric solves the Einstein we expect that this family of solutions will capture
equations in the weak sense. The Lipschitz matching the gross dynamics of solutions when more general
of the metrics, together with the conservation equations of state are imposed. For more general
constraint, leads to a system of ordinary differential equations of state, other waves, such as rarefaction
equations (ODEs) that determine the shock position, waves and entropy waves, would need to be present
together with the TOV density and pressure at the to meet the conservation constraint, and thereby
shock. Since the TOV metric depends only on r̄, the mediate the transition across the shock wave. Such
equations thus determine the TOV spacetime beyond transitional waves would be very difficult to model in
the shock wave. To obtain a physically meaningful an exact solution. But, the fact that we can find
outgoing shock wave, we impose the constriant p̄   global solutions that meet our physical bounds, and
to ensure that the equation of state on the TOV side that are qualitatively the same for all values of  2
of the shock is physically reasonable, and as the (0,1] and all initial shock positions, strongly suggests
entropy condition we impose the condition that the that such a shock wave would be the dominant wave
shock be compressive. For an outgoing shock wave, in a large class of problems.
this is the condition  > , p > p̄, that the pressure In the next section, the FRW solution is derived
and density be larger on the side of the shock that for the case  = const., and the Hubble length is
receives the mass flux – the FRW side when the discussed as a critical length scale. Subsequently,
shock wave is propagating away from the FRW the general theorems in Smoller and Temple (1994)
center. This condition breaks the time-reversal sym- for matching gravitational metrics across shock
metry of the equations, and is sufficient to rule out waves are employed. This is followed by a discus-
rarefaction shocks in classical gas dynamics (Smoller sion of the construction of the family of solutions in
1983, Smoller and Temple 2003). The ODEs, the case r = 0. Finally, the case r > 0 is discussed.
together with the equation-of-state bound and the (Details can be found in Smoller and Temple (1995,
conservation and entropy constraints, determine a 2003, 2004).)
unique solution of the ODEs for every 0 <   1 and
r̄  0, and this provides the two-parameter family of
solutions discussed here (Smoller and Temple 1995, The FRW Metric
2003). The Lipschitz matching of the metrics implies
that the total mass M is continuous across the According to Einstein’s theory of general relativity,
interface, and so when r > 0, the total mass of the all properties of the gravitational field are deter-
entire solution, inside and outside the shock wave, is mined by a Lorentzian spacetime metric tensor g,
finite at each time t > 0, and both the FRW and whose line element in a given coordinate system
TOV spacetimes emerge at the big bang. The total x = (x0, . . . , x3 ) is given by
mass M on the FRW side of the shock has the
ds2 ¼ gij dxi dxj ½4
meaning of total mass inside the radius r̄ at fixed
time, but on the TOV side of the shock, M does not (We use the Einstein summation convention,
evolve according to equations that give it the whereby repeated up–down indices are assumed
interpretation as a total mass because the metric is summed from 0 to 3.) The components gij of the
inside the black hole. Nevertheless, after the space- gravitational metric g satisfy the Einstein equations
time emerges from the black hole, the total mass
takes on its usual meaning outside the black Gij ¼ T ij ; T ij ¼ ðc2 þ pÞwi wj þ pgij ½5
hole, and time asymptotically the big bang ends where we assume that the stress-energy tensor T
with an expansion of finite total mass in the usual corresponds to that of a perfect fluid. Here G is the
sense. Thus, when r > 0, our shock wave refine- Einstein curvature tensor,
ment of the FRW metric leads to a big bang of
8 G
finite total mass. ¼ ½6
A final comment is in order regarding our overall c4
philosophy. The family of exact shock wave solutions is the coupling constant, G is Newton’s gravitational
described here are rough models in the sense that constant, c is the speed of light, c2 is the energy
the equation of state on the FRW side satisfies the density, p is the pressure, and w = (w0, . . . , w3 ) are
condition  = const., and the equation of state on the the components of the 4-velocity of the fluid (cf.
TOV side is determined by the equations, and Weinberg 1972), and again we use the convention
therefore cannot be imposed. Nevertheless, the that c = 1 and G = 1 when convenient.
Shock Wave Refinement of the Friedman–Robertson–Walker Metric 563

Putting the metric ansatz [1] into the Einstein [14] that M(t, r̄) has the physical interpretation as
equations [5] gives the equations for the FRW metric the total mass inside radius r̄ at time t in the FRW
(Weinberg 1972), metric. Restricting to the case of critical expansion
!2 k = 0, we see from [7], [14], and [13] that r̄ = H 1 is
2 R_  k equivalent to 2M=r̄ = 1, and so at fixed time t, the
H ¼ ¼  2 ½7 following equivalences are valid:
R 3 R
2M
and r ¼ H 1 iff ¼1 iff A¼0 ½15
r
_ ¼ 3ðp þ ÞH ½8 We conclude that r̄ = H 1 is the critical length scale
for the FRW metric at fixed time t in the sense that
The unknown quantities R, , and p are assumed to
A = 1  2M=r̄ changes sign at r̄ = H1 , and so the
be functions of the FRW coordinate time t alone, and
universe lies inside a black hole beyond r̄ = H 1 , as
the ‘‘dot’’ denotes differentiation with respect to t.
claimed above. Now, we proved in Smoller and
To verify that the Hubble length r̄crit = 1=H is the
Temple (1998) that the standard TOV metric out-
limit for FRW–TOV shock matching outside a black
side the black hole cannot be continued into A = 0
hole, write the FRW metric [1] in standard
except in the very special case  = 0. (It takes an
Schwarzschild coordinates x = (r̄, t̄), where the
infinite pressure to hold up a static configuration at
metric takes the form
the event horizon of a black hole.) Thus, shock
ds2 ¼ Bðr; tÞdt2 þ Aðr; tÞ1 dr2 þ r2 d2 ½9 matching beyond one Hubble length requires a
metric of a different character, and for this purpose,
and the mass function M(r̄, t̄) is defined through the we introduce the TOV metric inside the black hole –
relation a metric of TOV form, with A < 0, whose fluid is
comoving with the timelike radial coordinate
2M
A¼1 ½10 r̄ (Smoller and Temple 2004).
r
The Hubble length r̄crit = c=H is also the critical
It is well known that a general spherically symmetric distance at which the outward expansion of the FRW
metric can be transformed to the form [9] by metric exactly cancels the inward advance of a radial
coordinate transformation (see Weinberg (1972) and light ray impinging on an observer positioned at the
Groah and Temple (2004)). Substituting r̄ = Rr into origin of a k = 0 FRW metric. Indeed, by [1], a light
[1] and diagonalizing the resulting metric, we obtain ray traveling radially inward toward the center of an
(see Smoller and Temple (2004) for details) FRW coordinate system satisfies the condition
 
2 1 1  kr2 c2 dt2 ¼ R2 dr2 ½16
ds ¼  2 dt2
1  kr2  H 2r2 so that
 
1  
þ dr2 þ r2 d2 ½11 dr _ þ R_r ¼ Hr  c ¼ H r  c > 0
1  kr2  H 2r2 ¼ Rr ½17
dt H
where is an integrating factor that solves the if and only if
equation c
r >
    H
@ 1  kr2  H 2r2 @ Hr
 ¼ 0 ½12 Thus, the arclength distance from the origin to an
@r 1  kr2 @t 1  kr2
inward moving light ray at fixed time t in a k = 0
and the time coordinate t̄ = t̄(t, r̄) is defined by the FRW metric will actually increase as long as the light
exact differential ray lies beyond the Hubble length. An inward moving
    light ray will, however, eventually cross the Hubble
1  kr2  H 2r2 Hr length and reach the origin in finite proper time, due
dt ¼ dt þ dr ½13
1  kr2 1  kr2 to the increase in the Hubble length with time.
We now calculate the infinite redshift limit in terms
Now using [10] in [7], it follows that
of the Hubble length. It is well known that light emitted
Z
 r 1 3 at (te , re ) at wavelength
e in an FRW spacetime will be
Mðt; rÞ ¼ ðtÞs2 ds ¼ r ½14 observed at (t0 , r0 ) at wavelength
0 if
2 0 32
Since in the FRW metric, r̄ = Rr measures arclength R 0
0
¼
along radial geodesics at fixed time, we see from Re
e
564 Shock Wave Refinement of the Friedman–Robertson–Walker Metric

Moreover, the redshift factor z is defined by then (assuming an expanding universe R_ > 0), the
solution of system [7], [8] satisfying R = 0 at t = 0

0
z¼ 1 and R = 1 at t = t0 is given by

e
4 1
Thus, infinite redshifting occurs in the limit Re ! 0, ¼ 2 t2
½23
where R = 0, t = 0 is the big bang. Consider now a 3ð1 þ Þ
light ray emitted at the instant of the big bang, and  2=½3ð1þÞ
observed at the FRW origin at present time t = t0 . t
R¼ ½24
Let r1 denote the FRW coordinate at time t ! 0 of t0
the furthest objects that can be observed at the FRW
origin before time t = t0 . Then r1 marks the position H t0
¼ ½25
of objects at time t = 0 whose radiation would be H0 t
observed as infinitly redshifted (assuming no scatter- Moreover, the age of the universe t0 and the infinite
ing). Note then that a shock wave emanating from red shift limit r1 are given exactly in terms of the
r̄ = 0 at the instant of the big bang, will be observed at Hubble length by
the FRW origin before present time t = t0 only if its
position r at the instant of the big bang satisfies the 2 1
t0 ¼ ½26
condition r < r1 . To estimate r1 , note first that from 3ð1 þ Þ H0
[16] it follows that an incoming radial light ray in an
FRW metric follows a lightlike trajectory r = r(t) if 2 1
r1 ¼ ½27
Z t 1 þ 3 H0
d
r  re ¼ 
te Rð Þ
From [27] we conclude that a shock wave will be
and thus observed at the FRW origin before present time
Z t0 t = t0 only if its position r at the instant of the big
d
r1 ¼ ½18 bang satisfies the condition
0 Rð Þ
2 1
Using this, the following theorem can be proved r<
1 þ 3 H0
(Smoller and Temple 2004).
Note that r1 ranges from one-half to two Hubble
Theorem 1 If the pressure p satisfies the bounds
lengths as  ranges from 1 to 0, taking the
0  p  13  ½19 intermediate value of one Hubble length at  = 1=3
(cf. [21]).
then, for any equation of state, the age of the Note that using [23] and [24] in [14], it follows
universe t0 and the infinite red shift limit r1 are that
bounded in terms of the Hubble length by Z
 r
1 2 M¼ ðtÞs2 ds
 t0  ½20 2 0
2H0 3H0
2r3
1 2 ¼ 2=ð1þÞ
t2=ð1þÞ ½28
 r1  ½21 9ð1 þ Þ2 t0
H0 H0
so M _ < 0 if  > 0. It follows that if p = ,
 = const. > 0, then the total mass inside radius
(We have assumed in Theorem 1 that R = 0 when
r = const. decreases in time.
t = 0 and R = 1 when t = t0 , H = H0 .)
The next theorem gives closed-form solutions of
the FRW equations [7], [8] in the case when The General Theory of Shock Matching
 = const. As a special case, we recover the bounds
in [20] and [21] from the cases  = 0 and 1/3. The matching of the FRW and TOV metrics in the next
two sections is based on the following theorems that
Theorem 2 Assume k = 0 and the equation of state were derived in Smoller and Temple (1994) (Theorems
p ¼  ½22 3 and 4 apply to non-lightlike shock surfaces. The
lightlike case was discussed by Scott (2002).)
where  is taken to be constant,
Theorem 3 Let  denote a smooth, three-dimen-
01 sional shock surface in spacetime with spacelike
Shock Wave Refinement of the Friedman–Robertson–Walker Metric 565

normal vector n relative to the spacetime metric g;  to form the matched metric g [ ḡ. That is, assume
let K denote the second fundamental form on ; and that g and ḡ are Lorentzian metrics given by
let G denote the Einstein curvature tensor. Assume
that the components gij of the gravitational metric g ds2 ¼ aðt; rÞdt2 þ bðt; rÞdr2 þ cðt; rÞd2 ½29
are smooth on either side of  (continuous up to the and
boundary on either side separately), and Lipschitz
 t; rÞdr2 þ cðt; rÞd2
ds2 ¼ aðt; rÞdt2 þ bð ½30
continuous across  in some fixed coordinate
system. Then the following statements are and that there exists a smooth coordinate transforma-
equivalent: tion  : (t, r) ! (t̄, r̄), defined in a neighborhood of a
(i) [K] = 0 at each point of . shock surface  given by r = r(t), such that the metrics
(ii) The curvature tensors Rijkl and Gij , viewed as agree on . (We implicitly assume that  and ’ are
second-order operators on the metric compo- continuous across the surface.) Assume that
nents gij , produce no -function sources on . cðt; rÞ ¼ cððt; rÞÞ ½31
(iii) For each point P 2 , there exists a C1,1
coordinate transformation defined in a neigh- in an open neighborhood of the shock surface , so
borhood of P, such that, in the new coordinates that, in particular, the areas of the 2-spheres of
(which can be taken to be the Gaussian normal symmetry in the barred and unbarred metrics agree
coordinates for the surface), the metric compo- on the shock surface. Assume also that the shock
nents are C1,1 functions of these coordinates. surface r = r(t) in unbarred coordinates is mapped to
(iv) For each P 2 , there exists a coordinate frame the surface r̄ = r̄(t̄) by (t̄, r̄(t̄)) = (t, r(t)). Assume,
that is locally Lorentzian at P, and can be finally, that the normal n to  is non-null, and that
reached within the class of C1,1 coordinate nðcÞ 6¼ 0 ½32
transformations.
where n(c) denotes the derivative of the function c in
Moreover, if any one of these equivalencies hold, the direction of the vector n. Then the following are
then the Rankine–Hugoniot jump conditions, equivalent to the statement that the components of
[G]i n = 0 (which express the weak form of con- the metric g [ ḡ in any Gaussian normal coordinate
servation of energy and momentum across  when system are C1,1 functions of these coordinates across
G = T), hold at each point on . the surface :
Here [f] denotes the jump in the quantity f across ½Gij ni ¼ 0 ½33
 (this being determined by the metric separately on
each side of  because gij is only Lipschitz
½Gij ni nj ¼ 0 ½34
continuous across ), and by C1,1 we mean that
the first derivatives are Lipschitz continuous. ½K ¼ 0 ½35
In the case of spherical symmetry, the following
stronger result holds. In this case, the jump condi- Here again, [f ] = f̄  f denotes the jump in the
tions [Gij ]ni = 0, which express the weak form of quantity f across , and K is the second fundamental
conservation across a shock surface, are implied by a form on the shock surface.
single condition [Gij ]ni nj = 0, so long as the shock is We assume in Theorem 4 that the areas of the
non-null, and the areas of the spheres of symmetry 2-spheres of symmetry change monotonically in the
match smoothly at the shock and change mono- direction normal to the surface. For example, if
tonically as the shock evolves. Note that, in general, c = r2 , then @c=@t = 0, so the assumption n(c) 6¼ 0 is
assuming that the angular variables are identified valid except when n = @=@t, in which case the rays
across the shock, we expect conservation to entail of the shock surface would be spacelike. Thus, the
two conditions, one for the time and one for the shock speed would be faster than the speed of light
radial components. The fact that the smooth if our assumption n(c) 6¼ 0 failed in the case c = r2 .
matching of the spheres of symmetry reduces
conservation to one condition can be interpreted as
an instance of the general principle that directions of FRW–TOV Shock Matching Outside the
smoothness in the metric imply directions of Black Hole – The Case r  = 0
conservation of the sources.
To construct the family of shock wave solutions for
Theorem 4 Assume that g and ḡ are two spheri- parameter values 0 <   1 and r = 0, we match
cally symmetric metrics that match Lipschitz con- the exact solution [23]–[25] of the FRW metric [1]
tinuously across a three-dimensional shock interface to the TOV metric [2] outside the black hole,
566 Shock Wave Refinement of the Friedman–Robertson–Walker Metric

assuming A > 0. In this case, we can bypass the By rescaling the time coordinate, we can take B0 = 1
problem of deriving and solving the ODEs for the at r̄0 = 1, in which case [44] reduces to
shock surface and constraints discussed above, by
actually deriving the exact solution of the Einstein B ¼ r 4=ð1þÞ ½45
equations of TOV form that meets these equations. We conclude that when [42] holds, [40]–[43] and
This exact solution represents the general-relativistic [44] provide an exact solution of the Einstein field
version of a static, singular isothermal sphere – equations of TOV type, for each 0    1. (In this
singular because it has an inverse square density case, an exact solution of TOV type was first found
profile, and isothermal because the relationship by Tolman (1939), and rediscovered in the case
between the density and pressure is p̄ = ,  = const.  = 1=3 by Misner and Zapolsky (cf. Weinberg
Assuming the stress tensor for a perfect fluid, and (1972 p. 320)).) By [43], these solutions are defined
assuming that the density and pressure depend only outside the black hole, since 2M=r̄ < 1. When
on r̄, the Einstein equations for the TOV metric [2]  = 1=3, [42] yields = 3=56 G (cf. Weinberg
outside the black hole (i.e., when A = 1 2M=r̄ > 0) (1972, equation (11.4.13))).
are equivalent to the Oppenheimer–Volkoff system To match the FRW exact solution [23]–[25] with
dM equation of state p =  to the TOV exact solution
¼ 4 r2  ½36 [40]–[45] with equation of state p̄ =  across a
dr
shock interface, we first set r̄ = Rr to match the
  spheres of symmetry, and then match the timelike
2d 
p
r  ¼ GM
p  1þ and spacelike components of the corresponding
dr 
metrics in standard Schwarzschild coordinates. The
  
4 r3 p
 2GM 1 matching of the dr̄2 coefficient A1 yields the
 1þ 1 ½37 conservation of mass condition that implicitly gives
M r
the shock surface r̄ = r̄(t),
Integrating [36], we obtain the usual interpretation
of M as the total mass inside radius r̄, 4
MðrÞ ¼ ðtÞr3 ½46
Z r 3
MðrÞ ¼ 4 2 ð Þd ½38 Using this together with [41] gives the following two
0
relations that hold at the shock surface:
The metric component B  B(r̄) is determined from  sffiffiffiffiffiffiffiffi
and M through the equation 3
r ¼
B0 ðrÞ 0 ðrÞ
p ðtÞ
¼ 2 ½39
B  þ 
p 3 M 3
¼ ¼ ¼ 3
 ½47
Assuming 4 rðtÞ3 rðtÞ2

 ¼ ;
Matching the coefficient B of dt̄2 on the shock
p ðrÞ ¼ ½40
r2 surface determines the integrating factor in a
for some constants  and , and substituting into neighborhood of the shock surface by assigning
[3], we obtain initial conditions for [44]. Finally, the conservation
constraint [Tij ]ni nj = 0 leads to the single condition
MðrÞ ¼ 4 r ½41
Þðp þ Þ2
0 ¼ ð1  AÞð þ p
Putting [40] and [41] into [37] and simplifying yields  
the identity 1
þ 1 þp
ð Þð þ pÞ2 þðp  p
Þð  Þ2 ½48
  A
1 
¼ ½42 which upon using p =  and p̄ =  is satisfied
2 G 1 þ 6  þ 2
assuming the condition
From [38] we obtain pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
 ¼ 12 92 þ 54 þ 49  32   72  HðÞ ½49
A ¼ 1  8 G < 1 ½43
Alternatively, we can solve for  in [49] and write
Applying [39] leads to
this relation as
 2=ð1þÞ  4=ð1þÞ
 r  þ 7Þ
ð
B ¼ B0 ¼ B0 ½44 ¼ ½50
0 r0 3ð1  Þ
Shock Wave Refinement of the Friedman–Robertson–Walker Metric 567

This guarantees that conservation holds across the a simple model for star formation (Smoller and
shock surface, and so it follows from Theorem 4 that Temple 2000). As the scenario goes, a star begins as
all of the equivalencies in Theorem 3 hold across the a diffuse cloud of gas. The cloud slowly contracts
shock surface. Note that H(0) = 0, and to leading under its own gravitational force by radiating energy
order  = (3/7) þ O(2 ) as  ! 0. Within the out through the gas cloud as gravitational potential
0
physical region
pffiffiffiffiffiffi 0  ,   1, H () >p0,  < , and
ffiffiffiffiffiffiffiffi energy is converted into kinetic energy. This
H(1=3) = 17  4 0.1231, H(1) = 112=2  5 contraction continues until the gas cloud reaches
0.2915. the point where the mean free path for transmission
Using the exact formulas for the FRW metric in of light is small enough that light is scattered,
[23]–[25], and setting R0 = 1 at  = 0 , t = t0 , we instead of being transmitted, through the cloud. The
obtain the following exact formulas for the shock scattering of light within the gas cloud has the effect
position: of equalizing the temperature within the cloud, and
at this point the gas begins to drift toward the most
rðtÞ ¼ t ½51 compact configuration of the density that balances
the pressure when the equation of state is isother-
rðtÞ ¼ rðtÞRðtÞ1 ¼ tð1þ3Þ=ð3þ3Þ ½52 mal. This configuration is a static, singular, iso-
thermal sphere, the general-relativistic version of
where
which is the exact TOV solution beyond the shock
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi wave when r = 0. This solution in the Newtonian

 ¼ 3ð1 þ Þ case is also inverse square in the density and
1 þ 6  þ 2
 1=ð3þ3Þ pressure, and so the density tends to infinity at the
3 center of the sphere. Eventually, the high densities at
 ¼ ð1þ3Þ=ð3þ3Þ ½53
0 the center ingnite thermonuclear reactions. The
result is a shock wave explosion emanating from
It follows from [41] that A > 0, and from [52] that the center of the sphere, and this signifies the birth
r = limt!0 r(t) = 0. The entropy condition that the of the star. The exact solutions when r = 0
shock wave be compressive follows from the fact represent a general-relativistic version of such a
that  = H() < . Thus, we conclude that for each shock wave explosion.
0 <   1, r = 0, the solutions constructed in
[40]–[53] define a one-parameter family of shock
wave solutions that evolve everywhere outside
the black hole, which implies that the distance Shock Wave Solutions Inside the Black
from the shock wave to the FRW center is less than Hole – The Case r  > 0
one Hubble length for all t > 0. When the shock wave is beyond one Hubble length
Using [51] and [52], one can determine the shock from the FRW center, we obtain a family of shock
speed, and check when the Lax characteristic wave solutions for each 0 <   1 and r > 0 by
condition (Smoller 1983) holds at the shock. The shock matching the FRW metric [1] to a TOV
result is the following theorem. (Note that even metric of form [2] under the assumption that
when the shock speed is larger than c, only the
wave, and not the sound speeds or any other 2MðrÞ
AðrÞ ¼ 1   1  NðrÞ < 0 ½54
physical motion, exceeds the speed of light. See Scott r
(2002) for the case when the shock speed is equal to the In this case, r̄ is the timelike variable. Assuming that
speed of light.) The reader is referred to Smoller and the stress tensor T is taken to be that of a perfect
Temple (1995) for details. fluid comoving with the TOV metric, the Einstein
Theorem 5 There equations G = T, inside the black hole, take the
pffiffiffi exist values 0 < 1 < 2 < 1,
(1 0.458, 2 = 5=3 0.745), such that, for form (see Smoller and Temple (2004) for details)
0 <   1, the Lax characteristic condition holds at  þ  N 0
p
the shock if and only if 0 <  < 1 ; and the shock 0 ¼
p ½55
2 N1
speed is less than the speed of light if and only if
0 <  < 2 .  
N
N0 ¼  r
þ p ½56
The explicit solution in the case r = 0 can be r
interpreted as a general-relativistic version of a  
shock wave explosion into a static, singular, B0 1 N
¼ þ 
 ½57
isothermal sphere, known in the Newtonian case as B N  1 r
568 Shock Wave Refinement of the Friedman–Robertson–Walker Metric

The system [55]–[57] defines the simplest class of determined by the inhomogeneous scalar equation
gravitational metrics that contain matter, evolve [58] when  = const. We take as the entropy
inside the black hole, and such that the mass function constraint the condition that
M(r̄) < 1 at each fixed time r̄. System [55]–[57] for
 < p;
0<p 0 <  <  ½62
A < 0 differs substantially from the TOV equations
for A > 0 because, for example, the energy density and to insure a physically reasonable solution, we
T 00 is equated with the timelike component Grr when impose the equation of state constriant on the TOV
A < 0, but with Gtt when A > 0. In particular, this side of the shock (this is equivalent to the dominant
implies that, inside the black hole, the mass function energy condition (Blau and Guth 1987))
M(r̄) does not have the interpretation as a total mass
inside the radius r̄ as it does outside the black hole.  < 
0<p ½63
Equations [56], [57] do not have the same
Condition [62] implies that outgoing shock waves
character as [54], [55] and the relation p̄ =  with
are compressive. Inequalities [62] and [63] are both
 = const. is inconsistent with [56], [57] together with
implied by the single condition (Smoller and Temple
the conservation constraint and the FRW assumption
2004),
p =  for shock matching. Thus, instead of looking
  
for an exact solution of [56], [57] ahead of time, as in 1 1u u
the case r = 0, we assume the FRW solution [23]– < ½64
N 1þu þu
[25], and derive the ODEs that describe the TOV
metrics that match this FRW metric Lipschitz- Since  is constant, eqn [58] uncouples from [59],
continuously across a shock surface, and then impose and thus solutions of system [58]–[60] are deter-
the conservation, entropy, and equation of state mined by the scalar nonautonomous equation [58].
constraints at the end. Matching a given k = 0 FRW Making the change of variable S = 1=N, which
metric to a TOV metric inside the black hole across a transforms the ‘‘big bang’’ N ! 1 over to a rest
shock interface leads to the system of ODEs, (see point at S ! 0, we obtain
Smoller and Temple (2004) for details),  
  du ð1 þ uÞ
du ð1 þ uÞ ¼
¼ dS 2ð1 þ 3uÞS
dN 2ð1 þ 3uÞN  
  ð3u1ÞðuÞ þ 6uð1þuÞS
ð3u  1Þð  uÞN þ 6uð1 þ uÞ  ½65
 ½58 ð  uÞ þ ð1þuÞS
ð  uÞN þ ð1 þ uÞ
Note that the conditions N > 1 and 0 < p̄ < p
restrict the domain of [65] to the region 0 < u <
dr 1 r
¼ ½59  < 1, 0 < S < 1. The next theorem gives the exis-
dN 1 þ 3u N tence of solutions for 0 <   1, r > 0, inside the
with conservation constraint black hole (Smoller and Temple 2003).
ð1 þ uÞ þ ð  uÞN Theorem 6 For every , 0 <  < 1, there exists a
v¼ ½60 unique solution u (S) of [65], such that [64] holds
ð1 þ uÞ þ ð  uÞN
on the solution for all S, 0 < S < 1, and on this
where solution, 0 < u (S) < ū, limS!0 u (S) = ū, where

p  p
u¼ ; v¼ ; ¼ ½61 u
 ¼ Minf1=3; g ½66
  
and
Here  and p denote the (known) FRW density and
pressure, and all variables are evaluated at the  ¼ 0 ¼ lim 
lim p ½67
shock. Solutions of [58]–[60] determine the S!1 S!1

(unknown) TOV metrics that match the given For each of these solutions u (S), the shock position
FRW metric Lipschitz-continuously across a shock is determined by the solution of [59], which in turn
interface, such that conservation of energy and is determined uniquely by an initial condition which
momentum hold across the shock, and such that can be taken to be the FRW radial position of the
there are no -function sources at the shock (Israel shock wave at the instant of the big bang,
1966, Smoller and Temple 1997). Note that the
dependence of [58]–[60] on the FRW metric is only r ¼ lim rðSÞ > 0 ½68
S!0
through the variable , and so the advantage of
taking  = const. is that the whole solution is Concerning the shock speed, we have
Shock Wave Refinement of the Friedman–Robertson–Walker Metric 569

Theorem 7 Let 0 <  < 1. Then the shock wave is in the case  = 1=3. Inequalities [73], [74] imply, for
everywhere subluminous, that is, the shock speed example, that at the Oppenheimer–Snyder limit  = 0,
s (S)  s(u (S)) < 1 for all 0 < S  1, if and only if pffiffiffiffiffiffiffi tcrit
  1=3. N0 ¼ 2; ¼2
t0
Concerning the shock speed near the big bang and in the limit  = 1=3,
S = 0, the following is true:
tcrit pffiffiffiffiffiffiffi
Theorem 8 The shock speed at the big bang S = 0 1:8   4:5; 1< N0  4:5
t0
is given by
We can conclude that at the moment t0 when the
lim s ðSÞ ¼ 0;  < 1=3 ½69 shock wave first becomes visible at the FRW center,
S!0
the shock wave must lie within 4.5 Hubble lengths of
lim s ðSÞ ¼ 1;  > 1=3 ½70 the FRW center. Throughout the expansion up until
S!0 this time, the expanding universe must lie entirely
within a white hole – the universe will eventually
lim s ðSÞ ¼ 1;  ¼ 1=3 ½71 emerge from this white hole, but not until some later
S!0
time tcrit , where tcrit does not exceed 4.5t0 .
Theorem 8 shows that the equation of state
p = /3 plays a special role in the analysis when r > 0,
and only for this equation of state does the shock Conclusion
wave emerge at the big bang at a finite nonzero We believe that the existence of a wave at the
speed, the speed of light. Moreover, [66] implies that leading edge of the expansion of the galaxies is the
in this case, the correct relation p̄=  =  is also most likely possibility. The alternatives are that
achieved in the limit S ! 0. The result [67] implies either the universe of expanding galaxies goes on out
that (neglecting the pressure p at this time onward), to infinity, or else the universe is not simply
the solution continues to a k = 0 Oppenheimer– connected. Although the first possibility has been
Snyder solution outside the black hole for S > 1. believed for most of the history of cosmology based
It follows that the shock wave will first become on the Friedmann universe, we find this implausible
visible at the FRW center r̄ = 0 at the moment and arbitrary in light of the shock wave refinements
t = t0 , (R(t0 ) = 1), when the Hubble length of the FRW metric discussed here. The second
H01 = H 1 (t0 ) satisfies possibility, that the universe is not simply connected,
1 1 þ 3 has received considerable attention recently (Klarreich
¼ r ½72 2003). However, since we have not seen, and
H0 2
cannot create, any non-simply-connected 3-spaces
where r is the FRW position of the shock at the on any other length scale, and since there is no
instant of the bigpbang.
ffiffiffiffiffi At this time, the number of observational evidence to support this, we view this
Hubble lengths N 0 from the FRW center to the as less likely than the existence of a wave at the leading
shock wave at time t = t0 can be estimated by edge of the expansion of the galaxies, left over from the
2 pffiffiffiffiffi 2 pffiffiffiffi big bang. Recent analysis of the microwave back-
1  N0  e 3ðð1þ3Þ=ð1þÞÞ ground radiation data shows a cutoff in the angular
1 þ 3 1 þ 3
frequencies consistent with a length scale of around
Thus, in particular, the shock wave will still lie one Hubble length (Andy Abrecht, private commu-
beyond the Hubble length 1=H0 at the FRW time t0 nication). This certainly makes one wonder whether
when it first becomes visible. Furthermore, the time this cutoff is evidence of a wave at this length scale,
tcrit > t0 at which the shock wave will emerge from especially given the consistency of this possibility
the white hole given that t0 is the first instant at with the case r > 0 of the family of exact solutions
which the shock becomes visible at the FRW center, discussed here.
can be estimated by
2 tcrit 2 pffiffiffiffi
e=4   e2 3=ð1þÞ ½73 Acknowledgments
1 þ 3 t0 1 þ 3
for 0 <   1=3, and by the better estimate The work of JS was supported in part by NSF
pffiffi Applied Mathematics Grant Number DMS-010-
tcrit
e 6=4   e3=2 ½74 3998, and that of BT by NSF Applied Mathematics
t0 Grant Number DMS-010-2493.
570 Short-Range Spin Glasses: The Metastate Approach

See also: Black Hole Mechanics; Cosmology: Smoller J (1983) Shock Waves and Reaction Diffusion Equations.
Mathematical Aspects; Newtonian Limit of General New York, Berlin: Springer.
Relativity; Symmetric Hyperbolic Systems and Shock Smoller J and Temple B (1994) Shock-wave solutions of the
Waves. Einstein equations: the Oppenheimer–Snyder model of grav-
itational collapse extended to the case of non-zero pressure.
Archives for Rational and Mechanical Analysis 128: 249–297.
Smoller J and Temple B (1995) Astrophysical shock-wave solutions of
the Einstein equations. Physical Review D 51(6): 2733–2743.
Further Reading Smoller J and Temple B (1997) Solutions of the Oppenheimer–
Blau SK and Guth AH (1987) Inflationary cosmology. In: Hawking Volkoff equations inside 9/8’ths of the Schwarzschild radius.
SW and Israel W (eds.) Three Hundred Years of Gravitation, Communications in Mathematical Physics 184: 597–617.
pp. 524–603. Cambridge: Cambridge University Press. Smoller J and Temple B (1998) On the Oppenheimer–Volkov
Groah J, Smoller J, and Temple B (2003) Solving the Einstein equations in general relativity. Archives for Rational and
equations by Lipschitz continuous metrics: shockwaves in Mechanical Analysis 142: 177–191.
general relativity. In: Friedlander S and Serre D (eds.) Smoller J and Temple B (2000) Cosmology with a shock wave.
Handbook of Mathematical Fluid Dynamics, vol. 2, Communications in Mathematical Physics 210: 275–308.
pp. 501–597. Amsterdam: North Holland. Smoller J and Temple B (2003) Shock wave cosmology inside a
Groah J and Temple B (2004) Shock-Wave Solutions of the Einstein black hole. Proceedings of the National Academy of Sciences
Equations: Existence and Consistency by a Locally Inertial Glimm of the United States of America 100(20): 11216–11218.
Scheme. Memoirs of the AMS, 84 pages, vol. 172, no. B13. Smoller J and Temple B (2004) Cosmology, black holes, and
Israel W (1966) Singular hypersurfaces and thin shells in general shock waves beyond the Hubble length. Methods and
relativity. IL Nuovo Cimento XLIV B(1): 1–14. Applications of Analysis 11(1): 77–132.
Klarreich E (2003) The shape of space. Science News 164–165. Tolman R (1939) Static solutions of Einstein’s field equations for
Oppenheimer JR and Snyder JR (1939) On continued gravita- spheres of fluid. Physical Review 55: 364–374.
tional contraction. Physical Review 56: 455–459. Weinberg S (1972) Gravitation and Cosmology: Principles and
Scott M (2002) General Relativistic Shock Waves Propagating at Applications of the General Theory of Relativity. New York:
the Speed of Light. Ph.D. thesis, UC-Davis. Wiley.

Shock Waves see Symmetric Hyperbolic Systems and Shock Waves

Short-Range Spin Glasses: The Metastate Approach


C M Newman, New York University, New York, NY, Newman and Stein 1996b), which were later shown
USA to be equivalent (Newman and Stein 1998a). The
D L Stein, University of Arizona, Tucson, AZ, USA metastate is a probability measure on the space of
ª 2006 Elsevier Ltd. All rights reserved. all thermodynamic states. Its usefulness arises in
situations where multiple ‘‘competing’’ pure states
may be present. In such situations it may be
difficult to construct individual states in a measur-
Introduction
able and canonical way; the metastate avoids this
The nature of the low-temperature spin glass phase in difficulty by focusing instead on the statistical
short-range models remains one of the central problems properties of the states.
in the statistical mechanics of disordered systems (Binder An important aspect of the metastate approach is
and Young 1986, Chowdhury 1986, Mézard et al. 1987, that it relates, by its very construction (Newman and
Stein 1989, Fischer and Hertz 1991, Dotsenko 2001, Stein 1996b), the observed behavior of a system in
Newman and Stein 2003). While many of the basic large but finite volumes with its thermodynamic
questions remain unanswered, analytical and rigorous properties. It therefore serves as a (possibly indis-
work over the past decade have greatly streamlined the pensable) tool for analyzing and understanding both
number of possible scenarios for pure state structure and the infinite-volume and finite-volume properties of a
organization at low temperatures, and have clarified the system, particularly in cases where a straightforward
thermodynamic behavior of these systems. interpolation between the two may be incorrect, or
The unifying concept behind this work is that of their relation otherwise difficult to analyze.
the ‘‘metastate.’’ It arose independently in two We will focus on the Edwards–Anderson (EA)
different constructions (Aizenman and Wehr 1990, Ising spin glass model (Edwards and Anderson
Short-Range Spin Glasses: The Metastate Approach 571

1975), although most of our discussion is relevant to last assertion follows from the measurability and
a much larger class of realistic models. The EA translation invariance of N , and the translation
model is described by the Hamiltonian ergodicity of the disorder distribution of J .)
X A pure state  (where  is a pure-state index) can
HJ ¼  Jxy x y ½1 also be intrinsically characterized by a ‘‘clustering
hx;yi
property’’; for two-point correlation functions, this
where J denotes a particular realization of all of the reads
couplings Jxy and the brackets indicate that the sum is hx y i  hx i hy i ! 0 ½4
over nearest-neighbor pairs only, with x, y 2 Zd . We
will take Ising spins x = 1; although this will affect as jx  yj ! 1. A simple observation (Newman and
the details of our discussion, it is unimportant for our Stein 1992), with important consequences for spin
main conclusions. The couplings Jxy are quenched, glasses, is that if many pure states exist, a sequence
independent, identically distributed random variables of (L)
J ,  ’s, with boundary conditions and L’s chosen
whose common distribution  is symmetric about zero. independently of J , will generally not have a
(single) limit. We call this phenomenon ‘‘chaotic
size dependence’’ (CSD).
States and Metastates We will be interested in the properties of ex G at
We are interested in both finite-volume and infinite- low temperatures. If the spin-flip symmetry present
volume Gibbs states. For the cube of length scale L, in the EA Hamiltonian equation [1] is spontaneously
L = {L, L þ1, . . . , L}d , we define HJ , L to be broken above some dimension d0 and below some
the restriction of the EA Hamiltonian to L with a temperature Tc (d), there will be at least a pair of
specified boundary condition such as free, fixed, or pure states such that their even-spin correlations
periodic. Then the finite-volume Gibbs distribution are identical and their odd-spin correlations have the
(L) (L)
J = J ,  on L (at inverse temperature  = 1=T) is
opposite sign. Assuming that such broken spin-flip
  symmetry indeed exists for d > d0 and T < Tc (d), the
ðLÞ
J ; ðÞ ¼ Z1
L exp HJ ;L ðÞ ½2 question of whether there exists more than one
such pair (of spin-flip related extremal infinite-
where the partition function ZL () is such that
volume Gibbs distributions) is a central unresolved
the sum of (L) J ,  over all  yields 1. (In this and all issue for the EA and related models. If many such
succeeding definitions, the dependence on spatial
pairs should exist, we can ask about the structure of
dimension d will be suppressed.)
their relations with one another, and how this
Thermodynamic states are described by infinite-
structure would manifest itself in large but finite
volume Gibbs measures. At fixed inverse temperature
volumes. To do this, we use an approach, introduced
 and coupling realization J , a thermodynamic state
by Newman and Stein (1996b), to study inhomoge-
J ,  is the limit, as L ! 1, of some sequence of such
neous and other systems with many competing pure
finite-volume measures (each with a specified bound-
states. This approach, based on an analogy with
ary condition, which may remain the same or may
chaotic dynamical systems, requires the construction
change with L). A thermodynamic state J ,  can also
of a new thermodynamic quantity which is called the
be characterized intrinsically through the Dobrushin–
‘‘metastate’’ – a probability measure J on the
Lanford–Ruelle (DLR) equations (see, e.g., Georgii
thermodynamic states. The metastate allows an
1988): for any L , the conditional distribution of J , 
understanding of CSD by analyzing the way in
(conditioned on the sigma-field generated by
which (L) J ,  ‘‘samples’’ from its various possible limits
{x : x 2 Zd nL } is (L), 
J ,  , where  is given by the as L ! 1.
conditioned values of x for x on the boundary of L .
The analogy with chaotic dynamical systems can
Consider now the set G = G(J , ) of all thermo-
be understood as follows. In dynamical systems, the
dynamic states at a fixed (J , ). The set of extremal,
chaotic motion along a deterministic orbit is
or pure, Gibbs states is defined by
analyzed in terms of some appropriately selected
ex G ¼ G n fa1 þ ð1  aÞ2 : probability measure, invariant under the dynamics.
Time along the orbit is replaced, in our context, by
a 2 ð0; 1Þ; 1 ; 2 2 G; 1 6¼ 2 g ½3
L and the phase space of the dynamical system is
and the number of pure states N (J , ) at (J , ) is the replaced by the space of Gibbs states.
cardinality jex Gj of ex G. It is not hard to show that, in Newman and Stein (1996b) considered a ‘‘micro-
any d and for a.e. J , the following two statements are canonical ensemble’’ (as always, at fixed , which
true: (1) N = 1 at sufficiently low  > 0; (2) at any will hereafter be suppressed for ease of notation) N
fixed , N is constant a.s. with respect to the J ’s. (The in which each of the finite-volume Gibbs states
572 Short-Range Spin Glasses: The Metastate Approach

(L 1) (L2 ) (LN )


J , J , . . . , J has weight N 1 . The ensemble such a trivial metastate could occur even if N > 1;
N converges to a metastate J as N ! 1, in the indeed, just such a situation of ‘‘weak uniqueness’’
following sense: for every (nice) function g on states (van Enter and Fröhlich 1985, Campanino et al.
(e.g., a function of finitely many correlations), 1987) happens in very long range spin glasses at
Z high temperatures (Fröhlich and Zegarlinski 1987,
XN
lim N 1 gððL‘ Þ Þ ¼ gðÞ dJ ðÞ ½5 Gandolfi et al. 1993).)
N!1
‘¼1 A phase transition has been proved to exist
(Aizenman et al. 1987) in the Sherrington–
The information contained in J effectively specifies Kirkpatrick (SK) model (Sherrington and Kirkpa-
the fraction of cube sizes L‘ which the system spends trick 1975), which is the infinite-range version of
in different (possibly mixed) thermodynamic states  the EA model. Numerical (Ogielski 1985, Ogielski
as ‘ ! 1. and Morgenstern 1985, Binder and Young 1986,
A different, but in the end equivalent, approach Kawashima and Young 1996) and some analytical
based on J -randomness is due to Aizenman and (Fisher and Singh 1990, Thill and Hilhorst 1996) work
Wehr (1990). Here one considers the random pair has led to a general consensus that above some
(J ,(L)
J ), defined on the underlying probability space dimension (typically around three or four) there does
of J , and takes the limit y (with conditional exist a positive-temperature phase transition below
distribution yJ , given J ), via finite-dimensional which spin-flip symmetry is broken, that is, in which
distributions along some subsequence. The details pure states come in pairs, as discussed below eqn [4].
are omitted here, and the reader is referred to the Because much of the literature has focused on this
work by Aizenman and Wehr (1990) and Newman possibility, we assume it in what follows, and the
and Stein (1998a). We note, however, the important metastate approach turns out to be highly useful in
result that a ‘‘deterministic’’ subsequence of volumes restricting the scenarios that can occur. The simplest
can be found on which [5] is valid and also (J , (L)
J ) such scenario is a two-state picture in which, below the
converges, with yJ = J (Newman and Stein transition temperature Tc , there exists a single pair of
1998a). global flip-related pure states J and 
J . In this case,
In what follows we use the term ‘‘metastate’’ as there is no CSD for periodic boundary conditions and
shorthand for the J constructed using periodic the metastate can be written as
boundary conditions on a sequence of volumes
chosen independently of the couplings, and along J ¼ 1 þ 1 ½6
2 J 2 J

which J = yJ . We choose periodic boundary That is, the metastate is supported on a single
conditions for specificity; the results and claims (mixed) thermodynamic state.
discussed are expected to be independent of the The two-state scenario that has received the most
boundary conditions used, as long as they are attention in the literature is the ‘‘droplet/scaling’’
chosen independently of the couplings. picture (McMillan 1984, Fisher and Huse 1986,
1988, Bray and Moore 1985). In this picture a low-
energy excitation above the ground state in L is a
Low-Temperature Structure
droplet whose surface area scales as lds , with l 
of the EA Model O(L) and ds < d, and whose surface energy scales as
There have been several scenarios proposed for the l , with > 0 (in dimensions where Tc > 0). More
spin-glass phase of the Edwards–Anderson model at recently, an alternative picture has arisen (Krzakala
sufficiently low temperature and high dimension. and Martin 2000, Palassini and Young 2000) in
These remain speculative, because it has not even which the low-energy excitations differ from those
0
been proved that a phase transition from the high- of droplet/scaling, in that their energies scale as l ,
temperature phase exists at positive temperature in with 0 = 0.
any finite dimension. The low-temperature picture that has perhaps
As noted earlier, at sufficiently high temperature generated the most attention in the literature is
in any dimension (and at all nonzero temperatures in the replica symmetry breaking (RSB) scenario
one and presumably two dimensions, although the (Binder and Young 1986, Marinari et al. 1994,
latter assertion has not been proved), there is a 1997, Franz et al. 1998, Marinari et al. 2000,
unique Gibbs state. It is conceivable that this Marinari and Parisi 2000, 2001, Dotsenko 2001),
remains the case in all dimensions and at all nonzero which assumes a rather complicated pure-state
temperatures, in which case the metastate J is, for structure, inspired by Parisi’s solution of the SK
a.e. J , supported on a single, pure Gibbs state J . model (Parisi 1979, 1983, Mézard et al. 1984,
(It is important to note, however, that in principle 1987). This is a many-state picture (N = 1 for a.e.
Short-Range Spin Glasses: The Metastate Approach 573

J ) in which the ordering is described in terms of the each L , the finite-volume Gibbs state (L)
J is well
‘‘overlaps’’ between states. There has been some approximated deep in the interior by a mixed
ambiguity in how to describe such a picture for thermodynamic state (L) , decomposable into many
short-range models; the prevailing, or standard, pure states L (explicit dependence on J is
view. Consider any reasonably constructed thermo- suppressed for ease of notation). More precisely,
dynamic state J (see Newman and Stein (1998a) each  in J satisfies
for more details) – e.g., the ‘‘average’’ over the X
metastate J ¼ W  ½12
Z 

J ¼  dJ ðÞ ½7 and is presumed to have a nontrivial overlap


distribution for ,  0 from ()( 0 ):
Now choose  and  0 from the product distribu- X    
tion J ()J ( 0 ). The overlap Q is defined as P ðqÞ ¼ W  W  q  q  ½13
 ;
X
Q ¼ lim jL j1 x 0x ½8
L!1
x2L
as did J in the standard RSB picture.
Because J , like its counterpart J in the standard
and PJ (q) is defined to be its probability SK picture, is translation covariant, the resulting
distribution. ensemble of overlap distributions P is independent
In the standard RSB picture, J is a mixture of of J . Because of the CSD present in this scenario,
infinitely many pure states, each with a specific the overlap distribution for (L)
J varies with L, no
J -dependent weight W: matter how large L becomes. So, instead of
X averaging the overlap distribution over J , the
J ðÞ ¼ WJ J ðÞ ½9 averaging must now be done over the states 

within the metastate J , all at fixed J :
If  is drawn from and  0 from J , then the
J Z
expression in eqn [8] equals its thermal mean, Pns ðqÞ ¼ P ðqÞJ ðÞd ½14
1
X
q
J ¼ lim jL j hx i hx i ½10
L!1
x2L The Pns (q) is the same for a.e. J , and has a form
analogous to the Ps (q) in the standard RSB picture.
and hence PJ is given by However, the nonstandard RSB scenario seems
X rather unlikely to occur in any natural setting,
PJ ðqÞ ¼ WJ WJ ðq  q
J Þ ½11
because of the following result:
;
Theorem Newman and Stein 1998b). (Consider
The ‘‘self-overlap,’’ or EA order parameter, is given
two metastates constructed along (the same) deter-
by qEA = q J and (at fixed T) is thought to be
ministic sequence of L ’s, using two different
independent of both  and J (with probability 1).
sequences of flip-related, coupling-independent
According to the standard RSB scenario, the WJ ’s
boundary conditions (such as periodic and antiper-
and qJ ’s are non-self-averaging (i.e., J -dependent) iodic). Then with probability one, these two
quantities, except for  =  or its global flip, where
metastates are the same.
q
J = qEA . The average Ps (q) of PJ (q) over the
disorder distribution of J is predicted to be a The proof is given by Newman and Stein (1998b),
mixture of two delta-function components at qEA but the essential idea can be easily described here.
and a continuous part between them. However, it As discussed earlier, J = yJ ; but yJ is constructed
was proved by Newman and Stein (1996c) that this by a limit of finite-dimensional distributions, which
scenario cannot occur, because of the translation means averaging over other couplings including the
invariance of PJ (q) and the translation ergodicity of ones near the system boundary, and hence gives the
the disorder distribution. Nevertheless, the metastate same metastate for two flip-related boundary
approach suggests an alternative, nonstandard, RSB conditions.
scenario, which is described next. This invariance with respect to different sequences
The idea behind the nonstandard RSB picture of periodic and antiperiodic boundary conditions
(referred to by us as the nonstandard SK picture in means essentially that the frequency of appearance
earlier papers) is to produce the finite-volume of various thermodynamic states (L) in finite
behavior of the SK model to the maximum extent volumes L is independent of the choice of
possible. We therefore assume in this picture that in boundary conditions. Moreover, this same
574 Short-Range Spin Glasses: The Metastate Approach

invariance property holds among any two sequences does occur in real spin glasses. In this section we list
of fixed boundary conditions (and the fixed bound- a number of open questions relevant to the above
ary condition of choice may even be allowed to vary discussion.
arbitrarily along any single sequence of volumes)! It
Open Question 1 Determine whether a phase
follows that, with respect to changes of boundary
transition occurs in any finite dimension greater
conditions, the metastate is extraordinarily robust.
than one. If it does, find the lower critical dimension.
This should rule out all but the simplest overlap
Existence of a phase transition does not necessa-
structures, and in particular the nonstandard RSB
rily imply two or more pure states below Tc . It could
and related pictures (for a full discussion,
happen, for example, that in some dimension there
see Newman and Stein 1998b). It is therefore
exists a single pure state at all nonzero temperatures,
natural to ask whether the property of metastate
with two-point spin correlations decaying exponen-
invariance allows any many-state picture.
tially above Tc and more slowly (e.g., as a power
There is one such picture, namely the ‘‘chaotic pairs’’
law) below Tc . This leads to:
picture, which is fully consistent with metastate
invariance (our belief is that it is the only many-state Open Question 2 If there does exist a phase
picture that fits naturally and easily into results transition above some lower critical dimension,
obtained about the metastate.) determine whether the low-temperature spin-glass
Here the periodic boundary condition metastate is phase exhibits broken spin-flip symmetry.
supported on infinitely many pairs of pure states, If broken symmetry does occur in some dimen-
but instead of eqn [12] one has sion, then of course an obvious open question is to
determine the number of pure-state pairs, and hence
 ¼ ð1=2Þ þ ð1=2Þ ½15
the nature of ordering at low temperature. A
with overlap (possibly) easier question (but still very difficult),
and one which does not rely on knowing whether a
P ¼ ð1=2Þðq  qEA Þ þ ð1=2Þðq þ qEA Þ ½16
phase transition occurs, is to determine the zero-
So there is CSD in the states but not in the overlaps, temperature – i.e., ground state – properties of spin
which have the same form as a two-state picture in glasses as a function of dimensionality. A ground
every volume. The difference is that, while in the latter state is an infinite-volume spin configuration whose
case, one has the ‘‘same’’ pair of states in every volume, energy (governed by eqn [1]) cannot be lowered by
in chaotic pairs the pure-state pair varies chaotically as flipping any finite subset of spins. That is, all ground
volume changes. If the chaotic pairs picture is to be state spin configurations must satisfy the constraint
X
consistent with metastate invariance in a natural way, Jxy x y  0 ½17
then the number of pure-state pairs should be hx;yi2C
‘‘uncountable.’’ This allows for a ‘‘uniform’’ distribu-
tion (within the metastate) over all of the pure states, along any closed loop C in the dual lattice.
and invariance of the metastate with respect to
boundary conditions could follow naturally.
Open Question 3 How many ground state pairs is
the T = 0 periodic boundary condition metastate
supported on, as a function of d?
Open Questions
The answer is known to be one for 1D, and a partial
We have discussed how the metastate approach to the result (Newman and Stein 2000, 2001a) points
EA spin glass has narrowed considerably the set of towards the answer being one for 2D as well. There
possible scenarios for low-temperature ordering in any are no rigorous, or even heuristic (except based on
finite dimension, should broken spin-flip symmetry underlying ‘‘ansätze’’) arguments in higher dimension.
occur. The remaining possibilities are either a two-state An interesting – but unrealistic – spin-glass model
scenario, such as droplet/scaling, or the chaotic-pairs in which the ground state structure can be exactly
picture if there exist many pure states at some (, d). solved (although not yet completely rigorously) was
Both have simple overlap structures. The metastate proposed by the authors (Newman and Stein 1994,
approach appears to rule out more complicated 1996a) (see also Banavar 1994). This ‘‘highly
scenarios such as RSB, in which the approximate disordered’’ spin glass is one in which the coupling
pure-state decomposition in a typical large, finite magnitudes scale nonlinearly with the volume (and so
volume is a nontrivial mixture of many pure-state pairs. are no longer distributed independently of the
Of course, this does not answer the question of volume, although they remain independent and
which, if either, of the remaining pictures actually identically distributed for each volume). The model
Short-Range Spin Glasses: The Metastate Approach 575

displays a transition in ground state multiplicity: Further Reading


below eight dimensions, it has only a single pair of
Aizenman M, Lebowitz JL, and Ruelle D (1987) Some rigorous
ground states, while above eight it has uncountably results on the Sherrington–Kirkpatrick spin glass model.
many such pairs. The mechanism behind the transi- Communications in Mathematical Physics 112: 3–20.
tion arises from a mapping to invasion percolation Aizenman M and Wehr J (1990) Rounding effects of quenched
and minimal spanning trees (Lenormand and Bories randomness on first-order phase transitions. Communications
1980, Chandler et al. 1982, Wilkinson and Will- in Mathematical Physics 130: 489–528.
Banavar JR, Cieplak M, and Maritan A (1994) Optimal paths and
emsen 1983): the number of ground state pairs can be domain walls in the strong disorder limit. Physical Review
shown to equal 2N , where N = N (d) is the number of Letters 72: 2320–2323.
distinct global components in the ‘‘minimal spanning Binder K and Young AP (1986) Spin glasses: experimental facts,
forest.’’ The zero-temperature free boundary condi- theoretical concepts, and open questions. Review of Modern
tion metastate above eight dimensions is supported Physics 58: 801–976.
Bray AJ and Moore MA (1985) Critical behavior of the three-
on a uniform distribution (in a natural sense) on dimensional Ising spin glass. Physical Review B 31: 631–633.
uncountably many ground state pairs. Campanino M, Olivieri E, and van Enter ACD (1987) One
Interestingly, the high-dimensional ground state dimensional spin glasses with potential decay 1=r1þ
. Absence
multiplicity in this model can be shown to be of phase transitions and cluster properties. Communications in
unaffected by the presence of frustration, although Mathematical Physics 108: 241–255.
Chandler R, Koplick J, Lerman K, and Willemsen JF (1982)
frustration still plays an interesting role: it leads to Capillary displacement and percolation in porous media.
the appearance of chaotic size dependence when free Journal of Fluid Mechanics 119: 249–267.
boundary conditions are used. Chowdhury D (1986) Spin Glasses and Other Frustrated Systems.
Returning to the more difficult problem of ground New York: Wiley.
state multiplicity in the EA model, we note as a final Dotsenko V (2001) Introduction to the Replica Theory of
Disordered Statistical Systems. Cambridge: Cambridge
remark that there could, in principle, exist ground University Press.
state pairs that are not in the support of metastates Edwards S and Anderson PW (1975) Theory of spin glasses.
generated through the use of coupling-independent Journal of Physics F 5: 965–974.
boundary conditions. If such states exist, they may Fischer KH and Hertz JA (1991) Spin Glasses. Cambridge:
be of some interest mathematically, but are not Cambridge University Press.
Fisher DS and Huse DA (1986) Ordered phase of short-range
expected to play any significant physical role. A Ising spin-glasses. Physical Review Letters 56: 1601–1604.
discussion of these putative ‘‘invisible states’’ is Fisher DS and Huse DA (1988) Equilibrium behavior of the spin-
given by Newman and Stein (2003). glass ordered phase. Physical Review B 38: 386–411.
Fisher ME and Singh RRP (1990) Critical points, large-dimensionality
Open Question 4 If there exists broken spin-flip expansions and Ising spin glasses. In: Grimmett G and Welsh DJA
symmetry at a range of positive temperatures in (eds.) Disorder in Physical Systems, pp. 87–111. Oxford:
some dimensions, then what is the number of pure- Clarendon Press.
state pairs as a function of (, d)? Franz S, Mézard M, Parisi G, and Peliti L (1998) Measuring
equilibrium properties in aging systems. Physical Review
Again, the answer to this is not known above one Letters 81: 1758–1761.
dimension; indeed, the prerequisite existence of Fröhlich J and Zegarlinski B (1987) The high-temperature phase
spontaneously broken spin-flip symmetry has not of long-range spin glasses. Communications in Mathematical
been proved in any dimension. A speculative paper Physics 110: 121–155.
by the authors (Newman and Stein 2001b), using a Gandolfi A, Newman CM, and Stein DL (1993) Exotic states in
long-range spin glasses. Communications in Mathematical
variant of the highly disordered model, suggests that Physics 157: 371–387.
there is at most one pair of pure states in the EA Georgii HO (1988) Gibbs Measures and Phase Transitions.
model below eight dimensions; but no rigorous Berlin: de Gruyter.
arguments are known at this time. Kawashima N and Young AP (1996) Phase transition in the three-
dimensional J Ising spin glass. Physical Review B 53:
R484–R487.
Krzakala F and Martin OC (2000) Spin and link overlaps in
Acknowledgments three-dimensional spin glasses. Physical Review Letters 85:
3013–3016.
This work was supported in part by NSF Grants Lenormand R and Bories S (1980) Description d’un mécanisme de
DMS-01-02587 and DMS-01-02541. connexion de liaison destiné à l’étude du drainage avec
piègeage en milieu poreux. Comptes Rendus de l’Académie
des Sciences 291: 279–282.
See also: Glassy Disordered Systems: Dynamical
Marinari E and Parisi G (2000) Effects of changing the boundary
Evolution; Mean Field Spin Glasses and Neural conditions on the ground state of Ising spin glasses. Physical
Networks; Spin Glasses. Review B 62: 11677–11685.
576 Sine-Gordon Equation

Marinari E and Parisi G (2001) Effects of a bulk perturbation on Picco P (eds.) Mathematics of Spin Glasses and Neural
the ground state of 3D Ising spin glasses. Physical Review Networks, pp. 243–287. Boston: Birkhauser.
Letters 86: 3887–3890. Newman CM and Stein DL (2000) Nature of ground state
Marinari E, Parisi G, Ricci-Tersenghi F, Ruiz-Lorenzo JJ, and incongruence in two-dimensional spin glasses. Physical
Zuliani F (2000) Replica symmetry breaking in spin glasses: Review Letters 84: 3966–3969.
Theoretical foundations and numerical evidences. Journal of Newman CM and Stein DL (2001a) Are there incongruent ground
Statistical Physics 98: 973–1047. states in 2D Edwards–Anderson spin glasses? Communications
Marinari E, Parisi G, and Ritort F (1994) On the 3D Ising spin in Mathematical Physics 224: 205–218.
glass. Journal of Physics A 27: 2687–2708. Newman CM and Stein DL (2001b) Realistic spin glasses below
Marinari E, Parisi G, and Ruiz-Lorenzo J (1997) Numerical eight dimensions: a highly disordered view. Physical Review E
simulations of spin glass systems. In: Young AP (ed.) Spin Glasses 63: 16101-1–16101-9.
and Random Fields, pp. 59–98. Singapore: World Scientific. Newman CM and Stein DL (2003) Topical review: Ordering and
McMillan WL (1984) Scaling theory of Ising spin glasses. Journal broken symmetry in short-ranged spin glasses. Journal of
of Physics C 17: 3179–3187. Physics: Condensed Matter 15: R1319–R1364.
Mézard M, Parisi G, Sourlas N, Toulouse G, and Virasoro M Ogielski AT (1985) Dynamics of three-dimensional spin glasses in
(1984) Nature of spin-glass phase. Physical Review Letters 52: thermal equilibrium. Physical Review B 32: 7384–7398.
1156–1159. Ogielski AT and Morgenstern I (1985) Critical behavior of the
Mézard M, Parisi G, and Virasoro MA (eds.) (1987) Spin Glass three-dimensional Ising spin-glass model. Physical Review
Theory and Beyond. Singapore: World Scientific. Letters 54: 928–931.
Newman CM and Stein DL (1992) Multiple states and thermo- Palassini M and Young AP (2000) Nature of the spin glass state.
dynamic limits in short-ranged Ising spin glass models. Physical Review Letters 85: 3017–3020.
Physical Review B 46: 973–982. Parisi G (1979) Infinite number of order parameters for spin-
Newman CM and Stein DL (1994) Spin-glass model with glasses. Physical Review Letters 43: 1754–1756.
dimension-dependent ground state multiplicity. Physical Parisi G (1983) Order parameter for spin-glasses. Physical Review
Review Letters 72: 2286–2289. Letters 50: 1946–1948.
Newman CM and Stein DL (1996a) Ground state structure in a Sherrington D and Kirkpatrick S (1975) Solvable model of a spin
highly disordered spin glass model. Journal of Statistical glass. Physical Review Letters 35: 1792–1796.
Physics 82: 1113–1132. Stein DL (1989) Disordered systems: Mostly spin glasses. In: Stein
Newman CM and Stein DL (1996b) Non-mean-field behavior of DL (ed.) Lectures in the Sciences of Complexity, pp. 301–355.
realistic spin glasses. Physical Review Letters 76: 515–518. New York: Addison–Wesley.
Newman CM and Stein DL (1996c) Spatial inhomogeneity Thill MJ and Hilhorst HJ (1996) Theory of the critical state of
and thermodynamic chaos. Physical Review Letters 76: low-dimensional spin glass. Journal of Physics I 6: 67–95.
4821–4824. van Enter ACD and Fröhlich J (1985) Absence of symmetry
Newman CM and Stein DL (1998a) Simplicity of state and breaking for N-vector spin glass models in two dimensions.
overlap structure in finite volume realistic spin glasses. Communications in Mathematical Physics 98: 425–432.
Physical Review E 57: 1356–1366. Wilkinson D and Willemsen JF (1983) Invasion percolation:
Newman CM and Stein DL (1998b) Thermodynamic chaos and A new form of percolation theory. Journal of Physics A 16:
the structure of short-range spin glasses. In: Bovier A and 3365–3376.

Sine-Gordon Equation
S N M Ruijsenaars, Centre for Mathematics and It shares this relativistic invariance property with the
Computer Science, Amsterdam, The Netherlands linear Klein–Gordon equation, which is obtained
ª 2006 Elsevier Ltd. All rights reserved. upon replacing sin by . (The name sine-Gordon
equation is derived from this relation, and was
introduced by Kruskal.) The sine-Gordon equation
can also be defined and studied in the form
Introduction
@2 ~ ~ ~ vÞ ¼ ðt; xÞ
The sine-Gordon equation ¼ sin ; ðu; ½3
@u @v
 2 
@ @2 where
 ¼ sin ½1
@x2 @t2
u ¼ ðx þ tÞ=2; v ¼ ðx  tÞ=2 ½4
may be viewed as a prototype for a nonlinear
integrable field theory. It is manifestly invariant are the so-called light-cone variables.
under spacetime translations and Lorentz boosts, There are two interpretations of the field (t, x)
that are quite different, both from a physical and
ðx; tÞ 7! ðx  ; t  Þ
½2 from a mathematical viewpoint. The first one
ðx; tÞ 7! ðx cosh  t sinh ; t cosh  x sinh Þ consists in viewing it as a real-valued function, so
Sine-Gordon Equation 577

that [1] is simply a nonlinear PDE in two variables. data (0, x) = (x) and @t (0, x) = (x) with special
In the second version, one views (t, x) as an properties. First of all, the energy functional
operator-valued distribution on a Hilbert space. Z 1 
(Thus, one should smear (t, x) with a test function 1 2 1 2
H¼ ðxÞ þ @x ðxÞ þ ð1  cos ðxÞÞ dx
f (t, x) in Schwartz space to obtain a genuine 1 2 2
operator on the Hilbert space.) In spite of their ½5
different character, the classical and quantum field
and symplectic form
theory versions have several striking features in
Z 1
common, including the presence of an infinite
number of conservation laws and the occurrence of !¼ dðxÞ ^ dðxÞ dx ½6
1
solitonic excitations.
The classical sine-Gordon equation has been used should be well defined on the phase space of initial
as a model for various wave phenomena, including data. Indeed, in that case [1] amounts to the
the propagation of dislocations in crystals, phase Hamilton equation associated to [5] via [6].
differences across Josephson junctions, torsion Second, there exists a sequence of functionals
waves in strings and pendula, and waves along I2lþ1 ð; Þ; l2Z ½7
lipid membranes. It was already studied in the
nineteenth century in connection with the theory of that formally Poisson-commute with H and among
pseudospherical surfaces. The quantum version is themselves.
used as a simple model for solid-state excitations. In particular, H equals 2(I1 þ I1 ), whereas
The designation ‘‘sine-Gordon’’ is also used for 2(I1  I1 ) equals the momentum functional
various equations that generalize [1] or bear Z 1
resemblance to it. These include the so-called P¼ ðxÞ@x ðxÞ dx ½8
homogeneous and symmetric space sine-Gordon 1
models, discrete and supersymmetric versions, and
The functional I2lþ1 contains x-derivatives of order
generalizations to higher-dimensional spacetimes
up to j2l þ 1j, so one needs to require that the
(i.e., in [1] the spatial derivative is replaced by the
functions @x (x) and (x) be smooth and that all of
Laplace operator in several variables). In this
their derivatives have sufficient decrease for
contribution we focus on [1], however.
x !  1.
Our main goal is to discuss the integrability and
A natural choice guaranteeing the latter require-
solitonic properties, both at the classical and at the
ments is
quantum level. First, we sketch the inverse-scattering
transform (IST) solution to the Cauchy problem for @x ðxÞ; ðxÞ 2 SR ðRÞ ½9
[1]. Following Faddeev and Takhtajan, we emphasize
the interpretation of the IST as an action-angle where SR (R) denotes the Schwartz space of
transformation for an infinite-dimensional Hamilto- real-valued functions on the line. To render the integral
nian system. Next, the particle-like solutions are over 1  cos (x) (and similar integrals occurring for
surveyed by using a description in terms of variables the sequence [7]) finite, one also needs to require
that may be viewed as relativistic action-angle ðxÞ ! 2k ; x !  1; k 2 Z ½10
coordinates. This is followed by a section on the
quantum field theory version, paying special atten- On this phase space  of initial data, the Cauchy
tion to the factorized scattering that is the quantum problem for the evolution equation [1] is not only
analog of the solitonic classical scattering. Finally, we well posed, but can be solved in explicit form by
sketch the intimate relation between the N-particle using the IST. More generally, the Hamiltonians
subspaces of the classical and quantum sine-Gordon I2lþ1 give rise to evolution equations that are
field theory and certain integrable relativistic systems simultaneously solved via the IST, yielding an
of N point particles on the line. infinite sequence of commuting Hamiltonian flows
on .
Before sketching the overall picture resulting from
the IST, it should be mentioned at this point that [1]
The Classical Version: An Integrable
admits explicit solutions of interest that do not
Hamiltonian System
belong to . First, there is a class of algebro-
In order to tie in the hyperbolic evolution equation geometric solutions that have no limits as x !  1.
[1] with the notion of infinite-dimensional integrable These solutions can be obtained via finite-gap
system, it is necessary to restrict attention to initial integration methods, yielding formulas involving
578 Sine-Gordon Equation

the Riemann theta functions associated to compact The crux of the IST is now that the potentials can
Riemann surfaces. Second, there are the tachyon be reconstructed from the spectral data
solutions. They arise from the particle-like solutions
fbðÞ; 1 ; . . . ; N ; 1 ; . . . ; N g ½14
that do belong to  by the transformation
by solving a linear integral equation of Gelfand–
ðt; xÞ ! ðx; tÞ þ  ½11
Levitan–Marchenko (GLM) type. (Alternatively,
(Observe that the equation of motion [1] is invariant Riemann–Hilbert problem techniques can be used.)
under [11], whereas due to the finite-energy require- Hence, the nonlinear Cauchy problem can be
ment [10] this is not true for solutions evolving in .) replaced by the far simpler linear problems of
The IST via which the above Cauchy problem can determining the spectral data [14] of a linear
be solved starts from an auxiliary system of two operator (the direct problem) and then solving the
linear ordinary differential equations involving linear GLM equation for the time-evolved scattering
(0, x) and @t (0, x). It is beyond our scope to data (the inverse problem).
describe the system in detail. The results derived From the Hamiltonian perspective, the IST may
from it, however, are to a large extent the same as be reinterpreted as a transformation to action-angle
those obtained via a simpler auxiliary linear opera- variables. The action variables are defined in terms
tor that is associated to the light-cone Cauchy of jb()j and 1 , . . . , N . They are time independent
problem. The latter operator is of the Ablowitz– under the sine-Gordon and higher Hamiltonian
Kaup–Newell–Segur (AKNS) form. That is, the flows. The angle variables are arg b() and suitable
linear operator is an ordinary differential operator functions of the normalization coefficients. They
of Dirac type given by depend linearly on the evolution times of the flows.
! The Hamiltonians can be explicitly expressed in
d
i dx iq action variables.
L¼ d
½12
ir i dx Next, we point out that there is a large subspace
of Cauchy data ((x), (x)) that do not give rise to
where the external potentials r(u) and q(u) depend bound states in the auxiliary linear problem. The
on the evolution equation at hand. For the light- associated solutions are the so-called radiation
cone sine-Gordon equation [3], one needs to choose solutions: they decrease to 0 for large times. These
~
r ¼ q ¼ ð@u Þðu; 0Þ=2 ½13 solutions can be obtained from the inverse transform
involving the GLM equation by only taking b()
In both settings, the associated spectral features into account.
are invariant under the sine-Gordon evolution and The other extreme is to choose b() = 0 and
all of the evolutions generated by the Hamiltonians arbitrary bound states and normalization coeffi-
I2lþ1 , yielding the so-called isospectral flows. More cients in the GLM equation. This special case of
specifically, if the initial data give rise to bound-state vanishing reflection leads to the particle-like solu-
solutions of the linear problem (square-integrable tions that are studied in the next section. For general
wave functions), then the corresponding eigenvalues Cauchy data, one has both b() 6¼ 0 and a finite
are time independent. Furthermore, due to the decay number of bound states. These so-called mixed
requirements on the potential in the linear system, solutions have a radiation component (encoded in
there exist scattering solutions with plane-wave b()) which decays for asymptotic times, whereas
asymptotics for all initial data in . A suitable the bound states show up for t ! 1 as isolated
normalization leads to the so-called Jost solutions solitons, antisolitons, and breathers.
(x, ). (Here  is the spectral parameter, which
varies over the real line for scattering solutions.)
Their x ! 1 asymptotics is encoded in transition
coefficients a() and b(), with a() and jb()j being Classical Solitons, Antisolitons,
time independent, whereas arg b() has a linear and Breathers
dependence on time when the potential evolves
Just as for other classical soliton equations, the case
according to the sine-Gordon equation. The bound
of reflectionless data can be handled in complete
states correspond to special -values 1 , . . . , N with
detail, since the GLM equation reduces to an N  N
positive imaginary part (namely the zeros of the
system of linear equations. The case N = 1 yields the
coefficient a(), which is analytic in the upper-half
1-soliton and 1-antisoliton solutions. Resting at the
-plane); their normalization coefficients 1 , . . . , N
origin, these one-particle solutions are given by
have an essentially linear time evolution, just
as b(). 4 arctanðex Þ ½15
Sine-Gordon Equation 579

and have energy 8 (cf. [5]). (We normalize all where


solutions by requiring
j ¼ qj  x cosh j þ t sinh j ; q j ; j 2 R ½22
lim ðt; xÞ ¼ 0 ½16
x!1 and b results from þ by substituting
Note that one can add arbitrary multiples of 2 1 !   i ; q1 ! ðy  i
Þ=2 ½23
2 2
without changing the energy H [5].) A spatial
translation and Lorentz boost then yields the general (For the case 1 < 2 , one needs an extra minus sign
solutions on the right-hand side of [21].)
There is yet another possibility for an eigenvalue
 ðt; xÞ on the imaginary axis we have not mentioned thus
¼ 4 arctanðexpðq  x cosh  þ t sinh ÞÞ ½17 far: it may have an arbitrary multiplicity, giving rise
to the so-called multipole solutions. This is illu-
with energy 8 cosh  and momentum 8 sinh  (cf. [8]). strated by the breather solution b : when one sets
Defining the topological charge of a solution
= 2q0 and lets tend to 0, one obtains a
(with normalization [16]) by solution
1
Q¼ lim ðt; xÞ ½18 sep ðt; xÞ
2 x ! 1  
q0 þ t cosh   x sinh 
the different charges Q = 1 and Q = 1 of the ¼ 4 arctan ½24
cosh½y=2  x cosh  þ t sinh 
soliton and antisoliton reflect a signature associated
to the special value of the spectral parameter on the From a physical viewpoint, the soliton and anti-
imaginary axis for which a bound state in the linear soliton have just enough energy to prevent a bound
problem occurs. More generally, for bound-state state from being formed. Notice that in this case the
eigenvalues on the imaginary axis these signatures distance between soliton and antisoliton diverges
must be specified in the IST setting, a point glossed logarithmically in jtj as t ! 1, whereas for þ
over in the previous section. one obtains linear increase.
Bound states in the linear problem can also arise The 2-soliton and 2-antisoliton solutions can also be
from -values off the imaginary axis, which come in obtained by analytic continuation of þ . They read
pairs ia  b, with a, b > 0. Such pairs give rise to 
solutions containing breathers, which can be viewed as  ¼  4 arctan cothðð1  2 Þ=2Þ
bound states of a soliton and an antisoliton. The one- 
coshðð 1  2 Þ=2Þ
breather solution breathing at the origin is given by  ; 2 < 1 ½25
  sinhðð 1 þ 2 Þ=2Þ
sinðt sin Þ
4 arctan cot ; 2 ð0; =2Þ ½19 where j is given by [22]. Thus, they arise by
coshðx cos Þ taking q2 ! q2 þ i and q1 ! q1 þ i in [21], resp.
and has energy 16 cos . A spacetime translation and The equal-signature eigenvalues corresponding to
Lorentz boost then yields the general solution these two solutions cannot collide and move off
the imaginary axis; physically speaking, equal-
b ðt; xÞ charge particles repel each other. The energy and
 
sin½
=2 þ sin ðt cosh   x sinh Þ momentum of the solutions [25] and [21] are given
¼ 4 arctan cot by 8 cosh 1 þ 8 cosh 2 and 8 sinh 1 þ 8 sinh 2 ,
cosh½y=2  cos ðx cosh   t sinh Þ
respectively.
½20
Up to scale factors, the above variables 1 , 2 and
which has energy 16 cosh  cos and momentum , are the action variables resulting from the IST,
16 sinh  cos . It may be obtained by analytic whereas q1 , q2 and y,
are the canonically con-
continuation from the solution describing a collision jugated angle variables. Accordingly, the time and
between a soliton with velocity tanh 1 and an space translation flows (generated by H [5] and P
antisoliton with velocity tanh 2 , taking 2 < 1 . [8], resp.) shift the angles linearly in the evolution
The latter is given by parameters t and x.
We conclude this section with a description of the
þ ðt; xÞ N-soliton solution and its large time asymptotics. It
 
sinhðð 1  2 Þ=2Þ can be expressed in terms of the N  N matrix
¼ 4 arctan cothðð1  2 Þ=2Þ Q
coshðð 1 þ 2 Þ=2Þ
l6¼j j cothððj  l Þ=2Þj
Ljk ¼ expð j Þ ½26
2 < 1 ½21 coshððj  k Þ=2Þ
580 Sine-Gordon Equation

where j is given by [22] with Here, (0, x) is a neutral Klein–Gordon field with
mass m and the double dots denote a suitable
q1 ; . . . ; qN 2 R; N <    < 1 ½27
ordering prescription. The associated equation of
Specifically, one has motion

Nþ ðt; xÞ ¼4 tr arctan ðLÞ m2


xx  tt ¼ sin  ½36
¼  2i lnðj1N þ iLj=j1N  iLjÞ 
! !
XN is equivalent to [1] on the classical level, but not on
l
¼  2i ln 1þ i Sl ðLÞ =c:c: ½28 the quantum level. (If (t, x) is a classical solution to
l¼1 [36], then (t=m, x=m) solves [1].) This difference
where Sl is the lth symmetric function of L. Using is due to the extremely singular character of
Cauchy’s identity, one obtains the explicit formula interacting relativistic quantum field theory, a
! context in which ‘‘solving’’ the field theory has
X X Y slowly acquired a meaning that is vastly different
Sl ¼ exp j j cothððj  k Þ=2Þj ½29 from the classical notion. Indeed, one can at best
If1;...;Ng j2I j2I
jIj¼l k=
2I hope to verify [36] in the sense of expectation values
in suitable quantum states, and this is precisely what
In order to specify the t ! 1 asymptotics of Nþ , has been achieved within the form-factor program
we introduce the 1-soliton solutions sketched later on.
 From the perspective of functional analysis, the
j ðt; xÞ ¼ 4 arctanðexpð j  j =2ÞÞ ½30
existence of a well-defined Wightman field theory with
where all of the features mentioned below is wide open. More
0 1 precisely, beginning with pioneering work by Fröhlich
X X some 30 years ago, various authors have contributed
j ¼ @  A ðj  k Þ ½31 to a mathematically rigorous construction of a sine-
k<j k>j
Gordon quantum field theory version, but to date it
  seems not feasible to verify that the resulting Wight-
ðÞ ¼ ln cothð=2Þ2 ½32 man field theory has any of the explicit features we are
going to sketch. (For example, not even the free
Then, one has character of the field theory for 2 = 4 has been
  established; cf. below.)
 XN 
   That said, we proceed to sketch some highlights
supNþ ðt; xÞ  j ðt; xÞ ¼ OðexpðjtjrÞÞ
x2R  j¼1
 of the impressive, but partly heuristic lore that has
t! 1 ½33 been assembled in a great many theoretical physics
papers. A key result we begin with is the equivalence
where the decay rate is given by to a field theory that looks very different at face
value. This is the massive Thirring model, formally
r ¼ minðcoshðj Þj tanh j  tanh k jÞ ½34
j6¼k given by the Hamiltonian
Z 1  
Thus, the soliton profile with velocity tanh j incurs g
HT ¼ :  ði
5 @x þ
0 MÞ þ ðJ02  J12 Þ : dx
a shift j = cosh j as a result of the collision. The 1 2
factor 1= cosh j may be viewed as a Lorentz M 2 ð0; 1Þ; g 2 R ½37
contraction factor.
Here, (0, x) is the charged Dirac field with mass M
and the double dots stand for normal ordering. For
the
-algebra, one may choose
The Quantum Version: A Soliton    
Quantum Field Theory 0 1 0 1

0 ¼ ;
1 ¼
1 0 1 0
From a perturbation-theoretic viewpoint, the quan-  
1 0
tum sine-Gordon Hamiltonian is given by
5 ¼
0
1 ¼ ½38
Z 1  0 1
1 1
H¼ : ð@t Þ2 þ ð@x Þ2 and J is the Dirac current,
1 2 2  
2
 1 0
m
J0 ¼  ; J1 ¼ 
 ½39
þ 2 ð1  cos Þ : dx; m;  > 0 ½35 0 1

Sine-Gordon Equation 581

The equivalence argument (due to Coleman) consists the DHN formula. Notice that for  near zero m1 and
in showing that the quantities m are nearly equal, and that for 2 4 there are no
longer any sine-Gordon mesons present in the theory.
 m2 A priori, the existence of infinitely many classical
  @  ; : cos  : ½40
2 2 conserved Hamiltonians does not even formally
in the sine-Gordon theory have the same vacuum imply the same feature for the quantum field theory,
expectation values (in perturbation theory) as the as anomalies may occur. For the sine-Gordon and
massive Thirring quantities massive Thirring cases, anomalies have been shown
to be absent, however. This entails not only that the
: J :; M : 
0  : ½41 number of solitons, antisolitons, and breathers in a
scattering process is conserved, but also that the set
resp., provided the parameters are related by of incoming rapidities equals the set of outgoing
4 g rapidities.
¼1þ ½42 The latter stability features and the DHN formula
2 
[44] are corroborated by the S-matrix, which is
This yields an equivalence between the charge-0 known in complete detail. The two-body amplitudes
sector of the massive Thirring model and the sector involving solitons and antisolitons can be written in
of the sine-Gordon theory obtained by the action of terms of the function
the fields [40] on the vacuum vector. But the
charged sectors of the Thirring model can also be uðzÞ
 Z 1 
viewed as new sectors in the sine-Gordon theory, dx sinhð  =2Þx
¼ exp i sin 2xz ½45
obtained by a solitonic field construction (first 0 x sinh x cosh x=2
performed by Mandelstam).
In this picture, the fermions and antifermions in They are given by
the massive Thirring model correspond to new ðuþþ ; tþ ; rþ ; u ÞðÞ
excitations in the sine-Gordon theory, the quantum 
solitons and antisolitons. The latter are viewed as sinhð=2Þ
¼ uð=2Þ 1; ;
coherent states of the sine-Gordon ‘‘mesons’’ in the sinhðði  Þ=2Þ
vacuum sector, the rest masses being related by 
i sinð2 =2Þ
  ;1 ½46
8m 2 sinhðði  Þ=2Þ
M¼ 2 1 ½43
 8 where  denotes the rapidity difference. (Due to
in the semiclassical limit  ! 0.2 fermion statistics, one gets only one amplitude for a
Even at the formal level involved in the corre- soliton or antisoliton pair. But a soliton and an
spondence, the theories are not believed to exist for antisoliton have opposite charge, so they can be
2 > 8 and g < =2, since there is positivity distinguished. In that case, therefore, the notion of
breakdown for this range of couplings. The free reflection and transmission coefficients makes sense.)
Dirac case g = 0 corresponds to  2 = 4. In parti- The S-matrix involving an arbitrary number of
cular, there is no interaction between the sine- solitons, antisolitons, and their bound states is also
Gordon solitons and antisolitons for this -value. explicitly known. The amplitudes involving no
In the range 2 2 (4, 8) there is interaction, but breathers are readily described in terms of the above
bound soliton–antisoliton pairs (quantum breathers, two-body amplitudes. Indeed, the S-matrix factorizes
alias sine-Gordon mesons) do not occur. as a sum of products of the amplitudes [46], yielding a
By contrast, for  2 < 4 there exist breathers with picture of particles scattering independently in pairs,
rest masses just as at the classical level. The factorization can be
performed irrespective of the temporal ordering
mn ¼ 2M sinðn þ 1Þ; 
m=2M; assumed for the pair scattering processes, since the
n þ 1 ¼ 1; 2; . . . ; L < =2 ½44 four functions occurring inside the parentheses of
[46] satisfy the Yang–Baxter equations.
Thus, the ‘‘particle spectrum’’ consists of solitons Roughly speaking, the S-matrix for processes invol-
and antisolitons with mass M and mesons C1 , . . . ,CL ving breathers can be calculated by analytic continua-
with masses m1 , . . . ,mL given by [44]. The latter tion from the soliton–antisoliton S-matrix. The details
formula was first established by semiclassical quan- are however quite substantial. We only add that
tization of the classical breathers (Dashen– scattering amplitudes involving solely breathers can
Hasslacher–Neveu), and ever since is usually called be expressed using only hyperbolic functions.
582 Sine-Gordon Equation

Since the 1980s, a lot of information has also on the phase space
been gathered concerning matrix elements of
~ ¼ fðx; pÞ 2 R4 g;
 ! ¼ dx ^ dp ½50
suitable sine-Gordon field quantities between
special quantum states (form factors). Unfortu- The two-antiparticle Hamiltonian is again given by
nately, the correlation functions involve infinite [47] and [48]. The interaction potential in [47] is
sums of form factors that are quite difficult to repulsive, whereas it is attractive in [49]. Hence, any
control analytically. Hence, it is not known whether initial point in  gives rise to a scattering state,
the correlation functions associated with the form whereas points in ˜ yield scattering states if and
factors give rise to a Wightman field theory with only if the reduced Hamiltonian
the usual axiomatic properties.
~ r ¼ cosh pj tanhðx=2Þj;
H p ¼ ðp1  p2 Þ=2
x ¼ x1  x2 ½51
The Relation to Relativistic ~ r > 1. More specifically, in both cases the
satisfies H
Calogero–Moser Systems distance jx1 (t)  x2 (t)j increases linearly as t !  1,
the scattering (position shift) being encoded by the
The behavior of the special classical solutions
same function [32] as for the sine-Gordon solitons.
discussed earlier is very similar to that of classical ~ r = 1}
The phase-space points on the separatrix {H
point particles. Furthermore, the picture of classical
have the same temporal asymptotics as the multipole
solitons, antisolitons, and their bound states scatter-
solution [24], whereas the bound-state oscillations
ing independently in pairs is essentially preserved on ~ r < 1 match those of the breathers [20].
for H
the quantum level, just as one would expect for the
More generally, the Hamiltonian for Nþ particles
quantization of an integrable particle system.
and N antiparticles is given by the function
Next, we note that from the quantum viewpoint
there is no physical distinction between wave Nþ
X Nþ
Y
functions and point particles, whereas a classical coshðpþ
j Þ j cothððxþ þ
j  xk Þ=2Þj
wave is a physical entity that is clearly very different j¼1 k¼1
k6¼j
from a point particle. Even so, it is a natural Y
N X
N
question whether there exist classical Hamiltonian  j tanhððxþ 
j  xl Þ=2Þj þ coshðp
l Þ
systems of N point particles on the line whose l¼1 l¼1
physical characteristics (charges, bound states, scat- Y
N
tering, etc.) are the same as those of the particle-like  j cothððx 
l  xm Þ=2Þj
sine-Gordon solutions. If so, a second question is m¼1
m6¼l
equally obvious: does the quantum version of the Nþ
Y
N-particle systems still have the same features as j tanhððx þ
 l  xj Þ=2Þj ½52
that of the quantum sine-Gordon excitations? j¼1
As we now sketch, the first question has been
answered in the affirmative, whereas the second one on the phase space
has not been completely answered yet. However, all Nþ ;N
of the information on the pertinent quantum n
N-particle systems collected thus far points to an ¼ ðxþ ; pþ Þ 2 R 2Nþ ; ðx ; p Þ
affirmative answer. The systems at issue are relati- o
vistic versions of the well-known nonrelativistic 2 R2N jxþ þ  
Nþ <    < x1 ; xN <    < x1 ½53
Calogero–Moser N-particle systems.
To begin with the classical two-particle system, its !Nþ ;N ¼ dxþ ^ dpþ þ dx ^ dp ½54
Hamiltonian is given by
This defining Hamiltonian can be supplemented by
H ¼ ðcosh p1 þ cosh p2 Þ cothððx1  x2 Þ=2Þ ½47 (Nþ þ N  1) independent Hamiltonians that pair-
wise commute. The action-angle map of this integr-
on the phase space able system can be used to relate the scattering and
bound-state behavior to that of the sine-Gordon
 ¼ fðx; pÞ 2 R4 jx2 < x1 g; ! ¼ dx ^ dp ½48
solutions from an earlier section, yielding an exact
Taking x2 ! x2 þ i yields the particle–antiparticle correspondence. Indeed, the variables we used to
Hamiltonian describe the particle-like sine-Gordon solutions
amount to the action-angle variables associated to
~ ¼ ðcosh p1 þ cosh p2 Þj tanhððx1  x2 Þ=2Þj
H ½49 [52]. Moreover, the matrix L [26] with t = x = 0
Sine-Gordon Equation 583

equals the Lax matrix for the N-particle system, which to the arbitrary-N case, one needs first of all
is the manifestation of a remarkable self-duality sufficiently explicit solutions to the N-body
property of the equal-charge case. There is an equally Schrödinger equation. To date, this has only been
close relation between the general particle-like solu- achieved for the case of N equal charges and the
tions and the general systems encoded in [52]. special couplings for which the reflection amplitude
As a matter of fact, the connection can be further rþ vanishes. The asymptotics of the pertinent
strengthened by introducing spacetime trajectories solutions is factorized in terms of u (), in agree-
for the solitons, antisolitons, and breathers, which ment with the sine-Gordon picture.
are defined in terms of the evolution of an initial
point in Nþ ,N under the time translation generator See also: Bäcklund Transformations; Boundary-Value
[52] and the space translation generator, obtained Problems for Integrable Equations; Calogero–
from [52] by the replacement cosh ! sinh . These Moser–Sutherland Systems of Nonrelativistic and
Relativistic Type; Infinite-dimensional Hamiltonian
point particle and antiparticle trajectories make it
Systems; Integrability and Quantum Field Theory;
possible to follow the motion of the solitons,
Integrable Systems and Discrete Geometry; Integrable
antisolitons, and breathers during the temporal Systems and Inverse Scattering Method; Integrable
interval in which the nonlinear interaction takes Systems: Overview; Ljusternik–Schnirelman Theory;
place, whereas for large times the trajectories are Solitons and Other Extended Field Configurations;
located at the (then) clearly discernible positions of Solitons and Kac–Moody Lie Algebras; Symmetries and
the individual solitons, antisolitons, and breathers. Conservation Laws; Two-Dimensional Models;
Before sketching the soliton-particle correspon- Yang–Baxter Equations.
dence at the quantum level, we add a remark on the
finite-gap solutions of the classical sine-Gordon
equation, already mentioned in the paragraph
Further Reading
containing [11]. These solutions may be viewed as Ablowitz MJ, Kaup DJ, Newell AC, and Segur H (1974) The
generalizations of the particle-like solutions dis- inverse scattering transform – Fourier analysis for nonlinear
cussed earlier, and they can also be obtained via problems. Studies in Applied Mathematics 53: 249–315.
Coleman S (1977) Classical lumps and their quantum descen-
relativistic N-particle Calogero–Moser systems. The dants. In: Zichichi A (ed.) New Phenomena in Subnuclear
pertinent systems are generalizations of the hyper- Physics, Proceedings Erice 1975, pp. 297–421. New York:
bolic systems just described to the elliptic level. Plenum.
Turning now to the quantum level, we begin by Faddeev LD and Takhtajan LA (1987) Hamiltonian Methods in
mentioning that the Poisson-commuting Hamilto- the Theory of Solitons. Berlin: Springer.
Flaschka HF and Newell AC (1975) Integrable systems of
nians admit a quantization in terms of commuting nonlinear evolution equations. In: Moser J (ed.) Dynamical
analytic difference operators. This involves a special Systems, Theory and Applications, Lecture Notes in Physics,
ordering choice of the p-dependent and x-dependent vol. 38, pp. 355–440. Berlin: Springer.
factors in the classical Hamiltonians, which is Karowski M (1979) Exact S-matrices and form factors in 1þ1
required to preserve commutativity. The resulting dimensional field theoretic models with soliton behaviour.
Physics Reports 49: 229–237.
quantum two-body problem can be explicitly solved Ruijsenaars SNM (2001) Sine-Gordon solitons vs. Calogero–
in terms of a generalization of the Gauss hypergeo- Moser particles. In: Pakuliak S and von Gehlen G (eds.)
metric function. For the case of equal charges, the Proceedings of the Kiev NATO Advanced Study Institute
scattering is encoded in the sine-Gordon amplitudes Integrable Structures of Exactly Solvable Two-Dimensional
u () (cf. [45] and [46]). For the unequal-charge Models of Quantum Field Theory, NATO Science Series, vol. 35,
pp. 273–292. Dordrecht: Kluwer.
case, one should distinguish an even and odd Scott AC, Chu FYF, and McLaughlin DW (1973) The soliton: a
channel. The scattering on these channels is encoded new concept in applied science. Proceedings of the Institute of
in the sine-Gordon amplitudes tþ ()  rþ (). Electrical and Electronics Engineers 61: 1443–1483.
Moreover, the bound-state spectrum agrees with Smirnov FA (1992) Form Factors in Completely Integrable
the DHN formula [44] and the bound-state wave Models of Quantum Field Theory. Advanced Series in
Mathematical Physics, vol. 14. Singapore: World Scientific.
functions are given by hyperbolic functions. Thacker HB (1981) Exact integrability in quantum field theory
As a consequence of these results, the physics and statistical systems. Reviews of Modern Physics 53:
encoded in the two-body subspace of the sine- 253–285.
Gordon quantum field theory is indistinguishable Zamolodchikov AB and Zamolodchikov AlB (1979) Factorized
from that of the corresponding two-body relativistic S-matrices in two dimensions as the exact solutions of certain
relativistic quantum field theory models. Annals of Physics
Calogero–Moser systems. To extend this equivalence (NY) 120: 253–291.
584 Singularities of the Ricci Flow

Singularities of the Ricci Flow


M Anderson, State University of New York at Recently, G Perelman (2002, 2003a, b) has deve-
Stony Brook, Stony Brook, NY, USA loped new insights into the geometry of the Ricci flow
ª 2006 Elsevier Ltd. All rights reserved. which has led to a solution of long-standing mathe-
matical conjectures on the structure of 3-manifolds,
namely the Thurston geometrization conjecture
(Thurston 1982), and hence the Poincaré conjecture.
Introduction
Fix a closed n-dimensional manifold M, and let M
be the space of Riemannian metrics on M. As in the Basic Properties of the Ricci Flow
reasoning leading to the Einstein equations in In charts where the coordinate functions are locally
general relativity, there is basically a unique simple defined harmonic functions in the metric g(t), [2]
and natural vector field on the space M. Namely, the takes the form
tangent space Tg M consists of symmetric bilinear
forms; besides multiples of the metric itself, the Ricci d
gij ¼ gij þ Qij ðg; @gÞ
curvature Ricg of g is the only symmetric form that dt
depends on at most the second derivatives of the where  is the Laplace operator on functions with
metric, and is invariant under coordinate changes, respect to the metric g = g(t) and Q is a lower-order
that is, a (0, 2)-tensor formed from the metric. Thus, term quadratic in g and its first-order partial
consider derivatives. This is a nonlinear heat-type equation
for gij and leads to the existence and uniqueness of
Xg ¼ Ricg þ g
solutions to the Ricci flow on some time interval
where ,  are scalars. Setting  = 2, the corre- starting at any smooth initial metric. This is the
sponding equation for the flow of X is reason for the minus sign in [2]; a plus sign gives a
backwards heat-type equation, which has no solu-
d
gðtÞ ¼ 2RicgðtÞ þ gðtÞ ½1 tions in general.
dt The flow [2] gives a natural method to try to
The Ricci flow, introduced by Hamilton (1982), is construct canonical metrics on the manifold M.
obtained by setting  = 0: Stationary points of the flow [2] are Ricci-flat
metrics, while stationary points of the flow [1] are
d
gðtÞ ¼ 2RicgðtÞ ½2 (Riemannian) Einstein metrics, where Ricg = (R=n)g,
dt
with R the scalar curvature of g. One of Hamilton’s
Rescaling the metric and time variable t transforms motivations for studying the Ricci flow were results
[2] into [1], with  = (t). For example, rescaling the on an analogous question for nonlinear sigma
Ricci flow [2] so that the volume of (M, g(t)) is models. Consider maps f between Riemannian
preserved
H leads to the flow equation [1] with manifolds M, N with Lagrangian given by the
 = 2 R, twice the mean value of the scalar Dirichlet energy. Eells–Sampson studied the heat
curvature R. equation for this action and proved that when the
The Ricci flow [2] bears some relation with the target N has nonpositive curvature, the flow exists
metric part of the -function or renormalization for all time and converges to a stationary point of
group (RG) flow equation the action, that is, a harmonic map f1 : M ! N. The
d idea is to see if an analogous program can be
gðtÞ ¼ ðgðtÞÞ developed on the space of metrics M.
dt
There are a number of well-known obstructions to
for the two-dimensional sigma model of maps the existence of Einstein metrics on manifolds, in
2 ! M. The -function is a vector field on M, particular, in dimensions 3 and 4. Thus, the Ricci
invariant under diffeomorphisms, which has an flow will not exist for all time on a general
expansion of the form manifold. Hence, it must develop singularities. A
fundamental issue is to try to relate the structure of
ðgÞ ¼ Ricg þ "Riem2 þ   
the singularities of the flow with the topology of the
where Riem2 is quadratic in the Riemann curvature underlying manifold M.
tensor. The Ricci flow corresponds to the one-loop A few simple qualitative features of the Ricci flow
term or semiclassical limit in the RG flow [2] are as follows: if Ric(x, t) > 0, then the flow
(cf. D’Hoker (1999) and Friedan (1985)). contracts the metric g(t) near x, to the future, while
Singularities of the Ricci Flow 585

if Ric(x, t) < 0, then the flow expands g(t) near x. At Space-form theorem. If g(0) is a metric of positive
a general point, there will be directions of positive Ricci curvature on a 3-manifold M, then the volume
and negative Ricci curvature, along which the metric normalized Ricci flow exists for all time, and
locally contracts or expands. The flow preserves converges to the round metric on S3 =, where  is
product structures of metrics, and preserves the a finite subgroup of SO(4), acting freely on S3 .
isometry group of the initial metric. Thus, the Ricci flow ‘‘geometrizes’’ 3-manifolds of
The form of [2] shows that the Ricci flow positive Ricci curvature. There are two further
continues as long as Ricci curvature remains important structural results on the Ricci flow.
bounded. On a bounded time interval where Ricg(t) Curvature pinching estimate (Hamilton 1982,
is bounded, the metrics g(t) are quasi-isometric, that Ivey 1993). For g(t) a solution to the Ricci flow on
is, they have bounded distortion compared with the a closed 3-manifold M, there is a nonincreasing
initial metric g(0). Thus, one needs to consider function  : (1, 1) ! R, tending to 0 at 1, and a
evolution equations for the curvature, induced by constant C, depending only on g(0), such that,
the flow for the metric. The simplest of these is the
Riemðx; tÞ  C  ðRðx; tÞÞ  jRðx; tÞj ½6
evolution equation for the scalar curvature R:
This estimate does not imply a lower bound on
d
R ¼ R þ 2jRicj2 ½3 Riem(x, t) uniform in time. However, when com-
dt bined with the fact that the scalar curvature R(x, t)
Evaluating [3] at a point realizing the minimum Rmin is uniformly bounded below (cf. [3]), it implies that
of R on M shows that Rmin is monotone nondecreas- jRiemj(x, t)  1 only where R(x, t)  1. To control
ing along the flow. In particular, the Ricci flow the size of jRiemj, it thus suffices to obtain just an upper
preserves positive scalar curvature. Moreover, if bound on R. This is remarkable, since the scalar
Rmin (0) > 0, then curvature is a much weaker invariant of the metric
n than the full curvature. Moreover, at points where the
t ½4 curvature is sufficiently large, [6] shows that
2Rmin ð0Þ
Riem(x, t)=R(x, t)  , for  small. Thus, if one scales
Thus, the Ricci flow exists only up to a maximal the metric to make R(x, t) = 1, then Riem(x, t)  . In
time T  n=2Rmin (0) when Rmin (0) > 0. In contrast, such a scale, the metric then has almost non-negative
in regions where the Ricci curvature stays negative curvature near (x, t).
definite, the flow exists for infinite time. Harnack estimate (Hamilton 1982). Let (N, g(t))
The evolution of the Ricci curvature has the same be a solution to the Ricci flow with bounded and
general form as [3]: non-negative curvature Riem  0, and suppose g(t)
is a complete Riemannian metric on N. Then for
d e ij
Rij ¼ Rij þ Q ½5 0 < t1  t2 ,
dt !
The expression for Q e is much more complicated t1 dt21 ðx1 ; x2 Þ
Rðx2 ; t2 Þ  exp  Rðx1 ; t1 Þ ½7
than the Ricci curvature term in [3] but involves t2 2ðt2  t1 Þ
only quadratic expressions in the curvature.
However, Q e involves the full Riemann curvature where dt1 is the distance function on (M, gt1 ). This
tensor Riem of g, and not just the Ricci curvature (as allows one to control the geometry of the solution at
[3] involves Ricci and not just scalar curvature). An different spacetime points, given control at an initial
important feature of dimension 3 is that the full point.
Riemann curvature Riem is determined algebraically
by the Ricci curvature. So the Ricci flow has a much
better chance of ‘‘working’’ in dimension 3. For
Singularity Formation
example, an analysis of Q e shows that the Ricci flow The deeper analysis of the Ricci flow is concerned
preserves positive Ricci curvature in dimension 3; if with the singularities that arise in finite time.
Ricg(0) > 0, then Ricg(t) > 0, for t > 0. This is not the Equation [3] shows that the Ricci flow will not
case in higher dimensions. On the other hand, in any exist for arbitrarily long time in general. In the case
dimension > 2, the Ricci flow does not preserve of initial metrics with positive Ricci curvature, this is
negative Ricci curvature, or even a general lower resolved by rescaling the Ricci flow to constant
bound Ric  , for  > 0. For the remainder of the volume. However, the general situation is necessarily
article, we usually assume then that dim M = 3. much more complicated. For example, any manifold
The first basic result on the Ricci flow is the which is a connected sum of S3 = or S2  S1 factors
following, due to Hamilton (1982). has metrics of positive scalar curvature. For obvious
586 Singularities of the Ricci Flow

topological reasons, the volume normalized Ricci of the curvature, and that base points of maximal
flow then cannot converge nicely to a round metric; curvature in space and time t  ti have been chosen.
even the renormalized flow must develop At least in a subsequence, one then obtains a limit
singularities. solution to the Ricci flow (N, g(t), x), based at x,
The usual method to understand the structure of defined at least for times (1, 0], with g(t) a
singularities, particularly in geometric PDEs, is to complete Riemannian metric on N. Such solutions
rescale or renormalize the solution on a sequence are called ancient solutions of the Ricci flow. The
converging to the singularity to make the solution estimate [6] shows that the limit has non-negative
bounded, and try to pass to a limit of the curvature in dimension 3, and so [7] holds on N.
renormalization. Such a limit solution models the Thus, the limit is indeed quite special. The topology
singularity formation, and one hopes (or expects) of complete manifolds N of non-negative curvature
that the singularity models have special features is completely understood in dimension 3. If N is
making them much simpler than an arbitrary noncompact, then N is diffeomorphic to R3 , S2  R,
solution of the flow. or a quotient of these spaces. If N is compact, then
A singularity forms for the Ricci flow only where a slightly stronger form of the space-form theo-
the curvature becomes unbounded. Suppose then rem implies N is diffeomorphic to S3 =, S2  S1 , or
that 2i = jRiemj(xi , ti ) ! 1, on a sequence of points S2 Z2 S1 .
xi 2 M, and times ti ! T < 1. Consider the The study of the formation of singularities in
rescaled or blow-up metrics and times the Ricci flow was initiated by Hamilton (1995).
Recently, Perelman has obtained an essentially
gi ðti Þ ¼ 2i  i gðtÞ;
 ti ¼ 2i ðt  ti Þ ½8 complete understanding of the singularity behavior
of the Ricci flow, at least in dimension 3.
where i are diffeomorphisms giving local dilations
of the manifold near xi by the factor i .
The flow  gi is also a solution of the Ricci flow,
and has bounded curvature at (xi , 0). For suitable Perelman’s Work
choices of xi and ti , the curvature will be bounded
Noncollapse
near xi , and for nearby times to the past, ti  0; for
example, one might choose points (xi , ti ) where the Consider the Einstein–Hilbert action
curvature is maximal on (M, g(t)), t  ti . Z
The rescaling [8] expands all distances by the RðgÞ ¼ RðgÞ dVg ½10
factor i , and time by the factor 2i . Thus, in M
effect one is studying very small regions, of as a functional on M. Critical points of R are Ricci-
spatial size on the order of ri = 1 i about (xi , ti ), flat metrics. It is natural and tempting to try to
and ‘‘using a microscope’’ to examine the small- relate the Ricci flow with the gradient flow of R
scale features in this region on a scale of size (with respect to a natural L2 metric on the space M).
about 1. However, it has long been recognized that this
A limit solution of the Ricci flow, defined at least cannot be done directly. In fact, the gradient flow of
locally in space and time, will exist provided that the R does not even exist, since it implies a backwards
local volumes of the rescalings are bounded below heat-type equation for the scalar curvature R
(Gromov compactness). In terms of the original (similar to [3] but with a minus sign before ).
unscaled flow, this requires that the metric g(t) Consider however the following functional
should not be locally collapsed on the scale of its extending R:
curvature, that is, Z
vol Bxi ðri ; ti Þ  rni ½9 F ðg; f Þ ¼ ðR þ jrf j2 Þef dVg ½11
M

for some fixed but arbitrary  > 0. A maximal as a functional on the larger space M  C1 (M, R), or
g(t), x) containing the base point
connected limit (N,  equivalently a family of functionals on M, parame-
x = lim xi , is then called a ‘‘singularity model.’’ trized by C1 (M, R). The functional [11] also arises in
Observe that the topology of the limit N may well string theory as the low-energy effective action; the
be distinct from the original manifold M, most of scalar field f is called the dilaton. Fix any smooth
which may have been blown off to infinity in the measure dm on M and define the Perelman coupling
rescaling. by requiring that (g, f ) satisfy
To see the potential usefulness of this process,
suppose one does have local noncollapse on the scale ef dVg ¼ dm ½12
Singularities of the Ricci Flow 587

The resulting functional for which the metrics evolve by diffeomorphisms


Z and rescalings. Gradient solitons arise naturally as
F m ðg; f Þ ¼ ðR þ jrf j2 Þ dm ½13 singularity models, due to the rescalings and
M diffeomorphisms in the blow-up procedure [8]. An
becomes a functional on M. (This coupling does not important example is the cigar soliton on R 2  R,
appear to have been considered in string theory.) (or R 2  S1 ),
The L2 gradient flow of F m is given simply by g ¼ ð1 þ r2 Þ1 gEucl þ ds2 ½17
de
g e 2f Þ
¼ 2ðRic~g þ D ½14 Perelman then uses the scalar field f to probe the
dt geometry of g(t). For instance, the collapse or
where D e 2 f is the Hessian of f with respect to e g. The noncollapse of the metric g(t) near a point x 2 M
evolution equation [14] for e g is just the Ricci flow [2] can be detected from the size of W(g(t)) by choosing
modified by an infinitesimal diffeomorphism: ef to be an approximation to a delta function
e 2 f = (d=dt)( e
D f centered at (x, t). The more collapsed g(t) is near x,
t g), where (d=dt)t = rf . Thus, the
gradient flow of F m is the Ricci flow, up to the more negative the value of W(g(t)). The collapse
diffeomorphisms. The evolution equation for the of the metric g(t) on any scale in finite time is then
scalar field f, ruled out by combining this with the fact that the
entropy functional W is increasing along the Ricci
e R
ft ¼ f e ½15 flow.
Much more detailed information can be obtained
is a backward heat equation (balancing the forward by studying the path integral associated to the
evolution of the volume form of e g(t)). Thus, this evolution equation [15] for f, given by
flow will not exist for general f, going forward in t. Z
However, one of the basic points of view is to let the pffiffiffi 2

Þ ¼  ½j
ðÞj
_ þ Rð
ðÞÞ d
(pure) Ricci flow [2] flow for a time t0 > 0. At t0 ,

one may then take an arbitrary f = f (t0 ) and flow


where R and j
()j
˙ are computed with respect to the
this f backward in time ( = t0  t) to obtain an
evolving metrics g(). In particular, the study of the
initial value f (0) for f. The choice of f (t0 ) deter-
geodesics and the associated variational theory of
mines, together with the choice of volume form of
the length functional L are important in under-
g(0)), (or g(t0 )), the measure dm and so the choice
standing the geometry of the Ricci flow near the
of F m . The process of passing from F to F m
singularities.
corresponds to a reduction of the symmetry group of
all diffeomorphisms D of F to the group D0 of
volume-preserving diffeomorphisms; the quotient Singularity Models
space D=D0 has been decoupled into a space
C1 (M, R) of parameters. A major accomplishment of Perelman is essentially a
The functionals F m are not scale invariant. To classification of all complete singularity models
achieve scale invariance, Perelman includes an (N, g(t)) that arise in finite time. In the simple case
explicit insertion of the scale parameter, related to where N is compact, then as noted above, N is
time, by setting diffeomorphic to S3 =, S2  S1 , or S2 Z2 S1 .
Z  In the much more important case where N is

complete and noncompact, Perelman proves that the
Wðg; f ; Þ ¼ ðjrf j2 þ RÞ þ f  n
geometry of N near infinity is that of a union of
"-necks. Thus, at time 0, and at points x with
 ð4 Þn=2 ef dV ½16
r(x) = dist(x, x0 )  1, for a fixed base point x0 , a
with coupling so that dm = (4 )n=2 ef dV is fixed. region of radius "1 about x, in the scale where
The entropy functional W is invariant under R(x) = 1, is "-close to such a region in the standard
simultaneous rescaling of  and g, and t = 1. round product metric on S2  R; " may be made
Again, the gradient flow of W is the Ricci flow arbitrarily small by choosing r(x) sufficiently large.
modulo diffeomorphisms and rescalings and the For example, this shows that the cigar soliton [17]
stationary points of the gradient flow are the cannot arise as a singularity model. Moreover, this
gradient Ricci solitons, structure also holds on a time interval on the order
of "1 to the past, so that on such regions the
1 solution is close to the (backwards) evolving Ricci
Ricg þ D2 f  g¼0
2 flow on S2  R.
588 Singularity and Bifurcation Theory

Perelman shows that this structural result for the consequently the proof of the Poincaré conjecture.
singularity models themselves also holds for the It gives a full classification of all closed 3-manifolds,
solution g(t) very near any singularity time T. Thus, much like the classification of surfaces given by the
at any base point (x, t) where the curvature is classical uniformization theorem.
sufficiently large, the rescaling as in [8] of the
spacetime by the curvature is smoothly close, on See also: Einstein Manifolds; Evolution Equations: Linear
large compact domains, to corresponding large and Nonlinear; Minimal Submanifolds; Renormalization:
domains in a complete singularity model. The General Theory; Topological Sigma Models.
‘‘ideal’’ complete singularity models do actually
describe the geometry and topology near any
singularity. Consequently, one has a detailed under- Further Reading
standing of the small-scale geometry and topology in
a neighborhood of every point where the curvature Anderson M (1997) Scalar curvature and geometrization con-
jectures for 3-manifolds. In: Grove K and Petersen P (eds.)
is large on (M, g(t)), for t near T. Comparison Geometry, MSRI Publications, vol. 30, pp. 49–82.
The main consequence of this analysis is the Cambridge: Cambridge University Press.
existence of canonical, almost round 2-spheres S2 in Cao HT, Chow B, Cheng SC, and Yau ST (eds.) (2003) Collected
any region of (M, g(t)) where the curvature is Papers on Ricci Flow. Boston: International Press.
D’Hoker E (1999) String theory. In: Deligne P et al. (eds.)
sufficiently large; the radius of the S2 ’s is on the
Quantum Fields and Strings: A Course for Mathematicians,
order of the curvature radius. One then disconnects vol. 2, Providence, RI: American Mathematical Society.
the manifold M into pieces, by cutting M along a Friedan D (1985) Nonlinear models in 2 þ " dimensions. Annals
judicious choice of such 2-spheres, and gluing in of Physics 163: 318–419.
round 3-balls in a natural way. This surgery process Hamilton R (1982) Three manifolds of positive Ricci curvature.
allows one to excise out the regions of (M, g(t)) Journal of Differential Geometry 17: 255–306.
Hamilton R (1993) The Harnack estimate for the Ricci flow.
where the Ricci flow is almost singular, and thus Journal of Differential Geometry 37: 225–243.
leads to a naturally defined Ricci flow with surgery, Hamilton R (1995) Formation of singularities in the Ricci flow.
valid for all times t 2 [0, 1). In: Yau ST (ed.) Surveys in Differential Geometry vol. 2,
The surgery process disconnects the original pp. 7–136. Boston, MA: International Press.
connected 3-manifold M into a collection of disjoint Ivey T (1993) Ricci solitons on compact three-manifolds.
Differential Geometry and Its Applications 3: 301–307.
(connected) 3-manifolds Mi , with the Ricci flow Perelman G (2002) The entropy formula for the Ricci flow and its
running on each. However, topologically, there is a geometric applications, math.DG/0211159.
canonical relation between M and the components Perelman G (2003a) Ricci flow with surgery on three-manifolds,
Mi ; M is the connected sum of {Mi }. An analysis of math.DG/0303109.
the long-time behavior of the volume-normalized Perelman G (2003b) Finite extinction time for the solutions to the
Ricci flow on certain three-manifolds, math.DG/0307245.
Ricci flow confirms the expectation that the flow Thurston W (1982) Three dimensional manifolds, Kleinian groups
approaches a fixed point, that is, an Einstein metric, and hyperbolic geometry. Bulletin of American Mathematical
or collapses along 3-manifolds admitting an S1 Society 6: 357–381.
fibration. This then leads to the proof of Thurston’s
geometrization conjecture for 3-manifolds and

Singularity and Bifurcation Theory


J-P Françoise and C Piquet, Université P.-M. Curie, celestial mechanics (cf. Siegel and Moser (1971)).
Paris VI, Paris, France Then dynamical systems developed intensively from
ª 2006 Elsevier Ltd. All rights reserved. stability theory (Lyapunov’s theory) to generic proper-
ties (based on functional analysis techniques,) hyper-
bolic structures (Anosov’s flows, Smale axiom A) and
to perturbation theory (Pugh’s closing lemma, KAM
Introduction
theorem). There are many links with ergodic theory
Dynamical systems first developed from the geometry dating back to Birkhoff’s ergodic theorem (motivated
of Newton’s equations (see Goodstein and Goodstein by Boltzmann–Gibbs contributions to thermody-
(1997)) and the question of the stability of the solar namics). These aspects have been developed in several
system motivated further researches inspired by articles of the encyclopedia (see Generic Properties of
Singularity and Bifurcation Theory 589

Dynamical Systems; Ergodic Theory; Hyperbolic V (x) = 15 x5 þ 13 1 x3 þ 12 2 x2 þ 3 x, the swallow tail,


Dynamical Systems). This article develops another V (x) = 16 x6 þ 14 1 x4 þ 13 2 x3 þ 12 3 x2 þ 4 x, the
aspect of dynamical systems, namely bifurcation butterfly,
theory. In contrast, the mathematics involved relates
V (x) = x3  3xy2 þ 1 (x2 þ y2 ) þ 2 x þ 3 y, the
more to local analytic geometry in the broad sense and
elliptic umbilic,
provides local models like normal forms, uses blow-up
techniques and asymptotic developments. This con- V (x) = x3 þ y3 þ 1 xy þ 2 x þ 3 y, the hyperbolic
tains the singularity theory of functions (related to umbilic, and
singularities of gradient flows). A recent development V (x) = y4 þ x2 y þ 1 x2 þ 2 y2 þ 3 x þ 4 y, the
of the whole subject deals with bifurcation theory of parabolic umbilic.
fast-slow systems.
Consider more particularly the first four cases.
The ‘‘state equation’’ defines the critical points of V :

Singularity Theory of Functions @V


¼0
@x
A singular point of a gradient dynamics
which contains the subset of the stable equilibrium
dx points of the associated gradient dynamics. The
¼ grad VðxÞ
dt nature of these equilibrium states changes at points
is a critical point of the function V. Assume that the contained in the set defined by the equation
function V: U ! R is defined and infinitely differ-
entiable on an open set U. Let x0 2 U be a critical @ 2 V
¼0
point of V. @x2

Definition 1 The critical point x0 is said to be of The projection of this set on the space of parameters
Morse type if the Hessian of V at x0 : D2x V(x0 ) is of contains the set of values of the parameters for which
maximal rank n. The corank of a singular point x0 is the equilibrium position is susceptible to change of
the corank of the matrix D2x V(x0 ). topological type (in other terms to undergo a bifurca-
tion). This set is called the catastrophe set (see Figure 1).
Denote by O the local ring of germs of C1
Consider now the case of umbilics where there are
functions at point x0 .
two state equations:
Definition 2 The Jacobian ideal of the function V
@V @V
at x0 , denoted as Jac(V), is the ideal generated in ¼ ¼0
the ring O by the partial derivatives of @x @y
V: @V=@xi , i = 1, . . . , n, considered as elements of The catastrophe set S is determined by one further
the local ring O. equation:
The singularity (or the singular point) is isolated if  2 2
@2V @2V @ V
Hess V ¼  ¼0
dimR O=JacðVÞ < 1 @x2 @y2 @x@y
In that case, the Milnor number is defined as the In both cases of hyperbolic and elliptic umbilics, the set
dimension S is a singular surface. For the last case of the parabolic
 ¼ dimR O=JacðVÞ umbilic, the set S is of dimension 3 and again it is only
Local models of singularities at a point are simple possible to represent it by a family of its sections by a
expressions that germs of functions singular at this variable hyperplane (see Figure 2).
point have in local coordinates. All possible deformations (in the space of func-
R Thom proposed to focus more particularly on the tions) of a function with an isolated singularity can
singularities whose Milnor number is less than or be induced by a single -dimensional family of
equal to 4 and whose corank is less than or equal to 2. deformations named the ‘‘universal deformation.’’ In
The list of local models V (x) of functions whose general, the ‘‘codimension’’ of a bifurcation is the
singularities at 0 display a Milnor number less than minimal number of parameters needed to display all
or equal to 4 and a corank less than or equal to 2 is possible phase diagrams of all possible unfoldings.
the following: Several deep mathematical techniques, like the
Malgrange division theorem and preparation theo-
V (x) = 13 x3 þ 1 x, the fold, rem, allowed J Mather to prove the theorem (local,
1
V (x) = 4 x4 þ 12 1 x2 þ 2 x, the cusp, then global) of existence of the universal unfolding.
590 Singularity and Bifurcation Theory

λ3 λ1

λ2

λ2

λ1

Swallow tail Cusp

λ3
λ2 λ2 λ3

λ1

λ1

Hyperbolic umbilic Elliptic umbilic


Figure 1 Examples of catastrophe sets. Adapted with permission from Françoise J-P (2005) Oscillations en Biologie: Analyse
Qualitative et Modèles (Mathématiques et Applications, vol. 46). Heidelberg: Springer.

The theory of unfoldings of singularities can be A simplified model of the essential physics of a laser
used, for instance, to provide asymptotic expression is due to Haken (1983). It is given by
of stationary phase integrals when critical points of
n_ ¼ GnN  kn
the phase are not of Morse type. This relates to
monodromy, Bernstein polynomials, Milnor fibra- were n is the number of photons in the laser field, N is
tion near a singular point, and simultaneous local the number of excited atoms, and the gain term comes
models of forms and functions (cf. Malgrange from the process of stimulated emission which occurs
(1974)) and see Feynman Path Integrals). at a rate proportional to the product n.N. Further-
more, the number of excited atoms drops down by the
emission of photons N = N0  n. Then we obtain
Singularity Theory of Vector Fields
n_ ¼ ðGN0  kÞn  Gn2
Transcritical Bifurcation
This model displays a transcritical bifurcation, which
The transcritical bifurcation is the standard mechan- explains in elementary terms the laser threshold.
ism for changes in stability. The local model is given by
x_ ¼ rx  x2 Pitchfork Bifurcation

For r < 0, there is an unstable fixed point at x = r The local model for supercritical pitchfork bifurca-
and a stable fixed point at x = 0. As r increases, the tion is
unstable and the stable fixed points coalesce when
r = 0 and when r > 0, they exchange their stability. x_ ¼ rx  x3
Singularity and Bifurcation Theory 591

λ4

λ4

λ2 λ3
λ2
λ3

Butterfly Parabolic umbilic


(section λ 1 = 0) (section λ 1 = 0)

λ4
λ4

λ3
λ1
λ1
λ3

Butterfly Parabolic umbilic


(section λ 2 = 0) (section λ 2 = 0)

Figure 2 Sections of catastrophe sets. Adapted with permission from Françoise J-P (2005) Oscillations en Biologie: Analyse
Qualitative et Modèles (Mathématiques et Applications, vol. 46). Heidelberg: Springer.

When the parameter r < 0, it displays one stable Taylor expansions of functions. This leads to
equilibrium position. As r increases, this equilibrium decomposition of the vector fields into semisimple
bifurcates (for r > 0) into two stable equilibria and and nilpotent parts (at the level of formal series). A
an unstable equilibrium. Its drawing suggests ‘‘the normal form is a formal coordinate system in which
pitchfork.’’ In case of subcritical pitchfork the semisimple part is linear. If the vector field
bifurcation preserves a structure (like volume form or symplec-
tic form) the change of coordinates which brings it
x_ ¼ rx þ x3 to its normal form is also (volume-preserving,
symplectic). The simplicity of the normal form
there is a single stable state for r < 0 that bifurcates depends on the number of allowed resonances for
into two stable states and one unstable as r > 0. the eigenvalues of the first-order jet of the vector
field at the singular point. The best-known example
Normal Forms
is the Birkhoff normal form of Hamiltonian vector
Local analysis of vector fields proceeds with local fields that we recall now, but we should also
models called normal forms. A local vector field mention the Sternberg normal form of volume-
near a singular point (zero) is seen as a derivation of preserving vector fields.
the local ring of functions which preserves the Local analysis of a Hamiltonian vector field under
unique maximal ideal (of the functions which vanish symplectic changes of coordinates is the same as the
at the singular point). It yields a linear operator of local analysis of functions (namely its associated
the finite-dimensional vector spaces of truncated Hamiltonian). Birkhoff normal form deals with the
592 Singularity and Bifurcation Theory

d
case of a Hamiltonian that is a perturbation at the than ag before a time bounded below by eb= (see
origin: Gallavotti (1983)).
X
m
H0 ðpÞ ¼ j pj Bifurcations of Periodic Orbits
j¼1
Consider a one-parameter family of vector fields X
pj ¼ x2j þ y2j ; j ¼ 1; . . . ; m of class Ck , k  3,

where the symplectic form is x_ ¼ Fðx; Þ

X
m Assume that X (0) = 0 and that the linear part of the
!¼ dxj ^ dyj vector field at 0 has two complex-conjugated
j¼1 eigenvalues () and () such that Re(()) > 0
for  > 0, Re((0)) = 0 and (Re(()))=dj = 0 6¼ 0.
If the eigenvalues j are assumed to be independent
Then, for  > 0 but small enough, the vector field
over the integers (no resonances), then there is
^j , q X has a periodic orbit  which tends to 0 as 
a formal system of symplectic coordinates p ^j ,
tends to 0.
j = 1, . . . , m, called action-angle variables, in which
This bifurcation of codimension 1 is named Hopf
the Hamiltonian only depends of the action variables
^j . Such a coordinate system is generically divergent bifurcation and it occurs in many models.
p
When several oscillators (conservative or dissipa-
because, under generic assumptions on the 3-jet of
tive) are weakly coupled, they may display fre-
the Hamiltonian, the system displays isolated periodic
quency locking (existence of an attractive periodic
orbits in any neighborhood of the origin (see Moser,
orbit) phase locking, and synchronization. The fact
Vey, Francoise). Normal forms are normally used in
that we always see the same face of the Moon from
applications (e.g., Nekhoroshev theorem, Hopf bifur-
the Earth can be explained by a synchronization of
cation theorem) in their truncated versions. Birkhoff
the rotation of the Moon onto itself with its rotation
normal form was conjectured (A Weinstein) to enter
around the Earth. Synchronization also plays a
in the asymptotic expansion of the fundamental
fundamental role in living organisms (e.g., heart,
solution of the wave equation on a Riemannian
population dynamics: see D Attenborough’s movie
manifold near elliptic geodesics. This conjecture was
‘‘The Trials of Life’’). It is sometimes possible to be
recently proved by V Guillemin.
convinced of synchronization via computer experi-
ments, but the main theoretical approach is due to
Stability Theory of Hamiltonian Systems, Malkin. See Bifurcations of Periodic Orbits, where a
Nekhoroshev Theorem, Arnol’d Diffusion full mathematical proof is included.
The generic divergence of the Birkhoff normal form does Homoclinic Bifurcation, Newhouse’s Phenomenon
not allow one to conclude about the stability of the
elliptic singular point. In the case where it is convergent, Homoclinic bifurcation occurs in the family X at
the motion is trapped inside invariant tori (conservation the bifurcation value of the parameter  = 0 if X0
of the actions). The KAM theorem (see Gallavotti displays a singular orbit which tends to 0 both for
(1983)) provides the existence of many invariant tori t ! þ1 and for t ! 1. In dimension 2, if  is
but, except in low dimensions, this does not exclude the slightly deformed around 0, one periodic orbit may
existence of trajectories that would escape to infinity. appear (or disappear). For planar systems, the
Arnol’d indeed provided a mechanism and examples of Bogdanov–Takens bifurcation is the codimension-2
such situations (this is now called Arnol’d diffusion) (see bifurcation, which mixes the homoclinic and the
Introductory Articles: Classical Mechanics). This diffu- Hopf bifurcations. In dimension 3, more complicated
sion process needs some time, which is estimated below phase diagrams may occur (such as in the Shilnikov
by a theorem of Nekhoroshev. bifurcation) with the appearance of infinitely many
Consider the Hamiltonian periodic orbits or homoclinic loops (in a stable way:
Newhouse phenomenon). This eventually gives rise to
H ðp; qÞ ¼ hðpÞ þ f ðp; qÞ strange attractors (the Roessler attractor).
where h(p) is strictly convex, analytic, anisochro-
nous on the closure U of an open bounded region U The Poincaré Center-Focus Problem, Local
Hilbert’s 16th Problem, Abel Equations, Algebraic
of Rm and the perturbation f (p, q) is analytic on U 
Moments
Rm . Nekhoroshev’s theorem tells that there are
positive constants a, b, d, g,  such that for any initial Hopf bifurcation theory for two-dimensional sys-
data p0 , q0 , the actions p do not change by more tems deals with the first case of a general situation
Singularity and Bifurcation Theory 593

often referred to as degeneracies of Hopf bifurca- Excitability is also an important feature which occurs
tions or alternatively Hopf–Takens bifurcations. in some fast–slow systems. Consider initial data in a
Consider more generally a planar vector field, neighborhood of an excitable attractive point. For some
tangent at the origin to a linear focus: initial data, the orbit goes very quickly to the attractor.
For some others instead (usually below some threshold),
x_ ¼ y þ x þ f ðx; yÞ the orbit undergoes a long incursion in the phase
y_ ¼ x þ y þ gðx; yÞ diagram before turning back to the attractive point.
Singular Hopf bifurcation, hysteresis, and excit-
The Poincaré center-focus problem asks for ability can, for instance, occur in the electrodissolu-
necessary and sufficient conditions on the perturba- tion and passivation of iron in sulfuric acid
tion terms so that all orbits are periodic in a (see Alligood et al. (1997)).
neighborhood of the origin. This problem is still Sometimes, the orbit leaves the neighborhood of a
pending in the case, for instance, where f and g are first attractor to jump to a second one and then this
homogeneous of degrees 4 and 5. It was solved a second one disappears and the orbit jumps back to
long time ago for degrees 2 and 3. Part (b) of the initial attractor as the slow variables have
Hilbert’s 16th Problem asks for finding a bound in undergone a cycle. This is called a hysteresis cycle.
terms of the degrees of polynomial perturbations for In case one of the attractors is a point while the
the number of limit cycles (isolated periodic orbits) other is an attractive periodic orbit, it may lead to
in the neighborhood of the origin. In the case of bursting oscillations. These oscillations are charac-
homogeneous perturbations, a Cherkas transforma- terized by the periodic succession of silent phases
tion allows the reduction of both problems to the (attractor of the fast dynamics) and active (pulsatile)
so-called one-dimensional periodic Abel equations: phases (periodic attractor of the fast dynamics).
They are ubiquitous in physiology, where they were
dy=dx ¼ pðxÞy2 þ qðxÞy3 first discovered and can be also observed in physics
(laser beams) and in population dynamics.
where p and q are trigonometric polynomials in x.
A perturbative approach was developed for several
years and yields a theory of algebraic moments Example
related to Livsic’s generalized problem of moments.
The Hindmarsh–Rose model displays bursting
oscillations:

Fast–Slow Systems x_ ¼ y  x3 þ 3x2 þ I  z

Fast–slow systems y_ ¼ 1  5x2  y


z_ ¼ sðx  x1 Þ  z
 x_ ¼ f ðx; yÞ; y_ ¼ gðx; yÞ
The fast dynamics is two dimensional. For some values
are characterized by the existence of two timescales. of the parameters, it displays an attractive node, a
Variables x are called fast variables and y are called saddle and a repulsive focus. Under the slow variation
slow variables. Different approximation techniques of z, the fast dynamics displays a saddle–node
can be used (averaging method, multiscale approach bifurcation, a Hopf bifurcation from which emerges
(see Multiscale Approaches)). The behavior of a stable limit cycle which disappears into a homoclinic
solutions is approximated as follows (when the bifurcation. The fast–slow system undergoes a hyster-
scale  is small). The orbit jumps to an attractor of esis loop which yields to bursting oscillations.
the fast dynamics. This attractor may eventually lose
its stability and/or bifurcate as time evolves. Then
the orbit jumps to another attractor of the fast
Conclusions
dynamics. Once again, this attractor may evolve/
bifurcate/disappear, depending on the slow variables Over the past three decades, mathematical tech-
y. This explains why bifurcation theory enters in the niques gathered under the names of singularity
process in a crucial way, and it has to be adapted to theory and bifurcation theory of dynamical systems
this special context where some new phenomena may have offered a powerful means to explore nonlinear
occur (e.g., singular Hopf bifurcation theory, phenomena in diverse settings. These include
Canards, etc.). Fundamental tools to be used in this mechanical vibrations, lasers, superconducting cir-
context are Takens theorem, Fenichel central mani- cuits, and chemical oscillators. Many such instances
fold theorem, blowing-up (Dumortier–Roussarie). are further developed in this encyclopedia.
594 Solitons and Kac–Moody Lie Algebras

See also: Bifurcation Theory; Bifurcations of Periodic Gallavotti G (1983) The Elements of Mechanics. New York: Springer.
Orbits; Chaos and Attractors; Entropy and Quantitative Goodstein DL and Goodstein JR (1997) Feynmann’s lost lecture.
Transversality; Ergodic Theory; Feynman Path Integrals; London: Vintage.
Generic Properties of Dynamical Systems; Gravitational Guckenheimer J (2004) Bifurcations of relaxation oscillations. In:
Ilyashenko Y, Rousseau C, and Sabidussi G (eds.) Normal Forms,
Lensing; Homoclinic Phenomena; Hyperbolic Dynamical
Bifurcations and Finiteness Problems in Differential Equations.
Systems; Multiscale Approaches; Optical Caustics; Séminaire de mathématiques supérieures de Montréal, Nato
Poisson Reduction; Stationary Phase Approximation; Sciences Series, II. Mathematics, vol. 137, pp. 295–316. Kluwer.
Symmetry and Symmetry Breaking in Dynamical Haken H (1983) Synergetics, 3rd edn. Berlin: Springer.
Systems; Symmetry and Symplectic Reduction; Keener J and Sneyd J (1998) Mathematical Physiology. Inter-
Synchronization of Chaos; Weakly Coupled Oscillators. disciplinary Applied Mathematics, vol. 8. New York: Springer.
Malgrange B (1974) Intégrales asymptotiques et monodromie.
Annales de l’ENS 7: 405–430.
Further Reading May R-M (1976) Simple mathematical models with very
complicated dynamics. Nature 261: 459–467.
Alligood KT, Sauer TD, and Yorke JA (1997) Chaos, An Nekhoroshev V (1977) An exponential estimate of the time of
Introduction to Dynamical Systems, Textbooks in Mathema- stability of nearly integrable Hamiltonian systems. Russian
tical Sciences. New York: Springer. Mathematical Surveys 32(6): 1–65.
Alpay D and Vinikov V (eds.) (2001) Operator Theory, System Palis J and de Melo W (1982) Geometric Theory of Dynamical
Theory and Related Topics, The Mosche Livsic Anniversary Systems, An Introduction. New York: Springer.
Volume, Operator Theory, Advances and Applications Perko L (2000) Differential Equations and Dynamical Systems, 3rd
vol. 123. Birkhauser. edn, Text in Applied Mathematics, vol. 7. New York: Springer.
Briskin M, Francoise JP, and Yomdin (2001) Generalized Siegel C-L and Moser J (1971) Lectures on Celestial Mechanics,
Moments, Cener-Focus Conditions and Compositions of Die Grundleheren der mathematischen Wissenschaften,
Polynomials. Operator Theory, Advances and Applications vol. 187. Berlin: Springer.
123 ( in honor of M Livsic, 80th birthday). Smale S (1998) Mathematical problems for the next century. The
Diener M (1994) The canard unchained, or how fast–slow dynamical Mathematical Intelligencer 20: 7–15.
systems bifurcate? The Mathematical Intelligencer 6: 38–49. Smale S. Dynamics retrospective: great problems, attempts that
Francoise JP and Guillemin V (1991) On the period spectrum of a failed. Physica D 51: 267–273.
symplectic map. Journal of Functional Analysis 100: 317–358.

Sobolev Spaces see Inequalities in Sobolev Spaces

Solitons and Kac–Moody Lie Algebras


E Date, Osaka University, Osaka, Japan area of research relating to this particular phenom-
ª 2006 Elsevier Ltd. All rights reserved. enon in direct or indirect ways. From the viewpoint
of solitons, particular solutions of differential
equations are of special interest. Although particular
solutions have been studied for a long time, interest
Introduction
in them was overshadowed by the method of
Solitons and Kac–Moody Lie algebras were born at functional analysis in the 1950s. In the late nine-
almost the same time in the 1960s, although they teenth century, in parallel with the theory
did not have a connection at first. They both have of algebraic functions, several studies undertook
roots in the history of mathematics. From the 1970s the solution of mechanical problems by elliptic or
on, they became intersection points for many hyperelliptic integrals. Subsequently, however, there
(previously known and new) results. was a drop in activity in this area of work.
The notion of solitons has many facets and it is Originally it was hoped that this kind of phenom-
difficult to give a mathematically precise definition; enon could be used for practical applications. No
closely related to solitons is the notion of ‘‘com- mention of practical application of solitons will be
pletely integrable systems.’’ The latter is usually used made in this article.
in a much broader sense. First we list several topics which constitute the
The terminology ‘‘soliton’’ was originally used for main body of the notion of solitons in the early
a particular phenomenon in shallow water waves. stages; we will then explain relations with Kac–
Now, in its broadest sense, it is used to represent an Moody Lie algebras.
Solitons and Kac–Moody Lie Algebras 595

Birth of Solitons was studied by Gelfand–Dikii, Marchenko, and


Krein in the 1950s, motivated by scattering theory
The name ‘‘soliton’’ itself was coined by Martin D
in quantum mechanics.
Kruskal around 1965. It was originally employed for
It gives a one-to-one correspondence between rapidly
the solitary wave solution Korteweg–de Vries (KdV)
decreasing potentials u(x) and scattering data which
equation
consist of discrete eigenvalues j2 and normalization
ut  14 ð6uux þ uxxx Þ ¼ 0; u ¼ uðx; tÞ ½1 cj , j = 1, . . . , n, of the eigenfunctions corresponding to
them and the reflection coefficient r(). The reflection
The coefficients here are not important. We can coefficient represents the ratio of reflection of the unit
change them arbitrarily. The unknown function u, plane wave eix by the potential field. The scattering
or rather u, represents the height of the wave. data {r(), j , ci , j = 1, . . . , n} are a mathematical ideali-
The solitary wave solution in question is given by zation of observable data in quantum scattering. The
pffiffiffi  procedure of reconstructing a potential from given
uðx; tÞ ¼ 2c sech2 cðx  ct  dÞ ½2
scattering data is called the inverse problem. The heart
This is a traveling-wave solution with the height of of this procedure is solving an integral equation (the
the wave proportional to the speed. This is one Gelfand–Dikii–Marchenko equation). In the reflection-
feature of the nonlinearity of this differential less case (r() = 0), this integral equation reduces to a
equation. system of linear algebraic equations.
A reason for this nomenclature comes from the Kruskal and co-workers found that the scattering
particle-like property of solitary wave observed via data of these operators with solutions of [1] as
numerical computations. That is, if we have two potentials depend very simply on t:
solitons [2] with different speeds, with the faster one 3
on the left and the slower one on the right, then after j ðtÞ ¼ j ð0Þ; cj ðtÞ ¼ cj ð0Þ e2i t

3
½3
some time they collide and their shapes are distorted. rð; tÞ ¼ rð; 0Þ e2i t
After a long enough time, they are separated and
recover their original shapes, the only difference It was realized at the same time that soliton solutions
being in the change of the phase shift d in [2]. correspond to a reflectionless potential (r() = 0) with
Solitary waves in shallow water (like a canal) only one discrete eigenvalue, while reflectionless
were first observed by Scott Russell in Scotland in potentials correspond to a nonlinear ‘‘superposition’’
the middle of the nineteenth century. Differential of soliton solutions (called multisoliton solutions) and
equations which possess solitary waves in shallow describe the interaction of solitons.
water as solutions were sought after Scott Russell’s As was pointed out by Zakharov and others, the
report. Boussinesq derived one (now called the inverse-scattering method has an intimate relation
Boussinesq equation, which contains second partial with the Riemann–Hilbert problem.
derivatives with respect to time) from the Euler
equation of water wave; then in 1895 Korteweg and
his student de Vries derived the KdV equation. They Lax Representation
also showed that the KdV equation possesses
solutions expressible in terms of elliptic functions. Looking at this invariance of the spectrum, Lax
In the 1960s Kruskal and Zabusky carried out reformulated the KdV equation [1] as an evolution
numerical computations for the Fermi–Pasta–Ulam equation for the one-dimensional Schrödinger operator:
problem; they also came across the KdV equation  2
dL d
and found the aforementioned phenomenon. ¼ ½A; L; L ¼ þu
dt dx
 3   ½4
@ 3 @ @
A¼ þ u þ u
@x 4 @x @x
Inverse-Scattering Method
Here we have changed the sign of the operator for
Kruskal and his co-workers further pursued the
later convenience. This form of representation
origin of the particle-like property of solitons and
together with the inverse-scattering method gave a
proposed the so-called inverse-scattering method.
framework for finding nonlinear differential (differ-
The inverse problem of scattering theory of the
ence) equations that have solutions with properties
one-dimensional Schrödinger operator
similar to solitons (soliton equations).
 2 Among such are the sine-Gordon equation
d
L¼ þ uðxÞ
dx utt  uxx ¼ sin u
596 Solitons and Kac–Moody Lie Algebras

the nonlinear Schrödinger equation Among them was the work of Lepowsky–Wilson,
2 who constructed basic representations of the affine
iut þ uxx þ juj u ¼ 0 Lie algebra A(1) b
1 (= sl2 ) using differential operators of
the modified KdV equation infinite order in infinitely many variables. These
  operators were called vertex operators by Garland,
ut  16 6u2 ux þ uxxx ¼ 0 in view of the resemblance to objects in string
theory. Character formulas for these new Lie
the Toda lattice equation algebras were intensively studied and many combi-
dQn natorial identities were (re)derived.
¼ Pn
dt ½5
dPn Geometric Interpretation
¼ expðQn  Qnþ1 Þ þ expðQn1  Qn Þ
dt
How do Kac–Moody Lie algebras enter into this
and so on. The first three are obtained by replacing picture?
L by a 2  2 matrix differential operator of first In the early stages of the history of solitons
order. For eqn [5], the linear operator corresponding Kac–Moody Lie algebras appeared rather artifi-
to L in the case of the KdV equation is a difference cially. Some authors tried to understand solitons
operator of order 2 and has a connection with the from geometric viewpoints. A typical example is the
theory of orthogonal polynomials in one variable as sine-Gordon equation. This equation appears as the
well as with the theory of moment problems. Gauß–Codazzi equation in the theory of embeddings
Later it was remarked that the differential of two-dimensional surfaces of constant negative
operator A in eqn [4] is nothing but the differential curvature into three-dimensional Euclidean space,
operator part of the fractional power of while the Gauß–Weingarten equation is the linear
L: A = (L3=2 )þ . By replacing A in [4] by (L(2nþ1)=2 )þ equation that appears in the Lax representation of
we obtain higher (nth) KdV equations. the sine-Gordon equation. Another approach of a
geometric nature, involving the prolongation struc-
ture, was the direction initiated by Wahlquist–
Basic Representations of Affine Estabrook. In this approach, the Lie algebra
Lie Algebras appeared in a natural way, although the nature of
such Lie algebras was not so clear. This direction of
In the 1960s Kac and Moody introduced indepen-
research is close in spirit to the method of Cartan for
dently a class of infinite-dimensional Lie algebras
treating partial differential equations.
which are in many respects close to finite-dimensional
Several authors considered generalizations of the
semisimple Lie algebras. Each of them is constructed
Toda lattice equation. Bogoyavlenskii and others
for a given generalized Cartan matrix (GCM),
  observed that the original Toda lattice equation [5]
C ¼ aij ; aii ¼ 2; aij  0 for i 6¼ j is related to the Cartan matrix of the affine Lie
and if aij ¼ 0 then aji ¼ 0 ½6 algebra of type A. Viewed in this way, it was
straightforward to generalize the Toda lattice
There is a special class of Kac–Moody Lie algebras equation to Cartan matrices of another type of
that are now called affine Lie algebras. They affine Lie algebras and also to ordinary Cartan
correspond to positive-semidefinite GCM and are matrices. These were typical appearances of Kac–
realized as central extensions of loop algebras Moody Lie algebras in the theory of solitons; they
(current algebras) were used to produce soliton equations. The climax
of this is the work of Drinfel’d–Sokolov.
C½; 1   g
It needed some time to understand another role of
of finite-dimensional semisimple Lie algebras g. affine Lie algebras in the theory of solitons.
They have many applications in physics, in parti-
cular as current algebras. The Sugawara construc-
Bäcklund Transformation
tion in current algebra plays an essential role in
conformal field theory. Note that finite-dimensional In the theory of two-dimensional surfaces of
semisimple Lie algebras correspond to positive- constant negative curvature, a method of obtaining
definite GCMs. another surface of constant negative curvature from
In the late 1970s, there was interest in construct- the given one with some parameter was known by
ing representations of these algebras after the the work of Bäcklund. If we apply this to the trivial
general theory of representations was constructed. solutions u = 0 of the sine-Gordon equation, we
Solitons and Kac–Moody Lie Algebras 597

obtain a one-soliton solution of the sine-Gordon equation, and so on. He made a dependent-variable
equation. From this fact, the transformation of transformation of the KdV equation [1],
solutions of soliton equations to other solutions is  
d
called a Bäcklund transformation. The original u¼2 log f
Darboux transformation is a special case of a dx
Bäcklund transformation. This form naturally arises when we reconstruct the
potential of the one-dimensional Schrödinger
operator from the scattering data by solving the
Hamiltonian Formalism Gelfand–Dikii–Marchenko integral equation. In this
Another discovery of Gardner–Greene–Kruskal– new dependent variable, eqn [1] takes the following
Miura was the Hamiltonian structure of the KdV form:
equation. In the process of showing the existence of  4 
Dx  4Dx Dt f ðx; tÞ  f ðx; tÞ ¼ 0
infinitely many conservation laws, they used the
so-called Miura transformation, which relates the where the operator Dx is defined by
KdV and the modified KdV equation. Faddeev– d
Zakharov showed that the transformation to Dx ðf  gÞ ¼ f ðx þ x0 Þ gðx  x0 Þj x0 ¼0 ½7
dx0
scattering data is a canonical transformation, and
conserved quantities are obtained from the expan- This operator is called Hirota’s bilinear differential
sion of the reflection coefficients. operator. In such transformed form, he tried to solve
Gelfand–Dikii studied Hamiltonian structures of the resulting equation in a perturbative way,
the KdV equation using the formal variational X
n
calculus they initiated. f ¼1 þ expð2pj x þ 2p3 t þ qj Þ
M Adler was the first to try to study the KdV j¼1
X
equation by using the orbit method known for þ cij expð2ðpj þ pk Þx
finite-dimensional Lie algebras. It was known by the 1j<kn
works of Kostant and Kirillov or even earlier by Lie
þ 2ðp3j þ p3k Þt þ qj þ qk Þ þ    ½8
that the co-adjoint orbits of Lie algebras admit
symplectic structures (the Kostant–Kirillov bracket). It is rather miraculous that in the soliton equation
Adler considered the algebra of pseudodifferential case we can truncate such a perturbative procedure
operators in one variable. This acquires the structure at a finite point. The number of steps corresponds to
of Lie algebra by the commutation relation. This the number of solitons.
algebra admits a natural triangular decomposition Most of the soliton equations are rewritten in
by order. He showed that the KdV equation can be bilinear form with such bilinear differentiation after
viewed as a Hamiltonian system in the co-adjoint a suitable dependent-variable transformation. (Some
orbit of the one-dimensional Schrödinger operator equations need several new dependent variables.)
with the Kostant–Kirillov bracket. By introducing Once we have a differential equation in Hirota’s
the notion of residue of pseudodifferential operators bilinear differential form, it always has two-soliton
he rederived conserved quantities. The work of solutions.
Drinfeld–Sokolov can be regarded as a thorough Up to 1980, keywords characterizing solitons
generalization of this direction. Hamiltonian struc- were; inverse-scattering method, Bäcklund trans-
tures of the KdV equation and other soliton formation, multisolitons, Hirota’s method, quasi-
equations are now understood in this way. periodic solutions, etc. No explicit mention was
The method is also applicable to finite-dimensional made of representation theory.
Lie algebras. Symes, Kostant, and others treated the
finite Toda lattice in this way.
The motion of tops, including that of Kovalevs- Hierarchy of Soliton Equations
kaya, was also studied in this way.
As was stated above, soliton equations viewed as
Hamiltonian systems have infinitely many conserva-
tion laws. This implies that we can introduce infinitely
Hirota’s Method
many independent time variables consistently. From
There was another approach to soliton equations, quite this viewpoint, it is natural to consider the KdV
different from the above. This was the method initiated equation and its higher-order analogs simultaneously.
by Hirota. He placed stress on the form of multisoliton They have many properties in common. For example,
solutions of the KdV equation, the sine-Gordon the t-dependence of the scattering data of the higher
598 Solitons and Kac–Moody Lie Algebras

KdV equation is given by replacing 3 by 2nþ1 and j3 If we assume that L2 is a differential operator, we
by j2nþ1 in eqn [3]. The totality of soliton equations have the KdV hierarchy and the constraint that L3 is
organized in this way is called a hierarchy of soliton a differential operator gives the Boussinesq
equations; in the KdV case, it is called the KdV hierarchy. This process is called reduction.
hierarchy. This notion of hierarchy was introduced by Sato found that character polynomials (Schur
M Sato. He tried to understand the nature of the functions) solve the KP hierarchy and, based on
bilinear method of Hirota. First, he counted the this observation, he created the theory of the
number of Hirota bilinear operators of given degree infinite-dimensional (universal) Grassmann manifold
for hierarchies of soliton equations. For the number of and showed that the Hirota bilinear equations are
bilinear equations, M Sato and Y Sato made extensive nothing but the Plücker relations for this Grassmann
computations and made many conjectures that involve manifold.
eumeration of partitions. Sato also gave an (infinite-dimensional) determi-
nantal formula for Hirota’s dependent variable and
called the latter the -function. Using this
-function, the wave function (the eigenfunction
Kadomtsev–Petviashvili Hierarchy
corresponding to the KP hierarchy) is expressed as
Although it was included in a family of soliton !
equations slightly later, the Kadomtsev–Petviashvili X
1
ðx  ðk1 ÞÞ
n
(KP) equation is a soliton equation in three wðx; kÞ ¼ exp xn k
n¼1
ðxÞ
independent variables, which first appeared in
plasma physics:   ½12
k k3
2
  ðkÞ ¼ k; ; ; . . .
3 1 2 3
4 uyy  ut  4 ð6uux þ u xxx Þ ¼0 ½9
Lw ¼ kw
For this equation we have to replace the Lax
representation by where L is given by eqn [11].
 2  3

@ @ @ 3 @ @
þ u ; þ u þ v ¼ 0 ½10
@x @y @x 2 @x @t
Affine Lie Algebras as Infinitesimal
This form of representation was introduced by
Transformation Groups for Soliton
Zakharov–Shabat. Sometimes it is referred to as
Equations
the zero-curvature representation or the Zakharov–
Shabat representation. The KP equation is universal Date–Jimbo–Kashiwara–Miwa found another rela-
in the sense that it contains the KdV equation [1] tion among soliton equations and affine Lie alge-
and the Boussinesq equation as special cases. If u bras. After noticing some similarity between the
does not depend on y, resp. t, this gives the KdV, formula in the paper by Lepowsky–Wilson on the
resp. the Boussinesq equation. Rogers–Ramanujan identity using the vertex opera-
tors for A(1)
1 and the formula in the computation of
numbers of bilinear operators in Sato’s paper, they
Work of Sato applied the vertex operator for A(1)
1 ,
!
Sato stressed the importance of the study of the KP X
1
2j1
equation. He first introduced the KP hierarchy. XðpÞ ¼ exp 2x2j1 p
Instead of the one-dimensional Schrödinger operator j¼1
in the KdV case consider a pseudo- (micro) !
X 2 @
differential operator of first order,  exp 
j¼1
jp2j1 @x2j1
L ¼ @ þ u2 ðxÞ@ 1 þ u3 ðxÞ@ 3 þ   
@ ½11
@¼ ; x ¼ ðx1 ; x2 ; x3 ; . . .Þ to 1 (which is the simplest -function for the KdV
@x1 hierarchy), where p is a parameter. They found that
Setting Bn = (Ln )þ , the KP hierarchy is defined by the result is the -function corresponding to the one-
the Zakharov–Shabat representation soliton solution of the KP hierarchy. They also

found that successive application of X(p)’s to 1
@ @ produced all multisoliton -functions. Therefore,
 Bm ;  Bn ¼ 0; m; n ¼ 2; 3; . . .
@xm @xn applications of vertex operators are precisely
Solitons and Kac–Moody Lie Algebras 599

Bäcklund transformations. This implies that the The Method of Drinfeld–Sokolov


affine Lie algebra A(1)
1 is the infinitesimal transfor-
The KdV or the KP hierarchies are related to scalar
mation group for solutions of the KdV hierarchy.
linear differential operators. A parallel treatment
After this discovery, it was realized that the
using matrix differential operators is also possible.
totality of -functions of the KdV hierarchy is
In fact, the nonlinear Schrödinger equation, modi-
the group orbit of the highest weight vector (=1)
fied KdV equation, the sine-Gordon equation, etc.,
of the basic representation of A(1)
1 .
are treated in this way.
The vertex operators for the KP hierarchy were
Drinfel’d and Sokolov gave a general framework
also found:
along these lines. The first step is to choose the starting
! (matrix-valued) linear differential operator of order
X
1
one. For that they use the language of Lie algebras.
Xðp; qÞ ¼ exp xj ðpj þ qj Þ
Let us start with a matrix realization of a Lie
j¼1
algebra (for an affine Lie algebra, the elements are
! Laurent polynomials in one variable). Consider a
X 1 1

@
 exp  þ linear differential operator of the following form:
j¼1
jpj jqj @xj
d
L¼ þ qðxÞ þ 
dx
If we put q = p, the vertex operator for A(1)1 ([12])
is recovered. where q(x) is an element of the Borel subalgebra and 
Viewed in this way the Lie algebra corresponding is a sum of positive Chevalley generators in the case of
to the KP hierarchy is gl1 (=A1 ). And an embed- affine Lie algebras. By using gauge transformations
ding of A(1)
1 into A1 was also found. Subsequently,
(adjoint group), they consider several normal forms.
the method using free fermions (Clifford algebras) One normal form is obtained by choosing a node of
was established. Frenkel–Kac had already used free the corresponding Dynkin diagram. The resulting
fermions to construct basic representations. In this matrix system is equivalent to the one obtained by
approach, the -functions are defined as vacuum scalar Lax representation (or a slight generalization of
expectation values. Based on this connection with it). In this way, the generalized KdV equations for
affine Lie algebras, many conjectures of Sato on the affine Lie algebras are obtained. Another normal form
number of bilinear equations are (re)proved by using is to make q h-valued. Soliton equations obtained in
specialized characters of affine Lie algebras. this way are called the modified KdV equations. This is
The use of free fermions was exploited by a generalization of the Miura transformation. They
Ishibashi–Matsuo–Ooguri to relate soliton equations also comment on the construction of partially mod-
with conformal field theory on Riemann surfaces. ified soliton equations, which correspond to taking
This aspect was further studied by Tsuchiya–Ueno– various parabolic subalgebras. The Hamiltonian
Yamada using D-modules. formalism is also treated from their viewpoint.
Once such a viewpoint was established, it was In summary, in their approach affine algebras are
easy to construct soliton equations corresponding to used to construct soliton equations, or one can say
other affine Lie algebras. Hierarchies similar to the that they consider the space of initial values of
KP hierarchies (the simplest equation contains three soliton equations.
variables) were also found, which correspond to Lie They also discuss two-dimensional Toda lattices
algebras like go1 , sp1 (the BKP hierarchy, the CKP in their setting and show that modified equations in
hierarchy, and so on). their sense are symmetries of the two-dimensional
Summarizing these developments, we can say that Toda lattices.
affine Lie algebras, or slightly larger ones like gl1 ,
appear naturally as infinitesimal transformation
groups for soliton equations and the solution spaces
Common Features of the Roles of Affine
are the (completed) group orbits of highest weight
Lie Algebras in Solitons
vector -functions of level-1 representations. The
Hirota bilinear equations are the equations describ- In -function approach as well as in the method
ing these orbits (analogs of Plücker relations). of Drinfeld–Sokolov, the existence of triangular
Soon afterwards, the notion of -functions was decomposition of Lie algebras was essential. In the
introduced in the study of Painlevé equations by former case, it was basic when considering highest-
Okamoto, revealing Hamiltonian structures in weight representations and, for the latter, it was
Painlevé equations. used for the setup.
600 Solitons and Kac–Moody Lie Algebras

Special Solutions of Soliton Equations with simple zeros j of the discriminant as zeros of
(Multisoliton and Rational Solutions) polynomials defining the curve. If we consider the
Dirichlet boundary value problem for the operator L,
One of the characteristic features of soliton equa-
tions is that they allow rich special solutions. Lf ¼ ; f
Multisoliton solutions were the starting point of f ðs; Þ ¼ 0 ¼ f ðs þ l; Þ
the whole story. They directly relate to vertex
operators of affine Lie algebras. the eigenvalues are discrete and each eigenvalue j is
Rational solutions (in terms of -function poly- located in a zone:
nomial solutions) can be viewed as degenerations of
multisoliton solutions. Motions of poles (or zeros) of 2j1  j ðsÞ  2j
the solutions are interesting. Airault–McKean–Moser So, for the double zeros (2j1 = 2j ), the corre-
studied the motion of poles of rational solutions of sponding Dirichlet eigenvalue j (s) does not depend
the KdV equation and found that they are identical to on s.
the motion of particles on a line (Calogero–Moser– Dubrovin–Novikov also showed that a finite-zone
Sutherland system). This viewpoint has now been potential is a stationary solution of the higher-order
generalized by Veselov and others. KdV equation (the order being equal to the number
Another discovery of Sato was that polynomial of nontrivial zones) and the n-zonal potentials form
-functions of the KP hierarchy are precisely Schur a finite-dimensional integrable system. In other
functions (character polynomials). words, the linear operators L, An defining the nth
In accordance with the process of reduction, order KdV equations commute,
polynomial -functions of the KdV hierarchy are
Schur functions of special type. ½L; An  ¼ 0
In passing, it was later found that such a pair of
commuting linear differential operators was first
Quasiperiodic Solutions of studied by Burchnall–Chaundy in the 1920s.
Soliton Equations H F Baker remarked on the corresponding simulta-
neous eigenfunctions by relating them to multi-
As mentioned above, the KdV equation admits plicative functions on algebraic curves.
solutions expressible in terms of elliptic functions.
Dubrovin–Novikov and Its–Matveev, almost at the
same time, studied solutions of the KdV equation
The Work of Krichever
with periodic initial condition.
To the Sturm–Liouville (i.e., one-dimensional Krichever reversed the above argument, utilizing the
Schrödinger) operator with periodic potential properties of corresponding eigenfunctions as a
function of the spectral parameter. In this approach,
 2
@ we start with a compact Riemann surface C
L¼ þ uðxÞ; uðx þ lÞ ¼ uðxÞ (= nonsingular algebraic curve) of genus g. Here
@x
we apply his method to the KP hierarchy. Take a
there corresponds the discriminant, which is an point P0 on C together with the inverse of a local
entire function of the spectral parameter. Its zeros parameter k1 . Also take a general divisor  on C of
represent the periodic and antiperiodic spectrum j degree g. Consider a function (x, P), x = (x1 , x2 , . . .),
of the operator: with the following properties:

Lfj ðxÞ ¼ j fj ðxÞ; fj ðx þ lÞ ¼  fj ðxÞ 1. is meromorphic on CnP0 with the pole divisor
, and
It turns out that, except for a finite number of zeros, 2. near P0 , behaves like
other zeros are double. Such a potential is called a !
X
1  
finite-zone potential. These zones correspond to the ðx; PÞ ¼ exp j
xj k 1 þ Oðk1 Þ
spectrum of the operator in the L2 -sense. To a finite- j¼1
zone potential u(x) there corresponds a hyperelliptic
Such a exists uniquely and can be constructed
curve
using the theory of abelian integrals and the Jacobi
Y
2n  problems on algebraic curves. Such a function was

2 ¼   j called the Baker–Akhiezer function, since Akhiezer
j¼0 constructed it by using abelian integrals and Jacobi’s
Solitons and Kac–Moody Lie Algebras 601

problem in his study of moment problems (ortho- discussed modulation of the KdV equation by using
gonal polynomials). the averaging method of Whitham. This opens the
It was later realized that Schur had much earlier way to study the quasiclassical limit of soliton
considered such functions in the study of ordinary equations. This aspect was further studied by Dubro-
differential equations. vin and others in connection with topological field
It is easy to show that such a function satisfies the theory.
following linear differential equations: Quite recently, Noumi and Yamada gave a general-
ization of the Painlevé equation in many variables by
  n1  !
@ @ n X @ j using the idea of similarity solutions of soliton
¼ þ uj ðxÞ ; n ¼ 2; 3; . . . equations. In the work of Noumi–Yamada, the affine
@xn @x1 j¼0
@x1
Weyl group and -functions play an essential role in
constructing generalizations of the Painlevé equation.
In this way, we obtain a solution of the KP
The shift or the unit of difference corresponds to
hierarchy.
imaginary null roots of affine Lie algebras. The idea is
If there exists a rational function f (P) on C with
further applied to elliptic Painlevé equations.
poles only at P0 with singular part kn , can be
factorized as

ðx; PÞ ¼ exp f ðPÞ 0 ðx0 ; PÞ Integrable Many-Body Problems


As mentioned in relation with the rational solutions
where x0 indicates the set of variables other than xn . of soliton equations, the theory of integrable many-
Consequently, we have body problems has an intimate relationship with
the theory of solitons. Recently, Veselov and his
@
ðx; PÞ ¼ f ðPÞ ðx; PÞ co-workers introduced the notion of Baker–Akhiezer
@xn
functions of many variables. This concerns a
In this way, for a hyperelliptic curve C and a commutative subring of differential operators in
branch point of it, viewed as the double cover of many variables. The structure of vector bundles on
CP1 , we recover the case of the KdV hierarchy. algebraic varieties of higher dimensions is quite
Multisolitons correspond to rational algebraic different from that of algebraic curves. For this
curves with ordinary double points, while rational reason, a naı̈ve generalization of soliton equations to
solutions correspond to further degeneration. higher dimensions is not possible. Veselov and
The study of quasiperiodic solutions of soliton others have set up a class of functions which they
equations revealed an intimate relationship with call multidimensional Baker–Akhiezer functions.
the theory of algebraic curves. One particular out- They are defined by giving a finite set of vectors in
come was the characterization of Jacobian varieties a Euclidean space. The first problem is the existence.
among abelian varieties. This was originally posed For the existence of the multidimensional Baker–
by Schottky and subsequently reformulated by Akhiezer function the set must satisfy several
S P Novikov using soliton equations (Schottky constraints. This is quite different from the case of
problem, Novikov conjecture). This problem was solitons. Root systems satisfy these constraints and
solved through studies by Shiota, Mulase, and the corresponding Baker–Akhiezer function becomes
Arbarello–De Concini. the common eigenfunction of linear differential
Another aspect was finding commutative subalge- operators appearing in the Calogero–Sutherland–
bras in the ring of linear differential operators. This Moser model corresponding to root systems.
problem is related to the theory of stable vector
bundles on algebraic curves.
Ball–Box Systems
Satsuma–Takahashi found a soliton-like phenom-
Similarity Solutions of Soliton Equations
enon in cellular automata. It took much time for a
Ablowitz and Segur have shown that the Painlevé mathematical explanation of this. Now it is under-
transcendent of the second kind solves the KdV stand that these systems are obtained by a limiting
equation as a similarity solution. This was the procedure from soliton equations. Sometimes this is
starting point of the study of similarity solutions of called ultra-discretization. The system thus obtained
soliton equations. can also be obtained from the theory of crystal bases
Flaschka and Newell tried to construct the theory of of affine Lie algebras. They are now called ball–box
multisimilarity solutions. As a by-product, they systems.
602 Solitons and Other Extended Field Configurations

Other Topics Integrable Systems – Classical Theory and Quantum theory,


pp. 39–119. Singapore: World Scientific.
A quantized version of the inverse-scattering method Drinfel’d VG and Sokolov VV (1985) Lie algebras and equations
was initiated by Faddeev and his co-workers, which of Korteweg–de Vries type. Journal of Soviet Mathematics
30: 1975–2036.
makes a connection with two-dimensional solvable
Gardner CS, Greene JM, Kruskal MD, and Miura RM (1967)
lattice models and produced the notion of quantum Methods for solving the Korteweg–de Vries equation. Physical
groups. Through the Bethe ansatz, another relation Review Letter 19: 1095–1097.
of two-dimensional lattice models and ball–box Kac VG (1990) Infinite Dimensional Lie Algebras, 3rd edition.
systems has been discussed. Cambridge: Cambridge University Press.
Manin YuI (1979) Algebraic aspects of nonlinear differential
equations. Journal of Soviet Mathematics 11: 1–122.
See also: Affine Quantum Groups; Bäcklund
Miwa T, Jimbo M, and Date E (2000) Solitons, (Translated by
Transformations; Bi-Hamiltonian Methods in Soliton
Reid, M). Cambridge: Cambridge University Press.
Theory; Coherent States; Current Algebra; Integrable Noumi M (2002) Affine Weyl group approach to Painlevé
Systems and Algebraic Geometry; Integrable Systems: equations. In: Tatsien LI (ed.) International Congress of
Overview; Multi-Hamiltonian Systems; Painlevé Mathematicians (2002, Beijing), Proceedings of the Interna-
Equations; Partial Differential Equations: Some Examples; tional Congress of Mathematicians, August 20–28, 2002,
q-Special Functions; Recursion Operators in Classical Beijing, pp. 497–509. Beijing: Higher Education Press.
Mechanics; Sine-Gordon Equation; Toda Lattices. Novikov SP, Manakov SV, Pitaevskii LP, and Zakharov VE
(1984) Theory of Solitons. The Inverse Scattering Method.
New York and London: Consultants Bureau.
Further Reading Sato M and Sato Y (1983) Soliton equations as dynamical systems
on infinite-dimensional Grassmann manifold. In: Fujita H,
Cherednik I (1996) Basic Methods of Soliton Theory. Advanced Lax PD, and Strang G (eds.) Nonlinear Partial Differential
Series in Mathematical Physics, vol. 25. Singapore, New Equations in Applied Science (Tokyo, 1982), North-Holland
Jersey, London and Hong Kong: World Scientific. Math. Stud., vol. 81, pp. 259–271. Amsterdam: North-
Date E, Kashiwara M, Jimbo M, and Miwa T (1983) Transfor- Holland.
mation groups for soliton equations. In: Jimbo M and Miwa T
(eds.) Proceedings of RIMS Symposium on Non-Linear

Solitons and Other Extended Field Configurations


R S Ward, University of Durham, Durham, UK Sometimes the term ‘‘soliton’’ is used in a
ª 2006 Elsevier Ltd. All rights reserved.
restricted sense, to refer to stable localized lumps
which have purely elastic interactions: solitons
which collide without any radiation being emitted.
This is possible only in very special systems, namely,
Introduction
those that are completely integrable. For these
A soliton is a localized lump (or string or wall, etc.) systems, soliton stability (and the elasticity of
of energy, which can move without distortion, collisions) arises from a number of characteristic
dispersion, or dissipation, and which is stable under properties, including a precise balance between
perturbations (and collisions with other solitons). The dispersion and nonlinearity, solvability by the
word was coined by Zabusky and Kruskal in 1965 to inverse scattering transform from linear data, infi-
describe a solitary wave with particle-like properties nitely many conserved quantities, a Lax formulation
(as in electron, proton, etc.). Solitons are relevant to (associated linear problem), and Bäcklund transfor-
numerous areas of physics – condensed matter, mations. Examples of such integrable soliton sys-
cosmology, fluids/plasmas, biophysics (e.g., DNA), tems are the sine-Gordon, Korteweg–deVries, and
nuclear physics, high-energy physics, etc. Mathema- nonlinear Schrödinger equations.
tically, they are modeled as solutions of appropriate The category of topological solitons is the most
partial differential equations. varied, and includes such examples as kinks,
Systems which admit solitons may be classified vortices, monopoles, skyrmions, and instantons.
according to the mechanism by which stability is The requirement of dynamical balancing for these
ensured. Such mechanisms include complete integr- can be understood in terms of Derrick’s theorem,
ability, nontrivial topology plus dynamical balan- which provides necessary conditions for a classical
cing, and Q-balls/breathers. field theory to admit static localized solutions. The
Solitons and Other Extended Field Configurations 603

Derrick argument involves studying what is independent of time t. This leads to something
happens to the energy of a field when one changes like a centrifugal force, which can have a stabilizing
the scale of space. If one has a scalar field (or effect in the absence of Skyrme or magnetic terms.
multiplet of scalar fields) , and/or a gauge field F , The corresponding solitons are Q-balls.
then the static energy E is the sum of terms such as
Z Z
E0 ¼ VðÞ dn x; Ed ¼ Td ðDj Þ dn x; Kinks and Breathers
Z
EF ¼ Fjk Fjk dn x The simplest topological solitons are kinks, in
systems involving a real-valued scalar field (x) in
where each integral is over (n-dimensional) space one spatial dimension. The dynamics is governed by
Rn , Dj  denotes the covariant spatial derivative of , the Lagrangian density
and Td (j ) is a real-valued polynomial of degree d.  
L ¼ 12 ðt Þ2  ðx Þ2  WðÞ2
In particular, for example, we could have T2 (Dj ) =
(Dj )(Dj ), the standard gradient term. Under the where W() is a (fixed) smooth function. The system
dilation xj 7! xj , these functionals transform as can admit kinks if W() has at least two zeros, for
example, W(A) = W(B) = 0 with W() > 0 for A <
E0 7! n E0 ; Ed 7! dn Ed ; EF 7! 4n EF  < B. Two well-known systems are: sine-Gordon
(where W() = 2 sin (=2), A = 0, and B = 2) and 4
In order to have a static solution (critical point of (where W() = 1  2 , A = 1, and B = 1). The corre-
the static energy functional), one needs to have a zero sponding field equations are the Euler–Lagrange equa-
exponent on , and/or a balance between positive and tions for L; for example, the sine-Gordon equation is
negative exponents. A negative exponent indicates a
compressing force (tending to implode a localized tt  xx þ sin  ¼ 0 ½1
lump), whereas a positive exponent indicates an
expanding force; so to have a static lump solution, Configurations satisfying the boundary condi-
these two forces have to balance each other. For tions  ! A as x ! 1 and  ! B as x ! 1 are
n = 1, a system involving only a scalar field, with called kinks (and the corresponding ones with
terms of the form E0 and E2 , can admit static solitons x = 1 and x = 1 interchanged are antikinks).
(e.g., kinks); the scaling argument implies a virial For kink (or antikink) configurations, there is a
theorem, which in this case says that E0 = E2 . For lower bound, called the Bogomol’nyi bound, on the
n = 2, one can have a scalar system with only E2 , static energy E[]; for kink boundary conditions,
since in this case the relevant exponent is zero (e.g., we have
Z
the two-dimensional sigma model). Another n = 2 1 1h i
example is that of vortices in the abelian Higgs model, E½ ¼ ðx Þ2 þ WðÞ2 dx
2 1
where the energy contains terms E0 , E2 , and EF . For Z Z 1
1 1 2
n = 3, interesting systems have E2 together with either ¼ ½x  WðÞ dx þ WðÞx dx
2 1 1
E4 (e.g., skyrmions) or EF (e.g., monopoles). An E0 Z B
term is optional in these cases; its presence affects, in  WðÞ d
particular, the long-range properties of the solitons. A
For n = 4, one can have instantons in a pure gauge with equality if and only if the Bogomol’nyi equation
theory (term EF only).
It should be noted that if there are no restrictions on d
¼ WðÞ ½2
the fields  and Aj (such as those arising, e.g., from dx
nontrivial topology), then there is a more obvious mode is satisfied. A static solution of the Bogomol’nyi
of instability, which will inevitably be present:  7!  equation is a kink solution – it is a static minimum
and/or Aj 7! Aj , where 0    1. In other words, the of the energy functional in the kink sector. For
fields can simply be scaled away altogether, so that the example, for the sine-Gordon system, we get E[] 
height of the soliton (and its energy) go smoothly to 8, with equality for the sine-Gordon kink
zero. This can be prevented by nontrivial topology.
Another way of preventing solitons from shrink- ðxÞ ¼ 4 tan1 expðx  x0 Þ
ing is to allow the field to have some ‘‘internal’’ time while for the 4 system, we get E[]  4=3, with
dependence, so that it is stationary rather than equality for the phi-four kink
static. For example, one could allow the complex
scalar field  to have the form  = exp (i!t), where ðxÞ ¼ tanhðx  x0 Þ
604 Solitons and Other Extended Field Configurations

These kinks are stable topological solitons; the energy density, as well as a zeroth-order term
nontrivial topology corresponds to the fact that the E 0 = V(a ) not involving derivatives of . A term
boundary value of (t, x) at x = 1 is different from of the form E 4 is called a Skyrme term.
that at x = 1. With trivial boundary conditions The boundary condition on field configurations
(say  ! A as x ! 1), stable static solitons are is that  tends to some constant value 0 2 M as
unlikely to exist, but solitons with periodic time jxj ! 1 in Rn . From the topological point of view,
dependence (which in this context are called breath- this compactifies Rn to Sn . In other words,  extends
ers) may exist. For example, the sine-Gordon to a map from Sn to M; and such maps are classified
equation and the nonlinear Schrödinger equation, topologically by the homotopy group n (M). For
both, admit breathers – but these owe their existence topological solitons to exist, this group has to be
to complete integrability. By contrast, the 4 system nontrivial.
(which is not integrable) does not admit breathers; a In one spatial dimension (n = 1) with M = S1 (say),
collision between a 4 kink and an antikink (with the expression E 4 is identically zero, and we just have
suitable impact speed) produces a long-lived state kink-type systems such as sine-Gordon. The simplest
which looks like a breather, but eventually decays two-dimensional example (n = 2) is the O(3) sigma
into radiation. model, which has M = S2 with its standard metric. In
In lattice systems, however, breathers are more this system, the field is often expressed as a unit
generic. In a one-dimensional lattice system, the 3-vector field f = (1 , 2 , 3 ), with E 2 = (@j f)  (@j f).
continuous space R is replaced by the lattice Z, so Here the configurations are classified topologically by
(t, x) is replaced by n (t), where n 2 Z. The their degree (or winding number, or topological
Lagrangian is charge) N 2 2 (S2 ) ffi Z, which equals
Z
1 Xh _ 2 i 1
f  @1 f  @2 f dx1 dx2
L¼ ðn Þ  h2 ðnþ1  n Þ2  Wðn Þ N¼
4
2 n

where h is a positive parameter, corresponding to the Instead of f, it is often convenient to use a single
dimensionless ratio between the lattice spacing and the complex-valued function W related to f by the
size of a kink. The continuum limit is h ! 0. This stereographic projection W = (1 þ i2 )=(1  3 ). In
system admits kink solutions as in the continuum case; terms of W, the formula for the degree N is
and for h large enough, it admits breathers as well, but Z
i W1 W 2  W2 W 1 1 2
these disappear as h becomes small. N¼ dx dx
Interpreted in three dimensions, the kink becomes
2 ð1 þ jWj2 Þ2
a domain wall separating two regions in which the and the static energy is (with z = x1 þ ix2 )
order parameter  takes distinct values; this has Z
applications in such diverse areas as cosmology and
E ¼ E 2 d2 x
condensed matter physics.
Z
jWz j2 þ jWz j2
¼8 d2 x
Sigma Models and Skyrmions ð1 þ jWj2 Þ2
Z Z
In a sigma model or Skyrme system, the field is a jWz j2 jWz j2  jWz j2
¼ 16 d2 x þ 8 d2 x
map  from spacetime to a Riemannian manifold M; ð1 þ jWj2 Þ2 ð1 þ jWj2 Þ2
generally, M is taken to be a Lie group or a Z
jWz j2
symmetric space. The energy density of a static ¼ 16 d2 x þ 8N
field can be constructed as follows (the Lorentz- ð1 þ jWj2 Þ2
invariant extension of this gives a relativistic
From this, one sees that E satisfies the Bogomol’nyi
Lagrangian for fields on spacetime). Let a be local
bound E  8N, and that minimal-energy solutions
coordinates on the m-dimensional manifold M, let
correspond to solutions of the Cauchy–Riemann
hab denote the metric of M, and let xj denote the
equations Wz = 0. To have finite energy, W(z) has to
spatial coordinates on space Rn . An m  m matrix D
be a rational function, and so solutions with wind-
is defined by
ing number N correspond to rational meromorphic
functions W(z), of degree jNj. (If N < 0, then W is a
Da b ¼ ð@j c Þhac ð@j b Þ
rational function of z.) The energy is scale invariant
where @j denotes derivatives with respect to the xj . (conformally invariant), and consequently these
Then the invariants E 2 = tr(D) = j@j a j2 and solutions are not solitons – they are not quite stable,
E 4 = (1=2)[(tr D)2  tr(D2 )] can be terms in the since their size is not fixed. Adding terms E 4 and E 0
Solitons and Other Extended Field Configurations 605

to the energy density fixes the soliton size, and the where Dj  := @j   iAj , and where  is a positive
resulting two-dimensional Skyrme systems admit constant. The boundary conditions are
true topological solitons.
The three-dimensional case (n = 3), with M being Dj  ¼ 0; B ¼ 0; jj ¼ 1 ½4
a simple Lie group, is the original Skyrme model of as r ! 1. If we consider a very large circle C on R2 ,
nuclear physics. If M = SU(2), then the integer N 2 so that [4] holds on C, then jC is a map from the
3 (SU(2)) ffi Z is interpreted as the baryon number. circle C to the circle of unit radius in the complex
The (quantum) excitations of the -field correspond plane, and therefore it has an integer winding
to the pions, whereas the (semiclassical) solitons number N. Thus configurations are labeled by this
correspond to the nucleons. This model emerges as vortex number N.
an effective theory of quantum chromodynamics Note that if E vanishes, then B = 0 and jj = 1: the
(QCD), in the limit where the number of colors is gauge symmetry is spontaneously broken, and the
large. If we express the field as a function U(xj ) photon ‘‘acquires a mass’’: this is a standard
taking values in a Lie group, then Lj = U1 @j U takes example of spontaneous symmetry breaking.
values in the corresponding Lie algebra, and E 2 and R
The total magnetic flux B d2 x equals 2N; a
E 4 take the form proof of this is as follows. Let  be the usual polar
coordinate around C. Because jj = 1 on C, we can
E 2 ¼ 12 trðLj Lj Þ write  = exp [if ()] for some function f; this f need
 
E 4 ¼ 161
tr ½Lj ; Lk ½Lj ; Lk  not be single-valued, but must satisfy f (2) 
f (0) = 2N with N being an integer (in order that
The static energy density in the basic Skyrme system  be single-valued). In fact, this defines the winding
is the sum of these two terms. The static energy number. Now since Dj  = @j   iAj  = 0 on C,
satisfies a Bogomol’nyi bound E  122 jNj, and it is we have
believed that stable solitons (skyrmions) exist for
Aj ¼ i1 @j  ¼ @j f
each value of N. Classical skyrmions have been
investigated numerically; for values of N up to
25, on C. So, using Stokes’ theorem, we get
they turn out to resemble polyhedral shells. Com- Z Z
parison with nucleon phenomenology requires semi- 2
Bd x ¼ Aj dxj
classical quantization, and this leads to results which R2 C
Z 2
are at least qualitatively correct. df
A variant of the Skyrme model is the Skyrme– ¼ d
0 d
Faddeev system, which has n = 3 and M = S2 ; the
¼ 2N
solitons in this case resemble loops which can be
linked or knotted, and which are classified by their R
If  = 1, then the total energy E = E d2 x
Hopf number N 2 3 (S2 ). In this case, the energy satisfies the Bogomol’nyi bound E  N; E = N
satisfies a lower bound of the form E  cN 3=4 . if and only if a set of partial differential equations
Numerical experiments indicate that for each N, (the Bogomol’nyi equations) are satisfied. Since
there is a minimal-energy solution with Hopf like charges repel, the magnetic force between
number N, and with energy close to this topological vortices is repulsive. However, there is also a
lower bound. force from the Higgs field, and this is attractive.
The balance between the two forces is determined
by : if  > 1, the vortices repel each other;
Abelian Higgs Vortices whereas if  < 1, the vortices attract. In the
critical case  = 1, the force between vortices is
Vortices live in two spatial dimensions; viewed in exactly balanced, and there exist static multi-
three dimensions, they are string-like. Two of their vortex solutions. In fact, one has the following:
applications are as cosmic strings and as magnetic given N points in the plane, there exists an
flux tubes in superconductors. They occur as static N-vortex solution of the Bogomol’nyi equations
topological solitons in the the abelian Higgs model (and hence of the full field equations) with 
(or Ginzburg–Landau model), and involve a mag- vanishing at the chosen points (and nowhere
netic field B = @1 A2  @2 A1 , coupled to a complex else). All static solutions are of this form. These
scalar field , on the plane R2 . The energy density is solutions cannot, however, be written down
explicitly in terms of elementary functions (except
E ¼ 12 ðDj ÞðDj Þ þ 12 B2 þ 18 ð1  jj2 Þ2 ½3 of course for N = 0).
606 Solitons and Other Extended Field Configurations

Monopoles of BPS monopoles of charge N, with gauge freedom


factored out, is parametrized by a (4N  1)-dimen-
The abelian Higgs model does not admit three-
sional manifold MN . This is the moduli space of N
dimensional solitons, but a nonabelian generaliza-
monopoles. Roughly speaking, each monopole has a
tion does – such nonabelian Higgs solitons are called
position in space (three parameters) plus a phase
magnetic monopoles. The field content, in the
(one parameter), making a total of 4jNj parameters;
simplest version, is as follows. First, there is a
an overall phase can be removed by a gauge
gauge (Yang–Mills) field F , with gauge potential
transformation, leaving (4jNj  1) parameters. In
A , and with the gauge group being a simple Lie
fact, it is often useful to retain the overall phase, and
group G. Second, there is a Higgs scalar field ,
to work with the corresponding 4jNj-dimensional
transforming under the adjoint representation of G f N . This manifold has a natural metric,
manifold M
(thus  takes values in the Lie algebra of G). For
which corresponds to the expression for the kinetic
simplicity, G is taken to be SU(2) in what follows. f N represents an
energy of the system. A point in M
So we may write A = iAa a , F = iF a
a , and
a N-monopole configuration, and the slow-motion
 = i a , where a are the Pauli matrices. The
dynamics of N monopoles corresponds to geodesics
energy of static (@0  = 0 = @0 Aj ), purely magnetic f N ; this is the geodesic approximation of
on M
(A0 = 0) configurations is
monopole dynamics.
Z
1 a a 1 a a 1 a a 2
 3 The N = 1 monopole is spherically symmetric, and
E¼ 2 Bj Bj þ 2 ðDj Þ ðDj Þ þ 4 ð1    Þ d x the corresponding fields take a simple form; for
example, the Higgs field of a 1-monopole located at
where Baj = (1=2)
jkl Fkl is the magnetic field. The
r = 0 is
boundary conditions are Baj ! 0 and a a ! 1 as
r ! 1; so  restricted to a large spatial 2-sphere  
a cothð2rÞ 1 a
becomes a map from S2 to the unit 2-sphere in the  ¼  2 x
r 2r
Lie algebra su(2), and as such it has a degree N 2 Z.
An analytic expression for N is For N > 1, the expressions tend to be less explicit;
Z
but monopole solutions can nevertheless be char-
Baj ðDj Þa d3 x ¼ 2N ½5 acterized in a fairly complete way. The Bogomol’nyi
equations [7] are a dimensional reduction of the self-
At long range, the field resembles an isolated dual Yang–Mills equations in R4 , and BPS mono-
magnetic pole (a Dirac magnetic monopole), with poles correspond to holomorphic vector bundles
magnetic charge 2N. Asymptotically, the SU(2) over a certain two-dimensional complex manifold
gauge symmetry is spontaneously broken to U(1), (‘‘mini-twistor space’’). This leads to various other
which is interpreted as the electromagnetic gauge characterizations of monopole solutions, for exam-
group. ple, in terms of certain curves (‘‘spectral curves’’) on
In 1974, it was observed that this system admits a mini-twistor space, and in terms of solutions of a set
smooth, finite-energy, stable, spherically symmetric of ordinary differential equations called the Nahm
N = 1 solution – this is the ’t Hooft–Polyakov equations. Having all these descriptions enables one
monopole. There is a Bogomol’nyi lower bound on to deduce much about the monopole moduli space,
the energy E: from 0  (B þ D)2 = B2 þ (D)2 þ and to characterize many monopole solutions. In
2B  D, we get particular, there are explicit solutions of the Nahm
Z
equations involving elliptic functions, which corre-
E  2N þ 14 ð1  a a Þ2 d3 x ½6 spond to monopoles with certain discrete symme-
tries, such as a 3-monopole with tetrahedral
where [5] has been used. The inequality [6] is symmetry, and a 4-monopole with the appearance
saturated if and only if the Prasad–Sommerfield and symmetries of a cube.
limit  = 0 is used, and the Bogomol’nyi equations
ðDj Þa ¼ Baj ½7
Yang–Mills Instantons
hold. The corresponding solitons are called
Bogomol’nyi–Prasad–Sommerfield (BPS) monopoles. Consider gauge fields in four-dimensional Euclidean
The Bogomol’nyi equations [7], together with the space R4 , with gauge group G. For simplicity, in
boundary conditions described above, form a com- what follows, G is taken to be SU(2); one can extend
pletely integrable elliptic system of partial differen- much of the structure to more general groups, for
tial equations. For any positive integer N, the space example, the simple Lie groups. Let A and F
Solitons and Other Extended Field Configurations 607

denote the gauge potential and gauge field. The bundles over complex projective 3-space (twistor
Yang–Mills action is space). One large class of solutions which can be
Z written out explicitly is as follows: for N = 1 and
1  
S¼ tr F F d4 x ½8 N = 2 it gives all instantons, while for N  3 it gives a
4 (5N þ 4)-dimensional subfamily of the full (8N  3)-
where we assume a boundary condition, at infinity dimensional solution space. The gauge potentials in
in R4 , such that this integral converges. The Euler– this class have the form
Lagrange equations which describe critical points of
the functional S are the Yang–Mills equations A ¼ i  @ log  ½11

D F ¼ 0 ½9 where the  are constant matrices (antisymmetric


in ) defined in terms of the Pauli matrices a by
Finite-action Yang–Mills fields are called instantons.
The Euclidean action [8] is used in the path-integral 10 ¼ 23 ¼ 12 1
approach to quantum gauge field theory; therefore,
20 ¼ 31 ¼ 12 2
instantons are crucial in understanding the path
integral. 30 ¼ 12 ¼ 12 3
The dual of the field tensor F is
The real-valued function  = (x ) is a solution of
F ¼ 12 " F the four-dimensional Laplace equation given by
The gauge field is self-dual if F = F , and anti- X
N
self-dual if F = F . In view of the Bianchi k
ðx Þ ¼   x Þðx  x Þ
identity D F = 0, any self-dual or anti-self-dual k¼0
ðx k k
gauge field is automatically a solution of the Yang–
Mills equations [9]. This fact also follows from the where the xk are N þ 1 distinct points in R4 , and the
discussion below, where we see that self-dual k are N þ 1 positive constants: a total of 5N þ 5
instantons give local minima of the action. parameters. It is clear from [11] that the overall
The Yang–Mills action (and Yang–Mills equa- scale of  is irrelevant, leaving a (5N þ 4)-parameter
tions) are conformally invariant; any finite-action family. For N = 1 and N = 2, symmetries reduce the
solution of the Yang–Mills equations on R4 extends parameter count further, to 5 and 13, respectively.
smoothly to the conformal compactification S4 . Although  has poles at the points x = xk , the gauge
Gauge fields on S4 , with gauge group SU(2), are potentials are smooth (possibly after a gauge
classified topologically by an integer N, namely, the transformation).
second Chern number Finally, it is worth noting that (as one might
Z expect) there is a gravitational analog of the gauge-
1   theoretic structures described here. In other words,
N ¼ c2 ¼  2 tr F F d4 x ½10
8 one has self-dual gravitational instantons – these are
From [8] and [10] a topological lower bound on the four-dimensional Riemannian spaces for which the
action is given as follows: conformal-curvature tensor (the Weyl tensor) is
Z self-dual, and the Ricci tensor satisfies Einstein’s
   equations R = g . As before, such spaces
0   tr F  F F  F d4 x
can be constructed using a twistor-geometrical
¼ 8S  162 N correspondence.

and so S  22 N, with equality if and only if the


field is self-dual. If N < 0, we get S  22 jNj, with
equality if and only if F is anti-self-dual. So the self-
Q-Balls
dual (or anti-self-dual) fields minimize the action in A Q-ball (or nontopological soliton) is a soliton
each topological class. which has a periodic time dependence in a degree of
For the remainder of this section, we restrict to self- freedom which corresponds to a global symmetry.
dual instantons with instanton number N > 0. The The simplest class of Q-ball systems involves a
space (moduli space) of such instantons, with gauge complex scalar field , with an invariance under the
equivalence factored out, is an (8N  3)-dimensional constant phase transformation  7! ei ; the Q-balls
real manifold. In principle, all these gauge fields can are soliton solutions of the form
be constructed using algebraic-geometry (twistor)
methods: instantons correspond to holomorphic vector ðt; xÞ ¼ ei!t ðxÞ ½12
608 Solitons and Other Extended Field Configurations

Z
where (x) is a complex scalar field depending only
on the spatial variables x. The best-known case is Ep ¼ U  12 KI ¼ ðK=!2 ÞEk > ðK=!2 ÞEp ½14
the 1-soliton solution
where the final inequality comes from [13]. As a
pffiffiffi consequence, we see that !2 is restricted to the range
ðt; xÞ ¼ a 2 expðia2 tÞsechðaxÞ
K < ! 2 < m2 ½15
of the nonlinear Schrödinger equation it þ xx þ
jj2 = 0. An example which has been studied in some detail is
More generally, consider a system (in n spatial U(f ) = f 2 [1 þ (1  f 2 )2 ]; here m2 = 4 and K = 2, so
dimensions) with Lagrangian the
pffiffiffi range of frequency for Q-balls in this system is
2 < ! < 2. The dynamics of Q-balls in systems
L ¼ 12 ð@ Þð@  Þ  UðjjÞ such as these turns out to be quite complicated.

where (x ) is a complex-valued field. Associated See also: Abelian Higgs Vortices; Homoclinic
with the global phase Phenomena; Integrable Systems: Overview; Instantons:
R symmetry is the conserved
Noether charge Q = Im(  t ) dn x. Minimizing the Topological Aspects; Noncommutative Geometry from
energy of a configuration subject to Q being fixed Strings; Sine-Gordon Equation; Topological Defects and
Their Homotopy Classification.
implies that  has the form [12]. Without loss of
generality, Rwe may take !  0. Note that Q = !I,
where I = j j2 dn x. The energy of a configuration
of the form [12] is E = Eq þ Ek þ Ep , where Further Reading
Z Atiyah MF and Hitchin NJ (1988) The Geometry and Dynamics
1
Eq ¼ j@j j2 dn x of Magnetic Monopoles. Princeton: Princeton University
2 Press.
Coleman S (1988) Aspects of Symmetry. Cambridge: Cambridge
Ek ¼ 12 I!2 ¼ 12 Q2 =I University Press.
Z Drazin PG and Johnson RS (1989) Solitons: An Introduction.
Ep ¼ Uðj jÞ dn x Cambridge: Cambridge University Press.
Goddard P and Mansfield P (1986) Topological structures in field
theories. Reports on Progress in Physics 49: 725–781.
Let us take U(0) = 0 = U0 (0), with the field satisfying Jaffe A and Taubes C (1980) Vortices and Monopoles. Boston:
Birkhäuser.
the boundary condition ! 0 as r ! 1.
Lee TD and Pang Y (1992) Nontopological solitons. Physics
A stationary Q-lump is a critical point of the Reports 221: 251–350.
energy functional E[ ], subject to Q having some Makhankov VG, Rubakov YP, and Sanyuk VI (1993) The Skyrme
fixed value. The usual (Derrick) scaling argument Model: Fundamentals, Methods, Applications. Berlin: Springer.
shows that any stationary Q-lump must satisfy Manton NS and Sutcliffe PM (2004) Topological Solitons.
Cambridge: Cambridge University Press.
Rajaraman R (1982) Soliton and Instantons. New York: North-
ð2  nÞEq  nEp þ nEk ¼ 0 ½13
Holland.
Rebbi C and Soliani G (1984) Solitons and Particles. Singapore:
For simplicity, in what follows, let us take n  3. World Scientific.
Define m > 0 by U00 (0) = m2 ; then, near spatial Vilenkin A and Shellard EPS (1994) Cosmic Strings and Other
infinity, the Euler–Lagrange equations give r2  Cosmological Defects. Cambridge: Cambridge University Press.
(m2  !2 ) = 0. So, in order to satisfy the boundary Ward RS and Wells RO Jr. (1990) Twistor Geometry and Field
Theory. Cambridge: Cambridge University Press.
condition ! 0 as r ! 1, we need ! < m.
Zakrzewski WJ (1989) Low Dimensional Sigma Models. Bristol:
It is clear from [13] that if U  (1=2)m2 j j2 IOP.
everywhere, then there can be no solution. So
K = min[2U(j j)=j j2 ] has to satisfy K < m2 . Also,
we have
Source Coding in Quantum Information Theory 609

Source Coding in Quantum Information Theory


N Datta, University of Cambridge, Cambridge, UK Classical Data Compression
T C Dorlas, Dublin Institute for Advanced Studies,
Dublin, Republic of Ireland Entropy and Source Coding

ª 2006 Elsevier Ltd. All rights reserved. A simple model of a classical information source
consists of a sequence of discrete random variables
X1 , X2 , . . . , Xn , whose values represent the output of
the source. Each random variable Xi , 1  i  n,
Introduction takes values xi from a finite set, the source alphabet
Two key issues of classical and quantum informa- X . Hence, X(n) := (X1 , . . . , Xn ) takes values x(n) :=
tion theory are storage and transmission of informa- (x1 , . . . , xn ) 2 X n . We recall the definition of entropy
tion. An information source produces some outputs (or information content) of a source.
(or signals) more frequently than others. Due to this If the discrete random variables X1 , . . . , Xn which
redundancy, one can reduce the amount of space take values from a finite alphabet X have joint
needed for its storage without compromising on its probabilities
content. This data compression is done by a suitable
PðX1 ¼ x1 ; . . . ; Xn ¼ xn Þ ¼ pn ðx1 ; . . . ; xn Þ
encoding of the output of the source. In contrast, in
the transmission of information through a channel, then the Shannon entropy of this source is defined by
it is often advantageous to add redundancy to a
message, in order to combat the effects of noise. HðX1 ; . . . ; Xn Þ
X X
This is done in the form of error-correcting codes. ¼  pn ðx1 ; . . . ; xn Þ
The amount of redundancy which needs to be added x1 2X xn 2X
to the original message depends on how much noise  log pn ðx1 ; . . . ; xn Þ ½1
is present in the channel (see, e.g., Nielson and
Chuang (2000)). Hence, redundancy plays comple- Here and in the following, the logarithm is taken to
mentary roles in data compression and transmission the base 2. This is because the fundamental unit of
of data through a noisy channel. In this review we classical information is a ‘‘bit,’’ which takes two
focus only on data compression in quantum infor- values 0 and 1. Notice that H(X1 , . . . , Xn ) in fact
mation theory. only depends on the (joint) probability mass func-
In classical information theory, Shannon showed tion (p.m.f.) pn and can also be denoted as H(pn ).
that there is a natural limit to the amount of There are several other concepts of entropy, for
compression that can be achieved. It is given by example, relative entropy, conditional entropy, and
the Shannon entropy. The analogous concept in mutual information. See, for example, Cover and
quantum information theory is the von Neumann Thomas (1991) and Nielson and Chuang (2000). It
entropy. Here, we review some of the main results is easy to see that
of quantum data compression and the significance of
1. 0  H(X1 , . . . , Xn )  n log jX j, where jX j denotes
the von Neumann entropy in this context.
the number of letters in the alphabet X . Two
The review is structured as follows. We first give
other important properties are as follows:
a brief introduction to the Shannon entropy and
2. H(X1 , . . . , Xn ) is jointly concave in X1 , . . . , Xn
classical data compression. This is followed by a
and
discussion of quantum entropy and the idea behind
3. H(X1 , . . . , Xn )  H(X1 , . . . , Xm ) þ H(Xmþ1 , . . . , Xn )
quantum source coding. We elaborate on data
for m < n.
compression schemes for three different classes of
quantum sources, namely memoryless sources, The latter property is called subadditivity.
ergodic sources, and sources modeled by Gibbs In the next section, analogous quantities are
states of quantum spin systems. In the bulk of the introduced for quantum information and the corre-
review, we concentrate on source-dependent, fixed- sponding properties are stated.
length coding schemes. We conclude with a brief Suppose that the random variables X1 , X2 , . . . , Xn
discussion of universal and variable-length coding. are independent and identically distributed (i.i.d.).
We would like to point out that this review article Then the entropy of each random variable modeling
is by no means complete. Due to a restriction on its the source is the same and can be denoted by H(X).
length, we had to leave out various important From the point of view of classical information
aspects and developments of quantum source theory, the Shannon entropy has an important
coding. operational definition. It quantifies the minimal
610 Source Coding in Quantum Information Theory

physical resources needed to store data from a It is known that (Xn )n2Z is ergodic if and only if
classical information source and provides a limit to its probability distribution is extremal in the set of
which data can be compressed reliably (i.e., in a invariant probability measures. The generalization
manner in which the original data can be recovered of Theorem 1 (McMillan 1953, Breiman 1957) now
later with a low probability of error). Shannon reads:
showed that the original data can be reliably
Theorem 2 (Shannon–McMillan–Breiman theo-
obtained from the compressed version only if the
rem). Suppose that the sequence (Xn )n2Z is
rate of compression is greater than the Shannon
ergodic. Then
entropy. This result is formulated in Shannon’s
 
noiseless channel coding theorem (Shannon 1918, 1
Cover and Thomas 1991, Nielson and Chuang lim  log pn ðX1 ; . . . ; Xn Þ ¼ hKS
n!1 n ½4
2000) given later.
with probability 1

where hKS is the Kolmogorov–Sinai entropy defined by


The Asymptotic Equipartition Property
The main idea behind Shannon’s noiseless channel 1 1
hKS ¼ lim HðX1 ;...;Xn Þ ¼ inf HðX1 ;...;Xn Þ ½5
n!1 n n n
coding theorem is to divide the possible values
x1 , x2 , . . . , xn of random variables X1 , . . . , Xn into Remark. It follows from the subadditivity property
two classes – one consisting of sequences which have (3) above that the sequence (1=n)H(pn ) is decreas-
a high probability of occurrence, known as ‘‘typical ing, and it is obviously bounded below by 0.
sequences,’’ and the other consisting of sequences
which occur rarely, known as ‘‘atypical sequences.’’ We now define the set of typical sequences (or more
The idea is that there are far fewer typical sequences precisely, -typical sequences) as follows:
than the total number of possible sequences, but Definition Let X1 , . . . , Xn be i.i.d. random vari-
they occur with high probability. The existence ables with p.m.f. p(x). Given  > 0, -typical set T(n)
of typical sequences follows from the so-called is the set of sequences (x1 . . . xn ) for which
‘‘asymptotic equipartition property’’:
2nðHðXÞþÞ  pðx1 . . . xn Þ  2nðHðXÞÞ ½6
Theorem 1 (AEP). If X1 , X2 , X3 , . . . are i.i.d.
random variables with p.m.f. p(x), then In the case of an ergodic sequence, H(X) is replaced
by hKS in [6].
1 P
 log pn ðX1 ; . . . ; Xn Þ ! HðXÞ ½2 Let jT(n) j denote the total number of typical
n
sequences and P{T(n) } denote the probability of the
where H(X) is the Shannon entropy for a single typical set. Then the following is an easy conse-
variable, and pn (X1 , . . . , Xn ) denotes the
Q random quence of Theorem 1.
variable taking values pn (x1 , . . . , xn ) = ni= 1 p(xi )
with probabilities pn (x1 , . . . , xn ). Theorem 3 (Theorem of typical sequences). For
any  > 0 9 n0 () > 0 such that 8n  n0 () the follow-
This theorem has been generalized to the case of ing hold:
sequences of dependent variables (Xn )n2Z which are
ergodic for the shift transformation defined below. (i) P{T(n) } > 1   and
It is easiest to formulate this for an information (ii) (1  )2n(H(X))  jT(n) j  2n(H(X)þ)
stream which extends from 1 to þ1:
Definition A sequence (Xn )n2Z is called ‘‘stationary’’ Shannon’s Noiseless Channel Coding Theorem
if for any n1 < n2 and any xn1 , . . . , xn2 2 X , Shannon’s noiseless channel coding theorem is a
simple application of the theorem of typical
PðXn1 ¼ xn1 ; . . . ; Xn2 ¼ xn2 Þ
sequences and says that the optimal rate at which
¼ PðXn1 þ1 ¼ xn1 ; . . . ; Xn2 þ1 ¼ xn2 Þ one can reliably compress data from an i.i.d.
classical information source is given by the Shannon
We define the shift transformation  by
entropy H(X) of the source.
 
 ðxn Þn2Z ¼ ðx0n Þn2Z ; x0n ¼ xn1 ½3 A ‘‘compression scheme’’ Cn of rate R maps
possible sequences x = (x1 , . . . , xn ) to a binary string
Then (Xn )n2Z is called ‘‘ergodic’’ if it is stationary of length dnRe: Cn : x 7! y = (y1 , . . . , ydnRe ), where
and if every subset A  X Z such that (A) = A has xi 2 X ; jX j = d and yi 2 {0, 1} 8 1  i  dnRe. The
probability 0 or 1, that is, P((Xn )n2Z 2 A) = 0 or 1. corresponding decompression scheme takes the dnRe
Source Coding in Quantum Information Theory 611

compressed bits and maps them back to a string of n on a finite-dimensional algebra M, there exists a
letters from the alphabet X : Dn : y 2 {0, 1}dnRe 7! x0 = unique density matrix  such that [7] holds, so the
(x01 , . . . , x0n ). A compression–decompression scheme concepts can be used interchangeably. (This is not
is said to be ‘‘reliable’’ if the probability that x0 6¼ x true in the infinite-dimensional case.)
tends to 0 as n ! 1. Shannon’s noiseless channel The quantum analog of the Shannon entropy is
coding theorem (Shannon 1918, Cover and Thomas called the von Neumann entropy. For any quantum
1991) now states state  (or equivalently  ), it is defined by
 
Theorem 4 (Shannon). Suppose that {Xi } is an i.i.d. SðÞ
Sð Þ :¼ tr  log  ½8
information source, with Xi p(x) and Shannon
entropy H(X). If R > H(X) then there exists a Here we use log to denote log2 and define 0 log
reliable compression scheme of rate R for the 0
0, as for the Shannon entropy. Let the density
source. Conversely, any compression scheme with matrix  have a spectral decomposition
rate R < H(X) is not reliable. X
d
 ¼ i j i ih i j ½9
Proof (sketch). Suppose R > H(X). Choose  > 0
i¼1
such that H(X) þ  < R. Consider the set T(n) of
typical sequences. The method of compression is Here {j i i} is the set of eigenvectors of  . They
then to examine the output of the source, to see if it form an orthonormal basis of the Hilbert space H.
belongs to T(n) . If the output is a typical sequence, By the fact that  is positive definite and has trace 1,
then we compress the data by simply storing an the eigenvalues i of  determine a probability
index for the particular sequence using dnRe bits in distribution. When expressed in terms of the i , the
the obvious way. If the input string is not typical, von Neumann entropy of  reduces to the Shannon
then we compress the string to some fixed dnRe bit entropy corresponding to this probability distribu-
string, for example, (00 . . . 000). In this case, data tion (henceforth, the subscript  of  will be
compression effectively fails, but, in spite of this, the omitted): S() = H(), where  = {1 , . . . , d }.
compression–decompression scheme succeeds with The von Neumann entropy has properties analo-
probability tending to 1 as n ! 1, since by Theorem 3 gous to H(X1 , . . . , Xn ), in particular (Ohya and Petz
the probability of atypical sequences can be made 1993, Nielson and Chuang 2000)
small by choosing n large enough. 1. 0  S()  log(dim (H));
If R < H(X), then any compression scheme of rate 2. S() is concave in ; and
R is not reliable. This also follows from Theorem 3 3. if  is a state on H = H1 H2 then S()  S(1 ) þ
by the following argument. Let S(n) be a collection S(2 ) if 1 and 2 are the restrictions of  to
of sequences x(n) of size jS(n)j  2dnRe . Then the H1 I and I H2 respectively.
subset of atypical sequences in S(n) is highly
improbable, whereas the corresponding subset of A ‘‘quantum information source’’ in general is
typical sequences has probability bounded by defined by a sequence of density matrices (n) on
2nR 2nH(X) ! 0 as n ! 1. & Hilbert spaces Hn of increasing dimensions Nn given
by a decomposition
X ðnÞ ðnÞ ðnÞ
ðnÞ ¼ pk jk ihk j ½10
Quantum Data Compression
k
Quantum Sources and Entropy
where the states j(n)
k i are interpretedPas the signal
In quantum information processing systems, infor- states, and the numbers p(n)
k  0 with
(n)
k pk = 1, as
mation is stored in quantum states of physical their probabilities of occurrence. The vectors j(n)
k i2
systems. The most general description of a quantum Hn need not be mutually orthogonal.
state is provided by a density matrix.
Compression–Decompression
A ‘‘density matrix’’  is a positive semidefinite
Scheme and Fidelity
operator on a Hilbert space H, with tr = 1, and the
expected value of an operator A on H is given by To compress data from such a source one encodes
each signal state j(n) e (n) e
ðAÞ ¼ tr ðAÞ ½7 k i by a state k 2 B(Hn ) where
e
dim Hn = dc (n) < Nn . Thus, a compression scheme
The functional  on M = B(H), the algebra of linear is a map C(n) : j(n) (n)
e (n) e
k ihk j 7! k 2 B(Hn ). The state
(n)
operators on H, is positive (i.e., (A)  0, if A  0) ek is referred to as the compressed state. A
and maps the identity I 2 M to 1. Such a functional corresponding decompression scheme is a map
is also called a state. Conversely, given such a state D(n) : B(Hen ) 7! B(Hn ). Both C(n) and D(n) must be
612 Source Coding in Quantum Information Theory

completely positive maps. In particular, this implies Schumacher’s Theorem for Memoryless
that D(n) must be of the form Quantum Sources
X The notion of a typical subspace was first
DðnÞ ðÞ ¼ Di D i ½11
i
introduced in the context of quantum information
theory by Schumacher (1995) in his seminal paper.
for
P linear operators Di : Hen 7! Hn such that He considered the simplest class of quantum

i Di Di = I (see Nielson and Chuang 2000). information sources, namely quantum memoryless
Obviously, in order to achieve the maximum or i.i.d sources. For such a source the density matrix
possible compression of Hilbert space dimensions (n) , defined through [10], acts on a tensor product
per signal state, the goal must be to make the Hilbert space Hn = H n and is itself given by a
dimension dc (n) as small as possible, subject to the tensor product
condition that the information carried in the signal
states can be retrieved with high accuracy upon ðnÞ ¼  n ½14
decompression. Here H is a fixed Hilbert space (representing an
The ‘‘rate of compression’’ is defined as elementary quantum subsystem) and  is a density
logðdim Hen Þ log dc ðnÞ matrix acting on H; for example, H can be a single
Rn :¼ ¼ qubit Hilbert space, in which case dim H = 2, Hn is
logðdim Hn Þ log Nn
the Hilbert space of n qubits and  is the density
It is natural to consider the original Hilbert space matrix of a single qubit. If the spectral decomposi-
Hn to be the n-qubit space. In this case Nn = 2n and tion of  is given by
hence log Nn = n. As in the case of classical data
XH
dim
compression, we are interested in finding the ¼ qi ji ihi j ½15
optimal limiting rate of data compression, which in i¼1
this case is given by
then the eigenvalues and eigenvectors of (n) are
log dc ðnÞ given by
R1 :¼ lim ½12
n!1 n ðnÞ
k ¼ qk1 qk2 . . . qkn ½16
Unlike classical signals, quantum signal states are
not completely distinguishable. This is because they and
are, in general, not mutually orthogonal. As a result, ðnÞ
perfectly reconstructing a quantum signal state from j k i ¼ jk1 i jk2 i    jkn i ½17
its compressed version is often an impossible task
Thus, we can write the spectral decomposition of
and therefore too stringent a requirement for the
the density matrix (n) of an i.i.d. source as
reliability of a compression–decompression scheme. X ðnÞ ðnÞ ðnÞ
Instead, a reasonable requirement is that a state can ðnÞ ¼ k j k ih k j ½18
be reconstructed from the compressed version which k
is nearly indistinguishable from the original signal
state. A measure of indistinguishability useful for where the sum is over all possible sequences
this purpose is the average fidelity defined as k = (k1 . . . kn ), with each ki taking (dim H) values.
follows: Hence, we see that the eigenvalues (n) are labeled
X ðnÞ ðnÞ by a classical sequence of indices k = k1 . . . kn .
ðnÞ ðnÞ
Fn :¼ pk hk jDðnÞ ðe
k Þjk i ½13 The von Neumann entropy of such a source is
k given by
This fidelity satisfies 0  Fn  1 and Fn = 1 if SððnÞ Þ
Sð n Þ ¼ nSðÞ ¼ nHðXÞ ½19
and only if D(n) (e k(n) ) = j(n) (n)
k ihk j for all k. A
compression–decompression scheme is said to be where X is the classical random variable with
reliable if Fn ! 1 as n ! 1. probability distribution {qi }.
The key idea behind data compression is the fact Let T (n) be the classical typical subset of indices
that some signal states have a higher probability of (k1 . . . kn ) for which
occurrence than others (these states playing a role  
 1   
analogous to the typical sequences of classical  log qk . . . qk  SðÞ   ½20
 n 1 n 
information theory). These signal states span a
subspace of the original Hilbert space of the source as in the theorem of typical sequences. Defining
and is referred to as the typical subspace. T  (n) as the space spanned by the eigenvectors j (n)
k i
Source Coding in Quantum Information Theory 613

with k 2 T (n) then immediately yields the quantum Using the typical subspace theorem, Schumacher
analog of the theorem of typical sequences – Theorem (1995) proved the following analog of Shannon’s
4 given below. We refer to T (n)
 as the typical subspace noiseless channel coding theorem for memoryless
(or more precisely, the –typical subspace). quantum information sources:
Theorem 4 (Typical subspace theorem). Fix  > 0. Theorem 5 (Schumacher’s quantum coding theo-
Then for any  > 0 9 n0 () > 0 such that 8n  n0 () rem). Let {n , Hn } be an i.i.d. quantum source:
and (n) =  n , the following are true: n =  n and Hn = H n . If R > S(), then there exists
a reliable compression scheme of rate R. If R < S(),
(i) Tr(P(n) (n)
  ) > 1   and then any compression scheme of rate R is not reliable.
(ii) (1  )2n(S())  dim (T (n)
 ) 2
n(S()þ)
, where
P(n)
 is the orthogonal projection onto the Proof
subspace T (n) .
(i) R > S(). Choose  > 0 such that R > S() þ .
Note that tr (P(n) (n)
  ) gives the probability of the For a given  > 0, choose the typical subspace as
typical subspace. As tr(P(n) (n)
  ) approaches unity for above and choose n large enough so that (i) and (ii)
(n)
n sufficiently large, T  carries almost all the weight in the typical subspace theorem hold. In particular,
of (n) . Let T (n)?
 denote the orthocomplement of the An = tr(P(n)
 n ) > 1  . Thus, the fidelity tends to 1
typical subspace, that is, for any pair of vectors as n ! 1.
j i 2 T (n)
 and ji 2 T 
(n)?
, hj i = 0. It follows from (ii) Suppose R < S(). Let the compression map
the above theorem that the probability of a signal be C(n) . We may assume that H e n is a subspace of Hn
state belonging to T (n)? can be made arbitrarily e nR
with dim Hn = 2 . We denote the projection onto

small for n sufficiently large. He n as P e n and let ~(n) = C(n) (j(n) i h(n) j). Since
k k k
Let P(n) denote the orthogonal projection onto the
(n)
~k is concentrated on H e n , we have ~(n)  P en
k
typical subspace T (n)  . The encoding (compression) and hence D (~ (n) (n) (n) e
k )  D (Pn ), for any decompres-
of the signal states j(n) k i of [10], is done in the sion map D(n) . Inserting into the definition of the
following manner. C(n) : j(n) (n)
e (n)
k i hk j 7! k , where fidelity, we then have
ðnÞ ðnÞ ðnÞ X ðnÞ ðnÞ  
~k :¼ 2k jk
~ j þ
2 j0 ih0 j
~ ih
k k ½21 F pk hk jDðnÞ ðP e n ÞjðnÞ i ¼ tr ðnÞ DðnÞ ðP
en Þ
k
k
Here X ðnÞ ðnÞ ðnÞ
X ðnÞ
ðnÞ e
ðnÞ ðnÞ
 k h k jD ðPn Þj k i þ k ½24
~ ðnÞ i:¼ P jk i ðnÞ
k2T k2
ðnÞ
= T
j k ðnÞ ðnÞ
kP jk ik ½22
By the typical subspace theorem, the latter sum
ðnÞ ðnÞ
k :¼ kPðnÞ
 jk ik;
k ¼ kðI  PðnÞ
 Þjk ik
tends to 0 as n ! 1, and in the sum over k 2 T(n)
we have (n)
k  2
n(S())
. The first sum can therefore
and j0 i is any fixed state in T (n) . be bounded as follows:
Obviously ~(n) (n)
k 2 B(T  ), and hence the typical X ðnÞ ðnÞ
subspace T (n)
 plays the role of the compressed space. k h k jDðnÞ ðP e n Þj ðnÞ i
k
The decompression D(n) (e k(n) ) is defined as the k2T
ðnÞ
(n) (n)
extension of ek on T  to Hn : X ðnÞ ðnÞ
ðnÞ e
   2nðSðÞÞ h k jD ðPn Þj k i
ðnÞ ðnÞ
DðnÞ ek ¼ ek 0 k
 
¼2 nðSðÞÞ en Þ
tr DðnÞ ðP
The fidelity of this compression–decompression
!
scheme satisfies X
X ðnÞ ðnÞ ðnÞ ðnÞ
¼2 nðSðÞÞ
tr e
Di Pn Di
Fn ¼ pk k j~ k jk i
k nðSðÞÞ nR
X h i ¼2 2 ½25
ðnÞ ðnÞ ~ ðnÞ 2 ðnÞ 2
¼ pk 2k jhk j 2
k ij þ
k jhk j0 ij
k
by
P the cyclic property of thenRtrace and the fact that
X X e
ðnÞ ðnÞ ðnÞ ðnÞ i Di Di = I and dim Hn = 2 . h
 ~ ij2 ¼
pk 2k jhk j pk 4k
k
k k Even for a quantum source with memory, reliable
X ðnÞ data compression is achieved by looking for a
 pk ð2 2k  1Þ ¼ 2An  1 ½23
typical subspace T (n)
 of the Hilbert space Hn for a
k
given  > 0. In the following subsections, we discuss
where An = tr(P(n)
 n ). two different classes of such sources for which one
614 Source Coding in Quantum Information Theory

can find typical subspaces T (n)


 such that the fidelity where pi = jei ihei j is the projection onto jei i, be the
Fn tends to 1 as n ! 1. spectral decomposition for l . Denote the spectrum
X = {i }li =
d
1 . For n 2 N, introduce the probability
measures n on X n by
Ergodic Quantum Sources
n ðAÞ ¼ nl ðqA Þ ½32
A quantum generalization of classical ergodic
sources is defined as follows. First consider the where, for any A  X n , the projection qA is defined by
analog of an infinite sequence of random variables
X
which is a state on the infinite tensor product of a qA ¼ pi1 . . . pin ½33
finite-dimensional -algebra M. The latter is given ði1 ;...;in Þ2A
by the norm closure of the increasing sequence of
finite tensor products Similarly, we define 1 on X Z . The sequence of
[ random variables (Xn )n2Z with distribution 1 is
M1 ¼ nk¼n M ½26 then ergodic since 1 is completely ergodic (and
n hence l-ergodic).
A translation-invariant state 1 on M1 is said to be By the Shannon–McMillan–Breiman theorem
ergodic if it cannot be decomposed as a (nontrivial) (Theorem 2),
convex combination of other translation-invariant
states. The analog of the Kolmogorov–Sinai entropy 1
 log n ðfðx1 ; . . . ; xn ÞgÞ ! hKS ½34
[5] for an ergodic state 1 is called the mean n
entropy and is given by almost surely w.r.t. 1 , where hKS is the Kolmogorov–
1 1 Sinai entropy. The latter is given by hKS = limn ! 1
SM ð1 Þ ¼ lim Sðn Þ ¼ inf Sðn Þ ½27 (1=n)Hn = inf n2N ð1/nÞHn , where
n!1 n n2N n
X
where n is the restriction of 1 to Mn := M n . Hn ¼  n ðfðx1 ; . . . ; xn ÞgÞ
Following Hiai and Petz (1991), we define the ðx1 ;...;xn Þ2X n
following quantity for any state  on an arbitrary  log n ðfðx1 ; . . . ; xn ÞgÞ ½35
finite-dimensional -algebra M and a given  > 0:
Notice in particular that

 ðÞ ¼ infflog trðqÞ : q 2 M; q ¼ q;
q2 ¼ q; ðqÞ  1  g ½28 hKS  H1 ¼ Sðl Þ < lh ½36

We also define a state 1 on M1 to be completely If let T(n) be the (typical) subset of X n such that
ergodic if it is ergodic under transformations on M1 ,
induced by l-fold shifts on Z, for arbitrary l 2 N. The 1
 log n ðfðx1 ; . . . ; xn ÞgÞ 2 ðhKS  ; hKS þ Þ ½37
following theorem is due to Hiai and Petz (1991), n
who proved it in a slightly more general setting: for (x1 , . . . , xn ) 2 T(n) then we have 1 (T(n) )  1  
Theorem 6 (Hiai and Petz). Suppose that 1 is a for n large enough. Moreover, since n ({(x1 , . . . , xn )}) 
completely ergodic state on M1 and d := dim M < 1, en(hKS þ) for all (x1 , . . . , xn ) 2 T(n) , and the total
measure is 1,
(

and set n = 1 Mn . Then, for any  > 0, the following


hold:
1 jTðnÞ j  enðhKS þÞ  enðl hþÞ ½38
(i) lim sup
 ðn Þ  SM ð1 Þ ½29
n!1 n It follows that tr(qT(n) )  en(l hþ) whereas nl (qT(n) ) =
1 n (T(n) )  1   and we conclude that
(ii) lim inf
 ðn Þ  SM ð1 Þ   log d ½30
n!1 n
1 nðl h þ Þ
Proof of (i) Choose r > SM (1 ) and let  < r 
 ðnl Þ  <r ½39
nl nl
SM (1 ) and h = r  . By the definition of SM (1 ),
there exists l 2 N such that S(l ) < l h. Let {jei i}li =
d
1
from which [29] follows upon taking n ! 1, since
be an orthonormal set of eigenvectors of l , with r > SM (1 ) was arbitrary. (Notice that
 (n ) is
corresponding eigenvalues i , that is, let decreasing in n since Mn  Mnþ1 .) &

X
ld Proof of (ii) Given ,  > 0 and n 2 N, choose a
 l ¼ i p i ½31 projection qn with n (qn )  1   and log tr(qn ) <
i¼1
 (n ) þ . Since SM (1 ) = inf (1=n)S(n ) we have
Source Coding in Quantum Information Theory 615

SM (1 )  (1=n)S(n ). We now use the following similar to that of Schumacher’s theorem (Petz and
lemma: Mosonyi 2001, Bjelaković et al. 2004):
Lemma 7 If  is a state on a finite-dimensional Theorem 8 Let 1 be a completely ergodic
-algebra M, and q 2 M is a projection, then stationary state on the infinite tensor product
algebra M1 . If R > SM (1 ), then for any decom-
SðÞ  HðpÞ þ ðqÞ log trðqÞ
position of the form
þ ð1  ðqÞÞ log trð1  qÞ ½40 X ðnÞ ðnÞ ðnÞ
ðnÞ ¼ pk jk i hk j ½44
where H(p) = p log p  (1  p) log (1  p) (the bin-
ary entropy) with p = (q). there exists a reliable quantum code of rate R.
Proof First notice that if [ , q] = 0 then the result Conversely, if R < SM (1 ) then any quantum
[40] follows from the simple inequality: compression–decompression scheme of rate R is
not reliable.
X
m X
m
 ~i log ~i  log m if ~i ¼ 1 ½41 Remarks Theorem 6 also holds for higher-
i¼1 i¼1 dimensional information streams, with essentially
Indeed, diagonalizing  , the eigenvalues i divide into the same proof. (The existence of the mean entropy
two subsets with corresponding eigenvectors belong- is more complicated in that case.) The condition of
ing to the range of q, respectively, its complement. complete ergodicity in this theorem is unnecessary.
Considering the firstPset, we have, if m = dim (Ran(q)), Indeed, Bjelaković et al. (2004) showed that the
and taking ~i = i =( m result remains valid (also in more than one dimen-
i = 1 i ) in [41],
! ! sions) if the state 1 of the source is simply ergodic.
Xm X m
1X m They achieved this by decomposing a general
 i log i   i log i ergodic state into a finite number of l-ergodic states,
i¼1 i¼1
m i¼1
and then applying the above strategy to each. It
¼ trðq Þ log trðq Þ  log trðqÞ should also be mentioned that a weaker version of
Adding the analogous inequality for the part of the Theorem 6 was proved by King and Lesniewski
spectrum corresponding to 1  q, we obtain [40]. (1998). They considered the entropy of an asso-
In the general case, that is, if [ , q] 6¼ 0, define ciated classical source, but did not show that this
the unitary u = 2q  1 and the state classical entropy can be optimized to approximate
the von Neumann entropy. This had in fact already
0 ðxÞ ¼ 12 ½ðxÞ þ ðuxuÞ ½42 been proved by Hiai and Petz (1991). The relevance
of the latter work for quantum information theory
Then [0 , q] = 0 and by concavity of S() and the
was finally pointed out by Mosonyi and Petz (2001).
result for the previous case
HðXÞ þ ðqÞ log trðqÞ
þ ð1  ðqÞÞ log trð1  qÞ  Sð0 Þ  SðÞ ½43 Source Coding for Quantum
Spin Systems
since 0 (q) = (q). &
In this section we consider a class of quantum
Continuing with the proof of (ii), we conclude that sources modeled by Gibbs states of a finite strongly
interacting quantum spin system in   Zd with
Sðn Þ  HðpÞ þ n ðqn Þ log trðqn Þ
d  2. Due to the interaction between spins, the
þ ð1  ðqn ÞÞ log trð1  qn Þ density matrix of the source is not given by a tensor
 1 þ
 ðn Þ þ  þ n log d product of the density matrices of the individual
spins and hence the quantum information source is
Dividing by n and taking the limit we obtain (30).
non-i.i.d. We consider the density matrix to be
&
written in the standard Gibbsian form:
It follows from this theorem that we can define a !

typical subspace in the same way as in Schumacher’s e


H
!; ¼ ½45
theorem. Indeed, given  > 0 and  > 0, we have !;
that for n large enough, there exists a subspace T (n)  where
> 0 is the inverse temperature. Here !
equal to the range of a projection qn such that denotes the boundary condition, that is, the config-
n (qn ) > 1   and en(SM (1 ) log d) < dim (T (n)
 )= uration of the spins in c = Zd n, and H! is the
tr(qn ) < en(SM (1 )þ) . The proof of the quantum Hamiltonian acting on the spin system in  under
analog of the Shannon–McMillan theorem is then this boundary condition. (see Datta and Suhov (2002)
616 Source Coding in Quantum Information Theory

for precise definitions of these quantities). The Theorem 9 Under the above assumptions, for

denominator on the right-hand side of [45] is the large and  small enough, for all  > 0
partition function.   
Note that any faithful density matrix can be !;  1 !; 
lim P  log K  h  
written in the form [45] for some self-adjoint %Zd jj
X
operator H! with discrete spectrum, such that ¼ lim j fjjj1 log j hjg ¼ 1 ½46
!
e
H is trace class. However, we consider H! to %Zd j
be a small quantum perturbation of a classical
where {...} denotes an indicator function.
Hamiltonian and require it to satisfy certain
hypotheses (see Datta and Suhov (2002)). In Theorem 9 is essentially a law of large numbers
particular, we assume that H = H0 þ V , where for random variables (log K!,  ). The statement of
(1) H0 is a classical, finite-range, translation- the theorem can be alternatively expressed as
invariant Hamiltonian with a finite number of follows. For any  > 0,
periodic ground states, and the excitations of these  
ground states have an energy proportional to the lim P !; 2jjðhþÞ  K!;  2jjðhÞ ¼ 1 ½47
%Zd
size of their boundaries (Peierls condition); (2) V
is a translation-invariant, exponentially decaying, Thus, we can define a typical subspace T !, by
quantum perturbation,  being the perturbation jjðhþÞ
parameter. These hypotheses ensure that the quan- T !;
 :¼ span fj j i : 2  j  2jjðhÞ g ½48
tum Pirogov–Sinai theory of phase transitions in It clearly satisfies the analogs of (i) and (ii) of the
lattice systems (see, e.g., Datta et al. (1996)) applies. typical subspace theorem, which implies as before
The power of quantum Pirogov–Sinai theory is that a compression scheme of rate R is reliable if and
such that, in proving reliable data compression for only if R > h.
such sources, we do not need to invoke the concept
of ergodicity.
Universal and Variable Length Data Compression
Using the concavity of the von Neumann entropy
S(!,  ), one can prove that the von Neumann Thus far we discussed source-dependent data com-
entropy rate (or mean entropy) of the source pression for various classes of quantum sources. In
each case data compression relied on the identifica-
Sð!; Þ tion of the typical subspace of the source, which in
h :¼ lim
%Zd jj turn required a knowledge of its density matrix. In
classical information theory, there exists a general-
exists. For a general van Hove sequence, this follows ization of the theorem of typical sequences due to
from the strong subadditivity of the von Neumann Csiszár and Körner (1981) where the typical set is
entropy (see, e.g., Ohya and Petz (1993)). universal, in that it is typical for every possible
Let !,  have a spectral decomposition probability distribution with a given entropy. This
X result was used by Jozsa et al. (1998) to construct a
!; ¼ j j j ih j j universal compression scheme for quantum i.i.d
j
sources with a given von Neumann entropy S using
where the eigenvalues j , 1  j  2jj , and the a counting argument for symmetric subspaces. This
corresponding eigenstates j j i, depend on ! and . was generalized to ergodic sources by Kaltchenko
Let P !,  denote the probability distribution {j } and and Yang (2003) along the lines of Theorem 6.
consider a random variable K!,  which takes a value Hayashi and Matsumoto (2002) supplemented the
j with probability j : work of Jozsa et al. (1998) with an estimation of the
eigenvalues of the source (using the measurement
K!; ð j Þ ¼ j ; P !; ðK!; ¼ j Þ ¼ j smearing technique) to show that a reliable compres-
sion scheme exists for any quantum i.i.d source,
The data compression limit is related to asympto- independent of the value of its von Neumann entropy
tical properties of the random variables K!,  as S, the limiting rate of compression being given by S. If
 % Zd . As in the case of i.i.d. sources, we prove one admits variable length coding, the Lempel–Ziv
the reliability of data compression by first proving algorithm gives a completely universal compression
the existence of a typical subspace. The latter scheme, independent of the value of the entropy, in
follows from Theorem 9 below. The proof of this the classical case (Cover and Thomas 1991). This
crucial theorem relies on results of quantum algorithm was generalized to the quantum case for
Pirogov–Sinai theory (Datta et al. 1996). i.i.d sources by Jozsa and Presnell (2003), and to
Spacetime Topology, Causal Structure and Singularities 617

sources modeled by Gibbs states of free bosons or Datta N, Fernández R, and Fröhlich J (1996) Low-temperature
fermions on a lattice by Johnson and Suhov (2002). phase diagrams of quantum lattice systems. I. Stability for
quantum perturbations of classical systems with finitely-many
Another important question is the efficiency of the ground states. Journal of Statistical Physics 84: 455–534.
various coding schemes. The above-mentioned Datta N and Suhov Y (2002) Data compression limit for an
schemes for quantum i.i.d. sources are not efficient, information source of interacting qubits. Quantum Informa-
in the sense that they have no polynomial time tion Processing 1(4): 257–281.
implementation. Recently, it was shown by Bennett Hayashi M and Matsumoto K (2002) Quantum universal
variable-length source coding. Physical Review A 66: 022311.
et al. (2004) that an efficient, universal compression Hiai F and Petz D (1991) The proper formula for the relative
scheme for i.i.d sources can be constructed by entropy and its asymptotics in quantum probability. Commu-
employing quantum state tomography. nications in Mathematical Physics 143: 257–281.
Johnson O and Suhov YM (2002) The von Neumann entropy and
information rate for integrable quantum Gibbs ensembles.
Acknowledgment Quantum Computers and Computing 3/1: 3–24.
Jozsa R, Horodecki M, Horodecki P, and Horodecki R (1998)
The authors would like to thank Y M Suhov for Universal quantum information compression. Physical Review
helpful discussions. Letters 81: 1714–1717.
Jozsa R and Presnell S (2003) Universal quantum information
See also: Capacity for Quantum Information; Channels in compression and degrees of prior knowledge. Proceedings of
Quantum Information Theory; Positive Maps on C -Algebras. Royal Society of London Series A 459: 3061–3077.
Kaltchenko A and Yang E-H (2003) Universal compression of
ergodic quantum sources. Quantum Information & Computa-
Further Reading tion 3: 359–375.
King C and Lesniewski A (1998) Quantum sources and a quantum
Bennett CH, Harrow AW, and Lloyd S (2004) Universal quantum coding theorem. Journal of Mathematical Physics 39: 88–101.
data compression via gentle tomography, quant-ph/0403078. McMillan B (1953) The basic theorems of information theory.
Bjelaković I, Krüger T, Siegmund-Schultze R, and Szkoła A Annals of Mathematical Statistics 24: 196–219.
(2004) The Shannon–McMillan theorem for ergodic quantum Nielson MA and Chuang IL (2000) Quantum Computation and
lattice systems. Inventiones Mathematicae 155: 203–222. Quantum Information. Cambridge: Cambridge University Press.
Breiman L (1957) The individual ergodic theorem of information Ohya M and Petz D (1993) Quantum Entropy and Its Use.
theory. Annals of Mathematical Statistics 28: 809–811. Heidelberg: Springer.
Breiman L (1960) The individual ergodic theorem of information Petz D and Mosonyi M (2001) Stationary quantum source coding.
theory – correction note. Annals of Mathematical Statistics 31: Journal of Mathematical Physics 42: 4857–4864.
809–810. Schumacher B (1995) Quantum coding. Physical Review A 51:
Cover TM and Thomas JA (1991) Elements of Information 2738–2747.
Theory. New York: Wiley. Shannon CE (1918) A mathematical theory of communication.
Csiszár I and Körner J (1981) Information Theory. Coding Bell System Technical Journal 27: 379–423, 623–656.
Theorems for Discrete Memoryless Systems. Budapest:
Akadémiai Kiadó.

Spacetime Topology, Causal Structure and Singularities


R Penrose, University of Oxford, Oxford, UK of differential geometry (see General Relativity:
ª 2006 Elsevier Ltd. All rights reserved.
Overview). By virtue of this, it is sometimes the
case, in general relativity, that geometrical arguments
of various kinds – including purely topological ones
(i.e., arguments depending only upon the properties
The Value of Topological Reasoning
of continuity or smoothness) – can be used to great
in General Relativity
effect to obtain results that are not readily accessible
Solving the equations of Einstein’s general relativity by standard procedures of differential equation
(see General Relativity: Overview) can be an exceed- theory or by direct numerical calculation.
ingly complicated business; it is commonly found One particularly significant family of situations
necessary to resort to numerical solutions involving where this kind of argument has a key role to play is
very complex computer codes (see Computational in the important issue of the singularities that arise
Methods in General Relativity: The Theory). The in many solutions of the Einstein equations, in
essential content of the basic equations of the theory which spacetime curvatures may be expected to
itself is, however, something that can be phrased in diverge to infinity. These are exemplified, particu-
simple geometrical terms, using only basic concepts larly, by two important classes of solutions of the
618 Spacetime Topology, Causal Structure and Singularities

Einstein field equations in which singularities arise.


In the first instance, we have cosmological models,
which tend to exhibit the presence of an initial
Observer
singularity referred to as the ‘‘Big Bang,’’ as was first

Horizon
noted in the standard Friedmann models (which are
solutions of the Einstein equations with simple
matter sources; see Cosmology: Mathematical
Aspects). Secondly, we find a final singularity (for

Singularity
local observers) at the endpoint of gravitational

Horizon
collapse to a black hole (where in the relevant
region, outside the collapsing matter, Einstein’s
vacuum equations are normally taken to hold). In
either case, there are canonical exact models, in
which considerable symmetry is assumed, and where
the models indeed become singular at places where
the spacetime curvature diverges to infinity. For
many years (prior to 1965), there had been much
debate as to whether these singularities were an
inevitable feature of the general physical situation
under consideration, or whether the presence of
singularities might be an artifact of the assumed
high symmetry. The use of topological-type argu- Collapsing
ments has established that, in general terms, the matter
occurrence of a singularity is not merely an artifact
Figure 1 Spacetime diagram of collapse to a black hole.
of symmetry, and cannot generally be removed by
(One spatial dimension is suppressed.) Matter collapses inwards,
the introduction of small (finite) perturbations. through the 3-surface that becomes the (absolute) event horizon.
Let us first consider the standard picture, put No matter or information can escape the hole once it has been
forward in 1939 by Oppenheimer and Snyder (OS), formed. The null cones are tangent to the horizon and allow
of the gravitational collapse of an over-massive star matter or signals to pass inwards but not outwards. An external
observer cannot see inside the hole, but only the matter – vastly
to a black hole; see Figure 1 (and see Stationary
dimmed and redshifted – just before it enters the hole.
Black Holes). This assumes exact spherical symme- (Reproduced with permission from Penrose R. (2004) The Road
try. The region external to the matter is described by to Reality : a Complete Guide to the Laws of the Universe.
the well-known Schwarzschild solution of the London: Jonathan Cape.)
Einstein vacuum equations, appropriately extended
to inside the ‘‘Schwarzschild radius’’ r = 2mG=c2 indefinitely into the future, where the ‘‘horizon’’ is
(G being Newton’s gravitational constant and c, the the three-dimensional region obtained by rotating,
speed of light, and where m is the total mass of over the (, ) 2-sphere, the null (lightlike) line
the collapsing material; from now, for convenience, which is r = 2m outside the matter region and which
we choose units so that G = c = 1). In Figure 1, is the extension of this line, as a null line, into the
this internal extension is conveniently expressed past until it meets the axis. It is easy to see that any
using Eddington–Finkelstein coordinates (r, v, , ) observer’s world line within this horizon is indeed
(see Eddington (1924) and Finkelstein (1958)), trapped in this sense.
where v = t þ r þ 2m log (r  2m), the metric form The question naturally arises: how representative
being is this model? Here, the singularity occurs at the
center (r = 0), the place where all the matter is
ds2 ¼ð1  2m=rÞdv2  2dvdr directed, and where it all reaches without rebound-
 r2 ðd2 þ sin2 d2 Þ ing. So it may be regarded as unsurprising that the
density becomes infinite there. Now, let us suppose
(The signature convention þ is being adopted that the collapsing material is not exactly spherically
here; see General Relativity: Overview.) We find symmetrical. Even if it is only slightly (though
that, in this model, there is a singularity (at r = 0) at finitely) perturbed away from this symmetrical
the future endpoint of each world line of collapsing situation, having slight (but finite) transverse
matter. Moreover, no future-timelike line starting motions, the collapsing matter is now not all
inside the horizon can avoid reaching the singularity directed exactly towards the center, as it is in the
when we try to extend it, as a timelike curve, OS model. One might imagine that the singularity
Spacetime Topology, Causal Structure and Singularities 619

could now be avoided, the different portions of Iþ (S) itself is represented by that part of Figure 1
matter just ‘‘missing’’ each other and then being which lies between these null curves.
finally flung out again, after some complicated We observe that, in this symmetrical case (s being
motions, where the density and spacetime curvatures chosen in the vacuum region), a characterization of s
might well become large but presumably still finite. as being ‘‘trapped,’’ in the sense that it lies in a
To follow such an irregular collapse in full detail region that is within the horizon, is that the future
would present a very difficult task, and one would tangents to these null curves both point ‘‘inwards,’’
have to carry it out by numerical means. As yet, in the sense of decreasing r. Since r is the metric
despite enormous advances in computational tech- radius of the S2 of rotation, so that the element of
nique, a fully effective simulation of such a surface area of this sphere is proportional to r2 , it
‘‘generic’’ collapse is still not in hand. In any case, follows that the surface area of the boundary @Iþ (S)
it is hard to make a convincing case as to whether or reduces, on both branches, as we move away from S
not a singularity arises, because as soon as metric or into the future. The three-dimensional region @Iþ (S)
curvature quantities begin to diverge, the computa- consists of two null surfaces joined along S, in
tion becomes fundamentally unreliable and simply the sense that their Lorentzian normals are null
‘‘gives up.’’ So we cannot really tell whether the 4-vectors. For each fixed value of  and , this
failure is due to some genuine divergence or whether normal is a tangent to one or other of the two null
it is an artifact. It is thus fortunate that other curves of Figure 1, starting at s. For a trapped s,
mathematical techniques are available. Indeed, by these normals point in the direction of decreasing r,
use of a differential–topological–causal argument, and it follows that the divergence of these normals is
we find that such perturbations do not help, at least negative (so  > 0 in what follows below).
so long as they are small enough not to alter the In the general case, it is this property of negativity
general character of the collapse, which we find has of the divergence, at S, of both sets of Lorentzian
an ‘‘unstoppable’’ character, so long as a certain normals (i.e., of null tangents to @Iþ (S)), that
criterion is satisfied its early stages. characterizes S as a trapped surface, where in the
general case we must also prescribe S to be compact
and spacelike. But now there are to be no assump-
tions of symmetry whatever. Such a characterization
Trapped Surfaces
is stable against small, but finite, perturbations of
But how are we to characterize the collapse as the location of S, within the spacetime manifold M,
‘‘unstoppable,’’ where no symmetries are to be and also against small, but finite, perturbations of M
assumed, and the simple picture illustrated in itself.
Figure 1 cannot be appealed to? A convenient We can think of a trapped surface in more direct
characterization is the presence of what is called a physical/geometrical terms. Imagine a flash of light
‘‘trapped surface.’’ This notion generalizes a key emitted all over some spacelike compact spherical
feature of the 0 < r < 2m region inside the horizon surface such as S, but now in ordinary flat space-
of the vacuum (Eddington–Finkelstein) picture of time, where for simplicity we suppose that S is
Figure 1. To understand what this feature is, situated in some spacelike (flat) 3-hypersurface H, of
consider fixing a point s in the vacuum region of constant time t = 0. There will be one component to
the (v, r)-plane of Figure 1. We must, of course, bear the flash proceeding outwards and another proceed-
in mind that, because this plane is to be ‘‘rotated’’ ing inwards. Provided that S is convex, the outgoing
about the central vertical axis (r = 0) by letting  and flash will represent an initial increase of the surface
 vary as coordinates on a 2-sphere S2 , the point s area at every point of S and the ingoing flash, an
actually describes a closed 2-surface S (coordina- initial decrease. In four-dimensional spacetime
tized by  and ) with topology S2 (so S is terms, we express this as positivity of the divergence
intrinsically an ordinary 2-sphere). We shall be of the outward null normal and the negativity of the
concerned with the region Iþ (S), which is the divergence of the inward one. The characteristic
(chronological) ‘‘future’’ of S, that is, the locus of feature of a trapped surface is that whereas the
points q for which a timelike curve exists having a ingoing flash will still have an initially reducing
future endpoint at q and a past endpoint on S. We surface area, the ‘‘outgoing’’ flash now has the
shall also be interested, particularly, in the boundary curious property that its surface area is also initially
@Iþ (S) of Iþ (S). This boundary is described, in decreasing, this holding at every point of S.
Figure 1, by the pair of null curves v = const. and Locally, this is not particularly strange. For a
2r þ 4m log (r  2m) = const., proceeding into the surface wiggling in and out, we are quite likely to
future from s (and rotated in  and ). The region find portions of ingoing flash with increasing area,
620 Spacetime Topology, Causal Structure and Singularities

and portions of outgoing flash with decreasing area. where it is assumed that each of ‘a , ma is parallel-
An extreme case in Minkowski spacetime has S as the propagated along :
intersection of two past light cones. All the null
normals to S point along the generators of these past ‘a ra ‘b ¼ 0; ‘a ra mb ¼ 0
cones, and therefore all converge into the future. Such (ra denoting covariant derivative). The spin-coefficient
a surface S (indeed spacelike) looks ‘‘trapped’’ every- quantities
where locally, but fails to count as trapped, not being
compact. Since there is nothing causally extreme about  ¼ ma m
 b ra ‘b and  ¼ ma mb ra ‘b
Minkowski space, it is appropriate not to count such are of importance. Here, the real part of  measures the
surfaces as ‘‘trapped.’’ What is the peculiar about a convergence of the congruence and the imaginary part
trapped surface is that both ingoing and outgoing defines its rotation;  measures its shear, where the
flashes are initially decreasing in area, over the entire argument of  defines the direction (perpendicular
compact S. (N. B. Hawking and Ellis (1973) adopt a to ) of the axis of shear, and whose strength is defined
slightly different terminology; the term ‘‘trapped,’’ by jj (see Penrose and Rindler (1986) for a graphic
used here, refers to their ‘‘closed trapped.’’) description of these quantities). Defining propagation
derivative along  by
D ¼ ‘a ra
The Null Raychaudhuri Equation
we can write the Sachs equations as
What do we deduce from the existence of a trapped
surface? A glance at Figure 1 gives us some D ¼ 2 þ  þ 
indication of the trouble. As we trace @Iþ (S) into D ¼ 2 þ 
the future, we find that its cross-sectional area
continues to decrease, until becoming zero at the where  = (1=2)Rab ‘a ‘b and  = Cabcd ‘a mb ‘c md ,
central singularity. This last feature need not reflect conventions for the Ricci tensor Rab and the Weyl
closely what happens in more general cases, with no tensor Cabcd being those of General Relativity:
spherical symmetry. But the reduction in surface Overview (and of Penrose and Rindler (1984)). We
area is a general property. This is the first point to note that it is the real Ricci component  which
appreciate in a theorem (Penrose 1965, 1968, governs the propagation of the divergence and the
Hawking and Ellis 1973) which indicates the complex Weyl component  which governs the
profoundly disturbing physical implications of the propagation of shear, though there are some non-
existence of a trapped surface in physically realistic linear terms. The quantity  is normally taken non-
gravitational collapse, according to Einstein’s gen- negative, since it measures the energy flux across 
eral relativity. The surface-area reduction arises (with, in fact  = 4GTab ‘a ‘b , where Tab is the
from a result known as ‘‘Raychaudhuri’s equation,’’ energy tensor). The condition that   0 at all points
in the case of null rays – where we refer to this as of spacetime and for all null directions ‘a , is called
the ‘‘Sachs’’ equations. We come to this next. the ‘‘weak energy condition.’’ (Again there is a minor
Although many different notations are used to discrepancy with Hawking and Ellis (1973) who
express the needed quantities, we can here conve- adopt a somewhat stronger ‘‘weak energy condition,’’
niently employ the spin-coefficient formalism, as which is the above but where ‘a is also allowed to be
described elsewhere in this Encyclopedia (see Spi- future-timelike. Unfortunately, with this terminology,
nors and Spin Coefficients). their ‘‘weak energy condition’’ is not strictly weaker
Suppose that we have a congruence (smooth three- than their ‘‘strong energy condition.’’)
parameter family) of rays (null geodesics) in four- It will now be assumed that  is real:
dimensional spacetime. Let ‘a be a real future-null
 ¼ 
vector, tangent to a null geodesic  of the congruence,
and let mb be complex-null, also defined along , which is always the case for propagation along the
where its real and imginary parts are unit vectors generators of a null hypersurface. The weak energy
spanning a 2-surface element orthogonal to ‘a at each condition then has an important implication for us.
point of , so we have We find that if A is an element of 2-surface area
within the plane spanned by the real and imaginary
parts of ma , then (this area element being propa-
‘a ‘a ¼ 0; ‘a ma ¼ 0;
gated by D along the lines )
ma ma ¼ 0; m  a ma ¼ 1;
‘a ¼ ‘a DA1=2 ¼ A1=2
Spacetime Topology, Causal Structure and Singularities 621

As a consequence, assuming   0, arbitrary subset Q in M, obtaining the chronological


2 1=2 1=2
future Iþ (Q) and past I (Q) of Q in M by
D A ¼ ð
 þ ÞA 0
Iþ ðQÞ ¼ fqjp  q for some p 2 Qg
This tells us that once the divergence () becomes
negative, then the area element must reduce to zero I ðQÞ ¼ fqjq  p for some p 2 Qg
sometime in the future along , assuming that  is The notation {qj some property of q} denotes the set
future-null-complete in the sense that it extends to of q’s with the stated property and the causal future
indefinitely large values of an affine parameter u Jþ (Q) and past J (Q) of Q in M by
defined along it, where an affine parameter asso-
ciated with the parallel-propagated ‘a satisfies Jþ ðQÞ ¼ fqjp  q for some p 2 Qg
‘a ra u ¼ 1 J ðQÞ ¼ fqjq  p for some p 2 Qg

Such a place where the cross-sectional area pinches The I (Q) are always open sets, but the J (Q) are not
down to zero is a singularity of the congruence or null always closed (though they are for any closed set Q in
hypersurface, referred to as a ‘‘caustic.’’ (There are Minkowski space). Thus, the sets I (Q) have a more
also terminological confusions arising from different uniform character than the J (Q), and it is simpler to
authors defining the term ‘‘caustic’’ in slightly concentrate, here, on the I (Q) sets.
different ways. The terminology used here is slightly The boundary @Iþ (Q) of Iþ (Q) has an elegant
discrepant from that of Arnol’d (1992) (Chapter 3).) characterization:
From this property, it follows that if we have a @Iþ ðQÞ ¼ fqjIþ ðqÞ @Iþ ðQÞ; but q 2
= Iþ ðQÞg
trapped surface S, then every generator of @Iþ (S), if
extended indefinitely into the future, must eventually and the corresponding statement holds for @I (Q).
encounter a caustic. This, so far, tells us nothing about Boundaries of futures also have a relatively simple
actual singularities in the spacetime M; even Minkowski structure, as is exhibited in the following result (for
space contains many null hypersurfaces with multitudes which there is also a version with past and future
of caustic points. However, caustics do tell us some- interchanged):
thing significant about sets like @Iþ (S), which are the Lemma Let Q M be closed, and p 2 @Iþ (Q)  Q,
boundaries of future sets, and we come to this shortly. then there exists a null geodesic on @Iþ (Q) with
future endpoint at p and which either extends along
Causality Properties @Iþ (Q) indefinitely into the past, or until it reaches a
First, consider the basic causal relations. If a an b point of Q. It can only extend into the future along
are two points of M, then if there is a nontrivial @Iþ (Q) if p is not a caustic point of @Iþ (Q).
future-timelike curve in M from a to b we say that a Beyond a caustic point, the null geodesic would
‘‘chronologically’’ precedes b and write enter into the interior of Iþ (Q), but this also happens
ab (more commonly) when crossing another region of
null hypersurface on @Iþ (Q).
(so it would be possible for some observer’s world line We wish to apply this to @Iþ (S), for a trapped
to encounter first a and then b). If there is a future-null surface S, but we first need a further assumption that S
curve in M from a to b (trivial or otherwise), we say that lies in the interior of the (future) domain of dependence
a ‘‘causally’’ precedes b and write Dþ (H) of some spacelike hypersurface H. This region is
ab defined as the totality of points q for which every
timelike curve with future endpoint q can be extended
(so it would be possible for a signal to get from a to into the past until it meets H. One can consider domains
b). We have the following elementary properties (see of dependence for regions H other than smooth space-
Penrose (1972)): like surfaces, but it is usual to assume, more generally,
aa that H is a closed achronal set, where ‘‘achronal’’ means
that H contains no pair of points a, b for which a  b.
if a  b then a  b
We find that every point q in the interior intDþ (H) of
if a  b and b  c then a  c Dþ (H) has the further property that all null curves into
if a  b and b  c then a  c the past from q will also eventually meet H if extended
if a  b and b  c then a  c sufficiently. The physical significance of Dþ (H) is that,
for fields with locally Lorentz-invariant and determi-
if a  b and b  c then a  c
nistic evolution equations, the (appropriate) initial data
We generalize the definition of Iþ (S), above, to an on H will fix the fields throughout Dþ (H) (and also
622 Spacetime Topology, Causal Structure and Singularities

throughout the similarly defined past domain of was able to remove assumptions concerning domains
dependence D (H)). We find that points in the future of dependence (e.g., Hawking (1967)). A later
Cauchy horizon Hþ (H), which is the future boundary theorem (Hawking and Penrose 1970) encompassed
of Dþ (H) defined by most of the earlier ones and had, as one of its
implications, that virtually all spatially closed uni-
H þ ðHÞ ¼ Dþ ðHÞ  I ðDþ ðHÞÞ;
verse models, satisfying a reasonable energy condition
has properties similar to the boundary of a past set, in and without closed timelike curves, would have to be
accordance with the above lemma, and also for the singular, in this sense of ‘‘incompleteness,’’ but again
past Cauchy horizon H  (H), defined correspondingly. the topological-type arguments used give little indica-
tion of the nature or location of the singularities.
Another issue that is not addressed by these
arguments is whether the singularities arising from
Singularity Theorems
gravitational collapse are inevitably ‘‘hidden,’’ as in
and Related Questions
Figure 1, by the presence of a horizon – a conjecture
Now, applying our lemma to @Iþ (S), for a trapped referred to as ‘‘cosmic censorship’’ (see Penrose
surface S intDþ (H), we find that every one of its (1969, 1998)). Without this assumption, one cannot
points lies on a null-geodesic segment  on @Iþ (S), deduce that gravitational collapse, in which a trapped
with past endpoint on S (for if  did not terminate at S surface forms, will lead to a black hole, or to the
it would have to reach H, which is impossible). alternative which would be a ‘‘naked singularity.’’
Assuming future-null completeness and weak energy There are many results in the literature having a
(  0), we conclude that if extended far enough into bearing on this issue, but it still remains open.
the future, the family of such null geodesics  must A related issue is that of strong cosmic censorship
encounter a caustic, and therefore they must leave which has to do with the question of whether
@Iþ (S) and enter Iþ (S). We finally conclude that singularities might be observable to local observers.
@Iþ (S) must be a compact topological 3-manifold. Roughly speaking, a naked singularity would be one
Using basic theorems, we construct an everywhere which is ‘‘timelike,’’ whereas the singularities in black
timelike vector field in intDþ (H) which provides a holes might in general be expected to be spacelike
(1–1) continuous map from the compact @Iþ (S) to H, (or future-null), and in the Big Bang, spacelike (or past-
yielding a contradiction if H is noncompact, thereby null). There are ways of characterizing these distinctions
establishing the following (Penrose 1965, 1968): purely causally, in terms of past sets or future sets (sets Q
for which Q = I (Q) or Q = Iþ (Q)); see Penrose (1998).
Theorem The requirement that there be a trapped
If (strong) cosmic censorship is valid, so there are no
surface which, together with its closed future, lies in the
timelike singularities, the remaining singularities would
interior of the domain of dependence of a noncompact
be cleanly divided into past-type and future-type. In the
spacelike hypersurface, is incompatible with future null
observed universe, there appears to be a vast difference
completeness and the weak energy condition.
between the structure of the two, which is intimately
We notice that this ‘‘singularity theorem’’ gives no connected with the second law of thermodynamics,
indication of the nature of the failure of future null there appearing to be an enormous constraint on
completeness in a spatially open spacetime subject to the Weyl curvature (see General Relativity: Overview)
weak positivity of energy and containing a trapped in the initial singularities but not in the final ones.
surface. The natural assumption is that in an actual Despite the likelihood of singularities arising in their
physical situation of such gravitational collapse, the time evolution, it is possible to set up initial data for the
failure of completeness would arise at places where Einstein vacuum equations for a wide variety of
curvatures mount to such extreme values that complicated spatial topologies (see Einstein Equations:
classical general relativity breaks down, and must be Initial Value Formulation). On the observational side,
replaced by the appropriate ‘‘quantum geometry’’ (see however, there seems to be little evidence for anything
Quantum Geometry and its Applications, etc.). other than Euclidean spatial topology in our actual
Hawking (1965) showed how this theorem (in time- universe (which includes black holes). Speculation on
reversed form) could also be applied on a cosmolo- the nature of spacetime at the tiniest scales, however,
gical scale to provide a strong argument that the where quantum gravity might be relevant, often
Big-Bang singularity of the standard cosmologies is involves non-Euclidean topology, however. It may be
correspondingly stable. He subsequently introduced noted that an early theorem of Geroch established that
techniques from ‘‘Morse theory’’ which could be the constraints of classical Lorentzian geometry do not
applied to timelike rather than just null geodesics permit the spatial topology to change without viola-
and, using arguments applied to Cauchy horizons, tions of causality (closed timelike curves).
Spectral Sequences 623

See also: Asymptotic Structure and Conformal Infinity; Hawking SW and Penrose R (1970) The singularities of
Boundaries for Spacetimes; Computational Methods in gravitational collapse and cosmology. Proceedings of the
General Relativity: The Theory; Cosmology: Royal Society (London) A 314: 529–548.
Mathematical Aspects; Critical Phenomena in Oppenheimer JR and Snyder H (1939) On continued gravitational
contraction. Physical Review 56: 455–459.
Gravitational Collapse; Einstein Equations: Exact
Penrose R (1965) Gravitational collapse and space-time singula-
Solutions; Einstein Equations: Initial Value Formulation; rities. Physics Review Letters 14: 57–59.
General Relativity: Overview; Geometric Analysis and Penrose R (1972) Techniques of Differential Topology in
General Relativity; Lorentzian Geometry; Quantum Relativity. CBMS Regional Conference Series in Applied
Cosmology; Quantum Geometry and its Applications; Mathematics, no.7. Philadelphia: SIAM.
Spinors and Spin Coefficients; Stationary Black Holes. Penrose R and Rindler W (1984) Spinors and Space-Time, Vol. 1:
Two-Spinor Calculus and Relativistic Fields. Cambridge:
Cambridge University Press.
Penrose R and Rindler W (1984) Spinors and Space-Time, Vol. 2:
Further Reading
Spinor and Twistor Methods in Space-Time Geometry. Cam-
Arno’ld, Beem JK, and Ehrlich PE (1996) Global Lorentzian bridge: Cambridge University Press.
Geometry, (2nd edn). New York: Marcel Dekker. Penrose R (1968) Structure of space-time. In: DeWitt CM and
Eddington AS (1924) A comparison of Whitehead’s and Einstein’s Wheeler JA (eds.) Battelle Rencontres, 1967 Lectures in
formulas. Nature 113: 192. Mathematics and Physics. New York: Benjamin.
Finkelstein D (1958) Past–future asymmetry of the gravitational Penrose R (1969) Gravitational collapse: the role of general
field of a point particle. Physical Review 110: 965–967. relativity. Rivista del Nuovo Cimento Serie I, Numero
Hawking SW (1965) Physical Review Letters 15: 689. Speciale 1: 252–276.
Hawking SW (1967) The occurrence of singularities in cosmology Penrose R (1998) The question of cosmic censorship. In: Wald
III. Causality and singularities. Proceedings of the Royal RM (ed.) Black Holes and Relativistic Stars, (reprinted in
Society (London) A 300: 187–. (1999) Journal of Astrophysics and Astronomy 20, 233–248.)
Hawking SW and Ellis GFR (1973) The Large-Scale Structure of Chicago, IL: University of Chicago Press.
Space-Time: Cambridge: Cambridge University Press.

Special Lagrangian Submanifolds see Calibrated Geometry and Special Lagrangian Submanifolds

Spectral Sequences
P Selick, University of Toronto, Toronto, ON, Canada so to the differential group (G, d) we can associate
ª 2006 Elsevier Ltd. All rights reserved. its homology, H(G, d) := Ker d=Im d. Often G has
extra structure and we require d to satisfy some
compatibility condition in order that H(G, d) should
also have this structure. For example, a differential
Introduction
graded Lie algebra (L, d) requires a differential d
Spectral sequences are a tool for collecting and which satisfies the condition d[x, y] = [dx, y] þ
distilling the information contained in an infinite (1)jxj [x, dy]. While, for simplicity, throughout this
number of long exact sequences. Their most article we will always assume that G is an abelian
common use is the calculation of homology by group, the concepts are readily extended to the case
filtering the object under study and using a spectral where G is an object of some abelian category and
sequence to pass from knowledge of the homology generalizations to nonabelian situations have also
of the filtration quotients to that of the object itself. been studied.
This article will discuss the construction of spectral An important example
L of extra structure is the
sequences and the notion of convergence including case where G = 1 n = 1 G n is a graded abelian
conditions sufficient to guarantee convergence. group. The appropriate compatibility condition for
Some sample applications of spectral sequences are a differential graded group is that d should be
given. homogeneous of degree 1. That is, d(Gn )
Gn1 .
A differential on an abelian group G is a self-map In many contexts it is more natural to use super-
d : G ! G such that d2 = 0. A morphism of differ- scripts and regard d as having degree þ1; the two
ential groups is a map f : G ! G0 such that d0 f = fd. concepts are equivalent via the reindexing conven-
The condition d2 = 0 guarantees that Im d
Ker d, tion Gn := Gn . Another important example is that
624 Spectral Sequences

where G forms a graded algebra, meaning that it has Since our plan is to study X by computing
a multiplication Gn  Gk ! Gn þ k . To form a Gr(F X ), the first question we need to consider is
differential graded algebra, in addition to having what conditions we need to place on our filtration
degree 1, d is required to satisfy the Leibniz rule so that Gr(F X ) retains enough information to
d(xy) = d(x)y þ (1)jxj xd(y) (where jxj denotes the recover X. Our experience from the ‘‘5-lemma’’
degree of x) familiar from the differentiation of suggests that the appropriate way to phrase the
differential forms. requirement is to ask for conditions on the filtra-
In many cases, G itself is not the main object of tions which are sufficient to conclude that f : X ! Y
interest, but is a relatively large and complicated is an isomorphism whenever f : F X ! F Y is a
object, G = G(X), formed by applying some functor morphism of filtered groups for which the induced
G to the object X being studied. For example, X Gr(f ) : Gr(X) ! Gr(Y) is an isomorphism.
might be some manifold and G could be the set of It is clear that GrF X can tell us nothing about
all differential forms on X with the exterior X  ([Xn ) so we require that X = [Xn . Similarly
derivative as d. The presumption is that H(G(X)) we need that \Xn = 0. However, the latter condition
carries the information we want about X in a much is insufficient as can be seen from the following
simpler form than the whole of G(X). example.
A spectral sequence (Leray 1946) is defined L Q1
Example 1 Let X := 1 k = 1 Z and Y := k = 1 Z. Set
simply as a sequence ((Er , dr ))r = n0 , n0 þ1,..., of differ-

ential abelian groups such that Erþ1 = H(Er , dr ). By X if n  0
reindexing, we could always arrange that n0 = 1, but Fn X :¼ L1
k¼n Z if n < 0
sometimes it is more natural to begin with some 
other integer. If all terms (Er , dr ) of the spectral Y if n  0
Fn Y :¼ Q1
sequence have the appropriate additional structure, k¼n Z if n < 0
we might refer, for example, to a spectral sequence
and let f : X ! Y be the inclusion. Then Gr(f ) is an
of Lie algebras. If there exists N such that Er = EN
isomorphism but f is not.
for all r  N (equivalently dr = 0 for all r  N), the
spectral sequence is said to ‘‘collapse’’ at EN . To phrase the appropriate condition we need the
The definition of spectral sequence is so broad concept of algebraic limits. Given a sequence of
that we can say almost nothing of interest about objects {Xn }n2Z and morphisms fn : Xn ! Xnþ1 in
them without putting on some additional condi- some category, the ‘‘direct limit’’ or ‘‘colimit’’ of the
tions. We will begin by considering the most sequence, written lim Fn X, is an object X together
! n
common type of spectral sequence, historically the with morphisms gn : Xn ! X satisfying gnþ1  fn = gn ,
one that formed the motivating example: the having the universal property that given any object
spectral sequence of a filtered chain complex. X0 together with maps g 0n : Xn ! X0 satisfying g 0nþ1 
fn = g 0n , there exists a unique morphism h : X ! X0
such that g 0n = h  gn for all n. By the usual
categorical argument the object X, if it exists, is
Filtered Objects
unique up to isomorphism. The dual concept,
To study a complicated object X, it often helps to ‘‘inverse limit’’ or simply ‘‘limit’’ of the sequence,
filter X and study it one filtration at a time. A written lim Fn X, is obtained by reversing the
n
filtration F X of a group X is a nested collection of directions of the morphisms. For intuition, we note
subgroups that these notions share, with the notion of limits of
sequences in calculus, the properties that changing
F X :¼ . . . Fn X  Fnþ1 X      X 1 < n < 1
the terms Xn only for n < N does not affect
A morphism f : F X ! F Y of filtered groups is a lim Fn X, and if the sequence stabilizes at N (i.e.,
! n
homomorphism f : X ! Y such that f (Fn (X))  Fn (Y). the morphisms fn are isomorphisms for all n  N),
The groups Fn X=Fn1 X are called the ‘‘filtration
L then lim Fn X ffi XN . Similarly lim Fn X depends
! n n
quotients’’ and their direct sum Gr(F X ) := n Fn X= only upon behavior of the sequence as n ! 1.
Fn1 X is called the associated graded group of the Limits over partially ordered sets other than Z can
filtered group F X . In cases where X has additional also be taken but we shall not need them in this
structure, we might define special types of filtra- article. Although limits need not exist in general, in
tions satisfying some compatibility conditions so the category of abelian groups, both the direct and
that Gr(F X ) inherits the additional structure. For inverse limit exist for any sequence and are given
example, an algebra filtration of an algebra X is explicitly
L by the following constructions. L lim
! n
defined as one for which (Fn X)(Fk X)  Fn þ k X. Fn X = Xn = where, letting ik : Xk ! Xn be the
Spectral Sequences 625

canonical inclusion, the equivalence relation is gener- with a long exact sequence, knowledge of two of
ated by in (x) inþ1 f (x) for x 2 Xn . lim Fn X = every three terms gives a handle on computing the
n
{(xn ) 2 Xn j fn (xn ) = xnþ1 8n}. remaining terms but does not, in general, completely
The condition needed is that our filtrations should determine those terms, which explains intuitively
be bicomplete, defined as follows. F X is called why we have some reason to hope that a spectral
‘‘cocomplete’’ if the canonical map X ! lim Fn X sequence might be useful and also why it is not
! n
is an isomorphism and F X is called ‘‘complete’’ if guaranteed to solve our problem.
X ! lim X=Fn X is an isomorphism. F X is called Before proceeding with our motivating example,
n
bicomplete if it is both complete and cocomplete. we digress to discuss spectral sequences formed from
Note that F X cocomplete is equivalent to [Fn X = X exact couples.
but F X complete is stronger than \Fn X = 0.
Theorem 1 (Comparison theorem). Let F X be
Exact Couples
bicomplete and let F Y be cocomplete with
\Fn Y = 0. Suppose that f : F X ! F Y is a morphism In this section, we will define exact couples, show
such that Gr(f ) : Gr(X) ! Gr(Y) is an isomorphism. how to associate a spectral sequence to an exact
Then f : X ! Y is an isomorphism. couple, and discuss some properties of spectral
sequences coming from exact couples. As we shall
see, a filtered chain complex gives rise to an exact
Filtered Chain Complexes couple and we will examine this spectral sequence in
greater detail.
A chain complex (C, d) of abelian groups consists of
Exact couples were invented by Massey and many
abelian groups Cn for n 2 Z together with homo-
books use them as a convenient method of con-
morphisms dn : Cn ! Cn1 such that dn  dnþ1 = 0 for
structing spectral sequences. Other books bypass
all n. To the chain complex (C, d) we can Lassociate discussion of exact couples and define the spectral
the differential (abelian) group (C
, d) := 1
n = 1 Cn sequence coming from a filtered chain complex
with djCn induced by dn . We often write simply C if
directly.
the differential is understood. The dual notion in
which d has degree þ1 is called a cochain complex Definition 1 An ‘‘exact couple’’ consists of a
and the concepts are equivalent through our triangle
convention Cn := Cn . i
D!D
Theorem 2 (Homology commutes with direct
limits). H(lim Cn ) = lim H(Cn ).
! n ! n
# k

E
# j
As we shall see later, failure of homology to
containing abelian groups D, E, and together with
commute with inverse limits is a source of great
homomorphisms i, j, k such that the diagram is
complication in working with spectral sequences.
exact at each vertex.
Let F C be a filtered chain complex. In many
applications, our goal is to compute H
(C) from a In the following, to avoid conflicting notation
knowledge of H
(Fn C=Fn1 C) for all n. The overall considering the many superscripts and subscripts
plan, which is not guaranteed to be successful in which will be needed, we use the convention that an
general, would be: n-fold composition will be written f n rather than
the usual f n .
1. use the given filtration on C to define a filtration
Given an exact couple, set d := jk : E ! E. By
on H
(C),
exactness, kj = 0, so d2 = jkjk = 0 and therefore
2. use our knowledge of H
(Gr C) to compute
(E, d) forms a differential group. To the exact
Gr H
(C),
couple we can associate another exact couple, called
3. reconstruct H
(C) from Gr H
(C).
its derived couple, as follows. Set D0 := Im i  D and
To begin, set Fn (H
C) := Im(sn )
, where sn : E0 := H(E, d). Define i0 := ijD and let j0 : D0 ! E0 be
Fn (C) ! C is the inclusion (chain) map from the given by j0 (iy) := j(y), where x denotes the equiva-
filtration. The spectral sequence which we will lence class of x. The map k0 : E0 ! D0 is defined by
define for this situation can be regarded as a method k0 (z) := kz. One checks that the maps j0 and k0 are
of keeping track of the information contained in well defined and that (D0 , E0 , i0 , j0 , k0 ) forms an exact
the infinite collection of long exact homology couple. Therefore, from our original exact couple,
sequences coming from the short exact sequences we can inductively form a sequence of exact couples
0 ! Fn1 C ! Fn C ! Fn C=Fn1 C ! 0. When working (Dr , Er , ir , jr , kr )1 1 1 r
r = 1 with D := D, E := E, D := (D
r1 0
)
626 Spectral Sequences

and Er := (Er1 )0 . This gives a spectral sequence If dx = 0, . . . , dr1 x = 0, then x represents an


(Er , dr )1 r r r
r = 1 with d = j k . element of Er and dr x is defined. Set Zr := {x 2
To the filtered chain complex F C , we Lcan E j dm x = 0 8m r}. Then Erþ1 ffi Zr= where x y
associate an exact couple as follows. Set DL := p, q if there exists z 2 E such that for some t r we
Dp, q where Dp, q = Hpþq (Fp C) and E := p, q Ep, q have dm z = 0 for m < t (thus dt z is defined) and
where Ep, q = Hpþq (Fp C=Fp1 C). The long exact dt z = x  y. With this as motivation, we set Z1 :=
homology sequences coming from the sequences \r Zr = {x 2 E j dm x = 0 8m} (known as the ‘‘infinite
a b
0 ! Fp1 C ! Fp C ! Fp C=Fp1 C ! 0 give rise, for cycles’’) and define E1 := Z1= where x y if
each p and q, to maps a
: Dp1, qþ1 ! Dp, q , b
: there exists z 2 E such that for some t we have
Dp, q ! Ep, q , and @ : Ep, q ! Dp1, q . Define i : D ! D dm z = 0 for m < t and dt z = x  y.
to be the map whose restriction to Dp1, qþ1 is the Notice that Drþ1 = Imir ffi D=Ker ir . There is no
composition of a
with the canonical inclusion analog of this statement for r = 1. Instead we have
Dp, q ! D. Similarly, define j : D ! E and k : E ! D separate concepts so we set D1 := D= [r Ker ir
to be the maps whose restrictions to each and 1 D := \r Imir . The analog of the rth-derived
summand are the compositions of b
and @ with exact couple when r = 1 is the following exact
the inclusions. The indexing scheme for the bigrada- sequence.
tions is motivated by the fact that in many
Theorem 3 There are maps induced by i, j, and k
applications it causes all of the nonzero terms to
producing an exact sequence
appear in the first quadrant, so it is the most
common choice, although one sometimes sees other i1 j1 1
k 1
i
0 ! D1 ! D1 ! E1 ! 1 D ! 1 D
conventions.
There is actually a second exact couple we could The fact that we were able to add the 0 term to
associate to F C , which yields the same spectral the left of this sequence but not the right can be
sequence: use the same E as above but replace D by traced to the fact that lim preserves exactness but
L !
Dp, q with Dp, q = Hpþqþ1 (C=Fp C), and define i, j, lim does not.

and k in a manner similar to that above. In our motivating example, the terms of Lthe initial
When dealing with cohomology rather than exact L couple came with a bigrading D = Dp, q and
homology, the usual starting point would be a E = Ep, q and writing jf j for the bidegree of a
system of inclusions of cochain complexes    Fnþ1 C morphism f we had: jij = (1, 1); jjj = (0, 0); jkj =
 Fn C  Fn1      C. This can be reduced to the (1, 0); d = (1, 0). It follows that jir j = (1,1); jjr j =
previous case by replacing the cochain complex C by (r þ 1, r  1); jkr j = (1, 0); jdr j = (r, r  1) which
a chain complex C
using the convention Cp := Cp is considered the standard bigrading for a bigraded
and filtering the result by Fn C
:= Fn C. The usual exact couple. Similarly, the standard bigrading for a
practice, equivalent to the above followed by a bigraded spectral sequence is one such that
rotation of 180 , is to leave the original indices and jdr j = (r, r  1).
instead reverse the arrows in the exact couple. In We observed earlier that terms of an exact couple
this case, it is customary to write Dp, r
q
and Ep,r
q
for and its corresponding spectral sequence get smaller
the terms in the exact couple and spectral sequence. as r ! 1 as each is a subquotient of its predecessor.
In applications, it is often the case that E1 is Note that the bigrading is such that this applies to
known and that our goal includes computing D1 . each pair of coordinates individually (e.g., Erþ1 p, q is
The example of the filtered chain complex with the a subquotient of Erp, q ) and so in particular if the
assumption that we know H
(Fp C=Fp1 C) for all p p, q-position ever becomes 0 that position remains 0
is fairly typical. forevermore.
Since each Dr is contained in Dr1 and each Er is
a subquotient of Er1 , the terms of these exact
Convergence of Graded Spectral
couples get smaller as we progress. To get properties
Sequences
of the spectral sequence, we need to examine this
process and, in particular, analyze that which As noted earlier, the definition of spectral sequence
remains in the spectral sequence as we let r go to is so broad that we need to put some conditions on
infinity. our spectral sequences to make them useful as a
For x 2 E, if dx = 0 then x  belongs to E2 and so computational tool. From now on, we will restrict
2
d ( x) is defined. In the following, we shall usually attention to spectral sequences
L arising from
L exact
simplify the notation by writing simply x in place of couples in which D = Dp and E = Ep are
 and writing dr x = 0 to mean ‘‘dr x is defined and
x graded with ijDp  Dpþ1 , jjDp  Ep , and kjEp  Dp1 .
equals 0.’’ All the spectral sequences which have been studied
Spectral Sequences 627

to date satisfy this condition and in fact most also in which this is true is stated more precisely in the
have a second gradation as in the case of our following theorem.
motivating example. To see how to proceed, we
examine that case more closely. Theorem 5 (Spectral sequence comparison
~ r be a morphism
theorem). Let f = (f r ) : (Er ) ! E
For a filtered chain complex F C with structure
maps sp : Fp C ! C we defined Fp (H
(C)) = Im sp
. If of spectral sequences.
x = i(r1) y belongs to (i) If f : EN ! E ~ N is an isomorphism for some N,
r
then f is an isomorphism for all r  N (includ-
Drp; q = Im iðr1Þ : Hpþq ðFprþ1 CÞ ! Hpþq ðFp CÞ
ing r = 1).
then (sp )
x = (sp )
i(r1) y = (spþ1 )
ir y = (spþ1 )
ix. (ii) Suppose in addition that (Er ) converges to F X
Therefore, we have a commutative diagram and (E ~ r ) quasiconverges to F ~ . Let  : F X ! F ~
X X
be a morphism of filtered abelian groups which
Drp;q ! Fp ðHpþq ðCÞÞ is compatible with f. (i.e., there exist isomorph-
isms  : Gr X ffi E1 and ˜ : Gr X ~ ffiE ~ 1 such that
#i # 1
f   = ˜  Gr(f )). Then f : X ! X ~ is an
Drþ1
pþ1;q1 ! Fpþ1 ðHpþq ðCÞÞ isomorphism.
yielding a map Within the constraints provided by Theorem 5, a
    spectral sequence might have many limits. A typical
Drþ1 r
pþ1; q1 =Dp; q ! Fpþ1 Hpþq ðCÞ =Fp Hpþq ðCÞ calculation of some group Y by means of spectral
= Grpþ1 ðHpþq ÞC sequences might proceed as an application of
Theorem 5 along the lines of the following plan.
Letting r go to infinity, we get an induced map
 : D1 =i1 (D1 ) ! Gr(H(C)). 1. Subgroups Fn Y forming a filtration of Y are
defined, although usually not computable at this
Theorem 4 If F H (C) is cocomplete then point. The subgroups are chosen in a manner that
(i) D1 = Fn (H(C)); seems natural bearing in mind that to be useful it
(ii)  : D1 =i1 (D1 ) ! Gr(H(C)) is an isomorphism; will be necessary to show convergence properties.
(iii) There is an exact sequence 0 ! Gr(H(C)) 2. Directly or by means of an exact couple, a
j1 1
k 1
i spectral sequence is defined in a manner that
! E1 ! 1 D ! 1 D:
seems to be related to the filtration.
We say that the spectral sequence (Er ) ‘‘abuts’’ to 3. Some early term of the spectral sequence (usually
F L if there is an isomorphism GrL ! E1 . Here we E1 or E2 ) is calculated explicitly and the
mean an isomorphism of graded abelian groups, differentials dr are calculated successively result-
which makes sense since under our assumptions Er ing in a computation of E1 .
inherits a grading from E1 for each r. If in addition 4. With the aid of the knowledge of E1 , a
the filtration on L is cocomplete, we say that (Er ) conjecture Y = G is formulated for some G.
‘‘weakly converges’’ to F L and if it is bicomplete we 5. A suitable filtration on G and a map of filtrations
say that (Er ) ‘‘converges’’ (or strongly converges) to F G ! F Y or F Y ! F G are defined.
F L . The notation (Er ) ) F L (or simply (Er ) ) L 6. The spectral sequence arising from F G is demon-
when the filtration on L is either understood or strated to converge to G.
unimportant) is often used in connection with 7. The original spectral sequence is demonstrated to
convergence but there is no universal agreement as converge to Y and Theorem 5 is applied.
to which of the three concepts (abuts, weakly
converges, or converges) it refers to! In this article, The hardest steps are usually (3) and (7). For step
we will also use the expression (Er ) ‘‘quasicon- (3), in most cases the calculations require knowledge
verges’’ to F L to mean that the spectral sequence which cannot be obtained from the spectral sequence
weakly converges to F L with \n Fn L = 0. (Note: the itself, although the spectral sequence machinery plays
terminology quasiconverges is nonstandard although its role in distilling the information and pointing the
the concept has appeared in the literature, some- way to exactly what needs to be calculated. Steps
times under the name converges.) (4)–(6) are frequently very easy, and often not stated
While it would be overstating things to claim that explicitly, with ‘‘by construction of G’’ being the
convergence of the spectral sequence shows that E1 most common justification of (6). We now discuss
determines H(C), it is clear that convergence is what the types of considerations involved in step (7).
we need in order to expect that E1 contains enough Convergence of a spectral sequence to a desired L
information to possibly reconstruct H(C). The sense can be difficult to verify in general partly because
628 Spectral Sequences

2
the conditions are stated in terms of some filtration Looking at the bidegrees, the domain or range of dp, q
(usually understood only in a theoretical sense) on is zero for each p and q so d2 = 0, and similarly
an initially unknown L rather than in terms of dr = 0 for all r > 2. Therefore, the spectral sequence
properties of the spectral sequence itself or an exact collapses with E2 = E1 . The spectral sequence con-
couple from which it arose. Theorems 2 and 4(ii) verges to H
(X) so the terms on the diagonal
give us the following extremely important special p þ q = n form a composition series for Hn (X).
case in which we can conclude convergence to H(C) Since the (n, 0) term is the only nonzero term on
of the spectral sequence for F C based on conditions this diagonal, Hn (X) ffi Hncell (X). That is, ‘‘cellular
that are often easily checked. homology equals singular homology.’’
Returning to the general situation, set L1 :=
Theorem 6 If F C is a filtered chain complex such
that F C is cocomplete and there exists M such that
lim
! n Dn and L1 := lim  n Dn . Filter L1 by Fn L1 :=
H(Fn C) = 0 for n < M, then the spectral sequence Im(Dn ! L1 ) and filter L1 by Fn L1 := Ker
for F C converges to H(C). (L1 ! Dn ). It follows from the definitions that
Fn L1 = D1 1 1 1
n and so Dn =i (Dn1 ) = Grn L1 . At the
Although the second hypothesis, which implies other end, the canonical map L1 ! Dn lifts to 1 Dn
that 1 D = 0, is very strong it handles the large yielding an injection L1 =Fn L1 ! 1 Dn . Therefore,
numbers of commonly used filtrations which are 0 for each n there is an injection Grn L1 ! Kn where
in negative degrees. Kn = Ker(1 Dn1 ! 1 Dn ). In general, the map
Under the conditions of Theorem 6, inserting the L1 ! 1 Dn need not be surjective (an element
bigradings into Theorem 4 gives a short exact could be in the image of ir for each finite r without
sequence 0 ! D1 1 1
p1, qþ1 ! Dp, q ! Ep, q ! 0 with being part of a consistent infinite sequence), although
1
Dp, q ffi Fp (Hpþq (X)); equivalently it is surjective in the special case when 1 Ds ! 1 Dsþ1
is surjective for each s. In the latter case we get
Fk ðHn ðCÞÞ=Fk1 ðHn ðCÞÞ ffi E1
k; nk Gr L1 ffi K. As we will see in the next section, the
Thus, the only E1 -terms relevant to the computa- exact sequence of Theorem 3 extends to the right
(Theorem 8) giving lim 1 Zr = 0 as a sufficient condition
tion to Hn (C) are those on the diagonal p þ q = n. In r
the important case of a first quadrant spectral that 1 Ds ! 1 Dsþ1 be surjective for each s, where lim 1

sequence (Erp, q = 0 if p < 0 or q < 0), the number is described in that section and (Zr ) refers to the system
of nonzero terms on any diagonal is finite so the of inclusions     Zrþ1  Zr  Zr1     . Thus,
E1 -terms on the diagonal p þ q = n give a finite lim 1 Zr = 0 is a sufficient condition for Gr L1 ffi K.
r
composition series for each Hn (C). Taking into account the short exact sequence
Here is an elementary example of an application 0 ! D1 =i1 (D1 ) ! E1 ! K ! 0 coming from
of a spectral sequence. Theorem 3, the preceding discussion yields two
obvious candidates for a suitable F L : F L1 or F L1 .
Example 2 Let S
( ) denote the singular chain In theory there are other possibilities, but in
complex, let H
( ) := H
(S
( )) denote singular practice one of these two cases usually occurs. We
homology, and let H
cell ( ) denote cellular homology. examine them individually and see what additional
Let X be a CW-complex with n-skeleton X(n) . The conditions are required for convergence.
inclusions S
(X(n) ) ! S
(X) yield a filtration on
S
(X). In the associated spectral sequence, Case I: Conditions for convergence to F L1 It is
easily checked from the definitions that lim D1
  ! n n =
E1p;q ¼ Hpþq XðpÞ =Xðp1Þ lim Dn so F L1 is always cocomplete. Therefore,
! n
 besides Gr L1 ffi E1 (equivalently, K = 0), it is
free abelian group on the p-cells of X if q ¼ 0 required to verify that F L1 is complete. As we will

0 if q 6¼ 0 see in the next section, the completeness condition can
be restated as \Dn = 0 and lim 1 Dn = 0. According to
n
The differential the preceding discussion, under the assumption that
n
    L1 = \ D = 0, which we need anyway as part of the
1
dp; ðpÞ
0 : Hp X =X
ðp1Þ
! Hp1 Xðp1Þ=Xðp2Þ requirement that F L1 be complete, lim 1 Zr X = 0 is
r
sufficient to show K = 0.
is the definition of the differential in cellular Case II: Conditions for convergence to F L1 Any
homology. Therefore, inverse limit is complete in its canonical filtration, so
 F L1 is always complete and the issues are whether
2 H cell ðXÞ if q ¼ 0 GrL1 ffi E1 and whether F L1 is cocomplete.
Ep;q ¼
0 if q 6¼ 0 F L1 is cocomplete if and only if every element of
Spectral Sequences 629

L1 lies in Ker(L1 ! Dn ) for some n, for which a Let (Xn ) be an inverse system with structure maps
1
sufficient condition is that L1 = 0 or equivalently in1 : Xn1 ! Xn . An explicit construction
Q Qfor lim
n
E1 ffi K. Therefore, if the reason for the isomorph- Xn is as follows. Define  : n Xn ! n Xn by
ism Gr L1 ffi E1 is that the maps E1  K and letting (xn ) be the sequence whose nth component
Gr L1  K are isomorphisms, then the rest of the is (xn  in1 xn1 ). Then lim 1 Xn ffi Coker . Observe
n
convergence conditions are automatic. In particular, that Ker  ffi lim Xn according to the explicit for-
to deduce convergence to F L1 it suffices to know n
mula for lim Xn given earlier.
that L1 = 0 and lim 1 Zr = 0. n
r Recall that we defined 1 D = \r Im ir ffi lim Dr .
r
The exact sequence of Theorem 3 can be extended
Derived Functors to give:

The left and right derived functors Ln T, Rn T of a Theorem 8 There is an exact sequence
functor T provide a measure of the amount by which i j k i
the functor deviates from preserving exactness. 0 ! D1 ! D1 ! E1 ! 1 D ! 1 D
The category I nv of inverse systems indexed over Z j k i
! lim1 Zr ! lim1 Dr ! lim1 Dr ! 0
(i.e., the category whose objects are diagrams r r r

of abelian groups    ! An1 ! An ! Anþ1 !   ) It is clear from the explicit construction that if the
forms an abelian category in which a sequence of system (Xn ) stabilizes with Xn = G for all sufficiently
morphisms A0 ! A ! A00 is exact if and only if the small n, then lim X = G and lim 1 X = 0. If the
sequence An 0 ! An ! An 00 of abelian groups is exact n n
spectral sequence collapses at any stage then the
for each n. The functor of interest to us is lim : I nv ! r
system (Z ) stabilizes at that point, and so for a

AB where AB denotes the category of abelian groups. spectral sequence which collapses, the condition
Let T : A ! B be an additive functor between lim 1 Zr = 0, which arose in the discussion of
¼ ¼ r
abelian categories. Suppose that X in Obj A has an convergence in the previous section, is automatic.
¼
injective resolution IX . The definition of additive Let F X be a filtered abelian group. Applying
functor implies that T takes zero morphisms to zero Theorem 7 to the short exact sequence 0 ! Fn X !
morphisms, so TIX forms a cochain complex in B . X ! X=Fn X ! 0 of inverse systems gives an exact
¼
The right derived functors of T are defined by sequence
(Rn T)(X) := Hn (TIX ). The result is independent of
the choice of injective resolution (assuming one 0 ! lim Fn X ! lim X ! lim X=Fn
n n n
exists) and satisfies:
! lim1 Fn X ! lim1 X
n n
1. If T is ‘‘left exact’’ (meaning that T preserves
1
monomorphisms), then R0 T(X) = T(X); Since lim X = X and lim X = 0, we get
n n
2. If T preserves exactness, then (Rn T)(X) = 0 for
Theorem 9 F X is complete if and only if
n > 0.
lim Fn X = 0 and lim 1 Fn X = 0.
n n
Theorem 7 Let 0 ! X0 ! X ! X00 ! 0 be a short
When working with lim 1 the following sufficient
exact sequence in A . Suppose T is left exact and that n
¼ condition for its vanishing, known as the Mittag–
all the objects have injective resolutions. Then there
Leffler condition, is often useful.
is a (long) exact sequence
Theorem 10 Suppose A is an inverse system in
0 ! TðX0 Þ ! TðXÞ ! TðX00 Þ ! ðR1 TÞðX0 Þ !    which for each n there exists k(n) n such that
! ðRn1 TÞðX00 Þ ! ðRn TÞðX0 Þ ! ðRn TÞðXÞ ! Im(Ai ! An ) equals Im(Ak(n) ! An ) for all i k(n).
Then lim 1 A = 0.
ðRn TÞðX00 Þ !   
n
Of course, this will not be (directly) useful in
Similarly, the left derived functors of T are defined establishing lim 1 Fn X = 0 since the structure maps in
n
by using projective resolutions and have similar that system are all monomorphisms.
properties with respect to the obvious duality.
The functor lim is left exact and in the category
n Some Examples of Standard Spectral
I nv every object has an injective resolution. There-
fore lim q is defined and lim 0 Xn = lim Xn , where Sequences and Their Use
n n n
lim q denotes the derived functor Rq (lim ). It turns To this point we have considered the general theory
n q n
out that lim is 0 for q > 1, but we are particularly of spectral sequences. The properties of the spectral
n 1
interested in lim . sequences arising in many specific situations have
n
630 Spectral Sequences

been well studied. Usually the spectral sequence it is instead H


(X) and one of the other two
would be defined either directly, through an exact homologies which is known, and one is working
couple, or by giving some filtration on a chain backwards from the spectral sequence to find the
complex. This defines the E1 -term. Typically, a homology of the third space.
theorem would then be proved giving some formula
Example 3 The universal S1 -bundle is the bundle
for the resulting E2 -term. In many cases, conditions
S1 ! S1 ! CP1 where S1 is contractible. We will
under which the spectral sequence converges may
calculate H
(CP1 ) from the Serre spectral sequence
also be well known.
of this bundle, taking H
(S1 ) and H
(S1 ) as known.
In this section, we shall take a brief look at the
We also take as known that CP1 is path connected,
Serre spectral sequence, Atiyah–Hirzebruch spectral
so H0 (CP1 ) ffi Z.
sequence, spectral sequence of a double complex,
Grothendieck spectral sequence, change of ring p;q
E2 ffi H p ðCP1 Þ  H q ðS1 Þ
spectral sequence, and Eilenberg–Moore spectral  p
H ðCP1 Þ if q ¼ 0 or 1
sequence, and carry out a few sample calculations. ffi
0 otherwise

Serre Spectral Sequence E1 -terms on the diagonal p þ q = n form a compo-



sition series for H n (S1 ) which is zero for n 6¼ 0.
Let F ! X! B be a fiber bundle (or more generally Therefore Ep, q
1 = 0 unless p = 0 and q = 0, with
a fibration) in which the base B is a CW-complex. 0, 0
E1 ffi Z. Because all nonzero terms lie in the first
Define a filtration on the total space by quadrant, the bidegrees of the differentials show
Fn X := 1 B(n) . This yields a filtration on H
(X) by that dr (E1,0 1,0
2 ) = 0 for all r  2, so 0 = E1 =
setting Fn H
(X) := Im(H
(Fn X) ! H
(X)). The spec- 1,0 1 1 1,q 1,0 0,q
E2 = H (CP ). Since E2 ffi E2  E2 , it follows
tral sequence coming from the exact couple in which 1,q
that E2 = 0 for all q. Taking into the account the
D1p, q := Hpþq (Fp X) and E1p, q := Hpþq (Fp X, Fp1 X) is known zero terms, the bidegrees of the differentials
called the ‘‘Serre spectral sequence’’ of the fibration. show that E0,1 0,1 2,0
3 ffi Ker(d2 : E2 ! E2 ) and E1 =
0,1
Theorems from topology guarantee that this filtra- E3 . Similarly, E1 = E3 ffi Coker(d2 : E2 ! E2,0
0,1 2,0 2,0 0,1
2 ).
tion is cocomplete and that E1p, q = 0 if either p < 0 Therefore, the vanishing of these E1 -terms shows
or q < 0. Therefore, the Serre spectral sequence is that d2 : E0,1 2,0
and in particular H 2 (CP1 ) ffi
2 ffi E2
always a first quadrant spectral sequence converging 0,1 1 1 2,q
E2 = H (S ) ffi Z. It follows that E2 ffi Z  E2 ffi
0,q
to H
(X). 0, q
E2 for all q. With the aid of the fact that we
Theorem 11 (Serre). In the Serre spectral sequence showed E1,1 2 = 0, we can repeat the argument used to
1,q 3,q
of the fibration F ! E ! B there is an isomorphism show E2 = 0 for all q to conclude that E2 = 0 for
E2p, q ffi Hp (B; t Hq (F)). all q. Repeating the procedure, we inductively find
that Ep,q
2 ffi E2
p2,q
for all p > 0 and all q and in
Here t H
(F) denotes a ‘‘twisted’’ or ‘‘local’’
particular
coefficient system in which the differential is

modified to take into account the action, coming
from the fibration, of the fundamental groupoid of H n ðCP1 Þ ffi Z if n is even
0 if n is odd
the base B on the fiber F. In the special case where B
is simply connected and Tor(H
(B), H
(F)) = 0, the The cup products in H
(CP1 ) can also be
‘‘universal coefficient theorem’’ says that the determined by taking advantage of the fact that the
E2 -term reduces to E2p, q ffi Hp (B)  Hq (F). spectral sequence is a spectral sequence of algebras.
The Serre spectral sequence for cohomology, Let a 2 E2,02 ffi Z be a generator and set x := d2 a. By
Ep, q p t q
2 ffi H (B; H (F)) ) H
pþq
(X), has the advantage the preceding calculation, d2 is an isomorphism so x
that it is a spectral sequence of algebras which is a generator of H 2 (CP1 ). Therefore, x  a is a
greatly simplifies calculation of the differentials dr generator of E2, 2
2
and the isomorphism d2 gives
which are restricted by the requirement that they that d2 (x  a) is a generator of H4 (CP1 ). However,
satisfy the Leibniz rule with respect to the cup d2 (x  a) = d2 (x  1)(1  a) = 0  1 þ (1)2 (x  1)d2
product on H
(B) and H
(F), and which also allows a = x2  1 and thus, x2 is a generator of H 4 (CP1 ).
the computation of the cup product on H
(X). Since Inductively, it follows that xn is a generator of
it is a first quadrant spectral sequence, convergence H 2n (CP1 ) for all n and so H
(CP1 ) ffi Z[x].
is not an issue. When working backwards from the Serre or
Frequently in applications of the Serre spectral other first quadrant spectral sequences in which
sequence, instead of using the spectral sequence to E2p, q ffi E2p, 0  E20, q the following analog of the
calculate H
(X) from knowledge of H
(F) and H
(B) comparison theorem (Theorem 5) is often useful.
Spectral Sequences 631

Theorem 12 (Zeeman comparison theorem). Let sufficient condition for convergence to lim Y
(Fn X).
n
E and E0 be first quadrant spectral sequences such However since the real object of study is usually
2 2 2
that E2p, q = E2p, 0  E20, q and E0p, q = E0p, 0  E00, q . Let Y
(X), the spectral sequence is most useful when one
f : E ! E0 be a homomorphism of spectral sequences is also able to show lim 1 Y
(Fn X) = 0 in which case
such that fp,2 q = fp,2 0  f0,2 q . Suppose that fp,1q : E1 n
p, q ! the Milnor exact sequence (Milnor 1962)
01
Ep, q is an isomorphism for all p and q. Then the
following are equivalent: 0! lim 1 Y
ðFn XÞ ! Y
ðXÞ
n
! lim Y
ðFn XÞ ! 0
2
(i) fp,2 0 : E2p, 0 ! E0p, 0 is an isomorphism for p n  1;
2
n
(ii) f0,2 q : E20, q ! E00, q is an isomorphism for q n.
gives Y
(X) ffi lim Y
(Fn X).
There is a version of the Serre spectral sequence n
If Y
( ) has cup products then the spectral
for generalized homology theories coming from sequence has the extra structure of a spectral
the exact couple obtained by applying the sequence of Y
(
)-algebras. In the case where B is
generalized homology theory to the Serre filtra- finite dimensional, all convergence problems disap-
tion of X. pear since the spectral sequence lives in a strip and
Theorem 13 (Serre spectral sequence for generalized the filtrations are finite.
homology). Let F ! X ! B be a fibration and let Example 4 Let K
( ) be complex K-theory. Since
Y be an (unreduced) homology theory satisfying the K
(
) ffi Z[z, z1 ] with jzj = 2, in the Atiyah–
Milnor wedge axiom. Then there is a (right half- Hirzebruch spectral sequence for K
(CPn ) we have
plane) spectral sequence with E2p, q ffi Hp (B; t Yq (F)) 
converging to Ypþq (X). E2 ¼ Z if q is even and p is even with 0 p 2n
p;q
0 otherwise
Cocompleteness of the filtration follows from the
properties of generalized homology theories satisfy- Because CPn is a finite complex, the spectral
ing the wedge axiom (Milnor 1962), and the rest of sequence converges to K
(CPn ). Since all the non-
the convergence conditions are trivial since the zero terms have even total degree and all the
filtration is 0 in negative degrees. Here, unlike differentials have total degree þ1, the spectral
the Serre spectral sequence for ordinary homology, sequence collapses at E2 and we conclude that
the existence of terms in the fourth quadrant opens the Kq (CPn )= 0 if q is odd and that it has a composition
possibility for composition series of infinite length, series consisting of (n þ 1) copies of Z when q is
although in the case where B is a finite-dimensional even. Since Z is a free abelian group, this uniquely
complex all the nonzero terms of the spectral identifies the group structure of Keven (CPn ) as Znþ1 .
sequence will live in the strip between p = 0 and To find the ring structure we can make use of the
p = dim B and so the filtrations will be finite. fact that this is a spectral sequence of K
(
)-
The special case of the fibration
! X ! X algebras. The result is K
(CPn ) ffi K
(
)[x]=(xnþ1 ),
yields what is known as the ‘‘Atiyah–Hirzebruch where jxj= 2.
spectral sequence’’. In the Atiyah–Hirzebruch spectral sequence for
Theorem 14 (Atiyah–Hirzebruch spectral sequence). K
(CP1 ) again all the terms have even total degree
Let X be a CW-complex and let Y be an (unreduced) so the spectral sequence collapses at E2 . We noted
homology theory satisfying the Milnor wedge earlier that collapse of the spectral sequence implies
axiom. Then there is a (right half-plane) spectral that lim 1 Zr X = 0 and so the spectral sequence
r
sequence with E2p, q ffi Hp (X; Yq (
)) converging to convergences to lim K
(CPn ), where we used
n
Ypþq (X). F2n CP1 = CPn . Since our preceding calculation
shows that K
(CPn ) ! K
(CPn1 ) is onto, Mittag–
In the cohomology Serre spectral sequence for Leffler (Theorem 10) implies that lim 1 K
(CPn ) = 0.
n
generalized cohomology (including the cohomology Therefore, the spectral sequence converges to
Atiyah–Hirzebruch spectral sequence), convergence K
(CP1 ) and we find that K
(CP1 ) ffi lim
n
of the spectral sequence to Y
(X) is not guaranteed. K
(CPn ), which is isomorphic to the power series
Convergence to lim Y
(Fn X), should that occur, ring K
(
)[[x]], where jxj = 2.
n
would be of the type discussed in case II in the In topology one might be interested in the Atiyah–
section ‘‘Convergence of graded spectral sequences’’. Hirzebruch spectral sequence in the case where X is
Since Xn = ; for n < 0, the system defining L1 a spectrum rather than a space (a spectrum being a
stabilizes to 0. Therefore, L1 = 0 and, by the generalization in which cells in negative degrees are
discussion in that section, lim 1 Zr X = 0 becomes a allowed including the possibility that the dimensions
r
632 Spectral Sequences

of the cells are not bounded below). In such cases, Therefore, the spectral sequence collapses to give
the spectral sequence is no longer constrained to lie Hn (Tot C) ffi Tor0R
n (M, N). Similarly, the second
in the right half-plane and convergence criteria are spectral sequence shows that Hn (Tot C) ffi Tor00R n
not well understood for either the homology or (M, N). Thus, TorR
(M, N) can be computed equally
cohomology version. well from a projective resolution of either variable.
The technique of using a double complex in which
Spectral Sequence of a Double Complex one spectral sequence yields the homology the total
complex to which both converge can be used to prove.
A double complex is a chain complex of chain
complexes. That is, it is a bigraded abelian group Cp, q Theorem 15 (Grothendieck spectral sequence). Let
F G
together with two differentials d0 : Cp, q ! Cp1, q and C ! B ! A be a composition of additive functors,
¼ ¼ ¼
d00 : Cp, q ! Cp, q1 satisfying d0  d0 = 0, d00  d00 = 0, where C , B , and A are abelian categories. Assume
¼ ¼ ¼
and d0 d00 = d00 d0 . Given a double complex that all objects in C and B have projective
L C its total ¼ ¼
complex Tot C is defined by (Tot C)n := pþq = n Cp, q resolutions. Suppose that F takes projectives to
with differential defined by djCp, q := d0 þ (1)p d00 : projectives. Then for all objects C of C there exists
¼
Cp, q ! Cp1, q Cp, q1  Totn1 C. a (first quadrant) spectral sequence with E2p, q =
There are two natural filtrations, F 0TotC and (Lp G)((Lq F)(C)) converging to (Lpþq (GF))(C).
00
F Tot C , on Tot C given by Naturally, there is a corresponding version for
  M right derived functors.
Fp0 ðTotCÞ ¼ Cs;t An application of the Grothendieck spectral
n
sþt¼n
s p sequence is the following ‘‘change of rings spectral
  M sequence.’’ Let f : R ! S be a ring homomorphism,
Fp00 ðTotCÞ ¼ Cs;t let M be a right S-module and let N be a left
n
sþt¼n
t p R-module. Let F(A) = S R A and G(B) = M S B,
and note that GF(A) = M R A. Applying the
yielding two spectral sequences abutting to Grothendieck spectral sequence to the composition
H
(TotC). In the first E0 2p, q = Hp (Hq (C
,
)) and in F G
(left R-modules ! left S-modules ! abelian groups)
2
the other E00p, q = Hq (Hp (C
,
)). Convergence of these yields a convergent spectral sequence E2p, q ffi TorSp
spectral sequences is not guaranteed, although the (M, TorR R
q (S, N)) ) Torpþq (M, N).
first will always converge if there exists N such that
Cp, q = 0 for p < N and the second will converge if Eilenberg–Moore Spectral Sequence
there exists N such that Cp, q = 0 for q < N. From
For a topological group G, Milnor showed how to
the double complex C one could Q instead form the
construct a universal G-bundle G ! EG ! BG in
product total complex (Tot C)n := pþq = n Cp, q and
which EG is the infinite join G
1 with diagonal
proceed in a similar manner to construct the same
spectral sequences with different convergence pro- G-action. There is a natural filtration Fn BG :=
blems. In the important special case of a first G
(nþ1) =G on BG and therefore an induced filtration
quadrant double complex both spectral sequences on the base of any principal G-bundle. This
converge and information is often obtained by filtration yields a spectral sequence including as a
playing one off against the other. special case a tool for calculating H
(BG) from
knowledge of H
(G).
Example 5 Let M and N be R-modules. Let
Tor0R 00R Theorem 16 Let G ! X ! B be a principal

(M, N) and Tor
(M, N) be the derived func-
tors of ( )  N and M  ( ), respectively. Let P
and G-bundle and let H
( ) denote homology with
coefficients in a field. Then there is a first quadrant
Q
be projective resolutions of M and N respec- H
(G)
spectral sequence with E2p, q = Torpq (H
(X), H
(
))
tively. Define a first quadrant double complex by
converging to Hpþq (BG).
Cp, q := Pp  Qq . Since Pp is projective,
 Here the group structure makes H
(G) into an
0 if q 6¼ 0 algebra and TorA
Hq ðCp;
Þ ¼ Pp  Hq ðCp;
Þ ¼ pq (M, N) denotes degree q of the
N if q ¼ 0 graded object formed as the pth-derived functor of
and so in the first spectral sequence of the double the tensor product of the graded modules M and N
complex, over the graded ring A.
 There is also a version (Eilenberg and Moore
2 0 if q 6¼ 0 1962) which, like the Serre spectral sequence, is
E0p;q ¼
Tor0R
p ðM; NÞ if q ¼ 0 suitable for computing H
(G) from H
(BG).
Spectral Theory of Linear Operators 633

Theorem 17 Let Bousfield AK and Kan DM (1972) Homotopy Limits, Comple-


tions and Localization. Lecture Notes in Math, vol. 304.
W ! Y Berlin: Springer.
Cartan H and Eilenberg S (1956) Homological Algebra. Prince-
# # ton: Princeton University Press.
f Eilenberg S and Moore J (1962) Homology and fibrations I.
X ! B
Coalgebras cotensor product and its derived functors. Com-
mentarii Mathematici Helvetici 40: 199–236.
be a pullback square in which  is a fibration and X Grothendieck A (1957) Sur quelques points d’algèbre homologi-
and B are simply connected. Suppose that que. Tôhoku Mathematical Journal 9: 119–221.
H
(X), H
(Y), and H
(B) are flat R-modules of Hilton PJ and Stammbach U (1970) A Course in Homological
finite type, where H
( ) denotes cohomology with Algebra. Graduate Texts in Mathematics, vol. 4. Berlin:
Springer.
coefficients in the Noetherian ring R. Then there is a
Koszul J-L (1947) Sur l’opérateurs de derivation dans un anneau.
(second quadrant) spectral sequence with Ep, q
2 ffi Comptes Rendus de l’Académie des Sciences de Paris 225:
H
(B)

pþq
Torpq (H (X), H (Y)) converging to H (W). 217–219.
Leray J (1946) L’anneau d’une representation; Propriétés d’homo-
The cohomological version of the Eilenberg–Moore logie de la projections d’un espace fibré sur sa base; Sur
spectral sequence, stated above, contains the more l’anneau d’homologie de l’espace homogène, quotient d’un
familiar Tor for modules over an algebra. For the groupe clos par un sous-groupe abélien, connexe, maximum.
homological version, one must dualize these notions C.R. Acad. Sci. 222: 1366–1368; 1419–1422 and 223:
412–415.
appropriately to define the cotensor product of como-
Massey W (1952) Exact couples in algebraic topology I, II.
dules over a coalgebra, and its derived functors Cotor. Annals of Mathematics 56: 363–396.
Provided the action of the fundamental group of B Massey W (1953) Exact couples in algebraic topology III, IV, V.
is sufficiently nice there are extensions of the Annals of Mathematics 57: 248–286.
Eilenberg–Moore spectral sequence to the case McCleary J (2001) User’s Guide to Spectral Sequences.
Cambridge Studies in Advanced Mathematics, vol. 58.
where B is not simply connected, although they do
Cambridge: Cambridge University Press.
not always converge, and extensions to generalized Milnor J (1956) Constructions of universal bundles, II. Annals of
(co)homology theories have also been studied. Mathematics 63: 430–436.
Milnor J (1962) On axiomatic homology theory. Pacific Journal
See also: Cohomology Theories; Derived Categories; of Mathematics 12: 337–341.
K-Theory; Spectral Theory for Linear Operators. Selick P (1997) Introduction to Homotopy Theory. Fields
Institute Monographs, vol. 9. Providence, RI: American
Mathematical Society.
Further Reading Serre J-P (1951) Homologie singulière des espaces fibrés. Annals
of Mathematics 54: 425–505.
Adams JF (1974) Stable Homotopy and Generalized Homology. Smith L (1969) Lectures on the Eilenberg–Moore Spectral
Chicago: Chicago University Press. Sequence. Lecture Notes in Math, vol. 134. Berlin: Springer.
Atiyah M and Hirzebruch F (1969) Vector bundles and homo- Zeeman EC (1958) A proof of the comparison theorem for
geneous spaces. Proceedings of Symposia in Pure Mathematics spectral sequences. Proceedings of the Cambridge Philosophi-
3: 7–38. cal Society 54: 57–62.
Boardman JM (1999) Conditionally convergent spectral
sequences. Contemporary Mathematics 239: 49–84.

Spectral Theory of Linear Operators


M Schechter, University of California at Irvine, call the space complex. If the scalars are real, we
Irvine, CA, USA shall call it real.
ª 2006 Elsevier Ltd. All rights reserved. Let X, Y be normed vector spaces. A mapping A
which assigns to each element x of a set D(A)  X a
unique element y 2 Y is called an operator (or
transformation). The set D(A) on which A acts is called
Introduction
the domain of A. The operator A is called linear if
We begin with the study of linear operators
1. D(A) is a subspace of X, and
on normed vector spaces (for definitions, see, e.g.,
2. A(1 x1 þ 2 x2 ) = 1 Ax1 þ 2 Ax2
Schechter (2002) or the appendix at the end of this
article). If the scalars are complex numbers, we shall for all scalars 1 , 2 and all elements x1 , x2 2 D(A).
634 Spectral Theory of Linear Operators

To begin, we shall only consider operators A with Under these definitions, X0 becomes a vector space.
D(A) = X. The expression
An operator A is called bounded if there is a
jf ðxÞj
constant M such that kf k ¼ sup ; f 2 X0 ½6
x6¼0 kxk
kAxk  Mkxk; x2X ½1
is easily seen to be a norm. Thus, X0 is a normed vector
The norm of such an operator is defined by space. It is therefore natural to ask when X0 will be
kAxk complete. A rather surprising answer is given by
kAk ¼ sup ½2
x6¼0 kxk Theorem 2 X0 is a Banach space whether or not
X is.
It is the smallest M which works in [1]. An operator
A is called continuous at a point x 2 X if xn ! x in (For the definition of a Banach space, see, e.g.,
X implies Axn ! Ax in Y. A bounded linear Schechter (2002) or the appendix at the end of this
operator is continuous at each point. For if xn ! x article.)
in X, then Suppose X, Y are normed vector spaces and
A 2 B(X, Y). For each y0 2 Y 0 , the expression y0 (Ax)
kAxn  Axk  kAk  kxn  xk ! 0
assigns a scalar to each x 2 X. Thus, it is a functional
We also have F(x). Clearly F is linear. It is also bounded since
Theorem 1 If a linear operator A is continuous at jFðxÞj ¼ jy0 ðAxÞj  ky0 k  kAxk  ky0 k  kAk  kxk
one point x0 2 X, then it is bounded, and hence
continuous at every point. Thus, there is an x0 2 X0 such that
We let B(X, Y) be the set of bounded linear y0 ðAxÞ ¼ x0 ðxÞ; x2X ½7
operators from X to Y. Under the norm [2], one
This functional x is unique. Thus, to each y 2 Y 0
0 0
easily checks that B(X, Y) is a normed vector space.
we have assigned a unique x0 2 X0 . We designate this
assignment by A0 and note that it is a linear operator
The Adjoint Operator from Y 0 to X0 . Thus, [7] can be written in the form
An assignment F of a number to each element x of a y0 ðAxÞ ¼ A0 y0 ðxÞ ½8
vector space is called a functional and denoted by
F(x). If it satisfies The operator A0 is called the adjoint (or conjugate)
of A. We note
Fð1 x1 þ 2 x2 Þ ¼ 1 Fðx1 Þ þ 2 Fðx2 Þ ½3
Theorem 3 A0 2 B(Y 0 , X0 ), and kA0 k = kAk.
for 1 , 2 scalars, it is called linear. It is called
bounded if The adjoint has the following easily verified
properties:
jFðxÞj  Mkxk; x2X ½4
If F is a bounded linear functional on a normed ðA þ BÞ0 ¼ A0 þ B0 ½9
vector space X, the norm of F is defined by
ðAÞ0 ¼ A0 ½10
jFðxÞj
kFk ¼ sup ½5
x2X; x6¼0 kxk ðABÞ0 ¼ B0 A0 ½11
It is equal to the smallest number M satisfying [4].
Why should we consider adjoints? One reason is
For any normed vector space X, let X0 denote the as follows. Many problems in mathematics and its
set of bounded linear functionals on X. If f , g 2 X0 , applications can be put in the form: given normed
we say that f = g if
vector spaces X, Y and an operator A 2 B(X, Y), one
f ðxÞ ¼ gðxÞ for all x 2 X wishes to solve
The ‘‘zero’’ functional is the one assigning zero to all Ax ¼ y ½12
x 2 X. We define h = f þ g by
The set of all y for which one can solve [12] is called
hðxÞ ¼ f ðxÞ þ gðxÞ; x2X the ‘‘range’’ of A and is denoted by R(A). The set of
all x for which Ax = 0 is called the ‘‘null space’’ of A
and g = f by
and is denoted by N(A). Since A is linear, it is easily
gðxÞ ¼ f ðxÞ; x2X checked that N(A) and R(A) are subspaces of X and Y,
Spectral Theory of Linear Operators 635

respectively (for definitions, see, e.g., Schechter Let p(t) be a polynomial of the form
(2002) or the appendix at the end of this article). X
n
The dimension of N(A) is denoted by (A). pðtÞ ¼ ak t k
If y 2 R(A), there is an x 2 X satisfying [12]. For 0
any y0 2 Y 0 we have
Then for any operator A 2 B(X), we define the
y0 ðAxÞ ¼ y0 ðyÞ operator
Taking adjoints we get X
n
pðAÞ ¼ ak Ak
0 0 0
A y ðxÞ ¼ y ðyÞ 0
0
If y0 2 N(A0 ), this gives y0 (y) = 0. Thus, a necessary where we take A = I. We have
condition that y 2 R(A) is that y0 (y) = 0 for all Theorem 7 If  2 (A), then p() 2 (p(A)) for any
y0 2 N(A0 ). Obviously, it would be of great interest polynomial p(t).
to know when this condition is also sufficient.
Proof Since  is a root of p(t)  p(), we have
The Spectrum and Resolvent Sets pðtÞ  pðÞ ¼ ðt  ÞqðtÞ
From this point henceforth we shall assume that where q(t) is a polynomial with real coefficients.
X = Y. We can then speak of the identity operator I Hence,
defined by
pðAÞ  pðÞ ¼ ðA  ÞqðAÞ ¼ qðAÞðA  Þ ½14
Ix ¼ x; x2X
Now, if p() is in (p(A)), then [14] shows that
For a scalar , the operator I is given by
(A  ) = 0 and R(A  ) = X. This means that
Ix ¼ x; x2X  2 (A), and the theorem is proved. &
A symbolic way of writing Theorem 7 is
We shall denote the operator I by .
We shall denote the space B(X, X) by B(X). pððAÞÞ  ðpðAÞÞ ½15
For any operator A 2 B(X), a scalar  for which
(A  ) 6¼ 0 is called an eigenvalue of A. Any Note that, in general, there may be points in
element x 6¼ 0 of X such that (A  )x = 0 is called (p(A)) which may not be of the form p() for
an eigenvector (or eigenelement). The points  for some  2 (A). As an example, consider the
which (A  ) has a bounded inverse in B(X) operator on R 2 given by
comprise the resolvent set (A) of A (for defini- Að1 ; 2 Þ ¼ ð2 ; 1 Þ
tions, see, e.g., Schechter (2002) or the appendix
at the end of this article). If X is a Banach space, A has no spectrum; A   is invertible for all real .
it is the set of those  such that (A  ) = 0 and However, A2 has 1 as an eigenvalue. What is the
R(A  ) = X. The spectrum (A) of A consists of reason for this? It is simply that our scalars are real.
all scalars not in (A). The set of eigenvalues of A Consequently, imaginary numbers cannot be con-
is sometimes called the point spectrum of A and sidered as eigenvalues. We shall see later that in
is denoted by P(A). order to obtain a more complete theory, we shall
We note that have to consider complex Banach spaces. Another
question is whether every operator A 2 B(X) has
Theorem 4 For A in B(X), (A0 ) = (A). points in its spectrum. For complex Banach spaces,
We are now going to examine the sets (A) and the answer is yes.
(A) for arbitrary A 2 B(X).
Theorem 5 (A) is an open set and hence (A) is a The Spectral Mapping Theorem
closed set.
Suppose we want to solve an equation of the form
Does every operator A 2 B(X) have points in its
pðAÞx ¼ y; x; y 2 X ½16
resolvent set? Yes. In fact, we have
Theorem 6 For A in B(X), set where p(t) is a polynomial and A 2 B(X). If 0 is not in
the spectrum of p(A), then p(A) has an inverse in B(X)
r ðAÞ ¼ inf kAn k1=n ½13 and, hence, [16] can be solved for all y 2 X. So a
n
natural question to ask is: what is the spectrum of
Then (A) contains all scalars  such that jj > r (A). p(A)? By Theorem 7 we see that it contains p((A)),
636 Spectral Theory of Linear Operators

but by the remark at the end of the preceding section Let C be any circle with center at the origin and
it can contain other points. If it were true that radius greater than, say, kAk. Then, by Lemma 1,
I X
1 I
pððAÞÞ ¼ ðpðAÞÞ ½17
zn ðz  AÞ1 dz ¼ Ak1 znk dz
C k¼1 C
then we could say that [16] can be solved uniquely
n
for all y 2 X if and only if p() 6¼ 0 for all  2 (A). ¼ 2iA ½19
For a complex Banach space we have
or
Theorem 8 If X is a complex Banach space, then I
 2 (p(A)) if and only if  = p() for some  2 (A), 1
An ¼ zn ðz  AÞ1 dz ½20
that is, if [17] holds. 2i C

Proof We have proved it in one direction already where the line integral is taken in the right direction.
(Theorem 7). To prove it in the other, let 1 , . . . , n Note that the line integrals are defined in the same
be the (complex) roots of p(t)  . For a complex way as is done in the theory of functions of a
Banach space they are all scalars. Thus, complex variable. The existence of the integrals and
their independence of path (so long as the integrands
pðAÞ   ¼ cðA  1 Þ    ðA  n Þ; c 6¼ 0 remain analytic) are proved in the same way. Since
(z  A)1 is analytic on (A), we have
Now suppose that all of the j are in (A). Then
each A  j has an inverse in B(X). Hence, the same Theorem 10 Let C be any closed curve containing
is true for p(A)  . In other words,  2 (p(A)). (A) in its interior. Then [20] holds.
Thus, if  2 (p(A)), then at least one of the j must As a direct consequence of this, we have
be in (A), say k . Hence,  = p(k ), where k 2 (A).
This completes the proof. & Theorem 11 r (A) = max2(A) jj and kAn k1=n !
Theorem 8 is called the ‘‘spectral mapping r (A) as n ! 1.
theorem’’ for polynomials. As mentioned before, it We can now put Lemma 1 in the following form:
has the useful consequence:
Theorem 12 If jzj > r (A), then [18] holds with
Corollary 1 If X is a complex Banach space, then convergence in B(X).
eqn [16] has a unique solution for every y in X if
and only if p() 6¼ 0 for all  2 (A). Now let b be any number greater than r (A), and
let f (z) be a complex-valued function that is analytic
Operational Calculus in jzj < b. Thus,

Other things can be done in a complex Banach space X


1
f ðzÞ ¼ ak z k ; jzj < b ½21
that cannot be done in a real Banach space. For 0
instance, we can get a formula for p(A)1 when it
exists. To obtain this formula, we first note We can define f (A) as follows: the operators
Theorem 9 If X is a complex Banach space, then X
n

(z  A)1 is a complex analytic function of z for ak A k


0
z 2 (A).
By this, we mean that in a neighborhood of each converge in norm, since
z0 2 (A), the operator (z  A)1 can be expanded in a X
1
‘‘Taylor series,’’ which converges in norm to (z  A)1 , jak j  kAk k < 1
just like analytic functions of a complex variable. 0
Now, by Theorem 6, (A) contains the set jzj > kAk.
This last statement follows from the fact that if c is
We can expand (z  A)1 in powers of z1 on this set.
any number satisfying r (A) < c < b, then
In fact, we have
Lemma 1 If jzj > lim sup kAn k1=n , then kAk k1=k  c

X
1 for k sufficiently large, and the series
ðz  AÞ1 ¼ zn An1 ½18
1 X
1
jak jck
where the convergence is in the norm of B(X). 0
Spectral Theory of Linear Operators 637

is convergent. We define f (A) to be where the line integrals are to be taken in the
proper directions. It is easily checked that f (A) 2
X
1
ak A k ½22 B(X) and is independent of the choice of the set !.
0 By [23], this definition agrees with the one given
above for the case when  contains a disk of radius
By Theorem 10, this gives greater than r (A). Note that if  is not connected,
I f (z) need not be the same function on different
1 X1
f ðAÞ ¼ ak zk ðz  AÞ1 dz components of .
2i 0 C Now suppose f (z) does not vanish on (A). Then
I X 1 we can choose ! so that f (z) does not vanish on ! 
1
¼ ak zk ðz  AÞ1 dz (this is also an exercise). Thus, g(z) = 1=f (z) is
2i C 0
I analytic on an open set containing !  so that g(A) is
1 defined. Since f (z)g(z) = 1, one would expect that
¼ f ðzÞðz  AÞ1 dz ½23
2i C f (A)g(A) = g(A)f (A) = I, in which case, it would
follow that f (A)1 exists and is equal to g(A). This
where C is any circle about the origin with radius
follows from
greater than r (A) and less than b.
We can now give the formula that we promised. Lemma 2 If f (z) and g(z) are analytic in an open
Suppose f (z) does not vanish for jzj < b. Set set  containing (A) and
g(z) = 1=f (z). Then g(z) is analytic in jzj < b, and
hence g(A) is defined. Moreover, hðzÞ ¼ f ðzÞgðzÞ
I
1 then h(A) = f (A)g(A).
f ðAÞgðAÞ ¼ f ðzÞgðzÞðz  AÞ1 dz
2i C
I Therefore, it follows that we have
1
¼ ðz  AÞ1 dz ¼ I
2i C Theorem 13 If A is in B(X) and f (z) is a function
analytic in an open set  containing (A) such that
Since f (A) and g(A) clearly commute, we see that f (z) 6¼ 0 on (A), then f (A)1 exists and is given by
f (A)1 exists and equals g(A). Hence,
I
I 1 1 1
1 1 f ðAÞ ¼ ðz  AÞ1 dz
f ðAÞ1 ¼ ðz  AÞ1 dz ½24 2i @! f ðzÞ
2i C f ðzÞ
In particular, if where ! is any open set such that

X
1 (i) (A)  !, !   ,
k (ii) @! consists of a finite number of simple closed
gðzÞ ¼ 1=f ðzÞ ¼ ck z ; jzj < b
0 curves, and
(iii) f (z) 6¼ 0 on !.

then
Now that we have defined f (A) for functions
X
1
f ðAÞ 1
¼ ck A k
½25 analytic in a neighborhood of (A), we can show
0 that the spectral mapping theorem holds for such
functions as well (see Theorem 8). We have
Now, suppose f (z) is analytic in an open set 
containing (A), but not analytic in a disk of radius Theorem 14 If f (z) is analytic in a neighborhood
greater than r (A). In this case, we cannot say that of (A), then
the series [22] converges in norm to an operator in
B(X). However, we can still define f (A) in the ð f ðAÞÞ ¼ f ððAÞÞ ½27
following way: there exists an open set ! whose
that is,  2 (f (A)) if and only if  = f () for some
closure !   and whose boundary @! consists of a
 2 (A).
finite number of simple closed curves that do not
intersect, and such that (A)  !. (That such a
set always exists is left as an exercise; see, e.g.,
Complexification
Schechter (2002).) We now define f (A) by
I What we have just done is valid for complex Banach
1 spaces. Suppose, however, we are dealing with a real
f ðAÞ ¼ f ðzÞðz  AÞ1 dz ½26
2i @! Banach space. What can be said then?
638 Spectral Theory of Linear Operators

Let X be a real Banach space. Consider the set Z This shows that  2 (A) ^ if and only if  2 (A).
of all ordered pairs hx, yi of elements of X. We set Similarly, if p(t) is a polynomial with real coeffi-
cients, then
hx1 ; y1 i þ hx2 ; y2 i ¼ hx1 þ x2 ; y1 þ y2 i
^
pðAÞhx; yi ¼ hpðAÞx; pðAÞyi
ð þ i Þhx; yi ¼ hðx  yÞ; ð x þ yÞi
; 2 R ^ has an inverse in B(Z) if and only
showing that p(A)
if p(A) has an inverse in B(X). Hence, we have
With these definitions, one checks easily that Z is a
complex vector space. The set of elements of Z of Theorem 15 Equation [16] has a unique solution
the form hx, 0i can be identified with X. We would for each y in X if and only if p() 6¼ 0 for all
^
 2 (A).
like to introduce a norm on Z that would make Z
into a Banach space and satisfy In the example given earlier, the operator A ^
has eigenvalues i and i. Hence, 1 is in the
khx; 0ik ¼ kxk; x2X ^ 2 and also in that of A2 . Thus, the
spectrum of A
An obvious suggestion is equation
ðA2 þ 1Þx ¼ y
ðkxk2 þ kyk2 Þ1=2
cannot be solved uniquely for all y.
However, it is soon discovered that this is not a norm
on Z (why?). We have to be more careful. One that
works is given by Compact Operators
2 2 1=2
khx; yik ¼ max ðkx  yk þ k x þ yk Þ Let X, Y be normed vector spaces. A linear operator
2 þ 2 ¼1
K from X to Y is called compact (or completely
With this norm, Z becomes a complex Banach space continuous) if D(K) = X and for every sequence
having the desired properties. {xn }  X such that kxn k  C, the sequence {Kxn } has
Now let A be an operator in B(X). We define an a subsequence which converges in Y. The set of all
operator A^ in B(Z) by compact operators from X to Y is denoted by
K(X, Y).
^ yi ¼ hAx; Ayi
Ahx; A compact operator is bounded. Otherwise, there
would be a sequence {xn } such that kxn k  C, while
Then kKxn k ! 1. Then {Kxn } could not have a conver-
gent subsequence. The sum of two compact opera-
^ yik
kAhx; tors is compact, and the same is true of the product
¼ max ðkAx  Ayk2 þ k Ax þ Ayk2 Þ1=2 of a scalar and a compact operator. Hence, K(X, Y)
2 þ 2 ¼1 is a subspace of B(X, Y).
¼ max ðkAðx  yÞk2 þ kAð x þ yÞk2 Þ1=2 If A 2 B(X, Y) and K 2 K(Y, Z), then KA 2 K
2 þ 2 ¼1 (X, Z). Similarly, if L 2 K(X, Y) and B 2 B(Y, Z),
 kAk  khx; yik then BL 2 K(X, Z).
Suppose K 2 B(X, Y), and there is a sequence {Fn }
Thus, of compact operators such that
^  kAk
kAk kK  Fn k ! 0 as n ! 1 ½28
We claim that if Y is a Banach space, then K is
But,
compact.
^ sup khAx; 0ik Theorem 16 Let X be a normed vector space and
kAk ¼ kAk
x6¼0 khx; 0ik Y a Banach space. If L is in B(X, Y) and there is a
sequence {Kn }  K(X, Y) such that
Hence,
kL  Kn k ! 0 as n ! 0
^ ¼ kAk
kAk then L is in K(X, Y).
If  is real, then Theorem 17 Let X be a Banach space and let K be
an operator in K(X). Set A = I  K. Then, R(A) is
^  Þhx; yi ¼ hðA  Þx; ðA  Þyi
ðA closed in X and dim N(A) = dim N(A0 ) is finite.
Spectral Theory of Linear Operators 639

In particular, either R(A) = X and N(A) = {0}, or otherwise specified, X, Y, Z, and W will denote
R(A) 6¼ X and N(A) 6¼ {0}. Banach spaces in this article.
Let X, Y be normed vector spaces, and let A be
The last statement of Theorem 17 is known as the
a linear operator from X to Y. We now officially
‘‘Fredholm alternative.’’
lift our restriction that D(A) = X. However, if
Let X, Y be Banach spaces. An operator A 2
A 2 B(X, Y), it is still to be assumed that D(A) = X.
B(X, Y) is said to be a Fredholm operator from X to
The operator A is called closed if whenever {xn } 
Y if
D(A) is a sequence satisfying
1. (A) = dim N(A) is finite,
xn ! x in X; Axn ! y in Y ½31
2. R(A) is closed in Y, and
3. (A) = dim N(A0 ) is finite. then x 2 D(A) and Ax = y. Clearly, all operators in
The set of Fredholm operators from X to Y is B(X, Y) are closed.
denoted by (X, Y). If X = Y and K 2 K(X), then, To define A0 for an unbounded operator, we
clearly, I  K is a Fredholm operator. The index of a follow the definition for bounded operators, and
Fredholm operator is defined as exercise a bit of care. We want

iðAÞ ¼ ðAÞ  ðAÞ ½29 A0 y0 ðxÞ ¼ y0 ðAxÞ; x 2 DðAÞ ½32

For K 2 K(X), we have shown that i(I  K) = 0 Thus, we say that y0 2 D(A0 ) if there is an x0 2 X0
(Theorem 17). such that
Theorem 18 Let X, Y be normed vector spaces, x0 ðxÞ ¼ y0 ðAxÞ; x 2 DðAÞ ½33
and assume that K is in K(X, Y). Then K0 is in
K(Y 0 , X0 ). Then we define A0 y0 to be x0 . In order that this
definition make sense, we need x0 to be unique, that
Let X be a Banach space, and suppose K 2 K(X). is, that x0 (x) = 0 for all x 2 D(A) should imply that
If  is a nonzero scalar, then x0 = 0. This is true if and only if D(A) is dense in X.
To summarize, we can define A0 for any linear
I  K ¼ ðI  1 KÞ 2 ðXÞ ½30
operator from X to Y provided D(A) is dense in X.
For an arbitrary operator A 2 B(X), the set of all We take D(A0 ) to be the set of those y0 2 Y 0 for
scalars  for which I  A 2 (X) is called the -set which there is an x0 2 X0 satisfying [33]. This x0 is
of A and is denoted by A . Thus, [30] gives unique, and we set A0 y0 = x0 . Note that if

Theorem 19 If X is a Banach space and K is in jy0 ðAxÞj  Ckxk; x 2 DðAÞ


K(X), then K contains all scalars  6¼ 0.
then a simple application of the Hahn–Banach
Theorem 20 Under the hypothesis of Theorem 19, theorem (see e.g., Schechter (2002) or the appendix)
(K  ) = 0 except for, at most, a denumerable set shows that y0 2 D(A0 ).
S of values of . The set S depends on K and has 0 as We define unbounded Fredholm operators in the
its only possible limit point. Moreover, if  6¼ 0 and following way: let X, Y be Banach spaces. Then the
 62 S, then (K  ) = 0, R(K  ) = X and K   set (X, Y) of Fredholm operators from X to Y
has an inverse in B(X). consists of linear operators from X to Y such that
1. D(A) is dense in X,
2. A is closed,
Unbounded Operators 3. (A) = dim N(A) < 1,
In many applications, one runs into unbounded 4. R(A) is closed in Y, and
operators instead of bounded ones. This is particu- 5. (A) = dim N(A0 ) < 1.
larly true in the case of differential equations. For
instance, consider the operator d/dt on C[0, 1] with
The Essential Spectrum
domain consisting of continuously differentiable
functions. It is clearly unbounded. In fact, the Let A be a linear operator on a normed vector space
sequence xn (t) = tn satisfies kxn k = 1, kdxn =dtk = X. We say that  2 (A) if R(A  ) is dense in X
n ! 1 as n ! 1. It would, therefore, be useful if and there is a T 2 B(X) such that
some of the results that we have stated for bounded
operators would also hold for unbounded ones. We TðA  Þ ¼ I on DðAÞ
½34
shall see that, indeed, many of them do. Unless ðA  ÞT ¼ I on RðA  Þ
640 Spectral Theory of Linear Operators

Otherwise,  2 (A). As before, (A) and (A) are converges in H. Define the operator A on H by
called the resolvent set and spectrum of A, respec- X
tively. To show the relationship of this definition to Af ¼ k ðf ; ’k Þ’k ½37
the one given before, we note the following. Clearly, A is a linear operator. It is also bounded,
Lemma 3 If X is a Banach space and A is closed, since
then  2 (A) if and only if X
kAf k2 ¼ jk j2 jðf ; ’k Þj2  C2 kf k2 ½38
ðA  Þ ¼ 0; RðA  Þ ¼ X ½35
by Bessel’s inequality
Throughout the remainder of this section, we shall
X
1
assume that X is a Banach space, and that A is a ðf ; ’k Þ2  kf k2 ½39
densely defined, closed linear operator on X. We ask 1
the following question: what points of (A) can be
removed from the spectrum by the addition of a For convenience, let us assume that each k 6¼ 0 (just
compact operator to A? The answer to this question is remove those ’k corresponding to the k that vanish).
closely related to the set A . We define this to be the In this case, N(A) consists of precisely those f 2 H
set of all scalars  such that A   2 (X). We have which are orthogonal to all of the ’k . Clearly, such f
are in N(A). Conversely, if f 2 N(A), then
Theorem 21 The set A is open, and i(A  ) is
constant on each of its components. 0 ¼ ðAf ; ’k Þ ¼ k ðf ; ’k Þ

We also have Hence, (f , ’k ) = 0 for each k. Moreover, each k is


an eigenvalue of A with ’k the corresponding
Theorem 22 AþK = A for all K which are eigenvector. This follows immediately from [37].
A-compact, and i(A þ K  ) = i(A  ) for all Since (A) is closed, it also contains the limit points
 2 A . of the k .
Set Next, we shall see that if  6¼ 0 is not a limit point
\ of the k , then  2 (A). To show this, we solve
e ðAÞ ¼ ðA þ KÞ
K2KðXÞ ð  AÞu ¼ f ½40

We call e (A) the essential spectrum of A (there are for any f 2 H. Any solution of [40] satisfies
other definitions). It consists of those points of (A) X
which cannot be removed from the spectrum by the u ¼ f þ Au ¼ f þ k ðu; ’k Þ’k ½41
addition of a compact operator to A. We now Hence,
characterize e (A).
ðu; ’k Þ ¼ ðf ; ’k Þ þ k ðu; ’k Þ
Theorem 23 = e (A) if and only if  2 A and
2
i(A  ) = 0. or
ðf ; ’k Þ
ðu; ’k Þ ¼ ½42
Normal Operators   k
A sequence of elements {’n } in a Hilbert space is Substituting back in [41], we obtain
called orthonormal if
X k ðf ; ’k Þ’k
( u ¼ f þ ½43
0; m ¼
6 n   k
ð’m ; ’n Þ ¼ ½36
1; m ¼ n Since  is not a limit point of the k , there is a
> 0
(for definitions, see, e.g., Schechter (2002) or the such that
appendix at the end of this article). j  k j
; k ¼ 1; 2; . . .
Let {’n } be an orthonormal sequence (finite or
infinite) in a Hilbert space H. Let {k } be a sequence Hence, the series in [43] converges for each f 2 H. It
(of the same length) of scalars satisfying is an easy exercise to verify that [43] is indeed a
solution of [40]. To see that (  A)1 is bounded,
jk j  C note that
Then for each element f 2 H, the series
jj  kuk  kf k þ Ckf k=
½44
X
k ðf ; ’k Þ’k (cf. [38]). Thus, we have proved
Spectral Theory of Linear Operators 641

Lemma 4 If the operator A is given by [37], then We also have


(A) consists of the points k , their limit points and
Lemma 5 If A is normal, then
possibly 0. N(A) consists of those u which are
orthogonal to all of the ’k . For  2 (A), the 
kðA  Þuk ¼ kðA  Þuk; u2H ½50
solution of [40] is given by [43].
We see from all this that the operator [37] has Corollary 2 If A is normal and A’ = ’, then
many useful properties. Therefore, it would be 
A ’ = ’.
desirable to determine conditions under which
Lemma 6 If A is normal and compact, then it has
operators are guaranteed to be of that form. For
an eigenvalue  such that jj = kAk.
this purpose, we note another property of A. It is
expressed in terms of the Hilbert space adjoint of A. We also have
Let H1 and H2 be Hilbert spaces, and let A be an
Corollary 3 If A is a normal compact operator,
operator in B(H1 , H2 ). For fixed y 2 H2 , the expres-
then there is an orthonormal sequence {’k } of
sion Fx = (Ax, y) is a bounded linear functional on
eigenvectors of A such that every element u in H
H1 . By the Riesz representation theorem (see, e.g.,
can be written in the form
Schechter (2002) or the appendix at the end of this X
article), there is a z 2 H1 such that Fx = (x, z) for all u¼hþ ðu; ’k Þ’k ½51
x 2 H1 . Set z = A y. Then A is a linear operator
from H2 to H1 satisfying where h 2 N(A).

ðAx; yÞ ¼ ðx; A yÞ ½45


Hyponormal Operators
A is called the Hilbert space adjoint of A. Note the
difference between A and the operator A0 defined An operator A in B(H) is called hyponormal if
for a Banach space. As in the case of the operator A0 ,
we note that A is bounded and kA uk  kAuk; u2H ½52

kA k ¼ kAk ½46


or, equivalently, if
Returning to the operator A, we remove the
ð½AA  A Au; uÞ  0; u2H ½53
assumption that each k 6¼ 0 and note that
X Of course, a normal operator is hyponormal. An
ðAu; vÞ ¼ k ðu; ’k Þð’k ; vÞ operator A 2 B(H) is called seminormal if either A
 X 
or A is hyponormal. We have
¼ u; k ðv; ’k Þ’k
Theorem 25 If A is seminormal, then
showing that
X r ðAÞ ¼ kAk ½54
A v ¼ k ðv; ’k Þ’k ½47 We have earlier defined the essential spectrum of
(If H is a complex Hilbert space, then the complex an operator A to be
\
conjugates k of the k are required. If H is a real e ðAÞ ¼ ðA þ KÞ ½55
Hilbert space, then the k are real, and it does not K2KðHÞ
matter.) Now, by Lemma 4, we see that each k
is an eigenvalue of A with ’k a corresponding It was shown that  62 e (A) if and only if  2 A
eigenvector. Note also that and i(A  ) = 0 (Theorem 23). Let us show that we
X can be more specific in the case of seminormal
kA f k2 ¼ jk j2 jðf ; ’k Þj2 ½48 operators.

showing that Theorem 26 If A is a seminormal operator, then


 2 (A)ne (A) if and only if  is an isolated
kA f k ¼ kAf k; f 2H ½49 eigenvalue with r(A  ) = limn ! 1 [(A  )n ] < 1.
An operator satisfying [49] is called normal. An Lemma 7 If A is hyponormal, then so is B = A  
important characterization is given by for any complex .
Theorem 24 An operator is normal and compact Lemma 8 If B is hyponormal with 0 an isolated
if and only if it is of the form [37] with {’k } an point of (B) and either (B) or (B) is finite, then
orthonormal set and k ! 0 as k ! 1. B 2 (H) and i(B) = 0.
642 Spectral Theory of Linear Operators

There is a simple consequence of Lemma 8. Thus  is an eigenvalue of p(A) and ’ is a


corresponding eigenvector. This shows that
Corollary 4 If A is seminormal and  is an isolated X
point of (A), then  is an eigenvalue of A. pðAÞu ¼ pðk Þðu; ’k Þ’k ½58
We also have the following:
Now, the right-hand side of [58] makes sense if p(t)
Theorem 27 Let A be a seminormal operator such is any function bounded on (A) (see the section
that (A) has no nonzero limit points. Then A is ‘‘Normal operators’’). Therefore it seems plausible
compact and normal. Thus, it is of the form [37] to define p(A) by means of [58]. Of course, for such
with the {’k } orthonormal and k ! 0. a definition to be useful, one would need certain
Corollary 5 If A is seminormal and compact, then relationships to hold. In particular, one would want
it is normal. f (t)g(t) = h(t) to imply f (A)g(A) = h(A). We shall
discuss this a bit later.
If A is not compact, we cannot, in general, obtain
an expansion in the form [56]. However, we can
Spectral Resolution
obtain something similar. In fact, we have
We saw in the section ‘‘Operational calculus’’ that,
Theorem 28 Let A be a self-adjoint operator in
in a Banach space X, we can define f (A) for any
B(H). Set
A 2 B(X) provided f (z) is a function analytic in a
neighborhood of (A). In this section, we shall show m ¼ inf ðAu; uÞ; M ¼ sup ðAu; uÞ
kuk¼1 kuk¼1
that we can do better in the case of self-adjoint
operators. Then there is a family {E()} of orthogonal projection
A linear operator A on a Hilbert space X is called operators on H depending on a real parameter  and
self-adjoint if it has the property that x 2 D(A) and such that:
Ax = f if and only if
(i) E(1 )  E(2 ) for 1  2 ;
ðx; AyÞ ¼ ðf ; yÞ; y 2 DðAÞ (ii) E()u ! E(0 )u as 0 <  ! 0 , u 2 H;
(iii) E() = 0 for  < m, E() = I for  M;
In particular, it satisfies
(iv) AE() = E()A; and
ðAx; yÞ ¼ ðx; AyÞ; x; y 2 DðAÞ (v) if a < m, b M and p(t) is any polynomial,
then
A bounded self-adjoint operator is normal. Z b
To get an idea, let A be a compact, self-adjoint pðAÞ ¼ pðÞ dEðÞ ½59
operator on H. Then by Theorem 24, a
X This means the following. Let a = 0 < 1 <    <
Au ¼ k ðu; ’k Þ’k ½56
n = b be any partition of [a, b], and let 0k be any
number satisfying k1  0k  k . Then
where {’k } is an orthonormal sequence of eigenvec-
tors and the k are the corresponding eigenvalues of X
n  
A. Now let p(t) be a polynomial with real p 0k ½Eðk Þ  Eðk1 Þ ! pðAÞ ½60
1
coefficients having no constant term
in B(H) as = max (k  k1 ) ! 0.
X
m
k
pðtÞ ¼ ak t ½57 Theorem 29 Let A be a self-adjoint operator on H.
1 Then there is a family {E()} of orthogonal projec-
Then p(A) is compact and self-adjoint. Let  6¼ 0 be tion operators on H satisfying (i) and (ii) of
a point in (p(A)). Then  = p() for some  2 (A) Theorem 28 and

(Theorem 8). Now  6¼ 0 (otherwise we would have 0 as  ! 1
 = p(0) = 0). Hence, it is an eigenvalue of A (see the (i) EðÞ !
I as  ! þ1
section ‘‘The spectrum and resolvent sets’’). If ’ is a
corresponding eigenvector, then (ii) EðÞA  AEðÞ
X Z 1
½pðAÞ  ’ ¼ ak Ak ’  ’
X (iii) pðAÞ = pðÞ dEðÞ
¼ ak k ’  ’ 1

¼ ½pðÞ  ’ ¼ 0 for any polynomial p(t).


Spectral Theory of Linear Operators 643

These theorems are known as the spectral A subset U of a vector space V is called a subspace
theorems for self-adjoint operators. of V if 1 x1 þ 2 x2 is in U whenever x1 , x2 are in U
and 1 , 2 are scalars.
A subset U of a normed vector space X is called
closed if for every sequence {xn } of elements in U
Appendix
having a limit in X, the limit is actually in U.
Here we include some background material related Consider a vector space X having a mapping (f , g)
to the text. from pairs of its elements to the reals such that
Consider a collection C of elements or ‘‘vectors’’
1. (f , g) = (f , g)
with the following properties:
2. (f þ g, h) = (f , h) þ (g, h)
1. They can be added. If f and g are in C, so is f þ g. 3. (f , g) = (g, f )
2. f þ (g þ h) = (f þ g) þ h, f , g, h 2 C. 4. (f , f ) > 0 unless f = 0.
3. There is an element 0 2 C such that h þ 0 = h
Then
for all h 2 C.
4. For each h 2 C there is an element h 2 C such
ðf ; gÞ2  ðf ; f Þðg; gÞ; f;g 2 X ½61
that h þ (h) = 0.
5. g þ h = h þ g, g, h 2 C. An expression (f , g) that assigns a real number to
6. For each real number , h 2 C. each pair of elements of a vector space and satisfies
7. (g þ h) = g þ h. the aforementioned properties is called a scalar
8. ( þ )h = h þ h. (or inner) product.
9. ( h) = ( )h. If a vector space X has a scalar product (f , g), then
10. To each h 2 C there corresponds a real number it is a normed vector space with norm kf k = (f , f )1=2 .
khk with the following properties: A vector space which has a scalar product and is
11. khk = jjkhk. complete with respect to the induced norm is called
12. khk = 0 if, and only if, h = 0. a Hilbert space. Every Hilbert space is a Banach
13. kg þ hk  kgk þ khk. space, but the converse is not true. Inequality [61] is
14. If {hn } is a sequence of elements of C such known as the Cauchy–Schwarz inequality. Rn is a
that khn  hm k ! 0 as m, n ! 1, then there is Hilbert space.
an element h 2 C such that khn  hk ! 0 as Let H be a Hilbert space and let (x, y) denote its
n ! 1. scalar product. If we fix y, then the expression
A collection of objects which satisfies statements (x, y) assigns to each x 2 H a number. An assign-
(1)–(9) and the additional statement ment F of a number to each element x of a vector
15. 1h = h space is called a functional and denoted by F(x).
is called a vector space or linear space. The scalar product is not the first functional we
A set of objects satisfying statements (1)–(13) is have encountered. In any normed vector space, the
called a normed vector space, and the number khk norm is also a functional. The functional
is called the norm of h. Although statement (15) is F(x) = (x, y) satisfies
not implied by statements (1)–(9), it is implied by
statements (1)–(13). A sequence satisfying Fð1 x1 þ 2 x2 Þ ¼ 1 Fðx1 Þ þ 2 Fðx2 Þ ½62

khn  hm k ! 0 as m; n ! 1 for 1 , 2 scalars. A functional satisfying [62] is


called linear. Another property is
is called a Cauchy sequence. Property (14) states
that every Cauchy sequence converges in norm to jFðxÞj  Mkxk; x2H ½63
a limit (i.e., satisfies khn  hk ! 0 as n ! 1).
Property (14) is called completeness, and a normed which follows immediately from Schwarz’s inequal-
vector space satisfying it is called a complete normed ity (cf. [61]). A functional satisfying [63] is called
vector space or a Banach space. bounded. The norm of such a functional is defined
We shall write to be

hn ! h as n ! 1 jFðxÞj
kFk ¼ sup
x2H; x6¼0 kxk
when we mean
Thus for y fixed, F(x) = (x, y) is a bounded linear
khn  hk ! 0 as n ! 1 functional in the Hilbert space H. We have
644 Spectral Theory of Linear Operators

Theorem 30 For every bounded linear functional F vector 0), we can assign to each y 2 Y the unique
on a Hilbert space H there is a unique element solution of
y 2 H such that
Ax ¼ y
FðxÞ ¼ ðx; yÞ for all x 2 H ½64
This assignment is an operator from Y to X and is
Moreover, usually denoted by A1 and called the inverse
operator of A. It is linear because of the linearity
jFðxÞj of A. One can ask: ‘‘when is A1 continuous?’’ or,
kyk ¼ sup ¼ kFk ½65
x2H; x6¼0 kxk equivalent by, ‘‘when is it bounded?’’ A very
important answer to this question is given by
Theorem 30 is known as the ‘‘Riesz representation
theorem.’’ Theorem 32 If X, Y are Banach spaces and A is a
For any normed vector space X, let X0 denote the closed linear operator from X to Y with
set of bounded linear functionals on X. If f , g 2 X0 , R(A) = Y, N(A) = {0}, then A1 2 B(Y, X).
we say that f = g if This theorem is sometimes referred to as the
‘‘bounded inverse theorem.’’
f ðxÞ ¼ gðxÞ for all x 2 X
If A is self-adjoint and
The ‘‘zero’’ functional is the one assigning zero to all
ðA  Þx ¼ 0; ðA  Þy ¼ 0
x 2 X. We define h = f þ g by
with  6¼ , then
hðxÞ ¼ f ðxÞ þ gðxÞ; x2X
ðx; yÞ ¼ 0
and g = f by
If A has a compact inverse, its eigenvalues cannot
gðxÞ ¼ f ðxÞ; x2X have limit points. If A1 is compact, then the
eigenelements corresponding to the same eigenvalue
Under these definitions, X0 becomes a vector space. form a finite-dimensional subspace.
We have been employing the expression
See also: Ljusternik–Schnirelman Theory; Quantum
jf ðxÞj 0
Mechanical Scattering Theory; Regularization for
kf k ¼ sup ; f 2X ½66 Dynamical Zeta Functions; Spectral Sequences;
x6¼0 kxk
Stochastic Resonance.
This is easily seen to be a norm. Thus X0 is a normed
vector space.
We also have Further Reading
Theorem 31 Let M be a subspace of a normed vector Bachman G and Narici L (1966) Functional Analysis. New York:
Academic Press.
space X, and suppose that f (x) is a bounded linear Banach S (1955) Théorie des Opérations Linéaires. New York:
functional on M. Set Chelsea.
Berberian SK (1974) Lectures in Functional Analysis and
jf ðxÞj Operator Theory. New York: Springer.
kf k ¼ sup Brown A and Pearcy C (1977) Introduction to Operator Theory.
x2M;x6¼0 kxk
New York: Springer.
Day MM (1958) Normed Linear Spaces. Berlin: Springer.
Then there is a bounded linear functional F(x) on Dunford N and Schwartz IT (1958, 1963) Linear Operators, I, II.
the whole of X such that New York: Wiley.
Edwards RE (1965) Functional Analysis. New York: Holt.
FðxÞ ¼ f ðxÞ; x2M ½67 Epstein B (1970) Linear Functional Analysis. Philadelphia:
W. B. Saunders.
Gohberg IC and Krein MS (1960) The basic propositions on
and
defect numbers, root numbers and indices of linear operators.
Amer. Math. Soc. Transl, Ser. 2, 13, 185–264.
jFðxÞj jf ðxÞj Goldberg S (1966) Unbounded Linear Operators. New York:
kFk ¼ sup ¼ kf k ¼ sup ½68
x2X;x6¼0 kxk x2M;x6¼0 kxk McGraw-Hill.
Halmos PR (1951) Introduction to Hilbert Space. New York: Chelsea.
Hille E and Phillips R (1957) Functional Analysis and Sem-Groups.
Theorem 31 is known as the ‘‘Hahn–Banach theorem.’’ Providence: American Mathematical Society.
If A is a linear operator from X to Y, with Kato T (1966, 1976) Perturbation Theory for Linear Operators.
R(A) = Y and N(A) = {0} (i.e., consists only of the Berlin: Springer.
Spin Foams 645

Müller V (2003) Spectral Theory of Linear Operators – and Spactral Stone MH (1932) Linear Transformations in Hilbert Space.
System in Banach Algebras. Basel: Birkhäuser Verlag. Providence: American Mathematical Society.
Reed M and Simon B (1972) Methods of Modern Mathematical Taylor AE (1958) Introduction to Functional Analysis. New York:
Physics, I. Academic Press. Wiley.
Riesz F and St.-Nagy B (1955) Functional Analysis. New York: Ungar. Weidmann J (1980) Linear Operators in Hilbert Space. New York:
Schechter M (2002) Principles of Functional Analysis. Providence: Springer.
American Mathematical Society. Yosida K (1965, 1971) Functional Analysis. Berlin: Springer.

Spin Foams
A Perez, Penn State University, University Park, the scalar constraint Hphys . Formally, one can write
PA, USA P as
ª 2006 Elsevier Ltd. All rights reserved. Y
P¼“
ðb
SðxÞÞ”
x2
Z  Z
Introduction ¼ d
D½N exp i NðxÞSðxÞ ½2

In loop quantum gravity (LQG) (see Loop Quantum
Gravity) – a background independent formulation of A formal argument shows that P can also be defined
quantum gravity – the full quantum dynamics is in a manifestly covariant manner as a regularization
governed by the following (constraint) operator of the formal path integral of general relativity. In
equations or quantum Einstein equations: first-order variables, it becomes
Z
Gauss Law P ¼ D½e D½A ½A; e exp½iSGR ðe; AÞ ½3
b i ðA; EÞj >:¼ Dd
G a
a Ei j >¼ 0
where e is the tetrad field, A is the spacetime connection,
Vector constraint and [A, e] denotes the appropriate measure.
In both cases, P characterizes the space of
b a ðA; EÞj >:¼ Ea Fd
V i
i ab ðAÞj >¼ 0 solutions of quantum Einstein equations as for
Scalar constraint any arbitrary state j >2 Hkin then Pj > is a

pffiffiffiffiffiffiffiffiffiffi
(formal) solution of [1]. Moreover, the matrix

1 d ij
elements of P define the physical inner product
b

SðA; EÞ
 >:¼ detE Ei Ej Fab ðAÞ þ   

a b
½1 ( < , >p ) providing the vector space of solutions of
[1] with the Hilbert space structure that defines
 >¼ 0
Hphys . Explicitly,
where Aia is an SU(2) connection (i = 1, 2, 3,
<s; s0>p :¼ <Ps; s0>
a = 1, 2, 3), Eai is its conjugate momentum (the triad
ij
field), F ab (A) is the curvature of Aia , and Da is the for s, s0 2 Hkin .
covariant derivative (see Canonical General Relativ- When these matrix elements are computed in
ity). The hat means that the classical phase-space the spin network basis (see Figure 1) (see Loop
functions are promoted to operators in a kinematical Quantum Gravity), they can be expressed as a
Hilbert space Hkin ; the solutions are in the so-called sum over amplitudes of ‘‘spin network histories’’:
physical Hilbert space Hphys . The goal of the spin foam spin foams (Figure 2). The latter are naturally
approach is to construct a mathematically well-defined given by foam-like combinatorial structures
notion of path integral for LQG as a device for whose basic elements carry quantum numbers of
computing the solutions of the previous equations. geometry (see Loop Quantum Gravity). A spin
The space of solution of the Gauss and vector foam history, from the state js > to the state js0 > ,
constraints [1] is well understood in LQG (see Loop is denoted by a pair (Fs ! s0 , {j}), where Fs ! s0 is the
Quantum Gravity), and often also called kinematical 2-complex with boundary given by the graphs of
Hilbert space Hkin . The solutions of the scalar the spin network states js0 > and js >, respectively,
constraint can be characterized by the definition of and {j} is the set of spin quantum numbers
the generalized projection operator P from the labeling its edges (denoted e 2 Fs ! s0 ) and faces
kinematical Hilbert space Hkin into the kernel of (denoted f 2 Fs ! s0 ). Vertices are denoted
646 Spin Foams

5 gravitational field and can be interpreted as a set


2 of transitions through different quantum states of
2 space. Boundary data in the path integral are given
1
3 2 by the polymer-like excitations (spin network
2 states, Figure 1) representing 3-geometry states in
1
LQG.
1

1 Spin Foams in 3D Quantum Gravity


Figure 1 A spin network state is given by a graph embedded
in space whose links and nodes are labeled by unitary Now we introduce the concept of spin foams in a
irreducible representations of SU(2). These states form a more explicit way in the context of the quantization
complete basis of the kinematical Hilbert space of LQG where of three-dimensional (3D) Riemannian gravity. Later
the operator equations [1] are defined. in this section we will present the definition of P
from the canonical and covariant viewpoint for-
mally stated in the introduction by eqns [2] and [3],
l respectively.
j
p n s
o k m
q The Classical Theory
n
s Riemannian gravity in 3D is a theory with no local
l m
j degrees of freedom, that is, a topological theory (see
p
Topological Quantum Field Theory: Overview). Its
k l
q action (in the first-order formalism) is given by
o j
k Z
l Sðe; !Þ ¼ trðe ^ Fð!ÞÞ ½5
j M

k where M =   R (for  an arbitrary Riemann


Figure 2 A spin foam as the ‘‘colored’’ 2-complex representing surface), ! is an SU(2) connection, and the triad e
the transition between three different spin network states. A is an su(2)-valued 1-form. The gauge symmetries of
transition vertex is magnified on the right. the action are the local SU(2) gauge transformations
e ¼ ½e; ; ! ¼ d!  ½6
v 2 Fs ! s0 . The physical inner product can be
expressed as a sum over spin foam amplitudes where  is an su(2)-valued 0-form, and the
‘‘topological’’ gauge transformation
<s0 ; s>p ¼ <Ps0 ; s>
X X Y e ¼ d! ; ! ¼ 0 ½7
¼ NðFs!s0 Þ Af ðjf Þ
Fs!s0 fjg f 2Fs!s0 where d! denotes the covariant exterior derivative
Y Y and  is an su(2)-valued 0-form. The first invariance
 Ae ðje Þ Av ðjv Þ ½4
e2Fs!s0 v2Fs!s0
is manifest from the form of the action, while the
second is a consequence of the Bianchi identity,
where N(Fs ! s0 ) is a (possible) normalization d! F(!) = 0. The gauge symmetries are so large that
factor, and Af (jf ), Ae (je ), and Av (jv ) are the 2-cell all the solutions to the equations of motion are
or face amplitude, the edge or 1-cell amplitude, locally pure gauge. The theory has only global or
and the 0-cell or vertex amplitude, respectively. topological degrees of freedom.
These local amplitudes depend on the spin quan- Upon the standard 2 þ 1 decomposition (see Cano-
tum numbers labeling neighboring cells in Fs ! s0 nical General Relativity), the phase space in these
(e.g., the vertex amplitude of the vertex magnified variables is parametrized by the pullback to  of ! and
in Figure 2 is Av (j, k, l, m, n, s)). e. In local coordinates, one can express them in terms of
The underlying discreteness discovered in LQG the two 2D connection Aia and the triad field
is crucial: in the spin foam representation, the Ebj = bc ekc jk , where a = 1, 2 are space coordinate
functional integral for gravity is replaced by a sum indices and i, j = 1, 2, 3 are su(2) indices. The symplec-
over amplitudes of combinatorial objects given by tic structure is defined by
foam-like configurations (spin foams) as in [4]. A
spin foam represents a possible history of the fAia ðxÞ; E bj ðyÞg ¼  ba i j ð2Þ ðx; yÞ ½8
Spin Foams 647

j
Local symmetries of the theory are generated by the where  is the unitary irreducible representation matrix
first-class constraints of spin j (for a precise definition, see Loop Quantum
Gravity). For simplicity, we will often denote spin
Db E bj ¼ 0; F iab ðAÞ ¼ 0 ½9 network states js > omitting the graph and spin labels.
which are referred to as the Gauss law and the
curvature constraint, respectively – the quantization Spin Foams from the Hamiltonian Formulation
of these is the analog of [1] in 4D. This simple
theory has been quantized in various ways in the The physical Hilbert space, Hphys , is defined by
literature; here we will use it to introduce the spin those ‘‘states’’ that are annihilated by the con-
foam quantization. straints. By construction, spin-network states solve
the Gauss constraint – Dd a
a Ei js > = 0 – as they
are manifestly SU(2) gauge invariant (see Loop
Kinematical Hilbert Space Quantum Gravity). To complete the quantization,
In analogy with the 4D case, one follows Dirac’s one needs to characterize the space of solutions of
the quantum curvature constraints (F bi ), and to
procedure finding first a representation of the basic ab
variables in an auxiliary or kinematical Hilbert provide it with the physical inner product. The
space Hkin . The basic states are functionals of the existence of Hphys is granted by the following:
connection depending on the parallel transport Theorem 1 There exists a normalized positive
along paths   : the so-called holonomy. Given linear form P over Cyl, that is, P(  )  0 for 2
a connection Aia (x) and a path , one defines the Cyl and P(1) = 1, yielding (through the GNS
holonomy h [A] as the path-ordered exponential construction (see Algebraic Approach to Quantum
Z Field Theory)) the physical Hilbert space Hphys and
h ½A ¼ P exp A ½10 the physical representation
p of Cyl.

The state P contains a very large Gelfand ideal (set
The kinematical Hilbert space, Hkin , corresponds
of zero norm states) J := { 2 Cyl s.t. P( ) = 0}. In
to the Ashtekar–Lewandowski (AL) representation
fact, the physical Hilbert space Hphys := Cyl=J corre-
of the algebra of functions of holonomies or
sponds to the quantization of finitely many degrees of
generalized connections. This algebra is in fact a
freedom. This is expected in 3D gravity as the theory
C -algebra and is denoted Cyl (see Loop Quantum
does not have local excitations (no ‘‘gravitons’’) (see
Gravity). Functionals of the connection act in the
Topological Quantum Field Theory: Overview). The
AL representation simply by multiplication. For
representation
p of Cyl solves the curvature con-
example, the holonomy operator acts as follows:
straint in the sense that for any functional f [A] 2 Cyl
hd
½A½A ¼ h ½A½A ½11 defined on the subalgebra of functionals defined on
contractible graphs  2 , one has that
As in 4D, an orthonormal basis of Hkin is defined
p ½f  ¼ f ½0 ½13
by the spin network states. Each spin network is
labeled by a graph   , a set of spins {j‘ } labeling b = 0’’ in Hphys
This equation expresses the fact that ‘‘F
links ‘ 2 , and a set of intertwiners { n } labeling (for flat connections, parallel transport is trivial
nodes n 2  (Figure 3), namely: around a contractible region). For s, s0 2 Hkin , the
physical inner product is given by
O j‘
OY
s;fj‘ g;f n g ½A ¼ n ðh‘ ½AÞ ½12 <s; s0>p :¼ Pðs sÞ ½14
n2 ‘2
where the -operation and the product are defined
5 in Cyl.
2
5
The previous equation admits a ‘‘sum over
5 2 1
2
2
2 histories’’ representation. We shall introduce the
concept of the spin foam representation as an
1 3
2 2 explicit construction of the positive linear form P
1
3
which, as in [2], is formally given by
2 1 Z  Z 
1
P ¼ D½N exp i tr½N FðAÞ b
Figure 3 A spin network state in 2 þ 1 LQG. The decomposi- 
Y
tion of a 4-valent node in terms of basic 3-valent intertwiners is ¼ d
½FðAÞ ½15
shown. x2
648 Spin Foams

over the Npi using the identity (Peter–Weyl


Wp theorem)
Z
Σ
ε dN expði tr½NWÞ
Figure 4 Cellular decomposition of the space manifold X hj i
 (a square lattice in this example), and the infinitesimal ¼ ð2j þ 1Þtr ðWÞ ½19
plaquette holonomy Wp [A]. j

Using the previous equation


where N(x) 2 su(2). One can make the previous YX
formal expression a rigorous definition if one intro- PðsÞ :¼ lim ð2jðpi Þ þ 1Þ
!0
duces a regularization. Given a partition of  in terms pi jðpi Þ

of 2D plaquettes of coordinate area 2 , one has that h jðpi Þ i


Z < tr  ðWpi Þ ; s> ½20
X
tr½NFðAÞ ¼ lim 2 tr½Npi Fpi  ½16
 !0
pi where j(pi ) is the spin labeling element of the sum
[19] associated to the ith plaquette. Since the
where Npi and Fpi are values of N i and ab Fab
i
[A] tr[j (W)] commute, the ordering of plaquette opera-
at some interior point of the plaquette p and ab is
i
tors in the previous product does not matter. It can be
the Levi-Civita tensor. Similarly, the holonomy shown that the limit  ! 0 exists and one can give a
Wpi [A] around the boundary of the plaquette pi closed expression of P(s).
(see Figure 4) is given by Now in the AL representation (see eqn [11]), each
i
tr½jðp Þ ðWpi Þ acts by creating a closed loop in the jpi
Wpi ½A ¼ 1 þ 2 Fpi ðAÞ þ Oð2 Þ ½17
representation at the boundary of the corresponding
j
where Fpi = j ab Fab (xpi ) ( j are the generators of plaquette (Figures 5 and 6).
su(2) in the fundamental representation). The pre- One can introduce a (nonphysical) time parameter
vious two equations lead to the following definition: that works simply as a coordinate providing the means
given s 2 Cyl (think of spin network state based on a of organizing the series of actions of plaquette loop
graph ), the linear form P(s) is defined as operators in [20]; that is, one assumes that each of the
* + loop actions occurs at different ‘‘times.’’ We have
YZ introduced an auxiliary time slicing (arbitrary para-
PðsÞ :¼ lim  dNpi expði tr½Npi W pi Þ; s ½18 metrization). If one inserts the AL partition of unity
!0
pi XX
1¼ j; fjg >< ; fjgj ½21
where < , > is the inner product in the AL 2 fjg
representation and j > is the ‘‘vacuum’’ (1 2 Cyl)
in the AL representation. The partition is chosen so where the sum is over the complete basis of spin
that the links of the underlying graph  border the network states {j, {j} > } – based on all graphs  2 
plaquettes. One can easily perform the integration and with all possible spin labeling – between each time

k
tr[∏(Wp)] p k = Σ N k
Δ

j = j m j,m,k j
m
Figure 5 Graphical notation representing the action of one plaquette holonomy on a spin network state. On the right is the result
written in terms of the spin network basis. The amplitude Nj,m,k can be expressed in terms of Clebsch–Gordan coefficients.

j k j k j k
n
p n 1 j k m
k
= = Σ
o,p Δn Δj Δk Δm
p o
tr[∏(Wp)] no p
Δ

m m m

Figure 6 Graphical notation representing the action of one plaquette holonomy on a spin network vertex. The object in brackets (fg)
is a 6j-symbol and j := 2j þ 1.
Spin Foams 649

slice, one arrives at a sum over spin network histories <s; s0>p :¼ Pðs s0 Þ
representation of P(s). More precisely, P(s) can be
expressed as a sum over amplitudes corresponding to a
series of transitions that can be viewed as the ‘‘time and can be expressed as a sum over amplitudes
evolution’’ between the ‘‘initial’’ spin network s and corresponding to transitions interpolating between
the ‘‘final’’ ‘‘vacuum state’’ . The physical inner the ‘‘initial’’ spin network s0 and the ‘‘final’’ spin
product between spin networks s and s0 is defined as network s (e.g., Figures 7 and 8).

j j j j
k k k

m m m
m
j
j j j
k k
k k k
m m m m

j
j j
j
k k
k

m m m m

Figure 7 A set of discrete transitions in the loop-to-loop physical inner product obtained by a series of transitions as in Figure 5. On
the right, the continuous spin foam representation in the limit  ! 0.

j j j

m m p m p
n n
o o

k k k

j
p n
j j j
m o k
p p p
m m m
n n n
o
o o

k k k

j j j
p
p p
m m m
n n n

o o o
k k k

Figure 8 A set of discrete transitions representing one of the contributing histories at a fixed value of the regulator. On the right, the
continuous spin foam representation when the regulator is removed.
650 Spin Foams

Spin network nodes evolve into edges while spin


network links evolve into 2D faces. Edges inherit the
intertwiners associated to the nodes and faces inherit
the spins associated to links. Therefore, the series of
transitions can be represented by a 2-complex whose Σ
1-cells are labeled by intertwiners and whose 2-cells
are labeled by spins. The places where the action of
the plaquette loop operators create new links f

(Figures 6 and 8) define 0-cells or vertices. These


foam-like structures are the so-called spin foams.
The spin foam amplitudes are purely combinatorial
and can be explicitly computed from the simple Σ
action of the loop operator in the AL representation
(see Loop Quantum Gravity). A particularly simple
case arises when the spin network states s and s0 Figure 9 The cellular decomposition of M =   I ( = T 2 in
have only 3-valent nodes. Explicitly, this example). The illustration shows part of the induced graph
on the boundary and the detail of a tetrahedron in  and a face
<s; s0>p :¼ Pðs s0 Þ f 2  in the bulk.

j4 j5
(dual to 1-cells in ). The intersection of the dual
X Y f Y j3 2-complex  with the boundaries defines two
¼ ð2jf þ 1Þ 2 ½22
fjg f 2Fs!s0 v2Fs!s0
graphs 1 , 2 2  (see Figure 9). For simplicity, we
j6
j1 j2 ignore the boundaries until the end of this section.
The fields e and A are discretized as follows. The
where the notation is that of [4], and f = 0 if su(2)-valued 1-form field e is represented by the
f \ s 6¼ 0 ^ f \ s0 6¼ 0, f = 1 if f \ s 6¼ 0 _ f \ s0 6¼ 0, assignment of ef 2 su(2) to each 1-cell in . We
and f = 2 if f \ s = 0 ^ f \ s0 = 0. The tetrahedral use the fact that faces in  are in one-to-one
diagram denotes a 6j-symbol: the amplitude obtained correspondence with 1-cells in  and label ef with a
by means of the natural contraction of the four face subindex (Figure 9). The connection field A is
intertwiners corresponding to the 1-cells converging represented by the assignment of group elements
at a vertex. More generally, for arbitrary spin ge 2 SU(2) to each edge in e 2  (see Figure 10).
networks, the vertex amplitude corresponds to 3nj- With all this, [23] becomes the regularized version
symbols, and <s, s0>p takes the general form [4]. P defined as
Z Y Y  
P ¼ def dge exp i tr ef Wf ½24
Spin Foams from the Covariant Path Integral f 2 e2
In this section we re-derive the spin foam represen- where def is the regular Lebesgue measure on R 3 ,
tation of the physical scalar product of 2 þ 1 dge is the Haar measure on SU(2), and Wf denotes
(Riemannian) quantum gravity directly as a regular- the holonomy around (spacetime) faces, that is,
ization of the covariant path integral. The formal Wf = g1e    gN
e for N being the number of edges
path integral for 3D gravity can be written as bounding the corresponding face (see Figure 10).
Z  Z 
The discretization procedure is reminiscent of the
P ¼ D½eD½A exp i tr½e ^ FðAÞ ½23 one used in standard lattice gauge theory (see Lattice
M

Assume M =   I, where I  R is a closed (time)


interval (for simplicity, we ignore boundary
terms). ge2 ge3
In order to give a meaning to the formal ge4
expression above, one replaces the 3D manifold f
ge1
(with boundary) M with an arbitrary cellular ge5
decomposition . One also needs the notion of the
Figure 10 A (2-cell) face f 2  in a cellular decomposition of
associated dual 2-complex of  denoted by  . The
the spacetime manifold M and the corresponding dual 1-cell. The
dual 2-complex  is a combinatorial object defined connection field is discretized by the assignment of the parallel
by a set of vertices v 2  (dual to 3-cells in ), transport group elements gei 2 SU(2) to edges e 2 
edges e 2  (dual to 2-cells in ), and faces f 2  (i = 1, . . . , 5 in the face shown here).
Spin Foams 651

Gauge Theory). The previous definition can be states on 1 and 2 , respectively. A careful analysis
motivated by an analysis equivalent to the one of the boundary contribution shows that only the
presented in [16]. face amplitude is modified to (j‘ ) f =2 , and that the
Integrating over ef , and using [19], one obtains spin foam amplitudes are as in eqn [22].
XZ Y Y A crucial property of the path integral in 3D
P ¼ dge ð2jf þ 1Þ gravity (and of the transition amplitudes in general)
fjg e2 f 2 is that it does not depend on the discretization  –
" #
jf this is due to the absence of local degrees of freedom
 tr ðg1e . . . gN
e Þ ½25 in 3D gravity and not expected to hold in 4D. Given
two different cellular decompositions  and 0 ,
one has
Now it remains to integrate over the lattice con-
nection {ge }. If an edge e 2  bounds n faces f 2  0
n0 P ¼ n0 P0 ½29
there will be n traces of the form tr[jf (    ge    )] in
[25] containing ge in the argument. In order to whereP n0 is the number of 0-simplexes in , and
integrate over ge we can use the following identity: = j (2j þ 1)2 . As is given by a divergent sum,
Z j1 j2 jn the discretization independence statement is formal.
n
Iinv :¼ dg ðgÞ ðgÞ    ðgÞ Moreover, the sum over spins in [28] is typically
X divergent. Divergences occur due to infinite gauge-
¼ C j1 j2 jn C
j1 j2 jn ½26 volume factors in the path integral corresponding to

the topological gauge freedom [7]. Freidel and
n Louapre have shown how these divergences can be
where Iinv is the projector from the tensor product of
irreducible representations Hj1 jn = j1 j2    jn avoided by gauge-fixing unphysical degrees of free-
onto the invariant component H0j1 jn = Inv[j1 j2 dom in [24]. In the case of 3D gravity with positive
   jn ]. On the right-hand side, we have chosen an cosmological constant, the state sum generalizes to
orthonormal basis of invariant vectors (intertwiners) the Turaev–Viro invariant (see Topological Quan-
in Hj1 jn to express the projector. Notice that the tum Field Theory: Overview) defined in terms of the
assignment of intertwiners to edges is a consequence quantum group SUq (2) with qn = 1 where the
of the integration over the connection. Using [26] representations are finitely many and thus < 1.
one can write P in the general spin foam Equation [29] is a rigorous statement in that case.
representation form [4] No such infrared divergences appear in the canoni-
X Y Y cal treatment of the previous section.
P ¼ ð2jf þ 1Þ Av ðjv Þ ½27
ff g f 2 v2

where Av ( v , jv ) is given by the appropriate trace of Spin Foams in 4D


the intertwiners corresponding to the edges bounded
by the vertex. As in the previous section, this Spin Foam from the Canonical Formulation
amplitude is given in general by an SU(2) 3Nj- There is no rigorous construction of the physical
symbol. When  is a simplicial complex, all the inner product of LQG in 4D. The spin foam
edges in  are 3-valent and vertices are 4-valent. representation as a device for its definition has
Consequently, the vertex amplitude is given by the been introduced formally by Rovelli. In 4D LQG,
contraction of the corresponding four 3-valent difficulties in understanding dynamics are centered
intertwiners, that is, a 6j symbol. In that case, the around the quantum scalar constraint
path integral takes the (Ponzano–Regge) form pffiffiffiffiffiffiffiffiffiffi1d ij
b a b
S = detE Ei Ej Fab (A) þ    (see [1]) – the vector
constraint V b a (A, E) is solved in a simple manner
j4 j5
(see Loop Quantum Gravity). The physical inner
X Y Y
P ¼ ð2jf þ 1Þ
j3
½28 product formally becomes
fjg f 2 
v2 
j6 Y
j1 j2 hPs; s0 idiff ¼ ½b
SðxÞ
x
Z  Z 
The labeling of faces that intersect the boundary ¼ D½N < exp i NðxÞb SðxÞ s; s0 >diff

naturally induces a labeling of the edges of the Z X1 n Z n
i
graphs 1 and 2 induced by the discretization. ¼ D½N < NðxÞb
SðxÞ s; s0 >diff ½30
Thus, the boundary states are given by spin network n¼0
n! 
652 Spin Foams

n
o
j k j k
n p
= Σ N(xn)Snop p o
nop
ˆ
N(x)S(x)

Δ
Σ k
m m
j
m

Figure 11 The action of the scalar constraint and its spin foam representation. N(xn ) is the value of N at the node and Snop are the
b
matrix elements of S.

where < , >diff denotes the inner product in the operator acting on Hkin . The physical inner product
Hilbert space of solutions of the vector constraint, is given by
and the exponential has been expanded in powers in Z T
the last expression on the right-hand side. b
<s; s0>p :¼ lim <s; dt eitM s0> ½32
From early on, it was realized that smooth loop T!1 T
states are naturally annihilated by b S (indepen- A spin foam representation of the previous expres-
dently of any quantization ambiguity). Conse- sion could now be achieved by the standard
quently, b S acts only on spin network nodes. skeletonization that leads to the path-integral repre-
Generically, it does so by creating new links and sentation in quantum mechanics. In this context,
nodes modifying the underlying graph of the spin one splits the t-parameter in discrete steps and
network states (Figure 11). writes
Therefore, each term in the sum [30] represents a
series of transitions – given by the local action of b S b b b
eitM ¼ lim ½eitM=N N ¼ lim ½1 þ itM=N N
½33
at spin network nodes – through different spin N!1 N!1

network states interpolating the boundary states s The spin foam representation follows from the fact
and s0 , respectively. The action of b S can be b
that the action of the basic operator 1 þ itM=N on a
visualized as an ‘‘interaction vertex’’ in the ‘‘time’’ spin network can be written as a linear combination
evolution of the node (Figure 11). As in the explicit of new spin networks whose graphs and labels have
3D case, eqn [30] can be expressed as sum over been modified by the creation of new nodes (in a
‘‘histories’’ of spin networks pictured as a system of way qualitatively analogous to the local action
branching surfaces described by a 2-complex whose shown in Figure 11). An explicit derivation of the
elements inherit the representation labels on the physical inner product of 4D LQG along these lines
intermediate states. The value of the ‘‘transition’’ is under current investigation.
amplitudes is controlled by the matrix elements of
b
S. Therefore, although the qualitative picture is Spin Foams from the Covariant Formulation
independent of quantization ambiguities, transition
amplitudes are sensitive to them. In 4D, the spin foam representation of the dynamics
Before even considering the issue of convergence of LQG has been investigated more intensively in
of [30], the problem with this definition is evident: the covariant formulation. This has led to a series of
every single term in the sum is a divergent integral! constructions which are referred to as spin foam
Therefore, this way of presenting spin foams has to models. These treatments are related more closely to
be considered as formal until a well-defined regular- the construction based on the covariant path-
ization of [2] is provided. That is the goal of the spin integral approach of the last section. Here we
foam approach. illustrate the formulation which has captured much
Instead of dealing with an infinite number of interest in the literature: the Barrett–Crane (BC)
constraints Thiemann recently proposed to impose model.
one single master constraint defined as
Z Spin foam models for gravity as constrained quan-
S2 ðxÞ  qab Va ðxÞVb ðxÞ tum BF theory The BC model is one of the most
M¼ dx3 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½31
 det qðxÞ extensively studied spin foam models for quantum
gravity. To introduce the main ideas involved, we
Using techniques developed by Thiemann, this concentrate on the definition of the model in the
constraint can indeed be promoted to a quantum Riemannian sector. The BC model can be formally
Spin Foams 653

viewed as a spin foam quantization of SO(4) where D[B]D[A](B ! IJKL eK ^ eL ) means that one
Plebanski’s formulation of general relativity. Ple- must restrict the sum in [36] to those configurations
banski’s Riemannian action depends on an SO(4) of the topological theory satisfying the constraints
connection A, a Lie-algebra-valued 2-form B, and B =  (e ^ e) for some tetrad e. The remarkable fact
Lagrange multiplier fields and . Writing explicitly is that this restriction can be implemented in a
the Lie algebra indices, the action is given by systematic way directly on the spin foam configura-
tions that define Ptopo .
S½B; A; ;  In Ptopo spin foams are labeled with spins corre-
Z
 IJ sponding to the unitary irreducible representations of
¼ B ^ FIJ ðAÞ þ IJKL BIJ ^ BKL
SO(4) (given by two spin quantum numbers (jR , jL )).
Essentially, the factor ‘‘(B ! IJKL eK ^ eL )’’ restricts
þ IJKL IJKL ½34
the set of spin foam quantum numbers to the so-
where  is a 4-form and IJKL =  JIKL = called simple representations (for which jR = jL = j).
 IJLK = KLIJ is a tensor in the internal space. This is the ‘‘quantum’’ version of the solution to the
Variation with respect to  imposes the constraint constraints [35]. There are various versions of this
IJKL IJKL = 0 on IJKL . The Lagrange multiplier model. The simplest definition of the transition
tensor IJKL has then 20 independent components. amplitudes in the BC model is given by
Variation with respect to imposes 20 algebraic X Y Y
equations on the 36 components of B. The (non- Pðs sÞ ¼ ð2jf þ 1Þ f
fjg f 2Fs!s0 v2Fs!s0
degenerate) solutions to the equations obtained by
ι2 ι2*
varying the multipliers and  are
j12 j23 j12
* j23
*

IJ IJKL X ι1
B ¼
 eK ^ eL j25
j13
j24
ι3 ι1* j13
*
ι3*
j25
* j24
*
½38
and 1  5 j15 j14 j35 j34 j15
*
j14
* j35
* j34
*

BIJ ¼
eI ^ eJ ½35
ι5 j45 ι4 ι5* j45
* ι4*

in terms of the 16 remaining degrees of freedom of where we use the notation of [22], the graphs denote
the tetrad field eIa . If one substitutes the first solution 15j-symbols, and i are half-integers labeling SU(2)
into the original action, one obtains Palatini’s normalized 4-intertwiners. No rigorous connection
formulation of general relativity; therefore, on shell with the Hilbert space picture of LQG has yet been
(and on the right sector), the action is that of established. The self-dual version of Plebanski’s
classical gravity. action leads, through a similar construction, to
The key idea in the definition of the model is that Reisenberger’s model.
the path integral for the theory corresponding to the The simplest amplitude in the BC model corre-
action S[B, A, 0, 0], namely sponds to a single 4-simplex, which can be viewed
Z  Z  as the simplest triangulation of the 4D spacetime
 IJ
Ptopo ¼ D½BD½A exp i B ^ FIJ ðAÞ ½36 given by the interior of a 3-sphere (the correspond-
ing 2-complex is shown in Figure 12). States of the
can be given a meaning as a spin foam sum, [4], in 4-simplex are labeled by ten spins j (labeling the ten
terms of a simple generalization of the construction edges of the boundary spin network, see Figure 12)
of the previous section. In fact, S[B, A, 0, 0] corre- which can be shown to be related to the area in
sponds to a simple theory known as BF theory that
is formally very similar to 3D gravity (see BF
2
Theories). The result is independent of the chosen
discretization because BF theory does not have local
degrees of freedom (just as 3D gravity).
The BC model aims at providing a definition of 1 3
the path integral of gravity pursuing a well-posed
definition of the formal expression 0
Z

PGR ¼ D½BD½A B ! IJKL eK ^ eL
 Z 
 IJ 5 4
 exp i B ^ FIJ ðAÞ ½37
Figure 12 The dual of a 4-simplex.
654 Spin Foams

Planck units of the ten triangular faces that form the amplitudes are in one-to-one correspondence to
4-simplex. A first indication of the connection of the those found in the models of the previous section
model with gravity was that the large-j asymptotics (e.g., the BC model). This duality is regarded as a
appeared to be dominated by the exponential of the way of providing a fully combinatorial definition of
Regge action (the action derived by Regge as a quantum gravity where no reference to any dis-
discretization of general relativity). This estimate cretization or even a manifold structure is made.
was done using the stationary-phase approximation Transition amplitudes between spin network states
to the integral that gives the amplitude of a correspond to n-point functions of the field theory.
4-simplex in the BC model. However, more detailed These models have been inspired by generalizations
calculations showed that the amplitude is dominated of matrix models applied to BF theory.
by configurations corresponding to degenerate Divergent transition amplitudes can arise by the
4-simplexes. This seems to invalidate a simple contribution of ‘‘loop’’ diagrams as in standard
connection to general relativity and is one of the quantum field theory. In spin foams, diagrams
main puzzles in the model. corresponding to 2D bubbles are potentially divergent
because spin labels can be arbitrarily high leading to
unbounded sums in [4]. Such divergences do not occur
Spin Foams as Feynman Diagrams
in certain field theories dual (in the sense above) to the
The main problem with the models of the previous BC model. However, little is known about the
section is that they are defined on a discretization  convergence of the series in and the physical meaning
of M and that – contrary to what happens with a of this constant. Nevertheless, Freidel and Louapre
topological theory, for example, 3D gravity have shown that the series can be re-summed in certain
(eqn [29]) – the amplitudes depend on the discretiza- models dual to lower-dimensional theories.
tion . Various possibilities to eliminate this reg-
ulator have been discussed in the literature but no
Causal Spin Foams
explicit results are yet known in 4D. An interesting
proposal is a discretization-independent definition of Let us conclude by presenting a fundamentally
spin foam models achieved by the introduction of an different construction leading to spin foams. Using
auxiliary field theory living on an abstract group the kinematical setting of LQG with the assumption
manifold – Spin(4)4 and SL(2, C)4 for Riemannian of the existence of a microlocal (in the sense of
and Lorentzian gravity, respectively. The action of Planck scale) causal structure, Markopoulou and
the auxiliary group field theory (GFT) takes the form Smolin define a general class of (causal) spin foam
Z Z models for gravity. The elementary transition ampli-
2
S½ ¼  þ Mð5Þ ½ ½39 tude AsI ! sIþ1 from an initial spin network sI to
G4 5! G10 another spin network sIþ1 is defined by a set of
where M(5) [] is a fifth-order monomial, and simple combinatorial rules based on a definition of
G is the corresponding group. In the simp- causal propagation of the information at nodes. The
lest model, M(5) [] = (g1 , g2 , g3 , g4 )(g4 , g5 , g6 , g7 ) rules and amplitudes have to satisfy certain causal
(g7 , g3 , g8 , g9 )(g9 , g6 , g2 , g10 )(g10 , g8 , g5 , g1 ). The restrictions (motivated by the standard concepts
field  is required to be invariant under the in classical Lorentzian physics). These rules gene-
(simultaneous) right action of the group on its rate surface-like excitations of the same kind one
four arguments in addition to other symmetries encounters in the previous formulations. Spin foams
(not described here for simplicity). The perturba- FNsi ! sf are labeled by the number of times, N, these
tive expansion in of the GFT Euclidean path elementary transitions take place. Transition
integral is given by amplitudes are defined as
X
Z hsi ; sf i ¼ AðFN
X N si !sf Þ ½41
P ¼ D½eS½ ¼ A½FN  ½40 N
F
sym½FN 
N
which is of the generic form [4]. The models are not
where A[FN ] corresponds to a sum of Feynman- related to any continuum action. The only guiding
diagram amplitudes for diagrams with N interaction principles in the construction are the restrictions
vertices, and sym[FN ] denotes the standard symme- imposed by causality, and the requirement of the
try factor. A remarkable property of this expansion existence of a nontrivial critical behavior that
is that A[FN ] can be expressed as a sum over spin reproduces general relativity at large scales. Some
foam amplitudes, that is, 2-complexes labeled by indirect evidence of a possible nontrivial continuum
unitary irreducible representations of G. Moreover, limit has been obtained in certain versions of these
for very simple interaction M(5) [], the spin foam models in 1 þ 1 dimensions.
Spin Glasses 655

See also: Algebraic Approach to Quantum Field Theory; Baez J (1998) Spin foams. Classical and Quantum Gravity 15:
BF Theories; Canonical General Relativity; Chern– 1827–1858.
Simons Models: Rigorous Results; Lattice Gauge Baez J (2000) An introduction to spin foam models of quantum
Theory; Loop Quantum Gravity; Quantum Dynamics in gravity and BF theory. Lecture Notes in Physics 543: 25–94.
Baez J and Muniain JP (1995) Gauge fields, Knots and Gravity.
Loop Quantum Gravity; Quantum Geometry and its
Singapore: World Scientific.
Applications; Topological Quantum Field Theory: Oriti D (2001) Spacetime geometry from algebra: spin foam
Overview. models for nonperturbative quantum gravity. Reports on
Progress in Physics 64: 1489–1544.
Perez A (2003) Spin foam models for quantum gravity. Classical
and Quantum Gravity 20: R43.
Further Reading
Rovelli C Quantum Gravity. Cambridge: Cambridge University
Ashtekar A (1991) Lectures on Nonperturbative Canonical Press (to appear).
Gravity. Singapore: World Scientific. Thiemann T Modern Canonical Quantum General Relativity.
Ashtekar A and Lewandowski J (2004) Background independent Cambridge: Cambridge University Press (to appear).
quantum gravity: a status report.

Spin Glasses
F Guerra, Università di Roma ‘‘La Sapienza’’, couplings, assumed for simplicity to be independent
Rome, Italy identically distributed random variables, with cen-
ª 2006 Elsevier Ltd. All rights reserved. tered unit Gaussian distribution. The quenched
character of the J means that they do not contribute
to thermodynamic equilibrium, but act as a kind of
Introduction random external noise on the coupling of the 
variables. In the expression of the Hamiltonian, we
From a physical point of view, spin glasses, as dilute have indicated with  the set of all (n), and with J
magnetic alloys, are very interesting systems. They the set of all J(n, n0 ). The region  must be taken
are characterized by such features as exhibiting a new very large, by letting it invade all lattice in the limit.
magnetic phase, where magnetic moments are frozen The physical motivation for this choice is that for
into disordered equilibrium orientations, without any real spin glasses the interaction between the spins
long-range order. See, for example, Young (1987) for dissolved in the matrix of the alloy oscillates in sign
general reviews, and also Stein (1989) for a very according to distance. This effect is taken into
readable account about the physical properties of account in the model through the random character
spin glasses. The experimental laboratory study of of the couplings between spins.
spin glasses is a very difficult subject, because of their Even though very drastic simplifications have
peculiar properties. In particular, the existence of been introduced in the formulation of this model,
very slowly relaxing modes, with consequent memory as compared to the extremely complicated nature
effects, makes it difficult to realize the very basic of physical spin glasses, nevertheless a rigorous
physical concept of a system at thermodynamical study of all properties emerging from the static
equilibrium, at a given temperature. and dynamic behavior of a thermodynamic system
From a theoretical point of view some models of this kind is far from being complete. In particular,
have been proposed, which try to capture the with reference to static equilibrium properties, it
essential physical features of spin glasses, in the is not yet possible to reach a completely substan-
frame of very simple assumptions. tiated description of the phases emerging in the
The basic model has been proposed by Edwards low-temperature region. Even physical intuition
and Anderson (1975) many years ago. It is a simple gives completely different guesses for different
extension of the well-known nearest-neighbor Ising people.
model. On a large region  of the unit lattice in d In the same way as a mean-field version can be
dimensions, we associate an Ising spin (n) to each associated to the ordinary Ising model, so it is possible
lattice site n, and then we introduce a lattice for the disordered model described by [1]. Now we
Hamiltonian consider a number of sites i = 1, 2, . . . , N, and let each
X spin (i) at site i interact with all other spins, with the
H ð; JÞ ¼  Jðn; n0 ÞðnÞðn0 Þ ½1
intervention of a quenched noise Jij . The precise form
ðn;n0 Þ
of the Hamiltonian will be given in the following.
Here, the sum runs over all couples of nearest- This is the mean-field model for spin glasses,
neighbor sites in , and J are quenched random introduced by Sherrington and Kirkpatrick (1975).
656 Spin Glasses

It is a celebrated model. Numerous articles have As a matter of fact, how to face this challenge is a
been devoted to its study during the years, appearing very difficult problem. Here we would like to recall
in the theoretical physics literature. the main features of a very powerful method, yet
The relevance of the model stems surely from the extremely simple in its very essence, based on a
fact that it is intended to represent some important comparison and interpolation argument on sets of
features of the physical spin glass systems, of great Gaussian random variables.
interest for their peculiar properties, at least at the The method found its first simple application in
level of the mean-field approximation. Guerra (2001), where it was shown that the
But another important source of interest is Sherrington–Kirkpatrick replica symmetric approxi-
connected with the fact that disordered systems, of mate solution was a rigorous lower bound for the
the Sherrington–Kirkpatrick type, and their general- quenched free energy of the system, uniformly in
izations, seem to play a very important role for the size. Then, it was possible to reach a long-
theoretical and practical assessments about hard awaited result (Guerra and Toninelli 2002): the
optimization problems, as it is shown, for example, convergence of the free energy density in the
by Mézard et al. (2002). thermodynamic limit, by an intermediate step
It is interesting to remark that the original paper where the quenched free energy was shown to be
was entitled ‘‘Solvable model of a spin-glass,’’ while subadditive in the size of the system.
a previous draft, as told by David Sherrington, Moreover, still by interpolation on families of
contained the even stronger designation ‘‘Exactly Gaussian random variables, the first mentioned result
solvable.’’ However, it turned out that the very was extended to give a rigorous proof that the
natural solution devised by the authors is valid only expression given by the Parisi ansatz is also a lower
at high temperatures, or for large external magnetic bound for the quenched free energy of the system,
fields. At low temperatures, the proposed solution uniformly in the size (Guerra 2003). The method gives
exhibits a nonphysical drawback given by a negative not only the bound, but also the explicit form of the
entropy, as properly recognized by the authors in correction in a complex form. As a recent and very
their very first paper. important result, along the task of facing the challenge,
It took some years to find an acceptable solution. Michel Talagrand has been able to dominate these
This was done by Giorgio Parisi in a series of correction terms, showing that they vanish in the
papers, marking a radical departure from the thermodynamic limit. This milestone achievement was
previous methods. In fact, a very intense method of first announced in a short note, containing only a
‘‘spontaneous replica symmetry breaking’’ was synthetic sketch of the proof, and then presented with
developed. As a consequence, the physical content all details in a long paper (Talagrand 2006).
of the theory was encoded in a functional order The interpolation method is also at the basis of
parameter of new type, and a remarkable structure the far-reaching generalized variational principle
emerged for the pure states of the theory, a kind of proved by Aizenman et al. (2003).
hierarchical, ultrametric organization. These very In our presentation, we will try to be as self-
interesting developments, due to Parisi, and his contained as possible. We will give all definitions,
coworkers, are explained in a brilliant way in the explain the basic structure of the interpolation
classical book by Mézard et al. (1987). Part of this method, and show how some of the results are
structure will be recalled in the following. obtained. We will concentrate mostly on questions
It is important to remark that the Parisi solution is connected with the free energy, its properties of
presented in the form of an ingenious and clever subadditivity, the existence of the infinite-volume
‘‘ansatz.’’ Until few years ago, it was not known limit, and the replica bounds.
whether this ansatz would give the true solution for For the sake of comparison, and in order to
the model, in the so-called thermodynamic limit, provide a kind of warm-up, we will recall also some
when the size of the system becomes infinite, or it features of the standard elementary mean-field
would be only a very good approximation for the model of ferromagnetism, the so-called Curie–
true solution. Weiss model. We will concentrate also here on the
The general structures offered by the Parisi solu- free energy, and systematically exploit elementary
tion, and their possible generalizations for similar comparison and interpolation arguments. This will
models, exhibit an extremely rich and interesting show the strict analogy between the treatment of the
mathematical content. Very appropriately, Talagrand ferromagnetic model and the developments in the
(2003) has used a strongly suggestive sentence in the mean-field spin glass case. Basic roles will be played
title to his recent book: ‘‘Spin glasses: a challenge for in the two cases, but with different expressions, by
mathematicians.’’ positivity and convexity properties.
Spin Glasses 657

Then, we will consider the problem of connecting the theory of ferromagnetism. Here we first consider
results for the mean-field case to the short-range case. some properties of the free energy, easily obtained
An intermediate position is occupied by the so-called through comparison methods.
diluted models. They can be studied through a The generic configuration of the mean-field
generalization of the methods exploited in the mean- ferromagnetic model is defined through Ising spin
field case, as shown, for example, in De Sanctis (2005). variables i = 1, attached to each site i = 1,
The organization of the paper is as follows. We 2, . . . , N.
first introduce the ferromagnetic model and discuss The Hamiltonian of the model, in some external
behavior and properties of the free energy in the field of strength h, is given by the mean-field expression
thermodynamic limit, by emphasizing, in this very
1X X
elementary case, the comparison and interpolation HN ð; hÞ ¼  i j  h i ½2
methods that will be also exploited, in a different N ði;jÞ i
context, in the spin glass case.
The basic features of the mean-field spin glass Here, the first sum extends to all N(N  1)=2 site
models are discussed next, by introducing all couples, and the second to all sites.
necessary definitions. This is followed by the For a given inverse temperature , let us now
introduction, for generic Gaussian interactions, of introduce the partition function ZN (, h) and the
some important formulas, concerning the derivation free energy per site fN (, h), according to the well-
with respect to the strength of the interaction, and known definitions
X
the Gaussian comparison and interpolation method. ZN ð; hÞ ¼ expðHN ð; hÞÞ ½3
We then give simple applications to the mean-field 1 ...N
spin glass model, in particular to the existence of the
infinite-volume limit of the quenched free energy fN ð; hÞ ¼ N 1 E log ZN ð; hÞ ½4
(Guerra and Toninelli 2002), and to the proof of
It is also convenient to define the average spin
general variational bounds, by following the useful
magnetization
strategy developed in Aizenman et al. (2003).
The main features of the Parisi representation are 1X
recalled briefly, and the main theorem concerning m¼ i ½5
N i
the free energy is stated. This is followed by a brief
mention of results for diluted models. Then, it is immediately seen that the Hamiltonian
We also attack the problem of connecting the in [2] can be equivalently written as
results for the mean-field case to the more realistic X
1
short-range models. HN ð; hÞ ¼  Nm2  h i ½6
Finally we provide conclusions and outlook for 2 i
future foreseen developments.
where an unessential constant term has been
Our treatment will be as simple as possible, by
neglected. In fact, we have
relying on the basic structural properties, and by
describing methods of presumably very long lasting X 1X 1 1
i j ¼ i j ¼ N 2 m2  N ½7
power. The emphasis given to the mean-field case 2 i;j;i6¼j 2 2
ði;jÞ
reflects the status of research. After some years from
now this review would perhaps be written according where the sum over all couples has been equivalently
to completely different patterns. written as one half the sum over all i, j with i 6¼ j,
and the diagonal terms with i = j have been added
and subtracted out. Notice that they give a constant
A Warm-up. The Mean-field because 2i = 1.
Ferromagnetic Model: Structure Therefore, the partition function in [3] can be
and Results equivalently substituted by the expression
  !
The mean-field ferromagnetic model is among the X 1 X
ZN ð; hÞ ¼ exp Nm2 exp h i ½8
simplest models of statistical mechanics. However, it 2
1 ...N i
contains very interesting features, in particular a
phase transition, characterized by spontaneous which will be our starting point.
magnetization, at low temperatures. We refer to Our interest will be in the limN!1 N1 log ZN (, h).
standard textbooks for a full treatment and a To this purpose, let us establish the important
complete appreciation of the model in the frame of subadditivity property, holding for the splitting of the
658 Spin Glasses

large-N system in two smaller systems with N1 and N2 It is simple to realize that the supremum coincides
sites, respectively, with N = N1 þ N2 , with the limit as N ! 1. To this purpose we follow
the following simple procedure. Let us consider all
log ZN ð; hÞ  log ZN1 ð; hÞ þ log ZN2 ð; hÞ ½9
possible values of the variable m. There are N þ 1 of
The proof is very simple. Let us denote, in the most them, corresponding to any number K of possible
natural way, by 1 , . . . , N1 the spin variables for the spin flips, starting from a given  configuration,
first subsystem, and by N1 þ1 , . . . , N the N2 spin K = 0, 1, . . . , N. Let us consider the trivial decom-
variables of the second subsystem. Introduce also the position of the identity, holding for any m,
subsystem magnetizations m1 and m2 , by adapting X
the definition [5] to the smaller systems, in such a 1¼ mM ½16
M
way that
where M in the sum runs over the N þ 1 possible
Nm ¼ N1 m1 þ N2 m2 ½10 values of m, and  is Kroneker delta, being equal to 1
Therefore, we see that the large system magnetiza- if M = N, and zero otherwise. Let us now insert [16]
tion m is the linear convex combination of the in the definition [8] of the partition function inside
smaller system ones, according to the obvious the sum over ’s, and invert the two sums. Because of
the forcing m = M given by the , we can write
N1 N2 m2 = 2mM  M2 inside the sum. Then if we neglect
m¼ m1 þ m2 ½11
N N the , by using the trivial   1, we have an upper
Since the mapping m ! m2 is convex, we also have bound, where the sum over ’s can be explicitly
the general bound, holding for all values of the  performed as before. Then it is enough to take the
variables upper bound with respect to M, and consider that
there are N þ 1 terms in the now trivial sum over M,
N1 2 N2 2
m2  m þ m ½12 in order to arrive at the upper bound
N 1 N 2
Then, it is enough to substitute the inequality in the N 1 log ZN ð; hÞ

definition [8] of ZN (, h), and recognize that we  sup log 2 þ log cosh ðh þ MÞ
M
achieve factorization with respect to the two sub- 
systems, and therefore the inequality ZN  ZN1 ZN2 . 12 M2 þ N 1 logðN þ 1Þ ½17
So we have established [9]. From subadditivity, the
Therefore, by going to the limit as N ! 1, we can
existence of the limit follows by standard arguments.
collect all our results in the form of the following
In fact, we have
theorem giving the full characterization of the
lim N 1 log ZN ð; hÞ ¼ inf N 1 log ZN ð; hÞ ½13 thermodynamic limit of the free energy.
N!1 N
Theorem 1 For the mean-field ferromagnetic
Now we will calculate explicitly this limit, by model we have
introducing an order parameter M, a trial function,
and an appropriate variational scheme. In order to lim N 1 log ZN ð; hÞ ¼ inf N 1 log ZN ð; hÞ ½18
N!1 N
get a lower bound, we start from the elementary
inequality m2  2mM  M2 , holding for any value  
¼ sup log 2 þ log cosh ðh þ MÞ  12 M2 ½19
of m and M. By inserting the inequality in the M
definition [8] we arrive at a factorization of the sum This ends our discussion about the free energy in
over ’s. The sum can be explicitly calculated, and the ferromagnetic model.
we arrive immediately to the lower bound, uniform Other properties of the model can be easily
in the size of the system, established. Introduce the Boltzmann–Gibbs state
N 1 log ZN ð; hÞ !N ðAÞ
   X 
 log 2 þ log cosh ðh þ MÞ  12 M2 ½14 X 1 2
¼ Z1
N A exp Nm exp h i ½20
holding for any value of the trial order parameter M. 1 ...N
2 i
Clearly, it is convenient to take the supremum over M.
where A is any function of 1 . . . N .
Then, we establish the optimal uniform lower bound
The observable m() becomes self-averaging under
N 1 log ZN ð; hÞ !N , in the infinite-volume limit, in the sense that
 
 sup log 2 þ log cosh ðh þ MÞ  12 M2 ½15 lim !N ððm  Mð; hÞÞ2 Þ ¼ 0 ½21
M N!1
Spin Glasses 659

This property of m is the deep reason for the success ensure a good thermodynamic behavior to the free
of the strategy exploited earlier for the convergence energy.
of the free energy. Easy consequences are the For a given inverse temperature , let us now
following. In the infinite-volume limit, for h 6¼ 0, introduce the disorder-dependent partition func-
the Boltzmann–Gibbs state becomes a factor state tion ZN (, h, J) and the quenched average of the
free energy per site fN (, h), according to the
lim !N ð1 . . . s Þ ¼ Mð; hÞs ½22
N!1 definitions
X
A phase transition appears in the form of sponta- ZN ð; h; JÞ ¼ expðHN ð; h; JÞÞ ½26
neous magnetization. In fact, while for h = 0 and 1 ...N
  1 we have M(, h) = 0, on the other hand, for
 > 1, we have the discontinuity  fN ð; hÞ ¼ N 1 E log ZN ð; h; JÞ ½27
lim Mð; hÞ ¼ lim Mð; hÞ  MðÞ > 0 ½23 Notice that in [27] the average E with respect to the
h!0þ h!0
external noise is made ‘‘after’’ the log is taken. This
Fluctuations can also be easily controlled. In fact, procedure is called quenched averaging. It represents
one
pffiffiffiffiffi proves that the rescaled random variable the physical idea that the external noise does not
N (m  M(, h)) tends in distribution, under !N , contribute to the thermal equilibrium. Only the ’s
to a centered Gaussian with variance given by the are thermalized.
susceptibility For the sake of simplicity, it is also convenient to
@ ð1  M2 Þ write the partition function in the following equiva-
ð; hÞ  Mð; hÞ  ½24 lent form. First of all let us introduce a family of
@h 1  ð1  M2 Þ
centered Gaussian random variables K(), indexed
Notice that the variance becomes infinite only at the by the configurations , and characterized by the
critical point h = 0,  = 1, where M = 0. covariances
Now we are ready to attack the much more
difficult spin glass model. But it will be surprising to EðKðÞKð0 ÞÞ ¼ q2 ð; 0 Þ ½28
see that, by following a simple extension of the 0
where q(,  ) are the overlaps between two generic
methods described here, we will arrive at similar configurations, defined by
results. X
qð; 0 Þ ¼ N 1 i 0i ½29
i
Basic Definitions for the Mean-Field Spin with the obvious bounds 1  q(, 0 )  1, and
Glass Model the normalization q(, ) = 1. Then, starting from
As in the ferromagnetic case, the generic configura- the definition [25], it is immediately seen that the
tion of the mean-field spin glass model is defined partition function in [26] can also be written, by
through Ising spin variables i = 1, attached to neglecting unessential constant terms, in the form
each site i = 1, 2, . . . , N. ZN ð; h; JÞ
But now there is an external quenched disorder rffiffiffiffiffi ! !
given by the N(N  1)=2 independent and identical X N X
¼ exp  KðÞ exp h i ½30
distributed random variables Jij , defined for each 1 ...N
2 i
pair of sites. For the sake of simplicity, we assume
each Jij to be a centered unit Gaussian with averages which will be the starting point of our treatment.
E(Jij ) = 0, E(Jij2 ) = 1. By quenched disorder we mean
that the J have a kind of stochastic external
influence on the system, without contributing to Basic Formulas of Derivation
the thermal equilibrium. and Interpolation
Now the Hamiltonian of the model, in some We work in the following general setting. Let Ui
external field of strength h, is given by the mean- be a family of centered Gaussian random variables,
field expression i = 1, . . . , K, with covariance matrix given by
1 X X E(Ui Uj )  Sij . We treat the index i now as configura-
HN ð; h; JÞ ¼  pffiffiffiffiffi Jij i j  h i ½25 tion space for some statistical mechanics system, with
N ði;jÞ i
partition function Z and quenched free energy given by
Here, the first sum extends to all p
site X pffiffi
ffiffiffiffiffi pairs, and the E log wi expð tUi Þ  E log Z ½31
second to all sites. Notice the N , necessary to
i
660 Spin Glasses

where wi  0 are generic weights, and t is a Theorem 2 Let Ui and U ^ i , for i = 1, . . . , K, be


parameter ruling the strength of the interaction. independent families of centered Gaussian random
It would be hard to underestimate the relevance of variables, whose covariances satisfy the inequalities
the following derivation formula for generic configurations
d X pffiffi ^ j Þ  ^Sij
^ iU
E log wi expð tUi Þ EðUi Uj Þ  Sij  EðU ½37
dt i
  and the equalities along the diagonal
1 X pffiffi
¼ E Z1 wi expð tUi Sii ^ i Þ  ^Sii
^ iU
2 EðUi Ui Þ  Sii ¼ EðU ½38
i
 XX
1 pffiffi then for the quenched averages we have the inequal-
 E Z2 wi wj expð tUi Þ:
2 ity in the opposite sense
i j
 X X
pffiffi E log wi expðUi Þ  E log ^ i Þ ½39
wi expðU
 expð tUj ÞSij ½32 i i

The proof is straightforward. First we perform where the wi  0 are the same in the two
directly the t-derivative. Then, we notice that the expressions.
random variables appear in expressions of the form Considerations of this kind are present in the
E(Ui F), where F are functions of the U’s. These can mathematical literature, as mentioned, for example,
be easily handled through the following integration in Talagrand (2003).
by parts formula for generic Gaussian random The proof is extremely simple and amounts to a
variables, strongly reminiscent of the Wick theorem straightforward calculation. In fact, let us consider
in quantum field theory, the interpolating expression
X   X pffiffiffiffiffiffiffiffiffiffiffi
@ pffiffi
EðUi FÞ ¼ Sij E F ½33 E log wi expð tUi þ 1  tU ^ iÞ ½40
j
@Uj i

Therefore, we see that always two derivatives are where 0  t  1. Clearly, the two expressions under
involved. The two terms in [32] come from the comparison correspond to the values t = 0 and t = 1,
action of the Uj derivatives, the first acting on the respectively. By taking the derivative with respect to
Boltzmann factor, and giving rise to a Kronecker ij , t, with the help of the previous derivation formula,
the second acting on Z1 , and giving rise to the we arrive at the evaluation of the t derivative in
minus sign and the duplication of variables. the form
The derivation formula can be expressed in a X pffiffi pffiffiffiffiffiffiffiffiffiffiffi
d ^ iÞ
more compact form by introducing replicas and E log wi expð tUi þ 1  tU
suitable averages. In fact, let us introduce the state ! dt i
!
acting on functions F of i as follows 1 X pffiffi
X pffiffi ¼ E Z1 wi expð tUi ÞðSii  ^Sii
!ðFðiÞÞ ¼ Z1 wi expð tUi ÞFðiÞ ½34 2 i
i
1 XX pffiffi
together with the associated product state  acting  E Z2 wi wj expð tUi Þ
2
on replicated configuration spaces i1 , i2 , . . . , is . By i j
!
performing also a global E average, finally we define pffiffi
the averages ^
 expð tUj ÞðSij  Sij ½41
hFit  EðFÞ ½35
From the conditions assumed for the covariances,
where the subscript is introduced in order to recall we immediately see that the interpolating function is
the t dependence of these averages. nonincreasing in t, and the theorem follows.
Then, eqn [32] can be written in a more compact The derivation formula and the comparison
form theorem are not restricted to the Gaussian case.
d X pffiffi Generalizations in many directions are possible. For
E log wi expð tUi Þ¼ 12hSi1 i1 i  12hSi1 i2 i ½36 the diluted spin glass models and optimization
dt i
problems we refer, for example, to Franz and
Our basic comparison argument will be based on Leone (2003), and to De Sanctis (2005), and
the following very simple theorem. references therein.
Spin Glasses 661

Thermodynamic Limit and the The second application is in the form of the
Variational Bounds Aizenman–Sims–Starr generalized variational princi-
ple. Here, we will need to introduce some auxiliary
We give here some striking applications of the basic system. The denumerable configuration space is
comparison theorem. Guerra and Toninelli (2002) given by the values of  = 1, 2, . . . . We introduce
have given a very simple proof of a long-awaited also weights w  0 for the  system, and suitably
result, about the convergence of the free energy per defined overlaps between two generic configurations
site in the thermodynamic limit. Let us show the p(, 0 ), with p(, ) = 1.
argument. Let us consider a system of size N and A family of centered Gaussian random variables
two smaller systems of sizes N1 and N2 respectively, ^
K(), now indexed by the configurations , will be
with N = N1 þ N2 , as before in the ferromagnetic defined by the covariances
case. Let us now compare
^
EðKðÞ ^ 0 ÞÞ ¼ p2 ð; 0 Þ
Kð ½48
E log ZN ð; h; JÞ
rffiffiffiffiffi !
X N We will also need a family of centered Gaussian
¼ E log exp  KðÞ random variables i (), indexed by the sites i of our
2
1 ...N
! original system and the configurations  of the
X auxiliary system, so that
 exp h i ½42
i Eði ðÞi0 ð0 ÞÞ ¼ ii0 pð; 0 Þ ½49
with Both the probability measure w , and the overlaps
rffiffiffiffiffiffiffi ! p(, 0 ) could depend on some additional external
X N1 ð1Þ ð1Þ
E log exp  K ð Þ quenched noise, which does not appear explicitly in
1 ...N
2 our notation.
rffiffiffiffiffiffiffi ! ! In the following, we will denote by E averages
N2 ð2Þ ð2Þ X
 exp  K ð Þ exp h i with respect to all random variables involved.
2 i In order to start the comparison argument, we
 E log ZN1 ð; h; JÞ þ E log ZN2 ð; h; JÞ ½43 will consider first the case where the two  and 
systems are not coupled, so as to appear factorized
where (1) stands for i , i = 1, . . . , N1 , and (2) for in the form
i , i = N1 þ 1, . . . , N. Covariances for K(1) and K(2) rffiffiffiffiffi !
are expressed as in [28], but now the overlaps are X X N
substituted with the partial overlaps of the first and E log w exp  KðÞ
1 ...N 
2
second block, q1 and q2 , respectively. It is very rffiffiffiffiffi ! !
simple to apply the comparison theorem. All one has N^ X
to do is to observe that the obvious  exp  KðÞ exp h i
2 i
Nq ¼ N1 q1 þ N2 q2 ½44 X
 E log ZN ð; h; JÞ þ E log w
analogous to [10], implies, as in [12], 
rffiffiffiffiffi !
N1 2 N2 2 N^
q2  q þ q ½45  exp  KðÞ ½50
N 1 N 2 2
Therefore, the comparison gives the superaddivity
In the second case, the K fields are suppressed and
property, to be compared with [9],
the coupling between the two systems will be taken
E log ZN ð; h; JÞ in a very simple form, by allowing the  field to act
as an external field on the  system. In this way
 E log ZN1 ð; h; JÞ þ E log ZN2 ð; h; JÞ ½46
the ’s appear as factorized, and the sums can
From the superaddivity property the existence of the be explicitly performed. The chosen form for the
limit follows in the form second term in the comparison is
! !
lim N1 E log ZN ð; h; JÞ X X X X
N!1 E log w exp  i ðÞi exp h i
¼ sup N 1 E log ZN ð; h; JÞ ½47 1 ...N 
X
i i
N
 N log 2 þ E log w ðc1 c2 . . . cN Þ ½51
to be compared with [13]. 
662 Spin Glasses

where we have defined such that

ci ¼ cosh ðh þ i ðÞÞ ½52 xðqÞ ¼ m1 for 0 ¼ q0  q < q1


xðqÞ ¼ m2 for q1  q < q2
as arising from the sums over ’s. .. ½58
Now we apply the comparison theorem. In the .
first case, the covariances involve the sums of xðqÞ ¼ mK for qK1  q  qK
squares of overlaps
In the following, we will find it convenient to
1 2 0 2
þ p ð;  ÞÞ0 define also m0  0, and mKþ1  1. The replica
2ðq ð;  Þ ½53
symmetric case of Sherrington and Kirkpatrick
In the second case, a very simple calculation shows corresponds to
that the covariances involve the overlap products
K ¼ 2; q1 ¼ q
; m1 ¼ 0; m2 ¼ 1 ½59
qð; 0 Þpð; 0 0Þ ½54
Let us now introduce the function f, with values
Therefore, the comparison is very easy and, by f (q, y; x, ), of the variables q 2 [0, 1], y 2 R,
collecting all expressions, we end up with the useful depending also on the functional order parameter
estimate, as in Aizenman et al. (2003), holding for x, and on the inverse temperature , defined
any auxiliary system as defined before, as the solution of the nonlinear antiparabolic
equation
N 1 E log ZN ð; h; JÞ
 X ð@q f Þðq; yÞ þ 12 ð@y2 f Þðq; yÞ
1
 log 2 þ N E log w ðc1 c2 cN Þ

þ 12 xðqÞð@y f Þ2 ðq; yÞ ¼ 0 ½60
X  rffiffiffiffiffi 
N^ with final condition
 E log w exp  KðÞ ½55

2
f ð1; yÞ ¼ log coshðyÞ ½61
Here, we have stressed only the dependence of f on q
and y.
The Parisi Representation It is very simple to integrate eqn [60] when x is
for the Free Energy piecewise constant. In fact, consider x(q) = ma , for
We refer to the original papers, reprinted in the qa1  q  qa , firstly with ma > 0. Then, it is
extensive review given in Mézard et al. (2002), for immediately seen that the correct solution of eqn
the general motivations, and the derivation of the [60] in this interval, with the right final boundary
broken replica ansatz, in the frame of the ingenious condition at q = qa , is given by
replica trick. Here, we limit ourselves to a synthetic
f ðq; yÞ
description of its general structure, independently Z
from the replica trick. 1 pffiffiffiffiffiffiffiffiffiffiffiffiffi
¼ log expðma f ðqa ; y þ z qa  qÞÞ dðzÞ ½62
First of all, let us introduce the convex space X of ma
the functional order parameters x, as nondecreasing where d(z) is the centered unit Gaussian measure
functions of the auxiliary variable q, both x and q on the real line. On the other hand, if ma = 0, then
taking values on the interval [0, 1], that is, [60] loses the nonlinear part and the solution is
X 3 x : ½0; 1 3 q ! xðqÞ 2 ½0; 1 ½56 given by
Z
pffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Notice that we call x the function, and x(q) its f ðq; yÞ ¼ f ðqa ; y þ z qa  qÞ dðzÞ ½63
values. We introduce a metric on X through the
L1 ([0, 1], dq)-norm, where dq is the Lebesgue which can be seen also as deriving from [62] in the
measure. limit ma ! 0. Starting from the last interval K, and
For our purposes, we will consider the case of using [62] iteratively on each interval, we easily get
piecewise constant functional order parameters, the solution of [60], [61], in the case of piecewise
characterized by an integer K, and two sequences order parameter x, as in [58], through a chain of
q0 , q1 , . . . , qK , m1 , m2 , . . . , mK of numbers satisfying interconnected Gaussian integrations.
Now, we introduce the following important
0 ¼ q0  q1   qK1  qK ¼ 1 definitions. The trial auxiliary function, associated
0  m1  m2   mK  1 ½57 to a given mean-field spin glass system, as described
Spin Glasses 663

earlier, depending on the functional order parameter uniformly in N. This result stems from earlier work
x, is defined as of Derrida, Ruelle, Neveu, Bolthausen, Sznitman,
Z Aizenman, Contucci, Talagrand, Bovier, and others,
2 1 and in a sense is implicit in the treatment given in
log 2 þ f ð0; h; x; Þ  q xðqÞ dq ½64
2 0 Mézard et al. (1987). It can be reached in a very
Notice that in this expression the function f appears simple way. Let us sketch the argument.
evaluated at q = 0, and y = h, where h is the value of First of all, let us consider the Poisson point
the external magnetic field. This trial expression process y1  y2  y3 . . . , uniquely characterized by
shoul be considered as the analog of that appearing the following conditions. For any interval A,
in [14] for the ferromagnetic case. introduce the occupation numbers N(A), defined by
The Parisi spontaneously broken replica symmetry X
NðAÞ ¼ ðy 2 AÞ ½68
expression for the free energy is given by the definition

 fP ð; hÞ where ( ) = 1, if the random variable y belongs to
Z 1
2 the interval A, and ( ) = 0, otherwise. We assume
 inf ðlog 2 þ f ð0; h; x; Þ  q xðqÞ dqÞ ½65 that N(A) and N(B) are independent if the intervals
x 2 0
A and B are disjoint, and moreover that for each A,
where the infimum is taken with respect to all the random variable N(A) has a Poisson distribution
functional order parameters x. Notice that the with parameter
infimum appears here, as compared to the supre-
Z b
mum in the ferromagnetic case.
By exploiting a kind of generalized comparison ðAÞ ¼ expðyÞ dy ½69
a
argument, involving a suitably defined interpolation
function, Guerra (2003) has established the follow- if A is the interval (a, b), that is,
ing important result.
PðNðAÞ ¼ kÞ ¼ expððAÞÞðAÞk =k! ½70
Theorem 3 For all values of the inverse tempera-
ture , and the external magnetic field h, and for We will exploit y as energy levels for a statistical
any functional order parameter x, the following mechanics system with configurations indexed by .
bound holds: For a parameter 0 < m < 1, playing the role of inverse
temperature, we can introduce the partition function
N 1 E log ZN ð; h; JÞ X y 
Z 1 v¼ exp

½71
2 m
 log 2 þ f ð0; h; x; Þ  q xðqÞ dq 
2 0
For m in the given interval it turns out that v is a
uniformly in N. Consequently, we have also very well defined random variable, with the sum
N 1 E log ZN ð; h; JÞ over  extending to infinity. In fact, there is a strong
 Z  inbuilt smooth cutoff in the very definition of the
2 1 stochastic energy levels.
 inf log 2 þ f ð0; h; x; Þ  q xðqÞ dq
x 2 0 From the general properties of Poisson point
uniformly in N. processes, it is very well known that the following
basic invariance property holds. Introduce a random
However, this result can also be understood in the variable b, independent of y, subject to the condition
framework of the generalized variational principle E( exp b) = 1, and let b be independent copies.
established by Aizenman–Sims–Starr as described Then, the randomly biased point process y0 = y þ b ,
earlier.  = 1, 2, . . . , is equivalent to the original one in
In fact, one can easily show that there exist  distribution. An immediate consequence is the follow-
systems such that ing. Let f be a random variable, independent of y, such
X that E( exp f ) < 1, and let f be independent copies.
N 1 E log w c1 c2 . . . cN  f ð0; h; x; Þ ½66

Then, the two random variables
rffiffiffiffiffi ! X y 
X N^ 
N 1
E log w exp  KðÞ exp expðf Þ ½72
2 
m

2 Z 1 X y 

Eðexpðmf ÞÞ1=m

 q xðqÞ dq ½67 exp ½73
2 0 
m
664 Spin Glasses

have the same distribution. In particular, they can be Now, it is simple to verify that [66] and [67]
freely substituted under averages. hold. Let us consider, for example, [66]. With the
The auxiliary system which gives rise to the Parisi  system chosen as before, the repeated applica-
representation according to [66] and [67], for a tion of the stochastic equivalence of [72] and [73]
piecewise constant order parameter, is expressed in will give rise to a sequence of interchained
the following way. Now  will be a multi-index Gaussian integrations exactly equivalent to those
 = (1 , 2 , . . . , K ), where each a runs on arising from the expression for f, as solution of
1, 2, 3, . . . . Define the Poisson point process y1 , then, the eqn [60]. For [73], there are equivalent
independently, for each value of 1 processes y1 2 , considerations.
and so on up to y1 2 ...K . Notice that in the cascade of Therefore, we see that the estimate in Theorem 3
independent processes y1 , y1 2 , . . . , y1 2 ...K , the last is also a consequence of the generalized variational
index refers to the numbering of the various points of principle.
the process, while the first indices denote independent Up to this point we have seen how to obtain
copies labeled by the corresponding ’s. upper bounds. The problem arises whether, as in the
The weights w have to be chosen according to ferromagnetic case, we can also get lower bounds,
the definition so as to shrink the thermodynamic limit to the value
y1 y  y  ... given by the inf x in Theorem 3. After a short
w ¼ exp exp 1 2 . . . exp 1 2 K ½74 announcement, Talagrand (2005) has firmly estab-
m1 m2 mK
lished the complete proof of the control of the lower
The cavity fields  and K have the following bound. We refer to the original paper for the
expression in terms of independent unit Gaussian complete details of this remarkable achievement.
random variables Ji 1 , Ji 1 2 , . . . , Ji 1 2 ...K , J0 1 , J0 1 2 , . . . , About the methods, here we only recall that in
J0 1 2 ...K , Guerra (2003) we have given also the corrections to
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the bounds appearing in Theorem 3, albeit in a quite
i ðÞ ¼ q1  q0 Ji 1 þ q2  q1 Ji 1 2 þ complicated form. Talagrand has been able to
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
þ qK  qK1 Ji 1 2 ...K ½75 establish that these corrections do in fact vanish in
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the thermodynamic limit.
KðÞ ¼ q21  q20 J0 1 þ q22  q21 J0 1 2 þ In conclusion, we can establish the following
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi extension of Theorem 1 to spin glasses.
þ q2K  q2K1 J0 1 2 ...K ½76 Theorem 4 For the mean-field spin glass model we
0 have
It is immediate to verify that E(i ()i0 ( ) is zero if
i 6¼ i0 , while lim N 1 E log ZN ð; h; JÞ
N!1
8
> 0 if 1 6¼ 01 ¼ sup N 1 E log ZN ð; h; JÞ ½79
>
>
>
> q1 if 1 ¼ 01 ; 2 6¼ 02 N
>
>
< q2 if 1 ¼ 01 ; 2 ¼ 02 ; 3 6¼ 03 ;
0
Eði ðÞi ð ÞÞ ¼ ..  Z 
>
> . 2 1
>
> ¼ inf log 2 þ f ð0; h; x; Þ  qxðqÞ dq ½80
>
> 0 0 x 2 0
: 1 if 10 ¼ 1 ; 2 ¼ 2 ; . . . ;
>
K ¼ K
½77
Diluted Models
Similarly, we have
8 Diluted models, in a sense, play a role intermediate
> 0 if 1 6¼ 01 between the mean-field case and the short-range
>
> 2
>
> q1 if 1 ¼ 01 ; 2 6¼ 02 case. In fact, while in the mean-field model each site
>
>
>
> is interacting with all other sites, on the other hand,
>
< q22 if 1 ¼ 01 ; 2 ¼ 02 ; 3 6¼ 03 ;
EðKðÞKð0 ÞÞ ¼ . in the diluted model, each site is interacting with
>
> .. only a fixed number of other sites. However, while
>
>
>
> for the short-range models there is a definition of
>
> 1 if 1 ¼ 01 ; 2 ¼ 02 ; . . . ;
>
> distance among sites, relevant for the interaction, no
:
K ¼ 0K such definition appears in the diluted models, where
½78
all sites are in any case equivalent. From this point
This ends the definition of the  system, associated of view, the diluted models are structurally similar
to a given piecewise constant order parameter. to the mean-field models, and most of the
Spin Glasses 665

techniques and results explained before can be systems. The different models require different
extended to them. cavity fields ruling the interaction between the
Let us define a typical diluted model. The original system and the auxiliary system. But further
quenched noise is described as follows. Let K be a work will be necessary in order to clarify this very
Poisson random variable with parameter N, where important issue. For results about diluted models in
N is the number of sites, and  is a parameter the high-temperature region, we refer to Guerra and
entering the theory, together with the temperature. Toninelli (2004).
We consider also a sequence of independent cen-
tered random variables J1 , J2 , . . . , and a sequence of
discrete independent random variables i1 , j1 ,
Short-Range Model and Its Connections
i2 , j2 , . . . , uniformly distributed over the set of sites
with the Mean-Field Version
1, 2, . . . , N. Then we assume as Hamiltonian
The investigations of the connections between the
X
K
short-range version of the model and its mean-field
HN ðÞ ¼  J k i k j k ½81
version are at the beginning. Here, we limit ourselves
k¼0
to a synthetic description of what should be done, and
Only the variables  contribute to thermodynamic to a short presentation of the results obtained so far.
equilibrium. All noise coming from K, Jk , ik , jk is First of all, according to the conventional wisdom,
considered quenched, and it is not explicitly indi- the mean-field version should be a kind of limit of the
cated in our notation for H. short-range model on a lattice in dimension d, when
The role played by Gaussian integration by d ! 1, with a proper rescaling of the strength of the
parts in the Sherrington–Kirckpatrick model, here Hamiltonian, of the form d1=2 . Results of this kind
is assumed by the following elementary derivation are very well known in the ferromagnetic case, but
formula, holding for Poisson distributions, the present technology of interpolation does not seem
sufficient to assure a proof in the spin glass case. So,
d d this very basic result is still missing. In analogy with
PðK ¼ k; tNÞ  expðtNÞðtNÞk =k!
dt dt the ferromagnetic case, it would be necessary to
¼ NðPðK ¼ k  1; tNÞ arrive at the notion of a critical dimension, beyond
 PðK ¼ k; tNÞÞ ½82 which the features of the mean-field case still hold,
for example, in the expression of the critical
Then, all machinery of interpolation can be easily exponents and in the ultrametric hierarchical struc-
extended to the diluted models, as firstly recognized ture of the pure phases, or at least for the overlap
by Franz and Leone in (2003). distributions. For physical dimensions less than the
In this way, the superaddivity property, the critical one, the short-range model would need
thermodynamic limit, and the generalized varia- corrections with respect to its mean-field version.
tional principle can be easily established. We refer to Therefore, this is a completely open problem.
Franz and Leone (2003), and De Sanctis (2005), for Moreover, always according to the conventional
a complete treatment. wisdom, the mean-field version should be a kind of
There is an important open problem here. While limit of the short-range models, in finite fixed
in the fully connected case, the Poisson probability dimensions, as the range of the interaction goes to
cascades provide the right auxiliary  systems to be infinity, with proper rescaling. Important work of
exploited in the variational principle, on the other Franz and Toninelli shows that this is effectively the
hand in the diluted case more complicated prob- case, if a properly defined Kac limit is performed.
ability cascades have been proposed, as shown, for Here, interpolation methods are effective, and we
example, in Franz and Leone (2003), and in refer to Franz and Toninelli (2004), and references
Panchenko and Talagrand (2004). On the other quoted there, for full details.
hand, in De Sanctis (2005), the very interesting Due to the lack of efficient analytical methods, it is
proposal has been made that also in the case of clear that numerical simulations play a very important
diluted models the Poisson probability cascades play role in the study of the physical properties emerging
a very important role. Of course, here the auxiliary from short-range spin glass models. In particular, we
system interacts with the original system differently, refer to Marinari et al. (2000) for a detailed account of
and involves a multi-overlap structure as explained the evidence, coming from theoretical considerations
in De Sanctis (2005). In this way a kind of very deep and extensive computer simulations, that some of the
universality is emerging. Poisson probability cas- more relevant features of the spontaneous replica
cades are a kind of universal class of auxiliary breaking scheme of the mean field are also present in
666 Spin Glasses

short-range models in three dimensions. Different See also: Glassy Disordered Systems: Dynamical
views are expressed, for example, in Newman and Evolution; Large Deviations in Equilibrium Statistical
Stein (1998), where it is argued that the phase-space Mechanics; Mean Field Spin Glasses and Neural
structure of short-range spin glass models is much Networks; Short-Range Spin Glasses: The Metastate
Approach; Statistical Mechanics and Combinatorial
simpler than that foreseen by the Parisi spontaneous
Problems.
replica symmetry mechanism.
Such very different views, both apparently
strongly supported by reasonable theoretical con-
siderations and powerful numerical simulations, are
a natural consequence of the extraordinary difficulty Further Reading
of the problem.
Aizenman M, Sims R, and Starr S (2003) Extended variational
It is clear that extensive additional work will be
principle for the Sherrington–Kirkpatrick spin-glass model.
necessary before the clarification of the physical Physical Review B 68: 214403.
features exhibited by the realistic short-range spin De Sanctis L (2005) Structural Approachs to Spin Glasses and
glass models. Optimization Problems. Ph.D. thesis, Department of Math-
ematics, Princeton University.
Edwards SF and Anderson PW (1975) Theory of spin glasses.
Journal of Physics F: Metal Physics 5: 965–974.
Conclusion and Outlook for Future Franz S and Leone M (2003) Replica bounds for optimization
problems and diluted spin systems. Journal of Statistical
Developments Physics 111: 535–564.
As we have seen, in these last few years, there has Franz S and Toninelli FL (2004) The Kac limit for finite-range
spin glasses. Physical Review Letters 92: 030602.
been an impressive progress in the understanding of Guerra F (2001) Sum rules for the free energy in the
the mathematical structure of spin glass models, mean field spin glass model. Fields Institute Communica-
mainly due to the systematic exploration of com- tions 30: 161.
parison and interpolation methods. However, many Guerra F (2003) Broken replica symmetry bounds in the mean
important problems are still open. The most field spin glass model. Communications in Mathematical
Physics 233: 1–12.
important one is to establish rigorously the full Guerra F and Toninelli FL (2002) The thermodynamic limit in
hierarchical ultrametric organization of the overlap mean field spin glass models. Communications in Mathema-
distributions, as appears in Parisi theory, and to tical Physics 230: 71–79.
fully understand the decomposition in pure states of Guerra F and Toninelli FL (2004) The high temperature region of
the glassy phase, at low temperatures. the Viana–Bray diluted spin glass model. Journal of Statistical
Physics 115: 531–555.
Moreover, it would be important to extend these Marinari E, Parisi G, Ricci-Tersenghi F, Ruiz-Lorenzo JJ, and
methods to other important disordered models as, Zuliani F (2000) Replica symmetry breaking in short range
for example, neural networks. Here the difficulty is spin glasses: A review of the theoretical foundations and of the
that the positivity arguments, so essential in com- numerical evidence. Journal of Statistical Physics 98:
parison methods, do not seem to emerge naturally 973–1074.
Mézard M, Parisi G, and Virasoro MA (1987) Spin Glass Theory
inside the structure of the theory. and Beyond. Singapore: World Scientific.
Finally, the problem of connecting properties of Mézard M, Parisi G, and Zecchina R (2002) Analytic and
the short-range model, with those arising in the algorithmic solution of random satisfiability problems. Science
mean-field case, is still almost completely open. 297: 812.
Newman CM and Stein DL (1998) Simplicity of state and overlap
structure in finite-volume realistic spin glasses. Physical
Review E 57: 1356–1366.
Acknowledgment Panchenko D and Talagrand M (2004) Bounds for diluted mean-
field spin glass models. Probability Theory Related Fields 130:
We gratefully acknowledge useful conversations 319–336.
with Michael Aizenman, Pierluigi Contucci, Giorgio Sherrington D and Kirkpatrick S (1975) Solvable model of a spin-
glass. Physical Review Letters 35: 1792–1796.
Parisi, and Michel Talagrand. The strategy Stein DL (1989) Disordered systems: Mostly spin glasses. In: Stein
explained in this report grew out from a systematic DL (ed.) Lectures in the Sciences of Complexity. New York:
exploration of comparison and interpolation meth- Addison-Wesley.
ods, developed in collaboration with Fabio Lucio Talagrand M (2003) Spin Glasses: A Challenge for Mathemati-
Toninelli, and Luca De Sanctis. cians. Mean Field Models and Cavity Method. Berlin:
Springer.
This work was supported in part by MIUR Talagrand M (2006) The Parisi formula. Annals of Mathematics
(Italian Ministry of Instruction, University and 163: 221–263.
Research), and by INFN (Italian National Institute Young P (ed.) (1987) Spin Glasses and Random Fields. Singapore:
for Nuclear Physics). World Scientific.
Spinors and Spin Coefficients 667

Spinors and Spin Coefficients


K P Tod, University of Oxford, Oxford, UK correct magnetic moment for the electron, so that
ª 2006 Elsevier Ltd. All rights reserved. this equation does describe an electron with spin in
the form made familiar by Pauli.
To decide on the size of the matrices  a and
therefore the dimension of the space of ’s, one
Introduction notices, with the aid of [2], that the following are a
Spinors were invented by the mathematician basis for the algebra generated by the  a :
E Cartan (see, e.g., Cartan (1981)) in the early
years of the last century in the course of his study of 1;  a ;  ½a  b ;  ½a  b  c ;  ½a  b  c  d ½3
rotation groups. The physicist Pauli reinvented what
Cartan would have called the spinors of SU(2), There are 16 elements in this basis, assuming that
which is the double cover of the rotation group there are no extra identities among them, so that we
SO(3), in order to explain the spectroscopy of alkali might hope to find a representation as 4  4
atoms and the anomalous Zeeman effect. For this, matrices. This can be done, and Dirac gave explicit
he needed an essential two-valuedness of the formulas in terms of Pauli matrices. The space of
electron, an internal quantum number to contribute Dirac spinors is now a complex four-dimensional
to the angular momentum, which is now called spin. vector space, which turns out to split as the sum of a
Now the wave function becomes a two-component complex two-dimensional vector space S, which is
column vector. It is worth noting that, despite the referred to as a spin space, and its complex
name, Pauli resisted the picture of an electron as a conjugate S̄ (the relationship between a complex
spinning ‘‘thing’’ on the grounds that, as a repre- vector space and its complex conjugate is described
sentation of SU(2) which was not a representation of in the text below and eqn [9]). Under proper,
SO(3), it should have no classical kinematic model, orthochronous Lorentz transformations, S trans-
which a spinning object would have. forms into itself by SL(2, C) transformation, but
According to the review article of van der Waerden space and time reflections relate S to S̄. The fact that
(1960), the term ‘‘spinor’’ is due to Ehrenfest in there are two spin spaces S and S̄ in dimension 4 is
1929, and was introduced in the flurry of interest the basis of chirality: an electron is represented by a
after the next important step in the evolution of Dirac spinor, which is a pair of spinors, one in each
spinors in the physics literature, which was the of S and S̄, which are related under space reflection;
introduction of a relativistic equation for the a particle represented just by a spinor in S cannot be
electron by Dirac (1928). invariant under space reflection.
Dirac sought a linear, first-order but Lorentz- The Clifford algebra (see Clifford Algebras and
invariant equation for the electron which was to be their Representations) associated with a vector space
the square root of the linear, Lorentz-invariant but V with metric g is defined as the algebra generated
second-order Klein–Gordon equation. He assumed by elements v, w of V with the multiplication [
the equation for the wave function would take satisfying
the form
v [ w þ w [ v ¼ 2gðv; wÞ ½4
L :¼ ði a pa þ mcIÞ ¼ 0 ½1
The matrices  a define a representation of the
a a
where pa =  ih@=@x for a = 0, 1, 2, 3, but where  Clifford algebra by associating a covector va with a
are complex square matrices, of a size to be matrix v = va  a , since [2] is then equivalent to [4].
determined, and I is the corresponding identity This part of the process works in any dimension n
matrix. Differentiating [1] again, one obtains the and signature s. For odd n, as, for example, with Pauli
Klein–Gordon equation for provided these spinors, the  a are square matrices of size 2N  2N ,
matrices satisfy the equation where N = (n  1)=2, and there is a single spin space
of dimension 2N . For even n, as with the original Dirac
 a  b þ  b  a ¼ 2ab I ½2
spinors, the  a are square matrices of size 2N  2N ,
where ab is the Minkowski metric, diag(1, 1, where N = n=2, but there are two spin spaces each of
1, 1). dimension 2N1 . Reality properties of the spin spaces
Assuming the  a have been found, the usual and the existence of other structures on them depend
substitution p ! p  ieA, for a particle in a mag- in an intricate way on n and s (Penrose and Rindler
netic field with vector potential A, leads to the 1984, 1986, Benn and Tucker 1987).
668 Spinors and Spin Coefficients

The dimension of the space of spinors rises rapidly If the vector v in [5] is null, then the matrix has
with n, which is one reason why historically spinors vanishing determinant, or, equivalently, it has rank
have been most useful in spaces of dimensions 3 and 4, 1, and so it can be written as the outer product of a
where the spin space has dimension 2. In a space of two-component column vector  = (0 , 1 )T and its
dimension 11, a case considered in supergravity, the Hermitian conjugate:
spin space already has dimension 32.
ðvÞ ¼ y ½8
Furthermore, under [7],  transforms as
Spinors in General Relativity: Spinor
Algebra  ! t ½9

In this section, we start again with a different The two-complex-dimensional space to which 
emphasis. Conventions follow Penrose and Rindler belongs is the spin space S at p, already met in the
(1984, 1986). To introduce spinors as a calculus in a previous section, and it follows from [8], since null
four-dimensional, Lorentzian spacetime M, one can vectors span V, that the tensor product S  S̄ of S with
begin by choosing an orthonormal tetrad of vectors its complex conjugate vector space S̄ is the complex-
(e0 , e1 , e2 , e3 ) at a point p. The following conven- ification of V. Complex conjugation gives an antilinear
tions are used: map from S to S̄. (One associates the complex-
conjugate vector space V̄ to any given complex vector
gðea ; eb Þ ¼ ab ¼ diagð1; 1; 1; 1Þ space V as follows: scalar multiplication for V can be
Any vector v in the tangent space V = Tp M at p has considered as a function  : C  V ! V given by
components va in this basis, which we arrange as a (z, v) = zv, while vector addition is a map : V 
matrix and label in two ways: V ! V given by (u, v) = u þ v. Define another
   0  complex vector space by taking the same vectors and
0
1 v0 þ v3 v1 þ iv2 v00 v01 the same  where
but with scalar multiplication ,
ðvÞ ¼ pffiffiffi 1 2 0 3 ¼ 0 0 ½5 
2 v  iv v  v v10 v11 (z, v) = (z̄, v). This is the complex-conjugate vector
pffiffiffi space V̄. Given a choice of basis, we think of V as, say,
The reason for the factor 1= 2 will be seen below, n-component column vectors of complex numbers,
as will the rationale for the second form of the and then V̄ is the corresponding complex-conjugate
matrix. Note that (v) is Hermitian and that columns.)
2 det ðvÞ ¼ gðv; vÞ ¼ ab va vb ½6 Conventionally, S is the space of unprimed spinors
and S̄ the space of primed spinors, and one also has
0
Clearly, there is a one-to-one correspondence the two duals S0 and S̄ which are associated in the
between elements of V and Hermitian 2  2 corresponding way to the dual V 0 of V. Analogously
matrices. Further, if t is any matrix in SL(2, C),then to the situation with vectors and covectors, index
the transformation conventions for spinors are as follows:
ðvÞ ! tðvÞty ½7 A 2 S;
0
A 2 S;  A 2 S0 ; A0 2 S0
where ty is the Hermitian conjugate of t, is linear in v, where A = 0, 1, A0 = 00 , 10 .
and preserves both Hermiticity and the norm of v. Spinor algebra mirrors tensor algebra: a spinor
Thus, it must represent a Lorentz transformation. It is 0 0
A1 ...Ap A1 ...Aq B1 ...Br B0 ...B0s is an element of the tensor
straightforward to check that it is a proper, ortho- 1
product of p copies of S, q copies of S̄, r copies of S0 ,
chronous Lorentz transformation and that all such 0
and s copies of S̄ . The second way of writing the
transformations arise in this way (recall that ‘‘proper’’ matrix in [5] enables the identification of a vector
means transformations of determinant 1 so that with a matrix to be conventionally written as
orientation is preserved, and ‘‘orthochronous’’ means
0
that future-pointing timelike or null vectors are taken va ¼ vAA ½10
to future-pointing timelike or null vectors, so that time
orientation is preserved; the proper, orthochronous and then extended to any tensor T a...b c...d by replacing
Lorentz group is equivalently the identity-connected each vector index, say b, with a pair BB0 of spinor
component of the Lorentz group). Since both t and t indices. In particular, from [8], it follows that any
give the same Lorentz transformation, this provides an real null vector na can be written in the form
explicit demonstration of the (2 – 1)-homomorphism 0
na ¼  A A
of SL(2,C) with the proper, orthochronous Lorentz
group O"þ (1,3). for some spinor  A .
Spinors and Spin Coefficients 669

One must pay attention to the order of spinor where the round brackets indicate symmetrization
indices of a given type, primed or unprimed, but by over the indices A1 , . . . , An , and the n spinors
convention may permute primed and unprimed (1) (n)
A1 , . . . , An , which are determined only up to
indices. A spinor with an equal number n of primed reordering and rescaling, are known as the principal
and unprimed indices corresponds to a tensor of spinors of . To prove this, note that the principal
valence n, and the tensor is real if the spinor satisfies spinors can be identified with the solutions
A of the
a suitable Hermiticity relation. equation
Spinors may have various symmetries among their
indices, much as tensors have. However, since S is two A1 ...An
A1   
An ¼ 0
dimensional, there is only a one-dimensional space of and there are n of these, counting multiplicities, by
2-forms on S. This has two consequences: no spinor the ‘‘fundamental theorem of algebra.’’
can be antisymmetric over more than two indices; and
if we make a choice of canonical 2-form, all spinors
can be written in terms of symmetric spinors and the
canonical 2-form. This is a decomposition of spinors Spinors in General Relativity: Spinor
into irreducibles for SL(2, C). Calculus
One makes a choice of 2-form AB according to We now want to define spinor fields on the
AB ¼  BA ; 01 ¼ 1 spacetime M as sections of a spinor bundle S
whose fiber at each point is S and such that the
There is an inverse AB defined by tensor product S  S is the complexified tangent
AC BC ¼ BA ½11 bundle. The existence of such an S imposes global
restrictions on M: M must be orientable and time
where BA is the Kronecker delta. The complex orientable, and a certain characteristic class, the
conjugate of AB is conventionally written without second Stiefel–Whitney class, must vanish (for an
0 0
an overbar as A0 B0 , and analogously A B is the explanation of these terms see, e.g., Penrose and
AB
complex conjugate of . Rindler (1984, 1986)). Assuming that M satisfies
Because of the antisymmetry of AB , order of these conditions, spinor fields can be defined. It is
indices is crucial in equations such as [11]. The convenient to retain the algebraic formulas from the
2-form AB has a role akin to that of a metric as it previous section (e.g., [10] or [12]) but with indices
provides an identification of S and its dual, now regarded as abstract (a note on the abstract
according to index convention appears in Twistors).
By an argument analogous to that for the
A ! B ¼ A AB fundamental theorem of Riemannian geometry,
there is a unique covariant derivative that satisfies
B ! A ¼ AB B
the Leibniz condition, coincides with the Levi-
with corresponding formulas for primed spinors. Civita derivative on tensors and the gradient on
Note that, because of the antisymmetry of AB , scalars, and annihilates AB and A0 B0 . Following the
necessarily A A = 0 for any A . conventions of the previous section, the spinor
With conventions made so far, it can be checked covariant derivative will be denoted as rAA0 . The
that commutator of derivatives can be written in terms
0 0 of irreducible parts (for SL(2, C)) according to the
gab va vb ¼ AB A0 B0 vAA vBB ½12 formula
for any vector va , where gab is the spacetime metric rAA0 rBB0  rBB0 rAA0 ¼ A0 B0 AB þ AB A0 B0
at p, so that 0
where AB = rC0 (A rC
B) . The definition of the
gab ¼ AB A 0 B0 Riemann curvature tensor is in terms of the Ricci
It is the desire to have this formula without identity
constants
pffiffiffi that necessitates the choice of the factor
ðra rb  rb ra Þvc ¼ Rabd c vd
1= 2 in [5].
One final piece of spinor algebra that we note is and then this translates into two Ricci identities for
the following: given a symmetric spinor A1 ...An there a spinor field:
is a factorization
AB C ¼ ABCD D
ð1Þ ðnÞ
A1 ...An ¼ ðA1    An Þ ½13 A0 B0 C ¼ A0 B0 CD D
670 Spinors and Spin Coefficients

The curvature spinors ABCD and A0 B0 CD are related The Spin-Coefficient Formalism
to the curvature tensor. The Ricci spinor A0 B0 AB is
The spin-coefficient formalism of Newman and
Hermitian and symmetric on both index pairs and is
Penrose is a formalism for spinor calculus in space-
a multiple of the trace-free part of the Ricci tensor:
times (see, e.g., Penrose and Rindler (1984, 1986)
  and Stewart (1990)). It finds application in
A0 B0 AB ¼ 12 Rab  14Rgab
any calculation dealing with curvature tensors,
including solving the Einstein equations. The form-
The spinor ABCD is symmetric on the first and last
alism exploits the compression of terminology which
pairs of indices and decomposes into irreducibles
the introduction of complex quantities permits.
according to
The formalism starts with a choice of spinor dyad,
ABCD ¼ ABCD  2 DðA BÞC a basis of spinor fields (oA , A ) normalized so that
oA A = 1. From the dyad, one constructs a null
where  = R=24 in terms of the Ricci scalar or scalar tetrad, which is a basis of vector fields, according to
curvature R, while ABCD , which is totally sym- the scheme
metric and is known as the Weyl spinor, is related to 0 0 0 0

the Weyl tensor Cabcd by the equation ‘a ¼ oA o


A ; na ¼ A A ; ma ¼ oA A ; m
 a ¼ A o
A

 A0 B0 C0 D0 AB CD
Cabcd ¼ ABCD A0 B0 C0 D0 þ  Given the normalization of the spinor dyad, each of
the vectors in the null tetrad is null (hence the name)
Thus, the ten real components of the Weyl tensor and all inner products are zero, except for
are coded into the five complex components of the
Weyl spinor. ‘a na ¼ 1 ¼ ma m
a
Following the last remark in the previous section,
It follows that the metric can be written in the
the Weyl spinor has four principal spinors, each of
basis as
which defines a null direction, the principal null
directions of the Weyl tensor. There is a classifica-
gab ¼ 2‘ða nbÞ  2mða m
 bÞ
tion of Weyl tensors, the Petrov–Pirani–Penrose
classification, based on coincidences among the The components of the covariant derivative in the
principal null directions (Penrose and Rindler null tetrad are given separate names according to the
1984, 1986). following scheme:
As a final exercise in spinor calculus, we recall the
zero-rest-mass equations (see Twistors). In flat  a ra ¼ 
‘a ra ¼ D; na ra ¼ ; ma ra ¼ ; m
spacetime, these are the equations
and the spin coefficients are the 12 components of
0
rA A AB...C ¼ 0 the covariant derivative of the basis. Each is labeled
with a Greek letter according to the following
on a totally symmetric spinor field AB...C . The field scheme:
is said to have spin s if it has 2s indices, and the
cases s = 1=2, 1, or 2, respectively, are the Weyl DoA ¼ oA  A ; oA ¼ oA   A
neutrino equation, the Maxwell equation, and the  A ¼ oA   A
oA ¼ oA   A ; o
linearized Einstein equation. In flat spacetime, these ½14
hyperbolic equations are well understood and D A ¼ oA  A ;  A ¼ oA   A
solvable in a variety of ways. In curved spacetime,  A ¼ oA   A
 A ¼ oA   A ; 
however, if s  3=2, then there are curvature
obstructions to the existence of solutions, known The spin coefficients code the 24 real Ricci rotation
as Buchdahl conditions. This can be seen at once by coefficients into 12 complex quantities. Some of the
differentiating again, say by rBA0 , and using the spin coefficients have direct geometrical interpreta-
spinor Ricci identity. After a little algebra, one finds tion. For example, the vanishing of is the
condition for the integral curves of ‘a to be geodesic,
ABC ðD E...FÞABC ¼ 0 while, if  is also zero, this congruence of geodesics
is shear free. The same role is played by  and  for
so that, whenever the field has three or more indices, the na -congruence. The real and imaginary parts of 
there are algebraic constraints on its components in are, respectively (minus), the expansion and the
terms of the Weyl spinor. twist of the congruence of integral curves of ‘a .
Spinors and Spin Coefficients 671

In practice, it is often simpler to calculate the spin D   ¼ ð þ Þ þ ð


 þ Þ  ð þ Þ  ð þ  Þ
coefficients from the commutators of the basis
þ    þ 2   þ 11
vectors, now regarded as directional derivatives, as
 
D   ¼ ð  3 þ Þ þ  þ ð þ   Þ
follows:
   þ 20
 þ Þ  ð þ Þ
D  D ¼ ð þ  ÞD þ ð þ Þ  ð
D   ¼ ð
   Þ þ  þ ð

 þ Þ
D  D ¼ ð
 þ   ÞD þ   ð þ  Þ  
  þ 2 þ 2
   ¼    Þ þ ð   þ  Þ þ 
 D þ ð  
D   ¼ ð þ Þ þ ð
 þ Þ þ ð   Þ
  ¼ ð
   ÞD þ ð   ð
  Þ þ ð  Þ   Þ
 ð3 þ Þ þ 3 þ 21
½15  ¼  ð þ  þ 3   Þ
  
The commutator of second derivatives applied to þ ð3 þ  þ   Þ  4
the spinor dyad expresses the components of the  ¼ ð
    þ ð  Þ
 þ Þ  ð3  Þ
curvature tensor in terms of the derivatives of
the spin coefficients. Before presenting these, we þ ð  Þ
  1 þ 01
adopt a convention for labeling the components of 
   ¼    þ   þ   2 þ ð  Þ
curvature. The components of the Weyl spinor are þ ð  Þ  2 þ  þ 11
given as follows:  ¼ ð  Þ þ ð  Þ þ ð þ Þ
   
0 ¼ ABCD oA oB oC oD þ ð  3Þ  3 þ 21
1 ¼ ABCD o o o A B C D    ¼  ð þ  þ  Þ   þ 
2 ¼ ABCD oA oB C D ½16 þ ð þ 3  Þ  22
3 ¼ ABCD oA B C D    ¼ ð
 þ   Þ   þ  þ  

þ ð    Þ    12
4 ¼ ABCD A B C D
  ð þ   Þ
   ¼  ð  3 þ  Þ   
so that these five complex scalars encode the ten real þ 
  02
components of the Weyl tensor. For the Ricci spinor, set  ¼ ð þ   Þ   þ ð    Þ
  
0 0 0 0
00 ¼ ABA0 B0 oA oB o
A o
B ; 01 ¼ ABA0 B0 oA oB o
A  B þ   2  2
A B A0 B0
02 ¼ ABA0 B0 o o   ; A B A 0 B0
11 ¼ ABA0 B0 o o
  
   ¼ ð þ Þ  ð þ Þ þ ð
  Þ

A B A0 B 0 0 0 
þ ð  Þ  3 ½17
12 ¼ ABA0 B0 o   ; 22 ¼ ABA0 B0 A B A  B
Finally, it is possible to write out the Bianchi
together with 10 = 01 , 20 = 02 , and 21 = 12 .
identities in this formalism. For simplicity, and
The nine components of the trace-free Ricci tensor
with a view to an application, we do this below
are encoded in these scalars of which three are real
only for vacuum, so that the Ricci tensor is zero:
and three complex. The Ricci scalar, as before, is
replaced by the real scalar  = R=24.  0 ¼ ð  4Þ0 þ 2ð2 þ Þ1  3 2
D1  
Now the commutators of covariant derivatives on
the spinor dyad lead to the following system: 0  1 ¼ ð4  Þ0  2ð2 þ Þ1 þ 32
 1 ¼ 0 þ 2ð  Þ1 þ 32  2 3
D2  
 ¼ 2 þ 
D    þ ð þ Þ  
1  2 ¼ 0 þ 2ð  Þ1  32 þ 23
 ð3 þ   Þ þ 00  2 ¼ 21 þ 32 þ 2ð  Þ3  4
D3  
D   ¼ ð þ  þ 3  Þ
2  3 ¼ 21  32 þ 2ð  Þ3 þ 4
 ð   þ  þ 3Þ þ 0  3 ¼ 32 þ 2ð þ 2Þ3 þ ð  4 Þ4
D4  
D   ¼ ð þ Þ þ ð þ Þ þ ð  Þ
3  4 ¼ 32  2ð þ 2Þ3 þ ð4  Þ4
 ð3 þ  Þ þ 1 þ 01
½18
     
D   ¼ ð þ   2 Þ þ   
þ ð þ Þ þ 10 The whole system is then loosely described as the
spin-coefficient equations.
D   ¼ ð þ Þ þ ð
  Þ  ð þ Þ
As a simple application, we shall prove the
 ð
  Þ þ 1 Goldberg–Sachs theorem: for vacuum spacetimes, a
672 Spinors and Spin Coefficients

spinor field oA is geodesic and shear free iff it is a positive-mass (or positive-energy) theorem. The
repeated principal spinor of the Weyl spinor. proof was motivated by ideas from supergravity
In the spin-coefficient formalism, oA is geodesic and gave rise to an increased interest in spinors in
and shear-free iff and  vanish, and, from [16], is a general relativity.
repeated principal spinor of the Weyl spinor The positive-mass theorem is the following asser-
provided 0 = 1 = 0. It will be repeated three tion: given an asymptotically flat spacetime M with
times if also 2 = 0 and four times if 3 = 0, but a spacelike hypersurface , which is topologically
one must have k 6¼ 0 for some k if the spacetime is R3 and in which the dominant energy condition
not to be flat. holds, the total (or Arnowitt–Deser–Misner (ADM))
Suppose that oA is a (twice) repeated principal momentum is timelike and future-pointing. (The
spinor of the Weyl spinor, then at once from the first dominant-energy condition is the requirement that
two expressions in [18] both and  vanish. If it is Tab Ua V b is non-negative for every pair of future-
repeated three times, one gets the same result from pointing timelike or null vectors Ua and V b .)
the third and fourth expressions in [18], while if oA We follow the notation of Penrose and Rindler
is repeated four times then the fifth and sixth (1984, 1986), where the proof begins by considering
expressions of [18] should be used. the 2-form  defined in terms of a spinor field A on
For the converse, suppose that =  = 0. Then, by  by
the first equation in [14], oA can be rescaled to ensure
that = 0 and a spinor field A can be chosen which is  ¼ iB0 ra B dxa ^ dxb
normalized against oA and parallelly propagated along
‘a , so that, by the fifth equation in [14],  = 0. From If a tends to a constant spinor at spatial infinity on
the second expression in [17], one can see at once that , then
0 = 0, so that the first two equations in [18] simplify I
1 0
to give expressions for D1 and 1 . By commuting D  ! pa A A ½19
and  on 1 and using the second expression of [15] 4G S
with the relevant parts of [17], it can be concluded that as the spacelike spherical surface S tends to spatial
1 = 0, as required. infinity, where pa is the ADM momentum. Suppose
Another application which is easy to describe is  has unit normal ta , intrinsic metric hab = gab  ta tb
the solution of the type-D vacuum equations. A and the dual-volume 3-form is da = ta d. Then
type-D solution is one for which the Weyl spinor has Stokes’ theorem states that
two (linearly independent) repeated principal spi- I Z
nors. If these are taken as the normalized dyad, then ¼ d
from [16] only 2 is nonzero among the k . By the S 
Goldberg–Sachs theorem, both spinors are geodesic We calculate
and shear free, so that the spin coefficients , , ,
and  all vanish. With these conditions, the spin- d ¼  þ 
coefficient equations simplify to the point that where
careful choices of coordinates and the remaining
freedom in the dyad enable the equations to be  ¼ 4GTab ‘a db
solved explicitly. One obtains metrics that depend 0
 ¼ i ab cd rc B rd B da
only on a few parameters. Analogous methods 0
reduce the Einstein equations to simpler systems where ‘a = a A and we have used the Einstein field
for the other vacuum algebraically special metrics, equations to replace curvature terms in  by the
that is, the other vacuum metrics for which the Weyl energy–momentum tensor Tab . Provided the matter
spinor does not have four distinct principal null satisfies the dominant-energy condition,  is every-
directions (Mason 1998). where a positive multiple of the volume form on 
The spin-coefficient formalism has also been and its integral is positive (it can vanish only in
extensively used in the study of asymptotically flat vacuum). To make the integral of  positive, A is
spacetimes and gravitational radiation (Penrose and required to satisfy
Rindler 1984, 1986, Stewart 1990).
DAA0 A ¼ 0 ½20
where Da = hba rb , which is the projection of the four-
The Positive-Mass Theorem
dimensional covariant derivative rather than the
A very important application of spinor calculus in intrinsic covariant derivative of . Equation [20] is
recent years was the proof by Witten (1981) of the the Sen–Witten equation; it is elliptic and reduces to
Spinors and Spin Coefficients 673

the Dirac equation on a maximal surface; furthermore, reasons related to the Buchdahl conditions, 8 is the
given an asymptotically constant value for A on an largest N that is considered in four dimensions.
asymptotically flat 3-surface  with the topology of In superstring theory and in some supergravity
R3 , it has a unique solution. Equation [20] removes theories, one often wishes to consider spaces
part of the derivative of A from  to leave with ‘‘residual supersymmetry,’’ by which is meant
that there is a spinor field satisfying a condition of
 ¼ hab Da C Db C0 dc covariant constancy in some connection (Candelas et
Now hab is negative definite and  has timelike al. 1985). The existence of such constant spinors, as a
normal so that  is a positive multiple of the volume result of spinor Ricci identities analogous to those
form on  (unless A is covariantly constant, a case given above, typically imposes strong restrictions on
which is dealt with separately). Thus, the integral of the curvature. Riemannian manifolds admitting con-
d is non-negative and therefore, by [19], so is the stant spinors for the Levi-Civita connection are Ricci-
inner product of the ADM momentum pa with any flat (Hitchin 1974); Lorentzian ones can often be
null vector constructed from asymptotically constant found in terms of a few functions. Manifolds of
spinors. Furthermore, this inner product is strictly special holomorphy, which are of interest in super-
positive, except in a vacuum spacetime admitting a string theory, can usually be characterized as admit-
constant spinor. Such spacetimes can be found ting special spinors (Wang 1989).
explicitly and cannot be asymptotically flat, so that
the ADM momentum is always timelike and future See also: Clifford Algebras and Their Representations;
Dirac Operator and Dirac Field; Einstein Equations: Exact
pointing, and vanishes only in flat spacetime.
Solutions; Einstein’s Equations with Matter; General
The basic positive-energy theorem outlined above
Relativity: Overview; Geometric Flows and the Penrose
can be extended in several directions: Inequality; Index Theorems; Relativistic Wave Equations
 to prove that the total momentum at future null Including Higher Spin Fields; Spacetime Topology,
infinity is also timelike and future pointing; Causal Structure and Singularities; Supergravity; Twistor
 to deal with surfaces  which have inner Theory: Some Applications [in Integrable Systems,
Complex Geometry and String Theory]; Twistors.
boundaries, for example, at black holes;
 to prove inequalities between charge and mass; and
 to deal with spacetimes which are asymptotically Further Reading
anti-de Sitter rather than flat.
Benn IM and Tucker RW (1987) An Introduction to Spinors and
Geometry with Applications in Physics. Bristol: Adam Hilger.
Budinich P and Trautman A (1988) The Spinorial Chessboard.
Further Applications of Spinors Berlin: Springer.
Supersymmetry is a symmetry in quantum field Candelas P, Horowitz G, Strominger A, and Witten E (1985)
Vacuum configurations for superstrings. Nuclear Physics
theory relating bosons and fermions. In the language B 258: 46–74.
of spinors, bosons are represented by fields with an Cartan E (1981) The Theory of Spinors. New York: Dover.
even number of spinor indices and fermions by fields Chevalley CC (1954) The Algebraic Theory of Spinors. New
with an odd number of indices. Thus, the gauge York: Columbia University Press.
transformations of supersymmetry are generated by Dirac PAM (1928) The quantum theory of the electron.
Proceedings of the Royal Society of London A 117: 610–624.
spinors with a single index. Harvey FR (1990) Spinors and Calibrations. Boston: Academic
Supergravity is supersymmetry in the case that one of Press.
the fields is the graviton. A supergravity theory is Hitchin NJ (1974) Harmonic spinors. Advances in Mathematics.
labeled by an integer N for the number of independent 14: 1–55.
supersymmetries and much of the numerology of these Mason LJ (1998) The asymptotic structure of algebraically special
spacetimes. Classical Quantum Gravity 15: 1019–1030.
theories follows from properties of spinors. N = 1 Penrose R and Rindler W (1984, 1986) Spinors and Space–Time.
supergravity contains a graviton and a spin-3/2 field vol. 1 and 2. Cambridge: Cambridge University Press.
coupled together, and the presence of the super- Stewart J (1990) Advanced General Relativity. Cambridge:
symmetry allows the Buchdahl condition to Cambridge University Press.
be evaded. Supergravity theory with one supersymme- van der Waerden BL (1960) Exclusion principle and spin. In:
Fierz M and Weisskopf VF (eds.) Theoretical Physics in
try in 11 spacetime dimensions depends on one spinor,
the Twentieth Century: A Memorial Volume to Wolfgang Pauli,
which, in 11 dimensions, has 32 components. This is as pp. 199–244. New York: Interscience.
many components as eight Dirac spinors in a four- Wang MY (1989) Parallel spinors and parallel forms. Annals of
dimensional spacetime, and, by a process of dimen- Global Analysis and Geometry 7: 59–68.
sional reduction, N = 1 supergravity in 11 dimensions Witten E (1981) A new proof of the positive energy theorem.
Communications in Mathematical Physics 80: 381–402.
is related to N = 8 supergravity in four dimensions. For

You might also like