Professional Documents
Culture Documents
Painlevé Equations
N Joshi, University of Sydney, Sydney, NSW,
Australia movable singularities (a singularity is ‘‘movable’’ if
its location changes with initial conditions).
ª 2006 Elsevier Ltd. All rights reserved. For the Painlevé equations, all movable singula-
rities are poles. For PI and PII , all solutions are
meromorphic functions. However, the solutions of
each of the remaining equations have other singula-
Introduction
rities called ‘‘fixed’’ singularities, with locations that
The Painlevé equations PI –PVI are six classical are determined by the singularities of the coefficient
second-order ordinary differential equations that functions of the equation. PIII –PVI have a fixed
appear widely in modern physical applications. singularity at x = 1. PIII and PV have additional
Their conventional forms (governing y(x) with fixed singularities at x = 0, and PVI has them at x = 0
derivatives y0 = dy=dx, y00 = d2 y=dx2 ) are: and 1. Although each solution of PIII –PVI is single-
valued around a movable singularity, it may be
PI : y00 ¼ 6y2 þ x
multivalued around a fixed singularity.
PII : y00 ¼ 2y3 þ xy þ Painlevé’s school considered canonical classes of
y02 y0 1 2 ordinary differential equations equivalent under linear
PIII : y00 ¼ þ y þ þ y3 þ fractional transformations of y and x. Of the fifty
y x x y
canonical classes of equations they found, all except
y02 3 3
PIV : y00 ¼ þ y þ 4xy2 þ 2ðx2 Þy þ six were found to be solvable in terms of already
2y 2 y
0
known functions. These six lead to the Painlevé
1 1 y equations PI –PVI as their canonical representatives.
PV : y00 ¼ þ y02
2y y 1 x A resurgence of interest in the Painlevé equations
2
came about from the observation (due to Ablowitz
ðy 1Þ y
þ y þ þ and Segur) that they arise as similarity reductions
x2 y x
yðy þ 1Þ of well-known integrable partial differential equa-
þ tions (PDEs), or soliton equations, such as the
y1
Korteweg–de Vries equation, the sine-Gordon equa-
00 1 1 1 1 tion, and the self-dual Yang–Mills equations.
PVI : y ¼ þ þ y02
2 y y1 yx As this connection suggests, the Painlevé equations
1 1 1 possess many of the special properties that are
þ þ y0
x x1 yx commonly associated with soliton equations. They
(
yðy 1Þðy xÞ x have associated linear problems (i.e., Lax pairs) for
þ 2
þ 2 which they act as compatibility conditions. There
2
x ðx 1Þ y
) exist special transformations (called Bäcklund trans-
ðx 1Þ xðx 1Þ formations) mapping a solution of one equation to a
þ þ solution of another Painlevé equation (or the same
ðy 1Þ2 ðy xÞ2
equation with changed parameters). There exist
where , , , are constants. They were identified Hamiltonian forms that are related to existence of
and studied by Painlevé and his school in their tau-functions, that are analytic everywhere except at
search for ordinary differential equations (in the the fixed singularities. They also possess multilinear
class y00 = R(x, y, y0 ), where R is rational in y0 , y and forms (or Hirota forms) that are satisfied by tau-
analytic in x) that define new transcendental func- functions. In the following subsections, for concise-
tions. Painlevé focussed his search on equations that ness, we give examples of these properties for the first
possess what is now known as the Painlevé property: or second Painlevé equations and briefly indicate
that all solutions are single-valued around all differences, in any, with other Painlevé equations.
2 Painlevé Equations
Complex Analytic Structure of Solutions imaginary axis (between 8 < =x < 12) may be
numerical artifacts. We used the command NSolve to
Consider the two-(complex-)parameter manifold of
32 digits in MATHEMATICA4.)
solutions of a Painlevé equation. Each solution is
The rays of symmetry evident in Figure 1 reflect
globally determined by two initial values given at a
discrete symmetries of PI . The solutions of PI and PII
regular point of the solution. However, the solution
are invariant under the respective discrete symmetries,
can also be determined by two pieces of data given
at a movable pole. The location x0 of such a pole PI : yn ðxÞ ¼ e2in=5 yðe4in=5 xÞ; n ¼ 1; 2
provides one of the two free parameters. The other PII : yn ðxÞ ¼ e in=3
yðe 2in=3
xÞ; 7! ein
free parameter occurs as a coefficient in the Laurent
expansion of the solution in a domain punctured at n ¼ 1; 2; 3
x0 . For PI , the Laurent expansion of a solution at a The rays of angle 2n=5 for PI and n=3 for PII
movable singularity x0 is related to these symmetries play special roles in the
1 x0 asymptotic behaviors of the corresponding solutions
yðxÞ ¼ 2
þ ðx x0 Þ2 for jxj ! 1.
ðx x0 Þ 10
1
þ ðx x0 Þ3 þ cI ðx x0 Þ4 þ ½1
6
Linear Problems
where cI is arbitrary. This second free parameter is
normally called a ‘‘resonance parameter.’’ For PII , The Painlevé equations are regarded as completely
the Laurent expansion of a solution at a movable integrable because they can be solved through an
singularity x0 is associated system of linear equations (Jimbo and
Miwa 1981).
1 x0
yðxÞ ¼ þ ðx x0 Þ d’
ðx x0 Þ 6 ¼ Lðx; Þ’ ½3a
1 d
þ ðx x0 Þ2 þ cII ðx x0 Þ3 þ ½2
4
where cII is arbitrary. The symmetric solution of PI d’
¼ Mðx; Þ’ ½3b
that has a pole at the origin and corresponding dx
resonance parameter cI = 0 has a distribution of poles The compatibility condition, that is,
in the complex x-plane shown in Figure 1. (This figure
was obtained by searching for zeros of truncated Lx M þ ½L; M ¼ 0 ½4
Taylor expansions of the tau-function I described in is equivalent to the corresponding Painlevé equation.
the section ‘‘Bäcklund and Miura transformations.’’ The matrices L, M for PI and PII are listed below:
One hundred and sixty numerical zeros are shown.
The two pairs of closely spaced zeros near the 0 1 2 0 y
PI : LI ðx; Þ ¼ þ
0 0 4 0
!
z y2 þ x=2
þ
4y z
0 1=2 0 y
MI ðx; Þ ¼ þ
0 0 2 0
where z ¼ y0 ; z0 ¼ 6y2 þ x
1 0 2
0 u
PII : LII ðx; Þ ¼ þ
0 1 2z=u 0
z þ x=2 uy
þ
2ð# þ zyÞ=u ðz þ x=2Þ
1=2 0 0 u=2
MII ðx; Þ ¼ þ
0 1=2 z=u 0
Figure 1 Poles of a symmetric solution of PI in the complex where u0 ¼ uy; z ¼ y0 y2 x=2
x-plane, with a pole at the origin and zero corresponding 1
resonance parameter, i.e., x0 = 0, cI = 0. # :¼
2
Painlevé Equations 3
Alternative linear problems also exist for each yn , we can write a difference equation relating yn1
equation. For example, for PII , an alternative choice and ynþ1 (by eliminating y0 from the two transfor-
of L and M is (Flaschka and Newell 1980): mations ~y, ^y) as
! !
4 i 0 0 4y c þ 12 þ n c 12 þ n
PII : LII0 ðx; Þ ¼ 2
þ þ þ 2y2n þ x ¼ 0
0 4i 4y 0 ynþ1 þ yn yn1 þ yn
!
iðx þ 2y2 Þ 2 iy0 This is an example of a discrete Painlevé equation (called
þ ‘‘alternate’’ dPI in the literature). In such a discrete
2 iy0 iðx þ 2y2 Þ Painlevé equation, x is fixed while n varies. Another
!
0 1 lesser known Bäcklund transformation for PII is
þ x
1 0 y0 y2 v2 ¼ 0 ½7
! 2
i y
MII0 ðx; Þ ¼
y i v0 þ y v ¼ 0 ½8
between PII with = 1=2 and
The matrix L for each Painlevé equation is
x
singular at a finite number of points ai (x) in the v00 þ v3 þ v ¼ 0
-plane. For the above choices of L for PI and PII , 2
pffiffiffi pp ffiffiffiffiffiffiffiffiffi
ffiffiffiffiffi
the point = 1 is clearly a singularity. For LII0 , the which can be scaled (take v(x) = y( 2x)= 2) to
origin = 0 is also a singularity. The analytic the usual form of PII with = 0.
continuation of a fundamental matrix of solutions Miura transformations are those that map a solution
around ai gives a new solution e which must be of a Painlevé equation to another equation in the 50
e = A. A is called
related to the original solution: canonical types classified by Painlevé’s school. If y is a
the monodromy matrix and its trace and determi- solution of PII with parameter 6¼ 1=2, then
nant are called the monodromy data. In general, the
1 w0
data will change with x. However, eqn [4] ensures ð2 1Þw ¼ 2ðy0 y2 x=2Þ; y¼
that the monodromy data remain constant in x. For 2w
this reason, the system [3] is called an isomonodr- maps between PII and
omy problem.
ðw0 Þ2 1
w00 ¼ ð2 1Þ w2 xw
2w 2w
Bäcklund and Miura Transformations which represents the 34th canonical class in the
Painlevé classification listed in Ince (1927).
Bäcklund transformations are those that map a The Painlevé equations do not possess contin-
solution of a Painlevé equation with one choice of uous symmetries other than Bäcklund and Miura
parameter to a solution of the same equation with transformations described here. However, they do
different parameters. For PI no such transformation possess discrete symmetries described in the section
is known. For PII , there is one Bäcklund transforma- ‘‘Complex analytic structure of solutions.’’
tion. Let y = y(x; ) denote a solution of PII with
parameter . Then ~ y = y(x; 1), which solves PII
with parameter 1, is given by Classical Special Solutions
1 Painlevé showed that there can be no explicit first
2
y :¼ y þ
~ if 6¼ 1=2 ½5 integral that is rational in y and y0 for his
y0 y2 x=2
eponymous equations. It is known that this state-
If = 1=2, then y0 = y2 þ x=2 and ~
y = y (see the ment can be extended to say that no such algebraic
next section for this case). Combined with the first integral exists. But the question whether the
symmetry y 7! y, = , we can write down Painlevé equations define new transcendental func-
another version of this Bäcklund transformation tions remained open until recently.
which maps y to ^y = y(x; þ 1): Form a class of functions consisting of those
þ 12 1 satisfying linear second-order differential equations,
y :¼ y
^ ; if 6¼ ½6 such as the Airy, Bessel, and hypergeometric functions,
y0 þ y2 þ x=2 2
as well as rational, algebraic, and exponential func-
If we parametrize by c þ n for arbitrary c, and tions. Extend this class to include arithmetic opera-
denote the solution for corresponding parameter as tions, compositions under such functions, and
4 Painlevé Equations
solutions of linear equations with these earlier func- where EII and EII are constants. We choose
tions as coefficients. Members of this class are called canonical variables q1 (t) = y(x), p1 (t) = y0 (x), where
classical functions. For general values of the constants t = x. Furthermore, for PI , we take
, , , , it is now known (Umemura 1990, Umemura Z x
and Watanabe 1997) that the six Painlevé equations q2 ðtÞ ¼ x; p2 ðtÞ ¼ yð Þd
cannot be solved in terms of classical functions.
However, there are special values of the constant and the Hamiltonian
parameters , , , for which classical functions do
solve the Painlevé equations. Each Painlevé equation, p21
HI :¼ 2q1 3 q2 q1 þ p2
except PI , has special solutions given by classical 2
functions when the parameters in the Painlevé equa- so that the Hamiltonian equations of motion
tion take on special values. For PII , with = 1=2 we q_ i = @H=@pi and p_ i = @H=@qi are satisfied. For
have the special integral PII , we take
x Z x
I1=2 y0 y2 ¼0 ½9 q2 ðtÞ ¼ x=2; p2 ðtÞ ¼ yð Þ2 d
2
which, modulo PII with = 1=2, satisfies the relation
and the Hamiltonian
d
þ 2y I1=2 ¼ 0 p1 2 q1 4 1
dx HII :¼ q2 q1 2 þ p2 q1
2 2 2
The Riccati eqn [9] can be linearized via y = 0 = We note that these Hamiltonians govern systems
to yield with two degrees of freedom and each is conserved.
x However, no explicit second conserved quantity is
00
þ ¼0 known (see comments on first integrals in the last
2
section).
which gives Painlevé’s viewpoint of the transcendental solutions
of the Painlevé equations as natural generalizations of
ðxÞ ¼ a Aið21=3 xÞ þ b Bið21=3 xÞ elliptic functions also led him to search for entire
for arbitrary constants a and b, that is, the well- functions that play the role of theta functions in
known Airy function solutions of PII . Iterations of this new setting. He found that analogous functions
the Bäcklund transformations ~ y and ^y, [5]–[6] give could be defined which have only zeros at the
further classical solutions in terms of Airy functions locations of the movable singularities of the Painlevé
for the case when = (2N þ 1)=2 for integer N. transcendents. These functions are now commonly
Similarly, there is a sequence of rational solutions of known as tau-functions (also denoted -functions).
the family of equations PII with = N, for integer N, if For PI and PII , the corresponding tau-functions are
we iterate the Bäcklund transformations ~y, ^y by entire functions (i.e., they are analytic everywhere in
starting with the trivial solution y 0 for the case the complex x-plane). However, for the remaining
= 0. For example, for = 1, we have ^ y = 1=x. The Painlevé equations, they are singular at the fixed
transformations [7]–[8] give a mapping that shows singularities of the respective equation.
that this family of rational solutions and the above For PI , all movable singularities of PI are double
family of Airy-type solutions of PII both exist for the poles of strength unity (see eqn [1]). Therefore, the
cases when is half-integer and when it is integer. function given by
Z xZ s
PI : I ðxÞ ¼ exp yðtÞdtds
!
Note that this equation is bilinear in and its d d3 d 0
derivatives. Such bilinear, or in general, multilinear, Lnþ1 fvg ¼ þ 4v þ 2v Ln fvg
dz dz3 dz
equations are called Hirota-type forms of the Painlevé
equations. The special nature of such equations is L1 fvg ¼ v
most simply expressed in terms of the Hirota D( Dx )
operator, an antisymmetric differential operator defined where primes denote z-derivatives. Note that
here on products of functions of x: L2 fvg ¼ v00 þ 3v2
Dn f g ¼ ð@ @
Þn f ð Þgð
Þj ¼
¼x L3 fvg ¼ vð4Þ þ 10vv00 þ 5v02 þ 10v3
This operator is intimately related to the Korteweg–de
Notice that Vries equation. (It was first discovered as a method of
generating the infinite number conservation laws
D2 ¼ 00 02 ;
associated with this soliton equation.)
D4 ¼ ð4Þ 4 0 ð3Þ þ 3 002 The scaling v(z) = y(
x), with = (2)1=3 ,
= (2)1=3 , shows that the case n = 2 of the
Hence the equation satisfied by I (x) can be sequence of ODEs defined recursively by
rewritten more succinctly as Ln fvg ¼ z
@2u @2u
¼0 ½3
@t2 @x2
Introduction which governs, for example, linear acoustics in one
dimension (sound pipes) or the propagation of an
Many physical laws are mathematically expressed
elastic wave along an elastic string.
in terms of partial differential equations (PDEs);
A third equation of type [1] is the linear parabolic
this is, for instance, the case in the realm of
equation
classical mechanics and physics of the laws of
conservation of angular momentum, mass, and @u @ 2 u
energy. ¼0 ½4
@t @x2
The object of this short article is to provide an
overview and make a few comments on the set of also called the heat equation, which governs, under
PDEs appearing in classical mechanics, which is appropriate circumstances, the temperature (u(x, t) =
tremendously rich and diverse. From the mathema- temperature at x at time t).
tical point of view the PDEs appearing in mechanics All these equations are well understood from the
range from well-understood PDEs to equations mathematical viewpoint and many well-posedness
which are still at the frontier of sciences as far as results are available. A fundamental difference
their mathematical theory is concerned. The math- between eqns [2], [3], and [4] is that for [2] and
ematical theory of PDEs deals primarily with their [4] the solution is as smooth as allowed by the data
‘‘well-posedness’’ in the sense of Hadamard. A well- (forcing terms, boundary data not mentioned here),
posed PDE problem is a problem for which whereas the solutions of [3] usually present some
existence and uniqueness of solutions in suitable discontinuities corresponding to the propagation of
function spaces and continuous dependence on the a wave or wave front.
data have been proved. A considerable jump of complexity occurs if we
For simplicity, let us restrict ourselves to space consider the equation of transonic flows in which
dimension 2. Several interesting and important PDEs !
are of the form 1 @u 2
a¼ 1 2
v @x
@2u @2u @2u
a þ b þ c ¼0 ½1 2 @u @u
@x2 @x@y @y2 b¼ ½5
v2 @x @y
Here a, b, c may depend on x and y or they may be
constants, and then eqn [1] is linear: they may also 1 @u 2
depend on u, @u=@x, and @u=@y, in which case the c¼1 2
v @y
equation is nonlinear.
Such an equation is where v = v(x, y) is the local speed of sound. This is
a mixed second-order equation: it is elliptic in the
elliptic when (where) b2 4ac < 0, subsonic region where M < 1, M the Mach number
hyperbolic when (where) b2 4ac > 0, being the ratio of the velocity
parabolic when (where) b2 4ac = 0.
2 2 !1=2
Among the simplest linear equations, we have the @u @u
jgrad uj ¼ þ
elliptic equation @x @y
u ¼ 0 ½2
to the local velocity of sound v = v(x, y); eqn [1]
which governs the following phenomena: equation (with [5]) is hyperbolic in the supersonic region,
for the potential or stream function of plane, where M > 1 and parabolic on the sonic line
incompressible irrotational fluids; equation for M = 1. Essentially no result of well-posedness is
some potential in linear elasticity, or the equation available for this problem, and it is not even totally
for the temperature in suitable conditions (sta- clear what are the boundary conditions that one
tionary case; see below for the time-dependent should associate to [1]–[5] to obtain a well-posed
case). problem.
Partial Differential Equations: Some Examples 7
Path Integral Methods see Functional Integration in Quantum Physics; Feynman Path Integrals
8 Path Integrals in Noncommutative Geometry
introduce an element h of L2 (N). The map which to where s ! x(s) is a smooth loop in N, !i is of odd
!, a smooth function on N, associates exp[(N þ degree and !1i is of even degree. Let us recall that
!)]h(Re 0) satisfies the requirements (1) and (2) even forms on the free loop space commute. F(n ) is
of the introduction and depends holomorphically on built from even forms on the free loop space, which
. This defines by the Potthoff–Streit theorem a commute. This explains why we have to consider
distribution which depends holomorphically on , the symmetric Fock space.P Therefore, if belongs to
Re 0 with values in L2 (N). By uniqueness of WNs, 1 , then F() = F2r (), where F2r () is a
analytic continuation, we obtain: measurable form on L(N) of degree 2r (see Jones
and Léandre (1991) for an analogous statement in
Theorem 2 If Px (N) is the space of smooth paths
the stochastic context).
starting from x in N, we have
( ) Let us explain why the free loop space is
Z important in this context. Let d
x (1) be the law of
h ; i ¼ x ! FðÞhðxð1ÞÞdx ðÞ ½10 the Brownian bridge on N starting from x and
Px ðNÞ
coming back at x at time 1: this is the law of the
Instead of taking functions, we can consider as Brownian motion x. (x) subject to return in time 1 at
bundle E the space of complex 1-forms on N. We its departure. Let pt (x, y) be the heat kernel
then consider Chen (1973) iterated integrals: associated with xt (x): the law of xt (x) is namely
pt (x, y) dmN (y) (Ikeda and Watanabe 1981). We
Fðn Þðxð:ÞÞ consider the Bismut–Høegh–Krohn measure on the
Z
continuous free loop space L0 (N):
¼ hn ðxðs1 Þ; . . . ; xðsn ÞÞ; dxðs1 Þ; . . . ; dxðsn Þi ½11
n
dP ¼ p1 ðx; xÞdx d
x ð1Þ ½14
such that F maps WNs, 1 into the set of measurable
maps on P(N). These maps are generally not This satisfies
bounded. Namely,
tr½exp½s1 N f1 fn exp½ð1 sn ÞN
Z 1 Z
Fðexp½!Þ ¼ exp h!ðxðsÞÞ; dxðsÞi ½12 ¼ f1 ðxðs1 ÞÞ fn ðxðsn ÞÞdP ½15
0 L0 ðNÞ
R1
instead of exp[ 0 !(x(s))ds] in the previous case. By (We are interested in the trace of the heat semigroup
using the Cameron–Martin–Girsanov–Maruyama for- instead of the heat semigroup itself unlike in the
mula and Kato perturbation theory, we get an analog previous section.)
of Theorem 2 for Chen iterated integrals, but for Since N is spin, we can consider the spin bundle
Re < 0, because we have to deal with a perturbation Sp = Spþ
Sp on it, the Clifford bundle Cl on it with
of N by a drift when we want to check (1) and (2). its natural Z=2Z gradation (Gilkey 1995). Let us recall
The interest of this formalism is that the parallel that the Clifford algebra acts on the spinors. A form !
transport belongs in some sense to the domain of the can be associated with an element !˜ of the Clifford
distribution and that we get the flat Feynman path bundle (Gilkey 1995). We consider the Brownian loop
integral from the curved one by using an analog of [7]. x(.) associated to the Bismut–Høegh–Krohn measure.
If s < t, we can define the stochastic parallel transport
~s, t from x(t) to x(s) (we identify a loop to a path from
Bismut–Chern Character [0, 1] into N with the same end values). We remark
and Path Integrals that with the notations of [13]
Z
Since we are concerned in this part with index theory, ~0;s1 ð!
~1 ðdxðs1 ÞÞ þ ! ~1 ds1 Þ~
s1 ;s2 . . .
we replace the free path space of N by the free smooth n
loop space L(N). We consider the case where V = N is
^ !n ðdxðsn Þ; :Þ þ !1n dsn ½13 of the Dirac operator in terms of the horizontal
Path Integrals in Noncommutative Geometry 11
We have only to use the formula [21] and Bismut JM (1986) Localization formula, superconnections and
the index theorem for families. Communications in Mathema-
hexp½;
1 ^
2 ^
2n i ¼ Pf f!ð
i ^
j Þg ½24 tical Physics 103: 127–166.
Bismut JM (1987) Filtering equation, equivariant cohomology
and the Chern character. In: Seneor R (ed.) Proc. VIIIth Int.
and to estimate the obtained Pfaffians when n ! 1.
Cong. Math. Phys., pp. 17–56. Singapore: World Scientific.
Theorem 4 allows us to give a rigorous interpreta- Chen KT (1973) Iterated path integrals of differential forms and
tion of the fermionic Feynman–Kac formula of Rogers loop space homology. Annals of Mathematics 97: 213–237.
(1987). We refer to Roepstorff (1994) for details. Connes A (1988) Entire cyclic cohomology of Banach algebras
exp[] should give a rigorous interpretation to the and character of -summable Fredholm modules. K-Theory
1: 519–548.
Gaussian
pffiffiffiffiffiffi R 1Berezin
P integral with formal density
Cuntz J (2001) Cyclic theory, bivariant K-theory and the bivariant
exp [ 1 0 pi (s) dqi (s)]. Chern–Connes character. In: Cyclic Homology in Noncom-
mutative Geometry, pp. 2–69. Encyclopedia of Mathematical
See also: Equivariant Cohomology and the Cartan Sciences, 121. Heidelberg: Springer.
Model; Feynman Path Integrals; Functional Integration in Gilkey P (1995) Invariance Theory, the Heat Equation and the
Quantum Physics; Hopf Algebras and q-Deformation Atiyah–Singer Theorem. Boca Raton: CRC Press.
Quantum Groups; Index Theorems; Measure on Loop Hida T, Kuo HH, Potthoff J, and Streit L (1993) White Noise: An
Spaces; Positive Maps on C-Algebras; Stationary Phase Infinite Dimensional Calculus. Dordrecht: Kluwer.
Approximation; Stochastic Differential Equations; Ikeda N and Watanabe S (1981) Stochastic Differential Equations
and Diffusion Processes. Amsterdam: North-Holland.
Supermanifolds; Supersymmetric Quantum Mechanics.
Jones JDS and Léandre R (1991) Lp Chen forms on loop spaces.
In: Barlow M and Bingham N (eds.) Stochastic Analysis,
pp. 104–162. Cambridge: Cambridge University Press.
Further Reading Léandre R (2002) White noise analysis, filtering equation and
Accardi L and Boźejko M (1998) Interacting Fock spaces and the index theorem for families. In: Heyer H and Saitô (eds.)
Gaussianization of probability measures. Infinite Dimensional Infinite Dimensional Harmonic Analysis (to appear).
Analysis, Quantum Probability and Related Topics 1: 663–670. Léandre R (2003) Theory of distributions in the sense of Connes–
Albeverio S (1996) Wiener and Feynman path integrals and their Hida and Feynman path integral on a manifold. Inf. Dim.
applications. In: Masani PR (eds.) Norbert Wiener Centenary Anal. Quant. Probab. Rel. Top. 6: 505–517.
Congress, Proc. Symp. Appl. Math. vol. 52, pp. 163–194. Roepstorff G (1994) Path Integral Approach to Quantum
Providence, RI: American Mathematical Society. Physics. An Introduction. Heidelberg: Springer.
Andersson L and Driver B (1999) Finite dimensional approxima- Rogers A (1987) Fermionic path integration and Grassmann
tion to Wiener measure and path integral formulas on Brownian motion. Communications in Mathematical Physics
manifolds. Journal of Functional Analysis 165: 430–498. 113: 353–368.
Atiyah M (1985) Circular symmetry and stationary phase Sidorova N, Smolyanov O, von Weizsaecker H, and Wittich O
approximation. In: Colloque en l’honneur de L. Schwartz, (2004) The surface limit of Brownian motion in tubular
vol. 131, pp. 43–59. Paris: Asterisque. neighborhood of an embedded Riemannian manifold. Journal
Bismut JM (1985) Index theorem and equivariant cohomology on of Functional Analysis 206: 391–413.
the loop space. Communications in Mathematical Physics 98: Szabo R (2000) Equivariant cohomology and localization of path
213–237. integrals in physics, Lecture Notes in Physics M63. Berlin:
Berezansky YM and Kondratiev YO (1995) Spectral Methods Springer.
in Infinite-Dimensional Analysis, vols. I, II. Dordrecht:
Kluwer.
Peakons
D D Holm, Imperial College, London, UK are peakon wave fronts in higher dimensions. The
ª 2006 Elsevier Ltd. All rights reserved. reduction of these singular solutions of CH and EPDiff
to canonical Hamiltonian dynamics on lower-dimen-
sional sets may be understood, by realizing that their
solution ansatz is a momentum map, and momentum
Introduction maps are Poisson.
Peakons are singular solutions of the dispersionless Camassa and Holm (1993) discovered the ‘‘peakon’’
Camassa–Holm (CH) shallow-water wave equation in solitary traveling-wave solution for a shallow-
one spatial dimension. These are reviewed in the water wave:
context of asymptotic expansions and Euler–Poincaré
uðx; tÞ ¼ cejxctj= ½1
(EP) variational principles. The dispersionless CH
equation generalizes to the EPDiff equation (defined whose fluid velocity u is a function of position x on
subsequently in this article), whose singular solutions the real line and time t. The peakon traveling wave
Peakons 13
moves at a speed equal to its maximum height, at sech2 (x t) traveling-wave solutions (the solitons)
which it has a sharp peak (jump in derivative). for KdV [3] arise in a balance between its (weakly)
Peakons are an emergent phenomenon, solving the nonlinear steepening and its third-order linear
initial-value problem for a partial differential equa- dispersion, when the quadratic terms in and 2
tion (PDE) derived by an asymptotic expansion of on its right-hand side are neglected.
Euler’s equations using the small parameters of In eqn [3], a normal-form transformation due to
shallow-water dynamics. Peakons are nonanalytic Kodama (1985) has been used to remove the other
solitons, which superpose as possible quadratic terms of order O(2 ) and O(4 ).
The remaining quadratic correction terms in the
X
N
KdV equation [3] may be collected at order O(2 ).
uðx; tÞ ¼ pa ðtÞejxqa ðtÞj= ½2
a¼1
These terms may be expressed, after introducing a
‘‘momentum variable,’’
for sets {p} and {q} satisfying canonical Hamiltonian
dynamics. Peakons arise for shallow-water waves in m ¼ u
2 uxx ½4
the limit of zero linear dispersion in one dimension. and neglecting terms of cubic order in and 2 , as
Peakons satisfy a PDE arising from Hamilton’s
principle for geodesic motion on the smooth 2
mt þ mx þ ðumx þ b mux Þ þ ð1 3Þuxxx ¼ 0 ½5
invertible maps (diffeomorphisms) with respect to 2 6
the H 1 Sobolev norm of the fluid velocity. Peakons In the momentum variable m = u
2 uxx , the
generalize to higher dimensions, as well. We explain parameter
is given by Dullin et al. (2001):
how peakons were derived in the context of
shallow-water asymptotics and describe some of 19 30 45 2
¼ ½6
their remarkable mathematical properties. 60ð1 3Þ
Thus, the effects of 2 -dispersion also enter the
nonlinear terms. After restoring dimensions in eqn
Shallow-Water Background for Peakons [5] and rescaling velocity u by (b þ 1), the following
Euler’s equations for irrotational incompressible ‘‘b-equation’’ emerges,
ideal fluid motion under gravity with a free surface mt þ c0 mx þ umx þ b mux þ uxxx ¼ 0 ½7
have an asymptotic expansion for shallow-
2
water waves that contains two small parameters, where m = u uxx is the dimensional momentum
and 2 , with ordering 2 . These small para- variable, and the constants 2 and =c0 are squares of
meters are = a=h0 (the ratio of wave amplitude to length scales. When 2 ! 0, one recovers KdV from
mean depth) and 2 = (h0 =lx )2 (the squared ratio of the b-equation [7], up to a rescaling of velocity. Any
mean depth to horizontal length, or wavelength). value of the parameter b 6¼ 1 may be achieved in
Euler’s equations are made nondimensional by eqn [7] by an appropriate Kodama transformation
introducing x = lx x0 for horizontal position, z = h0 z0 (Dullin et al. 2001).
for vertical position, t = (lx =c0 )t0 for time, = a0 for As already emphasized, the values of the coeffi-
surface elevation, and ’ =ffi (glx a=c0 )’0 for velocity cients in the asymptotic analysis of shallow-water
pffiffiffiffiffiffiffi
potential, where c0 = gh0 is the mean wave speed waves at quadratic order in their two small para-
and g is the constant gravity. The quantity meters only hold, modulo the Kodama normal-form
= 0 =(h0 c20 ) is the dimensionless Bond number, transformations. Hence, these transformations may
in which is the mass density of the fluid and 0 is be used to advance the analysis and thereby gain
its surface tension, both of which are taken to be insight, by optimizing the choices of these coeffi-
constants. After dropping primes, this asymptotic cients. The freedom introduced by the Kodama
expansion yields the nondimensional Korteweg–de transformations among asymptotically equivalent
Vries (KdV) equation for the horizontal velocity equations at quadratic order in and 2 also helps
variable u = ’x (x, t) at ‘‘linear’’ order in the small to answer the perennial question, ‘‘Why are integr-
dimensionless ratios and 2 , as the left-hand side of able equations so ubiquitous when one uses asymp-
totics in modeling?’’
3 2
ut þ ux þ uux þ ð1 3Þuxxx ¼ Oð2 Þ ½3
2 6 Integrable Cases of the b-equation [7]
Here, partial derivatives are denoted using sub- The cases b = 2 and b = 3 are special values
scripts, and boundary conditions are u = 0 and for which the b-equation becomes a completely
ux = 0 at spatial infinity on the real line. The famous integrable Hamiltonian system. For b = 2, eqn [7]
14 Peakons
specializes to the integrable CH equation of Fokas and Liu (1996), and also in Johnson [2002]. All
Camassa and Holm (1993). The case b = 3 in [7] the three derivations used different variants of the
recovers the integrable equation of Degasperis and method of asymptotic expansions for shallow-water
Procesi (1999) (henceforth DP equation). These two waves in the absence of surface tension. Only the
cases exhaust the integrable candidates for [7], as derivation in Dullin et al. (2001) used the Kodama
was shown using Painlevé analysis. The b-family of normal-form transformations to take advantage of the
eqns [7] was also shown in Mikhailov and Novikov nonuniqueness of the asymptotic expansion results at
(2002) to admit the symmetry conditions necessary quadratic order.
for integrability, only in the cases b = 2 for CH and The effects of the parameter b on the solutions of
b = 3 for DP. eqn [7] were investigated in Holm and Staley (2003),
The b-equation [7] with b = 2 was first derived in where b was treated as a bifurcation parameter, in the
Camassa and Holm (1993) by using asymptotic limiting case when the linear dispersion coefficients are
expansions directly in the Hamiltonian for Euler’s set to c0 = 0 and = 0. This limiting case allows
equations governing inviscid incompressible flow in several special solutions, including the peakons, in
the shallow-water regime. In this analysis, the CH which the two nonlinear terms in eqn [7] balance each
equation was shown to be bi-Hamiltonian and other in the ‘‘absence’’ of linear dispersion.
thereby was found to be completely integrable by
the inverse-scattering transform (IST) on the real
line. Reviews of IST may be found, for example, in
Peakons: Singular Solutions without
Ablowitz and Clarkson (1991), Dubrovin (1981),
and Novikov et al. (1984). For discussions of other
Linear Dispersion in One Spatial
related bi-Hamiltonian equations, see Degasperis Dimension
and Procesi (1999). Peakons were first found as singular soliton solutions
Camassa and Holm (1993) also discovered the of the completely integrable CH equation. This is eqn
remarkable peaked soliton (peakon) solutions of [1], [7] with b = 2, now rewritten in terms of the velocity as
[2] for the CH equation on the real line, given by [7]
in the case b = 2. The peakons arise as solutions of ut þ c0 ux þ 3uux þ uxxx
[7], when c0 = 0 and = 0 in the absence of linear ¼ 2 ðuxxt þ 2ux uxx þ uuxxx Þ ½8
dispersion. Peakons move at a speed equal to their
maximum height, at which they have a sharp peak Peakons were found in Camassa and Holm (1993)
(jump in derivative). Unlike the KdV soliton, the to arise in the absence of linear dispersion. That is,
peakon speed is independent of its width (). they arise when c0 = 0 and = 0 in CH [8].
Periodic peakon solutions of CH were treated in Specifically, peakons are the individual terms in the
Alber et al. (1999). There, the sharp peaks of peaked N-soliton solution of CH [8] for its velocity
periodic peakons were associated with billiards
reflecting at the boundary of an elliptical domain. X
N
uðx; tÞ ¼ pb ðtÞejxqb ðtÞj= ½9
These billiard solutions for the periodic peakons
b¼1
arise from geodesic motion on a triaxial ellipsoid, in
the limit that one of its axes shrinks to zero length. in the absence of linear dispersion. Each term in the
Before Camassa and Holm (1993) derived their sum is a soliton with a sharp peak at its maximum,
shallow-water equation, a class of integrable equa- hence the name ‘‘peakon.’’ Expressed using its
tions existed, which was later found to contain eqn momentum, m = (1 2 @x2 )u, the peakon velocity
[7] with b = 2. This class of integrable equations was solution [9] of dispersionless CH becomes a sum
derived using hereditary symmetries in Fokas and over a delta functions, supported on a set of points
Fuchssteiner (1981). However, eqn [7] was not moving on the real line. Namely, the peakon
written explicitly, nor was it derived physically as velocity solution [9] implies
a shallow-water equation and its solution properties
for b = 2 were not studied before Camassa and X
N
mðx; tÞ ¼ 2 pb ðtÞðx qb ðtÞÞ ½10
Holm (1993). (See Fuchssteiner (1996) for an
b¼1
insightful history of how the shallow-water equation
[7] in the integrable case with b = 2 relates to the because of the relation (1 2 @x2 )ejxj= = 2(x).
mathematical theory of hereditary symmetries.) These solutions satisfy the b-equation [7] for any
Equation [7] with b = 2 was recently re-derived as a value of b, provided c0 = 0 and = 0.
shallow-water equation by using asymptotic methods Thus, peakons are ‘‘singular momentum solu-
in three different approaches in Dullin et al. (2001), in tions’’ of the dispersionless b-equation, although
Peakons 15
yields Hamilton’s canonical equations for the Being a completely integrable Hamiltonian soliton
dynamics of the discrete set of peakon parameters equation, the continuum CH equation [8] has an
qa (t) and pa (t): associated isospectral eigenvalue problem, discov-
ered in Camassa and Holm (1993) for any values of
@hN @hN its dispersion parameters c0 and . Remarkably,
q_ a ðtÞ ¼ and p_ a ðtÞ ¼ ½12 when c0 = 0 and = 0, this isospectral eigenvalue
@pa @qa
problem has a purely ‘‘discrete’’ spectrum. More-
for a = 1, 2, . . . , N, with Hamiltonian given by over, in this case, each discrete eigenvalue corre-
(Camassa and Holm 1993): sponds precisely to the time-asymptotic velocity of a
peakon. This discreteness of the CH isospectrum in
1 X N
the absence of linear dispersion implies that only the
hN ¼ pa pb ejqa qb j= ½13
2 a; b¼1 singular peakon solutions [10] emerge asymptoti-
cally in time, in the solution of the initial-value
Thus, one finds that the points x = qa (t) in the problem for the dispersionless CH equation [11].
peakon solution [9] move with the flow of the fluid This is borne out in numerical simulations of the
velocity u at those points, since u(qa (t), t) = q_ a (t). dispersionless CH equation [11], starting from a
This means the qa (t) are Lagrangian coordinates. smooth initial distribution of velocity (Fringer and
Moreover, the singular momentum solution ansatz Holm 2001, Holm and Staley 2003).
[10] is the Lagrange-to-Euler map for an invariant Figure 1 shows the emergence of peakons from an
manifold of the dispersionless CH equation [11]. initially Gaussian velocity distribution and their
On this finite-dimensional invariant manifold for subsequent elastic collisions in a periodic one-
the PDE [11], the dynamics is canonically dimensional domain. This figure demonstrates that
Hamiltonian. singular solutions dominate the initial-value pro-
With Hamiltonian [13], the canonical equations blem and, thus, that it is imperative to go beyond
[12] for the 2N canonically conjugate peakon smooth solutions for the CH equation; the situation
parameters pa (t) and qa (t) were interpreted in is similar for the EPDiff equation.
Camassa and Holm (1993) as describing ‘‘geodesic
Peakons as Mechanical Systems
motion’’ on the N-dimensional Riemannian mani-
fold whose co-metric is gij ({q}) = e jqi qj j= . More- Being governed by canonical Hamiltonian equa-
over, the canonical geodesic equations arising from tions, each N-peakon solution can be associated
Hamiltonian [13] comprise an integrable system for with a mechanical system of moving particles.
any number of peakons N. This integrable system Calogero (1995) further extended the class of
was studied in Camassa and Holm (1993) for mechanical systems of this type. The r-matrix
solutions on the real line, and in Alber et al. (1999) approach was applied to the Lax pair formulation
and Mckean and Constantin (1999) and references of the N-peakon system for CH by Ragnisco and
therein, for spatially periodic solutions. Bruschi (1996), who also pointed out the connection
16 Peakons
of this system with the classical Toda lattice. A discrete Compactons in the 1=a 2 ! 0 Limit of CH
version of the Adler–Kostant–Symes factorization
As mentioned earlier, in the limit that 2 ! 0, the
method was used by Suris (1996) to study a discretiza-
CH equation [8] becomes the KdV equation.
tion of the peakon lattice, realized as a discrete
In contrast, when 1=a 2 ! 0, CH becomes the
integrable system on a certain Poisson submanifold of
Hunter–Zheng equation (Hunter and Zheng 1994):
gl(N) equipped with an r-matrix Poisson bracket. Beals
et al. (1999) used the Stieltjes theorem on continued ðut þ uux Þxx ¼ 12 u2x x
fractions and the classical moment problem for study-
ing multipeakon solutions of the CH equation. Gen- This equation has ‘‘compacton’’ solutions, whose
eralized peakon systems are described for any simple collision dynamics was studied numerically and
Lie algebra by Alber et al. (1999). put into the present context in Fringer and Holm
(2001). The corresponding Green’s function satis-
fies @x 2 g(x) = 2(x), so it has the triangular
Pulsons: Generalizing the Peakon Solutions of shape, g(x) = 1 jxj for jxj < 1, and vanishes
the Dispersionless b-Equation for Other Green’s
otherwise, for jxj 1. That is, the Green’s func-
Functions
tion in this case has compact support, hence the
The Hamiltonian hN in eqn [13] depends on name ‘‘compactons’’ for these pulson solutions,
the Green’s function for the relation between which as a limit of the integrable CH equations
velocity u and momentum m. However, the singular are true solitons, solvable by IST.
momentum solution ansatz [10] is ‘‘independent’’ of
this Green’s function. Thus, as discovered in Fringer Pulson Solutions of the Dispersionless b-Equation
and Holm (2001), the singular momentum solution
Holm and Staley (2003) give the pulson solutions of
ansatz [10] for the dispersionless equation
the traveling-wave problem and their elastic colli-
mt þ umx þ 2mux ¼ 0; with u ¼ g m ½14 sion properties for the dispersionless b-equation:
provides an invariant manifold on which canonical mt þ umx þ bmux ¼ 0; with u ¼ g m ½17
Hamiltonian dynamics occurs, for any choice of the
with any (symmetric) Green’s function g and for
Green’s function g relating velocity u and momen-
any value of the parameter b. Numerically,
tum m by the convolution u = g m.
pulsons and peakons are both found to be stable
The fluid velocity solutions corresponding to the
for b > 1 (Holm and Staley 2003). The reduction
singular momentum ansatz [10] for eqn [14] are the
to ‘‘noncanonical’’ Hamiltonian dynamics for the
‘‘pulsons’’. Pulsons are given by the sum over N velocity
invariant manifold of singular momentum solu-
profiles determined by the Green’s function g, as
tions [10] of the other integrable case b = 3 with
X
N peakon Green’s function g(x, y) = e jxyj= is found
uðx; tÞ ¼ pa ðtÞgðx; qa ðtÞÞ ½15 in Degasperis and Procesi (1999) and Degasperis
a¼1 et al. (2002).
Again for [14], the singular momentum ansatz [10]
results in a finite-dimensional invariant manifold of
solutions, whose dynamics is canonically Hamilto- Euler–Poincaré Theory in More
nian. The Hamiltonian for the canonical dynamics Dimensions
of the 2N parameters pa (t) and qa (t) in the ‘‘pulson’’ Generalizing the Peakon Solutions of the CH
solutions [15] of eqn [14] is Equation to Higher Dimensions
1 X N
In Holm and Staley (2003), weakly nonlinear analysis
hN ¼ pa pb gðqa ; qb Þ ½16 and the assumption of columnar motion in the
2 a; b¼1
variational principle for Euler’s equations are found
Again, for the pulsons, the canonical equations for the to produce the two-dimensional generalization of the
invariant manifold of singular momentum solutions dispersionless CH equation [11]. This generalization is
provide a phase-space description of geodesic motion, the EP equation (Holm et al. 1998a, b) for the
this time with respect to the co-metric given by the Lagrangian consisting of the kinetic energy:
Green’s function g. Mathematical analysis and numer- Z
1 h 2 i
ical results for the dynamics of these pulson solutions ‘¼ juj þ 2 ðdiv uÞ2 dx dy ½18
2
are given in Fringer and Holm (2001). These results
describe how the collisions of pulsons [15] depend in which the fluid velocity u is a two-dimensional
upon their shape. vector. Evolution generated by kinetic energy in
Peakons 17
Hamilton’s principle results in geodesic motion, Strengthening the Kinetic-Energy Norm to Allow
with respect to the velocity norm kuk, which is for Circulation
provided by the kinetic-energy Lagrangian. For The kinetic-energy Lagrangian [18] is a norm for
ideal incompressible fluids governed by Euler’s irrotational flow, with curl u = 0. However, inclusion
equations, the importance of geodesic flow was of rotational flow requires the kinetic-energy norm to be
recognized by Arnol’d (1966) for the L2 norm of strengthened to the H1 norm of the velocity, defined as
the fluid velocity. The EP equation generated by
Z
any choice of kinetic-energy norm without impos- 1 h 2 i
ing incompressibility is called ‘‘EPDiff,’’ for ‘‘Euler– ‘¼ juj þ 2 ðdiv uÞ2 þ 2 ðcurl uÞ2 dx dy
2
Z
Poincaré equation for geodesic motion on the 1 h 2 i 1
diffeomorphisms.’’ EPDiff is given by (Holm et al. ¼ juj þ 2 jruj2 dx dy ¼ kuk2H1 ½22
2 2
1998a):
Here, we assume boundary conditions that give
@ no contributions upon integrating by parts. The
þ u r m þ ruT m þ mðdiv uÞ ¼ 0 ½19
@t corresponding EPDiff equation is [19] with m
‘=u = u 2 u. This expression involves inver-
with momentum density m = ‘=u, where ‘ = (1=2) sion of the familiar Helmholtz operator in the
kuk2 is given by the kinetic energy, which defines a (nonlocal) relation between fluid velocity and
norm in the fluid velocity kuk, yet to be determined. momentum density. The H1 norm kuk2H1 for the
By design, this equation has no contribution from kinetic energy [22] also arises in three dimensions
either potential energy or pressure. It conserves the for turbulence modeling based on Lagrangian aver-
velocity norm kuk given by the kinetic energy. Its aging and using Taylor’s hypothesis that the
evolution describes geodesic motion on the diffeo- turbulent fluctuations are ‘‘frozen’’ into the Lagran-
morphisms with respect to this norm (Holm et al. gian mean flow (Foias et al. 2001).
1998a).
An alternative way of writing the EPDiff equation
[19] in either two or three dimensions is Generalizing the CH Peakon Solutions
to n Dimensions
@ Building on the peakon solutions [9] for the CH
m u curl m þ rðu mÞ þ mðdiv uÞ ¼ 0 ½20
@t equation and the pulsons [15] for its generalization
to other traveling-wave shapes in Fringer and Holm
This form of EPDiff involves all three differential
(2001), Holm and Staley (2003) introduced the
operators: curl, gradient, and divergence. For the
following measure-valued singular momentum solu-
kinetic-energy Lagrangian ‘ given in [18], which is a
tion ansatz for the n-dimensional solutions of the
norm for ‘‘irrotational’’ flow (with curl u = 0), we
EPDiff equation [19]:
have the EPDiff equation [19] with momentum
m = ‘=u = u 2 r(div u). N Z
X
EPDiff [19] may also be written intrinsically as mðx; tÞ ¼ P a ðs; tÞðx Qa ðs; tÞÞ ds ½23
a¼1
@ ‘ ‘
¼ adu ½21 These singular momentum solutions, called ‘‘diffeons,’’
@t u u
are vector density functions supported in R n on a set of
where ad is the L2 dual of the ad-operation N surfaces (or curves) of codimension (n k) for s 2
(commutator) for vector fields (see Arnol’d and Rk with k < n. They may, for example, be supported on
Khesin (1998) and Marsden and Ratiu (1999) for sets of points (vector peakons, k = 0), one-dimensional
additional discussions of the beautiful geometry filaments (strings, k = 1), or two-dimensional surfaces
underlying this equation). (sheets, k = 2) in three dimensions.
Figure 2 shows the results for the EPDiff equation
when a straight peakon segment of finite length is
Reduction to the Dispersionless CH Equation
created initially moving rightward (East). Because of
in One Dimension
propagation along the segment in adjusting to the
In one dimension, the EPDiff equations [19]–[21] with condition of zero speed at its ends and finite speed in its
Lagrangian ‘ given in [18] simplify to the dispersionless interior, the initially straight segment expands outward
CH equation [11]. The dispersionless limit of the CH as it propagates and curves into a peakon ‘‘bubble.’’
equation appears, because potential energy and pres- Figure 3 shows an initially straight segment whose
sure have been ignored. velocity distribution is exponential in the transverse
18 Peakons
Figure 2 A peakon segment of finite length is initially moving Figure 3 An initially straight segment of velocity distribution
rightward (east). Because its speed vanishes at its ends and it whose exponential profile is wider than the width for the
has fully two-dimensional spatial dependence, it expands into a peakon solution breaks up into a train of curved peakon
peakon ‘‘bubble’’ as it propagates. (The various shades indicate ‘‘bubbles,’’ each of width . This example illustrates the
different speeds. Any transverse slice will show a wave profile emergent property of the peakon solutions in two dimensions.
with a maximum at the center of the wave, which falls
exponentially with distance away from the center.)
Further Reading
Ablowitz MJ and Clarkson PA (1991) Solitons, Nonlinear
Evolution Equations and Inverse Scattering. Cambridge:
Figure 4 A single collision is shown involving reconnection as the Cambridge University Press.
faster peakon segment initially moving southeast along the diagonal Ablowitz MJ and Segur H (1981) Solitons and the Inverse
expands, curves, and obliquely overtakes the slower peakon Scattering Transform. Philadelphia: SIAM.
segment initially moving rightward (east). This reconnection Alber M, Camassa R, Fedorov Y, Holm D, and Marsden JE
illustrates one of the collision rules for the strongly two-dimensional (1999) On billiard solutions of nonlinear PDEs. PhysicsLetters
EPDiff flow. A 264: 171–178.
20 Peakons
Alber M, Camassa R, Fedorov Y, Holm D, and Marsden JE (2001) Fuchssteiner B (1996) Some tricks from the symmetry-toolbox for
The complex geometry of weak piecewise smooth solutions of nonlinear equations: generalization of the Camassa–Holm
integrable nonlinear PDE’s of shallow water and Dym type. equation. Physica D 95: 229–243.
Communications in Mathematical Physics 221: 197–227. Fringer O and Holm DD (2001) Integrable vs. nonintegrable
Alber M, Camassa R, Holm D, and Marsden JE (1994) The geodesic soliton behavior. Physica D 150: 237–263.
geometry of peaked solitons and billiard solutions of a class of Holm DD (2005) The Euler–Poincaré variational framework
integrable PDEs. Letters in Mathematical Physics 32: 137–151. for modeling fluid dynamics. In: Montaldi J and Ratiu T
Alber MS, Camassa R, and Gekhtman M (2000) On billiard weak (eds.) Geometric Mechanics and Symmetry: The Peyresq
solutions of nonlinear PDE’s and Toda flows. CRM Proceed- Lectures, pp. 157–209. London Mathematical Society
ings and Lecture Notes 25: 1–11. Lecture Notes Series 306. Cambridge: Cambridge University
Arnol’d VI (1966) Sur la géométrie differentielle des groupes de Lie de Press.
dimenson infinie et ses applications à l’hydrodynamique des fluids Holm DD and Marsden JE (2004) Momentum maps and
parfaits. Annales de l’Institut Fourier, Grenoble 16: 319–361. measure-valued solutions (peakons, filaments and sheets) for
Arnol’d VI and Khesin BA (1998) Topological Methods in the EPDiff equation. In: Marsden JE and Ratiu TS (eds.) The
Hydrodynamics. Springer: New York. Breadth of Symplectic and Poisson Geometry, A Festshcrift
Beals R, Sattinger DH, and Szmigielski J (1999) Multi-peakons for Alan Weinstein, pp. 203–235, Progress in Mathematics,
and a theorem of Stietjes. Inverse Problems 15: L1–4. vol. 232. Boston: Birkhäuser.
Beals R, Sattinger DH, and Szmigielski J (2000) Multipeakons Holm DD, Marsden JE, and Ratiu TS (1998a) The Euler–
and the classical moment problem. Advances in Mathematics Poincaré equations and semidirect products with applica-
154: 229–257. tions to continuum theories. Advances in Mathematics 137:
Beals R, Sattinger DH, and Szmigielski J (2001) Peakons, strings, 1–81.
and the finite Toda lattice. Communications in Pure and Holm DD, Marsden JE, and Ratiu TS (1998b) Euler–Poincaré
Applied Mathematics 54: 91–106. models of ideal fluids with nonlinear dispersion. Physical
Calogero F (1995) An integrable Hamiltonian system. Physics Review Letters 349: 4173–4177.
Letters A 201: 306–310. Holm DD and Staley MF (2003) Nonlinear balance and exchange
Calogero F and Francoise J-P (1996) A completely integrable of stability in dynamics of solitons, peakons, ramps/cliffs and
Hamiltonian system. Journal of Mathematics 37: 2863–2871. leftons in a 1þ1 nonlinear evolutionary PDE. Physical Letters
Camassa R and Holm DD (1993) An integrable shallow water A 308: 437–444.
equation with peaked solitons. Physical Review Letters 71: Holm DD and Staley MF (2003) Wave structures and nonlinear
1661–1664. balances in a family of evolutionary PDEs. SIAM Journal of
Camassa R, Holm DD, and Hyman JM (1994) A new Applied Dynamical Systems 2(3): 323–380.
integrable shallow water equation. Advances in Applied Holm DD and Staley MF (2004) Interaction dynamics of singular
Mechanics 31: 1–33. wave fronts in nonlinear evolutionary fluid equations (in
Constantin A and Strauss W (2000) Stability of peakons. Commu- preparation).
nications on Pure and Applied Mathematics 53: 603–610. Hunter JK and Zheng Y (1994) On a completely integrable
Degasperis A and Procesi M (1999) Asymptotic integrability. In: hyperbolic variational equation. Physica D 79: 361–386.
Degasperis A and Gaeta G (eds.) Symmetry and Perturbation Johnson RS (2002) Camassa–Holm, Korteweg–de Vries models
Theory, pp. 23–37. Singapore: World Scientific. for water waves. Journal of Fluid Mechanics 455: 63–82.
Degasperis A, Holm DD, and Hone ANW (2002) A new Kodama Y (1985) On integrable systems with higher order
integrable equation with peakon solutions. Theoretical and corrections. Physics Letters A 107: 245–249.
Mathematical Physics 133: 1463–1474. Kodama Y (1985a) Normal forms for weakly dispersive wave
Dubrovin B (1981) Theta functions and nonlinear equations. equations. Physics Letters A 112: 193–196.
Russian Mathematical Surveys 36: 11–92. Kodama Y (1987) On solitary-wave interaction. Physics letters A
Dubrovin BA, Novikov SP, and Krichever IM (1985) Integrable 123: 276–282.
systems. I. Itogi Nauki i Tekhniki. Sovr. Probl. Mat. Fund. Liu AK, Chang YS, Hsu M-K, and Liang NK (1998) Evolution of
Naprav. 4. VINITI (Moscow) (Engl. transl. (1989) Encyclo- nonlinear internal waves in the east and south China Seas.
paedia of Mathematical Sciences, vol. 4. Berlin: Springer). Journal of Geophysical Research 103: 7995–8008.
Dullin HR, Gottwald GA, and Holm DD (2001) An integrable Marsden JE and Ratiu TS (1999) Introduction to Mechanics and
shallow water equation with linear and nonlinear dispersion. Symmetry. Texts in Applied Mathematics, 2nd edn. vol. 17,
Physical Review Letters 87: 194501–04. Berlin: Springer.
Dullin HR, Gottwald GA, and Holm DD (2003) Camassa–Holm, McKean HP and Constantin A (1999) A shallow water equation
Korteweg–de Vries-5 and other asymptotically equivalent on the circle. Communications on Pure and Applied Mathe-
equations for shallow water waves. Fluid Dynamics Research matics 52: 949–982.
33: 73–95. Mikhailov AV and Novikov VS (2002) Perturbative symmetry
Dullin HR, Gottwald GA, and Holm DD (2004) On asymptotically approach. Journal of Physics A 35: 4775–4790.
equivalent shallow water wave equations. Physica D 190: 1–14. Novikov SP, Manakov SV, Pitaevski LP, and Zakharov VE (1984)
Foias C, Holm DD, and Titi ES (2001) The Navier–Stokes-alpha Theory of Solitons. The Inverse Scattering Method, Comtem-
model of fluid turbulence. Physica D 152: 505–519. porary Soviet Mathematics. Consultants Bureau (translated
Fokas AS and Fuchssteiner B (1981) Bäcklund transformations for from Russian). New York: Plenum.
hereditary symmetries. Nonlinear Analysis Transactions of the Ragnisco O and Bruschi M (1996) Peakons, r-matrix and Toda
American Mathematical Society 5: 423–432. lattice. Physica A 228: 150–159.
Fokas AS and Liu QM (1996) Asymptotic integrability of water Suris YB (1996) A discrete time peakons lattice. Physics Letters A
waves. Physical Review Letters 77: 2347–2351. 217: 321–329.
Percolation Theory 21
Percolation Theory
V Beffara, Ecole Normale Supérieure de Lyon, Lyon, bond percolation on a graph G is equivalent to the
France existence of a path for site percolation on the
V Sidoravicius, IMPA, Rio de Janeiro, Brazil covering graph of G. However, site percolation on
ª 2006 Elsevier Ltd. All rights reserved. a given graph may not be equivalent to bond
percolation on any other graph.
All graphs under consideration will be assumed to
be connected, locally finite and quasitransitive. If
Introduction A, B V, then A $ B means that there exists an
Percolation as a mathematical theory was introduced open path from some vertex of A to some vertex of
by Broadbent and Hammersley (1957), as a stochastic B; by a slight abuse of notation, u $ v will stand for
way of modeling the flow of a fluid or gas through a the existence of a path between sites u and v, that is,
porous medium of small channels which may or may the event {u} $ {v}. The open cluster C(v) of the
not let gas or fluid pass. It is one of the simplest models vertex v is the set of all open vertices which are
exhibiting a phase transition, and the occurrence of a connected to v by an open path:
critical phenomenon is central to the appeal of CðvÞ ¼ fu 2 V : u $ vg
percolation. Having truly applied origins, percolation
has been used to model the fingering and spreading of The central quantity of the percolation theory is the
oil in water, to estimate whether one can build percolation probability:
nondefective integrated circuits, and to model the ðpÞ :¼ Pp f0 $ 1g ¼ Pp fjCð0Þj ¼ 1g
spread of infections and forest fires. From a mathema-
tical point of view, percolation is attractive because it The most important property of the percolation
exhibits relations between probabilistic and algebraic/ model is that it exhibits a phase transition, that is,
topological properties of graphs. there exists a threshold value pc 2 [0, 1], such that
To make the mathematical construction of such a the global behavior of the system is substantially
system of channels, take a graph G (which originally different in the two regions p < pc and p > pc . To
was taken as Zd ), with vertex set V and edge set E, and make this precise, observe that is a nondecreasing
make all the edges independently open (or passable) function. This can be seen using Hammersley’s joint
with probability p or closed (or blocked) with construction of percolation systems for all p 2 [0, 1]
probability 1 p. Write Pp for the corresponding on G: let {U(v), v 2 V} be independent random
probability measure on the set of configurations of variables, uniform in [0,1]. Declare v to be p-open
open and closed edges – that model is called bond if U(v) p, otherwise it is declared p-closed. The
percolation. The collection of open edges thus forms a configuration of p-open vertices has the distribution
random subgraph of G, and the original question stated Pp for each p 2 [0, 1]. The collection of p-open
by Broadbent was whether the connected component vertices is nondecreasing in p, and therefore (p) is
of the origin in that subgraph is finite or infinite. nondecreasing as well. Clearly, (0) = 0 and (1) = 1
A path on G is a sequence v1 , v2 , . . . of vertices of G, (Figure 1).
such that for all i 1, vi and viþ1 are adjacent on G. A
path is called open if all the edges {vi , viþ1 } between
successive vertices are open. The infiniteness of the θ(p)
cluster of the origin is equivalent to the existence of 1
an unbounded open path starting from the origin.
There is an analogous model, called ‘‘site percola-
tion,’’ in which all edges are assumed to be passable,
but the vertices are independently open or closed
with probability p or 1 p, respectively. An open p
0
path is then a path along which all vertices are open. pc 1
Site percolation is more general than bond percola- Figure 1 The behavior of (p) around the critical point
tion in the sense that the existence of a path for (for bond percolation).
22 Percolation Theory
The critical probability is defined as It was an important step in the development of the
theory to show that pT (G) = pc (G). The fundamental
pc :¼ pc ðGÞ ¼ supfp: ðpÞ ¼ 0g
estimate in the subcritical regime, which is a much
By definition, when p < pc , the open cluster of the stronger statement than pT (G) = pc (G), is the following:
origin is Pp -a.s. finite; hence, all the clusters are also
Theorem 1 (Aizenman and Barsky, Menshikov).
finite. On the other hand, for p > pc there is a
Assume that G is periodic. Then for p < pc there
strictly positive Pp -probability that the cluster of the
exist constants 0 < C1 , C2 < 1, such that
origin is infinite. Thus, from Kolmogorov’s zero–one
law it follows that Pp fjCðvÞj ng C1 eC2 n
Pp fjCðvÞj ¼ 1 for some v 2 Vg ¼ 1 for p > pc The last statement can be sharpened to a ‘‘local
Therefore, if the intervals [0, pc ) and (pc , 1] are both limit theorem’’ with the help of a subadditivity
nonempty, there is a phase transition at pc . argument: for each p < pc , there exists a constant
Using a so-called Peierls argument it is easy to see 0 < C3 (p) < 1, such that
that pc (G) > 0 for any graph G of bounded degree. 1
On the other hand, Hammersley proved that lim log Pp fjCðvÞj ¼ ng ¼ C3 ðpÞ
n!1 n
pc (Zd ) < 1 for bond percolation as soon as d 2,
and a similar argument works for site percolation The Supercritical Regime
and various periodic graphs as well. But for some
Once an infinite open cluster exists, it is natural to
graphs G, it is not so easy to show that pc (G) < 1.
ask how it looks like, and how many infinite open
One says that the system is in the subcritical (resp.
clusters exist. It was shown by Newman and Schul-
supercritical) phase if p < pc (resp. p > pc ).
man that for periodic graphs, for each p, exactly one
It was one of the most remarkable moments in the
of the following three situations prevails: if N 2
history of percolation when Kesten (1980) proved,
Zþ [ {1} is the number of infinite open clusters, then
based on results by Harris, Russo, Seymour and
Pp (N = 0) = 1, or Pp (N = 1) = 1, or Pp (N = 1) = 1.
Welsh, that the critical parameter for bond percolation
Aizenman, Kesten, and Newman showed that the
on Z2 is equal to 1/2. Nevertheless, the exact value of
third case is impossible on Zd . By now several
pc (G) is known only for a handful of graphs, all of
proofs exist, perhaps the most elegant of which is
them periodic and two dimensional – see below.
due to Burton and Keane, who prove that indeed
there cannot be infinitely many infinite open clusters
on any amenable graph. However, there are some
Percolation in Zd graphs, such as regular trees, on which coexistence
The graph on which most of the theory was of several infinite clusters is possible.
originally built is the cubic lattice Zd , and it was The geometry of the infinite open cluster can be
not before the late twentieth century that percola- explored in some depth by studying the behavior of
tion was seriously considered on other kinds of a random walk on it. When d = 2, the random walk
graphs (such as Cayley graphs), on which specific is recurrent, and when d 3 is a.s. transient. In all
phenomena can appear, such as the coexistence of dimensions d 2, the walk behaves diffusively, and
multiple infinite clusters for some values of the the ‘‘central limit theorem’’ and the ‘‘invariance
parameter p. In this section, the underlying graph is principle’’ were established in both the annealed and
thus assumed to be Zd for d 2, although most quenched cases.
of the results still hold in the case of a periodic
d-dimensional lattice.
Wulff droplets In the supercritical regime, aside
The Subcritical Regime from the infinite open cluster, the configuration
contains finite clusters of arbitrary large sizes. These
When p < pc , all open clusters are finite almost large finite open clusters can be thought of as droplets
surely. One of the greatest challenges in percolation swimming in the areas surrounded by an infinite open
theory has been to prove that (p) := Ep {jC(v)j} is cluster. The presence at a particular location of a large
finite if p < pc (Ep stands for the expectation with finite cluster is an event of low probability, namely, on
respect to Pp ). For that one can define another critical Zd , d 2, for p > pc , there exist positive constants
probability as the threshold value for the finiteness of 0 < C4 (p), C5 (p) < 1, such that
the expected cluster size of a fixed vertex:
1
pT ðGÞ :¼ supfp : ðpÞ < 1g C4 ðpÞ log Pp fjCðvÞj ¼ ng C5 ðpÞ
nðd1Þ=d
Percolation Theory 23
for all large n. This estimate is based on the fact that correlation length, leading again to the same value for
the occurrence of a large finite cluster is due to a pc ; the behavior at or near the critical point then has no
surface effect. The typical structure of the large finite characteristic length, and gives rise to scaling
finite cluster is described by the following theorem: exponents (conjecturally in most cases).
The most usual critical exponents are defined as
Theorem 2 Let d 2, and p > pc . There exists a
follows, if (p) is the percolation probability, C the
bounded, closed, convex subset W of Rd containing
cluster of the origin, and (p) the correlation length:
the origin, called the normalized Wulff crystal of
the Bernoulli percolation model, such that, under the @3
conditional probability Pp { jnd jC(0)j < 1}, the Ep ½jCj1 jp pc j1
@p3
random measure
ðpÞ ðp pc Þþ
1 X
x=n f ðpÞ :¼ Ep ½jCj1jCj<1 jp pc j
nd x2Cð0Þ
Ppc ½jCj ¼ n n11=
(where x denotes a Dirac mass at x) converges Ppc ½x 2 C jxj2d
weakly in probability toward the random measure
(p)1W (x M) dx (where M is the rescaled center of ðpÞ jp pc j
mass of the cluster C(0)). The deviation probabilities Ppc ½diamðCÞ ¼ n n11=
behave as exp{cnd1 } (i.e., they exhibit large
Ep ½jCjkþ1 1jCj<1
deviations of surface order; in dimensions 4 and
k
jp pc j
more it holds up to re-centering). Ep ½jCj 1jCj<1
This result was proved in dimension 2 by Alexander These exponents are all expected to be universal,
et al. (1990), and in dimensions 3 and more by Cerf that is, to depend only on the dimension of the
(2000). lattice, although this is not well understood at the
mathematical level; the following scaling relations
Percolation Near the Critical Point between the exponents are believed to hold:
Percolation in Slabs The main macroscopic obser- 2 ¼ þ 2 ¼ ð þ 1Þ; ¼ ; ¼
ð2 Þ
vable in percolation is (p), which is positive above
pc , 0 below pc , and continuous on [0, 1]n{pc }. In addition, in dimensions up to dc = 6, two
Continuity at pc is an open question in the general additional hyperscaling relations involving d are
case; it is known to hold in two dimensions strongly conjectured to hold:
(cf. below) and in high enough dimension (at the d ¼ þ 1; d
¼ 2
moment d 19 though the value of the critical
dimension is believed to be 6) using lace expansion while above dc the exponents are believed to take
methods. The conjecture that (pc ) = 0 for 3 d 18 their mean-field value, that is, the ones they have for
remains one of the major open problems. percolation on a regular tree:
Efforts to prove that led to some interesting and ¼ 1; ¼ 1; ¼ 1; ¼ 2
important results. Barsky, Grimmett, and Newman
solved the question in the half-space case, and simulta- ¼ 0;
¼ 12; ¼ 12; ¼ 2
neously showed that the slab percolation and half-space Not much is known rigorously on critical expo-
percolation thresholds coincide. This was complemen- nents in the general case. Hara and Slade (1990)
ted by Grimmett and Marstrand showing that proved that mean field behavior does happen above
dimension 19, and the proof can likely be extended
pc ðslabÞ ¼ pc ðZd Þ
to treat the case d 7. In the two-dimensional case
Critical exponents In the subcritical regime, expo- on the other hand, Kesten (1987) showed that,
nential decay of the correlation indicates that there assuming that the exponents and exist, then so
is a finite correlation length (p) associated to the do , , , and
, and they satisfy the scaling and
system, and defined (up to constants) by the relation hyperscaling relations where they appear.
n’ðxÞ
Pp ð0 $ nxÞ exp The incipient infinite cluster When studying long-
ðpÞ
range properties of a critical model, it is useful to
where ’ is bounded on the unit sphere (this is known have an object which is infinite at criticality, and
as Ornstein–Zernike decay). The phase transition can such is not the case for percolation clusters. There
then also be defined in terms of the divergence of the are two ways to condition the cluster of the origin to
24 Percolation Theory
The only other critical parameters that are known Figure 2 Two large critical percolation clusters in a box of the
exactly are pbond
c (T ) = 2 sin (=18) (and hence also square lattice (first: bond percolation, second: site percolation).
Percolation Theory 25
The first passage time a(x, y) between vertices x and The analogy with percolation is strong, the
y is given by corresponding percolative picture being the follow-
ing: in Zdþ1
þ , each edge is open with probability p 2
aðx; yÞ ¼ inffTðÞ : a path from x to yg
(0, 1), and the question is whether there exists an
and we can define infinite oriented path (i.e., a path along which the
sum of the coordinates is increasing), composed of
WðtÞ :¼ fx 2 Zd : að0; xÞ tg open edges. Once again, there is a critical parameter
the set of vertices reached by the liquid by time t. It customarily denoted by pc , at which no such path
turns out that W(t) grows approximately linearly as exists (compare this to the open question of the
time passes, and that there exists a nonrandom limit continuity of the function at pc in dimensions
set B such that either B is compact and 3 d 18). This variation of percolation lies in a
different universality class than the usual Bernoulli
1f model.
ð1 "ÞB WðtÞ ð1 þ "ÞB; eventually a:s:
t
for all > 0, or B = R d , and Invasion Percolation
For many analytic investigations, such as those in terms of which the expansion [7] reads
which arise in renormalization theory, one is Z dþ1
interested instead in the Green’s functions of the X1 nY
i n d ki ~
Z½ J ¼ Jðki Þ
quantum field theory, which measure the response n! i¼1 ð2Þdþ1
n¼0
of the system to an external perturbation. For
~ ðnÞ ðk1 ; . . . ; kn Þ
G ½10
definiteness, let us consider a free real scalar field
theory in d þ 1 dimensions with Lagrangian
The generating functional [10] can be written as a sum
density
of Feynman diagrams with source insertions. Dia-
L ¼ 12 @ @ 12 m2 2 þ Lint ½4 grammatically, the Green’s function is an infinite series
of graphs which can be represented symbolically as
where Lint is the interaction Lagrangian density
which we assume has no derivative terms. The k1
interaction Hamiltonian density is then given by kn
Hint = Lint . Introducing a real scalar source J(x), ~
G(n)(k1, . . . ,kn) = . k2 ½11
we define the normalized ‘‘partition function’’
.
through the vacuum expectation values, k3
. .
h0jS½ Jj0i
Z½ J ¼ ½5 where the n external lines denote the source
h0jS½0j0i insertions of momenta ki and the bubble denotes
where j0i is the normalized perturbative vacuum the sum over all Feynman diagrams constructed
state of the quantum field theory given by (4) from the interaction vertices of Lint .
(defined to be destroyed by all field annihilation This procedure is, however, rather formal in the way
operators), and that we have presented it, for a variety of reasons. First
of all, by Haag’s theorem, it follows that the interaction
Z
representation of a quantum field theory does not exist
S½ J ¼ T exp i ddþ1 xðLint þ JðxÞðxÞÞ ½6 unless a cutoff regularization is introduced into the
interaction term in the Lagrangian density (this
from the Dyson formula. This partition function is regularization is described explicitly below). The
the generating functional for all Green’s functions addition of this term breaks translation covariance.
of the quantum field theory, which are obtained This problem can be remedied via a different definition
from [5] by taking functional derivatives with of the regularized Green’s functions, as we discuss
respect to the source and then setting J(x) = 0. below. Furthermore, the perturbation series of a
Explicitly, in a formal Taylor series expansion in J quantum field theory is typically divergent. The
one has expansion into graphs is, at best, an asymptotic series
Z which is Borel summable. These shortcomings will not
X
1 nY
i n
Z½ J ¼ ddþ1 xi Jðxi Þ GðnÞ ðx1 ; . . . ; xn Þ ½7 be emphasized any further in this article. Some
n¼0
n! i¼1 mathematically rigorous approaches to perturbative
quantum field theory can be found in the bibliography.
whose coefficients are the Green’s functions The Green’s functions can also be used to describe
scattering amplitudes, but there are two important
GðnÞ ðx1 ; ...; xn Þ differences between the graphs [11] and those which
R
h0jT½exp i ddþ1 xLint ðx1 Þ ðxn Þj0i appear in scattering theory. In the present case,
:¼ R ½8 external lines carry propagators, that is, the free-
h0jT exp i ddþ1 xLint j0i field Green’s functions
is then given by the multiple on-shell residue of the The formal Taylor series expansion of the
Green’s function in momentum space as scattering operator S may now be succinctly
0
summarized into a diagrammatic notation by
k1 ; . . . ; k0n j S 1jk1 ; . . . ; kl usingRWick’s theorem. For each spacetime integra-
Yn
1 Yl
1 2 tion ddþ1 xi we introduce a vertex with label i,
¼ 0 lim pffiffiffi0ffi k0i 2 m2 p ffiffiffi
ffi k j m2 and from each vertex there emanate some lines
0
k ;...;kn !m2
i¼1 i ci
i cj
1
k1 ;...;kl !m2
j¼1 corresponding to field insertions at the point xi .
If the operators represented by two lines appear in
~ ðnþmÞ k0 ; . . . ; k0 ; k1 ; . . . ; kl
G ½13
1 n a two-point function according to [14], that is, they
where ic0i , icj are the residues of the corresponding are contracted, then these two lines are connected
particle poles in the exact two-point Green’s together. The S operator is then represented as a
function. sum over all such Wick diagrams, bearing in mind
This article deals with the formal development that topologically equivalent diagrams correspond
and computation of perturbative scattering ampli- to the same term in S. Two diagrams are said to
tudes in relativistic quantum field theory, along the have the same pattern if they differ only by a
lines outlined above. Initially we deal only with real permutation of their vertices. For any diagram D
scalar field theories of the sort [4] in order to with n(D) vertices, the number of ways of inter-
illustrate the concepts and technical tools in as changing vertices is n(D)!. The number of diagrams
simple and concise a fashion as possible. These per pattern is always less than this number. The
techniques are common to most quantum field symmetry number S(D) of D is the number of
theories. Fermions and gauge theories are then permutations of vertices that give the same dia-
separately treated afterwards, focusing on the gram. The number of diagrams with the pattern of
methods which are particular to them. D is then n(D)!=S(D).
In a given pattern, we write the contribution to S
of a single diagram D as
Diagrammatics 1
: ðDÞ:
The pinnacle of perturbation theory is the technique nðDÞ!
of Feynman diagrams. Here we develop the basic
machinery in a quite general setting and use it to where the combinatorial factor comes from
analyze some generic features of the terms compris- the Taylor expansion of S, the large colons
ing the perturbation series. denote normal ordering of quantum operators,
and : (D) : contains spacetime integrals over nor-
mal-ordered products of the fields. Then all
Wick’s Theorem diagrams with the pattern of D contribute : (D) :
The Green’s functions [8] are defined in terms of =S(D) to S. Only the connected diagrams Dr , r 2 N
vacuum expectation values of time-ordered products (those in which every vertex is connected to every
of the scalar field (x) at different spacetime points. other vertex) contribute and we can write the
Wick’s theorem expresses such products in terms of scattering operator in a simple form which
normal-ordered products, defined by placing each eliminates contributions from all disconnected dia-
field creation operator to the right of each field grams as
annihilation operator, and in terms of two-point !
Green’s functions [12] of the free-field theory X
1
ðDr Þ
S ¼: exp : ½15
(propagators). The consequence of this theorem is r¼1
SðDr Þ
the Haffnian formula
k k
Consider an arbitrary proper Feynman diagram
1PI =: ∑(k) ½17 D with n internal lines and v vertices. The
number, ‘, of independent loops in the diagram
is the number of independent internal momenta in
is called the self-energy. If G(k) is the complete
D when conservation laws at each vertex have
two-point function in momentum space, then one
been taken into account, and it is given by ‘ = n þ
has
1 v. There is an independent momentum inte-
gration variable ki for each loop, and a propa-
k k
G(k) := gator for each internal line as in [16]. The
contribution of D to a proper Green’s function
k k k with r incoming external momenta pi , with
= + 1PI P r
i = 1 pi = 0, is given by
k k k n Z
+ 1PI 1PI +... Y ddþ1 ki
~ID ðpÞ ¼ VðDÞ i
SðDÞ i¼1 ð2Þdþ1 k2i m2 þ i
= i
k 2 – m 2 – ∑(k) ½18 Yv
ð2Þdþ1 ðdþ1Þ Pj Kj ½20
j¼1
and thus it suffices to calculate only 1PI diagrams.
The 1PI effective action, defined by R the Legendre where V(D) contains all contributions from the
transformation [] := i ln Z[J] ddþ1 xJ(x)(x) interaction vertices of Lint , and Pj (resp. Kj ) is the
of [5], is the generating functional for proper vertex sum of incoming external momenta plj (resp.
functions and it can be represented as a functional of internal momenta klj ) at vertex j with respect to
only the vacuum expectation value of the field , a fixed chosen orientation of the lines of the
that is, its classical value. In the semiclassical (WKB) graph. After resolving the delta-functions in terms
approximation, the one-loop effective action is of independent internal loop momenta k1 , . . . , k‘
given by and dropping the overall momentum conservation
32 Perturbation Theory and Its Techniques
delta-function along with the symmetry and vertex reduces to the calculation of the parametric
factors in [20], one is left with a set of momentum integrals:
space integrals
n ðdþ1Þ‘ n Z
Y Y
‘ Z
1 ‘
Y ddþ1
ki iY
n 2 1
ID ðpÞ ¼ ½21 ID ðpÞ ¼ dj
Qi ðÞ2
ðdþ1Þ‘
theory above apply to the case of Dirac fermion (with xnþ1 := x1 ), where tr is the 4 4 trace
fields. The Lagrangian density is over spinor indices. This reordering introduces the
familiar minus sign for a closed fermion loop, and
LF ¼ ði@= mÞ þ L0 ½30 one has
where are four-component Dirac fermion fields in V(x1)
V(xn)
3 þ 1 dimensions, := y
0 and @= =
@ with
n
the generators of the Clifford algebra {
,
} = 2 .
. = (–)Π d4xi
. i=1
The Lagrangian density L0 contains couplings of the .
V(x2)
n–1
Dirac fields to other field theories, such as the scalar V(x3) × tr Π ΔF (xj – xj + 1)
field theories considered previously. j =1
Wick’s theorem for anticommuting Fermi fields
leads to the Pfaffian formula × V(xj + 1) ΔF (xj + 1 – xj + 2)
½33
h0jT½ ð1Þ ðnÞj0i
8
> 0; n ¼ 2k 1 Feynman rules are now described as follows.
>
>
>
> X Fermion lines are oriented to distinguish a particle
>
> 1
>
> sgnðÞ from its corresponding antiparticle, and carry both
>
> k
< 2 k! 2S2k ½31 a four-momentum label p as well as a spin
¼ polarization index r = 1, 2. Incoming fermions (resp.
>
> Y k
>
> antifermions) are described by the wave functions
>
> h0jT½ ðð2i 1ÞÞ ðð2iÞÞj0i
>
>
> i¼1 u(r)
p (resp. v(r)
p ), while outgoing fermions (resp.
>
>
: antifermions) are described by the wave functions
n ¼ 2k u(r) (r) (r) (r)
p (resp. vp ). Here up and vp are the classical
where for compactness we have written in the spinors, that is, the positive and negative-energy
argument of (i) the spacetime coordinate, the solutions of the Dirac equation (p = m)u(r)
p = (p=þ
(r)
Dirac index, and a discrete index which distin- m)vp = 0. Matrices are multiplied along a Fermi
guishes from . The nonvanishing contractions line, with the head of the arrow on the left. Closed
in [31] are determined by the free-fermion fermion loops produce an overall minus sign as in
propagator [33], and the multiplication rule gives the trace of
Dirac matrices along the lines of the loop. Unpolar-
F ðx yÞ ¼ 0T ðxÞ ðyÞ 0 ized scattering amplitudes are summed over the spins
D E of final particles and averaged over the spins of initial
¼ xði@= mÞ1 y particles using the completeness relations for spinors
Z X ðrÞ ðrÞ X ðrÞ ðrÞ
d4 p p= þ m
¼i 4 p2 m2 þ i
eipðxyÞ ½32 up up ¼ p = þ m; vp vp ¼ p= m ½34
ð2Þ r¼1;2 r¼1;2
Y
n tr
¼ 4ð
tr Vðxi Þ ðxi Þ ðxiþ1 Þ
i=1 þ
Þ
Perturbation Theory and Its Techniques 35
Specific to D = 4 dimensions are the trace identities Feynman diagrams. The gauge field propagator is
5 5 given by
tr
¼ tr
¼ 0;
½36 h0jT A ðxÞA ðyÞ j0i
tr
5 ¼ 4i
1
¼ hxj & þ 2 @ @ jyi
where
5 := i
0
1
2
3 . Finally, loop diagrams eval-
Z p p
uated with the fermion propagator [32] require a d4 p þ 2 ipðxyÞ
generalization of the momentum space integral [29] ¼i e ½39
ð2Þ4 p2 2 þ i
given by
Z and is represented by a wavy line. The fermion–
dD k 1 fermion–photon vertex is
D ðk2 þ 2k p þ a2 þ iÞr
ð2Þ
= –ie γμ
iðÞD=2 r D2 1 ½40
¼ D
½37
ð2Þ ðr 1Þ! ða2 p2 þ iÞrD=2 μ
From this formula we can extract expressions for An incoming (resp. outgoing) soft photon of
more complicated Feynman integrals which are momentum k and polarization r is described by the
tensorial, that is, which contain products of wave function e(r) (r)
(k) (resp. e (k) ), where the
(r)
momentum components k in the numerators of polarization vectors e (k), r = 1, 2, 3 solve the vector
their integrands, by differentiating [37] with respect field wave equation (& þ 2 )A = @ A = 0 and
to the external momentum p . obey the orthonormality and completeness
conditions
kμ kν
b,ν ημν + (α – 1)
k a,μ k2
– i δ ab
k2 + i
c,λ,k
b,ν,q a,μ,p
d,ρ a,μ
c,λ b,ν
b a i δ ab
k
k2 + i
a,μ,p
ekμ f abc
c,k b,q
Figure 1 Feynman rules.
represent gluons and dashed lines represent ghosts. physical quantity. However, at a given order of
Feynman rules for the fermions are exactly as perturbation theory, a physical quantity typically
before, except that now the vertex [40] is multi- involves both virtual and real emission contribu-
plied by the color matrix T a . All color indices are tions that are separately infrared divergent.
contracted along the lines of the Feynman graph. Already at two-loop level these divergences have
Color factors may be simplified by using the a highly intricate structure. Their precise form is
identities specified by the Catani color-space factorization
formula, which also provides an efficient way of
dim R organizing amplitudes into divergent parts, which
Tr Ra Rb ¼ C2 ðRÞab ; Ra Ra ¼ C2 ðRÞ
dim
G ultimately drop out of physical quantities, and
½43
1 finite contributions.
R R R ¼ C2 ðRÞ C2 ðGÞ Rb
a b a
2 The computation of multigluon amplitudes in
nonabelian gauge theory is rather complicated
where Ra := R(T a ) and C2 (R) is the quadratic when one uses polarization states of vector bosons.
Casimir invariant of the representation R (with A much more efficient representation of amplitudes
value C2 (G) in the adjoint representation). For is provided by adopting a helicity (or circular
G = SU(N), one has C2 (G) = N and C2 (N) = (N 2 polarization) basis for external gluons. In the
1)=2N for the fundamental representation. spinor–helicity formalism, one expresses positive
The cancellation of infrared divergences in loop and negative-helicity polarization vectors in terms
amplitudes of QCD is far more delicate than in of massless Weyl spinors jk i := 12 (1
5 )uk =
QED, as there is no analog of the Bloch– 1
2 (1
5 )vk through
Nordsieck theorem in this case. The Kinoshita–
Lee–Nauenberg theorem guarantees that, at the
q
k
by using the method of Padé approximation which to write the left-hand sides of [50] as the sum of
requires knowledge of only part of the expansion of rank-2 Feynman integrals which, with the exception
the diagram. By construction, the Padé approximation of the one multiplied by q2 from [51], have one less
has the same analytic properties as the exact denominator factor. This formally determines the
amplitude. coefficients a and b in terms of a set of rank-2
integrations. The vector function c is then found
from the contraction
Brown–Feynman Reduction
J ¼ p a þ q b þ ðD 2Þc ½52
When considering loop diagrams which involve
fermions or gauge bosons, one encounters tensorial This contraction eliminates the k2 denominator
Feynman integrals. When these involve more than factor in the integrand of [47] and produces a
three distinct denominator factors (propagators), vector-valued integral. Solving the system of
they require more than two Feynman parameters algebraic equations [50] and [52] then formally
for their evaluation and become increasingly determines the rank-3 Feynman integral [47] in
complicated. The Brown–Feynman method simpli- terms of rank-1 and rank-2 Feynman integrals. The
fies such higher-rank integrals and effectively rank-2 Feynman integrals thus generated can then
reduces them to scalar integrals which typically be evaluated in the same way by writing a
require fewer Feynman parameters for their decomposition for them analogous to [48] and
evaluation. solving for them in terms of vector-valued and
To illustrate the idea behind this method, consider scalar-valued Feynman integrals. Finally, the rank-1
the one-loop rank-3 tensor Feynman integral integrations can be solved for in terms of a set of
Z scalar-valued integrals, most of which have fewer
dD k denominator factors in their integrands.
J ¼
ð2ÞD Generally, any one-loop amplitude can be reduced
k k k to a set of basic integrals by using the Passarino–
½47
k2 ðk2 2 Þðq kÞ2 ððk qÞ2 þ 2 Þðk2 þ 2k pÞ Veltman reduction technique. For example, in
supersymmetric amplitudes of gluons any tensor
where p and q are external momenta with the mass- Feynman integral can be reduced to a set of scalar
shell conditions p2 = (p q)2 = m2 . By Lorentz invar- integrals, that is, Feynman integrals in a scalar field
iance, the general structure of the integral [47] will theory with a massless particle circulating in the
be of the form loop, with rational coefficients. In the case of N = 4
J ¼ a p þ b q þ c s þ c s ½48 supersymmetric Yang–Mills theory, only scalar box
integrals appear.
where a , b are tensor-valued functions and
c a vector-valued function of p and q. The Reduction to Master Integrals
symmetric tensor s is chosen to project out
components of vectors transverse to both p and q, While the Brown–Feynman and Passarino–Veltman
i.e., p s = q s = 0, with the normalization reductions are well suited for dealing with one-loop
s = D 2. Solving these constraints leads to the diagrams, they become rather cumbersome for
explicit form higher-loop computations. There are other more
powerful methods for reducing general tensor
m2 q q þ q2 p p ðp qÞðq p þ p q Þ integrals into a basis of known integrals called
s ¼ ½49
m2 q2 ðp qÞ2 master integrals. Let us illustrate this technique on a
scalar example. Any scalar massless two-loop Feyn-
To determine the as yet unknown functions man integral can be brought into the form
a , b and c above, we first contract both sides Z Z D 0 Y q
dD k d k t Y
of the decomposition [48] with p and q to get IðpÞ ¼
lj
ni i ½53
j
ð2ÞD ð2ÞD j¼1 i¼1
2p J ¼ 2m2 a þ 2ðp qÞb
½50
2q J ¼ 2ðp qÞa þ 2q2 b where j are massless scalar propagators depending
on the loop momenta k, k0 and the external
Inside the integrand of [47], we then use the trivial momenta p1 , . . . , pn , and i are scalar products of
identities a loop momentum with an external momentum or
of the two loop momenta. The topology of the
2k p ¼ k2 þ 2k p k2
½51 corresponding Feynman diagram is uniquely deter-
2q k ¼ k2 þ q2 ðk qÞ2 mined by specifying the set 1 , . . . , t of t distinct
Perturbation Theory and Its Techniques 39
propagators in the graph, while the integral itself is techniques. For instance, one can apply a Mellin–
specified by the powers lj 1 of all propagators, by Barnes transformation of all propagators given by
the selection 1 , . . . , q of q scalar products and by Z i1
their powers ni 0. 1 1 dz az
¼ ðl þ zÞðzÞ ½56
The integrals in a class of diagrams of the same ðk2 þ aÞl ðl 1Þ! i1 2i ðk2 Þlþz
topology
P with the same denominator dimension
where the contour of integration is chosen to lie to the
r = Pj lj and same total scalar product number
right of the poles of the Euler function (l þ z) and to
s = i ni are related by various identities. One
the left of the poles of (z) in the complex z-plane.
class follows from the fact that the integral over a
Alternatively, one may apply the negative-dimension
total derivative with respect to any loop momentum
method in which D is regarded as a negative integer in
vanishes in dimensional regularization as
intermediate calculations and the problem of loop
Z integration is replaced with that of handling infinite
dD k @JðkÞ
¼0 series. When combined with the above methods, it may
ð2ÞD @k be used to derive powerful recursion relations among
where J(k) is any tensorial combination of propaga- scattering amplitudes. Both of these techniques rely on
tors, scalar products and loop momenta. The an explicit integration over the loop momenta of the
resulting relations are called integration-by-parts graph, their differences occurring mainly in the repre-
identities and for two-loop integrals can be cast sentations used for the propagators.
into the form The procedure outlined above can also be used to
reduce a tensor Feynman integral to scalar integrals, as
Z Z
dD k dD k0 @f ðk; k0 ; pÞ in the Brown–Feynman and Passarino–Veltman reduc-
v ¼0 tions. The tensor integrals are expressed as linear
ð2ÞD ð2Þ D @k
Z Z D 0 combinations of scalar integrals of either higher
dD k d k @f ðk; k0 ; pÞ dimension or with propagators raised to higher
¼ v ½54
ð2ÞD ð2ÞD @k0 powers. The projection onto a tensor basis takes the
form [53] and can thus be reduced to master integrals.
where f (k, k0 , p) is a scalar function containing
propagators and scalar products, and v is any
internal or external momentum. For a graph with ‘ String Theory Methods
loops and n independent external momenta, this
The realizations of field theories as the low-energy
results in a total of ‘(n þ ‘) relations.
limits of string theory provides a number of power-
In addition to these identities, one can also exploit
ful tools for the calculation of multiloop amplitudes.
the fact that all Feynman integrals [53] are Lorentz
They may be used to provide sets of diagrammatic
scalars. Under an infinitesimal Lorentz transformation
computational rules, and they also work well for
p ! p þ p , with p = p , = , one has
calculations in quantum gravity. In this final part we
the invariance condition I(p þ p) = I(p), which leads
shall briefly sketch the insights into perturbative
to the linear homogeneous differential equations
quantum field theory that are provided by tech-
Xn niques borrowed from string theory.
@ @
pi pi IðpÞ ¼ 0 ½55
i¼1
@pi @pi String Theory Representation
This equation can be contracted with all possible String theory provides an efficient compact repre-
antisymmetric combinations of pi pj to yield sentation of scattering amplitudes. At each loop
linearly independent Lorentz invariance identities order there is only a single closed string diagram,
for (53). which includes within it all Feynman graphs along
Using these two sets of identities, one can either with the contributions of the infinite tower of
obtain a reduction of integrals of the type (53) massive string excitations. Schematically, at one-
to those corresponding to a small number of simpler loop order, the situation is as shown in Figure 3.
diagrams of the same topology and diagrams of The terms arising from the heavy string modes are
simpler topology (fewer denominator factors), or removed by taking the low-energy limit in which all
a complete reduction to diagrams with simpler external momenta lie well below the energy scale set
topology. The remaining integrals of the topology by the string tension. This limit picks out the regions
under consideration are called irreducible master of integration in the string diagram corresponding to
integrals. These momentum integrals cannot be particle-like graphs, but with different diagrammatic
further reduced and have to be computed by different rules.
40 Perturbation Theory and Its Techniques
problem shows up already in the quantization of the algebra of observables is then defined as the
free gauge fields (see the section ‘‘Quantization of cohomology of the BRST transformation. To solve
free gauge fields’’). In the final (interacting) theory the the problem of positivity, one has to show that the
physical quantities should be independent on how the algebra of observables, in contrast to the algebra of
gauge fixing is done (‘‘gauge independence’’). all fields, has a nontrivial representation on a
Traditionally, the quantization of gauge theories Hilbert space. Finally, one can attack the infrared
is mostly analyzed in terms of path integrals (e.g., by problem by investigating the asymptotic behavior
Faddeev and Popov), where some parts of the of states. The latter problem is nontrivial even in
arguments are only heuristic. In the original treat- quantum electrodynamics (since an electron is
ment of Becchi, Rouet, and Stora (cf. also Tyutin) accompanied by a ‘‘cloud of soft photons’’) and
(which is called ‘‘BRST-quantization’’), a restriction may be related to confinement in quantum
to purely massive theories was necessary; the chromodynamics.
generalization to the massless case by Lowenstein’s The method of BRST quantization is by no means
method is cumbersome. restricted to gauge theories, but applies to general
The BRST quantization is based on earlier work constrained systems. In particular, massive vector
of Feynman, Faddeev, and Popov (introduction of fields, where the masses are usually generated by the
‘‘ghost fields’’), and of Slavnov. The basic idea is Higgs mechanism, can alternatively be treated
that after adding a term to the Lagrangian which directly by the BRST formalism, in close analogy
makes the Cauchy problem well posed but which is to the massless case (cf. the section on quantization
not gauge-invariant one enlarges the number of of free gauge fields).
fields by infinitesimal gauge transformations
(‘‘ghosts’’) and their duals (‘‘anti-ghosts’’). One
then adds a further term to the Lagrangian which
Local Operator BRST Formalism
contains a coupling of the anti-ghosts and ghosts.
The BRST transformation acts as an infinitesimal In AQFT, the principal object is the family of
gauge transformation on the original fields and on operator algebras O ! A(O) (where O runs, e.g.,
the gauge transformations themselves and maps the through all double cones in Minkowski space),
anti-ghosts to the gauge-fixing terms. This is done which fulfills the Haag–Kastler axioms (cf. Algebraic
in such a way that the total Lagrangian is invariant Approach to Quantum Field Theory). To construct
and that the BRST transformation is nilpotent. these algebras, one considers the algebras F (O)
The hard problem in the perturbative construction generated by all local fields including ghosts u and
of gauge theories is to show that BRST symmetry can anti-ghosts ũ. Ghosts and anti-ghosts are scalar
be maintained during renormalization (see the section fermionic fields. The algebra gets a Z2 grading with
on perturbative renormalization). By means of the respect to even and odd ghost numbers, where ghosts
‘‘quantum action principle’’ of Lowenstein (1971) get ghost numbers þ1 and anti-ghosts ghost number 1.
and Lam (1972, 1973) a cohomological classification The BRST transformation s acts on these algebras as a
of anomalies was worked out (an overview is given, Z2 -graded derivation with s2 = 0, s(F (O)) F (O),
e.g., in the book of Piguet and Sorella (1995)). For and s(F ) = (1)F s(F) , F denoting the ghost num-
more details, see BRST Quantization. ber of F.
The BRST quantization can be carried out in a The observables should be s-invariant and may be
transparent way in the framework of algebraic identified if they differ by a field in the range of s.
quantum field theory (AQFT, see Algebraic Since the range A00 of s is an ideal in the kernel A0
Approach to Quantum Field Theory). The advan- of s, the algebra of observables is defined as the
tage of this formulation is that it allows one to quotient
separate the three main problems of perturbative
gauge theories: A :¼ A0 =A00 ½1
1. the elimination of unphysical degrees of freedom,
and the local algebras A(O) A are the images of
2. positivity (or ‘‘unitarity’’), and
A0 \ F (O) under the quotient map A0 ! A.
3. the problem of infrared divergences.
To prove that A admits a nontrivial representa-
In AQFT, the procedure is the following: starting tion by operators on a Hilbert space, one may use
from an algebra of all local fields, including the the BRST operator formalism (Kugo and Ojima
unphysical ones, one shows that after perturbative 1979, Dütsch and Fredenhagen 1999): one starts
quantization the algebra admits the BRST transfor- from a representation of F on an inner-product
mation as a graded nilpotent derivation. The space (K, h , i) such that hF , i = h, F i
Perturbative Renormalization Theory and BRST 43
and that s is implemented by an operator Q on K, This result guarantees that, within perturbation
that is, theory, the interacting theory satisfies positivity,
provided the unperturbed theory was positive and
sðFÞ ¼ ½Q; F ½2 BRST symmetry is preserved.
with [ , ] denoting the graded commutator, such
that Q is symmetric and nilpotent. One may then
Quantization of Free Gauge Fields
construct the space of physical states as the
cohomology of Q, H := K0 =K00 , where K0 is the The action of a classical free gauge field A,
kernel and K00 the range of Q. The algebra of Z
1
observables now has a natural representation S0 ðAÞ ¼ dx F ðxÞF ðxÞ
4
on H:
Z
1 ^ ðkÞ M ðkÞA
^ ðkÞ
ð½AÞ½ :¼ ½A ½3 ¼ dkA ½5
2
(where A 2 A0 , 2 K0 , [A] := A þ A00 , [] := þ (where F := @ A @ A and M (k) := k2 g
K00 ). The crucial question is whether the scalar k k ) is unsuited for quantization because M is not
product on H inherited from K is positive definite. invertible: due to M k = 0, it has an eigenvalue 0.
In free quantum field theories (K, h , i) can be Therefore, the action is usually modified by adding a
chosen in such a way that the positivity can directly Lorentz-invariant gauge-fixing term: M is replaced
be checked by identifying the physical degrees of by M (k) þ k k , where 2 R n {0} is an arbitrary
freedom (see next section). In interacting theories constant. The corresponding Euler–Lagrange equation
(see the section on perturbative construction of reads
gauge theories), one may argue in terms of scattering
states that the free BRST operator on the asymptotic &A ð1 Þ@ @ A ¼ 0 ½6
fields coincides with the BRST operator of the For simplicity, let us choose = 1, which is referred
interacting theory. This argument, however, is to as Feynman gauge. Then the algebra of the free
invalidated by infrared problems in massless gauge gauge field is the unital ?-algebra generated by
theories. Instead, one may use a stability property of elements A (f ), f 2 D(R4 ), which fulfill the
the construction. relations:
Namely, let F~ be the algebra of formal power
series with values in F , and let K~ be the vector space f 7! A ðf Þ is linear ½7
of formal power series with values in K. K~ possesses
a natural inner product with values in the ring of A ð&f Þ ¼ 0 ½8
formal power series C[[]], as well as a representa-
tion of F~ by operators. One also assumes that the
A ðf Þ ¼ A ðf Þ ½9
BRST P transformation s̃ is a formal power series
s̃ = n n sn of operators sn on F and that the
BRST Z
P operator Q̃ is a formal power series
Q̃ = n n Qn of operators on K. The algebraic ½A ðf Þ; A ðgÞ ¼ ig dx dy f ðxÞDðx yÞgðyÞ ½10
construction can then be done in the same way as
before, yielding a representation ˜ of the algebra where D is the massless Pauli–Jordan distribution.
of observables A~ by endomorphisms of a C[[]] This algebra does not possess Hilbert space
module H, ~ which has an inner product with values representations which satisfy the microlocal spectrum
in C[[]]. condition, a condition which in particular requires
One now assumes that at = 0 the inner product the singularity of the two-point function to be of the
is positive, in the sense that so-called Hadamard form. It possesses, instead,
representations on vector spaces with a nondegene-
(Positivity)
rate sequilinear form, for example, the Fock space
ðiÞ h; i 0 8 2 K with Q0 ¼ 0; and over a one-particle space with scalar product
ðiiÞ Q0 ¼ 0 ^ h; i ¼ 0 ¼) 2 Q0 K ½4 Z 3
3 d p
h; i ¼ ð2Þ ðpÞ ðpÞjp0 ¼jpj ½11
2jpj
Then the inner product on H~ is positive in the
sense that for all ˜ 2 H~ the inner product with itself, Gupta and Bleuler characterized a subspace of the
h,˜ i,
˜ is of the form c̃ c̃ with some power series Fock space on which the scalar product is semide-
c̃ 2 C[[]], and c̃ = 0 iff ˜ = 0. finite; the space of physical states is then obtained
44 Perturbative Renormalization Theory and BRST
by dividing out the space of vectors with vanishing (see, e.g., Scharf (2001)). It is implemented by the
norm. free BRST charge
After adding a mass term Z
ð0Þ
Z Q0 ¼ d3 xj0 ðx0 ; xÞ ½15
m2 x0 ¼const:
dxA ðxÞA ðxÞ
2 where
to the action [5], it seems to be no longer necessary jð0Þ
:¼ ð@ B þ mÞ@ u @ ð@ B þ mÞu ½16
to add also a gauge-fixing term. The fields then
satisfy the Proca equation is the free BRST current, which is conserved. (The
interpretation of the integral in [15] requires some
@ F þ m2 A ¼ 0 ½12 care.) Q0 satisfies the assumptions of the (local)
which is equivalent to the equation (& þ m2 )A = 0 operator BRST formalism, in particular it is nilpotent
together with the constraint @ A = 0. The Cauchy and positive [4]. Distinguished representatives of the
problem is well posed, and the fields can be equivalence classes [] 2 Ke Q0 =Ra Q0 are the states
represented in a positive-norm Fock space with built up only from the three spatial (two transversal
only physical states (corresponding to the three for m = 0, respectively) polarizations of A.
physical polarizations of A). The problem, however,
is that the corresponding propagator admits no
power-counting renormalizable perturbation series. Perturbative Renormalization
The latter problem can be circumvented in the
The starting point for a perturbative construction of
following way: for the algebra of the free quantum
an interacting quantum field theory is Dyson’s
field, one takes only the equation (& þ m2 )A = 0
formula for the evolution operator in the interaction
into account (or, equivalently, one adds the gauge-
picture. To avoid conflicts with Haag’s theorem on
fixing term (1/2)(@ A )2 to the Lagrangian) and goes
the nonexistence of the interaction picture in
over from the physical field A to
quantum field theory, one multiplies the interaction
@ Lagrangian L with a test function g and studies the
B :¼ A þ ½13 local S-matrix,
m
X1 nZ
where is a real scalar field, to the same mass m i
SðgLÞ ¼ 1 þ dx1 dxn gðx1 Þ gðxn Þ
where the sign of the commutator is reversed n!
n¼1
(‘‘bosonic ghost field’’ or ‘‘Stückelberg field’’).
TðLðx1 Þ Lðxn ÞÞ ½17
The propagator of B yields a power-counting
renormalizable perturbation series; however, B is where T denotes a time-ordering prescription. In the
an unphysical field. One obtains four independent limit g ! 1 (adiabatic limit), S(gL) tends to the
components of B which satisfy the Klein–Gordon scattering matrix. This limit, however, is plagued by
equation. The constraint 0 = @ A = @ B þ m is infrared divergences and does not always exist.
required for the expectation values in physical states Interacting fields FgL are obtained by the Bogoliubov
only. So, quantization in the case m > 0 can be formula:
treated in analogy with [8]–[10] by replacing A by
B , the wave operator by the Klein–Gordon operator
FgL ðxÞ ¼ j SðgLÞ1 SðgL þ hFÞ ½18
(& þ m2 ) in [8], and D by the corresponding massive hðxÞ h¼0
commutator distribution m in [10]. Again, the The algebraic properties of the interacting fields
algebra can be nontrivially represented on a space within a region O depend only on the inter-
with indefinite metric, but not on a Hilbert space. action within a slightly larger region (Brunetti and
One can now use the method of BRST quantiza- Fredenhagen 2000), hence the net of algebras in the
tion in the massless as well as in the massive case. sense of AQFT can be constructed in the adiabatic
One introduces a pair of fermionic scalar fields limit without the infrared problems (this is called the
(ghost fields) (u, ũ). u, ũ, and (for m > 0) fulfill the ‘‘algebraic adiabatic limit’’).
Klein–Gordon equation to the same mass m 0 as The construction of the interacting theory is thus
the vector field B. The free BRST transformation reduced to a definition of time-ordered products of
reads fields. This is the program of causal perturbation
theory (CPT), which was developed by Epstein and
s0 ðB Þ ¼ i@ u; s0 ðÞ ¼ imu
½14 Glaser (1973) on the basis of previous work by
s0 ðuÞ ¼ 0; uÞ ¼ ið@ B þ mÞ
s0 ð~ Stückelberg and Petermann (1953) and Bogoliubov
Perturbative Renormalization Theory and BRST 45
and Shirkov (1959). For simplicity, we describe D(R 4n n n ) are maintained in the extension,
CPT only for a real scalar field. Let ’ be a classical namely:
real scalar field which is not restricted by any field
(N0) a bound on the degree of singularity near
equation. Let P denote the algebra of polynomials
the total diagonal;
in ’ and all its partial derivatives @ a ’ with multi-
(N1) Poincaré covariance;
indices a 2 N40 . The time-ordered products (Tn )n2N
(N2) unitarity of the local S-matrix;
are linear and symmetric maps Tn : (P
vector L1 2 P associated with the interaction L, nilpotent in classical field theory (and hence this holds
such that also for s̃). However, in QFT conservation of j̃gL and
2
Q̃ = 0 requires the validity of additional Ward
½Q0 ; Tn ðLðx1 Þ Lðxn Þ identities, beyond the condition of perturbative gauge
Xn
¼i @xl Tn ðLðx1 Þ L1 ðxl Þ Lðxn ÞÞ ½23 invariance [23]. All the necessary identities can be
l¼1 derived from the master Ward identity
This is a somewhat stronger condition than [22] but Tnþ1 ðA; F1 ; . . . ; Fn Þ
has the advantage that it can be formulated X n
independently of the adiabatic limit. The condition ¼ Tn ðF1 ; . . . ; A Fk ; . . . ; Fn Þ ½25
[22] (or perturbative gauge invariance) can be k¼1
satisfied for tree diagrams (i.e., the corresponding where A = A S0 with a derivation A . The master
requirement in classical field theory can be fulfilled). Ward identity is closely related to the quantum
In the massive case, this is impossible without a action principle which was formulated in the
modification of the model; the inclusion of addi- formalism of generating functionals of Green’s
tional physical scalar fields (corresponding to Higgs functions. In the latter framework, the anomalies
fields) yields a solution. It is gratifying that, have been classified by cohomological methods. The
by making a polynomial ansatz for the interaction vanishing of anomalies of the BRST symmetry is a
L 2 P, perturbative gauge invariance [23] for tree selection criterion for physically acceptable models.
diagrams, renormalizability (i.e., the mass dimension In the particular case of QED, the Ward identity
of L is 4), and some obvious requirements (e.g.,
the Lorentz invariance) determine L to a far extent. @y T ðj ðyÞF1 ðx1 Þ Fn ðxn ÞÞ
In particular, the Lie-algebraic structure needs not to X n
be put in, as it can be derived in this way (Stora 1997, ¼i ðy xj Þ
unpublished). Including loop diagrams (i.e., quantum j¼1
effects), it has been proved that (N0)–(N2) and T F1 ðx1 Þ ðFj Þðxj Þ Fn ðxn Þ ½26
perturbative gauge invariance can be fulfilled to all
for the Dirac current j := , is sufficient for
orders for massless SU(N) Yang–Mills theories.
the construction, where (F) := i(r s)F for
Unfortunately, in the massless case, it is unlikely that
F = r s B1 Bl (B1 , . . . , Bl are nonspinorial fields)
the adiabatic limit exists and, hence, an S-matrix
and F1 , . . . , Fn run through all subpolynomials of
formalism is problematic. One should better rely on
L = j A , (N0)–(N4) and [26] can be fulfilled to all
the construction of local observables in terms of
orders (Dütsch and Fredenhagen, 1999).
couplings with compact support. However, then the
selection of the observables [1] has to be done in terms See also: Algebraic Approach to Quantum Field Theory;
of the BRST transformation s̃ of the interacting fields. Axiomatic Quantum Field Theory; Batalin–Vilkovisky
For the corresponding BRST charge, one makes Quantization; BRST Quantization; Constrained Systems;
the ansatz Indefinite Metric; Perturbation Theory and its Techniques;
Z X Quantum Chromodynamics; Quantum Field Theory:
Q~ ¼ d4 x ~j ðxÞb ðxÞ; L ¼ Ln n ½24 A Brief Introduction; Quantum Fields with Indefinite
gL
n 1 Metric: Non-Trivial Models; Renormalization: General
Theory; Renormalization: Statistical Mechanics and
where (b ) is a smooth version of the -function Condensed Matter; Standard Model of Particle Physics.
characterizing a Cauchy surface and j̃gL is the
interacting
P BRST-current [18] (where
j̃ = n j(n)
n (n)
(j 2 P) is a formal power series with Further Reading
j(0)
given by [16]). (Note that there is a volume
divergence in this integral, which can be avoided by a Becchi C, Rouet A, and Stora R (1975) Renormalization of the
spatial compactification. This does not change the abelian Higgs–Kibble model. Communications in Mathema-
tical Physics 42: 127.
abstract algebra F L (O).) A crucial requirement is that Becchi C, Rouet A, and Stora R (1976) Renormalization of gauge
j̃gL is conserved in a suitable sense. This condition is theories. Annals of Physics (NY) 98: 287.
essentially equivalent to perturbative gauge invariance Bogoliubov NN and Shirkov DV (1959) Introduction to the Theory
and hence its application to classical field theory of Quantized Fields. New York: Interscience Publishers Inc.
determines the interaction L in the same way, and in Brunetti R and Fredenhagen K (2000) Microlocal analysis and
interacting quantum field theories: renormalization on physical
addition the deformation j(0) ! j̃gL . The latter also backgrounds. Communications in Mathematical Physics 208: 623.
gives the interacting BRST charge and transformation, Dütsch M, Hurth T, Krahe K, and Scharf G (1993) Causal
Q̃ and s̃, by [24] and [2]. The so-obtained Q̃ is often construction of Yang–Mills theories. I. N. Cimento A 106: 1029.
Phase Transition Dynamics 47
Dütsch M, Hurth T, Krahe K, and Scharf G (1994) Causal Lam Y-MP (1972) Perturbation Lagrangian theory for scalar
construction of Yang–Mills theories. II. N Cimento A 107: 375. fields – Ward–Takahashi identity and current algebra. Physics
Dütsch M and Fredenhagen K (1999) A local (perturbative) Reviews D 6: 2145.
construction of observables in gauge theories: the example of Lam Y-MP (1973) Equivalence theorem on Bogoliubov–Parasiuk–
QED. Communications in Mathematical Physics 203: 71. Hepp–Zimmermann – renormalized Lagrangian field theories.
Dütsch M and Boas F-M (2002) The Master Ward identity. Physics Reviews D 7: 2943.
Reviews of Mathematical Physics 14: 977–1049. Lowenstein JH (1971) Differential vertex operations in Lagrangian
Dütsch M and Fredenhagen K (2003) The master Ward identity field theory. Communications in Mathematical Physics 24: 1.
and generalized Schwinger–Dyson equation in classical Piguet O and Sorella S (1995) Algebraic Renormalization:
field theory. Communications in Mathematical Physics Perturbative Renormalization, Symmetries and Anomalies,
243: 275. Lecture Notes in Physics. Berlin: Springer.
Dütsch M and Fredenhagen K (2004) Causal perturbation theory Scharf G (1995) Finite Quantum Electrodynamics. The Causal
in terms of retarded products, and a proof of the action Ward Approach, 2nd edn. Berlin: Springer.
identity. Reviews in Mathematical Physics 16: 1291–1348. Scharf G (2001) Quantum Gauge Theories – A True Ghost Story.
Epstein H and Glaser V (1973) Annals Institut Henri Poincaré A New York: Wiley.
19: 211. Stora R (2002) Pedagogical experiments in renormalized
Epstein H and Glaser V (1976) Adiabatic limit in perturbation perturbation theory. Contribution to the conference. Theory
theory. In: Velo G and Wightman AS (eds.) Renormalization of Renormalization and Regularization. Germany:
Theory, pp. 193–254. Hesselberg.
Henneaux M and Teitelboim C (1992) Quantization of Gauge Stückelberg ECG and Petermann A (1953) La normalisation des
Systems. Princeton: Princeton University Press. constantes dans la theorie des quanta. Helvetica Physica Acta
Kugo T and Ojima I (1979) Local covariant operator formalism 26: 499–520.
of nonabelian gauge theories and quark confinement problem. Weinberg S (1996) The Quantum Theory of Fields. Cambridge:
Supplement of the Progress of Theoritical Physics 66: 1. Cambridge University Press.
We may explain the roles of the terms on the 1. If we set vint ‘(t)=t and K 1=‘(t), we obtain
right-hand side of eqn [1] in phase ordering in a a = 1=2 in the growth law [9].
simple manner. 2. In phase ordering under very small positive h,
the balance 1=‘(t) h= yields the crossover
1. The linear term triggers instability for < 0.
time th h2 . For t < th the effect of h is small,
2. The nonlinear term 3 gives rise to saturation
while for t > th the region with ffi 1 becomes
of into 1. To see this, we neglect r2 and
predominant.
to have @ =@t = (1 2 ) for = 1. This
3. A spherical droplet with ffi 1 evolves as
equation is solved to give
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi @R 2 2h
ðtÞ ¼ 0 = 2 þ ð1 2 Þe2t ½15 ¼ þ ½19
0 0 @t R
where 0 = (0) is the initial value. Thus, ! 1 from which the critical radius is determined as
for 0 > 0 and ! 1 for 0 < 0 as t ! 1. Rc ¼ =h ½20
3. The gradient term limits the instability only in
the long wavelength region k < 1 in the initial A droplet with R > Rc (R < Rc ) grows (shrinks).
stage (see eqn [8]) and creates the interfaces in We mention a statistical theory of interface dynamics
the late stage (see eqn [7]). at h = 0 by Ohta (1982). There, a smooth subsidiary
4. The noise term is relevant only in the early field u(r, t) is introduced to represent surfaces by
stage where is still on the order of the initial u = const. The differential geometry is much simplified
thermal fluctuations. The range of the early stage in terms of such a field. The two-phase boundaries are
is of order 1 for " > 1, but weakly grows as represented by u = 0. If all the surfaces follow vint = K
ln(1=") for " 1. The noise term can be in eqn [17] in the whole space, u obeys
neglected once the fluctuations much exceed the @ h X i
thermal level. u ¼ r2 ni nj ri rj u ½21
@t ij
5. If h is a small positive number, it favors growth
of regions with ffi 1. where ri = @=@xi and ni = ri u=jruj. This equation
becomes a linear diffusion equation if ni nj ri rj is
Interface Dynamics replaced by d1 ij r2 . Then u can be expressed in
terms of its initial value and the correlation function
At long times t 1 domains with typical size ‘(t)
of (r, t)( ffi u(r, t)=ju(r, t)j in the late stage) is
are separated by sharp interfaces and the thermal
calculated in the form of eqn [13] with
noise is negligible. Allowing the presence of a small
positive h, we may approximate the free energy F as 2 1 1 2
GðxÞ ¼ sin exp x ½22
F ¼ SðtÞ 2hVþ ðtÞ þ const: ½16 8ð1 1=dÞ
where is a constant (surface tension), S(t) is the which excellently agrees with simulations.
surface area, and Vþ (t) is the volume of the
regions with ffi 1. In this stage the interface velocity
vint = vint n is given by the Allen–Cahn formula Spinodal Decomposition in Conserved
(Allen and Cahn 1979): Systems
vint ¼ K þ ð2=Þh ½17 The order parameter can be a conserved variable
such as the density or composition in fluids or
The normal unit vector n is from a region with ffi 1 alloys. With the same F in eqn [4], a simple dynamic
to a region with ffi 1. The K is the sum of the model in such cases reads
principal curvatures 1=R1 þ 1=R2 in 3D. This equa-
tion can be derived from eqn [1]. If the interface @ F
¼ r2 r jR ½23
position r a moves to r a þ nR infinitesimally,
R the @t
surface area changes by S = daK, where da
Here jR is the random current characterized by
denotes the surface integral. Therefore, F in eqn [16] D E
changes in time as jR R 0 0 0 0
ðr; tÞj
ðr ; t Þ ¼ 2"
ðr r Þðt t Þ ½24
Z
dF
¼ daðK 2hÞvint
0 ½18 which ensures the equilibrium distribution [3] of .
dt
However, the noise jR is negligible in late-stage
which is non-negative-definite owing to eqn [17]. phase separation as in the nonconserved case. Note
Furthermore, we may draw three results from eqn [17]. that h in the conserved case is the chemical potential
50 Phase Transition Dynamics
conjugate to and, if it is homogeneous, it vanishes minority phase eventually appears as droplets in the
in the dynamic equation [23]. In experiments the percolating region of the majority phase.
average order parameter
Z Interface Dynamics
M ¼ h i ¼ dr ðrÞ=V ½25
Interface dynamics in the conserved case is much
more complicated than in the nonconserved case,
is used as a control parameter instead of h, where
because the coarsening can proceed only through
the integral is within the system with volume V. If
diffusion. Long-distance correlations arise among
there is no flux from outside, M is constant in time.
the domains and the interface velocity cannot be
Here the instability occurs below the so-called
written in terms of the local quantities like the
spinodal M2 < 1=3(M2 < jj=3 for general < 0).
curvature. As a simple example, we give the counter-
In fact, small fluctuations with wave vector k grow
part of eqn [19]. In 3D a spherical droplet with ffi 1
exponentially as
appears in a nearly homogeneous matrix with = M
k ðtÞ exp½k2 ð1 3M2 k2 Þt ½26 far from the droplet. The droplet radius R is then
governed by (Lifshitz and Slyozov 1961)
right after the quenching as in eqn [8]. The growth rate
is largest at an intermediate wave number k = km with @ 2d0
R¼D 2 ½29
@t R R
km ¼ ½ð1 3M2 Þ=21=2 ½27
where = (M þ 1)=2 is called the supersaturation,
This behavior and the exponential growth of the while D and d0 are constants (equal to 2 and =8,
structure factor have been observed in polymer mixtures respectively, after the scaling). The critical radius is
where the parameter " in eqn [3] or [12] is expected to be written as
small (Onuki 2002). In late-stage coarsening the peak
position of S(k, t) decreases in time as Rc ¼ 2d0 = ½30
km ðtÞ 2=‘ðtÞ ½28 The general definition of the supersaturation is
.
in terms of the domain size ‘(t). The growth ¼ M ð2Þ ð1Þ
ð2Þ
½31
cx cx cx
exponent in eqn [9] is given by 1/3 for the simple
model [23] (see eqn [33] below). Here the equilibrium values of are written as (1) cx
Figure 2 shows the patterns after quenching in 2D. and (2)cx and M is supposed to be slightly different
For M = 0 the two phases are symmetric and the from (2)cx .
patterns are bicontinuous, while for M 6¼ 0 the Lifshitz and Slyozov (1961) analyzed domain coar-
sening in binary AB alloys when the volume fraction q
of the A-rich domains is small. They noticed that the
M=0 supersaturation around each domain decreases in
time with coarsening. That is, the A component atoms
in the B-rich matrix are slowly absorbed onto the
growing A-rich domains, while a certain fraction of the
A-rich domains disappear. Thus, q(t) and (t) both
depend on time, but satisfy the conservation law
20 100 400
qðtÞ þ ðtÞ ¼ ð0Þ ¼ ðM þ 1Þ=2 ½32
(a)
M = 0.1 With this overall constraint, they found the
asymptotic late-stage behavior
are stable for infinitesimal fluctuations, but rare distribution n(R, t) then obeys the Fokker–Planck
spatially localized fluctuations, called critical nuclei, equation
can continue to grow, leading to macroscopic phase
@ @ @ F0 ðRÞ
ordering (Onuki 2002, Debenedetti 1996). The birth n¼ LðRÞ þ n ½39
of a critical droplet is governed by the Boltzmann @t @R @R kB T
factor exp (Fc =kB T) at finite temperatures, where Here n(R, t)dR denotes the droplet number density
Fc is the free energy needed to create a critical in the range [R, R þ dR]. We determine the kinetic
droplet and kB T is the thermal energy with kB being coefficient L(R) such that
the Boltzmann constant. In this section we explicitly
write kB T, but we may scale and space such that vðRÞ LðRÞF0 ðRÞ=kB T ½40
= 1 at the final temperature. is the right-hand side of eqn [19] or [29]. It is
equal to @R=@t when the thermal noise is
Droplet Free Energy and Experiments neglected. Thus, L(R) / R2 or R3 for the non-
In the nonconserved case we prepare a spin-down state conserved or conserved case. The second deriva-
with ffi 1 in the time region t < 0 and then apply a tive (@=@R)L(R)(@=@R) in eqn [39] stems from the
small positive field h at t = 0. For t > 0 a spin-up thermal noise and is negligible for R Rc >
1 in
droplet with radius R requires a free energy change 3D (Onuki 2002). Hence, for R Rc >
1, the
droplets follow the deterministic equation [19] or
8 3 [29] and n obeys
FðRÞ ¼ 4R2 hR ½34
3
@ @
The first term is the surface free energy and the n¼ ½vðRÞn ½41
@t @R
second term is the bulk decrease due to h. The
critical radius Rc in eqn [20] gives the maximum of In Figure 3, we plot the solution of eqn [39] for
F(R) given by the conserved case with Fc =kB T = 17.4 (Onuki
4 2 2002). The time is measured in units of 1=c ,
Fc ¼ Rc ½35 which is the timescale of a critical droplet defined by
3
In fact, F0 (R) = @F(R)=@R is written as c ¼ ð@vðRÞ=@RÞR¼Rc ½42
F0 ðRÞ ¼ 8ðR R2 =Rc Þ ½36 We notice c / R3 c from eqn [29] so c is small.
The initial distribution is given by
In conserved systems such as fluids or alloys, we
lower the temperature slightly below the coexistence nðR; 0Þ ¼ n0 expð4R2 =kB TÞ ½43
curve with the average order parameter M held fixed.
We again obtain the droplet free energy [34], but
0
h ¼ ð=2d0 Þ ½37
in terms of the (initial) supersaturation = (0). –2
Let the equilibrium values (1)
cx and
(2)
cx in the two
M = (2)
cx = A(Tc Tcx ) . In nucleation experi-
ments the final temperature T is slightly below Tcx –8
and T Tcx T is a positive temperature incre-
ment. For small T we find –10
with n0 being a constant number density. This form near-critical fluids, however, I0 itself becomes small
has been observed in computer simulations as the ( / 6 ) such that the cloud point considerably depends
droplet size distribution on the coexistence curve on the experimental timescale (observation time).
(h = 0). Figure 3 indicates that n(R, t) tends to a
steady solution ns (R) which satisfies
Remarks
@ F0 ðRÞ
LðRÞ þ ns ¼ I ½44 The order parameter can be a scalar, a vector as in
@R kB T
the Heisenberg spin system, a tensor as in liquid
where I is a constant. Imposing the condition ns (R) ! 0 crystals, and a complex number as in superfluids
as R ! 1, we integrate the above equation as and superconductors. In phase ordering a crucial
Z 1 role is played by topological singularities like
1 FðR1 Þ FðRÞ
ns ðRÞ ¼ I dR1 exp ½45 interfaces in the scalar case and vortices in the
R LðR1 Þ kB T
complex number case. Furthermore, a rich variety of
For R Rc 1 we may replace F(R1 ) F(R) phase transition dynamics can be explained if the
by F0 (R)(R1 R) in the integrand of eqn [45] to order parameter is coupled to other relevant
obtain variables in the free energy and/or in the dynamic
ns ðRÞ ffi I=vðRÞ ½46 equations. We mention couplings to velocity field in
fluids, electrostatic field in charged systems, and
which also follows from eqn [41]. Thus elastic field in solids. Phase ordering can also be
ns ðRÞdR ¼ I dt ðdR ¼ vðRÞdtÞ ½47 influenced profoundly by external fields such as
electric field or shear flow.
This means that I is the nucleation rate of droplets
with radii larger than Rc emerging per unit volume See also: Reflection Positivity and Phase Transitions;
and per unit time. Furthermore, as R ! 0, we Renormalization: Statistical Mechanics and Condensed
require ns (R) ! n0 = const. in eqn [43] so that Matter; Statistical Mechanics of Interfaces; Topological
Z 1 Defects and Their Homotopy Classification.
1 FðR1 Þ
n0 ¼ I dR1 exp ½48
0 LðR1 Þ kB T
Further Reading
where the integrand becomes maximum
around Rc . Using the expansion F(R) = Fc þ Allen SM and Cahn JW (1979) Microscopic theory for antiphase
F00 (Rc ) (R Rc )2 =2 þ , we obtain the famous boundary motion and its application to antiphase domain
formula for the nucleation rate coarsening. Acta Metallurgica 27: 1085.
Binder K (1991) Spinodal decomposition. In: Cohen RW, Haasen
I ¼ I0 expðFc =kB TÞ ½49 P, and Kramer EJ (eds.) Material Sciences and Technology, vol.
5. Weinheim: VCH.
Bray AJ (1994) Theory of phase-ordering kinetics. Advances in
Physics 43: 357.
¼ I0 expðC0 =2 Þ ½50 Cahn JW (1961) On spinodal decomposition. Acta Metallurgica
9: 795.
where the coefficient I0 is of order n0 c . The second Debenedetti PG (1996) Metastable Liquids. Princeton: Princeton
line holds in the 3D conserved case. Here, C0 103 University.
typically and I0 is a very large number in units of Gunton JD, San Miguel M, and Sani PS (1983) The dynamics of
cm3 s1 , say, 1030 . Then the exponential factor in I first-order phase transitions. In: Domb C and Lebowitz JL
(eds.) Phase Transition and Critical Phenomena, vol. 8.
changes abruptly from a very small to a very large London: Academic Press.
number with only a slight increase of at small Lifshitz IM and Slyozov VV (1961) The kinetics of precipitation
1. For example, if C0 =2 = 50, I is increased from supersaturated solid solutions. Journal of Physics and
by exp (100=) with a small increase of to Chemistry of Solids 19: 35.
þ . This factor can be of order 103 even for Ohta T, Jasnow D, and Kawasaki K (1982) Universal scaling in
the motion of random interfaces. Physical Review Letters 49:
= = 0.05. Unless very close to criticality, simple 1223.
metastable fluids become opaque suddenly with Onuki A (2002) Phase Transition Dynamics. Cambridge: Cambridge
increasing or T at a rather definite cloud point. In University Press.
Phase Transitions in Continuous Systems 53
infinite systems, it is proved that the DLR states can is the segment {0 T Tc , h = 0}, in the (T, h)
be directly characterized (i.e., without using limit plane, h being the magnetic field. In the upper-half
procedures) as the solutions of a set of equations, plane, there is a single phase with positive magne-
the ‘‘DLR equations,’’ which generalize the finite- tization, in the lower one with a negative value; at
volume Gibbs prescription. h = 0, positive and negative magnetization states can
In terms of DLR states, the mathematical meaning coexist, if the temperature is lower than the critical
of phase transitions becomes very clear and sharp. value Tc . Correspondingly, there are, simulta-
The starting point is the proof that the physical neously, a positive and a distinctly negative DLR
property that intensive variables in a pure phase state, which describe the two phases.
have negligible fluctuations is verified by all the An analogous result is missing for systems of
DLR measures which are in a special class, thus particles in the continuum, but there has been recent
selected by this property, and which are therefore progress on the analysis of the liquid–vapor branch
interpreted as ‘‘pure phases.’’ All the other DLR of the phase diagram, and the issue will be the main
measures are proved to be mixtures, that is, general focus of this article.
convex combinations, of the pure DLR states. Thus,
in the DLR theory, the system is in a single phase
when there is only one DLR state, at the given
Sensitive Dependence on Boundary
values of the thermodynamic parameters (e.g.,
Conditions
temperature and chemical potential), while the
system is at a phase transition if there are several Phase transitions describe exceptional regimes where
distinct DLR states. the system is in a critical state; this is why they are
While the theory beautifully clarifies the meaning so interesting and difficult to study. As in chaotic
of phase transitions, it does not say whether the systems, criticality corresponds to a ‘‘butterfly
phenomenon really occurs! This is maybe the main effect,’’ which, in a statistical-mechanics setting
open problem in equilibrium statistical mechanics. A means changing far-away boundary conditions.
general proof of existence of phase diagrams is Such changes affect the neighbors, which in turn
needed, which should at least capture the basic influence their neighbors, and so on. In general, the
property behind the Gibbs phase rule, namely that in effect decays with the distance but, at phase
most of the space (of thermodynamic parameters) transition, it provokes an avalanche which propa-
there is a single phase, with rare exceptions where gates throughout the system reaching all its points.
several phases coexist. A more refined result should Its occurrence is not at all obvious, if we remember
then indicate that coexistence occurs only on regular the stochastic nature of the theory. The domino
surfaces of positive codimension. effect described above can in fact, at each step, be
There is, however, a general result of existence of subverted by stochastic fluctuations. The latter, in
the gaseous phase, with a proof of uniqueness of the end, may completely hide the effect of changing
DLR measures when temperature is large and the boundary conditions. This is an instance of a
density low. Coexistence of phases is much less competition between energy and entropy which is
understood at a general level, but results for the ruling phenomenon behind phase transitions.
particular classes of models exist, for instance, in This intuitive picture also explains the relevance
lattice systems at low temperatures. The prototype is of space dimensionality. In a many-dimensional
the ferromagnetic Ising model in two or more space, the influence of the boundary conditions has
dimensions, where indeed the full diagram has clearly many more ways to percolate, in contrast to
been determined, see Figure 2. The transition curve the one-dimensional case, where in fact there is a
general result on the uniqueness of DLR measures
and therefore absence of phase transitions, for short-
h range interactions. For pair potentials, ‘‘short’’
means that the interaction energy between two
molecules, respectively at r and r0 , decays as
jr r0 j , > 2. There are results on the converse,
namely on the presence of phase transitions when
the above condition is not satisfied, mainly for
lattice systems, but with partial extensions also to
Tc T
continuous systems. One-dimensional and long-
range cases are not the main focus of this article,
Figure 2 Phase diagram of the Ising ferromagnet. and the issue will not be discussed further here.
Phase Transitions in Continuous Systems 55
The simpler Ising picture should instead reappear coarse graining picture works and it has been proved
at the liquid–vapor coexistence line. Looking at the that in a ‘‘small’’ region of the temperature–chemical
fluid on a proper spatial scale, we should in fact see potential plane, there is a part of the curve where two
a density that is essentially constant, except for distinct phases coexist, while elsewhere in the neighbor-
small and rare fluctuations. Its value will differ in hood, the phase is unique.
the liquid and in the gaseous states, gas < liq . The ideas behind the choice of the Hamiltonian
Therefore, density is an order parameter for the go back to van der Waals, and the Ginzburg–
transition and plays the role of the spin magnetiza- Landau theory, which are milestones in the theory
tion in the Ising picture. of phase transitions, while the mathematics of
There are general mathematical techniques devel- variational problems also enters here in an impor-
oped to translate these ideas into proofs, they involve tant way. These are briefly discussed in the next
‘‘coarse graining,’’ ‘‘block spin transformations,’’ and sections.
‘‘renormalization group’’ procedures. The starting
point is to ideally divide the space into cells. Their size
should be chosen to be much larger than the typical
The van der Waals Liquid–Vapor
microscopic distance between molecules, to depress
fluctuations of the particle density in a cell. To study
Transition
the probability distribution of the latter, we integrate Let us then do a step backwards and recall the
out all the other degrees of freedom. After such a van der Waals theory of the liquid–vapor transition.
coarse graining, we are left with a system of spins on a As typical intermolecular forces have a strong
lattice, the lattice sites labeling the cells (also called repulsive core and a rather long attractive tail, in a
blocks) and each spin (also called block spin) giving continuum, mesoscopic approximation of the system
the value of the density of particles in the correspond- will be described by a free-energy functional of the
ing cell. Translated into the language of block spins, type
the previous physical analysis of the state of the fluid Z
suggests that most probably, in each block the density 0
FðÞ ¼ f; ððrÞÞdr
is approximately equal to either liq or gas , and the
Z
same in different blocks, except in the case of small 1
Jðr; r0 ÞðrÞðr0 Þdr dr0 ½1
and rare fluctuations. If we represent the probability 2
distribution of the block spins in terms of a Gibbs where = {(r), r 2 } is the particles density and
measure (as always possible if the system is in a the region where the system is confined, which, for
bounded region), the previous picture is compatible simplicity, is taken here as a torus in Rd , consisting
with a new Hamiltonian with a single spin (one-body) of a cube with periodic boundary conditions. The
potential which favors the two values liq and gas and term J(r, r0 )(r)(r0 ), J(r, r0 ) 0, is the energy due to
an attractive interaction between spins which sup- the attractive tail of the interaction, which is
presses changes from one to the other. A new effective periodic in ; f,0 () = f,0 0 () is the free-energy
low temperature should finally dampen the density due to the short, repulsive part of the
fluctuations. interaction, being the chemical potential.
Thus, after coarse graining, the system should be in As noted later, [1] can be rigorously derived by a
the same universality class as of the low-temperature coarse graining transformation; it will be used to
Ising model, and we may hope, in this way, to extend build a bridge between the van der Waals theory and
to the liquid–vapor branch of the phase diagram the the previous block spin analysis of the liquid–vapor
Pirogov–Sinai theory of low-temperature lattice phase transition. Let us take for the moment [1] as a
systems. In particular, as in the Ising model, we will primitive notion. By invoking the second principle of
then be able to select the liquid or the vapor phases by thermodynamics, the equilibrium states can be
the introduction of suitable boundary conditions. found by minimizing the free-energy functional.
The conditional tense arises because the computation Supposing J to be translation invariant, that is,
of the coarse graining transformation is in general very J(r, rR0 ) = J(r þ a, r0 þ a), r, r0, a 2 Rd , and calling
difficult, if not impossible, to carry out, but there is a = J(r, r0 )dr0 the intensity of J, we can rewrite
class of systems where it has been accomplished. These F() as
are systems of identical point particles in Rd , d 2, Z
0 ðrÞ2
which interact with ‘‘special’’ two- and four-body FðÞ ¼ f; ððrÞÞ dr
potentials, having finite range and which can be chosen 2
Z
to be rotation and translation invariant; their specific 1
þ Jðr; r0 Þ½ðrÞ ðr0 Þ2 dr dr0 ½2
form will be described later. For such systems, the above 4
Phase Transitions in Continuous Systems 57
This shows that the minimizer must have (r) the exponential term in [3] is replaced by functions
constant (so that the second integral is minimized) whose dependence on has the same scaling
and equal to any value which minimizes the function properties as mentioned above (in (1) and (2)),
{f,0 () 2 =2}. By thermodynamic principles, the while the hard core can be replaced by suitably
free energy f,0 () is convex in , but, if is large repulsive interactions.
enough, the above expression is not convex and, by The proof, in the version proposed by Lebowitz
properly choosing the value of , the minimizers are and Penrose, uses coarse graining and shows that the
no longer unique, hence the van der Waals phase effective Hamiltonian is well approximated by the
transition. van der Waals functional [1], when is small, while
the effective temperature scales as d . The approx-
imation becomes exact in the limit ! 0, where it
Kac Potentials
reduces the computation of the partition function to
The analogy between the above analysis of [2] and the analysis of the minima and the ground states of
the previous heuristic study of the fluid based on an effective Hamiltonian which, in the limit ! 0,
coarse graining is striking. As customary in con- is exactly the van der Waals functional.
tinuum theory, each mesoscopic point r should be A true proof of phase transitions requires instead
regarded as representative of a cell containing many to keep > 0 fixed (instead of letting ! 0) and
molecules. Then the functional F() can be inter- thus to control the difference of the effective
preted as the effective Hamiltonian after coarse Hamiltonian after coarse graining and the van der
graining. The role of the one-body term is played in Waals functional, which is the effective Hamilto-
[2] by the curly bracket, which selects two values of nian, but only in the actual limit ! 0. In general,
(its minimizers, to be identified with liq and gas ); there is no symmetry between the two ground states,
the attractive two-body potential is then related to unlike in the Ising case where they are related by
the last term in [2], as it suppresses the variations of spin flip, and the Pirogov–Sinai theory thus enters
. The analogy clearly suggests a strategy for a into play. The framework in fact is exactly similar,
rigorous proof of phase transitions in the conti- with the lattice Hamiltonian replaced by the func-
nuum, an approach which has been and still is tional and low temperatures by small (recall that
actively pursued. It will be discussed briefly in the the effective temperature scales as d ). The extension
sequel. of the theory to such a setting, however, presents
The first rigorous derivation of the van der Waals difficulties and success has so far been only partial.
theory in a statistical-mechanics setting goes back to
the 1960s and to Kac, who proposed a model where
the particle pair interaction is A Model for Phase Transitions in the
Continuum
d ejqi qj j þ hard core; ; > 0 ½3
The problem is twofold: to have a good control of
The phase diagram of such systems, after the (1) the limit theory and (2) the perturbations
thermodynamic limit, can be quite explicitly deter- induced by a nonzero value of the Kac parameter
mined in the limit ! 0, where it has been proved . The former falls in the category of variational
to converge to the van der Waals phase diagram, problems for integral functionals, whose prototype
under a proper choice of f,0 ( ) in [1]. is the Ginzburg–Landau free energy
The characteristic features of the first term in [3] Z
are: (1) very long range, which scales as 1 , and (2) Fgl ðÞ ¼ fwðÞ þ jrj2 g dr ½4
very small intensity, which scales as d , so that the
total intensity of the potential, defined as the which can be regarded as an approximation of [2]
integral over the second position, is independent of with w equal to the curly bracket in [2] and J
. The additional hard-core term (which imposes replaced by a -function. Minimization problems for
that any two particles cannot get closer than this and similar functionals have been widely
2R0 , R0 > 0 being the hard-core radius) is to ensure analyzed in the context of general variational
stability of matter, that is, to avoid collapse of the problems theory and partial differential equations
whole system on an infinitesimally small region, as it (PDEs), and the study of the limit theory can benefit
would happen if only the attractive part of the from a vast literature on the subject. The analysis of
interaction were present. the corrections due to small is, however, so far
Derivation of the van der Waals theory has been quite limited. To implement the Pirogov–Sinai
proved for a general class of Kac potentials, where strategy, we need, in the case of the interaction [3],
58 Phase Transitions in Continuous Systems
a very detailed knowledge of the system without the where [8] is taken to be defined on a torus (to avoid
Kac part of the interaction and with only hard cores. convergence problems of the integral), and
This, however, is so far not available when the j = j , = 1.
particle density is near to close-packing (i.e., the Exploiting the concavity of the entropy S(), it is
maximal density allowed by the hard-core poten- proved that the minimizers of F( ) are constant
tial). Replacing hard cores by other short-range functions with the constants minimizing
repulsive interactions does not help either, and this
seems the biggest obstacle to the program. SðuÞ
The difficulty, however, can be avoided by f; ðuÞ ¼ e ðuÞ ; u0 ½10
replacing the hard-core potential by a repulsive
many-body (more than two) Kac potential, which In the case of [6], to which we restrict in the sequel,
ensures stability as well. The class of systems for any > (3=2)3=2 there is so that f , (u) is
covered by the approach is characterized by Hamil- double-well with two minimizers, gas < liq (depen-
tonian of the form dence on is omitted).
Z To ‘‘recognize’’ the densities gas and liq in a
H; ðqÞ ¼ e ð ðrÞÞdr ½5 particle configuration, we use coarse graining and
Rd introduce two partitions of R d into cubes C(‘
, ) . The
cubes C(‘, ) of the first partition have side ‘,
where e ( ) is a polynomial of the scalar field proportional to 1þ , > 0 suitably small; those of
variable , a specific example being the second one have length ‘þ, proportional to
1 ; they are chosen so that each cube C(‘þ, ) is
4 2
e ð Þ ¼ ½6 union of cubes C(‘, ) . Notice that the small cubes
4! 2 have side much smaller than the interaction range (for
This form of the Hamiltonian is familiar from small ), while the opposite is true for the large cubes.
Euclidean field theories. In these theories, the free Given a particle configuration q, we say that
distribution of the field is Gaussian; in our case, a point r is in the liquid phase and write
however, the field = (r) is a function of the (r; q) = 1, if
particle configurations q = (qi , i = 1, . . . , n):
jq u Cð‘; Þ j
X
n liq a ; a > 0 suitably small ½11
‘d
ðrÞ ¼ j qðrÞ ¼ j ðr; qi Þ
i¼1 ½7 (‘ )
0 d 0
for any small cube C(‘, ) contained either in Cr þ, or
j ðr; r Þ ¼ jðr; r Þ (‘ )
in the cubes C(‘þ, ) contiguous to Cr þ, : jq u C(‘, ) j is
where j(r, r0 ) is a translation-invariant, symmetric referred to as the number of particles of q in C(‘, ) ,
(‘ )
transition probability kernel. Thus, (r) is a non- and Cr þ, as the large cube which contains r.
negative variable which has the meaning of a local Thus, (r; q) = 1 if the local particle density is
density at r, weighted by the Kac kernel j (r, r0 ). constantly close to liq in a large region around r.
Defining (r; q) = 1 if the above holds with gas
instead of liq and setting (r; q) = 0 in all the other
cases, we then have a phase indicator (r; q), which
Contours and Phase Indicators identifies, for all particle configurations, which
The dependence on yields the scaling properties spatial regions should be attributed to the liquid
characteristic of the Kac potentials and [5] may be and gas phases. The connected components of the
regarded as a generalized Kac Hamiltonian, which, complementary region are called contours and the
in the polynomial case of [6], involves up to four- definition of (r; q) has been structured in such a
body Kac potentials. The phase diagram of the way that liquid and gas are always separated by a
model, after taking first the thermodynamic limit contour. The liquid phase will then be represented
and then the limit ! 0, is determined by the free- by a measure which gives large probability to
energy functional configurations having mostly = 1, while the gas
Z phase by configurations with mostly = 1.
SððrÞÞ This is quite similar to the Ising picture and, as in
FðÞ ¼ e ðj ðrÞÞ dr ½8
the Ising model, the existence of a phase transition
follows from a Peierls estimate that contours have
small probability. In fact, if there are few contours,
SðÞ ¼ ðlog 1Þ ½9 the phase imposed on the boundaries of the region
Phase Transitions in Continuous Systems 59
where the system is observed percolates inside, imposing a total density (or magnetization in the
invading most of the space. Thus, boundary condi- case of spins) intermediate between those of the pure
tions select the phase in the whole volume. The phases. There will then be an interface separating
absence of the short-range potential, which was the the two phases with a corresponding surface tension
hard-core interaction in [3], and hence the absence and the geometry will be determined by the solution
of all the difficulties which originate from it, allow of a variational problem and given by the Wulff
one to carry through successfully the Pirogov–Sinai shape.
program and prove Peierls estimates on contours Can statistical mechanics explain and describe the
and, hence, the existence of a phase transition. In phenomenon? Important progress has been made
particular, the statistical weight of a contour is recently on the subject in the case of lattice systems
estimated by first relating the computation to one at low temperatures. The question has also been
involving the functional [8] and then computing its widely studied at the mesoscopic level, in the
value on density profiles compatible with the context of variational problems for Ginzburg and
existence of the given contour. This part of the Landau and many other functionals. Therefore, all
problem needs variational analysis for [8], with the ingredients of further development of the theory
constraints and benefits of a vast literature on the in this direction are now present.
subject. We have so far discussed only classical systems;
The phase transition is very sharp, as shown by a few words about extensions to the quantum case
the following ideal experiment. Having fixed > are now in order. In the range of values of
(3=2)3=2 , let vary in a (suitably) small interval temperatures and densities where the liquid–vapor
[ , þ ], > 0, centered around the mean- transition occurs, the quantum effects are not
field critical value . We consider the system in a expected to be relevant. Referring to the case of
large region with, for instance, boundary conditions bosons, and away from the Bose condensation
= 1 (i.e., forcing the gas phase) and fix small regime (and for system with Boltzmann statistics
enough. At = , the system has = 1 in as well), the quantum delocalization of particles
most of the domain, and this persists when we caused by the indeterminacy principle should
increase till a critical value, , , close to, but not essentially disappear after macroscopic coarse
the same as . For > , , = 1 in most of the graining, and the block-spin variables should
domain, except for a small layer around the again behave classically, even though their under-
boundaries. The analogous picture holds if we lying constituents are quantal. If this argument
choose boundary conditions = 1, and = , is proves correct, then progress along these lines may
the only value of the chemical potential where the be expected in near future.
system is sensitive to the boundary conditions and
both phases can be produced by the right boundary See also: Cluster Expansion; Ergodic Theory; Finite
conditions. The fact that the actual value , differs Group Symmetry Breaking; Pirogov–Sinai Theory;
from , is characteristic of the Pirogov–Sinai Reflection Positivity and Phase Transitions; Statistical
Mechanics and Combinatorial Problems; Statistical
approach and enlightens the delicate nature of the
Mechanics of Interfaces; Symmetry Breaking in Field
proofs.
Theory; Two-Dimensional Ising Model.
Pirogov–Sinai Theory
R Kotecký, Charles University, Prague, Czech conditions 2 (and with Hamiltonian H) is the
Republic, and the University of Warwick, UK probability (j) on defined by
ª 2006 Elsevier Ltd. All rights reserved.
expfH ðjÞg
ðf gjÞ ¼ ½2
ZðjÞ
with the partition function
Introduction X
Pirogov–Sinai theory is a method developed to ZðjÞ ¼ expfH ðjÞg ½3
study the phase diagrams of lattice models at low
temperatures. The general claim is that, under We use G(H) to denote the set of all periodic Gibbs
appropriate conditions, the phase diagram of a states with Hamiltonian H defined on by means of
lattice model is, at low temperatures, a small the Dobrushin–Lanford–Ruelle (DLR) equations.
perturbation of the zero-temperature phase dia-
gram designed by ground states. The treatment can
be generalized to cover temperature driven transi- Ground-State Phase Diagram and the Removal
tions with coexistence of ordered and disordered of Degeneracy
phases. A periodic configuration 2 is called a (periodic)
ground state of a Hamiltonian H = (A ) if
Formulation of the Main Result X
Hð~; Þ ¼ Þ A ðÞÞ 0
ðA ð~ ½4
Setting A
Refraining first from full generality, we formulate for every finite perturbation ˜ 6¼ of (˜ differs
the result for a standard class of lattice models with from at a finite number of lattice sites). We use
finite spin state and finite-range interaction. We will g(H) to denote the set of all periodic ground states
mention different generalizations later. of H. For every configuration 2 g(H), we define
We consider classical lattice models on the the specific energy e (H) by
d-dimensional hypercubic lattice Zd with d 2.
A spin configuration = (x )x2Zd is an assignment of 1 X
e ðHÞ ¼ lim A ðÞ ½5
a spin with values in a finite set S to each lattice site n!1 jVn j
d A\V 6¼;n
x 2 Zd ; the configuration space is = SZ . For 2
and Zd , we use 2 = S to denote the (with Vn denoting a cube consisting of nd lattice sites).
restriction = {x ; x 2 }. To investigate the phase diagram, we will consider
The Hamiltonian is given in terms of a collection of a parametric class of Hamiltonians around a
interaction potentials (A ), where A are real func- fixed Hamiltonian H (0) with a finite set of periodic
tions on , depending only on x with x 2 A, and A ground states g(H(0) ) = {1 , . . . , r }. Namely, let H(0) ,
runs over all finite subsets of Zd . We assume that the H (1) , . . . , and H (r1) be Hamiltonians determined by
potential is periodic with finite range of interactions. potentials (0) , (1) , . . . , and (r1) , respectively, and
Namely, A0 (0 ) = A () whenever A and are related consider theP(r 1)-parametric set of Hamiltonians
to A0 and 0 by a translation from (aZ)d for some fixed Ht = H (0) þ r1 ‘ = 1 t‘ H
(‘)
with t = (t1 , . . . , tr1 ) 2 Rr1 .
integer a and there exists R 1 such that A 0 for Using a shorthand em (H) = em (H), and introducing
all A with diameter exceeding R. the vectors e(H)= (e1 (H), ... , er (H)) and h(t) = e(Ht )
Without loss of generality (possibly multiplying minm em (Ht ), we notice that for each t 2 Rr1 , the
the number a by an integer and increasing R), we vector h(t) 2 @Qr , the boundary of the positive octant
may assume that R = a. in Rr . A crucial assumption for such a parametriza-
The Hamiltonian H (j) in with boundary tion Ht to yield a meaningful phase diagram is the
conditions 2 is then given by condition of removal of degeneracy: we assume that
X g(H (0) þ H (‘) ) $ g (H (0) ),‘ = 1, ... , r 1, and that the
H ðjÞ ¼ A ð _ c Þ ½1 vectors e(H (‘) ), ‘ = 1, ... , r 1, are linearly independent.
A\6¼; In particular, its immediate consequence is that
the mapping Rr1 3 t 7! h(t) 2 @Qr is a bijection.
where _ c 2 is the configuration extended This fact has a straightforward interpretation in
by c on c . The Gibbs state in under boundary terms of ground-state phase diagram. Viewing the
Pirogov–Sinai Theory 61
phase diagram (at zero temperature) as a partition of (the support of the contour ) is a connected
the parameter space into regions Kg with a given set component of B() (and is the restriction of on
g g(H (0) ) of ground states – ‘‘coexistence of zero- ). Here, the connectedness of means that it cannot
temperature phases from g’’ – the above bijection be split into two parts whose (Euclidean) distance is
means that the region Kg is the preimage of the set larger than 1. We useS@() to denote the set of all
contours of , B() = 2@() .
Qg ¼ fh 2 @Qr jhm ¼ 0 for m 2 g and
Consider a configuration such that is its
hm > 0 otherwiseg ½6 unique contour. The set Zd n has one infinite
component to be denoted Ext and a finite number
The partition of the set @Qr has a natural
of finite components whose union will be denoted
hierarchical structure implied by the fact that Qg1 \
Qg2 = Qg1 [g2 (Qg is the closure of Qg ). Namely, the Int . Observing that the configuration coincides
with one of the states m 2 G on every component of
origin {0} = Qg(H(0) ) is the intersection of r positive
Zd nB(), each of those components can be labeled
coordinate axes Q{m , m6 ¼m} , m = 1, . . . , r; each of
by the corresponding m. Let q be the label of Ext ,
those half-lines is an intersection of r 1 two-
we say that is a q-contour, and let Intm be the
dimensional quarter-planes with boundaries on posi-
union of all components of Int labeled by
tive coordinate axes, etc., up to (r 1)-dimensional
m, m = 1, . . . , r.
planes Q{m } , m = 1, . . . , r. This hierarchical structure
Defining the ‘‘energy’’ () of a q-contour by
is thus inherited by the partition of the parameter
space Rr1 into the regions Kg . The phase diagrams the equation
with such regular structure are sometimes said to ðÞ ¼ Hð ; q Þ þ eq ðHÞjj
satisfy the Gibbs phase rule. Xr
We can thus summarize in a rather trivial conclusion ðem ðHÞ eq ðHÞÞjIntm j ½7
that the condition of removal of degeneracy implies m¼1
that the ground-state phase diagram obeys the Gibbs the Peierls condition with respect to the set G of
phase rule. The task of the Pirogov–Sinai theory is to reference configurations is an assumption of the
provide means for proving that this remains true, at existence of > 0 such that
least in a neighborhood of the origin of parameter
space, also for small nonzero temperatures. To achieve ðÞ ð þ min em ðHÞÞjj ½8
m
this, we need an effective control of excitation energies.
for any contour of any configuration that is a
Peierls Condition finite perturbation of q 2 G.
Notice that if G = g(H), the sum on the right-hand
A crucial assumption for the validity of the Pirogov– side of [7] vanishes.
Sinai theory is a lower bound on energy of
excitations of ground states – the Peierls condition.
Phase Diagram
In spite of the fact that for a study of phase diagram
we consider a parametric set of Hamiltonians whose The main claim of the Pirogov–Sinai theory provides,
set of ground states may differ, it is useful to introduce for sufficiently large, a construction of regions Kg ()
the Peierls condition with respect to a single fixed of the parameter space characterized by the coex-
collection G of reference configurations (eventually, it istence of phases labeled by configurations m 2 g.
will be identified with the ground states of the This is done similarly as for the ground-state phase
Hamiltonian H (0) ). Let thus a fixed set G of periodic diagram discussed earlier by constructing a home-
configurations {1 , . . . , r } be given. Again, without omorphism t 7! a(t) from a neighborhood of the origin
loss of generality, we may assume that the periodicity of the parameter space to a neighborhood of the origin
of all configurations m 2 G is R. of @Qr that provides the phase diagram (actually, the
Before formulating the Peierls condition, we have function a(t) will turn out to be just a perturbation of
to introduce the notion of contours. Consider the set h(t) with errors of order e ).
of all sampling cubes C(x) = {y 2 Zd kyi xi j R for Before stating the result, however, we have to
1 i d}, x 2 Zd . A bad cube of a configuration clarify what exactly is meant by existence of phase
2 is a sampling cube C for which C differs from m for a given Hamiltonian H. Roughly speaking, it
m restricted to C for every m 2 G. The boundary is the existence of a periodic extremal Gibbs state
B() of is the union of all bad cubes of . If m 2 G m 2 G(H), whose typical configurations do not
and is its finite perturbation (differing from m on a differ too much from the ground-state configura-
finite set of lattice sites), then, necessarily, B() is tion m . In more technical terms, the existence
finite. A contour of is a pair = (, ), where of such a state is provided once we prove a
62 Pirogov–Sinai Theory
suitable bound, for the finite-volume Gibbs state c and collections M(, q) of contours @ in
({ }jm ) under the boundary conditions m , on satisfying the matching condition, and such that the
the probability that a fixed point in is encircled external among them are q-contours. Here, a contour
by a contour from @. If this is the case, we say that 2 @ is called an external contour in @ if Ext 0
the phase m is stable. It turns out that such a bound for all 0 2 @ different from .
is actually an integral part of the construction of With this observation and usingSm (@) to denote
metastable free energies fm (t) yielding the home- the union of all components of n 2@ with label
omorphism t 7! a(t). In this way, we get the main m, we get
claim formulated as follows: X Y Y
Zðjq Þ ¼ eem ðHÞjm ðÞj eðÞ ½9
Theorem 1 Consider P a parametric set of Hamilto- @2Mð;qÞ m 2@
nians Ht = H (0) þ r1 ‘=1 t ‘ H (‘)
with periodic finite-
range interactions satisfying the condition of Usefulness of such contour representations stems
removal of degeneracy as well as the Peierls from an expectation that, for a stable phase q,
condition with respect to the reference set contours should constitute a suppressed excitation
G = g(H (0) ). Let d 2 and let be sufficiently and one should be able to use cluster expansions to
large. Then there exists a homeomorphism t 7! a(t) evaluate the behavior of the Gibbs state q .
of a neighborhood V of the origin of the parameter However, the direct use of the cluster expansion on
space Rr1 onto a neighborhood U of the origin of [9] is trammeled by the presence of the energy terms
@Qr such that, for any t 2 V , the set of all stable eem (H)jm (@)j and, more seriously, by the require-
phases is {m 2 {1, . . . , r}jam (t) = 0}. ment that the contour labels match.
Nevertheless, one can rewrite the partition func-
The Peierls condition can be actually assumed tion in a form that does not involve any matching
only for the Hamiltonian H (0) inferring its validity condition. Namely, considering first a sum over
for Ht on a sufficiently small neighborhood V . mutually external contours @ ext and resumming over
Notice also that the result can be actually stated collections of contours which are contained in their
not as a claim about phase diagram in a space of interiors without touching the boundary (being thus
parameters, but as a statement about stable phases prevented to ‘‘glue’’ with external contours), we get
of a fixed Hamiltonian H. Namely, for a Hamilto- X
nian H satisfying Peierls condition with respect to a Zðjq Þ ¼ eeq ðHÞjExtj
reference set G, one can assure the existence of @ ext
apply cluster expansion, provided the contour weights with spins x 2 { 1, 0, 1}. Taking into account only
satisfy the necessary convergence assumptions. the lowest-order excitations, we get:
Even though this is not necessarily the case, there
is a way to use this representation. Namely, one can ~f
ð
; hÞ ¼
h 1 eð2d
hÞ
artificially change the weights to satisfy the needed
bound, for example, by modifying them to the form (sea of pluses or minuses with a single spin flip
! 0)
and
w0q ðÞ ¼ min wq ðÞ; ejj ½15
~f0 ð
; hÞ ¼ 1 eð2dþ
Þ eh þ eh
with a suitable constant . The modified partition
function X Y
Z0 ðjq Þ ¼ eeq ðHÞjj w0q ðÞ ½16 (sea of zeros with a single spin flip either 0 ! þ or
@2Cð;qÞ 2@ 0 ! )
can then be controlled by cluster expansion allowing Since these functions differ from full metastable free
to define energies f
(
, h), f0 (
, h) by terms of higher order
( e(4d2) ), the real phase diagram differs in this
1 1 order from the one constructed by equating the
fq ðHÞ ¼ lim log Z0 ðjq Þ ½17
jj!1 jj functions ~f
(
, h) and ~f0 (
, h). It is particularly
This is the metastable free energy corresponding to the interesting to inspect the origin,
= h = 0. It is only
phase q. Applying the cluster expansion to the the phase 0 that is stable there at all small
logarithm of the sum in [16], we get jfq (H) eq (H)j temperatures since
e=2 . The metastable free energy corresponds to 2 1
taking the ground state q and its excitations as long f0 ð0; 0Þ e2d < f
ð0; 0Þ e2d ½22
as they are sufficiently suppressed. Once wq () exceeds
the weight ejj (and the contour would have been The only reason why the phase 0 is favored at this
actually preferred), we suppress it ‘‘by hand.’’ The point with respect to phases þ and is that there
point is that if the phase q is stable, this never happens are two excitations of order e2d for the phase 0,
and w0q () = wq () for all q-contours . This is the idea while there is only one such excitation for þ or .
behind the use of the function fq (H) as an indicator of The entropy of the lowest-order contribution to
the stability of the phase q by taking f0 (0, 0) is overweighting the entropy of the contribu-
tion to f
(0, 0) of the same order.
aq ðtÞ ¼ fq ðHt Þ min fm ðHt Þ ½18
m
between the phases, the use of the Pirogov–Sinai Zeros of Partition Functions
theory is indispensable.
The full strength of the formula [23] is revealed
Another interesting class of applications concerns
when studying the zeros of the partition function
the behavior of the system with periodic boundary
ZTN (z) as a polynomial in a complex parameter z
conditions. It is based on the fact that the partition
entering the Hamiltonian of the model. To be able
function ZTN on a torus TN consisting of N d sites
to use the theory in this case, one has to extend the
can be, again with the help of the cluster expan-
definitions of the metastable free energies to com-
sions, explicitly and very accurately evaluated in
plex values of z. Indeed, the construction still goes
terms of metastable free energies,
through, now yielding genuinely complex, contour
Xr models w
with the help of an inductive procedure.
fq ðHÞNd
Z TN e Notice that no analytic continuation is involved. An
q¼1
analog of [23] is still valid,
expf min fm ðHÞN d bNg ½23
m Xr
d
ZTN ðzÞ efm ðzÞN
with a fixed constant b. This formula (and its m¼1
generalization to the case of complex parameters)
expf min <efm ðzÞNd bNg ½26
allows us to obtain various results concerning the m
behavior of the model in finite volumes. Using [26], it is not difficult to convince oneself
that the loci of zeros can be traced down to the
Finite-Size Effects phase coexistence lines. Indeed, on the line of
the coexistence of two phases <efm = <efq , the
Considering as an illustration a perturbation of the partition function ZTN (z) is approximated by
Ising model, so that it does not have the
symmetry d d d
efN (e=mfm N þ e=mfn N ). The zeros of this
any more (and the value ht () of external field approximation are thus given by the equations
at which the phase transition between plus and
minus phase occurs is not known), we can pose a <efm ¼ <efn < <ef‘ for all ‘ 6¼ m; n
natural question that has an importance for correct d
½27
N ð=mfm =mfn Þ ¼ mod 2
interpretation of simulation data. Namely, what is
the asymptotic behavior of the magnetization The zeros of the full partition function ZTN (z) can
P
mper
N (, h) = T N
(1/ x2 x ) on a torus? In the be proved to be exponentially close, up to a shift
thermodynamic limit, the magnetization mper 1 (, h)
of order O(ebN ), to those of the discussed
displays, as a function of h, a discontinuity at approximation.
h = ht (). For finite N, we get a rounding of the Briefly, the zeros of ZTN (z) asymptotically con-
discontinuity – the jump is smoothed. What is the centrate on the phase coexistence curves with the
shift of a naturally chosen finite-volume transition density (1=2)N d j(d=dz)(fm fn )j.
point ht (N) with respect to the limiting value ht ?
The answer can be obtained with the help of [23]
once sufficient care is taken to use the freedom in Bibliographical Remarks
the definition of the metastable free energies fþ (h) and Generalizations
and f (h) to replace them with a sufficiently smooth The original works Pirogov and Sinai (1975, 1976)
version allowing an approximation of the functions and Sinai (1982) introduced an analog of the weights
f
(h) around limiting point ht in terms of their w0q () and parameters aq (H) as a fixed point of a
Taylor expansion. suitable mapping on a Banach space. The inductive
As a result, in spite of the asymmetry of the model, definition used here was introduced in Kotecký and
the finite-volume magnetization mper N (, h) has a uni- Preiss (1983) and Zahradnı́k (1984). The completeness
versal behavior in the neighborhood of the transition of phase diagram – the fact that the stable phases
point ht . With suitable constants m and m0 , we have exhaust the set of all periodic extremal Gibbs states
per
mN ð; hÞ m0 þ m tanhfN d mðh ht Þg ½24 was first proved in Zahradnı́k (1984). Extension to
complex parameters was first considered in Gawȩdzki
Choosing the inflection point hmax (N) of mper
N (, h) et al. (1987) and Borgs and Imbrie (1989). For a review
as a natural finite-volume indicator of the occurence of the standard Pirogov–Sinai theory, see Sinai (1982)
of the transition, one can show that and Slawny (1987).
3 Application of Pirogov–Sinai theory for finite-size
hmax ðNÞ ¼ ht þ N 2d þ OðN 3d Þ ½25 effects was studied in Borgs and Kotecký (1990) and
22 m3
Pirogov–Sinai Theory 65
general theory of zeros of partition functions is Borgs C and Kotecký R (1990) A rigorous theory of finite-size
presented in Biskup et al. (2004). scaling at first-order phase transitions. Journal of Statistical
Physics 61: 79–119.
The basic statement of the Pirogov–Sinai theory Borgs C, Kotecký R, and Ueltschi D (1996) Low temperature phase
yielding the construction of the full phase diagram diagrams for quantum perturbations of classical spin systems.
has been extended to a large class of models. Let us Communications in Mathematical Physics 181: 409–446.
mention just few of them (with rather incomplete Borgs C and Waxler R (1989) First order phase transitions in
references): unbounded spin systems: construction of the phase diagram.
Communications in Mathematical Physics 126: 291–324.
1. Continuous spins. The main difficulty in these Bricmont J, Kuroda T, and Lebowitz J (1985) First order phase
models is that one has to deal with contours transitions in lattice and continuum systems: extension of
Pirogov–Sinai theory. Communications in Mathematical Phy-
immersed in a sea of fluctuating spins (Dobrushin sics 101: 501–538.
and Zahradnı́k 1986, Borgs and Waxler 1989). Bricmont J and Kupiainen A (1987) Lower critical dimensions for
2. Potts model. An example of a system a transi- the random field Ising model. Physical Review Letters 59:
tion in temperature with the coexistence of the 1829–1832.
low-temperature ordered and the high-tempera- Bricmont J and Kupiainen A (1988) Phase transition in the 3D
random field Ising model. Communications in Mathematical
ture disordered phases. Contour reformulation is Physics 116: 539–572.
employing contours between ordered and dis- Datta N, Fernández R, and Fröhlich J (1996) Low-temperature
ordered regions (Bricmont et al. 1985, Kotecký phase diagrams of quantum lattice systems. I. Stability for
et al. 1990). The treatment is simplified with help quantum perturbations of classical systems with finitely-many
of Fortuin–Kasteleyn representation (Laanait ground states. Helv. Phys. Acta 69: 752–820.
Dinaburg EL and Sinaı̈ YaG (1985) An analysis of ANNNI model
et al. 1991). by Peierls contour method. Communications in Mathematical
3. Models with competing interactions. ANNNI Physics 98: 119–144.
model, microemulsions. Systems with a rich Dobrushin RL and Zahradnı́k M (1986) Phase diagrams of
phase structure (Dinaburg and Sinai 1985). continuous lattice systems. In: Dobrushin RL (ed.) Math.
4. Disordered systems. An example is a proof of Problems of Stat. Physics and Dynamics, pp. 1–123. Dordrecht:
Reidel.
the existence of the phase transition for the three- Gawȩdzki K, Kotecký R, and Kupiainen A (1987) Coarse-graining
dimensional random field Ising model (Bricmont approach to first-order phase transitions. Journal of Statistical
and Kupiainen 1987, 1988) using a renormaliza- Physics 47: 701–724.
tion group version of the Pirogov–Sinai theory Kotecký R, Laanait L, Messager A, and Ruiz J (1990) The q-state
first formulated in Gawȩdzki et al. (1987). Potts model in the standard Pirogov–Sinai theory: surface
tensions and Wilson loops. Journal of Statistical Physics 58:
5. Quantum lattice models. A class of quantum 199–248.
models that can be viewed as a quantum perturba- Kotecký R and Preiss D (1983) An inductive approach to PS
tion of a classical model. With the help of Feyn- theory, Proc. Winter School on Abstract Analysis, Suppl. ai
man–Kac formula these are rewritten as a (d þ 1)- Rend. del Mat. di Palermo.
dimensional classical model that is, in its turn, Lebowitz JL, Mazel A, and Presutti E (1999) Liquid–vapor phase
transitions for systems with finite range interactions. Journal
treated by the standard Pirogov–Sinai theory (Datta of Statistical Physics 94: 955–1025.
et al. 1996, Borgs et al. 1996). Laanait L, Messager A, Miracle-Solé S, Ruiz J, and Shlosman SB
6. Continuous systems. Gas of particles in con- (1991) Interfaces in the Potts model I: Pirogov–Sinai theory of
tinuum interacting with a particular potential of the Fortuin–Kasteleyn representation. Communications in
Kac type. Pirogov–Sinai theory is used for a proof Mathematical Physics 140: 81–91.
Pirogov SA and Sinai YaG (1975) Phase diagrams of classical
of the existence of the phase transitions after a lattice systems (Russian). Theoretical and Mathematical
suitable discretisation (Lebowitz et al. 1999). Physics 25(3): 358–369.
Pirogov SA and Sinai YaG (1976) Phase diagrams of classical
See also: Cluster Expansion; Falicov–Kimball Model; lattice systems. Continuation (Russian). Theoretical and
Phase Transitions in Continuous Systems; Quantum Mathematical Physics 26(1): 61–76.
Spin Systems. Sinai YaG (1982) Theory of Phase Transitions: Rigorous Results.
New York: Pergamon.
Slawny J (1987) Low temperature properties of classical lattice
systems: phase transitions and phase diagrams. In: Domb C and
Further Reading Lebowitz JL (eds.) Phase Transitions and Critical Phenomena,
Biskup M, Borgs C, Chayes JT, and Kotecký R (2004) Partition vol. 11, pp. 127–205. New York: Academic Press.
function zeros at first-order phase transitions: Pirogov–Sinai Zahradnı́k M (1984) An alternate version of Pirogov–Sinai theory.
theory. Journal of Statistical Physics 116: 97–155. Communications in Mathematical Physics 93: 559–581.
Borgs C and Imbrie JZ (1989) A unified approach to phase
diagrams in field theory and statistical mechanics. Commu-
nications in Mathematical Physics 123: 305–328.
66 Point-Vortex Dynamics
Point-Vortex Dynamics
S Boatto, IMPA, Rio de Janeiro, Brazil
D Crowdy, Imperial College, London, UK Roughly speaking, following Descartes, a vortex
ª 2006 Elsevier Ltd. All rights reserved. is an entity which makes particles move along
circular-like orbits. Examples are the cyclones and
anticyclones in the atmosphere (see Figure 3).
Mathematically speaking, let u = (u, v, w) 2 R3 be a
Introduction velocity field, the associated vorticity field ! is
defined to be
Vortices have a long fascinating history. Descartes
wrote in his Le Monde: !¼r^u ½1
. . .que tous les mouvements qui se font au Monde sont
In this article we are considering exclusively inviscid
en quelque façon circulaire: c’est à dire que, quand un
flows which are also incompressible, that is,
corps quitte sa place, il entre toujours en celle d’un
autre, et celui-ci en celle d’un autre, et ainsi de suite ru¼0 ½2
jusques au dernier, qui occupe au même instant le lieu
délaissé par le premier. and have constant density , which we normalize to
be equal to 1 ( = 1). In two dimensions, a point-
In particular, Descartes thought of vortices to vortex field is the simplest of all vorticity fields: it
model the dynamics of the solar system, as reported can be thought as an entity where the vorticity field
by W W R Ball (1940): is concentrated into a point. In other words, point
Descartes’ physical theory of the universe, embodying vortices are singularities of the vorticity field! Then,
most of the results contained in his earlier and in the plane the vorticity field associated to a system
unpublished Le Monde, is given in his Principia, of N point vortices is
1644, . . . He assumes that the matter of the universe XN
must be in motion, and that the motion must result in a !ðrÞ ¼ ðr r Þ ½3
number of vortices. He stated that the sun is the center ¼1
of an immense whirlpool of this matter, in which the
planets float and are swept round like straws in a
whirlpool of water. + + + +
+ +
Descartes’ theory was later on recused by Newton +
+ +
in his Principia in 1687. Few centuries later, +
W Thomson (1867) the later Lord Kelvin, made use +
+ +
of vortices to formulate his atomic theory: each atom + +
was assumed to be made up of vortices in a sort of + + +
ideal fluid. In 1878–79 the American physicist A M + +
Mayer conducted a few experiments with needle (a) (b)
magnets placed on floating pieces of cork in an Figure 1 Thomson atomic model: (a) atom with three
applied magnetic field, as toy models for studying electrons and (b) atom with four electrons. From Thomson JJ
atomic interactions and forms (Mayer 1878, Aref (1883) A Treatise on the Motion of Vortex Rings. New York:
et al. 2003). In 1883 inspired by Mayer experiments, Macmillan and Thomson JJ (1904) Electricity and Matter.
J J Thomson combined W Thomson’s atomic theory Westmister: Archibald Constable.
with H von Helmholtz’s point-vortex theory
(Helmholtz 1858): he thought as the electrons were
point vortices inside a positively charged shell (see
Figure 1), the vortices being located at the vertices of
regular parallelograms and investigated about the
stability of such structures (see Thomson (1883,
section 2.1)). The vortex-atomic theory survived for
quite a few years up to Rutherford’s experiments
proved that atoms have quite a different structure!
Before continuing this historical/modeling overview,
let’s address the following question: Figure 2 Hurricane Jeanne. Reproduced with permission from
what is a vortex and, more specifically, what is a point- the National Oceanic and Atmospheric Administration (NOAA)
vortex? (www.noaanews.noaa.gov).
Point-Vortex Dynamics 67
ω
u
particle
r
Γ>0 Γ>0
Γ
|u| = c __ u
r2
(a) (b)
Figure 4 (a) Advected by the velocity field of one point vortex, a test particle follows a circular orbit, with a speed proportional to the
absolute value of the vortex circulation and inversely proportional to the square of its distance from the vortex. (b) Straight vortex lines.
68 Point-Vortex Dynamics
1 XN
¼ log kr r k2 ½13
4 ¼1
2a1
z
2a3
2a2
θ
Boatto and Simó (2004) generalized the stability dynamics in a frame co-rotating with the relative
analysis to the case of a ring with polar vortices equilibrium configuration. In the co-rotating refer-
and of multiple rings, the key idea being, as we ence system, the Hamiltonian takes the form
shall discuss in this section, the structure of the
Hessian of the Hamiltonian. ~ ¼ H þ !M
H
How to infer about linear and nonlinear stability where M is the momentum of the system, and H and
of steadily rotating configurations? ! are, respectively, the Hamiltonian and the rota-
Let us restrict the discussion to a polygonal ring of tional frequency of the relative equilibrium in the
identical vortices on a sphere as illustrated in original frame of reference. In the new reference
Figure 7 (Boatto and Cabral 2003, Boatto and frame, the relative equilibrium becomes an equili-
Simó 2004). The reasoning is easily generalized for brium, X , and the standard techniques can be used
the planar case. The case of multiple rings is to study its stability.
discussed in great detail in Boatto and Simó To study linear stability, the relevant equation is
(2004). A polygonal ring is a relative equilibrium dX
of coordinates X(t) = (q1 (t), . . . , qN (t), p1 (t), . . . , ¼ JSX ½20
dt
pN (t)), where
where X = X þ X, and S is the Hessian of H ~
q ðtÞ ¼ ðtÞ ¼ !t þ o
½18 evaluated at the equilibrium X . Then linear (or
p ðtÞ ¼ po ¼ cos o ¼ 1; . . . ; N spectral) stability is deduced by studying the
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi eigenvalues of the matrix JS (spectral stability). For
! = (N 1)po =r2o , ro = 1 p2o =2 , o and o = o
being the initial longitude and co-latitude of the th nonlinear stability we make use of a sufficient
vortex. stability criterion due to Dirichlet (1897) (see G
Lejeune Dirichlet (1897). Werke, vol. 2, Berlin,
Theorem 1 (Spherical case) (Boatto and Simó pp. 5–8; Boatto and Cabral (2003) and references
2004). The relative equilibrium [18] is (linearly and therein).
nonlinearly) stable if
Theorem 2 Let X be an equilibrium of an
4ðN 1Þð11 NÞ þ 24ðN 1Þr2o autonomous system of ordinary differential equations
þ 2N 2 þ 1 þ 3ð1ÞN < 0 ½19 dX
¼ f ðXÞ;
R 2N ½21
and it is unstable if the inequality is reversed. dt
that is, f (X ) = 0. If there exists a positive (or
negative) definite integral F of the system [21] in a
Remarks
neighborhood of the equilibrium X , then X is
(i) By Theorem 1 a vortex polygon, of N point vortices, stable.
is stable for 0 o o and (180 o ) o
In our case the Hamiltonian itself is an integral of
180 , where o = arcsin(r o ) and
motion. Then by studying definiteness of its Hes-
7N sian, S, evaluated at X , we infer minimal stability
r 2
o < for N odd intervals in and N. Details are given in Boatto and
4
N 2 8N þ 8 Cabral (2003) and Boatto and Simó (2004). The
r 2
o < for N even proof is mainly based on the following
4ðN 1Þ
considerations:
where r o = sin o .
1. Since S is a symmetric matrix it is diagonaliz-
(ii) Theorem 1 includes at once the results of
able, that is, there exists an orthogonal matrix
Thomson (1883), Dritschel (1985), and Polvani
C such that CT SC = D, where D is a diagonal
and Dritschel (1993) (and other authors who
matrix, D = diag(1 , . . . , N ). Furthermore, the
have been working in the area (Aref et al. 2003)).
matrix C can be chosen to leave invariant the
We recover the planar case by setting ro = 0 in
symplectic form (equivalently J = CT JC). Then
eqn [19], deducing that stability is guaranteed
by the canonical change of variables Y = CT X
for N < 7.
eqn [20] becomes
To prove Theorem 1 it is useful to consider the
Hamiltonian equations as in eqn [17]. The first step dY
¼ JDY ½22
is to make a change of reference frame: view the dt
Point-Vortex Dynamics 73
where Y = (q ~1 , . . . , q ~1 , . . . , p
~N , p ~N ) and (q ~j ),
~j , p view. As discussed in the previous section, there are
j = 1, . . . , N, are pairs of conjugate variables. some vortex configurations, such as the polygonal
Equation [22] can be rewritten as ones, for which vortices undergo a periodic circular
motion. Then by viewing the dynamics in a
d2 ~
qj reference frame co-rotating with the vortices the
¼ j jþN q
~j ; j ¼ 1; . . . ; N
dt2 tracer Hamiltonian is manifestly time independent
and, therefore, integrable – since it reduces to a
2. When evaluated at the equilibrium X , the
Hamiltonian of one degree of freedom. In such an
Hessian S takes the block structure
occurrence, tracer trajectories form a web of homo-
~ Q O clinic and heteroclinic orbits. An interesting theo-
S¼ retical problem is to study how the tracer transport
O P
properties (i.e., existence of barriers to transport,
where the matrices Q and P are symmetric circulant diffusion etc.) are affected by perturbing the poly-
matrices, that is, (N N) matrices of the form gonal vortex configuration, that is, by introducing in
0 1 a ‘‘genuine’’ time dependence (periodic, quasi-
a1 a2 . . . aN
B aN a1 . . . aN1 C periodic, or chaotic) (see, e.g., Boatto and Pierre-
B C
A ¼ B .. .. . . .. C ½23 humbert (1999), Rom-Kedar, Leonard and Wiggins
@ . . . . A (1990), Kuznetsov and Zaslavsky (2000), and
a2 a3 . . . a1 Newton (2001)). Furthermore, in the lab experi-
Circulant matrices are of special interest to us ments, color dyes, which monitor the flow velocity
because we can easily compute their eigenvalues field, are often used as the experimental equivalent
and eigenvectors for all N. In fact, it is immediate of tracer particles. In this context we would like to
to show that: stress the striking resemblance between theoretical
particle trajectories, deduced from point vortex
Lemma 3 All circulant matrices [23] have dynamics, and the actual dye visualizations observed
eigenvalues by van Heijst and Flor for vortex dipoles in a
X
N stratified fluid (see Figures 11 and 12) (van Heijst
j ¼ ak rk1
j ; j ¼ 1; . . . ; N 1993). Similarly, tripolar structures have been
k¼1 observed both in lab experiments (see Figure 13)
and in nature (see Figure 14). Recently, the Danish
and corresponding eigenvectors vj = (1, rj , . . . ,
group of Jansson–Haspang–Jensen–Hersen–Bohr has
rN1
j )T , j = 1, . . . , N, where rj = exp (2(j 1)=N)
observed beautiful rotating polygons, such as
are solutions of rN = 1.
squares and pentagons, on a fluid surface in the
presence of a rotating cylinder (see Figure 15).
Passive Tracers in the Velocity Fields of N Point
Vortices: The Restricted (N þ 1)-Vortex Problem
The terminology ‘‘restricted (N þ 1)-vortex prob- Point Vortex Motion with Boundaries
lem’’ is used in analogy with celestial mechanics In comparison with the extensive literature on point
literature, when one of the vorticities is taken to be vortex motion in unbounded domains, the study of
zero. The zero-vorticity vortex does not affect the point vortex motion in the presence of walls is modest.
dynamics of the remaining N-vortices. For this
There is, however, a general theory for such problems,
reason, it is said to be passively advected by the
and some recent new developments in this area have
flow of the remaining N-vortices and in the fluid
resulted in a versatile tool for analyzing point vortex
mechanics literature the terminology ‘‘passive tra-
motion with boundaries. Newton (Newton 2001)
cer’’ is also employed. The tracer dynamics is given contains a chapter on point vortex motion with
by the Hamiltonian equations [8]. Notice that in boundaries and also features a detailed bibliography.
general the Hamiltonian is time dependent, The reader is referred there for standard treatments;
through the vortex variables r j , j = 1, . . . , N, that is, here, we focus on more recent developments of the
ðr; tÞ ¼ ðr; r 1 ðtÞ; . . . ; r N ðtÞÞ mathematical theory.
and (q, p) = (x, y) play the role of conjugate canoni-
The Method of Images
cal variables. There is an extensive literature on the
subject both from theoretical (see, e.g., Boatto and When point vortices move around in bounded
Simó (2004) and Newton (2001)) and an experi- domains, it is clear that the motion is subject to
mental (van Heijst 1993, Ottino 1990) point of the constraint that no fluid should penetrate any of
74 Point-Vortex Dynamics
Kirchhoff–Routh–Lin Theory
1 1X N
gðx; y; x0 ; y0 Þ ¼ Gðx; y; x0 ; y0 Þ log r0 ½27 2 gðxk ; yk ; xk ; yk Þ ½31
2 2 k¼1 k
is harmonic with respect to (x, y) throughout
the region D including at the point (x0 , y0 ). Here, In rescaled coordinates (xk , k yk ), [30] is a Hamil-
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi tonian system in canonical form. For historical
r0 = (x x0 )2 þ (y y0 )2 ; reasons, H is often called the Kirchhoff–Routh
2. if @G=@n is the normal derivative of G on a curve path function. Analyzing the separate contributions
then to the path function [31] is instructive: the first term
is the contribution from flows imposed from outside
Gðx; y; x0 ; y0 Þ ¼ Ak ; on Ck ; k ¼ 1; . . . ; M
I (e.g., background flows and round-island circula-
@G ½28 tions), the second term is the ‘‘free-space’’ contribu-
ds ¼ 0; k ¼ 1; . . . ; M
Ck @n tion (it is the relevant Hamiltonian when no
boundaries are present) while the third term encodes
where ds denotes an element of arc and {Ak } are
the effect of the boundary walls (or, the effect of the
constants;
‘‘image vorticity’’ distribution discussed earlier).
3. G(x, y; x0 , y0 ) = 0 on C0 .
Lin (1941a) went on to show that, with the
Flucher and Gustafsson (1997) refer to this G as
Hamiltonian in some D given by H in [31], the
the hydrodynamic Green’s function. (In fact, it
Hamiltonian relevant to vortex motion in another
coincides with the modified Green’s function
domain obtained from D by a conformal mapping
arising in abstract potential theory – a function
z(
) consists of [31] with some simple extra additive
that is dual to the usual first-type Green’s function
contributions dependent only on the derivative of
that equals zero on all the domain boundaries.)
the map z(
) evaluated at the point vortex positions.
On the use of G, Lin established the following two
Flucher and Gustafsson (1997) also introduce
key results:
the Robin function R(x0 , y0 ) defined as the regular
Theorem 4 If N vortices of strengths {k jk = part of the above hydrodynamic Green’s function
1, . . . , N} are present in an incompressible fluid at evaluated at the point vortex. Indeed, R(x0 , y0 )
the points {(xk , yk )jk = 1, . . . , N} in a general multi- g(x0 , y0 ; x0 , y0 ), where g is defined in [27]. An
ply connected region D bounded by fixed bound- interesting fact is that, for single-vortex motion in
aries, the stream function of the fluid motion is a simply connected domain, R(x0 , y0 ) satisfies the
given by quasilinear elliptic Liouville equation everywhere in
Point-Vortex Dynamics 77
D with the boundary condition that it becomes (2005a), who, up to conformal mapping, have
infinite everywhere on the boundary of D. derived explicit formulas for the hydrodynamic
By combining the Kirchhoff–Routh theory with Green’s function in multiply connected fluid regions
conformal mapping theory, many interesting prob- of arbitrary finite connectivity. Their approach
lems can be studied. What happens, for example, if makes use of elements of classical function theory
there is a gap in the wall of Figure 16? In recent dating back to the work of Poincaré, Schottky, and
work, Johnson and McDonald (2005) show that if Klein (among others). This allows new problems
the vortex starts off, far from the gap, at a distance involving bounded vortex motion to be tackled. For
of less than half the gap width from the wall, then it example, the motion of a single vortex around
will eventually penetrate the gap. Otherwise, it will multiple circular islands has been studied in Crowdy
dip towards the gap but not go through it. The and Marshall (2005b), thereby extending recent
trajectories are shown in Figure 17. work on the two-island problem (Johnson and
Unfortunately, Lin did not provide any explicit McDonald 2005). If the wall in Figure 17 happens
analytical expressions for G in the multiply con- to have two (or more) gaps, then the fluid region is
nected case. This has limited the applicability of his multiply connected. The two-gap (doubly con-
theory beyond fluid regions that are anything other nected) case was recently solved by Johnson and
than simply and doubly connected. Recently, how- McDonald (2005) using Schwarz–Christoffel maps
ever, Lin’s theory has recently been brought to combined with elements of elliptic function theory
implementational fruition by Crowdy and Marshall (see Figure 18). Crowdy and Marshall have solved
the problem of an arbitrary number of gaps in a wall
by exploiting the new general theory presented
Point vortex, circulation Γ
in Crowdy and Marshall (2005a,b) (and related
works by the authors). The case of a wall with three
gaps represents a triply connected fluid region and
the critical vortex trajectory is plotted in Figure 19.
Point vortex motion in bounded domains on the
Wall surface of a sphere has received scant attention in
2
1.5
Image vortex, circulation-Γ 1
Figure 16 The motion of a point vortex near an infinite straight 0.5
wall. The vortex moves, at constant speed, maintaining a
0
constant distance from the wall. Other possible trajectories are
shown; they are all straight lines parallel to the wall. The motion –0.5
can be thought of as being induced by an opposite-circulation –1
‘‘image’’ vortex at the reflected point in the wall. –1.5
–2
–3 –2 –1 0 1 2 3
Figure 18 The critical trajectory when there are two symmetric
2 gaps in a wall. The fluid region is now doubly connected. This
problem is solved in Johnson and McDonald (2005) and Crowdy
1.5
and Marshall (2005).
1
0.5
2
0
1
–0.5
0
1
–1
–1.5
–2
–2 –5 0 5
–3 –2 –1 0 1 2 3 Figure 19 The critical vortex trajectories when there are three
Figure 17 Distribution of point vortex trajectories near a wall gaps in the wall. This time the fluid region is triply connected.
with a single gap of length 2. There is a critical trajectory which, This problem is solved in Crowdy and Marshall (2005) using the
far from the gap, is unit distance from the wall. general methods in Crowdy and Marshall (2005).
78 Point-Vortex Dynamics
the literature, although Kidambi and Newton von Helmholtz H (1858) On the integrals of the hydrodynamical
(2000) and Newton (2001) have recently made a equations which express vortex motion. Philosophical Maga-
zine 4(33): 485–512.
contribution. Such paradigms are clearly relevant Jansson TRN, Haspang M, Jensen KH, Hersen P, and Bohr T
to planetary-scale oceanographic flows in (2005) Rotating polygons on a fluid surface. Preprint.
which oceanic eddies interact with topography such Johnson ER and McDonald NR (2005) Vortices near barriers
as ridges and land masses and deserve further study. with multiple gaps. Journal of Fluid Mechanics 531: 335–358.
Kidambi R and Newton PK (2000) Point vortex motion on a
sphere with solid boundaries. Physical Fluids 12: 581–588.
Kimura Y (1999) Vortex motion on surfaces with constant
Further Reading curvature. Proceedings of the Royal Society of London A 455:
245–259.
Albouy A (1996) The symmetric central configurations of four Kirchhoff G (1876) Vorlesunger über mathematische Physik,
equal masses. Contemporary Mathematics 198: 131–135. Mechanik. Leipzig.
Aref H and Stremler MA (1999) Four-vortex motion with zero total Koiller J and Carvalho SP (1989) Non-integrability of the 4-
circulation and impulse. Physics of Fluids 11(12): 3704–3715. vortex system: analytical proof. Communications in Mathe-
Aref H, Newton PK, Stremler MA, Tokieda T, and Vainchtein DL matical Physics 120(4): 643–652.
(2003) Vortex crystals. Advances in Applied Mathematics 39: Kuznetsov L and Zavlasky GM (2000) Passive tracer transport in
1–79. three-vortex flow. Physical Review A 61(4): 3777–3792.
Ball WWR (1940) A Short Account of the History of Mathe- Lim C, Montaldi J, and Roberts M (2001) Relative equilibria of
matics, 12th edn. London: MacMillan. point vortices on the sphere. Physica D 148: 97–135.
Boatto S and Pierrehumbert RT (1999) Dynamics of a passive Lin CC (1941a) On the motion of vortices in two dimensions. I.
tracer in the velocity field of four identical point-vortices. Existence of the Kirchhoff–Routh function. Proceedings of the
Journal of Fluid Mechanics 394: 137–174. National Academy of Sciences 27(12): 570–575.
Boatto S and Cabral HE (2003) Nonlinear stability of a Lin CC (1941b) On the motion of vortices in two dimensions. II.
latitudinal ring of point vortices on a non-rotating sphere. Some further investigations on the Kirchhoff–Routh function.
SIAM Journal of Applied Mathematics 64: 216–230. Proceedings of the National Academy Sciences 27(12): 575–577.
Boatto S and Simó C (2004) Stability of latitudinal vortex rings Marchioro C and Pulvirenti M (1993) Vortices and localization in
with polar vortices. Mathematical Physics Preprint Archive Euler flows. Communications in Mathematical Physics 154:
(mp_arc) 04-67. 49–61.
Cabral HE and Schmidt DS (1999/00) Stability of relative Marchioro C and Pulvirenti M (1994) Mathematical Theory of
equilibria in the problem of N þ 1 vortices. SIAM Journal of Incompressible Non-viscous Fluids. vol. 96, AMS. New York:
Mathematical Analysis 31(2): 231–250. Springer.
Carnevale GF, McWilliams JC, Pomeau Y, Weiss JB, and Young Mayer AM (1878) Floating magnetics. Nature 17: 487–488.
WR (1991) Evolution of vortex statistics in two-dimensional Mayer AM (1878) Scientific American 2045–2047.
turbulence. Physical Review Letters 66(21): 2735–2737. Mayer AM (1878) On the morphological laws of the configura-
Carnevale GF, McWilliams JC, Pomeau Y, Weiss JB, and Young WR tions formed by magnets floating vertically and subjected to
(1992) Rates, pathways, and end states of nonlinear evolution in the attraction of a superposed magnetic. American Journal of
decaying two-dimensional. Physics of Fluids A 4(6): 1314–1316. Science 16: 247–256.
Castilla MSAC, Moauro V, Negrini P, and Oliva WM (1993) The Montaldi J, Soulière A, and Tokieda T (2002) Vortex dyanmics
four positive vortices problem – regions of chaotic behavior on a cylinder. SIAM Journal of Applied Dynamical Systems
and non-integrability. Annales de l’Institut Henri Poincaré. 2(3): 417–430.
Section A. Physique Theorique 59(1): 99–115. Newton PK (2001) The N-Vortex Problem. Analytical Tech-
Crowdy DG and Marshall JS (2005a) Analytical formulae for the niques. New York: Springer.
Kirchhoff–Routh path function in multiply connected Ottino JM (1990) The Kinematics of Mixing: Stretching, Chaos
domains. Proceedings of the Royal Society A 461: 2477–2501. and Transport. Cambridge: Cambridge University Press.
Crowdy DG and Marshall JS (2005b) The motion of a point Pingree RD and Le Cann B (1992) Anticyclonic Eddy X91 in the
vortex around multiple circular islands. Physics of Fluids 17: Southern Bay of Biscay. Journal of Geophysical Research 97:
560–602. 14353–14362.
Dritschel DG (1985) The stability and energetics of co-rotating Polvani LM and Dritschel DG (1993) Wave and vortex dynamics on
uniform vortices. Journal of Fluid Mechanics 157: 95–134. the surface of a sphere. Journal of Fluid Mechanics 255: 35–64.
Flucher M and Gustafsson B (1997) Vortex Motion in Two Poupaud F (2002) Diagonal defect measures, adhesion dynamics
Dimensional Hydrodynamics. Royal Institute of Technology and Euler equation. Methods and Applications of Analysis
Report No. TRITA-MAT-1997-MA-02. 9(4): 533–562.
Glass K (2000) Symmetry and bifurcations of planar configura- Rom-Kedar V, Leonard A, and Wiggins S (1990) An analytical
tions of the N-body and other problems. Dynamics and study of transport, mixing and chaos in an unsteady vortical
Stability of Systems 15(2): 59–73. flow. Journal of Fluid Mechanics 214: 347–394.
van Heijst GJF and Flor JB (1989) Dipole formation and Routh E (1881) Some applications of conjugate functions.
collisions in a stratified fluid. Nature 340: 212–215. Proceedings of the London Mathematical Society 12: 73–89.
van Heijst GJF, Kloosterziel RC, and Williams CWM (1991) Sakajo T (2004) Transition of global dynamics of a polygonal vortex
Laboratory experiments on the tripolar vortex in a rotating ring on a sphere with pole vortices. Physica D 196: 243–264.
fluid. Journal of Fluid Mechanics 225: 301–331. Schochet S (1995) The weak vorticity formulation of the 2-D
van Heijst GJF (1993) Self-organization of two-dimensional Euler equations and concentration-cancellation. Communi-
flows. Nederlands Tijdschrift voor Natuurkunde 59: cations in Partial Differential Equations 20(5&6):
321–325 (http://www.fluid.tue.nl). 1077–1104.
Poisson Reduction 79
Soulière A and Tokieda T (2002) Periodic motion of vortices on Thomson W (1867) On vortex atoms. Proceedings of the Royal
surfaces with symmetries. Journal of Fluid Mechanics 460: Society of Edinburgh 6: 94–105.
83–92. Yarmchuk EJ, Gordon MJV, and Packard RE (1979) Observation
Thomson JJ (1883) A Treatise on the Motion of Vortex Rings. of stationary vortices arrays in rotating superfluid helium.
New York: Macmillan. Physical Review Letters 43(3): 214–217.
Thomson JJ (1904) Electricity and Matter. Westmister Archibald Ziglin SL (1982) Quasi-periodic motions of vortex systems.
Constable. Physica D 4: 261–269 (addendum to K M Khanin).
Poisson Lie Groups see Classical r-Matrices, Lie Bialgebras, and Poisson Lie Groups
Poisson Reduction
J-P Ortega, Université de Franche-Comté, structure. Given a Poisson dynamical system
Besançon, France (M, { , }, h), its ‘‘integrals of motion’’ or ‘‘con-
T S Ratiu, Ecole Polytechnique Federale de served quantities’’ are defined as the centralizer of
Lausanne, Lausanne, Switzerland h in (C1 (M), { , }) that is, the subalgebra of
ª 2006 Elsevier Ltd. All rights reserved. (C1 (M), { , }) consisting of the functions
f 2 C1 (M) such that {f , h} = 0. Note that the
terminology is justified since, by Hamilton’s equa-
Introduction tions in Poisson bracket form, we have f_ = Xh [f ] =
{f , h} = 0, that is, f is constant on the flow of Xh . A
The Poisson reduction techniques allow the con- smooth mapping ’ : M1 ! M2 , between the two
struction of new Poisson structures out of a given Poisson manifolds (M1 , { , }1 ) and (M2 , { , }2 ),
one by combination of two operations: ‘‘restriction’’ is called ‘‘canonical’’ or ‘‘Poisson’’ if for all g,
to submanifolds that satisfy certain compatibility h 2 C1 (M2 ) we have ’ {g, h}2 = {’ g, ’ g}1 . If
assumptions and passage to a ‘‘quotient space’’ ’ : M1 ! M2 is a smooth map between two Poisson
where certain degeneracies have been eliminated. manifolds (M1 , { , }1 ) and (M2 , { , }2 ), then ’ is a
For certain kinds of reduction, it is necessary to pass Poisson map if and only if T’ Xh’ = Xh ’ for
first to a submanifold and then take a quotient. any h 2 C 1 (M2 ), where T’ : TM1 ! TM2 denotes
Before making this more explicit, we introduce the the tangent map (or derivative) of ’.
notations that will be used in this article. All Let (S, { , }S ) and (M, { , }M ) be two Poisson mani-
manifolds in this article are finite dimensional. folds such that S
M and the inclusion iS : S ,! M
is an immersion. The Poisson manifold (S, { , }S ) is
Poisson Manifolds
called a ‘‘Poisson submanifold’’ of (M, { , }M )
A ‘‘Poisson manifold’’ is a pair (M, { , }), where M is a if iS is a canonical map. An immersed submanifold
manifold and { , } is a bilinear operation on C1 (M) Q of M is called a ‘‘quasi-Poisson submanifold’’ of
such that (C 1 (M), { , }) is a Lie algebra and { , } is a (M, { , }M ) if for any q 2 Q, any open neighborhood
derivation (i.e., the Leibniz identity holds) in each U of q in M, and any f 2 C 1 (U) we have
argument. The pair (C1 (M), { , }) is also called a Xf (iQ (q)) 2 Tq iQ (Tq Q), where iQ : Q ,! M is the
‘‘Poisson algebra.’’ The functions in the center C(M) of inclusion and Xf is the Hamiltonian vector field of f
the Lie algebra (C1 (M), { , }) are called ‘‘Casimir on U with respect to the Poisson bracket of M
functions.’’ From the natural isomorphism between restricted to U. If (S,{ , }S ) is a Poisson submanifold
derivations on C1 (M) and vector fields on M, it follows of (M, { , }M ), then there is no other bracket { , }0 on
that each h 2 C1 (M) induces a vector field on M via the S making the inclusion i : S ,! M into a canonical map.
expression Xh = { , h}, called the ‘‘Hamiltonian vector If Q is a quasi-Poisson submanifold of (M, { , }), then
field’’ associated to the ‘‘Hamiltonian function’’ h. there exists a unique Poisson structure { , }Q on Q
The triplet (M, { , }, h) is called a ‘‘Poisson dynami- that makes it into a Poisson submanifold of (M, { , })
cal system.’’ Any Hamiltonian system on a symplec- but this Poisson structure may be different from the
tic manifold is a Poisson dynamical system relative given one on Q. Any Poisson submanifold is quasi-
to the Poisson bracket induced by the symplectic Poisson but the converse is not true in general.
80 Poisson Reduction
The Poisson Tensor and Symplectic Leaves symplectic orbit reduced space MO (see Symmetry
and Symplectic Reduction). If, additionally, G is
The derivation property of the Poisson bracket implies
compact, M is connected, and the momentum map J
that for any two functions f , g 2 C1 (M), the value of
is proper, then McO = MO .
the bracket {f , g}(z) at an arbitrary point z 2 M (and
In the remainder of this section, we characterize
therefore Xf (z) as well) depends on f only through
the situations in which new Poisson manifolds can
df (z) which allows us to define a contravariant
be obtained out of a given one by a combination of
antisymmetric 2-tensor B 2 2 (T M), called the ‘‘Pois-
restriction to a submanifold and passage to the
son tensor,’’ by B(z)(z , z ) = {f , g}(z), where
quotient with respect to an equivalence relation that
df (z) = z 2 Tz M and dg(z) = z 2 Tz M. The vector
encodes the symmetries of the bracket.
bundle map B] : T M ! TM over the identity naturally
associated to B is defined by B(z)(z , z ) = Definition 1 Let (M,{ , }) be a Poisson manifold
hz , B] (z )i. Its range D := B] (T M) TM is called and D TM a smooth distribution on M. The
the ‘‘characteristic distribution’’ of (M, { , }) since D is distribution D is called ‘‘Poisson’’ or ‘‘canonical,’’ if
a generalized smooth integrable distribution. Its the condition df jD = dgjD = 0, for any f , g 2 C 1 (U)
maximal integral leaves are called the ‘‘symplectic and any open subset U P, implies that d{f , g}jD = 0.
leaves’’ of M for they carry a symplectic structure that
makes them into Poisson submanifolds. As integral Unless strong regularity assumptions are invoked, the
leaves of an integrable distribution, the symplectic passage to the leaf space of a canonical distribution
leaves L are ‘‘initial submanifolds’’ of M, that is, the destroys the smoothness of the quotient topological
inclusion i : L ,! M is an injective immersion such that space. In such situations, the Poisson algebra of functions
for any smooth manifold P, an arbitrary map g : P ! L is too small and the notion of presheaf of Poisson
is smooth if and only if i g : P ! M is smooth. algebras is needed. See Singularity and Bifurcation
Theory for more information on singularity theory.
be the pieces of this decomposition. The topology Definition 5 Let (M, { , }) be a Poisson manifold,
of S is not necessarily the relative topology as a S a decomposed subset of M, and D TMjS a
subset of M. Then D TMjS is called a ‘‘smooth Poisson-integrable generalized distribution adapted
distribution’’ on S adapted to the decomposition to the decomposition of S. Assume that C1 S=DS
{Si }i 2 I , if D \ TSi is a smooth distribution on Si for has the (D, DS )-local extension property. Then
all i 2 I. The distribution D is said to be ‘‘integrable’’ (M, { , }, D, S) is said to be ‘‘Poisson reducible’’ if
S=DS
if D \ TSi is integrable for each i 2 I. (S=DS ,C1 S=DS , { , } ) is a well-defined presheaf of
Poisson algebras where, for any open set V S=DS ,
In the situation described by the previous defini- S=D
the bracket { , }V S : C1 1 1
S=DS (V) CS=DS (V) ! CS=DS
tion and if D is integrable, the integrability of the
(V) is given by
distributions DSi := D \ TSi on Si allows us to
partition each Si into the corresponding maximal S=DS
ff ; ggV ðDS ðmÞÞ :¼ fF; GgðmÞ
integral manifolds. Thus, there is an equivalence
relation on Si whose equivalence classes are precisely for any m 2 1
DS (V)for local D-invariant extensions
these maximal integral manifolds. Doing this on F,G at m of f DS and g DS , respectively.
each Si , we obtain an equivalence relation DS on the Theorem 1 Let (M, { , }) be a Poisson manifold with
whole set S by taking the union of the different associated Poisson tensor B 2 2 (T M), S a decom-
equivalence classes corresponding to all the DSi . posed space, and D TMjS a Poisson-integrable
Define the quotient space S=DS by generalized distribution adapted to the decomposition
[ of S (see Definitions 4 and 1). Assume that C1
S=DS :¼ Si =DSi S=DS has
the (D, DS )-local extension property. Then (M, { , },
i2I
D, S) is Poisson reducible if for any m 2 S
and let DS : S ! S=DS be the natural projection.
B] ðm Þ Sm ½3
The Presheaf of Smooth Functions on S=DS where m := {dF(m)jF 2 C1 (Um ), dF(z)jD(z) = 0, for
all z 2 Um \ S, and for any open neighborhood Um
Define the presheaf of smooth functions C1 S=DS on of m in M} and Sm := {dF(m) 2 m jFjUm \Vm is
S=DS as the map that associates to any open subset V constant for an open neighborhood Um of m in M
of S=DS the set of functions C1 S=DS (V) characterized and an open neighborhood Vm of m in S}.
by the following property: f 2 C1
S=DS (V) if and only if
for any z 2 V there exists m 2 1 If S is endowed with the relative topology, then
DS (V),Um open
neighborhood of m in M, and F 2 C1 (Um ) such that Sm := {dF(m) 2 m jFjUm \Vm is constant for an open
neighborhood Um of m in M}.
f DS j1 ðVÞ\Um ¼ Fj1 ðVÞ\Um ½2
DS DS
Reduction by Regular Canonical Distributions
F is called a ‘‘local extension’’ of f DS at the point
m 2 1 Let (M, { , }) be a Poisson manifold and S an
DS (V). When the distribution D is trivial, the
presheaf C1 embedded submanifold of M. Let D TMjS be a
S=DS coincides with the presheaf of
Whitney smooth functions C1 sub-bundle of the tangent bundle of M restricted to
S, M on S induced by
the smooth functions on M. S such that DS := D \ TS is a smooth, integrable,
The presheaf C1 regular distribution on S and D is canonical.
S=DS is said to have the (D, DS )-
local extension property when the topology of S is Theorem 2 With the above hypotheses, (M, { , },
stronger than the relative topology and, at the same D, S) is Poisson reducible if and only if
time, the local extensions of f DS defined in [2]
can always be chosen to satisfy B] ðD Þ TS þ D ½4
symplectic leaf of (S, { , }S ) is a symplectic Additionally, since the functions ’1 , . . . , ’k are
submanifold of the symplectic leaf of (M,{ , }) D-invariant, by [6], it follows that
that contains it.
(v) Let Ls and LSs be the symplectic leaves of X’1 ðsÞ ¼ X 1 ðsÞ 2 Ts S; . . . ; X’k ðsÞ
b
’
(M, { , }) and (S, { , }S ), respectively, that contain ¼ X k ðsÞ 2 Ts S
the point s 2 S. Let !Ls and !LS be the correspond- b
’
s
ing symplectic forms. Then B] (s)((Ts S) ) is a for any s 2 S. Consequently, {X’1 (s), . . . ,X’k (s),
symplectic subspace of Ts Ls and X 1 (s), . . . , X nk (s)} spans Ts Ls with
!L ðsÞ fX’1 ðsÞ; . . . ; X’k ðsÞg Ts S \ Ts Ls
B] ðsÞððTs SÞ Þ ¼ Ts LSs s ½8
and
where (Ts LSs )!Ls (s) denotes the !Ls (s)-orthogonal
complement of Ts LSs in Ts Ls . fX 1 ðsÞ; . . . ; X nk ðsÞg B# ðsÞððTs SÞ Þ
(vi) Let BS 2 2 (T S) be the Poisson tensor associated By Proposition 2(i),
to (S, { , }S ). Then
spanfX’1 ðsÞ; . . . ; X’k ðsÞg ¼ Ts S \ Ts Ls
B]S ¼ S B] jS S ½9
and
where S
: T S ! T MjS is the dual of S : TMjS spanfX 1 ðsÞ; . . . ; X nk ðsÞg ¼ B# ðsÞððTs SÞ Þ
! TS.
Since dim(B# (s)((Ts S) )) = n k by Proposition
The ‘‘Dirac constraints formula’’ is the expression in 2(iii), it follows that {X 1 (s), . . . , X nk (s)} is a basis
coordinates for the bracket of a cosymplectic of B# (s)((Ts S) ).
submanifold. Let (M, { , }) be an n-dimensional Since B# (s)((Ts S) ) is a symplectic subspace of
Poisson manifold and let S be a k-dimensional Ts Ls by Theorem 3(v), there exists some r 2 N such
cosymplectic submanifold of M. Let z0 be an that n k = 2r and, additionally, the matrix C(s)
arbitrary point in S and (U, ) a submanifold chart with entries
around z0 such that = (’, ) : U ! V1 V2 , where
V1 and V2 are two open neighborhoods of the origin Cij ðsÞ :¼ f i ; j
gðsÞ; i; j 2 f1; . . . ; n kg
in two Euclidean spaces such that (z0 ) = (’(z0 ),
(z0 )) = (0, 0) and is invertible. Therefore, in the coordinates (’1 , . . . ,
’k , 1 , . . . , nk ), the matrix associated to the
ðU \ SÞ ¼ V1 f0g ½10 Poisson tensor B(s) is
Let ’ =: (’1 , . . . , ’k ) be the components of ’ BS ðsÞ 0
BðsÞ ¼
and define ’ b1 := ’1 jU\S , . . . , ’
bk := ’k jU\S . Extend 0 CðsÞ
1
b to D-invariant functions ’1 , . . . , ’k on U.
b ,...,’
’ k
See also: Classical r-Matrices, Lie Bialgebras, and Krishnaprasad PS and Marsden JE (1987) Hamiltonian structure
Poisson Lie Groups; Cotangent Bundle Reduction; and stability for rigid bodies with flexible attachments.
Graded Poisson Algebras; Symmetry and Symplectic Archives for Rational and Mechanical Analysis 98: 137–158.
Reduction; Hamiltonian Group Actions; Lie, Symplectic, Lewis D, Marsden JE, Montgomery R, and Ratiu TS (1986) The
Hamiltonian structure for dynamic free boundary problems.
and Poisson Groupoids and their Lie Algebroids;
Physica D 18: 391–404.
Singularity and Bifurcation Theory. Lu J-H and Weinstein A (1990) Poisson Lie groups, dressing
transformations and Bruhat decompositions. Journal of
Differential Geometry 31: 510–526.
Further Reading Marsden JE and Ratiu TS (1986) Reduction of Poisson manifolds.
Abraham R and Marsden JE (1978) Foundations of Mechanics, Letters in Mathematical Physics 11: 161–169.
2nd edn. Reading, MA: Addison–Wesley. Marsden JE and Ratiu TS (2003) Introduction to Mechanics and
Casati P and Pedroni M (1992) Drinfeld–Sokolov reduction on a Symmetry, second printing; 1st edn. (1994), Texts in Applied
simple Lie algebra from the bi-Hamiltonian point of view. Mathematics, 2nd edn., vol. 17. New York: Springer.
Letters in Mathematical Physics 25(2): 89–101. Ortega J-P and Ratiu TS (1998) Singular reduction of Poisson
Castrillón-López M and Marsden JE (2003) Some remarks on manifolds. Letters in Mathematical Physics 46: 359–372.
Lagrangian and Poisson reduction for field theories. Journal of Ortega J-P and Ratiu TS (2003) Momentum Maps and Hamiltonian
Geometry and Physics 48: 52–83. Reduction. Progress in Math. vol. 222. Boston: Birkhäuser.
Cendra H, Marsden JE, and Ratiu TS (2003) Cocycles, compat- Pedroni M (1995) Equivalence of the Drinfeld–Sokolov reduction
ibility, and Poisson brackets for complex fluids. In: Capriz G and to a bi-Hamiltonian reduction. Letters in Mathematical
Mariano P (eds.) Advances in Multifield Theories of Continua Physics, 35(4): 291–302.
with Substructures, Memoirs, pp. 51–73. Aarhus: Aarhus Univ. Sundermeyer K (1982) Constrained Dynamics. Lecture Notes in
Faybusovich L (1991) Hamiltonian structure of dynamical Physics, vol. 169. New York: Springer.
systems which solve linear programming problems. Physica Weinstein A (1983) The local structure of Poisson manifolds.
D 53: 217–232. Journal of Differential Geometry 18: 523–557.
Faybusovich L (1995) A Hamiltonian structure for generalized affine- Weinstein A (1985) The local structure of Poisson manifolds – errata
scaling vector fields. Journal of Nonlinear Science 5(1): 11–28. and addenda. Journal of Differential Geometry 22(2): 255.
Gotay MJ, Nester MJ, and Hinds G (1978) Presymplectic Zaalani N (1999) Phase space reduction and Poisson structure.
manifolds and the Dirac–Bergmann theory of constraints. Journal of Mathematical Physics 40: 3431–3438.
Journal of Mathematical Physics 19: 2388–2399.
Polygonal Billiards
S Tabachnikov, Pennsylvania State University, energy and momentum are conserved. The reflection
University Park, PA, USA off the left endpoint of the half-line is also elastic: if a
ª 2006 Elsevier Ltd. All rights reserved. point hits the ‘‘wall’’ x = 0, its velocity changes sign.
The configuration space of this system is the wedge
pffiffiffiffiffiffi
0 x1 x2 . After the rescaling x i = mi xi , i = 1, 2,
Mechanical Examples. Unfolding this system identifies with the p billiard inside a wedge
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Billiard Trajectories with the angle measure arctan m1 =m2 .
Likewise, the system of two elastic point-masses
The billiard system inside a polygon P has a very on a segment is the billiard system in a right
simple description: a point moves rectilinearly with triangle; a system of a number of elastic point-
the unit speed until it hits a side of P; there it masses on the positive half-line or a segment is the
instantaneously changes its velocity according to the billiard inside a multidimensional polyhedral cone
rule ‘‘the angle of incidence equals the angle of or a polyhedron, respectively. The system of three
reflection,’’ and continues the rectilinear motion. If elastic point-masses on a circle has three degrees of
the point hits a corner, its further motion is not freedom; one can reduce one by assuming that the
defined. (see Billiards in Bounded Convex Domains). center of mass of the system is fixed. The resulting
From the point of view of the theory of dynamical two-dimensional system is the billiard inside an
systems, polygonal billiards provide an example of acute triangle with the angles
parabolic dynamics in which nearby trajectories rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
diverge with subexponential rate. m1 þ m2 þ m3
arctan mi ; i ¼ 1; 2; 3
One of the motivations for the study of polygonal m1 m2 m3
billiards comes from the mechanics of elastic particles in
dimension 1. For example, consider the system of two For comparison, the more realistic system of
point-masses m1 and m2 on the positive half-line x
0. elastic balls identifies with the billiard system in a
The collision between the points is elastic, that is, the domain with nonflat boundary components.
Polygonal Billiards 85
In particular, one may talk about directions on a flat the Teichmuller space that contains this surface. These
surface. values are known, due to Eskin, Masur, Okunkov, and
The group PSL(2, R) acts on the space of flat Zorich. Since a generic flat surface does not correspond
structures. From the point of view of complex analysis, to a rational polygon, this result does not immediately
a flat surface is a Riemann surface with a holomorphic apply to polygonal billiards. However, quadratic
quadratic differential; the set of cone points corre- asymptotics are established for rectangular billiards
sponds to the zeros of the quadratic differential. Not with barriers.
every flat surface is associated with a polygonal billiard. Note, in conclusion, a close relation of billiards in
Concerning ergodicity, one has the theorem of rational polygons and interval exchange transforma-
Kerckhoff, Masur, and Smillie: given a flat surface of tions; the reduction of the former to the latter is a
genus not less than 2, for almost all directions (in the particular case of the reduction of the billiard flow to
sense of the Lebesgue measure), the flow F is uniquely the billiard ball map. On an invariant surface M of the
ergodic. Furthermore, the Hausdorff dimension of the billiard flow, consider a segment I, perpendicular to
set of angles for which ergodicity fails does not the directional flow. Since ‘‘the width of a beam’’ is an
exceed 1/2, and this bound is sharp. As a consequence, invariant transversal measure for the constant flow, the
the billiard flow on the invariant surface is uniquely first return map to I is a piecewise orientation preserving
ergodic for almost all directions. Another corollary: isometry, that is, an interval exchange transformation.
there is a dense G subset in the space of polygons
consisting of polygons for which the billiard flow is
ergodic. If a billiard polygon admits approximation by Acknowledgment
rational polygons at a superexponentially fast rate, This work was partially supported by NSF.
then the billiard flow in it is ergodic.
Concerning periodic orbits, one has the following See also: Billiards in Bounded Convex Domains; Ergodic
theorem due to H Masur: given a flat surface of genus Theory; Fractal Dimensions in Dynamics; Generic
not less than 2, there exists a dense set of angles such Properties of Dynamical Systems; Holomorphic
that F has a closed trajectory. As a consequence, for Dynamics; Hyperbolic Billiards; Riemann Surfaces.
any rational billiard polygon, there is a dense set of
directions each with a periodic orbit. Furthermore,
Further Reading
periodic points are dense in the phase space of the
billiard flow in a rational polygon. Burago D, Ferleger S, and Kononenko A (2000) A Geometric
Similarly to the case of a square, let f (‘) be the Approach to Semi-Dispersing Billiards. Hard Ball Systems and
number of strips of periodic trajectories of length not the Lorentz Gas, pp. 9–27. Berlin: Springer.
Chernov N and Markarian R Theory of Chaotic Billiards (to
greater than ‘ in a rational polygon P. By a theorem appear).
of H Masur, there exist constants c and C such that Galperin G, Stepin A, and Vorobets Ya (1992) Periodic billiard
for sufficiently large ‘ one has: c‘2 < f (‘) < C‘2 , and trajectories in polygons: generating mechanisms. Russian
likewise for flat surfaces. Mathematical Surveys 47(3): 5–80.
There is a class of flat surfaces, called Veech (or Gutkin E (1986) Billiards in polygons. Physica D 19: 311–333.
Gutkin E (1996) Billiards in polygons: survey of recent results.
lattice) surfaces, for which more refined results are Journal of Statistical Physics 83: 7–26.
available. The groups of affine transformations of a Gutkin E (2003) Billiard dynamics: a survey with the emphasis on
flat surface determine a subgroup in SL(2, R). If this open problems. Regular and Chaotic Dynamics 8: 1–13.
subgroup is a lattice in SL(2, R), then the flat surface Katok A and Hasselblatt B (1995) Introduction to the Modern
is called a Veech surface. Similarly, one defines a Theory of Dynamical Systems. Cambridge: Cambridge
University Press.
Veech rational polygon. For example, regular poly- Kozlov V and Treshchev D (1991) Billiards. A Genetic Introduction
gons and isosceles triangles with equal angles =n to the Dynamics of Systems with Impacts. Providence: American
are Veech. All acute Veech triangles are described. Mathematical Society.
For a Veech surface, one has the following Veech Masur H and Tabachnikov S (2002) Rational Billiards and
dichotomy: for any direction , either the flow F is Flat Structures. Handbook of Dynamical Systems, vol. 1A,
pp. 1015–1089. Amsterdam: North-Holland.
minimal or its every leaf is closed (unless it is a saddle Sinai Ya (1976) Introduction to Ergodic Theory. Princeton:
connection, i.e., a segment connecting cone points). Princeton University Press.
For a Veech surface (and polygon), the quadratic Smillie J (2000) The Dynamics of Billiard Flows in Rational
bounds for the counting function f (‘) become quad- Polygons, Encyclopaedia of Mathematical Sciences, vol. 100,
ratic asymptotics: f (‘)=‘2 has a limit as ‘ ! 1. The pp. 360–382. Berlin: Springer.
Tabachnikov S (1995) Billiards, Société Math. de France,
value of this limit is expressed in arithmetical terms. Panoramas et Syntheses, No 1.
A generic flat surface also has quadratic asymptotics. Tabachnikov S (2005) Geometry and Billiards. American Math-
The value of the limit depends only on the stratum of ematical Society.
88 Positive Maps on C*-Algebras
a1 a2 ak is positive if and only if the which is unital when p(e) = 1, on the full group
matrices ai have positive eigenvalues. C -algebra C (G). When G is amenable, this algebra
coincides with reduced C -algebra Cr (G) so that, if
Example 3 When a C -algebra A B(H) is rep-
G is also unimodular (as is the case if G is compact),
resented as a self-adjoint closed algebra of operators
the positive elements can be approximated by
on a Hilbert space H, its positive elements are those
positive-definite functions in L1 (G, m) and the
which have non-negative spectrum.
positivity of follows exactly as in the previous
example.
Positive Maps on C -Algebras
Positive Maps in Commutative C -Algebras
Among the various relevant classes of maps between
C -algebras, we are going to consider the following Positive maps : C0 (Y) ! C0 (X) between commu-
ones, whose properties are connected with the tative CR-algebras have the following structure:
underlying structures of ordered vector spaces. (a)(x) = Y k(x, dy)a(y), a 2 C0 (Y). Here the kernel
x 7! k(x, ) is a continuous map from X to the space
Definition 1 Given two C -algebras A and B, a of positive Radon measures on Y. In case X and Y
map : A ! B is called positive if (Aþ ) Bþ . In are compact, the map is unital provided k(x, ) is a
other words, a map is positive if and only if it probability measure for each x 2 X. In fact, for a
transforms the positive elements of A into positive fixed x 2 X, the map a 7! (a)(x) is a positive linear
elements of B: functional from C0 (Y) to C and Riesz’s theorem
a 2 A ) ða aÞ 2 Bþ ½1
guarantees that it can be represented by a positive
Radon measure on Y.
If A and B have units, the map is called unital In probability theory, one-parameter semigroups
provided (1A ) = 1B . t s = tþs of positive maps t : C0 (X) ! C0 (X)
such that t (1) 1 for all t 0, are called Markovian
Morphisms and Jordan Morphisms
semigroups (conservative, if the maps are unital). They
A -morphism between C -algebras : A ! B is represent the expectation at time t > 0 of Markovian
positive; in fact, (a a) = (a) (a) 0. stochastic processes on X. In this case, the time-
This also the case for Jordan -morphism, the dependent kernel k(t, x, ) represents the distribution
linear maps satisfying (a ) = (a) and ({a, b}) = probability at time t of a particle starting in x 2 X at
{(a), (b)}, where {a, b} = ab þ ba denotes the Jor- time t = 0.
dan product. In fact, if a = a then (a2 ) = (a)2 is These kinds of maps arise also in potential theory,
positive. where the dependence of the solution (a) of a
Dirichlet problem on a bounded domain , with
Shur’s Product of Matrices nice boundary @, upon the continuous boundary
Let A 2 Mn (C) be a positive matrix and define a data a 2 C(@) gives rise to a linear unital map
linear map : Mn (C) ! Mn (C) through the Shur’s : C(@) ! C( [ @), whose positivity and uni-
product of matrices: A (B) := [Aij Bij ]ni, j = 1 . Since the tality translates the ‘‘maximum principle’’ for har-
Shur’s product of positive matrices is positive too monic functions. When is the unit disk, k is the
(i.e., the positive cone of Mn (C) is a semigroup familiar Poisson’s kernel.
under matrix product), the above map is positive.
Continuity and Algebraic Properties
of Positive Maps
Positive-Definite Function on Groups
Since the order structure of a C -algebra A is defined
Positive maps also arise naturally in harmonic
by its positive cone Aþ , positive maps are
analysis. Let G be a locally compact topological
group with identity e and left Haar’s measure m. Let 1. real: (a ) = (a) and
p : G ! C be a continuous positive-definite function 2. order preserving: (a) (b) whenever a b.
on G. This just means that for all n 1 and all
s1 , . . . , sn 2 G, the matrix { p(s 1 n From this follows an important interplay between
Pi n sj )}i,j = 11belongs to positivity and continuity:
the positive cone of Mn (C): i, j = 1 p(si sj )i j 0
for all 1 , . . . , n . Such functions are necessarily a positive map : A ! B
bounded with kpk1 p(e), so that an operator
between C -algebras is continuous
: L1 (G, m) ! L1 (G, m) is well defined by point-
wise multiplication: (f )(s) := p(s)f (s). This map In case A has a unit, this follows by the fact that is
extends to a positive map : C (G) ! C (G), order preserving and that, for self-adjoint a, one has
90 Positive Maps on C*-Algebras
where K = H N, and the spectrum of H is assumed There is a symbiotic appearance of states and
to be discrete and such that eK is trace-class. For representations on C -algebras. In fact, given a
infinite systems, A is a quasilocal C -algebra generated representation : A ! B(H), one easily constructs
by a net {A } of C -subalgebras describing observa- states on A by unit vectors 2 H by
bles referred to finite-volume regions. Infinite-volume
equilibrium states on A can then be obtained as ðaÞ ¼ ðjðaÞÞ
thermodynamic limits of finite-volume Gibbs equili- In fact, one checks that (a a) = (j(a a) ) =
brium states of the above type. (j(a )(a) ) = k(a) k2 0 and, at least if a unit
exists, that (1A ) = kk2 = 1.
Normal and Singular States A fundamental construction due to Gelfand,
When observables with continuous spectrum have to Naimark, and Segal allows to associate a represen-
be considered and one chooses the algebra B(h) of tation to each state in such a way that each state is a
all bounded operators, the above formula, although vector state for a suitable representation.
still meaningful, does not describe all states on B(h) ‘‘Let ! be a state over the C -algebra A. It follows
but only the important subclass of the normal ones. that there exists cyclic representation (! , H! , ! )
To this class, which can be considered on any von of A such that
Neumann algebra M, belong states which are
-weakly continuous functionals. Equivalently, these !ðaÞ ¼ ð! j! ðaÞ! Þ
are the states such that for all increasing net a 2 Mþ Moreover, the representation is unique up to
with least upper bound a 2 Mþ , (a) is least upper unitary equivalence. It is called the canonical
bound of the net (a ). cyclic representation of A associated with !.’’
In general, each state on a von Neumann
algebra M splits as a sum of a maximal normal The positivity property of the state allows to
piece and a singular one. Singular traces appear in introduce the positive-semidefinite scalar product
noncommutative geometry as very useful tools to get hajbi = !(a b) on the vector space A. Moreover, its
back local objects from spectral ones via the familiar kernel I ! = {a 2 A: !(a a) = 0} is a left-ideal of A: in
principle that local properties of functions depend fact, if a 2 A and b 2 I ! then !((ba) (ba))
on the asymptotics of their Fourier coefficients. kak2 !(b b) = 0. This allows to define, on the
This is best illustrated on a compact, Riemannian quotient pre-Hilbert space A=I ! , an action of
n-manifold M by the formula the elements a 2 A: ! (a)(b þ I ! ) := ab þ I ! . It is
Z the extension of this action to the Hilbert space
f dm ¼ cn
! ðMf jDjn Þ completion H! of A=I ! that gives the representation
M associated to !. When A has a unit, the cyclic vector
which expresses the Riemannian integral of a nice ! with the stated properties is precisely the image of
function f in terms of the Dirac operator D acting on the 1A þ I ! . By definition, the cyclicity of the represen-
Hilbert space of square-integrable spinors, the multi- tation amounts to check that ! (A)! is dense in H! .
plication operator Mf by f, and the singular Dixmier
tracial state
! on B(H). Here the compactness of M
implies the compactness of the operator Mf jDjn and Completely Positive Maps
is positive and completely positive (CP map for to each Borel subset E of a topological space X. For
short) if this happens for all n.’’ each aR2 C0 (X), one can define its integral
Pn (f ) := X f dE as an element of A. The map
Equivalently, n-positive means that i, j = 1 bi
: C0 (X) ! A, called the observation channel, is
(ai aj )bj 0 for all a1 , . . . , an 2 A and b1 , . . . , bn 2 b.
then a CP map.
In particular, if is n-positive then it is k-positive for
4. Another field of mathematical physics in which CP
all k n. Many positive maps we considered are in
maps play a distinguished role is in the construc-
fact CP maps:
tion and application of the quantum dynamical
1. morphisms of C -algebras are CP maps; entropy, an extension of the Kolmogorov–Sinai
2. positive maps : A ! B are automatically CP entropy of measure preserving transformations
maps provided A, B or both are commutative and (see Quantum Entropy). When dealing with
states are, in particular, CP maps; and a noncommutative dynamical system (M, ,
)
3. an important class of CP maps is the following. in which
is a normal trace state on a finite
A norm one projection " : A ! B, from a von Neumann algebra M, the Connes–Størmer
C -algebra A onto a C -subalgebra B, is a entropy h
() is defined through the consideration
contraction such that "(b) = b for all b 2 B. It of an entropy functional H
(N1 , . . . , Nk ) of finite-
can be proved that these maps satisfy dimensional von Neumann subalgebras
"(bac) = b"(a)c for all a 2 A and b, c 2 B and N1 , . . . , Nk M. To extend the definition to
for this reason they are called conditional more general C -algebras and states on them, one
expectations. This property then implies that has to face the fact that C -algebras may have no
they are CP maps. nontrivial C -subalgebras. To circumvent the
problem A Connes, H Narnhofer, and W Thirring
However, the identity map from a C -algebra A
(CNT) introduced an entropy functional
into its opposite A is positive but not 2-positive
H(1 , . . . , k ) associated to a set i : Ai ! A of
unless A is commutative, the transposition a 7! at in
CP maps (finite channels) from finite-dimensional
Mn (C) is positive and not 2-positive if n 2 and, for
C -algebras Ai into A. This led to the CNT entropy
all n, there exist n-positive maps which are not
h! () of a noncommutative dynamical system
(n þ 1)-positive.
(A, , !), where ! is a state on A and is an
automorphism or a CP map preserving it:
CP Maps in Mathematical Physics ! = !.
In several fields of application, the transition of a CP Maps and Continuity
state of a system into another state can be described
by a completely positive map : A ! B between Since for an element a 2 A of a unital C -algebra,
C -algebras: for any given state ! of B, ! is then one has kak 1 precisely when
a state of A. 1 a
1. In the theory of quantum communication pro- a 1
cesses (see Channels in Quantum Information is positive in M2 (A), it follows that
Theory; Optimal Cloning of Quantum States;
Source Coding in Quantum Information Theory; 2-positive unital maps are contractive
Capacity for Quantum Information), for exam- Unital 2-positive maps satisfy, in particular, the
ple, B and A represent the input and output generalized Schwarz inequality for all a 2 A,
systems, respectively, ! the signal to be trans-
mitted, ! the received signal, and the system ða ÞðaÞ ða aÞ
of transmission, called the channel. In particular,
2. In quantum probability and in the theory of
quantum open systems, continuous semigroups ‘‘CP maps are completely bounded as supn k
1n k =
of CP maps (see Quantum Dynamical Semi- k(1A )k and completely contractive if they are
groups) describe dissipative time evolutions of a unital. Conversely unital, completely contractive
system due to interaction with an external one maps are CP maps.’’
(heat bath).
CP Maps and Matrix Algebras
3. In the theory of measurement in quantum
mechanics, an observable can be described by a When the domain or the target space of a map are
positive-operator-valued (POV) measure M which matrix algebras, one has the following equivalences
assigns a positive element m(E) in a C -algebra A concerning positivity. Let [ei, j ]i, j denote the standard
Positive Maps on C*-Algebras 93
matrix units in Mn (C) and : Mn (C) ! B into a Strongly continuous positive semigroups, which
C -algebra B. The following conditions are are KMS symmetric with respect to a KMS state !
equivalent: of a given automorphism group of a C -algebra A,
can be analyzed as positive semigroups in the
1. is a CP map,
standard representation (M, H, P, J) (see Tomita–
2. is n-positive, and
Takesaki Modular Theory) of the von Neumann
3. [(ei, j )]i, j is positive in Mn (B).
algebra M := ! (A)00 . A semigroup on A gives rise to
Associating to a linear map : A ! Mn (C), the a corresponding w -continuous positive semigroup
linear
P functional s : Mn (A) ! C by s ([ai, j ]) := on M and to a strongly continuous positive
i, j (a i, j )i, j , one has the following equivalent semigroup on the ordered Hilbert space (H, P) of
properties: the standard form. In the latter framework, one can
develop an infinite-dimensional, noncommutative
1. is a CP map,
extension of the classical Perron–Frobenius theory
2. is n-positive,
for matrices with positive entries. This applies, in
3. s is positive, and
particular, to semigroups generated by physical
4. s is positive on Aþ
Mn (C)þ .
Hamiltonians and has been used to prove existence
Stinspring Representation of CP Maps
and uniqueness of the ground state for bosons and
fermions systems in quantum field theory (one may
CP maps are relatively easy to handle, thanks to the consult Gross (1972)).
following dilation result due to W F Stinspring. It
describes a CP map as the compression of a
morphism of C -algebras. Nuclear C -Algebras and Injective
Let A be a unital C -algebra and : A ! B(H) a von Neumann Algebras
linear map. Then is a CP map if and only if it The nonabelian character of the product in
has the form C -algebras may prevent the existence of nontrivial
ðaÞ ¼ V ðaÞV morphisms between them, while one may have an
abundance of CP maps. For example, there are no
for some representation : A ! B(K) on a Hil- nontrivial morphisms from the algebra of compact
bert space K, and some bounded linear map operators to C, but there exist sufficiently many
V : H ! K. If A is a von Neumann algebra and states to separate its elements. A much more well-
is normal then can be taken to be normal. When behaved category of C -algebras is obtained by
A = B(H) and H is separable, one has, for some considering CP maps as morphisms. This is true, in
bn 2 B(H), particular, for nuclear C -algebras: those for which
X
1 any tensor product A
B with any other C -algebra
ðaÞ ¼ bn abn B admits a unique C -cross norm (see C-Algebras
n¼1 and their Classification). The intimate relation
between this class of algebras and CP maps is
The proof of this result is reminiscent of the illustrated by the following characterization:
GNS construction for states and its extension, by
G Kasparov, to C -modules is central in bivariant 1. A is nuclear;
K-homology theory. 2. the identity map of A is a pointwise limit of CP
maps of finite rank;
Despite the above satisfactory result, one should 3. the identity map of A can be approximately
be aware that positive but not CP maps are much factorized, lim (T S )a ! a for all a 2 A,
less understood and only for maps on very low through matrix algebras and nets of CP maps
dimensional matrix algebras do we have a definitive S : A ! Mn (C), T : Mn (C) ! A.
classification. To have an idea of the intricacies of
the matter, one may consult Størmer (1963). A second important relation between nuclear
C -algebras and CP maps emerges in connection to
Positive Semigroups on Standard Forms the lifting problem.
of von Neumann Algebras and Ground State
‘‘Let A be a nuclear C -algebra and J a closed two-
for Physical Hamiltonians
sided ideal in a C -algebra B. Then every CP map
The above result allows one to derive the structure : A ! B=J can be lifted to a CP map 0 : A ! B.
of generators of norm-continuous dynamical semi- In other words, factors through B by the
groups in terms of dissipative operators. quotient map q : B ! B=J: = q .’’
94 Pseudo-Riemannian Nilpotent Lie Groups
This and related results are used to prove that Takesaki Modular Theory; von Neumann Algebras:
the Brown–Douglas–Fillmore K-homology invariant Introduction, Modular Theory, and Classification Theory.
Ext(A) is a group for separable, nuclear C -algebras.
Our last basic result, due to W Arveson, about CP Further Reading
maps concerns the extension problem.
Bratteli O and Robinson DW (1987) Operator Algebras and
‘‘Let A be a unital C -algebra and N a self-adjoint Quantum Statistical Mechanics 1, 2nd edn., 505 pp. Berlin:
closed subspace of A containing the identity. Then Springer; New York: Heidelberg.
every CP map : N ! B(H) from N into a type I factor Bratteli O and Robinson DW (1997) Operator Algebras and
B(H) can be extended to a CP map : A ! B(H).’’ Quantum Statistical Mechanics 2, 2nd edn., 518 pp. Berlin:
Springer; New York: Heidelberg.
This result can be restated by saying that type I Connes A (1994) Noncommutative Geometry, 661 pp. San Diego,
factors are injective von Neumann algebras. It may CA: Academic Press.
suggest how the notion of a completely positive map Davies EB (1976) Quantum Theory of Open Systems, 171 pp.
London: Academic Press.
plays a fundamental role along Connes’ proof of one
Gross L (1972) Existence and uniqueness of physical ground
culminating result of the theory of von Neumann states. Journal of Functional Analysis 10: 52–109.
algebras, namely the fact that the class of injective Lance EC (1995) Hilbert C -Modules, 130 pp. London: Cambridge
von Neumann algebras coincides with the class University Press.
of approximately finite-dimensional ones (see von Ohya M and Petz D (1993) Quantum Entropy and Its Use,
335 pp. New York: Academic Press.
Neumann Algebras: Introduction, Modular Theory
Paulsen VI (1996) Completely Bounded Maps and Dilations,
and Classification Theory). 187 pp. Harlow: Longman Scientific-Technical.
Pisier G (2003) Introduction to Operator Space Theory, 479 pp.
See also: Capacity for Quantum Information; London: Cambridge University Press.
C *-Algebras and Their Classification; Channels Størmer E (1963) Positive linear maps of operator algebras. Acta
in Quantum Information Theory; Noncommutative Mathematica 110: 233–278.
Geometry and the Standard Model; Noncommutative Takesaki M (2004a) Theory of Operator Algebras I, Second
Geometry from Strings; Optimal Cloning of Quantum printing of the 1st edn., 415 pp. Berlin: Springer.
States; Path Integrals in Noncommutative Geometry; Takesaki M (2004b) Theory of Operator Algebras II, 384 pp.
Berlin: Springer.
Quantum Dynamical Semigroups; Quantum Entropy;
Takesaki M (2004c) Theory of Operator Algebras III, 548 pp.
Source Coding in Quantum Information Theory; Tomita– Berlin: Springer.
Others have made use of nilpotent Lie groups gives handy conversion procedures for this and for
with left-invariant (positive or negative) definite the other major sign variant (e.g., curvature) (see
metric tensors, such as Hervig’s (2004) constructions O’Neill (1983, pp. 92 and 89, respectively)).
of black hole spacetimes from solvmanifolds (related A Riemannian inner product has signature (p, 0).
to solvable groups: those with Iwasawa decomposi- In view of the preceding remark, one might as well
tion G = AN), including the so-called BTZ construc- regard signature (0, q) as also being Riemannian, so
tions. Definite groups and their applications, already that ‘‘Riemannian geometry is that of definite metric
having received thorough surveys elsewhere, most tensors.’’ Similarly, a Lorentzian inner product has
notably those of Eberlein, are not included here. either p = 1 or q = 1. In this case, both sign
Although the geometric properties of Lie groups conventions are used in relativistic theories with
with left-invariant definite metric tensors have been the proviso that the ‘‘1’’ axis is always timelike.
studied extensively, the same has not occurred for If neither p nor q is 1, there is no physical
indefinite metric tensors. For example, while the convention. We shall say that v 2 V is timelike if
paper of Milnor (1976) has already become a classic hv, vi > 0, null if hv, vi = 0, and spacelike if hv, vi < 0.
reference, in particular for the classification of (In a Lorentzian example, one may wish to revert to
positive-definite (Riemannian) metrics on three- one’s preferred relativistic convention.) We shall refer
dimensional Lie groups, a classification of the to these collectively as the causal type of a vector (or of
left-invariant Lorentzian metric tensors on these a curve to which a vector is tangent).
groups became available only in 1997. Similarly, Considering indefinite inner products (and metric
only a few partial results in the line of Milnor’s tensors) thus greatly expands one’s purview, from
study of definite metrics were previously known for one type of geometry (Riemannian), or possibly two
indefinite metrics. Moreover, in dimension 3, there (Riemannian and Lorentzian), to a total of b(p þ
are only two types of metric tensors: Riemannian q)=2c þ 1 distinctly different types of geometries on
(definite) and Lorentzian (indefinite). But in higher the same underlying differential manifolds.
dimensions, there are many distinct types of indefi-
nite metrics while there is still essentially only one
Rise of 2-Step Groups
type of definite metric. This is another reason why
this area has special interest now. Throughout, N will denote a connected (and simply
The list in ‘‘Further reading’’ at the end of this connected, usually), nilpotent Lie group with Lie
article consists of general survey articles and a algebra n having center z. We shall use h , i to denote
select few of the more historically important papers. either an inner product on n or the induced left-
Precise bibliographical information for references invariant pseudo-Riemannian (indefinite) metric
merely mentioned or alluded to in this article tensor on N.
may be found in those. The main, general reference For all nilpotent Lie groups, the exponential map
on pseudo-Riemannian geometry is O’Neill’s (1983) exp : n ! N is surjective. Indeed, it is a diffeomorph-
book. Eberlein’s (2004) article covers the Rieman- ism for simply connected N; in this case, we shall
nian case. At this time, there is no other compre- denote the inverse by log.
hensive survey of the pseudo-Riemannian case. One One of the earliest papers on the Riemannian
may use Cordero and Parker (1999) and Guediri geometry of nilpotent Lie groups was Wolf (1964).
(2003) and their reference lists to good advantage, Since then, a few other papers about general nilpotent
however. Lie groups have appeared, including Karidi (1994)
and Pauls (2001), but the area has not seen a lot of
progress.
Inner Product and Signature
However, everything changed with Kaplan’s
By an inner product on a vector space V we shall (1981) publication. Following this paper and its
mean a nondegenerate, symmetric bilinear form on successor (Kaplan 1983), almost all subsequent
V, generally denoted by h , i. In particular, we do not work on the left-invariant geometry of nilpotent
assume that it is positive definite. It has become groups has been on two-step groups.
customary to refer to an ordered pair of non- Briefly, Kaplan defined a new class of nilpotent
negative integers (p, q) as the signature of the inner Lie groups, calling them of Heisenberg type. This
product, where p denotes the number of positive was soon abbreviated to H-type, and has since been
eigenvalues and q the number of negative eigen- called also as Heisenberg-like and (unfortunately)
values. Then nondegeneracy means that p þ q = ‘‘generalized Heisenberg.’’ (Unfortunate, because
dim V. Note that there is no real geometric that term was already in use for another class, not
difference between (p, q) and (q, p); indeed, O’Neill all of which are of H-type.) What made them so
96 Pseudo-Riemannian Nilpotent Lie Groups
compelling was that (almost) everything was expli- basis of its Lie algebra n for which the structure
citly calculable, thus making them the next great test constants are rational.
bed after symmetric spaces.
Such a group is said to have a rational structure, or
Definition 1 We say that N (or n ) is 2-step simply to be rational.
nilpotent when [n , n ] z. Then [[n , n ], n ] = 0 and A nilmanifold is a (compact) homogeneous space
the generalization to k-step nilpotent is clear: of the form nN, where N is a connected, simply
connected (rational) nilpotent Lie group and is a
½½ ½½½n ; n ; n ; n ; n ¼ 0
lattice in N. An infranilmanifold has a nilmanifold
with k þ 1 copies of n (or k nested brackets, if you as a finite covering space. They are commonly
prefer). regarded as a noncommutative generalization of
tori, the Klein bottle being the simplest example of
It soon became apparent that H-type groups
an infranilmanifold that is not a nilmanifold.
comprised a subclass of 2-step groups; for a nice,
We recall the result of Marsden from O’Neill
modern proof see Berndt et al. (1995). By around
(1983).
1990, they had also attracted the attention of the
spectral geometry community, and Eberlein pro- Theorem 2 A compact, homogeneous pseudo-
duced the seminal survey (with important new Riemannian space is geodesically complete.
results) from which the modern era began. (It was
Thus, if a rational N is provided with a bi-invariant
published in 1994 (Eberlein 1994), but the preprint
metric tensor h , i, then M becomes a compact,
had circulated widely since 1990.) Since then,
homogeneous pseudo-Riemannian space which is
activity around 2-step nilpotent Lie groups has
therefore complete. It follows that (N, h , i) is itself
mushroomed; see the references in Eberlein (2004).
complete. In general, however, the metric tensor is
Finally, turning to pseudo-Riemannian nilpo-
not bi-invariant and N need not be complete.
tent Lie groups, with perhaps one or two
For 2-step nilpotent Lie groups, things work nicely
exceptions, all results so far have been obtained
as shown by this result first published by Guediri.
only for 2-step groups. Thus, the remaining
sections of this article will be devoted almost Theorem 3 On a 2-step nilpotent Lie group, all
exclusively to them. left-invariant pseudo-Riemannian metrics are geode-
The Baker–Campbell–Hausdorff formula takes on sically complete.
a particularly simple form in these groups:
No such general result holds for 3- and higher-step
expðxÞ expðyÞ ¼ exp x þ y þ 12½x; y ½1 groups, however.
Proposition 1 In a pseudo-Riemannian 2-step
nilpotent Lie group, the exponential map preserves
causal character. Alternatively, one-parameter sub- 2-Step Groups
groups are curves of constant causal character. In the Riemannian (positive-definite) case, one splits
Of course, one-parameter subgroups need not be n = z v = z z? , where the superscript denotes the
geodesics. orthogonal complement with respect to the inner
product h , i. In the general pseudo-Riemannian case,
however, z z? 6¼ n . The problem is that z might be
Lattices and Completeness
a degenerate subspace; that is, it might contain a
We shall need some basic facts about lattices in N. null subspace U for which U U ? .
In nilpotent Lie groups, a lattice is a discrete It turns out that this possible degeneracy of the
subgroup such that the homogeneous space center causes the essential differences between
M = nN is compact. Here we follow the conven- the Riemannian and pseudo-Riemannian cases. So
tion that a lattice acts on the left, so that the coset far, the only general success in studying groups with
space consists of left cosets and this is indicated by degenerate centers was in Cordero and Parker (1999)
the notation. Other subgroups will generally act on where an adapted Witt decomposition of n was used
the right, allowing better separation of the effects of together with an involution exchanging the two null
two simultaneous actions. parts.
Lattices do not always exist in nilpotent Lie Observe that if z is degenerate, the null subspace
groups. U is well defined invariantly. We shall use a
decomposition
Theorem 1 The simply connected, nilpotent Lie
group N admits a lattice if and only if there exists a n ¼zv ¼U Z V E ½2
Pseudo-Riemannian Nilpotent Lie Groups 97
To the compact nilmanifold nN we may flat in general. Moreover, the geometry of the
associate two flat (possibly degenerate) tori. product is ‘‘twisted’’ in a certain way. It would be
interesting to determine which tori could appear as
Definition 5 Let N be a simply connected, two-step
such a TV and how.
nilpotent Lie group with lattice and let : n ! v
denote the projection. Define Theorem 7 Let N be a simply connected, 2-step
nilpotent Lie group with lattice , a left-invariant
Tz ¼ z=ðlog \ zÞ
metric tensor, and tori as above. The fibers TF of
Tv ¼ v=ðlog Þ the (generalized) pseudo-Riemannian submersion
nN TB are isometric to Tz . If in addition the
Observe that dim Tz þ dim Tv = dim z þ dim v =
center Z of N is nondegenerate, then the base TB is
dim n .
isometric to Tv .
Let m = dim z and n = dim v. It is a consequence
We recall that elements of N can be identified
of a theorem of Palais and Stewart that nN is a
with elements of the isometry group I(N): namely,
principal T m -bundle over T n . The model fiber T m
n 2 N is identified with the isometry = Ln of left
can be given a geometric structure from its closed
translation by n. We shall abbreviate this by writing
embedding in nN; we denote this geometric
2 N.
m-torus by TF . Similarly, we wish to provide the
base n-torus with a geometric structure so that the Definition 6 We say that 2 N translates the
projection pB : nN TB is the appropriate general- geodesic by ! if and only if (t) = (t þ !) for
ization of a pseudo-Riemannian submersion all t. If is a unit-speed geodesic, we say that ! is a
(O’Neill 1983) to (possibly) degenerate spaces. period of .
Observe that the splitting n = z v induces splittings
Recall that unit speed means that jj ˙ =
TN = zN vN and T(nN) = z(nN) v(nN), 1=2
jh, ˙ j = 1. Since there is no natural normal-
˙ i
and that pB just mods out z(nN). Examining
ization for null geodesics, we do not define periods
O’Neill’s definition, we see that the key is to
for them. In the Riemannian case and in the
construct the geometry of TB by defining
timelike Lorentzian case in strongly causal space-
pB : v ðnNÞ ! TpB ðÞ ðTB Þ times, unit-speed geodesics are parametrized by
arclength and this period is a translation distance.
for each 2 nN is an isometry ½3
If belongs to a lattice , it is the length of a closed
and geodesic in nN.
In general, recall that if is a geodesic in N and if
rTpBx
B
pB y ¼ pB ðrx yÞ pN : N nN denotes the natural projection, then
for all x; y 2 v ¼ V E ½4 pN is a periodic geodesic in nN if and only if
some 2 translates . We say periodic rather than
where : n ! v is the projection. Then the rest of the closed here because in pseudo-Riemannian spaces it
usual results will continue to hold, provided that is possible for a null geodesic to be closed but not
sectional curvature is replaced by the numerator of periodic. If the space is geodesically complete or
the sectional curvature formula at least when Riemannian, however, then this does not occur; the
elements of V are involved: former is in fact the case for our 2-step nilpotent Lie
groups. Further, recall that free homotopy classes of
hRTB ðpB x; pB yÞ pB y; pB xi
closed curves in nN correspond bijectively with
¼ hRnN ðx; yÞy; xi þ 34h½x; y; ½x; yi ½5 conjugacy classes in .
Now pB will be a pseudo-Riemannian submersion in Definition 7 Let C denote either a nontrivial, free
the usual sense if and only if U = V = {0}, as is homotopy class of closed curves in nN or the
always the case for Riemannian spaces. corresponding conjugacy class in . We define }(C)
In the Riemannian case, Eberlein showed that to be the set of all periods of periodic unit-speed
TF ffi Tz and TB ffi Tv . In general, TB is flat only if N geodesics that belong to C.
has a nondegenerate center or is flat. In the Riemannian case, this is the set of lengths of
Remark 3 Observe that the torus TB may be closed geodesics in C, frequently denoted by ‘(C).
decomposed into a topological product TE
TV in Definition 8 The period spectrum of nN is the set
the obvious way. It is easy to check that TE is flat [
and isometric to ( log \ E )nE , and that TV has a spec} ðnNÞ ¼ }ðCÞ
linear connection not coming from a metric and not C
100 Pseudo-Riemannian Nilpotent Lie Groups
where the union is taken over all nontrivial, free z orthogonal to [e , n ] and set ! = jz0 þ e j. Let
homotopy classes of closed curves in nN. ˙ = z0 þ e0 . Then
(0)
In the Riemannian case, this is the length spectrum (i) je j !. In addition, ! < ! for timelike (space-
spec‘ (nN). like) geodesics with !z0 z0 timelike (spacelike),
and ! > ! for timelike (spacelike) geodesics
Example 4 Similar to the Riemannian case, we can
with !z0 z0 spacelike (timelike);
compute the period spectrum of a flat torus nRm ,
(ii) ! = je j if and only if (t) = exp(te =je j) for all
where is a lattice (of maximal rank, isomorphic to
t 2 R; and
Zm ). Using calculations in an analogous way as for
(iii) ! = ! if and only if !z0 z0 is null.
finding the length spectrum of a Riemannian flat
torus, we easily obtain Although ! need not be an upper bound for periods
as in the Riemannian case, it nonetheless plays a
spec} ðnRm Þ ¼ fjgj 6¼ 0 j g 2 g special role among all periods, as seen in (iii) above,
and we shall refer to it as the distinguished period
It is also easy to see that the nonzero d’Alembertian associated with 2 N. When the center is definite,
spectrum is related to the analogous set produced for example, we do have ! ! .
from the dual lattice , multiplied by factors of Now the following definitions make sense at least
42 , almost as in the Riemannian case. for N with a nondegenerate center.
As in this example, simple determinacy of periods Definition 9 Let C denote either a nontrivial, free
of unit-speed geodesics helps make calculation of the homotopy class of closed curves in nN or the
period spectrum possible purely in terms of corresponding conjugacy class in . We define } (C)
log n . to be the distinguished periods of periodic unit-speed
For the rest of this subsection, we assume that N geodesics that belong to C.
is a simply connected, two-step nilpotent Lie group
Definition 10 The distinguished period spectrum
with left-invariant pseudo-Riemannian metric tensor
of nN is the set
h , i. Note that non-null geodesics may be taken to be
[
of unit speed. Most non-identity elements of N Dspec} ðnNÞ ¼ } ðCÞ
translate some geodesic, but not necessarily one of C
unit speed. where the union is taken over all nontrivial, free
For our special class of flat 2-step nilmanifolds, homotopy classes of closed curves in nN.
we can calculate the period spectrum completely.
Then we get this result:
Theorem 8 If [n , n ] U and E = {0}, then spec} (M)
Corollary 5 Assume the center is nondegenerate. If
can be completely calculated from log for any
n is nonsingular, then spec} (TB ) (respectively, TF ) is
M = nN.
precisely the period spectrum (respectively, the
Thus, we see again just how much these flat, two- distinguished period spectrum) of those free homo-
step nilmanifolds are like tori. All periods can be topy classes C of closed curves in M = nN that do
calculated purely from log n , although some will not (respectively, do) contain an element in the
not show up from the tori in the fibration. center of ffi 1 (M), except for those periods arising
only from unit-speed geodesics in M that project to
Corollary 4 spec} (TB ) (respectively, TF ) is [C} (C)
null geodesics in both TB and TF .
where the union is taken over all those free
homotopy classes C of closed curves in M = nN
Conjugate Loci
that do not (respectively, do) contain an element in
the center of ffi 1 (M), except for those periods This is the only general result on conjugate points.
arising only from unit-speed geodesics in M that
Proposition 8 Let N be a simply connected, 2-step
project to null geodesics in both TB and TF .
nilpotent Lie group with left-invariant metric tensor
We note that one might consider using this to assign h , i, and let be a geodesic with (0) ˙ = a 2 z.
periods to some null geodesics in the tori TB and TF . If ady
a = 0, then there are no conjugate points
When the center is nondegenerate, we obtain along .
results similar to Eberlein’s. Here is part of them.
In the rest of this subsection, we assume that the
Theorem 9 Assume U = {0}. Let 2 N and write center of N is nondegenerate.
log = z þ e . Assume translates the unit-speed For convenience, we shall use the notation
geodesic by ! > 0. Let z0 denote the component of Jz = ady
z for any z 2 z. (Since the center is
Pseudo-Riemannian Nilpotent Lie Groups 101
2 12
t2 Z t02 ¼
jz0 j hx0 ; x0 i
and multcp (t0 ) = dim z 1.
in which case multcp (t) = dim v.
This covers all cases for a pseudoH-type group with
Theorem 11 Let be such a geodesic in a a center of any dimension.
pseudoH-type group N with z0 6¼ 0 6¼ x0 . Some results on other two-step groups and
(i) If hz0 , z0 i = 2 with > 0, then (t0 ) is con- examples (including pictures in dimension 3) may
jugate to (0) along if and only if be found in the references cited in Jang et al. (2005).
When the groups are not pseudoH-type, however,
complete results are available only when the center
2
t0 2 Z [ A1 [ A2 is one dimensional. Guediri (2004) has results in the
timelike Lorentzian case.
102 Pseudo-Riemannian Nilpotent Lie Groups
Nomizu K (1979) Left-invariant Lorentz metrics on Lie groups. Wilson EN (1982) Isometry groups on homogeneous manifolds.
Osaka Mathematical Journal 16: 143–150. Geometriae Dedicata 12: 337–346.
O’Neill B (1983) Semi-Riemannian Geometry. New York: Wolf JA (1964) Curvature in nilpotent Lie groups. Proceedings of
Academic Press. the American Mathematical Society 15: 271–274.
Pauls SD (2001) The large scale geometry of nilpotent Lie groups.
Communications in Analysis and Geometry 9: 951–982.
Q
q-Special Functions
T H Koornwinder, University o If jqj < 1 this definition remains meaningful for
f Amsterdam, Amsterdam, The Netherlands k = 1 as a convergent infinite product:
ª 2006 Elsevier Ltd. All rights reserved.
Y
1
ða; qÞ1 :¼ ð1 aqj Þ ½2
j¼0
and if jqj < 1, then the radius of convergence of the q-number, q-factorial, and q-Pochhammer
power series [4] equals 1 if r < s þ 1, 1 if r = s þ 1, symbol:
and 0 if r > s þ 1.
qð1=2Þa qð1=2Þa Y
k
We can view the q-shifted factorial as a q-analog ½aq :¼ ½kq ! :¼ ½jq
of the shifted factorial (or Pochhammer symbol) by q1=2 q1=2 j¼1
the limit formula
Y
k 1
ðqþ1 ; qÞ1 1
Jð3Þ ðx; qÞ :¼ x
ðq; qÞ1 2 The q-Gamma and q-Beta Functions
0 1
1 1 þ1 ; q; qx2 ðx > 0Þ ½22 The q-gamma function is defined by
q 4
ðq; qÞ1 ð1 qÞ1z
See [90] for the orthogonality relation for J(3) (x; q). q ðzÞ :¼ ðz 6¼ 0; 1; 2; . . .Þ ½28
If expq (z) denotes one of the three q-exponentials ðqz ; qÞ1
[17]–[19], then (1=2)( expq (ix) þ expq (ix)) is a
Z ð1qÞ1
q-analog of the cosine and (1=2)i( expq (ix)
¼ tz1 Eq ðð1 qÞqtÞ dq t ð<z > 0Þ ½29
expq (ix)) is a q-analog of the sine. The three 0
q-cosines are essentially the case = 1=2 of the
Then
corresponding q-Bessel functions [20]–[22], and the 1 qz
three q-sines are essentially the case = 1=2 of x q ðz þ 1Þ ¼ q ðzÞ ½30
1q
times the corresponding q-Bessel functions.
ðq; qÞn
q-Derivative and q-Integral q ðn þ 1Þ ¼ ½31
ð1 qÞn
The q-derivative of a function f given on a subset of
R or C is defined by lim q ðzÞ ¼ ðzÞ ½32
q"1
f ðxÞ f ðqxÞ
ðDq f ÞðxÞ :¼ ðx 6¼ 0; q 6¼ 1Þ ½23
ð1 qÞx The q-beta function is defined by
where x and qx should be in the domain of f. By q ðaÞq ðbÞ ð1 qÞ ðq; qaþb ; qÞ1
continuity, we set (Dq f )(0) := f 0 (0), provided f 0 (0) Bq ða; bÞ : ¼ ¼
q ða þ bÞ ðqa ; qb ; qÞ1
exists. If f is differentiable on an open interval
I, then ða; b 6¼ 0; 1; 2; . . .Þ ½33
By substitution of [25], formula [35] becomes a Some special solutions of [45] are:
transformation formula:
u1 ðzÞ :¼ 2 1 ðqa ; qb ; qc ; q; zÞ ½46
2 1 ða; b; c; q; zÞ
ðaz; qÞ1 ðb; qÞ1
¼ 2 1 ðc=b; z; az; q; bÞ ½36 u2 ðzÞ :¼ z1c 2 1 ðq1þac ; q1þbc ; q2c ; q; zÞ ½47
ðz; qÞ1 ðc; qÞ1
r Wr1 ða1 ; a4 ; a5 ; . . . ; ar ; q; zÞ
Transformation formulas in the terminating case 2 3
1=2 1=2
n a1 ; qa1 ; qa1 ; a4 ; . . . ; ar
q ;b :¼ r r1 4 ; q; z5 ½50
2 1 ; q; z 1=2 1=2
a1 ; a1 ; qa1 =a4 ; . . . ; qa1 =ar
c
" #
ðc=b; qÞn qn ; b; qn bc1 z Below only a few of the most important identities
¼ 3 2 ; q; q ½42 are given. See Gasper and Rahman (2004) for many
ðc; qÞn q1n bc1 ; 0
more. An important tool for obtaining complicated
identities from more simple ones is Bailey’s Lemma,
qn ; cb1 ; 0
¼ ðqn bc1 z; qÞn 3 2 ; q; q ½43 which can moreover be iterated (Bailey chain), see
c; qcb1 z1 Andrews (1986, ch.3).
ðc=b; qÞn n qn ; b; qz1 z
¼ b 3 1 ; q; ½44 The q-Saalschütz sum for a terminating balanced 3 2
ðc; qÞn q1n bc1 c
a; b; qn ðc=a; c=b; qÞn
3 2 ; q; q ¼ ½51
c; q1n abc1 ðc; c=ðabÞ; qÞn
Second order q-difference equation
zðqc qaþbþ1 zÞðD2q uÞðzÞ
Jackson’s sum for a terminating balanced 8W7
1 qc b1q
a
a1q
bþ1
þ q þq z ðDq uÞðzÞ nþ1 2
1q 1q 1q 8W7 ða; b; c; d; q a =ðbcdÞ; qn ; q; qÞ
1 qa 1 qb ðqa; qa=ðbcÞ; qa=ðbdÞ; qa=ðcdÞ; qÞn
uðzÞ ¼ 0 ½45 ¼ ½52
1q 1q ðqa=b; qa=c; qa=d; qa=ðbcdÞ; qÞn
q-Special Functions 109
Let x1 x2 x3 x4 x5 x6 = q1n . Then the following ðq; c=b; bz; q=ðbzÞ; qÞ1
¼ ðjc=bj < jz < 1Þ ½61
expression is symmetric in x1 , x2 , x3 , x4 , x5 , x6 : ðc; q=b; z; c=ðbzÞ; qÞ1
qð1=2Þnðn1Þ ðx1 x2 x3 x4 ; x1 x2 x3 x5 ; x1 x2 x3 x6 ; qÞn This has as a limit case
ðx1 x2 x3 Þn
" # ðq; z; q=z; qÞ1
qn ; x2 x3 ; x1 x3 ; x1 x2 0 1 ð; c; q; zÞ ¼ ðjzj > jcjÞ ½62
4 3 ; q; q ½55 ðc; c=z; qÞ1
x1 x2 x3 x4 ; x1 x2 x3 x5 ; x1 x2 x3 x6
and as a further specialization the Jacobi triple
Similar formulations involving symmetry groups can product identity
be given for other transformations, see Van der Jeugt
X
1
and Srinivasa Rao (1999). ð1Þk qð1=2Þkðk1Þ zk
k¼1
¼
4 ðx; qÞ :¼ ð1Þk qk e2ikx
ðqa=e; qa=f ; ðqaÞ2 =ðbcdef Þ; ðqaÞ2 =ðbcdÞ; qÞn k¼1
2 Y1
qa qa qa qa qnþ2 a3 n
10W9 ; ; ; ; e; f ; ; q ; q; q ½56 ¼ ð1 q2k Þ
bcd cd bd bc bcdef
k¼1
1 2qk1 cosð2xÞ þ q4k2 ½64
Rogers–Ramanujan Identities
X1
qk
2
1
0 1 ð; 0; q; qÞ ¼ ¼ ½57 q-Hypergeometric Orthogonal
ðq; qÞ ðq; q 4 ; q5 Þ
k¼0 k 1 Polynomials
Here we discuss families of orthogonal polyno-
X
1 mials {pn (x)} which are expressible as terminating
2 qkðkþ1Þ 1
0 1 ð; 0; q; q Þ ¼ ¼ 2 3 5 ½58 q-hypergeometric series (0 < q < 1) and for
k¼0
ðq; qÞk ðq ; q ; q Þ1 which either (1) Pn (x):= pn (x) or (2) Pn (x):= pn
110 q-Special Functions
((1=2)(x þ x1 )) are eigenfunctions of a second- and the xk are the points (1=2)(eqk þ e1 qk ) with
order q-difference operator, that is, e any of the a, b, c, d of absolute value > 1; the sum
is over the k 2 Z0 with jeqk j > 1. The !k are
AðxÞ Pn ðqxÞ þ BðxÞ Pn ðxÞ þ CðxÞ Pn ðq1 xÞ certain weights which can be given explicitly. The
¼ n Pn ðxÞ ½65 sum in [67] does not occur if moreover
jaj, jbj, jcj, jdj < 1.
where A(x), B(x), and C(x) are independent of n,
A more uniform way of writing the orthogonality
and where the n are the eigenvalues. The generic
relation [67] is by the contour integral
cases are the four-parameter classes of ‘‘Askey–
Wilson polynomials’’ (continuous weight function) I
1 1 1
and q-Racah polynomials (discrete weights pn ðz þ z1 Þ pm ðz þ z1 Þ
2i C 2 2
on finitely many points). They are of type (2) (quad-
ratic q-lattice). All other cases can be obtained from ðz2 ; z2 ; qÞ1 dz
the generic cases by specialization or limit transition. ðaz; az1 ; bz; bz1 ; cz; cz1 ; dz; dz1 ; qÞ1 z
In particular, one thus obtains the generic three- ¼ 2hn n;m ½70
parameter classes of type (1) (linear q-lattice). These
are the big q-Jacobi polynomials (orthogonality by where C is the unit circle traversed in positive
q-integral) and the q-Hahn polynomials (discrete direction with suitable deformations to separate the
weights on finitely many points). sequences of poles converging to zero from the
sequences of poles diverging to 1.
The case n = m = 0 of [70] or [67] is known as the
Askey–Wilson Polynomials
Askey–Wilson integral.
Definition as q-hypergeometric series
pn ðcos Þ ¼ pn ðcos ; a; b; c; d j qÞ q-Difference equation
ðab; ac; ad; qÞn
:¼ n 4 3 AðzÞPn ðqzÞ AðzÞ þ Aðz1 Þ Pn ðzÞ þ Aðz1 ÞPn ðq1 zÞ
" a #
qn ; qn1 abcd; aei ; aei ¼ ðqn 1Þ ð1 qn1 abcdÞPn ðzÞ ½71
; q; q ½66
ab; ac; ad where Pn (z) = pn ( 12 (z þ z1 )) and A(z) = (1 az)
This is symmetric in a, b, c, d. (1 bz)(1 cz) (1 dz)=((1 z2 )(1 qz2 ))
Orthogonality relation Assume that a, b, c, d are Special cases These include the continuous
four reals, or two reals and one pair of complex q-Jacobi polynomials (two parameters), the contin-
conjugates, or two pairs of complex conjugates. uous q-ultraspherical polynomials (symmetric one-
Also assume that jabj, jacj, jadj, jbcj, jbdj, jcdj < 1. parameter case of continuous q-Jacobi), the
Then Al-Salam-Chihara polynomials (Askey–Wilson with
Z 1 c = d = 0), and the continuous q-Hermite polyno-
mials (Askey–Wilson with a = b = c = d = 0).
pn ðxÞpm ðxÞwðxÞ dx
1
X
þ pn ðxk Þ pm ðxk Þ!k ¼ hn n;m ½67 Continuous q-Ultraspherical Polynomials
k
Definitions as finite Fourier series and as special
where Askey–Wilson polynomial
2
ðe2i ; qÞ
Cn ðcos ; j qÞ
2 sin wðcos Þ ¼
i i i 1 i
½68
ðae ; be ; ce ; de ; qÞ 1 Xn
ð; qÞk ð; qÞnk iðn2kÞ
:¼ e ½72
ðabcd; qÞ1 k¼0
ðq; qÞk ðq; qÞnk
h0 ¼
ðq; ab; ac; ad; bc; bd; cd; qÞ1
hn 1 abcdqn1
¼ ð; qÞn
h0 1 abcdq2n1 ¼ pn ðcos ; 1=2 ; q1=2 1=2 ; 1=2 ;
ðq; qÞn
ðq; ab; ac; ad; bc; bd; cd; qÞn
½69 q1=2 1=2 j qÞ ½73
ðabcd; qÞn
q-Special Functions 111
2i 1
d
2 0 ðe ; qÞ1 Pn ðxÞ ¼ Pn ðx; a; b; c; qÞ
" #
ð; q; qÞ1 1 ð2 ; qÞn qn ; qnþ1 ab; x
¼ 2 n;m ½74 :¼ 3 2 ; q; q ½82
ð ; q; qÞ1 1 qn ðq; qÞn
qa; qc
q-Difference equation
Orthogonality relation
AðzÞPn ðqzÞ AðzÞ þ Aðz1 Þ Pn ðzÞ þ Aðz1 ÞPn ðq1 zÞ Z qa
ða1 x; c1 x; qÞ1
¼ ðqn 1Þð1 qn 2 ÞPn ðzÞ ½75 Pn ðxÞPm ðxÞ dq x ¼ hn n;m ;
qc ðx; bc1 x; qÞ1
where Pn (z) = Cn ( 12 (z þ z1 ); j q) and A(z) = (1 z2 ) ð0 < a < q1 ; 0 < b < q1 ; c < 0Þ ½83
(1 qz2 )=((1 z2 )(1 qz2 )).
where hn can be explicitly given.
Generating function
ðei z; ei z; qÞ1 X 1 q-Difference equation
¼ Cn ðcos ; j qÞzn
ðei z; ei z; qÞ1 n¼0 AðxÞPn ðqxÞ ðAðxÞ þ CðxÞÞPn ðxÞ þ CðxÞPn ðq1 xÞ
ðjzj < 1; 0 ; 1 < < 1Þ ½76 ¼ ðqn 1Þð1 abqnþ1 ÞPn ðxÞ ½84
2
where A(x) = aq(x 1)(bx c)=x and C(x) = (x qa)
Special case: the continuous q-Hermite polynomials (x qc)=x2
Hn ðx j qÞ ¼ ðq; qÞn Cn ðx; 0 j qÞ ½77
Limit case: Jacobi polynomials P(
,
n
)
(x)
Special cases: the Chebyshev polynomials
lim Pn ðx; q
; q ; q1 d; qÞ
q"1
sinððn þ 1ÞÞ
Cn ðcos ; q j qÞ ¼ Un ðcos Þ :¼ ½78 n! 2x þ d 1
sin ¼ Pð
;Þ ½85
ð
þ 1Þn n dþ1
ðq; qÞn
lim Cn ðcos ; j qÞ ¼ Tn ðcos Þ Special case: the little q-Jacobi polynomials
"1 ð; qÞn
where !y and hn can be explicitly given. Jð3Þ ð2qð1=2ÞðnþkÞ ; qÞ ð > 1Þ ½89
112 q-Special Functions
by which [88] tends to the orthogonality relation for where the contour C is as in [70], and where
J(3) (x; q):
wðzÞ
X
1
Jð3Þ ð2qð1=2ÞðnþkÞ ; qÞ Jð3Þ ð2qð1=2ÞðmþkÞ ; qÞqk ðz2 ; z2 ; abcdez; abcde=z; qÞ1
¼ ½97
k¼1 ðaz; a=z; bz; b=z; cz; c=z; dz; d=z; ez; e=z; qÞ1
¼ n;m qn ðn; m 2 ZÞ ½90
ðbcde; acde; abde; abce; abcd; qÞ1
q-Hahn Polynomials h0 ¼ ½98
ðq; ab; ac; ad; ae; bc; bd; be; cd; ce; de; qÞ1
Definition as q-hypergeometric series and hn =h0 can also be given explicitly. For
n nþ1 ab = qN , n, m 2 {0, 1, . . . , N}, there is a related dis-
q ; q
; x
Qn ðx;
; ; N; qÞ :¼ 3 2 ; q; q crete biorthogonality of the form
q
; qN
ðn ¼ 0; 1; . . . ; NÞ ½91 X N
1 k 1 k
Rn ðaq þ a q Þ; a; b; c; d; e
k¼0
2
Orthogonality relation
1 q
X
N ðq
; qN ; qÞy ðq
Þy Rm ðaqk þ a1 qk Þ; a; b; c; d; wk ¼ 0
y y 2 abcde
Qn ðq ÞQm ðq Þ
y¼0
ðqN 1 ; q; qÞy ðn 6¼ mÞ ½99
¼ hn n;m ½92
where hn can be explicitly given.
Identities and Functions Associated
Stieltjes–Wigert Polynomials with Root Systems
Definition as q-hypergeometric series -Function Identities
n
1 q nþ1
Let R be a root system on a Euclidean space of
Sn ðx; qÞ ¼ 1 1 ; q; q x ½93 dimension l. Then Macdonald (1972) generalizes
ðq; qÞn 0
Weyl’s denominator formula to the case of an affine
The orthogonality measure is not uniquely determined: root system. The resulting formula can be written as
Z 1 an explicit expansion in powers of q of
1
Sn ðq1=2 x; qÞSm ðq1=2 x; qÞwðxÞ dx ¼ n n;m ; !
0 q ðq; qÞn Y1 Y
n l n
T
2Rþ ðe R ; qÞk ðqe ; qÞk dx
They satisfy the biorthogonality relation
T dx
I !
1 1 1 Y Y k
Rn ðz þ z Þ; a; b; c; d; e ¼ CT ð1 qi1 e
Þð1 qi e
Þ
2i C 2
2Rþ i¼1
1 q dz
Rm ðz þ z1 Þ; a; b; c; d; wðzÞ l
Y
2 abcde z kdi
¼ ½100
¼ 2hn n;m ½96 i¼1 k q
q-Special Functions 113
where T is a torus determined by R, CT means the Definition For a partition and for 0 t 1, the
constant term in the Laurent expansion in e
, and (analytically defined) Macdonald polynomial P (z) =
the di are the degrees of the fundamental invariants P (z; q, t) is of the form
of the Weyl group of R. The conjecture was X
extended for real k > 0, for several parameters k P ðzÞ ¼ P ðz; q; tÞ ¼ m ðzÞ þ u; m ðzÞ
<
(one for each root length), and for root system BCn ,
where Gustafson’s five-parameter n-variable analog ðu; 2 CÞ
of the Askey–Wilson integral ([70] for n = 0) such that for all <
settles: Z
Z
d1 . . . n P ðzÞ m ðzÞ ðzÞ dz ¼ 0
jðei1 ; . . . ; ein Þj2 ¼ 2n n! T
½0;2 n ð2Þn
Y n where
ðt; tnþj2 abcd; qÞ1
½101 Y ðzi z1
ðt ; q; abtj1 ; actj1 ; . . . ; cdtj1 ; qÞ1 j ; qÞ1
j
j¼1 ðzÞ ¼ ðz; q; tÞ :¼ ½106
i6¼j
ðtzi z1
j ; qÞ1
where
Y ðzi zj ; zi =zj ; qÞ1
ðzÞ :¼ Orthogonality relation
ðtzi zj ; tzi =zj ; qÞ1
1i<jn Z
1
Y
n ðz2j ; qÞ1 P ðzÞ P ðzÞ ðzÞ dz
½102 n! T
j¼1
ðazj ; bzj ; czj ; dzj ; qÞ1 Y ðqi j tji ; qi j þ1 tji ; qÞ
1
¼ i j tjiþ1 ; qi j þ1 tji1 ; qÞ
; ½107
Further extensions were in Macdonald’s conjectures i<j
ðq 1
for the quadratic norms of Macdonald polynomials
associated with root systems (see the subsection
‘‘Macdonald–Koornwinder polynomials’’), and finally q-Difference equation
proved by Cherednik. X
n Y
tzi zj
q;zi P ðz; q; tÞ
i¼1 j6¼i
zi zj
Macdonald Polynomials for Root System An1 !
X
n
Let n 2 Z>0 . We work with partitions = (1 , . . . , n ) i ni
¼ q t P ðz; q; tÞ ½108
of length n, where 1 n 0 are integers. i¼1
On the set of such partitions, we take the partial
where
q, zi is the q-shift operator:
q, zi f (z1 , . . . , zn ) :=
order ) 1 þ þ n = 1 þ þ n and
f (z1 , . . . , qzi , . . . , zn ). See (Macdonald 1995, ch. VI, §3)
1 þ þ i 1 þ þ i (i = 1, . . . , n 1). Write
for the full system of q-difference equations.
< iff and 6¼ . The monomials are
z
= z
1 1 . . . z
n n (
1 , . . . ,
n 2 Z0 ). For a partition
the symmetrized monomials m (z) and the Schur Special value
functions s (z) are defined by: Y
n
X P ð1; t; . . . ; tn1 ; q; tÞ ¼ tði1Þi
m ðzÞ:¼ z
ðsum over all distinct i¼1
þnj
detðzi j Þi;j¼1;...;n Restriction of number of variables
s ðzÞ :¼ nj
½104
detðzi Þi;j¼1;...;n
P1 ;2 ;...;n1 ;0 ðz1 ; . . . ; zn1 ; 0; q; tÞ
n
We integrate a function over the torus T := {z 2 C j ¼ P1 ;2 ;...;n1 ðz1 ; . . . ; zn1 ; q; tÞ ½110
jz1 j = = jzn j = 1} as
Z
1 Homogeneity
f ðzÞ dz :¼
T ð2Þn
Z 2 Z 2 P1 ;...;n ðz; q; tÞ ¼ z1 . . . zn P1 1;...;n 1 ðz; q; tÞ
... f ðei1 ; . . . ; ein Þd1 . . . dn ½105 ðn > 0Þ ½111
0 0
114 q-Special Functions
Bilinear sum
Schur functions (see [104]): X 1
P ðx; q; tÞP ðy; q; tÞ
P ðz; q; qÞ ¼ s ðzÞ ½115 hP P iq; t
;
Y ðtxi yj ; qÞ
1
¼ ½122
ðxi y j ; qÞ 1
Hall–Littlewood polynomials (see Macdonald (1995), i; j1
ch. III):
Generalized Kostka numbers The Kostka numbers
P ðz; 0; tÞ ¼ P ðz; tÞ ½116 K, Poccurring as expansion coefficients in
s = K, m were generalized by Macdonald to
coefficients K, (q, t) occurring in connection with
Jack polynomials (see Macdonald (1995), §VI.10): Macdonald polynomials, see Macdonald (1995,
ð1=aÞ §VI.8). Macdonald’s conjecture that K, (q, t) is a
lim P ðz; q; qa Þ ¼ P ðzÞ ½117
q"1 polynomial in q and t with coefficients in Z0 was
fully proved in Haiman (2001).
which are related to double affine Hecke algebras Elliptic Analog of Jackson’s 8W7 Summation
(see Macdonald (2003)).
nþ1 2
10V9 ða; b; c; d; q a =ðbcdÞ; qn ; q; pÞ
ðqa; qa=ðbcÞ; qa=ðbdÞ; qa=ðcdÞ; q; pÞn
Elliptic Hypergeometric Series ¼ ½129
ðqa=b; qa=c; qa=d; qa=ðbcdÞ; q; pÞn
Let p, q 2 C, jpj, jqj < 1. Define a modified Jacobi
theta function by Elliptic Analog of Bailey’s 10W9 Transformation
ðx; pÞ :¼ ðx; p=x; pÞ1 ðx 6¼ 0Þ ½123
qnþ2 a3 n
and the elliptic shifted factorial by V
12 11 a; b; c; d; e; f ; ; q ; q; p
bcdef
ða; q; pÞk :¼ ða; pÞðaq; pÞ . . . ðaqk1 ; pÞ ðqa; qa=ðef Þ;ðqaÞ2 =ðbcdeÞ;ðqaÞ2 =ðbcdf Þ; q; pÞn
¼
ðqa=e; qa=f ;ðqaÞ2 =ðbcdef Þ; ðqaÞ2 =ðbcdÞ;q;pÞn
ðk 2 Z>0 Þ; ða; q; pÞ0 :¼ 1 ½124 2
qa qa qa qa qnþ2 a3 n
12V11 ; ; ; ;e;f ; ; q ; q;p ½130
bcd cd bd bc bcdef
ða1 ; . . . ; ar ; q; pÞk :¼ ða1 ; q; pÞk . . . ðar ; q; pÞk ½125
Suitable 12V11 functions satisfy a discrete biortho-
where a, a1 , ..., ar 6¼ 0. For q = e2i , p= e2i
(=
> 0), gonality relation which is an elliptic analog of [99].
and a 2 C we have
Ruijsenaars’ elliptic gamma function
1
ðae2iðxþ Þ ; e2i
Þ
¼1
ðae2ix ; e2i
Þ
1 Y1
1 z1 qjþ1 pkþ1
ðae2iðxþ
Þ ; e2i
Þ ðz; q; pÞ :¼ ½131
¼ a1 qx ½126 1 zqj pk
ðae2ix ; e2i
Þ j;k¼0
P1
A series k = 0 ck with ckþ1 =ck being an elliptic
which is symmetric in p and q. Then
(i.e., doubly periodic meromorphic) function of k
ðqz; q; pÞ ¼ ðz; pÞðz; q; pÞ
considered as a complex variable is called an elliptic ½132
hypergeometric series. In particular, define the r Er1 ðqn z; q; pÞ ¼ ðz; q; pÞn ðz; q; pÞ
theta hypergeometric series as the formal series
Applications
r Er1 ða1 ; . . . ; ar ; b1 ; . . . ; br1 ; q; p; zÞ
Quantum Groups
X1
ða1 ; . . . ; ar ; q; pÞk zk
:¼ ½127 A specific quantum group is usually a Hopf algebra
k¼0
ðb1 ; . . . ; br1 ; q; pÞk ðq; q; pÞk
which is a q-deformation of the Hopf algebra of
It has g(k):= ckþ1 =ck with functions on a specific Lie group or, dually, of a
universal enveloping algebra (viewed as Hopf
zða1 qx ; pÞ . . . ðar qx ; pÞ algebra) of a Lie algebra. The general philosophy is
gðxÞ ¼
ðqxþ1 ; pÞ ðb1 qx ; pÞ . . . ðbr1 qx ; pÞ that representations of the Lie group or Lie algebra
also deform to representations of the quantum
By [126], g(x) is an elliptic function with periods 1
group, and that special functions associated with
and
1 (q = e2i , p = e2i
) if the balancing condi-
the representations in the classical case deform to
tion a1 . . . ar = qb1 . . . br1 is satisfied.
q-special functions associated with the representa-
The r Vr1 very well-poised theta hypergeometric
tions in the quantum case. Sometimes this is
series (a special r Er1 ) is defined, in case of
straightforward, but often new subtle phenomena
argument 1, as:
occur.
r Vr1 ða1 ; a6 ; . . . ; ar ; q; pÞ The representation-theoretic objects which may
X1 be explicitly written in terms of q-special functions
ða1 q2k ; pÞ ða1 ; a6 ; . . . ; ar ; q; pÞk
:¼ include matrix elements of representations with
k¼0
ða1 ; pÞ ðqa1 =a6 ; . . . ; qa1 =ar ; q; pÞk respect to specific bases (in particular spherical
qk elements), Clebsch–Gordan coefficients and Racah
½128 coefficients. Many one-variable q-hypergeometric
ðq; q; pÞk
functions have found interpretation in some way
The series is called balanced if a26 . . . a2r = ar6
1 q
r4
. in connection with a quantum analog of a three-
n
The series terminates if, for instance, ar = q . dimensional Lie group (generically the Lie group
116 q-Special Functions
SL(2, C) and its real forms). Classical by now are: Macdonald’s generalization of Weyl’s denominator
little q-Jacobi polynomials interpreted as matrix formula to affine root systems has an interpretation
elements of irreducible representations of SUq (2) as an identity for the denominator of the character
with respect to the standard basis; Askey–Wilson of a representation of an affine Kac–Moody
polynomials similarly interpreted with respect to a algebra.
certain basis not coming from a quantum subgroup;
Jackson’s third q-Bessel functions as matrix elements Partitions of Positive Integers
of irreducible representations of Eq (2); q-Hahn Let n be a positive integer, p(n) the number of
polynomials and q-Racah polynomials interpreted partitions of n, pN (n) the number of partitions of n
as Clebsch–Gordan coefficients and Racah coeffi- into parts N, pdist (n) the number of partitions of
cients, respectively, for SUq (2). n into distinct parts, and podd (n) the number of
Further developments include: Macdonald poly- partitions of n into odd parts. Then, Euler observed:
nomials as spherical elements on quantum analogs
1 X
1
1 X
1
of compact Riemannian symmetric spaces; q-analogs ¼ pðnÞqn ¼ pN ðnÞqn ½136
of Jacobi functions as matrix elements of irreducible ðq; qÞ1 n¼0 ðq; qÞN n¼0
unitary representations of SUq (1, 1); Askey–Wilson
polynomials as matrix elements of representations
of the SU(2) dynamical quantum group; an inter- X
1
pretation of discrete 12 V11 biorthogonality relations ðq; qÞ1 ¼ pdist ðnÞqn
on the elliptic U(2) quantum group. n¼0
½137
Since the q-deformed Hopf algebras are usually 1 X
1
¼ p ðnÞqn
presented by generators and relations, identities for ðq; q2 Þ1 n¼0 odd
q-special functions involving noncommuting vari-
ables satisfying simple relations are important for and
further interpretations of q-special functions in 1
quantum groups, for instance: ðq; qÞ1 ¼ ; pdist ðnÞ ¼ podd ðnÞ ½138
ðq; q2 Þ1
Andrews GE, Askey R, and Roy R (1999) Special Functions. Communications, vol. 14, pp. 131–166. Providence, RI:
Cambridge: Cambridge University Press. American Mathematical Society.
Andrews GE and Eriksson K (2004) Integer Partitions. Cambridge: Lepowsky J (1982) Affine Lie algebras and combinatorial
Cambridge University Press. identities. In: Winter DJ (ed.) Lie Algebras and Related
Baxter RJ (1982) Exactly Solved Models in Statistical Mechanics. Topics, Lecture Notes in Math., vol. 933, pp. 130–156.
London: Academic Press. Berlin: Springer.
Gasper G and Rahman M (2004) Basic Hypergeometric Series, Macdonald IG (1972) Affine root systems and Dedekind’s
2nd edn. Cambridge: Cambridge University Press. -function. Inventiones Mathematicae 15: 91–143.
Haiman M (2001) Hilbert schemes, polygraphs and the Macdonald Macdonald IG (1995) Symmetric Functions and Hall Polynomials,
positivity conjecture. Journal of the American Mathematical 2nd edn. Oxford: Clarendon.
Society 14: 941–1006. Macdonald IG (2000, 2001) Orthogonal polynomials associated
Koekoek R and Swarttouw RF (1998) The Askey-Scheme of with root systems. Séminaire Lotharingien de Combinatoire
Hypergeometric Orthogonal Polynomials and Its q-Analogue. 45: Art. B45a.
Report 98-17, Faculty of Technical Mathematics and Infor- Macdonald IG (2003) Affine Hecke Algebras and Orthogonal
matics, Delft University of Technology. Polynomials. Cambridge: Cambridge University Press.
Koornwinder TH (1992) Askey–Wilson polynomials for root Stanton D (1984) Orthogonal polynomials and Chevalley groups.
systems of type BC. In: Richards DStP (ed.) Hypergeometric In: Askey RA, Koornwinder TH, and Schempp W (eds.)
Functions on Domains of Positivity, Jack Polynomials, and Special Functions: Group Theoretical Aspects and Applica-
Applications, Contemp. Math., vol. 138, pp. 189–204. tions, pp. 87–128. Dordrecht: Reidel.
Providence, RI: American Mathematical Society. Suslov SK (2003) An Introduction to Basic Fourier Series.
Koornwinder TH (1994) Compact quantum groups and q-special Dordrecht: Kluwer Academic Publishers.
functions. In: Baldoni V and Picardello MA (eds.) Representa- Van der Jeugt J and Srinivasa Rao K (1999) Invariance groups of
tions of Lie Groups and Quantum Groups, Pitman Research transformations of basic hypergeometric series. Journal of
Notes in Mathematics Series, vol. 311, pp. 46–128. Harlow: Mathematical Physics 40: 6692–6700.
Longman Scientific & Technical. Vilenkin NJ and Klimyk AU (1992) Representation of Lie Groups
Koornwinder TH (1997) Special functions and q-commuting and Special Functions, vol. 3. Dordrecht: Kluwer Academic
variables. In: Ismail MEH, Masson DR, and Rahman M (eds.) Publishers.
Special Functions, q-Series and Related Topics, Fields Institute
The category C with duality, braiding, and twist is are isotopy classes of framed oriented tangles.
ribbon, if for any V 2 C, Given a ribbon category C, we can consider C-
labeled tangles, that is, (framed oriented) tangles
ðV idV ÞbV ¼ ðidV V ÞbV whose components are labeled with objects of C.
For an endomorphism f : V ! V of an object V 2 C, They form a category T C . Links appear here as
its trace ‘‘tr(f ) 2 EndC (1)’’ is defined as tangles without endpoints, that is, as morphisms
; ! ;. The link invariant hLi generalizes to a
trðf Þ ¼ dV cV;V ððV f Þ idV ÞbV : 1 ! 1 functor h i : T C ! C.
To define 3-manifold invariants, we need modular
This trace shares a number of properties of the categories (Turaev 1994). Let k be a field. A
standard trace of matrices, in particular, monoidal category C is k-additive if its Hom sets
tr(fg) = tr(gf ) and tr(f g) = tr(f )tr(g). For an object are k-vector spaces, the composition and tensor
V 2 C, set product of the morphisms are bilinear, and
dimðVÞ ¼ trðidV Þ ¼ dV cV;V ðV idV ÞbV EndC (1) = k. An object V 2 C is simple if
EndC (V) = k. A modular category is a k-additive
Ribbon categories nicely fit the theory of knots ribbon category C with a finite family of simple
and links in S3 . A link L S3 is a closed one- objects {V } such that (1) for any object P V2C
dimensional submanifold of S3 . (A manifold is there is a finite expansion idV = i fi gi for
closed if it is compact and has no boundary.) A certain morphisms gi : V ! Vi , fi : Vi ! V and
link is oriented (resp. framed) if all its components (2) the S-matrix (S, ) is invertible over k where
are oriented (resp. provided with a homotopy class S, = tr(cV , V cV , V ). Note that S, = hH(, )i
of nonsingular normal vector fields). Given a framed where H(, ) is the oriented Hopf link with framing 0,
oriented link L S3 whose components are labeled linking number þ1, and labels V , V .
with objects of a ribbon category C, one defines a Axiom (1) implies that every simple object in C is
tensor hLi 2 EndC (1). To compute hLi, present L by isomorphic to exactly one of V . In most interesting
a plane diagram with only double transversal cross- cases (when there is a well-defined direct summa-
ings such that the framing of L is orthogonal to the tion in C), this axiom may be rephrased by saying
plane. Each double point of the diagram is an that C is finite semisimple, that is, C has a finite set
intersection of two branches of L, going over and of isomorphism classes of simple objects and all
under, respectively. Associate with such a crossing objects of C are direct sums of simple objects. A
the tensor (cV, W )1 where V, W 2 C are the labels of weaker version of the axiom (2) yields premodular
these two branches and 1 is the sign of the crossing categories.
determined by the orientation of L. We also The invariant h i of links and tangles extends by
associate certain tensors with the points of the linearity to the case where labels are finite linear
diagram where the tangent line is parallel to a fixed combinations of objects of C withP coefficients in k.
axis on the plane. These tensors are derived from the Such a linear combination = dim (V )V is
evaluation and co-evaluation morphisms and the called the Kirby color. It has the following sliding
twists. Finally, all these tensors are contracted into a property: for any object V 2 C, the two tangles in
single element hLi 2 EndC (1). It does not depend on Figure 1 yield the same morphism V ! V. Here, the
the intermediate choices and is preserved under dashed line represents an arc on the closed compo-
isotopy of L in S3 . For the trivial knot O(V) with nent labeled by . This arc can be knotted or linked
framing 0 and label V 2 C, we have hO(V)i = with other components of the tangle (not shown in
dim (V). the figure).
Further constructions need the notion of a tangle.
An (oriented) tangle is a compact (oriented) one-
dimensional submanifold of R2 [0, 1] with end-
points on R 0 {0, 1}. Near each of its endpoints,
an oriented tangle T is directed either down or up,
and thus acquires a sign 1. One can view T as a
morphism from the sequence of 1’s associated Ω Ω
with its bottom ends to the sequence of 1’s
associated with its top ends. Tangles can be
composed by putting one on top of the other. V V
This defines a category of tangles T whose objects
are finite sequences of 1’s and whose morphisms Figure 1 Sliding property.
Quantum 3-Manifold Invariants 119
Invariants of Closed 3-Manifolds M with @M = (X) q Y (the minus sign indicates the
1 2 3 orientation reversal). A TQFT has to satisfy axioms
Given an embedded solid torus g : S D ,! S ,
which can be expressed by saying that V is a
where D2 is a 2-disk and S1 = @D2 , a 3-manifold can
monoidal functor from the category of surfaces and
be built as follows. Remove from S3 the interior of
cobordisms to the category of vector spaces over k.
g(S1 D2 ) and glue back the solid torus D2 S1
Homeomorphisms of surfaces should induce iso-
along gjS1 S1 . This process is known as ‘‘surgery.’’
morphisms of the corresponding vector spaces
The resulting 3-manifold depends only on the
compatible with the action of cobordisms. From
isotopy class of the framed knot represented by g.
the definition, V(;) = k. Every compact oriented
More generally, a surgery on a framed link
3-manifold M is a cobordism between ; and @M
L = [m i = 1 Li in S
3
with m components yields a
so that V yields a ‘‘vacuum’’ vector V(M) 2 Hom(V(;),
closed oriented 3-manifold ML . A theorem of
V(@M)) = V(@M). If @M = ;, then this gives a
W Lickorish and A Wallace asserts that any closed
numerical invariant V(M) 2 V(;) = k.
connected oriented 3-manifold is homeomorphic to
Interestingly, TQFTs are often defined for
ML for some L. R Kirby proved that two framed
surfaces and 3-cobordisms with additional struc-
links give rise to homeomorphic 3-manifolds if and
ture. The surfaces X are normally endowed with
only if these links are related by isotopy and a finite
Lagrangians, that is, with maximal isotropic
sequence of geometric transformations called Kirby
subspaces in H1 (X; R). For 3-cobordisms, several
moves. There are two Kirby moves: adjoining a
additional structures are considered in the litera-
distant unknot O" with framing " = 1, and sliding
ture: for example, 2-framings, p1 -structures, and
a link component over another one as in Figure 1.
numerical weights. All these choices are equiva-
Let L = [m 3
i = 1 Li S be a framed link and let lent. The TQFTs requiring such additional struc-
(bi, j )i, j = 1,..., m be its linking matrix: for i 6¼ j, bi, j is
tures are said to be ‘‘projective’’ since they provide
the linking number of Li , Lj , and bi, i is the framing
projective linear representations of the mapping
number of Li . Denote by eþ (resp. e ) the number of
class groups of surfaces.
positive (resp. negative) eigenvalues of this matrix.
Every modular category C with ground field k
The sliding property of modular categories implies
and simple objects {V } gives rise to a projective
the following theorem. In its statement, a knot K
three-dimensional TQFT V C . ItP depends on the
with label is denoted by K(). 2
choice of a square root D of (dim (V )) 2 k.
Theorem 1 Let C be a modular category with For a connected surface X of genus g,
Kirby color . Then hO1 ()i 6¼ 0, hO1 ()i 6¼ 0 and 0 1
the expression g
M O
V C ðXÞ ¼ HomC @1; ðVr Vr ÞA
C ðML Þ ¼ hO1 ðÞieþ hO1 ðÞie hL1 ðÞ; . . . ; Lm ðÞi 1 ;...;g r¼1
One calls V Hermitian if it is endowed with It turns out that S(X) is a finitely generated
conjugation such that projective D-module and V(X) = S(X) D C.
A cobordism (M, X, Y) is targeted if all its connected
V ¼ ðV Þ1 ; cV;W ¼ ðcV;W Þ1 components meet Y along a nonempty set. In
bV ¼ dV cV;V ðV 1V Þ this case, V(M)(S(X)) S(Y). Thus, applying S to
surfaces and restricting to targetet cobordisms, we
dV ¼ ð1V 1 1
V ÞcV ;V bV obtain an ‘‘integral version’’ of V. In many interest-
for any objects V, W of V. A Hermitian modular ing cases, the D-module S(X) is free and its basis
category V is unitary if tr(f f ) 0 for any morphism may be described explicitly. A simple Lie algebra g
f in V. The three-dimensional TQFT, derived from a and a primitive rth (in some cases 4rth) root of unity
Hermitian (resp. unitary) modular category, has a q with sufficiently big prime r give rise to an almost
natural structure of a Hermitian (resp. unitary) D-integral TQFT for D = Z[q].
TQFT.
The modular category derived from a simple Lie
algebra g and a root of unity q is always Hermitian. State-Sum Invariants
It may be unitary for some q. For simply laced g,
there are always such roots of unity q of any given Another approach to three-dimensional TQFTs is
sufficiently big order. For non-simply-laced g, this based on the theory of 6j-symbols and state sums on
holds under certain divisibility conditions on the triangulations of 3-manifolds. This approach intro-
order of q. duced by V Turaev and O Viro is a quantum
deformation of the Ponzano–Regge model for the
three-dimensional lattice gravity. The quantum 6j-
Integral Structures in TQFTs symbols derived from representations of Uq (sl2 C) are
C-valued rational functions of the variable q0 = q1=2
The quantum invariants of 3-manifolds have one
fundamental property: up to an appropriate res- i j k
½2
caling, they are algebraic integers. This was l m n
first observed by H Murakami, who proved that
numerated by 6-tuples of non-negative integers i, j,
qsl2 (M) is an algebraic integer, provided the order of
k, l, m, n. One can think of these integers as labels
q is an odd prime and M is a homology sphere. This
sitting on the edges of a tetrahedron (see Figure 3).
extends to an arbitrary closed connected oriented 3-
The 6j-symbol admits various equivalent normal-
manifold M and an arbitrary simple Lie algebra g as
izations and we choose the one which has full
follows (Le 2003): for any sufficiently big prime
tetrahedral symmetry. Now, let q0 2 C be a
integer r and any primitive rth root of unity q,
primitive 2rth root of unity with r 2. Set
qPg ðMÞ 2 Z½q ¼ Z½expð2i=rÞ ½1 I = {0, 1, . . . , r 2}. Given a labeled tetrahedron T
as in Figure 3 with i, j, k, l, m, n 2 I, the 6j-symbol
This inclusion allows one to expand qPg (M) as [2] can be evaluated at q0 and we can obtain a
a polynomial in q. A study of its coefficients leads complex number denoted jTj. Consider a closed
to the Ohtsuki invariants of rational homology three-dimensional manifold M with triangulation t.
spheres and further to perturbative invariants of (Note that all 3-manifolds can be triangulated.) A
3-manifolds due to T Le, J Murakami, and coloring of M is a mapping ’ from the set Edg(t)
T Ohtsuki (see Ohtsuki (2002)). Conjecturally, the of the edges of t to I. Set
inclusion [1] holds for nonprime (sufficiently big) r pffiffiffiffiffi X Y Y
2a
as well. Connections with the algebraic number jMj ¼ ð 2r=ðq0 q1 0 ÞÞ h’ðeÞi jT ’ j
theory (specifically modular forms) were studied by ’ e2EdgðtÞ T
where a is the number of vertices of t, hni = (1)n (see Majid (1995)). If C is spherical, then Z(C) is
(qn0 qn 1
0 ) (q0 q0 ) for any integer n, T runs over modular. Conjecturally, jMjC = Z(C) (M). In the case
all tetrahedra of t, and T ’ is T with the labeling where C arises from a subfactor, this has been recently
induced by ’. It is important to note that jMj does proved by Y Kawahigashi, N Sato, and M Wakui.
not depend on the choice of t and thus yields a The state sum invariants above are closely related
topological invariant of M. to spin networks, spin foam models, and other
The invariant jMj is closely related to the models of quantum gravity in dimension 2 þ 1 (see
quantum invariant qg (M) for g = sl2 (C). Namely, Baez (2000) and Carlip (1998)).
jMj is the square of the absolute value of qg (M), that
is, jMj = jqg (M)j2 . This computes jqg (M)j inside M See also: Axiomatic Approach to Topological Quantum
without appeal to surgery. No such computation of Field Theory; Braided and Modular Tensor Categories;
the phase of qg (M) is known. Chern–Simons Models: Rigorous Results; Finite-type
Invariants of 3-Manifolds; Large-N and Topological
These constructions generalize in two directions.
Strings; Schwarz-Type Topological Quantum Field
First, they extend to manifolds with boundary. Second,
Theory; Topological Quantum Field Theory: Overview;
instead of the representation category of Uq (sl2 C), one von Neumann Algebras: Subfactor Theory.
can use an arbitrary modular category C. This yields a
three-dimensional TQFT, which associates to a surface
X a vector space jXjC , and to a 3-cobordism (M, X, Y)
Further Reading
a homomorphism jMjC : jXjC ! jYjC , (see Turaev
(1994)). When X = Y = ;, this homomorphism is Baez JC (2000) An Introduction to Spin Foam Models of BF
multiplication C ! C by a topological invariant Theory and Quantum Gravity, Geometry and Quantum
jMjC 2 C. The latter is computed as a state sum on a Physics, Lecture Notes in Physics, No. 543, pp. 25–93. Berlin:
Springer.
triangulation of M involving the 6j-symbols associated Bakalov B and Kirillov A Jr. (2001) Lectures on Tensor
with C. In general, these 6j-symbols are not numbers Categories and Modular Functors. University Lecture Series,
but tensors so that, instead of their product, one vol. 21. Providence, RI: American Mathematical Society.
should use an appropriate contraction of tensors. The Blanchet C, Habegger N, Masbaum G, and Vogel P (1995)
vectors in V(X) are geometrically represented by Topological quantum field theories derived from the
Kauffman bracket. Topology 34: 883–927.
trivalent graphs on X such that every edge is labeled Carlip S (1998) Quantum Gravity in 2 þ 1 Dimensions,, Cambridge
with a simple object of C and every vertex is labeled Monographs on Mathematical Physics Cambridge: Cambridge
with an intertwiner between the three objects labeling University Press
the incident edges. The TQFT j jC is related to the Carter JS, Flath DE, and Saito M (1995) The Classical and
TQFT V = V C by jMjC = jV(M)j2 . Moreover, for any Quantum 6j-Symbols. Mathematical Notes, vol. 43. Princeton:
Princeton University Press.
closed oriented surface X, Evans D and Kawahigashi Y (1998) Quantum Symmetries on
Operator Algebras, Oxford Mathematical Monographs,
jXjC ¼ EndðVðXÞÞ ¼ VðXÞ ðVðXÞÞ
Oxford Science Publications. New York: The Clarendon
¼ VðXÞ VðXÞ Press, Oxford University Press.
Kauffman LH (2001) Knots and Physics, 3rd edn., Series on
and for any three-dimensional cobordism (M, X, Y), Knots and Everything, vol. 1. River Edge, NJ: World
Scientific.
jMjC ¼ VðMÞ VðMÞ : VðXÞ VðXÞ Kerler T and Lyubashenko V (2001) Non-Semisimple Topological
Quantum Field Theories for 3-Manifolds with Corners.
! VðYÞ VðYÞ
Lecture Notes in Mathematics, vol. 1765. Berlin: Springer.
J Barrett and B Westbury introduced a general- Kodiyalam V and Sunder VS (2001) Topological Quantum Field
Theories from Subfactors. Research Notes in Mathematics,
ization of jMjC derived from the so-called spherical vol. 423. Boca Raton, FL: Chapman and Hall/CRC Press.
monoidal categories (which are assumed to be Le T (2003) Quantum invariants of 3-manifolds: integrality,
semisimple with a finite set of isomorphism classes splitting, and perturbative expansion. Topology and Its
of simple objects). This class includes modular Applications 127: 125–152.
categories and a most interesting family of (unitary Lickorish WBR (2002) Quantum Invariants of 3-Manifolds.
Handbook of Geometric Topology, pp. 707–734. Amsterdam:
monoidal) categories arising in the theory of sub- North-Holland.
factors (see Evans and Kawahigashi (1998) and Majid S (1995) Foundations of Quantum Group Theory.
Kodiyalam and Sunder (2001)). Every spherical Cambridge: Cambridge University Press
category C gives rise to a topological invariant jMjC Ohtsuki T (2002) Quantum Invariants. A Study of Knots,
of a closed oriented 3-manifold M. (It seems that this 3-Manifolds, and Their Sets, Series on Knots and Everything,
vol. 29. River Edge, NJ: World Scientific.
approach has not yet been extended to cobordisms.) Turaev V (1994) Quantum Invariants of Knots and 3- Manifolds.
Every monoidal category C gives rise to a double (or de Gruyter Studies in Mathematics, vol. 18. Berlin: Walter de
a center) Z(C), which is a braided monoidal category Gruyter.
Quantum Calogero–Moser Systems 123
The diagonal
P element mj of M is given by
V(q) 1/q 2 q 2 + 1/q 2 1/sin2q 1/sinh2q
mj = ig P k6¼j 1=(qj qkP)2 . The matrix M has a special
r r
property j = 1 Mjk = k = 1 Mjk = 0, which ensures
the quantum conserved quantities as the total sum of
powers of ^
q q q P Lax matrix L: [H, Kn ] = 0, Kn
Rational Calogero Sutherland Hyperbolic Ts(Ln ) = j,k (Ln )jk , (n = 1, 2, 3, . . . ), [Kn , Km ] = 0.
Figure 1 Four different types of quantum C–M potentials. It should be stressed that the trace of Ln is not
conserved because of the noncommutativity of q and
p. The Hamiltonian is equivalent to K2 , H^ / K2 þ
(hyperbolic) counterpart (see Figure 1)
const. In other words, the Lax matrix L is like a
1=(qj qk )2 ! a2 =sinh2 a(qj qk ), in which a > 0 is
‘‘square root’’ of the Hamiltonian. The quantum
a real parameter. The 1=sin2 q potential case
equations of motion for the Sutherland and hyper-
(the Sutherland system) corresponds to the
bolic potentials are again expressed by Lax pairs if
1=(distance)2 interaction on a circle of radius 1/2a,
the following replacements are made: 1=(qj qk ) !
seePFigure 2. A harmonic confining potential
a coth a(qj qk ) in L and 1=(qj qk )2 !
!2 rj = 1 q2j =2 can be added to the rational Hamil- 2 2
a =sinh a(qj qk ) in M. The quantum conserved
tonian [1] without breaking the integrability
quantities are obtained in the same manner as above
(the Calogero system, see Figure 1). At the
for the systems with the trigonometric and hyperbolic
classical level, the trigonometric (hyperbolic) and
interactions.
rational C–M systems are obtained from the
The main goal here is to find all the eigenvalues
elliptic potential systems (with the Weierstrass }
{E} and eigenfunctions { (q)} of the Hamiltonians
function) as the degenerate limits: }(q1 q2 ) !
with the rational, Calogero, Sutherland, and
a2 =sinh2 a(q1 q2 ) ! 1=(q1 q2 )2 , namely as one
hyperbolic potentials: H^ (q) = E (q). The mome-
(two) period(s) of the } function tends to infinity.
ntum operator pj acts as differential operators
It is remarkable that these equations of motion can
pj = ih@=@qj . For example, for the rational
be expressed in a matrix form (Lax pair):
^ L] = dL=dt = LM ML = [L, M] , Heisenberg model Hamiltonian [1], the eigenvalue equation
i=h[H,
reads
equation of motion, in which L and M are given by
2 3
0 ig ig
1 h
2X r
@ 2 Xr
1
p1 q1 q2 q1 qr 4 þ gðg hÞ 5 ðqÞ
B C 2 j¼1 @q2j ðq q Þ 2
B C j<k j k
B ig ig C
B q2 q1 p2 q2 qr C ¼ E ðqÞ ½3
B C
L¼B
B
C
C
B .. .. .. .. C which is a second-order Fuchsian differential
B . . . . C
B C equation for each variable {qj } with a regular
@ A
ig ig
pr singularity at each hyperplane qj = qk whose expo-
qr q1 qr q2
nents are g=h, 1 g=h. Any solution of [3] is
0 ig ig
1 ½2 regular at all points, except for those on the union
m1 ðq 2 ðq 2
of hyperplanes qj = qk . Since the structure of the
1 q2 Þ 1 qr Þ
B C
B C singularity is the same for the other three types of
B C
B ig 2 m2 ig
ðq q Þ2 C potentials, the same assertion for the regularity and
B ðq2 q1 Þ 2 r C
B C singularity of the solution holds for these cases,
M¼B C
B .. .. .. .. C too. For the trigonometric (Sutherland) case, there
B C
B . . . . C are other singularities at qj qk = l=a, l 2 Z, due
B C
@ A to the periodicity of the potential. As is clear from
ig ig
ðq q Þ2
ðq q 2 mr the shape of the potentials, see Figure 1, the
r 1 r 2Þ
rational and hyperbolic Hamiltonians have only
continuous spectra, whereas the Calogero and
q2 Sutherland Hamiltonians have only discrete
q3 spectra.
q1 The integrability or more precisely the triangular-
q 4 distance(q 1, q 2) = sin a(q 1 – q 2)/a ity of the quantum C–M Hamiltonian was first
R = 1/2a
discovered by Calogero for particles on a line with
Figure 2 Sutherland potential is 1=(distance)2 interaction on a inverse square potential plus a confining harmonic
circle. The large-radius limit, a ! 0, gives the rational potential. force and by Sutherland for the particles on a circle
Quantum Calogero–Moser Systems 125
with the trigonometric potential. Later, classical Table 1 Functions appearing in the prepotential and Lax pair
integrability of the models in terms of Lax pairs was
Potential w (u) x (u) y (u)
proved by Moser. Olshanetsky and Perelomov
showed that these systems were based on Ar1 root Rational u 1/u 1=u 2
systems, that is, qj qk = q, and is one of the Hyperbolic sinh au a coth au a 2=sinh2 au
root vectors of Ar1 root system [13]. They also Trigonometric sin au a cot au a 2=sin2 au
introduced generalizations of the C–M systems
based on any root system including the noncrystal- group, that is, they are identical for roots in the
lographic ones. same orbit. That is, for the simple Lie algebra cases,
As shown by Heckman–Opdam and Sasaki and one coupling constant, g = g, for all roots in simply
collaborators, quantum C–M systems with degen- laced models and two independent coupling con-
erate potentials (i.e., the rational potentials with/ stants, g = gL for long roots and g = gS for short
without harmonic force, the hyperbolic, and the roots, in non-simply laced models. The function
trigonometric potentials), based on any root system w(u) and the other functions x(u) and y(u) appearing
can be formulated and solved universally. To be in the Lax pair [10], [11] are listed in Table 1 for
more precise, the rational and Calogero systems are each type of degenerate potentials. The dynamics of
integrable for all root systems, the crystallographic the prepotentials W(q) (eqn [5]) has been discussed
and noncrystallographic. The hyperbolic and trigo- by Dyson from a different point of view (random-
nometric (Sutherland) systems are integrable for any matrix model). The above factorized Hamiltonian
crystallographic root system. The universal formulas [4] consists of an operator part H, ^ which is the
for the Hamiltonians, Lax pairs, ground state wave Hamiltonian in the usual definition (see the Hamil-
functions, conserved quantities, the triangularity, the tonians in the previous section, e.g., [1]), and a
discrete spectra for the Calogero and Sutherland constant E 0 which is the ground-state energy,
systems, the creation and annihilation operators, H = H^ E 0 . The factorized Hamiltonian [4] also
etc., are equally valid for any root system. This will arises within the context of supersymmetric quan-
be shown in the next section. Some rudimentary tum mechanics.
facts of the root systems and reflections are The pre-potential and the Hamiltonian are
summarized in the appendix. invariant under reflection of the phase space
variables in the hyperplane perpendicular to any
root W(s (q)) = W(q), H(s (p), s (q)) = H(p, q), 8 2
Universal Formalism , with s defined by [12]. The above Coxeter
(Weyl) invariance is the only (discrete) symmetry of
A C–M system is a Hamiltonian dynamical systems the C–M systems. The main problem is, as in the Ar1
associated with a root system of rank r, which is a case, to find all the eigenvalues {E} and eigenfunctions
set of vectors in Rr with its standard inner product. { (q)} of the above Hamiltonian H (q) = E (q).
A brief review of the properties of the root systems For any root system and for any choice of
and the associated reflections together with explicit potential, the C–M system has a hard repulsive
realizations of all the classical root systems will be potential 1=( q)2 near the reflection hyperplane
found in the appendix. H = {q 2 Rr , q = 0}. The C–M eigenvalue equa-
tion is a second-order Fuchsian differential equation
Factorized Hamiltonian with regular singularities at each reflection hyper-
plane H and those arising from the periodicity in
The Hamiltonian for the quantum C–M system can the case of the Sutherland potential. Near the
be written in terms of a pre-potential W(q) in a reflection hyperplane H , the solution behaves as
‘‘factorized form’’: follows:
r
1X @WðqÞ @WðqÞ ð qÞg =h ð1 þ regular termsÞ; or
H¼ pj i pj þ i ½4
2 j¼1 @qj @qj ð qÞ1g =h ð1 þ regular termsÞ
The pre-potential is a sum over positive roots: The former solution is chosen for the square
X ! integrability. Because of the singularities, the con-
WðqÞ ¼ g ln jwð qÞj þ q2 ½5 figuration space is restricted to the principal Weyl
2
2
þ
chamber PW or the principal Weyl alcove PWT
The real positive coupling constants g are for the trigonometric potential (see Figure 3): PW =
defined on orbits of the corresponding Coxeter {q 2 Rr j q > 0, 2 }, PWT = {q 2 Rr j q > 0,
126 Quantum Calogero–Moser Systems
α2
contained in the ground-state wave function eW , E
αh
must be regular at finite q, including all the
λ2 reflection boundaries. As for the rational and
hyperbolic potentials, the energy eigenvalues are
λ1 only continuous. For the rational case, the eigen-
α1
functions are multivariable generalization of Bessel
Figure 3 Simple roots, the highest root, fundamental weights, functions.
and the principal Weyl alcove (grey) and the principal Weyl
chamber (light grey, extending to infinity) in a two-dimensional
root system.
Calogero systems The similarity-transformed
Hamiltonian H~ reads
2 , h q < =a}, (: set of simple roots, see the
appendix). Here h is the highest root. @ h2 X
r
@2
H~ ¼ h!q
@q 2 j¼1 @q2j
Ground-State Wave Function and Energy ½8
X g @
One straightforward outcome of the factorized h
2
q @q
þ
Hamiltonian [4] is the universal ground-state wave
function, which is given by which maps a Coxeter-invariant polynomial in q of
degree d to another of degree d. Thus, the
0 ðqÞ ¼ eWðqÞ=h Hamiltonian H~ (8) is lower-triangular in the basis
Y 2
¼ jwð qÞjg =h eð!=2hÞq ½6
of Coxeter-invariant polynomials and the diagonal
2þ elements have values as h! degree, as given by the
H0 ðqÞ ¼ 0 first term. Independent Coxeter-invariant polyno-
mials exist at the degrees fj listed in Table 2: fj = 1 þ
2
The exponential factor e(!=2h)q exists only for the ej , j = 1, . . . , r, where {ej }, j = 1, . . . , r, are the
Calogero systems. The ground-state energy, that is, exponents of .
the constant part of H = H^ E 0 , has a universal The eigenvalues of the Hamiltonian H are h!N
expression for each potential: with N a non-negative integer. N can be
P
( expressed as N = rj = 1 nj fj , nj 2 Zþ , and the
0 rational
E0 ¼ P degeneracy of the eigenvalue h!N is the number
! hr=2 þ 2þ g Calogero of partitions of N. It is remarkable that the
½7 coupling constant dependence appears only in the
ground-state energy E 0 . This is a deformation of
1 hyperbolic
E 0 ¼ 2a2 2 the isotropic harmonic oscillator confined in the
1 Sutherland
P principal Weyl chamber. The eigenpolynomials
where = 1=2 2þ g is called a ‘‘deformed are generalization of multivariable Laguerre
Weyl vector.’’ Obviously, 0 (q) is square integrable (Hermite) polynomials. One immediate consequence
in the configuration spaces for the Calogero and of this spectrum is the periodicity of the quantum
Sutherland systems and not square integrable for the motion. If a system has a wave function (0) at
rational and hyperbolic potentials. t = 0, then at t = T = 2=! the system has physically
the same wave function as (0), that is,
Excited States, Triangularity, and Spectrum (T) = eiE 0 T=h (0). The same assertion holds at the
classical level, too.
Excited states of the C–M systems can be easily
obtained as eigenfunctions of a differential operator
H~ obtained from H by a similarity transformation: Table 2 The degrees fj in which independent Coxeter-invariant
polynomials exist
H~ ¼ eW=h HeW=h
1 Xr fj = 1 þ ej fj = 1 þ ej
¼ ðh2 @ 2 =@q2j þ 2
h@W=@qj @=@qj Þ
2 j¼1 Ar 2, 3, 4, . . . , r þ 1 E8 2, 8, 12, 14, 18, 20, 24, 30
Br 2, 4, 6, . . . , 2r F4 2, 6, 8, 12
The eigenvalue equation for H,~ H
~ E = EE , is then Cr 2, 4, 6, . . . , 2r G2 2, 6
equivalent to that of the original Hamiltonian, Dr 2, 4, . . . , 2r 2, r I2 (m) 2, m
E6 2, 5, 6, 8, 9, 12 H3 2, 6, 10
HE eW = EE eW . Since all the singularities of the
E7 2, 6, 8, 10, 12, 14, 18 H4 2, 12, 20, 30
Fuchsian differential equation H (q) = E (q) are
Quantum Calogero–Moser Systems 127
Sutherland Systems The periodicity of the trigono- the chosen potential as given in Table 1. Then the
metric potential dictates that the wave function equations of motion can be expressed in a matrix
should be a Bloch factor e2iaq (where is a weight) form dL=dt = i= h[H, L]P = [L, M]. The P operator M
multiplied by a Fourier series in terms of simple satisfies the relation 2R M = 2R M = 0,
roots. The basis of the Weyl invariant wave which is essential for deriving quantum conserved
functions
P is specified by a Pdominant weight quantities as the total sum (Ts) of P all the matrix
= rj= 1 mj j , mj 2 Zþ , (q) 2O e2iaq , where elements of Ln : Kn = Ts(Ln ) , 2R (Ln ) ,
O is the orbit of by the action of the Weyl group: [H, Kn ] = 0, [Km ,Kn ] = 0, n, m = 1, . . . In particular,
O = {g() j g 2 G }. The set of functions { } has an the power 2 is universal to all the root systems, and
order , jj2 > j0 j2 ) 0 . The similarity- the quantum Hamiltonian is given by H / K2 þ
transformed Hamiltonian H~ given by const. As in the affine Toda molecule systems, a Lax
pair with a spectral parameter can also be intro-
h
2 Xr
@2 X @
H~ ¼ a
h g cot ða qÞ ½9 duced universally for all the above potentials. The
2 2
@qj @q
j¼1 2þ Dunkl operators, or the commuting differential–
difference operators are also used to construct
is lower-triangular in this basis: H ~ = 2a2 (h2 2 þ
P quantum conserved quantities for some root sys-
2h ) þ j0 j<jj c0 0 . That is, the eigenvalue is tems. This method is essentially equivalent to the
h2 2 þ 2
E = 2a2 ( h ) or E þ E 0 = 2a2 ( h þ )2 . universal Lax operator formalism. As the Lax
Again, the coupling constant dependence comes operators do not contain the Planck’s constant, the
solely from the deformed Weyl vector . This quantum Lax pair is essentially of the same form as
spectrum is a deformation of the spectrum corre- the classical Lax pair. The difference between the
sponding to the free motion with momentum 2ha trace (tr) and the total sum (Ts) vanishes as h ! 0.
in the principal Weyl alcove. The corresponding
eigenfunction is called a generalized Jack polynomial
Lax pair for Calogero systems The quantum Lax
or Heckman–Opdam’s Jacobi polynomial. For the
pair for the Calogero systems is obtained from the
rank-2 (r = 2) root systems, A2 , B2 ffi C2 and I2 (m)
universal Lax pair [10] by replacement L !
(the dihedral group), the complete set of eigenfunc-
L
= L
i!Q, Q q Ĥ, which correspond to the
tions are known explicitly.
creation and annihilation operators of a harmonic
oscillator. The equations of motion are rewritten as
Quantum Lax Pair and Quantum Conserved dL
=dt = i=h[H, L
] = [L
, M]
i!L
. Then L
=
Quantities L
L satisfy the Lax type equation dL
=dt =
The universal Lax pair for C–M systems is given in i=h[H, L
], giving rise to conserved quantities
terms of the representations of the Coxeter (Weyl) Ts(L
)n , n = 1, 2, . . . The Calogero Hamiltonian is
group in stead of the Lie algebra. The Lax operators given by H / Ts(L
).
without spectral parameter for the rational, trigono- All the eigenstates of the Calogero P Hamiltonian H
metric, and hyperbolic potentials are with eigenvalues h!N, N = rj = 1 njQ fj , nj 2 Zþ , are
simply constructed in terms of L
: rj = 1 (Bþ nj W
fj ) e .
^ þ XðqÞ
Lðp; qÞ ¼ p H Here the integers {fj }, j = 1, . . . , r, are listed in
X ½10 Table 2. The creation operators Bþ fj and the
XðqÞ ¼ i ^
g ð HÞxð qÞ^s corresponding annihilation operators B are defined
fj
fj
2þ by B
fj = Ts(L ) , j = 1, . . . , r. They are Hermitian
y
conjugate to each other (B
fj ) = Bfj with respect to
i X the standard Hermitian inner product of the states
MðqÞ ¼ g 2 yð qÞð^s IÞ ½11
2 2 defined in PW. They satisfy commutation relations
þ
þ þ
[H, B
k]=
h k!B
k , [Bk , Bl ] = [Bk , Bl ] = 0, k, l 2
where I is the identity operator and {ŝ j 2 } are {fj j j = 1, . . . , r}. The ground state is annihilated by
the reflection operators of the root system. They act all the annihilation operators B W
fj e = 0, j = 1, . . . , r.
on a set of Rr vectors, R = {(k) 2 Rr j k = 1, . . . , d},
permuting them under the action of the reflection
group. The vectors in R form a basis for the Further Developments
representation space V of dimension d. The matrix
Rational Potentials: Superintegrability
elements of the operators {ŝ j 2 } and
{Ĥj j j = 1, . . . , r} are defined as follows: The systems with the rational potential have a remark-
(ŝ ) = , s ( ) = , s () , (Ĥj ) = j , 2 , , able property: superintegrability. A rational C–M
2 R. The form of the functions x, y depends on system based on a rank-r root system has 2r 1
128 Quantum Calogero–Moser Systems
independent conserved quantities. Roughly speaking, Sutherland. For each member of R, to be called
they are of the form Kn = Ts(Ln ), Jm = Ts(QLm ), Q a ‘‘site,’’ a vector space V is associated whose
q Ĥ, among which only r are involutive. At the element is called a ‘‘spin.’’ The dynamical variables
classical level, superintegrability can be characterized are those of the particles {qj , pj } and the spin
as algebraic linearizability. Since a commutator of any exchange operators {P^ } ( 2 ) which exchange
conserved quantities is again a conserved quantity, these the spins at the sites and s (). For each and R
conserved quantities form a nonlinear algebra called a a spin exchange model can be defined by ‘‘freezing’’
quadratic algebra. It can be considered as a finite- the particle degrees of freedom at the equilibrium
dimensional analog of the W-algebra appearing in point of the corresponding classical potential
certain conformal field theory. {q, p} ! {q̄, 0}. These are generalization of Hal-
dane–Shastry model for Sutherland potentials and
Quantum vs Classical Integrability that of Polychronakos for the Calogero potentials.
Universal Lax pair operators for both spin C–M
In C–M systems, the classical and quantum integr-
systems and spin exchange models are known and
ability are very closely related. The quantum discrete
conserved quantities are constructed.
spectra of the Calogero and the Sutherland systems
are, as shown above, expressed in terms of the
coupling constant (!, g) and the exponents or the Integrable Deformations
weights of the corresponding root systems. Namely,
they are integral multiples of coupling constants. The C–M systems allow various integrable deformations at
corresponding the classical and/or quantum levels. One of the well-
P classical systems with the potential known deformations is the so-called ‘‘relativistic’’ C–M
V(q) = (1=2) rj = 1 (@W(q)=@qj )2 share many remark-
able properties. As is clear from Figure 1, they always system or the Ruijsenaars–Schneider (R–S) system. For
have an equilibrium position. The equilibrium posi- degenerate potentials, they are integrable both at the
tions (q̄) are described by the zeros of a classical classical and quantum levels. The classical quantities of
orthogonal polynomial; the Hermite polynomial the R–S systems at equilibrium exhibit many interesting
(A-type Calogero), the Laguerre polynomial (B, C, D- properties, too. The equilibrium positions are described
type Calogero), the Chebyshev polynomial (A-type by the zeros of certain deformation of the above-
Sutherland) and the Jacobi polynomial (B, C, D-type mentioned classical polynomials. The frequencies of
Sutherland). For the exceptional root systems, the small oscillations are also related to the exact quantum
corresponding polynomials were not known for a long spectrum, and they can be expressed as coupling
time. The minimum energy of the classical potential constant times the (q-) integers.
V(q) at the equilibrium is the quantum ground-state Inozemtsev models are classically integrable mul-
energy limh!0 E 0 itself. It is also an integral multiple of tiparticle dynamical systems related to C–M systems
coupling constants for both Calogero and Sutherland based on classical root systems (A, B, C, D) with
cases. Near a classical equilibrium, a multiparticle additional q6 (rational) or sin2 2q (trigonometric)
dynamical system is always reduced to a system of potentials. Their quantum versions are not exactly
coupled harmonic oscillators. For Calogero systems, solvable in contrast to the C–M or R–S systems,
the eigenfrequencies of these small oscillations are, in although there is some evidence of their Liouville
fact, exactly the same as the quantum eigenfrequen- integrability (without a proper Hilbert space).
cies, !fj = !(1 þ ej ). For Sutherland systems, the Quantum Inozemtsev systems can be deformed to
classical eigenfrequencies are the same as the o(h) be a widest class of quasi-exactly solvable multi-
part of the quantum spectra corresponding to all particle dynamical systems. They possess a form of
the fundamental weights j : 2a2 j . Moreover, the higher-order supersymmetry for which the method
eigenvalues of various Lax matrices L and M at the of prepotential is also useful.
equilibrium take many ‘‘interesting values.’’ These
results provide ample explicit examples of the general
theorem on the quantum–classical correspondence Appendix: Root Systems
formulated by Loris–Sasaki.
Some rudimentary facts of the root systems and
Spin Models reflections are recapitulated here. The set of roots
is invariant under reflections in the hyperplane
For any root system and an irreducible represen- perpendicular to each vector in . In other words,
tation R of the Coxeter (Weyl) group G , a spin s (
) 2 , 8,
2 , where
C–M system can be defined for each of the
potentials: rational, Calogero, hyperbolic and s ð
Þ ¼
ð_
Þ; _ 2=jj2 ½12
Quantum Calogero–Moser Systems 129
The set of reflections {s j 2 } generates a group 3. Cr : This root system is associated with Lie
G , known as a Coxeter group, or finite reflection algebra sp(2r). The long roots have (length)2 = 4
group. The orbit of
2 is the set of root vectors and short roots have (length)2 = 2:
resulting from the action of the Coxeter group on
¼ [ f
ej
ek g [rj¼1 f
2ej g
it. The set of positive roots þ may be defined in 1
j
k
r
terms of a vector U 2 Rr , with U 6¼ 0, 8 2 , Y r1 ½15
as the roots 2 such that U > 0. Given þ , ¼ [ fej ejþ1 g [ f2er g
j¼1
there is a unique set of r simple roots
= {j j j = 1, . . . , r} defined such that they span 4. Dr : This root system is associated with Lie
the P root space and the coefficients {aj } in algebra so(2r):
= rj = 1 aj j for
2 þ are all Pr non-negative. ¼ [ f
ej
ek g
The highest root h , for which j = 1 aj is max- 1
j
k
r
imal, is then also determined uniquely. The subset Y r1 ½16
of reflections {s j 2 } in fact generates the ¼ [ fej ejþ1 g [ fer1 þ er g
j¼1
Coxeter group G . The products of s , with 2
, are subject solely to the relations
(s s
)m(,
) = 1, ,
2 . The interpretation is that
s s
is a rotation in some plane by 2=m(,
). The See also: Calogero–Moser–Sutherland Systems
set of positive integers m(,
) (with of Nonrelativistic and Relativistic Type;
Dynamical Systems in Mathematical Physics:
m(, ) = 1, 8 2 ) uniquely specifies the Coxeter
An Illustration from Water Waves; Functional Equations
group. The weight lattice P() is defined as the
and Integrable Systems; Integrable Discrete Systems;
Z-span of the fundamental weights {j }, defined by Integrable Systems in Random Matrix Theory; Integrable
_j k = jk , 8j 2 . Systems: Overview; Isochronous Systems; Toda
The root systems for finite reflection groups may Lattices.
be divided into two types: crystallographic and
noncrystallographic. Crystallographic root systems
satisfy the additional condition _
2 Z, 8,
2 .
The remaining noncrystallographic root systems are Further Reading
H3 , H4 , whose Coxeter groups are the symmetry Calogero F (1971) Solution of the one-dimensional N-body
groups of the icosahedron and four-dimensional problem with quadratic and/or inversely quadratic pair
600-cell, respectively, and the dihedral group of potentials. Journal of Mathematical Physics 12: 419–436.
order 2m, {I2 (m)jm 4}. Calogero F (2001) Classical Many-Body Problems Amenable to
Exact Treatments. New York: Springer.
The explicit examples of the classical root Dunkl C (2001) Orthogonal Polynomials of Several Variables.
systems, that is, A, B, C, and D are given below. Cambridge: Cambridge University Press.
For the exceptional and noncrystallographic root Humphreys JE (1990) Reflection Groups and Coxeter Groups.
systems, the reader is referred to Humphrey’s book. Cambridge: Cambridge University Press.
In all cases, {ej } denotes an orthonormal basis in Rr . Macdonald IG (1995) Symmetric Functions and Hall Polynomials,
2nd edn. Oxford University Press.
1. Ar1 : This root system is related with the Lie Moser J (1975) Three integrable Hamiltonian systems connected
algebra su(r). with isospectral deformations. Advances in Mathematics 16:
197–220.
Olshanetsky MA and Perelomov AM (1983) Quantum integrable
¼ [ f
ðej ek Þg;
1
j
k
r systems related to Lie algebras. Physics Reports 94: 313–404.
Y r1 ½13 Ruijsenaars SNM (1999) Systems of Calogero–Moser Type. CRM
¼ [ fej ejþ1 g Series in Mathematical Physics 1: 251–352. Springer.
j¼1 Sasaki R (2001) Quantum Calogero–Moser Models: Complete
Integrability for All the Root Systems. Proceedings Quantum
2. Br : This root system is associated with Lie Integrable Models and Their Applications, 195–240. World
algebra so(2r þ 1). The long roots have Scientific.
(length)2 = 2 and short roots have (length)2 = 1: Sasaki R (2002) Quantum vs Classical Calogero–Moser Systems.
NATO ARW Proceedings, Elba, Italy.
¼ [ f
ej
ek g [rj¼1 f
ej g Stanley R (1989) Some combinatorial properties of Jack sym-
1
j
k
r metric function. Adv. Math. 77: 76–115.
Y r1 ½14 Sutherland B (1972) Exact results for a quantum many-body
¼ [ fej ejþ1 g [ fer g problem in one dimension. II. Physical Review A 5:
j¼1
1372–1376.
130 Quantum Central-Limit Theorems
operator, and to specify the algebraic character of Denote by AL all local observables, that is,
the set of all of these. [
Based on this quantum central-limit theorem, one AL ¼ A
notes that not all locally different microscopic
observables always yield different fluctuation opera- This algebra is naturally equipped with a C -norm
tors. Hence the central-limit theorem realizes a well- k k and its closure
defined procedure of coarse graining or reduction
procedure which is handled by the mathematical B ¼ AL
notion of an equivalence relation on the microscopic
is called a quasilocal C -algebra and considered as the
observables yielding the same fluctuation operator. microscopic algebra of observables of the system.
In the following sections we discuss the prelimin- Typical examples are spin systems where A = Mn is the
aries, the basic results about normal and abnormal n n complex matrix algebra. In this case, every state
fluctuations. Three model-independent applications ! of B is then locally normal, that is, there exists a
are also discussed. In this review, we omit the family of density matrices { j 2 D(Z )} such that
properties of the so-called modulated fluctuations.
One should remark that we discuss only fluctua- !ðAÞ ¼ tr A for all A 2 A
tions in space. One can also consider timelike An important group of -automorphisms of B is the
fluctuations. The theory of fluctuation operators group of space translations {x , x 2 Z }:
for these has not been explicitly worked out so far.
However, it is clear that for normal fluctuations the x : Ay 2 Ay ! x Ay ¼ Axþy 2 Ayþx
clustering properties of the time correlation func-
for all A 2 A.
tions will play a crucial role. On the other hand,
Note that the quasilocal algebra B is asymptoti-
typical properties of the structure of this fluctuation
cally abelian for space translations: that is, for all
algebra may come up.
A, B 2 B
Another point which one has to stress is that all
systems, which are treated in this review, are quasilocal lim k½A; x Bk ¼ 0
jxj ! 1
systems. Other systems, for example, fermion systems,
are note treated. But, in particular, fermion systems A state ! of B represents a physical state of the
share many properties of quasilocality, and many of system, assigning to every observable A its expecta-
the results mentioned hold true also for fermion tion value !(A). Therefore, this setting can be viewed
systems. as the quantum analog of the classical probabilistic
setting. Sequences of random variables or observables
can be constructed by considering an observable and
Preliminaries its translates, that is, x (A)x2Z is a noncommutative
random field. If a state ! is translation invariant, that
Quantum Lattice Systems is, !
x = ! for all x, then all x (A) are identically
Although all results we review can be extented to distributed random variables. The mixing property of
continuous or more general systems, modulo some the random field is then expressed by the spatial
technicalities, we limit ourself to quasilocal quantum correlations tending to zero:
dynamical lattice systems.
! x ðAÞy ðBÞ !ðx ðAÞÞ! y ðBÞ ! 0 ½3
We consider the quasilocal algebra built on a
-dimensional lattice Z . Let D(Z ) be the directed if jx yj ! 1.
set of finite subsets of Z where the direction is the One of the basic limit theorems of probability theory
inclusion. With each point x 2 Z we associate an is the weak law of large numbers. In this noncommu-
algebra (C - or von Neumann algebra) Ax , all copies tative setting the law of large numbers is translated into
of an algebra A. For all 2 D(Z ), the tensor the problem of the convergence of space averages of an
product x2 Ax is denoted by A . We take A to be observable A 2 B. A first result was given by the mean
nuclear, then there exists a unique C -norm on A . ergodic theorem of von Neumann (1929). In Brattelli
Every copy Ax is naturally embedded in A . and Robinson (1979, 2002) one finds the following
The family {A }2D(Z ) has the usual relations of theorem: if the state ! is space translation invariant and
locality and isotony: mixing (see [3]) then for all A, B, and C in B
½A1 ; A2 ¼ 0 if 1 \ 2 ¼ ; ½1 ! !
1 X
lim ! A x ðBÞ C ¼ !ðACÞ!ðBÞ ½4
! Z jj x 2
A1 A2 if 1 2 ½2
132 Quantum Central-Limit Theorems
That is, in the GNS (Gelfand–Naimark–Segal) repre- test function space (H, ) with a possibly degen-
sentation
P of the state !, the sequence S (B) = erate symplectic form is treated. Hence, H is a
1=jj x2 x B converges weakly to a multiple of real vector space and a bilinear, antisymmetric
the identity: S(B) !(B)1. This theorem, called the form on H.
mean ergodic theorem, characterizes the class of Denote by W(H, ) the complex vector space
states yielding a weak law of large numbers. Clearly, generated by the functions W(f ), f 2 H, defined by
these limits {S(A)jA 2 B} form a trivial abelian algebra
of macroscopic observables. Wðf Þ : H ! C : g ! Wðf Þg
Now we go a step further and consider space 0 if f ¼
6 g
¼
fluctuations. Define the local fluctuation of an 1 if f ¼ g
observable A in a homogeneous (spatial invariant)
state ! by W(H, ) becomes an algebra with unit W(0) for the
product
1 X
F ðAÞ ¼ ðx A !ðAÞÞ ½5 Wðf ÞWðgÞ ¼ Wðf þ gÞeði=2Þðf ;gÞ ; f;g2H
jj1=2 x 2
and a -algebra for the involution
The problem is to give a rigorous meaning to
lim F (A) for tending to Z in the sense of Wðf Þ ! Wðf Þ ¼ Wðf Þ
extending boxes. When does such a limit exist?
What are the properties of the fluctuations or the It becomes a C -algebra C (H, ) following the
limits F(A) = lim F (A), etc.? Again, the F(A) are construction of Verbeure and Zagrebnov (1992).
macroscopic variables of the microsystem. A linear functional ! of a C -algebra C (H, ) is
Already we remark the following: if A, B are called a state if !(I) = 1 and !(A A) 0 for all
strictly local elements, A, B 2 AL , then A 2 C (H, ) and I = W(0). Every state gives rise to a
X representation through the GNS construction
½A; y B 2 AL (Brattelli and Robinson 1979, 2002). P In particular,
y 2 Z ! is a state if for any choice of A = j cj W(fj ) we
have
and an easy computation yields, by [4],
X
weak lim ½F ðAÞ; F ðBÞ cjck ! Wðfj fj Þ eiðfj ;fk Þ 0
jk
!
1 X X !ðWð0ÞÞ ¼ 1
¼ weak lim x ½A; yx B
jj
x2 y2 A remark about the special case that is degenerate
!
1 X X is in order. Denote by H0 the kernel of :
¼ weak lim x ½A; y B
jj H0 ¼ ff 2 Hj ðf ; gÞ ¼ 0 for all g 2 Hg
x2 y 2 Z
X
¼ ! ½A; y B iðA; BÞ1 If H = H0 H1 with 1 a nondegenerate symplectic
y 2 Z form on H1 and 1 equal to the restriction of to
that is, if the F(A) and F(B) limits do exist, then H1 , we have that C (H, ) is a tensor product:
This property indicates that fluctuations should have Note that C (H0 , 0) is abelian and that each
the same commutation relations as boson fields. If positive-definite normalized functional ’,
fluctuations can be characterized as macroscopic ’ : h 2 H0 ! ’ðWðhÞÞ
observables, they must satisfy the canonical com-
mutation relations (CCRs). Therefore, in the next defines a state !(W(h)) = ’(W(h)) on C (H0 , 0).
section we introduce the essentials on CCR Let be any character of the abelian additive
representations. group H, then the map ,
Wðf Þ ¼ ðf ÞWðf Þ
CCR Representations
extends to a -automorphism of C (H, ). Let s be a
We present the abstract Weyl CCR C -algebra. positive symmetric bilinear form on H such that for
More details can be found in Brattelli and all f , g 2 H:
Robinson (1979, 2002) and in particular in 1 2
Manuceau et al. (1973), where the case of a real 4 jðf ; gÞj
sðf ; f Þ sðg; gÞ ½7
Quantum Central-Limit Theorems 133
and let !s, be the linear functional on C (H, ) where the limit is taken for any increasing
given by Z -absorbing sequence {} of finite volumes of
Z . The limits F(A) are called the macroscopic
!s; ðWðhÞÞ ¼ ðhÞeð1=2Þsðh;hÞ ½8 fluctuation operators of the system (B, !).
then it is straightforward (Brattelli and Robinson Already earlier work (Cushen and Hudson 1971,
1979, 2002) to check that !s, is a state on C (H, ). Sewell 1986) suggested that the fluctuations behave
All states of the type [8] are called quasifree states like bosons. We complete this idea by proving that
on the CCR algebra C (H, ). one gets a well-defined representation of a CCR C -
A state ! of C (H, ) is called a regular state if, for algebra of fluctuations uniquely defined by the
all f , g 2 H, the map 2 R ! !(W(f þ g)) is con- original system (B, !).
tinuous. The regularity property of a state yields the Denote by AL, sa and Bsa the real vector space of
existence of a Bose field as follows. Let (H, , ) be the self-adjoint elements of AL , respectively, B.
the GNS representation (Brattelli and Robinson Definition 1 An observable A 2 Bsa satisfies the
1979, 2002) of the state w, then the regularity of central-limit theorem if
w implies that there exists a real linear map
b : H ! L(H) (linear operators on H) such that (i) lim !(F (A)2 ) s! (A, A) exists and is finite, and
2
8f 2 H: b(f ) = b(f ) and (ii) lim !(eitF (A) ) = e(t=2) s! (A, A)
for all t 2 R.
ðWðf ÞÞ ¼ expðibðxÞÞ Clearly, our definition coincides with the notion in
terms of characteristic functions, for classical systems (A
The map b is called the Bose field satisfying the Bose abelian) equivalent with the notion of convergence in
field commutation relations: distribution. For quantum systems, there does not exist
½bðf Þ; bðgÞ ¼ iðf ; gÞ ½9 a standard notion of ‘‘convergence in distribution.’’
Only the concept of expectations is relevant. This does
Note that the Bose fields are state dependent. Note not exclude the notion of central-limit theorem in terms
also already that if is a continuous character of H, of the moments, which is the analog of the moment
then any quasifree state [8] is a regular state problem (Giri and von Waldenfels 1978).
guaranteeing the existence of a Bose field.
Definition 2 The system (B, !) is said to have
normal fluctuations if ! is translation invariant and if
tems with a quasilocal structure (see the section (ii) the central-limit theorem holds for all A 2 AL, sa .
‘‘Quantum lattice systems’’) and for technical simpli-
city we assume that the local C -algebra Ax , x 2 Z , Note that (i) implies that the state ! is mixing for
are copies of the matrix algebra Mn (C) of n n space translations. Also by (i), one can define a
complex matrices. Most of the results stated can be sesquilinear form on AL :
extended to the case where Ax is a general C -algebra hA; Bi! ¼ lim !ðF ðA ÞF ðBÞÞ
(Goderis et al. 1989, 1990, Goderis and Vets 1989). X
We consider a physical system (B, !) where ! is a ¼ ð!ðA x BÞ !ðA Þ!ðBÞÞ
translation-invariant state of B, that is, !
x = ! for
all x 2 Z . Later on we extend the situation to a and denote
C -dynamical system (B, !, t ) and analyze the
s! ðA; BÞ ¼ RehA; Bi!
properties of the dynamics t under the central limit.
For any local A we introduced its local fluctuation ! ðA; BÞ ¼ 2 ImhA; Bi!
in the state ! of the system:
For A, B 2 AL, sa one has
1 X X
F ðAÞ ¼ 1=2
ðx A !ðAÞÞ ½10 ! ðA; BÞ ¼ i !ð½A; x BÞ ½11
jj x2 x 2 Z
The main problem is to give a rigorous mathema- s! ðA; AÞ ¼ hA; Ai! ½12
tical meaning to the limits
Clearly, (AL, sa , ! ) is a symplectic space and s! a
lim F ðAÞ FðAÞ non-negative symmetric bilinear form on AL, sa .
!1
134 Quantum Central-Limit Theorems
Following the discussion in the section ‘‘CCR From [13], the mean ergodic theorem, and Theorem
representations’’ we get a natural CCR C -algebra 1 we get:
C (AL, sa , ! ) defined on this symplectic space. The
Theorem 2 If the system (B, !) has normal
following theorem is an essential step in the
fluctuations then for A, B 2 AL, sa :
construction of a macroscopic physical system of
fluctuations of the microsystem (B, !). lim ! eiF ðAÞ eiF ðBÞ
Theorem 1 If the system (B, !) has normal 1 i
¼ exp s! ðA þ B; A þ BÞ ! ðA; BÞ
fluctuations, then the limits { lim !(eiF (A) ) = 2 2
exp ((1=2)s! (A, A)), A 2 AL } define a quasifree ¼ !ðWðAÞWðBÞÞ
~
state !˜ on the CCR C -algebra C (AL, sa , ! ) by
with !˜ a quasifree state on the CCR algebra C (AL,! sa ).
~
!ðWðAÞÞ ¼ exp 12 s! ðA; AÞ
Theorems 1 and 2 describe completely the
topological and analytical aspects of the quantum
Proof The proof is clear from the definition [8] if central-limit theorem under the condition of normal
one can prove that the positivity condition [7] holds. fluctuations (Definition 2). In fact, the quantum
But the latter follows readily from central limit yields, for every microphysical system
2 (B, !), a macrophysical system (C (AL, sa , ! ), !) ˜
1
4 j! ðA; BÞj ¼ lim jIm !ðF ðAÞF ðBÞÞj2
defined by the CCR C -algebra of fluctuation
lim !ðF ðAÞ2 Þ!ðF ðBÞ2 Þ observables C (AL, sa , ! ) in the representation
defined by the quasifree state !. ˜ As the state !˜ is a
¼ s! ðA; AÞs! ðB; BÞ quasifree state, it is a regular state, that is, the map
2 R ! !( ˜ W (A þ B)) is continuous. From in sec-
by Schwarz inequality. &
tion ‘‘CCR representations’’ we know that this
This theorem indicates that the quantum-mechan- regularity property yields the existence of a Bose
ical alternative for (classical) Gaussian measures are field, that is, there exists a real linear map
quasifree states on CCR algebras. However, the
F : A 2 AL;sa ! FðAÞ
following basic question arises: is it possible to take
the limits of products of the form where F(A) is a self-adjoint operator on the GNS
representation space H~ of !,˜ such that for all
lim ! eiF ðAÞ eiF ðBÞ A, B 2 AL, sa :
½FðAÞ; FðBÞ ¼ i! ðA; BÞ
and, if they exist, do they preserve the CCR
structure? Clearly, this is a typical noncommutative Moreover, if one has a complex structure J on
problem. (AL, sa , ! ) such that J2 = 1 and for all A, B 2 AL, sa :
Using the following general bounds: for C = C
! ðJA; BÞ ¼ ! ðA; JBÞ
and D = D norm-bounded operators one has
iðCþDÞ ! ðA; JBÞ > 0
e eiC
kDk
iC iD then one defines the boson creation and annihilation
½e ; e
k½C; Dk
operators
iðCþDÞ
e eiC eiD
12 k½C; Dk 1
F ðAÞ ¼ pffiffiffi ðFðAÞ iFðJAÞÞ
and by the expansion of the exponential function 2
one proves easily that satisfying the usual boson commutation relations
limeiF ðAÞ eiF ðBÞ eiðF ðAÞþF ðBÞÞ ½F ðAÞ; Fþ ðBÞ ¼ ! ðA; JBÞ þ i! ðA; BÞ
eð1=2Þ½F ðAÞ;F ðBÞ ¼ 0 ½13 Finally, it is straightforward, nevertheless impor-
tant, to remark that Theorems 1 and 2 hold true if
if A and B are one-point observables, that is, if A, B 2 the linear space of local observables AL, sa is replaced
A{0} . For general local elements the proof is some- by any of its subspaces. Some of them can have
what more technical and can be based on a Bernstein- greater physical importance than others. This means
like argument (for details see Goderis and Vets that the quantum central-limit theorems can realize
(1989)). The property [13] can be seen as a several macrophysical systems of fluctuations. But
Baker–Campbell–Hausdorff formula for fluctuations. all of them are Bose field systems.
Quantum Central-Limit Theorems 135
It is also important to remark that these results where N, d 2 Rþ and d(, 0 ) is the Euclidean
end up in giving a probabilistic canonical basis of distance between and 0 . It is obvious that
the canonical commutation relations.
Now we analyze the notion of coarse graining due !N ðdÞ
!N ðd0 Þ if d d0
to the quantum central limit. Consider on AL the !N ðdÞ
!N0 ðdÞ if N
N 0
sesquilinear form (see [11], [12]) again
X The clustering condition is expressed by the follow-
hA; Bi! ¼ ð!ðA x BÞ !ðAÞ!ðBÞÞ ing scaling law:
x 2 Z
¼ s! ðA; BÞ þ i! ðA; BÞ ½14 9
> 0 : lim N 1=2 !N N 1=2
¼ 0 ½15
N!1
This form defines a topology on AL which is not
or, equivalently,
comparable with the operator topologies induced by
!. In fact, this form is not closable in the weak, 9
> 0 : lim N þ
!N2ðþ
Þ ðNÞ ¼ 0 ½16
strong, ultraweak, or ultrastrong operator topologies. N!1
We call A and B in AL equivalent, denoted by
Note that this condition implies that
A B if hA B, A Bi! = 0. Clearly, this defines
X
an equivalence relation on AL . The property of !N ðjxjÞ < 1
coarse graining is mathematically characterized by x 2 Z
the following: for all A, B 2 AL, sa the relation A B
is equivalent with F(A) = F(B). Suppose first that that is, that the function !N ( ) is an L1 (Z )-
F(A) = F(B), then function for all N. In fact, this condition corre-
sponds to the uniform mixing condition in the
½WðAÞ; WðBÞ ¼ 0 commutative (classical) central-limit theorem (see,
hence ! (A, B) = 0. Therefore, from Theorem 1: e.g., Ibragimov and Linnick (1971)). This condition
can also be called the modulus of decoupling.
1 ¼ !ðWðAÞWðBÞ
~ Þ ¼ !ðWðAÞWðBÞÞ
~ Product states, for example, equilibrium states of
¼ !ðWðA
~ BÞÞ ¼ exp 12 s! ðA B; A BÞ mean-field systems are uniformly clustering with
! (d) = 0 for d > 0.
and from [12] and [14]: hA B, A Bi! = 0. The The normality of the fluctuations of the micro-
converse is equally straightforward. system (B, !) for product states is proved and
From this property, it follows immediately that, for extensively studied in Goderis et al. (1989), and for
example, the action of the translation group is trivial states satisfying the condition [15] or [16] in Goderis
or that F(x A) = F(A) for all x 2 Z . Therefore, the and Vets (1989). In the latter case, the proofs are
map F : AL, sa ! C (AL, sa , ! ) is not injective. This very technical and based on a generalization of the
expresses the physical phenomenon of coarse graining well-known Bernstein argument (Ibragimov and
and gives a mathematical signification of the fluctua- Linnick 1971) of the classical central-limit theorem
tions being macroscopic observables. to the noncommutative situation. A refinement of
In the above, we have constructed the new these arguments can be found in Goderis et al.
macroscopic physical system of quantum fluctua- (1990). For the sake of formal self-consistency we
tions for any microsystem with the property of formulate the theorem:
normal fluctuations (see Definition 2). The main
problem remains: when the microsystem does have Theorem 3 (Central-limit theorem) Take the micro-
normal fluctuations. We end this section with the system (B, !) such that ! is lattice translation invariant
formulation of a general sufficient clustering condi- and satisfies the clustering condition [15]; then the
tion for the microstate ! in order that the micro- system has normal fluctuations for all elements of the
system (B, !) has normal fluctuations. vector space of local observables AL, sa . &
Let , 0 2 D(Z ) and ! a translation invariant In Goldshtein (1982) a noncommutative central-
state, denote limit theorem is derived using similar techniques.
! ð; 0 Þ ¼ sup j!ðABÞ !ðAÞ!ðBÞj The main difference, however, is its strictly local
A 2 A ;kAk¼1
B 2 A 0 ;kBk¼1
character, namely for one local operator separately.
The conditions depend on the spectral properties of
The cluster function !N (d) is defined by the operator. It excludes a global approach resulting
in a CCR algebra structure.
!N ðdÞ ¼ sup f ! ð; 0 Þ : dð; 0 Þ d and
Even for quantum lattice systems, it is not
maxðjj; j0 jÞ
Ng straightforward to check whether a state satisfies
136 Quantum Central-Limit Theorems
the degree of mixing as expressed in conditions This theorem yields the existence of a dynamics
~t
[15]–[16]. Clearly, one expects the condition to hold on the fluctuations algebra and shows that it is of
for equilibrium states at high enough temperatures. the quasifree type
For quantum spin chains, a theorem analogous with
~t FðAÞ ¼ Fð t AÞ
Theorem 3 under weaker conditions than [15] is
proved for example, in Matsui (2003). where F(A) is a representation of a Bose field in a
So far we have reviewed the quantum central-limit quasifree state !, ~ the noncommutative version of a
theorem for physical C -spin systems (B, !) with Gaussian distribution. In physical terms, it also
normal fluctuations. means that any microdynamics t induces a linear
Now we extend the physical system to a process on the level of its fluctuations.
C -dynamical system (B, !, t ) (Brattelli and Robinson We can conclude that on the basis of the
1979, 2002) and we investigate the properties of the Theorems 3 and 4 the quantum central-limit
dynamics t under the central limit. As usual, the theorem realized a map from the microdynamical
dynamics is supposed to be of the short-range type in system (B, !, t ) to a macrodynamical system
order to guarantee the norm limit: (C (AL, sa , ! ), !,
˜ ~t ) of the quantum fluctuations.
The latter system is a quasifree Boson system.
t ðÞ ¼ n lim eitH eitH Note that, contrary to the central-limit theorem,
the law of large numbers [4] maps local observables
and space homogeneous t x = x t , 8t 2 R, 8x 2 to their averages forming a trivial commutative
Z . We suppose that the state ! is both space as algebra of macro-observables. The macrodynamics
time translation invariant. Moreover, we assume is mapped to a trivial dynamics as well. Therefore,
that the state ! satisfies the mixing condition [15] the consideration of law of large numbers does not
for normal fluctuations. allow one to observe genuine quantum phenomena.
In [10] we defined, for every local A 2 AL, sa , the On the other hand, on the level of the fluctuations,
local fluctuation F (A) and obtained a clear meaning macroscopic quantum phenomena are observable.
of F(A) = lim F (A) from the central-limit theorem.
Now we are interested in the dynamics of the
fluctuations F(A). Clearly, for all A 2 AL, sa and all
finite : Abnormal Fluctuations
The results about normal fluctuations in the last
t F ðAÞ ¼ F ð t AÞ ½17
section contain two essential elements. On the one
and one is tempted to define the dynamics ~t of the hand, the central limit has to exist. The condition in
fluctuations in the -limit by the formula order that this occurs is the validity of the cluster
condition ([15] or [16]) guaranteeing the normality
~t FðAÞ ¼ Fð t AÞ
½18 of the fluctuations. On the other hand, there is the
reconstruction theorem, identifying the CCR algebra
Note, however, that in general t A is not a local
representation of the fluctuation observables or
element of AL, sa . It is unclear whether the central
operators in the quasifree state, which is denoted
limit of elements of the type t A, with A 2 AL, sa
by !.
˜
exists or not and hence whether one can give a
The cluster condition is in general not satisfied for
meaning to F( t A). Moreover, if F( t A) exists, it
systems with long-range correlations, for example,
remains to prove that ( ~t )t defines a weakly
for equilibrium states at low temperatures with
continuous group of -automorphisms on the fluc-
~ = C (AL, sa , ! )00 (the von phase transitions. It is a challenging question to also
tuation CCR algebra M
study in this case the existence of fluctuations
Neumann algebra generated by the !-representation
˜
operators and, if they exist, to study their mathe-
of C (AL, sa , ! )). All this needs a proof. In Goderis
matical structure. Here we detect structures other
et al. (1990), one finds the proof of the following
than the CCR structure, other states or distributions
basic theorem about the dynamics.
different from quasifree states, etc.
Theorem 4 Under the conditions on the dynamics Progress in the elucidation of all these questions
t and on the state ! expressed above, the limit started with a detailed study of abnormal fluctua-
F( t A) = lim F ( t A) exists as a central limit as in tions in the harmonic and anharmonic crystal
Theorem 2, and the maps ~t defined by [18] extend models (Verbeure and Zagrebnov 1992, Momont
to a weakly continuous one-parameter group of et al. 1997). More general Lie algebras are obtained
-automorphisms of the von Neumann algebra M. ~ than the Heisenberg Lie algebra of the CCR algebra,
The quasifree state !˜ is
~t -invariant (time invariant). and more general states !˜ or quantum distributions
Quantum Central-Limit Theorems 137
are computed beyond quasifree states, which is the Suppose now that the indices
A are determined
case for normal fluctuations. by the existence of the central limit [19]. The next
Abnormal fluctuations turn up, if one has an problem is to find out whether also in these cases a
ergodic state ! with long-range correlations. We reconstruction theorem, comparable to, for exam-
have in mind continuous (second-order) phase ple, Theorem 2, can be proved giving again a
transitions, then typically, for example, the heat mathematical meaning to the limits
capacity or some more general susceptibilities
diverge at critical points or lines. This means that lim F
A ðAÞ F
A ðAÞ ½21
normally scaled (with the factor jj1=2 ) fluctuations
of some observables diverge. This is equivalent with as operators, in general unbounded, on a Hilbert space.
the divergence of sums of the type Here we develop a proof of the Lie algebra
X character of the abnormal fluctuations under the
ð!ðAx AÞ !ðAÞ2 Þ conditions: (1) the
-indices are determined by the
x 2 Z existence of the variances (second moments), and
(2) the existence of the third moments (for more
for some local observable A.
details see, e.g., Momont et al. (1997)).
In order to deal with these situations, we rescale
Consider a local algebra, namely an n-dimensional
the local fluctuations. One determines a scaling
vector space G with basis {vi }i = 1,..., n and product
index
A 2 (1=2, 1=2), depending on the observa-
ble A, such that the abnormally scaled local X
n
j
k
‘
constants. When the transformed structure constants
lim
! Fj; Fk; F‘;
<1
approach a well-defined limit, a new nonisomorphic
Lie algebra might appear. The limit algebra G(Z ),
We have in mind, that the ! ’s are Gibbs states
called the contracted one of the original one G is
for some local Hamiltonians with some specific
always nonsemisimple. This contraction is a typical
boundary conditions. The limit ! Z may depend
Inönü–Wigner contraction (Inönü and Wigner
very strongly on these boundary conditions, in the
1953). About the limit algebra G(Z ), the following
sense that they are visible in the values of the
results are obtained (see Momont et al. (1997)):
indices
j (see, e.g., Verbeure and Zagrebnov
8
(1992)). If for some j 1, the corresponding
j = 0 <0 if 12 þ
j þ
k
‘ > 0
then the operator Lj has a normal fluctuation ‘
lim cjk ðÞ ¼ cjk if . . . . . . . . . . . . . . . ¼ 0
‘ ½28
operator :
0 if . . . . . . . . . . . . . . . < 0
Fj j ¼ lim Fj;
j
½25 It is interesting to distinguish a number of special
cases:
where the limit is understood in the sense of 1. If all fluctuations are normal, one recovers the
Condition A, namely a finite nontrivial variance. If, Heisenberg algebra of the canonical commuta-
for some j 1, the corresponding
j 6¼ 0, then the tion relations with the right symplectic form ! .
fluctuation [25] is called an abnormal fluctuation 2. If 1=2 þ
j þ
k
‘ > 0 for all j, k, ‘ one obtains
operator. In order to satisfy Condition A, it happens an abelian Lie algebra of fluctuations.
sometimes that
j has to be chosen negative (see, 3. One gets the richest structure if 1=2 þ
j þ
k
e.g., Verbeure and Zagrebnov (1992)). In this case,
‘ = 0 for all j, k, ‘ or for some of them. One
it is reasonable to limit our discussion to the notes a phenomenon of scale invariance, the
situation that all
j > 1=2. c‘jk () are -independent. Algebras different from
On the basis of Condition A, the limit set the CCR algebra are observed. A particularly
of any model, but we limit ourselves to mention often argued that when the perturbation is small,
three applications which are of a general nature and one can limit the study of the response to the first-
totally model independent. order term in the perturbation in the corresponding
Dyson expansion. This is the basis of what is called
Conservation of the KMS Property under the ‘‘linear response theory of Kubo.’’
the Transition from Micro to Macro A long-term debate is going on about the validity
of the linear response theory. The question is how to
Suppose that we start with a micro-dynamical
understand from a microscopic point of view the
system (B, !, t ) with normal fluctuations, that is,
validity of the response theory being linear or not.
we are in the situation as treated in the section
One must realize that the linear response theory
‘‘Normal fluctuations.’’ Hence, we know that the
actually observed in macroscopic systems seems to
quantum central-limit theorem maps the system
have a significant range of validity beyond the
(B, !, t ) onto the macrodynamical system
criticism being expressed about it.
(C (AL, sa , ! ), !,
˜ ~t ) of quantum fluctuations.
Here we discuss the main result of the paper
If the microstate ! is t -time invariant (! t = !
(Goderis et al. 1991) in which contours are sketched
for all t 2 R), then it also follows readily that the
for the exactness of the response being linear.
macrostate !˜ is ~t -time invariant (see Theorem 4,
We assume:
i.e., !˜
~t = !˜ for all t 2 R).
A less trivial question to pose is: suppose that the 1. that the microdynamics t is the norm-limit of
microstate ! is an equilibrium state for the micro- the local dynamics t = eitH eitH , where H
dynamics t , is then the macrostate !˜ also an contains only standard finite-range interactions
equilibrium state for the macrodynamics ~t of the (as in the section ‘‘Normal fluctuations’’);
fluctuations? In Goderis et al. (1990) this question is 2. that the ! are states such that ! = lim ! is a
answered positively in the following more technical state which is time and space translation invar-
sense: if ! is an t -KMS state of B at inverse iant; and
temperature
, then !˜ is an ~t -KMS state at the 3. that ! satisfies the cluster condition [15] or [16].
same temperature.
From the time invariance of the state, one has a
This property proves that the notion of equili-
Hamiltonian GNS representation of the dynamics:
brium is preserved under the operation of coarse
t = eitH eitH . On the basis of Theorem 4, one has
graining induced by the central-limit theorem. This
the dynamics ~t of the fluctuation algebra
statement constitutes a proof of one of the
C (AL, sa , ! ) in the state !.
˜ This GNS representation
basic assumptions of the phenomenological theory
yields a Hamiltonian representation for ~t :
of Onsager about small oscillations around
equilibrium. ~ ~
~t ¼ eitH eitH
This result also yields a contribution to the
discussion whether or not quantum systems should Now take any local perturbation P 2 AL, sa of t ,
be described at a macroscopic level by classical namely
observables. The result above states that the macro-
scopic fluctuation observables behave classically if Pt; ¼ eitðHþF ðPÞÞ eitðHþF ðPÞÞ
and only if they are time invariant. In other words, it
can only be expected a priori that conserved where F (P) is the local fluctuation of P in !. Then
quantities behave classically. In principle, other one proves the following central-limit theorem
observables follow a quantum dynamics. (Goderis et al. 1991): for all A and B in AL, sa , one
has the perturbed dynamics
Linear Response Theory ~ ~
~Pt ¼ eitðHþFðPÞÞ eitðHþFðPÞÞ
In particular, in the study of equilibrium states
(KMS states) a standard procedure is to perturb the of the fluctuation algebra in the sense of [18]:
system and to study the response of the system as a ~Pt FðAÞ ¼ lim Fð Pt; ðAÞÞ
function of the perturbation. The response eluci-
dates many, if not all, of the properties of the This proves the existence and the explicit form of
equilibrium state. the perturbed dynamics lifted to the level of the
Technically, one considers a perturbation of the fluctuations. In particular, one has
dynamics by adding a term to the Hamiltonian. One
expands the perturbed dynamics in terms of the lim ! Pt; ðF ðAÞÞ ¼ !ð
~ ~Pt FðAÞÞ
perturbation and the unperturbed dynamics. It is
140 Quantum Central-Limit Theorems
This is nothing but the existence of the relaxation following product state solutions: ! = i tr ,
function of Kubo but lifted to the level of the where
fluctuations and instead of dealing with strictly local
observables here one considers fluctuations. e
h
¼ ; ¼ tr ¼ ! ð Þ
Assume, furthermore, that the state ! is an ( t ,
)- tr e
h
KMS state; then one derives readily Kubo’s famous h ¼ z þ
formula of his linear response theory:
Note that = tr is a nonlinear equation for
d whose solutions determine the density matrix .
~
!ð ~Pt FðAÞÞ ¼ i! ~t FðAÞÞ
~ð½FðPÞ;
dt This equation always has the solution = 0,
which shows full linearity in the perturbation describing the so-called normal phase. For
>
c ,
observable P. Kubo’s formula arises as the central with th
c = 2, one has a solution 6¼ 0, describing
limit of the microscopic response to the dynamics the superconducting phase. Remark that if is a
perturbed by a fluctuation observable. We remark solution, then also ei for all is a solution as
that if ! is an equilibrium state, then the right-hand well. It is clear that HN is invariant under the
side of the formula above can be expressed in terms continuous gauge transformation automorphism
of the Duhamel two-point function, which is the group G = {’ j ’ 2 [0, 2]} of B:
common way of doing in linear response theory. ’ ðþ i’ þ
i Þ ¼ e i
Hence G is a symmetry group. On the other hand:
Spontaneous Symmetry Breaking ! (’ (þ i’
! (þ þ
i )) = e i ) 6¼ ! (i ). The gauge group
SSB is one of the basic phenomena accompanying G is spontaneously broken. Remark also that the
collective phenomena, such as phase transitions in gauge transformations are implemented locally by
statistical mechanics, or specific ground states in the charges
field theory. SSB goes back to the Goldstone X
N
theorem. There are many different situations to QN ¼ zi ; i:e:; ’ ðþ i’QN þ i’QN
i Þ¼ e i e
consider, for example, in the case of short-range j¼N
interactions, it is typical that SSB yields a z
dynamics which remains symmetric, whereas for and is the symmetry generator density. As the
long-range interactions SSB also breaks the sym- states ! are product states, all fluctuations are
metry of the dynamics. However, in all cases the normal (see the section ‘‘Normal fluctuations’’). One
physics literature predicts the appearance of a considers the local operators
particular particle, namely the Goldstone boson, to
jj2 z
appear as a result of SSB. The theory of fluctua- Q¼ 2
þ 2 ðþ þ Þ
tion operators allows the construction of the
canonical coordinates of this particle. The most i
P ¼ ðþ Þ
general result can be found in Michoel and
Verbeure (2001). We sketch the essentials in two where = (2 þ jj2 )1=2 . Note that P is essentially
cases, namely for systems of long-range interac- the order parameter operator, that is, the operator P
tions (mean fields) and for systems with short- is breaking the symmetry:
range interactions.
d
! ð’ ðAÞÞ 6¼ 0; ! ðAÞ ¼ 0
Long-range (mean-field) interactions Here we give d’
explicitly the example of the strong-coupling BCS
model in one dimension ( = 1). The microscopic On the other hand, Q is essentially the generator of
algebra of observables is B = i (M2 )i , where M2 is the symmetry z normalized to zero, that is,
the algebra of 2 2 complex matrices. The local ! (Q) = 0.
Hamiltonian of the models is given by Michoel and Verbeure (2001) proved in detail
that the fluctuations F(Q) and F(P) form a
X
N
1 XN
canonical pair
HN ¼ zi þ
2N þ 1 i;j¼N i j
i¼N 4jj2
1 ½FðQÞ; FðPÞ ¼ i
0<< 2
where z , are the usual 2 2 Pauli matrices. In and that they behave, under the time evolution, as
the thermodynamic limit, the KMS equation has the harmonic oscillator coordinates oscillating with a
Quantum Central-Limit Theorems 141
frequency equal to 2. This frequency is called a is nontrivial and finite. This means that the fluctua-
plasmon frequency. Moreover, the variances are tion F
(A) exists. Then we get
"
jj2 1 X
~ ðFðQÞ2 Þ ¼
! ~ ðFðPÞ2 Þ
¼! lim ! ðqx !ðqÞÞ;
2 jj1=2
x2
#!
This means that these coordinates vanish or dis- 1 X
appear if = 0. The coordinates F(Q) and F(P) are ðx A !ðAÞÞ ¼ c
the canonical coordinates of a particle appearing jj1=2þ
y 2
only if there is spontaneous symmetry breakdown. Hence
They are the canonical coordinates of the Goldstone
boson, which arise if SSB occurs. ~ F
ðqÞ; F
ðAÞ ¼ c
!
which for equilibrium states !, turns into the
operator equation for fluctuations
Short-range interactions An analogous result, as
for long-range interactions, can be derived for ½F
ðqÞ; F
ðAÞ ¼ c1
systems with short-range interactions. However, in
this case we have equilibrium states with poor In other words, one obtains a canonical pair
cluster properties. We are now in the situation as (F
(q), F
(A)) of normal coordinates of the collec-
described in the ‘‘Abnormal Fluctuations’’ section. tive Goldstone mode.
Also in this case we have the phenomenon of SSB, Note that the long-range correlation of the
which shows the appearance of a Goldstone particle. order-parameter operator (positive
) is exactly
Also in this case one is able to construct its compensated by a squeezing, described by the
canonical coordinates. The details of this construc- negative index
, for the fluctuation operator of
tion can be found in Michoel and Verbeure (2001). the local generator of the broken symmetry. This
Here we give a heuristic picture of this construction. result can also be expressed as typical for SSB,
Consider again a microsystem (B, !, t ) and let s namely that the symmetry is not completely
be a strongly continuous one-parameter symmetry broken, but only partially. More detailed informa-
group Pof t which is locally generated by tion about all this is found in Michoel and
Q = x2 qx . SSB amounts to find an equilibrium Verbeure (2001).
(KMS) or ground state ! which breaks the symme-
See also: Algebraic Approach to Quantum Field Theory;
try, that is, there exists a local observable A 2 AL, sa Large Deviations in Equilibrium Statistical Mechanics;
such that for s 6¼ 0 holds: !(s (A)) 6¼ !(A) and Macroscopic Fluctuations and Thermodynamic
t s = s t . This is equivalent to Functionals; Quantum Phase Transitions; Quantum
Spin Systems; Symmetry Breaking in Field Theory;
d
!ðs ðAÞÞ
with c a constant.
Now we turn this equation into a relation for Further Reading
fluctuations. Using space translation invariance of
Brattelli O and Robinson DW (1979) Operator Algebras and
the state, one gets Quantum Statistical Mechanics, vol. I. New York–Heidelberg–
" #! Berlin: Springer.
1 X X
lim ! ðqx !ðqÞÞ ðx A !ðAÞÞ ¼c Brattelli O and Robinson D (2002) Operator Algebras and
jj Quantum Statistical Mechanics, vol. II. New York–Heidelberg–
x2 y2
Berlin: Springer.
We now use another consequence of the Gold- Cushen CD and Hudson RL (1971) A quantum mechanical central
limit theorem. Journal of Applied Probability 8: 454–469.
stone theorem, namely that SSB implies poor
Fannes M and Quaegebeur J (1983) Central limits of product
clustering properties for the order parameter A, mappings between CAR-algebras. Publications of the Research
that is, in the line of what is done in the last Institute for Mathematical Studies Kyoto 19: 469–491.
section, we assume that the lack of clustering is Giri N and von Waldenfels W (1978) An algebraic version of the
expressed by the existence of a positive index
central limit theorem. Zeitschrift für Wahrscheinlichkeitstheorie
und Verwandte gebiete 42: 129–134.
such that
Goderis D, Verbeure A, and Vets P (1989) Non-commutative
0 !2 1 central limits. Probability and Related Fields 82: 527–544.
1 X Goderis D, Verbeure A, and Vets P (1990) Dynamics of
lim !@ 1þ2
ðx A !ðAÞÞ A fluctuations for quantum lattice systems. Communications in
jj x2 Mathematical Physics 128: 533–549.
142 Quantum Channels: Classical Capacity
Goderis D, Verbeure A, and Vets P (1991) About the exactness of Manuceau J, Sirugue M, Testard D, and Verbeure A (1973) The
the linear response theory. Communications in Mathematical smallest C -algebra for canonical commutation relations.
Physics 136: 265–583. Communications in Mathematical Physics 32: 231.
Goderis D and Vets P (1989) Central limit theorem for mixing Matsui T (2003) On the algebra of fluctuations in quantum spin
quantum systems and the CCR-algebra of fluctuations. chains. Annales Henri Poincaré 4: 63–83.
Communications in Mathematical Physics 122: 249. Michoel T and Verbeure A (2001) Goldstone boson normal
Goldshtein BG (1982) A central limit theorem of non-commutative coordinates. Communications in Mathematical Physics 216:
probability theory. Theory of Probability and its Applications 461–490.
27: 703. Momont B, Verbeure A, and Zagrebnov VA (1997) Algebraic
Hudson RL (1973) A quantum mechanical central limit theorem for structure of quantum fluctuations. Journal of Statistical
anti-commuting observables. Journal of Applied Probability 10: Physics 89: 633–653.
502–509. Quaegebeur J (1984) A non-commutative central limit theorem
Ibragimov IA and Linnick YuV (1971) Independent and stationary for CCR-algebras. Journal of Functional Analysis 57: 1–20.
sequences of random variables. Groningen: Wolters-Noordhoff. Sewell GL (1986) Quantum theory of collective phenomena.
Inönü E and Wigner EP (1953) On the contraction groups and Oxford: Oxford University Press.
their representations. Proceedings of the National Academy of Verbeure A and Zagrebnov VA (1992) Phase transitions and
Sciences, USA 39: 510–524. algebra of fluctuation operators in an exactly soluble model
Lanford DE and Ruelle D (1969) Observables at infinity and of a quantum anharmonic crystal. J. Stat. Phys. 69:
states with short-range correlations in statistical mechanics. 329–359.
Journal of Mathematical Physics 13: 194.
of n parallel and independent uses of a channel , n H() = tr log2 is the binary von Neumann
playing the role of transmission time (Holevo 1998). entropy, and the maximum is taken over all
More generally, one can consider memory channels probability distributions {px } and collections of
given by open dynamical systems with a kind of density operators {x } in H1 .
ergodic behavior and the limit where the transmission
time goes to infinity (Kretschmann and Werner 2005).
Restricting to the memoryless case, encoding is given The Variety of Capacities
by a mapping of classical messages x from a given This basic definition and the formulas [1], [2] generalize
codebook of size N into states (density operators) (n)x the definition of the Shannon capacity and the coding
in the input space Hn n
1 of the block channel , and theorem for classical memoryless channels. For quantum
decoding – by an observable M(n) in the output space channel, there are several different capacities because
Hn (n)
2 , that is, a family {My } of operators constituting a one may consider sending different kinds (classical or
resolution of the identity in Hn
2 : quantum) of information, restrict the admissible coding
X and decoding operations, and/or allow the use of
MðnÞ
y 0; MðnÞ
y ¼ I additional resources, such as shared entanglement,
y
forward or backward communication, leading to really
Here y plays the role of outcomes of the whole different quantities (Bennett et al. 2004). Few of these
decoding procedure involving both the quantum resources (such as feedback) also exist for classical
measurement at the output and the possible classical channels but usually influence the capacity less drama-
information post-processing. Then the diagram for tically (at least for memoryless channels). Restricting to
the classical information transmission is the transmission of classical information with no
additional resources, one can distinguish at least four
ðnÞ MðnÞ
x ! i ! n ½ðnÞx !y capacities (Bennett and Shor 1998), according to
|{z} |fflfflfflfflffl{zfflfflfflfflffl}
input output whether, for each block length n, one is allowed to use
state state
arbitrary entangled quantum operations on the full
The such-described encoding and decoding consti- block of input (resp. output) systems, or if, for each of the
tute a quantum block code of length n and size N parallel channels, one has to use a separate quantum
for the memoryless channel. The conditional prob- encoding (resp. decoding), and combine these only by
ability of obtaining an outcome y provided the classical pre- (resp. post-) processing:
message x was sent for a chosen block code is given
by the statistical formula
C∞∞: full
capacity, arbitary
pðnÞ ðyjxÞ ¼ tr n ½ðnÞ ðnÞ
x My (de)coding
quantity C () given by [2] is the essential content theory (Holevo 1998, Holevo and Werner 2001).
of the HSW theorem, from which [1] is obtained Another important extension concerns multiuser
by additional blocking. Since C is apparently quantum information processing systems and their
superadditive, C (1 2 ) C (1 ) þ C (2 ), one capacity regions (Devetak and Shor 2003).
has C11 C . It is still not known whether the
quantity C () is in fact additive for all channels, See also: Capacities Enhanced by Entanglement;
which would imply the equalities here. Additivity of Capacity for Quantum Information; Channels in Quantum
C () would have the important physical conse- Information Theory; Entanglement Measures.
quence – it would mean that using entangled input
states does not increase the classical capacity of
Further Reading
quantum channel. While such a result would be very
much welcome, giving a single-letter expression for Bennett CH and Shor PW (1998) Quantum information theory.
the classical capacity, it would call for a physical IEEE Transactions on Information Theory 44: 2724–2742.
explanation of asymmetry between the effects of Bennett CH, Devetak I, Shor PW, and Smolin JA Inequality and
separation between assisted capacities of quantum channel,
entanglement in encoding and decoding procedures. e-print quant-ph/0406086.
Indeed, the inequality in the lower left is known to be Devetak I and Shor PW The capacity of quantum channel for
strict sometimes (Holevo 1998), which means that simultaneous transmission of classical and quantum informa-
entangled decodings can increase the classical capa- tion, e-print quant-ph/0311131.
Holevo AS (1998) Quantum Coding Theorems. Russian Math.
city. There is even an intermediate capacity between
Surveys vol. 53. pp. 1295–1331, e-print quant-ph/9808023.
C11 and C11 obtained by restricting the quantum Holevo AS (2000) Coding theorems of quantum information
block decodings to adaptive ones (Shor 2002). The theory. In: Grigoryan A, Fokas A, Kibble T, and Zegarlinski B
additivity of the quantity C for all channels is one of (eds.) Proc. XIII ICMP, pp. 415–422. London: International
the central open problems in quantum information Press of Boston.
theory; it was shown to be equivalent to several other Holevo AS and Werner RF (2001) Evaluating capacities of
bosonic Gaussian channels. Physical Review A 63: 032312
important open problems, notably (super)additivity (e-print quant-ph/9912067).
of the entanglement of formation and additivity of Kretschmann D and Werner RF Quantum channels with memory,
the minimal output entropy (Shor 2004). e-print quant-ph/0502106.
For infinite-dimensional quantum processing sys- Shor PW The adaptive classical capacity of a quantum channel, or
tems, one needs to consider the input constraints information capacity of 3 symmetric pure states in three
dimensions, e-print quant-ph/0206058.
such as the power constraint for bosonic Gaussian Shor PW (2004) Equivalence of additivity questions in quantum
channels. The definition of the classical capacity and information theory. Communications in Mathematical Physics
the capacity formula are then modified by introduc- 246: 4334–4340 (e-print quant-ph/0305035).
ing the constraint in a way similar to the classical
Quantum Chromodynamics
G Sterman, Stony Brook University, Stony Brook, each field may be described in terms of quantum waves
NY, USA or particles.
ª 2006 Elsevier Ltd. All rights reserved. Because it is a gauge field theory, the fields that
carry the forces of QCD transform as vectors under
the Lorentz group. Corresponding to these vector
fields are the particles called ‘‘gluons,’’ which carry
Introduction
an intrinsic angular momentum, or spin, of 1 in
Quantum chromodynamics, or QCD, as it is normally units of h. The strong interactions are understood as
called in high-energy physics, is the quantum field the cumulative effects of gluons, interacting among
theory that describes the strong interactions. It is the themselves and with the quarks, the spin-1/2
SU(3) gauge theory of the current standard model for particles of the Dirac quark fields.
elementary particles and forces, SU(3)SU(2)L U(1), There are six quark fields of varying masses in
which encompasses the strong, electromagnetic, and QCD. Of these, three are called ‘‘light’’ quarks, in a
weak interactions. The symmetry group of QCD, with sense to be defined below, and three ‘‘heavy.’’ The
its eight conserved charges, is referred to as color light quarks are the up (u), down (d), and strange (s),
SU(3). As is characteristic of quantum field theories, while the heavy quarks are the charm (c), bottom (b),
Quantum Chromodynamics 145
and top (t). Their well-known electric charges are The Lagrangian and Its Symmetries
ef = 2e=3(u, c, t) and ef = e=3(d, s, b), with e the
The QCD Lagrangian may be written as
positron charge. The gluons interact with each quark
1 h 2 i
field in an identical fashion, and the relatively light nf
X
masses of three of the quarks provide the theory with L¼ q 6 ½A mf qf tr F
f i D ðAÞ
2
a number of approximate global symmetries that f ¼1
profoundly influence the manner in which QCD Bb ðAÞ
ðBa ðAÞÞ2 þ cb ca ½1
manifests itself in the standard model. 2 a
These quark and gluon fields and their correspond-
ing particles are enumerated with complete confidence with D6 [A] = @ þ igs A the covariant derivative in
by the community of high-energy physicists. Yet, none QCD. The are the Dirac matrices, satisfying the
of these particles has ever been observed in isolation, anticommutation relations,
P8 [ , ]þ = 2g . The SU(3)
as one might observe a photon or an electron. Rather, gluon fields are A = a = 1 Aa Ta , where Ta are the
all known strongly interacting particles are colorless; generators of SU(3) in the fundamental representation.
most are ‘‘mesons,’’ combinations with the quantum The field strengths F [A] = @ A @ A þ igs [A , A ]
numbers of a quark q and a antiquark q 0 , or specify the three- and four-point gluon couplings of
‘‘baryons’’ with the quantum numbers of (possibly nonabelian gauge theory. In QCD, there are nf = 6
distinct) combinations of three quarks qq0 q00 . This flavors of quark fields, qf , with conjugate qf = qyf 0 .
feature of QCD, that its underlying fields never The first two terms in the expression [1] make up
appear as asymptotic states, is called ‘‘confinement.’’ the classical Lagrangian, followed by the gauge-fixing
The very existence of confinement required new ways term, specified by a (usually, but not necessarily
of thinking about field theory, and only with these linear) function Ba (A), and the ghost Lagrangian. The
was the discovery and development of QCD possible. ghost (anti-ghost) fields ca (ca ) carry the same adjoint
index as the gauge fields.
The classical QCD Lagrangian before gauge fixing
The Background of QCD is invariant under the local gauge transformations
The strong interactions have been recognized as a i
separate force of nature since the discovery of the A0 ðxÞ ¼ @ ðxÞ1 ðxÞ þ ðxÞA0 ðxÞ1 ðxÞ
gs
neutron as a constituent of atomic nuclei, along with
¼ A ðxÞ @ ðxÞ
the proton. Neutrons and protons (collectively,
nucleons) possess a force, attractive at intermediate þ igs ðxÞ; A ðxÞ þ
distances and so strong that it overcomes the electric 0 ½2
i ðxÞ ¼ ðxÞij j ðxÞ ¼ i ðxÞ
repulsion of the protons, each with charge e. A sense
þ igs ðxÞij j ðxÞ þ
of the relative strengths of the electromagnetic and
strong interactions may be inferred from the typical X
8
Here, power P = 0 describes phase, and P = 1 chiral, The variation of the anti-ghost as in [3] is equivalent
transformations. Both transformations can be to an infinitesimal change in the gauge-fixing term;
extended to transformations among the light flavors, variations in the remaining fields all cancel single-
by letting become a vector, and an element in particle plane wave behavior in the corresponding
the Lie algebra of SU(M), with M = 2 if we take only Green functions. These identities then ensure the
the u and d quarks, and M = 3 if we include the gauge invariance of the perturbative S-matrix, a result
somewhat heavier strange quark. These symmetries, that turns out to be useful despite confinement.
not to be confused with the local symmetries of the To go beyond a purely perturbative description of
standard model, are strong isospin and its extension QCD, it is useful to introduce a set of nonlocal
to the ‘‘eightfold way,’’ which evolved into the operators that are variously called nonabelian
(3-)quark model of Gell–Mann and Zweig. The phases, ordered exponentials, and Wilson lines,
many successes of these formalisms are automati- " Z z #
cally incorporated into QCD. UC ðz; yÞ ¼ P exp igs
dx A ðxÞ ½8
y
classical solutions to the equations of motion, known as where in the second form, we have introduced QCD ,
instantons, that provide nonperturbative contributions the scale parameter of the theory, which embodies
to the path integral. Perhaps the most flexible non- the condition that we get the same coupling at scale
perturbative approach approximates the action and the 1 no matter which scale 0 we start from.
measure at a lattice of points in four-dimensional space. Asymptotic freedom consists of the observation that
For this purpose, integrals over the gauge fields are at larger renormalization masses , or correspond-
replaced by averages over ‘‘gauge links,’’ of the form of ingly shorter timescales, the coupling weakens, and
eqn [8] between neighboring points. indeed vanishes in the limit ! 1. The other side of
Perturbation theory is most useful for processes the coin is that over longer times or lower momenta,
that occur over short timescales and at high relative the coupling grows. Eventually, near the pole at
energies. Lattice QCD, on the other hand, can 1 = QCD , the lowest-order approximation to the
simulate processes that take much longer times, but running fails, and the theory becomes essentially
is less useful when large momentum transfers are nonperturbative. Thus, the discovery of asymptotic
involved. The gap between the two methods remains freedom suggested, although it certainly does not
quite wide, but between the two they have covered prove, that QCD is capable of producing very strong
enormous ground, enough to more than confirm forces, and confinement at long distances. Current
QCD as the theory of strong interactions. estimates of QCD are 200 MeV.
QCD) has not yet been demonstrated from first Motivation for such a string picture was also
principles, a very satisfactory description of the origin found from the hadron spectrum itself, before any of
of the condensate, and indeed of much hadronic the heavy quarks were known, and even before the
structure, has been given in terms of the attractive discovery of QCD, from the observation that many
forces between quarks provided by instantons. The mesonic (qq0 ) states lie along ‘‘Regge trajectories,’’
actions of instanton solutions provide a dependence which consist of sets of states of spin J and mass m2J
exp[82 =g2s ] in Euclidean path integrals, and so are that obey a relation
characteristically nonperturbative.
J ¼ 0 m2J ½14
Mechanisms of Confinement
for some constant 0 . Such a relation can be modeled
As described above, confinement is the absence of by two light particles (‘‘quarks’’) revolving around each
asymptotic states that transform nontrivially under other at some constant (for simplicity, fixed nonrela-
color transformations. The full spectrum of QCD, tivistic) velocity v0 and distance 2R, connected by a
however, is a complex thing to study, and so the ‘‘string’’ whose energy per unit length is a constant
.
problem has been approached somewhat indirectly. A Suppose the center of the string is stationary, so
difficulty is the same light-quark masses associated the overall system is at rest. Then neglecting the
with approximate chiral symmetry. Because the masses masses, the total energy of the system is M = 2R
.
of the light quarks are far below the scale QCD at Meanwhile, the momentum density per unit length
which the perturbative coupling blows up, light quarks at distance r from the center is v(r) = (r=R)v0 , and
are created freely from the vacuum and the process of the total angular momentum of the system is
‘‘hadronization,’’ by which quarks and gluons form Z R
mesons and baryons, is both nonperturbative and 2
v0 2 v0 2
J ¼ 2
v0 dr r2 ¼ R ¼ M ½15
relativistic. It is therefore difficult to approach in both 0 3 6
perturbation theory and lattice simulations.
and for such a system, [14] is indeed satisfied.
Tests and studies of confinement are thus normally
Quantized values of angular momentum J give
formulated in truncations of QCD, typically with no
quantized masses mJ , and we might take this as a
light quarks. The question is then reformulated in a
sort of ‘‘Bohr model’’ for a meson. Indeed, string
way that is somewhat more tractable, without
theory has its origin in related consideration in the
relativistic light quarks popping in and out of the
strong interactions.
vacuum all the time. In the limit that its mass becomes
Lattice data are unequivocal on the linearly rising
infinite compared to the natural scale of fluctuations in
potential, but it requires further analysis to take a
the QCD vacuum, the propagator of a quark becomes
lattice result and determine what field configura-
identical to a phase operator, [8], with a path C
tions, stringlike or not, gave that result. Probably the
corresponding to a constant velocity. This observation
most widely accepted explanation is in terms of an
suggests a number of tests for confinement that can be
analogy to the Meissner effect in superconductivity,
implemented in the lattice theory. The most intuitive is
in which type II superconductors isolate magnetic
the vacuum expectation value of a ‘‘Wilson loop,’’
flux in quantized tubes, the result of the formation
consisting of a rectangular path, with sides along the
of a condensate of Cooper pairs of electrons. If the
time direction, corresponding to a heavy quark and
strings of QCD are to be made of the gauge field,
antiquark at rest a distance R apart, and closed at some
they must be electric (F0 ) in nature to couple to
starting and ending times with straight lines. The
quarks, so the analogy postulates a ‘‘dual’’ Meissner
vacuum expectation value of the loop then turns out to
effect, in which electric flux is isolated as the result
be the exponential of the potential energy between the
of a condensate of objects with magnetic charge
quark pair, multiplied by the elapsed time,
(producing nonzero Fij ). Although no proof of this
I
mechanism has been provided yet, the role of
0P exp igs A ðxÞ dx 0
C
magnetic fluctuations in confinement has been
widely investigated in lattice simulations, with
¼ expðVðRÞT=hÞ ½13
encouraging results. Of special interest are magnetic
When V(R) / R (‘‘area law’’ behavior), there is a field configurations, monopoles or vortices, in the
linearly rising, confining potential. This behavior, Z3 center of SU(3), exp [ik=3]I33 , k = 0, 1, 2. Such
not yet proven analytically yet well confirmed on the configurations, even when localized, influence
lattice, has an appealing interpretation as the energy closed gauge loops [13] through the nonabelian
of a ‘‘string,’’ connecting the quark and antiquark, Aharonov–Bohm effect. Eventually, of course, the
whose energy is proportional to its length. role of light quarks must be crucial for any complete
Quantum Chromodynamics 149
description of confinement in the real world, as influence of other, truly nonperturbative scales,
emphasized by Gribov. proportional to powers of QCD . At large values of
Another related choice of closed loop is the Q2 , however, the situation simplifies greatly, and
‘‘Polyakov loop,’’ implemented at finite temperature, dependence on all scales below Q is suppressed by
for which the path integral is taken over periodic powers of Q. This may be expressed in terms of the
field configurations with period 1=T, where T is the operator product expansion,
temperature. In this case, the curve C extends from
times t = 0 to t = 1=T at a fixed point in space. In 0T J ð0ÞJ ðxÞ 0
X
this formulation it is possible to observe a phase ¼ ðx2 Þ3þdI =2 CI ðx2 2 ; s ðÞÞ
transition from a confined phase, where the expec- OI
tation is zero, to a deconfined phase, where it is h0jOI ð0Þj0i ½17
nonzero. This phase transition is currently under
intense experimental study in nuclear collisions. where dI is the mass dimension of operator OI , and
where the dimensionless coefficient functions CI
incorporate quantum corrections. The sum over
Using Asymptotic Freedom: operators begins with the identity (dI = 0), whose
Perturbative QCD coefficient function is identified with the sum of
quantum corrections in the approximation of zero
It is not entirely obvious how to use asymptotic
masses. The sum continues with quark mass correc-
freedom in a theory that should (must) have
tions, which are suppressed by powers of at least
confinement. Such applications of asymptotic free-
m2f =Q2 , for those flavors with masses below Q. Any
dom go by the term perturbative QCD, which has
QCD quantity that has this property, remaining
many applications, not the least as a window to
finite in perturbation theory when all particle masses
extensions of the standard model.
are set to zero, is said to be ‘‘infrared safe.’’
Lepton Annihilation and Infrared Safety
The effects of quarks whose masses are above Q
P are included indirectly, through the couplings and
The electromagnetic current, J = f ef q f qf , is a masses observed at the lower scales. In summary,
gauge-invariant operator, and its correlation functions the leading power behavior of (Q), and hence of
are not limited by confinement. Perhaps, the simplest the cross section, is a function of Q, , and s ()
application of asymptotic freedom, yet of great only. Higher-order operators whose vacuum matrix
physical relevance, is the scalar two-point function, elements receive nonperturbative corrections include
Z the ‘‘gluon condensate,’’ identified as the product
i
ðQÞ ¼ d4 x eiQx 0T J ð0ÞJ ðxÞ 0 ½16 s ()G G / 4QCD .
3
Once we have concluded that Q is the only
The imaginary part of this function is related to the physical scale in , we may expect that the right
total cross section for the annihilation process eþ e ! choice of the renormalization scale is = Q. Any
hadrons in the approximation that only one photon observable quantity is independent of the choice of
takes part in the reaction. The specific relation is renormalization scale, , and neglecting quark
QCD = (e4 =Q2 ) Im (Q2 ), which follows from the masses, the chain rule gives
optical theorem, illustrated in Figure 1. The perturba-
dðQ=; s ðÞÞ @ @
tive expansion of the function (Q) depends, in ¼ þ 2ðs Þ ¼0 ½18
general, on the mass scales Q and the quark masses d @ @s
mf as well as on the strong coupling s () and on the which shows that we can determine the beta
renormalization scale . We may also worry about the function directly from the perturbative expansion
of the cross section. Defining a s ()=, such a
e+
perturbative calculation gives
σ(Q) = Σ
2 2
– eq Π(Q)
q e
3 X 2
ImðQ2 Þ ¼ ef 1 þ a þ a2 1:986
Π(Q) = Σ m
2
= Im 4 f
m
= Im(
Q2
+ +. . .) 0:115nf ðb0 =4Þ ln 2 ½19
Figure 1 First line: schematic relation of lowest order eþ e
annihilation to sum over quarks q, each with electric charge eq . with b0 as above. Now, choosing = Q, we see that
Second line: perturbative unitarity for the current correlation asymptotic freedom implies that when Q is large,
function (Q). the total cross section is given by the lowest order,
150 Quantum Chromodynamics
0.5
Lattice
q
NNLO
Theory
NLO
Data to W
Deep-inelastic scattering
e+e– Annihilation P
0.4 Hadron collisions
Heavy quarkonia
Λ (5) αS(MZ)
{
MS q
245 MeV 0.1209 to C
0.3 QCD
210 MeV 0.1182
=
α s(Q) O(α4S)
180 MeV 0.1155 ξP
i
P
0.2
to fi/N
Figure 3 Schematic depiction of factorization in deep-inelastic
scattering.
0.1
n is a light-like vector, and Un a phase operator distributions can be inferred directly from experi-
whose path C is in the n-direction. The dependence ment, to arbitrarily high scales, reachable in accel-
of the parton distribution on the factorization scale erators under construction or in the imagination, or
is through the renormalization of the composite even on the cosmic level.
operator consisting of the quark fields, separated At very high energy, however, the effective values
along the light cone, and the nonabelian phase of the variable x can become very small and
operator Un (n, 0), which renders the matrix ele- introduce new scales, so that eventually the evolu-
ment gauge invariant by eqn [9]. By combining the tion of eqn [23] fails. The study of nuclear collisions
calculations of the C’s and data for WN , we can may provide a new high-density regime for QCD,
infer the parton distributions, fi=N . Important factor- which blurs the distinction between perturbative and
izations of a similar sort also apply to some nonperturbative dynamics.
exclusive processes, including amplitudes for elastic
pion or nucelon scattering at large momentum Inclusive Production
transfer.
Equation [21] has a number of extraordinary Once we have evolution at our disposal, we can take
consequences. First, because the coefficient function yet another step, and replace electroweak currents
is an expansion in s , it is natural to choose 2F with any operator from any extension of QCD, in
Q2 p q (when x is of order unity). When Q is the standard model or beyond, that couples quarks
large, we may approximate C and gluons to the particles of as-yet unseen fields.
i by its lowest order,
which is first order in the electromagnetic coupling Factorization can be extended to these situations as
of quarks to photons, and zeroth order in s . In this well, providing predictions for the production of
approximation, dependence on Q is entirely in the new particles, F of mass M, in the form of factorized
parton distributions. But such dependence is of inclusive cross sections,
necessity weak (again for x not so small as to AB!FðMÞ ðM; pA ; pB Þ
produce another scale), because the F dependence X Z
of fi=N ( , F ) must be compensated by the F ¼ d a d b fi=A ð a ; Þfj=B ð b ; Þ
dependence of C i , which is order s . This means i;j¼qf q
f ;G
that the overall Q dependence of the tensor WN is
Hij!FðMÞ ðxa pA ; xb pB ; M; ; s ðÞÞ ½24
weak for Q large when x is moderate. This is the
scaling phenomenon that played such an important where the functions Hij ! F may be calculated
role in the discovery of QCD. perturbatively, while the fi=A and fj=B parton
distributions are known from a combination of
Evolution: Beyond Scaling lower-energy observation and evolution. In this
Another consequence of the factorization [21], or context, they are said to be ‘‘universal,’’ in that
equivalently of the operator definition [22], is that they are the same functions in hadron–hadron
the F -dependence of the coefficient functions and collisions as in the electron–hadron collisions of
the parton distributions are linked. As in the lepton deep-inelastic scattering. In general, the calculation
annihilation cross section, this may be thought of as of hard-scattering functions Hij is quite nontrivial
due to the independence of the physically observable beyond lowest order in s . The exploration of
tensor WN
from the choice of factorization and methods to compute higher orders, currently as far
renormalization scales. This implies that the as 2s , has required extraordinary insight into the
F -dependence of fi=N may be calculated perturba- properties of multidimensional integrals.
tively since it must cancel the corresponding The factorization method helped predict the
dependence in Ci . The resulting relation is coven- observation of the W and Z bosons of electroweak
tionally expressed in terms of the ‘‘evolution theory, and the discovery of the top quark. The
equations,’’ extension of factorization from deep-inelastic scat-
tering to hadron production is nontrivial; indeed, it
dfa=N ðx; Þ only holds in the limit that the velocities, i , of the
d colliding particles approach the speed of light in the
XZ 1 center-of-momentum frame of the produced particle.
¼ d Pac ðx= ; s ðÞÞfc=N ð ; Þ ½23 Corrections to the relation [24] are then at the level
c x
of powers of i 1, which translates into inverse
where Pac ( ) are calculable as power series, now powers of the invariant mass(es) of the produced
known up to 3s . This relation expands the applic- particle(s) M. Factorizations of this sort do not
ability of QCD from scales where parton apply to low-velocity collisions. Arguments for this
152 Quantum Chromodynamics
result rely on relativistic causality and the uncer- s corrections. As a result, QCD predicts that in
tainty principle. The creation of the new state most leptonic annihilation events, energy will flow
happens over timescales of order 1/M. Before that in two back-to-back collimated sets of particles,
well-defined event, the colliding particles are known as ‘‘jets.’’ In this way, quarks and gluons are
approaching at nearly the speed of light, and hence observed clearly, albeit indirectly.
cannot affect the distributions of each others’ With varying choices of S, many properties of
partons. After the new particle is created, the jets, such as their distributions in invariant mass,
fragments of the hadrons recede from each other, and the probabilities and angular distributions of
and the subsequent time development, when multijet events, and even the energy dependence of
summed over all possible final states that include their particle multiplicities, can be computed in
the heavy particle, is finite in perturbation theory as QCD. This is in part because hadronization is
a direct result of the unitarity of QCD. dominated by the production of light quarks,
whose production from the vacuum requires very
Structure of Hadronic Final States little momentum transfer. Paradoxically, the very
lightness of quarks is a boon to the use of
A wide range of semi-inclusive cross sections are
perturbative methods. All these considerations can
defined by measuring properties of final states that
be extended to hadronic scattering, and jet and other
depend only on the flow of energy, and which bring
semi-inclusive properties of final states also com-
QCD perturbation theory to the threshold of
puted and compared to experiment.
nonperturbative dynamics. Schematically, P for a
state N = jk1 . . . kN i, we define S(N) = i s(i )k0i ,
where s() is some smooth function of directions.
We generalize the eþ e annihilation case above, and Conclusions
define a cross section in terms of a related, but QCD is an extremely broad field, and this article has
highly nonlocal, matrix element, hardly scratched the surface. The relation of QCD-
Z
like theories to supersymmetric and string theories,
dðQÞ
0 d x e 4 iQx
0J ð0Þ and implications of the latter for confinement and
dS
Z the computation of higher-order perturbative ampli-
d sðÞEðÞ S J ðxÞ0
2
½25 tudes, have been some of the most exciting devel-
opments of recent years. As another example, we
where 0 is a zeroth-order cross section, and where note that the reduction of the heavy-quark propa-
E is an operator at spatial infinity, which measures gator to a nonabelian phase, noted in our discussion
the energy flow of of confinement, is related to additional symmetries
P any state in direction : E() of heavy quarks in QCD, with many consequences
jk1 . . . kN i = (1=Q) i k0i 2 ( i ). This may seem a
little complicated, but like the total annihilation cross for the analysis of their bound states. Of the
section, the only dimensional scale on which it bibliography given below, one may mention the
depends is Q. The operator E can be defined in a four volumes of Shifman (2001, 2002), which
gauge-invariant manner, through the energy–momen- communicate in one place a sense of the sweep of
tum tensor for example, and has a meaning indepen- work in QCD.
dent of partonic final states. At the same time, this Our confidence in QCD as the correct description of
sort of cross section may be implemented easily in the strong interactions is based on a wide variety of
perturbation theory, and like the total annihilation experimental and observational results. At each stage in
cross section, it is infrared safe. To see why, notice the discovery, confirmation, and exploration of QCD,
that when a massless (k2 = 0) particle decays into two the mathematical analysis of relativistic quantum field
particles of momenta xk and (1 x)k (0 x 1), the theory entered new territory. As is the case for gravity or
quantity S is unchanged, since the sum of the new electromagnetism, this period of exploration is far from
energies is the same as the old. This makes the complete, and perhaps never will be.
observable S(N) insensitive to processes at low
See also: AdS/CFT Correspondence; Aharonov–Bohm
momentum transfer.
Effect; BRST Quantization; Current Algebra; Dirac
For the case of leptonic annihilation, the lowest-
Operator and Dirac Field; Euclidean Field Theory;
order perturbative contribution to energy flow Effective Field Theories; Electroweak Theory; Lattice
requires no powers of s , and consists of an Gauge Theory; Operator Product Expansion in Quantum
oppositely moving quark and antiquark pair. Any Field Theory; Perturbation Theory and its Techniques;
measure of energy flow that includes these config- Perturbative Renormalization Theory and BRST;
urations will dominate over correlations that require Quantum Field Theory: A Brief Introduction; Random
Quantum Cosmology 153
Matrix Theory in Physics; Renormalization: General Physics and Cosmology, vol. 8. Cambridge: Cambridge
Theory; Scattering in Relativistic Quantum Field Theory: University Press.
Fundamental Concepts and Tools; Scattering, Greensite J (2003) The confinement problem in lattice gauge
Asymptotic Completeness and Bound States; theory. Progress in Particle and Nuclear Physics 51: 1.
Mandelstam S (1976) Vortices and quark confinement in
Seiberg–Witten Theory; Standard Model of Particle
nonabelian gauge theories. Physics Reports 23: 245–249.
Physics. Muta T (1986) Foundations of Quantum Chromodynamics.
Singapore: World Scientific.
Neubert H (1994) Heavy quark symmetry. Physics Reports 245:
259–396.
Further Reading
Polyakov AM (1977) Quark confinement and topology of gauge
Bethke S (2004) s at Zinnowitz, 2004. Nuclear Physics groups. Nuclear Physics B 120: 429–458.
Proceeding Supplements 135: 345–352. Schafer T and Shuryak EV (1998) Instantons and QCD. Reviews
Brodsky SJ and Lepage P (1989) Exclusive processes in quantum of Modern Physics 70: 323–426.
chromodynamics. In: Mueller AH (ed.) Perturbative Quantum Shifman M (ed.) (2001) At the Frontier of Particle Physics:
Chromodynamics. Singapore: World Scientific. Handbook of QCD, vols. 1–3. River Edge, NJ: World Scientific.
Collins JC, Soper DE, and Sterman G (1989) Factorization. In: Shifman M (ed.) (2002) At the Frontier of Particle Physics:
Mueller AH (ed.) Perturbative Quantum Chromodynamics. Hand book of QCD, vol. 4. River Edge, NJ: World Scientific.
Singapore: World Scientific. Sterman G (1993) An Introduction to Quantum Field Theory.
Dokshitzer Yu L and Kharzeev DE (2004) Gribov’s conception of Cambridge: Cambridge University Press.
quantum chromodynamics. Annual Review of Nuclear and ’t Hooft G (1977) On the phase transition towards permanent
Particle Science 54: 487–524. quark confinement (1978). Nuclear Physics B 138: 1.
Dokshitzer Yu L, Khoze V, Troian SI, and Mueller AH (1988) ’t Hooft G (ed.) (2005) Fifty Years of Yang–Mills Theories.
QCD coherence in high-energy reactions. Reviews of Modern Hackensack: World Scientific.
Physics 60: 373. Weinberg S (1977) The problem of mass. Transactions of the
Eidelman S et al. (2004) Review of particle physics. Physics New York Academy of Science 38: 185–201.
Letters B 592: 1–1109. Wilson KG (1974) Confinement of quarks. Physical Review D
Ellis RK, Stirling WJ, and Webber BR (1996) QCD and Collider 10: 2445–2459.
Physics, Cambridge Monographs on Particle Physics, Nuclear
Quantum Cosmology
M Bojowald, The Pennsylvania State University, singularity, in the very early universe, quantum
University Park, PA, USA modifications will give rise to new equations of
ª 2006 Elsevier Ltd. All rights reserved. motion which turn into Einstein’s equations only on
larger scales. The analysis of these equations of
motion leads to new classes of early universe
phenomenology.
Introduction
The application of quantum theory to cosmology
Classical gravity, through its attractive nature, leads presents a unique problem with not only mathema-
to a high curvature in important situations. In tical but also many conceptual and philosophical
particular, this is realized in the very early universe ramifications. Since by definition there is only one
where in the backward evolution energy densities universe which contains everything accessible, there
are growing until the theory breaks down. Mathe- is no place for an outside observer separate from the
matically, this point appears as a singularity where quantum system. This eliminates the most straight-
curvature and physical quantities diverge and the forward interpretations of quantum mechanics and
evolution breaks down. It is not possible to set up an requires more elaborate, and sometimes also more
initial-value formulation at this place in order to realistic, constructions such as decoherence. From
determine the further evolution. the mathematical point of view, this situation is
In such a regime, quantum effects are expected to often expected to be mirrored by a new type of
play an important role and to modify the classical theory which does not allow one to choose initial or
behavior such as the attractive nature of gravity or the boundary conditions separately from the dynamical
underlying spacetime structure. Any candidate for laws. Initial or boundary conditions, after all, are
quantum gravity thus allows us to reanalyze the meant to specify the physical system prepared for
singularity problem in a new light which implies the observations which is impossible in cosmology.
tests of the characteristic properties of the respective Since we observe only one universe, the expectation
candidate. Moreover, close to the classical goes, our theories should finally present us with only
154 Quantum Cosmology
one, unique solution without any freedom for but still homogeneous models, where a minisuper-
further conditions. This solution then contains all space quantization does not agree at all with the
the information about observations as well as information obtained from the less symmetric
observers. Mathematically, this is an extremely model. However, often those effects already have a
complicated problem which has received only scant classical analog such as instability of the more
attention. Equations of motion for quantum cosmol- symmetric solutions. A wider investigation of the
ogy are usually of the type of partial differential or reliability of models and when correction terms
difference equations such that new ingredients from from ignored degrees of freedom have to be included
quantum gravity are needed to restrict the large has not been done yet.
freedom of solutions. With candidates for quantum gravity being
available, the current situation has changed to
some degree. It is then not only possible to reduce
Minisuperspace approximation classically and then simply use quantum
mechanics, but also perform at least some of the
In most investigations, the problem of applying full reduction steps at the quantum level. The relation
quantum gravity to cosmology is simplified by a to models is then much clearer, and consistency
symmetry reduction to homogeneous or isotropic conditions which arise in the full theory can be
geometries. Originally, the reduction was performed made certain to be observed. Moreover, relations
at the classical level, leaving in the isotropic case between models and the full theory can be studied
only one gravitational degree of freedom given by to elucidate the degree of approximation. Even
the scale factor a. Together with homogeneous though new techniques are now available, a
matter fields, such as a scalar , there are then detailed investigation of the degree of approxima-
only finitely many degrees of freedom which one can tion given by a minisuperspace model has not been
quantize using quantum mechanics. The classical completed due to its complexity.
Friedmann equation for the evolution of the scale This program has mostly been developed in the
factor, depending on the spatial curvature k = 0 or context of loop quantum gravity, where the specia-
1, is then quantized to the Wheeler–DeWitt lization to homogeneous models is known as loop
equation, commonly written as quantum cosmology. More specifically, symmetries
1 4 x @ x @ can be introduced at the level of states and basic
‘P a a ka 2 ða; Þ operators, where symmetric states of a model are
9 @a @a
8G ^ distributions in the full theory, and basic operators
¼ aHmatter ðaÞ ða; Þ ½1 are obtained by the dual action on those distribu-
3
tions. In such a way, the basic representation of
for the wave function (a, ). The matter Hamilto- models is not assumed but derived from the full
nian H^ matter (a), such as theory where it is subject to much stronger
@2 consistency conditions. This has implications even
^ matter ðaÞ ¼ 1
H h2 a3 2 þ a 3 VðÞ ½2 in homogeneous models with finitely many degrees
2 @
of freedom, despite the fact that quantum mechanics
is left unspecified here, and x parametrizes factor is usually based on a unique representation if the
ordering ambiguities (butffi not completely). The
pffiffiffiffiffiffiffiffiffiffiffiffi Weyl operators eisq and eitp for the variables q and p
Planck length ‘P = 8G h is defined in terms of are represented weakly continuously in the real
the gravitational constant G and the Planck parameters s and t.
constant h. The continuity condition, however, is not neces-
The central conceptual issue then is the generality sary in general, and so inequivalent representations
of effects seen in such a symmetric model and its are possible. In quantum cosmology this is indeed
relation to the full theory of quantum gravity. This realized, where the Wheeler–DeWitt representation
is completely open in the Wheeler–DeWitt form assumes that the conjugate to the scale factor,
since the full theory itself is not even known. On the corresponding to extrinsic curvature of an isotropic
other hand, such relations are necessary to value any slice, is represented through a continuous Weyl
potential physical statement about the origin and operator, while the representation derived for loop
early history of the universe. In this context, quantum cosmology shows that the resulting opera-
symmetric situations thus present models, and the tor is not weakly continuous. Furthermore, the scale
degree to which they approximate full quantum factor has a continuous spectrum in the Wheeler–
gravity remains mostly unknown. There are exam- DeWitt representation but a discrete spectrum in the
ples, for instance, of isotropic models in anisotropic loop representation. Thus, the underlying geometry
Quantum Cosmology 155
of space is very different, and also evolution takes such as that for an inflaton. Since initial conditions
a new form, now given by a difference equation of often provide special properties early on, the
the type combination of evolution and initial conditions has
been used to find a possible origin of an arrow
ðVþ5 Vþ3 Þe ik þ4 ðÞ of time.
ð2 þ k2 ÞðVþ1 V1 Þ ðÞ
ik
þ ðV3 V5 Þe 4 ðÞ Singularities
4 2^
¼ G‘ Hmatter ðÞ ðÞ
3 P ½3 While classical gravity is based on spacetime
geometry and thus metric tensors, this structure is
in terms of volume eigenvalues V = (‘ 2P jj=6) 3=2 .
viewed as emergent only at large scales in canonical
For large and smooth wave functions, one can see
quantum gravity. A gravitational system, such as a
that the difference equation reduces to the
whole universe, is instead described by a wave
Wheeler–DeWitt equation with jj / a2 to leading
function which, at best, yields expectation values for
order in derivatives of . At small , close to the
a metric. The singularity problem thus takes a
classical singularity, however, both equations have
different form since it is not metrics which need to
very different properties and lead to different
be continued as solutions to Einstein’s field equa-
conclusions. Moreover, the prominent role of
tions but the wave function describing the quantum
difference equations leads to new mathematical
system. In the strong curvature regime around a
problems.
classical singularity, one does not expect classical
This difference equation is not simply obtained
geometry to be applicable, such that classical
through a discretization of [1], but derived from a
singularities may just be a reflection of the break-
constraint operator constructed with methods from
down of this picture, rather than a breakdown of
full loop quantum gravity. It is, thus, to be regarded
physical evolution. Nevertheless, the basic feature of
as more fundamental, with [1] emerging in a
a singularity as presenting a boundary to the
continuum limit. The structure of [3] depends on
evolution of a system equally applies to the quantum
the properties of the full theory such that its
equations. One can thus analyze this issue, using
qualitative analysis allows conclusions for full
new properties provided by the quantum evolution.
quantum gravity.
The singularity issue is not resolved in the
Wheeler–DeWitt formulation since energy densities,
with a being a multiplication operator, diverge and
Applications
the evolution does not continue anywhere beyond
Traditionally, quantum cosmology has focused on the classical singularity at a = 0. In some cases one
three main conceptual issues: can formally extend the evolution to negative a, but
this possibility is not generic and leaves open what
the fate of classical singularities,
negative a means geometrically. This is different in
initial conditions and the ‘‘prediction’’ of inflation
the loop quantization: here, the theory is based on
(or other early universe scenarios), and
triad rather than metric variables. There is thus a
arrow of time and the emergence of a classical
new sign factor corresponding to spatial orientation,
world.
which implies the possibility of negative in the
The first issue consists of several subproblems since difference equation. The equation is then defined on
there are different aspects to a classical singularity. the full real line with the classical singularity = 0
Often, curvature or energy densities diverge and one in the interior. Outside = 0, we have positive
can expect quantum gravity to provide a natural volume at both sides, and opposite orientations.
cutoff. More importantly, however, the classical Using the difference equation, one can then see that
evolution breaks down at a singularity, and quan- the evolution does not break down at = 0,
tum gravity, if it is to cure the singularity problem, showing that the quantum evolution is singularity
has to provide a well-defined evolution which does free.
not stop. Initial conditions are often seen in relation For the example [3] shown here, one can follow
to the singularity problem since early attempts tried the evolution, for instance, backward in internal
to replace the singularity by choosing appropriate time , starting from initial values for at large
conditions for the wave function at a = 0. Different positive . By successively solving for 4 , the wave
proposals then lead to different solutions for the function at lower is determined. This goes on in
wave function, whose dependence on the scalar this manner only until the coefficient V3 V5 of
can be used to determine its probability distribution 4 vanishes, which is the case if and only if = 4.
156 Quantum Cosmology
The value 0 of the wave function exactly at the parts of minisuperspace, such as a = 0 in the
classical singularity is thus not determined by initial isotropic case, corresponding to classical singulari-
data, but one can easily see that it completely drops ties. This condition, unfortunately, can easily be
out of the evolution. In fact, the wave function at all seen to be ill posed in anisotropic models where in
negative is uniquely determined by initial values at general the only solution vanishes identically. In
positive . Equation [3] corresponds to one parti- other models, lima ! 0 (a) does not even exist.
cular ordering, which in the Wheeler–DeWitt case is Similar problems of the generality of conditions
usually parametrized by the parameter x (although arise in other scenarios. Most well known are the
the particular ordering obtained from the continuum no-boundary and tunneling proposal where initial
limit of [3] is not contained in the special family conditions are still imposed at a = 0, but with a
[1]). Other nonsingular orderings exist, such as that nonvanishing wave function there.
after symmetrizing the constraint operator, in which This issue is quite different for difference equa-
case the coefficients never become 0. tions since at first the setup is less restrictive: there
In more complicated systems, this behavior is are no continuity or differentiability conditions for a
highly nontrivial but still known to be realized in a solution. Moreover, oscillations that become arbi-
similar manner. It is not automatic that the internal trarily rapid, which can be responsible for the
time evolution does not continue since even in nonexistence of lim a ! 0 (a), cannot be supported
isotropic models one can easily write difference on a discrete lattice. It can then easily happen that a
equations for which the evolution breaks down. difference equation is well posed, while its con-
That the most natural orderings imply nonsingular tinuum limit with an analogous initial condition is
evolution can be taken as a support of the general ill posed. One example are the dynamical initial
framework of loop quantum gravity. It should also conditions of loop quantum cosmology which arise
be noted that the mechanism described here, from the dynamical law in the following way: the
providing essentially a new region beyond a classical coefficients in [3] are not always nonzero but vanish
singularity, presents one mechanism for quantum if and only if they are multiplied with the value of
gravity to remove classical singularities, and so far the wave function at the classical singularity = 0.
the only known one. Nevertheless, there is no claim This value thus decouples and plays no role in the
that the ingredients have to be realized in any evolution. The instance of the difference equation
nonsingular scenario in the same manner. Different that would determine 0 , for example, the equation
scenarios can be imagined, depending on how for = 4 in the backward evolution, instead implies
quantum evolution is understood and what the a condition on the previous two values, 4 and 8 ,
interpretation of nonsingular behavior is. It is also in the example. Since they have already been
not claimed that the new region is semiclassical in determined in previous iteration steps, this translates
any sense when one looks at it at large volume. If to a linear condition on the initial values chosen. We
the initial values for the wave function describe a thus have one example where indeed initial condi-
semiclassical wave packet, its evolution beyond the tions and the evolution follow from only one
classical singularity can be deformed and develop dynamical law, which also extends to anisotropic
many peaks. What this means for the re-emergence models. Without further conditions, the initial-value
of a semiclassical spacetime has to be investigated in problem is always well posed, but may not be
particular models, and also in the context of complete, in the sense that it results in a unique
decoherence. solution up to norm. Most of the solutions,
however, will be rapidly oscillating. In order to
guarantee the existence of a continuum approxima-
Initial Conditions
tion, one has to add a condition that these
Traditional initial conditions in quantum cosmology oscillations are suppressed in large volume regimes.
have been introduced by physical intuition. The Such a condition can be very restrictive, such that
main mathematical problem, once such a condition the issue of well-posedness appears in a new guise:
is specified in sufficient detail, then is to study well- nonzero solutions do exist, but in some cases all of
posedness, for instance, for the Wheeler–DeWitt them may be too strongly oscillating.
equation. Even formulating initial conditions In simple cases, one can use generating function
generally, and not just for isotropic models, is techniques advantageously to study oscillating solu-
complicated, and systematic investigations of the tions, at least if oscillations are of alternating nature
well-posedness have rarely been undertaken. An between two subsequent levels of the difference
exception is the historically first such condition, equation.P The idea is that a generating function
due to DeWitt, that the wave function vanishes at G(x) = n n x n has a stronger pole at x = 1 if n
Quantum Cosmology 157
is alternating compared to a solution of constant complicated ways. Quantization can thus be per-
sign. Choosing initial conditions which reduce the formed, but transforming back to the metric at the
pole order thus implies solutions with suppressed operator level and drawing conclusions is quite
oscillations. As an example, we can look at the involved. The main issue of interest in the recent
difference equation literature has been the investigation of field theory
aspects of quantum gravity in a tractable model. In
2
nþ1 þ n n1 ¼0 ½4 particular, it turns out that self-adjoint Hamilto-
n nians, and thus unitary evolution, do not exist in
whose generating function is general.
Loop quantizations of inhomogeneous models are
1x þ 0 ð1 þ 2xð1 logð1 xÞÞÞ
GðxÞ ¼ ½5 available even in cases where a reformulation such
ð1 þ xÞ2 as a field theory on flat space does not exist, or is
The pole at x = 1 is removed for initial values not being made use of to avoid special gauges. This
1 = 0 (2 log 2 1) which corresponds to nonoscil-
is quite valuable in order to see if specific features
lating solutions. In this way, analytical expressions exploited in reformulations lead to artifacts in the
can be used instead of numerical attempts which results. So far, the dynamics has not been investi-
would be sensitive to rounding errors. Similarly, the gated in detail, even though conclusions for the
issue of finding bounded solutions can be studied by singularity issue can already be drawn.
continued fraction methods. This illustrates how an From a physical perspective, it is most important
underlying discrete structure leads to new questions to introduce inhomogeneities at a perturbative level
and the application of new techniques compared to in order to study implications for cosmological
the analysis of partial differential equations which structure formation. On a homogeneous back-
appear more commonly. ground, one can perform a mode decomposition of
metric and matter fields and quantize the homo-
geneous modes as well as amplitudes of higher
modes. Alternatively, one can first quantize the
More General Models
inhomogeneous system and then introduce the mode
Most of the time, homogeneous models have been decomposition at the quantum level. This gives rise
studied in quantum cosmology since even formulat- to a system of infinitely many coupled equations of
ing the Wheeler–DeWitt equation in inhomogeneous infinitely many variables, which needs to be trun-
cases, the so-called midisuperspace models, is cated, for example, for numerical investigations. At
complicated. Of particular interest among homo- this level, one can then study the question to which
geneous models is the Bianchi IX model since it has degree a given minisuperspace model presents a
a complicated classical dynamics of chaotic beha- good approximation to the full theory, and where
vior. Moreover, through the Belinskii–Khalatnikov– additional correction terms should be introduced. It
Lifschitz (BKL) picture, the Bianchi IX mixmaster also allows one to develop concrete models of
behavior is expected to play an important role even decoherence, which requires a ‘‘bath’’ of many
for general inhomogeneous singularities. The classi- weakly interacting degrees of freedom usually
cal chaos then indicates a very complicated thought of as being provided by inhomogeneities in
approach to classical singularities, with structure cosmology, and an understanding of the semiclassi-
on arbitrarily small scales. cal limit.
On the other hand, the classical chaos relies on a
curvature potential with infinitely high walls, which
can be mapped to a chaotic billiard motion. The
Interpretations
walls arise from the classical divergence of curva-
ture, and so quantum effects have been expected to Due to the complexity of full gravity, investigations
change the picture, and shown to do so in several without symmetry assumptions or perturbative
cases. approximations usually focus on conceptual issues.
Inhomogeneous models (e.g., the polarized As already discussed, cosmology presents a unique
Gowdy models) have mostly been studied in cases situation for physics since there cannot be any
where one can reformulate the problem as that of a outside observer. While this fact has already
massless free scalar on flat Minkowski space. The implications on the interpretation of observations
scalar can then be quantized with familiar techni- at the classical level, its full force is noticed only in
ques in a Fock space representation, and is related to quantum cosmology. Since some traditional inter-
metric components of the original model in rather pretations of quantum mechanics require the role of
158 Quantum Cosmology
observers outside the quantum system, they do not midisuperspace models. In addition, complicated
apply to quantum cosmology. interpretational issues, as important as they are for
Sometimes, alternative interpretations such as a deep understanding of quantum physics, do not
Bohm theory or many-world scenarios are cham- prevent the development of physical applications in
pioned in this situation, but more conventional quantum cosmology, just as they did not do so in
relational pictures are most widely adopted. In the early stages of quantum mechanics.
such an interpretation, the wave function yields
relational probabilities between degrees of free- See also: Canonical General Relativity; Cosmology:
dom rather than absolute probabilities for mea- Mathematical Aspects; Loop Quantum Gravity; Quantum
surements done by an outside observer. This has Geometry and its Applications; Spacetime Topology,
been used, for instance, to determine the prob- Causal Structure and Singularities; Wheeler–De Witt
ability of the right initial conditions for inflation, Theory.
but it is marred by unresolved interpretational
issues and still disputed. These problems can be
avoided by using effective equations, in analogy Further Reading
to an effective action, which modify classical
equations on small scales. Since the new equa- Bojowald M (2001a) Absence of a singularity in loop
quantum cosmology. Physical Review Letters 86: 5227–5230
tions are still of classical type, that is, differential
(gr-qc/0102069).
equations in coordinate time, no interpretational Bojowald M (2001b) Dynamical initial conditions in
issues arise at least if one stays in semiclassical quantum cosmology. Physical Review Letters 87: 121301
regimes. In this manner, new inflationary scenar- (gr-qc/0104072).
ios motivated from quantum cosmology have Bojowald M (2003) Initial conditions for a universe. General
Relativity and Gravitation 35: 1877–1883 (gr-qc/0305069).
been developed.
Bojowald M (2005) Loop quantum cosmology. Living Reviews in
In general, a relational interpretation, though Relativity (to appear).
preferable conceptually, leads to technical Bojowald M and Morales-Técotl HA (2004) Cosmological
complications since the situation is much more applications of loop quantum gravity. In: Proceedings of
involved and evolution is not easy to disentangle. the Fifth Mexican School (DGFM): The Early Universe
and Observational Cosmology, Lecture Notes in Physics,
In cosmology, one often tries to single out one
vol. 646, pp. 421–462. Berlin: Springer (gr-qc/0306008).
degree of freedom as internal time with respect to DeWitt BS (1967) Quantum theory of gravity. I. The canonical
which evolution of other degrees of freedom is theory. Physical Review 160: 1113–1148.
measured. In homogeneous models, one can Giulini D, Kiefer C, Joos E, Kupsch J, Stamatescu IO et al. (1996)
simply take the volume as internal time, such as Decoherence and the Appearance of a Classical World in
Quantum Theory. Berlin: Springer.
a or earlier, but in full no candidate is known.
Hartle JB (2003) What connects different interpretations of
Even in homogeneous models, the volume is not quantum mechanics? quant-ph/0305089.
suitable as internal time to describe a possible Hartle JB and Hawking SW (1983) Wave function of the
recollapse. One can use extrinsic curvature universe. Physical Review D 28: 2960–2975.
around such a point, but then one has to under- Kiefer C (2005) Quantum cosmology and the arrow of time. In:
Proceedings of the Conference DICE2004, Piombino, Italy,
stand what changing the internal time in quantum
September 2004 (gr-qc/0502016).
cosmology implies, that is, whether evolution Kuchař KV (1992) Time and interpretations of quantum gravity.
pictures obtained in different internal time for- In: Kunstatter G, Vincent DE, and Williams JG (eds.)
mulations are equivalent to each other. Proceedings of the 4th Canadian Conference on General
There are thus many open issues at different Relativity and Relativistic Astrophysics. Singapore: World
Scientific.
levels, which, strictly speaking, do not apply only to
McCabe G (2005) The structure and interpretation of cosmology:
quantum cosmology but to all of physics. After all, Part II. The concept of creation in inflation and quantum
every physical system is part of the universe, and cosmology. Studies in History and Philosophy of Modern
thus a potential ingredient of quantum cosmology. Physics 36: 67–102 (gr-qc/0503029).
Obviously, physics works well in most situations Vilenkin A (1984) Quantum creation of universes. Physical
Review D 30: 509–511.
without taking into account its being part of one
Wiltshire DL (1996) An introduction to quantum cosmology. In:
universe. Similarly, much can be learned about a Robson B, Visvanathan N, and Woolcock WS (eds.)
quantum universe if only some degrees of freedom Cosmology: The Physics of the Universe, pp. 473–531.
of gravity are considered as in mini- or Singapore: World Scientific (gr-qc/0101003).
Quantum Dynamical Semigroups 159
Damped and pumped harmonic oscillator The with a single-particle Hamiltonian H1 and a damp-
quantum master equation for a linearly damped ing (pumping) positive operator # (" ) 0. The
and pumped harmonic oscillator with frequency ! operators H1 , # , and " need not be bounded
and the damping (pumping) coefficient # (" ) has provided iH1 (1=2){(# ()" ) generates a
form (contracting in the fermionic case) semigroup
{T(t); t 0} on H1 and the formal solution of
d #
¼ i!½a a; þ ð½a; a þ ½a; a Þ eqn [24]
dt 2
" ðtÞ ¼ TðtÞð0ÞT ðtÞ þ QðtÞ
þ ð½a ; a þ ½a ; aÞ ½19 Z t
2
where a , a are creation and annihilation operators where QðtÞ ¼ TðsÞ" T ðsÞds ½25
0
satisfying [a, a ] = 1. Taking diagonal elements
pn = <n, n> in the ‘‘particle number’’ basis is meaningful. We can now define the quasifree
a ajn >= njn >, n = 0, 1, 2, . . . , which evolve inde- dynamical semigroup for the many-particle system
pendently of the off-diagonal elements, one obtains described by the Fock space F (H1 ) (Alicki and
the birth and death process, Lendi 1987, Alicki and Fannes 2001). The simplest
definition involves Heisenberg evolution of the
dpn ordered monomials in a ( j ) and a(j ):
¼ # ðn þ 1Þpnþ1 þ " npn1
dt
# n þ " ðn þ 1Þ pn ½20 ðtÞa ð 1 Þ a ð m Það1 Þ aðn Þ
X
¼ Det < jk ; QðtÞil > k;l¼1;2;...;r
It is convenient to use the Heisenberg picture and
P
find an explicit solutionpinffiffiffi terms of Weyl unitary
operators W(z) = exp[(i= 2)(za þ za )], a ðT ðtÞ
1 Þ a ðT ðtÞ
mr Þ
a T ðtÞ1 a T ðtÞnr ½26
ðtÞWðzÞ
( )
The sum is taken over all partitions {(j1 , . . . , jr )
jzj2 #
¼ exp 1 eð# " Þt WðzðtÞÞ ½21 (
1 , . . . ,
mr )}, {(i1 , . . . , ir )(1 , . . . , nr } such that
4 # "
j1 < j2 < < jr ,
1 <
2 , < <
mr , i1 < i2 <
where z(t)= exp{(i! þ 12 (# " ))t}, t 0. For # > " < rr , 1 < 2 <
nr ; þ 1, is a product of
the solution of eqn [19] always tends to the stationary signatures of the permutations {1, 2, . . . , m} 7!
Gibbs state {j1 , . . . , jr ,
1 , . . . ,
mr }, {1, 2, . . . , n} 7! {i1 , . . . , ir , 1 , . . . ,
nr }; a permanent Detþ is taken for bosons, a
¼ Z1 e!a a ; Z ¼ tre!a a
determinant Det for fermions.
1 ½22 Introducing an orthonormal basis {ek } in H1 and
¼ lnð# =" Þ using the notation a (ek ) ak , we can write a
!
formal master equation for density matrices on the
Quasifree semigroups The previous example is the Fock space corresponding to eqn [26]:
simplest instance of the dynamical semigroups for d 1 X kl
noninteracting bosons and fermions which are ¼ i½HF ; þ # ½ak ; al
dt 2 k;l
completely determined on the single-particle level.
Such systems are defined by a single-particle Hilbert
þ ½ak ; al þ kl " ½a k ; a l þ ½a
k ; a l ½27
space H1 and a linear map H1 3 7! a () into
creation operators satisfying canonical commutation Again, formally,
or anticommutation relations (CCRs or CARs, X
respectively) for bosons and fermions, respectively HF ¼ < ek ; H1 el > ak al
k;l ½28
½að Þ; a ðÞ ¼ < ; >
½23 kl
# ¼ < ek ; # el >; kl
" ¼ < e k ; " e l >
½A; B AB ð1ÞBA
Often the formulas [27], [28] are not well-
In all expressions containing (), sign (þ) refers to
defined, but replacing the (infinite) matrices by
bosons and () to fermions.
(distribution-valued) integral kernels, sums by inte-
Consider a nonhomogeneous evolution equation
grals, and ak , al by quantum fields, we can obtain
on the trace-class operators 2 T (H1 ):
meaningful objects.
d 1 Quasifree dynamical semigroups find applications
¼ i½H1 ; fð# ðÞ" Þ; g þ " ½24
dt 2 in the theory of unstable particles, quantum linear
162 Quantum Dynamical Semigroups
optics, solid-state physics, quantum information unbounded one under technical conditions concern-
theory, etc. (Alicki and Lendi 1987, Sewell 2002). ing domains). This allows spectral decomposition of
L and a proper definition of damping rates for the
obtained eigenvectors. The normality condition is
Ergodic Properties one of the possible definitions of quantum detailed
Dynamical semigroups which possess stationary balance. The other, based on the time-reversal
states satisfying L0 = 0 are of particular interest, operation, often coincides with the previous one
for example, in the description of relaxation for important examples.
processes toward equilibrium states (Frigerio 1977, Interesting examples of nonergodic dynamical
Spohn 1980, Alicki and Lendi 1987). The dynamical semigroups are given for open systems consisting of
semigroup {(t)} with a stationary state 0 is called N identical particles with Hamiltonians H (N) and
ergodic if operators Vj(N) invariant with respect to particles
permutations. Then the commutant {H (N) , Vj(N) ,
lim ðtÞ ¼ 0 ; for any initial ½29 j 2 I}0 contains an abelian algebra generated by
t!1
projections on irreducible tensors corresponding to
For the case of finite-dimensional H at least one Young tables.
stationary state always exists. If, moreover, it is
strictly positive, 0 > 0, then we have the following
sufficient condition of ergodicity:
From Hamiltonian Dynamics to
fVj ; j 2 Ig0 fA; A 2 BðHÞ; ½A; Vj ¼ 0; j 2 Ig Semigroups
¼ C1 ½30
One of the main tasks in the quantum theory of open
Open systems interacting with heat baths at the systems is to derive master equations [4] from the
temperature T are described by the semigroups with model of a ‘‘small’’ open system S interacting with a
generators [4] of the special form ‘‘large’’ reservoir R at a certain reference state !R
(Davies 1976, Spohn 1980, Alicki and Lendi 1987,
d 1 X n Breurer and Petruccione 2002, Garbaczewski and
ðtÞ ¼ i½H; ðtÞ þ ½Vj ; ðtÞVj
dt 2 ! 0 Olkiewicz 2002). Starting with
j P the total Hamiltonian
H = HS 1R þ 1S HR þ
S
R
, where S
=
þ ½Vj ðtÞ; Vj þ e!j ½Vj ; ðtÞVj
S
, R
= R
, tr(!R R
) = 0, and is a coupling con-
o stant, we define the reduced dynamics of S by
þ ½Vj ðtÞ; Vj ½31
ðtÞ ¼ ðÞ ðtÞ ¼ trR U ðtÞ !R U ðtÞ ½34
where
1 with U (t) = exp (itH ). Here trR denotes a partial
¼ ; ½H; Vj ¼ !j Vj ½32 trace over R defined in terms of an arbitrary
kB T P basis
{ek } of R by the formula <, (trR A)> = k <
The Gibbs state = Z1 eH is a stationary state ek , A ek >. Generally, () (t þ s) 6¼ () (t)() (s),
for eqn [31] and the condition {Vj , Vj ; j 2 I}0 = C1 but dynamical semigroups can provide good approx-
implies ergodicity (return to equilibrium). Moreover, imations in important cases.
the matrix elements of diagonal in H-eigenbasis
transform independently of the off-diagonal ones
and satisfy the Pauli master equation Weak-Coupling Limit
P P
where H = HS þ 2
!2Sp P K
(!)V! V! is a completely positive map , that is, S(j) S(j).
renormalized Hamiltonian, !2Sp denotes the Psum Hence, for the quantum dynamical semigroup (t) with
over eigenfrequencies of [H, ], eitH S
eitH = !2Sp the stationary state 0 we obtain the following relation
V!
ei!t and for the von Neumann entropy S() = tr( ln ):
Z 1
d d d
ei!t tr !R eitHR R
eitHR R dt SððtÞÞ ¼ SððtÞj0 Þ trððtÞ ln 0 Þ ½38
0 dt dt dt
¼ 12 C
ð!Þ þ iK
ð!Þ ½36 where (d=dt)S((t) j 0 ) 0 is an entropy produc-
tion and the second term describes entropy exchange
The rigorous derivation involves van Hove or weak
with environment (Spohn 1980, Alicki and Lendi
coupling limit, ! 0, with = 2 t kept fixed.
1987).
It follows from the Bochner theorem that the
Bistochastic dynamical semigroups preserve the
matrix [C
(!)] is positively defined and therefore
maximally mixed state, that is, L(1) = 0. For them,
by its diagonalization we can convert eqn [35]
the von Neumann entropy does not decrease and the
into the standard form [4]. If the reservoir’s state
purity tr 2 never increases (Streater 1995). Two
!R is an equilibrium state (Kubo–Martin–Schwinger
important classes of master equations, used to
state) then C
(!) = e!=kB T C
(!) and therefore
describe decoherence, yield bistochastic dynamical
eqn [35] can be written in a form [31]. Moreover,
semigroups:
transition probabilities akl from eqn [33] coincide
with those obtained using the ‘‘Fermi golden d
rule.’’ ðtÞ ¼ i½H; ðtÞ
dt X
½Aj ; ½Aj ; ðtÞ; Aj ¼ Aj ½39
Low-Density Limit j
If the reservoir can be modeled by a gas of
noninteracting particles (bosons or fermions) at
low density , we can derive the following master d
ðtÞ ¼ i½H; ðtÞ
equation which approximates an exact dynamics dt Z
[34] in the low-density limit ( ! 0, with = t kept
þ ðd
ÞðUð
ÞðtÞU ð
Þ ðtÞÞ ½40
fixed) M
d XZ where U(
) are unitary and () is a (positive)
ðtÞ ¼ i½H; ðtÞ þ
d3 pd3 p0 GðpÞ
dt R 6 measure on M.
!2S
Ep0 Ep þ ! ð½T! ðp; p0 Þ; ðtÞT! ðp; p0 Þ
þ ½T! ðp; p0 ÞðtÞ; T! ðp; p0 Þ Þ ½37 Itô–Schrödinger Equations
Up to technical problems in the case of unbounded
Here H is a renormalized
P Hamiltonian of the system operators, the master equation [4] is completely
S, eitH TeitH = !2S T! ei!t , T is a T-matrix equivalent to the following stochastic differential
describing the scattering process involving S and a equation (in Itô form):
single particle, T = Vþ , where V is a particle-
system potential and þ is a Møller operator. 1X
d ðtÞ ¼ iH ðtÞ dt V Vj ðtÞ dt
T! (p, p0 ) denotes the integral kernel corresponding 2 j2I j
to T! expressed in terms of momenta of the bath X
particle, Ep the kinetic energy of a particle, and G(p) i Vj ðtÞdXj ðtÞ ½41
its probability distribution in the momentum space. j2I
If G(p) exp(Ep =kB T) and microreversibility con- where Xj (t) are arbitrary statistically independent
ditions, Ep = Ep and T! (p, p0 ) = T! (p0 , p), hold, stochastic processes with independent increments
then eqn [37] satisfies the quantum detailed-balance (continuous or jump processes) such that the
condition with the stationary Gibbs state expectation E(dXj (t) dXk (t)) = jk dt. Equation [41]
, = 1=kB T. should be understood as an integral equation
involving stochastic Itô integrals with respect to
{Xj (t)} computed according to the Itô rule:
Entropy and Purity
dXj (t) dXk (t) = jk dt. Taking the average (t) =
The relative entropy S( j ) = tr( ln ln ) is E(j (t) > < (t)j) one can show, using the Itô rule,
monotone with respect to any trace-preserving that (t) satisfies eqn [4]. For numerical
164 Quantum Dynamical Semigroups
applications, it is convenient to use the nonlinear von Neumann algebras, the most difficult problem
version of eqn [41] for the normalized stochastic of constructing physically relevant semigroups
vector (t) = (t)=k (t)k, which can be easily for generic infinite systems remains unsolved
derived from eqn [41] (Breurer and Petruccione (Majewski and Zegarliński 1996, Garbaczewski
2002). and Olkiewicz 2002).
Introducing quantum noises, for example, quan-
tum Brownian motions defined in terms of bosonic
or fermionic fields and satisfying suitable quantum
Nonlinear Dynamical Semigroups
Itô rules one can develop the theory of noncommu-
tative stochastic differential equations (NSDE) The reduced description of many-body classical or
(Hudson and Parthasarathy 1984). Both, eqn [41] quantum systems in terms of single-particle states
and NSDE, provide examples of unitary dilations – (probability distributions, wave functions, or density
(physically singular) mathematical constructions of matrices) leads to nonlinear dynamics (e.g., Boltz-
the environment R and the R–S coupling which mann, Vlasov, Hartree, or Hartree–Fock equations)
exactly reproduce dynamical semigroups as reduced (Spohn 1980, Garbaczewski and Olkiewicz 2002). A
dynamics [34]. large class of nonlinear evolution equations for
single-particle density matrices can be written as
Alicki and Lendi (1987)
Algebraic Formalism
In order to describe open systems in thermodyna- d
¼ L½ ½43
mical limit (e.g., infinite spin systems) or systems dt
in the quantum field theory one needs the
formalism based on C or von Neumann algebras. where 7! L[] is a map from density matrices to
In the C -algebraic language, by dynamical semi- semigroup generators of the type [4]. Under
group (in the Heisenberg picture) we mean a certain technical conditions the solution of eqn
family {T(t); t 0} of linear maps on the unital [43] exists and defines a nonlinear dynamical
C -algebra A satisfying the following conditions: semigroup – a family {(t); t 0} of maps on the
(1) complete positivity, (2) T(t)T(s) = T(t þ s), set of density matrices satisfying the composition
(3) weak (or strong) continuity, and (4) T(t)1 = 1. law (t þ s) = (t)(s).
Assuming the existence of a faithful stationary A simple example is provided by an open N-
state ! = ! T(t) on A, one can use a Gelfand– particle system with the total Hamiltonian invariant
Naimark–Segal (GNS) representation
! (A) of A with respect to particle permutations. The Marko-
in terms of bounded operators on the suitable vian approximation combined with the mean-field
Hilbert space H! with the cyclic and separating method leads to a nonlinear dynamical semigroup
vector satisfying !(A) = <,
! (A)> for all which preserves purity and for initial pure states is
A 2 A. Then the dynamical semigroup can be governed by the nonlinear Schrödinger equation
defined on the von Neumann algebra M (obtained with the following structure:
by a weak closure of
! (A)) as T(t)
^ ! (A)
! (T(t)A). The Kadison inequality valid even for d
¼ iðh þ NUð ÞÞ
2-positive bounded maps on A dt
N X
ðAA Þ ðAÞð1ÞðA Þ ½42 þ < ; Vj > Vj
2 j
implies that !([T(t)A] T(t)A) !(A A), which
allows one to extend the dynamical semigroup < ; Vj > Vj ½44
to the contracting semigroup T(t)[
~ ! (A)]
[
! (T(t)A)] on the GNS Hilbert space H! . Typi-
Here h is a single-particle Hamiltonian, U( ) a
cally, one tries to define the semigroup in terms of
Hartree potential, and Vj are single-particle opera-
the proper limiting procedures T(t) = limn ! 1 Tn (t),
tors describing collective dissipation.
where Tn (t) is well defined on A. However, the limit
may not exist as an operator on A but can be well See also: Boltzmann Equation (Classical and Quantum);
defined on the von Neumann algebra M. If not, the Channels in Quantum Information Theory; Evolution
contracting semigroup on H! may still be a useful Equations: Linear and Nonlinear; Kinetic Equations;
object. Nonequilibrium Statistical Mechanics (Stationary):
Although there exists a rich ergodic theory Overview; Positive Maps on C*-Algebras; Quantum
of dynamical semigroups for the special types of Error Correction and Fault Tolerance; Quantum
Quantum Dynamics in Loop Quantum Gravity 165
Mechanical Scattering Theory; Stochastic Differential Gorini V, Kossakowski A, and Sudarshan ECG (1976) Comple-
Equations. tely positive dynamical semigroups of n-level systems. Journal
of Mathematical Physics 17: 821–825.
Hudson R and Parthasarathy KR (1984) Quantum Itô’s formula
Further Reading and stochastic evolutions. Communications in Mathematical
Physics 93: 301–323.
Alicki R and Fannes M (2001) Quantum Dynamical Systems. Ingarden RS, Kossakowski A, and Ohya M (1997) Information
Oxford: Oxford University Press. Dynamics and Open Systems. Dordrecht: Kluwer.
Alicki R and Lendi K (1987) Quantum Dynamical Semigroups Lindblad G (1976) On the generators of quantum dynamical
and Applications. LNP, vol. 286. Berlin: Springer. semigroups. Communications in Mathematical Physics 48:
Breurer H-P and Petruccione F (2002) Theory of Open Quantum 119–130.
Systems. Oxford: Oxford University Press. Majewski WA and Zegarliński B (1996) Quantum stochastic
Chebotarev AM and Fagnola F (1998) Sufficient conditions for dynamics II. Reviews in Mathematical Physics 8: 689–713.
conservativity of quantum dynamical semigroups. Journal of Sewell G (2002) Quantum Mechanics and Its Emergent Macro-
Functional Analysis 153: 382–404. physics. Princeton: Princeton University Press.
Davies EB (1976) Quantum Theory of Open Systems. London: Spohn H (1980) Kinetic equations from Hamiltonian dynamics.
Academic Press. Review of Modern Physics 52: 569–616.
Frigerio A (1977) Quantum dynamical semigroups and approach Streater RF (1995) Statistical Dynamics. London: Imperial
to equilibrium. Letters in Mathematical Physics 2: 79–87. College Press.
Garbaczewski P and Olkiewicz R (eds.) (2002) Dynamics of
Dissipation. LNP, vol. 597. Berlin: Springer.
As said before, strictly speaking, implementing the E are well defined as operators. These issues can
dynamics comprises quantizing and satisfying all the however be dealt with in an elegant way as follows.
constraints. Here we will however focus on C since The first step is to absorb the determinant factor
it is the most challenging, and most closely related into a Poisson bracket,
to standard dynamics in that it generates changes
2 abc
under timelike deformations of the Cauchy surface CE ¼ trðFab fAc ; VgÞ
on which the canonical formulation is based.
The quantum solutions of the other constraints, where V is the volume of the spatial slice . Then
linear combinations of s-knots, lie in a Hilbert space one approximates the curvature by (identity minus)
Kdiff which is part of the dual of the kinematical the holonomy around a small loop. In the present
Hilbert space K of the theory. For details on these case one finds that for a small tetrahedron with
solutions as well as some basic definitions that will base point v, one can approximate
be used without comment below (see Loop Quan- Z
tum Gravity). Since s-knots are labeled, among other CE ðNÞ :¼ 21 N trðF ^ fA; VgÞ
things, by a diffeomorphism equivalence class of a
2
graph, relations to knot theory are emerging at this NðvÞijk trðhij hsk fh1
sk ; VgÞ ½1
level (see Knot Invariants and Quantum Gravity). 3
It is important to note that C does not Poisson- where (see Figure 1a)) the si are edges of incident
commute with the diffeomorphism constraints. at v and the ij loops around the faces of incident
Therefore, in the quantum theory it does matter in at v.
which order the constraints are solved. It turns out This suggests how to define an operator C b E that
that on the quantum solutions to the other con- acts on cylindrical functions on a given graph : one
straints, the scalar constraint can be defined by chooses a triangulation adapted to the graph and
introducing a regulator, and stays well defined even quantizes the CE (N) (where is a tetrahedron of
when the regulator is removed. This ultraviolet this triangulation) using the right-hand side of [1] –
finiteness on Kdiff can be intuitively understood holonomies are quantized by the holonomy opera-
from the diffeomorphism invariance of its elements: tors of the quantum theory, V by the volume
There is no problematic short-distance regime since operator V, b and the Poisson bracket by the
the states do not contain any scale at all. corresponding commutator divided by ih. To be
In the following we will briefly review the imple- more precise, the triangulation is chosen such that
mentation of the scalar constraint in LQG and the sk in [1] are part of , and the operators
comment on some ramifications and open questions. corresponding to the h are creating new edges that
connect the endpoints of the sk (see Figure 1b).
bE
Still this is not sufficient, since the definition of C
The Scalar Constraint Operator depends quite heavily on the choice of the triangula-
tion, and there is no natural way to choose one.
In the Lorentzian theory the scalar constraint C is Furthermore, there is no choice that would guarantee
the sum of the scalar constraint CE of the Euclidean
theory:
that the Cb E for different are consistent in the sense constraint operator along the lines sketched above.
that they correspond to the action of the same The quantization ambiguities include changes in the
operator C b E on two different cylindrical subspaces. power of the volume operator and the spin quantum
Here, the diffeomorphism invariance of the theory number that the constraint creates or annihilates. An
comes to the rescue: a well-defined operator largely interesting check on these quantizations would be to
free of ambiguities can be obtained by letting the inspect the algebra of constraint operators for anoma-
operators above act (by duality) on Kdiff to give lies. In the present situation, this can only be carried
elements in K . When acting on diffeomorphism- out to a certain extent, because C b is defined on
invariant states, the ambiguities in the definition of diffeomorphism-invariant states. The Poisson bracket
the triangulations can be eliminated, and the opera- between two scalar constraints is proportional to a
tors Cb E for different are consistent and together diffeomorphism constraint, and indeed it turns out
define an operator C b E (N). Roughly speaking, for a that in the quantum theory the commutator of two
diffeomorphism-invariant state, it does not matter scalar constraint operators vanishes for quantizations
anymore where on the graph the endpoints of the sk as described above. In that sense they are ambiguity
lie and how they are connected to form the loops . free; however, this criterion is not strong enough to
The final picture looks as follows: for each s-knot s, distinguish between the candidates.
the operator gives a sum of contributions, one for Recently, a slightly different strategy has been
each vertex of s, that is, C b E (N)s = P C cv (N)s. The proposed, which, if successfully implemented, would
v
terms in this sum are not diffeomorphism invariant. eliminate some of the questions regarding the
Their evaluation on a spin network S is of the form constraint algebra. The idea is to combine the
X constraints C(N) for different lapse functions N
cv sÞ½S ¼
ðC cðs0 ÞNðxðvÞÞs0 ½S ½2 into one master constraint
s0
Z
0
where the s are s-knots that differ from s by the M ¼ ðdet qÞ1=2 C2 d3 x
addition or deletion of certain edges, and correspond-
ing changes in coloring (by 1=2) and intertwiners. As M is manifestly diffeomorphism invariant and could
an example, Figure 2 schematically depicts the action replace all the noncommuting constraints C(N),
on a trivalent vertex. The point x(v) on which N is hence simplifying the constraint algebra considerably.
evaluated in the above formula gets determined as The interpretation of the solutions of all the
follows: the evaluation s0 [S] is zero unless the graph constraints hinges on the construction of observables
on which S is based is an element in the diffeomorph- for the theory. This is already a difficult task in the
ism equivalence class on which s0 is based. x(v) is the classical theory, and thus even more so after quantiza-
position of the vertex v in this element of the tion. Though there is no general solution to this problem
equivalence class. Because of this x(v), the action of available, interesting proposals are being studied.
b E (N) is not diffeomorphism invariant.
C Finally, it should be said that the quantization of
Similar techniques give a quantization C b of the
the scalar constraint can be used to obtain a picture
full constraint. The solutions to the constraint can that resembles more the standard time evolution in
be determined as the vectors 2 Kdiff that are quantum field theory. The (formal) power series
annihilated by C b in the sense that (C(N)
b )[f ] = 0 expansion of the projector
for all functions N and elements f of K. The Z Z
solutions are more or less explicitly known; how- Y
P¼ b
ðCðxÞÞ ¼ D½N exp i NðxÞCðxÞ b
ever, the task of interpreting them is a hard one and x2
remains an object of current research.
It should be mentioned that, strictly speaking, one onto the kernel of C b can be described by a spin foam
can arrive at several slightly different versions of the model (see Spin Foams).
For further information on the subject of this article
see the references: Thiemann (to appear), Rovelli
(2004), and Ashtekar and Lewandowski (2004) for
k general reviews on LQG (with a systematic exposition
∑ 1
2 of a large class of quantizations of the scalar constraint
j, k
j and their solutions in Ashtekar and Lewandowski
(2004)); Thiemann (1998) for a seminal work on the
quantization of the scalar constraint; Rovelli (1999)
Figure 2 A schematic rendering of the action of the operator and Reisenberger and Rovelli (1997) on the connec-
b v for a trivalent vertex.
C tion to spin foam models; Di Bartolo et al. (2002) on
168 Quantum Electrodynamics and Its Precision Tests
consistent discretizations; Kodama (1990) and Freidel Freidel L and Smolin L (2004) The linearization of the Kodama
and Smolin (2004) on the Kodama state; and state. Classical and Quantum Gravity 21: 3831.
Kodama H (1990) Holomorphic wave function of the universe.
Thiemann (2003) on the master constraint program. Physical Review D 42: 2548.
Reisenberger MP and Rovelli C (1997) ‘‘Sum over surfaces’’ form
See also: Constrained Systems; Knot Invariants and of loop quantum gravity. Physical Review D 56: 3490.
Quantum Gravity; Loop Quantum Gravity; Quantum Rovelli C (1999) The projector on physical states in loop
Geometry and its Applications; Spin Foams; Wheeler–De quantum gravity. Physical Review D 59: 104015.
Rovelli C (2004) Quantum Gravity, Cambridge Monographs
Witt Theory.
in Mathematical Physics. Cambridge: Cambridge University
Press.
Thiemann T (1998) Quantum spin dynamics (QSD). Classical and
Further Reading
Quantum Gravity 15: 839.
Ashtekar A and Lewandowski J (2004) Background independent Thiemann T (2003) The Phoenix project: master constraint
quantum gravity: a status report. Classical and Quantum programme for loop quantum gravity, arXiv:gr-qc/0305080.
Gravity 21: R53. Thiemann T (2006) Modern Canonical Quantum General
Di Bartolo C, Gambini R, and Pullin J (2002) Canonical Relativity, Cambridge University Press (to appear).
quantization of constrained theories on discrete space-time
lattices. Classical Quantum Gravity 19: 5275.
(1932), establishing Dirac’s equation as one of the perturbation theory (which replaced the previous
cornerstones of theoretical physics. noncovariant ‘‘old fashioned’’ perturbation theory)
All the ingredients needed for the evaluation of and of the renormalization theory, which liberated
the perturbative corrections to the QED theory the perturbative expansion from the divergences
(usually called radiative corrections) were already plaguing the older approach, opening the path to the
present at that moment, but radiative corrections evaluation of radiative corrections and to the great
were not systematically investigated for several success of precision predictions of QED.
years, due perhaps to the length and difficulty of The formalism improved quickly, evolving in
the calculations and the absence of important the more general quantum field theory (QFT)
disagreements between theoretical predictions and approach; three of the main contributors were
experimental results. Sin-Itiro Tomonaga, Julian Schwinger, and Richard
The situation changed in 1947, when two experi- P Feynman, awarded a few years later (1965) the
ments were carried out, measuring the energy Nobel price ‘‘for their fundamental work in quantum
difference between the 22 S1=2 and 22 P1=2 levels of electrodynamics, with deep-ploughing consequences
the hydrogen atom and the gyromagnetic ratios of for the physics of elementary particles.’’ QFT was then
the electron. successfully used for describing the weak interactions
Lamb and Retherford (1947), by using the ‘‘great in the electroweak model and later on also for the
wartime advances in microwaves techniques,’’ suc- strong interactions theory, dubbed quantum chromo-
ceeded in establishing that in the hydrogen atom dynamics (or QCD, in analogy with the popular QED
‘‘the 22 S1=2 state is higher than the 22 P1=2 by about acronym). For more details and references to original
1000 Mc/sec.,’’ while (as observed above) according works, the reader is invited to look at any treatise on
to the Dirac theory the two states are expected to QED or QFT, such as, for instance, Weinberg (1995).
have exactly the same energy. Subsequent refine- Initially, the Lamb shift was perhaps more
ments of the experiment (Triebwasser et al. 1953) important than the electron magnetic anomaly both
gave for the difference (now referred to as Lamb for the establishment of renormalization theory and
shift) the value 1057.77 0.10 MHz, with a relative as a test of QED, but in the following years it was
error 1 104 . supplanted by the latter as a precision test of QED.
The authors of the second 1947 experiment In 1947 the ‘‘best values’’ for some fundamental
(Kusch and Foley 1947) measured the frequencies constants were indeed
associated with the Zeeman splitting of two differ-
ent states of gallium, finding an inconsistency with c ¼ ð2:99776 0:00004Þ 1010 cm s1
the theoretical values of the gyromagnetic ratios of m e c2 2
R1 ¼ ¼ 109737:303 0:017 cm1 ½3
the electron. More exactly, write the magnetic 2hc
moments mL , mS associated to the (dimensionless) 1= ¼ 137:030 0:016
orbital and spin angular momenta L, S of the
electron as where R1 is the Rydberg constant for infinite mass,
h the Planck constant, and the fine structure
e
h e
h constant (let us observe here in passing that R1 was
mL ¼ gL L; mS ¼ gS S ½1
2me c 2me c and is still known much better than the separate
values of me , , and h entering in its definition); for
where (e) is the charge of the electron (e > 0), me
comparison, the current (2005) values for c and R1
its mass, c the speed of light and gL , gS , respectively,
are
the orbital and spin gyromagnetic ratios; the Dirac
theory then predicts gL = 1 and gS = 2, while the c ¼ 299792458 m s1
results of Kusch and Foley (1947) gave a discre- ½4
pancy which could be accounted for by taking R1 ¼ 109737:31568525ð73Þ cm1
gS = 2.00229 0.00008 and gL = 1, or alternatively
where the value of c is exact (it is in fact the
gS = 2 and gL = 0.99886 0.00004. In modern
definition of the meter), and the relative error in R1
notation the first conjecture can be rewritten as
is 6.6 1012 (the value of will be discussed later).
gS ¼ ge ¼ 2ð1 þ ae Þ; ae ¼ 0:001145 0:00004 ½2 The measurement of the Lamb shift, repeated
several times, gave results in nice agreement with
where ae is the anomalous magnetic moment (or the original value, and for several years it was
magnetic anomaly) of the electron. providing either a test of QED or a precise value for
The need of explaining the two experimental . But the Lamb shift is the energy difference
results gave rise to a rapid development of covariant between the metastable level 2S1=2 (whose lifetime
170 Quantum Electrodynamics and Its Precision Tests
is about 1/7 s) and the 2P1=2 level, which has a conservation of the momenta at that vertex. For
lifetime of about 1.596 ns or a natural linewidth of each process, the Feynman graphs are naturally
99.7 MHz. Such a large linewidth poses a strong classified by the total number of the interaction
intrinsic limitation to the precision attainable in the vertices they contain. In the simplest graphs for a
measure of the Lamb shift, which is just ten times given process (the so-called tree graphs) the
larger; as a matter of fact, that precision could never -functions at the vertices make the integrations
reach the 1 106 relative error level, while in the trivial; but when the number of vertices increases,
meantime the relative precision in ae reached the closed loops of virtual particle states appear, whose
109 range, replacing the Lamb shift in the role of evaluation quickly becomes extremely demanding.
the leading quantity in high-precision QED. In QED, each loop gives an extra factor (e)2 with
respect to the tree graph; it is customary to express it
in terms of (=) = (e=2)2 , so that the resulting
The Structure of Radiative Corrections
power of (=) corresponds to the number of
For obvious space problems we can only super- internal loops. The typical QED prediction for a
ficially sketch here the lines along which the physical quantity is then expressed as a series of
perturbative expansion of QED leading to the powers of the fine structure constant (and of its
evaluation of radiative corrections can be built, logarithm in bound-state problems). As is small
considering for simplicity only the photon and the ( ’ 1=137), and the first coefficients of the expan-
electron. One can start from a QED Lagrangian, sions are usually of the order of 1, a small number
formally similar to the classical Lagrangian, invol- of terms in the expansion is in general sufficient to
ving the electron field and the vector potentials of match the precision of the available experimental
the electromagnetic (or photon) field. The theory is data.
a gauge theory (its physical content should not But the number of different graphs for a given
change if a gradient is added to the vector number of loops grows quickly with the number of
potentials); it is further an abelian gauge theory as the loops; in turn, each graph consists in general of a
the EMF does not interact directly with itself. great number of terms and the loop integrations
The QED Lagrangian is separated into a free part become prohibitively difficult when the number of
and an interaction part. From the free part, one loops increases, so that the evaluation of radiative
derives the wave functions of the free-particle states corrections proved to be one of the major computa-
and the corresponding time-evolution operators tional challenges of theoretical physics. As a matter
(free Green’s functions or propagators; let us just of fact, it prompted the development of computer
recall here that to obtain a convenient photon programs (Veltman 1999) for processing the huge
propagator one has to break the gauge invariance algebraic expressions usually encountered, and of
by adding to the Lagrangian a suitable gauge- many sophisticated numerical and analytical techni-
breaking term), while the interaction part of the ques for performing the loop integrations.
Lagrangian gives the ‘‘interaction vertices’’ of the It should be further mentioned here that Feynman
theory. graphs written by naively following the above
Aim of the theory is to build the Green’s function sketched rules are often mathematically ill-defined,
for the various processes in the presence of the taking the form of nonconvergent integrals on the
interaction; from these Green’s functions, one then loop momenta. A regularization procedure is needed
derives all the physical quantities of interest. to give an unambiguous meaning to all the integrals;
With the free propagators and the interaction currently the most powerful regularization is the
vertices, one generates the perturbative expansion of continuous dimensional regularization scheme, in
the Green’s functions. The result, namely the which the loop integrations are carried out in d
contributions to the perturbative expansion (or continuous dimensions, with d unspecified; renor-
radiative corrections), can be depicted in terms of malization counter-terms are also evaluated in the
Feynman graphs: they consist of various particle same scheme, and the physical quantities are
lines joined in the interaction vertices, with external recovered in the d ! 4 limit (unrenormalized loop
lines corresponding to the initial and final particles integrals and renormalization counterterms are
and internal lines corresponding to intermediate or usually singular as powers of 1=(d 4) in the
virtual particle states. Each graph stands for an d ! 4 limit, but all those divergences cancel out in
integral on the momenta of all the intermediate the physical combinations of interest).
states, each vertex implying among other things an QED describes the main interaction of the
interaction constant, which is (e) in the case of charged leptons (e, , and ) which have, however,
electron QED, and a -function imposing the weak interactions as well. Strictly speaking, pure
Quantum Electrodynamics and Its Precision Tests 171
QED processes do not exist; it is an essential feature Coulomb interaction between the two charges) and
of QFT that any existing particle can contribute to to devise techniques for their resummation. Among
the Feynman graphs for any process, when the them, one can quote the Bethe–Salpeter equation,
approximation is pushed to a sufficiently high formally very elegant and complete but difficult to
degree. In particular the photon, which is the main use in practice. A great progress has been achieved by
carrier of the QED interaction, is directly coupled the NRQED (nonrelativistic QED) approach, which
also to the strongly interacting particles (the result- is a nonrelativistic theory designed to reproduce the
ing contributions are referred to as ‘‘hadronic full QED scattering amplitude in the nonrelativistic
vacuum polarization’’ effects). limit by the ad hoc definition, a posteriori, of a
The precision tests of QED are then to be suitable effective Hamiltonian. The Hamiltonian is
necessarily searched for in those phenomena where then divided into a part containing the Coulomb
non-QED contributions are presumably small and interaction, which is treated exactly and which gives
which involve quantities already well known inde- rise to the bound states, and all the rest, to be treated
pendently of QED itself. But such high-precision perturbatively. The power of the NRQED approach
quantities are not always available, and as QED is was further boosted by the continuous dimensional
known better than the rest of physics, very often it is regularization technique of Feynman graph integrals.
taken to be correct by assumption, and used as a Traditionally, the results are expressed in terms of
tool for extracting or measuring some of the non- the energies of the bound states, but as in practice
QED quantities relevant to various physical the precise measurements concern the transition
processes. frequencies between various levels, it is customary
In any case, as QED predictions are expressed in to express any energy contribution to some level, say
terms of the fine structure constant , a determina- E, also in terms of the associated frequency
tion of independent of QED is needed; without it, = (E)=h, where h is the Planck constant.
the most precise predictions of QED would simply
become measures of and not tests of the theory.
The Hydrogen-Like Atoms
Finally, it is to be recalled that, ironically, the
problem of the convergence of the expansion in Quite in general, a hydrogen-like atom consists of a
powers of is still open, even if it is commonly single electron bound to a positively charge particle,
accepted that convergence problems will matter only which is a proton for the hydrogen atom, a deuteron
for precisions and corresponding perturbative orders nucleus for deuterium, a Helium nucleus for an Heþ
(say at order 1= ’ 137) absolutely out of reach of ion, a þ meson for muonium, or a positron for
present experimental and computational possibili- positronium. Even if QED alone is not sufficient to
ties, involving further extremely high energies, treat the dynamical properties of the nuclei, their
where the other fundamental interactions are strong interactions can be described by introducing
expected to be as important as QED, so that it suitable form factors and a few phenomenological
would be meaningless to consider only QED. parameters; weak interactions could be treated
In the following we will discuss only the QED perturbatively, but are not yet required at the
predictions for bound states and the anomalous precision levels achieved so far.
magnetic moments of and e. The QED results for the hydrogen-like atoms can
be expressed in terms of the mass M of the positive
particle and of its charge Ze (of course Z = 1 for
The Bound States
hydrogen). When the electron mass me is smaller
A very good review of the current status of the theory then M (which is always the case, except the
of hydrogen-like atoms can be found in Eides et al. positronium case) one can take as a starting point
(2001), to which we refer for more details and the QED electron moving in the external field of the
citation of the original papers. The starting point for positive particle, and treat all the other aspects of
studying the bound-state problem in QED is the the relativistic two-body problem (the so-called
scattering amplitude of two charged particles, pre- recoil effects) perturbatively in me =M.
dicted by perturbative QED (pQED) as a (formal) Neglecting the spin of the positive particle, the
series expansion in powers of . In the static limit energy levels of the hydrogen-like atom are identi-
v ! 0, where v is the relative velocity of the two fied by the usual principal quantum number n, the
particles, some of the pQED terms behave as =v, so orbital angular momentum l (with the convention of
that the naive expansion in becomes meaningless. writing S, P, D, . . . instead of l = 0, l = 1, l = 2, . . .)
Fortunately, it is relatively easy to identify the origin and j, the total angular momentum including the
of those terms (which are essentially due to the spin of the electron. It turns out that the bound
172 Quantum Electrodynamics and Its Precision Tests
levels consist of very many contributions of different within QED, even if their actual calculation is an
kinds; dropping quantum number indices for sim- extremely demanding task. One of the first results
plicity, the energy levels can be written as an obtained in 1947 was A41 = (4=3)l0 , contributing to
expression of the form the 2S but not to the 2P states (quite in general,
most corrections are much bigger for l = 0 states
me c2 ðZÞ2 mr than for higher-angular-momentum states), which is
E¼
2 me sufficient to give the right order of magnitude of the
1 (2S1=2 –2P1=2 ) Lamb shift (about 1000 MHz). The
2 þ ðZÞ2 f4 þ ðZÞ4 f6 þ
n other coefficients are now known, thanks to the
þ Erad þ Erec þ Enucl þ ½5 strenuous and continued efforts (Eides et al. 2001)
since then, which is impossible to refer properly here
Let us observe that it is convenient to write in any detail. The current frontier of the theoretical
explicitly the Z factors even when Z = 1 for a better calculation (around the dots in the previous for-
bookkeeping of the various corrections. As usual, mr mula) corresponds to 8–9 total powers of (=) and
is the reduced mass of the electron, mr = me M= (Z) or some kHz for the 1S state.
(me þ M) the mass of the nucleus being M; the first The next term in eqn [5], Erec contains
term in the square bracket, 1=n2 , the familiar contributions of order me c2 (Z)5 (me =M) or smaller
Balmer term, is by and far the dominant one, giving (some care must be done for classifying the
for the n = 1 level in the Z = 1 case an energy of contributions of order me =M, which can be
about 13.6 eV or a corresponding frequency of accounted for by proper use of mr rather than me
3.3 1015 Hz. The other terms in the square and genuine me =M contributions), and are suffi-
bracket, f4 and f6 , are known coefficients (depend- ciently known for practical purposes; the same is
ing also on the small parameter me =M; f4 is true for many other contributions discussed in Eides
essentially the fine structure). et al. (2001) and skipped in eqn [5]. A troublesome
The term Erad , is the bulk of the radiative QED contribution comes however from Enucl ; at leading
corrections; it can be written as a multiple expan- order, one has
sion on (Z), and L = ln [1=(Z)2 ], which turns
out to have the following explicit form: 2ðZÞ4 mc2 mcRp 2
Enucl ¼ l0
3n3 h
1 h
Erad ¼ me c2 ðZÞ4 3 A41 L þ A40 þ ðZÞA50 where Rp is the so-called root-mean-square charge
n
i radius of the proton, which is not well known
þ ðZÞ2 A62 L2 þ A61 L þ A60 þ experimentally (in the literature, there are indeed
2 h two direct measurements, Rp = 0.805(11) fm and
þ B40 þ ðZÞB50 Rp = 0.862(12) fm, in poor agreement with each
i
other; a new independent measurement is strongly
þ ðZÞ2 B63 L3 þ B62 L2 þ B61 L þ B60 þ
needed).
3
þ ½C40 þ ðZÞC50 þ þ ½6
The hyperfine splitting The effect of the interac-
The first index of the coefficients refers to the power tion of the electron with the spin of the positive
of (Z), the second to the power of L; as a rule, particle introduces the so-called hyperfine splitting
there are three powers of (Z) due to the normal- of all the levels. The order of magnitude of the
ization of the wave function and one power of (Z) hyperfine splitting of the 1S state is given by the
for each interaction with the nucleus (in the leading Fermi energy
term of eqn [5] one must subtract two powers of 4 me
(Z) due to the long-range nature of the Coulomb EF ¼ me c2 ðZÞ4 gp
3 mp
interaction), while the terms in L = ln [1=(Z)2 ] are
related to the infrared divergences of the scattering where gp ’ 5.586 is the g-factor of the proton,
amplitude, with the binding energy acting as infra- which gives ’ 1.42 GHz. It was dubbed hyperfine
red cutoff. The A-coefficients refers to order (=) because it is smaller than the fine structure terms by
or one-loop virtual correction (we do not distinguish the factor me =mp . Many classes of corrections can
here between one-loop self-mass and vacuum- be worked out, with patterns similar to those of the
polarization contribution, as usually done in the previous subsection, and also in this case the nuclear
literature), the B-coefficients to two loops, etc. The contributions (this time mainly due to the theoreti-
coefficients are pure numbers, entirely determined cally unknown magnetic form factor and the
Quantum Electrodynamics and Its Precision Tests 173
so-called polarizability of the proton) prevent from the mþ lepton has no strong interactions, the mþ e
obtaining predictions with an error less than 1 kHz system can be studied theoretically within pure
(or a relative precision better than 1 106 ). QED, with the weak interactions giving a known
and small perturbation. Further, the ratio of the
The comparison with the experiments Experimen- masses me =mm ’ 4.8 103 is small, so that the
Experimentally, one measures transition frequencies external field approximation holds. However, the m
among the various levels. For many years the is unstable (lifetime ’ 2.2 m s), which makes experi-
precision record was given by the hyperfine splitting ments more difficult to carry out. The best measured
of the ground states of hydrogen hfs (1S) was quantity is the hyperfine splitting of the 1S ground
measured long ago (see Hellwig et al. (1970) and state (see Liu et al. (1999))
Essen et al. (1971)),
hfs ðme; 1SÞ ¼ 4 463 302 765ð53Þ Hz
hfs ð1SÞ ¼ 1 420 405:751 766 7ð9Þ kHz ½7
with a relative precision of 12 109 . The theore-
with a relative error 6 1013 . The current record in
tical treatment is similar to the case of hydrogen,
the optical range is the value of the (1S–2S)
with the important advantage that nuclear interac-
hydrogen transition frequency, obtained by means
tions are absent and everything can be evaluated
of two-photon Doppler-free spectroscopy Niering
within QED, so that the bulk of the contribution is
et al. (2000),
given by a formula with the structure of eqn [6]. But
ð1S–2SÞ ¼ 2 466 061 413 187:103ð46Þ kHz ½8 the prediction depends, in any case, on the me =mm
mass, which is not known with the required
with a relative precision 1.9 1014 ; other optical
precision. Indeed, a recent theoretical calculation
transitions, such as (2S–8D), (2S–12D) are measured
(Czarnecki et al. 2002) (which includes also a
with precision of about 1 1011 .
contribution of 0.233(3) kHz from hadronic
The measurement of the Lamb shift was repeated
vacuum polarization) gives 4 463 302 680(510)
several times, with results in nice agreement with the
(30)(220) Hz, where the first (and biggest) error
original value, such as Lundeen and Pipkin (1986),
comes from me =mm , the second from , and the third
1057.845(9) MHz. The most precise value,
is the theoretical error (an estimation of higher-
1057.8514 0.0019 MHz was given in Palchikov
order contributions not yet evaluated).
et al. (1985) (the result depends, however, on the
theoretical value of the lifetime, and should be
changed into 1057.8576 0.0021 according to Positronium
subsequent analysis (see Karshenboim (1996)). The
The positronium is the bound state of an electron
experimental (2S1=2 –2P1=2 ) Lamb shift was also
and a positron. Theoretically, it is an ideal system to
obtained as the difference between the measured
study, as it can be described entirely within QED,
fine structure separation (2P3=2 –2S1=2 ) and the
without any unknown parameter of non-QED
theoretical value of the (2P3=2 –2P1=2 ) frequency,
origin. As the masses of the two constituents,
and the radiative corrections Erad to any level are
positron and electron, are strictly equal, the reduced
now referred to as the Lamb shift of that level.
mass of the system is exactly equal to half of the
As a somewhat deceiving conclusion, the wonder-
electron mass, mr = me =2, and the energy scale of
ful experimental results of eqns [7] and [8] cannot
the bound states is half of R1 .
be used as a high-precision test of the theory or to
At variance with the muonium case, the external
obtain precise values of many fundamental con-
field approximation is not valid, so that positronium
stants, as the theoretical calculations depend, unfor-
must be treated with the full two-body bound-state
tunately, on hadronic quantities which are not
machinery of QFT, of which it provides an excellent
known accurately. Combining theoretical predic-
test (Karshenboim 2004).
tions, the above transitions and Lamb shift data, and
Experimentally, radioactive positron sources are
the available values of and me =mp , one can indeed
available, so that positronium is easier to produce
obtain a measure of Rp (Rp = 0.883 0.014,
than muonium. It is, however, unstable; states with
according to Melnikov and van Ritbergen (2000))
total spin S equal 0 (also called parapositronium
and the value of R1 already quoted above.
states) annihilate into an even number (mainly two)
of gammas, and states with S = 1 (orthopositronium)
Muonium
into an odd number (mainly three) of gammas, with
The muonium is the bound state of a positive mþ short lifetimes (which make precise measurements
meson and an electron. At variance with the proton, difficult). Further, as positronium is the lightest
174 Quantum Electrodynamics and Its Precision Tests
atom, Doppler-broadening effects are very impor- The Anomalous Magnetic Moments
tant, reducing the precision of spectroscopical of Leptons
measurements.
The precision of the measurements requires, for both
Positronium decay rates There has been a long- the e and leptons, to also take into account graphs
time discrepancy between theory and experiment in with contributions from the other leptons as virtual
decay rate of ground-state orthopositronium, which intermediate states and those of hadronic and weak
prompted thorough theoretical investigations look- origin. Quite in general, if the mass of the virtual
ing for errors in the calculations or flaws in the particle, say mv , is smaller than the mass of the
formalism, but it turned out that the flaw was on the external lepton, say ml , one can have an ln (ml =mv )
experimental side. The current theoretical prediction behavior of the contributions; that is the case of the
for the ground state S = 1 decay is (Adkins et al. virtual electron contributions to the muon magnetic
2002) anomaly am , which can be enhanced by powers of
ln (mm =me ). In the opposite case, mv > ml , the
1
contribution has the behavior (ml =mv )2 ; that is the
ð1S; orthoÞ ¼ 0 1 þ A þ 2 ln
3 case of the (mm =m )2 contributions to am from
2 33 loops and of the (me =mm )2 contributions from
2 3
þB ln þ C ln þ loops to th electron magnetic anomaly, ae . As strong
2
and weak interactions are in general associated with
¼ 7:039979ð11Þms1 heavy-mass particles, they are expected to be more
where 0 = 2(2 9)me 6 =(9) = 7.2111670(1), important for am than ae ; further, a given heavy
A = 10.286606(10),B = 45.06(26), C = 5.517, in particle contribution to ae is smaller by a factor
nice agreement with the less precise experimental (me =mm )2 than the corresponding contribution to am .
result of Karshenboim (2004, ref. 38) 7.0404
(10)(8)ms1 . As a curiosity, the coefficients A, B The Magnetic Anomaly am of the m
above are among the greatest coefficients so far
The am has been reviewed in Passera (2005). The
appeared in QED radiative corrections.
present (2005) world average experimental value is
The agreement between theory and experiment for
the ground-state parapositronium decay rate has am ðexpÞ ¼ 116 592 080ð60Þ 1011
always been good; the current status of Karshenboim
(2004, ref. 41) is 7990.9(1.7) ms1 for the experimental with a relative error 0.5 106 .
result and of Karshenboim (2004, ref. 43) Theoretically, one can write
7989.64(2) ms1 for the theoretical prediction. am ¼ am ðQEDÞ þ am ðhadÞ þ am ðEWÞ ½9
Positronium levels The quantum number structure where the three terms stand for the contributions
of the levels is similar to muonium, with the from pure QED, strong interacting hadrons and
important difference, however, that the hyperfine electroweak interactions. In turn, one can expand
splitting (which in hydrogen or muonium is small am (QED) in powers of as
because it is proportional to the ratio of the masses of X l
the two components) is in fact of the same order as am ðQEDÞ ¼ Cl
l
the fine structure. The theoretical evaluation of the
X ðlÞ ðlÞ mm ðlÞ mm
energy levels provides a very stringent check of QED ¼ A1 þ A2 þ A2
and of the overall treatment of the bound-state l
me mt
problem. Corrections have been evaluated, typically,
ðlÞ mm mm l
up to order mc2 7 . The best-known quantities are þ A3 ; ½10
me mt
the ground state (hyper)fine splitting, experimental
value (Ritter et al. 1984) 203.38910(74) GHz The coefficients A(l)
1 involve only the photon and
(3.6 106 relative error), theoretical (Karshenboim the external lepton as virtual states, are identically
2004) 203.3917(6), and the 1S–2S transition for the same as in ae ; they are known up to l = 4
orthopositronium, experiment (Fee et al. 1993) 1 233 included (but, strictly speaking, the contribution of
607 216.4 (3.2) MHz, theory 1 233 607 222.2(6). A(4)
1 is smaller than the experimental error of am )
The general agreement is good; the precisions and will be discussed later for the electron. The
achieved are, however, not yet sufficient to allow a A(l)
2 (mm =me ) are very large, being enhanced by
determination of R1 or competitive with other powers of ln (mm =me ), and are required and known
measurements. up to l = 5; A(l) (2)
2 (mm =mt ) starts with A2 (mm =mt ) ’
Quantum Electrodynamics and Its Precision Tests 175
1=45(mm =mt )2 , contributing 4.2 1011 to am , so the scientific community: the validity of QED and
that the A(l)
2 (mm =mt ) with higher values of l are not electroweak models is taken for granted, and a
needed. A(l)
3 (mm =me , mm =mt ), finally, starts from
disagreement, if any, is considered to be an indica-
l = 3, and gives a negligible contribution tion of new physics. To obtain significant informa-
0.7 1011 . Summing up, one finds C1 = 1=2, tion in that direction, however, the experimental
C2 = 0.765 857 410(27) (the error is from the experi- and the theoretical errors (dominated in turn by the
mental errors in the lepton masses) C3 = 24.050 experimental error in eþ e scattering data) should
509 64(43), C4 = 131.011(8), and C5 = 677(40). As be significantly reduced.
already observed, the coefficients are large due to the
presence of ln (mm =me ) factors. The last term C5
The Magnetic Anomaly ae of the Electron
contributes 4.6(0.3) 1011 to am , and the total QED
contribution is Experimentally, one has the 1987 value (Kinoshita
2005, ref. 1).
am ðQEDÞ ¼ 116 584 718:8ð0:3Þð0:4Þ 1011
where the first error is due to the uncertainties in the ae ðexpÞ ¼ 1 159 652 188:4ð4:3Þ 1012 ½11
coefficients C2 , C3 , and C5 and the second from the
value of coming from atom interferometry with a relative error 3.7 109 and the preliminary
measurements (see below). Harvard (2004) measurement (Kinoshita 2005, ref. 3).
The hadronic contributions are of two kinds,
those due to vacuum polarization, am (vac.pol), ae ðHarvardÞ ¼ 1 159 652 180:86ð0:57Þ 1012 ½12
which can be evaluated by sound theoretical
with 0.5 109 relative error, that is, an increase in
methods by using existing experimental data, and
precision by a factor 7.
those due to light-by-light hadronic scattering,
Theoretically, eqns [9] and [10] apply also to the
am (lbl), whose evaluation relies on much less firmer
electron; given the smallness of the electron mass,
grounds and are entirely model-dependent. The
the relevant terms up to the precision of the
value of am (vac.pol) varies slightly among the
experimental data are
various authors (see Passera (2005) for reference to
original work), let us take as a typical value
ð1Þ ð2Þ 2 ð3Þ 3
am (vac.pol) = 6834(92) 1011 (based on eþ e scat- a e ¼ A1 þ A1 þ A1
tering data and including also first-order radiative 4
corrections). The model-dependent value of the ð4Þ ð2Þ me 2
þ A1 þ þ A2
light-by-light contribution changed several times in mm
the years (also in sign!) but now there is a general
consensus that it should be positive; let us take, þ ae ðhadÞ þ ae ðEWÞ ½13
somewhat arbitrarily, am (lbl) = 136(25) 1011 , so The explicit calculation gives
that the total hadronic contribution becomes
ð1Þ 1
am ðhadÞ ¼ 6970ð92Þ 1011 A1 ¼ ðPassera 2005; ref: 1Þ
2
The electroweak contribution, finally, is ð2Þ 197 1 2 1 2 3
A1 ¼ þ ln 2 þ ð3Þ
144 12 2 4
am ðEWÞ ¼ 154ð2Þ 1011
¼ 0:328 478 965 579 . . .
which accounts for a one-loop purely weak ðPassera 2005; ref: 17Þ
contribution and a two-loop electromagnetic and
weak contribution, which turns out to be very large ð3Þ 83 215
A1 ¼ 2 ð3Þ ð5Þ
(42 1011 ) for the presence of logarithms in the 72 24
masses (the error is due to the uncertainty in the 100 1 4 1 2 2
þ a4 þ ln 2 ln 2
Higgs boson mass). 3 24 24
Summing up, eqn [9] gives am = 116 591 842 239 4 139 298 2
(92) 1011 , so that þ ð3Þ ln 2
2160 18 9
am ðexpÞ am ¼ 138ð60Þð90Þ 1011 17101 2 28259
þ þ
810 5184
The substantial agreement can be considered to be a
¼ 1:181 241 456 . . . ðLaporta and Remiddi 1996Þ
good overall check of QED and electroweak inter-
ð4Þ
actions. But another attitude is often adopted in A1 ¼ 1:7283ð35Þ ðKinoshita 2005Þ
176 Quantum Electrodynamics and Its Precision Tests
Quantum Entropy
D Petz, Budapest University of Technology ones, we simply call them particles. Since we have
and Economics, Budapest, Hungary ideas of quantum mechanics in mind, we assume
ª 2006 Elsevier Ltd. All rights reserved. that each of the particles is in one of the energy
levels E1 < E2 < < Em . The P number of particles
in the level Ei is Ni , so i Ni = N is the total
In the past 50 years, entropy has broken out of number of particles. A macrostate of our system is
thermodynamics and statistical mechanics and given by the occupation numbers P N1 , N2 , . . . , Nm .
invaded communication theory, ergodic theory The energy of a macrostate is E = i Ni Ei . A given
mathematical statistics, and even the social and macrostate can be realized by many configurations
life sciences. The favorite subjects of entropy of the N particles, each of them at a certain energy
concern macroscopic phenomena, irreversibility, level Ei . These configurations are called microstates.
and incomplete knowledge. In the strictly mathe- Many microstates realize the same macrostate. We
matical sense entropy is related to the asymptotics count the number of ways of arranging N particles
of probabilities or concerns the asymptotic beha- in m boxes (i.e., energy levels) such that each box
vior of probabilities. has N1 , N2 , . . . , Nm particles. There are
This review is organized as follows. First the
N N!
history of entropy is discussed generally and then we :¼ ½1
N1 ; N2 ; . . . ; Nm N1 !N2 ! . . . Nm !
concentrate on the von Neumann entropy again
somewhat historically following the work of von such ways. This multinomial coefficient is the
Neumann. Umegaki’s quantum relative entropy is number of microstates realizing the macrostate
discussed both in case of finite systems and in the (N1 , N2 , . . . , Nm ) and it is proportional to the
setting of C -algebras. An axiomatization is pre- probability of the macrostate if all configurations
sented. To show physical applications of the concept are assumed to be equally likely. Boltzmann called [1]
of entropy, the statistical thermodynamics is the thermodynamical probability of the macrostate,
reviewed in the setting of spin chains. The relative in German ‘‘thermodynamische Wahrscheinlichkeit,’’
entropy shows up in the asymptotic theory of hence the letter W was used. Of course, Boltzmann
hypothesis testing and data compression. argued in the framework of classical mechanics and
the discrete values of energy came from an approxi-
mation procedure with ‘‘energy cells.’’
General Introduction to Entropy: From If we are interested in the thermodynamic limit N
Clausius to von Neumann increasing to infinity, we use the relative numbers
The word ‘‘entropy’’ was created by Rudolf Clausius pi := Ni =N to label Pa macrostate and, instead of the
and it appeared in his work Abhandlungen über die total energy E = i Ni Ei , P we consider the average
mechanische Wärmetheorie published in 1864. The energy pro particle E=N = i pi Ei . To find the most
word has a Greek origin, its first part reminds us of probable macrostate, we wish to maximize [1] under
‘‘energy’’ and the second part is from ‘‘tropos,’’ a certain constraint. The Stirling approximation of
which means ‘‘turning point.’’ Clausius’ work is the the factorials gives
foundation stone of classical thermodynamics.
1 N
According to Clausius, the change of entropy of a log
system is obtained by adding the small portions of N N1 ; N2 ; . . . ; Nm
heat quantity received by the system divided by the ¼ Hðp1 ; p2 ; . . . ; pm Þ þ OðN1 log NÞ ½2
absolute temperature during the heat absorption.
This definition is satisfactory from a mathematical where
point of view and gives nothing other than an X
Hðp1 ; p2 ; . . . ; pm Þ :¼ pi log pi ½3
integral in precise mathematical terms. Clausius i
postulated that the entropy of a closed system
cannot decrease, which is generally referred to as If N is large then the approximation [2] yields that
the second law of thermodynamics. instead of maximizing the quantity [1] we can
The concept of entropy was really clarified by maximize [3]. P For example, maximizing [3] under
Ludwig Boltzmann. His scientific program was to the constraint i pi Ei = e, we get
deal with the mechanical theory of heat in connec-
tion with probabilities. Assume that a macroscopic eEi
pi ¼ P Ej ½4
system consists of a large number of microscopic je
178 Quantum Entropy
where the constant is the solution of the equation up to an additive constant, which could be chosen to
X Ei be 0 as a matter of normalization. Equation [6] is
e
Ei P ¼e von Neumann’s celebrated entropy formula; it has a
i j eEj more elegant form
Note that the last equation has a unique solution if Sð!Þ ¼ tr ð!Þ ½7
E1 < e < Em , and the distribution [4] is now known
where the state ! is identified with the correspond-
as the discrete Maxwell–Boltzmann law.
ing statistical operator, and : R þ ! R is the
Let p1 , p2 , . . . , pn be the probabilities of different
continuous function (t) = t log t.
outcomes of a random experiment. According to
von Neumann solved the maximization problem
Shannon, the expression [1] is a measure of our
for S(!) under the constraint tr !H = e. This means
ignorance prior to the experiment. Hence it is also
the determination of the ensemble of maximal
the amount of information gained by performing the
entropy when the expectation of the energy operator
experiment. The quantity [1] is maximum when all
H is a prescribed value e. It is convenient to rephrase
the pi ’s are equal. In information theory, logarithms
his argument in terms of conditional expectations.
with base 2 are used and the unit of information is
H = H is assumed to have a discrete spectrum and
called bit (from binary digit). As will be seen below,
we have a conditional expectation E determined by
an extra factor equal to Boltzmann’s constant is
the eigenbasis of H. If we pass from an arbitrary
included in the physical definition of entropy.
statistical operator ! with tr !H = e to E(!), then the
The comprehensive mathematical formalism of
entropy is increasing, on the one hand, and the
quantum mechanics was first presented in the famous
expectation of the energy does not change, on the
book Mathematische Grundlagen der Quantenme-
other, so the maximizer should be searched among
chanik published in 1932 by Johann von Neumann.
the operators commuting with H. In this way we are
In the traditional approach to quantum mechanics, a
(and von Neumann was) back to the classical
physical system is described in a Hilbert space:
problem of statistical mechanics treated at the
observables correspond to self-adjoint operators and
beginning of this article. In terms of operators, the
statistical operators are associated with the states. In
solution is in the form
fact, a statistical operator describes a mixture of pure
states. Pure states are really the physical states and expðHÞ
they are given by rank-1 statistical operators, or ½8
tr expðHÞ
equivalently by rays of the Hilbert space.
von Neumann associated an entropy quantity to a which is called Gibbs state today.
statistical operator in 1927 and the discussion was
extended in his book (von Neumann 1932). His
The von Neumann Entropy
argument was a gedanken experiment on the
grounds of phenomenological thermodynamics. Let von Neumann was aware of the fact that statistical
us consider a gas of N( 1) molecules in a box. operators form a convex set whose extreme points
Suppose that the gas behaves like a quantum system are exactly the pure states. He also knew that
and is described
P by a statistical operator ! which is a entropy is a concave functional, so
mixture i i j’i ih’i j, where j’i i ’i are orthogonal X X
state vectors. We may take i N molecules in the pure S i
i ! i i
Sð!i Þ ½9
state ’i for every i. The gedanken experiment gave
X for any convex combination. To determine the
entropy of a statistical operator, he used the
S i j’i ih’i j
i
Schatten decomposition, which is an orthogonal
X X extremal decomposition in our present language.
¼ i Sðj’i ih’i jÞ i log i ½5
For a statistical operator ! there are many ways to
i i
write it in the form
where is Boltzmann’s constant and S is certain X
thermodynamical entropy quantity (relative to the !¼ i j i ih i j
fixed temperature and molecule density). i
After this, von Neumann showed that S(j’ih’j) is if we do not require the state vectors to be
independent of the state vector j’i, so that orthogonal. The geometry of the statistical opera-
X X tors, that is, the state space, allows many extremal
S i j’i ih’i j ¼ i log i ½6 decompositions and among them there is a unique
i i orthogonal one if the spectrum of ! is not
Quantum Entropy 179
degenerate. Nonorthogonal pure states are essen- where !Ln 2 B(HA ) B(HLnB ) and !R R
n 2 B(HnB )
tially nonclassical. They are between identical and B(HC ) are density operators and pn 2 B(HB ) are
completely different. Jaynes recognized in 1956 that the orthogonal projections HB ! HLnB HR
nB .
from the point of view of information the Schatten
decomposition is optimal. He proved that
n X X o
S ð!Þ ¼ sup i i log i: ! ¼ ! ½10 Quantum Relative Entropy
i i i
The quantum relative entropy is an information
where thePsupremum is over all convex combina- measure representing the uncertainty of a state with
tions ! = i i !i statistical operators. This is Jaynes respect to another state. Hence it indicates a kind of
contribution to the von Neumann entropy. By the distance between the two states. The formal defini-
way, formula [10] may be used to define von tion [12] is due to Umegaki.
Neumann entropy for states of an arbitrary Now we approach quantum relative entropy
C -algebra whose states cannot be described by axiomatically. Our crucial postulate includes the
statistical operators. notion of conditional expectation. Let us recall that
Certainly the highlight of quantum entropy theory in the setting of operator algebras conditional
in the 1970s was the discovery of subadditivity. This expectation (or projection of norm 1) is defined as
property is formulated in a tripartite system whose a positive unital idempotent linear mapping onto a
Hilbert space H is a tensor product HA HB HC . subalgebra.
A statistical operator !ABC admits several reduced Now we list the properties of the relative entropy
densities, !AB , !B , !BC , and others. The strong functional which will be used in an axiomatic
subadditivity is the inequality due to Lieb and characterization:
Ruskai in 1973:
1. Conditional expectation property. Assume that A
Theorem 1 is a subalgebra of B and there exists a projection of
Sð!ABC Þ þ Sð!B Þ
Sð!AB Þ þ Sð!BC Þ ½11 norm 1 E of B onto A, such that ’ E = ’. Then
for every state ! of B S(!, ’) = S(!jA, ’jA) þ
The strong subadditivity inequality [11] is con- S(!, ! E) holds.
veniently rewritten in terms of the relative entropy. 2. Invariance property. For every automorphism
For statistical operators and !, of B we have S(!, ’) = S(! , ’ ).
Sðk!Þ ¼ tr ðlog log !Þ ½12 3. Direct sum property. Assume that B = B1 B2 . Let
’12 (a b) = ’1 (a) þ (1 )’2 (b) and !12 (a b) =
if supp
supp !, otherwise S(k!) = þ1. The !1 (a) þ (1 )!2 (b) for every a 2 B1 , b 2 B2 and
relative entropy expresses statistical distinguishabil- some 0 < < 1. Then S(!12 , ’12 ) = S (!1 , ’1 )þ
ity and therefore it decreases under stochastic (1 )S(!2 , ’2 ).
mappings: 4. Nilpotence property. S(’, ’) = 0.
Sðk!Þ SðEðÞkEð!ÞÞ ½13 5. Measurability property. The function (!, ’) 7!
S(!, ’) is measurable on the state space of the
for a completely positive trace-preserving mapping E. finite dimensional C -algebra B (when ’ is
The strong subadditivity is equivalent to assumed to be faithful).
Sð!AB ; ’ !B Þ
Sð!ABC ; ’ !BC Þ ½14 Theorem 3 If a real valued functional R(!, ’)
defined for faithful states ’ and arbitrary states !
where ’ is any state on B(HA ) of finite entropy. This of finite quantum systems shares the properties
inequality is a consequence of monotonicity of the [1]–[5], then there exists a constant c 2 R such
relative entropy, since !AB = E(!ABC ) and ’ !B = that
E(’ !BC ), where E is the partial trace over HC .
Clearly, the equality in [11] is equivalent to equality Rð!; ’Þ ¼ c Tr D! ðlog D! log D’ Þ
in [14]. The relative entropy may be defined for linear
Theorem 2 The equality holds in [11] if and only functionals of an arbitrary C -algebra. The general
if definition may go through von Neumann algebras,
L there is an orthogonal decomposition pB HB =
L R normal states and the relative modular operator.
n HnB HnB , pB = supp !B , such that the density
operator of !ABC satisfies Another possibility is based on the monotonicity.
X Let ! and ’ be states of a C -algebra A. Consider
!ABC ¼ !B ðpn Þ!Ln !R
n ½15 finite-dimensional algebras B and completely posi-
n tive unital mappings : B ! A. Then the supremum
180 Quantum Entropy
of the relative entropies S(! k’ ) (over all ) automorphisms of the quasilocal algebra A. Clearly,
can be defined as S(!k’). the covariance condition
Theorem 4 The relative entropy of states of x ðA Þ ¼ Aþx
C -algebras shares the following properties.
holds, where þ x is the space-translate of the
(i) (!, ’) 7! S(!k’) is convex and weakly lower- region by the displacement x.
semicontinuous. Having described the kinematical structure of
(ii) k’ !k2
2S(!, ’). lattice systems, we turn to the dynamics. The local
(iii) For a unital Schwarz map : A0 ! A1 the Hamiltonian H() is taken to be the total potential
relation S(! k’ )
S(!k’) holds. energy between the particles confined to . This
Property (iii) is Uhlmann’s monotonicity theorem, energy may come from many-body interactions of
which we have already applied above. various orders. Most generally, we assume that there
The relative entropy appears in many concepts exists a global function such that for any finite
and problems in the area of quantum information subsystem the local Hamiltonian takes the form
theory (Nielsen and Chuang 2000, Schumacher and X
HðÞ ¼ ðXÞ ½16
Westmoreland 2002). X
Our aim is to explain this variational principle after In accordance with the lattice-gas interpretation
the thermodynamic limit is performed. of our model, the global quantity p is termed
The thermodynamic limit ‘‘ tends to infinity’’ pressure.
may be taken along lattice parallelepipeds. Let a 2 Z In the treatment of quantum spin systems, the set
with positive coordinates and define S of all translation-invariant states is essential. The
global entropy functional s is a continuous affine
ðaÞ ¼ fx 2 Z: 0
xi < ai ; i ¼ 1; 2; . . . ; g ½19
function on S and physically it is a macroscopic
When a ! 1, (a) tends to infinity in a manner quantity which does not have microscopic (i.e.,
suitable for the study of thermodynamic limit: the local) counterpart. Indeed, the local entropy func-
boundary of the parallelepipeds is getting more and tional is not an observable because it is not affine on
more negligible compared with the volume. The the (local) state space. The local internal energy
notion of limit in the sense of van Hove makes this E (’) is microscopic observable and the energy
idea more precise and physically more satisfactory. density functional e of S is the corresponding
For the sake of simplicity, we restrict ourselves to global extensive quantity.
thermodynamic limit along parallelepipeds. As an analog of the variational principle for finite
Denoting by jj the volume of (or the number quantum systems, the global free-energy functional f
of points in ), we may define the global energy, attains an absolute minimum at a translationally
entropy, and free energy functionals of translation- invariant state, and the minimum value of f is equal
ally invariant states to be to the thermodynamic limit of the canonical free-
energy densities of the local finite systems. In the next
eð’Þ :¼ lim E ð’Þ=jj ½20
!1 theorem, this global variational principle will be
formulated in a slightly different but equivalent way.
sð’Þ :¼ lim S ð’Þ=jj ½21
!1 Theorem 6 When is an interaction of finite
range, then
f ð’Þ :¼ lim F ð’Þ=jj ½22
!1
pð; Þ ¼ supfsð!Þ eð!Þg
The existence of the limit in [21] is guaranteed by
the strong subadditivity of entropy, while that of the holds, when the supremum is over all translationally
limits in [20] and [22] is assumed if the interaction invariant states ! on A.
is suitably tempered, as it certainly does if the The minimizers of the right-hand side are called
interaction is of finite range. equilibrium states and they have several different
Theorem 5 If ’ is a translationally invariant state of characterizations.
the quasilocal algebra A, then the limit [21] exists and
sð’Þ ¼ inffSðaÞ ð’Þ=jðaÞj : a 2 Zþ g ½23 Asymptotical Properties
Moreover, the von Neumann entropy density functional We keep the notation of the previous section but we
’ 7! s(’) is affine and upper-semicontinuous when the consider one-dimensional chains, = 1. Let ! be
state space is endowed with the weak topology. translation-invariant state on A and we fix a positive
Let be an interaction of finite range. Then the number " < 1. We have in our mind that " is small and
thermodynamic limit [20] exists and the energy say that a sequence of projection Qn 2 A[1, n] is of high
density is given by probability if !(Qn ) 1 ". The size of Qn , the
X ðÞ cardinality of a maximal pairwise orthogonal family of
eð’Þ ¼ ’ðE Þ and E ¼ projections contained in Qn , is given by trn Qn . (The
jj
02 subscript n in trn indicates that the algebraic trace
Furthermore, e(’) is an affine weak continuous functional on An is meant here.) The theorem below
functional of ’. says that the entropy density of ! governs asymptoti-
It follows that the free energy density f (’) exists cally the rank of the high-probability projections.
and it is an affine lower-semicontinuous function of Theorem 7 Assume that ! is an ergodic translation-
the translation-invariant state ’. invariant state of A. Then the limit relation
For 0 < < 1 the thermodynamic limit
1
1 lim infflog trn Qn g ¼ sð!Þ
lim log tr eHðÞ pð; Þ n!1 n
!1 jj
holds, when the infimum is over all projections
exists. Qn 2 A[1, n] such that !n (Qn ) 1 ".
182 Quantum Entropy
This result is strongly related to data compression. Now we fix a formalism for an asymptotic theory
When ! is interpreted as a stationary quantum source of the hypothesis testing. Suppose that a sequence
(with possible memory), then efficient and reliable (Hn ) of Hilbert spaces is given, ((n) (n)
0 ) and (1 ) are
data compression needs a subspace of small dimension density matrices on Hn . The typical example we have
and the range of Qn can play this role. The entropy in mind is (n) (n)
0 = 0 0 0 and 1 = 1
density is the maximal rate of reliable compression. 1 1 . A positive contraction Tn 2 B(Hn ) is
It is interesting that one can impose further considered as a test on a composite system. Now the
requirements on the high-probability projections errors of the first and second kind depend on n:
and the statement of the theorem remains true. n [Tn ] = tr(n) (n)
0 (I Tn ) and n [Tn ] = tr1 Tn .
1. The partial trace of Qnþ1 over Anþ1 is Qn ; Set
2. en(s")
tr Qn
en(sþ") if n is large enough; and ðnÞ
ðn; "Þ ¼ infftr1 An g ½26
3. if q
Qn is a minimal projection (in A[1, n] ), then
!(q)
en(s") if n is large enough. where the infimum is over all An 2 B(Hn ) such that
In (2) and (3) s stands for s(!). Let Dn be the density 0
An
I and tr(n)
0 (I An )
". In other words,
matrix of the restriction of ! to A[1, n] . It follows this is the infimum of the error of the second kind
that for an eigenvalue of Qn Dn Qn the inequality when the error of the first kind is at most ". The
importance of this quantity is in the customary
log approach to hypothesis testing.
s"
n The following result is the quantum Stein lemma.
holds. Theorem 9 In the above setting, the relation
From the point of view of data compression, it is
important if the sequence Qn 2 A[1, n] works uni- 1
lim log ðn; "Þ ¼ Sð0 k1 Þ
n!1 n
versally for many states. Indeed, in this case the
compression algorithm can be universal for several holds for every 0 < " < 1.
quantum sources.
Theorem 8 Let R > 0. There is a projection
Qn 2 A[1, n] such that
Bibliographic Notes
some algebraic and Gibbs states by Hiai and Petz. Hiai F and Petz D (1991) The proper formula for relative entropy
The application to data compression was first and its asymptotics in quantum probability. Communications
in Mathematical Physics 143: 99–114.
observed by Schumacher (1995). The chained Kaltchenko A and Yang E-H (2003) Universal compression of
property of the high-probability subspaces was ergodic quantum sources. Quantum Information and Compu-
studied in Bjelaković et al. (2003) and the univers- tation 3: 359–375.
ality is from Kaltchenko and Yang (2003). Lieb EH and Ruskai MB (1973) Proof of the strong subadditivity
A weak form of the quantum Stein lemma was of quantum mechanical entropy. Journal of Mathematical
Physics 14: 1938–1941.
proved in Hiai and Petz (1991) and the stated form Nielsen MA and Petz D (2005) A simple proof of the strong
is due to Nagaoka and Ogawa (2000). An extension subadditivity inequality. Quantum Information and Computa-
to the case where (n)
0 is not a product was given in tion 5: 507–513.
Bjelaković and Siegmund-Schultze (2004). Ogawa T and Nagaoka H (2000) Strong converse and Stein’s
Other surveys about quantum entropy are Petz lemma in quantum hypothesis testing. IEEE Transactions on
Information Theory 46: 2428–2433.
(1992) and Schumacher and Westmoreland (2002). Ohya M and Petz D (1993) Quantum Entropy and Its Use, Texts
and Monographs in Physics, (2nd edn., 2004). Berlin:
See also: Asymptotic Structure and Conformal Infinity; Springer.
Capacities Enhanced by Entanglement; Channels in Petz D (1992) Entropy in quantum probability. In: Accardi L (ed.)
Quantum Information Theory; Entropy and Qualitative Quantum Probability and Related Topics VII, pp. 275–297.
Transversality; Positive Maps on C-Algebras; von Singapore: World Scientific.
Neumann Algebras: Introduction, Modular Theory and Petz D (2001) Entropy, von Neumann and the von Neumann
Classification Theory; von Neumann Algebras: Subfactor entropy. In: Rédei M and Stöltzner M (eds.) John von Neumann
Theory. and the Foundations of Quantum Physics. Dordrecht: Kluwer.
Raggio GA and Werner RF (1991) The Gibbs variational
principle for inhomogeneous mean field systems. Helvetica
Further Reading Physica Acta 64: 633–667.
Schumacher B (1995) Quantum coding. Physical Review A 51:
Bjelaković I, Krüger T, Siegmund-Schultze R, and Szko´la A 2738–2747.
(2003) Chained typical subspaces – a quantum version of Schumacher B and Westmoreland MD (2002) Relative entropy in
Breiman’s therem. Preprint. quantum information theory. In: Quantum Computation and
Bjelaković I and Siegmund-Schultze R (2004) An ergodic theorem Information, (Washington, DC, 2000), Contemp. Math. vol.
for the quantum relative entropy. Communications in Math- 305, pp. 265–289. Providence, RI: American Mathematical
ematical Physics 247: 697–712. Society.
Bratteli O and Robinson DW (1981) Operator Algebras and Sewell GL (1986) Quantum Theory of Collective Phenomena,
Quantum Statistical Mechanics. 2. Equilibrium States. Models New York: Clarendon.
in Quantum Statistical Mechanics, Texts and Monographs in Uhlmann A (1977) Relative entropy and the Wigner–Yanase–
Physics, (2nd edn., 1997). Berlin: Springer. Dyson–Lieb concavity in an interpolation theory. Commu-
Greven A, Keller G, and Warnecke G (2003) Entropy. Princeton nications in Mathematical Physics 54: 21–32.
and Oxford: Princeton University Press. von Neumann J (1932) Mathematische Grundlagen der Quanten-
Hayden P, Jozsa R, Petz D, and Winter A (2004) Structure of mechanik. Berlin: Springer. (In English: von Neumann J,
states which satisfy strong subadditivity of quantum entropy Mathematical Foundations of Quantum Mechanics. Princeton:
with equality. Communications in Mathematical Physics 246: Princeton University Press.)
359–374.
discrete dynamical system generated by a hyperbolic Additionally, two main kinds of methods are in
torus automorphism and its quantization by the use: (1) methods of semiclassical (or microlocal)
metaplectic representation. As these models indicate, analysis, which apply to general Laplacians (and
the basic problems and phenomena are richly more general Schrödinger operators), and (2)
embodied in simple, low-dimensional examples in methods of number theory and automorphic
much the same way that two-dimensional toy forms, which apply to arithmetic models such as
statistical mechanical models already illustrate com- arithmetic hyperbolic manifolds or quantum cat maps.
plex problems on phase transitions. The principles Arithmetic models are far more ‘‘explicitly solvable’’
established for simple models should apply to far than general chaotic systems, and the results obtained
more complex systems such as atoms and molecules for them are far sharper than the results of semiclassi-
in strong magnetic fields. cal analysis. This article is primarily devoted to the
The conjectural picture which has emerged from general results on Laplacians obtained by semiclassical
many computer experiments and heuristic argu- analysis; see Arithmetic Quantum Chaos for results by
ments on these simple model systems is roughly J Marklov. For background on semiclassical analysis,
that there exists a length scale in which quantum see Heller (1984).
chaotic systems exhibit universal behavior. At this
length scale, the eigenvalues resemble eigenvalues of
random matrices of large size and the eigenfunctions
Wave Group and Geodesic Flow
resemble random waves. A small sample of the
original physics articles suggesting this picture is The model quantum Hamiltonians we will discuss
Berry (1977), Bohigas et al. (1984), Feingold and are Laplacians on compact Riemannian mani-
Peres (1986), and Heller (1984). folds (M, g) (with or without boundary). The
This article reviews some of the rigorous mathe- classical phase space in this setting is the cotangent
matical results in quantum chaos, particularly those bundle T M of M, P equipped with its canonical
on eigenfunctions of quantizations of classically symplectic form i dxi ^ di . The metric defines
ergodic or mixing systems. They support the the Hamiltonian
conjectural picture of random waves up to two vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
uX
moments, that is, on the level of means and u n
variances. A few results also exist on higher Hðx; Þ ¼ jjg ¼ t gij ðxÞi j
ij¼1
moments in very special cases. But from the
mathematical point of view, the conjectural links
on T M, where
to random matrices or random waves remain very
much open at this time. A key difficulty is that the @ @
length scale on which universal behavior should gij ¼ g ;
@xi @xj
occur is far below the resolving power of any known
mathematical techniques, even in the simplest model [gij ] is the inverse matrix to [gij ]. We denote the
problems. The main evidence for the random volume density of (M, g) by dVol and the corre-
matrix and random wave connections comes from sponding inner product on L2 (M) by hf , gi. The unit
numerous computer experiments of model cases in (co-) ball bundle is denoted B M = {(x, ) : jj 1}.
the physics literature. We will not review numerical The Hamiltonian flow t of H is the geodesic
results here, but to get a well-rounded view of the flow. By definition, t (x, ) = (xt , t ), where (xt , t ) is
field, it is important to understand the computer the terminal tangent vector at time t of the unit
experiments (see, e.g., Bäcker et al. (1998a, b) and speed geodesic starting at x in the direction . Here
Barnett (2005)). and below, we often identify T M with the tangent
The model quantum systems that have been most bundle TM using the metric to simplify the
intensively studied in mathematical quantum chaos geometric description. The geodesic flow preserves
are Laplacians or Schrödinger operators on com- the energy surfaces {H = E} which are the co-sphere
pact (or finite-volume) Riemannian manifolds, with bundles SE M. Due to the homogeneity of H,
or without boundary, and quantizations of sym- the flow on any energy surface {H = E} is equivalent
plectic maps on compact Kähler manifolds. Similar to that on the co-sphere bundle S M = {H = 1}.
techniques and results apply in both settings, so for (This homogeneity could be broken by adding a
the sake of coherence we concentrate on the potential V 2 C1 (M) to form a semiclassical
Laplacian on a compact Riemannian manifold Schrödinger operator h2 þ V, whose underlying
with ‘‘chaotic’’ geodesic flow and only briefly Hamiltonian flow is generated by jj2g þ V(x).) See
allude to the setting of ‘‘quantum maps.’’ h-Pseudodifferential Operators and Applications.
Quantum Ergodicity and Mixing of Eigenfunctions 185
The quantization
pffiffiffiffi of the Hamiltonian H is the This formula is almost universally taken to be the
square root of the positive Laplacian definition of quantization of a flow or map in the
physics literature.
1 X n
@ ij @
¼ pffiffiffi g g The key difficulty in quantum chaos is that it
g i;j¼1 @xi @xj involves a comparison between long-time dynamical
properties of t and Ut through the symbol map and
of (M,
pg).
ffiffiffiffi Here, g = det [gij ]. We choose to work similar classical limits. The classical dynamics
with rather than since the former generates
defines the ‘‘principal symbol’’ behavior of Ut and
the wave
pffiffiffi
the ‘‘error’’ Ut AUt Op(A t ) typically grows
Ut ¼ eit exponentially in time. This is just the first example
of a ubiquitous ‘‘exponential barrier’’ in the subject.
which is the quantization of the geodesic flow t .
By the last statement we mean that Ut is related to
t in several essentially equivalent ways:
Eigenvalues and Eigenfunctions of
1. singularities of waves, that is, solutions Ut of
The eigenvalue problem on a compact Riemannian
the wave equation, propagate along geodesics;
manifold
2. Ut is a Fourier integral operator (= quantum
map) associated to the canonical relation defined ’j ¼ 2j ’j ; h’j ; ’k i ¼ jk
by the graph of t in T M T M; and
3. Egorov’s theorem holds. is dual under the Fourier transform to the wave
equation. Here, {’j } is a choice of orthonormal basis
We only define the latter since it plays an important of eigenfunctions, which is not unique if the
role in studying eigenfunctions. As with any quantum eigenvalues have multiplicities > 1. The individual
theory, there is an algebra of observables on the eigenfunctions are difficult to study directly, and so
Hilbert space L2 (M, dvolg ) which quantizes T M. one generally forms the spectral projections kernel,
Here, dvolg is the volume form of the metric. The X
algebra is that (M) of pseudodifferential operators Eð; x; yÞ ¼ ’j ðxÞ’j ðyÞ ½4
DO’s of all orders, though we often restrict to the j : j
subalgebra 0 of DO’s of order zero. We denote by Semiclassical asymptotics is the study of the ! 1
m (M) the subspace of pseudodifferential operators of limit of the spectral data {’j , j } or of E(, x, y). The
order m. The algebra is defined by constructing a (Schwartz) kernel of the wave group can be
quantization Op from an algebra of symbols a 2 represented in terms of the spectral data by
Sm (T M) of order m (polyhomogeneous functions on X
T Mn0) to m . The map Op is not unique. In the Ut ðx; yÞ ¼ eitj ’j ðxÞ’j ðyÞ
reverse direction is the symbol map A : m ! j
Sm (T M) which takes an operator Op(a) to the
or
R equivalently as the Fourier transform
homogeneous term am of order m in a.
R eit dE(, x, y) of the spectral projections. Hence,
Egorov’s theorem for the wave group concerns the
spectral asymptotics is often studied through the
conjugations
large-time behavior of the wave group.
t ðAÞ :¼ Ut AUt ; A 2 m ðMÞ ½1 The link between spectral theory and geometry,
and the source of Egorov’s theorem for the wave
Such a conjugation defines the quantum evolution of
group, is the construction of a parametrix (or WKB
observables in the Heisenberg picture, and, since the
formula) for the wave kernel. For small times t, the
early days of quantum mechanics, it was known to
simplest is the Hadamard parametrix,
correspond to the classical evolution
Z 1 X
1
2 2
Vt ðaÞ :¼ a t ½2 Ut ðx; yÞ eiðr ðx;yÞt Þ Uk ðx; yÞððd3Þ=2Þk d
0 k¼0
1
of observables a 2 C (S M). Egorov’s theorem is
the rigorous version of this correspondence: it states ðt < injðM; gÞÞ ½5
that t defines an order-preserving automorphism of where r(x, y) is the distance between points,
(M), that is, t (A) 2 m (M) if A 2 m (M), and U0 (x, y) = 1=2 (x, y) is the volume 1/2-density,
that inj(M, g) is the injectivity radius, and the higher
Ut AUt ðx; Þ ¼ A ðt ðx; ÞÞ :¼ Vt ðA Þ; Hadamard coefficients are obtained by solving
transport equations along geodesics. The parametrix
ðx; Þ 2 T Mn0 ½3 is asymptotic to the wave kernel in the sense of
186 Quantum Ergodicity and Mixing of Eigenfunctions
smoothness, that is, the difference of the two sides of An important generalization is the ‘‘local Weyl law’’
[5] is smooth. The relation [5] may be iterated using concerning the traces trAE(), where A 2 m (M).
Utm = Utm to obtain a parametrix for long times. It asserts that
This is obviously complicated and not necessarily X
the best long-time parametrix construction, but it hA’j ; ’j i
illustrates again the difficulty of a long-time j
Z
analysis. 1
¼ A dx d n þ Oðn1 Þ ½10
ð2 Þn B M
Weyl Law and Local Weyl Law
There is also a pointwise local Weyl law:
A fundamental and classical result in spectral
asymptotics is Weyl’s law on counting eigenvalues: X 1
j’j ðxÞj2 ¼ jBn jn þ Rð; xÞ ½11
j
ð2 Þn
NðÞ ¼ #fj : j g
jBn j where R(, x) = O(n1 ) uniformly in x. Again,
¼ VolðM; gÞn þ Oðn1 Þ ½6 when the periodic geodesics form a set of measure
ð2 Þn
zero in S M, one could average over the shorter
Here, jBn j is the Euclidean volume of the unit ball interval [, þ 1]. Combining the Weyl and local
and Vol(M, g) is the volume of M with respect to the Weyl law, we find the surface average of A is a
metric g. An equivalent formula which emphasizes limit of traces:
the correspondence between classical and quantum Z
mechanics is 1
!ðAÞ :¼ A d
ðS MÞ S M
Volðjjg Þ 1 X
trE ¼ ½7 ¼ lim hA’j ; ’j i ½12
ð2 Þn !1 NðÞ
j
Problem 1 Let Q denote the set of ‘‘quantum oscillation properties of eigenfunctions. Here are
limits,’’ that is, weak limit points of the sequence some possibilities:
{k } of distributions on the classical phase space
1. Normalized Liouville measure. In fact, the func-
S M, defined by
tional ! of [12] is also a state on 0 for the
Z
reason explained above. A subsequence {’kj } of
a dk :¼ hOpðaÞ’k ; ’k i eigenfunctions is considered diffuse if kj ! !.
X
2. A periodic orbit measure
defined by
where a 2 C1 (S M). Z
1
The set Q is independent of the definition of Op.
ðAÞ ¼ A ds
L
It follows almost immediately from Egorov’s theo-
rem that Q MI , where MI is the convex set of where L is the length of . A sequence of
invariant probability measures for the geodesic flow. eigenfunctions for which kj !
obviously con-
Furthermore, they are time-reversal invariant, that centrates (or strongly ‘‘scars’’) on the closed
is, invariant under (x, ) ! (x, ) since the eigen- geodesic.
functions are real valued. 3. A finite sum of periodic orbit measures.
To see this, it is helpful to introduce the linear 4. A delta-function along an invariant Lagrangian
functionals on 0 : manifold S M. The associated eigenfunctions
are viewed as ‘‘localizing’’ along .
k ðAÞ ¼ hOpðaÞ’k ; ’k i ½14 5. A more general invariant measure which is
We observe that k (I) = 1, k (A)
0 if A
0, singular with respect to d
.
and that All of these possibilities can and do happen in
different examples. If dkj ! !, then in particular
k Ut AUt ¼ k ðAÞ ½15 we have
Indeed, if A
0 then A = B B for some B 2 0 Z
1 VolðEÞ
and we can move B to the right-hand side. j’kj ðxÞj2 dVol !
VolðMÞ E VolðMÞ
Similarly, [15] is proved by moving Ut to the right-
hand side and using [13]. These properties mean for any measurable set E whose boundary has
that j is an ‘‘invariant state’’ on the algebra 0 . measure zero. Interpreting j’kj (x)j2 dVol as the
More precisely, one should take the closure of 0 in probability density of finding a particle of energy
the operator norm. An invariant state is the analog 2k at x, this result means that the sequence of
in quantum statistical mechanics of an invariant probabilities tends to uniform measure.
probability measure. However, dkj ! ! is much stronger since it says
The next important fact about the states k is that that the eigenfunctions become diffuse on the energy
any weak limit of the sequence {k } on 0 is an surface S M and not just on the configuration space
invariant probability measure on C(S M), that is, M. As an example, consider the flat torus Rn =Zn .
a positive linear functional on C(S M) rather than An orthonormal basis of eigenfunctions is furnished
just a state on 0 . This follows from the fact that by the standard exponentials e2 ihk, xi with k 2 Zn .
hK’j , ’j i ! 0 for any compact operator K, and so any Obviously, je2 ihk, xi j2 = 1, so the eigenfunctions are
limit of hA’k , ’k i is equally a limit of h(A þ K)’k , ’k i. already diffuse in configuration space. On the other
Hence, any limit is bounded by inf K kA þ Kk (the hand, they are far from diffuse in phase space, and
infimum taken over compact operators), and for any localize on invariant Lagrangian tori in S M. Indeed,
A 2 0 , kA kL1 = inf K kA þ Kk. Hence, any weak by definition of pseudodifferential operator,
limit is bounded by a constant times kA kL1 and is Ae2 ihk, xi = a(x, k) e2 ihk, xi , where a(x, k) is the com-
therefore continuous on C(S M). It is a positive plete symbol. Thus,
functional since each j , and hence any limit, is a Z
probability measure. By Egorov’s theorem and the hAe 2 ihk;xi
;e 2 ihk;xi
i¼ aðx; kÞ dx
invariance of the k , any limit of k (A) is a limit of R n =Zn
Z
k (Op(A t )) and hence the limit measure is k
invariant. A x; dx
R n =Zn jkj
Problem 1 is thus to identify which invariant
measures in MI show up as weak limits of the A subsequence e2 ihkj , xi of eigenfunctions has a weak
functionals k or equivalently the distributions dk . limit if and only if kj =jkj j tends to a limit vector 0 in
The weak limits reflect the concentration and the unit sphere in R n . In this case, the associated
188 Quantum Ergodicity and Mixing of Eigenfunctions
R
weak limit is Rn =Zn A (x, 0 )dx, that is, the delta- Matrix elements of eigenfunctions are quadratic
function on the invariant torus T0 S M defined forms. More ‘‘nonlinear’’ problems involve the
by the constant momentum condition = 0 . The Lp -norms or the distribution functions of eigenfunc-
eigenfunctions are said to localize on this invariant tions. Estimates of the L1 -norms can be obtained
torus for t . from the local Weyl law [10].P Since the2 jump in
The flat torus is a model of a completely the left-hand side at is j : j = j’j (x)j and the
integrable system on both the classical and quantum jump in the right-hand side is the jump of R(, x),
levels. Another example is that of the standard this implies
round sphere Sn . In this case, the author and X n1
D Jakobson showed that absolutely any invariant j’j ðxÞj2 ¼ Oðn1 Þ ¼) jj’j jjL1 ¼ Oð 2 Þ ½17
measure 2 MI can arise as a weak limit of a j:j ¼
sequence of eigenfunctions. This reflects the huge For general Lp -norms, the following bounds were
degeneracy (multiplicities) of the eigenvalues. proved by C Sogge for any compact Riemannian
On the other hand, if the geodesic flow is ergodic, manifold:
one would expect the eigenfunctions to be diffuse in
phase space. In the next section, we will discuss the k’j kp
¼ OððpÞ Þ; 2p1 ½18
rigorous results on this problem. k’k2
Off-diagonal matrix elements
where
jk ðAÞ ¼ hA’i ; ’j i ½16 8
> 1 1 1 2ðn þ 1Þ
> n
< 2 p 2; p1
are also important as transition amplitudes between n1
ðpÞ ¼ ½19
states. They no longer define states since jk (I) = 0, >
> n1 1 1 2ðn þ 1Þ
: ; 2p
are positive, or invariant. Indeed, jk (Ut AUt ) = 2 2 p n1
eit(j k ) jk (A), so they are eigenvectors of the
automorphism t of [1]. A sequence of such matrix These estimates are sharp on the unit sphere Sn
elements cannot have a weak limit unless the nþ1
R . The extremal eigenfunctions are the zonal
spectral gap j k tends to a limit
2 R. In this spherical harmonics, which are the L2 -normalized
case, by the same discussion as above, any weak spectral projection kernels N (x, x0 )=kN ( , x0 )k
limit of the functionals jk will be an eigenmeasure centered at any x0 . However, they are not sharp
of the geodesic flow which transforms by ei
t under for generic (M, g), and it is natural to ask how
the action of t . Examples of such eigenmeasures ‘‘chaotic dynamics’’ might influence Lp -norms.
are orbital Fourier coefficients
Problem 3 Improve the estimates k’j kp =k’k2 =
Z L
1 i
t t O((p) ) for (M, g) with ergodic or mixing geodesic
e A ð ðx; ÞÞ dt
L 0
flow.
C Sogge and the author have proved that if a
along a periodic orbit. Here,
2 (2 =L )Z. We
sequence of eigenfunctions attains the bounds in
denote by Q
such eigenmeasures of the geodesic
[17], then there must exist a point x0 so that a
flow. Problem 1 has the following extension to off-
positive measure of geodesics starting at x0 in Sx0 M
diagonal elements:
returns to x0 at a fixed time T. In the real analytic
Problem 2 Determine the set Q
of ‘‘quantum case, all return so x0 is a perfect recurrent point. In
limits,’’ that is, weak limit points of the sequence dimension 2, such a perfect recurrent point cannot
{kj } of distributions on the classical phase space occur if the geodesic flow is ergodic; hence
S M, defined by k’j kL1 = o((n1)=2 ) on any real analytic surface
Z with ergodic geodesic flow. This shows that none
adkj :¼ hOpðaÞ’k ; ’j i of the Lp -estimates above the critical index are sharp
X for real analytic surfaces with ergodic geodesic flow,
and the problem is the extent to which they can be
where j k =
þ o(1) and where a 2 C1 (S M), or
improved.
equivalently of the functionals jk .
The random wave model (see the section ‘‘Random
As will be discussed in the section ‘‘Quantum waves and orthonormal bases’’) predicts that eigen-
weak mixing,’’ the asymptotics of off-diagonal functions of Riemannian manifolds with chaotic
elements depends on the weak mixing properties of geodesic flow should have the bounds
pffiffiffiffiffiffiffiffiffiffi k’ kLp = O(1)
the geodesic flow and not just its ergodicity. for p < 1 and that k’ kL1 < log . But there are
Quantum Ergodicity and Mixing of Eigenfunctions 189
no rigorous estimates at this time close to such geodesic flow Gt is ergodic on (S M, d
) if and only
predictions. The best general estimate to date on if, for every A 2 o (M), we have:
negatively curved compact manifolds (which are 1
P 2
(i) lim ! 1 N() j j(A’j , ’j ) !(A)j = 0.
models of chaotic geodesic flow) is just the logarithmic P
1
improvement (ii) (8)(9) lim sup ! 1 N() j6¼k : j ,k jj k j<
2
n1 j(A’j , ’k )j < .
jj’j jjL1 ¼ O This implies that there exists a subsequence {’jk }
log
of eigenfunctions whose indices jk have counting
on the standard remainder term in the local Weyl density 1 for which hA’jk , ’jk i ! !(A). We will call
law. This was known for compact hyperbolic the eigenfunctions in such a sequence ‘‘ergodic
manifolds from the Selberg trace formula, and eigenfunctions.’’ One can sharpen the results by
similar estimates hold manifolds without conjugate averaging over eigenvalues in the shorter interval
points (P Bérard). The exponential growth of the [, þ 1] rather than in [0, ].
geodesic flow again causes a barrier in improving There is also an ergodicity result for boundary values
the estimate beyond the logarithm. In the analogous of eigenfunctions on domains with boundary and with
setting of quantum ‘‘cat maps,’’ which are models of Dirichlet, Neumann, or Robin boundary conditions
chaotic classical dynamics, there exist arbitrarily (Gérard–Leichtnam, Hassell–Zelditch, Burq). This cor-
large eigenvalues with multiplicities of the order responds to the fact that the billiard map on B @M
O(n1 =log ); the L1 -norm of the L2 -normalized is ergodic.
projection kernel onto an eigenspace of this multi- The first statement (i) is essentially a convexity
plicity is of order of the square root of the result. It remains true if one replaces the square by
multiplicity (Faure et al. 2003). This raises doubt any convex function ’ on the spectrum of A,
that the logarithmic estimate can be improved by
1 X
general dynamical arguments. Further discussion of ’ðhA’k ; ’k i !ðAÞÞ ! 0 ½20
L1 -norms, as well as zeros, will be given at the end NðEÞ E
j
with B = E [hAiT !(A)]E to get where is a closed orbit and T is its period.
X According to the Bowen–Margulis equidistribution
’ðhhAiT !ðAÞ’k ; ’k iÞ theorem for closed orbits of hyperbolic flows, we
j E have
tr ’ðE ½hAiT !ðAÞE Þ ½22
1 X 1
^ corre-
Here, E is the spectral projection for H
!
ðTÞ :T T jdetðI P Þj
sponding to the interval [0, E]. From the Berezin
In these terms, Theorem 1(i) states that quantum maps and of Laplacians have much in
common, this negative result shows that there
hAi ¼ !ðAÞI þ K; where lim ! ðK KÞ ! 0 ½23
!1 cannot exist a universal structural proof of QUE.
The principal positive result available at this time
where ! (A) = tr E()A. Thus, the time average
is the recent proof by Lindenstrauss of the QUE
equals the space average plus a term K which
property for the orthonormal basis of Laplace–
is semiclassically small in the sense that its
Hecke eigenfunctions on arithmetic hyperbolic sur-
Hilbert–Schmidt norm square kE Kk2HS in the span
faces. It is generally believed that the spectrum of
of the eigenfunctions of eigenvalue is o(N()).
the Laplace eigenvalues is of multiplicity 1 for such
This is not exactly equivalent to Theorem 1(i)
surfaces, so this should imply QUE completely for
since it is independent of the choice of orthonormal
these surfaces. Earlier partial results on Hecke
basis, while the previous result depends on the
eigenfunctions are due to Rudnick–Sarnak, Wolpert,
choice of basis. However, when all eigenvalues have
and others. For references and further discussion onf
multiplicity 1, then the two are equivalent. To psee
ffiffiffiffi Hecke eigenfunctions, see Rudnick and Sarnak
the equivalence, note that hAi commutes with
(1994) (see Arithmetic Quantum Chaos).
and hence is diagonal in the basis {’j } of joint
So far we have not mentioned Theorem 1(ii). In
eigenfunctions of hAi and of Ut . Hence, K is the
the next section, we will describe a similar but more
diagonal matrix with entries hA’k ,’k i !(A). The
general result for mixing systems and the relevance
condition is therefore equivalent to
of (ii) will become clear. An interesting open
1 X problem is the extent to which (ii) is actually
lim jhA’k ; ’k i !ðAÞj2 ¼ 0
E!1 NðEÞ
E
necessary for the equivalence to classical ergodicity.
j
sequence of eigenfunctions. Are the following The restriction j 6¼ k is of course redundant unless
asymptotics valid?
= 0, in which case the statement coincides with
Z Z quantum ergodicity. This result follows from the
1 general asymptotic formula, valid for any compact
f dHn1 j f dVol
N ’j VolðM; gÞ M Riemannian manifold (M, g), that
This is predicted by the random wave model of 1 X sin Tði j
Þ2
jhA’i ; ’j ij2
the section ‘‘Random waves and orthonormal NðÞ i6¼j; ; Tði j
Þ
bases.’’ An equidistribution law for the complex i j
Z T 2
zeros is known which gives some evidence for the 1 sin T
2
validity of this limit formula. Let (M, g) be a 2T e it
V ð Þ
t A
T
!ðAÞ
2
½24
T 2
compact real analytic Riemannian manifold and let
’Cj be the holomorphic extension of the real analytic In the case of weak-mixing geodesic flows, the right-
eigenfunction ’j to the complexification MC of M hand side tends to 0 as T ! 1. As with diagonal
(its Grauert tube). Then, if the geodesic flow is sums, the sharper result is true where one averages
ergodic and if ’j is an ergodic sequence of over the short intervals [, þ 1].
eigenfunctions, the normalized current of integration
(1=j )Z’C over the complex zero set of ’C j tends
j
weakly to (i= )@@j g j. This current is singular along Spectral Measures and Matrix Elements
the zero section. Theorem 2 is based on expressing the spectral
Finally, we mention some results on L1 -norms of measures of the geodesic flow in terms of matrix
eigenfunctions on arithmetic hyperbolic manifolds elements. The main limit formula is
of dimensions 2 and 3. It was proved by Iwaniec–
Sarnak that the joint eigenfunctions of and the Z
þ" X
1
Hecke operators on arithmetic hyperbolic surfaces d
A :¼ lim jhA’i ; ’j ij2 ½25
5=48þ
" !1 NðÞ i; j: ;
have the upper bound k’j k1 = O (j ) for all j
ji j
j<"
j pand > 0,
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi and the lower bound k’j k1
Since weak-mixing systems are ergodic, it is not On the basis of the analogy between jhA’i , ’j ij2
necessary to average in both indices along an and jhA’j , ’j i !(A)j2 , it is conjectured in Feingold
ergodic subsequence: and Peres (1986) that
X
lim hAt A’j ; ’j i ¼ eitði j Þ jhA’i ; ’j ij2 CA!ðAÞI ð0Þ
j ! 1 VA ðÞ
j n1 volð Þ
¼ hVt A ; A iL2 ðS MÞ ½27 pffiffiffi
The idea is that ’
= (1= 2)(’i
’j ) have the same
matrix element asymptotics as eigenfunctions when
Dually, one has
i j is sufficiently small. But then 2hA’þ , ’ i =
X Z
þ" hA’i , ’i i hA’j , ’j i when A = A. Since we are
2
lim jhA’i ; ’j ij ¼ d
A ½28 taking a difference, we may replace each matrix
j ! 1
"
i : ji j
j<" element hA’i , ’i i by hA’i , ’i i !(A) (and also for ’j ).
The conjecture then assumes that hA’i , ’i i !(A) has
For QUE systems, these limit formulas are valid for the same order of magnitude as hA’i , ’i i hA’j , ’j i.
the full sequence of eigenfunctions. Dynamical grounds for this conjecture are given in
Eckhardt et al. (1995). The order of magnitude is
predicted by some natural random wave models, as
Rate of Quantum Ergodicity and Mixing discussed in the next section.
A quantitative refinement of quantum ergodicity is
Rigorous results
to ask at what rate the sums in Theorem 1(i) tend to
zero, that is, to establish a rate of quantum At this time, the strongest variance result is an
ergodicity. More generally, we consider ‘‘variances’’ asymptotic formula for the diagonal variance proved
of matrix elements. For diagonal matrix elements, by Luo and Sarnak (2004) for special Hecke
we define eigenfunctions on the quotient H 2 =SL(2, Z) of the
upper half plane by the modular group. Their result
1 X pertains to holomorphic Hecke eigenforms, but the
VA ðÞ :¼ jhA’j ; ’j i !ðAÞj2 ½29
NðÞ j: analogous statement for smooth Maass–Hecke
j
eigenfunctions is expected to hold by similar
In the off-diagonal case, one may view jhA’i , ’j ij2 as methods, so we state the result as a theorem/
analogous to jhA’j , ’j ) !(A)j2 . However, the sums conjecture. Note that H 2 =SL(2, Z) is a noncompact
in [25] are double sums while those of [29] are finite-area surface whose Laplacian has both a
single. One may also average over the shorter discrete and a continuous spectrum. The discrete
intervals [, þ 1]. Hecke eigenfunctions are joint eigenfunctions of
and the Hecke operators Tp .
Quantum Chaos Conjectures Theorem/Conjecture 1 (Luo and Sarnak 2004).
First, consider off-diagonal matrix elements. One Let {’k } denote the orthonormal basis of Hecke
conjecture is that it is not necessary to sum in j in eigenfunctions for H 2 =SL(2, Z). Then there exists a
2
[28]: each individual term has the asymptotics quadratic form B(f ) on C1 0 (H =SL(2, Z)) such that
consistent with [28]. This is implicitly conjectured Z Z 2
1 X 2
’j dvol 1
by Feingold–Peres (1986) (see [11]) in the form f f dVol
NðÞ X VolðXÞ X
j
Ei Ej Bðf ; f Þ 1
CA ¼ þo
h
jhA’i ; ’j ij2 ’ ½30
2 ðEÞ
When the multiplier f = ’ is itself an eigenfunc-
where tion, Luo–Sarnak have shown that
Z 1
CA ð
Þ ¼ ei
t hVt A ; A i dt Bð’ ; ’ Þ ¼ C’ ð0ÞLð12 ; ’ Þ
1
where L( 12 , ’ ) is a certain L-function. Thus, the
1
In our notation, j = h Ej and (E) dE dN(). conjectured classical variance is multiplied by an
There are Cn1 eigenvalues i in the interval arithmetic factor depending on the multiplier. A
[j
, j
þ ], so [30] states that individual crucial fact in the proof is that the quadratic form B
terms have the asymptotics of [28]. is diagonalized by the ’ .
194 Quantum Ergodicity and Mixing of Eigenfunctions
(Zelditch 1996a). Thus, one can compare the asymptotics for the traces k Ak , (k Ak )2 for any
behavior of sums over eigenvalues of the orthonor- pseudodifferential operator A. Combining the strong
mal basis of eigenfunctions of with that of a Szegö asymptotics with the arguments of Zelditch
random orthonormal basis. Instead of taking (1996a), random orthonormal bases can be proved
Gaussian random combinations of Euclidean plane to satisfy the following variance asymptotics:
waves of a fixed eigenvalue, one takes Gaussian P
P 1. Eð j:j 2Ik jðAU’j ; U’j Þ !ðAÞj2
random combinations j : j 2[,þ1] cj ’j of the eigen-
functions of (M, g) with eigenvalues in a short ð!ðA AÞ !ðAÞ2 Þ;
interval in the sense above. Equivalently, one takes P
P sinTði j
Þ2 2
random combinations with jc j 2
= 1. These 2. Eð i6¼j:j ;i 2Ik Tði
Þ jðAU’j ; U’i Þj
j j j
2 2
random waves are globally adapted to (M, g). The
2sin
T þ 1 P sinTði j
Þ
statistical results depend on the measure of the set of
T NðkÞ i6¼j Tði j
Þ
Sarnak P (1995) Arithmetic quantum chaos. In: The Schur Zelditch S (1996c) Quantum Mixing. Journal of Functional
Lectures (Tel Aviv, 1992), Israel Mathematical Conference Analysis 140: 68–86.
Proc, vol. 8 (1995), pp. 183–236. Zelditch S and Zworski M (1996) Ergodicity of eigenfunctions for
Zelditch S (1996a) A random matrix model for quantum mixing. ergodic billiards. Communications in Mathematical Physics
International Mathematics Research Notices 3: 115–137. 175: 673–682.
Zelditch S (1996b) Quantum ergodicity of C dynamical systems.
Communications in Mathematical Physics 177: 507–528.
Frequently, we are interested in codes that correct (1997) bound) states that any ((n, K, d)) QECC
any error affecting t or fewer physical qubits. In that must satisfy
case, let us consider tensor products of the Pauli
n log K 2d 2 ½4
matrices
We can set a lower bound on the existence of
1 0 0 1 QECCs using the quantum Gilbert–Varshamov
I¼ ; X¼
0 1 1 0 bound, which states that, for large n, an ((n, 2k , d))
½2 QECC exists provided that
0 i 1 0
Y¼ ; Z¼
i 0 0 1 k=n 1 ðd=nÞ log 3 hðd=nÞ ½5
Define the Pauli group P n as the group consisting of where h(x) = x log x (1 x) log (1 x) is the
tensor products of I, X, Y, and Z on n qubits, with binary Hamming entropy. Note that the Gilbert–
an overall phase of 1 or i. The weight wt(P) of a Varshamov bound simply states that codes at least
Pauli operator P 2 P n is the number of qubits on this good exist; it does not suggest that better codes
which it acts as X, Y, or Z (i.e., not as the identity). cannot exist.
Then the Pauli operators of weight t or less form a
basis for the set of all errors acting on t or fewer
qubits, so a QECC which corrects these Pauli Stabilizer Codes
operators corrects all errors acting on up to t In order to better manipulate and discover QECCs,
qubits. If we have a channel which causes errors it is helpful to have a more detailed mathematical
independently with probability O() on each qubit structure to work with. The most widely used
in the QECC, then the code will allow us to structure gives a class of codes known as ‘‘stabilizer
decode a correct state except with probability codes’’ (Calderbank et al. 1998, Gottesman
O(tþ1 ), which is the probability of having more 1996). They are less general than arbitrary quantum
than t errors. We get a similar result in the case codes, but have a number of useful properties that
where the noise is a general quantum operation on make them easier to work with than the general
each qubit which differs from the identity by QECC.
something of size O().
Definition 3 Let S P n be an abelian subgroup of
Definition 2 The distance d of an ((n, K)) QECC is the Pauli group that does not contain 1 or i, and
the smallest weight of a nontrivial Pauli operator let C(S) = {j i s.t. Pj i = j i8P 2 S}. Then C(S) is a
E 2 P n s.t. the equation stabilizer code and S is its stabilizer.
h i jEj j i ¼ CðEÞij ½3 Because of the simple structure of the Pauli group,
any abelian subgroup has order 2nk for some k and
fails. can easily be specified by giving a set of n k
commuting generators.
We use the notation ((n, K, d)) to refer to an
The code words of the QECC are by definition in
((n, K)) QECC with distance d. Note that for P, Q 2
the þ1-eigenspace of all elements of the stabilizer,
P n , wt(PQ) wt(P) þ wt(Q). Then by comparing
but an error E acting on a code word will move the
the definition of distance with the quantum error-
state into the 1-eigenspace of any stabilizer element
correction conditions, we immediately see that a
M which anticommutes with E:
QECC corrects t general errors iff its distance d > 2t.
If we are instead interested in ‘‘erasure’’ errors, when MðEj iÞ ¼ EMj i ¼ Ej i ½6
the location of the error is known but not its precise
Thus, measuring the eigenvalues of the generators of
nature, a distance d code corrects d 1 erasure
S tells us information about the error that has
errors. If we only wish to detect errors, a distance d
occurred. The set of such eigenvalues can be
code can detect errors on up to d 1 qubits.
represented as an (n k)-dimensional binary vector
One of the central problems in the theory of
known as the ‘‘error syndrome.’’ Note that the error
quantum error correction is to find codes which
syndrome does not tell us anything about the encoded
maximize the ratios ( log K)=n and d=n, so they can
state, only about the error that has occurred.
encode as many qubits as possible and correct as
many errors as possible. Conversely, we are also Theorem 2 Let S be a stabilizer with n k gener-
interested in the problem of setting upper bounds on ators, and let S? = {E 2 P n s.t. [E, M] = 0 8M 2 S}.
achievable values of ( log K)=n and d=n. The Then S encodes k qubits and has distance d, where d
quantum Singleton bound (or Knill–Laflamme is the smallest weight of an operator in S? nS.
198 Quantum Error Correction and Fault Tolerance
We use the notation [[n, k, d]] to a refer to such a linear algebra exercise. Another useful representa-
stabilizer code. Note that the square brackets specify tion is to map the single-qubit Pauli operators I, X,
that the code is a stabilizer code, and that the middle Y, Z to the finite field GF(4), which sets up a
term k refers to the number of encoded qubits, and connection between stabilizer codes and a subset of
not the dimension 2k of the encoded subspace, as for classical codes on four-dimensional registers.
the general QECC (whose dimension might not be a
power of 2).
S? is the set of Pauli operators that commute with
CSS Codes
all elements of the stabilizer. They would therefore
appear to be those errors which cannot be detected CSS codes are a very useful class of stabilizer codes
by the code. However, the theorem specifies the invented by Calderbank and Shor (1996), and by
distance of the code by considering S? nS. A Pauli Steane (1996). The construction takes two binary
operator P 2 S cannot be detected by the code, but classical linear codes and produces a quantum code,
there is in fact no need to detect it, since all code and can therefore take advantage of much existing
words remain fixed under P, making it equivalent to knowledge from classical coding theory. In addition,
the identity operation. A distance d stabilizer code CSS codes have some very useful properties which
which has nontrivial P 2 S with wt(P) < d is called make them excellent choices for fault-tolerant
degenerate, whereas one which does not is non- quantum computation.
degenerate. The phenomenon of degeneracy has no A classical [n, k, d] linear code (n physical bits, k
analog for classical error-correcting codes, and logical bits, classical distance d) can be defined in
makes the study of quantum codes substantially terms of an (n k) n binary ‘‘parity check’’ matrix
more difficult than the study of classical error H – every classical code word v must satisfy Hv = 0.
correction. For instance, a standard bound on Each row of the parity check matrix can be
classical error correction is the Hamming bound converted into a Pauli operator by replacing each 0
(or sphere-packing bound), but the analogous with an I operator and each 1 with a Z operator.
quantum Hamming bound Then the stabilizer code generated by these opera-
tors is precisely a quantum version of the classical
k=n 1 ðt=nÞ log 3 hðt=nÞ ½7 error-correcting code given by H. If the classical
distance d = 2t þ 1, the quantum code can correct t
for [[n, k, 2t þ 1]] codes (when n is large) is only
bit flip (X) errors, just as could the classical code.
known to apply to nondegenerate quantum codes
If we want to make a QECC that can also correct
(though in fact we do not know of any degenerate
phase (Z) errors, we should choose two classical
QECCs that violate the quantum Hamming bound).
codes C1 and C2 , with parity check matrices H1 and
An example of a stabilizer code is the 5-qubit
H2 . Let C1 be an [n, k1 , d1 ] code and let C2 be an
code, a [[5,1,3]] code whose stabilizer can be
[n, k2 , d2 ] code. We convert H1 into stabilizer
generated by
generators as above, replacing each 0 with I and
X
Z
Z
X
I each 1 with Z. For H2 , we perform the same
I
X
Z
Z
X procedure, but each 1 is instead replaced by X. The
X
I
X
Z
Z code will be able to correct bit flip (X) errors as if it
Z
X
I
X
Z had a distance d1 and to correct phase (Z) errors as
if it had a distance d2 . Since these two operations are
The 5-qubit code is a nondegenerate code, and is the completely separate, it can also correct Y errors as
smallest possible QECC which corrects 1 error (as both a bit flip and a phase error. Thus, the distance
one can see from the quantum Singleton bound). of the quantum code is at least min (d1 , d2 ), but
It is frequently useful to consider other represen- might be higher because of the possibility of
tations of stabilizer codes. For instance, P 2 P n can degeneracy.
be represented by a pair of n-bit binary vectors However, in order to have a stabilizer code at all,
(pX j pZ ), where pX is 1 for any location where P has the generators produced by the above procedure
an X or Y tensor factor and is 0 elsewhere, and pZ must commute. Define the dual C? of a classical
is 1 for any location where P has a Y or Z tensor code C as the set of vectors w s.t. w v = 0 for all
factor. Two Pauli operators P = (pX jpZ ) and v 2 C. Then the Z generators from H1 will all
Q = (qX jqZ ) commute iff pX qZ þ pZ qX = 0. commute with the X generators from H2 iff C? 2
Then the stabilizer for a code becomes a pair of C1 (or equivalently, C? 1
C 2 ). When this is true, C1
(nk) n binary matrices, and most interesting and C2 define an [[n, k1 þ k2 n, d]] stabilizer code,
properties can be determined by an appropriate where d min (d1 , d2 ).
Quantum Error Correction and Fault Tolerance 199
The smallest distance-3 CSS code is the 7-qubit transversal gate which depends on the outcome of
code, a [[7, 1, 3]] QECC created from the classical the measurement.
Hamming code (consisting of all sums of classical
strings 1111000, 1100110, 1010101, and 1111111).
for this code consists of the Fault-Tolerant Gates
The encoded j0i
superposition of all even-weight classical code We will focus on stabilizer codes. Universal fault
words and the encoded j1i is the superposition of tolerance is known to be possible for any stabilizer
all odd-weight classical code words. The 7-qubit code, but in most cases the more complicated type
code is much studied because its properties make it of construction is needed for all but a few gates. The
particularly well suited to fault-tolerant quantum Pauli group P k , however, can be performed trans-
computation. versally on any stabilizer code. Indeed, the set S? nS
of undetectable errors is a boon in this case, as it
allows us to perform these gates. In particular, each
Fault Tolerance
coset S? =S corresponds to a different logical Pauli
Given a QECC, we can attempt to supplement it operator (with S itself corresponding to the identity).
with protocols for performing fault-tolerant opera- On a stabilizer code, therefore, logical Pauli opera-
tions. The basic design principle of a fault-tolerant tions can be performed via a transversal Pauli
protocol is that an error in a single location – either operation on the physical qubits.
a faulty gate or noise on a quiescent qubit – should Stabilizer codes have a special relationship to a
not be able to alter more than a single qubit in each finite subgroup Cn of the unitary group U(2n )
block of the QECC. If this condition is satisfied, t frequently called the ‘‘Clifford group.’’ The Clifford
separate single-qubit or single-gate failures are group on n qubits is defined as the set of unitary
required for a distance 2t þ 1 code to fail. operations which conjugate the Pauli group P n into
Particular caution is necessary, as computational itself; Cn can be generated by the Hadamard trans-
gates can cause errors to propagate from their form, the controlled-NOT (CNOT), and the single-
original location onto qubits that were previously qubit =4 phase rotation diag(1, i). The set of
correct. In general, a gate coupling pairs of qubits stabilizer codes is exactly the set of codes which can
allows errors to spread in both directions across the be created by a Clifford group encoder circuit using
coupling. j0i ancilla states.
The solution is to use transversal gates whenever Some stabilizer codes have interesting symmetries
possible (Shor 1996). A transversal operation is one under the action of certain Clifford group elements,
in which the ith qubit in each block of a QECC and these symmetries result in transversal gate
interacts only with the ith qubit of other blocks of operations. A particularly useful fact is that a
the code or of special ancilla states. An operation transversal CNOT gate (i.e., CNOT acting between
consisting only of single-qubit gates is automatically the ith qubit of one block of the QECC and the ith
transversal. A transversal operation has the virtue qubit of a second block for all i) acts as a logical
that an error occurring on the third qubit in a block, CNOT gate on the encoded qubits for any CSS code.
say, can only ever propagate to the third qubit of Furthermore, for the 7-qubit code, transversal
other blocks of the code, no matter what other Hadamard performs a logical Hadamard, and the
sequence of gates we perform before a complete transversal =4 rotation performs a logical =4
error-correction procedure. rotation. Thus, for the 7-qubit code, the full logical
In the case of certain codes, such as the 7-qubit Clifford group is accessible via transversal
code, a number of different gates can be performed operations.
transversally. Unfortunately, it does not appear to Unfortunately, the Clifford group by itself does
be possible to perform universal quantum compu- not have much computational power: it can be
tations using just transversal gates. We therefore efficiently simulated on a classical computer.
have to resort to more complicated techniques. We need to add some additional gate outside
First we create special encoded ancilla states in a the Clifford group to allow universal quantum
non-fault-tolerant way, but perform some sort of computation; a single gate will suffice, such as the
check on them (in addition to error correction) to single-qubit =8 phase rotation diag(1, exp (i=4)).
make sure they are not too far off from the goal. Note that this gives us a finite generating set of
Then we interact the ancilla with the encoded data gates. However, by taking appropriate products, we
qubits using gates from our stock of transversal get an infinite set of gates, one that is dense in the
gates and perform a fault-tolerant measurement. unitary group U(2n ), allowing universal quantum
Then we complete the operation with a further computation.
200 Quantum Error Correction and Fault Tolerance
The following circuit performs a =8 rotation, be measured, and we perform the controlled-X, -Y,
given an ancilla state j =8 i = j0i þ exp (i=4)j1i: or -Z operations transversally from the appropriate
qubits of the cat state to the appropriate qubits in
the data block. Since, assuming the cat state is
correct, all of its qubits are either j0i or j1i, the
⏐ψπ/8 〉 PX procedure either leaves the data state alone or
performs M on it uniformly. A þ1 eigenstate in the
Here P is the =4 phase rotation diag(1, i), and X data therefore leaves us with j00 . . . 0i þ j11 . . . 1i in
is the bit flip. The product is in the Clifford group, the ancilla and a 1 eigenstate leaves us with
and is only performed if the measurement outcome j00 . . . 0i j11 . . . 1i. In either case, the final state
is 1. Therefore, given the ability to perform fault- still tells us nothing about the data beyond the
tolerant Clifford group operations, fault-tolerant eigenvalue of M. If we perform a Hadamard
measurements, and to prepare the encoded j =8 i transform and then measure each qubit in the
state, we have universal fault-tolerant quantum ancilla, we get either a random even-weight string
computation. A slight generalization of the fault- (for eigenvalue þ1) or an odd-weight string (for
tolerant measurement procedure below can be used eigenvalue 1).
to fault-tolerantly verify the j =8 i state, which is a The procedure is transversal, so an error on a
þ1 eigenstate of PX. Using this or another verifica- single qubit in the initial cat state or in a single gate
tion procedure, we can check a non-fault-tolerant during the interaction will only produce one error in
construction. the data. However, the initial construction of the cat
state is not fault tolerant, so a single-gate error then
could eventually produce two errors in the data
block. Therefore, we must be careful and use some
Fault-Tolerant Measurement
sort of technique to verify the cat state, for instance,
and Error Correction
by checking if random pairs of qubits are the same.
Since all our gates are unreliable, including those Also, note that a single phase error in the cat state
used to correct errors, we will need some sort of will cause the final measurement outcome to be
fault-tolerant quantum error-correction procedure. wrong (even and odd switch places), so we should
A number of different techniques have been devel- repeat the measurement procedure multiple times
oped. All of them share some basic features: they for greater reliability.
involve creation and verification of specialized We can then make a full fault-tolerant error-
ancilla states, and use transversal gates which correction procedure by performing the above
interact the data block with the ancilla state. measurement technique for each generator of the
The simplest method, due to Shor, is very general stabilizer. Each measurement gives us one bit of the
but also requires the most overhead and is error syndrome, which we then decipher classically
frequently the most susceptible to noise. Note that to determine the actual error.
the following procedure can be used to measure More sophisticated techniques for fault-tolerant
(non-fault-tolerantly) the eigenvalue of any (possibly error correction involve less interaction with the
multiqubit) Pauli operator M: produce an ancilla data but at the cost of more complicated ancilla
qubit in the state jþi = j0i þ j1i. Perform a con- states. A procedure due to Steane uses (for CSS
trolled-M operation from the ancilla to the state codes) one ancilla in a logical j0i state of the same
being measured. In the case where M is a multiqubit code and one ancilla in a logical j0i þ j1i
state. A
Pauli operator, this can be broken down into a procedure due to Knill (for any stabilizer code)
sequence of controlled-X, controlled-Y, and con- teleports the data qubit through an ancilla consisting
trolled-Z operations. Then measure the ancilla in the of two blocks of the QECC containing an encoded
basis of jþi and ji = j0i j1i. If the state is a þ1 Bell state j00i þ j11i. Because the ancillas in Steane
eigenvector of M, the ancilla will be jþi, and if the and Knill error correction are more complicated
state is a 1 eigenvector, the ancilla will be ji. than the cat state, it is especially important to verify
The advantage of this procedure is that it the ancillas before using them.
measures just M and nothing more. The disadvan-
tage is that it is not transversal, and thus not fault
The Threshold for Fault Tolerance
tolerant. Instead of the unencoded jþi state, we
must use a more complex ancilla state j00 . . . 0i þ In an unencoded protocol, even one error can
j11 . . . 1i known as a ‘‘cat’’ state. The cat state destroy the computation, but a fully fault-tolerant
contains as many qubits as the operator M to protocol will give the right answer unless multiple
Quantum Error Correction and Fault Tolerance 201
errors occur before they can be corrected. On the Furthermore, these calculations make a number of
other hand, the fault-tolerant protocol is larger, assumptions about the physical properties of the
requiring more qubits and more time to do each computer. The errors are assumed to be independent
operation, and therefore providing more opportu- and uncorrelated between qubits except when a gate
nities for errors. If errors occur on the physical connects them. It is assumed that measurements and
qubits independently at random with probability p classical computations can be performed quickly
per gate or time step, the fault-tolerant protocol has and reliably, and that quantum gates can be
probability of logical error for a single logical gate performed between arbitrary pairs of qubits in the
or time step at most Cp2 , where C is a constant that computer, irrespective of their physical proximity.
depends on the design of the fault-tolerant circuitry Of these, only the assumption of independent errors
(assume the QECC has distance 3, as for the 7-qubit is at all necessary, and that can be considerably
code). When p < pt = 1=C, the fault tolerance helps, relaxed to allow short-range correlations and certain
decreasing the logical error rate. pt is the ‘‘thresh- kinds of non-Markovian environments. However,
old’’ for fault-tolerant quantum computation. If the the effects of relaxing these assumptions on the
error rate is higher than the threshold, the extra threshold value and overhead requirements have not
overhead means that errors will occur faster than been well studied.
they can be reliably corrected, and we are better off
with an unencoded system.
To further lower the logical error rate, we turn to
Further Reading
a family of codes known as ‘‘concatenated codes’’
(Aharonov and Ben-Or, Kitaev 1997, Knill et al. Aharonov D and Ben-Or M (1999) Fault-tolerant quantum
1998). Given a code word of a particular [[n, 1]] computation with constant error rate, quant-ph/9906129.
QECC, we can take each physical qubit and again Bennett C, DiVincenzo D, Smolin J, and Wootters W (1996)
Mixed state entanglement and quantum error correction.
encode it using the same code, producing an [[n2 , 1]] Physical Review A 54: 3824–3851 (quant-ph/9604024).
QECC. We could repeat this procedure to get an n3 - Calderbank AR and Shor PW (1996) Good quantum error-
qubit code, and so forth. The fault-tolerant proce- correcting codes exist. Physical Review A 54: 1098–1105
dures concatenate as well, and after L levels of (quant-ph/9512032).
concatenation, the effective logical error rate is Calderbank AR, Rains EM, Shor PW, and Sloane NJA (1998)
L Quantum error correction via codes over GF(4). IEEE
pt (p=pt )2 (for a base code correcting 1 error). Transactions on Information Theory 44: 1369–1387 (quant-
Therefore, if p is below the threshold pt , we can ph/9605005).
achieve an arbitrarily good error rate per logical Gottesman D (1996) Class of quantum error-correcting codes
gate or time step using only poly( log ) resources, saturating the quantum Hamming bound. Physical Review A
which is excellent theoretical scaling. 54: 1862–1868 (quant-ph/9604038).
Kitaev AY (1997) Quantum error correction with imperfect gates.
Unfortunately, the practical requirements for this In: Hirota O, Holeva AS, and Caves CM (eds.) Quantum
result are not nearly so good. The best rigorous Communication, Computing, and Measurement (Proc. 3rd
proofs of the threshold to date show that the Int. Conf. of Quantum Communication and Measurement),
threshold is at least 2 105 (meaning one error pp. 181–188. New York: Plenum.
per 50,000 operations). Optimized simulations of Knill E and Laflamme R (1997) A theory of quantum error-
correcting codes. Physical Review A 55: 900–911 (quant-ph/
fault-tolerant protocols suggest that the true thresh- 9604034).
old may be as high as 5%, but to tolerate this much Knill E, Laflamme R, and Zurek WH (1998) Resilient quantum
error, existing protocols require enormous overhead, computation. Science 279: 342–345.
perhaps increasing the number of gates and qubits Shor PW (1996) Fault-tolerant quantum computation. In: Proc. 35th
Ann. Symp. on Fundamentals of Computer Science, pp. 56–65.
by a factor of a million or more for typical
(quant-ph/9605011). Los Alamitos: IEEE Press.
computations. For lower physical error rates, over- Steane AM (1996) Multiple particle interference and quantum
head requirements are more modest, particularly if error correction. Proceedings of the Royal Society of London
we only attempt to optimize for calculations of a A 452: 2551–2577 (quant-ph/9601029).
given size, but are still larger than one would like.
202 Quantum Field Theory in Curved Spacetime
automorphisms, (t), and, given any stationary of interest, there may be one state or several states or,
state (i.e., one which satisfies ! (t) = ! 8t 2 R), frequently, no states at all which deserve the name
these will be implemented by a one-parameter ‘‘vacuum’’ and even when there are states which
group of unitaries, U(t), on its GNS Hilbert space deserve this name, they will often only be defined in
satisfying U(t) = . If U(t) is strongly continuous some approximate or asymptotic or transient sense or
so that it takes the form eiHt and if the only on some subregion of the spacetime.
Hamiltonian, H, is positive, then ! is said to be Concomitantly, one does not expect global obser-
a ‘‘ground state.’’ Typically one expects ground vables such as the ‘‘particle number’’ or the quantum
states to exist and often be unique. Hamiltonian of flat-spacetime free-field theory to
Another important class of stationary states for generalize to a curved spacetime context, and for
the algebra of a stationary spacetime is the class of this reason local observables play a central role in
KMS states, ! , at inverse temperature ; these have the theory. The quantized stress–energy tensor is a
the physical interpretation of thermal equilibrium particularly natural and important such local obser-
states. In the GNS representation of one of these, the vable and the theory of this is central to the whole
automorphisms are also implemented by a strongly subject. A brief introduction to it is given in a later
continuous unitary group, eiHt , which preserves section.
but (in place of H positive) there is a complex This is followed by a further section on the
conjugation, J, on H! such that Hawking and Unruh effects and then a brief section
on the problems of extending the theory beyond the
eH=2 ! ðAÞ ¼ J! ðA Þ ½1 ‘‘default’’ setting, to nonglobally hyperbolic space-
times. Finally, we briefly mention a number of other
for all A 2 A. An attractive feature of the subject is
interesting and active areas of the subject as well as
that its main qualitative features are already present
issuing a few warnings to be borne in mind when
for linear field theories and, unusually in compar-
reading the literature.
ison with other questions in QFT, these are
susceptible of a straightforward explicit and rigor-
ous mathematical formulation. In fact, as our
principal example, we give, in the next section a Construction of -Algebra(s) for a Real
construction for the field algebra for the quantized Linear Scalar Field on Globally
real linear Klein–Gordon equation Hyperbolic Spacetimes and Some
General Theorems
ð&g m2 VÞ ¼ 0 ½2
On a globally hyperbolic spacetime, the classical
of mass m on a globally hyperbolic spacetime (M, g). equation [2] admits well-defined advanced and
Here, &g denotes the Laplace–Beltrami operator retarded Green functions (strictly bidistributions)
gab ra @b (= (j det (g)j)1=2 @a ((j det (g)j1=2 gab @b )). We A and R and the standard covariant quantum
include a scalar external background classical field, free real (or ‘‘Hermitian’’) scalar field commutation
V, in addition to the external gravitational field relations familiar from Minkowski spacetime free-
represented by g. In case m is zero, taking V to equal field theory naturally generalize to the (heuristic)
R=6, where R denotes the Riemann scalar, makes the equation
equation conformally invariant.
The main new feature of QFT in curved spacetime ^
½ðxÞ; ^
ðyÞ ¼ iðx; yÞI
(present already for linear field theories) is that, in a
general (neither flat nor stationary) spacetime there where is the Lichnérowicz commutator function
will not be any single preferred state but rather a = A R . Here, the ‘‘^’’ on the quantum field ˆ
family of preferred states, members of which are best serves to distinguish it from a classical solution . In
regarded as on an equal footing with one another. It mathematical work, one does not assign a meaning
is this feature which makes the above algebraic to the field at a point itself, but rather aims to assign
framework particularly suitable, indeed essential, to meaning to smeared fields (F) ˆ for all real-valued
1
a clear formulation of the subject. Conceptually, it is test functions F 2 C0 (M) R which are then to be
this feature which takes the most getting used to. In interpreted as standing for M (x)fˆ (x)j det (g)j1=2 d4 x.
particular, one must realize that, as we shall explain In fact, it is straightforward to define a minimal
later, the interpretation of a state as having a field algebra (see below) Amin generated by such
particular ‘‘particle content’’ is in general problematic ˆ
(F) which satisfy the suitably smeared version
because it can only be relative to a particular choice
of ‘‘vacuum’’ state and, depending on the spacetime ^
½ðFÞ; ^
ðGÞ ¼ iðF; GÞI
204 Quantum Field Theory in Curved Spacetime
C40 . Wave front set (or microlocal) spectrum condition Moreover, since [2] implies, for each pair of classical
solutions, 1 , 2 , the conservation (i.e., @a ja = 0) of
WFðG þ iÞ the current ja = j det (g)j1=2 gab (1 @b 2 2 @b 1 ), the
¼ fðx1 ; p1 ; x2 ; p2 Þ 2 T ðM
MÞ n 0jx1 and x2 symplectic form (on C1 1
0 (C)
C0 (C))
lie on a single null geodesic, p1 is tangent to Z
that null geodesic and future pointing, and
ððft1 ; p1t Þ; ðft2 ; p2t ÞÞ ¼ ðft1 p2t p1t ft2 Þd3 x
p2 when parallel transported along that null C0
geodesic from x2 to x1 equals p1 g
will be conserved in time.
For the gist of what this means, it suffices to know that Corresponding to this picture of classical
to say that an element (x, p) of the cotangent bundle of dynamics, one expects there to be a description of
a manifold (excluding the zero section 0) is in the wave quantum dynamics in terms of a family of sharp-
front set, WF, of a given distribution on that manifold time quantum fields (’t , t ) on C0 , satisfying
may be expressed informally by saying that that heuristic canonical commutation relations
distribution is singular at the point x in the direction
p. (And here the notion is applied to G þ i, thought ½’t ðxÞ; ’t ðyÞ ¼ 0
of as a distribution on the manifold M
M.) ½t ðxÞ; t ðyÞ ¼ 0
We remark that generically (and, e.g., always if the ½’t ðxÞ; t ðyÞ ¼ i 3 ðx; yÞI
spatial sections are compact and m2 þ V(x) is every-
where positive) the Weyl algebra for eqn [2] on a given and evolving in time according to the same
stationary spacetime will have a unique ground state dynamics as the Cauchy data of a classical solution.
and unique KMS states at each temperature and these (Both these expectations are correct because the field
will be quasifree and Hadamard. equation is linear.) An elegant way to make rigorous
Quasifree states are important also because of a mathematical sense of these expectations is in terms
theorem of R Verch (1994, in verification of another of a -algebra with identity generated by Hermitian
conjecture of Kay) that (in the Weyl algebra frame- objects ‘‘((’0 , 0 ); (f , p))’’ (‘‘symplectically smeared
work) on the algebra of any bounded open region, sharp-time fields at t = 0’’) satisfying linearity in f
the folia of the quasifree Hadamard states coincide. and p together with the commutation relations
With this result one can extend the notion of
physical admissibility to not-necessarily-quasifree ½ðð’0 ; 0 Þ; ðf 1 ; p1 ÞÞ; ðð’0 ; 0 Þ; ðf 2 ; p2 ÞÞ
states by demanding that, to be admissible, a state ¼ iððf 1 ; p1 Þ; ðf 2 ; p2 ÞÞI
belong to the resulting common folium when
restricted to the algebra of each bounded open and to define (symplectically smeared) time-t sharp-
region; equivalently, that it be a locally normal state time fields by demanding
on the resulting natural extension of the net of local
ðð’t ; t Þ; ðft ; pt ÞÞ ¼ ðð’0 ; 0 Þ; ðf0 ; p0 ÞÞ
Weyl algebras to a net of local W -algebras.
where (ft , pt ) is the classical time-evolute of (f0 , p0 ).
This -algebra of sharp-time fields may be identified
Particle Creation and the Limitations with the (minimal) field -algebra of the previous
section, the (F) ˆ of the previous section being
of the Particle Concept
identified with ((’0 , 0 ); (f , p)), where (f , p) are
Global hyperbolicity also entails that the Cauchy the Cauchy data at t = 0 of F. (This identifica-
problem is well posed for the classical field equation tion is of course many–one since (F) ˆ = 0 whenever
[2] in the sense that for every Cauchy surface, C, and 2
F arises as (&g m V)G for some test function
every pair (f , p) of Cauchy data in C1 0 (C), there G 2 C1 0 (M).)
exists a unique solution in C1 0 (M) such that Specializing momentarily to the case of the free
f = jC and p = j det (g)j1=2 gab @b jC . Moreover, has scalar field (& m2 ) = 0 (m 6¼ 0) in Minkowski
compact support on all other Cauchy surfaces. space with a flat t = 0 Cauchy surface, the ‘‘sym-
Given a global time coordinate t, increasing towards plectically smeared’’ two-point function of the usual
the future, foliating M into a family of constant-t ground state (‘‘Minkowski vacuum state’’), !0 , is
Cauchy surfaces, Ct , and given a choice of global given, in this formalism, by
timelike vector field a (e.g., a = gab @b t) enabling
one to identify all the Ct , say with C0 , by identifying !0 ððð’; Þ; ðf 1 ; p1 ÞÞðð’; Þ; ðf 2 ; p2 ÞÞÞ
points cut by the same integral curve of a , a single ¼ 12 ðhf 1 j
f 2 i þ hp1 j
1 p2 i
such classical solution may be pictured as a family
{(ft , pt ): t 2 R} of time-evolving Cauchy data on C0 . þ iððf 1 ; p1 Þ; ðf 2 ; p2 ÞÞÞ ½4
206 Quantum Field Theory in Curved Spacetime
the (partly positive-frequency, partly negative- varying in time, one can define approximate adia-
frequency) Minkowski-space Klein–Gordon solution batic notions of classical positive-frequency solutions,
and hence also of quantum ‘‘vacuum’’ and ‘‘particles’’
in ðt; xÞ ¼ ð2
Þ1=2 expði
tÞa ðxÞ at each finite value of the cosmological time. But, at
times where the gravitational field is rapidly varying,
þ ð2
Þ1=2 expði
tÞa
ðxÞ one does not expect there to be any sensible notion of
‘‘particles.’’ And, in a rapidly time-varying back-
and this could be taken to be the defining equation ground gravitational field which never settles down,
for the operators and . one does not expect there to be any sensible particle
It is then known (by a 1962 theorem of Shale) interpretation of the theory at all. To understand
that the automorphism [5] (strictly, its Weyl algebra these statements, it suffices to consider the (1 þ 0)-
counterpart) will be unitarily implemented if and dimensional Klein–Gordon equation with an external
only if is a Hilbert–Schmidt operator on H. Wald potential V:
(1979, in case m 0) and Dimock (1979, in case !
m 6¼ 0) have verified that this condition is satisfied d2 2
2 m VðtÞ ¼ 0
in the case of our bump-of-curvature situation. In dt
that case, if we denote the unitary implementor by
U, we have the following results: which is of course a system of one degree of
freedom, mathematically equivalent to the harmonic
R1. The expectation value hUjN(a)UiF (H) of the
oscillator with a time-varying angular frequency
number operator, N(a) = ^ ay (a)^
a(a), where a is a
$(t) = (m2 þ V(t))1=2 . One could of course express
normalized element of H, is equal to ha j aiH .
its quantum theory in terms of a time-evolving
R2. First note that there exists an orthonormal basis
Schrödinger wave function (’, t) and attempt to
of vectors, ei , (i = 1 . . . 1), in H such that the
1 has the give this a particle interpretation at each time, s, by
(Hilbert–Schmidt) P operator expanding (’, s) in terms of the harmonic oscilla-
canonical form i i hCei jijei i. We then have
tor wave functions for a harmonic oscillator with
(up to an undetermined phase)
some particular choice of angular frequency. But the
!
1X y problem is, as is easy to convince oneself, that there
y
U ¼ N exp a ðei Þ^
i ^ a ðei Þ is no such good choice. For example, one might
2 i
think that a good choice would be to take, at time s,
the set of harmonic oscillator wave functions with
where the normalization constant N is chosen
angular frequency $(s). (This is sometimes known
so that kUk = 1. This formula makes manifest
as the method of ‘‘instantaneous diagonalization of
that the particles are created in pairs.
the Hamiltonian.’’) But suppose we were to apply
We remark that, identifying elements, a, of H with this prescription to the case of a smooth V() which
positive-frequency solutions (below, we shall call is constant in time until time 0 and assume the
them ‘‘modes’’) as explained above, result (R1) may initial state is the usual vacuum state. Then at some
alternatively be expressed by saying that the positive time s, the number of particles predicted to
expectation value, !in (N(a)), in the in-vacuum state be present is the same as the number of particles
of the occupation number, N(a), of a normalized predicted to be present on the same prescription at
mode, a, to the future of the bump, is given by ^
all times after s for a V() which is equal to V() up
hajaiH . to time s and then takes the constant value V(s) for
This formalism and the results, (R1) and (R2) all later times (see Figure 2). But V() ^ will
above, will generalize (at least heuristically, and generically have a sharp corner in its graph (i.e., a
sometimes rigorously – see especially the rigorous
scattering-theoretic work in the 1980s by Dimock
and Kay and more recently by A Bachelot and others)
to more realistic spacetimes which are only asympto-
tically flat or asymptotically stationary. In favorable
cases, one will still have notions of classical solutions
which are positive frequency asymptotically towards
t
the future/past, and, in consequence, one will have 0 s
well-defined asymptotic notions of ‘‘vacuum’’ and Figure 2 Plots of $ against t for the two potentials V (continuous
‘‘particles.’’ Also, in, for example, cosmological, line) and V^ (continuous line upto s and then dashed line) which play
models where the background spacetime is slowly a role in our critique of ‘‘instantaneous diagonalization.’’
208 Quantum Field Theory in Curved Spacetime
discontinuity in its time derivative) at time s, and for an arbitrary state whose two-point function
one would expect a large part of the particle has Hadamard form – i.e., whose anticommutator
production in the latter situation to be accounted function satisfies condition (C4)) on the minimal
for by the presence of this sharp corner – and field algebra and to other linear field theories
therefore a large part of the predicted particle (including the stress tensor for a conformally
production in the case of V() to be spurious. coupled linear scalar field) on a general globally
Back in 1 þ 3 dimensions, even where a good hyperbolic spacetime (and the result obtained
notion of particles is possible, it depends on the agrees with that obtained by other methods,
choice of time evolution, as is dramatically illu- including dimensional regularization and zeta-
strated by the Unruh effect discussed in the relevant function regularization). However, the general-
section. ization to a curved spacetime involves a number
of important new features which we now briefly
list (see Wald (1978) for details).
Theory of the Stress–Energy Tensor First, the subtraction term which replaces
!0 ((x1 )(x2 )) is, in general, not the expectation
To orient ideas, consider first the free (minimally
value of (x1 )(x2 ) in any particular state, but
coupled) scalar field, (& m2 ) = 0, in Minkowski
rather a particular locally constructed Hadamard
space. If one quantizes this system in the usual
two-point function whose physical interpretation is
Minkowski-vacuum representation, then the expec-
more subtle; the renormalization is thus in general
tation value of the renormalized stress-energy tensor not to be regarded as a normal ordering. Second, the
(which in this case is the same thing as the normal immediate result of the resulting limiting process
ordered stress–energy tensor) in a vector state in will not be covariantly conserved and, in order to
the Fock space will be given by the formal point- obtain a covariantly conserved quantity, one needs
splitting expression
to add a particular local geometrical correction
hjTab ðxÞi term. The upshot of this is that the resulting
expected stress–energy tensor is covariantly con-
¼ lim @a1 @b2 12 ab ðcd @c1 @d2 þ m2 Þ served but possesses a (state-independent) anoma-
ðx1 ;x2 Þ ! ðx;xÞ
lous trace. In particular, for a massless conformally
ðhj0 ððx1 Þðx2 ÞÞi coupled linear scalar field, one has (for all physically
hF j0 ððx1 Þðx2 ÞÞF iÞ ½6 admissible quasifree states, !) the trace anomaly
formula
where ab is the usual Minkowski metric. A
sufficient condition for the limit here to be finite !ðTaa ðxÞÞ ¼ ð28802 Þ1 Cabcd Cabcd þ Rab Rab 13 R2
and well defined would, for example, be for to
consist of a (normalized) finite superposition of plus an arbitrary multiple of &R. In fact, in general,
n-particle vectors of form ^ ay (a1 ), . . . , ^ay (an )F the thus-defined renormalized stress–energy tensor
where the smearing functions a1 , . . . , an are all operator (see below) is only defined up to a finite
C1 elements of H (i.e., of L2C (R3 ). The reason this renormalization ambiguity which consists of the
works is that the two-point function in such states addition of arbitrary multiples of the functional
shares the same short-distance singularity as the derivatives with respect to gab of the quantities
Minkowski-vacuum two-point function. For exactly
Z
the same reason, one obtains a well-defined finite
limit if one defines the expectation value of In ¼ Fn ðxÞjdetðgÞj1=2 d4 x
M
the stress–energy tensor in any physically admissible
quasifree state by the expression where n ranges from 1 to 4 with F1 = 1, F2 = R,
F3 = R2 , and F4 = Rab Rab . In the Minkowski-space
!ðTab ðxÞÞ case, only the first of these ambiguities arises and it
¼ lim @a1 @b2 12 ab ðcd @c1 @d2 þ m2 Þ is implicitly resolved in the formulas [6], [7]
ðx1 ;x2 Þ ! ðx;xÞ inasmuch as these effectively incorporate the
ð!ððx1 Þðx2 ÞÞ !0 ððx1 Þðx2 ÞÞÞ ½7 renormalization condition that !0 (Tab ) = 0. (For the
same reason, the locally flat example we give below
This latter point-splitting formula generalizes to a has no ambiguity.)
definition for the expectation value of the One expects, in both flat and curved cases, that,
renormalized stress–energy tensor for an arbitrary for test functions, F 2 C1 0 (M), there will exist
physically admissible quasifree state (or indeed operators Tab (F) which are affiliated to the net of
Quantum Field Theory in Curved Spacetime 209
local W -algebras referred to earlier and that it is Hawking and Unruh Effects
meaningful to write
Z The original calculation by Hawking (1975) con-
cerned a model spacetime for a star which collapses
!ðTab ðxÞÞFðxÞj detðgÞj1=2 d4 x ¼ !ðTab ðFÞÞ
M to a black hole. For simplicity, we shall only discuss
the spherically symmetric case (see Figure 4). Adopt-
provided that, by ! on the right-hand side, we ing a similar ‘‘mode’’ viewpoint to that mentioned
understand the extension of ! from the Weyl algebra after results (R1) and (R2) discussed earlier, the
to this net. (Tab (F) is however not expected to result of the calculation may be stated as follows:
belong to the minimal algebra or be affiliated to the For a real linear scalar field satisfying [2] with m = 0
Weyl algebra.) (and V = 0) on this spacetime, the expectation value
An interesting simple example of a renormalized !in (N(a$, ‘ )) of the occupation number of a one-
stress–energy tensor calculation is the so-called particle outgoing mode a$, ‘ ) localized (as far as a
Casimir effect calculation for a linear scalar field normalized mode can be) around $ in angular-
on a (for further simplicity, (1 þ 1)-dimensional) frequency space and about retarded time v, and with
timelike cylinder spacetime of radius R (see angular momentum ‘‘quantum number’’ ‘, in the in-
Figure 3). This spacetime is globally hyperbolic vacuum state (i.e., on the minimal algebra for a real
and stationary and, while locally flat, globally scalar field on this model spacetime) !in is, at late
distinct from Minkowski space. As a result, while – retarded times, given by the formula
provided the regions O are sufficiently small
(such as the diamond region in Figure 3) – elements ð$; ‘Þ
!in ðNða$;‘ ÞÞ ¼
A(O) of the minimal net of local algebras on this expð8M$Þ 1
spacetime will be identifiable, in an obvious way,
with elements of the minimal net of local algebras where M is the mass of the black hole and the
on Minkowski space, the stationary ground state absorption factor (alternatively known as gray-body
!cylinder will, when restricted to such thus-identified factor) ($, ‘) is equal to the norm-squared of that
regions, be distinct from the Minkowski vacuum part of the one-particle mode a$, ‘ which, viewed as
state !0 . The resulting renormalized stress–energy a complex positive-frequency classical solution
tensor (as first pointed out in Kay (1979)), propagating backwards in time from late retarded
definable, once the above identification has been times, would be absorbed by the black hole. (Note
made, exactly as in [7]) turns out, in the massless the independence of the right-hand side of this
case, to be nonzero and, interestingly, to have a (in formula from the retarded time, v.) This calculation
the natural coordinates, constant) negative energy- can be understood as an application of result (R1)
density T00 . In fact, in this massless case,
1
!cylinder ðTab Þ ¼ ab Singularity
24R2
Horizon
THawking ¼
=2
when restricted to a Rindler wedge and regarded with
respect to the time evolution consisting of the wedge-
where
is the surface gravity of the black hole.
preserving one-parameter family of Lorentz boosts is
This result suggests that there is something funda-
known as the Unruh effect (1975). This latter property
mentally ‘‘thermal’’ about quantum fields on black-
of the Minkowski vacuum in fact generalizes to
hole backgrounds and this is confirmed by a number of
general Wightman QFTs and is in fact an immediate
mathematical results. In particular, the theorems in the
consequence of a combination of the Reeh–Schlieder
two papers Kay and Wald (1991) and Kay (1993),
theorem (applied to a Rindler wedge) and the
combined together, tell us that there is a unique state
Bisognano–Wichmann theorem (1975). The latter
on the Weyl algebra for the maximally extended
theorem says that the defining relation [1] of a KMS
Schwarzschild spacetime (a.k.a. Kruskal–Szekeres
state holds if, in [1], we identify the operator J with the
spacetime) (see Figure 5) which is invariant under the
complex conjugation which implements wedge reflec-
Schwarzschild isometry group and whose two-point
tion and H with the self-adjoint generator of the
function has Hadamard form. Moreover, they tell us
unitary implementor of Lorentz boosts. We remark
that this state, when restricted to a single wedge (i.e.,
that the Unruh effect illustrates how the concept of
the exterior Schwarzschild spacetime) is necessarily a
‘‘vacuum’’ (when meaningful at all) is dependent on
KMS state at the Hawking temperature. This unique
the choice of time evolution under consideration.
state is known as the Hartle–Hawking–Israel state.
Thus, the usual Minkowski vacuum is a ground state
These results in fact apply more generally to a wide
with respect to the usual Minkowski time evolution
class of globally hyperbolic spacetimes with bifurcate
but not (when restricted to a Rindler wedge) with
Killing horizons including de Sitter space – where the
respect to a one-parameter family of Lorentz boosts;
unique state is sometimes called the Euclidean and
with respect to these, it is, instead, a KMS state.
sometimes the Bunch–Davies vacuum state – as well as
to Minkowski space, in which case the unique state is
the usual Minkowski vacuum state, the analog of the
Nonglobally Hyperbolic Spacetimes
exterior Schwarzschild wedge is a so-called Rindler
and the ‘‘Time Machine’’ Question
wedge, and the relevant isometry group is a one-
parameter family of wedge-preserving Lorentz boosts. Hawking (1992) argued that a spacetime in which a
In the latter situation, the fact that the Minkowski time machine gets manufactured should be modeled
vacuum state is a KMS state (at ‘‘temperature’’ 1=2) (see Figure 6) by a spacetime with an initial globally
Quantum Field Theory in Curved Spacetime 211
distributions violate the ‘‘Hadamard’’ condition (C4) Hawking SW (1992) The chronology protection conjecture.
and which therefore do not have a well-defined finite Physical Review D 46: 603–611.
Israel W (1976) Thermo-field dynamics of black holes. Physics
expectation value for the renormalized stress–energy Letters A 57: 107–110.
tensor. Kay BS (1979) Casimir effect in quantum field theory. (Original
title: The Casimir effect without magic.) Physical Review D
See also: AdS/CFT Correspondence; Algebraic 20: 3052–3062.
Approach to Quantum Field Theory; Axiomatic Quantum Kay BS (1993) Sufficient conditions for quasifree states and an
Field Theory; Black Hole Mechanics; Bosons and improved uniqueness theorem for quantum fields on space-times
Fermions in External Fields; Integrability and Quantum with horizons. Journal of Mathematical Physics 34: 4519–4539.
Field Theory; Quantum Fields with Indefinite Metric: Kay BS (2000) Application of linear hyperbolic PDE to linear
quantum fields in curved spacetimes: especially black holes,
Non-Trivial Models; Quantum Fields with Topological
time machines and a new semi-local vacuum concept. Journées
Defects; Quantum Geometry and Its Applications;
Équations aux Dérivées Partielles, Nantes, 5–9 juin 2000,
Scattering in Relativistic Quantum Field Theory: GDR 1151 (CNRS): IX1–IX19. (Also available at http://
Fundamental Concepts and Tools; Thermal Quantum www.math.sciences.univ-nantes.fr or as gr-qc/0103056.)
Field Theory. Kay BS, Radzikowski MJ, and Wald RM (1997) Quantum field
theory on spacetimes with a compactly generated Cauchy horizon.
Communications in Mathematical Physics 183: 533–556.
Kay BS and Wald RM (1991) Theorems on the uniqueness and
Further Reading thermal properties of stationary, nonsingular, quasifree states
on spacetimes with a bifurcate Killing horizon. Physics
Birrell ND and Davies PCW (1982) Quantum Fields in Curved Reports 207(2): 49–136.
Space. Cambridge: Cambridge University Press. Misner CW, Thorne KS, and Wheeler JA (1973) Gravitation. San
Brunetti R, Fredenhagen K, and Verch R (2003) The generally Francisco: W.H. Freeman.
covariant locality principle – a new paradigm for local quantum Unruh W (1976) Notes on black hole evaporation. Physical
physics. Communications in Mathematical Physics 237: 31–68. Review D 14: 870–892.
DeWitt BS (1975) Quantum field theory in curved space-time. Visser M (2003) The quantum physics of chronology protection.
Physics Reports 19(6): 295–357. In: Gibbons GW, Shellard EPS, and Rankin SJ (eds.) The
Dimock J (1980) Algebras of local observables on a manifold. Future of Theoretical Physics and Cosmology. Cambridge:
Communications in Mathematical Physics 77: 219–228. Cambridge University Press.
Haag R (1996) Local Quantum Physics. Berlin: Springer. Wald RM (1978) Trace anomaly of a conformally invariant quantum
Hartle JB and Hawking SW (1976) Path-integral derivation of field in a curved spacetime. Physical Review D 17: 1477–1484.
black-hole radiance. Physical Review D 13: 2188–2203. Wald RM (1994) Quantum Field Theory in Curved Spacetime
Hawking SW (1975) Particle creation by black holes. Commu- and Black Hole Thermodynamics. Chicago: University of
nications in Mathematical Physics 43: 199–220. Chicago Press.
seems to have taken this as a model for photons. Schrödinger wave function to be considered as a
Jordan further proposed that electrons should be ‘‘real’’ field, whose quanta result in ‘‘real’’ particles,
treated as the quanta of an electron field, but or is it a probability field, whose significance lies in
recognized that their fermionic nature would modify Born’s probabilistic interpretation of quantum
the quantization procedure. This generic idea mechanics? Born wrote in 1926, ‘‘[Einstein said
involved what was called ‘‘second quantization’’ – that] the waves are present only to show the
of a field into a particle. corpuscular light quanta the way, and he spoke in
One of the earliest quantization rules was Bohr’s the sense of a ‘‘ghost field’’. This determines the
condition relating
R to the periodic orbits of electrons in probability that a light quantum, the bearer of
atoms, J = p dq = nh. At the hands of Heisenberg and energy and momentum, takes a certain path;
Dirac this became upgraded to the commutation however, the field itself has no energy and no
relation momentum.’’ This is the first problem. The second
one concerns the nature of the quantization itself. Is
½q; p ¼ i
h this a quantization of field energy, or a quantization
of the field itself, as a substantial entity? If the field
where the operators p and q are ‘‘observables.’’ In is real, the second of these does not imply the first.
their papers on quantum field theory, Dirac, Jordan Ambiguities surrounding the idea of second
and Wigner, and Heisenberg introduced creation and quantization survived into the 1960s. Wigner is
annihilation operators which had the function, as recorded as saying, in an interview in 1963, ‘‘just as
their name implied, of creating and destroying single we get photons by quantising the electromagnetic
particles – quanta of the field. These operators obeyed fields, so we should be able to get material particles
the commutation rules (with [A, B] = AB BA) by quantising the Schrödinger field.’’ And Rosenfeld,
also in an interview in 1963, said, ‘‘in some sense or
½br ; bs ¼ rs ; ½br ; bs ¼ ½br ; bs ¼ 0
other, Jordan himself took the wave function, the
when the field quanta were bosons, and the anti- probability amplitude, physically more seriously
commutation rules than most people [did].’’
It would seem we are justified in concluding that the
fbr ; bs g ¼ rs ; fbr ; bs g ¼ fbr ; bs g ¼ 0 idea of second quantization contains flaws, but an even
clearer indication of the need for rethinking is provided
(with {A, B} = AB þ BA) when the field quanta were by the story of the Dirac equation. This is a wave
fermions (e.g., electrons). These steps constitute equation for the electron, compatible with special
second quantization, but it may be noted that relativity, and taking explicit account of its spin being
the creation and annihilation operators are not (1/2)h. The equation famously had both positive- and
observables, as p and q are in the Heisenberg negative-energy solutions. This potential disaster was
commutation relation. In addition, the second converted by Dirac into a triumph by reinterpreting the
quantization conditions do not involve Planck’s (absence of) negative-energy solutions as (positive-
constant. ‘‘First’’ and ‘‘second’’ quantization are energy) antiparticles – positrons, particles with positive
therefore not so similar as one might like to think. charge but the same mass and spin as the electron.
The question of what exactly is being quantized Positrons were eventually discovered by Anderson. It
was in fact the source of some confusion. In his was later shown that the existence of antiparticles is a
paper of 1927, Dirac’s attention is focussed on general feature of quantum field theory, not just a
electromagnetic radiation, but he nevertheless dis- peculiarity of spin-1/2 particles. The significance of this
cusses the difference between ‘‘a light-wave and the discovery, however, is that the twin requirements of
de Broglie or Schrödinger wave associated with the relativity and quantum theory are not compatible with
light-quanta.’’ As Dirac points out, ‘‘their intensities a single-particle state; rather, these requirements result
are to be interpreted in different ways. The number in a two-particle state. Thus, in some sense the
of light quanta per unit volume associated with a requirements of relativity and quantum mechanics
monochromatic light-wave equals the energy per already start to take us down the road to a quantum
unit volume of the wave divided by the energy theory of fields.
(2h) of a single light quantum. On the other hand Quantum field theory is then constructed on the
a monochromatic de Broglie wave of amplitude a following sort of framework: ‘‘classical’’ theories for
(multiplied into the imaginary exponential factor) fields with any spin may be written down and these
must be interpreted as representing a2 light quanta are quantized by reinterpreting the field variables as
per unit volume for all frequencies.’’ There are at operators and imposing Heisenberg-type commuta-
least two problematic issues here. First, is the tion relations on the field and its corresponding
214 Quantum Field Theory: A Brief Introduction
‘‘momentum’’ variable. So, for example, for spinless perturbation theory, since any physical process (say a
fields we have the equal-time commutation relation scattering process or a particle decay) will only be
observed at a finite energy and comparison of theory
hð3Þ ðx yÞ
½ðx; tÞ; ðy; tÞ ¼ i and experiment therefore only requires calculation up
where = @L=@(@0 ) and L is the Lagrange density. to a finite order of perturbation theory. So even
The mass and spin of particles are defined with nonrenormalizable theories are perfectly acceptable
reference to the Poincaré group (thereby incorporat- as low-energy theories. This amounts to a philosophy
ing special relativity) and the quantum requirement of effective field theories; an effective field theory is a
is the familiar one that physical states are repre- model which holds good up to a particular energy
sented by vectors in Hilbert space. The rest follows: scale, or equivalently down to a particular length
as Weinberg says, ‘‘quantum field theory is the way scale.
it is because (with certain qualifications) this is the An important addition to the theoretical armoury
only way to reconcile quantum mechanics with is the renormalization group. Renormalization is
special relativity.’’ implemented first of all by a scheme of regulariza-
tion, which enables the divergences to be exhibited
explicitly. The simplest type of regularization is the
introduction of a cutoff in the momentum integrals,
Renormalization
but in modern particle physics the favored scheme is
A notorious problem in quantum field theory is the dimensional regularization. The dimensionality of
occurrence of infinities. In QED, for example, the the integrals in momentum space is taken to be
electron acquires a self-energy – and therefore a d = 4 " and the divergent quantities have an
contribution to its mass – by virtue of the emission explicit dependence on " (which, of course, as the
and reabsorption of virtual photons. It turns out ‘‘real’’ world is approached, approaches zero). At
that this self-energy is infinite – it is given by a the same time, a mass parameter is introduced in
divergent integral – even in the lowest order of order to define dimensionless quantities, for exam-
perturbation theory. In the early days, this was ple, a dimensionless coupling constant. The renor-
recognized as being a serious problem, and in fact it malized quantities then depend on the ‘‘bare’’
turns out to be a generic problem in quantum field (unrenormalized) quantities and on and ". The
theory. It was realized by Dyson, however, that in arbitrariness of enables a differential equation, for
some field theories these divergences may be dealt scattering amplitudes, for example, to be written
with by redefining a small number of parameters down. While at first sight this renormalization
(e.g., in QED, the electron mass, charge, and field group equation might seem to have no physical
amplitude) so that thereafter the theory is finite to importance, in fact it gives a powerful way of
all orders of perturbation theory. Such theories are studying scattering behavior at large momenta.
called renormalizable, and QED is a renormalizable Most interestingly, the concept of the renormali-
field theory. zation group also arises in condensed matter physics.
Some important field theories, however, are not Here, rather than, for example, a cutoff in momen-
renormalizable; an example is Fermi’s theory of tum space, the relevant parameter is a distance scale.
weak interactions. To lowest order in perturbation In the Ising model in statistical mechanics, for
theory, Fermi’s theory works well (e.g., in account- example, in which spins are located on a lattice,
ing for the electron spectrum in neutron beta decay), the parameter is the lattice spacing. To construct a
but to higher orders divergent results are obtained, theory that describes the physics on the macroscopic
which cannot be waved away by redefining a finite scale involves integrating out the details on the
number of parameters; that is to say, as the order of microscopic scale and one way to do this is via the
perturbation increases, so also does the number of ‘‘block spin’’ transformation originally introduced
parameters to be redefined. Nonrenormalizable by Kadanoff. In this way the renormalization group
theories of this type have traditionally been regarded has had a large impact in condensed matter physics,
as highly undesirable, not to say rather nasty. for example, in the study of critical phenomena.
The modern view of renormalization is, however,
somewhat different. The problem with nonrenormal-
Particle Physics and Cosmology
izable theories is that, in order to calculate a physical
process to all orders in perturbation theory, an Probably the most spectacular success of quantum
infinite number of parameters must be renormalized, field theory in the twentieth century has been in
so the theory has no predictive power. In practice, particle physics. The ‘‘standard model’’ accounts for
however, we do not need to calculate to all orders in the strong, electromagnetic, and weak interactions
Quantum Field Theory: A Brief Introduction 215
between elementary particles with outstanding of the zero-point energies of all the oscillators in the
success. The interactions are generalizations of Max- Fourier expansion of the scalar field operator. In any
well’s electrodynamics, which is invariant under a other interaction than gravity, this zero-point energy
symmetry group U(1) of gauge transformations. An may be ignored, but in gravity it may be expected to
enlargement of this group to SU(2) U(1) accounts have observable consequences, and indeed it turns out
for the unified electroweak interaction (the unifica- that it plays the same role as a cosmological constant ,
tion resulting from the fact that the two U(1)’s above and therefore acts as an agent of acceleration, rather
are not exactly the same; there is some on-diagonal than deceleration, of the universe.
mixing), and the strong interactions between quarks, A final topic worth noting is one whose existence
which binds them into hadrons, are invariant under an would have been inconceivable in the early days of this
SU(3) group of gauge transformations. The gauge subject. The nonlinearity of the (nonabelian) gauge
fields are the photon , the W and Z bosons (both field equations and the existence of a nontrivial group
heavy; of the order of 100 times the proton mass), and space allows new types of topologically nontrivial
the (massless) gluons mediating the force between solutions to these equations: solitons, bounces, instan-
quarks (quantum chromodynamics, QCD). An tons, sphalerons, and so on. Effects such as fractional
important feature of the standard model is sponta- spin and nonconservation of fermion number also
neous symmetry breaking, which is the mechanism by appear, and, on the cosmological scale, domain walls
which the W and Z particles acquire a mass (but the and cosmic strings. There is something here for
photon does not, and neither do the gluons). This goes theoretical physicists of many differing interests.
by the name of the Higgs mechanism.
The quantization of the standard model is most See also: Algebraic Approach to Quantum Field Theory;
successfully carried out using the path-integral Axiomatic Quantum Field Theory; BRST Quantization;
formalism, rather than canonical quantization, and Constrained Systems; Constructive Quantum Field
Theory; Deformation Quantization; Electroweak Theory;
the proof of the renormalizability of the model (of
Euclidean Field Theory; Exact Renormalization Group;
nonabelian gauge theories with spontaneous sym-
Integrability and Quantum Field Theory; Nonperturbative
metry breaking) was given by ’t Hooft. Details of and Topological Aspects of Gauge Theory; Perturbative
these topics are now available in many textbooks. Renormalization Theory and BRST; Quantum
Confidence that this is a realistic model of elemen- Chromodynamics; Quantum Electrodynamics and Its
tary particles – that is to say, of quarks and leptons – Precision Tests; Quantum Fields with Indefinite Metric:
depends, of course, on particular experiments and Non-Trivial Models; Quantum Fields with Topological
their interpretation and an important milestone on this Defects; Renormalization: General Theory; Standard
journey was Feynman’s quark–parton model of deep Model of Particle Physics; Symmetries and Conservation
inelastic electron–proton scattering. The interpretation Laws; Symmetries in Quantum Field Theory of Lower
of the data required a picture of an electron scattering Spacetime Dimensions; Topological Defects and Their
Homotopy Classification; Topological Quantum Field
from an individual quark in the proton, and this in
Theory: Overview; Twistors.
turn required a negligible interaction between quarks;
in other words, that at small distances (inside the
proton) the quarks are (almost) free – despite the fact
Further Reading
that at large distances they most certainly are not! The
proof, by Gross, Politzer, and Wilczek, that nonabe- Cao TY (1997) Conceptual Developments of 20th Century Field
lian gauge are indeed asymptotically free (asymptotic Theories. Cambridge: Cambridge University Press.
in momentum space, that is) was therefore an Davies P (ed.) (1989) The New Physics. Cambridge: Cambridge
University Press.
important event in helping to establish the credibility Gross F (1993) Relativistic Quantum Mechanics and Field
of the standard model. Theory. New York: Wiley.
A characteristic contribution of quantum field theory Huang K (1998) Quantum Field Theory. New York: Wiley.
to our view of the physical world is its picture of the Itzykson C and Zuber J-B (1980) Quantum Field Theory.
vacuum, as being populated with virtual particle– New York: McGraw-Hill.
Maggiore M (2005) A Modern Introduction to Quantum Field
antiparticle pairs. A consequence of this is the phenom- Theory. Oxford: Oxford University Press.
enon of vacuum polarization – that the presence of an Rubakov V (2002) Classical Theory of Gauge Fields. Princeton:
electric charge in free space polarizes these virtual pairs. Princeton University Press.
This in turns leads to the phenomenon of screening in Schweber SS (1994) QED and the Men Who Made It. Princeton:
QED, and antiscreening in QCD, SU(3) having a more Princeton University Press.
Schwinger J (ed.) (1958) Quantum Electrodynamics. New York:
complicated structure than U(1). It also leads to a Dover.
nonzero (in fact, quadratically divergent!) value for the ’t Hooft G (1997) In Search of the Ultimate Building Blocks.
energy of the vacuum. This is in effect the contribution Cambridge: Cambridge University Press.
216 Quantum Fields with Indefinite Metric: Non-Trivial Models
Weinberg S (1995, 1996) The Quantum Theory of Fields, vol. 1 Century, (Reprinted in Mehra J (1973) The Physicist’s Conception
and 2. Cambridge: Cambridge University Press. of Nature. Dordrecht: Reidel.) New York: Interscience.
Wentzel G (1960) Quantum theory of fields (until 1947). In: Fierz M Zee A (2003) Quantum Field Theory in a Nutshell. Princeton:
and Weisskopf VF (eds.) Theoretical Physics in the Twentieth Princeton University Press.
The vacuum expectation values (VEVs), also called when combined with nontrivial scattering) becomes
Wightman functions, of the quantum field theory highly nonlinear for truncated Wightman functions.
with indefinite metric (IMQFT) are defined as This can be seen as one explanation why it is so
difficult to find nontrivial (i.e., corresponding to
Wn ðf1 fn Þ = h; ðf1 Þ ðfn Þi nontrivial interactions) solutions to the Wightman
f1 ; . . . ; fn 2 S ½1 axioms.
But it turns out that, in contrast to positivity, the
An axiomatic framework for (unconstrained) HSSC is essentially linear for truncated Wightman
IMQFT has been suggested by G Morchio and functions.
F Strocchi in terms of the Wightman functions
Theorem 1 If there exists a Schwartz norm jj jj on
Wn 2 S 0 , n 2 N0 . Previous work on the topic had
S such that WnT is continuous with respect to jj jjn
been done by J Yngvason. These generalized Wight-
for n 2 N then the associated sequence of Wightman
man axioms of Morchio and Strocchi replace the
functions fWn g fulfills the HSSC [2].
positivity condition on the Wightman functions by a
so-called Hilbert space structure condition (HSSC): Note that jj jjn is well defined as S is a nuclear
for n 2 N 0 there exist pn a Hilbert seminorm on S n space. This theorem makes it much easier to
such that construct IMQFTs. In particular, all known solu-
tions of the linear program for truncated
jWnþm ðf hÞj pn ðf Þpm ðhÞ 8n; m 2 N0 Wightman functions lead to an abundance of
f 2 S n ; h 2 S m ½2 mathematical solutions to the axioms of IMQFT,
as long as the singularities of truncated Wightman
This condition makes sure that a field algebra on a functions in position and energy–momentum space
Krein space with VEVs equal to the given set of do not become increasingly stronger with growing n.
Wightman functions can be constructed. The For example, the perturbative solutions to Wight-
remaining axioms of the Wightman framework – man functions of Ostendorf and Steinmann provide
temperedness, covariance, spectral condition, local- solutions when the perturbation series is truncated at
ity, and Hermiticity – remain the same. Clustering of a given order.
Wightman functions is assumed at least for massive
theories:
lim Wnþm ðf hta Þ = Wn ðf ÞWm ðhÞ 8n; m 2 N0 Relativistic Fields from Euclidean
t!1
Stochastic Equations
f 2 S n ; h 2 S m ½3
In the classical work on constructive quantum field
for spacelike a 2 R d . It fails to hold in certain theory, relativistic fields in spacetime dimensions
physical contexts where multiple vacua (also called d = 2 and 3 have been constructed by analytic
-vacua) accompanied with massless Goldstone continuation from Euclidean random fields. This, in
bosons occur due to spontaneous symmetry particular, has led to firm connections between
breaking. quantum field theory and equilibrium statistical
In the original Wightman axioms, there are mechanics. Let us discuss one specific class of
essentially two nonlinear axioms: positivity and solutions of the axioms of IMQFT for arbitrary d
clustering. Here nonlinear means that checking that which also stem from random fields related to an
condition involves more than one VEV with a given ensemble of statistical mechanics of classical, con-
number of field operators. The cluster condition can tinuous particles. Mathematically, this is connected
be linearized by an operation on the Wightman with using random fields with Poisson distribution.
functions called ‘‘truncation.’’ The equations As in constructive QFT, the moments, also called
Schwinger functions, of the random field can be
Wn ðf1 fn Þ analytically continued from Euclidean imaginary
X Y
¼ WnT ðfj1 fjl Þ ½4 time to relativistic real time. That this is possible
I2P ðnÞ
fj1 ;...;jl g2I results from an explicit calculation. Axiomatic results
j1 <j2 <<jl cannot be used, as they depend on positivity or
reflection positivity in the Euclidean spacetime,
recursively define the truncated Wightman functions respectively.
WnT for n 2 N. Here P (n) stands for the set of all By definition, a mixing Euclidean covariant
partitions of f1, . . . , ng into disjoint, nonempty sets. random field ’ is an almost surely linear mapping
Unfortunately, the positivity condition (at least from S R = S(R d , RN ) to the space of real-valued
218 Quantum Fields with Indefinite Metric: Non-Trivial Models
f 2 SR ½5 where
where : RN ! C is a Lévy function, Y
n
@
QEn;1 n ði rn Þ ¼C 1 n
QE;l ;l i ½12
Z @xl
t 2 t l¼1
ðtÞ = ia t þz ðeits 1Þ drðsÞ
2 N
R nf0g with
t2R N
½6
@ n ðtÞ
n
C1 n = ðiÞ ½13
Here the centered dot represents a -invariant scalar @t1 @tn t = 0
product on RN , a positive-semidefinite -invariant
and the Einstein convention of summation and raising/
N
N matrix, z 0 a real number and r is a
lowering of indices on RN with respect to the invariant
-invariant probability measure on Rn nf0g with all
inner product is applied. The Schwinger functions
moments. Further, 2, = (@ 2 (t)=@t @t )jt = 0 ,
fulfill the requirements of -covariance, symmetry,
and p : [0, 1) ! [0, 1) is a polynomial depending
^ 1 , the Fourier-transformed inverse of D, clustering, and Hermiticity from the Osterwalder–
on D. If D
Schrader axioms of Euclidean QFT.
exists, it can be represented by
While there is no known general reason why a
^ 1 ðkÞ = Q QE ðkÞ relativistic QFT should exist for a given set of
D P
½7 Schwinger functions, one can take advantage of the
l=1 ðjkj2 þ m2l Þ l
explicit formulas [10]–[13] in order to calculate the
Here QE (k) is a complex N
N matrix with analytic continuation from Euclidean to relativistic
polynomial entries being -covariant, ()QE times explicitly.
Quantum Fields with Indefinite Metric: Non-Trivial Models 219
theory in axiomatic QFT, Haag–Ruelle theory, relies experimentally very well tested, apparently has to be
on positivity. In fact, one can show that in the class located in the constraints, that is, in the procedure of
of models under discussion, the LSZ asymptotic implementing a gauge, of the theory and not in the
condition is violated if dipole degrees of freedom are unconstrained IMQFT.
admitted. In that case more complicated asymptotic Second, one can replace somewhat artificially the
conditions have to be used. In any case, the Haag– polynomials QM n in [17] by any other symmetric and
Ruelle theory cannot be adapted to IMQFT. relativistically covariant polynomial. If the sequence of
Nevertheless, asymptotic fields and states can be the ‘‘new’’ QM n is of uniformly bounded degree in any
constructed in IMQFT if one imposes a no-dipole of the arguments k1 , . . . , kn , the redefined Wightman
condition in a mathematically precise way. Then the functions in [17] still fulfill the requirements of
LSZ asymptotic condition leads to the construction of Theorem 1 and thus define a new relativistic, local
mixed VEVs of asymptotic in- and out-fields with local IMQFT. The scattering amplitudes of such a theory
fields. The collection of such VEVs is called the form- are again well defined and given by [20]. For example,
factor functional. After constructing this collection of in the case of only one scalar particle with mass m, one
mixed VEVs, one can try to check the HSSC for this can show that arbitrary Lorentz-invariant scattering
functional and obtains a Krein space representation for behavior of bosonic particles can be reproduced by
the algebra generated by in- local and out-fields. such theories for energies below an arbitrary maximal
Following this line, asymptotic in- and out-particle energy up to arbitrary precision. This kind of
states can be constructed for the given mass spectrum interpolation theorem shows that the outcome of an
in=outy
(m1 , . . . , mP ). If a, l (k), l = 1, . . . , P, denotes the arbitrary scattering experiment can be reproduced
creation operator for an incoming/outgoing particle within the formalism of (unconstrained) IMQFT as
with mass ml , spin component , and energy–momen- long as it is in agreement with the general requirements
tum k, the following scattering amplitude can be derived of Poincaré invariance and statistics.
for r incoming particles with masses ml1 , . . . , mlr and
n r outgoing particles with masses mlrþ1 , . . . , mln : List of Symbols
D ET
iny iny outy outy ! converges to
a1 ;l1 ðk1 Þ ar ;lr ðkr Þ; arþ1 ;lrþ1ðkrþ1 Þ an ;lnðkn Þ L
! convergence in law
¼ ð2ÞiQM 1 ;...;n ðk1 ; . . . ; kr ; krþ1 ; . . . ; kn Þ N set of natural numbers
Y
n N0 set of natural numbers and zero
þ
m l
ðkj Þ
ðKin Kout Þ ½20 R set of real numbers
j
j¼1 C set of complex numbers
Kin=out stand for the total energy–momentum of 1 identity mapping
Pr jD restricted to D
in- and Pout-particles, that is, Kin = j = 1 kj and
n
Kout = j = rþ1 kj .
x0 and x time and spatial part of
Two immediate consequences can be drawn from x = (x0 , x) 2 R
Rd1
[20]. First, choosing a model with nonvanishing rn gradient operator on Rdn
Poisson part such that C1 2 3 6¼ 0 and a differential
operator D containing in its mass spectrum the
masses m and
with m > 2
, one gets a nonvanish- See also: Algebraic Approach to Quantum Field Theory;
ing scattering amplitude for the process Euclidean Field Theory; Indefinite Metric; Perturbative
Renormalization Theory and BRST; Quantum Field
µ
Theory in Curved Spacetime; Quantum Field Theory: A
m ½21
Brief Introduction; Stochastic Differential Equations.
µ
Grothaus M and Streit L (1999) Construction of relativistic Steinmann O (2000) Perturbative Quantum Electrodynamics and
quantum fields in the framework of white noise analysis. Axiomatic Field Theory. Berlin: Springer.
Journal of Mathematical Physics 40(11): 5387. Strocchi F (1993) Selected Topics on the General Properties of
Morchio G and Strocchi F (1980) Infrared singularities, vacuum Quantum Field Theory. Lecture Notes in Physics, vol. 51.
structure and pure phases in local quantum field theory. Ann. Singapore: World Scientific.
Inst. H. Poincaré 33: 251.
normal to the superconducting phase in a metal. The may drastically interfere with the interacting objects,
identification of the state space where the field thus changing their nature. Besides the asymptotic
operators have to be realized is thus a physically fields, one then also introduces dynamical or
nontrivial problem in QFT. In this respect, the QFT Heisenberg fields, that is, the fields in terms of
structure is drastically different from the one of which the dynamics is given. Since the interaction
quantum mechanics (QM). The reason is the region is precluded from observation, we do not
following. observe Heisenberg fields. Observables are thus
The von Neumann theorem (1955) in QM states solely described in terms of asymptotic fields.
that for systems with a finite number of degrees of Summing up, QFT is a ‘‘two-level’’ theory: one level
freedom all the irreducible representations of the is the interaction level where the dynamics is specified
canonical commutation relations are unitarily by assigning the equations for the Heisenberg fields.
equivalent. Therefore, in QM the physical system The other level is the physical level, the one of the
can only live in one single physical phase: unitary asymptotic fields and of the physical state space
equivalence means indeed physical equivalence and directly accessible to observations. The equations for
thus there is no room (no representations) for the physical fields are equations for free fields,
physically different phases. Such a situation drasti- describing the observed incoming/outgoing particles.
cally changes in QFT where systems with infinitely To be specific, let the Heisenberg operator fields
many degrees of freedom are treated. In such a case, be generically denoted by H (x) and the physical
the von Neumann theorem does not hold and operator fields by ’in (x). For definiteness, we choose
infinitely many unitarily inequivalent representa- to work with the in-fields, although the set of out-
tions of the canonical commutation relations do in fields would work equally well. They are both
fact exist (Umezawa 1993, Umezawa et al. 1982). It assumed to satisfy equal-time canonical (anti)-
is such richness of QFT that allows the description commutation relations.
of different physical phases. For brevity, we omit considerations on the renor-
malization procedure, which are not essential for the
conclusions we will reach. The Heisenberg field
QFT as a Two-Level Theory
equations and the free-field equations are written as
In the perturbative approach, any quantum experi-
ð@Þ H ðxÞ ¼ J½ H ðxÞ ½1
ment or observation can be schematized as a
scattering process where one prepares a set of free ð@Þ’in ðxÞ ¼ 0 ½2
(noninteracting) particles (incoming particles or in-
fields) which are then made to collide at some later where (@) is a differential operator, x (t, x) and
time in some region of space (spacetime region of J is some functional of the H fields, describing the
interaction). The products of the collision are interaction.
expected to emerge out of the interaction region as Equation [1] can be formally recast in the
free particles (outgoing particles or out-fields). following integral form (Yang–Feldman equation):
Correspondingly, one has the in-field and the out- ¼ ’in ðxÞ þ 1 ð@Þ J ½
H ðxÞ H ðxÞ ½3
field state space. The interaction region is where the
dynamics operates: given the in-fields and the in- where denotes convolution. The symbol 1 (@)
states, the dynamics determines the out-fields and denotes formally the Green function for ’in (x). The
the out-states. precise form of Green’s function is specified by the
The incoming particles and the outgoing ones boundary conditions. Equation [3] can be solved by
(also called quasiparticles in solid state physics) are iteration, thus giving an expression for the Heisen-
well distinguishable and localizable particles only far berg fields H (x) in terms of powers of the ’in (x)
away from the interaction region, at a time much fields; this is the Haag expansion in the LSZ
before (t = 1) and much after (t = þ1) the formalism (or ‘‘dynamical map’’ in the language of
interaction time: in- and out-fields are thus said to Umezawa 1993 and Umezawa et al. 1982), which
be asymptotic fields, and for them the interaction might be formally written as
forces are assumed not to operate (switched off).
H ðxÞ ¼ F½x; ’in ½4
The only regions accessible to observations are
those far away (in space and in time) from the (A (formal) closed form for the dynamical map is
interaction region, that is, the asymptotic regions obtained in the closed time path (CTP) formalism
(the in- and out-regions). It is so since, at the (Blasone and Jizba 2002). Then the Haag expansion
quantum level, observations performed in the inter- [4] is directly applicable to both equilibrium and
action region or vacuum fluctuations occurring there nonequilibrium situations.)
Quantum Fields with Topological Defects 223
We stress that the equality in the dynamical map When symmetry is spontaneously broken it is
[4] is a ‘‘weak’’ equality, which means that it must G0 6¼ G, with G0 the group contraction of G; when
be understood as an equality among matrix elements symmetry is not broken then G0 = G.
computed in the Hilbert space of the physical Since G is the invariance group of the dynamics,
particles. eqn [4] requires that G0 is the group under which
We observe that mathematical consistency in the free fields equations are invariant, that is, also ’0in
above procedure requires that the set of ’in fields is a solution of [2]. Since eqn [4] is a weak equality,
must be an irreducible set; however, it may happen G0 depends on the choice of the Fock space H
that not all the elements of the set are known from among the physically realizable unitarily inequiva-
the beginning. For example, there might be compo- lent state spaces. Thus, we see that the (same)
site (bound states) fields or even elementary quanta original invariance of the dynamics may manifest
whose existence is ignored in a first recognition. itself in different symmetry groups for the ’in fields
Then the computation of the matrix elements in according to different choices of the physical state
physical states will lead to the detection of unex- space. Since this process is constrained by the
pected poles in the Green’s functions, which signal dynamical equations [1], it is called the dynamical
the existence of the ignored quanta. One thus rearrangement of symmetry (Umezawa 1993,
introduces the fields corresponding to these quanta Umezawa et al. 1982).
and repeats the computation. This way of proceed- In conclusion, different ordering patterns appear
ing is called the self- consistent method (Umezawa to be different manifestations of the same basic
1993, Umezawa et al. 1982). Thus it is not necessary dynamical invariance. The discovery of the process
to have a one-to-one correspondence between the of the dynamical rearrangement of symmetry leads
sets { Hj } and {’iin }, as it happens whenever the set to a unified understanding of the dynamical genera-
{’iin } includes composite particles. tion of many observable ordered patterns. This is the
phenomenon of the dynamical generation of order.
The contraction of the symmetry group is the
The Dynamical Rearrangement of Symmetry mathematical structure controlling the dynamical
As already mentioned, in QFT the Fock space for rearrangement of the symmetry. For a qualitative
the physical states is not unique since one may have presentation see Vitiello (2001).
several physical phases, for example, for a metal the One can now ask which ones are the carriers of
normal phase and the superconducting phase, and so the ordering information among the system elemen-
on. Fock spaces describing different phases are tary constituents and how the long-range correla-
unitarily inequivalent spaces and correspondingly tions and the coherence observed in ordered patterns
we have different expectation values for certain are generated and sustained. The answer is in
observables and even different irreducible sets of the fact that SSB implies the appearance of bosons
physical quanta. Thus, finding the dynamical map (Goldstone 1961, Goldstone et al. 1962, Nambu
involves singling out the Fock space where the and Jona-Lasinio 1961), the so-called Nambu–
dynamics has to be realized. Goldstone (NG) modes or quanta. They manifest
Let us now suppose that the Heisenberg field as long-range correlations and thus they are respon-
equations are invariant under some group G of sible of the above-mentioned change of scale, from
transformations of H : microscopic to macroscopic. The coherent boson
0
condensation of NG modes turns out to be the
H ðxÞ ! H ðxÞ ¼ g½ H ðxÞ ½5 mechanism by which order is generated, as we will
with g 2 G. The symmetry is spontaneously broken see in an explicit example in a later section.
when the vacuum state in the Fock space H is not
invariant under the group G but only under one of
its subgroups (Umezawa 1993, Umezawa et al. The ‘‘Boson Transformation’’ Method
1982).
On the other hand, eqn [4] implies that when H We now discuss the quantum origin of extended
is transformed as in [5], then objects (defects) and show how they naturally
emerge as macroscopic objects (inhomogeneous
’in ðxÞ ! ’0in ðxÞ ¼ g0 ½’in ðxÞ ½6 condensates) from the quantum dynamics. At zero
with g0 belonging to some group of transformations temperature, the classical soliton solutions are then
G0 and such that recovered in the Born approximation. This approach
is known as the ‘‘boson transformation’’ method
g½ H ðxÞ ¼ F½g0 ½’in ðxÞ ½7 (Umezawa 1993, Umezawa et al. 1982).
224 Quantum Fields with Topological Defects
f
The Boson Transformation Theorem that is, cl (x) provides the solution of the classical
Euler–Lagrange equation.
Let us consider, for simplicity, the case of a
Beyond the classical level, in general, the form of
dynamical model involving one scalar field H and
this equation changes. The Yang–Feldman equation
one asymptotic field ’in satisfying eqns [1] and [2],
[10] gives not only the equation for the order
respectively.
parameter, eqn [13], but also, at higher orders in
As already remarked, the dynamical map is valid
h, the dynamics of the physical quanta in the
only in a weak sense, that is, as a relation among matrix
potential generated by the ‘‘macroscopic object’’
elements. This implies that eqn [4] is not unique, since
f (x) (Umezawa 1993, Umezawa et al. 1982).
different sets of asymptotic fields and the correspond-
One can show (Umezawa 1993, Umezawa et al.
ing Hilbert spaces can be used in its construction. Let us
1982) that the class of solutions of eqn [8] which
indeed consider a c–number function f (x), satisfying
lead to topologically nontrivial (i.e., carrying a
the ’in equations of motion [2]:
nonzero topological charge) solutions of eqn [13],
ð@Þf ðxÞ ¼ 0 ½8 are those which have some sort of singularity with
respect to Fourier transform. These can be either
The boson transformation theorem (Umezawa 1993,
divergent singularities or topological singularities.
Umezawa et al. 1982) states that the field
The first are associated to a divergence of f (x) for
f
¼ F½x; ’in þ f jxj = 1, at least in some direction. Topological
H ðxÞ ½9
singularities are instead present when f (x) is not
is also a solution of the Heisenberg equation [1]. single-valued, that is, it is path dependent. In both
The corresponding Yang–Feldman equation takes cases, the macroscopic object described by the
the form order parameter, carries a nonzero topological
charge.
f f
H ðxÞ ¼ ’in ðxÞ þ f ðxÞ þ 1 ð@Þ J ½ H ðxÞ ½10
The difference between the two solutions H and Topological Singularities and Massless Bosons
f
H is only in the boundary conditions. An impor- An important result is that the boson transformation
tant point is that the expansion in [9] is obtained functions carrying topological singularities are only
from that in [4] by the spacetime-dependent allowed for massless bosons (Umezawa 1993,
translation Umezawa et al. 1982).
Consider a generic boson field in satisfying the
’in ðxÞ ! ’in ðxÞ þ f ðxÞ ½11
equation
The essence of the boson transformation theorem is
ð@ 2 þ m2 Þin ðxÞ ¼ 0 ½14
that the dynamics embodied in eqn [1] contains an
internal freedom, represented by the possible and suppose that the function f (x) for the boson
choices of the function f (x), satisfying the free- transformation in (x) ! in (x) þ f (x) carries a topo-
field equation [8]. logical singularity. It is then not single-valued and
We also observe that the transformation [11] is a thus path dependent:
canonical transformation since it leaves invariant the
canonical form of commutation relations. Gþ
ðxÞ ½@ ; @ f ðxÞ 6¼ 0; for certain ; ; x ½15
Let j0i denote the vacuum for the free field ’in . On the other hand, @ f (x), which is related with
The vacuum expectation value of eqn [10] gives observables, is single-valued, that is, [@ , @ ]
f @ f (x) = 0. Recall that f (x) is solution of the in
f ðxÞ h0j H ðxÞj0i
D h i E equation:
f
¼ f ðxÞ þ 0 1 ð@Þ J ½ H ðxÞ 0 ½12
ð@ 2 þ m2 Þf ðxÞ ¼ 0 ½16
f
The c–number field (x) is the order parameter. We From the definition of Gþ
(x)
and the regularity of
remark that it is fully determined by the quantum @ f (x), it follows, by computing @ Gþ
(x), that
dynamics. In the classical or Born approximation,
1
which consists in taking h0jJ [ fH ]j0i = J [ f ], that @ f ðxÞ ¼ @ Gþ
ðxÞ ½17
is, neglecting all the contractions of the physical @ þ m2
2
f
fields, we define cl (x) limh!0 f (x). In this limit, This equation and the antisymmetric nature of
2
we have Gþ (x) then lead to @ f (x) = 0, which in turn implies
m = 0. Thus, we conclude that [15] is only compa-
ð@Þclf ðxÞ ¼ J ½clf ðxÞ ½13 tible with massless equation for in .
Quantum Fields with Topological Defects 225
The topological charge is defined as B(x) is an auxiliary field which implements the
Z Z gauge-fixing condition (Matsumoto et al. 1975a, b).
NT ¼ dl @ f ¼ dS @ @ f Notice the -term where v is a complex number; its
C S rôle is to specify the condition of symmetry breaking
Z
1 þ under which we want to compute the functional
¼ dS G ½18
2 S integral and it may be given the physical meaning of
Here C is a contour enclosing the singularity and S a a small external field triggering the symmetry
surface with C as boundary. NT does not depend on breaking (Matsumoto et al. 1975a, b). The limit
the path C provided this does not cross the ! 0 must be made at the end of the computations.
singularity. The dual tensor G (x) is We will use the notation
Z
1
G ðxÞ 12 Gþ
ðxÞ ½19 hF½i;J;K ½dA ½d½d ½dBF½
N
and satisfies the continuity equation exp i S½A ; B; ½24
@ G ðxÞ ¼ 0 with hF[]i hF[]i, J =K =0 and hF[]i lim!0
, @ Gþ þ þ
ðxÞ þ @ G ðxÞ þ @ G ðxÞ ¼ 0 ½20 hF[]i .
The fields , A , and B appearing in the generating
Equation [20] completely characterizes the topolo- functional are c-number fields. In the following, the
gical singularity (Umezawa 1993, Umezawa et al. Heisenberg operator fields corresponding to them
1982). will be denoted by H , AH , and BH , respectively.
Thus, the spontaneous symmetry breaking condition
is expressed by h0jH (x)j0i ~v 6¼ 0, with ~v constant.
An Example: The Anderson–Higgs–Kibble Since in the functional integral formalism the
Mechanism and the Vortex Solution functional average of a given c-number field gives
the vacuum expectation value of the corresponding
We consider a model of a complex scalar field (x)
operator field, for example, hF[]i h0jF[H ]j0i, we
interacting with a gauge field A (x) (Anderson 1958,
have lim ! 0 h(x)i h0jH (x)j0i = ~v.
Higgs 1960, Kibble 1967). The lagrangian density
Let us introduce the following decompositions:
L[(x), (x), A (x)] is invariant under the global
and the local U(1) gauge transformations (we do not 1
ðxÞ ¼ pffiffiffi ½ ðxÞ þ iðxÞ
assume a particular form for the Lagrangian density, 2
so the following results are quite general): 1
KðxÞ ¼ pffiffiffi ½K1 ðxÞ þ iK2 ðxÞ
ðxÞ ! ei
ðxÞ; A ðxÞ ! A ðxÞ ½21 2
ðxÞ ðxÞ h ðxÞi
ðxÞ ! eie0 ðxÞ ðxÞ; A ðxÞ ! A ðxÞ þ @ ðxÞ ½22
Note that h(x)i = 0 because of the invariance
respectively, where (x) ! 0 for jx0 j ! 1 and/or under ! .
jxj ! 1 and e0 is the coupling constant. We work
in the Lorentz gauge @ A (x) = 0. The generating
The Goldstone Theorem
functional, including the gauge constraint, is
(Matsumoto et al. 1975a, b) Since the functional integral [23] is invariant under
Z the global transformation [21], we have that
1
Z½ J; K ¼ ½dA ½d½d ½dB @Z[ J, K]=@
= 0 and subsequent derivatives with
N respect to K1 and K2 lead to
exp i S½A ; B; ½23
pffiffiffi Z
Z h ðxÞi ¼ 2v d4 yhðxÞðyÞi
h
S¼ d4 x LðxÞ þ BðxÞ@ A ðxÞ pffiffiffi
¼ 2v ð; 0Þ ½25
þ K ðxÞðxÞ þ KðxÞ ðxÞ In momentum space the propagator for the field
i
þ J ðxÞA ðxÞ þ ijðxÞ vj 2 has the general form
Z
Z
N ¼ ½dA ½d½d ½dB ð0; pÞ ¼ lim 2
!0 p m2 þ ia
Z
4 2
exp i d x LðxÞ þ ijðxÞ vj þ (continuum contributions) ½26
226 Quantum Fields with Topological Defects
Here Z and a are renormalization constants. The (, A , B), all the other two-point functions
integration in eqn [25] picks up the pole contribu- must vanish.
tion at p2 = 0, and leads to The dynamical maps expressing the Heisenberg
pffiffiffi Z operator fields in terms of the asymptotic operator
v¼
~ 2 v , m ¼ 0; v ¼ 0 , m 6¼ 0
~ ½27 fields are found to be (Matsumoto et al. 1975a, b)
a ( )
Z1=2
The Goldstone theorem (Goldstone 1961, Goldstone H ðxÞ ¼ :exp i in ðxÞ ~v þ Z1=2 in ðxÞ
et al. 1962) is thus proved: if the symmetry is ~v
spontaneously broken (~v 6¼ 0), a massless mode must
þF ½in ; Uin ; @ðin bin Þ : ½32
exist, whose field is (x), that is, the NG boson
mode. Since it is massless, it manifests as a long- Z1=2
1=2
range correlation mode. (Notice that in the present AH ðxÞ ¼Z3 Uin
ðxÞ þ @ bin ðxÞ
case of a complex scalar field model, the NG mode e0 ~v
is an elementary field. In other models, it may þ : F ½in ; Uin ; @ðin bin Þ: ½33
appear as a bound state, for example, the magnon in
(anti)ferromagnets.) Note that e0 ~v
BH ðxÞ ¼ 1=2
½bin ðxÞ in ðxÞ þ c ½34
@ pffiffiffi Z 4 Z
h ðxÞi ¼ 2 d yhðxÞðyÞi ½28
@v where : . . . : denotes the normal ordering and the
functionals F and F are to be determined within a
and because m 6¼ 0, the right-hand side of this
particular model. In eqns [32]–[34], in denotes the
equation vanishes in the limit ! 0; therefore, ~v is
NG mode, bin the ghost mode, Uin the massive
independent of jvj, although the phase of jvj
vector field, and in the massive matter field. In eqn
determines the one of ~ v (from eqn [25]): as in
[34] c is a c-number constant, whose value is
ferromagnets, once an external magnetic field is
irrelevant since only derivatives of B appear in the
switched on, the system is magnetized independently
field equations (see below). Z3 represents the wave
of the strength of the external field.
function renormalization for Uin . The corresponding
The Dynamical Map and the Field Equations field equations are
Observing that the change of variables [21] (and/or @ 2 in ðxÞ ¼ 0; @ 2 bin ðxÞ ¼ 0
[22]) does not affect the generating functional, we may ½35
ð@ 2 þ m2 Þin ðxÞ ¼ 0
obtain the Ward–Takahashi identities. Also, using
B(x) ! B(x) þ (x) in [23] gives h@ A (x)i, J, K = 0.
ð@ 2 þ m2V ÞUin ðxÞ ¼ 0; @ Uin ðxÞ ¼ 0 ½36
One then finds the following two-point function pole
structures (Matsumoto et al. 1975a, b): with mV 2 = (Z3 =Z )(e0 ~v)2 . The field equations for
( Z )
i e0 ~
v BH and AH read (Matsumoto et al. 1975a, b)
hBðxÞðyÞi ¼ lim 4
d4 p eipðxyÞ ½29
!0 ð2Þ p2 þ ia @ 2 BH ðxÞ ¼ 0; @ 2 AH ðxÞ ¼ jH ðxÞ @ BH ðxÞ ½37
Z with jH (x) = L(x)=AH (x). One may then require
i 1
hBðxÞA ðyÞi ¼ @x d4 p eipðxyÞ ½30 that the current jH is the only source of the gauge
ð2Þ4 p2
field AH in any observable process. This amounts to
( Z impose the condition: p hbj@ BH (x)jaip = 0, that is,
i vÞ 2
ðe0 ~
hBðxÞBðyÞi ¼ lim d4 p eipðxyÞ ð@ 2 Þp hbjA0H ðxÞjaip ¼ phbj jH ðxÞjaip ½38
!0 ð2Þ4 Z
This is confirmed by the fact that they are present BH in a combination such that the changes of BH
in the S-matrix in the combination (in bin ) and of Uin compensate each other provided
(Matsumoto et al. 1975a, b). It is to be remarked,
however, that the NG boson does not disappear from m2V
ð@ 2 þ m2V Þa ðxÞ ¼ @ f ðxÞ ½45
the theory: we shall see below that there are situations e0
in which the NG fields do have observable effects. Equation [45] thus obtained is the Maxwell equa-
tion for the massive potential vector a (Matsumoto
The Dynamical Rearrangement of Symmetry et al. 1975a, b). The classical ground state current j
and the Classical Fields and Currents
turns out to be
From eqns [32]–[33] we see that the local gauge
1
transformations of the Heisenberg fields j ðxÞ h0jjH ðxÞj0i ¼ m2V a ðxÞ @ f ðxÞ ½46
e0
H ðxÞ ! eie0 ðxÞ H ðxÞ
½40 The term m2V a (x) is the Meissner current, while
AH ðxÞ ! AH ðxÞ þ @ ðxÞ; BH ðxÞ ! BH ðxÞ (m2V =e0 )@ f (x) is the boson current. The key point
with @ 2 (x) = 0, are induced by the in-field here is that both the macroscopic field and current
transformations are given in terms of the boson condensation
function f (x).
e0 ~
v Two remarks are in order: first, note that the
in ðxÞ ! in ðxÞ þ 1=2
ðxÞ
Z terms proportional to @ f (x) are related to obser-
e0 ~
v ½41 vable effects, for example, the boson current which
bin ðxÞ ! bin ðxÞ þ 1=2
ðxÞ acts as the source of the classical field. Second, note
Z
that the macroscopic ground state effects do not
in ðxÞ ! in ðxÞ; Uin ðxÞ ! Uin ðxÞ occur for regular f (x)(Gþ (x) = 0). In fact, from [45]
On the other hand, the global phase transformation we obtain a (x) = (1=e0 )@ f (x) for regular f (x)
H (x) ! ei
H (x) is induced by which implies zero classical current (j = 0) and
zero classical field (F = @ a @ a ), since the
v
~ Meissner and the boson current cancel each other.
in ðxÞ ! in ðxÞ þ 1=2
f ðxÞ; bin ðxÞ ! bin ðxÞ
Z In conclusion, the vacuum current appears only
in ðxÞ ! in ðxÞ;
Uin
ðxÞ ! Uin ðxÞ ½42 when f (x) has topological singularities and these can
be created only by condensation of massless bosons,
with @ 2 f (x) = 0 and the limit f (x) ! 1 to be performed that is, when SSB occurs. This explains why
at the end of computations. Note that under the above topological defects appear in the process of phase
transformations, the in-field equations and the transitions, where NG modes are present and
S-matrix are invariant and that BH is changed by an gradients in their condensate densities are nonzero
irrelevant c-number (in the limit f ! 1). (Kibble 1976, Zurek 1997).
Consider now the boson transformation On the other hand, the appearance of spacetime
in (x) ! in (x) þ
(x): in local gauge theories the order parameter is no guarantee that persistent
boson transformation must be compatible with the ground state currents (and fields) will exist: if f (x)
Heisenberg field equations but also with the physical is a regular function, the spacetime dependence of ~ v
state condition [39]. Under the boson transforma- can be gauged away by an appropriate gauge
tion with
(x) = ~ vZ1=2
f (x) and @ 2 f (x) = 0, BH transformation.
changes as Since, as already mentioned, the boson transfor-
mation with regular f (x) does not affect observable
v2
e0 ~
BH ðxÞ ! BH ðxÞ f ðxÞ ½43 quantities, the S-matrix is actually given by
Z
1
eqn [38] is thus violated when the Gupta–Bleuler- S ¼ : S in ; Uin @ðin bin Þ : ½47
mV
like condition is imposed. In order to restore it, the
shift in BH must be compensated by means of the This is indeed independent of the boson transforma-
following transformation on Uin : tion with regular f (x):
1=2
Uin ðxÞ ! Uin ðxÞ þ Z3 a ðxÞ; @ a ðxÞ ¼ 0 ½44 0 1
S ! S ¼ :S in ; Uin @ðin bin Þ
mV
with a convenient c-number function a (x). The
dynamical maps of the various Heisenberg operators 1=2 1
þZ3 ða @ f Þ : ½48
are not affected by [44] since they contain Uin and e0
228 Quantum Fields with Topological Defects
R
since a (x) = (1=e0 )@ f (x) for regular f (x). However, that is, by using the identity (2)2 d3 p(ei px =p2 ) =
S0 6¼ S for singular f (x): S0 includes the interaction of 1=2jxj,
the quanta Uin and in with the classically behaving Z
1 dy ð Þ 1
macroscopic defects (Umezawa 1993, Umezawa rf ðxÞ ¼ d k ^ rx ½54
et al. 1982). 2 d jx yð Þj
Note that r2 f (x) = 0 is satisfied.
A straight infinitely long vortex is specified by
The Vortex Solution yi ( ) = i3 with 1 < < 1. The only nonvanish-
ing component of G (x) are G03 (x) = Gþ 12 (x) =
Below we consider the example of the Nielsen–
(x1 )(x2 ). Equation [54] gives (Umezawa 1993,
Olesen vortex string solution. We show which one is
Umezawa et al. 1982, Matsumoto 1975a, b)
the boson function f (x) controlling the nonhomoge- Z
neous NG boson condensation in terms of which the @ 1 @ 2
f ðxÞ ¼ d ½x þ x22 þ ðx3 Þ2 1=2
string solution is described. For brevity, we only @x1 2 @x2 1
report the results of the computations. The detailed x2
¼ 2 ½55
derivation as well as the discussion of further x1 þ x22
examples can be found in (Umezawa 1993, @ x1 @
Umezawa et al. 1982). f ðxÞ ¼ 2 2
; f ðxÞ ¼ 0
@x2 x1 þ x2 @x3
In the present U(1) problem, the electromagnetic
tensor and the vacuum current are (Umezawa 1993, and then
Umezawa et al. 1982, Matsumoto et al. 1975a, b)
x2
f ðxÞ ¼ tan1 ¼
ðxÞ ½56
F ðxÞ ¼ @ a ðxÞ @ a ðxÞ x1
Z
m2 We have thus determined the boson transformation
¼ 2 V d4 x0 c ðx x0 ÞGþ 0
ðx Þ ½49
e0 function corresponding to a particular vortex solu-
tion. The vector potential is
Z Z
m2V m2 x0
j ðxÞ ¼ 2 d4 x0 c ðx x0 Þ@x0 Gþ 0
ðx Þ ½50 a1 ðxÞ ¼ V d4 x0 c ðx x0 Þ 02 2 02
e0 2e0 x1 þ x2
2 Z
respectively, and satisfy @ F (x) = j (x). In these m x0 ½57
a2 ðxÞ ¼ V d4 x0 c ðx x0 Þ 02 1 02
equations, 2e0 x1 þ x2
Z a3 ðxÞ ¼ a0 ðxÞ ¼ 0
1 0 1
c ðx x0 Þ ¼ d4 p eipðxx Þ 2 ½51
ð2Þ4 p m2V þ i and the only nonvanishing component of F :
The line singularity for the vortex (or string) Z
m2
solution can be parametrized by a single line F12 ðxÞ ¼ 2 V d4 x0 c ðx x0 Þðx01 Þðx02 Þ
e0
parameter and by the time parameter . A static qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
m2
vortex solution is obtained by setting y0 (, )= and ¼ V K0 mV x21 þ x22 ½58
y(, )= y( ), with y denoting the line coordinate. e0
Gþ (x) is nonzero only on the line at y (we can Finally, the vacuum current eqn [50] is given by
consider more lines but let us limit to only one line, qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
for simplicity). Thus, we have m3 x2
j1 ðxÞ ¼ V qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi K1 mV x21 þ x22
Z e0 x2 þ x2
dyi ð Þ 3 1 2
G0i ðxÞ ¼ d ½x yð Þ Gij ðxÞ ¼ 0 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
d ½52 mV3
x1 ½59
j2 ðxÞ ¼ q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi K1 mV x21 þ x22
Gþij ðxÞ ¼ ijk G0k ðxÞ; Gþ0i ðxÞ ¼ 0 e0 x2 þ x2
1 2
Equation [49] shows that these vortices are purely j3 ðxÞ ¼ j0 ðxÞ ¼ 0
magnetic. We obtain
We observe that these results are the same of the
@0 f ðxÞ ¼ 0 Nielsen–Olesen vortex solution. Notice that we did
Z
1 dyk ð Þ x not specify the potential in our model but only the
@i f ðxÞ ¼ d ijk @j
ð2Þ 2 d invariance properties. Thus, the invariance proper-
Z ties of the dynamics determine the characteristics of
eipðxyð ÞÞ
d3 p ½53 the topological solutions. The vortex solution
p2
Quantum Fields with Topological Defects 229
manifests the original U(1) symmetry through the See also: Abelian Higgs Vortices; Algebraic Approach to
cylindrical angle
which is the parameter of the Quantum Field Theory; Quantum Field Theory: A Brief
U(1) representation in the coordinate space. Introduction; Quantum Field Theory in Curved
Spacetime; Symmetries in Quantum Field Theory:
Algebraic Aspects; Symmetries in Quantum Field Theory
Conclusions of Lower Spacetime Dimensions; Topological Defects
and their Homotopy Classification.
We have discussed how topological defects arise as
inhomogeneous condensates in QFT. Topological
defects are shown to have a genuine quantum Further Reading
nature. The approach reviewed here goes under the
name of ‘‘boson transformation method’’ and relies Anderson PW (1958) Coherent excited states in the theory of
on the existence of unitarily inequivalent representa- superconductivity: gauge invariance and the Meissner effect.
Physical Review 110: 827–835.
tions of the field algebra in QFT. Blasone M and Jizba P (2002) Topological defects as
Describing quantum fields with topological inhomogeneous condensates in quantum field theory: kinks
defects amounts then to properly choose the physical in (1 þ 1) dimensional 4 theory. Annals of Physics 295:
Fock space for representing the Heisenberg field 230–260.
operators. Once the boundary conditions corre- Blasone M, Jizba P, and Vitiello G (2006) Spontaneous Break-
down of Symmetry and Topological Defects, London: Imper-
sponding to a particular soliton sector are found, ial College Press. (in preparation).
the Heisenberg field operators embodied with such Goldstone J (1961) Field theories with ‘‘superconductor’’ solu-
conditions contain the full information about the tions. Nuovo Cimento 19: 154–164.
defects, the quanta and their mutual interaction. Goldstone J, Salam A, and Weinberg S (1962) Broken symmetries.
One can thus calculate Green’s functions for Physical Review 127: 965–970.
Higgs P (1960) Spontaneous symmetry breakdown without
particles in the presence of defects. The extension massless bosons. Physical Review 145: 1156–1163.
to finite temperature is discussed in Blasone and Kibble TWB (1967) Symmetry breaking in non-abelian gauge
Jizba (2002) and Manka and Vitiello (1990). theories. Physical Review 155: 1554–1561.
As an example we have discussed a model with Kibble TWB (1976) Topology of cosmic domains and strings.
U(1) gauge invariance and SSB and we have obtained Journal of Physics A 9: 1387–1398.
Kibble TWD (1980) Some implications of a cosmological phase
the Nielsen–Olesen vortex solution in terms of transition. Physics Reports 67: 183–199.
localized condensation of Goldstone bosons. These Kleinert H (1989) Gauge Fields in Condensed Matter, vols. I & II
thus appear to play a physical role, although, in the Singapore: World Scientific.
presence of gauge fields, they do not show up in the Manka R and Vitiello G (1990) Topological solitons and tempera-
physical spectrum as excitation quanta. The function ture effects in gauge field theory. Annals of Physics 199: 61–83.
Matsumoto H, Papastamatiou NJ, Umezawa H, and Vitiello G
f (x) controlling the condensation of the NG bosons (1975a) Dynamical rearrangement in Anderson–Higgs–Kibble
must be singular in order to produce observable mechanism. Nuclear Physics B 97: 61–89.
effects. Boson transformations with regular f (x) only Matsumoto H, Papastamatiou NJ, and Umezawa H (1975b) The
amount to gauge transformations. For the treatment boson transformation and the vortex solutions. Nuclear
of topological defects in nonabelian gauge theories, Physics B 97: 90–124.
Nambu Y and Jona-Lasinio G (1961) Dynamical model of
see Manka and Vitiello (1990). elementary particles based on an analogy with superconduc-
Finally, when there are no NG modes, as in the tivity. I. Physical Review 122: 345–358.
case of the kink solution or the sine-Gordon Nambu Y and Jona-Lasinio G (1961) Dynamical model of
solution, the boson transformation function has to elementary particles based on an analogy with superconduc-
carry divergence singularity at spatial infinity tivity. II. Physical Review 124: 246–254.
Rajaraman R (1982) Solitons and Instantons: An Introduction to
(Umezawa 1993, Umezawa et al. 1982, Blasone Solitons and Instantons in Quantum Field Theory. Amsterdam:
and Jizba 2002). The boson transformation has also North-Holland.
been discussed in connection with the Bäklund Umezawa H (1993) Advanced Field Theory: Micro, Macro and
transformation at a classical level and the confine- Thermal Physics. New York: American Institute of Physics.
ment of the constituent quanta in the coherent Umezawa H, Matsumoto H, and Tachiki M (1982) Thermo
Field Dynamics and Condensed States. Amsterdam: North-
condensation domain. Holland.
For further reading on quantum fields with Vitiello G (2001) My Double Unveiled. Amsterdam: John
topological defects, see Blasone et al. (2006). Benjamins.
Volovik GE (2003) The Universe in a Helium Droplet. Oxford:
Clarendon.
Acknowledgments von Neumann J (1955) Mathematical Foundation of Quantum
Mechanics. Princeton: Princeton University Press.
The authors thank MIUR, INFN, INFM, and the Zurek WH (1997) Cosmological experiments in condensed matter
ESF network COSLAB for partial financial support. systems. Physics Reports 276: 177–221.
230 Quantum Geometry and Its Applications
given a 2-surface S on M, and an su(2)-valued (test) isomorphic to C . A is called the Gel’fand spectrum
function f on M, of C . It has been shown to consist of ‘‘generalized
Z connections’’ A defined as follows: A assigns to any
oriented edge e in M an element A(e) of SU(2)
PS; f :¼ trðf PÞ ½2
S (a ‘‘holonomy’’) such that A(e
1 ) = [A(e)] 1
; and, if
the endpoint of e1 is the starting point of e2 , then
is a momentum function on G, where tr is over the 1 e2 ) = A(e
1 ) A(e
2 ). Clearly, every smooth con-
A(e
su(2) indices. (For simplicity of presentation, all
nection A is a generalized connection. In fact, the
fields are assumed to be smooth and curves/edges e
space A of smooth connections has been shown to be
and surfaces S, finite and piecewise analytic in a
dense in A (with respect to the natural Gel’fand
specific sense. The extension to smooth curves and
topology thereon). But A has many more ‘‘distribu-
surfaces was carried out by Bacz and Sawin,
tional elements.’’ The Gel’fand theory guarantees that
Lewandowski and Thiemann, and Fleischhack. It is
every representation of the C ? algebra C is a direct
technically more involved but the final results are
sum of representations of the following type: the
qualitatively the same.) The symplectic structure on d) for some
underlying Hilbert space is H = L2 (A,
G enables one to calculate the Poisson brackets
measure on A and (regarded as functions on A)
{he , PS, f }. The result is a linear combination of
elements of C act by multiplication. Since there are
holonomies and can be written as a Lie derivative, there is a multi-
many inequivalent measures on A,
fhe ; PS; f g ¼ LXS; f he ½3 tude of representations of C . A key question is how
many of them can be extended to representations of
where XS, f is a derivation on the ring generated by the full algebra a (or W) without having to introduce
holonomy functions, and can therefore be regarded any ‘‘background fields’’ which would compromise
as a vector field on the configuration space A of diffeomorphism covariance. Quite surprisingly, the
connections. This is a familiar situation in classical requirement that the representation be cyclic with
mechanics of systems whose configuration space is a respect to a state which is invariant under the action
finite-dimensional manifold. Functions he and vector of the (appropriately defined) group Diff M of
fields XS, f generate a Lie algebra. As in quantum piecewise-analytic diffeomorphisms on M singles out
mechanics on manifolds, the first step is to promote a unique irreducible representation. This result was
this algebra to a quantum algebra by demanding established for a by Lewandowski, Okołów, Sahl-
that the commutator be given by ih times the Lie mann and Thiemann, and for W by Fleischhack. It is
bracket. The result is a ?-algebra a , analogous to the the quantum geometry analog to the seminal results
algebra generated by operators exp i^ x and p ^ in by Segal and others that characterized the Fock
quantum mechanics. By exponentiating the momen- vacuum in Minkowskian field theories. However,
tum operators P ^ S, f one obtains W , the analog of the while that result assumes not only Poincaré invar-
quantum-mechanical Weyl algebra generated by iance but also specific (namely free) dynamics, it is
exp i^x and exp ip ^. striking that the present uniqueness theorems make
The main task is to obtain the appropriate no such restriction on dynamics. The requirement of
representation of these algebras. In that representa- diffeomorphism invariance is surprisingly strong and
tion, quantum Riemannian geometry can be probed makes the ‘‘background-independent’’ quantum geo-
through the momentum operators P ^ S, f , which metry framework surprisingly tight.
stem from classical orthonormal triads. As in This representation had been constructed by
quantum mechanics on manifolds or simple field Ashtekar, Baez, and Lewandowski some ten years
theories in flat space, it is convenient to divide the before its uniqueness was established. The under-
task into two parts. In the first, one focuses on the lying Hilbert space is given by H = L2 (A, do ) where
algebra C generated by the configuration operators o is a diffeomorphism-invariant, faithful, regular
^c and finds all its representations, and in the second
h Borel measure on A, constructed from the normal-
one considers the momentum operators P ^ S, f to ized Haar measure on SU(2). Typical quantum states
restrict the freedom. can be visualized as follows. Fix: (1) a graph on M
C is called the holonomy algebra. It is naturally (by a graph on M we mean a set of a finite number
endowed with the structure of an abelian C ? algebra of embedded, oriented intervals called edges; if two
(with identity), whence one can apply the powerful edges intersect, they do so only at one or both ends,
machinery made available by the Gel’fand theory. called vertices), and (2) a smooth function on
This theory tells us that C determines a unique [SU(2)]n . Then, the function
compact, Hausdorff space A such that the C ? algebra
:¼ ðAðe
ðAÞ 1 Þ; . . . ; Aðe
n ÞÞ ½4
of all continuous functions on A is naturally
232 Quantum Geometry and Its Applications
on A is an element of H. Such states are said to be self-adjoint and all its eigenvalues are discrete. To
‘‘cylindrical’’ with respect to the graph and their define other geometric operators such as the area
space is denoted by Cyl . These are ‘‘typical states’’ operator A^ S associated with a surface S or a volume
in the sense that Cyl := [ Cyl is dense in H. operator V ^ R associated with a region R, one first
Finally, as ensured by the Gel’fand theory, the expresses the corresponding phase-space functions in
holonomy (or configuration) operators h ^e act just terms of the ‘‘elementary’’ functions ESi , fi using
by multiplication. The momentum operators P ^ S, f act suitable surfaces Si and test functions fi and then
as Lie derivatives: P^ S, f = i
hLXS, f . promotes ESi , fi to operators. Even though the
classical expressions are typically nonpolynomial
Remark Given any graph in M, and a labeling of
functions of ESi , fi , the final operators are all well
each of its edges by a nontrivial irreducible represen-
defined, self-adjoint and with purely discrete eigen-
tation of SU(2) (i.e., by a nonzero half integer j), one
values. Therefore, in the sense of the word used in
can construct a finite-dimensional Hilbert space H, j ,
elementary quantum mechanics (e.g., of the hydro-
which can be thought of as the state space of a spin
gen atom), one says that geometry is quantized.
system ‘‘living on’’ the graph . The full Hilbert space
Because the theory has no background metric or
admits a simple decomposition: H = , j H, j . This
indeed any other background field, all geometric
is called the spin-network decomposition. The geo-
operators transform covariantly under the action of
metric operators discussed in the next section leave
the Diff M. This diffeomorphism covariance makes
each H, j invariant. Therefore, the availability of this
the final expressions of operators rather simple. In
decomposition greatly simplifies the task of analyzing
the case of the area operator, for example, the
their properties. ^ S on a state [4] depends entirely on
action of A
the points of intersection of the surface S and the
graph and involves only right- and left-invariant
Geometric Operators
vector fields on copies of SU(2) associated with
In the classical theory, E := 8GP has the inter- edges of which intersect S. In the case of the
pretation of an orthonormal triad field (or a volume operator V ^ R , the action depends on the
‘‘moving frame’’) on M (with density weight 1). vertices of contained in R and, at each vertex,
Here, is a dimensionless, strictly positive number, involves the right- and left-invariant vector fields on
called the Barbero–Immirzi parameter, which arises copies of SU(2) associated with edges that meet at
as follows. Because of emphasis on connections, in each vertex.
the classical theory the first-order Palatini action is a To display the explicit expressions of these
more natural starting point than the second-order operators, let us first define on Cyl three basic
Einstein–Hilbert action. Now, there is a freedom to operators ^Jj(v, e) , with j 2 {1, 2, 3}, associated with the
add a term to the Palatini action which vanishes pair consisting of an edge e of and a vertex v of e:
when Bianchi identities are satisfied and therefore 8
does not change the equations of motion. arises as > d expðtj Þ; . . .Þ
>
> i jt¼0 ð. . . ; Ue ðAÞ
>
> dt
the coefficient of this term. In some respects is >
>
< if e begins at v
analogous to the parameter of Yang–Mills theory. ^Jðv;eÞ ðAÞ
¼
j >
Indeed, while theories corresponding to any permis- > d . . .Þ
>
> i jt¼0 ð. . . ; expðtj ÞUe ðAÞ;
sible values of are related by a canonical >
> dt
>
:
transformation classically, quantum mechanically if e ends at v
this transformation is not unitarily implementable. ½5
Therefore, although there is a unique representation
of the algebra a (or W ), there is a one-parameter where j denotes a basis in su(2) and ‘‘. . .’’ stands for
family of inequivalent representations of the algebra the rest of the arguments of which remain
of geometric operators generated by suitable func- unaffected. The quantum area operator As is
tions of orthonormal triads E, each labeled by the assigned to a finite two-dimensional submanifold S
value of . This is a genuine quantization ambiguity. in M. Given a cylindrical state we can always
As with the ambiguity in QCD, the actual value of represent it in the form [4] using a graph adapted
in nature has to be determined experimentally. to S, such that every edge e either intersects S at
The current strategy in quantum geometry is to fix exactly one endpoint, or is contained in the closure
its value through a thought experiment involving S, or does not intersect S. For each vertex v in S of
black hole thermodynamics (see below). the graph , the family of edges intersecting v can be
The basic object in quantum Riemannian geome- divided into three classes: edges {e1 , . . . , eu } lying on
try is the triad flux operator E ^ S, f := 8G P
^ S, f . It is one side (say ‘‘above’’) S, edges {euþ1 , . . . , euþd } lying
Quantum Geometry and Its Applications 233
on the other side (say ‘‘below’’), and edges contained tangent to the edges at v, [i , j ] = ck ij k and the
in S. To each v we assign a generalized Laplace indices are raised by the tensor ij . The action of the
operator quantum volume operator on a cylindrical state [4]
! is then given by
X u
ðv;e Þ
X
uþd
ðv;e Þ X pffiffiffiffiffiffiffiffi
S;v ¼ ij ^J I
^J I
^ R ¼ o
i i V qv j:
j^ ½12
I¼1 I¼uþ1
! v2R
X
u X
uþd
^Jðv;eK Þ ^Jðv;eK Þ ½6 Here, o is an overall, independent of a graph,
j j
K¼1 K¼uþ1 constant resulting from an averaging.
The volume operator plays an unexpectedly
where ij stands for 1=2 the Killing form on su(2).
^ S on important role in the definition of both the gravita-
Now, the action of the quantum area operator A
tional and matter contributions to the scalar
is defined as follows:
constraint operator which dictates dynamics.
X pffiffiffiffiffiffiffiffiffiffiffiffiffi
^ S ¼ 4‘2
A S;v ½7 Finally, a notable property of the volume operator
Pl
v2S is the following. Let R(p,
) be a family of neighbor-
hoods of a point p 2 M. Then, as indicated above,
The quantum area operator has played the most V^ R(p,
) = 0 if has no vertex in the neighborhood.
important role in applications. Its complete spec- However, if has a vertex at p
trum is known in a closed form. Consider arbitrary
sets j(u) (d) (uþd)
I , jI , and jI of half-integers, subject to the ^ Rðx;
Þ
lim V
!0
condition
exists but is not necessarily zero. This is a reflection
ðuþdÞ ðuÞ ðdÞ ðuÞ ðdÞ ðuÞ ðdÞ
jI 2 fjjI jI j; jjI jI j þ 1; . . . ; jI þ jI g ½8 of the ‘‘distributional’’ nature of quantum geometry.
where I runs over any finite number of integers. The Remark States 2 Cyl have support only on the
general eigenvalues of the area operator are given by: graph . In particular, they are simply annihilated
by geometric operators such as A ^ S and V^ R if the
X ðuÞ ðuÞ ðdÞ ðdÞ
aS ¼ 4‘2Pl 2jI ð jI þ 1Þ þ 2jI ð jI þ 1Þ support of the surface S and the region R does not
I intersect the support of . In this sense the
1=2
ðuþdÞ ðuþdÞ fundamental excitations of geometry are one dimen-
jI ð jI þ 1Þ ½9
sional and geometry is polymer-like. States ,
On the physically interesting sector of SU(2)- where is just a ‘‘small graph,’’ are highly quantum
gauge-invariant subspace Hinv of H, the lowest mechanical – like states in QED representing just a
eigenvalue of A ^ S – ‘‘the area gap’’ – depends on few photons. Just as coherent states in QED require
some global properties of S. Specifically, it ‘‘knows’’ an infinite superposition of such highly quantum
whether the surface is open, or a 2-sphere, or, if M is states, to obtain a semiclassical state approximating
a 3-torus, a (nontrivial) 2-torus in M. Finally, on a given classical geometry, one has to superpose a
Hinv , one is often interested only in the subspace of very large number of such elementary states. More
states , where has no edges which lie within a precisely, in the Gel’fand triplet Cyl H Cyl? ,
given surface S. Then, the expression of eigenvalues semiclassical states belong to the dual Cyl? of Cyl.
simplifies considerably:
X pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
aS ¼ 8‘2Pl jI ðjI þ 1Þ ½10 Applications
I Since quantum Riemannian geometry underlies loop
To display the action of the quantum volume quantum gravity and spin-foam models, all results
operator V ^ R , for each vertex v of a given graph , obtained in these frameworks can be regarded as its
let us first define an operator q ^v on Cyl . applications. Among these, there are two which
have led to resolutions of long-standing issues. The
1 first concerns black hole entropy, and the second,
^v ¼ð8‘2Pl Þ3
q
48 quantum nature of the big bang.
X ðv;eÞ ðv;e0 Þ ^ðv;e00 Þ
ðe; e0 ; e00 Þcijk ^Ji ^Jj Jk ½11
e;e0 ;e00 Black Holes
where e, e 0 , and e 00 run over the set of edges Seminal advances in fundamentals of black hole
intersecting v,
(e, e 0 , e 00 ) takes values 1 or 0 physics in the mid-1970s suggested that the entropy
depending on the orientation of the half-lines of large black holes is given by SBH = (ahor =4‘2Pl ),
234 Quantum Geometry and Its Applications
where ahor is the horizon area. This immediately to Hawking’s semiclassical result. This correction,
raised a challenge to potential quantum gravity with the 1=2 factor, is robust in the sense that it
theories: give a statistical mechanical derivation of also arises in other approaches.) However, as one
this relation. For familiar thermodynamic systems, a would expect, the proportionality factor depends on
statistical mechanical derivation begins with an the Barbero–Immirzi parameter and so far loop
identification the microscopic degrees of freedom. quantum gravity does not have an independent way
For a classical gas, these are carried by molecules; to determine its value. The current strategy is to
for the black body radiation, by photons; and for a determine by requiring that, for the Schwarzschild
ferromagnet, by Heisenberg spins. What about black black hole, the leading term agrees exactly with
holes? The microscopic building blocks cannot be Hawking’s semiclassical answer. This requirement
gravitons because the discussion involves stationary implies that is the root of algebraic equation and
black holes. Furthermore, the number of micro- its value is given by 0.2735. Now, quantum
scopic states is absolutely huge: some exp 1077 for a geometry theory is completely fixed. One can
solar mass black hole, a number that completely calculate entropy of other black holes, with angular
dwarfs the number of states of systems one normally momentum and distortion. A nontrivial check on the
encounters in statistical mechanics. Where does this strategy is that for all these cases, the coefficient in
huge number come from? In loop quantum gravity, the leading-order term again agrees with Hawking’s
this is the number of states of the ‘‘quantum horizon semiclassical result.
geometry.’’ The detailed analysis involves a number of
The idea behind the calculation can be heuristi- structures of interest to mathematical physics. First,
cally explained using the ‘‘It from Bit’’ argument, the intrinsic horizon geometry is described by a U(1)
put forward by Wheeler in the 1990s. Divide the Chern–Simons theory on a punctured 2-sphere (the
black hole horizon into elementary cells, each with horizon), the level k of the theory being given by
one Planck unit of area, ‘2Pl , and assign to each cell k = ahor =4‘2Pl . The punctures are simply the inter-
two microstates. Then the total number of states N sections of the excitations of the polymer geometry
is given by N = 2n , where n = (ahor =‘2Pl ) is the in the bulk with the horizon 2-surface. Second,
number of elementary cells, whence entropy is because of the horizon boundary conditions, in the
given by S = ln N
ahor . Thus, apart from a classical theory the gauge group SU(2) is reduced to
numerical coefficient, the entropy (It) is accounted U(1) at the horizon. At each puncture, it is further
for by assigning two states (Bit) to each elementary reduced to the discrete subgroup Zk of U(1),
cell. This qualitative picture is simple and attractive. sometimes referred to as a ‘‘quantum U(1) group.’’
However, the detailed derivation in quantum geo- Third, the ‘‘surface phase space’’ associated with the
metry has several new features. horizon is represented by a noncommutative torus.
First, Wheeler’s argument would apply to any Finally, the surface Chern–Simons theory is entirely
2-surface, while in quantum geometry the surface unrelated to the bulk quantum geometry theory but
must represent a horizon in equilibrium. This the quantum horizon boundary condition requires
requirement is encoded in a certain boundary that the spectrum of a certain operator in the
condition that the canonically conjugate pair (A, P) Chern–Simons theory must be identical to that of
must satisfy at the surface and plays a crucial role in another operator in the bulk theory. The surprising
the quantum theory. Second, the area of each fact is that there is an exact agreement. Without this
elementary cell is not a fixed multiple of ‘2Pl but is seamless matching, a coherent description of the
given by [10], where I labels the elementary cells quantum horizon geometry would not have been
and jI can be any half-integer (such that the sum is possible.
within a small neighborhood of the classical area of The main weakness of this approach to black hole
the black hole under consideration). Finally, the entropy stems from the Barbero–Immirzi ambiguity.
number of quantum states associated with an The argument would be much more compelling if
elementary cell labeled by jI is not 2 but (2jI þ 1). the value of were determined by independent
The detailed theory of the quantum horizon considerations, without reference to black hole
geometry and the standard statistical mechanical entropy. (By contrast, for extremal black holes,
reasoning is then used to calculate the entropy and string theory provides the correct coefficient without
the temperature. For large black holes, the leading any adjustable parameter. The AdS/CFT duality
contribution to entropy is proportional to the hypothesis (as well as other semiquantitative) argu-
horizon area, in agreement with quantum field ments have been used to encompass certain black
theory in curved spacetimes. (The subleading term holes which are away from extremality. But in these
(1=2) ln(ahor =‘2Pl ) is a quantum gravity correction cases, it is not known if the numerical coefficient is
Quantum Geometry and Its Applications 235
1/4 as in Hawking’s analysis.) It’s primary strengths function c(t) and, in that of the momentum/triad, in
are twofold. First, the calculation encompasses all another function p(t). The scale factor is given by
realistic black holes – not just extremal or near- a2 = jpj. (The variable p itself can assume both signs;
extremal – including the astrophysical ones, which positive if the triad is left handed and negative if it is
may be highly distorted. Hairy black holes of right handed. p vanishes at degenerate triads which
mathematical physics and cosmological horizons are permissible in this approach.) The system again
are also encompassed. Second, in contrast to other has only a finite number of degrees of freedom.
approaches, one works directly with the physical, However, quantum theory turns out to be inequi-
curved geometry around black holes rather than valent to that used in older quantum cosmologies.
with a flat-space system which has the same number This surprising result comes about as follows.
of states as the black hole of interest. Recall that in quantum geometry, one has well-
defined holonomy operators h ^ but there is no
operator corresponding to the connection itself. In
The Big Bang
quantum mechanics, the analog would be for
Most of the work in physical cosmology is carried ^
operators U() corresponding to the classical func-
out using spatially homogeneous and isotropic tions exp ix to exist but not be weakly continuous
models and perturbations thereon. Therefore, to in ; the operator x ^ would then not exist. Once the
explore the quantum nature of the big bang, it is requirement of weak continuity is dropped, von
natural to begin by assuming these symmetries. Neumann’s uniqueness theorem no longer holds and
Then the spacetime metric is determined simply by the Weyl algebra can have inequivalent irreducible
the scale factor a(t) and matter fields (t) which representations. The one used in loop quantum
depend only on time. Thus, because of symmetries, cosmology is the direct analog of full quantum
one is left with only a finite number of degrees of geometry. While the space A of smooth connections
freedom. Therefore, field-theoretic difficulties are reduces just to the real line R, the space A of
bypassed and passage to quantum theory is simpli- generalized connections reduces to the Bohr com-
fied. This strategy was introduced already in the late pactification R Bohr of the real line. (This space was
1960s and early 1970s by DeWitt and Misner. introduced by the mathematician Harold Bohr (Nils’
Quantum Einstein’s equations now reduce to a brother) in his theory of almost-periodic functions.
single differential equation of the type It arises in the present application because holo-
nomies turn out to be almost periodic functions
@2 ^ ða; Þ
ðf ðaÞða; ÞÞ ¼ const: H ½13 of c.) The Hilbert space of states is thus
@a2 H = L2 (R Bohr , do ) where o is the Haar measure
on the wave function (a, ), where H ^ is the matter on (the abelian group) R Bohr . As in full quantum
Hamiltonian and f (a) reflects the freedom in factor geometry, the holonomies act by multiplication and
ordering. Since the scale factor a vanishes at the big the triad/momentum operator p ^ via Lie derivatives.
bang, one has to analyze the equation and its To facilitate comparison with older quantum
solutions near a = 0. Unfortunately, because of the cosmologies, it is convenient to use a representation
standard form of the matter Hamiltonian, coeffi- in which p ^ is diagonal. Then, quantum states are
cients in the equation diverge at a = 0 and the functions (p, ). But the Wheeler–DeWitt equation
evolution cannot be continued across the singularity is now replaced by a difference equation:
unless one introduces unphysical matter or a new
principle. A well-known example of new input is the Cþ ðpÞ ðp þ 4po ; Þ þ Co ðpÞ ðp; Þ
Hartle–Hawking boundary condition which posits þ C ðpÞ ðp 4po ÞðÞ ¼ const: H ^ ðp; Þ ½14
that the universe starts out without any boundary
and a metric with positive-definite signature and where po is determined by the lowest eigenvalue of the
later makes a transition to a Lorentzian metric. area operator (‘‘area gap’’) and the coefficients C (p)
Bojowald and others have shown that the situa- and Co (p) are functions of p. In a backward ‘‘evolu-
tion is quite different in loop quantum cosmology tion,’’ given at p þ 4 and p, such a ‘‘recursion
because quantum geometry effects make a qualita- relation’’ determines at p 4, provided C does not
tive difference near the big bang. As in older vanish at p 4. The coefficients are well behaved and
quantum cosmologies, one carries out a symmetry nowhere vanishing, whence the evolution does not stop
reduction at the classical level. The final result at any finite p, either in the past or in the future. Thus,
differs from older theories only in minor ways. In near p = 0 this equation is drastically different from the
the homogeneous, isotropic case, the freedom in the Wheeler–DeWitt equation [13]. However, for large p –
choice of the connection is encoded in a single that is, when the universe is large – it is well
236 Quantum Group Differentials, Bundles and Gauge Theory
approximated by [13] and smooth solutions of [13] are some similarities with other approaches, (e.g.,
approximate solutions of the fundamental discrete ‘‘cyclic universes,’’ or pre-big-bang cosmology),
equation [14] in a precise sense. only in loop quantum cosmology is there a fully
To complete quantization, one has to introduce a deterministic evolution across what was the classical
suitable Hilbert space structure on the space of big-bang. However, so far, detailed results have
solutions to [14], identify physically interesting been obtained only in simple models. The major
operators and analyze their properties. For simple open issue is the inclusion of perturbations and
matter fields, this program has been completed. subsequent comparison with observations.
With this machinery at hand, one begins with
semiclassical states which are peaked at configura- See also: Algebraic Approach to Quantum Field Theory;
tions approximating the classical universe at late Black Hole Mechanics; Canonical General Relativity;
times (e.g., now) and evolves backwards. Numerical Knot Invariants and Quantum Gravity; Loop Quantum
Gravity; Quantum Cosmology; Quantum Dynamics in
simulations show that the state remains peaked at
Loop Quantum Gravity; Quantum Fields Theory in
the classical solution till very early times when the
Curved Spacetime; Spacetime Topology, Causal Structure
matter density becomes of the order of Planck and Singularities; Spin Foams; Wheeler–De Witt Theory.
density. This provides, in particular, a justification,
from first principles, for the assumption that space-
time can be taken to be classical even at the onset of
the inflationary era, just a few Planck times after the Further Reading
(classical) big bang. While one would expect a result Ashtekar A (1987) New Hamiltonian formulation of general
along these lines to hold on physical grounds, relativity. Physical Review D 36: 1587–1602.
technically it is nontrivial to obtain semiclassicality Ashtekar A and Krishnan B (2004) Isolated and dynamical
over such huge domains. However, in the Planck horizons and their applications. Living Reviews in Relativity
10: 1–78 (gr-qc/0407042).
regime near the big bang, there are major deviations Ashtekar A and Lewandowski L (2004) Background independent
from the classical behavior. Effectively, gravity quantum gravity: a status report. Classical Quantum Gravity
becomes repulsive, the collapse is halted and then 21: R53–R152.
the universe re-expands. Thus, rather than modify- Bojowald M and Morales-Tecotl HA (2004) Cosmological
ing spacetime structure just in a tiny region near the applications of loop quantum gravity. Lecture Notes in
Physics 646: 421–462 (also available at gr-qc/0306008).
singularity, quantum geometry effects open a bridge Gambini R and Pullin J (1996) Loops, Knots, Gauge Theories and
to another large classical universe. These are Quantum Gravity. Cambridge: Cambridge University Press.
dramatic modifications of the classical theory. Perez A (2003) Spin foam models for quantum gravity. Classical
For over three decades, hopes have been expressed Quantum Gravity 20: R43–R104.
that quantum gravity would provide new insights Rovelli C (2004) Quantum Gravity. Cambridge: Cambridge
University Press.
into the true nature of the big bang. Thanks to Thiemann T (2005) Introduction to Modern Canonical Quantum
quantum geometry effects, these hopes have been General Relativity, Cambridge: Cambridge University Press.
realized and many of the long-standing questions (draft available as gr-qc/0110034).
have been answered. While the final picture has
finite projective modules. The theory is illustrated One says that (A) satisfies the ‘‘density
by two explicit examples that can be viewed as condition’’ if any element of n (A) is of the
deformations of the classical magnetic monopole above form, for any n. To simplify notation, one
and the instanton. writes d for dn .
As an example of (A), take A = C(X) and then
the exterior algebra (X) for (A). The exterior
Differential Structures on Algebras algebra satisfies density condition as any n-form
Algebraic Conventions can be written as f (x) ^ dg(x) ^ dh(x) ^ . The
wedge product is anticommutative, but for a
Throughout this article, A (P etc.,) will be an noncommutative algebra A, the anticommutativity
associative unital complex algebra. To gain some of the product in (A) cannot be generally
geometric intuition the reader can think of A as an required.
algebra of continuous complex functions on a
compact (Hausdorff) space X, C(X), with product
given by pointwise multiplication fg(x) = f (x)g(x),
The Universal Differential Calculus
and with the unit provided by a constant function
x 7! 1. The algebra C(X) is commutative, but, in Any algebra A comes equipped with a universal
what follows, we do not assume that A is a differential calculus denoted by (1 A, d). 1 A is def-
commutative algebra. By an A-bimodule we mean ined as the kernel
P of the multiplication
P map, that
a vector space with mutually commuting left and is, 1 A := { i ai bi 2 A A j i ai bi = 0} A A.
right actions of A. All modules are unital (i.e., the The derivative is defined by d(a) = 1 a a 1. The
unit element of A acts trivially). On elements, the n-forms are defined as n A = 1 A A 1 A A
multiplication in an algebra or an action of A on a A 1 A (n-copies of 1 A). n A can be identified
module is denoted by juxtaposition. with a subspace of A A A (n þ 1-copies of
A) consisting of all such elements that vanish upon
multiplication of any two consecutive factors. With
Differential Calculus on an Algebra this identification, higher derivatives read
A first-order differential calculus on A is a pair
X Xnþ1 X
(1 (A), d), where 1 (A) is an A-bimodule and
d ai0 ai1 ain ¼ ð1Þk ai0 ai1
d : A ! 1 (A) is a linear map such that: i i
k¼0
1. for all a, b 2 A, d(ab) = (da)b þ adb (the Leibniz aik1 1 aik ain
rule); and P
2. every ! 2 1 (A) can be written as ! = i ai dbi for
The universal differential calculus satisfies the
some ai , bi 2 A.
density condition.
Elements of 1 (A) are called differential 1-forms This calculus captures very little (if any) of the
and the map d is called an exterior derivative. As a geometry of the underlying algebra A, but it has the
motivating example, take A = C(X) and 1 (A) the universality property, that is, any differential calcu-
space of 1-forms on X (sections of the cotangent lus on A can be obtained as a quotient of A.
bundle T X), and d the usual exterior differential. In other words, any differential calculus (A) is
Higher-differential forms corresponding to (1 (A), d) fully determined by a system of A-sub-bimodules
are defined as elements of a differential graded Nn 2 A nþ1 (or homogeneous ideals in the algebra
algebra (A). This is an algebra which can be A), so that n (A) = n A=Nn . The differentials d in
decomposed into the direct sum of A-bimodules (A) are derived from universal differentials via the
n (A), that is, (A) = A 1 (A) 2 (A) . In canonical projections n : n A ! n (A).
addition to d : A ! 1 (A), there are maps dn : n Typical examples of algebras in quantum geome-
(A) ! nþ1 (A) such that, for all !n 2 n (A), try are given by generators and relations, that is,
!k 2 k (A), A = Chx1 , . . . , xn i=hRi (x1 , . . . , xn )i, where Chx1 , . . . , xn i
is a free algebra on generators xk and Ri (x1 , . . . , xn )
1. d1 d = 0 and dnþ1 dn = 0, n = 1, 2, . . . ;
are polynomials, so that Ri (x1 , . . . , xn ) = 0 in A.
2. !n !k 2 nþk (A); and
Correspondingly, the modules n (A) are given by
3. dnþk (!n !k ) = (dn !n )!k þ (1)n !n (dk !k ).
generators and relations. If (A) satisfies the density
Elements of n (A) are known as ‘‘differential condition, that the whole of (A) must be generated
n-forms.’’ n (A) contains all linear combinations by some 1-forms. The sub-bimodules Nn contain
of expressions a0 da1 da2 dan with a0 , . . . , an 2 A. relations satisfied by these generators.
238 Quantum Group Differentials, Bundles and Gauge Theory
‘‘Woronowicz braiding’’ : 1 (A) A 1 (A) ! 1 (A) General classification results are based on
A 1 (A) by setting (a! A ) = a A ! for all a 2 A, the equivalence between the category of Hopf
and any left-invariant ! and right-invariant , and then bimodules of a finite-dimensional Hopf algebra
extending it A-linearly to the whole of A and that of Yetter–Drinfeld or crossed modules
1 (A) A 1 (A). This operator satisfies the braid of A. These are the modules of the Drinfeld double
relation (id A ) ( A id) (id A ) = ( A id) of A. As a result, in the case of a finite-dimensional
(id A ) ( A id), and is invertible provided the factorizable coquasitriangular Hopf algebra A with
antipode S is invertible. The Woronowicz braiding is a dual Hopf algebra H, the bicovariant 1 (A) are
used to define symmetric forms as those invariant in one-to-one correspondence with two-sided ideals
under . One then defines exterior 2-forms as elements in H þ . If, in addition, A is semisimple, then
of 1 (A) A 1 (A)=ker (id ), and introduces the (coirreducible) calculi are in one-to-one correspon-
wedge product. The wedge product is not in general dence with nontrivial irreducible representations of
anticommutative, but one does have ! ^ = ^ ! H. This can be extended to infinite-dimensional
for bi-invariant !, . This construction is extended to algebras, provided one works over a field of formal
higher forms and leads to the definition of the exterior power series in the deformation parameter.
algebra (A). To define exterior n-forms, one maps
any permutation on n-elements to the corresponding
element of the braid group generated by and then
Quantum Group Principal Bundles
takes the quotient of the nth tensor power of 1 (A) by Quantum Principal Bundles
all elements corresponding to even permutations. The
In classical geometry, a (topological) principal
differential d : A ! 1 (A) is extended to an exterior
bundle is a locally compact Hausdorff space with a
differential in the whole of (A) in the following way.
(continuous) free and proper action of a locally
First, 1 (A) is extended by a one-dimensional
compact group (e.g., a Lie group). In terms of
A-bimodule generated by a form that is required to
algebras of functions this gives rise to the following
be bi-invariant. The resulting extended bimodule
structure. A is a Hopf algebra (the model is
(which, in general, is not a first-order differential P functions on a group G), P is a right A-comodule
calculus, as is not necessarily of the form i ai dbi ,
algebra with a coaction P : P ! P A (the model
for some ai , bi 2 A) is then determined from the
is functions on a total space X). Let
relation da = a a for all a 2 A. Higher exterior
B = {b 2 P j P (b) = b 1} be the coinvariant sub-
derivative is then defined by d = ^ ( 1)n ^ ,
algebra (the model is functions on a base manifold
for any 2 n (A).
M = X=G). Fix a bicovariant calculus 1 (A), with
The algebra (A) is a Z2 -graded differential Hopf
the corresponding Q and 1 = Aþ =Q as in the
algebra, that is, it has a coproduct such that
X subsection ‘‘The Woronowicz theorems.’’ Take a
ð! ^ Þ ¼ ð1Þj!ð2Þ kð1Þ j !ð1Þ ^ ð1Þ !ð2Þ ^ ð2Þ differential calculus 1 (P) = 1 P=NP such that:
P
where j!(2) j etc., denotes the degree of a homo- 1. 1 P (NP ) NP A, where for all i pi qi 21 P,
geneous component in the decomposition of (!). X X
Furthermore, 1 P i
p q i
¼ pið0Þ qið0Þ pið1Þ qið1Þ
X i i
ðd!Þ ¼ d!ð1Þ !ð2Þ þ ð1Þj!ð1Þ j !ð1Þ d!ð2Þ
2 1 P A
On the 1-forms this coproduct is simply the sum
2. (N
˜ P ) NP Q, where
L þ R .
~ : 1 P ! P Aþ ;
Classification X X X
pi qi 7! pi P ðqi Þ ¼ pi qið0Þ qið1Þ
There is no unique covariant differential calculus on A, i i i
so classification of covariant differential calculi is an
important problem. For example, it is known that the 3. NB = NP \ 1 B gives rise to a differential struc-
quantum group SUq (2) admits a left-covariant three- ture 1 (B) = 1 B=NB on B. Condition (1) ensures
dimensional calculus, but there is no three-dimen- that 1 P descends to a coaction 1 (P) :
sional bicovariant calculus. On the other hand, there 1 (P) ! 1 (P) A, while (2) allows for defining
are two four-dimensional bicovariant calculi on a map
SUq (2). Differential calculi are classified for standard
quantum groups such as SLq (N) or Spq (N). ver : 1 ðPÞ ! P 1 ; verð½! Þ ¼ ½ð!Þ
~
240 Quantum Group Differentials, Bundles and Gauge Theory
S1
Z2 ). It can be shown, however, that the connection as a connection form or a gauge field,
quantum principal bundle corresponding to the that is, a map ! : 1 ! 1 (P) such that:
Möbius strip has a trivialization in the above
1. for all
2 1 , ver(!(
)) = 1
; and
sense.
2. 1 (P) ! = (! id) Ad1 (Ad-covariance), where
Ad1 is a projection of the adjoint coaction to 1 ,
Generalizations of Quantum Principal Bundles that is, Ad1 ([a]) = [Ad(a)] (well defined, because
Q is Ad-invariant for a bicovariant calculus, see
In the case of majority of quantum homogeneous
the subsection ‘‘The Woronowicz theorems’’).
spaces, the map in the subsection ‘‘Quantum
homogeneous bundles’’ is a coalgebra and right The correspondence between connections and con-
P-module map, but not an algebra map. Thus, the nection 1-forms is given by the formula
induced coaction is not an algebra map either. To Y X
cover examples like these, one needs to introduce ðpdqÞ ¼ pqð0Þ !ð½qð1Þ Þ
a generalization of quantum principal bundles. In the universal differential calculus case, 1 = Aþ ,
Consider an algebra P that is also a right comodule hence ! can be viewed as a map ! : A ! 1 P, such
of a coalgebra C with coaction P . Define that !(1) = 0. The map F ! : A ! 2 P, given by
n F ! = d! þ ! ! is called a ‘‘curvature’’ of !. The
B :¼ b 2 Pj8p 2 P; P ðbpÞ ¼ bP ðpÞ curvature satisfies the Bianchi identity, dF ! =
X o F ! ! ! F !.
¼ bpð0Þ pð1Þ In the case of a trivial bundle with trivialization
and universal calculus, any linear map : A ! 1 B
B is a subalgebra of P. P is a principal coalgebra- such that (1) = 0 defines a connection 1-form
bundle over B or B P is a coalgebra-Galois
extension provided the map ! ¼ 1 d þ 1
nX
described in the subsection ‘‘Connections and con- X i
nection forms’’ is strong (and every strong connection E¼ v p 2 V P
i i
vð0Þ pið0Þ við1Þ pið1Þ
i i
in a trivial bundle is of this form). Assuming
X o
invertibility of the antipode in A, a canonical
¼ vi p i 1 V P
connection in a quantum homogeneous bundle
i
described in that subsection is strong provided Ad- P
covariance (3) is replaced by conditions (id ) E
P is i a right B-module with product ( i vi pi )b =
i
i = (i id) A (right covariance) and ( id) i v p b. A right B-linear map s : E ! B is called a
i = (id i) A (left covariance), where is a section of E. The space of sections (E) is a left B-
coproduct in P, and A is a coproduct in A. module via (bs)(p) = bs(p).
In the universal calculus case, the map D can The theory of associated bundles is particularly rich
be extended to a P map D : 1 P ! 2 P via the when A has a bijective antipode and P has a strong
formula D() = d þ (0) !((1) ). Then D D(p) = connection form !. In this case, (E) is isomorphic to
P
p(0) F ! (p(1) ), where F ! is the curvature of ! (cf. the the left B-module % of maps
: V ! P such that P
subsection ‘‘Connections and connection forms’’). This
= (
id) %. If V is finite dimensional, then % is a
explains the relationship between a curvature under- finite projective B-module, that is, it is a module of
stood as the square of a covariant derivative and F ! . sections of a noncommutative vector bundle in the
sense of Connes. The strong connection induces a map
Bundle Automorphisms and Gauge r : % ! 1 B B % , given by r(
)(v) = d
(v) þ
P
Transformations
(v(0) )!(v(1) ). r is a connection in the sense of
Connes (in a projective left B-module), that is, for all
A quantum bundle automorphism is a left B-linear b 2 B,
2 % , r(b
) = db B
þ br(
).
right A-covariant (i.e., colinear) automorphism In the case of a trivial bundle, % can be identified
F : P ! P such that F(1) = 1. Bundle automorphisms with the space of linear maps V ! B. Thus, sections
form a group with operation FG = G F. This group of an associated bundle correspond to pullbacks of
is isomorphic to the group G(P) of gauge transfor- matter fields, as in the classical local gauge theory
mations, that is, maps f : A ! P that satisfy the matter fields are defined as functions on a spacetime
following conditions: with values in a representation (vector) space of the
1. f (1A ) = 1P (unitality); gauge group.
2. P f = (f id) Ad (Ad-covariance); and
3. f is convolution invertible (cf. the subsection
‘‘Hopf algebra preliminaries’’). The Dirac q-Monopole
The product in G(P) is the convolution product This is an example of a strong connection in a
(cf. the subsection ‘‘Hopf algebra preliminaries’’). quantum homogeneous bundle (cf. the subsection
The group of gauge transformations acts on the ‘‘Quantum homogeneous bundles’’). P = SUq (2) is a
space of (strong) connection forms ! via the formula matrix Hopf -algebra with matrix of generators
f . ! ¼ f ! f 1 þ f df 1 ; 8f 2 GðPÞ
a qc
This resembles the gauge transformation law of a c a
gauge field in the standard gauge theory. The curvature
and relations
transforms covariantly as F f .! = f F ! f 1 .
In the case of a trivial principal bundle, gauge
ac ¼ qca; ac ¼ qc a; cc ¼ c c
transformations correspond to a change of the
trivialization and can be identified with convolution- a a þ c c ¼ 1; aa þ q2 cc ¼ 1
invertible maps : A ! B such that (1) = 1. A map
: A ! 1 B that induces a connection as in the where q is a real parameter. A = C[U(1)] is a Hopf
subsection ‘‘Connections and connection forms’’ is -algebra generated by unitary and group-like u
transformed to 1 þ d 1 , and the curva- (i.e., uu = u u = 1, (u) = u u). The -projection
ture F 7! F 1 . : P ! A is defined by (a) = u. The coinvariant
subalgebra B is generated by x = cc , z = ac ,
Associated Bundles: Matter Fields z = ca . The elements x and z satisfy relations
Given a right A-comodule (corepresentation)
x ¼ x; zx ¼ q2 xz;
% : V ! V A one defines a quantum vector bundle
associated to P as zz ¼ q2 xð1 q2 xÞ; z z ¼ xð1 xÞ
Quantum Group Differentials, Bundles and Gauge Theory 243
Thus, B is the algebra of functions on the standard quantum 7-sphere. As a -algebra it is defined by
quantum 2-sphere. A strong connection is obtained generators z1 , z2 , z3 , z4 and relations
from a bicovariant -map i : A ! P given by
i(un ) = an (cf. the subsections ‘‘Quantum homoge- zi zj ¼ qzj zi ðfor i < jÞ
neous bundles,’’ ‘‘Connections and connection zj zi ¼ qzi zj ðfor i 6¼ jÞ
X
forms,’’ and ‘‘Covariant derivative: strong connec- zk zk ¼ zk zk þ ð1 q2 Þ zj zj ;
tions’’). Explicitly, the connection form reads j<k
X
n
X zk zk ¼1
n
!ðun Þ ¼ ck ank dðank ck Þ k¼1
k¼0
k q2
where q 2 R. The coaction of the -Hopf algebra
Xn n
n
!ðu Þ ¼ q2k
a nk k k nk
c dðc a Þ A = SUq (2) (cf. the previous subsection) on P is
k¼0
k q2 constructed as follows. Start with the quantum group
Uq (4), generated by a matrix t = (tij )4i, j = 1 and view
where the deformed binomial coefficients are C[S7q ] as a right quantum homogeneous space of Uq (4)
defined for any number x by generated by the bottom row in t. Thus, there is a right
coaction of Uq (4) on C[S7q ] obtained by the restriction of
n ðxn 1Þðxn1 1Þ ðxkþ1 1Þ the coproduct in Uq (4). Next, project Uq (4) to SUq (2)
¼ nk by a suitable coideal and a right ideal in Uq (4). The
k x ðx 1Þðxnk1 1Þ ðx 1Þ
corresponding canonical surjection r : Uq (4) ! SUq (2)
There is a family Vn , n 2 Z of one-dimensional is a coalgebra map, characterized as a right Uq (4)-
corepresentations of C[U(1)] with Vn = C and module map by r(t11 t22 qt12 t21 ) = 1 and
%n (1) = 1 un , n 0 and %n (1) = 1 un , n < 0. This
u 0 u22 u21
leads to the family of finite projective modules rðtÞ ¼ ; u
¼
0 u u12 u11
n = %n as described in the subsection ‘‘Associated
bundles: matter fields.’’ The Hermitian projectors where u = (uij )2i, j = 1 is the matrix of generators
e(n) of these modules come out as, for n > 0, of SUq (2) (cf. the previous subsection). When
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi applied to the coaction of Uq (4) on C[S7q ], r induces
n n
the required coaction P : C[S7q ] ! C[S7q ] SUq (2).
eðnÞij ¼ ani ci cj anj ;
i q2 j q2 Explicitly, the
P coaction comes out on generators
as P (zj ) = i zi r(tij ). The coaction P is not
i; j ¼ 0; 1; . . . ; n an algebra map. The coinvariant subalgebra B is a
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi
n n -algebra generated by
eðnÞij ¼ qiþj ci ani anj cj ;
i q2 j q2 a ¼ z1 z4 z2 z3
i; j ¼ 0; 1; . . . ; n b ¼ z1 z3 þ q1 z2 z4
R ¼ z1 z1 þ z2 z2
The e(n) describe q-monopoles of magnetic charge
n. For example, the charge-1 projector explicitly The elements a, a , b, b , R satisfy the following
reads relations:
Ra ¼ q2 aR; Rb ¼ q2 bR
1x z
z q2 x ab ¼ q3 ba; ab ¼ q1 b a
aa þ q2 bb ¼ Rð1 q2 RÞ
and reduces to the usual charge-1 Dirac monopole
projector when q = 1. The covariant derivatives r aa ¼ q2 a a þ ð1 q2 ÞR2
are Levi-Civita or Grassmann connections in mod- b b ¼ q4 bb þ ð1 q2 ÞR
ules n corresponding to projectors e(n).
Hence B can be understood as a deformation of the
algebra of functions on the 4-sphere and is denoted
The q-Instanton
by C[ 4q ]. One can show that the map ‘‘can’’ in the
This is an example of a coalgebra bundle and the subsection ‘‘Generalizations of quantum principal
associated vector bundle, which is a deformation of bundles’’ is bijective, hence there is an SUq (2)-
an instanton (with instanton number 1). P = C[S7q ] is coalgebra principal bundle with the total space the
the -algebra of polynomial functions on the quantum 7-sphere C[S7q ] and the base space the
244 Quantum Hall Effect
quantum 4-sphere C[ 4q ]. By abstract arguments Bonechi F, Ciccoli N, Da̧browski L, and Tarlini M (2004)
that involve cosemisimplicity of SUq (2), one can Bijectivity of the canonical map for the noncommutative
instanton bundle. Journal of Geometry and Physics 51: 71–81.
prove that there exists a strong connection in this Brzeziński T and Majid S (1993) Quantum group gauge theory on
bundle; this is the q-deformed instanton field. At the quantum spaces. Communications in Mathematical Physics
time of writing this article, however, the explicit 157: 591–638.
form of this connection is not known. Brzeziński T and Majid S (1995) Quantum group gauge theory on
On the other hand, following the classical con- quantum spaces – Erratum. Communications in Mathematical
Physics 167: 235.
struction of an instanton, one can take the funda- Brzeziński T and Majid S (2000) Quantum geometry of algebra
mental two-dimensional corepresentation V = C2 of factorisations and coalgebra bundles. Communications in
SUq (2) and explicitly construct q-instanton projection Mathematical Physics 213: 491–521.
with instanton number 1. Writing e1 , e2 for the basis Calow D and Matthes R (2002) Connections on locally trivial
of V, the coaction % : V ! V SUq (2) is given by quantum principal fibre bundles. Journal of Geometry and
X Physics 41: 114–165.
ðej Þ ¼ ei uij Da̧browski L, Grosse H, and Hajac PM (2001) Strong connec-
tions and Chern–Connes pairing in the Hopf–Galois theory.
i
Communications in Mathematical Physics 220: 301–331.
The associated bundle (cf. the subsection ‘‘Asso- Hajac PM and Majid S (1999) Projective module description of
ciated bundles: matter fields’’) is a finite projective the q-monopole. Communications in Mathematical Physics
206: 247–264.
left module over C[ 4q ]. The corresponding q-instan-
Heckenberger I and Schmüdgen K (1998) Classification
ton projector comes out as of bicovariant differential calculi on the quantum groups
0 2 1 SLq (n þ 1) and Spq (2n). Journal für die Reine und
q R 0 qa q2 b Angewandte Mathematik 502: 141–162.
B 0 q2 R qb q3 a C
B C Klimyk A and Schmüdgen K (1997) Quantum Groups and Their
@ qa qb 1R 0 A Representations. Berlin: Springer.
2 3 4 Majid S (1998) Classification of bicovariant differential calculi.
q b q a 0 1q R
Journal of Geometry and Physics 25: 119–140.
Majid S (1999) Quantum and braided Riemannian geometry.
Journal of Geometry and Physics 30: 113–146.
See also: Bicrossproduct Hopf Algebras and Majid S (2000) Foundations of Quantum Group Theory, 1st pbk.
Noncommutative Spacetime; Hopf Algebras and edn. Cambridge: Cambridge University Press.
Wess J and Zumino B (1990) Covariant differential calculus on
q-Deformation Quantum Groups; Noncommutative Tori,
the quantum hyperplane, Nuclear Physics B. Proceedings
Yang–Mills, and String Theory.
Supplement 18B: 302–312.
Woronowicz SL (1989) Differential calculus on compact matrix
pseudogroups (quantum groups). Communications in Mathe-
Further Reading matical Physics 122: 125–170.
Bonechi F, Ciccoli N, and Tarlini M (2003) Noncommutative
instantons and the 4-sphere from quantum groups. Commu-
nications in Mathematical Physics 226: 419–432.
Magnetic
field B 4
3
+ –
h σ H/ e 2
+ – 2
+ –
2
Direct 1
+ current – 1
0
0 1 2 3 4
ν
Figure 1 Schematic diagram of charge separation in Hall’s
experiment. (To different scale)
σ
the sample thickness , the diagonal components of
give the direct conductivity k and its off- 0 1 2 3 4
diagonal elements give the Hall conductivity: Figure 2 Schematic diagram of the Hall and direct conductivities
H = 21 . (For systems symmetric under 90 rota- plotted against the filling factor .
tions, 11 = 22 and 12 = 21 .) In quantum
theory, one usually works in terms of the filling
fraction = nh=eB and then H = e2 =h. conditions of the classical Hall effect. The following
In 1980 von Klitzing, Dorda, and Pepper dis- features seem to play a role, and in the case of the
covered that at very low temperatures in very high first three, even in the classical effect.
magnetic fields, the Hall conductivity H is quan-
1. As Hall discovered, the samples must be very thin
tized as integral multiples of e2 =
h, a fact known as
to exhibit even the classical effect. (Nowadays
the integer quantum hall effect (IQHE). The integer
they are often a surface layer between two
multiples were accurate to 1 part in 108 , and the
semiconductors.)
effect was exceptionally robust against changes in
2. The samples are macroscopic and much larger
the geometry of the samples and in the experimental
than the quantum wavelengths appearing in the
parameters. Indeed, the unprecedented accuracy of
problem.
the effect led to its adoption as the international
3. The electric field is small enough that nonlinear
standard for resistance in 1990.
effects are negligible.
More precisely, the Hall conductivity was no
4. The quantum effect appears only at a very low
longer proportional to the filling fraction , but the
temperature.
graph of H against displayed a sequence of jumps,
as shown in Figure 2. In this figure, the conductivity The first of these suggests that we should idealize
has plateau at the integer multiples of e2 =h, and to the case where the motion of the charge carriers is
jumps between them within fairly small ranges of restricted to a two-dimensional region, and the
the filling fraction. Moreover, the direct conductiv- second that we may work in the thermodynamic
ity vanishes where the Hall conductivity takes its limit where the conducting surface is the whole
constant integral values. of R2 . The third and fourth ensure both that the
These results raise numerous questions. linear Ohm’s law should be adequate, and also that
it should be enough to consider the limiting cases of
1. Why does the conductivity take such precise
very weak electric fields and zero temperature.
integer values, and why are they so stable under
Multiple limits of this sort raise delicate mathema-
changes of the geometry and physical
tical issues. Indeed, many plausible models of the
parameters?
effect turn out, on careful analysis, to predict
2. Why does the direct conductivity vanish, except
vanishing Hall conductivity.
in regions where the Hall conductivity jumps
A theoretical explanation of the quantization of
between integer values, and how are such jumps
the conductivity was soon suggested by Laughlin.
possible?
Exploiting the apparent independence of sample
Moreover, any theory must also explain why geometry, he considered a cylindrical conductor
these features are not present under the more normal where quantization followed on consideration of
246 Quantum Hall Effect
the flux tubes threading it. Laughlin’s choice of a There are good surveys of the area (Bellissard
particular configuration precluded investigation of et al. 1994, McCann 1998) explaining how the
the influence of changing geometry. This was soon mathematical model arises out of the physics, the
provided by Thouless, Kohmoto, Nightingale, and mathematical models themselves. As well as being
de Nijs, who argued (from a lattice version of the the standard reference for noncommutative geome-
problem) that the conductivity could be identified try, Connes (1994) discusses the Hall effect. These
with the Chern character of a line bundle over a resources contain good bibliographies, which may
Brillouin zone (a quotient of momentum space by be consulted for further references.
the action of the reciprocal crystal lattice), so that it
had to be integral and the stability of the effect was
a consequence of the topological nature of . Electron Motion in a Magnetic Field
Unfortunately, whilst suggestive, this explanation
worked only under the physically implausible con- The following discussion restricts attention to
straint that the magnetic flux through a crystal cell motion in two dimensions, with electrons as the
was rational, offered no explanation of the link charge carriers, and no interactions between them.
between the Hall and direct conductivities, and, (The first condition is essential; the second could be
working with a periodic Hamiltonian, made no relaxed a little to allow sufficiently long-lived
allowance for the impurities and disorder usually quasi-particles.) A single free electron with mass
important in solid-state problems of this sort. m and charge e moving in the x1 –x2 plane with a
Notwithstanding these deficiencies, this model con- constant transverse magnetic field B in the positive
tained important insights, which inspired Bellissard x3 -direction, can be described by the Landau
to model the effect using Connes’ newly developed Hamiltonian
noncommutative geometry (Bellissard 1986, Connes HL ¼ jP eAj2 =2m ½1
1986). (Kunz produced a Hilbert space theory at about
the same time, but that has been rather less influential.) where A = 12 B X is a magnetic vector potential that
Connes’ work turned out to contain all the relevant gives rise to B. This problem is exactly solvable by,
concepts and tools needed to provide a good under- for example, introducing K = (K1 , K2 ) = P eA.
standing of the effect, based on interpreting the The components of K þ and K commute with each
conductivity as a noncommutative Chern character other, but [K1 , K2 ] = iheB. Comparison with the
for a noncommutative version of the Brillouin zone. In harmonic oscillator shows that the energy spectrum of
fact, the techniques of noncommutative geometry HL = [(K1þ )2 þ (K2þ )2 ]=2m is {(n þ 12 )heB=m: n 2 Z}.
seemed to fit the quantum Hall effect so well that Since HL commutes with the components of K ,
this has become one of the standard examples of the each of these Landau energy levels is infinitely
theory. degenerate, and the filling fraction measures
Even whilst the theorists were struggling to what proportion of states in the Landau levels are
explain the experiments, observations by Tsui, filled. The frequency !c = eB=m is the cyclotron
Störmer, and Gossard showed that, with suitable frequency for classical circular orbits in the
care, fractional Hall conductivities could also be magnetic field.
observed, although these were far less stable than The degeneracy of the Landau Hamiltonian can
those given by integers. One, therefore, distinguishes also be understood in terms of the magnetic
between IQHE and the fractional quantum Hall translations obtained by exponentiating the connec-
effect (FQHE), and this survey concentrates largely tion defined by the magnetic potential A: rj = @j þ
on the former. One simplifying feature of the IQHE ieAj =h = iKj =h. More precisely, we set
is that it seems to be comprehensible at the level of UðaÞ ¼ expðia
rÞ ¼ expðia
K =hÞ ½2
individual noninteracting electrons, whereas the
FQHE certainly involves some kind of interaction which clearly commutes with HL , expressing the
and many-body theory. translational symmetry of this model. The curvature
This article presents an outline of the connection [r1 , r2 ] = B of the connection manifests itself in the
between noncommutative differential geometry and identities
the IQHE, and concludes by discussing some of the
UðaÞUðbÞ ¼ eð1=2Þi Uða þ bÞ ¼ ei UðbÞUðaÞ ½3
approaches to the FQHE, and some other applica-
tions of noncommutative geometry and mathema- where = eB
(a b)=h measures the magnetic flux
tical directions suggested by the theory. The sections through the parallelogram spanned by a and b.
alternate between the physical model and the These show that U is a projective representation
mathematical abstraction from it. of R2 with projective multiplier (a, b) = exp ( 12 i).
Quantum Hall Effect 247
The significance of this is that, unless is an We now wish to calculate the expected current
integer multiple of 2, U(a) and U(b) generate a h Jk i, in a thermal state with chemical potential at
noncommutative algebra. This replaces the commu- inverse temperature
= 1=kT (where k is Boltz-
tative algebra of functions on two-dimensional mann’s constant and the temperature is T (kelvin).
momentum space and leads naturally to a noncom- Using the Fermi–Dirac distribution, the grand
mutative geometry. canonical expectation value is
The unembellished Landau Hamiltonian cannot
describe the Hall effect without adding an electric 1
potential eE
X to drive the current in the sample. h Jk i ¼ tr 1 þ e
ðH Þ Jk ½5
(Alternatively, and useful for the later discussion,
one could use the radiation gauge in which, instead
of introducing a scalar potential, a time-dependent Since the quantum Hall effect occurs at low tempera-
term is added to A so that E = @A=@t.) tures (large
) and for weak fields, we formally
The quantum Hall effect also depends crucially on proceed to those limits. Then (1 þ e
(H ) )1 tends to
the effects of impurities in the conducting material. the projection PF onto the states with energy less than
These can be modeled by adding a random potential the Fermi energy EF in the absence of the electric
V! with ! in a compact probability space to field. The limiting expected current is, therefore,
obtain H! = HL þ eE
X þ V! (X). A continuous tr(PF Jk ) = tr(PF k H), where H is now the Hamilto-
function f on can be interpreted as a random nian including the electric field (without which there
variable, and its expectation (f ) gives a trace on would be no current).
the C -algebra C() (i.e., a positive linear functional A detailed calculation of the Hall conductivity
such that (AB) = (BA)). using the Kubo–Greenwood formula shows that the
Although the magnetic translations commute with conductivity matrix is actually
HL , they do not generally commute with the
potentials so they act on , but, on the other hand, kj ¼ iðe2 =hÞtrðPF ½@j PF ; @k PF Þ ½6
the physics of a disordered system and its translates
should be the same, so we assume that the
probability measure and hence also are invariant In particular, this immediately implies that the direct
under magnetic translations. (As noted earlier, we conductivity terms jj vanish, as observation sug-
work in the thermodynamic limit, where the Hall gested. The derivation of [6] requires great care, and
sample expands to fill R2 , so we do not need to references may be found in the surveys, but a formal
worry about translations moving the sample itself.) argument in the next section may lend this expres-
Then with the magnetic translation action can be sion some plausibility.
interpreted as the noncommutative Brillouin zone.
(A space can be reconstructed from the magnetic
translations of the resolvents of the Hamiltonians
The Noncommutative Geometry
(Bellisard et al. 1994).)
The current J may be defined as the functional The principal ingredient for noncommutative geo-
derivative of the Hamiltonian with respect to the metry is an algebra, and thus we shall now consider
vector potential A or, in components, Jk = k H = a class of algebras broad enough to include the
H=Ak . For the Landau Hamiltonian, this gives physical example.
The action of the magnetic translations on
hk HL ¼ ie
i hðPk eAk Þ=m ¼ e½Xk ; HL ½4 defines automorphisms of the C -algebra C(),
which permit the construction of a twisted crossed-
a relation which persists for H = HL þ V(X) when- product algebra, in which these automorphisms are
ever the potential V is independent of A, so that represented by conjugation. Because much of the
k H = ie[Xk , H]=h = e dXk =dt, the charge times theory has been formulated with lattice approxima-
velocity, as one might expect. The operator func- tions using Z2 rather than R2 , it is useful to work
tional calculus delivers a similar formula for deriva- more generally with a separable locally compact
tions of the spectral projections of H. We have abelian group G with continuous multiplier , and a
k = e@k =
h, where, in view of the commutation homomorphism to automorphisms of a C -algebra
relations, @k = i[Xk ,
] can be regarded as a A1 with trace 1 , which will in practice be the
momentum-space derivative, confirming that we commutative algebra C() with . The twisted
are dealing with the differential geometry of crossed product A = C(A1 , G, ) can be constructed
momentum space. as the norm completion of the continuous compactly
248 Quantum Hall Effect
supported functions from G to A1 with the product, provides a Connes–Fredholm involution which
adjoint and norm anticommutes with 3 . Detailed technical results of
Z Connes show how to use the supertrace on H2 and
ðf gÞðxÞ ¼ ðy; x yÞf ðyÞðy gÞðx yÞ dy ½7 the Dixmier trace to interpret the physically impor-
G tant quantities in this setting.
We now turn to the formal derivation of the key
alternative expression for the conductivity. In the
f ðxÞ ¼ ðx; xÞ1 f ðxÞ ½8
abstract algebraic setting, when p 2 A is a projec-
Z Z tion in the domain of a derivation the derivative of
kf k ¼ max kf ðxÞkA1 dx; kf ðxÞkA1 dx ½9 (1 p)p = 0 gives
G G
0 ¼ ðð1 pÞpÞ ¼ ð1 pÞðpÞ ðpÞp ½12
integration being with respect to the Haar measure.
and then an easy calculation leads to
The crossed-product algebra is noncommutative,
both because of the action of G and due to the ½p; ½p; p ¼ 2pðpÞp ðpÞp2 p2 ðpÞ ¼ p ½13
multiplier . It has a trace [f ] = 1 [ f (0)] and, when
G = R2 , has derivations given by @k f = ixk f (x).
As an example, consider the case of periodic In the identity for elements a, b, c, and h 2 A
potentials invariant under translation by vectors a ð½a; ½b; chÞ ðc½½h; a; bÞ
and b. Then the group G ffi Z2 generated by a and b
¼ ð½a; ½b; chÞ þ ð½b; c½h; aÞ ¼ 0 ½14
acts trivially on and the crossed-product algebra is
just a product of A1 and the twisted group algebra we set a = c = p and b = p to obtain
of complex-valued functions C(C, G, ), generated
ð½p; ½p; phÞ ¼ ðp½½h; p; pÞ ½15
by U(a) and U(b). We already noted that the algebra
is commutative only when the flux 2 2Z, in Combining this with [12] when = 0, one
which case it is just the convolution algebra of obtains
Z2 , which by Fourier transforming (effectively
setting U(a) = ei and U(b) = ei
) is the algebra ðphÞ ¼ ððphÞÞ ððpÞhÞ
C(T 2 ), with torus coordinates and
. For fluxes ¼ ð½p; ½p; phÞ ¼ ðp½½h; p; pÞ ½16
which are rational multiples of 2 we obtain a
matrix algebra, whilst irrational fluxes give an
infinite-dimensional irrational rotation algebra or
noncommutative torus, a standard example in The Hall Conductivity and Anderson
noncommutative geometry. Localisation
Any -representation of A1 on a Hilbert space
H can be induced to a -representation of the Substituting p = PF and h = H in formula [16] would
twisted crossed product on H = L2 (G, H ) by give the current tr(PF [[H, PF ], PF ]). Since k is
setting proportional to the commutator with Xk , it is true
that tr k = 0, but, unfortunately, PF need not lie in
ð ðf Þ ÞðxÞ the domain of k , and H is unbounded, further
Z compounding the difficulties. These are serious
¼ ðx; y xÞ1 ðx f Þðy xÞ ðyÞ dy ½10 problems, although the situation is not quite as
G bad as it seems. Without the electrostatic term eE
X
in H, PF would have been a spectral projection with
for f 2 A and 2 H. When A1 = C(), we may
which H would commute, so that
take to be a one-dimensional irreducible
-representations given by evaluating the function ½H; PF ¼ e½E
X; PF ¼ eEj ½Xj ; PF ¼ ieEj @j PF ½17
at a point ! 2 .
When G = R2 , it is easy to construct a Fredholm and H disappears from the formula, to be replaced
module from . The space H2 = H C2 has actions by @j PF . This would give the expected current
of A on the first factor and of the Pauli spin i(e2 =h)tr(PF [@j PF , @k PF ])Ej , and the conductivity
matrices 1 , 2 , 3 , on the second. It may be matrix
regarded as a graded module with grading operator
kj ¼ iðe2 =hÞtrðPF ½@j PF ; @k PF Þ ½18
3 , and
given earlier (there is no need to scale by the
F ¼ ðx21 þ x22 Þ1=2 ðx1 1 þ x2 2 Þ ½11 thickness in two dimensions).
Quantum Hall Effect 249
However it is derived, this expression for the characters. The cyclic cocycle is a trilinear form
conductivity only makes sense under suitable condi- defined on elements a0 , a1 , a2 2 A0 by
tions, otherwise tr(PF [@j PF , @k PF ]) might either be
undefined (because PF is not differentiable) or might c ða0 ; a1 ; a2 Þ ¼ ½a0 ð1 a1 2 a2 2 a1 1 a2 Þ ½22
not be trace class. There is a simple condition This is easily shown to be cyclic, c (a0 , a1 , a2 ) =
sufficient to handle both these difficulties, which c (a1 , a2 , a0 ), and to satisfy the cyclic 2-cocycle
also leads to an interesting physical insight. From condition
the obvious inequality
c ða0 a1 ; a2 ; a3 Þ c ða0 ; a1 a2 ; a3 Þ
0 tr PF ð@1 PF i@2 PF Þ ð@1 PF i@2 PF Þ ½19
þ c ða0 ; a1 ; a2 a3 Þ c ða3 a0 ; a1 ; a2 Þ ¼ 0 ½23
h i
¼ tr PF ð@1 PF Þ2 þ ð@2 PF Þ2 The Hall conductivity 21 = ic (p, p, p)e2 =h can
now be interpreted as the noncommut ative
i trðPF ½@1 PF ; @2 PF Þ ½20 Chern character defined by the projection p.
This interpretation of the Hall conductivity clears
and the fact that 1
PF , we deduce that
the way to prove that it is integral, and there are
h i
several different routes to this.
tr ð@1 PF Þ2 þ ð@2 PF Þ2
One approach is to identify the conductivity with
h i
some kind of index which is clearly integral.
tr PF ð@1 PF Þ2 þ ð@2 PF Þ2
Bellissard worked with the Fredholm module
jtrðPF ½@1 PF ; @2 PF Þj ½21 where, by results of Connes, the Chern character is
interpreted as the index of the Fredholm operator
Thus, if tr[((@1 PF )2 þ (@2 PF )2 )] exists and is finite, then (p)F (p). Avron, Seiler and Simon have inter-
our expression for the conductivity is well defined. preted the conductivity as a relative index
Mathematically, this is a Sobolev type condition. To dim [ ker (PF QF 1)] dim [ ker (QF PF 1)] of
see the physical significance, we recall that @k PF = i the projections PF and its conjugate QF = uPF u by
[Xk , PF ], so that the condition is equivalent to the an off-diagonal element u of F. This is particularly
finiteness of tr[(X21 þ X22 )PF 2 ] tr[(X1 PF )2 þ (X2 PF )2 ]. interesting as the conjugation by u can be inter-
This condition imposes a requirement for some preted as a nonsingular gauge transformation of
localization in the system (when PF is a rank-1 exactly the kind introduced by Laughlin in his
projection,
it reduces to the requirement that the original explanation of the quantum Hall effect in
variance X21 þ X22 hX1 i2 hX2 i2 be finite). This terms of singular flux tubes piercing a cylindrical
links with a much older observation of Anderson that conductor.
the interference caused by impurities in a crystal, Xia suggested another approach rewriting A as a
which cancel at long range, should, at smaller repeated crossed product with R, which allows us to
distances, cause localized clumping. The mathe- calculate K0 (A), using either Connes’ Thom iso-
matical development of this idea by Pastur provides morphism theorem or the Takai duality theorem for
an appropriate tool for handling the conditions stable algebras to get
for the valdiity of the conductivity formula. The
impurities generating Anderson localization are K0 ðAÞ ¼ K0 CðA1 ; G; Þ ffi K0 ðA1 Þ ½24
provided in this model by the random potential
which, when A1 = C(), is just K0 (), leading to
in the Hamiltonian. It also leads us to restrict
identification as a topological index. For the simplest
attention to the dense subalgebra A0 of f 2 A,
case of = T 2 , this gives K0 () ffi Z2 . The image of ,
where [(@1 f ) (@1 f ) þ (@2 f ) (@2 f )] < 1.
and so also c , actually sits in just one component,
leading to quantization of the Hall conductivity.
The two questions posed in the introduction can
The Integral Quantum Hall Effect
now be answered as follows: The Hall conductivity
Having identified the features of physical interest, can be identified with a topological index which can
we can return to the abstract algebraic description take only integer values, and therefore does not
with conductivity i(e2 =h)(p[@j p, @k p]). The key respond to continuous changes in any of the physical
observation is that this can be interpreted as the parameters until the change brings the system into a
Connes pairing between a cyclic cocycle c on A0 region where one of the background assumptions
and the projection p whose stable equivalence class fails, such as a breakdown in the localization
represents an element of the C -algebraic K-theory, condition. The same conditions also ensure that the
K0 (A). Such pairings give noncommutative Chern direct current vanishes. Roughly speaking, the
250 Quantum Hall Effect
plateaus occur when the Fermi energy is in a gap in example, Macris, Martin, and Pulé, and by Fröhlich,
the extended (nonlocalized) spectrum. Graf, and Walcher. The K-theory of the boundary
This brief overview has omitted many of the and bulk of a sample can be linked by exact
interesting features of the detailed theory, which can sequences such as those of the commutative theory
be found in the surveys, such as the fact that low- (Kellendonk et al. 2000), and even in the IQHE
lying energy levels do not contribute to the boundary and bulk conductivities can be used
conductivity, and Shubin’s theorem identifying (p) (Schulz-Baldes et al. 2002).
as the integrated density of states. Harper’s equation It has been fairly clear that whilst the IQHE can
describing a discrete lattice analog of the IQHE has already be understood in terms of the motion of a
been a test-bed for many of the ideas, and various single electron, the fractional effect is a many-body
results were first proved in that setting. The FQHE cooperative effect. One attempt to simplify the
was discovered during an unsuccessful search for a description is to work with an incompressible quan-
Wigner crystal phase transition, but analysis of tum fluid, and for edge currents one should study the
discrete models provides strong evidence that Hall boundary theory of such a fluid, in which the
conductors have very complicated phase diagrams. dominant contribution to the action is a Chern–Simons
term, with conductivity as a coefficient. For an annular
sample, this leads, in a suitable limit, to a chiral
The Fractional Quantum Hall Effect Luttinger model on the boundary circles, which can
As mentioned in the introduction, by the time IQHE then be tackled mathematically using the representa-
had been understood theoretically, it had been found tion theory of loop groups. This leads to some elegant
that, with appropriate care, fractional conductivities mathematics, including extensions to multiple coupled
could also be observed, although they were much less bands, with conductivities described by Cartan
precise and stable than the integer values, and the matrices, as explained in the International Congress
plateaus less pronounced. Although there have been of Mathematicians (ICM) survey (Fröhlich 1995), and
many phenomenological explanations, there is as yet no in the review by Fröhlich and Studer (1993).
mathematical understanding from quantum field the- The theory of composite fermions provides another
ory as compelling as that for the integer effect. We shall physical approach in which field-theoretic effects result
briefly summarize some of the main lines of attack. in the electrons sharing their charges in such a way as to
The first explanation, again due to Laughlin, has also produce fractional charges, and there is experimental
provided the basis for many subsequent treatments of evidence of such fractional charges in studies of
the problem. The wave functions of the oscillator-like tunneling from one edge to another. Then the FQHE
Landau Hamiltonian can conveniently be represented is easily understood by simply replacing the electron
in the Bargmann–Segal Fock space of holomorphic charge e by e=k in the appropriate formulas.
functions f on R2 C which are square-integrable with Susskind has suggested combining noncommuta-
respect to a Gaussian measure. Incorporating the tive geometry with the theory of incompressible
measure into the functions, these have the form quantum fluids, an idea taken up by Polychronakos
f (z) exp(jzj2 =2). Many particle wave functions are (2001). There are intriguing mathematical parallels
similarly realized in terms of holomorphic functions on with work by Berest and Wilson on ideals in the
CN , and must be antisymmetric under odd permuta- Weyl algebra and the Calogero–Moser model.
tions of the particles to describe fermions. This quickly
leads one to consider functions of the form Further Developments
!
Y k
X 2 Bellissard and others have extended the use of
ðzr zs Þ exp jzj j =2 ½25
noncommutative geometrical methods into other
r<s j
parts of solid-state theory, where they clarify a
for odd integers k > 0, and their multiples by even number of the physical ideas. This is particularly
holomorphic functions. The lowest energy where such a useful in the case of quasicrystals, which are not
wave function occurs is when k = 1, and larger values of easily handled by the conventional methods
k have the effect of dividing the Hall conductivity by k, (Bellissard et al. 2000). Some ideas in string theory
which produces fractional conductivities. resemble higher-dimensional analogs, and higher-
Halperin suggested quite early that counterflow- dimensional versions of the quantum Hall effect
ing currents in the interior of a sample would tend have also been studied by Hu and Zhang.
to cancel, so that most of the current would be Finally, we conclude with some mathematical
carried near the edge of the sample. There are extensions of the theory. We have seen that, for
several mathematical derivations of this, by, for periodic systems, the noncommutative Brillouin
Quantum Mechanical Scattering Theory 251
zone can be a noncommutative torus, and it is CRM Monograph Series vol. 13, pp. 207–258. Providence, RI:
possible to consider noncommutative versions of American Mathematical Society.
Bellissard J, van Elst A, and Schulz-Baldes H (1994) The non-
Riemann surfaces of higher genera. Carey et al. commutative geometry of the quantum Hall effect. Journal of
(1998) studied the effect in a noncommutative Mathematical Physics 35: 5373–5457.
hyperbolic geometry with a discrete group action, Carey AL, Hannabuss KC, Mathai V, and McCann P (1998)
generalizing the action of a Fuchsian group on the Quantum Hall effect on the hyperbolic plane. Communica-
unit disc. This provides a tractable example in which tions in Mathematical Physics 190: 629–673.
Connes A (1986) Non-commutative differential geometry. Pub-
one has an edge (albeit rather different from lications of the Institut des Hautes Etudes Scientifiques 62:
the normal physical situations) and also examples 257–360.
of a Hall effect in higher-genus noncommutative Connes A (1994) Non Commutative Geometry. San Diego:
Riemann surfaces closely related to those of Klimek Academic Press.
and Lesznewski. Natsumé and Nest have subse- Fröhlich J (1995) The Fractional Quantum Hall Effect, Chern–
Simons Theory and Integral Lattices. Proceedings of the
quently shown that these are deformation quantiza- International Congress of Mathematicians, 1994, Zürich,
tions of the commutative Riemann surface theory in 75–105. Basel: Birkhäuser Verlag.
the sense of Rieffel. Coverings of noncommutative Fröhlich J and Studer UM (1993) Gauge invariance and current
Riemann surfaces, which might provide an analoge algebra in nonrelativistic many-body theory. Reviews of
of composite fermions, have been investigated by Modern Physics 65: 733–802.
Kellendonk J, Richter T, and Schulz-Baldes M (2002) Edge
Marcolli and Mathai (1999, 2001). current channels and Chern numbers in the integral quantum
Hall effect. Reviews in Mathematical Physics 14: 87–119.
See also: C-Algebras and Their Classification; Marcolli M and Mathai V (1999) Twisted index theory on good
Chern–Simons Models: Rigorous Results; Fractional orbifolds. I. Non-commutative Bloch theory. Communications
Quantum Hall Effect; Hopf Algebras and q-Deformation in Contemporary Mathematics 1: 553–587.
Quantum Groups; Localization for Quasiperiodic Marcolli M and Mathai V (2001) Twisted index theory on good
Potentials; Noncommutative Geometry and the Standard orbifolds. II. Fractional quantum numbers. Communications
Model; Noncommutative Tori, Yang–Mills, and String in Mathematical Physics 217: 55–87.
Theory; Schrödinger Operators. McCann PJ (1998) Geometry and the integer quantum Hall effect.
In: Carey AL and Murray MK (eds.) Geometric Analysis and
Lie Theory in Mathematics and Physics, pp. 122–208.
Australian Mathematical Society Lecture Series. Cambridge:
Further Reading
Cambridge University Press.
Bellissard J (1986) K-theory of C -algebras, in solid state physics. Polychronakos A (2001) Quantum Hall states as matrix Chern–
In: Dorlas T, Hugenholtz NM, and Winnink M (eds.) Statistical Simons theory. The Journal of High Energy Physics
Mechanics and Field Theory: Mathematical Aspects, Springer 4(paper 11): 20.
Lecture Notes in Physics vol. 257, pp. 99–156. Berlin: Springer. Schulz-Baldes M, Kellendonk J, and Richter T (2000) Simulta-
Bellissard J, Herrmann DJL, and Zarrouati M (2000) Hulls of neous quantization of edge and bulk Hall conductivity.
a periodic solids and gap labelling theorem. In: Baake M and Journal of Physics A 33: L27–L32.
Moody RV (eds.) Directions in Mathematical Quasi-Crystals,
u
0 (t) = exp (iH0 t)f0 . Equation [2] leads to a defined by eqn [7] gives us the part of particles
connection between the corresponding initial data scattered in a solid angle d^
x:
f0 and f given by
dð^ x; !; Þj2 d^
x; !; Þ ¼ jað^ x ½7
f ¼ lim expðiHtÞ expðiH0 tÞf0 ½3 As discussed below, the temporal asymptotics
t!1
of solutions of the time-dependent Schrödinger
If f is an eigenvector of H, that is, Hf = f , then equation [1] are closely related to the asymptotics
obviously u(t) = eit f . On the contrary, if f belongs at large distances of solutions of the stationary
to the (absolutely) continuous subspace of H, then Schrödinger equation [5].
necessarily u(t) has the free asymptotics as t ! 1.
This result is known as asymptotic completeness.
The Schrödinger operator H = þ V(x) in the Time-Dependent Scattering Theory
space H = L2 (Rd ) with a real potential V decaying and Møller Operators
at infinity is a typical Hamiltonian of scattering
If V(x) ! 0 as jxj ! 1, then the essential spectrum
theory. The operator H describes a particle in an
of the Schrödinger operator H = þ V(x) covers
external potential V or two interacting particles.
the whole positive half-line, whereas the negative
Asymptotically (as t ! þ1 or t ! 1), particles
spectrum of H consists of eigenvalues accumulating,
may either form a bound state or be free (a
perhaps, at the point zero only.
scattering state). Of course, a bound (scattering)
Scattering theory requires a more advanced
state at 1 remains the same at þ1. To be more
classification of the spectrum based on measure
precise, suppose that
theory. Consider a self-adjoint operator H defined
jVðxÞj Cð1 þ jxjÞ ½4 on domain D(H) in a Hilbert space H. Let E be its
spectral family. Then the space H can be decom-
where > 1. Then relation [2] can be justified with posed into the orthogonal sum of invariant sub-
the kinetic energy operator H0 = playing the spaces H(p) , H(sc) and H(ac) . The subspace H(p) is
role of the unperturbed operator. spanned by eigenvectors of H and the subspaces
As discussed in Landau and Lifshitz (1965) (see H(sc) , H(ac) are distinguished by the condition that
also Amrein et al. (1977), Pearson (1988), and Yafaev the measure (E(X)f , f ) (here X R is a Borel set) is
(2000)), in scattering experiments one sends a beam singularly or absolutely continuous with respect to
of particles of energy > 0 in a direction !. Such a the Lebesgue measure for all f 2 H(sc) or f 2 H(ac) .
beam is described by the plane wave Typically (in applications to quantum-mechanical
problems) the singularly continuous part is absent,
0 ðx; !; Þ ¼ expðikh!; xiÞ; ¼ k2 > 0 that is, H(sc) = {0}. We denote by H (ac) the restriction
of H on its absolutely continuous subspace H(ac) and
(which satisfies of course the free equation
by P(ac) the orthogonal projection on this subspace.
0 = 0 ). The scattered particles are described
The same objects for the operator H0 will be
for large distances by the outgoing spherical wave
endowed with the index ‘‘0.’’
Equation [3] motivates the following fundamental
x; !; Þjxjðd1Þ=2 expðikjxjÞ
að^ definition. The wave, or Møller, operator
W = W (H, H0 ) for a pair of self-adjoint operators
Here x^ = xjxj1 is the direction of observation and
H0 and H is defined by eqn [8] provided that the
the coefficient a(^
x, !; ) is known as the scattering
corresponding strong limit exists:
amplitude. This means that quantum particles
subject to a potential V(x) are described by the W ¼ s-lim expðiHtÞ expðiH0 tÞP0
ðacÞ
½8
solution of eqn [5] with asymptotics [6] at infinity: t!1
It is easy to see that the completeness of W (H, H0 ) Consideration of wave operators [12] with J 6¼ I
is equivalent to the existence of the ‘‘inverse’’ wave may of course be of interest also in the case H0 = H.
operator W (H0 , H). Thus, if the wave operator It suffices to verify the existence of limits [8] or
W (H, H0 ) exists and is complete, then the opera- [12] on some set dense in the absolutely continuous
tors H0(ac) and H (ac) are unitarily equivalent. We subspace H(ac)
0 of the operator H0 . The following
emphasize that scattering theory studies not arbi- simple but convenient condition for the existence of
trary unitary equivalence but only the ‘‘canonical’’ wave operators is usually called Cook’s criterion.
one realized by the wave operators. Suppose that H0 = H0(ac) and that the operator J
Along with the wave operators an important role maps domain D(H0 ) of the operator H0 into D(H).
in scattering theory is played by the scattering Let
operator defined by eqn [11] where Wþ is the Z 1
operator adjoint to Wþ : kðHJ JH0 Þ expðiH0 tÞf kdt < 1
0
S ¼ SðH; H0 Þ ¼ Wþ ðH; H0 ÞW ðH; H0 Þ ½11
for all f from some set D0 D(H0 ) dense in H0 .
The operator S commutes with H0 and hence Then the wave operator W (H, H0 ; J) exists.
reduces to multiplication by the operator function This result is often useful in applications since the
S() = S(; H, H0 ) in a representation of H(ac)0 which operator exp (iH0 t) is known explicitly. For
is diagonal for H0(ac) . The operator S() is known as example, it works with J = I for the pair
the scattering matrix. The scattering operator [11] is H0 ¼ ; H ¼ H0 þ VðxÞ ½13
unitary on the subspace H(ac) 0 provided the wave
operators W (H, H0 ) exist and are complete. The if V(x) satisfies estimate [4] with > 1. On the
scattering operator S(H, H0 ) connects the asympto- other hand, different proofs of the existence of the
tics of the solutions of eqn [1] as t ! 1 and as wave operators W (H0 , H; J ) require new mathe-
t ! þ1 in terms of the free problem, that is matical tools. There are two essentially different
S(H, H0 ) : f0 7! f0þ , where f0 are the same as in eqn approaches in scattering theory: the trace-class and
[2]. The scattering operator and the scattering smooth methods.
matrix are usually of great interest in mathematical
physics problems, because they connect the ‘‘initial’’
and the ‘‘final’’ characteristics of the process Time-Independent Scattering Theory
directly, bypassing its consideration for finite times. The approach in scattering theory relying on
The definition of the wave operators can be definition [8] is called time dependent. An alter-
extended to self-adjoint operators acting in different native possibility is to change the definition of wave
spaces. Let H0 and H be self-adjoint operators in operators replacing the unitary groups by the
Hilbert spaces H0 and H, respectively, and let corresponding resolvents R0 (z) = (H0 z)1 and
‘‘identification’’ J : H0 ! H be a bounded operator. R(z) = (H z)1 . They are related by a simple
Then the wave operator W = W (H, H0 ; J) for the identity
triple H0 , H, and J is defined by eqn [12] provided
again that the strong limit there exists: RðzÞ ¼ R0 ðzÞ R0 ðzÞVRðzÞ
¼ R0 ðzÞ RðzÞVR0 ðzÞ ½14
ðacÞ
W ¼ s-lim expðiHtÞJ expðiH0 tÞP0 ½12
t!1 where V = H H0 and Im z 6¼ 0. In the stationary
Intertwining property [9] is preserved for wave approach in place of limits [8] one has to study
operator [12]. This operator is isometric on H(ac) if the boundary values (in a suitable topology) of the
0
and only if resolvents as the spectral parameter z tends to the
real axis. An important advantage of the stationary
lim kJ expðiH0 tÞf0 k ¼ kf0 k approach is that it gives convenient formulas for the
t!1
wave operators and the scattering matrix.
for all f0 2 H(ac)
0 . Since Let us discuss here the stationary formulation of
ðacÞ
the scattering problem for operators [13] in the
s-lim K expðiH0 tÞP0 ¼0 Hilbert space H = L2 (Rd ) in terms of solutions of the
jtj!1
Schrödinger equation [5]. If V(x) satisfies estimate [4]
for a compact operator K, wave operators [12] with > (d þ 1)=2, then for all > 0 and all unit
corresponding to identifications J1 and J2 coincide if vectors ! 2 Sd1 , eqn [5] has the solution (x; !, )
J2 J1 is compact or, at least, the operators (J2 J1 ) with asymptotics [6] as jxj ! 1. Moreover, the
E0 (X) are compact for all bounded intervals X. scattering amplitude a(^x, !; ) belongs to the space
254 Quantum Mechanical Scattering Theory
xÞjxjðd1Þ=2
ðR0 ð i0Þf ÞðxÞ ¼ c ðÞð0 ðÞf Þð^ which means that each wave operator establishes a
ðdþ1Þ=2 one-to-one correspondence between eigenfunctions of
expðikjxjÞ þ Oðjxj Þ
the continuous spectrum of the operators H0 and H.
where f 2 C1 d 1=2 1=4 i(d3)=4 The main ideas of the stationary approach go
0 (R ), c () = e and
the operator 0 () defined by eqn [16] is (up to the back to Friedrichs (1965), and Povzner. The inverse
numerical factor) the restriction of the Fourier problem of reconstruction of a potential V given the
transform ^f = F 0 f onto the sphere of radius 1=2 : scattering amplitude a (see eqn [6]) is treated in
Faddeev (1976).
ð0 ðÞf Þð!Þ ¼ 21=2 ðd2Þ=4 ^f ð1=2 !Þ; ! 2 Sd1 ½16
S 1 , then the wave operators W (H, H0 ) exist and are If K : H ! G (G is some Hilbert space) is a Hilbert–
complete. In particular, the operators H0(ac) and H (ac) Schmidt operator, then for all f 2 R
are unitarily equivalent. This can be considered as a far Z 1
advanced extension of the H Weyl theorem, which kK expðiHtÞf k2 dt 2r2H ðf ÞkKk22 ½21
states the stability of the essential spectrum under 1
that its derivative ’0 is absolutely continuous and (cf. eqns [21] and [23]). Here and below, C are different
’0 () > 0. Then the wave operators W (H, H0 ) exist positive numbers whose precise values are inessential.
and eqn [20] holds: It is important that this definition admits equivalent
reformulations in terms of the resolvent or of the
W ðH; H0 Þ ¼ W ð’ðHÞ; ’ðH0 ÞÞ ½20
spectral family. Thus, K is H-smooth if and only if
A direct generalization of the Kato–Rosenblum
sup kKðRð þ i"Þ Rð i"ÞÞK k < 1
theorem to the operators acting in different spaces is 2R;">0
due to Pearson. Suppose that H0 and H are self-
adjoint operators in spaces H0 and H, respectively, or if and only if
J : H0 ! H is a bounded operator and V = HJ
sup jXj1 kKEðXÞk2 < 1
JH0 2 S 1 . Then the wave operators W (H, H0 ; J)
and W (H0 , H; J ) exist. for all intervals X R.
Although rather sophisticated, the proof relies In applications the assumption of H-smoothness
only on the following elementary lemma of Rosen- of an operator K imposes too stringent conditions
blum. For a self-adjoint operator H, consider the set on the operator H. In particular, the operator H is
R H(ac) of elements f such that necessarily absolutely continuous if kernel of K is
trivial. This assumption excludes eigenvalues and
r2H ðf Þ :¼ ess sup dðEðÞf ; f Þ=d < 1 other singular points in the spectrum of H, for
256 Quantum Mechanical Scattering Theory
example, the bottom of the continuous spectrum for considered in the space H, is continuous in norm in
the Schrödinger operator with decaying potential or the closed complex plane C cut along (0, 1) with
edges of bands if the spectrum has the band possible exception of the point z = 0. This implies
structure. The notion of local H-smoothness sug- H0 -smoothness of the operator hxil , l > 1=2, on all
gested by Lavine is considerably more flexible. By compact intervals X (0, 1).
definition, K is called H-smooth on a Borel set X R To obtain a similar result for the operator H,
if the operator KE(X) is H-smooth. Note that, under we proceed from the resolvent identity [14].
the assumption Let R(z) = hxil R(z)hxil , and let B be the operator
of multiplication by the bounded function
sup kKðRð þ i"Þ Rð i"ÞÞK k < 1 ½24
2X;">0 (1 þ jxj) V(x). If
positive spectrum of H is necessarily contained in N . where the operator 0 () is defined by eqn [16].
To prove that its continuous part is empty, it suffices Then
to check that the set N consists of eigenvalues of the
operator H. In terms of u = hxil Bf , l = =2, eqn F0 : L2 ðRd Þ ! L2 ðRþ ; N Þ; N ¼ L2 ðSd1 Þ
[28] can be rewritten as
is a unitary operator and (F0 H0 f )() = (F0 f )().
u þ VR0 ð i0Þu ¼ 0 ½29 Under assumption [4] where > 1, the scattering
operator S for pair [13] is defined by eqn [11]. It is
Multiplying this equation by R0 ( i0)u and taking unitary on the space H = L2 (Rd ) and commutes
the imaginary part of the scalar product, we see that with the operator H0 . It follows that (F0 Sf )() =
S()(F0 f )(), > 0, where the unitary operator
dðE0 ðÞu; uÞ=d ¼ ImðR0 ð i0Þu; uÞ ¼ 0
S() : N ! N is known as the scattering matrix. The
According to eqn [26], this implies that scattering matrix S() for the pair H0 , H can be
computed in terms of the scattering amplitude.
u
^ðÞ ¼ 0 for jj ¼ 1=2 ½30 Namely, S() acts in the space L2 (Sd1 ), and S()
I is the integral operator whose kernel is the
It follows from eqn [29] that scattering amplitude. More precisely,
¼ R0 ð i0Þu ½31
ðSðÞf ÞðÞ
Z
that is, ˆ () = (jj2 i0)1 u
^(), is a formal
¼ f ðÞ þ 2i1=2 d ðÞ að; !; Þf ð!Þ d!
(because of the singularity of the denominator) Sd1
solution of Schrödinger equation [5]. Therefore, one
needs only to verify that 2 L2 (R d ). Since u 2 L(l) In operator notation, this representation can be
2 ,
where l = =2, this is a direct consequence of [25] and rewritten as
[30] if > 2. In the general case, one uses that under SðÞ ¼ I 2 i0 ðÞðV VRð þ i0ÞVÞ0 ðÞ ½33
assumption [30] the function (jj2 )1 u^() belongs
(p)
to the space L2 for any p < l 1. By virtue of The right-hand side here is correctly defined as a
condition [4] where > 1, eqn [29] now shows that bounded operator in the space N and is continuous
(p)
actually u 2 L2 for any p < l þ 1. Repeating in > 0. Moreover, the operator S() I is compact
these arguments, we obtain, after n steps, that u 2 since 0 ()hxil : H ! N is compact for l > 1=2 by
L(p)
2 for any p < l þ n( 1). For n large enough, this virtue of the Sobolev trace theorem.
implies that u 2 L(p) 2 for p > 1, and consequently It follows that the spectrum of the operator S()
function [31] belongs to L2 (Rd ). consists of eigenvalues of finite multiplicity, except
Similar arguments show that eigenvalues of H possibly the point 1, lying on the unit circle and
have finite multiplicity and do not have positive accumulating at the point 1 only. In the general
accumulation points. For the proof of boundedness case, eigenvalues of S() play the role of scattering
of the set of eigenvalues, one uses additionally the phases or shifts considered often for radial potentials
estimate V(x) = V(jxj).
The scattering amplitude is singular on the
kR0 ð i0Þk ¼ Oð1=2 Þ; !1 ½32 diagonal = ! only. Moreover, this singularity is
Actually, according to Kato theorem the Schrödin- weaker for potentials with faster decay at infinity
ger operator H does not have positive eigenvalues. (for bigger). If > (d þ 1)=2, then the operator
There exists also a purely time-dependent S() I belongs to the Hilbert–Schmidt class. In this
approach, the Enss method (see Perry (1983)), case the total scattering cross section
which relies on an advanced study of the free Z
evolution operator exp (iH0 t). ð!; Þ ¼ jað; !; Þj2 d
Sd1
Using resolvent identity [14], one deduces from operators W (H, H0 ; J), where J is a pseudodiffer-
eqn [33] the Born expansion ential operator,
Z
X
1
ðJf ÞðxÞ ¼ ð2Þd=2 eihx;i eiðx;Þ
ðx; Þ^f ðÞ d
SðÞ ¼ I 2 i ð1Þn 0 ðÞVðR0 ð þ i0ÞVÞn 0 ðÞ Rd
n¼0
with oscillating symbol exp (i(x, ))
(x, ). Due to the
This series is norm-convergent for small potentials V conservation of energy, we may suppose that
(x, )
and according to estimate [32] for high energies . contains a factor (jj2 ) with 2 C1 0 (0, 1). Set
j@ VðxÞj Cð1 þ jxjÞj j ; 2 ð0; 1 ½34 The notorious difficulty (for d 2) of this method is
that the eikonal equation does not have (even
for all derivatives of V up to some order. In the approximate) solutions such that jrx (x, )j ! 0 as
long-range case, the wave operators W (H, H0 ) do jxj ! 1 and the arising error term is short-range.
not exist, and the asymptotic dynamics should be However, it is easy to construct functions ’ = ’
properly modified. It can be done in a time- satisfying these conditions if a conical neighborhood
dependent way either in the coordinate or momen- of the direction is removed from Rd . For
tum representations. For example, in the coordinate example,
representation, the free evolution exp (iH0 t) Z 1
should be replaced in definition [8] of wave ðx; Þ ¼ 21 ðVðx Þ VðÞÞ d
operators by unitary operators U0 (t) defined by 0
generally complex eigenvalues are accommodated if knowledge of the of the quantum system under
we allow Q to be normal, that is, QQ = Q Q. In consideration. According to this view, the ‘‘jumping’’
each case we require the eigenvectors of Q to span that the quantum state undergoes is regarded as
the Hilbert space H.) This ‘‘evolution procedure’’ of unsurprising, since it does not represent a sudden
the quantum state is very different from U, owing change in the reality of the situation, but merely in the
both to its discontinuity and its indeterminacy. The observer’s knowledge, as new information becomes
letter R will be used for this, standing for the available, when the result of some measurement
‘‘reduction’’ of the quantum state (sometimes referred becomes known to the observer. According to this
to as the ‘‘collapse of the wave function’’). This view, there is no objective quantum reality described
strange hybrid, whereby U and R are alternated, with by j i. Whether or not there might be some objective
U holding between measurements and R holding at quantum-level reality with some other mathematical
measurements, is the standard procedure that is description seems to be left open by this viewpoint, but
pragmatically adopted in conventional quantum the impression given is that there might well not be any
mechanics, and which works so marvelously well, such quantum-level reality at all, in the sense that it
with no known discrepancy between the theory and becomes meaningless to ask for a description of
observation. (In his classic account, von Neumann ‘‘actual reality’’ at quantum-relevant scales.
(1932, 1955), ‘‘R’’ is referred to as his ‘‘process I’’ Of course some connection with the real world is
and ‘‘U’’ as his ‘‘process II.’’) However, there appears necessary, in order that the quantum formalism can
to be no consensus whatever about the relation relate to the results of experiment. In the Copenha-
between this mathematical procedure and what is gen viewpoint, the experimenter’s measuring appa-
‘‘really’’ going on in the physical world. This is the ratus is taken to be a classical-level entity, which can
kind of issue that will be of concern to us here. be ascribed a real ontological status. When the
Geiger counter ‘‘clicks’’ or when the pointer
‘‘points’’ to some position on a dial, or when the
Quantum Reality
track in the cloud chamber ‘‘becomes visible’’ –
The discussion here will be given only in the these are taken to be real events. The intervening
Schrödinger picture, for the reason that the issues description in terms of a quantum state vector j i is
appear to be clearer with this description. In the not ascribed a reality. The role of j i is merely to
Heisenberg picture, the state j i does not evolve in provide a calculational procedure whereby the
time, and all dynamics is taken up in the time different outcomes of an experiment can be assigned
evolution of the dynamical variables. But this probabilities. Reality comes about only when the
evolution does not refer to the evolution of specific result of the measurement is manifested, not before.
systems, the ‘‘state’’ of any particular system being A difficulty with this viewpoint is that it is hard to
defined to remain constant in time. Since the draw a clear line between those entities which are
Schrödinger and Heisenberg pictures are deemed to considered to have an actual reality, such as the
be equivalent (at least for the ‘‘normal’’ systems that experimental apparatus or a human observer, and
are under consideration here), we do not lose the elemental constituents of those entities, which
anything substantial by sticking to Schrödinger’s are such things as electrons or protons or neutrons
description, whereas there does seem to be a or quarks, which are to be treated quantum
significant gain in understanding of what the mechanically and therefore, on the ‘‘Copenhagen’’
formalism is actually telling us. view, their mathematical descriptions are denied
There are, however, many different attitudes that such an honored ontological status. Moreover, there
are expressed as to the ‘‘reality’’ of j i. (There is an is no limit to the number of particles that can
unfortunate possibility of confusion here in the two partake in a quantum state. According to current
uses of the word ‘‘real’’ that come into the discussion quantum mechanics, the most accurate mathemati-
here. In the quantum formalism, the state is mathe- cal procedure for describing a system with a large
matically a ‘‘complex’’ rather than a ‘‘real’’ entity, number of particles would indeed be to use a
whereas our present concern is not directly to do unitarily evolving quantum state. What reasons can
with this, but with the ‘‘ontology’’ of the quantum be presented for or against the viewpoint that this
description.) According to what is commonly regarded gives us a reasonable description of an actual
as the standard – ‘‘Copenhagen’’ – interpretation of reality? Can our perceived reality arise as some
quantum mechanics (due primarily to Bohr, kind of statistical limit when very large numbers of
Heisenberg, and Pauli), the quantum state j i is not constituents are involved?
taken as a description of a quantum-level reality at all, Before entering into the more subtle and con-
but merely as a description of the observer’s tentious issues of the nature of ‘‘quantum reality,’’ it
262 Quantum Mechanics: Foundations
is appropriate that one of the very basic mathema- guarantee this answer. (We are, of course, consider-
tical aspects of the quantum formalism be addressed ing only ‘‘ideal’’ measurements, for the purpose of
first. It is an accepted aspect of the quantum argument.) Moreover, we could imagine that
formalism that a state-vector such as j i should between the two measurements, some appropriate
not, in any case, be thought of as providing a unique magnetic field had been introduced so as to rotate
mathematical description of a ‘‘physical reality’’ for the spin direction in some very specific way, so that
the simple reason that j i and zj i, where z is any the spin state is now some other direction such as
nonzero complex number, describe precisely the jÇi. By rotating our second Stern–Gerlach apparatus
same physical situation. It is a common, but not to agree with this new direction, we must again get
really necessary, practice to demand that j i be certainty for the YES answer, the guaranteeing of
normalized to unity: h j i = 1, in which case the this by the rotated state seeming now to give a
freedom in j i is reduced to the multiplication by a ‘‘reality’’ to this new state jÇi. The quantum
phase factor j i 7! ei j i. Either way, the physically formalism does not allow us to ascertain an
distinguishable states constitute a projective Hilbert unknown direction of spin. But it does allow for us
space PH, where each point of PH corresponds to a to ‘‘confirm’’ (or ‘‘refute’’) a proposed direction for
one-dimensional linear subspace of the Hilbert space the spin state, in the sense that if the proposed
H. The issue, therefore, is whether quantum reality direction is incorrect, then there is a nonzero
can be described in terms of the points of a probability of refutation. Only the correct direction
projective Hilbert space PH. can be guaranteed to give the YES answer.
at W, might have been spacelike separated, and requirements of special relativity. (It is possible that
because of the requirements of special relativity there these difficulties might be resolved within some kind
would be no meaning to say which of the two of nonlocal geometry, such as that supplied by
measurements – at E or at W – had ‘‘actually’’ twistor theory (see Twistors; Twistor Theory: Some
occurred first. One seems to obtain a different picture Applications); see, particularly, Penrose (2005).)
of ‘‘reality’’ depending on this ordering. These types of issues are made even more dramatic
In fact, the calculations of probabilities come out and problematic in the procedure of ‘‘quantum
the same whichever picture is used, so if one asks teleportation,’’ whereby the information in a quantum
only for a calculational procedure for the probabil- state (e.g., the unknown actual direction in some
ities, rather than an actual picture of quantum quantum state ji) can be transported from one
reality, these considerations are not problematic. But experimenter A to another one B, by merely
they do provide profound difficulties for any view of the sending of a small finite number of classical bits
quantum reality that is entirely local. The difficulty of information from A to B, where before this classical
is made particularly clear in a theorem due to John information is transmitted, A and B must each be in
Bell (1964, 1966a, b) which showed that on the possession of one member of an EPR pair. More
basis of the assumptions of local realism, there are explicitly, we may suppose A (Alice) is presented with
particular relations between the conditional prob- a spin-1/2 state ji, but is not told the direction . She
abilities, which must hold in any situation of this has in her possession another spin-1/2 state which is an
kind; moreover, these inequalities can be violated in EPR–Bohm partner of a spin-1/2 state in the posses-
various situations in standard quantum mechanics. sion of B (Bob). She combines this ji with her EPR
(See, most specifically, Clauser et al. (1969).) Several atom and then performs a measurement which
experiments that were subsequently performed distinguishes the four orthogonal ‘‘Bell states’’
(notably Aspect et al. (1982)) confirmed the expec-
tations of quantum mechanics, thereby presenting 0: jÆiji jijÆi
profound difficulties for any local realistic model of 1: jÆijÆi jiji
the world. There are also situations of this kind 2: jÆijÆi þ jiji
which involve only yes/no questions, so that actual 3: jÆiji þ jijÆi
probabilities do not need to be considered, see
Kochen and Specker (1967), Peres (1991), Hardy where the first state in each product refers to her
(1993), Conway and Kochen (2002). Basically: if unknown state and the second refers to her EPR
one insists on realism, then one must give up atom. The result of this measurement is conveyed to
locality. Moreover, nonlocal realistic models, con- Bob by an ordinary classical signal, coded by the
sistent with the requirements of special relativity, are indicated numbers 0, 1, 2, 3. On receiving Alice’s
not easy to construct (see Quantum Mechanics: message, Bob takes the other member of the EPR
Generalizations), and have so far proved elusive. pair and performs the following rotation on it:
0: leave alone
Other Aspects of Quantum Nonlocality 1: 180 about x-axis
Problems of this kind occur even at the more 2: 180 about y-axis
elementary level of single particles, if one tries to 3: 180 about z-axis
consider that an ordinary particle wave function
(position-space description of j i) might be just This achieves the successful ‘‘teleporting’’ of ji
some kind of ‘‘local disturbance,’’ like an ordinary from A to B, despite the fact that only 2 bits of
classical wave. Consider the wave function spread- classical information have been signaled. It is the
ing out from a localized source, to be detected at a acausal EPR–Bohm connection that provides the
perpendicular screen some distance away. The transmission of ‘‘quantum information’’ in a classi-
detection of the particle at any one place on the cally acausal way. Again, we see the essentially
screen immediately forbids the detection of that nonlocal (or acausal) nature of any attempted
particle at any other place on the screen, and if we ‘‘realistic’’ picture of quantum phenomena. It may
are to think of this information as being transmitted be regarded as inappropriate to use the term
as a classical signal to all other places on the screen, ‘‘information’’ for something that is propagated
then we are confronted with problems of super- acausally and cannot be directly used for signaling.
luminary communication. Again, any ‘‘realistic’’ It has been suggested, accordingly, that a term such
picture of this process would require nonlocal as ‘‘quanglement’’ might be more appropriate to use
ingredients, which are difficult to square with the for this concept; see Penrose (2002, 2004).
264 Quantum Mechanics: Foundations
The preceding arguments illustrate how quantum consists of the photon going in some other direction,
systems involving even just a few particles can exhibit missing the detector so that the murderous device is
features quite unlike the ordinary behavior of classical not activated, and the cat is left alive. These two
particles. This was pointed out by Schrödinger (1935), alternatives would each be perfectly plausible
and he referred to this key property of composite evolutions which might take place in the physical
quantum systems as ‘‘entanglement.’’ An entangled world. Now, by use of a beam splitter (effectively a
quantum state (vector) is an element of a product ‘‘half-silvered mirror’’) we can easily arrange for the
Hilbert space Hm Hn which cannot be written as a initial state of the photon to be the superposition
tensor product of elements j iji, with j i 2 Hm and wji þ zji of the two. Then by quantum linearity
ji 2 Hn , where Hm refers to one part of the system and we find, as the final result, the superposed state
Hn refers to another part, usually taken to be physically wj0 i þ zj0 i, in which the cat is in a superposition
widely separated from the first. EPR systems are a of life and death (a ‘‘Schrödinger’s cat’’).
clear example, and we begin to see very nonclassical, We note that the two individual final states j0 i
effectively nonlocal behavior with entangled systems and j0 i would each involve not just the cat but also
generally. A puzzling aspect of this is that the vast its environment, fully entangled with the cat’s state,
majority of states are indeed entangled, and the more and perhaps also some human observer looking at
parts that a system has, the more entangled it becomes the cat. In the latter case, j0 i would involve the
(where the generalization of this notion to more than observer in a state of unhappily perceiving a dead
two parts is evident). One might have expected that cat, and j0 i happily perceiving a live one. Two of
‘‘big’’ quantum systems with large numbers of parts the ‘‘conventional standpoints’’ with regard to the
ought to behave more and more like classical systems measurement problem are of relevance here. Accord-
when they get larger and more complicated. However, ing to the standpoint of environmental decoherence,
we see that this is very far from being the case. There is the details of the environmental degrees of freedom
no good reason why a large quantum system, left on its are completely inaccessible, and it is deemed to be
own to evolve simply according to U should actually appropriate to construct a density matrix to describe
resemble a classical system, except in very special the situation, which is a partial trace D of the
circumstances. Something of the nature of the R quantity j ih j, constructed by tracing out over all
process seems to be needed in order that classical the environmental degrees of freedom:
behaviour can ‘‘emerge.’’
D ¼ trace over environmentfj ih jg
The density matrix tends to be regarded as a more
appropriate quantity than the ket jyi to represent
Schrödinger’s Cat
the physical situation, although this represents
To clarify the nature of the problem we must consider a something of an ‘‘ontology shift’’ from the point of
key feature of the U formalism, namely ‘‘linearity,’’ view that was being held previously. Under appro-
which is supposed to hold no matter how large or priate assumptions, D may now be shown to attain a
complicated is the quantum system under considera- form that is close to being diagonal in a basis with
tion. Recall the quantum superposition principle, which respect to which the cat is either dead or alive, and
allows us to construct arbitrary combinations of states then, by a second ‘‘ontology shift’’ D is re-read as
describing a probability mixture of these two states.
j i ¼ wji þ zji
According to the second ‘‘conventional standpoint’’
from two given states ji and ji. Quantum linearity under consideration here, it is not logical to take this
tells us that if detour through a density-matrix description, and
instead one should maintain a consistent ontology by
ji j0 i and ji j0 i
following the evolution of the state j i itself through-
where the symbol ‘‘ ’’ expresses how a state will out. The ‘‘real’’ resulting physical state is then taken to
have evolved after a specified time period T, then be actually j 0 i, which involves the superposition of a
dead and live cat. Of course this ‘‘reality’’ does not agree
j i ¼ wji þ zji j 0 i ¼ wj0 i þ zj0 i
with the reality that we actually perceive, so the position
Let us now consider how this might be applied in is taken that a conscious mind would not actually be
a particular, rather outlandish situation. Let us able to function in such a superposed condition, and
suppose that the ji-evolution consists of a photon would have to settle into a state of perception of either a
going in one direction, encountering a detector, dead cat or a live one, these two alternatives occurring
which is connected to some murderous device which with probabilities as given by the Born rule stated
kills a cat. The ji-evolution, on the other hand, above. It may be argued that this conclusion depends
Quantum Mechanics: Generalizations 265
upon some appropriate theory of how conscious minds Bell JS (1964) On the Einstein Podolsky Rosen paradox. Physics
actually perceive things, and this appears to be lacking. 1: 195–200. Reprinted in Quantum Theory and Measurement,
eds., Wheeler JA and Zurek WH (Princeton Univ. Press,
A good many physicists might argue that none of Princeton, 1983).
these attempts at resolution of the measurement Bell JS (1966a) On the problem of hidden variables in quantum
problem is satisfactory, including ‘‘Copenhagen,’’ theory. Reviews in Modern Physics 38: 447–452.
although the latter at least has the advantage of Bell JS (1966b) Speakable and Unspeakable in Quantum Mechanics.
offering a pragmatic, if not fully logical, stance. Such Cambridge: Cambridge University Press. Reprint 1987.
Clauser JF, Horne MA, Shimony A, and Holt RA (1969)
physicists might take the position that it is necessary Proposed experiment to test local hidden-variable theories.
to move away from the precise version of quantum Physical Review Letters 23: 880–884.
theory that we have at present, and turn to one of its Conway J and Kochen S (2002) The geometry of the quantum
modifications. Some major candidates for modifica- paradoxes. In: Bertlmann RA and Zeilinger A (eds.) Quantum
tion are discussed in Quantum Mechanics: General- [Un]speakables: From Bell to Quantum Information, Ch. 18
(ISBN 3-540-42756-2). Berlin: Springer.
izations. Most of these actually make predictions Hardy L (1993) Nonlocality for two particles without inequalities
that, at some stage, would differ from those of for almost all entangled states. Physical Review Letters
standard quantum mechanics. So it becomes an 71(11): 1665.
experimental matter to ascertain the plausibility of Kochen S and Specker EP (1967) Journal of Mathematics and
these schemes. In addition, there are reinterpretations Mechanics 17: 59.
Penrose R (2002) John Bell, state reduction, and quanglement. In:
which do not change quantum theory’s predictions, Reinhold A, Bertlmann, and Zeilinger A (eds.) Quantum
such as the de Broglie–Bohm model. In this, there are [Un]speakables: From Bell to Quantum Information,
two levels of ‘‘reality,’’ a firmer one with a particle or pp. 319–331. Berlin: Springer.
position-space ontology, and a secondary one con- Penrose R (2004) The Road to Reality: A Complete Guide to the
taining waves which guide the behavior at the firmer Laws of the Universe. London: Jonathan Cape.
Penrose R (2005) The twistor approach to space-time struc-
level. It is clear, however, that these issues will tures. In: Ashtekar A (ed.) In 100 Years of Relativity; Space-
remain the subject of debate for many years to come. Time Structure: Einstein and Beyond. Singapore: World
Scientific.
See also: Functional Integration in Quantum Physics; Peres A (1991) Two simple proofs of the Kochen–Specker
Normal Forms and Semiclassical Approximation; theorem. Journal of Physics A: Mathematical and General
Quantum Mechanics: Generalizations; Twistor Theory: 24: L175–L178.
Some Applications [In Integrable Systems, Complex Schrödinger E (1935) Probability relations between separated
Geometry and String Theory]; Twistors. systems. Proceedings of the Cambridge Philosophical Society
31: 555–563.
von Neumann J (1932) Mathematische Grundlageen der Quan-
Further Reading tenmechanik. Berlin: Springer.
von Neumann J (1955) Mathematical Foundations of Quantum
Aspect A, Grangier P, and Roger G (1982) Experimental Mechanics. Princeton: Princeton University Press.
realization of Einstein–Podolsky–Rosen–Bohm. Gedankenex-
periment: a new violation of Bell’s inequalities. Physical
Review Letters 48: 91–94.
formalism. Rather, they utilize structures compatible However, given the mapping ! = !(M, ) for indi-
with standard quantum theory to elucidate S. These vidual trials, one may, in principle, consider
approaches, which will not be discussed in this nonstandard distributions () 6¼ QT () that yield
article, have arguably been less successful so far at statistics outside the domain of ordinary quantum
achieving S than approaches that introduce theory (Valentini 1991, 2002a). We may say that
significant alterations to quantum theory. such distributions correspond to a state of quantum
This article will largely deal with the two most nonequilibrium.
well-developed realistic models that reproduce Quantum nonequilibrium is characterized by the
quantum theory in some limit and yield potentially breakdown of a number of basic quantum con-
new and testable physics outside that limit. First, the straints. In particular, nonlocal signals appear at the
pilot-wave model, which will be discussed in the statistical level. We shall first illustrate this for the
broader context of ‘‘hidden-variables theories.’’ hidden-variables model of de Broglie and Bohm.
Second, the continuous spontaneous localization Then we shall generalize the discussion to all
(CSL) model, which describes wave-function col- (deterministic) hidden-variables theories.
lapse as a physical process. Other related models At present there is no experimental evidence for
will also be discussed briefly. quantum nonequilibrium in nature. However, from
Due to bibliographic space limitations, this article a hidden-variables perspective, it is natural to
contains a number of uncited references, of the form explore the theoretical properties of nonequilibrium
‘‘[author] in [year].’’ Those in the next section can distributions, and to search experimentally for the
be found in Valentini (2002b, 2004a,b) or at statistical anomalies associated with them.
www.arxiv.org. Those in the subsequent sections From this point of view, quantum theory is a
can be found in Adler (2004), Bassi and Ghirardi special case of a wider physics, much as thermal
(2003), Pearle (1999) (or in subsequent papers by physics is a special case of a wider (nonequilibrium)
these authors, or directly, at www.arxiv.org), and in physics. (The special distribution QT () is analo-
Wallstrom (1994). gous to, say, Maxwell’s distribution of molecular
speeds.) Quantum physics may be compared with
the physics of global thermal equilibrium, which is
Hidden Variables and Quantum characterized by constraints – such as the impossi-
bility of converting heat into work (in the absence of
Nonequilibrium
temperature differences) – that are not fundamental
A deterministic hidden-variables theory defines a but contingent on the state. Similarly, quantum
mapping ! = !(M, ) from initial hidden parameters constraints such as statistical locality (the impossi-
(defined, e.g., at the time of preparation of a bility of converting entanglement into a practical
quantum state) to final outcomes ! of quantum signal) are seen as contingencies of QT ().
measurements. The mapping depends on macro-
scopic experimental settings M, and fixes the out-
come for each run of the experiment. Bell’s theorem Pilot-Wave Theory
of 1964 shows that, for entangled quantum states of The de Broglie–Bohm ‘‘pilot-wave theory’’ – as it
widely separated systems, the mapping must be was originally called by de Broglie, who first
nonlocal: some outcomes for (at least) one system presented it at the Fifth Solvay Congress in 1927 –
must depend on the setting for another distant is the classic example of a deterministic hidden-
system. variables theory of broad scope (Bohm 1952, Bell
In a viable theory, the statistics of quantum 1987, Holland 1993). We shall use it to illustrate the
measurement outcomes – over an ensemble of above ideas. Later, the discussion will be generalized
experimental trials with fixed settings M – will to arbitrary theories.
agree with quantum theory for some special dis- In pilot-wave dynamics, an individual closed
tribution QT () of hidden variables. For example, system with (configuration-space) wave function
expectation values will coincide with the predictions (X, t) satisfying the Schrödinger equation
of the Born rule
Z @ ^
ih ¼ H ½1
h!iQT d QT ðÞ!ðM; Þ ¼ trð^ ^
Þ @t
has an actual configuration X(t) with velocity
for an appropriate density operator ˆ and Hermi-
ˆ (As is customary in this context, _ JðX; tÞ
tian
R observable . XðtÞ ¼ ½2
d is to be understood as a generalized sum.) jðX; tÞj2
Quantum Mechanics: Generalizations 267
where J = J[] = J(X, t) satisfies the continuity distribution of outcomes of quantum measurements
equation will match the statistical predictions of quantum
theory (Bohm 1952, Bell 1987, Dürr et al. 2003).
@jj2 Thus, quantum theory emerges phenomenologically
þrJ ¼0 ½3
@t for a ‘‘quantum equilibrium’’ ensemble with
(which follows from [1]). In quantum theory, J is the distribution P(X, t) = j(X, t)j2 (or () = QT ()).
‘‘probability current.’’ In pilot-wave theory, is an
objective physical field (on configuration space) Quantum nonequilibrium In principle, as we saw
guiding the motion of an individual system. for general hidden-variables theories, we may con-
Here, the objective state (or ontology) for a closed sider a nonequilibrium distribution P(X, 0) 6¼
system is given by and X. A probability distribu- j(X, 0)j2 of initial configurations while retaining
tion for X – discussed below – completes an the same deterministic dynamics [1], [2] for indivi-
unambiguous specification S (as mentioned in the dual systems (Valentini 1991). The time evolution of
introduction). P(X, t) will be determined by [6].
Pilot-wave dynamics may be applied to any As we shall see, in appropriate circumstances
quantum system with a locally conserved current in _ [6]
(with a sufficiently complicated velocity field X),
configuration space. Thus, X may represent a many- generates relaxation P ! jj2 on a coarse-grained
body system, or the configuration of a continuous level, much as the analogous classical evolution on
field, or perhaps some other entity. phase space generates thermal relaxation. But for as
For example, at low energies, for a system of N long as the ensemble is in nonequilibrium, the
particles with positions xi (t) and masses statistics of outcomes of quantum measurements
mi (i = 1, 2, . . . , N), with an external potential V, will disagree with quantum theory.
[1] (with X (x1 , x2 , . . . , xN )) reads Quantum nonequilibrium may have existed in the
very early universe, with relaxation to equilibrium
@ X N
h2 2
occurring soon after the big bang. Thus, a hidden-
i
h ¼ r þ V ½4
@t i¼1
2mi i variables analog of the classical thermodynamic
‘‘heat death of the universe’’ may have actually
while [2] has components
taken place (Valentini 1991). Even so, relic cosmo-
dxi h ri ri S logical particles that decoupled sufficiently early
¼ Im ¼ ½5
dt mi mi could still be in nonequilibrium today, as suggested
by Valentini in 1996 and 2001. It has also been
(where = jje(i=h)S ). speculated that nonequilibrium could be generated
In general, [1] and [2] determine X(t) for an in systems entangled with degrees of freedom behind
individual system, given the initial conditions a black-hole event horizon (Valentini 2004a).
X(0), (X, 0) at t = 0. For an arbitrary initial Experimental searches for nonequilibrium have
distribution P(X, 0), over an ensemble with the been proposed. Nonequilibrium could be detected
same wave function (X, 0), the evolution P(X, t) by the statistical analysis of random samples of
of the distribution is given by the continuity particles taken from a parent population of (for
equation example) relics from the early universe. Once the
@P parent distribution is known, the rest of the popula-
_ ¼0
þ r ðPXÞ ½6
@t tion could be used as a resource, to perform tasks
that are currently impossible (Valentini 2002b).
The outcome of an experiment is determined by
X(0), (X, 0), which may be identified with . For
H-Theorem: Relaxation to Equilibrium
an ensemble with the same (X, 0), we have
= X(0). Before discussing the potential uses of nonequili-
brium, we should first explain why all systems
Quantum equilibrium From [3] and [6], if we probed so far have been found in the equilibrium
assume P(X, 0) = j(X, 0)j2 at t = 0, we obtain state P = jj2 . This distribution may be accounted
P(X, t) = j(X, t)j2 – the Born-rule distribution of for along the lines of classical statistical mechanics,
configurations – at all times t. noting that all currently accessible systems have had
Quantum measurements are, like any other a long and violent astrophysical history.
process, described and explained in terms of evol- Dividing configuration space into small cells, and
ving configurations. For measurement devices whose introducing coarse-grained quantities P, jj2 , a gen-
pointer readings reduce to configurations, the eral argument for relaxation P ! jj2 is based on an
268 Quantum Mechanics: Generalizations
analog of the classical coarse-graining H-theorem. ensemble P(xA , xB , t) = j(xA , xB , t)j2 , local opera-
The coarse-grained H-function tions at B have no statistical effect at A: the
Z individual nonlocal effects vanish upon averaging
¼ dX P
H lnðP=jj
2
Þ ½7 over an equilibrium ensemble.
Nonlocality is (generally) hidden by statistical
noise only in quantum equilibrium. If instead
(minus the relative entropy of P with respect to
P(xA , xB , 0) 6¼ j(xA , xB , 0)j2 , a local change in the
jj2 ) obeys the H-theorem (Valentini 1991) Hamiltonian at B generally induces an instan-
taneous
R 3 change in the marginal pA (xA , t)
HðtÞ Hð0Þ d xB P(xA , xB , t) at A. For example, in one dimen-
sion a sudden change H ^B !H ^ 0 in the Hamiltonian
(assuming no initial fine-grained microstructure in P B
at B induces a change pA pA (xA , t) pA (xA , 0)
and jj2 ). Here, H 0 for all P, jj2 and H = 0 if
2 (for small t) (Valentini 1991),
and only if P = jj everywhere.
Z
The H-theorem expresses the fact that P and jj2 t2 @
pA ¼ aðxA Þ dxB bðxB Þ
behave like two ‘‘fluids’’ that are ‘‘stirred’’ by the same 4m @xA
velocity field X,_ so that P and jj2 tend to become
PðxA ; xB ; 0Þ jðxA ; xB ; 0Þj2
indistinguishable on a coarse-grained level. Like its
½8
classical analog, the theorem provides a general jðxA ; xB ; 0Þj2
understanding of how equilibrium is approached, (Here mA = mB = m, a(xA ) depends on (xA , xB , 0),
while not proving that equilibrium is actually while b(xB ) also depends on H ^ 0 and vanishes if
B
reached. (And of course, for some simple systems – ^ 0 ^
HB = HB .) The signal is generally nonzero if
such as a particle in the ground state of a box, for P0 6¼ j0 j2 .
which the velocity field rS=m vanishes – there is no Nonlocal signals do not lead to causal paradoxes
relaxation at all.) A strict decrease of H(t) immedi- if, at the hidden-variable level, there is a preferred
ately after t = 0 is guaranteed if X _ 0 r(P0 =j0 j2 ) has foliation of spacetime with a time parameter that
nonzero spatial variance over a coarse-graining cell, defines a fundamental causal sequence. Such sig-
as shown by Valentini in 1992 and 2001. nals, if they were observed, would define an
A relaxation timescale may be defined by absolute simultaneity as discussed by Valentini in
1= 2 (d2 H=dt
2 0 . For a single particle with
)0 =H 1992 and 2005. Note that in pilot-wave field
quantum energy spread E, a crude estimate given theory, Lorentz invariance emerges as a phenom-
by Valentini in 2001 yields (1=") h2 =m1=2 (E)3=2 , enological symmetry of the equilibrium state,
where " is the coarse-graining length. For wave conditional on the structure of the field-theoretical
functions that are superpositions of many energy Hamiltonian (as discussed by Bohm and Hiley in
eigenfunctions, the velocity field (generally) varies 1984, Bohm, Hiley and Kaloyerou in 1987, and
rapidly, and detailed numerical simulations (in two Valentini in 1992 and 1996).
dimensions) show that relaxation occurs with an
approximately exponential decay H(t) H 0 et=tc ,
with a time constant tc of order (Valentini and Subquantum Measurement
Westman 2005). In principle, nonequilibrium particles could also be
Equilibrium is then to be expected for particles used to perform ‘‘subquantum measurements’’ on
emerging from the violence of the big bang. The ordinary, equilibrium systems. We illustrate this
possibility is still open that relics from very early with an exactly solvable one-dimensional model
times may not have reached equilibrium before (Valentini 2002b).
decoupling. Consider an apparatus ‘‘pointer’’ coordinate y,
with known wave function g0 (y) and known
Nonlocal Signaling
(ensemble) distribution 0 (y) 6¼ jg0 (y)j2 , where 0 (y)
We now show how nonequilibrium, if it were ever has been deduced by statistical analysis of random
discovered, could be used for nonlocal signaling. samples from a parent population with known wave
Pilot-wave dynamics is nonlocal. For a pair of function g0 (y). (We assume that relaxation may be
particles A, B with entangled wave function neglected: for example, if g0 is a box ground state,
(xA , xB , t), the velocity x_ A (t) = rA S(xA , xB , t)=mA y_ = 0 and 0 (y) is static.) Consider also a ‘‘system’’
of A depends instantaneously on xB , and local coordinate x with known wave function 0 (x) and
operations at B – such as switching on a potential – known distribution 0 (x) = j 0 (x)j2 . If 0 (y) is
instantaneously affect the motion of A. For an arbitrarily narrow, x0 can be measured without
Quantum Mechanics: Generalizations 269
disturbing 0 (x), to arbitrary accuracy (violating the measurement of the trajectories could then distin-
uncertainty principle). guish the states j 1 i, j 2 i.
To do this, at t = 0 we switch on an interaction
Hamiltonian H ^ = a^ ^y , where a is a constant and py
xp Breaking quantum cryptography The security of
is canonically conjugate to y. For relatively large a, standard protocols for quantum key distribution
we may neglect the Hamiltonians of x and y. For depends on the validity of the laws of quantum
= (x, y, t), we then have @=@t = ax@=@y. theory. These protocols would become insecure
For jj2 we have the continuity equation @jj2 =@t = given the availability of nonequilibrium systems
ax@ jj2 =@y, which implies the hidden-variable (Valentini 2002b).
velocity fields x_ = 0, y_ = ax and trajectories x(t) = x0 , The protocols known as BB84 and B92 depend on
y(t) = y0 þ ax0 t. the impossibility of distinguishing nonorthogonal
The initial product 0 (x, y) = 0 (x)g0 (y) evolves quantum states without disturbing them. An eaves-
into (x, y, t) = 0 (x)g0 (y axt). For at ! 0 (with a dropper in possession of nonequilibrium particles could
large but fixed), (x, y, t) ! 0 (x)g0 (y) and 0 (x) is distinguish the nonorthogonal states being transmitted
undisturbed: for small at, a standard quantum between two parties, and so read the supposedly secret
pointer with the coordinate y would yield negligible key. Further, if subquantum measurements allow an
information about x0 . Yet, for arbitrarily small at, eavesdropper to predict quantum measurement out-
the hidden-variable pointer coordinate y(t) = y0 þ ax0 t comes at each ‘‘wing’’ of a (bipartite) entangled state,
does contain complete information about x0 (and then the EPR (Einstein–Podolsky–Rosen) protocol also
x(t) = x0 ). This ‘‘subquantum’’ information will be becomes insecure.
visible to us if 0 (y) is sufficiently narrow.
For, over an ensemble of similar experiments,
Subquantum computation It has been suggested
with initial joint distribution P0 (x, y) = j 0 (x)j2 0 (y)
that nonequilibrium physics would be computation-
(equilibrium for x and nonequilibrium for y), the
ally more powerful than quantum theory, because of
continuity equation @P=@t =ax@P=@y implies that
the ability to distinguish nonorthogonal states
P(x, y, t) = j 0 (x)j2 0 (y axt). If 0 (y) is localized
(Valentini 2002b). However, this ability depends
around y = 0 (0 (y) = 0 for jyj > w=2), then a stan-
on the (less-than-quantum) dispersion w of the
dard (faithful) measurement of y with result ymeas
nonequilibrium ensemble. A well-defined model of
will imply that x lies in the interval (ymeas =at w=2at,
computational complexity requires that the
ymeas =at þ w=2at) (so that P(x, y, t) 6¼ 0). Taking the
resources be quantified in some way. Here, a key
simultaneous limits at ! 0, w ! 0, with w=at ! 0,
question is how the required w scales with the size
the midpoint ymeas =at ! x0 (since ymeas = y0 þ ax0 t
of the computational task. So far, no rigorous results
and jy0 j w=2), while the error w=2at ! 0.
are known.
If w is arbitrarily small, a sequence of such
measurements will determine the hidden trajectory Extension to All Deterministic
x(t) without disturbing (x, t), to arbitrary accuracy. Hidden-Variables Theories
Let us now discuss arbitrary (deterministic) theories.
Subquantum Information and Computation
From a hidden-variables perspective, immense phy- Nonlocal signaling Consider a pair of two-state
sical resources are hidden from us by equilibrium quantum systems A and B, which are widely
statistical noise. Quantum nonequilibrium would separated and in the singlet state. Quantum
probably be as useful technologically as thermal or measurements of observables ˆ A mA ŝ A , ˆ B
chemical nonequilibrium. mB ŝ B (where mA , mB are unit vectors in Bloch
space and ŝ A , ŝ B are Pauli spin operators) yield
outcomes A , B = 1, in the ratio 1 : 1 at each
Distinguishing nonorthogonal states In quantum wing, with a correlation hˆ A ˆ B i = mA mB . Bell’s
theory, nonorthogonal states j 1 i, j 2 i (h 1 j 2 i 6¼ 0) theorem shows that for a hidden-variables theory to
cannot be distinguished without disturbing them. reproduce this correlation – upon averaging over an
This theorem breaks down in quantum nonequili- equilibrium ensemble with distribution QT () – it
brium (Valentini 2002b). For example, if j 1 i, j 2 i must take the nonlocal form
are distinct states of a single spinless particle, then
A ¼ A ðmA ; mB ; Þ; B ¼ B ðmA ; mB ; Þ ½9
the associated de Broglie–Bohm velocity fields will
in general be different, even if h 1 j 2 i 6¼ 0, and so More precisely, toR obtain hA B iQT = mA mB
will the hidden-variable trajectories. Subquantum (where hA B iQT dQT ()A B ), at least one of
270 Quantum Mechanics: Generalizations
A , B must depend on the measurement setting at Further, for a two-state system with observables
the distant wing. Without loss of generality, we m ŝ, the ‘‘dot-product’’ structure of the quantum
assume that A depends on mB . expectation hm ŝ i = tr(m
ˆ ŝ) = m P (for some
For an arbitrary nonequilibrium ensemble with Bloch vector P) is equivalent to expectation
distribution
R () 6¼ QT (), in general hA B i additivity (Valentini 2004b). Nonadditive expecta-
d ()A B differs from mA mB , and the out- tions then provide a convenient signature of none-
comes A , B = 1 occur in a ratio different from 1 : 1. quilibrium for any two-state system. For example,
Further, a change of setting mB ! m0B at B will generally the sinusoidal modulation of the quantum trans-
induce a change in the outcome statistics at A, yielding a mission probability for a single photon through a
nonlocal signal at the statistical level. To see this, note polarizer
that, in a nonlocal theory, the ‘‘transition sets’’
pþ 1 1
QT ðÞ ¼ 2ð1 þ hm ŝ iÞ ¼ 2ð1 þ P cos 2Þ ½10
TA ð; þÞ fjA ðmA ; mB ; Þ ¼ 1;
A ðmA ; m0B ; Þ ¼ þ1g (where an angle on the Bloch sphere corresponds
TA ðþ; Þ fjA ðmA ; mB ; Þ ¼ þ1; to a physical angle = =2) will generically break
down in nonequilibrium. Deviations from [10]
A ðmA ; m0B ; Þ ¼ 1g
would provide an unambiguous violation of quan-
cannot be empty for arbitrary settings. Yet, in quantum tum theory (Valentini 2004b).
equilibrium, the outcomes A = 1 occur in the ratio Such deviations were searched for by Papaliolios
1 : 1 for all settings, so the transition sets must in 1967, using laboratory photons and successive
have equal equilibrium measure, QT [TA (, þ)] = polarization measurements over very short times, to
QT [TA (þ,)] (dQT QT ()d). That is, the test a hidden-variables theory (distinct from pilot-
fraction of the equilibrium ensemble making the wave theory) due to Bohm and Bub (1966), in which
transition A = 1 ! A = þ1 under mB ! m0B must quantum measurements generate nonequilibrium for
equal the fraction making the reverse transition short times. Experimentally, successive measure-
A = þ1 ! A = 1. (This ‘‘detailed balancing’’ is ments over timescales 1013 s agreed with the
analogous to the principle of detailed balance in (quantum) sinusoidal modulation cos2 to <
1%.
statistical mechanics.) Since TA (, þ), TA (þ, ) are Similar tests might be performed with photons of a
fixed by the deterministic mapping, they are indepen- more exotic origin.
dent of the ensemble distribution (). Thus, for
() 6¼ QT (), in general [TA (, þ)] 6¼ [TA (þ, )]
(d ()d): the fraction of the nonequilibrium
Continuous Spontaneous Localization
ensemble making the transition A = 1 ! A = þ1
will not in general balance the fraction making the
Model (CSL)
reverse transition. The outcome ratio at A will then The basic postulate of CSL is that the state vector
change under mB ! m0B and there will be an instanta- j , ti represents reality. Since, for example, in
neous signal at the statistical level from B to A describing a measurement, the usual Schrödinger
(Valentini 2002a). evolution readily takes a real state into a nonreal
Thus, in any deterministic hidden-variables state, that is, into a superposition of real states
theory, nonequilibrium distributions () 6¼ QT () (such as apparatus states describing different
generally allow entanglement to be used for non- experimental outcomes), CSL requires a modifica-
local signalling (just as, in ordinary statistical tion of Schrödinger’s evolution. To the Hamiltonian
physics, differences of temperature make it possible is added a term which depends upon a classical
to convert heat into work). randomly fluctuating field w(x, t) and a mass-
^ t). This term acts to collapse
density operator A(x,
Experimental signature of nonequilibrium Quantum a superposition of states, which differ in their
expectations are additive, hc1 ˆ 1 þ c2 ˆ 2 i = c1 hˆ 1 iþ spatial distribution of mass density, to one of these
ˆ
c2 h2 i, even for noncommuting observables states. The rate of collapse is very slow for a
([ˆ 1 ,
ˆ 2 ] 6¼ 0, with c1 , c2 real). As emphasized by superposition involving a few particles, but very
Bell in 1966, this seemingly trivial consequence fast for a superposition of macroscopically different
of the (linearity of the) Born rule hi ˆ = tr(ˆ )ˆ is states. Thus, very rapidly, what you see (in nature)
remarkable because it relates statistics from is what you get (from the theory). Each state vector
distinct, ‘‘incompatible’’ experiments. In none- evolving under each w(x, t) corresponds to a
quilibrium, such additivity generically breaks realizable state, and a rule is given for how to
down (Valentini 2004b). associate a probability with each. In this way, an
Quantum Mechanics: Generalizations 271
To describe collapse to a joint eigenstate of a set subsection ‘‘Spontaneous localization model’’), choose
of mutually commuting operators A ^ r , replace ^ x as, essentially, proportional to the mass in a sphere
A
(4) 1
[w(t 0
) 2 ^ 2 in the exponent of [12] by
A] of radius a about x:
P 1 r 0 ^r 2
r (4) [w (t ) 2A ] . The interaction picture
^
^ tÞ eiHt 1
state vector in this case is [12] multiplied by Aðx;
^
exp (iHt): ða2 Þ3=4
Z ^
Z MðzÞ 2 1 2 ^
l
dz eð2a Þ ðxzÞ eiHt ½18
j ; tiw ¼ T exp ð4Þ1 dt0 mp
0
! The parameter value choices of SL, 1016 s1
X (according to [17] and [18], the collapse rate for
r 0 ^ r 0 2
½w ðt Þ 2A ðt Þ j ; 0i ½15
r
protons) and a 105 cm are, so far, consistent with
experiment (see the next subsection), and will be
where A^ r (t0 ) exp (iHt ^ r exp (iHt
^ 0 )A ^ 0 ). The density adopted here.
matrix follows from [15], and [13]: The density matrix associated with [17] is, as
Z in [16],
Z t
^ðtÞ Pt ðwÞDwj ; tiw w h ; tj=Pt ðwÞ
^ðtÞ ¼ T exp ð=2Þ ^ L ðx0 ; t0 Þ
dt0 dx0 ½A
Z t 0
¼ T exp =2 dt0 0 0 2
0 ^
AR ðx ; t Þ ^ð0Þ ½19
X
^ r 0 ^ r 0 2
½AL ðt Þ AR ðt Þ ^ð0Þ ½16
r
which satisfies the differential equation
Z
^ r (t0 )(A
^ r (t0 )) appears to the left (right) of (0), d^
ðtÞ ^ 0 ; tÞ; ½Aðx
^ 0 ; tÞ; ^ðtÞ
where A L R ˆ ¼ dx0 ½Aðx ½20
and is time-ordered (time reverse-ordered). In the dt 2
example described by [14], the density matrix [16] is of Lindblad–Kossakowski form.
X 2
^ðtÞ ¼ eðt=2Þðan am Þ n m jan iham j Consequences of CSL
n;m
Since the state vector dynamics of CSL is different
which encapsulates the ensemble’s collapse behavior. from that of standard quantum theory, there are
phenomena for which the two make different
CSL predictions, allowing for experimental tests. Con-
sider an N-particle system with position operators
The CSL proposal (Pearle 1989) is that collapse is ^ i (X ^ 0 ) from [18] in
^ i jxi = xi jxi). Substitution of A(x
X
engendered by distinctions between states at each
^ r in [15] the Schrödinger picture version of [20], integration
point of space, so the index r of A
over x0 , and utilization of
becomes x,
Z tZ X
N
1
^
f ðzÞMðzÞjxi ¼ ^ i Þ
ðz X
m i f ðX ^ i Þjxi
j ; tiw ¼ T exp ð4Þ dt0 dx0 i¼1
0
0 0 ^ 0 0 2 results in
½wðx ; t Þ 2Aðx ; t Þ j ; 0i ½17
d^
ðtÞ XN X N
mi mj
^
ðtÞ; H
¼ i½^
and the distinction looked at is mass density. However, dt 2 i¼1 j¼1 mp mp
one cannot ^ 0) = M(x),
^
P make y
the choice A(x, where h 2 1 ^ ^ 2 2 1 ^ ^ 2
^ ˆ ˆ
M(x) = i mi i (x)i (x) is the mass-density operator
eð4a Þ ðXLi XLj Þ þ eð4a Þ ðXRi XRj Þ
(mi is the mass of the ith type of particle, so 2 1 ^
i
^ 2
me , mp , mn , . . . are the masses, respectively, of elec- 2eð4a Þ ðXLi XRj Þ ^ðtÞ ½21
y
trons, protons, neutrons. . . , and ˆi (x) is the creation
operator for such a particle at location x), because this which is a useful form for calculations first
entails an infinite rate of energy increase of particles suggested by Pearle and Squires in 1994.
([23] with a = 0). Instead, adapting a ‘‘Gaussian
smearing’’ idea from the Ghirardi et al. (1986) Interference Consider the collapse rate of an initial
spontaneous localization (SL) model (see the state ji = 1 j1i þ 2 j2i, where j1i, j2i describe a
Quantum Mechanics: Generalizations 273
clump of matter, of size
a, at different locations (bound state size/a)2 , the excitation rate of state
with separation a. Electrons may be neglected jEi is
because of their small collapse rate compared to the
dhEj^
ðtÞjEi
much more massive nucleons, and the nucleon mass jt¼0
difference may be neglected. In using [21] to calculate *dt +* +
^ j )2 ] 1 X N
mi ^ X N
mi ^
dh1j(t)j2i=dt,
ˆ since exp [(4a2 )1 (X^i X
^i ¼ 2 E Xi E0 E0 Xi E
when acting on state j1i or j2i, and 0 when X 2a i¼1
mp i¼1
mp
acts on j1i and X ^ j acts on j2i, [21] yields, for N
nucleons, the collapse rate N2 : þ Oðsize=aÞ4 ½24
Since jE0Pi, jEi are eigenstates of the center-of-mass
dh1j^
ðtÞj2i ^ operator N m ^ i = PN mi with eigenvalue 0, the
X
ðtÞ; Hj2i
¼ ih1j½^ N 2 h1j^
ðtÞj2i ½22 i=1 i i=1
dt dipole contribution explicitly given in [24] vanishes
If the clump undergoes a two-slit interference identically. This leaves the quadrupole contribution
experiment, where the size and separation condi- as the leading term, which is too small to be
tions above are satisfied for a time T, and if the measured at present.
However, the choice of A(x) ^ as mass-density
result agrees with the standard quantum theory
prediction to 1%, it also agrees with CSL provided operator was made only after experimental indica-
1 > 100N 2 T. So far, interference experiments tion. Let gi replace mi =mp in [21] and [24], so that
with N as large as 103 have been performed, by g2i is the collapse rate for the ith particle. Then,
Nairz, Arndt, and Zeilinger in 2000. The SL value experiments looking for the radiation expected from
of 1 1016 would be testable, that is, the ‘‘spontaneously’’ excited atoms and nuclei, in large
quantum-predicted interference pattern would be amounts of matter for a long time, as shown by
‘‘washed out’’ to 1% accuracy, if the clump were Collett, Pearle, Avignone, and Nussinov in 1995,
an 106 cm radius sphere of mercury, which Pearle, Ring, Collar, and Avignone in 1999, and
contains N 108 nucleons, interfered for Jones, Pearle, and Ring in 2004, have placed the
T = 0.01 s. Currently envisioned but not yet following limits:
performed experiments (e.g., by Marshall, Simon, g g
e me 12me n mn 3ðmn mp Þ
Penrose, and Bouwmester in 2003) have been < ; <
gp m p mp gp m p mp
analyzed (e.g., by Bassi, Ippoliti, and Adler in
2004 and by Adler in 2005), which involve a
superposition of a larger clump of matter in
slightly displaced positions, entangled with a
photon whose interference pattern is measured: Random walk According to [17] and [13], the
these proposed experiments are still too crude to center-of-mass wave packet, of a piece of matter of
detect the SL value of , or the gravitationally size a or smaller, containing N nucleons, achieves
based collapse rate proposed by Penrose in 1996 equilibrium size s in a characteristic time s , and
(see the next section and papers by Christian in undergoes a random walk through a root-mean-
1999 and 2005). square distance Q:
2 1=4
a h Nmp s2
s ; s
mp N3 h
Bound state excitation Collapse narrows wave
½25
packets, thereby imparting energy to particles. If
H^ = PN P ^2 ^ 1=2 3=2
h t
i = 1 i =2mi þ V(x1 , . . . , xN ), it is straight- Q
forward to calculate from [21] that mp a
The results in [25] were obtained by Collett and
d ^ d XN
h2
3
^ ðtÞ ¼
hHi tr½H^ ½23 Pearle in 2003. These quantitative results can be
dt dt i¼1
4mi a2 qualitatively understood as follows.
In time t, the usual Schrödinger equation
For a nucleon, the mean rate of energy increase is expands a wave packet of size s to s þ
quite small, 3
1025 eV s1 . However, deviations (h=Nmp s)t. CSL collapse, by itself, narrows the
from the mean can be significantly greater. wave packet to s[1 N 2 (s=a)2 t]. The condition
Equation [21] predicts excitation of atoms and of no change in s is the result quoted above. s is the
nuclei. Let jE0 i be an initial bound energy time it takes the Schrödinger evolution to expand a
eigenstate. Expanding [21] in a power series in wave packet near size s to size s: (h=Nmp s)s s.
274 Quantum Mechanics: Generalizations
This ‘‘hydrodynamical’’ interpretation suffers from formulated as an Itô stochastic differential equation,
many difficulties, especially for many-body systems. a suggestion which has been widely followed. (The
In any case, a criticism by Wallstrom (1994) seems equation for the state vector given here, which is
decisive: [26] and [27] (and their higher-dimensional physically more transparent, has its time derivative
analogs) are not, in fact, equivalent to the Schrödin- equivalent to a Stratonovich stochastic differential
ger equation. For, as usually understood, the quan- equation, which is readily converted to the Itô form.)
tum wave function is a single-valued and The importance of requiring that the density matrix
continuous complex field, which typically possesses describing collapse be of the Lindblad–Kossakowski
nodes ( = 0), in the neighborhood of which the form was emphasized by Gisin in 1984 and Diosi in
phase S is multivalued, with values differing by 1988. The stochastic differential Schrödinger equation
integral multiples of 2 h. If one allows S in [26], that achieves this was found independently by Diosi in
[27] to be multivalued, there is no reason why the 1988 and by Belavkin, Gisin, and Pearle in separate
allowed values should differ by integral multiples of papers in 1989 (see Ghirardi et al. 1990).
2h, and in general will not be single-valued. On A gravitationally motivated stochastic collapse
the other hand, if one restricts S in [26], [27] to be dynamics was proposed by Diosi in 1989 (and some-
single-valued, one will exclude wave functions – such what corrected by Ghirardi et al. in 1990). Penrose
as those of nonzero angular momentum – with a emphasized in 1996 that a quantum state, such as that
multivalued phase. (This problem does not exist in describing a mass in a superposition of two places, puts
pilot-wave theory as we have presented it here, where the associated spacetime geometry also in a super-
is regarded as a basic entity.) position, and has argued that this should lead to wave-
Stochastic mechanics, introduced by Fényes in 1952 function collapse. He suggests that the collapse time
and Nelson (1966), has particle trajectories x(t) should be h=E, where E is the gravitational
obeying a ‘‘forward’’ stochastic differential equation potential energy change obtained by actually displa-
dx(t) = b(x(t), t)dt þ dw(t), where b is a drift (equal to cing two such masses: for example, the collapse time
the mean forward velocity) and w a Wiener process, h=(Gm2 =R), where the mass is m, its size is R, and
and also a similar ‘‘backward’’ equation. Defining the displacement is R or larger. No specific dynamics
the ‘‘current velocity’’ v = (1=2)(b þ b ), where b is is offered, just the vision that this will be a property of
the mean backward velocity, and using an appropriate a correct future quantum theory of gravity.
time-symmetric definition of mean acceleration, one Collapse to energy eigenstates was first proposed
may impose a stochastic version of Newton’s second by Bedford and Wang in 1975 and 1977 and, in the
law. If one assumes, in addition, that v is a gradient context of stochastic collapse (e.g., [11] with A ^ = H),
^
(v = rS=m for some S), then one obtains [26], [27] by Milburn in 1991 and Hughston in 1996, but it has
pffiffiffi
with R , where is the particle density. been argued by Finkelstein in 1993 and Pearle in
pffiffiffi
Defining e(i=h)S , it appears that one recovers 2004 that such energy-driven collapse cannot give a
the Schrödinger equation for the derived quantity . satisfactory picture of the macroscopic world.
However, again, there is no reason why S should Percival in 1995 and in a 1998 book, and Fivel in
have the specific multivalued structure required for 1997 have discussed energy-driven collapse for
the phase of a single-valued complex field. It then microscopic situations.
seems that, despite appearances, quantum theory Adler (2004) has presented a classical theory
cannot in fact be recovered from stochastic (a hidden-variables theory) from which it is argued
mechanics (Wallstrom 1994). The same problem that quantum theory ‘‘emerges’’ at the ensemble level.
occurs in models that use stochastic mechanics as an The classical variables are N
N matrix field ampli-
intermediate step (e.g., Markopoulou and Smolin in tudes at points of space. They obey appropriate
2004): the Schrödinger equation is obtained only for classical Hamiltonian dynamical equations which he
exceptional, nodeless wave functions. calls ‘‘trace dynamics,’’ since the expressions for
Bohm and Bub (1966) first proposed dynamical Hamiltonian, Lagrangian, Poisson bracket, etc., have
wave-function collapse through deterministic evolu- the form of the trace of products of matrices and their
tion. Their collapse outcome is determined by the sums with constant coefficients. Using classical statis-
value of a Wiener–Siegel hidden variable (a variable tical mechanics, canonical ensemble averages of
distributed uniformly over the unit hypersphere in a (suitably projected) products of fields are analyzed
Hilbert space identical to that of the state vector). In and it is argued that they obey all the properties
1976, Pearle proposed dynamical wave-function col- associated with Wightman functions, from which
lapse equations where the collapse outcome is deter- quantum field theory, and its nonrelativistic-limit
mined by a random variable, and suggested (Pearle quantum mechanics, may be derived. As well as
1979) that the modified Schrödinger equation be obtaining the algebra of quantum theory in this way,
276 Quantum Mechanics: Weak Measurements
it is argued that statistical fluctuations around the Ghirardi G, Pearle P, and Rimini A (1990) Markov processes in
canonical ensemble can give rise to the behavior of Hilbert space and continuous spontaneous localization of
systems of identical particles. Physical Review A 42: 78–89.
wave-function collapse, of the kind discussed here, Holland PR (1993) The Quantum Theory of Motion: An Account
both energy-driven and CSL-type mass-density-driven of the de Broglie–Bohm Causal Interpretation of Quantum
collapse so that, with the latter, comes the Born Mechanics. Cambridge: Cambridge University Press.
probability interpretation of the algebra. The Hamil- Nelson E (1966) Derivation of the Schrödinger equation from
tonian needed for this theory to work is not provided Newtonian mechanics. Physical Review 150: 1079–1085.
Pearle P (1979) Toward explaining why events occur. Interna-
but, as the argument progresses, its necessary features tional Journal of Theoretical Physics 18: 489–518.
are delimited. Pearle P (1989) Combining stochastic dynamical state-vector
reduction with spontaneous localization. Physical Review A
See also: Quantum Mechanics: Foundations. 39: 2277–2289.
Pearle P (1999) Collapse models. In: Petruccione F and Breuer HP
(eds.) Open Systems and Measurement in Relativistic Quan-
tum Theory, pp. 195–234. Heidelberg: Springer. (ArXiv:
Further Reading quant-ph/9901077).
Valentini A (1991) Signal-locality, uncertainty, and the subquan-
Adler SL (2004) Quantum Theory as an Emergent Phenomenon. tum H-theorem. I and II. Physics Letters A 156: 5–11 and
Cambridge: Cambridge University Press. 158: 1–8.
Bassi A and Ghirardi GC (2003) Dynamical reduction models. Valentini A (2002a) Signal-locality in hidden-variables theories.
Physics Reports 379: 257–426 (ArXiv: quant-ph/0302164). Physics Letters A 297: 273–278.
Bell JS (1987) Speakable and Unspeakable in Quantum Valentini A (2002b) Subquantum information and computation.
Mechanics. Cambridge: Cambridge University Press. Pramana – Journal of Physics 59: 269–277 (ArXiv: quant-ph/
Bohm D (1952) A suggested interpretation of the quantum theory 0203049).
in terms of ‘hidden’ variables. I and II. Physical Review Valentini A (2004a) Black holes, information loss, and hidden
85: 166–179 and 180–193. variables, ArXiv: hep-th/0407032.
Bohm D and Bub J (1966) A proposed solution of the Valentini A (2004b) Universal signature of non-quantum systems.
measurement problem in quantum mechanics by a hidden Physics Letters A 332: 187–193 (ArXiv: quant-ph/0309107).
variable theory. Reviews of Modern Physics 38: 453–469. Valentini A and Westman H (2005) Dynamical origin of quantum
Dürr D, Goldstein S, and Zanghı̀ N (2003) Quantum equilibrium probabilities. Proceedings of the Royal Society of London
and the role of operators as observables in quantum theory, Series A 461: 253–272.
ArXiv: quant-ph/0308038. Wallstrom TC (1994) Inequivalence between the Schrödinger
Ghirardi G, Rimini A, and Weber T (1986) Unified dynamics for equation and the Madelung hydrodynamic equations. Physical
microscopic and macroscopic systems. Physical Review D Review A 49: 1613–1617.
34: 470–491.
conclusions on postselected ensembles. Weak mea- respectively. Here G is the central Gaussian
surements have been instrumental in the interpreta- distribution of variance . Note that, as expected,
tion of time-continuous quantum measurements on eqn [5] implies eqn [4]. Nonzero means that the
single states as well. Yet, weak measurement itself measurement is nonideal, yet the expectation value
can properly be illuminated in the context of E[a] remains calculable reliably if the statistics N is
classical statistics. Classical weak measurement as suitably large.
well as postselection and time-continuous measure- Suppose the spread of A in state is finite:
ment are straightforward concepts leading to con-
clusions that are natural in classical statistics. In 2 A ¼: hA2 i hAi2 < 1 ½7
quantum context, the case is radically different and Weak measurement will be defined in the asympto-
certain paradoxical conclusions follow from weak tic limit (eqns [8] and [9]) where both the stochastic
measurements. Therefore, we first introduce the error of the measurement and the measurement
classical notion of weak measurement on postse- statistics go to infinity. It is crucial that their rate is
lected ensembles and, alternatively, in time-contin- kept constant:
uous measurement on a single state. Certain idioms
from statistical physics will be borrowed and certain ; N ! 1 ½8
not genuinely quantum notions from quantum
2
theory will be anticipated. The quantum counterpart 2 ¼: ¼ const: ½9
of weak measurement, postselection, and continuous N
measurement will be presented afterwards. The Obviously for asymptotically large , the precision
apparent redundancy of the parallel presentations of individual measurements becomes extremely
is of reason: the reader can separate what is weak. This incapacity is fully compensated by the
common in classical and quantum weak measure- asymptotically large statistics N. In the weak
ments from what is genuinely quantum. measurement limit (eqns [8] and [9]), the probability
distribution pw of the arithmetic mean a of the N
independent outcomes converges to a Gaussian
distribution:
Classical Weak Measurement
Given a normalized probability density (X) over pw ðaÞ ! G a hAi ½10
the phase space {X}, which we call the state, the
The Gaussian is centered at the mean hAi , and the
mean value of a real function A(X) is defined as
Z variance of the Gaussian is given by the constant
rate [9]. Consequently, the mean [3] is reliably
hAi ¼: dX A ½3
calculable on a statistics N growing like 2 .
With an eye on quantum theory, we consider two
Let the outcome of an (unbiased) measurement of A situations – postselection and time-continuous
be denoted by a. Its stochastic expectation value measurement – of weak measurement in classical
E[a] coincides with the mean [3]: statistics.
E½a ¼ hAi ½4
with probability 1 . Then the coincidence of E[a] Equations [17] and [18] are the special case of the
and hAi , as in eqn [4], remains valid: Kushner–Stratonovich equations of time-continuous
Bayesian inference conditioned on the continuous
E½a ¼ hAi ½12
measurement of A yielding the time-dependent
Therefore, a large ensemble of postselected states outcome value at . Formal time derivatives of both
allows one to estimate the postselected mean hAi . sides of eqn [17] yield the heuristic equation
Classical postselection allows introducing the at ¼ hAit þ gt ½19
effective postselected state:
Accordingly, the current measurement outcome is
¼: ½13 always equal to the current mean plus a term
hi
proportional to standard white noise t . This
Then the postselected mean [11] of A in state can, plausible feature of the model survives in the
by eqn [14], be expressed as the common mean of A quantum context as well. As for the other equation
in the effective postselected state : [18], it describes the gradual concentration of the
distribution t in such a way that the variance t A
hAi ¼ hAi ½14 tends to zero while hAit tends to a random
As we shall see later, quantum postselection is asymptotic value. The details of the convergence
more subtle and cannot be reduced to common depend on the character of the continuously mea-
statistics, that is, to that without postselection. The sured function A(X). Consider a stepwise A(X):
X
quantum counterpart of postselected mean does not AðXÞ ¼ a P ðXÞ ½20
exist unless we combine postselection and weak
measurement.
The real values a are step heights all differing from
Time-Continuous Measurement each other. The indicator functions P take values
0 or 1 and form a complete set of pairwise disjoint
For time-continuous measurement, one abandons the functions on the phase space:
ensemble of identical states. One supposes that a single X
time-dependent state t is undergoing an infinite P 1 ½21
sequence of measurements (eqns [5] and [6]) of A
employed at times t = t, t = 2t, t = 3t, . . . . The rate
P P
¼
P ½22
=: 1=t goes to infinity together with the mean
squared error 2 . Their rate is kept constant: In a single ideal measurement of A, the outcome a is
one of the a ’s singled out at random. The
; ! 1 ½15
probability distribution of the measurement out-
2 come and the corresponding Bayesian update of the
g2 ¼: ¼ const: ½16 state are given by
In the weak measurement limit (eqns [15] and [16]), p ¼ hP i0 ½23
the infinite frequent weak measurements of A
constitute the model of time-continuous measure- 1
0 ! P 0 ¼: ½24
ment. Even the weak measurements will signifi- p
cantly influence the original state 0 , due to the
respectively. Equations [17] and [18] of time-
accumulated effect of the infinitely many Bayesian
continuous measurement are a connatural time-
updates [6]. The resulting theory of time-continuous
continuous resolution of the ‘‘sudden’’ ideal
measurement is described by coupled Gaussian
measurement (eqns [23] and [24]) in a sense that
processes [17] and [18] for the primitive function
they reproduce it in the limit t ! 1. The states
t of the time-dependent measurement outcome
are trivial stationary states of the eqn [18]. It can be
and, respectively, for the time-dependent Bayesian
shown that they are indeed approached with
conditional state t :
probability p for t ! 1.
dt ¼ hAit dt þ g dWt ½17
Quantum Weak Measurement
dt ¼ g1 A hAit t dWt ½18
In quantum theory, states in a given complex
Here dWt is the Itô differential of the Wiener Hilbert space H are represented by non-negative
process. density operators ,
ˆ normalized by tr ˆ = 1. Like the
Quantum Mechanics: Weak Measurements 279
classical states , the quantum state ˆ is interpreted Hilbert space L2 of a hypothetic meter. Suppose
statistically, referring to an ensemble of states with R 2 (1, 1) is the position of the ‘‘pointer.’’ Let its
the same . ˆ Given a Hermitian operator A, ^ called initial state ˆ M be a pure central Gaussian state of
observable, its theoretical mean value in state ˆ is width ; then the density operator ˆ M in Dirac
defined by position basis takes the form
Z Z
^ ¼ trðA^
hAi ^ Þ ½25
^ ^M ¼ dR dR0 G1=2 1=2 0
ðRÞG ðR ÞjRihR j
0
½30
Let the outcome of an (unbiased) quantum measure-
ment of A^ be denoted by a. Its stochastic expectation We are looking for a certain dynamical interaction
value E[a] coincides with the mean [25]: ^ onto the
to transmit the ‘‘value’’ of the observable A
pointer position R.^ To model the interaction, we
^
E½a ¼ hAi ½26 define the unitary transformation [31] to act on the
^
tensor space H L2 :
Performing a large number N of independent
measurements of A ^ on the elements of the ensemble ^ KÞ
^ ¼ expðiA
U ^ ½31
of identically prepared states, the arithmetic mean a
Here K^ is the canonical momentum operator
of the outcomes yields a reliable estimate of E[a]
^ . If the ^
conjugated to R:
and, this way, of the theoretical mean hAi ˆ
measurement outcome a contains a Gaussian sto- ^
chastic error of standard dispersion , then the expðiaKÞjRi ¼ jR þ ai ½32
probability distribution of a and the update, called The unitary operator U ^ transforms the initial
collapse in quantum theory, of the state are uncorrelated quantum state into the desired corre-
described by eqns [27] and [28], respectively. (We lated composite state:
adopt the notational convenience of physics litera-
ture to omit the unit operator ^I from trivial ^ ^M U
^ ¼: U^ ^y ½33
expressions like a^I.) Equations [30]–[33] yield the expression [34] for the
D E ˆ
^ state :
pðaÞ ¼ G ða AÞ ½27
^ Z Z
^ ¼ dR dR0 G1=2 ðR AÞ^
^ G1=2
1 ^ G1=2 ða AÞ
^
^ ! G1=2 ða AÞ^ ½28
pðaÞ
^ jRihR0 j
ðR0 AÞ ½34
Nonzero means that the measurement is nonideal, ^ into
Let us write the pointer’s coordinate operator R
but the expectation value E[a] remains calculable the standard form [35] in Dirac position basis:
reliably if N is suitably large. Z
Weak quantum measurement, like its classical ^ ¼ dajaihaj
R ½35
counterpart, requires finite spread of the observable
^ on state :
A ˆ
The notation anticipates that, when pointer R ^ is
^
2^ A ¼: hA ^ 2<1
^ 2 i hAi ½29 measured ideally, the outcome a plays the role of the
^ ^
nonideally measured value of the observable A. ^
Weak quantum measurement, too, will be defined in Indeed, let us consider the ideal von Neumann
the asymptotic limit [8] introduced for classical weak measurement of the pointer position on the corre-
measurement. Single quantum measurements can no lated composite state . ˆ The probability of the
more distinguish between the eigenvalues of A. ^ Yet,
outcome a and the collapse of the composite state
the expectation value E[a] of the outcome a remains are given by the following standard equations:
calculable on a statistics N growing like 2 . h i
Both in quantum theory and classical statistics, pðaÞ ¼ tr ð^I jaihajÞ
^ ½36
the emergence of nonideal measurements from ideal
ones is guaranteed by general theorems. For com- h i
^ ! 1 ð^I jaihajÞð
^ ^I jaihajÞ ½37
pleteness of this article, we prove the emergence of pðaÞ
the nonideal quantum measurement (eqns [27] and
[28]) from the standard von Neumann theory of respectively. We insert eqn [34] into eqns [36] and
ideal quantum measurements (von Neumann 1955). [37]. Furthermore, we take the trace over L2 of both
The source of the statistical error of dispersion sides of eqn [37]. In such a way, as expected, eqns
is associated with the state ˆ M in the complex [36] and [37] of ideal measurement of R ^ yield the
280 Quantum Mechanics: Weak Measurements
earlier postulated eqns [27] and [28] of nonideal The interpretation of postselection itself reduces to a
measurement of A.^ simple procedure. One performs the von Neumann
ideal measurement of the Hermitian projector jf ihf j,
then includes the case if the outcome is 1 and
Quantum Postselection discards it if the outcome is 0. The rate of
A quantum postselection is defined by a Hermitian postselection is jhf jiij2 . We note that a certain
^
operator satisfying 0 ˆ ^I. The corresponding statistical interpretation of Im Aw , too, exists
^ is
postselected mean value of a certain observable A although it relies upon the details of the ‘‘meter.’’
defined by We outline a heuristic proof of the central
equation [40]. One considers the nonideal measure-
h ^
^ Ai ment (eqns [27] and [28]) of A ^ followed by the ideal
^ ^
^ hAi^ ¼: Re ½38 ˆ Then the joint distribution of the
^
hi measurement of .
^
corresponding outcomes is given by eqn [42]. The
The denominator hi ˆ ˆ is the rate of quantum probability distribution of the postselected outcomes
postselection. Quantum postselection means that a is defined by eqn [43], and takes the concrete form
after the measurement of A, ^ we measure the [44]. The constant N assures normalization:
ˆ
observable in ideal quantum measurement and
we make a statistical decision on the basis of the pð; aÞ ¼ tr ð ÞG ^ G1=2 ða AÞ
^ 1=2 ða AÞ^ ^ ½42
outcome . With probability , we include the case
in question into the statistics while we discard it Z
1
with probability 1 . By analogy with the classical pðaÞ ¼: pð; aÞ d ½43
N
case [12], one may ask whether the stochastic
expectation value E[a] of the postselected measure- 1 D 1=2 E
ment outcome does coincide with pðaÞ ¼: G ða AÞ ^ G ^
^ 1=2 ða AÞ ½44
N ^
? ^
E½a ¼ ^ hAi ^ ½39 ^ is bounded. When
Suppose, for simplicity, that A
! 1, eqn [44] yields the first two moments of
Contrary to the classical case, the quantum equation
the outcome a:
[39] does not hold. The quantum counterparts of
classical equations [12]–[14] do not exist at all. ^
E½a ! ^ hAi ½45
^
Nonetheless, the quantum postselected mean ˆ hAi ^
ˆ
possesses statistical interpretation although E½a2 2 ½46
restricted to the context of weak quantum measure-
Hence, by virtue of the central limit theorem, the
ments. In the weak measurement limit (eqns [8] and
probability distribution [40] follows for the average
[9]), a postselected analog of classical equation [10]
a of postselected outcomes in the weak measurement
holds for the arithmetic mean a of postselected weak
limit (eqns [8] and [9]).
quantum measurements:
pw ð
aÞ ! G ^
a ^ hAi ½40 Quantum Weak-Value Anomaly
^
The Gaussian is centered at the postselected mean Unlike in classical postselection, effective postse-
^
ˆ hAiˆ , and the variance of the Gaussian is given by the
lected quantum states cannot be introduced. We can
constant rate [9]. Consequently, the mean [38] ask whether eqn [47] defines a correct postselected
becomes calculable on a statistics N growing like 2 . quantum state:
Since the statistical interpretation of the postse- ^
^
lected quantum mean [38] is only possible for weak ^?^ ¼: Herm ½47
^ is called the (real) ^
hi
measurements, therefore ˆ hAi ˆ ^
^
weak value of A. Consider the special case when This pseudo-state satisfies the quantum counterpart
both the state ˆ = jiihij and the postselected operator of the classical equation [14]:
ˆ = jf ihf j are pure states. Then the weak value
^
ˆ hAiˆ takes, in usual notations, a particular form ^ ^ ?^
^ hAi^ ¼ tr A^
½48
[41] yielding the real part of the complex weak
value Aw [1]: In general, however, the operator ˆ ?ˆ is not a density
operator since it may be indefinite. Therefore, eqn
^
hf jAjii
^ [47] does not define a quantum state. Equation [48]
f hAii ¼: Re ½41
hf jii does not guarantee that the quantum weak value
Quantum Mechanics: Weak Measurements 281
eqns [48] and [51]: Equation [56] and its classical counterpart [17] are
perfectly similar. There is a remarkable difference
^ 1
f hAii ¼ ½53 between eqn [57] and its classical counterpart [18].
cos
In the latter, the stochastic average of the state is
This weak value of A ^ lies outside the range of the constant: E[dt ] = 0, expressing the fact that classi-
eigenvalues of A.^ The anomaly can be arbitrarily cal measurements do not alter the original ensemble
large if the rate cos2 of postselection decreases. if we ‘‘ignore’’ the outcomes of the measurements.
Striking consequences follow from this anomaly On the contrary, quantum measurements introduce
if we turn to the statistical interpretation. For irreversible changes to the original ensemble, a
concreteness, suppose = 2=3 so that f hAi ^ = 2. phenomenon called decoherence in the physics
i
On average, 75% of the statistics N will be lost literature. Equation [57] implies the closed linear
in postselection. We learnt from eqn [40] that first-order differential equation [58] for the stochas-
the arithmetic mean a of the postselected outcomes tic average of the quantum state ˆ t under time-
continuous measurement of the observable A: ^
of independent weak measurements converges
stochastically to the weak value upto the Gaussian dE½^
t ^ ½A;
¼ 18g2 ½A; ^ E½^
t ½58
fluctuation , as expressed symbolically by dt
a¼2
½54 This is the basic irreversible equation to model the
gradual loss of quantum coherence (decoherence)
Let us approximate the asymptotically large error under time-continuous measurement. In fact, the
of our weak measurements by = 10 which is very equation models decoherence under the influ-
already well beyond the scale of the eigenvalues 1 ence of a large class of interactions, for example,
^ The Gaussian error derives
of the observable A. with thermal reservoirs or complex environments. In
282 Quantum Mechanics: Weak Measurements
two-dimensional Hilbert space, for instance, we can resolution of the ‘‘sudden’’ ideal quantum measure-
consider the initial pure state hij =: [ cos , sin ] and ment (eqns [65] and [66]) in a sense that they
the time-continuous measurement of the diagonal reproduce it in the limit t ! 1. The states ˆ are
observable [59] on it. The solution of eqn [58] is stationary states of eqn [57]. It can be shown that
given by eqn [60]: they are indeed approached with probability p for
t ! 1 (Gisin 1984).
^¼ 1 0
A ½59
0 1
Related Contexts
" #
cos2
2
et=4g cos sin In addition to the two particular examples as
E½^
t ¼ 2 ½60 in postselection and in time-continuous measure-
et=4g cos sin sin2
ment, respectively, presented above, the weak
The off-diagonal elements of this density matrix measurement limit itself has further variants.
go to zero, that is, the coherent superposition A most natural example is the usual thermodynamic
represented by the initial pure state becomes an limit in standard statistical physics. Then weak
incoherent mixture represented by the diagonal measurements concern a certain additive micro-
density matrix ^1 . scopic observable (e.g., the spin) of each constituent
Apart from the phenomenon of decoherence, the and the weak value represents the corresponding
stochastic equations show remarkable similarity additive macroscopic parameter (e.g., the magneti-
with the classical equations of time-continuous zation) in the infinite volume limit. This example
measurement. The heuristic form of eqn [56] is indicates that weak values have natural interpreta-
eqn [61] of invariable interpretation with respect tion despite the apparent artificial conditions of
to the classical equation [19]: their definition. It is important that the weak value,
with or without postselection, plays the physical role
^ þ gt similar to that of the common mean hAi ^ . If,
at ¼ hAi ^t ½61 ˆ
between their pre- and postselection, the states ˆ
Equation [57] describes what is called the time- become weakly coupled with the state of another
continuous collapse of the quantum state under quantum system via the observable A, ^ their average
time-continuous quantum measurement of A. ^ For influence will be as if A ^ took the weak value ˆ hAi ^ .
ˆ
^ and
concreteness, we assume discrete spectrum for A Weak measurements also open a specific loophole to
consider the spectral expansion circumvent quantum limitations related to the
X irreversible disturbances that quantum measure-
^¼
A ^
a P ½62 ments cause to the measured state. Noncommuting
observables become simultaneously measurable in
the weak limit: simultaneous weak values of non-
The real values a are nondegenerate eigenvalues.
^ form a complete commuting observables will exist.
The Hermitian projectors P
Literally, weak measurement had been coined
orthogonal set:
in 1988 for quantum measurements with (pre- and)
X
^ ^I
P ½63 postselection, and became the tool of a certain time-
symmetric statistical interpretation of quantum states.
Foundational applications target the paradoxical
P ^
¼
P
^ P ^ ½64 problem of pre- and retrodiction in quantum theory.
^ the outcome a is In a broad sense, however, the very principle of weak
In a single ideal measurement of A, measurement encapsulates the trade between asymp-
one of the a ’s singled out at random. The totically weak precision and asymptotically large
probability distribution of the measurement out- statistics. Its relevance in different fields has not yet
come and the corresponding collapse of the state are been fully explored and a growing number of founda-
given by
tional, theoretical, and experimental applications are
^ i being considered in the literature – predominantly in
p ¼ hP ^0 ½65
the context of quantum physics. Since specialized
1 ^ ^ monographs or textbooks on quantum weak measure-
^0 ! P ^0 P ¼: ^ ½66 ment are not yet available, the reader is mostly referred
p
to research articles, like the recent one by Aharonov
respectively. Equations [56] and [57] of continuous and Botero (2005), covering many topics of postse-
measurements are an obvious time-continuous lected quantum weak values.
Quantum n-Body Problem 283
many-body problem,’’ which usually refers to the invariant, that is, independent linear functions of the
quantum mechanics of large numbers of identical relative particle positions R R . We denote the
particles, such as the electrons in a solid. momenta conjugate to (r 1 , . . . , r n1 , RCM ) by
Of particular interest is the ‘‘reduction’’ of the (p1 , . . . , pn1 , PCM ), of which P CM turns out to be the
Hamiltonian [1], that is, the elimination of those total momentum of the system,
degrees of freedom that can be eliminated due to the
X
n
continuous symmetries of translations and rotations. P CM ¼ P ½3
A basic problem is to write down the reduced ¼1
Hamiltonian and to make its analytical and geome-
trical properties clear. In the following we shall Under such a coordinate transformation, the poten-
present this reduction in two stages, dealing first with tial energy becomes simply a function of the n 1
the translations and second with the proper rotations. relative vectors, V(r 1 , . . . , r n1 ), whereas the kinetic
In each stage, we shall describe the reduction first in energy becomes
coordinate language and then in geometrical lan-
jP CM j2 1 X n1
guage. The discrete symmetries of parity, time T¼ þ K p p ½4
reversal, and permutation of identical particles are 2M 2 ;¼1
handled by standard methods of group representation
where K is a symmetric tensor (the ‘‘inverse mass
theory, and will not be discussed here.
tensor’’).
There has been considerable interest in mathema-
The vectors (r 1 , . . . , r n1 ) specify the positions of n
tical circles in recent years in the reduction of
particles relative to their center of mass. As described
dynamical systems with symmetry, and the quantum
so far, these vectors need only be independent,
n-body problem is one of the most important such
translationally invariant linear combinations of the
systems from a physical standpoint. As such, the
particle postitions. However, it is convenient to
basic theory of the quantum n-body problem has
choose them so that the inverse mass tensor becomes
received considerable attention in the physical
proportional to the identity, K = (1=M) . An
literature going back to the birth of quantum
elegant way of doing this is the method of Jacobi
mechanics, and continues to be of great practical
vectors, which involves splitting the original set of
importance. This article and the bibliography
particles into two nonempty subsets, which are then
attempt to bridge these two centers of interest.
split into smaller subsets, etc., until only subsets of a
single particle remain. The process can be represented
Reduction by Translations: Coordinate by a tree growing downward, with the original n
particles as the root, and the ends of the branches at
Description
the bottom each containing one particle. Then the
We begin with a coordinate description of the vectors (r 1 , . . . , r n1 ) (the Jacobi vectors) are chosen
reduction of the system [1] by translations. The to be proportional to the differences between the
coordinates (R1 , . . . , Rn ) are coordinates on the con- centers of mass of the two subsets at each splitting.
figuration space of the system, called the ‘‘original With the right constants of proportionality, the
configuration space’’ or OCS. The OCS is R 3n . The kinetic energy becomes
original system has 3n degrees of freedom. The
translation group acts on configuration space by 1 1 X
n1
T¼ jPCM j2 þ jp j2 ½5
R 7! R þ , for = 1, . . . , n, where is a displace- 2M 2M ¼0
ment vector. It acts on wave functions by
(R1 , . . . , Rn ) 7! (R1 , . . . , Rn ). Henceforth, we shall assume that the vectors
To reduce the system by translations, we perform (r 1 , . . . , r n1 ) are Jacobi vectors with conjugate
a linear coordinate transformation on the OCS, momenta (p1 , . . . , pn1 ).
taking us from the original vectors (R1 , . . . , Rn ) to a The choice of Jacobi vectors is not unique. In the
new set of n vectors (r 1 , . . . , r n1 , RCM ), where RCM first place, there is a discrete set of possible ways of
is the center-of-mass position, splitting the original set of n particles into subsets
(of forming trees), each of which leads to the same
1X n
form [5] of the kinetic energy. More generally, the
RCM ¼ m R ½2
M ¼1 kinetic energy [5] is invariant under transformations
P
where M = m is the total mass of the system, and X
n1
the other n 1 vectors of the new coordinate system, r 0 ¼ Q r ½6
(r 1 , . . . , r n1 ), are required to be translationally ¼1
Quantum n-Body Problem 285
where Q is an orthogonal matrix, Q 2 O(n 1). that are created in the process of splitting subsets of
Such transformations are called ‘‘kinematic rota- particles, including the original action of the
tions.’’ The discrete choices of trees in forming the translation group. Thus, each splitting of a subset
Jacobi vectors are equivalent to a discrete set of of particles generates a three-dimensional subspace
kinematic rotations Q that map one standard of the OCS, on which one of the r are coordinates.
choice of Jacobi vectors into the others. The conjugate momentum p is the generator of the
Since the momentum P CM of the center of mass group action moving the two new subsets apart. The
commutes with H, the eigenfunctions of H can be final result is that the OCS is decomposed into n
chosen to have the form orthogonal, three-dimensional subspaces, one of
which contains the action of the original translation
ðR1 ; . . . ; Rn Þ
group, and the others of which represent the
¼ expðiRCM P CM =
hÞ ðr 1 ; . . . ; r n1 Þ ½7 decomposition of the TRCS into n 1, three-
dimensional orthogonal subspaces.
This causes to be an eigenfunction of the
The TRCS can also be seen as a global section of a
‘‘translation-reduced Hamiltonian,’’ Htr = Etr ,
flat, trivial, principal fiber bundle created by the
where
action of the translation group on the OCS.
1 X
n1 Alternatively, the TRCS can be seen as the quotient
Htr ¼ jp j2 þ Vðr 1 ; . . . ; r n1 Þ ½8 space, R3n =R3 . The construction is fairly simple
2M ¼0
because the translation group is Abelian.
The kinetic energy of the center of mass, The wave function can be seen as a member of
jPCM j2 =2M, has been discarded from both Htr and the Hilbert space of wave functions on the TRCS,
Etr , which represent physically the energy of the upon which the reduced Hamiltonian Htr of eqn [8]
system about its center of mass. acts. Alternatively, it can be seen as the function
obtained by restricting on the OCS to the TRCS,
where has a dependence along the orbits of the
translation group given by exp (iRCM P CM =h), that
Reduction by Translations: Geometrical
is, by an irreducible representation (irrep) of the
Description
translation group.
The kinetic
P energy T in eqn [1] specifies a metric
ds2 = m jdR j2 on the OCS (=R3n ). The transla-
tion group (=R3 ) acts freely on the OCS, with an
Reduction by Rotations: Coordinate
action that is generated by P CM . This action defines
Description
an orthogonal decomposition of the OCS,
R3n = R 3 R 3n3 , where R3 is the orbit of the origin The Hamiltonian Htr acts on wave functions
(the other orbits of the translation group action are defined on the TRCS and has 3n 3 degrees of
parallel spaces), and R 3n3 is the orthogonal subspace freedom. Consider a coordinate transformation to
(henceforth the ‘‘translation-reduced configuration eliminate further degrees of freedom due to the
space’’ or TRCS for short). The TRCS is physically rotational invariance. This coordinate transforma-
the space of configurations relative to the center of tion takes us from the Jacobi vectors {r , = 1, . . . ,
mass. The vectors (r 1 , . . . , r n1 ) are coordinates on n 1} to orientational and shape coordinates. Shape
the TRCS. The TRCS possesses a metric which is the coordinates are a set of 3n 6 coordinates
projection of the metric on the OCS onto the TRCS {q , = 1, . . . , 3n 6} that specify the shape of the
by means of the translation group action. The metric n-particle system, that is, they are 3n 6 independent
can be projected because translations preserve the functions of the interparticle distances (hence rota-
original metric (they are isometries). Jacobi vectors tionally invariant). We will call the space upon which
are Euclidean coordinates on the TRCS with respect the q are coordinates ‘‘shape space.’’ For example, in
to this metric. the case of the three-body problem, shape space is the
The tree method of constructing Jacobi vectors space of all triangles.
can be understood in terms of certain group actions As for orientational coordinates, to define them it
which take place as each subset of particles is split is necessary first to define a ‘‘body frame.’’ We
into two further subsets. The group action in assume we are already given one frame, the ‘‘space
question leaves the center of mass of the original frame,’’ a fixed inertial frame. The body frame is a
subset invariant, while moving the two new subsets 3-frame attached in a conventional way to each shape
apart along a line. This motion in the configuration of the system of particles, which rotates with the
space is orthogonal to all the other group actions particles. The orientational coordinates, to be
286 Quantum n-Body Problem
denoted by {i , i = 1, 2, 3}, are three coordinates (e.g., The third field is the (3n 6) (3n 6) lower
Euler angles) specifying the SO(3) rotation that maps block of the metric tensor on the TRCS, an object
the space frame into the body frame. We shall write with two shape indices. It is given by
the new coordinates collectively as {i , q }.
n1
X
There is a great deal of arbitrariness in the choice @r @r
g ¼ M A E A ½11
of a body frame, since for a given shape a body frame ¼1
@q @q
can be attached in many ways, the different choices
being related by proper rotations. The only require- where again the vectors are referred to the body
ment is that the body frame should change smoothly frame. The notation suggests (correctly) that g is
as the shape changes. Popular choices for the body the metric tensor on shape space.
frame are the principal axis and Eckart frames. On transforming the wave function from the
When the potential energy is transformed to the new Jacobi vectors to coordinates (i , q ), it is convenient
coordinates, it becomes a function only of the {q }, to introduce a Jacobian factor, (r 1 , . . . , r n1 ) =
that is, of the shape. The potential can be written as D1=4
(i , q ), where D = (det E)(det g ). This
V = V(q). V is a scalar field on shape space. causes the new wave function
to have the
The transformation of the kinetic energy is more normalization
complicated. When the (Euclidean) metric tensor on Z !
Y
3n6
the TRCS is transformed to orientational and shape dR dq j
j2
½12
coordinates there results a (3n 3) (3n 3) com- ¼1
ponent matrix which may be partitioned into blocks
according to the coordinates {i , q }, that is, accord- where dR is the Haar measure on the group SO(3).
ing to 3n 3 = 3 þ (3n 6). This matrix cannot be The factor D depends only on the q , not the i .
made diagonal or even block diagonal by any choice Then the Schrödinger equation can be written as
of orientational or shape coordinates, or by any Htr
= Etr
, where Htr is a differential operator
choice of body frame. involving @=@i and @=@q .
The components of the metric tensor in the new The orientational derivatives @=@i in Htr are
coordinates are conveniently expressed in terms of conveniently expressed in terms of the angular
three fields on shape space. The first is the moment-of- momentum operator L. When acting on the original
inertia tensor E, which describes the 3 3 upper block wave function on the OCS, the angular momen-
of the metric tensor. Its components are given by tum is
n1 X
n
X 2 L¼ R P ½13
Eij ¼ M jr j ij ri rj ½9 ¼1
¼1
When this is transformed to the coordinates
The vectors and tensors in this equation can be (r 1 , . . . , r n1 , RCM ), it becomes L = LCM þ Ltr ,
referred either to the space frame or the body frame, where LCM = RCM P CM , and
but the body frame is more convenient because then
the components of the vectors r are functions only X
n1
both for the space and the body components of L, as differential operators in i , but as (2l þ 1)
although the differential operators are not the same (2l þ 1) matrices that act on the ‘‘spinor’’ . These
in the two cases. The space components of L satisfy matrices are the transposes of the usual angular
the usual angular momentum commutation rela- momentum matrices in angular momentum theory,
tions, [Li , Lj ] = i
hijk Lk , while the body components that is, (Li )kk0 = hk0 jLi jki.
of satisfy [Li , Lj ] = ihijk Lk (with a minus sign This is the final form of the Schrödinger equation
relative to the space commutation relations). after all reductions by all continuous symmetries
Thus, the Hamiltonian can be expressed in have been carried out. The fully reduced system has
terms of L and the shape momentum operators, 3n 5 degrees of freedom (3n 6 for the shape
p = ih@=@q . The result is coordinates, and one for the ‘‘spinor’’ index k).
impose coordinates on each (nonsingular) rotation the components of the classical angular momentum L
fiber, that is, we label points on the fiber by the (body or space components, depending on the basis
rotation that takes us from the section to the actual of forms). Thus, horizontal motions are those for
configuration in question. This is why a choice of which L = 0, and horizontal lifts of curves in shape
body frame is necessary before defining orienta- space are motions of the system with vanishing
tional coordinates. Sections are only defined locally. angular momentum. Since angular momentum is
Popular choices of body frame, such as the principal conserved, such motions are generated by the
axis frame, imply multivalued sections, unless classical equations of motion and are physically
branch cuts are introduced. Orientational coordi- allowed. For loops in shape space, the holonomy
nates are simply coordinates on the group manifold generated by the horizontal lift is physically the
SO(3), transferred to the nonsingular rotation fibers, rotation that a flexible body experiences when it is
with the group identity element mapped onto the carried under conditions of vanishing angular
point where the fiber intersects the section. momentum from an initial shape, through intermedi-
The metric tensor determines much of the geome- ate shapes and back to the initial shape. An example
try of the reduction by rotations. Since the metric on is the rotation generated by the ‘‘falling cat.’’
the TRCS is SO(3)-invariant, horizontal subspaces in Since the metric on the TRCS is SO(3)-invariant,
the SO(3) fiber bundle (the TRCS minus the singular it may be projected onto shape space, which there-
orbits) can be defined as the spaces orthogonal to the fore is a Riemannian manifold in its own right. The
fibers (hence orthogonal to the vertical subspaces). projected metric is ds2 = g dq dq . This metric is
This is a standard construction in Kaluza–Klein not flat (the Riemann curvature tensor is nonzero
theories, which reappears here. Thus, the bundle has for all values n 3). Geodesics in shape space have
a connection, induced by the metric. horizontal lifts that are free particle motions (V = 0)
The moment-of-inertia tensor is the metric tensor of zero angular momentum. Conversely, such
restricted to a fiber, evaluated in a basis of left- motions project onto geodesics on shape space.
(body frame) or right-invariant (space frame) vector A popular choice of body frame in molecular
fields on SO(3), which are transported to the fibers physics is the Eckart frame, which has advantages
to create a basis of vertical vector fields. for the description of small vibrations and other
The coordinate description of the connection is purposes. The section defining the Eckart frame is a
the gauge potential A , in which the index refers flat vector subspace of the TRCS of dimension 3n 6
to shape coordinates q , and the components of the that is orthogonal (horizontal) to a particular fiber
3-vector A refer to the standard set of left- or right- (over an equilibrium shape) at a particular
invariant vector fields on SO(3). The coordinate orientation.
representative of the curvature 2-form is conveni- The geometrical meaning of eqn [17] is that
ently denoted by B , defined by rotations act on a set of wave functions
that span
an irrep of SO(3) by multiplication by the represen-
@A @A tative element of the group. In standard physics
B ¼ A A ½18
@q @q notation, l indexes the irrep, and m indexes the basis
vectors spanning the irrep. Thus, the values of these
where it is understood that body frame components wave functions at any point on the fiber are known
are used. Direct calculation shows that it is nonzero, once their values are given at a reference point. A
hence the fiber bundle is not flat, for any value of convenient choice for the reference point is the point
n 3. The curvature form B appears in the on the section, and the wave functions lk are simply
classical equation of motion and in the quantum the values of the
lm on this reference point (with a
commutation relations. change of notation, m ! k). Thus, the wave func-
The field B satisfies differential equations on tions lk are properly not ‘‘wave functions on shape
shape space that have the form of Yang–Mills field space,’’ but rather wave functions on the section.
equations. It is interesting that the sources of this Shape space in the case n = 3 is homeomorphic to
field are singularities of the monopole type, located the region x3 0 of R3 , and in the case n = 4 to R 6 .
on the singular shapes. In the case n = 3, the source A convenient tool for understanding the structure
is a single monopole located at the three-body of shape space is by its foliation under the action of
collision, which is similar to a Dirac monopole in the kinematic rotations, eqn [5]. The kinematic
electromagnetic theory. rotations commute with ordinary rotations, and
The (3n 6)-dimensional horizontal subspaces of hence have an action on shape space. This action
the TRCS are annihilated by three differential forms, preserves the eigenvalues of the moment-of-inertia
whose values on a velocity vector of the system are tensor.
Quantum Phase Transitions 289
This article will not consider such nonzero These operators clearly act on the two states of the
temperature phase transitions, but will instead qubit on site j, and the Pauli operators on different
describe second-order phase transitions at the sites commute.
absolute zero of temperature. Such transitions are The quantum Ising chain is defined by the simple
driven by quantum fluctuations mandated by the Hamiltonian
Heisenberg uncertainty principle: one can imagine
moving across the quantum critical point by X
N 1 X
N
HI ¼ J ^jz ^jþ1
z
gJ ^jx ½2
effectively ‘‘tuning the value of Planck’s constant, j¼1 j¼1
h.’’ Clearly, quantum mechanics plays a central role
at such transitions, unlike the situation at nonzero where J > 0 sets the energy scale, and g 0 is a
temperatures. The reader may object that absolute dimensionless coupling constant. In the thermody-
zero is an idealization not realized by any experi- namic limit (N ! 1), the ground state of HI exhibits
mental system; hence, the study of quantum phase a second-order quantum phase transition as g is
transitions is a subject only of academic interest. As tuned across a critical value g = gc (for the specific
we will illustrate below, knowledge of the zero- case of HI it is known that gc = 1), as we will now
temperature quantum critical points of a system is illustrate.
often the key to understanding its finite-temperature First, consider the ground state of HI for g 1.
properties, and in some cases the influence of a zero- At g = 0, there are two degenerate ‘‘ferromagnetically
temperature critical point can be detected at ordered’’ ground states
temperatures as high as ambient room temperature.
Y
N Y
N
We will begin in the following section by j*i ¼ j "i j ; j+i ¼ j #i j ½3
introducing some simple lattice models which j¼1 j¼1
exhibit quantum phase transitions. Next the theory
of the critical point in these models is based upon Each of these states breaks a discrete ‘‘Ising’’
a natural extension of the Landau–Ginzburg–Wilson symmetry of the Hamiltonian rotations of all
(LGW) method, and this will be presented. This spins by 180 about the x-axis. These states are
section will also describe the consequences of a zero- more succinctly characterized by defining the
temperature critical point on the nonzero tempera- ferromagnetic moment, N0 , by
ture properties. Finally, we will consider more zj j*i ¼ h+j^
N0 ¼ h*j^ zj j+i ½4
complex models in which quantum interference
effects play a more subtle role, and which cannot At g = 0 we clearly have N0 = 1. A key point is
be described in the LGW framework: such quantum that in the thermodynamic limit, this simple picture
critical points are likely to play a central role in of the ground state survives for a finite range of
understanding many of the correlated electron small g (indeed, for all g < gc ), but with 0 < N0 < 1.
systems of current interest. The quantum tunneling between the two ferromag-
netic ground states is exponentially small in N (and
so can be neglected in the thermodynamic limit),
and so the ground state remains 2-fold degenerate
Simple Models
and the discrete Ising symmetry remains broken.
Quantum Ising Chain The change in the wave functions of these states
from eqn [3] can be easily determined by perturba-
This is a simple model of N qubits, labeled by the
tion theory in g: these small g quantum fluctuations
index j = 1, . . . , N. On each ‘‘site’’ j there are two
reduce the value of N0 from unity but do not cause
qubit quantum states j"ij and j#ij (in practice, these
the ferromagnetism to disappear.
could be two magnetic states of an ion at site j in a
Now consider the ground state of HI for g 1.
crystal). The Hilbert space therefore consists of 2N
At g = 1 there is a single nondegenerate ground
states, each consisting of a tensor product of the
state which fully preserves all symmetries of HI :
states on each site. We introduce the Pauli spin
operators, ^j , on each site j, with = x, y, z: N
Y
) i ¼ 2N=2 j "i j þ j #i j ½5
j¼1
0 1 0 i
^x ¼ ; ^y ¼ It is easy to verify
1 0 i 0 that
this state has no ferromagnetic
½1 moment N0 = )^jz )i = 0. Further, perturbation
1 0 theory in 1=g shows that these features of the ground
^z ¼
0 1 state are preserved for a finite range of large g values
Quantum Phase Transitions 291
Quantum Criticality
The simple considerations of the previous section
Figure 3 The triplon excitation of the g > gc paramagnet. The have given a rather complete description (based on
stationary triplon is an eigenstate only for g = 1 but it becomes the quasiparticle picture) of the physics for g gc
mobile for finite g.
and g gc . We turn, finally, to the region g gc .
For the specific models discussed in the previous
triplet can hop from link to link, creating a gapped section, a useful description is obtained by a method
‘‘triplon’’ quasiparticle excitation. This is similar to that is a generalization of the LGW method
the large g paramagnet for HI , with the important developed earlier for thermal phase transitions.
difference that each quasiparticle is now 3-fold However, some aspects of the critical behavior
degenerate. (e.g., the general forms of eqns [13]–[15]) will
At g = 1, the ground state of Hd is not known apply also to the quantum critical point of the
exactly. However, at this point Hd becomes equiva- section ‘‘Beyond LGW theory.’’
lent to the nearest-neighbor square lattice antiferro- Following the canonical LGW strategy, we need
magnet, and this is known to have antiferromagnetic to identify a collective order parameter which
order in the ground state, as illustrated in Figure 4. distinguishes the two phases. This is clearly given
This state is similar to the ferromagnetic ground by the ferromagnetic moment in eqn [4] for the
state of HI , with the difference that the magnetic quantum Ising chain, and the antiferromagnetic
moment now acquires a staggered pattern on the moment in eqn [10] for the coupled dimer antiferro-
two sublattices, rather than the uniform moment of magnet. We coarse-grain these moments over some
the ferromagnet. Thus, in this ground state finite averaging region, and at long wavelengths this
yields a real order parameter field a , with the index
j jAFi ¼ N0 j n
hAFj^ ½10 a = 1, . . . , n. For the Ising case we have n = 1 and a
is a measure of the local average of N0 as defined in
where 0 < N0 < 1 is the antiferromagnetic moment, eqn [4]. For the antiferromagnet, a extends over the
j = 1 identifies the two sublattices in Figure 4, and three values x, y, z (so n = 3), and three components
n is an arbitrary unit vector specifying the of a specify the magnitude and orientation of the
local antiferromagnetic order in eqn [10]; note the
average orientation of a specific spin at site j is j
times the local value of a .
The second step in the LGW approach is to write
down a general field theory for the order parameter,
consistent with all symmetries of the underlying
model. As we are dealing with a quantum transition,
the field theory has to extend over spacetime, with
the temporal fluctuations representing the sum over
Figure 4 Schematic of the ground state with antiferromagnetic histories in the Feynman path-integral approach.
order with g < gc . With this reasoning, the proposed partition function
Quantum Phase Transitions 293
for the vicinity of the critical point takes the Here ^ is the Heisenberg field operator correspond-
following form: ing to the path integral in eqn [11], the square
Z brackets represent a commutator, and the angular
brackets an average over the partition function at a
Z ¼ Da ðx; Þ
temperature T. The structure of can be deduced
Z
1 from the knowledge that the quantum correlators of
exp dd x d ð@ a Þ2 Z are related by analytic continuation in time to
2
u
2 the corresponding correlators of the classical statis-
þ c2 ðrx a Þ2 þ s2a þ 2 ½11 tical mechanics problem in d þ 1 dimensions. The
4! a
latter are known to diverge at the critical point as
1=p2 where p is the (d þ 1)-dimensional momen-
Here is imaginary time; there is an implied
tum, is defined to be the anomalous dimension of
summation over the n values of the index a, c is a
the order parameter ( = 1=4 for the quantum Ising
velocity, and s and u > 0 are coupling constants.
chain). Knowing this, we can deduce the form of the
This is a field theory in d þ 1 spacetime dimensions,
quantum correlator in eqn [12] at the zero-tempera-
in which the Ising chain corresponds to d = 1 and
ture quantum critical point
the dimer antiferromagnet to d = 2. The quantum
phase transition is accessed by tuning the ‘‘mass’’ s: 1
ðk; !Þ ; T ¼ 0; g ¼ gc ½13
there is a quantum critical point at s = sc and the ðc2 k2 !2 Þ1=2
s < sc (s > sc ) regions correspond to the g < gc (g > gc )
regions of the lattice models. The s < sc phase has The most important property of eqn [13] is the
ha i 6¼ 0 and this corresponds to the spontaneous absence of a quasiparticle pole in the spectral
breaking of spin rotation symmetry noted in eqns [4] density. Instead, Im( (k, !)) is nonzero for all ! > ck,
and [10] for the lattice models. The s > sc phase is reflecting the presence of a continuum of critical
the paramagnet with ha i= 0. The excitations in this excitations. Thus the stable quasiparticles found at
phase can be understood as small harmonic oscilla- low enough energies for all g 6¼ gc are absent at the
tions of a about the point (in field space) a = 0. A quantum critical point.
glance at eqn [11] shows that there are n such We now briefly discuss the nature of the phase
oscillators for each wave vector. These oscillators diagram for T > 0 with g near gc . In general, the
clearly constitute the g > gc quasiparticles found interplay between quantum and thermal fluctuations
earlier in eqn [7] for the Ising chain (with n = 1) near a quantum critical point can be quite compli-
and the triplon quasiparticle (with n = 3) illustrated cated, and we cannot discuss it in any detail here.
in Figure 3 for the dimer antiferromagnet. However, the physics of the quantum Ising chain is
We have now seen that there is a perfect relatively simple, and also captures many key
correspondence between the phases of the quantum features found in more complex situations, and is
field theory Z and those of the lattice models HI summarized in Figure 5. For all g 6¼ gc there is a
and Hd . The power of the representation in eqn [11] range of low temperatures (T < jg gc j) where the
is that it also allows us to get a simple description of long time dynamics can be described using a dilute
the quantum critical point. In particular, readers gas of thermally excited quasiparticles. Further, the
may already have noticed that if we interpret the
temporal direction in eqn [11] as another spatial
direction, then Z is simply the classical partition T
function for a thermal phase transition in a ferro-
magnet in d þ 1 dimensions: this is the canonical
model for which the LGW theory was originally Quantum
developed. We can now take over standard results critical
Domain wall Flipped-spin
for this classical critical point, and obtain some quasiparticles quasiparticles
useful predictions for the quantum critical point of
Z . It is useful to express these in terms of the 0
dynamic susceptibility defined by gc g
dynamics of these quasiparticles is quasiclassical, order in the disordered state, and such effects are
although we reiterate that the nature of the entirely absent in the LGW theory.
quasiparticles is entirely distinct on opposite sides An important example of a system displaying such
of the quantum critical point. Most interesting, phenomena is the S = 1=2 square lattice antiferro-
however, is the novel quantum critical region, magnet with additional frustrating interactions. The
T> jg gc j, where neither quasiparticle picture nor quantum degrees of freedom are identical to those of
a quasiclassical description are appropriate. Instead, the coupled dimer antiferromagnet, but the Hamil-
we have to understand the influence of temperature tonian preserves the full point-group symmetry of
on the critical continuum associated with eqn [13]. the square lattice:
This is aided by scaling arguments which show that X
the only important frequency scale which charac- Hs ¼ Jjk ^ xj ^kx þ ^jy ^ky þ ^jz ^kz þ ½16
j<k
terizes the spectrum is kB T= h, and the crossovers
near this scale are universal, that is, independent of Here the Jjk > 0 are short-range exchange interac-
specific microscopic details of the lattice Hamilto- tions which preserve the square lattice symmetry,
nian. Consequently, the zero-momentum dynamic and the ellipses represent possible further multiple
susceptibility in the quantum critical region takes spin terms. Now imagine tuning all the non-nearest-
the following form at small frequencies: neighbor terms as a function of some generic
1 1 coupling constant g. For small g, when Hs is nearly
ðk ¼ 0; !Þ 2
½14 the square lattice antiferromagnet, the ground state
T ð1 i!=R Þ
has antiferromagnetic order as in Figure 4 and
This has the structure of the response of an eqn [10]. What is now the disordered ground state
overdamped oscillator, and the damping frequency, for large g? One natural candidate is the spin-singlet
R , is given by the universal expression paramagnet in Figure 2. However, because all
kB T nearest neighbor bonds of the square lattice are
R ¼ 2 tan ½15 now equivalent, the state in Figure 2 is degenerate
16 h
with three other states obtained by successive 90
The numerical proportionality constant in eqn. [15]
rotations about a lattice site. In other words, the
is specific to the quantum Ising chain; other models
state in Figure 2, when transferred to the square
also obey eqn [15] but with a different numerical
lattice, breaks the symmetry of lattice rotations by
value for this constant.
90 . Consequently it has a new type of order, often
called valence-bond-solid (VBS) order. It is now
believed that a large class of models like Hs do
Beyond LGW Theory
indeed exhibit a second-order quantum phase
The quantum transitions discussed so far have transition between the antiferromagnetic state and
turned to have a critical theory identical to that a VBS state – see Figure 6. Both the existence of VBS
found for classical thermal transitions in d þ 1 order in the paramagnet, and of a second-order
dimensions. Over the last decade it has become quantum transition, are features that are not
clear that there are numerous models, of key predicted by LGW theory: these can only be
physical importance, for which such a simple
classical correspondence does not exist. In these
models, quantum Berry phases are crucial in estab- Antiferromagnetic
lishing the nature of the phases, and of the critical VBS order
order
boundaries between them. In less technical terms, a
signature of this subtlety is an important simplifying
feature which was crucial in the analyses of the or
section ‘‘Simple models’’: both models had a
straightforward g ! 1 limit in which we were able
to write down a simple, nondegenerate, ground-state gc g
wave function of the ‘‘disordered’’ paramagnet. In Figure 6 Phase diagram of Hs . Two possible VBS states are
many other models, identification of the disordered shown: one which is the analog of Figure 2, and the other in
phase is not as straightforward: specifying absence which spins form singlets in a plaquette pattern. Both VBS states
have a 4-fold degeneracy due to breaking of square lattice
of a particular magnetic order is not enough to
symmetry. So the novel critical point at g = gc (described by Z z )
identify a quantum state, as we still need to write has the antiferromagnetic and VBS orders vanishing as it is
down a suitable wave function. Often, subtle approached from either side: this coincident vanishing of orders
quantum interference effects induce new types of is generically forbidden in LGW theories.
Quantum Spin Systems 295
understood by a careful study of quantum inter- of Z z is the presence of a U(1) gauge field A : this
ference effects associated with Berry phases of spin gauge force emerges near the critical point, even
fluctuations about the antiferromagnetic state. We though the underlying model in eqn [16] only has
will not enter into details of this analysis here, but will simple two spin interactions. Studies of fractiona-
conclude our discussion by writing down the theory so lized critical theories like Z c in other models with
obtained for the quantum critical point in Figure 6: spin and/or charge excitations is an exciting avenue
Z for further theoretical research.
Z z ¼ Dz ðx; ÞDA ðx; Þ
Z See also: Bose–Einstein Condensates; Boundary
Conformal Field Theory; Fractional Quantum Hall Effect;
exp d2 x d jð@ iA Þz j2 þ sjz j2 Ginzburg–Landau Equation; High Tc Superconductor
Theory; Quantum Central-Limit Theorems; Quantum
u 1
þ ðjz j2 Þ2 þ 2 ð @ A Þ2 ½17 Spin Systems; Quantum Statistical Mechanics: Overview.
2 2e
Here , , are spacetime indices which extend over Further Reading
the two spatial directions and , is a spinor index
which extends over " , # , and z is complex spinor Matsumoto M, Yasuda C, Todo S, and Takayama H (2002)
field. In comparing Z z to Z , note that the vector Physical Review B 65: 014407.
Sachdev S (1999) Quantum Phase Transitions. Cambridge:
order parameter a has been replaced by a spinor z , Cambridge University Press.
and these are related by a = z
a
z
, where a are Senthil T, Balents L, Sachdev S, Vishwanath A, and Fisher MPA,
the Pauli matrices. So the order parameter has http://arxiv.org/abs/cond-mat/0312617.
fractionalized into the z . A second novel property
angular momentum called spin (Compton 1921, When it was realized in the 1980s that the magnetic
Goudsmit and Uhlenbeck 1925). properties of complex materials play an important role
The second development was the attempt in in high-Tc superductivity, a variety of quantum spin
statistical mechanics to explain ferromagnetism and models studied in the literature proliferated. This
the phase transition associated with it on the basis of a motivated a large number of theoretical and experi-
microscopic theory (Lenz and Ising 1925). The mental studies of materials with exotic properties that
fundamental interaction between spins, the so-called are often based on quantum effects that do not have a
exchange operator which is a subtle consequence classical analog. An example of unexpected behavior is
of the Pauli exclusion principle, was introduced the prediction by Haldane of the spin liquid ground
independently by Dirac and Heisenberg in 1926. state of the spin-1 Heisenberg antiferromagnetic chain
With this discovery, it was realized that magnetism is in 1983. In the quest for a mathematical proof of this
a quantum effect and that a fundamental theory prediction (a quest still ongoing today), Affleck,
of magnetism requires the study of quantum-mechan- Kennedy, Lieb, and Tasaki introduced the AKLT
ical models. This realization and a large amount of model in 1987. They were able to prove that the
subsequent work notwithstanding, some of the most ground state of this model has all the characteristic
fundamental questions, such as a derivation of properties predicted by Haldane for the Heisenberg
ferromagnetism from first principles, remain open. chain: a unique ground state with exponential decay of
The first and most important quantum spin model correlations and a spectral gap above the ground state.
is the Heisenberg model, so named after Heisenberg. There are also particle models that are defined on
It has been studied intensely ever since the early a lattice, or more generally, a graph. Unlike spins,
1930s and its study has led to an impressive variety particles can hop from one site to another. These
of new ideas in both mathematics and physics. Here, models are closely related to quantum spin systems
we limit ourselves to listing only some landmark and, in some cases, are mathematically equivalent.
developments. The best-known example of a model of lattice
Spin waves were discovered independently by fermions is the Hubbard model. Such systems are
Bloch and Slater in 1930 and they continue to play not discussed further in this article.
an essential role in our understanding of the
excitation spectrum of quantum spin Hamiltonians.
Mathematical Framework
In two papers published in 1956, Dyson advanced
the theory of spin waves by showing how interac- Quantum spin systems present an area of mathema-
tions between spin waves can be taken into account. tical physics where the demands of mathematical
In 1931, Bethe introduced the famous Bethe rigor can be fully met and, in many cases, this can be
ansatz to show how the exact eigenvectors of the done without sacrificing the ability to include all
spin-1/2 Heisenberg model on the one-dimensional physically relevant models and phenomena. This
lattice can be found. This exact solution, directly does not mean, however, that there are few open
and indirectly, led to many important developments problems remaining. But it does mean that, in
in statistical mechanics, combinatorics, representa- general, these open problems are precisely formu-
tion theory, quantum field theory and more. lated mathematical questions.
Hulthén used the Bethe ansatz to compute the In this section we review the standard mathema-
ground-state energy of the antiferromagnetic spin- tical framework for quantum spin systems, in which
1/2 Heisenberg chain in 1938. the topics discussed in the subsequent section can be
In their famous 1961 paper, Lieb, Schultz, and given a precise mathematical formulation. It is
Mattis showed that some quantum spin models in possible, however, to skip this section and read the
one dimension can be solved exactly by mapping rest with only a physical or intuitive understanding
them into a problem of free fermions. This paper is of the notions of observable, Hamiltonian,
still one of the most cited in the field. dynamics, symmetry, ground state, etc.
Robinson, in 1967, laid the foundation for the The most common mathematical setup is as follows.
mathematical framework, which we describe in the Let d 1, and let L denote the family of finite subsets
next section. Using this framework, Araki estab- of the d-dimensional integer lattice Zd . For simplicity
lished the absence of phase transitions at positive we will assume that the Hilbert space of the ‘‘spin’’
temperatures in a large class of one-dimensional associated with each x 2 Zd has the same dimension
quantum spin models in 1969. n 2: H{x} ffi Cn . The Hilbert space associated
N with
During the more recent decades, the mathematical the finite volume 2 L is then H = x2 Hx . The
and computational techniques used to study quantum algebra of observables for the spin of site x consists of
spin models have fanned out in many directions. the n n complex matrices: A{x} ffi Mn (C). For any
Quantum Spin Systems 297
2 L, the algebraN of observables for the system in is Its completion is the C -algebra of quasilocal
given by A = x2 A{x} . The primary observables for observables, which we will simply denote by A.
a quantum spin model are the spin-S matrices The dynamics and symmetries of a quantum spin
S1 , S2 , and S3 , where S is the half-integer such that model are described by (groups of) automorphisms
n = 2S þ 1. They are defined as Hermitian matrices of the C -algebra A, that is, bijective linear trans-
satisfying the SU(2) commutation relations. Instead formations on A that preserve the product and
of S1 and S2 , one often works with the spin-raising
operations. Translation invariance, for example, is
and -lowering operators, Sþ and S , defined by the expressed by the translation automorphisms x , x 2
relations S1 = (Sþ þ S )=2, and S2 = (Sþ S )=(2i). In Zd , which map any subalgebra A to Aþx , in the
terms of these, the SU(2) commutation relations are natural way. They form a representation of the
additive group Zd on A.
½ Sþ ; S ¼ 2S3 ; ½ S3 ; S ¼ S ½1 A translation-invariant interaction, or potential,
defining a quantum spin model, is a map : L ! A
where we have used the standard notation for the
with the following properties: for all X 2 L,
commutator for two elements A and B in an algebra:
we have (X) 2 AX , (X) = (X) , and for x 2 Zd ,
[A, B] = AB BA. In the standard basis S3 , Sþ , and
(X þ x) = x ((X)). An interaction is called finite
S are given by the following matrices:
0 1 range if there exists R > 0 such that (X) = 0
S whenever diam(X) > R. The Hamiltonian in is
B S1 C the self-adjoint element of A defined by
S3 ¼ B
@ .. C
A
. X
S H ¼ ðXÞ
X
S = (Sþ ) , and
0 1 For the standard Heisenberg model the interaction is
0 cS
B C given by
B 0 cS1 C
þ B .. .. C
S ¼B . . C ðfx; ygÞ ¼ JSx Sy ; if jx yj ¼ 1 ½2
B C
@ 0 cSþ1 A
0 and (X) = 0 in all other cases. Here, Sx Sy is the
conventional notation for S1x S1y þ S2x S2y þ S3x S3y . The
where, for m = S, S þ 1, . . . , S,
magnitude of the coupling constant J sets a natural
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
cm ¼ SðS þ 1Þ mðm 1Þ unit of energy and is irrelevant from the mathema-
tical point of view. Its sign, however, determines
In the case n = 2, one often works with the Pauli whether the model is ferromagnetic (J > 0), or
matrices, 1 , 2 , 3 , simply related to the spin antiferromagnetic (J < 0). For the classical Heisen-
matrices by j = 2Sj , j = 1, 2, 3. berg model, where the role of Sx is played by a unit
Most physical observables are expressed as finite vector in R3 , and which can be regarded, after
sums and products of the spin matrices rescaling by a factor S2 , as the limit S ! 1 of the
Sxj , j = 1, 2, 3, associated with the site x 2 : quantum Heisenberg model, there is a simple trans-
O formation relating the ferro- and antiferromagnetic
Sxj ¼ Ay models (just map Sx to Sx for all x in the even
y2 sublattice of Zd ). It is easy to see that there does not
with Ax = Sj , and Ay = 1 if y 6¼ x. exist an automorphism of A mapping Sx to Sx , since
The A are finite-dimensional C -algebras for the that would be inconsistent with the commutation
usual operations of sum, product, and Hermitian relations [1]. Not only is there no exact mapping
conjugation of matrices and with identity 1 . between the ferro- and the antiferromagnetic models,
If 0 1 , there is a natural embedding of A0 their ground states and equilibrium states have
into A1 , given by radically different properties. See below for the
definitions and further discussion.
A0 ffi A0
11 n0 A1 The dynamics (or time evolution), of the system in
finite volume is the one-parameter group of
The algebra of local observables is then defined by automorphisms of A given by
[
Aloc ¼ A ðÞ
2L t ðAÞ ¼ eitH AeitH ; t2R
298 Quantum Spin Systems
observable algebra of infinite quantum spin systems phenomenon that also plays an important role in
on Zd . One can also define translation automorph- quantum field theory.
isms for finite systems with periodic boundary The famous Hohenberg–Mermin–Wagner theo-
conditions, which are defined on the torus rem, applied to quantum spin models, states that, as
Zd =TZd , where T = (T1 , . . . , Td ) is a positive integer long as the interactions do not have very long range
vector representing the periods. and the dimension of the lattice is 2 or less,
Other graph automorphisms. In general, if G is a continuous symmetries cannot be spontaneously
group Nof automorphisms of the graph , and broken in a
-KMS state for any finite
.
H = x2 Cn is the Hilbert space of a system of Quantum group symmetries. We restrict ourselves
identical spins defined on , then, for each g 2 G, one to one important example: the SUq (2) invariance of
canN define aNunitary Ug on H by linear extension of the spin-1/2 XXZ Heisenberg chain with
Ug ’x = ’g1 (x) , where ’x 2 Cn , for all x 2 . q 2 [0, 1], and with special boundary terms. The
These unitaries form a representation of G. With the Hamiltonian of the SUq (2)-invariant XXZ chain of
unitaries one can immediately define automorphisms length L is given by
of the algebra of observables: for A 2 A , and U 2 A
unitary, (A) = U AU defines an automorphism, and X
L1
1 1 1
if Ug is a group representation, the corresponding g HL ¼ Sx Sxþ1 þ S2x S2xþ1
x¼1
will be, too. Common examples of graph automorph-
1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3
isms are the lattice symmetries of rotation and S3x S3xþ1 1=4 þ 1 2 Sxþ1 S3x
reflection. Translation symmetry and other graph 2
automorphisms are often referred to collectively as where q 2 (0, 1] is related to the parameter 1
spatial symmetries. by the relation = (q þ q1 )=2. When q = 0, HL is
Local symmetries (also called gauge symmetries). equivalent to the Ising chain. Thus, the XXZ model
Let G be a group and ug , g 2 G, a unitary N repre- interpolates between the Ising model (the primordial
sentation of G on Cn . Then, Ug = x2 ug is a classical spin system) and the isotropic Heisenberg
representation on H . The Heisenberg model [2], for model (the most widely studied quantum spin model).
example, commutes with such a representation of In the limit of infinite spin (S ! 1), the model
SU(2). It is often convenient, and generally equiva- converges to the classical Heisenberg model (XXZ
lent, to work with a representation of the Lie or isotropic). An interesting feature of the XXZ
algebra. In that case the SU(2) invariance of the model are its non-translation-invariant ground
Heisenberg model is expressed by the fact that H states, called kink states.
commutes with the following three operators: In this family of models, one can see how aspects
X of discreteness (quantized spins) and continuous
Si ¼ Six ; i ¼ 1; 2; 3 symmetry (SU(2), or quantum symmetry SUq (2)) are
x2
present at the same time in the quantum Heisenberg
Note: sometimes the Hamiltonian is only sym- models, and the two classical limits (q ! 0 and
metric under certain combinations of spatial and S ! 1) can be used as a starting point to study its
local symmetries. CP symmetry is an example. properties.
For an automorphism , we say that a state ! is Quantum group symmetry is not a special case of
-invariant if ! = . If ! is g -invariant for all invariance under the action of a group. There is no
g 2 G, we say that ! is G-invariant. group, but there is an algebra represented on the
It is easy to see that if a quantum spin model has a Hilbert space of each spin, for which there is a good
symmetry G, then the set of all ground states or all definition of tensor product of representations, and
-KMS states will be G-invariant, meaning that if ! ‘‘many’’ irreducible representations. In this example,
is in the set, then so is ! g , for all g 2 G. By a the representation of SUq (2) on H[1, L] commuting
suitable averaging procedure, it is usually easy to with HL is generated by
establish that the sets of ground states or equili-
brium states contain at least one G-invariant X
L
S3 ¼ 11
S3x
1xþ1
1L
element. x¼1
An interesting situation occurs if the model is
X
L
G-invariant, but there are ground states or KMS Sþ ¼ t1
tx1
Sþ
x
1xþ1
1L
states that are not. This means that, for some x¼1
g 2 G, and some ! in the set (of ground states or X
L
KMS states), ! 6¼ !. When this happens, one says S ¼ 1 1
S 1 1
x
txþ1
tL
that there is spontaneous symmetry breaking, a x¼1
300 Quantum Spin Systems
u (ν,Ti)
‘‘old quantum theory’’ for the atomic structure is
based. The electromagnetic field is represented, via T1
Fourier analysis, as a set of infinitely many
independent harmonic oscillators, two for every
wave vector k, to take into account the polarization.
The frequency depends linearly on the wave number ν
k = jkj (linear dispersion law), and the spacing Figure 1 Dependence of the electromagnetic energy density
becomes negligible for macroscopic dimensions of on , for T1 < T2 < T3 .
the cavity. The key idea for computing the partition
function is the discretization of the phase space of
representing the radiating system was one of a gas
each oscillator (of frequency = !=2). Putting
of noninteracting photons, carrying energy and
there the adimensionalized Lebesgue measure
momentum, and being continuously created and
dp dq=h, where h is a constant with physical
absorbed.
dimensions of an action, we consider the regions
A slightly different approach was used about the
RE bounded by the constant-energy ellipses and
same time, for the problem of specific heat of
their areas jRE j, and find
Z crystalline solids.
dpdq 2E E The simpler model considers N points on the
jRE j ¼ ¼ ¼
RE h h! h nodes of the lattice Z3 , in a cubic box of side L, and
interacting through harmonic forces; similarly to the
If these adimensional areas have integer values, that radiation problem, the system is represented by a
is, E = nh, n = 0, 1, 2, . . . , the annular region (‘‘cell’’ collection of independent harmonic oscillators (nor-
Cn ) between REn and REnþ1 has unit area and so we mal modes), which are ‘‘quantized’’ as before: the
approximate the partition function with the series corresponding quanta were called phonons (by
( = 1=(kB T), the ubiquitous parameter in statistical Fraenkel, in 1932) for the role of the acoustic band
mechanics, often called ‘‘inverse temperature’’) of frequencies. In this simplified approach (by
X 1 Debye, in 1913) the different phonons are deter-
Zdiscr ¼ expðnhÞ ¼ mined by a finite set of wave vectors
n
1 expðhÞ
The presence of an external field, like the periodic SU(2) is given, so that the nonzero values for s are
one given by the ionic lattice of a crystal, changes 1=2, 1, 3=2, . . . . For any x, the generators
the situation in a relevant way, as the one-particle S (x), ( = 1, 2, 3) satisfy the well-known commuta-
spectrum generally gets a band structure, and the tion
P relations of the angular momentum; moreover,
2
allowed momenta are described in the reciprocal S
(x) = s(s þ 1)1, and operators related to
lattice: the Fermi sphere becomes a surface, and its different sites commute. The ferromagnetic, iso-
structure is central for further developments. tropic, next-neighbors, magnetic field Hamiltonian
For massive bosons, the strange superfluid fea- for the finite system is
tures of liquid 4 He at low temperature, that is, X X
below the critical value 2.17 K, led F London, just H ¼ J SðxÞ SðyÞ h S3 ðxÞ ½9
after Kapitza’s discovery in 1937, to speculate that <x;y> x
these were related to a macroscopic occupation of
the ground state (B–E condensation). A more where J is the positive strength of the next-neighbors
realistic model has to take into account interaction coupling (< x, y > means that x and y are next
between bosons (see last section) as the microscopic neighbors); h is the intensity of the magnetic field
interactions in superfluid liquid 4 He are not oriented along the third axis. This model is consider-
negligible. ably studied even now with several variants regarding
possible anisotropies of the interaction, the possibly
infinite range of the interaction, and the sign of J, for
other (e.g., antiferromagnetic) couplings. Among the
Quantum N-Body Properties:
relevant results, the Mermin–Wagner theorem, at
Second Quantization
variance with the analogous classical spin model,
The main step in analyzing a quantum N-body states the absence of spontaneous magnetization in
system is its energy spectrum, and in particular its this zero-field model for d = 2 for any positive
ground state, as it may represent a good approxima- temperature; this can also be formulated as absence
tion of the low-temperature states: its structure, the of symmetry breaking for this model (Fröhlich and
relations with possible symmetries of the Hamilto- Pfister in 1981 shed more light on this point).
nian, its degeneracy, the dependence of its energy on As mentioned earlier, a useful mathematical tool
the number of particles, are further relevant ques- for dealing with quantum systems of many particles
tions. The last one is related to the possibility of or quasiparticles, is the occupation-number repre-
defining a thermodynamics for the system (Ruelle sentation for the state of the system. The vector
1969). As a physically very interesting example, space for a system with an indefinite number of
consider a system of electrically charged particles, N particles is the Fock space: it is the direct sum of all
electrons with negative unit charge, and K atoms spaces with any number of particles, starting with
with positive charge z, say, interacting through the zero-particle, vacuum state. The operators which
electrostatic forces; the classical Coulomb potential connect these subspaces are the creation and
as a function of distance behaves badly, as it annihilation operators, very similar to the raising
diverges at zero and decreases slowly at infinity. and lowering operators introduced by Dirac for the
The first question is about the stability: thanks to spectral analysis of the harmonic-oscillator Hamil-
the exclusion principle, for the ground-state energy tonian and the angular momentum, in the context of
E0N, K an extensive estimate from below is valid: one-particle quantum theory.
It is perhaps worth sketching the action of these
E0N;K c0 ðN þ KzÞ operators on the Fock space.
so that a finite-volume grand partition function We consider spinless bosons first, as spin might
exists, while for the thermodynamic limit, which easily be taken into account, if necessary. We
involves large distances, we need more, that is, suppose that a one-particle Hamiltonian has eigen-
charge neutrality, which allows for screening, and a functions labeled by a set of quantum numbers k,
fast-decreasing effective interaction. say, as the wave vector for the purely kinetic one-
Let us see an example (quantum spin, Heisenberg particle Hamiltonian. P Let jnk1 , nk2 , . . . , nkp > denote
model) belonging to the class of lattice models, a vector state with i = 1,..., p nki particles, where nki
where the identical microscopic elements are distin- denotes the number of particles with wave vector
guishable by their fixed positions, that is, the nodes ki , i = 1, . . . , p; j0 > denotes the no-particle, vacuum
of a lattice like Zd . To any site x 2 Zd is associated state. We define the creation operators a
k as follows:
a copy Hx of a (2s þ 1)-dimensional Hilbert space pffiffiffiffiffiffiffiffiffiffiffiffiffiffi
H, where an irreducible unitary representation of a
k j . . . nk ; . . .i ¼ nk þ 1j . . . ; nk þ 1; . . .i ½10
306 Quantum Statistical Mechanics: Overview
(ε /kB)
50
momentum: for any k 40
30
a
k ak j . . . nk ; . . .i ¼ nk j . . . nk ; . . .i
20
10
a
k ak :¼ n
^k (the occupation-number operator)
0
0 0.5 1 1.5 2 2.5 3 3.5
The vacuum state belongs to Ker ak for any k, and k
the whole space is generated by application of Figure 4 Excitation spectrum for superfluids.
creation operators on the vacuum state.
The following basic commutation relations, for
any k, k0 , are valid: where "k is the one-particle kinetic energy and ^
vq is
the Fourier transform of the two-body potential.
½ak ; a
k0 ¼ ðk; k0 Þ; ½ak ; ak0 ¼ ½a
k ; a
k0 ¼ 0 ½12 To study the excitation spectrum above the ground
state, he introduced an approximation about the
For fermions, multiple occupancy is forbidden, so persistency of a macroscopic occupation of the
that the analogous annihilation ( k ) and creation ground state and a diagonalization procedure
(
k ) operators satisfy anticommutation relations: leading to new quasiparticles with a characteristic
energy spectrum, linearly increasing near jkj = 0,
½ k ;
k0 þ ¼ ðk; k0 Þ; ½ k ; k0 þ ¼ ½
k ;
k0 þ ¼ 0 ½13 then presenting a positive minimum before the
subsequent increase (see Figure 4).
The presence of spin is dealt by an additional spin
label to these symbols, and a (, 0 ), where
necessary.
The Hamiltonian for a system of particles, say Some Mathematical Tools for
spinless bosons, in a box , made of its kinetic part Macroscopic Quantum Systems
together with a two-body (
v(x y)) interaction, is
The formal apparatus of second quantization, born
written in terms of the ‘‘field operators’’; if {k (x)}
in the context of the quantum field theory, brought
are the one-particle eigenfunctions of the single-
to statistical mechanics new ideas and techniques
particle purely kinetic Hamiltonian for the spinless
and related difficulties. For instance, the renormali-
case, and their complex conjugates are {
k (x)}, we
zation group was conceived in the 1970s to deal
define the fields
both with critical phenomena (i.e., power singula-
X X
ðxÞ ¼ k ðxÞak ;
ðxÞ ¼
k ðxÞa
k ½14 rities of thermodynamic quantities around the
k k critical point) and with divergences in quantum
field theory. This subject is currently being devel-
So that the full Hamiltonian is given by oped and applied in models of quantum statistical
Z ! mechanics (QSM) (Benfatto and Gallavotti, 1995).
h2 Another issue, which has again strong relations
H ¼ dx ðxÞ ðxÞ
2m with quantum field theory, is the algebraic formula-
Z Z tion of QSM. This point of view, which is well
þ
dx dy vðx yÞ
ðxÞðxÞ
ðyÞðyÞ suited for the analysis of infinitely extended quan-
tum systems, uses a unified, synthetic, and rigorous
½15 language. The procedure for passing from a finite
We mention that a theoretical breakthrough in the quantum system to its infinitely extended version
analysis of superfluidity was made by Bogoliubov deserves some attention.
(1946), who, starting from the Hamiltonian in [15], It is well known that, for finite quantum systems,
introduced the following Hamiltonian in the say N particles in a box , an observable is represented
momentum representation: by a self-adjoint operator A on a Hilbert space H , and
the normalized elements {j >} of this space are the
X 1X
H ¼ "k a
k ak þ vq a
kq a
k0 þq ak ak0
^ ½16 pure states which define the expectations
k
2 0
k;k ;q ðAÞ :¼ < jA >
Quantum Statistical Mechanics: Overview 307
The mixed states (mixtures) are defined by convex This relation is suitably extended for infinite size,
combinations of pure states, the coefficients having and therefore defines a KMS state; it implies some
an obvious statistical meaning. physically relevant properties like stability with
Among the observables, the Hamiltonian plays a respect to local disturbances and dissipativity
special role, as it generates the dynamics of the (Sewell 2002).
system, which evolves the pure states through the A final issue in this section concerns another
unitary group (Schrödinger picture) formalism stemming from the Feynman path-integral
formulation of quantum mechanics: here a functional
ðtÞ exp itH > integral represents the statistical equilibrium density
h
operator W = exp(H). For a d-dimensional sys-
tem of N particles
P in a potential field (X 2 RdN )
To the notion of equilibrium probability measure on
V = V(X) = i<j (xi xj ) and Hamiltonian H =
the phase space of a classical system, corresponds
(1=2) þ V the Feynman–Kac formula which, for
the mixed state H , such that
a test function , may be written as follows:
H ; ðAÞ :¼ Z; 1 trðexpðH ÞAÞ ½17 Z Z
ðW ÞðXÞ ¼ PX;Y ðd!Þ exp dsVð!ðsÞÞðYÞdY
The normalization factor Z, = tr( exp ( H )) is 0
properties of this system, an important tool is the and experiments: these phenomena are a source of
Gross–Pitaevskii energy functional for the conden- new ideas and suggest new models for further
sate wave function , progress.
Z " #
h2
2 2 g 4 See also: Bose–Einstein Condensates; Dynamical
E½ ¼ dx jrj þ Vext ðxÞjj þ jj
2m 2 Systems and Thermodynamics; Exact Renormalization
Group; Falicov–Kimball Model; Fermionic Systems;
where the quartic term represents the reduced Finitely Correlated States; Fractional Quantum Hall
(mean-field) interaction among particles. Effect; High Tc Superconductor Theory; Hubbard Model;
The second issue, that is, the high-temperature Quantum Phase Transitions; Quantum Spin Systems;
superconductivity, certainly deserves much atten- Stability of Matter.
tion. It has been observed recently in some ceramic
materials well above 100 K, and a clear model which
takes into account the formation of pairs and the Further Reading
peculiar isotropy–anisotropy aspects of the normal
Benfatto G and Gallavotti G (1995) Renormalization Group.
conductivity and superconductivity is still lacking Princeton: Princeton University Press.
(Mattis 2003). Gallavotti G (1999) Statistical Mechanics: A Short Treatise.
Finally, let us consider the fractional quantum Berlin: Springer.
Hall effect; recall that the integer version, that is, a Glimm J and Jaffe A (1981) Quantum Physics. A Functional
discretization of the Hall resistivity RH by multiples Integral Point of View. New York: Springer.
Landau LD and Lifschitz EN (2000) Statistical Physics, Course of
of h=(e2 ), finds an explanation in terms of band Theoretical Physics, 3rd edn., vol 5. Parts I and II. Oxford:
spectra, formation of magnetic Landau levels, and Butterworth-Heinemann.
localization from surface impurities, that is, without Mattis DC (2003) Statistical Mechanics Made Simple. NJ: World
taking into account direct interactions among Scientific.
electrons. Ruelle D (1969) Statistical Mechanics. Rigorous Results. New
York: Benjamin.
The fractional discretization of RH (Störmer 1999) Sewell GL (2002) Quantum Mechanics and Its Emergent
has a theoretical interpretation, in terms of subtle Macrophysics. Princeton: Princeton University Press.
collective behavior of the two-dimensional semicon- Sinai YaG (1982) Theory of Phase Transitions: Rigorous Results.
ductor electron system: the quasiparticles which Budapest: Akadémiai Kiadó.
represent the excitations may behave as composite Störmer HL (1999) Nobel lecture: the quantum Hall effect.
Reviews of Modern Physics 71: 875–889.
fermions or bosons, or exhibit a fractional statistics Taylor PL and Heinonen O (2002) A Quantum Approach to
(see Fractional Quantum Hall Effect). Condensed Matter Physics. Cambridge: Cambridge University
This brief excursion through these new fascinating Press.
phenomena shows the rich interplay between theory
Quasiperiodic Systems
P Kramer, Universität Tübingen, Tübingen, Germany in 1892 in the complete classification of the 230
ª 2006 Elsevier Ltd. All rights reserved.
space groups due to Fedorov and Schoenflies
(see Schwarzenberger (1980, pp. 132–135). One
characteristic property of periodic systems is that
their Fourier transform has a pure point spectrum.
Introduction: From Periodic
Since the Fourier spectrum is experimentally acces-
to Quasiperiodic Systems
sible through diffraction experiments, it provides
Periodic systems occur in many branches of physics. a main tool for the structure determination of
Their mathematical analysis was stimulated in crystals.
particular by the analysis of the periodic transla- With quantum mechanics in the twentieth cen-
tional symmetry of crystals. The systematic study of tury, it became possible to describe crystal structures
the compatibility between translational and crystal- quantitatively as ordered systems of atomic nuclei
lographic point or reflection symmetry leads to the and electrons with electromagnetic interactions.
concept of space group symmetry. Mathematical The representation theory of crystallographic
crystallography in three dimensions (3D) culminated space groups now opened the way to verify the
Quasiperiodic Systems 309
space group symmetry of atomic systems for elaborate study of quasiperiodic systems. Therefore,
example from the band structure of crystals. It we shall focus in what follows on the concepts
was then believed that in physics atomic long- developed in this theory.
range order is linked to periodicity and hence to In the following section, we briefly review basic
the paradigm of the 230 space groups in 3D. concepts of periodic systems and lattices in nD, their
Mathematical analysis beyond this paradigm classification in terms of point symmetry and space
started independently in various directions. Bohr groups, and their cell structure. In a section on
(1925) studied quasiperiodic functions and their quasiperiodic point sets and functions, a quasiper-
Fourier transform. He interpreted them as restric- iodic system is taken as a geometric object on an
tions of periodic functions in nD to their values on a irrational mD subspace in an n-dimensional space
linear subspace of orientation irrational with respect and lattice. Noncrystallographic point symmetry is
to a lattice. Mathematical crystallography in general shown to select the irrational subspace. Next,
dimension n > 3, including point group symmetry, scaling symmetry in quasiperiodic systems is demon-
was started around 1949 in work by Hermann and strated. Then, examples of quasiperiodic systems
by Zassenhaus (see Schwarzenberger (1980)), and with point and scaling symmetry are given. The
completed in 1978 for n = 4 in Brown et al. (1978). penultimate section discusses quasiperiodic tilings
A different route was taken by Penrose (1974). He and their windows. Finally, the notion of a funda-
constructed an aperiodic tiling (covering without mental domain for quasiperiodic functions compa-
gaps or overlaps) of the plane. Its tiles in two tible with a tiling is illustrated.
rhombus shapes provide global 5-fold point symme-
try and make the tiling incompatible with any
periodic lattice in 2D. The connection between
Concepts from Periodic Systems
Penrose’s aperiodic tiling and irrational subspaces
in periodic structures was made by de Bruijn (1981). A distribution f p (x) of geometric objects on Eucli-
He interpreted the Penrose rhombus tiling as the dean space En (a real linear space equipped with
intersection of geometric objects from cells of a standard Euclidean scalar product h , i and metric)
hypercubic lattice in 5D with a 2D subspace, with coordinates x 2 En is called ‘‘periodic’’ if it is
irrational and invariant under 5-fold noncrystallo- invariant under translations bi in n linearly indepen-
graphic point symmetry. Kramer and Neri (1984) dent directions,
embedded the icosahedral group as a point group
into the hypercubic lattice in 6D and constructed a ðpÞ : f p : f p ðx þ bi Þ ¼ f p ðxÞ; i ¼ 1; . . . ; n ½1
3D irrational subspace invariant under the noncrys-
tallographic icosahedral point group. From intersec- The set of all translations on En forms the discrete
tions of boundaries of the hypercubic lattice cells additive abelian translation group
with this subspace, they constructed a 3D tiling of ( )
global icosahedral point symmetry with two rhom- Xn
n i n
T ¼ b 2 E : b¼ mi b ; ðm1 ; . . . ; mn Þ 2 Z ½2
bohedral tiles. i
Shechtman et al. (1984) discovered in the system
AlMn diffraction patterns of icosahedral point Any orbit (set of all images of an initial point) under
symmetry. Since icosahedral symmetry is incompa- the action T En ! En yields a lattice on En .
tible with a lattice in 3D, they concluded that there Since T acts fixpoint-free, there is a one-to-one
exists atomic long-range order without a lattice. The correspondence $ T. A fundamental domain on
new paradigm of quasiperiodic long-range order in En is defined as a subset of points x 2 En which
quasicrystals was established and since then stimu- contains a representative point from any orbit under
lated a broad range of theoretical and experimental T. Such a fundamental domain can be chosen, for
research. example, as the unit cell of the lattice or as the
The interplay between the notions – (1) of Voronoi cell (eqn [5]). By eqn [1], the functional
crystallographic symmetry in nD, n > 3, (2) of values on En of a periodic function f p (x) are
subspaces invariant under a point group but completely determined from its values on a funda-
irrational and hence incompatible with a lattice, mental domain of En .
and (3) of discrete geometric periodic objects in nD Given the lattice basis (b1 , . . . , bn ) of eqn [2]
providing quasiperiodic tilings on these subspaces – in En , the vector components of the basis form the
forms the mathematical basis for a new quasiper- n n basis matrix B of . The most general change
iodic long-range order found in quasicrystals. The of the basis preserving the lattice is given by acting
present-day theory of quasicrystals offers the most with any element h of the general linear group
310 Quasiperiodic Systems
Gl(n, Z), with integral matrix entries and determi- spectrum is a pure point spectrum and the Fourier
nant 1, on the lattice basis, coefficients can be referred to the points of a reciprocal
lattice (eqn [7]) in Fourier space En . We denote
Glðn; ZÞ 3 h : B ! B0 ¼ Bh ½3 objects belonging to this Fourier space by the index .
The crystallographic classification of inequivalent The basis matrix B of the reciprocal lattice 2 En
lattices in En starts from Gl(n, Z). In addition to is obtained from B as the inverse transpose,
translations, it employs crystallographic point sym-
hbi ; bj i ¼ ij $ B ¼ ðB1 ÞT ½7
metry operations, (Brown et al. 1978, p. 9). A
crystallographic point group operation of a lattice The values of the Fourier coefficients of f p (x) reduce
is a Euclidean isometry g which belongs to a group to integrals over the fundamental domain of the
G 3 g with representations D : G ! O(n, R) and lattice . From eqns [4] and [7] it follows that the
D : G ! Gl(n, Z) such that orthogonal representation of a point group G in
G ¼ fg : DðgÞB ¼ BDðgÞg ½4 coordinate and in Fourier space coincides. The
Fourier spectrum and its point symmetry in crystals
The maximal crystallographic point group for given are observed in diffraction experiments.
lattice is the holohedry of . The group generated
by T, G is a space group which classifies the lattice.
For finer details in the classification of space groups, Quasiperiodic Point Sets and Functions
we refer to Brown et al. (1978). For crystallography
in E3 , this classification yields 230 space groups. Quasiperiodic functions are characterized from their
Crystallography in En is described in Schwarzenberger Fourier spectrum (Bohr 1925) by
(1980) and in Brown et al. (1978) where it is (qp ) The Fourier point spectrum of a quasiper-
elaborated for E4 . iodic function forms a Z-module M of rank
From a lattice 2 En and from the Euclidean metric, n, n > m on Fourier space Em .
one constructs a cell structure as follows: the Voronoi
cell V(b), centered at a lattice point b 2 , known in A Z-module of rank n, n > m on Em is defined as a set
physics as the Wigner–Seitz cell, is the set of points ( )
Xn
i n
M ¼ b :b ¼ mj b ; ðm1 ; . . . ; mn Þ 2 Z ½8
VðbÞ ¼ fx 2 En : jx bj jx b0 j; b0 2 g ½5
j
Any Voronoi cell has a hierarchy of boundaries Xp with the Z-module basis (b1 , . . . , bn ) linearly
of dimension p, 0 p n which we denote as independent with respect to integral linear combina-
p-boundaries. tions. The step from a lattice to a module M is
The set of Voronoi cells at all lattice points form nontrivial since the set of all module points becomes
the -periodic Voronoi complex of 2 En . The dense on Em . The Fourier coefficients of a
Voronoi cells and complexes associated with a quasiperiodic function are assigned to the discrete
lattice admit a notion of geometric duality. We set of module points (eqn [8]).
denote dual objects by a star, . They are built from Bohr in his analysis of quasiperiodic functions
convex hulls of sets of lattice points (Kramer and (Bohr 1925, II, pp. 111–125) shows that a general
Schlottmann 1989) as follows. A Voronoi p-boundary Z-module M of rank n can be taken as the
Xp is shared by several Voronoi cells V(b) and projection to a subspace Em of dimension m of a
determines a set of lattice points (nonunique) lattice 2 En , n > m. It is convenient
SðXp Þ : fb 2 : Xp 2 VðbÞg ½6 to consider in Fourier space En an orthogonal
splitting which we denote as
The boundary dual to Xp is defined as the convex
ðnmÞ ðnmÞ
hull X(np) := conv{b : b 2 S(Xp )}. X(np) can be En ¼ Em
k þ E? ; Em
k ? E? ½9
shown to be an (n p)-boundary of a dual
Delone cell. A Delone cell D is defined as the A characterization of a quasiperiodic function
convex hull of all lattice points whose Voronoi cells f qp (x) on coordinate space is obtained as follows.
share a single vertex, called a hole of the lattice. From one can construct with the help of eqn [7]
Since these vertices fall into classes of orbits under the lattice := ( ) reciprocal to on a coordi-
translations, they determine translationally inequi- nate space En and associate to it via the Fourier
valent classes of Delone cells D , D , . . . . series a quasiperiodic function on a coordinate
(nm)
Fourier analysis applied to a periodic function f p (x) subspace Em n m
k of E = Ek þ E? , equipped with a
on En reduces to an n-fold Fourier series. The Fourier Z-module M (eqn [11]). As a result one finds a
Quasiperiodic Systems 311
Point Symmetry in Quasiperiodic The left of eqn [18] generates two 2D inequivalent
Systems representations of 5-fold planar rotations which are
incompatible with any 2D lattice.
Quasiperiodic systems with noncrystallographic The lattice A4 in addition has a scaling symmetry
point symmetry provide the structure theory and with a factor . The scaling transformation may be
physics of quasicrystals. We illustrate the general expressed in terms of the basis (eqn [16]) and an
scheme (qp) of eqn [12] by examples of 5-fold and element h 2 Gl(4, Z) as
icosahedral point symmetry. For generalizations, see
Janssen (1986). 2 3
0 0 0
6 0 0 0 7
6 7 1 2 3 4
6 7ðb ; b ; b ; b Þ
Example 2: 5-Fold Point Symmetry from 4 0 0 1 0 5
the Root Lattice A4
0 0 0 1
4
The A4 root lattice basis in E may be derived (Baake
et al. 1990) from five orthonormal unit vectors
(e1 , e2 , e3 , e4 , e5 ) in E5 as 2 3
0 1 0 1
6 0 1 1 17
B ¼ ðb1 ; b2 ; b3 ; b4 Þ ¼ ðb1 ; b2 ; b3 ; b4 Þ6
41
7 ½19
1 1 05
:¼ ðe1 e2 ; e2 e3 ; e3 e4 ; e4 e5 Þ ½16 1 0 1 0
As the generator of the cyclic group C5 of 5-fold It is easily verified that the operations of scaling and
rotations, we take the cyclic permutation (12345) in of 5-fold rotation (eqns [19] and [18]) commute
cycle notation acting on the vectors (e1 , e2 , e3 , e4 , e5 ). with one another.
Quasiperiodic Systems 313
Example 3: Icosahedral Point Symmetry from Example 4: The Quasiperiodic Fibonacci Point Set
Lattices = Z 6 , D6
If in the Fibonacci system (Figure 1), one attaches to
The icosahedral group G = H3 has two inequivalent any point b of the square lattice as a window the
3D noncrystallographic representations. H3 allows characteristic function of the perpendicular pro-
for an induced embedding representation D : H3 ! jection V? (b) of the unit square attached to b, the
Gl(6, Z), (Kramer and Neri 1984, Kramer et al. function f qp (xk ) becomes the standard quasiperiodic
1992, Kramer and Papadopolos 1997) into a Fibonacci sequence of points.
hypercubic lattice = Z6 . This representation The dual cell geometry of Voronoi and Delone
reduces into two 3D orthogonal inequivalent irre- cells and their dual boundaries (eqns [5] and [6])
ducible noncrystallographic representations Dk : allows us to construct dual canonical quasiperiodic
H3 ! O(3, R), D? : H3 !O(3, R). The irrational tilings (T , ), (T , ) (Kramer and Schlottmann
basis matrix of eqn [12] for = Z6 becomes 1989). To this end one constructs from local projec-
(Kramer et al. 1992, p. 185, eqn (7)) tions of pairs of dual boundaries Xm, k , X(nm), ? or
Xm, k , X(nm), ? the direct product polytopes Xm, k
B ¼ ðb1 ; b2 ; b3 ; b4 ; b5 ; b6 Þ X(nm), ? or Xk X? called ‘‘klotz polytopes.’’ The
2 3 characteristic functions on these polytopes form the
0 1 1 0
61 windows for the tiles Xm, k , X(nm), k , respectively.
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi6 0 1 077
6 7
1 6 0 0 1 17 ½20 Example 5: The Quasiperiodic Fibonacci Tiling
¼ 6 7
2ð þ 2Þ6 60 1 0 177 The Voronoi cells V of the square lattice are squares
6 7
4 1 1 0 05 centered at lattice points, the Delone cells D are
1 0 0 1 squares centered at the vertices of Voronoi squares.
The product polytopes X1, ? X1, k from projections
with = , 1 = 1. The six basis vectors with of dual 1-boundaries X1 , X1 of Delone and Voronoi
components in the upper three rows span the so- squares (cf. Figure 1) become the two types of square
called primitive icosahedral Z-module associated windows A, B. If a parallel line section x = xk þ c?
with Dk in E3k in the sense of eqn [11]. In this crosses one of these squares, the tile Ak or Bk is
space they point along the directions of six 5-fold formed. The standard Fibonacci tiling results.
axes of the icosahedron. Example 6: Canonical Tilings from the Root
A second lattice in E6 which admits icosahedral Lattices A4 , D6
point symmetry is the root lattice D6 . The basis of
this root lattice, often denoted as the P-lattice, is The two rhombus tiles of the planar quasiperiodic
obtained from eqn [20] by a centering matrix Penrose pattern (Penrose 1974) (T , A4 ) are the projec-
given in Kramer et al. (1992, p. 185, eqn (8)). The tions of 2-boundaries of the Voronoi complex of the
corresponding Z-module is inequivalent to the root lattice A4 2 E4 (Baake et al. 1990). The triangle
module projected from eqn [20]. The third tiles of the dual tiling (T , A4 ) are shown in Figure 2.
lattice of icosahedral point symmetry in E6 is They are projections of 2-boundaries from the
= I := P reciprocal to the root lattice D6 . All Delone complex of the same lattice. A full analysis
three icosahedral modules admit (powers of) of dual Voronoi and Delone boundaries of the root
-scaling. lattice D6 is given in Kramer et al. (1992). It leads to
icosahedral tilings (T , D6 ) and (T , D6 ) of E3 , (Kramer
et al. 1992, Kramer and Papadopolos 1997, Kramer
and Schlottmann 1989) and to models of icosahedral
Quasiperiodic Tilings and Their Windows quasicrystals.
Quasiperiodic sets of points arise from the general
scheme (qp) (eqn [12]) by choosing particular
Fundamental Domains for Quasiperiodic
periodic functions in the embedding space En , called
Tilings
the ‘‘windows,’’ whose intersections with Ek are the
quasiperiodic sets of points. Canonical tilings allow us to construct quasiperiodic
The window for the construction of a discrete functions equipped with a quasiperiodic counterpart
quasiperiodic point set based on eqn [12] is given by of fundamental domains or cells in crystals: assume
the characteristic function (x? ) on the projection that the tiles of a tiling (T , ) all are translates in Em
V? (x? ) := ? (V(b)) of the Voronoi cell (eqn [5]), of a finite minimal set of prototiles (X1 , . . . , Xr ).
attached to any lattice point b 2 . Consider the class of quasiperiodic functions which
314 Quasiperiodic Systems
Conclusion
For quasiperiodic systems, the general construction
was introduced in the section ‘‘Quasiperiodic point
sets and functions’’, and illustrations were given in
four subsequent sections. Further reading resources are
provided by the references given at the end. Here, we
mention some of the many possible generalizations.
Bohr (1925) considers quasiperiodic as special
cases of almost periodic systems. The module of an
almost periodic function has a countable basis.
Moody (1997) discusses the notion of Meyer sets.
These describe discrete sets on locally compact
abelian groups and as particular cases encompass
quasiperiodic systems.
Lagarias (2000) studies aperiodic sets character-
ized by the following properties, shared with
periodic and quasiperiodic sets:
Figure 2 A patch of the planar quasiperiodic triangle tiling (ap1): inequivalent patches of points are volume
(T , A4 ) obtained from the root lattice A4 2 E 4 . The tiles are two bounded,
triangles, projections of 2-boundaries from the Delone cells of (ap2): pure point Fourier spectrum,
A4 . The vertices are projections of lattice points. The 20 shaded
(ap3): linear repetitivity of patches, and
triangles form a set of prototiles such that any other tile is a
translate of one of them. The shaded set forms a fundamental (ap4): self-similarity.
domain for the tiling.
See also: Compact Groups and Their Representations;
Finite Group Symmetry Breaking; Lie Groups: General
take identical values on any translate of a prototile.
Theory; Localization for Quasiperiodic Potentials;
These values are prescribed on the finite set of Symmetries and Conservation Laws; Symmetry and
prototiles in Em which define a fundamental domain Symmetry Breaking in Dynamical Systems.
for this class of quasiperiodic functions. Only this
class of quasiperiodic functions is compatible with
the tiling. It can be characterized in the scheme (qp)
(eqn [12]) by -periodic functions on En whose Further Reading
values on the tile windows of the previous section Baake M, Kramer P, Schlottmann M, and Zeidler D (1990)
are independent of the perpendicular coordinate. A Planar patterns with fivefold symmetry as sections of periodic
fundamental domain for the triangle tiling (T , A4 ) structures in 4-space. International Journal of Modern Physics
is given by the shaded parts in Figure 2. The B 4: 2217–2268.
Bohr H (1925) Zur Theorie der fastperiodischen Funktionen. I.
fundamental domain property appears in relation
Acta Mathematicae 45: 29–127.
with the theory of covering of quasiperiodic sets (see Bohr H (1925) Zur Theorie der fastperiodischen Funktionen. II.
Kramer and Papadopolos (2000)). Acta Mathematicae 46: 101–214.
Brown R, Bülow R, Neubüser J, Wondratschek H, and
Zassenhaus H (1978) Crystallographic Groups of Four-
Example 7: Fundamental Domain Dimensional Space. New York: Wiley.
for the Fibonacci Tiling de Bruijn NG (1981) Algebraic theory of Penrose’s non-periodic
tilings of the plane. I. Proceedings Koninklijke Nederlandse
Attach to the squares A, B in Figure 1 a periodic Akademie van Wetenschappen 84: 39–52.
function f p (x) with functional values independent of de Bruijn NG (1981) Algebraic theory of Penrose’s non-periodic
the perpendicular coordinate x? within the two tilings of the plane. II. Proceedings Koninklijke Nederlandse
squares. Consider the functional values f qp (xk ) = Akademie van Wetenschappen 84: 53–66.
f p (xk þ c? ) picked up on a parallel line. Clearly, Janssen T (1986) Crystallography of quasicrystals. Acta Crystal-
lographica A 42: 261–71.
these values become independent of the perpendi- Kramer P and Neri R (1984) On periodic and non-periodic space
cular coordinate of any intersection with a square fillings of En obtained by projection. Acta Crystallographica A
A, B. The general prescription of values on a 40: 580–587.
fundamental domain of 2 E2 needed for a Kramer P and Papadopolos Z (1997) Symmetry concepts for
quasiperiodic function reduces to a prescription of quasicrystals and non-commutative crystallography. In:
Moody RV (ed.) The Mathematics of Long-Range Aperiodic
its functional values in Ek on the fundamental Order, pp. 307–330. Dordrecht: Kluwer.
domain formed by the two prototiles Ak , Bk .
Quillen Determinant 315
Kramer P and Papadopolos Z (eds.) (2000) Coverings of Discrete Moody RV (1997) Meyer sets and their duals. In: Moody RV
Quasiperiodic Sets, Theory and Applications to Quasicrystals. (ed.) The Mathematics of Long-Range Aperiodic Order,
New York: Springer. pp. 403–441. Dordrecht: Kluwer.
Kramer P, Papadopolos Z, and Zeidler D (1992) The root lattice D6 Penrose R (1974) The role of aesthetics in pure and applied
and icosahedral quasicrystals. In: Frank A, Seligman TH, and Wolf mathematical research. Bulletin of the Institute of Mathe-
KB (eds.) Group Theory in Physics, American Institute of Physics matics and its Applications 10: 266–71.
Conference Proceedings, vol. 266, pp. 179–200. New York. Schwarzenberger RLE (1980) N-Dimensional Crystallography.
Kramer P and Schlottmann M (1989) Dualization of Voronoi San Francisco: Pitman.
domains and klotz construction: a general method for the Shechtman D, Blech I, Gratias D, and Cahn JW (1984) Metallic
generation of proper space filling. Journal of Physics A: phase with long-range orientational order and no translational
Mathematical and General 22: L1097–L1102. symmetry. Physical Review Letters 53: 1951–1953.
Lagarias J (2000) The impact of aperiodic order on mathematics.
Material Science Engineering A 294–296: 186–191.
Quillen Determinant
S Scott, King’s College London, London, UK while for any basis {e1 , . . . , en } for V, with dual basis
ª 2006 Elsevier Ltd. All rights reserved. {e1 , . . . , en } for V ,
det A :¼ e1 ^ ^ en
ð^n AÞðe1 ^ ^ en Þ 2 LA ½6
generator for LA ; specifically, if dim V = m and where the sum is over permutations of {1, . . . , n} and
dim W = n, then det A = 0 if m 6¼ n (by ‘‘fiat’’), while (ai, j ) is the matrix of A with respect to any basis of V –
if m = n, then det A = 0 precisely when A is not changing the basis may change the summands on the
invertible. For the moment, set m = n. right-hand side of [10], but not their sum. It is
For k 2 {0, 1, . . . , n} the kth exterior power opera- fundamental that when V = W the classical determi-
tor is defined by nant is an intrinsic invariant of the operator A, inde-
^k A : ^k V ! ^k W pendent of the choice of basis for V; when V 6¼ W that
is no longer so since there is then no canonical bilinear
^k Aðv1 ^ v2 ^ ^ vk Þ :¼ Av1 ^ Av2 ^ ^ Avk ½4 pairing Det V Det W ! C; the choice of a non-
where v1 , ..., vk 2 V and ^0 V := C and ^0 A := 1. degenerate pairing is equivalent to a choice of in [1].
When k = n, Det V := ^n V and Det W := ^n W are The identification of [10] from [6] and [8]
complex lines and the determinant line of A is amounts to the identity in Det V
LA :¼ Det V
Det W ½5 ð^n AÞðe1 ^ ^ en Þ ¼ det C A : e1 ^ ^ en ½11
316 Quillen Determinant
Since ^n (AB) = ^n A ^n B, [11] in turn implies the < , > , let C(H) be the algebra of compact operators
characterizing multiplicativity property of the classical on H, and let
determinant ( )
X
1
det C ðABÞ ¼ det C A . det C B ½12 L1 ¼ A 2 CðHÞ j kAk21 :¼ i ðA AÞ < 1 ½17
i¼1
for A, B 2 End(V), specializing the general fact in
[7]. Similarly, the group Gl(V, C) of invertible be the ideal of trace-class operators, where the sum
elements of End(V) is identified with those A with is over the real discrete eigenvalues i (A A) & þ0 of
det C A 6¼ 0. the compact self-adjoint operator A A. For any
The classical determinant can also be thought of orthonormal basis { j } of H the map
in the following ways. First, the direct sum of the X
operators defined in [4] yields the total exterior tr : L1 ! C; A 7! tr ðAÞ :¼ < j ; A j >
j
power operator ^A : ^V ! ^V on the exterior
algebra ^V =
nk = 0 ^k V and this has trace is a trace functional on L1 (H), independent of the
choice of basis. Lidskii’s theorem states that
tr ð^AÞ ¼ det C ðI þ AÞ ½13
X
where I is the identity. Alternatively, one can do tr ðAÞ ¼ ½18
something a little more sophisticated and use the 2specðAÞ
holomorphic functional calculus to define the with the sum over the eigenvalues of A counted up
logarithm log B of B 2 End(V) by to algebraic multiplicity; for general trace-class
Z operators this equality is highly nontrivial.
i
log B ¼ log ðB IÞ1 d ½14 If A is trace class, then for each non-negative
2
integer k so is each of the exterior power operators
Here log is the branch of the complex logarithm ^k A : ^k H ! ^k H, defined as in [4]. Following
defined by 2 < arg() and is a positively [13], a determinant can therefore be defined on the
oriented contour enclosing spec(B) but not any point semigroup I þ L1 := {I þ A j A 2 L1 } of determinant-
of the spectral cut R = {rei j r 0}. Then, if B is class operators by the absolutely convergent sum
invertible,
X
1
tr ðlog BÞ ¼ log det C B ½15 det F ðI þ AÞ :¼ tr ð^AÞ ¼ 1 þ tr ð^k AÞ ½19
k¼1
the higher-dimensional analog of the winding determinant line Det(A) are equivalence classes
number of the determinant. [E, ] of pairs (E, ), where E : H 1 ! H 2 such that
A E is trace class and relative to the equivalence
relation (Eq, ) (E, det F (q)) for q : H 1 ! H 1 of
Fredholm Operators and Determinant determinant class and where detF (q) is the Fredholm
determinant of q. Complex multiplication on Det(A)
Line Bundles
is defined by [A, ] = [A, ]. The abstract, or
The operators whose determinants are considered in Quillen, determinant of A is the preferred element
this article are all Fredholm operators. Recall that a det A := [A, 1] in Det(A).
linear operator A : H1 ! H2 between Hilbert spaces Here are some essential properties of the determi-
is Fredholm if it is invertible modulo compact nant line. First, det A is nonzero if and only if A is
operators; that is, there is a ‘‘parametrix’’ invertible. Next, quotients of abstract determinants
Q : H2 ! H1 such that QA I and AQ I are in Det(A) are given by Fredholm determinants; for if
compact operators on H1 and H2 , respectively. A1 : H 1 ! H 2 , A2 : H 1 ! H 2 are Fredholm operators
Equivalently, the range A(H1 ) of A is closed in H2 , such that Ai A are trace class, then if A2 is
and the kernel Ker(A) = { 2 H1 j A = 0} and invertible we see that A1 2 A1 is determinant class
cokernel Coker(A) = H2 =A(H1 ) of A are finite and hence from the definition that
dimensional. (This is equally true for Banach and
Frechet spaces, we restrict our attention to Hilbert detðA1 Þ
spaces for brevity.) The space Fred of all such ¼ detF ðA1
2 A1 Þ ½27
detðA2 Þ
Fredholm operators with the norm topology has the
homotopy type of the classifying space Z BGl(1). where the quotient on the left-hand side is taken in
The first factor parametrizes the connected compo- Det(A). The principal functorial property of the
nents of Fred, two Fredholm operators are in the determinant line is that given a commutative
same component if and only if they have the same diagram with exact rows and Fredholm columns
index
0 ! H1 ! H10 ! H100 ! 0
index ðAÞ ¼ dim KerðAÞ dim CokerðAÞ
#A #A 0
#A 00 ½28
Mostly we restrict our attention to the connected 0 ! H2 ! H20 ! H200 ! 0
component Fred0 of operators of index zero. The
cohomology of Fred0 BGl(1) is a polynomial then there is canonical isomorphism of complex
ring lines
whose generators may be formally realized as the preserving the Quillen determinants det (A0 ) $
even degree components of the Chern character of det (A) det (A00 ). A consequence of this property is
an infinite-dimensional bundle over Fred0 . In fact, that given Fredholm operators A : H2 ! H3 and
the generators !2j1 of H (Gl1 (H)) are related to the B : H1 ! H2 , then
chj through transgression, see Chern and Simons
(1974). We shall be interested here in the first DetðABÞ ffi DetðAÞ DetðBÞ
generator ch1 , a transgression of the Fredholm
determinant ‘‘winding number 1-form’’ !1 , which with det (AB) $ det (A) det (B), generalizing the
coincides with the real Chern class of a canonical elementary property [9].
complex line bundle DET0 ! Fred0 . The fiber of The principal context of interest for studying
DET0 at A 2 Fred0 is the determinant line Det(A) of determinant lines is the case where one has a
the Fredholm operator A, which is defined as family A = {Ax j x 2 B} of Fredholm operators
follows (Segal 2004). parametrized by a manifold B, satisfying suitable
Just as for finite-rank operators (see the subsec- continuity properties, and one aims to make sense
tion ‘‘Determinants in finite dimensions’’), the of the determinant as a function A ! C. It is then
determinant of a Fredholm operator A : H 1 ! H 2 of no difficulty to show that the corresponding
exists abstractly not as a number but as an element family of determinant lines DET(A) = [Det(Ax )
detA of a complex line Det(A). For simplicity, we defines a complex line bundle over B endowed
suppose that index (A) = 0. Elements of the with a canonical section det : B ! DET(A)
Quillen Determinant 319
The Quillen determinant has been of particular and where f : H i (M) ! H i1 (B) is integration over
interest in the case of families of Dirac operators. the fibers. That is, with
= c1 (T),
Such a family is associated to a C1 fibration
: M ! B of closed boundaryless finite-dimensional chðf! ðT m ÞÞ ¼ f 1 þ m
þ 12 m2
2 þ
Riemannian manifolds of even dimension. If there is 1 2
1 þ 12
þ 12
þ
a graded Hermitian vector bundle E = E þ
E ! M 2
of Clifford modules, then from the Riemannian ¼ f 1 þ m þ 12
þ 121
m þ m þ 16
structure one can construct a Levi-Civita connection
2 þ
on the vertical tangent bundle T(M=B) which can be
lifted to a Clifford connection on E; for example, the So we have
spinor connection if we have a family of spin
manifolds. This data yields a smooth family of c1 ðf! ðT m ÞÞ ¼ 12
1
ð6m2 þ 6m þ 1Þf ð
2 Þ 2 H 2 ðBÞ ½32
320 Quillen Determinant
But for any element of K-theory, c1 (E) = connection on the determinant line bundle for a
c1 (DET(E)), and so the left-hand side of [32] is the family of @-operators over a Riemann surface
first Chern class of the determinant line bundle coupled to a holomorphic vector bundle. (This is
DET(D ). If we take, in particular, B = Conf(), the the first paper one should read on determinant line
space of conformal classes of metrics on (or bundles; Quillen’s motivation, in fact, did not come
compact subsets of this space), and couple the from physics but from a problem in number
family D to a background trivial real bundle of theory.)
rank d=2, or its negative in K-theory, then taking To outline this construction, which was extended
m = 1 [32] is easily seen to be modified to to general families of Dirac-type operators in Bismut
and Freed (1986), first we recall that if is
ðd 26Þ an invertible Laplacian-type second-order elliptic
c1 ðD;d=2 Þ ¼ f ð
2 Þ
24 differential operator acting on the space of sections
of a vector bundle over a compact manifold of
It follows for this topological anomaly to vanish dimension n, then it has a spectrum consisting of
one must have background spacetime of dimension real discrete eigenvalues {} forming an unbounded
d = 26. The idea here is that Conf() is a subset of the positive real line. The zeta function
configuration space for bosonic strings in R d of is defined in the complex half-plane Re(s) >
with the requirement that the determinant section n=2 by
of the determinant line bundle be conformally X n
invariant, corresponding to the classical invariance ð; sÞ ¼ tr ðs Þ ¼ s ; ReðsÞ >
2
of the string Lagrangian defining the string path
integral from which the determinant arises. That and extends to a meromorphic function of s on the
is, in order to evaluate the path integral on the whole complex plane. It turns out that the extension
reduced configuration space, one requires a trivia- has no pole at s = 0 and this means that we may
lization of the determinant line bundle which define the zeta-function regularized determinant of
defines a conformally invariant regularized deter- by
minant function. The above calculation says that
there is a topological obstruction to this occurring d
det ðÞ :¼ exp ð; sÞ
when the background space dimension differs dsjs¼0
from 26.
This is the most basic example of determinant since (d=ds)js = 0 s = log this formally represents a
anomaly computations, which have acquired regularized product of the eigenvalues of . A
considerably more sophisticated constructions in metric is now defined on the determinant line
modern versions of string theory and QFT. One bundle DET(D) by defining the norm square of the
immediate deficiency in the approach explained so element det (Dx ) 2 Det(Dx ) by
far is that not all anomalies are topological and so
k detðDx Þk2 :¼ det ðDx Dx Þ
even though the first Chern class of the determinant
line bundle may vanish, there may still be local and over the subset B0 of x 2 B where Dx is invertible.
global obstructions to the existence of a determi- Elsewhere in B, one includes a factor defined by the
nant function with the correct symmetry properties. induced L2 metric in the kernel and cokernel. See
To be more precise, one needs to say not just that a Quillen (1985) and Bismut and Freed (1986) for full
trivialization of the determinant line bundle for- details.
mally exists, but to actually be able to construct a A connection is defined by similarly constructing
specific preferred trivialization. For this more a regularized version of the connection we would
refined objective, one needs to know more about define if we were working with finite-rank bundles.
the differential geometry of the determinant line. First, one includes in the data associated to the
One approach is to fix a canonical choice of fibration : M ! B defining the family of opera-
connection and, if the determinant bundle is tors D a splitting of the tangent bundle
topologically trivial, to construct a determinant TM = T(M=B)
(TB). This assumption and the
section (up to phase) using the parallel transport Riemannian geometry of the fibration yield a
of the connection. connection r() defined along the fibers of the
The principal contribution to such a theory was fibration. The connection form over B0 is then
made in a remarkable four page paper by Quillen defined by
(1985) in which using zeta-function regularization
he presented a construction of a metric and !ðxÞ ¼ tr ðD1 ðÞ
x r Dx Þ
Quillen Determinant 321
where the zeta-regularized trace tr is defined on a determinant section pushes down to a section of
vertical bundle endomorphism-valued 1 form x 7! Ax a reduced determinant line bundle over A=G. As
on M by seen earlier, the topological obstruction to realiz-
s ing this determinant section as a function on A=G
tr ðAx Þ :¼ fps¼0 tr ðAx Dx Dx Þjmer can be computed from the Atiyah–Singer index
theorem for families applied to the corresponding
where the superscript indicates we are considering the index bundle Ind(DA=G ) in the K(A=G) by picking
meromorphically extended form, and fps = 0 (G(s)) out the degree-2 component in H 2 (A=G) of the
means the finite part of a meromorphic function G Chern character ch(Ind(DA=G )). On the other hand,
on C; that is, the constant term in the Laurent it turns out that this characteristic class is the
expansion of G(s) near s = 0. transgression of the element of H 1 (G, Z) defined by
A theorem of Bismut and Freed, generalizing the zeta-determinant trace
Quillen’s original computation, computes the curva-
1
ture (DET(D)) of this connection to be the 2-form :¼ tr DA Dg:A dG DA Dg:A
component in the local Atiyah–Singer families index
1 s
density. This is a refined version of the topological :¼ fps¼0 tr DA Dg:A dG DA Dg:A DA Dg:A Þjmer
version of that theorem which we utilized earlier; it
expresses the characteristic classes on B in terms of which counts the winding number of the zeta
specific canonical differential forms constructed by determinant G ! C defined by det (DA Dg.A ). This
integrating, along the fibers of the fibration, provides an interesting parallel of the classical
canonically defined vertical characteristic forms. theory described in the section ‘‘The Fredholm
More precisely, they prove the formula (Bismut determinant.’’ For more details of this and more
and Freed 1986 and Berline et al. 1992) advanced ideas take a look at Singer (1985). (A
! similar parallel holds between the topological
Z
ðDETðDÞÞ n=2 b derivation of the conformal anomaly outlined at
¼ ð2iÞ AðM=BÞchðEÞ ½33 the beginning of this section and what it called the
M=B
½2
Polyakov multiplicative anomaly formula for the
where ()[2] 2 2 (B) means the 2-form component zeta determinant of the Laplacian with respect to
of a differential form on B. Here A(M=B) b = conformal changes in the metric on the surface.)
1=2
det ((R M=B
=2)= sinh (RM=B
=2)) is the vertical Aspects of more recent work in this direction have
b
A-genus differential form, while ch(E) is the vertical been the extension of the theory to manifolds with
Chern character form associated to the curvature boundary, and how it encodes into the structures of
form of the bundle E. topological and conformal field theories, see Segal
This theory seems a long way from the classical (2004) and Mickelsson and Scott (2001), and more
theory of stable characteristic classes and the generally into M-theory (Freed and Moore 2004).
Fredholm determinant discussed in earlier sections.
See also: Anomalies; Feynman Path Integrals;
There are, however, interesting parallels which
Index Theorems; Regularization for Dynamical
may guide the search for an understanding of the -Functions.
geometry of families of elliptic operators, of which
determinants form a component. The prototypical
situation where determinants arise in the quantiza-
tion of gauge theory is the following. Consider the Further Reading
infinite-dimensional affine space A of connections Alvarez O and Singer IM (1984) Gravitational anomalies and the
on a complex vector bundle E with structure families index theorem. Communications in Mathematical
group G sitting over Sn the n-sphere. The Lie Physics 96: 409–417.
group G is assumed to be compact. For each Atiyah MF and Singer IM (1984) Dirac operators coupled to
connection A 2 A, we consider a Dirac operator vector potentials. Proceedings of the National Academy of
Sciences of the USA 81: 2597–2600.
DA : C1 (Sn , Sþ E) ! C1 (Sn , S E), where E is Bott R (1959) The stable homotopy of the classical groups.
a Hermitian vector bundle coupled to the spinor Annals of Mathematics 70: 313–337.
bundles S . The group G of based gauge transfor- Berline N, Getzler E, and Vergne M (1992) Heat Kernels and Dirac
mations acts on A and symmetry properties of Operators, Grundlehren der Mathematischen Wissenschaften,
conservation laws lead one to be interested in vol. 298. Berlin: Springer.
Bismut J-M and Freed DS (1986) The analysis of elliptic families I.
constructing a determinant function on the quo- Communications in Mathematical Physics 106: 159–176.
tient space A=G. More precisely, g 2 G transforms Chern SS and Simons J (1974) Characteristic forms and geometric
DA to Dg.A and by equivariance the Quillen invariants. Annals of Mathematics 99: 48–69.
322 Quillen Determinant
Freed DS and Moore G (2004) Setting the quantum integrand of Segal G (2004) The definition of conformal field theory. In:
M-theory, arXiv:hep-th/0409135. Tillmann U (ed.) Topology, Geometry and Quantum Field
Mickelsson J and Scott S (2001) Functorial QFT, gauge anomalies Theory, pp. 421–577. Cambridge University Press. Topo-
and the Dirac determinant bundle. Communications in Mathe- logical field theory, ‘‘Stanford lecture notes,’’ http://
matical Physics 219: 567–605. www.cgtp.duke.edu.
Quillen D (1985) Determinants of Cauchy–Riemann opeators over Singer IM (1985) Families of Dirac operators with applications to
a Riemann surface. Functional Analysis and its Applications physics. Asterique, hors serie, 323–340.
19: 31–34.
! ð2Þ i¼1
d2 z 1 ðN þ 1Þ2 ðzzÞN
d0 ðzÞ ¼ ½3 We can then compute the two-point function
ð1 zzÞ2 ð1 ðzzÞNþ1 Þ2
X
N
We see that as N ! 1, the zeros concentrate on the Gðz1 ; z2 Þ E½sðz1 Þs ðz2 Þ ¼ si ðz1 Þsi ðz2 Þ ½7
unit circle jzj = 1 (Hammersley 1954). i¼1
A similar formula can be derived for the distribu- and proceed as before.
tion of roots of a real polynomial on the real axis, In these terms, the simplest way to describe the
using d(t) = E[(f (t))jdf =dtj]. One obtains (Kac measure for our first example is that it follows from
1943): the inner product on the unit circle,
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi I
r dt 1 ðN þ 1Þ2 t2N dz
d0 ðtÞ ¼ ðf ; gÞ ¼ f ðzÞgðzÞ
jzj¼1 2z
2
ð1 t2 Þ ð1 t2Nþ2 Þ2
Integrating, one finds the expected number of real Thus, we might suspect that this has something to
zeros of a degree N random real polynomial is EN do with the concentration of eqn [3] on the unit
(2=) log N, and as N ! 1 the zeros are concen- circle. Indeed, this idea is made precise and general-
trated at t = 1. ized in Shiffman and Zelditch (2003).
While concentration of measure is a fairly Our second example belongs to a class of problems
generic property for random polynomials, it is by in which M is compact and L positive. In this case,
no means universal. Let us consider another the space H 0 (M, L) of holomorphic sections is finite
Gaussian ensemble, with variance n = N!=n! dimensional, so we can take the basis to consist of all
(N n)!. This choice leads to a particularly simple sections. Then, if M is in addition Kähler, we can
two-point function, derive all the other data from a choice of Hermitian
metric h(f , g) on L. In particular, this determines a
Gðz; zÞ ¼ ð1 þ zzÞN ½4 Kähler form ! as the curvature of the metric
compatible connection, and thus a volume form
and the distribution of zeros
Vol! = !n =n!. We then define the inner product to be
1 N d2 z Z
d0 ¼ @ @ log G ¼ ½5 ðf; gÞ ¼ Vol ! hðf; gÞ
ð1 þ zzÞ2 M
Rather than concentrate the zeros, in this ensemble Thus, the measure equation [1] and the final distribu-
zeros are uniformly distributed according to the tion equation [2] are entirely determined by h. In
Random Algebraic Geometry, Attractors and Flux Vacua 325
these terms, the underlying reason for the simplicity of There are many mathematical and physical ques-
eqn [5] is that we started with the SU(2)-invariant tions one can ask about attractor points, and it
metric h, so the final distribution must be invariant would be very interesting to have a general method
as well. More generally, eqn [7] is a Szegö kernel. to find them. As emphasized by G Moore, this is one
Taking L = L N 1 for N large, this has a known of the simplest problems arising from string theory
asymptotic expansion, enabling a rather complete in which integrality (here due to charge quantiza-
treatment (Zelditch 2001). tion) plays a central role, and thus it provides a
Our two examples also make the larger point that natural point of contact between string theory and
a wide variety of distributions are possible. Thus, to number theory. For example, one might suspect that
get convincing results, we must put in some informa- attractor Calabi–Yau’s are arithmetic, that is, are
tion about the ensemble of random polynomials or projective varieties whose defining equations live in
sections which appear in the problem at hand. an algebraic number field. This can be shown to
The basic computation we just discussed can be always be true for K3 T 2 , and there are
vastly generalized to multiple variables, multipoint conjectures about when this is true more generally
correlation functions, many different ensembles, and (Moore 2004).
different counting problems. We will discuss the A simpler problem is to characterize the distribu-
distribution of critical points of holomorphic tion of attractor points in Mc (M). As these are
sections below. infinite in number, one must introduce some
control parameter. While the first idea which
might come to mind is to bound the magnitude of
The Attractor Problem , since the intersection form on H 3 (M, Z) is
antisymmetric, there is no natural way to do this.
We now turn to our physical problems. Both are
A better way to get a finite set is to bound the
posed in the context of compactification of the type
period of , and consider the attractor points
IIb superstring theory on a Calabi–Yau 3-fold M.
satisfying
This leads to a four-dimensional effective field
R
theory with N = 2 supersymmetry, determined by 2 2 j M ^ j2
the geometry of M. Zmax jZð; zÞj R ½9
M^
Let us begin by stating the attractor problem
mathematically, and afterwards give its physical As an example of the type of result we will discuss
background. We begin by reviewing a bit of the below, one can show that for large Zmax , the density
theory of Calabi–Yau manifolds. By Yau’s proof of of such attractor points asymptotically approaches
the Calabi conjecture, the moduli space of Ricci-flat the Weil–Peterson volume form on Mc .
metrics on M is determined by a choice of complex We now briefly review the origins of this problem,
structure on M, denote this J, and a choice of Kähler in the physics of 1/2 BPS (Bogomoln’yi–Prasad–
class. Using deformation theory, it can be shown Sommerfield) black holes in N = 2 supergravity. We
that the moduli space of complex structures, denote begin by introducing local complex coordinates zi
this Mc (M), is locally a complex manifold of on Mc (M). Physically, these can be thought of as
dimension h2, 1 (M). A point J in Mc (M) picks out a massless complex scalar fields. These sit in vector
holomorphic 3-form J 2 H 3, 0 (M, C), unique up to multiplets of N = 2 supersymmetry, so there must be
an overall choice of normalization. The converse is h2, 1 (M) vector potentials to serve as their bosonic
also true; this can be made precise by defining the partners under supersymmetry. These appear
period map Mc (M) ! P(H 3 (M, Z) C) to be the because the massless modes of the type IIb string
class of in H 3 (M, Z) C up to projective include various higher rank-p form gauge potentials,
equivalence. One can prove that the period map is in particular a self-dual 4-form which we denote C.
injective (the Torelli theorem), locally in general and Self-duality means that dC = dC up to nonlinear
globally in certain cases such as the quintic in CP 4 . terms, where is the Hodge star operator in ten
Now, the data for the attractor problem is a charge, dimensions. Now, Kaluza–Klein reduction of this
a class 2 H 3 (M, Z). An attractor point for is then 4-form potential produces b3 (M) 1-form vector
a complex structure J on M such that potentials AI in four dimensions. Given an explicit
basis of 3-forms !I for H 3 (M, R) \ H 3 (M, Z), this
2 HJ3;0 ðM; CÞ
HJ0;3 ðM; CÞ ½8 follows from the decomposition
This amounts to h2, 1 complex conditions on the h2, 1 X
b3
complex structure moduli, so picks out isolated C¼ AI ^ !I þ massive modes
points in Mc (M), the attractor points. I¼1
326 Random Algebraic Geometry, Attractors and Flux Vacua
However, because of the self-duality relation, only With some work, one can see that in the 1/2 BPS
half of these vector potentials are independent; the case, the equations of motion imply that as r
other half are determined in terms of them by four- decreases, the complex structure moduli z follow
dimensional electric–magnetic duality. Explicitly, gradient flow with respect to jZ(, z)j2 in eqn [11],
given the intersection form ij on H 3 H 3 , we have and the area A(r) of an S2 at radius r decreases.
Finally, at the horizon, z reaches a value z at which
dAi ¼ ij 4 dAj ½10
jZ(, z )j2 is a local minimum, and the area of
where 4 denotes the Hodge star in d = 4. Thus we the event horizon is A = 4jZ(, z )j2 . Since z is
have h2, 1 þ 1 independent vector potentials. One of determined by minimization, this area will not
these sits in the N = 2 supergravity multiplet, and change under small variations of the initial z,
the rest are the correct number to pair with the resolving the paradox.
complex structure moduli. We now consider 1/2 BPS A little algebra shows that the problem of finding
black hole solutions of this four-dimensional N = 2 nonzero critical points of jZ(, z)j2 is equivalent to
theory. Choosing any S2 which surrounds the that of finding critical points Di Z = 0 of the period
horizon, we can define the charge as the class in associated to ,
H 3 (M, Z) which reproduces the corresponding mag- Z
netic charges Z¼ ^ ½12
Z Z M
1
Qi ¼ dAi !i ^ usually called the central charge, with respect to the
2 S2 M covariant derivative
Using eqn [10], this includes all charges.
Di Z ¼ @i Z þ ð@i KÞZ ½13
One can show that the mass M of any charged
object in supergravity satisfies a BPS bound, Here
Z
M2 jZð; zÞj2 ½11 eK
^ ½14
2
The quantity jZ(; z)j , defined in eqn [9], depends
explicitly on , and implicitly on the complex The mathematical significance of this rephrasing is
structure moduli z through . A 1/2 BPS solution that K is a Kähler potential for the Weil–Peterson
by definition saturates this bound. Kähler metric on Mc (M), with Kähler form
! = @ @K, and eqn [13] is the unique connection on
We now explain the ‘‘attractor paradox.’’
According to Bekenstein and Hawking, the entropy H (3, 0) (M, C) regarded as a line bundle over Mc (M),
of any black hole is proportional to the area of its whose curvature is !. These facts can be used to
event horizon. This area can be found by finding show that Di provides a basis for H (2, 1) (M, C), so
the black hole as an explicit solution of four- that the critical point condition forces the projection
dimensional supergravity, which clearly depends on of on H (2, 1) to vanish. This justifies our original
the charge . In fact, we must fix boundary definition eqn [8].
conditions for all the fields at infinity, in particular
the complex structure moduli, to get a particular
black hole solution. Now, normally varying the Flux Vacua in IIb String Theory
boundary conditions varies all the data of a We will not describe our second problem in as much
solution in a continuous way. On the other hand, detail, but just give the analogous final formulation.
if the entropy has any microscopic interpretation as In this problem, a ‘‘choice of flux’’ is a pair of
the logarithm of the number of quantum states of elements of H 3 (M, Z), or equivalently a single
the black hole, one would expect eS to be integrally element
quantized. Thus, it must remain fixed as the
boundary conditions on complex structure moduli F 2 H 3 ðM; Z
ZÞ ½15
are varied, in contradiction with naive expectations where 2 H { 2 CjIm > 0} is the so-called
for the area of the horizon, and seemingly contra- ‘‘dilaton-axion.’’
dicting Bekenstein and Hawking. A flux vacuum is then a choice of complex
The resolution of this paradox is the attractor structure J and for which
mechanism. Let us work in coordinates for which
the four-dimensional metric takes the form F 2 HJ3;0 ðM; CÞ
HJ1;2 ðM; CÞ ½16
AðrÞ Now we have h2, 1 þ h0, 3 = h2, 1 þ 1 complex condi-
ds2 ¼ f ðrÞ dt2 þ dr2 þ d2S2
4 tions on the joint choice of h2, 1 complex structure
Random Algebraic Geometry, Attractors and Flux Vacua 327
moduli and , so this condition also picks out One of the standard solutions of this problem is
special points, now in Mc H. the ‘‘anthropic solution,’’ initiated in work of
The critical point formulation of this problem is Weinberg and others, and discussed in string theory
that of finding critical points of in Bousso and Polchinski (2000). Suppose that we
Z are discussing a theory with a large number of
W ¼ ^F ½17 vacuum states, all of which are otherwise candidates
to describe our universe, but which differ in . If the
under the covariant derivatives eqn [13] and number of these vacuum states were sufficiently
large, the claim that a few of these states realize a
D W ¼ @ W þ ð@ WÞZ small would not be surprising. But one might still
with K the sum of eqn [14] and the Kähler potential feel a need to explain why our universe is a vacuum
log Im for the metric on the upper half-plane of with small , and not one of the multitude with
constant curvature 1. large .
This is a sort of complexified version of the The anthropic argument is that, according to
previous problem and arises naturally in IIb com- accepted models for early cosmology, if the value of
pactification by postulating a nonzero value F for a jj were even 100 times larger than what is
certain 3-form gauge field strength, the flux. The observed, galaxies and stars could not form. Thus,
quantity eqn [17] is the superpotential of the the known laws of physics guarantee that we will
resulting N = 1 supergravity theory, and it is a observe a universe with within this bound; it is
standard fact in this context that supersymmetric irrelevant whether other possible vacuum states
vacua (critical points of the effective potential) are ‘‘exist’’ in any sense.
critical points of W in the sense we just stated. While such anthropic arguments are controversial,
We can again pose the question of finding the one can avoid them in this case by simply asking
distribution of flux vacua in Mc (M) H. Besides whether or not any vacuum state fits the observed
jWj2 , which physically is one of the contributions to value of . Given a precise definition of vacuum
the vacuum energy, we can also use the ‘‘length of state, this is a question of mathematics. Still,
the flux’’ answering it for any given vacuum state is extremely
Z difficult, as it would require computing to 10122
1 precision. But it is not out of reach to argue that out
L¼ Re F ^ Im F ½18
Im of a large number of vacua, some of them are
expected to realize small . For example, if we
as a control parameter, and count flux vacua for could show that the number of otherwise physically
which L
Lmax . In fact, this parameter arises acceptable vacua was larger than 10122 , and that the
naturally in the actual IIb problem, as the ‘‘orienti- distribution of among these was approximately
fold three-plane charge.’’ uniform over the range (M4Planck , M4Planck ), we would
What makes this problem particularly interesting have made a good case for this expectation. This style
physically is that it (and its analogs in other string of reasoning can be vastly generalized and, given
theories) may bear on the solution of the cosmolo- favorable assumptions about the number of vacua in
gical constant problem. This begins with Einstein’s a theory, could lead to falsifiable predictions inde-
famous observation that the equations of general pendent of any a priori assumptions about the choice
relativity admit a one-parameter generalization, of vacuum state (Douglas 2003).
R 12g R ¼ 8T þ g
just discussed (real or complex, and with the In words, the two-point function is the formal
appropriate control parameters). We are then inter- continuation of the Kähler potential on Mc (M) to
ested in the expected distribution of critical points independent holomorphic and antiholomorphic
for a random period. This brings our problem into variables. This incorporates the quadratic form
the framework of random algebraic geometry. appearing in eqn [18] and can be used to count
Before proceeding to use this framework, let us sections with such a bound.
first point out some differences with the toy We can now follow the same strategy as before,
problems we discussed. First, while eqn [12] and by introducing an expected density of critical
eqn [17] are sums of the form eqn [6], we take not points,
an orthonormal basis but instead a basis si of
isðzÞÞ j det Hij j
dðzÞ ¼ E½ðnÞ ðDi sðzÞÞðnÞ ðD ½19
integral periods of . Second, the coefficients ci are 1
i;j
2n
not normally distributed but instead drawn from a
discrete uniform distribution, that is, correspond to where the ‘‘complex Hessian’’ H is the 2n 2n
a choice of in H 3 (M, Z) or F as in eqn [15], matrix of second derivatives
satisfying the bounds on jZj or L. Finally, we do not !
@i D sðzÞ @i Dj sðzÞ
normalize the distribution (which is thus not a H j ½20
probability measure) but instead take each choice @i DjsðzÞ @i Dj sðzÞ
with unit weight.
(note that @Ds = DDs at a critical point). One can
These choices can of course be modified, but are
then compute this density along the same lines.
made in order to answer the question, ‘‘how many s = ! s,
The holomorphy of s implies that @i D j ij
attractor points (or flux vacua) sit within a specified
which is one simplification. Other geometric
region of moduli space?’’ The answer we will get is a
simplifications follow from the fact that eqn [19]
density (Zmax ) or (Lmax ) on moduli space, such
depends only on s and a finite number of its
that as the control parameter becomes large, the
derivatives at the point z.
number of critical points within a region R
For the attractor problem, using the identity
asymptotes to
Z
s ¼ 0
Di Dj s ¼ F ijk !kk D k
N ðR; Zmax Þ ðZmax Þ
R from special geometry of Calabi–Yau 3-folds, the
The key observation is that to get such asympto- Hessian becomes trivial, and detH = jsj2n . One thus
tics, we can start with a Gaussian random finds (Denef and Douglas 2004) that the asymptotic
element s of H 3 (M, R) or H 3 (M, C). In other density of attractor points with large jZj
Zmax in a
words, we neglect the integral quantization of region R is
the charge or flux. Intuitively, this might be 2nþ1
expected to make little difference in the limit N ðR; jZj
Zmax Þ Znþ1 volðRÞ
ðn þ 1Þn max
that the charge or flux is large, and in fact one R
can prove that this simplification reproduces the where vol(R) = R !n =n! is the volume of R in the
leading large L or jZj asymptotics for the density Weil–Peterson metric. The total volume is known to
of critical points, using standard ideas in lattice be finite for Calabi–Yau 3-fold moduli spaces, and
point counting. thus so is the number of attractor points under this
This justifies starting with a two-point function bound.
like eqn [7]. While the integral periods si of can The flux vacuum problem is complicated by the
be computed in principle (and have been in many fact that DDs is nonzero and thus the determinant
examples) by solving a system of linear PDEs, the of the Hessian does not take a definite sign, and
Picard–Fuchs equations, it turns out that one does implementing the absolute value in eqn [19] is
not need such detailed results. Rather, one can nontrivial. The result (Douglas, et al. 2004) is
use the following ansatz for the two-point Z
1
function, ðzÞ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi j detðHH jxj2 1Þj
b3 ! det ðzÞ HðzÞC
X
b3
t 1
Hjxj2
Gðz1 ; z2 Þ ¼ IJ sI ðz1 ÞsJ ðz2 Þ eH ðzÞ dH dx
I¼1
Z where H(z) is the subspace of Hessian matrices eqn
¼ z2 Þ
ðz1 Þ ^ ð [20] obtainable from periods at the point z, and (z)
M is a covariance matrix computable from the period
¼ exp Kðz1 ; z2 Þ data.
Random Algebraic Geometry, Attractors and Flux Vacua 329
A simpler lower bound for the number of study of more or less any class of superstring vacua
solutions can be obtained by instead computing the leads to similar questions of counting and distribu-
index density tion, less well understood at present. Some of these
h i are discussed in Douglas (2003), Acharya et al.
isÞ det Hij
I ðzÞ ¼ E ðnÞ ðDi sÞðnÞ ðD ½21 (2005), Denef and Douglas (2005), Blumenhagen
1
i;j
2n
et al. (2005).
so-called because it weighs the vacua with a Morse–
Witten sign factor. This admits a simple explicit See also: Black Hole Mechanics; Chaos and Attractors;
formula (Ashok and Douglas 2004), Compactification of Superstring Theory; Supergravity.
’
(T0k (x0 ))k1 , the orbit of x0 . space of orbits. Hence, we may use the ergodic
Due to our inaccurate knowledge of the particular theorem and get that time averages of all continuous
physical system or due to computational or theore- observables ’ : M ! R, that is, writing x = (xk )k0
tical limitations (e.g., lack of sufficient computa- and
tional power, inefficient algorithms, or insufficiently
developed mathematical or physical theory), the 1Xn1
mathematical models never correspond exactly to ~
’ðxÞ ¼ lim ’ðxk Þ
n!þ1 n
the phenomenon they are meant to model. More- k¼0
probability measures on M such that the support of volume form on M. Then p ; ( j x) is the normalized
p( j x) is close to T0 (x), the random orbits are volume restricted to the -neighborhood of T0 (x).
Random Dynamical Systems 331
This defines a family of transition probabilities Tw (x) = T(w, x), and for every > 0 write
allowing the points to ‘‘jump’’ from T0 (x) to any = (m(B(p0 , ))1 (m j B(p0 , )), the normalized
point in the -neighborhood of T0 (x) following a restriction of m to the -neighborhood of p0 . Then
uniform distribution law. (Tw )w2P , together with , defines a random pertur-
bation of Tp0 , for every small enough > 0.
Random Maps Example 3 (Global additive perturbations). Let M
be a homogeneous space, that is, a compact
Alternatively, we may choose maps T1 , T2 , . . . , Tk
connected Lie group admitting an invariant
independently at random near T0 according to a
Riemannian metric. Fixing a neighborhood U of
probability law on the space T(M) of maps, whose
the identity e 2 M, we can define a map T : U
support is close to T0 in some topology, and
M ! M, (u, x) 7! Lu (T0 (x)), where Lu (x) = u x is
consider sequences xk = Tk T1 (x0 ) obtained
the left translation associated with u 2 M. The
through random iteration, k 1, x0 2 M.
invariance of the metric means that left (and also
This is again a Markov chain whose transition
right) translations are isometries, hence fixing u 2 U
probabilities are given for any x 2 M by
and taking any (x, v) 2 TM, we get
pðA j xÞ ¼ ðfT 2 TðMÞ: TðxÞ 2 AgÞ kDTu ðxÞ vk ¼ kDLu ðT0 ðxÞÞðDT0 ðxÞ vÞk
so this model may be reduced to the first one. ¼ kDT0 ðxÞ vk
However, in the random-maps setting, we may In the particular case of M = Td , the d-dimensional
associate, with each random orbit, a sequence of torus, we have Tu (x) = T0 (x) þ u, and this simplest
maps which are iterated, enabling us to use ‘‘robust case suggests the name ‘‘additive random pertur-
properties’’ of the transformation T0 (i.e., properties bations’’ for random perturbations defined using
which are known to hold for T0 and for every families of maps of this type.
nearby map T) to derive properties of the random
orbits. For the probability measure on U, we may
Under some regularity conditions on the map take , any probability measure supported in the
x 7! p(A j x) for every Borel subset A, it is possible -neighborhood of e and absolutely continuous
to represent random noise by random maps on with respect to the Riemannian metric on M, for
suitably chosen spaces of transformations. In fact, any > 0 small enough.
the transition probability measures obtained in the Example 4 (Local additive perturbations). If
random-maps setting exhibit strong spatial correla- M = Rd and U0 is a bounded open subset of M
tion: p( j x) is close to p( j y) as x is near y. strictly invariant under a diffeomorphism T0 , that is,
If we have a parametrized family T : U M ! M closure (T0 (U0 )) U0 , then we can define an
of maps, we can specify the law by giving a isometric random perturbation setting:
probability on U. Then with every sequence
T1 , . . . , Tk , . . . of maps of the given family, we (i) V = T0 (U0 ) (so that closure (V) = closure
associate a sequence !1 , . . . , !k , . . . of parameters in (T0 (U0 )) U0 );
U since (ii) G ’ Rd the group of translations of Rd ; and
(iii) V a small enough neighborhood of 0 in G.
Tk T1 ¼ T!k T!1 ¼ T!k1;...;!k
Then for v 2 V and x 2 V, we set Tv (x) = x þ v, with
the standard notation for vector addition, and
for all k 1, where we write T! (x) = T(!, x). In this
clearly Tv is an isometry. For , we may take any
setting, the shift map S becomes a skew-product
probability measure on the -neighborhood of 0,
transformation
supported in V and absolutely continuous with
S : M UN ðx; !Þ 7! ðT!1ðxÞ; ð!ÞÞ respect to the volume in R d , for every small enough
’
> 0.
to which many of the standard methods of dynami-
cal systems and ergodic theory can be applied,
yielding stronger results that can be interpreted in Random Perturbations of Flows
random terms.
In the continuous-time case, the basic model to start
Example 2 (Parametric noise). Let T : P M ! M with is an ordinary differential equation
be a smooth map where P, M are finite-dimensional dXt = f (t, Xt )dt, where f : [0, þ 1) ! X (M) and
Riemannian manifolds. We fix p0 2 P, denote by m X (M) is the family of vector fields in M. We
some choice of Riemannian volume form on P, set embed randomness in the differential equation
332 Random Dynamical Systems
basically through ‘‘diffusion,’’ the perturbation is system). A random dynamical system is a skew
given by white noise or Brownian motion ‘‘added’’ product
to the ordinary solution.
S t : M ; ð!; xÞ 7! ððtÞð!Þ; ’ðt; !ÞðxÞÞ
’
In this setting, assuming for simplicity that
M = Rn , the random orbits are solutions of stochas- for all t 2 T, where : T ! is a family
tic differential equations of measure-preserving maps (t) : (, P) and
’
’ : T M ! M is a family of maps
dXt ¼ f ðt; Xt Þdt þ ðt; Xt ÞdWt ;
’(t, !) : M satisfying the cocycle property: for
’
0 t T; X0 ¼ Z s, t 2 T, ! 2 ,
where Z is a random variable, , T > 0 and both ’ð0; !Þ ¼ IdM
f : [0, T] Rn ! Rn and : [0, T] Rn ! L(R k , Rn )
’ðt þ s; !Þ ¼ ’ðt; ðsÞð!ÞÞ ’ðs; !Þ
are measurable functions. The space of linear maps
Rk ! Rn is written on L(Rk , Rn ) and Wt is the In this general setting an invariant measure for the
white-noise process on Rk . The solution of this random dynamical system is any probability mea-
equation is a stochastic process: sure on M which is S t -invariant for all t 2 T
and whose marginal is P, that is, (S 1
t (U)) = (U)
X:R!M ðt; !Þ 7! Xt ð!Þ
and (1 (U)) = P(U) for every measurable U
for some (abstract) probability space , given by M, respectively, with : M ! the nat-
Z T Z T ural projection.
Xt ¼ Z þ f ðs; Xs Þds þ ðs; Xs ÞdWs Example 5 In the setting of the previous examples
0 0
of random perturbations of maps, the product
where the last term is a stochastic integral in the measure = P on M, with = U N , P = N
sense of Itô. Under reasonable conditions on f and , and any stationary measure, is clearly invariant.
there exists a unique solution with continuous paths, However, not all invariant measures are product
that is, measures of this type.
½0; þ 1Þ 3 t 7! Xt ð!Þ Naturally an invariant measure is ergodic if every
is continuous for almost all ! 2 (in general these S t -invariant function is -almost everywhere
paths are nowhere differentiable). constant. That is, if : M ! R satisfies
Setting Z = x0 , the probability measure concen- S t = -almost everywhere for every t 2 T,
trated on the point x0 , the initial point of the path is then is -almost everywhere constant.
x0 with probability 1. We write Xt (!)x0 for paths of
this type. Hence, x 7! Xt (!)x defines a map
Applications
Xt (!) : M which can be shown to be a home-
’
omorphism and even diffeomorphisms under suit- The well-established applications of both probability
able conditions on f and . These maps satisfy a or stochastic differential equations (solution of
cocycle property boundary value problems, optimal stopping, sto-
chastic control etc.) and dynamical systems (all
X0 ð!Þ ¼ IdM ðidentity map of MÞ
kinds of models of physical, economic or biological
Xtþs ð!Þ ¼ Xt ððsÞð!ÞÞ Xs ð!Þ phenomena, solutions of differential equations,
for s, t 0 and ! 2 , for a family of measure- control systems etc.) will not be presented here.
preserving transformations (s) : (, P) on a Instead, this section focuses on topics where the
’
suitably chosen probability space (, P). This subject sheds new light on these areas.
enables us to write the solution of this kind of
Products of Random Matrices and the
equations also as a skew product.
Multiplicative Ergodic Theorem
L(Rk , Rk ). Writing ’n (!) = Xn (!) X1 (!) for we obtain a stationary sequence to which we can
all n 1 and ! 2 we obtain a cocycle. If we set apply the previous result, obtaining the existence of
Lyapunov exponents and of Lyapunov subspaces on
1
B ¼ ð!; yÞ 2 Rk : lim log k’n ð!Þyk a full measure subset for any C1 measure-preserving
n!þ1 n
dynamical system.
By a standard extension of the previous setup, we
exists and is finite or is 1 ;
obtain a random version of the multiplicative ergodic
0 ¼ f! 2 : ð!; yÞ 2 B for all y 2 Rk g theorem. We take a family of skew-product maps
S t : M as in the section ‘‘The abstract frame-
’
then 0 contains a subset 00 of full probability and work’’ with an invariant probability measure and
there exist random variables (which might take the such that ’(t, !) : M is (for simplicity) a local
’
value 1)
1
2
k with the following diffeomorphism. We then consider the stationary family
properties.
Xt : ! LðTMÞ; ! 7! D’ðt; !Þ : TM t2T
’
1. Let I = {k þ 1 = i1 > i2 > > ilþ1 = 1} be any
where D’(t, !) is the tangent map to ’(t, !). This is
(l þ 1)-tuple of integers and then we define
a cocycle since for all t, s 2 T, ! 2 we have
I ¼ f! 2 00 :
i ð!Þ ¼
j ð!Þ; ih > i; j ihþ1 ; Xðs þ t; !Þ ¼ Xðs; ðtÞ!Þ Xðt; !Þ
and
ihð!Þ >
ihþ1ð!Þ for all 1 < h < lg
If we assume that
the set of elements where the sequence
i jumps
sup sup logþ kD’ðt; !ÞðxÞk 2 L1 ð; PÞ
exactly at the indexes in I. Then for 0t1 x2M
! 2 I , 1 < h l,
where k k denotes the norm on the corresponding
k 1 space of linear maps given by the induced norm
I;h ð!Þ ¼ y 2 R : lim log k’n ð!Þk
ihð!Þ
n!þ1 n (from the Riemannian metric) on the appropriate
tangent spaces, then we obtain a sequence of
is a vector subspace with dimension ih1 1.
random variables (which might take the value 1)
2. Setting I,kþ1 (!) = {0}, then
1
2
k , with k being the dimension of
1 M, such that
lim log k’n ð!Þk ¼
ihð!Þ
n!þ1 n
1
lim log kXt ð!; xÞyk ¼
i ð!; xÞ
for every y 2 I,h (!)nI,hþ1 (!). t!þ1 t
3. For all ! 2 00 there exists the matrix for every y 2 Ei !, x) = i (!, x) n iþ1 (!, x) and
1=2n i = 1, . . . , k þ 1, where (i (!, x))i is a sequence of
Að!Þ ¼ lim ½ð’n ð!ÞÞ ’n ð!Þ
differential equations with a hyperbolic singularity yet far from being proved for most dynamical
of saddle type, as the Lorenz flow, exhibit sensitive systems, in spite of much recent progress in this
dependence on initial conditions, a common feature direction.
of chaotic dynamics: small initial differences are There are robust examples of systems admitting
rapidly augmented as time passes, causing two several physical measures whose basins together are
trajectories originally coming from practically indis- of full Lebesgue measure, where ‘‘robust’’ means
tinguishable points to behave in a completely that there are whole open sets of maps of a manifold
different manner after a short while. Long-term in the C2 topology exhibiting these features. For
predictions based on such models are unfeasible, typical parametrized families of one-dimensional
since it is not possible to both specify initial unimodal maps (maps of the circle or of the interval
conditions with arbitrary accuracy and numerically with a unique critical point), it is known that the
calculate with arbitrary precision. above scenario holds true for Lebesgue almost every
parameter. It is known that there are systems
Physical measures Inspired by an analogous situa- admitting no physical measure, but the only known
tion of unpredictability faced in the field of cases are not robust, that is, there are systems
statistical mechanics/thermodynamics, researchers arbitrarily close which admit physical measures.
focused on the statistics of the data provided by It is hoped that conclusions drawn from models
the time averages of some observable (a continuous admitting physical measures to be effectively obser-
function on the manifold) of the system. Time vable in the physical processes being modeled.
averages are guaranteed to exist for a positive- In order to lend more weight to this expectation,
volume subset of initial states (also called an researchers demand stability properties from such
observable subset) on the mathematical model if invariant measures.
the transformation, or the flow associated with the
ordinary differential equation, admits a smooth Stochastic stability There are two main issues
invariant measure (a density) or a physical measure. concerning a mathematical model, both from theo-
Indeed, if 0 is an ergodic invariant measure for the retical and practical standpoints. The first one is to
transformation T0 , then the ergodic theorem ensures describe the asymptotic behavior of most orbits, that
that for every -integrable function ’ : M ! R and is, to understand what happens to orbits when time
for -almost every point x in P the manifold M, the time tends to infinity. The second and equally important
j
~ = limn!þ1 nR1 n1
average ’(x) j=0 ’(T0 (x)) exists and one is to ascertain whether the asymptotic behavior
equals the space average ’ d0 . A physical measure is stable under small changes of the system, that is,
is an invariant probability measure for which it is whether the limiting behavior is still essentially the
required that time averages of every continuous same after small changes to the law of evolution. In
function ’ exist for a positive Lebesgue measure fact, since models are always simplifications of the
(volume) subset of the space and be equal to the space real system (we cannot ever take into account the
average (’). whole state of the universe in any model), the lack
We note that if is a density, that is, absolutely of stability considerably weakens the conclusions
continuous with respect to the volume measure, then drawn from such models, because some properties
the ergodic theorem ensures that is physical. might be specific to it and not in any way
However, not every physical measure is absolutely resembling the real system.
continuous. To see why in a simple example, we Random dynamical systems come into play in this
consider a singularity p of a vector field which is an setting when we need to check whether a given
attracting fixed point (a sink), then the Dirac mass model is stable under small random changes to the
p concentrated on p is a physical probability law of evolution.
measure, since every orbit in the basin of attraction In more precise terms, we suppose that there is a
of p will have asymptotic time averages for any dynamical system (a transformation or a flow) admit-
continuous observable ’ given by ’(p) = p (’). ting a physical measure 0 and we take any random
Physical measures need not be unique or even dynamical system obtained from this one through the
exist in general but, when they do exist, it is introduction of small random perturbations on the
desirable that the set of points whose asymptotic dynamics, as in Examples 1– 4 or in the section on
time averages are described by physical measures ‘‘Random perturbations of flows,’’ with the noise level
(such a set is called the basin of the physical > 0 close to zero.
measures) be of full Lebesgue measure – only an In this setting if, for any choice of invariant
exceptional set of points with zero volume would measure for the random dynamical system for all
not have a well-defined asymptotic behavior. This is > 0 small enough, the set of accumulation points of
Random Dynamical Systems 335
the family ( )>0 , when tends to 0 – also known as measures such that the same map T0 is not
zero-noise limits – is formed by physical measures or, stochastically stable.
more generally, by convex linear combinations of
It is well known that
is the unique absolutely
physical measures, then the original unperturbed
continuous invariant measure for T0 and also the
dynamical system is stochastically stable.
unique physical measure. Given > 0 small, let us
This intuitively means that the asymptotic beha-
define transition probability measures as follows:
vior measured through time averages of continuous
observables for the random system is close to the
j ½ ðzÞ ; ðzÞ þ
where the convention that 0 log 0 = 0 has been used. smooth ergodic theory for nonuniformly hyperbolic
Given another finite partition , we write
_ to dynamics.
indicate the partition obtained through intersection Both the inequality and the characterization of
of every element of
with every element of , and stationary measures satisfying the entropy formula
analogously for any finite number of partitions. If were extended to random iterations of independent
is also a stationary measure for a random-maps and identically distributed C2 maps (noninjective
model (see the section ‘‘Random maps’’), then for and admitting critical points), and the inequality
any finite measurable partition
of M, reads
Z ! ZZ X
1 _ 1
n1
h ð
Þ ¼ inf H i
T! ð
Þ dpN ð!Þ h
i ðx; !Þ dðxÞ dpN ð!Þ
n1 n
i ðx;!Þ>0
i¼0
is finite and is called the entropy of the random where the functions
i are the random variables
dynamical system with respect to
and to . provided by the random multiplicative ergodic
We define h = sup
h (
) as the metric entropy theorem.
of the random dynamical system, where the
supremo is taken over all -measurable partitions. Construction of Physical Measures
An important point here is the following notion: as Zero-Noise Limits
setting A the Borel -algebra of M, we say that a The characterization of measures which satisfy the
finite partition
of M is a random generating entropy formula enables us to construct physical
partition for A if measures as zero-noise limits of random invariant
_
þ 1 measures in some settings, outlined in the following,
ðT!i Þ1 ð
Þ ¼ A obtaining in the process that the physical measures
i¼0 so constructed are also stochastically stable.
The physical measures obtained in this manner
(except -null sets) for pN -almost all ! 2 = U N .
arguably are natural measures for the system, since
Then a classical result from ergodic theory ensures
they are both stable under (certain types of)
that we can calculate the entropy using only a
random perturbations and describe the asymptotic
random generating partition
, that is, h = h (
).
behavior of the system for a positive-volume subset
of initial conditions. This is a significant contribu-
The entropy formula There exists a general tion to the state-of-the-art of present knowledge on
relation ensuring that the entropy of a measure- dynamics from the perspective of random dynami-
preserving differentiable transformation (T0 , ) on a cal systems.
compact Riemannian manifold is bounded from
above by the sum of the positive Lyapunov Hyperbolic measures and the entropy formula The
exponents of T0 main idea is that an ergodic invariant measure for
Z X a diffeomorphism T0 which satisfies the entropy
h ðT0 Þ
i ðxÞ dðxÞ formula and whose Lyapunov exponents are every-
i ðxÞ>0 where nonzero (known as hyperbolic measure)
necessarily is a physical measure for T0 . This follows
The equality (entropy formula) was first shown from standard arguments of smooth nonuniformly
to hold for diffeomorphisms preserving a measure hyperbolic ergodic theory.
equivalent to the Riemannian volume, and then the Indeed satisfies the entropy formula if and only
measures satisfying the entropy formula were if disintegrates into densities along the unstable
characterized: for C2 diffeomorphisms the equality submanifolds of T0 . The unstable manifolds W u (x)
holds if and only if the disintegration of along the are tangent to the subspace corresponding to every
unstable manifolds is formed by measures abso- positive Lyapunov exponent at -almost every point
lutely continuous with respect to the Riemannian x, they are an invariant family, that is,
volume restricted to those submanifolds. The T0 (W u (x)) = W u (x) for -almost every x, and dis-
unstable manifolds are the submanifolds of M tances on them are uniformly contracted under
everywhere tangent to the Lyapunov subspaces iteration by T01 .
corresponding to all positive Lyapunov exponents, If the exponents along the complementary direc-
analogous to ‘‘integrating the distribution of Lya- tions are nonzero, then they must be negative
punov subspaces corresponding to positive expo- and smooth ergodic theory ensures that there exist
nents’’ – this particular point is a main subject of stable manifolds, which are submanifolds W s (x) of
Random Dynamical Systems 337
M everywhere tangent to the subspace of negative that T0 belongs to the family. Letting Tx (u) = T(u, x)
Lyapunov exponents at -almost every point x, form for all (u, x) 2 U M, we then have that Tx ( ) is
a T0 -invariant family (T0 (W s (x)) = W s (x), -almost absolutely continuous. This means that sets of
everywhere), and distances on them are uniformly perturbations of positive -measure send points of
contracted under iteration by T0 . M onto positive-volume subsets of M. Such a
We still need to understand that time averages perturbation can be constructed for every contin-
are constant along both stable and unstable mani- uous map of any manifold.
folds, and that the families of stable and unstable In this setting, any invariant probability measure
manifolds are absolutely continuous, in order to for the associated skew-product map S : M of
’
realize how a hyperbolic measure is a physical the form N is such that is absolutely
measure. continuous with respect to volume on M. Then the
Given y 2 W s (x), the time averages of x and y entropy formula holds:
coincide for continuous observables simply because ZZ X
dist (T0n (x), T0n (y)) ! 0 when n ! þ1. For unstable h ¼
i ðx; !Þ d ðxÞ dN
ð!Þ
manifolds, the same holds when considering time
i ðx;!Þ>0
averages for T01 . Since forward and backward time
averages are equal -almost everywhere, the set of Having this and knowing the characterization of
points having asymptotic time averages given by measures satisfying the entropy formula, it is natural
has positive Lebesgue measure if the set to look for conditions under which we can guaran-
[ tee that the above inequality extends to any zero-
B ¼ fW s ðyÞ: y 2 W u ðxÞ \ suppðÞg noise limit 0 of when ! 0. In this case, 0
satisfies the entropy formula for T0 .
has positive volume in M, for some x whose time If, in addition, we are able to show that 0 is a
averages are well defined. hyperbolic measure, then we obtain a physical measure
Now, stable and unstable manifolds are trans- for T0 which is stochastically stable by construction.
verse everywhere where they are defined, but they These ideas can be carried out completely for
are only defined -almost everywhere and depend hyperbolic diffeomorphisms, that is, maps admitting
measurably on the base point, so we cannot use a continuous invariant splitting of the tangent space
transversality arguments from differential topol- into two sub-bundles E F defined everywhere with
ogy, in spite of W u (x) \ supp() having positive bounded angles, whose Lyapunov exponents are
volume in W u (x) by the existence of a smooth negative along E and positive along F. Recently,
disintegration of along the unstable manifolds. maps satisfying weaker conditions were shown to
However, it is known for smooth (C2 ) transforma- admit stochastically stable physical measures follow-
tions that the families of stable and unstable ing the same ideas.
manifolds are absolutely continuous, meaning These ideas also have applications to the con-
that projections along leaves preserve sets of zero struction and stochastic stability of physical measure
volume. This is precisely what is needed for for strange attractors and for all mathematical
measure-theoretic arguments to show that B has models involving ordinary differential equations or
positive volume. iterations of maps.
Bonatti C, Dı́az L, and Viana M (2005) Dynamics Beyond Kunita H (1990) Stochastic Flows and Stochastic Differential
Uniform Hyperbolicity. A Global Geometric and Probabilistic Equations. Cambridge: Cambridge University Press.
Perspective, Encyclopaedia of Mathematical Sciences, 102 Ledrappier F and Young L-S (1998) Entropy formula for random
Mathematical Physics III. Berlin: Springer. transformations. Probability Theory and Related Fields 80(2):
Doob J (1953) Stochastic Processes. New York: Wiley. 217–240.
Fathi A, Herman M-R, and Yoccoz J-C (1983) A proof of Pesin’s Liu P-D and Qian M (1995) Smooth Ergodic Theory of Random
stable manifold theorem. In: Palis J (ed.) Geometric Dynamics Dynamical Systems, Lecture Notes in Mathematics, vol. 1606.
(Rio de Janeiro, 1981), Lecture Notes in Mathematics, Berlin: Springer.
vol. 1007, 177–215. Berlin: Springer. Øskendal B (1992) Stochastic Differential Equations. Universi-
Kifer Y (1986) Ergodic Theory of Random Perturbations. text, 3rd edn. Berlin: Springer.
Boston: Birkhäuser. Viana M (2000) What’s new on Lorenz strange attractor.
Kifer Y (1988) Random Perturbations of Dynamical Systems. Mathematical Intelligencer 22(3): 6–19.
Boston: Birkhäuser. Walters P (1982) An Introduction to Ergodic Theory. Berlin: Springer.
0.5
x
0.0
0 1 2 3
ξ s
Figure 1 Original (top) and unfolded (bottom) spectrum. Figure 2 Wigner surmise (solid) and Poisson law (dashed).
Random Matrix Theory in Physics 339
whose elements, the eigenvalues xn , are uncorrelated eigenvalue is doubly degenerate. This is Kramers’
random numbers. To model the presence of correla- degeneracy. The diagonalizing matrix U is in the
tions, we insert off-diagonal matrix elements, orthogonal group O(N) for = 1, in the unitary
2 3 group U(N) for = 2 and in the unitary–symplectic
H11 H1N
6 . .. 7 group USp(2N) for = 4. Accordingly, the three
H ¼ 4 .. . 5 ½4 symmetry classes are referred to as orthogonal,
HN1 HNN unitary, and symplectic.
We have not yet chosen the probability densities
We require that H is real symmetric, H T = H. The for the random entries Hnm . To keep our assump-
independent elements Hnm are random numbers. tions about the system at a minimum, we treat all
The random matrix H is diagonalized to obtain the entries on equal footing. This is achieved by
energy levels xn , n = 1, 2, . . . , N. Indeed, a numerical rotational invariance of the probability density
simulation shows that these two models yield, after PN()
(H), not to be confused with the rotational
unfolding, the Poisson law and the Wigner surmise symmetry employed above to define the symmetry
for large N, that is, the absence or presence of classes. No basis for the matrices is preferred in any
correlations. This is the most important insight into way if we construct PN ()
(H) from matrix invariants,
the phenomenology of RMT. that is, from traces and determinants, such that it
In this article, we set up RMT in a more formal depends only on the eigenvalues, P() ()
N (H) = PN (x). A
way; we discuss analytical calculations of correla- particularly convenient choice is the Gaussian
tion functions, demonstrate how this relates to
supersymmetry and stochastic field theory and ðÞ ðÞ 2
show the connection to chaos, and we briefly sketch PN ðHÞ ¼ CN exp 2 tr H ½5
4v
the numerous applications in many-body physics, in
disordered and mesoscopic systems, in models for where the constant v sets the energy scale and the
interacting fermions, and in quantum chromody- constant C() N ensures normalization. The three
namics. We also mention applications in other symmetry classes together with the probability
fields, even beyond physics. densities [5] define the Gaussian ensembles: the
Gaussian orthogonal (GOE), unitary (GUE) and
symplectic (GSE) ensemble for = 1, 2, 4.
The phenomenology of the three Gaussian
Random Matrix Theory ensembles differs considerably. The higher , the
Classical Gaussian Ensembles stronger the level repulsion between the eigenvalues
xn . Numerical simulation quickly shows that the
For now, we consider a system whose energy levels nearest-neighbor spacing distribution behaves like
are correlated. The N N matrix H modeling it has p() (s) s for small spacings s. This also becomes
no fixed zeros but random entries everywhere. There obvious by working out the differential probability
are three possible symmetry classes of random PN()
(H)d[H] of the random matrices H in eigenvalue–
matrices in standard Schrödinger quantum angle coordinates x and U. Here, d[H] is the invariant
mechanics. They are labeled by the Dyson index . measure or volume element in the matrix space. When
If the system is not time-reversal invariant, H has to writing d[], we always mean the product of all
be Hermitian and the random entries Hnm are differentials of independent variables for the quantity
complex ( = 2). If time-reversal invariance holds, in the square brackets. Up to constants, we have
two possibilities must be distinguished: if either the
system is rotational symmetric, or it has integer spin d½H ¼ jN ðxÞj d½x dðUÞ ½6
and rotational symmetry is broken, the Hamilton
matrix H can be chosen to be real symmetric ( = 1). where d(U) is, apart from certain phase contribu-
This is the case in eqn [4]. If, on the other hand, the tions, the invariant or Haar measure on O(N), U(N),
system has half-integer spin and rotational symme- or USp(2N), respectively. The Jacobian of the
try is broken, H is self-dual ( = 4) and the random transformation is the modulus of the Vandermonde
entries Hnm are 2 2 quaternionic. The Dyson determinant
index is the dimension of the number field over Y
which H is constructed. N ðxÞ ¼ ðxn xm Þ ½7
n<m
As we are interested in the eigenvalue correla-
tions, we diagonalize the random matrix, H = raised to the power . Thus, the differential
()
U1 xU. Here, x = diag(x1 , . . . , xN ) is the diagonal probability PN (H) d[H] vanishes whenever any
matrix of the N eigenvalues. For = 4, every two eigenvalues xn degenerate. This is the level
340 Random Matrix Theory in Physics
repulsion. It immediately explains the behavior of density. When unfolding, we also want to take the
the nearest-neighbor spacing distribution for small limit of infinitely many levels N ! 1 to remove
spacings. cutoff effects due to the finite dimension of the
Additional symmetry constraints lead to new random matrices. It suffices to stay in the center of
random matrix ensembles relevant in physics, the the semicircle where pffiffiffiffiffi the mean level spacing is
Andreev and the chiral Gaussian ensembles. If one D = 1=R1() (0) = v= N . We introduce the dimen-
refers to the classical Gaussian ensembles, one sionless energies p = xp =D, p = 1, . . . , k, which have
usually means the three ensembles introduced to be held fixed when taking the limit N ! 1. The
above. unfolded correlation functions are given by
ðÞ ðÞ
Correlation Functions Xk ð1 ; . . . ; k Þ ¼ lim Dk Rk ðD1 ; . . . ; Dk Þ ½11
N!1
The probability density to find k energy levels at
As we are dealing with probability densities, the
positions x1 , . . . , xk is the k-level correlation func-
Jacobians dxp =dp enter the reformulation in the
tion Rk() (x1 , . . . , xk ). We find it by integrating out
new energy variables. This explains the factor Dk .
N k levels in the N-level differential probability
() Unfolding makes the correlation functions transla-
PN (H) d[H]. We also have to average over the
tion invariant; they depend only on the differences
bases, that is, over the diagonalizing matrices U.
p q . The unfolded correlation functions can be
Due to rotational invariance, this simply yields the
written in a rather compact form. For the GUE
group volume. Thus, we have
( = 2), they read
ðÞ
Rk ðx1 ; . . . ; xk Þ
Z þ1 Z þ1 ð2Þ sin ðp q Þ
N! Xk ð1 ; . . . ; k Þ ¼ det ½12
¼ dxkþ1
ðÞ
dxN jN ðxÞj PN ðxÞ ½8 ðp q Þ p;q¼1;...;k
ðN kÞ! 1 1
There are similar, but more complicated, formulae
Once more, we used rotational invariance which for the GOE ( = 1) and the GSE ( = 4). By
()
implies that PN (x) is invariant under permutation of construction, one has X1() (1 ) = 1.
the levels xn . Since the same then also holds for the It is useful to formulate the case where correla-
correlation functions [8], it is convenient to normal- tions are absent, that is, the Poisson case, accord-
ize them to the combinatorial factor in front of the ingly. The level density R(P)
1 (x1 ) is simply N times the
integrals. A constant ensuring this has been (smooth) probability density chosen for the entries
()
absorbed into PN (x). in the diagonal matrix [4]. Lack of correlations
Remarkably, the integrals in eqn [8] can be done means that the k-level correlation function only
in closed form. The GUE case ( = 2) is mathema- involves one-level correlations,
tically the simplest, and one finds the determinant
structure ðPÞ N! Yk
ðPÞ
Rk ðx1 ; . . . ; xk Þ ¼ R ðxp Þ ½13
ð2Þ ð2Þ ðN kÞ!N k p¼1 1
Rk ðx1 ; . . . ; xk Þ ¼ det½KN ðxp ; xq Þp;q¼1;...;k ½9
All entries of the determinant can be expressed in The combinatorial factor is important, since we
terms of the kernel K(2) always normalize to N!=(N k)!. Hence, one finds
N (xp , xq ), which depends on
two energy arguments (xp , xq ). Analogous but ðPÞ
Xk ð1 ; . . . ; k Þ ¼ 1 ½14
more complicated formulae are valid for the
GOE ( = 1) and the GSE ( = 4), involving for all unfolded correlation functions.
quaternion determinants and integrals and deriva-
tives of the kernel. Statistical Observables
As argued in the Introduction, we are interested in
the energy correlations on the unfolded energy scale. The unfolded correlation functions yield all statis-
The level density is formally the one-level correla- tical observables. The two-level correlation function
tion function. For the three Gaussian ensembles it is, X2 (r) with r = 1 2 is of particular interest in
to leading order in the level number N, the Wigner applications. If we do not write the superscript ()
semicircle or (P), we mean either of the functions. For the
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Gaussian ensembles, X2() (r) is shown in Figure 3.
ðÞ 1
R1 ðx1 Þ ¼ 4 Nv2 x21 ½10 One often writes X2 (r) = 1 Y2 (r). The two-level
2v2 cluster function Y2 (r) nicely measures the deviation
pffiffiffiffiffi pffiffiffiffiffi
for jx1 j 2 N v and zero for jx1 j > 2 Nv. None of from the uncorrelated Poisson case, where one has
the common systems in physics has such a level X(P) (P)
2 (r) = 1 and Y2 (r) = 0.
Random Matrix Theory in Physics 341
1.5
zero in a Gaussian fashion. Thus, although the
nearest-neighbor spacing distribution mathemati-
1.0
cally involves all correlations, it makes in practice
X2(β )(r ) only a meaningful statement about the two-level
correlations. Luckily, p() (s) differs only very slightly
0.5
from the heuristic Wigner surmise [2] (correspond-
ing to = 1), respectively from its extensions
0.0
0 1 2 3
(corresponding to = 2 and = 4).
r
Figure 3 Two-level correlation function X2() (r ) for GOE (solid),
Ergodicity and Universality
GUE (dashed) and GSE (dotted). We constructed the correlation functions as averages
over an ensemble of random matrices. But this is not
By construction, the average level number in an how we proceeded in the data analysis sketched in
interval of length L in the unfolded spectrum is L. the Introduction. There, we started from one single
The level number variance 2 (L) is shown to be an spectrum with very many levels and obtained the
average over the two-level cluster function, statistical observable just by sampling and, if
necessary, smoothing. Do these two averages, the
Z L ensemble average and the spectral average, yield the
2 ðLÞ ¼ L 2 ðL rÞY2 ðrÞdr ½15 same? Indeed, one can show that the answer is
0
pffiffiffiffiffiffiffiffiffiffiffiffi affirmative, if the level number N goes to infinity.
We find L 2 (L) levels in an interval of length L. This is referred to as ergodicity in RMT.
In the uncorrelated Poisson case, one has 2(P) (L) = L. Moreover, as already briefly indicated in the
This is just Poisson’s error law. For the Gaussian Introduction, very many systems from different
ensembles 2() (L) behaves logarithmically for large L. areas of physics are well described by RMT. This
The spectrum is said to be more rigid than in the seems to be at odds with the Gaussian assumption
Poisson case. As Figure 4 shows, the level number [5]. There is hardly any system whose Hamilton
variance probes longer distances in the spectrum, in matrix elements follow a Gaussian probability
contrast to the nearest-neighbor spacing distribution. density. The solution for this puzzle lies in the
Many more observables, also sensitive to higher unfolding. Indeed, it has been shown that almost all
()
order, k > 2 correlations, have been defined. In functional forms of the probability density PN (H)
practice, however, one is often restricted to analyz- yield the same unfolded correlation functions, if no
ing two-level correlations. An exception is, to some new scale comparable to the mean level spacing is
()
extent, the nearest-neighbor spacing distribution present in PN (H). This is the mathematical side of
p(s). It is the two-level correlation function with the empirically found universality.
the additional requirement that the two levels in Ergodicity and universality are of crucial impor-
question are adjacent, that is, that there are no levels tance for the applicability of RMT in data analysis.
between them. Thus, all correlation functions are
needed if one wishes to calculate the exact nearest- Wave Functions
neighbor spacing distribution p() (s) for the
By modeling the Hamiltonian of a system with a
Gaussian ensembles. These considerations explain
random matrix H, we do not only make an
that we have X2() (s) ’ p() (s) for small s. But while
assumption about the statistics of the energies, but
X2() (s) saturates for large s, p() (s) quickly goes to
also about those of the wave functions. Because of
the eigenvalue equation Hun = xn un , n = 1, . . . , N,
the wave function belonging to the eigenenergy xn
2 is modeled by the eigenvector un . The columns of
the diagonalizing matrix U = [u1 u2 uN ] are these
eigenvectors. The probability density of the compo-
Σ2(L)
1
nents unm of the eigenvector un can be calculated
rather easily. For large N it approaches a Gaussian.
0 This is equivalent to the Porter–Thomas distribu-
0 1 2 3 tion. While wave functions are often not accessible
L in an experiment, one can measure transition
Figure 4 Level number variance 2 (L) for GOE (solid) and amplitudes and widths, giving information about
Poisson case (dashed). the matrix elements of a transition operator and a
342 Random Matrix Theory in Physics
projection of the wave functions onto a certain state We have not made any statistical assumptions yet.
in Hilbert space. If the latter are represented by a Often, one can understand generic features of a
fixed matrix A or a fixed vector a, respectively, one scattering system by assuming that the Hamiltonian
can calculate the RMT prediction for the probability H is a random matrix, taken from one of the three
densities of the matrix elements uyn Aum or the classical ensembles. This is one RMT approach used
widths ay un from the probability density of the in scattering theory.
eigenvectors. Another RMT approach is based on the scattering
matrix itself, S is modeled by a unitary
random matrix. Taking into account additional
Scattering Systems symmetries, one arrives at the three circular ensem-
It is important that RMT can be used as a powerful bles, circular orthogonal (COE), unitary (CUE) and
tool in scattering theory, because the major part of symplectic (CSE). They correspond to the three
the experimental information about quantum sys- classical Gaussian ensembles and are also labeled
tems comes from scattering experiments. Consider with the Dyson index = 1, 2, 4. The eigenphases of
an example from compound nucleus scattering. In the random scattering matrix correspond to the
an accelerator, a proton is shot on a nucleus, with eigenvalues of the random Hamiltonian matrix. The
which it forms a compound nucleus. This then unfolded correlation functions of the circular
decays by emitting a neutron. More generally, the ensembles are identical to those of the Gaussian
ingoing channel (the proton in our example) ensembles.
connects to the interaction region (the nucleus),
which also connects to an outgoing channel (the
Supersymmetry
neutron). There are channels with channel wave
functions which are labeled = 1, . . . , . The Apart from the symmetries, random matrices con-
interaction region is described by an N N tain nothing but random numbers. Thus, a certain
Hamiltonian matrix H whose eigenvalues xn are type of redundancy is present in RMT. Remarkably,
bound-state energies labeled n = 1, . . . , N. The this redundancy can be removed, without losing any
dimension N is a cutoff which has to be taken to piece of information by using supersymmetry, that
infinity at the end of a calculation. The is, by a reformulation of the random matrix model
scattering matrix S contains the information about involving commuting and anticommuting variables.
how the ingoing channels are transformed into the For the sake of simplicity, we sketch the main ideas
outgoing channels. The scattering matrix S is for the GUE, but they apply to the GOE and the
unitary. Under certain and often justified assump- GSE accordingly.
tions, a scattering matrix element can be cast into One defines the k-level correlation functions by
the form using the resolvent of the Schrödinger equation,
which depends on the energies and k new source The RMT models discussed up to now describe
variables Jp , p = 1, . . . , k, ordered in 2k 2k diag- four extreme situations, the absence of correla-
onal matrices tions in the Poisson case and the presence of
correlations as in the three fully rotational
x ¼ diagðx1 ; x1 ; . . . ; xk ; xk Þ invariant models GOE, GUE, and GSE. A real
½21
J ¼ diagðþJ1 ; J1 ; . . . ; þJk ; Jk Þ physics system, however, is often between these
extreme situations. The corresponding RMT mod-
We notice the normalization Z(2)
k (x) = 1 at J = 0. The els can vary considerably, depending on the
generating function [20] is an integral over an
specific situation. Nevertheless, those models in
ordinary N N matrix H. It can be exactly rewritten
which the random matrices for two extreme
as an integral over a 2k 2k supermatrix contain-
situations are simply added with some weight are
ing commuting and anticommuting variables,
useful in so many applications that they acquired a
ð2Þ rather generic standing. One writes
Zk ðx þ JÞ
Z
ð2Þ
¼ Qk ðÞsdetN ðx þ J Þd½ ½22 Hð Þ ¼ H ð0Þ þ H ðÞ ½24
(0)
where H is a random matrix drawn from an
The integrals over the commuting variables are of ensemble with a completely arbitrary probability
the ordinary Riemann–Stiltjes type, while those over density P(0) (0)
N (H ). The case of a fixed matrix is
the anticommuting variables are Berezin integrals. included, because one may choose a product of
The Gaussian probability density [5] is mapped onto -distributions for the probability density. The
its counterpart in superspace matrix H() is random and drawn from the classical
Gaussian ensembles with probability density
ð2Þ ð2Þ 1 2 ()
Qk ðÞ ¼ ck exp 2 str ½23 PN (H () ) for = 1, 2, 4. One requires that the
2v
group diagonalizing H (0) is a subgroup of the one
where c(2)
k is a normalization constant. The supertrace diagonalizing H () . The model [24] describes a
str and the superdeterminant sdet generalize the crossover transition. The weight is referred to as
corresponding invariants for ordinary matrices. The transition parameter. It is useful to choose the
total number of integrations in eqn [22] is drastically spectral support of H (0) and H () equal. One can
reduced as compared to eqn [20]. Importantly, it is then view as the root-mean-square matrix element
independent of the level number N which now only of H () . At = 0, one has the arbitrary ensemble.
appears as the negative power of the superdeterminant The Gaussian ensembles are formally recovered in
in eqn [22], that is, as an explicit parameter. This most the limit ! 1, to be taken in a proper way such
convenient feature makes it possible to take the limit of that the energies remain finite.
infinitely many levels by means of a saddle point We are always interested in the unfolded correla-
approximation to the generating function. tion functions. Thus, has to be measured in units
Loosely speaking, the supersymmetric formulation of the mean level spacing D such that
= =D is
can be viewed as an irreducible representation of RMT the physically relevant transition parameter. It
which yields a clearer insight into the mathematical means that, depending on the numerical value of
structures. The same is true for applications in D, even a small effect on the original energy scale
scattering theory and in models for crossover transi- can have sizeable impact on the spectral statistics.
tions to be discussed below. This explains why super- This is referred to as statistical enhancement. The
symmetry is so often used in RMT calculations. nearest-neighbor spacing distribution is already
It should be emphasized that the rôle of super- very close to p() (s) for the Gaussian ensembles if
symmetry in RMT is quite different from the one in
is larger than 0.5 or so. In the long-range
high-energy physics, where the commuting and observables such as the level number variance
anticommuting variables represent physical parti- 2 (L), the deviation from the Gaussian ensemble
cles, bosons and fermions, respectively. This is not statistics becomes visible at interval lengths L
so in the RMT context. The commuting and comparable to
.
344 Random Matrix Theory in Physics
Crossover transitions can be interpreted as diffu- produce different statistics, often of the Poisson type.
sion processes. With the fictitious time t = 2 =2, the Mixed statistics as described by crossover transitions
probability density PN (x, t) of the eigenvalues x of are then of particular interest to investigate the
the total Hamilton matrix H = H(t) = H( ) satisfies character of excitations. For example, one applies
the diffusion equation the model [24] with H (0) drawn from a Poisson
ensemble and H () from a GOE. Another application
4@
x PN ðx; tÞ ¼ PN ðx; tÞ ½25 of crossover transitions is breaking of time-reversal
@t invariance in nuclei. Here, H (0) is from a GOE and
where the probability density for the arbitrary H () from a GUE. Indeed, a fit of spectral data to this
ensemble is the initial condition PN (x, 0) = P(0)
N (x).
model yields an upper bound for the time-reversal
The Laplacian invariance violating root-mean-square matrix element
in nuclei. Yet another application is breaking of
XN
@2 X @ @
x ¼ þ ½26 symmetries such as parity or isospin. In the case of
n¼1
@x2n n<m xn xm @xn @xm two quantum numbers, positive and negative parity,
say, one chooses H (0) = diag(H(þ) , H () ) block-
lives in the curved space of the eigenvalues x. This diagonal with H (þ) and H () drawn from two
diffusion process is Dyson’s Brownian motion in uncorrelated GOE and H () from a third uncorre-
slightly simplified form. It has a rather general meaning lated GOE which breaks the block structure. Again,
for harmonic analysis on symmetric spaces, connecting root-mean-square matrix elements for symmetry
to the spherical functions of Gelfand and Harish- breaking have been derived from the data.
Chandra, Itzykson–Zuber integrals, and to Calogero– Nuclear excitation spectra are extracted from
Sutherland models of interacting particles. All this scattering experiments. An analysis as described
generalizes to superspace. In the supersymmetric above is only possible if the resonances are isolated.
version of Dyson’s Brownian motion the generating Often, this is not the case and the resonance widths
function of the correlation functions is propagated, are comparable to or even much larger than the mean
4@ level spacing, making it impossible to obtain the
s Zk ðs; tÞ ¼ Zk ðs; tÞ ½27 excitation energies directly from the cross sections.
@t
One then analyzes the latter and their fluctuations as
where the initial condition Zk (s, 0) = Z(0)
k (s) is the measured and applies the concepts sketched above
generating function of the correlation functions for for scattering systems. This approach has also been
the arbitrary ensemble. Here, s denotes the eigenva- successful for crossover transitions.
lues of some supermatrices, not to be confused with Due to the complexity of the nuclear many-body
the spacing between adjacent levels. Since the problem, one has to use effective or phenomenological
Laplacian s lives in this curved eigenvalue space, interactions when calculating spectra. Hence, one often
this diffusion process establishes an intimate con- studies whether the statistical features found in the
nection to harmonic analysis on superspaces. Advan- experimental data are also present in the calculated
tageously, the diffusion [27] is the same on the spectra which result from the various models for nuclei.
original and on the unfolded energy scales. Other many-body systems, such as complex atoms
and molecules, have also been studied with RMT
concepts, but the main focus has always been on nuclei.
Fields of Application
Many-Body Systems Quantum Chaos
Numerous studies apply RMT to nuclear physics Originally, RMT was intended for modeling systems
which is also the field of its origin. If the total with many degrees of freedom such as nuclei. Surpris-
number of nucleons, that is, protons and neutrons, is ingly, RMT proved useful for systems with few degrees
not too small, nuclei show single-particle and of freedom as well. Most of these studies aim at
collective motion. Roughly speaking, the former is establishing a link between RMT and classical chaos.
decoherent out-of-phase motion of the nucleons Consider as an example the classical motion of a point-
confined in the nucleus, while the latter is coherent like particle in a rectangle billiard. Ideal reflection at the
in-phase motion of all nucleons or of large groups of boundaries and absence of friction are assumed,
them such that any additional individual motion of implying that the particle is reflected infinitely many
the nucleons becomes largely irrelevant. It has been times. A second billiard is built by taking a rectangle
shown empirically that the single-particle excitations and replacing one corner with a quarter circle as shown
lead to GOE statistics, while collective excitations in Figure 5. The motion of the particle in this Sinai
Random Matrix Theory in Physics 345
Brody TA, Flores J, French JB, Mello PA, Pandey A, and Wong Haake F (2001) Quantum Signatures of Chaos, 2nd edn. Berlin:
SSM (1981) Random-matrix physics: spectrum and strength Springer.
fluctuations. Reviews of Modern Physics 53: 385–479. Mehta ML (2004) Random Matrices, 3rd edn. New York:
Efetov K (1997) Supersymmetry in Disorder and Chaos. Cambridge: Academic Press.
Cambridge University Press. Stöckmann HJ (1999) Quantum Chaos: An Introduction. Cambridge:
Forrester PJ, Snaith NC, and Verbaarschot JJM (eds.) (2003) Cambridge University Press.
Special issue: random matrix theory. Journal of Physics A 36. Verbaarschot JJM and Wettig T (2000) Random matrix theory
Guhr T, Müller-Groeling A, and Weidenmüller HA (1998) and chiral symmetry in QCD. Annual Review of Nuclear and
Random-matrix theories in quantum physics: common con- Particle Science 50: 343–410.
cepts. Physics Reports 299: 189–428.
Random Partitions
A Okounkov, Princeton University, Princeton, NJ, group GL(n). More generally, the highest weight of a
USA rational representation of GL(n) can be naturally
ª 2006 A Okounkov. Published by Elsevier Ltd. viewed as two partitions of total length n.
All rights reserved. For an even more basic example, partitions
with
1 m and ‘(
) n are the same as upright lattice
paths making n steps up and m steps to the right
Partitions (just
follow
the boundary of
). In particular, there
are nþm n of such. By a variation on this theme,
A partition of n is a monotone sequence of non- partitions label the standard basis of fermionic Fock
negative integers, space (Miwa et al. 2000). They also label a standard
¼ ð
1
2
3
0Þ basis of the bosonic Fock space.
In most instances, partitions naturally occur
with sum n. The number n is also denoted by j
j and together with some weight function. For example,
is called the size of n. The number of nonzero terms the dimension, dim
, of an irreducible representation
in
is called the length of
and often denoted by of S(n), or some power of it, is what always appears in
‘(
). It is convenient to make the sequence
infinite harmonic analysis on S(n). By a theorem of Burnside,
by adding a string of zeros at the end.
A geometric object associated to partition is its ðdim
Þ2
diagram. The diagram of
= (4, 2, 2, 1) is shown in MPlanch ð
Þ ¼ ½1
n!
Figure 1. A larger diagram, flipped and rotated by 135 ,
can be seen in Figure 2. Flipping the diagram introduces is a probability measure on the set of partitions of n; it
an involution on the set of partitions of n known as is known as the Plancherel measure. Besides harmonic
transposition. The transposed partition is denoted by
0 . analysis, there are many other contexts in which
Partitions serve as natural combinatorial labels for it appears, for example, by a theorem of
many basic objects in mathematics and physics. For Schensted (see Sagan (2001) and Stanley (1999)), the
example, partitions of n index both conjugacy classes distribution of the first part
1 of a Plancherel random
and irreducible representations of the symmetric partition
is the same as the distribution of the longest
group S(n). Partitions
with ‘(
) n index irredu- increasing subsequence in a uniformly random permu-
cible polynomial representations of the general linear tation of {1, 2, . . . , n}.
1.5
0.5
–2 –1 1 2
Figure 2 A Plancherel-random partition of 1000 and the limit
Figure 1 Diagram of a partition. shape.
348 Random Partitions
Partitions of n being just a finite set, one is often Here the product is over all squares & in the
interested in letting n ! 1. Even if the original diagram of and
problem was not of a probabilistic origin, one can
hð&Þ ¼ 1 þ að&Þ þ lð&Þ
still often benefit from adopting a probabilistic
viewpoint because of the intuition and techniques where a(&) and l(&) is the number of squares to the
that it brings. This is best illustrated by concrete right of the square & and below it, respectively.
examples, which is what we now turn to. These (These are known as arm-length and leg-length.)
examples are not meant to be a panorama of
random partitions. This is an old and still rapidly Limit Shape and Edge Scaling
growing field and a simple list of all major
When the diagram of is very large, the logarithm
contributions will take more space than is allowed.
The books Kerov (2003), Pitman (n.d.) Sagan of the hook product approximates a double
integral. The analysis of the corresponding integral
(2001), and Stanley (1999) offer much more
plays the central role, (see Kerov (2003), chapter 3)
information on the topics discussed below.
in the proof of the following law of large numbers
for the Plancherel measure.
Take the diagram of , flip
pffiffiffi and rotate it as in Figure 1
Plancherel Measure
and rescale by a factor of n so that it has unit area. In
Dimension of a Diagram this way one obtains a measure on continuous and, in
fact, Lipschitz functions. By a result of Logan and Shepp
There are several formulas and interpretations
and, independently, Vershik and Kerov these measures
for the number dim in [1]; see Sagan (2001) and
converge as n ! 1 to the -measure on a single
Stanley (1999). The one that often appears in the
function (x). This limit shape for the Plancherel
context of growth processes is the following:
measure, is also plotted in Figure 2. Explicitly,
dim is the number of ways to grow the diagram
8 pffiffiffiffiffiffiffiffiffiffiffiffiffiffi
from the empty diagram ; by adding a square < 2 x arcsinðx=2Þ þ 4 x2 ; jxj 2
at a time. That is, dim is number of chains of ðxÞ ¼
the form :
jxj; jxj > 2
; ¼ ð0Þ ð1Þ ðn1Þ ðnÞ ¼ This is an analog of Wigner’s semicircle law (Mehta
(k)
where j j = k and means inclusion of 1991) for spectra of random matrices. The Gaussian
diagrams. correction to the limit shape was also found by
From the classical formula Kerov (2003).
The limit
pffiffiffi shape result can be refined to show
jj! Y that 1 = n ! 2 in probability. Together with
dim ¼ Q ði j þ j iÞ ½2
ði þ k iÞ! ijk Schensted’s theorem, this answers the question
posed by Ulam about the longest increasing
where k is any number such that kþ1 = 0, one sees subsequence in a random permutation. Further
that the Plancherel measure is a discrete analog of progress came in the work of Baik, Deift, and
the eigenvalue density Johansson (see Deift (2000)), who conjectured
P 2Y (and proved for i = 1 and 2) that as n ! 1 the
eð1/2Þ xi ðxi xj Þ2 joint distribution
i<j
pffiffiffi
i 2 n
of a GUE random matrix (Mehta 1991). Indeed, ; i ¼ 1; 2; . . .
the first factor in [2], which looks like a multi- n1=6
nomial coefficient, is the analog of the Gaussian becomes exactly the same as the distribution of
weight. Kerov (2003) and Johansson were among largest eigenvalues of a GUE random matrix. In
the first to recognize the analogy between Plan- particular, the longest increasing subsequence,
cherel measure and GUE. One comes across many suitably scaled, is distributed exactly like the
partition sums that are discrete analogs of random largest eigenvalue. The distribution of the latter is
matrix integrals. known as the Tracy–Widom distribution; it is
The most compact formula for dim is the hook given in terms of a particular solution of the
formula Painlevé II equation. For more information about
the proof of the full conjecture, see Aldous and
dim Y
¼ hð&Þ1 ½3 Diaconis (1999), Deift (2000), and Okounkov
jj! &2 (2002).
Random Partitions 349
Note that only Bessel function of integral order Schur Functions and Cauchy Identity
enter this formula. pffiffiffi Schur functions s (x1 , . . . , xn ), where is a parti-
For large
pffiffiffi argument , Jn (2 ) has sine asymptotics tion with at most n parts, form a distinguished
ifpnffiffiffi 2 and Airy function asymptotics if n
linear basis of the algebra of symmetric polyno-
2 . Consequently, one gets the random matrix mials in x1 , . . . , xn . Various definitions and many
behavior near the edge of the limit shape and remarkable properties of these function are dis-
discrete sine kernel asymptotics of correlations in cussed in, for example, Sagan (2001) and Stanley
the bulk of the limit shape. (1999). One of them is that s (x) is the trace of a
matrix with eigenvalues {xi } in an irreducible
GL(n) module with highest weight . The follow-
Permutation Enumeration
ing stability of s ,
A basic combinatorial problem is to count per-
s ðx1 ; . . . ; xn ; 0Þ ¼ s ðx1 ; . . . ; xn Þ; ‘ðÞ n
mutations 1 , . . . , p 2 S(n) of given cycle types
(1) , . . . , (p) such that allows one to define Schur functions in infinitely
many variables. The formulas
1 p ¼ 1 ½6
X X
A geometric interpretation of this problem is to count p ¼ s ; s ¼ p
covers of the sphere S2 = CP1 branched over p given
zðÞ
points with monodromy (1) , . . . , (p) . Elementary
where
character theory of S(n) gives (Jones 1998)
Y DY E jj! Y
#fi 2 CðiÞ ; i ¼ 1g ¼ f ðiÞ ½7 zðÞ ¼ ¼ jAutðÞj i
Planch jC j
350 Random Partitions
where q is a parameter (it is more common to use Harmonic Functions on Young Graph
dimn, q1=2 instead). Obviously, dimn, q ! dimn as q ! 1.
Definitions
The function dimn, q is an important building block of,
for example, quantum invariants of knots and 3-folds, Partitions form a natural directed graph Y, known
and various related objects (see, e.g., Bakalov and as Young graph, in which there is an edge from to
Kirillov (2001)). The Verlinde formula (Bakalov and if is obtained from by adding a square. We
Kirillov 2001) can be viewed as an analog of Burnside’s will denote this by %. Let
be a non-negative
formula with weight dimn, q . When q is a root of unity function (called multiplicity) on edges of Y. A
the summation over is naturally truncated to a function on the vertices of Y is harmonic if it
finite sum. satisfies
The next level of generalization is obtained by X
deforming Schur function to Jack and, more generally, ðÞ ¼
ð; ÞðÞ ½12
-
Macdonald symmetric functions (Macdonald 1995).
In particular, the Jack polynomial analog of the for any . For given edge multiplicities
, non-
Plancherel measure is negative harmonic functions normalized by (;) = 1
MJack ðÞ form a convex compact (with respect to pointwise
Y convergence) set, which we will denote by H(
). The
n!ðt1 t2 Þn
¼ extreme points of H(
) are the indecomposable or
ððað&Þ þ 1Þt1 þ lð&Þt2 Þðað&Þt1 þ ðlð&Þ þ 1Þt2 Þ
&2 ergodic harmonic functions. They are the most
where t1 , t2 are parameters, and a(&) and l(&) important ones. One defines
X Y
denote, as above, the arm- and leg-length of a dim
= ¼
ði ; iþ1 Þ
square &. This measure depends only on the ratio ¼0 %1 % %jjjj ¼
t2 =t1 which is the usual parameter of Jack poly-
nomials. To continue the analogy with random and dim
= dim
=;. For example, if
1 then
matrices, this should be viewed as a general dim
= dim . Any function 2 H(
) defines a
analog of the Plancherel measure. probability measure on partitions of fixed size
The measure MJack naturally arises in Atiyah– n, n = 0, 1, 2, . . . , by
Bott localization computations on the Hilbert M;n ðÞ ¼ ðÞ dim
; jj ¼ n ½13
scheme of n points in C2 . By definition, this
Hilbert scheme parametrizes ideals I C[x, y] of The mean value property [12] implies a certain
codimension n as linear spaces. The torus (C )2 coherence of these measures for different values of
acts on it by rescaling x and y and the fixed points n, which, in general, does not hold for measures like
of this action are MSchur . Two multiplicity functions
and
0 are
gauge equivalent if
I ¼ Span of xj1 yi1 ði;jÞ=2
0 ð; Þ ¼ f ðÞ
ð; Þf ðÞ1
where is a partition of n. The weight of this fixed
for some function f. In this case, H(
) and H(
0 ) are
point in the Atiyah–Bott formula is proportional to
naturally isomorphic and the measures M are the same.
MJack (), the parameters t1 and t2 being the
standard torus weights. Corresponding formulas in
First Example: Thoma Theorem
K-theory involve a Macdonald polynomial analog of
dim . Let F be a central
S function on the infinite symmetric
Nekrasov defines the partition functions of N = 2 group S(1) = n S(n), normalized by F(1) = 1.
supersymmetric gauge theories by formally applying Restricted to S(n), F is a linear combination of
the Atiyah–Bott localization formula to (noncom- irreducible characters
pact) instanton moduli spaces. The resulting expres- X
sion is a sum over partitions with a weight which is FjSðnÞ ¼ ðÞ
jj¼n
a generalization of MJack . In this way, random
P
partitions enter gauge theory. What is more, The branching rule jS(n1) = % implies that
statistical properties of these random partitions are the Fourier coefficients are harmonic with respect
reflected in the dynamics of gauge theories. For to
1. They are non-negative if and only if F is a
example, the limit shape turns out to be precisely the positive-definite function on S(1), which means that
Seiberg–Witten curve (see Nekrasov and Okounkov the matrix (F(gi g1
j )) is non-negative definite for any
(2003), Okounkov (2002), and also Nakajima and {gi } S(1). The description of all indecomposable
Yoshioka (2003)). positive-definite central functions on S(1) was first
352 Random Partitions
See also: Determinantal Random Fields; Growth Mehta ML (1991) Random Matrices, 2nd edn. Boston, MA:
Processes in Random Matrix Theory; Integrable Systems Academic Press.
in Random Matrix Theory; Random Matrix Theory in Miwa T, Jimbo M, and Date E (2000) Solitons. Differential
Physics; Symmetry Classes in Random Matrix Theory. Equations, Symmetries and Infinite-Dimensional Algebras.
Cambridge: Cambridge University Press.
Nakajima H and Yoshioka K (2003) Lectures on instanton
Further Reading counting, math.AG/0311058.
Nekrasov N and Okounkov A (2003) Seiberg–Witten theory and
Aldous D and Diaconis P (1999) Longest increasing subsequences: random partitions, hep-th/0306238.
from patience sorting to the Baik–Deift–Johansson theorem. Okounkov A (2002) Symmetric Functions and Random Partitions,
Bulletin of the American Mathematical Society 36(4): 413–432. Symmetric Functions 2001: Surveys of Developments and
Bakalov B and Kirillov A Jr. (2001) Lectures on Tensor Perspectives, pp. 223–252, NATO Sci. Ser. II Math. Phys.
Categories and Modular Functors. University Lecture Series, Chem., 74. (math.CO/0309074). Dordrecht: Kluwer Academic.
vol. 21. American Mathematical Society. Olshanski G (2003) An introduction to harmonic analysis on the
Deift P (2000) Integrable systems and combinatorial theory. infinite symmetric group, math.RT/0311369.
Notices of the American Mathematical Society 47(6): 631–640. Okounkov A (2002) The uses of random partitions, math-ph/
Jones G (1998) Characters and Surfaces: A Survey, The Atlas 0309015.
of Finite Groups: Ten Years on (Birmingham, 1995), Pitman J (n.d.) Combinatorial Stochastic Processes, Lecture Notes
London Mathematical Society Lecture Note Series, vol. 249, from St. Four Course, available from www.stat.berekeley.edu.
pp. 90–118. Cambridge: Cambridge University Press. Sagan B (2001) The Symmetric Group. Representations, Combi-
Kazakov V (2001) Solvable Matrix Models, Random Matrices natorial Algorithms, and Symmetric Functions. Graduate Texts
and Their Applications, vol. 40. MSRI Publications; in Mathematics, 2nd edn., vol. 203, New York: Springer.
Cambridge: Cambridge University Press. Stanley R (1999) Enumerative Combinatorics, II. Cambridge:
Kerov S (2003) Asymptotic Representation Theory of the Cambridge University Press.
Symmetric Group and its Applications in Analysis. American Witten E (1991) On quantum gauge theories in two dimensions.
Mathematical Society. Communications in Mathematical Physics 141(1): 153–209.
Kerov S, Okounkov A, and Olshanski G (1998) The boundary of Woodward C (2004) Localization for the norm-square of the
the Young graph with Jack edge multiplicities. International moment map and the two-dimensional Yang–Mills integral,
Mathematics Research Notices 4: 173–199. math.SG/0404413.
Macdonald IG (1995) Symmetric Functions and Hall Polyno-
mials. Oxford: Clarendon.
be successful for systems exposed to a relatively small that only the current location of the walk determines
disorder of the environment. However, in certain the random motion mechanism, whereas the past
circumstances, EMA may fail due to atypical environ- history is not relevant. In terms of probability theory,
ment configurations (‘‘large deviations’’) leading to such a process is referred to as ‘‘Markov chain.’’ Thus,
various anomalous effects. For instance, with small but assuming that the walk starts at the origin, its position
positive probability a realization of the environment after n steps can be represented as the sum of
may create ‘‘traps’’ that would hold the particle for an consecutive displacements, Xn = Z1 þ þ Zn ,
anomalously long time, resulting in the subdiffusive where Zi are independent random variables with the
behavior, with the mean square displacement growing same distribution P{Zi = 1} = p, P{Zi = 1} = q.
slower than linearly in time. The strong law of large numbers (LLN) states that
RWRE models have been studied by various almost surely (i.e., with probability 1)
nonrigorous methods including Monte Carlo simu-
Xn
lations, series expansions, and the renormalization lim ¼ EZ1 ¼ p q; P-a.s. ½1
group techniques (see more details in the above
n!1 n
references), but only a few models have been where E denotes expectation (mean value) with respect
analyzed rigorously, especially in dimensions greater to P. This result shows that the random walk moves
than one. The situation is much more satisfactory in with the asymptotic average velocity close to p q. It
the one-dimensional case, where the mathematical follows that if p q 6¼ 0, then the process Xn , with
theory has matured and the RWRE dynamics has probability 1, will ultimately drift to infinity (more
been understood fairly well. precisely, þ1 if p q > 0 and 1 if p q < 0). In
The goal of this article is to give a brief particular, in this case, the random walk may return to
introduction to the beautiful area of RWRE. The the origin (and in fact visit any site on Z) only finitely
principal model to be discussed is a random walk many times. Such behavior is called ‘‘transient.’’
with nearest-neighbor jumps in independent and However, in the symmetric case (i.e., p = q = 0.5) the
identically distributed (i.i.d.) random environment average velocity vanishes, so the above argument fails.
in one dimension, although we shall also comment In this case, the walk behavior appears to be more
on some generalizations. The focus is on rigorous complicated, as it makes increasingly large excursions
results; however, heuristics will be used freely to both to the right and to the left, so that
motivate the ideas and explain the approaches and limn ! 1 Xn = þ1, limn ! 1 Xn = 1 (P-a.s.). This
proofs. In a few cases, sketches of the proofs have implies that a symmetric random walk in one dimen-
been included, which should help appreciate the sion is ‘‘recurrent,’’ in that it visits the origin (and
flavor of the results and methods. indeed any site on Z) infinitely often. Moreover, it can
be shown to be ‘‘null-recurrent,’’ which means that the
expected time to return to the origin is infinite. That is
Ordinary Random Walks: A Reminder
to say, return to the origin is guaranteed, but it takes
To put our exposition in perspective, let us give very long until this happens.
a brief account of a few basic concepts and Fluctuations of the random walk can be char-
facts for ordinary random walks, that is, evolving acterized further via the central limit theorem
in a nonrandom environment (see further details in (CLT), which amounts to saying that the probability
Hughes (1995)). In such models, space is modeled distribution of Xn is asymptotically normal, with
using a suitable graph, for example, a d-dimensional mean n(p q) and variance 4npq:
integer lattice Zd , while time may be discrete or ( )
continuous. The latter distinction is not essential, Xn nðp qÞ
lim P pffiffiffiffiffiffiffiffiffiffiffi x
and in this article we will mostly focus on the n!1 4npq
discrete-time case. The random mechanism of Z x
1 2
spatial motion is then determined by the given ¼ ðxÞ:¼ pffiffiffiffiffiffi ey =2 dy ½2
2 1
transition probabilities (probabilities of jumps) at
each site of the graph. In the lattice case, it is usually These results can be extended to more general
assumed that the walk is translation invariant, so walks in one dimension, and also to higher dimen-
that at each step distribution of jumps is the same, sions. For instance, the criterion of recurrence for a
with no regard to the current location of the walk. general one-dimensional random walk is that it is
In one dimension (d = 1), the simple (nearest- unbiased, EðX1 X0 Þ = 0. In the two-dimensional
neighbor) random walk may move one step to right case, in addition one needs EjX1 X0 j2 < 1. In
or to the left at a time, with some probabilities p and higher dimensions, any random walk (which does
q = 1 p, respectively. An important assumption is not reduce to lower dimension) is transient.
Random Walks in Random Environments 355
Random Environments and Random Walks to the right, with probability px , or to the left, with
probability qx . Here the environment is determined
The definition of an RWRE involves two ingredi-
by the sequence of random variables {px }. For most
ents: (1) the environment, which is randomly chosen
of the article, we assume that the random probabil-
but remains fixed throughout the time evolution,
ities {px , x 2 Z} are i.i.d., which is referred to as
and (2) the random walk, whose transition prob-
‘‘i.i.d. environment.’’ Some extensions to more
abilities are determined by the environment. The set
general environments will be mentioned briefly in
of environments (sample space) is denoted by
the section ‘‘Some generalizations and variations.’’
= {!}, and we use P to denote the probability
The study of RWRE is simplified under the follow-
distribution on this space. For each ! 2 , we define
ing natural condition called ‘‘(uniform) ellipticity:’’
the random walk in the environment ! as the (time-
homogeneous) Markov chain {Xt , t = 0, 1, 2, . . .g on 0 < px 1 < 1; x 2 Z; P-a.s. ½5
Zd with certain (random) transition probabilities
which will be frequently assumed in the sequel.
pðx; y; !Þ ¼ P! fX1 ¼ yjX0 ¼ xg ½3
The probability measure P! that determines the
distribution of the random walk in a given environ- Transience and Recurrence
ment ! is referred to as the ‘‘quenched’’ law. We
In this section, we discuss a criterion for the RWRE
often use a subindex to indicate the initial position
to be transient or recurrent. The following theorem
of the walk, so that, for example, P!x {X0 = x} = 1.
is due to Solomon (1975).
By averaging the quenched probability P!x further,
with respect to the environment distribution, we Theorem 1 Set x := qx =px , x 2 Z, and := E ln 0 .
obtain the ‘‘annealed’’ measure P x = P P!x , which
(i) If 6¼ 0 then Xt is transient (P 0 -a.s.); moreover,
determines the probability law of the RWRE:
Z if < 0 then limt !0 Xt = þ1, while if > 0
then limt !0 Xt = 1 (P 0 -a.s.).
P x ðAÞ ¼ P!x ðAÞ Pðd!Þ ¼ EP!x ðAÞ ½4
(ii) If = 0 then Xt is recurrent (P 0 -a.s.); moreover,
Expectation with respect to the annealed measure lim Xt ¼ þ1; lim Xt ¼ 1; P 0 -a.s.
t!1 t!1
Px will be denoted by Ex.
Equation [4] implies that if some property A of the Let us sketch the proof. Consider the hitting times
RWRE holds almost surely with respect to the Tx := min {t 0 : Xt = x} and denote by fxy the
quenched law P!x for almost all environments (i.e., quenched first-passage probability from x to y:
for all ! 2 0 such that P(0 ) = 1), then this property is
also true with probability 1 under the annealed law P x . fxy :¼ P!x f1 Ty < 1g
Note that the random walk Xn is a Markov Starting from 0, the first step of the walk may be
chain only conditionally on the fixed environment either to the right or to the left, hence by the
(i.e., with respect to P!x ), but the Markov property Markov property the return probability f00 can be
fails under the annealed measure Px . This is because decomposed as
the past history cannot be neglected, as it tells what
information about the medium must be taken into f00 ¼ p0 f10 þ q0 f1;0 ½6
account when averaging with respect to environ- To evaluate f10 , for n 1 set
ment. That is to say, the walk learns more about
the environment by taking more steps. (This idea ux uðnÞ !
x :¼ Px fT0 < Tn g; 0xn
motivates the method of ‘‘environment viewed from
the particle,’’ see related section below.) which is the probability to reach 0 prior to n,
The simplest model is the nearest-neighbor one- starting from x. Clearly,
dimensional walk, with transition probabilities ðnÞ
8 f10 ¼ lim u1 ½7
n!1
< px if y ¼ x þ 1
pðx; y; !Þ ¼ qx if y ¼ x 1 Decomposition with respect to the first step yields
: the difference equation
0 otherwise
where px and qx = 1 px (x 2 Z) are random vari- ux ¼ px uxþ1 þ qx ux1 ; 0<x<n ½8
ables on the probability space (, P). That is to say, with the boundary conditions
given the environment ! 2 , the random walk
currently at point x 2 Z will make a one-unit step u0 ¼ 1; un ¼ 0 ½9
356 Random Walks in Random Environments
Note that the random variables ln j are i.i.d., hence case is degenerate and amounts to the ordinary
by the strong LLN symmetric random walk, while the second one
Yx (except where = 1=2) corresponds to Sinai’s
lim ¼ E ln 0 ; P-a.s. problem (see the section ‘‘Sinai’s localization’’). A
x!1 x
‘‘phase diagram’’ for this model, showing various
That is, the general term of the series [12] for large x limiting regimes as a function of the parameters , ,
behaves like exp (x); hence, for > 0 the condition is presented in Figure 1.
[12] holds true (and so f10 = 1), whereas for < 0 it
fails (and so f10 < 1).
By interchanging the roles of px and qx , we also
have f 1, 0 < 1 if > 0 and f 1, 0 = 1 if < 0. From Asymptotic Velocity
eqn [6], it then follows that in both cases f00 < 1, In the transient case the walk escapes to infinity,
that is, the random walk is transient. and it is reasonable to ask at what speed. For a
In the critical case, = 0, by a general result from nonrandom environment, px p, the answer is
probability theory, Yx 0 for infinitely many x given by the LLN, eqn [1]. For the simple
(P-a.s.), and so the series in eqn [12] diverges. RWRE, the asymptotic velocity was obtained by
Hence, f10 = 1 and, similarly, f 1, 0 = 1, so by eqn [6] Solomon (1975). Note that by Jensen’s inequality,
f00 = 1, that is, the random walk is recurrent. (E0 ) 1 E1
0 .
It may be surprising that the critical parameter
appears in the form = E ln 0 , as it is probably Theorem 2 The limit v := limt ! 1 Xt =t exists
more natural to expect, by analogy with the (P0 -a.s.) and is given by
8
ordinary random walk, that the RWRE criterion > 1 E0
>
> if E0 < 1
would be based on the mean drift, E(p0 q0 ). In the >
>
< 1 þ E0
next section, we will see that the sign of d may be v¼ 1 E1
0
½14
misleading. >
>
> 1
if E1
0 < 1
A canonical model of RWRE is specified by the > 1 þ E0
>
:
assumption that the random variables px take only 0 otherwise
two values, and 1 , with probabilities
Thus, the RWRE has a well-defined nonzero
Pfpx ¼ g ¼ ; Pfpx ¼ 1 g ¼ 1 ½13
asymptotic velocity except when (E0 ) 1 1
where 0 < < 1, 0 < < 1. Here = (2 1) E1
0 . For instance, in the canonical example
ln (1 þ (1 2)=), and it is easy to see that, for eqn [13] (see Figure 1), the criterion E0 < 1 for
example, < 0 if < 1=2, < 1=2 or > 1=2, the velocity v to be positive amounts to the
> 1=2. The recurrent region where = 0 splits into condition that both (1 )= and (1 )= lie on
two lines, = 1=2 and = 1=2. Note that the first the same side of point 1.
Random Walks in Random Environments 357
The key idea of the proof is to analyze the hitting Furthermore, by Jensen’s inequality
times Tn first, deducing results for the walk Xt later. 1
More specifically, set i = Ti Ti1 , which is the time E0 ¼ Ep1
0 1 ðEp0 Þ 1
to hit i after hitting i 1 (providing that i > X0 ). If so eqn [14] implies that if E0 < 1, then
X0 = 0 and n 1, then Tn = 1 þ þ n . Note that
in fixed environment ! the random variables { i } are 0 < v 2 Ep0 1 ¼ E ðp0 q0 Þ
independent, since the quenched random walk ‘‘for- and the inequality is strict if p0 is genuinely random
gets’’ its past. Although there is no independence with (i.e., does not reduce to a constant). Hence, the
respect to the annealed probability measure P 0 , one asymptotic velocity v is less than the mean drift
can show that, due to the i.i.d. property of the E(p0 q0 ), which is yet another evidence of slow-
environment, the sequence { i } is ergodic and therefore down. What is even more surprising is that it is
satisfies the LLN: possible to have E(p0 q0 ) > 0 but = E ln 0 > 0, so
Tn 1 þ þ n that P0 -a.s. Xt ! 1 (although with velocity v = 0).
¼ ! E0 1 ; P 0 -a:s: Indeed, following Sznitman (2004) suppose that
n n
In turn, this implies Pfp0 ¼ g ¼ ; Pfp0 ¼
g ¼ 1
L
quenched mean duration of the excursion T11 and Although the above considerations point to the
observe that w1 = 1 þ E!0 1 , where 1 is the time to critical parameter , eqn [20], which may be
get back to 1 after stepping to 0. expected to determine the slowdown scale, they
As a matter of fact, this representation and provide little explanation of a mechanism of the
eqn [19] imply that the annealed mean duration of slowdown phenomenon. Heuristically, it is natural
L
the left excursion, E0 T11 , is given by to attribute the slowdown effects to the presence of
8 ‘‘traps’’ in the environment, which may be thought
< 2 of as regions that are easy to enter but hard to leave.
if E0 < 1
Ew1 ¼ 1 E0 ½21 In the one-dimensional case, such a trap would
:
1 if E0 1 occur, for example, between two long series of
Note that in the latter case (and bearing in mind < 0), successive sites where the probabilities px are fairly
the random walk starting from 1 will eventually drift to large (on the left) and small (on the right).
þ1, thus making only a finite number of visits to 0, Remarkably, traps can be characterized quantita-
but the expected number of such visits is infinite. tively with regard to the properties of the random
In fact, our goal here is to characterize the environment, by linking them to certain large-
distribution of w1 under the law P. To this end, deviation effects (see Sznitman (2002, 2004)). The
observe that the excursion T11 L
involves at least two key role in this analysis is played by the function
steps (the first and the last ones) and, possibly, F(u) := ln Eu0 , u 2 R. Suppose that = E ln 0 < 0
several left excursions from 0, each with mean time (so that by Theorem 1 the RWRE tends to
L
w0 = E!0 T00 . Therefore, þ1, P 0 -a.s.) and also that E0 > 1 and E01 > 1
(so that by Theorem 2, v = 0). The latter means that
X
1
F(1) > 0 and F(1) > 0, and since F is a smooth
w1 ¼ 2 þ qj0 p0 ðjw0 Þ ¼ 2 þ 0 w0 ½22
j¼1
strictly convex function and F(0) = 0, it follows that
there is the second root 0 < < 1, so that F() = 0,
By the translation invariance of the environment, the that is, E0 = 1 (cf. eqn [20]).
random variables w1 and w0 have the same distribu- Let us estimate the probability to have a trap in
tion. Furthermore, similarly to recursion [22], we U = [ L, L] where the RWRE will spend anoma-
have w0 = 2 þ 1 w1 . This implies that w0 is a lously long time. Using eqn [11], observe that
function of px with x 1 only, and hence w0 and
0 are independent random variables. Introducing the P!1 fT0 < TLþ1 g 1 expfLSL g
Laplace transform (s) = E exp (sw1 ) and condition- P
where SL := L 1 Lx = 1 ln x ! < 0 as L ! 1.
ing on 0 , from eqn [22] we get the equation However, due to large deviations SL may exceed
ðsÞ ¼ e2s Eðs0 Þ ½23 level
> 0 with probability
1 ðsÞ
as ; s!0 where I(x) := supu {ux F(u)} is the Legendre trans-
form of F. We can optimize this estimate by
then eqn [23] amounts to assuming that
L ln n and minimizing the ratio
1 as þ ¼ ð1 2s þ Þð1 as E0 þ Þ I(
)=
. Note that F(u) can be expressed via the
inverse Legendre transform, F(u) = supx {xu I(x)},
Expanding the product on the right, one can see that and it is easy to see that if := min
>0 I(
)=
, then
a solution with = 1 is possible only if E0 < 1, in F() = 0, so is the second (positive) root of F.
which case The ‘‘left’’ probability P!1 {T0 < TL1 } is esti-
2 mated in a similar fashion, and one can deduce that
a ¼ Ew1 ¼ for some constants K > 0, c > 0, and any 0 > , for
1 E0
large n
We have already obtained this result in eqn [21].
The case < 1 is possible if E0 = 1, which is P P!0 max jXk j K ln n c n
0
0
prevent the RWRE from moving at distance n the number of left excursions starting P from i up to
from the origin before time n. In particular, it time Tn , and note that Tn = n þ 2 Pi Uin . Since the
0
follows that limn ! 1 Xn =n = 0 for any 0 > , so walk is transient to þ1, the sum i0 Uin is finite
recalling that 0 < < 1, we have indeed a sublinear (P0 -a.s.) and so does not affect the limit. (2) Observe
growth of Xn . This result is more informative as that if the environment ! is fixed then the condi-
compared to Theorem 2 (the case v = 0), and it tional distribution of Ujn , given Ujþ1 n
, . . . , Unn = 0, is
n
clarifies the role of traps (see more details in the same as the distribution of the sum of 1 þ Ujþ1 i.i.d.
Sznitman (2004)). The nontrivial behavior of the random variables V1 , V2 , . . . , each with geo-
RWRE on the precise growth scale, n , is char- metric distribution P!0P {Vi = k} = pj qkj (k = 0, 1, 2, . . . ).
n n
acterized in the next section. Therefore, the sum i = 1 UiP(read from right to
n1
left) can be represented as t = 0 Zt , where Z0 =
0, Z1 , Z2 , . . . is a branching process (in random
Limit Distributions environment {pj }) with one immigrant at each step
Considerations in the previous section suggest that and the geometric offspring distribution with parameter
the exponent , defined as the solution of eqn pj for each particle present at time j. (3) Consider
[20], characterizes environments in terms of dura- the successive ‘‘regeneration’’ times k , at which
tion of left excursions. These heuristic arguments the process
P Zt vanishes. The partial sums
are confirmed by a limit theorem by Kesten et al. Wk := t< Zt form an i.i.d. sequence, and the
k kþ1
(1975), which specifies the slowdown scale. We proof amounts to showing that the sum of Wk has a
state here the most striking part of their result. stable limit of index . (4) Finally, the P distribution
Qn 1 of
Denote lnþ u := max { ln u, 0}; by an arithmetic W0 can be approximated using M0 := 1 t=1 j = 0 j
distribution one means a probability law on R (cf. eqn [11]), which is the quenched mean number of
concentrated on the set of points of the form total progeny of the immigrant at time t = 0. Using
0, c, 2c, . . . . Kesten’s renewal theorem, it can be checked that
P{M0 > x}
Kx as x ! 1, so M0 is in the domain
Theorem 3 Assume that 1 = E ln 0 < 0 of attraction of a stable law with index , and the
and the distribution of ln 0 is nonarithmetic result follows.
(excluding a possible atom at 1). Suppose that Let us emphasize the significance of the regenera-
the root of eqn [20] is such that 0 < < 1 and tion times i . Returning to the original random
E0 lnþ 0 < 1. Then walk, one can see that these are times at which the
RWRE hits a new ‘‘record’’ on its way to þ1, never
lim P 0 fn1= Tn tg ¼ L ðtÞ
n!1 to backtrack again. The same idea plays a crucial
role in the analysis of the RWRE in higher
lim P 0 ft Xt xg ¼ 1 L ðx1= Þ
t!1 dimensions (see the subsections ‘‘Zero–one laws
where L () is the distribution function of a stable and LLNs’’ and ‘‘Kalikow’s condition and Sznitman’s
law with index , concentrated on [0, 1). condition (T0 )’’).
Finally, note that the condition 1 < 0
General information on stable laws can be found allows P{p0 = 1} > 0, so the distribution of 0 may
in many probability books; we only mention here have an atom at 0 (and hence ln 0 at 1). In view
that the Laplace transform of a stable distribution of eqn [20], no atom is possible at þ1. The
on [0, 1) with index has the form (s) = restriction for the distribution of ln 0 to be
exp { Cs }. nonarithmetic is important. This will be illustrated
Kesten et al. (1975) also consider the case 1. in the section ‘‘Diode model,’’ where we discuss the
Note that for > 1, we have E0 < (E0 )1/ ¼ 1, so model of random diodes.
v > 0 by eqn [14]. For example, if > 2 then, as
expected (see the previous section), there exists a
nonrandom 2 > 0 such that
Sinai’s Localization
Tn n=v
lim P 0 pffiffiffi t ¼ ðtÞ The results discussed in the previous section indicate
n!1 n
that the less transient the RWRE is (i.e., the critical
Xt tv exponent decreasing to zero), the slower it moves.
lim P 0 3=2 pffiffi x ¼ ðxÞ Sinai (1982) proved a remarkable theorem showing
t!1 v t
that for the recurrent RWRE (i.e., with
Let us describe an elegant idea of the proof based = E ln 0 = 0), the slowdown effect is exhibited in
on a suitable renewal structure. (1) Let Uin (i n) be a striking way.
360 Random Walks in Random Environments
Theorem 4 Suppose that the environment {px } is Environment Viewed from the Particle
i.i.d. and elliptic, eqn [5], and assume that
This important technique, dating back to Kozlov
E ln 0 = 0, with P{0 = 1} < 1. Denote 2 := E ln2
and Molchanov (1984), has proved to be quite
0 , 0 < 2 < 1. Then there exists a function
efficient in the study of random motions in random
Wn = Wn (!) of the random environment such that
media. The basic idea is to focus on the evolution of
for any " > 0
the environment viewed from the current position of
2
Xn the walk.
lim P 0 2 Wn > " ¼ 0 ½24 Let be the shift operator acting on the space of
n!1 ln n
environments = {!} as follows:
Moreover, Wn has a limit distribution:
! ¼ fpx g 7! !
¼ fpx1 g
lim PfWn xg ¼ GðxÞ ½25
n!1
Consider the process
and thus also the distribution of 2 Xn = ln2 n under
P 0 converges to the same distribution G(x). !n :¼ Xn !; !0 ¼ !
Sinai’s theorem shows that in the recurrent case, the
which describes the state of the environment from
RWRE considered on the spatial scale ln2 n becomes
the point of view of an observer moving along with
localized near some random point (depending on the
the random walk Xn . One can show that !n is a
environment only). This phenomenon, frequently
Markov chain (with respect to both P!0 and P 0 ), with
referred to as ‘‘Sinai’s localization,’’ indicates an
the transition kernel
extremely strong slowdown of the motion as com-
pared with the ordinary diffusive behavior. Tð!; d!0 Þ ¼ p0 ! ðd!0 Þ þ q0 1 ! ðd!0 Þ ½27
Following Révész (1990), let us explain heuristi-
cally why Xn is measured on the scale ln2 n. Rewrite and the respective initial law ! or P (here ! is the
eqn [11] as Dirac measure, i.e., unit mass at !).
!1 This fact as it stands may not seem to be of any
X
n1
practical use, since the state space of this Markov
!
P1 fTn < T0 g ¼ 1 þ expðYx Þ ½26
chain is very complex. However, the great advan-
x¼1
tage is that one can find an explicit invariant
where Yx is defined in eqn [12]. By the CLT, pffiffiffi the probability Q for the kernel T (i.e., such that
typical size of jYx j for large x is of order of x, and QT = Q), which is absolutely continuous with
so eqn [26] yields respect to P.
pffiffiffi More specifically, assume that E0 < 1 and set
P!1 fTn < T0 g expf ng
Q = f (!)P, where (cf. eqn [14])
This suggests thatpthe ffiffiffi walk started at site 1 will 1 Y
X x
make about exp { n } visits to the origin before f ¼ v ð1 þ 0 Þ j
reaching level n. Therefore, the first pffiffiffi passage to x¼0 j¼1
½28
site n takes at least time exp { n }. In other 1 E0
words, one may expect that a typical displace- v¼
1 þ E0
ment after n steps will be of order of ln2 n (cf. eqn
[24]). This argument also indicates, in the spirit Using independence of {x }, we note
of the trapping mechanism of slowdown discussed Z
at the end of the section ‘‘Critical exponent, X1
Qðd!Þ ¼ Ef ¼ ð1 E0 Þ ðE0 Þx ¼ 1
excursions, and traps,’’ that there is typically a x¼0
trap of size ln2 n, which retains the RWRE until
time n. hence Q is a probability measure on . Furthermore,
It has been shown (independently by H Kesten for any bounded measurable function g on we
and A O Golosov) that the limit in [25] coincides have
with the distribution of a certain functional of the Z
standard Brownian motion, with the density QTg ¼ Tgð!ÞQðd!Þ ¼ Ef Tg
function
( )
X1 k 2 2 ¼ E f p0 ðg
Þ þ q0 ðg
1 Þ
2 ð1Þ ð2k þ 1Þ
G0 ðxÞ ¼ exp jxj
k¼0 2k þ 1 8 ¼ E g ðp0 f Þ
1 þ ðq0 f Þ
½29
Random Walks in Random Environments 361
The last integral is easily evaluated to yield As a result, we have the representation
1 Y
X x
Eðp0 q0 Þf ¼ vE j ð1 0 Þ Xn nv ¼ HðXn ; n; !Þ þ hðXn ; !Þ ½31
x¼0 j¼1 For a fixed !, one can apply a suitable CLT for
X
1
martingale differences to the martingale term in eqn
¼ vð1 E0 Þ ðE0 Þx ¼ v
[31], while using that Xn
nv (P 0 -a.s.), the second
P
x¼0
term in eqn [31] is approximated by the sum nv k=0
and the first part of the formula [14] follows. (k, !), which can be handled via a CLT for stationary
The case E0 1 can be handled using a sequences. This way, we arrive at the following result.
comparison argument (Sznitman 2004). Observe
that if px p~x for all x then for the corresponding Theorem 5 Suppose that the environment is
random walks we have Xt X ~ t (P! - a.s.). We now elliptic, eqn [5], and such that E2þ"
0 < 1 for some
0
define a suitable dominating random medium by " > 0 (which implies that E0 < 1 and hence v > 0).
setting (for
> 0) Then there exists a nonrandom 2 > 0 such that
px
Xn nv
~x :¼
p þ px lim P0 pffiffiffiffiffiffiffiffi x ¼ ðxÞ
1þ
1þ
n!1 n2
362 Random Walks in Random Environments
Note that this theorem is parallel to the result by This equation is easily solved by iterations:
Kesten et al. (1975) on asymptotic normality when
X
1
> 2 (see the section ‘‘Limit distributions’’). The ðsÞ ¼ ð1 Þ k estk
moment assumptions in Theorem 5 are more k¼0
restrictive, but they can be relaxed. On the other ½33
X
k
hand, Theorem 5 does not impose the nonarithmetic j
tk :¼ 2
condition on the distribution of the environment j¼0
(cf. Theorem 3). More importantly, the environment
hence the distribution of w is given by
method proves to be quite efficient in more general
situations, including non-i.i.d. environments and Pfw ¼ tk g ¼ ð1 Þk ; k ¼ 0; 1; . . .
higher dimensions (at least in some cases, e.g., for
random bonds RWRE and balanced RWRE dis- This result has a transparent probabilistic meaning.
cussed subsequently). In fact, the factor (1 )k is the probability that
the nearest diode on the left of the starting point
occurs at distance k þ 1, whereas tk is the corre-
sponding mean excursion time. Note that formula
Diode Model [33] for tk easily follows from the recursion tk = 2 þ
In the preceding sections (except in the section tk 1 (cf. eqn [22]) with the boundary condition
‘‘Limit distributions,’’ where however we were t0 = 2.
limited to a nonarithmetic case), we assumed that A self-similar hierarchy of timescales [33] indi-
0 < px < 1 and therefore excluded the situation cates that the process will exhibit temporal oscilla-
where there are sites through which motion is tions. Indeed, for > 1 the average waiting time
permitted in one direction only. Allowing for such until passing through a valley of ordinary sites of
a possibility leads to the ‘‘diode model’’ (Solomon length k is asymptotically proportional to tk
2k ,
1975). Specifically, suppose that so one may expect the annealed mean displacement
E0 Xn to have a local minimum at n tk . Passing to
Pfpx ¼ g ¼ ; Pfpx ¼ 1g ¼ 1 ½32 logarithms, we note that ln tkþ1 ln tk
ln , which
with 0 < < 1, 0 < < 1, so that with probability suggests the occurrence of persistent oscillations on
a point x 2 Z is a usual two-way site and with the logarithmic timescale, with period ln (see
probability 1 it is a repelling barrier (‘‘diode’’), Figure 2). This was confirmed by Bernasconi and
through which passage is only possible from left to Schneider (1985) who showed that for > 1
right. This is an interesting example of statistically E0 Xn
n Fðln nÞ; n!1 ½34
inhomogeneous medium, where the particle motion
is strongly irreversible due to the presence of special where = ln = ln < 1 is the solution of eqn [20]
semipenetrable nodes. The principal mathematical and the function F is periodic with period ln (see
advantage of such a model is that the random walk Figure 2).
can be decomposed into independent excursions In contrast, for = 1 one has
from one diode to the next. n ln
Due to diodes, the RWRE will eventually drift to E0 Xn
; n!1
2 ln n
þ1. If > 1=2, then on average it moves faster
than in a nonrandom environment with px . The and there are no oscillations of the above kind.
situation where 1=2 is potentially more inter- These results illuminate the earlier analysis of the
esting, as then there is a competition between the diode model by Solomon (1975), which in the main
local drift of the walk to the left (in ordinary sites) has revealed the following. If = 1, then Xn
and the presence of repelling diodes on its way. satisfies the strong LLN:
Note that E0 = , where := (1 )=, so the Xn ln
condition E0 < 1 amounts to > =(1 þ ). In this lim ¼ ; P 0 -a.s.
n!1 n= ln n 2
case (which includes > 1=2), formula [14] for the
asymptotic velocity applies. while in the case > 1 the asymptotic behavior of
As explained in the section ‘‘Critical exponent, Xn is quite complicated and unusual: if ni ! 1 is a
excursions, and traps,’’ the quenched mean duration sequence of integers such that { ln ni } !
(here
w of the left excursion has Laplace transform given {a} = a [a] denotes the fractional part of a), then
by eqn [23], which now reads the distribution of ni Xni under P 0 converges to a
nondegenerate distribution which depends on
.
ðsÞ ¼ e2s f1 þ ðsÞg Thus, the very existence of the limiting distribution
Random Walks in Random Environments 363
0.2
ln(n –½ E0 Xn)
0.1
0 2 4 6 8 10
ln n
Figure 2 Temporal oscillations for the diode model, eqn [32]. Here = 0.3 and = 1=0.09, so that > 1 and = 1=2. The dots
represent an average of Monte Carlo simulations over 10 000 samples of the environment with a random walk of 200 000 steps in
each realization. The broken curve refers to the exact asymptotic solution [34]. The arrows indicate the simulated locations of the
minima tk , the asymptotic spacing of which is predicted to be ln 241. Reproduced from Bernasconi J and Schneider WR (1982).
Diffusion on a one-dimensional lattice with random asymmetric transition rates. Journal of Physics A: Mathematical and General 15:
L729–L734, by permission of IOP Publishing Ltd.
of Xn and the limit itself heavily depend on the conditions in order to ensure enough decoupling
subsequence ni chosen to approach infinity. (e.g., in Sinai’s problem). The method of environ-
This should be compared with a more ‘‘regular’’ ment viewed from the particle (discussed earlier) is
result Theorem 3. Note that almost all the condi- also suited very well to dealing with stationarity.
tions of this theorem are satisfied in the diode In the remainder of this section, we describe some
model, except that here the distribution of ln 0 is other generalizations including RWRE with
arithmetic (recall that the value ln 0 = 1 is bounded jumps, RWRE where randomness is
permissible), so it is the discreteness of the environ- attached to bonds rather than sites, and continuous-
ment distribution that does not provide enough time (symmetric) RWRE driven by the randomized
‘‘mixing’’ and hence leads to such peculiar features master equation.
of the asymptotics.
RWRE with Bounded Jumps
Some Generalizations and Variations The previous discussion was restricted to the case of
RWRE with nearest-neighbor jumps. A natural
Most of the results discussed above in the simplest extension is RWRE with bounded jumps. Let L, R
context of RWRE with nearest-neighbor jumps in an be fixed natural numbers, and suppose that from
i.i.d. random environment have been extended to each site x 2 Z jumps are only possible to the sites
some other cases. One natural generalization is to x þ i, i = L, . . . , R, with (random) probabilities
relax the i.i.d. assumption, for example, by con-
sidering stationary ergodic environments (see details X
R
only be accessed implicitly, which makes the For orientation, note that if pn (i) = p(i) are
analysis rather hard. nonrandom constants, then
1 = ln 1 , where 1 > 0
To explain how random matrices arise here, let us first is the largest eigenvalue of M0 , and so
1 < 0 if and
consider a particular case R = 1, L 1. Assume that only if 1 < 1. The latter means that the character-
px (L), px (1) > 0 for all x 2 Z (ellipticity condi- istic polynomial ’() := det (M0 I) satisfies the
tion, cf. eqn [5]), and consider the hitting probabilities condition (1)L ’(1) > 0. To evaluate det (M0 I),
un := P!n {T0 < 1}, where T0 := min {t 0 : Xt 0} replace the first column by the sum of all columns
(cf. the section ‘‘Transience and recurrence’’). By and expand to get ’(1) = (1)L1 (b1 þ þ bL ).
decomposing with respect to the first step, for n 1 Substituting expressions [38] it is easy toPsee that
we obtain the difference equation the above condition amounts to p(1) Li= 1 ip
(i) > 0, that is, the mean drift of the random
X
L
un ¼ pn ð1Þunþ1 þ pn ðiÞuni ½36 walk is positive and hence Xn ! þ1 a.s.
i¼0 In the general case, L 1, R 1, similar con-
siderations lead to the following matrices of order
with the boundary conditions
P u0 = = u Lþ1 = 1. d := L þ R 1 (cf. eqn [39]):
Using that 1 = pn (1) þ Li= 0 pn (i), we can rewrite 0 1
eqn [36] as an ðR 1Þ an ð1Þ bn ð1Þ bn ðLÞ
B C
X
L B 1 0 0 C
B C
pn ð1Þðun unþ1 Þ ¼ pn ðiÞðuni un Þ B C
B C
i¼1 B 0 1 0 0 C
Mn ¼ B B .. .. .. .. ..
C
.. C
or, equivalently, B . . . . . . C
B C
B . . . . . . C
X
L B .. .. .. .. .. .. C
vn ¼ bn ðiÞvni ½37 @ A
i¼1 0 0 1 0
where vi := ui uiþ1 and where bn (i) are given by eqn [38] and
pn ðiÞ þ þ pn ðLÞ pn ðiÞ þ þ pn ðRÞ
bn ðiÞ:¼ ½38 an ðiÞ :¼
pn ð1Þ pn ðRÞ
Recursion [37] can be written in a matrix form, Suppose that the ellipticity condition is satisfied in
Vn = Mn Vn 1 , where Vn := (vn , . . . , vn Lþ1 )> , the form pn (i) > 0, i 6¼ 0, L i R, and let
0 1
1
2
d be the (nonrandom) Lyapunov
bn ð1Þ bn ðLÞ
B 1 ... 0 0 C exponents of {Mn }. The largest exponent
1 is again
Mn :¼B @ ... .. .. .. C ½39 given by eqn [40], while other exponents are
. . . A determined recursively from the equalities
0 1 0
1 þ þ
k ¼ lim n1 ln k^k ðMn M1 Þk
and by iterations we get (cf. eqn [10]) n!1
who studied a more general RWRE on a strip section, we obtain that limn ! 1 Xn =n exists
Z {0, 1, . . . , m 1}. The link between these two (P!0 -a.s.) and is given by
models is given by the representation Xn = mYn þ Zn , Z
where m := max {L, R}, Yn 2 Z, Zn 2 {0, . . . , m 1}. dð0; !Þ Qðd!Þ ¼ Z1 E c01 c1;0 ¼ 0
Random matrices arising here are constructed in-
conductivity of the finite system, cN , is defined as which in the limit h ! 0 yields the master equation
the average conductance per bond, so that (or Chapman–Kolmogorov’s forward equation)
d X
1X N
p0x ðtÞ ¼ cyx p0y ðtÞ cxy p0x ðtÞ
c1
N ¼ c1 dt
N x¼0 x;xþ1 y 6¼ x ½45
and by the strong LLN, cN1 ! Ec1 p0x ð0Þ ¼ 0 ðxÞ
01 as N ! 1 (P-a.s.).
Therefore, the effective conductivity of the infinite where 0 (x) is the Kronecker symbol.
1 1
system is given by c = (Ec01 ) , and we note that Continuous-time RWRE are therefore naturally
c < Ec01 if the random medium is nondegenerate. described via the randomized master equation, that
Returning to the random bonds RWRE, eqn [41], is, with random transition rates. The canonical
it is easy to see that a site j is recurrent if and only if example, originally motivated by Dyson’s study of
the conductance cj, 1 between x and 1 equals zero. the chain of harmonic oscillators with random
Using again Ohm’s law, we have (cf. eqn [42]) couplings, is a symmetric nearest-neighbor RWRE,
X
1 where the random transition rates cxy are nonzero
c1
j; þ1 ¼ c1
x;xþ1 ¼ 0; P-a.s. only for y = x 1 and satisfy the condition
x¼j cx, xþ1 = cxþ1, x , otherwise being i.i.d. (see Alexander
and we recover the result about recurrence. et al. (1981)). In this case, the problem [45] can be
formally solved using the Laplace transform, leading
to the equations
Continuous-Time RWRE
1
As in the discrete-time case, a random walk on Z with s þ Gþ ^
0 þ G0 ¼ ½p0 ðsÞ ½46
continuous time is a homogeneous Markov chain
Xt , t 2 [0, 1), with state space Z and nearest-neighbor s þ G þ
x þ Gx ¼ 0 ðx 6¼ 0Þ ½47
(or at least bounded) jumps. The term ‘‘Markov’’ as
where Gx , Gþ
x are defined as
usual refers to the ‘‘lack of memory’’ property, which
amounts to saying that from the entire history of the ^0x ðsÞ p
p ^0;x1 ðsÞ
process development up to a given time, only the G
x :¼ cx;x1 ½48
^0x ðsÞ
p
current position of the walk is important for the future R
evolution while all other information is irrelevant. and p^0x (s) := 01 p0x (t) e st dt. From eqns [47] and
Since there is no smallest time unit as in the discrete- [48] one obtains the recursion
time case, it is convenient to describe transitions of Xt
1
1 1
in terms of transition rates characterizing the G
x ¼ þ
cx;x1 s þ Gþ ½49
likelihood of various jumps during a very short time. x1
although an explicit solution is not available, one For the symmetric nearest-neighbor RWRE con-
can obtain the asymptotics of small values of s, sidered above, the transition probabilities of the
thereby rendering information about the behavior of imbedded random walk are given by
p00 (t) for large t. More specifically, one can show cx;xþ1
1 1
that if c := (Ec01 ) > 0, then px :¼ px;xþ1 ¼
cx1;x þ cx;xþ1
^00 ðsÞ
ð4c sÞ1=2 ;
Ep s!0 qx :¼ px;x1 ¼ 1 px
and so by a Tauberian theorem and we recognize here the transition law of a
random walk in the random bonds environment
Ep00 ðtÞ
ð4c tÞ1=2 ; t!1 ½50 considered in the previous subsection (cf. eqn [41]).
Note that asymptotics [50] appears to be the same Recurrence and zero asymptotic velocity established
as for an ordinary symmetric random walk with there are consistent with the results discussed in the
constant transition rates cx, xþ1 = cxþ1, x = c , suggest- present section (e.g., note that the CLT for both Xn ,
ing that the latter provides an EMA for the RWRE eqn [43], and Xt , eqn [51], does not involve any
considered above. centering). Let us point out, however, that a ‘‘naive’’
This is further confirmed by the asymptotic discretization of time using the mean sojourn time
calculation of the annealed mean square displace- appears to be incorrect, as this would lead to the
ment, E0 X2t
2c t as t ! 1 (Alexander et al. 1981). scaling t = n1 with 1 := E(c 1, 0 þ c01 )1 , while
Moreover, Kawazu and Kesten (1984) proved that from comparing the limit theorems in these two
Xt is asymptotically normal: cases, one can conclude that the true value of the
effective discretization step is given by
Xt := (2c ) 1 = (1=2)Ec1
01 . In fact, by the arith-
lim P 0 pffiffiffiffiffiffiffiffiffi x ¼ ðxÞ ½51
t!1 2c t metic–harmonic mean inequality we have > 1 ,
which is a manifestation of the RWRE’s diffusive
Therefore, if c > 0, then the RWRE has the same slowdown.
diffusive behavior as the corresponding ordered
system, with a well-defined diffusion constant
D = c .
1
In the case where c = 0 (i.e., Ec01 = 1), one may RWRE in Higher Dimensions
expect that the RWRE exhibits subdiffusive beha- Multidimensional RWRE with nearest-neighbor
vior. For example, if the density function of the jumps are defined in a similar fashion: from site
transition rates is modeled by x 2 Zd the random walk can jump to one of the 2d
d
adjacent sites x þ e 2 ZP (such that jej = 1), with
fa ðuÞ ¼ ð1 Þ u 1f0<u<1g ð0 < < 1Þ
probabilities px (e) 0, jej = 1 px (e) = 1, where the
then, as shown by Alexander et al. (1981), random vectors px () are assumed to be i.i.d. for
different x 2 Zd . As usual, we will also impose the
Ep00 ðtÞ
C tð1Þ=ð2Þ condition of uniform ellipticity:
E0 X2t
C0 t2ð1Þ=ð2Þ px ðeÞ > 0; P-a.s.
½52
In fact, Kawazu and Kesten (1984) proved that in jej ¼ 1; x2Z d
known for nonballistic RWRE, apart from special of the RWRE and is particularly useful for
cases of balanced RWRE in d 2 (Lawler 1982), proving an LLN and a CLT, due to the fact
small isotropic perturbations of ordinary symmetric that pieces of the random walk between con-
random walks in d 3 (Bricmont and Kupiainen secutive regeneration times (and fragments of the
1991), and some examples based on combining random environment involved thereby) are inde-
components of ordinary random walks and RWRE pendent and identically distributed (at least
in d 7 (Bolthausen et al. 2003). In particular, there starting from 1 ). In this vein, one can prove a
are no examples of subdiffusive behavior in any ‘‘directional’’ version of the LLN, stating that for
dimension d 2, and in fact it is largely believed that each ‘ there exist deterministic v‘ , v ‘ (possibly
a CLT is always true in any uniformly elliptic, i.i.d. zero) such that
random environment in dimensions d 3, with
somewhat less certainty about d = 2. A heuristic Z‘n
lim ¼ v‘ 1A‘ þ v‘ 1A‘ ; P 0 -a:s: ½54
n!1 n
explanation for such a striking difference with the
case d = 1 is that due to a less restricted topology of Note that if P 0 (A‘ ) 2 {0, 1}, then eqn [54] in
space in higher dimensions, it is much harder to force conjunction with eqn [53] would readily imply
the random walk to visit traps, and hence the
slowdown is not so pronounced. Z‘n
lim ¼ v‘ ; P 0 -a:s: ½55
In what follows, we give a brief account of some n!1 n
of the known results and methods in this fast- Moreover, if P 0 (A‘ ) 2 {0, 1} for any ‘, then there
developing area (for further information and specific exists a deterministic v (possibly zero) such that
references, see an extensive review by Zeitouni
(2004)). Xn
lim ¼ v; P 0 -a:s: ½56
n!1 n
Zero–One Laws and LLNs Therefore, it is natural to ask if a zero–one law [53]
can be enhanced to that for the individual prob-
A natural first step in a multidimensional context is abilities P 0 (A‘ ). It is known that the answer is
to explore the behavior of the random walk Xn as affirmative for i.i.d. environments in d = 2, where
projected on various one-dimensional straight lines. indeed P(A‘ ) 2 {0, 1} for any ‘, with counterexamples
Let us fix a test unit vector ‘ 2 Rd , and consider the in certain stationary ergodic (but not uniformly
process Z‘n := Xn ‘. Then for the events elliptic) environments. However, in the case d 3
A‘ := { limn ! 1 Z‘n = 1} one can show that this is an open problem.
P 0 ðA‘ [ A‘ Þ 2 f0; 1g ½53
That is to say, for each ‘ the probability that the Kalikow’s Condition and Sznitman’s Condition (T0 )
random walk escapes to infinity in the direction ‘ is An RWRE is called ‘‘ballistic’’ (ballistic in direction ‘)
either 0 or 1. if v 6¼ 0 (v‘ 6¼ 0), see eqns [55] and [56]. In this
Let us sketch the proof. We say that is ‘‘record section, we describe conditions on the random
time’’ if jZ‘t j > jZ‘k j for all k < t, and ‘‘regeneration environment which ensure that the RWRE is ballistic.
time’’ if in addition jZ‘ j jZ‘n j for all n . Note Let U be a connected strict subset of Zd contain-
that by the ellipticity condition [52], limn ! 1 jZ‘n j = ing the origin. For x 2 U, denote by
1 (P0 -a.s.), hence there is an infinite sequence of
record times 0 = 0 < 1 < 2 < . If P 0 (A‘ [ X
TU
gðx; !Þ :¼ E!0 1fXn ¼xg
A ‘ ) > 0, we can pick a subsequence of record
n¼0
times i0 , each of which has a positive P0 -
probability to be a regeneration time (because the quenched mean number of visits to x prior to the
otherwise jZ‘n j would persistently backtrack exit time TU := min {n 0 : Xn 2
= U}. Consider an
towards the origin and the event A‘ [ A ‘ could auxiliary Markov chain X b n , which starts from 0,
not occur). Since the trials for different record makes nearest-neighbor jumps while in U, with
times are independent, it follows that a regenera- (nonrandom) probabilities
tion time occurs P 0 -a.s. Repeating this argu- E½gðx; !Þpx ðeÞ
ment, we conclude that there exists an infinite b
px ðeÞ ¼ ; x2U ½57
E½gðx; !Þ
sequence of regeneration times i , which implies
that jZ‘n j ! 1 (P0 -a.s.), that is, P(A‘ [ A ‘ ) = 1. and is absorbed as soon as it first leaves U. Note
Regeneration structure introduced by the that the expectations in eqn [57] are finite; indeed, if
sequence { i } plays a key role in further analysis x is the probability to return to x before leaving U,
Random Walks in Random Environments 369
then, by the Markov property, the mean number of Condition [59] can also be reformulated in terms of
returns is given by the exit distribution of the RWRE from infinite thick
X
1 slabs ‘‘orthonormal’’ to directions ‘0 sufficiently close
x to ‘. As it stands, the latter reformulation is difficult
kkx ð1 x Þ ¼ <1
k¼1
1 x to check, but Sznitman (2004) has developed a
remarkable ‘‘effective’’ criterion reducing the job to
since, due to ellipticity, x < 1.
a similar condition in finite boxes, which is much
An important property, highlighting the usefulness
b n , is that if X
b n leaves U with probability 1, then the more tractable and can be checked in a number of
of X
cases.
same is true for the original RWRE Xn (under
In fact, condition (T0 ) follows from Kalikow’s
the annealed law P 0 ), and moreover, the
b ^ and XT have the same distribution condition, but not the other way around. In the one-
exit points X TU U
dimensional case, condition (T0 ) (applied to ‘ = 1 and
laws.
‘ = 1) proves to be equivalent to the transient
Let ‘ 2 Rd , j‘j = 1. One says that Kalikow’s condi-
b n in behavior of the RWRE, which, as we have seen in
tion with respect to ‘ holds if the local drift of X
Theorem 2, may happen with v = 0, that is, in a
the direction ‘ is uniformly bounded away from zero:
X nonballistic scenario. The situation in d 2 is quite
inf inf ðe ‘Þ b
px ðeÞ > 0 ½58 different, as condition (T0 ) implies that the RWRE is
U x2U ballistic in the direction ‘ (with v‘ > 0) and satisfies a
jej¼1
CLT (under P 0 ). It is not known whether the ballistic
A sufficient condition for [58] is, for example, that behavior for d 2 is completely characterized by
for some > 0 condition (T0 ), although this is expected to be true.
E ðdð0; !Þ ‘Þþ E ðdð0; !Þ ‘Þ
where d(0, !) = E!0 X1 and u := max {u, 0}. Balanced RWRE
A natural implication of Kalikow’s condition [58] In this section we discuss a particular case of
is that P 0 (A‘ ) = 1 and v‘ > 0 (see eqn [55]). More- nonballistic RWRE, for which LLN and CLT can
over, noting that eqn [58] also holds for all ‘0 in a be proved. Following Lawler (1982), we say that an
vicinity of ‘ and applying the above result with d RWRE is ‘‘balanced’’ if px (e) = px (e) for all
noncollinear vectors from that vicinity, we conclude x 2 Zd , jej = 1 (P-a.s.). In this case, the local drift
that under Kalikow’s condition there exists a vanishes, d(x, !) = 0, hence the coordinate processes
deterministic v 6¼ 0 such that Xn =n ! v as n ! 1 Xin (i = 1, . . . , d) are martingales with respect to the
(P0 -a.s.). p Furthermore,
ffiffiffi it can be proved that natural filtration F n = {X0 , . . . , Xn }. The quenched
(Xn nv)= n converges in law to a Gaussian covariance matrix of the increments Xin :=
distribution (see Sznitman (2004)). Xinþ1 Xin (i = 1, . . . , d) is given by
It is not hard to check that in dimension d = 1
Kalikow’s condition is equivalent to v 6¼ 0 and E!0 Xin Xjn jF n ¼ 2ij pXn ðei Þ ½60
therefore characterizes completely all ballistic
Since the right-hand side of eqn [60] is uniformly
walks. For d 2, the situation is less clear; for
bounded, it follows that Xn =n ! 0 (P 0 -a.s.). Further,
instance, it is not known if there exist RWRE with
it can be proved that there exist deterministic positive
P(A‘ ) > 0 and v‘ = 0 (of course, such RWRE cannot
constants a1 , . . . , ad such that for i ¼ 1, . . . , d
satisfy Kalikow’s condition).
Sznitman (2004) has proposed a more compli- 1Xn1
ai
cated transience condition (T0 ) involving certain lim
n!1 n
pXk ðei Þ ¼ ; P0 -a.s. ½61
2
regeneration times i similar to those described in k¼0
the previous subsection. An RWRE is said to satisfy Once this is proved, a multidimensional
Sznitman’s condition (T0 ) relative to direction ‘ if pffiffiffi CLT for
martingale differences yields that Xn = n converges
P 0 (A‘ ) = 1 and for some c > 0 and all 0 <
< 1 in law to a Gaussian distribution with zero mean
and the covariances bij = ij ai .
E0 exp c sup jXn j
< 1 ½59 The proof of [61] employs the method of environ-
n 1
ment viewed from the particle. Namely, define a
This condition provides a powerful control over 1 Markov chain !n := Xn ! with the transition kernel
for d 2 and in particular ensures that 1 has finite Xd
moments of any order. This is in sharp contrast with Tð!; d!0 Þ ¼ ½ p0 ðei Þ! ðd!0 Þ
the one-dimensional case, and should be viewed as a i¼1
reflection of much weaker traps in dimensions d 2. þ p0 ðei Þ1 ! ðd!0 Þ
370 Random Walks in Random Environments
(cf. eqn [27]). The next step is to find a probability that the annealed local drift in some direction is strong
measure Q on invariant under T and absolutely enough (see Sznitman (2004)). More precisely, sup-
continuous with respect to P. Unlike the one- pose that d 3 and 2 (0, 1). Then there exists
dimensional case, however, an explicit form of Q is "0 = "0 (d, ) > 0 such that if jpx (e) 1=2dj <
not available, and Q is constructed indirectly as the " (x 2 Zd , jej = 1) with 0 < " < "0 , and for some e0
limit of invariant measures of certain periodic one has E[d(x, !) e0 ] "2.5 (d = 3) or "3
modifications of the RWRE. Birkhoff’s ergodic (d 4), then Sznitman’s condition (T0 ) is satisfied
theorem then yields, P 0 -a.s., with respect to e0 and therefore the RWRE is ballistic
in the direction e0 (cf. the subsection ‘‘Kalikow’s
1Xn1
1Xn1
condition and Sznitman’s condition (T0 )’’).
pXk ðei ; !Þ ¼ p0 ðei ; !k Þ
n k¼0 n k¼0 Examples of a different type are constructed in
Z dimensions d 6 by letting the first d1 5 coordi-
! p0 ðei ; !Þ Qðd!Þ nates of the RWRE Xn behave according to an
ordinary random walk, while the remaining
by the ellipticity condition [52], and eqn [61] d2 = d d1 coordinates are exposed to a random
follows. environment (see Bolthausen et al. (2003)). One can
With regard to transience, balanced RWREs show that there exists a deterministic v (possibly
admit a complete and simple classification. Namely, zero) such that Xn =n ! pvffiffiffi (P 0 -a.s.). Moreover, if
it has been proved (see Zeitouni (2004)) that any d1 13, then (Xn nv)= n satisfies both quenched
balanced RWRE is transient for d 3 and recurrent and annealed CLT. Incidentally, such models can be
for d = 2 (P 0 -a.s.). It is interesting to note, however, used to demonstrate the surprising features of the
that these answers may be false for certain balanced multidimensional RWRE. For instance, for d 7
random walks in a fixed environment (P-probability one can construct an RWRE Xn such that the
of such environments being zero, of course). Indeed, annealed local drift does not vanish, Ed(x, !) 6¼ 0,
examples can be constructed of balanced random but the asymptotic velocity is zero, Xn =n ! 0
walks in Z2 and in Zd with d 3, which are (P0 - a.s.), andpffiffiffifurthermore, if d 15, then in this
transient and recurrent, respectively (Zeitouni example Xn = n satisfies a quenched CLT. (In fact,
2004). one can construct such RWRE as small perturba-
tions of a simple symmetric walk.) On the other
RWRE Based on Modification of Ordinary
hand, there exist examples (in high enough dimen-
Random Walks
sions) where the walk is ballistic with a velocity
A number of partial results are known for RWRE which has an opposite direction to the annealed drift
constructed on the basis of ordinary random walks Ed(x, !) 6¼ 0. These striking examples provide
via certain randomization of the environment. A ‘‘experimental’’ evidence of many unusual properties
natural model is obtained by a small perturbation of of the multidimensional RWRE, which, no doubt,
a simple symmetric random walk. To be more will be discovered in the years to come.
precise, suppose that: (1) jpx (e) 1=2dj < " for all
x 2 Zd and any jej = 1, where " > 0 is small enough; See also: Averaging Methods; Growth Processes in
(2) Epx (e) = 1=2d; (3) vectors px () are i.i.d. for Random Matrix Theory; Lagrangian Dispersion (Passive
different x 2 Zd ; and (4) the distribution of the Scalar); Random Dynamical Systems; Random Matrix
Theory in Physics; Stochastic Differential Equations;
vector px () is isotropic, that is, invariant with
Stochastic Loewner Evolutions.
respect to permutations of its coordinates. Then for
d 3 Bricmont and Kupiainen (1991) have proved
an LLN (with zero asymptotic velocity) and a Further Reading
quenched CLT (with nondegenerate covariance
matrix). The proof is based on the renormalization Alexander S, Bernasconi J, Schneider WR, and Orbach R (1981)
Excitation dynamics in random one-dimensional systems.
group method, which involves decimation in time
Reviews of Modern Physics 53: 175–198.
combined with a suitable spatial–temporal scaling. Bernasconi J and Schneider WR (1985) Random walks in one-dimen-
This transformation replaces an RWRE by another sional random media. Helvetica Physica Acta 58: 597–621.
RWRE with weaker randomness, and it can be Bolthausen E and Goldsheid I (2000) Recurrence and transience
shown that iterations converge to a Gaussian fixed of random walks in random environments on a strip.
point. Communications in Mathematical Physics 214: 429–447.
Bolthausen E, Sznitman A-S, and Zeitouni O (2003) Cut points
Another class of examples is also built using small and diffusive random walks in random environments.
perturbations of simple symmetric random walks, but Annales de l’Institut Henri Poincaré. Probabilités et Statis-
is anisotropic and exhibits ballistic behavior, providing tiques 39: 527–555.
Recursion Operators in Classical Mechanics 371
Bouchaud J-P and Georges A (1990) Anomalous diffusion in Probabilités de Saint-Flour XXII-1992, Lecture Notes in
disordered media: statistical mechanisms, models and physical Mathematics, vol. 1581, pp. 242–411. Berlin: Springer.
applications. Physical Reports 195: 127–293. Révész P (1990) Random Walk in Random and Non-Random
Brémont J (2004) Random walks in random medium on Z and Environments. Singapore: World Scientific.
Lyapunov spectrum. Annales de l’Institut Henri Poincaré. Sinai YaG (1982) The limiting behavior of a one-dimensional
Probabilités et Statistiques 40: 309–336. random walk in a random medium. Theory of Probability and
Bricmont J and Kupiainen A (1991) Random walks in asymmetric Its Applications 27: 256–268.
random environments. Communications in Mathematical Solomon F (1975) Random walks in a random environment. The
Physics 142: 345–420. Annals of Probability 3: 1–31.
Hughes BD (1995) Random Walks and Random Environments. Sznitman A-S (2002) Lectures on random motions in random
Volume 1: Random Walks. Oxford: Clarendon. media. In: Bolthausen E and Sznitman A-S. Ten Lectures on
Hughes BD (1996) Random Walks and Random Environments. Random Media, DMV Seminar, vol. 32. Basel: Birkhäuser.
Volume 2: Random Environments. Oxford: Clarendon. Sznitman A-S (2004) Topics in random walks in random
Kawazu K and Kesten H (1984) On birth and death processes in environment. In: Lawler GF (ed.) School and Conference on
symmetric random environment. Journal of Statistical Physics Probability Theory (Trieste, 2002), ICTP Lecture Notes Series,
37: 561–576. vol. XVII, pp. 203–266 (Available at http://www.ictp.trieste.it/
Kesten H, Kozlov MV, and Spitzer F (1975) A limit law for ~pub_off/lectures/vol17.html).
random walk in a random environment. Compositio Mathe- Zeitouni O (2003) Random walks in random environments. In:
matica 30: 145–168. Tatsien Li (ed.) Proceedings of the International Congress of
Kozlov SM and Molchanov SA (1984) On conditions for Mathematicians (Beijing, 2002), vol. III, pp. 117–127. Beijing:
applicability of the central limit theorem to random walks Higher Education Press.
on a lattice. Soviet Mathematics Doklady 30: 410–413. Zeitouni O (2004) Random walks in random environment. In:
Lawler GF (1982) Weak convergence of a random walk in a Picard J (ed.) Lectures on Probability Theory and Statistics,
random environment. Communications in Mathematical Phy- Ecole d’Eté de Probabilités de Saint-Flour XXXI-2001,
sics 87: 81–87. Lecture Notes in Mathematics, vol. 1837, pp. 189–312.
Molchanov SA (1994) Lectures on random media. In: Bernard P New York: Springer.
(ed.) Lectures on Probability Theory, Ecole d’Eté de
separable systems of Jacobi. To this end, the To the first order in , the Jacobi identity on {f , g}
article is organized in four sections, of which the gives
first three clarify the above-mentioned concepts. In
the section ‘‘!N manifolds,’’ the idea of !N fff ; gg; hg0 þ fff ; gg0 ; hg þ cyclic permutations ¼ 0
manifolds is explained from the viewpoint of bi- This condition entails a constraint on !0 . One can
Hamiltonian geometry. The section ‘‘Cotangent readily check that !0 must be a closed 2-form:
bundles’’ shows that cotangent bundles provide a
large class of !N manifolds, proving that such d!0 ¼ 0
manifolds are not rare. Next, two basic examples
In turn, this constraint imposes a condition on N.
of cyclic systems of Levi-Civita are presented.
The translation of the closure of !0 on N is
Finally, the relation between cyclic systems of
Levi-Civita and separable systems of Jacobi is ½NXf ; Xg þ ½Xf ; NXg N½Xf ; Xg ¼ Xff ;gg0
explained briefly.
To the second order in , the Jacobi identity on
{f , g} gives
The problem is to find {f , g}0 in such a way that the This formula shows that this process of deformation
linear pencil satisfies the Jacobi identity for any is rigid. For each change of the Poisson bracket,
value of the parameter . To solve this problem it there is a deformation of the commutator of vector
is convenient to represent the bracket {f , g}0 in fields such that the basic correspondence between
the form functions and Hamiltonian vector fields, established
by the symplectic form !, remains a Lie algebra
ff ; gg0 ¼ !0 ðXf ; Xg Þ morphism.
The same phenomenon can be observed in
(which is analogous to the standard representation
connection with the definition of Hamiltonian
of the Poisson bracket of S) and then to notice that
vector field. If one introduces the pencil of 2-forms
there exists a unique (1, 1) tensor field N : TS ! TS
such that ! ¼ ! þ !0
0
! ðXf ; Xg Þ ¼ !ðNXf ; Xg Þ and the pencil of derivations
0
Due to the skew-symmetry of ! , the tensor field N d ¼ d þ dN
must satisfy the condition
where dN is the derivation of type d and degree 1
!ðNXf ; Xg Þ ¼ !ðXf ; NXg Þ canonically associated with N according to the
Recursion Operators in Classical Mechanics 373
theory of graded derivations of Frölicher and characteristic polynomial is s(). Thus, the choice
Nijenhuis, one can prove that of s() also determines an !N structure on T Q
according to the previous prescription. The con-
d2 ¼ 0; d ! ¼ 0 clusion is that there is a relation between pencils
and that of Poisson brackets on T Q and coordinate
systems on Q. This relation is the clue to
! ðXh ; Þ ¼ d h understand the geometry of separable systems of
This means that, on an !N manifold, the symplectic Jacobi.
form ! and the de Rham differential d are deformed
in such a way that the basic relation between
functions and Hamiltonian vector fields established
Cyclic Systems of Levi-Civita
by ! holds true.
The systems of coupled harmonic oscillators are the
first example of cyclic systems of Levi-Civita. Let us
Cotangent Bundles consider, for simplicity, a system formed by only
two particles, with masses m1 and m2 , moving on a
Cotangent bundles are a source of examples of !N
line under the action of an internal elastic force. The
manifolds. The construction begins on the
Lagrangian of the system is
base manifold Q. For any (1, 1) tensor field
L : TQ ! TQ with vanishing Nijenhuis torsion,
L ¼ 12 m1 x_ 21 þ m2 x_ 22 12 kðx1 x2 Þ2
one constructs the deformed Liouville 1-form
X
n and the equations of motion are
0 ¼ yi L ðdxi Þ
i¼1 x1
M€
x þ Kx ¼ 0; x¼
x2
and its exterior derivative
where
!0 ¼ d0
It can be proved that !0 satisfies the conditions m1 0 k k
M¼ ; K¼
explained in the previous section, and conclude 0 m2 k k
that T Q, endowed with the pencil of 2-form
! = ! þ !0 , is an !N manifold. Under a change of coordinates, the entries of the
A subclass of these structures merits attention. It matrices M and K obey the transformation law of
is related to the polynomials the components of a second-order covariant tensor.
Therefore, the entries of the matrix L = M1 K are
sðÞ ¼ n s1 n1 þ s2 n2 þ þ sn the components of a tensor field of type (1, 1) on R 2 .
The defining equations of the associated endo-
the coefficients of which are functions on Q
morphism L : TR 2 ! R2 are
satisfying the condition
ds1 ^ ds2 ^ ^ dsn 6¼ 0 L ðdx1 Þ ¼ !21 ðdx2 dx1 Þ
where s1 is the first coefficient of the polynomial integrable if and only if the 2-form ddN h vanishes
defining the elliptical spherical coordinates, and h is on Dh :
the Hamiltonian of the Neumann system. By the
ddN h ¼ 0 on Dh
Frobenius theorem, this equation alone entails the
integrability of the distribution Dh , without the need Suppose now that the dimension of Dh is maximal,
of computing Xh , NXh , and their commutator that is, equal to n = (1=2) dim S. Then Dh is spanned
[Xh , NXh ]. Thus, it can be concluded that the by the n vector fields (Xh , NXh , . . . ,N n1 Xh ), and
Neumann system too is a cyclic system of Levi- the vanishing condition of ddN h on Dh turns out to
Civita, and that the recursion operator N, generat- be equivalent to
ing the distribution Dh , is closely related to the
ddN hðN j Xh ; N k Xh Þ ¼ 0
polynomial defining the separation coordinates of
the Neumann system. for any value of j and k from 0 to n 1. Thus, the
number of separability conditions of h and the
number of integrability conditions of Dh are equal.
This circumstance strongly suggests that the two sets
Separable System of Jacobi
of conditions are related. The nontensorial character
In 1838, Jacobi noticed that the Hamilton–Jacobi of the Levi-Civita conditions, compared with the
equation tensorial character of the integrability conditions of
Dh , further suggests that the former should be the
@W @W
h x1 ; x2 ; . . . ; xn ; ;...; ¼e evaluation of the latter in a specific system of
@x1 @xn coordinates. These coordinates are the ‘‘normal
of many Hamiltonian systems splits owing to an coordinates’’ of an !N manifold, that will be
appropriate choice of coordinates in a set of introduced in the following.
ordinary differential equations. On account of Assume that the minimal polynomial of N has
this property, these systems have been called real and distinct roots (l1 , . . . , ln ). In this case, the
separable. In 1904, Levi-Civita gave a first partial !N manifold is said to be semisimple. A two-
characterization of separable Hamiltonians by dimensional eigenspace is associated with each
means of his separability conditions. In a letter root lk . Let us consider the distribution Ek spanned
addressed to Stäckel, he proved that h is separ- by all the eigenvectors of N, except those
able in a preassigned system of canonical coordi- associated with lk . Since N is torsion free, each
nates if and only if the conditions distribution Ek is integrable. Let us fix the
attention on one of these distributions. It turns
@ 2 h @h @h @ 2 h @h @h out that its leaves are symplectic submanifolds of
@xj @xk @yj @yk @xj @yk @yj @xk codimension 2. So they are the level surfaces of a
@ 2 h @h @h @ 2 h @h @h pair of (local) functions which are not in involu-
þ ¼0 tion. By collecting together the pairs of functions
@yj @xk @xj @yk @yj @yk @xj @xk
associated with the n distributions (E1 , . . . , En ),
are satisfied by h. One must notice the nontensorial one obtains, at the end, a coordinate system
character of these conditions; they hold only in a (1 , 1 , 2 , 2 , . . . , n , n ) on S. Moreover, these
specific coordinate system, and if the coordinates are functions can be chosen in such a way to form a
changed, it is not possible to reconstruct the form of system of canonical coordinates. The final result is
the separability conditions in the new coordinates. that, on a semisimple !N manifold, one can
The nontensorial character is the major drawback of construct a coordinate system such that
the separability conditions of Levi-Civita, making X
n
them practically useless in the search of separation !¼ dj ^ dj
coordinates. j¼1
The contact between the theory of separable
and
system of Jacobi and the theory of cyclic systems
of Levi-Civita rests on two occurrences. The first is N ðdj Þ ¼ lj dj
the form of the integrability conditions of the N ðdj Þ ¼ lj dj
distribution Dh generated by any vector field Xh
on an !N manifold. Exploiting the Frobenius These coordinates are called the normal coordinates
integrability conditions and the properties of the (or sometimes, the Darboux–Nijenhuis coordinates) of
differential operator dN associated with the recur- the !N manifold. One can prove that the separability
sion operator N, it can be proved that Dh is conditions of Levi-Civita are the integrability
376 Reflection Positivity and Phase Transitions
conditions of Dh , written in normal coordinates. This Variables for Differential Equations; Solitons and
result allows us to claim that the cyclic systems of Levi- Kac–Moody Lie Algebras.
Civita on semisimple !N manifolds are all separable.
The reverse is also true. As has already been Further Reading
shown in the example of the Neumann system, a
given separable system of Jacobi can be associated Dubrovin BA, Krichever IM, and Novikov SP (2001) Integrable
with a recursion operator N in such a way that its systems I. In: Arnol’d VI (ed.) Encyclopaedia of Mathematical
Sciences. Dynamical Systems IV, pp. 177–332. Berlin: Springer.
phase space (with the possible exclusion of a Jacobi CGJ (1996) Vorlesungen ber analytische Mechanik,
singular locus) becomes an !N manifold, and the Deutsche Mathematiker Vereinigung, Freiburg. Braunschweig:
Hamiltonian vector field Xh becomes a cyclic system Friedrich Vieweg and Sohn.
of Levi-Civita. A new interpretation of the process Ivan K, Michor PW, and Slovák J (1993) Natural Operations in
of separation of variables follows from this result. Differential Geometry. Berlin: Springer.
Kalnins EG (1986) Separation of Variables for Riemannian
Indeed, to find separation coordinates for a given Spaces of Constant Curvature. New York: Wiley.
system on a symplectic manifold S is equivalent to Krasilshchik IS and Kersten PHM (2000) Symmetries and
deforming the Poisson bracket of S into a pencil Recursion Operators for Classical and Supersymmetric Differ-
ential Equations. Dordrecht: Kluwer.
ff ; gg ¼ ff ; gg þ ff ; gg0 Magri F, Falqui G, and Pedroni M (2003) The method of Poisson
pairs in the theory of nonlinear PDEs. In: Conte R, Magri F,
in such a way that the recursion operator N defining Musette M, Satsuma J, and Winternitz P (eds.) Direct and
the pencil {f , g} generates, with Xh , an integrable Inverse Methods in Nonlinear Evolution Equations, Lecture
distribution Dh . Therefore, classical mechanics is Notes in Physics, vol. 632, pp. 85–136. Berlin: Springer.
Miller W (1977) Symmetry and Separation of Variables. Reading,
deeply entangled with the theory of recursion opera- MA–London–Amsterdam: Addison-Wesley.
tors, even if the insistence on the use of separation Olver PJ (1993) Applications of Lie Groups to Differential
coordinates has hidden this factor for a long time. Equations, 2nd edn. New York: Springer.
Pars LA (1965) A Treatise on Analytical Dynamics. London:
Heinemann.
See also: Bi-Hamiltonian Methods in Soliton Theory;
Vaisman I (1994) Lectures on the Geometry of Poisson Mani-
Classical r-Matrices, Lie Bialgebras, and Poisson Lie folds. Basel: Birkhäuser.
Groups; Integrable Systems and Algebraic Geometry; Vilasi G (2001) Hamiltonian Dynamics. River Edge, NJ: World
Integrable Systems and Recursion Operators on Scientific.
Symplectic and Jacobi Manifolds; Integrable Systems: Yano K and Ishihara S (1973) Tangent and Cotangent Bundles:
Overview; Multi-Hamiltonian Systems; Separation of Differential Geometry. New York: Dekker.
positivity method in all its versions may be found For d 3, the latter integral exists; hence, > 0 for
in the literature listed at the end of the article. There J large enough, which means that the state we
we also provide short bibliographic comments. consider is nonergodic.
The quantum case is more involved. The infrared
bounds are obtained not for functions like K(p)b but
Nonergodicity and Infrared Estimates
for the so-called Duhamel two-point functions. Then
The following heuristic arguments should give an idea one has to prove a number of additional statements,
how to establish the nonergodicity of a Gibbs state by which finally lead to the proof of the result desired.
means of infrared estimates. Let us consider a classical In the section on reflection positivity in quantum
ferromagnetic translation-invariant model. (Of systems we indicate how to do this for a simple
course, we assume that it possesses Gibbs states, quantum spin model.
which for models with unbounded spins is a
nontrivial property. A particular case of this model
is described in more detail in the subsection ‘‘Gaus-
Reflection Positivity and Phase
sian domination.’’) This model describes the system
Transitions in Classical Systems
of interacting N-dimensional spins x‘ 2 RN , indexed
by the elements ‘ 2 Zd of the d-dimensional simple We begin by studying reflection positive (RP)
cubic lattice. The interaction is pairwise, attractive, functionals. Gibbs states of RP models are such
nearest-neighbor, and invariant with respect to the functionals.
378 Reflection Positivity and Phase Transitions
#ð A þ
BÞ ¼ #ðAÞ þ
#ðBÞ Lemma 4 If is RP, then for any A, B 2 Aþ ,
½8
#ðA BÞ ¼ #ðAÞ #ðBÞ f½A#ðBÞg2 ½A#ðAÞ ½B#ðBÞ ½13
þ
By A (respectively, A ), we denote the sub-
algebra of A consisting of functions dependent Proof For 2 R, by [8] we have
on xþ (respectively, x ). Then #(Aþ ) = A and ½ðA þ BÞ#ðA þ BÞ
# # = id.
¼ ½ðA þ BÞð#ðAÞ þ #ðBÞÞ 0
Definition 1 A linear functional : A ! R is called
RP with respect to the maps and #, if Since is linear, the latter can be written as a
3-nomial, whose positivity for all 2 R is equivalent
8A 2 Aþ: ½A#ðAÞ 0 ½9 to [13]. &
Example 2 Let be a Borel measure on the real Now let an RP functional be such that for
line (not necessarily positive), with respect to which A; B; C1 ; . . . ; Cm ; D1 ; . . . ; Dm 2 Aþ
all real polynomials are integrable. Let also A be the
algebra of all real-valued polynomials on Rjj , jj there exists
being even. Finally, let and # be any of the maps " !#
X
m
with the properties described above. Then the exp A þ #ðBÞ þ Ci #ðDi Þ
functional i¼1
Z
and that the series
ðAÞ ¼ Aðx Þ d ðx Þ
R jj X
1
Y ½10 1
f½C1 #ðC1 Þn1 ½Cm #ðCm Þnm
d ðx Þ ¼ dðx‘ Þ n ! n !
n1 ;...;nm ¼0 1 m
‘2
exp½A þ #ðBÞg ½14
is RP. Indeed, let F : R jj=2 ! R be such that
A(x ) = F(xþ ). Then as well as the one with all Ci s replaced by Di s
Z Y Z Y converge absolutely.
½A#ðAÞ ¼ Fðxþ Þ dðx‘ Þ Fðx Þ dðx‘ Þ Lemma 5 Let the functional and the functions
‘2þ ‘2
A, B, Ci , Di , i = 1, . . . , m, be as above. Then
"Z #2
Y ( " !#)2
¼ Fðxþ Þ dðx‘ Þ 0 Xm
‘2þ
exp A þ #ðBÞ þ Ci #ðDi Þ
i¼1
In the above example the multiplicative structure of " !#
X
m
the measure is crucial. It results in the positivity exp A þ #ðAÞ þ Ci #ðCi Þ
of with respect to all reflections. If one has just i¼1
" !#
one such reflection, the measure which defines X
m
may be decomposable onto two measures only. Let
exp B þ #ðBÞ þ Di #ðDi Þ ½15
,A,, and # be as above. Consider a Borel measure i¼1
Reflection Positivity and Phase Transitions 379
Proof By the above assumptions 0þ [ 0 , and A be the algebra of all polynomials
0
" !# of (x0 , y0 ) 2 R2Nj j . Note that x0 may be regarded
Xm
exp A þ #ðBÞ þ Ci #ðDi Þ as the pair (x0þ , x0 ). Let Aþ (respectively, A ) be
i¼1 the subalgebra of A consisting of the polynomials
" !# which depend on x0þ , y0þ (respectively, x0 , y0 )
X
m
¼ F#ðGÞ exp Ci #ðDi Þ only. Introduce the measures
i¼1 !
X
1 JX 2
1 d~ðx Þ ¼ exp jx‘ j dðx Þ
¼ ½F#ðGÞ½C1 #ðD1 Þn1 2 ‘20
n1 ;...;nm
n
¼0 1
! nm ! !
JX 2
½Cm #ðDm Þnm ½16 d~
ðx Þ ¼ exp jxj d
ðx Þ
2 ‘20 ‘
where F = eA , G = eB . Then by [13] and the Cauchy–
Schwarz inequality for sums we get and define the following functional on A:
Z
RHS½16
1=2 ðFÞ ¼ Fðx0 ; y0 Þ dðx
~ þ Þ
X
1
1 R 2Njj
½F#ðFÞ½C1 #ðC1 Þn1 ½Cm #ðCm Þnm
n1 ;...;nm ¼0
n1 ! nm !
d~
ðyþ Þ d~
ðx Þ d~
ðy Þ ½18
1=2
1
½G#ðGÞ½D1 #ðD1 Þn1 ½Dm #ðDm Þnm It has the same structure as the one described by
n1 ! nm !
(
X1
)1=2 Proposition 3, hence is RP with respect to the map #
1
½F#ðFÞ½C1 #ðC1 Þn1 ½Cm #ðCm Þnm defined by the reflection . Set
n1 ;...;nm ¼0 1
n ! nm ! Z Z
( )1=2
X1
1 ¼ d~
ðx Þ;
¼ d~
ðy Þ ½19
½G#ðGÞ½D1 #ðD1 Þn1 ½Dm #ðDm Þnm R Njj R Njj
n ! nm !
n1 ;...;nm ¼0 1
( " !#)1=2 and
Xm
X 1
¼ exp A þ #ðAÞ þ Ci #ðCi Þ
i¼1 2
( " !#)1=2 A 0; B ¼ J ja‘ j þ ða‘ ; y‘ Þ
X
m
‘20þ
2
exp B þ #ðBÞ þ Di #ðDi Þ
i¼1
ðkÞ
pffiffi ðkÞ ðkÞ
pffiffi ðkÞ ðkÞ
½20
C‘ ¼ J x‘ ; D‘ ¼ J y‘ þ a ‘
which yields [15]. &
‘ 2 0þ ; k ¼ 1; . . . ; N
Main Estimate
Then the left-hand side of [17] is
Let be a finite set and 0 be its nonempty subset.
LHS ½17
Let also and
be finite Borel measures on "
RNjj , N 2 N. For vectors b, c 2 RN , by
P (b, c)(k)and ¼
1
exp A þ #ðBÞ
jbj, jcj we denote their scalar product N k=1 b c
(k)
ð
Þ2
and the corresponding norms, respectively. By x !#
we denote (x‘ )‘2 , x‘ 2 RN ; hence, x 2 RNjj . XX N 2
ðkÞ ðkÞ
þ C‘ # D‘ ½21
Lemma 6 Let the sets , 0 and the measures ,
be 0
‘2þ k¼1
0
as above. Then for every (a‘ )‘20 2 RNj j and J 0,
"Z ! #2 with given by [18]. Applying [15] and taking into
JX 2
account [19], we arrive at
exp jx‘ y‘ a‘ j dðx Þd
ðy Þ
R2Njj 2 ‘20 LHS ½17
Z ! 0 1
JX 2 Z X
exp jx‘ y‘ j dðx Þ dðy Þ 1
R2Njj 2 ‘20 exp@ J x‘ xð‘Þ A
Z ! ð
Þ2 R 2Njj ‘20þ
JX 2
exp jx‘ y‘ j d
ðx Þ d
ðy Þ ½17
dðx
~ þ Þ dðx
~ Þ d~
ðyþ Þ d~
ðy Þ
R 2Njj 2 ‘20 0 1
Z X
Proof Take two copies of and denote them by
exp@ J y‘ yð‘Þ A
. Furthermore, by 0 we denote the subsets R 2Njj ‘20þ
consisting of the elements of 0 . For an ‘ 2 þ ,
dðx
~ þ Þ dðx
~ Þ d~
ðyþ Þ d~
ðy Þ ¼ RHS ½17
by (‘) we denote its counterpart in . Then is a
reflection and (0þ ) = 0 . Let = þ [ , 0 = which completes the proof. &
380 Reflection Positivity and Phase Transitions
Gaussian Domination
Let be a finite set, jj even, and E be a set of
unordered pairs of elements of , such that the
graph (, E) is connected. If e 2 E connects given
‘, ‘0 2 , we write e = h‘, ‘0 i. We suppose that E
contains no loops h‘, ‘i. With each ‘ 2 we
associate a random N-component vector x‘ , called
spin. The joint probability distribution of the spins
(x‘ )‘2 is defined by means of the local Gibbs
measure Figure 1 The torus.
0 1
1 J X
d ðx Þ ¼ exp@ jx‘ x‘0 j2 Ad ðx Þ; bijection n : ! , n n = id, such that
Z 2 h‘;‘0 i2E
n ((n)
þ ) = (n)
and h n (‘), n (‘ 0
)i 2 E(n)
whenever
h‘, ‘0 i 2 E(n)
þ . Finally, we assume that if h‘, ‘ i 2 En
0
x 2 RNjj ½22 (n) 0
and ‘ 2 þ , then n (‘) = ‘ .
Here the measure By this assumption if h‘, ‘0 i 2 En , then no other
Y elements of En can be of the form h‘, ‘00 i or h‘00 , ‘0 i.
d ðx Þ ¼ dðx‘ Þ ½23
The basic example here is the torus which one obtains
‘2
from a rectangular box Zd , jj even, by imposing
describes the system if the interaction intensity periodic conditions on its boundaries. The set of edges
J equals zero. In general, J 0, that is, the model is E = {h‘, ‘0 ijj‘ ‘0 j = 1}, where j‘ ‘0 j is the
[22], [23] is ferromagnetic. The single-spin measure periodic distance on (see the next subsection).
is a probability measure on R N and Then every plane which contains the center of the
0 1 torus and its axis cuts it out along a family of
Z X edges onto two subgraphs with the property
J
Z ¼ exp@ jx‘ x‘0 j2A d ðx Þ ½24 desired (see Figure 1).
R Njj 2 h‘;‘0 i2E
Theorem 9 The model [22]–[23] defined on the
is the partition function. Set graph obeying Assumption 8 admits Gaussian
0 1 domination.
Z X
J Proof For = 1, h = (h‘‘0 )h‘, ‘0 i2E , and n = 1, . . . , m,
Z ðhÞ ¼ exp@ jx‘ x‘0 h‘‘0 j2 A
R Njj 2 h‘;‘0 i2E we define the map
8
d ðx Þ ½25 >
< h‘‘0 ; if h‘; ‘0 i 2 EðnÞ
Tn h ‘‘0 ¼ hn ð‘Þn ð‘0 Þ ; if h‘; ‘0 i 2 EðnÞ ½28
where h‘‘0 = h‘0 ‘ 2 RN , h‘, ‘0 i 2 E. >
:
0
0; if h‘; ‘ i 2 En
Definition 7 The model [22]–[23] admits Gaussian
domination if for all h = (h‘‘0 )h‘, ‘0 i2E , According to Assumption 8
0 1
Z ðhÞ Z ð0Þ ½26 Z X
J 2
Z ðhÞ ¼ exp@ jx‘ x‘0 h‘‘0 j A
We prove that our model admits Gaussian domina- R Njj 2 h‘;‘0 i2E
1
tion if the graph satisfies the following:
d
þð1Þ xð1Þ d
ð1Þ xð1Þ ½29
Assumption 8 The set of edges E can be þ þ
decomposed
where
[
m \
E¼ En ; En En0 ¼ ;; if n 6¼ n0 ½27 d
ð1Þ xð1Þ
n¼1 0 1
J X
in such a way that for every n = 1, . . . , m, the graph ¼ exp@ jx‘ x‘0 h‘‘0 j2 A dð1Þ xð1Þ ;
(, EnEn ) is disconnected and falls into two con- 2 0 ð1Þ
h‘;‘ i2E
nected components, ((n) (n) (n) (n)
þ , Eþ ) and ( , E ), which
are isomorphic. This means that there exists a ¼ 1
Reflection Positivity and Phase Transitions 381
dð1Þ ðxð1Þ Þ
d
~ ðx Þ ½33
þ þ
0 1 where C(k) ‘ , k = 1, . . . , N, are the same as in [20] and
X = {‘ 2 (n)
0þ,n def 0
B J C þ jh‘, ‘ i 2 En }. Then the reflection
d
þð1Þ xð1Þ ¼ exp@ jxð‘Þ xð‘0 Þ h‘‘0 j2A
2 ð1Þ
positivity of the Gibbs state [31] can be obtained
h‘;‘0 i2Eþ
along the line of arguments used for proving Lemma
6. It appears that this is the only possible way to
dð1Þ ðxð1Þ Þ
construct an RP functional from another RP
Then we apply here Lemma 6, with 0þ = {‘ 2 functional.
(1) 0
þ jh‘, ‘ i 2 E1 }, and obtain
Repeated application of the estimate [15] also
yields
0 1
Z X ! " !#1=jj
J Y Y Y
½Z ðhÞ2 exp@ jx‘ x‘0 j2 A F‘ ðx‘ Þ F‘ ðx‘0 Þ ½34
R Njj 2 h‘;‘0 i2E 1
‘2 ‘2 ‘0 2
d
þð1Þ xð1Þ d
þð1Þ xð1Þ which holds for any family of functions
þ
{F‘ : RN ! [0, þ1)}‘2 , for which the above
þ
0 1
Z expressions make sense. The estimate [34] is a
J X
exp@ jx‘ x‘0 j2A chessboard estimate, which is a very important
R Njj 2 h‘;‘0 i2E element of the theory of phase transitions in
1
RP models. The estimate [26] may be obtained
d
ð1Þ xð1Þ
d
ð1Þ xð1Þ from [34].
þ þ
Next we estimate both Z (T1 h) employing E2 and Let us show now how to derive the infrared
T2 . Repeating this procedure due times we finally estimates from the Gaussian domination [26].
get Consider the system of N-dimensional spins
indexed by the elements of Zd with the nearest-
Y
½Z ðhÞ2
m
m
Z ðTm T11 hÞ ¼ ½Z ð0Þ2
m
½30 neighbor ferromagnetic interaction and the sin-
1 ;...;m ¼1 gle-spin measure . To construct the periodic
local Gibbs measure of this system, we take the
Note that Tmm
T11 h = 0 for any h 2 RNjEj and any box
sequence 1 , ..., m = 1, which follows from [27] \
and [28]. & ¼ ðL; Ld Zd ; L 2 N ½35
As might be clear from the proof given above, the and impose periodic conditions on its boundaries.
local Gibbs state This defines the periodic distance
Z " #1=2
Xd
ðAÞ ¼ Aðx Þ d ðx Þ ½31 j‘ ‘0 j ¼ j‘j ‘0j j2L ; ‘; ‘0 2
R Njj ½36
j¼1
defined by means of the measure [22], is RP j‘j ‘0j jL ¼ minfj‘j ‘0j j; L j‘j ‘0j jg
with respect to all reflections n , n = 1, . . . , m.
Indeed, the functional defined by the product and hence the set of edges E, being unordered
measure pairs h‘, ‘0 i such that j‘ ‘0 j = 1. Thus, we have
the graph (, E) and the measure [22]. This is the
!
JX periodic local Gibbs measure of our model. By
def
d
~ ðx Þ ¼ exp jx‘ j2 d ðx Þ ½32 [31] it defines the periodic local Gibbs state .
2 ‘2
We have included the inverse temperature into J
382 Reflection Positivity and Phase Transitions
^ðpÞ ¼ pffiffiffiffiffiffi
x x‘ eið‘;pÞ
jj ‘2 X h i
ð1Þ 2
½37 h‘‘0 ½45
1 X
x‘ ¼ pffiffiffiffiffiffi ^ðpÞeið‘; pÞ
x h‘;‘0 i2E
jj p2
This means that the eigenvalues of the matrix of the
real quadratic form (with respect to h) defined by
the left-hand side of [45] do not exceed one. The
same ought to be true for the extension of this form
¼ p ¼ ðp1 ; . . . ; pd Þj pj ¼ þ j ;
L to the complex case. Let us show that the complex
eigenvectors h(1)
‘‘0 (p) of this matrix and the corre-
j ¼ 1; . . . ; 2L; j ¼ 1; . . . ; d ½38 sponding eigenvalues (p) are
ð1Þ 0 pffiffiffiffiffiffi
Then we can set h‘‘0 ðpÞ ¼ ðeiðp;‘Þ eiðp;‘ Þ Þ= jj
h i p 2
½46
b ðkÞ ðpÞ ¼ x ðkÞ ðkÞ ðpÞ ¼ 2JEðpÞK b ð1Þ ðpÞ
K ^ x
ðpÞ^ ðpÞ
X
N ½39 For j = 1, . . . , d, let j 2 Zd be the unit vector with
b ðpÞ ¼
K b ðkÞ ðpÞ
K the jth component equal to 1. Then for h‘, ‘0 i 2 E,
k¼1 there exists j such that ‘ ‘0 = j . Since the edge
Thereby, cf. [1], [2], h‘, ‘0 i is an unordered set, let us fix ‘0 = ‘ þ j .
Thereby,
def 1 Xb
X ð1Þ
0
K ð‘; ‘0 Þ ¼ ½ðx‘ ; x‘0 Þ ¼ K ðpÞeiðp;‘‘ Þ ½40 1 ð1Þ iðp;‘Þ iðp;‘0 Þ
jj p2 x ‘ x ‘ 0 e e
jj1=2 h‘;‘0 i2E
By construction, for any ‘0 2 , d h i
2 XX ð1Þ ð1Þ
0 0 ¼ x‘ eiðp;‘Þ x‘ eiðp;‘Þ cosðp; j Þ
K ð‘; ‘ Þ ¼ K ð‘ þ ‘0 ; ‘ þ ‘0 Þ ½41 jj1=2 ‘2 j¼1
Gibbs state of the whole model is called the periodic Then employing the latter two facts and [37], we get
Gibbs state. By construction, it is translation X h
ð1Þ ð1Þ ð1Þ ð1Þ
invariant. Set J x‘1 x‘0 x‘2 x‘0 h‘2 ‘02 ðpÞ
1 2
h‘2 ;‘02 i2E
X
d h i
ð1Þ ð1Þ
EðpÞ ¼ ½1 cos pj ; p 2 ð; d ½42 ¼ 2JEðpÞ x‘1 x‘0 ^ð1Þ ðpÞ
x
1
j¼1 h i
1 X
¼ 2JEðpÞ x^ð1Þ ðp0 Þ^
xð1Þ ðpÞ
Theorem 10 For all p 2
n {0}, jj1=2 p0 2
b ðpÞ N 0 0 0
eiðp ;‘1 Þ eiðp ;‘1 Þ
K ½43
2JEðpÞ
b ð1Þ ðpÞh‘ ‘0 ðpÞ
¼ 2JEðpÞK 1 1
Proof Consider the function f ( ) = Z ( h), 2 R,
which proves [46]. Then by [45] Kb (1) (p) 1=2JE(p),
where Z (h) is defined by [25]. By Theorem 9 it has
a maximum at = 0; hence, b (k) (p), k = 2, . . . , N,
for p 6¼ 0. The same holds for K
which by [39] yields [43]. &
f 00 ð0Þ 0 ½44
The result just proved and the convergence of
Obviously, f 00 (0) depends on h = (h‘‘0 )h‘, ‘0 i2E , K (‘, ‘0 ) ! K(‘, ‘0 ), as L ! þ1, imply the infrared
h‘‘0 2 RN . Let us choose h such that only the bound [4]. It turns out that the estimate [43]
Reflection Positivity and Phase Transitions 383
may be used directly to prove the phase transi- article we can only sketch its main elements basing
tion. Consider on the original paper by Dyson et al. (1978), where
X the interested reader can find the details. As above,
def 1
P ¼ 2 ½ðx‘1 ; x‘2 Þ we start by studying reflection positive functionals.
jj ‘1 ;‘2 2
0 1
1 X 2 Reflection Positivity in Nonabelian Case
¼ @ x A0 ½47
jj ‘2 ‘ Again we consider a finite set , jj being even. For every
‘ 2 , let a complex Hilbert space H‘ be given. This is
where is the box [35]. By [40] and [41], we have the single-spin physical Hilbert space for our quantum
1 b system. We suppose that all H‘ , ‘ 2 , are the copies of a
P ¼ K ð0Þ ½48 certain finite-dimensional space H. The physical Hilbert
jj
def
space H corresponding to is the tensor product of
One can show that if P = limL!þ1 P is positive, H‘ , ‘ 2 . Let A be the algebra of all linear operators
then there exist multiple Gibbs states. By [40], [41], defined on H . This is the algebra of observables in our
and [48], we get that for any ‘ 2 , case; it is noncommutative (nonabelian) and contains the
1 X b unit element I – the identity operator. As above, splits
K ð‘; ‘Þ ¼ P þ KðpÞ ½49 into two subsets , which are the mirror images of each
jj p2 nf0g
other, that is, we are given a reflection : ! , such
that (þ ) = . This allows us to introduce the
Suppose that, cf. [5], corresponding subalgebras A by setting the elements
K ð‘; ‘Þ K > 0 ½50 of Aþ to be of the form A I, where A : Hþ ! Hþ is a
linear operator and I is the identity operator on H .
with K independent of and J. Employing in [49] Respectively, the elements of A are to be of the form
this estimate and [43], and passing to the limit I A. Then we define the map # : Aþ
! A as
L ! þ1, we get
#ðA IÞ ¼ I A ½53
P K I ðdÞN=2J ½51
where A 7! A is complex (not Hermitian) conjugation; it
where
may be realized as transposing and taking Hermitian
Z 1 A
n =
def 1 dp conjugation. For A1 , . . . , An 2 A, one has A
I ðdÞ ¼ d
½52 A1 , An . We also suppose that # possesses the
ð2Þ ð;d EðpÞ
properties [8]. A linear functional : A ! R is called
which is finite for d 3. Thereby, we have proved RP (with respect to the pair , #) if it has the property [9].
the following:
Definition 12 A functional is called generalized
Theorem 11 For the spin model [22], [23], there reflection positive (GRP) if for any A1 , . . . , An 2 Aþ
,
exist multiple Gibbs states, and hence multiple
phases, if d 3 and J > I (d)N=2 K. ½A1 #ðA1 Þ An #ðAn Þ 0 ½54
Finally, let us pay some attention to the estimate In principle, this notion differs from the reflection
[50], which is closely related with the properties of the positivity only in the nonabelian case. However, if
single-spin measure (note that played no role in the algebras A commute (they do commute in our
obtaining [26] and [43]). If it is the uniform measure case), a functional is RP if and only if it is GRP.
on the unit sphere SN1 RN , then K (‘, ‘) = 1 and Example 13 Let
[50] is trivial. In general, one has to employ some
technique to obtain such an estimate. ðAÞ ¼ traceðAÞ; A 2 A ½55
Since the space H is finite dimensional, this is
well defined. It is GRP. Indeed, as the algebras A
Reflection Positivity and Phase commute, we have
Transitions in Quantum Systems ½A1 I #ðA1 IÞ An I #ðAn IÞ
As in the classical case, the way of proving the phase ¼ ½A1 I An I #ðA1 IÞ #ðAn IÞ
transition for appropriate models leads from an ¼ ½A1 I An I #ðA1 I An IÞ
estimate like [17] to Gaussian domination and then 1 A
n
¼ trace½A1 An trace½A
to the infrared bound. However, here this way is
much more complicated, so in the frames of this ¼ jtrace½A1 An j2 0
384 Reflection Positivity and Phase Transitions
The Cauchy–Schwarz inequality [13] obviously The proof is performed by means of Lemma 14.
holds also in the quantum case. By means of this The periodic local Gibbs state of the model [58] at
inequality and the Trotter product formula the inverse temperature , analogous to the state [31], is
expðA þ BÞ ¼ lim ½expðA=nÞ expðB=nÞn ½56 ðAÞ ¼ tracefA expðH Þg=Z ð0Þ; A 2 A ½61
n!þ1
As in the classical case, one can define the parameter
one can prove that every RP functional obeys an
[47]. However, now the fact that limL!þ1 P > 0
estimate like [17]. Thereby, we have the following
does not yet imply the phase transition. One has to
analog of Lemma 6:
prove a more general fact
Lemma 14 Let A, B, C1 , . . . , Cn 2 Aþ be any self-
8 0 19
adjoint operators possessing real matrix representa- < 1 X 2 =
lim lim @ 0 S A >0 ½62
tion and a1 , . . . , am be any real numbers. Then L0 !þ1:L!þ1 j j ‘20 ‘ ;
" ( !)#2
X
m
trace exp A þ #ðBÞ ½Cn #ðCn Þ an 2 where 0 is the box [35] of side 2L0 . Furthermore, in
(
n¼1
!) the quantum case the Gaussian domination [60]
X
m does not lead directly to the estimate [43], which
trace exp A þ #ðAÞ ½Cn #ðCn Þ2 yields [51]. Instead, one can get a bound like [43]
n¼1
( !) but for the Duhamel two-point function (DTF).
X
m
trace exp B þ #ðBÞ ½Cn #ðCn Þ2 ½57 Given A, B 2 A , their DTF is
n¼1 Z 1
ðA; BÞ ¼ ðAe H Be H Þd ½63
0
Gaussian Domination and Phase Transitions By means of [56] one can show that
To proceed further we need a concrete model with 1
ðA; BÞ ¼
finite-dimensional physical Hilbert spaces. As every Z ð0Þ
2
quantum model, it is defined by its Hamiltonian. Let @
trace½expð A þ
B H Þ ½64
Zd be the box [35] and (, E) be the same @ @
¼
¼0
graph as in the subsection ‘‘Infrared bound.’’ The
periodic Hamiltonian of our model is Let ^S(p) = (^S(1) (p),. .., ^S(N) (p)), p 2
, be the Fourier
image of S‘ , defined by [37], [38]. Then
X 1 X
H ¼ Q‘ þ jS‘ S‘0 j2 ½58 XN
‘2
2 0
h‘;‘ i2E ^SðpÞ; ^SðpÞ ¼ ^SðkÞ ðpÞ; ^SðkÞ ðpÞ
k¼1
where at each ‘ 2 we have the copies Q‘ ,
S(1) (N)
‘ , . . . , S‘ of N þ 1 basic operators, acting in the Theorem 16 For all p 2
n{0}, it follows that
Hilbert space H‘ , and
^SðpÞ; ^SðpÞ N ½65
N
X 2EðpÞ
ðkÞ ðkÞ 2
jS‘ S‘0 j2 ¼ S ‘ S ‘0
k¼1 To prove this statement one has to use the
Gaussian bound [60] exactly as in the case of
The only condition we impose so far is that all these
Theorem 10. The second derivative with respect to
operators can simultaneously be chosen as real
gives the corresponding DTF (see [64]).
matrices. For h = (h‘‘0 )h‘,‘0 i2E 2 R NjEj , we set
Now let us indicate how the infrared bound [65]
(
X leads to the phase transition. To this end we use the
Z ðhÞ ¼ trace exp Q‘ simplest quantum spin model with the Hamiltonian
‘2
!) [58], for which Q‘ = 0, N = 2, and S(k) ‘ , k = 1, 2,
X 2 being the copies of the Pauli matrices
jS‘ S‘0 h‘‘0 j ½59
2 h‘;‘0 i2E 0 1 1 0
ð1Þ ð2Þ
S ¼ ; S ¼
1 0 0 1
where > 0 is the inverse temperature.
Then
Theorem 15 For the model [58] and any
ðkÞ ðkÞ ðkÞ
h = (h‘‘0 )h‘, ‘0 i2E 2 RNjEj , K ð‘; ‘Þ ¼ S‘ S‘ ¼1
Z ðhÞ Z ð0Þ ½60 for all ‘ 2 ; k ¼ 1; 2 ½66
Reflection Positivity and Phase Transitions 385
which gives the bound K (see [50]). For A, B 2 A , was made in Driessler et al. (1979), Pastur and
by [A, B] we denote the commutator AB BA. Set Khoruzhenko (1987), Barbulyak and Kondratiev
h h ii (1992), and Kondratiev (1994). In the latter two
ðkÞ
ðpÞ ¼ ^ SðkÞ ðpÞ; H ; ^
SðkÞ ðpÞ papers a general version of the quantum crystal was
k ¼ 1; 2 ½67 studied in the framework of the Euclidean approach,
based on functional integrals (see Albeverio et al.
The phase transition in the model we consider can (2002)). In this approach the quantum crystal is
be established by means of the following statement represented as a lattice spin model with unbounded
(see Dyson 1978, Theorem 5.1). infinite-dimensional spins. Like in the case of classical
models with unbounded spins, here establishing the
Proposition 17 Suppose there exist (k) (p), k = 1, 2,
estimate [5] becomes a highly nontrivial task. In
p 2 (, ]d such that, for all L 2 N,
particular cases, for example, for 4 -models, one
ðkÞ
ðpÞ ðkÞ ðpÞ; k ¼ 1; 2; p 2
½68 applies special tools like the Bogoliubov inequalities
(see Driessler et al. (1979) and Pastur and Khoruz-
Then the model undergoes a phase transition at a henko (1987)). In the general case quasiclassical
certain finite if d 3 and asymptotics allow us to get the lower bound [5] (see
Z ðkÞ 1=2 Barbulyak and Kondratiev (1992) and Kondratiev
1 ðpÞ (1994)). There is one more technique based on
d
dp < 1 ½69
ð2Þ ð; 8EðpÞ
d
reflection positivity (see Lieb (1989)). It employs
reflections in spin spaces, whereas the properties of
for a certain, and hence for both, k = 1, 2.
the index sets (lattices) play no role. This technique
Thus to prove the phase transition we have to proved to be useful in the theory of strongly correlated
estimate (k)
(p), k = 1, 2. By means of the Cauchy– electron systems, see Tian (2004). Finally, we mention
Schwarz inequality, the estimate [69] may be the books of Georgii (1988), Prum (1986), and Sinai
transformed into the following: (1982) where different aspects of the RP method are
Z h i described. In Georgii (1988), one can also find
1 ð1Þ ð2Þ
ðpÞ þ ðpÞ dp < 16=I ðdÞ extended bibliographical and historical comments on
ð2Þd ð;d this subject.
where I (d) is the same as in [52]. The integral on
See also: Phase Transition Dynamics; Phase Transitions
the left-hand ffi side can be estimated from above by
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi in Continuous Systems; Quantum Spin Systems;
8 d(d þ 1); hence, the latter inequality holds if
Renormalization: Statistical Mechanics and Condensed
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Matter.
I ðdÞ dðd þ 1Þ < 2
Fröhlich J, Simon B, and Spenser T (1976) Infrared bounds, phase Prum B and Fort J-C (1991) Stochastic Processes on a Lattice
transitions and continuous symmetry breaking. Communica- and Gibbs Measures (translated from the French by Bertram
tions in Mathematical Physics 50: 79–85. Eugene Schwarzbach and revised by the authors). Mathema-
Georgii H-O (1988) Gibbs Measures and Phase Transitions. tical Physics Studies, vol.11. Dordrecht: Kluwer Academic
Studies in Mathematics, vol.9. Berlin: Walter de Gruyter. Publishers Group.
Kondratiev JG (1994) Phase transitions in quantum models of Shlosman SB (1986) The method of reflection positivity in the
ferroelectrics. In: Stochastic Processes, Physics and Geometry, mathematical theory of first-order phase transitions. Russian
vol. II, pp. 465–475. Singapure: World Scientific. Mathematical Surveys 41: 83–134.
Lieb EH (1989) Two Theorems on the Hubbard Model. Physical Sinai Ya (1982) Theory of Phase Transitions. Rigorous Results.
Review Letters 62: 1201–1204. Oxford: Pergamon.
Pastur LA and Khoruzhenko BA (1987) Phase transitions in Tian G-S (2004) Lieb’s spin-reflection-positivity method and its
quantum models of rotators and ferroelectrics. Theoretical applications to strongly correlated electron systems. Journal of
and Mathematical Physics 73: 111–124. Statistical Physics 116: 629–680.
Prum P (1986) Processus sur un Réseau et Mesures de Gibbs.
Applications. Techniques Stochastiques. Paris: Masson.
kneading operator regularization approach, eigenvalue.) Ruelle also proved that the traces can
inspired by the work of Milnor and Thurston, be written as sums over periodic orbits:
and applicable to dynamical systems with finite Qn1
X k
smoothness. n k¼0 gðf xÞ
tr L0 ¼
Despite the terminology, none of the regulariza- x: f n ðxÞ¼x
j detðId Tfxn Þj
tion techniques discussed below match the following P
‘‘-regularization’’ formula: where means that the fixed points of f n lying in
! the intersection of two or more elements of the
Y1
dX 1
Markov partition must be counted two or more
s
ak ¼ exp a j ½3
k¼1
ds k¼1 k s¼0 times. (Note that if f n (x) = x, then this closed orbit
gives a natural inverse branch for f n .) Taking into
(For information about the above -regularization account the periodic orbits on the boundaries of the
and its applications to physics, we refer, e.g., to Markov partition, Ruelle expresses the following
Elizalde 1995. See also Voros (1987) and Fried ‘‘dynamical determinant’’:
(1986) for more geometrical approaches and further
references, e.g., to the work of Ray and Singer.) df; g ðzÞ
We do not cover all aspects of dynamical 2 3
X
1 n X Qn1 k
-functions here. For more information and refer- z gðf xÞ
¼ exp4 k¼0 5 ½6
ences, we refer to our survey Baladi (1998), to the n¼1
n x: f n ðxÞ¼x j detðId Tfxn Þj
more recent surveys by Pollicott (2001) and Ruelle
(2002), and also to the exhaustive account by as an alternated product of determinants d0 (z) as in [5].
Cvitanović et al. (2005), which contains a rich The expression [6] is sometimes also called a
array of physical applications. ‘‘dynamical -function,’’ but we prefer to reserve this
terminology for the following power series:
2 3
The Grothendieck–Fredholm Case X1 n
z X nY 1
f; g ðzÞ ¼ exp4þ gðf k xÞ5 ½7
Let M be a real analytic compact manifold (e.g., the n¼1
n n
x: f ðxÞ¼x k¼0
circle or the d-torus), and let f : M ! M be real
analytic and g : M ! C be analytic. It is not difficult to write f , g (z) as (Baladi 1998) an
First suppose that f is uniformly expanding, that alternated product of determinants df , gi , for
is, there is > 1 so that kTf (v)k kvk. (For i = 0, . . . , d, and appropriate weights gi .
example, f (z) = z2 on the unit circle, or a small In fact, the results just described hold in more
analytic perturbation thereof.) Consider generality, for example, for piecewise bijective and
X analytic interval maps. Such maps, f, appear
Lf ; g ’ðxÞ ¼ gðyÞ’ðyÞ ½4 naturally, for example, when considering Schottky
y: f ðyÞ¼x subgroups of PSL(2, Z). We mention the recent
(For example, with g(y) = 1=j det Tf (y)j or work of Guillopé–Lin–Zworski (2004), who let the
1=j det Tf (y)js .) Ruelle (1976) proved that an transfer operator associated to such f and weights
operator L0 , which is essentially the same as Lf , g gs (y) = 1=jf 0 (y)js act (as trace-class operators) on
(the difference, if any, arises from the use of Markov suitable Hilbert spaces of holomorphic functions.
partitions, especially in higher dimensions), acting This allows them to obtain precise estimates for the
on a Banach space of holomorphic and bounded number of zeros of s 7! df , gs [1] in the complex
functions, is not only compact, but is in fact a plane: these zeros are the resonances (in the sense of
nuclear operator in the sense of Grothendieck. In the spectrum of the Laplacian).
particular, the traces of all its powers are well Note that the nuclearity properties extend also to
defined, and the Grothendieck–Fredholm (Gohberg the Gauss map f (x) = {1=x}, which has infinitely
et al. 2000) determinant many inverse branches, if the weight g has summa-
! bility properties over the branches (e.g.,
X1
zn n
gs (y) = j1=f 0 (y)js , where s is a complex parameter,
d0 ðzÞ ¼ exp tr L0 ½5 with <s > 1=2). The dynamical determinant df , gs (z)
n¼1
n
for the transfer operator of the Gauss map is related
extends to an entire function of finite order, the to the Selberg -function (see e.g., Chang and Mayer
zeros of which are exactly the inverses of the (2001) and references therein).
nonzero eigenvalues of L0 . (The order of the zero Next, assume that M and g are as before, but f is a
coincides with the algebraic multiplicity of the uniformly hyperbolic real analytic diffeomorphism.
388 Regularization for Dynamical -Functions
For example, M is the 2-torus and f is a small real Lf , g on B is equal to 1, and such that the following
analytic perturbation of the linear automorphism regularized determinant
2 1 df; g ðzÞ
1 1 2 3
Q n1
X
1
zn X g ð f k
xÞ
k¼0 s
More generally, we may assume that f is a real ¼ exp4 5 ½8
n¼1
n x2ð0;1: f n ðxÞ¼x
1 Tfxn
analytic Anosov diffeomorphism, that is, there are
C 1 and > 1 such that the tangent bundle is a holomorphic function in the cut complex plane
decomposes as TM = Eu Es , where the dynamical {z 2 C j z 62 [1, 1)}. Furthermore, its zeros z in this
bundles Eu and Es are Tf-invariant, with kTf n jEs k
cut plane are in bijection with the spectrum of Lf , g jB
Cn and kTf n jEu k
Cn for all n 2 Zþ . In outside of the unit interval [0, 1], and this spectrum
general, the smoothness of x 7! Eu (x) and Es (x) is consists of eigenvalues 1/z of finite multiplicities.
only Hölder. Under the very strong additional Finally, these eigenvalues can only accumulate at 0
assumption that Eu (x) and Es (x) are real analytic, or 1, although each point in the unit interval belongs
Ruelle (1976) (see also Fried (1986)) showed that to the spectrum of Lf , g . In particular, the essential
the power series df , g (z) can again be written as a spectral radius of Lf , g on B coincides with its
finite alternated product (this product being again spectral radius.
an artifact of the Markov partition) of entire Let us define the Banach space B and explain the
functions of finite order. For this, he constructed key ideas in the proof of the above result (Rugh’s
auxiliary transfer operators associated to the claim is in fact more general than the statement
expanding (and analytic!) quotiented dynamics above and applies to a class of maps f with neutral
acting on holomorphic functions on disks. The fixed points). The starting point is the decomposition
analyticity assumption on the dynamical bundles
was later lifted by Rugh (1996) (see also Fried Lf ; g ¼ L1 þ L2
(1995)), who let their transfer operators act on
where Li ’ = ’ fi1 j(fi1 )0 js . The operator L2 is of
Banach topological tensor products of spaces of
the type discussed in the previous section, and it is
holomorphic functions on a disk with the dual of
nuclear when acting, for example, on bounded
such a space. In all these cases, the transfer
holomorphic functions in a complex neighborhood
operator is a nuclear operator in the sense of
of M. Since f1 is not expanding (because of the
Grothendieck and no regularization is needed.
parabolic fixed point at 0), other ideas must be used
(More recent work of Kitaev (1999), when applied
to handle the operator L1 . The change of coordinates
to this analytic setting, shows that the ‘‘mero-
(this idea goes back to Fatou) w = 1=x replaces the
morphic’’ function df , g (z) in fact does not have
weak contraction f11 by the translation w 7! w þ 1 in
poles.)
a suitable domain containing a half-plane <w > w0 .
In order to take into account the weight gs , it is
convenient to use the change of variables
Regularization and Intermittency (w) = ’(1=w) w2s . Indeed, in the new coordinates
Consider the interval M = [0, 1], and f defined on M the operator L1 reads as
by f (x) = f1 (x) = x=(1 x) on [0, 1/2], and f (x) = M1 ðwÞ ¼ ðw þ 1Þ
f2 (x) = (1 x)=x on [1/2, 1]. (This is the Farey
map, which appears naturally when considering The next step consists in letting M1 act on the
continued fractions.) Each of the two branches is Banach space Bw of Laplace transforms of
an analytic bijection onto [0, 1]. The second branch L1 (Rþ , Lebesgue), that is, functions
is expanding, but the first one, f1 , has a (parabolic) Z 1
neutral fixed point at x = 0 (the expansion is ðwÞ ¼ eðww0 Þt ðtÞ dt
0
f (x) = x þ x2 þ x3 þ ). Let g = gs be an analytic R
weight of the form g(y) = 1=jf 0 (y)js for <s 1=2. We with the induced norm kkBw = j (t)j dt. Since M1
are interested in the spectrum of the operator Lf , g maps to et (t), it is not difficult to see that the
associated with the pair (f , g) by [4]. Clearly, the spectrum of M1 on Bw (and thus of L1 on the pullback
expression [6] is not a good candidate for an analog B of Bw by , which consists of functions in a complex
of the Fredholm determinant of Lf , g . Rugh (1996) neighborhood of [0,1], holomorphic in a sector at 0,
introduced a Banach space B of functions in a and with a possible, but controlled, singularity at 0) is
complex neighborhood of M, having a controlled the closed unit interval. One can check that L2 is
singularity at 0, and such that the spectral radius of nuclear on B. Composing a bounded operator with a
Regularization for Dynamical -Functions 389
nuclear operator gives a nuclear operator. If 1=z 62 smaller than the spectral radius. Then, the goal is to
[0, 1], the resolvent (1 zL1 )1 is a bounded operator, prove that the dynamical determinant [6] defines a
and therefore, for such z, the operator holomorphic function in the disk of radius 1=ess , and
that its zeros in this disk are exactly the inverses of the
PðzÞ :¼ zL2 ð1 zL1 Þ1 ½9 eigenvalues of Lf , g . For uniformly expanding Cr maps
is nuclear on B. We view P(z) as a ‘‘regularized’’ f on compact manifolds, and Cr weights, denoting by
version of Lf , g = L1 þ L2 . Now, since > 1 the expansion coefficient as in the section ‘‘The
Grothendieck–Fredholm case,’’ this goal was essen-
ð1 zLf ; g Þ1 ¼ ð1 zðL1 þ L2 ÞÞ1 tially attained by Ruelle (1990). For Lf , g acting on the
1 Banach space of Cr functions on M, Ruelle proved
¼ ð1 zL1 Þ1 1 zL2 ð1 zL1 Þ1 ess (Lf , g )
r and was able to extend df , g (z) (and
interpret its zeros) in the disk of radius r .
it is not surprising that one can prove (Rugh 1996)
For Cr Anosov diffeomorphisms f, and Cr weights g,
that the Fredholm determinant
Pollicott, Ruelle, Haydn, and others obtained important
u 7! det 1 L2 ðu L1 Þ1 results using the symbolic dynamics description (for
which the maximal smoothness which can be used is
(which is holomorphic in u 62 [0, 1]) has as its zero set r
1, because of the metric-space model). Later, Kitaev
sp(Lf , g jB ) n [0, 1], and that this set consists in isolated (1999) was able to show that df , g (z) extends to a
eigenvalues of finite multiplicity (equal to the order of holomorphic function in the disk of radius r=2 ,
the corresponding zero) for Lf , g . Formally, but did not give any spectral interpretation of the
zeros of df , g (z). More recently, Liverani (2005) was able
X
1
ð1 zL1 Þ1 ¼ z k L k1 ½10 to give such an interpretation, in a smaller disk however.
k¼0 All the works mentioned in the previous paragraph
are based on some approximation scheme (Taylor
so that the regularization we just described can be expansion style). In the early 1990s, a new approach,
viewed as mirroring an induction (or renormaliza- with a regularization flavor, was launched (see e.g.,
tion) procedure, where the dynamics f is replaced by Baladi and Ruelle (1996)), initially for piecewise
the first-return map to the ‘‘chaotic’’ part of the monotone interval maps. We present it next.
phase space [0, 1/2]. (For the Farey map, the induced Consider a finite set of local homeomorphisms
map is just the Gauss map.) The formal equality [10]
! : U! ! ! (U! ), where each U! is a bounded
is also behind the fact that (Rugh 1996) open interval of R, and of associated weight functions
Q n1 g! which are continuous, of bounded variation, and
X k
n k¼0 gs ðf xÞ
tr PðzÞ ¼ have support inside U! . For example, the ! can be the
x6¼0: f n ðxÞ¼x
1 Tfxn
inverse branches of a single piecewise monotone
An extension of this theory to the two-dimensional interval map f, and g! can be g ! for a single g.
setting has been obtained by Baladi, Pujals, and (No contraction assumption is required on the ! :
Sambarino. their graph can even coincide with the diagonal on a
segment.) The transfer operator is now
X
M’ ¼ g! ð’ ! Þ
Regularization and Kneading !
Determinants b for the essential
Ruelle obtained an estimate, noted R,
Up to now we have only discussed analytic dynamical spectral radius of M acting on the Banach space BV of
systems, for which hyperbolicity (or uniform expan- functions of bounded variation. The main result of
sion) guaranteed that the transfer operator (or a Baladi and Ruelle (1996) links the eigenvalues of
regularized version thereof) was compact, even M : BV ! BV outside of the disk of radius R, b with
nuclear, on a natural Banach space. When considering the zeros of the following ‘‘sharp determinant’’:
hyperbolic invertible (or expanding noninvertible) !
maps f, and weights g with ‘‘finite smoothness,’’ say #
X1
zn # n
det ðId zMÞ ¼ exp tr M ½11
Cr for some finite r > 1, the transfer operator defined n¼1
n
by [2] or [4] is usually not compact on any infinite-
dimensional space. However, one can often prove a where (with the understanding that y=jyj = 0 if y = 0)
‘‘Lasota–Yorke’’ type inequality (see e.g., Baladi X Z 1 ! ðxÞ x
(1998)) which ensures that the essential spectral radius tr# M ¼ dg! ðxÞ
ess (Lf , g ), defined in the ‘‘Introduction,’’ is strictly !
2 j ! ðxÞ xj
390 Regularization for Dynamical -Functions
If the ! are strict contractions which form the set Thurston. In a suitable z-disk, one proves that this
of inverse branches of a piecewise monotone interval b is a Hilbert–Schmidt operator on an
operator D(z)
map f, and g! = g ! , then integration by parts L2 space (its kernel is bounded and compactly
together with the key property that supported), thus allowing the use of regularized
x determinants of order 2 (see e.g., Gohberg et al.
d ¼ ; the Dirac delta at the origin of R b
(2000)). By definition, det(Id þ D(z)) is the product of
2jxj
this regularized determinant with the exponential of the
show that det# (Id zM) = 1=f , g (z) (recall [7]). If b along the diagonal, which
average of the kernel of D(z)
one assumes instead only that the graph of each is well defined. Another kneading operator, D(z), is
admissible composition nw of n successive ! ’s (with essential. If 1=z is not in the spectrum of M (on BV),
n 1) intersects the diagonal transversally, then then D(z) is also Hilbert–Schmidt, and one can show
b
det(Id þ D(z)) = det(Id þ D(z))1 . The initial defini-
det# ðId zMÞ b and D(z) were technical and we shall not
tions of D(z)
2
X1 n
z X X give them here. However, a more conceptual definition
¼ exp4 L x; n
w of the D(z) was later implemented:
n¼1
n admissible n x: n ðxÞ¼x
w w
# DðzÞ ¼ N ðId zMÞ1 S ½15
Y
n1
k
g!k w ðxÞ ½12 where N is an auxiliary transfer operator and S is
k¼0 the convolution
where L(x, ) 2 {1, 1} is the Lefschetz number of a Z
1 xy
transversal fixed point x = (x) (if is C1 this is just S’ðxÞ ¼ ’ðyÞ d
2 jx yj
sgn (1 0 (x))). Therefore, we call the sharp determi-
nant det# (Id zM) a Ruelle–Lefschetz (dynamical) where is an auxiliary non-negative finite measure.
determinant. For a class of ‘‘unimodal’’ interval maps f From [15], it becomes clear that the kneading
and constant weight g = 1, the expression [12] with operator is a regularized (through the convolution
Lefschetz numbers, coming from the additional S) object which describes the inverse spectrum of the
transversality assumption, gives that det# (Id zM) transfer operator: the resolvent (Id zM)1 in [15]
is just 1= (z), where the ‘‘negative -function’’ means that poles can only appear if 1=z is an
" # b
eigenvalue. Since det(Id þ D(z)) = det(Id þ D(z))1 ,
X1 n
z this can be translated into a statement for zeros of
n
ðzÞ ¼ exp þ ð2#Fix ðf Þ 1Þ ½13 b
n¼1
n det(Id þ D(z)). The Milnor–Thurston identity [14]
then implies that any zero of det# (Id zM) is an
is defined by counting (twice) the sets inverse eigenvalue of M.
Fix ðf n Þ ¼ fxjf n ðxÞ ¼ x; f strictly decreasing The one-dimensional kneading regularization we
just presented is well understood. The higher-
in a neighborhood of xg dimensional theory is not as developed yet. Let
of ‘‘negative fixed points.’’ This negative -function U! be now finitely many bounded open subsets of Rd ,
r
was studied by Milnor and Thurston, who proved ! : U! ! ! (U! ) be local C homeomorphisms or
the remarkable identity diffeomorphisms, while g! : U! ! C are compactly
supported Cr functions, for r 1.
b
ð ðzÞÞ1 ¼ detð1 þ DðzÞÞ In 1995, A Kitaev wrote a two-page sketch proving a
b higher-dimensional Milnor–Thurston formula, under
where D(z) is a 1 1 ‘‘matrix,’’ which is just a
an additional transversality assumption. This assump-
power series in z with coefficients in {1, 0, þ1},
tion guarantees that the set of fixed points of each fixed
given by the signed itinerary of the image of the
period m is finite, so that the Ruelle–Lefschetz
turning point (the so-called ‘‘kneading’’ data).
determinant det# (Id zM) can be defined through
Returning now to the general setup ! , g! , the
[12]. Inspired by Kitaev’s unpublished note, Baillif
crucial step in the proof of the spectral interpreta-
(2004) proved the following Milnor–Thurston formula:
tion of the zeros of this Ruelle–Lefschetz determi-
nant consists in establishing the following Y
d 1 kþ1
continuous version of the Milnor–Thurston identity: det# ðId zMÞ ¼ det [ ðId þ Dk ðzÞÞð1Þ ½16
k¼0
b
det# ðId zMÞ ¼ det ðId þ DðzÞÞ ½14
Here, the Dk (z) are kernel operators acting on (k þ 1)-
b replaces (for-
where the ‘‘kneading operator’’ D(z) forms, constructed with the resolvent (Id zMk )1 ,
mally) the finite kneading matrix of Milnor and together with a convolution operator S k , mapping
Relativistic Wave Equations Including Higher Spin Fields 391
(k þ 1)-forms to k-forms and which satisfies the for general modular groups. In: Fiedler B (ed.) Ergodic
homotopy equation dS þ Sd = 1. The kernel k (x, y) Theory, Analysis, and Efficient Simulation of Dynamical
Systems, pp. 523–562. Springer: Berlin.
of S k has singularities of the form (x y)=kx ykd . Cvitanović P, Artuso R, Mainieri R, Tanner G, and Vattay G (2005)
The transversality assumption allows Baillif to interpret Chaos: Classical and Quantum, ChaosBook.org. Copenhagen:
the determinant obtained by integrating the kernels Niels Bohr Institute.
along the diagonal as a flat determinant in the sense of Elizalde E (1995) Ten Physical Applications of Spectral Zeta
Atiyah and Bott, whence the notation det[ in the right- Functions, Lecture Notes in Physics, New Series m:35.
Springer: Berlin.
hand side of [16]. Fried D (1986) The zeta functions of Ruelle and Selberg. I. Ann.
Baillif (2004) did not give a spectral interpretation Sci. École Norm. Sup 19: 491–517.
of zeros or poles of the sharp determinant [16], but Fried D (1986) Analytic torsion and closed geodesics on
he noticed that for jzj very small, suitably high hyperbolic manifolds. Inventiones Mathematicae 84:
iterates of the Dk (z) are trace-class on L2 (Rd ), 523–540.
Fried D (1995) Meromorphic zeta functions for analytic flows.
showing that the corresponding regularized determi- Communications in Mathematical Physics 174: 161–190.
nant has a nonzero radius of convergence under Gohberg I, Goldberg S, and Krupnik N (2000) Traces and
weak assumptions. The spectral interpretation of the Determinants of Linear Operators. Basel: Birkhäuser.
sharp determinant [12] in arbitrary dimension, but Guillopé L, Lin K, and Zworski M (2004) The Selberg zeta
under additional assumptions, was subsequently function for convex co-compact Schottky groups. Commu-
nications in Mathematical Physics 245: 149–176.
carried out by Baillif and the author of the present Kitaev AY (1999) Fredholm determinants for hyperbolic diffeo-
article, giving a new proof of some of the results in morphisms of finite smoothness. Nonlinearity 12: 141–179.
Ruelle (1990). Liverani C (2005) Fredholm determinants, Anosov maps and
Ruelle resonances. Discrete and Continuous Dynamical
See also: Chaos and Attractors; Dynamical Systems and Systems 13: 1203–1215.
Thermodynamics; Ergodic Theory; Hyperbolic Dynamical Pollicott M (2001) Dynamical zeta functions. In: Katok A, de la
Systems; Number Theory in Physics; Quantum Llave R, Pesin Y, and Weiss H (eds.) Smooth Ergodic Theory
and Its Applications (Seattle, WA, 1999), Proc. Sympos. Pure
Ergodicity and Mixing of Eigenfunctions; Quillen
Math., vol. 69, pp. 409–427. Providence, RI: American
Determinant; Semi-Classical Spectra and Closed Orbits;
Mathematical Society.
Spectral Theory for Linear Operators. Ruelle D (1976) Zeta functions for expanding maps and Anosov
flows. Inventiones Mathematicae 34: 231–242.
Ruelle D (1990) An Extension of the Theory of Fredholm
Further Reading Determinants, Inst. Hautes Études Sci. Publ. Math.
175–193.
Baillif M (2004) Kneading operators, sharp determinants, and
Ruelle D (2002) Dynamical Zeta Functions and Transfer
weighted Lefschetz zeta functions in higher dimensions. Duke
Operators, Notices American Mathematical Society: 887–895.
Mathematical Journal 124: 145–175.
Rugh HH (1996) Generalized Fredholm Determinants and
Baladi V (1998) Periodic Orbits and Dynamical Spectra,
Selberg Zeta Functions for Axiom A Dynamical Systems,
Ergodic Theory Dynam. Systems, vol. 18, pp. 255–292
Ergodic Theory Dynam. Systems. 805–819.
(with an addendum by Dolgopyat D and Pollicott M, pp.
Rugh HH (1999) Intermittency and regularized Fredholm
293–301.)
determinants. Inventiones Mathematicae 135: 1–25.
Baladi V and Ruelle D (1996) Sharp determinants. Inventiones
Voros A (1987) Spectral functions, special functions and the
Mathematicae 123: 553–574.
Selberg zeta function. Communications in Mathematical
Chang CH and Mayer DH (2001) An extension of the
Physics 110: 439–465.
thermodynamic formalism approach to Selberg’s zeta function
have the same structure, if s > 0. Therefore, most of By iteration we obtain second-order wave equations
the equations dealt with in this article are formulated of normal hyperbolic type. Further, Cauchy’s initial-
for spinor fields. (Strictly speaking, the exclusive use of value problem is well posed and a Lagrangian is
2-spinors restricts the relativistic invariance to the known. For zero mass, we state the wave equations
proper Lorentz group SOþ (1, 3). However, all the
results presented here can be ‘‘translated back’’ into rA
ðA0 jAjB0 ...E0 Þ ¼ 0 ½2
tensor or bispinor form, respectively (Illge 1993).)
Relativistic wave equations for free fields with arbi- which are just the curved versions of the equations
trary spin s > 0 in Minkowski spacetime are discussed for the potential of a massless field. They are
in the section ‘‘Higher spin in Minkowski spacetime’’; consistent in curved spacetime, too, and the Cauchy
they were first given by Dirac (1936). problem is well posed (Illge 1988).
In the subsequent section, we explain how the field Last but not least, let us mention the esthetic
theory can be extended to curved spacetimes. If a aspect. Equations [1] and [2] satisfy Dirac’s demand:
Lagrangian is known, then there exists a well-known ‘‘Physical laws should have mathematical beauty.’’
mathematical procedure (‘‘Lagrange formalism’’) to In the following, we assume that the spacetime
obtain the field equations, the energy–momentum and all the spinor and tensor fields are of class C1 .
tensor, etc. All field equations for ‘‘low’’ spin s 1 All considerations are purely local. We will call a
arise from an action principle. Consequently, they can symmetric (‘‘irreducible’’) spinor to be of type (n, k)
be extended to curved spacetime by simply replacing the if and only if it has n unprimed and k primed indices
flat metric and connection with their curved versions. (irrespective of their position). Moreover, we use the
If s > 1, then the wave equations do not follow from notations and conventions of Penrose and Rindler
a variation principle without supplementary conditions. (1984), especially for the curvature spinors ABCD
Nevertheless, one can try to generalize the equations of and ABA0 B0 .
the section ‘‘Higher spin in Minkowski spacetime’’ to
curved spacetime by the ‘‘principle of minimal cou- Wave Equations for Low Spin
pling,’’ too. However, the arising equations are not in Minkowski Spacetime
satisfactory, since there is an algebraic consistency
condition in curved space if s > 1 (Buchdahl 1962), and The spin (or intrinsic angular momentum) of a
another for charged fields in the presence of electro- particle is found to be quantized. Its projection on
magnetism if s > 1=2 (Fierz and Pauli 1939). any fixed direction is an integer or half-integer
There have been numerous attempts to avoid these multiple of Planck’s constant h; the only possible
inconsistencies. As a rule, the alternative theories values are
require an extended spacetime structure or additional sh; ðs þ 1Þh; . . . ; ðs 1Þh; sh
new fields or they give up some important principle. An
extensive literature is devoted to just this problem – The spin quantum number s so defined can have one
unfortunately, a survey article or book is missing. of the values s = 0, 1=2, 1, 3=2, 2, . . . and is a
Finally, we present a possibility to describe fields characteristic for all elementary particles along
with arbitrary spin s > 0 within the framework of with their mass m and electric charge e. The
Einstein’s general relativity without any auxiliary particles with integer s are called ‘‘bosons,’’ those
fields and subsidiary conditions in a uniform manner. with half-integer s ‘‘fermions.’’ The three numbers
The approach is based on irreducible representations s = 0, 1=2, and 1 are referred to as ‘‘low’’ spin; they
of type D(s, 0) and D(s 1=2, 1=2) instead of are sufficient for the greater part of physics.
D(s=2, s=2) in the Fierz theory for bosons and The principle of first quantization associates a type
D(s=2 þ 1=4, s=2 1=4) in the Rarita–Schwinger of field and a field equation to each type of elementary
theory for fermions. It was first pointed out particles. Massive particles, with rest mass m > 0, and
by Buchdahl (1982) that this type of field equations massless particles, with rest mass m = 0, are to be
can be generalized to a curved spacetime if the mass is distinguished. Accordingly, we obtain six linear wave
positive. After a short time Wünsch (1985) simplified equations for s 1, which read as follows in units
them to their final form: such that c = h = 1 (see Table 1):
For the sake of simplicity, we consider only free
5A
P0 ’AB...E þ m1 B...EP0 ¼ 0 fields in Table 1; no source terms or interaction terms
0 ½1
rPðA B...EÞP0 m2 ’AB...E ¼ 0 appear here. The associated ‘‘free’’ Lagrangians are
given in Table 2.
This system contains the well-known wave equa- Since the electromagnetic field tensor Fab satisfies the
tions for low spin s = 1=2 and s = 1 as special cases. first part of Maxwell’s equations @[c Fab] = 0, it follows
Relativistic Wave Equations Including Higher Spin Fields 393
Table 1 Relativistic wave equations for low spin s = 0, 1=2, and 1 where ’ and are both symmetric spinors:
’AB = ’(AB) , A0 B0 = (A0 B0 ) . After a straightforward
Spin, mass Wave equation Associated particles
calculation the Proca equation yields
s = 0, m > 0 Klein–Gordon eqn. Scalar mesons C 0
C 0
(& þ m 2 )u = 0 , , K , . . . @ðA BÞC0 þ ’AB ¼ 0; @ðA 0 B0 ÞC þ A0 B0 ¼ 0
0
s = 0, m = 0 D’Alembert eqn. – @AC0 ’CA þ @AC C0 A0 þ m2 AA0 ¼ 0
&u = 0
s = 1=2, m > 0 Dirac eqn. Leptons e, , Further, from the equation @[c Hab] = 0, we obtain
0
@AA0 ’A þ pimffiffi2 A0 = 0 Baryons p, n, , , , . . . @AC A0 C0 = @AC0 ’AC ; thus, the first and second summand
0
@AA A0 pimffiffi ’A = 0
2
in the third equation are equal. Consequently, we find
s = 1=2, m = 0 Weyl eqn. Massless(?) neutrinos the following spinor form of the Proca equations:
@AA0 A = 0 e , ,
m2 0
s = 1, m > 0 Proca eqn. Vector mesons @AC0 ’CA þ 0 ¼ 0; C
@ðA BÞC0 þ ’AB ¼ 0
Hab = @a Ub @b Ua
, !, , , . . . 2 AA ½3
2
@ c Hca þ m 2 Ua = 0 0 m C0
@AC C0 A0 þ 0 ¼ 0; @ðA 0 B0 ÞC þ A0 B0 ¼ 0
s = 1, m = 0 Maxwell eqn. Photon 2 AA
@½a Fbc = 0
If the tensor fields H and U are real, then we have
@a F ab = 0
A0 B0 , AA0 =
A0 B0 = ’ AA0 , and the second pair of equa-
tions is just the complex conjugate of the first.
Now it is readily seen that the Dirac and Proca
Table 2 The Lagrangian densities for free (i.e., noninteracting) equations have the same structure. They are coupled
fields with low spin first-order systems of differential equations for pairs
of spinor fields. The only decisive difference is that
Field Lagrangian density
the spinors have one index if s = 1=2 and two indices
Scalar field L = 12 f(@ a u)(@a u) m 2 u 2 g if s = 1.
Dirac field L = piffiffi2 (
0
A @ AA A0 þ ’
0
B @BB 0 ’B ’B @BB 0 ’
B
0
We obtain a similar result for Maxwell fields. The
A0 @ AA
0
A ) þ m( A ’A þ ’
0
A A0 ) real tensor Fab has the spinor equivalent
0 0
Weyl field L = piffiffi2 (
A0 @ AA A A @ AA A0 )
2
Fab aAA0 bBB0 ¼ ’AB "A0 B0 þ ’
A0 B0 "AB
1 ab
Proca field L= 4 Hab H H ab @½a Ub þ m2 Ua U a
Maxwell field L= 14 Fab F ab = (@½a Ab (@ ½a Ab ) with a symmetric spinor ’AB . The spinor form of
Maxwell’s equations is (Penrose and Rindler 1984)
@AA0 ’AB ¼ 0 ½4
that a vector field Aa exists such that Fab = @a Ab
@b Aa . This vector field is called the ‘‘electromagnetic and has the same structure as the Weyl equation.
4-potential.’’ It is not uniquely determined by the field Here we found an example for the power and utility
Fab ; the freedom in Aa is Aa ! Aa þ @a where of spinor techniques since they allow the formulation
= (x) is a real-valued function. This gauge transfor- of the wave equations for bosons and fermions in a
mation of Aa can be used, for example, to obtain the uniform manner. Only the cases m > 0 and m = 0 are
Lorentz gauge condition @ a Aa = 0. to be distinguished. Moreover, the above results
The wave equations listed in Table 1 look rather suggest the way for generalizing the wave equations
different, but this formal disadvantage can be over- to higher spin. Therefore, we can already end the
come. To begin with, we remark that fermions discussion of the fields with low spin and take them as
require spinors for their description. The Dirac and special cases of those with arbitrary spin.
Weyl equations are not describable by linear equa-
tions for tensor fields. On the other hand, bosons can
be described by spinors as well. All tensor equations Higher Spin in Minkowski Spacetime
can be ‘‘translated’’ into spinor form using the mixed
Massive Fields
spinor–tensor aAA0 . We will demonstrate this proce-
dure for the Proca field in some detail. Relativistic wave equations for particles with arbi-
The (possibly complex) skew-symmetric tensor trary spin were first considered by Dirac (1936). His
Hab and the vector Ua have the spinor equivalents equations read
Hab aAA0 bBB0 ¼ ’AB "A0 B0 þ A0 B0 "AB @PA0 ’AB...DQ0 ...T 0 þ m1 B...DP0 Q0 ...T 0 ¼ 0
½5
Ua aAA0 ¼ AA0 0
@AP B...DP0 Q0 ...T 0 m2 ’AB...DQ0 ...T 0 ¼ 0
394 Relativistic Wave Equations Including Higher Spin Fields
where the spinors ’ and are of type (n, k) and Then is symmetric in all its indices since ’ is
(n 1, k þ 1), respectively (corresponding to irredu- divergence-free. Further, we obtain
cible representations of the restricted Lorentz group 0 0
SOþ (1, 3)). The constants m1 and m2 are mass @EP B...DP0 Q0 ...T 0 ¼ @EP @PA0 ’AB...DQ0 ...T 0
parameters (m2 = 2m1 m2 ) and the spin s is one 1
&’EB...DQ0 ...T 0
half of the total number of indices of each spinor, 2
s = (1=2)(n þ k). As in the preceding section, we 2
m
assume that electromagnetism and other interactions ¼ ’EB...DQ0 ...T 0
2
are absent. We should mention that equations for
since ’ satisfies the Klein–Gordon equation [6a].
higher spin were not motivated by observations or
Consequently, the pair (’, ) satisfies a system [5].
empirical facts in that period of time, because only a
Obviously, this procedure can be continued: define
few elementary particles were known (proton,
neutron, electron, positron, and photon), and all of B
C...DO0 P0 Q0 ...T 0 :¼ @O 0 B...DP0 Q0 ...T 0
them have low spin (see Table 1). Since that time,
particles with s > 1 were found in nature, for etc. We obtain a sequence of spinors of type
example, resonances in scattering experiments. (0, 2s), (1, 2s 1), . . . , (2s, 0) each of which is
The system [5] allows a uniform description of free obtainable from its immediate neighbors by a
fields with arbitrary spin s > 0, including Dirac and differentiation contracted on one index. Together,
Proca fields, as we know from the preceding section. these spinors form an invariant exact set (Penrose
(Remark: The symmetrization in eqns [3] can be and Rindler 1984).
omitted since the vector field U is divergence-free The just given arguments show that there is an
as a consequence of the second Proca equation.) ambiguity in the system [5]. The spin s fixes only
Various other field equations proposed subsequently the total number of indices of ’ and . However,
can be comprehended as its special cases (Corson their partition into primed and unprimed ones is
1953). Examples are the Rarita–Schwinger equations not a priori fixed. Therefore, we can choose a
for fermions: if they are written in terms of 2-spinors, ‘‘convenient’’ partition for the respective needs.
then one obtains just the system [5] where the spinor Massless Fields
’ is of type (s þ 1=2, s 1=2) and the spinor is of
type (s 1=2, s þ 1=2). If m = 0, then the Dirac system [5] is decoupled.
0
If we apply @EP to the first of the equations in [5] Therefore, we have to state a single equation for a
and use the second, we obtain single field. Let ’ be a spinor field of type (n, 0). The
massless free-field equation for spin (1/2)n is then
ð& þ m2 Þ’AB...DQ0 ...T 0 ¼ 0 ½6a taken to be
since the second derivatives commute in flat space- @AA0 ’AB...E ¼ 0 ½8
times. Similarly,
More precisely, the solutions of [8] represent left-
ð& þ m2 ÞB...DP0 Q0 ...T 0 ¼ 0 ½6b handed massless particles with helicity (1=2)n h,
whereas the solutions of the complex-conjugate
so both fields ’ and satisfy a Klein–Gordon type
form of this equation are right-handed particles
equation. Moreover, eqns [5] imply that each of ’
(helicity þ (1=2)nh). Recall that the Weyl equation
and is divergence-free
(n = 1) and the source-free Maxwell equation (n = 2)
0 0
@ AQ ’AB...DQ0 ...T 0 ¼ 0 ¼ @ BP B...DP0 Q0 ...T 0 ½7 have this form. (Remark: The Bianchi identity in
Einstein spaces also falls in this category, with the
if they have at least one index of each kind. Weyl spinor ABCD taking the place of ’. . . .
In a sense, this procedure can be reversed. Let a Moreover, we may think of [8] with n = 4 as the
symmetric spinor field ’ be given that satisfies [6a] gauge-invariant equation for the weak vacuum
and [7]. (Remark: A significant example is the Fierz gravitational field.)
system The massless field equation [8] can be solved
using methods of twistor geometry. Moreover, there
ð& þ m2 ÞUab...d ¼ 0; @ a Uab...d ¼ 0
is an explicit integral formula for representing
for a symmetric, tracefree tensor field U, since the massless free fields in terms of arbitrarily chosen
spinor equivalent of U is of type (k, k).) null data on a light cone (Penrose and Rindler 1984,
Define 1986, Ward and Wells 1990). We do not discuss
eqns [8] in detail since they are generally incon-
B...DP0 Q0 ...T 0 :¼ @PA0 ’AB...DQ0 ...T 0 sistent in curved spacetimes if n > 2 (see the next
Relativistic Wave Equations Including Higher Spin Fields 395
section). We only indicate that each solution of [8] flat metric and connection with their curved
satisfies the second-order wave equation versions. This procedure is called the ‘‘principle of
minimal coupling.’’
&’AB...E ¼ 0
All equations for low spin in Minkowski
spacetime are the Euler–Lagrange equations of a
Maxwell’s equations imply the existence of an
variation principle (see Table 2). Consequently, they
electromagnetic potential (cf. section ‘‘Wave equa-
can be extended to curved spacetime by simply using
tions for low spin in Minkowski spacetime’’). This
the principle of minimal coupling. The arising
concept can be generalized to higher spin.
equations are perfectly acceptable. No complications
A ‘‘potential’’ for a spinor field ’AB...E of type
arise, and so we do not repeat them in this section.
(n, 0) is a spinor field AB0 ...E0 of type (1, n 1) such
If s > 1, then neither the massive nor the massless
that
wave equations follow from a variation principle
A without supplementary conditions. Nevertheless, we
@ðA 0 jAjB0 ...E0 Þ ¼ 0 ½9
can try to generalize the equations of the previous
and section to a curved spacetime by formally replacing
B 0 0 the flat metric and connection with their curved
’AB...E ¼ @ðB @EE AÞB0 ...E0 ½10
versions, too. However, serious problems arise:
Let us first consider massless fields of helicity
One can check in a straightforward manner that a
(1=2)nh. The principle of minimal coupling yields
spinor field ’ that is given by [9] and [10] satisfies
the massless equation [8]. If n > 1, there is a gauge rA
A0 ’AB...E ¼ 0 ½13
freedom in these potentials; it turns out to be 0
If we apply rA
F to this equation, we obtain
AB0 ...E0 ! AB0 ...E0 þ @AðB0 !C0 ...E0 Þ
0
rA A
F rA0 ’AB...E ¼ 0
for any spinor field ! of type (0, n 2). Further-
more, the general massless field ’ can locally be Since the covariant derivatives do not commute
expressed in this way (Penrose and Rindler 1986). with each other, the term on the left-hand side is not
completely symmetric in the unprimed indices.
Wave Equations in Curved Spacetimes, Therefore, this equation can be decomposed into
Consistency Conditions two nontrivial irreducible parts if n > 1: symmetri-
zation yields the covariant D’Alembert equation
First of all we emphasize that Hamilton’s principle
of stationary action is extremely important in field ra ra ’B...EF ¼ 0
theories (see, e.g., Schmutzer (1968)). Assume that as required, while antisymmetrization yields by use
the Lagrangian L contains at most first derivatives of the spinor Ricci identities
of a field : L = L( (x), @a (x)). ‘‘Special rela-
tivity’’ states that L is invariant under Lorentz ðn 2ÞKLM ðC ’D...EÞKLM ¼ 0 ½14
transformations. The Euler–Lagrange equations
with respect to variation of read where ABCD is the Weyl spinor. If n > 2 and the
spacetime is not conformally flat, then this algebraic
@L @L consistency condition effectively renders eqn [13]
@a ¼0 ½11
@ @ð@a Þ useless as physical field equations.
and these are the field equations that is required to If m > 0, the situation is not better. In somewhat
satisfy. similar way, we obtain the algebraic consistency
In ‘‘general relativity,’’ the Lagrangian L has to be conditions
generally covariant. So we have L = L( (x), ðn 2ÞKLM ðC ’D...EÞKLMQ0 P0 ...T 0
ra (x)) and the Euler–Lagrange equations 0
þ kKLX ðQ0 ’jKLC...EjP0 ...T 0 ÞX0 ¼ 0 ðn > 1Þ
@L @L ½15
ra ¼0 ½12 X0 Y 0 Z0 ðS0 jB...DX0 Y 0 Z0 jT 0 ...U0 Þ
ðk 1Þ
@ @ðra Þ 0 0
þ ðn 1ÞðBKX Y C...DÞKX0 Y 0 S0 T 0 ...U0 ¼ 0 ðk > 0Þ
emerge. If we assume that the Lagrangian L does
not contain the curvature tensors and their deriva- if the spinor field ’ is of type (n, k) (Buchdahl 1962).
tives explicitly and compare [11] and [12], then it is We remark that similar consistency conditions
easily seen how the wave equations in curved occur if we have no gravitation, but an interaction
spacetime can be obtained: by simply replacing the with an electromagnetic field. Then the partial
396 Relativistic Wave Equations Including Higher Spin Fields
Wave Equations for Arbitrary Spin This is a linear second-order equation of normal
without Consistency Conditions hyperbolic type for the spinor field ’. It can be used
to solve Cauchy’s problem for the system [16].
Massive Fields Similarily, we get a second-order equation for :
The ansatz which leads to the desired result is 0
ra ra B...EP0 2ðn 1ÞðB K P0 W C...EÞKW 0
surprisingly simple. We avoid the ambiguity in the
Dirac system [5] that has been discussed earlier as R
þ þ m2 B...EP0
well as any consistency condition if we state the 4
wave equations n1 0
¼2 rðBP0 rKW C...EÞKW 0 ½20
rA n
P0 ’AB...E þ m1 B...EP0 ¼ 0
0 ½16 Seemingly this is not an equation of hyperbolic
rPðA B...EÞP0 m2 ’AB...E ¼ 0
type if n > 1. However, the second derivatives of
This system was first proposed by Wünsch (1985); on the right-hand side of [20] can be eliminated
it is equivalent to a pair of equations given by using [17]. Therefore, if the spinor field ’ is
Buchdahl (1982) which contains the Weyl spinor already known by solving [19], then [20] is an
explicitly. As before, ’ and are symmetric spinor equation of Klein–Gordon type, too. However, it
fields, ’ has n unprimed indices (and no one else!) is generally inhomogeneous if n > 2. A wave
and the constants m1 , m2 are mass parameters equation that contains the spinor field alone
(m2 = 2m1 m2 ). We assume m1 6¼ 0 in this section. exists only if n = 1, n = 2, or the spacetime is
Obviously, the Dirac and Proca equations are conformally flat.
Relativistic Wave Equations Including Higher Spin Fields 397
Now we are going to discuss the ‘‘Cauchy for a spinor field of type (1, n 1). This is just
problem’’ for the wave equations [16] (for details eqn [9] for the potential of a massless field. We will
see Wünsch (1985)). Let a spacelike hypersurface S show that [23] is a satisfactory equation in a
be given and let na denote the future-directed unit generally curved spacetime (Illge 1988). Unfortu-
normal vector on S and rn = na ra . The local nately, no Lagrangian has been found if n > 1.
Cauchy problem is to find a solution (’, ) of [16] To begin with, we remark that there is a gauge
with given Cauchy data ’0 , 0 on S. freedom in curved spacetimes, too, since the
In general, the initial data ’0 and 0 cannot be solution of [23] cannot be uniquely determined
prescribed arbitrarily. Suppose that a solution (’, ) if n > 1. We use this freedom to prescribe the
of [16] does exist. Then the differential equations divergence of . So let an arbitrary spinor field
have to be satisfied on S, too. Thus, we obtain ! of type (0, n 2) be given. We consider eqns
0 [23] and
ðrn ’AB...E ÞjS ¼ 2nA ~F
A rA0 ’B...EF þ m1 B...EA0 jS ½21
0
~AA0 = rAA0 rAB AB0 C0 ...E0 ¼ !C0 ...E0
where the differential operator r
nAA0 rn is just the tangential part of rAA0 with or, together,
respect to S. Therefore, the right-hand side of [21]
is completely determined by the initial data. Now n1
rA
A0 AB0 ...E0 ¼ "A0 ðB0 !C0 ...E0 Þ ½24
the symmetry of the solution ’AB...E implies the n
symmetry of rn ’AB...E . Consequently, the right- If we apply rA
0
B to this equation, we obtain using the
hand side of [21] has to be symmetric with respect spinor Ricci identities
to the unprimed indices and so we obtain the
following constraints for the initial data if ’ has at K W0 R
ra ra BB0 ...E0 2ðn 1ÞB ðB0 jKjC0 ...E0 ÞW 0 þ BB0 ...E0
least two indices: 4
0 2ðn 1Þ
nBA r ~ F 0 ’0 B...EF þ m1 0 B...EA0 jS ¼ 0
A ½22 ¼
n
rBðB0 !C0 ...E0 Þ ½25
fact is not surprising, since eqn [23] is a consistent Petrov type N, III or D spacetimes as well as those
one, whereas [13] is inconsistent. with r[a Rb]c = 0.
We continue with some remarks on ‘‘conformal
rescalings of the metric.’’ The equations for massless See also: Clifford Algebras and Their Representations;
fields have to be invariant with respect to such Dirac Fields in Gravitation and Nonabelian Gauge
transformations. Therefore, the ‘‘curved space’’ Theory; Euclidean Field Theory; Evolution Equations:
Linear and Nonlinear; Spinors and Spin Coefficients;
scalar wave equation is
Standard Model of Particle Physics; Twistors.
R
&þ ’¼0 ½26
6
Further, the equations Further Reading
rA Buchdahl HA (1962) On the compatibility of relativistic wave
ðA0 jAB...EjB0 ...F0 Þ ¼ 0 ½27
equations in Riemann spaces. Nuovo Cimento 25: 486–496.
for any spinor field of type (n, k) are conformally Buchdahl HA (1982) On the compatibility of relativistic wave
equations in Riemann spaces II. Journal of Physics A 15: 1–5.
invariant (Penrose and Rindler 1984). Especially, Corson EM (1953) Introduction to Tensors, Spinors, and
eqns [23] for the massless potential and [13] for the Relativistic Wave-Equations. London and Glasgow: Blackie
massless field have this property. and Son Ltd.
We mention a further special case of [27]. If is of Dirac PAM (1936) Relativistic wave equations. Proceedings of the
type (k þ 1, k), then these equations are consistent, Royal Society London Series A 155: 447–459.
Fierz M and Pauli W (1939) On relativistic wave equations for
too (Frauendiener and Sparling 1999). The Cauchy particles of arbitrary spin in an electromagnetic field. Proceed-
problem is well posed and a Lagrangian is known. ings of the Royal Society London Series A 173: 211–232.
Unfortunately, the solutions do not satisfy a wave Frauendiener J and Sparling GAJ (1999) On a class of consistent
equation of second order if k > 0. higher spin equations on curved manifolds. Journal of
We conclude with the discussion of the Cauchy Geometry and Physics 30: 54–101.
Greiner W (1997) Relativistic Quantum Mechanics – Wave
problem for eqn [24]. As in the preceding section, let Equations, 2nd edn. Berlin: Springer.
a spacelike hypersurface S and initial data 0 on S Illge R (1988) On potentials for several classes of spinor and
be given. We can state: tensor fields in curved spacetimes. General Relativity and
Gravitation 20: 551–564.
Theorem 3 If a symmetric spinor field ! of type Illge R (1993) Massive fields of arbitrary spin in curved space-
(0, n 2) is given, then there exists a neighborhood times. Communications in Mathematical Physics 158:
of S in which eqn [24] has one and only one solution 433–457.
satisfying jS = 0 . Penrose R and Rindler W (1984) Spinors and Space-Time,
Two-Spinor Calculus and Relativistic Fields, vol. 1.
The proof is given in Illge (1988). We emphasize Cambridge: Cambridge University Press.
that there are no constraints on the Cauchy data for Penrose R and Rindler W (1986) Spinors and Space-Time, Spinor
and Twistor Methods in Space-Time Geometry, vol. 2.
the massless equation [24]. Cambridge: Cambridge University Press.
In contrast to massive fields we are far away from Schmutzer E (1968) Relativistische Physik. Leipzig: Teubner-
an answer to the question whether Huygens princi- Verlag.
ple is valid for the massless equations. A particular Ward RS and Wells RO (1990) Twistor Geometry and Field
result is Wünsch (1994): Theory. Cambridge: Cambridge University Press.
Wünsch V (1985) Cauchy’s problem and Huygens’ principle for
Theorem 4 Huygen’s principle for the conformally relativistic higher spin wave equations in an arbitrarily curved
invariant scalar wave equation [26], the Weyl, and space-time. General Relativity and Gravitation 17: 15–38.
Wünsch V (1994) Moments and Huygens’ principle for
the Maxwell equations is valid only for conformally conformally invariant field equations in curved space-times.
flat and plane wave metrics within the classes of Annales de l’Institute Henri Poincaré – Physique théorique
centrally symmetric, recurrent, (2, 2)-decomposable, 60: 433–455.
Renormalization: General Theory 399
Introduction
Formulation of QFT
Quantum field theories (QFTs) provide a natural
framework for quantum theories that obey the A QFT is specified by its Lagrangian density.
principles of special relativity. Among their most A simple example is 4 theory:
striking features are ultraviolet (UV) divergences,
which at first sight invalidate the existence of the ? ð@Þ2 m2 2 4
L¼ ½1
theories. The divergences arise from Fourier modes 2 2 4!
of very high wave number, and hence from the where (x) = (t, x) is a single component Hermitian
structure of the theories at very short distances. In field. The Lagrangian density and the resulting
the very restricted class of theories called ‘‘renorma- equation of motion, @ 2 þ m2 þ (1=6)3 = 0, are
lizable,’’ the divergences may be removed by a local; they involve only products of fields at the
singular redefinition of the parameters of the theory. same spacetime point. Such locality is characteristic
This is the process of renormalization that defines a of relativistic theories, where otherwise it is difficult
QFT as a nontrivial limit of a theory with a UV or impossible to preserve causality, but it is also the
cutoff. source of the UV divergences. The question mark
A very important QFT is the standard model, an over the equality symbol in eqn [1] is a reminder
accurate and successful theory for all the known that renormalization of UV divergences will force us
interactions except gravity. Calculations using to modify the equation.
renormalization and related methods are vital to The Feynman rules for perturbation theory are
the theory’s success. given by a free propagator i=(p2 m2 þ i0) and an
The basic idea of renormalization predates QFT. interaction vertex i. Although we will usually
Suppose we treat an observed electron as a work in four spacetime dimensions, it is useful also
combination of a bare electron of mass m0 and the to consider the theory in a general spacetime
associated classical electromagnetic field down to a dimensionality n, where the coupling has energy
radius a. The observed mass of the electron is its dimension [] = E4n . We use ‘‘natural units,’’ that
bare mass plus the energy in the field (divided by c2 ). is, with h = c = 1. The ‘‘i0’’ in the propagator i=(p2
The field energy is substantial, for example, 0.7 MeV m2 þ i0) symbolizes the location of the pole relative
when a = 1015 m, and it diverges when a ! 0. The to the integration contour; it is often written as i.
observed mass, 0.5 MeV, is the sum of the large The primary targets of calculations are the
(or infinite) field contribution compensated by a vacuum expectation values of time-ordered products
negative and large (or infinite) bare mass. This of ; in QFT these are called the Green functions of
calculation needs replacing by a more correct the theory. From these can be reconstructed the
version for short distances, of course, but it remains scattering matrix, scattering cross sections, and
a good motivation. other measurable quantities.
In this article, we review the theory of renorma-
lization in its classic form, as applied to weak-
coupling perturbation theory, or Feynman graphs. It One-Loop Calculations
is this method, rather than the Wilsonian approach Low-order graphs for the connected and amputated
(see Exact Renormalization Group), that is typically four-point Green function are shown in Figure 1.
used in practice for perturbative calculations in the Each one-loop graph has the form
standard model, especially its QCD part.
Much of the emphasis is on weak-coupling i2 Iðp2 Þ
perturbation theory, where there are well-known 2Z
? d4 k 1
algorithmic rules for performing calculations and ¼ ½2
2 ð2Þ4 ðk2 m2 þ i0Þ½ðp kÞ2 m2 þ i0
renormalization. Applications (see Quantum Chro-
modynamics for some important nontrivial examples) where p is a combination of external momenta.
involve further related results, such as the operator There is a divergence from where the loop
400 Renormalization: General Theory
+ + + + O(λ3) + 3A + + + + O(λ3)
Figure 1 One-loop approximation to connected and amputated Figure 2 One-loop approximation to renormalized connected
four-point function, before renormalization. and amputated four-point function, with counter-term.
momentum k goes to infinity. We define the degree the appropriate expansion parameter of the theory is
of divergence, , by counting powers of k at large k, the finite renormalized coupling , held fixed as
to get = 0. In an n-dimensional spacetime we a ! 0. We call the extra term in eqn [5] a counter-
would have = n 4. The integral is divergent term. The diagrams for the correct renormalized
whenever 0. Comparing the dimensions of the calculation are represented in Figure 2, which has a
one-loop and tree graphs shows that equals the counter-term graph compared with Figure 1.
negative of the energy dimension of the coupling . In the physics terminology, used here, the cutting-
Thus, the dimensionlessness of at the physical off of the divergence by using a modified theory is
spacetime dimension is equivalent to the integral called a regularization. This contrasts with the
being just divergent. mathematics literature, where ‘‘regularized integral’’
The infinity in the integral implies that the theory usually means the same as a physicist’s ‘‘renorma-
in its naive formulation is not defined. With the aid lized integral.’’
of RG methods, it has been shown that the problem There is always freedom to add a finite term to a
is with the complete theory, not just perturbation counter-term. When we discuss the RG, we will see
theory. that this corresponds to a reorganization of the
The divergence only arises because we use a perturbation expansion and provides a powerful
continuum spacetime. So suppose that we formulate tool for improving perturbatively based calculations,
the theory initially on a lattice of spacing a (in space especially in QCD. Contrary to the impression given
or spacetime). Our loop graph is now in some parts of the literature, it is not necessary
that a renormalized mass equal a corresponding
i2 Iðp; m; aÞ
Z physical particle mass, with similar statements for
2 coupling and field renormalization. While such a
¼ d4 k Sðk; m; aÞ Sðp k; m; aÞ ½3
324 prescription is common and natural in a simple
theory like QED, it is by no means required and
where the free propagator S(k, m; a) approaches the
certainly may not always be best. If nothing else, the
usual value i=(k2 m2 þ i0) when k is much smaller
correspondence between fields and stable particles
than 1=a, and it falls off more rapidly for large k.
may be poor or nonexistent (as in QCD).
The basic observation that propels the renormaliza-
One classic possibility is to subtract the value of
tion program is that the divergence as a ! 0 is
the graph at p = 0, a prescription associated with
independent of p. This is most easily seen by
Bogoliubov, Parasiuk, and Hepp (BPH), which
differentiating once with respect to p, after which
leads to
the integral is convergent when a = 0, because the
differentiated integral has degree of divergence 1. i2 IR; BPH ðp2 Þ
Thus we can cancel the divergence in eqn [2] by Z
i2 1
replacing the coupling in the first term in Figure 1, ¼ 2
dx ln 1 p2 xð1 xÞ=m2 ½6
by the so-called bare coupling 32 0
In obtaining this from [2], we used a standard
0 ¼ þ 3AðaÞ2 þ Oð3 Þ ½4
Feynman parameter formula,
Here A(a) is chosen so that the renormalized value Z 1
of our one-loop graph, 1 1
¼ dx ½7
AB 0 ½Ax þ Bð1 xÞ2
i2 IR ðp2 ; m2 Þ ¼ i2 lim ½Iðp; m; aÞ þ AðaÞ ½5
a!0 to combine the propagator denominators, after
exists, at a = 0, with A(a) in fact being real valued. which the integral over the momentum variable
The factor 3 multiplying A(a) in eqn [4] is because k is elementary. We then obtain the renormalized
there are three one-loop graphs, with equal diver- one-loop (four-point and amputated) Green function
gent parts. The replacement for the coupling is made
i i2 ½IR ðsÞ þ IR ðtÞ þ IR ðuÞ þ Oð3 Þ ½8
in the tree graph in Figure 1, but not yet at the
vertices of the other graphs, because at the moment where s, t, and u are the three standard Mandelstam
we are only doing a calculation accurate to order 2 ; invariants for the Green function. (For a 2 ! 2
Renormalization: General Theory 401
interaction. The counter-terms are expanded in with and being held fixed when ! 0. (Thus,
powers of , and then all graphs involving counter- the basic interaction in eqn [13] is changed to
term vertices at the chosen order in are added to the 2 4 =4!.) Then for the one-loop graph of eqn [2],
calculation. The counter-terms are arranged to cancel dimensionally regularized Feynman parameter meth-
all the divergences, so that the UV regulator can be ods give
removed, with m and held fixed. The counter-terms
cancel the parts of the basic Feynman graphs asso- i2
i2 Iðp; m; Þ ¼ ð4Þ ðÞ
ciated with large loop momenta. An algorithmic 322
Z 1 2
specification of the otherwise arbitrary finite parts of m p2 xð1 xÞ i0
dx ½16
the counter-terms is called a renormalization prescrip- 0 2
tion or a renormalization scheme. Thus, it gives a A natural renormalization procedure is to subtract
definite relation between the renormalized and bare the pole at = 0, but it is convenient to accompany
parameters, and hence a definite specification of the this with other factors to remove some universally
partitioning of L into its three parts. occurring finite terms. So MS renormalization
It has been proved that this procedure works to all (‘‘modified minimal subtraction’’) is defined by
orders in , with corresponding results for other using the counter-term
theories. Even in the absence of fully rigorous
nonperturbative proofs, it appears clear that the results 2 S
iAðÞ2 ¼ i ½17
extend beyond perturbation theory, at least in asymp- 322
totically free theories like QCD: see the discussion on
where S def
= (4 eE ) , with E = 0.5772 . . . being the
Wilsonian RG (see Exact Renormalization Group).
Euler constant. This gives a renormalized integral (at
= 0)
Z 1 2
Dimensional Regularization i2 m p2 xð1 xÞ
dx ln ½18
and Minimal Subtraction 322 0 2
The final result for renormalized graphs does not which can be evaluated easily. A particularly simple
depend on the particular regularization procedure. result is obtained at m = 0:
A particularly convenient procedure, especially in
i2 p2
QCD, is dimensional regularization, where diver- ln 2 þ 2 ½19
gences are removed by going to a low spacetime 322
dimension n. To make a useful regularization method, This formula symptomizes important and very
n is treated as a continuous variable, n = 4 2. useful algorithmic simplifications in the higher-
Great advantages of the method are that it order massless calculations common in QCD.
preserves Poincaré invariance and many other The MS scheme amounts to a de facto standard
symmetries (including the gauge symmetry of for QCD. At higher orders a factor of S L is used in
QCD), and that Feynman graph calculations are the counter-terms, with L being the number of
minimally more complicated than for finite graphs loops.
at n = 4, particularly when all the lines are massless,
as in many QCD calculations.
Although there is no such object as a genuine
Coordinate Space
vector space of finite noninteger dimension, it is
possible to construct an operation that behaves as if Quantum fields are written as if they are functions
it were an integration over such a space. The of x, but they are in fact distributions or generalized
operation was proved unique by Wilson, and functions, with quantum-mechanical operator
explicit constructions have been made, so that values. This indicates that using products of fields
consistency is assured at the level of all Feynman is dangerous and in need of careful definition. The
graphs. Whether a satisfactory definition beyond relation with ordinary distribution theory is simplest
perturbation theory exists remains to be determined. in the coordinate-space version of Feynman graphs.
It is convenient to arrange that the renormalized Indeed in the 1950s, Bogoliubov and Shirkov
coupling is dimensionless in the regulated theory. formulated renormalization as a problem of
This is done by changing the normalization of with defining products of the singular numeric-valued
the aid of an extra parameter, the unit of mass : distributions in coordinate-space Feynman graphs;
theirs was perhaps the best treatment of renormali-
0 ¼ 2 ð þ counter-termsÞ ½15 zation in that era.
Renormalization: General Theory 403
For example, the coordinate-space version of external momenta, which does not produce a finite
eqn [5] is result because of the divergent one-loop subgraph.
Z But for consistency of the theory, the one-loop
2 lim d4 x d4 y f ðx; yÞ counter-terms already computed must be themselves
a!0
h i put into loop graphs. Among others, this gives the
Sðx y; m; aÞ2 þ iAðaÞð4Þ ðx yÞ ½20
12 ~ second graph of Figure 3, where the cross denotes
that a counter-term contribution is used. The
where x and y are the coordinates for the interaction contribution used here is actually 2/3 of the total
vertices, f (x, y) is the product of external-line free one-loop counter-term, for reasons of symmetry
propagators, and ~ S(x y; m, a) is the coordinate- factors that are not fully evident at first sight. The
space free propagator, which at a = 0 has a remainder of the one-loop coupling renormalization
singularity cancels a subdivergence in another two-loop graph.
1 It is readily shown that the divergence of the sum of
½21 the first two graphs in Figure 3 is momentum
42 ½ðx yÞ2 þ i0
independent, and thus can be canceled by a vertex
as (x y)2 ! 0. We see in eqn [20] a version of the counter-term.
Hadamard finite part of a divergent integral, and This method is fully general, and is formalized in
renormalization theory generalizes this to particular the Bogoliubov R-operation, which gives a recursive
kinds of arbitrarily high-dimension integrals. The specification of the renormalized value R(G) of a
physical realization and justification of the use of graph G:
the finite-part procedure is in terms of renormaliza- def
X
tion of parameters in the Lagrangian; this also gives RðGÞ ¼ G þ Gji !Cði Þ ½22
the procedure a significance that goes beyond the f1 ;...;n g
integrals themselves and involves the full nonpertur- The sum is over all sets of nonintersecting 1PI
bative formulation of QFT. subgraphs of G, and the notation Gji ! C(i ) denotes
G with all the subgraphs i replaced by associated
General Counter-Term Formulation counter-terms C(i ). The counter-term C() of a 1PI
graph has the form
We have written L as a basic Lagrangian density
def
plus counter-terms, and have seen in an example CðÞ ¼ T ð þ counter-terms
how to cancel divergences at one-loop order. In this for subdivergencesÞ ½23
section, we will see how the procedure works to all
orders. The central mathematical tool is Bogoliubov’s Here T is an operation that extracts the divergent
R-operation. Here the counter-terms are expanded part of its argument and whose precise definition
as a sum of terms, one for each basic one-particle gives the renormalization scheme. For example, in
irreducible (1PI) graph with a non-negative degree minimal subtraction we define
of divergence. To each basic graph for a Green TðÞ ¼ pole part at ¼ 0 of ½24
function is added a set of counter-term graphs
associated with divergences for subgraphs. The We formalize the term inside parentheses in eqn
central theorem of renormalization is that this [23] as
procedure does in fact remove all the UV diver- def
RðÞ ¼ þ counterterms for subdivergences
gences, with the form of the counter-terms being
X0
determined by the simple computation of the degree ¼ þ Gji !Cði Þ ½25
of divergence for 1PI graphs. f1 ;...;n g
To see the essential difficulty to be solved, consider P
a two-loop graph like the first one in Figure 3. Its where the prime on the 0 denotes that we sum over
divergence is not a polynomial in external momenta, all sets of nonintersecting 1PI subgraphs except for
and is therefore not canceled by an allowed counter- the case that there is a single i equal to the whole
term. This is shown by differentiation with respect to graph (i.e., the term with n = 1 and 1 = is
omitted).
Note that, for the MS scheme, we define the T
+ 2A +
operation to be applied to a factor of constant
B
dimension obtained by taking the appropriate power
Figure 3 A two-loop graph and its counter-terms. The label B of outside of the pole-part operation. Moreover,
indicates that it is the two-loop overall counter-term for this graph. it is not a strict pole-part operation; instead each
404 Renormalization: General Theory
We first discuss nonchiral symmetries; these are generally hold; the form of the gauge transformation is
symmetries in which the left-handed and right- itself renormalized, in a certain sense.
handed parts of Dirac fields transform identically.
For Poincaré invariance and simple global internal
Anomalies
symmetries, it is simplest to use a regulator, like
dimensional regularization, which respects the sym- Chiral symmetries, as in the weak-interaction part of
metries. Then it is easily shown that the symmetries the gauge symmetry of the standard model, are
are preserved under renormalization. This holds much harder to deal with. Chiral symmetries are
even if the internal symmetries are spontaneously ones for which the left-handed and right-handed
broken (as happens with a ‘‘wrong-sign mass term,’’ components of Dirac field transform independently
e.g., negative m2 in eqn [1]). under different components of the symmetry group,
The case of local gauge symmetries is harder. But local or global as the case may be. Occasionally,
their preservation is more important, because gauge some or other of the left-handed or right-handed
theories contain vector fields which, without a gauge components may not even be present.
symmetry, generally give unphysical features to the In general, chiral symmetries are not preserved by
theory. For perturbation theory, BRST quantization regularization, at least not without some other
is usually used, in which, instead of gauge symme- pathology. At best one can adjust the finite parts of
try, there is a BRST supersymmetry. This is counter-terms such that in the limit of the removal of
manifested at the Green function level by Slavnov– the regulator, the Ward or Slavnov–Taylor identities
Taylor identities that are more complicated, in hold. But in general, this cannot be done consistently,
general, than the Ward identities for simple global and the theory is said to suffer from an anomaly. In
symmetries and for abelian local symmetries. the case of chiral gauge theories, the presence of an
Dimensional regularization preserves these anomaly prevents the (candidate) theory from being
symmetries and the Slavnov–Taylor identities. More- valid. A dramatic and nontrivial result (Adler–
over, the R-operation still produces finite results with Bardeen theorem and some nontrivial generaliza-
local counter-terms, but cancelations and relations tions) is that if chiral anomalies cancel at the
occur between divergences for different graphs in one-loop level, then they cancel at all orders.
order to preserve the symmetry. A simple example is Similar results, but more difficult ones, hold for
QED, which has an abelian U(1) gauge symmetry, and supersymmetries.
whose gauge-invariant Lagrangian is The anomaly cancelation conditions in the standard
model lead to constraints that relate the lepton content
ð0Þ ð0Þ 2 to the quark content in each generation. For example,
L ¼ 14 @ A @ A
given the existence of the b quark, and the
and
þ 0 i @ e0 Að0Þ
m 0 0 ½26 leptons (of masses around 4.5 GeV, 1.8 GeV, and zero
respectively), it was strongly predicted on the grounds
At the level of individual divergent 1PI graphs, of anomaly cancelation that there must be a t quark
we get counter-terms proportional to A 2 and to partner of the b to complete the third generation of
(A 2 )2 , operators not present in the gauge-invariant quark doublets. This prediction was much later
Lagrangian. The Ward identities and Slavnov–Taylor vindicated by the discovery of the much heavier top
identities show that these counter-terms cancel when quark with mt ’ 175 GeV.
they are summed over all graphs at a given order of
renormalized perturbation theory. Moreover, the
Renormalization Schemes
renormalization of coupling and the gauge field are
inverse, so that e0 A(0) equals the corresponding A precise definition of the counter-terms entails
object with renormalized quantities, eA . Natu- a specification of the renormalization prescription
rally, sums of contributions to a counter-term in (or scheme), so that the finite parts of the counter-
L can only be quantified with use of a regulator. terms are determined. This apparently induces extra
In nonabelian theories, the gauge-invariance proper- arbitrariness in the results. However, in the 4
ties are not just the absence of certain terms in L but Lagrangian (for example), there are really only two
quantitative relations between the coefficients of terms independent parameters. (A scaling of the field does
with different numbers of fields. Even so, the argument not affect any observables, so we do not count Z as
with Slavnov–Taylor identities generalizes appropri- a parameter here.) Thus, at fixed regulator para-
ately and proves renormalizability of QCD, for meter a or , renormalization actually just gives a
example. But note that the relation concerning the reparametrization of a two-parameter collection of
product of the coupling and the gauge field does not theories. A renormalization prescription gives the
406 Renormalization: General Theory
change of variables between bare and renormalized RG equation is incorrectly labeled as a Callan–
parameters, a rather singular transformation when Symanzik equation.
the regulator is removed. If we have two different The elementary use of the RG is not sufficient for
prescriptions, we can deduce a transformation most interesting processes, which involve a set of
between the renormalized parameters in the two widely different scales. Then more powerful theo-
schemes. The renormalized mass and coupling m1 rems come into play. Typical are the factorization
and 1 in one scheme can be obtained as functions theorems of QCD (see Quantum Chromodynamics).
of their values m2 and 2 in the other scheme, with These express differential cross sections for certain
the bare parameters, and hence the physics, being important reactions as a product of quantities that
the same in both schemes. Since these are renorma- involve a single scale:
lized parameters, the removal of the regulator leaves
the transformation well behaved. d ¼ CðQ; ; ðÞÞ f ðm; ; ðÞÞ
Generalization to all renormalizable theories is þ small correction ½28
immediate.
The product is typically a matrix or a convolution
product. The factors obey nontrivial RG equations,
and these enable different values of to be used in
Renormalization Group and Applications the different factors. Predictions arise because some
and Generalizations factors and the kernels of the RG equation are
perturbatively calculable, with a weak effective
One part of the choice of renormalization scheme is
coupling. Other factors, such as f in eqn [28], are
that of a scale parameter such as the unit of mass of
not perturbative. These are quantities with names
the MS scheme. The physical predictions of the theory
like ‘‘parton distribution functions,’’ and they are
are invariant if a change of is accompanied by a
universal between many different processes. Thus,
suitable change of the renormalized parameters, now
the nonperturbative functions can be measured in a
considered as -dependent parameters () and m().
limited set of reactions and used to predict cross
These are called the effective, or running, coupling and
sections for many other reactions with the aid of
mass. The transformation of the parametrization of
calculations of the perturbative factors.
the theory is called an RG transformation.
Ultimately, this whole area depends on physical
The bare coupling and mass 0 and m0 are RG
phenomena associated with renormalization.
invariant, and this can be used to obtain equations
for the RG evolution of the effective parameters
from the perturbatively computed counter-terms.
For example, in 4 theory, we have (in the Concluding Remarks
renormalized theory after removal of the regulator) The actual ability to remove the divergences in
d certain QFTs to produce consistent, finite, and
¼ ðÞ ½27 nontrivial theories is a quite dramatic result. More-
d ln 2
over, associated with the integrals that give the
with () = 32 =(162 ) þ O(3 ). As exemplified in divergences is behavior of the kind that is analyzed
eqns. [18] and [19], Feynman diagrams depend with RG methods and generalizations. So the
logarithmically on . By choosing to be comparable properties of QFTs associated with renormalization
to the physical external momentum scale, we remove get tightly coupled to many interesting consequences
possible large logarithms in this and higher orders. of the theories, most notably in QCD.
Thus, provided that the effective coupling at this scale QFTs are actually very abstruse and difficult
is weak, we get an effective perturbation expansion. theories; only certain aspects currently lend them-
This is a basic technique for exploiting perturba- selves to practical calculations. So the reader should
tion theory in QCD, for the strong interactions, not assume that all aspects of their rigorous
where the interactions are not automatically weak. mathematical treatment are perfect. Experience,
In this theory the RG function is negative so that both within the theories and in their comparison
the coupling decreases to zero as ! 1; this is the with experiment, indicates, nevertheless, that we
asymptotic freedom of QCD. have a good approximation to the truth.
A closely related method is that associated with When one examines the mathematics associated
the Callan–Symanzik equation, which is a formula- with the R-operation and its generalizations with
tion of a Ward identity for anomalously broken factorization theorems, there are clearly present
scale invariance. However, RG methods are the some interesting mathematical structures that are
actually used ones, normally, even if sometimes an not yet formulated in their most general terms. Some
Renormalization: Statistical Mechanics and Condensed Matter 407
calculation of the partition function. This leads to a large K. Let T be a set and = { : ! T } be the
map between effective interactions associated to set of spin configurations. Common examples for
different length scales. Thus, the focus shifts from the target space T are T = {1, 1} for the Ising
the analysis of a single interaction to that of a flow model, T = SN1 for the O(N) model, and T = Rn
on a space of interactions. This space is in general for unbounded spins. Let S : ! R, 7! S () be
much larger than the original formulation of the an interaction and
model would suggest: the description of long- Z Y
distance or low-energy properties may be in terms Zð; S Þ ¼ dðxÞeS ðÞ ½1
of variables that were not even present in the x2
original formulation of the system. Phenomeno- In the unbounded case, S is assumed to grow
logically, this corresponds to the emergence of sufficiently fast for jj ! 1, so that Z exists; for the
collective degrees of freedom. case of a finite set T, the integral is replaced by a
Condensed matter theory is itself already an sum. Denote the corresponding Boltzmann factor by
effective theory, and its ‘‘microscopic’’ formulation (, S ),
gets inputs from the underlying theories, which
determine in particular the statistics of the particles 1
ð; S ÞðÞ ¼ eS ðÞ ½2
and their interactions at the scale of atomic energies. Zð; S Þ
At much lower-energy scales, which are relevant for
The block spin transformation consists of an
low-temperature phenomena in condensed matter,
integration step and a rescaling step. Divide the
collective excitations of different, sometimes exotic,
lattice into cubic blocks of side-length L and define
statistics may emerge, but the starting point is given
a new lattice 0 by associating one lattice site of the
naturally in terms of fermionic and bosonic parti-
new lattice to each L-block of the old lattice. For
cles. For this reason, the discussion given below will
any 0 : 0 ! T, let
be split in these two cases.
Z Y
A major difference between high-energy and
0 ð0 Þ ¼ dðxÞPð0 ; ÞeS ðÞ ½3
condensed matter systems is that the latter have a
x2
well-defined Hamiltonian which can be used to RQ
0 0 0 0
define the finite-volume ensembles of quantum where P( , ) 0 and x0 20 d (x )P( , ) = 1
0
statistical mechanics and which determines the time for all , so that remains a probability distribu-
evolution, as well as various analyticity properties. tion. Since 0 is positive, one defines
The relevant spatial dimensions in condensed
S00 ð0 Þ ¼ log 0 ð0 Þ ½4
matter are d 3, but some results in higher
dimensions relevant for the development of the By construction, the partition function is invariant:
method will also be discussed below. The cases Z(0 , S00 ) = Z(, S ). The new lattice 0 has spacing L;
d = 1 and d = 2 have always been of mathematical now rescale to make it a unit lattice. This completes
interest but in recent years have become important the RG step in finite volume.
for the theory of new materials. In an algorithmic sense, the ‘‘blocking rule’’
Some interesting topics cannot be covered here P(0 , ) can be viewed as a transition probability of
due to space restrictions, notably the application of a configuration to a configuration 0 . P may be
renormalization methods to membrane theory (see deterministic, that is, simply fix 0 as a function
Wiese (2001)) and renormalization methods for of . From the intuition of averaging over local
operators (see Bach et al. (1998)). fluctuations, 0 is often taken to be some average of
(x) at x in a block around x0 , hence the name.
Obviously, the thus defined RG transformation
The Renormalization Group often cannot be iterated arbitrarily, since in every
application, the number of points of the lattice shrinks
In this section we briefly describe the setup of two
by a factor Ld , so that after K iterations, a lattice with
important versions of the RG, namely the block spin
only a single point is left over. It is necessary to take the
RG and the RG based on scale decompositions of
infinite-volume limit L ! 1 to obtain a map that
singular covariances.
operates from a space to itself. However, [4] can
become problematic in that limit: Gibbs measures
Block spin RG
can map to measures 0 whose large-deviation proper-
Let be a finite lattice, for example, a finite subset ties differ from those of Gibbs measures. The discus-
of Zd . For the following, it is convenient to take sion of this problem and its solution is reviewed in
to be a cube of side-length LK for L > 1 and some Bricmont and Kupiainen (2001). The problem can be
Renormalization: Statistical Mechanics and Condensed Matter 409
solved in different ways, relaxing conditions on Gibbs Again, we assume that the potential v depends on x
measures or, in the Ising model, changing the descrip- and y only via x y, so that translation invariance
tion from the spins to the contours. The crucial point is holds. In both UV and IR cases, naive perturbation
that the difficulties arise only because [4] is applied theory fails even as a formal power series. That is,
globally, that is, to every 0 . The set of bad 0 has very writing V = V0 , with a coupling constant which is
small probability. treated as a formal expansion parameter, the singu-
Block spin methods have been used in mathema- larity of C leads to termwise divergences in the series.
tical construction of quantum field theories, for The theory is called perturbatively renormalizable if
example, in the work of Gawedzki and Kupiainen all divergences can be removed by posing counter-
(1985) and Balaban (1988) (see the subsection terms of certain types, which are fixed by physically
‘‘Field theory and statistical mechanics’’). The sensible renormalization conditions. Identifying the
above-mentioned problem was avoided there by UV renormalizable theories was a breakthrough in
not taking a logarithm in the so-called large-field high-energy physics. The IR renormalization problem
region (which has very small probability). is different, and in some respects harder, because
there is almost no freedom to put counter-terms: the
microscopic model is given from the start. This will
Scale Decomposition RG be discussed in more detail below for an example.
The generating functionals of quantum field theory A much more ambitious, and largely open, project
and quantum statistical mechanics can be cast into is to do this renormalization nonperturbatively, that
the form is, to treat as a real (typically, small) parameter.
Z Some results will be discussed below.
0
ZðC; V; Þ ¼ dC ð0 Þ eVð þÞ ½5 TheP RG is set up by a scale decomposition
C = j Cj . In the example of the massless Gaussian
field, one would take each C ^ j to be a C1 function
Here dC denotes the Gaussian measure with covar-
iance C, and V is the two-body interaction between the supported in the region {k 2 Rd : Mj k2 Mjþ1 },
particles. The field variables are real or complex for where M > 1 is a fixed constant, and the summation
bosons and Grassmann-valued for fermions. Differ- over j runs over Z.
entiating log Z with respect to the external field The scale decomposition of C leads to a represen-
generates the connected amputated correlation func- tation of [5] by an iteration of Gaussian convolution
tions. The covariance determines the free propagation integrals with covariances Cj , hence a sequence of
of particles; the interaction their collisions. effective interactions Vj , defined recursively by
In most cases, such functional integrals are a priori Z
0
ill-defined, even if V is small (and bounded from eVj ðÞ ¼ dCjþ1 ð0 Þ eVjþ1 ð þÞ ; V0 ¼ V ½7
below) because the covariance C is singular. That is,
the integral kernel C(X, X0 ) of the operator C either For a singular covariance, the scale decomposition is
diverges as jx x0 j ! 0 (ultraviolet (UV) problem) or an infinite sum. A formal object like [5] is now
C(X, X0 ) has a slow decay as jx x0 j ! 1 (infrared regularized by starting with a finite sum, that is,
(IR) problem). In our notational convention, X may, imposing a UV and IR cutoff, which is mathemati-
in addition to the configuration variable x, also cally well defined, and then taking limits of the thus
contain discrete indices of the fields, such as a spin or defined objects. Again, in condensed matter applica-
color index. The dependence of C on x and x0 is tions, imposing an IR cutoff is an operation that
assumed to be of the form x x0 . A typical example needs to be justified, for example, by showing that
is the massless Gaussian field in d dimensions, where taking the limit as the cutoff is removed commutes
C is the inverse Fourier transform of C(k) ^ = 1=k2 , with the infinite-volume limit.
d
k 2 R , which has both a UV and an IR problem, or Note that the RG map, which is the iteration
its lattice analog, Vj 7! Vj1 , goes to lower and lower j, corresponding
!1 to longer and longer length scales. The convention
2 Xd that the iteration starts at some fixed j, for example,
^
DðkÞ ¼ 2 ð1 cosðaki Þ
a i¼1 j = 0, is appropriate for IR problems. In UV
problems, the iteration would start at some large
with a the lattice constant, which has only an IR JUV , which defines a UV cutoff and is taken to
problem. A typical interaction is of the type infinity, to remove the cutoff, at the end.
Z A Rvariant using a continuous scale decomposition,
VðÞ ¼ dX dY ðXÞðXÞvðX;
YÞðYÞðYÞ ½6 C = dsC_ s , originally due to Wegner and Houghton,
became very popular after Polchinski (1984) used it
410 Renormalization: Statistical Mechanics and Condensed Matter
to give a short argument for perturbative renorma- estimates on the rareness of large-field regions
lizability. Polchinski’s equation, the analog of the using cluster expansions. For fermions, the expan-
recursion [7], reads sion in powers of the fields can be proved to
converge for regular, summable covariances, which
@V 1 V V 1 1 V _ V
¼ e C_ s e ¼ C_ s V ; Cs ½8 leads to substantial technical simplifications.
@s 2 2 2 The spatial proliferation of interactions is absent
Here only in certain one-dimensional and in specially
constructed higher-dimensional models, the so-
called ‘‘hierarchical models.’’ In these models, the
C ¼ ;C
search for an RG fixed point is still a nonlinear
fixed-point problem, whose treatment leads to
denotes the Laplacian in field space associated to the interesting mathematical results.
covariance C. Polchinski’s argument has been devel- This article will be restricted to the mathema-
oped into a mathematical tool that applies to many tical use of the RG both in perturbative and
models. For an introduction to perturbative renor- nonperturbative quantum field theory of con-
malization using this method, see Salmhofer (1998). densed matter systems. Many nonrigorous but
Equations of the type [8] have also been very useful very interesting applications have also come out
beyond perturbation theory: much work has been of this method, showing that it also works well in
done based on the beautiful representation of Mayer practice, but they will not be reviewed here. Before
expansions found in Brydges and Kennedy (1987) discussing condensed matter systems, the pioneer-
using RG equations. ing works done on the mathematical RG, which
were largely motivated by high-energy physics,
Mathematical Structure and Difficulties will be reviewed briefly, as they laid the founda-
The RG flow is thus, depending on the implementa- tion of much of the technique used later in the
tion, either a sequence or a continuous flow of condensed matter case.
interactions. Setting up this flow in mathematical
terms is not easy and indeed part of the mathema- Field Theory and Statistical Mechanics
tical RG analysis is to find a suitable space of
interactions that is left invariant by the successive Because of the close connection between quantum
convolutions, and then to control the RG iteration. field theory and statistical mechanics given by
A serious problem is the proliferation of interac- formulas of the Feynman–Kac type, a significant
tions: already a single application of the RG amount of work on the mathematical RG focused
transformation [7] maps a simple interaction, such on models of classical statistical mechanics in
as [6], to a nonlocal functional of the fields, connection with field theories and gauge theories.
Here we mention some of the pioneering results in
XZ
Vj ðÞ ¼ dX1 dXm that field.
m0 The scale decomposition method was developed
in a mathematical form and applied to perturbative
vðjÞ
m ðX1 ; . . . ; Xm Þ ðX1 Þ ðXm Þ ½9
UV renormalization of scalar field theories, as well
Already for perturbative renormalization, one needs as nonperturbative analysis of some models, by
to extract local terms, calculate their flow more Gallavotti and Nicolò (Gallavotti 1985).
explicitly, and control the power counting of the Infrared 4 theory in four dimensions was
remainder. The convergence of the series is not an constructed using block spin methods (Gawedzki
issue in formal perturbation theory because in every and Kupiainen 1985) and scale decomposition RG
finite order r in , the sum over m is finite. (Feldman et al. 1987). An essential feature of the 44
For nonperturbative renormalization, however, model is its IR asymptotic freedom, meaning that
the problem is much more serious. For bosonic the local part of the effective quartic interaction
systems, the expansion in powers of the fields in tends to zero in the IR limit.
[9] is divergent, and one needs a split into small- Block spin methods were used by Balaban (1988)
field and large-field regions and cluster expansions to construct gauge theories in three and four
to obtain a well-defined sequence of effective dimensions. For gauge theories, the block spin RG
actions (Gawedzki and Kupiainen 1985, Feldman has the major advantage that it allows to define a
et al. 1987, Rivassean 1993). That is, the local gauge-invariant RG flow. The scale decomposition
parts are extracted and treated explicitly only in violates gauge invariance, which creates substantial
the small-field region, and this is combined with technical problems (Rivasseau 1993).
Renormalization: Statistical Mechanics and Condensed Matter 411
Starting with the seminal work of Feldman and Renormalization of the Fermi surface at zero
Trubowitz (1990, 1991) and Benfatto and temperature In the limit T ! 0, the Matsubara
Gallavotti (1995), this field has become one of the frequency ! becomes a real variable, hence the
most successful applications of the mathematical propagator has a singularity at ! = 0 and k 2 S,
RG. We use this example to discuss the scale where S = {k : e(k) = 0}, a codimension-1 subset of
decomposition method in a bit more detail. Bd , is the Fermi surface. The existence of a Fermi
We shall mainly focus on models in d 2 surface which does not degenerate to a point is a
dimensions (the case d = 1 is described in detail in characteristic feature of systems showing metallic
Benfatto and Gallavotti (1995)). The system is put behavior.
into a finite (very large) box of side-length L. For The singularity implies that C^ 62 Lp (R Bd ) for
simplicity we take periodic boundary conditions. any p 2. Because terms of the type
The Hilbert space for spin-1/2 Z Z
L Velectrons is the ^
fermionic Fock space F = n0 n L2 (, C2 ). The d! dkFð!; kÞCð!; kÞ
grand canonical ensemble in finite volume is given
by the density operator = Z1 e(HN) , with the Y
p1
^
Ti ð!; kÞCð!; kÞ ½11
Hamiltonian H and the number operator N, in the
i¼1
usual second quantized form. The parameter
= T 1 is the inverse temperature and the chemical appear for all p 1 in the formal perturbation
potential is an auxiliary parameter used to fix the expansion, with functions Ti and F that do not
average particle number. vanish on the singularity set of C, the perturbation
The grand canonical trace defining the ensemble expansion for observables is termwise divergent.
can be rewritten in functional-integral form. It takes The deeper reason for these problems is that the
the form [5], but now dC stands for a Grassmann interaction shifts the Fermi surface so that the true
Gaussian ‘‘measure,’’ which is really only a linear propagator has a singularity of the form
functional (for definitions, see, e.g., Salmhofer G(!, k) = (i! e(k)
(!, k))1 . If the self-energy
(1998, chapter 4 and appendix B)). A two-body is a sufficiently regular function, G has the same
interaction corresponds to a quartic interaction integrability properties as C, but the singularity of G
polynomial V, as in [6]. The covariance is (in the is on the set ~S = {k : e(k) þ
(0, k) = 0} (the singular-
infinite-volume limit L ! 1) ity in ! remains
P at ! = 0).
Z Let 1 = j0 j (!, k) be a C1 partition of unity
1 X dk iðkx!Þ ^ such that
Cð; xÞ ¼ e Cð!; kÞ
!2M ð2 Þd
F
½10 for j < 0 supp j fð!; kÞ : 0 Mj2
^ 1 ji! eðkÞj 0 Mj g ½12
Cð!; kÞ ¼
i! eðkÞ
where M > 1 and 0 is a fixed constant (an energy
where 2 (0, ] is a Euclidian time variable and k scale determined by the global properties of the
is the spatial momentum. The summation over ! function e; see Salmhofer (1998, chapter 4)). The
runs over the set of fermionic Matsubara frequen- corresponding covariances C ^j = C
^ have the prop-
j
cies MF = T(2Z þ 1). The function e(k) = "(k) , ^
erties that for j < 0, kCj k1 const.Mj and kC ^ jk
1
where "(k) is the band function given by the single- j
const.MP . Using these bounds and expanding
particle term in the Hamiltonian. For a lattice v(j)
m =
(j) r
r1 vm, r , one can derive estimates for the
system, k 2 Bd , the momentum space torus (e.g., coefficient functions v(j) m, r .
for the lattice Zd , Bd = R d =2 Zd ); for a continuous Of course, the scale decomposition by itself does
system, k 2 Rd , hence there is a spatial UV not solve the problem of the moving singularity. It
problem. Electrons in a crystal have a natural only allows us to pinpoint the problematic terms in
spatial UV cutoff (see Salmhofer (1998, chapter 4) the expansion. To construct the self-energy
, as
for a discussion) so we assume in the following well as all higher Green functions, a two-step
that there is either a UV cutoff or that the system is method is used (Feldman and Trubowitz 1990,
on a lattice. A nonperturbative definition of the 1991, Feldman et al. 1996, 2000). First, a counter-
functional integral involves a limit from discrete term function K which modifies e is introduced, so
times (by the Trotter product formula); see, for that all two-point insertions Ti get subtracted on
example, Salmhofer (1998) or Feldman et al. the Fermi surface, hence replaced by T ~ i (!, k) =
(2003, 2004). 0 0
Ti (!, k) Ti (0, k ), with k obtained from k by a
412 Renormalization: Statistical Mechanics and Condensed Matter
projection to the Fermi surface (Feldman and has a unique solution. If this is done, the procedure
Trubouitz 1990, 1991). Consequently, the T ~ i vanish for renormalization is as follows. For a model given
linearly on the Fermi surface, so that the integral over by dispersion relation and interaction (E, V), solve
k in [11] converges. The effect of the counter-term [14], then add and subtract e in the kinetic term.
function K can be described less technically: it fixes This automatically puts K = E e as a counter-term,
the Fermi surface to be S, the zero set of e. Thus, K and the expansion is now set up automatically with
forces S to be the Fermi surface of the interacting the right counter-term. The function K describes
system. To achieve this, K must be chosen a function the shift from the Fermi surface of the free system (the
of e, k, and V. In contrast to the situation for zero set of E) to that of the interacting system
covariances with point singularities, the function K (the zero set of e). Proving that K is sufficiently
will, for a nontrivial Fermi surface, be very different regular and solving [14] is nontrivial. Uniqueness of
from the original e. It can, however, be constructed to the solution follows from the above stated properties
all orders in perturbation theory for a large class of of K as a function of e. Existence was shown for a
Fermi surfaces. More precisely, one can prove: if e 2 class of Fermi surfaces with strictly positive curva-
C2 (Bd , R), ^
v 2 C2 (Bd , R), and the Fermi surface S ture in Feldman et al. (1996, 2000), to every order
contains noPpoints k with re(k) = 0 and no flat sides, in perturbation theory. This implies a bijective
then K = r r Kr exists as a formal power series in relation between the Fermi surfaces of the free and
and the map e 7! e þ K is locally injective on this set the interacting model.
of e’s (Feldman et al. 1996, 2000). With this counter-
term, the order-r m-point functions on scale j satisfy
the bounds Positive temperature and the zero-limit temperature
One advantage of the functional-integral approach
ðjÞ
^vm;r wm;r Mð4mÞj=2 jjjr is that the setup at positive temperatures is identical
1
to that at zero temperature, save for the discreteness
and of the set MF at T > 0. Because 0 62 MF , the
temperature effectively provides an IR cutoff, so
ðjÞ that all term-by-term divergences are regularized in
^vm;r w
~ m;r ½13
1
a natural way. However, renormalization is still
with constants wm, r and w v(j)
~ m, r . Here ^ m, r is the
necessary because the temperature is a physical
(j)
Fourier transform of vm, r (see [9], with the momen- parameter and unrenormalized expansions give
tum conservation delta function from translation disastrous bounds for the behavior of observables
invariance removed. as functions of the temperature. Renormalization
Equation [13] implies that in the RG sense, the carries over essentially unchanged (the counter-term
two-point function is relevant, the four-point func- function is constructed slightly differently).
tion is marginal, and all higher m-point functions Because j!j = for all ! 2 MF , [12] implies
are irrelevant. supp j = ; for j < J , where
In one dimension, the Fermi ‘‘surface’’ reduces to 0
two points which are related by a symmetry, so the J ¼ logM ½15
counter-term function K is just a constant, that is, an
adjustment of the chemical potential , which is Thus, the scale decomposition is now a finite sum
justified because is only an auxiliary parameter over 0 j J . This restriction is inessential for
used to fix the average value of the particle number. the problem of renormalizing the Fermi surface, but
The counter-term function is a constant also in it puts a cutoff on the marginal growth of the four-
higher dimensions in the special case e(k) = k2 : point function: [15] and [13] imply that
there, rotational symmetry implies that K can be
ðjÞ 0 r
chosen independent of k (if v is also rotationally k^vm;r k1 w
~ m;r log ½16
symmetric). However, in the generic case of non-
spherical Fermi surfaces, K depends nontrivially If one can show that w ~ m, r ABr with constants A
on k, and an inversion problem arises: adding the and B, this implies that perturbation theory con-
counter-term changes the model. To obtain the verges for jj log (0 = ) < B1 . Such a bound has
Green functions of a model with a given dispersion been shown using constructive methods (Disertori
relation and interaction (E, V), one needs to show and Rivasseau 2000, Feldman et al. 2003, 2004) (see
that given E in a suitable set, the equation below). The logarithm of is due to the Cooper
instability (see Feldman and Trubowitz (1990,
eðkÞ þ Kð; e; VÞðkÞ ¼ EðkÞ ½14 1991) and Salmhofer (1998, section 4.5)).
Renormalization: Statistical Mechanics and Condensed Matter 413
Benfatto uses the RG to prove that the propagator of BRST; Phase Transition Dynamics; Reflection Positivity
the interacting system no longer has the singularity and Phase Transitions.
structure (i! k2 )1 but instead (!2 þ c2 k2 )1 , where
c is a constant. This requires a nontrivial analysis of Further Reading
Ward identities in the RG flow.
BEC has been proved in the Gross–Pitaevskii limit Bach V, Fröhlich J, and Sigal IM (1998) Renormalization group
(Lieb et al. 2002). In the present formulation, this limit analysis of spectral problems in quantum field theory.
Advances in Mathematics 137: 205–298.
corresponds to an infinite-volume limit L ! 1 where Balaban T (1988) Convergent renormalization expansions for
the density is taken to zero as an inverse power of L. lattice gauge theories. Communications in Mathematical
A nonperturbative proof of BEC at fixed positive Physics 119: 243–285.
particle density remains an open problem. Balaban T (1995) A low-temperature expansion for classical
N-vector models I. A renormalization group flow. Commu-
nications in Mathematical Physics 167: 103–154.
Benfatto G and Gallavotti G (1995) Renormalization Group.
Superconductivity
Princeton: Princeton University Press.
Superconductivity (SC) occurs in fermionic systems, Bricmont J and Kupiainen A (2001) Renormalizing the renorma-
lization group pathologies. Physics Reports 348: 5–31.
but it happens at energy scales where the relevant
Brydges DC and Kennedy T (1987) Mayer expansions and the
excitations have bosonic character: the Cooper pairs Hamilton–Jacobi equation. Journal of Statistical Physics 48:
are bosons. In the RG framework, they arise naturally 19–49.
when the fermionic RG flow discussed above is Disertori M and Rivasseau V (2000) Interacting Fermi liquid in
stopped before it leaves the weak-coupling region two dimensions at finite temperature I. Convergent contribu-
tions. Communications in Mathematical Physics 215: 251–290.
and the dominant Cooper pairing term is rewritten by
Domb C and Green M (eds.) (1976) Phase Transitions and
a Hubbard–Stratonovich transformation. The fer- Critical Phenomena, vol. 6. London: Academic Press.
mions can then be integrated over, resulting in the Feldman J, Magnen J, Rivasseau V, and Sénéor R (1987)
typical Mexican hat potential of an O(2) nonlinear Construction of infrared 44 by a phase space expansion.
sigma model. Effectively, one now has to deal with a Communications in Mathematical Physics 109: 437.
Feldman J and Trubowitz E (1990) Perturbation theory for
problem similar to the one for BEC, but the action is
many-fermion systems. Helvetica Physica Acta 63: 157.
considerably more complicated. Feldman J and Trubowitz E (1991) The flow of an electron-
phonon system to the superconducting state. Helvetica
Physica Acta 64: 213.
The Nonlinear Sigma Models Feldman J, Salmhofer M, and Trubowitz E (1996) Perturbation
theory around non-nested Fermi surfaces I. Keeping the Fermi
The prototypical model, into whose universality surface fixed. Journal of Statistical Physics 84: 1209–1336.
class both examples mentioned above fall, is that Feldman J, Salmhofer M, and Trubowitz E (2000) An inversion
of O(N) nonlinear sigma models: both BEC and SC theorem in Fermi surface theory. Communications on Pure
can be reformulated as spontaneous symmetry and Applied Mathematics 53: 1350–1384.
Feldman J, Knörrer H, and Trubowitz E (2003) A class of Fermi
breaking (SSB) in the O(2) model in dimensions
liquids. Reviews in Mathematical Physics 15: 949–1169.
d 3. For d = 2, long-range order is possible only at Feldman J, Knörrer H, and Trubowitz E (2004) Communications
zero temperature because only then does the time in Mathematical Physics 247: 1–319.
direction truly represent a third dimension, prevent- Fröhlich J, Simon B, and Spencer T (1976) Infrared bounds, phase
ing the Mermin–Wagner theorem from applying. transitions, and continuous symmetry breaking. Communica-
tions in Mathematical Physics 50: 79.
SSB has been proved for lattice O(N) models by
Gallavotti G (1985) Renormalization theory and ultraviolet
reflection positivity and Gaussian domination meth- stability via renormalization group methods. Reviews of
ods (Fröhlich et al. 1976). The elegance and Modern Physics 57: 471–569.
simplicity of this method is unsurpassed, but only Gawedzki K and Kupiainen A (1985) Massless lattice 44 theory:
very special actions satisfy reflection positivity, so Rigorous control of a renormalizable asymptotically free model.
Communications in Mathematical Physics 99: 197–252.
that the method cannot be used for the effective
Lieb E, Seiringer R, Solovej JP, and Yngvason J (2002) The ground
actions obtained in condensed matter models. state of the Bose gas. In: Current Developments in Mathematics,
Results in the direction of proving SSB in O(N) 2001, pp. 131–178. Cambridge: International Press.
models for d 3 by RG methods, which apply to Polchinski J (1984) Renormalization and effective Lagrangians.
much more general actions, have been obtained by Nuclear Physics B 231: 269.
Rivasseau V (1993) From Perturbative to Constructive Renorma-
Balaban (1995).
lization. Princeton, NJ: Princeton University Press.
Salmhofer M (1998) Renormalization: An Introduction, Springer
See also: Bose–Einstein Condensates; Fermionic Texts and Monographs in Physics. Heidelberg: Springer.
Systems; High Tc Superconductor Theory; Holomorphic Wiese KJ (2001) Polymerized membranes, a review. In: Domb C
Dynamics; Operator Product Expansion in Quantum and Lebowitz J (eds.) Phase Transitions and Critical Phenom-
Field Theory; Perturbative Renormalization Theory and ena, vol. 19. Academic Press.
Resonances 415
Resonances
N Burq, Université Paris-Sud, Orsay, France On the quantum mechanics point of view, both
ª 2006 Elsevier Ltd. All rights reserved.
systems are described by the Hamiltonians
d2
Hi ¼ h2 þ Vi ðxÞ
dx2
Introduction acting on L2 ([1, 1]) (with boundary conditions) and
In quantum mechanics and wave propagation, L2 (R), respectively. In the first case, H1 has a discrete
eigenvalues (and eigenfunctions) appear naturally spectrum, j, h 2 R with eigenfunctions ej, h (x), j 2 N,
as they describe the behavior of a quantum and the time evolution of the system is given by
system (or the vibration of a structure). There X
eitH1 u ¼ eitj;h uj;h ej;h ½1
are however some cases where these simple j
notions do not suffice and one has to appeal to
where uj, h ej, h is the orthogonal projection of u on
the more subtle notion of resonances. For
example, if the vibration of a drum is well the eigenspace Cej, h . In the second case, H2 has no
understood in terms of eigenvalues (the audible square integrable eigenfunction, and no simple
frequencies) and eigenfunctions (the correspond- description as [1] can consequently hold. However
ing vibrating modes), the notion of resonances is as h ! 0, the correspondence principle tells us that
quantum mechanics should get close to classical
necessary to understand the propagation of waves
mechanics. Since for both quantum problems the
in the exterior of a bounded obstacle. Another
classical limit is the same (at least for initial states
example (taken from Zworski (2002)) which
confined in the well with energy E), we expect that
allows us to understand both the similarities of
resonances with eigenvalues and their differences for the second potential there should exist a
is the following: consider the motion of a quantum state corresponding to the classical one.
classical particle submitted to a force field In fact, this is indeed the case and one can show that
deriving from the potential V1 (x) on a bounded there exist resonant states ej, h associated to reso-
nances Ej, h which are solution of the equation
interval as shown in Figure 1a. If the classical
momentum is denoted by , then the classical H2 ej;h Ej;h ej;h ; Ej;h E
energy is given by
are not square integrable, but still have moderate
E ¼ jj2 þ V1 ðxÞ growth at infinity and are confined in the interior of
and the classical motion is given by the relations of the well (see sections ‘‘Definition’’ and ‘‘Location of
Hamiltonian mechanics: resonances’’). On the other hand, the first quantum
system is confined, whereas the second one is not and
@E @E we know that even for initial states confined in the
x_ ¼ ¼ 2; _ ¼ ¼ V 0 ðxÞ
@ @x well, tunneling effect allows the quantum particle to
Since energy is conserved, if the initial energy is escape to infinity. This fact should be described by
smaller than the top of the barrier, then the classical the theory as a main difference between eigenvalues
particle bounces forever in the well. Now we can and resonances. This is indeed the case as the
consider the same example with the potential V2 (x) resonances Ej, h are not real (contrarily to eigenvalues
on R as shown in Figure 1b. Of course, if the of self-adjoint operators) but have a nonvanishing
particle is initially inside the well (with the same imaginary part (see section ‘‘Resonance-free regions’’)
energy as before), the classical motion remains the
Im Ej;h eC=h
same.
If we assume that a similar description as [1] still
holds for the second system, at least locally in space
(see section ‘‘Resonances and time asymptotics’’),
E then, for time t >> eC=h , the factor eitEh becomes
very small (the quantum particle has left the well
0 π 0 π due to tunneling effect).
There have been several studies on resonances and
(a) (b) scattering theory and the presentation here cannot be
Figure 1a, b A particle trapped in a well. complete. For a more in-depth presentation, one can
416 Resonances
consult the books by Lax and Phillips (1989) and Remark 1 In the case of acoustical scattering
Hislop and Sigal (1987), or the reviews on resonances (P = 2 , = h1 ), the introduction of the addi-
by Vodev (2001) and Zworski (1994) for example. tional parameter z is pointless and pffiffiffi one works
directly with the parameter = h1 z. In that case
the resolvent R()( 2 )1 is well defined for
Definition Im < 0, the essential spectrum is precisely the axis
2 R and the resolvent admits a meromorphic
There are different (equivalent) definitions of reso-
continuation from Im z < 0 toward the upper half-
nances. The most elegant is certainly the Helffer and
plane (with possibly a cut at 0):
Sjöstrand (1986) definition (see also the presentation
of complex scaling by Combes et al. (1984) and the RðÞ : L2 ðÞcomp ! L2 ðÞloc
very general ‘‘black box’’ framework by Sjöstrand and
Zworski (1991)). However, it requires a few prerequi- The acoustic resonances are by definition the poles
sites and we preferred to stick to the more elementary of this meromorphic continuation. They are related
(but less general) resolvent point of view. The starting to semiclassical resonances by the relation
point for this definition of resonances is the fact that pffiffiffiffiffiffiffiffiffiffiffi
Ressc ¼ h Resac
the eigenvalues of a (self-adjoint) operator P are the
points where P is not injective. The more general It can also be shown that if z is a resonance, there
resonances will be the points where the operator is not exists an associated resonant state ez such that
invertible (on suitable spaces).
ðPh zÞez ¼ 0
More precisely, consider a perturbation of the
Laplace operator on Rn , P0 (h) = h2 in the following the function ez satisfies Sommerfeld radiation con-
sense: let Rd be a (possibly empty) smooth obstacle ditions (in polar coordinates (r, ) 2 [0, þ1) Sn1 )
whose complementary, = c , is connected. Consider pffiffiffi pffiffi
a classical self-adjoint operator defined on L2 (): jh@r e i zej Cjei zr j=r1þn=2
The operator P (or by extension the obstacle in the Theorem 4 If the acoustical problem is nontrap-
case of acoustic scattering) is said to be nontrapping ping, then there exist C, > 0 such that for any
at energy E if all generalized bicharacteristics go to solution of the wave equation
the infinity:
@u
&u ¼ 0; ujt¼0 ¼ u0 ; @t ujt¼0 ¼ u1 ; ujD ¼ 0; j ¼ 0
lim jxðsÞj ¼ þ1 @n N
s!1
The operator P (or by extension the obstacle in the with compactly supported initial data (u0 , u1 ) (in a
case of acoustic scattering) is said to be nontrapping fixed compact), one has
near energy E if P is nontrapping at energy E0 for E0 Eloc ðuÞ
in a neighborhood of E. Z
¼ jruj2 þ j@t uj2
The following result was obtained in different \fjxjC
generalities by Morawetz (1975), Melrose and 8 t
< Ce if the space dimension is even
Sjöstrand (1978), and others. C ½4
: if the space dimension is odd
Theorem 1 Assume that the operator P is nontrap- td
ping near energy E. Then for any N > 0 there exist
h0 > 0 such that for 0 < h < h0 there are no Trapping perturbations were investigated more
resonances in the set recently. In that case, the local energy decays, but the
fz; jIm zj Nh logðhÞg rate cannot be uniform. The first trapping example in
acoustic scattering was studied by Ikawa (1983): the
In the case of analytic geometries (and coefficients), obstacle is the union of a finite number (and at least
this result (see Bardos et al. 1987) can be improved to two) convex bodies. In that case, one has
Theorem 2 Assume that the operator P is non Theorem 5 For any > 0 there exists C > 0 such
trapping. Then there exist > 0, N0 > 0 and h0 > 0 that for any initial data supported in a fixed
such that for 0 < h < h0 there are no resonances in compact set
the set
Eloc ðuÞðtÞ Cet kðu0 ; u1 Þk2Dðð1Þð1þÞ=2 Þ
1ð1=3Þ
fz; jIm zj N0 h g \ fjz Ej g
where D((1 )(1þ)=2 ) is the domain of the
Remark 2 In the case of acoustical scattering, operator (1 )(1þ)=2 . Remark that the norm in
pffiffiffi with
the new definition of resonances, = h1 z, the D((1 )1=2 ) is the natural energy and consequently
resonance-free zones have respectively the forms the estimate above exhibits a loss of derivatives.
For strongly trapping perturbations, the results are
fz; jIm zj N logðjzjÞ; jzj >> 1g worse. They are consequences of Theorem 3.
fz; jIm zj N0 jzj1=3 ; jzj >> 1g
Theorem 6 For any k there exists Ck > 0 such that
In the case of trapping perturbations, the first result for any initial data supported in a fixed compact set
was obtained by Burq (1998). Ck
Eloc ðuÞðtÞ 2k
kðu0 ; u1 Þk2Dðð1Þð1þkÞ=2 Þ
Theorem 3 There exist C > 0 and h0 > 0 such that logðtÞ
for 0 < h < h0 there are no resonances in the set
One can also obtain real asymptotic expansions in
fz; jIm zj N0 eC=h g \ fjz Ej g terms of resonances (see the work by Tang and
Zworski (2000)).
Theorem 7 Let 2 C1 n
c (R ) and 2 C1 c ((0, 1))
Resonances and Time Asymptotics and let chsupp = [a, b]. There exists 0 < <
The relationship between eigenfunctions/eigen- c(h) < 2 such that for every M > M0 there exists
values and time asymptotics is straightforward. L = L(M), and we have
This is no longer the case for resonances. For X
eit
ðPÞ=h ðPÞ ¼ Resðeit
ð Þ=h
nontrapping problems however, this question has
z2ðhÞ\ResðPÞ
been studied in the late 1960s by Lax and Phillips
(1989) and Vainberg (1968). In particular, this Rð ; hÞ; zÞ ðPÞ ½5
approach was decisive to study the local energy 1
þ OH!H ðh Þ; for t > h L
decay in acoustical scattering. As a consequence of
Theorem 1, we have ðhÞ ¼ ða cðhÞ; b þ cðhÞÞ i½0; hM Þ
418 Resonances
where Res(f ( ), z) denotes the residue of a mer- disjoint convex bodies. In this case, the line
omorphic family of operators, f, at z. minimizing the distance, d, between the bodies is
trapped. However, this trapped trajectory is isolated
The function c(h) depends on the distribution of
and of hyperbolic type (unstable). Ikawa (1983) and
resonances: roughly speaking we cannot ‘‘cut’’
Gérard (1988) have obtained:
through a dense cloud of resonances. Even in the
very well understood case of the modular surface Theorem 9 There exist geometric positive constants
there is, currently at least, a need for some kp ! þ1 as p ! þ1 such that all resonances
nonexplicit grouping of terms. The same ideas can located above the line Im z
C (C arbitrary large
be applied to acoustic scattering. but fixed) have an asymptotic expansion
X l=2
j;p þ al;p j;p þ Oð1
j;p Þ; j ! þ1
Trace Formulas l
Trace formulas provide a description of the classical/ where the approximate resonances
quantum correspondence: one side is given by the trace
j;p ¼ j ikp
of a certain function of the operator f (Ph ), whereas the d
other side is described in terms of classical objects are located on horizontal lines.
(closed orbits of the classical flow). In the case of
Another example is when the obstacle is convex.
discrete eigenvalues, the question is relatively simple
This example is nontrapping and Sjöstrand and
and can be solved by using the spectral theorem. In the
Zworski (1999) are able to prove that the resonances
case of continuous spectrum, the problem is much more
in any region Im z
Njzj1=3 (N arbitrary large) are
subtle (self-adjoint operators with continuous spectrum
asymptotically distributed near cubic curves
behave in some ways as non-normal operators). It has
been studied by Lax and Phillips (1989), Bardos et al. Cj ¼ fz 2 C; Im z ¼ cj jzj1=3 g
(1982), and Melrose (1982). More recently, Sjöstrand
(1997) introduced a local notion of trace formulas. Finally, the last main example where one can give a
Let W be an open precompact subsets of precise asymptotic for resonances is when there
ei[20 , 0] ]0, þ1[. Assume that the intersections I exists a stable (elliptic) periodic trajectory for the
and J of W and with the real axis are intervals and Hamiltonian flow. In that case it had been known
that is simply connected. from the 1960s (see the works by Babič (1968)) that
one can construct quasimodes, that is, compactly
Theorem 8 Let f (z, h) be a family of holomorphic supported approximate solutions of the eigenfunc-
functions on z 2 such that jfjnW j 1. Let 2 tions equation:
C10 (R) equal to 1 on a neighborhood of I. Then ðPh Eh Þej ¼ Oðh1 Þ
Trace ðf ÞðPh Þ ðf Þ h2 It is only recently that Tang and Zworski (1998) and
X
¼ f ð; hÞ þ Oðhn Þ Stefanov (1999) proved that these quasimodes
a resonance of Ph \ constructions imply the existence of resonances
asymptotic to Eh , h ! 0.
The use of this result with a clever choice of functions f
allows Sjöstrand to show that an analytic singularity of See also: h-Pseudodifferential Operators and
the function E 7! Vol({x; V(x)
E}) (observe that if V Applications; Semi-Classical Spectra and Closed Orbits.
is bounded, this function vanishes for large E and
consequently it has analytic singularities) gives a lower
bound for a neighborhood of E Further Reading
]ResðPh Þ \
chn Babič VM (1968) Eigenfunctions which are concentrated in the
neighborhood of a closed geodesic. Zapiski Nauchnykh
which coincides with the upper bound (see Zworski Seminanov Leningradkogo Otdeleniya Matematicheskogo
(2002) and the references given there). Instituta Imeni V.A. Steklova 9: 15–63.
Bardos C, Guillot JC, and Ralston J (1982) La relation de Poisson
pour l’équation des ondes dans un ouvert non borné.
Application à la théorie de la diffusion. Communications in
Location of Resonances Partial Differential Equations 7: 905–958.
In some particular cases, one can expect to have a Bardos C, Lebeau G, and Rauch J (1987) Scattering frequencies and
Gevrey 3 singularities. Inventiones Mathematicae 90: 77–114.
precise description of the location of resonances. Burq N (1998) Décroissance de l’énergie locale de l’équation des
This is the case in Ikawa’s example in acoustic ondes pour le problème extérieur et absence de résonance au
scattering where the obstacle is the union of two voisinage du réel. Acta Mathematica 180: 1–29.
Riemann Surfaces 419
Combes J-M, Duclos P, and Seiler R (1984) On the shape Spectral Theory (Lucca, 1996), NATO Adv. Sci. Inst. Ser. C
resonance. In: Resonances – Models and Phenomena (Biele- Math. Phys. Sci., vol. 490, pp. 377–437. Dordrecht: Kluwer
feld, 1984), Lecture Notes in Physics, vol. 211, pp. 64–77. Academic.
Berlin: Springer. Sjöstrand J and Zworski M (1991) Complex scaling and the
Gérard C (1988) Asymptotique des pôles de la matrice de scattering distribution of scattering poles. Journal of the American
pour deux obstacles strictement convexes. Supplément au Mathematical Society 4(4): 729–769.
Bulletin de la Société Mathématique de France 116: 146 pp. Sjöstrand J and Zworski M (1999) Asymptotic distribution of
Helffer B and Sjöstrand J (1986) Resonances en limite semi- resonances for convex obstacles. Acta Mathematica 183(2):
classique. Mémoire de la S.M.F 114(24–25): 228 pp. 191–253.
Hislop PD and Sigal IM (1987) Shape resonances in quantum Stefanov P (1999) Quasimodes and resonances: sharp lower
mechanics. In: Differential Equations and Mathematical bounds. Duke Mathematical Journal 99(1): 75–92.
Physics (Birmingham, Ala., 1986), Lecture Notes in Mathe- Tang S-H and Zworski M (1998) From quasimodes to reason-
matics, vol. 1285, pp. 180–196. Berlin: Springer. ances. Mathematical Research Letters 5(3): 261–272.
Ikawa M (1983) On the poles of the scattering matrix for two Tang SH and Zworski M (2000) Resonance expansions of
convex obstacles. Journal of Mathematics of the Kyoto scattered waves. Communication in Pure and Applied Mathe-
University 23: 127–194. matics 53(10): 1305–1334.
Lax PD and Phillips RS (1989) Scattering Theory, Pure and Applied Vainberg BR (1968) On the analytical properties of the resolvent
Mathematics, 2nd edn., vol. 26. Boston: Academic Press. for a certain class of operator pencils. Mathematics of the
Melrose RB (1982) Scattering theory and the trace of the wave USSR Sbornik 6(2): 241–273.
group. Journal of Functional Analysis 45: 429–440. Vodev G (2001) Resonances in euclidean scattering. Cubo Mate-
Melrose RB and Sjöstrand J (1978) Singularities of boundary matica Educacional 3: 317–360. http://www.math.sciences.
value problems. I. Communications in Pure and Applied univ-nantes.fr.
Mathematics 31: 593–617. Zworski M (1994) Counting scattering poles. In: Ikawa M (ed.)
Melrose RB and Sjöstrand J (1982) Singularities of boundary Spectral and Scattering Theory (Sanda, 1992), Lecture Notes
value problems. II. Communications in Pure and Applied in Pure and Applied Mathematics, vol. 161, pp. 301–331.
Mathematics 35: 129–168. New York: Dekker.
Morawetz CS (1975) Decay for solutions of the exterior problem Zworski M (2002) Quantum resonances and partial differential
for the wave equation. Communication in Pure Applied equations. In: Li Ta Tsien (ed.) Proceedings of the Interna-
Mathematics 28: 229–264. tional Congress of Mathematicians (Beijing, 2002), vol. III,
Sjöstrand J (1997) A trace formula and review of some estimates pp. 243–252. Beijing: Higher Education Press.
for resonances. In: Rodino L (ed.) Microlocal Analysis and
Riemann Surfaces
K Hulek, Universität Hannover, Hannover, Germany disk, the complex plane, or the Riemann sphere
(see the section ‘‘Uniformization’’).
ª 2006 Elsevier Ltd. All rights reserved.
This article discusses the basic theory of compact
Riemann surfaces, such as their topology, their
periods, and the definition of the Jacobian variety.
Studying the zeros and poles of meromorphic
Introduction
functions leads to the notion of divisors and linear
Riemann surfaces were first studied as the natural systems. In modern language this can be rephrased
domain of definition of (multivalued) holomorphic in terms of line bundles, resp. locally free sheaves
or meromorphic functions. They were the starting (see the section ‘‘Divisors, linear systems, and line
point for the development of the theory of bundles’’). One of the fundamental results is the
real and complex manifolds (see Weyl (1997)). Riemann–Roch theorem which expresses the
Nowadays, Riemann surfaces are simply defined difference between the dimension of a linear system
as one-dimensional complex manifolds (see the and that of its adjoint system in terms of the degree
next section). Compact Riemann surfaces can of the linear system and the genus of the curve. This
be embedded into projective spaces and are thus, theorem has been vastly generalized and is truly one
by virtue of Chow’s theorem, algebraic curves. By of the cornerstones of algebraic geometry.
uniformization theory, the universal cover of A formulation of this result and a discussion of
a connected Riemann surface is either the unit some of its applications are also discussed.
420 Riemann Surfaces
Uα M
Uβ Uniformization
If M is a Riemann surface, then its universal
covering M ~ is again a Riemann surface. The
connected and simply connected Riemann surfaces
fα fβ can be fully classified. Let
E ¼ fz 2 C; jzj < 1g
be the unit disk and C ^ = C [ {1} the Riemann
fβ ° fα–1
sphere. The latter can be identified with the complex
projective line P1C .
Theorem 1 (Generalized Riemann mapping
Vα ⊂ Cn Vβ ⊂ Cn
theorem). Every connected and simply connected
Figure 1 Charts of a complex manifold. Riemann surface is biholomorphically equivalent
Riemann Surfaces 421
to the unit disk E, the complex plane C, or the Periods and the Jacobian
^
Riemann sphere C.
On a compact Riemann surface C of genus g, there
This theorem was proved rigorously by Koebe exist 2g homologically independent paths, that is,
and Poincaré at the beginning of the twentieth H1 (C, Z) ffi Z2g .
century. Let 1 , . . . ,2g be a basis of H1 (C, Z) and
let !1 , . . . , !g be a basis of the space of holomorphic
1-forms on C. Integrating these forms over the paths
Compact Riemann Surfaces 1 , . . . , 2g defines the period matrix
0R R 1
The topological structure of a compact Riemann 2g !1
1 !1
surface C is determined by its genus g (Figure 3). B . .. C
¼B @ R ..
C
Topologically, a Riemann surface of genus g is a R . A
sphere with g handles or, equivalently, a torus with 1 !g 2g !g
g holes.
Analytically, the genus can be characterized as the If Q = (i , j ) is the intersection matrix of the paths
maximal number of linearly independent holo- 1 , . . . , 2g , then satisfies the Riemann bilinear
morphic forms on C (see also the section ‘‘The relations
Riemann–Roch theorem and applications’’). pffiffiffiffiffiffiffi t
There exists a very close link with algebraic Qt ¼ 0; 1 Q > 0 ½1
geometry: every compact Riemann surface C can where the latter condition means positive definite.
be embedded into some projective space PnC (in One can choose (see Figure 4) 1 , . . . , 2g such that
fact already into P3C ). By Chow’s theorem, C is
then a (projective) algebraic variety, that is, it can 0 1g
Q¼J¼
be described by finitely many homogeneous equa- 1g 0
tions. It should be noted that such a phenomenon
is special to complex dimension 1. The crucial where 1g is the g g unit matrix. Moreover,
point is that one can always construct a non- !1 , . . . , !g can be chosen such that
0 1
constant meromorphic function on a Riemann 1 0 11 1g
surface (e.g., by Dirichlet’s principle). Given such B .. C
¼ @ ... . . . ... ..
. . A
a function, it is not difficult to find a projective
embedding of a compact Riemann surface C. On 0 1 g1 gg
the other hand, it is easy to construct a compact Let
two-dimensional torus T = C2 =L for some suitably
chosen lattice L, which cannot be embedded into 0 ¼ ðij Þ1 i; j g
any projective space PnC .
Then the Riemann bilinear relations [1] become
The dichotomy Riemann surface/algebraic curve
arises from different points of view: analysts think 0 ¼ t0 ; Im t0 > 0
of a real two-dimensional surface with a Rieman-
nian metric which, via isothermal coordinates, that is, 0 is an element of the Siegel upper half-
defines a holomorphic structure, whereas algebraic space
geometers think of a complex one-dimensional
Hg ¼ f 2 Matðg g; CÞ; ¼ t ; Im > 0g
object.
In this article, the expressions compact Riemann The matrix 0 is defined by the Riemann surface C
surface and (projective) algebraic curve are both only up to the action of the symplectic group
used interchangeably. The choice depends on
which expression is more commonly used in the Spð2g; ZÞ ¼ fM 2 Matð2g 2g; ZÞ; MJMt ¼ Jg
part of the theory which is discussed in the
relevant section.
γ3 γ4
γ1 γ2
g=0 g=1 genus g
Figure 3 Genus of Riemann surfaces. Figure 4 Homology of a compact Riemann surface.
422 Riemann Surfaces
is a hypersurface (i.e., has codimension 1 in J(C)) The fibers of this map are of particular interest.
and is called a theta divisor. A different choice of the Theorem 2 (Abel). Two effective divisors D1 and
base point P0 results in a translation of the theta D2 on C of the same degree d are linearly equivalent
divisor. Using the theta divisor, one can show that if and only if ud (D1 ) = ud (D2 ).
J(C) is an abelian variety, that is, J(C) can be
embedded into some projective space PnC . The pair One normally denotes the inverse image of ud (D) by
(J(C), ) is a principally polarized abelian variety
jDj ¼ u1 0 0 0
d ðud ðDÞÞ ¼ fD ; D 0; D Dg
and Torelli’s theorem states that C can be
reconstructed from its Jacobian J(C) and the theta Note that the latter description also makes sense if
divisor . D itself is not necessarily effective. One calls jDj the
Riemann Surfaces 423
complete linear system defined by the divisor D. If are fiberwise linear isomorphisms. If M is connected,
deg D < 0, then automatically jDj = ;, but the then r is constant and is called the rank of the vector
converse is not necessarily true. Let MC be the bundle. A line bundle is simply a rank-1 vector bundle.
field of meromorphic (or equivalently rational) Alternatively, one can view vector bundles as
functions on C. Then, one defines locally free OM -modules, where OM denotes the
structure sheaf of holomorphic (or in the algebro-
LðDÞ ¼ ff 2 MC ; ðf Þ Dg
geometric setting regular) functions on M. An
This is a C-vector space and it is not difficult to see OM -module E is called locally free of rank r, if an
that L(D) has finite dimension. To every function open covering (U )2A of M exists such that EjU ffi
0 6¼ f 2 L(D), one can associate the effective divisor O
r
U . The transition functions of a locally free sheaf
can be used to define a vector bundle and vice versa,
Df ¼ ðf Þ þ D 0 and hence the concepts of vector bundles and locally
Clearly, Df D and every effective divisor with this free sheaves can be used interchangeably. The open
property arises in this way. This gives a bijection coverings U can be viewed either in the complex
topology, or, if M is an algebraic variety, in
PðLðDÞÞ ¼ jDj the Zariski topology, thus leading to either holo-
morphic vector bundles (locally free sheaves in the
showing that the complete linear system jDj has the
C-topology) or algebraic vector bundles (locally free
structure of a projective space. A linear system is a
sheaves in the Zariski topology). Clearly, every
projective subspace of some complete linear system jDj.
algebraic vector bundle defines a holomorphic
Clearly, the map ud : Cd ! J(C) can be extended
vector bundle. Conversely, on a projective variety
to the set Divd (C) of degree d divisors and Abel’s
M, Serre’s GAGA theorem (géométrie algébriques et
theorem then states that this map factors through
géométrie analytique), a vast generalization of
Cld (C), that is, that we have a commutative diagram
Chow’s theorem, states that there exists a bijection
Divd(C ) Cld(C ) between the equivalence classes of algebraic and
ud ud holomorphic vector bundles (locally free sheaves).
J(C ) The Picard group Pic M is the set of all isomorph-
ism classes of line bundles on M. The tensor product
where ud is injective. defines a group structure on Pic M where the neutral
element is the trivial line bundle OM and the inverse
Theorem 3 (Jacobi’s Inversion Theorem). The
of a line bundle L is its dual bundle L
, which is also
map ud is surjective and hence induces an isomorphism
denoted by L1 . For this reason, locally free sheaves
ud : Cld ðCÞ ffi JðCÞ of rank 1 are also called invertible sheaves.
We now return to the case of a compact Riemann
It should be noted that the definition of the maps surface (algebraic curve) C. The concept of line
ud depends on the choice of a base point P0 2 C. bundles and P divisors can be translated into each
Hence, the maps ud are not canonical, with the other. If D = ni Pi is a divisor on C and U an open
exception of the isomorphism u0 : Cl0 (C) ffi J(C) set, then we denote by DU the restriction of D to U,
where the choice of P0 drops out. that is, the divisor consisting of all points Pi 2 U
The concepts of divisors and linear systems can be with multiplicity ni . One then defines a locally free
rephrased in the language of line bundles. A (holo- sheaf (line bundle) L(D) by
morphic) vector bundle on a complex manifold M is a
complex manifold E together with a projection LðDÞðUÞ ¼ ff 2 MC ðUÞ; ðf Þ DU g
p : E ! M which is a locally trivial Cr -bundle. This
To see that this is locally free, it is enough to
means that an open covering (U )2A of M and local
consider for each point Pi a neighborhood Ui on
trivializations
which a holomorphic function ti exists, which
≅ pα vanishes only at Pi and there of order 1 (i.e., it is a
p–1(Uα) Uα × Cr
pα prUα
local parameter near the point Pi ). Then,
Uα
LðDÞðUi Þ ¼ tini OUi ffi OUi
exist, such that the transition maps This correspondence defines a map
’ ’1
jðU \U ÞCr : Div C ! Pic C
r r
ðU \ U Þ C ! ðU \ U Þ C D 7! LðDÞ
424 Riemann Surfaces
It is not hard to show that: canonical divisors are the divisors of the meromorphic
1-forms on C, whereas the effective canonical divisors
1. every line bundle L 2 Pic C is of the form L =
correspond to the divisors of holomorphic 1-forms
L(D) for some divisor D on the curve C;
(here, we simply write a 1-form locally as f (z) dz and
2. D1 D2 () L(D1 ) ffi L(D2 );
define a divisor by taking the zeros, resp. poles of f (z)).
3. L(D1 ) L(D2 ) ffi L(D1 þ D2 ); and
By abuse of notation, we also denote the divisor class
4. L(D) ffi L(D)1 .
corresponding to canonical divisors by KC . There is a
Hence, there is an isomorphism of abelian groups natural identification
ClðCÞ ffi Pic C PðH 0 ðC; !C ÞÞ ¼ jKC j
This correspondence allows to define the degree of a For a divisor D, the index of speciality is defined by
line bundle L. In the complex analytic setting this
can also be interpreted as follows. Let O
C be the iðDÞ ¼ lðKC DÞ ¼ dimC LðKC DÞ
sheaf of nowhere-vanishing functions. Using cocycles, The linear system jKC Dj is called the adjoint
one easily identifies system of jDj. A crucial role is played by the
H 1 ðC; O
C Þ ffi Pic C Theorem 4 (Riemann–Roch). For any divisor D on a
and the exponential sequence compact Riemann surface C of genus g, the equality
exp lðDÞ iðDÞ ¼ deg D þ 1 g ½2
0 ! Z ! OC ! O
C ! 0
holds.
induces an exact sequence
This can also be written in terms of line bundles.
0 ! H1 ðC; ZÞ ! H 1 ðC; OC Þ If L is any line bundle, then we denote the
! H1 ðC; O
C Þ ¼ Pic C ! H 2 ðC; ZÞ dimension of the space of global sections by
The last map in this exact sequence associates to h0 ðLÞ ¼ dimC H 0 ðC; LÞ
each line bundle L its first Chern class c1 (L) 2 Then, the Riemann–Roch theorem can be written as
H 2 (C, Z) ffi Z, which can be identified with the
degree of L. Hence, the subgroup Pic0 C of degree 0 h0 ðLÞ h0 ð!C L1 Þ ¼ deg L þ 1 g ½3
line bundles on C is isomorphic to
This can be written yet again in a different way, if
Pic0 C ffi H 1 ðC; OC Þ=H 1 ðC; ZÞ we use sheaf cohomology. By Serre duality, there is
an isomorphism of cohomology groups
Altogether there are identifications
H 1 ðC; LÞ ffi H 0 ðC; !C L1 Þ
viewed as special cases of the Atiyah–Singer index Proposition 1 Let D be a divisor of degree d on the
theorem for elliptic operators. The latter also contains curve C. Then
the Gauss–Bonnet theorem from differential geometry
(i) jDj is base point free if d 2g and
as a special case. Moreover, Serre duality holds in
(ii) jDj is very ample if d 2g þ 1.
much greater generality, namely for coherent sheaves
on projective varieties. If the genus g(C) 2, then one can prove that jKC j
Applying the Riemann–Roch theorem [3] to the is base point free and consider the canonical map
zero divisor D = 0, resp. the trivial line bundle OC ,
one obtains ’jKC j : C ! Pg1
Moduli of Compact Riemann Surfaces looks like C3g3 =G near the origin, where G is a
finite group acting linearly on C3g3 . One expresses
As a set, the moduli space of compact Riemann
this by saying that Mg has only finite quotient
surfaces of genus g is defined as
singularities. A space with this property is also
Mg ¼ fC; C is a compact sometimes referred to as a V-manifold or an
Riemann surface of genus gg= ffi orbifold. Moreover, Mg is a quasiprojective variety,
that is, a Zariski-open subset of a projective variety.
For genus g = 0, the only Riemann surface is the As the above parameter count implies, the dimen-
Riemann sphere C ^ = P1 and hence M0 consists of sion of Mg is 3g 3. At this point it can also be
one point only. Every Riemann surface of genus 1 is clarified what is meant by a general curve in the
a torus context of Brill–Noether theory: a property is said to
hold for the general curve in Brill–Noether theory if
E ¼ C=L
it holds outside a countable number of proper
for some lattice L, which can be written in the form subvarieties of Mg .
It is often useful to work with projective, rather
L ¼ Z þ Z; Im > 0 than quasiprojective, varieties. This means that one
Two elliptic curves E = C=L and E 0 = C=L 0 are wants to compactify Mg to a projective variety Mg ,
isomorphic if and only if a matrix preferably in such a way that the points one adds
still correspond to geometric objects. The crucial
a b concept in this context is that of a stable curve. A
M¼ 2 SLð2; ZÞ
c d stable curve of genus g is a one-dimensional
exists with projective variety with the following properties:
is a Cartier divisor on SUC (r) and thus defines a line Complex Algebraic Geometry, (Berkeley, CA, 1992/93),
bundle L on SUC (r). This is a natural generalization Math. Sci. Res. Inst. Publ., vol. 28, pp. 17–33. Cambridge:
Cambridge University Press.
of the construction of the classical theta divisor. The Beauville A (2003) La conjecture de green générique [d’après
line bundle L generates the Picard group of the C. Voisin]. Exposé 924 du Séminaire Bourbaki.
moduli space SUC (r). Behrend K and Fantechi B (1997) The intrinsic normal cone.
Inventiones Mathematicae 128: 45–88.
Theorem 14 (Verlinde Formula). If C has genus g Faber C and Looijenga E (1999) Remarks on moduli of curves. In:
and k is a positive integer, then Faber C and Looijenga E (eds.) Moduli of Curves and Abelian
Varieties, Aspects Math. E33, pp. 23–45. Braunschweig: Vieweg.
dim H 0 ðSUC ðrÞ; Lk Þ Farkas H and Kra I (1992) Riemann Surfaces, 2nd edn. New
g X Y s t g1
York: Springer.
r
¼ sin Forster O (1991) Lectures on Riemann Surfaces (translated from
r þ k StT¼f1;...;rþkg s2S rþk the 1977 German Original by Bruce Gilligan), Reprint of the
t2T
jSj¼r 1981 English Translation. New York: Springer.
Griffiths Ph and Harris J (1994) Principles of Algebraic
This formula was first found by Verlinde in the context Geometry, Reprint of the 1978 Edition, Wiley Classics
of conformal field theory. Due to this relationship, the Library. New York: Wiley.
spaces H 0 (SUC (r), Lk ) are also called conformal Hartshorne R (1977) Algebraic Geometry. Heidelberg: Springer.
blocks. These spaces can also be defined for principal Jost J (1997) Compact Riemann Surfaces. An Introduction to
bundles. Rigorous proofs for the general case of the Contemporary Mathematics (translated from the German
Manuscript by Simha RR). Berlin: Springer.
Verlinde formula are due to Beauville–Laszlo and Kirwan F (1992) Complex Algebraic Curves. Cambridge:
Faltings. For a survey, see Beauville (1995). Cambridge University Press.
Lazarsfeld R (1989) A sampling of vector bundle techniques in the
See also: Characteristic Classes; Cohomology Theories; study of linear series. In: Cornalba M, Gomez-Mont X, and
Index Theorems; Mirror Symmetry: a Geometric Survey; Verjovsk (eds.) Lectures on Riemann Surfaces, pp. 500–559.
Moduli Spaces: An Introduction; Polygonal Billiards; Teaneck, NJ: World Scientific.
Several Complex Variables: Basic Geometric Theory; Miranda R (1995) Algebraic Curves and Riemann Surfaces.
Several Complex Variables: Compact Manifolds; Providence: American Mathematical Society.
Mumford D (1995) Algebraic Geometry. I. Complex Projective
Topological Gravity, Two-Dimensional.
Varieties, Reprint of the 1976 Edition. Berlin: Springer.
Vakil R (2003) The moduli space of curves and its tautological
ring. Notices of the American Mathematical Society 50(6):
Further Reading 647–658.
Arbarello E, Cornalba M, Griffiths Ph, and Harris J (1985) Weyl H (1997) Die Idee der Riemannschen Fläche (Reprint of the
Geometry of Algebraic Curves, vol. I. New York: Springer. 1913 German Original, With Essays by Reinhold Remmert,
Beauville A (1995) Vector bundles on curves and generalized Michael Schneider, Stefan Hildebrandt, Klaus Hulek and
theta functions: recent results and open problems. In: Boutet Samuel Patterson. Edited and with a Preface and a Biography
de Montel A and Morchenko V (eds.) Current Topics in of Weyl by Remmert). Stuttgart: Teubner.
matrix-valued function m() with the following The main benefit of reducing an originally non-
properties: linear problem to the analytic factorization of a
given matrix function arises in asymptotic analysis.
mðÞ is analytic in Cn ½1a Typically, the dependence of the jump matrix on the
external parameters (say, x and t) is oscillatory. In
mþ ðÞ ¼ m ðÞvðÞ for 2 analogy of asymptotic evaluation of oscillatory
where mþ ðÞðm ðÞÞ is the limit contour integrals via the classical method of steepest
of m from the þ ðÞ side of ½1b descent, in the asymptotic evaluation of the solution
m(; x, t) of the matrix RH problem as x, t ! 1, the
mðÞ ! I (identity matrix) as ! 1 ½1c nonlinear steepest-descent method examines the
analytic structure of the jump matrix v(; x, t) in
The precise sense in which the limit at 1 and the order to deform the contour to contours where
boundary values m are attained are technical the oscillatory factors become exponentially small as
matter that should be specified for each given RH x, t ! 1, and hence the original RH problem
problem (, v). reduces to a collection of local RH problems
Concerning the name RH problem we note that associated with the relevant points of stationary
in literature (particularly, in the theory of bound- phase. Although the method has (in the matrix case)
ary values of analytic functions), the problem of noncommutative and nonlinear elements, the final
reconstructing a function from its jump across a result of the analysis is as efficient as the asymptotic
curve is often called the Hilbert boundary-value evaluation of the oscillatory integrals.
problem. The closely related problem of analytic
matrix factorization (given and v, find G()
analytic and nondegenerate in Cn such that Dressing Method
Gþ G = v on ) is sometimes called the Riemann The RH method allows describing the solution of a
problem. The name ‘‘RH problem’’ is also differential system independently of the theory of
attributed to the reconstruction of a Fuchsian differential equations. The solution might be expli-
system with given poles and a given monodromy cit, that is, given in terms of elementary or elliptic or
group. abelian functions and contour integrals of such
In applications, the jump matrix v also depends functions. In general (transcendental) case, the
on certain parameters, in which the original problem solution can be represented in terms of the solution
at hand is naturally formulated (e.g., v = v(; x, t) in of certain linear singular integral equations.
applications to the integrable nonlinear differential In the modern theory of integrable systems, a
equations in dimension 1 þ 1, with x being the space system of nonlinear differential equations is often
variable and t the time variable), and the main called integrable if it can be represented as a
concern is the behavior of the solution of the RH compatibility condition of an auxiliary overdeter-
problem, m(; x, t), as a function of x and t. mined linear system of differential equations called a
Particular interest is in the behavior of m(; x, t) as Lax pair of the given nonlinear system (actually it
x and t become large. might involve more than two linear equations). In
In the scalar case, N = 1, rewriting the original order that the compatibility condition represents a
multiplicative jump condition in the additive form nontrivial nonlinear system of equations, the Lax
log mþ ðÞ ¼ log mþ ðÞ þ log vðÞ pair is required to depend rationally on an auxiliary
parameter (called a spectral parameter). The RH
and using the Cauchy–Plemelj–Sokhotskii formula problem formulated in the complex plane of the
give an explicit integral representation for the spectral parameter allows, given a particular solu-
solution tion of the compatibility equations, to construct
Z directly new solutions of the compatibility system by
1 log vðÞ
mðÞ ¼ exp d ½2 ‘‘dressing’’ the initial one.
2i
For example, let D(x, ), x 2 Rn , 2 C be an N N
(in the case of nonzero index, log vj 6¼ 0, formula diagonal, polynomial in with smooth coefficients,
[2] admits a suitable modification). function such that aj := @D=@xj are polynomials in
A generic (nonabelian) matrix RH problem of degree dj . Then 0 := exp D(x, ) solves the
cannot be solved explicitly in terms of contour system of linear equations @0 =@xj = aj 0 , whose
integrals; however, it can always be reduced to a compatibility conditions @ 2 0 =@xj @xk = @ 2 0 =@xk @xj
system of linear singular-integral equations, thus are trivially satisfied. Given a contour and a smooth
linearizing an originally nonlinear system. function v, consider the matrix RH problem [1]
Riemann–Hilbert Methods in Integrable Systems 431
with the jump matrix ~ v(; x) := expD(x, )v() The relation between the RH problem and the
exp D(x, ). Let m(; x) be the solution of this RH differential equations [5] is local in x and t; it is based
problem. Then (Dj m)þ = (Dj m) ~ v, where Dj f := only on the unique solvability of the RH problem,
@f =@xj þ [aj , f ] with [a, b] := ab ba. The Liouville the Liouville theorem, and the explicit dependence of
theorem implies that (Dj m)m1 is an entire function the jump matrix in x and t. The uniqueness of the
which is o(dj ) as ! 1. Setting (x, ) := m(; x) solution of an RH problem is basically provided by
exp D(x, ) gives the system of linear equations the Liouville theorem: the ratio m(1) (m(2) )1 of any
X two solutions is analytic in Cn and continuous
@
¼ aj þ k qjk ðxÞ Rj ðx; Þ ½3 across and is therefore identically equal to I by the
@xj k<d normalization condition [1c].
j
the contour consisting of the points (, "), where jump matrix [11] and evaluating its solution m(x, ) as
jj 2 and " = 1 marks the surface sheet. ! 1 [9]:
r 7! v 7! RHP 7! mðx; Þ
Inverse-Scattering Transform ¼ mðx; ; rÞ 7! m1 ðxÞ 7! qðxÞ
The inverse-scattering transform method for solving ¼ iðm1 ðxÞÞ12
initial-value problems for integrable nonlinear equa-
and thus
tions written as the compatibility conditions [6] for
linear equations [5] consists in the following: starting 2
qðx; tÞ ¼ R1 eixð Þitð Þ rð Þ ½12
from the given initial data, solve the direct problem,
that is, determine appropriate eigenfunctions (solu- The mathematical rigor to this scheme is provided
tions of the differential x-equation in the Lax pair [5]) by the general theory of analytic matrix factoriza-
having well-controlled analytic properties as functions tion making use of the relation between the
of the auxiliary (spectral) parameter and the factorization problem and certain singular integral
associated spectral functions of ; then, by virtue of equations; this relation can be established with the
the t-equations in the Lax pair [5], the associated help of the Cauchy operators
functions evolve in a simple, explicit way. Finally, Z
using the explicit evolution of the spectral functions, hðÞ d
ChðÞ ¼ ; 2 Cn
solve the inverse problem of finding the associated 2i
coefficients in the x-equation, which, by [5], evolve and
according to the given nonlinear equation and thus
solve the Cauchy problem for this equation. The last C hðÞ ¼ lim
0
ðChÞð0 Þ
!
0 2ðÞside of
step in this procedure, the inverse-scattering problem,
can be effectively solved by reformulating it as an RH For a very general class of contours, the Cauchy
problem, which in turn can be related to a system of operators C : Lp ! Lp , 1 < p < 1, are bounded,
singular integral equations. The classical Gelfand– Cþ C = I, and Cþ þ C = H, where
Levitan–Marchenko integral equation of the inverse- Z
scattering problem is the Fourier transform of some hðÞ d
HhðÞ :¼ lim
special cases of these singular integral equations. "!0
jj>"
i
To fix ideas, consider the initial-value problem for
the NLS equation [7], where the data q(x, t = 0) = is the Hilbert transform.
q0 (x) have sufficient smooth and decay as jxj ! 1. The map R is often considered as a nonlinear
For each 2 CnR, one constructs solutions (x, ) Fourier-type map; this point of view is supported by
of x = U with U given by [10], having the the fact that R is a bijection between the corre-
properties sponding Schwartz spaces of functions. Making use
of the Lp or Hölder theory of the Cauchy operators
ix3 and the related factorization problems, it is possible
mðx; Þ :¼ ðx; Þ exp ! I as x ! 1
2 to analyze the action of R and R1 in various
functional spaces. This also requires making more
and m(x, ) is bounded as x ! 1. For each fixed x,
precise the definition of the RH problem: for fixed
the 2 2 matrix function m(x, ) solves the RH
1 < p < 1, given and v such that v, v1 2
problem in , where = R and the jump matrix is
! L1 ( ! GL(N, C)), we say that m solves an RH
1 jrðÞj2 rðÞ eix Lp -problem if m 2 I þ @C(Lp ) and mþ () =
v ¼ vð; xÞ ¼ ½11 m ()v() for 2 . Here a pair of Lp ()-functions
rðÞ eix 1
f 2 @C(Lp ) if there exists a unique function
Here r() is the reflection coefficient of q0 (x). h 2 Lp () such that f () = (C h)(). Then f () =
The direct scattering map R is described by Ch(), 2 Cn, is called the extension of f off .
mapping q 7! r, Given a factorization of v = (v )1 vþ = (I w )1
(I þ wþ ) on with v , (v )1 2 Lp , the basic
q 7! mðx; Þ ¼ mðx; ; qÞ 7! vð; xÞ 7! r ¼ RðqÞ associated singular integral operator is defined by
By virtue of the t-equations in [5], if q(t) = q(x, t)
Cw h :¼ Cþ ðhw Þ þ C ðhwþ Þ
solves the NLS equation, then r(t) = R(q( , t)) evolves
2
as r(t) = r(t, ) = eit r0 (), where r0 = R(q0 ). Given If the operator I Cw is invertible on Lp (), with
r, the inverse-scattering map R1 is obtained by 2 I þ Lp (), solving (I Cw )m = I, then m() =
solving the normalized RH problem (RHP) with the I þ (C((wþ þ w )))() is the unique solution of the
Riemann–Hilbert Methods in Integrable Systems 433
RH problem (, v). Although the operator Cw need An RH problem may be viewed as a special case
not be compact, in many cases it is Fredholm with in a more general setting of problems of recon-
zero index. Then the existence of (I Cw )1 is structing an analytic function from the known
equivalent to the solvability of the RH problem structure of its singularities. The departure from
(, v), and the normalized RH problem (m ! I as analyticity of a function m of the complex variable
! 1) has a unique solution if and only if the can be described in terms of the ‘‘d-bar’’
corresponding homogeneous RH problem (with If @m=@ can be linearly related
derivative, @m=@ .
m ! 0 as ! 1) has only the trivial solution to m itself, then the use of the extension of
(vanishing lemma). Cauchy’s formula
The most complete theory for RH problem relative Z Z
1 1 @m 1 mðÞ
to simple contours is the theory when v is in an mðÞ ¼ d ^ d þ d
inverse, closed, decomposing Banach algebra A, that 2i D @ 2i @D
is, the algebra of continuous functions with the leads to a linear integral equation for m. This is the
Hilbert transform bounded in it such that if f 2 A, case for some multidimensional (2 þ 1) nonlinear
then f 1 2 A. For contours with self-intersections, the integrable equations. For example, for the Kadomtsev–
RH factorization theory is formulated in terms of a Petviashvili-I equation (the two-dimensional general-
pair of decomposing algebras: choosing the orienta- ization of the Korteweg–de Vries equation) (qt þ
tion of the contour in such a way that it divides the 6qqx þ qxxx )x = 3qyy , the appropriate eigenfunctions
-plane into two disjoint regions, þ and , and are still sectionally meromorphic, but their jumps
each arc of forms part of the positively oriented across a contour are connected nonlocally to m on
boundary of þ , the functions in the þ () algebra the contour, which leads to nonlocal RH problem of
are continuous up to the boundary in each connected the type
component of þ ( ). Z
The choice of functional spaces in the RH problem mþ ðÞ ¼ m ðÞ þ dm ðÞf ð; Þ; 2
should be based on the integrable system at hand. For
example, an integrable flow connected to the scatter- with given f (, ) (analogue of scattering data).
ing problem for x = U, with U defined by [10], Contrarily, the eigenfunctions for the Kadomtsev–
p p
has in general the form eit 3 v()eit 3 (Ablowitz– Petviashvili-II equation (qt þ 6qqx þ qxxx )x = 3qyy
Kaup–Newell–Segur (AKNS) hierarchy) in the scat- are nowhere analytic, with @m=@ related to m by
tering space (for the NLS equation, p = 2), so that
appropriate spaces are L2 ((1 þ x2 ) dx) \ H p1 for @m
ðÞ ¼ FðRe ; Im ÞmðÞ; 2C
q( , t) and L2 ((1 þ jj2p2 )jdj) \ H1 as the scatter- @
ing space. Deift and Zhou showed that in this case
the scattering map R and the inverse-scattering map
R1 indeed involve no ‘‘loss’’ of smoothness or decay. Nonlinear Steepest-Descent Method
A generalization of the inverse-scattering trans- The nonlinear steepest-descent method is based on a
form method to the initial boundary-value problems direct asymptotic analysis of the relevant RH
for integrable nonlinear equations (on the half-line problem; it is general and algorithmic in the sense
or on a finite interval with respect to the space that it does not require a priori information (anzatz)
variable x) can be also developed on the basis of the about the form of the solution of the asymptotic
RH problem formalism. It this case, the construction problem. However, the noncommutativity of the
of the corresponding RH problem involves simulta- matrix setting requires developing rather sophisti-
neous spectral analysis of the both linear equations cated technical ideas, which, in particular, enable an
in the Lax pair [5]. The boundary values generate an explicit solution of the associated local RH problems.
additional set of spectral functions, which generally To fix ideas, let us again consider the NLS
makes the construction of the associated RH equation. The dependence of the jump matrix
problem more complicated than in the case of the v(; x, t) on x and t is oscillatory; it is the same as
corresponding initial-value problem (particularly, in the integral
the contour is to be enhanced by adding the part Z
coming from the spectral analysis of the t-equation); 1 2
qðx; tÞ ¼ pffiffiffiffiffiffi eiðxt Þ q
^0 ðÞd ½13
however, this RH problem again depends explicitly 2 R
on x and t, which makes it possible to develop which solves the initial-value problem for the
relevant techniques (such as the nonlinear steepest- linearized version of [7]:
descent method for the asymptotic analysis) in the
same spirit as in the case of initial-value problems. iqt þ qxx ¼ 0; qðx; 0Þ ¼ q0 ðxÞ ½14
434 Riemann–Hilbert Methods in Integrable Systems
can be continued to the sector below Rþ þ 0 , where including explicit connection formulas, as x
the factors ei are exponentially decreasing. Doing the approaches relevant critical points along different
same for the appropriate factors on R þ 0 , we directions in the complex plane.
obtain an RH problem on a cross, say, (0 þ ei=4 R) [ The development of the RH method in the theory
(0 þ ei=4 R). As t ! 1, the RH problem then of integrable systems caused emerging new analytic
localizes at 0 . and algebraic ideas for other branches of mathe-
Performing an appropriate scaling, a straightfor- matics and theoretical physics. The recent examples
ward computation shows that, as t ! 1, the are the study of the asymptotics in the theory of
problem reduces to an RH problem with the jump orthogonal polynomials and random matrices and in
matrix that does not depend on (it is determined combinatories (random permutations).
by r(0 )), which make it possible to solve this
problem explicitly (in terms of the parabolic cylinder See also: Boundary-Value Problems for Integrable
functions, in the case of the NLS equation). Using Equations; Approach to Integrable Systems; Integrable
explicit asymptotics for these functions and control- Systems and Algebraic Geometry; Integrable Systems
and the Inverse Scattering Method; Integrable Systems:
ling the error terms, it is possible to obtain the
Overview; Nonlinear Schrödinger Equations; Painlevé
uniform (for all x 2 R) asymptotics for the solution
Equations; Twistor Theory: Some Applications [in
of the initial-value problem for the NLS equation Integrable Systems, Complex Geometry and String
with q0 2 L2 ((1 þ x2 ) dx) \ H 1 of the form Theory]; Riemann–Hilbert Problem.
qðx; tÞ ¼ t1=2
ð0 Þ expðix2 =ð4tÞ ið0 Þ log 2tÞ
þ Oðtð1=2þÞ Þ
Further Reading
for any fixed 0 < < 1=4, where
and are given
Ablowitz MJ and Clarkson PA (1991) Solitons, Nonlinear
in terms of r = R(q0 ):
Evolution Equations and Inverse Scatting, London Math.
1 Soc., Lecture Notes Series, vol. 149. Cambridge: Cambridge
ðÞ ¼ logð1 jrðÞj2 Þ University Press.
2 Beals R, Deift PA, and Tomei C (1988) Direct and Inverse
ðÞ Scattering on the Line. Mathematical Surveys and Mono-
j
ðÞj2 ¼
2 graphs 28. Providence, RI: American Mathematical Society.
Belokolos ED, Bobenko AI, Enol’skii VZ, and Its AR (1994)
and Algebro-Geometric Approach to Nonlinear Integrable
Z Equations. Springer Series in Nonlinear Dynamics. Berlin:
1
arg
ðÞ ¼ logð Þ dðlogð1 jrðÞj2 ÞÞ Springer.
1 Deift PA (1999) Orthogonal Polynomials and Random Matrices:
A Riemann–Hilbert Approach. Courant Lecture Notes in
þ þ arg ðiðÞÞ þ arg rðÞ Mathematics, vol. 3. New York: CIMS.
4
Deift PA and Zhou X (2003) Long-time asymptotics for solutions
The method can be used to obtain asymptotic of the NLS equation with initial data in a weighted Sobolev
expansions to all orders. Also, for nonlinear equa- space. Communications on Pure and Applied Mathematics
tions supporting solitons, the soliton part of the 56(8): 1029–1077.
Deift PA, Its AR, and Zhou X (1993) Long-time asymptotics for
asymptotics can be incorporated via the dressing integrable nonlinear wave equations. In: Fokas AS and
method. Zakharov VE (eds.) Important Developments in Soliton
Further applications include long-time asympto- Theory, pp. 181–204. Berlin: Springer.
tics for near-integrable systems, such as the per- Faddeev LD and Takhtajan LA (1987) Hamiltonian Methods in
turbed NLS equation iqt þ qxx 2jqj2 q "jqjl q = 0 the Theory of Solitons. Berlin: Springer.
Fokas AS (2000) On the integrability of linear and nonlinear
for l > 2 and " > 0, and the small-dispersion limits partial differential equations. Journal of Mathematical Physics
of integrable equations (e.g., for the Korteweg– 41: 4188–4237.
de Vries equation qt 6qqx þ "2 qxxx = 0 with small Its AR (2003) The Riemann–Hilbert problem and integrable
dispersion " & 0). systems. Notices of the AMS 50(11): 1389–1400.
The RH formalism makes possible a comprehen- Novikov SP, Manakov SV, Pitaevskii LP, and Zakharov VE
(1984) Theory of Solitons. The Inverse Scattering Method.
sive global asymptotic analysis of the Painlevé New York: Consultants Bureau.
transcendents (which, due to their increasing role Zhou X (1989) The Riemann–Hilbert problem and inverse
in the modern mathematical physics, should be scattering. Journal on Mathematical Analysis. Society for
considered as new nonlinear special functions), Industrial and Applied Mathematics (SIAM) 20: 966–986.
436 Riemann–Hilbert Problem
Riemann–Hilbert Problem
V P Kostov, Université de Nice Sophia Antipolis, poles of order only n j. A linear equation is
Nice, France Fuchsian if and only if it is regular. The best-studied
ª 2006 Elsevier Ltd. All rights reserved. Fuchsian equations are the hypergeometric one and
its generalizations and the Jordan–Pochhammer
equation.
Regular and Fuchsian Linear Systems The linear change of the dependent variables
on the Riemann Sphere X 7! WðtÞX ½4
Consider a system of ordinary linear differential 1
(where W is meromorphic on CP ) makes system [2]
equations with time belonging to the Riemann
undergo the gauge transformation
sphere CP1 = C [ 1:
A ! W 1 ðdW=dtÞ þ W 1 AW ½5
dX=dt ¼ AðtÞX ½1
(Most often one requires W to be holomorphic and
The n n matrix A is meromorphic on CP1 , with
holomorphically invertible for t 6¼ aj , j = 1, . . . , p þ 1,
poles at a1 , . . . , apþ1 ; the dependent variables X form
so that no new singular points appear in the system.)
an n n matrix. One can assume that 1 is not
This transformation preserves regularity but not
among the poles aj and it is not a pole of the 1-form
necessarily being Fuchsian. The only invariant under
A(t)dt (this can be achieved by a fractionally-linear
the group of linear transformations [4] is the
transformation of t).
monodromy group of the system.
P Deligne has introduced a terminology of
meromorphic connections and sections which is Definition 4 Set = CP1 n{a1 , . . . , apþ1 }. Fix a
often preferred in modern literature to the one of base point a0 2 and a matrix B 2 GL(n, C).
meromorphic linear systems and their solutions, and Consider a closed contour with base point a0
there is a one-to-one correspondence between the and bypassing the poles of the system. The mono-
two languages. dromy operator of system [1] defined by this
contour is the linear operator M acting on the
Definition 1 System [1] is regular at the pole aj if
solution space of the system which maps the
its solutions have a moderate (or polynomial)
solution X with Xjt = a0 = B into the value of its
growth rate there, that is, for every sector S centered
analytic continuation along . Notation: X 7! XM.
at aj and not containing other poles of the system
The monodromy operator depends only on the class
and for every solution X restricted to S there exists
of homotopy equivalence of .
Nj 2 R such that kX(t aj )k = O(jt aj jNj ) for all
The monodromy group is the subgroup of
t 2 S. System [1] is regular if it is regular at all poles
GL(n, C) generated by all monodromy operators. It
aj . System [1] is Fuchsian if its poles are logarithmic
is defined only up to conjugacy due to the freedom
(i.e., of first order). Every Fuchsian system is
to choose a0 and B.
regular.
Definition 5 Define the product (concatenation)
Remark 2 The opening of the sector S might be
1 2 of two paths 1 , 2 in (where the end of 1
> 2. Restricting to a sector is necessary because the
coincides with the beginning of 2 ) as the path
solutions are, in general, ramified at the poles aj and
obtained by running 1 first and 2 next.
by turning around the poles much faster than
approaching them one can obtain any growth rate. Remark 6 The monodromy group is an antirepre-
sentation of the fundamental group 1 () into
A Fuchsian system can be presented in the form
! GL(n, C) because one has
pþ1
X 1 2
dX=dt ¼ Aj =ðt aj Þ X; Aj 2 glðn; CÞ ½2 X 7! XM1 7! XM2 M1 ½6
j¼1
that is, the concatenation 1 2 of the two contours
The sum of its matrices-residua Aj is 0, that is, defines the monodromy operator M2 M1 . In the text,
the monodromy group is referred to as to a
A1 þ þ Apþ1 ¼ 0 ½3
representation, not an antirepresentation.
(recall that 1 is not a pole of the system).
One usually chooses a standard set of generators
Remark 3 The Pn linear equation (with meromorphic of 1 () (see Figure 1) defined by contours
(j)
coefficients) j=0 a j (t)x = 0 is Fuchsian if aj has j , j = 1, . . . , p þ 1, where j consists of a segment
Riemann–Hilbert Problem 437
the monodromy operators of system [1] is diagonal- size 2. In Bolibrukh’s work, the last condition is
izable, then system [1] is equivalent to a Fuchsian formulated in a different (but equivalent) way using
one; this is due to Yu S Il’yashenko. (In particular, if the notion of Fuchsian weight.
one allows just one additional apparent singularity,
then the Riemann–Hilbert problem is positively
solvable. The author has shown that the result still The New Setting of the Problem
holds if one of the monodromy operators has one After the negative answer to the Riemann–Hilbert
Jordan block of size 2 and n 2 Jordan blocks of problem for n 3, it is reasonable to reformulate it
size 1. The result is sharp – it would be false if one as follows:
allows one Jordan block of size 3 or two blocks of
size 2.) It also follows that any finitely generated Find necessary and/or sufficient conditions for the
subgroup of GL(n, C) is the monodromy group of a choice of the monodromy operators M1 , . . . , Mp and
regular system with prescribed poles which is the points a1 , . . . , apþ1 so that there should exist a
Fuchsian at all the poles with the possible exception Fuchsian system with poles at and only at the given
of one (where the system is regular) which can be points and whose monodromy operators Mj should
chosen among them at random. be the given ones.
After the publication of Plemelj’s result, the In the new setting of the Riemann–Hilbert pro-
interest shifted basically towards the question how blem, the answer is positive if the monodromy group
to construct a Fuchsian system given the mono- is irreducible (for any positions of the poles aj ). This
dromy operators Mj . At the end of the 1920s has been first proved by Bolibrukh for n = 3 and then
IA Lappo-Danilevskii expressed the solutions to a independently by the author and by him for any n.
Fuchsian system as series of the monodromy Bolibrukh found many examples of couples
operators. These series are convergent for mono- (reducible monodromy group, poles) for which the
dromy operators close to the identity matrix and for answer to the Riemann–Hilbert problem is nega-
such operators one can express the residua Aj of the tive. For n = 3, the negative answer is due to
Fuchsian system as convergent series of the mono- possible ‘‘bad position’’ of the poles and a small
dromy operators. shift from this position while keeping the same
In 1956 BL Krylov proved that the Riemann– monodromy group leads to a couple for which the
Hilbert problem is solvable for n = p = 2 by con- answer is positive. For n 4, there are couples
structing a Fuchsian system after its monodromy where the negative answer is due to arithmetic
group. In 1983 NP Erugin did the same in the case properties of the eigenvalues of the matrices-
n = 2, p = 3, and established a connection between residua and the corresponding monodromy groups
the Riemann–Hilbert problem and Painlevé’s are not realizable by Fuchsian systems for any
equations. position of the poles. During the last years of his
In 1957 H Röhrl reformulated the problem in life, Bolibrukh studied upper-triangular mono-
terms of fibre bundles. His approach is more dromy representations and found other examples
geometric; however, it does not require the system with negative answer to the Riemann–Hilbert
realizing a given monodromy group to be Fuchsian, problem.
but only regular. Bolibrukh also found some sufficient conditions
In 1978 W Dekkers considered the particular case for the positive resolvability of the Riemann–Hilbert
n = 2 of the Riemann–Hilbert problem, and gave a problem in the case of a reducible monodromy
positive answer to it. The gap in Plemelj’s proof was group. For example, suppose that the monodromy
detected in the 1980s by AT Kohn and YuS group is a semidirect sum:
Il’yashenko. !
It was proved by AA Bolibrukh in 1989 that, for M1j
n 3, the problem has a negative answer. For n = 3, Mj ¼
0 M2j
the answer is negative precisely for those couples
(monodromy group, set of poles) for which each where the matrices Mij (of size li li , i = 1, 2) define
monodromy operator M1 , . . . , Mpþ1 is conjugate to the representations i . Suppose that the representa-
a Jordan block of size 3, the monodromy group is tion 2 is realizable by a Fuchsian system, that the
reducible, with an invariant subspace or factor-space representation 1 is irreducible, and that one of the
of dimension 2, the monodromy sub- or factor- matrices Mj is block-diagonal, with left upper block
representation corresponding to it is irreducible and of size s s, where s l1 . Then for any choice of the
cannot be realized by a Fuchsian system having all poles aj the monodromy group can be realized by
its matrices-residua conjugate to Jordan blocks of some Fuchsian system.
Riemann–Hilbert Problem 439
Bolibrukh also gave an estimation upon the Remark 10 Denote by k, j the diagonal entries
number m of additional apparent singularities in a (i.e., the eigenvalues) of the matrix Ej . Then the
Fuchsian equation which are sufficient to realize a sums k, j þ ’k, j are the eigenvalues of the matrix-
given irreducible monodromy group. It follows from residuum Aj at aj .
his result that
In proving that the Riemann–Hilbert problem is
nðn 1Þðp 1Þ positively solved in the case of an irreducible mono-
m þ1n
2 dromy group, Bolibrukh (or the author) uses the
correct part of Plemelj’s proof – namely, that the given
One can ask the question what the codimension of monodromy group can be realized by a regular system
the subset in the space (monodromy group, poles) is which is Fuchsian at all poles but one. After this, a
which provides the negative answer to the Riemann– suitable change [4] is sought which makes the system
Hilbert problem in its initial setting. The (author’s) Fuchsian at the last pole. The criterium to be Fuchsian
answer for p 3 is 2p(n 1), and for n 7 this is provided by the above theorem; one checks how the
codimension is attained only at couples (mono- matrices Dj , that is, the exponents ’k, j and the
dromy group, poles) for which every monodromy matrices Uj change as a result of the transformation
operator Mj is conjugate to a Jordan block of size n, [4]. This is easier (one has only to multiply to the left
the group has an invariant subspace or factor-space by W(t)) than to see how the matrix A(t) of system [1]
of dimension n 1, the corresponding sub- or changes because one has conjugation in rule [5]. This
factor-representation is irreducible and cannot be idea is also due to Bolibrukh.
realized by a Fuchsian system in which all matrices- When Bolibrukh obtains the negative answer to
residua are conjugate to Jordan blocks of size n 1. the Riemann–Hilbert problem in some case of
For n 6 there are examples where the same reducible monodromy group, he often uses the
codimension is attained (but cannot be decreased) following two propositions:
on other couples as well. P
Proposition 11 The sum k, j þ ’k, j relative to a
subspace of the solution space invariant for all
monodromy operators is a non-positive integer.
Levelt’s Result and Bolibrukh’s Method
In particular, the sum of all exponents k, j þ ’k, j
In 1961, AHM Levelt described the form of the is a non-positive integer which is 0 if and only if the
solution to a regular system at its pole. His result is system is Fuchsian.
in the core of Bolibrukh’s method for solving the
Riemann–Hilbert problem. Proposition 12 If some component of some col-
umn of some matrix solution to a regular system is
Theorem 9 In the neighborhood of a pole, the identically equal to 0, then the monodromy group of
solution to a regular linear system is representable in the system is reducible.
the form
A reducible monodromy group can be conjugated
X ¼ Uj ðt aj Þðt aj ÞDj ðt aj ÞEj Gj ½8 to a block upper-triangular form, with the diagonal
where the matrix Uj is holomorphic in a neigh- blocks defining irreducible representations. Thus, the
borhood of 0, Dj = diag(’1, j , . . . , ’n, j ), ’n, j 2 Z, Riemann–Hilbert problem for reducible monodromy
det Gj 6¼ 0. The matrix Ej is in upper-triangular groups makes necessary the answer to the question
form and the real parts of its eigenvalues belong to ‘‘given the set of poles aj , for which sets of exponents
[0, 1) (by definition, (t aj )Ej = eEj ln (taj ) ). The num- ’k, j can a given irreducible monodromy group be
bers ’k, j satisfy the condition [10] formulated realized by such a Fuchsian system?’’ For n 2, an
below. They are valuations in the eigenspaces of irreducible monodromy group can be a priori realized
the monodromy operator Mj (i.e., in the maximal by infinitely many Fuchsian systems, with different
subspaces invariant for Mj on which it acts as an sets of exponents ’k, j . Consider the case when these
operator with a single eigenvalue). exponents are fixed for j 6¼ 1; suppose that a1 = 0.
A regular system is Fuchsian at aj if and only if The author has shown that then infinitely many of
the a priori possible choices of the exponents ’k, 1
det Uj ð0Þ 6¼ 0 ½9 cannot be realized by Fuchsian systems if and only if
The condition on ’k, j can be formulated as follows: let the given monodromy group is realized by a Fuchsian
Ej have one and the same eigenvalue in the rows with system which is obtained from another one via the
indices s1 < s2 < < sq . Then one has change of time t 7! tk =(bk tk þ bk1 tk1 þ þ b0 ),
bi 2 C, b0 6¼ 0, k 2 N , k > 1. This change increases
’s1 ; j ’s2 ; j ’sq ; j ½10 the number of poles.
440 Riemann–Hilbert Problem
Kohn A and Treibich (1983) Un résultat de Plemelj, Mathematics Levelt AHM (1961) Hypergeometric functions. Indagationes
and Physics (Paris 1979/1982). In: Progr. Math., vol. 37, Mathematicae 23: 361–401.
pp. 307–312. Boston: Birkhäuser. Maisonobe Ph and Narváez-Macarro L (eds.) (2004) Eléments de
Kostov VP (1992) Fuchsian linear systems on CP1 and the la théorie des systèmes différentiels géométriques. Cours du
Riemann–Hilbert problem. Comptes Rendus de l’Académie C.I.M.P.A. Ecole d’été de Séville. Séminaires et Congrès 8,
des Sciences à Paris, 143–148. xx þ 430 pages.
Kostov VP (1999) The Deligne–Simpson problem. C.R. Acad. Sci. Völklein H (1998) Rigid generators of classical groups. Mathe-
Paris, t. 329 Série I, 657–662. matische Annalen 311(3): 421–438.
Kostov VP (2004) The Deligne–Simpson problem – a survey. Wasow WR (1976) Asymptotic Expansions for Ordinary Differ-
Journal of Algebra 281: 83–108. ential Equations. New York: Huntington.
subgroup of SO(n) or O(n). Then M carries some is that many possible holonomy groups are the
extra geometric structures compatible with g. holonomy group of a Riemannian symmetric space,
Broadly, the smaller Hol(g) is as a subgroup of but are not realized by any nonsymmetric metric.
O(n), the more special g is, and the more extra Therefore, by restricting attention to nonsymmetric
geometric structures there are. Therefore, under- metrics, one considerably reduces the number of
standing and classifying the possible holonomy possible Riemannian holonomy groups.
groups gives a family of interesting special Rieman- A tensor S on M is constant if rS = 0. An
nian geometries, such as Kähler geometry. All of important property of Hol(g) is that it determines
these special geometries have cropped up in physics. the constant tensors on M.
Define the holonomy algebra hol(g) to be the Lie
Theorem 5 Let (M, g) be a Riemannian manifold,
algebra of Hol(g), regarded as a Lie subalgebra of
with Levi-Civita connection r. Fix x 2 M, so
o(n), defined up to the adjoint action of O(n).
that Holx (g)
Nk acts onN Tx M, and so on the tensor
Define holx (g) to be the Lie algebra of Holx (g), as a l
powers
Nk T
Nl x M
Tx M. Suppose S 2 C1
Lie subalgebra of o(Tx M) ffi 2 Tx M. The holonomy
( TM
T M) is a constant tensor. Then Sjx
algebra hol(g) is intimately connected with the
is fixed Nby the action
N of Holx (g). Conversely,
Riemann curvature tensor Rabcd = gae Re bcd of g.
if Sjx 2 k Tx M
l Tx M is fixed by Holx (g),
Theorem 3 The Riemann curvature tensor Rabcd it extends
N to Na unique constant tensor
lies in S2 holx (g) at x, where holx (g) is regarded as a S 2 C1 ( k TM
l T M).
subspace of 2 Tx M. It also satisfies the first and
The main idea in the proof is that if S is a constant
second Bianchi identities
tensor and : [0, 1] ! M is a path from x to y, then
Rabcd þ Radbc þ Racdb ¼ 0 ½3 P (Sjx ) = Sjy , that is, ‘‘constant tensors are invariant
under parallel transport.’’ In particular, they are
invariant under parallel transport around closed
re Rabcd þ rc Rabde þ rd Rabec ¼ 0 ½4 loops based at x, and so under elements of Holx (g).
SO(n) acting irreducibly on Rn , with Lie algebra h. (iii) Metrics g with Hol(g) = Sp(m) are called
The classification of all such H follows from the ‘‘hyper-Kähler.’’ As Sp(m) SU(2m) U(2m), hyper-
classification of Lie groups (and is of considerable Kähler metrics are Ricci-flat and Kähler.
complexity). Berger’s method was to take the list of Metrics g with holonomy group Sp(m)Sp(1) for
all such groups H, and to apply two tests to each m 2 are called ‘‘quaternionic Kähler.’’ (Note that
possibility to find out if it could be a holonomy quaternionic Kähler metrics are not in fact Kähler.)
group. The only groups H which passed both tests They are Einstein, but not Ricci-flat.
are those in the theorem. (iv), (v) G2 and Spin(7) are the exceptional cases,
Berger’s tests are algebraic and involve the so they are called the ‘‘exceptional holonomy
curvature tensor. Suppose that Rabcd is the Riemann groups.’’ Metrics with these holonomy groups are
curvature of a metric g with Hol(g) = H. Then Ricci-flat.
Theorem 3 gives Rabcd 2 S2 h, and the first Bianchi
The groups can be understood in terms of the four
identity [3] applies. But if h has large codimension in
division algebras: the real numbers R, the complex
o(n), then the vector space RH of elements of S2 h
numbers C, the quaternions H, and the octonions or
satisfying [3] will be small, or even zero. However,
Cayley numbers O.
the ‘‘Ambrose–Singer holonomy theorem’’ shows that
RH must be big enough to generate h. For many of the SO(n) is a group of automorphisms of Rn .
candidate groups H, this does not hold, and so H U(m) and SU(m) are groups of automorphisms of Cm .
cannot be a holonomy group. This is the first test. Sp(m) and Sp(m) Sp(1) are automorphism groups
Now re Rabcd lies in (Rn )
RH , and also satisfies of Hm .
the second Bianchi identity, eqn [4]. Frequently, G2 is the automorphism group of Im O ffi R 7 .
these imply that rR = 0, so that g is locally Spin(7) is a group of automorphisms of O ffi R 8 ,
symmetric. Therefore, we may exclude such H, and preserving part of the structure on O.
this is Berger’s second test.
Berger’s proof does not show that the groups on
his list actually occur as Riemannian holonomy The Exceptional Holonomy Groups
groups – only that no others do. It is now known, For some time after Berger’s classification, the
though this took another thirty years to find out, exceptional holonomy groups remained a mystery.
that all possibilities in Theorem 6 do occur. In 1987, Bryant used the theory of exterior
differential systems to show that locally there exist
The Groups on Berger’s List many metrics with these holonomy groups, and gave
Here are some brief remarks about each group on some explicit, incomplete examples. Then in 1989,
Berger’s list. Bryant and Salamon found explicit, complete
metrics with holonomy G2 and Spin(7) on non-
(i) SO(n) is the holonomy group of generic compact manifolds. In 1994–95, the author con-
Riemannian metrics. structed the first examples of metrics with holonomy
(ii) Riemannian metrics g with Hol(g) U(m) are G2 and Spin(7) on compact manifolds. For more
called ‘‘Kähler metrics.’’ Kähler metrics are a natural information on exceptional holonomy, see Joyce
class of metrics on complex manifolds, and generic (2000, 2002).
Kähler metrics on a given complex manifold have
holonomy U(m). The Holonomy Group G2
Metrics g with Hol(g) = SU(m) are called Calabi–
Let (x1 , . . . , x7 ) be coordinates on R7 . Write dxij...l
Yau metrics. Since SU(m) is a subgroup of U(m), all
for the exterior form dxi ^ dxj ^
^ dxl on R 7 .
Calabi–Yau metrics are Kähler. If g is Kähler and M
Define a metric g0 , a 3-form ’0 , and a 4-form ’0
is simply connected, then Hol(g) SU(m) if and
on R7 by
only if g is Ricci-flat. Thus, Calabi–Yau metrics are
locally more or less the same as Ricci-flat Kähler g0 ¼ dx21 þ
þ dx27
metrics. ’0 ¼ dx123 þ dx145 þ dx167 þ dx246
If (M, J) is a compact complex manifold with
dx257 dx347 dx356 ½5
trivial canonical bundle admitting Kähler metrics,
then Yau’s solution of the Calabi conjecture gives a ’0 ¼ dx4567 þ dx2367 þ dx2345 þ dx1357
unique Ricci-flat Kähler metric in each canonical dx1346 dx1256 dx1247
class. This gives a way to construct many examples
of Calabi–Yau manifolds, and explains why these The subgroup of GL(7, R) preserving ’0 is the
have been named after them. exceptional Lie group G2 . It also preserves g0 , ’0 ,
444 Riemannian Holonomy Groups and Exceptional Holonomy
0 ¼ dx1234 þ dx1256 þ dx1278 þ dx1357 dx1368 Constructing Compact G2- and Spin(7)-Manifolds
dx1458 dx1467 dx2358 dx2367 dx2457
The author’s method of constructing compact
þ dx2468 þ dx3456 þ dx3478 þ dx5678 ½6 7-manifolds with holonomy G2 is based on the
Riemannian Holonomy Groups and Exceptional Holonomy 445
Kummer construction for Calabi–Yau metrics elements of . We now describe the singularities in
on the K3 surface and may be divided into four the example.
steps.
Lemma 12 In Example 11, , , , and
Step 1. Let T 7 be the 7-torus and (’0 , g0 ) a flat have no fixed points on T 7 . The fixed points of
G2 -structure on T 7 . Choose a finite group of , , are each 16 copies of T 3 . The singular set S of
isometries of T 7 preserving (’0 , g0 ). Then the quotient T 7 = is a disjoint union of 12 copies of T 3 , 4 copies
T 7 = is a singular, compact 7-manifold, an orbifold. from each of , , . Each component of S is a
Step 2. For certain special groups , there is a singularity modeled on that of T 3 C2 ={1}.
method to resolve the singularities of T 7 = in a natural
The most important consideration in choosing
way, using complex geometry. We get a nonsingular,
is that we should be able to resolve the singula-
compact 7-manifold M, together with a map : M !
rities of T 7 = within holonomy G2 , in Step 2. We
T 7 =, the resolving map.
have no idea how to resolve general orbifold
Step 3. On M, we explicitly write down a one-
singularities of G2 -manifolds. However, after fifty
parameter family of G2 -structures (’t , gt ) depending
years of hard work we understand well how to
on t 2 (0, ). They are not torsion free, but have
resolve orbifold singularities of Calabi–Yau mani-
small torsion when t is small. As t ! 0, the
folds, with holonomy SU(m). This is done by a
G2 -structure (’t , gt ) converges to the singular
combination of algebraic geometry, which pro-
G2 -structure (’0 , g0 ).
duces the underlying complex manifold by a
Step 4. We prove using analysis that for suffi-
crepant resolution, and Calabi–Yau analysis,
ciently small t, the G2 -structure (’t , gt ) on M, with
which produces the Ricci-flat Kähler metric on
small torsion, can be deformed to a G2 -structure
this complex manifold.
(’
’t , g̃t ), with zero torsion. Finally, it is shown that g̃t
Now the holonomy groups SU(2) and SU(3) are
is a metric with holonomy G2 on the compact
subgroups of G2 , as in [7]. Our tactic in Step 2 is to
7-manifold M.
ensure that all of the singular set S of T 7 = can
We explain the first two steps in greater detail. locally be resolved with holonomy SU(2) or SU(3),
For Step 1, an example of a suitable group is given and then use Calabi–Yau geometry to do this. In
here. particular, suppose each connected component of S
is isomorphic to either
Example 11 Let (x1 , . . . , x7 ) be coordinates on
T 7 = R7 =Z7 , where xi 2 R=Z. Let (’0 , g0 ) be the 1. T 3 C2 =G, for G a finite subgroup of SU(2); or
flat G2 -structure on T 7 defined by [5]. Let , , and 2. S 1 C3 =G, for G a finite subgroup of SU(3)
be the involutions of T 7 defined by acting freely on C3 n{0}.
25
20
b 2(M )
15
10
0
0 20 40 60 80 100 120 140 160 180 200
b 3(M )
Figure 1 Betti numbers (b 2 , b 3 ) of compact G2 -manifolds. (From Joyce (2000) and Kovalev (2003).)
for a = inf G. If the sequence has a convergent This assumption is used to carry out step (1). We define
Z 2
subsequence, this will produce a minimum. 1 2
However, when extrema do not exist, there is no GðuÞ ¼ kukH Fðx; uðxÞÞ dx ½8
2 0
clear way of obtaining critical points. In particular,
this happens when the functional is not bounded where F(x, t) is given by [2] and we take H to be the
from either above or below. Until recently, there completion of C1 (I) with respect to the norm
was no organized procedure for producing critical
kukH ¼ ðku0 k2 þ kuk2 Þ1=2 ½9
points which are not extrema. We shall describe an
2
approach which is very useful in such cases. where kuk = (u, u). We have
448 Saddle Point Problems
and n an integer 0. If G(u) is given by [8], then Note that M, N are closed subspaces of H and that
there is a u0 2 H such that M = N ? . Note also that N is finite dimensional. If
we consider the functional [8], it is not difficult to
G0 ðu0 Þ ¼ 0 ½12 show that [11] implies
In particular, u0 is a solution of [4] and [5] in the inf G > 1; sup G < 1 ½21
M N
usual sense.
We are now in a position to apply Theorem 3. This
In proving this theorem, we shall make use of
produces a saddle point satisfying [1]. &
Theorem 3 Let M, N be closed subspaces of a
Hilbert space E such that M = N ? . Assume that at
least one of these subspaces is finite dimensional. Minimax
Let G be a continuously differentiable functional on
Theorem 3 is very useful when extrema do not exist, but
E satisfying
it is not always applicable. One is then forced to search
m0 ¼ sup inf Gðv þ wÞ 6¼ 1 ½13 for other ways of obtaining critical points. Again, one is
v2N w2M faced with the fact that there is no systematic method of
and finding them. A useful idea is to try to find sets that
separate the functional. By this we mean the following:
m1 ¼ inf sup Gðv þ wÞ 6¼ 1 ½14
w2M v2N Definition 1 Two sets A, B separate the functional
G(u) if
Then there is a sequence {uk } E such that
a0 :¼ sup G b0 :¼ inf G ½22
B
Gðuk Þ ! c; m0 c m1 ; G0 ðuk Þ ! 0 ½15 A
k ¼ ðu; ’
k Þ; k ¼ 0;
1;
2; . . . ½17 Gðuk Þ ! a; G0 ðuk Þ ! 0 ½24
with a b0 . This leads to
and
Definition 2 We shall say that the set A links the
1 set B if [22] implies [24] with a b0 for every C1
’k ðxÞ ¼ pffiffiffiffiffiffi eikx ; k ¼ 0;
1;
2; . . . ½18
2 functional G(u).
Saddle Point Problems 449
Of course, [24] is a far cry from [23], but if, for is finite. Let (t) be a positive, locally Lipschitz
example, the sequence [24] has a convergent continuous function on [0, 1) such that
subsequence, then [24] implies [23]. Whether or Z 1
not [24] implies [23] is a property of the functional ðrÞ dr ¼ 1 ½27
G(u). We state this as 0
The Details
Some Examples
Let E be a Banach space, and let be the set of all
continuous maps = (t) from E [0, 1] to E such The following are examples of sets that link.
that Example 1 Let M, N be closed subspaces such that
1. (0) = I, the identity map; E = M N (with one finite dimensional). Let
2. for each t 2 [0, 1), (t) is a homeomorphism of E BR ¼ fu 2 E : kuk < Rg
onto E and 1 (t) 2 C(E [0, 1), E);
3. (1)E is a single point in E and (t)A converges and take A = @BR \ N, B = M. Then A links B.
uniformly to (1)E as t ! 1 for each bounded To see this, we identify N with some Rn and take
set A E; and For u 2 E, we write
= BR \ N, Q = .
4. for each t0 2 [0, 1) and each bounded set A E, u ¼ v þ w; v 2 N; w 2 M ½31
sup fkðtÞuk þ k1 ðtÞukg < 1 ½25 and take F to be the projection
0tt0 ;u2A
Fu ¼ v
We have the following
Since FjQ = I and M = F1 (0), we see from Theorem 7
Theorem 5 A sufficient condition for A to link B is that A links B.
(i) A \ B = and Example 2 We take M, N as in Example 1. Let
(ii) for each 2 there is a t 2 (0, 1] such that w0 6¼ 0 be an element of M, and take
ðtÞA \ B 6¼ A ¼ fv 2 N : kvk Rg
[ fsw0 þ v : v 2 N; s 0; ksw0 þ vk ¼ Rg
Theorem 6 Let G be a C1 -functional on E, and let B ¼ @B \ M; 0 < < R:
A, B be subsets of E such that A, B satisfy [22] and
the hypotheses of Theorem 5. Assume that Then A links B. Again we identify N with some R n ,
and we may assume kw0 k = 1. Let
a :¼ inf sup GððsÞuÞ ½26
2 0s1;u2A Q ¼ fsw0 þ v : v 2 N; s 0; ksw0 þ vk Rg
450 Saddle Point Problems
0
and W0 is a function in Lq (). Here (ii) for each > 0 sufficiently small, there is an " > 0
Z 1=q such that
kukq :¼ juðxÞjq dx ½38
GðuÞ "; kukD ¼ ½48
We may assume that option (ii) holds, for otherwise
kukD :¼ kA1=2 uk ½39 we are done. By [46] we have
and q0 = q=(q 1). With the norm [39], D becomes Z
a Hilbert space. Define G and F by [8] and [2]. It GðR’0 Þ R2 ðk’0 k2D 0 k’0 k2 Þ þ W0 ðxÞ dx
follows that G is a continuously differentiable Z
functional on the whole of D. ¼ W0 ðxÞ dx
We assume further that
or uk ðxÞj ! 1;
juk ðxÞj ¼ k j~ x 2 0 ½55
452 Saddle Point Problems
Hence, Hence,
Z
GðxðkÞ Þ ¼ k½xðkÞ 0 k2
GðxÞ kx0 k2 2 dt Z
jxj<m
2 Vðt; xðkÞ ðtÞÞ dt ! c 0 ½68
2
2ð2Þ 0 ½64 I
GðyÞ ¼ s2 kw00 k2 2 Vðt; yðtÞÞ dt from which we conclude easily that x is a solution of
Z I
[57]. From [68], we see that
s 2 jyðtÞj2 dt þ 2C
2
I GðxÞ c 0
¼ s2 2ðkvk2 þ s2 Þ þ 2C
showing that x(t) is not a constant. For if c > 0 and
ð1 2Þs2 4jvj2 þ 2C x 2 N, then
! 1 as s2 þ jvj2 ! 1 Z
GðxÞ ¼ 2 Vðt; xðtÞÞ dt 0
We also note that Hypothesis 1 implies I
See also: Combinatorics: Overview; Homoclinic Ghoussoub N (1993) Duality and Perturbation Methods in Critical
Phenomena; Ljusternik–Schnirelman Theory; Minimax Point Theory. Cambridge: Cambridge University Press.
Principle in the Calculus of Variations. Mawhin J and Willem M (1989) Critical Point Theory and
Hamiltonian Systems. Berlin: Springer.
Rabinowitz PR (1986) Minimax Methods in Critical Point Theory
Further Reading with Applications to Differential Equations, Conf. Board of
Math. Sci. Reg. Conf. Ser. in Math. No. 65. Providence, RI:
Ambrosetti A and Prodi G (1993) A primer of nonlinear analysis. American Mathematical Society.
Cambridge Studies in Advanced Mathematics 34. Cambridge: Schechter M (1999) Linking Methods in Critical Point Theory.
Cambridge University Press. Boston: Birkhäuser.
Chang KC (1993) Infinite Dimensional Morse Theory and Schechter M (1986) Spectra of Partial Differential Operators,
Multiple Solution Problems. Boston: Birkhäuser. 2nd edn. Amsterdam: North-Holland.
Ekeland I and Temam R (1976) Convex Analysis and Variational Struwe M (1996) Variational Methods. Berlin: Springer.
Problems. Amsterdam: North-Holland. Willem M (1996) Minimax Theorems. Boston: Birkhäuser.
below. In order to maintain the focus on the above physically mandatory equality of state spaces nor
essential points, we consider in the subsequent the more stringent requirement that every state has an
sections primarily a single massive particle of integer interpretation in terms of incoming and outgoing
spin s, that is, a boson. In standard scattering theory scattering states, that is, H = Hin = Hout (asymptotic
based upon Wigner’s characterization, this particle completeness), has been fully established in any inter-
is simply identified with an irreducible unitary acting relativistic field theoretic model so far. This
representation U1 of the identity component P "þ of intriguing problem will be touched upon in the last
the Poincaré group with spin s and mass m > 0. The section of this article.
Hilbert space H1 upon which U1 (P "þ ) acts is called Before going into details, let us state the few
the one-particle space and determines the possible physically motivated postulates entering into the
states of a single particle, alone in the universe. analysis. As discussed, the point of departure is a
Assuming that configurations of several such parti- family of algebras A(O), more precisely a net,
cles do not interact, one can proceed by a standard associated with the open subregions O of Min-
construction to a Fock space describing freely kowski space and acting on H. Restricting attention
propagating multiple particle states, to the case of bosons, we may assume that this net is
M local in the sense that if O1 is spacelike separated
HF ¼ Hn from O2 , then all elements of A(O1 ) commute with
n2N 0
all elements of A(O2 ). (In the presence of fermions,
where H0 = C and Hn is the n-fold symmetrized direct these algebras contain also fermionic operators
product of H1 with itself. This space is spanned by which anticommute.) This is the mathematical
vectors 1 n , where denotes the symme- expression of the principle of Einstein causality.
trized tensor product, representing an n-particle state The unitary representation U of P "þ acting on H is
wherein the kth particle is in the state k 2 assumed to satisfy the relativistic spectrum condition
H1 , k = 1, . . . , n. The representation U1 (P "þ ) induces (positivity of energy in all Lorentz frames) and, in
a unitary representation UF (P "þ ) on HF by the sense of equality of sets, U()A(O)U()1 =
: A(O) for all 2 P "þ and regions O, where O
UF ðÞð1 n Þ ¼ U1 ðÞ1 U1 ðÞn ½1 denotes the Poincaré transformed region. It is also
assumed that the subspace of U(P "þ )-invariant
In interacting theories, the states in the correspond- vectors is spanned by a single unit vector ,
ing physical Hilbert space H do not have such an a representing the vacuum, which has the Reeh–
priori interpretation in physical terms, however. It is Schlieder property, that is, each set of vectors
the primary goal of scattering theory to identify in H A(O) is dense in H. These standing assumptions
those vectors which describe, at asymptotic times, will subsequently be amended by further conditions
incoming, respectively, outgoing, configurations of concerning the particle content of the theory.
freely moving particles. Mathematically, this amounts
to the construction of certain specific isometries
(generalized Møller operators), in and out , mapping Haag–Ruelle Theory
HF onto subspaces Hin H and Hout H, respec- Haag and Ruelle were the first to establish the
tively, and intertwining the unitary actions of the existence of scattering states within this general
Poincaré group on HF and H. The resulting vectors framework (Jost 1965); further substantial improve-
: ments are due to Araki and Hepp (Araki 1999). In all
ð1 n Þin=out ¼ in=out ð1 n Þ 2 H ½2
of these investigations, the arguments were given for
are interpreted as incoming and outgoing particle quantum field theories with associated particles (in
configurations in scattering processes wherein the the Wigner sense) which have strictly positive mass
kth particle is in the state k 2 H1 . m > 0 and for which m is an isolated eigenvalue of
If, in a theory, the equality Hin = Hout holds, then the mass operator (upper and lower mass gap).
every incoming scattering state evolves, after the Moreover, it was assumed that states of a single
collision processes at finite times, into an outgoing particle can be created from the vacuum by local
scattering state. It is then physically meaningful to operations. In physical terms, these assumptions
define on this space of states the scattering matrix, allow only for theories with short-range interactions
setting S = in out . Physical data such as collision and particles carrying strictly localizable charges.
cross sections can be derived from S and the corre- In view of these limitations, Haag–Ruelle theory
sponding transition amplitudes h(1 m )in , has been developed in a number of different
(01 0n )out i, respectively, by a standard proce- directions. By now, the scattering theory of massive
dure. It should be noted, however, that neither the particles is under complete control, including also
458 Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools
:e
particles carrying nonlocalizable (gauge or topo- ff
x0 (p) ¼ f (p) e
ix0 !( p)
, where f is some test function
logical) charges and particles having exotic statistics on R with e
3
f (p) having compact support, and
(anyons, plektons) which can appear in theories in !(p) = (p2 þ m2 )1=2 . Note that (x0 , x) 7! fx0 (x) is a
low spacetime dimensions. Due to constraints of solution of the Klein–Gordon equation of mass m.
space, these results must go without further men- With these assumptions, it follows by a straight-
tion; we refer the interested reader to the articles forward application of the harmonic analysis of
Buchholz and Fredenhagen (1982) and Fredenhagen unitary groups that in the sense of strong conver-
et al. (1996). Theories of massless particles and of gence At (f ) !R P1 A(f ) and At (f ) ! 0 as t ! 1,
particles carrying charges of electric or magnetic where A(f ) = d3 x f (x)A(0, x). Hence, the opera-
type (infraparticles) will be discussed in subsequent tors At (f ) may be thought of as creation operators
sections. and their adjoints as annihilation operators. These
We outline here a recent generalization of Haag– operators are the basic ingredients in the construc-
Ruelle scattering theory presented in Dybalski tion of scattering states. Choosing local operators
(2005), which covers massive particles with localiz- Ak as above and test functions f (k) with disjoint
able charges without relying on any further con- compact supports in momentum space,
straints on the mass spectrum. In particular, the k = 1, . . . , n, the scattering states are obtained as
scattering of electrically neutral, stable particles limits of the Haag–Ruelle approximants
fulfilling a sharp dispersion law in the presence of
massless particles is included (e.g., neutral atoms in A1t ðf ð1Þ Þ Ant ðf ðnÞ Þ ½5
their ground states). Mathematically, this assump-
tion can be expressed by the requirement that there Roughly speaking, the operators Akt (f (k) ) are loca-
exists a subspace H1 H such that the restriction of lized in spacelike separated regions at asymptotic
U(P "þ ) to H1 is a representation of mass m > 0. We times t, due to the support properties of the Fourier
denote by P1 the projection in H onto H1 . transforms of the functions f (k) . Hence they com-
To establish notation, let O be a bounded space- mute asymptotically because of locality and, by the
time region and let A 2 A(O) be any operator such clustering properties of the vacuum state, the above
that P1 A 6¼ 0. The existence of such localized (in vector becomes a product state of single-particle
brief, local) operators amounts to the assumption states. In order to prove convergence, one proceeds,
that the particle carries a localizable charge. That in analogy to Cook’s method in quantum-mechanical
the particle is stable, that is, completely decouples scattering theory, to the time derivatives,
from the underlying continuum states, can be cast
into a condition first stated by Herbst: for all @t A1t ðf ð1Þ Þ Ant ðf ðnÞ Þ
sufficiently small > 0 X
¼ A1t ðf ð1Þ Þ ½@t Akt ðf ðkÞ Þ; Alt ðf ðlÞ Þ Ant ðf ðnÞ Þ
kE ð1 P1 ÞAk c ½3 k6¼l
X k
for some constants c, > 0, where E is the projec- þ A1t ðf ð1Þ Þ _ Ant ðf ðnÞ Þ@t Akt ðf ðkÞ Þ ½6
tion onto the spectral subspace of the mass operator k
corresponding to spectrum in the interval (m , k
where _ denotes omission of Akt (f (k) ). Employing
m þ ). In the case originally considered by Haag
and Ruelle, where m is isolated from the rest of the techniques of Araki and Hepp, one can prove that
the terms in the first summation on the right-hand
mass spectrum, this condition is certainly satisfied.
: side (RHS) of [6], involving commutators, decay
Setting A(x) ¼ U(x)AU(x)1 , where U(x) is the
rapidly in norm as t approaches infinity because of
unitary implementing the spacetime translation
locality, as indicated above. By applying condition
x = (x0 , x) (the velocity of light and Planck’s
[3] and the fact that the vectors @t Akt (f (k) ) do not
constant are set equal to 1 in what follows), one
have a component in the single-particle space H1 ,
puts, for t 6¼ 0,
the terms in the second summation on the RHS of
Z
[6] can be shown to decay in norm like jtj(1þ) .
At ðf Þ ¼ d4 x gt ðx0 Þfxo ðxÞAðxÞ ½4 Thus, the norm of the vector [6] is integrable in t,
: implying the existence of the strong limits
Here x0 7! gt (x0 ) ¼ g((x0 t)=jtj )=jtj induces a
time averaging about t, g being any test function in=out
P1 A1 ðf ð1Þ Þ P1 An ðf ðnÞ Þ
which integrates to 1 and whose Fourier transform
has compact support, and 1=(1 þ ) < < 1 with :
¼ lim A1t ðf ð1Þ Þ Ant ðf ðnÞ Þ ½7
as above. The Fourier transform of fx0 is given by t!
1
Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools 459
As indicated by the notation, these limits depend useful reduction formulas for the S-matrix greatly
only on the single-particle vectors P1 Ak (f (k) ) 2 H1 , facilitate computations, in particular in perturba-
k = 1, . . . , n, but not on the specific choice of tion theory. Moreover, these formulas are the
operators and test functions. In order to establish starting point of general studies of the momentum
their Fock structure, one employs results on cluster- space analyticity properties of the S-matrix (disper-
ing properties of vacuum correlation functions in sion relations), as outlined in Dispersion Relations
theories without strictly positive minimal mass. (cf. also Iagolnitzer (1993)). Within the present
Using this, one can compute inner products of general setting, the LSZ method was established by
arbitrary asymptotic states and verify that the maps Hepp.
For simplicity of discussion, we consider again a
P1 A1 ðf ð1Þ Þ P1 An ðf ðnÞ Þ single particle type of mass m > 0 and integer spin s,
in=out subject to condition [3]. According to the results of
7! P1 A1 ðf ð1Þ Þ P1 An ðf ðnÞ Þ ½8 the preceding section, one then can consistently
define asymptotic creation operators on the scatter-
extend by linearity to isomorphisms in=out from the ing states, setting
Fock space HF onto the subspaces Hin=out H in=out
generated by the collision states. Moreover, the Aðf Þin=out P1 A1 ðf ð1Þ Þ P1 An ðf ðnÞ Þ
asymptotic states transform under the Poincaré in=out
:
transformations U(P "þ ) as ¼ lim At ðf Þ P1 A1 ðf ð1Þ Þ P1 An ðf ðnÞ Þ
t!
1
in=out ¼ P1 Aðf Þ P1 A1 ðf ð1Þ Þ
UðÞ P1 A1 ðf ð1Þ Þ P1 An ðf ðnÞ Þ in=out
P1 An ðf ðnÞ Þ ½10
¼ U1 ðÞP1 A1 ðf ð1Þ Þ
in=out Similarly, one obtains the corresponding asymptotic
U1 ðÞP1 An ðf ðnÞ
Þ ½9 annihilation operators,
in=out
Thus, the isomorphisms in=out intertwine the action Aðf Þin=out P1 A1 ðf ð1Þ Þ P1 An ðf ðnÞ Þ
of the Poincaré group on HF and Hin=out . We
¼ lim At ðf Þ P1 A1 ðf ð1Þ Þ
summarize these results, which are vital for the t!
1
physical interpretation of the underlying theory, in in=out
the following theorem. P1 An ðf ðnÞ Þ ¼0 ½11
Theorem 1 Consider a theory of a particle of mass where the latter equality holds if the Fourier trans-
m > 0 which satisfies the standing assumptions and forms of the functions f , f (1) , . .., f (n) , have disjoint
the stability condition [3]. Then there exist canoni- supports. We mention as an aside that, by replacing
cal isometries in=out , mapping the Fock space HF the time-averaging function g in the definition of
based on the single-particle space H1 onto subspaces At (f ) by a delta function, the above formulas still
Hin=out H of incoming and outgoing scattering hold. But the convergence is then to be understood
states. Moreover, these isometries intertwine the in the weak Hilbert space topology. In this form, the
action of the Poincaré transformations on the above relations were anticipated by LSZ (asymptotic
respective spaces. condition).
It is straightforward to proceed from these
Since the scattering states have been identified
relations to reduction formulas. Let B be any local
with Fock space, asymptotic creation and annihila-
operator. Then one has, in the sense of matrix
tion operators act on Hin=out in a natural manner.
elements between outgoing and incoming scattering
This point will be explained in the following section.
states,
BAðf Þin Aðf Þout B ¼ lim ðBAðft Þ Aðft ÞBÞ
LSZ Formalism Z
t!1
Z
Prior to the results of Haag and Ruelle, an axiomatic ¼ lim d4 xft ðxÞBAðxÞ d4 xft ðxÞAðxÞB ½12
t!1
approach to scattering theory was developed by
:
Lehmann, Symanzik, and Zimmermann (LSZ), ft (x)¼ gt (x0 )f (x0 )(vec(x)). Because of the (essential)
based on time-ordered vacuum expectation values support properties of the functions f t , the contribu-
of quantum fields. The relative advantage of their tions to the latter integrals arise, for asymptotic t,
approach with respect to Haag–Ruelle theory is that from spacetime points x where the localization
460 Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools
D out
regions of A(x) and B have a negative timelike (first
P1 A1 ðf ð1Þ Þ P1 Ak ðf ðkÞ Þ ;
term), respectively, positive timelike (second term)
in
distance. One may therefore proceed from the ðkþ1Þ ðnÞ
P1 Akþ1 ðf Þ P1 An ðf Þ
products of these operators to the time-ordered
Z Z
products T(BA(x)), where T(BA(x)) = A(x)B if the
¼ ð2iÞn d3 p1 d3 pn fg
ð1Þ ðp Þ
1
localization region of A(x) lies in the future of that
of B, and T(BA(x)) = BA(x) if it lies in the past. It is
noteworthy that a precise definition of the time fg k
g ðp Þ fg
ðkÞ ðp Þf ðkþ1Þ
kþ1
ðnÞ ðp Þ
n
Yn D
ordering for finite x is irrelevant in the present f ðp1 Þ
ðpi0 !ðpi ÞÞ ; T A 1
context – any reasonable interpolation between the i¼1
above relations will do. Similarly, one can define f ðpk ÞAg
A kþ1 ðpkþ1 Þ
time-ordered products for an arbitrary number of k
Ej¼1;...;n
local operators. The preceding limit can then be fn ðpn Þ
A ½17
recast into pj0 ¼!ðpj Þ
Z in an obvious notation.
lim d4 xðft ðxÞ ft ðxÞÞTðBAðxÞÞ ½13
t!1 Thus, the kernels of the scattering amplitudes in
The latter expression has a particularly simple form in momentum space
Qn are obtained by restricting the (by
momentum space. Proceeding to the Fourier trans- the factor i = 1 (pi0 !(pi ))) amputated Fourier
forms of f t and noticing that, in the limit of large t, transforms of the vacuum expectation values of the
time-ordered products to the positive and negative
ff e
t ðpÞ ft ðpÞ =ðp0 !ðpÞÞ mass shells, respectively. These are the famous LSZ
reduction formulas, which provide a convenient link
! 2ie
f ðpÞ ðp0 !ðpÞÞ ½14 between the time-ordered (Green’s) functions of a
one gets theory and its asymptotic particle interpretation.
however. A general method to that effect was respectively, negative, timelike distance from all
outlined in Buchholz et al. (1991). As a matter of points in O. Then, for any operator B which is
fact, for that method only the knowledge of states in compactly localized in O , respectively, one obtains
the subspace of neutral states is required. Yet in this limt ! 1 At B = limt ! 1 BAt = BP1 A. This
approach one would need for the computation of, relation establishes the existence of the limits
say, elastic collision cross sections of charged
particles the vacuum correlation functions involving Ain=out ¼ lim At ½23
t!
1
at least eight local observables. This practical
disadvantage of increased computational complexity on the (by the Reeh–Schlieder property) dense sets of
of the method is offset by the conceptual advantage vectors {B : B 2 A(O
)} H. It requires some
of making no appeal to quantities which are a priori more detailed analysis to prove that the limits have
nonobservable. all of the properties of a (smeared) free massless
field, whose translates x 7! Ain=out (x) satisfy the wave
equation and have c-number commutation relations.
Massless Particles From these free fields, one can then proceed to
asymptotic creation and annihilation operators and
and Huygens’ Principle
construct asymptotic Fock spaces Hin=out H of
The preceding general methods of scattering theory massless particles and a corresponding scattering
apply only to massive particles. Yet taking advan- matrix as in the massive case. The details of this
tage of the salient fact that massless particles always construction can be found in the original article, cf.
move with the speed of light, Buchholz succeeded in Haag (1992).
establishing a scattering theory also for such It also follows from these arguments that the
particles (Haag 1992). Moreover, his arguments asymptotic fields Ain=out of massless particles ema-
lead to a quantum version of Huygens’ principle. nating from a region O, that is, for which the
As in the case of massive particles, one assumes underlying interpolating operators A are localized in
that there is a subspace H1 H corresponding to a O, commute with all operators localized in O
,
representation of U(P "þ ) of mass m = 0 and, for respectively. This result may be understood as an
simplicity, integer helicity; moreover, there must expression of Huygens’ principle. More precisely,
exist local operators interpolating between the denoting by Ain=out (O) the algebras of bounded
vacuum and the single-particle states. These operators generated by the asymptotic fields Ain=out ,
assumptions cover, in particular, the important respectively, one arrives at the following quantum
examples of the photon and of Goldstone particles. version of Huygens’ principle.
Picking any suitable local operator A interpolating
Theorem 4 Consider a theory of massless particles
between and some vector in H1 , one sets, in
as described above and let Ain=out (O) be the algebras
analogy to [4],
Z generated by massless asymptotic fields Ain=out with
: A 2 A(O). Then
At ¼ d4 x gt ðx0 Þ
Ain ðOÞ AðO Þ0
ð1=2Þ"ðx0 Þ ðx20 x2 Þ@0 AðxÞ ½22
and ½24
:
Here gt (x0 ) ¼ (1=j ln tj) g((x0 t)=j ln tj) with g as in A out
ðOÞ AðOþ Þ 0
[4], and the solution of the Klein–Gordon equation
in [4] has been replaced by the fundamental solution Here the prime denotes the set of bounded operators
of the wave equation; furthermore, @0 A(x) denotes commuting with all elements of the respective
the derivative of A(x) with respect to x0 . Then, once algebras (i.e., their commutants).
again, the strong limit of At as t ! 1 is P1 A,
with P1 the projection onto H1 .
Beyond Wigner’s Concept of Particle
In order to establish the convergence of At as in
the LSZ approach, one now uses the fact that these There is by now ample evidence that Wigner’s
operators are, at asymptotic times t, localized in the concept of particle is too narrow in order to cover
complement of some forward, respectively, back- all particle-like structures appearing in quantum
ward, light cone. Because of locality, they therefore field theory. Examples are the partons which show
commute with all operators which are localized in up in nonabelian gauge theories at very small
the interior of the respective cones. More specifi- spacetime scales as constituents of hadrons, but
cally, let O R 4 be the localization region of A and which do not appear at large scales due to the
let O R4 be the two regions having a positive, confining forces. Their mathematical description
Scattering in Relativistic Quantum Field Theory: Fundamental Concepts and Tools 463
requires a quite different treatment, which cannot be UðxÞjLip ¼ eipx jLðxÞip ; L2L ½25
discussed here. But even at large scales, Wigner’s
concept does not cover all stable particle-like It is instructive to (formally) replace L here by the
systems, the most prominent examples being parti- identity operator, making it clear that this relation
cles carrying an abelian gauge charge, such as the indeed defines improper states of sharp energy–
electron and the proton, which are inevitably momentum.
accompanied by infinite clouds of (‘‘on-shell’’) In theories of massive particles, one can always find
massless particles. localizing operators L 2 L such that their images
The latter problem was discussed first by Schroer, jLip 2 H are states with a sharp mass. This is the
who coined the term ‘‘infraparticle’’ for such situation covered in Wigner’s approach. In theories
systems. Later, Buchholz showed in full generality with long-range forces there are, in general, no such
that, as a consequence of Gauss’ law, pure states operators, however, since the process of localization
with an abelian gauge charge can neither have a inevitably leads to the production of low-energy
sharp mass nor carry a unitary representation of the massless particles. Yet improper states of sharp momen-
Lorentz group, thereby uncovering the simple origin tum still exist in this situation, thereby leading to a
of results found by explicit computations, notably in meaningful generalization of Wigner’s particle concept.
quantum electrodynamics (Steinmann 2000). Thus, That this characterization of particles covers all
one is faced with the question of an appropriate situations of physical interest can be justified in the
mathematical characterization of infraparticles general setting of relativistic quantum field theory as
which generalizes the concept of particle invented follows. Picking gt as in [4] and any vector 2 H
by Wigner. Some significant steps in this direction with finite energy, one can show that the functionals
were taken by Fröhlich, Morchio, and Strocchi, who t , t 2 R, given by
based a definition of infraparticles on a detailed Z
:
spectral analysis of the energy–momentum opera- t ðL LÞ ¼ d4 x gt ðx0 Þ h; ðL LÞðxÞi; L 2 L ½26
:
setting jLip ¼ L jpi, one concludes that the map structures, which cannot appear in quantum-mechan-
L 7! jLi can be decomposed into a direct integral of ical systems with a finite number of degrees of freedom.
improper Rparticle states of sharp energy–momen- Thus, the first step in establishing a complete particle
tum, ji = d(p)1=2 jip . It is crucial that this result interpretation in a quantum field theory has to be the
can also be established without any a priori input determination of its full particle content. Here the
about the nature of the particle content of the methods outlined in the preceding section provide a
theory, thereby providing evidence of the universal systematic tool. From the resulting data, one must then
nature of the concept of improper particle states of reconstruct the full physical Hilbert space of the theory
sharp momentum, as outlined here. comprising all superselection sectors. For theories in
which only massive particles appear, such a construc-
Theorem 5 Consider a relativistic quantum field
tion has been established in Buchholz and Fredenhagen
theory satisfying the standing assumptions. Then the
(1982), and it has been shown that the resulting Hilbert
maps L 7! jLi defined above can be decomposed into
space contains all scattering states. The question of
improper particle states of sharp energy–momentum p,
Z completeness can then be recast into the familiar
problem of the unitarity of the scattering matrix. It is
ji ¼ dðpÞ1=2 jip ½29
believed that phase space (nuclearity) properties of the
theory are of relevance here (Haag 1992).
where is some measure depending on the state However, in theories with long-range forces, where
and the respective time limit taken.
a meaningful scattering matrix may not exist, this
It is noteworthy that whenever the space of strategy is bound to fail. Nonetheless, as in most high-
improper particle states corresponding to fixed energy scattering experiments, only some very specific
energy–momentum p is finite dimensional (finite aspects of the particle interpretation are really tested –
particle multiplets), then in the corresponding Hilbert one may think of other meaningful formulations of
space there exists a continuous unitary representation completeness. The interpretation of most scattering
of the little group of p. This implies that improper experiments relies on the existence of conservation
momentum eigenstates of mass m = (p2 )1=2 > 0 carry laws, such as those for energy and momentum. If a
definite (half)integer spin, in accordance with Wigner’s state has a complete particle interpretation, it ought to
classification. However, if m = 0, the helicity need not be possible to fully recover its energy, say, from its
be quantized, in contrast to Wigner’s results. asymptotic particle content, that is, there should be no
Though a general scattering theory based on contributions to its total energy which do not manifest
improper particle states has not yet been developed, themselves asymptotically in the form of particles.
some progress has been made in Buchholz et al. Now the mean energy–momentum of a state 2 H is
(1991). There it is outlined how inclusive collision given by h, Pi, P being the energy–momentum
cross sections of scattering states, where an unde- operators, and the mean energy–momentumR contained
termined number of low-energy massless particles in its asymptotic particle content is d(p)p, where
remains unobserved, can be defined in the presence is the measure appearing in the decomposition [29].
of long-range forces, in spite of the fact that a Hence, in case of a complete particle interpretation,
meaningful scattering matrix may not exist. the following should hold:
Z
h; Pi ¼ dðpÞp ½30
Asymptotic Completeness
Similar relations should also hold for other con-
Whereas the description of the asymptotic particle served quantities which can be attributed to parti-
features of any relativistic quantum field theory can be cles, such as charge, spin, etc. It seems that such a
based on an arsenal of powerful methods, the question weak condition of asymptotic completeness suffices
of when such a theory has a complete particle for a consistent interpretation of most scattering
interpretation remains open to date. Even in concrete experiments. One may conjecture that relation [30]
models there exist only partial results, cf. Iagolnitzer and its generalizations hold in all theories admitting
(1993) for a comprehensive review of the current state a local stress–energy tensor and local currents
of the art. This situation is in striking contrast to the corresponding to the charges.
case of quantum mechanics, where the problem of
asymptotic completeness has been completely settled. See also: Algebraic Approach to Quantum Field Theory;
One may trace the difficulties in quantum field Axiomatic Quantum Field Theory; Dispersion Relations;
theory back to the possible formation of superselection Perturbation Theory and its Techniques; Quantum
sectors (Haag 1992) and the resulting complex particle Chromodynamics; Quantum Field Theory in Curved
Scattering in Relativistic Quantum Field Theory: The Analytic Program 465
Spacetime; Quantum Mechanical Scattering Theory; Dybalski W (2005) Haag–Ruelle scattering theory in presence of
Scattering, Asymptotic Completeness and Bound States; massless particles. Letters in Mathematical Physics 72: 27–38.
Scattering in Relativistic Quantum Field Theory: The Fredenhagen K, Gaberdiel MR, and Ruger SM (1996) Scattering
Analytic Program. states of plektons (particles with braid group statistics) in (2 þ 1)
dinemsional quantum field theory. Communications in Mathe-
matical Physics 175: 319–336.
Haag R (1992) Local Quantum Physics. Berlin: Springer.
Further Reading
Iagolnitzer D (1993) Scattering in Quantum Field Theories.
Araki H (1999) Mathematical Theory of Quantum Fields. Princeton, NJ: Princeton University Press.
Oxford: Oxford University Press. Jost R (1965) General Theory of Quantized Fields. Providence,
Buchholz D and Fredenhagen K (1982) Locality and the structure RI: American Mathematical Society.
of particle states. Communications in Mathematical Physics Steinmann O (2000) Perturbative Quantum Electrodynamics and
84: 1–54. Axiomatic Field Theory. Berlin: Springer.
Buchholz D, Porrmann M, and Stein U (1991) Dirac versus Streater RF and Wightman AS (1964) PCT, Spin and Statistics,
Wigner: towards a universal particle concept in quantum field and All That. Reading, MA: Benjamin/Cummings.
theory. Physics Letters B 267: 377–381.
the analyticity of the previous structure functions There are two versions of this postulate. In the
in the complex spacetime variables, in particular Wightman framework, causality is expressed by the
for purely imaginary times. condition of local commutativity or microcausality,
In both cases, analyticity is obtained as a basic pro- ½ðx1 Þ; ðx2 Þ ¼ 0 for ðx1 x2 Þ2 < 0 ½3
perty of the Fourier–Laplace transformation in several
variables. Let V þ denote the forward cone of the In the algebraic QFT framework, causality is
: : : expressed by a similar property in terms of any
Minkowskian space (V þ ¼ V ¼ {x; x2 ¼ x x > 0, :
x0 > 0}) and let field B(x) generated by a local observable B ¼ B(0)
Z affiliated to a region of spacetime enclosed in a
~f ðp þ iqÞ ¼ given ‘‘double cone’’ Ob = Vbþ \ (Vbþ ). The corres-
eiðpþiqÞx f ðxÞdx ½1
Vaþ ponding expression of causality is
Z
4
½Bðx1 Þ; Bðx2 Þ ¼ 0
gðx þ iyÞ ¼ ð2Þ eipðxþiyÞ ~
gðpÞdp ½2
Vpþ for ðx1 x2 Þ 2= ðVaþ [ ðVaþ Þ ½4
be the associated reciprocal Fourier formulas, for all a such that a > 2b.
applied, respectively, to functions f (x) with support So, we see that basically, causality and spectral
contained in the translated forward cone Vaþ = a þ condition generate analyticity respectively in com-
V þ , a 2 V þ (or in its closure), and to functions ~g(p) plexified p-space and x-space. However, the situa-
with support contained in the translated forward tion is more intricate, since for each N there are
cone VPþ = P þ V þ , P 2 V þ of energy–momentum always several holomorphic branches (two in the
space (or in its closure). Then in view of the case N = 2) in the variables (z1 , . . . , zn ) and also in
convergence properties of the previous integrals, one the variables (k1 , . . . , kn ): each of these two sets is
easily checks that ~f (k) is holomorphic with possible obtained essentially by permutations of the N vector
exponential increase in the imaginary directions variables. The important point is that these various
controlled by the bound eqa in the tube domain branches can be seen to ‘‘communicate together,’’
T þ = R4 þ iV þ ; similarly, g(z) is holomorphic with thanks to the existence of ‘‘coincidence regions’’ of
an increase controlled by the exponential bound eyP their boundary values on the reals. Here again the
in the tube domain T = R4 þ iV . roles played by causality and stability are symmetric
On the one hand, for each N the structure functions (but inverted): while causality produces coincidence
<, (p˜ 1 ) (p
˜ N )0 > (or <, B(p ~ 1 ) B(p
~ N )0 >) regions for the holomorphic functions in complex
have conical support properties of the previous type in spacetime, spectral conditions produce coincidence
the variables pj , as a consequence of the relativistic regions for the holomorphic functions in complex
shape of the energy–momentum spectrum. In both energy–momentum space.
axiomatic frameworks, in fact, one postulates that In view of a basic theorem of several complex
there is a state of zero energy–momentum , called the variable analysis, called the edge-of-the-wedge the-
vacuum, and that the energy–momentum spectrum , orem (see below in (4)), the two sets of commu-
namely the joint spectrum of the generators P of the nicating holomorphic branches actually define by
Lie algebra of the group U(x), is contained in the mutual analytic continuation two holomorphic
, 0 0
closure of V þ : this is the so-called spectral condition. function HN (k1 , . . . , kN ) and W ,N (z1 , . . . , zN ) in
, 0 , 0
A more refined assumption introduced for the require- respective domains DN and N . However, these
ments in particle physics is that contains discrete two primitive domains are not natural holomorphy
parts localized on sheets of (mass-shell) hyperboloids domains (a phenomenon which is particular to
inside V þ . These support properties in p-space imply complex geometry in several variables). The prob-
that the corresponding inverse Fourier transforms lem of finding their holomorphy envelopes, namely
<, (x1 ) (xN )0 > are boundary values of holo- the smallest domains D ^ , 0 and
^ , 0 in which any
N N
morphic functions in appropriate tube domains of the functions holomorphic in the primitive domains can
complex space variables (z1 , . . . , zn ). be analytically continued, is the idealistic purpose of
On the other hand, in order to exhibit structure what has been called the analytic program of
functions with conical support properties in x-space, axiomatic QFT. So, we see that there is an analytic
one needs to build appropriate algebraic combina- program in x-space and there is an analytic program
tions of functions < , (xj1 ) (xjN )0 > with in p-space. In practice, except for the case N = 2,
permuted arguments in order to take the benefit of where the complete answer is known, only a partial
the causality postulate, which is always formulated knowledge of the holomorphy envelopes has been
in terms of the commutator of two field operators. obtained.
Scattering in Relativistic Quantum Field Theory: The Analytic Program 467
The analytic program in p-space, which is the 3. A more recent extension of QFT called thermal
only one to be described in the rest of this article, QFT (TQFT), which aims to study the behavior of
was often considered as physically more interesting, quantum fields in a thermal bath, can be described
in view of the fact that it aims to establish in terms of a modified analytic program. In the
analyticity properties of the scattering kernels on latter, the spectral condition is replaced by the
the complex mass shell. As a matter of fact, an so-called KMS condition, which prescribes x-space
important part of it concerns the derivation of the analyticity properties of a particular type for the
analyticity domains of dispersion relations for two- structure functions W N : it requires analyticity
particle scattering amplitudes. This part is important together with periodicity conditions with respect
from the historical viewpoint as well as from to imaginary times, the period being the inverse of
conceptual, physical, and pedagogical viewpoints the temperature (see Thermal Quantum Field
(the reader may find it useful to first check the Theory). The usual analytic structure for the
article Dispersion Relations, which illustrates
0
how a theories with vacuum and spectral conditions is
structure function of the form H2, (k1 , k2 ) can be recovered in the zero-temperature limit.
used for that purpose with a suitable choice of the 4. In more recent investigations concerning quan-
states and 0 ). In the general development of the tum fields on (holomorphic) curved spacetimes,
analytic program (in x-space as well as in p-space), analyticity properties of the structure functions
it is recommended to consider the infinite set of similar to those of thermal QFT can be estab-
: ,
structure functions HN ¼ HN (k1 , . . . , kN ) and lished. This is the case in particular with de Sitter
: ,
W N ¼ W N (z1 , . . . , zN ) where is the privileged spacetime, for which a notion of ‘‘temperature of
vacuum state of the theory, in view of the fact that geometrical origin’’ is most simply exhibited.
each of these sets characterizes entirely the field
In this article, an account of the general analytic
theory considered.
program of axiomatic QFT in complex energy–
Before shifting to the analytic program in p-space,
momentum space will be presented; it will describe
we would like to mention various points of interest
some of the methods which have been used for
of the analytic program in x-space:
establishing analyticity properties of the N-point
1. Various results of this program have been structure functions of QFT and corresponding proper-
extensively used for proving fundamental prop- ties of the (n ! n0 )-particle collision processes, for all
erties of QFT, such as the PCT-invariance n, n0 such that n 2, n0 2, n þ n0 = N. (For a more
theorem, the spin–statistics connection, etc. detailed study, in particular concerning the microlocal
A good part of these can be found in the methods, see the book by Iagolnitzer (1992)).
books by Streater and Wightman (1980) and by Concerning the important case N = 4, this article
Jost (1965). gives complements to the results described in the
2. The functions HN and W N are holomorphic in article Dispersion Relations. In fact, the program
their respective p-space and x-space ‘‘Euclidean allows one to justify other important analytic
subspaces.’’ To make this clear, let us assume structures of the four-point functions and of two-
that a Lorentz frame has been chosen once for particle scattering functions. They concern
all; the linear subspace of complex spacetime
the field-theoretical basis of analyticity in the
(resp. energy–momentum) vectors of the form
complexified variable of angular momentum, first
z = (iy0 , x) (resp. k = (iq0 , p)) is called the ‘‘Eucli-
introduced and developed in potential theory
dean subspace’’ of the corresponding complex
(Regge 1959);
Minkowskian space, in view of the fact that the
: the Bethe–Salpeter (BS-) type structure (based on
quadratic form z2 ¼ z z = (y20 þ x2 ) (resp.
: the additional postulate of asymptotic complete-
k2 ¼ k k = (q20 þ p2 )) has a definite (negative)
ness), which is a relativistic field-theoretical gen-
sign on that subspace. Then it has been estab-
eralization of the Lippmann–Schwinger structure
lished that (for each N) the restrictions of HN
of nonrelativistic scattering theory (for Schrödinger
and W N to the corresponding N-vector Euclidean
equations with Yukawa-type potentials).
subspaces are the Fourier transforms of each
other. This fact participates in the foundation of The latter allows one to introduce the concept of
the Euclidean formulation of QFT or ‘‘QFT at composite particle in the field-theoretical framework
imaginary times’’; the latter has provided many (including bound states and unstable particles or
important results in QFT, in particular for the ‘‘resonances’’) and also the concept of ‘‘Regge
rigorous study of field models (initiated by particle,’’ thanks to complex angular momentum
Glimm and Jaffe in the 1970s). analysis.
468 Scattering in Relativistic Quantum Field Theory: The Analytic Program
Various Aspects of the General Dispersion Relations. Then in view of the Laplace-
Analytic Program of QFT in Complex transform theorem in several variables, the Fourier
transform ~ () (p1 , . . . , pN ) = (p1 þ þ pN )
R
Energy–Momentum Space N
~rN ([p]N ) is such that ~r()
()
N ([p]N ) is the boundary value
The N-Point Structure Functions of QFT (), (c)
of a holomorphic function ~rN ([k]N ) defined in a
tube T = R 4(N1)
þ iC~ . Here [k] = [p] þ i[q]
It is proved in the Wightman QFT axiomatic frame- N N N
work that any QFT is completely characterized by the belongs to a 4(N 1)-dimensional complex linear
(infinite) sequence of its ‘‘N-point functions’’ or space M(c) : this is the set of complex vectors
: N ~ is
‘‘vacuum expectation values’’ (also called ‘‘Wightman [k]N ¼ (k1 , . . . , kN ) such that k1 þ þ kN = 0. C
functions’’) the dual cone of C in the real (4(N 1)-dimen-
sional) [q]N -space. Geometrically, each cone C ~ is
:
WN ðx1 ; . . . ; xN Þ ¼ < ; ðx1 Þ ðxN Þ > defined in terms of a certain ‘‘cell’’ of [q]N -space
which are tempered distributions on R4N satisfying a which is defined by prescribing consistent P conditions
set of general properties that can be split up into of the form "J qJ 2 V þ with qJ = j2J qj and "J = 1
linear and nonlinear conditions. (This is known as for all proper subsets J of the set {1, 2, . . . , N}.
the Wightman reconstruction theorem). This is the expression of the microcausality postu-
late (summarized in [3] or [4]) in complex energy–
Linear conditions Each individual N-point func- momentum space. Concerning the difference
tion satisfies three sets of linear conditions which between the two formulations [3] and [4], one can
result, respectively, from: see that there is no geometrical difference concern-
ing the analyticity domains, but differences for the
1. Poincaré invariance: typically, for every Poincaré type of increase of the structure functions in their
transformation g of Minkowski spacetime tube domains: in the case of [3], they are bounded
WN ðx1 ; . . . ; xN Þ ¼ WN ðgx1 ; . . . ; gxN Þ by powers of the energy–momenta, while in the case
of [4] they may have an exponential increase
in particular, the WN are invariant under space- governed by factors of the type eqa .
time translations and therefore defined on the For each N, the linear space generated by all the
:
quotient subspace R4(N1) ¼ R4N =R4 of the differ- distributions ~r() N ([ p]N ) is constrained by a set of
ences xj xk . linear relations (called Steinmann relations) which
2. Microcausality: support conditions on commu- result from algebraic expressions of discontinuities
tator functions of the following form: of the following type, called (generalized) ‘‘absorp-
: tive parts,’’
Cðj;jþ1Þ ðx1 ; . . . ; xn Þ ¼ WN ðx1 ; . . . ; xj ; xjþ1 ; . . . ; xN Þ
ðÞ ð0 Þ
WN ðx1 ; . . . ; xjþ1 ; ~rN ð½ pN Þ ~rN ð½ pN Þ
xj ; . . . ; xN Þ ¼ 0 ~ ð1 Þ ð½ p Þ; R
¼ < ; ½R ~ ð2 Þ ð½ p Þ > ½5
J1 ðJ1 Þ J2 ðJ2 Þ
4N 2
in the region of R defined by (xj xjþ1 ) < 0. for all pairs of adjacent cells (, 0 )( J1 , J2 ) in the
3. Spectral condition: support conditions on the following sense: and 0 only differ by changing the
Fourier transform W ~ N (p1 , . . . , pN ) = (p1 þ þ
value of "J1 = "J2 , ( J1 , J2 ) denoting any given
pN ) w ^ N (p1 , . . . , pN1 ) of WN , which assert that partition of the set {1, 2, . . . , N}. In [5], the symbols
w^ N (p1 , . . . , pN1 ) = 0 if either one of the follow- ~ (i ) denote generalized retarded operators of lower
R Ji
ing conditions is fulfilled: p1 þ þ pj 62 , for order and the argument [ p](J) stands for the set of
j = 1, . . . , N 1. independent 4-momenta { pj ; j 2 J}. Formula [5] may
For each N, one can then construct a set of be seen as an N-point generalization of formula [26]
distributions R()
N (x1 , . . . , xN ), called ‘‘generalized
of Dispersion Relations for the case when the state
retarded functions’’ (Araki, Ruelle, Steinmann, = 0 is replaced by .
1960 (see Iagolnitzer (1992, ref. [EGS])) which are Then by applying to [5] the same argument based
appropriate linear combinations of multiple com- on spectral condition as in the exploitation of
mutator functions built from WN and multiplied by eqn [26] in Dispersion Relations, one concludes
(0 )
products of Heaviside step-functions (xj,0 xk,0 ) of that the two distributions ~r() N and ~ rN coincide on
the differences of time coordinates. Each of these an open set R, 0 of the form p2J1 = p2J2 < M2J1 , where
: P
distributions R()
N (x1 , . . . , xN ) has its support con-
pJ1 ¼ j2J1 pj = pJ2 . It then follows from the gen-
tained in a convex salient cone C . This construction eral ‘‘oblique edge-of-the-wedge theorem’’ (Epstein,
can be seen as a generalization of the decomposition 1960; see below) that the two corresponding
(0 ), (c)
[23] of the commutator C, 0 in the article holomorphic functions ~r(), N
(c)
([k]N ) and ~rN ([k]N )
Scattering in Relativistic Quantum Field Theory: The Analytic Program 469
have a common analytic continuation in the union of Sn, n0 (pn, in ; pn0 , out ), defined by a straightforward gen-
their tubes together with a certain complex ‘‘connecting eralization of formula [20] of the quoted article:
set,’’ bordered by R, 0 . Since this argument applies to
all pairs (, 0 )( J1 , J2 ) , the following important property Sn;n0 ð^fn;in ; ^gn0 ;out Þ
Z
holds (see Iagolnitzer (1992, refs. [B2], [EGS])): ^fn;in ðpn;in Þ^gn0 ;out ðpn0 ;out Þ
¼
Theorem 1 Mn;n0
0
the (n ! n0 )-channel is considered. The important in its cut-plane (or crossing) domain; the dispersion
thing to be noted in [8] is the sign convention which relations with two subtractions are still justified in
attributes the notation pj to the momentum of any that case (Epstein, Glaser, Martin, 1969 (see Martin
incoming particle and therefore implies that pj (1969, preprint))).
:
belongs to the negative sheet of hyperboloid Hm ¼
þ
Hm . This is the price to pay for expressing Off-Shell Character of DN : Nontriviality of the
symmetrically the energy–momentum conservation Analytic Structure of the Scattering Kernels
law as p1 þ p2 þ þ pN = 0 (according to the QFT
One can now see that for each value of N(N 4)
formalism), but it also displays, as a nice feature,
the situation created by complex geometry in the
the fact that all the affiliated scattering kernels
space C4(N1) of [k]N is a mere generalization of the
Tn, n0 such that n þ n0 = N are located on the
one described in a simple situation in the article
various connected components of the mass shell
Dispersion Relations.
M(N) (pj 2 Hm ; j = 1, 2, . . . , N): the choice of the
þ
sheet Hm or Hm of Hm is exactly linked to the 1. There exists a fundamental (3N 4)-dimensional
incoming or outgoing character of the particle complex submanifold, namely the complex mass
considered. shell M(c) (N) defined by the equations kj = m ;
2 2
nonlinear program. The ‘‘synergy’’ created by the Zerner ‘‘flat tube theorem,’’ or ‘‘flat edge-of-the-
combination of these two programs remains, to a wedge theorem.’’ In the latter, the local tubes
large extent, to be explored. TC(loc)
1
and TC(loc)
2
of f1 and f2 reduce to one-variable
domains of the upper half-plane in separate
Results of Analytic Completion variables z1 = x1 þ iy1 , z2 = x2 þ iy2 but with a
in the ‘‘Linear Program’’ common range of real parts (x1 , x2 ) 2 U. The data
f1 (z1 , x2 ) and f2 (x1 , z2 ) have coinciding boundary
We can only outline here some of the geometrical
values (f1 (x1 , x2 ) = f2 (x1 , x2 )) in the limit (y1 ! 0,
methods which allow one to compute parts of the
y2 ! 0). The result is again the existence of a
holomorphy envelopes of the domains DN . One
common analytic continuation to f1 and f2 , which
important method, which may be used after apply-
is a function of two complex variables f (z1 , z2 ) in
ing suitable conformal mappings, reduces to the
the intersection of the quadrant (y1 > 0, y2 > 0)
following basic theorem.
with a complex neighborhood of U. (Note that
The tube theorem The holomorphy envelope of a this result of complex analysis still holds when the
‘‘tube domain’’ of the form TB = Rn þ iB, where B is real boundary values of the holomorphic func-
an arbitrary domain in Rn called the basis of the tions have singularities, namely are only defined
tube, is the convex tube TB^ = Rn þ iB,
^ where B
^ is the in the sense of distributions).
convex hull of B.
Global analyticity properties The following prop-
The opposite or oblique edge-of-the-wedge theo-
erty (discovered by Streater for three-point func-
rem (Epstein 1960 (see Streater and Wightman
tions) looks like an extension of the tube theorem.
(1980, ch. 2, ref. 18))) is a refined local version of
The holomorphy envelope of the union of two tubes
the tube theorem, in which the basis B is of the form
T , T 0 corresponding to adjacent pairs of cells
B = C1 [ C2 , where C1 , C2 are two disjoint (opposite
(, 0 )(J1 , J2 ) together with a complex connecting set
or nonopposite) cones with apex at the origin
bordered by R, 0 = {[ p]N ; p2J1 < m2J1 } is the convex
and where TB is replaced by a pair of ‘‘local tubes’’
hull T , 0 of the union of these tubes minus the
(TC(loc) , TC(loc) ). Here the adjective ‘‘local’’ means that
1 2 following analytic hypersurface J1 which can be
the real parts of the variables are confined in a given
called ‘‘a cut’’: J1 = {[k]N : k2J1 = m2J1 þ , 0}. The
open set U (which can be arbitrarily small). The
interest of this result (although it remains by itself an
connectedness of TB is now replaced by the
off-shell result) is that it can generate larger cut-
consideration of any pair of functions (f1 , f2 )
domains by additional analytic completions, which
holomorphic in these local tubes whose boundary
may have intersections with the complex mass shell
values on their common real set U coincide. The
(see below for the case N = 4).
result is that f1 and f2 admit a common analytic
continuation f in a local tube TC(loc) , where C is the
Microlocal analyticity properties In the case of the
convex hull of C1 [ C2 . In the case of opposite cones ^ 4 , it is possible to consider
four-point function H
(C1 = C2 ), f is then analytic in the real set U, while
opposite cut-domains of the previous type, for which
in the general oblique case f is only analytic in a
J1 = {1, 2} is the energy-cut of the channel (1, 2 !
complex connecting set bordered by U (namely a set
3, 4), and for which the spectral conditions prescribe
which connects TC(loc) and TC(loc) ). There exists an
1 2 an ‘‘edge-of-the-wedge situation’’ in the neighbor-
extended version of the edge-of-the-wedge theorem
hood of the corresponding mass-shell component
in which the boundary values of f1 and f2 are only
M(1, 2 ! 3, 4) . The result is that H4 is proved to be
defined as distributions.
holomorphic in a full complex cut-neighborhood of
For simplicity, we shall just give a very rough
M(1, 2 ! 3, 4) in the ambient complex energy–momen-
classification of the type of results obtained. We
tum space. The intersection of this local domain
shall distinguish:
with the complex mass shell M(c) (4) is of course a full
analyticity domains in the space of several complex cut-neighborhood of M(1, 2 ! 3, 4) in M(c) (4) , and
(possibly all) variables: they can be of global this proves that the corresponding scattering amplitude
type or of microlocal type, namely restricted to is the boundary value of an analytic scattering function
^ t) ¼ : ^ ^ 4 : it is
complex neighborhoods of real points; defined as the restriction F(s, H4 jM(c) of H
(4)
analyticity domains in special families of one- holomorphic in a domain of complex (s, t) space
dimensional complex manifolds; and deprived from the s–cut.
combinations of one-dimensional results which In the general case N > 4, the results are less
generate domains in several variables by a refined spectacular, although a more sophisticated microlocal
use of the tube theorem, called the Malgrange– method involving a ‘‘generalized edge-of-the-wedge
472 Scattering in Relativistic Quantum Field Theory: The Analytic Program
theorem’’ has been applied. This method, which was {k; k = p þ iq; k.P = 0, k2 = s=4 þ m2 ; jq2 j < b2 }. The
one of the three methods at the origin of the chapter (2 ! N 2)-particle scattering kernel is therefore the
of mathematics called microlocal analysis (the other boundary value of a scattering function holomorphic
two being Hörmander’s ‘‘analytic wave-front’’ in the previous spherical domain of complex k-space.
method and Sato’s ‘‘microfunctions’’ method) is In the special case of the two-particle scattering
based on a local version of the Fourier–Laplace amplitude F(s, t), one checks that the previous domain
transformation called the FBI transformation (see, yields for each s, s 4m2 , an ellipse of analyticity for
e.g., the book on ‘‘hypo-analytic structures’’ by ^ t) in the t-plane with foci at t = 0 and u = 4m2
F(s,
Treves (1992) and in the present context the article s t = 0; this ellipse is called the Lehmann ellipse. (We
‘‘Causality and local analyticity’’ by Bros and have considered for simplicity the case of a single type
Iagolnitzer (1973) (see Iagolnitzer (1992, ref. of particle with mass m and two-particle threshold at
[BI1]))). 2m.) In fact, the squared momentum transfer t is equal
A first positive result (obtained at first by Hepp in to (k k0 )2 , if k0 = (k3 k4 )=2 denotes the ‘‘final
1965) is the fact that the various real boundary relative momentum’’ of the s-channel, which was
values of H ^ N admit well-defined restrictions as here taken to be fixed and real. Moreover, by a similar
tempered distributions on the corresponding (real) argument the corresponding absorptive part, namely
mass shell M(N) ; this result is in fact crucial for the the discontinuity across the s-cut of the scattering
rigorous proof of general reduction formulas. How- amplitude, can be shown to be holomorphic in a larger
ever, (according to Bros, Epstein, Glaser, 1972 (see ellipse with the same foci called the large Lehmann
Iagolnitzer (1992, ref. [BEG2])) the local existence ellipse.
of an analytic scattering function in M(c) (N) is not It is interesting to compare the previous result
ensured at all points of the mass shell, but only in with the one that one obtains when the fixed vector
certain regions. A rather favourable situation still P is chosen to be spacelike, namely when s has a
occurs for (2 ! 3)-particle collision amplitudes (i.e., negative, namely ‘‘unphysical’’ value with respect to
for N = 5), but in the general case there are large the distinguished channel (1, 2 ! 3, 4). For that case,
regions of the mass shell where it is only possible to the exploitation of the primitive domain D4 shows
prove (at least in this linear program) that the that for all negative (unphysical) values
i = k2i < 0;
amplitude is a sum of a limited number of boundary i = 1, 2, 3, 4, of the squared mass variables, the
values of analytic functions, defined in local domains function H ^ 4 is holomorphic in a cut-plane of the
of M(c)
(N) (see in this connection, Iagolnitzer (1992)). variable t, where the cuts are the t-cut (t = 4m2 þ ,
0) and the u-cut (u = 4m2 s t = 4m2 þ 0 ,
Analyticity at fixed total energy in momentum 0 0). This cut-plane has of course to be compared
transfer variables A remarkably simple situation with the off- shell cut-plane domain
at the basis
had already been exploited before the general of the proof of dispersion relations (see Dispersion
analysis of HN leading to Theorem 1 was carried Relations). Here, however, the choice of the squared
out. It is the section of the domain of the N-point momentum transfer t as the variable of analyticity
function in the space of the ‘‘initial relative allows one to shift to another interpretation in terms
4-momentum’’ k = (k1 k2 )=2 of the s-channel of the concept of angular momentum.
with initial 4-momenta (k1 , k2 ), when the total
energy–momentum P = (k1 þ k2 ) with P2 = s is
kept fixed and real. The remaining 4-momenta Analyticity in the complex angular momentum
p3 , . . . , pN such that p3 þ þ pN = P are also kept variable In all the situations previously considered
fixed and real. Consider the case when P is (positive) for the case N = 4, one can see that at fixed real
timelike and such that s 4m2 . Then it can be seen values of the squared energy s and of the squared
that one obtains analyticity of (a certain ‘‘1-vector masses
= {
i ; i = 1, 2, 3, 4}, the complex initial and
restriction’’ of) HN with respect to the vector variable final relative 4-momenta k and k0 have directions
k in the union of the two opposite tubes T þ = R4 þ which vary on the complexified sphere S(c) . More-
iV þ , T = R4 þ iV . Moreover, an edge-of-the- over, the corresponding restriction of H ^ 4 to that
wedge situation holds in view of the spectral coin- sphere turns out to be always well defined and
cidence region of the form k21 = (P=2 þ k)2 < M21 , analytic on the real part of that sphere: it therefore
k22 = (P=2 k)2 < M22 . The corresponding holomor- defines a kernel on the sphere, which, in view of
phy envelope is given by a Jost–Lehmann–Dyson Poincaré invariance, is invariant under the rotations
domain (see Dispersion Relations), whose section by and therefore admits a convergent expansion in
the complex mass shell k21 = k22 = m2 turns out Legendre polynomials. Let us call h‘ (s;
) the
to give a ‘‘spherical tube domain’’ of the form corresponding sequence of Legendre coefficients.
Scattering in Relativistic Quantum Field Theory: The Analytic Program 473
In the first case considered above, this sequence Asymptotic completeness and BS-type structural
coincides (all
i being equal to m2 ) with what the analysis The BS equations have been at first
physicists call the set of partial waves f‘ (s) of the introduced as identities of formal series in the
scattering amplitude. The analyticity of H ^ 4 on a perturbative approach of QFT, and the idea of
complex spherical tube of S(c) , namely of F(s, ^ t) in considering such identities as exact equations having
the Lehmann ellipse, is then equivalent to a certain a conceptual content in the general axiomatic
exponential decrease property with respect to ‘ of framework of QFT has been introduced and devel-
the sequence of partial waves. oped by Symanzik in 1960. However, it took a long
In the second case, where s and the
i are negative, it time before its integration in the analytic program of
can be seen that the sphere S describes 4-momentum QFT (Bros 1970 (see Iagolnitzer (1992, ref. [B1]))).
configurations which all belong to a certain Euclidean These developments belong to the nonlinear pro-
subspace E 4 of M(c)4 . But this situation is much more gram since they rely on quadratic integral equations
favourable from the viewpoint of analyticity, since H ^4 between the various N-point functions, which
can be seen to be holomorphic on the full complex express the postulate of asymptotic completeness
submanifold S(c) S(c) minus two sets t and u via the use of appropriate reduction formulas.
which correspond to the t- and u-cuts of the For brevity, the general set of BS-type equations
complex t-plane. Then this larger analyticity prop- for the N-point functions with N > 4 will not be
erty turns out to be equivalent to the fact that the presented. The simplest BS-type equation, which
sequence h‘ (s;
) admits an interpolation H(; ~ s;
) concerns the four-point function, can be written as
holomorphic in a certain half-plane of the form follows:
Re > ‘0 such that for all integers ‘ > ‘0 one has:
~ s;
) = h‘ (s;
). The value of ‘0 is linked to the ^ 4 ðK; k; k0 Þ ¼ BðK; k; k0 Þ þ ðH
H ^ 4 s BÞðK; k; k0 Þ ½9
H(‘;
power bound at large momenta that must be where
satisfied by H^ 4 as a consequence of the temperate-
ness property included in the Wightman axiomatic ^ 4 s BÞðK; k; k0 Þ
ðH
Z
framework (Bros and Viano 2000). ^ 4 ðK; k; k00 ÞBðK; k00 ; k0 ÞG K þ k00
Of course, this nice analytic structure in a ¼ H
2
complex angular momentum variable could extend
K
to the set of physical partial waves f‘ (s) if one could G k00 d4 k00 ½10
^ t) in a cut- 2
establish the analytic continuation of F(s,
plane of t containing the Lehmann ellipses, but this In the latter, the s-channel is privileged, with
seems out of the possibilities at least of the linear s = K2 , K = (k1 þ k2 ); H ^ 4 is seen as a K-dependent
0
program. kernel (k and k are the initial and final relative
4-momenta already defined), and the new object B
to be studied is also a K-dependent kernel. The
function G(k) is holomorphic in k2 in a cut-plane
The ‘‘Nonlinear Program’’ and
except for a pole at k2 = m2 which plays a crucial
Its Two Main Aspects
role. (It is essentially the ‘‘propagator’’ or two-point
The extension of the analyticity domains by positivity function of the field theory considered). Apart from
and the derivation of bounds by unitarity Positivity pathologies due to the Fredholm alternative, the
conditions of the form [6] have been extensively correspondence between H ^ 4 and B is one-to-one, but
applied to the case N = 4 (namely for subsets J with the peculiarity concerns the integration cycle of
two elements). The main result (Martin 1969) consists [10]: it is a complex cycle of real dimension 4, which
in the possibility of differentiating the forward disper- coincides with the Euclidean space of the vector
sion relations with respect to t and, as a consequence, variable k00 when all the 4-momenta are Euclidean,
to enlarge the analyticity domain in t at fixed s: the and can always be distorted inside the analyticity
Lehmann ellipse, whose size shrinks to zero when s domain of H ^ 4 together with the external variables.
tends to infinity, can then be replaced by an ellipse The exploitation of the Fredholm equation in
(i.e., the Martin ellipse) whose maximal point complex space with ‘‘floating integration cycles’’
t = tmax > 0 is fixed when s goes to infinity. This then implies that B is holomorphic at least in the
justifies the extension of dispersion relations in s to primitive domain of H ^ 4.
positive values of t; then in a second step the use of An important geometrical aspect of the integra-
unitarity relations for the partial waves allows one to tion on the cycle in [10] is the fact that this cycle is
obtain Froissart-type bounds on the scattering ampli- ‘‘pinched’’ between the pair of poles of the functions
tudes (see Martin (1969)). G when K2 tends to its threshold value (s = 4m2 ).
474 Scattering in Relativistic Quantum Field Theory: The Analytic Program
The type of mathematical concept encountered here in the joint variables and s, corresponding to
is closely related to those used in the study of the concept of Regge particle: the composite
analyticity properties and Landau singularities of the particles introduced in (2) might then be inte-
Feynman amplitudes in the perturbative approach of grated in the Regge particle, although they
QFT (in this connection, see the books by Hwa and manifest themselves physically only for integral
Teplitz (1966) and by F Pham (2005) and references values ‘ of with the corresponding spin
therein). interpretation. Of course, this scenario is by no
^4
The first basic result is that it is equivalent for H means proven to hold in the general analytic
to satisfy an asymptotic completeness equation in program of QFT, but we have seen that the
the pure two-particle region 4m2 < s < 9m2 and for relevant ‘‘embryonary structures’’ are concep-
B to satisfy the following property called two- tually built-in, so that the phenomenon might
particle irreducibility: B satisfies dispersion relations hopefully be produced in a definite quantum field
in s such that the s-cut begins at the three-particle model.
threshold: s = 9m2 . 4. Byproducts of BS-type structural analysis for
The consequence of this extended analyticity N = 5 and N = 6. Relativistic exact structural
property of B is that it generates the following type equations for (3 ! 3)-particle collision ampli-
of analyticity properties for H ^ 4: tudes, which generalize the Faddeev structural
equations of nonrelativistic potential theory,
1. The existence of a two-sheeted analytic structure
^ 4 over a domain of the s-plane containing have been shown to be valid in the energy
for H
region of ‘‘elastic’’ collisions (i.e., with total
the interval 4m2
s < 9m2 , with a square-root-
energy bounded by 4m); relevant Landau singu-
type branch point at the threshold s = 4m2 .
larities of tree diagrams and triangular diagrams
2. Composite particles. There exists a Fredholm-
have been exhibited as a by-product in this
type expression
low-energy region (Bros, and also Combescure,
0
^ 4 ðK; k; k0 Þ ¼ NðK; k; k Þ
H ½11 Dunlop in two-dimensional field models, 1981
DðK2 Þ (see Iagolnitzer (1992, refs. [B3], [B4], [CD]))).
Moreover, crossing domains on the complex mass
where N and D are expressed in terms of B via
shell for (2 ! 3)-particle collision amplitudes have
Fredholm determinants, which shows that in its
^ 4 may have poles in s = K2 , been obtained (Bros 1986 (see Iagolnitzer (1992,
second sheet H
ref. [B1]))) by conjointly using (N = 5) BS-type
generated by the zeros of D. These poles are
equations together with analytic completion prop-
interpreted as resonances or unstable particles.
erties (see, e.g., the ‘‘Crossing lemma’’ in Dispersion
The generation of real poles in the first sheet (i.e.,
Relations).
bound states) is also possible under special
spectral assumptions of QFT. See also: Algebraic Approach to Quantum Field Theory;
3. Complex angular momentum diagonalization of Axiomatic Quantum Field Theory; Dispersion Relations;
BS-type equations (Bros and Viano 2000, 2003). Scattering, Asymptotic Completeness and Bound States;
The operation s in the BS-type equation [9] Scattering in Relativistic Quantum Field Theory:
contains not only an integration over squared- Fundamental Concepts and Tools; Thermal Quantum
mass variables, but also a convolution product on Field Theory.
the sphere S; the latter is transformed into a
product by the Legendre expansion of four-point
functions described previously in the subsection Further Reading
‘‘Analyticity in the complex angular momentum
Bros J and Viano GA (2000) Complex angular momentum in
variable.’’ As a result, there is a partially
general quantum field theory. Annales Henri Poincaré 1:
diagonalized transform of eqn [9] in terms of 101–172.
~
the functions H(; ~
s;
) and B(; s;
), which Bros J and Viano GA (2003) Complex angular momentum
allows one to write a Fredholm formula similar diagonalization of the Bethe–Salpeter structure in general
to [11], namely quantum field theory. Annales Henri Poincaré 4: 85–126.
Haag R (1992) Local Quantum Physics. Berlin: Springer.
~
Nð; s;
Þ Hwa RC and Teplitz VL (1966) Homology and Feynman
~
Hð; s;
Þ ¼ ½12
~ integrals. New York: Benjamin.
Dð; sÞ Iagolnitzer D (1992) Scattering in Quantum Field Theories: The
Axiomatic and Constructive Approaches, Princeton Series in
Then under suitable increase assumptions on B,
Physics. Princeton: Princeton University Press.
there may exist a half-plane of the form Re > Jost R (1965) The General Theory of Quantized Fields.
~
‘1 (with ‘1 < ‘0 ) such that H(; s;
) admits poles Providence: American Mathematical Society.
Scattering, Asymptotic Completeness and Bound States 475
Martin A (1969) Scattering Theory: Unitarity, Analyticity and Streater RF and Wightman AS (1964, 1980) PCT, Spin and
Crossing, Lecture Notes in Physics. Berlin: Springer. Statistics, and all that. Princeton: Princeton University Press.
Pham F (2005) Intégrales singulières. Paris: EDP Sciences/CNRS Treves F (1992) Hypoanalytic Structures. Princeton: Princeton
Éditions. University Press.
to be convergent at small coupling (and replace the one mass m, there is only one corresponding
nonconvergent expansions, of perturbative QFT). particle. At small coupling, the existence of other
Examples of such models are the super-renormalizable (stable) particles is not a priori expected; never-
massive ’4 models in dimensions 2 or 3 (in the theless, we will see that such particles (two-particle
1970s) and the ‘‘just renormalizable’’ massive bound states) will occur in some models in view of
(fermionic) Gross–Neveu model – in dimension 2 – kinematical threshold effects.
in the 1980s. The N-point functions of these models The 2PI four-point kernel G2 is shown to be
can be shown to have exponential fall-off in analytic up to s = (4m)2 " in an even theory. On
Euclidean spacetime. By the usual Fourier–Laplace the other hand, it satisfies a (regularized) BS
transform theorem, one obtains in turn analyticity equation. In a way analogous to the section ‘‘AC
properties in corresponding regions away from the and analyticity,’’ starting here from the analyticity of
Euclidean energy–momentum space. G2 , the actual four-point function F is in turn
On the other hand, à la Osterwalder–Schrader analytic or meromorphic in that region up to the cut
properties can be established in Euclidean spacetime. at s 4m2 , and the discontinuity formula associated
By analytic continuation from imaginary to real with AC in the low-energy region is obtained.
times, it is in turn shown that a corresponding For some models (depending on the signs of some
nontrivial theory satisfying the Wightman axioms is couplings), it will be shown that F has a pole in the
recovered on the Minkowskian side. This analysis is physical sheet, below the two-particle threshold (at a
omitted here. However, no information is obtained distance from it which tends to zero as the coupling
in that way on the mass spectrum, AC, energy– itself tends to zero). This pole then corresponds to a
momentum space analyticity, . . . . Such results can further stable particle.
be obtained through the use of irreducible kernels. More generally, and up to some technical pro-
This was initiated by T Spencer in the 1970s and blems, the structure equations should allow one to
then developed along the same line (Spencer and derive various discontinuity formulas of N-point
Zirilli, Dimock and Eckmann, Koch, Combescure, functions including those associated with AC in
and Dunlop). We outline here the more general increasingly higher-energy regions. Asymptotic caus-
approach of the present authors. In the latter, ality in terms of particles and related analyticity
irreducible kernels are directly defined through properties (Landau singularities . . .) should also
‘‘higher-order’’ cluster expansions which are again follow. However, in this approach, results should
convergent at sufficiently small coupling. They are be obtained only for very small couplings as the
shown to satisfy exponential fall-off in Euclidean energy region considered increases.
spacetime with rates better than those of the Note: Notations used are different in the next
N-point functions, and hence corresponding analy- two sections on the one hand, and the final three
ticity in larger regions around (and away from) the sections on the other. These notations follow the
Euclidean energy–momentum space. Results will use of, respectively, axiomatic and constructive
then be established by analytic continuation, from field theory; for instance, x and p are real on
the Euclidean up to the Minkowskian energy– the Minkowskian side in the next two sections
momentum space, of structure equations that whereas they are real on the Euclidean side in the
express the N-point functions in terms of irreducible last three sections. The mass m in the next two
kernels. These structure equations are infinite series sections is a physical mass, whereas it is a bare
expansions, with again convergence properties at mass in the last three sections (where a physical
small coupling. In the cases N = 2 and N = 4 (even mass is noted mph ).
theories), the re-summation of these structure equa-
tions give, respectively, the Lippmann–Schwinger and
Bethe–Salpeter (BS) integral equations (up to some
The General Framework of Massive
regularization).
Field Theories
The one-particle irreducible (1PI) two-point
kernel G1 is analytic up to s = (2m)2 ", where " We denote by x = (x0 , x) a (real) point in Minkowski
is small at small coupling (s is the squared center of spacetime with respective time and space components
mass energy of the channel). A simple argument x0 and x (in a given Lorentz frame); x2 = x20 x2 .
then allows one to show analyticity of the actual Besides the usual spacetime dimension d = 4, possible
two-point function in the same region up to a pole values 2 or 3 will also be considered. In all that
at k2 = m2ph : this shows the existence of a first basic follows, the unit system is such that the velocity c of
physical mass mph (close at small coupling to the light is equal to 1. Energy–momentum variables, dual
bare mass m). In a free theory (zero coupling) with (by Fourier transformation) to time and space
Scattering, Asymptotic Completeness and Bound States 477
variables, respectively, are denoted by p = (p0 , p); limits in H when t ! 1, respectively, and that
p2 = p20 p2 . these limits depend only on the mass-shell restric-
We describe below the Wightman axiomatic tions of the test functions ~fjjHþ (m) .
framework, though alternative ones such as ‘‘local Hin and Hout are interpreted physically as sub-
quantum physics’’ based on the Araki–Haag–Kastler spaces of states that are ‘‘asymptotically tangent’’
axioms may be used similarly for present purposes. before, respectively, after the interactions, to free-
For simplicity, unless otherwise stated, we consider particle states with particles of mass m. They are in
a theory with only one basic (neutral, scalar) field A; fact both isomorphic to the free-particle Fock space
A is defined on spacetime as an operator-valued F , namely the direct sum of n-particle spaces of
distribution:
R for each test function f , A(f ) (formally ‘‘wave functions’’ depending on n on mass-shell
A(x)f (x)dx) is an operator in a Hilbert space H of energy–momenta p1 , p2 , . . . , pn .
states. A physical state is represented by a (normal- AC is the assertion that H = Hin = Hout , that is,
ized) vector in H modulo scalar multiples. It has to that each state in H is asymptotically tangent to a
be physically understood as ‘‘sub specie aeternitatis’’ free-particle state, with particles of mass m, both
(i.e., ‘‘with all its evolution,’’ the Heisenberg picture before and after interactions (the two free-particle
of quantum mechanics being always adopted). It is states are different if there are interactions). This
assumed that there exists in H a representation of the condition cannot be expected to always hold in the
Poincaré group (semidirect product of pure Lorentz general framework introduced above, even if we
transformations and spacetime translations). restrict our attention to ‘‘physically reasonable’’
The Wightman axioms include: theories in which states of H are asymptotically
tangent to free-particle states before and after
1. local commutativity: A(x) and A(y) commute if
interactions: the absence of other stable particles
x y is spacelike: (x y)2 < 0.
with different masses is not guaranteed. For
2. the spectral condition ( = positivity of the energy
instance, even if A is ‘‘neutral,’’ the action of field
in relativistic form): the spectrum of the energy–
operators on the vacuum might generate pairs of
momentum operators (infinitesimal generators of
‘‘charged’’ particles with opposite charges, whatever
spacetime translations) is contained in the cone
‘‘charge’’ one might imagine. Individual charged
Vþ (p2 0, p0 0). In a massive theory, the
particles cannot occur in the neutral space H and
spectrum is more precisely assumed to be
their mass thus does not appear in the spectral
contained in the union of the origin (that will
condition. Hence, such states of pairs of charged
correspond to the vacuum vector introduced
particles will not belong to Hin or Hout although
next), of one or more discrete mass-shell hyper-
they belong to H. However, if the set of charged
boloids Hþ (mi )(p2 = m2i , p0 > 0) with strictly
particles is known, it can be shown that the above
positive masses mi , and of a continuum. For
framework might be enlarged by defining charged
simplicity, and unless otherwise stated, we con-
fields, in such a way that AC might still be valid in
sider in this section a theory with only one mass
the enlarged framework (see the article of Buchholz
m and a continuum starting at 2m (but this will
and Summers). For simplicity, we restrict below our
not be so in a theory with ‘‘two-particle bound
attention to the simplest theories in which AC holds
states’’). This condition introduces a first (partial)
in the way stated above.
particle content of the theory. In models, physical
If AC holds, it is shown that there exists a linear
masses will not be introduced at the outset but
operator S from H to H, called ‘‘collision operator’’
will have to be determined.
or ‘‘S-matrix,’’ that relates the ‘‘initial’’ and ‘‘final’’
3. existence in H of a vacuum vector , which is the
free-particle states to which a state in H is tangent
only invariant vector under Poincaré transforma-
before and after interactions, respectively; if AC
tions up to scalar multiples; it is moreover assumed
does not hold, S can also be defined as in operator in
that the vector space generated by the action of field
F . Collision amplitudes or scattering functions are
operators on the vacuum is dense in H.
the energy–momentum kernels of S for given
4. Poincaré covariance of the theory.
numbers m and n of initial and final particles. As
Subspaces Hin and Hout of H can be defined by easily seen, they are well-defined distributions on the
limiting procedures. To that purpose, one considers space of all initial and final on-shell energy–
test functions fj, t (x) with Fourier transforms of momenta. For convenience, we will denote by pk
2 1=2
the form ~fj (p)ei(po [p þ m ] )t , where the functions ~fj
2
the physical energy–momentum of a final particle
have their supports in a neighborhood of the mass- with index k(pk 2 Hþ (m)), and by pk the physical
shell Hþ (m). It can then be shown that vectors of the energy–momentum of an initial particle
form t = A(f1, t )A(f2, t ) A(fn, t ) converge to (pk 2 Hþ (m)).
478 Scattering, Asymptotic Completeness and Bound States
Wightman Functions, Chronological Functions, the definition of chronological operators, and sup-
and LSZ Reduction Formulas port properties in p-space due to the spectral
The N-point Wightman ‘‘functions’’ WN are defined condition. Support properties in x-space apply to
as the vacuum expectation values (VEVs) of the cell and more general ‘‘paracell’’ functions which are
products of N field operators, namely: VEVs of adequate combinations of products of
‘‘partial’’ chronological operators. It is shown that
WN ðx1 ; x2 ; . . . ; xN Þ each such function has support in x-space in a closed
¼ < ; Aðx1 ÞAðx2 Þ AðxN Þ > cone CS (with apex at the origin). Moreover, for cell
functions, the cone CS is convex and salient. Hence,
The chronological functions TN are the VEVs of the in view of the usual Laplace transform theorem, the
chronological products of the fields A(x1 ), . . . , cell function in p-space (after Fourier transforma-
A(xN ): in the latter, fields are ordered according to tion) is the boundary value of a function analytic in
decreasing values of the time components of the complex space in the tube Re p arbitrary, Im p in the
points xk . TN is essentially well defined due to local open dual cone C ~ S of CS . It is also shown that, near
commutativity with, however, problems not treated any real point P = (P1 , . . . , PN ), the chronological
here at coinciding points. function in p-space coincides with one or more cell
T~ N (p1 , . . . , pN ) will denote the Fourier transform
functions.
of TN . In view of the invariance of the theory under Together with support properties in p-space
spacetime translations, functions above are invariant arising from the spectral condition and the use of
under global spacetime translation of all points xk coincidence relations between some cell functions (in
together. Hence, their Fourier transforms contain an adequate real regions in p-space), one then shows
energy–momentum conservation (e.m.c.) delta func- the existence, for each N, of a well-defined, unique
tion (p1 þ p2 þ þ pN ). Connected N-point func- analytic function FN , called the ‘‘analytic N-point
tions are defined by induction (over N) via a function,’’ whose domain of analyticity, the ‘‘primi-
formula expressing each (nonconnected) function tive domain of analyticity,’’ in complex p-space
as the sum of the corresponding connected function contains all the tubes T S associated with the cell
and of products of connected functions depending functions. It also contains in particular a complex
on subsets of points. In contrast to nonconnected neighborhood of the Euclidean energy–momentum
functions, the analysis shows that connected func- space which consists of energy momenta Pk with
tions in energy–momentum space do not contain in real P k and imaginary energies (Pk )0 . Moreover, the
general e.m.c. delta functions involving subsets of chronological function T ~ amp, c is the boundary value
N
energy–momenta. of FN at all real points P, from imaginary directions
It can be shown that the two-point function which include those of the convex envelope of the
~ 2 (p1 , p2 ) = (p1 þ p2 )T
T ~ 2 (p1 ) has a pole of the form ~ S associated with cell functions that coincide
cones C
1=(p21 m2 ) and that T ~ N has similar poles for each ~ amp, c .
locally with T N
energy–momentum variable pk on the mass-shell. The However, the primitive domain has an empty
connected, amputated chronological function T ~ amp, c is
N intersection with the complex mass-shell, and thus
defined by multiplying (T ~N ) ~c
connected = TN (for N 2) gives no result on analyticity properties of collision
by the product of all factors p2k m2 that cancel these amplitudes on the (real or complex) mass-shell. For
poles. It is then shown that it can be restricted as a N = 4, it has been possible to largely extend the
distribution to the mass-shell of any physical process primitive domain (which is not a ‘‘natural domain of
with m initial and n final particles, with m þ n = N, holomorphy’’) by computing (parts of) its holomorphy
and that this restriction coincides with the collision envelope, which now has a nonempty intersection
amplitude of the process. A process is here character- with the complex mass shell. It is shown in turn that
ized by fixing the initial and final indices. the four-point function F4 can be restricted to the
The analyticity properties of interest (described complex mass-shell in a one-sheeted domain, called
below) will apply to the connected functions after the ‘‘physical sheet,’’ that admits each (real) physical
factoring out their global e.m.c. delta functions. region on its boundary (there is here one physical
region for each choice of the two initial and the two
The Analytic N-point Functions
final indices, the corresponding physical regions being
The Wightman axioms (without so far AC) yield disconnected from each other). In each physical
general analyticity, as also asymptotic causality, region, the collision amplitude is the boundary value
properties that we now describe. The analysis is of the mass-shell restriction of F4 , from the corre-
essentially based on the interplay of support proper- sponding half-space of ‘‘þi"’’ directions Im s > 0,
ties in x-space arising from local commutativity and where s is the (squared) energy of the process.
Scattering, Asymptotic Completeness and Bound States 479
The analyticity domain on the complex mass-shell this particular case Lorentz invariance implies that
contains paths of analytic continuation between the u3 u1 must be proportional to P3 þ P4 ). In more
various physical regions (‘‘crossing property’’) and general cases, the possible causal configurations u
admits cuts sij real (2m)2 covering the various depend on P.
physical regions. From these analyticity properties in
the physical sheet, one can also derive ‘‘dispersion
relations’’ (see Dispersion Relations). AC and Analyticity
Asymptotic Causality in Terms of Particles
Asymptotic causality and analyticity and Landau Singularities
properties for N 4
As a matter of fact, a better causality property ‘‘in
No similar result has been achieved at N > 4, and as terms of particles’’ – which is the best possible
a matter of fact, no similar result is expected if the one – is expected for ‘‘physically reasonable’’
AC condition is not assumed. The best results theories if the (stable) particles of the theory are
achieved so far are decompositions of the collision known. (By physically reasonable, we mean the
amplitude, in various parts of its physical region, as absence of ‘‘à la Martin’’ pathologies such as the
a sum of boundary values of functions analytic in occurrence of an infinite number of unstable
domains of the complex mass-shell. In contrast to particles with arbitrary long lifetime). That prop-
the case N = 4, the sum reduces to one term only in erty expresses the idea that the only causal
a certain subset of the physical region. Near other configurations u at P are those for which the
points, the N-point analytic function cannot be energy–momentum can be transferred from the
restricted locally to the complex mass-shell, though initial to the final points via intermediate stable
it can be decomposed as a sum of terms which, particles in accordance with classical laws: there
individually, are locally analytic in a larger domain should exist a classical connected multiple scatter-
that intersects the complex mass-shell. ing diagram in spacetime joining the initial and
These analyticity properties for N 4 are a direct final points uk , with physical on-shell energy–
consequence of (and equivalent to) an asymptotic momenta for each intermediate particle and
causality property that we now outline. Let fk, (p) energy–momentum conservation at each (point-
be, for each index k, a test function of the form wise) interaction vertex.
2 This property, if it holds, yields in turn (and is
fk; ðpÞ ¼ eip:uk ejpk Pk j
equivalent to) improved analyticity of the analytic
where each uk is a point in spacetime, Pk is a given N-point function near real physical regions: the (on-
on-shell energy–momentum, and will be a space- shell) collision amplitude is the boundary value of a
time dilatation parameter ( > 0). It is well localized unique analytic function in its physical region, at
in p-space around the point Pk and its Fourier least away from some ‘‘exceptional points.’’ The
transform is well localized in x-space around the boundary value (namely the collision amplitude) is
pffiffiffiffiffiffi
point uk up to an exponential fall-off of width moreover analytic outside Landau surfaces Lþ () of
which is small compared to as ! 1. connected multiple scattering graphs ; and along
We now consider the action of the (connected, these surfaces (which are in general smooth
amputated) chronological function on such test codimension-1 surfaces), it is in general obtained
functions. A configuration u = (u1 , . . . , uN ) will be from well-specified ‘‘þi"’’ directions (that depend in
called ‘‘noncausal’’ at P = (P1 , . . . , PN ) if this action general on the real point P of Lþ ).
decays exponentially as ! 1. In mathematical Exceptional points are those that lie at the
terms, u is then outside the ‘‘essential support’’ or intersection of two (or several) surfaces Lþ (1 ),
‘‘microsupport’’ at P. The asymptotic causality Lþ (2 ) . . . , with opposite causal directions, and
property established, has roughly the following hence having no þi" directions in common (in the
content: the only possible causal configurations u on-shell framework). Such points do not occur at
at P are those for which energy–momentum can be N = 4 for two-body processes, in which case the
transferred from the initial to the final points in surfaces Lþ are the n-particle thresholds s = (nm)2 ,
future cones. Moreover, at least two initial ‘‘extre- with n 2, s = (p1 þ p2 )2 . They do occur more
mal’’ points must coincide, as also two extremal generally: in a 3 ! 3 process, 1,2,3 initial, 4,5,6
final points. The simplest example is the case N = 4; final, this is the case of all points P such that
if, for example, indices 1,2 are initial and 3, 4 final, P1 = P4 , P2 = P5 , P3 = P6 which all belong to
then the only a priori possible causal situations are the Landau surfaces of the two graphs 1 , 2 , with
such that u3 = u4 is in the future cone of u1 = u2 (in only one internal line joining two interaction
480 Scattering, Asymptotic Completeness and Bound States
vertices: in the case of 1 , (resp., 2 ), the first vertex region (imaginary energies) and then by local
involves the external particles 1, 2, 4 (resp., 1, 3, 5), distortions of integration contours allowing one to
while the second one involves 3, 5, 6 (resp., 2, 4, 6). reach the Minkowskian region. From discontinuity
If moreover P1 , P2 , P3 lie in a common plane, formulas and algebraic arguments, these irreducible
previous points P also lie on surfaces Lþ of kernels are shown to have analyticity (or meromor-
‘‘triangle’’ graphs with again opposite causal phy) properties associated with the physical idea of
directions at P. The fact that þi directions are irreducibility (see examples below).
opposite can equally be checked for the corre- Results obtained so far with or without irreduci-
sponding Feynman integrals of perturbative field ble kernels are comparable in the simplest cases.
theory. However, the method based on irreducible kernels
gives more refined results and seems best adapted to
Remark The above points are no longer exceptional
‘‘extricate’’ the analytic structure of N-point func-
in spacetime dimension 2. In fact, all surfaces
tions for N > 4.
Lþ mentioned then coincide with the (on-shell)
codimension-1 surface p1 = p4 , p2 = p5 , p3 = p6 ,
with two opposite causal directions. The previous N = 4, Two-Body Processes in the
asymptotic causality property, together with a further Low-Energy Region
‘‘causal factorization’’ property for causal configura-
By even theory, we mean theories in which N-point
tions, then yields along that surface an actual
function vanishes identically for N odd.
factorization of the three-body (nonconnected)
Standard results on two-body processes with
S-matrix into a product of two-body scattering
initial (resp., final) energy–momenta p1 , p2 (resp.,
functions modulo an analytic background. The latter
p01 , p02 ) in the low-energy region (2m)2 s < (3m)2
vanishes outside the surface, hence is identically zero,
(s = (p1 þ p2 )2 = (p01 þ p02 )2 ) are based on the ‘‘off-
for some special two-dimensional models.
shell unitarity equation’’
In the absence of the AC condition, one clearly
Fþ F ¼ Fþ ? F ½1
sees why the above causality in terms of particles
cannot be established: as we have seen, there is where Fþ (p1 , p2 ; p01 , p02 )
and F (p1 , p2 ; p01 , p02 )
denote,
a priori no control on the stable particles of the respectively, the þi" and i" boundary values of the
theory and on their masses, and pathologies such as four-point function F4 from above or below the cut
those mentioned above cannot be excluded. Hope- s (2m)2 in the physical sheet, and ? denotes on-
fully, the first problem should be solved if AC is shell convolution over two intermediate energy–
assumed, and the second one should be removed by momenta. This relation is a direct consequence of
adequate regularity assumptions. This is the pur- AC for s less than (3m)2 , or less than (4m)2 in an
pose of the so-called axiomatic nonlinear program, even theory. When the four external energy–momen-
in which one also wishes to examine further tum vectors p1 , p2 , p01 , p02 are put on the mass shell
problems, for example, analytic continuation into (on both sides of that relation), one recovers the usual
unphysical sheets, with the occurrence of possible elastic unitarity relation for the collision amplitude
unstable particle poles and other singularities, Tþ and its complex conjugate T :
nature of singularities, possible multiparticle dis-
Tþ T ¼ Tþ ? T
persion relations, . . . . , to cite only a few. Results so
far remain limited but provide a first insight into In the exploitation of these relations outlined below,
such problems. a regularity condition is moreover needed, for
example, the continuity of Fþ in the low-energy
region.
The Nonlinear Axiomatic Program
By considering the unitarity equation as a Fredholm
Results described below are based on discontinuity equation for Tþ at fixed s (in the complex mass
formulas arising from – and essentially equivalent in shell), one obtains the following result: Tþ can be
adequate energy regions to – AC, together with analytically continued as a meromorphic function
some regularity conditions. They can be established of s through the cut (in the low-energy region) in a
either with or without the introduction of adequate two-sheeted (d even) or multisheeted (d odd)
‘‘irreducible’’ kernels. The methods rely on some domain around the two-particle threshold. Possible
general preliminary results on Fredholm theory in poles in the second sheet (generated by Fredholm
complex space (and with complex parameters). theory) will correspond physically to unstable
Irreducible kernels are defined through integral particles. The singularity at the two-particle thresh-
(Fredholm type) equations, first in the Euclidean old is of the square-root type in s for d even, or in
Scattering, Asymptotic Completeness and Bound States 481
1=log s for d odd. The difference between the two graphs with one internal line and with triangle
cases is due to the power (d 1)=2 of s, integer or graphs, with two-point functions on internal lines
half-integer, in the kinematical factor arising from and four-point functions at each vertex, plus a
on-shell convolution. This result can also be remainder R. The latter is shown to be a boundary
extended to the off-shell function F4 by applying a value from þi" directions Im s positive, where
further argument of analytic continuation making s = (p1 þ p2 þ p3 )2 , p1 , p2 , p3 denoting the energy–
use of the off-shell unitarity equation. momentum vectors of the initial particles. Further
Restricting now our attention to an even theory regularity conditions are needed to recover its local
(for simplicity), a similar result also follows from the physical region analyticity. The various explicit
introduction of a two 2PI BS type kernel G contributions that we have just mentioned yield the
satisfying (and here defined from F through) a actual physical region Landau singularities expected
regularized BS equation of the form in the low-energy 3–3 physical region.
A more refined result, in the approach based on
F ¼ G þ F M G ½2
irreducible kernels outlined below, applies in a
where M denotes convolution over two intermedi- larger region and then includes further à la Feynman
ate energy–momenta with two-point functions on contributions associated with 2-loop and 3-loop
the internal lines and a regularization factor in order diagrams (the latter do not contribute to ‘‘effective’’
to avoid convergence problems at infinity (G then singularities in the neighborhood of the physical
depends on the choice of this factor but its proper- region).
ties and the subsequent analysis do not). Alterna- The first result can be established from disconti-
tively, one may also introduce a kernel satisfying a nuity formulas for the three-point function around
renormalized BS equation, but this is not useful for two-particle thresholds, arising from AC, and
present purposes. ‘‘microsupport’’ analysis of all terms involved. In
Starting from the above discontinuity formula [1], the approach based on irreducible kernels, it is
one shows in turn that G is indeed ‘‘2PI’’ in the useful to introduce in particular a 3PI kernel G3
analytic sense: that, in contrast to the 3–3 function, will be analytic
or meromorphic in a domain including the three-
Gþ ¼ G ½3
particle threshold. To that purpose, an adequate set
in the low-energy region. More precisely, G is of integral equations is introduced and the three-
analytic or meromorphic (with poles that may arise particle irreducibility of G3 in ‘‘the analytic sense’’ is
from Fredholm theory) in a domain that includes the then established. In turn it provides the complete
two-particle threshold s = (2m)2 , in contrast to F structure equation mentioned above.
itself.
The proof of [3] is based on the relation
More General Analysis
independent of M (and thus leaving the M depen-
dence implicit). There are so far only preliminary steps in more
general situations, in view of (difficult) technical
þ ¼ ? ½4
problems involved and the need of ad hoc regularity
(which is a nontrivial adaptation of the decomposi- assumption at each stage. As already mentioned, the
tion of a mass-shell delta function as a sum of plus approach based on irreducible kernels seems best
and minus i" poles). A simple algebraic argument adapted. The analysis should clearly involve more
then shows essentially the equivalence between the general irreducible kernels with various irreducibil-
discontinuity formulas [1] and [3]. ity properties with respect to various channels (and
In turn, assuming that G has no poles, this not only with respect to the basic channel consid-
analyticity allows one to recover the two-sheetedness ered such as the 3–3 channel in the case above).
(d even) or multisheetedness (d odd, singularity in From a heuristic viewpoint, one may first consider
1=log) of F, in view of the BS type equation. to that purpose adequate formal expansions into
(infinite) sums of ‘‘à la Feynman contributions’’
adapted to the energy regions under investigation.
N = 6, 3–3 Process in the Low-Energy Region
These à la Feynman contributions will involve
(Even Theory)
adequate irreducible kernels in the graphical sense
The result, in the neighborhood of the 3–3 physical at each vertex, and the above expansions correspond
region, is here a ‘‘structure equation’’ expressing the formally to the best possible regroupings of
3–3 function F in the low-energy region as a sum of Feynman integrals with respect to the energy region
‘‘à la Feynman contributions’’ associated with considered. From such expansions, one might
482 Scattering, Asymptotic Completeness and Bound States
determine adequate sets of integral equations allow- also exist provided that c2p > 0 is small enough
ing one, together with regularity assumptions, to depending on m and on the other coefficient c’s
carry out an analysis similar to above. and , and
2. the just renormalizable theories where () (and
possibly
()) depend in general on . In models
The Models mentioned below () ! 0 as ! 1; this char-
acterizes ‘‘asymptotic freedom.’’
A Euclidean field-theoretical model can be defined
by a probability measure d(’) on the space of The proof of the existence of the N-point
tempered distributions ’ in Euclidean spacetime, functions makes use of Taylor type expansions
whose moments verify the Osterwalder–Schrader (or with remainder. The first orders are used to compute
similar) axioms. The moments of d are, for each N, (),
(), a(). The idea is to consider the functional
the Euclidean (Schwinger) N-point functions: integral [5] – at , finite – as an integral over
Z roughly d ‘‘degrees of freedom’’ which are weakly
Sðx1 ; . . . ; xN Þ ¼ ’ðx1 Þ ’ðxN Þ dð’Þ ½5 coupled. This corresponds to a decomposition of the
phase space (with cutoff both in x-space (the box )
In what follows, the measure d will be a and in p-space (roughly jpj < )). The coupling
perturbed Gaussian measure which, for the massive between different regions in x-space comes from
’4 model with a volume cutoff and an ultraviolet the propagators C ; the coupling between different
cutoff , is given in d dimensions by frequencies in p-space comes from the ’4 term (the
R 4 R 2 interaction vertex). The expansion is then, for each
ðÞ ’ ðzÞ dzþaðÞ ’ ðzÞdz
d; ¼ e d ð’Þ=Z; ½6 degree of freedom, a finite expansion in the coupling
between this degree and the others so that, even if
where Z, is the normalization factor and where
the expansion is perturbative up to the order d ,
d
R (’) is the Gaussian measure of mean zero the bound on each term is qualitatively the one on a
( ’ d = 0) and covariance
product of d finite order-independent expansions,
Z
2 2 the order of which can be fixed uniformly in (and
Cðx y; Þ ¼ dd p eipðxyÞ ep = =ð
ðÞp2 þ m2 Þ depending only on ). To achieve this program, the
propagator linking two points of distance of order L
where by convention m is called the bare mass. 1
must have a decrease of order eL jxyj , that is, have
For d = 2 or 3 one can show that, for () = 1
momentum larger than L , so that one must
small enough (depending on m) and
() = 1, there localize both in x-space and p-space ; for example,
exists a function a() (a() = O() as ! 0) such the smallest cells of phase space correspond to fields
that, for any set of N distinct points, the function ’ localized in x, p-spaces, the x-boxes being of side
S(x1 , . . . , xN ) = lim, ! 1 S, (x1 , . . . , xN ) exists, is 1 and the p-localization consisting of values such
not Gaussian (hence does not correspond to a trivial, that roughly (=2) jpj . More generally, a
free theory), and satisfies the Osterwalder–Schrader generic cell (of index i) corresponds to fields ’ at
axioms. The connected part S(x1 , . . . , xN )connected has point x and momentum p, with x in a box of side
the following perturbative series: 2i 1 and 2i1 < jpj < 2i .
X ð1Þn Z These expansions are mimicking the à la Wilson
lim ’ðx1 Þ . . . ’ðxN Þ renormalization group. For just renormalizable theo-
;!1
n
n!
Z n ries (where () depends on ), one is led to introduce
the effective coupling constant (2i ) whose pertur-
½’4 aðÞ’2 ðzÞ dz d ; ð’Þconnected ½7
bative expansion is the value at momentum zero of
the sum of all the (connected, amputated) four-point
which is the (divergent) sum of the connected
functions containing only propagators of momentum
renormalized (Euclidean) Feynman graphs.
(roughly) bigger than 2i (plus () which in fact
The study of the perturbative series leads to the
tends to zero as ! 1).
distinction of:
Then by small coupling we mean a theory where
1. the super-renormalizable theories, where it is (2i )=
(2i )2 is small for all i.
possible to take (),
() not depending on . By convention we write ren ,
ren , aren for the
In dimension 2, all the models where ’4 is effective parameters of the theory at zero
replaced by momentum.
The expansion obtained expresses Sconnected as a
c2p ’2p þ c2p1 ’2p1 þ þ c5 ’5 þ ’4 þ c3 ’3 ½8 sum of terms each of them being associated to a
Scattering, Asymptotic Completeness and Bound States 483
given set of phase-space cells which are ‘‘connected’’ Finally, the external points are by convention z‘
together by ‘‘links’’ that are either propagators or points; then:
vertices. Each term decreases exponentially with the
difference imax imin of the upper and lower indices S; ðx1 ; . . . ; xN Þconnected
Z X 1 X
of the phase-space cells involved. Moreover, each set ¼ d M ð’Þ
must contain the cells associated to the fields T
jTj! fXv g
’(x1 ) ’(xN ) whose indices are fixed by the order Z
nonoverlapping
Y
of magnitude of the distances between the points. Y
dz‘ dz0‘ CM ð‘Þ
On the other hand, the difference between the ‘ ‘2T
z‘ not external
theory of cutoff and the one of cutoff 2 are Y
terms containing at least one cell of momentum of KXv ðfz; z0 gv ; ’Þ ½9
order ; these terms are thus small like v2T
cst(x1 , . . . , xN )e(cst) , so that the limit as ! 1 where for coupling small enough:
exists. Z Y Y
So far, the ‘‘construction’’ of models is possible d M ð’Þ jKXv ðfz; z0 gv ; ’Þj eMð1ÞjXv j ½10
only at small coupling, apart from special cases. The v2T v2T
’4 theory in dimension 4 is just renormalizable
(from the perturbative viewpoint) but the above The X’s are 2 2 nonoverlapping; however, it will
condition of small coupling cannot be achieved (and suffice to sum over all X’s (without restriction) to
it is generally believed that this model cannot be get a bound showing the convergence of the
defined as a nontrivial theory). A just renormaliz- expansion as ! 1. In this formula the K(. , ’)’s
able model has been shown to exist, namely the are still coupled by the measure d M (’); all the
Gross–Neveu model which is a fermionic theory in nonperturbativity is hidden in the K’s (in particular
dimension 2. The elementary particle physics models the contribution of momentum bigger than M).
are just renormalizable but their construction has As a consequence of [9] and if a(, ) has been
not been completed so far (in particular in view chosen such that aren = 0, for M large enough and at
of the confinement problem). See Constructive small coupling (depending on M, m):
Quantum Field Theory for details. jSðx; yÞconnected j
To state the result in a form convenient for our Z
purposes here, we introduce a splitting of the jCM ðx yÞj þ dz01 dz02 jCM ðx z01 Þ
covariance in two parts: 0 0
eMð1Þjz1 z2 j CM ðz02 yÞj þ
Cðx y; Þ ¼ CM ðx y; Þ þ C>M ðx y; Þ; M>m ðcstÞemð1Þjxyj ½11
2 2 2 2
~ M ðp; Þ ¼ ðep
C = 2 2
=p þ m Þ ðe p = 2 2
=p þ M Þ More generally, the connected N-point function
satisfies
so that CM (x y) behaves like C at large distances but
has an ultraviolet cutoff of size M, and jC>M (x y)j jSðx1 ; . . . ; xN Þconnected j cst emð1Þdðx1 ;...;xN Þ ½12
eMjxyj decreases exponentially depending on the
where d(x1 , . . . , xN ) is the length of the smallest tree
(technical) choice of M. Let d M (’) be the Gaussian
joining x1 , . . . , xN , with possibly intermediate points.
measure of covariance CM .
One divides also in unit cubes and obtains for
the connected N-point function an expansion as a
The Irreducible Kernels
sum over connected trees; a tree T is composed of
lines ‘ and vertices v; each line joins two vertices or The 1PI Kernel and a Lippmann–Schwinger Equation
one of the external points x1 , . . . , xN and a vertex;
To then show that a theory – if the perturbation series
moreover, there are no loops.
heuristically shows it – contains only one particle of
To each line ‘ is associated a propagator
mass smaller than 2m(1 ), it is necessary to expand
CM (z‘ , z0‘ ) = CM (‘).
further the coupling between the K’s in [9]. Each
To each vertex v are associated:
perturbative step relatively to this coupling will
1. two subsets Iv , Iv0 of {‘}, generate a sum of terms such that in each one there is
2. a connected set Xv of unit cubes such that all the a ‘‘new’’ propagator CM between two K’s.
z‘ , ‘ 2 Iv and all the z0‘ , ‘ 2 Iv0 are contained in The fact that in [9] the X’s are nonoverlapping
Xv ; jXv j is the volume of Xv , and has the consequence that an expansion where for
3. a kernel KXv ({z, z0 }v ; ’) each pair of KX the number of propagators CM
484 Scattering, Asymptotic Completeness and Bound States
remains bounded (say by n þ 1) is convergent (for outgoing xpþ1 , . . . , xN points, this defines a channel.
small enough couplings depending on m, n); this is One then obtains nPI kernels (in the given channel).
because, for a given X, the others must be farther In the same way as above, one obtains a relevant
and farther as their number increases, and in view of structure equation; this equation makes sense only
the exponential decrease (in x-space) of CM . if the kernels KX have a decrease corresponding to
We then consider the expansion where we have n-particle irreducibility; to that purpose we take
further expanded the two-point function S(x, y) such M > nm. The expansion converges for couplings
that each term can be decomposed in the channel small enough depending on m and n.
x ! y in CM propagators and 1PI contributions (in In the case n = 2 this gives a kind of BS equation
the sense that any line cutting such a 1PI contribu- (the Lippmann–Schwinger equation corresponding
tion (and outside the X0 s) cuts at least two to the case n = 1); if we restrict, for simplicity, the
propagators); that means that these 1PI contribu- analysis to even theories one is led to jump directly
tions are no longer coupled by the d M (’) measure. to the case n = 3:
They are made of propagators and of KX which still
Sðx1 ; x2 ; x3 ; x4 Þconnected
have nonoverlapping restrictions; the latter are Z
straightforwardly expanded using a kind of (con- ¼ dz1 dt1 dz2 dt2 ðM Þðx1 ; x2 ; z1 ; t1 Þ
vergent) Mayer expansion; the result is finally a
Lippmann–Schwinger type equation: G2 ðz1 ; t1 ; z2 ; t2 ÞðM Þðz2 ; t2 ; x3 ; x4 Þ þ ½18
Z
Sðx; yÞconnected ¼ CM ðx yÞ þ dz1 dz2 CM ðx z1 Þ
X
S ¼ M ½G2 M p
G1 ðz1 ; z2 Þ CM ðz2 yÞ þ ½13
p1
or
or
" #
X S ¼ M G2 M þ M G2 S ½19
Sðx; yÞconnected ¼ CM ½G1 CM p ðx; yÞ
p0 where
which is equivalent to ðM Þðx1 ; x2 ; x3 ; x4 Þ ¼ Sðx1 ; x3 ÞSðx2 ; x4 Þ
Sconnected ¼ CM þ CM G1 CM þ CM G1 Sconnected ½14 þ Sðx1 ; x4 ÞSðx2 ; x3 Þ
where G1 is a 1PI kernel that satisfies the bound and where
2mð1Þjtuj
jG1 ðt; uÞj ren e ½15 jG2 ðt1 ; t2 ; u1 ; u2 Þj
In Fourier transform, eqn [14] becomes ren expf4mð1 Þ maxðjti uj jÞg ½20
i;j
~ M ðpÞ þ C
FðpÞ ¼ C ~ M ðpÞG
~ 1 ðpÞC
~ M ðpÞ
Equation [19] once amputated, and after Fourier
þC ~ M ðpÞG1 ðpÞFðpÞ ½16 transformation, is eqn [2].
Denoting by (p þ q) F(p, q) the Fourier transform of
More General Irreducible Kernels
S(x, y)connected , we can then compute F(p): and Structure Equations
~M þ C
ðp2 þ m2 Þ½C ~ MG~ 1 ðpÞ Irreducible kernels with various degrees of irreduci-
FðpÞ ¼ ½17
~ MG
ðp2 þ m2 Þ ðp2 þ m2 ÞC ~ 1 ðpÞ bility in various channels can be defined in a similar
~ M (p) ! (1 m2 =M2 ) as p ! 0 and way. Corresponding expansions of N-point func-
where (p2 þ m2 )C tions follow, in terms of integrals involving these
~
jG1 (p)j ren cst(m) so that (as expected) F has no kernels and two-point functions. These kernels are
pole in the Euclidean region at small coupling; but, again convergent at small coupling (! 0 as their
as will be seen in the next section, it has a pole irreducibility ! 1) as well as the corresponding
outside the Euclidean region. structure equations (which generalize eqn [18]).
First, it is easily seen that the two-point function section ‘‘AC and analyticity,’’ so as to avoid the pole
is analytic in the region s < (2m)2 apart from a singularities of the two-point functions involved in
pole at s = m2ph which defines the physical mass M , the threshold singularities being due to the
mph (m2ph is the zero in p2 of the denominator in pinching of this contour between the two poles as
formula [17]). In view of the bounds of the previous s ! (2mph )2 . If a fixed neighborhood of the thresh-
two sections, mph is close to the ‘‘bare’’ mass m. old is excluded, one does obtain uniform bounds of
The 2PI kernel, for even theories, is shown, again by the form (cst ren )q (for a term with q factors G2 ) in
Laplace transform theorem, to be analytic and bounded any bounded domain, which ensures the conver-
in domains around and away from the Euclidean region gence of the Neumann series.
up to s = (4m)2 , and is of the order of ren . It remains to study the neighborhood of the
As we have seen in the section ‘‘AC and threshold. To that purpose, the following method
analyticity,’’ the analyticity of G2 entails the analytic is convenient. One shows that the convolution
structure of F (two-sheeted or multisheeted at the operator M can be written in the form
threshold). On the other hand, further poles of F can
be generated by the BS integral equation [2] in the M ¼ gðsÞ
þr ½21
physical or unphysical sheets. If a pole in the where
is, as in the section ‘‘AC and analyticity,’’
physical sheet occurs at s < (2mph )2 real, it will on-shell convolution for s > (2mph )2 or is obtained
correspond to a new particle in the theory, namely a by analytic continuation for complex value of s
two-particle bound state. around the threshold; g(s) = 1=2 for d even and, if d
is odd, g(s) = (i=2) log , where = 4m2ph s. In
AC in the Low-Energy Region view of this definition of g(s), the operator r is
The analysis of possible bound states, which will be regular: it is an analytic one-sheeted operation
presented in the following, will show that there around the threshold (this is equivalent to [4]), and
might be at most one two-particle bound state of it has no pole singularities. This property of r can
mass mB < 2mph which tends to 2mph as the be established by geometric methods or by an
couplings tends to zero. explicit evaluation.
On the other hand, for even theories, in view of It is then useful to introduce a new kernel U
the analyticity properties of the two-point function linked to G2 by the integral equation
and of the 2PI kernel G2 , equation [1] holds in the U ¼ G2 þ UrG2 ½22
region (2m2ph ) < s < (4mph )2 , where
is on-shell
convolution with particles of mass mph . In view of the regularity and bounds of r and G2 ,
If there is no two-particle bound state, this one sees (e.g., by a series expansion) that U, like G2 ,
characterizes the AC of the theory for s < (4mph )2 . is analytic in a neighborhood of the threshold and
If there is a bound state of mass mB , AC is behaves in the same way at small ren .
established only in the region s < (3mph )2 . By a simple algebraic argument F and U are
For non-even theories, the analysis is similar but related by the integral equations
requires the introduction of new irreducible kernels
F ¼ U þ gðsÞU
F ¼ U þ gðsÞF
U ½23
in view of the fact that the non-evenness opens new
channels. AC in all cases can be established, for Two-dimensional models We start the analysis with
small couplings, up to s < (3mph )2 . the case d = 2. The mass shell is trivial in this case; let f
be the restriction of F to the mass shell; it depends only
Analysis of Possible Two-Particle Bound States on s = (p3 þ p4 )2 due to the mass shell and e.m.c.
for Even Theories at Small Coupling constraints (as also Lorentz invariance). On the mass
It can be checked that such poles of F, if there are, shell, the operation
becomes a mere multiplication
either lie far away in the unphysical sheet(s) or are and the integral equation [23] becomes
close to the two-particle threshold (s = (2mph )2 ). 1
This is due to the convergence, at small coupling, of f ðsÞ ¼ uðsÞ þ f ðsÞuðsÞ ½24
aðsÞ
the Neumann series F = G2 þ G2 M G2 þ . Indi-
vidual terms G2 M M G2 are, in fact, defined where u is the mass shell restriction of U and the
away from the Euclidean region by analytic con- factor a(s) arising from
is of the form
tinuation in a two-sheeted (d even) or multisheeted a(s) = cst s1=2 1=2 , = (2mph )2 s, which gives
(d odd) domain around the threshold: to that
purpose locally distorted integration contours (initi- aðsÞuðsÞ
f ðsÞ ¼ ½25
ally the Euclidean region) are introduced as in the aðsÞ uðsÞ
486 Scattering, Asymptotic Completeness and Bound States
Iagolnitzer D (1992) Scattering in Quantum Field Theories: The renormalization: the Gross–Neveu model. Communications in
Axiomatic and Constructive Approaches, Princeton Series in Mathematical Physics 111: 89.
Physics. Princeton University Press. Martin A (1970) Scattering Theory: Unitarity, Analyticity and
Iagolnitzer D and Magnen J (1987a) Asymptotic completeness Crossing. Heidelberg: Springer.
and multiparticle structure in field theories. Communications Rivasseau V (1991) From Perturbative to Constructive Renorma-
in Mathematical Physics 110: 51. lization. Princeton: Princeton University Press.
Iagolnitzer D and Magnen J (1987b) Asymptotic completeness
and multiparticle structure in field theories. II. Theories with
Schrödinger Operators
V Bach, Johannes Gutenberg-Universität, general validity of eqn [2] as the fundamental
Mainz, Germany dynamical law of all physical theories, including,
ª 2006 Elsevier Ltd. All rights reserved. for example, nonrelativistic and (special) relativistic
quantum mechanics, quantum field theory, and
string theory, deserves appreciation.
Schrödinger operators are linear partial differential If the physical system under consideration is a
operators of the form nonrelativistic point particle of mass m > 0 in a
potential Ve : Rd ! R, then, according to the princi-
HV ¼ þ VðxÞ ½1 ples of classical (Newtonian) mechanics, its state is
acting on a suitable dense domain dom(HV ) L () 2 determined by its momentum p 2 R d and its posi-
in the Hilbert space of square-integrable functions tion x 2 R d , its kinetic energy is (1=2m)p2 , its
e
potential energy is V(x), and the dynamics is given
on a spatial domain
Pd Rd , where d 2 N. Here,
H0 = = = 1 @ =@x2 is (minus) the Laplacian
2 by the Hamiltonian flow generated by the
e
Hamiltonian function Hclass (p, x) = (1=2m)p2 þ V(x).
on , and the potential V : ! R acts as a multi-
plication operator, [V ](x) := V(x) (x). Schrödinger derived the Hamiltonian (operator)
H = (h2 =2m) þ V(x)e in [2] from the replace-
ment of the momentum p 2 Rd by the momentum
Historical Origin and Relation operator ihrx . This prescription is called quanti-
to Theoretical Physics zation and is further discussed in the section
In 1926, Schrödinger formulated quantum theory as ‘‘Quantization and semiclassical limit.’’ The
wave mechanics and proved later that it is equiva- Schrödinger operator HV in [1] is then obtained after
lent to Heisenberg’s matrix mechanics. He proposed an additional unitary rescaling, (x) 7! d=2 (x),
e
by := h(2m)1=2 , and a redefinition V(x) := V(x=)
that the state of a physical system at time t 2 R is
given by a normalized wave function t 2 L2 () of the potential.
whose dynamics is determined by a linear Cauchy For more details, we refer the reader to
problem: 0 is the state at time t = 0, and for t > 0, Schrödinger (1926) and Messiah (1962).
it evolves according to
@ t Self-Adjointness
i ¼H t ½2
@t Led by the requirement of unitarity of the propa-
the Schrödinger equation. More generally, 0 is a gator, the domain dom(HV ) in [1] is usually chosen
normalized element of a Hilbert space H, and such that HV is self-adjoint, which, in turn, is most
the Hamiltonian HV is a self-adjoint operator, often established by means of the Kato–Rellich
that is, dom(HV ) = dom(HV
) H and HV = HV
on perturbation theory, briefly described below. If
dom(HV ). Formally, eqn [2] is solved by the V 0, then H0 equals the Laplacian , which
evolution operator or propagator exp(itHV ) in is a positive self-adjoint operator, provided
2
the form t = exp(itHV ) 0 . The self-adjointness dom(H0 ) = Wb.c. () is the second Sobolev space
of HV insures the existence and unitarity of with suitable conditions on the boundary @ of .
the propagator exp(itHV ), for all t 2 R, so Typical examples are dom(H0 ) = W 2 (Rd ), for
k t k = k 0 k = 1. For physics, this unitarity is crucial, = Rd , and WDir 2 2
() and WNeu () with Dirichlet
because k t k2 is interpreted as the total probability or Neumann boundary conditions on @, respec-
of the system to be at time t in some state in H. The tively, in case that is a bounded, open domain in
488 Schrödinger Operators
Rd with smooth boundary @. Starting from this decomposition of the spectrum of HV into the discrete
situation, V is required to be relatively H0 -bounded, spectrum disc (HV ), which consists of all isolated
that is, that M(V, r) := V( þ r1)1 defines eigenvalues of HV of finite multiplicity, and its
(extends to) a bounded operator on L2 (), for any complement ess (HV ) = Rndisc (HV ), the essential
r > 0. If limr!1 kM(V, r)k < 1, then HV is self- spectrum of HV , as its residual spectrum is void. One
adjoint on dom(H0 ) and semibounded, that is, the of the main goals of the spectral analysis is to
infimum inf (HV ) of its spectrum (HV ) is finite; in determine the spectral measure for a given potential
other words, HV c1, for some c 2 R, as a V as precisely as possible.
quadratic form. (The semiboundedness corresponds In many applications, = Rd and the potential V in
to quasidissipativity, as a generator of the semigroup HV is not only relatively H0 -bounded, but even
exp(HV ).) relatively H0 -compact, that is, M(V, 1) is compact. In
A fairly large class of potentials fulfilling these this case, limr!1 kM(V, r)k = 0, insuring self-
requirements is defined by adjointness on dom(H0 ) and semiboundedness of HV .
( Z ) Moreover, a theorem of Weyl implies that its essential
lim sup 4d 2 d
jx yj VðyÞ d y ¼ 0 ½3 spectrum agrees with the one of H0 , that is, with the
&0 x2 jxyj positive half-axis Rþ 0 , and the discrete spectrum is
contained in the negative half-axis R . If, furthermore,
for d 6¼ 4, and with jx yj4d replaced by (ln jx (H0 þ 1)1 [x rV(x)](H0 þ 1)1 is compact, then the
yj)1 , for d = 4. For d 3, [3] is equivalent to the essential spectrum on the positive half-axis is purely
uniformR local square integrability of V, that is, absolutely continuous, ess (HV ) \ Rþ = ac (HV ) \
supx2 jxyj1 V(y)2 dd y < 1. Note that [3] allows Rþ , and hence disc (HV ) pp (HV ) disc (HV ) [
for local singularities of V, provided they are not too {0}; the singular continuous spectrum is void.
severe; in this respect, quantum mechanics is more We remark that the absence of singular contin-
general than classical mechanics. Equation [3] is a uous spectrum is not understood. Indeed, it is
sufficient condition for HV = þ V to be self- possible to explicitly construct potentials V such
adjoint on dom() because limr!1 kM(V, r)k = 0. that H(V) has singular continuous spectrum. In
Moreover, as eqn [3] only misses some borderline terms of the Baire category, singular continuous
cases, it is also almost necessary for the self- spectrum is even typical. The appearance of singular
adjointness of HV . By means of Kato’s inequality, the continuous spectrum can, perhaps, be easier
conditions on V, especially on its positive part understood in terms of the dynamical properties of
Vþ := maxfV, 0g, can be further relaxed. Also, if one exp [ itHV ], rather than the spectral analysis of its
realizes HV as the Friedrichs extension of a semi- generator HV : Singular continuous spectrum occurs
bounded quadratic form, the conditions to impose on when initially localized states are not bound states,
V are milder. One possibly loses, however, control but move out to infinity very slowly.
over the operator domain dom(HV ), and typically The reader is referred to Simon (2000), Reed and
dom() is only a core for HV . Simon (1980a, b) and Cycon et al. (1987) for further
For further details on self-adjointness, we refer the detail.
reader to Reed and Simon (1980a, b), Kato (1976),
and Cycon et al. (1987).
Properties of Eigenfunctions
Let us assume = Rd , that V 0 is nonpositive,
Spectral Analysis
fulfills [3], and that limjxj!1 V(x) = 0. From the
The self-adjointness of HV establishes a functional statements in the last section we conclude that
calculus, generalizing the notion of diagonalizability of HV = þ V(x) is semibounded, that the essential
finite-dimensional self-adjoint matrices: there exists a spectrum is the positive half-axis and that all
unitary transformation W : L2 () ! L2 ((HV ), d) eigenvalues are negative and of finite multiplicity,
such that HV acts on elements ’ of L2 ((HV ), dHV) possibly accumulating only at 0. We collect some
as a multiplication operator, [HV ’](!) = !’(!). The properties of the eigenfunctions j 2 L2 (R d ) with
spectral measure HV decomposes into an absolutely corresponding eigenvalue ej < 0, that is, HV j =
continuous (ac) part HV , ac , a pure point (pp) part ej j . The smallest eigenvalue e0 := inf (HV ) (coin-
HV , pp , and a singular continuous (sc) part HV , sc , ciding with the bottom of the spectrum) is simple,
mutual disjointly supported on the ac spectrum and the corresponding eigenfunction 0 (x) > 0 is
ac (HV ), the pp spectrum pp (HV ), and the sc strictly positive a.e. Elliptic regularity implies that at
spectrum sc (HV ) R, respectively, whose union is a given point x 2 R d , the eigenfunction j is almost
the spectrum (HV ) of HV . There is an additional 2 d/2 degrees more regular than V. For example,
Schrödinger Operators 489
if V 2 Ck [B2 (x)], for some > 0, then j 2 procedure does not commute with symplectic
Ckþ‘ [B" (x)], for all ‘ < 2 d=2. Agmon estimates changes of the classical variables. The question of
(originally obtained by S’nol and also known in the geometrically sound definition of quantization,
mathematical physics as Combes–Thomas argu- with a general d-dimensional manifold replacing
ment) furthermore show that, for unbounded , the spatial domain , has attracted many mathe-
the eigenfunction j decays exponentially: j j (x)j maticians and has led to the mathematical fields
C ejxj , for any 0 < < ej . of geometric quantization and deformation
For more details, see Reed and Simon (1978, quantization.
1980a, b) and Cycon et al. (1987). It is remarkable, however, that Schrödinger himself
discovered already in his early paper the fact that
classical dynamics derives as the scaling limit h ! 0
One Dimension and Sturm–Liouville from quantum mechanics. The systematic study of
Theory the convergence of wave functions and of operators
and their spectral properties is known as semiclassical
For d = 1, the stationary Schrödinger equation
analysis, which is nowadays considered to be part of
reduces to a second-order ordinary differential
microlocal analysis. We illustrate the type of results
equation known as a Sturm–Liouville problem,
one obtains by the following example on = Rd .
00
ðxÞ þ VðxÞ ðxÞ ¼ E ðxÞ ½4 Let F 2 C1 0 (R; R) be a smooth characteristic
function, compactly supported in an interval I R
on L2 ([a, b]), with V 2 L1 ([a, b)] and independent
away from the essential spectrum of the semiclassi-
boundary conditions at 1 a < b 1, say. Equa-
cal Schrödinger operator Hh = h2 þ V with a
tion [4] admits an almost explicit solution by means of d
smooth potential V 2 C1 0 (R ) of compact support.
the Prüfer transformation defined by ’(x):=
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi We define the operator F[Hh ] by functional calculus
0
arctan [ (x)= (x)] and R(x):= ln (x)2 þ 0 (x)2 . (note that I d (HV ) andPF[Hh ] is of trace class).
Let, furthermore, Ah = jjM a (x)@x be a differ-
The key point about the Prüfer transformation is that it
ential operator representing an observable. Then
effectively reduces the second-order differential equa-
tr{Ah F[Hh ]}, which exists because the eigenfunctions
tion [4] into a (nonlinear) first-order equation for ’,
of Hh are smooth and decay exponentially, is, up to
’0 ðxÞ ¼ ðE VðxÞÞ sin2 ½’ðxÞ þ cos2 ½’ðxÞ ½5 normalization, interpreted to be the expectation of the
observable Ah in the state represented by the spectral
Note that [5] does not involve R and that the projection of Hh in I, approximated by F[Hh ].
boundary conditions on and 0 at a and b can be Semiclassical analysis then yields an asymptotic
easily expressed in terms of ’(a) and ’(b). More- expansion of the form
over, having determined ’ on [a, b] from [5], the
function R is immediately obtained by integrating tr{Ah F½Hh } = hd ðc0 þ c1 h þ þ cn hn þ oðhn ÞÞ
R0 (x) = [1 þ V(x) E] sin [’(x)] cos [’(x)]. In case of
a bounded interval, 1 < a < b < 1, or a confin- for arbitrarily large integers n 2 N. The leading-
ing potential, limx!
1 V(x) = 1, it is not difficult to order coefficient c0 is determined by Bohr’s corre-
derive from [5] the following basic facts: the spondence principle,
spectrum of H(V) consists only of simple eigenva- trfAh F½Hh g
lues E0 < E1 < E2 < with limn!1 En = 1. More- Z
over, the corresponding eigenfunction dp dy
n 6¼ 0, ¼ a½x; pF½ p2 þ VðxÞ
n 2 N 0 , with H(V) n = En n , has precisely n zeros, R 2d
ð2hÞd
and Sturm’s oscillation theorem holds.
þ o ð2hÞd ½6
See Amrein et al. (2005) for more details.
Semiclassical analysis thus provides the mathemati-
cal link between quantum and classical mechanics.
Quantization and Semiclassical Limit The proof of [6] usually involves pseudodifferential
The quantization procedure postulated by Schrödinger and/or Fourier integral operators, depending on the
is the replacement of the classical momentum p 2 R d method. Advanced topics in semiclassical analysis
by the quantum-mechanical momentum operator studied more recently are the construction of
ihrx . It is known (and, in fact, easy to see, quasimodes, that is, wave functions E, h, n which
cf. Messiah (1962)) that the classical Hamiltonian hn )
solve the eigenvalue problem (Hh E) E, h, n = O(
n
equation of motions is invariant under symplectic up to errors of order h , for arbitrarily large n 2 N,
transformations, but Schrödinger’s quantization and the relation between semiclassical asymptotics
490 Schrödinger Operators
and the KAM (Kolmogorov–Arnold–Moser) theory includes an external magnetic field, for example,
from classical mechanics. H = (p A)2 V (see the next and the last section).
For more details, see Dimassi and Sjöstrand The reader is referred to Thirring (1997), Reed and
(1999), and Robert (1987). See also Stability Theory Simon (1978), and Simon (1979) for further details.
and KAM, KAM Theory and Celestial Mechanics in
this encyclopedia.
Magnetic Schrödinger Operators
quadratic form may assume arbitrarily small the spectral analysis of this Schrödinger operator
values (even if the corresponding field energy is directly, but rather only suitable approximations.
added). In spite of the fact that HN (Z, R) was one of the
5. For many choices of A, the (Dirac) operator basic operators of quantum mechanics from its very
s (p A) has a nontrivial kernel. beginning in the late 1920s, HN (Z, R) was, strictly
speaking, not known to be self-adjoint before Kato
From (1)–(4) it is clear that the proof of stability of
developed the perturbation theory (described in the
matter (see the next section) in presence of a
section ‘‘Self-adjointness’’) some 20 years later, which
magnetic field is more difficult than in absence of it.
then also yielded the semiboundedness of HN (Z, R).
This can be illustrated by the fact that magnetic Lieb–
So, the ground-state energy EN (Z, R) := inf [HN
Thirring inequalities, being the natural analog of eqn
(Z, R)] > 1 is finite. From the HVZ (Hunziker–
[8], are more involved to derive than the original
van Winter–Zishlin) theorem follows that inf ess [HN
estimate [8]. The currently best bound is of the form
(Z, R)] = EN1 (Z, R), which particularly implies that
trf½HV
þ g EN (Z, R) is monotonically decreasing in N and
Z n negative (because E1 (Z, R) < 0).
CmLT ½VðxÞ5=2
þ þ jBðxÞj ½VðxÞþ
3=2
It is known that EN (Z, R) = ENþ1 (Z, R) and that
Rd
o HN (Z, R) has Pno eigenvalue, for N 2Ztot þ 1,
þ jBðxÞj þ Lc ðxÞ2 Lc ðxÞ1 ½VðxÞþ dd x ½11 where Ztot := K k = 1 Zk is the total nuclear charge
of the atom. On the other hand, it is known that
for some universal CmLT < 1, where Lc (x) is a local EN (Z, R) is an eigenvalue, provided N < Ztot . Thus,
length scale associated with B. It is nonlocal in x defining Ncrit to be the smallest number such that
and somewhat reminiscent of a maximal function. EN (Z, R) is not an eigenvalue, for all N Ncrit , that
We further remark that if restricted to two is, Ncrit is the maximal number of electrons the
dimensions, d = 2, both the magnetic and the Pauli molecule can bind, we have that Ztot Ncrit
Hamiltonians play an important role in the theory of 2Ztot þ 1. In increasing precision, asymptotic neu-
the (integer) quantum Hall effect. trality, Ncrit = Ztot þ R(Ztot ), with R(Ztot ) = o(Ztot )
For more details, see Simon (1979), Cycon et al. and R(Z) = o(Z5=7 ), was shown for atoms and for
(1987), Rauch and Simon (1997), and Erdös and molecules, respectively. The ionization conjecture
Solovej (2004). See also the article Quantum Hall states that Ncrit Ztot þ C, for some universal
Effect in this encyclopedia. constant C. It is still open for the full model
represented by HN (Z, R), but has been proved in
the Hartree–Fock approximation. It has been proved
N-Body Schrödinger Operators in the Hartree–Fock approximation by Solovej.
The semiboundedness of HN (Z, R), for fixed Z, R,
The origin of quantum mechanics is atomic (K = 1 and N, alone does not rule out a physical collapse of
below) or molecular (K 2) physics. If we regard the matter described by HN (Z, R), but the stronger
the nuclei of the molecule as fixed point charges property of stability of matter does. It holds if there
Z := (Z1 , . . . , ZK ) > 0 at respective positions exists a constant C, possibly depending on Z, such that
R := (R1 , . . . , RK ) 2 R3 , then the Hamiltonian (in
convenient units) of this molecule with N 2 N X Zk Z‘
EN ðZ; RÞ þ CðN þ KÞ ½13
electrons is the following Schrödinger operator: jRk R‘ j
1k<‘K
( )
XN XK
Zk
HN ðZ; RÞ ¼ n that is, if the ground-state energy plus the repulsive
n¼1 k¼1
jxn Rk j electrostatic energy of the nuclei is bounded below
X 1 by a constant times the total number N þ K of
þ ½12 particles in the system. Equation [13] was shown to
1m<nN m
jx xn j
V hold for HN (Z, R).
defined on H(N) := N 2 3 2
n = 1 L [R
Z2 ] L [(R
3
In connection with stability of matter, Thomas–
N
Z2 ) ], the space of totally antisymmetric, square- Fermi theory and the question of the limit of large
integrable wave functions in N space–spin variables nuclear charge came into the focus of research. For
(x1 , 1 ), . . . , (xN , N ) 2 R 3
Z2 . The antisymmetry simplicity, we restrict ourselves to atoms, K = 1, that
of the wave function accounts for the fact that is, there is one nucleus of charge Z := Z1 at the
electrons are fermions and is of crucial importance. origin, R1 = 0, and we consider E(Z) := minN2N
Note that the number N of electrons is possibly very EN (Z, 0) (which amounts to fixing N := Ncrit ). An
large. It is clear that we cannot expect to carry out asymptotic expansion for E(Z) of increasing
492 Schrödinger Operators
precision in Z was obtained by ever-finer estimates; scattering states (states in the range of
) of HV .
presently, one knows that The intertwining property HV
=
H0 (which
easily follows from [15]) implies that the restriction
EðZÞ ¼ ETF Z7=3 þ 14 Z2 þ CDS Z5=3 þ oðZ5=3 Þ ½14 of HV to Ran
is unitarily equivalent to H0 , hence
Ran
Hac (HV ) H? pp (HV ). The difficult part of
where the leading contribution ETF Z7=3 is the
the proof of asymptotic completeness is to show that
Thomas–Fermi energy, (1=4)Z2 is the Scott correc-
H?pp (HV ) Ran .
Scattering Theory
where each pair potential Vmn obeys j@y Vmn (y)j
The study of the properties of the propagator C(1 þ jyj)jj , with 2 N d0 being a multi-index. If
exp(itH) of a self-adjoint operator H = H , as > 1 for all m 6¼ n then V is called a short-range
t ! 1, is the concern of scattering theory. To potential. Conversely, if 0 < 1 then V is a long-
obtain a well-defined mathematical object in this range potential. Note that even though each Vmn
limit, it is necessary to compose exp(itH) with decays at infinity, jxj2 = x21 þ x22 þ þ x2n ! 1
the inverse of some explicitly accessible compar- alone does not imply that V(x) ! 1. In fact, physical
ison dynamics before passing to the limit t ! 1. If intuition tells us that for a cluster C of N particles,
V is a short-range potential, that is, V is relatively whose dynamics is generated by HN (V), several
H0 -compact and jV(x)j Cjxj , for some > 1 scenarios for the long-time asymptotic behavior of
and C < 1, then the comparison dynamics appro- the evolution are possible:
priate for HV is generated by H0 : the wave
operators
are defined as the strong limits 1. The N particles stay together in their cluster C
whose center of mass moves in space at constant
:¼ lim eitHV e
itH0 ½15 velocity.
t!
1
2. The cluster breaks up into two (or even more)
A general technique in scattering theory to prove the subclusters, C1 and C2 , of N1 and N2 = N N1
existence of such limits is Cook’s argument, which particles, respectively, whose centers of mass drift
formally amounts to an application of the funda- apart from each other at constant velocities (in
mental theorem of calculus. For example, for the the short-range case). For each subcluster C1 and
existence of þ , one writes C2 , both scenarios may appear again, after wait-
Z 1 ing sufficiently longer.
þ d itHV itH0
3. In the limit t ! 1, possibly after going through
1¼ dt e e
0 dt (1) and (2) several times, the initial cluster C is
Z 1 broken up into 1 K N subclusters
¼ i dt feitHV V eitH0 g ½16 C1 , . . . , CK , whose centers of mass drift apart
0
from each other at constant velocities according
and additionally proves the absolute integrability of to a free and independent dynamics of their
t 7! eitHV VeitH0 ’, for ’ in a dense subset of H, like centers of mass.
dom(H0 ) = dom(HV ).
In some sense, asymptotic completeness says that
Research in scattering theory in the past two
nothing else than (1)–(3) can possibly happen.
decades or so was focused around the question of
(Strictly speaking, asymptotic completeness is a
asymptotic completeness, which is a mathematically
statement about the limit t ! 1 and only
precise formulation
involves (3) – the actual behavior of exp [itHV ]
Ranþ ¼ Ran ¼ H? at intermediate times in terms of (1)–(3) is beyond
pp ðHV Þ ½17
the reach of current mathematics.) It is a key
of the physical expectation that the states in H are insight of scattering theory that the asymptotics of
either bound states (eigenvectors) of HV or the time evolution in the sense of (3) is completely
Schrödinger Operators 493
characterized by the asymptotic velocity defined particular, the spectrum (H(V! )) R itself) are
by the strong limit independent of ! P-almost surely. For example,
x assuming an independent, identical distribution
Pþ :¼ lim eitHN ðVÞ eitHN ðVÞ ½19 (i.i.d.) of V! in the discrete case on Zd , one arrives
t!1 t
at the Anderson model, which has been most
It is a nontrivial fact that Pþ exists, commutes with thoroughly studied. Its counterpart for continuum
HN (V), and that bound states are precisely the states models is a Poisson-distributed V! . A model which
with zero asymptotic velocity, while states with also has ergodic properties, although deterministic, is
nonzero asymptotic velocity are scattering states in the Hofstadter or the Mathieu problem. Most
Ran
. This then implies asymptotic completeness research has been focused on localization, that is,
for short-range potentials. The proof of this dichot- spatial decay properties of the resolvent {H( V! )
omy builds essentially upon positive commutator or E}1 (x, y) of H( V! ), as jx yj ! 1, and particularly
Mourre estimates. Given an interval J localized (in the question of presence or absence of exponential
energy) away from any eigenvalue of any possible decay (localization), as this is an important indicator
subcluster configuration C1 , . . . , CK (called thresh- for the transport properties of the material under
olds), the Mourre estimate asserts the existence of a consideration. Exponential localization of eigenstates
positive constant M > 0 and a compact operator has been established for d = 1 or strong disorder or
R 2 B(H(N) ) such that sufficiently high energies E 1. Localization is also
1J i½HN ðVÞ; A 1J M1J R ½20 intimately related to bounds on moments of the form
kx=2 t k C t . The study of the asymptotic dis-
as a quadratic form, for some suitable operator tribution of eigenvalues close to the lowest threshold
A. This operator A is often chosen to be the leads to the so-called Lifshitz tails.
dilation generator A = (1=2){p x þ x p} or a var- The reader is referred to Figotin and Pastur
iant thereof. (1992), Cycon et al. (1987), and Stollmann (2001).
Again, the proof of asymptotic completeness for
long-range potentials is still more
pffiffiffidifficult and has
been carried out only for > 3 1. The addi-
tional problem is the comparison dynamics of the
(Pseudo)relativistic Schrödinger
relative motion of the clusters C1 and C2 in (2), Operators
which is not the free one; the clusters rather Schrödinger operators of the form H(V) = p2 þ V(x)
influence each other even at large distances. do not observe the invariance principles of (special)
For more details, see Reed and Simon (1980c) and relativity, as their derivation is based in classical
Derezinski and Gérard (1997). See also the articles (Newtonian) mechanics. The free Dirac operator
Scattering in Relativistic Quantum Field Theory: D := a p þ m (here, and are self-adjoint
Fundamental Concepts and Tools, Scattering, 4
4 matrices) possesses the desired relativistic
Asymptotic Completeness and Bound States in this invariance, but it is not semibounded, and the
encyclopedia. definition of an interacting Dirac operator is
notoriously difficult (and unsolved). The replace-
ment of the kinetic p energy (1=2m)p2 by the Klein–
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Random Schrödinger Operators Gordon operator p þ m2 is a step towards
2
Schrödinger operators H(V! ) on L2 (Rd ) or ‘2 (Zd ) relativistic invariance, which, at the same time,
with a random potential V! are called random yields a positive operator. This replacement may
Schrödinger operators. (If H(V! ) acts on ‘2 (Zd ), also be viewed as the restriction of the free Dirac
then the (continuum) Laplacian is replaced by the operator to its positive-energy subspace. The virtue
discrete
Pd Laplacian on Zd defined by [disc f ](x) = of this replacement is that it immediately allows for
= 1 {2f (x) f (x e ) f (x þ e ).) More precisely,
the study of interacting N-particle operators,
given a probability space (, P, ) and a random ( )
variable 3 ! 7! V! , the family {H(V! )}!2 defines X
N pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X
K
Zk
rel
an operator-valued random variable that we refer to HN ðZ; RÞ ¼ n þ m2
n¼1 k¼1
jxn Rk j
as a random Schrödinger operator. Random quantum X 1
systems are physically relevant as models for amor- þ ½21
phous materials, and for solids in very heterogenous 1‘<nN
jx ‘ xn j
external fields or coupled to quantized fields. Suitable pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ergodicity assumptions on ! !V! ensure that the much like in [12]. Since p2 þ m2 p jpj, as p ! 1,
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
domain of H! and even many spectral properties (in the pseudorelativistic kinetic energy p2 þ m2 can
494 Schwarz-Type Topological Quantum Field Theory
balance only less severe local singularities of the non-homogeneous magnetic field. Annales Henri Poincaré 5:
potential V than the nonrelativistic kinetic energy 671–741.
2 Figotin A and Pastur L (1992) Spectra of Random and Almost-
(1=2m)p
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi. Indeed, already the quadratic form Periodic Operators. Grundlehren der Mathematischen
p2 þ m2 gjxj1 on C1 3
0 (R ) associated to a hydro- Wissenschaften, vol. 297. Berlin: Springer-Verlag.
gen-like atom is unbounded from below if g > 2=. Kato T (1976) Perturbation Theory of Linear Operators, 2 edn.,
Hence, the stability of matter becomes a more subtle Grundlehren der mathematischen Wissenschaften, vol. 132.
property of pseudorelativistic matter. The relaxation Berlin: Springer-Verlag.
Messiah A (1962) Quantum Mechanics, 1st edn., vol. 2. Amsterdam:
of the restriction onto the positive subspace of the free North-Holland.
Dirac operator also got into the focus of research. Rauch J and Simon B (eds.) (1997) Quasiclassical Methods. IMA
For more details, we refer the reader to Thirring Volumes in Mathematics and Its Applications, vol. 95. Berlin:
(1997). Springer-Verlag.
Reed M and Simon B (1978) Methods of Modern Mathematical
See also: Deformation Quantization; Elliptic Differential Physics IV. Analysis of Operators, 1st edn., vol. 4. San Diego:
Academic Press.
Equations: Linear Theory; h-Pseudodifferential Operators
Reed M and Simon B (1980a) Methods of Modern Mathematical
and Applications; Localization for Quasiperiodic
Physics: I. Functional Analysis, 2nd edn., vol. 1. San Diego:
Potentials; Nonlinear Schrödinger Equations; Normal Academic Press.
Forms and Semiclassical Approximation; N-Particle Reed M and Simon B (1980b) Methods of Modern Mathematical
Quantum Scattering; Quantum Hall Effect; Quantum Physics: II. Fourier Analysis and Self-Adjointness, 2nd edn.,
Mechanical Scattering Theory; Scattering, Asymptotic vol. 2. San Diego: Academic Press.
Completeness and Bound States; Stability of Matter; Reed M and Simon B (1980c) Methods of Modern Mathematical
Stationary Phase Approximation. Physics: III. Scattering Theory, 2nd edn., vol. 3. San Diego:
Academic Press.
Robert D (1987) Autour de l’Approximation Semi-Classique,
Further Reading 1st edn. Boston: Birkhäuser.
Schrödinger E (1926) Quantisierung als Eigenwertproblem.
Amrein W, Hinz A, and Pearson D (2005) Sturm–Liouville Annalen der Physik 79: 489.
Theory – Past and Present. Boston: Birkhäuser. Simon B (1979) Functional Integration and Quantum Physics,
Cycon H, Froese R, Kirsch W, and Simon B (1987) Schrödinger Pure and Applied Mathematics. New York: Academic Press.
Operators, 1st edn. Berlin: Springer. Simon B (2000) Schrödinger operators in the twentieth century.
Derezinski J and Gérard C (1997) Scattering Theory of Classical Journal of Mathematical Physics 41: 3523–3555.
and Quantum N-Particle Systems, Text and Monographs in Solovej JP (2003) The ionization conjecture in Hartree–Fock
Physics. Berlin: Springer-Verlag. theory. Annals of Mathematics 158: 509–576.
Dimassi M and Sjöstrand J (1999) Spectral Asymptotics in Stollmann P (2001) Caught by Disorder. Progress in Mathema-
the Semi-Classical Limit. London Mathematical Society tical Physics, vol. 20. Boston: Birkhäuser.
Lecture Notes Series, vol. 268. Cambridge: Cambridge Thirring W (ed.) (1997) The Stability of Matter: From Atoms to
University Press. Stars – Selecta of Elliott H. Lieb, 2 edn. Berlin: Springer-
Erdös L and Solovej JP (2004) Uniform Lieb–Thirring inequality Verlag.
for the three-dimensional Pauli operator with a strong
The examples of such theories are topological topological properties of knots and links. These
Chern–Simons (CS) theories and BF theories. theories with bilinear action in fields can also be
Metric independence of the action S of a Schwarz- defined in higher dimensions. In particular in D = 4,
type gauge theory implies that stress–energy tensor BF theory, besides describing two-dimensional gen-
is zero: eralizations of knots and links, also provides a field-
theoretic interpretation of Donaldson invariants.
S
T ¼ 0 This provides a connection of these theories with
g Witten-type TQFTs of Yang–Mills gauge fields. We
More generally, in the gauge-fixed version of such shall not discuss BF theories in the following and
theories, stress–energy can be BRST exact, where refer to the article BF Theories in this Encyclopedia.
BRST charge corresponds to gauge fixing in contrast Witten (1995) has also formulated CS theories in
to Witten-type theories where corresponding BRST three complex dimensions described in terms of
charge corresponds to a combination of shift holomorphic 1-forms. Such a theory on Calabi–Yau
symmetry and gauge symmetry. There are no local spaces can be interpreted as a string theory in terms
propagating degrees of freedom; the only degrees of of a Witten-type topological field theory of a sigma
freedom are topological. Expectation values of model coupled to gravity. General topological sigma
metric-independent operators W are also indepen- models in Batalin–Vilkovisky formalism have been
dent of the metric: constructed by Alexandrov et al. (1997). This is a
Schwarz-type theory. However, in its gauge-fixed
hWi version, it can also be interpreted as a Witten-type
¼0
g theory. This construction provides a general for-
mulation from which numerous topological field
Three-dimensional CS theories are of particular theories emerge. In particular, the Witten A and B
interest, for these provide a framework for the study models and also multidimensional CS theories are
of knots and links in any 3-manifold. Pioneering special cases of this construction.
indications of the fact that topological invariants In the following, we shall survey three-dimensional
can be found in such a setting came in very early CS theory as a description of knots/links, indicate
when A S Schwarz demonstrated that a particular how manifold invariants can be constructed from
topological invariant, Ray–Singer analytic torsion invariants for framed links, and also discuss its
(which is equivalent to combinatorial Reidemeister– application to three-dimensional gravity.
Franz torsion) can be interpreted in terms of the
partition function of a quantum gauge field theory
(Schwarz 1978, 1979). In particular, in the weak- Three-Dimensional CS Theory with
coupling limit of CS theory of gauge group G on a Gauge Group U(1)
manifold M, contribution from each topologically The simplest Schwarz-type topological field theory is
distinct flat connection (characterized by the equiva- the U(1) CS theory described by the action:
lence classes of homomorphisms: 1 (M) ! G) to the Z
1
partition function is given by metric-independent S¼ A dA ½1
Ray–Singer torsion of the flat connection up to a 8 M
phase. This phase factor is also a topological where A is a connection 1-form A = A dx and M is
invariant of framed 3-manifold M (Witten 1989). the 3-manifold, which we shall take to be S3 for the
It was Schwarz who first discussed CS theory as a discussion below. The action has no dependence on
topological field theory and also conjectured that the metric. Besides being the U(1) gauge invariant, it
the well-known Jones polynomial may be related to is also general coordinate invariant.
it (Schwarz 1987). In his famous paper Witten In quantum CS field theory, we are interested in
(1989) not only demonstrated this connection, but the functional averages of gauge-invariant and
also set up a general field-theoretic framework to metric-independent functionals W[A]:
study the topological properties of knots and links in Z
1
any arbitrary 3-manifold. In addition, this frame- hW½Ai ¼ ½DAW½A expfikSg
Z
work provides a method of obtaining some new Z ½2
manifold invariants. As discussed by A Achúcaro Z ¼ ½DA expfikSg
and P K Townsend, CS theory also describes gravity
in three-dimensional spacetime (Carlip 2003). This theory captures some of the simple, but
BF theories in three dimensions provide another interesting, topological properties of knots and links
framework for field-theoretic description of in three dimensions. For a knot K, we associate a knot
496 Schwarz-Type Topological Quantum Field Theory
H
operator K A which is gauge invariant and also does does depend on the topological character of the
not depend on the metric of the 3-manifold. Then for normal vector field n (s). It is also related to two
a link made of two knotsH K1 andH K2 , we have the loop geometric quantities called ‘‘twist’’ T(K) and ‘‘writhe’’
correlation function h K1 A K2 Ai, which can be w(K) through a theorem due to Calugareanu:
evaluated in terms of two-point correlator
SLðKÞ ¼ TðKÞ þ !ðKÞ ½5
hA (x)A (y)i in R3 (with flat metric). This correlator
in Lorentz gauge (@ A = 0) is: where
I
i ðx yÞ 1 dx dx
hA ðxÞA ðyÞi ¼ TðKÞ ¼ ds n
k jx yj3 2 K ds ds
I I
1 de de
so that for two distinct knots K1 and K2 !ðKÞ ¼ ds dt e
4 K K ds dt
I I
4i
A A ¼ LðK1 ; K2 Þ ½3 Here
K1 K2 k
y ðtÞ y ðsÞ
where e ðs; tÞ ¼
jyðtÞ yðsÞj
I I
1 ðx yÞ is a unit map from K K ! S2 and n (s) is a normal
LðK1 ; K2 Þ ¼ dx dy 3
4 K1 K2 jx yj unit vector field. T(K) and !(K) are not in general
integers and represent the amount of twist and coiling
This integral is the well-known topological invariant
of the knot. These are not topological invariants but
called ‘‘Gauss linking number’’ of two distinct
their sum, self-linking number, is indeed always an
closed curves. It is an integer measuring the number
integer and a topological invariant. This result has
of times one knot K1 goes through the other knot
found interesting applications in the studies of the
K2 . Linking number does not depend on the
action of enzymes on circular DNA.
location, size, or shape of the knots. In electro-
dynamics, it has the physical interpretation of work
done to move a monopole around a knot while Nonabelian CS Theories
electric current runs through the other knot.
Abelian CS theory also provides a field-theoretic Nonabelian CS theories provide far more informa-
representation for another topological quantity tion about the topological properties of the mani-
called ‘‘self-linking number,’’ also known as ‘‘fram- folds as well as knots and links.
ing number,’’ of the knot. Nonabelian CS theory in a 3-manifold M (which
H H It is related to the as in last section is taken to be S3 ) is described by
functional average of h K A K Ai where two loop
integrals are over the same knot. Coincidence the action functional
Z
singularity is avoided by a topological loop-splitting 1
regularization. For a knot K given by x (s) para- S¼ tr A ^ dA þ 23A ^ A ^ A ½6
4 M
metrized along the length of the knot by s, we
associate another closed curve Kf given by where A is a gauge field 1-form which takes its value
y (s) = x (s) þ n (s), where is a small parameter in the Lie algebra LG of a compact semisimple Lie
and n (s) is a principal normal to the curve at s. The group G. For example, we may take this group to be
coincidence limit is then obtained at the end by SU(N) and A = Aa T a , where T a is the fundamental
taking the limit ! 0. Such a limiting procedure is N-dimensional representation with trT a T b = 1=2ab .
called framing and knot Kf is the ‘‘frame’’ of knot K. Under homotopically nontrivial gauge transforma-
Linking number of the knot K and its frame Kf is the tions this action is not invariant, but changes by an
self-linking number of the knot: amount 2n where integers n are the winding
I I numbers characterizing the gauge transformations
1 ðx yÞ which fall in homotopic classes given by 3 (G) = Z
SLðK; n Þ ¼ dx dy
4 jx yj3 for a compact semisimple group G. However, for
quantum theory what is relevant is exp[ikS] which
Hence coincidence two loop correlator is is invariant even under homotopically nontrivial
I I gauge transformations provided the coupling k
4i
A A ¼ SLðK; n Þ ½4 takes integer values. This quantized nature of the
K K k
coupling was pointed out by Deser et al. (1982a, b)
Notice that the self-linking number of a knot is (and also they were first to introduce the non-
independent of the regularization parameter , but abelian CS term as a gauge-invariant topological
Schwarz-Type Topological Quantum Field Theory 497
mass term in gauge theories). So for integer k, the generalization is the HOMFLY polynomial) corre-
quantum field theory we discuss here is gauge sponds to the case of spin-1/2 representation of
invariant. SU(2) CS theory: V2 [L] = Jones polynomial [L], up
The topological operators are Wilson loop opera- to an overall normalization. These skein relations
tors for an oriented knot K: are sufficient to recursively find all the expectation
I values of links with only fundamental representation
WR ½K ¼ tr P exp AR ½7 on the components. To obtain invariants for any
K other representation, more general methods have to
a
where AR = A TRa
with TRa
as the representation be developed. A complete and explicit solution of
matrices of a finite-dimensional representation R of the CS field theory is thus obtained. One such
the LG. P stands for the path ordering of the method has been reviewed in Kaul (1999). The
exponential. The method makes use of the following important
S observable Wilson link operator
for a link L = n1 Ki , carrying representations Ri on statement:
the respective component knots, is
Proposition: CS theory on a 3-manifold M
Y
n with boundary is described by a WZNW
WR1 R2 Rn ½L ¼ WRi ½Ki ½8 (Wess–Zumino–Novikov–Witten) conformal field
1
theory (CFT) on the boundary (Figure 2).
Expectation values of these operators are:
Using the same identification, functional average
R
½DAWR1 Rn ½LeikS for Wilson lines ending at n points on the boundary
VR1 ;R2 Rn ½L ¼ R ½9 is obtained from WZNW field theory on the
½DAeikS
boundary with n punctures carrying representations
The measure [DA] has to be metric independent. Ri (Figure 3):
These expectation values depend not only on the We can represent CS functional integral as a
isotopy of the link L but also on the set of the vector (Witten 1989) in the Hilbert space H
representations {Ri }. These can be evaluated in associated with the n-point vacuum expectation
principle nonperturbatively. For example, when values of primary fields in WZNW conformal field
LG = su(N) and each of the component knot of the theory on the boundary . Next, to obtain a
links carries the fundamental N-dimensional repre- complete and explicit nonperturbative solution of
sentation, the Wilson link expectation values satisfy the CS theory, the theory of knots and links and
a recursion relation involving three link diagrams their connection to braids is invoked.
which are identical except for one crossing where
they differ as over crossing (Lþ ), under crossing Σ
(L ), and no crossing (L0 ) as shown in the Figure 1. Σ
The expectation values of these links are related
as (Witten 1989): W
Z
qN=2 VN ½Lþ qN=2 VN ½L CS
N
¼ q1=2 q1=2 VN ½L0 ½10 W
where
L+ L0 L–
Figure 3 CS functional integrals with Wilson lines and CFT on
Figure 1 Skein related links. punctured boundary.
498 Schwarz-Type Topological Quantum Field Theory
Knots/Links and Braids two nonintersecting 3-balls are removed from the
3-manifold S3 to obtain a manifold with two S2
Braids have an intimate connection with knots and
boundaries. Then we arrange 2n Wilson lines of, say
links which can be summarized as follows:
SU(N) CS theory, as a 2n-strand oriented braid
1. An n-braid is a collection of nonintersecting carrying representations Ri in this manifold. The CS
strands connecting n points on a horizontal rod functional integral over this manifold is a state in
to n points on another horizontal rod below the tensor product of the Hilbert spaces H1 H2
strictly excluding any backward traversing of the associated with conformal field theory on the two
strands. A general braid can be written as a word boundaries. These boundaries have 2n punctures
in terms of elementary braid generators. carrying the set of representations {Ri } and {R0i },
2. We associate representations Ri of the group with respectively, the two sets being permutations of each
the strands as their colors. We also put an other. This state can be expanded in terms of some
orientation on each strand. When all the repre- convenient basis given by the conformal blocks for
sentations are identical and also all strands are the 2n-point correlation functions of SU(N)k
unoriented, we get ordinary braids, otherwise we WZNW conformal field theory. The duality of
get colored oriented braids. these correlation functions represents the transfor-
3. The colored oriented braids form a groupoid mation between different bases for the Hilbert
where product of the different braids is obtained space. Their monodromy properties allow us to
by joining them with both colors and orientations write down representations of the braid generators.
matching on the joined strands. Unoriented Since an arbitrary braid is just a word in terms of
monochromatic braids form a group. these generators, this construction provides us a
4. A knot/link can be formed from a given braid by matrix representation B({Ri }, {R0j }) for the colored
a process called platting. We connect adjacent oriented braid in the manifold with two S2 bound-
strands namely the (2i þ 1)th strand to 2ith aries. Then we plat this braid by gluing two balls B1
strand carrying the same color and opposite and B2 with Wilson lines as shown in Figure 5.
orientations in both the rods of an even-strand Each of the two caps again represents a state
braid (Figure 4a). j ({Rj })i in the Hilbert space associated with the
There is a theorem due to Birman which states conformal field theory on punctured boundary (S2 ).
that all colored oriented knots/links can be Platting of the braid then simply is the matrix
obtained through platting. This construction is element of braid representation B({Ri }, {R0j }) with
not unique. respect to these states j ({Ri })i and j ({R0j })i corre-
5. There is another construction associated with sponding to two caps B1 , B2 . Thus, for a link in S3
braids which relates them to knots and links. We the invariant is given by the following theorem:
obtain a closure of a braid by connecting the ends
Theorem The vacuum expectation value of Wilson
of the first, second, third, . . . strands from above
loop operator of a link L constructed from platting
to those of the respective first, second, third, . . .
of a colored oriented 2n braid with representation
strands from below as shown in the Figure 4b.
B({Ri }, {R0j }) is given by (Kaul 1999):
There is theorem due to Alexander which states
that any knot or link can be obtained as a closure V½L ¼ h ðfRi gÞjBðfRi g; fR0j gÞj ðfR0j gÞi ½11
of a braid, though again not uniquely.
This theorem can be used to calculate the
Link Invariants invariant for any arbitrary link. For an unknot U
This connection of braids to knots and links can be
used to construct link invariants, say in S3 . To do so,
〈ψ({Rj})⏐ ⏐ψ({Rj′})〉
B({Ri}, {Ri′})
(a) (b) B1 B2
Figure 4 (a) Platting and (b) closure of braids. Figure 5 Construction of the link invariant.
Schwarz-Type Topological Quantum Field Theory 499
carrying an N-dimensional representation in an method for generating manifold invariants are given
SU(N) CS theory, the knot invariant is: in Kaul (1999) and Kaul and Ramadevi (2001).
qN=2 qN=2
VN ½U ¼ ½N; where ½N ¼ Surgery of Framed Knots/Links and Kirby Moves
q1=2 q1=2
As discussed earlier, frame of a knot K is an
Wilson link expectation values calculated this way associated closed curve Kf going along the length
depend on the regularization, that is, the definition of the knot wrapping around it certain number of
of framing used in defining coincident loop correla- times. Self-linking number (also called framing
tors. One such regularization usually used is the number) is equal to the linking number of the knot
standard framing, where the frame for every knot is with its frame. There are several ways of fixing this
so chosen that its self-linking number is zero. framing. The ‘‘standard’’ framing is one in which the
The procedure outlined here has been used for frame number of the knot, that is, the linking
explicit computations of knot/link invariants. This number of the knot and its frame is zero. On the
has led to answers to several questions of knot other hand, ‘‘vertical’’ framing is obtained by
theory. One such question relates to distinguishing choosing the frame vertically above the knot
chirality of knots (Kaul 1999). In this context, newer projected on to a plane. In such a frame, the framing
invariants constructed with arbitrary representations number of a knot is the same as its crossing number.
living on the knots are more powerful than the older In constructing the 3-manifold invariants from CS
polynomial invariants. For example, invariants with theories, we need vertical framing. The framing
spin-3/2 representation in an SU(2) CS theory are number may be denoted by writing the integer by
sensitive to chirality of many knots which otherwise the side of knot. We denote a framed r-component
is not detected by Jones, HOMFLY, and Kauffman link by [L, f ] where framing f = (n(1), n(2), . . . , n(r))
polynomials. However, invariants obtained from CS is a set of integers denoting the framing number of
theories do not distinguish all chiral knots. There is component knots K1 , K2 , . . . , Kr in the link L.
a class of links known as ‘‘mutants’’ which are not According to the Lickorish–Wallace theorem,
distinguished by CS link invariants (Kaul 1999). A surgery over links with vertical framing in S3 yields
mutant link is obtained by removing a portion of all the 3-manifolds. This surgery is performed in the
weaving pattern in a link and then gluing it back following way.
after rotating it about any one of three orthogonal Take a framed r-component link [L, f ] in S3 .
axes by an amount . Thicken the component knots K1 , K2 , . . . , Kr such
The CS invariants of knots and links can also be that the solid tubes N1 , N2 , . . . , Nr so obtained are
used to construct special 3-manifold invariants. nonintersecting. Then the compliment S3
Hence, CS theory provides an important tool to (N1 þ N2 þ þ Nr ) will have r toral boundaries.
study these. On the ith toral boundary, we imagine an
appropriate curve winding n(i) times around the
meridian and once along the longitude. Perform a
Manifold Invariants from CS Theory modular transformation so that this curve bounds
a disk. This construction is done with each of the
Different 3-manifolds can be constructed through a
toral boundaries. The tubes N1 , N2 , . . . , Nr are
procedure called ‘‘surgery of framed knots and
then glued back in to the respective gaps. This
links’’ in S3 (Lickorish–Wallace theorem). This
surgery thus yields a new 3-manifold. This
construction is not unique. That is, there are many
construction is not unique. The rules of equiva-
framed knots and links which give the same
lence for surgery on framed knots/links in S3 are
manifold. However, rules of this equivalence are
two independent Kirby moves.
known: these are called ‘‘Kirby moves.’’
Classification of 3-manifolds would involve find-
ing a method of associating a quantity with the Kirby move I Take an arbitrary r-component
manifold obtained by surgery on the corresponding framed link [L, f ] in S3 and consider a curve C
framed knot/link on S3 . If the Kirby moves on the with framing number þ1 going around the unlinked
framed knot/link leave this quantity unchanged, strands of L as in Figure 6a. We refer to this (r þ 1)-
then it is a 3-manifold invariant. Knot/link invar- component link as H[X], where X represents a
iants of nonabelian CS theories provide a method of weaving pattern of the strands. Kirby move I
finding such 3-manifold invariants. Equivalently, consists of twisting the disk enclosed by C in the
this procedure gives an algebraic meaning to the clockwise direction from below by an amount 2.
surgery construction of 3-manifolds. Details of this This twisting thereby introduces new crossings
500 Schwarz-Type Topological Quantum Field Theory
2i
The surgery of the framed links in Figures 6a and 6b ð!Þ exp ð!ðR1 þ Þ; R2 þ Þ
!2W
k þ Cv
will give the same 3-manifold.
Inverse Kirby move I involves removal of a curve where W denotes the Weyl group and its elements !
C with framing number 1 (instead of þ1) after are words
making one complete anticlockwise twist from Q constructed using the generator si – that
is, ! = i si and (!) = (1)‘(!) with ‘(!) as length of
below on the disk enclosed by C. In the process the the word. Here Ri ’s denotes the highest weights of
unlinked strands get twisted in the anticlockwise the representations Ri ’s and is the Weyl vector. The
direction leading to changed framing numbers action of the Weyl generator s on a weight R is
n0 (i) = n(i) þ (L(Ki , C))2 of the component knots Ki .
ðR ; Þ
s ðR Þ ¼ R 2
Kirby move II This move consists of removing a ð; Þ
disjoint unknot C with framing 1 from framed link
[L, f ] without changing the rest of the link as in and jL! =Lj is the ratio of weight and coroot lattices
Figure 7. Surgery of the two links in Figure 7 will (equal to the determinant of the Cartan matrix for
give the same manifold. simply laced algebras). Also Cv is quadratic Casimir
Inverse Kirby move II involves removal of a invariant for the adjoint representation.
disjoint unknot with framing þ1 (instead of 1) It is important to stress that the expression
^ (G) [M] is unchanged under both Kirby moves I
F
from a framed link.
and II (for detailed proof, see Kaul (1999) and Kaul
3-Manifold Invariants and Ramadevi (2001)). Notice that for every
compact gauge group, we have a new 3-manifold
Now a 3-manifold invariant can be constructed by
invariant.
an appropriate combination of the invariants of
framed links in such a way that this algebraic
expression is unchanged under the Kirby moves. We Few examples of 3-manifolds Table 1 lists the
algebraic expressions of this invariant calculated
C explicitly from the formula in eqn [12] for a few
3-manifolds. All these examples can be constructed
Z Z by surgery on an unknot U(f ) with different frame
–1 numbers f.
In Table 1 L[p, q] stands for Lens spaces of the
Figure 7 Kirby move II. type (p, q) and CR is the quadratic Casimir invariant
Schwarz-Type Topological Quantum Field Theory 501
Table 1 Invariants for some simple manifolds where the coupling constant k = ‘=(4G) for negative
^ (G)
cosmological constant = 1=‘2 . The gauge group
U(f ) M F [M]
for this theory is SL(2, C). Infinitesimal diffeo-
U(0) S2 S1 1=S00 morphisms are described by field-dependent gauge
U(1) S3 1 P transformations. The corresponding gauge group for
2CR
U(þ2) RP 3 1 S0R qS00 S0R Minkowski gravity with negative cosmological con-
PR
U(þp) L[p, 1]
pC R
1 S0R qS00 S0R stant is SO(2, R) SO(2, R). For positive , one
R
gets SO(3, 1) and SO(4) for Minkowski and Euclidean
metrics, respectively. For = 0, we have ISO(2, 1)
for representation R of the Lie algebra of the gauge (ISO(3)) as the gauge group for Minkowski
group G. (Euclidean) gravity. Hence, the sign of cosmological
Partition function of a CS theory on M is also an constant determines the gauge group of the CS
invariant characterizing the 3-manifold. This has theory.
been calculated for several manifolds by different Identification of 3D gravity with CS theory can be
methods. Invariant F ^ (G) [M] listed above for various used with some advantage to find the partition
manifolds is related to the CS partition function function for a black hole in 3D gravity with negative
^ (G) [M] = S1 Z(G) [M]. So the method of
Z(G) [M]: F cosmological constant. This in turn yields an
00
constructing 3-manifold invariants above can also expression for entropy of the black hole.
be used to calculate the partition function of CS
theories.
BTZ Black Hole and Its Partition Function
Only for negative we have a black hole solution of
3D Gravity and CS Theory the Einstein’s equations. This solution, known as the
Three-dimensional CS theory also provides a BTZ black hole (Carlip 2003), in Euclidean gravity
description of gravity. The 3D gravity including is given by the metric
cosmological constant has been first discussed by
16G M l 4r 2r
G is the Newton’s constant, g is the metric on the
It is specified by two parameters M and J (the mass
3-manifold M, and R is scalar curvature. Solutions
and angular momentum). By a coordinate transfor-
of Einstein equations of motion have a constant
mation, this metric can be rewritten as ds2E =
positive (negative) curvature if is positive (nega-
(l2 =z2 )(dx2 þ dy2 þ dz2 ), with z > 0. This is the 3D
tive). It is also well known that there are no
upper-half hyperbolic space and can be rewritten
dynamical degrees of freedom for gravity in dimen-
using spherical polar coordinates as
sions D 3; it is indeed described by topological
field theories. The gravity action above can be
l2 2
rewritten as a CS gauge theory in first-order ds2E ¼ dR þ R2 d
2 þ R2 sin2
d2
2
formulation (Carlip 2003). For triads ea and spin R2 sin
connection !a of Euclidean gravity, we define
1-forms e = ea T a dx , ! = !a T a dx , which have We have the identifications (R,
, )
(R exp {2rþ =l},
values in the Lie algebra of SU(2) whose generators
þ {2r =l}, ) where rþ and r are the outer and
are T a = i a =2 with a as three Pauli matrices. inner horizon radii, respectively. It is clear from this
In terms of these we define two gauge field 1-forms identification that topologically the metric corre-
A and A as: sponds to a solid torus. Functional integral over
this manifold represents a state in the Hilbert space
ie ie specified by the mass and angular momentum. It is
A¼ þ! ; A¼ !
‘ ‘ the microcanonical ensemble partition function and
Then the Euclidean gravity action can be written its logarithm is the entropy of the black hole.
as
in terms of two CS actions, SCS [A] and SCS [A], To evaluate this partition function, the connection
1-form is kept at a constant value on the toroidal
S ¼ kSCS ½A kSCS ½A ½14 boundary through a gauge transformation. We
502 Schwarz-Type Topological Quantum Field Theory
define local coordinatesR on the R torus boundary black hole mass and zero angular momentum in
z = x þ
y such that a dz = 1, b dz =
, where saddle-point approximation. The computation yields
a (b) stands for the contractible (noncontractible) (Govindarajan et al. 2001):
cycle of solid torus and
=
1 þ i
2 is the modular rffiffiffiffiffiffiffiffiffiffiffiffi
i u
~ i u This gives not only the leading Bekenstein–Hawking
A¼ dz þ dz T 3 ½15 behavior of the black hole entropy S but also a
2
2
subleading logarithmic term:
where u and u ~ are canonically conjugate with 2rþ 3 2rþ
commutation relation: [~ u, u] = (2=)
2 (k þ 2)1 . S ¼ ln ZBH ¼ ln þ
4G 2 4G
These are related to black hole parameters
through holonomies of gauge field A around the This is an interesting application of CS theory to
a- and b-cycles (for a classical black hole solution 3D gravity. In fact, three-dimensional CS theory also
= 2): has applications in the study of black holes in four-
dimensional gravity: the boundary degrees of free-
Kaul RK and Ramadevi P (2001) Three-manifold invariants from Schwarz AS (1987) New Topological Invariants in the Theory of
Chern–Simons field theory with arbitrary semi-simple gauge Quantized Fields. Abstracts in the Proceedings of International
groups. Communications in Mathematical Physics 217: Topological Conference, Baku, Part II.
295–314. Witten E (1988) Topological quantum field theory. Communica-
Schwarz AS (1978) The partition function of degenerate quadratic tions in Mathematical Physics 117: 353–386.
functional and Ray–Singer invariants. Letters in Mathematical Witten E (1989) Quantum field theory and the Jones polynomial.
Physics 2: 247–252. Communications in Mathematical Physics 121: 351–399.
Schwarz AS (1979) The partition function of a degenerate Witten E (1995) Chern–Simons gauge theory as a string theory.
functional. Communications in Mathematical Physics 67: Progress in Mathematics 133: 637–678.
1–16.
Seiberg–Witten Theory
Siye Wu, University of Colorado, Boulder, CO, USA N = 1 Gauge Theory and Seiberg Dualities
ª 2006 Elsevier Ltd. All rights reserved. N = 1 Yang–Mills Theory and QCD
Let G be a compact Lie group and let P be a principal
Introduction G-bundle over the Minkowski space R3, 1 . In pure
gauge theory, the dynamical variable is a connection A
Gauge theory is the cornerstone of the standard in P; two connections are equivalent if they are related
model of elementary particles. The original motiva- by a gauge transformation. Let F 2 2 (R3, 1 , ad P) be
tion for studying supersymmetric gauge theories was the curvature of A. It decomposes into the self-dual and
phenomenological (such as the hierarchy problem). anti-self-dual parts, þ
pffiffiffiffiffiffi that is, F = F þ F , where
They display a large number of interesting phenom-
F = (1=2)(F 1 F). With a suitably normalized
ena and become the models for the dynamics of nondegenerate bilinear form h,i on the Lie algebra g,
strongly coupled field theories. They also offer the classical action is
valuable insights to nonsupersymmetric models. In Z
N = 1 gauge theory, the low-energy effective super- 1
SYM ½A ¼ 2 hF ^ Fi þ hF ^ Fi
potential is holomorphic both in the superfields and 3;1 2g 162
ZR
in the coupling constants. This powerful holomor-
¼ hFþ ^ Fþ i hF ^ F i
phy principle, together with symmetry and various R 3;1 8 8
limits, often determines the effective superpotential
Here g > 0 is the coupling constant and
2 R, the
completely. Such theories often have quantum
angle, and
moduli spaces where the classical singularities are
pffiffiffiffiffiffiffi
smoothed out, continuous interpolation between
4 1
Higgs and confinement phases, massless composite
¼ þ
2 g2
mesons and baryons, and dual theories weakly
coupled at low energy. For N = 2 pure gauge theory, is a complex number in the upper-half plane that
the low-energy effective theory is an abelian gauge incorporates both. Classically, the theory is con-
theory in which both the kinetic term and the formally invariant and the dynamics is independent
coupling constant are determined by a holomorphic of the
-term. At the quantum level,
(mod2)
prepotential. The electric–magnetic duality is in the appears in the path integral and parametrizes
ambiguity of the low-energy description. Much inequivalent vacua. The coupling constant runs as
physical information, such as the coupling constant, energy varies, satisfying the renormalization group
the Kähler metric on the quantum moduli, the equation
monodromy around the singularities, can be incor- dg b0 3
porated in a family of elliptic curves. This low- ¼ g þ oðg5 Þ
energy exact solution is also useful to topological
d ð4Þ2
field theory that can be obtained from the N = 2 where the right-hand side is called the -function
theory by twisting. Much of the above was the work (g). This introduces, when b0 6¼ 0, a mass scale
of Seiberg and Witten in the mid-1990s. In this given by
article, we review some of the fascinating aspects of 2
=gðÞ2
N = 1 and N = 2 supersymmetric gauge theories. ð=Þb0 ¼ e8
504 Seiberg–Witten Theory
up to one-loop. Consequently, the classical scale the theory is asymptotically free but
Since b0 = 3h,
invariance is lost. It is convenient to redefine as a strongly coupled at low energy. Classically, the
complex quantity such that theory has a U(1)R chiral symmetry. However, due
pffiffiffiffiffi
to anomaly, only the subgroup Z2h survives at the
ð=Þb0 ¼ e2 1ðÞ quantum level. Instanton effect yields gaugino
condensation h i
3 . The symmetry is thus
For pure gauge theory, b0 = (11=3)h, where h is the inequivalent vacua.
further broken to Z2 , resulting h
dual Coxeter number of g. At high energy ( ! 1), The N = 1 QCD has additional chiral superfields
the coupling becomes weak (g ! 0); this is known as in a representation R, including the bosons 2
asymptotic freedom. On the contrary, the interac- (P G R) and the fermions 2 (Sþ (P G R)).
tion becomes strong at low energy. It is believed that In the absence of superpotential, the action is
the theory exhibits confinement and has a mass gap.
QCD, or quantum chromodynamics, is gauge SN¼1 N¼1
SQCD ½A; ; ; ¼ SSYM ½A;
theory coupled to matter fields. Suppose the boson Z
1
and the fermion are in the (complex) representa- þ 2 d4 x d2 d2 12jj2
g
tions Rb and Rf of G, respectively. That is, 2
(P G Rb ), or is a section of the bundle P G Rb , In components, the second term is
and 2 (S (P G Rf )), where S is the spinor
Z
bundle over R3, 1 . The classical action is 1 pffiffiffiffiffiffiffi
d4 x 12jrj2 þ 1ð ; r = þ Þ 12 jDj2 þ
g2
SQCD ½A; ; ¼ SYM ½A
Z pffiffiffiffiffiffiffi
1 1 where D : R ! g is the moment map of the
þ 2 d4 x jrj2 þ 1ð ; r = Þ þ Hamiltonian G-action on R, and we have omitted
g 2
other terms containing fermionic fields. The
where r is the covariant derivative, r
= is the Dirac moduli space of classical vacua is the symplectic
operator coupled to A, and we have omitted possible quotient D1 (0)=G = R==G. It is the same as the
mass and potential terms. The quantum theory Kähler quotient Rs =GC , where the stable subset
depends sensitively on the representations Rb and Rs = { 2 RjGC \ D1 (0) 6¼ ;} is open and dense in
Rf . In the -function, we have R. Again, the quantum theory depends on the
representation R. Since b0 = 3h (1=2)(R), the theory
b0 ¼ 11 1 2 is asymptotically free, infrared free, scale invariant (to
3 h 6ðRb Þ 3ðRf Þ (R) > 6h, (R) = 6h,
one-loop) when (R) < 6h,
where (R) is the Dynkin index of a representation respectively. The moduli space may be lifted by a
R. If b0 < 0, the theory is free in the infrared but superpotential or modified by other quantum effects.
strongly interacting in the ultraviolet. If b0 > 0, the
converse is true; in particular, the theory exhibits
asymptotic freedom. If b0 = 0, the situation depends SU(Nc ) Theories at Low Energy
on the sign of the two or higher-loop contributions. We now consider N = 1 QCD with G = SU(Nc ); Nc
Pure N = 1 supersymmetric gauge theory is one on is the number of colors. The matter field consists of
the superspace R 3, 1j(2, 2) with a constraint that the Nf copies of quarks Qi (1 i Nf ) in the funda-
curvature vanishes in the odd directions. The mental representation of SU(Nc ) and Nf copies of
dynamical variables are in the superfield strength antiquarks Q0i0 (1 i0 Nf ) in the conjugate repre-
W, a 1j(1, 0)-form valued in ad P. In components, sentation. Using the isomorphism of su (Nc ) with its
the theory is gauge field coupled to a Majorana or dual, the moment map is
Weyl fermion in the adjoint representation. Let S
pffiffiffiffiffiffiffi
be spinor bundles of positive (negative) chiralities, DðQ; Q0 Þ ¼ traceless part of 1ðQQy Q0 Q0y Þ
respectively, and let be a section of Sþ adP. The
action, written both in superspace and in ordinary So (Q, Q0 ) 2 D1 (0) if and only if QQy Q = cINc
0y
for some ak 0. Generically, these ak > 0 and the The stationary points of Weff are at BB0 ^Nc M = 0,
gauge group SU(Nc ) is broken to SU(Nc Nf ). If BM = 0, MB0 = 0; these are precisely the constraints
Nf Nc , then that the classical configuration satisfies. However,
0 1 0 0 1 the moduli space is interpreted differently: it is
a1 a1 embedded into a larger space, and the constraints
B .. C B .. C
Q
@ . A; Q0
@ . A are satisfied only at stationary points. At the
aN c a0Nc singularity hMi = 0, the whole global symmetry
group is unbroken, and B, B0 are the new massless
where ak , a 0k 0 satisfy a2k a 0k 2 = c for some c 2 R. fields resolving the singularity. So we have a
The gauge group is completely broken. The low- continuous transition between confinement (without
energy superfields are the mesons Mii0 = Qi Q0i0 and, if chiral symmetry breaking) and the Higgs mechanism
Nf Nc , the baryons in the semiclassical regime.
When Nc þ 2 Nf (3=2)Nc , the original theory,
1
BiNc þ1 iNf ¼
i i Qi1 QiNc called the electric theory, is still strongly coupled in
Nc ! 1 Nf the infrared. Seiberg (1995) proposed that there is a
i0 ...i0N 1 i01 i0Nf 0 dual, magnetic theory, which is infrared free. The
B0 Ncþ1 f ¼
Qi0 Q0i0
Nc ! 1 Nc
two theories are different classically, but are
equivalent at the quantum level. The dual theory
When Nf < Nc , Affleck et al. (1984) found a is an N = 1SU(N ~ c ) gauge theory with N ~ c = Nf Nc ,
dynamically generated superpotential coupled to dual quarks Q ~ 0i0 , where 1 i; i0
~ i, Q
3Nc N 1=ðNc Nf Þ Nf are flavor indices. In addition, the mesons Mii0
^ f
Weff ðMÞ ¼ ðNc Nf Þ become fundamental fields. They are not coupled to
det M the SU(N ~ c ) gauge field but interact with the dual
generated by instanton effect when Nf = Nc 1 and by quarks through the superpotential
gaugino condensation in the unbroken SU(Nc Nf ) ~ 0i0
~ iQ
theory when Nf < Nc 1. It is also the unique super- W ¼ 1 Mii0 Q
potential (up to a multiplicative constant) that is The two theories have the same global symmetry
consistent with the global and supersymmetry. The and the same gauge-invariant operators. The dual
potential pushes the vacuum to infinity. Therefore, quarks are fundamental in the magnetic theory but
contrary to the classical picture, theories with Nf < Nc are solitonic excitations in the electric theory. At
do not have a vacuum at the quantum level. high energy, the electric theory is asymptotically
When Nf 3Nc , the theory is not strongly inter- free, while the magnetic theory is strongly coupled.
acting at low energy, and perturbation methods are At low energies, the converse is true. Therefore,
reliable. (When Nf = 3Nc , the two-loop contribution reliable perturbative calculations can be performed
to the -function is negative.) We now look at the by choosing an appropriate weakly coupled
range Nc Nf < 3Nc . The cases Nf = Nc , Nc þ 1 theory.
and Nc þ 2 Nf < 3Nc were studied in Seiberg When (3=2)Nc < Nf < 3Nc , the theory has a
(1994) and Seiberg (1995), respectively. nontrivial infrared fixed point. This is because up
When Nf = Nc , the classical moduli space is to two-loop,
det M = BB0 . The quantum theory at low energy
consists of the fields M, B, B0 satisfying the g3
constraint det M BB0 = 2Nc . The quantum moduli ðgÞ ¼ ð3Nc Nf Þ
162
space is smooth everywhere, and there are no
additional massless particles. So the gluons are g5 2 Nf
þ 2Nc Nf 3Nc þ oðg7 Þ
heavy throughout the moduli space. This is due to 1284 Nc
confinement near the origin, where the interaction is
strong, and due to the Higgs mechanism far out in There is a solution g > 0 to (g) = 0. We have
the flat direction, where the classical picture is a (g) < 0 when 0 < g < g , (g) > 0 when g > g . In
good approximation. We see a smooth transition the infrared limit, the coupling constant flows to
between these two effects. g = g , where we have a nontrivial, interacting
When Nf = Nc þ 1, there is a dynamically gener- superconformal theory in four dimensions. The
ated superpotential conformal dimension becomes anomalous and is
equal to 3/2 of the charge of the chiral U(1)R ; for
1 example, that of the meson 1 M is 3(Nf
Weff ¼ ðB0 MB det MÞ
2Nc 1 Nc )=Nf > 1 in this range.
506 Seiberg–Witten Theory
~
Other Classical Gauge Groups form on C2Nc . When (3=2)(Nc þ 1) < Nf < 3(Nc þ 1),
the theory flows to an interacting superconformal field
We now consider N = 1 supersymmetric gauge
theory in the infrared.
theory and QCD with gauge groups Sp(Nc ) and
Theories with the SO(Nc ) gauge group were
SO(Nc ). The Sp(Nc ) theories, studied by Intriligator
studied by Seiberg (1995) and by Intriligator and
and Pouliot (1995), are the simplest examples of
Seiberg (1995). Since the fundamental representa-
the N = 1 theories. We take 2Nf chiral superfields
tion is real, there is no constraint on the number Nf
Qi (i = 1, . . . , 2Nf ) in the fundamental representation
of quarks Qi (1 i Nf ). The gauge invariants are
C2Nc ffi HNc of Sp(Nc ). The number of copies must j
the mesons Mij = Qia Qb ab and, if Nf Nc , the
be even so that the quantum theory is free from
baryons BiNc þ1 iNf =
i1 iNf Qi1 QiNc =Nc ! They
global gauge anomaly. The gauge-invariant quanti-
satisfy rank M Nc and BB = p ^N c
M.
ffiffiffiffiffiffi Using the
ties are the mesons Mij = Qai Qbj !ab , where ! is
decomposition u(Nc ) = so(Nc ) 1{R-self-adjoint
the symplectic form on C2Nc , subject to a constraint
matrices},
pffiffiffiffiffiffi the moment map D(Q) is the projection
all the massless fermions are in M, even at the origin gC =GC = tC =W, where W is the Weyl group. At a
of the moduli space. Hence the quarks are confined. generic 2 tC , the gauge group is broken to T by
When Nf = Nc 3, the unbroken gauge group is the Higgs mechanism. Classically, the massless
SO(3) and the theory has two branches with degrees of freedom are excitations of and
components of the gauge field in t. So the low-
2Nc 3 energy physics can be described by these massless
Weff ¼ 4ð1 þ
Þ
det M fields. However, the moduli space is singular when
where
= 1. For
= 1, the quantum theory has no is on the walls of the Weyl chambers. At these
vacuum. For
= 1, Weff = 0, but there are addi- values, the unbroken gauge group is larger and there
~ i coupling to M via the super-
tional light fields Q are extra massless fields that resolve the
potential W
(2)1 Mij Q~ iQ
~ j near M = 0. singularities.
When Nf = Nc 2, the low-energy theory is related Since b0 = 2h > 0, the quantum theory is asymp-
to the N = 2 gauge theory and will be addressed in the totically free but strongly interacting at low energy.
subsection ‘‘Seiberg–Witten’s low-energy solution.’’ It can be shown that N = 1 supersymmetry already
When Nf Nc 1, we define a dual, magnetic forbids a dynamically generated superpotential on
theory whose gauge group is SO(N ~ c ), where tC =W. Therefore, the vacuum degeneracy is not
N~ c = Nf Nc þ 4. There are Nf dual quarks Q ~ i (1 lifted and the quantum moduli space is still a
i Nf ) in the fundamental representation. This continuum. However, there are corrections to the
theory is infrared free if Nf (3=2)(Nc 2). In the part of classical moduli space where strong interac-
effective theory, the mesons Mij become fundamen- tions occur. The quantum theory has a dynamically
tal and couple with the dual quarks through a generated mass scale . We pick the renormalization
superpotential W = (2)1 Mij Q ~ iQ
~ j if Nf Nc ; there scale to be jj, the typical energy scale where
is an additional term det M=642Nc 5 if Nf = Nc 1. spontaneous symmetry breaking occurs. Far away
When (3=2)(Nc 2) < Nf < 3(Nc 2), the theory from the origin, that is, when jj jj, the theory is
flows to an interacting superconformal field theory in weakly interacting and the classical description of
the infrared. the moduli space is a good approximation. How-
ever, when jj is comparable to jj, the classical
language and perturbation methods fail due to
N = 2 Gauge Theory and Seiberg–Witten strong interaction. At = 0, the full gauge symmetry
Duality is restored classically. But since the theory becomes
strongly interacting at low energy, it cannot be the
N = 2 Yang–Mills Theory
low-energy solution of the original theory.
Pure N = 2 supersymmetric gauge theory is a special The classical U(1)R symmetry extends to U(2)R ,
case of N = 1 QCD when R = gC is the (complex- mixing and . The U(1)R subgroup in U(2)R is
ified) adjoint representation
pffiffiffiffiffiffi of G. The moment map anomalous except for a subgroup Z4h . So we have a
is D() = (1=2 1)[, ] 2 g ffi g ( 2 g). Since the global SU(2)R Z2 Z4h symmetry at the quantum
fermionic fields and are sections of the same level. This is consistent with a continuous moduli
bundle, there is a second set of supersymmetry space of vacua, if the group SU(2)R is to act
transformations by interchanging the roles of and nontrivially. Also, the space is not a single orbit of
. This makes the theory N = 2 supersymmetric. the global symmetry group. pffiffiffiffiThe generator of Z4h
The classical action is acts on tC by a phase e 1=h . The group Z4h is
spontaneously broken to the subgroup which
SN¼2
SYM ½A; ; ; ¼ SYM ½A acts trivially on tC =W.
Z pffiffiffiffiffiffiffi
1 We study the general form of low-energy effective
þ 2 d4 x 1ðh ; r
= i
g Lagrangian that is consistent with N = 2 super-
1 symmetry. We assume that the quantum effect does
= iÞ þ jrj2
þ h ; r not modify the topology of the moduli space tC =W,
pffiffiffiffiffiffiffi 2
þ 1ðh; ½ ; i þ h; ½ ;
iÞ though it may alter the singularity and its nature.
Suppose U is the quantum moduli. At a generic
1 2
j½; j point in U, the residual gauge group is T. In the
8 N = 1 language, the theory is a supersymmetric
The energy reaches the minimum when takes a gauged sigma model with target space U. It contains
constant value 2 gC that can be conjugated by G N = 1 vector multiplets W I and chiral multiplets I ,
to the Cartan subalgebra tC . (t is the Lie algebra of where 1 I r, r = dim T being the rank of G.
the maximal torus T.) The classical moduli space is N = 1 supersymmetry requires that U is Kähler, with
508 Seiberg–Witten Theory
possible singularities where the effective theory proposed that this is so for the low-energy effective
breaks down. N = 2 supersymmetry requires further theory of the N = 2 gauge theory. An SL(2, Z)
that U is special Kähler, that is, there is a flat, transformation maps one description of the low-
torsion-free connection r on TU such that the energy theory to another, exchanging electricity and
Kähler form ! is parallel and such that dr J = 0, magnetism. It is however not an exact duality of the
where the complex structure J is viewed as a 1-form full SU(2) theory. Rather, duality is in the ambiguity
valued in TU. See, for example, Freed (1999). of the choice of the low-energy description. More
Locally, there is a holomorphic prepotential F and precisely, is a section of a flat SL(2, Z) bundle over
special coordinates {zI }. Let ~zI = @F =@zI be the dual U. Thus, is multivalued and exists as a function in
coordinates and let IJ = @ 2 F =@zI @zJ = @~zI =@zp J
.ffiffiffiffiffiffi
Then local charts only. So we must use different Lagran-
K = Im(~zI zI ) is a Kähler potential and ! = ( 1=2) gians in different regions of the u-plane. Around the
Im(IJ )dzI ^ dzJ is the Kähler form. The effective singularities where is not defined, nontrivial
action is monodromy can appear.
Z Away from infinity, the electric theory is strongly
1
N¼2
Seff ½W; ¼ Im d4 x d2 12IJ ðÞðW I ; W J Þ interacting but the magnetic theory is infrared free.
4 The dual field is ~a = dF (a)=da, and eff (u) = d~ a=da.
Z
4 2 2
þ d x d d KðÞ The group SL(2, Z) is generated by
1 0 0 1
Note that both the coupling constants IJ and the P¼ ; S¼
0 1 1 0
metric ImIJ on U are determined by a holomorphic
function F , which is the hallmark of N = 2 1 1
T¼
supersymmetry. 0 1
In the bare theory with abelian gauge group T, the
To see its action on ~aa , we use the central
action is given by choosing F 0 () = (1=2)IJ hI , J i,
extension of the N = 2 super-Poincaré algebra. In
where the IJ (and hence the metric ImIJ ) are
the classical theory, the central charge is Z = (ne þ
constants. Due to one-loop and instanton effects,
nm )a from the boundary terms at infinity. As the
F is no longer quadratic in the effective theory.
electric–magnetic duality transformation S inter-
Since varies on U, it cannot be holomorphic
changes ne and nm , we have for any 2 SL(2, Z),
(except at a few singular points), single valued, and
: (nm , ne ) 7! (nm , ne ) 1 . When nm = 0, the classical
having a positive-definite imaginary part. The
formula Z = ne a is valid. Invariance of Z under
solution to this apparent contradiction is that each
SL(2, Z) requires that Z = nm ~a þ ne a at the quan-
set of special coordinates and the expression of F is
tum level and that SL(2, Z) acts on ~aa homo-
valid only in part of U. Solving the N = 2 gauge
geneously as a column vector.
theory at low energy means understanding the
When u = (1=2)a2 is large, perturbation is reliable.
singularity of U in the strong coupling regime and
The
pffiffiffiffiffiffi classical
pffiffiffiffiffiffi and one-loop results are a(u)
particles are collective excitations in the perturbative monodromy is (T 2 S)T 2 (T 2 S)1 . A pair of dyons E of
regime. Suppose along a path connecting u0 and charges 1 become massless. The effective action is
some base point near infinity, a monopole of charges Weff
(u 162Nc 4 )Eþ E .
(1, ne ) = (0, 1)(T ne S1 )1 becomes massless at u0 . Topological gauge theory is a twisted version of
Then by the renormalization group analysis N = 2 Yang–Mills theory in which the observables
and duality, the monodromy at u0 is Mu0 = (T ne S1 ) at high energy are the Donaldson invariants. The
T 2 (T ne S1 )1 . It turns out that there are two work of Seiberg and Witten (1994a, b) yields new
singularities u = 2 with monodromies M2 = insight to it and has a tremendous impact on the
ST 2 S1 and M2 = (TS)T 2 (TS)1 . The particles that geometry of 4-manifolds. See Witten (1994) for the
become massless at 2 are of charges (nm , ne ) = (1, 0) initial steps.
and (1, 1), respectively. The only BPS states in the After the work of Seiberg and Witten (1994a, b),
strong coupling regime are those which become there has been much progress on theories with other
massless at the singularities; the others decay as u gauge groups. If the gauge group is a compact Lie
deforms towards strong interaction. group of rank r, the u-plane is replaced by tC =W;
The monodromies M2 , M1 (or any two of the singularities are modified by quantum effects.
them) generate the subgroup (2). The family of The duality group is Sp(2r, Z) or its subgroup of
elliptic curves with these monodromies can be finite index, acting on the coupling matrix = (IJ )
identified with y2 = (x 2 )(x þ 2 )(x u) called by fractional linear transformations. For example, for
the Seiberg–Witten curve. The singularities are at G = SU(Nc ), the moduli space is parametrized by
u = 2 and u = 1, where the curve degenerates. gauge P invariants u2 , . . . , uNc defined by det (xI ) =
Let xNc N i = 2 ui x
c Nc i
= PNc (x, ui ). Classically, the sin-
pffiffiffi gular locus is a simple singularity of type ANc 1 . At
2 y dx the quantum level, the singularity consists of two
¼ copies of such locus shifted by n in the un
2 x2 4
direction. The monodromies correspond to a family
be the Seiberg–Witten differential (of second kind on of hyperelliptic curves y2 = PNc (x, ui )2 2Nc of
the total space E). Then in a suitable
R basis
R (
, ) of genus Nc 1. The Seiberg–Witten differential is
H1 (Eu =U, Z), we have a =
, ~ a = . At a pffiffiffi
singularity, if = nm þ ne
is a vanishing cycle, 2 @PNc ðx; ui Þ x dx
¼ pffiffiffiffiffiffiffi þ @ð Þ
then the dyon of charges (nm , ne ) becomes massless. 1 @x y
This Ris because its central charge is Z = nm ~a þ The Nc 1 independent eigenvalues ai of and
ne a = . The monodromy at a singularity where their duals ~ai = @F =@ai are the periods of along
is a vanishing cycle is given by the Picard–Lefshetz the 2Nc 2 homology cycles in the curve. For more
formula M: 7! 2( ). At u = 2 , the van- details, the reader is referred to Klemm et al. (1995)
ishing cycles are and
, respectively. and Argyes and Faraggi (1995).
We return to the N = 1 SO(Nc ) gauge theory with
Nf = Nc 2. At a generic point in the moduli space,
N = 2 QCD
the gauge group is broken to SO(2), which is
abelian. Much of the above discussion applies to N = 2 supersymmetric QCD is N = 2 Yang–Mills
this case. By N = 1 supersymmetry, the effective theory coupled to N = 2 matter. The latter consists
coupling eff is holomorphic in M but is not single of N = 1 superfields Q that form a quarternionic
valued. In fact, eff depends on u = det M, which is representation R of the gauge group G. The space R
invariant under the (anomalypffiffiffiffi free) SU(Nf ) symme- has a G-invariant hyper-Kähler structure. The
try. For large u, we have e2 1eff = 4Nc 8 =u2 and hyper-Kähler moment map H : R ! g Im H con-
the monodromy around infinity is M1 = PT 2 . sists of a real moment map R : R ! g for the
On the other hand, a large expectation value Kähler structure and a complex moment map
of M of rank Nc 3 breaks the gauge group to C : R ! (g )C for the holomorphic symplectic
SO(3) and the theory is the N = 2 theory discussed structure. As an N = 1 theory, the matter superfields
earlier. Using these facts, Intriligator and Seiberg R gC with a D-term D(Q, ) =
are valued inpffiffiffiffiffiffi
(1995) identified the family of elliptic curves as R (Q) þ (1=2
pffiffiffi 1)[, ] and a superpotential
y2 = x(x 162Nc 4 )(x u). There are two singula- W(Q, ) = 2hC (Q), i þ m(Q), where the mass
rities with inequivalent physics. At u = 0, the mono- term m is a G-invariant quadratic form on R. The
dromy is ST 2 S1 . A pair of monopoles Q ~ becomes classical moduli space of vacua has two branches.
massless. They couple with M through the super- On the Coulomb branch where Q = 0 and 6¼ 0,
potential W
(2)1 Mij Q~ iQ
~ j . At u = 162Nc 4 , the the unbroken gauge group is abelian and the
510 Seiberg–Witten Theory
photons are massless. If Q 6¼ 0 exists in the flat multiply ne by 2 so that it has integer values on Qi
directions, the gauge group is broken according to and Q ~ i , and divide a by 2 to preserve the formula
the value of Q; these are the Higgs branches. If Z = nm ~a þ ne a. The monodromies around the singu-
m = 0, the moduli space of classical vacua is the larities become M2 = STS1 , M2 = (T 2 S)T(T 2 S)1 ,
hyper-Kähler quotient 1 H (0)=G. The branches of M1 = PT 4 . They generate the subgroup 0 (4) of
two types touch at the origin, where the full gauge SL(2, Z). The coupling constant is
group is restored, and at other subvarieties in R. The pffiffiffiffiffiffiffi
global symmetry is the subgroup of U(R) that 8 1
¼ þ
commutes with the G-action on R and preserves g2
m; it contains U(2)R .
Quantum mechanically, such a theory is free The Seiberg–Witten curve is y2 = x3 ux2 þ
from local gauge anomalies. Consistency under large (1=4)40 x, related to the earlier one y2 = (x u)(x2
gauge transformations puts a torsion condition on R, 40 ) by an isogeny. Here and below, Nf is the
such as (R) = 0(mod 2). Since b0 = 2h (1=2)(R), dynamically generated scale.
the theory is asymptotically free if (R) < 4h. If For Nf > 0, we consider the case with zero bare
the quantum theory is scale invariant up masses. The simplest BPS-saturated
(R) = 4h, pffiffiffi states are the
to one-loop (and hence to all loops), and is expected elementary quarks with mass 2jaj, which form
to be so nonperturbatively. If (R) > 4h, the quan- the vector representation of SO(2Nf ). In addition, the
tum theory may not be defined but it can be the low- quarks have fermion zero modes in the monopole
energy solution of another asymptotically free theory. background. When nm = 1, each SU(2) doublet of
Due to the axial anomaly, the U(2)R global symmetry quarks has one zero mode. With Nf hypermultiplet,
reduces to the subgroup SU(2)R Z2 Z4h(R) . The there are 2Nf zero modes in the vector representation
metric on the Coulomb branch can be corrected by of SO(2Nf ). Upon quantization, the quantum states
quantum effects, but those on the Higgs branches do are in the spinor representation. So the flavor
not change because of the uniqueness of the hyper- symmetry is really Spin(2Nf ). The spectrum may
Kähler metric. In the quantum theory, the Higgs also include states with nm > 1. For Nf = 2, 3, 4, the
branches still touch the Coulomb branch, but the center Z(Spin(2Nf )) are Z2 Z2 , Z4 , Z2 Z2 ,
photons of the Coulomb branch are the only massless whose generators act onpstates of charges (nm , ne )
ffiffiffiffiffiffinm þ2n
by ((1)ne þnm , (1)ne ), 1 , ((1)nm , (1)ne ),
e
gauge bosons at the point where they meet.
When G = SU(Nc ) we take Nf quarks respectively.
Qi (i = 1, . . . , Nf ) in the fundamental representation Suppose at a singularity on the u-plane, the low-
and Nf antiquarks Q ~ i (i = 1, . . . , Nf ) in the complex- energy theory is QED with k hypermultiplets. Let mi
conjugate representation. The moment map is the be the bare mass and Si , the U(1) charge of the ith
same aspin ffiffiffi N = 1i QCD whereas the superpotential hypermultiplet. Withpffiffiffi the expectation value of , the
is W = 2Q ~ i Q þ P mi Q ~ i Qi . Consider the case actual masses are j 2a þ mi j(1 i k). As the states
i
G = SU(2) as in Seiberg and Witten (1994b). Since form a small representation of the N = 2 algebra, the
b0 = 4 Nf , the asymptotically free theories have pffiffiffi charge is modified as Z = nm ~a þ ne a þ S
central
Nf 3 whereas the Nf = 4 theory is scale invariant. m= 2, where m = (m1 , . . . , mk ) and S = (S1 , . . . , Sk ).
As the representations on Qi and Q ~ i are isomorphic, Under a duality transformation M 2 SL(2, Z), the
pffiffiffi
the classical global symmetry is O(2Nf ) U(2)R column vector (m= 2 , a
~
, a) is multiplied by a matrix
when all mi = 0. The appearance of the even number of the form M ^ = Ik 0 . (For example, if M = T, M ^
M
of fundamental representations is necessary for the can be derived by one-loop analysis.) So the row
consistency of the theory at the quantum level. The vector W = (S, nm , ne ) transforms as W 7! W M ^ 1 . The
U(1)R symmetry is anomalous if Nf 6¼ 4. When Nf > 0, transformation on (nm , ne ) is not homogeneous when
SO(2Nf ) is anomaly free, whereas O(2Nf )=SO(2Nf ) = there are hypermultiplets. This phenomenon persists
Z2 is anomalous. The anomaly free subgroup of Z2 even when all the bare masses mi are zero.
U(1)R is Z4(4Nf ) . Its Z2 subgroup acts in the same way When Nf = 1, the global symmetry of the u-plane
as Z2 Z(SO(2Nf )). A nonzero expectation value of is Z3 . There are three singularities related by this
u = tr 2 further breaks the symmetry to Z4 . The symmetry, where monopoles with charges (nm , ne ) =
quotient group that acts effectively on the u-plane (the (1, 0), (1, 1), and (1, 2) become massless. The low-
Coulomb branch) is Z4Nf if Nf > 0 and Z2 if Nf = 0. energy theory at each singularity is QED with a
When Nf = 4, the U(1)R symmetry is anomaly free but single light hypermultiplet. Besides the photon, no
Z2 = O(8)=SO(8) is still anomalous. other flat directions exist. This is consistent with the
The Nf = 0 theory is the N = 2 pure gauge theory. absence of Higgs branch in the original theory.
In order to compare it to the Nf > 0 theories, we The monodromies at the singularities are STS1 ,
Seiberg–Witten Theory 511
(TS)T(TS)1 , (T 2 S)T(T 2 S)1 , respectively, and the hypermultiplet has (nm , ne ) = (0, 1) and form the
corresponding Seiberg–Witten family of curves is vector representation v of SO(8). Fermion zero
y2 = x2 (x u) (1=64)61 . The Seiberg–Witten dif- modes give rise to hypermultiplets with
ferential is (nm , ne ) = (1, 0), (1, 1) that transform under the spinor
pffiffiffi representations s, c of Spin(8). SL(2, Z) acts on the
2 y dx spectrum via a homomorphism onto the outer-auto-
¼
4 x2 morphism group S3 of Spin(8), which then permutes v,
s, and c. So duality is mixed in an interesting way with
When Nf = 2, there are two singularities related by the SO(8) triality. In v, s, and c, the center Z2 Z2
the global symmetry Z2 of the u-plane. The massless acts as ((1)nm , (1)ne ) = (1, 1), (1, 1), (1, 1),
states at one singularity have (nm , ne ) = (1, 0) and respectively. The full SL(2, Z) invariance predicts the
form a spinor representation of SO(4) while those at existence of multimonopole bound states: for every
the other have (nm , ne ) = (1, 1) and form the other pair of relatively prime integers (p, q), there are eight
spinor representation. The low-energy theory at each states with (nm , ne ) = (p, q) that form a representation
singularity is QED with two light hypermultiplets. of Spin(8) on which the center acts as ((1)p , (1)q ).
There are additional flat directions along which Solutions when the bare masses are nonzero are
SO(4) SU(2)R is broken. They form the two Higgs also obtained by Seiberg and Witten (1994b). The
branches that touch the u-plane at the two singula- masses can be deformed to relate theories with
rities rather than at the origin. The metric and pattern different values of Nf . N = 2 QCD with a general
of symmetry breaking are the same as classically. classical gauge group has also been studied. By
The monodromies are ST 2 S1 , (TS)T 2 (TS)1 . The adding to these theories a mass term m tr 2
Seiberg–Witten curve is y2 = (x2 u) (1=64)42 ) that explicitly breaks the supersymmetry to N = 1,
(x u) and the differential is the dualities of Seiberg can be recovered. For
pffiffiffi SU(Nc ), SO(Nc ) and Sp(2Nc ) gauge groups,
2 y dx see Hanany and Oz (1995), Argyes et al. (1996),
¼
4 x2 42 =64 Argyes et al. (1997) and references therein.
When Nf = 3, the u-plane has no global symme- See also: Anomalies; Brane Construction of Gauge
try. There are two singularities. At one of them, a Theories; Donaldson–Witten Theory; Duality in
single monopole bound state with (nm , ne ) = (2, 1) Topological Quantum Field Theory; Effective Field
becomes massless and there are no other light Theories; Electric–Magnetic Duality; Floer Homology;
particles. At the other singularity, the massless states Gauge Theories from Strings; Gauge Theory:
Mathematical Applications; Nonperturbative and
have (nm , ne ) = (1, 0) and form a (four-dimensional)
Topological Aspects of Gauge Theory; Quantum
spinor representation of SO(6) with a definite Chromodynamics; Topological Quantum Field Theory:
chirality. Thus, the low-energy theory is QED with Overview; Supersymmetric Particle Models.
four light hypermultiplets. Along the flat directions,
the SO(6) SU(2)R symmetry is further broken.
This corresponds to a single Higgs branch touching Further Reading
the u-plane at the singularity. Again, the metric on Affleck I, Dine M, and Seiberg N (1984) Dynamical super-
the Higgs branch is not modified by quantum symmetric breaking in supersymmetric QCD. Nuclear Physics
effects. The monodromies at the two singularities B 241: 493–534.
are (ST 2 S)T(ST 2 S)1 and ST 4 S1 , respectively. The Argyes PC and Faraggi AE (1995) Vacuum structure and
Seiberg–Witten curve is y2 = x2 (x u) (1=64) spectrum of N = 2 supersymmetric SU(n) gauge theory.
Physical Review Letters 74: 3931–3934.
23 (x u)2 and the differential is Argyes PC, Plesser MR, and Seiberg N (1996) The moduli space
pffiffiffi of vacua of N = 2 SUSY QCD and duality in N = 1 SUSY
2 pffiffiffiffiffiffiffi 3 32 QCD. Nuclear Physics B 471: 159–194.
¼ log y þ 1 x u 2 x2 dx Argyes PC, Plesser MR, and Shapere AD (1997) N = 2 moduli
3 8 3
spaces and N = 1 dualities for SOðnc Þ and USpð2nc Þ super-
QCD. Nuclear Physics B 483: 172–186.
When Nf = 4, the theory is characterized by Freed DS (1999) Special Kähler manifolds. Communications in
classical coupling constant
pffiffiffiffiffiffi , and there are no Mathematical Physics 203: 31–52.
corrections to a = (1=2) 2u, ~ a = a. There is only Hanany A and Oz Y (1995) On the quantum moduli space of
one singularity at u = 0, where the monodromy is P. vacua of N = 2 supersymmetric SUðNc Þ gauge theories.
Nuclear Physics B 452: 283–312.
Seiberg and Witten (1994b) postulate that the full Klemm A, Lerche W, Yankielowicz S, and Theisen S (1995)
quantum theory is SL(2, Z) invariant, just like the Simple singularities and N = 2 supersymmetric Yang–Mills
N = 4 pure gauge theory. The elementary theory. Physics Letters B 344: 169–175.
512 Semiclassical Spectra and Closed Orbits
Intriligator K and Pouliot P (1995) Exact superpotential, quantum Seiberg N and Witten E (1994a) Electric–magnetic duality,
vacua and duality in supersymmetric SPðNc Þ gauge theories. monopole condensation, and confinement in N = 2
Physics Letters B 353: 471–476. supersymmetric Yang–Mills theory. Nuclear Physics B 426:
Intriligator K and Seiberg N (1995) Duality, monopoles, dyons, 19–52.
confinement and oblique confinement in supersymmetric Seiberg N and Witten E (1994b) Monopoles, duality and chiral
SOðNc Þ gauge theories. Nuclear Physics B 444: 125–160. symmetry breaking in N = 2 supersymmetric QCD. Nuclear
Seiberg N (1994) Exact results on the space of vacua of four- Physics B 431: 484–550.
dimensional SUSY gauge theories. Physical Review D 49: Witten E (1994) Monopoles and four-manifolds. Mathematical
6857–6863. Research Letters 1: 769–796.
Seiberg N (1995) Electric–magnetic duality in supersymmetric
non-Abelian gauge theories. Nuclear Physics B 435: 129–146.
Semiclassical Approximation see Stationary Phase Approximation; Normal Forms and Semiclassical Approximation
Using the full power of the symbolic calculus of Semiclassical Schrödinger Operators
Fourier integral operators, H Duistermaat and on Riemannian Manifolds
V Guillemin were able to compute the main term
of the singularity from the Poincaré map of the If (X, g) is a (possibly noncompact) Riemannian
closed orbit. Their paper became a canonical manifold and V : X ! R a smooth function which
reference on the subject. satisfies lim inf x ! 1 V(x) = E1 > 1, the differential
operator H ^ = (1=2)h2 þ V is semibounded from
After that, people were able to extend SCTF to: below and admits self-adjoint extensions. For all
general semiclassical Hamiltonians (Helffer– those extensions, the spectrum is discrete in the interval
e 1, E1 d and eigenfunctions H’ ^ j = Ej ’j are loca-
Robert, Guillemin–Uribe, Meinrenken),
manifolds with boundary (Guillemin–Melrose), lized in the domain V Ej . If X is compact and V = 0,
surfaces with conical singularities and polygonal we recover the case of the Laplace operator.
billiards (Hillairet), and We will denote this part of the spectrum by
several commuting operators (Charbonnel– inf V < E1 ðhÞ < E2 ðhÞ Ej ðhÞ < E1
Popov).
For the Laplace operator, we have Ej = h2 j , where
Recently, some researchers have remarked about the 1 2 j is the spectrum of the
nonprincipal terms in the singularities expansion Laplace operator.
which come from the semiclassical Birkhoff normal The SCTF can also be derived the same way for
form (Zelditch, Guillemin). Schrödinger operators with magnetic field. One can
even extend it to Hamiltonian systems which are not
obtained by Legendre transform from a regular
Selberg Trace Formula Lagrangian. In this case, Morse indices have to be
replaced by the more general Maslov indices.
We consider a compact hyperbolic surface X.
‘‘Hyperbolic’’ means that the Riemannian metric is
locally (dx2 þ dy2 )=y2 or is of constant curvature
1. Such a surface is the quotient X = H= where Classical Dynamics
is a discrete co-compact subgroup of the group of Newton Flows
isometries of the Poincaré half-plane H. Closed
geodesics of X are in bijective correspondence with Euler–Lagrange equations for the Lagrangian
nontrivial conjugacy classes of . More precisely, L(x, v) := (1=2)kvk2g V(x) admit a Hamiltonian
the set of loops C(S1 , X) splits into connected formulation on T ? X whose energy is given by
components associated to conjugacy classes and H = (1=2)kk2g þ V(x). We will denote by XH the
each component of nontrivial loops contains exactly Hamiltonian vector field
one periodic geodesic. X @H @H
XH :¼ @ xj @
Theorem 1 (Selberg trace formula). If is a real- j
@ j @xj j
valued function on R whose Fourier transform ˆ is
compactly supported and j = 1=4 þ 2j is the spec- Preservation of H by the dynamics shows immedi-
trum of the Laplace operator on X, we have: ately that the Hamiltonian flow t restricted to H <
E1 is complete.
X
1 Z The Hamiltonian H is the ‘‘classical limit’’ of H; ^
A
ð j Þ ¼ ð þ sÞs tanh s ds in more technical terms, H is the semiclassical
2 R
j¼1 ^
principal symbol of H.
XX
1
l If V = 0, H = (1=2)gij i j and the flow is the geo-
þ desic flow.
2P n¼1
2 sinhðnl =2Þ
ðnl Þeinl Þ
Reð^ Periodic Orbits
where A is the area of X, P the set of primitive Definition 1 A periodic orbit (, T) (also denoted
conjugacy classes of and, for 2 P, l is the length p.o.) of the Hamiltonian H consists of an orbit
of the unique closed geodesic associated to . of XH which is homeomorphic to a circle and
a nonzero real number T so that T (z) = z for all
A nice recent presentation of the Selberg trace z 2 . We will denote by T0 () > 0 (the primitive
formula can be found in Marklof (2003). period) the smallest T > 0 for which T (z) = z.
514 Semiclassical Spectra and Closed Orbits
If (T, E) are given, WT, E is the set of z’s so that tangent space is the intersection of the tangent
H(z) = E and T (z) = z. spaces of Y and Z.
Fixed points of a smooth map are clean if the
The (linear) Poincaré map of a p.o. (, T) with
graph of the map intersects the diagonal cleanly.
H() = E: we restrict the flow to SE := {H = E}
and take a hypersurface inside SE transversal to Definition 3 We will denote by (ND) the following
at the point z0 . The associated return map P is a property of the p.o. (0 , T0 ): the fixed points of the
local diffeomorphism fixing z0 . Its linearization associated (nonlinear) Poincaré map P are clean.
:= P0 (z0 ) is the linear Poincaré map, an The set WT, E is ND if all p.o.’s inside are ND.
inversible (symplectic) endomorphism of the WT, E is then a manifold of dimension ().
tangent space Tz0 .
Example 2
The Morse index (): p.o.R T (, T) is a critical point
of the action integral 0 L((s), ˙ (s)) ds on the Generic case: = 1; (ND) is equivalent to ‘‘1 is
manifold C1 (R=TZ, X). It always has a not an eigenvalue of the linear Poincaré map.’’
finite Morse index (Milnor 1967) which is denoted In this case, we can deform the p.o. smoothly by
by (). For general Hamiltonian systems, the Morse moving the energy. This family of p.o.’s is called
index is replaced by the Conley–Zehnder index. a cylinder of p.o.’s. The period T(E) is then a
The nullity index () is the dimension of the smooth function of E.
space of infinitesimal deformations of the p.o. Completely integrable systems: = d; (ND) is then a
by p.o. of the same energy and period. We always consequence of the so-called ‘‘isoenergetic KAM
have () 1 and () = 1 þ dim ker (Id ). condition’’: assuming the Hamiltonian is expressed
as H(I1 , . . . , Id ) using action-angle coordinates, this
Example 1 (Geodesic flows)
condition is that the mapping I ! [rH(I)] from the
Riemannian manifold with sectional curvature < 0: energy surface H = E into the projective space is a
in this case, we have for all periodic geodesics local diffeomorphism. This condition implies that
() = 0, () = 1. Diophantine invariant tori are not destructed by a
Generic metrics: for a generic metric on a closed small perturbation of the Hamiltonian.
manifold, we have () = 1 for all periodic Maximally degenerated systems: it is the case
geodesics. where all orbits are periodic ( = 2d 1). For
For flat tori of dimension d: we have () = 0 and example, the two-body problem with Newtonian
() = d. potential and the geodesic flows on compact
For sphere of dimension 2 with constant curva- rank-1 symmetric spaces.
ture: if n is the nth iterate of the great circle, we
have (n ) = 2jnj and (n ) = 3.
Canonical Measures and Symplectic Reduction
It is a beautiful result of J-P Serre that any pair of
points on a closed Riemannian manifold are end- Under the hypothesis (ND), the manifold WT, E admits
points of infinitely many distinct geodesics. Count- a canonical measure c , invariant by t . In theffi case
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ing geometrically distinct periodic geodesics is much = 1, this measure is given by jdtj= det(Id ).
harder especially for simple manifolds like the By using a Poincaré section, it is enough to
spheres. It is now known that every closed Riemannian understand the following fact: if A is a symplectic
manifold admits infinitely many geometrically distinct linear map, the space ker (Id A) admits a canonical
periodic geodesics (at least, in some cases, for Lebesgue measure.
generic metrics, (Berger 2000 chap. V). There exists We start with the following construction: let L1
significant knowledge concerning more general and L2 be two Lagrangian subspaces of a symplectic
Hamiltonian systems as well. space E and !j , j = 1, 2, be half-densities on Lj ,
denoted by !j 2 1=2 (Lj ). If W = L1 \ L2 , we have
the following canonical isomorphisms: 1=2 (Lj ) =
Nondegeneracy 1=2 (W) 1=2 (Lj =W). So 1=2 (L1 ) 1=2 (L2 ) =
1=2 (L1 =W) 1=2 (L2 =W) 1 (W). Mj = Lj =W are
There are several possible nondegeneracy assump-
two Lagrangian subspaces of the reduced space
tions. They can be formulated ‘‘à la Morse–Bott’’
W o =W whose intersection is 0. Hence, by using
(critical point of action integrals) or purely
the Liouville measure on it, we get 1=2 (M1 )
symplectically.
1=2 (M2 ) = C. Hence, we get a density !1 ? !2
Definition 2 Two submanifolds Y and Z of X on W. It turns out that the previous calculation is one
intersect cleanly iff Y \ Z is a manifold whose of the main algebraic pieces of the symbolic calculus of
Semiclassical Spectra and Closed Orbits 515
P
Fourier integral operators and the density !1 ? !2 We define D(E) := aEj b
(Ej ) as the sum of
arises in stationary-phase computations. Dirac measures at the points Ej and its h-Fourier
The graph of a symplectic map is equipped with a transform as
half-density by pullback of the Liouville half- ^ X0
density. So we can apply the previous construction ZðtÞ ¼ trace0 ðeitH=h Þ :¼ expðitEj =hÞ ½1
to the intersection of the graph of A and the graph P0
of the identity map. where is the sum over Ej 2 [a, b].
The Duistermaat–Guillemin trick relates the
Actions previous behavior to asymptotics of the regularized
density of eigenvalues.RLet us give a function 2
Definition 4 If (, T) is a p.o., we define the
S(R) so that (t) ˆ = eitE (E)dE is compactly
following quantity which is called action of :
Z supported and
AðÞ ¼ dx ^ðtÞ ¼ 1 þ Oðt1 Þ; t!0 ½2
(all moments ofP vanish). We introduce, for E 2
In the (ND) case, A() is constant on each connected [a, b], D (E) := 0j 1h (E Ej =h). D (E) is indepen-
component of WT, E . dent modulo O(h1 ) of a, b. We have
In the generic case and if T 0 (E) 6¼ 0 (cylinder of Z
1
p.o.), p.o.’s of the cylinder are also parametrized D ðEÞ ¼ ^ðtÞZðtÞ dt
2h
by T (i.e., we note by E the p.o. of the cylinder of
energy E and T the p.o. R T of period T). If The idea is now to start from a semiclassical
^
a(E) = A(E ) and b(T) = 0 L(T (s), _ T (s))ds, a(E) approximation of U(t) = eitH=h and to insert it into
and b(T) are Legendre transforms of each other. eqn [1]. We need only a uniform approximation of
U(t) for t 2 Support().
ˆ From the asymptotic expan-
Playing with Spectral Densities sion of Z(t), we will deduce the asymptotic expan-
sion of D , the regularized eigenvalue density.
We will define the ‘‘regularized spectral densities.’’
The general idea is as follows: we want to study an
h-dependent sequence of numbers Ej (
h) (a spectrum)
in some interval [a, b]. We introduce The Smoothed Density of States
R a non negative
function 2 S(R) P which satisfies (t)dt = 1, and The following statement expressing the smoothed
also D, ", h (E) = " (E Ej ), where " (E) = density of eigenvalues is the main result of the
"1 (E="). It gives the analysis of the spectrum at subject. Under the (ND) assumption, it gives the
the scale ". Of course, we will adapt the scaling " existence of an asymptotic expansion for D (E):
to the small parameter h. If the scaling is of the size
of the mean spacing of the spectrum, we will get a Theorem 2 If E is not a critical value of H and the
very precise resolution of the spectrum. (ND) condition is satisfied for all p.o.’s of energy
The general philosophy is: E 2 [a, b] and period inside the support of ,
ˆ
X
If h is the semiclassical parameter of a semiclassi- D ðEÞ ¼ DWeyl ðEÞ þ DWðT;EÞ þ Oðh1 Þ ½3
cal Hamiltonian, the mean spacing of the eigen-
values is of order hd (Weyl’s law). The trace where:
formula gives the asymptotic behavior of (i) !
D, ", h (E) for "
h (and hence " >> E except X
1
d j
if d = 1). This behavior is not ‘‘universal’’ and DWeyl ðEÞ ¼ ð2hÞ aj ðEÞh
thus contains a significant amount information of j¼0
(in our case, on periodic trajectories). R
Better resolution of the spectrum needs the use of with a0 (E) = H = E dL=dH
the long-time behavior of the classical dynamics and (ii) The sum is over all the manifolds WT, E so that
is conjecturally universal. It means that eigenvalues T 2 Support().
ˆ
seen at very small scale behave like eigenvalues of an (iii)
"
ensemble of random matrices, the most common one DWðT;EÞ ¼ ð ðÞþ1Þ=2
eiðÞ=2
being the Wigner Gaussian orthogonal ensemble ð2ihÞ
(GOE) and Gaussian unitary ensemble (GUE). X
eiAðÞ=h bj ðEÞhj
We fix some interval [a, b] with b < E1 . j0
516 Semiclassical Spectra and Closed Orbits
Feynman R and Hibbs A (1965) Quantum Mechanics and Path on Hypebolic Manifolds,’’ Schloss Reisensburg, Gunsburg,
Integrals. New-York: McGraw-Hill. Germany, 4–11 October 2003. To appear in Springer LNP,
Gutzwiller M (1971) Periodic orbits and classical quantization Berlin–Heidelberg, New York. http://fr.arxiv.org
conditions. Journal of Mathematical Physics 12: 343–358. Milnor J (1967) Morse Theory. Annals of Mathematics Studies
Gutzwiller M (1990) Chaos in Classical and Quantum no. 51. Princeton, NJ: Princeton University Press.
Mechanics. Berlin–Heidelberg–New York: Springer. Selberg A (1956) Harmonic analysis and discontinuous groups in
Hörmander L (1968) The spectral function of an elliptic operator. weakly symmetric Riemannian spaces with applications to
Acta Mathematica 121: 193–218. Dirichlet series. Journal of the Indian Mathematical Society
Marklof J (2003) Selberg’s trace formula: an introduction. 20: 47–87.
Lectures given at the International School ‘‘Quantum Chaos
&u ¼ gðjuj2 Þu ½5
has essentially the same properties as the real-valued
Introduction equation [3], it is not too restrictive to study only
real-valued functions as we shall mostly do in the
A semilinear wave equation is an equation of the
following.
form
The more general equations of the form [1],
&u ¼ Fðu; u0 Þ; u :
R Rn ! R ½1 involving the derivatives of u, are encountered in
nþ2 several physical theories, including the nonlinear
where F : R ! R is a smooth function, the
-models and general relativity.
d’Alembert operator & is defined as However, beyond the concrete physical applica-
@ @ tions, eqn [1] is important since it is a simplified but
& ¼ D2t D2x1 D2xn ; Dt ¼ ; D xj ¼ ½2 relevant model of much more general equations and
@t @xj
systems of mathematical physics; despite its simple
and u0 denotes the vector of all first-order deriva- structure, the semilinear wave equation presents
tives of u: already all the main difficulties and phenomena of
u0 ¼ ðDt u; Dx1 u; . . . ; Dxn uÞ ðut ; ux1 ; . . . ; uxn Þ nonlinear wave interaction, and it represents an
ideal laboratory for such problems.
Sometimes the term ‘‘semilinear’’ is used in a more In this article we plan to give a concise but, as far
restrictive sense and refers to the special class of as possible, comprehensive review of the main
equations research directions concerning eqn [1], and in
&u ¼ f ðuÞ ½3 particular we shall focus on the global existence of
The very particular case f (u) = mu, m > 0, corres- both large and small nonlinear waves, and the
ponds to the Klein–Gordon equation, used to model problem of local existence for low-regularity solu-
relativistic particles. True nonlinear terms of the form tions. A large part of the theory extends to nonlinear
f (u) = mu u3 , m 0 (meson equation), or perturbations of the form &u = F(u, u0 , u00 ) and to
f (u) = sin u (sine-Gordon equation) have been pro- the fully nonlinear case; we have no space here to
posed as models of self-interacting fields with a local give an account of these developments and we must
interaction. Notice that for the physical applications it refer the reader to the books and papers cited in the
is natural to consider complex-valued functions u(t, x); ‘‘Further reading’’ section.
in the general case of eqn [1], this actually means that
we are considering a 2 2 system in <u and =u. Classical Results
However, the natural physical requirement of gauge
invariance restricts the possible nonlinearities to the Equations [1] and [3] are hyperbolic with respect to
functions satisfying the condition the variable t. This is a precise way of stating that
the ‘‘correct’’ problem for it is an initial-value
f ðei uÞ ¼ f ðuÞei ; 8 2 R ½4 problem (IVP) with data at some fixed time, or
Semilinear Wave Equations 519
more generally on some spacelike surface: this respect to Ek (0) (the H k norm of the data), takes
means that we assign two functions u0 (x), u1 (x), any ball BY (0, N) of YT into the ball BX (0, M þ NT)
called the ‘‘initial data,’’ and we look for a function of XT . Moreover, if we apply [7] to the difference of
u(t, x) satisfying the IVP: two equations &u = F and &v = G, we also see that
is Lipschitz continuous from YT to XT , with a
&u ¼ Fðu; u0 Þ; uð0; xÞ ¼ u0 ðxÞ; ut ð0; xÞ ¼ u1 ðxÞ ½6 Lipschitz constant CT.
On the other hand, (u) = F(u, u0 ) takes XT to YT ,
This setting is in agreement with the physical picture provided k > 1 þ n=2; we can even say that it is
of an evolution problem: the data represent the Lipschitz continuous from BX (0, M) to BY (0, C(M))
complete state of a system at a fixed time, and they for some function C(M), with a Lipschitz constant
uniquely determine the evolution of the system, C1 (M) also depending on M. This follows easily
which is described by the differential equation. from Moser type estimates like
This rough statement of the problem is sufficient
when working with smooth functions, as in the n
kFðu; u0 ÞkHk1 ðkukL1 ÞkukHk ; k> þ1
classical approach. By purely classical methods, that 2
is, energy inequalities and nonlinear estimates, it is or
not difficult to prove the following local existence
result, where H k = H k (R n ) denotes the Sobolev n
space of functions with k derivatives in L2 (Rn ): kFðuÞkHk ðkukL1 ÞkukHk ; k>
2
Theorem 1 Assume F is C1 . Let (u0 , u1 ) 2 H k Now it is easy to conclude: the composition
H k1 for some k > 1 þ n=2. Then there exists a time maps XT into itself, and actually is a contraction of
T = T(ku0 kHk þ ku1 kHk1 ) > 0 such that problem BX (0, M) into itself provided M is large enough with
[6] has a unique solution belonging to (u, ut ) 2 respect to the data, and T is small enough with
C([T, T]; H k ) C([T, T]; H k1 ). respect to M. The unique fixed point is the required
If F = F(u) depends only on u, the result holds for solution. &
all k > n=2.
The wave operator has an additional important
Proof We decided to include a sketchy but com- property called the finite speed of propagation,
plete proof of this result since it shows the basic which can be stated as follows: given the IVP
approach to nonlinear wave equations: many results
of the theory, even some of the most delicate ones, &u ¼ 0; uð0; xÞ ¼ u0 ðxÞ; ut ð0; xÞ ¼ u1 ðxÞ
are obtained by suitable variations of the contrac-
tion method, and are similar in spirit to this classical if we modify the data ‘‘outside’’ a ball B(x0 , R) R n ,
theorem. the values of the solution inside the cone
Assume for a moment that the equation is linear
so that F = F(t, x) is a given smooth function of (t, x). Kðx0 ; RÞ ¼ fðt; xÞ : t 0; jx x0 j < R tg
For the linear equation &u = F, we can construct a
solution u using explicit formulas. Moreover, u do not change. Notice that K(x0 , R) is the cone with
satisfies the energy inequality basis B(x0 , R) and tip (R, x0 ); the slope of its mantle
represents the speed of propagation of the signals,
Z t
which for the wave operator & is equal to 1. The
Ek ðtÞ Ek ð0Þ þ kFðs; ÞkHk1 ds ½7
0
property extends without modification to the semi-
linear problem [6], at least for the smooth solutions
where the energy Ek (t) is defined as given by Theorem 1. Actually, it is not difficult to
modify the proof of the theorem to work on cones
Ek ðtÞ ¼ kuðt; ÞkHk þ kut ðt; ÞkHk1 ½8 instead of bands [T, T] Rn ; in other words, given
a ball B = B(x0 , R), we can assign two data
Now we introduce the space XT = C([T, T]; H k ) \ u0 2 H k (B), u1 2 H k1 (B)(k > n=2 þ 1) and prove
C1 ([T, T]; H k1 ), the space YT = C([T, T]; H k1 ), the existence of a local solution on the cone
the mapping : F ! u that takes the function F(t, x) K(x0 , R) for some time interval t 2 [0, T].
into the solution of &u = F (with fixed data u0 , u1 ), In general, the finite speed of propagation allows
and the mapping (u) = F(u, u0 ) which is the original us to localize in space most of the results and the
right-hand side of the equation. estimates; as a rule of thumb, we expect that what is
The energy inequality tells us that is bounded true on a band [0, T] Rn should also be true on
from YT to XT . Actually, for M large enough with any truncated cone K(x0 , R) \ {0 t T}.
520 Semilinear Wave Equations
Symmetries and
Z
The linear wave equation can be written as the
Euler–Lagrange equation of a suitable Lagrangian. ½xk eðuÞ þ Dk u Dt u dx ¼ const:;
This is still true for the semilinear perturbations of
k ¼ 1;.. .;n ½15
the form
&u þ f ðuÞ ¼ 0 ½9 where
Rs
Indeed, denoting with F(s) = 0 f () d the primitive eðuÞ ¼ 12u2t þ 12jrx uj2 þ FðuÞ ½16
of f, the Lagrangian of [9] is
ZZ is the energy density.
1 1
LðuÞ ¼ jut j2 þ jrx uj2 þ FðuÞ dt dx ½10 The Poincaré group does not exhaust the invar-
2 2 iance properties of the free wave equation. Among
The functional L is not positive definite; hence, the the other transformations which commute or almost
variational approach gives only weak results. How- commute with &, we mention the spacetime dilations
ever, this point of view allows us to apply Noether’s and inversions (which together with translations and
principle: any invariance of the functional is related Lorentz transformations generate the larger confor-
to a conservation law of the equation. These mal group), the scaling u 7! u, the spatial dilations,
conserved quantities can also be obtained by taking and, in the complex-valued case, the gauge transfor-
the product of the equation by a suitable multiplier, mation u 7! ei u. In this way several useful conserva-
although this method is far from obvious in many tion laws can be obtained, including the conformal
cases. We describe here this circle of ideas briefly. energy identities of K Morawetz.
The functional L is invariant under the Poincaré
group, generated by time and space translations and
the Lorentz transformations ( > 1, c 6¼ 0): Strichartz Estimates
t xj =c xj ct Energy estimates are very useful tools but they have
t 7! pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; xj 7! pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½11
2 1 2 1 some major shortcomings. The main one is clearly
The infinitesimal generators of the translations are the large number of derivatives necessary to estimate
simply the partial derivatives Dt and Dxj . The Lorentz the nonlinear term. This is why the modern theory
transformations can be decomposed as a rotation of semilinear wave equations relies mainly on
followed by a boost, and indeed a corresponding different tools, which go under the umbrella name
complete set of infinitesimal generators are the operators of Strichartz estimates and express the decay
properties of solutions when measured in Lp or
jk ¼ xj Dk xk Dj ; j ¼ xj Dt þ tDj ½12 related norms. In this section we summarize these
All the operators in the Poincaré group commute estimates in their most general form, and try to give
with & exactly. a feeling of the techniques involved.
The conservation law related to time translations Consider the following IVP for a homogeneous
(time derivative) is the fundamental ‘‘conservation of linear wave equation:
energy’’
&u ¼ 0; uð0; xÞ ¼ 0; ut ð0; xÞ ¼ f ðxÞ ½17
Z
1 2 1
EðtÞ ¼ ut þ jrx uj2 þ FðuÞ dx ¼ Eð0Þ ½13 The conservation of energy states that
2 2
while spatial translations (spatial derivatives) lead to kut ðt; Þk2L2 þ krx uðt; Þk2L2 kf k2L2 ½18
the conservation of momenta
Z for all times t. Thus, we see that L2 -type norms of
ut uxj dx ¼ const:; j ¼ 1; . . . ; n the solution do not decay. The interesting fact is that
if we measure the solution u in a different Lp -norm,
On the other hand, infinitesimal rotations and p > 2, the norm decays as t ! 1, and the decay is
boost [12] are connected to the conservation of fastest for the L1 -norm.
angular momenta To appreciate the dispersive phenomena at their
Z best, let us assume that the Fourier transform of the
data is localized in an annulus of order 1:
xk Dj u xj Dk u Dt u dx ¼ const:;
j; k ¼ 1; . . . ; n ½14 supp ^f ðÞ f1=2 jj 2g ½19
Semilinear Wave Equations 521
Then the corresponding solution u(t, x) has the same is by definition the B_ s1,1 Besov norm of f. Thus,
property, and we see that summing the estimates [23] over j, we conclude that
a general solution of [17] satisfies the dispersive
ukL2 2kjj^
kukL2 ¼ k^ ukL2 2krukL2 4kukL2 estimate
We condense the last line in the shorthand notation kuðt; ÞjL1 < ðn1Þ=2
t kf kB_ ðn1Þ=2
1;1
½24
kukL2 ’ krukL2
The Strichartz estimates can be obtained as a
We shall also write consequence of the above dispersive estimates, plus
kvkX < some subtle functional analytic arguments. In the
kwkY () kvkX CkwkY for some C
general form we give here, they were proved by
We can now rewrite the conservation of energy J Ginibre and G Velo, and in the most difficult
[20] in a very simple form; for localized data (and endpoint cases by Keel and T Tao. The solution of
hence a localized solution) as in [19], we have the homogeneous problem [17] studied above can be
written as
kuðt; ÞkL2 <
kf kL2 ½20
sinðtjDjÞ
The basic L1 -estimate for a solution of [17] with uðt; xÞ ¼ f; jDj F 1 jjF
jDj
localized data as in [19] is simply
ðn1Þ=2
(here F denotes the Fourier transform). On the
kuðt; ÞkL1 <
t kf kL1 ½21 other hand, the solution of the complete nonhomo-
This estimate is well known since the 1960s; it can geneous problem
be proved easily by several techniques, notably by &u ¼ Fðt; xÞ; uð0; xÞ ¼ u0 ; ut ð0; xÞ ¼ u1 ðxÞ ½25
the stationary-phase method. Property [21] mea-
sures the fact that as time increases, the total can be written by Duhamel’s formula as
energy of the solution remains constant but spreads
@ sinðtjDjÞ sinðtjDjÞ
over a region of increasing volume, due to the uðt; xÞ ¼ u0 þ u1
propagation of waves. If we interpolate between @t Z jDj jDj
t
[20] and [21], we obtain the full set of dispersive sinððt sÞjDjÞ
þ f ds
estimates 0 jDj
ðn1Þð1=21=qÞ and we see that the above estimates [22] apply to all
kuðt; ÞkLq <
t kf kLp
the operators appearing here. If we consider problem
1 1 ½22
þ ¼ 1; 2 q 1 [25] and we assume that the data F(t, x), u0 , u1 are
q p localized in frequency so that F(t, ^ ), u^0 , u
^1 have
Recall that we are working with localized solutions support in the annulus jj 1, the Strichartz estimate
on the annulus jj 1; it is easy to extend the takes the following form:
above estimates to general solutions by a rescaling kukLp Lq <
ku0 kL2 þ ku1 kL2 þ kFkLp~0 Lq~0 ½26
argument, exploiting the fact that, if u(t, x) is a I I
solution of the homogeneous wave equation, Here the dimension is n 2; LpI Lq denotes the space
u(t, x) is also a solution for any constant . with norm
Indeed, if ^f (and hence u ^) is supported in the
Z 1=p
annulus 2j1 jj 2jþ1 , j 2 Z, by rescaling [21],
we obtain kukLp Lq ¼ kuðt; ÞkpLq ðRn Þ dt ; I ¼ ½0; T
I
I
If f is any smooth function, not localized in the indices p, q satisfy the conditions
frequency, we can still write it as a series 1 1 n1 1 n1
X þ ;
f ¼ fj p q 2 2 2
j2Z p 2 ½2; 1; ðn; p; qÞ 6¼ ð3; 2; 1Þ ½27
where supp ^fj {2j1 jj 2jþ1 }. The quantity while p ~ satisfy an identical condition (and p0
~, q
X denotes the conjugate index to p). The constant in
kf kB_ s ¼ 2js kfj kL1 inequality [26] is uniform with respect to the
1;1
j2Z interval I.
522 Semilinear Wave Equations
To get the most general form of the estimates, Global Large Waves
some additional function space trickery is required.
As for ordinary differential equations (ODEs), the
As before, a simple rescaling argument extends
local solutions constructed in Theorem 1 can be
estimate [26] to the case of data F, u0 , u1 , whose
extended to a maximal time interval [0, T ], and a
spatial Fourier transforms are localized in the
natural question arises: are these maximal solutions
annulus 2j1 jj 2jþ1 ; we obtain
global, that is, is T = 1?
2jð1=pþn=qÞ kukLp Lq < jn=2
2 ku0 kL2 For generic nonlinearities and large data, the
I
answer is negative; in a dramatic way, in general
þ 2jðn=21Þ ku1 kL2
the norm ku(t, )kL1 is unbounded as t"T < 1.
~0 0
þ 2jð1=p þn=~q 2Þ kFkLp~0 Lq~0 The reason for this is simple: using the finite speed
I
of propagation, we can localize the equation and
Finally, if the data are arbitrary, we may decompose work on a cone; then if we take constant functions
them as series of localized functions, and summing as initial data, the solution inside the cone does not
the corresponding estimates we obtain the general depend on x, and the equation restricted to the cone
Strichartz estimates for the wave equation [25]: for effectively reduces to an ODE:
all (p, q) and (p~, q
~) as in [27],
&u ¼ f ðuÞ () y00 ðtÞ ¼ f ðyÞ;
kukLp B_ 1=pþn=q < ½30
I q;2
ku0 kH_ n=2 þ ku1 kH_ n=21 yðtÞ uðt; xÞ
þ kFkLp~0 B_ 1=p~0 þn=~q0 2 ½28 By this remark it is elementary to construct solutions
I ~0 ;2
q
P of the IVP [6] that blow up in a finite time.
Here, given a decomposition f = j2Z fj , the This construction does not apply if the equation
homogeneous Besov and Sobolev norms are defined, has some positive conserved quantity. Indeed, con-
respectively, by the identities (obvious modification sider a general gauge-invariant equation
for r = 1):
X &u þ gðjuj2 Þu ¼ 0;
½31
kf krB_ s ¼ 2jsr kfj krLq ; uð0; xÞ ¼ u0 ðxÞ; ut ð0; xÞ ¼ u1 ðxÞ
q;r
j2Z
for
R s some smooth function g(s). Writing G(s) =
kukH_ s ¼ kjjs u
^kL2 ’ kukB_ s
2;2 0 g() d, multiplying the equation by ut , and
integrating over Rn , it is easy to check that the
It is easy to convert the estimates [28] into a form
nonlinear energy
that uses only the more traditional norms Z h i
kf kH_ qs kjDjs f kLq ; jDj F 1 jj F EðtÞ ¼ jut j2 þ jrx uj2 þ Gðjuj2 Þ dx Eð0Þ ½32
since by the Besov–Sobolev embedding we have is constant in time, provided the solution u is
smooth enough. When G(s) has no definite sign,
B_ sq;2 H
_ s for 2 q < 1;
q we can proceed as above and construct solutions
_ s for 1 < q 2
B_ sq;2
H q
that blow up in finite time; this is usually called the
‘‘focusing’’ case. However, if we assume that
Notice that if we apply to the equation and the G(s) 0 (‘‘defocusing’’ case), the energy E(t) is
data the operator jDj = F 1 jj F , which commutes non-negative. The corresponding ODE, which is
with &, the Strichartz estimate [28] can be rewritten y00 þ g(y2 )y = 0, has only global solutions, and one
in an apparently more general form: may guess that also the solutions of [31] can be
extended to global ones.
kukLp B_ 1=pþn=qþ <
ku0 kH_ n=2þ
I q;2 This innocent-looking guess turns out to be one of
þ ku1 kH_ n=21þ þ kFk ~ 0 1=p
p
0 0
~ þn=~
q 2þ ½29 the most difficult problems of the theory of nonlinear
LI B_ q~0 ;2
waves, and is actually largely unsolved at present.
In particular, it is possible to choose the indices in The only general result for eqns [31] is Segal’s
such a way that no derivatives appear on u and F: theorem, stating that the IVP has always a global
this choice gives weak solution:
Theorem 2 Let g(s) be aR C1 non-negative function
kukLp ðRnþ1 Þ <
ku0 kH_ 1=2 þ ku1 kH_ 1=2 þ kFkLp0 ðRnþ1 Þ s
on [0, þ1), write G(s) = 0 g() d and assume that
2ðn þ 1Þ for some constant C
p¼
n1
sgðs2 Þ CGðs2 Þ; lim GðsÞ ¼ þ1 ½33
which is the estimate originally proved by Strichartz. s!þ1
Semilinear Wave Equations 523
Then for any (u0 , u1 ) 2 H 1 L2 such that G(ju0 j2 ) names of K Jörgens, I Segal, W Strauss, W von
2 L1 , the IVP [31] has a global solution u(t, x) in the Wahl, P Brenner, H Pecher, J Ginibre, G Velo,
sense of distributions, such that u0 2 L1 (R, L2 (Rn )) R Glassey and the more recent contributions of
and F(u) 2 L1 (R, L1 (Rn )). J Shatah, M Struwe, L Kapitanski, M Grillakis,
omitting many others). Actually modern proofs are
The proof (see Shatah and Struwe (1998)) is
remarkably simple, and are based again on a
delicate but elementary in spirit: by truncating the
variation of the fixed-point argument. Roughly
nonlinear term, we can approximate the problem at
speaking, the linear equation &u þ g(jvj2 )v = 0
hand with a sequence of problems with global
defines a mapping v 7! u; the Strichartz estimates
solution; then the conservation law [32] yields
localized on a cone imply that this mapping is
some extra compactness, which allows us to extract
Lipschitz continuous in suitable spaces, the Lipschitz
a subsequence converging to a solution of the
constant being estimated by the nonlinear energy of
original equation.
the solution restricted to the cone. In order to show
Thus we see that, despite its generality, this result
that this mapping is actually a contraction, it is
does not shed much light on the difficulties of the
sufficient to prove that the localized energy tends to
problem. Indeed, the weak solution obtained might
zero near the tip of the cone, that is, it cannot
not be unique, nor smooth, and in these questions
concentrate at a point. Once this is known, it is easy
the real obstruction to solving [31] is hidden.
to continue the solution beyond any maximal time
Notice that in the one-dimensional case n = 1 the
of existence and prove the global existence and
solution is always unique and smooth when the data
uniqueness of the solution.
are smooth, since in this case E(t) controls the L1 -
In the supercritical case p > p0 (n), very little is
norm of u. For higher dimensions n 2, something
known at present; there is some indication that the
more can be proved if we assume that the nonlinear
problem is much more unstable than in the
term has a polynomial growth:
subcritical case (Kumlin, Brenner, Lebeau), and
sgðs2 Þ ¼ jsjp1 s for s large; p > 1 ½34 there is some numerical evidence in the same
direction.
In particular, the defocusing wave equation with a
power nonlinearity
The weighted estimates of F John are estimates of In the ‘‘cubic’’ case = 3, one has global existence
p
the solution in spacetime L norms with weights for all data small enough. On the other hand, in the
of the form (1 þ jtj þ jxj) (1 þ ktj jxk) . An ‘‘quadratic’’ case = 2, it is possible to construct
extension of this method was also used in the examples where the solution blows up in a finite
final complete proof of the conjecture. time no matter how small the data. Now, assume
The vector field approach of S Klainerman. If we that the nonlinear term has the following structure:
X
regard energy estimates as norms generated by the Fðu0 Þ ¼ aQ0 ðu0 Þ þ cjk Qjk ðu0 Þ þ Oðju0 j3 Þ ½36
plain derivatives, it is natural to extend them to 0 j<k 3
more general norms generated by vector fields
commuting, or quasicommuting, with the wave which is called a ‘‘null structure’’. Here a, cjk are
operator. The conservation of energy expressed in constants, and the quadratic forms Q are the
these generalized norms has a built-in decay that following:
allows us to prove global existence of small waves.
Q0 ðu0 Þ ¼ jDt uj2 jDx1 u j2 jDx2 u j2 jDx3 u j2 ½37
This circle of ideas led very far, and we might even
regard Christodoulou and Klainerman’s proof of
the stability of Minkowski space for the Einstein Q0j ðu0 Þ ¼ Dt u Dxj u Dxj u Dt u; j ¼ 1; 2; 3 ½38
equation as an extreme consequence of this
approach. Qjk ðu0 Þ ¼ Dxj u Dxk u Dxk u Dxj u
The normal forms of J Shatah. The idea is to ½39
apply a nonlinear (and nonlocal) transformation j; k ¼ 1; 2; 3; j < k
Semilinear Wave Equations 525
Then the problem has a global solution for all small and the Lorentz transformation
enough data. The extensions and applications of this
t x1 t x1
idea are very wide (see the ‘‘Further reading’’ section t 7! pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ; x 7! pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½43
for further information). Another situation where 2 1 2 1
the null structure plays an important role is it is possible to show by explicit constructions that
discussed in the next section.
the equation is not locally well posed for p(n=2 s)
(n=2 þ 2 s) (scaling) and
the equation is not locally well posed for p(n=4 þ
Low Regularity 1=4 s) n=4 þ 5=4 s (Lorentz).
Theorem 1, although optimal in the classical frame- On the positive side, local well-posedness has been
work, is not satisfactory for a few reasons. From a almost fully proved in the complementary region of
physicist’s point of view, requiring n=2 þ 1 deriva- indices, with the exception of a tiny spot near the
tives of the data is not meaningful, since the endpoint s = 0, p = (n þ 5)=(n þ 1) where the pro-
measurable quantities involve only low-order deri- blem is still open (and the conjecture is that the
vatives, the most important one being the energy, equation is ill posed for indices in that region).
that is, the H 1 -norm of the solution. Moreover, the These results are due to several authors, among the
wave equation has a rich set of conserved quantities, others we cite C Kenig, G Ponce, L Vega,
symmetries and decay properties which may be H Lindblad, C Sogge, L Kapitanski, and T Tao.
useful to prove stronger results, and in particular the When the nonlinearity depends also on the first-
global existence. However, many of these structures order derivatives of u, the situation becomes more
appear only at a low-regularity level (H 1 or even complex. In the general case, the best result
Lp ); in order to exploit them it is essential to work available is still the local existence theorem
with low-regularity solutions. (Theorem 1); the only possible refinement is the
As an example, if we were able to prove Theorem 1 use of fractional Sobolev spaces H s , but in general
for k = 1, then we could deduce that the local local solvability only holds for s > n=2 þ 1. If we
solutions can be extended to global ones in all cases assume that F = F(u0 ) is a quadratic form in the
when the H 1 -norm is conserved. For instance, this first-order derivatives, a clever use of Strichartz
would allow us to solve globally the equations of estimates allows us to prove local solvability down
the form to s > n=2 þ 1=2 for n 3 and s > 7=4 for n = 2
&u þ G0 ðjuj2 Þu ¼ 0; GðsÞ 0 (Ponce and Sideris).
However, exactly as in the case of the small
The problem of the lowest value of s such that nonlinear waves examined in the previous section, if
a unique local solution exists in H s is quite the nonlinear term has a null structure the result can
difficult, and still not completely solved. In order be improved. Indeed, when F(u0 ) is a combination of
to state the results we precise the definition of the forms [37]–[39], then local solvability and
solution as follows: the IVP is said to be locally uniqueness can be proved for all s > n=2, as in the
well posed in H s , if, for all (u0 , u1 ) in a bounded case of a nonlinear term of the type F(u). This result
set B of H s H s1 , there exist a T > 0, a Banach is due to Klainerman, Machedon, and Selberg.
space XT (depending on B) continuously Again, the proof is based on a variation of the
embedded in C([0, T]; H s ), and a unique solution contraction method; the additional ingredient here
u 2 XT , such that the map (u0 , u1 ) 7! u is contin- is the use of suitable function spaces, which are
uous from B to XT . the counterpart for the wave equation of the spaces
For the wave equation with a power nonlinearity used by Bourgain in the study of the nonlinear
Schrödinger equation. The norm of these spaces is
&u ¼ jujp ½40 defined as follows:
or more generally
kukHs; khis hjtj jji e
uð
; ÞkL2 ðRnþ1 Þ
&u ¼ FðuÞ; FðuÞ ¼ 0
½41 where hi = (1 þ jj2 )1=2 and u e is the spacetime
jFðuÞ FðvÞj Cju vjðjujp1 þ jvjp1 Þ Fourier transform of u(t, x). The wave operator can
the picture is almost complete. Indeed, by using the be regarded as a spacetime Fourier multiplier of the
scaling form
2 jj2 = (jtj jj)(jtj þ jj), and we see that
‘‘inverting’’ the operator & has a regularizing effect
t 7! t; x 7! x ½42 in the scale of Hs, spaces, since it decreases both
526 Separation of Variables for Differential Equations
s and by one unit. Substantiating this formal John F (1979) Blow-up of solutions of nonlinear wave equations
argument and complementing it with suitable esti- in three space dimensions. Manuscripta Mathematica 28(1–3):
235–268.
mates for the nonlinear term requires some hard work, Keel M and Tao T (1998) Endpoint Strichartz estimates.
which is contained in the theory of bilinear estimates American Journal of Mathematics 120: 955–980.
developed by Klainerman and his school. Klainerman S (1986) The null condition and global existence to
nonlinear wave equations. Lecture in Applied Mathematics
See also: Evolution Equations: Linear and Nonlinear; 23: 293–326.
Symmetric Hyperbolic Systems and Shock Waves; Wave Klainerman S and Selberg S (2002) Bilinear estimates and
Equations and Diffraction. applications to nonlinear wave equations. Communications
in Contemporary Mathematics 4: 223–295.
Lindblad H and Sogge C (1995) Existence and scattering with
Further Reading minimal regularity for semilinear wave equations. Journal of
Functional Analysis 130: 357–426.
Choquet-Bruhat Y (1988) Global existence for nonlinear Schiff LI (1951) Nonlinear meson theory of nuclear forces I.
-models, Rend. Sem. Mat. Univ. Pol. Torino, Fascicolo Physical Review 84: 1–9.
speciale ‘‘Hyperbolic equations,’’ 65–86. Segal IE (1963) The global Cauchy problem for a relativistic
D’Ancona P, Georgiev V, and Kubo H (2001) Weighted decay scalar field with power interaction. Bulletin de la Société
estimates for the wave equation. Journal of Differential Mathématique de France 91: 129–135.
Equations 177: 146–208. Shatah J (1988) Weak solutions and the development of
Georgiev V, Lindblad H, and Sogge CD (1997) Weighted singularities in the SU(2) -model. Communications on Pure
Strichartz estimates and global existence for semilinear wave and Applied Mathematics 41: 459–469.
equations. American Journal of Mathematics 119: 1291–1319. Shatah J and Struwe M (1993) Regularity results for nonlinear
Ginibre J and Velo G (1982) The Cauchy problem for the wave equations. Annals of Mathematics 138: 503–518.
o(N), cp(N 1), and GC (N, p) models. Annals of Physics 142: Shatah J and Struwe M (1998) Geometric Wave Equations.
393–415. Courant Lecture Notes in Mathematics, vol. 2. New York:
Ginibre J and Velo G (1995) Generalized Strichartz inequalities Courant Institute, New York University.
for the wave equation. Journal of Functional Analysis 133: Strauss W (1989) Nonlinear Wave Equations. CBMS Lecture
50–68. Notes, vol. 75. Providence, RI: American Mathematical Society.
Hörmander L (1997) Lectures on Nonlinear Hyperbolic Equa-
tions. Berlin: Springer.
in the case of the IBVP on the interval 0 x L operator of total derivative with respect to (w.r.t.) xi ;
and with zero boundary conditions then, Di H[x, u] = 0 or
@t u ¼ @xx u; 0 < t; 0 < x < L e iH
D
ui;mi þ1 ¼
uð0; tÞ ¼ uðL; tÞ ¼ 0; 0<t Hui ;mi
uðx; 0Þ ¼ f ðxÞ; 0<x<L where Hui , mi = @ui , mi H. The integrability conditions
only a countable set of values for the separation Dj ui, mi þ1 = 0, j 6¼ i, give rise to a large set of
constant k is admissible: kn = (n=L), n = 1, 2, . . . . differential conditions to be satisfied by H[x, u]:
Then the general solution has the form of the
Hui ;mi Huj ;mj De iDe j H þ Hu ;m u ;m D e iH D e jH
Fourier series
i
i j j
X
1 ¼ Huj ;mj D e iH D e j Hu ;m
i i
uðx; tÞ ¼ cn exp k2n t sinðkn xÞ
n¼1 þ Hui ;mi D e jH D e i Hu ;m ½4
j j
method, one looks for a generating function W(x, ) A separable solution W(x, ) of [6] exists when-
of a canonical transformation ever the Hamiltonian H(x, y) satisfies (identically)
the integrability conditions [4] which in this case
@Wðx; Þ @Wðx; Þ acquire the (nonlinear) form
y¼ ; ¼
@x @
that transforms Hamiltonian equations [5] into simple Lij ðHÞ @i H@j H@ i @ j H þ @ i H@ j H@i @j H
equations for the new variables 2 Rn , 2 Rn . Since
@i H@ j H@ i @j H @ i H@j H@i @ j H
the transformation is canonical, the transformed
equations are again Hamiltonian with the new ¼0 for all i; j ¼ 1; . . . ; n ½7
Hamiltonian H(,e ) = H(x(, ), y(, )). If we
e
choose this transformation so that H(, ) = 1 , then (@i = @=@xi , @ i = @=@yi ) found by Levi-Civita (1904).
the transformed Hamilton equations become In classical mechanics the most important
Hamiltonians are natural ones:
e
@ Hð; Þ
_ ¼ ¼ ð1; 0; . . . ; 0Þ 1 X ij
@ Hðx; yÞ ¼ g ðxÞyi yj þ VðxÞ G þ V ½8
e
@ Hð; Þ 2 i;j
_ ¼ ¼0
@
They are defined on the cotangent bundle T Q of a
so that (t) = (t þ 10 , 20 , . . . , n0 ), (t) = configurational Riemannian manifold Q with the
(10 , . . . , n0 ) = const. and the solution x(t), y(t) of metric tensor g. The function G is the geodesic
the Hamilton equations [5] is then given implicitly Hamiltonian associated with the metric tensor g. For
by the equations such natural Hamiltonians, the Levi-Civita condition
Lij (G þ V) = 0 splits into the condition Lij (G) = 0
@WðxðtÞ; Þ @WðxðtÞ; Þ
ðtÞ ¼ ; yðtÞ ¼ and a condition for the potential V(x). The condition
@ @x
Lij (G) = 0, depending solely on the kinetic energy
Since term, is thus a necessary condition for coordinates xi
on Q to be separation coordinates for [8].
@Wðx; Þ
y¼ In the fundamental case of orthogonal separation
@x (i.e., when gij = 0 for i 6¼ j), the Levi-Civita condi-
the generating function W(x, ) has to satisfy (identi- tions Lij (G þ V) = 0 read
cally w.r.t. (x, )) the first-order nonlinear PDE
@i @j gkk @i ln gjj @j gkk
@Wðx; Þ
H x; ¼ 1 ½6 @j ln gii @i gkk ¼ 0; i 6¼ j ½9
@x
This equation is called the Hamilton–Jacobi
equation for the generating function W(x, ). It is @i @j V @i ln gjj @j V
solved when its complete integral W(x, ), complete
means that @j ln gii @i V ¼ 0; i 6¼ j ½10
2 The main questions arising here are
@ Wðx; Þ
det 6¼ 0
@xi @j 1. What is the algebraic form of orthogonally
separable Riemannian metrics?
depending on n independent constants is known.
2. What is the form of separable coordinates on
In general, it is very difficult to find solutions of [6].
Riemannian manifolds?
The most important method is the method of
separation of variables when one P looks for a The first question is answered by the Stäckel
solution in the form W(x, ) = nk = 1 Wk (xk , ) theorem (Stäckel 1891) that provides an algebraic
which is a sum of n functions Wk (xk , ), each characterization of orthogonal separability of a
depending on a single variable xk and, possibly, all natural Hamiltonian H = G þ V.
constants a. If the Hamilton–Jacobi equation [6]
Theorem 1 The Hamilton–Jacobi equation for the
admits such a solution, then integrating this
natural Hamiltonian
equation is reduced to integrating n (uncoupled)
first-order ODEs for functions Wk (xk , ). The 1 X ii
constants k acquire then the meaning of integration H ¼GþV ¼ g ðxÞy2i þ VðxÞ
2 i
constants.
Separation of Variables for Differential Equations 529
Proof If 1 X ii
V ¼ 1 g ð@xi Wi Þ2
2 3 2 i
’11 ðx1 Þ ’1n ðx1 Þ
11 6 7 1 X ii 1
g ; . . . ; gnn 6 .. .. .. 7 ¼ g 1 ’i1 ðxi Þ ð@xi Wi Þ2
4 . . . 5 2 i 2
X
’n1 ðxn Þ ’nn ðxn Þ ¼ ii i
g fi ðx Þ &
i
¼ ½1; 0; . . . ; 0 ½11
then the Hamilton–Jacobi equation for H can be Remark 2 The Stäckel characterization of orthogo-
written as nal separability is equivalent to Levi-Civita conditions
[9] and [10]. It is in fact a solution of these conditions.
1 X ii @W 2 X ii
g þ g fi ðxi Þ ¼ 1 Remark 3 With every Stäckel matrix, one can
2 i @xi i
X X relate a family of n quadratic in momenta Hamilto-
¼ 1 gii ’i1 ðxi Þ þ 2 gii ’i2 ðxi Þ nians defined by n rows of the inverse Stäckel matrix
i
X
i = 1 = [ kr ]:
þ þ n gii ’in ðxi Þ ½12
i
1X n
2
Hk ¼ kr yr ; k ¼ 1; . . . ; n ½14
2 r¼1
This equationP admits an additively separable
solution W = i Wi (xi ), where the functions Wi (so that H1 = G). These Hamiltonians are linearly
satisfy n ODEs (separation equations): and functionally independent; they Poisson-
commute (so that they form a Liouville integrable
1 @Wi 2
þ fi ðxi Þ system) and are all diagonal so that they have
2 @xi common eigenvectors.
¼ 1 ’i1 ðxi Þ þ 2 ’i2 ðxi Þ þ þ n ’in ðxi Þ These properties are the main ingredients of an
i ¼ 1; . . . ; n ½13 intrinsic (coordinate-independent) characterization
of separable geodesic Hamiltonians G in terms of
By differentiating [13] w.r.t. j , we get involutive Killing tensors that is due to works of
@Wi @ 2 Wi Eisenhart (1934), Kalnins and Miller (1980), and
’ij ðxi Þ ¼ Benenti (1997).
@xi @xi @j
and thus Theorem 4 A necessary and sufficient condition
2 for the existence of an orthogonal additive separable
@W1 @Wn @ W coordinate system x for the Hamilton–Jacobi
det ’ij ðxi Þ ¼ . . . det 6¼ 0
@x 1 @x n @xi @j equation of the geodesic Hamiltonian H1 = G
P on an n-dimensional (pseudo)-Riemannian manifold
so that W = i Wi (xi ) is indeed a complete integral of is that there exist n quadratic forms
the Hamilton–Jacobi equation [12]. Conversely, if P
P Hr = ni, j hijr (x)yi yj such that
W = i Wi (xi ) is a complete integral of the Hamilton–
Jacobi equation [12], then by differentiating it w.r.t. j (i) They all Poisson-commute: {Hr , Hs } = 0, 1 r,
we get for j = 1 s n.
(ii) The set {Hr }nr= 1 is linearly independent.
X @Wi @ 2 Wi
gii ¼1 (iii) There is a basis {!(j) }nj= 1 of n simultaneous
i
@xi @xi @j eigenforms for all Hr .
530 Separation of Variables for Differential Equations
If conditions(i)–(iii) are satisfied then there exist and the constant terms in [4] give the P Levi-Civita
functions gj (x) such that !(j) = gj dxj , j = 1, . . . , n. equation [10] meaning that V(x) = ni= 1 gii fi (xi ).
Eisenhart has shown that the Robertson condition is
This theorem has been further simplified
equivalent to the requirement that the Ricci tensor is
by Benenti (1997), who has shown that for separ-
diagonal: Rij = 0, i 6¼ j in variables x so that the
ability it is sufficient that gij admits a single Killing
Robertson condition is satisfied automatically in the
2-tensor with simple eigenvalues and normal eigen-
Euclidean space, in spaces of constant curvature and in
vectors. He has also explained the role of ignorable
Einstein spaces. Thus every orthogonal coordinate
coordinates.
system permitting multiplicative separation of the
These results are key ingredients of an answer to the
Schrödinger equation corresponds to the Stäckel form.
question (2). Eisenhart (1934), starting from the fact
that every separable geodesic Hamiltonian H = G
Jacobi Problem of Separability
admits n quadratic (w.r.t. momenta yi ) integrals of
motion, derived a set of nonlinear PDEs characterizing In order to apply the separability theory to physical
separable Riemannian metrics. He has solved these Hamiltonians H = (1=2)p2 þ V(q), p = (p1 , . . . , pn ),
equations for spaces of constant curvature. This q = (q1 , . . . , qn ), it is essential to solve the following
solution is the basis of the Kalnins and Miller’s problem: ‘‘given a potential V(q), decide if there
(1986) diagrammatic classification of all orthogonal exists a point transformation x(q) to some curvi-
separation coordinates on Rn and the sphere Sn . linear coordinates x such that the Hamilton–Jacobi
Separable coordinates on the Minkowski space Mn equation associated with H is separable in coordi-
have not been classified yet. nates x, and if such transformation exists, determine
Since the work of Robertson (1927) and Eisenhart it and solve the obtained Hamilton–Jacobi
(1934), it is known that in Rn , Sn and, in general, in equation.’’
the space with diagonal Ricci tensor, the (additive) This problem has been raised by Jacobi (1884) in
separability of Hamilton–Jacobi equation for the connection with the problem of finding geodesic
natural Hamiltonian H = G þ V is equivalent motions on a 3-axial ellipsoid. For solving this
to multiplicative separability of the stationary problem Jacobi introduced his ‘‘remarkable change
Schrödinger equation with the same potential V: of coordinates’’ to the generalized elliptic coordi-
nates x(q) defined through zeros of the rational
ð þ VðxÞÞðxÞ ¼ EðxÞ ½15
function
where Q j
X n
ðqi Þ2 j ðz x Þ
X
n qffiffiffiffiffiffiffiffiffiffiffiffiffi 1þ Q ½16
1 ðz i Þ i ðz i Þ
¼ pffiffiffiffiffiffiffiffiffiffiffiffiffi @i detðgÞgij @j i¼1
i;j¼1 detðgÞ
where the constants i > 0 are all different. From
is the Laplace–Beltrami operator. Usually, multi- the graph of the left-hand side of [16], it is easy to
Q
plicative separated solutions (x) = ni= 1 i (x) is see that there are exactly n simple, real zeros. For
considered but the change of the dependent variable given values of elliptic coordinates xj , the values of
u = ln transforms it into an additive separable (qi )2 are uniquely determined as residues at i while
solution. If we restrict our considerations to ortho- Cartesian coordinates qi are determined uniquely
gonal separation coordinates (gij = 0 for i 6¼ j), eqn only in each n-tant of Rn .
[15] becomes The Jacobi elliptic coordinates play a pivotal role
n
in orthogonal separability on Rn and Sn since they
X 1
gii uii þ u2i þ pffiffiffiffiffiffiffiffiffiffiffiffiffi @i are the mother of all other separation coordinates
i¼1 detðgÞ that can be obtained through proper and improper
qffiffiffiffiffiffiffiffiffiffiffiffiffi degenerations of i ’s. By using these coordinates
detðgÞgii ui þ VðxÞ ¼ E Jacobi solved not only the geodesic motions on the
ellipsoid but also the motion on the ellipsoid under
where ui = @i u, uii = @i @i u. The integrability condi- the action of harmonic potential V(q) = (1=2)q2 . He
tions [4] for regular separation lead to the Levi-Civita has also found separation coordinates for a system
condition [9] on the components gii of the metric of three interacting particles on the line known
tensor, upon comparison of the coefficients at u2i . today as the Calogero system. In general, however,
The coefficients at uii yield the Robertson condition Jacobi considered the problem of finding separation
qffiffiffiffiffiffiffiffiffiffiffiffiffi coordinates for a given potential V(q) to be very
@i @j ln detðgÞgii ¼ 0; i 6¼ j difficult. In Vörlesungen über Dynamik, ch. 26, he
writes: ‘‘The main difficulty in integrating a given
Separation of Variables for Differential Equations 531
differential equation lies in introducing convenient Remark 6 If the potential V(q1 , q2 ) is separable,
variables, which there is no rule for finding. There- then it admits an integral of motion K that is
fore, we must travel the reverse path and after quadratic w.r.t. momenta and V satisfies (identically
finding some notable substitution, look for problems w.r.t. q1 , q2 ) eqn [17] for certain values of the
to which it can be successfully applied’’. This undetermined constants a, b, e b, c, e
c, d. Since coeffi-
statement had a profound influence on further cients at linearly independent expressions of q1 , q2
development of SoV theory that concentrated on have to be equal to zero, the parameters
characterizing separable Hamiltonians (as expressed a, b, e
b, c, ec, d have to satisfy a set of linear, algebraic,
in terms of separation coordinates) and on describ- homogeneous equations. If there is a nonzero
ing and classifying separation coordinates. solution for a, b, e b, c, ec, d, then there exists an
The original problem of Jacobi of finding separa- integral of motion K and separation coordinates
tion variables for a given natural Hamiltonian has can be determined as characteristic variables for
been taken up by Rauch-Wojciechowski (1986), equation [17].
who found a characterization of separable potentials
Example 7 Separable cases of the Henon–Heiles
V(q) in terms of Cartesian coordinates qi . Its
potential
invariant geometric form has been given by Benenti.
A complete criterion of separability that allows for
V ¼ 12 !1 q21 þ !2 q22 þ q21 q2 13 q32
an effective testing and calculation of separation
coordinates (if they exist) for V(q) has been solved
By substituting this form of V into [17], we get two
by Waksjö and Rauch-Wojciechowski (2003). This
sets of admissible solutions for parameters , ,
criterion is directly applicable to the problem of
!1 , !2 : (i) = , !1 = !2 with V separable in
finding SoV for the Schrödinger equation.
rotated (by =4) Cartesian coordinates; (ii)
Criterion of Separability for n = 2 = 6, !1 , !2 -arbitrary with V separable in the
shifted parabolic coordinates. In case (ii) eqn [17]
The criterion of separability for n = 2 can be read becomes
from the Bertrand–Darboux theorem.
Theorem 5 (Bertrand–Darboux). For the 1
2 q2 ð4!1 !2 Þ @1 @2 V
Hamiltonian: 4
þ q1 @22 V @12 V þ 3@1 V ¼ 0
1
H= 2 p21 þ p22 þ Vðq1 ; q2 Þ
the following statements are equivalent: and p inffiffiffiffiffi its characteristic coordinates defined as
q1 = , q2 = (1=2)( ) þ (1=4)(4!1 !2 ) it
(i) H has a functionally independent integral of takes the form ( )@ @ V þ @ V þ @ V = 0 solved
motion {H, K} = 0 of the form by V(, ) = ( þ )2 [f () þ g( )] which is separable
in the parabolic coordinates.
K ¼ aq22 þ bq2 þ c p21 þ aq21 þ e c p22
bq1 þ e
þ 2aq1 q2 bq1 e bq2 þ d p1 p2 Effective Criterion of Separability
þ kðq1 ; q2 Þ for Arbitrary Dimension
able in one of the four orthogonal coordinate Ki ¼ ði r Þ1 lir2 þ p2i þ ki ðqÞ
r¼1;r6¼i
systems in the plane: elliptic, parabolic, polar,
or Cartesian. lir ¼ qi pr qr pi ½18
532 Separation of Variables for Differential Equations
where ’(s) = 4(s e1 )(s e2 )(s e3 ) and ei are para- since solving [25] w.r.t. is a purely algebraic
meters of the elliptic coordinates. This is the Lame operation. We can treat eqns [26] as a set of
equation; its solutions define new transcendental func- simultaneously separable (in the canonical variables
tions that depend on the choice of the constants , . (x, y)) Hamilton–Jacobi equations related to the
The approach presented here extends to diverse Hamiltonians Hi . Assume now that
modifications such as vibrations with forcing term 2
w(q) þ w(q) = f (q), vibrations of a nonhomogen- @2W @ Wi
det ¼ det 6¼ 0
eousmedium w(q) þ (q)w(q) = 0, the stationary @xi @j @xi @j
Schrödinger equation w(q) þ V(q)w(q) = w(q)
whenever the functions (q), f (q), V(q) are compatible i.e. that W is a complete integral for [26]. Then the
with the separation coordinates. Hamiltonians Hi (x, y) = i Poisson-commute since
Separation equations for the second-order BVP i can be treated as new canonical variables
are the source of one-dimensional eigenvalue pro- obtained by the canonical transformation (x, y) !
blems of the Sturm–Liouville type (, ) given by
0
ðpðsÞu0 Þ qðsÞu þ ðsÞu ¼ 0 @Wðx; Þ @Wðx; Þ
y¼ ; ¼
with singularities that may occur at the endpoints of @x @
the fundamental domain. Majority of orthogonal
Thus, any solvable w.r.t. set of separation relations
polynomials and special functions appearing in math-
[25] defines a Liouville integrable system.
ematical physics are solutions of Sturm–Liouville
If we perform a canonical transformation from
problems.
(x, y) to new variables (q, p), then the new set of
In the complex domain the study of singularities e i (q, p) = Hi (x(q, p),
commuting Hamiltonians H
of Laurent series solutions of the same equations led
y(q, p)) is also called separable.
to development of theory of linear ODEs with
The main problem for any given set of commuting
singular points of the Fuchs class and the Böcher e i (q, p) is to decide if there exists a
Hamiltonians H
class.
canonical transformation (q, p) ! (x, y) to the
separation variables (x, y) so that the related
Hamilton–Jacobi equations [26] are simultaneously
Constructive Approach to Separability separable. An answer to this problem is known for
of Liouville Integrable Systems integrable Hamiltonians solvable through the spec-
tral curve method (Sklyanin 1995) and for the whole
In the constructive approach to separability, one class of natural Hamiltonians discussed earlier.
considers simultaneously all Hamilton–Jacobi equa- This approach brings new, wider perspective to the
tions following from a set of n, functionally classical separability mechanism stated in the Stäckel
independent, commuting integrals H1 (x, y), . . . , theorem. It contains majority of all known separable
Hn (x, y), {Hi , Hj } = 0, that define a Liouville inte- Hamiltonian systems. For example, if we specify the
grable system (Sklyanin 1995). separation relations [25] to be affine in i ,
One starts with the separation equations, a set
of n decoupled ODEs for the functions Wi (xi , ) X
n
depending on one variable xi and parametric fik ðxi ; yi Þk ¼ gi ðxi ; yi Þ; i ¼ 1; . . . ; n ½27
2 Rn : k¼1
@Wi ðxi ; Þ then [27] are called generalized Stäckel separability
fi xi ; yi ¼ ; ¼ 0 ½25
@xi conditions. To recover the explicit form of Hamilto-
nians Hk = k , it is enough to solve relations [27] w.r.t.
Assume that the dependence on i is essential (i.e., k . It has been proved that the Stäckel Hamiltonians in
that det(@fi =@j ) 6¼ 0) so that we can resolve eqns [27] constitute a quasi-bi-Hamiltonian chain. If we
[25] w.r.t. i so that i = Hi (x, y) for some functions specify further relations [27] by assuming that func-
Hi . If the functions Wi (xi , ) solve [25] identically tions fik do not depend on yi and functions gi are
w.r.t.
Pn x and , then the function W(x, ) = quadratic in yi , then we obtain the classical Stäckel
i=1 W i (xi , ) is simultaneously an additively separability conditions (see Theorem 1)
separable solution of eqns [25] and of the equations
X
n
1
x; @Wðx; Þ fik ðxi Þk ¼ gi ðxi Þy2i þ hi ðxi Þ ½28
i Hi x; y ¼ ; i ¼ 1; . . . ; n ½26 2
@x k¼1
Separatrix Splitting 535
that can be solved for k yielding Courant R and Hilbert D (1989) Methods of Mathematical
Physics. vol. II. Partial Differential Equations, Wiley Classics
Library. A Wiley-Interscience Publication. New York: Wiley.
1X n
1
2 hi ðxi Þ Eisenhart LP (1934) Separable systems of Stäckel. Annals of
k ðx; yÞ ¼ ik yi þ
2 i¼1 gi ðxi Þ Mathematics 35(2): 284–305.
Fourier J (1945) The Analytical Theory of Heat. New York:
G. E. Stechert and Co.
that is, the Stäckel Hamiltonians [14] with the Stäckel Jacobi CGJ (1884) Vörlesungen über Dynamik. Herausgegeben
matrix = [’ik ], where ’ik = fik (xi )=gi (xi ). By speci- A. Clebsch., ch. 26. Berlin: Verlag von G. Reimer.
fying [28] further, we obtain separation relations Kalnins EG and Miller W Jr. (1980) Killing tensors and variable
separation for Hamilton–Jacobi and Helmholtz equations.
SIAM Journal on Mathematical Analysis 11(6): 1011–1026.
xn1
i 1 þ xn2
i 2 þ þ n ¼ 12gðxi Þy2i þ hðxi Þ Kalnins EG and Miller W Jr. (1986) Separation of variables on
n-dimensional Riemannian manifolds. I. The n-sphere Sn and
which give the so-called Benenti systems associated Euclidean n-space Rn . Journal of Mathematical Physics 27(7):
with conformal Killing tensors and cofactor pair 1721–1736.
Landau LD and Lifshitz EM (1976) Course of Theoretical
systems. Physics. vol. 1. Mechanics, Third Edition. Translated from
Relations [27], with gi (xi , yi ) depending exponen- the Russian by JB Sykes and JS Bell. Oxford–New York–
tially on momenta y, contain several well-known Toronto, Ont: Pergamon Press.
systems such as periodic Toda lattice, the KdV Levi-Civita T (1904) Sulla integrazione della equazione di
dressing chain, and the Ruijsenaar–Schneider sys- Hamilton-Jacobi per separazione di variabili. Mathematische
Annalen 59: 383–397.
tem. Relations with gi cubic in momenta y yield Miller W Jr. (1983) The technique of variable separation for
stationary flows of Boussinesq hierarchy and integr- partial differential equations. In: Wolf BK (ed.) Nonlinear
able systems on the loop algebra sl(3). Phenomena (Oaxtepec, 1982), Lecture Notes in Physics,
vol. 189, pp. 184–208. Berlin: Springer.
See also: Boundary-Value Problems for Integrable Sklyanin EK (1995) Separation of variables. New trends. Progress
Equations; Calogero–Moser–Sutherland Systems of in Theoretical Physics 118: 35–60.
Nonrelativistic and Relativistic Type; Elliptic Differential Stäckel P (1897) Über die Integration der Hamilton’schen
Differentialgleichung mittelst Separation der Variabeln. Math.
Equations: Linear Theory; Evolution Equations: Linear
Ann. 49(1): 145–147.
and Nonlinear; Integrable Systems: Overview; Multi- Waksjö C and Rauch-Wojciechowski S (2003) How to find
Hamiltonian Systems; Ordinary Special Functions; separation coordinates for the Hamilton–Jacobi equation: a
Recursion Operators in Classical Mechanics; Toda Lattices. criterion of separability for natural Hamiltonian systems.
Mathematical Physics, Analysis and Geometry 6(4): 301–348.
Wojciechowski S (1986) Review of the recent results on
Further Reading integrability of natural Hamiltonian systems. In: Winternitz P
Benenti S (1997) Intrinsic characterization of the variable (ed.) Systèmes dynamiques non linéaires: intégrabilité
separation in the Hamilton–Jacobi equation. Journal of et comportement qualitatif, Sém. Math. Sup, vol. 102,
Mathematical Physics 38(12): 6578–6602. pp. 294–327. Montreal, QC: Presses Univ. Montréal.
Separatrix Splitting
D Treschev, Moscow State University, Moscow, In this article we consider the case of systems with
Russia finite-dimensional phase space. Basically we deal with
ª 2006 Elsevier Ltd. All rights reserved. nonautonomous Hamiltonian systems 2-periodic in
time. However, it is useful to keep in mind the fact
that the cases of autonomous Hamiltonian systems
Separatrices are asymptotic manifolds in dynamical and symplectic maps are dynamically the same. Some
systems. However, this term is applied usually in the results for non-Hamiltonian perturbations will also
case of a small dimension of the phase space, where be presented. Hamiltonian systems with one-
these manifolds are hypersurfaces. In the context of and-a-half or two degrees of freedom as well as
separatrix splitting manifolds asymptotic to hyper- area-preserving two-dimensional maps are especially
bolic tori are usually considered, where tori of important for us because the results on the separatrix
dimension 0 and 1 are called equilibrium positions splitting in this case are more clear and complete.
and periodic trajectories, respectively. A separatrix Dynamics in such systems is essentially the same.
can be stable (asymptotic as t ! þ1) and unstable Below we call these systems two dimensional.
(asymptotic as t ! 1). We assume that all systems are at least C1 -smooth.
536 Separatrix Splitting
where
Since the addition to the Hamiltonian of a function,
depending only on t and ", does not change the 1. hu" () = O("2 ),
dynamics, without loss of generality we can assume 2. hs" () = "M() þ O("2 ).
that H1 (0, 0, t) 0. Hence the Poincaré integral Moreover, let gt" : D ! D be the phase flow of
Z þ1 the perturbed system. The map g2 " is called the
PðÞ ¼ H1 ððt þ Þ; tÞ dt Poincaré map. The following statement holds.
1 3. For any two points z0 , z1 2 U such that z1 = g2
" (z0 ),
let (0 , h0 ) and (1 , h1 ) be their time–energy
converges. The function P carries all information on
coordinates. Then
the separatrix splitting in the first approximation
in ". 1 ¼ 0 þ 2 þ Oð"Þ; h1 ¼ h0 þ Oð"Þ
Separatrix Splitting 537
h Multidimensional Case
Multidimensional generalization of the Poincaré–
Lobe Λs Λu Melnikov construction is strongly connected to the
concept of a (partially) hyperbolic torus. Let
(M, !, H) be a Hamiltonian system on the 2m-
τ∗ τ dimensional symplectic manifold (M, !).
An invariant n-torus N M (0 n < m) is called
Figure 1 Perturbed separatrices in time–energy coordinates.
hyperbolic if there exist coordinates x, y, z on M in a
neighborhood of N such that
Existence of such coordinates has several 1. y = (y1 , . . . , yn ), x = (x1 , . . . , xn ) mod 2,
corollaries. z = (zs , zu ), zs, u = (zs,1 u , . . . , zs,l u ), l þ n = m;
If P is not identically constant, the separatrices 2. ! = dy ^ dx þ dzu ^ dzs ;
split and this splitting is of the first order in ". 3. N = {(x, y, z) : y = 0, z = 0}; and
Let
be a simple zero of M. Then the 4. H = h, yi þ (1=2)hAy, yi þ hzu , (x)zs i þ O3 (y, z),
perturbed separatrices intersect transversally at
where 2 Rn is a constant vector, A is a constant
a point z
(") with time–energy coordinates
n n matrix, is an l l matrix such that
(
þ O("), O("2 ), t = 0). Such a point z
(") is
(x) þ T (x) is positive definite for any x mod 2,
called a transversal homoclinic point. It gen-
the symbol O3 P denotes terms of order not less than
erates a doubly asymptotic solution in the
3, and ha, bi = aj bj .
perturbed system.
Consider a lobe domain L(
, ") bounded by two If det A 6¼ 0, the torus is called nondegenerate. If
is Diophantine, that is, for some , > 0 and any
segments of separatrices on the section {t = 0}
0 6¼ k 2 Zn
(see Figure 1). Let another ‘‘corner point’’ of the
lobe L(
, ") correspond to the simple zero
0 of jh; kij jkj
M. Then the symplectic area of L(
, ") equals
Z the torus N is called Diophantine. The coordinates
0
(x, y, z) are called canonical for N.
AL ð
; "Þ ¼ " MðÞ d þ Oð"2 Þ
Now suppose that the Hamiltonian H depends
smoothly on the parameter ":
A Standard Example H ¼ H0 þ "H1 þ Oð"2 Þ
Consider as an example a pendulum with periodi- and for " = 0 the system is Liouville integrable with
cally oscillating suspension point. The Hamiltonian the commuting first integrals F1 , . . . , Fm :
of the system can be presented in the form
fFj ; Fk g ¼ 0; 1 j; k m
Hðx; y; t; "Þ ¼ 12 y2 þ 2 cos x þ "ðtÞ cos x ½3
Let M0 = {F1 = = Fm = 0} M be their zero
where is the ‘‘internal’’ frequency of the pendulum. common level and let N M be an n-dimensional
The function is 2-periodic in time. Hence the nondegenerate Diophantine hyperbolic torus. The
frequency of the suspension point oscillation equals torus N generates the invariant Lagrangian asymp-
1. In this case, the unperturbed homoclinic solution totic manifolds s, u M. Suppose that the separa-
(t) can be computed explicitly. In particular, trices are doubled, that is, there is a Lagrangian
manifold s \ u .
cosðxðtÞÞ ¼ 1 2 cosh2 ðtÞ Consider the perturbed Hamiltonian H = H0 þ
R þ1 "H1 þ O("2 ). The torus N as well as the asymptotic
Therefore, P() = 1 (t)(cos (x(t þ )) 1) dt. For manifolds s, u survive the perturbation. Let N" be the
example, if (t) = cos t, we have corresponding hyperbolic torus in the perturbed
2 cos system and s," u its asymptotic manifolds: N" and
PðÞ ¼ s," u depend smoothly on " and N0 = N, s,0 u = s, u .
2 sinhð=2Þ
Let the function
(x) satisfy the equation
In this case, different lobes have the same area
h; @
ðxÞ=@xi þ H1 ðx; 0; 0Þ
Z
4" 1
AL ¼ þ Oð"2 Þ ¼ H1 ðx; 0; 0Þ dx
2 sinhð=2Þ ð2Þn T n
538 Separatrix Splitting
This equation has a smooth solution unique up to an Suppose that N = N(0) is Diophantine and non-
additive constant. degenerate. Then in the perturbed system there is
Consider a solution of the unperturbed Hamiltonian smooth in " hyperbolic torus N" , N0 = N. Consider
equations (t) . Let Ij , Ij, l , 1 j, l m be the the Poincaré function
following quantities (Treschev 1994): Z þ1
Z T Pð; Þ ¼ H1 ð þ ðt þ Þ; 0; ðt þ Þ; tÞ
Ij ¼ lim fFj ; H1 gððtÞÞ dt 1
T!þ1 T
H1 ð þ ðt þ Þ; 0; 0; 0; tÞ dt
þ fFj ;
gððTÞÞ fFj ;
gððTÞÞ
Obviously, P(, ) is 2-periodic in and .
Z T If P is not identically constant, asymptotic
Ij;l ¼ lim fFj ; fFl ; H1 ggððtÞÞ dt surfaces of N" split in the first approximation in ".
T!þ1 T
Nondegenerate critical points of P correspond to
þ fFj fFl ;
ggððTÞÞ
transversal homoclinic solutions of the perturbed
fFj fFl ;
ggððTÞÞ system.
Other results on the splitting of multidimensional
asymptotic manifolds are presented in Arnol’d et al.
The numbers Ij , Ij, l play the role of the first and (1988) and Lochak et al. (2003).
second derivatives of the Poincaré integral at some
point.
If any of the quantities Ij , Ij, l does not vanish, Exponentially Small Separatrix Splitting
the asymptotic manifolds s, u split. Moreover, sup-
pose that I1 = = Im
= 0 and the rank of the matrix If in the unperturbed (integrable) system there are no
(Ij, l ) equals m 1. Then for small values of ", the asymptotic manifolds, they can appear after a
manifolds s and u intersect transversally on the perturbation. Consider, for example, perturbation
energy level at points of the solution " (t), where of a real-analytic Liouville integrable system near a
" ! as " ! 0. simple resonance:
@H @H
x_ ¼ ; y_ ¼ ; x 2 T m ; y 2 D Rm
@y @x
Poincaré Integral in Multidimensional Hðx; y; t; "Þ ¼ H0 ðyÞ þ "H1 ðx; y; t; "Þ
Case
As usual, we assume 2-periodicity in t. A simple
Suppose that the Hamiltonian from the previous resonance corresponds to a value of the action
section equals variable y = y0 such that the frequency vector
0 01
Hðx; y; u; v; t; "Þ ¼ H0 ðy; u; vÞ þ "H1 ðx; y; u; v; tÞ @H0 0
þ Oð"2 Þ ^ ¼ @ A; 0 ¼ ðy Þ 2 Rm
@y
1
Here x = (x1 , . . . , xn ) mod 2, y = (y1 , . . . , yn ) 2 Rn , and
(u, v) 2 R2 . The symplectic structure ! = dy ^ dx þ (here 1 is the frequency, corresponding to the time
dv ^ du. variable) admits only one resonance. More precisely,
there exists a nonzero k ^ 2 Zm þ 1 , satisfying hk,
^ ^i = 0
We assume that in the unperturbed integrable mþ1
system the variables separate: and any k 2 Z such that hk, ^i = 0 is collinear
^
with k.
H0 ðy; u; vÞ ¼ FðyÞ þ f ðu; vÞ Without loss of generality, we can assume that
and the system with one degree of freedom and y0 =0 and 0 = e 0
, e 2 Rm1 . Then the vector
Hamiltonian f has a hyperbolic equilibrium = e 2 Rm is nonresonant.
1 pffiffiffi
(u, v) = 0 with a homoclinic solution (t). Any torus In a "-neighborhood of the resonance we have a
system with fast variables X = (x2 , . . . , xm , t) mod 2
Nðy0 Þ ¼ fðx; y; u; v; tÞ: y ¼ y0 ; u ¼ v ¼ 0g and slow variables Y = (x1 , "1=2 y1 , . . . , "1=2 ym )
variables:
is a hyperbolic torus of the unperturbed system with
pffiffiffi pffiffiffi
frequency vector Y_ ¼ Oð "Þ; _ ¼ þ Oð "Þ
X ½4
0 1
ðy0 Þ If the frequency vector is Diophantine, by using
@ A; ðyÞ ¼ @F=@y the Neishtadt averaging procedure, we can reduce
1 the dependence of the right-hand sides of eqns [4] on
Separatrix Splitting 539
the fast variables to exponentially small in " terms. separatrix splitting, one has to study singularities of
This means that there exist new symplectic variables the solutions with respect to complex time. Area of
pffiffiffi pffiffiffi lobes in this system equals (Treschev 1997)
P ¼ Y þ Oð "Þ; Q ¼ X þ Oð "Þ
1
AL ¼ 4bf ðb; "Þ"1 eð2"Þ
(new time coincides with the old one) such that
system [4] takes the form Here f (b, "), " 0 is a smooth function. The func-
pffiffiffi pffiffiffi tion f (b, 0) is even and entire. It can be computed
P_ ¼ "FðP; "Þ þ O expða"b Þ numerically as a solution of a problem which does
Q_ ¼ þ pffiffi"ffiGðP; pffiffi"ffiÞ þ O expða"b Þ not contain ". The value f (0, 0) = 4 corresponds to
the Poincaré integral, but the function f (b, 0) is not
with positive constants a, b. constant. It is possible to prove that f can be
If we neglect the exponentially small reminders, expanded in a power series in ". Apparently, this
the system turns out to be integrable. Generically, it series diverges for any b 6¼ 0.
has a family of hyperbolic m-tori of the form
{(P, Q): P = const.} with doubled asymptotic mani-
folds. However, the terms O(exp (a"b )) generic- Separatrix Splitting and Dynamics
ally cannot be removed completely. They produce 1. Separatrix splitting can be regarded as an obstacle
an exponentially small splitting of the asymptotic to the integrability of the perturbed system. How-
manifolds. This splitting implies nonintegrability, ever, this statement needs some comments.
chaotic behavior, Arnol’d diffusion, and other Doubled asymptotic surfaces in an integrable
dynamical effects. Hamiltonian system can have self-intersections. In
It is important to note that exponentially small the case of equilibrium, such intersections can even
splitting appears only in the analytic case. In smooth be transversal. In the literature, there is no general
systems the splitting is much stronger. result saying that separatrix splitting implies non-
Unfortunately, at present there are no quantitative integrability. Some particular cases (studied by
methods for studying such splittings except obvious Kozlov, Ziglin, Bolotin, and others) are presented
upper estimates and the case of two-dimensional in Arnol’d et al. (1989). For example, in the two-
systems. dimensional case, this is seen to be true.
2. Conceptual reason for the nonintegrability, dis-
cussed in the previous item, is a complicated
Exponentially Small Splitting dynamics near the splitted separatrices. In many
in Two-Dimensional Systems situations, it is possible to find in this domain a
Smale horseshoe. This implies positive topological
The main results on exponentially small separatrix entropy, existence of nontrivial hyperbolic sets,
splitting were obtained by Lasutkin and his students symbolic dynamics, etc.
(Gelfreich and others). Another effective approach 3. Consider a near-integrable area-preserving two-
was proposed by Treschev. There are no general dimensional map. In the perturbed system in the
theorems in this situation; however, many examples vicinity of the splitted separatrices of a hyperbolic
were studied. We discuss the splitting in the fixed point z" the so-called stochastic layer is
pendulum with rapidly oscillating suspension point. formed. Here we mean the domain bounded by
The Hamiltonian of the system has the form invariant curves, closest to the separatrices. An
important quantity, describing the rate of chaos, is
H ¼ 12 y2 þ ð1 þ 2b cosðt="ÞÞ cos x the area of the stochastic layer ASL ("). It turns out
(Treschev 1998b) that ASL (") is connected with the
(cf. [3]). For any value of " the circle area of the largest lobe AL (") by the simple formula
{(x, y, t): x = y = 0} is a periodic trajectory. For
small " > 0 the trajectory is hyperbolic. AL ð"Þ logðAL ð"ÞÞ
Poincaré integral can be formally written in this c1 ASL ð"Þ < < c2 ASL ð"Þ
1 log2
system. It predicts the area of lobes 16b"1 e(2") .
However, there is no reason to expect that this with some constants c1 , c2 > 0, where is the
asymptotics of the splitting is correct. Indeed, its largest multiplier (Lyapunov exponent) of the fixed
value is exponentially small in ", while the error of the point z0 .
Poincaré–Melnikov method is in general quadratic in 4. Let ^z be a hyperbolic fixed point of an area-
the perturbation. To obtain correct asymptotics of the preserving two-dimensional map. The point ^z
540 Several Complex Variables: Basic Geometric Theory
divides the corresponding separatrices s, u in 4 Lochak P, Marco J-P, and Sauzin D (2003) On the splitting of
branches s1, 2 and u1, 2 . Suppose that the pair of invariant manifolds in multidimensional near-integrable
Hamiltonian systems. Memoirs of the American Mathematical
branches sj and ul satisfies the following Society 163(775): viiiþ145.
conditions: Melnikov V (1963) On the stability of the center for time-periodic
perturbations. Trudy Moskovskogo Metern. Obschestva 12:
sj and ul lie in a compact invariant domain; 3–52 (Russian). (English transl.: Trans. Moscow Math. Soc.
sj and ul do not coincide and intersect at a 1963, pp. 1–56 (1965).)
homoclinic point. Poincaré H (1987) Les méthodes nouvelles de la Mécanique
s u Céleste, vols. 1–3. Paris: Gauthier–Villars, (Original publica-
Then the closures j , l are compact invariant tion: 1892, 1893, 1899). New Printing: Librairie Scientifique
sets. Very little is known about these sets. For et Technique Albert Blanchard 9, Rue Medecin 75006, Paris.
example, it is not known if their measure is positive. Treschev D (1994) Hyperbolic tori and asymptotic surfaces in
However, by using the Poincaré recurrence theorem, Hamiltonian systems. Russian Journal of Mathematical
s u Physics 2(1): 93–110.
it is possible to prove (Treschev 1998a) that j = l . Treschev D (1997) Separatrix splitting for a pendulum with
rapidly oscillating suspension point. Russian Journal of
See also: Averaging Methods; Billiards in Bounded Convex Mathematical Physics 5(1): 63–98.
Domains; Hamiltonian Systems: Obstructions to Integrability; Treschev D (1998a) Closures of asymptotic curves in a two-
Hamiltonian Systems: Stability and Instability Theory. dimensional symplectic map. J. Dynam. Control Systems 4(3):
305–314.
Treschev D (1998b) Width of stochastic layers in near-integrable
Further Reading two-dimensional symplectic maps. Physica D 116(1–2): 21–43.
Wiggins S (1988) Global Bifurcations and Chaos. Analytical
Arnol’d V, Kozlov V, and Neishtadt A (1988) Mathematical Methods. Applied Mathematical Sciences, vol. 73, 494pp
aspects of classical and celestial mechanics. In: Encyclopaedia New York: Springer.
of Mathematical Sciences, vol. 3. Berlin: Springer.
Levi Theorem and the Levi Problem Bounded Domains and Their Automorphisms
Consider a smooth (local) real hypersurface The unit disk in the complex plane is particularly
containing 0 2 Cn with n > 1. It is the zero-set important, because, with the exceptions of projective
Several Complex Variables: Basic Geometric Theory 543
space P1 (C), the complex plane C, the punctured realized that domains of holomorphy form the
plane Cn{0}, and compact complex tori, it is the basic class of spaces where it would be possible to
universal cover of every (connected) one-dimensional solve the important problems of the subject con-
complex manifold. cerning the existence of holomorphic or mero-
In higher dimensions it should first be underlined morphic functions with reasonably prescribed
that, without some further condition, there is no properties. In fact, Oka formulated a principle
best bounded domain in Cn . For example, two which more or less states that if a complex analytic
randomly chosen small perturbations of the unit ball problem which is well formulated on a domain of
B2 := {(z, w); jzj2 þ jwj2 < 1}, with, for example, real holomorphy has a continuous solution, then it
analytic boundary, are not biholomorphically should have a holomorphic solution. Given the
equivalent. flexibility of continuous functions and the rigidity
On the other hand, the following theorem of of holomorphic functions, this would seem impos-
H. Cartan shows that bounded domains D are good sible but in fact is true!
candidates for covering spaces: Beginning in the late 1930s, Stein worked on
problems related to this Oka principle, in particular
Theorem Equipped with the compact open topol-
on those related to what we would now call the
ogy, the group Aut(D) of holomorphic automorph-
algebraic topological aspects of the subject, and he
isms of D is a Lie group acting properly on D.
was led to formulate conditions on a general
The notion of a proper group action of a complex manifold X which should hold if problems
topological group on a topological space is funda- of the above type are to be solved. First, his axiom
mental and should be underlined. It means that if of holomorphic convexity was simply that, given a
{xn } is a convergent sequence in the space where the divergent sequence {xn } in X, there should be a
group is acting, then a sequence of group elements function f 2 O(X) such that {f (xn )} is unbounded.
{gn }, with the property that {gn (xn )} is convergent, Secondly, holomorphic functions should separate
itself possesses a convergent subsequence. As a points in the sense that, given distinct points x1 , x2 2
consequence, isotropy groups are compact and X, there exists f 2 O(X) with f (x1 ) 6¼ f (x2 ). Finally,
orbits are closed. globally defined holomorphic functions should give
In the context of bounded domains D this implies local coordinates. Assuming that X is n-dimensional,
that if is a discrete subgroup of Aut(D), then this means that, given a point x 2 X, there exist
X = D= carries a natural structure of a complex f1 , . . . , fn 2 O(X) such that df1 (x) ^ ^ dfn (x) 6¼ 0.
space. If in addition is acting freely, something Assuming Stein’s axioms, Cartan and Serre then
that, with minor modifications, can be arranged, produced a powerful theory in the context of sheaf
then X is a complex manifold. cohomology which proved certain vanishing theo-
Many nontrivial compact complex manifolds arise rems that led to the desired existence theorems. This
as quotients D= of bounded domains. Even very theory and typical applications are sketched below.
concrete quotients, for example, where D = B2 , are Before going into this, we would like to mention
extremely interesting. Conversely, if Aut(D) contains that Grauert’s version of the Cartan–Serre theory
a discrete subgroup so that D= is compact, then requires only very weak versions of Stein’s axioms:
D is probably very special. For example, it is known (1) The connected component containing K of the
to be holomorphically convex! holomorphic convex hull K ^ of every compact set
Any compact quotient X = D= of a bounded should be compact. (2) Given x 2 X, there are
domain is projective algebraic in the sense that it can functions f1 , . . . , fm 2 O(X) so that x is an isolated
be realized as a complex (algebraic) submanifold of point in the fiber of the map F := (f1 , . . . , fm ) : X !
some complex projective space. In fact the embed- Cm . Of course the results also hold for complex
ding can be given by quite special -invariant spaces.
holomorphic tensors on D, and this in turn implies Holomorphically convex domains in Cn are Stein
that X is of general type (see below). For further manifolds, and since closed complex manifolds of
details, in particular on Cartan’s theorem on the Stein manifolds are Stein, it follows that any
automorphism group of a bounded domain, the complex submanifold of Cn is Stein. In particular,
reader is referred to Narasimhan (1971). affine varieties are Stein spaces. Remmert’s theorem
states the converse: an n-dimensional Stein manifold
can be embedded as a closed complex submanifold
Stein Manifolds
of C2nþ1 . A nontrivial result of Behnke and Stein
The founding fathers of the first phase of ‘‘modern implies that every noncompact Riemann surface is
complex analysis’’ (Cartan, Oka, and Thullen) also Stein.
544 Several Complex Variables: Basic Geometric Theory
O-modules. Conversely, by taking bases of a locally other words, for U open in A the space OA (U)
free sheaf S on the open sets where it is isomorphic should be regarded as the space of holomorphic
to a direct sum Or , one builds an associated functions on U.
holomorphic vector bundle E so that E = S. Now, I is a coherent sheaf on X and therefore by
It is not possible to restrict our attention to these Theorem B the cohomology group H 1 (X, I ) vanishes.
locally free sheaves or equivalently to holomorphic Consequently, the associated long exact sequence in
vector bundles. One important reason is that images cohomology implies that the restriction mapping
of holomorphic vector bundle maps are not necessa- OX (X) ! OA (A) is surjective. This special case of
rily vector bundles. A related reason is that the sheaf Theorem A means that every (global!) holomorphic
of ideals of holomorphic functions which vanish on function on A is the restriction of a holomorphic
a given analytic set A is not always a vector bundle. function on X. ^
This is caused by the presence of singularities in A.
Example Let us consider the multiplicative (second)
There are many other reasons, but these should
Cousin problem. In this case meromorphic functions
suffice for this sketch.
mi are given on the open subsets Ui of a covering U
The sheaves S that arise naturally in complex
with the property that mi = fij mj , where fij is holo-
analysis are almost vector bundles. If X is the base
morphic and nowhere vanishing on the overlap Uij .
complex manifold or complex space under consid-
This is a distribution D of the zero and polar parts of
eration, then S will come from a vector bundle on
meromorphic functions, which in complex geometry is
some big open subset X0 whose boundary is an
called a divisor, and the interesting question is whether
analytic set X1 , and then on the irreducible
or not there exists a globally defined meromorphic
components of X1 it will come from vector bundles
function which has D as its divisor.
on such big open sets, etc. These sheaves are called
Now we note that GL1 (C) = C
and thus
coherent analytic sheaves of OX -modules. The
fij : Uij ! C
defines a line bundle L on X and we
correct algebraic definition is that locally there
regard it as an element of the space H 1 (X, O
) of
exists an exact sequence
equivalence classes of line bundles on X. Here O
A slightly refined statement from that above is the Montel’s Theorem and Fredholm
fact that on a Stein manifold the space of topologi- Mappings
cal line bundles is the same as the space of
holomorphic line bundles. In the case of (higher If U is an open subset of a complex space X, then
rank) vector bundles this is a deep and important O(U) has the Fréchet topology of convergence on
theorem of Grauert. It can be formulated as follows. compact subsets K defined by the seminorms j jK .
Using resolutions of type (1) above, one shows that
Grauert’s Oka principle On a Stein space the map the space of sections S(U) of every coherent sheaf S
F : Vectholo (X) ! Vecttop (X) from the space of holo- also possesses a canonical Fréchet topology. This is
morphic vector bundles to the space of topological then extended to the spaces Cq (U, S), and conse-
vector bundles which forgets the complex structure quently one is able to equip the cohomology spaces
is bijective. H q (X, S) with (often non-Hausdorff) quotient
In closing this section, a few words concerning the topology.
proofs of the major theorems, for example, Theorem B, Elements of such cohomology groups can be
should be mentioned. In all cases one must solve regarded as obstructions to solving complex analytic
something like an additive Cousin problem and one problems. One often expects such obstructions, and
first does this on special relatively compact subsets. For is satisfied whenever it can be shown if there are
this step there are at least two different ways to only finitely many, that is, a finiteness theorem of
proceed. One is to delicately piece together solutions the type dim H q (X, S) < 1 is desirable. Here we
which are known to exist on very special polyhedral- sketch two finiteness theorems which hold in
type domains or build up from lower-dimensional seemingly different contexts, but their proofs are
pieces of such. based on one principle: use the compactness
Another method is to solve certain systems of PDEs guaranteed by Montel’s theorem as the necessary
on relatively compact domains where control at the input for the Fredholm theorem in the context of
boundary is given by the positivity of the Levi-form. Fréchet spaces.
An example of how such PDEs occur can already be Recall that a continuous linear map T : E ! F
seen at the level of the above Cousin I problem. At the between topological vector spaces is said to be
point where we have solved it topologically, that is, the compact if there is an open neighborhood U of 0 2 E
holomorphic cocycle {fij } is a coboundary fij = fj fi of such that T(U) is relatively compact in F. If Y is a
smooth functions, we observe that since @f ij = 0, it relatively compact open subset of a complex space
follows that = @fi is a globally defined (0, 1)-form. It X, then Montel’s theorem states that the restriction
is @-closed, that is, the compatibility condition for map rX Y : O(X) ! O(Y) is compact. This can be
solving the system @u = is fulfilled. If this system can extended to coherent sheaves, and using the Fred-
be solved, then we use the solution u to adjust the holm theorem for certain natural restriction and
topological solutions of the Cousin problem by boundary maps, one proves the following funda-
replacing fi by fi u. We still have fij = fj fi , but mental fact.
now the fi are holomorphic on Ui . Lemma 1 If the restriction map rX q
Y : H (X, S) !
To obtain the global solution to a Cousin-type q q
H (Y, S) is surjective, then H (Y, S) is finite
problem, one exhausts the Stein space by the special dimensional.
relatively compact subsets Un where, by one method
or another, we have solved the problem with Since the methods for the proof are basic in complex
solutions sn . One would like to say that the sn analysis, we outline it here. Take a covering U~ of X
converge to a global solution s. However, there is no such that H q (U, S) = H q (X, S). Then intersect its
way to a priori guarantee this without making some elements with Y to obtain a covering U~ of Y. Finally,
sort of estimates. One main way of handling this refine that covering with refinement mapping to a
problem is to adjust the solutions as n ! 1 by an covering V of Y such that H q (V, S) = Hq (X, S) and so
approximation procedure. For this one needs to that Ui contains V (i) as a relatively compact subset
know that holomorphic objects, for example, func- for all i. Let Zq (U, S) denote the kernel of
tions on Un , can be approximated on Un by objects the boundary map for the covering U, and consider
of the same type which are defined on the bigger set the map Zq (U, S)
Cq1 (V, S) ! Cq (V, S) which is
Unþ1 . This Runge-type theorem, which is a non- the direct sum
of the restriction and boundary
trivial ingredient in the whole theory, requires the maps. By assumption it is surjective. Since is the
introduction of an appropriate Fréchet structure on difference of this map and the compact map ,
the spaces of sections of a coherent sheaf. This is in L Schwartz’s version of the Fredholm theorem for
itself a point that needs some attention. Fréchet spaces implies that its image is of finite
Several Complex Variables: Basic Geometric Theory 547
codimension, that is, H q (Y, S) = H q (V, S) is finite be noted that, even if the original space X is a
dimensional. complex manifold, the associated Stein space Z may
Applying this Lemma in the case of compact be singular. This reflects the fact that it is difficult to
spaces where X = Y, one has the following theorem avoid singularities in complex geometry.
of Cartan and Serre:
Theorem If X is a compact complex space and S is
Mapping Theory
a coherent sheaf on X, then dim H q (X, S) < 1 for
all q. Above we have attempted to make it clear that
holomorphic maps play a central role in complex
Grauert made use of this technique in solving the
geometry. It is even important to regard a holo-
Levi problem for a strongly pseudoconvex relatively
morphic function as a map. Here we outline the
compact domain D with smooth boundary in a
basic background necessary for dealing with maps
complex manifold X. Here strongly pseudoconvex
and then state three basic theorems which involve
means that the restriction of the Levi form to the
proper holomorphic mappings.
complex tangent space of every boundary point is
positive definite. To do this he sequentially made
Basic Facts
‘‘bumps’’ at boundary points to obtain a finite
sequence of domains D = D0 D1 Dm in A holomorphic map F : X ! Y between (reduced)
such a way that the restriction mappings at the complex spaces is a continuous map which can be
level of qth cohomology, q 1, are all surjective represented locally as a holomorphic map between
and such that at the last step D is relatively analytic subsets of the spaces in which X and Y are
compact in Dm . Applying the above Lemma, locally embedded. In other words, F is the restriction
dim H q (D, S) < 1. Using another bumping proce- of a map F = (f1 , . . . , fm ) which is defined by
dure, it then follows that D is holomorphically holomorphic functions.
convex and, in fact, that D is almost Stein. If X is irreducible and X and Y are one-
This last statement means that one can guarantee dimensional, then a nonconstant holomorphic map
that O(D) separates points outside of some compact F : X ! Y is an open mapping. This statement is far
subset which could contain compact subvarietes on from being true in the higher-dimensional setting.
which the global holomorphic functions are constant. The reader need only consider the example
In this situation one can apply Remmert’s reduction F : C2 ! C2 , (z, w) ! (zw, z).
theorem which implies that there is a canonically Despite the fact that holomorphic maps can be
defined proper surjective holomorphic map : D ! Z quite complicated, they have properties that in
to a Stein space which is biholomorphic outside of certain respects render them tenable. Let us sketch
finitely many fibers. One says that, in order to obtain these in the case where X is irreducible. First, one
the Stein space Z, finitely many compact analytic notes that every fiber F1 (y) is a closed analytic
subsets must be blown down to points. subset of X. One defines rankx F to be the codimen-
The above mentioned reduction theorem is a sion at x of the fiber F1 (F(x)) at x. Then
general result which applies to any holomorphically rank F := max {rankx F; x 2 X}. It then can be
convex complex space X. For this one observes that shown that {x 2 X; rankx F k} is a closed analytic
if X is holomorphically convex, then for x 2 X the subset of X for every k. Applying this for
level set L(x) := {y 2 X; f (y) = f (x) for all f 2 O(X)} k = rank F 1 we see that, outside a proper closed
is a compact analytic subset of X. One then defines analytic subset, F has constant maximal rank.
an equivalence relation: x y if and only if the If F : X ! Y has constant rank k in a neighbor-
connected component of L(x) containing x and that hood of some point x 2 X, then one can choose
of L(y) which contains y are the same. One then neighborhoods U of x in X and V of F(x) in Y so
equips X= with the quotient topology and proves that FjU maps U onto a closed analytic subset of Y.
that the canonical quotient : X ! X= =: Z is By restricting F to the sets where it has lower rank
proper. Finally, for U open in Z one defines and applying this local-image theorem, it follows
OZ (U) = OX (1 (U)) and proves that, equipped that the local images of the set where F has lower
with this structure, Z is a Stein space. This Remmert rank are at least two dimensions smaller than those
reduction is universal with respect to holomorphic of top rank. Conversely, the fiber dimension
maps to holomorphically separable complex spaces, dF (x) := dimx F1 (F(x)) is semicontinuous in the
that is, if ’ : X ! Y and OY (Y) separates the points sense that dF (x) dF (z) for all z near x. Finally, we
of Y, then there exists a uniquely defined holo- note that if Y is m-dimensional, then F : X ! Y is an
morphic map ’ : Z ! Y so that ’ = ’. It should open map if and only if it is of constant rank m.
548 Several Complex Variables: Basic Geometric Theory
Given another basic theorem of complex analysis, Complex Analysis and Algebraic
the reader can imagine how this might be proved. Geometry
This is the continuation theorem for analytic sets
The interplay between these subjects has motivated
due to Remmert and Stein:
research and produced deep results on both sides.
If X is a complex space and Y is a closed analytic Here we indicate just a few results of the type which
subset with dimy Y k for all y 2 Y and Z is a closed show that objects which are a priori of an analytic
analytic subset of the complement XnY with dimz Z nature are in fact algebraic geometric.
k þ 1 at all z 2 Z, then the topological closure cl(Z) of
Z in X is a closed analytic subset of X with E = cl(Z)n Projective Varieties
Z = cl(Z) \ Y a proper analytic subset of cl(Z).
Let us begin with the algebraic geometric side of the
Similar results hold for more general complex picture where we consider algebraic subvarieties X of
analytic objects. For example, closed positive cur- projective space Pn (C). If [z0 : z1 : : zn ] are homo-
rents with (locally) finite volume can be continued geneous coordinates of Pn , such a variety is the
across any proper analytic subset (Skoda 1982). A simultaneous zero-set, X := V(P1 , . . . , Pm ), of finitely
sketch of the proof of the proper mapping theorem many (holomorphic) homogeneous polynomials
(for X irreducible) goes as follows. From the Pi = Pi (z0 , . . . , zm ). Chow’s theorem states that in this
assumption that F is proper, the image F(X) context there are no further analytic phenomena:
is closed. If F has constant rank k, then, by the
Theorem Closed complex analytic subsets of pro-
local result stated above, its image is everywhere
jective space Pn (C) are algebraic subvarieties.
locally a k-dimensional analytic set. Since the image
is closed, the desired result follows. If rank F = k This observation has numerous consequences. For
and E := {x 2 X; rankx F < k} 6¼ ;, then by induction example, if F : X ! Y is a holomorphic map between
F(E) is a closed analytic subset of dimension at algebraic varieties, then, by applying Chow’s theorem
most k 2. Let A := F1 (E) and apply the to its graph, it follows that F is algebraic.
previous discussion for constant rank maps to Chow’s theorem can be proved via an application
Fj(XnA) : XnA ! YnE. The image is a closed of the Remmert–Stein theorem in a very simple
k-dimensional analytic subset of YnE and its situation. For this, let : Cnþ1 n{0} ! Pn (C) be the
Remmert–Stein extension is the full image F(X). standard projection, and let Z := 1 (X). Since Z is
In this framework the Stein factorization theorem positive dimensional, by the Remmert–Stein theorem it
is an important tool. Here F : X ! Y is again a can be extended to an analytic subset of Cnþ1 . The
proper holomorphic map which we may now resulting subvariety K(X) (the cone over X) is invariant
assume to be surjective. Analogous to the construc- by the C
-action which is defined by v !
v for
2
tion of the reduction of a holomorphically convex C
. If f is a holomorphic function on Cnþ1 which
space, one says that two points in X are equivalent vanishes on K(X), then Pwe develop it in homogeneous
if they are in the same connected component of an polynomials fP= Pd and note that
F-fiber. This is indeed an equivalence relation, and
(f )(z) = f (
z) =
d Pd also vanishes for all
.
the quotient Z := X= is a complex space equipped Hence, all Pd vanish identically and therefore the
with the direct image sheaf. Thus one decomposes F ideal of holomorphic functions which vanish on K(X)
Several Complex Variables: Basic Geometric Theory 549
is generated by the homogeneous polynomials which pseudoconcave, that is, when regarded from outside
vanish on K(X) and consequently finitely many of T, its boundary is strongly pseudoconvex.
these define X as a subvariety of Pn (C). To prove an embedding theorem, one must
Complements of subvarieties in projective varieties produce sections with prescribed properties. Sections
occur in numerous applications and are important of powers Lk are closely related to holomorphic
objects in complex geometry. Even complements Pn nY functions on the dual bundle space L
. This is due to
of subvarieties Y in the full projective space are not the fact that if : L ! X is the bundle projection,
well understood. If Y is the intersection of a compact 1 (U ) ffi U C is a local trivialization, and z is
projective variety X with a projective hyperplane, that a fiber coordinate, then a holomorphic function f on
is, Y is a hyperplane section, then XnY is affine. If Y is L
has a Taylor series development
q-codimensional in X, then XnY possesses a certain X
degree of Levi convexity and general theorems of f ðvÞ = s ðnÞððvÞÞzn ðvÞ
Andreotti and Grauert (1962) on the finiteness and
vanishing of cohomology indeed apply. However, not The function f is well defined on L. Hence, the
nearly as much is understood in this case as in the case transformation law for the zn must be canceled out
of a hyperplane section. by a transformation law for the coefficient functions
s (n). This implies that the s (n) are sections of Ln .
Hence, proving the existence of sections in the
Kodaira Embedding Theorem
powers of L with prescribed properties amounts to
Given that analytic subvarieties of projective space the same thing as proving the existence of holo-
are algebraic, one would like to understand whether morphic funtions on L
with analogous properties.
a given compact complex manifold or complex The positivity assumption on L is equivalent to
space can be realized as such a subvariety. Kodaira’s assuming that the tubular neighborhoods of the zero-
theorem is a prototype of such an embedding section in L
defined by the norm function associated
theorem. Most often one formulates projective to the dual metric are strongly pseudoconvex. The
embedding theorems in the language of bundles. solution to the Levi problem, which was sketched
For this, observe that if L ! X is a holomorphic above, then shows that L
is holomorphically convex,
line bundle over a compact complex manifold, then and its Remmert reduction is achieved by simply
its space (X, L) of holomorphic sections is a finite- blowing down its zero-section. In other words, L
is
dimensional vector space V. The zero-set of a section essentially a Stein manifold, and using Stein theory, it
s 2 V is a one-codimensional subvariety of X. is possible to produce enough holomorphic functions
Let us restrict our attention to bundles which are on L to show that some power Lk defines a
generated by their sections which for line bundles holomorphic embedding ’Lk : X ! P((X, Lk )
).
simply means that for every x 2 X there is some Bundles with this property are said to be ample, and
section s 2 V with s(x) 6¼ 0. It then follows that for thus we have outlined the following fact: ‘‘a line
every x 2 X the space Hx := {x 2 X; s(x) = 0} is a bundle which is Grauert-positive is ample.’’
one-codimensional vector subspace of V. Thus L It should be underlined that we defined the Chern
defines a holomorphic map ’L : X ! P(V
), x 7! Hx . class of L as the image in H 2 (X, Z) of its equivalence
Note that we must go to the projective space P(V
), class in H 1 (X, O
), that is, in this formulation the
because a linear function defining such an Hx is only Chern class is a Cech cohomology class. It is, however,
unique up to a complex multiple. often more useful to consider it as a deRham class
2
Projective embedding theorems state that under where it lies in the (1, 1)-part of HdeR (X, C). If h is a
certain conditions on L the map ’L is a holomorphic bundle metric as above, then the Levi form of the norm
embedding, that is, it is injective and is everywhere function is a representative c1 (L, h) of the Chern
of maximal rank in the analytic sense that its class of L
. Thus c1 (L, h) is an integral (1, 1)-form
differential has maximal rank. Here we outline a which represents c1 (L). It is called the Chern form of L
complex analytic approach of Grauert for proving associated to the metric h. The following is Kodaira’s
embedding theorems. It makes strong use of the formulation of his embedding theorem:
complex geometry of bundle spaces.
Theorem A line bundle L is ample if and only if it
Let L ! X be a holomorphic line bundle over a
possesses a metric h so that c1 (L, h) is positive definite.
compact complex manifold. A Hermitian bundle metric
is a smoothly varying metric h in the fibers of L. This Kodaira’s proof of this fact follows from his
defines a norm function v 7! jvj2 := h(v, v) on the vanishing theorem (see Several Complex Variables:
bundle space L. One says that L is positive if the tubular Compact Manifolds) in the same way the example
neighborhood T := {v 2 L; jvj3 < 1} is strongly of Theorem A was derived from Theorem B in the
550 Several Complex Variables: Basic Geometric Theory
first example in the subsection ‘‘Selected theorems.’’ determinant of the Jacobian d=dz and, given a
That an ample bundle is positive follows immedi- holomorphic functionP f, consider (at least formally)
ately from the fact that if ’Lk is an embedding, then the Poincaré series f ((z))J(, z)k of weight k. If f is
its pullback of the (positive) hyperplane bundle on bounded and k 2, then this series converges to a
projective space agrees with Lk . holomorphic function P(f ) on D which satisfies the
Finally, one asks the question ‘‘under what natural transformation rule P(f ) ((z)) = J(, z)k P(f )(z).
conditions can one construct a bundle L which is Now the differential volume form := dz1 ^ ^
positive?’’ The following is an example of an answer dzn transforms in the opposite way (for k = 1).
which is related to geometric quantization. Therefore s(f ) = P(f )()k is a -invariant section of
Suppose that X is a compact complex manifold the kth power of the determinant bundle
equipped with a symplectic structure !, that is, ! is K := n T
D of the holomorphic cotangent bundle
a d-closed, nondegenerate 2-form. One says that ! is of D. In other words, s(f ) 2 (X, Kk ). Since the
Kählerian if it is compatible with the complex choice of f may be varied to show that there are
structure J in the sense that !(Jv, Jw) = !(v, w) and sufficiently many sections to separate points and to
!(Jv, v) > 0 for every v and w in every tangent space guarantee the maximal rank condition, it follows
of X. Note that if L is a positive line bundle, then it that the canonical bundle K of X is ample. Compact
possesses a Hermitian metric h such that ! = c1 (L, h) complex manifolds with ample canonical bundle are
is a Kählerian structure on X. examples of manifolds which are said to be of
It should be underlined that there are Kähler general type (see Several Complex Variables: Compact
manifolds without positive bundles, for example, Manifolds). Thus, this construction with Poincaré
every compact complex torus T = Cn = possesses series proves the following: ‘‘Every compact quotient
the Kählerian structure which comes from the D= of a bounded domain is of general type and is
standard linear structure on Cn . However, for n > 1 in particular projective algebraic.’’
most such tori are not projective algebraic and
therefore do not have positive bundles. See also: Gauge Theoretic Invariants of 4-Manifolds;
If, on the other hand, the Kählerian structure is Moduli Spaces: An Introduction; Riemann Surfaces;
integral, a condition that is automatic for the Chern Several Complex Variables: Compact Manifolds; Twistor
Theory: Some Applications [in Integrable Systems,
form c1 (L, h) of a bundle, then there is indeed a line
Complex Geometry and String Theory].
bundle L ! X equipped with a Hermitian metric h
such that c1 (L, h) = !. The condition of integrality can
be formulated in terms of the integrals of ! over
homology classes being integral or that its deRham Further Reading
class is in the image of the deRham isomorphism from Andreotti A and Grauert H (1962) Théorèmes de finitude pour la
the Cech cohomology H 2 (X, Z) C to Hde 2
R (X, C). cohomologie des espaces complexes. Bulletin de la Société
Coupling this with the embedding theorem for positive Mathématique de France 90: 193–259.
bundles, we have the following theorem of Kodaira: Demailly J-P (1985) Champs magnt́ique et inégalitiés de Morse pour
lat d00 -cohomologie. Annales de l’Institut de Fourier 35: 189–229.
Theorem If (X, !) is Kählerian and ! is integral, Demailly J-P, Complex analytic and algebraic geometry, http://
then X is projective algebraic. www-fourier.ujf-grenoble.fr/demailly.
Grauert H (1962) Uber Modifikationen und exzeptionelle
This result has been refined in the following analytische Mengen. Mathematische Annalen 129: 331–368.
important way (a conjecture of Grauert and Grauert H and Fritzsche K (2001) From Holomorphic Functions
Riemenschneider proved with different methods by to Complex Manifolds. Heidelberg: Springer.
Griffiths PhA and Harris J (1978) Principles of Algebraic
Siu (1984) and by Demailly (1985)): the same result Geometry. New York: Wiley.
holds if ! is only assumed to be semipositive and Grauert H and Remmert R (1979) Theory of Stein Spaces.
positive in at least one point. Heidelberg: Springer.
For Grauert’s proof of the Kodaira embedding Grauert H and Remmert R (1984) Coherent Analytic Sheaves.
theorem and a number of other important and Heidelberg: Springer.
Grauert H, Peternell Th, and Remmert R (1994) Several Complex
beautiful results, we recommend the original paper Variables VII. Encyclopedia of Mathematical Science, vol. 74.
(Grauert 1962). Heidelberg: Springer.
Narasimhan R (1971) Several Complex Variables. Chicago Lectures
in Mathematics. Chicago, IL: University of Chicago Press.
Quotients of Bounded Domains Siu YT (1984) A vanishing theorem for semi-positive line bundles
over non-Kähler manifolds. Journal of Differntial Geometry
Let D be a bounded domain in Cn and be a discrete 19: 431–452.
subgroup of Aut(D) which is acting freely on D with a Skoda H (1982) Prolongement des courants positifs fermés
compact quotient X := D=. For 2 let J(, z) be the de masse finie. Inventiones Mathematicae 66: 361–376.
Several Complex Variables: Compact Manifolds 551
A projective manifold is a compact manifold which notion of ampleness: a line bundle L is ample if L
is a submanifold of some projective space PN . Of carries a metric of positive curvature. Alternatively
course, a projective manifold can be embedded into some tensor power of L has enough global section to
projective spaces in many ways. According to Chow’s separate points and tangents and there gives an
theorem (see Several Complex Variables: Basic embedding into some projective space; see Several
Geometric Theory), X PN is automatically given Complex Variables: Basic Geometric Theory for
by polynomial equations and is therefore an algebraic more details. The notion of nefness, which is in a
variety. This is part of Serre’s GAGA principle which certain sense the degenerate version of ampleness,
roughly says that all global analytic objects on a plays a central role in Mori theory: a line bundle or
projective manifold, for example, vector bundles or divisor L is nef if
coherent sheaves and their cohomology are auto-
matically algebraic. A compact manifold which is L C ¼ degðLjCÞ 0
bimeromorphically equivalent to a projective mani- for all curves C X. Examples are those L carrying
fold is called a Moishezon manifold. These arise a metric of semipositive curvature, but the converse
naturally, for example, as quotient of group actions, is not true. However, if L is nef, there exists for all
compactifications, etc. positive > 0 a metric h with curvature > !,
The most important birational invariant of com- where ! is a fixed positive form. In this context
pact manifolds is certainly the Kodaira dimension singular metrics on L are also important. Locally
(X). It is defined in three steps: they are given by e’ with a locally integrable
(X) = 1 iff h0 (mKX ) = 0 for all m 1. weight function ’ and they still have a curvature
(X) = 0 iff h0 (mKX )
1 for all m, and current . If L has a singular metric with
h0 (mKX ) = 1 for some m. bounded from below as current by a Kähler form,
In all other cases we can consider the meromorphic then L is big, that is, (L) = dim X, the birational
map fm : X ! PN(m) associated to H 0 (mKX ) for all version of ampleness. If one simply has 0 as
those m for which h0 (mKX ) 2. Let Vm denote current, then L is pseudoeffective (and vice versa).
the (closure of the) image of fm . Then (X) is All these positivity notions only depend on the
defined to be the maximal possible dim Vm . Chern class c1 (L) of L and therefore one considers
the ample cone
Recall that fm is defined by [s0 : : sN ] for a given
base si of H 0 (mKX ), cf. Several Complex Variables: Kamp ðH 1;1 ðXÞ \ H 2 ðX; ZÞÞ R
Basic Geometric Theory.
In the same way one defines the Kodaira (or and the cone of curves
Iitaka) dimension (L) of a holomorphic line bundle NEðXÞ ðH n1;n1 ðXÞ \ H 2n2 ðX; ZÞÞ R
L (instead of L = KX ).
We are now going to describe geometrically the The ample cone is by definition the closed cone of
different birational equivalence classes and how to nef divisors, the interior being the ample classes,
single out nice models in each class. Using methods while the cone of curves is the closed cone generated
in characteristic p, Miyaoka and Mori proved the by the fundamental classes of irreducible curves.
following theorem: A basic result says that these cones are dual to
each other. The structure of NE(X) in the part
Theorem 1 Let X be a projective manifold and
where KX is negative is very nice; one has the
suppose that through a general point x 2 X there is a
following cone theorem:
curve C such that KX C < 0. Then X is uniruled, that
is, there is a family of rational curves covering X. Theorem 2 NE(X) is locally finite polyhedral in
the half-space {KX < 0}; the (geometrically) extremal
A rational curve is simply the image of noncon-
rays contain classes of rational curves.
stant map f : P1 ! X. It is a simple matter to prove
that uniruled manifolds have (X) = 1, but the A ray R = Rþ [a] is said to be extremal in a closed
converse is an important open problem. A step cone K if the following holds: given b, c 2 K with
towards this conjecture has recently been made by b þ c 2 R, then b, c 2 R. Given such an extremal ray
Boucksom et al. (2004) if KX is not pseudoeffective, R NE(X), one can find an ample line bundle H
that is, KX ‘‘cannot be approximated by effective and a rational number t such that KX þ tH is nef
divisors,’’ then X is uniruled. Here one also finds a and KX þ tH R = 0. Using the Kawamata–Viehweg
discussion of the case when KX is pseudoeffective. vanishing theorem, a generalization of Kodaira’s
Mori theory is central in birational geometry. vanishing theorem, which is one of the technical
To state the main results in this theory, we recall the corner stones of the theory, one proves the so-called
Several Complex Variables: Compact Manifolds 553
Base point free theorem Some multiple of KX þ tH stop; this class is discussed later. If KX is not nef,
is spanned by global sections and therefore defines a then perform a Mori contraction f : X ! Y. There
holomorphic map f : X ! Y to some normal projec- are two cases:
tive variety Y contracting exactly those curves whose
If dim Y < dim X, then the general fiber F is a
classes belong to R.
manifold with ample KF , that is, a Fano
These maps are called ‘‘contractions of extremal manifold (discussed in the next section). Here we
rays’’ or ‘‘Mori contractions.’’ In dimension 2 they stop and observe that (X) = 1. Of course one
are classical: either X = P2 and f is the constant can still investigate Y and try to say more on the
map, or f is a P1 -bundle or f is birational and the structure of the fibration f.
contraction of a P1 with normal bundle O(1), that If dim Y = dim X, then Y has terminal singularities –
is, f contracts a (1)-curve. In particular Y is again unless f is a small contraction which means that no
smooth. In the first two cases X has a very precise divisors are contracted. Thus if f is not small, we may
structure, but in the third birational case one attempt to proceed by substituting X by Y.
proceeds by asking whether or not KY is nef. If it
As a result one must develop the entire theory for
is not nef, we start again by choosing the contrac-
varieties with terminal singularities. The big pro-
tion of an extremal ray; if KY is nef, then a
blem arises from small contractions f. In that case
fundamental result says that a multiple of KY is
KY cannot be Q-Cartier and the machinery stops. So
spanned. The class of manifolds with this property
new methods are required. At this stage, other
will be discussed later.
aspects of the theory lead one to attempt a certain
The situation in higher dimensions is much more
surgery procedure which should improve the situa-
complicated. For example, Y need no longer be
tion and allow one to continue as above. The
smooth. However the singularities which appear are
expected surgery Y * Y 0 , which takes place in
rather special.
codimension at least 2, is a ‘‘flip.’’ The idea is that
Definition 1 A normal variety X is said to have we should substitute a small set, namely the
only terminal singularities if first some multiple of exceptional set of a small contraction, by some
the canonical (Weil) divisor KX is a Cartier divisor, other small set (on which the canonical bundle will
that is, a line bundle (one says that X is be positive) to improve the situation. Of course Y 0
Q-Gorenstein) and second if for some (hence for should possess only terminal singularities. The
every) resolution of singularities : X ! X the existence of flips is very deep and has been proved
following holds: by S Mori in dimension 3. Moreover, there cannot
X be an infinite sequence of flips, at least in dimension
KX^ ¼ ðKX Þ þ ai Ei at most 4.
In summary, by performing contractions and flips
where the Ei run over the irreducible -exceptional
one constructs from X a birational model X0 with
divisors and the ai are strictly positive.
terminal singularities such that either
A brief remark concerning Weil divisors is in
KX0 is nef in which case we call X0 a minimal
order:
P a Weil divisor is a finite linear combination
model for X, or
ai Yi with Yi irreducible of codimension 1, but Yi
X0 admits a Fano fibration f 0 : X0 ! Y 0 (discussed
is not necessarily locally defined by one equation.
below), in which case (X) = (X0 ) = 1.
Recall that if each Yi is given locally by one
equation, then the Weil divisor is Cartier. On a Up to now, Mori theory (via the work of
smooth variety these notions coincide. Kawamata, Kollár, Mori, Reid, Shokurov, and
One important consequence is that (X) = (X) ^ in others) works well in dimension 3 (and possibly in
case of terminal singularities, which is completely the near future in dimension 4) but in higher
false for arbitrary singularities. Also notice that dimensions there are big problems with the existence
terminal singularities are rational: Rq (OX^ ) = 0 for of flips. Of course there might be completely
q 1. Terminal singularities occur in codimension different and possibly less precise ways to construct
at least 3. Thus they are not present on surfaces. In a minimal model. One way is to consider the
dimension 3 terminal singularities are well under- canonical ring R of a manifold of general type:
stood. The main point in this context is that for a X
birational Mori contraction the image Y often has R¼ H 0 ðmKX Þ
terminal singularities.
Now the scheme of Mori theory is the following. If R is finitely generated as C-algebra, then
Start with a projective manifold X. If KX is nef, we Proj(R) would be at least a canonical model which
554 Several Complex Variables: Compact Manifolds
has slightly more complicated singularities than a rationally connected fibers) f : X ! Y, then X is
minimal model. However, it is known that this rationally connected if and only if Y is.
‘‘finite generatedness problem’’ is equivalent to the Manifolds Xn which are birational to Pn are
existence of minimal models. On the other hand, if called rational. If there merely exists a surjective
X is of general type with KX nef (hence essentially (‘‘dominant’’) rational map Pn * X, then X is said
ample) or more generally when some positive to be unirational. Of course rational (resp. unira-
multiple mKX is generated by global sections, then tional) manifolds are rationally connected, but to
R is finitely generated. decide whether a given manifold is rational/uni-
We now must discuss the case of a nef canonical rational is often a very deep problem. Therefore,
bundle. The behavior is predicted by the rational connectedness is often viewed as a practical
substitute for (uni)rationality.
Abundance conjecture. If X has only terminal
Often it is very important to compute the Kodaira
singularities and KX is nef, then some multiple
dimension of fiber spaces. Let us fix a holomorphic
mKX is spanned.
surjective map f : X ! Y between projective mani-
Up to now this conjecture is known only in folds and we suppose f has connected fibers. Then
dimension 3 (Kawamata, Kollár, Miyaoka). In the so-called conjecture Cnm states that
higher dimensions it is even unknown if there is a
ðXÞ ðFÞ þ ðYÞ
single section in some multiple mKX . If mKX is
spanned, one considers the Stein factorization where F is the general fiber of f. This conjecture is
f : X ! Y of the associated map, which is called the known in many cases, for example, when the
Iitaka fibration (if not birational) and we have general fiber is of general type, but it is wide open
dim Y = (X) by definition. The general fiber F is a in general. It is deeply related to the existence of
variety with KF 0, a class discussed in the next minimal models (Kawamata).
section. If f is birational, then Y will be slightly
singular (so-called canonical singularities) and KY
Biholomorphic Classification
will be ample. Essentially we are in the case of
negative Ricci curvature. In this section we discuss manifolds X with
Everything that was outlined above holds for
ample anticanonical bundles KX (Fano manifolds),
projective manifolds. In the Kähler case one would
trivial canonical bundles, and
expect the same picture, but the methods completely
ample canonical bundles KX .
fail, and new, analytic methods must be found. Only
very few results are known in this context. Due to the solution of the Calabi conjecture by
We come back to the case of a Fano fibration Yau and Aubin, these classes are characterized by a
f : X ! Y. By definition the anticanoical bundle KX Kähler metric of positive (resp. zero, resp. negative)
is relatively ample so that the general fiber is a Fano Ricci curvature. In principle, in view of the results of
variety. In this case there are no constraints on Y. Mori theory, one should rather consider varieties
To see how much of the geometry of X is dictated with terminal singularities, but we ignore this aspect
by the rational curves, one considers the so-called completely. Philosophically, up to birational equiva-
rational quotient of X. Here we identify two very lence all manifolds are via fibrations somehow
general points on X if they can be joined by a chain composed of those classes via fibrations, possibly
of rational curves. In that way we obtain the also up to étale coverings.
rational quotient Examples of Fano manifolds are hypersurfaces of
f :X*Y degree at most n þ 1 in Pnþ1 , Grassmannians, or
more generally homogenenous varieties G/P with G
This map is merely meromorphic, but has the semisimple and P a parabolic subgroup. Fano
remarkable property of being ‘‘almost holo- manifolds are simply connected. This can be seen
morphic,’’ that is, the set of indeterminacies does either by classical differential geometric methods
not project onto Y. In other words, one has nice using a Kähler metric of positive curvature or via the
compact fibers not meeting the indeterminacy set. If fundamental
Y is just a point, then all points of X can be joined
Theorem 3 Fano manifolds are rationally
by chains of rational curves and X is called
connected.
rationally connected. This notion is clearly biration-
ally invariant. The only known proof of this fact uses, as in the
A deep theorem of Graber–Harris–Starr states uniruled criterion mentioned above, characteristic p
that, given a Fano fibration (or a fibration with methods. By just using complex methods it is not
Several Complex Variables: Compact Manifolds 555
known how to construct a single rational curve (of c1 (X) = 0 in H 2 (X, R). Then there exists a finite
course, in concrete examples the rational curves are unramified cover X ! X such that KX~ is trivial. In
seen immediately). One still has to observe that view of Mori theory, normal projective varieties X
rationally connected manifolds are simply con- with at most terminal singularities and KX 0 (i.e.,
nected, which is not so surprising, since rational KX C = 0 for all curves) should also be investigated.
curves lift to the universal cover. It is expected that similar structure theorems hold;
At least in principle, Fano manifolds can be in particular 1 (X) should be finite. The main
classified: difficulty is that there are no differential methods
available; on the other hand an algebraic proof even
Theorem 4 There are only finitely many families of
for the splitting theorem in the smooth case is
Fano manifolds in every dimension.
unknown.
A family (of Fano manifolds) is a submersion Calabi–Yau manifolds play an important role in
: X ! S (with S irreducible) such that all fibers are string theory and mirror symmetry (see Mirror
Fano manifolds. The essential step is to bound (KX )n . Symmetry: A Geometric Survey). Here we mention
An actual classification has been carried out only two basic problems. The first is the problem of
in dimension up to 3; in dimension 2 one finds boundedness:
P2 , P1 P1 and the so-called del Pezzo surfaces (P2 Are there only finitely many families of Calabi–
blown up in at most eight points in general position). Yau manifolds in any dimension?
In dimension 3 there are already 17 families of Fano This problem is wide open; in particular one
3-folds with b2 = 1 and 88 families with b2 2. might ask:
An extremely hard question is to decide whether a Is the Hodge number h1, 2 bounded for Calabi–
given Fano manifold is rational or unirational. Even Yau 3-folds?
in dimension 3 this is not completely decided. The other problem asks for the existence of
The next class to be discussed are the manifolds rational curves. In all known examples there are
with trivial canonical class KX . This means that rational curves, but a general existence proof is not
there is a holomorphic n-form without zeros known. The case where b2 (X) = 1 seems to be
(n = dim X). Important examples are tori and particularly difficult. If b2 (X) 2, then in may
hypersurface in Pnþ1 of degree n þ 2. Simply cases one can hope to find a fibration or a birational
connected manifolds with trivial canonical bundles map, at least for 3-folds. Given such a map, the
are further divided into irreducible Calabi–Yau existence of rational curves is simple. For example,
manifolds and irreducible symplectic manifolds. if D X is an irreducible hypersurface which is not
The first class is defined by requiring that there are nef, choose H ample and consider the a priori
no holomorphic p-forms for p < dim X whereas the positive real number p such that D þ pH is on the
second is characterized by the existence of a boundary of the ample cone. Then actually p is
holomorphic 2-form of everywhere maximal rank. rational and a suitable multiple m(D þ pH) is
A completely different characterization is by holonomy: spanned and defines a contraction on X. This
an irreducible Calabi–Yau manifold has SU-holonomy comes from ‘‘logarithmic Mori theory.’’
whereas irreducible symplectic manifolds have The above splitting theorem exhibits a torus
Sp-holonomy (with respect to a suitable Kähler metric). factor and all holomorphic 1-forms on X come
The splitting theorem of Beauville–Bogomolov– from this torus. This principle generalizes: given any
Kobayashi says projective or compact Kähler manifold X, there
exists a ‘‘universal object,’’ the Albanese torus
Theorem 5 Let X be a projective (or compact
Kähler) manifold with trivial canonical bundle. AlbðXÞ ¼ H0 ð1X Þ =H1 ðX; ZÞ
~ !X
Then there exists a finite unbranched cover X
such that (which is algebraic if X is) together with a
X ¼ A Xi Yj holomorphic map
follows: every map X ! T to a torus factors via an In case of equality, X is covered by the
affine map Alb(X) ! T. n-dimensional unit ball.
There is a nonabelian analog, the so-called The same inequality holds in case KX = 0, and as a
Shafarevich map, but at the moment this map is consequence the Chern class c2 (X) is in some sense
only known to be meromorphic. It is an important semipositive. If c2 (X) = 0, then some finite unrami-
tool to study the fundamental group 1 (X). We refer fied cover of X is a torus.
to Campana (1996) and Kollár (1995). There is an interesting relation to stability. Recall
In the following, Chern classes of holomorphic that a vector bundle E on a compact Kähler
vector bundles will be important. Let X be a manifold Xn is semistable with respect to a given
compact complex manifold and E a holomorphic Kähler form !, if for all proper coherent subsheaves
vector bundle on X. The jth Chern class of E is an F E of rank-r the following inequality holds:
element
c1 ðF Þ !n1 c1 ðEÞ !n1
cj ðEÞ 2 H 2j ðX; QÞ \ H j;j ðXÞ
r n
It can be defined, for example, by putting a Hermitian In case of strict inequality, E is said to be stable.
metric on E, computing the curvature of the canonical The basic observation is now that the tangent
connection compatible with both the metric and the bundle of a manifold with a Kähler–Einstein metric
holomorphic structure and then by applying certain is semistable (with respect to the Kähler–Einstein
linear operators coming from symmetric functions metric). It is expected that Fano manifolds with
such as determinant and trace. Actually Chern classes b2 = 1 have (semi?-)stable tangent bundles, although
can be attached to every complex topological vector in certain situations they do not admit a Kähler–
bundle on a topological manifold; then cj (E) will Einstein metric.
simply live in H 2j (X, R). There is also a purely Again the first two Chern classes of a semistable
algebraic construction by Grothendieck. We refer, for vector bundle fulfill an inequality:
example, to Fulton (1984) as well as for a discussion of
the elementary functorial properties of Chern classes. 2r
Here we just recall that for a rank-r vector bundle E the c21 ðEÞ !n2
c2 ðEÞ !n2
r1
first Chern class
^ Equally important, semistable bundles with fixed
c1 ðEÞ ¼ c1 E numerical data form moduli spaces, this being the
origin of the stability notion (Mumford). In this
V context, the notion of an Hermite–Einstein bundle is
where the Chern class of the line bundle r E as
given in Several Complex Variables: Basic Geo- also important. Given a holomorphic vector bundle
metric Theory actually lives in H 2 (X, Z). E with a Hermitian metric h, there is a unique
Finally we discuss manifolds with ample canonical connection Fh on E compatible both with h and the
class KX . Here moduli question often plays a central complex structure. Fh is a (1,1)-form with values in
role. Moduli spaces of surfaces with fixed c21 and c2 End(E). Now suppose (X, !) is Kähler and let Fh be
are very intensively studied (by Catanese, Ciliberto, the contraction of Fh with !. Then (E, h) is said to
and others). Here, without going into details, we Hermite–Einstein on (X, !), if
will concentrate on the very interesting topic of
Kähler–Einstein metrics. Fh ¼ id
A Kähler metric ! is said to be Kähler–Einstein, if
its Ricci curvature Ric(!) is proportional to !. The with some constant and id: E ! E the identity.
proportionality factor can be taken to be 1, 0, 1. In Notice that (X, !) is Kähler–Einstein if (TX , h) is
case KX is ample or trivial, Kähler–Einstein metrics Hermite–Einstein over (X, !) with h the Kähler
always exist by Yau and Aubin (cases = 1, resp. metric with Kähler form !. It is not so difficult to
= 0). However if X is Fano, there are obstructions, see that Hermite–Einstein bundles are semistable
and a Kähler–Einstein metric does not always exist. (with respect to the underlying Kähler form) and
An important consequence of the existence of a actually are directs sum of stable Hermite–Einstein
Kähler–Einstein metric on a manifold Xn with ample bundles. Conversely, a very deep theorem of
canonical class is the Miyaoka–Yau inequality: Uhlenbeck–Yau says that every stable vector bundle
on a compact Kähler manifold is Hermite–Einstein.
2n þ 1 n2 This is known as the Kobayashi–Hitchin correspon-
c21 !n2
!
n dence; see Lübke and Teleman (1995).
Several Complex Variables: Compact Manifolds 557
be an effective Q-divisor, that is, all ai are positive More generally, let us consider the case that the
rational numbers. Let hai i be theP
fractional part of ai compact Kähler manifold X admits a vector field v
and suppose that the Q-divisor hai iDi has normal without zeros, but X is not required to be homo-
crossings.
P Let dai e be the roundup of ai and put geneous. Then a theorem of Lieberman says that
L = dai eDi . If D is big and nef, then there is a finite unramified cover f : X ! X and a
splitting
H q ðX; L KX Þ ¼ 0
~ ’FT
X
for q 1. Of course L itself need not be nef! This
generalization is technically very important and yields with T a torus, such that f (v) is the pullback of a
substantial freedom for birational manipulations. We vector field on T. On the other hand, if v has a zero,
refer to Kawamata et al. (1987) and Lazarsfeld (2004). then a classical theorem of Rosenlicht says that X is
Even this is not the end of the story: the Kawamata– covered by rational curves, that is, X is uniruled. In
Viehweg theorem is embedded in the broader context particular (X) = 1. Notice also that a manifold
of the Nadel vanishing theorem where multiplier ideal of general type can never carry a vector field, in
sheaves come into the play. See Demailly and other words, the automorphism group is discrete,
Lazarsfeld and Lazarsfeld (2004). even finite.
Coming back to compact homogeneous Kähler
manifolds, the first thing to study is the Albanese
Homogeneous Manifolds map. The Borel–Remmert theorem says that
Hermitian metric is called Hermitian symmetric, if for Griffiths Ph and Harris J (1978) Priniciples of Algebraic
every x 2 X there exists an involutive holomorphic Geometry. New York: Wiley.
Grauert H, Peternell Th, and Remmert R (1994) Several Complex
isometry fixing x. Mok has shown the remarkable fact Variables VII. Encyclopedia of Mathematical Sciences, vol. 74.
that the simply connected compact Hermitian sym- Heidelberg: Springer.
metric spaces are exactly those simply connected Huckleberry A (1990) Actions of groups of holomorphic
compact manifolds carrying a Kähler metric with transformations. In: Barth W and Narasimhan R (eds.) Several
semipositive holomorphic bisectional curvature. The Complex Variables VI, Encyclopedia of Mathematical Science,
vol. 69, pp. 143–196. Berlin: Springer.
only manifold having a metric with positive holo- Kawamata Y, Matsuda K, and Matsuki K (1987) Introduction to
morphic bisectional curvature is Pn (Siu-Yau, Mori). the minimal model problem. Advance Studies in Pure
Mathematics 10: 283–360.
See also: Classical Groups and Homogeneous Spaces; Kollár J (1995) Shafarevitch Maps and Automorphic Forms.
Einstein Manifolds; Mirror Symmetry: A Geometric Princeton University Press.
Survey; Moduli Spaces: An Introduction; Riemann Kollár J (1996) Rational Curves on Algebraic Varieties. Ergeb-
Surfaces; Several Complex Variables: Basic Geometric nisse der Mathematik und ihrer Grenzgebiete, vol. 32.
Theory; Topological Sigma Models; Twistor Theory: Heidelberg: Springer.
Lazarsfeld R (2004) Positivity in Algebraic Geometry I, II.
Some Applications [in Integrable Systems, Complex
Ergebnisse der Mathematik und ihrer Grenzgebiete, vol. 48
Geometry and String Theory].
and 49. Heidelberg: Springer.
Lübke M and Teleman A (1995) The Kobayashi–Hitchin
Correspondence. Singapore: World Scientific.
Further Reading Matsuki K (2002) Introduction to the Mori Program, Universi-
Beltrametti M and Sommese AJ (1995) The Adjunction Theory of text. Heidelberg: Springer.
Complex Projective Varieties. Berlin: de Gruyter. Mori S (1987) Classification of higher-dimensional varieties.
Boucksom S, Demailly JP, Paun M, and Peternell T (2004) Proceedings of Symposia in Pure Mathematics 46: 269–331.
The pseudo-effective cone of a compact Kähler manifold Parshin AN and Shafarevich IR (eds.) (1999) Algebraic Geometry
and varieties of negative Kodaira dimension math.AG/0405285 V – Fano Varieties, vol. 47, Encyclopedia of Mathematical
Campana F (1996) Kodaira dimension and fundamental group of Sciences. Heidelberg: Springer.
compact Kähler manifolds. In: Andreatta M and Peternell T Siu YT (1987) Lectures on Hermitian–Einstein Metrics for Stable
(eds.) Higher Dimensional Complex Varieties, pp. 89–162. Bundles and Kähler–Einstein Metrics. DMV Seminar, vol. 8.
Berlin: de Gruyter. Basel: Birkhäuser.
Demailly JP (2000) Complex analytic and algebraic geometry, Ueno K (1975) Classification Theory of Compact Complex spaces.
http://www-fourier.ujf-grenoble.fr/ demailly. Lecture Notes in Math., vol. 439. Heidelberg: Springer.
Demailly JP and Lazarsfeld R (eds.) Vanishing Theorems and Viehweg E (1995) Quasi-Projective Moduli for Polarozed Mani-
Effective Results in Algebraic Geometry, ICTP Lecture Notes, folds. Ergebnisse der Mathematik und ihrer Grenzgebiete,
vol. 6, Trieste. vol. 30. Heidelberg: Springer.
Fulton W (1984) Intersection Theory. Heidelberg: Springer.
In this article, which summarizes the work of the coordinate r is singular with respect to radial
authors in Smoller and Temple (1995, 2003), we arclength r̄ = rR at the big bang R = 0, so setting
describe a two-parameter family of exact solutions r > 0 does not place the shock wave away from the
of the Einstein equations that refine the FRW metric origin at time t = 0. The distance from the FRW
by a spherical shock wave cutoff. In these exact center to the shock wave tends to zero in the limit
solutions, the expanding FRW metric is reduced to a t ! 0 even when r > 0. In the limit r ! 1, we
region of finite extent and finite total mass at each recover from the family of solutions the usual
fixed time, and this FRW region is bounded by an (infinite) FRW metric with equation of state p = –
entropy-satisfying shock wave that emerges from the that is, we recover the standard FRW metric in the
origin (the center of the explosion), at the instant of limit that the shock wave is infinitely far out. In this
the big bang, t = 0. The shock wave, which marks sense our family of exact solutions of the Einstein
the leading edge of the FRW expansion, propagates equations considered here represents a two-parameter
outward into a larger ambient spacetime from time refinement of the standard FRW metric.
t = 0 onward. Thus, in this refinement of the FRW The exact solutions for the case r = 0 were first
metric, the big bang that set the galaxies in motion constructed in Smoller and Temple (1995) (see also
is an explosion of finite mass that looks more like a the notes by Smoller and Temple (1999)), and are
classical shock wave explosion than does the big qualitatively different from the solutions when r > 0,
bang of the standard model. (The fact that the entire which were constructed later in Smoller and
infinite space R3 emerges at the instant of the big Temple (2003). The difference is that, when r = 0,
bang, is, loosely speaking, a consequence of the the shock wave lies closer than one Hubble length
Copernican principle, the principle that the Earth is from the center of the FRW spacetime throughout
not in a special place in the universe on the largest its motion (Smoller and Temple 2000), but when
scale of things. With a shock wave present, the r > 0, the shock wave emerges at the big bang at a
Copernican principle is violated, in the sense that distance beyond one Hubble length. (The Hubble
the Earth then has a special position relative to the length depends on time, and tends to zero as t ! 0.)
shock wave. But, of course, in these shock wave We show in Smoller and Temple (2003) that one
refinements of the FRW metric, there is a spacetime Hubble length, equal to c=H, where H = R=R, _ is a
on the other side of the shock wave, beyond the critical length scale in a k = 0 FRW metric because
galaxies, and so the scale of uniformity of the FRW the total mass inside one Hubble length has a
metric, the scale on which the density of the galaxies Schwarzschild radius equal exactly to one Hubble
is uniform, is no longer the largest length scale.) length. (Since c=H is a good estimate for the age of
In order to construct a mathematically simple the universe, it follows that the Hubble length c=H
family of shock wave refinements of the FRW metric is approximately the distance of light travel starting
that meet the Einstein equations exactly, we assume at the big bang up until the present time. In this
k = 0 (critical expansion), and we restrict to the case sense, the Hubble length is a rough estimate for the
that the sound speed in the fluid on the FRW side of distance to the further most objects visible in the
the shock wave is constant. That is, we assume an universe.) That is, one Hubble length marks precisely
FRW equation of state p = , where , the square
pffiffiffiffiffiffiffiffiffiffiffiffiffi the distance at which the Schwarzschild radius r̄s 2M
of the sound speed @p=@, is constant, 0 < c2 . of the mass M inside a radial shock wave at distance
At = c2 =3, this catches the important equation of r̄ from the FRW center, crosses from inside (r̄s < r̄)
state p = (c2 =3) which is correct at the earliest stage to outside (r̄s > r̄) the shock wave. If the shock wave
of big bang physics (Weinberg 1972). Also, as is at a distance closer than one Hubble length from
ranges from 0 to c2 , we obtain qualitatively correct the FRW center, then 2M < r̄ and we say that the
approximations to general equations of state. solution lies outside the black hole, but if the shock
Taking c = 1 (we use the convention that c = 1, and wave is at a distance greater than one Hubble
Newton’s constant G = 1 when convenient), the length, then 2M > r̄ at the shock, and we say that
family of solutions is then determined by two the solution lies ‘‘inside’’ the black hole. Since M
parameters, 0 < 1 and r 0. The second increases like r̄3 , it follows that 2M < r̄ for r̄
parameter, r , is the FRW radial coordinate r of sufficiently small, and 2M > r̄ for r̄ sufficiently
the shock in the limit t ! 0, the instant of the large, so there must be a critical radius at which
big bang. (Since, when k = 0, the FRW metric is 2M = r̄, and we show in what follows (see also
invariant under the rescaling r ! r and R ! 1 R, Smoller and Temple (2003)) that when k = 0, this
we fix the radial coordinate r by fixing the scale critical radius is exactly the Hubble length. When
factor with the condition that R(t0 ) = 1 for some the parameter r = 0, the family of solutions for 0 <
time t0 , say present time.) The FRW radial 1 starts at the big bang, and evolves thereafter
Shock Wave Refinement of the Friedman–Robertson–Walker Metric 561
‘‘outside’’ the black hole, satisfying 2M=r̄ < 1 every- this sense, the case r > 0 gives a black hole
where from t = 0 onward. But, when r > 0, the cosmology that refines the standard FRW model of
shock wave is further out than one Hubble length cosmology to the case of finite mass. One of the
at the instant of the big bang, and the solution surprising differences between the case r = 0 and the
begins with 2M=r̄ > 1 at the shock wave. From this case r > 0 is that, when r > 0, the important
time onward, the spacetime expands until even- equation of state p = /3 comes out of the analysis as
tually the Hubble length catches up to the shock special at the big bang. When r > 0, the shock
wave at 2M=r̄ = 1, and then passes the shock wave, wave emerges at the instant of the big bang at a
making 2M=r̄ < 1 thereafter. Thus, when r > 0, finite nonzero speed (the speed of light) only for the
the whole spacetime begins inside the black hole special value = 1/3. In this case, the equation of
(with 2M=r̄ > 1 for sufficiently large r̄), but state on both sides of the shock wave tends to the
eventually evolves to a solution outside the black correct relation p = /3 as t ! 0, and the shock
hole. The time when r̄ = 2M actually marks the wave decelerates to subluminous speed for all
event horizon of a white hole (the time reversal of positive times thereafter (see Smoller and Temple
a black hole) in the ambient spacetime beyond the (2003) and Theorem 8 below).
shock wave. We show that, when r > 0, the time In all cases 0 < 1, r 0, the spacetime
when the Hubble length catches up to the shock metric that lies beyond the shock wave is taken to
wave comes after the time when the shock wave be a metric of Tolmann–Oppenheimer–Volkoff
comes into view at the FRW center, and when (TOV) form (Oppenheimar and Volkoff 1939):
2M = r̄ (assuming t is so large that we can neglect
the pressure from this time onward), the whole ds2 ¼ BðrÞdt2 þ A1 ðrÞdr2 þ r2 ½d2 þ sin2 d2 ½2
solution emerges from the white hole as a finite
ball of mass expanding into empty space, satisfying The metric [2] is in standard Schwarzschild coordi-
2M=r̄ < 1 everywhere thereafter. In fact, when r > 0, nates (diagonal with radial coordinate equal to the
the zero pressure Oppenheimer–Snyder solution area of the spheres of symmetry), and the metric
outside the black hole gives the large-time asymp- components depend only on the radial coordinate r̄.
totics of the solution (Oppenheimer and Snyder Barred coordinates are used to distinguish TOV
1939, Smoller and Temple 1988, 2004 and the coordinates from unbarred FRW coordinates for
comments after Theorems 6–8 below). shock matching. The mass function M(r̄) enters as a
The exact solutions in the case r = 0 give a metric component through the relation
general-relativistic version of an explosion into a 2MðrÞ
static, singular, isothermal sphere of gas, qualita- A¼1 ½3
r
tively similar to the corresponding classical explo-
sion outside the black hole (Smoller and Temple The TOV metric [2] has a very different character
1995). The main difference physically between the depending on whether A > 0 or A < 0; that is,
cases r > 0 and r = 0 is that, when r > 0 (the case depending on whether the solution lies outside the
when the shock wave emerges from the big bang at a black hole or inside the black hole. In the case A > 0,
distance beyond one Hubble length), a large region r̄ is a spacelike coordinate, and the TOV metric
of uniform expansion is created behind the shock describes a static fluid sphere in general relativity.
wave at the instant of the big bang. Thus, when r > 0, (When A > 0, for example, the metric [2] is the
lightlike information about the shock wave starting point for the stability limits of Buchdahl
propagates inward from the wave, rather than and Chandresekhar for stars (Weinberg 1972,
outward from the center, as is the case when r = 0 Smoller and Temple 1997, 1998).) When A < 0, r̄
and the shock lies inside one Hubble length. (One is the timelike coordinate, and [2] is a dynamical metric
can imagine that when r > 0, the shock wave can that evolves in time. The exact shock wave solutions are
get out through a great deal of matter early on when obtained by taking r̄ = R(t)r to match the spheres of
everything is dense and compressed, and still not symmetry, and then matching the metrics [1] and [2] at
violate the speed of light bound. Thus, when r > 0, an interface r̄ = r̄(t) across which the metrics are
the shock wave ‘‘thermalizes,’’ or more accurately Lipschitz continuous. This can be done in general.
‘‘makes uniform,’’ a large region at the center, early In order for the interface to be a physically mean-
on in the explosion.) It follows that, when r > 0, ingful shock surface, we use the result in Theorem 4
an observer positioned in the FRW spacetime inside below (see Smoller and Temple (1994)) that a single
the shock wave will see exactly what the standard additional conservation constraint is sufficient to rule
model of cosmology predicts, up until the time when out -function sources at the shock (the Einstein
the shock wave comes into view in the far field. In equations G = T are second order in the metric, and
562 Shock Wave Refinement of the Friedman–Robertson–Walker Metric
so -function sources will in general be present at a bounds on the equations of state imply that the
Lipschitz continuous matching of metrics), and equations of state are qualitatively reasonable, and
guarantee that the matched metric solves the Einstein we expect that this family of solutions will capture
equations in the weak sense. The Lipschitz matching the gross dynamics of solutions when more general
of the metrics, together with the conservation equations of state are imposed. For more general
constraint, leads to a system of ordinary differential equations of state, other waves, such as rarefaction
equations (ODEs) that determine the shock position, waves and entropy waves, would need to be present
together with the TOV density and pressure at the to meet the conservation constraint, and thereby
shock. Since the TOV metric depends only on r̄, the mediate the transition across the shock wave. Such
equations thus determine the TOV spacetime beyond transitional waves would be very difficult to model in
the shock wave. To obtain a physically meaningful an exact solution. But, the fact that we can find
outgoing shock wave, we impose the constriant p̄ global solutions that meet our physical bounds, and
to ensure that the equation of state on the TOV side that are qualitatively the same for all values of 2
of the shock is physically reasonable, and as the (0,1] and all initial shock positions, strongly suggests
entropy condition we impose the condition that the that such a shock wave would be the dominant wave
shock be compressive. For an outgoing shock wave, in a large class of problems.
this is the condition > , p > p̄, that the pressure In the next section, the FRW solution is derived
and density be larger on the side of the shock that for the case = const., and the Hubble length is
receives the mass flux – the FRW side when the discussed as a critical length scale. Subsequently,
shock wave is propagating away from the FRW the general theorems in Smoller and Temple (1994)
center. This condition breaks the time-reversal sym- for matching gravitational metrics across shock
metry of the equations, and is sufficient to rule out waves are employed. This is followed by a discus-
rarefaction shocks in classical gas dynamics (Smoller sion of the construction of the family of solutions in
1983, Smoller and Temple 2003). The ODEs, the case r = 0. Finally, the case r > 0 is discussed.
together with the equation-of-state bound and the (Details can be found in Smoller and Temple (1995,
conservation and entropy constraints, determine a 2003, 2004).)
unique solution of the ODEs for every 0 < 1 and
r̄ 0, and this provides the two-parameter family of
solutions discussed here (Smoller and Temple 1995, The FRW Metric
2003). The Lipschitz matching of the metrics implies
that the total mass M is continuous across the According to Einstein’s theory of general relativity,
interface, and so when r > 0, the total mass of the all properties of the gravitational field are deter-
entire solution, inside and outside the shock wave, is mined by a Lorentzian spacetime metric tensor g,
finite at each time t > 0, and both the FRW and whose line element in a given coordinate system
TOV spacetimes emerge at the big bang. The total x = (x0, . . . , x3 ) is given by
mass M on the FRW side of the shock has the
ds2 ¼ gij dxi dxj ½4
meaning of total mass inside the radius r̄ at fixed
time, but on the TOV side of the shock, M does not (We use the Einstein summation convention,
evolve according to equations that give it the whereby repeated up–down indices are assumed
interpretation as a total mass because the metric is summed from 0 to 3.) The components gij of the
inside the black hole. Nevertheless, after the space- gravitational metric g satisfy the Einstein equations
time emerges from the black hole, the total mass
takes on its usual meaning outside the black Gij ¼ T ij ; T ij ¼ ðc2 þ pÞwi wj þ pgij ½5
hole, and time asymptotically the big bang ends where we assume that the stress-energy tensor T
with an expansion of finite total mass in the usual corresponds to that of a perfect fluid. Here G is the
sense. Thus, when r > 0, our shock wave refine- Einstein curvature tensor,
ment of the FRW metric leads to a big bang of
8 G
finite total mass. ¼ ½6
A final comment is in order regarding our overall c4
philosophy. The family of exact shock wave solutions is the coupling constant, G is Newton’s gravitational
described here are rough models in the sense that constant, c is the speed of light, c2 is the energy
the equation of state on the FRW side satisfies the density, p is the pressure, and w = (w0, . . . , w3 ) are
condition = const., and the equation of state on the the components of the 4-velocity of the fluid (cf.
TOV side is determined by the equations, and Weinberg 1972), and again we use the convention
therefore cannot be imposed. Nevertheless, the that c = 1 and G = 1 when convenient.
Shock Wave Refinement of the Friedman–Robertson–Walker Metric 563
Putting the metric ansatz [1] into the Einstein [14] that M(t, r̄) has the physical interpretation as
equations [5] gives the equations for the FRW metric the total mass inside radius r̄ at time t in the FRW
(Weinberg 1972), metric. Restricting to the case of critical expansion
!2 k = 0, we see from [7], [14], and [13] that r̄ = H 1 is
2 R_ k equivalent to 2M=r̄ = 1, and so at fixed time t, the
H ¼ ¼ 2 ½7 following equivalences are valid:
R 3 R
2M
and r ¼ H 1 iff ¼1 iff A¼0 ½15
r
_ ¼ 3ðp þ ÞH ½8 We conclude that r̄ = H 1 is the critical length scale
for the FRW metric at fixed time t in the sense that
The unknown quantities R, , and p are assumed to
A = 1 2M=r̄ changes sign at r̄ = H1 , and so the
be functions of the FRW coordinate time t alone, and
universe lies inside a black hole beyond r̄ = H 1 , as
the ‘‘dot’’ denotes differentiation with respect to t.
claimed above. Now, we proved in Smoller and
To verify that the Hubble length r̄crit = 1=H is the
Temple (1998) that the standard TOV metric out-
limit for FRW–TOV shock matching outside a black
side the black hole cannot be continued into A = 0
hole, write the FRW metric [1] in standard
except in the very special case = 0. (It takes an
Schwarzschild coordinates x = (r̄, t̄), where the
infinite pressure to hold up a static configuration at
metric takes the form
the event horizon of a black hole.) Thus, shock
ds2 ¼ Bðr; tÞdt2 þ Aðr; tÞ1 dr2 þ r2 d2 ½9 matching beyond one Hubble length requires a
metric of a different character, and for this purpose,
and the mass function M(r̄, t̄) is defined through the we introduce the TOV metric inside the black hole –
relation a metric of TOV form, with A < 0, whose fluid is
comoving with the timelike radial coordinate
2M
A¼1 ½10 r̄ (Smoller and Temple 2004).
r
The Hubble length r̄crit = c=H is also the critical
It is well known that a general spherically symmetric distance at which the outward expansion of the FRW
metric can be transformed to the form [9] by metric exactly cancels the inward advance of a radial
coordinate transformation (see Weinberg (1972) and light ray impinging on an observer positioned at the
Groah and Temple (2004)). Substituting r̄ = Rr into origin of a k = 0 FRW metric. Indeed, by [1], a light
[1] and diagonalizing the resulting metric, we obtain ray traveling radially inward toward the center of an
(see Smoller and Temple (2004) for details) FRW coordinate system satisfies the condition
2 1 1 kr2 c2 dt2 ¼ R2 dr2 ½16
ds ¼ 2 dt2
1 kr2 H 2r2 so that
1
þ dr2 þ r2 d2 ½11 dr _ þ R_r ¼ Hr c ¼ H r c > 0
1 kr2 H 2r2 ¼ Rr ½17
dt H
where is an integrating factor that solves the if and only if
equation c
r >
H
@ 1 kr2 H 2r2 @ Hr
¼ 0 ½12 Thus, the arclength distance from the origin to an
@r 1 kr2 @t 1 kr2
inward moving light ray at fixed time t in a k = 0
and the time coordinate t̄ = t̄(t, r̄) is defined by the FRW metric will actually increase as long as the light
exact differential ray lies beyond the Hubble length. An inward moving
light ray will, however, eventually cross the Hubble
1 kr2 H 2r2 Hr length and reach the origin in finite proper time, due
dt ¼ dt þ dr ½13
1 kr2 1 kr2 to the increase in the Hubble length with time.
We now calculate the infinite redshift limit in terms
Now using [10] in [7], it follows that
of the Hubble length. It is well known that light emitted
Z
r 1 3 at (te , re ) at wavelength
e in an FRW spacetime will be
Mðt; rÞ ¼ ðtÞs2 ds ¼ r ½14 observed at (t0 , r0 ) at wavelength
0 if
2 0 32
Since in the FRW metric, r̄ = Rr measures arclength R 0
0
¼
along radial geodesics at fixed time, we see from Re
e
564 Shock Wave Refinement of the Friedman–Robertson–Walker Metric
Moreover, the redshift factor z is defined by then (assuming an expanding universe R_ > 0), the
solution of system [7], [8] satisfying R = 0 at t = 0
0
z¼ 1 and R = 1 at t = t0 is given by
e
4 1
Thus, infinite redshifting occurs in the limit Re ! 0, ¼ 2 t2
½23
where R = 0, t = 0 is the big bang. Consider now a 3ð1 þ Þ
light ray emitted at the instant of the big bang, and 2=½3ð1þÞ
observed at the FRW origin at present time t = t0 . t
R¼ ½24
Let r1 denote the FRW coordinate at time t ! 0 of t0
the furthest objects that can be observed at the FRW
origin before time t = t0 . Then r1 marks the position H t0
¼ ½25
of objects at time t = 0 whose radiation would be H0 t
observed as infinitly redshifted (assuming no scatter- Moreover, the age of the universe t0 and the infinite
ing). Note then that a shock wave emanating from red shift limit r1 are given exactly in terms of the
r̄ = 0 at the instant of the big bang, will be observed at Hubble length by
the FRW origin before present time t = t0 only if its
position r at the instant of the big bang satisfies the 2 1
t0 ¼ ½26
condition r < r1 . To estimate r1 , note first that from 3ð1 þ Þ H0
[16] it follows that an incoming radial light ray in an
FRW metric follows a lightlike trajectory r = r(t) if 2 1
r1 ¼ ½27
Z t 1 þ 3 H0
d
r re ¼
te RðÞ
From [27] we conclude that a shock wave will be
and thus observed at the FRW origin before present time
Z t0 t = t0 only if its position r at the instant of the big
d
r1 ¼ ½18 bang satisfies the condition
0 RðÞ
2 1
Using this, the following theorem can be proved r<
1 þ 3 H0
(Smoller and Temple 2004).
Note that r1 ranges from one-half to two Hubble
Theorem 1 If the pressure p satisfies the bounds
lengths as ranges from 1 to 0, taking the
0 p 13 ½19 intermediate value of one Hubble length at = 1=3
(cf. [21]).
then, for any equation of state, the age of the Note that using [23] and [24] in [14], it follows
universe t0 and the infinite red shift limit r1 are that
bounded in terms of the Hubble length by Z
r
1 2 M¼ ðtÞs2 ds
t0 ½20 2 0
2H0 3H0
2r3
1 2 ¼ 2=ð1þÞ
t2=ð1þÞ ½28
r1 ½21 9ð1 þ Þ2 t0
H0 H0
so M _ < 0 if > 0. It follows that if p = ,
= const. > 0, then the total mass inside radius
(We have assumed in Theorem 1 that R = 0 when
r = const. decreases in time.
t = 0 and R = 1 when t = t0 , H = H0 .)
The next theorem gives closed-form solutions of
the FRW equations [7], [8] in the case when The General Theory of Shock Matching
= const. As a special case, we recover the bounds
in [20] and [21] from the cases = 0 and 1/3. The matching of the FRW and TOV metrics in the next
two sections is based on the following theorems that
Theorem 2 Assume k = 0 and the equation of state were derived in Smoller and Temple (1994) (Theorems
p ¼ ½22 3 and 4 apply to non-lightlike shock surfaces. The
lightlike case was discussed by Scott (2002).)
where is taken to be constant,
Theorem 3 Let denote a smooth, three-dimen-
01 sional shock surface in spacetime with spacelike
Shock Wave Refinement of the Friedman–Robertson–Walker Metric 565
normal vector n relative to the spacetime metric g; to form the matched metric g [ ḡ. That is, assume
let K denote the second fundamental form on ; and that g and ḡ are Lorentzian metrics given by
let G denote the Einstein curvature tensor. Assume
that the components gij of the gravitational metric g ds2 ¼ aðt; rÞdt2 þ bðt; rÞdr2 þ cðt; rÞd2 ½29
are smooth on either side of (continuous up to the and
boundary on either side separately), and Lipschitz
t; rÞdr2 þ cðt; rÞd2
ds2 ¼ aðt; rÞdt2 þ bð ½30
continuous across in some fixed coordinate
system. Then the following statements are and that there exists a smooth coordinate transforma-
equivalent: tion : (t, r) ! (t̄, r̄), defined in a neighborhood of a
(i) [K] = 0 at each point of . shock surface given by r = r(t), such that the metrics
(ii) The curvature tensors Rijkl and Gij , viewed as agree on . (We implicitly assume that and ’ are
second-order operators on the metric compo- continuous across the surface.) Assume that
nents gij , produce no -function sources on . cðt; rÞ ¼ cððt; rÞÞ ½31
(iii) For each point P 2 , there exists a C1,1
coordinate transformation defined in a neigh- in an open neighborhood of the shock surface , so
borhood of P, such that, in the new coordinates that, in particular, the areas of the 2-spheres of
(which can be taken to be the Gaussian normal symmetry in the barred and unbarred metrics agree
coordinates for the surface), the metric compo- on the shock surface. Assume also that the shock
nents are C1,1 functions of these coordinates. surface r = r(t) in unbarred coordinates is mapped to
(iv) For each P 2 , there exists a coordinate frame the surface r̄ = r̄(t̄) by (t̄, r̄(t̄)) = (t, r(t)). Assume,
that is locally Lorentzian at P, and can be finally, that the normal n to is non-null, and that
reached within the class of C1,1 coordinate nðcÞ 6¼ 0 ½32
transformations.
where n(c) denotes the derivative of the function c in
Moreover, if any one of these equivalencies hold, the direction of the vector n. Then the following are
then the Rankine–Hugoniot jump conditions, equivalent to the statement that the components of
[G]i n = 0 (which express the weak form of con- the metric g [ ḡ in any Gaussian normal coordinate
servation of energy and momentum across when system are C1,1 functions of these coordinates across
G = T), hold at each point on . the surface :
Here [f] denotes the jump in the quantity f across ½Gij ni ¼ 0 ½33
(this being determined by the metric separately on
each side of because gij is only Lipschitz
½Gij ni nj ¼ 0 ½34
continuous across ), and by C1,1 we mean that
the first derivatives are Lipschitz continuous. ½K ¼ 0 ½35
In the case of spherical symmetry, the following
stronger result holds. In this case, the jump condi- Here again, [f ] = f̄ f denotes the jump in the
tions [Gij ]ni = 0, which express the weak form of quantity f across , and K is the second fundamental
conservation across a shock surface, are implied by a form on the shock surface.
single condition [Gij ]ni nj = 0, so long as the shock is We assume in Theorem 4 that the areas of the
non-null, and the areas of the spheres of symmetry 2-spheres of symmetry change monotonically in the
match smoothly at the shock and change mono- direction normal to the surface. For example, if
tonically as the shock evolves. Note that, in general, c = r2 , then @c=@t = 0, so the assumption n(c) 6¼ 0 is
assuming that the angular variables are identified valid except when n = @=@t, in which case the rays
across the shock, we expect conservation to entail of the shock surface would be spacelike. Thus, the
two conditions, one for the time and one for the shock speed would be faster than the speed of light
radial components. The fact that the smooth if our assumption n(c) 6¼ 0 failed in the case c = r2 .
matching of the spheres of symmetry reduces
conservation to one condition can be interpreted as
an instance of the general principle that directions of FRW–TOV Shock Matching Outside the
smoothness in the metric imply directions of Black Hole – The Case r = 0
conservation of the sources.
To construct the family of shock wave solutions for
Theorem 4 Assume that g and ḡ are two spheri- parameter values 0 < 1 and r = 0, we match
cally symmetric metrics that match Lipschitz con- the exact solution [23]–[25] of the FRW metric [1]
tinuously across a three-dimensional shock interface to the TOV metric [2] outside the black hole,
566 Shock Wave Refinement of the Friedman–Robertson–Walker Metric
assuming A > 0. In this case, we can bypass the By rescaling the time coordinate, we can take B0 = 1
problem of deriving and solving the ODEs for the at r̄0 = 1, in which case [44] reduces to
shock surface and constraints discussed above, by
actually deriving the exact solution of the Einstein B ¼ r 4=ð1þÞ ½45
equations of TOV form that meets these equations. We conclude that when [42] holds, [40]–[43] and
This exact solution represents the general-relativistic [44] provide an exact solution of the Einstein field
version of a static, singular isothermal sphere – equations of TOV type, for each 0 1. (In this
singular because it has an inverse square density case, an exact solution of TOV type was first found
profile, and isothermal because the relationship by Tolman (1939), and rediscovered in the case
between the density and pressure is p̄ = , = const. = 1=3 by Misner and Zapolsky (cf. Weinberg
Assuming the stress tensor for a perfect fluid, and (1972 p. 320)).) By [43], these solutions are defined
assuming that the density and pressure depend only outside the black hole, since 2M=r̄ < 1. When
on r̄, the Einstein equations for the TOV metric [2] = 1=3, [42] yields
= 3=56 G (cf. Weinberg
outside the black hole (i.e., when A = 1 2M=r̄ > 0) (1972, equation (11.4.13))).
are equivalent to the Oppenheimer–Volkoff system To match the FRW exact solution [23]–[25] with
dM equation of state p = to the TOV exact solution
¼ 4 r2 ½36 [40]–[45] with equation of state p̄ = across a
dr
shock interface, we first set r̄ = Rr to match the
spheres of symmetry, and then match the timelike
2d
p
r ¼ GM
p 1þ and spacelike components of the corresponding
dr
metrics in standard Schwarzschild coordinates. The
4 r3 p
2GM 1 matching of the dr̄2 coefficient A1 yields the
1þ 1 ½37 conservation of mass condition that implicitly gives
M r
the shock surface r̄ = r̄(t),
Integrating [36], we obtain the usual interpretation
of M as the total mass inside radius r̄, 4
MðrÞ ¼ ðtÞr3 ½46
Z r 3
MðrÞ ¼ 4 2 ðÞd ½38 Using this together with [41] gives the following two
0
relations that hold at the shock surface:
The metric component B B(r̄) is determined from sffiffiffiffiffiffiffiffi
and M through the equation 3
r ¼
B0 ðrÞ 0 ðrÞ
p ðtÞ
¼ 2 ½39
B þ
p 3 M 3
¼ ¼ ¼ 3
½47
Assuming 4 rðtÞ3 rðtÞ2
¼ ;
Matching the coefficient B of dt̄2 on the shock
p ðrÞ ¼ ½40
r2 surface determines the integrating factor in a
for some constants and
, and substituting into neighborhood of the shock surface by assigning
[3], we obtain initial conditions for [44]. Finally, the conservation
constraint [Tij ]ni nj = 0 leads to the single condition
MðrÞ ¼ 4
r ½41
Þðp þ Þ2
0 ¼ ð1 AÞð þ p
Putting [40] and [41] into [37] and simplifying yields
the identity 1
þ 1 þp
ð Þð þ pÞ2 þðp p
Þð Þ2 ½48
A
1
¼ ½42 which upon using p = and p̄ = is satisfied
2 G 1 þ 6 þ 2
assuming the condition
From [38] we obtain pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
¼ 12 92 þ 54 þ 49 32 72 HðÞ ½49
A ¼ 1 8 G
< 1 ½43
Alternatively, we can solve for in [49] and write
Applying [39] leads to
this relation as
2=ð1þÞ 4=ð1þÞ
r þ 7Þ
ð
B ¼ B0 ¼ B0 ½44 ¼ ½50
0 r0 3ð1 Þ
Shock Wave Refinement of the Friedman–Robertson–Walker Metric 567
This guarantees that conservation holds across the a simple model for star formation (Smoller and
shock surface, and so it follows from Theorem 4 that Temple 2000). As the scenario goes, a star begins as
all of the equivalencies in Theorem 3 hold across the a diffuse cloud of gas. The cloud slowly contracts
shock surface. Note that H(0) = 0, and to leading under its own gravitational force by radiating energy
order = (3/7) þ O(2 ) as ! 0. Within the out through the gas cloud as gravitational potential
0
physical region
pffiffiffiffiffiffi 0 , 1, H () >p0, < , and
ffiffiffiffiffiffiffiffi energy is converted into kinetic energy. This
H(1=3) = 17 4 0.1231, H(1) = 112=2 5 contraction continues until the gas cloud reaches
0.2915. the point where the mean free path for transmission
Using the exact formulas for the FRW metric in of light is small enough that light is scattered,
[23]–[25], and setting R0 = 1 at = 0 , t = t0 , we instead of being transmitted, through the cloud. The
obtain the following exact formulas for the shock scattering of light within the gas cloud has the effect
position: of equalizing the temperature within the cloud, and
at this point the gas begins to drift toward the most
rðtÞ ¼ t ½51 compact configuration of the density that balances
the pressure when the equation of state is isother-
rðtÞ ¼ rðtÞRðtÞ1 ¼ tð1þ3Þ=ð3þ3Þ ½52 mal. This configuration is a static, singular, iso-
thermal sphere, the general-relativistic version of
where
which is the exact TOV solution beyond the shock
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi wave when r = 0. This solution in the Newtonian
¼ 3ð1 þ Þ case is also inverse square in the density and
1 þ 6 þ 2
1=ð3þ3Þ pressure, and so the density tends to infinity at the
3
center of the sphere. Eventually, the high densities at
¼ ð1þ3Þ=ð3þ3Þ ½53
0 the center ingnite thermonuclear reactions. The
result is a shock wave explosion emanating from
It follows from [41] that A > 0, and from [52] that the center of the sphere, and this signifies the birth
r = limt!0 r(t) = 0. The entropy condition that the of the star. The exact solutions when r = 0
shock wave be compressive follows from the fact represent a general-relativistic version of such a
that = H() < . Thus, we conclude that for each shock wave explosion.
0 < 1, r = 0, the solutions constructed in
[40]–[53] define a one-parameter family of shock
wave solutions that evolve everywhere outside
the black hole, which implies that the distance Shock Wave Solutions Inside the Black
from the shock wave to the FRW center is less than Hole – The Case r > 0
one Hubble length for all t > 0. When the shock wave is beyond one Hubble length
Using [51] and [52], one can determine the shock from the FRW center, we obtain a family of shock
speed, and check when the Lax characteristic wave solutions for each 0 < 1 and r > 0 by
condition (Smoller 1983) holds at the shock. The shock matching the FRW metric [1] to a TOV
result is the following theorem. (Note that even metric of form [2] under the assumption that
when the shock speed is larger than c, only the
wave, and not the sound speeds or any other 2MðrÞ
AðrÞ ¼ 1 1 NðrÞ < 0 ½54
physical motion, exceeds the speed of light. See Scott r
(2002) for the case when the shock speed is equal to the In this case, r̄ is the timelike variable. Assuming that
speed of light.) The reader is referred to Smoller and the stress tensor T is taken to be that of a perfect
Temple (1995) for details. fluid comoving with the TOV metric, the Einstein
Theorem 5 There equations G = T, inside the black hole, take the
pffiffiffi exist values 0 < 1 < 2 < 1,
(1 0.458, 2 = 5=3 0.745), such that, for form (see Smoller and Temple (2004) for details)
0 < 1, the Lax characteristic condition holds at þ N 0
p
the shock if and only if 0 < < 1 ; and the shock 0 ¼
p ½55
2 N1
speed is less than the speed of light if and only if
0 < < 2 .
N
N0 ¼ r
þ p ½56
The explicit solution in the case r = 0 can be r
interpreted as a general-relativistic version of a
shock wave explosion into a static, singular, B0 1 N
¼ þ
½57
isothermal sphere, known in the Newtonian case as B N 1 r
568 Shock Wave Refinement of the Friedman–Robertson–Walker Metric
The system [55]–[57] defines the simplest class of determined by the inhomogeneous scalar equation
gravitational metrics that contain matter, evolve [58] when = const. We take as the entropy
inside the black hole, and such that the mass function constraint the condition that
M(r̄) < 1 at each fixed time r̄. System [55]–[57] for
< p;
0<p 0 < < ½62
A < 0 differs substantially from the TOV equations
for A > 0 because, for example, the energy density and to insure a physically reasonable solution, we
T 00 is equated with the timelike component Grr when impose the equation of state constriant on the TOV
A < 0, but with Gtt when A > 0. In particular, this side of the shock (this is equivalent to the dominant
implies that, inside the black hole, the mass function energy condition (Blau and Guth 1987))
M(r̄) does not have the interpretation as a total mass
inside the radius r̄ as it does outside the black hole. <
0<p ½63
Equations [56], [57] do not have the same
Condition [62] implies that outgoing shock waves
character as [54], [55] and the relation p̄ = with
are compressive. Inequalities [62] and [63] are both
= const. is inconsistent with [56], [57] together with
implied by the single condition (Smoller and Temple
the conservation constraint and the FRW assumption
2004),
p = for shock matching. Thus, instead of looking
for an exact solution of [56], [57] ahead of time, as in 1 1u u
the case r = 0, we assume the FRW solution [23]– < ½64
N 1þu þu
[25], and derive the ODEs that describe the TOV
metrics that match this FRW metric Lipschitz- Since is constant, eqn [58] uncouples from [59],
continuously across a shock surface, and then impose and thus solutions of system [58]–[60] are deter-
the conservation, entropy, and equation of state mined by the scalar nonautonomous equation [58].
constraints at the end. Matching a given k = 0 FRW Making the change of variable S = 1=N, which
metric to a TOV metric inside the black hole across a transforms the ‘‘big bang’’ N ! 1 over to a rest
shock interface leads to the system of ODEs, (see point at S ! 0, we obtain
Smoller and Temple (2004) for details),
du ð1 þ uÞ
du ð1 þ uÞ ¼
¼ dS 2ð1 þ 3uÞS
dN 2ð1 þ 3uÞN
ð3u1ÞðuÞ þ 6uð1þuÞS
ð3u 1Þð uÞN þ 6uð1 þ uÞ ½65
½58 ð uÞ þ ð1þuÞS
ð uÞN þ ð1 þ uÞ
Note that the conditions N > 1 and 0 < p̄ < p
restrict the domain of [65] to the region 0 < u <
dr 1 r
¼ ½59 < 1, 0 < S < 1. The next theorem gives the exis-
dN 1 þ 3u N tence of solutions for 0 < 1, r > 0, inside the
with conservation constraint black hole (Smoller and Temple 2003).
ð1 þ uÞ þ ð uÞN Theorem 6 For every , 0 < < 1, there exists a
v¼ ½60 unique solution u (S) of [65], such that [64] holds
ð1 þ uÞ þ ð uÞN
on the solution for all S, 0 < S < 1, and on this
where solution, 0 < u (S) < ū, limS!0 u (S) = ū, where
p p
u¼ ; v¼ ; ¼ ½61 u
¼ Minf1=3; g ½66
and
Here and p denote the (known) FRW density and
pressure, and all variables are evaluated at the ¼ 0 ¼ lim
lim p ½67
shock. Solutions of [58]–[60] determine the S!1 S!1
(unknown) TOV metrics that match the given For each of these solutions u (S), the shock position
FRW metric Lipschitz-continuously across a shock is determined by the solution of [59], which in turn
interface, such that conservation of energy and is determined uniquely by an initial condition which
momentum hold across the shock, and such that can be taken to be the FRW radial position of the
there are no -function sources at the shock (Israel shock wave at the instant of the big bang,
1966, Smoller and Temple 1997). Note that the
dependence of [58]–[60] on the FRW metric is only r ¼ lim rðSÞ > 0 ½68
S!0
through the variable , and so the advantage of
taking = const. is that the whole solution is Concerning the shock speed, we have
Shock Wave Refinement of the Friedman–Robertson–Walker Metric 569
Theorem 7 Let 0 < < 1. Then the shock wave is in the case = 1=3. Inequalities [73], [74] imply, for
everywhere subluminous, that is, the shock speed example, that at the Oppenheimer–Snyder limit = 0,
s (S) s(u (S)) < 1 for all 0 < S 1, if and only if pffiffiffiffiffiffiffi tcrit
1=3. N0 ¼ 2; ¼2
t0
Concerning the shock speed near the big bang and in the limit = 1=3,
S = 0, the following is true:
tcrit pffiffiffiffiffiffiffi
Theorem 8 The shock speed at the big bang S = 0 1:8 4:5; 1< N0 4:5
t0
is given by
We can conclude that at the moment t0 when the
lim s ðSÞ ¼ 0; < 1=3 ½69 shock wave first becomes visible at the FRW center,
S!0
the shock wave must lie within 4.5 Hubble lengths of
lim s ðSÞ ¼ 1; > 1=3 ½70 the FRW center. Throughout the expansion up until
S!0 this time, the expanding universe must lie entirely
within a white hole – the universe will eventually
lim s ðSÞ ¼ 1; ¼ 1=3 ½71 emerge from this white hole, but not until some later
S!0
time tcrit , where tcrit does not exceed 4.5t0 .
Theorem 8 shows that the equation of state
p = /3 plays a special role in the analysis when r > 0,
and only for this equation of state does the shock Conclusion
wave emerge at the big bang at a finite nonzero We believe that the existence of a wave at the
speed, the speed of light. Moreover, [66] implies that leading edge of the expansion of the galaxies is the
in this case, the correct relation p̄= = is also most likely possibility. The alternatives are that
achieved in the limit S ! 0. The result [67] implies either the universe of expanding galaxies goes on out
that (neglecting the pressure p at this time onward), to infinity, or else the universe is not simply
the solution continues to a k = 0 Oppenheimer– connected. Although the first possibility has been
Snyder solution outside the black hole for S > 1. believed for most of the history of cosmology based
It follows that the shock wave will first become on the Friedmann universe, we find this implausible
visible at the FRW center r̄ = 0 at the moment and arbitrary in light of the shock wave refinements
t = t0 , (R(t0 ) = 1), when the Hubble length of the FRW metric discussed here. The second
H01 = H 1 (t0 ) satisfies possibility, that the universe is not simply connected,
1 1 þ 3 has received considerable attention recently (Klarreich
¼ r ½72 2003). However, since we have not seen, and
H0 2
cannot create, any non-simply-connected 3-spaces
where r is the FRW position of the shock at the on any other length scale, and since there is no
instant of the bigpbang.
ffiffiffiffiffi At this time, the number of observational evidence to support this, we view this
Hubble lengths N 0 from the FRW center to the as less likely than the existence of a wave at the leading
shock wave at time t = t0 can be estimated by edge of the expansion of the galaxies, left over from the
2 pffiffiffiffiffi 2 pffiffiffiffi big bang. Recent analysis of the microwave back-
1 N0 e 3ðð1þ3Þ=ð1þÞÞ ground radiation data shows a cutoff in the angular
1 þ 3 1 þ 3
frequencies consistent with a length scale of around
Thus, in particular, the shock wave will still lie one Hubble length (Andy Abrecht, private commu-
beyond the Hubble length 1=H0 at the FRW time t0 nication). This certainly makes one wonder whether
when it first becomes visible. Furthermore, the time this cutoff is evidence of a wave at this length scale,
tcrit > t0 at which the shock wave will emerge from especially given the consistency of this possibility
the white hole given that t0 is the first instant at with the case r > 0 of the family of exact solutions
which the shock becomes visible at the FRW center, discussed here.
can be estimated by
2 tcrit 2 pffiffiffiffi
e=4 e2 3=ð1þÞ ½73 Acknowledgments
1 þ 3 t0 1 þ 3
for 0 < 1=3, and by the better estimate The work of JS was supported in part by NSF
pffiffi Applied Mathematics Grant Number DMS-010-
tcrit
e 6=4 e3=2 ½74 3998, and that of BT by NSF Applied Mathematics
t0 Grant Number DMS-010-2493.
570 Short-Range Spin Glasses: The Metastate Approach
See also: Black Hole Mechanics; Cosmology: Smoller J (1983) Shock Waves and Reaction Diffusion Equations.
Mathematical Aspects; Newtonian Limit of General New York, Berlin: Springer.
Relativity; Symmetric Hyperbolic Systems and Shock Smoller J and Temple B (1994) Shock-wave solutions of the
Waves. Einstein equations: the Oppenheimer–Snyder model of grav-
itational collapse extended to the case of non-zero pressure.
Archives for Rational and Mechanical Analysis 128: 249–297.
Smoller J and Temple B (1995) Astrophysical shock-wave solutions of
the Einstein equations. Physical Review D 51(6): 2733–2743.
Further Reading Smoller J and Temple B (1997) Solutions of the Oppenheimer–
Blau SK and Guth AH (1987) Inflationary cosmology. In: Hawking Volkoff equations inside 9/8’ths of the Schwarzschild radius.
SW and Israel W (eds.) Three Hundred Years of Gravitation, Communications in Mathematical Physics 184: 597–617.
pp. 524–603. Cambridge: Cambridge University Press. Smoller J and Temple B (1998) On the Oppenheimer–Volkov
Groah J, Smoller J, and Temple B (2003) Solving the Einstein equations in general relativity. Archives for Rational and
equations by Lipschitz continuous metrics: shockwaves in Mechanical Analysis 142: 177–191.
general relativity. In: Friedlander S and Serre D (eds.) Smoller J and Temple B (2000) Cosmology with a shock wave.
Handbook of Mathematical Fluid Dynamics, vol. 2, Communications in Mathematical Physics 210: 275–308.
pp. 501–597. Amsterdam: North Holland. Smoller J and Temple B (2003) Shock wave cosmology inside a
Groah J and Temple B (2004) Shock-Wave Solutions of the Einstein black hole. Proceedings of the National Academy of Sciences
Equations: Existence and Consistency by a Locally Inertial Glimm of the United States of America 100(20): 11216–11218.
Scheme. Memoirs of the AMS, 84 pages, vol. 172, no. B13. Smoller J and Temple B (2004) Cosmology, black holes, and
Israel W (1966) Singular hypersurfaces and thin shells in general shock waves beyond the Hubble length. Methods and
relativity. IL Nuovo Cimento XLIV B(1): 1–14. Applications of Analysis 11(1): 77–132.
Klarreich E (2003) The shape of space. Science News 164–165. Tolman R (1939) Static solutions of Einstein’s field equations for
Oppenheimer JR and Snyder JR (1939) On continued gravita- spheres of fluid. Physical Review 55: 364–374.
tional contraction. Physical Review 56: 455–459. Weinberg S (1972) Gravitation and Cosmology: Principles and
Scott M (2002) General Relativistic Shock Waves Propagating at Applications of the General Theory of Relativity. New York:
the Speed of Light. Ph.D. thesis, UC-Davis. Wiley.
1975), although most of our discussion is relevant to last assertion follows from the measurability and
a much larger class of realistic models. The EA translation invariance of N , and the translation
model is described by the Hamiltonian ergodicity of the disorder distribution of J .)
X A pure state (where is a pure-state index) can
HJ ¼ Jxy x y ½1 also be intrinsically characterized by a ‘‘clustering
hx;yi
property’’; for two-point correlation functions, this
where J denotes a particular realization of all of the reads
couplings Jxy and the brackets indicate that the sum is hx y i hx i hy i ! 0 ½4
over nearest-neighbor pairs only, with x, y 2 Zd . We
will take Ising spins x = 1; although this will affect as jx yj ! 1. A simple observation (Newman and
the details of our discussion, it is unimportant for our Stein 1992), with important consequences for spin
main conclusions. The couplings Jxy are quenched, glasses, is that if many pure states exist, a sequence
independent, identically distributed random variables of (L)
J , ’s, with boundary conditions and L’s chosen
whose common distribution is symmetric about zero. independently of J , will generally not have a
(single) limit. We call this phenomenon ‘‘chaotic
size dependence’’ (CSD).
States and Metastates We will be interested in the properties of ex G at
We are interested in both finite-volume and infinite- low temperatures. If the spin-flip symmetry present
volume Gibbs states. For the cube of length scale L, in the EA Hamiltonian equation [1] is spontaneously
L = {L, L þ1, . . . , L}d , we define HJ , L to be broken above some dimension d0 and below some
the restriction of the EA Hamiltonian to L with a temperature Tc (d), there will be at least a pair of
specified boundary condition such as free, fixed, or pure states such that their even-spin correlations
periodic. Then the finite-volume Gibbs distribution are identical and their odd-spin correlations have the
(L) (L)
J = J , on L (at inverse temperature = 1=T) is
opposite sign. Assuming that such broken spin-flip
symmetry indeed exists for d > d0 and T < Tc (d), the
ðLÞ
J ; ðÞ ¼ Z1
L exp HJ ;L ðÞ ½2 question of whether there exists more than one
such pair (of spin-flip related extremal infinite-
where the partition function ZL () is such that
volume Gibbs distributions) is a central unresolved
the sum of (L) J , over all yields 1. (In this and all issue for the EA and related models. If many such
succeeding definitions, the dependence on spatial
pairs should exist, we can ask about the structure of
dimension d will be suppressed.)
their relations with one another, and how this
Thermodynamic states are described by infinite-
structure would manifest itself in large but finite
volume Gibbs measures. At fixed inverse temperature
volumes. To do this, we use an approach, introduced
and coupling realization J , a thermodynamic state
by Newman and Stein (1996b), to study inhomoge-
J , is the limit, as L ! 1, of some sequence of such
neous and other systems with many competing pure
finite-volume measures (each with a specified bound-
states. This approach, based on an analogy with
ary condition, which may remain the same or may
chaotic dynamical systems, requires the construction
change with L). A thermodynamic state J , can also
of a new thermodynamic quantity which is called the
be characterized intrinsically through the Dobrushin–
‘‘metastate’’ – a probability measure J on the
Lanford–Ruelle (DLR) equations (see, e.g., Georgii
thermodynamic states. The metastate allows an
1988): for any L , the conditional distribution of J ,
understanding of CSD by analyzing the way in
(conditioned on the sigma-field generated by
which (L) J , ‘‘samples’’ from its various possible limits
{x : x 2 Zd nL } is (L),
J , , where is given by the as L ! 1.
conditioned values of x for x on the boundary of L .
The analogy with chaotic dynamical systems can
Consider now the set G = G(J , ) of all thermo-
be understood as follows. In dynamical systems, the
dynamic states at a fixed (J , ). The set of extremal,
chaotic motion along a deterministic orbit is
or pure, Gibbs states is defined by
analyzed in terms of some appropriately selected
ex G ¼ G n fa1 þ ð1 aÞ2 : probability measure, invariant under the dynamics.
Time along the orbit is replaced, in our context, by
a 2 ð0; 1Þ; 1 ; 2 2 G; 1 6¼ 2 g ½3
L and the phase space of the dynamical system is
and the number of pure states N (J , ) at (J , ) is the replaced by the space of Gibbs states.
cardinality jex Gj of ex G. It is not hard to show that, in Newman and Stein (1996b) considered a ‘‘micro-
any d and for a.e. J , the following two statements are canonical ensemble’’ (as always, at fixed , which
true: (1) N = 1 at sufficiently low > 0; (2) at any will hereafter be suppressed for ease of notation) N
fixed , N is constant a.s. with respect to the J ’s. (The in which each of the finite-volume Gibbs states
572 Short-Range Spin Glasses: The Metastate Approach
which J = yJ . We choose periodic boundary That is, the metastate is supported on a single
conditions for specificity; the results and claims (mixed) thermodynamic state.
discussed are expected to be independent of the The two-state scenario that has received the most
boundary conditions used, as long as they are attention in the literature is the ‘‘droplet/scaling’’
chosen independently of the couplings. picture (McMillan 1984, Fisher and Huse 1986,
1988, Bray and Moore 1985). In this picture a low-
energy excitation above the ground state in L is a
Low-Temperature Structure
droplet whose surface area scales as lds , with l
of the EA Model O(L) and ds < d, and whose surface energy scales as
There have been several scenarios proposed for the l , with > 0 (in dimensions where Tc > 0). More
spin-glass phase of the Edwards–Anderson model at recently, an alternative picture has arisen (Krzakala
sufficiently low temperature and high dimension. and Martin 2000, Palassini and Young 2000) in
These remain speculative, because it has not even which the low-energy excitations differ from those
0
been proved that a phase transition from the high- of droplet/scaling, in that their energies scale as l ,
temperature phase exists at positive temperature in with 0 = 0.
any finite dimension. The low-temperature picture that has perhaps
As noted earlier, at sufficiently high temperature generated the most attention in the literature is
in any dimension (and at all nonzero temperatures in the replica symmetry breaking (RSB) scenario
one and presumably two dimensions, although the (Binder and Young 1986, Marinari et al. 1994,
latter assertion has not been proved), there is a 1997, Franz et al. 1998, Marinari et al. 2000,
unique Gibbs state. It is conceivable that this Marinari and Parisi 2000, 2001, Dotsenko 2001),
remains the case in all dimensions and at all nonzero which assumes a rather complicated pure-state
temperatures, in which case the metastate J is, for structure, inspired by Parisi’s solution of the SK
a.e. J , supported on a single, pure Gibbs state J . model (Parisi 1979, 1983, Mézard et al. 1984,
(It is important to note, however, that in principle 1987). This is a many-state picture (N = 1 for a.e.
Short-Range Spin Glasses: The Metastate Approach 573
J ) in which the ordering is described in terms of the each L , the finite-volume Gibbs state (L)
J is well
‘‘overlaps’’ between states. There has been some approximated deep in the interior by a mixed
ambiguity in how to describe such a picture for thermodynamic state (L) , decomposable into many
short-range models; the prevailing, or standard, pure states L (explicit dependence on J is
view. Consider any reasonably constructed thermo- suppressed for ease of notation). More precisely,
dynamic state J (see Newman and Stein (1998a) each in J satisfies
for more details) – e.g., the ‘‘average’’ over the X
metastate J ¼ W ½12
Z
invariance property holds among any two sequences does occur in real spin glasses. In this section we list
of fixed boundary conditions (and the fixed bound- a number of open questions relevant to the above
ary condition of choice may even be allowed to vary discussion.
arbitrarily along any single sequence of volumes)! It
Open Question 1 Determine whether a phase
follows that, with respect to changes of boundary
transition occurs in any finite dimension greater
conditions, the metastate is extraordinarily robust.
than one. If it does, find the lower critical dimension.
This should rule out all but the simplest overlap
Existence of a phase transition does not necessa-
structures, and in particular the nonstandard RSB
rily imply two or more pure states below Tc . It could
and related pictures (for a full discussion,
happen, for example, that in some dimension there
see Newman and Stein 1998b). It is therefore
exists a single pure state at all nonzero temperatures,
natural to ask whether the property of metastate
with two-point spin correlations decaying exponen-
invariance allows any many-state picture.
tially above Tc and more slowly (e.g., as a power
There is one such picture, namely the ‘‘chaotic pairs’’
law) below Tc . This leads to:
picture, which is fully consistent with metastate
invariance (our belief is that it is the only many-state Open Question 2 If there does exist a phase
picture that fits naturally and easily into results transition above some lower critical dimension,
obtained about the metastate.) determine whether the low-temperature spin-glass
Here the periodic boundary condition metastate is phase exhibits broken spin-flip symmetry.
supported on infinitely many pairs of pure states, If broken symmetry does occur in some dimen-
but instead of eqn [12] one has sion, then of course an obvious open question is to
determine the number of pure-state pairs, and hence
¼ ð1=2Þ þ ð1=2Þ ½15
the nature of ordering at low temperature. A
with overlap (possibly) easier question (but still very difficult),
and one which does not rely on knowing whether a
P ¼ ð1=2Þðq qEA Þ þ ð1=2Þðq þ qEA Þ ½16
phase transition occurs, is to determine the zero-
So there is CSD in the states but not in the overlaps, temperature – i.e., ground state – properties of spin
which have the same form as a two-state picture in glasses as a function of dimensionality. A ground
every volume. The difference is that, while in the latter state is an infinite-volume spin configuration whose
case, one has the ‘‘same’’ pair of states in every volume, energy (governed by eqn [1]) cannot be lowered by
in chaotic pairs the pure-state pair varies chaotically as flipping any finite subset of spins. That is, all ground
volume changes. If the chaotic pairs picture is to be state spin configurations must satisfy the constraint
X
consistent with metastate invariance in a natural way, Jxy x y 0 ½17
then the number of pure-state pairs should be hx;yi2C
‘‘uncountable.’’ This allows for a ‘‘uniform’’ distribu-
tion (within the metastate) over all of the pure states, along any closed loop C in the dual lattice.
and invariance of the metastate with respect to
boundary conditions could follow naturally.
Open Question 3 How many ground state pairs is
the T = 0 periodic boundary condition metastate
supported on, as a function of d?
Open Questions
The answer is known to be one for 1D, and a partial
We have discussed how the metastate approach to the result (Newman and Stein 2000, 2001a) points
EA spin glass has narrowed considerably the set of towards the answer being one for 2D as well. There
possible scenarios for low-temperature ordering in any are no rigorous, or even heuristic (except based on
finite dimension, should broken spin-flip symmetry underlying ‘‘ansätze’’) arguments in higher dimension.
occur. The remaining possibilities are either a two-state An interesting – but unrealistic – spin-glass model
scenario, such as droplet/scaling, or the chaotic-pairs in which the ground state structure can be exactly
picture if there exist many pure states at some (, d). solved (although not yet completely rigorously) was
Both have simple overlap structures. The metastate proposed by the authors (Newman and Stein 1994,
approach appears to rule out more complicated 1996a) (see also Banavar 1994). This ‘‘highly
scenarios such as RSB, in which the approximate disordered’’ spin glass is one in which the coupling
pure-state decomposition in a typical large, finite magnitudes scale nonlinearly with the volume (and so
volume is a nontrivial mixture of many pure-state pairs. are no longer distributed independently of the
Of course, this does not answer the question of volume, although they remain independent and
which, if either, of the remaining pictures actually identically distributed for each volume). The model
Short-Range Spin Glasses: The Metastate Approach 575
Marinari E and Parisi G (2001) Effects of a bulk perturbation on Picco P (eds.) Mathematics of Spin Glasses and Neural
the ground state of 3D Ising spin glasses. Physical Review Networks, pp. 243–287. Boston: Birkhauser.
Letters 86: 3887–3890. Newman CM and Stein DL (2000) Nature of ground state
Marinari E, Parisi G, Ricci-Tersenghi F, Ruiz-Lorenzo JJ, and incongruence in two-dimensional spin glasses. Physical
Zuliani F (2000) Replica symmetry breaking in spin glasses: Review Letters 84: 3966–3969.
Theoretical foundations and numerical evidences. Journal of Newman CM and Stein DL (2001a) Are there incongruent ground
Statistical Physics 98: 973–1047. states in 2D Edwards–Anderson spin glasses? Communications
Marinari E, Parisi G, and Ritort F (1994) On the 3D Ising spin in Mathematical Physics 224: 205–218.
glass. Journal of Physics A 27: 2687–2708. Newman CM and Stein DL (2001b) Realistic spin glasses below
Marinari E, Parisi G, and Ruiz-Lorenzo J (1997) Numerical eight dimensions: a highly disordered view. Physical Review E
simulations of spin glass systems. In: Young AP (ed.) Spin Glasses 63: 16101-1–16101-9.
and Random Fields, pp. 59–98. Singapore: World Scientific. Newman CM and Stein DL (2003) Topical review: Ordering and
McMillan WL (1984) Scaling theory of Ising spin glasses. Journal broken symmetry in short-ranged spin glasses. Journal of
of Physics C 17: 3179–3187. Physics: Condensed Matter 15: R1319–R1364.
Mézard M, Parisi G, Sourlas N, Toulouse G, and Virasoro M Ogielski AT (1985) Dynamics of three-dimensional spin glasses in
(1984) Nature of spin-glass phase. Physical Review Letters 52: thermal equilibrium. Physical Review B 32: 7384–7398.
1156–1159. Ogielski AT and Morgenstern I (1985) Critical behavior of the
Mézard M, Parisi G, and Virasoro MA (eds.) (1987) Spin Glass three-dimensional Ising spin-glass model. Physical Review
Theory and Beyond. Singapore: World Scientific. Letters 54: 928–931.
Newman CM and Stein DL (1992) Multiple states and thermo- Palassini M and Young AP (2000) Nature of the spin glass state.
dynamic limits in short-ranged Ising spin glass models. Physical Review Letters 85: 3017–3020.
Physical Review B 46: 973–982. Parisi G (1979) Infinite number of order parameters for spin-
Newman CM and Stein DL (1994) Spin-glass model with glasses. Physical Review Letters 43: 1754–1756.
dimension-dependent ground state multiplicity. Physical Parisi G (1983) Order parameter for spin-glasses. Physical Review
Review Letters 72: 2286–2289. Letters 50: 1946–1948.
Newman CM and Stein DL (1996a) Ground state structure in a Sherrington D and Kirkpatrick S (1975) Solvable model of a spin
highly disordered spin glass model. Journal of Statistical glass. Physical Review Letters 35: 1792–1796.
Physics 82: 1113–1132. Stein DL (1989) Disordered systems: Mostly spin glasses. In: Stein
Newman CM and Stein DL (1996b) Non-mean-field behavior of DL (ed.) Lectures in the Sciences of Complexity, pp. 301–355.
realistic spin glasses. Physical Review Letters 76: 515–518. New York: Addison–Wesley.
Newman CM and Stein DL (1996c) Spatial inhomogeneity Thill MJ and Hilhorst HJ (1996) Theory of the critical state of
and thermodynamic chaos. Physical Review Letters 76: low-dimensional spin glass. Journal of Physics I 6: 67–95.
4821–4824. van Enter ACD and Fröhlich J (1985) Absence of symmetry
Newman CM and Stein DL (1998a) Simplicity of state and breaking for N-vector spin glass models in two dimensions.
overlap structure in finite volume realistic spin glasses. Communications in Mathematical Physics 98: 425–432.
Physical Review E 57: 1356–1366. Wilkinson D and Willemsen JF (1983) Invasion percolation:
Newman CM and Stein DL (1998b) Thermodynamic chaos and A new form of percolation theory. Journal of Physics A 16:
the structure of short-range spin glasses. In: Bovier A and 3365–3376.
Sine-Gordon Equation
S N M Ruijsenaars, Centre for Mathematics and It shares this relativistic invariance property with the
Computer Science, Amsterdam, The Netherlands linear Klein–Gordon equation, which is obtained
ª 2006 Elsevier Ltd. All rights reserved. upon replacing sin by . (The name sine-Gordon
equation is derived from this relation, and was
introduced by Kruskal.) The sine-Gordon equation
can also be defined and studied in the form
Introduction
@2 ~ ~ ~ vÞ ¼ ðt; xÞ
The sine-Gordon equation ¼ sin ; ðu; ½3
@u @v
2
@ @2 where
¼ sin ½1
@x2 @t2
u ¼ ðx þ tÞ=2; v ¼ ðx tÞ=2 ½4
may be viewed as a prototype for a nonlinear
integrable field theory. It is manifestly invariant are the so-called light-cone variables.
under spacetime translations and Lorentz boosts, There are two interpretations of the field (t, x)
that are quite different, both from a physical and
ðx; tÞ 7! ðx ; t Þ
½2 from a mathematical viewpoint. The first one
ðx; tÞ 7! ðx cosh t sinh ; t cosh x sinh Þ consists in viewing it as a real-valued function, so
Sine-Gordon Equation 577
that [1] is simply a nonlinear PDE in two variables. data (0, x) = (x) and @t (0, x) = (x) with special
In the second version, one views (t, x) as an properties. First of all, the energy functional
operator-valued distribution on a Hilbert space. Z 1
(Thus, one should smear (t, x) with a test function 1 2 1 2
H¼ ðxÞ þ @x ðxÞ þ ð1 cos ðxÞÞ dx
f (t, x) in Schwartz space to obtain a genuine 1 2 2
operator on the Hilbert space.) In spite of their ½5
different character, the classical and quantum field
and symplectic form
theory versions have several striking features in
Z 1
common, including the presence of an infinite
number of conservation laws and the occurrence of !¼ dðxÞ ^ dðxÞ dx ½6
1
solitonic excitations.
The classical sine-Gordon equation has been used should be well defined on the phase space of initial
as a model for various wave phenomena, including data. Indeed, in that case [1] amounts to the
the propagation of dislocations in crystals, phase Hamilton equation associated to [5] via [6].
differences across Josephson junctions, torsion Second, there exists a sequence of functionals
waves in strings and pendula, and waves along I2lþ1 ð; Þ; l2Z ½7
lipid membranes. It was already studied in the
nineteenth century in connection with the theory of that formally Poisson-commute with H and among
pseudospherical surfaces. The quantum version is themselves.
used as a simple model for solid-state excitations. In particular, H equals 2(I1 þ I1 ), whereas
The designation ‘‘sine-Gordon’’ is also used for 2(I1 I1 ) equals the momentum functional
various equations that generalize [1] or bear Z 1
resemblance to it. These include the so-called P¼ ðxÞ@x ðxÞ dx ½8
homogeneous and symmetric space sine-Gordon 1
models, discrete and supersymmetric versions, and
The functional I2lþ1 contains x-derivatives of order
generalizations to higher-dimensional spacetimes
up to j2l þ 1j, so one needs to require that the
(i.e., in [1] the spatial derivative is replaced by the
functions @x (x) and (x) be smooth and that all of
Laplace operator in several variables). In this
their derivatives have sufficient decrease for
contribution we focus on [1], however.
x ! 1.
Our main goal is to discuss the integrability and
A natural choice guaranteeing the latter require-
solitonic properties, both at the classical and at the
ments is
quantum level. First, we sketch the inverse-scattering
transform (IST) solution to the Cauchy problem for @x ðxÞ; ðxÞ 2 SR ðRÞ ½9
[1]. Following Faddeev and Takhtajan, we emphasize
the interpretation of the IST as an action-angle where SR (R) denotes the Schwartz space of
transformation for an infinite-dimensional Hamilto- real-valued functions on the line. To render the integral
nian system. Next, the particle-like solutions are over 1 cos (x) (and similar integrals occurring for
surveyed by using a description in terms of variables the sequence [7]) finite, one also needs to require
that may be viewed as relativistic action-angle ðxÞ ! 2k ; x ! 1; k 2 Z ½10
coordinates. This is followed by a section on the
quantum field theory version, paying special atten- On this phase space of initial data, the Cauchy
tion to the factorized scattering that is the quantum problem for the evolution equation [1] is not only
analog of the solitonic classical scattering. Finally, we well posed, but can be solved in explicit form by
sketch the intimate relation between the N-particle using the IST. More generally, the Hamiltonians
subspaces of the classical and quantum sine-Gordon I2lþ1 give rise to evolution equations that are
field theory and certain integrable relativistic systems simultaneously solved via the IST, yielding an
of N point particles on the line. infinite sequence of commuting Hamiltonian flows
on .
Before sketching the overall picture resulting from
the IST, it should be mentioned at this point that [1]
The Classical Version: An Integrable
admits explicit solutions of interest that do not
Hamiltonian System
belong to . First, there is a class of algebro-
In order to tie in the hyperbolic evolution equation geometric solutions that have no limits as x ! 1.
[1] with the notion of infinite-dimensional integrable These solutions can be obtained via finite-gap
system, it is necessary to restrict attention to initial integration methods, yielding formulas involving
578 Sine-Gordon Equation
the Riemann theta functions associated to compact The crux of the IST is now that the potentials can
Riemann surfaces. Second, there are the tachyon be reconstructed from the spectral data
solutions. They arise from the particle-like solutions
fbðÞ; 1 ; . . . ; N ; 1 ; . . . ; N g ½14
that do belong to by the transformation
by solving a linear integral equation of Gelfand–
ðt; xÞ ! ðx; tÞ þ ½11
Levitan–Marchenko (GLM) type. (Alternatively,
(Observe that the equation of motion [1] is invariant Riemann–Hilbert problem techniques can be used.)
under [11], whereas due to the finite-energy require- Hence, the nonlinear Cauchy problem can be
ment [10] this is not true for solutions evolving in .) replaced by the far simpler linear problems of
The IST via which the above Cauchy problem can determining the spectral data [14] of a linear
be solved starts from an auxiliary system of two operator (the direct problem) and then solving the
linear ordinary differential equations involving linear GLM equation for the time-evolved scattering
(0, x) and @t (0, x). It is beyond our scope to data (the inverse problem).
describe the system in detail. The results derived From the Hamiltonian perspective, the IST may
from it, however, are to a large extent the same as be reinterpreted as a transformation to action-angle
those obtained via a simpler auxiliary linear opera- variables. The action variables are defined in terms
tor that is associated to the light-cone Cauchy of jb()j and 1 , . . . , N . They are time independent
problem. The latter operator is of the Ablowitz– under the sine-Gordon and higher Hamiltonian
Kaup–Newell–Segur (AKNS) form. That is, the flows. The angle variables are arg b() and suitable
linear operator is an ordinary differential operator functions of the normalization coefficients. They
of Dirac type given by depend linearly on the evolution times of the flows.
! The Hamiltonians can be explicitly expressed in
d
i dx iq action variables.
L¼ d
½12
ir i dx Next, we point out that there is a large subspace
of Cauchy data ((x), (x)) that do not give rise to
where the external potentials r(u) and q(u) depend bound states in the auxiliary linear problem. The
on the evolution equation at hand. For the light- associated solutions are the so-called radiation
cone sine-Gordon equation [3], one needs to choose solutions: they decrease to 0 for large times. These
~
r ¼ q ¼ ð@u Þðu; 0Þ=2 ½13 solutions can be obtained from the inverse transform
involving the GLM equation by only taking b()
In both settings, the associated spectral features into account.
are invariant under the sine-Gordon evolution and The other extreme is to choose b() = 0 and
all of the evolutions generated by the Hamiltonians arbitrary bound states and normalization coeffi-
I2lþ1 , yielding the so-called isospectral flows. More cients in the GLM equation. This special case of
specifically, if the initial data give rise to bound-state vanishing reflection leads to the particle-like solu-
solutions of the linear problem (square-integrable tions that are studied in the next section. For general
wave functions), then the corresponding eigenvalues Cauchy data, one has both b() 6¼ 0 and a finite
are time independent. Furthermore, due to the decay number of bound states. These so-called mixed
requirements on the potential in the linear system, solutions have a radiation component (encoded in
there exist scattering solutions with plane-wave b()) which decays for asymptotic times, whereas
asymptotics for all initial data in . A suitable the bound states show up for t ! 1 as isolated
normalization leads to the so-called Jost solutions solitons, antisolitons, and breathers.
(x, ). (Here is the spectral parameter, which
varies over the real line for scattering solutions.)
Their x ! 1 asymptotics is encoded in transition
coefficients a() and b(), with a() and jb()j being Classical Solitons, Antisolitons,
time independent, whereas arg b() has a linear and Breathers
dependence on time when the potential evolves
Just as for other classical soliton equations, the case
according to the sine-Gordon equation. The bound
of reflectionless data can be handled in complete
states correspond to special -values 1 , . . . , N with
detail, since the GLM equation reduces to an N N
positive imaginary part (namely the zeros of the
system of linear equations. The case N = 1 yields the
coefficient a(), which is analytic in the upper-half
1-soliton and 1-antisoliton solutions. Resting at the
-plane); their normalization coefficients 1 , . . . , N
origin, these one-particle solutions are given by
have an essentially linear time evolution, just
as b(). 4 arctanðex Þ ½15
Sine-Gordon Equation 579
where j is given by [22] with Here, (0, x) is a neutral Klein–Gordon field with
mass m and the double dots denote a suitable
q1 ; . . . ; qN 2 R; N < < 1 ½27
ordering prescription. The associated equation of
Specifically, one has motion
The equivalence argument (due to Coleman) consists the DHN formula. Notice that for near zero m1 and
in showing that the quantities m are nearly equal, and that for 2 4 there are no
longer any sine-Gordon mesons present in the theory.
m2 A priori, the existence of infinitely many classical
@ ; : cos : ½40
2 2 conserved Hamiltonians does not even formally
in the sine-Gordon theory have the same vacuum imply the same feature for the quantum field theory,
expectation values (in perturbation theory) as the as anomalies may occur. For the sine-Gordon and
massive Thirring quantities massive Thirring cases, anomalies have been shown
to be absent, however. This entails not only that the
: J :; M :
0 : ½41 number of solitons, antisolitons, and breathers in a
scattering process is conserved, but also that the set
resp., provided the parameters are related by of incoming rapidities equals the set of outgoing
4 g rapidities.
¼1þ ½42 The latter stability features and the DHN formula
2
[44] are corroborated by the S-matrix, which is
This yields an equivalence between the charge-0 known in complete detail. The two-body amplitudes
sector of the massive Thirring model and the sector involving solitons and antisolitons can be written in
of the sine-Gordon theory obtained by the action of terms of the function
the fields [40] on the vacuum vector. But the
charged sectors of the Thirring model can also be uðzÞ
Z 1
viewed as new sectors in the sine-Gordon theory, dx sinhð =2Þx
¼ exp i sin 2xz ½45
obtained by a solitonic field construction (first 0 x sinh x cosh x=2
performed by Mandelstam).
In this picture, the fermions and antifermions in They are given by
the massive Thirring model correspond to new ðuþþ ; tþ ; rþ ; u ÞðÞ
excitations in the sine-Gordon theory, the quantum
solitons and antisolitons. The latter are viewed as sinhð=2Þ
¼ uð=2Þ 1; ;
coherent states of the sine-Gordon ‘‘mesons’’ in the sinhðði Þ=2Þ
vacuum sector, the rest masses being related by
i sinð2 =2Þ
;1 ½46
8m 2 sinhðði Þ=2Þ
M¼ 2 1 ½43
8 where denotes the rapidity difference. (Due to
in the semiclassical limit ! 0.2 fermion statistics, one gets only one amplitude for a
Even at the formal level involved in the corre- soliton or antisoliton pair. But a soliton and an
spondence, the theories are not believed to exist for antisoliton have opposite charge, so they can be
2 > 8 and g < =2, since there is positivity distinguished. In that case, therefore, the notion of
breakdown for this range of couplings. The free reflection and transmission coefficients makes sense.)
Dirac case g = 0 corresponds to 2 = 4. In parti- The S-matrix involving an arbitrary number of
cular, there is no interaction between the sine- solitons, antisolitons, and their bound states is also
Gordon solitons and antisolitons for this -value. explicitly known. The amplitudes involving no
In the range 2 2 (4, 8) there is interaction, but breathers are readily described in terms of the above
bound soliton–antisoliton pairs (quantum breathers, two-body amplitudes. Indeed, the S-matrix factorizes
alias sine-Gordon mesons) do not occur. as a sum of products of the amplitudes [46], yielding a
By contrast, for 2 < 4 there exist breathers with picture of particles scattering independently in pairs,
rest masses just as at the classical level. The factorization can be
performed irrespective of the temporal ordering
mn ¼ 2M sinðn þ 1Þ;
m=2M; assumed for the pair scattering processes, since the
n þ 1 ¼ 1; 2; . . . ; L < =2 ½44 four functions occurring inside the parentheses of
[46] satisfy the Yang–Baxter equations.
Thus, the ‘‘particle spectrum’’ consists of solitons Roughly speaking, the S-matrix for processes invol-
and antisolitons with mass M and mesons C1 , . . . ,CL ving breathers can be calculated by analytic continua-
with masses m1 , . . . ,mL given by [44]. The latter tion from the soliton–antisoliton S-matrix. The details
formula was first established by semiclassical quan- are however quite substantial. We only add that
tization of the classical breathers (Dashen– scattering amplitudes involving solely breathers can
Hasslacher–Neveu), and ever since is usually called be expressed using only hyperbolic functions.
582 Sine-Gordon Equation
Since the 1980s, a lot of information has also on the phase space
been gathered concerning matrix elements of
~ ¼ fðx; pÞ 2 R4 g;
! ¼ dx ^ dp ½50
suitable sine-Gordon field quantities between
special quantum states (form factors). Unfortu- The two-antiparticle Hamiltonian is again given by
nately, the correlation functions involve infinite [47] and [48]. The interaction potential in [47] is
sums of form factors that are quite difficult to repulsive, whereas it is attractive in [49]. Hence, any
control analytically. Hence, it is not known whether initial point in gives rise to a scattering state,
the correlation functions associated with the form whereas points in ˜ yield scattering states if and
factors give rise to a Wightman field theory with only if the reduced Hamiltonian
the usual axiomatic properties.
~ r ¼ cosh pj tanhðx=2Þj;
H p ¼ ðp1 p2 Þ=2
x ¼ x1 x2 ½51
The Relation to Relativistic ~ r > 1. More specifically, in both cases the
satisfies H
Calogero–Moser Systems distance jx1 (t) x2 (t)j increases linearly as t ! 1,
the scattering (position shift) being encoded by the
The behavior of the special classical solutions
same function [32] as for the sine-Gordon solitons.
discussed earlier is very similar to that of classical ~ r = 1}
The phase-space points on the separatrix {H
point particles. Furthermore, the picture of classical
have the same temporal asymptotics as the multipole
solitons, antisolitons, and their bound states scatter-
solution [24], whereas the bound-state oscillations
ing independently in pairs is essentially preserved on ~ r < 1 match those of the breathers [20].
for H
the quantum level, just as one would expect for the
More generally, the Hamiltonian for Nþ particles
quantization of an integrable particle system.
and N antiparticles is given by the function
Next, we note that from the quantum viewpoint
there is no physical distinction between wave Nþ
X Nþ
Y
functions and point particles, whereas a classical coshðpþ
j Þ j cothððxþ þ
j xk Þ=2Þj
wave is a physical entity that is clearly very different j¼1 k¼1
k6¼j
from a point particle. Even so, it is a natural Y
N X
N
question whether there exist classical Hamiltonian j tanhððxþ
j xl Þ=2Þj þ coshðp
l Þ
systems of N point particles on the line whose l¼1 l¼1
physical characteristics (charges, bound states, scat- Y
N
tering, etc.) are the same as those of the particle-like j cothððx
l xm Þ=2Þj
sine-Gordon solutions. If so, a second question is m¼1
m6¼l
equally obvious: does the quantum version of the Nþ
Y
N-particle systems still have the same features as j tanhððx þ
l xj Þ=2Þj ½52
that of the quantum sine-Gordon excitations? j¼1
As we now sketch, the first question has been
answered in the affirmative, whereas the second one on the phase space
has not been completely answered yet. However, all Nþ ;N
of the information on the pertinent quantum n
N-particle systems collected thus far points to an ¼ ðxþ ; pþ Þ 2 R 2Nþ ; ðx ; p Þ
affirmative answer. The systems at issue are relati- o
vistic versions of the well-known nonrelativistic 2 R2N jxþ þ
Nþ < < x1 ; xN < < x1 ½53
Calogero–Moser N-particle systems.
To begin with the classical two-particle system, its !Nþ ;N ¼ dxþ ^ dpþ þ dx ^ dp ½54
Hamiltonian is given by
This defining Hamiltonian can be supplemented by
H ¼ ðcosh p1 þ cosh p2 Þ cothððx1 x2 Þ=2Þ ½47 (Nþ þ N 1) independent Hamiltonians that pair-
wise commute. The action-angle map of this integr-
on the phase space able system can be used to relate the scattering and
bound-state behavior to that of the sine-Gordon
¼ fðx; pÞ 2 R4 jx2 < x1 g; ! ¼ dx ^ dp ½48
solutions from an earlier section, yielding an exact
Taking x2 ! x2 þ i yields the particle–antiparticle correspondence. Indeed, the variables we used to
Hamiltonian describe the particle-like sine-Gordon solutions
amount to the action-angle variables associated to
~ ¼ ðcosh p1 þ cosh p2 Þj tanhððx1 x2 Þ=2Þj
H ½49 [52]. Moreover, the matrix L [26] with t = x = 0
Sine-Gordon Equation 583
equals the Lax matrix for the N-particle system, which to the arbitrary-N case, one needs first of all
is the manifestation of a remarkable self-duality sufficiently explicit solutions to the N-body
property of the equal-charge case. There is an equally Schrödinger equation. To date, this has only been
close relation between the general particle-like solu- achieved for the case of N equal charges and the
tions and the general systems encoded in [52]. special couplings for which the reflection amplitude
As a matter of fact, the connection can be further rþ vanishes. The asymptotics of the pertinent
strengthened by introducing spacetime trajectories solutions is factorized in terms of u (), in agree-
for the solitons, antisolitons, and breathers, which ment with the sine-Gordon picture.
are defined in terms of the evolution of an initial
point in Nþ ,N under the time translation generator See also: Bäcklund Transformations; Boundary-Value
[52] and the space translation generator, obtained Problems for Integrable Equations; Calogero–
from [52] by the replacement cosh ! sinh . These Moser–Sutherland Systems of Nonrelativistic and
Relativistic Type; Infinite-dimensional Hamiltonian
point particle and antiparticle trajectories make it
Systems; Integrability and Quantum Field Theory;
possible to follow the motion of the solitons,
Integrable Systems and Discrete Geometry; Integrable
antisolitons, and breathers during the temporal Systems and Inverse Scattering Method; Integrable
interval in which the nonlinear interaction takes Systems: Overview; Ljusternik–Schnirelman Theory;
place, whereas for large times the trajectories are Solitons and Other Extended Field Configurations;
located at the (then) clearly discernible positions of Solitons and Kac–Moody Lie Algebras; Symmetries and
the individual solitons, antisolitons, and breathers. Conservation Laws; Two-Dimensional Models;
Before sketching the soliton-particle correspon- Yang–Baxter Equations.
dence at the quantum level, we add a remark on the
finite-gap solutions of the classical sine-Gordon
equation, already mentioned in the paragraph
Further Reading
containing [11]. These solutions may be viewed as Ablowitz MJ, Kaup DJ, Newell AC, and Segur H (1974) The
generalizations of the particle-like solutions dis- inverse scattering transform – Fourier analysis for nonlinear
cussed earlier, and they can also be obtained via problems. Studies in Applied Mathematics 53: 249–315.
Coleman S (1977) Classical lumps and their quantum descen-
relativistic N-particle Calogero–Moser systems. The dants. In: Zichichi A (ed.) New Phenomena in Subnuclear
pertinent systems are generalizations of the hyper- Physics, Proceedings Erice 1975, pp. 297–421. New York:
bolic systems just described to the elliptic level. Plenum.
Turning now to the quantum level, we begin by Faddeev LD and Takhtajan LA (1987) Hamiltonian Methods in
mentioning that the Poisson-commuting Hamilto- the Theory of Solitons. Berlin: Springer.
Flaschka HF and Newell AC (1975) Integrable systems of
nians admit a quantization in terms of commuting nonlinear evolution equations. In: Moser J (ed.) Dynamical
analytic difference operators. This involves a special Systems, Theory and Applications, Lecture Notes in Physics,
ordering choice of the p-dependent and x-dependent vol. 38, pp. 355–440. Berlin: Springer.
factors in the classical Hamiltonians, which is Karowski M (1979) Exact S-matrices and form factors in 1þ1
required to preserve commutativity. The resulting dimensional field theoretic models with soliton behaviour.
Physics Reports 49: 229–237.
quantum two-body problem can be explicitly solved Ruijsenaars SNM (2001) Sine-Gordon solitons vs. Calogero–
in terms of a generalization of the Gauss hypergeo- Moser particles. In: Pakuliak S and von Gehlen G (eds.)
metric function. For the case of equal charges, the Proceedings of the Kiev NATO Advanced Study Institute
scattering is encoded in the sine-Gordon amplitudes Integrable Structures of Exactly Solvable Two-Dimensional
u () (cf. [45] and [46]). For the unequal-charge Models of Quantum Field Theory, NATO Science Series, vol. 35,
pp. 273–292. Dordrecht: Kluwer.
case, one should distinguish an even and odd Scott AC, Chu FYF, and McLaughlin DW (1973) The soliton: a
channel. The scattering on these channels is encoded new concept in applied science. Proceedings of the Institute of
in the sine-Gordon amplitudes tþ () rþ (). Electrical and Electronics Engineers 61: 1443–1483.
Moreover, the bound-state spectrum agrees with Smirnov FA (1992) Form Factors in Completely Integrable
the DHN formula [44] and the bound-state wave Models of Quantum Field Theory. Advanced Series in
Mathematical Physics, vol. 14. Singapore: World Scientific.
functions are given by hyperbolic functions. Thacker HB (1981) Exact integrability in quantum field theory
As a consequence of these results, the physics and statistical systems. Reviews of Modern Physics 53:
encoded in the two-body subspace of the sine- 253–285.
Gordon quantum field theory is indistinguishable Zamolodchikov AB and Zamolodchikov AlB (1979) Factorized
from that of the corresponding two-body relativistic S-matrices in two dimensions as the exact solutions of certain
relativistic quantum field theory models. Annals of Physics
Calogero–Moser systems. To extend this equivalence (NY) 120: 253–291.
584 Singularities of the Ricci Flow
if Ric(x, t) < 0, then the flow expands g(t) near x. At Space-form theorem. If g(0) is a metric of positive
a general point, there will be directions of positive Ricci curvature on a 3-manifold M, then the volume
and negative Ricci curvature, along which the metric normalized Ricci flow exists for all time, and
locally contracts or expands. The flow preserves converges to the round metric on S3 =, where is
product structures of metrics, and preserves the a finite subgroup of SO(4), acting freely on S3 .
isometry group of the initial metric. Thus, the Ricci flow ‘‘geometrizes’’ 3-manifolds of
The form of [2] shows that the Ricci flow positive Ricci curvature. There are two further
continues as long as Ricci curvature remains important structural results on the Ricci flow.
bounded. On a bounded time interval where Ricg(t) Curvature pinching estimate (Hamilton 1982,
is bounded, the metrics g(t) are quasi-isometric, that Ivey 1993). For g(t) a solution to the Ricci flow on
is, they have bounded distortion compared with the a closed 3-manifold M, there is a nonincreasing
initial metric g(0). Thus, one needs to consider function : (1, 1) ! R, tending to 0 at 1, and a
evolution equations for the curvature, induced by constant C, depending only on g(0), such that,
the flow for the metric. The simplest of these is the
Riemðx; tÞ C ðRðx; tÞÞ jRðx; tÞj ½6
evolution equation for the scalar curvature R:
This estimate does not imply a lower bound on
d
R ¼ R þ 2jRicj2 ½3 Riem(x, t) uniform in time. However, when com-
dt bined with the fact that the scalar curvature R(x, t)
Evaluating [3] at a point realizing the minimum Rmin is uniformly bounded below (cf. [3]), it implies that
of R on M shows that Rmin is monotone nondecreas- jRiemj(x, t) 1 only where R(x, t) 1. To control
ing along the flow. In particular, the Ricci flow the size of jRiemj, it thus suffices to obtain just an upper
preserves positive scalar curvature. Moreover, if bound on R. This is remarkable, since the scalar
Rmin (0) > 0, then curvature is a much weaker invariant of the metric
n than the full curvature. Moreover, at points where the
t ½4 curvature is sufficiently large, [6] shows that
2Rmin ð0Þ
Riem(x, t)=R(x, t) , for small. Thus, if one scales
Thus, the Ricci flow exists only up to a maximal the metric to make R(x, t) = 1, then Riem(x, t) . In
time T n=2Rmin (0) when Rmin (0) > 0. In contrast, such a scale, the metric then has almost non-negative
in regions where the Ricci curvature stays negative curvature near (x, t).
definite, the flow exists for infinite time. Harnack estimate (Hamilton 1982). Let (N, g(t))
The evolution of the Ricci curvature has the same be a solution to the Ricci flow with bounded and
general form as [3]: non-negative curvature Riem 0, and suppose g(t)
is a complete Riemannian metric on N. Then for
d e ij
Rij ¼ Rij þ Q ½5 0 < t1 t2 ,
dt !
The expression for Q e is much more complicated t1 dt21 ðx1 ; x2 Þ
Rðx2 ; t2 Þ exp Rðx1 ; t1 Þ ½7
than the Ricci curvature term in [3] but involves t2 2ðt2 t1 Þ
only quadratic expressions in the curvature.
However, Q e involves the full Riemann curvature where dt1 is the distance function on (M, gt1 ). This
tensor Riem of g, and not just the Ricci curvature (as allows one to control the geometry of the solution at
[3] involves Ricci and not just scalar curvature). An different spacetime points, given control at an initial
important feature of dimension 3 is that the full point.
Riemann curvature Riem is determined algebraically
by the Ricci curvature. So the Ricci flow has a much
better chance of ‘‘working’’ in dimension 3. For
Singularity Formation
example, an analysis of Q e shows that the Ricci flow The deeper analysis of the Ricci flow is concerned
preserves positive Ricci curvature in dimension 3; if with the singularities that arise in finite time.
Ricg(0) > 0, then Ricg(t) > 0, for t > 0. This is not the Equation [3] shows that the Ricci flow will not
case in higher dimensions. On the other hand, in any exist for arbitrarily long time in general. In the case
dimension > 2, the Ricci flow does not preserve of initial metrics with positive Ricci curvature, this is
negative Ricci curvature, or even a general lower resolved by rescaling the Ricci flow to constant
bound Ric , for > 0. For the remainder of the volume. However, the general situation is necessarily
article, we usually assume then that dim M = 3. much more complicated. For example, any manifold
The first basic result on the Ricci flow is the which is a connected sum of S3 = or S2 S1 factors
following, due to Hamilton (1982). has metrics of positive scalar curvature. For obvious
586 Singularities of the Ricci Flow
topological reasons, the volume normalized Ricci of the curvature, and that base points of maximal
flow then cannot converge nicely to a round metric; curvature in space and time t ti have been chosen.
even the renormalized flow must develop At least in a subsequence, one then obtains a limit
singularities. solution to the Ricci flow (N, g(t), x), based at x,
The usual method to understand the structure of defined at least for times (1, 0], with g(t) a
singularities, particularly in geometric PDEs, is to complete Riemannian metric on N. Such solutions
rescale or renormalize the solution on a sequence are called ancient solutions of the Ricci flow. The
converging to the singularity to make the solution estimate [6] shows that the limit has non-negative
bounded, and try to pass to a limit of the curvature in dimension 3, and so [7] holds on N.
renormalization. Such a limit solution models the Thus, the limit is indeed quite special. The topology
singularity formation, and one hopes (or expects) of complete manifolds N of non-negative curvature
that the singularity models have special features is completely understood in dimension 3. If N is
making them much simpler than an arbitrary noncompact, then N is diffeomorphic to R3 , S2 R,
solution of the flow. or a quotient of these spaces. If N is compact, then
A singularity forms for the Ricci flow only where a slightly stronger form of the space-form theo-
the curvature becomes unbounded. Suppose then rem implies N is diffeomorphic to S3 =, S2 S1 , or
that 2i = jRiemj(xi , ti ) ! 1, on a sequence of points S2 Z2 S1 .
xi 2 M, and times ti ! T < 1. Consider the The study of the formation of singularities in
rescaled or blow-up metrics and times the Ricci flow was initiated by Hamilton (1995).
Recently, Perelman has obtained an essentially
gi ðti Þ ¼ 2i i gðtÞ;
ti ¼ 2i ðt ti Þ ½8 complete understanding of the singularity behavior
of the Ricci flow, at least in dimension 3.
where i are diffeomorphisms giving local dilations
of the manifold near xi by the factor i .
The flow gi is also a solution of the Ricci flow,
and has bounded curvature at (xi , 0). For suitable Perelman’s Work
choices of xi and ti , the curvature will be bounded
Noncollapse
near xi , and for nearby times to the past, ti 0; for
example, one might choose points (xi , ti ) where the Consider the Einstein–Hilbert action
curvature is maximal on (M, g(t)), t ti . Z
The rescaling [8] expands all distances by the RðgÞ ¼ RðgÞ dVg ½10
factor i , and time by the factor 2i . Thus, in M
effect one is studying very small regions, of as a functional on M. Critical points of R are Ricci-
spatial size on the order of ri = 1 i about (xi , ti ), flat metrics. It is natural and tempting to try to
and ‘‘using a microscope’’ to examine the small- relate the Ricci flow with the gradient flow of R
scale features in this region on a scale of size (with respect to a natural L2 metric on the space M).
about 1. However, it has long been recognized that this
A limit solution of the Ricci flow, defined at least cannot be done directly. In fact, the gradient flow of
locally in space and time, will exist provided that the R does not even exist, since it implies a backwards
local volumes of the rescalings are bounded below heat-type equation for the scalar curvature R
(Gromov compactness). In terms of the original (similar to [3] but with a minus sign before ).
unscaled flow, this requires that the metric g(t) Consider however the following functional
should not be locally collapsed on the scale of its extending R:
curvature, that is, Z
vol Bxi ðri ; ti Þ rni ½9 F ðg; f Þ ¼ ðR þ jrf j2 Þef dVg ½11
M
for some fixed but arbitrary > 0. A maximal as a functional on the larger space M C1 (M, R), or
g(t), x) containing the base point
connected limit (N, equivalently a family of functionals on M, parame-
x = lim xi , is then called a ‘‘singularity model.’’ trized by C1 (M, R). The functional [11] also arises in
Observe that the topology of the limit N may well string theory as the low-energy effective action; the
be distinct from the original manifold M, most of scalar field f is called the dilaton. Fix any smooth
which may have been blown off to infinity in the measure dm on M and define the Perelman coupling
rescaling. by requiring that (g, f ) satisfy
To see the potential usefulness of this process,
suppose one does have local noncollapse on the scale ef dVg ¼ dm ½12
Singularities of the Ricci Flow 587
Perelman shows that this structural result for the consequently the proof of the Poincaré conjecture.
singularity models themselves also holds for the It gives a full classification of all closed 3-manifolds,
solution g(t) very near any singularity time T. Thus, much like the classification of surfaces given by the
at any base point (x, t) where the curvature is classical uniformization theorem.
sufficiently large, the rescaling as in [8] of the
spacetime by the curvature is smoothly close, on See also: Einstein Manifolds; Evolution Equations: Linear
large compact domains, to corresponding large and Nonlinear; Minimal Submanifolds; Renormalization:
domains in a complete singularity model. The General Theory; Topological Sigma Models.
‘‘ideal’’ complete singularity models do actually
describe the geometry and topology near any
singularity. Consequently, one has a detailed under- Further Reading
standing of the small-scale geometry and topology in
a neighborhood of every point where the curvature Anderson M (1997) Scalar curvature and geometrization con-
jectures for 3-manifolds. In: Grove K and Petersen P (eds.)
is large on (M, g(t)), for t near T. Comparison Geometry, MSRI Publications, vol. 30, pp. 49–82.
The main consequence of this analysis is the Cambridge: Cambridge University Press.
existence of canonical, almost round 2-spheres S2 in Cao HT, Chow B, Cheng SC, and Yau ST (eds.) (2003) Collected
any region of (M, g(t)) where the curvature is Papers on Ricci Flow. Boston: International Press.
D’Hoker E (1999) String theory. In: Deligne P et al. (eds.)
sufficiently large; the radius of the S2 ’s is on the
Quantum Fields and Strings: A Course for Mathematicians,
order of the curvature radius. One then disconnects vol. 2, Providence, RI: American Mathematical Society.
the manifold M into pieces, by cutting M along a Friedan D (1985) Nonlinear models in 2 þ " dimensions. Annals
judicious choice of such 2-spheres, and gluing in of Physics 163: 318–419.
round 3-balls in a natural way. This surgery process Hamilton R (1982) Three manifolds of positive Ricci curvature.
allows one to excise out the regions of (M, g(t)) Journal of Differential Geometry 17: 255–306.
Hamilton R (1993) The Harnack estimate for the Ricci flow.
where the Ricci flow is almost singular, and thus Journal of Differential Geometry 37: 225–243.
leads to a naturally defined Ricci flow with surgery, Hamilton R (1995) Formation of singularities in the Ricci flow.
valid for all times t 2 [0, 1). In: Yau ST (ed.) Surveys in Differential Geometry vol. 2,
The surgery process disconnects the original pp. 7–136. Boston, MA: International Press.
connected 3-manifold M into a collection of disjoint Ivey T (1993) Ricci solitons on compact three-manifolds.
Differential Geometry and Its Applications 3: 301–307.
(connected) 3-manifolds Mi , with the Ricci flow Perelman G (2002) The entropy formula for the Ricci flow and its
running on each. However, topologically, there is a geometric applications, math.DG/0211159.
canonical relation between M and the components Perelman G (2003a) Ricci flow with surgery on three-manifolds,
Mi ; M is the connected sum of {Mi }. An analysis of math.DG/0303109.
the long-time behavior of the volume-normalized Perelman G (2003b) Finite extinction time for the solutions to the
Ricci flow on certain three-manifolds, math.DG/0307245.
Ricci flow confirms the expectation that the flow Thurston W (1982) Three dimensional manifolds, Kleinian groups
approaches a fixed point, that is, an Einstein metric, and hyperbolic geometry. Bulletin of American Mathematical
or collapses along 3-manifolds admitting an S1 Society 6: 357–381.
fibration. This then leads to the proof of Thurston’s
geometrization conjecture for 3-manifolds and
Definition 1 The critical point x0 is said to be of The projection of this set on the space of parameters
Morse type if the Hessian of V at x0 : D2x V(x0 ) is of contains the set of values of the parameters for which
maximal rank n. The corank of a singular point x0 is the equilibrium position is susceptible to change of
the corank of the matrix D2x V(x0 ). topological type (in other terms to undergo a bifurca-
tion). This set is called the catastrophe set (see Figure 1).
Denote by O the local ring of germs of C1
Consider now the case of umbilics where there are
functions at point x0 .
two state equations:
Definition 2 The Jacobian ideal of the function V
@V @V
at x0 , denoted as Jac(V), is the ideal generated in ¼ ¼0
the ring O by the partial derivatives of @x @y
V: @V=@xi , i = 1, . . . , n, considered as elements of The catastrophe set S is determined by one further
the local ring O. equation:
The singularity (or the singular point) is isolated if 2 2
@2V @2V @ V
Hess V ¼ ¼0
dimR O=JacðVÞ < 1 @x2 @y2 @x@y
In that case, the Milnor number is defined as the In both cases of hyperbolic and elliptic umbilics, the set
dimension S is a singular surface. For the last case of the parabolic
¼ dimR O=JacðVÞ umbilic, the set S is of dimension 3 and again it is only
Local models of singularities at a point are simple possible to represent it by a family of its sections by a
expressions that germs of functions singular at this variable hyperplane (see Figure 2).
point have in local coordinates. All possible deformations (in the space of func-
R Thom proposed to focus more particularly on the tions) of a function with an isolated singularity can
singularities whose Milnor number is less than or be induced by a single -dimensional family of
equal to 4 and whose corank is less than or equal to 2. deformations named the ‘‘universal deformation.’’ In
The list of local models V (x) of functions whose general, the ‘‘codimension’’ of a bifurcation is the
singularities at 0 display a Milnor number less than minimal number of parameters needed to display all
or equal to 4 and a corank less than or equal to 2 is possible phase diagrams of all possible unfoldings.
the following: Several deep mathematical techniques, like the
Malgrange division theorem and preparation theo-
V (x) = 13 x3 þ 1 x, the fold, rem, allowed J Mather to prove the theorem (local,
1
V (x) = 4 x4 þ 12 1 x2 þ 2 x, the cusp, then global) of existence of the universal unfolding.
590 Singularity and Bifurcation Theory
λ3 λ1
λ2
λ2
λ1
λ3
λ2 λ2 λ3
λ1
λ1
The theory of unfoldings of singularities can be A simplified model of the essential physics of a laser
used, for instance, to provide asymptotic expression is due to Haken (1983). It is given by
of stationary phase integrals when critical points of
n_ ¼ GnN kn
the phase are not of Morse type. This relates to
monodromy, Bernstein polynomials, Milnor fibra- were n is the number of photons in the laser field, N is
tion near a singular point, and simultaneous local the number of excited atoms, and the gain term comes
models of forms and functions (cf. Malgrange from the process of stimulated emission which occurs
(1974)) and see Feynman Path Integrals). at a rate proportional to the product n.N. Further-
more, the number of excited atoms drops down by the
emission of photons N = N0 n. Then we obtain
Singularity Theory of Vector Fields
n_ ¼ ðGN0 kÞn Gn2
Transcritical Bifurcation
This model displays a transcritical bifurcation, which
The transcritical bifurcation is the standard mechan- explains in elementary terms the laser threshold.
ism for changes in stability. The local model is given by
x_ ¼ rx x2 Pitchfork Bifurcation
For r < 0, there is an unstable fixed point at x = r The local model for supercritical pitchfork bifurca-
and a stable fixed point at x = 0. As r increases, the tion is
unstable and the stable fixed points coalesce when
r = 0 and when r > 0, they exchange their stability. x_ ¼ rx x3
Singularity and Bifurcation Theory 591
λ4
λ4
λ2 λ3
λ2
λ3
λ4
λ4
λ3
λ1
λ1
λ3
Figure 2 Sections of catastrophe sets. Adapted with permission from Françoise J-P (2005) Oscillations en Biologie: Analyse
Qualitative et Modèles (Mathématiques et Applications, vol. 46). Heidelberg: Springer.
When the parameter r < 0, it displays one stable Taylor expansions of functions. This leads to
equilibrium position. As r increases, this equilibrium decomposition of the vector fields into semisimple
bifurcates (for r > 0) into two stable equilibria and and nilpotent parts (at the level of formal series). A
an unstable equilibrium. Its drawing suggests ‘‘the normal form is a formal coordinate system in which
pitchfork.’’ In case of subcritical pitchfork the semisimple part is linear. If the vector field
bifurcation preserves a structure (like volume form or symplec-
tic form) the change of coordinates which brings it
x_ ¼ rx þ x3 to its normal form is also (volume-preserving,
symplectic). The simplicity of the normal form
there is a single stable state for r < 0 that bifurcates depends on the number of allowed resonances for
into two stable states and one unstable as r > 0. the eigenvalues of the first-order jet of the vector
field at the singular point. The best-known example
Normal Forms
is the Birkhoff normal form of Hamiltonian vector
Local analysis of vector fields proceeds with local fields that we recall now, but we should also
models called normal forms. A local vector field mention the Sternberg normal form of volume-
near a singular point (zero) is seen as a derivation of preserving vector fields.
the local ring of functions which preserves the Local analysis of a Hamiltonian vector field under
unique maximal ideal (of the functions which vanish symplectic changes of coordinates is the same as the
at the singular point). It yields a linear operator of local analysis of functions (namely its associated
the finite-dimensional vector spaces of truncated Hamiltonian). Birkhoff normal form deals with the
592 Singularity and Bifurcation Theory
d
case of a Hamiltonian that is a perturbation at the than ag before a time bounded below by eb= (see
origin: Gallavotti (1983)).
X
m
H0 ðpÞ ¼ j pj Bifurcations of Periodic Orbits
j¼1
Consider a one-parameter family of vector fields X
pj ¼ x2j þ y2j ; j ¼ 1; . . . ; m of class Ck , k 3,
X
m Assume that X (0) = 0 and that the linear part of the
!¼ dxj ^ dyj vector field at 0 has two complex-conjugated
j¼1 eigenvalues () and () such that Re(()) > 0
for > 0, Re((0)) = 0 and (Re(()))=dj = 0 6¼ 0.
If the eigenvalues j are assumed to be independent
Then, for > 0 but small enough, the vector field
over the integers (no resonances), then there is
^j , q X has a periodic orbit which tends to 0 as
a formal system of symplectic coordinates p ^j ,
tends to 0.
j = 1, . . . , m, called action-angle variables, in which
This bifurcation of codimension 1 is named Hopf
the Hamiltonian only depends of the action variables
^j . Such a coordinate system is generically divergent bifurcation and it occurs in many models.
p
When several oscillators (conservative or dissipa-
because, under generic assumptions on the 3-jet of
tive) are weakly coupled, they may display fre-
the Hamiltonian, the system displays isolated periodic
quency locking (existence of an attractive periodic
orbits in any neighborhood of the origin (see Moser,
orbit) phase locking, and synchronization. The fact
Vey, Francoise). Normal forms are normally used in
that we always see the same face of the Moon from
applications (e.g., Nekhoroshev theorem, Hopf bifur-
the Earth can be explained by a synchronization of
cation theorem) in their truncated versions. Birkhoff
the rotation of the Moon onto itself with its rotation
normal form was conjectured (A Weinstein) to enter
around the Earth. Synchronization also plays a
in the asymptotic expansion of the fundamental
fundamental role in living organisms (e.g., heart,
solution of the wave equation on a Riemannian
population dynamics: see D Attenborough’s movie
manifold near elliptic geodesics. This conjecture was
‘‘The Trials of Life’’). It is sometimes possible to be
recently proved by V Guillemin.
convinced of synchronization via computer experi-
ments, but the main theoretical approach is due to
Stability Theory of Hamiltonian Systems, Malkin. See Bifurcations of Periodic Orbits, where a
Nekhoroshev Theorem, Arnol’d Diffusion full mathematical proof is included.
The generic divergence of the Birkhoff normal form does Homoclinic Bifurcation, Newhouse’s Phenomenon
not allow one to conclude about the stability of the
elliptic singular point. In the case where it is convergent, Homoclinic bifurcation occurs in the family X at
the motion is trapped inside invariant tori (conservation the bifurcation value of the parameter = 0 if X0
of the actions). The KAM theorem (see Gallavotti displays a singular orbit which tends to 0 both for
(1983)) provides the existence of many invariant tori t ! þ1 and for t ! 1. In dimension 2, if is
but, except in low dimensions, this does not exclude the slightly deformed around 0, one periodic orbit may
existence of trajectories that would escape to infinity. appear (or disappear). For planar systems, the
Arnol’d indeed provided a mechanism and examples of Bogdanov–Takens bifurcation is the codimension-2
such situations (this is now called Arnol’d diffusion) (see bifurcation, which mixes the homoclinic and the
Introductory Articles: Classical Mechanics). This diffu- Hopf bifurcations. In dimension 3, more complicated
sion process needs some time, which is estimated below phase diagrams may occur (such as in the Shilnikov
by a theorem of Nekhoroshev. bifurcation) with the appearance of infinitely many
Consider the Hamiltonian periodic orbits or homoclinic loops (in a stable way:
Newhouse phenomenon). This eventually gives rise to
H ðp; qÞ ¼ hðpÞ þ f ðp; qÞ strange attractors (the Roessler attractor).
where h(p) is strictly convex, analytic, anisochro-
nous on the closure U of an open bounded region U The Poincaré Center-Focus Problem, Local
Hilbert’s 16th Problem, Abel Equations, Algebraic
of Rm and the perturbation f (p, q) is analytic on U
Moments
Rm . Nekhoroshev’s theorem tells that there are
positive constants a, b, d, g, such that for any initial Hopf bifurcation theory for two-dimensional sys-
data p0 , q0 , the actions p do not change by more tems deals with the first case of a general situation
Singularity and Bifurcation Theory 593
often referred to as degeneracies of Hopf bifurca- Excitability is also an important feature which occurs
tions or alternatively Hopf–Takens bifurcations. in some fast–slow systems. Consider initial data in a
Consider more generally a planar vector field, neighborhood of an excitable attractive point. For some
tangent at the origin to a linear focus: initial data, the orbit goes very quickly to the attractor.
For some others instead (usually below some threshold),
x_ ¼ y þ x þ f ðx; yÞ the orbit undergoes a long incursion in the phase
y_ ¼ x þ y þ gðx; yÞ diagram before turning back to the attractive point.
Singular Hopf bifurcation, hysteresis, and excit-
The Poincaré center-focus problem asks for ability can, for instance, occur in the electrodissolu-
necessary and sufficient conditions on the perturba- tion and passivation of iron in sulfuric acid
tion terms so that all orbits are periodic in a (see Alligood et al. (1997)).
neighborhood of the origin. This problem is still Sometimes, the orbit leaves the neighborhood of a
pending in the case, for instance, where f and g are first attractor to jump to a second one and then this
homogeneous of degrees 4 and 5. It was solved a second one disappears and the orbit jumps back to
long time ago for degrees 2 and 3. Part (b) of the initial attractor as the slow variables have
Hilbert’s 16th Problem asks for finding a bound in undergone a cycle. This is called a hysteresis cycle.
terms of the degrees of polynomial perturbations for In case one of the attractors is a point while the
the number of limit cycles (isolated periodic orbits) other is an attractive periodic orbit, it may lead to
in the neighborhood of the origin. In the case of bursting oscillations. These oscillations are charac-
homogeneous perturbations, a Cherkas transforma- terized by the periodic succession of silent phases
tion allows the reduction of both problems to the (attractor of the fast dynamics) and active (pulsatile)
so-called one-dimensional periodic Abel equations: phases (periodic attractor of the fast dynamics).
They are ubiquitous in physiology, where they were
dy=dx ¼ pðxÞy2 þ qðxÞy3 first discovered and can be also observed in physics
(laser beams) and in population dynamics.
where p and q are trigonometric polynomials in x.
A perturbative approach was developed for several
years and yields a theory of algebraic moments Example
related to Livsic’s generalized problem of moments.
The Hindmarsh–Rose model displays bursting
oscillations:
See also: Bifurcation Theory; Bifurcations of Periodic Gallavotti G (1983) The Elements of Mechanics. New York: Springer.
Orbits; Chaos and Attractors; Entropy and Quantitative Goodstein DL and Goodstein JR (1997) Feynmann’s lost lecture.
Transversality; Ergodic Theory; Feynman Path Integrals; London: Vintage.
Generic Properties of Dynamical Systems; Gravitational Guckenheimer J (2004) Bifurcations of relaxation oscillations. In:
Ilyashenko Y, Rousseau C, and Sabidussi G (eds.) Normal Forms,
Lensing; Homoclinic Phenomena; Hyperbolic Dynamical
Bifurcations and Finiteness Problems in Differential Equations.
Systems; Multiscale Approaches; Optical Caustics; Séminaire de mathématiques supérieures de Montréal, Nato
Poisson Reduction; Stationary Phase Approximation; Sciences Series, II. Mathematics, vol. 137, pp. 295–316. Kluwer.
Symmetry and Symmetry Breaking in Dynamical Haken H (1983) Synergetics, 3rd edn. Berlin: Springer.
Systems; Symmetry and Symplectic Reduction; Keener J and Sneyd J (1998) Mathematical Physiology. Inter-
Synchronization of Chaos; Weakly Coupled Oscillators. disciplinary Applied Mathematics, vol. 8. New York: Springer.
Malgrange B (1974) Intégrales asymptotiques et monodromie.
Annales de l’ENS 7: 405–430.
Further Reading May R-M (1976) Simple mathematical models with very
complicated dynamics. Nature 261: 459–467.
Alligood KT, Sauer TD, and Yorke JA (1997) Chaos, An Nekhoroshev V (1977) An exponential estimate of the time of
Introduction to Dynamical Systems, Textbooks in Mathema- stability of nearly integrable Hamiltonian systems. Russian
tical Sciences. New York: Springer. Mathematical Surveys 32(6): 1–65.
Alpay D and Vinikov V (eds.) (2001) Operator Theory, System Palis J and de Melo W (1982) Geometric Theory of Dynamical
Theory and Related Topics, The Mosche Livsic Anniversary Systems, An Introduction. New York: Springer.
Volume, Operator Theory, Advances and Applications Perko L (2000) Differential Equations and Dynamical Systems, 3rd
vol. 123. Birkhauser. edn, Text in Applied Mathematics, vol. 7. New York: Springer.
Briskin M, Francoise JP, and Yomdin (2001) Generalized Siegel C-L and Moser J (1971) Lectures on Celestial Mechanics,
Moments, Cener-Focus Conditions and Compositions of Die Grundleheren der mathematischen Wissenschaften,
Polynomials. Operator Theory, Advances and Applications vol. 187. Berlin: Springer.
123 ( in honor of M Livsic, 80th birthday). Smale S (1998) Mathematical problems for the next century. The
Diener M (1994) The canard unchained, or how fast–slow dynamical Mathematical Intelligencer 20: 7–15.
systems bifurcate? The Mathematical Intelligencer 6: 38–49. Smale S. Dynamics retrospective: great problems, attempts that
Francoise JP and Guillemin V (1991) On the period spectrum of a failed. Physica D 51: 267–273.
symplectic map. Journal of Functional Analysis 100: 317–358.
3
½3
some time they collide and their shapes are distorted. rð; tÞ ¼ rð; 0Þ e2i t
After a long enough time, they are separated and
recover their original shapes, the only difference It was realized at the same time that soliton solutions
being in the change of the phase shift d in [2]. correspond to a reflectionless potential (r() = 0) with
Solitary waves in shallow water (like a canal) only one discrete eigenvalue, while reflectionless
were first observed by Scott Russell in Scotland in potentials correspond to a nonlinear ‘‘superposition’’
the middle of the nineteenth century. Differential of soliton solutions (called multisoliton solutions) and
equations which possess solitary waves in shallow describe the interaction of solitons.
water as solutions were sought after Scott Russell’s As was pointed out by Zakharov and others, the
report. Boussinesq derived one (now called the inverse-scattering method has an intimate relation
Boussinesq equation, which contains second partial with the Riemann–Hilbert problem.
derivatives with respect to time) from the Euler
equation of water wave; then in 1895 Korteweg and
his student de Vries derived the KdV equation. They Lax Representation
also showed that the KdV equation possesses
solutions expressible in terms of elliptic functions. Looking at this invariance of the spectrum, Lax
In the 1960s Kruskal and Zabusky carried out reformulated the KdV equation [1] as an evolution
numerical computations for the Fermi–Pasta–Ulam equation for the one-dimensional Schrödinger operator:
problem; they also came across the KdV equation 2
dL d
and found the aforementioned phenomenon. ¼ ½A; L; L ¼ þu
dt dx
3 ½4
@ 3 @ @
A¼ þ u þ u
@x 4 @x @x
Inverse-Scattering Method
Here we have changed the sign of the operator for
Kruskal and his co-workers further pursued the
later convenience. This form of representation
origin of the particle-like property of solitons and
together with the inverse-scattering method gave a
proposed the so-called inverse-scattering method.
framework for finding nonlinear differential (differ-
The inverse problem of scattering theory of the
ence) equations that have solutions with properties
one-dimensional Schrödinger operator
similar to solitons (soliton equations).
2 Among such are the sine-Gordon equation
d
L¼ þ uðxÞ
dx utt uxx ¼ sin u
596 Solitons and Kac–Moody Lie Algebras
the nonlinear Schrödinger equation Among them was the work of Lepowsky–Wilson,
2 who constructed basic representations of the affine
iut þ uxx þ juj u ¼ 0 Lie algebra A(1) b
1 (= sl2 ) using differential operators of
the modified KdV equation infinite order in infinitely many variables. These
operators were called vertex operators by Garland,
ut 16 6u2 ux þ uxxx ¼ 0 in view of the resemblance to objects in string
theory. Character formulas for these new Lie
the Toda lattice equation algebras were intensively studied and many combi-
dQn natorial identities were (re)derived.
¼ Pn
dt ½5
dPn Geometric Interpretation
¼ expðQn Qnþ1 Þ þ expðQn1 Qn Þ
dt
How do Kac–Moody Lie algebras enter into this
and so on. The first three are obtained by replacing picture?
L by a 2 2 matrix differential operator of first In the early stages of the history of solitons
order. For eqn [5], the linear operator corresponding Kac–Moody Lie algebras appeared rather artifi-
to L in the case of the KdV equation is a difference cially. Some authors tried to understand solitons
operator of order 2 and has a connection with the from geometric viewpoints. A typical example is the
theory of orthogonal polynomials in one variable as sine-Gordon equation. This equation appears as the
well as with the theory of moment problems. Gauß–Codazzi equation in the theory of embeddings
Later it was remarked that the differential of two-dimensional surfaces of constant negative
operator A in eqn [4] is nothing but the differential curvature into three-dimensional Euclidean space,
operator part of the fractional power of while the Gauß–Weingarten equation is the linear
L: A = (L3=2 )þ . By replacing A in [4] by (L(2nþ1)=2 )þ equation that appears in the Lax representation of
we obtain higher (nth) KdV equations. the sine-Gordon equation. Another approach of a
geometric nature, involving the prolongation struc-
ture, was the direction initiated by Wahlquist–
Basic Representations of Affine Estabrook. In this approach, the Lie algebra
Lie Algebras appeared in a natural way, although the nature of
such Lie algebras was not so clear. This direction of
In the 1960s Kac and Moody introduced indepen-
research is close in spirit to the method of Cartan for
dently a class of infinite-dimensional Lie algebras
treating partial differential equations.
which are in many respects close to finite-dimensional
Several authors considered generalizations of the
semisimple Lie algebras. Each of them is constructed
Toda lattice equation. Bogoyavlenskii and others
for a given generalized Cartan matrix (GCM),
observed that the original Toda lattice equation [5]
C ¼ aij ; aii ¼ 2; aij 0 for i 6¼ j is related to the Cartan matrix of the affine Lie
and if aij ¼ 0 then aji ¼ 0 ½6 algebra of type A. Viewed in this way, it was
straightforward to generalize the Toda lattice
There is a special class of Kac–Moody Lie algebras equation to Cartan matrices of another type of
that are now called affine Lie algebras. They affine Lie algebras and also to ordinary Cartan
correspond to positive-semidefinite GCM and are matrices. These were typical appearances of Kac–
realized as central extensions of loop algebras Moody Lie algebras in the theory of solitons; they
(current algebras) were used to produce soliton equations. The climax
of this is the work of Drinfel’d–Sokolov.
C½; 1 g
It needed some time to understand another role of
of finite-dimensional semisimple Lie algebras g. affine Lie algebras in the theory of solitons.
They have many applications in physics, in parti-
cular as current algebras. The Sugawara construc-
Bäcklund Transformation
tion in current algebra plays an essential role in
conformal field theory. Note that finite-dimensional In the theory of two-dimensional surfaces of
semisimple Lie algebras correspond to positive- constant negative curvature, a method of obtaining
definite GCMs. another surface of constant negative curvature from
In the late 1970s, there was interest in construct- the given one with some parameter was known by
ing representations of these algebras after the the work of Bäcklund. If we apply this to the trivial
general theory of representations was constructed. solutions u = 0 of the sine-Gordon equation, we
Solitons and Kac–Moody Lie Algebras 597
obtain a one-soliton solution of the sine-Gordon equation, and so on. He made a dependent-variable
equation. From this fact, the transformation of transformation of the KdV equation [1],
solutions of soliton equations to other solutions is
d
called a Bäcklund transformation. The original u¼2 log f
Darboux transformation is a special case of a dx
Bäcklund transformation. This form naturally arises when we reconstruct the
potential of the one-dimensional Schrödinger
operator from the scattering data by solving the
Hamiltonian Formalism Gelfand–Dikii–Marchenko integral equation. In this
Another discovery of Gardner–Greene–Kruskal– new dependent variable, eqn [1] takes the following
Miura was the Hamiltonian structure of the KdV form:
equation. In the process of showing the existence of 4
Dx 4Dx Dt f ðx; tÞ f ðx; tÞ ¼ 0
infinitely many conservation laws, they used the
so-called Miura transformation, which relates the where the operator Dx is defined by
KdV and the modified KdV equation. Faddeev– d
Zakharov showed that the transformation to Dx ðf gÞ ¼ f ðx þ x0 Þ gðx x0 Þj x0 ¼0 ½7
dx0
scattering data is a canonical transformation, and
conserved quantities are obtained from the expan- This operator is called Hirota’s bilinear differential
sion of the reflection coefficients. operator. In such transformed form, he tried to solve
Gelfand–Dikii studied Hamiltonian structures of the resulting equation in a perturbative way,
the KdV equation using the formal variational X
n
calculus they initiated. f ¼1 þ expð2pj x þ 2p3 t þ qj Þ
M Adler was the first to try to study the KdV j¼1
X
equation by using the orbit method known for þ cij expð2ðpj þ pk Þx
finite-dimensional Lie algebras. It was known by the 1j<kn
works of Kostant and Kirillov or even earlier by Lie
þ 2ðp3j þ p3k Þt þ qj þ qk Þ þ ½8
that the co-adjoint orbits of Lie algebras admit
symplectic structures (the Kostant–Kirillov bracket). It is rather miraculous that in the soliton equation
Adler considered the algebra of pseudodifferential case we can truncate such a perturbative procedure
operators in one variable. This acquires the structure at a finite point. The number of steps corresponds to
of Lie algebra by the commutation relation. This the number of solitons.
algebra admits a natural triangular decomposition Most of the soliton equations are rewritten in
by order. He showed that the KdV equation can be bilinear form with such bilinear differentiation after
viewed as a Hamiltonian system in the co-adjoint a suitable dependent-variable transformation. (Some
orbit of the one-dimensional Schrödinger operator equations need several new dependent variables.)
with the Kostant–Kirillov bracket. By introducing Once we have a differential equation in Hirota’s
the notion of residue of pseudodifferential operators bilinear differential form, it always has two-soliton
he rederived conserved quantities. The work of solutions.
Drinfeld–Sokolov can be regarded as a thorough Up to 1980, keywords characterizing solitons
generalization of this direction. Hamiltonian struc- were; inverse-scattering method, Bäcklund trans-
tures of the KdV equation and other soliton formation, multisolitons, Hirota’s method, quasi-
equations are now understood in this way. periodic solutions, etc. No explicit mention was
The method is also applicable to finite-dimensional made of representation theory.
Lie algebras. Symes, Kostant, and others treated the
finite Toda lattice in this way.
The motion of tops, including that of Kovalevs- Hierarchy of Soliton Equations
kaya, was also studied in this way.
As was stated above, soliton equations viewed as
Hamiltonian systems have infinitely many conserva-
tion laws. This implies that we can introduce infinitely
Hirota’s Method
many independent time variables consistently. From
There was another approach to soliton equations, quite this viewpoint, it is natural to consider the KdV
different from the above. This was the method initiated equation and its higher-order analogs simultaneously.
by Hirota. He placed stress on the form of multisoliton They have many properties in common. For example,
solutions of the KdV equation, the sine-Gordon the t-dependence of the scattering data of the higher
598 Solitons and Kac–Moody Lie Algebras
KdV equation is given by replacing 3 by 2nþ1 and j3 If we assume that L2 is a differential operator, we
by j2nþ1 in eqn [3]. The totality of soliton equations have the KdV hierarchy and the constraint that L3 is
organized in this way is called a hierarchy of soliton a differential operator gives the Boussinesq
equations; in the KdV case, it is called the KdV hierarchy. This process is called reduction.
hierarchy. This notion of hierarchy was introduced by Sato found that character polynomials (Schur
M Sato. He tried to understand the nature of the functions) solve the KP hierarchy and, based on
bilinear method of Hirota. First, he counted the this observation, he created the theory of the
number of Hirota bilinear operators of given degree infinite-dimensional (universal) Grassmann manifold
for hierarchies of soliton equations. For the number of and showed that the Hirota bilinear equations are
bilinear equations, M Sato and Y Sato made extensive nothing but the Plücker relations for this Grassmann
computations and made many conjectures that involve manifold.
eumeration of partitions. Sato also gave an (infinite-dimensional) determi-
nantal formula for Hirota’s dependent variable and
called the latter the -function. Using this
-function, the wave function (the eigenfunction
Kadomtsev–Petviashvili Hierarchy
corresponding to the KP hierarchy) is expressed as
Although it was included in a family of soliton !
equations slightly later, the Kadomtsev–Petviashvili X
1
ðx ðk1 ÞÞ
n
(KP) equation is a soliton equation in three wðx; kÞ ¼ exp xn k
n¼1
ðxÞ
independent variables, which first appeared in
plasma physics: ½12
k k3
2
ðkÞ ¼ k; ; ; . . .
3 1 2 3
4 uyy ut 4 ð6uux þ u xxx Þ ¼0 ½9
Lw ¼ kw
For this equation we have to replace the Lax
representation by where L is given by eqn [11].
2 3
@ @ @ 3 @ @
þ u ; þ u þ v ¼ 0 ½10
@x @y @x 2 @x @t
Affine Lie Algebras as Infinitesimal
This form of representation was introduced by
Transformation Groups for Soliton
Zakharov–Shabat. Sometimes it is referred to as
Equations
the zero-curvature representation or the Zakharov–
Shabat representation. The KP equation is universal Date–Jimbo–Kashiwara–Miwa found another rela-
in the sense that it contains the KdV equation [1] tion among soliton equations and affine Lie alge-
and the Boussinesq equation as special cases. If u bras. After noticing some similarity between the
does not depend on y, resp. t, this gives the KdV, formula in the paper by Lepowsky–Wilson on the
resp. the Boussinesq equation. Rogers–Ramanujan identity using the vertex opera-
tors for A(1)
1 and the formula in the computation of
numbers of bilinear operators in Sato’s paper, they
Work of Sato applied the vertex operator for A(1)
1 ,
!
Sato stressed the importance of the study of the KP X
1
2j1
equation. He first introduced the KP hierarchy. XðpÞ ¼ exp 2x2j1 p
Instead of the one-dimensional Schrödinger operator j¼1
in the KdV case consider a pseudo- (micro) !
X 2 @
differential operator of first order, exp
j¼1
jp2j1 @x2j1
L ¼ @ þ u2 ðxÞ@ 1 þ u3 ðxÞ@ 3 þ
@ ½11
@¼ ; x ¼ ðx1 ; x2 ; x3 ; . . .Þ to 1 (which is the simplest -function for the KdV
@x1 hierarchy), where p is a parameter. They found that
Setting Bn = (Ln )þ , the KP hierarchy is defined by the result is the -function corresponding to the one-
the Zakharov–Shabat representation soliton solution of the KP hierarchy. They also
found that successive application of X(p)’s to 1
@ @ produced all multisoliton -functions. Therefore,
Bm ; Bn ¼ 0; m; n ¼ 2; 3; . . .
@xm @xn applications of vertex operators are precisely
Solitons and Kac–Moody Lie Algebras 599
Special Solutions of Soliton Equations with simple zeros j of the discriminant as zeros of
(Multisoliton and Rational Solutions) polynomials defining the curve. If we consider the
Dirichlet boundary value problem for the operator L,
One of the characteristic features of soliton equa-
tions is that they allow rich special solutions. Lf ¼ ; f
Multisoliton solutions were the starting point of f ðs; Þ ¼ 0 ¼ f ðs þ l; Þ
the whole story. They directly relate to vertex
operators of affine Lie algebras. the eigenvalues are discrete and each eigenvalue j is
Rational solutions (in terms of -function poly- located in a zone:
nomial solutions) can be viewed as degenerations of
multisoliton solutions. Motions of poles (or zeros) of 2j1 j ðsÞ 2j
the solutions are interesting. Airault–McKean–Moser So, for the double zeros (2j1 = 2j ), the corre-
studied the motion of poles of rational solutions of sponding Dirichlet eigenvalue j (s) does not depend
the KdV equation and found that they are identical to on s.
the motion of particles on a line (Calogero–Moser– Dubrovin–Novikov also showed that a finite-zone
Sutherland system). This viewpoint has now been potential is a stationary solution of the higher-order
generalized by Veselov and others. KdV equation (the order being equal to the number
Another discovery of Sato was that polynomial of nontrivial zones) and the n-zonal potentials form
-functions of the KP hierarchy are precisely Schur a finite-dimensional integrable system. In other
functions (character polynomials). words, the linear operators L, An defining the nth
In accordance with the process of reduction, order KdV equations commute,
polynomial -functions of the KdV hierarchy are
Schur functions of special type. ½L; An ¼ 0
In passing, it was later found that such a pair of
commuting linear differential operators was first
Quasiperiodic Solutions of studied by Burchnall–Chaundy in the 1920s.
Soliton Equations H F Baker remarked on the corresponding simulta-
neous eigenfunctions by relating them to multi-
As mentioned above, the KdV equation admits plicative functions on algebraic curves.
solutions expressible in terms of elliptic functions.
Dubrovin–Novikov and Its–Matveev, almost at the
same time, studied solutions of the KdV equation
The Work of Krichever
with periodic initial condition.
To the Sturm–Liouville (i.e., one-dimensional Krichever reversed the above argument, utilizing the
Schrödinger) operator with periodic potential properties of corresponding eigenfunctions as a
function of the spectral parameter. In this approach,
2
@ we start with a compact Riemann surface C
L¼ þ uðxÞ; uðx þ lÞ ¼ uðxÞ (= nonsingular algebraic curve) of genus g. Here
@x
we apply his method to the KP hierarchy. Take a
there corresponds the discriminant, which is an point P0 on C together with the inverse of a local
entire function of the spectral parameter. Its zeros parameter k1 . Also take a general divisor on C of
represent the periodic and antiperiodic spectrum j degree g. Consider a function (x, P), x = (x1 , x2 , . . .),
of the operator: with the following properties:
Lfj ðxÞ ¼ j fj ðxÞ; fj ðx þ lÞ ¼ fj ðxÞ 1. is meromorphic on CnP0 with the pole divisor
, and
It turns out that, except for a finite number of zeros, 2. near P0 , behaves like
other zeros are double. Such a potential is called a !
X
1
finite-zone potential. These zones correspond to the ðx; PÞ ¼ exp j
xj k 1 þ Oðk1 Þ
spectrum of the operator in the L2 -sense. To a finite- j¼1
zone potential u(x) there corresponds a hyperelliptic
Such a exists uniquely and can be constructed
curve
using the theory of abelian integrals and the Jacobi
Y
2n problems on algebraic curves. Such a function was
2 ¼ j called the Baker–Akhiezer function, since Akhiezer
j¼0 constructed it by using abelian integrals and Jacobi’s
Solitons and Kac–Moody Lie Algebras 601
problem in his study of moment problems (ortho- discussed modulation of the KdV equation by using
gonal polynomials). the averaging method of Whitham. This opens the
It was later realized that Schur had much earlier way to study the quasiclassical limit of soliton
considered such functions in the study of ordinary equations. This aspect was further studied by Dubro-
differential equations. vin and others in connection with topological field
It is easy to show that such a function satisfies the theory.
following linear differential equations: Quite recently, Noumi and Yamada gave a general-
ization of the Painlevé equation in many variables by
n1 !
@ @ n X @ j using the idea of similarity solutions of soliton
¼ þ uj ðxÞ ; n ¼ 2; 3; . . . equations. In the work of Noumi–Yamada, the affine
@xn @x1 j¼0
@x1
Weyl group and -functions play an essential role in
constructing generalizations of the Painlevé equation.
In this way, we obtain a solution of the KP
The shift or the unit of difference corresponds to
hierarchy.
imaginary null roots of affine Lie algebras. The idea is
If there exists a rational function f (P) on C with
further applied to elliptic Painlevé equations.
poles only at P0 with singular part kn , can be
factorized as
Derrick argument involves studying what is independent of time t. This leads to something
happens to the energy of a field when one changes like a centrifugal force, which can have a stabilizing
the scale of space. If one has a scalar field (or effect in the absence of Skyrme or magnetic terms.
multiplet of scalar fields) , and/or a gauge field F , The corresponding solitons are Q-balls.
then the static energy E is the sum of terms such as
Z Z
E0 ¼ VðÞ dn x; Ed ¼ Td ðDj Þ dn x; Kinks and Breathers
Z
EF ¼ Fjk Fjk dn x The simplest topological solitons are kinks, in
systems involving a real-valued scalar field (x) in
where each integral is over (n-dimensional) space one spatial dimension. The dynamics is governed by
Rn , Dj denotes the covariant spatial derivative of , the Lagrangian density
and Td (j ) is a real-valued polynomial of degree d.
L ¼ 12 ðt Þ2 ðx Þ2 WðÞ2
In particular, for example, we could have T2 (Dj ) =
(Dj )(Dj ), the standard gradient term. Under the where W() is a (fixed) smooth function. The system
dilation xj 7! xj , these functionals transform as can admit kinks if W() has at least two zeros, for
example, W(A) = W(B) = 0 with W() > 0 for A <
E0 7! n E0 ; Ed 7! dn Ed ; EF 7! 4n EF < B. Two well-known systems are: sine-Gordon
(where W() = 2 sin (=2), A = 0, and B = 2) and 4
In order to have a static solution (critical point of (where W() = 1 2 , A = 1, and B = 1). The corre-
the static energy functional), one needs to have a zero sponding field equations are the Euler–Lagrange equa-
exponent on , and/or a balance between positive and tions for L; for example, the sine-Gordon equation is
negative exponents. A negative exponent indicates a
compressing force (tending to implode a localized tt xx þ sin ¼ 0 ½1
lump), whereas a positive exponent indicates an
expanding force; so to have a static lump solution, Configurations satisfying the boundary condi-
these two forces have to balance each other. For tions ! A as x ! 1 and ! B as x ! 1 are
n = 1, a system involving only a scalar field, with called kinks (and the corresponding ones with
terms of the form E0 and E2 , can admit static solitons x = 1 and x = 1 interchanged are antikinks).
(e.g., kinks); the scaling argument implies a virial For kink (or antikink) configurations, there is a
theorem, which in this case says that E0 = E2 . For lower bound, called the Bogomol’nyi bound, on the
n = 2, one can have a scalar system with only E2 , static energy E[]; for kink boundary conditions,
since in this case the relevant exponent is zero (e.g., we have
Z
the two-dimensional sigma model). Another n = 2 1 1h i
example is that of vortices in the abelian Higgs model, E½ ¼ ðx Þ2 þ WðÞ2 dx
2 1
where the energy contains terms E0 , E2 , and EF . For Z Z 1
1 1 2
n = 3, interesting systems have E2 together with either ¼ ½x WðÞ dx þ WðÞx dx
2 1 1
E4 (e.g., skyrmions) or EF (e.g., monopoles). An E0 Z B
term is optional in these cases; its presence affects, in WðÞ d
particular, the long-range properties of the solitons. A
For n = 4, one can have instantons in a pure gauge with equality if and only if the Bogomol’nyi equation
theory (term EF only).
It should be noted that if there are no restrictions on d
¼ WðÞ ½2
the fields and Aj (such as those arising, e.g., from dx
nontrivial topology), then there is a more obvious mode is satisfied. A static solution of the Bogomol’nyi
of instability, which will inevitably be present: 7! equation is a kink solution – it is a static minimum
and/or Aj 7! Aj , where 0 1. In other words, the of the energy functional in the kink sector. For
fields can simply be scaled away altogether, so that the example, for the sine-Gordon system, we get E[]
height of the soliton (and its energy) go smoothly to 8, with equality for the sine-Gordon kink
zero. This can be prevented by nontrivial topology.
Another way of preventing solitons from shrink- ðxÞ ¼ 4 tan1 expðx x0 Þ
ing is to allow the field to have some ‘‘internal’’ time while for the 4 system, we get E[] 4=3, with
dependence, so that it is stationary rather than equality for the phi-four kink
static. For example, one could allow the complex
scalar field to have the form = exp (i!t), where ðxÞ ¼ tanhðx x0 Þ
604 Solitons and Other Extended Field Configurations
These kinks are stable topological solitons; the energy density, as well as a zeroth-order term
nontrivial topology corresponds to the fact that the E 0 = V(a ) not involving derivatives of . A term
boundary value of (t, x) at x = 1 is different from of the form E 4 is called a Skyrme term.
that at x = 1. With trivial boundary conditions The boundary condition on field configurations
(say ! A as x ! 1), stable static solitons are is that tends to some constant value 0 2 M as
unlikely to exist, but solitons with periodic time jxj ! 1 in Rn . From the topological point of view,
dependence (which in this context are called breath- this compactifies Rn to Sn . In other words, extends
ers) may exist. For example, the sine-Gordon to a map from Sn to M; and such maps are classified
equation and the nonlinear Schrödinger equation, topologically by the homotopy group n (M). For
both, admit breathers – but these owe their existence topological solitons to exist, this group has to be
to complete integrability. By contrast, the 4 system nontrivial.
(which is not integrable) does not admit breathers; a In one spatial dimension (n = 1) with M = S1 (say),
collision between a 4 kink and an antikink (with the expression E 4 is identically zero, and we just have
suitable impact speed) produces a long-lived state kink-type systems such as sine-Gordon. The simplest
which looks like a breather, but eventually decays two-dimensional example (n = 2) is the O(3) sigma
into radiation. model, which has M = S2 with its standard metric. In
In lattice systems, however, breathers are more this system, the field is often expressed as a unit
generic. In a one-dimensional lattice system, the 3-vector field f = (1 , 2 , 3 ), with E 2 = (@j f) (@j f).
continuous space R is replaced by the lattice Z, so Here the configurations are classified topologically by
(t, x) is replaced by n (t), where n 2 Z. The their degree (or winding number, or topological
Lagrangian is charge) N 2 2 (S2 ) ffi Z, which equals
Z
1 Xh _ 2 i 1
f @1 f @2 f dx1 dx2
L¼ ðn Þ h2 ðnþ1 n Þ2 Wðn Þ N¼
4
2 n
where h is a positive parameter, corresponding to the Instead of f, it is often convenient to use a single
dimensionless ratio between the lattice spacing and the complex-valued function W related to f by the
size of a kink. The continuum limit is h ! 0. This stereographic projection W = (1 þ i2 )=(1 3 ). In
system admits kink solutions as in the continuum case; terms of W, the formula for the degree N is
and for h large enough, it admits breathers as well, but Z
i W1 W 2 W2 W 1 1 2
these disappear as h becomes small. N¼ dx dx
Interpreted in three dimensions, the kink becomes
2 ð1 þ jWj2 Þ2
a domain wall separating two regions in which the and the static energy is (with z = x1 þ ix2 )
order parameter takes distinct values; this has Z
applications in such diverse areas as cosmology and
E ¼ E 2 d2 x
condensed matter physics.
Z
jWz j2 þ jWz j2
¼8 d2 x
Sigma Models and Skyrmions ð1 þ jWj2 Þ2
Z Z
In a sigma model or Skyrme system, the field is a jWz j2 jWz j2 jWz j2
¼ 16 d2 x þ 8 d2 x
map from spacetime to a Riemannian manifold M; ð1 þ jWj2 Þ2 ð1 þ jWj2 Þ2
generally, M is taken to be a Lie group or a Z
jWz j2
symmetric space. The energy density of a static ¼ 16 d2 x þ 8N
field can be constructed as follows (the Lorentz- ð1 þ jWj2 Þ2
invariant extension of this gives a relativistic
From this, one sees that E satisfies the Bogomol’nyi
Lagrangian for fields on spacetime). Let a be local
bound E 8N, and that minimal-energy solutions
coordinates on the m-dimensional manifold M, let
correspond to solutions of the Cauchy–Riemann
hab denote the metric of M, and let xj denote the
equations Wz = 0. To have finite energy, W(z) has to
spatial coordinates on space Rn . An m m matrix D
be a rational function, and so solutions with wind-
is defined by
ing number N correspond to rational meromorphic
functions W(z), of degree jNj. (If N < 0, then W is a
Da b ¼ ð@j c Þhac ð@j b Þ
rational function of z.) The energy is scale invariant
where @j denotes derivatives with respect to the xj . (conformally invariant), and consequently these
Then the invariants E 2 = tr(D) = j@j a j2 and solutions are not solitons – they are not quite stable,
E 4 = (1=2)[(tr D)2 tr(D2 )] can be terms in the since their size is not fixed. Adding terms E 4 and E 0
Solitons and Other Extended Field Configurations 605
to the energy density fixes the soliton size, and the where Dj := @j iAj , and where is a positive
resulting two-dimensional Skyrme systems admit constant. The boundary conditions are
true topological solitons.
The three-dimensional case (n = 3), with M being Dj ¼ 0; B ¼ 0; jj ¼ 1 ½4
a simple Lie group, is the original Skyrme model of as r ! 1. If we consider a very large circle C on R2 ,
nuclear physics. If M = SU(2), then the integer N 2 so that [4] holds on C, then jC is a map from the
3 (SU(2)) ffi Z is interpreted as the baryon number. circle C to the circle of unit radius in the complex
The (quantum) excitations of the -field correspond plane, and therefore it has an integer winding
to the pions, whereas the (semiclassical) solitons number N. Thus configurations are labeled by this
correspond to the nucleons. This model emerges as vortex number N.
an effective theory of quantum chromodynamics Note that if E vanishes, then B = 0 and jj = 1: the
(QCD), in the limit where the number of colors is gauge symmetry is spontaneously broken, and the
large. If we express the field as a function U(xj ) photon ‘‘acquires a mass’’: this is a standard
taking values in a Lie group, then Lj = U1 @j U takes example of spontaneous symmetry breaking.
values in the corresponding Lie algebra, and E 2 and R
The total magnetic flux B d2 x equals 2N; a
E 4 take the form proof of this is as follows. Let be the usual polar
coordinate around C. Because jj = 1 on C, we can
E 2 ¼ 12 trðLj Lj Þ write = exp [if ()] for some function f; this f need
E 4 ¼ 161
tr ½Lj ; Lk ½Lj ; Lk not be single-valued, but must satisfy f (2)
f (0) = 2N with N being an integer (in order that
The static energy density in the basic Skyrme system be single-valued). In fact, this defines the winding
is the sum of these two terms. The static energy number. Now since Dj = @j iAj = 0 on C,
satisfies a Bogomol’nyi bound E 122 jNj, and it is we have
believed that stable solitons (skyrmions) exist for
Aj ¼ i1 @j ¼ @j f
each value of N. Classical skyrmions have been
investigated numerically; for values of N up to
25, on C. So, using Stokes’ theorem, we get
they turn out to resemble polyhedral shells. Com- Z Z
parison with nucleon phenomenology requires semi- 2
Bd x ¼ Aj dxj
classical quantization, and this leads to results which R2 C
Z 2
are at least qualitatively correct. df
A variant of the Skyrme model is the Skyrme– ¼ d
0 d
Faddeev system, which has n = 3 and M = S2 ; the
¼ 2N
solitons in this case resemble loops which can be
linked or knotted, and which are classified by their R
If = 1, then the total energy E = E d2 x
Hopf number N 2 3 (S2 ). In this case, the energy satisfies the Bogomol’nyi bound E N; E = N
satisfies a lower bound of the form E cN 3=4 . if and only if a set of partial differential equations
Numerical experiments indicate that for each N, (the Bogomol’nyi equations) are satisfied. Since
there is a minimal-energy solution with Hopf like charges repel, the magnetic force between
number N, and with energy close to this topological vortices is repulsive. However, there is also a
lower bound. force from the Higgs field, and this is attractive.
The balance between the two forces is determined
by : if > 1, the vortices repel each other;
Abelian Higgs Vortices whereas if < 1, the vortices attract. In the
critical case = 1, the force between vortices is
Vortices live in two spatial dimensions; viewed in exactly balanced, and there exist static multi-
three dimensions, they are string-like. Two of their vortex solutions. In fact, one has the following:
applications are as cosmic strings and as magnetic given N points in the plane, there exists an
flux tubes in superconductors. They occur as static N-vortex solution of the Bogomol’nyi equations
topological solitons in the the abelian Higgs model (and hence of the full field equations) with
(or Ginzburg–Landau model), and involve a mag- vanishing at the chosen points (and nowhere
netic field B = @1 A2 @2 A1 , coupled to a complex else). All static solutions are of this form. These
scalar field , on the plane R2 . The energy density is solutions cannot, however, be written down
explicitly in terms of elementary functions (except
E ¼ 12 ðDj ÞðDj Þ þ 12 B2 þ 18 ð1 jj2 Þ2 ½3 of course for N = 0).
606 Solitons and Other Extended Field Configurations
denote the gauge potential and gauge field. The bundles over complex projective 3-space (twistor
Yang–Mills action is space). One large class of solutions which can be
Z written out explicitly is as follows: for N = 1 and
1
S¼ tr F F d4 x ½8 N = 2 it gives all instantons, while for N 3 it gives a
4 (5N þ 4)-dimensional subfamily of the full (8N 3)-
where we assume a boundary condition, at infinity dimensional solution space. The gauge potentials in
in R4 , such that this integral converges. The Euler– this class have the form
Lagrange equations which describe critical points of
the functional S are the Yang–Mills equations A ¼ i @ log ½11
Z
where (x) is a complex scalar field depending only
on the spatial variables x. The best-known case is Ep ¼ U 12 KI ¼ ðK=!2 ÞEk > ðK=!2 ÞEp ½14
the 1-soliton solution
where the final inequality comes from [13]. As a
pffiffiffi consequence, we see that !2 is restricted to the range
ðt; xÞ ¼ a 2 expðia2 tÞsechðaxÞ
K < ! 2 < m2 ½15
of the nonlinear Schrödinger equation it þ xx þ
jj2 = 0. An example which has been studied in some detail is
More generally, consider a system (in n spatial U(f ) = f 2 [1 þ (1 f 2 )2 ]; here m2 = 4 and K = 2, so
dimensions) with Lagrangian the
pffiffiffi range of frequency for Q-balls in this system is
2 < ! < 2. The dynamics of Q-balls in systems
L ¼ 12 ð@ Þð@ Þ UðjjÞ such as these turns out to be quite complicated.
where (x ) is a complex-valued field. Associated See also: Abelian Higgs Vortices; Homoclinic
with the global phase Phenomena; Integrable Systems: Overview; Instantons:
R symmetry is the conserved
Noether charge Q = Im( t ) dn x. Minimizing the Topological Aspects; Noncommutative Geometry from
energy of a configuration subject to Q being fixed Strings; Sine-Gordon Equation; Topological Defects and
Their Homotopy Classification.
implies that has the form [12]. Without loss of
generality, Rwe may take ! 0. Note that Q = !I,
where I = j j2 dn x. The energy of a configuration
of the form [12] is E = Eq þ Ek þ Ep , where Further Reading
Z Atiyah MF and Hitchin NJ (1988) The Geometry and Dynamics
1
Eq ¼ j@j j2 dn x of Magnetic Monopoles. Princeton: Princeton University
2 Press.
Coleman S (1988) Aspects of Symmetry. Cambridge: Cambridge
Ek ¼ 12 I!2 ¼ 12 Q2 =I University Press.
Z Drazin PG and Johnson RS (1989) Solitons: An Introduction.
Ep ¼ Uðj jÞ dn x Cambridge: Cambridge University Press.
Goddard P and Mansfield P (1986) Topological structures in field
theories. Reports on Progress in Physics 49: 725–781.
Let us take U(0) = 0 = U0 (0), with the field satisfying Jaffe A and Taubes C (1980) Vortices and Monopoles. Boston:
Birkhäuser.
the boundary condition ! 0 as r ! 1.
Lee TD and Pang Y (1992) Nontopological solitons. Physics
A stationary Q-lump is a critical point of the Reports 221: 251–350.
energy functional E[ ], subject to Q having some Makhankov VG, Rubakov YP, and Sanyuk VI (1993) The Skyrme
fixed value. The usual (Derrick) scaling argument Model: Fundamentals, Methods, Applications. Berlin: Springer.
shows that any stationary Q-lump must satisfy Manton NS and Sutcliffe PM (2004) Topological Solitons.
Cambridge: Cambridge University Press.
Rajaraman R (1982) Soliton and Instantons. New York: North-
ð2 nÞEq nEp þ nEk ¼ 0 ½13
Holland.
Rebbi C and Soliani G (1984) Solitons and Particles. Singapore:
For simplicity, in what follows, let us take n 3. World Scientific.
Define m > 0 by U00 (0) = m2 ; then, near spatial Vilenkin A and Shellard EPS (1994) Cosmic Strings and Other
infinity, the Euler–Lagrange equations give r2 Cosmological Defects. Cambridge: Cambridge University Press.
(m2 !2 ) = 0. So, in order to satisfy the boundary Ward RS and Wells RO Jr. (1990) Twistor Geometry and Field
Theory. Cambridge: Cambridge University Press.
condition ! 0 as r ! 1, we need ! < m.
Zakrzewski WJ (1989) Low Dimensional Sigma Models. Bristol:
It is clear from [13] that if U (1=2)m2 j j2 IOP.
everywhere, then there can be no solution. So
K = min[2U(j j)=j j2 ] has to satisfy K < m2 . Also,
we have
Source Coding in Quantum Information Theory 609
ª 2006 Elsevier Ltd. All rights reserved. A simple model of a classical information source
consists of a sequence of discrete random variables
X1 , X2 , . . . , Xn , whose values represent the output of
the source. Each random variable Xi , 1 i n,
Introduction takes values xi from a finite set, the source alphabet
Two key issues of classical and quantum informa- X . Hence, X(n) := (X1 , . . . , Xn ) takes values x(n) :=
tion theory are storage and transmission of informa- (x1 , . . . , xn ) 2 X n . We recall the definition of entropy
tion. An information source produces some outputs (or information content) of a source.
(or signals) more frequently than others. Due to this If the discrete random variables X1 , . . . , Xn which
redundancy, one can reduce the amount of space take values from a finite alphabet X have joint
needed for its storage without compromising on its probabilities
content. This data compression is done by a suitable
PðX1 ¼ x1 ; . . . ; Xn ¼ xn Þ ¼ pn ðx1 ; . . . ; xn Þ
encoding of the output of the source. In contrast, in
the transmission of information through a channel, then the Shannon entropy of this source is defined by
it is often advantageous to add redundancy to a
message, in order to combat the effects of noise. HðX1 ; . . . ; Xn Þ
X X
This is done in the form of error-correcting codes. ¼ pn ðx1 ; . . . ; xn Þ
The amount of redundancy which needs to be added x1 2X xn 2X
to the original message depends on how much noise log pn ðx1 ; . . . ; xn Þ ½1
is present in the channel (see, e.g., Nielson and
Chuang (2000)). Hence, redundancy plays comple- Here and in the following, the logarithm is taken to
mentary roles in data compression and transmission the base 2. This is because the fundamental unit of
of data through a noisy channel. In this review we classical information is a ‘‘bit,’’ which takes two
focus only on data compression in quantum infor- values 0 and 1. Notice that H(X1 , . . . , Xn ) in fact
mation theory. only depends on the (joint) probability mass func-
In classical information theory, Shannon showed tion (p.m.f.) pn and can also be denoted as H(pn ).
that there is a natural limit to the amount of There are several other concepts of entropy, for
compression that can be achieved. It is given by example, relative entropy, conditional entropy, and
the Shannon entropy. The analogous concept in mutual information. See, for example, Cover and
quantum information theory is the von Neumann Thomas (1991) and Nielson and Chuang (2000). It
entropy. Here, we review some of the main results is easy to see that
of quantum data compression and the significance of
1. 0 H(X1 , . . . , Xn ) n log jX j, where jX j denotes
the von Neumann entropy in this context.
the number of letters in the alphabet X . Two
The review is structured as follows. We first give
other important properties are as follows:
a brief introduction to the Shannon entropy and
2. H(X1 , . . . , Xn ) is jointly concave in X1 , . . . , Xn
classical data compression. This is followed by a
and
discussion of quantum entropy and the idea behind
3. H(X1 , . . . , Xn ) H(X1 , . . . , Xm ) þ H(Xmþ1 , . . . , Xn )
quantum source coding. We elaborate on data
for m < n.
compression schemes for three different classes of
quantum sources, namely memoryless sources, The latter property is called subadditivity.
ergodic sources, and sources modeled by Gibbs In the next section, analogous quantities are
states of quantum spin systems. In the bulk of the introduced for quantum information and the corre-
review, we concentrate on source-dependent, fixed- sponding properties are stated.
length coding schemes. We conclude with a brief Suppose that the random variables X1 , X2 , . . . , Xn
discussion of universal and variable-length coding. are independent and identically distributed (i.i.d.).
We would like to point out that this review article Then the entropy of each random variable modeling
is by no means complete. Due to a restriction on its the source is the same and can be denoted by H(X).
length, we had to leave out various important From the point of view of classical information
aspects and developments of quantum source theory, the Shannon entropy has an important
coding. operational definition. It quantifies the minimal
610 Source Coding in Quantum Information Theory
physical resources needed to store data from a It is known that (Xn )n2Z is ergodic if and only if
classical information source and provides a limit to its probability distribution is extremal in the set of
which data can be compressed reliably (i.e., in a invariant probability measures. The generalization
manner in which the original data can be recovered of Theorem 1 (McMillan 1953, Breiman 1957) now
later with a low probability of error). Shannon reads:
showed that the original data can be reliably
Theorem 2 (Shannon–McMillan–Breiman theo-
obtained from the compressed version only if the
rem). Suppose that the sequence (Xn )n2Z is
rate of compression is greater than the Shannon
ergodic. Then
entropy. This result is formulated in Shannon’s
noiseless channel coding theorem (Shannon 1918, 1
Cover and Thomas 1991, Nielson and Chuang lim log pn ðX1 ; . . . ; Xn Þ ¼ hKS
n!1 n ½4
2000) given later.
with probability 1
compressed bits and maps them back to a string of n on a finite-dimensional algebra M, there exists a
letters from the alphabet X : Dn : y 2 {0, 1}dnRe 7! x0 = unique density matrix such that [7] holds, so the
(x01 , . . . , x0n ). A compression–decompression scheme concepts can be used interchangeably. (This is not
is said to be ‘‘reliable’’ if the probability that x0 6¼ x true in the infinite-dimensional case.)
tends to 0 as n ! 1. Shannon’s noiseless channel The quantum analog of the Shannon entropy is
coding theorem (Shannon 1918, Cover and Thomas called the von Neumann entropy. For any quantum
1991) now states state (or equivalently ), it is defined by
Theorem 4 (Shannon). Suppose that {Xi } is an i.i.d. SðÞ
Sð Þ :¼ tr log ½8
information source, with Xi p(x) and Shannon
entropy H(X). If R > H(X) then there exists a Here we use log to denote log2 and define 0 log
reliable compression scheme of rate R for the 0
0, as for the Shannon entropy. Let the density
source. Conversely, any compression scheme with matrix have a spectral decomposition
rate R < H(X) is not reliable. X
d
¼ i j i ih i j ½9
Proof (sketch). Suppose R > H(X). Choose > 0
i¼1
such that H(X) þ < R. Consider the set T(n) of
typical sequences. The method of compression is Here {j i i} is the set of eigenvectors of . They
then to examine the output of the source, to see if it form an orthonormal basis of the Hilbert space H.
belongs to T(n) . If the output is a typical sequence, By the fact that is positive definite and has trace 1,
then we compress the data by simply storing an the eigenvalues i of determine a probability
index for the particular sequence using dnRe bits in distribution. When expressed in terms of the i , the
the obvious way. If the input string is not typical, von Neumann entropy of reduces to the Shannon
then we compress the string to some fixed dnRe bit entropy corresponding to this probability distribu-
string, for example, (00 . . . 000). In this case, data tion (henceforth, the subscript of will be
compression effectively fails, but, in spite of this, the omitted): S() = H(), where = {1 , . . . , d }.
compression–decompression scheme succeeds with The von Neumann entropy has properties analo-
probability tending to 1 as n ! 1, since by Theorem 3 gous to H(X1 , . . . , Xn ), in particular (Ohya and Petz
the probability of atypical sequences can be made 1993, Nielson and Chuang 2000)
small by choosing n large enough. 1. 0 S() log(dim (H));
If R < H(X), then any compression scheme of rate 2. S() is concave in ; and
R is not reliable. This also follows from Theorem 3 3. if is a state on H = H1 H2 then S() S(1 ) þ
by the following argument. Let S(n) be a collection S(2 ) if 1 and 2 are the restrictions of to
of sequences x(n) of size jS(n)j 2dnRe . Then the H1 I and I H2 respectively.
subset of atypical sequences in S(n) is highly
improbable, whereas the corresponding subset of A ‘‘quantum information source’’ in general is
typical sequences has probability bounded by defined by a sequence of density matrices (n) on
2nR 2nH(X) ! 0 as n ! 1. & Hilbert spaces Hn of increasing dimensions Nn given
by a decomposition
X ðnÞ ðnÞ ðnÞ
ðnÞ ¼ pk jk ihk j ½10
Quantum Data Compression
k
Quantum Sources and Entropy
where the states j(n)
k i are interpretedPas the signal
In quantum information processing systems, infor- states, and the numbers p(n)
k 0 with
(n)
k pk = 1, as
mation is stored in quantum states of physical their probabilities of occurrence. The vectors j(n)
k i2
systems. The most general description of a quantum Hn need not be mutually orthogonal.
state is provided by a density matrix.
Compression–Decompression
A ‘‘density matrix’’ is a positive semidefinite
Scheme and Fidelity
operator on a Hilbert space H, with tr = 1, and the
expected value of an operator A on H is given by To compress data from such a source one encodes
each signal state j(n) e (n) e
ðAÞ ¼ tr ðAÞ ½7 k i by a state k 2 B(Hn ) where
e
dim Hn = dc (n) < Nn . Thus, a compression scheme
The functional on M = B(H), the algebra of linear is a map C(n) : j(n) (n)
e (n) e
k ihk j 7! k 2 B(Hn ). The state
(n)
operators on H, is positive (i.e., (A) 0, if A 0) ek is referred to as the compressed state. A
and maps the identity I 2 M to 1. Such a functional corresponding decompression scheme is a map
is also called a state. Conversely, given such a state D(n) : B(Hen ) 7! B(Hn ). Both C(n) and D(n) must be
612 Source Coding in Quantum Information Theory
completely positive maps. In particular, this implies Schumacher’s Theorem for Memoryless
that D(n) must be of the form Quantum Sources
X The notion of a typical subspace was first
DðnÞ ðÞ ¼ Di Di ½11
i
introduced in the context of quantum information
theory by Schumacher (1995) in his seminal paper.
for
P linear operators Di : Hen 7! Hn such that He considered the simplest class of quantum
i Di Di = I (see Nielson and Chuang 2000). information sources, namely quantum memoryless
Obviously, in order to achieve the maximum or i.i.d sources. For such a source the density matrix
possible compression of Hilbert space dimensions (n) , defined through [10], acts on a tensor product
per signal state, the goal must be to make the Hilbert space Hn = Hn and is itself given by a
dimension dc (n) as small as possible, subject to the tensor product
condition that the information carried in the signal
states can be retrieved with high accuracy upon ðnÞ ¼ n ½14
decompression. Here H is a fixed Hilbert space (representing an
The ‘‘rate of compression’’ is defined as elementary quantum subsystem) and is a density
logðdim Hen Þ log dc ðnÞ matrix acting on H; for example, H can be a single
Rn :¼ ¼ qubit Hilbert space, in which case dim H = 2, Hn is
logðdim Hn Þ log Nn
the Hilbert space of n qubits and is the density
It is natural to consider the original Hilbert space matrix of a single qubit. If the spectral decomposi-
Hn to be the n-qubit space. In this case Nn = 2n and tion of is given by
hence log Nn = n. As in the case of classical data
XH
dim
compression, we are interested in finding the ¼ qi ji ihi j ½15
optimal limiting rate of data compression, which in i¼1
this case is given by
then the eigenvalues and eigenvectors of (n) are
log dc ðnÞ given by
R1 :¼ lim ½12
n!1 n ðnÞ
k ¼ qk1 qk2 . . . qkn ½16
Unlike classical signals, quantum signal states are
not completely distinguishable. This is because they and
are, in general, not mutually orthogonal. As a result, ðnÞ
perfectly reconstructing a quantum signal state from j k i ¼ jk1 i jk2 i jkn i ½17
its compressed version is often an impossible task
Thus, we can write the spectral decomposition of
and therefore too stringent a requirement for the
the density matrix (n) of an i.i.d. source as
reliability of a compression–decompression scheme. X ðnÞ ðnÞ ðnÞ
Instead, a reasonable requirement is that a state can ðnÞ ¼ k j k ih k j ½18
be reconstructed from the compressed version which k
is nearly indistinguishable from the original signal
state. A measure of indistinguishability useful for where the sum is over all possible sequences
this purpose is the average fidelity defined as k = (k1 . . . kn ), with each ki taking (dim H) values.
follows: Hence, we see that the eigenvalues (n) are labeled
X ðnÞ ðnÞ by a classical sequence of indices k = k1 . . . kn .
ðnÞ ðnÞ
Fn :¼ pk hk jDðnÞ ðe
k Þjk i ½13 The von Neumann entropy of such a source is
k given by
This fidelity satisfies 0 Fn 1 and Fn = 1 if SððnÞ Þ
Sðn Þ ¼ nSðÞ ¼ nHðXÞ ½19
and only if D(n) (e k(n) ) = j(n) (n)
k ihk j for all k. A
compression–decompression scheme is said to be where X is the classical random variable with
reliable if Fn ! 1 as n ! 1. probability distribution {qi }.
The key idea behind data compression is the fact Let T (n) be the classical typical subset of indices
that some signal states have a higher probability of (k1 . . . kn ) for which
occurrence than others (these states playing a role
1
analogous to the typical sequences of classical log qk . . . qk SðÞ ½20
n 1 n
information theory). These signal states span a
subspace of the original Hilbert space of the source as in the theorem of typical sequences. Defining
and is referred to as the typical subspace. T (n) as the space spanned by the eigenvectors j (n)
k i
Source Coding in Quantum Information Theory 613
with k 2 T (n) then immediately yields the quantum Using the typical subspace theorem, Schumacher
analog of the theorem of typical sequences – Theorem (1995) proved the following analog of Shannon’s
4 given below. We refer to T (n)
as the typical subspace noiseless channel coding theorem for memoryless
(or more precisely, the –typical subspace). quantum information sources:
Theorem 4 (Typical subspace theorem). Fix > 0. Theorem 5 (Schumacher’s quantum coding theo-
Then for any > 0 9 n0 () > 0 such that 8n n0 () rem). Let {n , Hn } be an i.i.d. quantum source:
and (n) = n , the following are true: n = n and Hn = Hn . If R > S(), then there exists
a reliable compression scheme of rate R. If R < S(),
(i) Tr(P(n) (n)
) > 1 and then any compression scheme of rate R is not reliable.
(ii) (1 )2n(S()) dim (T (n)
) 2
n(S()þ)
, where
P(n)
is the orthogonal projection onto the Proof
subspace T (n) .
(i) R > S(). Choose > 0 such that R > S() þ .
Note that tr (P(n) (n)
) gives the probability of the For a given > 0, choose the typical subspace as
typical subspace. As tr(P(n) (n)
) approaches unity for above and choose n large enough so that (i) and (ii)
(n)
n sufficiently large, T carries almost all the weight in the typical subspace theorem hold. In particular,
of (n) . Let T (n)?
denote the orthocomplement of the An = tr(P(n)
n ) > 1 . Thus, the fidelity tends to 1
typical subspace, that is, for any pair of vectors as n ! 1.
j i 2 T (n)
and ji 2 T
(n)?
, hj i = 0. It follows from (ii) Suppose R < S(). Let the compression map
the above theorem that the probability of a signal be C(n) . We may assume that H e n is a subspace of Hn
state belonging to T (n)? can be made arbitrarily e nR
with dim Hn = 2 . We denote the projection onto
small for n sufficiently large. He n as P e n and let ~(n) = C(n) (j(n) i h(n) j). Since
k k k
Let P(n) denote the orthogonal projection onto the
(n)
~k is concentrated on H e n , we have ~(n) P en
k
typical subspace T (n) . The encoding (compression) and hence D (~ (n) (n) (n) e
k ) D (Pn ), for any decompres-
of the signal states j(n) k i of [10], is done in the sion map D(n) . Inserting into the definition of the
following manner. C(n) : j(n) (n)
e (n)
k i hk j 7! k , where fidelity, we then have
ðnÞ ðnÞ ðnÞ X ðnÞ ðnÞ
~k :¼ 2k jk
~ j þ
2 j0 ih0 j
~ ih
k k ½21 F pk hk jDðnÞ ðP e n ÞjðnÞ i ¼ tr ðnÞ DðnÞ ðP
en Þ
k
k
Here X ðnÞ ðnÞ ðnÞ
X ðnÞ
ðnÞ e
ðnÞ ðnÞ
k h k jD ðPn Þj k i þ k ½24
~ ðnÞ i:¼ P jk i ðnÞ
k2T k2
ðnÞ
= T
j k ðnÞ ðnÞ
kP jk ik ½22
By the typical subspace theorem, the latter sum
ðnÞ ðnÞ
k :¼ kPðnÞ
jk ik;
k ¼ kðI PðnÞ
Þjk ik
tends to 0 as n ! 1, and in the sum over k 2 T(n)
we have (n)
k 2
n(S())
. The first sum can therefore
and j0 i is any fixed state in T (n) . be bounded as follows:
Obviously ~(n) (n)
k 2 B(T ), and hence the typical X ðnÞ ðnÞ
subspace T (n)
plays the role of the compressed space. k h k jDðnÞ ðP e n Þj ðnÞ i
k
The decompression D(n) (e k(n) ) is defined as the k2T
ðnÞ
(n) (n)
extension of ek on T to Hn : X ðnÞ ðnÞ
ðnÞ e
2nðSðÞÞ h k jD ðPn Þj k i
ðnÞ ðnÞ
DðnÞ ek ¼ ek
0 k
¼2 nðSðÞÞ en Þ
tr DðnÞ ðP
The fidelity of this compression–decompression
!
scheme satisfies X
X ðnÞ ðnÞ ðnÞ ðnÞ
¼2 nðSðÞÞ
tr e
Di Pn Di
Fn ¼ pk k j~ k jk i
k nðSðÞÞ nR
X h i ¼2 2 ½25
ðnÞ ðnÞ ~ ðnÞ 2 ðnÞ 2
¼ pk 2k jhk j 2
k ij þ
k jhk j0 ij
k
by
P the cyclic property of thenRtrace and the fact that
X X e
ðnÞ ðnÞ ðnÞ ðnÞ i Di Di = I and dim Hn = 2 . h
~ ij2 ¼
pk 2k jhk j pk 4k
k
k k Even for a quantum source with memory, reliable
X ðnÞ data compression is achieved by looking for a
pk ð2 2k 1Þ ¼ 2An 1 ½23
typical subspace T (n)
of the Hilbert space Hn for a
k
given > 0. In the following subsections, we discuss
where An = tr(P(n)
n ). two different classes of such sources for which one
614 Source Coding in Quantum Information Theory
We also define a state 1 on M1 to be completely If let T(n) be the (typical) subset of X n such that
ergodic if it is ergodic under transformations on M1 ,
induced by l-fold shifts on Z, for arbitrary l 2 N. The 1
log n ðfðx1 ; . . . ; xn ÞgÞ 2 ðhKS ; hKS þ Þ ½37
following theorem is due to Hiai and Petz (1991), n
who proved it in a slightly more general setting: for (x1 , . . . , xn ) 2 T(n) then we have 1 (T(n) ) 1
Theorem 6 (Hiai and Petz). Suppose that 1 is a for n large enough. Moreover, since n ({(x1 , . . . , xn )})
completely ergodic state on M1 and d := dim M < 1, en(hKS þ) for all (x1 , . . . , xn ) 2 T(n) , and the total
measure is 1,
(
X
ld Proof of (ii) Given , > 0 and n 2 N, choose a
l ¼ i p i ½31 projection qn with n (qn ) 1 and log tr(qn ) <
i¼1
(n ) þ . Since SM (1 ) = inf (1=n)S(n ) we have
Source Coding in Quantum Information Theory 615
SM (1 ) (1=n)S(n ). We now use the following similar to that of Schumacher’s theorem (Petz and
lemma: Mosonyi 2001, Bjelaković et al. 2004):
Lemma 7 If is a state on a finite-dimensional Theorem 8 Let 1 be a completely ergodic
-algebra M, and q 2 M is a projection, then stationary state on the infinite tensor product
algebra M1 . If R > SM (1 ), then for any decom-
SðÞ HðpÞ þ ðqÞ log trðqÞ
position of the form
þ ð1 ðqÞÞ log trð1 qÞ ½40 X ðnÞ ðnÞ ðnÞ
ðnÞ ¼ pk jk i hk j ½44
where H(p) = p log p (1 p) log (1 p) (the bin-
ary entropy) with p = (q). there exists a reliable quantum code of rate R.
Proof First notice that if [ , q] = 0 then the result Conversely, if R < SM (1 ) then any quantum
[40] follows from the simple inequality: compression–decompression scheme of rate R is
not reliable.
X
m X
m
~i log ~i log m if ~i ¼ 1 ½41 Remarks Theorem 6 also holds for higher-
i¼1 i¼1 dimensional information streams, with essentially
Indeed, diagonalizing , the eigenvalues i divide into the same proof. (The existence of the mean entropy
two subsets with corresponding eigenvectors belong- is more complicated in that case.) The condition of
ing to the range of q, respectively, its complement. complete ergodicity in this theorem is unnecessary.
Considering the firstPset, we have, if m = dim (Ran(q)), Indeed, Bjelaković et al. (2004) showed that the
and taking ~i = i =( m result remains valid (also in more than one dimen-
i = 1 i ) in [41],
! ! sions) if the state 1 of the source is simply ergodic.
Xm X m
1X m They achieved this by decomposing a general
i log i i log i ergodic state into a finite number of l-ergodic states,
i¼1 i¼1
m i¼1
and then applying the above strategy to each. It
¼ trðq Þ log trðq Þ log trðqÞ should also be mentioned that a weaker version of
Adding the analogous inequality for the part of the Theorem 6 was proved by King and Lesniewski
spectrum corresponding to 1 q, we obtain [40]. (1998). They considered the entropy of an asso-
In the general case, that is, if [ , q] 6¼ 0, define ciated classical source, but did not show that this
the unitary u = 2q 1 and the state classical entropy can be optimized to approximate
the von Neumann entropy. This had in fact already
0 ðxÞ ¼ 12 ½ðxÞ þ ðuxuÞ ½42 been proved by Hiai and Petz (1991). The relevance
of the latter work for quantum information theory
Then [0 , q] = 0 and by concavity of S() and the
was finally pointed out by Mosonyi and Petz (2001).
result for the previous case
HðXÞ þ ðqÞ log trðqÞ
þ ð1 ðqÞÞ log trð1 qÞ Sð0 Þ SðÞ ½43 Source Coding for Quantum
Spin Systems
since 0 (q) = (q). &
In this section we consider a class of quantum
Continuing with the proof of (ii), we conclude that sources modeled by Gibbs states of a finite strongly
interacting quantum spin system in Zd with
Sðn Þ HðpÞ þ n ðqn Þ log trðqn Þ
d 2. Due to the interaction between spins, the
þ ð1 ðqn ÞÞ log trð1 qn Þ density matrix of the source is not given by a tensor
1 þ
ðn Þ þ þ n log d product of the density matrices of the individual
spins and hence the quantum information source is
Dividing by n and taking the limit we obtain (30).
non-i.i.d. We consider the density matrix to be
&
written in the standard Gibbsian form:
It follows from this theorem that we can define a !
for precise definitions of these quantities). The Theorem 9 Under the above assumptions, for
denominator on the right-hand side of [45] is the large and small enough, for all > 0
partition function.
Note that any faithful density matrix can be !; 1 !;
lim P log K h
written in the form [45] for some self-adjoint %Zd jj
X
operator H! with discrete spectrum, such that ¼ lim j fjjj1 log j hjg ¼ 1 ½46
!
e
H is trace class. However, we consider H! to %Zd j
be a small quantum perturbation of a classical
where {...} denotes an indicator function.
Hamiltonian and require it to satisfy certain
hypotheses (see Datta and Suhov (2002)). In Theorem 9 is essentially a law of large numbers
particular, we assume that H = H0 þ V , where for random variables (log K!, ). The statement of
(1) H0 is a classical, finite-range, translation- the theorem can be alternatively expressed as
invariant Hamiltonian with a finite number of follows. For any > 0,
periodic ground states, and the excitations of these
ground states have an energy proportional to the lim P !; 2jjðhþÞ K!; 2jjðhÞ ¼ 1 ½47
%Zd
size of their boundaries (Peierls condition); (2) V
is a translation-invariant, exponentially decaying, Thus, we can define a typical subspace T !, by
quantum perturbation, being the perturbation jjðhþÞ
parameter. These hypotheses ensure that the quan- T !;
:¼ span fj j i : 2 j 2jjðhÞ g ½48
tum Pirogov–Sinai theory of phase transitions in It clearly satisfies the analogs of (i) and (ii) of the
lattice systems (see, e.g., Datta et al. (1996)) applies. typical subspace theorem, which implies as before
The power of quantum Pirogov–Sinai theory is that a compression scheme of rate R is reliable if and
such that, in proving reliable data compression for only if R > h.
such sources, we do not need to invoke the concept
of ergodicity.
Universal and Variable Length Data Compression
Using the concavity of the von Neumann entropy
S(!, ), one can prove that the von Neumann Thus far we discussed source-dependent data com-
entropy rate (or mean entropy) of the source pression for various classes of quantum sources. In
each case data compression relied on the identifica-
Sð!; Þ tion of the typical subspace of the source, which in
h :¼ lim
%Zd jj turn required a knowledge of its density matrix. In
classical information theory, there exists a general-
exists. For a general van Hove sequence, this follows ization of the theorem of typical sequences due to
from the strong subadditivity of the von Neumann Csiszár and Körner (1981) where the typical set is
entropy (see, e.g., Ohya and Petz (1993)). universal, in that it is typical for every possible
Let !, have a spectral decomposition probability distribution with a given entropy. This
X result was used by Jozsa et al. (1998) to construct a
!; ¼ j j j ih j j universal compression scheme for quantum i.i.d
j
sources with a given von Neumann entropy S using
where the eigenvalues j , 1 j 2jj , and the a counting argument for symmetric subspaces. This
corresponding eigenstates j j i, depend on ! and . was generalized to ergodic sources by Kaltchenko
Let P !, denote the probability distribution {j } and and Yang (2003) along the lines of Theorem 6.
consider a random variable K!, which takes a value Hayashi and Matsumoto (2002) supplemented the
j with probability j : work of Jozsa et al. (1998) with an estimation of the
eigenvalues of the source (using the measurement
K!; ð j Þ ¼ j ; P !; ðK!; ¼ j Þ ¼ j smearing technique) to show that a reliable compres-
sion scheme exists for any quantum i.i.d source,
The data compression limit is related to asympto- independent of the value of its von Neumann entropy
tical properties of the random variables K!, as S, the limiting rate of compression being given by S. If
% Zd . As in the case of i.i.d. sources, we prove one admits variable length coding, the Lempel–Ziv
the reliability of data compression by first proving algorithm gives a completely universal compression
the existence of a typical subspace. The latter scheme, independent of the value of the entropy, in
follows from Theorem 9 below. The proof of this the classical case (Cover and Thomas 1991). This
crucial theorem relies on results of quantum algorithm was generalized to the quantum case for
Pirogov–Sinai theory (Datta et al. 1996). i.i.d sources by Jozsa and Presnell (2003), and to
Spacetime Topology, Causal Structure and Singularities 617
sources modeled by Gibbs states of free bosons or Datta N, Fernández R, and Fröhlich J (1996) Low-temperature
fermions on a lattice by Johnson and Suhov (2002). phase diagrams of quantum lattice systems. I. Stability for
quantum perturbations of classical systems with finitely-many
Another important question is the efficiency of the ground states. Journal of Statistical Physics 84: 455–534.
various coding schemes. The above-mentioned Datta N and Suhov Y (2002) Data compression limit for an
schemes for quantum i.i.d. sources are not efficient, information source of interacting qubits. Quantum Informa-
in the sense that they have no polynomial time tion Processing 1(4): 257–281.
implementation. Recently, it was shown by Bennett Hayashi M and Matsumoto K (2002) Quantum universal
variable-length source coding. Physical Review A 66: 022311.
et al. (2004) that an efficient, universal compression Hiai F and Petz D (1991) The proper formula for the relative
scheme for i.i.d sources can be constructed by entropy and its asymptotics in quantum probability. Commu-
employing quantum state tomography. nications in Mathematical Physics 143: 257–281.
Johnson O and Suhov YM (2002) The von Neumann entropy and
information rate for integrable quantum Gibbs ensembles.
Acknowledgment Quantum Computers and Computing 3/1: 3–24.
Jozsa R, Horodecki M, Horodecki P, and Horodecki R (1998)
The authors would like to thank Y M Suhov for Universal quantum information compression. Physical Review
helpful discussions. Letters 81: 1714–1717.
Jozsa R and Presnell S (2003) Universal quantum information
See also: Capacity for Quantum Information; Channels in compression and degrees of prior knowledge. Proceedings of
Quantum Information Theory; Positive Maps on C-Algebras. Royal Society of London Series A 459: 3061–3077.
Kaltchenko A and Yang E-H (2003) Universal compression of
ergodic quantum sources. Quantum Information & Computa-
Further Reading tion 3: 359–375.
King C and Lesniewski A (1998) Quantum sources and a quantum
Bennett CH, Harrow AW, and Lloyd S (2004) Universal quantum coding theorem. Journal of Mathematical Physics 39: 88–101.
data compression via gentle tomography, quant-ph/0403078. McMillan B (1953) The basic theorems of information theory.
Bjelaković I, Krüger T, Siegmund-Schultze R, and Szkoła A Annals of Mathematical Statistics 24: 196–219.
(2004) The Shannon–McMillan theorem for ergodic quantum Nielson MA and Chuang IL (2000) Quantum Computation and
lattice systems. Inventiones Mathematicae 155: 203–222. Quantum Information. Cambridge: Cambridge University Press.
Breiman L (1957) The individual ergodic theorem of information Ohya M and Petz D (1993) Quantum Entropy and Its Use.
theory. Annals of Mathematical Statistics 28: 809–811. Heidelberg: Springer.
Breiman L (1960) The individual ergodic theorem of information Petz D and Mosonyi M (2001) Stationary quantum source coding.
theory – correction note. Annals of Mathematical Statistics 31: Journal of Mathematical Physics 42: 4857–4864.
809–810. Schumacher B (1995) Quantum coding. Physical Review A 51:
Cover TM and Thomas JA (1991) Elements of Information 2738–2747.
Theory. New York: Wiley. Shannon CE (1918) A mathematical theory of communication.
Csiszár I and Körner J (1981) Information Theory. Coding Bell System Technical Journal 27: 379–423, 623–656.
Theorems for Discrete Memoryless Systems. Budapest:
Akadémiai Kiadó.
Horizon
noted in the standard Friedmann models (which are
solutions of the Einstein equations with simple
matter sources; see Cosmology: Mathematical
Aspects). Secondly, we find a final singularity (for
Singularity
local observers) at the endpoint of gravitational
Horizon
collapse to a black hole (where in the relevant
region, outside the collapsing matter, Einstein’s
vacuum equations are normally taken to hold). In
either case, there are canonical exact models, in
which considerable symmetry is assumed, and where
the models indeed become singular at places where
the spacetime curvature diverges to infinity. For
many years (prior to 1965), there had been much
debate as to whether these singularities were an
inevitable feature of the general physical situation
under consideration, or whether the presence of
singularities might be an artifact of the assumed
high symmetry. The use of topological-type argu- Collapsing
ments has established that, in general terms, the matter
occurrence of a singularity is not merely an artifact
Figure 1 Spacetime diagram of collapse to a black hole.
of symmetry, and cannot generally be removed by
(One spatial dimension is suppressed.) Matter collapses inwards,
the introduction of small (finite) perturbations. through the 3-surface that becomes the (absolute) event horizon.
Let us first consider the standard picture, put No matter or information can escape the hole once it has been
forward in 1939 by Oppenheimer and Snyder (OS), formed. The null cones are tangent to the horizon and allow
of the gravitational collapse of an over-massive star matter or signals to pass inwards but not outwards. An external
observer cannot see inside the hole, but only the matter – vastly
to a black hole; see Figure 1 (and see Stationary
dimmed and redshifted – just before it enters the hole.
Black Holes). This assumes exact spherical symme- (Reproduced with permission from Penrose R. (2004) The Road
try. The region external to the matter is described by to Reality : a Complete Guide to the Laws of the Universe.
the well-known Schwarzschild solution of the London: Jonathan Cape.)
Einstein vacuum equations, appropriately extended
to inside the ‘‘Schwarzschild radius’’ r = 2mG=c2 indefinitely into the future, where the ‘‘horizon’’ is
(G being Newton’s gravitational constant and c, the the three-dimensional region obtained by rotating,
speed of light, and where m is the total mass of over the (, ) 2-sphere, the null (lightlike) line
the collapsing material; from now, for convenience, which is r = 2m outside the matter region and which
we choose units so that G = c = 1). In Figure 1, is the extension of this line, as a null line, into the
this internal extension is conveniently expressed past until it meets the axis. It is easy to see that any
using Eddington–Finkelstein coordinates (r, v, , ) observer’s world line within this horizon is indeed
(see Eddington (1924) and Finkelstein (1958)), trapped in this sense.
where v = t þ r þ 2m log (r 2m), the metric form The question naturally arises: how representative
being is this model? Here, the singularity occurs at the
center (r = 0), the place where all the matter is
ds2 ¼ð1 2m=rÞdv2 2dvdr directed, and where it all reaches without rebound-
r2 ðd2 þ sin2 d2 Þ ing. So it may be regarded as unsurprising that the
density becomes infinite there. Now, let us suppose
(The signature convention þ is being adopted that the collapsing material is not exactly spherically
here; see General Relativity: Overview.) We find symmetrical. Even if it is only slightly (though
that, in this model, there is a singularity (at r = 0) at finitely) perturbed away from this symmetrical
the future endpoint of each world line of collapsing situation, having slight (but finite) transverse
matter. Moreover, no future-timelike line starting motions, the collapsing matter is now not all
inside the horizon can avoid reaching the singularity directed exactly towards the center, as it is in the
when we try to extend it, as a timelike curve, OS model. One might imagine that the singularity
Spacetime Topology, Causal Structure and Singularities 619
could now be avoided, the different portions of Iþ (S) itself is represented by that part of Figure 1
matter just ‘‘missing’’ each other and then being which lies between these null curves.
finally flung out again, after some complicated We observe that, in this symmetrical case (s being
motions, where the density and spacetime curvatures chosen in the vacuum region), a characterization of s
might well become large but presumably still finite. as being ‘‘trapped,’’ in the sense that it lies in a
To follow such an irregular collapse in full detail region that is within the horizon, is that the future
would present a very difficult task, and one would tangents to these null curves both point ‘‘inwards,’’
have to carry it out by numerical means. As yet, in the sense of decreasing r. Since r is the metric
despite enormous advances in computational tech- radius of the S2 of rotation, so that the element of
nique, a fully effective simulation of such a surface area of this sphere is proportional to r2 , it
‘‘generic’’ collapse is still not in hand. In any case, follows that the surface area of the boundary @Iþ (S)
it is hard to make a convincing case as to whether or reduces, on both branches, as we move away from S
not a singularity arises, because as soon as metric or into the future. The three-dimensional region @Iþ (S)
curvature quantities begin to diverge, the computa- consists of two null surfaces joined along S, in
tion becomes fundamentally unreliable and simply the sense that their Lorentzian normals are null
‘‘gives up.’’ So we cannot really tell whether the 4-vectors. For each fixed value of and , this
failure is due to some genuine divergence or whether normal is a tangent to one or other of the two null
it is an artifact. It is thus fortunate that other curves of Figure 1, starting at s. For a trapped s,
mathematical techniques are available. Indeed, by these normals point in the direction of decreasing r,
use of a differential–topological–causal argument, and it follows that the divergence of these normals is
we find that such perturbations do not help, at least negative (so > 0 in what follows below).
so long as they are small enough not to alter the In the general case, it is this property of negativity
general character of the collapse, which we find has of the divergence, at S, of both sets of Lorentzian
an ‘‘unstoppable’’ character, so long as a certain normals (i.e., of null tangents to @Iþ (S)), that
criterion is satisfied its early stages. characterizes S as a trapped surface, where in the
general case we must also prescribe S to be compact
and spacelike. But now there are to be no assump-
tions of symmetry whatever. Such a characterization
Trapped Surfaces
is stable against small, but finite, perturbations of
But how are we to characterize the collapse as the location of S, within the spacetime manifold M,
‘‘unstoppable,’’ where no symmetries are to be and also against small, but finite, perturbations of M
assumed, and the simple picture illustrated in itself.
Figure 1 cannot be appealed to? A convenient We can think of a trapped surface in more direct
characterization is the presence of what is called a physical/geometrical terms. Imagine a flash of light
‘‘trapped surface.’’ This notion generalizes a key emitted all over some spacelike compact spherical
feature of the 0 < r < 2m region inside the horizon surface such as S, but now in ordinary flat space-
of the vacuum (Eddington–Finkelstein) picture of time, where for simplicity we suppose that S is
Figure 1. To understand what this feature is, situated in some spacelike (flat) 3-hypersurface H, of
consider fixing a point s in the vacuum region of constant time t = 0. There will be one component to
the (v, r)-plane of Figure 1. We must, of course, bear the flash proceeding outwards and another proceed-
in mind that, because this plane is to be ‘‘rotated’’ ing inwards. Provided that S is convex, the outgoing
about the central vertical axis (r = 0) by letting and flash will represent an initial increase of the surface
vary as coordinates on a 2-sphere S2 , the point s area at every point of S and the ingoing flash, an
actually describes a closed 2-surface S (coordina- initial decrease. In four-dimensional spacetime
tized by and ) with topology S2 (so S is terms, we express this as positivity of the divergence
intrinsically an ordinary 2-sphere). We shall be of the outward null normal and the negativity of the
concerned with the region Iþ (S), which is the divergence of the inward one. The characteristic
(chronological) ‘‘future’’ of S, that is, the locus of feature of a trapped surface is that whereas the
points q for which a timelike curve exists having a ingoing flash will still have an initially reducing
future endpoint at q and a past endpoint on S. We surface area, the ‘‘outgoing’’ flash now has the
shall also be interested, particularly, in the boundary curious property that its surface area is also initially
@Iþ (S) of Iþ (S). This boundary is described, in decreasing, this holding at every point of S.
Figure 1, by the pair of null curves v = const. and Locally, this is not particularly strange. For a
2r þ 4m log (r 2m) = const., proceeding into the surface wiggling in and out, we are quite likely to
future from s (and rotated in and ). The region find portions of ingoing flash with increasing area,
620 Spacetime Topology, Causal Structure and Singularities
and portions of outgoing flash with decreasing area. where it is assumed that each of ‘a , ma is parallel-
An extreme case in Minkowski spacetime has S as the propagated along :
intersection of two past light cones. All the null
normals to S point along the generators of these past ‘a ra ‘b ¼ 0; ‘a ra mb ¼ 0
cones, and therefore all converge into the future. Such (ra denoting covariant derivative). The spin-coefficient
a surface S (indeed spacelike) looks ‘‘trapped’’ every- quantities
where locally, but fails to count as trapped, not being
compact. Since there is nothing causally extreme about ¼ ma m
b ra ‘b and ¼ ma mb ra ‘b
Minkowski space, it is appropriate not to count such are of importance. Here, the real part of measures the
surfaces as ‘‘trapped.’’ What is the peculiar about a convergence of the congruence and the imaginary part
trapped surface is that both ingoing and outgoing defines its rotation; measures its shear, where the
flashes are initially decreasing in area, over the entire argument of defines the direction (perpendicular
compact S. (N. B. Hawking and Ellis (1973) adopt a to ) of the axis of shear, and whose strength is defined
slightly different terminology; the term ‘‘trapped,’’ by jj (see Penrose and Rindler (1986) for a graphic
used here, refers to their ‘‘closed trapped.’’) description of these quantities). Defining propagation
derivative along by
D ¼ ‘a ra
The Null Raychaudhuri Equation
we can write the Sachs equations as
What do we deduce from the existence of a trapped
surface? A glance at Figure 1 gives us some D ¼ 2 þ þ
indication of the trouble. As we trace @Iþ (S) into D ¼ 2 þ
the future, we find that its cross-sectional area
continues to decrease, until becoming zero at the where = (1=2)Rab ‘a ‘b and = Cabcd ‘a mb ‘c md ,
central singularity. This last feature need not reflect conventions for the Ricci tensor Rab and the Weyl
closely what happens in more general cases, with no tensor Cabcd being those of General Relativity:
spherical symmetry. But the reduction in surface Overview (and of Penrose and Rindler (1984)). We
area is a general property. This is the first point to note that it is the real Ricci component which
appreciate in a theorem (Penrose 1965, 1968, governs the propagation of the divergence and the
Hawking and Ellis 1973) which indicates the complex Weyl component which governs the
profoundly disturbing physical implications of the propagation of shear, though there are some non-
existence of a trapped surface in physically realistic linear terms. The quantity is normally taken non-
gravitational collapse, according to Einstein’s gen- negative, since it measures the energy flux across
eral relativity. The surface-area reduction arises (with, in fact = 4GTab ‘a ‘b , where Tab is the
from a result known as ‘‘Raychaudhuri’s equation,’’ energy tensor). The condition that 0 at all points
in the case of null rays – where we refer to this as of spacetime and for all null directions ‘a , is called
the ‘‘Sachs’’ equations. We come to this next. the ‘‘weak energy condition.’’ (Again there is a minor
Although many different notations are used to discrepancy with Hawking and Ellis (1973) who
express the needed quantities, we can here conve- adopt a somewhat stronger ‘‘weak energy condition,’’
niently employ the spin-coefficient formalism, as which is the above but where ‘a is also allowed to be
described elsewhere in this Encyclopedia (see Spi- future-timelike. Unfortunately, with this terminology,
nors and Spin Coefficients). their ‘‘weak energy condition’’ is not strictly weaker
Suppose that we have a congruence (smooth three- than their ‘‘strong energy condition.’’)
parameter family) of rays (null geodesics) in four- It will now be assumed that is real:
dimensional spacetime. Let ‘a be a real future-null
¼
vector, tangent to a null geodesic of the congruence,
and let mb be complex-null, also defined along , which is always the case for propagation along the
where its real and imginary parts are unit vectors generators of a null hypersurface. The weak energy
spanning a 2-surface element orthogonal to ‘a at each condition then has an important implication for us.
point of , so we have We find that if A is an element of 2-surface area
within the plane spanned by the real and imaginary
parts of ma , then (this area element being propa-
‘a ‘a ¼ 0; ‘a ma ¼ 0;
gated by D along the lines )
ma ma ¼ 0; m a ma ¼ 1;
‘a ¼ ‘a DA1=2 ¼ A1=2
Spacetime Topology, Causal Structure and Singularities 621
Such a place where the cross-sectional area pinches The I (Q) are always open sets, but the J (Q) are not
down to zero is a singularity of the congruence or null always closed (though they are for any closed set Q in
hypersurface, referred to as a ‘‘caustic.’’ (There are Minkowski space). Thus, the sets I (Q) have a more
also terminological confusions arising from different uniform character than the J (Q), and it is simpler to
authors defining the term ‘‘caustic’’ in slightly concentrate, here, on the I (Q) sets.
different ways. The terminology used here is slightly The boundary @Iþ (Q) of Iþ (Q) has an elegant
discrepant from that of Arnol’d (1992) (Chapter 3).) characterization:
From this property, it follows that if we have a @Iþ ðQÞ ¼ fqjIþ ðqÞ @Iþ ðQÞ; but q 2
= Iþ ðQÞg
trapped surface S, then every generator of @Iþ (S), if
extended indefinitely into the future, must eventually and the corresponding statement holds for @I (Q).
encounter a caustic. This, so far, tells us nothing about Boundaries of futures also have a relatively simple
actual singularities in the spacetime M; even Minkowski structure, as is exhibited in the following result (for
space contains many null hypersurfaces with multitudes which there is also a version with past and future
of caustic points. However, caustics do tell us some- interchanged):
thing significant about sets like @Iþ (S), which are the Lemma Let Q M be closed, and p 2 @Iþ (Q) Q,
boundaries of future sets, and we come to this shortly. then there exists a null geodesic on @Iþ (Q) with
future endpoint at p and which either extends along
Causality Properties @Iþ (Q) indefinitely into the past, or until it reaches a
First, consider the basic causal relations. If a an b point of Q. It can only extend into the future along
are two points of M, then if there is a nontrivial @Iþ (Q) if p is not a caustic point of @Iþ (Q).
future-timelike curve in M from a to b we say that a Beyond a caustic point, the null geodesic would
‘‘chronologically’’ precedes b and write enter into the interior of Iþ (Q), but this also happens
ab (more commonly) when crossing another region of
null hypersurface on @Iþ (Q).
(so it would be possible for some observer’s world line We wish to apply this to @Iþ (S), for a trapped
to encounter first a and then b). If there is a future-null surface S, but we first need a further assumption that S
curve in M from a to b (trivial or otherwise), we say that lies in the interior of the (future) domain of dependence
a ‘‘causally’’ precedes b and write Dþ (H) of some spacelike hypersurface H. This region is
ab defined as the totality of points q for which every
timelike curve with future endpoint q can be extended
(so it would be possible for a signal to get from a to into the past until it meets H. One can consider domains
b). We have the following elementary properties (see of dependence for regions H other than smooth space-
Penrose (1972)): like surfaces, but it is usual to assume, more generally,
aa that H is a closed achronal set, where ‘‘achronal’’ means
that H contains no pair of points a, b for which a b.
if a b then a b
We find that every point q in the interior intDþ (H) of
if a b and b c then a c Dþ (H) has the further property that all null curves into
if a b and b c then a c the past from q will also eventually meet H if extended
if a b and b c then a c sufficiently. The physical significance of Dþ (H) is that,
for fields with locally Lorentz-invariant and determi-
if a b and b c then a c
nistic evolution equations, the (appropriate) initial data
We generalize the definition of Iþ (S), above, to an on H will fix the fields throughout Dþ (H) (and also
622 Spacetime Topology, Causal Structure and Singularities
throughout the similarly defined past domain of was able to remove assumptions concerning domains
dependence D (H)). We find that points in the future of dependence (e.g., Hawking (1967)). A later
Cauchy horizon Hþ (H), which is the future boundary theorem (Hawking and Penrose 1970) encompassed
of Dþ (H) defined by most of the earlier ones and had, as one of its
implications, that virtually all spatially closed uni-
H þ ðHÞ ¼ Dþ ðHÞ I ðDþ ðHÞÞ;
verse models, satisfying a reasonable energy condition
has properties similar to the boundary of a past set, in and without closed timelike curves, would have to be
accordance with the above lemma, and also for the singular, in this sense of ‘‘incompleteness,’’ but again
past Cauchy horizon H (H), defined correspondingly. the topological-type arguments used give little indica-
tion of the nature or location of the singularities.
Another issue that is not addressed by these
arguments is whether the singularities arising from
Singularity Theorems
gravitational collapse are inevitably ‘‘hidden,’’ as in
and Related Questions
Figure 1, by the presence of a horizon – a conjecture
Now, applying our lemma to @Iþ (S), for a trapped referred to as ‘‘cosmic censorship’’ (see Penrose
surface S intDþ (H), we find that every one of its (1969, 1998)). Without this assumption, one cannot
points lies on a null-geodesic segment on @Iþ (S), deduce that gravitational collapse, in which a trapped
with past endpoint on S (for if did not terminate at S surface forms, will lead to a black hole, or to the
it would have to reach H, which is impossible). alternative which would be a ‘‘naked singularity.’’
Assuming future-null completeness and weak energy There are many results in the literature having a
( 0), we conclude that if extended far enough into bearing on this issue, but it still remains open.
the future, the family of such null geodesics must A related issue is that of strong cosmic censorship
encounter a caustic, and therefore they must leave which has to do with the question of whether
@Iþ (S) and enter Iþ (S). We finally conclude that singularities might be observable to local observers.
@Iþ (S) must be a compact topological 3-manifold. Roughly speaking, a naked singularity would be one
Using basic theorems, we construct an everywhere which is ‘‘timelike,’’ whereas the singularities in black
timelike vector field in intDþ (H) which provides a holes might in general be expected to be spacelike
(1–1) continuous map from the compact @Iþ (S) to H, (or future-null), and in the Big Bang, spacelike (or past-
yielding a contradiction if H is noncompact, thereby null). There are ways of characterizing these distinctions
establishing the following (Penrose 1965, 1968): purely causally, in terms of past sets or future sets (sets Q
for which Q = I (Q) or Q = Iþ (Q)); see Penrose (1998).
Theorem The requirement that there be a trapped
If (strong) cosmic censorship is valid, so there are no
surface which, together with its closed future, lies in the
timelike singularities, the remaining singularities would
interior of the domain of dependence of a noncompact
be cleanly divided into past-type and future-type. In the
spacelike hypersurface, is incompatible with future null
observed universe, there appears to be a vast difference
completeness and the weak energy condition.
between the structure of the two, which is intimately
We notice that this ‘‘singularity theorem’’ gives no connected with the second law of thermodynamics,
indication of the nature of the failure of future null there appearing to be an enormous constraint on
completeness in a spatially open spacetime subject to the Weyl curvature (see General Relativity: Overview)
weak positivity of energy and containing a trapped in the initial singularities but not in the final ones.
surface. The natural assumption is that in an actual Despite the likelihood of singularities arising in their
physical situation of such gravitational collapse, the time evolution, it is possible to set up initial data for the
failure of completeness would arise at places where Einstein vacuum equations for a wide variety of
curvatures mount to such extreme values that complicated spatial topologies (see Einstein Equations:
classical general relativity breaks down, and must be Initial Value Formulation). On the observational side,
replaced by the appropriate ‘‘quantum geometry’’ (see however, there seems to be little evidence for anything
Quantum Geometry and its Applications, etc.). other than Euclidean spatial topology in our actual
Hawking (1965) showed how this theorem (in time- universe (which includes black holes). Speculation on
reversed form) could also be applied on a cosmolo- the nature of spacetime at the tiniest scales, however,
gical scale to provide a strong argument that the where quantum gravity might be relevant, often
Big-Bang singularity of the standard cosmologies is involves non-Euclidean topology, however. It may be
correspondingly stable. He subsequently introduced noted that an early theorem of Geroch established that
techniques from ‘‘Morse theory’’ which could be the constraints of classical Lorentzian geometry do not
applied to timelike rather than just null geodesics permit the spatial topology to change without viola-
and, using arguments applied to Cauchy horizons, tions of causality (closed timelike curves).
Spectral Sequences 623
See also: Asymptotic Structure and Conformal Infinity; Hawking SW and Penrose R (1970) The singularities of
Boundaries for Spacetimes; Computational Methods in gravitational collapse and cosmology. Proceedings of the
General Relativity: The Theory; Cosmology: Royal Society (London) A 314: 529–548.
Mathematical Aspects; Critical Phenomena in Oppenheimer JR and Snyder H (1939) On continued gravitational
contraction. Physical Review 56: 455–459.
Gravitational Collapse; Einstein Equations: Exact
Penrose R (1965) Gravitational collapse and space-time singula-
Solutions; Einstein Equations: Initial Value Formulation; rities. Physics Review Letters 14: 57–59.
General Relativity: Overview; Geometric Analysis and Penrose R (1972) Techniques of Differential Topology in
General Relativity; Lorentzian Geometry; Quantum Relativity. CBMS Regional Conference Series in Applied
Cosmology; Quantum Geometry and its Applications; Mathematics, no.7. Philadelphia: SIAM.
Spinors and Spin Coefficients; Stationary Black Holes. Penrose R and Rindler W (1984) Spinors and Space-Time, Vol. 1:
Two-Spinor Calculus and Relativistic Fields. Cambridge:
Cambridge University Press.
Penrose R and Rindler W (1984) Spinors and Space-Time, Vol. 2:
Further Reading
Spinor and Twistor Methods in Space-Time Geometry. Cam-
Arno’ld, Beem JK, and Ehrlich PE (1996) Global Lorentzian bridge: Cambridge University Press.
Geometry, (2nd edn). New York: Marcel Dekker. Penrose R (1968) Structure of space-time. In: DeWitt CM and
Eddington AS (1924) A comparison of Whitehead’s and Einstein’s Wheeler JA (eds.) Battelle Rencontres, 1967 Lectures in
formulas. Nature 113: 192. Mathematics and Physics. New York: Benjamin.
Finkelstein D (1958) Past–future asymmetry of the gravitational Penrose R (1969) Gravitational collapse: the role of general
field of a point particle. Physical Review 110: 965–967. relativity. Rivista del Nuovo Cimento Serie I, Numero
Hawking SW (1965) Physical Review Letters 15: 689. Speciale 1: 252–276.
Hawking SW (1967) The occurrence of singularities in cosmology Penrose R (1998) The question of cosmic censorship. In: Wald
III. Causality and singularities. Proceedings of the Royal RM (ed.) Black Holes and Relativistic Stars, (reprinted in
Society (London) A 300: 187–. (1999) Journal of Astrophysics and Astronomy 20, 233–248.)
Hawking SW and Ellis GFR (1973) The Large-Scale Structure of Chicago, IL: University of Chicago Press.
Space-Time: Cambridge: Cambridge University Press.
Special Lagrangian Submanifolds see Calibrated Geometry and Special Lagrangian Submanifolds
Spectral Sequences
P Selick, University of Toronto, Toronto, ON, Canada so to the differential group (G, d) we can associate
ª 2006 Elsevier Ltd. All rights reserved. its homology, H(G, d) := Ker d=Im d. Often G has
extra structure and we require d to satisfy some
compatibility condition in order that H(G, d) should
also have this structure. For example, a differential
Introduction
graded Lie algebra (L, d) requires a differential d
Spectral sequences are a tool for collecting and which satisfies the condition d[x, y] = [dx, y] þ
distilling the information contained in an infinite (1)jxj [x, dy]. While, for simplicity, throughout this
number of long exact sequences. Their most article we will always assume that G is an abelian
common use is the calculation of homology by group, the concepts are readily extended to the case
filtering the object under study and using a spectral where G is an object of some abelian category and
sequence to pass from knowledge of the homology generalizations to nonabelian situations have also
of the filtration quotients to that of the object itself. been studied.
This article will discuss the construction of spectral An important example
L of extra structure is the
sequences and the notion of convergence including case where G = 1 n = 1 G n is a graded abelian
conditions sufficient to guarantee convergence. group. The appropriate compatibility condition for
Some sample applications of spectral sequences are a differential graded group is that d should be
given. homogeneous of degree 1. That is, d(Gn )
Gn1 .
A differential on an abelian group G is a self-map In many contexts it is more natural to use super-
d : G ! G such that d2 = 0. A morphism of differ- scripts and regard d as having degree þ1; the two
ential groups is a map f : G ! G0 such that d0 f = fd. concepts are equivalent via the reindexing conven-
The condition d2 = 0 guarantees that Im d
Ker d, tion Gn := Gn . Another important example is that
624 Spectral Sequences
where G forms a graded algebra, meaning that it has Since our plan is to study X by computing
a multiplication Gn Gk ! Gn þ k . To form a Gr(F X ), the first question we need to consider is
differential graded algebra, in addition to having what conditions we need to place on our filtration
degree 1, d is required to satisfy the Leibniz rule so that Gr(F X ) retains enough information to
d(xy) = d(x)y þ (1)jxj xd(y) (where jxj denotes the recover X. Our experience from the ‘‘5-lemma’’
degree of x) familiar from the differentiation of suggests that the appropriate way to phrase the
differential forms. requirement is to ask for conditions on the filtra-
In many cases, G itself is not the main object of tions which are sufficient to conclude that f : X ! Y
interest, but is a relatively large and complicated is an isomorphism whenever f : F X ! F Y is a
object, G = G(X), formed by applying some functor morphism of filtered groups for which the induced
G to the object X being studied. For example, X Gr(f ) : Gr(X) ! Gr(Y) is an isomorphism.
might be some manifold and G could be the set of It is clear that GrF X can tell us nothing about
all differential forms on X with the exterior X ([Xn ) so we require that X = [Xn . Similarly
derivative as d. The presumption is that H(G(X)) we need that \Xn = 0. However, the latter condition
carries the information we want about X in a much is insufficient as can be seen from the following
simpler form than the whole of G(X). example.
A spectral sequence (Leray 1946) is defined L Q1
Example 1 Let X := 1 k = 1 Z and Y := k = 1 Z. Set
simply as a sequence ((Er , dr ))r = n0 , n0 þ1,..., of differ-
ential abelian groups such that Erþ1 = H(Er , dr ). By X if n 0
reindexing, we could always arrange that n0 = 1, but Fn X :¼ L1
k¼n Z if n < 0
sometimes it is more natural to begin with some
other integer. If all terms (Er , dr ) of the spectral Y if n 0
Fn Y :¼ Q1
sequence have the appropriate additional structure, k¼n Z if n < 0
we might refer, for example, to a spectral sequence
and let f : X ! Y be the inclusion. Then Gr(f ) is an
of Lie algebras. If there exists N such that Er = EN
isomorphism but f is not.
for all r N (equivalently dr = 0 for all r N), the
spectral sequence is said to ‘‘collapse’’ at EN . To phrase the appropriate condition we need the
The definition of spectral sequence is so broad concept of algebraic limits. Given a sequence of
that we can say almost nothing of interest about objects {Xn }n2Z and morphisms fn : Xn ! Xnþ1 in
them without putting on some additional condi- some category, the ‘‘direct limit’’ or ‘‘colimit’’ of the
tions. We will begin by considering the most sequence, written lim Fn X, is an object X together
! n
common type of spectral sequence, historically the with morphisms gn : Xn ! X satisfying gnþ1 fn = gn ,
one that formed the motivating example: the having the universal property that given any object
spectral sequence of a filtered chain complex. X0 together with maps g 0n : Xn ! X0 satisfying g 0nþ1
fn = g 0n , there exists a unique morphism h : X ! X0
such that g 0n = h gn for all n. By the usual
categorical argument the object X, if it exists, is
Filtered Objects
unique up to isomorphism. The dual concept,
To study a complicated object X, it often helps to ‘‘inverse limit’’ or simply ‘‘limit’’ of the sequence,
filter X and study it one filtration at a time. A written lim Fn X, is obtained by reversing the
n
filtration F X of a group X is a nested collection of directions of the morphisms. For intuition, we note
subgroups that these notions share, with the notion of limits of
sequences in calculus, the properties that changing
F X :¼ . . . Fn X Fnþ1 X X 1 < n < 1
the terms Xn only for n < N does not affect
A morphism f : F X ! F Y of filtered groups is a lim Fn X, and if the sequence stabilizes at N (i.e.,
! n
homomorphism f : X ! Y such that f (Fn (X)) Fn (Y). the morphisms fn are isomorphisms for all n N),
The groups Fn X=Fn1 X are called the ‘‘filtration
L then lim Fn X ffi XN . Similarly lim Fn X depends
! n n
quotients’’ and their direct sum Gr(F X ) := n Fn X= only upon behavior of the sequence as n ! 1.
Fn1 X is called the associated graded group of the Limits over partially ordered sets other than Z can
filtered group F X . In cases where X has additional also be taken but we shall not need them in this
structure, we might define special types of filtra- article. Although limits need not exist in general, in
tions satisfying some compatibility conditions so the category of abelian groups, both the direct and
that Gr(F X ) inherits the additional structure. For inverse limit exist for any sequence and are given
example, an algebra filtration of an algebra X is explicitly
L by the following constructions. L lim
! n
defined as one for which (Fn X)(Fk X) Fn þ k X. Fn X = Xn = where, letting ik : Xk ! Xn be the
Spectral Sequences 625
canonical inclusion, the equivalence relation is gener- with a long exact sequence, knowledge of two of
ated by in (x) inþ1 f (x) for x 2 Xn . lim Fn X = every three terms gives a handle on computing the
n
{(xn ) 2 Xn j fn (xn ) = xnþ1 8n}. remaining terms but does not, in general, completely
The condition needed is that our filtrations should determine those terms, which explains intuitively
be bicomplete, defined as follows. F X is called why we have some reason to hope that a spectral
‘‘cocomplete’’ if the canonical map X ! lim Fn X sequence might be useful and also why it is not
! n
is an isomorphism and F X is called ‘‘complete’’ if guaranteed to solve our problem.
X ! lim X=Fn X is an isomorphism. F X is called Before proceeding with our motivating example,
n
bicomplete if it is both complete and cocomplete. we digress to discuss spectral sequences formed from
Note that F X cocomplete is equivalent to [Fn X = X exact couples.
but F X complete is stronger than \Fn X = 0.
Theorem 1 (Comparison theorem). Let F X be
Exact Couples
bicomplete and let F Y be cocomplete with
\Fn Y = 0. Suppose that f : F X ! F Y is a morphism In this section, we will define exact couples, show
such that Gr(f ) : Gr(X) ! Gr(Y) is an isomorphism. how to associate a spectral sequence to an exact
Then f : X ! Y is an isomorphism. couple, and discuss some properties of spectral
sequences coming from exact couples. As we shall
see, a filtered chain complex gives rise to an exact
Filtered Chain Complexes couple and we will examine this spectral sequence in
greater detail.
A chain complex (C, d) of abelian groups consists of
Exact couples were invented by Massey and many
abelian groups Cn for n 2 Z together with homo-
books use them as a convenient method of con-
morphisms dn : Cn ! Cn1 such that dn dnþ1 = 0 for
structing spectral sequences. Other books bypass
all n. To the chain complex (C, d) we can Lassociate discussion of exact couples and define the spectral
the differential (abelian) group (C
, d) := 1
n = 1 Cn sequence coming from a filtered chain complex
with djCn induced by dn . We often write simply C if
directly.
the differential is understood. The dual notion in
which d has degree þ1 is called a cochain complex Definition 1 An ‘‘exact couple’’ consists of a
and the concepts are equivalent through our triangle
convention Cn := Cn . i
D!D
Theorem 2 (Homology commutes with direct
limits). H(lim Cn ) = lim H(Cn ).
! n ! n
# k
E
# j
As we shall see later, failure of homology to
containing abelian groups D, E, and together with
commute with inverse limits is a source of great
homomorphisms i, j, k such that the diagram is
complication in working with spectral sequences.
exact at each vertex.
Let F C be a filtered chain complex. In many
applications, our goal is to compute H
(C) from a In the following, to avoid conflicting notation
knowledge of H
(Fn C=Fn1 C) for all n. The overall considering the many superscripts and subscripts
plan, which is not guaranteed to be successful in which will be needed, we use the convention that an
general, would be: n-fold composition will be written f n rather than
the usual f n .
1. use the given filtration on C to define a filtration
Given an exact couple, set d := jk : E ! E. By
on H
(C),
exactness, kj = 0, so d2 = jkjk = 0 and therefore
2. use our knowledge of H
(Gr C) to compute
(E, d) forms a differential group. To the exact
Gr H
(C),
couple we can associate another exact couple, called
3. reconstruct H
(C) from Gr H
(C).
its derived couple, as follows. Set D0 := Im i D and
To begin, set Fn (H
C) := Im(sn )
, where sn : E0 := H(E, d). Define i0 := ijD and let j0 : D0 ! E0 be
Fn (C) ! C is the inclusion (chain) map from the given by j0 (iy) := j(y), where x denotes the equiva-
filtration. The spectral sequence which we will lence class of x. The map k0 : E0 ! D0 is defined by
define for this situation can be regarded as a method k0 (z) := kz. One checks that the maps j0 and k0 are
of keeping track of the information contained in well defined and that (D0 , E0 , i0 , j0 , k0 ) forms an exact
the infinite collection of long exact homology couple. Therefore, from our original exact couple,
sequences coming from the short exact sequences we can inductively form a sequence of exact couples
0 ! Fn1 C ! Fn C ! Fn C=Fn1 C ! 0. When working (Dr , Er , ir , jr , kr )1 1 1 r
r = 1 with D := D, E := E, D := (D
r1 0
)
626 Spectral Sequences
to date satisfy this condition and in fact most also in which this is true is stated more precisely in the
have a second gradation as in the case of our following theorem.
motivating example. To see how to proceed, we
examine that case more closely. Theorem 5 (Spectral sequence comparison
~ r be a morphism
theorem). Let f = (f r ) : (Er ) ! E
For a filtered chain complex F C with structure
maps sp : Fp C ! C we defined Fp (H
(C)) = Im sp
. If of spectral sequences.
x = i(r1) y belongs to (i) If f : EN ! E ~ N is an isomorphism for some N,
r
then f is an isomorphism for all r N (includ-
Drp; q = Im iðr1Þ : Hpþq ðFprþ1 CÞ ! Hpþq ðFp CÞ
ing r = 1).
then (sp )
x = (sp )
i(r1) y = (spþ1 )
ir y = (spþ1 )
ix. (ii) Suppose in addition that (Er ) converges to F X
Therefore, we have a commutative diagram and (E ~ r ) quasiconverges to F ~ . Let : F X ! F ~
X X
be a morphism of filtered abelian groups which
Drp;q ! Fp ðHpþq ðCÞÞ is compatible with f. (i.e., there exist isomorph-
isms : Gr X ffi E1 and ˜ : Gr X ~ ffiE ~ 1 such that
#i # 1
f = ˜ Gr(f )). Then f : X ! X ~ is an
Drþ1
pþ1;q1 ! Fpþ1 ðHpþq ðCÞÞ isomorphism.
yielding a map Within the constraints provided by Theorem 5, a
spectral sequence might have many limits. A typical
Drþ1 r
pþ1; q1 =Dp; q ! Fpþ1 Hpþq ðCÞ =Fp Hpþq ðCÞ calculation of some group Y by means of spectral
= Grpþ1 ðHpþq ÞC sequences might proceed as an application of
Theorem 5 along the lines of the following plan.
Letting r go to infinity, we get an induced map
: D1 =i1 (D1 ) ! Gr(H(C)). 1. Subgroups Fn Y forming a filtration of Y are
defined, although usually not computable at this
Theorem 4 If F H (C) is cocomplete then point. The subgroups are chosen in a manner that
(i) D1 = Fn (H(C)); seems natural bearing in mind that to be useful it
(ii) : D1 =i1 (D1 ) ! Gr(H(C)) is an isomorphism; will be necessary to show convergence properties.
(iii) There is an exact sequence 0 ! Gr(H(C)) 2. Directly or by means of an exact couple, a
j1 1
k 1
i spectral sequence is defined in a manner that
! E1 ! 1 D ! 1 D:
seems to be related to the filtration.
We say that the spectral sequence (Er ) ‘‘abuts’’ to 3. Some early term of the spectral sequence (usually
F L if there is an isomorphism GrL ! E1 . Here we E1 or E2 ) is calculated explicitly and the
mean an isomorphism of graded abelian groups, differentials dr are calculated successively result-
which makes sense since under our assumptions Er ing in a computation of E1 .
inherits a grading from E1 for each r. If in addition 4. With the aid of the knowledge of E1 , a
the filtration on L is cocomplete, we say that (Er ) conjecture Y = G is formulated for some G.
‘‘weakly converges’’ to F L and if it is bicomplete we 5. A suitable filtration on G and a map of filtrations
say that (Er ) ‘‘converges’’ (or strongly converges) to F G ! F Y or F Y ! F G are defined.
F L . The notation (Er ) ) F L (or simply (Er ) ) L 6. The spectral sequence arising from F G is demon-
when the filtration on L is either understood or strated to converge to G.
unimportant) is often used in connection with 7. The original spectral sequence is demonstrated to
convergence but there is no universal agreement as converge to Y and Theorem 5 is applied.
to which of the three concepts (abuts, weakly
converges, or converges) it refers to! In this article, The hardest steps are usually (3) and (7). For step
we will also use the expression (Er ) ‘‘quasicon- (3), in most cases the calculations require knowledge
verges’’ to F L to mean that the spectral sequence which cannot be obtained from the spectral sequence
weakly converges to F L with \n Fn L = 0. (Note: the itself, although the spectral sequence machinery plays
terminology quasiconverges is nonstandard although its role in distilling the information and pointing the
the concept has appeared in the literature, some- way to exactly what needs to be calculated. Steps
times under the name converges.) (4)–(6) are frequently very easy, and often not stated
While it would be overstating things to claim that explicitly, with ‘‘by construction of G’’ being the
convergence of the spectral sequence shows that E1 most common justification of (6). We now discuss
determines H(C), it is clear that convergence is what the types of considerations involved in step (7).
we need in order to expect that E1 contains enough Convergence of a spectral sequence to a desired L
information to possibly reconstruct H(C). The sense can be difficult to verify in general partly because
628 Spectral Sequences
2
the conditions are stated in terms of some filtration Looking at the bidegrees, the domain or range of dp, q
(usually understood only in a theoretical sense) on is zero for each p and q so d2 = 0, and similarly
an initially unknown L rather than in terms of dr = 0 for all r > 2. Therefore, the spectral sequence
properties of the spectral sequence itself or an exact collapses with E2 = E1 . The spectral sequence con-
couple from which it arose. Theorems 2 and 4(ii) verges to H
(X) so the terms on the diagonal
give us the following extremely important special p þ q = n form a composition series for Hn (X).
case in which we can conclude convergence to H(C) Since the (n, 0) term is the only nonzero term on
of the spectral sequence for F C based on conditions this diagonal, Hn (X) ffi Hncell (X). That is, ‘‘cellular
that are often easily checked. homology equals singular homology.’’
Returning to the general situation, set L1 :=
Theorem 6 If F C is a filtered chain complex such
that F C is cocomplete and there exists M such that
lim
! n Dn and L1 := lim n Dn . Filter L1 by Fn L1 :=
H(Fn C) = 0 for n < M, then the spectral sequence Im(Dn ! L1 ) and filter L1 by Fn L1 := Ker
for F C converges to H(C). (L1 ! Dn ). It follows from the definitions that
Fn L1 = D1 1 1 1
n and so Dn =i (Dn1 ) = Grn L1 . At the
Although the second hypothesis, which implies other end, the canonical map L1 ! Dn lifts to 1 Dn
that 1 D = 0, is very strong it handles the large yielding an injection L1 =Fn L1 ! 1 Dn . Therefore,
numbers of commonly used filtrations which are 0 for each n there is an injection Grn L1 ! Kn where
in negative degrees. Kn = Ker(1 Dn1 ! 1 Dn ). In general, the map
Under the conditions of Theorem 6, inserting the L1 ! 1 Dn need not be surjective (an element
bigradings into Theorem 4 gives a short exact could be in the image of ir for each finite r without
sequence 0 ! D1 1 1
p1, qþ1 ! Dp, q ! Ep, q ! 0 with being part of a consistent infinite sequence), although
1
Dp, q ffi Fp (Hpþq (X)); equivalently it is surjective in the special case when 1 Ds ! 1 Dsþ1
is surjective for each s. In the latter case we get
Fk ðHn ðCÞÞ=Fk1 ðHn ðCÞÞ ffi E1
k; nk Gr L1 ffi K. As we will see in the next section, the
Thus, the only E1 -terms relevant to the computa- exact sequence of Theorem 3 extends to the right
(Theorem 8) giving lim 1 Zr = 0 as a sufficient condition
tion to Hn (C) are those on the diagonal p þ q = n. In r
the important case of a first quadrant spectral that 1 Ds ! 1 Dsþ1 be surjective for each s, where lim 1
sequence (Erp, q = 0 if p < 0 or q < 0), the number is described in that section and (Zr ) refers to the system
of nonzero terms on any diagonal is finite so the of inclusions Zrþ1 Zr Zr1 . Thus,
E1 -terms on the diagonal p þ q = n give a finite lim 1 Zr = 0 is a sufficient condition for Gr L1 ffi K.
r
composition series for each Hn (C). Taking into account the short exact sequence
Here is an elementary example of an application 0 ! D1 =i1 (D1 ) ! E1 ! K ! 0 coming from
of a spectral sequence. Theorem 3, the preceding discussion yields two
obvious candidates for a suitable F L : F L1 or F L1 .
Example 2 Let S
( ) denote the singular chain In theory there are other possibilities, but in
complex, let H
( ) := H
(S
( )) denote singular practice one of these two cases usually occurs. We
homology, and let H
cell ( ) denote cellular homology. examine them individually and see what additional
Let X be a CW-complex with n-skeleton X(n) . The conditions are required for convergence.
inclusions S
(X(n) ) ! S
(X) yield a filtration on
S
(X). In the associated spectral sequence, Case I: Conditions for convergence to F L1 It is
easily checked from the definitions that lim D1
! n n =
E1p;q ¼ Hpþq XðpÞ =Xðp1Þ lim Dn so F L1 is always cocomplete. Therefore,
! n
besides Gr L1 ffi E1 (equivalently, K = 0), it is
free abelian group on the p-cells of X if q ¼ 0 required to verify that F L1 is complete. As we will
ffi
0 if q 6¼ 0 see in the next section, the completeness condition can
be restated as \Dn = 0 and lim 1 Dn = 0. According to
n
The differential the preceding discussion, under the assumption that
n
L1 = \ D = 0, which we need anyway as part of the
1
dp; ðpÞ
0 : Hp X =X
ðp1Þ
! Hp1 Xðp1Þ=Xðp2Þ requirement that F L1 be complete, lim 1 Zr X = 0 is
r
sufficient to show K = 0.
is the definition of the differential in cellular Case II: Conditions for convergence to F L1 Any
homology. Therefore, inverse limit is complete in its canonical filtration, so
F L1 is always complete and the issues are whether
2 H cell ðXÞ if q ¼ 0 GrL1 ffi E1 and whether F L1 is cocomplete.
Ep;q ¼
0 if q 6¼ 0 F L1 is cocomplete if and only if every element of
Spectral Sequences 629
L1 lies in Ker(L1 ! Dn ) for some n, for which a Let (Xn ) be an inverse system with structure maps
1
sufficient condition is that L1 = 0 or equivalently in1 : Xn1 ! Xn . An explicit construction
Q Qfor lim
n
E1 ffi K. Therefore, if the reason for the isomorph- Xn is as follows. Define : n Xn ! n Xn by
ism Gr L1 ffi E1 is that the maps E1 K and letting (xn ) be the sequence whose nth component
Gr L1 K are isomorphisms, then the rest of the is (xn in1 xn1 ). Then lim 1 Xn ffi Coker . Observe
n
convergence conditions are automatic. In particular, that Ker ffi lim Xn according to the explicit for-
to deduce convergence to F L1 it suffices to know n
mula for lim Xn given earlier.
that L1 = 0 and lim 1 Zr = 0. n
r Recall that we defined 1 D = \r Im ir ffi lim Dr .
r
The exact sequence of Theorem 3 can be extended
Derived Functors to give:
The left and right derived functors Ln T, Rn T of a Theorem 8 There is an exact sequence
functor T provide a measure of the amount by which i j k i
the functor deviates from preserving exactness. 0 ! D1 ! D1 ! E1 ! 1 D ! 1 D
The category I nv of inverse systems indexed over Z j k i
! lim1 Zr ! lim1 Dr ! lim1 Dr ! 0
(i.e., the category whose objects are diagrams r r r
of abelian groups ! An1 ! An ! Anþ1 ! ) It is clear from the explicit construction that if the
forms an abelian category in which a sequence of system (Xn ) stabilizes with Xn = G for all sufficiently
morphisms A0 ! A ! A00 is exact if and only if the small n, then lim X = G and lim 1 X = 0. If the
sequence An 0 ! An ! An 00 of abelian groups is exact n n
spectral sequence collapses at any stage then the
for each n. The functor of interest to us is lim : I nv ! r
system (Z ) stabilizes at that point, and so for a
AB where AB denotes the category of abelian groups. spectral sequence which collapses, the condition
Let T : A ! B be an additive functor between lim 1 Zr = 0, which arose in the discussion of
¼ ¼ r
abelian categories. Suppose that X in Obj A has an convergence in the previous section, is automatic.
¼
injective resolution IX . The definition of additive Let F X be a filtered abelian group. Applying
functor implies that T takes zero morphisms to zero Theorem 7 to the short exact sequence 0 ! Fn X !
morphisms, so TIX forms a cochain complex in B . X ! X=Fn X ! 0 of inverse systems gives an exact
¼
The right derived functors of T are defined by sequence
(Rn T)(X) := Hn (TIX ). The result is independent of
the choice of injective resolution (assuming one 0 ! lim Fn X ! lim X ! lim X=Fn
n n n
exists) and satisfies:
! lim1 Fn X ! lim1 X
n n
1. If T is ‘‘left exact’’ (meaning that T preserves
1
monomorphisms), then R0 T(X) = T(X); Since lim X = X and lim X = 0, we get
n n
2. If T preserves exactness, then (Rn T)(X) = 0 for
Theorem 9 F X is complete if and only if
n > 0.
lim Fn X = 0 and lim 1 Fn X = 0.
n n
Theorem 7 Let 0 ! X0 ! X ! X00 ! 0 be a short
When working with lim 1 the following sufficient
exact sequence in A . Suppose T is left exact and that n
¼ condition for its vanishing, known as the Mittag–
all the objects have injective resolutions. Then there
Leffler condition, is often useful.
is a (long) exact sequence
Theorem 10 Suppose A is an inverse system in
0 ! TðX0 Þ ! TðXÞ ! TðX00 Þ ! ðR1 TÞðX0 Þ ! which for each n there exists k(n) n such that
! ðRn1 TÞðX00 Þ ! ðRn TÞðX0 Þ ! ðRn TÞðXÞ ! Im(Ai ! An ) equals Im(Ak(n) ! An ) for all i k(n).
Then lim 1 A = 0.
ðRn TÞðX00 Þ !
n
Of course, this will not be (directly) useful in
Similarly, the left derived functors of T are defined establishing lim 1 Fn X = 0 since the structure maps in
n
by using projective resolutions and have similar that system are all monomorphisms.
properties with respect to the obvious duality.
The functor lim is left exact and in the category
n Some Examples of Standard Spectral
I nv every object has an injective resolution. There-
fore lim q is defined and lim 0 Xn = lim Xn , where Sequences and Their Use
n n n
lim q denotes the derived functor Rq (lim ). It turns To this point we have considered the general theory
n q n
out that lim is 0 for q > 1, but we are particularly of spectral sequences. The properties of the spectral
n 1
interested in lim . sequences arising in many specific situations have
n
630 Spectral Sequences
Theorem 12 (Zeeman comparison theorem). Let sufficient condition for convergence to lim Y
(Fn X).
n
E and E0 be first quadrant spectral sequences such However since the real object of study is usually
2 2 2
that E2p, q = E2p, 0 E20, q and E0p, q = E0p, 0 E00, q . Let Y
(X), the spectral sequence is most useful when one
f : E ! E0 be a homomorphism of spectral sequences is also able to show lim 1 Y
(Fn X) = 0 in which case
such that fp,2 q = fp,2 0 f0,2 q . Suppose that fp,1q : E1 n
p, q ! the Milnor exact sequence (Milnor 1962)
01
Ep, q is an isomorphism for all p and q. Then the
following are equivalent: 0! lim 1 Y
ðFn XÞ ! Y
ðXÞ
n
! lim Y
ðFn XÞ ! 0
2
(i) fp,2 0 : E2p, 0 ! E0p, 0 is an isomorphism for p n 1;
2
n
(ii) f0,2 q : E20, q ! E00, q is an isomorphism for q n.
gives Y
(X) ffi lim Y
(Fn X).
There is a version of the Serre spectral sequence n
If Y
( ) has cup products then the spectral
for generalized homology theories coming from sequence has the extra structure of a spectral
the exact couple obtained by applying the sequence of Y
(
)-algebras. In the case where B is
generalized homology theory to the Serre filtra- finite dimensional, all convergence problems disap-
tion of X. pear since the spectral sequence lives in a strip and
Theorem 13 (Serre spectral sequence for generalized the filtrations are finite.
homology). Let F ! X ! B be a fibration and let Example 4 Let K
( ) be complex K-theory. Since
Y be an (unreduced) homology theory satisfying the K
(
) ffi Z[z, z1 ] with jzj = 2, in the Atiyah–
Milnor wedge axiom. Then there is a (right half- Hirzebruch spectral sequence for K
(CPn ) we have
plane) spectral sequence with E2p, q ffi Hp (B; t Yq (F))
converging to Ypþq (X). E2 ¼ Z if q is even and p is even with 0 p 2n
p;q
0 otherwise
Cocompleteness of the filtration follows from the
properties of generalized homology theories satisfy- Because CPn is a finite complex, the spectral
ing the wedge axiom (Milnor 1962), and the rest of sequence converges to K
(CPn ). Since all the non-
the convergence conditions are trivial since the zero terms have even total degree and all the
filtration is 0 in negative degrees. Here, unlike differentials have total degree þ1, the spectral
the Serre spectral sequence for ordinary homology, sequence collapses at E2 and we conclude that
the existence of terms in the fourth quadrant opens the Kq (CPn )= 0 if q is odd and that it has a composition
possibility for composition series of infinite length, series consisting of (n þ 1) copies of Z when q is
although in the case where B is a finite-dimensional even. Since Z is a free abelian group, this uniquely
complex all the nonzero terms of the spectral identifies the group structure of Keven (CPn ) as Znþ1 .
sequence will live in the strip between p = 0 and To find the ring structure we can make use of the
p = dim B and so the filtrations will be finite. fact that this is a spectral sequence of K
(
)-
The special case of the fibration
! X ! X algebras. The result is K
(CPn ) ffi K
(
)[x]=(xnþ1 ),
yields what is known as the ‘‘Atiyah–Hirzebruch where jxj= 2.
spectral sequence’’. In the Atiyah–Hirzebruch spectral sequence for
Theorem 14 (Atiyah–Hirzebruch spectral sequence). K
(CP1 ) again all the terms have even total degree
Let X be a CW-complex and let Y be an (unreduced) so the spectral sequence collapses at E2 . We noted
homology theory satisfying the Milnor wedge earlier that collapse of the spectral sequence implies
axiom. Then there is a (right half-plane) spectral that lim 1 Zr X = 0 and so the spectral sequence
r
sequence with E2p, q ffi Hp (X; Yq (
)) converging to convergences to lim K
(CPn ), where we used
n
Ypþq (X). F2n CP1 = CPn . Since our preceding calculation
shows that K
(CPn ) ! K
(CPn1 ) is onto, Mittag–
In the cohomology Serre spectral sequence for Leffler (Theorem 10) implies that lim 1 K
(CPn ) = 0.
n
generalized cohomology (including the cohomology Therefore, the spectral sequence converges to
Atiyah–Hirzebruch spectral sequence), convergence K
(CP1 ) and we find that K
(CP1 ) ffi lim
n
of the spectral sequence to Y
(X) is not guaranteed. K
(CPn ), which is isomorphic to the power series
Convergence to lim Y
(Fn X), should that occur, ring K
(
)[[x]], where jxj = 2.
n
would be of the type discussed in case II in the In topology one might be interested in the Atiyah–
section ‘‘Convergence of graded spectral sequences’’. Hirzebruch spectral sequence in the case where X is
Since Xn = ; for n < 0, the system defining L1 a spectrum rather than a space (a spectrum being a
stabilizes to 0. Therefore, L1 = 0 and, by the generalization in which cells in negative degrees are
discussion in that section, lim 1 Zr X = 0 becomes a allowed including the possibility that the dimensions
r
632 Spectral Sequences
of the cells are not bounded below). In such cases, Therefore, the spectral sequence collapses to give
the spectral sequence is no longer constrained to lie Hn (Tot C) ffi Tor0R
n (M, N). Similarly, the second
in the right half-plane and convergence criteria are spectral sequence shows that Hn (Tot C) ffi Tor00R n
not well understood for either the homology or (M, N). Thus, TorR
(M, N) can be computed equally
cohomology version. well from a projective resolution of either variable.
The technique of using a double complex in which
Spectral Sequence of a Double Complex one spectral sequence yields the homology the total
complex to which both converge can be used to prove.
A double complex is a chain complex of chain
complexes. That is, it is a bigraded abelian group Cp, q Theorem 15 (Grothendieck spectral sequence). Let
F G
together with two differentials d0 : Cp, q ! Cp1, q and C ! B ! A be a composition of additive functors,
¼ ¼ ¼
d00 : Cp, q ! Cp, q1 satisfying d0 d0 = 0, d00 d00 = 0, where C , B , and A are abelian categories. Assume
¼ ¼ ¼
and d0 d00 = d00 d0 . Given a double complex that all objects in C and B have projective
L C its total ¼ ¼
complex Tot C is defined by (Tot C)n := pþq = n Cp, q resolutions. Suppose that F takes projectives to
with differential defined by djCp, q := d0 þ (1)p d00 : projectives. Then for all objects C of C there exists
¼
Cp, q ! Cp1, q Cp, q1 Totn1 C. a (first quadrant) spectral sequence with E2p, q =
There are two natural filtrations, F 0TotC and (Lp G)((Lq F)(C)) converging to (Lpþq (GF))(C).
00
F Tot C , on Tot C given by Naturally, there is a corresponding version for
M right derived functors.
Fp0 ðTotCÞ ¼ Cs;t An application of the Grothendieck spectral
n
sþt¼n
sp sequence is the following ‘‘change of rings spectral
M sequence.’’ Let f : R ! S be a ring homomorphism,
Fp00 ðTotCÞ ¼ Cs;t let M be a right S-module and let N be a left
n
sþt¼n
tp R-module. Let F(A) = S R A and G(B) = M S B,
and note that GF(A) = M R A. Applying the
yielding two spectral sequences abutting to Grothendieck spectral sequence to the composition
H
(TotC). In the first E0 2p, q = Hp (Hq (C
,
)) and in F G
(left R-modules ! left S-modules ! abelian groups)
2
the other E00p, q = Hq (Hp (C
,
)). Convergence of these yields a convergent spectral sequence E2p, q ffi TorSp
spectral sequences is not guaranteed, although the (M, TorR R
q (S, N)) ) Torpþq (M, N).
first will always converge if there exists N such that
Cp, q = 0 for p < N and the second will converge if Eilenberg–Moore Spectral Sequence
there exists N such that Cp, q = 0 for q < N. From
For a topological group G, Milnor showed how to
the double complex C one could Q instead form the
construct a universal G-bundle G ! EG ! BG in
product total complex (Tot C)n := pþq = n Cp, q and
which EG is the infinite join G
1 with diagonal
proceed in a similar manner to construct the same
spectral sequences with different convergence pro- G-action. There is a natural filtration Fn BG :=
blems. In the important special case of a first G
(nþ1) =G on BG and therefore an induced filtration
quadrant double complex both spectral sequences on the base of any principal G-bundle. This
converge and information is often obtained by filtration yields a spectral sequence including as a
playing one off against the other. special case a tool for calculating H
(BG) from
knowledge of H
(G).
Example 5 Let M and N be R-modules. Let
Tor0R 00R Theorem 16 Let G ! X ! B be a principal
(M, N) and Tor
(M, N) be the derived func-
tors of ( ) N and M ( ), respectively. Let P
and G-bundle and let H
( ) denote homology with
coefficients in a field. Then there is a first quadrant
Q
be projective resolutions of M and N respec- H
(G)
spectral sequence with E2p, q = Torpq (H
(X), H
(
))
tively. Define a first quadrant double complex by
converging to Hpþq (BG).
Cp, q := Pp Qq . Since Pp is projective,
Here the group structure makes H
(G) into an
0 if q 6¼ 0 algebra and TorA
Hq ðCp;
Þ ¼ Pp Hq ðCp;
Þ ¼ pq (M, N) denotes degree q of the
N if q ¼ 0 graded object formed as the pth-derived functor of
and so in the first spectral sequence of the double the tensor product of the graded modules M and N
complex, over the graded ring A.
There is also a version (Eilenberg and Moore
2 0 if q 6¼ 0 1962) which, like the Serre spectral sequence, is
E0p;q ¼
Tor0R
p ðM; NÞ if q ¼ 0 suitable for computing H
(G) from H
(BG).
Spectral Theory of Linear Operators 633
To begin, we shall only consider operators A with Under these definitions, X0 becomes a vector space.
D(A) = X. The expression
An operator A is called bounded if there is a
jf ðxÞj
constant M such that kf k ¼ sup ; f 2 X0 ½6
x6¼0 kxk
kAxk Mkxk; x2X ½1
is easily seen to be a norm. Thus, X0 is a normed vector
The norm of such an operator is defined by space. It is therefore natural to ask when X0 will be
kAxk complete. A rather surprising answer is given by
kAk ¼ sup ½2
x6¼0 kxk Theorem 2 X0 is a Banach space whether or not
X is.
It is the smallest M which works in [1]. An operator
A is called continuous at a point x 2 X if xn ! x in (For the definition of a Banach space, see, e.g.,
X implies Axn ! Ax in Y. A bounded linear Schechter (2002) or the appendix at the end of this
operator is continuous at each point. For if xn ! x article.)
in X, then Suppose X, Y are normed vector spaces and
A 2 B(X, Y). For each y0 2 Y 0 , the expression y0 (Ax)
kAxn Axk kAk kxn xk ! 0
assigns a scalar to each x 2 X. Thus, it is a functional
We also have F(x). Clearly F is linear. It is also bounded since
Theorem 1 If a linear operator A is continuous at jFðxÞj ¼ jy0 ðAxÞj ky0 k kAxk ky0 k kAk kxk
one point x0 2 X, then it is bounded, and hence
continuous at every point. Thus, there is an x0 2 X0 such that
We let B(X, Y) be the set of bounded linear y0 ðAxÞ ¼ x0 ðxÞ; x2X ½7
operators from X to Y. Under the norm [2], one
This functional x is unique. Thus, to each y 2 Y 0
0 0
easily checks that B(X, Y) is a normed vector space.
we have assigned a unique x0 2 X0 . We designate this
assignment by A0 and note that it is a linear operator
The Adjoint Operator from Y 0 to X0 . Thus, [7] can be written in the form
An assignment F of a number to each element x of a y0 ðAxÞ ¼ A0 y0 ðxÞ ½8
vector space is called a functional and denoted by
F(x). If it satisfies The operator A0 is called the adjoint (or conjugate)
of A. We note
Fð1 x1 þ 2 x2 Þ ¼ 1 Fðx1 Þ þ 2 Fðx2 Þ ½3
Theorem 3 A0 2 B(Y 0 , X0 ), and kA0 k = kAk.
for 1 , 2 scalars, it is called linear. It is called
bounded if The adjoint has the following easily verified
properties:
jFðxÞj Mkxk; x2X ½4
If F is a bounded linear functional on a normed ðA þ BÞ0 ¼ A0 þ B0 ½9
vector space X, the norm of F is defined by
ðAÞ0 ¼ A0 ½10
jFðxÞj
kFk ¼ sup ½5
x2X; x6¼0 kxk ðABÞ0 ¼ B0 A0 ½11
It is equal to the smallest number M satisfying [4].
Why should we consider adjoints? One reason is
For any normed vector space X, let X0 denote the as follows. Many problems in mathematics and its
set of bounded linear functionals on X. If f , g 2 X0 , applications can be put in the form: given normed
we say that f = g if
vector spaces X, Y and an operator A 2 B(X, Y), one
f ðxÞ ¼ gðxÞ for all x 2 X wishes to solve
The ‘‘zero’’ functional is the one assigning zero to all Ax ¼ y ½12
x 2 X. We define h = f þ g by
The set of all y for which one can solve [12] is called
hðxÞ ¼ f ðxÞ þ gðxÞ; x2X the ‘‘range’’ of A and is denoted by R(A). The set of
all x for which Ax = 0 is called the ‘‘null space’’ of A
and g = f by
and is denoted by N(A). Since A is linear, it is easily
gðxÞ ¼ f ðxÞ; x2X checked that N(A) and R(A) are subspaces of X and Y,
Spectral Theory of Linear Operators 635
respectively (for definitions, see, e.g., Schechter Let p(t) be a polynomial of the form
(2002) or the appendix at the end of this article). X
n
The dimension of N(A) is denoted by (A). pðtÞ ¼ ak t k
If y 2 R(A), there is an x 2 X satisfying [12]. For 0
any y0 2 Y 0 we have
Then for any operator A 2 B(X), we define the
y0 ðAxÞ ¼ y0 ðyÞ operator
Taking adjoints we get X
n
pðAÞ ¼ ak Ak
0 0 0
A y ðxÞ ¼ y ðyÞ 0
0
If y0 2 N(A0 ), this gives y0 (y) = 0. Thus, a necessary where we take A = I. We have
condition that y 2 R(A) is that y0 (y) = 0 for all Theorem 7 If 2 (A), then p() 2 (p(A)) for any
y0 2 N(A0 ). Obviously, it would be of great interest polynomial p(t).
to know when this condition is also sufficient.
Proof Since is a root of p(t) p(), we have
The Spectrum and Resolvent Sets pðtÞ pðÞ ¼ ðt ÞqðtÞ
From this point henceforth we shall assume that where q(t) is a polynomial with real coefficients.
X = Y. We can then speak of the identity operator I Hence,
defined by
pðAÞ pðÞ ¼ ðA ÞqðAÞ ¼ qðAÞðA Þ ½14
Ix ¼ x; x2X
Now, if p() is in (p(A)), then [14] shows that
For a scalar , the operator I is given by
(A ) = 0 and R(A ) = X. This means that
Ix ¼ x; x2X 2 (A), and the theorem is proved. &
A symbolic way of writing Theorem 7 is
We shall denote the operator I by .
We shall denote the space B(X, X) by B(X). pððAÞÞ ðpðAÞÞ ½15
For any operator A 2 B(X), a scalar for which
(A ) 6¼ 0 is called an eigenvalue of A. Any Note that, in general, there may be points in
element x 6¼ 0 of X such that (A )x = 0 is called (p(A)) which may not be of the form p() for
an eigenvector (or eigenelement). The points for some 2 (A). As an example, consider the
which (A ) has a bounded inverse in B(X) operator on R 2 given by
comprise the resolvent set (A) of A (for defini- Að1 ; 2 Þ ¼ ð2 ; 1 Þ
tions, see, e.g., Schechter (2002) or the appendix
at the end of this article). If X is a Banach space, A has no spectrum; A is invertible for all real .
it is the set of those such that (A ) = 0 and However, A2 has 1 as an eigenvalue. What is the
R(A ) = X. The spectrum (A) of A consists of reason for this? It is simply that our scalars are real.
all scalars not in (A). The set of eigenvalues of A Consequently, imaginary numbers cannot be con-
is sometimes called the point spectrum of A and sidered as eigenvalues. We shall see later that in
is denoted by P(A). order to obtain a more complete theory, we shall
We note that have to consider complex Banach spaces. Another
question is whether every operator A 2 B(X) has
Theorem 4 For A in B(X), (A0 ) = (A). points in its spectrum. For complex Banach spaces,
We are now going to examine the sets (A) and the answer is yes.
(A) for arbitrary A 2 B(X).
Theorem 5 (A) is an open set and hence (A) is a The Spectral Mapping Theorem
closed set.
Suppose we want to solve an equation of the form
Does every operator A 2 B(X) have points in its
pðAÞx ¼ y; x; y 2 X ½16
resolvent set? Yes. In fact, we have
Theorem 6 For A in B(X), set where p(t) is a polynomial and A 2 B(X). If 0 is not in
the spectrum of p(A), then p(A) has an inverse in B(X)
r ðAÞ ¼ inf kAn k1=n ½13 and, hence, [16] can be solved for all y 2 X. So a
n
natural question to ask is: what is the spectrum of
Then (A) contains all scalars such that jj > r (A). p(A)? By Theorem 7 we see that it contains p((A)),
636 Spectral Theory of Linear Operators
but by the remark at the end of the preceding section Let C be any circle with center at the origin and
it can contain other points. If it were true that radius greater than, say, kAk. Then, by Lemma 1,
I X
1 I
pððAÞÞ ¼ ðpðAÞÞ ½17
zn ðz AÞ1 dz ¼ Ak1 znk dz
C k¼1 C
then we could say that [16] can be solved uniquely
n
for all y 2 X if and only if p() 6¼ 0 for all 2 (A). ¼ 2iA ½19
For a complex Banach space we have
or
Theorem 8 If X is a complex Banach space, then I
2 (p(A)) if and only if = p() for some 2 (A), 1
An ¼ zn ðz AÞ1 dz ½20
that is, if [17] holds. 2i C
Proof We have proved it in one direction already where the line integral is taken in the right direction.
(Theorem 7). To prove it in the other, let 1 , . . . , n Note that the line integrals are defined in the same
be the (complex) roots of p(t) . For a complex way as is done in the theory of functions of a
Banach space they are all scalars. Thus, complex variable. The existence of the integrals and
their independence of path (so long as the integrands
pðAÞ ¼ cðA 1 Þ ðA n Þ; c 6¼ 0 remain analytic) are proved in the same way. Since
(z A)1 is analytic on (A), we have
Now suppose that all of the j are in (A). Then
each A j has an inverse in B(X). Hence, the same Theorem 10 Let C be any closed curve containing
is true for p(A) . In other words, 2 (p(A)). (A) in its interior. Then [20] holds.
Thus, if 2 (p(A)), then at least one of the j must As a direct consequence of this, we have
be in (A), say k . Hence, = p(k ), where k 2 (A).
This completes the proof. & Theorem 11 r (A) = max2(A) jj and kAn k1=n !
Theorem 8 is called the ‘‘spectral mapping r (A) as n ! 1.
theorem’’ for polynomials. As mentioned before, it We can now put Lemma 1 in the following form:
has the useful consequence:
Theorem 12 If jzj > r (A), then [18] holds with
Corollary 1 If X is a complex Banach space, then convergence in B(X).
eqn [16] has a unique solution for every y in X if
and only if p() 6¼ 0 for all 2 (A). Now let b be any number greater than r (A), and
let f (z) be a complex-valued function that is analytic
Operational Calculus in jzj < b. Thus,
X
1 for k sufficiently large, and the series
ðz AÞ1 ¼ zn An1 ½18
1 X
1
jak jck
where the convergence is in the norm of B(X). 0
Spectral Theory of Linear Operators 637
is convergent. We define f (A) to be where the line integrals are to be taken in the
proper directions. It is easily checked that f (A) 2
X
1
ak A k ½22 B(X) and is independent of the choice of the set !.
0 By [23], this definition agrees with the one given
above for the case when contains a disk of radius
By Theorem 10, this gives greater than r (A). Note that if is not connected,
I f (z) need not be the same function on different
1 X1
f ðAÞ ¼ ak zk ðz AÞ1 dz components of .
2i 0 C Now suppose f (z) does not vanish on (A). Then
I X 1 we can choose ! so that f (z) does not vanish on !
1
¼ ak zk ðz AÞ1 dz (this is also an exercise). Thus, g(z) = 1=f (z) is
2i C 0
I analytic on an open set containing ! so that g(A) is
1 defined. Since f (z)g(z) = 1, one would expect that
¼ f ðzÞðz AÞ1 dz ½23
2i C f (A)g(A) = g(A)f (A) = I, in which case, it would
follow that f (A)1 exists and is equal to g(A). This
where C is any circle about the origin with radius
follows from
greater than r (A) and less than b.
We can now give the formula that we promised. Lemma 2 If f (z) and g(z) are analytic in an open
Suppose f (z) does not vanish for jzj < b. Set set containing (A) and
g(z) = 1=f (z). Then g(z) is analytic in jzj < b, and
hence g(A) is defined. Moreover, hðzÞ ¼ f ðzÞgðzÞ
I
1 then h(A) = f (A)g(A).
f ðAÞgðAÞ ¼ f ðzÞgðzÞðz AÞ1 dz
2i C
I Therefore, it follows that we have
1
¼ ðz AÞ1 dz ¼ I
2i C Theorem 13 If A is in B(X) and f (z) is a function
analytic in an open set containing (A) such that
Since f (A) and g(A) clearly commute, we see that f (z) 6¼ 0 on (A), then f (A)1 exists and is given by
f (A)1 exists and equals g(A). Hence,
I
I 1 1 1
1 1 f ðAÞ ¼ ðz AÞ1 dz
f ðAÞ1 ¼ ðz AÞ1 dz ½24 2i @! f ðzÞ
2i C f ðzÞ
In particular, if where ! is any open set such that
X
1 (i) (A) !, ! ,
k (ii) @! consists of a finite number of simple closed
gðzÞ ¼ 1=f ðzÞ ¼ ck z ; jzj < b
0 curves, and
(iii) f (z) 6¼ 0 on !.
then
Now that we have defined f (A) for functions
X
1
f ðAÞ 1
¼ ck A k
½25 analytic in a neighborhood of (A), we can show
0 that the spectral mapping theorem holds for such
functions as well (see Theorem 8). We have
Now, suppose f (z) is analytic in an open set
containing (A), but not analytic in a disk of radius Theorem 14 If f (z) is analytic in a neighborhood
greater than r (A). In this case, we cannot say that of (A), then
the series [22] converges in norm to an operator in
B(X). However, we can still define f (A) in the ð f ðAÞÞ ¼ f ððAÞÞ ½27
following way: there exists an open set ! whose
that is, 2 (f (A)) if and only if = f () for some
closure ! and whose boundary @! consists of a
2 (A).
finite number of simple closed curves that do not
intersect, and such that (A) !. (That such a
set always exists is left as an exercise; see, e.g.,
Complexification
Schechter (2002).) We now define f (A) by
I What we have just done is valid for complex Banach
1 spaces. Suppose, however, we are dealing with a real
f ðAÞ ¼ f ðzÞðz AÞ1 dz ½26
2i @! Banach space. What can be said then?
638 Spectral Theory of Linear Operators
Let X be a real Banach space. Consider the set Z This shows that 2 (A) ^ if and only if 2 (A).
of all ordered pairs hx, yi of elements of X. We set Similarly, if p(t) is a polynomial with real coeffi-
cients, then
hx1 ; y1 i þ hx2 ; y2 i ¼ hx1 þ x2 ; y1 þ y2 i
^
pðAÞhx; yi ¼ hpðAÞx; pðAÞyi
ð þ i Þhx; yi ¼ hðx yÞ; ð x þ yÞi
; 2 R ^ has an inverse in B(Z) if and only
showing that p(A)
if p(A) has an inverse in B(X). Hence, we have
With these definitions, one checks easily that Z is a
complex vector space. The set of elements of Z of Theorem 15 Equation [16] has a unique solution
the form hx, 0i can be identified with X. We would for each y in X if and only if p() 6¼ 0 for all
^
2 (A).
like to introduce a norm on Z that would make Z
into a Banach space and satisfy In the example given earlier, the operator A ^
has eigenvalues i and i. Hence, 1 is in the
khx; 0ik ¼ kxk; x2X ^ 2 and also in that of A2 . Thus, the
spectrum of A
An obvious suggestion is equation
ðA2 þ 1Þx ¼ y
ðkxk2 þ kyk2 Þ1=2
cannot be solved uniquely for all y.
However, it is soon discovered that this is not a norm
on Z (why?). We have to be more careful. One that
works is given by Compact Operators
2 2 1=2
khx; yik ¼ max ðkx yk þ k x þ yk Þ Let X, Y be normed vector spaces. A linear operator
2 þ 2 ¼1
K from X to Y is called compact (or completely
With this norm, Z becomes a complex Banach space continuous) if D(K) = X and for every sequence
having the desired properties. {xn } X such that kxn k C, the sequence {Kxn } has
Now let A be an operator in B(X). We define an a subsequence which converges in Y. The set of all
operator A^ in B(Z) by compact operators from X to Y is denoted by
K(X, Y).
^ yi ¼ hAx; Ayi
Ahx; A compact operator is bounded. Otherwise, there
would be a sequence {xn } such that kxn k C, while
Then kKxn k ! 1. Then {Kxn } could not have a conver-
gent subsequence. The sum of two compact opera-
^ yik
kAhx; tors is compact, and the same is true of the product
¼ max ðkAx Ayk2 þ k Ax þ Ayk2 Þ1=2 of a scalar and a compact operator. Hence, K(X, Y)
2 þ 2 ¼1 is a subspace of B(X, Y).
¼ max ðkAðx yÞk2 þ kAð x þ yÞk2 Þ1=2 If A 2 B(X, Y) and K 2 K(Y, Z), then KA 2 K
2 þ 2 ¼1 (X, Z). Similarly, if L 2 K(X, Y) and B 2 B(Y, Z),
kAk khx; yik then BL 2 K(X, Z).
Suppose K 2 B(X, Y), and there is a sequence {Fn }
Thus, of compact operators such that
^ kAk
kAk kK Fn k ! 0 as n ! 1 ½28
We claim that if Y is a Banach space, then K is
But,
compact.
^ sup khAx; 0ik Theorem 16 Let X be a normed vector space and
kAk ¼ kAk
x6¼0 khx; 0ik Y a Banach space. If L is in B(X, Y) and there is a
sequence {Kn } K(X, Y) such that
Hence,
kL Kn k ! 0 as n ! 0
^ ¼ kAk
kAk then L is in K(X, Y).
If is real, then Theorem 17 Let X be a Banach space and let K be
an operator in K(X). Set A = I K. Then, R(A) is
^ Þhx; yi ¼ hðA Þx; ðA Þyi
ðA closed in X and dim N(A) = dim N(A0 ) is finite.
Spectral Theory of Linear Operators 639
In particular, either R(A) = X and N(A) = {0}, or otherwise specified, X, Y, Z, and W will denote
R(A) 6¼ X and N(A) 6¼ {0}. Banach spaces in this article.
Let X, Y be normed vector spaces, and let A be
The last statement of Theorem 17 is known as the
a linear operator from X to Y. We now officially
‘‘Fredholm alternative.’’
lift our restriction that D(A) = X. However, if
Let X, Y be Banach spaces. An operator A 2
A 2 B(X, Y), it is still to be assumed that D(A) = X.
B(X, Y) is said to be a Fredholm operator from X to
The operator A is called closed if whenever {xn }
Y if
D(A) is a sequence satisfying
1. (A) = dim N(A) is finite,
xn ! x in X; Axn ! y in Y ½31
2. R(A) is closed in Y, and
3. (A) = dim N(A0 ) is finite. then x 2 D(A) and Ax = y. Clearly, all operators in
The set of Fredholm operators from X to Y is B(X, Y) are closed.
denoted by (X, Y). If X = Y and K 2 K(X), then, To define A0 for an unbounded operator, we
clearly, I K is a Fredholm operator. The index of a follow the definition for bounded operators, and
Fredholm operator is defined as exercise a bit of care. We want
For K 2 K(X), we have shown that i(I K) = 0 Thus, we say that y0 2 D(A0 ) if there is an x0 2 X0
(Theorem 17). such that
Theorem 18 Let X, Y be normed vector spaces, x0 ðxÞ ¼ y0 ðAxÞ; x 2 DðAÞ ½33
and assume that K is in K(X, Y). Then K0 is in
K(Y 0 , X0 ). Then we define A0 y0 to be x0 . In order that this
definition make sense, we need x0 to be unique, that
Let X be a Banach space, and suppose K 2 K(X). is, that x0 (x) = 0 for all x 2 D(A) should imply that
If is a nonzero scalar, then x0 = 0. This is true if and only if D(A) is dense in X.
To summarize, we can define A0 for any linear
I K ¼ ðI 1 KÞ 2 ðXÞ ½30
operator from X to Y provided D(A) is dense in X.
For an arbitrary operator A 2 B(X), the set of all We take D(A0 ) to be the set of those y0 2 Y 0 for
scalars for which I A 2 (X) is called the -set which there is an x0 2 X0 satisfying [33]. This x0 is
of A and is denoted by A . Thus, [30] gives unique, and we set A0 y0 = x0 . Note that if
Otherwise, 2 (A). As before, (A) and (A) are converges in H. Define the operator A on H by
called the resolvent set and spectrum of A, respec- X
tively. To show the relationship of this definition to Af ¼ k ðf ; ’k Þ’k ½37
the one given before, we note the following. Clearly, A is a linear operator. It is also bounded,
Lemma 3 If X is a Banach space and A is closed, since
then 2 (A) if and only if X
kAf k2 ¼ jk j2 jðf ; ’k Þj2 C2 kf k2 ½38
ðA Þ ¼ 0; RðA Þ ¼ X ½35
by Bessel’s inequality
Throughout the remainder of this section, we shall
X
1
assume that X is a Banach space, and that A is a ðf ; ’k Þ2 kf k2 ½39
densely defined, closed linear operator on X. We ask 1
the following question: what points of (A) can be
removed from the spectrum by the addition of a For convenience, let us assume that each k 6¼ 0 (just
compact operator to A? The answer to this question is remove those ’k corresponding to the k that vanish).
closely related to the set A . We define this to be the In this case, N(A) consists of precisely those f 2 H
set of all scalars such that A 2 (X). We have which are orthogonal to all of the ’k . Clearly, such f
are in N(A). Conversely, if f 2 N(A), then
Theorem 21 The set A is open, and i(A ) is
constant on each of its components. 0 ¼ ðAf ; ’k Þ ¼ k ðf ; ’k Þ
We call e (A) the essential spectrum of A (there are for any f 2 H. Any solution of [40] satisfies
other definitions). It consists of those points of (A) X
which cannot be removed from the spectrum by the u ¼ f þ Au ¼ f þ k ðu; ’k Þ’k ½41
addition of a compact operator to A. We now Hence,
characterize e (A).
ðu; ’k Þ ¼ ðf ; ’k Þ þ k ðu; ’k Þ
Theorem 23 = e (A) if and only if 2 A and
2
i(A ) = 0. or
ðf ; ’k Þ
ðu; ’k Þ ¼ ½42
Normal Operators k
A sequence of elements {’n } in a Hilbert space is Substituting back in [41], we obtain
called orthonormal if
X k ðf ; ’k Þ’k
( u ¼ f þ ½43
0; m ¼
6 n k
ð’m ; ’n Þ ¼ ½36
1; m ¼ n Since is not a limit point of the k , there is a
> 0
(for definitions, see, e.g., Schechter (2002) or the such that
appendix at the end of this article). j k j
; k ¼ 1; 2; . . .
Let {’n } be an orthonormal sequence (finite or
infinite) in a Hilbert space H. Let {k } be a sequence Hence, the series in [43] converges for each f 2 H. It
(of the same length) of scalars satisfying is an easy exercise to verify that [43] is indeed a
solution of [40]. To see that ( A)1 is bounded,
jk j C note that
Then for each element f 2 H, the series
jj kuk kf k þ Ckf k=
½44
X
k ðf ; ’k Þ’k (cf. [38]). Thus, we have proved
Spectral Theory of Linear Operators 641
These theorems are known as the spectral A subset U of a vector space V is called a subspace
theorems for self-adjoint operators. of V if 1 x1 þ 2 x2 is in U whenever x1 , x2 are in U
and 1 , 2 are scalars.
A subset U of a normed vector space X is called
closed if for every sequence {xn } of elements in U
Appendix
having a limit in X, the limit is actually in U.
Here we include some background material related Consider a vector space X having a mapping (f , g)
to the text. from pairs of its elements to the reals such that
Consider a collection C of elements or ‘‘vectors’’
1. (f , g) = (f , g)
with the following properties:
2. (f þ g, h) = (f , h) þ (g, h)
1. They can be added. If f and g are in C, so is f þ g. 3. (f , g) = (g, f )
2. f þ (g þ h) = (f þ g) þ h, f , g, h 2 C. 4. (f , f ) > 0 unless f = 0.
3. There is an element 0 2 C such that h þ 0 = h
Then
for all h 2 C.
4. For each h 2 C there is an element h 2 C such
ðf ; gÞ2 ðf ; f Þðg; gÞ; f;g 2 X ½61
that h þ (h) = 0.
5. g þ h = h þ g, g, h 2 C. An expression (f , g) that assigns a real number to
6. For each real number , h 2 C. each pair of elements of a vector space and satisfies
7. (g þ h) = g þ h. the aforementioned properties is called a scalar
8. ( þ )h = h þ h. (or inner) product.
9. ( h) = ( )h. If a vector space X has a scalar product (f , g), then
10. To each h 2 C there corresponds a real number it is a normed vector space with norm kf k = (f , f )1=2 .
khk with the following properties: A vector space which has a scalar product and is
11. khk = jjkhk. complete with respect to the induced norm is called
12. khk = 0 if, and only if, h = 0. a Hilbert space. Every Hilbert space is a Banach
13. kg þ hk kgk þ khk. space, but the converse is not true. Inequality [61] is
14. If {hn } is a sequence of elements of C such known as the Cauchy–Schwarz inequality. Rn is a
that khn hm k ! 0 as m, n ! 1, then there is Hilbert space.
an element h 2 C such that khn hk ! 0 as Let H be a Hilbert space and let (x, y) denote its
n ! 1. scalar product. If we fix y, then the expression
A collection of objects which satisfies statements (x, y) assigns to each x 2 H a number. An assign-
(1)–(9) and the additional statement ment F of a number to each element x of a vector
15. 1h = h space is called a functional and denoted by F(x).
is called a vector space or linear space. The scalar product is not the first functional we
A set of objects satisfying statements (1)–(13) is have encountered. In any normed vector space, the
called a normed vector space, and the number khk norm is also a functional. The functional
is called the norm of h. Although statement (15) is F(x) = (x, y) satisfies
not implied by statements (1)–(9), it is implied by
statements (1)–(13). A sequence satisfying Fð1 x1 þ 2 x2 Þ ¼ 1 Fðx1 Þ þ 2 Fðx2 Þ ½62
hn ! h as n ! 1 jFðxÞj
kFk ¼ sup
x2H; x6¼0 kxk
when we mean
Thus for y fixed, F(x) = (x, y) is a bounded linear
khn hk ! 0 as n ! 1 functional in the Hilbert space H. We have
644 Spectral Theory of Linear Operators
Theorem 30 For every bounded linear functional F vector 0), we can assign to each y 2 Y the unique
on a Hilbert space H there is a unique element solution of
y 2 H such that
Ax ¼ y
FðxÞ ¼ ðx; yÞ for all x 2 H ½64
This assignment is an operator from Y to X and is
Moreover, usually denoted by A1 and called the inverse
operator of A. It is linear because of the linearity
jFðxÞj of A. One can ask: ‘‘when is A1 continuous?’’ or,
kyk ¼ sup ¼ kFk ½65
x2H; x6¼0 kxk equivalent by, ‘‘when is it bounded?’’ A very
important answer to this question is given by
Theorem 30 is known as the ‘‘Riesz representation
theorem.’’ Theorem 32 If X, Y are Banach spaces and A is a
For any normed vector space X, let X0 denote the closed linear operator from X to Y with
set of bounded linear functionals on X. If f , g 2 X0 , R(A) = Y, N(A) = {0}, then A1 2 B(Y, X).
we say that f = g if This theorem is sometimes referred to as the
‘‘bounded inverse theorem.’’
f ðxÞ ¼ gðxÞ for all x 2 X
If A is self-adjoint and
The ‘‘zero’’ functional is the one assigning zero to all
ðA Þx ¼ 0; ðA Þy ¼ 0
x 2 X. We define h = f þ g by
with 6¼ , then
hðxÞ ¼ f ðxÞ þ gðxÞ; x2X
ðx; yÞ ¼ 0
and g = f by
If A has a compact inverse, its eigenvalues cannot
gðxÞ ¼ f ðxÞ; x2X have limit points. If A1 is compact, then the
eigenelements corresponding to the same eigenvalue
Under these definitions, X0 becomes a vector space. form a finite-dimensional subspace.
We have been employing the expression
See also: Ljusternik–Schnirelman Theory; Quantum
jf ðxÞj 0
Mechanical Scattering Theory; Regularization for
kf k ¼ sup ; f 2X ½66 Dynamical Zeta Functions; Spectral Sequences;
x6¼0 kxk
Stochastic Resonance.
This is easily seen to be a norm. Thus X0 is a normed
vector space.
We also have Further Reading
Theorem 31 Let M be a subspace of a normed vector Bachman G and Narici L (1966) Functional Analysis. New York:
Academic Press.
space X, and suppose that f (x) is a bounded linear Banach S (1955) Théorie des Opérations Linéaires. New York:
functional on M. Set Chelsea.
Berberian SK (1974) Lectures in Functional Analysis and
jf ðxÞj Operator Theory. New York: Springer.
kf k ¼ sup Brown A and Pearcy C (1977) Introduction to Operator Theory.
x2M;x6¼0 kxk
New York: Springer.
Day MM (1958) Normed Linear Spaces. Berlin: Springer.
Then there is a bounded linear functional F(x) on Dunford N and Schwartz IT (1958, 1963) Linear Operators, I, II.
the whole of X such that New York: Wiley.
Edwards RE (1965) Functional Analysis. New York: Holt.
FðxÞ ¼ f ðxÞ; x2M ½67 Epstein B (1970) Linear Functional Analysis. Philadelphia:
W. B. Saunders.
Gohberg IC and Krein MS (1960) The basic propositions on
and
defect numbers, root numbers and indices of linear operators.
Amer. Math. Soc. Transl, Ser. 2, 13, 185–264.
jFðxÞj jf ðxÞj Goldberg S (1966) Unbounded Linear Operators. New York:
kFk ¼ sup ¼ kf k ¼ sup ½68
x2X;x6¼0 kxk x2M;x6¼0 kxk McGraw-Hill.
Halmos PR (1951) Introduction to Hilbert Space. New York: Chelsea.
Hille E and Phillips R (1957) Functional Analysis and Sem-Groups.
Theorem 31 is known as the ‘‘Hahn–Banach theorem.’’ Providence: American Mathematical Society.
If A is a linear operator from X to Y, with Kato T (1966, 1976) Perturbation Theory for Linear Operators.
R(A) = Y and N(A) = {0} (i.e., consists only of the Berlin: Springer.
Spin Foams 645
Müller V (2003) Spectral Theory of Linear Operators – and Spactral Stone MH (1932) Linear Transformations in Hilbert Space.
System in Banach Algebras. Basel: Birkhäuser Verlag. Providence: American Mathematical Society.
Reed M and Simon B (1972) Methods of Modern Mathematical Taylor AE (1958) Introduction to Functional Analysis. New York:
Physics, I. Academic Press. Wiley.
Riesz F and St.-Nagy B (1955) Functional Analysis. New York: Ungar. Weidmann J (1980) Linear Operators in Hilbert Space. New York:
Schechter M (2002) Principles of Functional Analysis. Providence: Springer.
American Mathematical Society. Yosida K (1965, 1971) Functional Analysis. Berlin: Springer.
Spin Foams
A Perez, Penn State University, University Park, the scalar constraint Hphys . Formally, one can write
PA, USA P as
ª 2006 Elsevier Ltd. All rights reserved. Y
P¼“
ðb
SðxÞÞ”
x2
Z Z
Introduction ¼ d
D½N exp i NðxÞSðxÞ ½2
In loop quantum gravity (LQG) (see Loop Quantum
Gravity) – a background independent formulation of A formal argument shows that P can also be defined
quantum gravity – the full quantum dynamics is in a manifestly covariant manner as a regularization
governed by the following (constraint) operator of the formal path integral of general relativity. In
equations or quantum Einstein equations: first-order variables, it becomes
Z
Gauss Law P ¼ D½e D½A ½A; e exp½iSGR ðe; AÞ ½3
b i ðA; EÞj >:¼ Dd
G a
a Ei j >¼ 0
where e is the tetrad field, A is the spacetime connection,
Vector constraint and [A, e] denotes the appropriate measure.
In both cases, P characterizes the space of
b a ðA; EÞj >:¼ Ea Fd
V i
i ab ðAÞj >¼ 0 solutions of quantum Einstein equations as for
Scalar constraint any arbitrary state j >2 Hkin then Pj > is a
pffiffiffiffiffiffiffiffiffiffi
(formal) solution of [1]. Moreover, the matrix
1 d ij
elements of P define the physical inner product
b
SðA; EÞ
>:¼ detE Ei Ej Fab ðAÞ þ
a b
½1 ( < , >p ) providing the vector space of solutions of
[1] with the Hilbert space structure that defines
>¼ 0
Hphys . Explicitly,
where Aia is an SU(2) connection (i = 1, 2, 3,
<s; s0>p :¼ <Ps; s0>
a = 1, 2, 3), Eai is its conjugate momentum (the triad
ij
field), F ab (A) is the curvature of Aia , and Da is the for s, s0 2 Hkin .
covariant derivative (see Canonical General Relativ- When these matrix elements are computed in
ity). The hat means that the classical phase-space the spin network basis (see Figure 1) (see Loop
functions are promoted to operators in a kinematical Quantum Gravity), they can be expressed as a
Hilbert space Hkin ; the solutions are in the so-called sum over amplitudes of ‘‘spin network histories’’:
physical Hilbert space Hphys . The goal of the spin foam spin foams (Figure 2). The latter are naturally
approach is to construct a mathematically well-defined given by foam-like combinatorial structures
notion of path integral for LQG as a device for whose basic elements carry quantum numbers of
computing the solutions of the previous equations. geometry (see Loop Quantum Gravity). A spin
The space of solution of the Gauss and vector foam history, from the state js > to the state js0 > ,
constraints [1] is well understood in LQG (see Loop is denoted by a pair (Fs ! s0 , {j}), where Fs ! s0 is the
Quantum Gravity), and often also called kinematical 2-complex with boundary given by the graphs of
Hilbert space Hkin . The solutions of the scalar the spin network states js0 > and js >, respectively,
constraint can be characterized by the definition of and {j} is the set of spin quantum numbers
the generalized projection operator P from the labeling its edges (denoted e 2 Fs ! s0 ) and faces
kinematical Hilbert space Hkin into the kernel of (denoted f 2 Fs ! s0 ). Vertices are denoted
646 Spin Foams
j
Local symmetries of the theory are generated by the where is the unitary irreducible representation matrix
first-class constraints of spin j (for a precise definition, see Loop Quantum
Gravity). For simplicity, we will often denote spin
Db E bj ¼ 0; F iab ðAÞ ¼ 0 ½9 network states js > omitting the graph and spin labels.
which are referred to as the Gauss law and the
curvature constraint, respectively – the quantization Spin Foams from the Hamiltonian Formulation
of these is the analog of [1] in 4D. This simple
theory has been quantized in various ways in the The physical Hilbert space, Hphys , is defined by
literature; here we will use it to introduce the spin those ‘‘states’’ that are annihilated by the con-
foam quantization. straints. By construction, spin-network states solve
the Gauss constraint – Dd a
a Ei js > = 0 – as they
are manifestly SU(2) gauge invariant (see Loop
Kinematical Hilbert Space Quantum Gravity). To complete the quantization,
In analogy with the 4D case, one follows Dirac’s one needs to characterize the space of solutions of
the quantum curvature constraints (F bi ), and to
procedure finding first a representation of the basic ab
variables in an auxiliary or kinematical Hilbert provide it with the physical inner product. The
space Hkin . The basic states are functionals of the existence of Hphys is granted by the following:
connection depending on the parallel transport Theorem 1 There exists a normalized positive
along paths : the so-called holonomy. Given linear form P over Cyl, that is, P( ) 0 for 2
a connection Aia (x) and a path , one defines the Cyl and P(1) = 1, yielding (through the GNS
holonomy h [A] as the path-ordered exponential construction (see Algebraic Approach to Quantum
Z Field Theory)) the physical Hilbert space Hphys and
h ½A ¼ P exp A ½10 the physical representation
p of Cyl.
The state P contains a very large Gelfand ideal (set
The kinematical Hilbert space, Hkin , corresponds
of zero norm states) J := { 2 Cyl s.t. P( ) = 0}. In
to the Ashtekar–Lewandowski (AL) representation
fact, the physical Hilbert space Hphys := Cyl=J corre-
of the algebra of functions of holonomies or
sponds to the quantization of finitely many degrees of
generalized connections. This algebra is in fact a
freedom. This is expected in 3D gravity as the theory
C -algebra and is denoted Cyl (see Loop Quantum
does not have local excitations (no ‘‘gravitons’’) (see
Gravity). Functionals of the connection act in the
Topological Quantum Field Theory: Overview). The
AL representation simply by multiplication. For
representation
p of Cyl solves the curvature con-
example, the holonomy operator acts as follows:
straint in the sense that for any functional f [A] 2 Cyl
hd
½A½A ¼ h ½A½A ½11 defined on the subalgebra of functionals defined on
contractible graphs 2 , one has that
As in 4D, an orthonormal basis of Hkin is defined
p ½f ¼ f ½0 ½13
by the spin network states. Each spin network is
labeled by a graph , a set of spins {j‘ } labeling b = 0’’ in Hphys
This equation expresses the fact that ‘‘F
links ‘ 2 , and a set of intertwiners { n } labeling (for flat connections, parallel transport is trivial
nodes n 2 (Figure 3), namely: around a contractible region). For s, s0 2 Hkin , the
physical inner product is given by
O j‘
OY
s;fj‘ g;f n g ½A ¼ n ðh‘ ½AÞ ½12 <s; s0>p :¼ Pðs sÞ ½14
n2 ‘2
where the -operation and the product are defined
5 in Cyl.
2
5
The previous equation admits a ‘‘sum over
5 2 1
2
2
2 histories’’ representation. We shall introduce the
concept of the spin foam representation as an
1 3
2 2 explicit construction of the positive linear form P
1
3
which, as in [2], is formally given by
2 1 Z Z
1
P ¼ D½N exp i tr½N FðAÞ b
Figure 3 A spin network state in 2 þ 1 LQG. The decomposi-
Y
tion of a 4-valent node in terms of basic 3-valent intertwiners is ¼ d
½FðAÞ ½15
shown. x2
648 Spin Foams
k
tr[∏(Wp)] p k = Σ N k
Δ
j = j m j,m,k j
m
Figure 5 Graphical notation representing the action of one plaquette holonomy on a spin network state. On the right is the result
written in terms of the spin network basis. The amplitude Nj,m,k can be expressed in terms of Clebsch–Gordan coefficients.
j k j k j k
n
p n 1 j k m
k
= = Σ
o,p Δn Δj Δk Δm
p o
tr[∏(Wp)] no p
Δ
m m m
Figure 6 Graphical notation representing the action of one plaquette holonomy on a spin network vertex. The object in brackets (fg)
is a 6j-symbol and j := 2j þ 1.
Spin Foams 649
slice, one arrives at a sum over spin network histories <s; s0>p :¼ Pðs s0 Þ
representation of P(s). More precisely, P(s) can be
expressed as a sum over amplitudes corresponding to a
series of transitions that can be viewed as the ‘‘time and can be expressed as a sum over amplitudes
evolution’’ between the ‘‘initial’’ spin network s and corresponding to transitions interpolating between
the ‘‘final’’ ‘‘vacuum state’’ . The physical inner the ‘‘initial’’ spin network s0 and the ‘‘final’’ spin
product between spin networks s and s0 is defined as network s (e.g., Figures 7 and 8).
j j j j
k k k
m m m
m
j
j j j
k k
k k k
m m m m
j
j j
j
k k
k
m m m m
Figure 7 A set of discrete transitions in the loop-to-loop physical inner product obtained by a series of transitions as in Figure 5. On
the right, the continuous spin foam representation in the limit ! 0.
j j j
m m p m p
n n
o o
k k k
j
p n
j j j
m o k
p p p
m m m
n n n
o
o o
k k k
j j j
p
p p
m m m
n n n
o o o
k k k
Figure 8 A set of discrete transitions representing one of the contributing histories at a fixed value of the regulator. On the right, the
continuous spin foam representation when the regulator is removed.
650 Spin Foams
j4 j5
(dual to 1-cells in ). The intersection of the dual
X Y f Y j3 2-complex with the boundaries defines two
¼ ð2jf þ 1Þ 2 ½22
fjg f 2Fs!s0 v2Fs!s0
graphs 1 , 2 2 (see Figure 9). For simplicity, we
j6
j1 j2 ignore the boundaries until the end of this section.
The fields e and A are discretized as follows. The
where the notation is that of [4], and f = 0 if su(2)-valued 1-form field e is represented by the
f \ s 6¼ 0 ^ f \ s0 6¼ 0, f = 1 if f \ s 6¼ 0 _ f \ s0 6¼ 0, assignment of ef 2 su(2) to each 1-cell in . We
and f = 2 if f \ s = 0 ^ f \ s0 = 0. The tetrahedral use the fact that faces in are in one-to-one
diagram denotes a 6j-symbol: the amplitude obtained correspondence with 1-cells in and label ef with a
by means of the natural contraction of the four face subindex (Figure 9). The connection field A is
intertwiners corresponding to the 1-cells converging represented by the assignment of group elements
at a vertex. More generally, for arbitrary spin ge 2 SU(2) to each edge in e 2 (see Figure 10).
networks, the vertex amplitude corresponds to 3nj- With all this, [23] becomes the regularized version
symbols, and <s, s0>p takes the general form [4]. P defined as
Z Y Y
P ¼ def dge exp i tr ef Wf ½24
Spin Foams from the Covariant Path Integral f 2 e2
In this section we re-derive the spin foam represen- where def is the regular Lebesgue measure on R 3 ,
tation of the physical scalar product of 2 þ 1 dge is the Haar measure on SU(2), and Wf denotes
(Riemannian) quantum gravity directly as a regular- the holonomy around (spacetime) faces, that is,
ization of the covariant path integral. The formal Wf = g1e gN
e for N being the number of edges
path integral for 3D gravity can be written as bounding the corresponding face (see Figure 10).
Z Z
The discretization procedure is reminiscent of the
P ¼ D½eD½A exp i tr½e ^ FðAÞ ½23 one used in standard lattice gauge theory (see Lattice
M
Gauge Theory). The previous definition can be states on 1 and 2 , respectively. A careful analysis
motivated by an analysis equivalent to the one of the boundary contribution shows that only the
presented in [16]. face amplitude is modified to (j‘ )f =2 , and that the
Integrating over ef , and using [19], one obtains spin foam amplitudes are as in eqn [22].
XZ Y Y A crucial property of the path integral in 3D
P ¼ dge ð2jf þ 1Þ gravity (and of the transition amplitudes in general)
fjg e2 f 2 is that it does not depend on the discretization –
" #
jf this is due to the absence of local degrees of freedom
tr ðg1e . . . gN
e Þ ½25 in 3D gravity and not expected to hold in 4D. Given
two different cellular decompositions and 0 ,
one has
Now it remains to integrate over the lattice con-
nection {ge }. If an edge e 2 bounds n faces f 2 0
n0 P ¼ n0 P0 ½29
there will be n traces of the form tr[jf ( ge )] in
[25] containing ge in the argument. In order to whereP n0 is the number of 0-simplexes in , and
integrate over ge we can use the following identity: = j (2j þ 1)2 . As is given by a divergent sum,
Z j1 j2 jn the discretization independence statement is formal.
n
Iinv :¼ dg ðgÞ ðgÞ ðgÞ Moreover, the sum over spins in [28] is typically
X divergent. Divergences occur due to infinite gauge-
¼ C j1 j2 jn C
j1 j2 jn ½26 volume factors in the path integral corresponding to
the topological gauge freedom [7]. Freidel and
n Louapre have shown how these divergences can be
where Iinv is the projector from the tensor product of
irreducible representations Hj1 jn = j1 j2 jn avoided by gauge-fixing unphysical degrees of free-
onto the invariant component H0j1 jn = Inv[j1 j2 dom in [24]. In the case of 3D gravity with positive
jn ]. On the right-hand side, we have chosen an cosmological constant, the state sum generalizes to
orthonormal basis of invariant vectors (intertwiners) the Turaev–Viro invariant (see Topological Quan-
in Hj1 jn to express the projector. Notice that the tum Field Theory: Overview) defined in terms of the
assignment of intertwiners to edges is a consequence quantum group SUq (2) with qn = 1 where the
of the integration over the connection. Using [26] representations are finitely many and thus < 1.
one can write P in the general spin foam Equation [29] is a rigorous statement in that case.
representation form [4] No such infrared divergences appear in the canoni-
X Y Y cal treatment of the previous section.
P ¼ ð2jf þ 1Þ Av ðjv Þ ½27
ff g f 2 v2
n
o
j k j k
n p
= Σ N(xn)Snop p o
nop
ˆ
N(x)S(x)
Δ
Σ k
m m
j
m
Figure 11 The action of the scalar constraint and its spin foam representation. N(xn ) is the value of N at the node and Snop are the
b
matrix elements of S.
where < , >diff denotes the inner product in the operator acting on Hkin . The physical inner product
Hilbert space of solutions of the vector constraint, is given by
and the exponential has been expanded in powers in Z T
the last expression on the right-hand side. b
<s; s0>p :¼ lim <s; dt eitM s0> ½32
From early on, it was realized that smooth loop T!1 T
states are naturally annihilated by b S (indepen- A spin foam representation of the previous expres-
dently of any quantization ambiguity). Conse- sion could now be achieved by the standard
quently, b S acts only on spin network nodes. skeletonization that leads to the path-integral repre-
Generically, it does so by creating new links and sentation in quantum mechanics. In this context,
nodes modifying the underlying graph of the spin one splits the t-parameter in discrete steps and
network states (Figure 11). writes
Therefore, each term in the sum [30] represents a
series of transitions – given by the local action of b S b b b
eitM ¼ lim ½eitM=N N ¼ lim ½1 þ itM=N N
½33
at spin network nodes – through different spin N!1 N!1
network states interpolating the boundary states s The spin foam representation follows from the fact
and s0 , respectively. The action of b S can be b
that the action of the basic operator 1 þ itM=N on a
visualized as an ‘‘interaction vertex’’ in the ‘‘time’’ spin network can be written as a linear combination
evolution of the node (Figure 11). As in the explicit of new spin networks whose graphs and labels have
3D case, eqn [30] can be expressed as sum over been modified by the creation of new nodes (in a
‘‘histories’’ of spin networks pictured as a system of way qualitatively analogous to the local action
branching surfaces described by a 2-complex whose shown in Figure 11). An explicit derivation of the
elements inherit the representation labels on the physical inner product of 4D LQG along these lines
intermediate states. The value of the ‘‘transition’’ is under current investigation.
amplitudes is controlled by the matrix elements of
b
S. Therefore, although the qualitative picture is Spin Foams from the Covariant Formulation
independent of quantization ambiguities, transition
amplitudes are sensitive to them. In 4D, the spin foam representation of the dynamics
Before even considering the issue of convergence of LQG has been investigated more intensively in
of [30], the problem with this definition is evident: the covariant formulation. This has led to a series of
every single term in the sum is a divergent integral! constructions which are referred to as spin foam
Therefore, this way of presenting spin foams has to models. These treatments are related more closely to
be considered as formal until a well-defined regular- the construction based on the covariant path-
ization of [2] is provided. That is the goal of the spin integral approach of the last section. Here we
foam approach. illustrate the formulation which has captured much
Instead of dealing with an infinite number of interest in the literature: the Barrett–Crane (BC)
constraints Thiemann recently proposed to impose model.
one single master constraint defined as
Z Spin foam models for gravity as constrained quan-
S2 ðxÞ qab Va ðxÞVb ðxÞ tum BF theory The BC model is one of the most
M¼ dx3 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ½31
det qðxÞ extensively studied spin foam models for quantum
gravity. To introduce the main ideas involved, we
Using techniques developed by Thiemann, this concentrate on the definition of the model in the
constraint can indeed be promoted to a quantum Riemannian sector. The BC model can be formally
Spin Foams 653
viewed as a spin foam quantization of SO(4) where D[B]D[A](B ! IJKL eK ^ eL ) means that one
Plebanski’s formulation of general relativity. Ple- must restrict the sum in [36] to those configurations
banski’s Riemannian action depends on an SO(4) of the topological theory satisfying the constraints
connection A, a Lie-algebra-valued 2-form B, and B = (e ^ e) for some tetrad e. The remarkable fact
Lagrange multiplier fields
and . Writing explicitly is that this restriction can be implemented in a
the Lie algebra indices, the action is given by systematic way directly on the spin foam configura-
tions that define Ptopo .
S½B; A;
; In Ptopo spin foams are labeled with spins corre-
Z
IJ sponding to the unitary irreducible representations of
¼ B ^ FIJ ðAÞ þ
IJKL BIJ ^ BKL
SO(4) (given by two spin quantum numbers (jR , jL )).
Essentially, the factor ‘‘(B ! IJKL eK ^ eL )’’ restricts
þ IJKL
IJKL ½34
the set of spin foam quantum numbers to the so-
where is a 4-form and
IJKL =
JIKL = called simple representations (for which jR = jL = j).
IJLK =
KLIJ is a tensor in the internal space. This is the ‘‘quantum’’ version of the solution to the
Variation with respect to imposes the constraint constraints [35]. There are various versions of this
IJKL
IJKL = 0 on
IJKL . The Lagrange multiplier model. The simplest definition of the transition
tensor
IJKL has then 20 independent components. amplitudes in the BC model is given by
Variation with respect to
imposes 20 algebraic X Y Y
equations on the 36 components of B. The (non- Pðs sÞ ¼ ð2jf þ 1Þf
fjg f 2Fs!s0 v2Fs!s0
degenerate) solutions to the equations obtained by
ι2 ι2*
varying the multipliers
and are
j12 j23 j12
* j23
*
IJ IJKL X ι1
B ¼
eK ^ eL j25
j13
j24
ι3 ι1* j13
*
ι3*
j25
* j24
*
½38
and 1 5 j15 j14 j35 j34 j15
*
j14
* j35
* j34
*
BIJ ¼
eI ^ eJ ½35
ι5 j45 ι4 ι5* j45
* ι4*
in terms of the 16 remaining degrees of freedom of where we use the notation of [22], the graphs denote
the tetrad field eIa . If one substitutes the first solution 15j-symbols, and i are half-integers labeling SU(2)
into the original action, one obtains Palatini’s normalized 4-intertwiners. No rigorous connection
formulation of general relativity; therefore, on shell with the Hilbert space picture of LQG has yet been
(and on the right sector), the action is that of established. The self-dual version of Plebanski’s
classical gravity. action leads, through a similar construction, to
The key idea in the definition of the model is that Reisenberger’s model.
the path integral for the theory corresponding to the The simplest amplitude in the BC model corre-
action S[B, A, 0, 0], namely sponds to a single 4-simplex, which can be viewed
Z Z as the simplest triangulation of the 4D spacetime
IJ
Ptopo ¼ D½BD½A exp i B ^ FIJ ðAÞ ½36 given by the interior of a 3-sphere (the correspond-
ing 2-complex is shown in Figure 12). States of the
can be given a meaning as a spin foam sum, [4], in 4-simplex are labeled by ten spins j (labeling the ten
terms of a simple generalization of the construction edges of the boundary spin network, see Figure 12)
of the previous section. In fact, S[B, A, 0, 0] corre- which can be shown to be related to the area in
sponds to a simple theory known as BF theory that
is formally very similar to 3D gravity (see BF
2
Theories). The result is independent of the chosen
discretization because BF theory does not have local
degrees of freedom (just as 3D gravity).
The BC model aims at providing a definition of 1 3
the path integral of gravity pursuing a well-posed
definition of the formal expression 0
Z
PGR ¼ D½BD½A B ! IJKL eK ^ eL
Z
IJ 5 4
exp i B ^ FIJ ðAÞ ½37
Figure 12 The dual of a 4-simplex.
654 Spin Foams
Planck units of the ten triangular faces that form the amplitudes are in one-to-one correspondence to
4-simplex. A first indication of the connection of the those found in the models of the previous section
model with gravity was that the large-j asymptotics (e.g., the BC model). This duality is regarded as a
appeared to be dominated by the exponential of the way of providing a fully combinatorial definition of
Regge action (the action derived by Regge as a quantum gravity where no reference to any dis-
discretization of general relativity). This estimate cretization or even a manifold structure is made.
was done using the stationary-phase approximation Transition amplitudes between spin network states
to the integral that gives the amplitude of a correspond to n-point functions of the field theory.
4-simplex in the BC model. However, more detailed These models have been inspired by generalizations
calculations showed that the amplitude is dominated of matrix models applied to BF theory.
by configurations corresponding to degenerate Divergent transition amplitudes can arise by the
4-simplexes. This seems to invalidate a simple contribution of ‘‘loop’’ diagrams as in standard
connection to general relativity and is one of the quantum field theory. In spin foams, diagrams
main puzzles in the model. corresponding to 2D bubbles are potentially divergent
because spin labels can be arbitrarily high leading to
unbounded sums in [4]. Such divergences do not occur
Spin Foams as Feynman Diagrams
in certain field theories dual (in the sense above) to the
The main problem with the models of the previous BC model. However, little is known about the
section is that they are defined on a discretization convergence of the series in
and the physical meaning
of M and that – contrary to what happens with a of this constant. Nevertheless, Freidel and Louapre
topological theory, for example, 3D gravity have shown that the series can be re-summed in certain
(eqn [29]) – the amplitudes depend on the discretiza- models dual to lower-dimensional theories.
tion . Various possibilities to eliminate this reg-
ulator have been discussed in the literature but no
Causal Spin Foams
explicit results are yet known in 4D. An interesting
proposal is a discretization-independent definition of Let us conclude by presenting a fundamentally
spin foam models achieved by the introduction of an different construction leading to spin foams. Using
auxiliary field theory living on an abstract group the kinematical setting of LQG with the assumption
manifold – Spin(4)4 and SL(2, C)4 for Riemannian of the existence of a microlocal (in the sense of
and Lorentzian gravity, respectively. The action of Planck scale) causal structure, Markopoulou and
the auxiliary group field theory (GFT) takes the form Smolin define a general class of (causal) spin foam
Z Z models for gravity. The elementary transition ampli-
2
S½ ¼ þ Mð5Þ ½ ½39 tude AsI ! sIþ1 from an initial spin network sI to
G4 5! G10 another spin network sIþ1 is defined by a set of
where M(5) [] is a fifth-order monomial, and simple combinatorial rules based on a definition of
G is the corresponding group. In the simp- causal propagation of the information at nodes. The
lest model, M(5) [] = (g1 , g2 , g3 , g4 )(g4 , g5 , g6 , g7 ) rules and amplitudes have to satisfy certain causal
(g7 , g3 , g8 , g9 )(g9 , g6 , g2 , g10 )(g10 , g8 , g5 , g1 ). The restrictions (motivated by the standard concepts
field is required to be invariant under the in classical Lorentzian physics). These rules gene-
(simultaneous) right action of the group on its rate surface-like excitations of the same kind one
four arguments in addition to other symmetries encounters in the previous formulations. Spin foams
(not described here for simplicity). The perturba- FNsi ! sf are labeled by the number of times, N, these
tive expansion in
of the GFT Euclidean path elementary transitions take place. Transition
integral is given by amplitudes are defined as
X
Z hsi ; sf i ¼ AðFN
X
N si !sf Þ ½41
P ¼ D½eS½ ¼ A½FN ½40 N
F
sym½FN
N
which is of the generic form [4]. The models are not
where A[FN ] corresponds to a sum of Feynman- related to any continuum action. The only guiding
diagram amplitudes for diagrams with N interaction principles in the construction are the restrictions
vertices, and sym[FN ] denotes the standard symme- imposed by causality, and the requirement of the
try factor. A remarkable property of this expansion existence of a nontrivial critical behavior that
is that A[FN ] can be expressed as a sum over spin reproduces general relativity at large scales. Some
foam amplitudes, that is, 2-complexes labeled by indirect evidence of a possible nontrivial continuum
unitary irreducible representations of G. Moreover, limit has been obtained in certain versions of these
for very simple interaction M(5) [], the spin foam models in 1 þ 1 dimensions.
Spin Glasses 655
See also: Algebraic Approach to Quantum Field Theory; Baez J (1998) Spin foams. Classical and Quantum Gravity 15:
BF Theories; Canonical General Relativity; Chern– 1827–1858.
Simons Models: Rigorous Results; Lattice Gauge Baez J (2000) An introduction to spin foam models of quantum
Theory; Loop Quantum Gravity; Quantum Dynamics in gravity and BF theory. Lecture Notes in Physics 543: 25–94.
Baez J and Muniain JP (1995) Gauge fields, Knots and Gravity.
Loop Quantum Gravity; Quantum Geometry and its
Singapore: World Scientific.
Applications; Topological Quantum Field Theory: Oriti D (2001) Spacetime geometry from algebra: spin foam
Overview. models for nonperturbative quantum gravity. Reports on
Progress in Physics 64: 1489–1544.
Perez A (2003) Spin foam models for quantum gravity. Classical
and Quantum Gravity 20: R43.
Further Reading
Rovelli C Quantum Gravity. Cambridge: Cambridge University
Ashtekar A (1991) Lectures on Nonperturbative Canonical Press (to appear).
Gravity. Singapore: World Scientific. Thiemann T Modern Canonical Quantum General Relativity.
Ashtekar A and Lewandowski J (2004) Background independent Cambridge: Cambridge University Press (to appear).
quantum gravity: a status report.
Spin Glasses
F Guerra, Università di Roma ‘‘La Sapienza’’, couplings, assumed for simplicity to be independent
Rome, Italy identically distributed random variables, with cen-
ª 2006 Elsevier Ltd. All rights reserved. tered unit Gaussian distribution. The quenched
character of the J means that they do not contribute
to thermodynamic equilibrium, but act as a kind of
Introduction random external noise on the coupling of the
variables. In the expression of the Hamiltonian, we
From a physical point of view, spin glasses, as dilute have indicated with the set of all (n), and with J
magnetic alloys, are very interesting systems. They the set of all J(n, n0 ). The region must be taken
are characterized by such features as exhibiting a new very large, by letting it invade all lattice in the limit.
magnetic phase, where magnetic moments are frozen The physical motivation for this choice is that for
into disordered equilibrium orientations, without any real spin glasses the interaction between the spins
long-range order. See, for example, Young (1987) for dissolved in the matrix of the alloy oscillates in sign
general reviews, and also Stein (1989) for a very according to distance. This effect is taken into
readable account about the physical properties of account in the model through the random character
spin glasses. The experimental laboratory study of of the couplings between spins.
spin glasses is a very difficult subject, because of their Even though very drastic simplifications have
peculiar properties. In particular, the existence of been introduced in the formulation of this model,
very slowly relaxing modes, with consequent memory as compared to the extremely complicated nature
effects, makes it difficult to realize the very basic of physical spin glasses, nevertheless a rigorous
physical concept of a system at thermodynamical study of all properties emerging from the static
equilibrium, at a given temperature. and dynamic behavior of a thermodynamic system
From a theoretical point of view some models of this kind is far from being complete. In particular,
have been proposed, which try to capture the with reference to static equilibrium properties, it
essential physical features of spin glasses, in the is not yet possible to reach a completely substan-
frame of very simple assumptions. tiated description of the phases emerging in the
The basic model has been proposed by Edwards low-temperature region. Even physical intuition
and Anderson (1975) many years ago. It is a simple gives completely different guesses for different
extension of the well-known nearest-neighbor Ising people.
model. On a large region of the unit lattice in d In the same way as a mean-field version can be
dimensions, we associate an Ising spin (n) to each associated to the ordinary Ising model, so it is possible
lattice site n, and then we introduce a lattice for the disordered model described by [1]. Now we
Hamiltonian consider a number of sites i = 1, 2, . . . , N, and let each
X spin (i) at site i interact with all other spins, with the
H ð; JÞ ¼ Jðn; n0 ÞðnÞðn0 Þ ½1
intervention of a quenched noise Jij . The precise form
ðn;n0 Þ
of the Hamiltonian will be given in the following.
Here, the sum runs over all couples of nearest- This is the mean-field model for spin glasses,
neighbor sites in , and J are quenched random introduced by Sherrington and Kirkpatrick (1975).
656 Spin Glasses
It is a celebrated model. Numerous articles have As a matter of fact, how to face this challenge is a
been devoted to its study during the years, appearing very difficult problem. Here we would like to recall
in the theoretical physics literature. the main features of a very powerful method, yet
The relevance of the model stems surely from the extremely simple in its very essence, based on a
fact that it is intended to represent some important comparison and interpolation argument on sets of
features of the physical spin glass systems, of great Gaussian random variables.
interest for their peculiar properties, at least at the The method found its first simple application in
level of the mean-field approximation. Guerra (2001), where it was shown that the
But another important source of interest is Sherrington–Kirkpatrick replica symmetric approxi-
connected with the fact that disordered systems, of mate solution was a rigorous lower bound for the
the Sherrington–Kirkpatrick type, and their general- quenched free energy of the system, uniformly in
izations, seem to play a very important role for the size. Then, it was possible to reach a long-
theoretical and practical assessments about hard awaited result (Guerra and Toninelli 2002): the
optimization problems, as it is shown, for example, convergence of the free energy density in the
by Mézard et al. (2002). thermodynamic limit, by an intermediate step
It is interesting to remark that the original paper where the quenched free energy was shown to be
was entitled ‘‘Solvable model of a spin-glass,’’ while subadditive in the size of the system.
a previous draft, as told by David Sherrington, Moreover, still by interpolation on families of
contained the even stronger designation ‘‘Exactly Gaussian random variables, the first mentioned result
solvable.’’ However, it turned out that the very was extended to give a rigorous proof that the
natural solution devised by the authors is valid only expression given by the Parisi ansatz is also a lower
at high temperatures, or for large external magnetic bound for the quenched free energy of the system,
fields. At low temperatures, the proposed solution uniformly in the size (Guerra 2003). The method gives
exhibits a nonphysical drawback given by a negative not only the bound, but also the explicit form of the
entropy, as properly recognized by the authors in correction in a complex form. As a recent and very
their very first paper. important result, along the task of facing the challenge,
It took some years to find an acceptable solution. Michel Talagrand has been able to dominate these
This was done by Giorgio Parisi in a series of correction terms, showing that they vanish in the
papers, marking a radical departure from the thermodynamic limit. This milestone achievement was
previous methods. In fact, a very intense method of first announced in a short note, containing only a
‘‘spontaneous replica symmetry breaking’’ was synthetic sketch of the proof, and then presented with
developed. As a consequence, the physical content all details in a long paper (Talagrand 2006).
of the theory was encoded in a functional order The interpolation method is also at the basis of
parameter of new type, and a remarkable structure the far-reaching generalized variational principle
emerged for the pure states of the theory, a kind of proved by Aizenman et al. (2003).
hierarchical, ultrametric organization. These very In our presentation, we will try to be as self-
interesting developments, due to Parisi, and his contained as possible. We will give all definitions,
coworkers, are explained in a brilliant way in the explain the basic structure of the interpolation
classical book by Mézard et al. (1987). Part of this method, and show how some of the results are
structure will be recalled in the following. obtained. We will concentrate mostly on questions
It is important to remark that the Parisi solution is connected with the free energy, its properties of
presented in the form of an ingenious and clever subadditivity, the existence of the infinite-volume
‘‘ansatz.’’ Until few years ago, it was not known limit, and the replica bounds.
whether this ansatz would give the true solution for For the sake of comparison, and in order to
the model, in the so-called thermodynamic limit, provide a kind of warm-up, we will recall also some
when the size of the system becomes infinite, or it features of the standard elementary mean-field
would be only a very good approximation for the model of ferromagnetism, the so-called Curie–
true solution. Weiss model. We will concentrate also here on the
The general structures offered by the Parisi solu- free energy, and systematically exploit elementary
tion, and their possible generalizations for similar comparison and interpolation arguments. This will
models, exhibit an extremely rich and interesting show the strict analogy between the treatment of the
mathematical content. Very appropriately, Talagrand ferromagnetic model and the developments in the
(2003) has used a strongly suggestive sentence in the mean-field spin glass case. Basic roles will be played
title to his recent book: ‘‘Spin glasses: a challenge for in the two cases, but with different expressions, by
mathematicians.’’ positivity and convexity properties.
Spin Glasses 657
Then, we will consider the problem of connecting the theory of ferromagnetism. Here we first consider
results for the mean-field case to the short-range case. some properties of the free energy, easily obtained
An intermediate position is occupied by the so-called through comparison methods.
diluted models. They can be studied through a The generic configuration of the mean-field
generalization of the methods exploited in the mean- ferromagnetic model is defined through Ising spin
field case, as shown, for example, in De Sanctis (2005). variables i = 1, attached to each site i = 1,
The organization of the paper is as follows. We 2, . . . , N.
first introduce the ferromagnetic model and discuss The Hamiltonian of the model, in some external
behavior and properties of the free energy in the field of strength h, is given by the mean-field expression
thermodynamic limit, by emphasizing, in this very
1X X
elementary case, the comparison and interpolation HN ð; hÞ ¼ i j h i ½2
methods that will be also exploited, in a different N ði;jÞ i
context, in the spin glass case.
The basic features of the mean-field spin glass Here, the first sum extends to all N(N 1)=2 site
models are discussed next, by introducing all couples, and the second to all sites.
necessary definitions. This is followed by the For a given inverse temperature , let us now
introduction, for generic Gaussian interactions, of introduce the partition function ZN (, h) and the
some important formulas, concerning the derivation free energy per site fN (, h), according to the well-
with respect to the strength of the interaction, and known definitions
X
the Gaussian comparison and interpolation method. ZN ð; hÞ ¼ expðHN ð; hÞÞ ½3
We then give simple applications to the mean-field 1 ...N
spin glass model, in particular to the existence of the
infinite-volume limit of the quenched free energy fN ð; hÞ ¼ N 1 E log ZN ð; hÞ ½4
(Guerra and Toninelli 2002), and to the proof of
It is also convenient to define the average spin
general variational bounds, by following the useful
magnetization
strategy developed in Aizenman et al. (2003).
The main features of the Parisi representation are 1X
recalled briefly, and the main theorem concerning m¼ i ½5
N i
the free energy is stated. This is followed by a brief
mention of results for diluted models. Then, it is immediately seen that the Hamiltonian
We also attack the problem of connecting the in [2] can be equivalently written as
results for the mean-field case to the more realistic X
1
short-range models. HN ð; hÞ ¼ Nm2 h i ½6
Finally we provide conclusions and outlook for 2 i
future foreseen developments.
where an unessential constant term has been
Our treatment will be as simple as possible, by
neglected. In fact, we have
relying on the basic structural properties, and by
describing methods of presumably very long lasting X 1X 1 1
i j ¼ i j ¼ N 2 m2 N ½7
power. The emphasis given to the mean-field case 2 i;j;i6¼j 2 2
ði;jÞ
reflects the status of research. After some years from
now this review would perhaps be written according where the sum over all couples has been equivalently
to completely different patterns. written as one half the sum over all i, j with i 6¼ j,
and the diagonal terms with i = j have been added
and subtracted out. Notice that they give a constant
A Warm-up. The Mean-field because 2i = 1.
Ferromagnetic Model: Structure Therefore, the partition function in [3] can be
and Results equivalently substituted by the expression
!
The mean-field ferromagnetic model is among the X 1 X
ZN ð; hÞ ¼ exp Nm2 exp h i ½8
simplest models of statistical mechanics. However, it 2
1 ...N i
contains very interesting features, in particular a
phase transition, characterized by spontaneous which will be our starting point.
magnetization, at low temperatures. We refer to Our interest will be in the limN!1 N1 log ZN (, h).
standard textbooks for a full treatment and a To this purpose, let us establish the important
complete appreciation of the model in the frame of subadditivity property, holding for the splitting of the
658 Spin Glasses
large-N system in two smaller systems with N1 and N2 It is simple to realize that the supremum coincides
sites, respectively, with N = N1 þ N2 , with the limit as N ! 1. To this purpose we follow
the following simple procedure. Let us consider all
log ZN ð; hÞ log ZN1 ð; hÞ þ log ZN2 ð; hÞ ½9
possible values of the variable m. There are N þ 1 of
The proof is very simple. Let us denote, in the most them, corresponding to any number K of possible
natural way, by 1 , . . . , N1 the spin variables for the spin flips, starting from a given configuration,
first subsystem, and by N1 þ1 , . . . , N the N2 spin K = 0, 1, . . . , N. Let us consider the trivial decom-
variables of the second subsystem. Introduce also the position of the identity, holding for any m,
subsystem magnetizations m1 and m2 , by adapting X
the definition [5] to the smaller systems, in such a 1¼ mM ½16
M
way that
where M in the sum runs over the N þ 1 possible
Nm ¼ N1 m1 þ N2 m2 ½10 values of m, and is Kroneker delta, being equal to 1
Therefore, we see that the large system magnetiza- if M = N, and zero otherwise. Let us now insert [16]
tion m is the linear convex combination of the in the definition [8] of the partition function inside
smaller system ones, according to the obvious the sum over ’s, and invert the two sums. Because of
the forcing m = M given by the , we can write
N1 N2 m2 = 2mM M2 inside the sum. Then if we neglect
m¼ m1 þ m2 ½11
N N the , by using the trivial 1, we have an upper
Since the mapping m ! m2 is convex, we also have bound, where the sum over ’s can be explicitly
the general bound, holding for all values of the performed as before. Then it is enough to take the
variables upper bound with respect to M, and consider that
there are N þ 1 terms in the now trivial sum over M,
N1 2 N2 2
m2 m þ m ½12 in order to arrive at the upper bound
N 1 N 2
Then, it is enough to substitute the inequality in the N 1 log ZN ð; hÞ
definition [8] of ZN (, h), and recognize that we sup log 2 þ log cosh ðh þ MÞ
M
achieve factorization with respect to the two sub-
systems, and therefore the inequality ZN ZN1 ZN2 . 12 M2 þ N 1 logðN þ 1Þ ½17
So we have established [9]. From subadditivity, the
Therefore, by going to the limit as N ! 1, we can
existence of the limit follows by standard arguments.
collect all our results in the form of the following
In fact, we have
theorem giving the full characterization of the
lim N 1 log ZN ð; hÞ ¼ inf N 1 log ZN ð; hÞ ½13 thermodynamic limit of the free energy.
N!1 N
Theorem 1 For the mean-field ferromagnetic
Now we will calculate explicitly this limit, by model we have
introducing an order parameter M, a trial function,
and an appropriate variational scheme. In order to lim N 1 log ZN ð; hÞ ¼ inf N 1 log ZN ð; hÞ ½18
N!1 N
get a lower bound, we start from the elementary
inequality m2 2mM M2 , holding for any value
¼ sup log 2 þ log cosh ðh þ MÞ 12 M2 ½19
of m and M. By inserting the inequality in the M
definition [8] we arrive at a factorization of the sum This ends our discussion about the free energy in
over ’s. The sum can be explicitly calculated, and the ferromagnetic model.
we arrive immediately to the lower bound, uniform Other properties of the model can be easily
in the size of the system, established. Introduce the Boltzmann–Gibbs state
N 1 log ZN ð; hÞ !N ðAÞ
X
log 2 þ log cosh ðh þ MÞ 12 M2 ½14 X 1 2
¼ Z1
N A exp Nm exp h i ½20
holding for any value of the trial order parameter M. 1 ...N
2 i
Clearly, it is convenient to take the supremum over M.
where A is any function of 1 . . . N .
Then, we establish the optimal uniform lower bound
The observable m() becomes self-averaging under
N 1 log ZN ð; hÞ !N , in the infinite-volume limit, in the sense that
sup log 2 þ log cosh ðh þ MÞ 12 M2 ½15 lim !N ððm Mð; hÞÞ2 Þ ¼ 0 ½21
M N!1
Spin Glasses 659
This property of m is the deep reason for the success ensure a good thermodynamic behavior to the free
of the strategy exploited earlier for the convergence energy.
of the free energy. Easy consequences are the For a given inverse temperature , let us now
following. In the infinite-volume limit, for h 6¼ 0, introduce the disorder-dependent partition func-
the Boltzmann–Gibbs state becomes a factor state tion ZN (, h, J) and the quenched average of the
free energy per site fN (, h), according to the
lim !N ð1 . . . s Þ ¼ Mð; hÞs ½22
N!1 definitions
X
A phase transition appears in the form of sponta- ZN ð; h; JÞ ¼ expðHN ð; h; JÞÞ ½26
neous magnetization. In fact, while for h = 0 and 1 ...N
1 we have M(, h) = 0, on the other hand, for
> 1, we have the discontinuity fN ð; hÞ ¼ N 1 E log ZN ð; h; JÞ ½27
lim Mð; hÞ ¼ lim Mð; hÞ MðÞ > 0 ½23 Notice that in [27] the average E with respect to the
h!0þ h!0
external noise is made ‘‘after’’ the log is taken. This
Fluctuations can also be easily controlled. In fact, procedure is called quenched averaging. It represents
one
pffiffiffiffiffi proves that the rescaled random variable the physical idea that the external noise does not
N (m M(, h)) tends in distribution, under !N , contribute to the thermal equilibrium. Only the ’s
to a centered Gaussian with variance given by the are thermalized.
susceptibility For the sake of simplicity, it is also convenient to
@ ð1 M2 Þ write the partition function in the following equiva-
ð; hÞ Mð; hÞ ½24 lent form. First of all let us introduce a family of
@h 1 ð1 M2 Þ
centered Gaussian random variables K(), indexed
Notice that the variance becomes infinite only at the by the configurations , and characterized by the
critical point h = 0, = 1, where M = 0. covariances
Now we are ready to attack the much more
difficult spin glass model. But it will be surprising to EðKðÞKð0 ÞÞ ¼ q2 ð; 0 Þ ½28
see that, by following a simple extension of the 0
where q(, ) are the overlaps between two generic
methods described here, we will arrive at similar configurations, defined by
results. X
qð; 0 Þ ¼ N 1 i 0i ½29
i
Basic Definitions for the Mean-Field Spin with the obvious bounds 1 q(, 0 ) 1, and
Glass Model the normalization q(, ) = 1. Then, starting from
As in the ferromagnetic case, the generic configura- the definition [25], it is immediately seen that the
tion of the mean-field spin glass model is defined partition function in [26] can also be written, by
through Ising spin variables i = 1, attached to neglecting unessential constant terms, in the form
each site i = 1, 2, . . . , N. ZN ð; h; JÞ
But now there is an external quenched disorder rffiffiffiffiffi ! !
given by the N(N 1)=2 independent and identical X N X
¼ exp KðÞ exp h i ½30
distributed random variables Jij , defined for each 1 ...N
2 i
pair of sites. For the sake of simplicity, we assume
each Jij to be a centered unit Gaussian with averages which will be the starting point of our treatment.
E(Jij ) = 0, E(Jij2 ) = 1. By quenched disorder we mean
that the J have a kind of stochastic external
influence on the system, without contributing to Basic Formulas of Derivation
the thermal equilibrium. and Interpolation
Now the Hamiltonian of the model, in some We work in the following general setting. Let Ui
external field of strength h, is given by the mean- be a family of centered Gaussian random variables,
field expression i = 1, . . . , K, with covariance matrix given by
1 X X E(Ui Uj ) Sij . We treat the index i now as configura-
HN ð; h; JÞ ¼ pffiffiffiffiffi Jij i j h i ½25 tion space for some statistical mechanics system, with
N ði;jÞ i
partition function Z and quenched free energy given by
Here, the first sum extends to all p
site X pffiffi
ffiffiffiffiffi pairs, and the E log wi expð tUi Þ E log Z ½31
second to all sites. Notice the N , necessary to
i
660 Spin Glasses
The proof is straightforward. First we perform where the wi 0 are the same in the two
directly the t-derivative. Then, we notice that the expressions.
random variables appear in expressions of the form Considerations of this kind are present in the
E(Ui F), where F are functions of the U’s. These can mathematical literature, as mentioned, for example,
be easily handled through the following integration in Talagrand (2003).
by parts formula for generic Gaussian random The proof is extremely simple and amounts to a
variables, strongly reminiscent of the Wick theorem straightforward calculation. In fact, let us consider
in quantum field theory, the interpolating expression
X X pffiffiffiffiffiffiffiffiffiffiffi
@ pffiffi
EðUi FÞ ¼ Sij E F ½33 E log wi expð tUi þ 1 tU ^ iÞ ½40
j
@Uj i
Therefore, we see that always two derivatives are where 0 t 1. Clearly, the two expressions under
involved. The two terms in [32] come from the comparison correspond to the values t = 0 and t = 1,
action of the Uj derivatives, the first acting on the respectively. By taking the derivative with respect to
Boltzmann factor, and giving rise to a Kronecker ij , t, with the help of the previous derivation formula,
the second acting on Z1 , and giving rise to the we arrive at the evaluation of the t derivative in
minus sign and the duplication of variables. the form
The derivation formula can be expressed in a X pffiffi pffiffiffiffiffiffiffiffiffiffiffi
d ^ iÞ
more compact form by introducing replicas and E log wi expð tUi þ 1 tU
suitable averages. In fact, let us introduce the state ! dt i
!
acting on functions F of i as follows 1 X pffiffi
X pffiffi ¼ E Z1 wi expð tUi ÞðSii ^Sii
!ðFðiÞÞ ¼ Z1 wi expð tUi ÞFðiÞ ½34 2 i
i
1 XX pffiffi
together with the associated product state acting E Z2 wi wj expð tUi Þ
2
on replicated configuration spaces i1 , i2 , . . . , is . By i j
!
performing also a global E average, finally we define pffiffi
the averages ^
expð tUj ÞðSij Sij ½41
hFit EðFÞ ½35
From the conditions assumed for the covariances,
where the subscript is introduced in order to recall we immediately see that the interpolating function is
the t dependence of these averages. nonincreasing in t, and the theorem follows.
Then, eqn [32] can be written in a more compact The derivation formula and the comparison
form theorem are not restricted to the Gaussian case.
d X pffiffi Generalizations in many directions are possible. For
E log wi expð tUi Þ¼ 12hSi1 i1 i 12hSi1 i2 i ½36 the diluted spin glass models and optimization
dt i
problems we refer, for example, to Franz and
Our basic comparison argument will be based on Leone (2003), and to De Sanctis (2005), and
the following very simple theorem. references therein.
Spin Glasses 661
Thermodynamic Limit and the The second application is in the form of the
Variational Bounds Aizenman–Sims–Starr generalized variational princi-
ple. Here, we will need to introduce some auxiliary
We give here some striking applications of the basic system. The denumerable configuration space is
comparison theorem. Guerra and Toninelli (2002) given by the values of = 1, 2, . . . . We introduce
have given a very simple proof of a long-awaited also weights w 0 for the system, and suitably
result, about the convergence of the free energy per defined overlaps between two generic configurations
site in the thermodynamic limit. Let us show the p(, 0 ), with p(, ) = 1.
argument. Let us consider a system of size N and A family of centered Gaussian random variables
two smaller systems of sizes N1 and N2 respectively, ^
K(), now indexed by the configurations , will be
with N = N1 þ N2 , as before in the ferromagnetic defined by the covariances
case. Let us now compare
^
EðKðÞ ^ 0 ÞÞ ¼ p2 ð; 0 Þ
Kð ½48
E log ZN ð; h; JÞ
rffiffiffiffiffi !
X N We will also need a family of centered Gaussian
¼ E log exp KðÞ random variables i (), indexed by the sites i of our
2
1 ...N
! original system and the configurations of the
X auxiliary system, so that
exp h i ½42
i Eði ðÞi0 ð0 ÞÞ ¼ ii0 pð; 0 Þ ½49
with Both the probability measure w , and the overlaps
rffiffiffiffiffiffiffi ! p(, 0 ) could depend on some additional external
X N1 ð1Þ ð1Þ
E log exp K ð Þ quenched noise, which does not appear explicitly in
1 ...N
2 our notation.
rffiffiffiffiffiffiffi ! ! In the following, we will denote by E averages
N2 ð2Þ ð2Þ X
exp K ð Þ exp h i with respect to all random variables involved.
2 i In order to start the comparison argument, we
E log ZN1 ð; h; JÞ þ E log ZN2 ð; h; JÞ ½43 will consider first the case where the two and
systems are not coupled, so as to appear factorized
where (1) stands for i , i = 1, . . . , N1 , and (2) for in the form
i , i = N1 þ 1, . . . , N. Covariances for K(1) and K(2) rffiffiffiffiffi !
are expressed as in [28], but now the overlaps are X X N
substituted with the partial overlaps of the first and E log w exp KðÞ
1 ...N
2
second block, q1 and q2 , respectively. It is very rffiffiffiffiffi ! !
simple to apply the comparison theorem. All one has N^ X
to do is to observe that the obvious exp KðÞ exp h i
2 i
Nq ¼ N1 q1 þ N2 q2 ½44 X
E log ZN ð; h; JÞ þ E log w
analogous to [10], implies, as in [12],
rffiffiffiffiffi !
N1 2 N2 2 N^
q2 q þ q ½45 exp KðÞ ½50
N 1 N 2 2
Therefore, the comparison gives the superaddivity
In the second case, the K fields are suppressed and
property, to be compared with [9],
the coupling between the two systems will be taken
E log ZN ð; h; JÞ in a very simple form, by allowing the field to act
as an external field on the system. In this way
E log ZN1 ð; h; JÞ þ E log ZN2 ð; h; JÞ ½46
the ’s appear as factorized, and the sums can
From the superaddivity property the existence of the be explicitly performed. The chosen form for the
limit follows in the form second term in the comparison is
! !
lim N1 E log ZN ð; h; JÞ X X X X
N!1 E log w exp i ðÞi exp h i
¼ sup N 1 E log ZN ð; h; JÞ ½47 1 ...N
X
i i
N
N log 2 þ E log w ðc1 c2 . . . cN Þ ½51
to be compared with [13].
662 Spin Glasses
earlier, depending on the functional order parameter uniformly in N. This result stems from earlier work
x, is defined as of Derrida, Ruelle, Neveu, Bolthausen, Sznitman,
Z Aizenman, Contucci, Talagrand, Bovier, and others,
2 1 and in a sense is implicit in the treatment given in
log 2 þ f ð0; h; x; Þ q xðqÞ dq ½64
2 0 Mézard et al. (1987). It can be reached in a very
Notice that in this expression the function f appears simple way. Let us sketch the argument.
evaluated at q = 0, and y = h, where h is the value of First of all, let us consider the Poisson point
the external magnetic field. This trial expression process y1 y2 y3 . . . , uniquely characterized by
shoul be considered as the analog of that appearing the following conditions. For any interval A,
in [14] for the ferromagnetic case. introduce the occupation numbers N(A), defined by
The Parisi spontaneously broken replica symmetry X
NðAÞ ¼ ðy 2 AÞ ½68
expression for the free energy is given by the definition
fP ð; hÞ where ( ) = 1, if the random variable y belongs to
Z 1
2 the interval A, and ( ) = 0, otherwise. We assume
inf ðlog 2 þ f ð0; h; x; Þ q xðqÞ dqÞ ½65 that N(A) and N(B) are independent if the intervals
x 2 0
A and B are disjoint, and moreover that for each A,
where the infimum is taken with respect to all the random variable N(A) has a Poisson distribution
functional order parameters x. Notice that the with parameter
infimum appears here, as compared to the supre-
Z b
mum in the ferromagnetic case.
By exploiting a kind of generalized comparison ðAÞ ¼ expðyÞ dy ½69
a
argument, involving a suitably defined interpolation
function, Guerra (2003) has established the follow- if A is the interval (a, b), that is,
ing important result.
PðNðAÞ ¼ kÞ ¼ expððAÞÞðAÞk =k! ½70
Theorem 3 For all values of the inverse tempera-
ture , and the external magnetic field h, and for We will exploit y as energy levels for a statistical
any functional order parameter x, the following mechanics system with configurations indexed by .
bound holds: For a parameter 0 < m < 1, playing the role of inverse
temperature, we can introduce the partition function
N 1 E log ZN ð; h; JÞ X y
Z 1 v¼ exp
½71
2 m
log 2 þ f ð0; h; x; Þ q xðqÞ dq
2 0
For m in the given interval it turns out that v is a
uniformly in N. Consequently, we have also very well defined random variable, with the sum
N 1 E log ZN ð; h; JÞ over extending to infinity. In fact, there is a strong
Z inbuilt smooth cutoff in the very definition of the
2 1 stochastic energy levels.
inf log 2 þ f ð0; h; x; Þ q xðqÞ dq
x 2 0 From the general properties of Poisson point
uniformly in N. processes, it is very well known that the following
basic invariance property holds. Introduce a random
However, this result can also be understood in the variable b, independent of y, subject to the condition
framework of the generalized variational principle E( exp b) = 1, and let b be independent copies.
established by Aizenman–Sims–Starr as described Then, the randomly biased point process y0 = y þ b ,
earlier. = 1, 2, . . . , is equivalent to the original one in
In fact, one can easily show that there exist distribution. An immediate consequence is the follow-
systems such that ing. Let f be a random variable, independent of y, such
X that E( exp f ) < 1, and let f be independent copies.
N 1 E log w c1 c2 . . . cN f ð0; h; x; Þ ½66
Then, the two random variables
rffiffiffiffiffi ! X y
X N^
N 1
E log w exp KðÞ exp expðf Þ ½72
2
m
2 Z 1 X y
Eðexpðmf ÞÞ1=m
q xðqÞ dq ½67 exp ½73
2 0
m
664 Spin Glasses
have the same distribution. In particular, they can be Now, it is simple to verify that [66] and [67]
freely substituted under averages. hold. Let us consider, for example, [66]. With the
The auxiliary system which gives rise to the Parisi system chosen as before, the repeated applica-
representation according to [66] and [67], for a tion of the stochastic equivalence of [72] and [73]
piecewise constant order parameter, is expressed in will give rise to a sequence of interchained
the following way. Now will be a multi-index Gaussian integrations exactly equivalent to those
= (1 , 2 , . . . , K ), where each a runs on arising from the expression for f, as solution of
1, 2, 3, . . . . Define the Poisson point process y1 , then, the eqn [60]. For [73], there are equivalent
independently, for each value of 1 processes y1 2 , considerations.
and so on up to y1 2 ...K . Notice that in the cascade of Therefore, we see that the estimate in Theorem 3
independent processes y1 , y1 2 , . . . , y1 2 ...K , the last is also a consequence of the generalized variational
index refers to the numbering of the various points of principle.
the process, while the first indices denote independent Up to this point we have seen how to obtain
copies labeled by the corresponding ’s. upper bounds. The problem arises whether, as in the
The weights w have to be chosen according to ferromagnetic case, we can also get lower bounds,
the definition so as to shrink the thermodynamic limit to the value
y1 y y ... given by the inf x in Theorem 3. After a short
w ¼ exp exp 1 2 . . . exp 1 2 K ½74 announcement, Talagrand (2005) has firmly estab-
m1 m2 mK
lished the complete proof of the control of the lower
The cavity fields and K have the following bound. We refer to the original paper for the
expression in terms of independent unit Gaussian complete details of this remarkable achievement.
random variables Ji 1 , Ji 1 2 , . . . , Ji 1 2 ...K , J0 1 , J0 1 2 , . . . , About the methods, here we only recall that in
J0 1 2 ...K , Guerra (2003) we have given also the corrections to
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the bounds appearing in Theorem 3, albeit in a quite
i ðÞ ¼ q1 q0 Ji 1 þ q2 q1 Ji 1 2 þ complicated form. Talagrand has been able to
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
þ qK qK1 Ji 1 2 ...K ½75 establish that these corrections do in fact vanish in
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi the thermodynamic limit.
KðÞ ¼ q21 q20 J0 1 þ q22 q21 J0 1 2 þ In conclusion, we can establish the following
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi extension of Theorem 1 to spin glasses.
þ q2K q2K1 J0 1 2 ...K ½76 Theorem 4 For the mean-field spin glass model we
0 have
It is immediate to verify that E(i ()i0 ( ) is zero if
i 6¼ i0 , while lim N 1 E log ZN ð; h; JÞ
N!1
8
> 0 if 1 6¼ 01 ¼ sup N 1 E log ZN ð; h; JÞ ½79
>
>
>
> q1 if 1 ¼ 01 ; 2 6¼ 02 N
>
>
< q2 if 1 ¼ 01 ; 2 ¼ 02 ; 3 6¼ 03 ;
0
Eði ðÞi ð ÞÞ ¼ .. Z
>
> . 2 1
>
> ¼ inf log 2 þ f ð0; h; x; Þ qxðqÞ dq ½80
>
> 0 0 x 2 0
: 1 if 10 ¼ 1 ; 2 ¼ 2 ; . . . ;
>
K ¼ K
½77
Diluted Models
Similarly, we have
8 Diluted models, in a sense, play a role intermediate
> 0 if 1 6¼ 01 between the mean-field case and the short-range
>
> 2
>
> q1 if 1 ¼ 01 ; 2 6¼ 02 case. In fact, while in the mean-field model each site
>
>
>
> is interacting with all other sites, on the other hand,
>
< q22 if 1 ¼ 01 ; 2 ¼ 02 ; 3 6¼ 03 ;
EðKðÞKð0 ÞÞ ¼ . in the diluted model, each site is interacting with
>
> .. only a fixed number of other sites. However, while
>
>
>
> for the short-range models there is a definition of
>
> 1 if 1 ¼ 01 ; 2 ¼ 02 ; . . . ;
>
> distance among sites, relevant for the interaction, no
:
K ¼ 0K such definition appears in the diluted models, where
½78
all sites are in any case equivalent. From this point
This ends the definition of the system, associated of view, the diluted models are structurally similar
to a given piecewise constant order parameter. to the mean-field models, and most of the
Spin Glasses 665
techniques and results explained before can be systems. The different models require different
extended to them. cavity fields ruling the interaction between the
Let us define a typical diluted model. The original system and the auxiliary system. But further
quenched noise is described as follows. Let K be a work will be necessary in order to clarify this very
Poisson random variable with parameter N, where important issue. For results about diluted models in
N is the number of sites, and is a parameter the high-temperature region, we refer to Guerra and
entering the theory, together with the temperature. Toninelli (2004).
We consider also a sequence of independent cen-
tered random variables J1 , J2 , . . . , and a sequence of
discrete independent random variables i1 , j1 ,
Short-Range Model and Its Connections
i2 , j2 , . . . , uniformly distributed over the set of sites
with the Mean-Field Version
1, 2, . . . , N. Then we assume as Hamiltonian
The investigations of the connections between the
X
K
short-range version of the model and its mean-field
HN ðÞ ¼ J k i k j k ½81
version are at the beginning. Here, we limit ourselves
k¼0
to a synthetic description of what should be done, and
Only the variables contribute to thermodynamic to a short presentation of the results obtained so far.
equilibrium. All noise coming from K, Jk , ik , jk is First of all, according to the conventional wisdom,
considered quenched, and it is not explicitly indi- the mean-field version should be a kind of limit of the
cated in our notation for H. short-range model on a lattice in dimension d, when
The role played by Gaussian integration by d ! 1, with a proper rescaling of the strength of the
parts in the Sherrington–Kirckpatrick model, here Hamiltonian, of the form d1=2 . Results of this kind
is assumed by the following elementary derivation are very well known in the ferromagnetic case, but
formula, holding for Poisson distributions, the present technology of interpolation does not seem
sufficient to assure a proof in the spin glass case. So,
d d this very basic result is still missing. In analogy with
PðK ¼ k; tNÞ expðtNÞðtNÞk =k!
dt dt the ferromagnetic case, it would be necessary to
¼ NðPðK ¼ k 1; tNÞ arrive at the notion of a critical dimension, beyond
PðK ¼ k; tNÞÞ ½82 which the features of the mean-field case still hold,
for example, in the expression of the critical
Then, all machinery of interpolation can be easily exponents and in the ultrametric hierarchical struc-
extended to the diluted models, as firstly recognized ture of the pure phases, or at least for the overlap
by Franz and Leone in (2003). distributions. For physical dimensions less than the
In this way, the superaddivity property, the critical one, the short-range model would need
thermodynamic limit, and the generalized varia- corrections with respect to its mean-field version.
tional principle can be easily established. We refer to Therefore, this is a completely open problem.
Franz and Leone (2003), and De Sanctis (2005), for Moreover, always according to the conventional
a complete treatment. wisdom, the mean-field version should be a kind of
There is an important open problem here. While limit of the short-range models, in finite fixed
in the fully connected case, the Poisson probability dimensions, as the range of the interaction goes to
cascades provide the right auxiliary systems to be infinity, with proper rescaling. Important work of
exploited in the variational principle, on the other Franz and Toninelli shows that this is effectively the
hand in the diluted case more complicated prob- case, if a properly defined Kac limit is performed.
ability cascades have been proposed, as shown, for Here, interpolation methods are effective, and we
example, in Franz and Leone (2003), and in refer to Franz and Toninelli (2004), and references
Panchenko and Talagrand (2004). On the other quoted there, for full details.
hand, in De Sanctis (2005), the very interesting Due to the lack of efficient analytical methods, it is
proposal has been made that also in the case of clear that numerical simulations play a very important
diluted models the Poisson probability cascades play role in the study of the physical properties emerging
a very important role. Of course, here the auxiliary from short-range spin glass models. In particular, we
system interacts with the original system differently, refer to Marinari et al. (2000) for a detailed account of
and involves a multi-overlap structure as explained the evidence, coming from theoretical considerations
in De Sanctis (2005). In this way a kind of very deep and extensive computer simulations, that some of the
universality is emerging. Poisson probability cas- more relevant features of the spontaneous replica
cades are a kind of universal class of auxiliary breaking scheme of the mean field are also present in
666 Spin Glasses
short-range models in three dimensions. Different See also: Glassy Disordered Systems: Dynamical
views are expressed, for example, in Newman and Evolution; Large Deviations in Equilibrium Statistical
Stein (1998), where it is argued that the phase-space Mechanics; Mean Field Spin Glasses and Neural
structure of short-range spin glass models is much Networks; Short-Range Spin Glasses: The Metastate
Approach; Statistical Mechanics and Combinatorial
simpler than that foreseen by the Parisi spontaneous
Problems.
replica symmetry mechanism.
Such very different views, both apparently
strongly supported by reasonable theoretical con-
siderations and powerful numerical simulations, are
a natural consequence of the extraordinary difficulty Further Reading
of the problem.
Aizenman M, Sims R, and Starr S (2003) Extended variational
It is clear that extensive additional work will be
principle for the Sherrington–Kirkpatrick spin-glass model.
necessary before the clarification of the physical Physical Review B 68: 214403.
features exhibited by the realistic short-range spin De Sanctis L (2005) Structural Approachs to Spin Glasses and
glass models. Optimization Problems. Ph.D. thesis, Department of Math-
ematics, Princeton University.
Edwards SF and Anderson PW (1975) Theory of spin glasses.
Journal of Physics F: Metal Physics 5: 965–974.
Conclusion and Outlook for Future Franz S and Leone M (2003) Replica bounds for optimization
problems and diluted spin systems. Journal of Statistical
Developments Physics 111: 535–564.
As we have seen, in these last few years, there has Franz S and Toninelli FL (2004) The Kac limit for finite-range
spin glasses. Physical Review Letters 92: 030602.
been an impressive progress in the understanding of Guerra F (2001) Sum rules for the free energy in the
the mathematical structure of spin glass models, mean field spin glass model. Fields Institute Communica-
mainly due to the systematic exploration of com- tions 30: 161.
parison and interpolation methods. However, many Guerra F (2003) Broken replica symmetry bounds in the mean
important problems are still open. The most field spin glass model. Communications in Mathematical
Physics 233: 1–12.
important one is to establish rigorously the full Guerra F and Toninelli FL (2002) The thermodynamic limit in
hierarchical ultrametric organization of the overlap mean field spin glass models. Communications in Mathema-
distributions, as appears in Parisi theory, and to tical Physics 230: 71–79.
fully understand the decomposition in pure states of Guerra F and Toninelli FL (2004) The high temperature region of
the glassy phase, at low temperatures. the Viana–Bray diluted spin glass model. Journal of Statistical
Physics 115: 531–555.
Moreover, it would be important to extend these Marinari E, Parisi G, Ricci-Tersenghi F, Ruiz-Lorenzo JJ, and
methods to other important disordered models as, Zuliani F (2000) Replica symmetry breaking in short range
for example, neural networks. Here the difficulty is spin glasses: A review of the theoretical foundations and of the
that the positivity arguments, so essential in com- numerical evidence. Journal of Statistical Physics 98:
parison methods, do not seem to emerge naturally 973–1074.
Mézard M, Parisi G, and Virasoro MA (1987) Spin Glass Theory
inside the structure of the theory. and Beyond. Singapore: World Scientific.
Finally, the problem of connecting properties of Mézard M, Parisi G, and Zecchina R (2002) Analytic and
the short-range model, with those arising in the algorithmic solution of random satisfiability problems. Science
mean-field case, is still almost completely open. 297: 812.
Newman CM and Stein DL (1998) Simplicity of state and overlap
structure in finite-volume realistic spin glasses. Physical
Review E 57: 1356–1366.
Acknowledgment Panchenko D and Talagrand M (2004) Bounds for diluted mean-
field spin glass models. Probability Theory Related Fields 130:
We gratefully acknowledge useful conversations 319–336.
with Michael Aizenman, Pierluigi Contucci, Giorgio Sherrington D and Kirkpatrick S (1975) Solvable model of a spin-
glass. Physical Review Letters 35: 1792–1796.
Parisi, and Michel Talagrand. The strategy Stein DL (1989) Disordered systems: Mostly spin glasses. In: Stein
explained in this report grew out from a systematic DL (ed.) Lectures in the Sciences of Complexity. New York:
exploration of comparison and interpolation meth- Addison-Wesley.
ods, developed in collaboration with Fabio Lucio Talagrand M (2003) Spin Glasses: A Challenge for Mathemati-
Toninelli, and Luca De Sanctis. cians. Mean Field Models and Cavity Method. Berlin:
Springer.
This work was supported in part by MIUR Talagrand M (2006) The Parisi formula. Annals of Mathematics
(Italian Ministry of Instruction, University and 163: 221–263.
Research), and by INFN (Italian National Institute Young P (ed.) (1987) Spin Glasses and Random Fields. Singapore:
for Nuclear Physics). World Scientific.
Spinors and Spin Coefficients 667
The dimension of the space of spinors rises rapidly If the vector v in [5] is null, then the matrix has
with n, which is one reason why historically spinors vanishing determinant, or, equivalently, it has rank
have been most useful in spaces of dimensions 3 and 4, 1, and so it can be written as the outer product of a
where the spin space has dimension 2. In a space of two-component column vector = (0 , 1 )T and its
dimension 11, a case considered in supergravity, the Hermitian conjugate:
spin space already has dimension 32.
ðvÞ ¼ y ½8
Furthermore, under [7], transforms as
Spinors in General Relativity: Spinor
Algebra ! t ½9
In this section, we start again with a different The two-complex-dimensional space to which
emphasis. Conventions follow Penrose and Rindler belongs is the spin space S at p, already met in the
(1984, 1986). To introduce spinors as a calculus in a previous section, and it follows from [8], since null
four-dimensional, Lorentzian spacetime M, one can vectors span V, that the tensor product S S̄ of S with
begin by choosing an orthonormal tetrad of vectors its complex conjugate vector space S̄ is the complex-
(e0 , e1 , e2 , e3 ) at a point p. The following conven- ification of V. Complex conjugation gives an antilinear
tions are used: map from S to S̄. (One associates the complex-
conjugate vector space V̄ to any given complex vector
gðea ; eb Þ ¼ ab ¼ diagð1; 1; 1; 1Þ space V as follows: scalar multiplication for V can be
Any vector v in the tangent space V = Tp M at p has considered as a function : C V ! V given by
components va in this basis, which we arrange as a (z, v) = zv, while vector addition is a map : V
matrix and label in two ways: V ! V given by (u, v) = u þ v. Define another
0 complex vector space by taking the same vectors and
0
1 v0 þ v3 v1 þ iv2 v00 v01 the same where
but with scalar multiplication ,
ðvÞ ¼ pffiffiffi 1 2 0 3 ¼ 0 0 ½5
2 v iv v v v10 v11 (z, v) = (z̄, v). This is the complex-conjugate vector
pffiffiffi space V̄. Given a choice of basis, we think of V as, say,
The reason for the factor 1= 2 will be seen below, n-component column vectors of complex numbers,
as will the rationale for the second form of the and then V̄ is the corresponding complex-conjugate
matrix. Note that (v) is Hermitian and that columns.)
2 det ðvÞ ¼ gðv; vÞ ¼ ab va vb ½6 Conventionally, S is the space of unprimed spinors
and S̄ the space of primed spinors, and one also has
0
Clearly, there is a one-to-one correspondence the two duals S0 and S̄ which are associated in the
between elements of V and Hermitian 2 2 corresponding way to the dual V 0 of V. Analogously
matrices. Further, if t is any matrix in SL(2, C),then to the situation with vectors and covectors, index
the transformation conventions for spinors are as follows:
ðvÞ ! tðvÞty ½7 A 2 S;
0
A 2 S; A 2 S0 ; A0 2 S0
where ty is the Hermitian conjugate of t, is linear in v, where A = 0, 1, A0 = 00 , 10 .
and preserves both Hermiticity and the norm of v. Spinor algebra mirrors tensor algebra: a spinor
Thus, it must represent a Lorentz transformation. It is 0 0
A1 ...Ap A1 ...Aq B1 ...Br B0 ...B0s is an element of the tensor
straightforward to check that it is a proper, ortho- 1
product of p copies of S, q copies of S̄, r copies of S0 ,
chronous Lorentz transformation and that all such 0
and s copies of S̄ . The second way of writing the
transformations arise in this way (recall that ‘‘proper’’ matrix in [5] enables the identification of a vector
means transformations of determinant 1 so that with a matrix to be conventionally written as
orientation is preserved, and ‘‘orthochronous’’ means
0
that future-pointing timelike or null vectors are taken va ¼ vAA ½10
to future-pointing timelike or null vectors, so that time
orientation is preserved; the proper, orthochronous and then extended to any tensor T a...b c...d by replacing
Lorentz group is equivalently the identity-connected each vector index, say b, with a pair BB0 of spinor
component of the Lorentz group). Since both t and t indices. In particular, from [8], it follows that any
give the same Lorentz transformation, this provides an real null vector na can be written in the form
explicit demonstration of the (2 – 1)-homomorphism 0
na ¼ A A
of SL(2,C) with the proper, orthochronous Lorentz
group O"þ (1,3). for some spinor A .
Spinors and Spin Coefficients 669
One must pay attention to the order of spinor where the round brackets indicate symmetrization
indices of a given type, primed or unprimed, but by over the indices A1 , . . . , An , and the n spinors
convention may permute primed and unprimed (1) (n)
A1 , . . . , An , which are determined only up to
indices. A spinor with an equal number n of primed reordering and rescaling, are known as the principal
and unprimed indices corresponds to a tensor of spinors of . To prove this, note that the principal
valence n, and the tensor is real if the spinor satisfies spinors can be identified with the solutions
A of the
a suitable Hermiticity relation. equation
Spinors may have various symmetries among their
indices, much as tensors have. However, since S is two A1 ...An
A1
An ¼ 0
dimensional, there is only a one-dimensional space of and there are n of these, counting multiplicities, by
2-forms on S. This has two consequences: no spinor the ‘‘fundamental theorem of algebra.’’
can be antisymmetric over more than two indices; and
if we make a choice of canonical 2-form, all spinors
can be written in terms of symmetric spinors and the
canonical 2-form. This is a decomposition of spinors Spinors in General Relativity: Spinor
into irreducibles for SL(2, C). Calculus
One makes a choice of 2-form AB according to We now want to define spinor fields on the
AB ¼ BA ; 01 ¼ 1 spacetime M as sections of a spinor bundle S
whose fiber at each point is S and such that the
There is an inverse AB defined by tensor product S S is the complexified tangent
AC BC ¼ BA ½11 bundle. The existence of such an S imposes global
restrictions on M: M must be orientable and time
where BA is the Kronecker delta. The complex orientable, and a certain characteristic class, the
conjugate of AB is conventionally written without second Stiefel–Whitney class, must vanish (for an
0 0
an overbar as A0 B0 , and analogously A B is the explanation of these terms see, e.g., Penrose and
AB
complex conjugate of . Rindler (1984, 1986)). Assuming that M satisfies
Because of the antisymmetry of AB , order of these conditions, spinor fields can be defined. It is
indices is crucial in equations such as [11]. The convenient to retain the algebraic formulas from the
2-form AB has a role akin to that of a metric as it previous section (e.g., [10] or [12]) but with indices
provides an identification of S and its dual, now regarded as abstract (a note on the abstract
according to index convention appears in Twistors).
By an argument analogous to that for the
A ! B ¼ A AB fundamental theorem of Riemannian geometry,
there is a unique covariant derivative that satisfies
B ! A ¼ AB B
the Leibniz condition, coincides with the Levi-
with corresponding formulas for primed spinors. Civita derivative on tensors and the gradient on
Note that, because of the antisymmetry of AB , scalars, and annihilates AB and A0 B0 . Following the
necessarily A A = 0 for any A . conventions of the previous section, the spinor
With conventions made so far, it can be checked covariant derivative will be denoted as rAA0 . The
that commutator of derivatives can be written in terms
0 0 of irreducible parts (for SL(2, C)) according to the
gab va vb ¼ AB A0 B0 vAA vBB ½12 formula
for any vector va , where gab is the spacetime metric rAA0 rBB0 rBB0 rAA0 ¼ A0 B0 AB þ AB A0 B0
at p, so that 0
where AB = rC0 (A rC
B) . The definition of the
gab ¼ AB A 0 B0 Riemann curvature tensor is in terms of the Ricci
It is the desire to have this formula without identity
constants
pffiffiffi that necessitates the choice of the factor
ðra rb rb ra Þvc ¼ Rabd c vd
1= 2 in [5].
One final piece of spinor algebra that we note is and then this translates into two Ricci identities for
the following: given a symmetric spinor A1 ...An there a spinor field:
is a factorization
AB C ¼ ABCD D
ð1Þ ðnÞ
A1 ...An ¼ ðA1 An Þ ½13 A0 B0 C ¼ A0 B0 CD D
670 Spinors and Spin Coefficients
The curvature spinors ABCD and A0 B0 CD are related The Spin-Coefficient Formalism
to the curvature tensor. The Ricci spinor A0 B0 AB is
The spin-coefficient formalism of Newman and
Hermitian and symmetric on both index pairs and is
Penrose is a formalism for spinor calculus in space-
a multiple of the trace-free part of the Ricci tensor:
times (see, e.g., Penrose and Rindler (1984, 1986)
and Stewart (1990)). It finds application in
A0 B0 AB ¼ 12 Rab 14Rgab
any calculation dealing with curvature tensors,
including solving the Einstein equations. The form-
The spinor ABCD is symmetric on the first and last
alism exploits the compression of terminology which
pairs of indices and decomposes into irreducibles
the introduction of complex quantities permits.
according to
The formalism starts with a choice of spinor dyad,
ABCD ¼ ABCD 2 DðA BÞC a basis of spinor fields (oA , A ) normalized so that
oA A = 1. From the dyad, one constructs a null
where = R=24 in terms of the Ricci scalar or scalar tetrad, which is a basis of vector fields, according to
curvature R, while ABCD , which is totally sym- the scheme
metric and is known as the Weyl spinor, is related to 0 0 0 0
A0 B0 C0 D0 AB CD
Cabcd ¼ ABCD A0 B0 C0 D0 þ Given the normalization of the spinor dyad, each of
the vectors in the null tetrad is null (hence the name)
Thus, the ten real components of the Weyl tensor and all inner products are zero, except for
are coded into the five complex components of the
Weyl spinor. ‘a na ¼ 1 ¼ ma m
a
Following the last remark in the previous section,
It follows that the metric can be written in the
the Weyl spinor has four principal spinors, each of
basis as
which defines a null direction, the principal null
directions of the Weyl tensor. There is a classifica-
gab ¼ 2‘ða nbÞ 2mða m
bÞ
tion of Weyl tensors, the Petrov–Pirani–Penrose
classification, based on coincidences among the The components of the covariant derivative in the
principal null directions (Penrose and Rindler null tetrad are given separate names according to the
1984, 1986). following scheme:
As a final exercise in spinor calculus, we recall the
zero-rest-mass equations (see Twistors). In flat a ra ¼
‘a ra ¼ D; na ra ¼ ; ma ra ¼ ; m
spacetime, these are the equations
and the spin coefficients are the 12 components of
0
rA A AB...C ¼ 0 the covariant derivative of the basis. Each is labeled
with a Greek letter according to the following
on a totally symmetric spinor field AB...C . The field scheme:
is said to have spin s if it has 2s indices, and the
cases s = 1=2, 1, or 2, respectively, are the Weyl DoA ¼ oA
A ; oA ¼ oA A
neutrino equation, the Maxwell equation, and the A ¼ oA A
oA ¼ oA A ; o
linearized Einstein equation. In flat spacetime, these ½14
hyperbolic equations are well understood and DA ¼ oA A ; A ¼ oA A
solvable in a variety of ways. In curved spacetime, A ¼ oA A
A ¼ oA A ;
however, if s 3=2, then there are curvature
obstructions to the existence of solutions, known The spin coefficients code the 24 real Ricci rotation
as Buchdahl conditions. This can be seen at once by coefficients into 12 complex quantities. Some of the
differentiating again, say by rBA0 , and using the spin coefficients have direct geometrical interpreta-
spinor Ricci identity. After a little algebra, one finds tion. For example, the vanishing of
is the
condition for the integral curves of ‘a to be geodesic,
ABC ðD E...FÞABC ¼ 0 while, if is also zero, this congruence of geodesics
is shear free. The same role is played by and for
so that, whenever the field has three or more indices, the na -congruence. The real and imaginary parts of
there are algebraic constraints on its components in are, respectively (minus), the expansion and the
terms of the Weyl spinor. twist of the congruence of integral curves of ‘a .
Spinors and Spin Coefficients 671
spinor field oA is geodesic and shear free iff it is a positive-mass (or positive-energy) theorem. The
repeated principal spinor of the Weyl spinor. proof was motivated by ideas from supergravity
In the spin-coefficient formalism, oA is geodesic and gave rise to an increased interest in spinors in
and shear-free iff
and vanish, and, from [16], is a general relativity.
repeated principal spinor of the Weyl spinor The positive-mass theorem is the following asser-
provided 0 = 1 = 0. It will be repeated three tion: given an asymptotically flat spacetime M with
times if also 2 = 0 and four times if 3 = 0, but a spacelike hypersurface , which is topologically
one must have k 6¼ 0 for some k if the spacetime is R3 and in which the dominant energy condition
not to be flat. holds, the total (or Arnowitt–Deser–Misner (ADM))
Suppose that oA is a (twice) repeated principal momentum is timelike and future-pointing. (The
spinor of the Weyl spinor, then at once from the first dominant-energy condition is the requirement that
two expressions in [18] both
and vanish. If it is Tab Ua V b is non-negative for every pair of future-
repeated three times, one gets the same result from pointing timelike or null vectors Ua and V b .)
the third and fourth expressions in [18], while if oA We follow the notation of Penrose and Rindler
is repeated four times then the fifth and sixth (1984, 1986), where the proof begins by considering
expressions of [18] should be used. the 2-form defined in terms of a spinor field A on
For the converse, suppose that
= = 0. Then, by by
the first equation in [14], oA can be rescaled to ensure
that = 0 and a spinor field A can be chosen which is ¼ iB0 ra B dxa ^ dxb
normalized against oA and parallelly propagated along
‘a , so that, by the fifth equation in [14], = 0. From If a tends to a constant spinor at spatial infinity on
the second expression in [17], one can see at once that , then
0 = 0, so that the first two equations in [18] simplify I
1 0
to give expressions for D1 and 1 . By commuting D ! pa A A ½19
and on 1 and using the second expression of [15] 4G S
with the relevant parts of [17], it can be concluded that as the spacelike spherical surface S tends to spatial
1 = 0, as required. infinity, where pa is the ADM momentum. Suppose
Another application which is easy to describe is has unit normal ta , intrinsic metric hab = gab ta tb
the solution of the type-D vacuum equations. A and the dual-volume 3-form is da = ta d. Then
type-D solution is one for which the Weyl spinor has Stokes’ theorem states that
two (linearly independent) repeated principal spi- I Z
nors. If these are taken as the normalized dyad, then ¼ d
from [16] only 2 is nonzero among the k . By the S
Goldberg–Sachs theorem, both spinors are geodesic We calculate
and shear free, so that the spin coefficients ,
, ,
and all vanish. With these conditions, the spin- d ¼ þ
coefficient equations simplify to the point that where
careful choices of coordinates and the remaining
freedom in the dyad enable the equations to be ¼ 4GTab ‘a db
solved explicitly. One obtains metrics that depend 0
¼ i ab cd rc B rd B da
only on a few parameters. Analogous methods 0
reduce the Einstein equations to simpler systems where ‘a = a A and we have used the Einstein field
for the other vacuum algebraically special metrics, equations to replace curvature terms in by the
that is, the other vacuum metrics for which the Weyl energy–momentum tensor Tab . Provided the matter
spinor does not have four distinct principal null satisfies the dominant-energy condition, is every-
directions (Mason 1998). where a positive multiple of the volume form on
The spin-coefficient formalism has also been and its integral is positive (it can vanish only in
extensively used in the study of asymptotically flat vacuum). To make the integral of positive, A is
spacetimes and gravitational radiation (Penrose and required to satisfy
Rindler 1984, 1986, Stewart 1990).
DAA0 A ¼ 0 ½20
where Da = hba rb , which is the projection of the four-
The Positive-Mass Theorem
dimensional covariant derivative rather than the
A very important application of spinor calculus in intrinsic covariant derivative of . Equation [20] is
recent years was the proof by Witten (1981) of the the Sen–Witten equation; it is elliptic and reduces to
Spinors and Spin Coefficients 673
the Dirac equation on a maximal surface; furthermore, reasons related to the Buchdahl conditions, 8 is the
given an asymptotically constant value for A on an largest N that is considered in four dimensions.
asymptotically flat 3-surface with the topology of In superstring theory and in some supergravity
R3 , it has a unique solution. Equation [20] removes theories, one often wishes to consider spaces
part of the derivative of A from to leave with ‘‘residual supersymmetry,’’ by which is meant
that there is a spinor field satisfying a condition of
¼ hab Da C Db C0 dc covariant constancy in some connection (Candelas et
Now hab is negative definite and has timelike al. 1985). The existence of such constant spinors, as a
normal so that is a positive multiple of the volume result of spinor Ricci identities analogous to those
form on (unless A is covariantly constant, a case given above, typically imposes strong restrictions on
which is dealt with separately). Thus, the integral of the curvature. Riemannian manifolds admitting con-
d is non-negative and therefore, by [19], so is the stant spinors for the Levi-Civita connection are Ricci-
inner product of the ADM momentum pa with any flat (Hitchin 1974); Lorentzian ones can often be
null vector constructed from asymptotically constant found in terms of a few functions. Manifolds of
spinors. Furthermore, this inner product is strictly special holomorphy, which are of interest in super-
positive, except in a vacuum spacetime admitting a string theory, can usually be characterized as admit-
constant spinor. Such spacetimes can be found ting special spinors (Wang 1989).
explicitly and cannot be asymptotically flat, so that
the ADM momentum is always timelike and future See also: Clifford Algebras and Their Representations;
Dirac Operator and Dirac Field; Einstein Equations: Exact
pointing, and vanishes only in flat spacetime.
Solutions; Einstein’s Equations with Matter; General
The basic positive-energy theorem outlined above
Relativity: Overview; Geometric Flows and the Penrose
can be extended in several directions: Inequality; Index Theorems; Relativistic Wave Equations
to prove that the total momentum at future null Including Higher Spin Fields; Spacetime Topology,
infinity is also timelike and future pointing; Causal Structure and Singularities; Supergravity; Twistor
to deal with surfaces which have inner Theory: Some Applications [in Integrable Systems,
Complex Geometry and String Theory]; Twistors.
boundaries, for example, at black holes;
to prove inequalities between charge and mass; and
to deal with spacetimes which are asymptotically Further Reading
anti-de Sitter rather than flat.
Benn IM and Tucker RW (1987) An Introduction to Spinors and
Geometry with Applications in Physics. Bristol: Adam Hilger.
Budinich P and Trautman A (1988) The Spinorial Chessboard.
Further Applications of Spinors Berlin: Springer.
Supersymmetry is a symmetry in quantum field Candelas P, Horowitz G, Strominger A, and Witten E (1985)
Vacuum configurations for superstrings. Nuclear Physics
theory relating bosons and fermions. In the language B 258: 46–74.
of spinors, bosons are represented by fields with an Cartan E (1981) The Theory of Spinors. New York: Dover.
even number of spinor indices and fermions by fields Chevalley CC (1954) The Algebraic Theory of Spinors. New
with an odd number of indices. Thus, the gauge York: Columbia University Press.
transformations of supersymmetry are generated by Dirac PAM (1928) The quantum theory of the electron.
Proceedings of the Royal Society of London A 117: 610–624.
spinors with a single index. Harvey FR (1990) Spinors and Calibrations. Boston: Academic
Supergravity is supersymmetry in the case that one of Press.
the fields is the graviton. A supergravity theory is Hitchin NJ (1974) Harmonic spinors. Advances in Mathematics.
labeled by an integer N for the number of independent 14: 1–55.
supersymmetries and much of the numerology of these Mason LJ (1998) The asymptotic structure of algebraically special
spacetimes. Classical Quantum Gravity 15: 1019–1030.
theories follows from properties of spinors. N = 1 Penrose R and Rindler W (1984, 1986) Spinors and Space–Time.
supergravity contains a graviton and a spin-3/2 field vol. 1 and 2. Cambridge: Cambridge University Press.
coupled together, and the presence of the super- Stewart J (1990) Advanced General Relativity. Cambridge:
symmetry allows the Buchdahl condition to Cambridge University Press.
be evaded. Supergravity theory with one supersymme- van der Waerden BL (1960) Exclusion principle and spin. In:
Fierz M and Weisskopf VF (eds.) Theoretical Physics in
try in 11 spacetime dimensions depends on one spinor,
the Twentieth Century: A Memorial Volume to Wolfgang Pauli,
which, in 11 dimensions, has 32 components. This is as pp. 199–244. New York: Interscience.
many components as eight Dirac spinors in a four- Wang MY (1989) Parallel spinors and parallel forms. Annals of
dimensional spacetime, and, by a process of dimen- Global Analysis and Geometry 7: 59–68.
sional reduction, N = 1 supergravity in 11 dimensions Witten E (1981) A new proof of the positive energy theorem.
Communications in Mathematical Physics 80: 381–402.
is related to N = 8 supergravity in four dimensions. For