You are on page 1of 70

Manifolds

[7CCMMS18/ CM437Z - Semester 1, 2013]

Lecturer : George Papadopoulos


Department of Mathematics, Kings College London
Strand, London
WC2R 2LS
Email:

Notes most recently edited by Jan Gutowski and George Papadopoulos

Contents
1.

Manifolds
1.1
1.2

2.

The Tangent Space


2.1
2.2
2.3

3.

Vector Fields
Integral Curves and Local Flows
Lie Derivatives

Co-Tangent Vectors
Pull-back and Lie Derivative of a co-vector
Tensors

30
30
32
34
38
38
41
47

Forms
Exterior Derivative
Integration on Manifolds

Connections, Curvature and Metrics


7.1
7.2
7.3
7.4

21
21
21
24
24
25
28

Differential Forms
6.1
6.2
6.3

7.

Diffeomorphisms
Push Forward of Tangent Vectors

Tensors
5.1
5.2
5.3

6.

Maps from M to R
Tangent Vectors
Curves and their Tangents

Vector Fields
4.1
4.2
4.3

5.

11
11
12
17

Maps Between Manifolds


3.1
3.2

4.

Elementary Topology and Definitions


Manifolds

3
3
4

Connections, Curvature and Torsion


Riemannian Manifolds
Symplectic Manifolds
Applications to classical mechanics

8. Spheres with different differentiable structures

54
54
60
65
67
68

Recommended Books
C. Isham, Modern Differential Geometry for Physicists, World Scientific, 1989.
M. Nakahara, Geometry, Topology and Physics IOP, 1990.
I. Madsen and J. Tornehave, From Calculus to Cohomology, CUP, 1997.
M. G
ockler and T. Sch
uker, Differential Geometry, Gauge Thoeries and Gravity,
CUP, 1987.
S. Kobayashi and K. Nomizu, Foundations of Differential Geometry, vol. I, Wiley,
1963.
A useful review of differential geometry (which includes much of the course material,
but also significant amounts of other material) can be found in
http://empg.maths.ed.ac.uk/Activities/GT/EGH.pdf

For an introduction to topological spaces, see


W. A. Sutherland, Introduction to Metric and Topological Spaces, OUP, 2009.

Course information can be found at:

http://keats.kcl.ac.uk/

1. Manifolds
1.1 Elementary Topology and Definitions
This section should be a review of concepts (hence it is all definitions and no theorems).
Definition 1.1. A topological space X is a set equipped with a collection U = {Ui } of
subsets of X, which are called open sets, that satisfy
(i) U is closed under finite intersections and arbitrary unions
(ii) , X U
Definition 1.2. A set is closed if its complement in X is open.
Definition 1.3. An open cover of X is a collection of open sets whose union is X.
A topology allows us to define notion of locality, convergence and continuity.
Local properties are those of open sets, there is a definition of convergence of sequences
in a topological space and that of a continuous function between topological spaces. Some
other important concepts are:
Definition 1.4. X is connected if it is not the union of two disjoint open sets.
Definition 1.5. A map f : X Y between two topological spaces is continuous iff
f 1 (V ) = {x X : f (x) V } is open for any open set V Y .
Definition 1.6. If X Y , where Y is a topological space then X can be made into a
topological space too by considering the induced topology: open sets in X are generated by
U X X where U is an open set of Y .
Problem 1.1. Show that the induced topology indeed satisfies the definition of a topology.
Definition 1.7. A Hausdorff space is a topological space with the additional property that
points can be separated: for any two distinct points x, y X, there exists open sets Ux and
Uy such that x Ux , y Uy and Ux Uy = . Hausdorff spaces are also known as T2
spaces (as there are also weaker notions of separability).
Non-Hausdorff spaces have various pathologies that we do not want to consider. Therefore in what follows we will take all topological spaces to be Hausdorff unless otherwise
mentioned.
Definition 1.8. A function f : X Y is one-to-one iff f (x) = f (y) implies x = y.
This guarantees the existence of a left inverse fL1 : f (X) Y X such that fL1
f (x) = x for all x X, since every element in the image f (X) comes from a unique point
in X.
Definition 1.9. A function f : X Y is onto iff f (X) = Y , i.e. for all y Y there
exists an x X such that f (x) = y.

This guarantees the existence of a right inverse fR1 : Y X such that f fR1 (y) = y
for all y Y , since every element in Y has some x X (not necessarily unique) which is
mapped to it by f . Observe that fR1 is not a map.
Definition 1.10. A bijection is a map which is both one-to-one and onto.
Definition 1.11. A homeomorphism is a bijection f : X Y which is continuous and
whose inverse is continuous.
Examples:
(X, U) with U = {, X}
(X, U) with U the set of all subsets of X
We often have much more structure. For example if there is a notion of distance then
the usual topology is that defined by the open balls.
Definition 1.12. A metric space is a point set X together with a map d : X X R
such that
(i) d(x, y) = d(y, x)
(ii) d(x, y) 0 with equality iff x = y
(iii) d(x, y) d(x, z) + d(z, y)
U is an open subset of X iff for every x U there is an  > 0 such that the open ball
U (x) = {y X : d(x, y) < }

(1.1)

is contained in U . Note that it follows that X is open since it is a union of open balls
[
X=
U (x)
(1.2)
xX

for any  > 0 of your choosing. These spaces are always Hausdorff (property (ii) ensures
that any two distinct points have a finite distance between them and hence open balls with
 taken to be half this distance will separate them).
In particular we will heavily use Rn viewed as a metric topological space (with the
usual Pythagorian definition of distance ||x y||).
1.2 Manifolds
The use of topology allows one to define the notion of continuity on spaces while a manifold
structure deals with the notion of smoothness. Roughly speaking manifolds are topological
spaces which at suitably chosen open subsets look like Rn .
Definition 1.13. An n-dimensional chart on M is a pair (U, ) where U is an open subset
of M and : U Rn is a homeomorphism onto its image (U ) Rn

Figure 1: An n-dimensional chart (U, )

Definition 1.14. A n-dimensional differentiable structure on M is a collection of ndimensional charts (Ui , i ), i I such that
(i) M = iI Ui
(ii) For any pair of charts Ui , Uj with Ui Uj 6= , the map j 1
: i (Ui Uj )
i

j (Ui Uj ) is C , i.e. all partial derivatives exist up to any order.


(iii) We always take a differentiable structure to be a maximal set of charts, i.e. the union
over all charts which satisfy (i) and (ii).
N. B. The functions j 1
are called transition functions.
i
Theorem 1.1. If M is connected then n is well defined, i.e. all charts have the same
value of n.
Proof. Suppose that two charts had different values of n then it is clear that they cant
intersect because the map j 1
which takes a subset of Rni to a subset of Rnj is C
i
smooth and invertible. This in particular requires that the Jacobian of j 1
i is invertible
and so the map preserves the dimension. Since this is true for all charts we see that M
must split into disjoint charts, at least one for each different value of n. Since a chart is an
open set we can therefore write M as a union over disjoint open sets, one for each value of
n. Thus if M is connected it must have a unique value of n.
Henceforth we will only consider connected topological spaces.
Definition 1.15. A differentiable manifold M of dimension n is a connected Hausdorff
topological space equipped with an n-dimensional differentiable structure.

Figure 2: Transition functions

N.B. One can also study differentiable structures where the transition functions are
only C k for some k > 0. Alternatively one could replace Rn by Cn and demand that the
transition functions are holomorphic. These are therefore a special case of 2n-dimensional
real manifolds. This leads to the beautiful and rich subject of complex differential geometry
which we will not have time to consider here.
Example: Trivially Rn is a n-dimensional manifold. A single chart that covers the
whole of Rn is (Rn , id) where id is the identity map id(x) = x
Example: Any open subset U Rn is a n-dimensional manifold. Again the single
chart (U, id) is sufficient. In fact any open subset U of a manifold M with charts (Ui , i )
is also a manifold since one can use the charts (U Ui , i ), where now i is restricted on
U Ui .
Remark: Closed subsets may not be manifolds with respect to the induced charts and
transition functions from M .
Example: The circle is a 1-dimensional manifold.
Let M = {(x, y) R2 : x2 + y 2 = 1}. We need at least two charts, say
1
U1 = {(x, y) R2 : x2 + y 2 = 1, y > }
2
1
U2 = {(x, y) R2 : x2 + y 2 = 1, y < } .
2

We define i : Ui R by

5
where (x, y) = (cos , sin )
1 (x, y) = ,
4 4


5
where (x, y) = (cos 0 , sin 0 )
2 (x, y) = 0
,
4 4


Now
1
1
U1 U2 = {(x, y) R2 : x2 + y 2 = 1, < y < } = VL VR
2
2
where
1
1
VL = {(x, y) R2 : x2 + y 2 = 1, < y < , x < 0}
2
2
and
1
1
VR = {(x, y) R2 : x2 + y 2 = 1, < y < , x > 0}
2
2

Figure 3: The open sets U1 , U2

1
5

Now on 1 (VL ) = ( 3
4 , 4 ), 2 1 () = 2, whereas on 1 (VR ) = ( 4 , 4 ),
1 0
1
1 0
3
5
0
2 1 () = . Similarly 1 2 ( ) = +2 on 2 (VL ) = ( 4 , 4 ), and 1 2 ( ) =
0 on 2 (VR ) = ( 4 , 4 ). These maps are C and hence we have a manifold.
Here on the circle we see that locally we can define a single coordinate, which we
think of as an angle. But is not defined over the whole circle, = 0 and = 2 are the
same.
This illustrates a key point: The maps i provide coordinates, just like a map in an
atlas provides coordinates in the form of longitude and latitude. However the coordinates

will not in general work over the whole of the manifold. For example the surface of the
earth can be mapped in an atlas but the notion of latitude and longitude will break down
somewhere; at the poles longitude is not defined. Maps that one sees hanging on a wall
always break down somewhere (usually at both poles) but an atlas can smoothly cover the
whole earth.
Often the Ui are called coordinate neighbourhoods, or patches, and the i are coordinate maps. If p M is some point contained in a given patch Ui then the local coordinates
of p are

i (p) =


x (p), x (p), x (p), . . . , x (p) Rn
1

(1.3)

Figure 4: Local coordinates

Clearly there is a huge amount of choice of local coordinates. Typically in any given
patch Ui we could choose from an infinite number of different functions i . Furthermore
for a given manifold there will be infinitely many choices of open sets Ui which we use to
cover it with.
Differential geometry is the study of manifolds and uses tensorial objects which take
into account this huge redundancy in the actual way that we may choose to describe a given
manifold. This is the so-called coordinate free approach. Often, especially in older texts,
one fixes a covering and coordinate patches and writes any tensor in terms of its values in
some given local coordinate system. This may be convenient for some calculational purposes
but it obscures the true coordinate independent meaning of the important concepts. In
addition it should always be kept in mind when using explicit coordinates that they are
almost certainly not valid everywhere. One might often need to change coordinates, either
because we prefer to use a different choice of coordinates valid in the same patch, or because

we need to transform to a new patch which covers a different portion of the manifold. In
this course we will use the coordinate free approach as much as possible.
Example: Let us consider RP n = (Rn+1 {0})/ where is the equivalence relation
(x1 , x2 , x3 , . . . , xn+1 ) (x1 , x2 , x3 , . . . , xn+1 )

(1.4)

for any R {0}. We denote an element of the equivalence class by [x1 , x2 , x3 , . . . , xn+1 ].
Let us choose for the charts
U1 = {[x1 , x2 , x3 , . . . , xn+1 ] RP n : x1 6= 0}
U2 = {[x1 , x2 , x3 , . . . , xn+1 ] RP n : x2 6= 0}
U3 = {[x1 , x2 , x3 , . . . , xn+1 ] RP n : x3 6= 0}
.
.
.
Un+1 = {[x1 , x2 , x3 , . . . , xn+1 ] RP n : xn+1 6= 0}
with the functions

xn+1
x2 x3 x4
, , ,..., 1
Rn
1 ([x , x , x , . . . , x
]) =
x1 x1 x1
x
 1 3 4

x x x
xn+1
2 ([x1 , x2 , x3 , . . . , xn+1 ]) =
,
,
,
.
.
.
,
Rn
x2 x2 x2
x2
 1 2 4

x x x
xn+1
1 2 3
n+1
3 ([x , x , x , . . . , x
]) =
, , ,..., 3
Rn
x3 x3 x3
x
.
1

n+1

.
. 

x1
x2
x3
xn
1 2 3
n+1
n+1 ([x , x , x , . . . , x
]) =
,
,
, . . . , n+1 Rn
xn+1 xn+1 xn+1
x
n
Clearly n+1
i=1 Ui = RP and each i is a homeomorphism (without loss of generality
we can take xi = 1 in Ui ). Consider the intersection of U1 and U2 , and consider

(u1 , u2 , . . . , un ) 1 (U1 U2 )

(1.5)

One must therefore take u1 6= 0, and hence


1
1 (u1 , u2 , u3 , . . . , un ) = [1, u1 , u2 , . . . , un ]

(1.6)

and hence
2

1
1 (u1 , u2 , u3 , . . . , un )


=

1 u2 u3
un
, , ,...,
u1 u1 u1
u1

Rn .

(1.7)

Since u1 6= 0 this map is C on 1 (U1 U2 ). All the other intersections follow the same
way. Thus RP n is an n-dimensional manifold.

Problem 1.2. What is RP 1 ?


Theorem 1.2. If M and N are m and n-dimensional manifolds respectively then M N
is an (m + n)-dimensional manifold.
Proof. Let (Ui , i ), i I be an m-dimensional differential structure for M and (Va , a ),
a A be an n-dimensional differential structure for N . We can construct a differential
structure for M N by taking the following charts:
Wia = Ui Va

ia : Wia Rm+n ,

ia (x, y) = (i (x), a (y))

i I, a A
(1.8)

where x M, y N . Clearly ia Wia = M N , and ia are homeomorphisms. It is also


clear that the transition functions
1
1
jb 1
ia = (j i , b a )

(1.9)

are C .

Problem 1.3. Show that the following


U1 = {(x, y) S 1 : y > 0},

1 (x, y) = x

2 (x, y) = x

3 (x, y) = y

4 (x, y) = y

U2 = {(x, y) S : y < 0},


U3 = {(x, y) S : x > 0},
U4 = {(x, y) S : x < 0},
are a set of charts which cover S 1 .

Problem 1.4. Show that the 2-sphere S 2 = {(x, y, z) R3 : x1 + y 2 + z 2 = 1} is a


2-dimensional manifold.
Hint: consider stereographic projection. This requires using two charts
US = {(x, y, z) S 2 |z < 1}

and

UN = {(x, y, z) S 2 |z > 1}

(1.10)

these are clearly open and cover S 2 . In each chart one constructs N/S : UN/S R2 by
taking a straight line through either the south pole (0, 0, 1) or north pole (0, 0, 1) and
then through the point p UN/S . These lines are defined by the equation

0
x

X() = 0 + y
1
z1

(1.11)

so that X(0) is either the north or south pole and X(1) is a point on S 2 . We define N/S (p)
to be the point in the (x, y)-plane where the line intersects z = 0.

10

Figure 5: Stereographic projection from US = S 2 {(0, 0, 1)} R

2. The Tangent Space


An important notion in geometry is that of a tangent vector. This is intuitively familiar for
a curve in Rn . But the elementary definition of a tangent vector, or indeed any vector, relies
on special properties of Rn such a fixed coordinate system and its vector space structure.
Once given a vector v = (v 1 , v 2 , v 3 ) R3 for example, we can consider the derivative
in the direction of v:
v1

+ v2 2 + v3 3
x1
x
x

(2.1)

Here we view this expression as an operator acting on functions f : R3 R. Changing


coordinates, for example by performing a rotation, we also must change the coefficients
v 1 , v 2 , v 3 in an appropriate way however the action on a function remains the same. We
need to generalise the notion of a tangent vector to manifolds in a coordinate free way.
There are several equivalent ways to do this but here we will use the identification of a
vector field with an operator acting on functions.
Our first step is to introduce differentiable functions on manifolds. We will then proceed
to understand vectors as operators on differentiable functions.
2.1 Maps from M to R
Definition 2.1. A function f : M R is C iff for each chart (Ui , i ) in the differentiable
structure of M
f 1
i : i (Ui ) R
is C . The set of such functions on a manifold M is denoted C (M).

11

(2.2)

Note that if f 1
i is C , and (Uj , j ) is another chart with Ui Uj 6= 0, then f j
will be C on Ui Uj . Thus we need only check that f 1
is C over a set of charts
i
that covers M.

Problem 2.1. Consider the circle S 1 as above. Show that f : S 1 R with f (x, y) = x2 +y
is C .
Definition 2.2. An algebra V is a real vector space along with an operation ? : V V V
such that
(i) v ? 0 = 0 ? v = 0
(ii) (v) ? u = v ? (u) = (v ? u)
(iii) v ? (u + w) = v ? u + v ? w
(iv) (u + w) ? v = u ? v + w ? v
for all u, v, w V and R.
Theorem 2.1. C (M) is an algebra with addition and multiplication defined pointwise
(f + g)(p) = f (p) + g(p)
(f )(p) = f (p)
(f ? g)(p) = f (p)g(p)

(2.3)

Proof. Let us show that f ? g is in C (M). Note that


1
1
(f ? g) 1
i = (f i ).(g i )

(2.4)

where f 1
and g 1
are C . Therefore their pointwise product is too. Hence,
i
i
(f ? g) 1
is C , which is what we needed to show.
i
The other properties can be shown in a similar manner.
Definition 2.3. For a point p M, we let C (p) be the set of functions such that
(i) f : U R where p U M and U is an open set.
(ii) f C (U ) (recall that an open subset of a manifold is a manifold)
2.2 Tangent Vectors
We can now state our main definition.
Definition 2.4. A tangent vector at a point p M is a map Xp : C (p) R, which
satisfies
(i) Xp (f + g) = Xp (f ) + Xp (g)
(ii) Xp (constant f unction) = 0

12

(iii) Xp (f g) = f (p)Xp (g) + Xp (f )g(p) (Liebniz rule)


The set of tangent vectors at a point p M is called the tangent space to M at p and
is denoted by Tp M.
The union of all tangent spaces to M is called the tangent bundle
T M = pM Tp M .

(2.5)

This is an example of a fibre bundle and is itself a 2n-dimensional manifold.


N.B.: In general objects which satisfy these properties are called derivations.
N.B.: With this definition a tangent vector is a linear map from C (p) to R. Since C (p)
is a vector space (it is an algebra) the tangent vectors are therefore elements of the dual
vector space to C (p). However C (p) and hence its dual are infinite dimensional. The
conditions (i), (ii) and (iii) restrict the possible linear maps that we identify as tangent
vectors and in fact we will see that they become a finite dimensional vector space.
Example: Consider Rn as a manifold with the obvious chart U = Rn , : U Rn
taken to be the identity, then
X=

n
X
=1

is a tangent vector. In fact, we will learn that all tangent vectors have this form.
In what follows we denote
2f
f
=

f,
= f
etc.
(2.6)

x
x x
where f is defined on some open set in Rn . We can extend this to manifolds by the following
Definition 2.5. Let (x1 , . . . xn ) be local coordinates about a point p M. That is there
exists a chart (Ui , i ) with p Ui and i (q) = (x1 (q), x2 (q), . . . , xn (q)) for all q Ui . We
define

: C (p) R
(2.7)
x p
by

1
1
n
f = (f 1
(2.8)
i )(x (p), . . . x (p)) = (f i ) i (p)
x p

Theorem 2.2. x p is a tangent vector to M at p.
Proof. Let f, g C (p) be defined on an open set U in M that contains p, then

1
n
(f + g) = ((f + g) 1
i )(x (p), . . . x (p))
x p
1
1
n
= (f 1
i + g i )(x (p), . . . x (p))
1
1
n
1
n
= (f 1
i )(x (p), . . . x (p)) + (g i )(x (p), . . . x (p))


=
(f ) + p (g)
p

x
x

13

(2.9)

It is clear that
Also


x p (constant

map) = 0.


1
n
(f g) = ((f g) 1
i )(x (p), . . . x (p))
x p
1
1
n
= ((f 1
i ).(g i ))(x (p), . . . x (p))
1
1
n
1
n
= (f 1
i )(x (p), . . . x (p))(g i )(x (p), . . . x (p))
1
1
n
1
n
+ (f 1
i )(x (p), . . . x (p)) (g i )(x (p), . . . x (p))


(f )g(p) + p (g)f (p)
=
p

x
x

(2.10)

We will show that all tangent vectors arise in this way


Example: Consider RP 1 = (R2 {0})/ where (x, y) (x, y), 6= 0.
We have two charts
Ut = {(x, y) : x 6= 0},
Us = {(x, y) : y 6= 0},

x2
x1
x1
s = s ([x1 , x2 ]) = 2
x
t = t ([x1 , x2 ]) =

thus on the intersection Us Ut , s = 1/t. Let p = [1, 3] and consider the tangent vector
X : C (p) R,

X(f ) =



d
f 1
(t = 3) .
f=
t
p
t
dt

(2.11)

How does X act in the other coordinate system (where they overlap)? Recall that
d
ds(t) d
(f (t)) =
(f (t(s)))
dt
dt ds
1
Now s(t) = s 1
t and t(s) = t s so we have

d
(f 1
t )(t = 3)
dt
ds(t) d
1
1
=
(f 1
t (t s ))(s = )
dt ds
3

1 d
1
= 2 (f s ) s= 1
3
t ds

1 d

=
(f 1
s ) s= 1 .
3
9 ds

X(f ) =

(2.12)

Thus the tangent vector can look rather different, depending on the coordinate system one
chooses. However, its definition as a linear map from C (p) to R is independent of the
choice of coordinates, i.e. (2.11) and (2.12) will agree on any function in C (p).

14

Lemma 2.1. Let (x1 , . . . xn ) be a coordinate system about p M. Then for every function
f C (p) there exist n functions f1 , . . . fn C (p) such that
f (p) =


f
x p

(2.13)

and
f (q) = f (p) +

(x (q) x (p))f (q)

(2.14)

Proof. To begin, let F = f 1


i which is defined on V = i (U Ui ) where U is the domain
of f . Let B be an open ball in V Rn centred on v = i (p), and take y B.
Then
F (y 1 , . . . , y n ) = F (y 1 , . . . , y n ) F (y 1 , . . . y n1 , v n )
+ F (y 1 , . . . y n1 , v n ) F (y 1 , . . . v n1 , v n )
...
+ F (y 1 , v 2 , . . . v n ) F (v 1 , . . . , v n )
+ F (v 1 , . . . , v n )

X
1
+1
n
1
+1
n
= F (v , . . . , v ) +
F (y , . . . , y , v
, . . . , v ) F (y , . . . , v , v
,...,v )
1

t=1
F (y 1 , . . . , v + t(y v ), v +1 , . . . , v n ) t=0

= F (v , . . . , v ) +

= F (v 1 , . . . , v n ) +

XZ

= F (v 1 , . . . , v n ) +

XZ

d
F (y 1 , . . . , v + t(y v ), v +1 , . . . , v n )dt
dt

F (y 1 , . . . , y 1 , v + t(y v ), v +1 , . . . v n )(y v )dt

(2.15)
If we define
1

F (y , . . . y ) =

F (y 1 , . . . , v + t(y v ), v +1 , . . . , v n )dt

(2.16)

then
F (y 1 , . . . , y n ) = F (v 1 , . . . , v n ) +

X
(y v )F (y 1 , . . . , y n )

(2.17)

1
n
1
n
Finally, we recall that F = f 1
i so that if we let (y , . . . , y ) = i (q) = (x (q), . . . , x (q))
then this condition can be rewritten as
1
f 1
i i (q) = f i i (p) +

X
(x (q) x (p))F i (q)

15

(2.18)

and so
f (q) = f (p) +

(x (q) x (p))f (q)

(2.19)

where we identify f = F i .
It also follows that


f =
(f 1
i ) i (p)
p

x
x
F
i (p)
=
y


X

1
n
1
n

(y

v
)F
(y
,
.
.
.
,
y
)
F
(v
,
.
.
.
,
v
)
+
=

y=i (p)
y



X

= F (y 1 , . . . , y n ) +
(y v ) F (y 1 , . . . , y n ) y= (p)
i

= F (i (p))
= f (p)

(2.20)

Theorem 2.3. Tp (M) is an n-dimensional real vector space and a set of basis vectors is



, = 1, . . . , n
(2.21)
x p
i.e. a general element of Tp (M) can be written as
X

Xp =
p
x

(2.22)

Proof. First we note that if Xp and Yp are two tangent vectors at a point p M then we
can add them or multiply them by a number R:
(i) (Xp + Yp ) : C (p) R,
(ii) (Xp ) : C (p) R,

f Xp f + Yp f
f (Xp f )

(Convince yourself of this).


Next, we show that the basis elements (2.21) span the vector space of tangent vectors.
To do this, note that the previous lemma implied


X

Xp (f ) = Xp f (p) +
(x x (p))f
(2.23)

Now Xp (f (p)) = 0 and Xp (x (p)) = 0 as f (p) and x (p) are constants. Thus we have
X

Xp (f ) =
((x (q) x (p))Xp f + Xp (x )f (q) q=p

Xp (x )f (p)

Xp (x )


f
x p

16

(2.24)

where we have used (2.20) in the last step. This shows that the elements (2.21) span
Tp (M). We must now show that they are also linearly independent. To this end, suppose
that
X


=0
x p

(2.25)

The coordinate functions are in C (p), so we may consider


0=

X
X

x
=

(x
)
=
=

x p
x

(2.26)

Thus = 0 for all .

So why are they called tangent vectors? First consider Rn , and a curve C : (0, 1) Rn .
Recall from elementary geometry that a tangent vector to a point p = C(t1 ) is a line through
p in the direction (i.e. with the slope)
 1

dC
dC n
,...,
(2.27)
dt t=t1
dt t=t1
where C(t) = (C 1 (t), . . . , C n (t)) Rn is C .
Now if f C (p), then by our definition, Xp : C (p) R defined by
Xp (f ) =


d
f (C(t)) t=t1
dt

(2.28)

is a tangent vector to Rn at p = C(t1 ), i.e. it acts linearly on the function f , vanishes on


constant functions, and satisfies the Liebniz rule. On the other hand, we also have

dC

(
f
)

t=t
C(t1 )
1
dt

are the components of Xp in the basis x p .
Xp (f ) =

dC
dt

(2.29)

2.3 Curves and their Tangents


We can now discuss curves on manifolds and their tangents.
Definition 2.6. Consider an open interval (a, b) R. A map C : (a, b) M to a
manifold M is called a smooth curve on M if i C is C where it is defined for any
chart (Ui , i ) of M (i.e. with Ui C((a, b)) 6= ).
Definition 2.7. For a point t1 (a, b) with C(t1 ) = p, we can define the tangent vector
Tp (C) Tp (M) to the curve C at p by
Tp (C)(f ) =


d
f (C(t)) t=t1
dt

17

(2.30)

Figure 6: A smooth curve

It should be clear that Tp (C) Tp (M).


Let p M be a point on a curve C at t = t1 which is covered by a chart (Ui , i ). Then
there is some  > 0 such that
C((t1 , t1 + )) Ui

(2.31)

We can express the tangent to C at p as



d
(f (C(t))) t=t1
dt

d

=
(f 1
i ) (i C)(t) t=t1
dt
n
X

d
=
(i C) (t)) t=t1 (f 1
i )(i (C(t1 )))
dt

Tp (C)(f ) =

(2.32)

=1

n
Here we have split f C : (t1 , t1 +) R as the composition of a function f 1
i :R R
with i C : (t1 , t1 + ) Rn , and used the chain rule.
Thus we have
n
X


d
(i C) (t)) t=t1 p (f )
(2.33)
Tp (C)(f ) =
dt
x
=1

so that
n
X

d

Tp (C) =
(i C) (t)) t=t1 p
dt
x
=1

18

(2.34)

Conversely, suppose that Tp is a tangent vector to M at p. We will now show that


there exists a curve through p such that Tp is its tangent at p. Let (x1 , . . . , xn ) = i (q) be
local coordinates about p defined on an open set Ui . Hence we may write
Tp =

n
X

=1


x p

(2.35)

for some numbers T . We define C : (t1 , t1 + ) M by


(i C(t)) = x (p) + (t t1 )T

(2.36)

where we pick  sufficiently small such that 1


i (x (p)+(tt1 )T ) Ui for t (t1 , t1 +).
It follows that


d
d

(f (C(t))) t=t1 =
(f 1
i ) ( C)(t) t=t1
dt
dt
n
X

d
(i C) t=t1 (f 1
=
i )(i (C(t1 )))
dt

=1
n
X

=1


(f )
x p

(2.37)

We have therefore shown that all curves through p M define a tangent vector to
M at p, and that conversely, all tangent vectors to M at p can be realized as the tangent
vector to some curve C. However, this correspondence is not unique. Clearly many distinct
curves may have the same tangent vector at p, and conversely the construction of the curve
through p with tangent Tp was not unique. But this does lead to the following theorem:
Theorem 2.4. Tp M is isomorphic to the set of curves through p M modulo the equivalence relation


d
d


C(t) C(t)
iff
(f C) t=t1 = (f C)
(2.38)
t=t1
dt
dt
1) = p
for all f C (p) where C(t1 ) = C(t
Proof. It is easy to check that the construction above provides a bijection between these
two spaces. The more difficult part , which we wont go into here, is to show that the
vector space structure is preserved. Indeed, to do this we would need to give a vector space
structure to the equivalence class of curves through p.
Theorem 2.5. Let (x1 , . . . , xn ) = 1 , (y 1 , . . . , y n ) = 2 be two coordinate systems at a
point p M with U1 U2 6= , and suppose Xp Tp M. If
Xp =

n
X
=1


x p

and

Xp =

n
X
=1


y p

(2.39)

then
B =

n
X
=1


(y )
x p

where y (x1 , . . . , xn ) = 2 1
1 : 1 (U1 U2 ) 2 (U1 U2 ) is a smooth function.

19

(2.40)

Proof. We have in the second coordinate system that


Xp (y ) =

n
X

=1


(y ) = B
y p

(2.41)

but in the first coordinate system we also see that

Xp (y ) =

n
X

=1


(y )
x p

(2.42)

and as these two expressions must agree, we have proved the theorem.
N.B.: This formula is often simply written as
B =

n
X

=1

y
x

(2.43)

or even
A0 = A

x0
x

(2.44)

with a sum over and a prime denoting quantities in the new coordinate system.
We can see that


y ,
x p


x
y p

and

(2.45)

are inverses of each other (when viewed as matrices). To see this note that on interchanging
the coordinate systems, we must also have
A =

n
X
=1


(x )
y p

(2.46)

It follows that
X

n
n
n
n
X
X
X

A =
(x )
A
(y ) =
A
(x ) p (y )
y p
x p
y p
x

=1

=1

=1

(2.47)

=1

As this must be true for all possible A , it implies that


n
X

(x ) p (y ) =
y p
x
=1

20

(2.48)

3. Maps Between Manifolds


3.1 Diffeomorphisms
Definition 3.1. Suppose that f : M N , where M, (Ui , i ), i I, and N , (Va , a ), a A
are two manifolds. We say that f is C iff
1
a f 1
(Va )) Rn
i : i (f

(3.1)

is C for all i I, a A (such that Ui f 1 (Va ) 6= ).


Problem 3.1. Show that f : S 1 S 1 defined by f (e2i ) = e2in is C for any n.
Theorem 3.1. Suppose that M, N and P are manifolds with f : M N and g : N P
C , then g f : M P is also C .
Proof. This follows from the chain rule.
Definition 3.2. If f : M N is a bijection with both f and f 1 C then f is called a
diffeomorphism.
Two manifolds which are diffeomorphic, i.e. for which there exists a diffeomorphism
between them, are equivalent geometrically.
Problem 3.2. Show that the charts of two diffeomorphic manifolds are in a 1-1 correspondence.
Problem 3.3. Show that the set of diffeomorphisms from a manifold to itself forms a
group under composition.
3.2 Push Forward of Tangent Vectors
Theorem 3.2. Let f : M N be C , if Xp Tp M then
f? Xf (p) : C (f (p)) R

defined by

g Xp (g f )

(3.2)

is in Tf (p) N , i.e. is a tangent vector to N at f (p)


Proof. Let g1 , g2 C (f (p)), R. Then
f? Xf (p) (g1 + g2 ) = Xp ((g1 + g2 ) f )
= Xp (g1 f + g2 f )
= Xp (g1 f ) + Xp (g2 f )
= f? Xf (p) (g1 ) + f? Xf (p) (g2 )

(3.3)

and
f? Xf (p) () = Xp ( f ) = Xp () = 0 .

21

(3.4)

Finally, we have
f? Xf (p) (g1 .g2 ) = Xp ((g1 .g2 ) f )
= Xp ((g1 f ).(g2 f ))
= Xp (g1 f )(g2 f )(p) + (g1 f )(p)Xp (g2 f )
= f? Xf (p) (g1 )g2 (f (p)) + g1 (f (p))f? Xf (p) (g2 )

(3.5)

Definition 3.3. f? Xf (p) is called the push forward of Xp .

Figure 7: Push forward of tangent vector

Theorem 3.3. Suppose that f : M N is C , (x1 , . . . , xm ) are local coordinates for a


point p M and (y 1 , . . . , y n ) are local coordinates for the image f (p) N . If
Xp =

m
X
=1


x p

(3.6)

is in Tp M then
f? Xf (p) =

m X
n
X
=1 =1



(y f ). f (p)
p

x
y

22

(3.7)

Proof. Let g C (f (p)). Then


f? Xf (p) (g) = Xp (g f )
m
X

=
A p (g f )
x
=

=1
m
X

=1
m
X

g f 1
i ) i (p)
x

g a1 a f 1
i (p)
i

=1
m X
n
X

=1 =1
m X
n
X
=1 =1

1
(

)
(i (p))) (g a1 )(a (f (p)))
a
i

x
y



(y f ). f (p) (g)
p

x
y

Corollory 3.1. The push forward acts linearly on vectors

23

(3.8)

4. Vector Fields
4.1 Vector Fields
Next we consider vector fields.
Definition 4.1. A vector field is a map X : M T M such that X(p) = Xp Tp M and
for all f C (M) the mapping
p X(f )(p)

(4.1)

is C , where X(f )(p) = Xp (f ).


For vector fields we will drop the explicit subscript p. Thus a vector field assigns, in a
smooth way, a vector in Tp M to each point p M.
N.B.: As we defined it, a vector field is (globally) valid over all M, however it can
also be defined over an open subset U M (i.e. locally).
Is the product of two vector fields also a vector field? To check this, take two vector
fields X, Y and f, g C (M). With multiplication of vectors taken to mean (X.Y )(f ) =
X(Y (f )), we see that
X(Y (f + g)) = X(Y (f ) + Y (g)) = X(Y (f )) + X(Y (g))
X(Y (constant map)) = X(0) = 0

(4.2)

However, we find
X(Y (f.g)) = X(Y (f )g + f Y (g)) = g.X(Y (f )) + f.X(Y (g))
+ Y (f ).X(g) + X(f ).Y (g)

(4.3)

and this is not a vector field, due to the presence of the terms on the second line. However,
note that the terms on the second line are symmetric in X, Y . Motivated by this, we
construct a vector field by taking
[X, Y ](f ) = X(Y (f )) Y (X(f ))

(4.4)

so that
[X, Y ](f.g) = g.X(Y (f )) + f.X(Y (g)) + Y (f ).X(g) + X(f ).Y (g)

g.Y (X(f )) + f.Y (X(g) + X(f ).Y (g) + Y (f ).X(g))
= [X, Y ](f ).g + f.[X, Y ](g)

(4.5)

as required for a vector field.


Definition 4.2. [X, Y ] is called the commutator of two vector fields
Problem 4.1. What goes wrong if we try to make a vector field using the definition
(X.Y )(f ) = X(f ).Y (f )?

24

Problem 4.2. Show that, if in a particular coordinate system


X=

X (x)


,
x p

Y =

Y (x)


x p

(4.6)

then
[X, Y ] =

XX

X Y Y X



x p

(4.7)

Theorem 4.1. With the product of two vector fields defined as the commutator, the space
of vector fields is an algebra
Proof. We have already seen that Tp M is a vector space for a particular p M. this ensures
that the space of vector fields is also a vector space, with addition and scalar multiplication
defined pointwise. We need to check the conditions (i)(iv) in the definition of an algebra.
Conditions (i), (ii) are obviously satisfied. In addition since [X, Y ] = [Y, X], we need
only check that
[X, Y + Z](f ) = X(Y + Z)(f ) (Y + Z)X(f )
= X(Y (f )) + X(Z(f )) Y (X(f )) Z(X(f ))
= [X, Y ](f ) + [X, Z](f )

(4.8)

as required
Problem 4.3. Show that for three vector fields X, Y, Z on M the Jacobi Identity holds:
[X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y ]] = 0

(4.9)

Problem 4.4. Consider a manifold with a local coordinate system Ui , i = (x1 , . . . xn ).


In Ui we can simply write

=
p

x
x

(4.10)



(i) Show that x , x = 0.


(ii) Evaluate x 1 , (x1 , x2 ) x 2 where (x1 , x2 ) is a C function of x1 , x2 .
4.2 Integral Curves and Local Flows
Given a vector field X, we can construct curves that pass through p M for which the
tangent vector at p is X.
Definition 4.3. Let X be a vector field on M and consider a point p M. An integral
curve of X passing through p is a curve C(t) in M such that C(0) = p and
 
d
C?
= XC(t)
(4.11)
dt

25

for all t in some open interval (, ) R. Here we are viewing


so that the push forward is defined by

C?

d
dt

as a vector field on R,


d
d
(f ) = (f (C(t))) = Tp (C)(f )
dt
dt

(4.12)

is just the tangent vector to C(t) at p.


If we introduce a local coordinate system so that in an open set about p M,
X=

X (x)


1
x i (x)

(4.13)

then we find the integral curve is



C?


d
d
(f ) = f (C(t))
dt
dt

d
=
f 1
i i C (t)
dt
X dC

(t) C(t) (f )
=
dt
x

(4.14)

where we have set C = (i C) . On the other hand we have


XC(t) (f ) =

X (C(t))


(f ) .
x C(t)

(4.15)

Thus we see that the condition for an integral curve is a first order differential equation
for the coordinates of the curve C (t):
d
C (t) = X (C(t))
dt

(4.16)

together with the initial condition x (C(0)) = x (p). This is a first order differential
equation, and as such it has (at least locally) a unique solution with the given initial
condition (Picards theorem). However it is by no means clear whether or not the solution
can be extended to all values of t. In particular, even if there is a solution to the differential
equation (4.16) for all t, one must also worry about patching solutions together over the
different coordinate patches. This leads to
Definition 4.4. A vector field X on M is complete if for every point p M the integral
curve of X can be extended to a curve on M for all values of t
Theorem 4.2. If M is compact (i.e. all open covers have a finite subcover) then all vector
fields on M are complete.
Proof. We will not prove this

26

Let (t, p) be an integral curve of a vector field X that passes through p at t = 0. This
therefore satisfies
d
(t, p) = X ((t, p))
dt

(4.17)

along with the initial condition (0, p) = p. Thus we have found the following, at least
locally (and globally for complete vector fields):
Definition 4.5. The flow generated by the vector field X is a differentiable map
:RMM

(4.18)

such that
(i) At each point p M, the tangent to the curve Cp (t) = (t, p) at p is X.
(ii) (0, p) = p
(iii) (t + s, p) = (t, (s, p))
We must show that the third property holds. To do this we dimply note that since
(4.17) is a first order differential equation it has a unique solution for a fixed initial condition. So consider
d (t + s, p)
d
(t + s, p) =
= X ((t + s, p))
dt
d(t + s)

(4.19)

which satisfies the initial condition (0 + s, p) = (s, p). On the other hand we also have
d
(t, (s, p)) = X ((t, (s, p)))
dt

(4.20)

with the initial condition (0, (s, p)) = (s, p). Thus both (t + s, p) and (t, (s, p))
satisfy the same equation (4.17) with the same initial condition, and so they must be
equal.
We see that each point p M defines a curve Cp (t) = (t, p) in M whose tangent is
X, and such that Cp (0) = p.
Another way of looking at this is that, for each t the flow defines a map t : M M
such that t+s = t s . Now for t =  small, we have, from (4.17),
 (p) = (, p) = x (p) + X (p) + O(2 )

(4.21)

Thus, at least for a sufficiently small value of t, t is 1-1 and C . Hence it is a diffeomorphism onto its image (at least for some open set in M). This leads to another definition
Definition 4.6. A 1-parameter family of diffeomorphisms of M is a collection of diffeomorphisms t : M M with t R, such that
(i) 0 = id where id is the identity map.

27

(ii) t s = t+s
(iii) t t = id
So in effect, we can think of vector fields as generating infinitessimal diffeomorphisms
through their flows. In this sense, vector fields can be identified with the Lie algebra of the
diffeomorphism group.
4.3 Lie Derivatives
A vector field allows us to introduce the notion of a derivative on a manifold. The problem
with the usual derivative,
f
f (p + ) f (p)
= lim
p 0


(4.22)

is that we dont know how to add two points on a manifold, i.e. what is p + ? However,
we saw that at least locally, a vector field generates a unique integral flow about any given
point p. Therefore we can use the flow to take us to a nearby point and hence form a
derivative. This is the notion of a Lie derivative.
Definition 4.7. The X be a vector field on M and f C (M). We define the Lie
derivative of f along X to be
f ((, p)) f (p)
0


LX f (p) = lim

(4.23)

where (, p) is the flow generated by X at p.


Theorem 4.3.
LX f = X(f )

(4.24)

Proof. We have from the definition


f ((, p)) f ((0, p))


d
= f ((t, p)) t=0
dt
 
d
= (t, p)?
(f )
dt t=0
= Xp (f )

LX f (p) = lim

0

(4.25)

where we have used the defining property of the flow that the tangent at p is Xp .
We can also define the Lie derivative of a vector field Y along X;
Definition 4.8.
LX Yp = lim

0

? Yp Yp


(4.26)

where again () is the flow generated by X, and we have suppressed the dependence on
the point p.

28

Theorem 4.4.
LX Y = [X, Y ]
Proof. Let us introduce coordinates (x1 , . . . , xn ) about p M such that
X
X


Y =
Y (x) 1 (x)
X=
X (x) 1 (x) ,
x
x

(4.27)

(4.28)

Recall that
(, p) = x (p) + X (p) + O(2 )

(4.29)

and note for any f ,


Y() (f ( )) =

=
=
=
=
=

Y (())


(f ( ))
x ()


(f ( )) + O(2 )

x
X
(Y + X Y )( f ( ) + X f ( )) + O(2 )
X


Y + X Y (f X f ) + X f + O(2 )
X
(Y + X Y Y X )f + O(2 )
X
=
(Y + [X, Y ] )f + O(2 )
(4.30)
X

(Y + X Y )

N.B. In the 3rd line, we have suppressed the 1


term (in f 1
i
i ) for convenience.
We also adopt the Einstein summation convention, whereby repeated indices are
summed over.
It follows that


1
()? Yp (f ) Yp (f ) = (X Y Y X ) f (p) + O()

= [X, Y ]p (f ) + O()

(4.31)

and we are done.

Remark: Sometimes the push forward of a vector field is defined as


? Xp (f ) = Xp (f )

(4.32)

ie although ? Xp is a vector at (p) the subscript of ? X is the point p rather than


the point (p) that the vector is defined, compare with (3.2). In the above case, the Lie
derivative is defined as
? Y(,p) Yp
LX Yp = lim
(4.33)
0

compare with (4.26). Of course, there is no difference, it is a matter of notation.

29

5. Tensors
5.1 Co-Tangent Vectors
Recall that the tangent space Tp M at a point p M is a vector space. For any vector
space there is a natural notion of a dual vector space which is defined at the space of linear
maps from the vector space to R.
Problem 5.1. Show that the dual space is a vector space, of the same dimension as the
original vector space (provided it is finite-dimensional).
Thus we have the
Definition 5.1. The co-tangent space to M at p M is the dual vector space to Tp (M)
and is denoted by Tp? M.
In other words, p Tp? M iff p : Tp M R is a linear map. We denote the action of
p on a vector Xp Tp M by
p (Xp ) = hp , Xp i

(5.1)

hp , Xp + Yp i = p (Xp + Yp ) = p (Xp ) + p (Yp ) = hp , Xp i + hp , Yp i

(5.2)

Since p is a linear map we have

Furthermore, as the dual space is a vector space,


hp + p , Xp i = (p + p )(Xp ) = p (Xp ) + p (Xp ) = hp , Xp i + hp , Xp i

(5.3)

Thus h, i is linear in each of its entries.


Now the dual of the dual of a vector space is just the original space itself. To see this,
note that for a fixed vector Xp , we can construct the map
p p (Xp ) R

(5.4)

The properties of dual vectors ensure that this is a linear map. Thus we can view vectors
Xp as linear maps acting on co-vectors p via
Xp (p ) = hp , Xp i
Just as for the tangent bundle, we can define the co-tangent bundle to be
[
Tp? M
T ?M =
pM

which is a 2n-dimensional manifold.


Definition 5.2. A smooth co-vector field is a map : M T ? M such that
(i) p Tp? M.

30

(5.5)

(5.6)

(ii) (X) : M R, defined by (X)(p) = p (Xp ) is in C M for all smooth vector


fields X
Recall that a chart (Ui , i ) defines a natural basis of Tp M for p Ui :

x p

(5.7)

where i (p) = (x1 (p), . . . , xn (p)). This allows us to define a natural basis for Tp? M by


hdx p , p i =
x

(5.8)


i.e. dx p is a linear map from Tp M to R that maps the vector
n
X

=1


x p

(5.9)

to v .
Thus, if in a local coordinate chart we have a vector
n
X

V (p) =

V (p)

=1


x p

(5.10)

and a co-vector
(p) =

n
X


(p)dx p

(5.11)

=1

Then
h(p), V (p)i = h

n
X
=1

n
X

(p)dx p ,
V (p) p i
x
=1

n X
n
X
=1 =1
n X
n
X



(p)V (p)hdx p , p i
x
(p)V (p)

=1 =1
n
X

(p)V (p)

(5.12)

=1

Theorem 5.1. Let (x1 , . . . , xn ) = 1 , and (y 1 , . . . , y n ) = 2 be two coordinate systems at


a point p M with p U1 U2 , and suppose that p Tp? M If
p =

n
X


A dx p ,

and

p =

=1

n
X


B dy p

(5.13)

=1

then
A =

n
X
=1


y
x p

where y (x1 , . . . , xn ) = 2 1
1 is a smooth function 1 (U1 U2 ) 2 (U1 U2 ).

31

(5.14)

Proof. In the first coordinate system


 X

n



=
A hdx p , p i = A
p
p

x
x

(5.15)

=1

In the second coordinate system, we see that



 X
n




p
=
,
i
B
hdy

p x p
x p

(5.16)

=1

Also recall that


X

=
x p




(y )
p

x
y p

(5.17)

Thus we find

p


x p

n
X



X


=
(y )
B hdy p ,
i
p

x
y p
=1



n
X
X

i
(y ) hdy p ,
=
B
p

x
x p
=

=1
n
X
=1


y
x p

(5.18)

and since these must agree the theorem is proved.

N.B.: This formula is often simply written as


A =

n
X

=1

y
x

(5.19)

or even
x
(5.20)
x0
with a sum over understood and a prime denoting quantities in the new coordinate
system. Note the different positions of the prime and unprimed coordinates as compared
to the analogous formula for a vector.
A0 = A

5.2 Pull-back and Lie Derivative of a co-vector


Suppose we have a smooth map f : M N . We saw that we could push-forward a vector
Xp Tp M to a vector f? Xf (p) Tf (p) N by
f? Xf (p) (g) = Xp (g f )

(5.21)

We therefore can consider the dual map f ? : Tf?(p) N Tp? M defined by


f ? (f (p) )(Xp ) = hf ? (f (p) ), Xp i = hf (p) , f? Xf (p) i = f (p) (f? Xf (p) )

(5.22)

It is clear that if f (p) Tf?(p) N then f ? (f (p) ) is a linear map Tp M R, this follows from
the linearity of the push-forward f? .

32

Theorem 5.2. Let f : M N be C , (y 1 , . . . , y n ) be local coordinates on V N , and


(x1 , . . . , xm ) be local coordinates on U f 1 (V ) M (assuming U f 1 (V ) 6= ). If
=

n
X


dy f (p)

(5.23)

=1

then
?

f =

m X
n
X

=1 =1


(y

f
)dx
p
x

(5.24)

Proof. We have
h, f? Xf (p) i =

n
X

(f ? Xf (p) )

=1

n X
m
X

=1 =1


y f)
x p

(5.25)

where we used out earlier result for the components of the push forward of a vector.
However we also have
?

hf , Xp i =

m
X

(f ? ) X

(5.26)

=1

by definition. As these two expressions must be equal for all possible choices of X , the
result follows.

We can also extend the definition of the Lie derivative to co-vector fields (and in fact
all tensor fields).
Definition 5.3. If X is a smooth vector field and a smooth co-vector field on M then


1
 ? ()
0 

LX = lim

(5.27)

Theorem 5.3. If, in a local coordinate system (x1 , . . . , xn ),


X=

X (x)


,
x p


(x)dx p

(5.28)

then
LX =

XX



X + X dx p

33

(5.29)

Proof. Suppose Y Tp M, consider


 ? () Y = () ? Y()
X
=
() ( ? Y() )

X

+ X ( ? Y() ) + O(2 )
=

(5.30)

where

( ? Y() ) = Y ( ())
= Y (x + X ) + O(2 )
= Y + Y X + O(2 )
as a consequence of Theorem 3.3.
Combining these expressions one obtains


X
?

 () Y =
+ X
Y + Y X
+ O(2 )

X
X

=
Y + 
X + X Y + O(2 )

(5.31)

(5.32)

It follows that



X
1
?

lim
 () Y =
X + X dx (Y )
0 

(5.33)

and the theorem follows.

5.3 Tensors
We can now construct the definition of a (r, s) tensor. First we need to recall the definition
of the tensor product. If V, W are two vector spaces with basis {vi : i = 1, . . . , n} and
{wa : a = 1, . . . , m} respectively, then the vector space sum is a n + m-dimensional vector
space with basis {vi , wa : i = 1, . . . , n; a = 1, . . . , m}, i.e.
V W = Spani,a {vi , wa }

(5.34)

so that a general element is


n
X
i=1

a vi +

m
X

ba wa

(5.35)

a=1

where the sum is interpreted as a formal sum. Hence V W is a n + m-dimensional vector


space.

34

We can also construct the tensor product of V and W, which is spanned by {vi wa :
i = 1, . . . , n; a = 1, . . . , m}, i.e.
V W = Spani,a {vi wa }

(5.36)

where vi wa is a formal product. A general element is


n X
m
X

cia vi wa .

(5.37)

i=1 a=1

and we have the following identities:


(v1 + v2 ) w = (v1 w) + (v2 w)
v (w1 + w2 ) = (v w1 ) + (v w2 )
(v) w = v (w) = (v w)

(5.38)

for v, v1 , v2 V, w, w1 , w2 W and R. V W is a nm-dimensional vector space.


Definition 5.4. A (r, s)-tensor T at a point p M is an element of


Tp(r,s) M = r Tp M s Tp? M

(5.39)

where r denotes the r-th tensor product.


It follows that, given a local coordinate system, a local basis for (r, s)-tensors is given
by




r p dx1 p dxs p
p

1
x
x

(5.40)

and if we write




1
s

dx

dx
p
p
p
p

x 1
x r
then the components can be computed as






T 1 ...r 1 ...s = T dx1 p , . . . , dxr p , 1 p , . . . , s p
x
x
T =

T 1 ...r 1 ...s

Problem 5.2. Show that if in some local coordinate system


X




T =
A1 ...r 1 ...s 1 p r p dx1 p dxs p
x
x
is a (r, s)-tensor, and
X
,
Xi =
Xi
i = 1, . . . , s
p
x

(5.41)

(5.42)

(5.43)

(5.44)

are s vectors and


I =


I dx p ,

I = 1, . . . , r

(5.45)

are r co-vectors then


T ( 1 , . . . , r , X1 , . . . , Xs ) =

A1 ...r 1 ...s 1 1 . . . r r X11 . . . Xss

35

(5.46)

Problem 5.3. Suppose that T is a (r, s)-tensor, (U1 , 1 = (x1 , . . . , xn )) and (U1 , 1 =
(y 1 , . . . , y n )) are two local coordinate charts with U1 U2 6= , such that
T =

A1 ...r 1 ...s





r p dx1 p dxs p
p

1
x
x

(5.47)

T =

B 1 ...r 1 ...s





1
s

dy

dy
p
p
y 1 p
y r p

(5.48)

and

Then
B

1 ...r

1 ...s

n
X
1 =1

n X
n
X

s =1 1 =1

n
X

A1 ...r 1 ...s

r =1




1
r

...
y
y
x1 p
xr p




1
s

...
x
x
y 1 p
y s p

(5.49)

where y (x) is the -th component of the transition function 2 1


1 , and x (y) is the
1
-th component of the transition function 1 2 .

Definition 5.5. A (r, s)-tensor field is a map


T : M (r T M) (s T ? M)

such that

T (p) (r Tp M) (s Tp? M)(5.50)

which is smooth in the sense that for any choice of r smooth co-vector fields 1 , . . . , r ,
and s smooth vector fields V 1 , . . . , V s , the map T ( 1 , . . . , r , V 1 , . . . , V s ) : M R is C .
Some tensor fields have special names.
(i) A (0, 0)-tensor is a scalar; as a field it assigns a number to each point in M
(ii) A (1, 0)-tensor is a vector; as a field it assigns a tangent vector to each point in M.
(iii) A (0, 1)-tensor is a 1-form; as a field it assigns a co-vector to each point in M
Sometimes, especially in older books, (r, 0)-tensors are called covariant, and (0, s)
tensors are called contravariant.
A (r, 0) tensor is called symmetric if
T ( P (1) , . . . , P (r) ) = T ( 1 , . . . , r )

(5.51)

and similarly a (0, s) tensor is called symmetric if


T (V P (1) , . . . , V P (s) ) = T (V 1 , . . . , V s )

(5.52)

On the other hand, they are called anti-symmetric if


T ( P (1) , . . . , P (r) ) = sgn(P )T ( 1 , . . . , r )

36

(5.53)

or
T (V P (1) , . . . , V P (s) ) = sgn(P )T (V 1 , . . . , V s )

(5.54)

Here P is a permutation, and sgn(P ) is its sign. Recall that P is a bijection P : {1, . . . , k}
{1, . . . , k}, and can always be written in terms of either an even (sgn(P ) = 1) or odd
(sgn(P ) = 1) of interchanges (where two neighbouring integers are swapped, and everything else is unaltered).
Similarly, one can talk about symmetry properties of the (r, 0) and (0, s) components
of a mixed (r, s)-tensor separately.
It is straightforward to extend the Lie derivative to act on (r, s) tensors by requiring
that
(i) LX (T1 + T2 ) = LX T1 + LX T2 , where T1 , T2 are (r, s) tensors
(ii) LX (T ) = (LX T ), where T is a (r, s) tensor and R is constant.
(iii) LX (T1 T2 ) = (LX T1 ) T2 + T1 (LX T2 ) where T1 , T2 are (r, s) and (r0 , s0 ) tensors.

37

6. Differential Forms
N.B.: Conventions about forms can vary from book to book (by various factors
of p! and minus signs), so be careful when comparing different sources.
6.1 Forms
Definition 6.1. A p-form on a manifold M is a smooth anti-symmetric (0, p)-tensor field
on M. In particular, if is a p-form then
(XP (1) , . . . , XP (p) ) = sgn(P )(X1 , . . . , Xp )

(6.1)

Definition 6.2. A 0-form on M is a function in C (M).


Theorem 6.1. If M is n-dimensional then all p-forms with p > n vanish.
Proof. First note that a p-form acting on a set of vectors with the same vector appearing
twice vanishes, because
(X1 , . . . , Y, X2 , . . . , Y, X3 , . . . ) = sgn(P )(Y, Y, X1 . . . )
= (1)sgn(P )(Y, Y, X1 , . . . )
= (1)(sgn(P ))2 (X1 , . . . , Y, X2 , . . . , Y, X3 , . . . )
= (X1 , . . . , Y, X2 , . . . , Y, X3 , . . . )

(6.2)

where in the first line we used a permutation P to place the two Y vectors next to each
other, in the second line we used an interchange to swap the two Y -vectors, which introduces
an extra coefficient of 1, and in the third line we apply the inverse permutation to P to
restore the original order of the vector fields (recall that the sign of P 1 is the same as
that of P ) which introduces another factor of sgn(P ).
Now, if is a p form with p > n, then in any collection of p basis vectors, at least
two must be the same. So evaluated on any such collection of basis vector must vanish.
Since it vanishes on any set of basis vectors, it must vanish identically.

The space of p-forms on M is denoted p (M, R), and we let


(M) = 0 (M, R) 1 (M, R) + + n (M, R)

(6.3)

Here we have included an explicit reference to the field R over which the manifold is defined.
Note that if and are a p-form and q-form respectively, then will be a (0, p+q)tensor field, but not a (p + q)-form, since it will not be antisymmetric.
However, one can construct a (p + q)-form from a p-form and a q-form using the wedge
product :

38

Definition 6.3. If p (M) and q (M) then


X
( )(X1 , . . . , Xp+q ) =
sgn(P )( )(XP (1) , . . . , XP (p+q) )

(6.4)

where
( )(XP (1) , . . . , XP (p+q) ) = (XP (1) , . . . , XP (p) ) . (XP (p+1) , . . . , XP (p+q) )

(6.5)

Note that is clearly a (0, p + q) tensor, because each term in the above sum is.
Also, if Q is some permutation, then
X
( )(XQ(1) , . . . , XQ(p+q) ) =
sgn(P )( )(XQP (1) , . . . , XQP (p+q) )
P

sgn(Q1 P )( )(XQQ1 P (1) , . . . , XQQ1 P (p+q) )

sgn(Q1 P )( )(XP (1) , . . . , XP (p+q) )

= sgn(Q)

sgn(P )( )(XP (1) , . . . , XP (p+q) )

= sgn(Q)( )(X1 , . . . , Xp+q )

(6.6)

where in going from the first to the second line, the permutation P in the sum is replaced
with Q1 P (this leaves the sum unaffected). So is indeed a (p + q)-form.
Example:






dx p dx p = dx p dx p dx p dx p
Example:









dx p (dx p dx p ) = dx p dx p dx p dx p dx p dx p






+ dx p dx p dx p dx p dx p dx p






+ dx p dx p dx p dx p dx p dx p

(6.7)

(6.8)

Note: by convention if f 0 (M, R) and k (M, R) then


f =f =f .
Theorem 6.2. The wedge product satisfies:
(i) (1 + 2 ) = (1 ) + (2 )
(ii) (1 + 2 ) = ( 1 ) + ( 2 )
(iii) () = () = ( )
for , 1 , 2 p (M), , 1 , 2 q (M) and R

39

(6.9)

Proof. This follows directly from the definition of the wedge product, and is left as an
exercise
Theorem 6.3. If p (M) and q (M) then = (1)pq .
Proof. Let Q be the permutation that maps the ordered list
{1, . . . , p, p + 1, . . . , p + q} {q + 1, . . . , q + p, 1, . . . , q}

(6.10)

Note that sgn(Q) = (1)pq .


Then
( )(X1 , . . . , Xp , Xp+1 , . . . , Xp+q )
X
=
sgn(P )( )(XP (1) , . . . , XP (p) , XP (p+1) , . . . , XP (p+q) )
P

sgn(P Q)( )(XP Q(1) , . . . , XP Q(p) , XP Q(p+1) , . . . , XP Q(p+q) )

sgn(P )sgn(Q)( )(XP (q+1) , . . . XP (q+p) , XP (1) , . . . XP (q) )

= (1)pq

sgn(P )( )(XP (q+1) , . . . XP (q+p) , XP (1) , . . . XP (q) )

= (1)pq

sgn(P )( )(XP (1) , . . . , XP (q) , XP (q+1) , . . . , XP (p+q) )

= (1)pq ( )(X1 , . . . , Xp , Xp+1 , . . . , Xp+q )

(6.11)

Problem 6.1. If
= A12 dx1 dx2 + A34 dx3 dx4
= B123 dx1 dx2 dx3 + B125 dx1 dx2 dx5

(6.12)

then what is ? (N.B. here we have dropped the subscript p for convenience)
Theorem 6.4. A basis for k (M) at a point p M is given by


dx1 p dxk p

(6.13)

for 1 < < k .


Proof. Suppose k (M); define




=
,...,
x1 p
x k p


1 ,...k

(6.14)

Also, from the definition of a k-form







=
, . . . , P (k)
xP (1) p
x
p






= sgn(P )
,...,
x1 p
x k p


P (1) ,...,P (k)

= sgn(P )1 ,...,k

40

(6.15)

for any permutation P . It then follows that





1 XX
1 X
1 ,...k dx1 p dxk p =
sgn(P )1 ...k dxP (1) p dxP (k)
k!
k!
P
i
i

1 XX
=
P (1) ,...P (k) dxP (1) p dxP (k)
k!
P
i
X


=
1 ,...,k dx1 p dxk p
i

=
(6.16)


This shows that dx1 p dxk p span the space of k-forms.


Now consider the k-form dx1 p dxk p . This either vanishes, because not all
the i are distinct, or there exists a permutation P of 1, . . . , k such that




(6.17)
dx1 p dxk p = sgn(P )dxP (1) p dxP (k) p
with P (1) < < P (k) . It therefore follows that the k-forms


dx1 p dxk p
for 1 < < k span k (M)
Furthermore, note that by definition, if 1 < < k and 1 < < k then

 (


1 if 1 = 1 , . . . , k = k


1
k
dx p dx p

=
p
p

x 1
x k
0 otherwise
It follows that if 1 < < k , and
X


A1 ...k dx1 p dxk p = 0

(6.18)

(6.19)

(6.20)

1 <<k


xk p


x1 p



implies that A1 ...k = 0, so the k forms dx1 p
then acting on

dxk p for 1 < < k are linearly independent.
Hence these k-forms form a basis for k (M)

6.2 Exterior Derivative


We can define a notion of a derivative on k-forms by
Definition 6.4. If k (M), then the exterior derivative, d, is given by
d(X1 , . . . , Xk+1 ) =

k+1
X
i
k+1
X

(1)i+1 Xi ((X1 , . . . , Xi1 , Xi+1 , . . . , Xk+1 ))


(1)i+j ([Xi , Xj ], X1 , . . . , Xi1 , Xi+1 , . . . , Xj1 , Xj+1 , . . . Xk+1 )

i<j

(6.21)

41

Theorem 6.5. If k (M) then d k+1 (M)


Proof. To start with, we shall prove that d is a (0, k + 1) tensor. Having established that,
we will prove that it is antisymmetric by evaluating its components in a local basis.
First, fix ` such that 1 ` k + 1. Note that from the construction given in (6.21),
one finds
d(X1 , . . . , X` + Y` , . . . , Xk+1 ) = d(X1 , . . . , X` , . . . , Xk+1 ) + d(X1 , . . . , Y` , . . . , Xk+1 )
(6.22)
Next, suppose that f C (M).
To simplify the expressions in what follows, it will be convenient to denote by {X}i
the (ordered) k-tuple obtained by removing Xi from the (ordered) list of k + 1 vector fields
X1 , . . . , Xk+1 ; and by {X}i,j the (ordered) (k 1)-tuple obtained by removing Xi and Xj
from X1 , . . . , Xk+1 (i 6= j).
We can proceed to evaluate
X


d(X1 , . . . , f X` , . . . , Xk+1 ) =
(1)i+1 Xi f ({X}i ) + (1)`+1 f X` ({X}` )
i6=`

f (1)i+j ([Xi , Xj ], {X}i,j )

i<j
i,j6=`

(1)`+j ([f X` , Xj ], {X}`,j ) +

`<j

X
(1)i+` ([Xi , f X` ], {X}i,` )
i<`

(6.23)
where the first two line of (6.23) corresponds to the first line of (6.21) and the last two
lines of (6.23) come from the second line of (6.21), on restricting the various summations
appropriately. Recall that the Liebniz rule implies


Xi f ({X}i ) = (Xi (f ))({X}i ) + f Xi f ({X}i )
(6.24)
Using this, the first line of (6.23) can be rewritten as

 X
X
i+1
f
(1) Xi ({X}i ) +
(1)i+1 (Xi (f ))({X}i )
i

(6.25)

i6=`

Also, on recalling that


[Xi , f Xj ] = f [Xi , Xj ] + (Xi (f ))Xj

(6.26)

the remainder of (6.23) can be rewritten as


f

X
(1)i+j ([Xi , Xj ], {X}i,j )
i<j

X
X
+
(1)(1)`+j (Xj (f ))(X` , {X}`,j ) +
(1)i+` (Xi (f ))(X` , {X}i,` )
`<j

i<`

=f

(1)i+j ([Xi , Xj ], {X}i,j ) +

i<j

X
i6=`

42

(1)i (Xi (f ))({X}i )

(6.27)

On adding together the contribution from (6.25) and (6.27), one finds that the terms
involving Xi (f ) cancel, and one finds
d(X1 , . . . , f X` , . . . , Xk+1 ) = f d(X1 , . . . , X` , . . . , Xk+1 )

(6.28)

Having established linearity, we compute the components of d in the usual basis of


vectors

 X

 
k+1







i+1
d
,...,
=
,...,
,
,...,
(1)

x1 p
x k+1 p
xi p
x1 p
x i1 p xi+1 p
x k+1 p
i=1

(6.29)
where we have used the fact that






=0
,
x p x p

(6.30)

so that the second term in the definition of d (6.21) vanishes. So, on defining the components of d as




(6.31)
,...,
d1 ,...,k+1 = d
x1 p
x k+1 p
one obtains
d1 ,...,k+1 =

k+1
X

(1)i+1 i 1 ,...,i1 ,i+1 ,...k+1

i=1

= (k + 1)[1 2 ,...,k+1 ]

(6.32)

where the square brackets are defined as


Y[1 ,...k ] =

1 X
sgn(P )YP (1) ,...,P (k)
k!

(6.33)

It is clear that the components of d are antisymmetric.

Note that in this coordinate basis


X


1
d =
(d)1 ,...,k+1 dx1 p dxk+1 p
(k + 1)!


1 X
=
[1 2 ,...,k+1 ] dx1 p dxk+1 p
k!


1 X
=
1 2 ,...,k+1 dx1 p dxk+1 p
k!

(6.34)

where the last line follows due to the antisymmetry of the wedge product terms.
Before proceeding further, we shall establish a correspondence between the exterior
derivative acting on the co-ordinate functions (which although not globally defined, are
nevertheless locally defined), and the dual operators.

43



In particular we have denoted by dx p the dual vectors to the tangent vectors x p
at p M, defined in equation (5.8).
Suppose that (x1 , . . . , xn ) is some local co-ordinate system associated with the chart
(Ui , i ). Note that x (p) = (i (p)) are C (Ui ) functions, and hence we can (locally)
define their exterior derivative dx where
dx (X) = X(x )

(6.35)

for any (locally defined) smooth vector field X. So dx is a (smooth) 1-form locally defined
on the patch Ui . To evaluate this 1-form at p Ui note that





(x ) =
dx ( p ) =
(6.36)

x
x
p
p




denotes the evaluation of dx at p. From
where on the LHS of this expression
p

the above equation it is clear that dx evaluated at p is identical to the dual vector dx p
as defined in equation (5.8), and so our choice of notation in section 5 is consistent with
the exterior derivative as defined in this section.
dx

Example: Consider R3 . A 0-form f is just a function, and


X
df =
f dx

(6.37)

is just the gradient of f . A 1-form = dx has


X
d =
dx dx
= (1 2 2 1 )dx1 dx2 + (2 3 3 2 )dx2 dx3 + (3 1 1 3 )dx1 dx3
(6.38)
P
whose components are those of curl(). Lastly, for a 2-form = 21 dx dx , we
have
1X
d =
dx dx dx = (1 23 + 2 31 + 3 12 )dx1 dx2 dx3 (6.39)
2
and these are the components of div(
) where

= 23 dx1 + 31 dx2 + 12 dx3

(6.40)

Next we prove the most important property of the exterior derivative


Theorem 6.6. d2 = 0
Proof. Let us choose a coordinate system as previously, so that
d =

1 X
1 ,...,k dx dx1 dxk
k!

44

(6.41)

Then it follows that


1 X
1 ,...,k dx dx dx1 dxk
k!
However this vanishes, because
d2 =

1 ,...,k = 1 ,...,k

(6.42)

(6.43)

Theorem 6.7. If p (M) and q (M), then


d( ) = (d) + (1)p d

(6.44)

Proof. In components we have


=

1 X
1 ...p dx1 dxp
p!

(6.45)

1 X
1 ...q dx1 dxq
q!

(6.46)

and

and hence

1 X
1 ...p 1 ...q dx dx1 ... dxp dx1 ... dxq
p!q!

1 X
=
1 ...p 1 ...q + 1 ...p 1 ...q
p!q!
dx dx1 ... dxp dx1 ... dxq
X
1
=
1 ...p 1 ...q dx dx1 ... dxp dx1 ... dxq
p!q!
1 X
+(1)p
1 ...p 1 ...q dx1 ... dxp dx dx1 ... dxq
p!q!
= (d) + (1)p d
(6.47)

d( ) =

where the factor of (1)p appears because


dx dx1 dxp = (1)p dx1 dxp dx

(6.48)

Let us return to the notion of a pull-back. This can be easily extended to any p-form.
Consider a C map f : M N between two manifolds, and let be a p-form on N .
Then we define
(f ? )p (X1 , . . . , Xp ) = f (p) (f? X1 , . . . , f? Xp )

(6.49)

for tangent vectors Xi Tp (M). Clearly, f ? is antisymmetric, because is, so if


p (N , R) then f ? p (M, R). We can also define the pull-back of a 0-form or function
g : N R by the rule
f ?g = g f
so that f ? g : M R.

45

(6.50)

Theorem 6.8. Let f : M N , (y 1 , . . . y n ) be local coordinates on V N , and (x1 , . . . , xm )


local coordinates on U f 1 (V ) M. If
=

n


1 X
1 ,...,k (q)dy 1 q dy k q
k!

(6.51)

=1

then
(f ? )p =

m n

1 XX

1 ,...,k (f (p)) 1 (y 1 f ) . . . (y k f )dx1 p dxk p


k
k!
x
x
=1 =1

(6.52)
for q N , p M.
Proof. It is sufficient to compute








?
(f )p
, . . . , p = f (p) f? 1 p , . . . , f? p
x1 p
x k
x
x k

(6.53)

We have already proven that


f?

X


=
(y i f ) f (p)
p

x i
x i
y i

(6.54)

and hence on substituting this back into (6.53), one obtains the result.
Theorem 6.9. The exterior derivative and the pull-back commute: d(f ? ) = f ? d.
Proof. In a local coordinate system we have proven that
f ? =

1 XX
f 1
f k
1 ,...,k (f (x)) 1 . . . dx1 dxk
k!
x
x k

where we have defined f = y f . Therefore,




f k
1 X
f 1
?
df =
1 ,...,k (f (x)) 1 . . . dx dx1 dxk
k!
x
x k
x

X
1
1
f
f k
f
=

(f
(x))
.
.
.
dx dx1 dxk

,...,
k
k!
x1
xk
x y 1
 1


f k
1 X
f
1 ,...,k
. . . dx dx1 dxk
+
k!
x k
x x1

X
1
f
f 1
f k
=

(f
(x))
.
.
.
dx dx1 dxk

,...,
1
k
k!
x1
xk
x y
 1

1 X

f
f k
1 ,...,k
+
. . . dx dx1 dxk
k!
x k
x x1

1 X
f f 1
f k
=

(f
(x))
.
.
.
dx dx1 dxk
,...,k
k!
y 1
xk
x x1
= f ? d

46

(6.55)

(6.56)

Note that line 4 of this expression is obtained by using the chain rule. The contribution
2 f 1

1
from line 5 vanishes - this is because x x
1 is symmetric in , 1 , whereas dx dx
dxk is antisymmetric in , 1 ; similarly all other contributions from
vanish.

2 f j
x xj

terms

There is also a particularly elegant relationship between the exterior derivative d, and
the Lie derivative.
Definition 6.5. Given a vector field Y , the interior product iY is a map iY : p (M)
p1 (M) defined by
(iY )(X1 , . . . , Xp1 ) = (Y, X1 , . . . , Xp1 )

(6.57)

for p (M) and vector fields X1 , . . . , Xp1 .


It is clear that if is a p-form then iY is a p 1-form, as it is linear in the Xi , and
interchange of any pair Xi , Xj changes the sign. Note that if is a 1-form then iY = (Y ).
Theorem 6.10. If X is a vector field and a 1-form then
LX = d(iX ) + iX d

(6.58)

Proof. Working in local co-ordinates x , note that


X

d(iX ) + iX d =
( X ) + X d

( )X + X + X ( )

X + X

= (LX )

(6.59)

In fact, one can show that LX = d(iX ) + iX d for any p-form , however we shall
not do this here.
6.3 Integration on Manifolds
Let us first recall how we would integrate = p(x, y)dx + q(x, y)dy along a curve C :
[0, 1] R2 in R2 . A natural prescription is

Z
Z 1
dC y
dC x
p(x, y)dx + q(x, y)dy =
p(C(t))
+ q(C(t))
dt
(6.60)
dt
dt
C
0
Here we are thinking of dx and dy as the infinitessimal change in x and y along the curve

x
dC y
C; (dx, dy) = dC
dt dt, dt dt . We can rewrite this as
Z
Z 1
=
C ?
(6.61)
C

47

Problem 6.2. Show that




d
d
C = p(C(t)) (x C) + q(C(t)) (y C) dt
dt
dt
?

(6.62)

This definition clearly extends to the integral of a 1-form along a curve in an arbitrary
manifold. To define the integral of a general p-form over a manifold, we need to generalize
a curve to a p-dimensional surface.
Definition 6.6. Let Ip = [0, 1]p = {(x1 , . . . , xp ) Rp : 0 x 1} be the p-cube in Rp .
(i) A p-simplex on M is a C map C : J M where J is an open set in Rp that
contains Ip .
(ii) A 0-simplex is a map from {0} M i.e. it is just a point in M.
(iii) The support |C| of a p-simplex is the set C(Ip ) M.
Next we can consider sums of such surfaces.
Definition 6.7. A p-chain on M is a finite formal linear combination of p-simplices on
M with real coefficients, i.e. a general p-chain is
p = r1 C1 + + rk Ck

(6.63)

where ri R and Ci are p-simplexes. The support of a p-chain p = r1 C1 + + rk Ck is


[
|p | =
|Ci |
(6.64)
i: ri 6=0
(1)

Definition 6.8. We define the maps i

(0)

: Ip1 Ip and i

: Ip1 Ip by

(1)

i (t1 , . . . , tp1 ) = (t1 , . . . , ti1 , 1, ti , . . . , tp1 )


(0)

i (t1 , . . . , tp1 ) = (t1 , . . . , ti1 , 0, ti , . . . , tp1 )

(6.65)

i.e. these project a (p 1)-cube in Rp1 onto the side of a p-cube in Rp .


The boundary of a p-cube can then be constructed as the sum of all sides weighted with
a plus sign for front sides and a minus sign for the back sides, in other words a (p1)-chain.
Example: If we look at I2 then
(0)

2 = (t, 0)

(0)

(1)

2 = (t, 1)

1 (t) = (0, t)

(1)

1 (t) = (1, t)

(6.66)

As a point set the boundary is


{(1, t) : t I1 } {(t, 1) : t I1 } {(0, t) : t I1 } {(t, 0) : t I1 }

(6.67)

but this does not take into account the fact that some sides are oriented differently to
others. This is achieved by considering the 1-chain
(1)

(0)

(1)

(0)

1 1 2 + 2

whose support is the point set consisting of the boundary of I2 .


This allows us to define the boundary of a p-simplex in M.

48

(6.68)

Figure 8: The (oriented) boundary of I2

Definition 6.9. If C is a p-simplex in M then the boundary of C is denoted by C and


is defined as the (p 1)-chain
C =

p
X

(1)

(1)i+1 (C i

(0)

C i )

(6.69)

i=1

For a p-chain = r1 C1 + + rk Ck we define


X
C =
ri Ci

(6.70)

The next theorem summarizes the notion that boundaries have no boundaries.
Theorem 6.11. 2 = 0
Proof. It suffices to show this for a p-simplex C in M. As
C =

p
X

(1)

(1)i+1 (C i

(0)

C i )

(6.71)

i=1

it follows that
p
X
(1)
(0) 
2C =
(1)i+1 (C i ) (C i )
=

i=1
p1
p
XX

(1)

(1)i+j C i

(1)

(1)

j C i

(0)

(0)

j C i

(1)

(0)

j + C i

(0) 

j=1 i=1

(6.72)

49

Now, if j < i then


()

()

()

j (t1 , . . . , tp2 ) = i (t1 , . . . , , . . . , tp2 ) = (t1 , . . . , , . . . , , . . . , tp2 ) (6.73)

and also (as j i 1)


()

()

()

i1 (t1 , . . . , tp2 ) = j (t1 , . . . , , . . . , tp2 ) = (t1 , . . . , , . . . , , . . . , tp2 ) (6.74)

where the final expression has in the j-th position and in the i-th position. Thus if
j < i then
()

()

()

()

i1

= j

(6.75)

This shift of i i 1 introduces a minus sign into the sum due to the (1)i+j factor.
Hence, we see that the first term and the last term in 2 C each sum to zero, and the
middle two terms together sum to zero.

Finally, we can define the integral of a p-form over a p-chain.


Definition 6.10. Let C be a p-simplex in M and a p-form. Then
Z
Z
=
C ?
C

(6.76)

Ip

where if C ? = f (t1 , . . . , tp )dt1 dtp , the RHS is understood to mean the usual integral
Z
Z 1
Z 1
?
C =
...
f (t1 , . . . , tp )dt1 . . . dtp
(6.77)
Ip

If =

ri Ci is a p-chain then
Z
=

ri

(6.78)

Ci

Example: Consider the manifold R2 {(0, 0)}, the 1-form


=

ydx
xdy
2
2
+y
x + y2

x2

(6.79)

and the curve C(t) = (cos(2t), sin(2t)). Then


Z
Z 1
=
C ? dt
C
0

Z 1
d
d
=
sin(2t) (cos(2t)) cos(2t) (sin(2t)) dt
dt
dt
0
Z 1
= 2
(sin2 (2t) + cos2 (2t))dt
0

= 2

(6.80)

50

Problem 6.3. Consider the manifold R2 {(0, 0)}, the 1-form


=

xdy
ydx
2
2
+y
x + y2

x2

(6.81)

What is
Z

(6.82)

along the curve C(t) = (2 + cos(2t), 2 + sin(2t)). Next consider the 2-form
=

dx dy
x2 + y 2

(6.83)

What is
Z

(6.84)

where C : I2 R2 {(0, 0)} is given by C(t1 , t2 ) = (t1 + 1)(cos(2t2 ), sin(2t2 )).


Problem 6.4. Consider the manifold R2 {(0, 0)} and the 1-form
=

ydx
xdy
2
2
+y
x + y2

x2

(6.85)

Show that d = 0. Is there a smooth function f such that = df ?


The exterior derivative has one very important property:
d2 = 0

(6.86)

Thus if = d then it follows that d = 0. This motivates two definitions


Definition 6.11. A p-form is closed if d = 0. We denote the set of closed p-forms on
M by Z p (M, R).
Definition 6.12. A p-form is exact if = d for some (p1) form on M. We denote
the set of exact p-forms on M by B p (M, R).
Theorem 6.12. B p (M, R) Z p (M, R)
Proof. If B p (M, R) then = d for some (p 1)-form , hence d = d2 = 0, so
Z p (M, R).
Since the space of p-forms is a vector space over R, we can define the following:
Definition 6.13. The p-th de Rahm cohomology group H p (M, R) is the quotient space
H p (M, R) =

Z p (M, R)
B p (M, R)

(6.87)

where two p-forms are viewed as equivalent iff their difference is an exact form. The
dimension of H p (M, R) is called the p-th Betti number bp

51

Theorem 6.13. If M and N are two manifolds, and f : M N is a diffeomorphism


then H p (M, R)
= H p (N , R).
Proof. Recall that we proved that the pull-back and exterior derivative commute (Theorem
6.9). Therefore if is a closed p-form on N then f ? is a closed form on M:
df ? = f ? d = 0 .

(6.88)

Furthermore, if = d is an exact form on N then


f ? = f ? d = d(f ? )

(6.89)

is an exact p-form on M. Similarly closed forms on M are pulled back using f 1 to closed
forms on N and exact forms on M are pulled back to exact forms on N .
Thus the de Rahm cohomology groups are capable of distinguishing between two distinct manifolds. By distinct, we mean that two manifolds are equivalent if there is a
diffeomorphism between them. Note though that the converse is not true. There are
plenty of examples of inequivalent manifolds that have the same de Rahm cohomology
groups H k (M, R). The general idea of cohomology can be applied to any operator which
is nilpotent, i.e. whose action squares to zero, and is a central element of modern algebraic
and geometric topology.
We finally arrive at a central theorem in differential geometry
Theorem 6.14 (Stokess Theorem). If (p1) (M, R) and is a p chain then
Z
Z
d =

(6.90)

Proof. By linearity, it suffices to show this is true for p-simplexes C. Recall that as a
consequence of Theorem 6.9
Z
Z
Z
?
d =
C d =
dC ?
(6.91)
C

Ip

Ip

and by definition
Z

C ?

=
C

So to prove that these two quantities are equal, it is sufficient to show that
Z
Z
d =

Ip

(6.92)

Ip

(6.93)

Ip

for any (p 1)-form on Rp . As this condition is linear in it is sufficient to consider


= f (x)dx1 dxp1

52

(6.94)

so
d = (1)p1 p f dx1 dxp

(6.95)

and hence we evaluate


Z
Z
p1
d = (1)
p f dx1 dxp
Ip
1

Ip
p1

p1

= (1)

p f dxp

0
p1

dx . . . dx

= (1)


dx1 . . . dxp1 f (x1 , . . . , xp1 , 1) f (x1 , . . . , xp1 , 0) (6.96)

On the other hand, the boundary of Ip is


Ip =

p
X
(1)
(0)
(1)i+1 (i i )
i=1


p
X
i+1
=
{(x1 , . . . , xi1 , 1, xi+1 , . . . xp ) : xj I1 }
(1)
i=1
1

{(x , . . . , x

i1

i+1

, 0, x

, . . . x ) : x I1 }
(6.97)

Now = f dx1 dxp1 will only have a non-vanishing contribution to the total
integral on those boundary components with xp constant and x1 , . . . , xp1 varying. Hence
Z
Z
Z
p+1
p+1
(1)

= (1)
Ip

= (1)p+1

{(x1 ,...,xp1 ,0:xi I1 }


Z 1
p+1
1

{(x1 ,...,xp1 ,1:xi I1 }


1
1
p1
1

f (x , . . . , xp1 , 1) (1)

dx . . . dx

dx . . . dxp1 f (x1 , . . . , xp1 , 0)

(6.98)
and this establishes the proof.
This is a beautiful generalization of the following well-known result for 1-forms on R:
Z

df = f (b) f (a)

(6.99)

In particular, Stokess theorem enables us to see quite explicitly the connection between
the identities d2 = 0 and 2 = 0, because if is a p-chain and is a (p 2)-form then
Z
Z
Z
2
0=
d =
d =
=0
(6.100)

where the first equality (reading from left to right) follows from d2 = 0, the next two
equalities follow from Stokess theorem, and the last equality follows from 2 = 0.

53

7. Connections, Curvature and Metrics


So far, everything that we have discussed about manifolds has been intrinsic to the manifold, defined as a topological space with a differentiable structure, and has not required
introducing any additional structures. However, it is very common and intuitive to introduce some additional structures.
7.1 Connections, Curvature and Torsion
The first additional structure that we can introduce is that of a connnection. We have
been emphasising that we cannot just differentiate a generic object such as a tensor on a
manifold because we dont know how to construct something like x +  where x M and
 is a small parameter.
Problem 7.1. Show that if
X=


x p

(7.1)

is a vector field expanded in some local coordinate chart i (p) = (x1 (p), . . . , xn (p)) and we
define
=
dX



X dx p p
x

(7.2)

is not a tensor. [N.B. d is to be distinguished from the exterior derivative d; here


then dX
d is simply an operator defined by the above equation!]

P
However, check that if = dx p is a co-vector (1-form) then
d =

X


( )dx p dx p

(7.3)

is a tensor.
(y 1 , . . . , y n ).

Hint: consider how things look in a different coordinate chart i (p) =

However we can simply declare that there is a suitable derivative


Definition 7.1. A connection (or more accurately an affine connection) on a manifold M
is an operator D which assigns to each vector field X on M a mapping DX : T M T M
such that for all X, Y T M and f C M:
(i) DX (Y + Z) = DX Y + DX Z
(ii) DX+Y Z = DX Z + DY Z
(iii) Df X Y = f DX Y
(iv) DX (f Y ) = X(f )Y + f DX Y

54

DX Y is called the covariant derivative of Y along X.


N.B. The commutator (Lie derivative) obeys all but condition (iii).
In other words, DX acts as a directional derivative along the direction determined by
X. However the existence of a connection does not follow from the definition of a manifold
but requires us to add it in. In particular a typical manifold can be endowed with infinitely
many different connections.
It is convenient to introduce a new notation. If x1 , . . . , xn are local coordinates on
some chart of M, we let
D = D

(7.4)

x p

It is straightforward to see that DX is entirely determined by its action on a set of basis


vectors, hence we introduce
D

x p
x p

(7.5)

and the are known as the connection coefficients. Thus if


X=


,
x p

and

Y =


x p

then it follows from the definition of DX that


X




DX Y = DP
Y
X x
x p
p


X
X


=
X D
Y
x p



X


(Y ) p + Y D p
=
X
x p
x
x
X




=
X Y + X Y
x p

(7.6)

(7.7)

Theorem 7.1. Let be the connection coefficients in a coordinate system (x1 , . . . , xn ),


then in another overlapping coordinate system (y 1 , . . . , y n ), we have
=

X x x y
X y 2 x

y y x
x y y

(7.8)

where we think of x (y ) as the transition functions.


Proof. We have seen that the relationship between the standard vector field basis elements
in two overlapping coordinate systems is
X y (x)


=
p

x
x y p

and

55

X x (y)


=
p

y
y x p

(7.9)

By definition, we have

D


y p


=

y p

(7.10)

However, we also have



X



x (y)


P

= D x (y)
D
y p
y x p
y p
y x p


X x (y)
X x (y) 


D
=
y
y x p
x p
X x (y) x (y)
X 2 x (y)

y
y
y y x p
x p
X x (y) x (y)
X 2 x (y)



=

y
y
x p
y y x p
X

x (y) x (y) y (x) X 2 x (y) y (x)
=

+
y
y
x
y y x
y p

On comparing the coefficients of y p in these two expressions, one establishes the proof.
Thus the connection coefficients cannot be thought of as the components of a tensor,
because they do not transform in the appropriate way. However, from the connection we
can construct two associated tensors.
Definition 7.2. The torsion tensor is a (1, 2) tensor defined by
T (X, Y, ) = (DX Y DY X [X, Y ])

(7.11)

for X, Y T M, T ? M.
Theorem 7.2. In a local coordinate system, the torsion tensor is
T =

T
dx p dx p p
x

(7.12)

where

T
=

(7.13)

Proof. We evaluate



D
= dx D

x
x



= dx
x
x

56

(7.14)

Secondly, we have the curvature (1, 3)-tensor R:


Definition 7.3. The curvature (1, 3) tensor is


R(X, Y, Z, ) = DX (DY Z) + DY (DX Z) + D[X,Y ] Z

(7.15)

for X, Y, Z T M and T ? M.
Problem 7.2. Prove that R is a tensor.
Theorem 7.3. In a local coordinate system the curvature tensor is




R dx p dx p dx p p
x

(7.16)

R = + +

(7.17)

R=

where

Problem 7.3. Prove this.


The important property of these tensors is that they contain coordinate independent
information. In particular, if a tensor, such as torsion or curvature, vanishes in one coordinate system, then it vanishes in all. This cannot be said of things like the connection
coefficients or other quantities that one might encounter when working with a particular
coordinate system.
Definition 7.4. A vector field X is parallel transported along a curve C if
DTC X = 0

(7.18)

at each point on C, where TC is the tangent to C.


This means that we think of X as being transported in such a way that it points in
the same direction along the curve. This is possible because we have a connection which
tells us how to compare vectors in the tangent spaces at different points.
We can give a geometric meaning to both the torsion and curvature tensors. First the
torsion: consider an infinitessimal displacement of the coordinate x by a vector X:
X x = X

(7.19)

Then parallel transport this displacement along a direction Y . Parallel transport means
that
X

0 = Y X +
X
(7.20)
so that
Y X = Y X = 

57

Y X

(7.21)

Thus
Y X x = 2

Y X

(7.22)

On changing the order of displacement first along Y and then X one similarly finds
X
X Y x = 2
X Y
(7.23)
Therefore the difference between these two is measured by the torsion
X

[X , Y ]x = 2
X Y T

(7.24)

To understand the curvature, we first parallel transport a vector Z around a curve


with tangent X by an infinitessimal amount. Using the above formula one finds
X
Z Z 
X Z
(7.25)
Let us now parallel transport this around Y by an infinitessimal amount:
X
X
Z Z 
X Z
(Y ) (x + X)(Z X Z )
X
X
= Z 
X Z 
Y ( + X )(Z X Z )
X
X
X
Y X Z + 2
Y X Z + O(3 )
= Z 
(X + Y ) Z 2
(7.26)
If we first transport around Y and then X we find
X
X
X
Z Z 
(Y + X ) 2
X Y Z + 2
X Y Z + O(3 )
(7.27)
Then, with a little work, one sees that the difference between parallel transport along X
then Y , minus the parallel transport along Y and then X is expressed in terms of the
curvature:
X
[X , Y ]Z = 2
R X Y Z
(7.28)

Figure 9: Parallel transport along great circle arcs on S 2

58

In the above, the vector field at A can be taken to C by parallel transport A B C


or by A D C. The resulting vector fields at C are not equal.

Figure 10: Parallel transport of vector field Z around a small parallelogram

The connection can be extended to define a covariant derivative on any tensor field.
We start by defining it on a co-vector by
(DX )(Y ) = X((Y )) (DX (Y ))

(7.29)

(DX )(Y + Z) = (DX )(Y ) + (DX )(Z)

(7.30)

for any vector field Y .


Clearly,

for vector fields X, Y, Z, and if f C (M) then


(DX )(f Y ) = X((f Y )) (DX (f Y ))
= X(f (Y )) (X(f )Y + f DX (Y ))
= X(f )(Y ) + f X((Y )) X(f )(Y ) f (DX (Y ))
= f (DX )(Y )

(7.31)

and hence DX is a co-vector as claimed. In coordinates this is


X
D =

(7.32)

where we have taken X =


x p , Y

(Y ) =


x p

so that

and

59

(DX Y ) =

(7.33)

This implies that


D (dx ) =

dx

(7.34)

The extension to a (r, s)-tensor is


DX T ( 1 , . . . , r , Y1 , . . . , Ys ) = X(T ( 1 , . . . , r , Y1 , . . . , Ys ))
X

T ( 1 , . . . , DX i , . . . r , Y1 , . . . , Ys )
i

T ( 1 , . . . , r , Y1 , . . . , DX Yi , . . . , Ys )

(7.35)

Problem 7.4. Convince yourself that DX T is a (r, s)-tensor and in a local coordinate
system
D T 1 ,...,r 1 ,...,s = T 1 ,...,r 1 ,...,s
X
+
i T 1 ,...,,...,r 1 ,...,s
i

i T 1 ,...,r 1 ,...,,...,s

(7.36)

7.2 Riemannian Manifolds


Another object that is frequently discussed is a metric. This allows one to measure distances
and angles on manifolds. Again, this is not implicit to a manifold, and typically there are
infinitely many possible metrics for a given manifold. For example, General Relativity is a
theory of gravity which postulates that spacetime is a manifold. The dynamical equations
of General Relativity (Einsteins equations) then determine the metric.
Definition 7.5. A metric g on a manifold M is a non-degenerate symmetric (0, 2) tensor
defined at each point p M; that is a map gp : Tp M Tp M R such that
g(X, Y ) = g(Y, X),

g(X, Y + f Z) = g(X, Y ) + f g(X, Z)

(7.37)

for X, Y, Z T M, f C (M). We will assume that g is a smooth (0, 2) tensor field on


M.
The non-degeneracy condition is equivalent to requiring that in any local co-ordinate
system (x1 , . . . , xn ), the components of the metric
g = g


,

x x

(7.38)

form a matrix with nonzero determinant.


Definition 7.6. A Riemannian manifold is a manifold with a positive definite metric
tensor field (positive definite means g(X, X) 0 with equality iff X = 0). If the metric is
not positive definite it is called a pseudo-Riemannian manifold.

60

From elementary linear algebra, an inner product allows us to define the lengths and
angles of vectors. Thus, with a (positive definite) metric we can define the length and
angles of tangent vectors. For example, we can define the angle between two intersecting
curves as


g(T1 , T2 )
arccos p
(7.39)
g(T1 , T1 )g(T2 , T2 )
where T1 , T2 are the tangent vectors to the two curves where they intersect. We can also
define the length of a curve C with tangent vector T to be
Z p
g(T, T )d
(7.40)
C

So we simply integrate the length of the tangent vector at each point of the curve.
Thus we can give a metric structure to the manifold by defining
Z p
d(p, q) = inf C
g(T, T )d
(7.41)
C

where C is a curve on M such that C(0) = p, C(1) = q.


Example:
The Euclidean metric on Rn is simply



g
,
=
x x
This is in Cartesian coordinates. The length of a curve is therefore just
Z sX
dC dC
d
d d
C

so in particular, for a straight line C = p + (q p ) one has


sX
Z 1 sX

(p q )(p q )d =
(p q )(p q )
0

(7.42)

(7.43)

(7.44)

which is the Pythagorian distance.


However, in general, the metric coefficients g can be functions of the coordinates.
Indeed, even Rn with a different coordinate system will have non-trivial g .
Example: We can think of M = R2 {(0, 0)}. This is clearly a manifold, as it is
an open subset of R2 . By putting different metrics on it, though, we can think of it in a
variety of ways. With the flat metric
g = dr dr + r2 d d

(7.45)

then this is just what we naturally think of as M = R2 {(0, 0)} as a subset of the plane.

61

But we could also consider


g 0 = dr dr + d d

(7.46)

This turns the manifold into a cylinder S 1 R, although since r > 0 it is really only half
a cylinder.
There are also more exotic possibilities, such as
g 00 = dr dr + cosh2 rd d

(7.47)

This looks like a funnel where the radius of the circle starts at one and then grows exponentially with r.
So it is clear that M admits numerous different metrics.
As is well known, an inner product induces an isomorphism between a vector space
and its dual. A metric tensor induces an isomorphism between Tp M and Tp? M at each
point p M. To be precise, given a vector field X, we can construct a co-vector X by
X (Y ) = g(X, Y )

(7.48)

Clearly this defines a linear map, i.e X T ? M. If in a local coordinate system we have
X=

,
x

and

g=

g dx dx

(7.49)

then
X (Y ) =

(X ) Y =

g X Y

(7.50)

therefore we see that


(X ) =

g X

(7.51)

To see that all co-vectors arise in this way, suppose that is a co-vector (i.e. a linear map
from Tp M to R). Then it is defined by its action on a basis



=

(7.52)

where
=

dx ,

and

g=

g dx dx

(7.53)

We can therefore define the vector


X =

62

(7.54)

where g is the matrix inverse to g , which exists since g is non-degenerate. It now


follows that
X (Y ) = g(X , Y )
X
=
g X Y
X
=
g g Y
X
=
Y
= (Y )

(7.55)

for all vector fields Y , so X = .


A metric tensor gives rise to an inverse metric (2, 0) tensor by
g 1 (X , Y ) = g(X, Y )

(7.56)

where we have used the fact that each co-vector can be identified with a unique vector.
Since this identification is linear, we see that
g 1 : T ? M T ? M R

(7.57)

is linear, and also symmetric.


We have already used this tensor. Using the above form for X , Y we find that in a
particular coordinate system
X
X
g 1 (X , Y ) =
(g 1 ) (X ) (Y ) =
(g 1 ) g X g Y
(7.58)
However, we also have
g(X, Y ) =

g X Y

(7.59)

and as these two expressions must be equal for all choices of X, Y , we find
(g 1 ) = g

(7.60)

So the components of the tensor g 1 are those of the inverse metric. A metric tensor allows
us to raise and lower the indices on tensors.
Once a metric is supplied there is a natural choice of connection, known as the LeviCivita connection
Theorem 7.4. On a (pseudo) Riemannian manifold, there is a unique connection D such
that
(i) DX g = 0 for any vector X.
(ii) The torsion of D vanishes.

63

Proof. Let us start by assuming that such a connection exists. From the definition of a
covariant derivative on a (0, 2)-tensor we have
0 = (DX g)(Y, Z) = X(g(Y, X)) g(DX Y, Z) g(Y, DX Z)

(7.61)

for three vector fields X, Y , Z. This implies, along with its cyclic permutations
X(g(Y, Z)) = g(DX Y, Z) + g(Y, DX Z)
Y (g(Z, X)) = g(DY Z, X) + g(Z, DY X)
Z(g(X, Y )) = g(DZ X, Y ) + g(X, DZ Y )

(7.62)

Next, we assume that D has no torsion so that DX Y DY X = [X, Y ]. We use the


torsion-free condition to rewrite the first term on the RHS of all three lines above
X(g(Y, Z)) = g(DY X, Z) + g(DX Z, Y ) + g([X, Y ], Z)
Y (g(Z, X)) = g(DZ Y, X) + g(DY X, Z) + g([Y, Z], X)
Z(g(X, Y )) = g(DX Z, Y ) + g(DZ Y, X) + g([Z, X], Y )

(7.63)

Consider the second plus the third minus the first of these expressions
Z(g(X, Y )) + Y (g(Z, X)) X(g(Y, Z)) = 2g(DZ Y, X)
+ g([Y, Z], X) + g([Z, X], Y ) g([X, Y ], Z)
(7.64)
Rearranging gives
2g(DZ Y, X) = Z(g(X, Y )) + Y (g(Z, X)) X(g(Y, Z))
g([Y, Z], X) g([Z, X], Y ) + g([X, Y ], Z)

(7.65)

Because g is non-degenerate and X is arbitrary, this uniquely determines DY Z.


Conversely, if D is defined by (7.65), observe that on taking (7.65) and interchanging
Z Y , and subtracting, one finds
2g(DZ Y DY Z + [Y, Z], X) = 0

(7.66)

and as this must hold for all X, we find the torsion must vanish. Also, on taking (7.65),
interchanging X Y , and adding, one finds
2g(DZ Y, X) + 2g(DZ X, Y ) = 2Z(g(X, Y ))

(7.67)

and we recover the condition DZ g = 0.


Having established these identities, it remains to check that D defined in (7.65) does
indeed define a connection. Properties (i), (ii) of the definition of the connection follow
trivially. To test property (iii) we substitute Z f Z in (7.65) and find
2g(Df Z Y, Z) = f Z(g(X, Y )) + Y (g(f Z, X)) X(g(Y, f Z))
g([Y, f Z], X) g([f Z, X], Y ) + g([X, Y ], f Z)
= 2f g(DZ Y, X) + Y (f )g(Z, X) X(f )g(Y, Z) Y (f )g(Z, X) + X(f )g(Z, Y )
= 2f g(DZ Y, X)

(7.68)

64

To test property (iv) we substitute Y f Y in (7.65) and find


2g(DZ f Y, X) = Z(g(X, f Y )) + f Y (g(Z, X)) X(g(f Y, Z))
g([f Y, Z], X) g([Z, X], f Y ) + g([X, f Y ], Z)
= 2f g(DZ Y, X) + Z(f )g(X, Y ) X(f )g(Y, Z) + Z(f )g(Y, X) + X(f )g(Y, Z)
= 2g(f DZ Y + Z(f )Y, X)

(7.69)

and so D is a connection.

Therefore, given a metric tensor we also find a natural curvature tensor, the Riemann
curvature, which is the curvature tensor of the Levi-Civita connection.
Theorem 7.5. In a local coordinate system (x1 , . . . , xn ) the coefficients of the Levi-Civita
connection are

1
= g g + g g
(7.70)
2
Problem 7.5. Prove this.
7.3 Symplectic Manifolds
Definition 7.7. A symplectic manifold (M, ) is a manifold equipped with a non-degenerate
2-form which is closed (i.e. d = 0).
The non-degeneracy condition is equivalent to requiring that det 6= 0, where here the
determinant is taken with respect to the components of in any local co-ordinate system,
treating as an antisymmetric matrix with components
=

, .

x x

(7.71)

The condition on the determinant is formulated in terms of co-ordinates, but can be


re-expressed in a co-ordinate independent fashion. The non-vanishing of det is equivalent
to requiring the following: given any co-vector field , there exists a unique vector field X
such that
= iX

(7.72)

(where we recall that iX (Y ) = (X, Y ) for vector field Y ).


We also have the following theorem
Theorem 7.6. If (M, ) is a symplectic manifold then the dimension of M must be even.
Proof. The fact that a symplectic manifold must have even dimension follows from
det = det T = det () = (1)dim(M) det

(7.73)

where in the above line, we regard as an anti-symmetric matrix with components in


some local basis. So if M were odd-dimensional, this would force to have vanishing
determinant, which is a contradiction.

65

We remark that there are many examples of symplectic manifolds. Of particular


importance to symplectic manifolds is the following theorem:
Theorem 7.7 (Darboux Theorem). If (M, ) is a 2m-dimensional symplectic manifold,
then there exists about each point a local co-ordinate system (q 1 , . . . , q m , p1 , . . . pm ) such
that
=

m
X

dpi dq i

(7.74)

i=1

Proof. We shall not prove this here. The proof is non-examinable


the matrix inverse of
Definition 7.8. If (M, ) is a symplectic manifold, denote by

= .
(in local co-ordinates) which satisfies
It is instructive to compare Riemannian or pseudo-Riemannian manifolds with symplectic manifolds. In the former case, the fundamental structure is the metric. In the latter
case, the fundamental structure is provided by the symplectic form .
There are some similarities; for example for Riemannian or pseudo-Riemannian manifolds, we have seen that given a vector field X one can construct an associated 1-form
X via X (Y ) = g(X, Y ). For symplectic manifolds, given X, there is a natural 1-form
associated with X which is iX .
However, there are also crucial differences between Riemannian/pseudo-Riemannian
manifolds and symplectic manifolds. In particular, we have seen that symplectic manifolds must be even-dimensional, whereas Riemannian/pseudo-Riemannian manifolds can
be either even or odd-dimensional.
The Darboux theorem also indicates a significant difference. This can be seen when
one recalls that for a Riemannian or pseudo-Riemannian manifold, given a particular point
p M, one can always diagonalize the metric at p, the diagonal components of g being
non-zero constants. However, in general, one cannot argue that this form of g holds in
some patch containing p, for some local co-ordinates. This is because, if this were true,
then the connection coefficients of the Levi-Civita connection would vanish in these coordinates, which would in turn imply that the Riemann curvature would vanish; and the
vanishing of the Riemann curvature is a co-ordinate independent condition. However, there
are manifolds (such as S 2 ) for which it is known that the Riemann curvature is nonzero. So
it is apparent that in general, given p M, one cannot arrange for some local co-ordinate
system around p in which the components of g are constant. However, for symplectic
manifolds, the Darboux theorem implies that given p M there does exist some local
co-ordinate system in which the components of are constant!
Definition 7.9. If (M, ) is a symplectic manifold, then the Poisson bracket {, } is a map
C (M) C (M) C (M) defined by
f g
{f, g} =

(7.75)

Theorem 7.8. If (M, ) is a symplectic manifold then the Poisson bracket satisfies

66

(i) {f, g} = {g, f }


(ii) {f1 + f2 , g} = {f1 , g} + {f2 , g}
(iii) {f, g} = {f, g}

(for constant R)

(iv) {f, gh} = {f, g}h + g{f, h}


(v) {f, {g, h}} + {g, {h, f }} + {h, {f, g}} = 0
where f, g, h, f1 , f2 C (M).
Proof. (i) (iv) follow straightforwardly from the definition of the Poisson bracket and are
left as an exercise. One does not require the condition d = 0 for these.
To prove (v), one must make use of d = 0, and the proof is left as a problem for the
final example sheet. (It is not necessary to use the Darboux theorem to prove (v)).

7.4 Applications to classical mechanics


There is a close relationship between symplectic manifolds and Hamiltonian dynamics.
Using the Darboux theorem, work with local co-ordinates {q 1 , . . . , q m , p1 , . . . , pm } such
that
m
X
=
dpi dq i
(7.76)
i=1

so that in components
=

0
Im

Im
0

!
,

0
Im

Im
0

!
(7.77)

Suppose that a classical system is described by a Hamiltonian function H = H(p, q)


C (M). We define the Hamiltonian vector field XH by

H
XH
=

(7.78)

The classical trajectories of particles whose motion is constrained by H correspond to


integral curves of XH parametrized by t. The integral curves are given by
dq i
H
=
,
dt
pi

dpi
H
= i
dt
q

(7.79)

which are simply Hamiltons equations.


Furthermore along these curves
dH
H H = 0
= LXH H = XH H =
(7.80)
dt
i.e. H is constant along the integral curves of XH , which corresponds to conservation of
energy. Also note that for any function f = f (p, q),
LXH f = XH (f ) = {f, H}

67

(7.81)

8. Spheres with different differentiable structures


It has been thought for sometime that every topological manifold apparently had a
differential structure. This is because it was thought that one can always find an associated
smooth atlas, ie all the corners could be smoothen. See how this can be done in the example
sheet 1 for a square.
It was a surprise that Milnor noticed that there exist exotic 7-spheres which are homeomorphic but not diffeomorphic to the standard 7-sphere, ie the exotic spheres are the same
as the standard sphere as topological manifolds but different as differential manifolds. The
proof is rather involved to be described in this introductory course but one can describe
some of these exotic spheres fairly easily. For this, we need the definition of a fibre bundle.
Definition 8.1 (Fibre Bundle). Let G be a topological group which acts effectively on a
space F on the left. (E, B, F, ) is a fibre bundle with total space E, base space B, fibre
F , structure group G and projection : E B if B admits an open cover {U } such that
there are fibre-preserving homeomorphisms
: E|U 1 (U ) U F

(8.1)

and the transition functions are continuous functions with values in G:


g (x) = 1
|xF G

(8.2)

Since we are interested in manifolds all spaces and maps can be taken to be smooth.
Fibre bundles are a generalization of the product of two manifols. Examples of fibre bundle
include the tangent T M and cotangent T M of a manifold M as well as all the bundles
associated with forms and tensors.
The transition functions on triple overlaps U U U satisfy the cocycle condition
g g g = 1

(8.3)

Conversely, given spaces B and F , a topological group G acting on F , and a set of


transition functions in G, one can construct a fibre bundle as
G
E = ( U F )/((x, y) (x, g (x)y)
(8.4)
for (x, y) U F and (x, g (x)y) U F .
Exotic 7-spheres are fibre bundles over S 4 with fibre S 3 . The structure group is SO(4).
The total space Mk7 can be constructed from two copies of R4 S 3 and after identifying
the subsets (R4 {0} S 3 under the diffeomorphisms
(u, v) (u0 , v 0 ) = (

u uh vuj
,
)
||u||2 ||u||

(8.5)

where u, v are viewed as quaternions, and h + j = 1, k = h j odd integer. Observe that


u
4
0
u u0 = ||u||
2 is the transition function of S (prove it!). Observe also that ||v || = ||v|| = 1,
ie this diffeomorphism is well-defined.

68

Theorem 8.1. For k 2 6= 1 mod 7, Mk7 are homeomorphic to standard S 7 but not differeomorphic to it
Proof. Here we shall briefly sketch a proof. To prove that Mk7 are homeomorphic to standard S 7 , one uses Morse theory. In particular, one constructs a function on Mk7 which has
precisely one global maximum and one global minimum. Then Morse theory implies that
such a manifold is homeomorphic, but not necessarily diffeomorphic, to a sphere. This
1
function can be given explicitly as f (x) = Re(v)/(1 + ||u||2 ) 2 in one patch and in the other
1
as f (x) = Re(v 00 )/(1 + ||u00 ||2 ) 2 , whereRe(v) is the real part of the quaternion and we have
made an additional coordinate transformation u00 = u0 (v 0 )1 , v 00 = v 0 . Prove that at the
overlaps the values of f are the same, so it is a good function on Mk7 .
The part of the proof which demonstrates that Mk7 are NOT diffeomorphic to the standard 7-sphere is more involved and requires the use of cobordism theory and the computation of characteristic classes which are not part of this introductory module. Nevertheless
as we have seen that exotic spheres can be constructed very explicitly.

69

You might also like