You are on page 1of 126

Discrete Structures

Prof. Stefan Wolf


Arne Hansen

January 18, 2016


Contents

1 Motivation 6
1.1 Swapping knights . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Combinatorics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Connections without crossings . . . . . . . . . . . . . . . . . . . . 7
1.4 Coloring maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Propositional Logic 13
2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Connectives as truth functions . . . . . . . . . . . . . . . 14
2.2 Syntax of propositional logic . . . . . . . . . . . . . . . . . . . . 16
2.3 Symantics of propositional logic . . . . . . . . . . . . . . . . . . . 18
2.4 Normal forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5 Models and semantic conclusion . . . . . . . . . . . . . . . . . . . 23
2.6 Proof theory of propositional logic . . . . . . . . . . . . . . . . . 25
2.6.1 An Excursion into Complexity Theory . . . . . . . . . . . 27
2.7 The Resolution Calculus . . . . . . . . . . . . . . . . . . . . . . . 28

3 Set theory 32
3.1 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.1 Cantors paradise . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.2 Zermelo-Fraenkel set theory . . . . . . . . . . . . . . . . . 33
3.1.3 Laws derived from logics . . . . . . . . . . . . . . . . . . . 37
3.1.4 The Cartesian Product . . . . . . . . . . . . . . . . . . . . 37
3.2 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.1 Representation of relations . . . . . . . . . . . . . . . . . 44
3.2.2 Poperties of relations . . . . . . . . . . . . . . . . . . . . . 44
3.2.3 Equivalence relations . . . . . . . . . . . . . . . . . . . . . 45
3.2.4 Order relations . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4 Combinatorics 57
4.1 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Urn models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3 Combinatorial rules and counting strategies . . . . . . . . . . . . 64
4.3.1 The Pigeonhole Principle . . . . . . . . . . . . . . . . . . 68
4.3.2 Double Counting . . . . . . . . . . . . . . . . . . . . . . . 69
4.4 Binomial coefficients: Properties and approximations . . . . . . . 73
4.4.1 Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

1
4.4.2 Vandermonde Identity . . . . . . . . . . . . . . . . . . . . 73
4.4.3 Binomial theorem . . . . . . . . . . . . . . . . . . . . . . 74
4.4.4 Approximation of the binomial coefficient . . . . . . . . . 74
4.5 An Excursion into information theory: Data compression . . . . 76
4.6 Special counting problems . . . . . . . . . . . . . . . . . . . . . . 79
4.6.1 Equivalence relations . . . . . . . . . . . . . . . . . . . . . 79
4.6.2 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . 80

5 Graph Theory 83
5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.2 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2.1 Basic notions for simple undirected graphs . . . . . . . . . 86
5.3 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.4 Some special graphs . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.5 Euler tours and Hamilton cycles . . . . . . . . . . . . . . . . . . 101
5.5.1 Seven bridges of Konigsberg . . . . . . . . . . . . . . . . . 101
5.6 Planar graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.7 Graph colorings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

6 Cryptography 113
6.1 Diffie-Hellman key exchange . . . . . . . . . . . . . . . . . . . . . 113
6.2 RSA cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.2.1 Euclids Algorithm . . . . . . . . . . . . . . . . . . . . . . 118
6.2.2 The Chinese remainder theorem . . . . . . . . . . . . . . 120
6.2.3 Fermats theorem . . . . . . . . . . . . . . . . . . . . . . . 121
6.2.4 The RSA protocol . . . . . . . . . . . . . . . . . . . . . . 122

Appendices 124

A Proof techniques 125


A.1 Proof by contradiction . . . . . . . . . . . . . . . . . . . . . . . . 125
A.2 Proof by induction . . . . . . . . . . . . . . . . . . . . . . . . . . 125

2
What is discrete
mathematics all about?

Before venturing into formal details, lets briefly clearify what we refer to if we
speak about discrete mathematics.

What is mathematics?
The pillars of mathematics are logics and set theory. While logics determines
how to reason within mathematics (what is considered a proof, . . . ), set theory
describes the objects we deal with.
Example (Set theory). It is actually possible to define all natural numbers using
merely the emptyset:

0 : H
1 : tHu
2 : ttHu , Hu
..
.

This exemplifies how set theory serves to define basic mathematical entities.

mathematics
set theory
logic

Figure 1: The pillars of mathematics.

3
Figure 2: Matching the rational and the natural numbers

What does discrete mean?


With discrete we usually refer to finite or countably infinite sets. While it is
intuitively clear what is meant by finite (a set of elements we can label with
natural numbers 1, 2, 3, . . . , N ) countably infinite needs slightly more explana-
tion. A set S is called countably infinite if one can match each element in S with
a unique natural number. More formally such a matching would be a function
and we get the following definition.
Definition 0.1 (Countable set). A set S is called countable if there exists a
one-to-one or injective1 function f : S N. If f is also surjective2 then S is
called countably infinite.
To better understand the concept of countability, well consider two common
examples.
Example (The rational numbers are countable). Even though there seem to be
a lot more rational numbers than natural numbers, we can construct a matching
showing that the set is countably infinite. The rational numbers can be asso-
ciated with dots in a two-dimensional space, with the numerator on the x-axis
and the denominator on the y-axis or vice versa. We can then number the dots
in a spiral as shown in figure 2.
Example (The real numbers are not). We cannot find such a matching for the
real numbers as one can prove by contradiction. Lets assume that there is a
matching for all real numbers in the interval r0, 1s with a binary representation

1 0.0101011100110 . . .
2 0.1001100101001 . . .
3 0.0001011001010 . . .
.. ..
. .
But one can construct a real number that is not contained in the list: the
ith digit of the binary representation is just the flipped value of the ith digit of
the ith real number.
1 Thisis to say, each element gets assigned a unique natural number, or in other words, no
natural number is matched to two dierent elements in S
2 That is, every natural number is matched to an element in S

4
1 0.0101011100110 . . .
2 0.1001100101001 . . .
3 0.0001011001010 . . .
.. ..
. .
constructed real number 0.111 . . .

The number we constructed diers from every element in the list and is
therefore not contained in the list. This yields a contradiction with the initial
assumption that there existed a complete matching. Thus the size of the set of
real numbers is strictly larger than the size of the natural number (in the sense
that there is no bijective map inbetween the two.).

|R| |Q| |N| (1)

Why bother?
What is the relevance of discrete mathematics? On the one hand logic is discrete
as the basic set is tTRUE, FALSEu. On the other hand, computer science is not
only closely related to logics, but as computers have only finite states informatics
is in a sense discrete.

5
Chapter 1

Motivation

After having clearified the meaning of discrete mathematics we will look into
interesting examples of the field.

1.1 Swapping knights


Consider a reduced chess board with two knights of each color as shown in
figure 1.1. Is it possible to transform the configuration on the left into the
configuration on the right using regular knight moves?
The knights never reach the middle field. Further if one moves along the line
of moves one finally returns back to the initial position. This can be represented
nicely in a graph. Each possible field is then a node in the graph. The possible
moves are then the edges of the graph. Rearranging the nodes yields one closed
loop with 8 nodes. The knights can merely move around the loop but not change
their order. Thus the answer is, that the transformation is impossible.

Lessons learnt from the example above:

Modelling Find the right model to represent the problem.


Graph are often a good structure to model discrete problems
Abstraction Concentrate on the relevant and get rid of the irrelevant.

3
m0M 3
M0m 3
m0M
2
0Z0 2
0Z0 2
0Z0
1
m0M 1
m0M 1
m0M
a b c a b c a b c

Figure 1.1: Reduced chessboard with knights

6
a2 a2
a3 a1 a3 a1

a4 a0 a4 a0

a5 a7 a5 a7
a6 a6

Figure 1.2: Graphs corresponding to the knights swapping problem

Figure 1.3: Necklace with further symmetries, but p not being prime.

1.2 Combinatorics
Imagine you want to create a necklace with p pearls, with p being prime. There
are pearls of a dierent colors available. How many dierent necklaces can you
make? If we can distinguish the pearls, there would be ap dierent possible
necklaces. But the necklace has a rotational symmetry. That is two necklaces
that are made up from the same sequence of pearls except that they are rotated
by an angle are indistinguishable. Just diving by p does not yield the right
solution though, as the one-colored necklaces just appear once. As p is a prime
number these are they only exceptions. Configuration like in 1.3 cannot occur.
Therefore the right answer is
ap a
N `a (1.1)
p
More generally we obtain for a prime number p, ap a is divisable by p.
This is called Fermats small theorem and plays a role in cryptography, namely
the public-key protocol RSA.

1.3 Connections without crossings


Imagine a settlement with three houses and three plants (power, waste water,
. . . ) as shown in figure 1.4. The question is, whether it is possible to connect
the three houses to each of the plants without crossing other connections.
This problem can again be turned into a graph, namely K3,3 as shown in
figure 1.5. The question is then: can the associated graph be drawn that edges
merely meet in nodes? If so the graph is called planar. An example of a planar
graph is K4 shown in figure 1.6.
So we ultimately want to show, that K3,3 is not planar. To do so we first
try to better understand the properties of planar graphs.

Figure 1.4: Houses with connections. The third house cannot be connected
anymore without intersecting other connections.

h1 h2 h3

p1 p2 p3

Figure 1.5: The houses connection problem can be associated to the graph
K3,3

Figure 1.6: The graph K4 is an example of a planar graph

8
Figure 1.7: A cube can be transformed into a graph by removing the bottom
(turns into outer region) and pressing the top down while spreading out the
lower nodes.

Figure 1.8: A simple tree

Graphs and polyhedrons There is an interesting analogy between graphs


and polyhedrons. A polyhedron can be associated with a graph: Imagine the
polyhedron was made out of elastic rubber. One could now remove one face and
press the remaining part on a piece of paper. This would yield a planar graph.
Edges of the Polyhedron become edges of the graph, corners turn into nodes of
the graph. Note that the face that was initially removed corresponds now to
the infinite region outside the graph (which as always considered as a proper
region). Figure 1.7 shows the graph corresponding to a cube.
Looking at dierent polyhedra (or at planar graphs) it becomes apparant
that the number of vertices (nodes) n, the number of edges e and the number of
faces f (regions) are closely reelated. Indeed Eulers polyhedron formula states,
for any polyhedron (and equivalently for planar graphs) it holds

ne`f 2 (1.2)
Before looking at a sketch of the proof we can apply it to K4,4 to answer our
initial question. Indeed n e ` f 6 9 ` 13 10 2.

Proof sketch The proof consists of two parts: First we will show the propo-
sition for trees, which form a simpler class of graphs. Then we will generalize
this to arbitrary planar paths.

Euler for trees A tree is a path without any cycles. As it contains only one
region, namely the outer one, it is always planar. We can now show by induction
that Eulers law is always satisfied for trees.
If there is just one node, then the formula is satisfied as n 1, e 0 and
f 1. For the induction step we now assume that the law holds for a tree with

9
Figure 1.9:

n nodes and follow that it will hold for a tree with n ` 1 nodes. Adding node
requires to add an edge as well. So the new tree will have e ` 1 edges and n ` 1
nodes and therefore

n on loomo
loomo f on n e ` f 2
e on ` loomo (1.3)
n`1 e`1 1

where the second equality holds due to the induction assumption.

Generalization A general planar graph can now be characterized by a span-


ning tree and the dual graph. A spanning tree is a subgraph, connecting all
nodes, that is at the same time a tree. The dual graph to a given spanning
tree is the constructed by placing a node in each region and connecting them
without crossing the spanning tree.
The spanning tree is has the same number of nodes as the original graph
nd n and the number of edges just one less es ns 1. The number of nodes
in the dual tree are just the same as the number of regions in the graph, nd f .
Further each edge of the graph is either part of the spanning tree or crossed by
one of the edges of the dual tree. Putting all this together we obtain

e es ` ed pns 1q ` pnd 1q n 1 ` f 1 n ` f 2 (1.4)

which is just Eulers formula. The subsequent example will employ the
formula as well.

1.4 Coloring maps


Imagine we were given a map showing countries and their borders1 . Every two
countries sharing a border are considered neighbors. So how many colors are
needed to color the map in a way that no neighbors have the same color? Three
colors are not enough as can be seen in figure 1.10.
Again the problem can be translated to planar graphs without multi-nodes2 .
Each country is replaced by a node and neighbouring countries are connected
by edges. So can the nodes be colored with 4 colors requiring that neighbors
are always colored dierently? The answer is yes as a computer-aided proof
showed. As the proof can never be verified directly by human being there has
1 For simplicity well assume that there are no exclaves like Campione dItalia.
2 Multi-nodes are nodes with more than one edge connecting the two.

10
c2

c3 c1

c4

Figure 1.10: The figure on the left shows, that it is possible to arrange four
countries such that they are all neighbors to one another. Thus we need at least
four colors. The corresponding graph is shown on the right.

been some suspision towards the validity of the proof. A shorter proof shows
that 5 colors are sufficient. In the following we sketch a proof for 6 colors.
Proposition 1.4.1 (6-Coloring). Six colors suffice to color a map (or equivalently
the nodes of a planar, non-multi graph).
Proof. As there is a finite number of nodes (countries) we will prove the propo-
sition by induction.

Base case For 6 or less nodes the proposition holds, as there are at least as
many colors as nodes.

Induction hypothesis We will assume that the proposition holds for k nodes.

Induction step The major part of the proof is now to show that, given the
induction hypothesis, the proposition hold also for k ` 1 nodes. The idea to do
that is as follows

Elimininate a node v

By the induction hypothesis there exists a coloring for the reduced graph.
One needs to find a coloring for the node v such that the proposition still
holds. The goal choose the node v such that it has 5 or less neighbors and
we can still choose one color.

While the first two points are clear, we still need to prove the last one
essentially the following lemma.
Lemma 1.4.1. In every planar graph without multi-nodes there exists at least
one node v with 5 or less neighbors.

Lemma. We will prove the lemma by contradiction. We will assume that the
lemma was actually false, i.e. that each (!) node has at least 6 neighbors. Based
on this assumption we will construct a contradiction with Eulers formula.
As every node has at least 6 neighbors and each edge connects two nodes
the edges and the nodes are related as follows

2e 6n (1.5)

11
Similarly the regions and the edges are related. Every region is bounded by at
3 edges (NB: two edges only suffice for multi-graphs). So we obtain

2e 3f (1.6)

Putting all this together we obtain


e 2
n`f ` ee n`f e0 (1.7)
3 3
This is in contradiction with Eulers formula and therefore completes the proof
of the lemma.
The proof of the lemma is the last missing piece. We now know that we can
always find a node v with 5 or less neighbors to reduce a graph with k ` 1 nodes
to a graph with k nodes. This completes the induction step and thus the proof
of the proposition.

12
Chapter 2

Propositional Logic

In the introduction we mentioned logic as one of the pillars of mathematics. Now


we will formaly introduce propositional logic (PL), a branch of logic concerned
with propositions and how they can be constructed from basic proposition, so-
called atomic propositions.

2.1 Definitions
Logic is ultimately about statements and the question whether they are true or
false. This serves to formulate the following definition.

Definition 2.1 (Proposition). A proposition is a sentence or an expression


which is either true or false.
Note that we dont ask whether the statement is physically meaningful. The
only criterion is, that a proposition is clearly either true or false. This property
is called truth-definiteness.

Definition 2.2 (Atom). An atom or atomic proposition is a basic proposition


that is not composed from other propositions.
Definition 2.3 (Connective). A connective relates to propositions logically.
Example 2.1.1. This examples shows how atoms (or atomic proposition) can be
connected by connectives to form composed proposition.

If it rains , then the streets are wet . (2.1)


atom atom

connective

composed proposition

13
Figure 2.1: Logaical AND gate

The following sentence connects three atoms A, B, C.

A
When the rooster crows on the dungheap ,

then the weather changes or remains the same .


B C

The conditioning atom is without eect as the second proposition is always


true. The atom C is just the negation of the atom B: C B. Therefore
C _ B p Bq _ B true. Such a proposition that is always true due to its
form is called a tautology.
Propositions can be confusing when they refer to themselves. The following
negation is false
This is not a proposition .

while the next one does not seem to be proper proposition at all as one cannot
decide whether it is false or true.

This proposition is false .

2.1.1 Connectives as truth functions


A connective can be considered as a function of propositions, yielding yet an-
other proposition. They can be characterized by so-called truth tables (see
below). As propositions represent two values, just as bits in a computer, there
is a close relation to gates (i.e. the abstract building blocks of a processor).
Subsequently we will characterize some connectives.

Conjunction or AND gate is the connective that returns true if and only if
both arguments are true.

A B A^B
true true true
true false false (2.2)
false true false
false false false

Negation or NOT gate returns the opposite of the input truth value.

A A
true false (2.3)
false true

14
Together the AND and the NOT gate suffice to realize any other gate. We say
these gates are universal. Interestingly just one gate namely the NAND
is universal in itself.

NAND gate The two gates above can be combined to a new gate, the negated
AND gate, short NAND gate

A | B : pA ^ Bq (2.4)

Consequently the truth table is just the inverted AND table

A B A|B
true true false
true false true (2.5)
false true true
false false true

To get an idea of how the NAND gate can serve to simulate all other gates, one
obtains the NOT gate by using A for both inputs: A A | A. One can
now use this NOT gate and the NAND to obtain an AND gate and so on and
to forth.

Inclusive disjunction or OR gate returns true whenever at least one of the


two inputs is true.
A B A_B
true true true
true false true (2.6)
false true true
false false false
Note though that this is usually not what we refer to by or. In common
language or means either . . . or . . . , corresponding to the following gate.

Exclusive disjunction or exclusive OR or XOR gate returns true if and only


if one of the two arguments is true.

A B AB A B AB
true true false 1 1 0
true false true 1 0 1 (2.7)
false true true 0 1 1
false false false 0 0 0

The XOR symbol resembles a plus for a reason: if we replace false by zero and
true by 1 then the XOR is just addition modulo 2 (as we will see later in the
course). This replacement by 0 and 1 simplies matters. We will therefore use
from now on digits.

15
Logical equivalence is the inverse gate of the XOR, returning true if and
only if the arguments have the same truth value

A B AB
0 0 1
0 1 0 (2.8)
1 0 0
1 1 1

So we get the that A B pA Bq even though it is not yet clear what


we mean by equality here. But we will return to the issue soon.

Implication connects two atoms A and B in the way, that if A is true also
B is true. A is called the premise, B the conclusion.

A B AB
0 0 1
0 1 1 (2.9)
1 0 0
1 1 1

Conversely if B is true A is not necessarily true: A B B A. Thus the


connective is not symmetric. Also the negative version is not equivalent: A
B p Aq p Bq. But the contraposition is: A B p Bq p Aq.
This asymmetry requires that one is particularly careful how to prove a
statement. The line of reasoing cannot simply be inverted.
Nonetheless if both implications hold, then we obtain equivalence: pA
Bq ^ pB Aq A B. It is therefore possible to prove statements of
equivalence by separately showing both implications.
There is still the issue of the . The expressions on the right and on the
left were not the same, so on the level of strings there is no equality. It is
necessary to distingiush the syntax (the sequence of symbols) and the semantic,
the meaning, of a statement. The indicated that the meaning of the two
expressions was the same. We say they are equivalent and write . In the
following we look at syntax and seminantics of propositional logic in greater
detail.

2.2 Syntax of propositional logic


The following definition specifies what entities we can use in propositional logic.
That is what are syntactically correct expressions, called formulas. The defini-
tion states how to construct formulas from atomic formulas.
Definition 2.4 (Syntax). Syntactically correct formulas ED with a set of atoms
D : tA, B, C, . . .u are
atmic forumlas in D
if f and g are syntactically correct formulas then also p f q, pf ^ gq and
pf _ gq are syntactically correct
These are all syntactically correct formulas.

16
D

p ED q
ED :
p ED ^ ED q

p ED _ ED q

Figure 2.2: The syntax diagram for propositional logic

_ _

^ A B

A B C

Figure 2.3: Tree of the formula F

In short, the only permitted connectives to related either atoms or other


formulas are , ^ and _. Remark that we listed more connectives above, that
are a priori not part of the syntax. As we will see later, one can construct those
from the ones in the definition.
We can describe this in a syntax diagram as shown in figure 2.2.
Example 2.2.1. The following are syntactically correct formulas. Note that the
brackets cannot be left away so far.
A
p Aq

pA ^ Bq
pA _ B
p pA ^ Bqq
F : pppA ^ Bq _ p Cqq ^ pA _ Bqq

The last formula, F , can be visualized in a graph (figure 2.3). Subtrees


correspond to partial formulas. The partial formulas of F are

tF, A, B, C, pA ^ Bq, p Cq, ppA ^ Bq _ p Cqq, pA _ Bqu


The following formulas are not correct as the brackets are missing
A^B^C

17
A^B_C
The first can either be understood as ppA ^ Bq ^ Cq or pA ^ pB ^ Cqq. The
meaning (i.e. the semantics) is the same. This does not hold for the second
which can be interpreted as either pA ^ pB _ Cqq or ppA ^ Bq _ Cq.

2.3 Symantics of propositional logic


Definition 2.5 (Assignment). A truth assignment A : D t0, 1u is a function
that assigns a truth value to all atoms. The function can naturally be extended
to all formulas by simply evaluating the formula, given the truth values for its
atoms.
Example 2.3.1 (Assignment). Lets consider a set of three atoms A, B, C. An
assignment is for instance1

A : A 0
B 1
C 0

The function can be extended to formulas. For F ppA ^ p Bqq _ Cq the


assignment yields ApF q 0.
Definition 2.6 (Semantic). The semantic of a proposition is the set of truth
values for all possible assignments.
In the examples below each row corresponds to one of the possible assign-
ments and the semantic of the propositions is given by the vector highlighted in
red.
Example 2.3.2. For the formula F in the example above one could evaluate the
tree bottom-up. Ultimately we require a full list of the results for all possible
inputs. It is more efficient to evaluate the subformulas and write the result
below the connective as done below

pppA ^ Bq _ p Cqq ^ pA _ Bqq


0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0
0 0 1 1 1 0 1 0 1 1
0 0 1 0 0 1 0 0 1 1 (2.10)
1 0 0 1 1 0 1 1 1 0
1 0 0 0 0 1 0 1 1 0
1 1 1 1 1 0 1 1 1 1
1 1 1 1 0 1 1 1 1 1
Consequently we can define two formulas to be equivalent if they yield the
same output on the same input.
Definition 2.7 (Semantically equivalent). Two formulas are semantically equiv-
alent if they have the same truth value for all assignments of their atomic for-
mulas. We write then F G or F G.
1 The symbol stands for maps to. So for a function f the expression f : x y means

f maps x to y and is just another way of writing f pxq y.

18
Example 2.3.3 (Equivalent formulas). The following two formulas are equivalent,
they yield the same red columns.

pA _ Bq p pp Aq ^ p Bqqq
0 0 0 0 1 0 1 1 0
0 1 1 1 1 0 0 0 1 (2.11)
1 1 0 1 0 1 0 1 0
1 1 1 1 0 1 0 0 1

Thus we obtain

A_B pp Aq ^ p Bqq (2.12)


We have been using the values 0 and 1 for quite a while without introducing
them rigorously. They can be defined as being equivalent to formulas using
merely basic syntax elements.
Definition 2.8. We define the formulas

0 : pA ^ p Aqq (2.13)
1 : pA _ p Aqq (2.14)

The formulas equivalent to 0 and to 1 have own names.


Definition 2.9 (Tautology, Unsatisfiability). If a formula F is semantically
equivalent to 0, F 0, then F is called unstaisfiable.
If a formula F is semantically equivalent to 1, F 1, then F is called a tautology.
Simarly we can now formally introduce all other connectives such as the
XOR

A B : ppA ^ p Bqq _ pp Aq ^ Bqq (2.15)


or the equivalence connective

A B pF ^ Gq _ pp F q ^ p Gqq (2.16)
Semantical equivalence is an equivalence relation, that is a mathematical relation
with particular properties as we will see later in the course. For now it is enough
to know that it groups formulas into disjoint subsets2 of equivalent elements,
so called equivalence classes. It structures the set of all syntactically correct
formulas as visualized in 2.4.
The formulas pB _ p Bqq, p p Aq _ p p p Aqqqq are all equivalent to 1
and therefore in the same equivalence class. They are all tautologies.

Number of equivalence classes The set of all syntactically correct formulas


ED is infinite. The number of equivalence classes though is finite if the number of
atomic formulas is finite. Given, for instance, that there are 26 atomic formulas
D tA, B, C, . . . , Zu, there are 226 dierent input configurations (or lines if we
26
wrote it like in 2.10). As each line can yield a 0 or 1, there are 2p2 q dierent
equivalence classes.
We will now related the equivalence connective and semantic equivalence:
2 Disjoint means that every formula is contained in exactly one of those subsets.

19
contains all tautologies

ED
1

contains all unsatisfiable formulas

Figure 2.4: Equivalence classes in the set of all syntactically correct formulas
ED .

Theorem 2.3.1. Two formulas F and G are semanically equivalent, i.e. F


G, if and only if (i ) the formula F G is a tautology.
Note that connects F and G syntactically whereas connects the two
semantically.
Proof. The formula A B pF ^ Gq _ pp F q ^ p Gqq is a tautology i F
and G have the same truth values for all assignments. Thus it is a tautology i
F and G are semantically equivalent.
Example 2.3.4 (Logical Laws). The following list contains important examples
of semantical equivalences.
Idempotence:
pF ^ F q F pF _ F q F

pF F q 0
AA1
Symmetry
pF ^ Gq pG ^ F q pF _ Gq pG _ F q

Associativity

pA ^ pB ^ Cqq ppA ^ Bq ^ Cq pA _ pB _ Cqq ppA _ Bq _ C

Absorption

ppF ^ Gq _ F q F ppF _ Gq ^ F q F

20
Distributivity

pF ^pG_Hqq ppF ^Gq_pF ^Hqq pF _pG^Hqq ppF _Gq^pF _Hq

de Morgan

p pF ^ Gqq pp F q _ p Gqq p pF _ Gqq pp F q ^ p Gqq

double negation
p p F qq F

Note that the associativity for AND and XOR is not both ways but just for

A ^ pB Cq pA ^ Bq pA ^ Cq (2.17)

Just as the XOR corresponds to addition, the AND corresponds to multipli-


cation, i.e. A ^ B A B. Associativity of addition and multiplication just
extends to the logical gates.

Simplication of Notation So far we have been sticking closely to the syntax


permitted by the definition 2.4. We will now introduce some simplifications,
motivated by the equivalences above.
We will allow all connectives introduced in the first part (, , ) as
they are equivalent to formulas with basic connectives.
If brackets do not change the semantics they can be dropped as for instance
in associative formulas (e.g. A ^ B ^ C). Based on this the following
formulas are valid as well

n
n
Ai A1 ^ A2 ^ . . . ^ An Ai A1 _ A2 _ . . . _ An (2.18)
i1 i1

We will introduce the following priority rules (or operator precedence):


, p^, _q, p, , , q. Again brackets can be dropped as long as the
formula is in accordance with these priority rules.

2.4 Normal forms


Given a syntactically correct formula one obtains the semantics by writing down
the truth table. Is it conversely possible to construct a syntactically correct
formula that reproduces a given truth vector? The following example shows
how this can be done.
Example 2.4.1. Let us consider the truth table with the atoms A, B and C. We
would like to find a composed formula F that produces the truth vector in the

21
last column.
A B C F
0 0 0 0
0 0 1 0
0 1 0 1
0 1 1 1 (2.19)
1 0 0 1
1 0 1 1
1 1 0 0
1 1 1 1
A first way to construct a formula F with the desired semantics is, to consider
all the rows that have to be true. One has to be in row 3, row 4, row 5, row 6
or row 8. To check the validity of a single row one connects the atoms or their
negation with ANDs. Thus we obtain the formula

F prow 3q _ prow 4q _ prow 5q _ prow 6 _ prow 8q


p A ^ B ^ Cq _ p A ^ B ^ Cq
_ pA ^ B ^ Cq _ pA ^ B ^ Cq _ pA ^ B ^ Cq

We will call this disjunction of conjunctions a Disjunctive Normal Form (DNF).


Applying the formulas above, we can reduce the formula to a simpler semanti-
cally equivalent one. Employing associativity we obtain

p A ^ B ^ Cq _ p A ^ B ^ Cq p A ^ Bq _ ploooomoooon
C ^ Cq A^B
0

So we obtain

ploooooooooooooooooooooomoooooooooooooooooooooon
A ^ B ^ Cq _ p A ^ B ^ Cq _ loooooooooooooooooooooomoooooooooooooooooooooon
pA ^ B ^ Cq _ pA ^ B ^ Cq _pA ^ B ^ Cq
A^B A^ B
loooooooooooooomoooooooooooooon
p A ^ Bq _ pA ^ Bq _pA ^ B ^ Cq A B _ A B C
AB

using multiplication as an abbreviation of the AND.


A second approach to finding a formula F is to consider the false rows. Row
1 and row 2 and row 7 have to be false. Each of these rows is false if any of the
subforrmulas is false. Thus we connect the atoms (or their negations) by ORs.

F prow 1q ^ prow 2q ^ prow 7q


pA _ B _ Cq ^ pA _ B _ Cq ^ p A _ B _ Cq

This conjunction of disjunctions is called a Conjunctive Normal Form (CNF).


Using associativity again we can simplify

pA _ B _ Cq ^ pA _ B _ Cq pA _ Bq ^ pC _ Cq
loooomoooon
1

to obtain
F pA _ Bq ^ p A _ A _ Cq
In the following definition the normal forms mentioned above will be intro-
duced formally precise.

22
Definition 2.10 (Literal). If A P D is an atom then A and A are called
literals. In other words, a literal is an atom or a negated atom.
Definition 2.11 (Conjunctive Normal Form). A formula F is in Conjunctive
Normal Form if there exist literals Li,j such that

n mi

F Li,j
i1 j1
pL1,1 _ L1,2 _ . . . _ L1,m1 q
^ pL2,1 _ . . . _ L2,m2 q
^ ...
^ pLn,1 _ . . . _ Ln,mn q

Definition 2.12 (Disjunctive Normal Form). A formula F is in Disjunctive


Normal Form if there exist literals Li,j such that

n mi

F Li,j
i1 j1

Generally both methods we used in the example above can be applied to any
truth vector. Thus any formula is semantically equivalent to a formula in CNF
and to a formula in DNF.

2.5 Models and semantic conclusion


While for instance in physics is about which basic proposition are true the
inverse questions are usually asked in logic. So instead of how does the world
work fundamentally? one rather asks which worlds are compatible with a given
formula? 3 . More practical: What assignments render a formula true?
A possible world is called a model as specified in the following definition.
Definition 2.13 (Model). Let F be a formula and A an assignment which
renders F true. Then A is a model of F with the notation

A(F (2.20)

Example 2.5.1. Let us consider the following formula with its truth table
F pA ^ Bq C
0 0 0 1 0
0 0 0 1 1
0 0 1 1 0
0 0 1 1 1
1 0 0 1 0
1 0 0 1 1
1 1 1 0 0
1 1 1 1 1
Only the assignment in the seventh row is not a model of F .
3 World is to be understood more philisophically as reality.

23
Some of the properties of models are the following
Two formulas F and G are semantically equivalent, i.e. F G, if and
only if4
A ( F i A ( G (2.21)
That is if F and G have the same models.

A formula F is a tautology i any assignment A is a model, i.e. F is true


in any world.
A formula F is unsatisfiable i there exists no model, i.e. F is true in no
world.

Having introduced the notion of a model we can now define a semantic


conclusion.
Definition 2.14 (Semantic conclusion). G is a semantic conclusion of F1 , . . . , Fn
if any common model A for all Fi that is A ( F1 , . . . , A ( Fn is also a
model of G A ( G. One also says, F1 , . . . , Fn semantically entail G. The
semantic conclusion is denoted tFa , . . . , Fn u ( G or tFa , . . . , Fn u G
Remark 1 (Relation to semantic equivalence). The semantic conclusion is the
one-sided variant of the semantic equivalence, similarly to and on the
syntactical level. Therefore two formulas F and G are semantically equivalent
i F is the semantic conclusion of G and vice versa.
Remark 2 (Conflicting notation). Note that the symbol ( has two meanings.
It indicates semantic conclusion as well as models.
Example 2.5.2. Lets have a closer look at the following set of formulas tA, A B, B Cu.
The only model of the first formula is the assignment A0 pAq 1. The models
of the second are (corresponding to the rows 1,2 and 4 in 2.9)

A1 : A 0, B 0
A2 : A 0, B 1
A3 : A 1, B 1

and similarly for the last formula

A4 : B 0, C 0
A5 : B 0, C 1
A6 : B 1, C 1

So merely the assignment

A : A 1, B 1, C 1

is a model that renders all formulas true, i.e. a common model for all formulas.
As this is also a model for the atomic formula C we obtain

tA, A B, B Cu ( C
4 if and only if is usually abbreviated i.

24
The semantic conclusion is related to the (syntactical) implication analogue
to equivalence in the theorem (2.3.1).

Theorem 2.5.1. A set of formulas tF1 , . .. , F2 u semantically entails a formula


n
G tF1 , . . . , Fn u ( G i the formula i1 Fi G is a tautology.
One has to carefully distinguish the syntactical and the semantic level. The
expression tF1 , . . . , Fn u ( G relates the set
of formulas tF1 , . . . , Fn u semanti-
n
cally with the formula G. The expression i1 Fi G connects the two syn-
tactically and yields another syntactically correct formula. Only by demanding
this formula to be a tautology there emerges a semantic criterion.
Proof. The implication A B is equivalent to A _ B. Therefore we obtain

n
Fi G pF1 ^ F2 ^ . . . ^ Fn q _ G
i1
F1 _ F 2 _ . . . _ F n _ G

Lets now consider models of this formula. If an assignment renders all Fi true
ApFi q 1 @i then the formula is true i the assignment also renders G
true. This is just the definition of a semantic conclusion.

Remark 3. A formula F is a tautology i F is a conclusion of 1, i.e. 1 ( F . So


F is a tautology i the implication 1 F is a tautology.
A formula F is unsatisfiable i 0 is a conclusion of F , i.e. F ( 0. So F is
unsatisfiable if the implication F 0 is a tautology.

2.6 Proof theory of propositional logic


The objective is to develop a calculus to decide semantic questions such as

Is F a tautology?
Is F unsatisfiable?
Does tF1 , . . . , Fn u ( G hold?

In the following examples we will consider the tautology question for CNFs
and DNFs.
Example 2.6.1 (Tautology problem of CNF). Given a formula F in CNF, can we
decide whether it is a tautology? The formula can be rewritten as a conjunction
of subformulas Fi

F loooooooooomoooooooooon
pL1,1 _ . . . L1,m1 q ^ . . . pL n,1 _ . . . Ln,mn q
loooooooooomoooooooooon
:F1 :Fn
F1 ^ . . . ^ Fn

So F is a tautology i all Fi are tautologies. The Fi s in turn are tautologies if


at least one atom reappears negated. That is Fi contains an atom A as well as
A.

25
Example 2.6.2 (Tautology problem of DNF). Can we derive a similar criteria
for the subformulas of a DNF

n mi

F Li,j F1 _ . . . _ Fn
i1 j1

to decide whether F is a tautology or not? Unfortunately the problem does not


localize in the same manner. For a CNF to be a tautology, all subformulas have
to be true for any assignment, and therefore have to be tautologies themselves.
Whereas for a DNF just one subformula has to be true for any assignment. This
could be dierent subformulas for dierent assignments. The question whether
F is a tautology can only be decided considering the entire formula and not by
looking at the single subformulas independently.
There seems to be no other way than to write down the entire truth table.
It contains 2l entries for a formula containing l atoms. Thus the table grows
very fast with the number of atoms.

Relating DNF and CNF For a formula F in CNF we immediately obtain a


formula in DNF that is semantically equivalent to F . Employing de Morgans
law yields


n mi
n mi

F Li,j Li,j
i1 j1 i1 j1


n mi

Li,j
i1 j1

This is a DNF as any negated literal is just another literal. The argument also
works for a F being a DNF. Then one obtains a CNF semantically equivalent
to F .
As F being a tautology is equivalent to F being unsatisfiable, the satis-
fiability problem of a DNF can be solved analogue to the tautology problem
of a CNF. Conversely the satisfiability problem of a CNF analogue to the
tautology problem of a DNF can merely be decided by writing down the
entire truth table.

Comparison of computational hardness How hard is it to decide the


tautology problem for a CNF and for a DNF? Consider a formula F of length
l (length being the number of literals). If F is in CNF one merely has to look
at all pairs of literals in every subformula, following the criteria in the example
above. So one has to perform at most c l2 steps (c being some finite constant)
to decide whether F is a tautology.
In the case of F being a DNF one has to write down the entire truth table,
containing 2l rows. The number of computational steps is thus exponential in
l, growing much faster5 than quadratically. In fact the two problems belong to
dierent so-called complexity classes.

26
all computational problems

NP

NPC

Figure 2.5: The set of computational problems is divided into classes of dierent
complexity.

2.6.1 An Excursion into Complexity Theory


Complexity theory is about classifying computational problems by their com-
putational difficulty. The set P contains all problems that can be solved in
polynomial time, i.e. the number of computational steps is at most a polyno-
mial of the input size. P is a subset of NP, containing all problems for which
one can verify a given solution in polynomial time, whereas the solution may
not be found in polynomial time.
A third class is NP-complete (NPC). Problems in NPC are as hard as any
other problem in NP. That means, any problem in NP can be reduced to a
problem in NPC in polynomial time. Consequently any problem in NPC can be
reduced to another problem in NPC in polynomial time. Thus NPC is subset
of NP and disjoint with P.
Example 2.6.3 (Elements in P). P contains
the tautology problem for CNF

the (un)satisfiability problem for DNF.


Example 2.6.4 (Elements in NPC). NPC contains
the tautology problem for DNF

the satisfiability problem for CNF


the problem to find a semantically equivalent DNF for a given CNF and
vice versa.
The last problem follows from the first two: If we could find a G in CNF for any
F in DNF with F G in polynomial time we could solve the tautology problem
for a DNF in polynomial by first turning it into a CNF and then applying the
criteria above.
5 Consider for instance 10 literals 210 1024 102 100 or 20 literals 220 1048576

202 400

27
Real life problems? Are problems in real life hard? Yes, they are, as one
can see in the following example.
Example 2.6.5 (Sudoku). Sudoku is an NPC problem as we can reduce it to the
satisfiability of a CNF. As one has to find a solution satisfying all conditions on
the rows, columns and subboxes, it is a problem of the form
pcondition 1q ^ pcondition 2q ^ . . . (2.22)
Therefore it is in CNF. To find a solution is just the same as proving satisfiability.
The satisfiability problem for a CNF is in NPC, and so is the Sudoku problem.
Actually this is a common structure of real life problems, such as flight
plans or schedules. Usually there is a number of necessary conditions yielding
a formula of the form 2.22. Each condition can be met in various ways. So
the subformulas are disjunctions. Thus one is usually looking for an assignment
satisfying a CNF.

2.7 The Resolution Calculus


Even though there is no easy way to solve the satisfiability problem for a CNF
we can do better than writing down the full truth table using so-called resolution
calculus. Lets consider the subsequent example.
Example 2.7.1. The task is to prove the following semantical conclusion
tA, A B, B C, C D, D Eu ( E
Employing the theorem 2.5.1 the task turns into showing that the formula
pA ^ pA Bq ^ pB Cq ^ pC Dq ^ pD Eqq E
loooooooooooooooooooooooooooooooooomoooooooooooooooooooooooooooooooooon
:F

is a tautology. The resolving the definition of the implication yields F E


F _ E. We can now turn this tautology problem into the unsatisfiability
problem for the formula
p F _ Eq F ^ E
Resolving F again yields a CNF
A ^ p A _ Bq ^ p B _ Cq ^ p C _ Dq ^ p D _ Eq ^ E
The subformulas of a CNF are called clauses and usually written as sets, con-
taining the literals of the clause6 .
ttAu , t A, Bu , t B, Cu , t C, Du , t D, Eu , t Euu
We will show now by contradiction that there does not exist an assignment
A that renders all clauses true. So lets assume that such an assignment A
exists. We can gradually deduce further clauses from the existing ones. So if
the assignment A renders all clauses true, it must hold that ApAq 1. Thus
A 0 and the second clause is merely true if ApBq 1. This argument can
be repeated and yields the tree shown in figure 2.6. In the last step we obtain
E ^ E 0 and therefore a contradiction with the initial assumption that A
renders all assignments true.
6 It resembles writing the Li,j as a two-dimensional array in for instance Mathematica

28
tAu t A, Bu t B, Cu t C, Du t D, Eu t Eu

tBu

tCu

tDu

tEu

Figure 2.6: Deducing a contradiction from the clauses.

In the last example the claim was somehow intuitive as A and E were con-
nected by a row of implications. Another example is the following
Example 2.7.2. The task is to prove the semantical conclusion

tA _ B, A C, A C, B Cu ( C

corresponding to showing that the formula

pA _ Bq ^ p A _ Cq ^ p B _ Cq ^ C

is unsatisfiable. The clauses and their resolution is shown in figure 2.7.


To better understand what happens in a resolution step well consider the
following example
Example 2.7.3. Every assignment that renders both tA, B, Cu and t A, Eu
true will also render tB, C, Eu true. To prove this we have to distinguish
the cases
1. ApAq 0: Then tB, Cu must be true and therefore also tB, C, Eu.
2. ApAq 1: Then t Eu must be true and therefore also tB, C, Eu.
So whatever truth value A is assigned, the clause tB, C, Eu is true.
More generally we can define the resolution step as
Definition 2.15 (Resolution step). Let C1 , C2 and R be clauses. R is the
resolvent of C1 and C2 if there exists a literal L such that L is in C1 and L is
in C2 (or vice versa) and R contains all literals in C1 and C2 except of L and
L.

29
tA, Bu t A, Cu t B, Cu t Cu

t Bu

tAu t Au

Figure 2.7:

tA, B, Cu t A, Eu t C, D, Eu tCu t D, Cu

tB, C, Eu tD, Eu t Du

tB, Eu tEu

tBu t Au

Figure 2.8:

In the notation of set (that will be introduced below) we obtain


F C1 z tLu Y C2 z t Lu
Generalizing from the example above yields that C1 and C2 include R
tC1 , C2 u ( R
The following examples shows that resolution does not merely show unsat-
isfiability but also yields an assignment if the formula is satisfiable.
Example 2.7.4. Is the following formula satisfiable?
F pA _ B _ Cq ^ p A _ Eq ^ p C _ D _ Eq ^ C ^ p D _ Cq
From the resolution in figure ?? one obtains a model
A : A 0
B 1
C 1
D 1
E 1

30
Remark 4. Summarizing the above
If a formula in CNF is satisfiable then resolution yields a model for this
formula.
If a formula in CNF is unsatisfiable than resolution can show this.
A programming language based on resolution calculus is prolog.

31
Chapter 3

Set theory

After having introduced logics we will now turn to the second pillar of mathe-
matics, namely set theory. As mentioned in the introduction, all mathematical
objects are sets. For instance the natural numbers can be defined inductively
by merely nesting sets of the empty set.

0 : H
1 : t0u tHu
2 : t0, 1u tH, tHuu
..
.
n : t0, 1, . . . , n 1u

3.1 Basic notions


Intuitively a set is a collection of elements. This somehow suggests that there
are basic elements to start o with. This is not true for formal set theory. The
only entities in set theory are sets. These sets can than be put in relation to one
another or combined to form new sets. Thus being an element of a set merely
relates two sets in a particular way. Thus we define
Definition 3.1 (Element relation). The relation is an element of relates some
x (which is ultimately a set itself)1 with a set A. One says x is an element of
A or x is in A or A contains x2 and writes
xPA
In the contrary, if x is not in A, one writes
xRA px P Aq
This is rather clearifing some basic notation than characterizing what sets
are. In the following we will look briefly into Cantors definition of set theory,
where it falls short and how set theory is usually introduced in ZFC.
1 It is common to label sets with capital letters and their elements with small letters. This

notation is misleading as there formally only exist sets and sets of sets. So also elements are
sets and may contain other elements. Having said this we will use the very notation as well.
2 The same expression might be used for subsets defined below.

32
3.1.1 Cantors paradise
In an attempt to formally define sets, Cantor proposed the following definition
Definition 3.2 (Cantors naive approach). Any collection of distinguishable
objects is a set. An object can be in any set, in particular a set can contain
itself as a member.
Unfortunately this definition of a set is not logically consistent as Russells
antinomy shows. One can group all sets that contain themselves into a set and
all sets that do not contain themselves. Let
M : tB|B R Bu
be the set of sets, that do not contain themselves. The question is now: does M
contain itself or not? If M contains itself, then M is not an element of M . Thus
it is an elment of M . This is a contradiction and therefore Cantors approach
is not theoretically sound.

3.1.2 Zermelo-Fraenkel set theory


As a consequence of Russells antinomy a more restrictive set of basic rules is
required as a foundation of set theory. Subsequently we will introduce ZFC set
theory, named after the mathematicians Ernst Zermelo and Abraham Fraenkel,
including the so called axiom of choice.
ZFC comprises a set of axioms that suffice to build set theory upon. It is
a set of basic premises or assumptions that allow to logically derive all other
results of set theory. A set of axioms is supposed to be irreducible in the sense
that none of the contained statements can be derived from others in the set.
More practically ZFC will provide us with basic rules how to form sets.

The axiom of choice mentioned above, states that given any collection of
bins containing each at least one element, it is possible to make a selection of
exactly one object from each bin. If there is a distinguishing property for the
elements in a bin, one can single out an element without the axiom of choice
whearas the axiom is needed to choose an element from a set of indistinguishable
elements. Though the axiom of choice seems somehow intuitive it can have
quite strange implications. It leads for instance to the Banach Tarski Paradox.
Nonetheless the axiom of choice is widely used and an important cornerstone of
set theory.
Subsequently we will look at some of the axioms in greater detail. But before
doing so we will introduce some concepts from predicate logic.
Definition 3.3 (Quantifiers). In order to express that a statement holds for all
cases of a kind one uses the universal quantifier @. If a statement holds for at
least one instance of a kind the existential quantifier D is used.
Example. In terms of quantifiers the statement all natural numbers are non-
negative is
@x P Npx 0q
read for all x in N: x is greater or equal to zero. Similarly the statement
there exists a natural number greater than 100 can be written as
Dx P Npx 100q

33
We will now use quantifiers to formulate the axioms.
Axiom 1 (Extensionality axiom). The extensionality axiom states that two sets
are equal i they contain the same elements. The definition of equality yields:
if A and B are equal A B then A and B contain the same element.
So we only need to include the inverse implication in the axiom: any two sets
A and B containing the same elments, are equal. Using logical quantifiers this
reads
@A@Bp@xpx P A x P Bq A Bq

From the extensionality axiom follows that any set is uniquely determined
by its elements.
Example 3.1.1. As the following sets contain the same elements they are equal.

ta, b, cu tb, c, au ta, a, b, cu

Neither the order nor multiple occurences do matter.


In the contrary the following sets are not equal.

ta, b, cu ttau , tbu , tcuu

Note that a and the set containing a, i.e. tau are not equal.
Predicates, as defined below, yield subsets as follows from the subsequent
axiom.
Definition 3.4 (Predicate). For a given set A a predicate is a function P :
A tfalse, trueu. P can also be regarded as a property that all x P A have
for which P pxq true.
Axiom 2 (Sets from predicates). Given a set A and a predicate P on A, the
collection of all elements that have the property P (i.e. for which P is true)

B : tx P A | P pxq trueu tx P A | P pxqu

is another set.
Implicitely we introduced the notation t | u for set defined by predicates.
Example 3.1.2. One can define a predicate on the natural numbers to be true
i the argument is less or equal to 10. This yields the set

A : tx P N | x 10u

With a second predicate on A returning true i the argument is prime we obtain

B : tx P A | Primepxqu t2, 3, 5, 7u

Definition 3.5 (Subset). A set A is a subset of another set B if all elements


of A are also elements of B. In the notation of predicate logic this is3

AB : @xpx P A x P Bq
3 The colon indicates a definition and is put on the side of the equivalence that is to be

defined.

34
Figure 3.1: The intersection and the union of two sets

Using the subset relation we can formulate a first theorem


Theorem 3.1.1. If a set A is a subset of B and B a subset of A, then the two
sets are equal.
AB^B A AB
Proof. The theorem is a direct consequence of the axiom of extensionality. As
A is a subset B, any element in A is an element of B and vice versa any element
in B is an element of A. Thus an element x is an element of A i it is an element
of B, i.e. A and B contain the same elements.
Using predicates we can now define the empty set if we are given an arbitrary
set A.
H : tx P A | x xu tu
Thus we defined the empty set as the subset of A not containing any element
from A. By extensionality the emptyset is unique. Further it is a subset of any
set.
In the following we will consider dierent ways to form new sets from given
ones.
Definition 3.6 (Intersection). Given two sets A and B the intersection, A X B
consists of all elements contained in both

xPAXB : xPA^xPB

Note that this is not a new axiom. Any intersection can just be written in
terms of predicates
A X B tx P A | x P Bu
The union of two sets cannot be written in terms of predicates. We thus
require another axiom.
Axiom 3 (Union). Given two sets A and B their union A Y B containing all
elements of A as well as all elements of B is a set.

AYB : xPA_xPB

Further we will define dierence and the symmetric dierences as follows.

35
Figure 3.2: The dierence and the symmetric dierence of two sets

Definition 3.7 (Dierence). The dierence of two sets A and B denoted


AzB is defined as the set of all elements contained in A but not in B.

x P AzB : xPA^xRB

Definition 3.8 (Symmetric Dierence). The symmetric dierence is defined


using the xor.
x P A4B : x P A x P B

The symmetric dierence as the name suggest is the symmetric version


of the diere. It is equal to the union of AzB and BzA as well as to the union
of A and B minus their intersection.

A4B pAzBq Y pBzAq pA Y BqzpA X Bq

The definitions above show the close relation between set theory and logics.
Using the logical connectives we can define corresponding connectives for sets.
In this sense the power set is exceptional. It has no correspondent in logics.
Definition 3.9 (Power set). The power set of a set A is the set of all subsets
of A
x P PpAq : x A
In particular the power set contains the empty set and A itself.

The power set is also denoted 2A as the cardinality, i.e. the number of
elements, of the power set is just

|PpAq| 2|A| .

if the set is finite. To see this, we can consider the number of all predicates.
For finite sets each subset corresponds to a predicate. Every predicate can be
seen as a bit string of length |A|, with the i-th bit being 1 i P pxi q true. As
there are then 2|A| dierent strings, there are also just as many subsets.
Definition 3.10 (Complement). If a set A is defined as the subset of some
larger set U , usually called the universe, then the complement of a set A is
defined as
A : U zA.

36
Families of sets Any countable union or intersection of sets yields another
set. So for a family of sets Ai with an index i P I (the countable index set) we
define

xP Ai : Di P Ipx P Ai q
iPI

xP Ai : @i P Ipx P Ai q
iPI

3.1.3 Laws derived from logics


The close relation between logics and set theory mentioned above allows us to
infer some properties of sets from logical laws. One obtains the correspondent
of a logical law in set theory by replacing ^ by X, _ by Y, by the complement
and the semantical equivalence by equality of sets.

Idempotency in logics states that the formula A ^ A is symantically equiv-


alent to A, i.e. A ^ A A. In terms of sets this becomes

AXAA

Similarly we obtain from A _ A A, A Y A A.

Absorption states that a formula A ^ pA _ Bq is semantically equivalent to


A. For sets this turns into

A X pA Y Bq A

Distributivity for logical conjunction is pA ^ Bq _ C pA _ Cq ^ pB _ Cq


and turns into
pA X Bq Y C pA Y Cq X pB Y Cq
for sets as one can see in 4.13.
Finally also de Morgans laws have a set theoretic correspondent. If we have
a reference set and can thus use the complement of a set, we obtain

pA Y Bq A X B

as the set analogue of pA _ Bq A ^ B.

3.1.4 The Cartesian Product


In the 17th century Rene Descartes introduced the Cartesian product to describe
the location of points by their coordinates. Only later Cartesian products were
formalised in set theory employing ordered pairs.
Consider for example points in the two dimensional plane, as shown in 3.3.
The points px, yq and py, xq are generally not the same. The order of the coor-
dinates does matter. This means we want two ordered pairs pa, bq and pc, dq to
be equal, if and only if a c and b d.
How do we define an ordered pair in set theory? The set containing x
and y, tx, yu ty, xu, is symmetric and thus not a sufficient set theoretic
characterization of an ordered pair. We therefore define

37
R
py, xq

px, yq

Figure 3.3: The order of the coordinates does matter.

Definition 3.11 (Ordered pair). An ordered pair is defined as

px, yq : ttxu , tx, yuu

Let us verify whether this definition really entails the equality property for
two points pa, bq and pc, dq, mentioned above. The ordered pairs correspond by
definition to the sets
! ( ! )) ! ( ! ))
a , a, b c , c, d

If the two sets are equal, then necessarily a and c have to be equal. Therefore,
as also the elements ta, bu and tc, du have to be equal, b and d have to be equal.
A special case of an ordered pair is the one containing twice the same element.
Then the set is equal to the set containing the set containing the element.

px, xq ttxu , tx, xuu ttxu , txuu ttxuu

Having clearified the notion of an ordered pair we can now define the Carte-
sian product of two sets.
Definition 3.12 (Cartesian Product). Given two sets A and B, their Cartesian
product is defined as the set containing all ordered pairs

A B : tpa, bq | a P A ^ b P Bu

So if one of the two is the empty set, the Cartesian product is the emptyset

AHBHH

Generally, if neither A nor B are empty, and they are not equal, A B, then
the Cartesian product is not symmetric, i.e.

AB BA

as one can see in the following example.


Example 3.1.3. Considering the sets A t1u and B t2, 3u we obtain the
Cartesian products

A B tp1, 2q, p1, 3qu


B A tp2, 1q, p3, 1qu

38
R R

2 2

1 1

R R
1 2 1 2

2
Figure 3.4: The shaded area to the left is the Cartesian product r0, 2s , the one
2 2
to the right the dierence of two products r0, 2s z r0, 1s .

Figure 3.5: The area above the diagonal corresponds to the order relation .

The definitions of ordered pairs extends naturally to ordered lists of more


than two numbers, so called tuples. So given a finite index set I t1, . . . , ku
we therefore define the Cartesian product of k sets as

Ai : tpa1 , . . . , ak q | @i P I : ai P Ai u
iPI

Example 3.1.4. Given an interval

r0, 2s : tx P R | x 0 ^ x 2u
2
the Cartesian product r0, 2s r0, 2s r0, 2s is a square with length 2 as shown
2 2
in the left figure of 3.4. Another subset of R2 R R is r0, 2s z r0, 1s shown
in the right figure of 3.4.
Example 3.1.5 (Order relation). Using the order relation we can define the
following subset (
R : px, yq P R2 | x y
pictured in figure 3.5. So far order relations were not formally defined. We
will make up for that in the next section, taking the inverse approach. We will
define using the set R.

3.2 Relations
Before turning to order relations we define more generally binary relations as
follows

39
R R

R R

Figure 3.6: Merely the diagonal line with x y is contained in both relations.

Definition 3.13 (Binary relation). A (binary) relation R from A to B is a


subset of the Cartesian product of A and B.

RAB

Note that generally relations are directed. Eventually this will lead to the
definition of functions. In the special case that A B a relation R is called
relation on A. For a pair pa, bq P R we write aRb.
We can now apply the set calculus from above on relations as done in the
subsequent examples.
Example 3.2.1. The intersection of the relations and is equal to the equality
relation. Set operations are indicated by blue, relations by red.

Similarly the symmetric dierence of the two yields the inequality relation.

The relation is contained in , and in .

The complement of is .
Example 3.2.2. We will now introduce relations on the integer numbers that
are of particular interest later in number theory and cryptography. The integer
numbers contain all positive and negative natural numbers.

Z t. . . , 3, 2, 1, 0, 1, 2, 3, . . .u

An integer b is said to be devisible by another integer a (or a is a divisor of b)


if there exists c P Z such that a c b. Using the notation | we obtain the
following formal definition

a|b : Dc P Z : a c b

So for instance 4 is divisible by 2, and 9 by -3 but not 5 by 2.

2|4 3|9 2-5

40
Divisibility is a binary relation on Z

| Z Z Z2

All relations we have seen before on R can be reduced to relations on Z by


intersection like for instance

Z R X Z2

Finally we will consider the congruence relation defined as

ab pmod mq : m | pa bq

i.e. two integer numbers a and b are congruent modulo m if the dierence of the
two is divisible by m. This is equivalent to say, that the integer division of each
a and b by m yields the same remainder. For instance 3 and 5 are congruent
modulo 2, while 3 and 4 are not

35 pmod 2q 34 pmod 2q

So each natural number m defines a congruence relation m with the modulo


m. The intersection of two such congruence relations yields another one with
the least common multiple4 (lcm) being the modulo.

m X n lcmpm.nq

So one obtains for example

2 X 2 2 2 X 3 6

Symmetry of relations Two important types of relations are so-called order


relations and equivalence relations. The two dier in that order relations are
anti-symmetric whereas equivalence relations are symmetric. Symmetric means
that an ordered pair pa, bq is in R i the swapped pair pb, aq is also in R. A formal
definition of the properties of relations will follow. For now we will consider a
number of examples of order and equivalence relations.
Inequality relations such as , , and are anti-symmetric relations5 .
They give the set they are defined on a hierarchy.
Equality, congruence and semantic equivalence are equivalence relations.
These relations yield a partition of the set they are defined on. That means
they will divide the set into subsets, each containing all elements equivalent to
one another. Equivalence relations are often denoted by .
Lets consider some examples in greater detail.
4 The least common multiple of two integers a and b is the smallest number c P Z such that

there exist integers m, n P Z with a m c b n. For prime numbers the least common
multiple is simply their product, c a b.
5 Strictly speaking not all of them are order relations. This will become apparent once

order relations are formaly defined.

41
6
4

3
2 5

Figure 3.7: The graph shows the divisibility of the set A. 1 divides all other
numbers and is thus the root of the directed path. The direct neighbours are
prime numbers. Finally the integers 4 and 6 are connected to their prime factors.

Example 3.2.3 (Order relations). Lets return to the example of divisibility on


Z. It is an order relation. For now we restrict the relation (by intersection) to
the subset A : t1, 2, 3, 4, 5, 6u Z. The order relation on this finite set can
be visualized in a graph as done in figure 3.7. Following the arrows, also for
multiple steps, yields the ordered pairs in |.
Similarly we obtain the graph 4.13 for the divisibility on the set B t1, 2, 3, 5, 6, 10, 15, 30u.
The corresponding graph, called Hasse diagram, yields the two-dimensional pro-
jection of a cube. The order relation on the powerset

Ppt1, 2, 3uq tH, t1u , t2u , . . . , t1, 2, 3uu

gives the same cube.


One can now extend the graph to a projection of a 4-dimensional hypercube.
This corresponds either to the divisibility relation on the set t1, 2, 3, 4, 5, 7, . . . , 210u
or the subset relation on the power set Ppt1, 2, 3, 4uq. Considering the latter we
extend the cube by another linked cube containing the set with 4 as shown in
figure 3.9.
Example 3.2.4 (Equivalence relation). Let us consider an equivalence relation
2
on t1, 2, 3, 4, 5, 6u tpi, jq | 1 i, j 6u defined as

pa, bq pc, dq : adbc

So intuitively the ratio of equivalent pairs has to be the same. As the division
on integers is slightly more complicated we wrote it in form of a product. The
relation yields the following equivalence classes

t1{1, 2{2, 3{3, 4{4, 5{5, 6{6u ,


t1{2, 2{4, 3{6u , t2{1, 4{2, 6{3u ,
t1{3, 3{6u , . . .
t1{4u , . . .

This leads to the definition of the rational numbers as the set containing the
equivalence classes above,

Q : Z pZz t0uq{

42
30 t1, 2, 3u

6 15 t1, 2u t1, 3u t2, 3u


10

3 t1u t2u t3u


2 5

1 H

Figure 3.8: The graph on the left shows the divisibility of the set B. 1 divides all
other numbers and is thus the root of the directed path. The direct neighbours
are prime numbers. Now instead of building further bottom up one can construct
the next level top down by dividing 30 by all prime numbers in B. Finally one
connects the two graphs.

t1, 2, 3, 4u

t1, 2, 3u
t1, 2, 4u t1, 3, 4u t2, 3, 4u

t1, 2u t1, 3u t2, 3u

t1, 4u t2, 4u t3, 4u

t1u t2u t3u


t4u

Figure 3.9: The graph of the subset relation on the power set Ppt1, 2, 3, 4uq
corresponds to the 4-dimensional hypercube.

43
3.2.1 Representation of relations
Relations on finite sets A and B can be represented by either binary matrices
or bipartite graphs.
In the matrix representation each row corresponds to an element in A and
each column to an element in B. The entry is then 1 i the corresponding pair
pa, bq is in R.
b b bn
1 2
a1 1 0 1
a2
0 1 0

..
.
am 0 0 1
In the bipartite graph the nodes on the left correspond to elements in A, the
ones on the right to elements in B. The nodes are linked i pa, bq P R.

a1 b1

a2 b2

a3 b3

Later we will take the inverse approach and formally define graphs using
relations.

3.2.2 Poperties of relations


We will now consider dierent properties of relations in greater detail and for-
mally precise.

Reflexivity A relation R on A is reflexive if for any element a P A the pair


pa, aq is in R.
@a P R : pa, aq P R
In case A is finite the corresponding matrix has only ones on the diagonal. The
corresponding graph has a loop for each of its nodes.

Example (Reflexive relations).


m

Anti-reflexivity A relation R on A is anti-reflexive if


@a P A : pa, aq R R
In other words: the diagonal of the corresponding matrix is zero and there are
no loops in the associated graph.
Example (Anti-reflexive relations).

44
Symmetry A relation R on A is symmetric if

@a, b P A : pa, bq P R pb, aq P R

For finite sets A the corresponding binary matrix is symmetric. That is MR


MRT 6 .
Example (Symmetric relations).

Anti-symmetry A relation R on A is anti-symmetric if

@a, b P A : pa, bq P R ^ pb, aq P R a b

Example (Anti-symmetric relations).

In this case also the matrix is anti-symmetric. Note that also is anti-
symmetric as there is merely a semantical conclusion in the definition and not
a semantical equivalence. For the condition ppa, bq P R ^ pb, aq P Rq is never
met.

Transitivity A relation R on A is transitive if

@a, b, c P A : pa, bq P R ^ pb, cq P R pa, cq P R

Example. Most of common relations are transitive like

The negations of the above are not transitive as well as the element relation

Having introduced these properties of relations we can formally characterize


equivalence relations and order relations that were mentioned repeatedly above.

3.2.3 Equivalence relations


Equivalence relations are formally defined as follows.

Definition 3.14 (Equivalence relation). A relation R on a set A is called an


equivalence relation if R is
reflexive
symmetric

transitive
6 For a matrix M , M T is the transposed matrix, with entries pM T qij Mji . This corre-
sponds to a reflection about the diagonal.

45
1979

1980
1981

Figure 3.10:

Example 3.2.5 (Age group). Lets consider the set of all humans, A : thumansu.
Their age group is an equivalence relation7 formally defined as

ab : a and b are born in the same year

This equivalence relation introduces a partition on the set of all humans. They
are being grouped by their year of birth.
We have mentioned several times that equivalence relations partion the set
they are defined on without clearifying what that means. We will now formally
define a partion and investigate their connection to equivalence relations in
greater detail.
Definition 3.15 (Partition). Let A be a set. A partition of A is a family of
sets pAi qiPI with the following two properties.

Their union yields A: iPI Ai A
They are disjoint: Ai X Aj H @i, j P I, i j
Theorem 3.2.1. Any equivalence relation yields a partition and vice versa. So
if is an equivalence relation on A then the equivalence classes

ras : tx P A | x au A

are a partition of A.
Inversely, if pAi qiPI is a partition of A, then the relation

xy : Di P I : x P Ai ^ y P Ai

is an equivalence relation.
Proof. We will first show that for a given equivalence relation , the equivalence
classes yield a partition. As any equivalence class ras contains at least a by
reflexivity it follows

A ras tau A ras A
aPA aPA aPA

So it remains to show that the equivalence classes are disjoint. In fact if two
equivalence classes are not disjoint, then they are equal. This follows from
transitivity as follows.
7 To show that the relation the same age group is an equivalence relation, one needs to

show that it satisfies the properties in the definition. Rather obviously it does so.

46
Consider two equivalence classes, rxs and rys with a non-empty intersection.
That implies, there exists an element z P A, such that z P rxs X rys. Therefore
z x and z y and thus by transitivity x y. Employing again transitivity
we obtain that any other elment x1 P rxs is equivalent to y and thus in rys.
Therefore we obtain that rxs rys. The same argument can be made the other
way around to obtain rys rxs and thus rxs rys.
To prove the second part of the theorem we need to show that the relation
satisfies the properties of an equivalence relation. From the idempotency of the
logical AND follows the reflexivity and from its symmetry the symmetry of the
relation. Thus it remains to show transitivity.

Example 3.2.6 (Semantic equivalence). As mentioned in the previous chapter,


semantic equivalence is an equivalence relation on the set of all syntactically
n
correct formulas. It gives rise to a partition with 2p2 q classes if there are n
8
atoms .
Example 3.2.7 (Congruence modulo n). The relation congruence modulo m,
m , is an equivalence relation on Z any m P Z. It yields m equivalence classes.

r0s t. . . , p2q m, p1q m, 0, m, 2 m, 3 m, . . .u


r1s t. . . , p2q m ` 1, p1q m ` 1, 1, m ` 1, 2 m ` 1, 3 m ` 1, . . .u
r2s t. . . , p2q m ` 2, p1q m ` 2, 2, m ` 2, 2 m ` 2, 3 m ` 2, . . .u
r3s t. . . , p2q m ` 3, p1q m ` 3, 3, m ` 3, 2 m ` 3, 3 m ` 3, . . .u
..
.
rm 1s t. . . , p2q m 1, p1q m 1, 1, m 1, 2 m 1, 3 m 1, . . .u

We can now transfer the algebraic structure of Z to the set of equivalence classes.
On first sight it might seems strange to introduce a calculus for sets. But then,
also all numbers in Z can be reduced to sets. Recall the definition of the natural
numbers as nested sets of the emptyset. Just as we can define an algebraic
structure on N we will now define a similar structure on the set of equivalence
classes
Zm : Z{ m : tr0s , r1s , r2s , . . . , rm 1su

Addition is defined as
ras ` rbs : ra ` bs
We have chosen two particular representatives of the equivalence classes, namely
a P ras and b P rbs, and used these to define a new equivalence class, ra ` bs
as the result of the addition. This operation should though be independent of
the choice of the representative. That is, for any other pair of representatives,
a1 P ras and b1 P rbs, the sum should be in the corresponding equivalence class,
a1 ` b1 P ra ` bs. If so, the addition is well-defined. To show that this is actually
the case we note, that any a1 P ras dieres from a only by an integer multiple
of m. The same holds for b1 P rbs.

a1 a ` k m b1 b ` l m k, l P Z
n
8 Each semantic is given by a truth table with 2n entries. Thus there are 2p2 q dierent
truth table, each representating an equivalence class.

47
Therefore the sum of a1 and b1 turns out to be
a1 ` b1 a ` b ` pk ` lq m
and diers just by integer multiple of m from a ` b. Therefore it is in the same
equivalence class.

Multiplication Similar to addition above we now define the multiplication


in Zm as
ras rbs ra bs
Again we have to verify that the addition is well-defined, that is independent of
the choice of the representatives. In the same manner as above one multiplies
a1 a ` k m and b1 b ` l m to obtain
a1 b1 a b ` m pa l ` b k ` k l mq
looooooooooooomooooooooooooon
PZ

So ab and a1 b1 merely dier by an integer multiple of m and are thus congruent


modulo m. With the two operations and ` on Zm we obtain an algebraic
structure, called a ring. Well encounter it later again when well look into
cryptography.

3.2.4 Order relations


Definition 3.16 (Partial order). A relation on a set A is a partial order if
it satisfies the following properties.
reflexive
anti-symmetric: x y ^ y x x y
transitive
Example 3.2.8 (Partial orders). The following are common partial orders.
For the sets N, Z and R the relations and are partial orders. The
relations and are anti-symmetric but not reflexive and therefore not
partial orders.
The divisibility relation | on N is a partial order. This is not true for Z as
a a9 .
The subset relation on a powerset PpBq is a partial order.
Remark 5. Note that we do not ask for every pair to be comparable. This
condition would read
@x, y P A : x y _ y x
and defines total order, also called linear order or a chain.
Whether an order is merely partial or total, can easily be seen from the
directed graph, associated with the order. In the graph 3.7 corresponding to
the subset order not all nodes are connected and thus the order is partial. The
order on R has a linear graph (or a chain) and is therefore also a total order.
9 The divisibility is not anti-symmetric. a is a divisor of a and vice versa, but they are

not equal (for all a 0).

48
greatest element

maxima

Figure 3.11: The figure on the left shows an order with just one maxima, the
greatest elment. The figure on the left shows an order with two maxima, and
no unique greatest element.

Important notions We will now introduce some important notions related


to orders.

The maximal elements in a set A with partial order satisfy

Ey P A : y x ^ y x

An element x P A is the greatest element if

@y P A : xy

Note that there are possibly multiple maxima whereas the greatest element is
unique as schematically shown in figure 3.11.
2
Example 3.2.9. Lets consider the set A t1, 2, 3u first with the order

a pa1 , a2 q a1 pa11 , a12 q : a1 a11 ^ a2 a12

This order is not total, as the pairs p1, 2q and p2, 1q are not comparable. Another
order on the same set is

a pa1 , a2 q a1 pa11 , a12 q : a1 a11 _ pa1 a11 ^ a2 a12 q

This second order corresponds to an alphabetical order. The first component


where the two elements dier, decides about the order. It is thus called lexico-
graphic order.

3.3 Functions
After having introduced the notion of relations we will now introduce functions
as a particular kind of relations.
Definition 3.17 (Function). A relation f A B is a functional relation or
just a function, written f : A B, if it satisfies the following properties.
@a P A Db P B : pa, bq P f (Existence)

49
p3, 3q greatest element

p2, 3q p3, 2q

p1, 3q p2, 2q p3, 1q

p1, 2q p2, 1q

p1, 1q smallest element

Figure 3.12: The first order relation yields the Hasse diagram shown here. From
the diagram it becomes evident, that the order is not total.

p3, 3q

p3, 2q

p3, 1q

p2, 3q

Figure 3.13: The lexicographic order has a linear Hasse diagram as it is a total
order.

50
R

Figure 3.14: This is a valid function.

Figure 3.15: This is not a function.

pa, bq P f ^ pa, b1 q P f b b1 (Uniqueness)


Sometimes existence and uniqueness are written more concisely as

@a P A D!b : pa, bq P f

where the exclamation mark indicates exactly one.


In other the words: the definition identifies a function with the corresponding
set of pairs pa, bq, i.e. the graph of the function. The following notations are
commonly used for pa, bq P f .

f paq b f : a b

After having specified the notion of a function, we now introduce some im-
portant properties.
Definition 3.18. We distinguish the following characteristics of functions
A function f : A B is called injective or one-to-one if

@a, a1 P A : a a1 f paq f pa1 q

A function f : A B is called surjective or onto if

@b P B Da P A : f paq b

51
Figure 3.16: The first function is has a collision and is thus not injective. The
second has an image that is not equal to B and thus not surjective. The third
shows a bijective function.

R R R

R R R

Figure 3.17: These three functions f1 , f2 , f3 : R R illustrate the properties of


functions. The function on the left is bijective, the one in the middle is surjective
but not injective and the one on the right is injective but not surjective.

A function f : A B is called bijective if it is both, injective and surjec-


tive.
So a function is injective, if each argument is mapped to a unique element.
There are no conflicts as shown in figure 3.16. Surjectivity on the other hand
states, that all points in the set B are reached (or the image of f is just B).
Finally bijectivity yields a one-to-one correspondence between the two sets. In
particular one obtains a function g : B A from f .
Employing these properties of functions we can now introduce an order re-
lation on sets.
Definition 3.19 (Ordering sets). The following relations compare the sizes of
two sets, called the cardinality of the sets. The cardinality of a set A is less or
equal than the cardinality of another set B, written A B, if there exists an
injective function from A to B.
AB : Df : A B injective
The sets are equal in size if there exists a bijective function.
AB : Df : A B bijective

52
Figure 3.18: Injective maps map the first set into the second and vice versa.

The relation is an order relation as shown below and thus yields a hierarchy
of sets. First the relation is reflexive, as the identity function

id : A A f paq a

is injective (actually bijective). Therefore A A.


In order to prove the transitivity of the relation well need to introduce
the concatenation of functions. Given a function f : A B and a function
g : B C we define a function h : g f : A C with hpaq gpf paqq. If
both, f and g, are injective then also their concatenation g f is injective10 .
So if A B and B C then there exist injective functions f : A B and
g : B C. The existence of the injective function g f proves then A C and
thus transitivity.
To complete the proof for being a partial order, it remains to show anti-
symmetry. So the goal is to show

AB^B A AB (3.1)

It is not intuitively clear why this should hold. An injective map from A to B,
maps A on a subset in B. So we might get an intertwined mapping as shown in
3.18.
As it turn out, that 3.1 is actually true. This is the statement of the subse-
quent theorem.
Theorem 3.3.1 (Cantor-Schroder-Bernstein). For any two sets A and B it
holds
AB^B A AB
We will now sketch the idea of the proof with a simple analogy.
Proof. Imagine a park (associated with a set A) with a house (associated with
a set B) as shown in figure 3.19. In the house there is a map of the park. If we
look close enough, we see the house on the map again. This house on the map
in turn contains a map of the park containing a house containing. . . .
As the house is inside the park, i.e. B A, there exists an injective function
f : A B. On the other hand the map of the park (so somehow the park
itself)11 is inside the house and there exists an injective map g : B A.
10 A formal proof of this statement could be done by contradiction. If g f was not injective

there existed elements a and a1 s.t. gpf paqq gpf paqq. Employing the injectivity of f we
know that f paq : b f pa1 q : b1 . Thus there exist two arguments b and b1 , b b1 such that
gpbq gpb1 q yielding a contradiction with g being injective.
11 There is a bijective map connecting the park and the map of the park. Thus they are in

a one-to-one correspondence.

53
A

Figure 3.19: The park, represented by the green circle, with the hut depicted
by the red square. The smaller green circle in the hut corresponds to the map
of the park.

More precisely we started o with a set A and a set B, i.e. a park and
a house separately. Only by means of the injective functions f : A B and
g : B A we could put the house into the park, and the park into the house
(i.e. on the map inside the house) and thus construct the infinite nesting of the
two sets.

Bijection So we can define a bijection as follows: the sets parkz house (i.e.
AzgpBq or circlezsquare in figure 3.20) are mapped to their correspondent one
level below. The sets housezpark (i.e. Bzf pAq or squarezcircle) are mapped
to themselves (on the same level). This idea is visualized in 3.20.
So satisfies all requirements of a partial order. Indeed it is even a total
order (even though we will not prove this here). Thus it is natural to ask,
whether there is a biggest element. As shown in the theorem below, the power
set of any set is strictly bigger than the set itself. Thus assuming that there
exists a biggest set A, we know that A PpAq. This yields a contradiction. So
the answer is: there does not exist a greatest element.
Theorem 3.3.2 (Cantor). The cardinality of any set A is not equal to the
cardinality of its power set.
A ff PpAq
and thus A PpAq.
Proof. We will show that no function

f : A PpAq

is surjective. For any such function f we note that an element a P A is either


an element of its image f paq or not.

a P f paq _ a R f paq

Lets now define the set

B : ta P A | a R f paqu A

We now show by contradiction that there does not exist an element b P A such
that f pbq B. Assume such a b would exist. If b P f pbq, then b R B f pbq.

54
Figure 3.20: The bijective map from A to B.

55
This is a contradiction. So there is no surjective function from A to its power
set and therefore no bijection. Thus the two are not equal.
There is a simple injective map from A to its power set PpAq, mapping all
elements in A to the corresponding one-element sets.

f : A PpAq f : a tau

Thus A PpAq and therefore A PpAq.

56
Chapter 4

Combinatorics

Combinatorics is a collection of methods, principles, tools, techniques and facts


to count the size of finite sets. It is however not sufficient to have a system-
atic approach. Solving combinatorical problems requires also a fair amount of
intuition.
We start by introducing some basic notions by means of an example.

4.1 Basic notions


Example 4.1.1 (Manhattan). Imagine walking from a point A to another point
B on a grid such as the streets of Manhattan, as shown in figure 4.1. The
question is, how many shortest ways are there from A to B. We will discuss
two dierent ways to answer this question.

Divide and conquer One way to find a solution is to reduce the problem
step by step to a base case. To do so, let us consider how to calculate the answer
if we knew it for some previous point B 1 . Lets assume that we want to calculate
the number of paths x, while in the previous points the number of paths is a
respectively b as shown in figure 4.2. Then the number of shortest paths to B
is just the sum of a and b. Repeating the argument for further preceding points

B
r

Figure 4.1: Some of the shortest paths are indicated by the colored paths.

57
1 1 1 1

1 2 3 4
e
1 3 6 10
d b
1 4 10 20
c a x

Figure 4.2: In the figure to the left the variables a, b, c, d, e and x refer to the
number of paths from A to the corresponding points. Note that they do not
refer to the coordinates. The figure on the right shows the number of shortest
paths from A to its neighboring points.

k0

n0 1 k1

n1 1 1 k2

n2 1 2 1

1 3 3 1

1 4 6 4 1

Figure 4.3: Pascals triangle. The diagonals are labeled by k, the horizontal
levels by n.

yields

xa`b
ac`d
bd`e (4.1)

In this manner we can reduce the computation of the number of shortest paths
until we reach the upper edge or the edge to the left. For both of these edges
there is only one shortest path. Thus we can compute the number of shortest
paths for all crossings in the grid by summing, starting from the edges with just
one path, as indicated by the red arrows in right figure in 4.2. The pattern of
numbers that emerges is the Pascal triangle shown in figure 4.3 rotated by 45 .
The numbers in the Pascal triangle are constructed the same way, by setting the
outer nodes to 1 and then add the two upper neighbors. The diagonal levels are
labeled by k, the horizontal levels are labeled by n. The entries of the triangle
are denoted by
n
k

58
This entity is usually called n choose k or the binomial coefficient. Writing
the observation that the number of paths can be calculated from the number of
paths ending in previous nodes (i.e. equation 4.1), in terms of n and k yields
the following recursive formula

n n1 n1
` (4.2)
k k1 k
With the base-case defined as

n n
: : 1
0 n
the binomial coefficient is completely characterized as it can be computed for
any n and k by building up the triangle employing the recursion relation in
equation 4.2.
We can now write the number of shortest paths by setting n d ` r and
k r to obtain
d`r
x
r

Step sequences A second way to approach the problem is to take into account
the dierent step sequences. We will now denote a step down with a capital
D and a step to the right with R. Any sequence of steps forming one of the
shortest paths from A to B as for instance

R1 D1 R2 R3 Dd1 Dd Rr

contains r steps to the right and d steps down. So at first one might think that
the number of shortest paths from A to B is the number of permutations of the
r steps R and the d steps D, i.e.1

pd ` rq pd ` r 1q pd ` r 2q 2 1 pd ` rq!

This is however wrong as the steps R and the steps D are indistinguishable.
Two sequences
R1 R2 D 1 R2 R1 D 1
with the Rs being permuted among themselves do not dier and are counted
double in the formula above. In order to fix this issue we have to divide pd ` rq!
first by the number of permutations among the Rs and then by the number of
permutations among the Ds.
pr ` dq!
x
r! d!
Assuming that both approaches yield the same result we can now conclude for
the binomial coefficient, replacing d ` r n, r k and d n k, that

n n! npn 1qpn 2q pn k ` 1q
(4.3)
k k!pn kq! kpk 1qpk 2q 2 1
1 The number of permutations of n dierent or distinguishable elements is n!. To understand

this imagine we put the n elements one by one into an order. So for the first element we can
choose among n elments, for second among n 1 and so on and so forth. We obtain therefore
n pn 1q 2 1 n! dierent orders (or permutations).

59
The example above provided us with two formulas for the binomial coeffi-
cient. Lets now verify that the formula in equation 4.3 actually holds and the
two yield the same result.
So we have to show that the right-hand side satisfies the base conditions and
the recursion relation. Lets first compute the base cases. By convention 0! 1.
Thus
n!
1
0!n!
It remains to verify the recursion relation.
n! n! n! n!
` `
k!pn kq! pk ` 1q!pn k ` 1q! pk 1q!pn kq!k pk 1q!pn kq!pn k ` 1q
n!pn k ` 1 ` kq pn ` 1q!

k!pn k ` 1q k!pn ` 1 kq!
This yields the recursion relation (note that we merely shifted the index relative
to equation 4.2). Therefore the recursive and the direct characterization of
the binomial coefficient are consistent with one another and simply yield two
dierent ways to compute the same number. In the following we will introduce
two common interpretations and applications of the binomial coefficients.

Subsets Why do we say n choose k refering to`the binomial coefficient? As


it turns out, that a finite set with n element has nk subsets with k elements.
We will now prove this.
Proof. We show that the number of k-element subsets from an n-element set,
satisfies the same recursion relation as the binomial coefficient. As we can choose
merely one subset with zero elements the empty set and one subset with
n elements the set itself the base conditions are satisfied. Together with
the recursion relation this will prove that the binomial coefficient is equal to the
number of subsets with k elements.
So back to the recursion relation. Lets consider a set containing n ` 1
elements. How many k-element subset do exist in this set? Lets separate
the pn ` 1q-th element in the set and consider first all k-element subsets not
containing that element. This leaves us with the case of k-element subsets from
an n-element set. It remains to consider the k-element subsets containing the
pn ` 1q-th element. These are just the pk 1q-element subsets in the set without
that element. Thus we obtain
#pk out of pn ` 1qq #pk out of nq ` #ppk 1q out of nq
This is just the recursion relation from equation 4.2.

Binomial coefficient Lets phrase the question in the last paragraph the
other way around: How did n choose k get the label binomial coefficient?
The name stems from the following relation to calculus. If we want to calculate
the n-th power of the sum of two variables we usually want to turn the product
of a sum into a sum of products
px ` yqn px ` yq px ` yq px ` yq
looooomooooon py x xq ` . . . ` y n
px x xq ` looooomooooon
xn yxn1

60
Figure 4.4: Urn with three elements.

We have to multiply one variable from each of the brackets. So there are 2n
summands containing each n factors on the right. These summands could also
be regarded as n-bit strings with zero corresponding to x and 1 to y. There is
a one-to-one relation between n-bit string and the subsets of an n-element set.
The i-th element is an element of a subset i the i-th bit of the corresponding
string is 1. So we obtain a one-to-one correspondence between the summands
and the subsets of a finite set wit n elements. We will refer to this set as A.
The summand xn corresponds to the empytset, y x x to the set containing
the first element a1 P A, i.e. ta1 u P PpAq, and y n to the set containing all n
elements, i.e. A itself.
While the sets ta1 u and ta2 u are not the same, their correspondent sum-
mands are
y x xx x y xx
Analoguely all summands with the same number of xs and ys are equal. How
many of these summands are there containing k times x and pn kq times y?
Following the reasoning above these correspond to the n-bit strings with` exactly

k ones or the k-element subsets of some n-element sets. Thus there are nk such
summands and we obtain
n
n k nk
px ` yqn x y (4.4)
k0
k

4.2 Urn models


How many combinations of k elements can be drawn from an urn containing
n elements (as shown in figure 4.4)? The number of dierent combinations
depends on whether the order of the combinations matters, and whether the
elements are put back after each draw. To understand these cases better well
first consider the following example before turning to the general case.
Example 4.2.1. To keep matters simple let us set n 3 and k 2. We will now
go through all the four cases in detail.

Ordered, with repetition After each draw the element is put back into
the urn. So for each draw there are n choices. As the order matters the two
combinations px, yq and py, xq are not equal, similar to the Cartesian product,

61
and we therefore obtain the following 9 combnations

p1, 1q p1, 2q p1, 3q


p2, 3q p2, 2q p2, 3q
p3, 3q p3, 2q p3, 3q

Ordered, without repetitions Elements once drawn from the urn are not
put back again. Thus the number of choices reduces by one with each draw and
we obtain the following 6 combnations

p1, 2q p1, 3q
p2, 3q p2, 3q
p3, 3q p3, 2q

Unordered, with repetition Again we put the elements back after each
draw, but the order of the drawn combination does not matter. That is, we
regard the combinations above and below the diagonal in the matrices above as
equal. Therefore we are left with 6 combinations

p1, 1q p1, 2q p1, 3q


p2, 2q p2, 3q
p3, 3q

Unorderd, without repetition If we do not put back the elements after


each draw, the combinations with twice the same element, i.e. those on the
diagonal, cannot occur. Thus we are left with the following 3 cases

p1, 2q p1, 3q
p2, 3q

We would now like to generalize from the example above to arbitrary n and
k. The matrix picture is somewhat misleading as it is merely helpful in the case
k 2. One rather multiplies the number of choices as we will see.

Ordered, with repetition For each of the draws there are n choices. So the
number of combinations is
n n n nk
loooomoooon
k times

Ordered, without repetition If the elements are not returned to the urn,
the number of choices decreases by one with each draw. Thus the number of
combinations is

n pn 1q pn 2q pn k ` 1q : nk

The special case k n yields the number of permutations of n elements, i.e. n!


as mentioned already before.

62
Unordered, without repetition The number of unordered combinations of
k elements without repetition corresponds to the k-element subsets of a set with
n elements. Recall that two sets are equal if they contain the same elements,
independent of the order. Thus the number of combinations is given by the
binomial coefficient
n n!

k k!pn kq!

Unordered, with repetition The last case, with repetition but unordered,
is slightly more complicated. A vote is an example of this case. The order of the
votes does not matter, but merely how many votes each candidate got. Imagine
3 candidates and 20 voters. Therefore n 3 and k 20.2 How many dierent
distributions of votes are there? As the order does not make a dierence, we
might as well order the votes such that the votes for the first candidate come
first and then the ones for the seconds etc.

1 ...1 | 2...2 | 3...3


loooooooooooomoooooooooooon
20 votes

For arbitrary numbers of candidates and voters we similarly obtain

1...1 | 2...2 | ... | nn


loooooooooooooooomoooooooooooooooon
k votes

with n 1 separators |. In this notation the separators also indicate the votes.
Left of the first separator there are the votes for the first candidate, left of the
second separator the votes for the second candidate and so on and so forth.
Therefore can therefore write

... | ... | ... | ...


looooooooooooooooomooooooooooooooooon
k stars

The combinations are merely characterized by the order of stars and separators.
To see how many arrangements there are, we take n 1 ` k positions and fill
them either with stars or with separators. Then there are

n1`k n`k1
(4.5)
n1 k

dierent arrangments for k stars and n 1 separators as this corresponds to


k-element subsets of an pn ` k 1q-element set (or equivalently to k-element
subsets by the symmetry of the binomial coefficient)3 .
The extensionality axiom of set theory establishes a connection to unordered
combinations. Two sets that dier only by a permutation of the elements are
equal, analogue to unordered combinations. The emergence of the binomial
coefficient is a result of this analogy.
2 As repetitions are allowed, the number of draws k might as well be bigger than the number

of elements in the urn, n.


3 One could also consider the number of permutations of all n ` k 1 stars and separators.

Then we have to divide by the number of permutations among the stars and among the
separators as they are indistinguisable. So we obtain the binomial coefficient again.

63

Figure 4.5: The elements in the intersection are counted twice and have to be
substracted again.

4.3 Combinatorial rules and counting strategies


After having considered how to determine the number of combinations in dif-
ferent urn models we will turn to calculating the sizes of sets contructed from
other sets by e.g. union.

Sum rule A family of sets pAi qi1,...,n is mutually disjoint if

Ai X Aj H @i j

The size of the union of mutually disjoint sets is the sum of the sizes of the sets.

n n

Aj |A |
i1 i1 i

Product rule For any family of sets pAi qi1,...,n , whether disjoint or not, the
size of their Cartesian product is just the product of the sizes of the single sets
Ai .
n n

A |A |
i1 i i1 i

Equality rule Two finite sets A, B have the same size, if there exists a bijec-
tive function f : A B. The function establishes a one-to-one correspondence
between the elements of A and B.

Principle of Inclusion/Exclusion Wed like to generalize the sum rule


above to sets that are not necessarily mutually disjoint. Lets first consider
the cases of 2 sets and 3 sets. The union of two arbitrary sets, whether disjoint
or not, has the size

|A1 Y A2 | |A1 | ` |A2 | |A1 X A2 |

The elements that are counted twice have to be substracted again as visualized
in figure 4.5.
Example 4.3.1 (Inclusion/Exclusion of 2 sets). How many of the numbers 1, 2,
3, . . . , 100 are divisible by 2 or by 5? The numbers that are divisible by 2 form

64
a set A1 with 50 elements, the numbers divisible by 5 a set A2 with 20 elements.
Their intersection has size 10. Therefore

|A1 Y A2 | |A1 | ` |A2 | |A1 X A2 | 60

Let us now consider the case of three sets. Again we have to substract the
intersection of each of the pairs. But then the intersection of all three sets is
substracted three times and thus not counted at all. So we have to finally add
it again.

|A1 Y A2 Y A3 | |A1 | ` |A2 | ` |A3 |


|A1 X A2 | |A1 X A3 | |A2 ` A3 |
` |A1 X A2 X A3 |

Instead of considering the intersections one after the other we could as well
repeatedly apply the case of two sets from above. This yields

|pA1 Y A2 q Y A3 | |A1 Y A2 | ` |A3 | |pA 1 Y A2 q X A3 |


looooooooomooooooooon
|pA1 XA3 qYpA2 XA3 q|
|A1 | ` |A2 | |A1 X A2 | ` |A3 | p|A1 X A3 | ` |A2 X A3 |q

as above. This step turns out to be crucial in the proof of the general case
considered in the subsequent theorem.
Theorem 4.3.1 (Inclusion/Exclusion). The size of a union of sets pAi qi1,...,n
is given by
n n r
r1
Ai p1q Ai k (4.6)
i1 r1 1i i ...i n

1 2 r k1

The theorem can be proven by induction over n.


Proof. As we already discussed the cases n 2 and n 3 above, the base cases
are proven. It remains to show the induction step. We will consider a family of
pn ` 1q sets and split o the last set. Then we apply the case of two sets.

n`1 n

Ai Ai Y An`1
i1 i1

n n

Ai ` |An`1 | Ai X An`1
i1 i1
looooooooooomooooooooooon

| n
i1 pAi XAn`1 q|


n r
r1
p1q Aik ` |An`1 |
r1

1i1 i2 ...ir n k1


n r

p1qr1 pAik X An`1 q
r1 1i i ...i n

1 2 r k1

The last line is of the form of equation 4.6. To see this we will expand the
equation 4.6 with pn ` 1q. First we split of the term r 1, then all the terms

65
with ir n ` 1.


n`1 r
r1
p1q Ai k
r1 1i1 i2 ...ir n`1

k1

n
n`1
|Ak | ` |An`1 | ` ...
k1 r2
n
|Ak | ` |An`1 |
k1


n r
r1
` p1q Ai k
r2

1i1 i2 ...ir n k1



n`1 r1
r1
` p1q Aik X An`1
r2 1i i ...i n

1 2 r1 k1

n
|Ak | ` |An`1 |
k1


n r
r1
` p1q Ai k
r2

1i1 i2 ...ir n k1


n

r
r1
p1q Aik X An`1
r1 1i i ...i n

1 2 r k1
|An`1 |

n r

` p1qr1 Ai k
r2

1i1 i2 ...ir n k1


n r
r1
p1q Aik X An`1
r1 1i i ...i n

1 2 r k1

In the penultimate step the index r in the last term was shifted. Theemerging
n
p1q factor was taken in front of the sum. In the last step the sum k1 |Ak |
was absorbed into the third term as r 1.
Even though the right-hand side of the equation 4.6 looks pretty complicated
it can be helpful as for instance in the example below.
Example 4.3.2 (Opera). During an opera evening all the coats in the cloakroom
get disordered and the guests get back a random coat afterwards. Is it more
likely that at least one person gets back his or her own coat or that nobody
gets back his own coat? Let us first consider a simpler case with 3 guests. The
three coats in the cloakroom are permuted. If at least one person retrieves his
or her own coat after the opera the permutation of the coats has at least one
so-called fix-point. At least one of the coats is at its former positions after the
permutation in the cloakroom. An example is4

1 2 3
2 1 3
4 We write a permutation with the initial order in the first row and the final order in the

second row.

66
If no guest retrieves the own coat, then the corresponding permutation is fix-
point free (we abreviate such a permutation by FPFP). For three coats there
are merely two FPFPs

1 2 3 1 2 3
2 3 1 3 1 2

So the probability that nobody retrieves the own coat is 1{3 while the probability
that at least one person retrieves the own coat is 1 1{3 2{3. How do we
calculate this if the number of guests (and therefore the number of coats) is
significantly larger? It turns out the number of fix-point permutations can be
calculated using equation 4.6. We will write the set containing all permutations
with at least one fix-point, denoted hereafter A, as a union of simpler sets and
use inclusion-exclusion to determine the size of this union.
Let us define Ai as the set containing the permutations with a fix-point at
position i. Then

n
Ai A
i1
and the size of A can be calculated using inclusion and exclusion. The size of
each of the Ai is equal to the number of permutation of pn 1q elements. The
size of the intersections of two dierent Ai is equal the number of permutations
of pn 2q elements and so on and so forth.

n n r

|A| Ai p1qr1 Ai k
i1 r1 1i i ...i n

1 2 r k1

n
p1qr1 pn rq!
r1 1i1 i2 ...ir n
` n ` n
For each r there are r terms in the inner sum. Employing the equality r
n!
pnrq!r! we obtain


n
n
|A| p1qr1 pn rq!
r1
r
n
p1qr1
n!
r1
r!

The number of fix-point-free permutations is then



n
p1qr1
n! |A| n! 1
r1
r!
n
p1qr1
n!
r0
r!

Using that the exponential can be written as


8
xk
ex (4.7)
k0
k!

67
we obtain with x 1 that the number of fix-point-free permutations tends to
n!{e.
n!
lim #FPFPpnq
n8 e
So the probability that nobody will retrieve the own coat is 1{e 1{2.71828
and thus smaller than the probability that at least one person retrieves the own
coat.

4.3.1 The Pigeonhole Principle


If there are more pigeons inside a pigeonry than holes in the pigeonry then at
least two pigeons have to leave through the same hole. More formally we obtain
the following principle

Theorem 4.3.2 (Pigeonhole Principle). When n objects are distributed among


k boxes, with k n, then there is at least one box containing at least two objects.
Even though the principle seems intuitivly clear one could for instance
think of a pigeonry to better undstand it we will provide a formal proof
employing induction over the number of boxes k.

Proof. In the base case with k 1 there is merely one box containing all n
elements, with n 2. In the induction step we proof that the assertion hold for
k ` 1 boxes if it holds for k boxes. Lets consider the pk ` 1q-th box separately
and distinguish the following two cases:
1. The box pk ` 1q contains 2 objects or more.

2. The box pk ` 1q contains only 1 object. Then the other n 1 objects are
distributed among the remaining k boxes. If n k ` 1 then n 1 k and
by induction hypothesis there is one box among these n that contains at
least two objects.

This completes the induction step and thus the proof of the principle.
The following example illustrates the how the pigeonhole principle can be
employed.
Example 4.3.3 (Monotonic subsequences). If we are given a finite sequence of
distinct numbers as for instance

p1, 17, 5, 3, 20, 2, 4q (4.8)

then a subsequence is a selection of these numbers with the order given by the
original sequence such as
p17, 20, 4q (4.9)
A subsequence is monotoneously increasing if the numbers are in increasing
order. Analogue the sequence is monotoneously decreasing if the numbers are
in decreasing order. The sequence

p1, 3, 20q

68
is monotoneously increasing while the sequence

p17, 5, 3, 2q

is monotoneously decreasing.
The question is: for a sequence of length n is there a lower bound on the length
of the longest monotonic subsequence? In fact a sequence of length m2 ` 1 has
always a monotonic subsequence of length m ` 1. We will prove this statement
now by contradiction.

Proof. Lets consider a sequence of m2 ` 1 elements

pa1 , . . . , am2 `1 q

We assume that the longest subsequence has length l m. To each number


ai in the sequence we now associate the length ci of the longest monotoneously
increasing subsequence starting with ai , and the length di of the longest mono-
toneously decreasing subsequence starting with ai . So for each ai we obtain a
pair pci , di q. Both, ci and di are less or equal to l and therefore less or equal to
m. Thus there are m2 dierent ordered pairs. The pigeonhole principle states
that there are at least two numbers with the same values pc, dq.5

pa1 , . . . , ai , . . . , aj , am2 `1 q (4.10)

pc, dq pc, dq

We distinguish now the two cases


ai aj Then we can construct a monotoneously increasing subsequence of length
c ` 1 by adding ai to the subsequence starting from aj . This yields a
contradiction with c being the length of the longest subsequence starting
from ai .
ai aj One obtains a monotoneously decreasing subsequence of length d`1. This
is in contradiction with d being the length of the longest monotoneously
decreasing subsequence starting from ai .

Therefore the assumption that the length of the longest monotonic subsequence
is less or equal to m leads to a contradiction.
Is this lower bound optimal? In other words: is this lower bound tight?
Indeed, this is the case. We can come up with a sequence of length m2 that has
a monotonic subsequent of length m as depicted in figure 4.6.

4.3.2 Double Counting


There are two equivalent ways to count the elements in a subset of a Cartesian
product A B, i.e. in a relation, S A B. We could either count the set
column by column using the subsets

ma : tb P B | pa, bq P Su S
5 The m2 possible pairs correspond to the holes and the m2 ` 1 elements ai to the pigeons.

69
m2

3
2
1

1 2 3 m2
m

Figure 4.6: A sequence with m2 elements and monotonic subsequences with m


elements.

B ma

nb

Figure 4.7: One can count the elements in S either row after row, or column
after column.

70
l

10
9
8
7
6
5
4
3
2
1

1 2 3 4 5 6 7 8 9 10 k

Figure 4.8: The divisors l of the numbers k n.

or count the elements in S row by row using the subsets

nb : ta P A | pa, bq P Su S

Then we obtain the size of S



|S| |ma | |nb |
aPA bPB

The subsequent example shows how it can be useful to interchange a sum over
columns with a sum over rows.
Example 4.3.4. What is the average number of divisors of a number k? To
answer this question let us introduce a function

pkq : |tl 0 | k divides nu|

that returns the number of divisors. So the function yields for instance

p1q 1 p2q 2 p3q 2 p4q 3 p5q 2 p6q 4

For every prime number p we obtain ppq 2. Having introduced this function
we can define what we meant by average. So for a given number n we are
interested in calculating
n
pkq
k1
n
Summing the values of pkq is equivalent to counting the dots in figure 4.8 col-
umn by column. The columns have a rather fuzzy pattern whereas the number

71
1

0.8

0.6

0.4

0.2

0 2 4 6 8

Figure 4.9: Estamation using integrals

X \
of points within each row is simply given by nk . The fraction n{k is rounded
to the next smaller integer value. Interchanging the sum over columns by a sum
over rows we obtain
1 1 Yn]
n n
pkq
n k1 n l1 k
How does this number scale with n? To answer this question we will derive an
estimate on the sum above. Note first
n Yn] n
1
l l l
Therefore we can bound the average as follows.

1 1 Yn] 1
n n
1
l1
l n l1 k l1
l
n
Thus we require bounds on the sum l1 1{l. As seen in figure 4.9 we can
approximate with functions and integrate to obtain the area below the graph.
The upper bound is then
n n n
1 1 1
1` 1` dx 1 ` lnpnq
l1
l l2
l 1 x

where ln is the natural logarithm with the base e. Similarly we obtain the lower
bound n
n
1 1
dx lnpn ` 1q lnpnq
l1
l 0 x ` 1
Putting all this together the average of the number of divisors for number be-
tween 1 and n can be estimated as

1 Yn]
n
lnpnq 1 lnpnq ` 1
n l1 l

72
n

r nr

t kt

Figure 4.10:

4.4 Binomial coefficients: Properties and approx-


imations
As shown above the number of k-sets that can be chosen from an n-set is given
by the binomial coefficient

n n! n

k pn kq!k! nk

4.4.1 Symmetry
The binomial coefficient reflects the symmetry of the Pascal triangle as

n n

k nk

This can also be seen from the formula above.

4.4.2 Vandermonde Identity


`
There are nk possibilities to choose k elements from a set containing n elements.
Imagine now that there are r blue and nr red elements in the set of n elements.
Then there are
r nr

t kt
possibilities to choose k elements such that t of these are red while k t are
blue. If we add the numbers of k-sets from an n-set with t 0 elements red,
t 1 elements red, t 2 elements red until we reach k red elements,
` it is just
the same as not to bother about the colors and we are left with nk . Thus we

73
obtain the equality
k
n r nr

k t0
t kt
` n
Note that k is set to zero whenever k n.

4.4.3 Binomial theorem


For the sake of completeness we mention again the binomial theorem (see also
4.3).
n
n k nk
px ` yqn x y
k1
k
If we set x y 1 we obtain the sum over one row in Pascals triangle
n
n
2n
k1
k
This yields the number of all subsets of an n-set and thus the size of the power
set.
Another interesting special case is x 1 and y 1. This yields
n
n
p1qk 0k 0 (4.11)
k1
k
That is, summing over any row with alternating sign yields zero. For rows with
n being odd the number of terms in the row is even. So 4.11 for n being odd
follows directly from the symmetry of the Pascal triangle. For any row with n
even it is not obvious though.
From this we can draw a conclusion for the number of strings with even and
with odd parity. The parity of a string is even if the number of ones in the string
is even and it is odd if the number of ones in the string is odd. So the left-hand
side of the equation 4.11 is just the number of even parity strings minus the
number of odd parity strings. Thus the number of odd parity strings and the
number of even parity strings are always equal. For strings with an odd number
of elements there is a bijection that shows the equal number of even and odd
parity strings. One simply flips all bits. So for instance
010 101
For strings with an even number of elements the fact is not that obvious but
follows from equation 4.11.

4.4.4 Approximation of the binomial coefficient


For large numbers n and k the binomial coefficient is rather hard to compute.
Therefore we would like to have an estimate. First of all we will derive upper
and lower bounds. To obatin a lower bound we write the binomial coefficient
with factorials and rearrange the factors as follows.

n n! npn 1qpn 2q pn k ` 1q

k k!pn kq! kpk 1qpk 2q 1
n n1 n2 nk`1

k k1 k2 1

74
1

0.8

0.6

0.4

0.2

0 0.2 0.4 0.6 0.8 1

Figure 4.11: The graph of the binary entropy hpxq.

If n k then the factor n{k is less or equal to pn 1q{pk 1q as


n n1
npk 1q nk n pn 1qk nk k
k k1
Repeating this argument for the subsequent factors we obtain the lower bound

n n k

k k
` n
How much bigger can k be than pn{kqk ? To answer this question we will now
derive an upper bound. Consider first the following ratio.
` n n k
npn 1qpn 2q pn k ` 1q kk
k
` n k ek
n k
looooooooooooooooooomooooooooooooooooooon kpk 1qpk 2q
loooooooooooomoooooooooooon1 k
k
1 ek

We have employed once more the formula with factorials and arranged the
factors in a convenient way. To upper bound the second part we have used the
expansion of the exponential (4.7). One summand is less than the entire series.
Summarizing this we obtain the bounds
n k n n k
ek (4.12)
k k k

Stirlings formula provides a more precise estimate of the factorial


? n n
n! 2n (4.13)
e
and can thus be used to estimate the binomial coefficient. The factors of e
cancel as ek enk {en en {en 1. Further we will drop all square root factors

75
and obtain
? n n e k e nk
n n! 1 n
? a
k k!pn kq! 2 kpn kq e k nk
nn 1
k ` k `
k pn kqnk k nk nk
n n
n
1
` k{n `
k nk pnkq{n
n n

In the last two steps we merely reformulate the expression in a form that will
turn out to yield some insight later on.
Lets now define x : k{n. Thus pn kq{n 1 x and we can write the
estimate above as

n ` n
xx p1 xq1x 2npx log2 xp1xq log2 p1xqq
k

Further we introduce the function

hpxq : x log2 x p1 xq log2 p1 xq

The function is symmetric about x 1{2 and becomes zero for both x 0 and
x 1 and one for x 1{2. The graph of h is shown in figure 4.13. All this new
formalism yields the estimate

n
log2 nhpxq (4.14)
k

Why should we bother to write the estimate in a such a complicated way?


The function h plays an important role in information theory, more concretely
in results on data compression.

4.5 An Excursion into information theory: Data


compression
How much can we compress data without loosing information? Lets consider a
bit string generated by n coin flips. If the probability for heads and tails is 1{2
we cannot compress this string. But if the probability for obtain zero is larger
than 1{2, i.e.
1
x : P p0q
2
then there is redundancy and we can compress to a smaller bit string of length
l as depicted in Fig. 4.12. As we are dealing with a probabilistic source of n bit
strings, we refine the question. What is the best compression, i.e. the smalles
l, on average, without loss of information?
To answer this question let us consider the set of all possible n-bit strings.
What strings are likely to occur if they are produced with source generating a
zero with probabilisty x 1{2 and one with a probability 1 x?

76
n

011001011100010100000010010001000011010

0001010011010110011

Figure 4.12: Scheme for compression of an n-bit string to an l-bit string.

n
t0, 1u

strings with x n zeros


and p1 xq n ones
0n

strings with n{2 ones


and n{2 zeros

Figure 4.13: The set of all strings of length n with various subsets.

77
0.4

0.2

0 0.2 0.4 0.6 0.8 1


a
Figure 4.14: The graph of the function f pxq xp1 xq.

The single most likely string is then the zero string 0n with a probability xn .
If for instance x 0.9 then P p0n q is 0.81 for n 2 and 0.729 for n 3. For
n 8 this probability tends to zero.
Any string with equally many zeros and ones occurs with a probility
xn{2 p1 xqn{2
`n

As there are n{2 of these strings, the probability to draw any such string is

n
xn{2 p1 xqn{2 2nhp1{2q xn{2 p1 xqn{2
n{2
2n xn{2 p1 xqn{2
a n
2 xp1 xq 0 as n 8
a
Here we have used that xp1 xq 1{2 for any x 1{2 as can be seen from
the graph shown in Fig. 4.14. This is related to the geometric fact, that the
rectangle with the largest area for a given circumference is a square.
This is to say that for long strings, i.e. large n, the probability to draw a
string with equally many zeros and ones is very small.
Which strings are instead drawn with high probability? Intuitively wed
expect the string with n x zeros and n p1 xq ones to be rather likely. The
probability to draw any such string is in fact

n
P r# zeros x ns xxn p1 xqnp1xq
xn

n n
2npx logpx`p1xq logp1xqq
2nhpxq
xn xn
2nhpxq 2nhpxq 1
`n
Therefore a lossless compression protocol merely encodes the roughly xn
2nhpxq strings with n x zeros and p1 xq n ones using l nhpxq bits. So the
binary entropy hpxq characterizes the ultimate data compression rate or the
relative information content of a source.

78
4.6 Special counting problems
4.6.1 Equivalence relations
Recall from set theory that equivalence relations yield a partition of a set. The
question we are now interested in is: How many equivalence relations are there
for a given set with n elements? Lets first consider some simple cases with
small n.
n 1: There is just one partition and therefore merely one equivalence
relation.

n 2: There is are two partitions, the one containing both the elements
and the one with each element being contained in its own partition.

n 3: There are 5 partitions: one containing all elements, one with each
element in its own partition, and three with two elements contained in the
same partition.

Instead of trying to directly find a general answer to the question above, well
first a consider a slightly simpler question: How many partitions into exactly
k sets are there? We denote this number by Sn,k . One can now deduce a
recursvie formula similar to the binomial coefficient. To do so, we separate the
n-th element and distiguish the following to cases
1. The n-th element is in its own a set. Then we have to find a k 1 partition
of the remaining n 1 elements.
2. The n-th element is an element of a set containing also other elements.
Imagine the other n 1 elements are partitioned into k sets. Then we
could add the n-th element to any of those k sets.
These considerations yield the following recursive formula

Sn,k Sn1,k1 ` k Sn1,k (4.15)

This relation resembles the recursion relation for the binomial coefficient. There
is merely the additional factor k. Indeed, after considering the base cases

S0,0 1
Sn,0 0 @n 0
Sn,n 1

79
k0

n0 1 k1

n1 0 1 k2

n2 0 1 1

0 1 3 1

0 1 7 6 1

0 1 15 25 10 1

Figure 4.15: Stirlings triangle of the second kind.

we can build up a similar triangle, called Stirlings triangle of the second kind.
The number of all partitions is then given by the sum over row n in the
triangle.

n
Bn : Sn,k
k0
There is no closed formula for this number. So to calculate the number, one has
to actually build up the triangle before calculating the sum over the row.

4.6.2 Permutations
A permutation is a bijective map
: t1, . . . , nu t1, . . . , nu
A common notation for permuations is

1 2 n
p1q p2q pnq
Example 4.6.1. The following is a permutation of 5 elements

1 2 3 4 5
5 4 3 2 1
The permutation has a fix point, 3 3, and two cycles of 2
1 5 1 2 4 2
Therefore applying the permutation twice yield the identity, 2 id. In other
words the permutation is self-inverse, 1 .
Example 4.6.2. Let us consider a permutation with a more complicated cycle
structure.
1 2 3 4 5 6 7 8 9 10
5 9 8 10 7 6 1 3 4 2
There are the following cycles

80
A fixpoint at 6

There is one cycle of 2.


3 8

There is one cycle of 3.


1

There is one cycle of 4.


10 4

2 9

The counting problem well consider is: How many permutations of n with
exactly k cycles are there? As before we aim for a recursion relation. Abusing
slightly the notation, we denote the number permutation of n elements with
k cycles by Sn,k . One can then separate the n-th element and distinguish the
following two cases.
1. n is a fix-point and therefore adds a cycle by itself. It remains to take into
consideration the permutations of n 1 elements with k 1 cycles.
2. We can fit n into existing cycles. So for permutations of n 1 elements
with k cycles there are n 1 position where we could put n, essentially
after each of the existing elements independent of which cycle they are in.

Therefore we obtain the following recursive formula

Sn,k Sn1,k1 ` pn 1q Sn1,k (4.16)

As there is no permutation of n elements without any cycles and merely one


with n cycles we obtain the base cases

S0,0 1 Sn,0 0 Sn,n 1

This yields another Stirlings triangle.

81
k0

n0 1 k1

n1 0 1 k2

n2 0 1 1

0 2 3 1

0 6 11 6 1

0 24 50 35 10 1

Figure 4.16: Stirlings triangle of the first kind.

82
Chapter 5

Graph Theory

5.1 Motivation
Many discrete problems can be modeled as graphs. This is useful as there is a
well-developed theory on graphs.
Example 5.1.1 (Error-free communication over noisy channels). A channel is an
abstract model of a device used to transmit some information as for instance
a wire, an optical fibre or a radio signal. As shown in Fig,. 5.1 a channel
can be depicted by a graph, where the input is connected with each of the
possible outputs. Generally errors might occur while transmitting the signal.
The channel is noisy. Therefore a given input can potentially yield dierent
ouputs in dierent uses of the channel.1 Nodes representing the input in the
graph might be connected to more than one element in the output alphabet Y.
As outputs can stem from dierent inputs, the receiver cannot tell which input
the sender chose. Is there a way to transmit information without error through
such a noisy channel? To answer the question we will first consider a dierent
graph representation.
An ambiguity graph shows the conflicting inputs in a noisy channel. The
1 In a more complete model one would assign a probability distribution over the outputs

for each input. Typically the right output would occur with rather high probability. If the
channel is noisy there are other outputs that occur with probability larger than zero and cause
the ambiguity for the receiver.

x0 y0
x1 y1
x2 y2
x3 y3
.. ..
. .
xn yn

Figure 5.1: A graph for a noisy channel. An input x P X can be mapped to


more than one output y P Y

83
x2
x3
x1
x5

x4
x6

x7 x8

Figure 5.2: The ambiguity graph of a noisy channel. The red markers show an
optimal coding.

nodes correspond to the inputs in X . Two nodes are connected if the corre-
sponding inputs might yield the same output.
In order to resolve the ambiguity of the inputs, the sender and the receiver
agree on a coding. The sender will use merely a subset of the input alphabet X
without conflicts in the output. This subset has to be chosen such that none
of the elements are neighbors in the ambiguity graph. The optimal code is
the biggest such set. In the graph 5.2 the codewords of the optimal code are
highlighted in red.

The noisy typewriter We will now consider an explicite example of a noisy


channel. Claude Shannon in his paper from 1948, laying the basis for informa-
tion theory, considered a noisy typewriter. For symplicity we consider a reduced
alphabet of 5 instead of 26 letters. Through some error the following letter is
also hit every now and then. This error happens cyclic. So when typing the last
letter, sometimes the first one is hit. We obtain the following graphs

0
0 0
1 1 1 4
2 2
3 3
4 4 2 3

The corresponding optimal code has then 2 codewords and can therefore
transmit 1 bit with each use of the channel. It seems therefore that the zero
error capacity, the number of bits that can be sent unambiguously per use of
the channel, is 1. But one could achieve the same transmission rate if there
were just 4 letters. So the question is, whether this is really optimal?
The zero error capacity is in fact an average over multiple uses of the channel.
Let us therefore consider the case of using the channel twice. The new alphabet
is then the Cartesian product X 2 with 25 elements as shown in Fig. 5.3. We
can now search for unambiguous codewords among these 25 ordered pairs. The
optimal code indicated by the red circles allows then for 5 distinct codewords to

84
0 1 2 3 4
0
1
2
3
4

Figure 5.3: This is the ambiguity graph for the case of using the channel twice.
Instead of showing all edges the blue and the red lines give to examples of
ambigeous pairs. The red circles indicate an optimal code for this new channel.

be transmitted without error. Therefore the zero error capacity of the channel
in bits is
log2 5
2
and is therefore larger than 1. It turns out that using the channel even more
than twice does not improve this result ( as shown by Lovasz).

5.2 Basic notions


Definition 5.1 (Graph). A graph G pV, Eq consists of
a non-empty, finite vertex set V . Therefore

0 |V | 8

The elements of V are called vertices or nodes.


an edge-set E V V .
Definition 5.2 (Undirected graph). A graph G is undirected if

pu, vq P E pv, uq P E

If a graph is known two be undirected we usually assume E to contain merely


one of two, i.e. either pu, vq or pv, uq.
Definition 5.3 (Simple graph). A graph G is called simple if it does neither
contain loops, nor multiple edges.

Definition 5.4 (Neighborhood of a vertex). For a vertex v P V we call the set

pvq : tw P V | pv, wq P Eu

85
the neighborhood of v.
w1
w2

v w5

w3
w4

Definition 5.5. The number of edges towards a node v is called the in-degree
deg pvq.

The number of edges from a node v is called the out-degree deg` pvq.

For undirected paths the number of edges ending in a node v is called the degree
degpvq.

In a directed graph, any ingoing edge of a node is an outgoing edge of another


node. Further any edge is an ingoing (respectively outgoing) edge for some node.
Thus we obtain for directed paths the following equality.

deg pvq deg` pvq |E| (5.1)
vPV vPV

Similarly it holds for undirected paths



degpvq |E| (5.2)
vPV

5.2.1 Basic notions for simple undirected graphs


Definition 5.6 (Way). A 0 -vl -way with length l is a sequence of vertices w
pv1 , . . . , vl q with
pvi , vi`1 q P E @i P t1, . . . , l 1u

86
v6
v2
v1
v3

v4
v5

Definition 5.7 (Path). A path is a way such that all vertices are distincts.
That is, there can not be loops within a path.

v4
v2
v1
v3

Definition 5.8 (Circuit or cycle). A circuit or cycle is a sequence of pair-wise


distinct vertices c pv1 , . . . , vl q satisfying

pvi , vi`1 q P E @i 1, . . . , l 1
pvl , v1 q P E

It is therefore a path with the last node being connected to the first by an edge.

Definition 5.9 (Subgraph). The pair H pV 1 , E 1 q is a subgraph of G pV, Eq


if
V1 V E1 E E1 V 1 V 1
Given a vertex set V V a subgraph H pV , Eq is called induced by V if

@u, v P V : pu, vq P E pu, vq P E

The subgraph induced by V is denoted H : GrV s.

Definition 5.10 (Connected components). Let pVi qiPI be a partition of the


vertex set of a graph G such that all elements within each partition Vi are
connected by a path. That is

Dpu, vq-path Di P I : u, v P Vi

87
Then the induced subgraphs GrVi s are called the connected components of G.

Definition 5.11 (Bridge). An edge e P E is called a bridge if removing yields


one more connected component. That is the graph G1 : pV, Ez teuq has one
connected component more than G.
Theorem 5.2.1. A graph G pV, Eq has at least
|V | |E|
connected components.
Proof. To show the theorem above we give an inductive argument. The graph
G pV, Hq has |V | connected components, each containing just one vertex. Any
edge that is inserted into the graph reduces the number of connected components
by at most one. If it connects two previously separate connected components it
reduces the number by one. If the edge connects two vertices within a connected
component the number of connected components remains the same. Any edge
that is inserted into the graph reduces the number of connected components by
at most one. If it connects two previously separate connected components it
reduces the number by one. If the edge connects two vertices within a connected
component the number of connected components remains the same.
Corollary 5.2.2. If a graph G is connected, i.e. all components are connected
with one another by paths and there is just one connected component, then the
number of edges and vertices is related by
|V | |E| 1
The following graph is an example of a connected graph.
w0

w1 w4

w2 w3

The number of vertices is |V | 5 while the number of edges is |E| 10.


Therefore the dierence is well below the bound from the corollary above, as
|V | |E| 5 1

88
Indeed we could remove all internal edges and one of the outer ones and still
obtain a connected graph.
w0

w1 w4

w2 w3

In this minimally connected graph the dierence of the number of edges and
the number of vertices reaches the bound from the corollary above. Further all
edges are bridges These minimally connected graphs are called trees.

5.3 Trees
As mentioned above a tree is a minimally connected graph. A graph containing
trees as subpaths are then called forests as formally defined below. The following
graph is a forest with 3 trees.

The following graph is not a tree, as it contains a cycle.

Definition 5.12 (Forest). An undirected simple graph without cycles is called


a forest.
Definition 5.13 (Tree). A connected forest is called a tree.
Definition 5.14 (Leaf). A node v P V with degpvq 1 is a called a leaf.
Theorem 5.3.1. Every tree with at least two vertices has at least 2 leaves.
Proof. As there are at least two vertices, and the tree is connected, there is
at least one edge e P E connecting two vertices u, v P V . From both these
vertices we can walk in opposite directions away from one another. As long as
the degpvi q 1 of the nodes along the way, one can choose another edge for the
next step.
v u
e

As there are no cycles and merely finitely many vertices, one will eventually
reach a leaf.

89
After having introduced the concept of a tree we now turn to a number of
equivalent characerizations of trees.

Theorem 5.3.2. Given an undirected simple graph G pV, Eq the following


statements are equivalent
1. G is a tree, i.e. connected graph without cycles.
2. G is connected and |V | |E| ` 1.

3. G has no cycles and |V | |E| ` 1.


4. G is connected and every edge is a bridge.
5. G has no cycles and any further edges, added to G, create cycles.
6. For all vertices v, u P V there exists a unique u v-path.
`
These are 2 62 30 implications. It is however sufficient to show a loop of
implications. Any statement then follows from any other by just following the
implications in the loop.
Proof. We will show the implications in the following cycle.

(6) (1)

(4) (5)

(2) (3)

(1) (6) By the definition of connectedness there exists a u v-path for all
pairs of vertices u, v P V . It remains to show that this path is unique. We will
show this by contradiction. More precisely we will show that if the path is not
unique we can construct a cycle in the graph.
Lets assume that there are 2 dierent paths connecting the vertices u and v.

v u

Where the two paths split, we can start constructing a loop taking one path
until we reach the vertex where they rejoin and return on the other path.

(6) (4) As there exists a path for any pair u, v P V we obtain immediately
that G is connected. It remains to show that every edge is a bridge. Again we
will show this by contradiction. Assume there exists an edge e P E that is not
a bridge. Then we can remove e and G is still connectd, i.e. there is a path
connecting the vertices adjacent to e. In other words, with e there are two paths
connecting these two vertices. This is a contradiction with the uniqueness of
the path.

90
(4) (2) As connectedness is already given, it remains to show that |V |
|E| ` 1 if every edge is a bridge. We will show this by induction over the number
of edges.

Base case If |E| 1 then there are two vertices and the equation is satisfied.
Induction step As all edges are bridges, we can simply split the graph into two connected
components by removing any of the edges.

G1 G2

Then we are left with two connected graphs. By induction hypothesis we


have
|E1 | ` 1 |V1 | |E2 | ` 1 |V2 |
Therefore we obtain the equality we were looking for by adding these two
equations.
|E 1 | ` |E2 | `2 |V1 | ` |V2 | |V |
looooomooooon
|E|1

(2) (3) We show by contradiction that a connected graph G satisfying the


equality |V | |E| ` 1 has no cycles. Lets assume that G has a cycle. Then
there exists an edge that is not a bridge. We can therefore remove that edge to
obtain another connected graph with one edge less, G1 pV, Ez teuq. For this
graph it holds then

|V | p|E| 1q |V | |E| ` 1 1

Thus |V | |E| 1 yields a contradiction.

(3) (5) We have to show that adding an edge e1 to the graph creates a
cycle. Thus we have from (3) that |V | |E| ` 1 |E 1 | for E 1 : E Y te1 u. We
show by induction over the number of vertices that whenever |V | |E 1 | then
there is a cycle.
Base case The first interesting case is |V | 3. If the number of edges is also 3, we
obtain the graph

which obviously contains a cycle.


Induction step We will distinguish two cases. If there exists a leaf in the graph, we
merely remove that vertex and the corresponding edge. This reduces the
number of vertices and the number of edges by one. If |V | |E| then also
|V | 1 |E| 1 and the graph contains a cycle by induction hypothesis.
So also the graph containing the leaf has a cycle.

91
If there is no leaf, then all vertices have degpvq 2. Then we can simply
construct a cycle by going from one vertex to the next. As there is no leaf,
there is no dead-end. And as the number of vertices is finite, we have to
end-up at the initial vertex sooner or later.

(5) (1) The graph G does not contain any cycles. So it remains to show that
G is connected. Again we will use an indirect proof. Lets assume that G has two
connected components. Then we could add a vertex e1 to G without creating
cycles by linking the two connected components. This yields a contradiction
with (5), as this insertion of an edge did not create a cycle.

The number of trees After having derived dierent characterizations of a


tree we now turn to the following question: How many trees are there with n
vertices?
To answer this question we will distinguish marked and unmarked trees. These
marked trees are not equal

v0 v0 v0

v1 v2 v1 v2 v1 v2

while there corresponding un-marked trees are equal

Lets first consider the number of unmarked trees with small numbers of
vertices.
Example 5.3.1. The number of dierent unmarked graphs for n 1, . . . , 5 is
given in the last column.

# vertices # trees
n1 1
n2 1
n3 1
n4 2

n5 3

As we will see below the number of marked trees with n vertices is the same
as the number of so-called spanning trees of the completely connected graph Kn
with n vertices. A spanning tree is formally defined as follows.
Definition 5.15 (Spanning tree). Given a connected (undirected, simple) graph
G pV, Eq, a graph H pV, E 1 q with the same vertex set is called a spanning
tree of G if
H is a tree.

92
E 1 E.
Example 5.3.2. The connected graph

has the spanning tree

Generally there exists multiple spanning trees for a given graph.


Spanning trees occur in algorithms. Usually the edges are assigned a price
or weight. Then one is interested to find the spanning tree with the minimal
total weight (price).
Lets now return to the question above: How many spanning trees are there
for complete graphs Kn with n vertices? Lets first consider some cases with
small n.
Example 5.3.3. In a completely connected graph any vertex is connected with
any other. We will now consider the spanning trees for n 1, 2, 3, 4.
K1 : There is merely one vertex and thus only one spanning tree.

K2 : There is merely one connected graph with two nodes. The single
spanning tree is just the graph itself.
K3 : The completely connected graph K3

v0

v1 v2

has three spanning trees

K4 : The completely connected graph K4

v1 v0

v2 v3

93
has 16 spanning trees.

The first 12 of these, in the red box, are actually the same if we considered
them as unmarked graphs. By re-arranging the vertices of a given tree
among these 12 we can obtain any other of the 12. An example of such a
re-arrangement is the following.

This re-arrangment preserves neighbor-relations. Any pair in the edge set


pu, vq P E is also in the edge set E 1 of the graph obtained after the re-
arrangement. Functions with this property are called isometries and are
formally defined below.
Definition 5.16 (Isometry). Two graphs G pV, Eq and G1 pV 1 , E 1 q are
isometric, G G1 , if there exists a bijective function f : V V 1 such that
@u, v P V : pu, vq P E pf puq, f pvqq P E 1
The bijective function f that preserves the edge relations, is called an isom-
etry.
As the number spanning trees of completely connected graphs grows pretty
fast, it is rather tideous to write down all of them for K5 . We can now take
a dierent approach employing isometries. First we consider the dierent un-
marked trees with n vertices. Then we derive the number of isometries for the
corresponding marked (spanning) tree.
Example 5.3.4. We will now consider the un-marked trees with 5 vertices and
how many marked graphs are isometric to each of them.
The graph

0 1 2 3 4

p0q p1q p2q p3q p4q

94
corresponds for instance to the marked graph with the vertices in in-
creasing order. For any permutation of the 5 vertices we obtain another
isometric graph. For the permuation that merely reverses the order, we
obtain the same graph as the edges are undirected. Therefore there are
5!{2 60 marked graphs corresponding to this un-marked graph.
The so called star graph

corresponds to 5 dierent marked graphs. Swapping the leaves does not


change the graph, but merely changing the center of the star. So for each
of the 5 vertices being the center we obtain a dierent graph.
Finally we consider the graph

Again the permutations of the vertices give the isometries. But swapping
the two leaves as indicated by the red arrow does not change the graph.
We therefore have a similar symmetry as above and have to divide again
the number of permutations 5! by 2 to obtain 60 graphs.
So in total we are left with 125 53 dierent spanning trees (or dierent marked
trees with 5 vertices).
Considering the examples above we might guess that in general the number
of spanning trees of completely connected graphs Kn is nn2 . Indeed this is
true.
Theorem 5.3.3 (Cayley). The number of dierent marked trees with n nodes
(i.e. the number of spanning trees of the completely connected graph Kn ) is
nn2 .
Proof. Instead of counting marked trees we will count marked trees with two
additional markers, a green circle and a red square.

1 3
2
5

These two markers can be placed on any node in the graph, also both on the
same. Therefore there are n2 dierent ways to place the two markers. If for
instance n 2 then we got the following possible placements

95
We will now show, that there exist nn dierent marked trees with n vertices
and two additional markers. Then there are
nn
nn2
n2
marked trees without the markers.
To derive the number of trees with markers we now construct a bijection
from these trees to the functions from a set of n elements to another set of n
elements.
tf : t1, . . . , nu t1, . . . , nuu
Note that we ask for neither injectivity or surjectivity. Thus any of the n
elements in the first set can be mapped to any of the n elements in the second.
Therefore there are nn such functions. Even though these are not necessarily
permutations we will adopt a similar notation. Lets consider the following
example.
1 2 3 4 5 6 7 8 9 10
f
9 8 3 9 3 2 1 6 4 1
We will now represent this function by a directed graph.

5
3
10

1 9 4
7 6

2 8

In every component there has to exist a cycle, as there is only a finite number
of nodes and there is no end point if one applies the map over and over again
(i.e. follows the arrows). Restricting to the vertices that are contained in cycles
yields a permutation, i.e. a bijective map. The set of vertices contained in some
cycle is in the example above

M t2, 3, 4, 6, 8, 9u

and the corresponding permutation is the restriction



2 3 4 6 8 9
f |M
8 3 9 2 6 4

We will now incode this permutation in a simple tree

8 3 9 2 6 4

This is usually compared to an animal with the green circle being the head, the
red square the tail and the encoding of the permutation the spinal cord. We

96
can now add the remaining elements to the tree as follows.

5
8 9 2 6 4

3
1

7 10

The edges are to be read as arrows towards the spine. This procedure yields a
tree with head and tail for any function f : t1, . . . , nu t1, . . . , nu. It remains
to show that we have an inverse map from trees to functions.

Inverse direcion Lets construct a function for the following tree with head
and tail.
6
1 7 10

3
2
8 9

4 5

From the spine we directly obtain the permutation, as the first row is given by
convention by the numbers in increasing order.

1 3 7 10
f |M
10 7 3 1

Reading the other edges as arrows towards the spine we obtain the entire map.

1 2 3 4 5 6 7 8 9 10
f |M
10 1 7 2 2 3 3 7 7 1

This procedure works similarly for any other tree. Thus the proof is complete.

5.4 Some special graphs


Complete graphs or cliques are simple, undirected graphs with edges be-
tween any pair of vertices.

K1 K2 K3 K4 K5

97
For a complete graph with n vertices, denoted Kn , there are then

npn 1q n

2 2
edges.

Cycles (also circles or circuits) are graphs with all nodes contained in one
single cycle without additional edges. As we only consider simple graphs here,
we are merely interested in cycles with 3 or more vertices.

C3 C4 C5

The cycle Cn with n vertices has n edges.

Mesh graphs are graphs with a node set

V tpi, jq|1 i m, 1 j nu

The edge set contains merely the closest neighbors of each vertex

ppi1 , j1 q, pi2 , j2 qq P E |i1 i2 | ` |j1 j2 | 1

and no diagonals.
n

Sometimes the mesh is interpreted in a cyclic way. Depending on how we connect


the boundaries we obtain graphs that resemble dierent topologies. For instance
the nodes on the very right can be connected to the ones on the very left as
follows.

98
This graph has the shape of a cylinder. One might not only connect the nodes
horizontally but also vertically and obtain a torus shaped graph.

Another variant would be to invert the order to obtain a Moebius strip.

Complete bipartite graphs Instead of connecting all vertices in a graph


with one another one might form two groups (indicated by colors of the vertices)
and connect each member of one group with all members of the other. This

99
graph is called complete bipartite graph and denoted by Km,n .

m n

Hypercubes The vertices of the d-dimensional hypercube are the d bit strings,
i.e.
d
V t0, 1u td-bit stringsu
Therefore there are 2d vertices. The edges are then given by the pairs with
minimal Hamming distance, which defined as follows.
Definition 5.17 (Hamming distance). The Hamming distance between two
d
bit strings of same length, dH px, yq with x, y P t0, 1u , is the number of bits, in
which x and y dier.

So for instance we obatin the Hamming distance dH p0110, 1010q 2 and


dH p0101, 1010q dH p0000, 1111q 4. Then the edges of the d-dimensional
hypercube are defined by

pu, vq P E : dH pu, vq 1

Therefore every vertex has d neighbors, i.e. degpvq d@v P V . The total
number of edges is
d 2d
|E| d 2d1
2
as each of the 2d vertices is connected to d edges but then every edge is connected
to two vertices. We can now iteratively draw the graph. To obtain the graph
Qd`1 we draw twice the graph Qd . To one of the two we add a 0 after all bit
strings labelling the vertices, to the second analoguely a 1. Finally we add edges
to connect the corresponding vertices of the two copies of Qd . We start from
the graph Q0 containing merely the empty word

Then the hypercube Q1 is constructed following the procedure above.

0 1 0 1

100
Repeating the same process yields Qd for larger d. For Q2 we get

01 11 01 11

00 10 00 10

This corresponds directly to the Hassediagram of 1 2 3 6. Q4 is the usual


three dimensional cube.

011 111 011 111

001 101 001 101

010 110 010 110

000 100 000 100

The Hasse diagram for 2 3 5 30 yields the same graph. Similarly we obtain
the 4 dimensional hypercube.

This is just the Hasse diagram for 2 3 5 7 210.

5.5 Euler tours and Hamilton cycles


5.5.1 Seven bridges of Konigsberg
In 1736 Leonhard Euler considered the question whether there is a tour crossing
every of the seven bridges in Konigsberg

101
exactly once, thereby laying the foundations of graph theory. We can translate
the setup with the bridges to a graph.

Then we are looking for a cycle, that passes each edge exactly once a so-called
Euler tour.
Definition 5.18 (Euler tour). An Euler tour is a closed sequence of edges of a
graph G pV, Eq, that contains each edge of the graph exactly once.
As the degree of any of the vertices in the graph above is 3, thus odd, there
doesnt exist an Euler tour.

Theorem 5.5.1 (Euler). A connected graph has an Euler tour, if and only if
all degrees are even. That is

Ghas an Euler tour degpvqis even@v P V

Proof. We will first show that the condition

@v P V : degpvq is even

is necessary for the existence of an Euler tour. This follows from the observation,
that any vertex is reached and left the same number of times. Whenever one
reaches or leaves the edge we have to use an other edge to form an Euler tour.
Thus the number of available edges has to be even if we finally have to have
passed all edges. Note that the same holds for the starting vertex.
It remains to show that the condition is sufficient. We will now construct an
Euler tour for a path satisfying the condition above. First one chooses an initial
vertex v1 . This might be any vertex in V . Following an unused edge one now
proceeds to other vertices. As there is an even number of edges ending at each
of the vertices one always finds such an unused edge to continue except after
returning to v1 . As the number of edges and the number of vertices is finite one
will eventually end up in v1 and thus obtain a first cycle with each edge in the
cycle used merely once.

v1

This cycle might however not contain all edges. In that case, as the graph is
connected, there must be a vertex v2 in the cycle above with (at least 2) unused
edges. We can therefore use this vertex v2 as the starting point of a new cycle

102
using the same procedure of following the unused edges. The red arrows show
how to interpret this as just one cycle.

v1
v2 v3

Repeating this iteratively until we used all edges yields the desired Euler tour.

Example 5.5.1. All vertices in the complete graph K5 have degree 4. Therefore
K5 has an Euler tour that can be constructed following the procedure in the
proof above, starting from the vertex on the top. First we follow the outer circle
and then the inner edges as indicated by the color gradient.

Example 5.5.2. Some of the special graphs considered above contain Euler tours
under some conditions.
The complete graphs Kn have an Euler tour, if and only if n is odd as

degpvq n 1 @v P V

The hypercubes Qd have an Euler tour, if and only if d is even as

degpvq d @v P V

Note that the proof above directly yields a simple greedy algorithm2 , that
finds an Euler tour in time linear in the number of edges |E|. We will now turn
to Hamilton cycles defined as follows.
Definition 5.19 (Hamilton cycle). A cycle that visits every node exactly once,
is called a Hamilton cycle. A graph containing such a cycle is called hamiltonian.

Example 5.5.3. The following graphs are hamiltonian.


The cycles Ck are hamiltonian for all k 3.
The complete graphs Kn contain the cycles Cn for all n 3, and therefore
a Hamilton cycle. Thus these are hamiltonian.
2 At each vertex we can simply choose any unused edge, indpendent of any previous choice.

The choice does not depend on any global properties.

103
The wheel graph

Example 5.5.4. The following graphs are not hamiltonian.


Any tree is not hamiltonian as it does not contain any cycles at all.
The following graphs do neither contain any Hamilton cycles.

We would now like to address the question whether meshgraphs are hamil-
tonian. The graph M1,2

is a tree and thus not hamiltonian. The two following two mesh graphs, M2,2
and M2,3 ,

are hamiltonian as the outer cycles connect all nodes. The next mesh graph,
M3,3 ,

however is not hamiltonian. To see this note, that we can color the nodes

such that separating the blue nodes and the green nodes we obtain a bipartite
graph. There are 5 blue nodes and 4 green ones. So if we could find a Hamilton
cycle containing all nodes we could arrange the cycle in the following way.

But then as the number of blue nodes and the number of green nodes are node
equal there must be two neighbouring blue nodes. This yields a contradiction,
as there are no diagonal edges in mesh graphs. The same argument holds for
any mesh Mm,n with m n being odd.

104
More general, a bipartite graph can only have a Hamilton cycle if the number
of vertices of one color equals the number of vertices of the other color.
For mesh graphs this condition is not only necessary but also sufficient. That
is
Mm,n is hamiltonian m n is even
Proof. From the argument above we directly obtain that m n being even is a
necessary condition. This proves . So it remains to show that the condition
is sufficient, i.e. whenever the product is even there exists a Hamilton cycle. If
the product is even, at least one of the two, m and n, is even. Without loss of
generality we can assume that m is even. Then we can construct a Hamilton
cycle as follows.
n

Similarly for bipartite complete graphs Km,n the condition m n is suffi-


cient. Then one can construct a Hamilton cycle as follows

Having considered mesh graphs and complete bipartite graphs we now turn
to Hamilton cycles in hypercubes. The smallest two, Q0 and Q1 , are not hamil-
tonian. The first is merely a point, the second is a tree. But the square Q2 and
the cube Q3 are hamiltonian.

105
Using the Hamilton cycle on Q3 we can now construct a Hamilton cycle on Q4 .
We remove two corresponding edges and link the two cycles at the corresponding
vertices.

This procedure can be applied iteratively to obtain a Hamilton cycle for any
Qd , as we will show below formally by induction. Therefore the hypercubes Qd
are hamiltonian for all d 2.
Proof. We will show that there exists a Hamilton cycle for all d 2 by induction
over d.
Base case As we have seen above Q2 is hamiltonian.
Induciton step By induction hypothesis there exists a Hamilton cycle for Qd . We will
then construct a Hamilton cycle for Qd`1 .

Qd 0 Qd 1

a1 0 a1 1

a2 0 a2 1

So the process is to remove the corresponding edges pa1 0, a2 0q and pa1 1, a2 1q


and add the edges pa1 0, a1 1q and pa2 0, a2 1q.
Though in some special graphs there are Hamilton cycles, there is generally
no efficient algorithm known to decide whether a given graph is hamiltonian.

5.6 Planar graphs


Definition 5.20 (Planar graphs). A graph G pV, Eq is planar if it can be
drawn in the plane such that no edges cross.

106
Example 5.6.1. The complete graph K4 is planar as we can rearrange the edges
as follows

The complete graph K5 however is not planar. One might shift two of the
inner edges outside, but there remains a crossing. Shifting further edges outside
introduces crossings outside.

At the beginning of the course we proved that the bipartite graph K3,3 is not
planar.

To actually prove that the latter two graphs are not planar we could go
through all possible arrangements of edges to see whether there is any without
crossings. This is rather tideous. We therefore derive some properties of planar
graphs. If a graph does not have any of this properties it cannot be planar.
Theorem 5.6.1 (Eulers formula). Let G pV, Eq be a planar connected graph
which divides the plane into f regions (including the region outside the graph).
Then the number of regions, of vertices and of edges satisfy the following equa-
tion.
|V | ` f |E| 2 (5.3)
Note that the inverse implication does not hold. A graph satisfying equation
(5.3) is not necessarily planar.
In the introduction of the course we saw a proof of this statement using
spanning and dual trees. We will now consider a dierent one, reducing a given
planar graph to a tree.
Proof. Note first of all that equation (5.3) holds for trees. As there are no cycles
there is merely one region, i.e. f 1. Further the number of edges is one less
than the number of vertices, |E| |V | 1. Thus one obtains

|V | ` f |E| |V | ` 1 p|V | 1q 2

We can now reduce any other connected planar graph to a tree by removing

107
edges in cycles, as shown for the following graph.

Removing an edge of a cycle reduces the number of regions and the number of
edges by one, while the number of vertices remains the same.

f f1 : f 1 |E| |E1 | : |E| 1 |V | |V1 | |V |

Thus the value of the left-hand side for the new graph G1 pV1 , E1 q with f1
regions is the same as before.

|V1 | ` f1 |E1 | |V | ` f |E|

The same holds for any other graph Gi pVi , Ei q resulting from removing
further edges to break apart cycles. Finally we end up with a tree that satisfies
Eulers formula as seen above. Thus Eulers formula also holds for the initial
general planar graph.
Unfortunately Eulers formula is of no use to prove that K5 or K3,3 are not
planar. The notion of a region is ill-defined if there are crossings. It is thus not
clear what f is in these two cases. We will now find bounds on the number of
regions, f , in terms of the number of edges and vertices. In a simple graph a
region is bounded by at least three edges

and each edge bounds two regions. Therefore we obtain


2
3 f 2 |E| f |E|
3
Inserting this into Eulers formula yields the following corollary.

108
Corollary 5.6.2. Any planar graph with |V | 3 satisfies the following inequal-
ity.
|E| 3 |V | 6 (5.4)

The complete graph K5 violates this equation as |V | 5 and |E| 10. Thus
the left-hand side is 9 10.
The complete bipartite graph K3,3 with |E| 9 and |V | 6 does however
not violate the equation (5.4). We can fix this issue by improving the bound
on the number of regions for bipartite graphs. A region in a bipartite graph is
bounded by at least 4 edges.

Therefore we obtain the following improved bound for bipartite graphs


1
4f 2 |E| f |E|
2
Insertion into Eulers formula (5.3) yields the following corollary.
Corollary 5.6.3. A bipartite planar graph G pV, Eq satisfies the following
inequality.
|E| 2 |V | 4 (5.5)

The bipartite planar graph K3,3 with |E| 9 and |V | 6 violates this
inequality.

Average degree The inequalities (5.4) and (5.5) can be expressed in terms
of the average degree

1 2 |E|
degpvq : degpvq
|V | vPV |V |

Then (5.4) for general planar graphs yields

|E| 3 |V | 6 3 |V | degpvq 6

and (5.5) for bipartite graphs

|E| 2 |V | 4 3 |V | degpvq 4.

Subdivisions One can extend a given graph by adding a vertex on one of


the edges. More precisely for a given graph G pV, Eq we add a vertex v 1 to
the vertex set V and replace one of the edges pu, vq P E by the two edges pu, v 1 q
and pv 1 , vq. The new graph is then called a subdivision of G.
Surprisingly subdivisions of K5 and K3,3 are at the core of non-planarity.

Theorem 5.6.4. All non-planar graphs contain subdivisions of K5 or subdivi-


sions of K3,3 as subgraphs.
We will not prove this statement.

109
5.7 Graph colorings
Before giving the formal definition of a coloring let us consider an example.
Example 5.7.1. Imagine the following problem: a flight company wants to find
the minimal number of planes to operate some given flights. Each flight is rep-
resented by a node. Two nodes are connected if the flights cannot be operated
by the same plane. Then the minimal number of colors required to color the
vertices, such that any two neighbouring vertices have a dierent color, corre-
sponds to the number of planes required. Lets consider colorings of the Petersen
graph.

Two colors do not suffice as can be seen from the outer cycle. It contains an
odd number of vertices and thus requires a coloring with three colors.

We can extend this coloring to the entire Petersen graph.

Therefore in this case of ten flights we would need at least three planes.
Definition 5.21 (Coloring). The k-coloring of a graph G pV, Eq is a function

c : V t1, . . . , ku (5.6)

110
such that any neighbours are assigned a dierent value, i.e.

pu, vq P E cpuq cpvq

Definition 5.22 (Chromatic number). The chromatic number pGq of a graph


G is the minimal k such that a k-coloring of G exists.
Example 5.7.2. We will now consider the chromatic number of some special
graphs.

Planar graphs Gp have a chromatic number less or equal than 6 as we


have shown in the introduction 1.4.

pGp q 6

It can be shown that the chromatic number is actually smaller or equal


than 4.
Complete graphs Kn have a chromatic number pKn q n as every vertex
needs its own color.
Bipartite graphs have a chromatic number pKm,n q 2 by definition3 .

Mesh graphs have a chromatic number pMm,n q 2 as there are no


diagonal edges.
For hypercubes pQd q 2. This follows from an inductive argument.
Obviously Q1 is two-colorable. If Qd is two-colorable than we can color
the second Qd with flipped colors before connecting the two to construct
Qd`1 .
The cycles have a chromatic number
#
2 n even
pCn q
3 n odd

Trees are two-colorable, ptreeq 2, as we can arrange the nodes in levels.


Then all levels with odd parity (depth) are colored in one color and those
with even parity in another.

Theorem 5.7.1. A graph G pV, Eq is bipartite, i.e. 2-colorable, if and only


if it contains no odd cycle, i.e. no cycle of odd length.
3 Bipartite is equivalent with two-colorable.

111
Proof. The implication follows directly from the observation above. If G is
bipartite it cannot contain an odd cycle, as then pGq 3 yields a contradiction.
It remains to show that if G contains no odd cycle, then it is also bipartite. Lets
consider the spanning tree of G. We can then color the levels with two colors
as above. We now need to show that there are no edges in G that connect two
levels of the same color. This follows from the observation that adding an edge
connecting two vertices of levels with same parity examples are indicated by
dashed lines below always yields an odd cycle and thus a contradiction.

The theorem yields an efficient way, i.e. in linear time, to decide whether a
graph is two-colorable. To decide however whether a graph is 3-colorable is an
NP-complete problem.

112
Chapter 6

Cryptography

6.1 Diffie-Hellman key exchange


For millenia information was hidden using cryptography. An example though
badly broken is the Ceasar cipher, that relied on permuting the letters ac-
cording to a given a scheme. The permutation, i.e. the key, was merely known
to the those that were supposed to receive the information. Similarly any cipher
used before the 1970s relied on a key that that was shared among the commu-
nicating parties and kept hidden from possible adversaries. This required the
parties to meet sometime prior to the communication and exchange the key.
In 1976 however Whitfield Diffie and Martin Hellman published a proto-
col to establish a key using merely authenticated communication over a public
channel. Two parties, usually referred to as Alice and Bob, exchange unen-
crypted messages. Thus also a potential adversary Eve can read the messages.
Nonetheless Alice and Bob can generate a secret key Eve cannot retrieve.
First of all Alice sends Bob a large prime number p ( 200 digits). They
can use this number p to construct a trap door function, i.e. a function that is
easy to compute but hard to invert. Using this sort of function it is possible to
establish the key. We will now take a closer look at the trap door functions.

Trap door function Let us consider the function

x Rp p2x q 2x mod p

How can one compute this function efficiently? Multiplying

2 2 ... 2

before taking the modulo requires x multiplications. To find a more efficient


algorithm lets first assume that x is a power of 2, i.e. x 2k . Then
2
p2k q
` 2 2 2
2 2
looooooooomooooooooon
ktimes

one can compute the power by k multiplication, each time multiplying the result
with itself. To generalize this to arbitrary x we write the argument in its binary

113
s ? s

Figure 6.1: Alice and Bob communicate through a public channel. Even though
Eve receives all their messages she does not know their shared secret.

expansion.

x pxr xr1 xr2 x0 q2 x0 ` 21 x1 ` 22 x2 ` . . . ` 2r xr

Then the computation one has to perform turns into repeated squaring and
multiplying.
2 2
1
x1 `22 x2 `...`2r xr
2x 2x0 `2
2
p2xr q 2xr1 2x1 2x0

Thus we are left with 2r 2 log x 2 log p computational steps.


Computing the inverse
Rp p2x q x
is harder. One way is to multiply 222 and compare at each step with Rp p2x q
until one finds a match. There is however a slightly more efficient protocol. For
some fixed number R p one calculates the giant steps 2R , R2R , 23R , . . . and
the baby steps 2x`1 2x 2, 2x`2 2x 22 , . . . , 2x`R 2x 2R as shown in figure
6.2 until one finds a match 2jR 2x`i . Then x j R i. The number of
computational steps is then
p
R`
R
? ?
Thus the optimal choice of R is roughly p 1 . There are therefore roughly p
computational steps. Thus
?
p 21{2log p
is still exponential in the input size log p. So even the slightly more efficient
protocol is by far less efficient than the computation of x Rp p2x q. Therefore
the function is a trapdoor function. One way is easy while the other is not.
1 One obtains this for example from setting the derivative with respect to R of the convex
function f pRq R ` p{R equal to zero.

114
23R
2x`0
2x`1
3R
2x`2 x`0
x`1
2x`3 x`2
x`3
2x`4
x`4 22R
2R

1R

3 21R
1 2
0
23
22
20 21

Figure 6.2: This illustrates the giant-step-baby-step algorithm.

The Diffie-Hellman key exchange protocol Employing the trapdoor func-


tion from above we can now devise a protocol for Alice and Bob to agree on a
shared key.
Alice chooses a large prime p and sends it to Bob.
Alice chooses a random integer x p and sends a : Rp p2x q to Bob.
Bob chooses a random integer y p and sends b : Rp p2y q to Alice.
Alices computes by mod p and Bob ay mod p. These are the same num-
bers and thus their shared secret.
y x
s : ay mod p ppx q mod p ppy q mod p bx

Schematically the protocol is depicted in figure 6.3. Eve merely knows a, b and
p but neither x nor y. There is no more efficient way known to compute s from
a, b, p than to first invert Rp p2x q and compute either x or y. This inversion
will take a long time if the number x, y and p are sufficiently large. Thus Eve
practically cannot compute s.
There is an analogy with locks, depicted in figure 6.4. Alice and Bob have
each two open locks of the same kind. Both close one of the locks and send it to
the other party. With the remaining lock they can lock the two locks together.
Even can intercept the messages and copy the closed locks. But as she cannot
open any of the two, she cannot lock them together to obtain the secret s. 2
2A more precise analogy would be, that Eve can try to built a key for the locks. But as

115
p
xp yp

a
Rp p2x q : a a

b
b Rp p2y q : b

s bx mod p s ay mod p

Figure 6.3: The Diffie-Hellman key exchange protocol.

a b

s s

Figure 6.4: Eve can copy the two closed locks sent by Alice and Bob, but she
cannot intertwine them to obtain the secret s.

116
6.2 RSA cryptosystem
Symmetric cryptosystem Until the late 70s the following encryption scheme
was used exclusively.

A B

C
M Enc Dec M

K K

The scheme employs the same secret key shared among Alice and Bob. This
key allows to both encrypt and decrypt the message. Therefore this kind of
cryptosystem is called symmetric.
Diffie and Hellman solved the issue of distributing the secret key using merely
a public authenticated channel as we have seen above. They also proposed a
completely new scheme employing dierent keys for encryption and decryption
and thus not relying on a shared secret key.

Public-key cryptosystems rely on a publicly known key P K used to en-


crypt a message. This key however does not allow to decrypt the message.
Merely the recipient knows the private (or secret) key SK required to decrypt
the message.

A B

C
M Enc Dec M

PK SK

PK

Diffie and Hellman could however not provide a realization of such an asym-
metric or public-key cryptosystem. In 1977 Rivest, Shamir and Adleman re-
alized a public cryptosystem employing the factorizing problem. We will now
consider some of the mathematical basis before turning to the protocol itself.

she has no information how the key looks like she has to try all possible keys. This process
will take a lot of time and thus practically Eve cannot open the lock.

117
6.2.1 Euclids Algorithm
Given a terrace of dimensions a b, what is the length of the largest square tile,
that can be used to exactly cover the terrace?

ee

So we are actually looking for the greatest common devisor gcdpa, bq. One
could compute the gcd from the prime factors of a and b. There is however no
eecient algorithm known to compute the prime factors of a number. Instead
the terrace-picture allows to derive a more efficient way to compute the gcd
without computing first all prime factors. One can reduce the question to smaller
rectangles. We can substract the square bb and ask for the largest tile exactly
covering pa bq b.

ab

The answer is the same as for the rectangle a b. One can then repeat this over
and over.

What if however a is much larger than b, a b? Then we have to substract


the same square b b over and over again.

It is more efficient to directly reduce to the remainder Rb paq of the integer


division a{b. In particular then 0 Rb paq a. So the algorithm consists
of repeatedly reducing to smaller rectangles ak bk with alternatingly ak :

118
Rb pak1 q, bk bk1 or ak ak1 , bk Ra pbk1 q. Always the longer side is re-
placed by the remainder. The algorithm terminates when one of the remainders
becomes zero. This happens latest when one reaches the square 1 1. This
yields Euclids algorithm an efficient algorithm to compute the gcd of two
numbers a and b.
Example 6.2.1. Lets consider the case a 17 and b 10. First we reduce a to
the remainder R10 p17q 7, then b to the remainder R7 p10q 3 and so on and
so forth. We therefore obtain the sequence

p17, 10q p7, 10q p7, 3q p1, 3q p1, 0q

Thus we obtain gcdp17, 10q 1.


A related question is the following. Given two hour glasses of dierent sizes,
i.e. with dierent time intervals of a and b minutes (a, b P N), can one measure
one minute?

a 17 b 10

As the gcd of 17 and 10 is 1 we can actually measure an interval of 1 minute. We


will now consider an iterative method to measure smaller time intervals with the
two hour glasses. We derive for each of the remainders in the Euclid algorithm
how to measure the corresponding time interval until we reach 1.
To measure an interval of 17 minutes we use the first hour glass once and
the second not at all. Similarly for 10 min one uses merely the second one. To
measure an interval of 7 minutes we turn both hour glasses at the same time.
But the interval we are interested starts only after the sand in the second hour
glass has fallen through. Thus we delay the start of the interval by 10 minutes.
Indicating this delay with a minus we obtain the following number of uses
of the hour glasses.
17 10
17 1 0
10 0 1
7 1 1
3 1 2
1 3 5
The first two lines can be filled easily. Each of the following lines can be com-
puted from the previous two. Take for instance the last line. From integer
division in the first column we obtain the factor k : 7{3 2. We obtain the
last line pl5 q from the previous two, pl4 q and pl3 q by computing pl3 qkpl4 q pl5 q.

119
Thus to measure an interval of one minute we use the hour glasses as follows.

3 17 51

5 10 50
51 50 1

This means
1 3 17 ` p5q 10
3
or in terms of modulo
1 3 17 pmod 10q
Therefore this yields an algorithm to compute inverses in modulo arithmatic,
called extended Euclidean algorithm.

3 171 pmod 10q

6.2.2 The Chinese remainder theorem


In order to count the men in their army the ancient Chinese came up with
the following algorithm. The soldiers had to form blocks with p soldiers, p
being a prime, in each row. For each prime number one would then record the
remainder. For instance

p2 p3 p3

remainder is 2

remainder is 0

remainder is 1

3 Recall that a b pmod mq is equivalent to R paq R pbq or m dividing the dierence


m m
pa bq, i.e. m|pa bq.

120
From the first block we conclude that the number of soldiers x is odd. From
the second we obtain, that it is a multiple of 3. We can write this as

x1 pmod 2q
x0 pmod 3q
x2 pmod 5q

Then the only solution to this equation smaller than 30 is 27. We are in partic-
ular interested in the following special case. For two prime numbers p and q it
holds #
x a pmod pq
x a pmod pqq (6.1)
x a pmod qq

6.2.3 Fermats theorem


Finally we will briefly revise Fermats theorem, treated at the beginning of the
course. Imagine we want to make a necklace with p pearls, each having one of
a colors.
5 4
3
2

How many dierent necklaces are there? A priori we can choose for each of the
pearls among a colors. So there are ap diernt necklaces. There are however
some that dier merely by a rotation. Thus we have to divide by the number of
rotations. This would reduce the number of necklaces too much, as we initially
counted the a necklaces with all pearls having the same color merely once. So
we have to substract those before the division and add them afterwards again.
Thus we obtain
ap a
#necklaces `aPN (6.2)
p
Therefore p divides ap a

p | ap a ap a pmod pq ap1 1 pmod pq (6.3)

where the last equivalence requires that a is not a multiple of p, i.e. gcdpa, pq 1.
We can now combine the Chinese remainder theorem (6.1) and Fermats
theorem (6.3) to obtain for two primes p, q, with p q, and an integer number

121
a with gcdpa, pqq 1

app1qpq1q 1 pmod pq
app1qpq1q 1 pmod qq
pp1qpq1q
a 1 pmod pqq

Equipped with these tools from number theory we can now introduce the RSA
protocol.

6.2.4 The RSA protocol


We will swap the roles of Alice and Bob. Instead of Alice sending a message to
Bob, Bob sends an encrypted message to Alice.

A B

C
M Dec Enc M

SK PK

PK

Then the protocol consist of the following steps:


Alice generates two large prime numbers p, q and computes n : p q.
Further she chooses a number e that is relatively prime to pp 1qpq 1q,
i.e. gcdpe, pp 1qpq 1qq 1. For instance e 3 is a possible choice.
Alice computes

d : e1 pmod pq 1qpp 1qq de 1 pmod pq 1qpp 1qq

using the extended Euclidean algorithm.


Alice now sends the public key P K pn, eq to Bob while she keeps the
secret key SK pn, dq secure with herself.
Bob can now encrypt his message M n by computing C M e employ-
ing the square and multiply algorithm. Then Bob sends C to Alice.
Alice then decrypts the message as follows.

Cd pmod nq pM e qd pmod nq
ed
M pmod nq
1`kpp1qpq1q
M pmod nq for some k P N
1
`
pp1qpq1q k
M loooooomoooooon
M pmod nq
1 mod n
M pmod nq

122
The security of RSA is based on the hardness of factorizing. To compute p
and q from n requires a lot of computational resource. An upper bound on the
number of computational steps is
?
n 21{2 log n

i.e. exponential in the size of the input. The best algorithm known so far is the
so-called number field sieve and requires
1{3
2cplog nq

computational steps. This is less than exponential but not polynomial in the
input size, thus called subexponential. So it suffices to choose two large inte-
gers p, q to make RSA secure (assuming that the NSA does not have secretly
developed a substentially more efficient algorithm for factorizing or possess a
properly working quantum computer).

123
Appendices

124
Appendix A

Proof techniques

In this appendix we will briefly review common proof techniques used through-
out the course.

A.1 Proof by contradiction


In order to prove a proposition to be true, one can show that the assumption
of the proposition being false leads to a contradiction. So the usual approach is
to negate the proposition and then construct a contradiction.
?
Example A.1.1 ( 2 is irrational).
Proposition A.1.1. The squareroot of 2 is irrational.
?
Proof. The negative of the proposition is 2 is rational. ?
Given the negative proposition we will now construct a contradiction. If 2
is rational there is an irreducible fraction
a ?
2 a, b P N a2 2b2 (A.1)
b
As a2 is even, a has to be even. Thus a2 4a12 for some other natural number
a1 P N. Therefore b2 2a12 and consequently b is even as well. This contradicts
a{b being an irreducible fraction.

A.2 Proof by induction


If the proposition is a statement for all natural numbers, induction is usually a
convenient approach.
One shows first that the proposition holds for a first natural number (the
base case). Then one proves that assuming the proposition holds for a general
natural number n it also holds for n ` 1. The second part of the proof is called
the induction step. Applying the induction step over and over again starting
from 1 one obtains that the proposition holds for 2,3,4,. . . and eventually all
natural numbers.

125

You might also like