Professional Documents
Culture Documents
1 Motivation 6
1.1 Swapping knights . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Combinatorics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Connections without crossings . . . . . . . . . . . . . . . . . . . . 7
1.4 Coloring maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Propositional Logic 13
2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Connectives as truth functions . . . . . . . . . . . . . . . 14
2.2 Syntax of propositional logic . . . . . . . . . . . . . . . . . . . . 16
2.3 Symantics of propositional logic . . . . . . . . . . . . . . . . . . . 18
2.4 Normal forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5 Models and semantic conclusion . . . . . . . . . . . . . . . . . . . 23
2.6 Proof theory of propositional logic . . . . . . . . . . . . . . . . . 25
2.6.1 An Excursion into Complexity Theory . . . . . . . . . . . 27
2.7 The Resolution Calculus . . . . . . . . . . . . . . . . . . . . . . . 28
3 Set theory 32
3.1 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.1 Cantors paradise . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.2 Zermelo-Fraenkel set theory . . . . . . . . . . . . . . . . . 33
3.1.3 Laws derived from logics . . . . . . . . . . . . . . . . . . . 37
3.1.4 The Cartesian Product . . . . . . . . . . . . . . . . . . . . 37
3.2 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.1 Representation of relations . . . . . . . . . . . . . . . . . 44
3.2.2 Poperties of relations . . . . . . . . . . . . . . . . . . . . . 44
3.2.3 Equivalence relations . . . . . . . . . . . . . . . . . . . . . 45
3.2.4 Order relations . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4 Combinatorics 57
4.1 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Urn models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3 Combinatorial rules and counting strategies . . . . . . . . . . . . 64
4.3.1 The Pigeonhole Principle . . . . . . . . . . . . . . . . . . 68
4.3.2 Double Counting . . . . . . . . . . . . . . . . . . . . . . . 69
4.4 Binomial coefficients: Properties and approximations . . . . . . . 73
4.4.1 Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
1
4.4.2 Vandermonde Identity . . . . . . . . . . . . . . . . . . . . 73
4.4.3 Binomial theorem . . . . . . . . . . . . . . . . . . . . . . 74
4.4.4 Approximation of the binomial coefficient . . . . . . . . . 74
4.5 An Excursion into information theory: Data compression . . . . 76
4.6 Special counting problems . . . . . . . . . . . . . . . . . . . . . . 79
4.6.1 Equivalence relations . . . . . . . . . . . . . . . . . . . . . 79
4.6.2 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . 80
5 Graph Theory 83
5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.2 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2.1 Basic notions for simple undirected graphs . . . . . . . . . 86
5.3 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.4 Some special graphs . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.5 Euler tours and Hamilton cycles . . . . . . . . . . . . . . . . . . 101
5.5.1 Seven bridges of Konigsberg . . . . . . . . . . . . . . . . . 101
5.6 Planar graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.7 Graph colorings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6 Cryptography 113
6.1 Diffie-Hellman key exchange . . . . . . . . . . . . . . . . . . . . . 113
6.2 RSA cryptosystem . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.2.1 Euclids Algorithm . . . . . . . . . . . . . . . . . . . . . . 118
6.2.2 The Chinese remainder theorem . . . . . . . . . . . . . . 120
6.2.3 Fermats theorem . . . . . . . . . . . . . . . . . . . . . . . 121
6.2.4 The RSA protocol . . . . . . . . . . . . . . . . . . . . . . 122
Appendices 124
2
What is discrete
mathematics all about?
Before venturing into formal details, lets briefly clearify what we refer to if we
speak about discrete mathematics.
What is mathematics?
The pillars of mathematics are logics and set theory. While logics determines
how to reason within mathematics (what is considered a proof, . . . ), set theory
describes the objects we deal with.
Example (Set theory). It is actually possible to define all natural numbers using
merely the emptyset:
0 : H
1 : tHu
2 : ttHu , Hu
..
.
This exemplifies how set theory serves to define basic mathematical entities.
mathematics
set theory
logic
3
Figure 2: Matching the rational and the natural numbers
1 0.0101011100110 . . .
2 0.1001100101001 . . .
3 0.0001011001010 . . .
.. ..
. .
But one can construct a real number that is not contained in the list: the
ith digit of the binary representation is just the flipped value of the ith digit of
the ith real number.
1 Thisis to say, each element gets assigned a unique natural number, or in other words, no
natural number is matched to two dierent elements in S
2 That is, every natural number is matched to an element in S
4
1 0.0101011100110 . . .
2 0.1001100101001 . . .
3 0.0001011001010 . . .
.. ..
. .
constructed real number 0.111 . . .
The number we constructed diers from every element in the list and is
therefore not contained in the list. This yields a contradiction with the initial
assumption that there existed a complete matching. Thus the size of the set of
real numbers is strictly larger than the size of the natural number (in the sense
that there is no bijective map inbetween the two.).
Why bother?
What is the relevance of discrete mathematics? On the one hand logic is discrete
as the basic set is tTRUE, FALSEu. On the other hand, computer science is not
only closely related to logics, but as computers have only finite states informatics
is in a sense discrete.
5
Chapter 1
Motivation
After having clearified the meaning of discrete mathematics we will look into
interesting examples of the field.
3
m0M 3
M0m 3
m0M
2
0Z0 2
0Z0 2
0Z0
1
m0M 1
m0M 1
m0M
a b c a b c a b c
6
a2 a2
a3 a1 a3 a1
a4 a0 a4 a0
a5 a7 a5 a7
a6 a6
Figure 1.3: Necklace with further symmetries, but p not being prime.
1.2 Combinatorics
Imagine you want to create a necklace with p pearls, with p being prime. There
are pearls of a dierent colors available. How many dierent necklaces can you
make? If we can distinguish the pearls, there would be ap dierent possible
necklaces. But the necklace has a rotational symmetry. That is two necklaces
that are made up from the same sequence of pearls except that they are rotated
by an angle are indistinguishable. Just diving by p does not yield the right
solution though, as the one-colored necklaces just appear once. As p is a prime
number these are they only exceptions. Configuration like in 1.3 cannot occur.
Therefore the right answer is
ap a
N `a (1.1)
p
More generally we obtain for a prime number p, ap a is divisable by p.
This is called Fermats small theorem and plays a role in cryptography, namely
the public-key protocol RSA.
Figure 1.4: Houses with connections. The third house cannot be connected
anymore without intersecting other connections.
h1 h2 h3
p1 p2 p3
Figure 1.5: The houses connection problem can be associated to the graph
K3,3
8
Figure 1.7: A cube can be transformed into a graph by removing the bottom
(turns into outer region) and pressing the top down while spreading out the
lower nodes.
ne`f 2 (1.2)
Before looking at a sketch of the proof we can apply it to K4,4 to answer our
initial question. Indeed n e ` f 6 9 ` 13 10 2.
Proof sketch The proof consists of two parts: First we will show the propo-
sition for trees, which form a simpler class of graphs. Then we will generalize
this to arbitrary planar paths.
Euler for trees A tree is a path without any cycles. As it contains only one
region, namely the outer one, it is always planar. We can now show by induction
that Eulers law is always satisfied for trees.
If there is just one node, then the formula is satisfied as n 1, e 0 and
f 1. For the induction step we now assume that the law holds for a tree with
9
Figure 1.9:
n nodes and follow that it will hold for a tree with n ` 1 nodes. Adding node
requires to add an edge as well. So the new tree will have e ` 1 edges and n ` 1
nodes and therefore
n on loomo
loomo f on n e ` f 2
e on ` loomo (1.3)
n`1 e`1 1
which is just Eulers formula. The subsequent example will employ the
formula as well.
10
c2
c3 c1
c4
Figure 1.10: The figure on the left shows, that it is possible to arrange four
countries such that they are all neighbors to one another. Thus we need at least
four colors. The corresponding graph is shown on the right.
been some suspision towards the validity of the proof. A shorter proof shows
that 5 colors are sufficient. In the following we sketch a proof for 6 colors.
Proposition 1.4.1 (6-Coloring). Six colors suffice to color a map (or equivalently
the nodes of a planar, non-multi graph).
Proof. As there is a finite number of nodes (countries) we will prove the propo-
sition by induction.
Base case For 6 or less nodes the proposition holds, as there are at least as
many colors as nodes.
Induction hypothesis We will assume that the proposition holds for k nodes.
Induction step The major part of the proof is now to show that, given the
induction hypothesis, the proposition hold also for k ` 1 nodes. The idea to do
that is as follows
Elimininate a node v
By the induction hypothesis there exists a coloring for the reduced graph.
One needs to find a coloring for the node v such that the proposition still
holds. The goal choose the node v such that it has 5 or less neighbors and
we can still choose one color.
While the first two points are clear, we still need to prove the last one
essentially the following lemma.
Lemma 1.4.1. In every planar graph without multi-nodes there exists at least
one node v with 5 or less neighbors.
Lemma. We will prove the lemma by contradiction. We will assume that the
lemma was actually false, i.e. that each (!) node has at least 6 neighbors. Based
on this assumption we will construct a contradiction with Eulers formula.
As every node has at least 6 neighbors and each edge connects two nodes
the edges and the nodes are related as follows
2e 6n (1.5)
11
Similarly the regions and the edges are related. Every region is bounded by at
3 edges (NB: two edges only suffice for multi-graphs). So we obtain
2e 3f (1.6)
12
Chapter 2
Propositional Logic
2.1 Definitions
Logic is ultimately about statements and the question whether they are true or
false. This serves to formulate the following definition.
connective
composed proposition
13
Figure 2.1: Logaical AND gate
A
When the rooster crows on the dungheap ,
while the next one does not seem to be proper proposition at all as one cannot
decide whether it is false or true.
Conjunction or AND gate is the connective that returns true if and only if
both arguments are true.
A B A^B
true true true
true false false (2.2)
false true false
false false false
Negation or NOT gate returns the opposite of the input truth value.
A A
true false (2.3)
false true
14
Together the AND and the NOT gate suffice to realize any other gate. We say
these gates are universal. Interestingly just one gate namely the NAND
is universal in itself.
NAND gate The two gates above can be combined to a new gate, the negated
AND gate, short NAND gate
A | B : pA ^ Bq (2.4)
A B A|B
true true false
true false true (2.5)
false true true
false false true
To get an idea of how the NAND gate can serve to simulate all other gates, one
obtains the NOT gate by using A for both inputs: A A | A. One can
now use this NOT gate and the NAND to obtain an AND gate and so on and
to forth.
A B AB A B AB
true true false 1 1 0
true false true 1 0 1 (2.7)
false true true 0 1 1
false false false 0 0 0
The XOR symbol resembles a plus for a reason: if we replace false by zero and
true by 1 then the XOR is just addition modulo 2 (as we will see later in the
course). This replacement by 0 and 1 simplies matters. We will therefore use
from now on digits.
15
Logical equivalence is the inverse gate of the XOR, returning true if and
only if the arguments have the same truth value
A B AB
0 0 1
0 1 0 (2.8)
1 0 0
1 1 1
Implication connects two atoms A and B in the way, that if A is true also
B is true. A is called the premise, B the conclusion.
A B AB
0 0 1
0 1 1 (2.9)
1 0 0
1 1 1
16
D
p ED q
ED :
p ED ^ ED q
p ED _ ED q
_ _
^ A B
A B C
pA ^ Bq
pA _ B
p pA ^ Bqq
F : pppA ^ Bq _ p Cqq ^ pA _ Bqq
17
A^B_C
The first can either be understood as ppA ^ Bq ^ Cq or pA ^ pB ^ Cqq. The
meaning (i.e. the semantics) is the same. This does not hold for the second
which can be interpreted as either pA ^ pB _ Cqq or ppA ^ Bq _ Cq.
A : A 0
B 1
C 0
18
Example 2.3.3 (Equivalent formulas). The following two formulas are equivalent,
they yield the same red columns.
pA _ Bq p pp Aq ^ p Bqqq
0 0 0 0 1 0 1 1 0
0 1 1 1 1 0 0 0 1 (2.11)
1 1 0 1 0 1 0 1 0
1 1 1 1 0 1 0 0 1
Thus we obtain
0 : pA ^ p Aqq (2.13)
1 : pA _ p Aqq (2.14)
A B pF ^ Gq _ pp F q ^ p Gqq (2.16)
Semantical equivalence is an equivalence relation, that is a mathematical relation
with particular properties as we will see later in the course. For now it is enough
to know that it groups formulas into disjoint subsets2 of equivalent elements,
so called equivalence classes. It structures the set of all syntactically correct
formulas as visualized in 2.4.
The formulas pB _ p Bqq, p p Aq _ p p p Aqqqq are all equivalent to 1
and therefore in the same equivalence class. They are all tautologies.
19
contains all tautologies
ED
1
Figure 2.4: Equivalence classes in the set of all syntactically correct formulas
ED .
pF F q 0
AA1
Symmetry
pF ^ Gq pG ^ F q pF _ Gq pG _ F q
Associativity
Absorption
ppF ^ Gq _ F q F ppF _ Gq ^ F q F
20
Distributivity
de Morgan
double negation
p p F qq F
Note that the associativity for AND and XOR is not both ways but just for
A ^ pB Cq pA ^ Bq pA ^ Cq (2.17)
21
last column.
A B C F
0 0 0 0
0 0 1 0
0 1 0 1
0 1 1 1 (2.19)
1 0 0 1
1 0 1 1
1 1 0 0
1 1 1 1
A first way to construct a formula F with the desired semantics is, to consider
all the rows that have to be true. One has to be in row 3, row 4, row 5, row 6
or row 8. To check the validity of a single row one connects the atoms or their
negation with ANDs. Thus we obtain the formula
p A ^ B ^ Cq _ p A ^ B ^ Cq p A ^ Bq _ ploooomoooon
C ^ Cq A^B
0
So we obtain
ploooooooooooooooooooooomoooooooooooooooooooooon
A ^ B ^ Cq _ p A ^ B ^ Cq _ loooooooooooooooooooooomoooooooooooooooooooooon
pA ^ B ^ Cq _ pA ^ B ^ Cq _pA ^ B ^ Cq
A^B A^ B
loooooooooooooomoooooooooooooon
p A ^ Bq _ pA ^ Bq _pA ^ B ^ Cq A B _ A B C
AB
pA _ B _ Cq ^ pA _ B _ Cq pA _ Bq ^ pC _ Cq
loooomoooon
1
to obtain
F pA _ Bq ^ p A _ A _ Cq
In the following definition the normal forms mentioned above will be intro-
duced formally precise.
22
Definition 2.10 (Literal). If A P D is an atom then A and A are called
literals. In other words, a literal is an atom or a negated atom.
Definition 2.11 (Conjunctive Normal Form). A formula F is in Conjunctive
Normal Form if there exist literals Li,j such that
n mi
F Li,j
i1 j1
pL1,1 _ L1,2 _ . . . _ L1,m1 q
^ pL2,1 _ . . . _ L2,m2 q
^ ...
^ pLn,1 _ . . . _ Ln,mn q
Generally both methods we used in the example above can be applied to any
truth vector. Thus any formula is semantically equivalent to a formula in CNF
and to a formula in DNF.
A(F (2.20)
Example 2.5.1. Let us consider the following formula with its truth table
F pA ^ Bq C
0 0 0 1 0
0 0 0 1 1
0 0 1 1 0
0 0 1 1 1
1 0 0 1 0
1 0 0 1 1
1 1 1 0 0
1 1 1 1 1
Only the assignment in the seventh row is not a model of F .
3 World is to be understood more philisophically as reality.
23
Some of the properties of models are the following
Two formulas F and G are semantically equivalent, i.e. F G, if and
only if4
A ( F i A ( G (2.21)
That is if F and G have the same models.
A1 : A 0, B 0
A2 : A 0, B 1
A3 : A 1, B 1
A4 : B 0, C 0
A5 : B 0, C 1
A6 : B 1, C 1
A : A 1, B 1, C 1
is a model that renders all formulas true, i.e. a common model for all formulas.
As this is also a model for the atomic formula C we obtain
tA, A B, B Cu ( C
4 if and only if is usually abbreviated i.
24
The semantic conclusion is related to the (syntactical) implication analogue
to equivalence in the theorem (2.3.1).
Lets now consider models of this formula. If an assignment renders all Fi true
ApFi q 1 @i then the formula is true i the assignment also renders G
true. This is just the definition of a semantic conclusion.
Is F a tautology?
Is F unsatisfiable?
Does tF1 , . . . , Fn u ( G hold?
In the following examples we will consider the tautology question for CNFs
and DNFs.
Example 2.6.1 (Tautology problem of CNF). Given a formula F in CNF, can we
decide whether it is a tautology? The formula can be rewritten as a conjunction
of subformulas Fi
F loooooooooomoooooooooon
pL1,1 _ . . . L1,m1 q ^ . . . pL n,1 _ . . . Ln,mn q
loooooooooomoooooooooon
:F1 :Fn
F1 ^ . . . ^ Fn
25
Example 2.6.2 (Tautology problem of DNF). Can we derive a similar criteria
for the subformulas of a DNF
n mi
F Li,j F1 _ . . . _ Fn
i1 j1
This is a DNF as any negated literal is just another literal. The argument also
works for a F being a DNF. Then one obtains a CNF semantically equivalent
to F .
As F being a tautology is equivalent to F being unsatisfiable, the satis-
fiability problem of a DNF can be solved analogue to the tautology problem
of a CNF. Conversely the satisfiability problem of a CNF analogue to the
tautology problem of a DNF can merely be decided by writing down the
entire truth table.
26
all computational problems
NP
NPC
Figure 2.5: The set of computational problems is divided into classes of dierent
complexity.
202 400
27
Real life problems? Are problems in real life hard? Yes, they are, as one
can see in the following example.
Example 2.6.5 (Sudoku). Sudoku is an NPC problem as we can reduce it to the
satisfiability of a CNF. As one has to find a solution satisfying all conditions on
the rows, columns and subboxes, it is a problem of the form
pcondition 1q ^ pcondition 2q ^ . . . (2.22)
Therefore it is in CNF. To find a solution is just the same as proving satisfiability.
The satisfiability problem for a CNF is in NPC, and so is the Sudoku problem.
Actually this is a common structure of real life problems, such as flight
plans or schedules. Usually there is a number of necessary conditions yielding
a formula of the form 2.22. Each condition can be met in various ways. So
the subformulas are disjunctions. Thus one is usually looking for an assignment
satisfying a CNF.
28
tAu t A, Bu t B, Cu t C, Du t D, Eu t Eu
tBu
tCu
tDu
tEu
In the last example the claim was somehow intuitive as A and E were con-
nected by a row of implications. Another example is the following
Example 2.7.2. The task is to prove the semantical conclusion
tA _ B, A C, A C, B Cu ( C
pA _ Bq ^ p A _ Cq ^ p B _ Cq ^ C
29
tA, Bu t A, Cu t B, Cu t Cu
t Bu
tAu t Au
Figure 2.7:
tA, B, Cu t A, Eu t C, D, Eu tCu t D, Cu
tB, C, Eu tD, Eu t Du
tB, Eu tEu
tBu t Au
Figure 2.8:
30
Remark 4. Summarizing the above
If a formula in CNF is satisfiable then resolution yields a model for this
formula.
If a formula in CNF is unsatisfiable than resolution can show this.
A programming language based on resolution calculus is prolog.
31
Chapter 3
Set theory
After having introduced logics we will now turn to the second pillar of mathe-
matics, namely set theory. As mentioned in the introduction, all mathematical
objects are sets. For instance the natural numbers can be defined inductively
by merely nesting sets of the empty set.
0 : H
1 : t0u tHu
2 : t0, 1u tH, tHuu
..
.
n : t0, 1, . . . , n 1u
notation is misleading as there formally only exist sets and sets of sets. So also elements are
sets and may contain other elements. Having said this we will use the very notation as well.
2 The same expression might be used for subsets defined below.
32
3.1.1 Cantors paradise
In an attempt to formally define sets, Cantor proposed the following definition
Definition 3.2 (Cantors naive approach). Any collection of distinguishable
objects is a set. An object can be in any set, in particular a set can contain
itself as a member.
Unfortunately this definition of a set is not logically consistent as Russells
antinomy shows. One can group all sets that contain themselves into a set and
all sets that do not contain themselves. Let
M : tB|B R Bu
be the set of sets, that do not contain themselves. The question is now: does M
contain itself or not? If M contains itself, then M is not an element of M . Thus
it is an elment of M . This is a contradiction and therefore Cantors approach
is not theoretically sound.
The axiom of choice mentioned above, states that given any collection of
bins containing each at least one element, it is possible to make a selection of
exactly one object from each bin. If there is a distinguishing property for the
elements in a bin, one can single out an element without the axiom of choice
whearas the axiom is needed to choose an element from a set of indistinguishable
elements. Though the axiom of choice seems somehow intuitive it can have
quite strange implications. It leads for instance to the Banach Tarski Paradox.
Nonetheless the axiom of choice is widely used and an important cornerstone of
set theory.
Subsequently we will look at some of the axioms in greater detail. But before
doing so we will introduce some concepts from predicate logic.
Definition 3.3 (Quantifiers). In order to express that a statement holds for all
cases of a kind one uses the universal quantifier @. If a statement holds for at
least one instance of a kind the existential quantifier D is used.
Example. In terms of quantifiers the statement all natural numbers are non-
negative is
@x P Npx 0q
read for all x in N: x is greater or equal to zero. Similarly the statement
there exists a natural number greater than 100 can be written as
Dx P Npx 100q
33
We will now use quantifiers to formulate the axioms.
Axiom 1 (Extensionality axiom). The extensionality axiom states that two sets
are equal i they contain the same elements. The definition of equality yields:
if A and B are equal A B then A and B contain the same element.
So we only need to include the inverse implication in the axiom: any two sets
A and B containing the same elments, are equal. Using logical quantifiers this
reads
@A@Bp@xpx P A x P Bq A Bq
From the extensionality axiom follows that any set is uniquely determined
by its elements.
Example 3.1.1. As the following sets contain the same elements they are equal.
Note that a and the set containing a, i.e. tau are not equal.
Predicates, as defined below, yield subsets as follows from the subsequent
axiom.
Definition 3.4 (Predicate). For a given set A a predicate is a function P :
A tfalse, trueu. P can also be regarded as a property that all x P A have
for which P pxq true.
Axiom 2 (Sets from predicates). Given a set A and a predicate P on A, the
collection of all elements that have the property P (i.e. for which P is true)
is another set.
Implicitely we introduced the notation t | u for set defined by predicates.
Example 3.1.2. One can define a predicate on the natural numbers to be true
i the argument is less or equal to 10. This yields the set
A : tx P N | x 10u
B : tx P A | Primepxqu t2, 3, 5, 7u
AB : @xpx P A x P Bq
3 The colon indicates a definition and is put on the side of the equivalence that is to be
defined.
34
Figure 3.1: The intersection and the union of two sets
xPAXB : xPA^xPB
Note that this is not a new axiom. Any intersection can just be written in
terms of predicates
A X B tx P A | x P Bu
The union of two sets cannot be written in terms of predicates. We thus
require another axiom.
Axiom 3 (Union). Given two sets A and B their union A Y B containing all
elements of A as well as all elements of B is a set.
AYB : xPA_xPB
35
Figure 3.2: The dierence and the symmetric dierence of two sets
x P AzB : xPA^xRB
The definitions above show the close relation between set theory and logics.
Using the logical connectives we can define corresponding connectives for sets.
In this sense the power set is exceptional. It has no correspondent in logics.
Definition 3.9 (Power set). The power set of a set A is the set of all subsets
of A
x P PpAq : x A
In particular the power set contains the empty set and A itself.
The power set is also denoted 2A as the cardinality, i.e. the number of
elements, of the power set is just
|PpAq| 2|A| .
if the set is finite. To see this, we can consider the number of all predicates.
For finite sets each subset corresponds to a predicate. Every predicate can be
seen as a bit string of length |A|, with the i-th bit being 1 i P pxi q true. As
there are then 2|A| dierent strings, there are also just as many subsets.
Definition 3.10 (Complement). If a set A is defined as the subset of some
larger set U , usually called the universe, then the complement of a set A is
defined as
A : U zA.
36
Families of sets Any countable union or intersection of sets yields another
set. So for a family of sets Ai with an index i P I (the countable index set) we
define
xP Ai : Di P Ipx P Ai q
iPI
xP Ai : @i P Ipx P Ai q
iPI
AXAA
A X pA Y Bq A
pA Y Bq A X B
37
R
py, xq
px, yq
Let us verify whether this definition really entails the equality property for
two points pa, bq and pc, dq, mentioned above. The ordered pairs correspond by
definition to the sets
! ( ! )) ! ( ! ))
a , a, b c , c, d
If the two sets are equal, then necessarily a and c have to be equal. Therefore,
as also the elements ta, bu and tc, du have to be equal, b and d have to be equal.
A special case of an ordered pair is the one containing twice the same element.
Then the set is equal to the set containing the set containing the element.
Having clearified the notion of an ordered pair we can now define the Carte-
sian product of two sets.
Definition 3.12 (Cartesian Product). Given two sets A and B, their Cartesian
product is defined as the set containing all ordered pairs
A B : tpa, bq | a P A ^ b P Bu
So if one of the two is the empty set, the Cartesian product is the emptyset
AHBHH
Generally, if neither A nor B are empty, and they are not equal, A B, then
the Cartesian product is not symmetric, i.e.
AB BA
38
R R
2 2
1 1
R R
1 2 1 2
2
Figure 3.4: The shaded area to the left is the Cartesian product r0, 2s , the one
2 2
to the right the dierence of two products r0, 2s z r0, 1s .
Figure 3.5: The area above the diagonal corresponds to the order relation .
r0, 2s : tx P R | x 0 ^ x 2u
2
the Cartesian product r0, 2s r0, 2s r0, 2s is a square with length 2 as shown
2 2
in the left figure of 3.4. Another subset of R2 R R is r0, 2s z r0, 1s shown
in the right figure of 3.4.
Example 3.1.5 (Order relation). Using the order relation we can define the
following subset (
R : px, yq P R2 | x y
pictured in figure 3.5. So far order relations were not formally defined. We
will make up for that in the next section, taking the inverse approach. We will
define using the set R.
3.2 Relations
Before turning to order relations we define more generally binary relations as
follows
39
R R
R R
Figure 3.6: Merely the diagonal line with x y is contained in both relations.
RAB
Note that generally relations are directed. Eventually this will lead to the
definition of functions. In the special case that A B a relation R is called
relation on A. For a pair pa, bq P R we write aRb.
We can now apply the set calculus from above on relations as done in the
subsequent examples.
Example 3.2.1. The intersection of the relations and is equal to the equality
relation. Set operations are indicated by blue, relations by red.
Similarly the symmetric dierence of the two yields the inequality relation.
The complement of is .
Example 3.2.2. We will now introduce relations on the integer numbers that
are of particular interest later in number theory and cryptography. The integer
numbers contain all positive and negative natural numbers.
Z t. . . , 3, 2, 1, 0, 1, 2, 3, . . .u
a|b : Dc P Z : a c b
40
Divisibility is a binary relation on Z
| Z Z Z2
Z R X Z2
ab pmod mq : m | pa bq
i.e. two integer numbers a and b are congruent modulo m if the dierence of the
two is divisible by m. This is equivalent to say, that the integer division of each
a and b by m yields the same remainder. For instance 3 and 5 are congruent
modulo 2, while 3 and 4 are not
35 pmod 2q 34 pmod 2q
m X n lcmpm.nq
2 X 2 2 2 X 3 6
there exist integers m, n P Z with a m c b n. For prime numbers the least common
multiple is simply their product, c a b.
5 Strictly speaking not all of them are order relations. This will become apparent once
41
6
4
3
2 5
Figure 3.7: The graph shows the divisibility of the set A. 1 divides all other
numbers and is thus the root of the directed path. The direct neighbours are
prime numbers. Finally the integers 4 and 6 are connected to their prime factors.
So intuitively the ratio of equivalent pairs has to be the same. As the division
on integers is slightly more complicated we wrote it in form of a product. The
relation yields the following equivalence classes
This leads to the definition of the rational numbers as the set containing the
equivalence classes above,
Q : Z pZz t0uq{
42
30 t1, 2, 3u
1 H
Figure 3.8: The graph on the left shows the divisibility of the set B. 1 divides all
other numbers and is thus the root of the directed path. The direct neighbours
are prime numbers. Now instead of building further bottom up one can construct
the next level top down by dividing 30 by all prime numbers in B. Finally one
connects the two graphs.
t1, 2, 3, 4u
t1, 2, 3u
t1, 2, 4u t1, 3, 4u t2, 3, 4u
Figure 3.9: The graph of the subset relation on the power set Ppt1, 2, 3, 4uq
corresponds to the 4-dimensional hypercube.
43
3.2.1 Representation of relations
Relations on finite sets A and B can be represented by either binary matrices
or bipartite graphs.
In the matrix representation each row corresponds to an element in A and
each column to an element in B. The entry is then 1 i the corresponding pair
pa, bq is in R.
b b bn
1 2
a1 1 0 1
a2
0 1 0
..
.
am 0 0 1
In the bipartite graph the nodes on the left correspond to elements in A, the
ones on the right to elements in B. The nodes are linked i pa, bq P R.
a1 b1
a2 b2
a3 b3
Later we will take the inverse approach and formally define graphs using
relations.
44
Symmetry A relation R on A is symmetric if
In this case also the matrix is anti-symmetric. Note that also is anti-
symmetric as there is merely a semantical conclusion in the definition and not
a semantical equivalence. For the condition ppa, bq P R ^ pb, aq P Rq is never
met.
The negations of the above are not transitive as well as the element relation
transitive
6 For a matrix M , M T is the transposed matrix, with entries pM T qij Mji . This corre-
sponds to a reflection about the diagonal.
45
1979
1980
1981
Figure 3.10:
Example 3.2.5 (Age group). Lets consider the set of all humans, A : thumansu.
Their age group is an equivalence relation7 formally defined as
This equivalence relation introduces a partition on the set of all humans. They
are being grouped by their year of birth.
We have mentioned several times that equivalence relations partion the set
they are defined on without clearifying what that means. We will now formally
define a partion and investigate their connection to equivalence relations in
greater detail.
Definition 3.15 (Partition). Let A be a set. A partition of A is a family of
sets pAi qiPI with the following two properties.
Their union yields A: iPI Ai A
They are disjoint: Ai X Aj H @i, j P I, i j
Theorem 3.2.1. Any equivalence relation yields a partition and vice versa. So
if is an equivalence relation on A then the equivalence classes
ras : tx P A | x au A
are a partition of A.
Inversely, if pAi qiPI is a partition of A, then the relation
xy : Di P I : x P Ai ^ y P Ai
is an equivalence relation.
Proof. We will first show that for a given equivalence relation , the equivalence
classes yield a partition. As any equivalence class ras contains at least a by
reflexivity it follows
A ras tau A ras A
aPA aPA aPA
So it remains to show that the equivalence classes are disjoint. In fact if two
equivalence classes are not disjoint, then they are equal. This follows from
transitivity as follows.
7 To show that the relation the same age group is an equivalence relation, one needs to
show that it satisfies the properties in the definition. Rather obviously it does so.
46
Consider two equivalence classes, rxs and rys with a non-empty intersection.
That implies, there exists an element z P A, such that z P rxs X rys. Therefore
z x and z y and thus by transitivity x y. Employing again transitivity
we obtain that any other elment x1 P rxs is equivalent to y and thus in rys.
Therefore we obtain that rxs rys. The same argument can be made the other
way around to obtain rys rxs and thus rxs rys.
To prove the second part of the theorem we need to show that the relation
satisfies the properties of an equivalence relation. From the idempotency of the
logical AND follows the reflexivity and from its symmetry the symmetry of the
relation. Thus it remains to show transitivity.
We can now transfer the algebraic structure of Z to the set of equivalence classes.
On first sight it might seems strange to introduce a calculus for sets. But then,
also all numbers in Z can be reduced to sets. Recall the definition of the natural
numbers as nested sets of the emptyset. Just as we can define an algebraic
structure on N we will now define a similar structure on the set of equivalence
classes
Zm : Z{ m : tr0s , r1s , r2s , . . . , rm 1su
Addition is defined as
ras ` rbs : ra ` bs
We have chosen two particular representatives of the equivalence classes, namely
a P ras and b P rbs, and used these to define a new equivalence class, ra ` bs
as the result of the addition. This operation should though be independent of
the choice of the representative. That is, for any other pair of representatives,
a1 P ras and b1 P rbs, the sum should be in the corresponding equivalence class,
a1 ` b1 P ra ` bs. If so, the addition is well-defined. To show that this is actually
the case we note, that any a1 P ras dieres from a only by an integer multiple
of m. The same holds for b1 P rbs.
a1 a ` k m b1 b ` l m k, l P Z
n
8 Each semantic is given by a truth table with 2n entries. Thus there are 2p2 q dierent
truth table, each representating an equivalence class.
47
Therefore the sum of a1 and b1 turns out to be
a1 ` b1 a ` b ` pk ` lq m
and diers just by integer multiple of m from a ` b. Therefore it is in the same
equivalence class.
48
greatest element
maxima
Figure 3.11: The figure on the left shows an order with just one maxima, the
greatest elment. The figure on the left shows an order with two maxima, and
no unique greatest element.
Ey P A : y x ^ y x
@y P A : xy
Note that there are possibly multiple maxima whereas the greatest element is
unique as schematically shown in figure 3.11.
2
Example 3.2.9. Lets consider the set A t1, 2, 3u first with the order
This order is not total, as the pairs p1, 2q and p2, 1q are not comparable. Another
order on the same set is
3.3 Functions
After having introduced the notion of relations we will now introduce functions
as a particular kind of relations.
Definition 3.17 (Function). A relation f A B is a functional relation or
just a function, written f : A B, if it satisfies the following properties.
@a P A Db P B : pa, bq P f (Existence)
49
p3, 3q greatest element
p2, 3q p3, 2q
p1, 2q p2, 1q
Figure 3.12: The first order relation yields the Hasse diagram shown here. From
the diagram it becomes evident, that the order is not total.
p3, 3q
p3, 2q
p3, 1q
p2, 3q
Figure 3.13: The lexicographic order has a linear Hasse diagram as it is a total
order.
50
R
@a P A D!b : pa, bq P f
f paq b f : a b
After having specified the notion of a function, we now introduce some im-
portant properties.
Definition 3.18. We distinguish the following characteristics of functions
A function f : A B is called injective or one-to-one if
@b P B Da P A : f paq b
51
Figure 3.16: The first function is has a collision and is thus not injective. The
second has an image that is not equal to B and thus not surjective. The third
shows a bijective function.
R R R
R R R
52
Figure 3.18: Injective maps map the first set into the second and vice versa.
The relation is an order relation as shown below and thus yields a hierarchy
of sets. First the relation is reflexive, as the identity function
id : A A f paq a
AB^B A AB (3.1)
It is not intuitively clear why this should hold. An injective map from A to B,
maps A on a subset in B. So we might get an intertwined mapping as shown in
3.18.
As it turn out, that 3.1 is actually true. This is the statement of the subse-
quent theorem.
Theorem 3.3.1 (Cantor-Schroder-Bernstein). For any two sets A and B it
holds
AB^B A AB
We will now sketch the idea of the proof with a simple analogy.
Proof. Imagine a park (associated with a set A) with a house (associated with
a set B) as shown in figure 3.19. In the house there is a map of the park. If we
look close enough, we see the house on the map again. This house on the map
in turn contains a map of the park containing a house containing. . . .
As the house is inside the park, i.e. B A, there exists an injective function
f : A B. On the other hand the map of the park (so somehow the park
itself)11 is inside the house and there exists an injective map g : B A.
10 A formal proof of this statement could be done by contradiction. If g f was not injective
there existed elements a and a1 s.t. gpf paqq gpf paqq. Employing the injectivity of f we
know that f paq : b f pa1 q : b1 . Thus there exist two arguments b and b1 , b b1 such that
gpbq gpb1 q yielding a contradiction with g being injective.
11 There is a bijective map connecting the park and the map of the park. Thus they are in
a one-to-one correspondence.
53
A
Figure 3.19: The park, represented by the green circle, with the hut depicted
by the red square. The smaller green circle in the hut corresponds to the map
of the park.
More precisely we started o with a set A and a set B, i.e. a park and
a house separately. Only by means of the injective functions f : A B and
g : B A we could put the house into the park, and the park into the house
(i.e. on the map inside the house) and thus construct the infinite nesting of the
two sets.
Bijection So we can define a bijection as follows: the sets parkz house (i.e.
AzgpBq or circlezsquare in figure 3.20) are mapped to their correspondent one
level below. The sets housezpark (i.e. Bzf pAq or squarezcircle) are mapped
to themselves (on the same level). This idea is visualized in 3.20.
So satisfies all requirements of a partial order. Indeed it is even a total
order (even though we will not prove this here). Thus it is natural to ask,
whether there is a biggest element. As shown in the theorem below, the power
set of any set is strictly bigger than the set itself. Thus assuming that there
exists a biggest set A, we know that A PpAq. This yields a contradiction. So
the answer is: there does not exist a greatest element.
Theorem 3.3.2 (Cantor). The cardinality of any set A is not equal to the
cardinality of its power set.
A ff PpAq
and thus A PpAq.
Proof. We will show that no function
f : A PpAq
a P f paq _ a R f paq
B : ta P A | a R f paqu A
We now show by contradiction that there does not exist an element b P A such
that f pbq B. Assume such a b would exist. If b P f pbq, then b R B f pbq.
54
Figure 3.20: The bijective map from A to B.
55
This is a contradiction. So there is no surjective function from A to its power
set and therefore no bijection. Thus the two are not equal.
There is a simple injective map from A to its power set PpAq, mapping all
elements in A to the corresponding one-element sets.
f : A PpAq f : a tau
56
Chapter 4
Combinatorics
Divide and conquer One way to find a solution is to reduce the problem
step by step to a base case. To do so, let us consider how to calculate the answer
if we knew it for some previous point B 1 . Lets assume that we want to calculate
the number of paths x, while in the previous points the number of paths is a
respectively b as shown in figure 4.2. Then the number of shortest paths to B
is just the sum of a and b. Repeating the argument for further preceding points
B
r
Figure 4.1: Some of the shortest paths are indicated by the colored paths.
57
1 1 1 1
1 2 3 4
e
1 3 6 10
d b
1 4 10 20
c a x
Figure 4.2: In the figure to the left the variables a, b, c, d, e and x refer to the
number of paths from A to the corresponding points. Note that they do not
refer to the coordinates. The figure on the right shows the number of shortest
paths from A to its neighboring points.
k0
n0 1 k1
n1 1 1 k2
n2 1 2 1
1 3 3 1
1 4 6 4 1
Figure 4.3: Pascals triangle. The diagonals are labeled by k, the horizontal
levels by n.
yields
xa`b
ac`d
bd`e (4.1)
In this manner we can reduce the computation of the number of shortest paths
until we reach the upper edge or the edge to the left. For both of these edges
there is only one shortest path. Thus we can compute the number of shortest
paths for all crossings in the grid by summing, starting from the edges with just
one path, as indicated by the red arrows in right figure in 4.2. The pattern of
numbers that emerges is the Pascal triangle shown in figure 4.3 rotated by 45 .
The numbers in the Pascal triangle are constructed the same way, by setting the
outer nodes to 1 and then add the two upper neighbors. The diagonal levels are
labeled by k, the horizontal levels are labeled by n. The entries of the triangle
are denoted by
n
k
58
This entity is usually called n choose k or the binomial coefficient. Writing
the observation that the number of paths can be calculated from the number of
paths ending in previous nodes (i.e. equation 4.1), in terms of n and k yields
the following recursive formula
n n1 n1
` (4.2)
k k1 k
With the base-case defined as
n n
: : 1
0 n
the binomial coefficient is completely characterized as it can be computed for
any n and k by building up the triangle employing the recursion relation in
equation 4.2.
We can now write the number of shortest paths by setting n d ` r and
k r to obtain
d`r
x
r
Step sequences A second way to approach the problem is to take into account
the dierent step sequences. We will now denote a step down with a capital
D and a step to the right with R. Any sequence of steps forming one of the
shortest paths from A to B as for instance
R1 D1 R2 R3 Dd1 Dd Rr
contains r steps to the right and d steps down. So at first one might think that
the number of shortest paths from A to B is the number of permutations of the
r steps R and the d steps D, i.e.1
pd ` rq pd ` r 1q pd ` r 2q 2 1 pd ` rq!
This is however wrong as the steps R and the steps D are indistinguishable.
Two sequences
R1 R2 D 1 R2 R1 D 1
with the Rs being permuted among themselves do not dier and are counted
double in the formula above. In order to fix this issue we have to divide pd ` rq!
first by the number of permutations among the Rs and then by the number of
permutations among the Ds.
pr ` dq!
x
r! d!
Assuming that both approaches yield the same result we can now conclude for
the binomial coefficient, replacing d ` r n, r k and d n k, that
n n! npn 1qpn 2q pn k ` 1q
(4.3)
k k!pn kq! kpk 1qpk 2q 2 1
1 The number of permutations of n dierent or distinguishable elements is n!. To understand
this imagine we put the n elements one by one into an order. So for the first element we can
choose among n elments, for second among n 1 and so on and so forth. We obtain therefore
n pn 1q 2 1 n! dierent orders (or permutations).
59
The example above provided us with two formulas for the binomial coeffi-
cient. Lets now verify that the formula in equation 4.3 actually holds and the
two yield the same result.
So we have to show that the right-hand side satisfies the base conditions and
the recursion relation. Lets first compute the base cases. By convention 0! 1.
Thus
n!
1
0!n!
It remains to verify the recursion relation.
n! n! n! n!
` `
k!pn kq! pk ` 1q!pn k ` 1q! pk 1q!pn kq!k pk 1q!pn kq!pn k ` 1q
n!pn k ` 1 ` kq pn ` 1q!
k!pn k ` 1q k!pn ` 1 kq!
This yields the recursion relation (note that we merely shifted the index relative
to equation 4.2). Therefore the recursive and the direct characterization of
the binomial coefficient are consistent with one another and simply yield two
dierent ways to compute the same number. In the following we will introduce
two common interpretations and applications of the binomial coefficients.
Binomial coefficient Lets phrase the question in the last paragraph the
other way around: How did n choose k get the label binomial coefficient?
The name stems from the following relation to calculus. If we want to calculate
the n-th power of the sum of two variables we usually want to turn the product
of a sum into a sum of products
px ` yqn px ` yq px ` yq px ` yq
looooomooooon py x xq ` . . . ` y n
px x xq ` looooomooooon
xn yxn1
60
Figure 4.4: Urn with three elements.
We have to multiply one variable from each of the brackets. So there are 2n
summands containing each n factors on the right. These summands could also
be regarded as n-bit strings with zero corresponding to x and 1 to y. There is
a one-to-one relation between n-bit string and the subsets of an n-element set.
The i-th element is an element of a subset i the i-th bit of the corresponding
string is 1. So we obtain a one-to-one correspondence between the summands
and the subsets of a finite set wit n elements. We will refer to this set as A.
The summand xn corresponds to the empytset, y x x to the set containing
the first element a1 P A, i.e. ta1 u P PpAq, and y n to the set containing all n
elements, i.e. A itself.
While the sets ta1 u and ta2 u are not the same, their correspondent sum-
mands are
y x xx x y xx
Analoguely all summands with the same number of xs and ys are equal. How
many of these summands are there containing k times x and pn kq times y?
Following the reasoning above these correspond to the n-bit strings with` exactly
k ones or the k-element subsets of some n-element sets. Thus there are nk such
summands and we obtain
n
n k nk
px ` yqn x y (4.4)
k0
k
Ordered, with repetition After each draw the element is put back into
the urn. So for each draw there are n choices. As the order matters the two
combinations px, yq and py, xq are not equal, similar to the Cartesian product,
61
and we therefore obtain the following 9 combnations
Ordered, without repetitions Elements once drawn from the urn are not
put back again. Thus the number of choices reduces by one with each draw and
we obtain the following 6 combnations
p1, 2q p1, 3q
p2, 3q p2, 3q
p3, 3q p3, 2q
Unordered, with repetition Again we put the elements back after each
draw, but the order of the drawn combination does not matter. That is, we
regard the combinations above and below the diagonal in the matrices above as
equal. Therefore we are left with 6 combinations
p1, 2q p1, 3q
p2, 3q
We would now like to generalize from the example above to arbitrary n and
k. The matrix picture is somewhat misleading as it is merely helpful in the case
k 2. One rather multiplies the number of choices as we will see.
Ordered, with repetition For each of the draws there are n choices. So the
number of combinations is
n n n nk
loooomoooon
k times
Ordered, without repetition If the elements are not returned to the urn,
the number of choices decreases by one with each draw. Thus the number of
combinations is
n pn 1q pn 2q pn k ` 1q : nk
62
Unordered, without repetition The number of unordered combinations of
k elements without repetition corresponds to the k-element subsets of a set with
n elements. Recall that two sets are equal if they contain the same elements,
independent of the order. Thus the number of combinations is given by the
binomial coefficient
n n!
k k!pn kq!
Unordered, with repetition The last case, with repetition but unordered,
is slightly more complicated. A vote is an example of this case. The order of the
votes does not matter, but merely how many votes each candidate got. Imagine
3 candidates and 20 voters. Therefore n 3 and k 20.2 How many dierent
distributions of votes are there? As the order does not make a dierence, we
might as well order the votes such that the votes for the first candidate come
first and then the ones for the seconds etc.
with n 1 separators |. In this notation the separators also indicate the votes.
Left of the first separator there are the votes for the first candidate, left of the
second separator the votes for the second candidate and so on and so forth.
Therefore can therefore write
The combinations are merely characterized by the order of stars and separators.
To see how many arrangements there are, we take n 1 ` k positions and fill
them either with stars or with separators. Then there are
n1`k n`k1
(4.5)
n1 k
Then we have to divide by the number of permutations among the stars and among the
separators as they are indistinguisable. So we obtain the binomial coefficient again.
63
Figure 4.5: The elements in the intersection are counted twice and have to be
substracted again.
Ai X Aj H @i j
The size of the union of mutually disjoint sets is the sum of the sizes of the sets.
n n
Aj |A |
i1 i1 i
Product rule For any family of sets pAi qi1,...,n , whether disjoint or not, the
size of their Cartesian product is just the product of the sizes of the single sets
Ai .
n n
A |A |
i1 i i1 i
Equality rule Two finite sets A, B have the same size, if there exists a bijec-
tive function f : A B. The function establishes a one-to-one correspondence
between the elements of A and B.
The elements that are counted twice have to be substracted again as visualized
in figure 4.5.
Example 4.3.1 (Inclusion/Exclusion of 2 sets). How many of the numbers 1, 2,
3, . . . , 100 are divisible by 2 or by 5? The numbers that are divisible by 2 form
64
a set A1 with 50 elements, the numbers divisible by 5 a set A2 with 20 elements.
Their intersection has size 10. Therefore
Let us now consider the case of three sets. Again we have to substract the
intersection of each of the pairs. But then the intersection of all three sets is
substracted three times and thus not counted at all. So we have to finally add
it again.
Instead of considering the intersections one after the other we could as well
repeatedly apply the case of two sets from above. This yields
as above. This step turns out to be crucial in the proof of the general case
considered in the subsequent theorem.
Theorem 4.3.1 (Inclusion/Exclusion). The size of a union of sets pAi qi1,...,n
is given by
n n r
r1
Ai p1q Ai k (4.6)
i1 r1 1i i ...i n
1 2 r k1
The last line is of the form of equation 4.6. To see this we will expand the
equation 4.6 with pn ` 1q. First we split of the term r 1, then all the terms
65
with ir n ` 1.
n`1 r
r1
p1q Ai k
r1 1i1 i2 ...ir n`1
k1
n
n`1
|Ak | ` |An`1 | ` ...
k1 r2
n
|Ak | ` |An`1 |
k1
n r
r1
` p1q Ai k
r2
1i1 i2 ...ir n k1
n`1 r1
r1
` p1q Aik X An`1
r2 1i i ...i n
1 2 r1 k1
n
|Ak | ` |An`1 |
k1
n r
r1
` p1q Ai k
r2
1i1 i2 ...ir n k1
n
r
r1
p1q Aik X An`1
r1 1i i ...i n
1 2 r k1
|An`1 |
n r
` p1qr1 Ai k
r2
1i1 i2 ...ir n k1
n r
r1
p1q Aik X An`1
r1 1i i ...i n
1 2 r k1
In the penultimate step the index r in the last term was shifted. Theemerging
n
p1q factor was taken in front of the sum. In the last step the sum k1 |Ak |
was absorbed into the third term as r 1.
Even though the right-hand side of the equation 4.6 looks pretty complicated
it can be helpful as for instance in the example below.
Example 4.3.2 (Opera). During an opera evening all the coats in the cloakroom
get disordered and the guests get back a random coat afterwards. Is it more
likely that at least one person gets back his or her own coat or that nobody
gets back his own coat? Let us first consider a simpler case with 3 guests. The
three coats in the cloakroom are permuted. If at least one person retrieves his
or her own coat after the opera the permutation of the coats has at least one
so-called fix-point. At least one of the coats is at its former positions after the
permutation in the cloakroom. An example is4
1 2 3
2 1 3
4 We write a permutation with the initial order in the first row and the final order in the
second row.
66
If no guest retrieves the own coat, then the corresponding permutation is fix-
point free (we abreviate such a permutation by FPFP). For three coats there
are merely two FPFPs
1 2 3 1 2 3
2 3 1 3 1 2
So the probability that nobody retrieves the own coat is 1{3 while the probability
that at least one person retrieves the own coat is 1 1{3 2{3. How do we
calculate this if the number of guests (and therefore the number of coats) is
significantly larger? It turns out the number of fix-point permutations can be
calculated using equation 4.6. We will write the set containing all permutations
with at least one fix-point, denoted hereafter A, as a union of simpler sets and
use inclusion-exclusion to determine the size of this union.
Let us define Ai as the set containing the permutations with a fix-point at
position i. Then
n
Ai A
i1
and the size of A can be calculated using inclusion and exclusion. The size of
each of the Ai is equal to the number of permutation of pn 1q elements. The
size of the intersections of two dierent Ai is equal the number of permutations
of pn 2q elements and so on and so forth.
n n r
|A| Ai p1qr1 Ai k
i1 r1 1i i ...i n
1 2 r k1
n
p1qr1 pn rq!
r1 1i1 i2 ...ir n
` n ` n
For each r there are r terms in the inner sum. Employing the equality r
n!
pnrq!r! we obtain
n
n
|A| p1qr1 pn rq!
r1
r
n
p1qr1
n!
r1
r!
67
we obtain with x 1 that the number of fix-point-free permutations tends to
n!{e.
n!
lim #FPFPpnq
n8 e
So the probability that nobody will retrieve the own coat is 1{e 1{2.71828
and thus smaller than the probability that at least one person retrieves the own
coat.
Proof. In the base case with k 1 there is merely one box containing all n
elements, with n 2. In the induction step we proof that the assertion hold for
k ` 1 boxes if it holds for k boxes. Lets consider the pk ` 1q-th box separately
and distinguish the following two cases:
1. The box pk ` 1q contains 2 objects or more.
2. The box pk ` 1q contains only 1 object. Then the other n 1 objects are
distributed among the remaining k boxes. If n k ` 1 then n 1 k and
by induction hypothesis there is one box among these n that contains at
least two objects.
This completes the induction step and thus the proof of the principle.
The following example illustrates the how the pigeonhole principle can be
employed.
Example 4.3.3 (Monotonic subsequences). If we are given a finite sequence of
distinct numbers as for instance
then a subsequence is a selection of these numbers with the order given by the
original sequence such as
p17, 20, 4q (4.9)
A subsequence is monotoneously increasing if the numbers are in increasing
order. Analogue the sequence is monotoneously decreasing if the numbers are
in decreasing order. The sequence
p1, 3, 20q
68
is monotoneously increasing while the sequence
p17, 5, 3, 2q
is monotoneously decreasing.
The question is: for a sequence of length n is there a lower bound on the length
of the longest monotonic subsequence? In fact a sequence of length m2 ` 1 has
always a monotonic subsequence of length m ` 1. We will prove this statement
now by contradiction.
pa1 , . . . , am2 `1 q
pc, dq pc, dq
Therefore the assumption that the length of the longest monotonic subsequence
is less or equal to m leads to a contradiction.
Is this lower bound optimal? In other words: is this lower bound tight?
Indeed, this is the case. We can come up with a sequence of length m2 that has
a monotonic subsequent of length m as depicted in figure 4.6.
ma : tb P B | pa, bq P Su S
5 The m2 possible pairs correspond to the holes and the m2 ` 1 elements ai to the pigeons.
69
m2
3
2
1
1 2 3 m2
m
B ma
nb
Figure 4.7: One can count the elements in S either row after row, or column
after column.
70
l
10
9
8
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 k
nb : ta P A | pa, bq P Su S
The subsequent example shows how it can be useful to interchange a sum over
columns with a sum over rows.
Example 4.3.4. What is the average number of divisors of a number k? To
answer this question let us introduce a function
that returns the number of divisors. So the function yields for instance
For every prime number p we obtain ppq 2. Having introduced this function
we can define what we meant by average. So for a given number n we are
interested in calculating
n
pkq
k1
n
Summing the values of pkq is equivalent to counting the dots in figure 4.8 col-
umn by column. The columns have a rather fuzzy pattern whereas the number
71
1
0.8
0.6
0.4
0.2
0 2 4 6 8
X \
of points within each row is simply given by nk . The fraction n{k is rounded
to the next smaller integer value. Interchanging the sum over columns by a sum
over rows we obtain
1 1 Yn]
n n
pkq
n k1 n l1 k
How does this number scale with n? To answer this question we will derive an
estimate on the sum above. Note first
n Yn] n
1
l l l
Therefore we can bound the average as follows.
1 1 Yn] 1
n n
1
l1
l n l1 k l1
l
n
Thus we require bounds on the sum l1 1{l. As seen in figure 4.9 we can
approximate with functions and integrate to obtain the area below the graph.
The upper bound is then
n n n
1 1 1
1` 1` dx 1 ` lnpnq
l1
l l2
l 1 x
where ln is the natural logarithm with the base e. Similarly we obtain the lower
bound n
n
1 1
dx lnpn ` 1q lnpnq
l1
l 0 x ` 1
Putting all this together the average of the number of divisors for number be-
tween 1 and n can be estimated as
1 Yn]
n
lnpnq 1 lnpnq ` 1
n l1 l
72
n
r nr
t kt
Figure 4.10:
4.4.1 Symmetry
The binomial coefficient reflects the symmetry of the Pascal triangle as
n n
k nk
73
obtain the equality
k
n r nr
k t0
t kt
` n
Note that k is set to zero whenever k n.
74
1
0.8
0.6
0.4
0.2
We have employed once more the formula with factorials and arranged the
factors in a convenient way. To upper bound the second part we have used the
expansion of the exponential (4.7). One summand is less than the entire series.
Summarizing this we obtain the bounds
n k n n k
ek (4.12)
k k k
75
and obtain
? n n e k e nk
n n! 1 n
? a
k k!pn kq! 2 kpn kq e k nk
nn 1
k ` k `
k pn kqnk k nk nk
n n
n
1
` k{n `
k nk pnkq{n
n n
In the last two steps we merely reformulate the expression in a form that will
turn out to yield some insight later on.
Lets now define x : k{n. Thus pn kq{n 1 x and we can write the
estimate above as
n ` n
xx p1 xq1x 2npx log2 xp1xq log2 p1xqq
k
The function is symmetric about x 1{2 and becomes zero for both x 0 and
x 1 and one for x 1{2. The graph of h is shown in figure 4.13. All this new
formalism yields the estimate
n
log2 nhpxq (4.14)
k
76
n
011001011100010100000010010001000011010
0001010011010110011
n
t0, 1u
Figure 4.13: The set of all strings of length n with various subsets.
77
0.4
0.2
The single most likely string is then the zero string 0n with a probability xn .
If for instance x 0.9 then P p0n q is 0.81 for n 2 and 0.729 for n 3. For
n 8 this probability tends to zero.
Any string with equally many zeros and ones occurs with a probility
xn{2 p1 xqn{2
`n
As there are n{2 of these strings, the probability to draw any such string is
n
xn{2 p1 xqn{2 2nhp1{2q xn{2 p1 xqn{2
n{2
2n xn{2 p1 xqn{2
a n
2 xp1 xq 0 as n 8
a
Here we have used that xp1 xq 1{2 for any x 1{2 as can be seen from
the graph shown in Fig. 4.14. This is related to the geometric fact, that the
rectangle with the largest area for a given circumference is a square.
This is to say that for long strings, i.e. large n, the probability to draw a
string with equally many zeros and ones is very small.
Which strings are instead drawn with high probability? Intuitively wed
expect the string with n x zeros and n p1 xq ones to be rather likely. The
probability to draw any such string is in fact
n
P r# zeros x ns xxn p1 xqnp1xq
xn
n n
2npx logpx`p1xq logp1xqq
2nhpxq
xn xn
2nhpxq 2nhpxq 1
`n
Therefore a lossless compression protocol merely encodes the roughly xn
2nhpxq strings with n x zeros and p1 xq n ones using l nhpxq bits. So the
binary entropy hpxq characterizes the ultimate data compression rate or the
relative information content of a source.
78
4.6 Special counting problems
4.6.1 Equivalence relations
Recall from set theory that equivalence relations yield a partition of a set. The
question we are now interested in is: How many equivalence relations are there
for a given set with n elements? Lets first consider some simple cases with
small n.
n 1: There is just one partition and therefore merely one equivalence
relation.
n 2: There is are two partitions, the one containing both the elements
and the one with each element being contained in its own partition.
n 3: There are 5 partitions: one containing all elements, one with each
element in its own partition, and three with two elements contained in the
same partition.
Instead of trying to directly find a general answer to the question above, well
first a consider a slightly simpler question: How many partitions into exactly
k sets are there? We denote this number by Sn,k . One can now deduce a
recursvie formula similar to the binomial coefficient. To do so, we separate the
n-th element and distiguish the following to cases
1. The n-th element is in its own a set. Then we have to find a k 1 partition
of the remaining n 1 elements.
2. The n-th element is an element of a set containing also other elements.
Imagine the other n 1 elements are partitioned into k sets. Then we
could add the n-th element to any of those k sets.
These considerations yield the following recursive formula
This relation resembles the recursion relation for the binomial coefficient. There
is merely the additional factor k. Indeed, after considering the base cases
S0,0 1
Sn,0 0 @n 0
Sn,n 1
79
k0
n0 1 k1
n1 0 1 k2
n2 0 1 1
0 1 3 1
0 1 7 6 1
0 1 15 25 10 1
we can build up a similar triangle, called Stirlings triangle of the second kind.
The number of all partitions is then given by the sum over row n in the
triangle.
n
Bn : Sn,k
k0
There is no closed formula for this number. So to calculate the number, one has
to actually build up the triangle before calculating the sum over the row.
4.6.2 Permutations
A permutation is a bijective map
: t1, . . . , nu t1, . . . , nu
A common notation for permuations is
1 2 n
p1q p2q pnq
Example 4.6.1. The following is a permutation of 5 elements
1 2 3 4 5
5 4 3 2 1
The permutation has a fix point, 3 3, and two cycles of 2
1 5 1 2 4 2
Therefore applying the permutation twice yield the identity, 2 id. In other
words the permutation is self-inverse, 1 .
Example 4.6.2. Let us consider a permutation with a more complicated cycle
structure.
1 2 3 4 5 6 7 8 9 10
5 9 8 10 7 6 1 3 4 2
There are the following cycles
80
A fixpoint at 6
2 9
The counting problem well consider is: How many permutations of n with
exactly k cycles are there? As before we aim for a recursion relation. Abusing
slightly the notation, we denote the number permutation of n elements with
k cycles by Sn,k . One can then separate the n-th element and distinguish the
following two cases.
1. n is a fix-point and therefore adds a cycle by itself. It remains to take into
consideration the permutations of n 1 elements with k 1 cycles.
2. We can fit n into existing cycles. So for permutations of n 1 elements
with k cycles there are n 1 position where we could put n, essentially
after each of the existing elements independent of which cycle they are in.
81
k0
n0 1 k1
n1 0 1 k2
n2 0 1 1
0 2 3 1
0 6 11 6 1
0 24 50 35 10 1
82
Chapter 5
Graph Theory
5.1 Motivation
Many discrete problems can be modeled as graphs. This is useful as there is a
well-developed theory on graphs.
Example 5.1.1 (Error-free communication over noisy channels). A channel is an
abstract model of a device used to transmit some information as for instance
a wire, an optical fibre or a radio signal. As shown in Fig,. 5.1 a channel
can be depicted by a graph, where the input is connected with each of the
possible outputs. Generally errors might occur while transmitting the signal.
The channel is noisy. Therefore a given input can potentially yield dierent
ouputs in dierent uses of the channel.1 Nodes representing the input in the
graph might be connected to more than one element in the output alphabet Y.
As outputs can stem from dierent inputs, the receiver cannot tell which input
the sender chose. Is there a way to transmit information without error through
such a noisy channel? To answer the question we will first consider a dierent
graph representation.
An ambiguity graph shows the conflicting inputs in a noisy channel. The
1 In a more complete model one would assign a probability distribution over the outputs
for each input. Typically the right output would occur with rather high probability. If the
channel is noisy there are other outputs that occur with probability larger than zero and cause
the ambiguity for the receiver.
x0 y0
x1 y1
x2 y2
x3 y3
.. ..
. .
xn yn
83
x2
x3
x1
x5
x4
x6
x7 x8
Figure 5.2: The ambiguity graph of a noisy channel. The red markers show an
optimal coding.
nodes correspond to the inputs in X . Two nodes are connected if the corre-
sponding inputs might yield the same output.
In order to resolve the ambiguity of the inputs, the sender and the receiver
agree on a coding. The sender will use merely a subset of the input alphabet X
without conflicts in the output. This subset has to be chosen such that none
of the elements are neighbors in the ambiguity graph. The optimal code is
the biggest such set. In the graph 5.2 the codewords of the optimal code are
highlighted in red.
0
0 0
1 1 1 4
2 2
3 3
4 4 2 3
The corresponding optimal code has then 2 codewords and can therefore
transmit 1 bit with each use of the channel. It seems therefore that the zero
error capacity, the number of bits that can be sent unambiguously per use of
the channel, is 1. But one could achieve the same transmission rate if there
were just 4 letters. So the question is, whether this is really optimal?
The zero error capacity is in fact an average over multiple uses of the channel.
Let us therefore consider the case of using the channel twice. The new alphabet
is then the Cartesian product X 2 with 25 elements as shown in Fig. 5.3. We
can now search for unambiguous codewords among these 25 ordered pairs. The
optimal code indicated by the red circles allows then for 5 distinct codewords to
84
0 1 2 3 4
0
1
2
3
4
Figure 5.3: This is the ambiguity graph for the case of using the channel twice.
Instead of showing all edges the blue and the red lines give to examples of
ambigeous pairs. The red circles indicate an optimal code for this new channel.
be transmitted without error. Therefore the zero error capacity of the channel
in bits is
log2 5
2
and is therefore larger than 1. It turns out that using the channel even more
than twice does not improve this result ( as shown by Lovasz).
0 |V | 8
pu, vq P E pv, uq P E
pvq : tw P V | pv, wq P Eu
85
the neighborhood of v.
w1
w2
v w5
w3
w4
Definition 5.5. The number of edges towards a node v is called the in-degree
deg pvq.
The number of edges from a node v is called the out-degree deg` pvq.
For undirected paths the number of edges ending in a node v is called the degree
degpvq.
86
v6
v2
v1
v3
v4
v5
Definition 5.7 (Path). A path is a way such that all vertices are distincts.
That is, there can not be loops within a path.
v4
v2
v1
v3
pvi , vi`1 q P E @i 1, . . . , l 1
pvl , v1 q P E
It is therefore a path with the last node being connected to the first by an edge.
Dpu, vq-path Di P I : u, v P Vi
87
Then the induced subgraphs GrVi s are called the connected components of G.
w1 w4
w2 w3
88
Indeed we could remove all internal edges and one of the outer ones and still
obtain a connected graph.
w0
w1 w4
w2 w3
In this minimally connected graph the dierence of the number of edges and
the number of vertices reaches the bound from the corollary above. Further all
edges are bridges These minimally connected graphs are called trees.
5.3 Trees
As mentioned above a tree is a minimally connected graph. A graph containing
trees as subpaths are then called forests as formally defined below. The following
graph is a forest with 3 trees.
As there are no cycles and merely finitely many vertices, one will eventually
reach a leaf.
89
After having introduced the concept of a tree we now turn to a number of
equivalent characerizations of trees.
(6) (1)
(4) (5)
(2) (3)
(1) (6) By the definition of connectedness there exists a u v-path for all
pairs of vertices u, v P V . It remains to show that this path is unique. We will
show this by contradiction. More precisely we will show that if the path is not
unique we can construct a cycle in the graph.
Lets assume that there are 2 dierent paths connecting the vertices u and v.
v u
Where the two paths split, we can start constructing a loop taking one path
until we reach the vertex where they rejoin and return on the other path.
(6) (4) As there exists a path for any pair u, v P V we obtain immediately
that G is connected. It remains to show that every edge is a bridge. Again we
will show this by contradiction. Assume there exists an edge e P E that is not
a bridge. Then we can remove e and G is still connectd, i.e. there is a path
connecting the vertices adjacent to e. In other words, with e there are two paths
connecting these two vertices. This is a contradiction with the uniqueness of
the path.
90
(4) (2) As connectedness is already given, it remains to show that |V |
|E| ` 1 if every edge is a bridge. We will show this by induction over the number
of edges.
Base case If |E| 1 then there are two vertices and the equation is satisfied.
Induction step As all edges are bridges, we can simply split the graph into two connected
components by removing any of the edges.
G1 G2
|V | p|E| 1q |V | |E| ` 1 1
(3) (5) We have to show that adding an edge e1 to the graph creates a
cycle. Thus we have from (3) that |V | |E| ` 1 |E 1 | for E 1 : E Y te1 u. We
show by induction over the number of vertices that whenever |V | |E 1 | then
there is a cycle.
Base case The first interesting case is |V | 3. If the number of edges is also 3, we
obtain the graph
91
If there is no leaf, then all vertices have degpvq 2. Then we can simply
construct a cycle by going from one vertex to the next. As there is no leaf,
there is no dead-end. And as the number of vertices is finite, we have to
end-up at the initial vertex sooner or later.
(5) (1) The graph G does not contain any cycles. So it remains to show that
G is connected. Again we will use an indirect proof. Lets assume that G has two
connected components. Then we could add a vertex e1 to G without creating
cycles by linking the two connected components. This yields a contradiction
with (5), as this insertion of an edge did not create a cycle.
v0 v0 v0
v1 v2 v1 v2 v1 v2
Lets first consider the number of unmarked trees with small numbers of
vertices.
Example 5.3.1. The number of dierent unmarked graphs for n 1, . . . , 5 is
given in the last column.
# vertices # trees
n1 1
n2 1
n3 1
n4 2
n5 3
As we will see below the number of marked trees with n vertices is the same
as the number of so-called spanning trees of the completely connected graph Kn
with n vertices. A spanning tree is formally defined as follows.
Definition 5.15 (Spanning tree). Given a connected (undirected, simple) graph
G pV, Eq, a graph H pV, E 1 q with the same vertex set is called a spanning
tree of G if
H is a tree.
92
E 1 E.
Example 5.3.2. The connected graph
K2 : There is merely one connected graph with two nodes. The single
spanning tree is just the graph itself.
K3 : The completely connected graph K3
v0
v1 v2
v1 v0
v2 v3
93
has 16 spanning trees.
The first 12 of these, in the red box, are actually the same if we considered
them as unmarked graphs. By re-arranging the vertices of a given tree
among these 12 we can obtain any other of the 12. An example of such a
re-arrangement is the following.
0 1 2 3 4
94
corresponds for instance to the marked graph with the vertices in in-
creasing order. For any permutation of the 5 vertices we obtain another
isometric graph. For the permuation that merely reverses the order, we
obtain the same graph as the edges are undirected. Therefore there are
5!{2 60 marked graphs corresponding to this un-marked graph.
The so called star graph
Again the permutations of the vertices give the isometries. But swapping
the two leaves as indicated by the red arrow does not change the graph.
We therefore have a similar symmetry as above and have to divide again
the number of permutations 5! by 2 to obtain 60 graphs.
So in total we are left with 125 53 dierent spanning trees (or dierent marked
trees with 5 vertices).
Considering the examples above we might guess that in general the number
of spanning trees of completely connected graphs Kn is nn2 . Indeed this is
true.
Theorem 5.3.3 (Cayley). The number of dierent marked trees with n nodes
(i.e. the number of spanning trees of the completely connected graph Kn ) is
nn2 .
Proof. Instead of counting marked trees we will count marked trees with two
additional markers, a green circle and a red square.
1 3
2
5
These two markers can be placed on any node in the graph, also both on the
same. Therefore there are n2 dierent ways to place the two markers. If for
instance n 2 then we got the following possible placements
95
We will now show, that there exist nn dierent marked trees with n vertices
and two additional markers. Then there are
nn
nn2
n2
marked trees without the markers.
To derive the number of trees with markers we now construct a bijection
from these trees to the functions from a set of n elements to another set of n
elements.
tf : t1, . . . , nu t1, . . . , nuu
Note that we ask for neither injectivity or surjectivity. Thus any of the n
elements in the first set can be mapped to any of the n elements in the second.
Therefore there are nn such functions. Even though these are not necessarily
permutations we will adopt a similar notation. Lets consider the following
example.
1 2 3 4 5 6 7 8 9 10
f
9 8 3 9 3 2 1 6 4 1
We will now represent this function by a directed graph.
5
3
10
1 9 4
7 6
2 8
In every component there has to exist a cycle, as there is only a finite number
of nodes and there is no end point if one applies the map over and over again
(i.e. follows the arrows). Restricting to the vertices that are contained in cycles
yields a permutation, i.e. a bijective map. The set of vertices contained in some
cycle is in the example above
M t2, 3, 4, 6, 8, 9u
8 3 9 2 6 4
This is usually compared to an animal with the green circle being the head, the
red square the tail and the encoding of the permutation the spinal cord. We
96
can now add the remaining elements to the tree as follows.
5
8 9 2 6 4
3
1
7 10
The edges are to be read as arrows towards the spine. This procedure yields a
tree with head and tail for any function f : t1, . . . , nu t1, . . . , nu. It remains
to show that we have an inverse map from trees to functions.
Inverse direcion Lets construct a function for the following tree with head
and tail.
6
1 7 10
3
2
8 9
4 5
From the spine we directly obtain the permutation, as the first row is given by
convention by the numbers in increasing order.
1 3 7 10
f |M
10 7 3 1
Reading the other edges as arrows towards the spine we obtain the entire map.
1 2 3 4 5 6 7 8 9 10
f |M
10 1 7 2 2 3 3 7 7 1
This procedure works similarly for any other tree. Thus the proof is complete.
K1 K2 K3 K4 K5
97
For a complete graph with n vertices, denoted Kn , there are then
npn 1q n
2 2
edges.
Cycles (also circles or circuits) are graphs with all nodes contained in one
single cycle without additional edges. As we only consider simple graphs here,
we are merely interested in cycles with 3 or more vertices.
C3 C4 C5
V tpi, jq|1 i m, 1 j nu
The edge set contains merely the closest neighbors of each vertex
and no diagonals.
n
98
This graph has the shape of a cylinder. One might not only connect the nodes
horizontally but also vertically and obtain a torus shaped graph.
99
graph is called complete bipartite graph and denoted by Km,n .
m n
Hypercubes The vertices of the d-dimensional hypercube are the d bit strings,
i.e.
d
V t0, 1u td-bit stringsu
Therefore there are 2d vertices. The edges are then given by the pairs with
minimal Hamming distance, which defined as follows.
Definition 5.17 (Hamming distance). The Hamming distance between two
d
bit strings of same length, dH px, yq with x, y P t0, 1u , is the number of bits, in
which x and y dier.
pu, vq P E : dH pu, vq 1
Therefore every vertex has d neighbors, i.e. degpvq d@v P V . The total
number of edges is
d 2d
|E| d 2d1
2
as each of the 2d vertices is connected to d edges but then every edge is connected
to two vertices. We can now iteratively draw the graph. To obtain the graph
Qd`1 we draw twice the graph Qd . To one of the two we add a 0 after all bit
strings labelling the vertices, to the second analoguely a 1. Finally we add edges
to connect the corresponding vertices of the two copies of Qd . We start from
the graph Q0 containing merely the empty word
0 1 0 1
100
Repeating the same process yields Qd for larger d. For Q2 we get
01 11 01 11
00 10 00 10
The Hasse diagram for 2 3 5 30 yields the same graph. Similarly we obtain
the 4 dimensional hypercube.
101
exactly once, thereby laying the foundations of graph theory. We can translate
the setup with the bridges to a graph.
Then we are looking for a cycle, that passes each edge exactly once a so-called
Euler tour.
Definition 5.18 (Euler tour). An Euler tour is a closed sequence of edges of a
graph G pV, Eq, that contains each edge of the graph exactly once.
As the degree of any of the vertices in the graph above is 3, thus odd, there
doesnt exist an Euler tour.
Theorem 5.5.1 (Euler). A connected graph has an Euler tour, if and only if
all degrees are even. That is
@v P V : degpvq is even
is necessary for the existence of an Euler tour. This follows from the observation,
that any vertex is reached and left the same number of times. Whenever one
reaches or leaves the edge we have to use an other edge to form an Euler tour.
Thus the number of available edges has to be even if we finally have to have
passed all edges. Note that the same holds for the starting vertex.
It remains to show that the condition is sufficient. We will now construct an
Euler tour for a path satisfying the condition above. First one chooses an initial
vertex v1 . This might be any vertex in V . Following an unused edge one now
proceeds to other vertices. As there is an even number of edges ending at each
of the vertices one always finds such an unused edge to continue except after
returning to v1 . As the number of edges and the number of vertices is finite one
will eventually end up in v1 and thus obtain a first cycle with each edge in the
cycle used merely once.
v1
This cycle might however not contain all edges. In that case, as the graph is
connected, there must be a vertex v2 in the cycle above with (at least 2) unused
edges. We can therefore use this vertex v2 as the starting point of a new cycle
102
using the same procedure of following the unused edges. The red arrows show
how to interpret this as just one cycle.
v1
v2 v3
Repeating this iteratively until we used all edges yields the desired Euler tour.
Example 5.5.1. All vertices in the complete graph K5 have degree 4. Therefore
K5 has an Euler tour that can be constructed following the procedure in the
proof above, starting from the vertex on the top. First we follow the outer circle
and then the inner edges as indicated by the color gradient.
Example 5.5.2. Some of the special graphs considered above contain Euler tours
under some conditions.
The complete graphs Kn have an Euler tour, if and only if n is odd as
degpvq n 1 @v P V
degpvq d @v P V
Note that the proof above directly yields a simple greedy algorithm2 , that
finds an Euler tour in time linear in the number of edges |E|. We will now turn
to Hamilton cycles defined as follows.
Definition 5.19 (Hamilton cycle). A cycle that visits every node exactly once,
is called a Hamilton cycle. A graph containing such a cycle is called hamiltonian.
103
The wheel graph
We would now like to address the question whether meshgraphs are hamil-
tonian. The graph M1,2
is a tree and thus not hamiltonian. The two following two mesh graphs, M2,2
and M2,3 ,
are hamiltonian as the outer cycles connect all nodes. The next mesh graph,
M3,3 ,
however is not hamiltonian. To see this note, that we can color the nodes
such that separating the blue nodes and the green nodes we obtain a bipartite
graph. There are 5 blue nodes and 4 green ones. So if we could find a Hamilton
cycle containing all nodes we could arrange the cycle in the following way.
But then as the number of blue nodes and the number of green nodes are node
equal there must be two neighbouring blue nodes. This yields a contradiction,
as there are no diagonal edges in mesh graphs. The same argument holds for
any mesh Mm,n with m n being odd.
104
More general, a bipartite graph can only have a Hamilton cycle if the number
of vertices of one color equals the number of vertices of the other color.
For mesh graphs this condition is not only necessary but also sufficient. That
is
Mm,n is hamiltonian m n is even
Proof. From the argument above we directly obtain that m n being even is a
necessary condition. This proves . So it remains to show that the condition
is sufficient, i.e. whenever the product is even there exists a Hamilton cycle. If
the product is even, at least one of the two, m and n, is even. Without loss of
generality we can assume that m is even. Then we can construct a Hamilton
cycle as follows.
n
Having considered mesh graphs and complete bipartite graphs we now turn
to Hamilton cycles in hypercubes. The smallest two, Q0 and Q1 , are not hamil-
tonian. The first is merely a point, the second is a tree. But the square Q2 and
the cube Q3 are hamiltonian.
105
Using the Hamilton cycle on Q3 we can now construct a Hamilton cycle on Q4 .
We remove two corresponding edges and link the two cycles at the corresponding
vertices.
This procedure can be applied iteratively to obtain a Hamilton cycle for any
Qd , as we will show below formally by induction. Therefore the hypercubes Qd
are hamiltonian for all d 2.
Proof. We will show that there exists a Hamilton cycle for all d 2 by induction
over d.
Base case As we have seen above Q2 is hamiltonian.
Induciton step By induction hypothesis there exists a Hamilton cycle for Qd . We will
then construct a Hamilton cycle for Qd`1 .
Qd 0 Qd 1
a1 0 a1 1
a2 0 a2 1
106
Example 5.6.1. The complete graph K4 is planar as we can rearrange the edges
as follows
The complete graph K5 however is not planar. One might shift two of the
inner edges outside, but there remains a crossing. Shifting further edges outside
introduces crossings outside.
At the beginning of the course we proved that the bipartite graph K3,3 is not
planar.
To actually prove that the latter two graphs are not planar we could go
through all possible arrangements of edges to see whether there is any without
crossings. This is rather tideous. We therefore derive some properties of planar
graphs. If a graph does not have any of this properties it cannot be planar.
Theorem 5.6.1 (Eulers formula). Let G pV, Eq be a planar connected graph
which divides the plane into f regions (including the region outside the graph).
Then the number of regions, of vertices and of edges satisfy the following equa-
tion.
|V | ` f |E| 2 (5.3)
Note that the inverse implication does not hold. A graph satisfying equation
(5.3) is not necessarily planar.
In the introduction of the course we saw a proof of this statement using
spanning and dual trees. We will now consider a dierent one, reducing a given
planar graph to a tree.
Proof. Note first of all that equation (5.3) holds for trees. As there are no cycles
there is merely one region, i.e. f 1. Further the number of edges is one less
than the number of vertices, |E| |V | 1. Thus one obtains
|V | ` f |E| |V | ` 1 p|V | 1q 2
We can now reduce any other connected planar graph to a tree by removing
107
edges in cycles, as shown for the following graph.
Removing an edge of a cycle reduces the number of regions and the number of
edges by one, while the number of vertices remains the same.
Thus the value of the left-hand side for the new graph G1 pV1 , E1 q with f1
regions is the same as before.
The same holds for any other graph Gi pVi , Ei q resulting from removing
further edges to break apart cycles. Finally we end up with a tree that satisfies
Eulers formula as seen above. Thus Eulers formula also holds for the initial
general planar graph.
Unfortunately Eulers formula is of no use to prove that K5 or K3,3 are not
planar. The notion of a region is ill-defined if there are crossings. It is thus not
clear what f is in these two cases. We will now find bounds on the number of
regions, f , in terms of the number of edges and vertices. In a simple graph a
region is bounded by at least three edges
108
Corollary 5.6.2. Any planar graph with |V | 3 satisfies the following inequal-
ity.
|E| 3 |V | 6 (5.4)
The complete graph K5 violates this equation as |V | 5 and |E| 10. Thus
the left-hand side is 9 10.
The complete bipartite graph K3,3 with |E| 9 and |V | 6 does however
not violate the equation (5.4). We can fix this issue by improving the bound
on the number of regions for bipartite graphs. A region in a bipartite graph is
bounded by at least 4 edges.
The bipartite planar graph K3,3 with |E| 9 and |V | 6 violates this
inequality.
Average degree The inequalities (5.4) and (5.5) can be expressed in terms
of the average degree
1 2 |E|
degpvq : degpvq
|V | vPV |V |
|E| 3 |V | 6 3 |V | degpvq 6
|E| 2 |V | 4 3 |V | degpvq 4.
109
5.7 Graph colorings
Before giving the formal definition of a coloring let us consider an example.
Example 5.7.1. Imagine the following problem: a flight company wants to find
the minimal number of planes to operate some given flights. Each flight is rep-
resented by a node. Two nodes are connected if the flights cannot be operated
by the same plane. Then the minimal number of colors required to color the
vertices, such that any two neighbouring vertices have a dierent color, corre-
sponds to the number of planes required. Lets consider colorings of the Petersen
graph.
Two colors do not suffice as can be seen from the outer cycle. It contains an
odd number of vertices and thus requires a coloring with three colors.
Therefore in this case of ten flights we would need at least three planes.
Definition 5.21 (Coloring). The k-coloring of a graph G pV, Eq is a function
c : V t1, . . . , ku (5.6)
110
such that any neighbours are assigned a dierent value, i.e.
pGp q 6
111
Proof. The implication follows directly from the observation above. If G is
bipartite it cannot contain an odd cycle, as then pGq 3 yields a contradiction.
It remains to show that if G contains no odd cycle, then it is also bipartite. Lets
consider the spanning tree of G. We can then color the levels with two colors
as above. We now need to show that there are no edges in G that connect two
levels of the same color. This follows from the observation that adding an edge
connecting two vertices of levels with same parity examples are indicated by
dashed lines below always yields an odd cycle and thus a contradiction.
The theorem yields an efficient way, i.e. in linear time, to decide whether a
graph is two-colorable. To decide however whether a graph is 3-colorable is an
NP-complete problem.
112
Chapter 6
Cryptography
x Rp p2x q 2x mod p
2 2 ... 2
one can compute the power by k multiplication, each time multiplying the result
with itself. To generalize this to arbitrary x we write the argument in its binary
113
s ? s
Figure 6.1: Alice and Bob communicate through a public channel. Even though
Eve receives all their messages she does not know their shared secret.
expansion.
Then the computation one has to perform turns into repeated squaring and
multiplying.
2 2
1
x1 `22 x2 `...`2r xr
2x 2x0 `2
2
p2xr q 2xr1 2x1 2x0
114
23R
2x`0
2x`1
3R
2x`2 x`0
x`1
2x`3 x`2
x`3
2x`4
x`4 22R
2R
1R
3 21R
1 2
0
23
22
20 21
Schematically the protocol is depicted in figure 6.3. Eve merely knows a, b and
p but neither x nor y. There is no more efficient way known to compute s from
a, b, p than to first invert Rp p2x q and compute either x or y. This inversion
will take a long time if the number x, y and p are sufficiently large. Thus Eve
practically cannot compute s.
There is an analogy with locks, depicted in figure 6.4. Alice and Bob have
each two open locks of the same kind. Both close one of the locks and send it to
the other party. With the remaining lock they can lock the two locks together.
Even can intercept the messages and copy the closed locks. But as she cannot
open any of the two, she cannot lock them together to obtain the secret s. 2
2A more precise analogy would be, that Eve can try to built a key for the locks. But as
115
p
xp yp
a
Rp p2x q : a a
b
b Rp p2y q : b
s bx mod p s ay mod p
a b
s s
Figure 6.4: Eve can copy the two closed locks sent by Alice and Bob, but she
cannot intertwine them to obtain the secret s.
116
6.2 RSA cryptosystem
Symmetric cryptosystem Until the late 70s the following encryption scheme
was used exclusively.
A B
C
M Enc Dec M
K K
The scheme employs the same secret key shared among Alice and Bob. This
key allows to both encrypt and decrypt the message. Therefore this kind of
cryptosystem is called symmetric.
Diffie and Hellman solved the issue of distributing the secret key using merely
a public authenticated channel as we have seen above. They also proposed a
completely new scheme employing dierent keys for encryption and decryption
and thus not relying on a shared secret key.
A B
C
M Enc Dec M
PK SK
PK
Diffie and Hellman could however not provide a realization of such an asym-
metric or public-key cryptosystem. In 1977 Rivest, Shamir and Adleman re-
alized a public cryptosystem employing the factorizing problem. We will now
consider some of the mathematical basis before turning to the protocol itself.
she has no information how the key looks like she has to try all possible keys. This process
will take a lot of time and thus practically Eve cannot open the lock.
117
6.2.1 Euclids Algorithm
Given a terrace of dimensions a b, what is the length of the largest square tile,
that can be used to exactly cover the terrace?
ee
So we are actually looking for the greatest common devisor gcdpa, bq. One
could compute the gcd from the prime factors of a and b. There is however no
eecient algorithm known to compute the prime factors of a number. Instead
the terrace-picture allows to derive a more efficient way to compute the gcd
without computing first all prime factors. One can reduce the question to smaller
rectangles. We can substract the square bb and ask for the largest tile exactly
covering pa bq b.
ab
The answer is the same as for the rectangle a b. One can then repeat this over
and over.
118
Rb pak1 q, bk bk1 or ak ak1 , bk Ra pbk1 q. Always the longer side is re-
placed by the remainder. The algorithm terminates when one of the remainders
becomes zero. This happens latest when one reaches the square 1 1. This
yields Euclids algorithm an efficient algorithm to compute the gcd of two
numbers a and b.
Example 6.2.1. Lets consider the case a 17 and b 10. First we reduce a to
the remainder R10 p17q 7, then b to the remainder R7 p10q 3 and so on and
so forth. We therefore obtain the sequence
a 17 b 10
119
Thus to measure an interval of one minute we use the hour glasses as follows.
3 17 51
5 10 50
51 50 1
This means
1 3 17 ` p5q 10
3
or in terms of modulo
1 3 17 pmod 10q
Therefore this yields an algorithm to compute inverses in modulo arithmatic,
called extended Euclidean algorithm.
p2 p3 p3
remainder is 2
remainder is 0
remainder is 1
120
From the first block we conclude that the number of soldiers x is odd. From
the second we obtain, that it is a multiple of 3. We can write this as
x1 pmod 2q
x0 pmod 3q
x2 pmod 5q
Then the only solution to this equation smaller than 30 is 27. We are in partic-
ular interested in the following special case. For two prime numbers p and q it
holds #
x a pmod pq
x a pmod pqq (6.1)
x a pmod qq
How many dierent necklaces are there? A priori we can choose for each of the
pearls among a colors. So there are ap diernt necklaces. There are however
some that dier merely by a rotation. Thus we have to divide by the number of
rotations. This would reduce the number of necklaces too much, as we initially
counted the a necklaces with all pearls having the same color merely once. So
we have to substract those before the division and add them afterwards again.
Thus we obtain
ap a
#necklaces `aPN (6.2)
p
Therefore p divides ap a
where the last equivalence requires that a is not a multiple of p, i.e. gcdpa, pq 1.
We can now combine the Chinese remainder theorem (6.1) and Fermats
theorem (6.3) to obtain for two primes p, q, with p q, and an integer number
121
a with gcdpa, pqq 1
app1qpq1q 1 pmod pq
app1qpq1q 1 pmod qq
pp1qpq1q
a 1 pmod pqq
Equipped with these tools from number theory we can now introduce the RSA
protocol.
A B
C
M Dec Enc M
SK PK
PK
Cd pmod nq pM e qd pmod nq
ed
M pmod nq
1`kpp1qpq1q
M pmod nq for some k P N
1
`
pp1qpq1q k
M loooooomoooooon
M pmod nq
1 mod n
M pmod nq
122
The security of RSA is based on the hardness of factorizing. To compute p
and q from n requires a lot of computational resource. An upper bound on the
number of computational steps is
?
n 21{2 log n
i.e. exponential in the size of the input. The best algorithm known so far is the
so-called number field sieve and requires
1{3
2cplog nq
computational steps. This is less than exponential but not polynomial in the
input size, thus called subexponential. So it suffices to choose two large inte-
gers p, q to make RSA secure (assuming that the NSA does not have secretly
developed a substentially more efficient algorithm for factorizing or possess a
properly working quantum computer).
123
Appendices
124
Appendix A
Proof techniques
In this appendix we will briefly review common proof techniques used through-
out the course.
125