You are on page 1of 24

Normalisation to 3NF

Database Systems
Lecture 8
In This Lecture
• Normalisation to 3NF
• Data redundancy
• Functional dependencies
• Normal forms
• First, Second, and Third Normal Forms
• For more information
• Connolly and Begg chapter 13
• Ullman and Widom ch.3.6.6 (2nd edition),
3.5 (3rd edition)
Redundancy and Normalisation

• Redundant data • Normalisation


• Can be determined • Aims to reduce data
from other data in the redundancy
database • Redundancy is
• Leads to various expressed in terms of
problems dependencies
• INSERT anomalies • Normal forms are
• UPDATE anomalies defined that do not
• DELETE anomalies have certain types of
dependency
First Normal Form
• In most definitions of • A relation is said to
the relational model be in first normal
• All data values should form (1NF) if all data
be atomic values are atomic
• This means that table
entries should be
single values, not sets
or composite objects
Normalisation to 1NF
To convert to a 1NF relation, split up any
non-atomic values
1NF
Unnormalised Module Dept Lecturer Text
Module Dept Lecturer Texts M1 D1 L1 T1
M1 D1 L1 T2
M1 D1 L1 T1, T2 M2 D1 L1 T1
M2 D1 L1 T1, T3 M2 D1 L1 T3
M3 D1 L2 T4 M3 D1 L2 T4
M4 D2 L3 T1, T5 M4 D2 L3 T1
M5 D2 L4 T6 M4 D2 L3 T5
M5 D2 L4 T6
Problems in 1NF
• INSERT anomalies
1NF
• Can't add a module
Module Dept Lecturer Text with no texts
M1 D1 L1 T1 • UPDATE anomalies
M1 D1 L1 T2 • To change lecturer for
M2 D1 L1 T1 M1, we have to
M2 D1 L1 T3 change two rows
M3 D1 L2 T4
M4 D2 L3 T1 • DELETE anomalies
M4 D2 L3 T5 • If we remove M3, we
M5 D2 L4 T6 remove L2 as well
Functional Dependencies
• Redundancy is often • A set of attributes, A,
caused by a functional functionally determines
dependency another set, B, or: there
• A functional dependency exists a functional
(FD) is a link between dependency between A
two sets of attributes in and B (A  B), if
a relation whenever two rows of
• We can normalise a the relation have the
same values for all the
relation by removing
undesirable FDs attributes in A, then they
also have the same
values for all the
attributes in B.
Example
• {ID, modCode}  {First, Last, modName}
• {modCode}  {modName}
• {ID}  {First, Last}

ID First Last modCode modName

111 Joe Bloggs G51PRG Programming

222 Anne Smith G51DBS Databases


FDs and Normalisation
• We define a set of • Not all FDs cause a
'normal forms' problem
• Each normal form has • We identify various
fewer FDs than the sorts of FD that do
last • Each normal form
• Since FDs represent removes a type of FD
redundancy, each that is a problem
normal form has less • We will also need a
redundancy than the way to remove FDs
last
Properties of FDs
• In any relation • Rules for FDs
• The primary key FDs • Reflexivity: If B is a
any set of attributes subset of A then
in that relation AB
KX • Augmentation: If
• K is the primary key, A  B then
X is a set of
AUCBUC
attributes
• Same for candidate • Transitivity:
keys If A  B and B  C then
• Any set of attributes AC
is FD on itself
XX
FD Example
• The primary key is
1NF {Module, Text} so
Module Dept Lecturer Text {Module, Text} 
M1 D1 L1 T1 {Dept, Lecturer}
M1 D1 L1 T2 • 'Trivial' FDs, eg:
M2 D1 L1 T1
{Text, Dept}  {Text}
M2 D1 L1 T3
M3 D1 L2 T4 {Module}  {Module}
M4 D2 L3 T1 {Dept, Lecturer}  { }
M4 D2 L3 T5
M5 D2 L4 T6
FD Example
• Other FDs are
1NF
• {Module} 
Module Dept Lecturer Text {Lecturer}
M1 D1 L1 T1 • {Module}  {Dept}
M1 D1 L1 T2 • {Lecturer}  {Dept}
M2 D1 L1 T1 • These are non-trivial
M2 D1 L1 T3 and determinants (left
M3 D1 L2 T4 hand side of the
M4 D2 L3 T1 dependency) are not
M4 D2 L3 T5 keys.
M5 D2 L4 T6
Partial FDs and 2NF
• Partial FDs: Second normal form:
• A FD, A  B is a partial • A relation is in second
FD, if some attribute of normal form (2NF) if it is
A can be removed and in 1NF and no non-key
the FD still holds
attribute is partially
• Formally, there is some dependent on a
proper subset of A,
candidate key
C  A, such that C  B
• In other words, no C  B
• Let us call attributes where C is a strict subset
which are part of some of a candidate key and B
candidate key, key is a non-key attribute.
attributes, and the rest
non-key attributes.
Second Normal Form
1NF
• 1NF is not in 2NF
Module Dept Lecturer Text
• We have the FD
M1 D1 L1 T1 {Module, Text} 
M1 D1 L1 T2
{Lecturer, Dept}
M2 D1 L1 T1
M2 D1 L1 T3 • But also
M3 D1 L2 T4 {Module}  {Lecturer, Dept}
M4 D2 L3 T1 • And so Lecturer and
M4 D2 L3 T5 Dept are partially
M5 D2 L4 T6 dependent on the
primary key
Removing FDs
• Suppose we have a • It turns out that we can
relation R with scheme S split R into two parts:
and the FD A  B where • R1, with scheme C U A
A∩B={} • R2, with scheme A U B
• Let C = S – (A U B) • The original relation can
• In other words: be recovered as the
• A – attributes on the left natural join of R1 and
hand side of the FD R2:
• B – attributes on the • R = R1 NATURAL JOIN R2
right hand side of the FD
• C – all other attributes
1NF to 2NF – Example
1NF 2NFa 2NFb
Module Dept Lecturer Text Module Dept Lecturer Module Text
M1 D1 L1 T1 M1 D1 L1 M1 T1
M1 D1 L1 T2 M2 D1 L1 M1 T2
M2 D1 L1 T1 M3 D1 L2 M2 T1
M2 D1 L1 T3 M4 D2 L3 M2 T3
M3 D1 L2 T4 M5 D2 L4 M3 T4
M4 D2 L3 T1 M4 T1
M4 D2 L3 T5 M4 T5
M5 D2 L4 T6 M1 T6
Problems Resolved in 2NF
• Problems in 1NF • In 2NF the first two
• INSERT – Can't add a are resolved, but not
module with no texts the third one
• UPDATE – To change 2NFa
lecturer for M1, we
have to change two Module Dept Lecturer
rows M1 D1 L1
• DELETE – If we M2 D1 L1
remove M3, we M3 D1 L2
remove L2 as well M4 D2 L3
M5 D2 L4
Problems Remaining in 2NF
2NFa
• INSERT anomalies
Module Dept Lecturer
• Can't add lecturers
who teach no modules M1 D1 L1
M2 D1 L1
• UPDATE anomalies
M3 D1 L2
• To change the M4 D2 L3
department for L1 we M5 D2 L4
must alter two rows
• DELETE anomalies
• If we delete M3 we
delete L2 as well
Transitive FDs and 3NF
• Transitive FDs: • Third normal form
• A FD, A  C is a • A relation is in third
transitive FD, if there normal form (3NF) if
is some set B such it is in 2NF and no
that A  B and B  C non-key attribute is
are non-trivial FDs transitively dependent
• A  B non-trivial on a candidate key
means: B is not a
subset of A
• We have
ABC
Third Normal Form
2NFa • 2NFa is not in 3NF
Module Dept Lecturer • We have the FDs
M1 D1 L1 {Module}  {Lecturer}
M2 D1 L1 {Lecturer}  {Dept}
M3 D1 L2
• So there is a
M4 D2 L3
transitive FD from the
M5 D2 L4
primary key {Module}
to {Dept}
2NF to 3NF – Example
2NFa 3NFa 3NFb
Module Dept Lecturer Lecturer Dept Module Lecturer
M1 D1 L1 L1 D1 M1 L1
M2 D1 L1 L2 D1 M2 L1
M3 D1 L2 L3 D2 M3 L2
M4 D2 L3 L4 D2 M4 L3
M5 D2 L4 M5 L4
Problems Resolved in 3NF
• Problems in 2NF • In 3NF all of these are
resolved (for this relation –
• INSERT – Can't add but 3NF can still have
lecturers who teach anomalies!)
no modules 3NFb
• UPDATE – To change 3NFa Module Lecturer
the department for L1 M1 L1
Lecturer Dept
we must alter two M2 L1
rows L1 D1 M3 L2
• DELETE – If we delete L2 D1 M4 L3
M3 we delete L2 as L3 D2 M5 L4
well L4 D2
Normalisation and Design
• Normalisation is • When you find you
related to DB design have a non-3NF DB
• A database should • Identify the FDs that
normally be in 3NF at are causing a problem
least • Think if they will lead
• If your design leads to to any insert, update,
a non-3NF DB, then or delete anomalies
you might want to • Try to remove them
revise it
Next Lecture
• More normalisation
• Lossless decomposition; why our reduction
to 2NF and 3NF is lossless
• Boyce-Codd normal form (BCNF)
• Higher normal forms
• Denormalisation
• For more information
• Connolly and Begg chapter 14
• Ullman and Widom chapter 3.6

You might also like