You are on page 1of 39

NORMALIZATION

12/08/10
1
OBJECTIVES

12/08/10
 Purpose of normalization.

 Problems associated with redundant data.

 Identification of various types of update anomalies such


as insertion, deletion, and modification anomalies.

 How to recognize appropriateness or quality of the


design of relations.

2
OBJECTIVES

12/08/10
 How functional dependencies can be used to group attributes
into relations that are in a known normal form.

 How to undertake process of normalization.

 How to identify most commonly used normal forms, namely


1NF, 2NF, 3NF, and Boyce–Codd normal form (BCNF).

 How to identify fourth (4NF) and fifth (5NF) normal forms.

3
NORMALIZATION

12/08/10
 Normalization is defined as a
technique for producing a set of
well designed relations that
measure up to a set of
requirements which are outlined
in various levels of normalization
(or Normal Forms).

4
NORMALIZATION
 Four most commonly used normal forms are
first (1NF), second (2NF) and third (3NF)

12/08/10
normal forms, and Boyce–Codd normal form
(BCNF).
 Normalization has the underlying aim
of minimising information
redundancy, avoiding data
inconsistency and preventing
insertion, deletion, and modification
anomalies (Update anomalies).
5
DATA REDUNDANCY
 Major aim of relational database design is to
group attributes into relations to minimize

12/08/10
data redundancy and reduce file storage
space required by base relations.

 Problems associated with data redundancy


are illustrated by comparing the following
Staff and Branch relations with the
StaffBranch relation.
6
DATA REDUNDANCY

12/08/10
7
UPDATE ANOMALIES
 Relationsthat contain redundant

12/08/10
information may potentially suffer from
update anomalies.

 Typesof update anomalies include:


Insertion
Deletion
Modification.

8
UPDATE ANOMALIES
 Insertion Anomaly: Occurs when extra data beyond the desired data must be added to
the database.
 Until the new faculty member, Dr. Newsome, is assigned to teach at least one course, his
details cannot be recorded.

12/08/10
9
UPDATE ANOMALIES
Modification Anomaly: Changing the value of one of the
columns in a table will mean changing all the values that

12/08/10
have to do with that column.
Employee 519 is shown as having different addresses on
different records.

10
UPDATE ANOMALIES
 Deletion Anomaly: Occurs whenever deleting a row inadvertently causes other data to be deleted.
 All information about Dr. Giddens is lost when he temporarily ceases to be assigned to any courses.

12/08/10
11
FUNCTIONAL DEPENDENCY

12/08/10
 Main concept associated with normalization.

 Functional Dependency
Describes relationship between attributes in a
relation.
If A and B are attributes of relation R, B is
functionally dependent on A (denoted A ➙ B),
if each value of A in R is associated with
exactly one value of B in R.
12
FUNCTIONAL DEPENDENCY

12/08/10
 Diagrammatic representation:

◆ Determinant of a functional dependency refers to


attribute or group of attributes on left-hand side of
the arrow. 13
EXAMPLE

12/08/10
 branchNo bAddress

2.14
Func
tiona
l
Depe
nden
cies
EXAMPLE - FUNCTIONAL
DEPENDENCY

12/08/10
15
EXAMPLE

12/08/10
2.16
Func
tiona
l
Depe
nden
cies
EXAMPLES

12/08/10
Given TEXT we know the COURSE.
TEXT ->COURSE

•TEXT maps to a single value of COURSE 17


THE PROCESS OF
NORMALIZATION

12/08/10
 Formal technique for analyzing a relation based on its
primary key and functional dependencies between its
attributes.

 Oftenexecuted as a series of steps. Each step


corresponds to a specific normal form, which has
known properties.

 As normalization proceeds, relations become


progressively more restricted (stronger) in format and
also less vulnerable to update anomalies.
18
UNNORMALIZED FORM (UNF)
 A table that contains one or more repeating groups.
 Note: A repeating group is an attribute

12/08/10
or group of attributes within a table that
occurs with multiple values for a single
occurrence of the nominated key
attributes for that table. For example a
book with multiple authors, etc
 Tocreate an unnormalized table:
transform data from information source (e.g. form)
into table format with columns and rows.

19
FIRST NORMAL FORM (1NF)
A table is in First Normal Form (1NF) iff
all its attributes are atomic.

12/08/10
 A domain is atomic if its elements are
considered to be indivisible units. A
relation in which intersection of each
row and column contains one and only
one value.
 Implies that it should have no
composite attributes or multivalued
attributes.
 In case a table is not in 1NF, we do two
20
things
UNF TO 1NF
 First identify a primary key, then

12/08/10
Either

Place each value of a repeating group


on a tuple with duplicate values of the
non-repeating data (called “flattening”
the table)

21
FIRST UNF TO 1NF
Or
 Make a new table to cater for multivalued

12/08/10
attributes.
 Place repeating data along with copy of the
original key attribute(s) into a separate
relation
 The new primary key should be a
combination of the (multivalued) attribute
and the primary key of the parent table.

Nor
maliz
ation
UNF TO 1NF

12/08/10
Nor
maliz
ation
UNF TO 1NF

12/08/10
24
12/08/10
25
SECOND NORMAL FORM (2NF)
 Based on concept of full functional
dependency:

12/08/10
A and B are attributes of a relation,
B is fully dependent on A if B is
functionally dependent on A but not on
any proper subset of A.
 2NF - A relation that is in 1NF and
every non-primary-key attribute is
fully functionally dependent on the
primary key.
Nor
 It applies to relations that have maliz
ation
composite keys for a primary key.
1NF TO 2NF
 This involves the removal of partial dependencies

12/08/10
A partial dependency occurs when the
primary key is made up of more than one
attribute (i.e. it is a composite primary key)
and there exists an attribute (which is a non-
primary key attribute) that is dependant on
only part of the primary key.

Nor
maliz
ation
1NF TO 2NF

12/08/10
 These partial dependencies can
be removed by removing all of
the partially dependent attributes
into another relation along with a
copy of the determinant attribute
(which is part of the primary key
in the original relation)
Nor
maliz
ation
12/08/10
29
12/08/10
30
THIRD NORMAL FORM (3NF)
 Basedon concept of transitive dependency:

12/08/10
A, B and C are attributes of a relation such that if A
➙ B and B ➙ C,
then C is transitively dependent on A through B.
(Provided that A is not functionally dependent on B
or C).

 3NF - A relation
that is in 1NF and 2NF and in
which no non-primary-key attribute is transitively
dependent on the primary key.
Nor
maliz
ation
2NF TO 3NF
 Identify the primary key in the 2NF relation.

12/08/10
 Identify functional dependencies in the relation.

 Iftransitive dependencies exist on the primary key


remove them by placing them in a new relation
along with copy of their determinant.

Nor
maliz
ation
``````3EA4EZQQ `1 1

12/08/10
33
12/08/10
34
INSTRUCTIONS

12/08/10
 The following tables are susceptible to
update anomalies. Provide examples of
insertion, deletion, and modification
anomalies.

 Describe and illustrate the process of


normalizing the tables to 3NF. State any
assumptions you make about the data shown
in these tables.
35
EXERCISES

12/08/10
36
EXERCISES

12/08/10
37
EXAMPLE

12/08/10
38
SOLUTION
 0NF
 ORDER(order#, customer#, name, address,
orderdate(product#, description, quantity, unitprice))

12/08/10
 1NF
 ORDER(order#, customer#, name, address, orderdate)
 ORDER_LINE(order#, product#, description, quantity,
unitprice)
 2NF
 ORDER(order#, customer#, name, address, orderdate)
 ORDER_LINE(order#, product#, quantity)
 PRODUCT(product#, description, unitprice)
 3NF
 ORDER(order#, customer#, orderdate)
 CUSTOMER(customer#, name, address)
 ORDER_LINE(order#, product#, quantity)
 PRODUCT(product#, description, unitprice)
39

You might also like