You are on page 1of 7

Exercise 4.

Nave Bayes for data with nominal attributes


ata), predict the class oI the Iollowing new

olution:
= age30, incomemedium, studentyes, credit-ratingIair
-ratingIair
(yes)9/140.643 P(no)5/140.357
(E1,yes)2/90.222 P(E1,no)3/50.6
Given the training data in the table below (Buy Computer d
example using Nave Bayes classiIication: age30, incomemedium, studentyes, credit-ratingIair

S

E
E
1
is age30, E2 is incomemedium, studentyes, E4 is credit
We need to compute P(yes,E) and P(no,E) and compare them.


P

P
P(E2,yes)4/90.444 P(E2,no)2/50.4
P(E3,yes)6/90.667 P(E3,no)1/50.2
P(E4,yes)6/90.667 P(E4,no)2/50.4

Hence, the Nave Bayes classiIier predicts buyscomputeryes Ior the new example.


) (
) ( ) , ( ) , ( ) , ( ) , (
) , (
4 3 2 1
E P
yes P yes E P yes E P yes E P yes E P
E yes P
) (
028 . 0
) (
443 . 0 668 . 0 667 . 0 444 . 0 222 . 0
) , (
E P E P
E yes P
) (
007 . 0
) (
357 . 0 4 . 0 2 . 0 4 . 0 6 . 0
) , (
E P E P
E no P

Exercise 5. Applying Nave Bayes to data with numerical attributes and using the Laplace
correction (to be done at your own time, not in class)
Given the training data in the table below (Tennis data with some numerical attributes), predict the
class oI the Iollowing new example using Nave Bayes classiIication:
outlookovercast, temperature60, humidity62, windyIalse.

Tip. You can use Excel or Matlab Ior the calculations oI logarithm, mean and standard deviation.
og2
olution:
eed to calculate the mean and standard deviation values Ior the numerical attributes.
tempyes73, tempyes6.2; tempno74.6, tempno8.0
humyes79.1, tempyes10.2; humno86.2, tempno9.7
econd, to calculate I(temperature60,yes), I(temperature60,no), I(humidity62,yes) and
Matlab is installed on our undergraduate machines. The Iollowing Matlab Iunctions can be used: l
logarithm with base 2, mean mean value, std standard deviation. Type help Iunction name~
(e.g. help mean) Ior help on how to use the Iunctions and examples.





S
First, we n
X
i
, i1..n the i-th measurement, n-number oI measurements


S
I(humidity62,no) using the probability density Iunction Ior the normal distribution:

n
X
n
i
i

1
) (
2
1 2

n
X
n
i
i

2
2
2
1
) (

e x f
2
) ( x
Third, we can calculate the probabilities Ior the nominal attributes:
P(yes)9/140.643 P(no)5/140.357

P(outlookovercast,yes)4/140.286 P(outlookovercast,no)0/50
P(windyIalse,yes)6/90.667 P(windyIalse,no)2/50.4

As P(outlookovercast,no)0, we need to use a Laplace estimator Ior the attribute outlook. We assume
that the three values (sunny, overcast, rainy) are equally probable and set 3:

ourth, we can calculate the Iinal probabilities:
hereIore, the Nave Bayes classiIier predicts playyes Ior the new example.

F

T


071 . 0
2 2 . 6
1
) , 60 (
2
) 2 . 6 ( 2
2
) 73 60 (

e yes e temperatur f

0094 . 0
2 8
1
) , 60 (
2
8 2
2
) 6 . 74 60 (

e no e temperatur f

0096 . 0
2 2 . 10
1
) , 62 (
2
) 2 . 10 ( 2
2
) 1 . 79 62 (

e yes humidity f

0018 . 0
2 7 . 9
1
) , 62 (
2
) 7 . 9 ( 2
2
) 2 . 86 62 (

e no humidity f

4167 . 0
12
5
3 9
1 4
) , (

yes overcast outlook P


125 . 0
8
1
3 5
1 0
) , (

no overcast outlook P
) ( ) (
) , (
E P E P
E yes P
10 * 22 . 1 643 . 0 * 667 . 0 * 0096 . 0 * 0071 . 0 * 4167 . 0
5
) ( ) (
) , (
E P E P
E no P
10 * 02 . 3 357 . 0 * 4 . 0 * 0018 . 0 * 0094 . 0 * 125 . 0
7
Exercise 6. Using Weka (to be done at your own time, not in class)
Load iris data (iris.arII). Choose 10-Iold cross validation. Run the Nave Bayes and Multi-layer
xercise 7.
percepton (trained with the backpropagation algorithm) classiIiers and compare their perIormance.
Which classiIier produced the most accurate classiIication? Which one learns Iaster?


E k-Nearest neighbours
e 4 (Buy Computer data), predict the class oI the Iollowing new Given the training data in Exercis
example using k-Nearest Neighbour Ior k5: age30, incomemedium, studentyes, credit-
ratingIair. For similarity measure use a simple match oI attribute values: Similarity(A,B)
4 ) , ( *
4
1


i
i i i
b a w where is 1 iI a ) , (
i i
b a
i
equals b
i
and 0 otherwise. a
i
and b
i
are either age, income,
student or credit_rating. Weights are all 1 except Ior income it is 2.

Solution:



RID Class Distance to New
1 No (1001)/40.5
2 No (1000)/40.25
3 Yes (0001)/40.25
4 Yes (0+2+0+1)/4=0.75
5 Yes (0011)/40.5
6 No (0010)/40.25
7 Yes (0010)/40.25
8 No (1+2+0+1)/4=1
9 Yes (1+0+1+1)/4=0.75
10 Yes (0+2+1+1)/4=1
11 Yes (1+2+1+0)/4=1
12 Yes (0200)/40.5
13 Yes (0011)/40.5
14 No (0200)/40.5

Among the Iive nearest neighbours Iour are Irom class Yes and one Irom class No. Hence, the k-NN
classiIier predicts buyscomputeryes Ior the new example.



Exercise 8. Decision trees
Given the training data in Exercise 4 (Buy Computer data), build a decision tree and predict the class
oI the Iollowing new example: age30, incomemedium, studentyes, credit-ratingIair.


Solution:
First check which attribute provides the highest InIormation Gain in order to split the training set based
on that attribute. We need to calculate the expected inIormation to classiIy the set and the entropy oI
each attribute. The inIormation gain is this mutual inIormation minus the entropy:
The mutual inIormation oI the two classes I(S
Yes
, S
No
) I(9,5) -9/14 log
2
(9/14) 5/14 log
2
(5/14)0.94

- For Age we have three values age
30
(2 yes and 3 no),

age
31..40
(4 yes and 0 no) and age
~40
(3 yes 2
no)
4/14 (0) 5/14 (-3/5log(3/5)-2/5log(2/5))
09) 0 5/14(0.9709)
ain(age) 0.94 0.6935 0.2465
three values income
high
(2 yes and 2 no),

income
medium
(4 yes and 2 no) and
come
low
(3 yes 1 no)
/14 (-4/6log(4/6)-2/6log(2/6))
0.285714 0.393428 0.231714 0.9108
ain(income) 0.94 0.9108 0.0292
For Student we have two values student
yes
(6 yes and 1 no) and student
no
(3 yes 4 no)
log(3/7)-4/7log(4/7)
)
0.2958 0.4926 0.7884
ain (student) 0.94 0.7884 0.1516
reditRating we have two values creditrating
Iair
(6 yes and 2 no) and creditrating
excellent
(3 yes
no)
g(2/8)) 6/14(-3/6log(3/6)-3/6log(3/6))
0.4635 0.4285 0.8920
ain(creditrating) 0.94 0.8920 0.479
t splitting the dataset using the age attribute

ince all records under the branch age
31..40
are all oI class Yes, we can replace the leaI with ClassYes

Entropy(age) 5/14 (-2/5 log(2/5)-3/5log(3/5))
5/14(0.97
0.6935
G

- For Income we have
in

Entropy(income) 4/14(-2/4log(2/4)-2/4log(2/4)) 6
4/14 (-3/4log(3/4)-1/4log(1/4))
4/14 (1) 6/14 (0.918) 4/14 (0.811)


G

-

Entropy(student) 7/14(-6/7log(6/7)) 7/14(-3/7
7/14(0.5916) 7/14(0.9852


G

- For C
3

Entropy(creditrating) 8/14(-6/8log(6/8)-2/8lo
8/14(0.8112) 6/14(1)


G

Since Age has the highest InIormation Gain we star

S
age
Income student credit_rating Class
medium yes excellent Yes
Income student credit_rating Class
high no fair No medium no fair Yes
high no excellent No low yes fair Yes
medium no fair No low yes excellent No
low yes fair Yes medium yes fair Yes
medium no excellent No
Income student credit_rating Class
high no fair Yes
low yes excellent Yes
medium no excellent Yes
high yes fair Yes
>40
31.. 40
<=30


still have attributes income, student and creditrating. Which one should be use
split the partition?
he mutual inIormation is I(S
Yes
, S
No
) I(2,3) -2/5 log
2
(2/5) 3/5 log
2
(3/5)0.97
e values income
high
(0 yes and 2 no),

income
medium
(1 yes and 1 no) and
come
low
(1 yes and 0 no)
-1/2log(1/2)-1/2log(1/2)) 1/5 (0)
2/5 (1) 0.4
ain(income) 0.97 0.4 0.57
For Student we have two values student
yes
(2 yes and 0 no) and student
no
(0 yes 3 no)
ntropy(student) 2/5(0) 3/5(0) 0
ain (student) 0.97 0 0.97
split on attribute student without checking the other attributes since the inIormation
ain is maximized.

nches are Irom distinct classes, we make them into leaI nodes with their
respective class as label:

The same process oI splitting has to happen Ior the two remaining branches.
For branch age
30
we
to

T

- For Income we have thre
in

Entropy(income) 2/5(0) 2/5 (


G

-

E

G

We can then saIely
g


Since these two new bra
age
Income student credit_rating Class
medium yes excellent Yes
Income student credit_rating Class
high no fair No medium no fair Yes
high no excellent No low yes fair Yes
medium no fair No low yes excellent No
low yes fair Yes medium yes fair Yes
medium no excellent No
>40
31.. 40
<=30
Class=Yes
age
Income student credit_rating Class
medium no fair No
high no fair No
high no excellent No
Income student credit_rating Class
medium no fair Yes
low yes fair Yes
low yes excellent No
medium yes fair Yes
medium no excellent No
>40
31.. 40
<=30
Class=Yes
student
Income student credit_rating Class
low yes fair Yes


medium yes excellent Yes
Yes
No


Again the same process is needed Ior the other branch oI age.

The mutual inIormation is I(S
Yes
, S
No
) I(3,2) -3/5 log
2
(3/5) 2/5 log
2
(2/5)0.97

- For Income we have two values income
medium
(2 yes and 1 no) and income
low
(1 yes and 1 no)

Entropy(income) 3/5(-2/3log(2/3)-1/3log(1/3)) 2/5 (-1/2log(1/2)-1/2log(1/2))
3/5(0.9182)2/5 (1) 0.550. 4 0.95

Gain(income) 0.97 0.95 0.02

- For Student we have two values student
yes
(2 yes and 1 no) and student
no
(1 yes and 1 no)

Entropy(student) 3/5(-2/3log(2/3)-1/3log(1/3)) 2/5(-1/2log(1/2)-1/2log(1/2)) 0.95

Gain (student) 0.97 0.95 0.02

- For CreditRating we have two values creditrating
Iair
(3 yes and 0 no) and creditrating
excellent
(0 yes
and 2 no)

Entropy(creditrating) 0

Gain(creditrating) 0.97 0 0.97

We then split based on creditrating. These splits give partitions each with records Irom the same class.
We just need to make these into leaI nodes with their class label attached:

New example: age30, incomemedium, studentyes, credit-ratingIair
Follow branch(age30) then studentyes we predict Classyes Buyscomputer yes

age
<=30 >40
31.. 40
Income student credit_rating Class
medium no fair Yes
low yes fair Yes
low yes excellent No
medium yes fair Yes
medium no excellent No
Class=Yes
student
Class=Yes Class=No
Yes No
age
<=30 >40
31.. 40
Class=Yes
student Credit_rating
Class=Yes Class=No
Yes No
Class=Yes Class=No
fair excellent

You might also like