Professional Documents
Culture Documents
LAST NAME:
SOLUTIONS
FIRST NAME:
Problem
Score
Max Score
___________
18
___________
16
___________
15
___________
___________
___________
18
___________
19
Total
___________
100
1 of 10
h=6
h=0
h=4
h=0
4
h=6
h=1
4
h=10
When a node is expanded, assume its children are put in the Frontier set in alphabetical order
so that the child closest to the front of the alphabet is removed before its other siblings (for all
uninformed searches and for ties in informed searches).
For each of the search methods below, give (i) the sequence of nodes removed from the
Frontier (for expansion or before halting at the goal), and (ii) the solution path found.
(a) [6] Uniform-Cost graph search (i.e., use an Explored set)
Nodes removed:
Solution:
S B A D C E G
S B D G
2 of 10
(b) [6] Greedy Best-First tree search (i.e., no repeated state checking)
Nodes removed:
Solution:
S A G
S G
Nodes removed:
Solution:
S A D B D G
S B D G
3 of 10
Consider the following game tree in which the root corresponds to a MAX node and the values
of a static evaluation function, if applied, are given at the leaves.
(a) [4] What are the minimax values computed at each node in this game tree? Write your
answers to the LEFT of each node in the tree above.
E=3, F=8, G=7, H=1, I=5, J=8, K=10, B=3, C=1, D=8, A=8
(b) [4] Which nodes are not examined when Alpha-Beta Pruning is performed? Assume
children are visited left to right.
O, Q, I, (T, U,) Y
(c) [3] Is there a different ordering of the children of the root for which more pruning would result
by Alpha-Beta? If so, give the order. If not, say why not.
Yes, when the children are ordered (D, B, C)
or (D, C, B).
(d) [4] Now assume your opponent chooses her move uniformly at random (e.g., if there are
two moves, the time she picks the first move and the time she picks the second) when
its her turn, and you know this. You still seek to maximize your chances of winning. What
are the expected minimax values computed at each node in this case? Write your answers
to the RIGHT of each node in the tree above.
At MAX nodes compute the maximum of its childrens values, but at
MIN nodes compute the average of its childrens values. So, backed
up values are E=3, F=8, G=7, H=1, I=5, J=8, K=10, B=(3+8+7)/3=6,
C=(1+5)/2=3, D=(8+10)/2=9, A=9
You are given the following table of distances between all pairs of five clusters:
A
1075
2013
2054
996
1075
3273
2687
2037
2013
3273
808
1307
2054
2687
808
1059
996
2037
1307
1059
(a) [3] Which pair of clusters will be merged into one cluster at the next iteration of hierarchical
agglomerative clustering using single linkage?
(b) [3] What will the new values be in the resulting table corresponding to the four new clusters?
Include the cluster names in the first row and first column; if clusters x and y were merged,
name that cluster x+y.
A
C+D
1075
2013
996
1075
2687
2037
C+D
2013
2687
1059
996
2037
1059
Given a set of three points, -2, 0, and 10, we want to use k-Means Clustering with k = 2 to
cluster them into two clusters.
(a) [5] If the initial cluster centers are c1 = -4.0 and c2 = 1.0, show each successive iteration of
k-Means Clustering until no points change cluster, indicating at each iteration which points
belong to each cluster and the coordinates of the two cluster centers.
Point
-2
0
10
c1 = -4
2
4
14
Point
-2
0
10
c1 = -2
c1 = -2
0
2
12
c2 = 1
3
1
9
c2 = 5
7
3
5
c1 = -1
c1 = -1
1
1
11
and
Cluster
1
2
2
c2 = 5.
Cluster
1
1
2
and c2 = 10.
c2 = 10
12
10
0
Second iteration
Third iteration
Cluster
1
1
2
(b) [3] Yes or No: k-Means Clustering is guaranteed to find the same final clusters for the
above three points, no matter what the initial cluster center values are.
No. For example, if initially c1 = 0 and c2 = 50 then all three
points will be assigned to cluster 1 and no points will be in
cluster 2. Then, updating the cluster centers, they become c1 =
(2+0+10)/3 = 6 and c2 = 50 (no change because there are no
points). These two cluster centers and their associated points
do not change in the next iteration, so the final clustering has
all three points in one cluster and none in the other.
7 of 10
(c) [3] k-NN can be thought of as an ensemble learning method using k 1-NN classifiers.
Random forests are another ensemble learning method. For a 2-class classification
problem, what operation is the same in both ensemble methods?
Bad question in that k 1-NN classifiers will all classify an
example the same way. What was intended was that the first 1-NN
classifier use the closest neighbors class, the second uses the
second closest neighbors class, etc. In any event, the common
operation is the way classifiers are combined by majority vote to
obtain the output class in both k-NN and random forest algorithms.
8 of 10
(ii) [5] Write an expression in terms of logs and fractions for computing the conditional
entropy (also called Remainder) of choosing attribute A, i.e., H(C | A). You do not
need to simplify this or compute a numeric answer.
9 of 10
(b) [3] In a problem where each example has n binary attributes, to select the best attribute for
a decision tree node at depth k, where the root node is at depth 0, how many attributes are
candidates for selection at this node?
n - k
(c) [3] Say we use the following method to prune a decision tree: Iteratively remove non-leaf
nodes using a tuning set equal to the training set until no improvement is made in the
classification accuracy on the tuning set. How will the final pruned tree compare to the
original decision tree in terms of classification accuracy? Justify your answer.
Using the tuning set to be the same as the training set means
that the accuracy with the original decision tree will be 100% on
the tuning set, so no pruning will be done because no pruning can
improve the accuracy on the tuning set. So, the final pruned
tree will be the original decision tree.
(d) [3] After constructing a decision tree from a training set that contains many attributes you
find that the training set accuracy is very high but the testing set accuracy is low. Explain
what the likely cause of this situation is and what might be done to fix it.
The likely cause is overfitting the training data. One possible
solution is to prune the tree. Alternatively, use a random
forest.
10 of 10