You are on page 1of 2

Birla Institute of Technology & Science, Pilani

Work-Integrated Learning Programmes Division


First Semester 2018-2019
Comprehensive Examination (EC-3 Regular)
Course No. : BA ZG522
Course Title : BUSINESS DATA MINING
Nature of Exam : Open Book
Weightage : 45% No. of Pages =2
Duration : 3 Hours No. of Questions = 6
Date of Exam : 24/11/2018 (AN)
Note:
1. Please follow all the Instructions to Candidates given on the cover page of the answer book.
2. All parts of a question should be answered consecutively. Each answer should start from a fresh page.
3. Assumptions made if any, should be stated clearly at the beginning of your answer.
Q.1. Provide short & crisp answers to the following [5 X 2 = 10]
(a) Compare & Contrast Data Mining and Data Warehousing.
(b) Why K-NN (k-nearest-neighbor) classifier is considered lazy classifier? What is
the disadvantage of using lazy classifiers?
(c) Differentiate Slice and Dice during OLAP (online analytical processing).
(d) What is the significance of stationarity for a time series?
(e) Differentiate statistical outliers from distance-based outliers.

Q.2. Answers the following with necessary steps [5 X 3 = 15]


(a) For the following vectors x and y, calculate the indicated similarity or distance
measures:
x = (1,1,1,1), y = (3,3,3,3). Compute cosine similarity, Manhattan, and Euclidian
distances.
(b) Given the following two objects with five binary attributes.
Attribute1 Attribute2 Attribute3 Attribute4 Attribute5
Object1 1 1 0 0 1
Object2 1 0 1 0 1
a) What is the distance between the objects if variables are symmetric?
b) What is the distance between the objects if variables are asymmetric?
c) What is Jaccard coefficient for the objects?
(c) Consider the one-dimensional data set shown in Table below:
x 0.5 3.0 4.5 4.8 4.9 5.2 5.3 5.7 7.0 9.5
y - - + + - - + + - -
Classify the data point x = 5.0 according to its 1-, 3-, 5-, and 9-nearest neighbors
(using majority vote).
(d) Given below is the confusion matrix of a 2-class classifier:
Predicted Class
Actual Class 1 Class 2
Class Class 3 2
1
Class 1 4
2
Compute the precision, recall, and F-measure for the model

(e) Given three rules {p}  {q}, {p}  {q, r}, & {p, r}  {q}, can you identify rule
that has lowest confidence? Can you identify the rule with the highest confidence?

BA ZG522 (EC-3 Regular) First Semester 2018-2019 Page 1 of 2


BA ZG522 (EC-3 Regular) First Semester 2018-2019 Page 2

Q.3. Suppose that Apriori algorithm is applied to the data set shown in table below with
minimum support = 30%. Answer the questions given below [2 + 3 = 5]

Transaction
ID Items Bought
1 {apple, banana, dates, guava}
2 {banana, coconut, dates}
3 {apple, banana, dates, guava}
4 {apple, coconut, dates, guava}
5 {banana, coconut, dates, guava}
6 {banana, dates, guava}
7 {coconut, dates}
8 {apple, banana, coconut}
9 {apple, dates, guava}
10 {banana, dates}

a. What is the percentage of frequent itemsets (with respect to all possible itemsets)?
b. What is the false alarm rate (i.e., percentage of candidate itemsets that are found to be
infrequent after performing support counting)?

Q.4. Draw single link dendrogram for the following distance data among 5 points. If we need
to restrict inter-cluster distance to at least 0.20, what is maximum count of clusters?
[3 + 2 = 5]
p1 p2 p3 p4 p5
p1 0 0.95 0.69 0.48 0.65
p2 0.95 0 0.36 0.53 0.17
p3 0.69 0.36 0 0.56 0.21
p4 0.48 0.53 0.56 0 0.24
p5 0.65 0.17 0.21 0.24 0

Q.5. Use k-means algorithm to cluster the following points into three clusters. Use Manhattan
distance to compute distance between points. Consider A, C, and G as initial centroids for
the three clusters. [5]
A B C D E F G H
(2,10) (2,5) (8,4,) (5,8) (7,5) (6,4) (1,2) (4,9)

Q.6. The demand for a product for 5 consecutive months is given in the table below:
Month 1 2 3 4 5
Demand (in thousands) 20 21 23 24 25
Use exponential smoothing with a smoothing factors (α) of 0.9 and 0.1 to forecast
demand for 6th month. Compare the results obtained in both cases. [3 + 2 = 5]

************

BA ZG522 (EC-3 Regular) First Semester 2018-2019 Page 2 of 2

You might also like