You are on page 1of 3

MALAVIYA NATIONAL INSTITUTE OF

TECHNOLOGY, JAIPUR
Department of Computer Science and
Engineering
Machine Learning Lab 2018

Exercise Sheet

Machine learning is, as artificial intelligence pioneer Arthur Samuel said, a way of
programming that gives computers “the ability to learn without being explicitly programmed”.
It is a way to program computers to extract meaningful patterns from examples, and to use
those patterns to make decisions or predictions. This is a two step process, first, creating
a model with adjustable parameters that can adjust themselves to the available data, and
then introducing new test data and letting the model extrapolate. The aim is to develop
such a model which is a balance between precisely capturing the behavior of known data
and generalizing enough to predict the behavior of unknown data.

Note:
(1) LFD refers to the textbook “Learning from Data”.
(2) All the problems are to be implemented in Python. The exercises are to be done
individually.
(3) The submission date is the last lab day for the exercise. The number of labs for each
exercise is mentioned alongside.

Following is a list of exercises to be done as part of Machine Learning Lab course.

1. (10 points, Labs - 1) Generate a linearly separable data (random) set of size 20.Plot
the examples {(xn , yn )} as well as the target function f on a plane. Besure to mark
the examples from different classes differently, and add labels to the axes of the plot.

2. Run the perceptron learning algorithm on the data set above. Report the number of
updates that the algorithm takes before converging. Plot the examples {(xn , yn )}, the
target function f , and the final hypothesis g in the same figure. Comment on whether
f is close to g.

3. Repeat everything in (2) with another randomly generated data set of size 100. Compare
your results with (2).

4. (10 points, Labs - 1) Write a python script that can find w0 and w1 for an arbitrary
dataset of number of hours studied versus rank of a students as {(xn , yn )} pairs. Find
the linear model, y = wT x, that minimizes the squared loss. Derive the optimal
PN T 2
parameter value, w,b for the total training loss: L = n=1 (yn − w xn ) . Using the
model predict the rank for the number of hours studied.

Lab Instructor: NN Please go on to the next page. . .


Mathematics Mathematics — Differential Geometry Page 2 of 3

Load the data stored in the file syntheticdata.mat. Fit a 4th order polynomial function
f (x; w) = w0 + w1 x + w2 x2 + w3 x3 + w4 x4 to this data. What do you notice about w2
and w4 ?
Fit a function f (x; w) = w0 + w1 x + w2 sin( (x−a)
b
to this data, assuming a and b are
fixed in some sensible range. Show a least square fit using this model. What do you
notice about w1 and w2 . Comment about generalization and overfitting.

5. (15 points, Labs - 2) Handwritten Digits Data: You should download the two data files
with handwritten digits data: training data (ZipDigits.train) and test data (ZipDigits.test).
Each row is a data example. The first entry is the digit, and the next 256 are grayscale
values between −1 and 1. The 256 pixels correspond to a 16 × 16 image. For this
problem, we will only use the 1 and 4 digits, so remove the other digits from your
training and test examples. Please submit your Python code implementing the logistic
regression for classification using gradient descent.

• (5 points) Familiarize yourself with the data by giving a plot of two of the digit
images.
• (5 points) Develop two features to measure properties of the image that would
be useful in distinguishing between 1 and 4. You may use symmetry and average
intensity (as discussed in class).
• (5 points) As in the text, give a 2-D scatter plot of your features: for each data
example, plot the two features with a red redx if it is a 4 and a blue blueo if it is
a 1.
• (35 points) Classifying Handwritten Digits: 1 vs. 4. Implement logistic regression
for classification using gradient descent to find the best separator you can using
the training data only (use your 2 features from the above question as the inputs).
The output is +1 if the example is a 1 and −1 for a 4.
• (5 points) Give separate plots of the training and test data, together with the
separators.
• (10 points) Compute Ein on your training data and Etest , the test error on the
test data after 1000 iterations.
• (10 points) Now repeat the above using a 3rd order polynomial transform.
• (10 points) As your final deliverable to a customer, would you use the linear
model with or without the 3rd order polynomial transform? Explain.

6. (20 points, Labs - 2) Logistic regression can also be augmented with the l2 -norm
regularization:
min
w
E(w) + λ||W ||22 ,
where E(w) is the logistic loss. Please change your gradient descent algorithm accordingly
and use cross-validation to determine the best regularization parameter.

• (10 points) Plot the training and testing performance curves.

Lab Instructor: NN Please go on to the next page. . .


Mathematics Mathematics — Differential Geometry Page 3 of 3

• (10 points) Indicate in the plot the best regularization parameter you obtained
(using cross validation).

7. (20 points, Labs - 3) In this problem you will implement forward and backward
propagation methods for a multi-layer neural network with K hidden layers. Assume
that K is a user input less than 10. Implement the networks separately with the
following activation functions:

• Sigmoid: Derive the gradient of the activation function. Confirm with numerical
differentiation.
• Tanh: Derive the gradient of the activation function. Confirm with numerical
differentiation.

Assume that the last layer has a linear activation function and the loss function is
l(y, ŷ) = ||y − ŷ||22 . Submit your code (along with any instructions necessary to run it),
the forward pass outputs at each layer and the gradients of the parameters Wijk , bki
. The input, output and the parameters of the network can be found in the MAT file
associated with this problem.

8. (20 points, Labs - 2) In this problem you will train a multi-layer neural network
to recognize handwritten digits. Use the multi-layer neural network (with ReLU
activation) that you implemented in the previous homework. Use 32 nodes in each
layer and initialize the weights randomly. The data is also provided to you in a MAT
file.

• Report the training and validation accuracy as a function of iterations (with 5


hidden layers). Report the convergence speed of the training procedure (with 5
hidden layers) for the Stochastic Gradient Descent optimization algorithm.
• Determine the number of hidden layers required via cross-validation. Report the
training and validation accuracy for cross-validation.
• Finally, report the best test error that you can achieve.

9. (20 points, Labs - 2)Classify the digits data as given for exercise 4 using a Support
Vector Machine. Compute the values of W and an offset b, also draw the hyperplane.

Lab Instructor: NN End of Exercise

You might also like