You are on page 1of 4

List of Files for this assignment:

[o] ex1.m - main file to execute the program for this assignment.
[o] ex1_multi.m - main file to execute the program for this assignment.
[o] computeCost.m - Function to compute the cost of linear regression
[o] gradientDescent.m - Function to run gradient descent
[o] computeCostMulti.m - Cost function for multiple variables
[o] gradientDescentMulti.m - Gradient descent for multiple variables
[o] normalEqn.m - Function to compute the normal equations
o indicates files you will need to complete

Section 1: Linear regression with one variable


Part 1: Plotting the Data
The loading of training data (assigned to variables X and y in the code) has already been
done for you as follows:

data = load('ex1data1.txt'); % read comma separated data


X = data(:, 1); y = data(:, 2);
m = length(y); % number of training examples

Training data can be visualized within the rectangular 2D plot space. The code to do this
has also been written down for you.

Part 2: Gradient Descent + Computing J


the backbone code for doing gradient descent is already written down for you.

For this part, You should implement costFunction() and gradientDescent() function.
Note that for this assignment, gradientDescent() not only should do gradient descent,
but also should compute the value of cost function (via invoking costFunction()
function) at the end of every gradient descent iteration. This is because, although
irrelevant to the optimization process itself, we want to see if the cost function value
really actually decreases at each iteration(see part 3 for details).

After implementing costFunction() and gradientDescent() function, you should include


lines of code that visuallizes the best fit linear graph of the given training set.

Successful implementation of costFunction() and gradientDescent(), along with correct


choice of function(s) for plotting the best fit graph, results in figure that is at least as
similar as follows:
Aside from drawing the linear fit, picky TAs will check if you put the x-labels and
y-labels. Plus, they will require you to put the legend that describes what is shown in the
plot space. Failure to do this will give you penalties.

Part 3: Visualizing J as a function of iteration number


Now you should include lines of code that visuallizes J as a function of iteration number
to ensure that J decreases by iteration.
Correct choice of function(s) for plotting J results in something like this:
Aside from plotting the data, many picky TAs will check if you put the x-labels and
y-labels. Failure to do this will give you penalties.

Part 4: Visualizing J as a function of thetas


Now that you have seen the cost function value properly decreases by iteration, let's plot
this as a function of thetas. As stated in class, there are two ways to do this: the 2D-way
or 3D-way. To do the 2D-way means to draw a contour plot. To do the 3D-way means to
draw a surface plot.

Section 2: Linear regression with multiple variables


That's it for ex1.m, costFunction.m, and gradientDescent.m! In this part, you will
implement the rest of the soruce files to do linear regression with multiple variables to
predict the prices of houses.

ex1data2.txt contains a training set of housing prices in Portland, Oregon.


- The first column is the size of the house (in square feet).
- the second column is the number of bedrooms,
- and the third column is the price of the house.

The initialization part of ex1 multi.m script has been set up to help you step through this
exercise. Note that it performs feature scaling and mean normalization (in a single line
of code) to make gradient descent converge much more quickly.
Part 5: Gradient Descent + Computing J for Linear regression
with multiple variables
Previously, you implemented gradient descent on a univariate regression
problem. The only difference now is that there is one more feature in the
matrix X. The hypothesis function and the batch gradient descent update
rule remain unchanged.
You should complete the code in computeCostMulti.m and gradientDescentMulti.m to
implement the cost function and gradient descent for linear regression with multiple
variables. If your code in the previous part (single variable) already supports multiple
variables, you can use it here too.
Make sure your code supports any number of features and is well-vectorized. You can
use `size(X, 2)' to find out how many features are present in the dataset.

Part 6: Normal Equations for Linear regression with multiple


variables
Using this normal equation will get an exact solution in one calculation.
Complete the code in normalEqn.m to use the equation to calculate parameter vector.

You might also like