You are on page 1of 42

Regression:

Predicting House Prices


Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
1

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Predicting house prices

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

How much is my house worth?

I want to list
my house
for sale

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

How much is my house worth?

$$ ????

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Look at recent sales in my neighborhood


How much did they sell for?

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

price ($)

Plot recent house sales


(Past 2 years)
y

Terminology:

square feet (sq.ft.)


6

2015 Emily Fox & Carlos Guestrin

x feature,
covariate, or
predictor
y observation or
response
Machine Learning Specializa0on

price ($)

Predict your house by


similar houses
y

square feet (sq.ft.)


7

2015 Emily Fox & Carlos Guestrin

No house sold
recently had exactly
the same sq.ft.
Machine Learning Specializa0on

price ($)

Predict your house by


similar houses
y

square feet (sq.ft.)


8

2015 Emily Fox & Carlos Guestrin

Look at average
price in range
Still only 2 houses!
Throwing out info
from all other sales
Machine Learning Specializa0on

Linear regression

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Use a linear regression model


Fit a line through the data

price ($)

f(x) = w0+w1 x
square feet (sq.ft.)
10

2015 Emily Fox & Carlos Guestrin

parameters
of model
Machine Learning Specializa0on

Use a linear regression model


Fit a line through the data

price ($)

fw (x) = w0+w1 x
square feet (sq.ft.)
11

2015 Emily Fox & Carlos Guestrin

function
parameterized by
w = (w0 ,w1 )
Machine Learning Specializa0on

Which line?

price ($)

fw (x) = w0+w1 x
dierent parameters w
square feet (sq.ft.)
12

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Cost of using a given line


Residual sum of squares (RSS)

price ($)

RSS(w0,w1) =
($house 1-[w0+w1sq.ft.house 1])2
+ ($house 2-[w0+w1sq.ft.house 2])2
+ ($house 3-[w0+w1sq.ft.house 3])2
+ [include all houses]
square feet (sq.ft.)

13

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Find best line


Minimize cost over all
possible w0,w1

price ($)

RSS(w0,w1) =
($house 1-[w0+w1sq.ft.house 1])2
+ ($house 2-[w0+w1sq.ft.house 2])2
+ ($house 3-[w0+w1sq.ft.house 3])2
+ [include all houses]
square feet (sq.ft.)

14

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Predicting your house price


fw*(x) = 0 + 1 x

price ($)

Best guess of your


house price:
= 0 + 1 sq.ft.your house
square feet (sq.ft.)

15

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Adding higher order eects

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Fit data with a line or ?

price ($)

square feet (sq.ft.)


17

2015 Emily Fox & Carlos Guestrin

You show
your friend
your analysis
Machine Learning Specializa0on

Fit data with a line or ?

price ($)

Dude, its
not a linear
relationship!
square feet (sq.ft.)

18

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

What about a quadratic function?

price ($)

Dude, its
not a linear
relationship!
square feet (sq.ft.)

19

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

What about a quadratic function?

price ($)

fw(x) = w0 + w1 x+ w2 x2
square feet (sq.ft.)
20

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Even higher order polynomial

price ($)

I can
minimize
your RSS
square feet (sq.ft.)

21

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Do you believe this fit?

price ($)

y
My house
isnt worth
so little
square feet (sq.ft.)
22

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Evaluating overfitting via


training/test split

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Do you believe this fit?

price ($)

Minimizes RSS,
but bad predictions

square feet (sq.ft.)


24

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

What about a quadratic function?

price ($)

fw(x) = w0 + w1 x+ w2 x2
square feet (sq.ft.)
25

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

How to choose model


order/complexity

Want good predictions, but


cant observe future
Simulate predictions
1. Remove some houses
2. Fit model on remaining
3. Predict heldout houses
26

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Training/test split

Terminology: training set


test set
27

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Training error

price ($)

Minimize to
find
square feet (sq.ft.)
28

2015 Emily Fox & Carlos Guestrin

Training error (w) =


($train 1-fw(sq.ft.train 1))2
+ ($train 2-fw(sq.ft.train 2))2
+ ($train 3-fw(sq.ft.train 3))2
+ [include all
training houses]
x
Machine Learning Specializa0on

Test error

price ($)

Assess
predictions
using
square feet (sq.ft.)

29

2015 Emily Fox & Carlos Guestrin

Test error () =
($test 1-f(sq.ft.test 1))2
+ ($test 2-f(sq.ft.test 2))2
+ ($test 3-f(sq.ft.test 3))2
+ [include all
test houses]
x
Machine Learning Specializa0on

Error

Training/Test Curves

Model complexity
30

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Adding other features

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

price ($)

Predictions just based on


house size
y

Only 1 bathroom!
Not same as my
3 bathrooms
square feet (sq.ft.)

32

2015 Emily Fox & Carlos Guestrin

x
Machine Learning Specializa0on

Add more features

price ($)

fw(x) = w0 + w1 sq.ft.
+ w2 #bath

x2

square feet (sq.ft.)


33

2015 Emily Fox & Carlos Guestrin

x1
Machine Learning Specializa0on

How many features to use?


Possible choices:
-Square feet
-# bathrooms
-# bedrooms
-Lot size
-Year built
-

See Regression Course!


34

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Other regression examples

35

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Salary after ML specialization

hard work

How much will your salary be? (y = $$)


Depends on x = performance in courses, quality of
capstone project, # of forum responses,
36

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Salary after ML specialization

hard work

= 0 + 1 performance +
2 capstone + 3 forum
informed by other students who
completed specialization
37

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Stock prediction
Predict the price of a stock
Depends on
-Recent history of stock price
-News events
-Related commodities

38

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Tweet popularity
How many people will retweet your tweet?
Depends on # followers,
# of followers of followers,
features of text tweeted,
popularity of hashtag,
# of past retweets,

39

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Smart houses
Smart houses have many distributed sensors
Whats the temperature at your desk? (no sensor)
- Learn spatial function to predict temp

Also depends on
- Thermostat setting
- Blinds open/closed
or window tint
- Vents
- Temperature outside
- Time of day
40

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Summary for regression

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

What you can do now


Describe the input (features) and output (real-valued
predictions) of a regression model
Calculate a goodness-of-fit metric (e.g., RSS)
Estimate model parameters by minimizing RSS
(algorithms to come)
Exploit the estimated model to form predictions
Perform a training/test split of the data
Analyze performance of various regression models in
terms of test error
Use test error to avoid overfitting when selecting amongst
candidate models
Describe a regression model using multiple features
Describe other applications where regression is useful
42

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

You might also like