You are on page 1of 6

Econometrics

Professor Robert H. Patrick


Finance and Economics Department
Rutgers Business School – Newark and New Brunswick

Multiple Regression Example: residential real estate model

A proposed model to forecast the price of residential properties in a large metropolitan


area is:
Pi = β0 + β1 ( sqfti ) + β 2 ( agei ) + β3 ( uptowni ) + β 4 ( pooli ) + β5 ( fireplacei ) + et (1)

where Pi ≡ property i selling price ($),


sqfti is the size of the home i (square feet),
agei is the age of the i th residential property (years),
uptowni is a dummy variable equal to 1 if property i is located uptown and 0 otherwise,
pooli is a dummy variable equal to 1 if property i has a pool and 0 otherwise, and
fireplacei is a dummy variable equal to 1 if property i has a fireplace and 0 otherwise.
Larger homes are expected sell at a higher price on average, homes are generally
expected decrease in price as it ages, there is expected to be a premium on the prices of
home that are located uptown, and pools and fireplaces are expected to be attributes that
will add to the price of the home.

The estimated model is

REGRESSION 1:
Pi ≡ property i selling price ($) is the dependent variable.
Regression Statistics
Multiple R 0.9319731
R Square 0.868573859
Adjusted R Square 0.867912761
Standard Error 15334.4444
Observations 1000
ANOVA
df SS MS F
Regression 5 1.54471E+12 3.08942E+11 1313.836664
Residual 994 2.33734E+11 235145185
Total 999 1.77845E+12

Coefficients Standard Error t Stat


Intercept 6911.880144 4289.365451 1.61139922
sqft 83.18324915 1.671728468 49.75882791
age -192.9910111 51.56655384 -3.742561734
Uptown 60196.23307 971.5313159 61.96015721
Pool 4352.569786 1205.26063 3.611310017
Fireplace 1398.809937 976.807024 1.432022808

How does age affect the selling price of a home? Test this hypothesis at the 5% level
of significance.

Regression 1 is the estimated form of the model specified above. The estimated

coefficient that relates age of a house to the price of the house is β̂ 2 ≅ -192.99 . The
model indicates that, on average, house price declines by $192.99 each year as it ages.

To determine if this estimated coefficient is significantly different from zero at the 5%


level of significance, either of the following provides the appropriate test.

(i) the hypothesis to test is H 0 : β̂ 2 = 0, with degrees of freedom =994, tc,.05 ≅ 1.96 ,

from the regression printout tβ = -3.742561734 > tc,.05 = 1.9623534 ⇒ reject


2

H 0 , β̂ 2 is significantly different from 0. The absolute value of the t statistic can

be used since this is a two-tailed test.

(ii) β̂ 2 ± tc,.05SEβ̂ ⇒ −294.18 ≤ β̂ 2 ≤ −91.80 ⇒ reject H 0 , β̂ 2 is significantly


2

different from 0 since 0 is not in the bounds calculated above.

Test the hypothesis that β1 = β 2 = β3 = β 4 = β5 = 0 at the .05 significance level.

The F-test in the ANOVA table of Regression 1 provides a test of the above null
hypothesis against the alternative hypothesis of at least one of the coefficients in the null
hypothesis is not zero. That is, test

H 0 : β̂1 = β̂ 2 = β̂3 = β̂ 4 = β̂5 = 0 versus H1 : H 0 : β̂1 , β̂ 2 , β̂3 , β̂ 4 and/or β̂5 ≠ 0.


From Regression 1 we have F=1,313.8... The critical value to compare to this F is based
on 5 degrees of freedom for the numerator (the number of restrictions in the null), and the
degrees of freedom for the denominator is 994 (1000-6, observations used to estimate the
regression less the number of parameters estimated). This critical value is 2.223<1,313.8,
which implies reject the null hypothesis. This indicates that at least 1 of the estimated

Robert H. Patrick 2
parameters associated with the independent variables in Regression 1 is jointly different
from zero at the 5% level of significance.

A realtor argues that pools and fireplaces do not jointly affect home selling prices.
Test this joint hypothesis at the .05 significance level. What does your test reveal
about the realtor’s argument?

This question implies a joint test of the null and alternative hypotheses of

H 0 : β̂ 4 = β̂5 = 0 versus H1 : β̂ 4 and/or β̂5 ≠ 0). To carry out a test of this joint hypothesis,
substitute the null hypothesis into the unrestricted model, (1) above, which leads to the
restricted model

( ) ( ) (
Pi = β0 + β1 sqfti + β 2 agei + β3 uptowni + et . )
The restricted model estimates are Regression 2. Regression 1, above, is the estimated
unrestricted model.

Regression 2: The dependent


variable is Pi ≡ property i selling
price ($)
Regression Statistics
Multiple R 0.93093204
R Square 0.866634463
Adjusted R Square 0.86623276
Standard Error 15431.65475
Observations 1000

ANOVA
df SS MS F
Regression 3 1.54126E+12 5.13754E+11 2157.398744
Residual 996 2.37183E+11 238135968.3
Total 999 1.77845E+12

Coefficients Standard t Stat


Error
Intercept 7874.716693 4309.120649 1.827453287
sqft 83.39848828 1.673954815 49.82123027
age -185.0955432 51.8373427 -3.570698913
Uptown 60259.63706 977.436039 61.65072153

This null hypothesis can then be tested with an F-test,

Robert H. Patrick 3
F=
( ESS R
− ESSU ) q
=
( 2.37183E+11 − 2.33734E+11) 2
ESSU (T − K ) ( 2.33734E+11) 994
= 7.334002005 > Fc,.05 (2,994) = 3.004779025

Since the calculated F statistic is greater than the critical value, reject H0 at the .05 level
of significance. The unrestricted model is indicated to better represent the data
generating process. This provides empirical evidence that the realtor is wrong, pools and
fireplaces do jointly significantly (at the 5% level) affect residential house prices on
average.

Using the estimated model (Regression 1), what is the predicted price for a 2,500
square foot home that is 10 years old, is located uptown, and has a fireplace and a
pool?

Regression 3 is the forecast regression for model (1). Use the Data Table below to see that
this regression has two forecasts computed:
1. Price1 is for the specification of the X variables in this question,
2. Price2 is for exactly the same characteristics of a property except 2 does not have a
fireplace.
Therefore the Price1 parameter provides the predicted average house price (conditional
on the characteristics given) = $278,887.71.

REGRESSION 3: Pi ≡ property i selling price ($) is the dependent variable.


Regression Statistics
Multiple R 0.934318482
R Square 0.872951025
Adjusted R 0.87218413
Square
Standard 15334.4444
Error
Observations 1002
ANOVA
df SS MS F
Regression 7 1.60598E+12 2.67664E+11 1138.292435
Residual 995 2.33734E+11 235145185
Total 1002 1.83972E+12
Coefficients Standard Error t Stat
Intercept 6911.880144 4289.365451 1.61139922
sqft 83.18324915 1.671728468 49.75882791
age -192.9910111 51.56655384 -3.742561734
Uptown 60196.23307 971.5313159 61.96015721

Robert H. Patrick 4
Pool 4352.569786 1205.26063 3.611310017
Fireplace 1398.809937 976.807024 1.432022808
Price1 278887.7057 15387.30073 18.12453728
Price2 277488.8958 15827.30000 17.53229520

DATA:
price sqft age Uptown Pool Fireplace Price1 Price2
205452 2346 6 0 0 1 0 0
185328 2003 5 0 0 1 0 0
248422 2777 6 0 0 0 0 0
154690 2017 1 0 0 0 0 0
221801 2645 0 0 0 1 0 0
199119 2156 6 0 0 1 0 0
272134 2991 9 0 0 1 0 0
250631 2798 0 0 0 1 0 0
197240 2480 0 0 1 0 0 0
235755 2750 0 0 0 0 0 0
189639 2082 14 0 0 0 0 0
227008 2338 12 0 0 1 0 0
. . . . . . . .
. . . . . . . .
. . . . . . .
263526 2399 6 1 0 0 0 0
300728 2874 9 1 0 0 0 0
220987 2093 2 1 0 1 0 0
0 2500 10 1 1 1 -1 0
0 2500 10 1 1 0 0 -1

This value can also be calculated by substituting the hypothesized right hand side values
specified in the question into the estimated model (Regression 1) to arrive at the same
answer, i.e.,
P̂i = β0 + β1 ( sqfti ) + β 2 ( agei ) + β3 ( uptowni ) + β 4 ( pooli ) + β5 ( fireplacei ) + E ( et )
≅ 6911.88 + 83.183( 2500 ) − 192.99 (10 ) + 60196.23(1) + 4352.57 (1) + 1398.81(1) + 0 ≅ $278,887.71

What is the 80% confidence interval around this predicted price?

This question requires the use of the forecast standard error to calculate the 80%
probability bounds around the forecast value above. Use Regression 3 to obtain the
forecast standard error, σ f , forecast1 = SEβ = $15,387.30 , which is the standard error
forecast1

estimate associated with the Price1 parameter estimate.

Robert H. Patrick 5
Conditional on the specified house characteristics, using the resulting predicted average
house price and forecast standard error with the critical value tc,.20,994 ≅ 1.28 , the

approximate 80% confidence interval for the forecast is ±t c,.20,994σ f , forecast1 around the

expected (predicted) house price, or

$259,154.97 ≤ predicted house price ≤ $298,620.44 .

That is, the model predicts that we should expect a 2,500 square foot house that is 10
years old, is located uptown, and has a fireplace and a pool to have an average selling
price of $278,887.71 and an 80% probability that the selling price will be between
$259,154.97 and $298,620.44.

Other potential issues/questions:

How well does the model forecast?

Multicollinearity?

Have we made any errors that would affect our estimates? For example, are there
any omitted variables that should be in the model?

Robert H. Patrick 6

You might also like