You are on page 1of 10

Advanced Econometrics II

Problem Set 2

Due before the class on Wednesday, April 3

1 Conceptual Problems
Problem 1. (W 2.8) Consider the standard simple regression model y = β0 + β1 x + u under
the Gauss-Markov Assumptions SLR.1 through SLR.5. The usual OLS estimators β̂0 and β̂1 are

unbiased for their respective population parameters. Let β̃1 be the estimator of β1 obtained by

assuming the intercept is zero (see Section 2.6).

(i) Find E(β̃1 ) in terms of the xi , β0 , and β1 . Verify that β̃1 is unbiased for β1 when the

population intercept (β0 ) is zero. Are there other cases where β̃1 is unbiased?

β̃1 . (Hint: The variance does not depend on β0 .)


(ii) Find the variance of

(iii) Show that V ar(β̃1 ) ≤ V ar(β̂1 ). [Hint : For any sample of data,
Pn 2 Pn 2
i=1 xi ≥ i=1 (xi − x̄) ,
with strict inequality unless x̄ = 0.]

(iv) Comment on the tradeo between bias and variance when choosing between β̂1 and β̃1 .

Problem 2. (W 2.12) Consider the problem described at the end of Section 2.6: running a

regression and only estimating an intercept.

(i) Given a sample {yi : i = 1, 2, · · · , n}, let β̃0 be the solution to

n
X
min (yi − b0 )2 .
b0
i=1

Show that β̃0 = ȳ , that is, the sample average minimizes the sum of squared residuals. ( Hint :
You may use one-variable calculus or you can show the result directly by adding and subtracting

ȳ inside the squared residual and then doing a little algebra.)

(ii) Dene residuals ũi = yi − ȳ . Argue that these residuals always sum to zero.

Problem 3. (W 3.2) The data in WAGE2 on working men was used to estimate the following

equation:
d = 10.36 − .094sibs + .131meduc + .210f educ
educ

1
n = 722, R2 = .214,

where educ is years of schooling, sibs is number of siblings, meduc is mother's years of schooling,
and f educ is father's years of schooling.

(i) Does sibs have the expected eect? Explain. Holding meduc and f educ xed, by how much

does sibs have to increase to reduce predicted years of education by one year? (A noninteger

answer is acceptable here.)

(ii) Discuss the interpretation of the coecient on meduc.


(iii) Suppose that Man A has no siblings, and his mother and father each have 12 years of

education. Man B has no siblings, and his mother and father each have 16 years of education.

What is the predicted dierence in years of education between B and A?

Problem 4. (W 3.6) Consider the multiple regression model containing three independent vari-

ables, under Assumptions MLR.1 through MLR.4:

y = β0 + β1 x1 + β2 x2 + β3 x3 + u.

You are interested in estimating the sum of the parameters on x1 and x2 ; call this θ1 = β1 + β2 .
(i) Show that θ̂1 = β̂1 + β̂2 is an unbiased estimator of θ1 .
(ii) Find V ar(θ̂1 ) in terms of V ar(β̂1 ), V ar(β̂2 ), and Corr(β̂1 , β̂2 ).

2 Empirical Exercises
Exercise 1. (W C2.10) The data set in CATHOLIC includes test score information on over 7,000

students in the United States who were in eighth grade in 1988. The variables math12 and read12
are scores on twelfth grade standardized math and reading tests, respectively.

(i) How many students are in the sample? Find the means and standard deviations of math12
and read12.
(ii) Run the simple regression of math12 on read12 to obtain the OLS intercept and slope

estimates. Report the results in the form

math12
d = β̂0 + β̂1 read12

n =?, R2 =?

where you ll in the values for β̂0 and β̂1 and also replace the question marks.

(iii) Does the intercept reported in part (ii) have a meaningful interpretation? Explain.

(iv) Are you surprised by the β̂1 that you found? What about R2 ?
(v) Suppose that you present your ndings to a superintendent of a school district, and the

superintendent says, Your ndings show that to improve math scores we just need to improve

2
reading scores, so we should hire more reading tutors. How would you respond to this comment?

(Hint: If you instead run the regression of read12 on math12, what would you expect to nd?)

Exercise 2. (W C3.2) Use the data in HPRICE1 to estimate the model

price = β0 + β1 sqrf t + β2 bdrms + u,

where price is the house price measured in thousands of dollars.

(i) Write out the results in equation form.

(ii) What is the estimated increase in price for a house with one more bedroom, holding square

footage constant?

(iii) What is the estimated increase in price for a house with an additional bedroom that is 140

square feet in size? Compare this to your answer in part (ii).

(iv) What percentage of the variation in price is explained by square footage and number of

bedrooms?

(v) The rst house in the sample has sqrf t = 2, 438 and bdrms = 4. Find the predicted selling

price for this house from the OLS regression line.

(vi) The actual selling price of the rst house in the sample was $300,000 (so price = 300).
Find the residual for this house. Does it suggest that the buyer underpaid or overpaid for the

house?

Exercise 3. (W C3.6) Use the data set in WAGE2 for this problem. As usual, be sure all of the

following regressions contain an intercept.

(i) Run a simple regression of IQ on educ to obtain the slope coecient, say, δ̃1 .
(ii) Run the simple regression of log(wage) on educ, and obtain the slope coecient, β̃1 .

(iii) Run the multiple regression of log(wage) on educ and IQ, and obtain the slope coecients,

β̂1 and β̂2 , respectively.


(iv) Verify that β̃1 = β̂1 + β̂2 δ̃1 .

3
CONCEPTUAL QUESTIONS

Problem 1.
I) Find the E( β˜1) in terms of the xi, β0, and β1. Verify that β 1̃ is unbiased for β1
when the population intercept (β0) is zero. Are there other cases where β 1̃ is
unbiased? 


So we use yi = β0 + β1xi + ui to find that

We can also rewrite the numerator as:

So we can plug it in again and rewrite #β˜1 as:

Conditional on the #xi there is:

For all i, E(ui )=0. So in this equation, the bias is described by the first term. When β0


=0 it will be zero too. Also when x̄ = 0 ( xi = 0) # # #

# ˜1.
II) Find the variance of β

Based on what we found in part I) :

1

# ˜1) ≤ Var(#β1̂ ).
III) Show that Var( β
n n
x2 (xi − x̄)2, unless x̄# =0
∑ i ∑
We know that for any sample of data: # >
i=1 i=1

σ2
And we also know that Var ( β1̂ ) = n
∑i=1 (xi − x̄ )2

n n
2
xi2 − n(x̄)2 which is less
∑ ∑
So knowing the above we can see that # (xi − x̄) =
i=1 i=1
n
# ˜1) ≤ Var(#β1̂ ).
xi2 unless x̄# =0. So Var( β

than #
i=1

# ˜1
IV) Comment on the tradeoff︎ between bias and variance when choosing between β
# ˜1.
and β

For any given sample size if x̄# increases, the bias in β# ˜1 will increase too (if we
hold the sum of the x# 2 fixed). At the same time variance of β
i # ̂ will increase relative to
1
# ˜1). When #β0 is small, the bias in β
Var( β # ˜1 is also small. So choosing between β
# ˜1and β
# ˜1
on a mean square error basis, will be determined by the size of x̄# , β
# 0 and n (in addition
n
xi2).

to the#
i=1
Problem 2.
I) I will use the method of adding and subtracting ȳ# inside the squared residual
and then doing a little algebra.
So let’s say ȳ is the sample average of the y, therefore:

n
(yi − ȳ) = 0, The first term does not depend on b0 and the

So knowing that always
i=1

second term, n( ȳ − b0)2, which is nonnegative, is clearly minimized when b0 = 𝑦̅.


n n

∑ ∑
II) If we define ũi = yi − ȳ then ũi = (yi − ȳ) and in the proof in part i) we
i=1 i=1
already used the fact that this sum is zero.

Problem 3.

I) Yes. Because of budget constraints and limited resources, it makes sense that, the
more siblings there are in a family, the less education any one child in the family
has. To find the increase in the number of siblings that reduces predicted
education by one year, we solve 1 = 0.094Δsibs, and hence Δsibs = 10.6.

II) Holding the number of siblings and father’s years of schooling fixed, one more
year of mother’s education implies 0.131 years more of predicted education.

III) Since the number of siblings is the same, but meduc and feduc are both different,
the coefficients on meduc and feduc both need to be accounted for. The predicted
difference in education between B and A is 0.131(4) + 0.210(4) = 1.364.

Problem 4.

I) Since the MLR Assumptions hold, we know that β's are unbiased, which means
that 𝐸(β)̂ = β. Therefore, 𝐸(θ1) = 𝐸(𝛽1 + 𝛽2) = 𝐸(𝛽1) + 𝐸(𝛽2) = 𝛽1 + 𝛽2 = θ1.

II) The variance of θ1 is:


3

θ1 is a parameter, so the expectation is a constant.

Now substitute the beta terms. 



# 1̂ & β
The first and second terms in each parenthesis represent the variances of β # 2̂ , and
the last terms is combined to present their covariance.

#
And therefore:
#

EMPIRICAL EXERCISES

Exercise 1.

I) The number of observations, and therefore the number of students in the sample is
7430. The mean of read12 = 51.7724, the mean of math12 = 52.13362. Standard
deviation of read12 is 9.407761 and the standard deviation of math12 = 9.459117

II) #m ath12̂ = 15.153 + 0.714rea d12



n = 7430, #R 2= 0.5047

III) We can interpret the intercept value as the mean value of math scores when the
reading score value is zero. For the given variables, the minimum value in this
data set for read12 is 29.15, so there is no “0” outcome. We can understand it as if
the reading ability would be very low, the outcome in math would be affected/low
too. It seems logical, however it is probably not a common scenario (we don’t
have such outcomes, and our outcomes anywhere close to that) .This model
predicts that if there was such outcome, the math score would be 15,16.

IV) Our outcome indicates that 1 point increase in reading score increases math score
by 0,714 score. Assuming (It’s not clearly stated in the text) that the problems are

4
formulated in the same language that was tested in the reading part. It seems right,
it indicates that the ability to read well, helps in understanding the mathematical
tasks too. (I thing the relationship may not be that clear, Maybe it just indicates the
students studies more in general for all subjects. Maybe it can mean higher level of
IQ that may help in getting higher scores in maths too?) 

R squared means that almost 50,5 percent of the change in dependent variable can
be explained by our independent variable. This amount seems quite significant
considering that we are including only one regressor and there are probably many
other things influencing the math score in reality. However, as much as 50,5% of
the change in math score can be explained by the change in the reading score.

V) In this case when we run a regression of read12 on math12 we get:



# rea d12̂ = 14.937 + 0.706m ath12

# 2=0.5046

n= 7430; R
The Beta and R-squared values are very similar to the math12 regression on
read12. So we can see that one point increase in math scores will effect in increase
by 0,706 in the reading score. It is also possible to get better results in reading by
improving in maths( so hiring more math teachers could be suggested as a solution
too). What’s more the suggestion “to improve math scores we just need to improve
reading scores” is not perfect, since there may be many more factors that influence
the math score outside of the reading only (like the other 50% not included in the
R-squared )

Exercise 2

I) price = -19.315 + 0.1284362sqrft +15.19819bdrms

II) One additional bedroom, keeping other variables constant, is estimated to increase
the price of the house by 15198,19 $. Which is #β2*1000 because the price is given
in thousands of dollars.

III) In the previous example, as keeping other variables constant we can understand
that the area that would be taken by the additional bedroom will be reduced from
the overall area of the house. So keeping the size of the house the same, we would
add an extra bedroom. In this example though, we need to find out how would
extra 140-square-feet bedroom increase the price of the house. 

Change in price =0.128(140)+15.198(1) = 33,118 


5
So in this case the price would increase by 33 118 $. It seems reasonable, since we
not only add the bedroom but also increase the area of the house so the price
increase is significantly bigger in II) too.

IV) #R 2= 0.6319, so 63,19% of the variation in price is explained by square footage


and number of bedrooms. The first house has sqrft = 2,438 and bdrms = 4. So its
predicted price would be: 

price = -19.315 + 0.1284362(2438) +15.198(4) = 354 605,22$

V) In this case because the predicted value of such house is 354 605,22$ and the
buyer paid 300 000$. (300 000$- 354 605,22$= -54 605,22$)The residual is lower
than the predicted value so we could conclude that the buyer underpaid for the
house. However, we have to remember that there are many more factors than only
the no. of bedrooms or the floor area of the house. And we have not controlled for
them. There may be factors such as the location- whether the neighbourhood is
good, are there any schools or conveniently located facilities nearby etc. how far
from the city centre is it and so on. Maybe in the case of the first observation It
might have happened that the house’s location wasn’t as good or, school was
relatively far away, so the price was below the average for the house of this size
and no of bedrooms.

Exercise 3

I) IQ = 53.68715 + 3.533829 educ 



n=935, #R 2= 0.2659

the slope coeff︎cient, #˜
δ1 =3.533829
II) log(wage)= lwage

lwage= 5.973063+ 0.0598392educ

n= 935 R ! 2=0.0974

the slope coefficient ˜
# β1 = 0.0598392

III) The slope coefficients from log(wage) on educ and IQ are β1̂ = 0.0391199
and β2̂ = 0.0058631, respectively. 

log(wage) = 5.658288 + 0.0391199 educ + 0.0058631IQ 

n=935, R2=0.13 


6
IV) ˜
β1 =? β1̂ +β2̂ ˜
δ1 

0.0391199+ 0.0058631(3.533829)= 0.05983909

And the value of #˜


β1 we calculated is 0.0598392, so the results are very very
close and the difference may be only due to rounding error.

You might also like