Professional Documents
Culture Documents
(Part I)
HE204A – Lecture 1
Fall 2010
Assistant Professor Kannika Damrongplasit
Topics to be Discussed
Y – Dependent Variable
X – Independent or Explanatory Variable
u – Stochastic disturbance or Error term
The dependent variable is related to a single
independent variable
The Two Variable Model
For each X, there is a population of Y values that are spread around the conditional
mean of those Y values.
Linearity
Linear regression means a regression that is
linear in the parameters. The β’s are raised
to the first power only
The conditional expectation of Y, E(Y|Xi) is a
linear function of the parameters. It may or
may not be linear in the variable X
E(Y|Xi) = β1 + β2Xi2 is linear in parameter
E(Y|Xi) = β1 + β22Xi is nonlinear in parameter
Linearity
Question: Are these models considered linear in parameters?
Significance of Stochastic
Disturbance Term
The relationship between economic variables are
inexact. The error term captures all other
variables omitted from the model but that
collectively affect Y.
⇒ ∑ ui = ∑ (Yi − β1 − β 2 X i )
ˆ ˆ
2 2
ˆ
The methods of least square chooses β̂1 and β̂ 2
in such a manner that, for a given sample or set
2
of data, ∑ uˆi is as small as possible.
Ordinary Least Square (OLS)
Estimation
Ordinary Least Square (OLS)
Estimation
These simultaneous equations are known as normal equations. Solve them and
get the OLS estimators.
Ordinary Least square
Estimators
OLS estimators
βˆ =
n∑ X iYi − ∑ X i ∑ Yi
=
∑ ( X − X )(Y − Y ) i i
n∑ X − (∑ X ) ∑ (X − X )
2 2 2 2
i i i
∑ X ∑Y − ∑ X ∑ X Y
2
βˆ = i i
= Y − βˆ X i i i
n∑ X − (∑ X )
1 2 2 2
i i
Assumptions of Classical Linear
Regression Model (CLRM)
var(ui) = σ2
Assumptions of Classical Linear
Regression Model (CLRM)
homoscedasticity heteroscedasticity
Assumptions of Classical Linear
Regression Model (CLRM)
Assumption 5: The correlation between any
two ui and uj (i≠j) is zero
(No autocorrelation),
cov(ui,uj|Xi,Xj) = 0
Assumptions of Classical Linear
Regression Model (CLRM)
Assumptions of Classical Linear
Regression Model (CLRM)
Assumption 6: The number of observations n
must be greater than the number of
parameters to be estimated
∑X ∑X
2 2
var(βˆ1 ) = i
σ 2
se( βˆ1 ) = i
σ
n∑ x n∑ x
2 2
i i
xi = X i − X
Standard Errors of Least Square
Estimates
∑
2
uˆ
The term can be estimated by σˆ
σ2 2
=i
n−2
∑ (xi yi )
2
∑ uˆi =∑ yi − x 2 yi = Yi − Y
2 2
∑ i
∑ i
2
ˆ
u
The term σˆ =
n−2
∑( )+ ∑ ui
2
Yˆi − Y
2
ESS RSS ˆ
1= + =
∑ (Y − Y ) ∑ (Y − Y )
2 2
TSS TSS i i
Coefficient of Determination
ESS ∑ ( ) or ∑ i
2
Yˆi − Y
2
RSS ˆ
u
r 2
= = r = 1−
2
= 1−
∑ (Y − Y ) ∑ (Yi − Y )
2 2
TSS TSS
i
r = βˆ2
2 2 i
∑ (Y − Y )
2
i
Sample Correlation Coefficient
The quantity r is known as the sample
correlation coefficient.
It is a measure of the degree of association
between two variables
r=± r 2
n∑ X iYi − (∑ X i )(∑ Yi )
r=
[n∑ X i
2 2
][
− (∑ X i ) n∑ Yi − (∑ Yi )
2 2
]
Properties of r
It can be positive or negative
It limits are -1 ≤ r ≤ 1
It is symmetric in nature, rXY = rYX
It is independent of the origin and scale
It X and Y are statistically independent, r = 0. But
if r = 0, it does not mean that the two variables are
independent
It is a measure of linear association only
It does not imply any cause and effect relationship
Correlation
Patterns
Numerical Example (Table 3.2
of Textbook)
Y = Mean hourly wage, X = Years of Schooling, n = 13.
The regression results is
Yˆi = −0.0144 + 0.724 X i
se( βˆ1 ) = 0.932 se( βˆ2 ) = 0.07 r 2 = 0.9065
The value of β̂ 2 = 0.724 shows that, with the sample
range of X between 6 and 18 years of education, as X
increases by 1, the estimated increase in mean hourly
wage is about 72 cents
The value of β̂1 = -0.0144 indicates the average level of
wages when the level of education is zero. Very often the
intercept has no viable meaning.
The r2 value suggest that education explains about 90% of
the variation in hourly wage
Normality Assumption
The classical normal linear regression model
assumes that each ui is distributed normally
with
Mean E(ui) = 0
Variance E[ui-E(ui)]2 = E(ui2) = σ2
Cov(ui,uj) = E[ui-E(ui)][uj-E(uj)]
= E(uiuj) =0 for i ≠ j
That is, ui ~ N(0, σ2)
Properties of OLS Estimators
under normality assumption
The OLS estimators have the following
properties:
(1)They are unbiased
(2) They have minimum variance (efficient)
(3) They have consistency, that is, as the
sample size increases indefinitely, the
estimators converge to their true
population values
Properties of OLS Estimators
under normality assumption
(4) β̂1 is normally distributed with
E ( βˆ1 ) = β1
Mean
βˆ1 − β1
Z= → N (0,1)
∑ Xi
2