You are on page 1of 48

Two Variable Regression Analysis

(Part I)

HE204A – Lecture 1
Fall 2010
Assistant Professor Kannika Damrongplasit
Topics to be Discussed

 The Two Variable Model


 Ordinary Least Squares (OLS)
Estimation
 Goodness of Fit
 Properties of OLS Estimators
 Reference: Gujarati and Porter,
Chapters 2, 3 & 4
The Two Variable Model
 Econometrics – Application of mathematics and
statistics to economic data to analyze economic
models
 Simplest Model: Two Variable Model
 Y = β1 + β 2X + u

 Y – Dependent Variable
 X – Independent or Explanatory Variable
 u – Stochastic disturbance or Error term
 The dependent variable is related to a single
independent variable
The Two Variable Model

 β1 and β 2 are parameters


 β 1 – Intercept Coefficient
 β 2 – Slope coefficient
 Use regression to estimate the numerical value of
β1 and β2
 Regression is concerned with estimating the
mean of the dependent variable on the basis of
fixed values of independent variables
Population Regression Function (PRF)

 To study the relationship between weekly


consumption expenditure (Y) and weekly income
(X). Consider a population of 60 families divided
into 10 income groups (subpopulations)

 Observe that on the average weekly consumption


expenditure increases as income increases

 The population regression curve is a curve that


connects the mean of the subpopulations of Y
corresponding to the given values of X
Weekly Family Income
Weekly Family Income
Linear Population Regression Function

 Each conditional mean E(Y|Xi) is a function of Xi,


where Xi is the given value of X.
 Assume a linear function,
E(Y|Xi) = β1 + β2Xi
 This is known as the linear population regression
function
Population Regression Line

For each X, there is a population of Y values that are spread around the conditional
mean of those Y values.
Linearity
 Linear regression means a regression that is
linear in the parameters. The β’s are raised
to the first power only
 The conditional expectation of Y, E(Y|Xi) is a
linear function of the parameters. It may or
may not be linear in the variable X
 E(Y|Xi) = β1 + β2Xi2 is linear in parameter
 E(Y|Xi) = β1 + β22Xi is nonlinear in parameter
Linearity
Question: Are these models considered linear in parameters?
Significance of Stochastic
Disturbance Term
 The relationship between economic variables are
inexact. The error term captures all other
variables omitted from the model but that
collectively affect Y.

 The error term is needed due to:


 (1) Vagueness of theory
 (2) Unavailability of data (i.e. family wealth)
 (3) Core variables versus peripheral variables
Significance of Stochastic
Disturbance Term
 (4) Intrinsic randomness in human behavior
(i.e. intrinsic ability of individual)

 (5) Poor proxy variables


(i.e. error of measurement)

 (6) Principles of parsimony


(i.e. keep the model as simple as possible)
 (7) Wrong functional forms
Sample Regression Function (SRF)
 In practical situations we have only a sample
of Y values corresponding to some fixed X’s.
 May not be able to estimate the PRF
accurately because of sampling fluctuations
 A sample regression function is
Yˆi = βˆ1 + βˆ2 X i
 Yˆi = estimator of E(Y|Xi)
 β̂1 = Estimator of β1

β̂ 2 = Estimator of β2
Random samples from the
population
Regression lines based on two
different samples
Sample Regression Function (SRF)
 In stochastic form, the sample regression function is
Y = βˆ + βˆ X + uˆ
i 1 2 i i
 ûi = the sample residual term
 The primary objective in regression analysis is to
estimate the PRF Yi = β1 + β2Xi + ui on the basis of
the SRF
Yi = βˆ1 + βˆ2 X i + uˆi
 The SRF can be expressed as Yi = Yˆi + uˆi
 The SRF may underestimate or overestimate the PRF
because of sample fluctuations
Sample and Population Regression Lines
Ordinary Least Square (OLS)
Estimation
 From the SRF Yi = βˆ1 + βˆ2 X i + uˆi = Yˆi + uˆi
 We have uˆ = Y − Yˆ = Yi − βˆ − βˆ X
i i i 1 2 i

⇒ ∑ ui = ∑ (Yi − β1 − β 2 X i )
ˆ ˆ
2 2
ˆ
 The methods of least square chooses β̂1 and β̂ 2
in such a manner that, for a given sample or set
2
of data, ∑ uˆi is as small as possible.
Ordinary Least Square (OLS)
Estimation
Ordinary Least Square (OLS)
Estimation

These simultaneous equations are known as normal equations. Solve them and
get the OLS estimators.
Ordinary Least square
Estimators
 OLS estimators

βˆ =
n∑ X iYi − ∑ X i ∑ Yi
=
∑ ( X − X )(Y − Y ) i i

n∑ X − (∑ X ) ∑ (X − X )
2 2 2 2
i i i

∑ X ∑Y − ∑ X ∑ X Y
2

βˆ = i i
= Y − βˆ X i i i

n∑ X − (∑ X )
1 2 2 2
i i
Assumptions of Classical Linear
Regression Model (CLRM)

 Assumption 1: The regression model is linear in


the parameters, Yi = β1 + β2Xi + ui

 Assumption 2: The X variables and the error term


are independent, cov(Xi, ui) = 0

 Assumption 3: The mean value of the random


disturbance term is 0, E(ui|Xi) = 0
Assumptions of Classical Linear
Regression Model (CLRM)
Assumptions of Classical Linear
Regression Model (CLRM)

 Assumption 4: The variance of the error


term is the same regardless of the value
of X (homoscedasticity),

var(ui) = σ2
Assumptions of Classical Linear
Regression Model (CLRM)

homoscedasticity heteroscedasticity
Assumptions of Classical Linear
Regression Model (CLRM)
 Assumption 5: The correlation between any
two ui and uj (i≠j) is zero
(No autocorrelation),

cov(ui,uj|Xi,Xj) = 0
Assumptions of Classical Linear
Regression Model (CLRM)
Assumptions of Classical Linear
Regression Model (CLRM)
 Assumption 6: The number of observations n
must be greater than the number of
parameters to be estimated

 Assumption 7: The X values in a given sample


must not all be the same, var(X) > 0
Standard Errors of Least Square
Estimates
 The least square estimates are a function of the
sample data. Thus, we need to measure the
precision of the estimators by their standard errors
(se)
σ 2
σ
var(β 2 ) =
ˆ se( βˆ2 ) =
∑ xi ∑x
2 2
i

∑X ∑X
2 2

var(βˆ1 ) = i
σ 2
se( βˆ1 ) = i
σ
n∑ x n∑ x
2 2
i i

xi = X i − X
Standard Errors of Least Square
Estimates

2

 The term can be estimated by σˆ
σ2 2
=i

n−2
∑ (xi yi )
2

∑ uˆi =∑ yi − x 2 yi = Yi − Y
2 2

∑ i
∑ i
2
ˆ
u
 The term σˆ =
n−2

is known as the standard error of estimate. It is


the standard deviation of the Y values about the
estimated regression line
Gauss Markov Theorem
 An estimator is said to be the best linear unbiased
estimator (BLUE) if
 (1) It is linear
 (2) It is unbiased E ( βˆ2 ) = β 2
 (3) It has minimum variance (efficient estimator)
 Given the assumption of the classical linear
regression model, the least-square estimators, in
the class of linear estimators, have minimum
variance, that is, they are BLUE
Goodness of Fit
 Consider the goodness of fit of the fitted
regression line to a set of data
 How well the sample regression line fits the data
 Hope for the residuals (positive and negative)
around the regression line to be as small as
possible
 The coefficient of determination r2 measures how
well the sample regression line fits the data
Coefficient of Determination
 Recall that Yi = Yˆi + uˆi
 In deviation form, Yi − Y = Yˆi − Y + uˆi
 Squaring on both side and sum over the
sample, we have
∑ i − = ∑ iˆ − + ∑ i ∑ i − Y )uˆi
+ ˆ
2 2 2
(Y Y ) (Y Y ) ˆ
u 2 (Y
= ∑ (Yi − Y ) + ∑ uˆi
ˆ 2 2

 TSS = ESS + RSS


Breakdown of Variation in Y
Coefficient of Determination
 TSS = Total sum of squares, total variation of
actual Y values about their sample mean
 ESS = Explained sum of squares, variation of
estimated Y about their mean due to the
explanatory variables
 RSS = Residual sum of squares, variation of the Y
values about the regression line due to the error

∑( )+ ∑ ui
2
Yˆi − Y
2
ESS RSS ˆ
1= + =
∑ (Y − Y ) ∑ (Y − Y )
2 2
TSS TSS i i
Coefficient of Determination
ESS ∑ ( ) or ∑ i
2
Yˆi − Y
2
 RSS ˆ
u
r 2
= = r = 1−
2
= 1−
∑ (Y − Y ) ∑ (Yi − Y )
2 2
TSS TSS
i

 The quantity r2 is known as coefficient of


determination.
 It measures the proportion or percentage of
the total variation in Y explained by the
regression model.
Properties of r2

 (1) r2 is a nonnegative quantity


 (2) The limits of r2 are 0≤ r2 ≤ 1
 r2 = 1 means a perfect fit
 r2 = 0 means there is no relationship
between the Y and the explanatory variables
 To compute r2, we use the formula
∑ (X − X )
2

r = βˆ2
2 2 i

∑ (Y − Y )
2
i
Sample Correlation Coefficient
 The quantity r is known as the sample
correlation coefficient.
 It is a measure of the degree of association
between two variables

r=± r 2

n∑ X iYi − (∑ X i )(∑ Yi )
r=
[n∑ X i
2 2
][
− (∑ X i ) n∑ Yi − (∑ Yi )
2 2
]
Properties of r
 It can be positive or negative
 It limits are -1 ≤ r ≤ 1
 It is symmetric in nature, rXY = rYX
 It is independent of the origin and scale
 It X and Y are statistically independent, r = 0. But
if r = 0, it does not mean that the two variables are
independent
 It is a measure of linear association only
 It does not imply any cause and effect relationship
Correlation
Patterns
Numerical Example (Table 3.2
of Textbook)
 Y = Mean hourly wage, X = Years of Schooling, n = 13.
The regression results is
Yˆi = −0.0144 + 0.724 X i
se( βˆ1 ) = 0.932 se( βˆ2 ) = 0.07 r 2 = 0.9065
 The value of β̂ 2 = 0.724 shows that, with the sample
range of X between 6 and 18 years of education, as X
increases by 1, the estimated increase in mean hourly
wage is about 72 cents
 The value of β̂1 = -0.0144 indicates the average level of
wages when the level of education is zero. Very often the
intercept has no viable meaning.
 The r2 value suggest that education explains about 90% of
the variation in hourly wage
Normality Assumption
 The classical normal linear regression model
assumes that each ui is distributed normally
with
 Mean E(ui) = 0
 Variance E[ui-E(ui)]2 = E(ui2) = σ2
 Cov(ui,uj) = E[ui-E(ui)][uj-E(uj)]
= E(uiuj) =0 for i ≠ j
 That is, ui ~ N(0, σ2)
Properties of OLS Estimators
under normality assumption
 The OLS estimators have the following
properties:
 (1)They are unbiased
 (2) They have minimum variance (efficient)
 (3) They have consistency, that is, as the
sample size increases indefinitely, the
estimators converge to their true
population values
Properties of OLS Estimators
under normality assumption
(4) β̂1 is normally distributed with

E ( βˆ1 ) = β1
Mean
βˆ1 − β1
Z= → N (0,1)
∑ Xi
2

Var ( βˆ1 ) σ β2ˆ = σ 2 σ βˆ


n∑ xi
1 2 1

 (5) β̂ 2 is normally distributed with

Mean E ( βˆ2 ) = β 2 βˆ2 − β 2


σ2 Z= → N (0,1)
Var ( βˆ2 ) σ 2
= σ βˆ
βˆ
∑ xi
2 2 2
Probability Distribution of ( βˆ1 , βˆ2 )
Properties of OLS Estimators
under normality assumption
(6) (n − 2)(σ / σ )
2 2
 ˆ
is distributed as the χ2 distribution with (n - 2)
df.
(7) ( β1 , β 2 ) are distributed independently of σ
2
 ˆ ˆ ˆ

 (8) ( βˆ1 , βˆ2 ) have minimum variance in the entire


class of unbiased estimators
Properties of OLS Estimators
under normality assumption
 To sum up, the assumption ui ~N(0, σ2)
allows us to derive the probability distribution
of ( βˆ1 , βˆ2 ) and σˆ 2

 It will facilitate us in getting confidence


intervals and hypotheses testing later on

 Yi is a linear function of ui is itself normally


distributed with the mean and variance
E(Yi) = β1 + β2Xi and Var (Yi) = σ2

You might also like