HE204A Lecture 1-1

Two Variable Regression Analysis
(Part I)
HE204A – Lecture 1
Fall 2010
Assistant Professor Kannika Damrongplasit
Topics to be Discussed
 The Two Variable Model

 Ordinary Least Squares (OLS)
Estimation
 Goodness of Fit
 Properties of OLS Estimators
 Reference: Gujarati and Porter,
Chapters 2, 3 & 4
The Two Variable Model
 Econometrics – Application of mathematics and
statistics to economic data to analyze economic
models
 Simplest Model: Two Variable Model
 Y = β1 + β 2X + u
 Y – Dependent Variable
 X – Independent or Explanatory Variable
 u – Stochastic disturbance or Error term
 The dependent variable is related to a single
independent variable
The Two Variable Model
 β1 and β 2 are parameters

 β 1 – Intercept Coefficient
 β 2 – Slope coefficient
 Use regression to estimate the numerical value of
β1 and β2
 Regression is concerned with estimating the
mean of the dependent variable on the basis of
fixed values of independent variables
Population Regression Function (PRF)
 To study the relationship between weekly

consumption expenditure (Y) and weekly income
(X). Consider a population of 60 families divided
into 10 income groups (subpopulations)
 Observe that on the average weekly consumption

expenditure increases as income increases
 The population regression curve is a curve that

connects the mean of the subpopulations of Y
corresponding to the given values of X
Weekly Family Income
Weekly Family Income
Linear Population Regression Function
 Each conditional mean E(Y|Xi) is a function of Xi,

where Xi is the given value of X.
 Assume a linear function,
E(Y|Xi) = β1 + β2Xi
 This is known as the linear population regression
function
Population Regression Line
For each X, there is a population of Y values that are spread around the conditional
mean of those Y values.
Linearity
 Linear regression means a regression that is
linear in the parameters. The β’s are raised
to the first power only
 The conditional expectation of Y, E(Y|Xi) is a
linear function of the parameters. It may or
may not be linear in the variable X
 E(Y|Xi) = β1 + β2Xi2 is linear in parameter
 E(Y|Xi) = β1 + β22Xi is nonlinear in parameter
Linearity
Question: Are these models considered linear in parameters?
Significance of Stochastic
Disturbance Term
 The relationship between economic variables are
inexact. The error term captures all other
variables omitted from the model but that
collectively affect Y.
 The error term is needed due to:

 (1) Vagueness of theory
 (2) Unavailability of data (i.e. family wealth)
 (3) Core variables versus peripheral variables
Significance of Stochastic
Disturbance Term
 (4) Intrinsic randomness in human behavior
(i.e. intrinsic ability of individual)
 (5) Poor proxy variables

(i.e. error of measurement)
 (6) Principles of parsimony

(i.e. keep the model as simple as possible)
 (7) Wrong functional forms
Sample Regression Function (SRF)
 In practical situations we have only a sample
of Y values corresponding to some fixed X’s.
 May not be able to estimate the PRF
accurately because of sampling fluctuations
 A sample regression function is
Yî = βˆ1 + βˆ2 X i
 Yî = estimator of E(Y|Xi)
 β̂1 = Estimator of β1

β̂ 2 = Estimator of β2
Random samples from the
population
Regression lines based on two
different samples
Sample Regression Function (SRF)
 In stochastic form, the sample regression function is
Y = βˆ + βˆ X + uˆ
i 1 2 i i
 ûi = the sample residual term
 The primary objective in regression analysis is to
estimate the PRF Yi = β1 + β2Xi + ui on the basis of
the SRF
Yi = βˆ1 + βˆ2 X i + uî
 The SRF can be expressed as Yi = Yî + uî
 The SRF may underestimate or overestimate the PRF
because of sample fluctuations
Sample and Population Regression Lines
Ordinary Least Square (OLS)
Estimation
 From the SRF Yi = βˆ1 + βˆ2 X i + uî = Yî + uî
 We have uˆ = Y − Yˆ = Yi − βˆ − βˆ X
i i i 1 2 i
⇒ ∑ ui = ∑ (Yi − β1 − β 2 X i )
ˆ ˆ
2 2
ˆ
 The methods of least square chooses β̂1 and β̂ 2
in such a manner that, for a given sample or set
2
of data, ∑ uî is as small as possible.
Estimation
Estimation
These simultaneous equations are known as normal equations. Solve them and
get the OLS estimators.
Ordinary Least square
Estimators
 OLS estimators
βˆ =
n∑ X iYi − ∑ X i ∑ Yi
=
∑ ( X − X )(Y − Y ) i i
n∑ X − (∑ X ) ∑ (X − X )
2 2 2 2
i i i
∑ X ∑Y − ∑ X ∑ X Y
2
βˆ = i i
= Y − βˆ X i i i
n∑ X − (∑ X )
1 2 2 2
i i
Assumptions of Classical Linear
Regression Model (CLRM)
 Assumption 1: The regression model is linear in

the parameters, Yi = β1 + β2Xi + ui
 Assumption 2: The X variables and the error term

are independent, cov(Xi, ui) = 0
 Assumption 3: The mean value of the random

disturbance term is 0, E(ui|Xi) = 0
 Assumption 4: The variance of the error

term is the same regardless of the value
of X (homoscedasticity),
var(ui) = σ2
homoscedasticity heteroscedasticity
 Assumption 5: The correlation between any
two ui and uj (i≠j) is zero
(No autocorrelation),
cov(ui,uj|Xi,Xj) = 0
 Assumption 6: The number of observations n
must be greater than the number of
parameters to be estimated
 Assumption 7: The X values in a given sample

must not all be the same, var(X) > 0
Standard Errors of Least Square
Estimates
 The least square estimates are a function of the
sample data. Thus, we need to measure the
precision of the estimators by their standard errors
(se)
σ 2
σ
var(β 2 ) =
ˆ se( βˆ2 ) =
∑ xi ∑x
2 2
i
∑X ∑X
2 2
var(βˆ1 ) = i
σ 2
se( βˆ1 ) = i
σ
n∑ x n∑ x
2 2
i i
xi = X i − X
Standard Errors of Least Square
Estimates
∑
2
uˆ
 The term can be estimated by σˆ
σ2 2
=i
n−2
∑ (xi yi )
2
∑ uî =∑ yi − x 2 yi = Yi − Y
2 2
∑ i
∑ i
2
ˆ
u
 The term σˆ =
n−2
is known as the standard error of estimate. It is

the standard deviation of the Y values about the
estimated regression line
Gauss Markov Theorem
 An estimator is said to be the best linear unbiased
estimator (BLUE) if
 (1) It is linear
 (2) It is unbiased E ( βˆ2 ) = β 2
 (3) It has minimum variance (efficient estimator)
 Given the assumption of the classical linear
regression model, the least-square estimators, in
the class of linear estimators, have minimum
variance, that is, they are BLUE
Goodness of Fit
 Consider the goodness of fit of the fitted
regression line to a set of data
 How well the sample regression line fits the data
 Hope for the residuals (positive and negative)
around the regression line to be as small as
possible
 The coefficient of determination r2 measures how
well the sample regression line fits the data
Coefficient of Determination
 Recall that Yi = Yî + uî
 In deviation form, Yi − Y = Yî − Y + uî
 Squaring on both side and sum over the
sample, we have
∑ i − = ∑ iˆ − + ∑ i ∑ i − Y )uî
+ ˆ
2 2 2
(Y Y ) (Y Y ) ˆ
u 2 (Y
= ∑ (Yi − Y ) + ∑ uî
ˆ 2 2
 TSS = ESS + RSS

Breakdown of Variation in Y
 TSS = Total sum of squares, total variation of
actual Y values about their sample mean
 ESS = Explained sum of squares, variation of
estimated Y about their mean due to the
explanatory variables
 RSS = Residual sum of squares, variation of the Y
values about the regression line due to the error
∑( )+ ∑ ui
2
Yî − Y
2
ESS RSS ˆ
1= + =
∑ (Y − Y ) ∑ (Y − Y )
2 2
TSS TSS i i
ESS ∑ ( ) or ∑ i
2
Yî − Y
2
 RSS ˆ
u
r 2
= = r = 1−
2
= 1−
∑ (Y − Y ) ∑ (Yi − Y )
2 2
TSS TSS
i
 The quantity r2 is known as coefficient of

determination.
 It measures the proportion or percentage of
the total variation in Y explained by the
regression model.
Properties of r2
 (1) r2 is a nonnegative quantity

 (2) The limits of r2 are 0≤ r2 ≤ 1
 r2 = 1 means a perfect fit
 r2 = 0 means there is no relationship
between the Y and the explanatory variables
 To compute r2, we use the formula
∑ (X − X )
2
r = βˆ2
2 2 i
∑ (Y − Y )
2
i
Sample Correlation Coefficient
 The quantity r is known as the sample
correlation coefficient.
 It is a measure of the degree of association
between two variables
r=± r 2
n∑ X iYi − (∑ X i )(∑ Yi )
r=
[n∑ X i
2 2
][
− (∑ X i ) n∑ Yi − (∑ Yi )
2 2
]
Properties of r
 It can be positive or negative
 It limits are -1 ≤ r ≤ 1
 It is symmetric in nature, rXY = rYX
 It is independent of the origin and scale
 It X and Y are statistically independent, r = 0. But
if r = 0, it does not mean that the two variables are
independent
 It is a measure of linear association only
 It does not imply any cause and effect relationship
Correlation
Patterns
Numerical Example (Table 3.2
of Textbook)
 Y = Mean hourly wage, X = Years of Schooling, n = 13.
The regression results is
Yî = −0.0144 + 0.724 X i
se( βˆ1 ) = 0.932 se( βˆ2 ) = 0.07 r 2 = 0.9065
 The value of β̂ 2 = 0.724 shows that, with the sample
range of X between 6 and 18 years of education, as X
increases by 1, the estimated increase in mean hourly
wage is about 72 cents
 The value of β̂1 = -0.0144 indicates the average level of
wages when the level of education is zero. Very often the
intercept has no viable meaning.
 The r2 value suggest that education explains about 90% of
the variation in hourly wage
Normality Assumption
 The classical normal linear regression model
assumes that each ui is distributed normally
with
 Mean E(ui) = 0
 Variance E[ui-E(ui)]2 = E(ui2) = σ2
 Cov(ui,uj) = E[ui-E(ui)][uj-E(uj)]
= E(uiuj) =0 for i ≠ j
 That is, ui ~ N(0, σ2)
Properties of OLS Estimators
under normality assumption
 The OLS estimators have the following
properties:
 (1)They are unbiased
 (2) They have minimum variance (efficient)
 (3) They have consistency, that is, as the
sample size increases indefinitely, the
estimators converge to their true
population values
(4) β̂1 is normally distributed with

E ( βˆ1 ) = β1
Mean
βˆ1 − β1
Z= → N (0,1)
∑ Xi
2
Var ( βˆ1 ) σ β2ˆ = σ 2 σ βˆ

n∑ xi
1 2 1
 (5) β̂ 2 is normally distributed with
Mean E ( βˆ2 ) = β 2 βˆ2 − β 2

σ2 Z= → N (0,1)
Var ( βˆ2 ) σ 2
= σ βˆ
βˆ
∑ xi
2 2 2
Probability Distribution of ( βˆ1 , βˆ2 )
(6) (n − 2)(σ / σ )
2 2
 ˆ
is distributed as the χ2 distribution with (n - 2)
df.
(7) ( β1 , β 2 ) are distributed independently of σ
2
 ˆ ˆ ˆ
 (8) ( βˆ1 , βˆ2 ) have minimum variance in the entire

class of unbiased estimators
 To sum up, the assumption ui ~N(0, σ2)
allows us to derive the probability distribution
of ( βˆ1 , βˆ2 ) and σˆ 2
 It will facilitate us in getting confidence

intervals and hypotheses testing later on
 Yi is a linear function of ui is itself normally

distributed with the mean and variance
E(Yi) = β1 + β2Xi and Var (Yi) = σ2

HE204A Lecture 1-1

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

HE204A Lecture 1-1

Uploaded by

Copyright:

Available Formats

Two Variable Regression Analysis

 The Two Variable Model

 β1 and β 2 are parameters

 To study the relationship between weekly

 Observe that on the average weekly consumption

 The population regression curve is a curve that

 Each conditional mean E(Y|Xi) is a function of Xi,

 The error term is needed due to:

 (5) Poor proxy variables

 (6) Principles of parsimony

 Assumption 1: The regression model is linear in

 Assumption 2: The X variables and the error term

 Assumption 3: The mean value of the random

 Assumption 4: The variance of the error

 Assumption 7: The X values in a given sample

is known as the standard error of estimate. It is

 TSS = ESS + RSS

 The quantity r2 is known as coefficient of

 (1) r2 is a nonnegative quantity

Var ( βˆ1 ) σ β2ˆ = σ 2 σ βˆ

 (5) β̂ 2 is normally distributed with

Mean E ( βˆ2 ) = β 2 βˆ2 − β 2

 (8) ( βˆ1 , βˆ2 ) have minimum variance in the entire

 It will facilitate us in getting confidence

 Yi is a linear function of ui is itself normally

You might also like