Statistics

Simple regression
Statistics for dummies
Statistics
Gabriel V. Montes-Rojas
Gabriel Montes-Rojas Statistics

Simple regression
y = β0 + β1x + u
Much of applied econometrics is concerned with the linear simple
regression model that explains the relationship between y and x:
y = β0 + β1 x + u
where
y x
dependent variable independent variable
explained variable explanatory variable
response variable control variable
regressand regressor or covariate
u is called the error term, residual or disturbance and represents

all other factors, different from x that affect y .
Simple regression
y = β0 + β1x + u
Our interest is the effect of x on the variable y on some

population. The error term, u is assumed to have no systematic
influence on y and therefore, only x is of importance. Then, we
believe that y ≡ f (x ) = β 0 + β 1 x.
The following definitions will be used extensively during the course:
β 0 is the intercept, f (0) = β 0 .
This represents the value of y when x is set at 0.
β 1 is the slope, ∆y
∆x = β 1 .
This represents the unit change in y after a unit change
in x.

Simple regression

Simple regression
Example 2.7 (p.41 in Wooldridge): Returns to education
wage = β 0 + β 1 educ + u
Wages are expected to be an increasing function of education,

i.e. more education means on average higher wages. Then, in
this linear model, we expect that β 1 > 0.
What does u mean? Other factors, different from education,
that affect wages, such as age or ability.

Expectation
Simple regression
Variance
Regression model

Expectation
Simple regression
Variance
Regression model
Random variables (RV)

Why do we need random variables in Econometrics????
We will (almost) never observe the whole population, only a
small portion of it
A random sample is a subset of a population
If we consider the random variable X , a random sample is
{xi }ni=1 or x1 , x2 , ..., xn that consists of n realisations of the
variable X , which are indexed by i.
Example: If X is the return of an asset, a random sample are
actual observations in the market about the asset returns. Say
for a sample of three observations
x1 = $ 1000, x2 = −$ 567, x3 = $ 0
Example: Flipping a coin: let X = 0 be HEADS and X = 1
be TAILS. Then, X = {0, 1}. Moreover,
P [X = 0] = P [X = 1] = 0.5. (This is called the Bernoulli
distribution).
Expectation
Simple regression
Variance
Regression model
Discrete vs Continuous RVs
A discrete random variable is one that takes on only a finite or

countably infinite number of values.
Example: Flipping a coin: let X = 0 be HEADS and X = 1 be

TAILS. Two possible values: 0 or 1.
Example: Number of £50 bills in your wallet: X can take any

number in 0, 1, 2, 3,..., ∞
Each outcome of X has an associated probability.

pj = P (X = xj ), j = 1, ..., k. This probability measure satisfies:
pj ≥ 0, j = 1, 2, ..., k
∑kj=1 pj = 1

Expectation
Simple regression
Variance
Regression model
Discrete vs Continuous RVs
A continuous random variable is one that takes on any real

value.
Let X be a continuous random variable. Its probability measure is

described by a density function f (X ) that satisfies
f (x ) ≥ 0 for all x ∈ X , where X is the domain of X , usually
X =R
R
X
f (x )dx = 1
Although the density function acts as a probability of each value of

x, it has a tricky interpretation, because there are so many values
in X , that individually each one has probability zero (?!).

Expectation
Simple regression
Variance
Regression model
Expectation of a RV
Random variables can be described by some of its features:
Expectation: E [X ]
What value should we expect from X ? If we have a considerable

amount of draws from the X random variable, what would be their
average?
For the coin example:
E [X ] = 0 × P [X = 0] + 1 × P [X = 1] = 0 × 0.5 + 1 × 0.5 = 0.5.
For the discrete RVs: E [X ] = ∑kj=1 xj × P [X = xj ].
R
For the continuous RVs: E [X ] = X xf (x )dx.

Expectation
Simple regression
Variance
Regression model
Property of expectation: Let A and B be two random variables,

and c and d two constants. Then, E [cA + dB ] = cE [A] + dE [B ].
Property of expectation: Let A and B be two independent
random variables. Then, E [A × B ] = E [A] × E [B ].

Expectation
Simple regression
Variance
Regression model
An estimator of the expectation of a random variable X is the

sample average.
Given a random sample {xi }ni=1 , define x̄ = n−1 ∑ni=1 xi which is
simply the average.
An estimator µ̂ is unbiased for a given parameter µ if E (µ̂) = µ
In words, if we consider all possible random samples, on average,

we will obtain the parameter we want to estimate.
In our case, we can prove that E (x̄ ) = E (X ).
Proof:...

Expectation
Simple regression
Variance
Regression model
Variance of a RV
However, for a given realisation of X , defined as x, we may have

that x 6= E [X ].
But, how much does this random variable deviate from the E [X ]?
Variance: Var [X ] ≡ E [(X − E [X ])2 ]

Expectation
Simple regression
Variance
Regression model
Prove that Var [X ] = E [X 2 ] − (E [X ])2 .

Property of variance: Var [aX ] = a2 × Var [X ]
Property of variance:
Var [aX + bY ] = a2 × Var [X ] + b 2 × Var [Y ] + ab × Cov [X , Y ],
where Cov [X , Y ] = E [XY ] − E [X ]E [Y ]

Expectation
Simple regression
Variance
Regression model
Covariance
The covariance of the random variables A and B measures how

much co-movement they have.
Covariance: Cov [Y , X ] ≡ E [YX ] − E [Y ]E [X ]
Property of covariance: Let A and B be two independent random

variables. Then, Cov [A, B ] = 0.

Expectation
Simple regression
Variance
Regression model
In the simple regression model...
In the simple regression model, Y , X and U are random

variables. β 0 and β 1 are population parameters, i.e. constants
that describe the relation between Y and X . Then,
E [Y ] = E [ β 0 + β 1 X + U ] = β 0 + β 1 E [X ] + E [U ]
(Since U captures other factors, we will assume that E [U ] = 0.)
However, our main interest is in the conditional expectation that
defines the population regression model:
E [Y |X ] = E [ β 0 + β 1 X + U |X ] = β 0 + β 1 X + E [U |X ] = β 0 + β 1 X
Assumption: U and X are independent, then E [U |X ] = E [U ] = 0.

Expectation
Simple regression
Variance
Regression model
Parameters vs Estimators
Note:
β 0 and β 1 are population parameters to be estimated.
β̂ 0 and β̂ 1 will be their estimators.
The parameters are just numbers, they are fixed. However,
the estimators will be random variables.

Statistics

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics

Uploaded by

Copyright:

Available Formats

Simple regression

Statistics for dummies

Gabriel Montes-Rojas Statistics

u is called the error term, residual or disturbance and represents

Our interest is the effect of x on the variable y on some

Gabriel Montes-Rojas Statistics

Gabriel Montes-Rojas Statistics

Example 2.7 (p.41 in Wooldridge): Returns to education

Wages are expected to be an increasing function of education,

Gabriel Montes-Rojas Statistics

Statistics for dummies

Gabriel Montes-Rojas Statistics

Random variables (RV)

Discrete vs Continuous RVs

A discrete random variable is one that takes on only a finite or

Example: Flipping a coin: let X = 0 be HEADS and X = 1 be

Example: Number of £50 bills in your wallet: X can take any

Each outcome of X has an associated probability.

Gabriel Montes-Rojas Statistics

Discrete vs Continuous RVs

A continuous random variable is one that takes on any real

Let X be a continuous random variable. Its probability measure is

Although the density function acts as a probability of each value of

Gabriel Montes-Rojas Statistics

Random variables can be described by some of its features:

What value should we expect from X ? If we have a considerable

Gabriel Montes-Rojas Statistics

Property of expectation: Let A and B be two random variables,

Gabriel Montes-Rojas Statistics

An estimator of the expectation of a random variable X is the

An estimator µ̂ is unbiased for a given parameter µ if E (µ̂) = µ

In words, if we consider all possible random samples, on average,

Gabriel Montes-Rojas Statistics

However, for a given realisation of X , defined as x, we may have

Variance: Var [X ] ≡ E [(X − E [X ])2 ]

Gabriel Montes-Rojas Statistics

Prove that Var [X ] = E [X 2 ] − (E [X ])2 .

Gabriel Montes-Rojas Statistics

The covariance of the random variables A and B measures how

Covariance: Cov [Y , X ] ≡ E [YX ] − E [Y ]E [X ]

Property of covariance: Let A and B be two independent random

Gabriel Montes-Rojas Statistics

In the simple regression model...

In the simple regression model, Y , X and U are random

Assumption: U and X are independent, then E [U |X ] = E [U ] = 0.

Gabriel Montes-Rojas Statistics

Gabriel Montes-Rojas Statistics

You might also like