Professional Documents
Culture Documents
Nonlinear Estimation
This week well start our exploration of nonlinear estimation with dichotomous Y vars.
These arise in many social science problems
Legislator
Votes: Aye/Nay
Regime Type: Autocratic/Democratic
Involved in an Armed Conflict: Yes/No
Link Functions
Link Functions
Start
Link Functions
= X +
= 0: Non-minority elected
Y = 1: Minority elected
The data
look like
this.
The only
values Y
can have
are 0 and 1
0
.2
.4
.6
Black Voting Age Population
.8
And heres
a linear fit
of the data
.2
.4
.6
Black Voting Age Population
.8
Fitted values
Note that
the line
goes below
0 and
above 1
The line
doesnt fit
the data
very well.
.2
.4
.6
Black Voting Age Population
.8
Fitted values
And if we
take values
of Y
between 0
and 1 to be
probabilities,
this doesnt
make sense
Y as a
Probability
Y as a
Probability
?
Y (- ,)
That
This is the
That
This
.1
.2
.3
.4
Probit Estimation
-4
-2
After estimation, you can back out probabilities using the standard normal dist.
.1
.2
.3
.4
Probit Estimation
-4
-2
.1
.2
.3
.4
Probit Estimation
-4
-2
.1
.2
.3
.4
Probit Estimation
-4
-2
.3
.4
Probit Estimation
.1
.2
Prob(Y=1)
-4
-2
.1
.2
.3
.4
Probit Estimation
-4
-2
.1
.2
.3
.4
Probit Estimation
-4
-2
.3
.4
Probit Estimation
.1
.2
Prob(Y=1)
-4
-2
Probit Estimation
Probit Estimation
.2
.4
.6
Black Voting Age Population
.8
This fits the data much better than the linear estimation
Always lies between 0 and 1
Probit Estimation
.2
.4
.6
Black Voting Age Population
.8
Probit Estimation
.2
.4
.6
Black Voting Age Population
.8
Probit Estimation
.2
.4
.6
Black Voting Age Population
.8
Probit Estimation
.2
.4
.6
Black Voting Age Population
.8
= 0 O(p) = 0
p = O(p) = 1/3 (Odds are 1-to-3 against)
p = O(p) = 1 (Even odds)
p = O(p) = 3 (Odds are 3-to-1 in favor)
p = 1 O(p) =
Y as a
Probability
Y as a
Probability
Odds of Y
Y as a
Probability
Odds of Y
Y as a
Probability
Odds of Y
Log-Odds of Y
0
-
Logit Function
= log[O(Y)] = log[y/(1-y)]
.2
Logit
.05
.1
Normal
-4
-2
The logit function is similar, but has thinner tails than the normal distribution
Logit Function
(1 + e )Y = e
X
e X
Y=
1 + e X
Latent Variables
For the rest of the lecture well talk in terms of probits, but
everything holds for logits too
One way to state whats going on is to assume that there
is a latent variable Y* such that
Y * = X + , ~ N 0, 2
For the rest of the lecture well talk in terms of probits, but
everything holds for logits too
One way to state whats going on is to assume that there
is a latent variable Y* such that
Y * = X + , ~ N 0, 2
Normal = Probit
Latent Variables
For the rest of the lecture well talk in terms of probits, but
everything holds for logits too
One way to state whats going on is to assume that there
is a latent variable Y* such that
Y * = X + , ~ N 0, 2
Normal = Probit
0 if y 0
yi =
1 if y > 0
*
i
*
i
Latent Variables
For the rest of the lecture well talk in terms of probits, but
everything holds for logits too
One way to state whats going on is to assume that there
is a latent variable Y* such that
Y * = X + , ~ N 0, 2
Normal = Probit
0 if y 0
yi =
1 if y > 0
*
i
*
i
Latent Variables
Similarly,
x i
= Pr i >
x i
=
x i
Pr ( yi = 0 | x i ) = 1
Latent Variables
We cant estimate both and , since they enter the
equation as a ratio
So we set =1, making the distribution on a
standard normal density.
One (big) question left: how do we actually estimate
the values of the b coefficients here?
(Other
And
L ( y1 ) L ( y2 ) L ( y3 ) K L ( yn ) = L ( yi )
i =1
y* = x + , Probit: 1#[0,1]
P(y=1) = Probit(y*)
y* = x + , Probit: 1#[0,1]
P(y=1) = Probit(y*)
y* = x + , Probit: 1#[0,1]
P(y=1) = Probit(y*)
1-L(y3)
0
y3
y9
y* = x + , Probit: 1#[0,1]
P(y=1) = Probit(y*)
1-L(y9)
1-L(y3)
0
Impact of changing
Impact of changing to
P( y1 = 1) = P y1* 0 = P(1 x1 )
0
t = 1 2 3
Time Series
0
1
0
1
0
1
Country 1
Country 2
Country 3
Country 4
L ( y1 ) L ( y2 ) L ( y3 ) K L ( yn ) = L ( yi ) L
i =1
log(L ) = log L ( yi )
i =1
= log[L ( yi )]
i =1
Since Yi is 1 for heads and 0 for tails, p/n is also the sample mean
Intuitively,
if yi = 1
L ( yi ) =
1 if yi = 0
= (1 )
yi
1 yi
i =1
i =1
1 yi
= yi log( ) + (1 yi ) log(1 )
i =1
d yi log( ) + (1 yi ) log(1 )
d log L
= i =1
1
= yi (1 yi )
1
i =1
1
i =1
[ y (1 ) (1 y ) ]
i =1
[y
i =1
(1 )
=0
yi + (1 yi ) ] = 0
n
n = yi
i =1
n
y
i =1
i =1
[ y (1 ) (1 y ) ]
i =1
[y
i =1
(1 )
=0
yi + (1 yi ) ] = 0
n
n = yi
i =1
n
y
i =1
The
0:
1:
2:
3:
4:
5:
6:
log
log
log
log
log
log
log
likelihood
likelihood
likelihood
likelihood
likelihood
likelihood
likelihood
Probit estimates
=
=
=
=
=
=
=
-735.15352
-292.89815
-221.90782
-202.46671
-198.94506
-198.78048
-198.78004
Number of obs
LR chi2(1)
Prob > chi2
Pseudo R2
=
=
=
=
1507
1072.75
0.0000
0.7296
-----------------------------------------------------------------------------black |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------bvap |
0.092316
.5446756
16.95
0.000
0.081641
0.102992
_cons | -0.047147
0.027917
-16.89
0.000
-0.052619
-0.041676
------------------------------------------------------------------------------
0:
1:
2:
3:
4:
5:
6:
log
log
log
log
log
log
log
likelihood
likelihood
likelihood
likelihood
likelihood
likelihood
likelihood
Probit estimates
=
=
=
=
=
=
=
-735.15352
-292.89815
-221.90782
-202.46671
-198.94506
-198.78048
-198.78004
Maximizing the
log-likelihood
function!
Number of obs
LR chi2(1)
Prob > chi2
Pseudo R2
=
=
=
=
1507
1072.75
0.0000
0.7296
-----------------------------------------------------------------------------black |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------bvap |
0.092316
.5446756
16.95
0.000
0.081641
0.102992
_cons | -0.047147
0.027917
-16.89
0.000
-0.052619
-0.041676
------------------------------------------------------------------------------
0:
1:
2:
3:
4:
5:
6:
log
log
log
log
log
log
log
likelihood
likelihood
likelihood
likelihood
likelihood
likelihood
likelihood
Probit estimates
=
=
=
=
=
=
=
-735.15352
-292.89815
-221.90782
-202.46671
-198.94506
-198.78048
-198.78004
Maximizing the
log-likelihood
function!
Number of obs
LR chi2(1)
Prob > chi2
Pseudo R2
=
=
=
=
1507
1072.75
0.0000
0.7296
-----------------------------------------------------------------------------black |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------bvap |
0.092316
.5446756
16.95
0.000
0.081641
0.102992
_cons | -0.047147
0.027917
-16.89
0.000
-0.052619
-0.041676
------------------------------------------------------------------------------
Coefficients are
significant
.2
.4
.6
Black Voting Age Population
.8
.2
.4
.6
Black Voting Age Population
.8
Significant
Coefficients
big
big
(1.52)
From standard
normal table
(1.52)
From standard
normal table
(1.52)
(1.52-.339)
So theres an increase of .936 - .881 = 5.5% votes in
favor of incumbents who avoid a quality challengers.