You are on page 1of 59

Nonlinear Regression Functions (SW Ch.

6)
Everything so far has been linear in the Xs The approximation that the regression function is linear might be good for some variables, but not for others. The multiple regression framework can be extended to handle regression functions that are nonlinear in one or more X.

6-

The TestScore ! STR relation looks approximately linear"

6-#

$ut the TestScore ! average district income relation looks like it is nonlinear.

6-%

&f a relation between Y and X is nonlinear' The effect on Y of a change in X depends on the value of X ! that is, the marginal effect of X is not constant ( linear regression is mis-specified ! the functional form is wrong The estimator of the effect on Y of X is biased ! it neednt even be right on average. The solution to this is to estimate a regression function that is nonlinear in X

6-)

The General Nonlinear Population Regression Function Yi * f+X i,X#i,",Xki, - ui, i * ,", n Assumptions . E+ui. X i,X#i,",Xki, * / +same,0 implies that f is the conditional expectation of Y given the Xs. #. +X i,",Xki,Yi, are i.i.d. +same,. %. 1enough2 moments exist +same idea0 the precise statement depends on specific f,. ). 3o perfect multicollinearity +same idea0 the precise statement depends on the specific f,.
6-4

6-6

Nonlinear Functions of a Single Independent Varia le (SW Section 6.!) 5ell look at two complementary approaches' . 6olynomials in X The population regression function is approximated by a 7uadratic, cubic, or higher-degree polynomial #. 8ogarithmic transformations Y and9or X is transformed by taking its logarithm this gives a 1percentages2 interpretation that makes sense in many applications

6-:

". #ol$nomials in X (pproximate the population regression function by a polynomial' Yi * / - Xi - # X i# -"- r X ir - ui This is ;ust the linear multiple regression model ! except that the regressors are powers of X< Estimation, hypothesis testing, etc. proceeds as in the multiple regression model using =8> The coefficients are difficult to interpret, but the regression function itself is interpretable
6-?

Example' the TestScore ! Income relation Incomei * average district income in the ith district +thousdand dollars per capita, @uadratic specification' TestScorei * / - Incomei - #+Incomei,# - ui Aubic specification' TestScorei * / - Incomei - #+Incomei,# - %+Incomei,% - ui
6-B

Estimation of the qua ratic specification in ST!T!


generate avginc2 = avginc*avginc; reg testscr avginc avginc2, r; Regression with robust standard errors Create a new regressor Number of obs F( 2, 4 !" &rob ' F R(s)uared Root +,= = = = = 420 42#$%2 0$0000 0$%%*2 2$!24

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( . Robust testscr . Coef$ ,td$ -rr$ t &'.t. /0%1 Conf$ 2nterva34 (((((((((((((5(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( avginc . 6$#%000% $2*#004 4$6* 0$000 6$6240 4$6!!0!0 avginc2 . ($04260#% $004!#06 (#$#% 0$000 ($0% !0% ($0620 0 7cons . *0!$60 ! 2$00 !%4 200$20 0$000 *0 $%0!# * 6$00%* ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

The t-statistic on Income# is -?.?4, so the hypothesis of linearity is re;ected against the 7uadratic alternative at the C significance level.
6- /

Interpreting the estimate regression function' +a, 6lot the predicted values * 6/:.% - %.?4Incomei ! /./)#%+Incomei,# TestScore +#.B, +/.#:, +/.//)?,

6-

Interpreting the estimate regression function' +a, Aompute 1effects2 for different values of X
* 6/:.% - %.?4Incomei ! /./)#%+Incomei,# TestScore

+#.B, +/.#:,

+/.//)?,

6redicted change in TestScore for a change in income to D6,/// from D4,/// per capita'
TestScore * 6/:.% - %.?46 ! /./)#%6#

! +6/:.% - %.?44 ! /./)#%4#, * %.)

6- #

* 6/:.% - %.?4Incomei ! /./)#%+Incomei,# TestScore

6redicted 1effects2 for different values of X Ahange in Income +thD per capita, TestScore from 4 to 6 %.) from #4 to #6 .: from )4 to )6 /./ The 1effect2 of a change in income is greater at low than high income levels +perhaps, a declining marginal benefit of an increase in school budgetsE, "aution# 5hat about a change from 64 to 66E Font extrapolate outside the range of the data.
6- %

Estimation of the cu$ic specification in ST!T!


gen avginc6 = avginc*avginc2; reg testscr avginc avginc2 avginc6, r; Regression with robust standard errors Create the cubic regressor Number of obs F( 6, 4 *" &rob ' F R(s)uared Root +,= = = = = 420 2!0$ # 0$0000 0$%%#4 2$!0!

(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( . Robust testscr . Coef$ ,td$ -rr$ t &'.t. /0%1 Conf$ 2nterva34 (((((((((((((5(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( avginc . %$0 #*!! $!0!6%0% !$ 0 0$000 6$*2#2% *$400 04 avginc2 . ($00%#0%2 $02#0%6! (6$6 0$00 ($ %2! 0 ($06##0 6 avginc6 . $000*#%% $00064! $0# 0$040 6$2!e(0* $00 6*!! 7cons . *00$0!0 %$ 020*2 !$* 0$000 %00$0400 * 0$ 0# ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((

The cubic term is statistically significant at the 4C, but not C, level
6- )

Testing the null hypothesis of linearity, against the alternative that the population regression is 7uadratic and9or cubic, that is, it is a polynomial of degree up to %' %/' popn coefficients on Income# and Income% * / % ' at least one of these coefficients is nonGero.
test avginc2 avginc6; ( " ( 2" avginc2 = 0$0 avginc6 = 0$0 F( 2, 4 *" = &rob ' F = 6!$*0 0$0000 -8ecute the test command after running the regression

The hypothesis that the population regression is linear is re;ected at the C significance level against the alternative that it is a polynomial of degree up to %.
6- 4

Summar$% pol$nomial regression functions Yi * / - Xi - # X i# -"- r X ir - ui Estimation' by =8> after defining new regressors Aoefficients have complicated interpretations To interpret the estimated regression function' o plot predicted values as a function of x o compute predicted Y9X at different values of x Hypotheses concerning degree r can be tested by tand F-tests on the appropriate +blocks of, variable+s,. Ahoice of degree r o plot the data0 t- and F-tests, check sensitivity of estimated effects0 ;udgment.
6- 6

o &r use mo el selection criteria +ma'$e later, !. &ogarithmic functions of Y and'or X ln+X, * the natural logarithm of X 8ogarithmic transforms permit modeling relations in 1percentage2 terms +like elasticities,, rather than linearly. %ere(s )h''
x ln+x-x, ! ln+x, * ln + x ln+ x , = , +calculus' x x x x

Numericall'' ln+ ./ , * .//BB4 ./ 0 ln+ . /, * ./B4% . / +sort of,


6- :

Three cases' Case &. linear-log &&. log-linear &&&. log-log #opulation regression function Yi * / - ln+Xi, - ui ln+Yi, * / - Xi - ui ln+Yi, * / - ln+Xi, - ui

The interpretation of the slope coefficient differs in each case. The interpretation is found by applying the general 1before and after2 rule' 1figure out the change in Y for a given change in X.2

6- ?

I. &inear(log population regression function Yi * / - ln+Xi, - ui 3ow change X' >ubtract +a, ! +b,' now so or Y - Y * / - ln+X - X, Y * Iln+X - X, ! ln+X,J +b, +a,

X ln+X - X, ! ln+X, , X X Y X Y +small X, X 9 X

6- B

*inear+log case, continue Yi * / - ln+Xi, - ui for small X,


Y X 9 X X 3ow // * percentage change in X, so a 1% X

increase in X (multiplying X by ".)") is associated with a .)" " change in Y.

6-#/

Example- TestScore .s/ ln0Income1 Kirst defining the new regressor, ln+Income, The model is now linear in ln+Income,, so the linear-log model can be estimated by =8>'
* 44:.? - %6.)#ln+Incomei, TestScore

+%.?,

+ .)/,

so a C increase in Income is associated with an increase in TestScore of /.%6 points on the test. >tandard errors, confidence intervals, R# ! all the usual tools of regression apply here. How does this compare to the cubic modelE
6-#

* 44:.? - %6.)#ln+Incomei, TestScore

6-##

II. &og(linear population regression function ln+Yi, * / - Xi - ui 3ow change X' >ubtract +a, ! +b,' so or ln+Y - Y, * / - +X - X, ln+Y - Y, ! ln+Y, * X +b, +a,

Y X Y Y 9 Y +small X, X

6-#%

*og+linear case, continue ln+Yi, * / - Xi - ui for small X,


Y 9 Y X

Y 3ow // * percentage change in Y, so a change in Y

X by one unit ( X * ") is associated with a ")) "% change in Y (Y increases by a factor of "+ "). Note' 5hat are the units of ui and the >ELE o fractional +proportional, deviations o for example, SER * .# means"
6-#)

III. &og(log population regression function ln+Yi, * / - ln+Xi, - ui 3ow change X' >ubtract' so or ln+Y - Y, * / - ln+X - X, +b, +a,

ln+Y - Y, ! ln+Y, * Iln+X - X, ! ln+X,J


Y X Y X Y 9 Y +small X, X 9 X

6-#4

*og+log case, continue ln+Yi, * / - ln+Xi, - ui for small X,


Y 9 Y X 9 X Y X 3ow // * percentage change in Y, and // * Y X

percentage change in X, so a 1% change in X is associated with a 1% change in Y. In the log-log specification 1 has the interpretation of an elasticity.
6-#6

Example- ln0 TestScore1 .s/ ln0 Income1 Kirst defining a new dependent variable, ln+TestScore,, and the new regressor, ln+Income, The model is now a linear regression of ln+TestScore, against ln+Income,, which can be estimated by =8>'
TestScore, * 6.%%6 - /./44)ln+Incomei, ln+

+/.//6, +/.//# , (n C increase in Income is associated with an increase of ./44)C in TestScore +factor of ./44), How does this compare to the log-linear modelE
6-#:

Neither specification seems to fit as )ell as the cu$ic or linear+log 6-#?

Summar$% &ogarithmic transformations Three cases, differing in whether Y and9or X is transformed by taking logarithms. (fter creating the new variable+s, ln+Y, and9or ln+X,, the regression is linear in the new variables and the coefficients can be estimated by =8>. Hypothesis tests and confidence intervals are now standard. The interpretation of differs from case to case. Ahoice of specification should be guided by ;udgment +which interpretation makes the most sense in your applicationE,, tests, and plotting predicted values
6-#B

Interactions +et,een Independent Varia les (SW Section 6.-) 6erhaps a class siGe reduction is more effective in some circumstances than in others" 6erhaps smaller classes help more if there are many English learners, who need individual attention

TestScore That is, might depend on PctE* STR Y More generally, might depend on X# X

How to model such 1interactions2 between X and X#E 5e first consider binary Xs, then continuous Xs
6-%/

(a) Interactions et,een t,o inar$ .aria les Yi * / - 2 i - #2#i - ui 2 i, 2#i are binary is the effect of changing 2 */ to 2 * . &n this specification, this effect oesn(t epen on the .alue of 2#. To allow the effect of changing 2 to depend on 2#, include the 1interaction term2 2 i2#i as a regressor' Yi * / - 2 i - #2#i - %+2 i2#i, - ui
6-%

Interpreting the coefficients Yi * / - 2 i - #2#i - %+2 i2#i, - ui Neneral rule' compare the various cases E+Yi.2 i*/, 2#i* #, * / - #
# #

+b, - %
#

E+Yi.2 i* , 2#i* #, * / - - # subtract +a, ! +b,'

+a,

E+Yi.2 i* , 2#i* #, ! E+Yi.2 i*/, 2#i* #, * - % The effect of 2 depends on


#

+what we wanted,

% * increment to the effect of 2 , when 2# *


6-%#

Example' TestScore, STR, English learners 8et


if STR #/ %iSTR * and %iE* * / if STR < #/ if PctE* l/ / if PctE* < /

* 66). ! ?.#%iE* ! .B%iSTR ! %.4+%iSTR%iE*, TestScore

+ .), +#.%,

+ .B,

+%. ,

1Effect2 of %iSTR when %iE* * / is ! .B 1Effect2 of %iSTR when %iE* * is ! .B ! %.4 * !4.) Alass siGe reduction is estimated to have a bigger effect when the percent of English learners is large This interaction isnt statistically significant' t * %.49%.
6-%%

( ) Interactions et,een continuous and inar$ .aria les Yi * / - 2i - #Xi - ui 2i is binary, X is continuous (s specified above, the effect on Y of X +holding constant 2, * #, which does not depend on 2 To allow the effect of X to depend on 2, include the 1interaction term2 2iXi as a regressor' Yi * / - 2i - #Xi - %+2iXi, - ui

6-%)

Interpreting the coefficients Yi * / - 2i - #Xi - %+2iXi, - ui Neneral rule' compare the various cases Y * / - 2 - #X - %+2X, 3ow change X' Y - Y * / - 2 - #+X-X, - %I2+X-X,J +a, subtract +a, ! +b,'
Y Y * #X - %2X or * # - %2 X

+b,

The effect of X depends on 2 +what we wanted, % * increment to the effect of X, when 2 * Example' TestScore, STR, %iE* +* if PctE*#/,
6-%4

* 6?#.# ! /.B:STR - 4.6%iE* ! .#?+STR%iE*, TestScore

+ .B, +/.4B, 5hen %iE* * /'

+ B.4,

+/.B:,

* 6?#.# ! /.B:STR TestScore

5hen %iE* * ,
* 6?#.# ! /.B:STR - 4.6 ! .#?STR TestScore

* 6?:.? ! #.#4STR Two regression lines' one for each %iSTR group. Alass siGe reduction is estimated to have a larger effect when the percent of English learners is large. Example, ct /
6-%6

* 6?#.# ! /.B:STR - 4.6%iE* ! .#?+STR%iE*, TestScore

+ .B, +/.4B,

+ B.4,

+/.B:,

Testing various hypotheses' The two regression lines have the same slope the coefficient on STR%iE* is Gero' t * ! .#?9/.B: * ! .%# cant re;ect The two regression lines have the same intercept the coefficient on %iE* is Gero' t * !4.69 B.4 * /.#B cant re;ect Example, ct /
* 6?#.# ! /.B:STR - 4.6%iE* ! .#?+STR%iE*,, TestScore

+ .B, +/.4B,

+ B.4,

+/.B:,
6-%:

!oint hypothesis that the two regression lines are the same population coefficient on %iE* * / and population coefficient on STR%iE* * /' F * ?B.B) +p-value O .// , // 5hy do we re;ect the ;oint hypothesis but neither individual hypothesisE Aonse7uence of high but imperfect multicollinearity' high correlation between %iE* and STR%iE* 3inar'+continuous interactions- the t)o regression lines Yi * / - 2i - #Xi - %+2iXi, - ui
6-%?

=bservations with 2i* / +the 12 * /2 group,' Yi * / - #Xi - ui =bservations with 2i* +the 12 * 2 group,'

Yi * / - - #Xi - %Xi - ui * +/- , - +#-%,Xi - ui

6-%B

6-)/

(c) Interactions et,een t,o continuous .aria les Yi * / - X i - #X#i - ui X , X# are continuous (s specified, the effect of X doesnt depend on X# (s specified, the effect of X# doesnt depend on X To allow the effect of X to depend on X#, include the 1interaction term2 X iX#i as a regressor' Yi * / - X i - #X#i - %+X iX#i, - ui

6-)

"oefficients in continuous+continuous interactions Yi * / - X i - #X#i - %+X iX#i, - ui Neneral rule' compare the various cases Y * / - X - #X# - %+X X#, 3ow change X ' Y- Y * / - +X -X , - #X# - %I+X -X ,X#J subtract +a, ! +b,'
Y Y * X - %X#X or * # - %X# X

+b, +a,

The effect of X depends on X# +what we wanted, % * increment to the effect of X from a unit change in X# Example' TestScore, STR, PctE*
6-)#

* 6?6.% ! . #STR ! /.6:PctE* - .// #+STRPctE*,, TestScore

+ .?, +/.4B,

+/.%:,

+/./ B,

The estimated effect of class siGe reduction is nonlinear because the siGe of the effect itself depends on PctE*'
TestScore * ! . # - .// #PctE* STR TestScore PctE* STR

/ ! . # #/C ! . #-.// # #/ * ! . / Example, ct - h'pothesis tests


* 6?6.% ! . #STR ! /.6:PctE* - .// #+STRPctE*,, TestScore

+ .?, +/.4B,

+/.%:,

+/./ B,
6-)%

Foes population coefficient on STRPctE* * /E t * .// #9./ B * ./6 cant re;ect null at 4C level Foes population coefficient on STR * /E t * ! . #9/.4B * ! .B/ cant re;ect null at 4C level Fo the coefficients on both STR and STRPctE* * /E F * %.?B +p-value * ./# , re;ect null at 4C level+<<, +5hyE high but imperfect multicollinearity,

6-))

Application% Nonlinear 0ffects on 1est Scores of the Student(1eacher Ratio (SW Section 6.2) Kocus on two 7uestions' . (re there nonlinear effects of class siGe reduction on test scoresE +Foes a reduction from %4 to %/ have same effect as a reduction from #/ to 4E, #. (re there nonlinear interactions between PctE* and STRE +(re small classes more effective when there are many English learnersE,
6-)4

>trategy for @uestion P +different effects for different STRE, Estimate linear and nonlinear functions of STR, holding constant relevant demographic variables o PctE* o Income +remember the nonlinear TestScore-Income relation<, o *unchP"T +fraction on free9subsidiGed lunch, >ee whether adding the nonlinear terms makes an 1economically important2 7uantitative difference +1economic2 or 1real-world2 importance is different than statistically significant, Test for whether the nonlinear terms are significant
6-)6

5hat is a good 1base2 specificationE

6-):

The TestScore ! Income relation

(n advantage of the logarithmic specification is that it is better behaved near the ends of the sample, especially large values of income.
6-)?

+ase specification Krom the scatterplots and preceding analysis, here are plausible starting points for the demographic control variables' Fependent variable' TestScore Independent .aria le PctE* *unchP"T Income Functional form linear linear ln+Income, +or could use cubic,

6-)B

4uestion 5 ' &nvestigate by considering a polynomial in STR


* #4#./ - 6).%%STR ! %.)#STR# - ./4BSTR% TestScore

+ 6%.6, +#).?6,

+ .#4,

+./# ,

! 4.):%iE* ! .)#/*unchP"T - .:4ln+Income, + ./%, +./#B, + .:?, &nterpretation of coefficients on' %iE*E *unchP"TE ln+Income,E STR, STR#, STR%E
6-4/

Interpreting the regression function .ia plots +preceding regression is labeled +4, in this figure,

6-4

Are the higher order terms in "#$ statisticall$ significant3


* #4#./ - 6).%%STR ! %.)#STR# - ./4BSTR% TestScore

+ 6%.6, +#).?6,

+ .#4,

+./# ,

! 4.):%iE* ! .)#/*unchP"T - .:4ln+Income, + ./%, +./#B, + .:?, +a, %/' 7uadratic in STR v. % ' cubic in STRE t * ./4B9./# * #.?6 +p * .//4, +b, %/' linear in STR v. % ' nonlinear9up to cubic in STRE F * 6. : +p * .//#,
6-4#

4uestion 5#' STR-PctE* interactions +to simplify things, ignore STR#, STR% terms for now,
* 64%.6 ! .4%STR - 4.4/%iE* ! .4?%iE*STR TestScore

+B.B, +.%),

+B.?/,

+.4/,

! .) *unchP"T - #. #ln+Income, +./#B, + .?/, &nterpretation of coefficients on' STRE %iE*E +wrong signE, %iE*STRE *unchP"TE ln+Income,E
6-4%

Interpreting the regression functions .ia plots'


* 64%.6 ! .4%STR - 4.4/%iE* ! .4?%iE*STR TestScore

+B.B, +.%),

+B.?/,

+.4/,

! .) *unchP"T - #. #ln+Income, +./#B, + .?/, 4Real(,orld5 (4polic$5 or 4economic5) importance of the interaction term%
TestScore * !.4% ! .4?%iE* * STR . # if %iE* = .4% if %iE* = /

The difference in the estimated effect of reducing the STR is substantial0 class siGe reduction is more effective in districts with more English learners
6-4)

Is the interaction effect statisticall$ significant3


* 64%.6 ! .4%STR - 4.4/%iE* ! .4?%iE*STR TestScore

+B.B, +.%),

+B.?/,

+.4/,

! .) *unchP"T - #. #ln+Income, +./#B, + .?/, +a, %/' coeff. on interaction*/ v. % ' nonGero interaction t * ! . : not significant at the /C level +b, %/' both coeffs involving STR * / vs. % ' at least one coefficient is nonGero +STR enters, F * 4.B# +p * .//%, Next- specifications )ith pol'nomials - interactions<
6-44

6-46

Interpreting the regression functions .ia plots'

6-4:

Tests of 6oint h'potheses'

6-4?

Summar$% Nonlinear Regression Functions Qsing functions of the independent variables such as ln+X, or X X#, allows recasting a large family of nonlinear regression functions as multiple regression. Estimation and inference proceeds in the same way as in the linear multiple regression model. &nterpretation of the coefficients is model-specific, but the general rule is to compute effects by comparing different cases +different value of the original Xs, Many nonlinear specifications are possible, so you must use ;udgment' 5hat nonlinear effect you want to analyGeE 5hat makes sense in your applicationE
6-4B

You might also like