You are on page 1of 13

PublishedonSTAT501(https://onlinecourses.science.psu.

edu/stat501)
Home>10.2StepwiseRegression

10.2StepwiseRegression
Inthissection,welearnaboutthestepwiseregressionprocedure.Whilewewillsoonlearn
thefinerdetails,thegeneralideabehindthestepwiseregressionprocedureisthatwebuild
ourregressionmodelfromasetofcandidatepredictorvariablesbyenteringandremoving
predictorsinastepwisemannerintoourmodeluntilthereisnojustifiablereasonto
enterorremoveanymore.
Ourhopeis,ofcourse,thatweendupwithareasonableandusefulregressionmodel.There
isonesurewayofendingupwithamodelthatiscertaintobeunderspecifiedandthat'sif
thesetofcandidatepredictorvariablesdoesn'tincludeallofthevariablesthatactually
predicttheresponse.Thisleadsustoafundamentalruleofthestepwiseregression
procedurethelistofcandidatepredictorvariablesmustincludeallofthevariablesthat
actuallypredicttheresponse.Otherwise,wearesuretoendupwitharegressionmodelthat
isunderspecifiedandthereforemisleading.

Anexample
Let'slearnhowthestepwiseregressionprocedure
worksbyconsideringadatasetthatconcernsthe
hardeningofcement.Soundsinteresting,eh?In
particular,theresearcherswereinterestedinlearning
howthecompositionofthecementaffectedtheheat
evolvedduringthehardeningofthecement.
Therefore,theymeasuredandrecordedthefollowing
data(cement.txt[1] )on13batchesofcement:
Responsey:heatevolvedincaloriesduring
hardeningofcementonapergrambasis
Predictorx1:%oftricalciumaluminate
Predictorx2:%oftricalciumsilicate
Predictorx3:%oftetracalciumaluminoferrite
Predictorx4:%ofdicalciumsilicate
Now,ifyoustudythescatterplotmatrixofthedata:

youcangetahunchofwhichpredictorsaregoodcandidatesforbeingthefirsttoenterthe
stepwisemodel.Itlooksasifthestrongestrelationshipexistsbetweeneitheryandx2or
betweenyandx4andtherefore,perhapseitherx2orx4shouldenterthestepwisemodel
first.Didyounoticewhatelseisgoingoninthisdatasetthough?Astrongcorrelationalso
existsbetweenthepredictorsx2andx4!Howdoesthiscorrelationamongthepredictor
variablesplayoutinthestepwiseprocedure?Let'sseewhathappenswhenweusethe
stepwiseregressionmethodtofindamodelthatisappropriateforthesedata.
Note.Thenumberofpredictorsinthisdatasetisnotlarge.Thestepwiseprocedureis
typicallyusedonmuchlargerdatasets,forwhichitisnotfeasibletoattempttofitallofthe
possibleregressionmodels.Forthesakeofillustration,thedatasethereisnecessarily
small,sothatthelargenessofthedatasetdoesnotobscurethepedagogicalpointbeing
made.

Theprocedure
Again,beforewelearnthefinerdetails,letmeagainprovideabroadoverviewofthesteps
involved.First,westartwithnopredictorsinour"stepwisemodel."Then,ateachstepalong
thewayweeitherenterorremoveapredictorbasedonthepartialFteststhatis,thet
testsfortheslopeparametersthatareobtained.Westopwhennomorepredictorscanbe
justifiablyenteredorremovedfromourstepwisemodel,therebyleadingustoa"final
model."
Now,let'smakethisprocessabitmoreconcrete.Heregoes:
Startingtheprocedure.Thefirstthingweneedtodoissetasignificancelevelfordeciding
whentoenterapredictorintothestepwisemodel.We'llcallthistheAlphatoEnter
significancelevelandwilldenoteitasE.Ofcourse,wealsoneedtosetasignificancelevel
fordecidingwhentoremoveapredictorfromthestepwisemodel.We'llcallthistheAlphato

RemovesignificancelevelandwilldenoteitasR.Thatis,first:
SpecifyanAlphatoEntersignificancelevel.Thiswilltypicallybegreaterthantheusual
0.05levelsothatitisnottoodifficulttoenterpredictorsintothemodel.Manysoftware
packagesMinitabincludedsetthissignificancelevelbydefaulttoE=0.15.
SpecifyanAlphatoRemovesignificancelevel.Thiswilltypicallybegreaterthanthe
usual0.05levelsothatitisnottooeasytoremovepredictorsfromthemodel.Again,
manysoftwarepackagesMinitabincludedsetthissignificancelevelbydefaultto
R=0.15.
Step#1.Oncewe'vespecifiedthestartingsignificancelevels,thenwe:
1.Fiteachoftheonepredictormodelsthatis,regressyonx1,regressyonx2,...,and
regressyonxp1.
2.OfthosepredictorswhosettestPvalueislessthanE=0.15,thefirstpredictorputin
thestepwisemodelisthepredictorthathasthesmallestttestPvalue.
3.IfnopredictorhasattestPvaluelessthanE=0.15,stop.
Step#2.Then:
1.Supposex1hadthesmallestttestPvaluebelowE=0.15andthereforewasdeemed
the"best"singlepredictorarisingfromthethefirststep.
2.Now,fiteachofthetwopredictormodelsthatincludex1asapredictorthatis,
regressyonx1andx2,regressyonx1andx3,...,andregressyonx1andxp1.
3.OfthosepredictorswhosettestPvalueislessthanE=0.15,thesecondpredictor
putinthestepwisemodelisthepredictorthathasthesmallestttestPvalue.
4.IfnopredictorhasattestPvaluelessthanE=0.15,stop.Themodelwiththeone
predictorobtainedfromthefirststepisyourfinalmodel.
5.But,supposeinsteadthatx2wasdeemedthe"best"secondpredictoranditistherefore
enteredintothestepwisemodel.
6.Now,sincex1wasthefirstpredictorinthemodel,stepbackandseeifenteringx2into
thestepwisemodelsomehowaffectedthesignificanceofthex1predictor.Thatis,
checkthettestPvaluefortesting1=0.IfthettestPvaluefor1=0hasbecome
notsignificantthatis,thePvalueisgreaterthanR=0.15removex1fromthe
stepwisemodel.
Step#3.Then:
1.Supposebothx1andx2madeitintothetwopredictorstepwisemodelandremained
there.
2.Now,fiteachofthethreepredictormodelsthatincludex1andx2aspredictorsthat
is,regressyonx1,x2,andx3,regressyonx1,x2,andx4,...,andregressyonx1,x2,
andxp1.
3.OfthosepredictorswhosettestPvalueislessthanE=0.15,thethirdpredictorputin
thestepwisemodelisthepredictorthathasthesmallestttestPvalue.
4.IfnopredictorhasattestPvaluelessthanE=0.15,stop.Themodelcontainingthe
twopredictorsobtainedfromthesecondstepisyourfinalmodel.

5.But,supposeinsteadthatx3wasdeemedthe"best"thirdpredictoranditistherefore
enteredintothestepwisemodel.
6.Now,sincex1andx2werethefirstpredictorsinthemodel,stepbackandseeif
enteringx3intothestepwisemodelsomehowaffectedthesignificanceofthex1andx2
predictors.Thatis,checkthettestPvaluesfortesting1=0and2=0.IfthettestP
valueforeither1=0or2=0hasbecomenotsignificantthatis,thePvalueis
greaterthanR=0.15removethepredictorfromthestepwisemodel.
Stoppingtheprocedure.Continuethestepsasdescribedaboveuntiladdinganadditional
predictordoesnotyieldattestPvaluebelowE=0.15.
Whew!Let'sreturntoourcementdataexamplesowecantryoutthestepwiseprocedureas
describedabove.

Theexampleagain
Tostartourstepwiseregressionprocedure,let'ssetourAlphatoEntersignificancelevelat
E=0.15,andlet'ssetourAlphatoRemovesignificancelevelatR=0.15.Now,regressing
yonx1,regressingyonx2,regressingyonx3,andregressingyonx4,weobtain:

Eachofthepredictorsisacandidatetobeenteredintothestepwisemodelbecauseeacht
testPvalueislessthanE=0.15.Thepredictorsx2andx4tieforhavingthesmallestttest
Pvalueitis0.001ineachcase.ButnotethetieisanartifactofMinitabroundingtothree
decimalplaces.Thetstatisticforx4islargerinabsolutevaluethanthetstatisticforx24.77
versus4.69andthereforethePvalueforx4mustbesmaller.Asaresultofthefirststep,
weenterx4intoourstepwisemodel.
Now,followingstep#2,wefiteachofthetwopredictormodelsthatincludex4asapredictor
thatis,weregressyonx4andx1,regressyonx4andx2,andregressyonx4andx3,
obtaining:

Thepredictorx2isnoteligibleforentryintothestepwisemodelbecauseitsttestPvalue
(0.687)isgreaterthanE=0.15.Thepredictorsx1andx3arecandidatesbecauseeacht
testPvalueislessthanE=0.15.Thepredictorsx1andx3tieforhavingthesmallestttest
Pvalueitis<0.001ineachcase.But,againthetieisanartifactofMinitabroundingto
threedecimalplaces.Thetstatisticforx1islargerinabsolutevaluethanthetstatisticforx3
10.40versus6.35andthereforethePvalueforx1mustbesmaller.Asaresultofthe
secondstep,weenterx1intoourstepwisemodel.
Now,sincex4wasthefirstpredictorinthemodel,wemuststepbackandseeifenteringx1
intothestepwisemodelaffectedthesignificanceofthex4predictor.ItdidnotthettestP
valuefortesting1=0islessthan0.001,andthussmallerthanR=0.15.Therefore,we
proceedtothethirdstepwithbothx1andx4aspredictorsinourstepwisemodel.
Now,followingstep#3,wefiteachofthethreepredictormodelsthatincludex1andx4as
predictorsthatis,weregressyonx4,x1,andx2andweregressyonx4,x1,andx3,
obtaining:

Bothoftheremainingpredictorsx2andx3arecandidatestobeenteredintothestepwise
modelbecauseeachttestPvalueislessthanE=0.15.Thepredictorx2hasthesmallestt
testPvalue(0.052).Therefore,asaresultofthethirdstep,weenterx2intoourstepwise
model.
Now,sincex1andx4werethefirstpredictorsinthemodel,wemuststepbackandseeif
enteringx2intothestepwisemodelaffectedthesignificanceofthex1andx4predictors.
Indeed,itdidthettestPvaluefortesting4=0is0.205,whichisgreaterthanR=0.15.
Therefore,weremovethepredictorx4fromthestepwisemodel,leavinguswiththe
predictorsx1andx2inourstepwisemodel:

Now,weproceedfittingeachofthethreepredictormodelsthatincludex1andx2as
predictorsthatis,weregressyonx1,x2,andx3andweregressyonx1,x2,andx4,
obtaining:

Neitheroftheremainingpredictorsx3andx4areeligibleforentryintoourstepwisemodel,
becauseeachttestPvalue0.209and0.205,respectivelyisgreaterthanE=0.15.That
is,westopourstepwiseregressionprocedure.Ourfinalregressionmodel,basedonthe
stepwiseprocedurecontainsonlythepredictorsx1andx2:

Whew!Thattookalotofwork!Thegoodnewsisthatmoststatisticalsoftwareincluding
Minitabprovidesastepwiseregressionprocedurethatdoesallofthedirtyworkforus.For

exampleinMinitabv17,selectStat>Regression>Regression>FitRegressionModel,click
theStepwisebuttonintheresultingRegressionDialog,selectStepwiseforMethodand
selectIncludedetailsforeachstepunderDisplaythetableofmodelselectiondetails.Here's
whattheMinitabstepwiseregressionoutputlookslikeforourcementdataexample:

Minitabtellsusthat:
astepwiseregressionprocedurewasconductedontheresponseyandfourpredictors
x1,x2,x3,andx4
theAlphatoEntersignificancelevelwassetatE=0.15andtheAlphatoRemove
significancelevelwassetatR=0.15
TheremainingportionoftheoutputcontainstheresultsofthevariousstepsofMinitab's
stepwiseregressionprocedure.OnethingtokeepinmindisthatMinitabnumbersthesteps
alittledifferentlythandescribedabove.Minitabconsidersastepanyadditionorremovalofa
predictorfromthestepwisemodel,whereasourstepsstep#3,forexampleconsidersthe
additionofonepredictorandtheremovalofanotherasonestep.
TheresultsofeachofMintab'sstepsarereportedinacolumnlabeledbythestepnumber.It
tookMinitab4stepsbeforetheprocedurewasstopped.Here'swhattheoutputtellsus:
Justasourworkaboveshowed,asaresultofMinitab'sfirststep,thepredictorx4is
enteredintothestepwisemodel.Minitabtellsusthattheestimatedintercept
("Constant")b0=117.57andtheestimatedslopeb4=0.738.ThePvaluefortesting
4=0is0.001.TheestimateS,whichequalsthesquarerootofMSE,is8.96.TheR2
valueis67.45%andtheadjustedR2valueis64.50%.Mallows'Cpstatistic,whichwe
learnaboutinthenextsection,is138.73.TheoutputalsoincludesapredictedR2
value,whichwe'llcomebacktoinSection10.5.
AsaresultofMinitab'ssecondstep,thepredictorx1isenteredintothestepwise
modelalreadycontainingthepredictorx4.Minitabtellsusthattheestimatedintercept

b0=103.10,theestimatedslopeb4=0.614,andtheestimatedslopeb1=1.44.TheP
valuefortesting4=0is<0.001.ThePvaluefortesting1=0is<0.001.The
estimateSis2.73.TheR2valueis97.25%andtheadjustedR2valueis96.70%.
Mallows'Cpstatisticis5.5.
AsaresultofMinitab'sthirdstep,thepredictorx2isenteredintothestepwisemodel
alreadycontainingthepredictorsx1andx4.Minitabtellsusthattheestimatedintercept
b0=71.6,theestimatedslopeb4=0.237,theestimatedslopeb1=1.452,andthe
estimatedslopeb2=0.416.ThePvaluefortesting4=0is0.205.ThePvaluefor
testing1=0is<0.001.ThePvaluefortesting2=0is0.052.TheestimateSis2.31.
TheR2valueis98.23%andtheadjustedR2valueis97.64%.Mallows'Cpstatisticis
3.02.
AsaresultofMinitab'sfourthandfinalstep,thepredictorx4isremovedfromthe
stepwisemodelcontainingthepredictorsx1,x2,andx4,leavinguswiththefinalmodel
containingonlythepredictorsx1andx2.Minitabtellsusthattheestimatedinterceptb0
=52.58,theestimatedslopeb1=1.468,andtheestimatedslopeb2=0.6623.TheP
valuefortesting1=0is<0.001.ThePvaluefortesting2=0is<0.001.The
estimateSis2.41.TheR2valueis97.87%andtheadjustedR2valueis97.44%.
Mallows'Cpstatisticis2.68.
Doesthestepwiseregressionprocedureleadustothe"best"model?No,notatall!Nothing
occursinthestepwiseregressionproceduretoguaranteethatwehavefoundtheoptimal
model.Caseinpoint!Supposewedefinedthebestmodeltobethemodelwiththelargest
adjustedR2value.Then,here,wewouldpreferthemodelcontainingthethreepredictorsx1,
x2,andx4,becauseitsadjustedR2valueis97.64%,whichishigherthantheadjustedR2
valueof97.44%forthefinalstepwisemodelcontainingjustthetwopredictorsx1andx2.
Again,nothingoccursinthestepwiseregressionproceduretoguaranteethatwehavefound
theoptimalmodel.This,andothercautionsofthestepwiseregressionprocedure,are
delineatedinthenextsection.

Cautions!
Herearesomethingstokeepinmindconcerningthestepwiseregressionprocedure:
Thefinalmodelisnotguaranteedtobeoptimalinanyspecifiedsense.
Theprocedureyieldsasinglefinalmodel,althoughthereareoftenseveralequallygood
models.
Stepwiseregressiondoesnottakeintoaccountaresearcher'sknowledgeaboutthe
predictors.Itmaybenecessarytoforcetheproceduretoincludeimportantpredictors.
Oneshouldnotoverinterprettheorderinwhichpredictorsareenteredintothemodel.
Oneshouldnotjumptotheconclusionthatalltheimportantpredictorvariablesfor
predictingyhavebeenidentified,orthatalltheunimportantpredictorvariableshave
beeneliminated.Itis,ofcourse,possiblethatwemayhavecommittedaTypeIorType
IIerroralongtheway.
Manyttestsfortestingk=0areconductedinastepwiseregressionprocedure.The
probabilityisthereforehighthatweincludedsomeunimportantpredictorsorexcluded
someimportantpredictors.

It'sforallofthesereasonsthatoneshouldbecarefulnottooveruseoroverstatetheresults
ofanystepwiseregressionprocedure.

Moreexamples
Let'scloseupourdiscussionofstepwiseregressionbytakingaquicklookattwomore
examples.
Example#1.Areaperson'sbrainsizeandbodysize
predictiveofhisorherintelligence?Interestedinthis
question,someresearchers(Willerman,etal,1991)
collectedthefollowingdata(iqsize.txt[2] )onasampleof
n=38collegestudents:
Response(y):PerformanceIQscores(PIQ)from
therevisedWechslerAdultIntelligenceScale.This
variableservedastheinvestigator'smeasureof
theindividual'sintelligence.
Potentialpredictor(x1):Brainsizebasedonthe
countobtainedfromMRIscans(givenas
count/10,000).
Potentialpredictor(x2):Heightininches.
Potentialpredictor(x3):Weightinpounds.
Amatrixplotoftheresultingdatalookslike:

UsingMinitabtoperformthestepwiseregressionprocedure,weobtain:

Theoutputtellsus:
ThefirstpredictorenteredintothestepwisemodelisBrain.Minitabtellsusthatthe
estimatedinterceptis4.7andtheestimatedslopeforBrainis1.177.ThePvaluefor
testingBrain=0is0.019.TheestimateSis21.2,theR2valueis14.27%,theadjusted
R2valueis11.89%,andMallows'Cpstatisticis7.34.
ThesecondandfinalpredictorenteredintothestepwisemodelisHeight.Minitabtells
usthattheestimatedinterceptis111.3,theestimatedslopeforBrainis2.061,andthe
estimatedslopeforHeightis2.730.ThePvaluefortestingBrain=0is0.001.TheP
valuefortestingHeight=0is0.009.TheestimateSis19.5,theR2valueis29.49%,
theadjustedR2valueis25.46%,andMallows'Cpstatisticis2.00.
Atnostepisapredictorremovedfromthestepwisemodel.
WhenE=R=0.15,thefinalstepwiseregressionmodelcontainsthepredictorsBrain
andHeight.
Example#2.Someresearchersobservedthe
followingdata(bloodpress.txt[3] )on20individualswith
highbloodpressure:
bloodpressure(y=BP,inmmHg)
age(x1=Age,inyears)
weight(x2=Weight,inkg)
bodysurfacearea(x3=BSA,insqm)
durationofhypertension(x4=Dur,inyears)
basalpulse(x5=Pulse,inbeatsperminute)
stressindex(x6=Stress)
Theresearcherswereinterestedindeterminingifarelationshipexistsbetweenblood
pressureandage,weight,bodysurfacearea,duration,pulserateand/orstresslevel.
ThematrixplotofBP,Age,Weight,andBSAlookslike:

andthematrixplotofBP,Dur,Pulse,andStresslookslike:

UsingMinitabtoperformthestepwiseregressionprocedure,weobtain:

WhenE=R=0.15,thefinalstepwiseregressionmodelcontainsthepredictorsWeight,
Age,andBSA.

PRACTICEPROBLEMS:Stepwiseregression
Brainsizeandbodysize.Imaginethatyoudonothaveautomatedstepwise
regressionsoftwareatyourdisposal,andconductthestepwiseregressionprocedure
ontheiqsize.txt[2] dataset.SettingAlphatoRemoveandAlphatoEnterat0.15,verify
thefinalmodelobtainedabovebyMinitab.
Thatis:
a.First,fiteachofthethreepossiblesimplelinearregressionmodels.Thatis,
regressPIQonBrain,regressPIQonHeight,andregressPIQonWeight.(See
MinitabHelp:Performingabasicregressionanalyis[4] ).Thefirstpredictorthat
shouldbeenteredintothestepwisemodelisthepredictorwiththesmallestP
value(orequivalentlythelargesttstatisticinabsolutevalue)fortestingk=0,
providingthePvalueissmallerthan0.15.Whatisthefirstpredictorthatshould
beenteredintothestepwisemodel?
(CHECKYOURANSWER)
b.Now,fiteachofthepossibletwopredictormultiplelinearregressionmodels
whichincludethefirstpredictoridentifiedaboveandeachoftheremainingtwo
predictors.(SeeMinitabHelp:Performingabasicregressionanalyis[4] ).Which
predictorshouldbeenteredintothemodelnext?
(CHECKYOURANSWER)
c.Continuethestepwiseregressionprocedureuntilyoucannotjustifyenteringor
removinganymorepredictors.Whatisthefinalmodelidentifiedbyyourstepwise

regressionprocedure?
(CHECKYOURANSWER)
SourceURL:https://onlinecourses.science.psu.edu/stat501/node/329
Links:
[1]
https://onlinecourses.science.psu.edu/stat501/sites/onlinecourses.science.psu.edu.stat501/files/data/cement.txt
[2]
https://onlinecourses.science.psu.edu/stat501/sites/onlinecourses.science.psu.edu.stat501/files/data/iqsize.txt
[3]
https://onlinecourses.science.psu.edu/stat501/sites/onlinecourses.science.psu.edu.stat501/files/data/bloodpress.txt
[4]https://onlinecourses.science.psu.edu/stat501/node/130

You might also like