You are on page 1of 2

9/23/2016 mathCan'tunderstandthecostfunctionforLinearRegressionStackOverflow

signup login tour help

xDismiss

JointheStackOverflowCommunity

Stack Overflow is a community of 4.7 million


programmers, just like you, helping each other.
Join them it only takes a minute:

Signup

Can'tunderstandthecostfunctionforLinearRegression

Ireallycan'tunderstandthefollowingequation,especially 1/(2m) .

What'sthepurposeofthisequation?Andwheredoes 1/(2m) camefrom?

J(theta_0,theta_1)=1/(2m)*sum_(i=1)^m[h_theta(x^i)y^i]^2

Pleaseexplain.Howitcasts???

math machinelearning linearalgebra

editedFeb19at5:10 askedJan13'14at19:05
saurabhagarwal Faheem
649 3 19 62 1 5

3 Thisquestionappearstobeofftopicbecauseitisaboutunderstandingofmathandnottheprogrammingof
math.LanceRoberts Jan13'14at19:07

1 ThisisinMachineLearningbyAndrewNgprofessorinStanford.Fromhislecturesincoursera.Thisis
relatedtocomputerscienceandprogramming.Butifidon'tunderstandit'smathematicsthenhowIcan
developmachineLearningprograms?? Faheem Jan13'14at20:32

Youshoulduseamathsitetounderstandthemath,likethisone: math.stackexchange.com.Butitlooks
likeyouhaveagoodanswernow.LanceRoberts Jan13'14at20:35

okayyesurright.FormathI'lllookformath.stackexchange.com.Thanks Faheem Jan13'14at20:47

Notethattypingthevariablesasyoudogiveawrongimpression.Forexamlethe x^i youwritecould


betterbewrittenas x(i) or xi since x^i givestheimpressionthatxisraisedtosomepowerwhichis
notthecase.Pithikos Oct9'14at12:05

1Answer

Thecostfunctionis

J(theta_0,theta_1)=1/(2m)*sum_(i=1)^m[h_theta(x^i)y^i]^2

By h_theta(x^i) wedenotewhatmodeloutputsfor x^i ,so h_theta(x^i)y^i isitserror


(assuming,that y^i isacorrectoutput).

Now,wecalculatethesquareofthiserror [h_theta(x^i)y^i]^2 (whichremovesthesign,as


thiserrorcouldbebothpositiveandnegative)andsumitoverallsamples,andtoboundit
somehowwenormalizeitsimplybydividingby m ,sowehavemean(becausewedevideby
numberofsamples)squared(becausewesquare)error(becausewecomputeanerror):

1/m*sum_(i=1)^m[h_theta(x^i)y^i]^2

This 2 whichappearsinthefrontisusedonlyforsimplificationofthederivative,because
whenyouwilltrytominimizeit,youwillusethesteepestdescentmethod,whichisbasedonthe
derivativeofthisfunction.Derivativeof a^2 is 2a ,andourfunctionisasquareofsomething,so
this 2 willcancelout.Thisistheonlyreasonofitsexistance.

editedMay9at23:06 answeredJan13'14at20:28
lejlot
28.4k 3 24 57

1 Okaysonice.Thisisgreatanswer.Iunderstandnow.Butonequestionagainifudon'tmind.[h_theta(x^i)
y^i]^2issomethinglike(ab)^2whichisequaltoa^2+b^22ab.whywedonotexpand[h_theta(x^i)y^i
]^2like[h_theta(x^i)]^2+y^i]^22[h_theta(x^i)][y^i]?thanks Faheem Jan13'14at20:45

1 Becausethisexpansionwon'tleadtoanysimplification,andonlyaddsadditionaloperations(itischeapier
tocompute(ab)^2thana^22ab+b^2,bacausefirstonerequires2artihmeticoperations,whilethesecond
one6).lejlotJan13'14at20:46

http://stackoverflow.com/questions/21099289/cantunderstandthecostfunctionforlinearregression 1/2
9/23/2016 mathCan'tunderstandthecostfunctionforLinearRegressionStackOverflow
YesbutIthinktheresultsofbothrdifferent.Andthecorrectwayfor(ab)^2isa^22ab+b^2,nottosubtract
bfromafirstthentakesquareoftheresult.MaybeI'mwrong.notprettysure.Sorryforaskingagainand
again. Faheem Jan13'14at20:57

Well,youarewrong.(ab)^2isequaltoa^22ab+b^2,always. lejlot Jan13'14at20:58

ohyeahsure.sorryIwaswrong.Thanksalot.igotit. Faheem Jan13'14at21:03

http://stackoverflow.com/questions/21099289/cantunderstandthecostfunctionforlinearregression 2/2

You might also like