You are on page 1of 9

14/03/2016

MatrixFactorization:ASimpleTutorialandImplementationinPython@quuxlabs

MatrixFactorization:ASimpleTutorial
andImplementationinPython
ByAlbertAuYeungonSeptember16,2010
ThereisprobablynoneedtosaythatthereistoomuchinformationontheWebnowadays.
Searchengineshelpusalittlebit.Whatisbetteristohavesomethinginteresting
recommendedtousautomaticallywithoutasking.Indeed,fromassimpleasalistofthe
mostpopularbookmarksonDelicious,tosomemorepersonalizedrecommendationswe
receivedonAmazon,weareusuallyofferedrecommendationsontheWeb.
Recommendationscanbegeneratedbyawiderangeofalgorithms.Whileuserbasedor
itembasedcollaborativefilteringmethodsaresimpleandintuitive,matrixfactorization
techniquesareusuallymoreeffectivebecausetheyallowustodiscoverthelatentfeatures
underlyingtheinteractionsbetweenusersanditems.Ofcourse,matrixfactorizationis
simplyamathematicaltoolforplayingaroundwithmatrices,andisthereforeapplicablein
manyscenarioswhereonewouldliketofindoutsomethinghiddenunderthedata.
Inthistutorial,wewillgothroughthebasicideasandthemathematicsofmatrix
factorization,andthenwewillpresentasimpleimplementationinPython.Wewillproceed
withtheassumptionthatwearedealingwithuserratings(e.g.anintegerscorefromthe
rangeof1to5)ofitemsinarecommendationsystem.

TableofContents:
BasicIdeas
Themathematicsofmatrixfactorization
Regularization
ImplementationinPython
FurtherInformation
SourceCode
References

BasicIdeas
Justasitsnamesuggests,matrixfactorizationisto,obviously,factorizeamatrix,i.e.tofind
outtwo(ormore)matricessuchthatwhenyoumultiplythemyouwillgetbacktheoriginal
matrix.

http://www.quuxlabs.com/blog/2010/09/matrixfactorizationasimpletutorialandimplementationinpython/

1/9

14/03/2016

MatrixFactorization:ASimpleTutorialandImplementationinPython@quuxlabs

AsIhavementionedabove,fromanapplicationpointofview,matrixfactorizationcanbe
usedtodiscoverlatentfeaturesunderlyingtheinteractionsbetweentwodifferentkindsof
entities.(Ofcourse,youcanconsidermorethantwokindsofentitiesandyouwillbedealing
withtensorfactorization,whichwouldbemorecomplicated.)Andoneobviousapplicationis
topredictratingsincollaborativefiltering.
InarecommendationsystemsuchasNetflixorMovieLens,thereisagroupofusersandaset
ofitems(moviesfortheabovetwosystems).Giventhateachusershaveratedsomeitemsin
thesystem,wewouldliketopredicthowtheuserswouldratetheitemsthattheyhavenot
yetrated,suchthatwecanmakerecommendationstotheusers.Inthiscase,allthe
informationwehaveabouttheexistingratingscanberepresentedinamatrix.Assumenow
wehave5usersand10items,andratingsareintegersrangingfrom1to5,thematrixmay
looksomethinglikethis(ahyphenmeansthattheuserhasnotyetratedthemovie):
D1

D2

D3

D4

U1

U2

U3

U4

U5

Hence,thetaskofpredictingthemissingratingscanbeconsideredasfillingintheblanks
(thehyphensinthematrix)suchthatthevalueswouldbeconsistentwiththeexistingratings
inthematrix.
Theintuitionbehindusingmatrixfactorizationtosolvethisproblemisthatthereshouldbe
somelatentfeaturesthatdeterminehowauserratesanitem.Forexample,twouserswould
givehighratingstoacertainmovieiftheybothliketheactors/actressesofthemovie,orif
themovieisanactionmovie,whichisagenrepreferredbybothusers.Hence,ifwecan
discovertheselatentfeatures,weshouldbeabletopredictaratingwithrespecttoacertain
userandacertainitem,becausethefeaturesassociatedwiththeusershouldmatchwiththe
featuresassociatedwiththeitem.
Intryingtodiscoverthedifferentfeatures,wealsomaketheassumptionthatthenumberof
featureswouldbesmallerthanthenumberofusersandthenumberofitems.Itshouldnot
bedifficulttounderstandthisassumptionbecauseclearlyitwouldnotbereasonableto
assumethateachuserisassociatedwithauniquefeature(althoughthisisnotimpossible).
Andanywayifthisisthecasetherewouldbenopointinmakingrecommendations,because
http://www.quuxlabs.com/blog/2010/09/matrixfactorizationasimpletutorialandimplementationinpython/

2/9

14/03/2016

MatrixFactorization:ASimpleTutorialandImplementationinPython@quuxlabs

eachoftheseuserswouldnotbeinterestedintheitemsratedbyotherusers.Similarly,the
sameargumentappliestotheitems.

Themathematicsofmatrixfactorization
Havingdiscussedtheintuitionbehindmatrixfactorization,wecannowgoontoworkonthe
mathematics.Firstly,wehaveaset ofusers,andaset ofitems.Let ofsize
be
thematrixthatcontainsalltheratingsthattheusershaveassignedtotheitems.Also,we
assumethatwewouldliketodiscover$K$latentfeatures.Ourtask,then,istofindtwo
matricsmatrices (a
matrix)and (a
matrix)suchthattheirproduct
approximates :

Inthisway,eachrowof wouldrepresentthestrengthoftheassociationsbetweenauser
andthefeatures.Similarly,eachrowof wouldrepresentthestrengthoftheassociations
betweenanitemandthefeatures.Togetthepredictionofaratingofanitem by ,wecan
calculatethedotproductofthetwovectorscorrespondingto and :

Now,wehavetofindawaytoobtain and .Onewaytoapproachthisproblemisthefirst


intializethetwomatriceswithsomevalues,calculatehow`differenttheirproductisto ,
andthentrytominimizethisdifferenceiteratively.Suchamethodiscalledgradientdescent,
aimingatfindingalocalminimumofthedifference.
Thedifferencehere,usuallycalledtheerrorbetweentheestimatedratingandtherealrating,
canbecalculatedbythefollowingequationforeachuseritempair:

Hereweconsiderthesquarederrorbecausetheestimatedratingcanbeeitherhigheror
lowerthantherealrating.

http://www.quuxlabs.com/blog/2010/09/matrixfactorizationasimpletutorialandimplementationinpython/

3/9

14/03/2016

MatrixFactorization:ASimpleTutorialandImplementationinPython@quuxlabs

Tominimizetheerror,wehavetoknowinwhichdirectionwehavetomodifythevaluesof
and .Inotherwords,weneedtoknowthegradientatthecurrentvalues,andthereforewe
differentiatetheaboveequationwithrespecttothesetwovariablesseparately:

Havingobtainedthegradient,wecannowformulatetheupdaterulesforboth

and

Here, isaconstantwhosevaluedeterminestherateofapproachingtheminimum.Usually
wewillchooseasmallvaluefor ,say0.0002.Thisisbecauseifwemaketoolargeastep
towardstheminimumwemayrunintotheriskofmissingtheminimumandendup
oscillatingaroundtheminimum.
Aquestionmighthavecometoyourmindbynow:ifwefindtwomatrices and suchthat
approximates ,isntthatourpredictionsofalltheunseenratingswillallbezeros?
Infact,wearenotreallytryingtocomeupwith and suchthatwecanreproduce
exactly.Instead,wewillonlytrytominimisetheerrorsoftheobserveduseritempairs.In
otherwords,ifwelet beasetoftuples,eachofwhichisintheformof
,such
that containsalltheobserveduseritempairstogetherwiththeassociatedratings,weare
onlytryingtominimiseevery for
.(Inotherwords, isoursetoftraining
data.)Asfortherestoftheunknowns,wewillbeabletodeterminetheirvaluesoncethe
associationsbetweentheusers,itemsandfeatureshavebeenlearnt.
Usingtheaboveupdaterules,wecantheniterativelyperformtheoperationuntiltheerror
convergestoitsminimum.Wecanchecktheoverallerrorascalculatedusingthefollowing
equationanddeterminewhenweshouldstoptheprocess.

http://www.quuxlabs.com/blog/2010/09/matrixfactorizationasimpletutorialandimplementationinpython/

4/9

14/03/2016

MatrixFactorization:ASimpleTutorialandImplementationinPython@quuxlabs

Regularization
Theabovealgorithmisaverybasicalgorithmforfactorizingamatrix.Therearealotof
methodstomakethingslookmorecomplicated.Acommonextensiontothisbasicalgorithm
istointroduceregularizationtoavoidoverfitting.Thisisdonebyaddingaparameter and
modifythesquarederrorasfollows:

Inotherwords,thenewparameter isusedtocontrolthemagnitudesoftheuserfeature
anditemfeaturevectorssuchthat and wouldgiveagoodapproximationof without
havingtocontainlargenumbers.Inpractice, issettosomevaluesintherangeof0.02.The
newupdaterulesforthissquarederrorcanbeobtainedbyaproceduresimilartotheone
describedabove.Thenewupdaterulesareasfollows.

ImplementationinPython
Oncewehavederivedtheupdaterulesasdescribedabove,itactuallybecomesvery
straightforwardtoimplementthealgorithm.Thefollowingisafunctionthatimplementsthe
algorithminPython(notethatthisimplementationrequiresthenumpymodule).
Note:ThecompletePythoncodeisavailablefordownloadinsectionSourceCodeatthe
endofthispost.
01

importnumpy

02

03

04

defmatrix_factorization(R,P,Q,K,steps=5000,alpha=0.0002,
beta=0.02):

Q=Q.T

http://www.quuxlabs.com/blog/2010/09/matrixfactorizationasimpletutorialandimplementationinpython/

5/9

14/03/2016

MatrixFactorization:ASimpleTutorialandImplementationinPython@quuxlabs

05

forstepinxrange(steps):

06

foriinxrange(len(R)):

07

forjinxrange(len(R[i])):

08

ifR[i]
[j]>0:

09

10

eij=R[i]
[j]numpy.dot(P[i,:],Q[:,j])

forkinxrange(K):

11

P[i][k]=P[i][k]+alpha*(2*eij*Q[k]
[j]beta*P[i][k])

12

Q[k][j]=Q[k][j]+alpha*(2*eij*P[i]
[k]beta*Q[k][j])

13

eR=numpy.dot(P,Q)

14

e=0

15

foriinxrange(len(R)):

16

forjinxrange(len(R[i])):

17

ifR[i]
[j]>0:

18

19

20

e=e+pow(R[i]
[j]numpy.dot(P[i,:],Q[:,j]),2)

forkinxrange(K):

e=e+(beta/2)*(pow(P[i]
[k],2)+pow(Q[k][j],2))

http://www.quuxlabs.com/blog/2010/09/matrixfactorizationasimpletutorialandimplementationinpython/

6/9

14/03/2016

MatrixFactorization:ASimpleTutorialandImplementationinPython@quuxlabs

21

ife
<0.001:

22

break

23

returnP,
Q.T

Wecantrytoapplyittoourexamplementionedaboveandseewhatwewouldget.Belowisa
codesnippetinPythonforrunningtheexample.
01

R=[

02

[5,3,0,1],

03

[4,0,0,1],

04

[1,1,0,5],

05

[1,0,0,4],

06

[0,1,5,4],

07

08

09

R=numpy.array(R)

10

11

N=len(R)

12

M=len(R[0])

13

K=2

14

http://www.quuxlabs.com/blog/2010/09/matrixfactorizationasimpletutorialandimplementationinpython/

7/9

14/03/2016

MatrixFactorization:ASimpleTutorialandImplementationinPython@quuxlabs

14

15

P=numpy.random.rand(N,K)

16

Q=numpy.random.rand(M,K)

17

18

19

nP,nQ=matrix_factorization(R,
P,Q,K)

nR=numpy.dot(nP,
nQ.T)

Andthematrixobtainedfromtheaboveprocesswouldlooksomethinglikethis:
D1

D2

D3

D4

U1

4.97

2.98

2.18

0.98

U2

3.97

2.40

1.97

0.99

U3

1.02

0.93

5.32

4.93

U4

1.00

0.85

4.59

3.93

U5

1.36

1.07

4.89

4.12

Wecanseethatforexistingratingswehavetheapproximationsveryclosetothetruevalues,
andwealsogetsome'predictions'oftheunknownvalues.Inthissimpleexample,wecan
easilyseethatU1andU2havesimilartasteandtheybothratedD1andD2high,whilethe
restoftheuserspreferredD3,D4andD5.Whenthenumberoffeatures(KinthePython
code)is2,thealgorithmisabletoassociatetheusersanditemstotwodifferentfeatures,and
thepredictionsalsofollowtheseassociations.Forexample,wecanseethatthepredicted
ratingofU4onD3is4.59,becauseU4andU5bothratedD4high.

FurtherInformation
Wehavediscussedtheintuitivemeaningofthetechniqueofmatrixfactorizationanditsuse
incollaborativefiltering.Infact,therearemanydifferentextensionstotheabovetechnique.
Animportantextensionistherequirementthatalltheelementsofthefactormatrices(
and intheaboveexample)shouldbenonnegative.Inthiscaseitiscallednonnegative
http://www.quuxlabs.com/blog/2010/09/matrixfactorizationasimpletutorialandimplementationinpython/

8/9

14/03/2016

MatrixFactorization:ASimpleTutorialandImplementationinPython@quuxlabs

matrixfactorization(NMF).OneadvantageofNMFisthatitresultsinintuitivemeaningsof
theresultantmatrices.Sincenoelementsarenegative,theprocessofmultiplyingthe
resultantmatricestogetbacktheoriginalmatrixwouldnotinvolvesubtraction,andcanbe
consideredasaprocessofgeneratingtheoriginaldatabylinearcombinationsofthelatent
features.

SourceCode
ThefullPythonsourcecodeofthistutorialisavailablefordownloadat:
mf.py

References
Therehavebeenquitealotofreferencesonmatrixfactorization.Belowaresomeofthe
relatedpapers.
GborTakcsetal(2008).Matrixfactorizationandneighborbasedalgorithmsforthe
Netflixprizeproblem.In:Proceedingsofthe2008ACMConferenceonRecommender
Systems,Lausanne,Switzerland,October2325,267274.
PatrickOtt(2008).IncrementalMatrixFactorizationforCollaborativeFiltering.Science,
TechnologyandDesign01/2008,AnhaltUniversityofAppliedSciences.
DanielD.LeeandH.SebastianSeung(2001).AlgorithmsforNonnegativeMatrix
Factorization.AdvancesinNeuralInformationProcessingSystems13:Proceedingsofthe
2000Conference.MITPress.pp.556562.
DanielD.LeeandH.SebastianSeung(1999).Learningthepartsofobjectsbynon
negativematrixfactorization.Nature,Vol.401,No.6755.(21October1999),pp.788791.

http://www.quuxlabs.com/blog/2010/09/matrixfactorizationasimpletutorialandimplementationinpython/

9/9

You might also like