You are on page 1of 24

AnalyzingAffiliationNetworks

StephenP.BorgattiandDanielS.Halgin LINKSCenterforSocialNetworkAnalysis GattonCollegeofBusinessandEconomics UniversityofKentucky Lexington,KY40506USA In social network analysis, the term affiliations usually refers to membership or participation data, such as when we have data on which actors have participated in which events. Often, the assumption is that comembership in groups or events is an indicator of an underlying social tie. Forexample,DavisGardnerandGardner(1941)useddataprovidedbythesocietypagesofalocal newspaper to uncover distinct social circles among a set of society women. Similarly, Domhoff (1967) and others have used comembership in corporate boards to search for social elites (e.g., Allen, 1974; Carroll, Fox and Ornstein, 1982; Galaskiewicz, 1985; Westphal and Khanna, 2003). Alternatively, we can see coparticipation as providing opportunities for social ties to develop, which in turn provide opportunities things like ideas to flow between actors. For example, Davis (1991;DavisandGreeve,1997)studiedthediffusionofcorporatepracticessuchaspoisonpillsand golden parachutes. He finds evidence that poison pills diffuse through chains of interlocking directorates, where board members who sit on multiple boards serve as conduits of strategic information between the different firms. An important advantage of affiliation data, especially in the case studying elites, is that affiliations are often observable from a distance (e.g., government records,newspaperreports),withouthavingtohavespecialaccesstotheactors. Inthischapter,wefocusonissuesinvolvingtheanalysisofaffiliationdata,asopposedtothe collectionorthetheoreticalinterpretationofaffiliationdata.

BasicConcepts&Terminology

Affiliationsdataconsistofasetofbinaryrelationshipsbetweenmembersoftwosetsofitems.For example, the wellknown dataset collected by Davis, Gardner and Gardner (1941) records which womenattendedwhich socialeventsinasmallsoutherntown.Thus,therearetwosetsofitems, women and events, and there is a binary relation that connects them, namely the attended relation.Figure1givestheDavis,GardnerandGardner(henceforth,DGG)datamatrixinitsoriginal form.Therowscorrespondtothewomenandthecolumnsaretheeventstheyattended.

E1 EVELYN 1 LAURA 1 THERESA 0 BRENDA 1 CHARLOTTE 0 FRANCES 0 ELEANOR 0 PEARL 0 RUTH 0 VERNE 0 MYRNA 0 KATHERINE 0 SYLVIA 0 NORA 0 HELEN 0 DOROTHY 0 OLIVIA 0 FLORA 0

E2 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

E3 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0

E4 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0

E5 1 1 1 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0

E6 1 1 1 1 0 1 1 1 0 0 0 0 0 1 0 0 0 0

E7 0 1 1 1 1 0 1 0 1 1 0 0 1 1 1 0 0 0

E8 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 0 0

E9 E10 E11 E12 E13 E14 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 1 0 1 0 0 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 0 1 0 0 0 1 0 1 0 0 0

Ingeneral,thekindsofbinaryrelationsweconsideraffiliationsarelimitedtopart/wholerelations suchasisamemberoforisaparticipantinorhas(inthesenseofhavingatrait).Examplesof affiliationsdatathathavefoundtheirwayintothesocialscienceliteratureincludecorporateboard memberships(e.g.,Mizruchi,1983,1992,1996;Carroll,FoxandOrnstein,1982;Davis,1991;Lester andCanella,2006;RobinsandAlexander,2004;Westphal,1998),attendanceatevents(e.g.,Davis, GardnerandGardner,1941;Faust,Willber,RowleeandSkvoretz,2002),membershipinclubs(e.g., McPherson,1982;McPhersonandSmithLovin,1986,1987),participationinonlinegroups(Allatta, 2003, 2005), authorship of articles (e.g., Gmr, M. 2006; Lazer, Mergel and Friedman, 2009; Newman,2001),membershipinproductionteams(UzziandSpiro, 2005),andevencoursetaking patterns of high school students (e.g., Field, Frank, Schiller, RiegleCrumb, Muller, 2006). In addition, affiliations data are wellknown outside the social sciences, as in the speciesbytrait matricesofnumericaltaxonomy(SokalandSneath,1973). Wecanrepresentaffiliationsasmathematicalgraphs(Harary,1969)inwhichnodescorrespondto entities(suchaswomenandevents)andlinescorrespondtotiesofaffiliationamongtheentities. Figure2providesarepresentationoftheDGGdata.Affiliationsgraphsaredistinctiveinhavingthe propertyofbipartiteness,whichmeansthatthegraphsnodescanbepartitionedintotwoclasses suchthatalltiesoccuronlybetweenclassesandneverwithinclasses.WeseeinFigure2thatthere areonlylinesbetweenwomenandtheeventswhichtheyattended.Whileallaffiliationgraphsare bipartite,inourviewthereverseisnotnecessarilytrue.In empiricalnetworkdata,graphscanbe bipartite by chance alone, perhaps because of sampling error. What makes affiliation graphs differentisthatthetwonodesetsaredifferentkindsofentities,andthelackoftieswithinsetsisby

Figure1.DGGwomenbyeventsmatrix

design,nothappenstance.Formally,wedefineanaffiliationgraphasabipartitegraphG(V1,V2,E), in which V1 and V2 are sets of nodes corresponding to different classes of entities, and E is an affiliationrelationthatmapstheelementsofV1toV2.Therelationistypicallyconceivedasasetof unorderedpairsinwhichoneelementofeachpairbelongstoV1 andtheotherbelongstoV2.In contextswherewearediscussmultiplegraphs,weusethenotationV1(G)toindicatetheV1node setingraphG,andE(H)torefertothetiesingraphH.

Affiliationgraphsornetworksareoftencalled2modegraphs.Theterminologyofmodesrefers tothenumberofdifferentkindsofentitiesreferencedintherowsandcolumnsofamatrix.A1 modematrixissquare,itsrowsandcolumnsrefertothesamesetofentitiesasinglemode.An example,drawnfromthefamousHawthornestudies(RoethlisbergerandDickson,1939),isshown inFigure3.1 I1 I3 W1 W2 W3 W4 W5 W6 W7 W8 W9 S1 S2 S4 I1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 I3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 W1 0 0 0 0 1 1 0 0 0 0 0 1 0 0 W2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 W3 1 0 1 0 0 1 0 0 0 0 0 1 0 0 W4 0 0 1 0 1 0 0 0 0 0 0 1 0 0 W5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 W6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 W7 0 0 0 0 0 0 0 0 0 1 1 1 0 0 W8 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 The node labels indicate whether the individual is an Inspector (I), a Worker (W), or a Supervisor (S).

Figure2.DGGwomenbyeventsGraph

W9 S1 S2 S4 Figure3.1ModePersonbyPersonPositiveRelationshipMatrix Incontrast,a2modematrixisrectangularandtherowsandcolumnsrefertotwodifferentsetsof entitiestwomodes.Forexample,Figure4showsa2mode,nbympersonbygroupincidence matrixthatisalsobasedontheHawthornedata.Anincidencematrixhasrowscorrespondingto nodesandcolumnscorrespondingtonaryedges(alsocalledhyperedges)thatconnectsetsof nodes.Inthiscase,thematrixindicateseachindividualsmembershipineachoffivedifferent groups2.Thematrixclearlyrepresentsaffiliations,andindeedallaffiliationgraphscanbe representedas2modematrices,wherethetwomodescorrespondtotheaffiliationgraphstwo nodesets. Gr1 Gr2 Gr3 Gr4 Gr5 I1 1 0 0 0 0 I3 0 0 0 0 0 W1 1 1 1 0 0 W2 1 1 0 0 0 W3 1 1 1 0 0 W4 1 1 1 0 0 W5 0 0 1 0 0 W6 0 0 0 1 0 W7 0 0 0 1 1 W8 0 0 0 1 1 W9 0 0 0 1 1 S1 0 1 1 0 0 S2 0 0 0 0 0 S4 0 0 0 0 1 Figure4.2modePersonbyGroupMatrix Itisimportanttonotethatwhileaffiliationgraphscanberepresentedby2modematrices,notall 2mode matrices are considered affiliation graphs. For example, a standard sociological caseby variablesmatrix(e.g.,personbydemographics)mightbeseenas2mode,butwouldnotnormally becalledaffiliations.Thetermaffiliationsisreservedforthecasewhenthedataconsistofsome kindofparticipationormembership,asinpeopleinevents,projects,orgroups.3Inthischapterwe focusonaffiliationsdata,butthetechniqueswediscussapplyto2modedataingeneral. The groups were constructed by the present authors for illustrative purposes, based on a clique analysis. 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0

This is not to imply that the data must binary as we could have data in which persons have a degree of membership or participation in various groups or events.

In some cases, the purpose of collecting affiliations data is not to understand the pattern of ties betweenthetwosets,buttounderstandthepatternofties withinone ofthesets. It wouldseem perverse,inthatcase,tocollectaffiliationsdata,sincebydefinitionaffiliationsdatadonotinclude tiesamong membersof eitherset.However,givenaffiliationsdata,wecaninfactconstructsome kind of tie among members of a node set simply by defining coaffiliation (e.g., attendance at the sameevents,membershiponthesamecorporateboard)asatie.Forexample,fortheDGGdataset, wecanconstructawomanbywomanmatrix Sinwhichsijgivesthenumberofeventsthatwoman i andwomanjattendedtogether(seeFigure5).Ifwelike,wecanthendichotomizesothatthereisa tie between two women if and only if they coattended at least some number of events. Thus, affiliations data give rise to coaffiliation data, which constitute some kind of tie among nodes withinaset.
EVELYN LAURA THERESA BRENDA CHARLOTTE FRANCES ELEANOR PEARL RUTH VERNE MYRNA KATHERINE SYLVIA NORA HELEN DOROTHY OLIVIA FLORA EVE LAU THE BRE CHA FRA ELE PEA RUT VER MYR KAT SYL NOR HEL DOR OLI FLO 8 6 7 6 3 4 3 3 3 2 2 2 2 2 1 2 1 1 6 7 6 6 3 4 4 2 3 2 1 1 2 2 2 1 0 0 7 6 8 6 4 4 4 3 4 3 2 2 3 3 2 2 1 1 6 6 6 7 4 4 4 2 3 2 1 1 2 2 2 1 0 0 3 3 4 4 4 2 2 0 2 1 0 0 1 1 1 0 0 0 4 4 4 4 2 4 3 2 2 1 1 1 1 1 1 1 0 0 3 4 4 4 2 3 4 2 3 2 1 1 2 2 2 1 0 0 3 2 3 2 0 2 2 3 2 2 2 2 2 2 1 2 1 1 3 3 4 3 2 2 3 2 4 3 2 2 3 2 2 2 1 1 2 2 3 2 1 1 2 2 3 4 3 3 4 3 3 2 1 1 2 1 2 1 0 1 1 2 2 3 4 4 4 3 3 2 1 1 2 1 2 1 0 1 1 2 2 3 4 6 6 5 3 2 1 1 2 2 3 2 1 1 2 2 3 4 4 6 7 6 4 2 1 1 2 2 3 2 1 1 2 2 2 3 3 5 6 8 4 1 2 2 1 2 2 2 1 1 2 1 2 3 3 3 4 4 5 1 1 1 2 1 2 1 0 1 1 2 2 2 2 2 2 1 1 2 1 1 1 0 1 0 0 0 0 1 1 1 1 1 1 2 1 1 2 2 1 0 1 0 0 0 0 1 1 1 1 1 1 2 1 1 2 2

CoAffiliation

Figure5.DGGwomenbywomenmatrixofoverlapsacrossevents Onejustificationforrelyingoncoaffiliationistheideathatcoaffiliationprovidestheconditionsfor thedevelopmentofsocialtiesofvariouskinds.Forexample,themoreoftenpeopleattendthesame events, the more likely it is they will interact and develop some kind of relationship. Feld (1981) suggests that individuals whose activities are organized around the same focus (e.g., voluntary organization,workplaces,hangouts,family,etc.)frequentlybecomeinterpersonallyconnectedover time.Physicalproximity(whichissimplycoaffiliationwithrespecttospatialcoordinates)isalso clearlyamajorfactorinenablingand,inthebreach,preventinginteraction(Allen,1977).Another justification is almost the reverse of the first, namely that common affiliations can be the consequence of having a tie. For example, married couples attend a great number of events together,andbelongtoagreatnumberofgroupstogether,and indeedmaycometoshareagreat number of activities, interests and beliefs. Thus, coaffiliation can be viewed as an observable manifestationofasocialrelationthatisperhapsunobservabledirectly(suchasfeelings). If either of these justifications is valid, then we may collect affiliations data simply because it is moreconvenientthancollectingdirecttiesamongasetofnodes.Forexample,ifweareinterested in studying relationships among celebrities, we could try to interview them about their ties with othercelebrities,butthiscouldbequitedifficulttoarrange.Ifjustifiable,itwouldmostcertainlybe easiertosimplyreadcelebritynewsandrecordwhohasattendedwhatHollywoodsocialevent,or whohasworkedonwhatproject.

Indecidingwhethertouseaffiliationsdataasaproxyforsocialrelations,itisusefultothinkabout theconditionsunderwhichanyofthesejustificationsislikelytoprovevalid.Oneconsiderationis thesizeofaffiliationevents.Forexample,supposewehaveapersonbyclubmatrixindicatingwho isamemberofwhichclub.Iftheclubsaresmall(likeaboardofdirectors),thenourjustifications seem, well, justifiable. But if the clubs are large (on the order of thousands of members), co membership may indicate very little about the social tie between a given pair of members. Two peoplecanbemembersofallthesame(large)clubsorattendallthesame(large)events,andyet notevenbeawareofeachothersexistenceandneverendupmeeting. It should also be noted that in adopting coaffiliations as a proxy for social ties, we confound the concept of social proximity with that of social similarity, which in other contexts are treated as competingalternatives(Burt,1987;Friedkin,1984).Toseethat coaffiliationsaresimilaritiesdata, considerthewomanbywomancoaffiliationnetworkinFigure5,constructedfromtheoriginal2 modewomanbyeventattendancedata.Foreachpairofwomen,welookattheirrespectiverows in X, and count the number of times that they have 1s in the same places. This is simply an unnormalizedmeasureofsimilarityofrows.Ineffect,foranypairofwomenweconstructasimple 2by2 contingency table as shown in Figure 6 that shows the relationship between their pair of rows. Womanj 1 0 1 a b a+b Womani 0 c d c+d a+c b+d n Figure6.ContingencyTable Thequantityagivesthenumberoftimesthatthepairofwomencoattendedanevent.Thequantity a+b givesthetotalnumberofeventsthatwoman iattended,and a+cgivesthecorrespondingvalue forwomanj.ThequantitynissimplythenumberofeventsthenumberofcolumnsinmatrixX.A simple way to bound a between 0 and 1 and promote comparability across datasets is to simply divideabyn,asshowninEquation1.

a* =

a n

Equation1

Boundingabythemaximumpossiblescoreintroducesthenotionofothernormalizationsthattake intoaccountcharacteristicsofthewomensuchasthenumberofeventstheyattended.Forexample, ifwoman iandwoman jattendthreeeventsincommon,andwoman kandwoman ldoaswell,we wouldlikelyregardthetwopairsasequallyclose.Butifweknewthat iand jeachonlyattended3 events,whereaskandleachattended14events,intuitionwewouldbemorelikelytoconcludethat the100%overlapbetweeniandjsignalsgreaterclosenessthanthe21%overlapbetweenkandl. Therefore, if we wanted to normalize the quantity a for the number of events that each woman attended,wemightdivide a bytheminimumof a+band a+c,asshowninequation2.Theresulting coefficient runs between 0 and 1, where 1 indicates the maximum possible overlap given the numberofeventsattendedbyiandj.Thisapproachtakesintoaccountthatthenumberofoverlaps betweentwowomencannotexceedthenumberofeventsthateitherattended.


* a ij =

a Min ( a + b, a + c )

Equation2

Another wellknown approach to normalizing a is provided by the Jaccard coefficient, which is described by Equation 3. It gives the number of events attended in common as a proportion of eventsthatareattendable,asdeterminedbythefactthatatleastoneofthetwowomenattended theevent.

* aij =

Alternatively,wecouldtake a+dasarawmeasureofsocialcloseness.Byincluding d,weeffectively argue that choosing not to attend a given event is as much of a statement of social allegiance as attendinganevent.Awellknownnormalizationof a+disgivenbyEquation4,whichisequaltothe simplePearsoncorrelationbetweenrowsiandjofmatrixX. 1

a a+b+c

Equation3

rij =

x
k

ik

x jk u i u j

si s j

Equation4

Another approach, devised specifically for affiliations data, is provided by Bonacich (1972), who proposesnormalizingthecooccurrencematrix accordingto Equation 5. Effectively,thismeasure givestheextenttowhichtheoverlapobservedbetween iand jexceedstheamountofoverlapwe wouldexpectbychance,giventhenumberofeventsthatiandjeachattended. a adbc * Equation5 aij = ,foradbc Allofthesenormalizationsessentiallyshiftthenatureofcoaffiliationdatafromfrequenciesofco occurrences to tendencies or revealed preferences to cooccur. If we interpret frequencies of co occurrencesasgivingthenumberofopportunitiesforinteractionorflowofinformationorgoods, then the raw, unnormalized measures are the appropriate indices for measuring coaffiliation. In contrast, if the reason for studying affiliations is that coaffiliations reveal otherwise unseen relationshipsbetweenpeople(e.g.,sociometricpreferences),thenormalizedmeasuresarethemost appropriate,astheyessentiallygiveusthetendencyorpreferenceforapairofwomentocooccur whilecontrollingfornuisancevariablessuchasthenumberoftimesawomanwasobserved.The normalizedmeasurestellushowoftentwowomenarecoattendingrelativetothenumberoftimes theycouldhave. Consider the following hypothetical research project. Say that we are interested in analyzing connectionsbetweenagroupof13individualsbasedontheirmembershipsindifferentsocialclubs (16ofthem).Becauseweareinterestedinunderstandingrelationshipsamongthe13individuals weconverttheaffiliationsdata(personbysocialclub)intocoaffiliations(personbyperson).We construct both a raw unnormalized coaffiliation matrix and a normalized coaffiliation matrix. Figure 7 is a graphical representation of the raw coaffiliation network using a standard graph

ad bc

layoutalg gorithm.Individualsare labeledathrum.Aline connectingt twoindividu ualsindicates sthat theyarem membersof atleasttwo ofthesame socialclubs.Nodesizev variesbythe enumberofs social clubs tha at each indiv vidual is a member m of; th hus the larger the node, , the more socially s activ ve the individua al.Figure8is sadepictionofJaccardcoefficientsfo oreachpairo ofindividual ls,suchthat aline connectin ng two indiv viduals indicates that the eir social clu ub membership profiles are correlat ted at greaterth han0.38.

Figure7.C Comembersh hipin2ormo oresocialclub bs.Nodessize eisbasedonn numberofsocialclubsthat teach individual lisamemberof.

Figure8. .SpringEmbeddingofJacca ardCoefficient ts.Anedgeiss shownifcij>0 0.38. Nodessiz zeisbasedonnumberofsocialclubsthat teachindividu ualisamemberof.

on network (presented in Figure 7) ) can be de escribed as a a core perip phery The raw coaffiliatio e in that the ere are a set t of core ind dividuals wh ho are members of multiple social clubs structure (persons e,f,g,h,i) sur rrounded by a collection n of less conn nected indiv viduals. We see s that ther re are nitiesforinte eractionbetw weenmanyo ofthe13indi ividuals.How wever,theh highsocialac ctivity opportun of the co ore individu uals places them t in the middle of the graph which w tends s to obscure e any subgroup pingstructur re.NowconsidertheJac ccardsimilar ritynetwork (presented inFigure8).This grapheff fectivelyhigh hlightsthatt therearetwo ogroupings ofindividua alswithdifferentmembe ership profiles. The graph also a effective ely reveals the t bridging role of indiv vidual i, whi ich was not at all blewhenvisu ualizingunnormalizedco ooccurrence esamongthe eindividuals(seeFigure9 9). discernab malization worth w mentio oning has to do with the e size of the events (or social s Another kind of norm iduals are af ffiliated with h. If, in analyzing coaffili iation data, we w are takin ng the clubs) that the indivi viewthatgreatercoaffil liationcreatesmoreopp portunitiesfo orsocialties stodevelop, ,then pointofv whenme easuringpers sontoperso oncoaffiliations,wewou uldprobably wanttotake eintoaccoun ntthe

relativesizesofdifferentevents.Forexample,intheDGGdataiftwowomencoattendaneventthat includedjustfivepeopleintotal,itwouldseemthatthelikelihoodofbeingawareofeachother,of meeting,andindeedofchangingtheirrelationshipisreasonablyhigh.Wewouldwanttogivethat eventalotofweight.Ontheotherhandifthesamewomencoattendaneventinwhichthousands arepresent(suchasaconcert),wemightwanttoweightthatverylittle.Anobviousapproach,then, is to weight events inversely by their size. Thus, in Figure 6, the quantity n becomes the sum of weightsofallevents,andthequantity aisthesumofweightsoftheeventsthatwerecoattended byiandj.Themeasuresdescribedbyequations1to4canthenbecomputedwithoutmodification. Table1summarizeswhichnormalizationapproachesareappropriategivenonesattitudetoward thenatureofthecoaffiliationdata.Forconvenience,itisassumedthatthe2modeaffiliationsdata are actorbyevent, and that we are interested in constructing the actorbyactor coaffiliation matrix. As such, we refer to the actors/rows as variables and the events/columns as cases. Therefore, the first kind of normalization discussed above can be referred to as variable normalizationandthesecondascasenormalization. Table1.Appropriatenormalizationsbyviewofdata CoAffiliationasOpportunity CoAffiliationasIndicator Nonormalization Variablenormalization (simpleoverlapcounts) (e.g.,JaccardorPearson correlations) Casenormalization (e.g.,weighting inverselybyevent sizes) Havingconstructedacoaffiliationmatrix,wewouldtypicallywanttoanalyzethedatausingallthe tools of social network analysis as with any other kind of tie. For the most part, this is unproblematic,asidefromthecaveatsalreadyvoiced.Thebiggestissuewetypicallyencounteris thatthecoaffiliationmatrixisvaluedandmanynetworkanalytictechniquesassumebinarydata particularly those techniques with graphtheoretic roots. In those cases, the data will need to be dichotomized, and since the level of dichotomization is arbitrary, the normal procedure is to dichotomize at different levels and obtain centrality measures for networks constructed with different thresholds for what is considered a tie. In other cases, there will be no need for dichotomization. For example, eigenvector centrality (Bonacich, 1972) and beta centrality (Bonacich 1987, 2007), are quite happy to accept valued data, particularly when the values are positive in the sense that larger values can be interpreted as enhancing flows or coordination. Other centrality measures need to be modified to work with valued data. In general, measures basedonlengthsofpaths,suchasbetweennessandclosenesscentrality,caneasilybemodifiedto handlevalueddata,providedthedatacanbesensiblytransformedintodistancesorcosts(Brandes, 2001).Forexample,thenumberofeventscoattendedbytwowomencanbesubtractedfromthe numberofeventsintotalandthensubmittedtoavaluedbetweennessanalysis. Another possible difficulty with coaffiliation data is that similarity metrics tend to have certain mathematical properties that social networks in general need not have. For example, most

AnalysisofCoAffiliation

similaritymetricsaresymmetricsothats(u,v)=s(v,u).Wecanconstructnonsymmetricsimilarity measures, but these are rarely used and none of the ones we consider above are nonsymmetric. SimilaritymatricessuchasPearsoncorrelationmatriceshavenumerousotherpropertiesaswell, suchasbeingpositivesemidefinite(e.g.,alleigenvaluesarenonnegative).Themainconsequence isthatthenormsorbaselineexpectationsfornetworkmeasuresoncoaffiliationdatashouldnotbe basedonnormsorexpectationsdevelopedforsociometricdataingeneral(cfWang,Sharpe,Robins andPattison,2009). At this point, we leave the discussion of coaffiliation data, and focus entirely on visualizing and analyzingaffiliationgraphsdirectlywithoutconvertingtocoaffiliations.

DirectVisualizationofAffiliationGraphs

Affiliationgraphsaretypicallyvisualizedusingthesamegraphlayoutalgorithmsusedforordinary graphs. In principle, certain algorithms, such as spring embedders or multidimensional scaling of path distances, should be less than optimal when applied to bipartite graphs because these algorithmsplacenodesinspacesuchthatdistancesbetweenthemarelooselyproportionaltothe path distances that separate them. Since nodes belonging to the same nodeset are necessarily a minimum of two links apart, we might expect some difficulty in detecting grouping in bipartite graphs.Inpractice,however,thisisnotaproblemandordinarygraphlayoutalgorithmsworkwell onbipartitegraphs. Theonlyadjustmentthatwetypicallyhavetomakeforaffiliationsdataistovisuallydistinguishthe twonodesets,suchasbyusingdifferentcolorsandshapesfornodesymbolsofdifferentsets.For example,Figure2showsavisualizationoftheDGGdatasetusingthespringembeddingprocedure in NetDraw (Borgatti, 2002). Women are represented by circles and events are represented by squares.Inthefigure,wecanseeagroupofwomenonthefarrighttogetherwithagroupofevents (E1throughE5)thatonlytheyattend.Ontheleft,onecanseeanothergroupofwomenwhoalso havetheirexclusiveevents(E10throughE14).Inthemiddleoffigurearefourevents(E6through E9)thatareattendedbybothgroupsofwomen.Thefigurealso makesclearthatOliviaandFlora are a bit separate from the rest of the network, and structurally similar because they attended exactlythesameevents. Another approach is to use a 2mode multivariate analysis technique such as correspondence analysistolocatenodes.Correspondenceanalysisdeliversamapinwhichpointscorrespondingto both the n rows and m columns of an nbym 2mode matrix are represented in a joint space. Computationally,correspondenceanalysisconsistsofadoublenormalizationofthedatamatrixto reduce the influence of variation in the row and column sums, followed by a singular value decomposition. The result is that, in the case of a woman by event matrix, two women will be placedneareachothertotheextenttheyhavesimilareventprofiles,controllingforthesizesofthe events,andtwoeventswillbenearotheriftheytendtohavesimilarattendeeprofiles,controlling fortheoverallparticipationratesoftheattendees.InthecaseoftheDGGdataset,correspondence analysisgivesthediagramshowninFigure10.Asageneralrule,theadvantageofcorrespondence representationsisthat,inprinciple,themapdistancesaremeaningfulandcanberelatedprecisely backtotheinputdata.Thisisnotthecasewithmostgraphlayoutalgorithms,astheyrespondto multiplecriteriasuchasavoidingtheplacementofnodesrightontopofeachotherorkeepingline lengthsapproximatelyequal.Thedisadvantageofcorrespondenceanalysislayoutsisthattheycan belessreadable.Forexample,inFigure10,OliviaisobscuredbyFlora,andthe(accurate)portrayal of exactly how different Flora, Olivia and Event 11 are from the rest makes the majority of the displayveryhardtoread.

Figure10.CorrespondenceAnalysisof2modeDGGMatrix.

There are several different approaches to analyzing affiliations data without converting to co affiliations.Sinceaffiliationgraphsaregraphs,anobviousapproachistosimplyuseallthestandard algorithmsandtechniquesinthenetworkanalysistoolkitthatapplytographsingeneral.Indoing this, we effectively assume that either the special nature of affiliation graphs will not affect the techniques,orthatwecanpretendthattieswithinnodesetscouldhaveoccurredandjustdidnt. This approach works for a small class of methods, but by no means all. A case where it does not workismeasuringtransitivity:Calculatingtransitivityfailsbecausetransitivetriplesareimpossible inbipartitegraphs(alltiesarebetweennodesets,whichmeansthatif aband bcthen a and c mustbemembersofthesameclass,andthereforecannotbetied,makingtransitivityimpossible). An alternative approach is to develop new metrics and algorithms specifically designed for the bipartitecase(affiliationgraphs),takingintoaccountthefactthattheobservednetworkisnotjust bipartite by happenstance but design similar to the concept of structural zeros in loglinear modeling. This sounds like a great deal more work, but in practice it is often possible to adjust metricsdesignedforgeneralgraphsbysimplyapplyinganappropriateposthocnormalization.This is the strategy we shall take in applying centrality metrics to affiliations data. In other cases, a whollydifferentapproachmustbeconstructed.Forexample,forthecaseofmeasuringtransitivity, wemightredefinetransitivityintermsofquadruplessuchthataquadiscalledtransitiveifab, bc,cdandad.

DirectAnalysisofAffiliationGraphs

Centrality

As discussed elsewhere in this book (cf Hanneman and Riddles chapter), centrality refers to a family of properties of node positions. A number of centrality concepts have been developed,

togetherwiththeirassociatedmeasures(BorgattiandEverett, 2006).Inthissection,weconsider themeasurementoffourwellknowncentralitymeasures. Degree. In ordinary graphs, degree centrality, di, is defined as the number of ties incident upon a node i. In the affiliations case, of course, the degree of a node is the number of ties it has with members of the other node set. So in the DGG data, for women, it is the number of events they attended,andforevents,itisthenumberofwomenwhoattended.Ifwerepresentaffiliationsasa bipartite graph, we can compute degree centrality as usual and obtain perfectly interpretable values,atleastwithrespecttotherawcounts.However,itisusualtonormalizecentralitymeasures bydividingbythemaximumvaluepossibleinagraphofthatsize.Forordinarygraphs,thisvalueis n1,wherenisthenumberofnodesinthegraph.However,foraffiliationgraphs,thisisnotquite right because a node cannot have ties to its own node set, and so the value of n1 cannot be attained.4 The maximum degree is always the size of the other node set. In the DGG dataset, the maximum possible degree for a woman is the number of events (14), and the maximum possible degreeforaneventisthetotalnumberofwomen(18).Therefore,tonormalizedegreecentralityin thecaseofaffiliationsdata,wemustapplytwoseparatenormalizationsdependingonwhichnode setanodebelongsto,asshowninEquation6. d di* = i ,for i V1

n2 dj d* ,for j V2 j = n1

Equation6

The key benefit of normalizing degree centrality in this way is that we can not only assess the relative centrality of two women or two events, but also whether a given woman is more central than a given event. Without such normalization, nodes with equal propensities to have ties could onlyhaveequaldegreesifthenodesetswerethesamesize.However,whilenormalizationhandles the mathematical issues in comparability, the substantive interpretation of a womans centrality relative to an events is still an issue, and depends on the details of the research setting. For example,itmaybethattheeventsareopentoall,andtiesintheaffiliationgraphreflectawomans agencyonlyinchoosingwhicheventstoattend.Inthiscase,ifawomanhasgreaterdegreethana given event, we might say that her gregariousness is greater than the events attractiveness, althoughthisimpliesthatthedegreecentralitymeasurementdoesnotmeasurethesamethingfor womenasforevents,whichrunscountertothebasicideainthedirectanalysisofaffiliationgraphs. On the other hand, the events might be by invitation only, in which case both women and events haveakindofagency.Ingeneral,centralitymeasuresinthiscontexthavethemoststraightforward interpretations when the affiliations result from some kind of bilateral matching process, such as speeddating. Closeness.Inordinarygraphs,closenesscentrality,ci,referstothesumofgeodesicdistancesfrom node i to all n1 others in the network. As such, it is an inverse measure of centrality in which greatercentralityisindicatedbyalowerscore.Thelowestscorepossibleoccurswhenthenodehas a tie to every other node, in which case the sum of distances to all others is n1. To normalize

Except when for nodes that are in the only members of their special case where one vertex set contains just one node sets.
4

closeness centrality, we usually divide the raw score into n1, which simultaneously reverses the measuresothathighscoresindicategreatercentrality. 5 As with degree centrality, raw closeness can be calculated in affiliation graphs using the same algorithmsweuseforanygraph.But,alsolikedegreecentrality,wemustdosomethingdifferentto normalizeclosenessintheaffiliationcase.Inaffiliationgraphs,theclosestthatanodecanbetoall othersisn2+2(n11),whichisdistance1fromallnodesintheothernodesetanddistance2from allothernodesinitsownset.Therefore,tonormalize(andsimultaneouslyreverse)closenessinthe bipartitecase,wedividetherawclosenessofanodeinV1inton2+2(n11)andtherawclosenessof a node in V2 into n1 + 2(n21), as shown in Equation 7 in which ci represents raw closeness centrality,andn1andn2representthenumberofnodesineachnodeset. n + 2(n1 1) ci* = 2 ,for i V1

ci n1 + 2(n2 1) c* ,for j V2 j = cj

Equation7

UsingtheDGGdatasetforillustration,wecanseethatthemaximumnumberofnodesthatcanbe distance1fromawomanis14(sincethereare14events),andthemaximumnumberofnodesthat canbedistance2fromanyofthe18womenis17(sincethereare18women).Thus,thetheoretical minimumvalueofclosenesscentralityforawomanis14+2*(181),andthetheoreticalminimum valueforaneventis18+2*(141). Betweenness. In any graph, betweenness centrality, bi, refers to the share of shortest paths in a networkthatpassthroughanodei,asgivenbyEquation8. n n g ikj bk = 1 Equation8 2

g
i k j k ,i

ij

To normalize betweenness, we divide by the maximum possible value, which in the case of an ordinarygraphisachievedbythecenterofastarshapednetwork,asshowninFigure11.

Of course, this is a non-linear transformation, unlike all other centrality normalizations. To maintain consistency we could instead divide raw closeness by its maximum and simply remember that it is a reverse measure.

Figure11.S Starshapedn network Inthebip partitecase, unlessonen nodesetcon ntainsjuston nenode,ana affiliationgra aphcannota attain that leve el of centrali ization. As a a result, the maximum possible p bet tweenness fo or any node e in a bipartite graph is lim mited by the e relative siz ze of the tw wo node sets, as given by Equation 9. To ess,wesimpl lydividebib bythedenom minatorinEq quation9cor rresponding toits normalizebetweenne nodeset.
2 bV1 max = 1 2 [ n2 ( s + 1) + n2 ( s + 1)(2t s 1) t ( 2 s t + 3)] 2

s = (n1 1) div d n2 , t = (n1 1) mod n2


2 bV2 max = 1 2 [ n1 ( p + 1) + n1 ( p + 1)(2r p 1) r ( 2 p r + 3)] 2

tion9 Equat

p = (n2 1) div d n1 , r = (n1 1) mod n2


tor. Eigenvector centrali ity, ei, is de efined as the e principal eigenvector e of the adjac cency Eigenvect matrix of fa graph(Bo onacich,197 72), asdefine edbyEquati ion10.Ineig genvectorce entrality,an nodes scoreisp proportionaltothesumo ofthescores ofitsneighb bors.Inabip partitegraphsuchasDGG G,this meansaw womanscen ntralitywillb beproportion naltothesum mofcentrali itiesoftheev ventssheatt tends, andsimil larlythecen ntralityofan neventwill beproportio b onaltothece entralitiesof fthewomen nwho attendit. Asaresult,eigenvectorcentralityap ppliedtotheadjacencym matrixofana affiliationgra aphis ually and mathematically y identical to o singular va alue decompo osition (Eckh hardt and Young, conceptu 1936)of the2mode incidencem matrix.6Inad ddition,both oftheseare eequivalent toaneigenv vector ofthesimple ecoaffiliationmatrix analysiso

ei = aij e j
wh hereistheprincipal p eige envalueofA

Equation10

l Illustration n of Centrali ity Measures s. As an illustration, Fig gure 12 pres sents norma alized Empirical centrality y scores for r all four types of centr rality discus ssed above for the DGG G bipartite graph g presented d in Figure 2. Note that three event ts (E8, E9, and a E7) are more m centra al than any of o the
6

In additio on, singular val lue decomposit tion yields the measures of hu ubs and author rities proposed d by Kleinberg (1999). As s a result, in aff filiations data, eigenvector ce entrality and hu ubs and authori ities are identic cal concepts, which w is not true in ordinary gra aphs.

womenonallofthemeasuresexceptfornormalizeddegreecentrality.Itisalsoworthhighlighting that E7 has 10 ties while Nora has only 8, but Nora has a slightly higher normalized degree centralitybecausetherearefewereventsthanwomen,soher8representsagreaterpercentageof thepossibleties.
Node No.ofTies Normalized Normalized Normalized Normalized Degree Closeness Betweenness Eigenvector E8 14 0.78 0.85 0.24 0.51 E9 12 0.67 0.79 0.23 0.38 E7 10 0.56 0.73 0.13 0.38 Nora 8 0.57 0.80 0.11 0.26 Evelyn 8 0.57 0.80 0.10 0.33 Theresa 8 0.57 0.80 0.09 0.37 E6 8 0.44 0.69 0.07 0.33 Sylvia 7 0.50 0.77 0.07 0.28 Laura 7 0.50 0.73 0.05 0.31 Brenda 7 0.50 0.73 0.05 0.31 Katherine 6 0.43 0.73 0.05 0.22 E5 8 0.44 0.59 0.04 0.32 Helen 5 0.36 0.73 0.04 0.20 E3 6 0.33 0.56 0.02 0.25 Ruth 4 0.29 0.71 0.02 0.24 Verne 4 0.29 0.71 0.02 0.22 E12 6 0.33 0.56 0.02 0.20 Myrna 4 0.29 0.69 0.02 0.19 E11 4 0.22 0.54 0.02 0.09 Eleanor 4 0.29 0.67 0.01 0.23 Frances 4 0.29 0.67 0.01 0.21 Pearl 3 0.21 0.67 0.01 0.18 E4 4 0.22 0.54 0.01 0.18 Charlotte 4 0.29 0.60 0.01 0.17 E10 5 0.28 0.55 0.01 0.17 Olivia 2 0.14 0.59 0.01 0.07 Flora 2 0.14 0.59 0.01 0.07 E2 3 0.17 0.52 0.00 0.15 E1 3 0.17 0.52 0.00 0.14 Dorothy 2 0.14 0.65 0.00 0.13 E13 3 0.17 0.52 0.00 0.11 E14 3 0.17 0.52 0.00 0.11

Figure12.NormalizedcentralityscoresfortheDGGaffiliationgraph.

CohesiveSubgroups

Cohesive subgroups refer to dense areas in a network that typically have more ties within group than with the rest of the network. Affiliations data pose special problems for cohesive subgroup analysis because the area around any given node can never be very dense since none of a nodes friendscanbefriends witheachother.Asaresult,sometraditional graphtheoretic methods of findingsubgroupsneedtobemodifiedforthebipartitecase.

Oneofthemostfundamentalsubgroupconceptsisthatofaclique(LuceandPerry,1949).Aclique isdefinedasamaximallycompletesubgraph,whichmeansthateverymemberofthecliquehasa tie to every other (a property known as completeness), and there is no other node that could be addedtothesubgraphssetofverticeswithoutviolatingthecompletenessrequirement(thisisthe propertyofmaximality).Cliquesoflargesizearerareinordinarygraphs,andtheyareimpossible in bipartite graphs. As a result, applying ordinary clique algorithms to affiliation graphs is not useful. Onesolutionistousethencliqueconcept,whichisarelaxationofthecliqueidea.Inannclique, wedonotrequireeachmemberofthecliquetohaveadirecttiewitheveryother,butinsteadthatit benomorethandistancenfromeveryother.Choosingn=2givesussubgroupsinwhicheverypair ofnodesarewithin2linksofeachother.Appliedtoanordinarygraph,thisyieldssubgroupsthat are looser than ordinary cliques, meaning that they are less than 100% dense. However, when applied to an affiliation graph, a 2clique can be regarded as complete, since all possible ties are present,duetotheconstraintsofbipartitegraphs.Forthisreason,BorgattiandEverett(1997)give 2cliques in affiliation graphs a name of their own, the biclique. Effectively, a biclique is to affiliationgraphswhatacliqueisforordinarygraphs. Sincebicliquescanbenumerousandoverlapping,itisoftenusefultoperformasecondaryanalysis byconstructinganodebycliquematrix,andcorrelatingtheprofilesofeachnodeacrossbicliques sothatnodesthataremembersofmanyofthesamebicliqueswillbegivenahighcorrelation.This correlationmatrixcanthenbetreatedasavaluedadjacencymatrixandvisualizedusingstandard graphlayoutalgorithms.Figure13showstheresultofsuchananalysis.Theresultsarestrikingin thewaytheydifferentiatebetweentwogroupsofwomentiedto twodistinctgroupsofevents.In addition,thediagramclearlyshowstheseparationofFloraandOlivia,andthebridgingpositionof Ruth.

Figure13.Atieindicatesthatthecorrelationbetweentwonodesisgreaterthan0.60.

Structuralequivalencereferstotheextentthatpairsofnodeshavetiestothesamethirdparties.In affiliation graphs such as the DGG dataset, actors are structurally equivalent to the extent they attendthesameevents, andevents arestructurallyequivalent tothe extenttheyare attended by thesameactors.Strictlyspeaking,inaffiliationgraphstherecanbenoequivalencebetweennodes of different nodesets, since they cannot have any nodes in common. As a result, structural equivalence analyses of affiliation graphs are virtually identical to analyses of the actorbyactor and eventbyevent coaffiliation matrices. For example, a standard approach to measuring structuralequivalenceinordinarygraphsistocorrelatetherows(andcolumns)oftheadjacency matrix, and then do a hierarchical cluster analysis of the correlation matrix to identify blocks of approximatelyequivalentnodes.Ifwetakethisapproachtothe(n+m)by(n+m)adjacencymatrix ofanaffiliationgraph,wearevirtuallyguaranteedtofindthetwomodesoftheaffiliationsdataset asthedominantpartitioninthehierarchicalclustering.Thenextpartitionwillthensplitoneofthe two node sets, and so on. In the end, the results are essentially the same as if we had simply clusteredeachthecoaffiliationmatricesseparately. Analternativeapproachtostructuralequivalenceisblockmodeling(White,BoormanandBreiger, 1976). In ordinary graphs, blockmodeling refers to partitioning the rows and columns of the adjacencymatrixsuchthatthosecorrespondingtonearlyequivalentnodesareplacedinthesame classes,asshowninFigure14..Partitioningtherowsandcolumnsbasedonstructuralequivalence hastheeffectofpartitioningthecellsoftheadjacencyintomatrixblocksthathaveacharacteristic pattern of homogeneity: either all of the cells in the block are 1s (called 1blocks), or they all 0s (called 0blocks). The job of a blockmodeling algorithm is to find a partitioning of the rows and columnsthatmakeseachmatrixblockashomogeneousaspossible(BorgattiandEverett,1992).
A1 A2 A3 B1 B2 B3 B4 C1 C2 C3 A1 0 0 0 1 1 1 1 1 1 1 A2 0 0 0 1 1 1 1 1 1 1 A3 0 0 0 1 1 1 1 1 1 1 B1 1 1 1 0 0 0 0 1 1 1 B2 1 1 1 0 0 0 0 1 1 1 B3 1 1 1 0 0 0 0 1 1 1 B4 1 1 1 0 0 0 0 1 1 1 C1 0 0 0 1 1 1 1 0 0 0 C2 0 0 0 1 1 1 1 0 0 0 C3 0 0 0 1 1 1 1 0 0 0

StructuralEquivalence

Figure14.Structuralequivalenceblockmodelinginanordinaryadjacencymatrix Applyingthisapproachdirectlytoaffiliationgraphswouldmeanpartitioningtherowsandcolumns ofthe(n+m)by(n+m)bipartiteadjacencymatrixB.Thiscanbedone,butthebipartitestructure imposes certain constraints. For example, matrix blocks involving withinmode ties (e.g., woman towoman, eventtoevent) are necessarily 0blocks. In addition, the best 2class partition will almostcertainlybethemodepartition(exceptintrivialcases),andingeneral,allotherpartitions willberefinementsofthemodepartition(i.e.,theywillbenestedhierarchicallywithinthemode partition). A more elegant (and computationally efficient) approach is to work directly from the 2mode incidencematrixX(BorgattiandEverett,1992).Todothis,weredefinetheconceptofablockmodel

torefertonotonebuttwoindependentpartitions,onefortherowsandoneforthecolumns.We then apply an algorithm to find the pair of partitions that yield the most homogeneous matrix blocks. In other words, a structural equivalence blockmodeling of the 2mode incidence matrix is oneinwhichrownodesareinthesameclassiftheyhavesimilarrows,andcolumnnodesareinthe same class if they have similar columns. An example involving 4 classes of rows and 3 classes of columnsisshowninFigure15.
A1 A2 A3 B1 B2 B3 B4 C1 C2 C3 D1 D2 E1 1 1 1 1 1 1 1 0 0 0 0 0 E2 1 1 1 1 1 1 1 0 0 0 0 0 E3 1 1 1 1 1 1 1 0 0 0 0 0 F1 1 1 1 0 0 0 0 1 1 1 1 1 F2 1 1 1 0 0 0 0 1 1 1 1 1 F3 1 1 1 0 0 0 0 1 1 1 1 1 F4 1 1 1 0 0 0 0 1 1 1 1 1 G1 0 0 0 0 0 0 0 0 0 0 1 1 G2 0 0 0 0 0 0 0 0 0 0 1 1 G3 0 0 0 0 0 0 0 0 0 0 1 1

Figure15.2modestructuralequivalenceblockmodel.

RegularEquivalence

Inordinarygraphs,theideaofregularequivalenceisthatapairofequivalentnodesisconnected not necessarily to the same nodes (as in structural equivalence), but to equivalent nodes (White andReitz,1983).Inotherwordsifnodeuandvareperfectly regularlyequivalent,thenifuhasa friend p, we can expect v to have a friend q that is equivalent to p. In blockmodeling terms, this translatestoapartitioningoftherowsandcolumnsoftheadjacencymatrixsuchthattheresulting matrixblocksareeither0blocks,oraspecialkindof1blockinwhicheveryrowandcolumninthe matrixblockhasatleastone1. Inthecaseofstructuralequivalence,itwaspossibletoapplytheconcepttotheadjacencymatrixof anaffiliationsgraph,makingitpossibletouseexistingalgorithms/programstocomputeit.Inthe case of regular equivalence, there is a complication. Regular equivalence defines a lattice of partitionsthatallhavetheregularityproperty(BorgattiandEverett,1989).Moststandardregular equivalence algorithms deliver the maximum regular equivalence. Unfortunately, in undirected data, which is normally the case with affiliations graphs, the maximum regular equivalence is always trivial, placing all nodes in the same class. There are ways of handling this, but a better approach is to redefine regular equivalence for 2mode incidence matrices, as developed by Borgatti and Everett (1992); As we did with structural equivalence, we redefine the concept of a blockmodel to refer to not one but two independent partitions, one for the rows and one for the columns.Regularequivalenceimpliesthatwecansectionthematrixintorectangularblockssuch that each block is a 0block or a regular 1block. For example, if the affiliations graph indicates which consumers visit which restaurants, the 2mode regular blockmodel shown in Figure 16 identifiesfourdifferenttypesofconsumersthatvisitthreekindsofrestaurants.Consumersofthe same type do not necessarily visit the same restaurants, but they do visit the same kinds of

restaurants. Thus all consumers in the first class visit the first two kinds of restaurants, while all consumersinthesecondclassvisitonlythefirstandthirdkindsofrestaurants.

C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 R1 R2 1 0 0 0 0 1 1 0 1 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 R3 1 1 0 0 1 0 1 0 0 0 1 0 R4 0 0 1 0 0 0 0 0 0 1 1 0 R5 1 0 1 0 0 0 0 1 0 1 0 1 R6 1 1 0 0 0 0 0 1 1 0 1 0 R7 0 0 1 0 0 0 0 0 0 1 0 1 R8 0 0 0 0 1 0 1 0 0 0 0 1 R9 R10 0 0 0 0 0 0 1 1 1 0 1 1 0 1 0 0 0 0 0 0 1 1 0 1

Figure16.A2moderegularequivalenceblockmodel.

Insocialnetworkanalysis,thetermrelationalalgebraistypicallyusedverylooselytorefertothe composition of relations. For example, if we measure both friendship and teacher of relations amongasetofnodes,wecanconstructnew,compoundrelationsthatlinktheactors,suchasfriend ofateacheroforteacherofafriendof,aswellasfriendofafriendandteacherofateacherof.If the relations are represented as adjacency matrices, the composition relation can be equated to Boolean matrix multiplication7 of the adjacency matrices, so that if F represents the friendship relation and T represents the teacher of relation, then the Boolean matrix product FT represents thefriendofateacherofrelation.Sincetheresultofacompositionisjustanotherrelation,wecan construct compositions of compositions, yielding a long string of Boolean matrix products. For example,thestringFTTFgivesarelationinwhich,ifuistiedtovviathisrelation,itindicatesthat vislikedbyastudentofsomeonewhoisteacherofafriendofu.(NotethatthetransposeTisused torepresenttheinverserelationistaughtby.) Relational composition is also possible with affiliations data, provided the incidence matrices are conformable. For example, suppose we have a binary personbyorganization matrix M indicating which persons are members of which organizations. Suppose we also have an organizationby event matrix S, which indicates which organizations were sponsors of which events. Finally, suppose we have a personbyevent matrix A indicating which person attended which event. The product MS is a new matrix in which MS(u,v) > 0 indicates that person u belongs to at least one organization which sponsored event v. In a given research setting, we might use MS to explain matrixAi.e.,testthehypothesisthatpeoplearemorelikelytoattendeventsthataresponsored bytheirorganizations. Relationalalgebrascanincorporateamixofaffiliationandordinarynetworks.Forexample,ifwe also had a matrix F indicating which persons were friends with which others, we could generate compositions such as FMS, in which FMS(u,v) > 0 indicates that a person u has a friend who is a
Boolean multiplication is simply ordinary matrix multiplication in which the resulting matrix is dichotomized so that any value greater than 0 is assigned a 1.
7

2ModeRelationalAlgebras

member of an organization that sponsors an event v. Krackhardt and Carley (1998) use compositionsofthistypeintheirPCANSmodel,whichrelatespersons,tasksandresourcestoeach other,includingpersonpersoncommunicationsandtasktaskdependencies.Forexampleifmatrix A indicates which person is assigned to which task, and matrix P indicates which task precedes another,thentheproductAPrelateseachpersonutoeachtaskv,indicatingwhetherpersonuhasa taskthatprecedestask v.ThetripleproductAPArelateseachperson utoeachperson v,indicating whether person u has a task that precedes a task that person v does i.e., it indicates whether personvisdependentonpersonutogettheirworkdone.

Conclusion

Inthischapterweprovideanintroductiontotheanalysisofaffiliationsdata.Twobasicapproaches arediscussed:aconversionapproachandadirectapproach.Theconversionapproachconsistsof analyzing coaffiliations or similarities among elements of one nodeset with respect to their profiles across the other nodeset. The similarities are then treated as ties among the nodes. Co affiliationsarefrequentlyanalyzedtoidentifyopportunitiesforinteraction(e.g.,theflowofgoods orinformation)orunseenrelationshipsbetweenpeople(e.g.,sociometricpreferences).Thedirect approachconsistsofanalyzingbothnodesetssimultaneously,treatingtheelementsofeachonan equal footing. As discussed, the direct approach often requires the use of new metrics and algorithmsspecificallydesignedforbipartitegraphs. Our survey has focused on analysis, and within that, measurement of network concepts such as centrality, cohesive subgroups, structural equivalence, and regular equivalence. In doing so, we haveignoredstatisticalmodeling,suchasthenascentfieldofexponentialrandomgraphmodelsfor affiliationdata(seeRobinschapterinthisbookforamoredetaileddiscussion). We close with suggestions for future analyses of affiliations data. One element that is under explored in affiliations work is the temporal dimension. There are two important ways in which time can be brought into affiliation analysis. First, there is the case of affiliation graphs changing over time. We can conceptualize this as a series of personbyorganization matrices representing different slices of time, or a single 3mode affiliation network in which each tie links together a person, an organization and time period. Many of the direct analysis techniques discussed in this canbegeneralizedtothis3modecase(BorgattiandEverett,1992). Theotherimportantcaseisintheanalysisof2modepersonbyeventdata,wheretheeventsare orderedintime.Forexample,ifwestudyHollywoodfilmprojects,wetypicallyhaveadatamatrix thatisactorbyfilm,andthefilmsorderedbyreleasedate(orstartdate,etc).Ifweareinterestedin howactorspreviouscollaborationtiesaffectthequalityofafilmprojecttheyarejointlyengaged in, we need to construct the collaboration network continuously over time, since we would not wanttopredictfilmsuccessbasedoncollaborationsthatoccurafterthefilmwasproduced.Social network analysis software such as UCINET (Borgatti, Everett and Freeman, 2002) are just beginningtoincludetoolsforthesekindsofanalyses. Anotherexampleoftimeorderedaffiliationsdataoccursinthestudyofcareertrajectories.Taking the 3mode approach we can examine how actors colocation (in terms of both organization and time)tiesaffecttheirfuturecareers.Orwecanlookathowindividualsflowfromorganizationto organization along directed paths. Here, the organizations can be ordered in time differently for each individual, although a key research question is whether an underlying ordering of the organizations(suchasstatus)createsconsistencyinindividualcareermoves.

Allatta, J.T. 2003. Structural analysis of communities of practice: an investigation of job title, location,andmanagementintention.In Communities and Technologies,Eds.Huysman,M.Wenger, E.,andWulf,V.pg2342.KluwerAcademicPublishers. Allatta,J.T.2005.WorkerCollaborationandCommunitiesofPractice.Ph.D.dissertation,Universityof Pennsylvania, United States Pennsylvania. Retrieved April 1, 2009, from Dissertations and Theses:FullTextdatabase.(PublicationNo.AAT3197643). Allen, M. 1974. The Structure of interorganizational elite cooptation: interlocking corporate directorates. AmericanSociologicalReview,Vol.39(3):393406 Allen,T.1977.ManagingtheFlowofTechnology,Cambridge,MA:MITPress. Bonacich, P. 1972. Factoring and weighting approaches to status scores and clique identification. JournalofMathematicalSociology,Vol.2:112120. Bonacich,P.1987.Powerandcentrality:afamilyofmeasures. AmericanJournalofSociology,Vol. 92:11701182. Bonacich,P.1991.Simultaneousgroupandindividualcentralities. SocialNetworks.Vol.13(2):155 168. Bonacich, P. 2007. Some unique properties of eigenvector centrality. Social Networks, Vol. 29(4): 555564. Borgatti,S.P.1989.Regularequivalenceingraphshypergraphsandmatrices.UniversityofCalifornia, Irvine,1989,109pages;AAT8915431. Borgatti, S.P., Everett, M.G. and Freeman, L.C. 2002. Ucinet for Windows: Software for Social NetworkAnalysis.Harvard,MA:AnalyticTechnologies. Borgatti, S. P., and Everett, M. G. 1992. Regular blockmodels of multiway, multimode matrices. SocialNetworks,14:91120 Borgatti, S. P., and Everett, M. G. 1997. Network analysis of 2mode data. Social Networks, 19(3): 243269. Borgatti, S. P. and Everett, M.G. 2006. A graphtheoretic framework for classifying centrality measures.SocialNetworks28(4):466484. Brandes,U.2001.Afasteralgorithmforbetweeneesscentrality. Journal of Mathematical Sociology. Vol.25(2):163177, Breiger,R.L.1974.Thedualityofpersonsandgroups.SocialForces,Vol.24:201229.

References

Breiger R., Boorman S. and Arabie, P. 1975. An algorithm for clustering relational data, with applicationstosocialnetworkanalysisandcomparisonwithmultidimensionalscaling. Journal of MathematicalPsychology,Vol.12,328383. Burt, R. 1987. Social Contagion and Innovation: Cohesion Versus Structural Equivalence, The AmericanJournalofSociology,Vol.92(6):12871335. Carroll, W.K., Fox, J. and Ornstein, M.D., (1982), 'The network of directorate interlocks among the largestCanadianfirms',CanadianReviewofSociologyandAnthropology:24568. Davis, G. (1991). Agents without principles? The spread of the poison pill through the intercorporatenetwork.AdministrativeScienceQuarterly,Vol.36(4):583613. Davis,G.,andGreve,H.1997. CorporateEliteNetworksandGovernanceChangesinthe1980s. The AmericanJournalofSociology,Vol.103(1):137. Davis,A.,Gardner,B.,andGardner,R.1941.DeepSouth.Chicago:UniversityofChicagoPress. Domhoff,W.1967.WhoRulesAmerica?EnglewoodCliffs,N.J.:PrenticeHall. Doreian, P., Batagelj, V., and Ferligoj, A. 2004. Generalized blockmodeling of twomode network data.SocialNetworks,Vol.26:2953. Eckart, C. and Young, G. 1936. The approximation of one matrix by another of lower rank. Psychometrika,Vol.1:211218. Everett, M. G., and Borgatti, S. P. 1993. An extension of regular colouring of graphs to digraphs, networksandhypergraphs.SocialNetworks,15:237254 Faust, K. 2005. Using correspondence analysis for joint displays of affiliation networks. In In: P. Carrington, J. Scott and S. Wasserman, Editors, Models and Methods in Social Network Analysis, CambridgeUniversityPress Faust, K. Willber, K, Rowlee, D. and Skvortz, J. 2002. Scaling and statistical models for affiliation networks: Patterns of participation among Soviet politicians during the Brezhnew era. Social Networks,Vol.24:231259. Feld,S.1981.Thefocusedorganizationofsocialties. American Journal of Sociology,Vol.86:1015 1035. Field, S., Frank, K., Schiller, K., RiegleCrumb, C., and Muller, C. 2006. Identifying social positions from affiliationnetworks:preservingthedualityofpeople andevents. Social Networks 28(2):97 186 Freeman , Linton C. 2003.Finding social groups: A metaanalysis of the southern women data, In RonaldBreiger,KathleenCarleyandPhilippaPattison,eds. Dynamic Social Network Modeling and Analysis.Washington:TheNationalAcademiesPress. Friedkin,N.1984."StructuralCohesionandEquivalenceExplanationsofSocialHomogeneity." SociologicalMethodsandResearch12:23561.

Galaskiewicz,J.1985.SocialOrganizationofanUrbanGrantsEconomy.NewYork:AcademicPress. Gmr,M.2006.Cocitationanalysisandthesearchforinvisiblecolleges:Amethodological evaluation.Scientometrics.Vol.57(1):2757. Harary,F.1969.GraphTheory.Reading,MA:AddisonWesley. Kleinberg,J.1999.Authoritativesourcesinahyperlinkedenvironment.JournaloftheACM, 46(5):604632. Krackhardt,D.andK.M.Carley.1998.APCANSModelofStructureinOrganization.In:Proceedings of the1998InternationalSymposiumonCommandandControlResearchandTechnology,113119. June.Monterey,CA. Lester, R., Cannella, A. 2006. Interorganizational familiness: how family firms use interlocking directorates to build communitylevel social capital. Entrepreneurship: Theory & Practice;, Vol. 30(6):755775. Luce, R., and Perry. A. 1949. A method of matrix analysis of group structure. Psychometrika. Vol. 14(2):95116. McPherson, J. 1982. Hypernetwork sampling: Duality and differentiation among voluntary organizations.SocialNetworks,Vol.3:225249. McPherson, J. and SmithLovin, L. 1986. Sex segregation in voluntary associations. American SociologicalReview,Vol.51(1):6179. McPherson,J.andSmithLovin,L.1987.Homophilyinvoluntaryorganizations:statusdistanceand thecompositionoffacetofacegroups.AmericanSociologicalReview,Vol.52(3):370379. MizruchiM.1983.Whocontrolswhom?Anexaminationoftherelationbetweenmanagementand boardsofdirectorsinlargeAmericancorporations.AcademyofManagementReview.Vol.8:426435 Mizruchi, M. 1992. The structure of corporate political action: interfirm relations and their consequences.Cambridge,MA:HarvardUniversityPress. Mizruchi, M. 1996. What do interlocks do? An analysis, critique, and assessment of research on interlockingdirectorates.AnnualReviewofSociology,Vol.22:217298. Newman, M. Strogatz, H. and Watts, D. 2001. Random graphs with arbitrary degree distributions andtheirapplications.PhysicalReview,E.64:117. Robins,G.,andAlexander,M.2004.Smallworldsamonginterlockingdirectors:Networkstructure and distance in bipartite graphs. Computational & Mathematical Organization Theory. Vol. 10(1): 6994. RoethlisbergerF.andDicksonW.1939.Managementandtheworker.Cambridge:Cambridge UniversityPress.

Uzzi,B.andSpiro,J.2005.Collaborationandcreativity:thesmallworldproblem. American Journal ofSociologyVol.111(2):447504. Wang, P., Sharpe, K., Robins, G., and Pattison, P. 2009. Exponential random graph (p*) models for affiliationnetworks.SocialNetworks,Vol.31(1):1225. WestphalJ.,&Poonam,K.2003.Keepingdirectorsinline:socialdistancingasacontrolmechanism inthecorporateelite.AdministrativeScienceQuarterly,Vol.48(3):361398. Westphal,J.D.1998.Boardgames:HowCEOsadapttoincreasesinstructuralboardindependence frommanagement.AdministrativeScienceQuarterly,Vol.43:511537. White,H.C.,Boorman,S.A.,andBreiger,R.L.1976.SocialStructureFromMultipleNetworks,I: BlockmodelsofRolesandPositions.AmericanJournalofSociology,81,730780. White,D.,andReitz,K.1983.GraphandSemigroupHomomorphismsonNetworksofRelations. SocialNetworks,Vol.5:193224
Sokal,R.,Sneath,P.1973NumericalTaxonomy.SanFrancisco:WHFreeman