You are on page 1of 137

1

WilliamOccamwasaphilosopherandmonkwhowasaminimalistinlifeandbecauseof hisfrequentusageofthesimplicityprinciple,hisnamehasbecomeassociatedto simplicity. Otherwaystheexpressioncanbeinterpretedis:keepitsimple WithQFabric,wehaveusedsimplicitytosolvesomeofthetoughestdatacenter problemsbykeepingitsimple. Infact,Juniperhasalwayssolvedveryhardnetworkingproblemswithsimplicityand innovation.

Recently,inaGartnerDatacenterconference:Customersurveysidentifiedfollowing twoitemsastheirburningtopics. NeedforaCloudstrategy>Experience PowerandCooling>Operationscosts Thereisalwaysatradeoffbetweenexperienceandeconomicsinthedatacenter. Intheverylargeproductiondatacenters liketheGoogles,theFacebooks,delivering betterexperienceisprioritybecauseitiscoreoftheirbusiness. HoweverontheotherhandmanyclassicITdatacenters costisanissueandtheymight bewillingtotradeoffexperienceforeconomics.

Fundamentally,theroleofITinfrastructureistoprovideusersconnectivityto applicationsandservices. Mobility,Web2.0andnewertechnologieshaveintroducedmanynewwaysinformation canbeexchanged: ClientdevicetoClientdeviceor ClientdevicetoComputingutilityor ComputingutilitytoComputingutility. Inaddition,areliableandhighperformancewideareanetworkisenablingapplication deliveryovertheWAN. Thisishelpingcustomerstoconsolidatedatacenterinfrastructureandmanagecoststo deliversuperiorapplicationperformance. Inthissession,wewilldiscussdatacenterevolutionandhowQFabricwillhelp transformdatacenternetworking.

Theapplicationarchitectureshaveevolvedfromclientservertodistributedapps, causinganunderliningshiftintrafficpattern. Applicationshavebecomefederatedwitharchitecturessuchas ServiceOrientedArchitectures, WebServices,and Softwareasaservice. Thenewerapplicationsareincreasingthecommunicationbetweenservertoserver causingmoreeastwesttrafficwithinthedatacenter. Servervirtualizationishelpingbusinessesgainefficiencybyconsolidatingmanyphysical serversintofewerhighperformancevirtualizedservers. OfallthetraffictypestraversinganEthernetnetwork,storagehasriseninprominence inrecentyears.Thistraffictypehasitsveryownuniquecharacteristicsanddemandsa separatetreatmentthanothertraditionalEthernettraffic. Tosupportbothvirtualizationandconvergence,thedatacenternetworkneedstooffer: highperformance scale, availability awaytosupportstronguserSLAs(aroundthroughput,latency,packetlossand security) Networkinghasnotprogressedandisnowabarrierslowingthepaceofinnovationin therestofthedatacenter.

Cloudscomeinmanysizesandshapes.Letslookathowcustomersarebuildingclouds. Customersareinitiallybuildingsmallpoolsofresourceswithcomputeandstorage resourcesthiscanbeasmallcloud.Largerthepoolsbettertheefficienciesandthe networkisthefoundationforbuildingthosepoolsleadingtoacloudreadyarchitecture.

Thenumbersofphysicalserversareflatteningoutbutthenumbersofvirtualservers havebeengrownexponentially.Virtualizationprovidescapitalexpenditurebenefits,but thecostofmanagingvirtualnetworkshasincreasedtheoperationalexpenses drastically.

10

Thefirstorderofdatacentertransformationisservervirtualizationinsmallpools, primarilytogainoperationalefficiencyincomputeinfrastructure.Thesecondorderof transformationwillbetoextendconsolidationbeyondserversandcreatedynamic cloudreadyinfrastructuretosupportoperationalflexibilityandbusinessagility.

11

TheInternethasforeverchangedandiscontinuingtochangethedatacenter.And wherediditstart?Itstartedwiththeapplication.Yousee,priortotheInternet,all applicationswereclientserver thatwasstateoftheart.Andinclientserverwhatyou hadwasthemonolithicserverwithamonolithicapplicationinit,connectedtoastateful client,andninetyfivepercentofallthetrafficthatwentoutontoaphysicalnetwork wastravelingbetweentheserverandtheclient. [CLICK]Butinsidethatserverwhatwefoundwasabunchofencapsulatedtraffic. Remember,backintheday,wehaddirectattachedstorage;therewasnonetworkfor thestorage.So,allthattrafficwasinsidetheserver,andalltheinterprocess communicationthatthatapplicationhad thatwascontainedinthememoryofthe server.So,onlyafractionofthetrafficwasactuallyonaphysicalnetwork.Butalong cametheInternet,andwhatitexposedwasthefundamentalweaknessinclientserver; thosearchitectureswouldneverscale.See,theyweredesignedtosupporthundredsof users maybeafewthousandsofusers.ButwhattheInternetgaveuswasthe opportunitytoreachout toouremployees,toourcustomers,toourpartners and touchnotthousandsbuttensofthousands maybemillionsofusers.Andyoucould neverdothatwithclientserver.Sowehadtochangethewaywedesignedthe applications.ItwasaDarwinianevent. [CLICK]Necessitydroveachangeinthearchitectureandwemovedtowhattodaywe callserviceorientedarchitectures(orSOA),withrichWeb2.0frontends;federated applications.Butfundamentallywhatwedidiswetookthatapplicationthatwas containedwithintheserverandwedisaggregateditacrossthenetwork.Andso, suddenly,whatwehadwasafundamentalshiftinthedatapatternsinthenetwork.Yes, werestillgoingaftertheclientandwehavemoretrafficgoingintotheclientthanever before.Buttodayitonlyaccountsforabout25%ofthetotaltrafficonadatacenter network.Therestofitistrafficbetweenservers;trafficbetweenserversandstorage; andallthattrafficthatusedtobeinasingleservernowisexposedacrossanetwork

afundamentalshiftindatapatterns.

12

With95%ofdatacentertrafficmovingnorthandsouthinthetreebetweenusersand servers,Traditionalhierarchaltreenetworkarchitectureworkedfineintheearlydays. However,todaystrafficpatternsfindmostnetworktrafficupto75%iseasttowest, travelingbetweendevices withinthedatacenter.Asaresult,traversingthedatacenter requiresasmanyasfivenetworkhops,addingbothlatencyandjitter,whichdirectly impactsapplicationperformance. Thenewestroleofthenetworkistoprovideafoundationforthecloudsolution.This architectureneedsflatnetworkswithanytoanyconnectivitythatreduceslatencyand jitter,enablingoptimalapplicationbehavior.

13

Legacydatacenterarchitectureshavetwofundamentalproblems:Theyretoocomplex andtootopographicallydiverse,whichimpactsperformanceandtheuserexperience.

14

Intodaysdatacenternetwork,applicationperformanceisoftendependentonthe physicallocationofserversthataretryingtocommunicatethatis,whereeachofthe resourceresideinthetreehierarchyrelativetooneanother. Themorehopsrequiredtocompleteatransaction,themorelatencythattransactionis subjectedto,contributingtounpredictableapplicationperformance. Inatypicaltreeconfiguration,wehaveaserverandstorageresourcesthataretypically containedinalocalizedbubbles.Theresourcesareclosetogetheraspossible,which worksfinewhenwehavefewapps.Butwhenappsgrowscalabilitybecomesanissue withtreestructures. Sowhyarebubblesinteresting? Whenprovisioningordynamicallymigratingavirtual machineinstance,theVManditsexternaldatasourceswouldideallyresidewithinthe samebubble. However,iftheVMisinstantiatedoutsidethisbubble,itmightsuddenlyfinditselfthree networkhopsawayfromdatasourceandtheapplicationwillslowdown. ThustreesforcetoproactivelymanagethelocationoftheVMinthephysical environmentinordertomaintainpredictableapplicationbehavior.

15

Anotherproblemwithtreearchitectureisthat,whenanappliancesuchasafirewallor loadbalancerisinsertedinlineinthenetwork. Atypicalsegmentation(VLAN)isdefined,theseattributesaregenerallyrelevanttoa specificbranchinthenetworkessentiallytheycastashadowdownthatbranchofthe tree. WhentheVMisinstantiatedwithintheshadowofitsappliances,itwilloperate properly. However,ifthatsameVMmigratesoutsidetheshadow,atbestitwillrun slower. Atworst,theapplicationmayceasetorunatall.

16

Inanetwork,therearetwopredominantpointsofmanagement: theswitchesand theinteractionsbetweentheswitches. Sinceeachswitchisanautonomousdevice,ITmustmanagethemindividually,which createssignificantoverheadinlargedatacenters. Theseswitchesalsocommunicatewitheachotherbyusingsharednetworkprotocols suchasrouting,linkaggregationgroups(LAGs),qualityofservice(QoS)andsecurityall examplesofprotocolsthatmustbeappliedacrossthenetworkandconfiguredonevery devicetosupporttheoveralldesign.Thiscausesthenumberofinteractions(often referredtoascontroltraffic)betweentheswitches,eveniftheyarenotdirectly connected,togrowgeometrically.

Thisexplosionofinteractionsfollowsasimpleformulaofn(n1)/2wherenrepresents thenumberofswitches.Basedonthisformula,10switchescangenerate45such interactions;100switchescangeneratenearly5,000potentialinteractions;and1,000 switchescancreate5millioninteractions. Sinceitisnearlyimpossibletodealwiththislevelofincreasingcomplexity,network operationswilltypicallysegmentthenetworkintosmallersubnetworks.Unfortunately, thisrunscountertothedesiretoconsolidateallserversandstorageinthedatacenter intoasinglelarge,moreefficientandmoreelasticpoolofresourcesthevisionbehind cloudcomputing.

18

19

BusinessBenefitsofQFabric ThisslidepresentsthebusinessvaluepointsofQFabric.Applicationperformanceisan importantcomponentofmovingtoanarchitecturethatusesQFabric.Byspeedingup theaccesslayerandallowingforconcepts,suchasvMotion(theabilitytomoveaVM aroundthedatacenter),notonlytheendusersbutalsotheadministratorscanbenefit. ThesecondvaluepointisthatQFabricistheidealfoundationforthemoderncloud,with virtualizedserverandstorageenvironments.Itscaleswell,allowingforover6000ports, yetmaintainsthesimplicityofmanagingasingledevice,withaflatanytoany connectivitywhereeveryportisequal,andistheidealfoundationformovingworkloads aroundthedatacenterdynamically. Thethirdpointissimplicity.Lesshardwaremeanstherearefewercomponents,which improvesperformance,power,spaceandcooling,aswellasincreasesreliability.QFabric isalsosimpletomanage,sinceithastheoperationalsimplicityofaswitch. Lastly,thenatureofthearchitectureallowscustomerstolowertheirOpExandCapEx.

ThreeProblemstobeSolved Theproblemsarerelatedtothebasicflawsinatreestructureinadatacenter.Thefirst probleminvolveshavingtomanageseveralseparatenetworks.Thisiswhyconvergence isimportant:gettingstorageandallthenetworkstobeinonephysicalnetwork,and thenusingvirtualizationtocreatethetrafficseparationthatisrequired. ThesecondissueiscalledMetcalfesRevenge,whichmeansthat,asthecapacityof thenetworkisexpandedinalinearfashion,thecomplexityincreasesgeometrically: (N(N1))/2.OneofthefeaturesofQFabricisthatitismanagedasasingledevice:N=1. Lastly,theresthetreestructureitself.Trafficistryingtogoeastandwestinan architectureoptimizedfornorthsouthtraffic.Processinghastoberepeatedateachhop inthenetworkor,moreimportantlyforthecloud,itmatterswheretheapplicationis placedrelativetoitsdata.Thiscanleadtoinconsistentapplicationbehaviorwhenthe applicationismovedaroundinthenetwork.JunipersQFabricprovidesaflat,anytoany topologytoconnecteverythinginthedatacentertogether.

ThreeTypesofFabrics Threetypesoffabricscanbeconsideredinthemarketplacetoday:marketing,overlay andswitchfabrics.

MarketingFabric ThebasictreestructuremadeoutofEthernetnetworksisconsideredbysome marketingdepartmentsasafabric.Itrealityitisnotafabricandthushasnoreal benefitsforthecustomer.

ProtocolOverlayFabric Thesecondkindiscalledaprotocoloverlayfabric.Thisisadesignthatusesaclassic spineandleaftopologywheretheleavesaretheaccessnodesintothenetworkandthe spineistheinterconnect;eachnodeinthespineisconnectedtoeveryleafandevery leaftoeveryspine.Thisisaclassicnonblockingtopologythatcouldbetracedbackto Mr.CharlesClos,whocameupwithaminimumnumberofdevicesrequiredtobuilda nonblockingnetwork. Thishasbeenusedinthewideareanetworkforyears.Whyhasntitbeenusedatthe datacenter?IthasntbecauseoftheshortcomingsofEthernet.Ethernetisefficientat thefirsthop(plugandplay),butonceitgetstothepointwheretherearemultiple paths,ithasshortcomings.Inthepast,theSpanningTreeProtocol(STP)wasusedto overcometheissue,butSTPallowsonepathtobeusedwhileblockingothers.This invariablytranslatestoinefficientuseofresources,particularlybandwidth.

ProtocolOverlayFabric ToovercomeSTPsdisadvantage,twonewprotocolshavebeendeveloped.Thefirstone isTRILL(TransparentInterconnectionofLotsofLinks)andtheotherisSPB(ShortestPath Bridging).TheyareusedtocreateLayer2tunnels(similartoMPLS,butattheLayer2 level)andtrafficcanthenbespreadacrossdifferenttunnels.Thisleadstoaflatter topologyandmakeseveryaccessportrelativelyequal,thususingallthebandwidth. However,therearestilltwotiersofindividualswitches.Moreimportantly,iteliminates theneedforSTP.Althoughthisleadstoadramaticimprovementtothenetworksofthe past,Juniperhasdevelopedanevenbettersolution.

SwitchFabric Thisleadstothethirdfabrictype,knownasaSwitchFabric.Itrepresentsbuildinga datacenterfabricthatbehavesasasingleswitch,allowingforasinglenetworkasit enablesconvergence,ismanagedasasingledevice,andhasflat,anytoany connectivity. Thisfabrictypehasthebenefitsofthepreviousfabric,butinsteadoftwotiers,ithas onlyone.Itistheflattestofallthetopologies,whichmakesitthefastest.Also,it virtualizeslocality likebefore,everyportisequal anditeliminatestheneedtorun STP.Thereisnoneedtorunanyprotocol(TRILL,SPB)insidethefabricasitisthe hardwarethatdoesthetransportfrompointAtopointB,justasthefabricinsidea switch;hence,thenameandbehavior. Otherbenefitsincludebeingmoreefficient.Fewerdevicesallowforpowersavings (whicharedirectlyrelatedtocoolingsavings)aswellasspacesavings.Itisalsosimpler asitismanagedasasingledevice(N=1).Asimplersolutionwithlessdevicesimplies thatcustomerscanrealizesubstantialcostsavings.

Giventheproblemswithlegacydatacenternetworkdesigns,anewarchitectural approachisrequired,onethatistransformationalandinnovative,notincremental.As YankeeGroupresearchersputit:Thedatacenternetworkisnowthebackplaneofthe virtualdatacenter. Theidealnextgenerationnetworkarchitectureformoderndatacenterswoulddirectly connectallprocessingandstorageelementsinaflat,anytoanynetworkfabric. Optimizedforperformanceandsimplicity,thisnextgenerationarchitecturewould addressthelatencyrequirementsoftodaysapplications;supportvirtualization,cloud computing,convergenceandotherdatacentertrends;scaleelegantly;andeliminate muchoftheoperationalexpenseandcomplexityoftodayshierarchicalarchitecture.

27

28

29

30

JuniperNetworkshasdevelopedacomprehensivestrategydesignedtomakethisnew fabricbaseddatacenternetworkdesignareality. AttheheartofthisstrategyisJunipers321DataCenterNetworkArchitecture,which eliminateslayersofswitchingtoflattentodaysthreetiertreenetworkstructuretotwo layersand,withtheQFabricarchitecture,tojustone. QFabricisthe1in321.

31

Thesimplest,mostagileandmostefficientnetworkcontainonly2switches(for availability)Ithasthelowestlatencyandanytoanyconnectivityatfullportcapacity overanonblockingfabric.Unfortunatelyrealnetworksdontlooklikethis largely because Largenumberofportsthatneedtobeaggregated Multipleoversubscribedswitchingtiersusedtoaggregatethoselargenumberofports andstilltryanddoitforlowercostthanafullfledgedanytoanyfullmesh

32

WiththeSRX,wecandealwiththephysicalflows.WeacqiredacompanycalledAltor. vGWisafirewallforvirtualizedserverenvironments,whereitallowsyoutoenforce policiesbetweenthenetworkingflows,betweenVMs.Themajorbenefitisgetting visibilityintowhatthoseflowsare,howtheyoperate,andsettingpolicythatisone policyacrossthenetwork,regardlessofwhetheritsaphysicalfloworvirtualflow. Juniperistheonlycompanytodaythatcanletyousetonepolicythatrunsacrossboth theSRXandvGW. TheinterdatacenterconnectivitywedoverywelltodaywithMPLSandVPLS. JunosSpacevirtualcontrol.Oneofthebigproblemsthatwehaveinavirtualized environmentisthatwehaveasetofvirtualconnectionsinsteadofphysicalconnections. SoVirtualControlisanapplicationthatwedevelopedwithVMwarethatallowskeeping virtualandphysicalpoliciesinsync.

33

34

35

36

Nowwehaveasinglepoolorsinglebubbleacrosstheentiredatacenter. Everythingisequaldistanceandeverythingisonenetworkhopaway.Fromthat standpoint,itstheidealseamlessresourcepoolfoundation. TheStratusprojectiswhatcreatedtheQFabric.

37

38

TheQFabrichasbeenindevelopmentfor3yearsandwehave125pendingpatents applicationsandhavebeengrantedthreesofar.Wehavebeenveryactiveindoingthe patentapplications,becausethereisaremanyuniqueinnovationsinQFabric.

39

QFabrictrulyisarevolutionaryarchitecture.Althoughinterestingenough,ithasareally welldefinedoperationalmodelthatpeopleunderstand thatofasingleswitch.What wearebuildingisaflat,resilientfabricwhereeverythingisexactlyonehopaway,butit scaleswithoutcomplexity.Nequalsone.

40

Wereallyhadtorethinkthefundamentalwaytoscaleaswitch.Wehadtolookatevery aspectofthedesigntooptimizehowaswitchcanscaleathighperformance. Dataplane: Theonlywaytoscalethedataplaneistopushtheintelligencetotheedge.The intelligenceiswhatdealswiththerealworldcomplexityofEthernet,thedifferent protocolstheswitchhastocommunicatewithexternaldevices.Butwhatsinthemiddle issimplytransport.ItssimplymovingthebitsfrompointAtopointB.Wedonthaveto haveintelligenceinthemiddle.Byhavingtheintelligenceoutthereonedge,Icanscale atspeed.Moderndatacenterswanttogoasfast,weareat10Gignow.Peopleare alreadyaskingfor40and100Gigsoon.Thatstheonlywaytoscale. ControlPlane: Ontheotherhand,thecontrolplaneisalittlebitdifferent.Wewanttheintelligenceto beeverywhere.Intelligencehastobelocal,wheretheswitchinghappens,buttheswitch alsohastohaveacentralconsciousnesstoit,sothatthingsarewellcoordinated.Thats reallywhatwetalkabout federatedintelligence.Itaddressesscale,butitalso retainsresiliency. ManagementPlane: Finally,wewantamanagementplanewhereN=1.Ithastheoperationalmodelofa singleswitch.WhenwesayN=1,itmeansthatitisonedevicetomanageandthereare nointeractionswithinthefabricitself.Thefabrictakescareofthemanagementdetails. WedontrunspanningtreeorTrilltotrytomanagetheloops,becauseloopsare managedthesamewaytheyareintheswitch bythecontrolplaneitself. Sincewetrulyhaveasinglecontrolplane,viafederated,wecanhaveacompletely differentmanagementscheme.SoitislikeasinglepaneofglasstomanageQFabricasif itsasingledevice.

41

Letstakeyouthroughabitofajourneyheretotrytoexplainthearchitecture.Imgoing tostartwiththeconceptofachassisswitch.Itcouldbean8200fromus.Itcouldbea Nexus7000,Catalyst6500,oranHP12000.Theyareallbasicallyverysimilarinhow theybehave.Wehavelinecards,whichcontaintheports.Wehavesomesortofa centerplaneorbackplanethatyoupluginto.Thenthereisfabriccircuitry,which interconnectsalltheportsineachlinecardtoalltheotherportsinthelinecards. Ethernetpacketscomein,someinitialprocessingattheegressport,andthenthebits aresprayedacrossthebackplaneinanonblockingfashionandreassembledatthefar sideandthenEthernetout.WhatyouexperienceasauserisEthernetinandEthernet out,butyoudontmanagethesebits.Itsaveryefficienttransport.

42

Onthecontrolplaneside,thereisasingleCPUthatendsupbeingthecentralbrain.We callitrouteengines.Ciscocallsitasupervisorcard.Itbasicallyisthecentralbrain,but theintelligenceisalsofederated,becauseeachofthelinecardshasintelligence.Where isthestatemanaged?Theanswerisyes.Itslocal,becausethatiswhatgivesthe switchitsspeed.Itsalsocentralized,soitcandocontrolplanelearning.Ilearnanew MACaddress.Everyportknowsaboutit notbecauseIhadtofloodtogetthere,but becauseitdoescontrolpanelearning. TheCPUalsoisthepointofconnectionforthemanagementplane.Itprojectsoutthe CLI,ortheGUI,ortheinterfacetoamoresophisticatedenvironmentlikeJunosSpace, Tivoli,orsoforth.Italsorunsasetofsoftwarethatautomatesalotofthesoftwarethat runsthemanagementoftheswitch.Ipluganewlinecardin.Itsautomatically detected,discovered,configured,andthenbroughtintopartofthesystem.

43

Weliketheoperationalmodelofthesingleswitch.Whatwedontlikeisthescaling model. Becausewhenwefillupalltheslots,thatsasfarasitscales.NowIhavetogotoa multitiertypeoftreestructure. Withmostvendorssolutions,anewswitchneedstobepurchasedtoaddcapacity. ThisiswhereQFabricinnovationjourneybegan...Wequestionedtraditionalmodeland evolvedthechassisarchitectureintodistributedchassisarchitecture... So,wekepttheoperationalmodel,butwechangedthewayitscales.

44

Westartbymethodicallybreakingthepiecesupphysically.Onthedataplaneside,itreally startswiththeseparationofthefabricfromthelinecards.Insteadofusingcoppertracesto connectthings,weusefiberoptics.Thiswaywegofrombeingabletosupport8linecardsto 128linecards.Wegetamuchbetterscaleandphysicalconnectivity,sowecanspanitacrossa largearea. Whatdothephysicalcomponentsactuallylooklike?Weputthefabricbitsintoachassisandwe optimizethelinecardsforserverconnectivity a1RU,topoftherack,typeofdesign. Inbetween,wehavefour40Gigabitfibersconnectingituptocenterwithstandardfibers,so OM4,Basically,itsEthernetinandEthernetout.Thisworksexactlythesamewayasafabric insideofachassisswitch.Itssimplytransport.Whenitstransport,itmeansthedevicescanbe simpler.Theycanbefaster,denser,andlessexpensive,becausetheyonlyhavetodotransport. Theydonthavetodealwiththecomplexitiesoftherealworld.Theydonthavetorunmultiple protocols.Theydontrunprotocols.Theydoonethingandonethingonly. Wewillneverhaveasinglepointoffailure;theminimumconfigurationistwoofthesechassis. Now,itisatthispointintimewhencustomersbegintogetcurious,becausetheywanttotreat thisasanetwork.Butthesearenotswitches.Noticetherearenoconnectionsbetweenthese twodevicesonthedataplaneitself.Alllinksareactive.Thehardware,usingtheefficient algorithm,willsay,Thispacketisgoingupthisfiberandcomingbackdownthatfiber.Itsall builtintothehardwareitself.Itscoordinatedfromthetoptomanagelinkstatusandsoforth. Forlargescale,wecanhaveuptofouroftheseinterconnectchassis,withonefiberconnection toeachone.Italsogivesgreatestresilience.Thisconfigurationwouldsupport128ofthe QF/Nodedevices.EachQF/Nodedevicehas4810Gigports. Thereasonwedrawthepictureasacircleisthatsreallythewayitbehaves.Everythinginthe realworldconnectstotheQF/Node,andthefabricissimplytheinterconnectbitshereare simplytransport.SowecallthesetheQF/Nodesandcenterchassisaretheinterconnectchassis. ItsEthernetinanditsEthernetout,withanefficienttransportprotocolinbetween.Mostof theprocessingisdoneattheingressportandsomeprocessingisdoneattheegressport,just likeaswitch.Now,herestheamazingthing.ThisfabricisfasterthananyEthernetchassisswitch madetoday.Itslessthanfivemicrosecondstogofromaportheretothatporttherewith maximumcablelengths.Theotherthingis,becausethisisanoncongestingtypeof environment,becauseitisarelativelysmallnumberofASICs,thejitterinlatencyisextremely low.

45

Westartbymethodicallybreakingthepiecesupphysically.Onthedataplaneside,itreally startswiththeseparationofthefabricfromthelinecards.Insteadofusingcoppertracesto connectthings,weusefiberoptics.Thiswaywegofrombeingabletosupport8linecardsto 128linecards.Wegetamuchbetterscaleandphysicalconnectivity,sowecanspanitacrossa largearea. Whatdothephysicalcomponentsactuallylooklike?Weputthefabricbitsintoachassisandwe optimizethelinecardsforserverconnectivity a1RU,topoftherack,typeofdesign. Inbetween,wehavefour40Gigabitfibersconnectingituptocenterwithstandardfibers,so OM4,Basically,itsEthernetinandEthernetout.Thisworksexactlythesamewayasafabric insideofachassisswitch.Itssimplytransport.Whenitstransport,itmeansthedevicescanbe simpler.Theycanbefaster,denser,andlessexpensive,becausetheyonlyhavetodotransport. Theydonthavetodealwiththecomplexitiesoftherealworld.Theydonthavetorunmultiple protocols.Theydontrunprotocols.Theydoonethingandonethingonly. Wewillneverhaveasinglepointoffailure;theminimumconfigurationistwoofthesechassis. Now,itisatthispointintimewhencustomersbegintogetcurious,becausetheywanttotreat thisasanetwork.Butthesearenotswitches.Noticetherearenoconnectionsbetweenthese twodevicesonthedataplaneitself.Alllinksareactive.Thehardware,usingtheefficient algorithm,willsay,Thispacketisgoingupthisfiberandcomingbackdownthatfiber.Itsall builtintothehardwareitself.Itscoordinatedfromthetoptomanagelinkstatusandsoforth. Forlargescale,wecanhaveuptofouroftheseinterconnectchassis,withonefiberconnection toeachone.Italsogivesgreatestresilience.Thisconfigurationwouldsupport128ofthe QF/Nodedevices.EachQF/Nodedevicehas4810Gigports. Thereasonwedrawthepictureasacircleisthatsreallythewayitbehaves.Everythinginthe realworldconnectstotheQF/Node,andthefabricissimplytheinterconnectbitshereare simplytransport.SowecallthesetheQF/nodesandcenterchassisaretheinterconnectchassis. ItsEthernetinanditsEthernetout,withanefficienttransportprotocolinbetween.Mostof theprocessingisdoneattheingressportandsomeprocessingisdoneattheegressport,just likeaswitch.Now,herestheamazingthing.ThisfabricisfasterthananyEthernetchassisswitch madetoday.Itslessthanfivemicrosecondstogofromaportheretothatporttherewith maximumcablelengths.Theotherthingis,becausethisisanoncongestingtypeof environment,becauseitisarelativelysmallnumberofASICs,thejitterinlatencyisextremely low.

46

Westartbymethodicallybreakingthepiecesupphysically.Onthedataplaneside,itreally startswiththeseparationofthefabricfromthelinecards.Insteadofusingcoppertracesto connectthings,weusefiberoptics.Thiswaywegofrombeingabletosupport8linecardsto 128linecards.Wegetamuchbetterscaleandphysicalconnectivity,sowecanspanitacrossa largearea. Whatdothephysicalcomponentsactuallylooklike?Weputthefabricbitsintoachassisandwe optimizethelinecardsforserverconnectivity a1RU,topoftherack,typeofdesign. Inbetween,wehavefour40Gigabitfibersconnectingituptocenterwithstandardfibers,so OM4,Basically,itsEthernetinandEthernetout.Thisworksexactlythesamewayasafabric insideofachassisswitch.Itssimplytransport.Whenitstransport,itmeansthedevicescanbe simpler.Theycanbefaster,denser,andlessexpensive,becausetheyonlyhavetodotransport. Theydonthavetodealwiththecomplexitiesoftherealworld.Theydonthavetorunmultiple protocols.Theydontrunprotocols.Theydoonethingandonethingonly. Wewillneverhaveasinglepointoffailure;theminimumconfigurationistwoofthesechassis. Now,itisatthispointintimewhencustomersbegintogetcurious,becausetheywanttotreat thisasanetwork.Butthesearenotswitches.Noticetherearenoconnectionsbetweenthese twodevicesonthedataplaneitself.Alllinksareactive.Thehardware,usingtheefficient algorithm,willsay,Thispacketisgoingupthisfiberandcomingbackdownthatfiber.Itsall builtintothehardwareitself.Itscoordinatedfromthetoptomanagelinkstatusandsoforth. Forlargescale,wecanhaveuptofouroftheseinterconnectchassis,withonefiberconnection toeachone.Italsogivesgreatestresilience.Thisconfigurationwouldsupport128ofthe QF/Nodedevices.EachQF/Nodedevicehas4810Gigports. Thereasonwedrawthepictureasacircleisthatsreallythewayitbehaves.Everythinginthe realworldconnectstotheQF/Node,andthefabricissimplytheinterconnectbitshereare simplytransport.SowecallthesetheQF/nodesandcenterchassisaretheinterconnectchassis. ItsEthernetinanditsEthernetout,withanefficienttransportprotocolinbetween.Mostof theprocessingisdoneattheingressportandsomeprocessingisdoneattheegressport,just likeaswitch.Now,herestheamazingthing.ThisfabricisfasterthananyEthernetchassisswitch madetoday.Itslessthanfivemicrosecondstogofromaportheretothatporttherewith maximumcablelengths.Theotherthingis,becausethisisanoncongestingtypeof environment,becauseitisarelativelysmallnumberofASICs,thejitterinlatencyisextremely low.

47

48

49

Directorisalsowhatprovidesthesinglemanagementplane.Itisasingleconsciousness thatwecantalktoanditprojectsoutaCLIorGUIoraDMIinterfaceintomore sophisticatedorchestrationenvironments.Italsodoesalotofautomation.Forexample, ifIpluginanewQF/Nodedevice,itsautomaticallydetectedanddiscovered.Its configuredanditslocalinformationissetupandspreadacrossthefabric.

50

ThedistributedcontrolplaneisimplementedontheinterconnectandQF/Nodedevices andiscentrallymanagedbycontrol

Clusterofphysicalredundantservers. AllcontroltrafficiscarriedovertwoseparateGElinkstocontrol. Controlplaneisolatedthefromthedataplane

51

NoSinglepointofFailure.QFabrichasresiliencybuiltinatmanystages:Hardware,Data Plane,ControlPlane,ManagementandSoftware.

52

53

54

ThedistributedcontrolplaneisimplementedontheinterconnectandQF/Nodedevices andiscentrallymanagedbycontrol

Clusterofphysicalredundantservers. AllcontroltrafficiscarriedovertwoseparateGElinkstocontrol. Controlplaneisolatedthefromthedataplane

55

Clusterofphysicalredundantservers. AllcontroltrafficiscarriedovertwoseparateGElinkstocontrol. Controlplaneisolatedthefromthedataplane

56

FCoETransitSwitch: TheQFX3500offersafullfeaturedDCBimplementationthatprovidesstrong monitoringcapabilitiesonthetopofrackswitch. Inaddition,FCInitiationProtocol(FIP)snoopingprovidesperimeterprotection, ensuringthatthepresenceofanEthernetlayerdoesnotimpactexistingSAN securitypolicies. FCoEFCGateway: InFCoEFCgatewaymode,theQFX3500eliminatestheneedforFCoE enablementintheSANbackbone. Organizationscanaddaconvergedaccesslayerandinteroperatewithexisting SANswithoutdisruptingthenetwork. TheQFX3500allowsupto12portstobeconvertedtoFibreChannelwithout requiringadditionalswitchhardwaremodules.

57

Therearethreeuniquepiecesofhardware.Thereistheinterconnectchassis,the QF/Node,andtheDirector.

58

Theinterconnectchassisisinahalfrackenvironment sowecanactuallystacktwoon top.Thefronthastheairintakeandhasthebusinessside.Thesearetheeightfront facingfabriccards.Eachonehas1640Gigabitports.TheseareQSFPstandard connectors.Thebacksidehasthedualpowerandredundantfans.Themiddlestageis fabricsandredundantcontrolboards,whichareaboutmonitoringandmanagingthe chassisitself.Eachcontrolboardhastwodifferentconnectionstothecontrolplane.We usestandardEthernetconnectorseverywhere. Wehaveeighttimes16,or12840Gigconnections.Again,thesearenotswitches,in thatsense,butifyouadditupusingmarketingmath,whichiscountingbothdirections, thatis10.2Terabitspersecond.Wecanhaveamaximumoffour.Thats44Terabitsper second,whichisaninordinateamountofcrosssectionalbandwidth.

59

60

61

Directorisa2RUIntelbasedserver.Itsaquadchipwithenoughmemoryandlotsof networking.Ifyouhavetwodifferentpowergridscomingintoyourdatacenterwe recommendconnectingthemondifferentpowergrids.

62

63

64

TheQFabrichasbeenindevelopmentfor3yearsandwehave125pendingpatents applicationsandhavebeengrantedthreesofar.Wehavebeenveryactiveindoingthe patentapplications,becausethereisaremanyuniqueinnovationsinQFabric.

65

66

67

Thispictureshowsminimumconfigurationthatwillbeshippedinitially.Twochassiswith twoconnectionseachwillgivethreetooneoversubscriptionattheQF/Node.Thiswill begoodupto64devicesorjustover3,000ports.

68

Themaximumconfigurationwouldbeaddingtwomorechassisthatcanbasicallyrewire itsoQF/Nodecanhaveoneconnectioneach.Thatwouldtaketheconfigurationupto 6,144ports.

69

TheQFabricarchitectureprovidescompleteinvestmentprotection, allowingcustomerstodeploytheQFX3500firstasahighperformancedatacentertop ofrackswitch andthentransitionittoaQF/NodeedgedeviceintheQFabricarchitectureattheirown pace,withoutrequiringawholesalereplacementoftheexistingnetwork. TheQFabricarchitectureprovidesaclearpathtoasingletierdatacenternetwork, helpingcustomersimproveapplicationperformanceandattainoperationalflexibility withminimaladministrativeoverhead.

Insomeways,wehavenowinterconnectedeverythinginthedatacentertoonegiant compute/storagefarm.Whataretheapplicationsthatcantakeadvantageofthat?What willcomeasaresultofit? ..nowthedatacenteristhecomputer..Ibelievetherewillbecoolapplicationsthatwill showupasaresultofQFabric. AndyIngram(VPofMarketing,FSG)

71

72

73

74

75

76

Assumptions: *Ports:6000ports *OS:3:1ToR>Fabric Ifwebuilda6000portnetwork,wewouldhavethe4chassis;wewouldhave125ofthe edgenodes.Thiswouldgiveusacompletenonblockingenvironmentthatbehavesasa singleswitchinLayer2,Layer3. Ourcompetitionistryingtodothesamething,buttheyhavetakenadifferent approach.Wetookacleansheetofpaperanddesignedfromthegroundup,which meanswehaveminimizedthenumberofpiecesofhardware thenumberofdevices andmadeitassimpleaspossible. Ourcompetitorsaretryingtobuildanonblockingflatnetworkoutoftheexisting buildingblocks.WhatthatlookslikeforCiscofabricpathiswehaveNexus5000satthe edge,havetworowsofNexus7000s,andIhaveanotherinstantiationtogetmyLayer3 functionalityforthedatacenterbecausetheprotocoltheyrunhereisLayer2only.This isaprettytypicalwaythatyoubuildaspinalleaftypeoftopologyforanonblocking fabric.Icanttellyouhowlongittooktodrawalltheselines. Thescarythingisthateachoneoftheselinesrepresents20differentlinksormore. Whatyoufindisyoucantactuallybuildthis.Whenyoudothis,whatyouhaveisa wholebunchofloops,soratherthandoingspanningtree,whichwouldturnoffallthis bandwidth,theyareusingaTrilllikeprotocolcalledfabricpath.ItisnotTrillanditis notinteroperablewithanythingelse,soitisstillproprietary.

77

ButwhatitdoesisitcreatesLayer2tunnels butitisonlyLayer2.Theyactuallyhave separatehardwaretodoLayer3,whichcreatesitsownsetofchallenges,becausenow youactuallyhavetotrytodocapacityplanningbetweentwoflowsoftrafficthatyou neverhadtomeasurebefore. Asapointofcomparison,wehave1/3thenumberofdevices,require2/3lesspower, 90%lessfloorspace,andweare7to10timesfaster. Source: DataSheets DesignDocuments InternalTestingresults ReferenceLinks: http://www.cisco.com/en/US/docs/switches/datacenter/hw/nexus7000/installation/guid e/n7k_sys_specs.html http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/data_sheet_c78 618603.html http://www.cisco.com/en/US/products/ps9402/products_data_sheets_list.html *CompetitiveConfiguration: Spine: Chassis18slotwith16payloadslots Linecardsareonlycapableof230Gfabricswitching 10Glinerateperlinecard:23ports Latency:5usecwithinlinecard TRILLlikesupported Edge: ToRlatency:2usec 10Gdownlink:36ports 10Guplinks:12ports NoTRILLlikesupported

77

Somedatacentersbuilddifferentkindsofnetworksfordifferentkindsofappsandwe believethatonenetworkshouldsolvealltheproblems,ifdoneright.Wepicked6,000 asabignumber,butthatalsoplaysoutifyouhave500,1,000,3,000,or6,000. TheQFabricarchitecturecanscalefromjustafewhundredtothousandsof server/storageports,helpingcustomersbuildhighlyscalable,highperformance,highly efficientcloudready(private,publicorhybrid)datacenterinfrastructures. Source: DataSheets DesignDocuments InternalTestingresults ReferenceLinks: http://www.cisco.com/en/US/docs/switches/datacenter/hw/nexus7000/installation/gui de/n7k_sys_specs.html http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/data_sheet_c78 618603.html http://www.cisco.com/en/US/products/ps9402/products_data_sheets_list.html http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9402/white_paper_c1 1516396.html *CompetitiveConfiguration: Spine:

78

Chassis18slotwith16payloadslots Linecardsareonlycapableof230Gfabricswitching 10Glinerateperlinecard:23ports Latency:5usecwithinlinecard TRILLlikesupported Edge: ToRlatency:2usec 10Gdownlink:36ports 10Guplinks:12ports NoTRILLlikesupported

78

Assumptions:(Averagenumbers) Power/KWH $0.11 HVAC&Lighting/KWH $0.13 RackSpace/Year $9,000 NetworkEngineer/peryr $200,000

79

Assumptions:(Averagenumbers) Power/KWH $0.11 HVAC&Lighting/KWH $0.13 RackSpace/Year $9,000 NetworkEngineer/peryr $200,000

80

TheQFabricitselfisincustomertrialsaswespeak.Becausethisissolutionandnota productwegothroughintensiveandextendedcustomertrials.Weareexpectingto bringittomarketinQ3ofthisyear.Thetrialswillprobablylast4to6months, dependingonthecustomer. Intermsoffuturedirections,westartedinthemiddleintermsofhowthefabricscales andthepricepointscales.Wewillhavemegafabricsthatwillultimatelyscaleto10sof thousandsofphysicalportsandwillhave100softhousandsofvirtualports.Wewill alsoscaledownthatwillbeveryeconomical.Wewillhave40and100Gigabitaccess speedsaswell.

81

QFabric:Theperformanceandoperationalsimplicityofasingleswitchandthe scalabilityandresiliencythatyouwouldexpectfromasinglenetwork. Webelievethiswilltransformthedatacenternetwork.QFabricwillmakeitbetterfor doingvirtualization.QFabricwillmakeitsimplertomanage.QFabricwillmakethedata centerbetter,whichgivesitabetterlevelofagility.Insteadofhavingthetradeoff,we believewecandriveuserexperience,andthequalitythereof,andtheeconomicstoa higherlevel,aswellastakethecosttoalowerlevel.

82

Achangeindatacenternetworkdesignisneededtoensurethatorganizationscantake fulladvantageoftheirinvestmentsinnewapplications,virtualization,andstorageand computeresources.Themostefficientwayforresourcestointeractisforthemtobeno morethanasinglehopawayfromeachother.Itstimetobreakthenetworkbarriers andbuildanetworkenvironmentthatisoptimizedforperformanceandsimpleto operate. QFabricarchitecturewouldaddressthelatencyrequirementsoftodaysapplications, eliminatethecomplexityoflegacyhierarchicalarchitectures,scaleelegantly,and supportvirtualization,convergence,andcloudcomputingandotherdemanding requirementsforthenextgenerationdatacenter. Insomeways,wehavenowinterconnectedeverythinginthedatacentertoonegiant compute/storagefarm.Whataretheapplicationsthatcantakeadvantageofthat?What willcomeasaresultofit? ..wealwayssaidthenetworkisthecomputer,butnowthedatacenteristhe computer...Ibelievetherewillbecoolapplicationsthatwillshowupasaresultof QFabric. AndyIngram(VPofMarketing,FSG)

83

HereisrearviewofQFX3500switch Itishighperf,highden10Gswitchdesignedspecificallyfordatacenterdeployments Ithasmax6310gports,440gports Featuring48dualmode10Gports,andfourQSFP+portsina1Uformfactor QFX3500deliversrichLayer2andLayer3connectivitytodevicessuchasrackservers, bladeservers,storagesystems. Itisahighperformanceswitch,offersflexiblelatncymodeswithcutthroughorstore& forward:bothwithsubmicroseclatency. Forconvergedserverenvironments,theQFX3500isalsoastandardsbasedFibre ChanneloverEthernet(FCoE)transitswitchand FCoEtoFibreChannel(FCoEFC)gateway,enablingcustomerstoprotecttheir investmentsinexistingdatacenteraggregationandFibreChannelstorageareanetwork (SAN)infrastructures Features JunosL2andL3 RichDCBimplementation FCoETransitSwitch/FIPsnooping FCoEFCGateway(NPIV) 40Greadyhardware

84

Laye2switching&FeatureRichDCB ULLcutthroughorstore&forward,LargeMACtableof96K,Servervirtualization supportinhardware Layer3switching ULLcutthroughorstore&forward,Layer3protocolsin2H2011 FCoEtransitSwitch/FIPSnoopingandcanprovideFCoEFCGatewayfunctionlaitforup to12FCPorts

84

Highfrequencytradinginfinancialservicesuserealtimemarketfeedstoenactmillions oftradespersecond.FinancialinstitutionssuchasJPMorganrelyonhighspeed networkstocarrythesetransactions.Forthem,fasteraccesstotransactions=more money. Highperformancecomputing(HPC)usesparallelprocessingforrunningadvanced intensiveoperationsefficientlyandquickly.ThemostcommonusersofHPCsystemsare scientificresearchers,seismographers,militaryandacademicinstitutions. InfinancialandHPCsector,highperforamnce:lowlatency,LowOversubscriptionin lowornonblockingconfigurationsisakeycriterionforgainingcompetitiveadvantage.If welookatexistingplayerssuchasArsita,CiscoorBNT,therehasbeentradeoff betweenlatency,wirespeedperfandportdensity. NotanymorewithQFX3500 Portdensitymeansfewerdevices,lessernumberofhops enablefastercommunication andminimizesoverallapplicationlatencyandshrinkdatacenterfootprint.Scalability andportdensitymeansthatthenetworkcanbeexpandedtomeetincreasingdemands. Highthroughputwithabilitytosendlargefilesormanysmallfileswithinacertain amountoftimeiscriticalformarketdatadelivery Financialsrequire

85

QFX3500combineshighperformancewithultralowlatencywithconsistentperand highportdens. Deliveringsubmicrosecondlowlatenciesdemandcutthroughandsharedmemory switchingtechnologies.QFX3500<1uslowlatencyguaranteesnearinstantaneous information. Aristaofferstwodifferentproductsfortwoseparateniches; TheArista7124featureslowlatencycharacteristicsandissuitedforsymmetric trafficflowsandconsistentenvironmentssuchasHPC.Arista7124cannot providehighportdensity,whichleadstogreaternumbersofdevices,higher CAPEXandOPEX,andmoremanagementcomplexity. TheArista7148SXoffersupto4810GbEportsbuthashigherlatencyandjitterthen 7124switch.Arista7100switchescanonlydocutthroughmode,notstoreandforward latencymodeandcannothandleasymmetric1and10GbEspeedsorthetypesoftraffic Ciscohasveryhighlatency,notonlythatlatencyisnotconsistent,socustomersmaynot evenfindthemsuiatbeforthis. Junipersuperiortoothersthatusesimilarmerchantsilicon(BNT)withFullL3,Advanced L2featuresandMigrationpathtoFabric

86

Cloudbasedapplicationsprovideuniqueadvantagesforondemandprovisioningand elasticcomputing capacitywithapayasyougopricingmodel. Thereqsare: Scale HighPortdensity LargenumberofVLANs LargeMACaddresstables(VMs):asEastWestVMtrafficinflatL2network needlargeMACtablesonedgeofnetwork I/oConvergencetoreducenumberofCNAsandachieveeconomies WithQFX3500,TargetissmallscalePrivateandPublicCloudsforL2/L310Gserver access Cloudcomputingtakestheformofwebbasedtools,applicationsorstoragethatusers canaccessthroughaweb dynamically.

87

JuniperssolutionsprovidesthescaleandelasticityneededforCloud Large(63)10GbEportdensity WithLargest96KMACaddresstables,itSupportlargeL2domain Ciscocanonlyprovidemax48portwithanexpansionmodule,theydontfullfullI/o convergenceandhavelimitedmactable WcaneasilybeatAristaasithasnosacle,withnopathtofabricandlimitedvirtualz support. QFX3500providesManagementSimplificationreqiredforcloudsprovider Aspartoffabric,multipleQFX3500canbemanagedasasingledevice,which leadstolesscomplexityandsavingsinopex.

88

ForAtypicalEnterprisesuchasGM,Mercado,datacneterisacostcenter. TheEnterprisemarketissegmentedintoSmall,midlevelenterpriseandlarge :Agregation:10G,L3 SmallmidEnterpriseDC:10GusedforDCAggregation. Keycharacteristsarehighavailability,portdensityandavailabilityof L2/L3switchingandroutingprotocol. OntopoftheseLargeEnterprise:need10Ginaccess. AlsostaorageconvergencewithFcoetransitandgatewayfunhelpsconsolidateresorce andsavecost. TheincumbenetweseeherewillbeCiscowith4900MandNx5548.

89

QFXstandsoutamongstalltheswitchesinthese3segments InHFT/HPCitistheFastest(submicrosecondatallpacketsizes)andwithL3atallports InClouditofferstelargetsportdensityandvirtualzationsacle Inenterprise,itprovidesoptimuncost/portintermsofopexandpower,sapceusageas wellasamangeabilityaspartoffabric.

91

CompetitivevendorscansupportonlyasubsetofLAfeatures. CiscooffersFCoEandFCgatewayfunctionalitybutlacks: ULL FullL3toeveryport VEPAor 40Euplinks. AristaoffersULL,40GEuplinks(maybe),partialL3butlacks fullL3 BNToffersultralowlatencyandL3butlacks FCgateway

92

Su:Tomeetthesedifferentneeds,customershavetochoosedifferentswitchestoday. forfinancialapplicationswherelowlatencyiscriticalArista7100ispreferred thoughitdoesnotofferportdensityandscale. ForScaleddatacenters,Cisco5548withits4810gportsandpathtofabricissuitable. Similarlyforenterprisedatacenters,customersareforcedtocompromiseon performanceanddeploycatalyst4900tomeetbudget. Introducing PerformancerichQFX3500.ItcanaddressmultipleDCapplicationsthat previouslycouldonlybeaddressedwithmultipleswitchesorbysacrificing functionality. QFX3500superiorineachsegmentof10GETORwithNocompromisesonfaster,bigger, cheaperandWithoutaddingcomplexity

100

102

Depth 28 Air flow Front to back

106

132

You might also like