Ichip Packet Flow

Ichip Packet Flow
Dan Rautio
Copyright 2006 Juniper Networks, Inc.
Proprietary and Confidential
www.juniper.net
Overview
Complete Ichip packet flow
Iwi Ifi wan and fabric in
Ipktwr packet writer
Im packet memory and Notification Memory Buffer
Iif incoming interface index lookup
Ir route lookup
Isr Route Lookup and IIF/WO RLDRAM memory access
Ip Host Packet
Imq Scheduling and queueing
Ipktrd packet reader
Iwo Ifo wan and fabric output
www.juniper.net
I chip packet flow
www.juniper.net
I chip
www.juniper.net

w_Pif 8 type 1, or 2 type 2 PICs (BDIF)
w_Sif SLIF PICs remap SLIF stream
numbers to internal stream number
w_Inq per-stream input buffering.
Assert FC to PICs. Dispatch header to
L2/L3 engines and payload to dbuf
w_L23 L2/L3 microcode processing
engines to decode the packet header. 4
engines, 2 double engines. Notification,
route key, and IIF key generated by the
engines to the segmentor
F_Kext similar to the w_L23 but for the
fabric to extract the route key
W_dbuf packet data buffering for wan.
Absorb the latency in processing headers
W_seg Error checks(chksm, plen, etc),
strip L2 header, counters, cellify packets,
send pktwr
www.juniper.net

Wi receives packets from both WAN and Fabric
WAN packet header processing Max first 128 bytes
L2/L3 Decap and processing
Route key extraction for route lookup
IIF key extraction for IIF lookup
WAN packet data processing
Per stream data queueing
Packet to cell segmentation
Cell type assignment
Error counters
www.juniper.net

Fabric packet header processing
Extract 12B fabric notification and write to the Notificaiton

buffer
Route key extraction for route lookup
Fabric packet data processing
Single fabric input (fi) interface rx data from 16 sources
Strip 12B from fabric first cell and re-cellify packet data
Cell type assignment
The result of the Packet Header/Data processing are

notification/cells which are sent to the packet writer
block
www.juniper.net
Ipktwr packet writer
When it receives the notification

and data cells from the Iwi and Ip
blocks (Ip block is used for injecting
the host originated traffic), it would
store them into different buffer
location.
Notification cells would be stored in
the Notification Buffer (NTBUF) in
per Notification Queue (NTQ) basis
The data cells are stored in data cell
buffers (DCBUF) and a write
request would be sent to the Spray
block. The address in the DCBUF for
the corresponding cell would be sent
to the memory interface block
(MIF).
When the data rate from Host
exceeds the permitted rate of
1Gbps (notification is not included),
the dma_vld signal sent to Spray
engine is de-asserted so that the
traffic from the Host is ignored by
Spray engine.
www.juniper.net
Ipktwr Packet writer
A per memory bank memory address will be assigned by the Spray

block for each cell write request and the cell write request is
then pushed to the per bank data cell write queues (DCWQ).
There are totally 12 banks Ipktwr can use to write the data cells
(3 DIMMs)
The memory address offset would be calculated and saved into
the notification cell of Icell (Packet size >= 6 cells) if required.
In case of an Icell is required, there is a reserved Icell space
(ICBUF) to store the Icell before being written into the packet
data memory via Im. Also, the Spray block will skip one bank to
the next bank for data cell and reserve current bank# for the
Icell.
Once the Icell is constructed, it will be enqueued into a
separated Icell write queue (ICWQ).
The Notification will stay within the NTQ until all the cells from
the corresponding packet are written into the packet memory via
Im.
When it has been done, the notification along with the packet
length (Plen) and Address Handle will be sent to the next block
for further processing (Iiif).
www.juniper.net
Im Packet memory
There are totally 4 DIMMs (16
banks) of memory with
256MB per DIMM. 3 of them
(12 banks) would be used for
data packet memory and 1 of
them (4 bnaks) would be used
for notification memory
(TNQ to DRAMQ / DRAMQ
to HNQ transfer)
Cells being read would be sent

to either Iwo (Wan output),
Ifo (Fabric output) or Ip
(Host Output) blocks.
Im sends bank request credit

to Ipktwr and Ipktwr asserts
grant signal with valid cell
address (cadr), base address
(ba) and cell data to Im when
the corresponding bank has
positive credit.
www.juniper.net
10
Im Packet memory
Im sends a bank credit to Ipktrd when Im is ready to accept

more read requests. Ipktrd keeps track of credit counter for
each bank and generate grant signal to Im when there is bank
read request with positive bank credit.
Im sends Imq_tnq a credit with bank address when there is a
space available in write request queue. Imq_tnq generates grant
signals to Im when there is a pending write request and the
corresponding bank request queue credit is positive.
Im sends Imq_tnq a credit with bank address when there is a
space available in write request queue. Imq_tnq generates grant
signals to Im when there is a pending write request and the
corresponding bank request queue credit is positive.
Im sends Imq_hnq a credit with bank address when there is a
space available in read request queue. Imq_hnq generates grant
signals to Im when there is a pending read request and the
corresponding bank request queue credit is not zero.
www.juniper.net
11
Iif Incoming
Interface
This is a new functional block
that being designed for Ichip

only. In the Gimlet and Martini
chipset, a channel lookup method
is being used to determine the
IIF index. The problem with the
old fashion channel lookup
method is that the maximum
level of lookup is 2 and this
causes a flexibility issues.
With this Iiif block, a lookup
format like JTREE (compressed
JTREE) has been used. As a
result, the IIF index lookup can
be done with more than 2 levels.
The Iiif data structure is
stored in a 8MB RLDRAM
partition accessible via the Isr
block. The final IIF value is
either from an on-chip per
stream state register or from
the off-chip IIF data structure
which is from the successful
lookup result.
www.juniper.net
12
Ir route lookup
This is the route lookup block we have
on the Ichip. Similar to the other
platform, the route lookup is based on
a Jtree format. However, there are
some enhancements on the jtree
structure which makes the route
lookup a lot more flexible.
RICP: This block receives
notification/key pairs and sends them
on to free key engines.
RKP: This is a pool of 13 key engines.
The key engines perform the lookups
for the incoming packets.
RRCP: This block is responsible for
reordering the packets from the key
engines. It also handles sampling.
RJTBL: This block holds the JTable
memory and handles transactions from
the various key engines / translation
table transactions.
RMLP: This block does multicast list
processing.
www.juniper.net
13
Ir route lookup
Once the Jtree has been constructed, it would be stored on the
RLDRAM. There are two partitions on the RLDRAM parts and each
of them has effective 16MB size.
The route lookup result would be the Lout_key combined with a
Stream ID (SID) which indicates whether the packet should be
forwarded to the Wan side or the Fabric side
When the per-packet load sharing is enabled, a final nexthop selection
process will be done to make sure that we can evenly distribute the
loading among the equal cost paths.
When the "m" bit in the lookup reset is 1, the result is multicast, and
the final nexthop is interpreted as a pointer to a multicast list. The
Rmlp (multicast list processor) block processes the multicast list on
the Ichip. The primary function of the block is to retrieve the
Multicast Final Next Hop list at the end of a key processing after
the reordering logic in the Rrcp block.
www.juniper.net
14
Isr RLDRAM access

This block acts as a memory controller
for the Ichip RLDRAMs. 4 RLDRAM
parts are connected to each I2.0 chip
and each of them is a 288 Mbit
RLDRAM part (32M entries * 9
bits/entry, with 8 data + 1 parity for
those 9 bits in each entry). Hence,
each RLDRAM part stores 32 MB of
data.
The Irlkp subsystem has two RLDRAM
parts dedicated to it. The JTREE
data structures are replicated in both
parts as more memory bandwidth is
required rather than more capacity.
Hence, the effective size of RLDRAM
memory for the Irlkp subsystem is
actually 32 MBytes, or 8 MWords
(since each word is 32 bits in the I2.0
architecture), even though the
physical capacity is 64 MB.
www.juniper.net
15
Isr RLSDRAM access

When a READ access is required (route lookup / firewall instruction
lookup), both parts of the RLDRAM can be accessed. However, when a
WRITE access is required for the accounting / policer counter purpose,
only part 1 (R1) is used.
Similar implementation is done for multicast traffic. Multicast lists are
kept only in part 2 (R2), so any traffic indicated by the MLP_SRM_RD
counter affects only R2 part.
The RLDRAM accesses from the Irlkp block are serviced in the order of
the requests made. The route lookup key engine requests are comprised of
JTREE lookups, firewall filtering read requests, accounting and policing
transactions and multicast list process read requests.
The RLDRAM accesses from the Iiif block and Iwo block are serviced in
round-robin fashion.
The maximum SRAM access rate for each RLDRAM part is 200Mops @
95%. For example, if route lookup / firewall lookup is required, the
maximum it can do is 200Mops x 95% x 2(two RLDRAM part) = 380Mops.
However, if firewall counter write is required, the maimum counter write
access rate is 200Mops @ 95% = 190Mops only as this can only be done on
partition 1 (R1).
www.juniper.net
16
Imq - queueing
Imq is the notification enqueued/dequeue
block of the Ichip. It gets the
notifications from Ir, queues them in
internal buffers (TNQ,HNQ) or to
external DRAM(DRAMQ) through
Im_ntf, and subsequently sends them
to Ipktrd.
The whole memory queue is separated
into three different parts TNQ,
DRAMQ and HNQ.
The TNQ receives the notification from
the Irlkp block, then, it maps the
SID[6:0] and QS[2:0] in the incoming
notification into the Imq internal
queue-number.
The DRAMQ is soft partitioned for
each queue and being configured by a
start and end pointers. Each pointer is
23 bits in cell (64 bytes) unit, which
includes 21 bits of bank cell address
and 2 bits of bank address. Within
each cell, there are 3 notifications and
the notification cells for each queue
are evenly sprayed across 4 DRAM
banks.
www.juniper.net
17
Imq - queueing
Wan is given enough to get 100ms buffer, and the rest is used for
fabric. The fabric memory is further partitioned equally between the
32 queues, This partition results in ~100ms delay-bandwidth buffer
even with 256MB DIMMs for average 8 cells packets (IMIX traffic).
HNQ: A global buffer is soft partitioned for each queue which is
configured by a start and end pointer pair. The HNQ is carved
statically between the queues. The Fabric is allocated 512
notifications and Wan is allocated 1536 notifications. The Fabric
space is split equally between the 32 queues resulting in an HNQ size
of 16.
RED is supprted on both the WAN and FAB queues of Imq. For each
queue, Imq maintains several counts: Mu, Mas, Bu, Buavg, Prv. These
counters are used to determine which queues need to be bisited by
RED.
Imq supports 8 priority levels. Four priorities are applicable to all
queues with positive credit (hi, medium-hi, medium-lo, & low priority)
and four different priorities are applicable to all queues with
negative credit (bonus-hi, bonus-medium-hi, bonus-medium-lo, &
bonus-low priority).
Error Statistics; aging, SID error, ECC error, mu overflow,
www.juniper.net
18
Ipktrd packet
reader
The Ipktrd is a Packer Reader block to

retrieve the data back from the Im
block with the notification received
from the Imq and Host. This is basically
a reverse process of what the Ipktwr
block does.
Ipkrd receives the notification from Mq
and Host along with a QID. The QID
range is 256-287 for Fabric and 0-255
for WAN.
- By extracting the packet length (2
bytes) and address handle (8 bytes)
field within the notification, the cell will
be stored in a free pool buffer notification buffer (NTBUF).
- The notification would be sent to
either Ifo or Iwo block depending on
where the data should go. Before the
notification is being sent to the Fabric,
it will send a cell buffer reservation
request to the Ifo first to make sure
that there is a room available in the
Fabric plane.
www.juniper.net
19
Ipktrd packet reader
The packet length and address handle will be stroed in Plen_and_Handle buffer
(PHBUF), a free pool buffer, for every stream.
The incoming notification of each stream will form the first-in first-out perqueue based queue called packet read queue (PRQ). The notification will be
processed in order for the same stream. The stream arbitration (STABR) will
select the next qualified stream for content switch.
The arbitration scheme in STARB across all streams from WAN is TDM. The
time slots will be based on the stream bandwidth programmed by sw
For more than 5 cells packet, the address handle may be processed to calculate
the first Icell address for reading. If the stream speed is equal or greater than
GE, the first icell of the following packet in the same PRQ will be prefetched.
When the stream is selected as a result of arbitration, either the address
handle, or the Icell, or the stream state will be fetched by the stream
processing engine (STPRC)
The Icells will be returned to Ipktrd and stored in icell buffer (ICBUF) for
further processing
The indirect cell prefetch is required in the case of GE or higher speed streams.
The ICBUF space of each stream is separated into two sections. One is for
regular icell and the other is for prefetch icells.
There is a sw programmed aging window for read address aging check against
the latest write address from Ipktwr.
www.juniper.net
20
Iwo wan output

Iwo receives cells from the Ipktrd block and
performs the following actions before sending the
packet out to the PIC.
- L2/L3 Micro-code engine performs IPv4
fragmentation and redirect check.
- Byte and packet count per descriptor/stream
- Build the L2/L3 header
- Transmit packets to pif/sif blocks
The Iwo_ip block interfaces with the pktrd and
Im controller. It collects notifications and data
cells in a data buffer. Once enough data cells are
collected for a packet, it sends the first two cells
and the notification to the l23 engines to build the
l2 and l3 headers. The remaining cells (if the
packet has more than two cells) are being sent to
iwo_lsif block.
The Iwo_desrd engine fetches the L2/Tag header
data from the RLDRAM and on-chip template. To
do this, the Lout_key field of the notification will
be used to do the first lookup.
www.juniper.net
21
Iwo wan output

it generates the L2/L3 header and forwards the packet and L2
encapasulation to the L23 engines.
There are 4 L2/L3 enigines in Ichip to support the OC192 bandwidth data
path. These four engines act as a free pool and a special logic has been
added to avoid packet reordering within a stream.
There are two main processing units in each engine. The L2+Tag processing
unit is responsible for building the L2 and Tag bytes. The L3 processing is
responsible for the L3 processing. There are 320 entries of L2 instruction
memory and 192 entries of L3 instruction memory.
At the end of the build for the first fragment, the engine unload logic will
check if fragmentation is required and then check the DF bit. If the DF bit
is set and the engine indicates that fragmentation is required, the
hardware will discard the current packet and send an MTU error message
to the host.
Also, the IPv4 header checksum is calculated by the micro-code engines.
Also, it performs CRC verification against the CRC value in the last cell of a
packet for data integrity check.
Later on, once the L2/L3 header and the data cells form the Data Buffer
(Iwo_ip_dbuf) are ready, the packet would be reassembled on the wo_spi
output buffer and being sent to the PIC.
www.juniper.net
22
Question or Suggestion
If you have any question regarding I-chip
trouble-shooting, please contact
mx-escalation@juniper.net
If you have any question about this presentation

drautio@juniper.net
www.juniper.net
23

Ichip Packet Flow

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ichip Packet Flow

Uploaded by

Copyright:

Available Formats

Ichip Packet Flow

Copyright 2006 Juniper Networks, Inc.

Proprietary and Confidential

Proprietary and Confidential