You are on page 1of 10

original

DPDK single core transceiver package source code interpretation


March 20, 2018 23:23:42
Reading number: 316
Dpdk uses mbuf to save the packet, and mempool is used to manipulate the mbuf.

data structure:
Rte_mbuf - dpdk encapsulation structure of the message
Rte_ring - dpdk lock-free buffer for high-performance producer consumer scenarios,
such as the front and rear end of the virtio send and receive messages.
Commonly used functions:
Rte_eal_init - dpdk initialization function, which reads various command line
parameters, parses the configuration, initializes the large page and its management
structure, and creates dpdk threads in it. The thread officially runs our written
function through another function
Rte_memcpy - The dpdk copy function takes full advantage of the bandwidth length
of a single instruction.
Rte_eal_remote_launch - officially executed thread function
Rte_eth_rx_burst - physical port receiving function
Rte_eth_tx_burst - physical port packet function
The process of sending and receiving packets can be roughly divided into two parts:
1. Configuration and initialization of the transceiver package, mainly to configure the
sending and receiving queues.
2. The acquisition and transmission of data packets, mainly from the queue to get the
data packets or put the data packets into the queue.

Main function
/* * The main function, which does initialization and calls the per-lcore * functions.
*/ int main( int argc, char *argv[]) { struct rte_mempool *mbuf_pool; //指向内存池
结构的指针变量 unsigned nb_ports; //网口个数 uint8_t portid; //网口号,临时的
标记变量 /* Initialize the Environment Abstraction Layer (EAL). */ int ret =
rte_eal_init(argc, argv); //初始化 if (ret < 0 ) rte_exit(EXIT_FAILURE, "Error with
EAL initialization\n" ); argc -= ret; argv += ret; /* Check that there is an even number
of ports to send/receive on. */ nb_ports = rte_eth_dev_count(); //获取当前有效网口
的个数 if (nb_ports < 2 || (nb_ports & 1 )) //如果有效网口数小于 2 或有效网口数
为奇数 0,则出错 rte_exit(EXIT_FAILURE, "Error: number of ports must be
even\n" ); /* Creates a new mempool in memory to hold the mbufs. */ /*创建一个新
的内存池*/ //"MBUF_POOL"内存池名, NUM_MBUFS * nb_ports 网口数, //此函
数为 rte_mempoll_create()的封装 mbuf_pool =
rte_pktmbuf_pool_create( "MBUF_POOL" , NUM_MBUFS * nb_ports,
MBUF_CACHE_SIZE, 0 , RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id()); if
(mbuf_pool == NULL ) rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n" ); //
初始化所有的网口 /* Initialize all ports. */ for (portid = 0 ; portid < nb_ports;
portid++) //遍历所有网口 if (port_init(portid, mbuf_pool) != 0 ) //初始化指定网口,
需要网口号和内存池 rte_exit(EXIT_FAILURE, "Cannot init port %" PRIu8 "\n" ,
portid); //如果逻辑核心总数>1 ,打印警告信息,此程序用不上多个逻辑核心 //
逻辑核心可以通过传递参数-c 逻辑核掩码来设置 if (rte_lcore_count() > 1 )
printf( "\nWARNING: Too many lcores enabled. Only 1 used.\n" ); /* Call
lcore_main on the master core only. */ //执行主函数 lcore_main(); return 0 ; }

1 port initialization: port_init(portid, mbuf_pool)


/* * Initializes a given port using global settings and with the RX buffers * coming
from the mbuf_pool passed as a parameter. */ /*指定网口的队列数,本列中指定的
是单队列在 tx、rx 两个方向上,设置缓冲区*/ static inline int port_init(uint8_t
port, struct rte_mempool *mbuf_pool) { struct rte_eth_conf port_conf =
port_conf_default; //网口配置=默认的网口配置 const uint16_t rx_rings = 1 ,
tx_rings = 1 ; //网口 tx、rx 队列的个数 int retval; //临时变量,返回值 uint16_t q;
//临时变量,队列号 // rte_eth_dev_count()获取可用 eth 的个数 if (port >=
rte_eth_dev_count()) return - 1 ; /* Configure the Ethernet device. */ //配置以太网口
设备 //网口号、发送队列个数、接收队列个数、网口的配置 retval =
rte_eth_dev_configure(port, rx_rings, tx_rings, &port_conf); //设置网卡设备 if
(retval != 0 ) return retval; /* Allocate and set up 1 RX queue per Ethernet port. */
//RX 队列初始化 for (q = 0 ; q < rx_rings; q++) { //遍历指定网口的所有 rx 队列 /
/申请并设置一个收包队列 //指定网口,指定队列,指定队列 RING 的大小,指
定 SOCKET_ID 号,指定队列选项(默认 NULL),指定内存池 retval =
rte_eth_rx_queue_setup(port, q, RX_RING_SIZE, rte_eth_dev_socket_id(port),
NULL , mbuf_pool); if (retval < 0 ) return retval; } //TX 队列初始化 /* Allocate
and set up 1 TX queue per Ethernet port. */ for (q = 0 ; q < tx_rings; q++) { //遍历指
定网口的所有 tx 队列 //申请并设置一个发包队列 //指定网口,指定队列,指
定队列 RING 大小,指定 SOCKET_ID 号,指定选项(NULL 为默认) retval =
rte_eth_tx_queue_setup(port, q, TX_RING_SIZE, rte_eth_dev_socket_id(port),
NULL ); if (retval < 0 ) return retval; } /* Start the Ethernet port. */ retval =
rte_eth_dev_start(port); //启动网卡 if (retval < 0 ) return retval; /* Display the port
MAC address. */ struct ether_addr addr; rte_eth_macaddr_get(port, &addr); //获取网
卡的 MAC 地址,并打印 printf( "Port %u MAC: %02" PRIx8 " %02" PRIx8 "
%02" PRIx8 " %02" PRIx8 " %02" PRIx8 " %02" PRIx8 "\n" , ( unsigned )port, addr
.addr_bytes [ 0 ], addr .addr_bytes [ 1 ], addr .addr_bytes [ 2 ], addr .addr_bytes [ 3 ],
addr .addr_bytes [ 4 ], addr .addr_bytes [ 5 ]); /* Enable RX in promiscuous mode for
the Ethernet device. */ rte_eth_promiscuous_enable(port); //设置网卡为混杂模式
return 0 ; }
The most important task of the configuration of the transceiver packet is to configure
the sending and receiving queue of the network card, set the address of the DMA
copy data packet, etc. After the address is configured, the network card will directly
copy the data packet to the specified memory address through the DMA controller
after receiving the data packet. . When we use a data packet, we only need to go to
the corresponding queue to retrieve the data of the specified address.
The configuration of the sending and receiving packets starts
from rte_eth_dev_configure() . Here, the number of queues and the configuration
information of the interface, such as the usage mode of the queue and the multi-queue
mode, are configured according to the parameters.
The initialization of the receive queue starts from rte_eth_rx_queue_setup() . The
parameters here need to specify the port_id, queue_id, and number of descriptors to
be initialized. You can also specify the received configuration, such as the threshold
for release and writeback.
Various checks are performed in the first place. For example, if the initialized queue
number is legal and valid, the device cannot be initialized if it has been started. Check
if the function pointer is valid, etc. Check if the data size of the mbuf meets the
configuration in the default device information. Finally, the setup function to the
queue is called for final initialization. For ixgbe devices, rx_queue_setup is the
function ixgbe_dev_rx_queue_setup() . It is still checked first. The maximum number
of descriptors can not be greater than IXGBE_MAX_RING_DESC, and the
minimum cannot be less than IXGBE_MIN_RING_DESC. The next are the key
points:
<1>. Allocate the queue structure and fill the structure
//填充结构体的所属内存池,描述符个数,队列号,队列所属接口号等成员。
rxq = rte_zmalloc_socket( "ethdev RX queue" , sizeof ( struct ixgbe_rx_queue),
RTE_CACHE_LINE_SIZE, socket_id);

<2>. Allocate the space of the descriptor queue, and allocate according to the
maximum number of descriptors.
rz = rte_eth_dma_zone_reserve(dev, "rx_ring" , queue_idx, RX_RING_SZ,
IXGBE_ALIGN, socket_id); //接着获取描述符队列的头和尾寄存器的地址,在收
发包后,软件要对这个寄存器进行处理。 rxq -> rdt_reg_addr =
IXGBE_PCI_REG_ADDR(hw, IXGBE_RDT(rxq -> reg_idx)); rxq -> rdh_reg_addr
= IXGBE_PCI_REG_ADDR(hw, IXGBE_RDH(rxq -> reg_idx)); //设置队列的接收
描述符 ring 的物理地址和虚拟地址。 rxq -> rx_ring_phys_addr =
rte_mem_phy2mch(rz -> memseg_id, rz -> phys_addr); rxq -> rx_ring = (union
ixgbe_adv_rx_desc * ) rz -> addr;
<3> Assign sw_ring, the object stored in this ring is struct ixgbe_rx_entry, in fact, it
is the pointer of the packet mbuf.
rxq->sw_ring = rte_zmalloc_socket( "rxq->sw_ring" , sizeof ( struct
ixgbe_rx_entry) * len, RTE_CACHE_LINE_SIZE, socket_id);

After the above three steps are completed, the important part of the newly allocated
queue structure has been filled, and the other members need to be reset.
ixgbe_reset_rx_queue () // 先把分配的描述符队列清空,其实清空在分配的时候
就已经做了,没必要重复做 for (i = 0 ; i < len; i++) { rxq->rx_ring[i] =
zeroed_desc; } //然后初始化队列中一下其他成员 rxq ->rx_nb_avail = 0 ; rxq
->rx_next_avail = 0 ; rxq ->rx_free_trigger = (uint16_t)(rxq->rx_free_thresh - 1 );
rxq ->rx_tail = 0 ; rxq ->nb_rx_hold = 0 ; rxq ->pkt_first_seg = NULL ; rxq
->pkt_last_seg = NULL ;

In this way, the receive queue is initialized. The initialization of the send queue is the
same as the receive queue in the previous check. Only a little difference is in the
setup section. After the above queue initialization, the queue ring and sw_ring are
allocated, but found that there is still, DMA still does not know where to copy the
packet, we said that DPDK is zero copy, then we allocate the mempool How do
objects relate to queues and drivers? Next is the most exciting moment - to establish
the relationship between mempool, queue, DMA, ring.
The device starts from rte_eth_dev_start() : diag = (*dev->dev_ops->dev_start)(dev);
In turn, find the real boot function for device startup: ixgbe_dev_start() : Check the
device's link settings first, temporarily not supporting half-duplex and fixed-rate
modes. It seems that for the time being only adaptive mode. Then disable the
interrupt, and at the same time, stop the adapter ixgbe_stop_adapter(hw); in which
ixgbe_stop_adapter_generic() is called, the main job is to stop the sending and
receiving units. This is done by writing directly to the register. Then restart the
hardware, ixgbe_pf_reset_hw()->ixgbe_reset_hw()->ixgbe_reset_hw_82599(), and
finally set the registers, so I won't go into details here. After that, the hardware is
started.
Then initialize the receiving unit: ixgbe_dev_rx_init() : In this function, the main
thing is to set various registers, such as configuring CRC check. If jumbo frame is
supported, configure the corresponding register. Also, if the loopback mode is
configured, also configure the registers.
The next most important thing is to set up a DMA register for each queue, identifying
the address, length, header, and tail of the descriptor ring for each queue.
bus_addr = rxq->rx_ring_phys_addr; IXGBE_WRITE_REG (hw, IXGBE_RDBAL
(rxq->reg_idx), (uint32_t)(bus_addr & 0x00000000ffffffff ULL));
IXGBE_WRITE_REG (hw, IXGBE_RDBAH (rxq->reg_idx), (uint32_t)(bus_addr
>> 32 )); IXGBE_WRITE_REG (hw, IXGBE_RDLEN (rxq->reg_idx), rxq-
>nb_rx_desc * sizeof(union ixgbe_adv_rx_desc)); IXGBE_WRITE_REG (hw,
IXGBE_RDH (rxq->reg_idx), 0 ); IXGBE_WRITE_REG (hw, IXGBE_RDT (rxq-
>reg_idx), 0 );

Here you can see that the physical address of the descriptor ring is written to the
register, and the length of the descriptor ring is also written. The length of the packet
data is then calculated and written to the register. It is also configured for the multi-
queue settings of the NIC. In this way, the initialization of the receiving unit is
completed.
Next, initialize the sending unit: ixgbe_dev_tx_init() . The initialization of the
sending unit is the same as the initializing operation of the receiving unit. It is the
value of the padding register. The key point is to set the base address and length of
the descriptor queue.
bus_addr = txq -> tx_ring_phys_addr; IXGBE_WRITE_REG(hw,
IXGBE_TDBAL(txq -> reg_idx), (uint32_t)(bus_addr & 0x00000000ffffffff ULL));
IXGBE_WRITE_REG(hw, IXGBE_TDBAH(txq -> reg_idx), (uint32_t)(bus_addr
>> 32 )); IXGBE_WRITE_REG(hw, IXGBE_TDLEN(txq -> reg_idx), txq ->
nb_tx_desc * sizeof(union ixgbe_adv_tx_desc)); /* Setup the HW Tx Head and TX
Tail descriptor pointers */ IXGBE_WRITE_REG(hw, IXGBE_TDH(txq -> reg_idx),
0 ); IXGBE_WRITE_REG(hw, IXGBE_TDT(txq -> reg_idx), 0 );

After the transceiver unit is initialized, you can start the transceiver unit of the
device: ixgbe_dev_rxtx_start()
First set the threshold related register of each send queue, which is the threshold
parameter when sending. This thing is described in the sending part. Then start each
receive queue in turn: ixgbe_dev_rx_queue_start()
//先检查,如果要启动的队列是合法的,那么就为这个接收队列分配存放 mbuf
的实际空间 if (ixgbe_alloc_rx_queue_mbufs(rxq) != 0 ) { PMD_INIT_LOG(ERR,
"Could not alloc mbuf for queue:%d" , rx_queue_id); return - 1 ; }

Here, you will find the ultimate answer – mempool, ring, queue ring, queue sw_ring
relationship!
static int __attribute__((cold)) ixgbe_alloc_rx_queue_mbufs( struct ixgbe_rx_queue
*rxq) { struct ixgbe_rx_entry *rxe = rxq->sw_ring; uint64_t dma_addr; unsigned int
i; /* Initialize software ring entries */ for (i = 0 ; i < rxq->nb_rx_desc; i++) { volatile
union ixgbe_adv_rx_desc *rxd; struct rte_mbuf *mbuf = rte_mbuf_raw_alloc(rxq-
>mb_pool); if (mbuf == NULL ) { PMD_INIT_LOG(ERR, "RX mbuf alloc failed
queue_id=%u" , ( unsigned ) rxq->queue_id); return -ENOMEM; }
rte_mbuf_refcnt_set(mbuf, 1 ); mbuf->next = NULL ; mbuf->data_off =
RTE_PKTMBUF_HEADROOM; mbuf->nb_segs = 1 ; mbuf->port = rxq->port_id;
dma_addr = rte_cpu_to_le_64(rte_mbuf_data_dma_addr_default(mbuf)); rxd =
&rxq->rx_ring[i]; rxd->read .hdr_addr = 0 ; rxd->read .pkt_addr = dma_addr;
rxe[i] .mbuf = mbuf; } return 0 ; }

We see that the nb_rx_desc mbuf pointers are looped out of the ring of the memory
pool to which the queue belongs, that is, to fill rxq->sw_ring. Each pointer points to a
packet space in the memory pool.
Then the newly allocated mbuf structure is populated first, and the most important is
the padding calculation dma_addr. Then initialize the queue ring, rxd information,
indicating that the driver puts the packet at dma_addr. In the last sentence, put the
allocated mbuf into the sw_ring of the queue, so that the package that was received is
directly placed in the sw_ring.
The most important work above is completed, the following can enable the DMA
engine, ready to receive the package
hw->mac .ops .enable _rx_dma(hw, rxctrl) ;

Then set the value of the head and tail registers of the queue ring, which is also very
important! The header is set to 0, and the tail is set to the number of descriptors
minus 1, which means that the descriptor fills the entire ring.
IXGBE_WRITE_REG(hw, IXGBE_RDH(rxq->reg_idx) , 0 ) ;
IXGBE_WRITE_REG(hw, IXGBE_RDT(rxq->reg_idx) , rxq->nb_rx_desc - 1 ) ;

With this step done, there is nothing important to the rest, just stop!
The start of the send queue is simpler than the start of the receive queue, except that
the txdctl register is configured, the delay waits for TX to be completed, and finally,
the head and tail positions of the set queue are both 0.
txdctl = IXGBE_READ_REG (hw, IXGBE_TXDCTL (txq->reg_idx)); txdctl |=
IXGBE_TXDCTL_ENABLE ; IXGBE_WRITE_REG (hw, IXGBE_TXDCTL (txq-
>reg_idx), txdctl); IXGBE_WRITE_REG (hw, IXGBE_TDH (txq->reg_idx), 0 );
IXGBE_WRITE_REG (hw, IXGBE_TDT (txq->reg_idx), 0 );

The send queue is started.

2 main loop part


/* * The lcore main. This is the main thread that does the work, reading from * an
input port and writing to an output port. */ /* //业务函数入口
点//__attribute__((noreturn))标明函数无返回值*/ /* //1、检测 CPU 与网卡是否匹
配//2、检查收发网卡是否在同一 NUMA 节点//3、数据接收、发送的 while(1) */
static __attribute__((noreturn)) void lcore_main( void ) { const uint8_t nb_ports =
rte_eth_dev_count(); //网口总数 uint8_t port; //临时变量,网口号 /* * Check that
the port is on the same NUMA node as the polling thread * for best performance. *
为了更好的性能,检查收发网卡是否在同一 NUMA 节点*/ for (port = 0 ; port <
nb_ports; port++) //遍历所有网口 if (rte_eth_dev_socket_id(port) > 0 && //检测的
IF 语句 rte_eth_dev_socket_id(port) != ( int )rte_socket_id()) printf ( "WARNING,
port %u is on remote NUMA node to " "polling thread.\n\tPerformance will " "not be
optimal.\n" , port); printf ( "\nCore %u forwarding packets. [Ctrl+C to quit]\n" ,
rte_lcore_id()); /* Run until the application is quit or killed. */ /*运行直到应用程序
退出或被 kill*/ for (;;) { /* * Receive packets on a port and forward them on the
paired * port. The mapping is 0 -> 1, 1 -> 0, 2 -> 3, 3 -> 2, etc. * 从 Eth 接收数据包,
并发送到 ETH 上。 * 发送顺序为:0 的接收到 1 的发送, * 1 的接收到 0 的发
送* 每两个端口为一对*/ for (port = 0 ; port < nb_ports; port++) { //遍历所有网口
/* Get burst of RX packets, from first port of pair. */ struct rte_mbuf
*bufs[BURST_SIZE]; //收包,接收到 nb_tx 个包 //端口,队列,缓冲区,队列
大小 const uint16_t nb_rx = rte_eth_rx_burst(port, 0 , bufs, BURST_SIZE); if
(unlikely(nb_rx == 0 )) continue ; // 没有读到包就继续下一个 port /* Send burst of
TX packets, to second port of pair. */ //发包,发送 nb_rx 个包 //端口,队列,发
送缓冲区,发包个数 const uint16_t nb_tx = rte_eth_tx_burst(port ^ 1 , 0 , bufs,
nb_rx); //*****注意:以上流程为:从 x 收到的包,发送到 x^1 口 //其中,0^1
= 1, 1^1 = 0 //此运算可以达到测试要求的收、发包逻辑 /* Free any unsent
packets. */ //释放不发送的数据包 //1、收到 nb_rx 个包,转发了 nb_tx 个,剩余
nb_rx-nb_tx 个 //2、把剩余的包释放掉 if (unlikely(nb_tx < nb_rx)) { uint16_t
buf; for (buf = nb_tx; buf < nb_rx; buf++) rte_pktmbuf_free(bufs[buf]); //释放包 } }
}}

note:
The acquisition of the data packet means that the driver puts the data packet into the
memory, and the upper layer application extracts the data packet from the queue; the
sending means that the data packet to be sent is put into the sending queue to prepare
for the actual transmission.
The service level gets the packet starting from rte_eth_rx_burst()
static inline uint16_t rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id, struct
rte_mbuf **rx_pkts, uint16_t nb_pkts) { struct rte_eth_dev *dev; dev =
&rte_eth_devices[port_id]; return (*dev->rx_pkt_burst)(dev->data-
>rx_queues[queue_id], rx_pkts, nb_pkts); }

Rte_eth_tx_burst() will automatically release the mbuf if it is sent successfully. For


unsuccessful, you need to release it manually.
Complete code:
#include <stdint.h> #include <inttypes.h> #include <rte_eal.h> #include
<rte_ethdev.h> #include <rte_cycles.h> #include <rte_lcore.h> #include
<rte_mbuf.h> #define RX_RING_SIZE 128 //接收环大小 #define TX_RING_SIZE
512 //发送环大小 #define NUM_MBUFS 8191 #define MBUF_CACHE_SIZE 250
#define BURST_SIZE 32 static const struct rte_eth_conf port_conf_default = {
.rxmode = { .max_rx_pkt_len = ETHER_MAX_LEN } }; /* basicfwd.c: Basic
DPDK skeleton forwarding example. */ /* * Initializes a given port using global
settings and with the RX buffers * coming from the mbuf_pool passed as a parameter.
*/ /*指定网口的队列数,本列中指定的是但队列在 tx、rx 两个方向上,设置缓
冲区*/ static inline int port_init(uint8_t port, struct rte_mempool *mbuf_pool)
{ struct rte_eth_conf port_conf = port_conf_default; //网口配置=默认的网口配置
const uint16_t rx_rings = 1 , tx_rings = 1 ; //网口 tx、rx 队列的个数 int retval; //临
时变量,返回值 uint16_t q; //临时变量,队列号 if (port >= rte_eth_dev_count())
return - 1 ; /* Configure the Ethernet device. */ //配置以太网口设备 //网口号、发
送队列个数、接收队列个数、网口的配置 retval = rte_eth_dev_configure(port,
rx_rings, tx_rings, &port_conf); //设置网卡设备 if (retval != 0 ) return retval; /*
Allocate and set up 1 RX queue per Ethernet port. */ //RX 队列初始化 for (q = 0 ; q
< rx_rings; q++) { //遍历指定网口的所有 rx 队列 //申请并设置一个收包队列 //
指定网口,指定队列,指定队列 RING 的大小,指定 SOCKET_ID 号,指定队
列选项(默认 NULL),指定内存池 //其中 rte_eth_dev_socket_id(port)不理解,
通过 port 号来获取 dev_socket_id?? //dev_socket_id 作用未知,有待研究 retval =
rte_eth_rx_queue_setup(port, q, RX_RING_SIZE, rte_eth_dev_socket_id(port),
NULL, mbuf_pool); if (retval < 0 ) return retval; } //TX 队列初始化 /* Allocate and
set up 1 TX queue per Ethernet port. */ for (q = 0 ; q < tx_rings; q++) { //遍历指定网
口的所有 tx 队列 //申请并设置一个发包队列 //指定网口,指定队列,指定队
列 RING 大小,指定 SOCKET_ID 号,指定选项(NULL 为默认) //??TX 为何
没指定内存池,此特征有待研究 retval = rte_eth_tx_queue_setup(port, q,
TX_RING_SIZE, //申请并设置一个发包队列 rte_eth_dev_socket_id(port),
NULL); if (retval < 0 ) return retval; } /* Start the Ethernet port. */ retval =
rte_eth_dev_start(port); //启动网卡 if (retval < 0 ) return retval; /* Display the port
MAC address. */ struct ether_addr addr; rte_eth_macaddr_get(port, &addr); //获取网
卡的 MAC 地址,并打印 printf ( "Port %u MAC: %02" PRIx8 " %02" PRIx8 "
%02" PRIx8 " %02" PRIx8 " %02" PRIx8 " %02" PRIx8 "\n" , ( unsigned )port,
addr.addr_bytes[ 0 ], addr.addr_bytes[ 1 ], addr.addr_bytes[ 2 ], addr.addr_bytes[ 3 ],
addr.addr_bytes[ 4 ], addr.addr_bytes[ 5 ]); /* Enable RX in promiscuous mode for
the Ethernet device. */ rte_eth_promiscuous_enable(port); //设置网卡为混杂模式
return 0 ; } /* * The lcore main. This is the main thread that does the work, reading
from * an input port and writing to an output port. */ /* //业务函数入口
点//__attribute__((noreturn))用法//标明函数无返回值//用来修饰 lcore_main 函数,
标明 lcore_main 无返回值*/ /* //1、检测 CPU 与网卡是否匹配//2、建议使用本地
CPU 就近网卡???,不理解//3、数据接收、发送的 while(1) */ static
__attribute__((noreturn)) void lcore_main( void ) { const uint8_t nb_ports =
rte_eth_dev_count(); //网口总数 uint8_t port; //临时变量,网口号 /* * Check that
the port is on the same NUMA node as the polling thread * for best performance. *
检测 CPU 和网口是否匹配* ????,不理解,姑且看作是一个检测机制*/ for
(port = 0 ; port < nb_ports; port++) //遍历所有网口 if (rte_eth_dev_socket_id(port)
> 0 && //检测的 IF 语句 rte_eth_dev_socket_id(port) != ( int )rte_socket_id())
printf ( "WARNING, port %u is on remote NUMA node to " "polling
thread.\n\tPerformance will " "not be optimal.\n" , port); printf ( "\nCore %u
forwarding packets. [Ctrl+C to quit]\n" , rte_lcore_id()); /* Run until the application
is quit or killed. */ /*运行直到应用程序推出或被 kill*/ for (;;) { /* * Receive
packets on a port and forward them on the paired * port. The mapping is 0 -> 1, 1 ->
0, 2 -> 3, 3 -> 2, etc. * 从 Eth 接收数据包,并发送到 ETH 上。 * 发送顺序为:0
的接收到 1 的发送, * 1 的接收到 0 的发送* 每两个端口为一对*/ for (port = 0 ;
port < nb_ports; port++) { //遍历所有网口 /* Get burst of RX packets, from first
port of pair. */ struct rte_mbuf *bufs[BURST_SIZE]; //收包,接收到 nb_tx 个包 //
端口,队列,收包队列,队列大小 const uint16_t nb_rx = rte_eth_rx_burst(port,
0 , bufs, BURST_SIZE); if (unlikely(nb_rx == 0 )) continue ; /* Send burst of TX
packets, to second port of pair. */ //发包,发送 nb_rx 个包 //端口,队列,发送缓
冲区,发包个数 const uint16_t nb_tx = rte_eth_tx_burst(port ^ 1 , 0 , bufs,
nb_rx); //*****注意:以上流程为:从 x 收到的包,发送到 x^1 口 //其中,0^1
= 1, 1^1 = 0 //此运算可以达到测试要求的收、发包逻辑 /* Free any unsent
packets. */ //释放不发送的数据包 //1、收到 nb_rx 个包,转发了 nb_tx 个,剩余
nb_rx-nb_tx 个 //2、把剩余的包释放掉 if (unlikely(nb_tx < nb_rx)) { uint16_t
buf; for (buf = nb_tx; buf < nb_rx; buf++) rte_pktmbuf_free(bufs[buf]); //释放包 } }
} } /* * The main function, which does initialization and calls the per-lcore *
functions. */ int main( int argc, char *argv[]) { struct rte_mempool *mbuf_pool; //指
向内存池结构的指针变量 unsigned nb_ports; //网口个数 uint8_t portid; //网口号,
临时的标记变量 /* Initialize the Environment Abstraction Layer (EAL). */ int ret =
rte_eal_init(argc, argv); //初始化 if (ret < 0 ) rte_exit(EXIT_FAILURE, "Error with
EAL initialization\n" ); argc -= ret; //??本操作莫名其妙,似乎一点用处也没有 argv
+= ret; //??本操作莫名其妙,似乎一点用处也没有 /* Check that there is an even
number of ports to send/receive on. */ nb_ports = rte_eth_dev_count(); //获取当前有
效网口的个数 if (nb_ports < 2 || (nb_ports & 1 )) //如果有效网口数小于 2 或有效
网口数为奇数 0,则出错 rte_exit(EXIT_FAILURE, "Error: number of ports must
be even\n" ); /* Creates a new mempool in memory to hold the mbufs. */ /*创建一个
新的内存池*/ //"MBUF_POOL"内存池名, NUM_MBUFS * nb_ports 网口数, //此
函数为 rte_mempoll_create()的封装 mbuf_pool =
rte_pktmbuf_pool_create( "MBUF_POOL" , NUM_MBUFS * nb_ports,
MBUF_CACHE_SIZE, 0 , RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id()); if
(mbuf_pool == NULL) rte_exit(EXIT_FAILURE, "Cannot create mbuf pool\n" ); //
初始化所有的网口 /* Initialize all ports. */ for (portid = 0 ; portid < nb_ports;
portid++) //遍历所有网口 if (port_init(portid, mbuf_pool) != 0 ) //初始化指定网口,
需要网口号和内存池,此函数为自定义函数,看前面定义
rte_exit(EXIT_FAILURE, "Cannot init port %" PRIu8 "\n" , portid); //如果逻辑核心
总数>1 ,打印警告信息,此程序用不上多个逻辑核心 //逻辑核心可以通过传递
参数-c 逻辑核掩码来设置 if (rte_lcore_count() > 1 ) printf ( "\nWARNING: Too
many lcores enabled. Only 1 used.\n" ); /* Call lcore_main on the master core only. */
//执行主函数 lcore_main(); return 0 ; }

You might also like